From patchwork Thu Jun 21 01:53:25 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 932466 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="Pc1lF5Yw"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41B4X72JZJz9s4w for ; Thu, 21 Jun 2018 11:55:03 +1000 (AEST) Received: from localhost ([::1]:52477 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVooO-0003yK-Qm for incoming@patchwork.ozlabs.org; Wed, 20 Jun 2018 21:55:00 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39102) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVonb-0003wx-8g for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:13 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVonZ-0002uA-6v for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:11 -0400 Received: from mail-pl0-x244.google.com ([2607:f8b0:400e:c01::244]:45790) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVonY-0002t6-Sw for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:09 -0400 Received: by mail-pl0-x244.google.com with SMTP id o18-v6so58224pll.12 for ; Wed, 20 Jun 2018 18:54:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=TtyyOtHe5R2tPmByYKa9JiSI9Mt4avEsP8TOldIsB08=; b=Pc1lF5YwMM3AeAh2r3jwPwQTED0OxlgWv6erb3xYZTSrnqyWm7/jr1uFvuJHJxRHV9 ybsm6KBUuP7aJhth7MO8oKIYIISSnV+SL90jOWdC3Ha6hRcQIu3ApkhO70KqwEkFWgsn uSZ9uBv25bqS2jwfb1QwOicxQrTyPyRdMf9Aw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=TtyyOtHe5R2tPmByYKa9JiSI9Mt4avEsP8TOldIsB08=; b=e+t1bGCJ8xHLgdXlOFdkCK6YMddt+oO6pr5Rn3IbmdoJx2FFnKUBpf6Z+ThRwN7+JT e7kcEpP6310RyLBhpBtK+O0U/XitbqW+X4vKGG7KRL8X/399eqnsayklm/ut4i3zSoxt xk+0qPsiDdkegg5Q0ox2V1WiCP5ROkNL2FDh5B2Mu/J5Fir1Z94D9ZzP6B1ikQmefXhH ac8W59ll+xUSELmtZ7F3/rqr5GH2eyvdZBGjuwwYKSJivWekoZV5zT5BySFczKpJ7rXf N1IzsChcphEIXgzKTTJezzbfs/vaTarajqQPyd0L5JJFQ/iN+XMONpqhn4bODS5bABpA ABbg== X-Gm-Message-State: APt69E2eKYny7CMtO3lKH3dWK/XYtdGr1tGB9WYQBCoE+bkJ9xasBF+6 3Uwg8qvU/O2Nw8AhH1nPHtwR3qeaQZo= X-Google-Smtp-Source: ADUXVKJnm4Xxjn2XGHWkriJ3E26HcxcQKEQqSG5pJowh1rDSbf9ABCmqhTFQM392sLARFWfN7n1rMQ== X-Received: by 2002:a17:902:b706:: with SMTP id d6-v6mr26453597pls.105.1529546047442; Wed, 20 Jun 2018 18:54:07 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.05 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:06 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:25 -1000 Message-Id: <20180621015359.12018-2-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::244 Subject: [Qemu-devel] [PATCH v5 01/35] target/arm: Implement SVE Memory Contiguous Load Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell Reviewed-by: Alex Bennée --- target/arm/helper-sve.h | 35 +++++++++ target/arm/sve_helper.c | 153 +++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 121 +++++++++++++++++++++++++++++ target/arm/sve.decode | 34 +++++++++ 4 files changed, 343 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 2e76084992..fcc9ba5f50 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -719,3 +719,38 @@ DEF_HELPER_FLAGS_5(gvec_rsqrts_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_rsqrts_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_ld1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld4bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ld1hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld2hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld3hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld4hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ld1ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld2ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld3ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld4ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ld1dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld2dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld3dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld4dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ld1bhu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld1bsu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld1bdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld1bhs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld1bss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld1bds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ld1hsu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld1hdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld1hss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld1hds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ld1sdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld1sds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 128bbf9b04..4e6ad282f9 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2810,3 +2810,156 @@ uint32_t HELPER(sve_while)(void *vd, uint32_t count, uint32_t pred_desc) return predtest_ones(d, oprsz, esz_mask); } + +/* + * Load contiguous data, protected by a governing predicate. + */ +#define DO_LD1(NAME, FN, TYPEE, TYPEM, H) \ +static void do_##NAME(CPUARMState *env, void *vd, void *vg, \ + target_ulong addr, intptr_t oprsz, \ + uintptr_t ra) \ +{ \ + intptr_t i = 0; \ + do { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + TYPEM m = 0; \ + if (pg & 1) { \ + m = FN(env, addr, ra); \ + } \ + *(TYPEE *)(vd + H(i)) = m; \ + i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \ + addr += sizeof(TYPEM); \ + } while (i & 15); \ + } while (i < oprsz); \ +} \ +void HELPER(NAME)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc) \ +{ \ + do_##NAME(env, &env->vfp.zregs[simd_data(desc)], vg, \ + addr, simd_oprsz(desc), GETPC()); \ +} + +#define DO_LD2(NAME, FN, TYPEE, TYPEM, H) \ +void HELPER(NAME)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + intptr_t ra = GETPC(); \ + unsigned rd = simd_data(desc); \ + void *d1 = &env->vfp.zregs[rd]; \ + void *d2 = &env->vfp.zregs[(rd + 1) & 31]; \ + for (i = 0; i < oprsz; ) { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + TYPEM m1 = 0, m2 = 0; \ + if (pg & 1) { \ + m1 = FN(env, addr, ra); \ + m2 = FN(env, addr + sizeof(TYPEM), ra); \ + } \ + *(TYPEE *)(d1 + H(i)) = m1; \ + *(TYPEE *)(d2 + H(i)) = m2; \ + i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \ + addr += 2 * sizeof(TYPEM); \ + } while (i & 15); \ + } \ +} + +#define DO_LD3(NAME, FN, TYPEE, TYPEM, H) \ +void HELPER(NAME)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + intptr_t ra = GETPC(); \ + unsigned rd = simd_data(desc); \ + void *d1 = &env->vfp.zregs[rd]; \ + void *d2 = &env->vfp.zregs[(rd + 1) & 31]; \ + void *d3 = &env->vfp.zregs[(rd + 2) & 31]; \ + for (i = 0; i < oprsz; ) { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + TYPEM m1 = 0, m2 = 0, m3 = 0; \ + if (pg & 1) { \ + m1 = FN(env, addr, ra); \ + m2 = FN(env, addr + sizeof(TYPEM), ra); \ + m3 = FN(env, addr + 2 * sizeof(TYPEM), ra); \ + } \ + *(TYPEE *)(d1 + H(i)) = m1; \ + *(TYPEE *)(d2 + H(i)) = m2; \ + *(TYPEE *)(d3 + H(i)) = m3; \ + i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \ + addr += 3 * sizeof(TYPEM); \ + } while (i & 15); \ + } \ +} + +#define DO_LD4(NAME, FN, TYPEE, TYPEM, H) \ +void HELPER(NAME)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + intptr_t ra = GETPC(); \ + unsigned rd = simd_data(desc); \ + void *d1 = &env->vfp.zregs[rd]; \ + void *d2 = &env->vfp.zregs[(rd + 1) & 31]; \ + void *d3 = &env->vfp.zregs[(rd + 2) & 31]; \ + void *d4 = &env->vfp.zregs[(rd + 3) & 31]; \ + for (i = 0; i < oprsz; ) { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + TYPEM m1 = 0, m2 = 0, m3 = 0, m4 = 0; \ + if (pg & 1) { \ + m1 = FN(env, addr, ra); \ + m2 = FN(env, addr + sizeof(TYPEM), ra); \ + m3 = FN(env, addr + 2 * sizeof(TYPEM), ra); \ + m4 = FN(env, addr + 3 * sizeof(TYPEM), ra); \ + } \ + *(TYPEE *)(d1 + H(i)) = m1; \ + *(TYPEE *)(d2 + H(i)) = m2; \ + *(TYPEE *)(d3 + H(i)) = m3; \ + *(TYPEE *)(d4 + H(i)) = m4; \ + i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \ + addr += 4 * sizeof(TYPEM); \ + } while (i & 15); \ + } \ +} + +DO_LD1(sve_ld1bhu_r, cpu_ldub_data_ra, uint16_t, uint8_t, H1_2) +DO_LD1(sve_ld1bhs_r, cpu_ldsb_data_ra, uint16_t, int8_t, H1_2) +DO_LD1(sve_ld1bsu_r, cpu_ldub_data_ra, uint32_t, uint8_t, H1_4) +DO_LD1(sve_ld1bss_r, cpu_ldsb_data_ra, uint32_t, int8_t, H1_4) +DO_LD1(sve_ld1bdu_r, cpu_ldub_data_ra, uint64_t, uint8_t, ) +DO_LD1(sve_ld1bds_r, cpu_ldsb_data_ra, uint64_t, int8_t, ) + +DO_LD1(sve_ld1hsu_r, cpu_lduw_data_ra, uint32_t, uint16_t, H1_4) +DO_LD1(sve_ld1hss_r, cpu_ldsw_data_ra, uint32_t, int8_t, H1_4) +DO_LD1(sve_ld1hdu_r, cpu_lduw_data_ra, uint64_t, uint16_t, ) +DO_LD1(sve_ld1hds_r, cpu_ldsw_data_ra, uint64_t, int16_t, ) + +DO_LD1(sve_ld1sdu_r, cpu_ldl_data_ra, uint64_t, uint32_t, ) +DO_LD1(sve_ld1sds_r, cpu_ldl_data_ra, uint64_t, int32_t, ) + +DO_LD1(sve_ld1bb_r, cpu_ldub_data_ra, uint8_t, uint8_t, H1) +DO_LD2(sve_ld2bb_r, cpu_ldub_data_ra, uint8_t, uint8_t, H1) +DO_LD3(sve_ld3bb_r, cpu_ldub_data_ra, uint8_t, uint8_t, H1) +DO_LD4(sve_ld4bb_r, cpu_ldub_data_ra, uint8_t, uint8_t, H1) + +DO_LD1(sve_ld1hh_r, cpu_lduw_data_ra, uint16_t, uint16_t, H1_2) +DO_LD2(sve_ld2hh_r, cpu_lduw_data_ra, uint16_t, uint16_t, H1_2) +DO_LD3(sve_ld3hh_r, cpu_lduw_data_ra, uint16_t, uint16_t, H1_2) +DO_LD4(sve_ld4hh_r, cpu_lduw_data_ra, uint16_t, uint16_t, H1_2) + +DO_LD1(sve_ld1ss_r, cpu_ldl_data_ra, uint32_t, uint32_t, H1_4) +DO_LD2(sve_ld2ss_r, cpu_ldl_data_ra, uint32_t, uint32_t, H1_4) +DO_LD3(sve_ld3ss_r, cpu_ldl_data_ra, uint32_t, uint32_t, H1_4) +DO_LD4(sve_ld4ss_r, cpu_ldl_data_ra, uint32_t, uint32_t, H1_4) + +DO_LD1(sve_ld1dd_r, cpu_ldq_data_ra, uint64_t, uint64_t, ) +DO_LD2(sve_ld2dd_r, cpu_ldq_data_ra, uint64_t, uint64_t, ) +DO_LD3(sve_ld3dd_r, cpu_ldq_data_ra, uint64_t, uint64_t, ) +DO_LD4(sve_ld4dd_r, cpu_ldq_data_ra, uint64_t, uint64_t, ) + +#undef DO_LD1 +#undef DO_LD2 +#undef DO_LD3 +#undef DO_LD4 diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 226c97579c..3543daff48 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -42,6 +42,8 @@ typedef void gen_helper_gvec_flags_3(TCGv_i32, TCGv_ptr, TCGv_ptr, typedef void gen_helper_gvec_flags_4(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32); +typedef void gen_helper_gvec_mem(TCGv_env, TCGv_ptr, TCGv_i64, TCGv_i32); + /* * Helpers for extracting complex instruction fields. */ @@ -82,6 +84,15 @@ static inline int expand_imm_sh8u(int x) return (uint8_t)x << (x & 0x100 ? 8 : 0); } +/* Convert a 2-bit memory size (msz) to a 4-bit data type (dtype) + * with unsigned data. C.f. SVE Memory Contiguous Load Group. + */ +static inline int msz_dtype(int msz) +{ + static const uint8_t dtype[4] = { 0, 5, 10, 15 }; + return dtype[msz]; +} + /* * Include the generated decoder. */ @@ -3526,3 +3537,113 @@ static bool trans_LDR_pri(DisasContext *s, arg_rri *a, uint32_t insn) } return true; } + +/* + *** SVE Memory - Contiguous Load Group + */ + +/* The memory mode of the dtype. */ +static const TCGMemOp dtype_mop[16] = { + MO_UB, MO_UB, MO_UB, MO_UB, + MO_SL, MO_UW, MO_UW, MO_UW, + MO_SW, MO_SW, MO_UL, MO_UL, + MO_SB, MO_SB, MO_SB, MO_Q +}; + +#define dtype_msz(x) (dtype_mop[x] & MO_SIZE) + +/* The vector element size of dtype. */ +static const uint8_t dtype_esz[16] = { + 0, 1, 2, 3, + 3, 1, 2, 3, + 3, 2, 2, 3, + 3, 2, 1, 3 +}; + +static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr, + gen_helper_gvec_mem *fn) +{ + unsigned vsz = vec_full_reg_size(s); + TCGv_ptr t_pg; + TCGv_i32 desc; + + /* For e.g. LD4, there are not enough arguments to pass all 4 + * registers as pointers, so encode the regno into the data field. + * For consistency, do this even for LD1. + */ + desc = tcg_const_i32(simd_desc(vsz, vsz, zt)); + t_pg = tcg_temp_new_ptr(); + + tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, pg)); + fn(cpu_env, t_pg, addr, desc); + + tcg_temp_free_ptr(t_pg); + tcg_temp_free_i32(desc); +} + +static void do_ld_zpa(DisasContext *s, int zt, int pg, + TCGv_i64 addr, int dtype, int nreg) +{ + static gen_helper_gvec_mem * const fns[16][4] = { + { gen_helper_sve_ld1bb_r, gen_helper_sve_ld2bb_r, + gen_helper_sve_ld3bb_r, gen_helper_sve_ld4bb_r }, + { gen_helper_sve_ld1bhu_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1bsu_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1bdu_r, NULL, NULL, NULL }, + + { gen_helper_sve_ld1sds_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1hh_r, gen_helper_sve_ld2hh_r, + gen_helper_sve_ld3hh_r, gen_helper_sve_ld4hh_r }, + { gen_helper_sve_ld1hsu_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1hdu_r, NULL, NULL, NULL }, + + { gen_helper_sve_ld1hds_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1hss_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1ss_r, gen_helper_sve_ld2ss_r, + gen_helper_sve_ld3ss_r, gen_helper_sve_ld4ss_r }, + { gen_helper_sve_ld1sdu_r, NULL, NULL, NULL }, + + { gen_helper_sve_ld1bds_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1bss_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1bhs_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1dd_r, gen_helper_sve_ld2dd_r, + gen_helper_sve_ld3dd_r, gen_helper_sve_ld4dd_r }, + }; + gen_helper_gvec_mem *fn = fns[dtype][nreg]; + + /* While there are holes in the table, they are not + * accessible via the instruction encoding. + */ + assert(fn != NULL); + do_mem_zpa(s, zt, pg, addr, fn); +} + +static bool trans_LD_zprr(DisasContext *s, arg_rprr_load *a, uint32_t insn) +{ + if (a->rm == 31) { + return false; + } + if (sve_access_check(s)) { + TCGv_i64 addr = new_tmp_a64(s); + tcg_gen_muli_i64(addr, cpu_reg(s, a->rm), + (a->nreg + 1) << dtype_msz(a->dtype)); + tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, a->rn)); + do_ld_zpa(s, a->rd, a->pg, addr, a->dtype, a->nreg); + } + return true; +} + +static bool trans_LD_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn) +{ + if (sve_access_check(s)) { + int vsz = vec_full_reg_size(s); + int elements = vsz >> dtype_esz[a->dtype]; + TCGv_i64 addr = new_tmp_a64(s); + + tcg_gen_addi_i64(addr, cpu_reg_sp(s, a->rn), + (a->imm * elements * (a->nreg + 1)) + << dtype_msz(a->dtype)); + do_ld_zpa(s, a->rd, a->pg, addr, a->dtype, a->nreg); + } + return true; +} diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 6f436f9096..cfb12da639 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -45,6 +45,9 @@ # Unsigned 8-bit immediate, optionally shifted left by 8. %sh8_i8u 5:9 !function=expand_imm_sh8u +# Unsigned load of msz into esz=2, represented as a dtype. +%msz_dtype 23:2 !function=msz_dtype + # Either a copy of rd (at bit 0), or a different source # as propagated via the MOVPRFX instruction. %reg_movprfx 0:5 @@ -71,6 +74,8 @@ &incdec2_cnt rd rn pat esz imm d u &incdec_pred rd pg esz d u &incdec2_pred rd rn pg esz d u +&rprr_load rd pg rn rm dtype nreg +&rpri_load rd pg rn imm dtype nreg ########################################################################### # Named instruction formats. These are generally used to @@ -170,6 +175,15 @@ @incdec2_pred ........ esz:2 .... .. ..... .. pg:4 rd:5 \ &incdec2_pred rn=%reg_movprfx +# Loads; user must fill in NREG. +@rprr_load_dt ....... dtype:4 rm:5 ... pg:3 rn:5 rd:5 &rprr_load +@rpri_load_dt ....... dtype:4 . imm:s4 ... pg:3 rn:5 rd:5 &rpri_load + +@rprr_load_msz ....... .... rm:5 ... pg:3 rn:5 rd:5 \ + &rprr_load dtype=%msz_dtype +@rpri_load_msz ....... .... . imm:s4 ... pg:3 rn:5 rd:5 \ + &rpri_load dtype=%msz_dtype + ########################################################################### # Instruction patterns. Grouped according to the SVE encodingindex.xhtml. @@ -665,3 +679,23 @@ LDR_pri 10000101 10 ...... 000 ... ..... 0 .... @pd_rn_i9 # SVE load vector register LDR_zri 10000101 10 ...... 010 ... ..... ..... @rd_rn_i9 + +### SVE Memory Contiguous Load Group + +# SVE contiguous load (scalar plus scalar) +LD_zprr 1010010 .... ..... 010 ... ..... ..... @rprr_load_dt nreg=0 + +# SVE contiguous load (scalar plus immediate) +LD_zpri 1010010 .... 0.... 101 ... ..... ..... @rpri_load_dt nreg=0 + +# SVE contiguous non-temporal load (scalar plus scalar) +# LDNT1B, LDNT1H, LDNT1W, LDNT1D +# SVE load multiple structures (scalar plus scalar) +# LD2B, LD2H, LD2W, LD2D; etc. +LD_zprr 1010010 .. nreg:2 ..... 110 ... ..... ..... @rprr_load_msz + +# SVE contiguous non-temporal load (scalar plus immediate) +# LDNT1B, LDNT1H, LDNT1W, LDNT1D +# SVE load multiple structures (scalar plus immediate) +# LD2B, LD2H, LD2W, LD2D; etc. +LD_zpri 1010010 .. nreg:2 0.... 111 ... ..... ..... @rpri_load_msz From patchwork Thu Jun 21 01:53:26 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 932467 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="TRzZFb7A"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41B4X848dxz9s5b for ; Thu, 21 Jun 2018 11:55:04 +1000 (AEST) Received: from localhost ([::1]:52478 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVooQ-0003zU-0F for incoming@patchwork.ozlabs.org; Wed, 20 Jun 2018 21:55:02 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39118) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVonc-0003xF-MG for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:14 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVonb-0002wF-0L for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:12 -0400 Received: from mail-pl0-x229.google.com ([2607:f8b0:400e:c01::229]:43894) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVona-0002vM-O9 for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:10 -0400 Received: by mail-pl0-x229.google.com with SMTP id c41-v6so745871plj.10 for ; Wed, 20 Jun 2018 18:54:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=Gdgl5W4Y8Z1LGarHK+ke+sNkDHVsnZS3jcF1WRaF47o=; b=TRzZFb7ArDesOmDb1til8eQb2dKP0bLliOzqL+QGt8G/rWDB4/bTO/GHqMFKrwOcb4 WhqRl2MLiyKgJOetljbTLSM340fd6nWDxKC2EkJC7eo/ausfk+9C4hfZQnCZifX3FtHo moRLX7hlAOSOVOI+HtQRs1giduSmXanUn9+xA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Gdgl5W4Y8Z1LGarHK+ke+sNkDHVsnZS3jcF1WRaF47o=; b=C/pUVyEww9rgNT8tmeCfkf4uY58cHZI0ujqwDB9VxQxW6kpr2y5rDuUcliIuf9AdWG 82S3e5y0WIQqGPapNkJQAf8I2nROM0vs+StlKlurts8ohdEGa+TNwwa0uNp0IWF4bcSC zWC9AWzyryFiaBecTz0SVDZPrWLGHbvMLZga5r5pXG1Ezm+ZFNKBqT2vz+Qd/t7va+W6 W808H1tkiIR970+Y0OaKO1zalA7SFXYdH3qjd+cebJgIUDL2puatBvRk9f1Uq+gAplg3 +ggsC6hXeACHYu5/fKZl75uU45FrV6LWZErzGAk1E27SZlcYiApfkJMxlQBMbN6dxDxb NGoQ== X-Gm-Message-State: APt69E2PqTcQ2nIOt8KbnV4y9t1496K+m4Qj570bAvLwcYby+vajm82x bhmu1qZtEyulRDkfDwi2v/6XJvtqJ7M= X-Google-Smtp-Source: ADUXVKJcM7yMgl2CMAcm9hsxQV6L9Sh0wlQaWimqP91QFtqeXpKTO7Ge5nQ92Tm+rmzQt+2y4cIgzA== X-Received: by 2002:a17:902:b285:: with SMTP id u5-v6mr26781847plr.183.1529546049415; Wed, 20 Jun 2018 18:54:09 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.07 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:08 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:26 -1000 Message-Id: <20180621015359.12018-3-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::229 Subject: [Qemu-devel] [PATCH v5 02/35] target/arm: Implement SVE Contiguous Load, first-fault and no-fault X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell Reviewed-by: Alex Bennée --- target/arm/helper-sve.h | 40 ++++++++++ target/arm/sve_helper.c | 156 +++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 69 ++++++++++++++++ target/arm/sve.decode | 6 ++ 4 files changed, 271 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index fcc9ba5f50..7338abbbcf 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -754,3 +754,43 @@ DEF_HELPER_FLAGS_4(sve_ld1hds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld1sdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld1sds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ldff1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldff1bhu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldff1bsu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldff1bdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldff1bhs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldff1bss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldff1bds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ldff1hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldff1hsu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldff1hdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldff1hss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldff1hds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ldff1ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldff1sdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldff1sds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ldff1dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ldnf1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldnf1bhu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldnf1bsu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldnf1bdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldnf1bhs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldnf1bss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldnf1bds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ldnf1hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldnf1hsu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldnf1hdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldnf1hss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldnf1hds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ldnf1ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldnf1sdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ldnf1sds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ldnf1dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 4e6ad282f9..6e1b539ce3 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2963,3 +2963,159 @@ DO_LD4(sve_ld4dd_r, cpu_ldq_data_ra, uint64_t, uint64_t, ) #undef DO_LD2 #undef DO_LD3 #undef DO_LD4 + +/* + * Load contiguous data, first-fault and no-fault. + */ + +#ifdef CONFIG_USER_ONLY + +/* Fault on byte I. All bits in FFR from I are cleared. The vector + * result from I is CONSTRAINED UNPREDICTABLE; we choose the MERGE + * option, which leaves subsequent data unchanged. + */ +static void __attribute__((cold)) +record_fault(CPUARMState *env, intptr_t i, intptr_t oprsz) +{ + uint64_t *ffr = env->vfp.pregs[FFR_PRED_NUM].p; + if (i & 63) { + ffr[i / 64] &= MAKE_64BIT_MASK(0, (i & 63) - 1); + i = ROUND_UP(i, 64); + } + for (; i < oprsz; i += 64) { + ffr[i / 64] = 0; + } +} + +/* Hold the mmap lock during the operation so that there is no race + * between page_check_range and the load operation. We expect the + * usual case to have no faults at all, so we check the whole range + * first and if successful defer to the normal load operation. + * + * TODO: Change mmap_lock to a rwlock so that multiple readers + * can run simultaneously. This will probably help other uses + * within QEMU as well. + */ +#define DO_LDFF1(PART, FN, TYPEE, TYPEM, H) \ +static void do_sve_ldff1##PART(CPUARMState *env, void *vd, void *vg, \ + target_ulong addr, intptr_t oprsz, \ + bool first, uintptr_t ra) \ +{ \ + intptr_t i = 0; \ + do { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + TYPEM m = 0; \ + if (pg & 1) { \ + if (!first && \ + page_check_range(addr, sizeof(TYPEM), PAGE_READ)) { \ + record_fault(env, i, oprsz); \ + return; \ + } \ + m = FN(env, addr, ra); \ + first = false; \ + } \ + *(TYPEE *)(vd + H(i)) = m; \ + i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \ + addr += sizeof(TYPEM); \ + } while (i & 15); \ + } while (i < oprsz); \ +} \ +void HELPER(sve_ldff1##PART)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc) \ +{ \ + intptr_t oprsz = simd_oprsz(desc); \ + unsigned rd = simd_data(desc); \ + void *vd = &env->vfp.zregs[rd]; \ + mmap_lock(); \ + if (likely(page_check_range(addr, oprsz, PAGE_READ) == 0)) { \ + do_sve_ld1##PART(env, vd, vg, addr, oprsz, GETPC()); \ + } else { \ + do_sve_ldff1##PART(env, vd, vg, addr, oprsz, true, GETPC()); \ + } \ + mmap_unlock(); \ +} + +/* No-fault loads are like first-fault loads without the + * first faulting special case. + */ +#define DO_LDNF1(PART) \ +void HELPER(sve_ldnf1##PART)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc) \ +{ \ + intptr_t oprsz = simd_oprsz(desc); \ + unsigned rd = simd_data(desc); \ + void *vd = &env->vfp.zregs[rd]; \ + mmap_lock(); \ + if (likely(page_check_range(addr, oprsz, PAGE_READ) == 0)) { \ + do_sve_ld1##PART(env, vd, vg, addr, oprsz, GETPC()); \ + } else { \ + do_sve_ldff1##PART(env, vd, vg, addr, oprsz, false, GETPC()); \ + } \ + mmap_unlock(); \ +} + +#else + +/* TODO: System mode is not yet supported. + * This would probably use tlb_vaddr_to_host. + */ +#define DO_LDFF1(PART, FN, TYPEE, TYPEM, H) \ +void HELPER(sve_ldff1##PART)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc) \ +{ \ + g_assert_not_reached(); \ +} + +#define DO_LDNF1(PART) \ +void HELPER(sve_ldnf1##PART)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc) \ +{ \ + g_assert_not_reached(); \ +} + +#endif + +DO_LDFF1(bb_r, cpu_ldub_data_ra, uint8_t, uint8_t, H1) +DO_LDFF1(bhu_r, cpu_ldub_data_ra, uint16_t, uint8_t, H1_2) +DO_LDFF1(bhs_r, cpu_ldsb_data_ra, uint16_t, int8_t, H1_2) +DO_LDFF1(bsu_r, cpu_ldub_data_ra, uint32_t, uint8_t, H1_4) +DO_LDFF1(bss_r, cpu_ldsb_data_ra, uint32_t, int8_t, H1_4) +DO_LDFF1(bdu_r, cpu_ldub_data_ra, uint64_t, uint8_t, ) +DO_LDFF1(bds_r, cpu_ldsb_data_ra, uint64_t, int8_t, ) + +DO_LDFF1(hh_r, cpu_lduw_data_ra, uint16_t, uint16_t, H1_2) +DO_LDFF1(hsu_r, cpu_lduw_data_ra, uint32_t, uint16_t, H1_4) +DO_LDFF1(hss_r, cpu_ldsw_data_ra, uint32_t, int8_t, H1_4) +DO_LDFF1(hdu_r, cpu_lduw_data_ra, uint64_t, uint16_t, ) +DO_LDFF1(hds_r, cpu_ldsw_data_ra, uint64_t, int16_t, ) + +DO_LDFF1(ss_r, cpu_ldl_data_ra, uint32_t, uint32_t, H1_4) +DO_LDFF1(sdu_r, cpu_ldl_data_ra, uint64_t, uint32_t, ) +DO_LDFF1(sds_r, cpu_ldl_data_ra, uint64_t, int32_t, ) + +DO_LDFF1(dd_r, cpu_ldq_data_ra, uint64_t, uint64_t, ) + +#undef DO_LDFF1 + +DO_LDNF1(bb_r) +DO_LDNF1(bhu_r) +DO_LDNF1(bhs_r) +DO_LDNF1(bsu_r) +DO_LDNF1(bss_r) +DO_LDNF1(bdu_r) +DO_LDNF1(bds_r) + +DO_LDNF1(hh_r) +DO_LDNF1(hsu_r) +DO_LDNF1(hss_r) +DO_LDNF1(hdu_r) +DO_LDNF1(hds_r) + +DO_LDNF1(ss_r) +DO_LDNF1(sdu_r) +DO_LDNF1(sds_r) + +DO_LDNF1(dd_r) + +#undef DO_LDNF1 diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 3543daff48..09f77b5405 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3647,3 +3647,72 @@ static bool trans_LD_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn) } return true; } + +static bool trans_LDFF1_zprr(DisasContext *s, arg_rprr_load *a, uint32_t insn) +{ + static gen_helper_gvec_mem * const fns[16] = { + gen_helper_sve_ldff1bb_r, + gen_helper_sve_ldff1bhu_r, + gen_helper_sve_ldff1bsu_r, + gen_helper_sve_ldff1bdu_r, + + gen_helper_sve_ldff1sds_r, + gen_helper_sve_ldff1hh_r, + gen_helper_sve_ldff1hsu_r, + gen_helper_sve_ldff1hdu_r, + + gen_helper_sve_ldff1hds_r, + gen_helper_sve_ldff1hss_r, + gen_helper_sve_ldff1ss_r, + gen_helper_sve_ldff1sdu_r, + + gen_helper_sve_ldff1bds_r, + gen_helper_sve_ldff1bss_r, + gen_helper_sve_ldff1bhs_r, + gen_helper_sve_ldff1dd_r, + }; + + if (sve_access_check(s)) { + TCGv_i64 addr = new_tmp_a64(s); + tcg_gen_shli_i64(addr, cpu_reg(s, a->rm), dtype_msz(a->dtype)); + tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, a->rn)); + do_mem_zpa(s, a->rd, a->pg, addr, fns[a->dtype]); + } + return true; +} + +static bool trans_LDNF1_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn) +{ + static gen_helper_gvec_mem * const fns[16] = { + gen_helper_sve_ldnf1bb_r, + gen_helper_sve_ldnf1bhu_r, + gen_helper_sve_ldnf1bsu_r, + gen_helper_sve_ldnf1bdu_r, + + gen_helper_sve_ldnf1sds_r, + gen_helper_sve_ldnf1hh_r, + gen_helper_sve_ldnf1hsu_r, + gen_helper_sve_ldnf1hdu_r, + + gen_helper_sve_ldnf1hds_r, + gen_helper_sve_ldnf1hss_r, + gen_helper_sve_ldnf1ss_r, + gen_helper_sve_ldnf1sdu_r, + + gen_helper_sve_ldnf1bds_r, + gen_helper_sve_ldnf1bss_r, + gen_helper_sve_ldnf1bhs_r, + gen_helper_sve_ldnf1dd_r, + }; + + if (sve_access_check(s)) { + int vsz = vec_full_reg_size(s); + int elements = vsz >> dtype_esz[a->dtype]; + int off = (a->imm * elements) << dtype_msz(a->dtype); + TCGv_i64 addr = new_tmp_a64(s); + + tcg_gen_addi_i64(addr, cpu_reg_sp(s, a->rn), off); + do_mem_zpa(s, a->rd, a->pg, addr, fns[a->dtype]); + } + return true; +} diff --git a/target/arm/sve.decode b/target/arm/sve.decode index cfb12da639..afbed57de1 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -685,9 +685,15 @@ LDR_zri 10000101 10 ...... 010 ... ..... ..... @rd_rn_i9 # SVE contiguous load (scalar plus scalar) LD_zprr 1010010 .... ..... 010 ... ..... ..... @rprr_load_dt nreg=0 +# SVE contiguous first-fault load (scalar plus scalar) +LDFF1_zprr 1010010 .... ..... 011 ... ..... ..... @rprr_load_dt nreg=0 + # SVE contiguous load (scalar plus immediate) LD_zpri 1010010 .... 0.... 101 ... ..... ..... @rpri_load_dt nreg=0 +# SVE contiguous non-fault load (scalar plus immediate) +LDNF1_zpri 1010010 .... 1.... 101 ... ..... ..... @rpri_load_dt nreg=0 + # SVE contiguous non-temporal load (scalar plus scalar) # LDNT1B, LDNT1H, LDNT1W, LDNT1D # SVE load multiple structures (scalar plus scalar) From patchwork Thu Jun 21 01:53:27 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 932468 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="SEbhed5a"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41B4XD2L3Kz9s3R for ; Thu, 21 Jun 2018 11:55:08 +1000 (AEST) Received: from localhost ([::1]:52479 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVooT-00041y-Ue for incoming@patchwork.ozlabs.org; Wed, 20 Jun 2018 21:55:05 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39140) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVone-0003yJ-NP for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:16 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVonc-0002xV-U2 for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:14 -0400 Received: from mail-pl0-x233.google.com ([2607:f8b0:400e:c01::233]:35619) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVonc-0002x1-KC for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:12 -0400 Received: by mail-pl0-x233.google.com with SMTP id k1-v6so756619plt.2 for ; Wed, 20 Jun 2018 18:54:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=sBIcMa1gXsMhdLwRtMVs9CJ9jMS0ijsQQUBNA3BsdtM=; b=SEbhed5ahN/dd2hWA5H5hW5PR0+mpZvEzXzTL00z6n/J7lNZbBHUOWq7s+Xf2V8LeM WbYkXy55VSDmjN/9LO3+7Rr5XshRafsFUYq1BeZ3HGRksAKLL2WvrEEvRzry+dWhXVBg bjtQQJfXUpdR/u0m22a+0Cml92A05qn0EryfI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=sBIcMa1gXsMhdLwRtMVs9CJ9jMS0ijsQQUBNA3BsdtM=; b=TtG21WBjMND60sO3jqmn8LYZurDjgwDngAm873y8HUYyR5Zmneyt+/nSRODiyonrwx rN7iwFQFEqTmO1bpEYRinJ5AhoZzDV70ySvGJ/9NbjrXlQK70GKJjjQRfEGCn1DDst6u OFu/zGTA07Fau40D8quwfPPzh+zf/Hh0vV8qRhltZtqGQ9tyUKkcOOwc18FqvdAPl/PI 3H/kfJcc9J+WypRi6i7cRe1pvH9zjuF43Cxm5TbcRPaI/5Kxm9mGwB1ezz4yrfJLYviA Zcw9EhCTPLsCZdJXQj0ovFwk2jWawCxpQyHTd1ZuBKI7LS/XnBeehbW45fO8bx7I2I3k 0wUw== X-Gm-Message-State: APt69E2YFhF+o04if+MpKmn+wNpL7uypRxVKndEkV/wY/gGGh6G5Qkg2 X4Wf1lpoUaxnkm0nvzT89mvO17vorK8= X-Google-Smtp-Source: ADUXVKLHSa4v+Xu19x1SJ28+G2cVqGg4qZ6MPt3uVDjZiLuLgqSVMQIPMxMSrDdMOM22WfXyjq/dWA== X-Received: by 2002:a17:902:5501:: with SMTP id f1-v6mr26161465pli.108.1529546051222; Wed, 20 Jun 2018 18:54:11 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.09 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:10 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:27 -1000 Message-Id: <20180621015359.12018-4-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::233 Subject: [Qemu-devel] [PATCH v5 03/35] target/arm: Implement SVE Memory Contiguous Store Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell Reviewed-by: Alex Bennée --- target/arm/helper-sve.h | 29 +++++ target/arm/sve_helper.c | 211 +++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 65 ++++++++++++ target/arm/sve.decode | 38 +++++++ 4 files changed, 343 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 7338abbbcf..b768128951 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -794,3 +794,32 @@ DEF_HELPER_FLAGS_4(sve_ldnf1sdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ldnf1sds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ldnf1dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_st1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st4bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_st1hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st2hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st3hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st4hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_st1ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st2ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st3ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st4ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_st1dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st2dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st3dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st4dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_st1bh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st1bs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st1bd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_st1hs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st1hd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_st1sd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 6e1b539ce3..f20774e240 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3119,3 +3119,214 @@ DO_LDNF1(sds_r) DO_LDNF1(dd_r) #undef DO_LDNF1 + +/* + * Store contiguous data, protected by a governing predicate. + */ +#define DO_ST1(NAME, FN, TYPEE, TYPEM, H) \ +void HELPER(NAME)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + intptr_t ra = GETPC(); \ + unsigned rd = simd_data(desc); \ + void *vd = &env->vfp.zregs[rd]; \ + for (i = 0; i < oprsz; ) { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + if (pg & 1) { \ + TYPEM m = *(TYPEE *)(vd + H(i)); \ + FN(env, addr, m, ra); \ + } \ + i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \ + addr += sizeof(TYPEM); \ + } while (i & 15); \ + } \ +} + +#define DO_ST1_D(NAME, FN, TYPEM) \ +void HELPER(NAME)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc) / 8; \ + intptr_t ra = GETPC(); \ + unsigned rd = simd_data(desc); \ + uint64_t *d = &env->vfp.zregs[rd].d[0]; \ + uint8_t *pg = vg; \ + for (i = 0; i < oprsz; i += 1) { \ + if (pg[H1(i)] & 1) { \ + FN(env, addr, d[i], ra); \ + } \ + addr += sizeof(TYPEM); \ + } \ +} + +#define DO_ST2(NAME, FN, TYPEE, TYPEM, H) \ +void HELPER(NAME)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + intptr_t ra = GETPC(); \ + unsigned rd = simd_data(desc); \ + void *d1 = &env->vfp.zregs[rd]; \ + void *d2 = &env->vfp.zregs[(rd + 1) & 31]; \ + for (i = 0; i < oprsz; ) { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + if (pg & 1) { \ + TYPEM m1 = *(TYPEE *)(d1 + H(i)); \ + TYPEM m2 = *(TYPEE *)(d2 + H(i)); \ + FN(env, addr, m1, ra); \ + FN(env, addr + sizeof(TYPEM), m2, ra); \ + } \ + i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \ + addr += 2 * sizeof(TYPEM); \ + } while (i & 15); \ + } \ +} + +#define DO_ST3(NAME, FN, TYPEE, TYPEM, H) \ +void HELPER(NAME)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + intptr_t ra = GETPC(); \ + unsigned rd = simd_data(desc); \ + void *d1 = &env->vfp.zregs[rd]; \ + void *d2 = &env->vfp.zregs[(rd + 1) & 31]; \ + void *d3 = &env->vfp.zregs[(rd + 2) & 31]; \ + for (i = 0; i < oprsz; ) { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + if (pg & 1) { \ + TYPEM m1 = *(TYPEE *)(d1 + H(i)); \ + TYPEM m2 = *(TYPEE *)(d2 + H(i)); \ + TYPEM m3 = *(TYPEE *)(d3 + H(i)); \ + FN(env, addr, m1, ra); \ + FN(env, addr + sizeof(TYPEM), m2, ra); \ + FN(env, addr + 2 * sizeof(TYPEM), m3, ra); \ + } \ + i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \ + addr += 3 * sizeof(TYPEM); \ + } while (i & 15); \ + } \ +} + +#define DO_ST4(NAME, FN, TYPEE, TYPEM, H) \ +void HELPER(NAME)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + intptr_t ra = GETPC(); \ + unsigned rd = simd_data(desc); \ + void *d1 = &env->vfp.zregs[rd]; \ + void *d2 = &env->vfp.zregs[(rd + 1) & 31]; \ + void *d3 = &env->vfp.zregs[(rd + 2) & 31]; \ + void *d4 = &env->vfp.zregs[(rd + 3) & 31]; \ + for (i = 0; i < oprsz; ) { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + if (pg & 1) { \ + TYPEM m1 = *(TYPEE *)(d1 + H(i)); \ + TYPEM m2 = *(TYPEE *)(d2 + H(i)); \ + TYPEM m3 = *(TYPEE *)(d3 + H(i)); \ + TYPEM m4 = *(TYPEE *)(d4 + H(i)); \ + FN(env, addr, m1, ra); \ + FN(env, addr + sizeof(TYPEM), m2, ra); \ + FN(env, addr + 2 * sizeof(TYPEM), m3, ra); \ + FN(env, addr + 3 * sizeof(TYPEM), m4, ra); \ + } \ + i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \ + addr += 4 * sizeof(TYPEM); \ + } while (i & 15); \ + } \ +} + +DO_ST1(sve_st1bh_r, cpu_stb_data_ra, uint16_t, uint8_t, H1_2) +DO_ST1(sve_st1bs_r, cpu_stb_data_ra, uint32_t, uint8_t, H1_4) +DO_ST1_D(sve_st1bd_r, cpu_stb_data_ra, uint8_t) + +DO_ST1(sve_st1hs_r, cpu_stw_data_ra, uint32_t, uint16_t, H1_4) +DO_ST1_D(sve_st1hd_r, cpu_stw_data_ra, uint16_t) + +DO_ST1_D(sve_st1sd_r, cpu_stl_data_ra, uint32_t) + +DO_ST1(sve_st1bb_r, cpu_stb_data_ra, uint8_t, uint8_t, H1) +DO_ST2(sve_st2bb_r, cpu_stb_data_ra, uint8_t, uint8_t, H1) +DO_ST3(sve_st3bb_r, cpu_stb_data_ra, uint8_t, uint8_t, H1) +DO_ST4(sve_st4bb_r, cpu_stb_data_ra, uint8_t, uint8_t, H1) + +DO_ST1(sve_st1hh_r, cpu_stw_data_ra, uint16_t, uint16_t, H1_2) +DO_ST2(sve_st2hh_r, cpu_stw_data_ra, uint16_t, uint16_t, H1_2) +DO_ST3(sve_st3hh_r, cpu_stw_data_ra, uint16_t, uint16_t, H1_2) +DO_ST4(sve_st4hh_r, cpu_stw_data_ra, uint16_t, uint16_t, H1_2) + +DO_ST1(sve_st1ss_r, cpu_stl_data_ra, uint32_t, uint32_t, H1_4) +DO_ST2(sve_st2ss_r, cpu_stl_data_ra, uint32_t, uint32_t, H1_4) +DO_ST3(sve_st3ss_r, cpu_stl_data_ra, uint32_t, uint32_t, H1_4) +DO_ST4(sve_st4ss_r, cpu_stl_data_ra, uint32_t, uint32_t, H1_4) + +DO_ST1_D(sve_st1dd_r, cpu_stq_data_ra, uint64_t) + +void HELPER(sve_st2dd_r)(CPUARMState *env, void *vg, + target_ulong addr, uint32_t desc) +{ + intptr_t i, oprsz = simd_oprsz(desc) / 8; + intptr_t ra = GETPC(); + unsigned rd = simd_data(desc); + uint64_t *d1 = &env->vfp.zregs[rd].d[0]; + uint64_t *d2 = &env->vfp.zregs[(rd + 1) & 31].d[0]; + uint8_t *pg = vg; + + for (i = 0; i < oprsz; i += 1) { + if (pg[H1(i)] & 1) { + cpu_stq_data_ra(env, addr, d1[i], ra); + cpu_stq_data_ra(env, addr + 8, d2[i], ra); + } + addr += 2 * 8; + } +} + +void HELPER(sve_st3dd_r)(CPUARMState *env, void *vg, + target_ulong addr, uint32_t desc) +{ + intptr_t i, oprsz = simd_oprsz(desc) / 8; + intptr_t ra = GETPC(); + unsigned rd = simd_data(desc); + uint64_t *d1 = &env->vfp.zregs[rd].d[0]; + uint64_t *d2 = &env->vfp.zregs[(rd + 1) & 31].d[0]; + uint64_t *d3 = &env->vfp.zregs[(rd + 2) & 31].d[0]; + uint8_t *pg = vg; + + for (i = 0; i < oprsz; i += 1) { + if (pg[H1(i)] & 1) { + cpu_stq_data_ra(env, addr, d1[i], ra); + cpu_stq_data_ra(env, addr + 8, d2[i], ra); + cpu_stq_data_ra(env, addr + 16, d3[i], ra); + } + addr += 3 * 8; + } +} + +void HELPER(sve_st4dd_r)(CPUARMState *env, void *vg, + target_ulong addr, uint32_t desc) +{ + intptr_t i, oprsz = simd_oprsz(desc) / 8; + intptr_t ra = GETPC(); + unsigned rd = simd_data(desc); + uint64_t *d1 = &env->vfp.zregs[rd].d[0]; + uint64_t *d2 = &env->vfp.zregs[(rd + 1) & 31].d[0]; + uint64_t *d3 = &env->vfp.zregs[(rd + 2) & 31].d[0]; + uint64_t *d4 = &env->vfp.zregs[(rd + 3) & 31].d[0]; + uint8_t *pg = vg; + + for (i = 0; i < oprsz; i += 1) { + if (pg[H1(i)] & 1) { + cpu_stq_data_ra(env, addr, d1[i], ra); + cpu_stq_data_ra(env, addr + 8, d2[i], ra); + cpu_stq_data_ra(env, addr + 16, d3[i], ra); + cpu_stq_data_ra(env, addr + 24, d4[i], ra); + } + addr += 4 * 8; + } +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 09f77b5405..b25fe96b77 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3716,3 +3716,68 @@ static bool trans_LDNF1_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn) } return true; } + +static void do_st_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr, + int msz, int esz, int nreg) +{ + static gen_helper_gvec_mem * const fn_single[4][4] = { + { gen_helper_sve_st1bb_r, gen_helper_sve_st1bh_r, + gen_helper_sve_st1bs_r, gen_helper_sve_st1bd_r }, + { NULL, gen_helper_sve_st1hh_r, + gen_helper_sve_st1hs_r, gen_helper_sve_st1hd_r }, + { NULL, NULL, + gen_helper_sve_st1ss_r, gen_helper_sve_st1sd_r }, + { NULL, NULL, NULL, gen_helper_sve_st1dd_r }, + }; + static gen_helper_gvec_mem * const fn_multiple[3][4] = { + { gen_helper_sve_st2bb_r, gen_helper_sve_st2hh_r, + gen_helper_sve_st2ss_r, gen_helper_sve_st2dd_r }, + { gen_helper_sve_st3bb_r, gen_helper_sve_st3hh_r, + gen_helper_sve_st3ss_r, gen_helper_sve_st3dd_r }, + { gen_helper_sve_st4bb_r, gen_helper_sve_st4hh_r, + gen_helper_sve_st4ss_r, gen_helper_sve_st4dd_r }, + }; + gen_helper_gvec_mem *fn; + + if (nreg == 0) { + /* ST1 */ + fn = fn_single[msz][esz]; + } else { + /* ST2, ST3, ST4 -- msz == esz, enforced by encoding */ + assert(msz == esz); + fn = fn_multiple[nreg - 1][msz]; + } + assert(fn != NULL); + do_mem_zpa(s, zt, pg, addr, fn); +} + +static bool trans_ST_zprr(DisasContext *s, arg_rprr_store *a, uint32_t insn) +{ + if (a->rm == 31 || a->msz > a->esz) { + return false; + } + if (sve_access_check(s)) { + TCGv_i64 addr = new_tmp_a64(s); + tcg_gen_muli_i64(addr, cpu_reg(s, a->rm), (a->nreg + 1) << a->msz); + tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, a->rn)); + do_st_zpa(s, a->rd, a->pg, addr, a->msz, a->esz, a->nreg); + } + return true; +} + +static bool trans_ST_zpri(DisasContext *s, arg_rpri_store *a, uint32_t insn) +{ + if (a->msz > a->esz) { + return false; + } + if (sve_access_check(s)) { + int vsz = vec_full_reg_size(s); + int elements = vsz >> a->esz; + TCGv_i64 addr = new_tmp_a64(s); + + tcg_gen_addi_i64(addr, cpu_reg_sp(s, a->rn), + (a->imm * elements * (a->nreg + 1)) << a->msz); + do_st_zpa(s, a->rd, a->pg, addr, a->msz, a->esz, a->nreg); + } + return true; +} diff --git a/target/arm/sve.decode b/target/arm/sve.decode index afbed57de1..6e159faaec 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -27,6 +27,7 @@ %imm7_22_16 22:2 16:5 %imm8_16_10 16:5 10:3 %imm9_16_10 16:s6 10:3 +%size_23 23:2 # A combination of tsz:imm3 -- extract esize. %tszimm_esz 22:2 5:5 !function=tszimm_esz @@ -76,6 +77,8 @@ &incdec2_pred rd rn pg esz d u &rprr_load rd pg rn rm dtype nreg &rpri_load rd pg rn imm dtype nreg +&rprr_store rd pg rn rm msz esz nreg +&rpri_store rd pg rn imm msz esz nreg ########################################################################### # Named instruction formats. These are generally used to @@ -184,6 +187,12 @@ @rpri_load_msz ....... .... . imm:s4 ... pg:3 rn:5 rd:5 \ &rpri_load dtype=%msz_dtype +# Stores; user must fill in ESZ, MSZ, NREG as needed. +@rprr_store ....... .. .. rm:5 ... pg:3 rn:5 rd:5 &rprr_store +@rpri_store_msz ....... msz:2 .. . imm:s4 ... pg:3 rn:5 rd:5 &rpri_store +@rprr_store_esz_n0 ....... .. esz:2 rm:5 ... pg:3 rn:5 rd:5 \ + &rprr_store nreg=0 + ########################################################################### # Instruction patterns. Grouped according to the SVE encodingindex.xhtml. @@ -705,3 +714,32 @@ LD_zprr 1010010 .. nreg:2 ..... 110 ... ..... ..... @rprr_load_msz # SVE load multiple structures (scalar plus immediate) # LD2B, LD2H, LD2W, LD2D; etc. LD_zpri 1010010 .. nreg:2 0.... 111 ... ..... ..... @rpri_load_msz + +### SVE Memory Store Group + +# SVE contiguous store (scalar plus immediate) +# ST1B, ST1H, ST1W, ST1D; require msz <= esz +ST_zpri 1110010 .. esz:2 0.... 111 ... ..... ..... \ + @rpri_store_msz nreg=0 + +# SVE contiguous store (scalar plus scalar) +# ST1B, ST1H, ST1W, ST1D; require msz <= esz +# Enumerate msz lest we conflict with STR_zri. +ST_zprr 1110010 00 .. ..... 010 ... ..... ..... \ + @rprr_store_esz_n0 msz=0 +ST_zprr 1110010 01 .. ..... 010 ... ..... ..... \ + @rprr_store_esz_n0 msz=1 +ST_zprr 1110010 10 .. ..... 010 ... ..... ..... \ + @rprr_store_esz_n0 msz=2 +ST_zprr 1110010 11 11 ..... 010 ... ..... ..... \ + @rprr_store msz=3 esz=3 nreg=0 + +# SVE contiguous non-temporal store (scalar plus immediate) (nreg == 0) +# SVE store multiple structures (scalar plus immediate) (nreg != 0) +ST_zpri 1110010 .. nreg:2 1.... 111 ... ..... ..... \ + @rpri_store_msz esz=%size_23 + +# SVE contiguous non-temporal store (scalar plus scalar) (nreg == 0) +# SVE store multiple structures (scalar plus scalar) (nreg != 0) +ST_zprr 1110010 msz:2 nreg:2 ..... 011 ... ..... ..... \ + @rprr_store esz=%size_23 From patchwork Thu Jun 21 01:53:28 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 932470 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="Exx/AZ1e"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41B4bm5yrwz9s3R for ; Thu, 21 Jun 2018 11:58:12 +1000 (AEST) Received: from localhost ([::1]:52495 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVorS-0006iH-EO for incoming@patchwork.ozlabs.org; Wed, 20 Jun 2018 21:58:10 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39146) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVonf-0003zF-GI for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:16 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVone-0002yc-JI for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:15 -0400 Received: from mail-pf0-x234.google.com ([2607:f8b0:400e:c00::234]:41725) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVone-0002xp-DW for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:14 -0400 Received: by mail-pf0-x234.google.com with SMTP id a11-v6so683565pff.8 for ; Wed, 20 Jun 2018 18:54:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=MIKKw3KUuErnHai2QdqLnnCjJ3TTAEPOo7LFHNMeMjQ=; b=Exx/AZ1ezbHNXloM/rQKKAhGXVwwQbf8w2RLsNmYkRhuRm8qZqCGQoMUAf79ON8ENG bGLPAU/03nPbDkNRZV36TiFuMDMPdbn+WCrpFT5ws3+K/dczqnCtikNctVftzc9EWEa7 rZVxD7oe1Y8Nce56ZGSO76UM+iSTiDYQB+MDM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=MIKKw3KUuErnHai2QdqLnnCjJ3TTAEPOo7LFHNMeMjQ=; b=oz+8SNvtaV4nTqMaWDYjgAx4jFkkVatTbzhr/ikEMTLhXTGf6oc0+ew0X1eV76WgbN GPd/QN3VB+lQNsvSDLanJCC1fgppqUtHRKxQrX2Yay4ICmb+OUTdnr0vxh2KKwW/gSeb lEHvYjItiGRZiTAslrV90Rar3plbHe8ZNYFx7GBaxVrvQHwMLu4wfsa4d9z2c91Z9cvG TcQ4Mk2sPtGjLHqDTdqTNLtHP2ZC0degpYQoE6CjZxqPSpmEWrk4KK7f33dotbuaFrDu U21Q2AOEtBj4ZlEGtpxbqiReGzypmH/jdo7molBq6SjluV9Ka7EOlrJL722kEMoI6h2J lbZA== X-Gm-Message-State: APt69E35P5dg4tm3OQGIrJbXgPeVbfATKpT6ovjlhY3F2rqlYzM7FopB 1f5uMCalljPe+JEDzVbk/H/keIc4ZUc= X-Google-Smtp-Source: ADUXVKLY4kZ2MDuTuOQ84T0fbVfID3XfPFlC5aTK341vd+KGHr712hdX/VN/0yh0pybqaLyLL1QaFQ== X-Received: by 2002:a62:4556:: with SMTP id s83-v6mr25352737pfa.73.1529546053055; Wed, 20 Jun 2018 18:54:13 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.11 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:12 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:28 -1000 Message-Id: <20180621015359.12018-5-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::234 Subject: [Qemu-devel] [PATCH v5 04/35] target/arm: Implement SVE load and broadcast quadword X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell Reviewed-by: Alex Bennée --- target/arm/translate-sve.c | 52 ++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 9 +++++++ 2 files changed, 61 insertions(+) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index b25fe96b77..83de87ee0e 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3717,6 +3717,58 @@ static bool trans_LDNF1_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn) return true; } +static void do_ldrq(DisasContext *s, int zt, int pg, TCGv_i64 addr, int msz) +{ + static gen_helper_gvec_mem * const fns[4] = { + gen_helper_sve_ld1bb_r, gen_helper_sve_ld1hh_r, + gen_helper_sve_ld1ss_r, gen_helper_sve_ld1dd_r, + }; + unsigned vsz = vec_full_reg_size(s); + TCGv_ptr t_pg; + TCGv_i32 desc; + + /* Load the first quadword using the normal predicated load helpers. */ + desc = tcg_const_i32(simd_desc(16, 16, zt)); + t_pg = tcg_temp_new_ptr(); + + tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, pg)); + fns[msz](cpu_env, t_pg, addr, desc); + + tcg_temp_free_ptr(t_pg); + tcg_temp_free_i32(desc); + + /* Replicate that first quadword. */ + if (vsz > 16) { + unsigned dofs = vec_full_reg_offset(s, zt); + tcg_gen_gvec_dup_mem(4, dofs + 16, dofs, vsz - 16, vsz - 16); + } +} + +static bool trans_LD1RQ_zprr(DisasContext *s, arg_rprr_load *a, uint32_t insn) +{ + if (a->rm == 31) { + return false; + } + if (sve_access_check(s)) { + int msz = dtype_msz(a->dtype); + TCGv_i64 addr = new_tmp_a64(s); + tcg_gen_shli_i64(addr, cpu_reg(s, a->rm), msz); + tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, a->rn)); + do_ldrq(s, a->rd, a->pg, addr, msz); + } + return true; +} + +static bool trans_LD1RQ_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn) +{ + if (sve_access_check(s)) { + TCGv_i64 addr = new_tmp_a64(s); + tcg_gen_addi_i64(addr, cpu_reg_sp(s, a->rn), a->imm * 16); + do_ldrq(s, a->rd, a->pg, addr, dtype_msz(a->dtype)); + } + return true; +} + static void do_st_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr, int msz, int esz, int nreg) { diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 6e159faaec..606c4f623c 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -715,6 +715,15 @@ LD_zprr 1010010 .. nreg:2 ..... 110 ... ..... ..... @rprr_load_msz # LD2B, LD2H, LD2W, LD2D; etc. LD_zpri 1010010 .. nreg:2 0.... 111 ... ..... ..... @rpri_load_msz +# SVE load and broadcast quadword (scalar plus scalar) +LD1RQ_zprr 1010010 .. 00 ..... 000 ... ..... ..... \ + @rprr_load_msz nreg=0 + +# SVE load and broadcast quadword (scalar plus immediate) +# LD1RQB, LD1RQH, LD1RQS, LD1RQD +LD1RQ_zpri 1010010 .. 00 0.... 001 ... ..... ..... \ + @rpri_load_msz nreg=0 + ### SVE Memory Store Group # SVE contiguous store (scalar plus immediate) From patchwork Thu Jun 21 01:53:29 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 932471 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="CqeRnEsx"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41B4bn57y6z9s3T for ; Thu, 21 Jun 2018 11:58:13 +1000 (AEST) Received: from localhost ([::1]:52494 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVorT-0006hx-AF for incoming@patchwork.ozlabs.org; Wed, 20 Jun 2018 21:58:11 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39168) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVonk-000444-2O for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:21 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVong-000300-6b for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:20 -0400 Received: from mail-pl0-x244.google.com ([2607:f8b0:400e:c01::244]:36275) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVonf-0002zP-UI for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:16 -0400 Received: by mail-pl0-x244.google.com with SMTP id a7-v6so757304plp.3 for ; Wed, 20 Jun 2018 18:54:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=jeoMd9BTrp3hGuVBUBWNDT8zmWrdd5+QUIlWX/rjX9U=; b=CqeRnEsxsXAV8dgaNdD8BcGACuEXpBxwBoooJMM5T1vP9JHFCXACf3deCWcfr5b5yN 4q3ntELPvZD3hKTGBO3I1CvTGn0Ycf4+yk9Ty07GZyfW12qoZuiuPoFw6wSYj+yIxgVu X1S9d3MMd6wCyOKpg0V6N/fxnU8XqJ7YqmwqE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=jeoMd9BTrp3hGuVBUBWNDT8zmWrdd5+QUIlWX/rjX9U=; b=eBwkaoQGj4N8QUPadXugyjROHIOFYVPcSBmvH+b+C9gUPsMxrr3ST7y4b/TTYQNLnM ywjI/RWx20vrAM08WBWAaLTHE5+wa6nYI0QdLvCzgDA6y0VAMJlFNDQacezRekKrDoUt CMbcsQR8Nmx0DccKETuFZNA+gUBivgMYgPeEtRmJbx4q29IriqzDbE+wJco1CEXnipZU A5cK0gx/WiSuyWPivGH0lQNu6I0K3+z/0ea4eBKRzV98EYqNksZajBX0tSbRT1JcBsTw V8F2pZSa9gT6rfLHLe877+Q2xGCxwtb/Aypk+UFc/7+uVd/x453jmvuik7V/J+Pw/CtX h4jw== X-Gm-Message-State: APt69E1xsg2K5krm2rMOo/S0u3S8lQOuZMPKw1y0vgaiGJ1Ue9iQhqcf 7vKpOaL8y+yZMlZ0PGtueqq4NN4Y+80= X-Google-Smtp-Source: ADUXVKJlBjc6a53CE63CuchDPTi036/BIsk5lBOjit5BnpiIHvLT5lotzzxEXoWocYF3/PnVYAzwaA== X-Received: by 2002:a17:902:9f84:: with SMTP id g4-v6mr26463657plq.339.1529546054641; Wed, 20 Jun 2018 18:54:14 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.13 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:13 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:29 -1000 Message-Id: <20180621015359.12018-6-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::244 Subject: [Qemu-devel] [PATCH v5 05/35] target/arm: Implement SVE integer convert to floating-point X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell Reviewed-by: Alex Bennée --- target/arm/helper-sve.h | 30 +++++++++++++ target/arm/sve_helper.c | 38 ++++++++++++++++ target/arm/translate-sve.c | 90 ++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 22 ++++++++++ 4 files changed, 180 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index b768128951..185112e1d2 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -720,6 +720,36 @@ DEF_HELPER_FLAGS_5(gvec_rsqrts_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(gvec_rsqrts_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_scvt_dh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_scvt_ss, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_scvt_sd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_scvt_ds, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_scvt_dd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_ucvt_hh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_ucvt_sh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_ucvt_dh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_ucvt_ss, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_ucvt_sd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_ucvt_ds, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_ucvt_dd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(sve_ld1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index f20774e240..a2f034820a 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2811,6 +2811,44 @@ uint32_t HELPER(sve_while)(void *vd, uint32_t count, uint32_t pred_desc) return predtest_ones(d, oprsz, esz_mask); } +/* Fully general two-operand expander, controlled by a predicate, + * With the extra float_status parameter. + */ +#define DO_ZPZ_FP(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vg, void *status, uint32_t desc) \ +{ \ + intptr_t i = simd_oprsz(desc); \ + uint64_t *g = vg; \ + do { \ + uint64_t pg = g[(i - 1) >> 6]; \ + do { \ + i -= sizeof(TYPE); \ + if (likely((pg >> (i & 63)) & 1)) { \ + TYPE nn = *(TYPE *)(vn + H(i)); \ + *(TYPE *)(vd + H(i)) = OP(nn, status); \ + } \ + } while (i & 63); \ + } while (i != 0); \ +} + +DO_ZPZ_FP(sve_scvt_hh, uint16_t, H1_2, int16_to_float16) +DO_ZPZ_FP(sve_scvt_sh, uint32_t, H1_4, int32_to_float16) +DO_ZPZ_FP(sve_scvt_ss, uint32_t, H1_4, int32_to_float32) +DO_ZPZ_FP(sve_scvt_sd, uint64_t, , int32_to_float64) +DO_ZPZ_FP(sve_scvt_dh, uint64_t, , int64_to_float16) +DO_ZPZ_FP(sve_scvt_ds, uint64_t, , int64_to_float32) +DO_ZPZ_FP(sve_scvt_dd, uint64_t, , int64_to_float64) + +DO_ZPZ_FP(sve_ucvt_hh, uint16_t, H1_2, uint16_to_float16) +DO_ZPZ_FP(sve_ucvt_sh, uint32_t, H1_4, uint32_to_float16) +DO_ZPZ_FP(sve_ucvt_ss, uint32_t, H1_4, uint32_to_float32) +DO_ZPZ_FP(sve_ucvt_sd, uint64_t, , uint32_to_float64) +DO_ZPZ_FP(sve_ucvt_dh, uint64_t, , uint64_to_float16) +DO_ZPZ_FP(sve_ucvt_ds, uint64_t, , uint64_to_float32) +DO_ZPZ_FP(sve_ucvt_dd, uint64_t, , uint64_to_float64) + +#undef DO_ZPZ_FP + /* * Load contiguous data, protected by a governing predicate. */ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 83de87ee0e..7639e589f5 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3425,6 +3425,96 @@ DO_FP3(FRSQRTS, rsqrts) #undef DO_FP3 + +/* + *** SVE Floating Point Unary Operations Prediated Group + */ + +static bool do_zpz_ptr(DisasContext *s, int rd, int rn, int pg, + bool is_fp16, gen_helper_gvec_3_ptr *fn) +{ + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + TCGv_ptr status = get_fpstatus_ptr(is_fp16); + tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, rd), + vec_full_reg_offset(s, rn), + pred_full_reg_offset(s, pg), + status, vsz, vsz, 0, fn); + tcg_temp_free_ptr(status); + } + return true; +} + +static bool trans_SCVTF_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_scvt_hh); +} + +static bool trans_SCVTF_sh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_scvt_sh); +} + +static bool trans_SCVTF_dh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_scvt_dh); +} + +static bool trans_SCVTF_ss(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_scvt_ss); +} + +static bool trans_SCVTF_ds(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_scvt_ds); +} + +static bool trans_SCVTF_sd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_scvt_sd); +} + +static bool trans_SCVTF_dd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_scvt_dd); +} + +static bool trans_UCVTF_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_ucvt_hh); +} + +static bool trans_UCVTF_sh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_ucvt_sh); +} + +static bool trans_UCVTF_dh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_ucvt_dh); +} + +static bool trans_UCVTF_ss(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_ucvt_ss); +} + +static bool trans_UCVTF_ds(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_ucvt_ds); +} + +static bool trans_UCVTF_sd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_ucvt_sd); +} + +static bool trans_UCVTF_dd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_ucvt_dd); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 606c4f623c..3abdb87cf5 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -133,6 +133,9 @@ @rd_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 &rpr_esz @rd_pg4_pn ........ esz:2 ... ... .. pg:4 . rn:4 rd:5 &rpr_esz +# One register operand, with governing predicate, no vector element size +@rd_pg_rn_e0 ........ .. ... ... ... pg:3 rn:5 rd:5 &rpr_esz esz=0 + # Two register operands with a 6-bit signed immediate. @rd_rn_i6 ........ ... rn:5 ..... imm:s6 rd:5 &rri @@ -681,6 +684,25 @@ FTSMUL 01100101 .. 0 ..... 000 011 ..... ..... @rd_rn_rm FRECPS 01100101 .. 0 ..... 000 110 ..... ..... @rd_rn_rm FRSQRTS 01100101 .. 0 ..... 000 111 ..... ..... @rd_rn_rm +### SVE FP Unary Operations Predicated Group + +# SVE integer convert to floating-point +SCVTF_hh 01100101 01 010 01 0 101 ... ..... ..... @rd_pg_rn_e0 +SCVTF_sh 01100101 01 010 10 0 101 ... ..... ..... @rd_pg_rn_e0 +SCVTF_dh 01100101 01 010 11 0 101 ... ..... ..... @rd_pg_rn_e0 +SCVTF_ss 01100101 10 010 10 0 101 ... ..... ..... @rd_pg_rn_e0 +SCVTF_sd 01100101 11 010 00 0 101 ... ..... ..... @rd_pg_rn_e0 +SCVTF_ds 01100101 11 010 10 0 101 ... ..... ..... @rd_pg_rn_e0 +SCVTF_dd 01100101 11 010 11 0 101 ... ..... ..... @rd_pg_rn_e0 + +UCVTF_hh 01100101 01 010 01 1 101 ... ..... ..... @rd_pg_rn_e0 +UCVTF_sh 01100101 01 010 10 1 101 ... ..... ..... @rd_pg_rn_e0 +UCVTF_dh 01100101 01 010 11 1 101 ... ..... ..... @rd_pg_rn_e0 +UCVTF_ss 01100101 10 010 10 1 101 ... ..... ..... @rd_pg_rn_e0 +UCVTF_sd 01100101 11 010 00 1 101 ... ..... ..... @rd_pg_rn_e0 +UCVTF_ds 01100101 11 010 10 1 101 ... ..... ..... @rd_pg_rn_e0 +UCVTF_dd 01100101 11 010 11 1 101 ... ..... ..... @rd_pg_rn_e0 + ### SVE Memory - 32-bit Gather and Unsized Contiguous Group # SVE load predicate register From patchwork Thu Jun 21 01:53:30 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 932472 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="Wrxeh8Ps"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41B4bv4XMzz9s3R for ; Thu, 21 Jun 2018 11:58:19 +1000 (AEST) Received: from localhost ([::1]:52497 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVorZ-0006mw-6B for incoming@patchwork.ozlabs.org; Wed, 20 Jun 2018 21:58:17 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39169) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVonk-000445-2Z for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:21 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVoni-000318-3K for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:20 -0400 Received: from mail-pl0-x233.google.com ([2607:f8b0:400e:c01::233]:45166) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVonh-00030p-Qv for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:18 -0400 Received: by mail-pl0-x233.google.com with SMTP id o18-v6so58394pll.12 for ; Wed, 20 Jun 2018 18:54:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=AprqyjT71U0BaW5ZsVknsigpNMYP5WHOGkZ3Sh7Oqeo=; b=Wrxeh8Psd52w4kPEeZmLspqgBP+x6twYEKIo8HC4SeaRilL+vSQBw1Xug7cjfPkYM9 ADOcpHdykwUQSqBGXs5diKclRGZ+nZda8pOCWobgRcNQzfz/BBR4/4cM8LSOpSXKqZ6g F/0uG2QNJZXw9+NJ93Zp5FkUjMZqVn37i/3yY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=AprqyjT71U0BaW5ZsVknsigpNMYP5WHOGkZ3Sh7Oqeo=; b=sa7hJl9EswSeS2493RwDeKM+M5kFZfu7EPccBjknFnKCXdHgRyyuRFw+yw3pqk2wRg v4sdMcm6dOdKZ+p4YATzLLOOPZg/LzyDucTeQ4xS4XtseaCn6VtdcPywC+1XoVabiQXB EZ6dw31IkklwVLPAEmOr4bozwDDzoaRAtH1FdGj3pSn7vDAza7Dy3lw4LFzXJZERU+qu xSFG3Yg5WSJCB12VhZLhWbCVXRCK0L4rWoFF96uIbjLIS5F6fXOj5oeGfTZYITJPTgPe liHOxCFuQPmzo0kPjyDHjBjDZ2hSag+9e99q4Poq1q+Ffrg5j8wD1qEnUx4WPtbY+Ueq pzaw== X-Gm-Message-State: APt69E1+uKGKRiSPFNkqTjqd7bcTPvH9sgfr5luxPR3FvNxH4Fvsk7mP L9q6q3LCf8p4cfzF8GSCGjngvq814yQ= X-Google-Smtp-Source: ADUXVKJagts6hNsK05v7ttAg6YH83xZ94cQe88UmP0xs8w0P0oiOmpp7F7RE6myAyvC6zYf38aeL1Q== X-Received: by 2002:a17:902:585:: with SMTP id f5-v6mr14171274plf.142.1529546056520; Wed, 20 Jun 2018 18:54:16 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.14 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:15 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:30 -1000 Message-Id: <20180621015359.12018-7-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::233 Subject: [Qemu-devel] [PATCH v5 06/35] target/arm: Implement SVE floating-point arithmetic (predicated) X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper-sve.h | 77 +++++++++++++++++++++++++++++++++ target/arm/sve_helper.c | 89 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 46 ++++++++++++++++++++ target/arm/sve.decode | 17 ++++++++ 4 files changed, 229 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 185112e1d2..4097b55f0e 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -720,6 +720,83 @@ DEF_HELPER_FLAGS_5(gvec_rsqrts_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(gvec_rsqrts_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fadd_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fadd_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fadd_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fsub_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fsub_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fsub_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fmul_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmul_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmul_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fdiv_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fdiv_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fdiv_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fmin_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmin_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmin_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fmax_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmax_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmax_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fminnum_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fminnum_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fminnum_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fmaxnum_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmaxnum_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmaxnum_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fabd_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fabd_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fabd_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fscalbn_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fscalbn_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fscalbn_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fmulx_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmulx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmulx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index a2f034820a..6bf30a3e66 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2811,6 +2811,95 @@ uint32_t HELPER(sve_while)(void *vd, uint32_t count, uint32_t pred_desc) return predtest_ones(d, oprsz, esz_mask); } +/* Fully general three-operand expander, controlled by a predicate, + * With the extra float_status parameter. + */ +#define DO_ZPZZ_FP(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, \ + void *status, uint32_t desc) \ +{ \ + intptr_t i = simd_oprsz(desc); \ + uint64_t *g = vg; \ + do { \ + uint64_t pg = g[(i - 1) >> 6]; \ + do { \ + i -= sizeof(TYPE); \ + if (likely((pg >> (i & 63)) & 1)) { \ + TYPE nn = *(TYPE *)(vn + H(i)); \ + TYPE mm = *(TYPE *)(vm + H(i)); \ + *(TYPE *)(vd + H(i)) = OP(nn, mm, status); \ + } \ + } while (i & 63); \ + } while (i != 0); \ +} + +DO_ZPZZ_FP(sve_fadd_h, uint16_t, H1_2, float16_add) +DO_ZPZZ_FP(sve_fadd_s, uint32_t, H1_4, float32_add) +DO_ZPZZ_FP(sve_fadd_d, uint64_t, , float64_add) + +DO_ZPZZ_FP(sve_fsub_h, uint16_t, H1_2, float16_sub) +DO_ZPZZ_FP(sve_fsub_s, uint32_t, H1_4, float32_sub) +DO_ZPZZ_FP(sve_fsub_d, uint64_t, , float64_sub) + +DO_ZPZZ_FP(sve_fmul_h, uint16_t, H1_2, float16_mul) +DO_ZPZZ_FP(sve_fmul_s, uint32_t, H1_4, float32_mul) +DO_ZPZZ_FP(sve_fmul_d, uint64_t, , float64_mul) + +DO_ZPZZ_FP(sve_fdiv_h, uint16_t, H1_2, float16_div) +DO_ZPZZ_FP(sve_fdiv_s, uint32_t, H1_4, float32_div) +DO_ZPZZ_FP(sve_fdiv_d, uint64_t, , float64_div) + +DO_ZPZZ_FP(sve_fmin_h, uint16_t, H1_2, float16_min) +DO_ZPZZ_FP(sve_fmin_s, uint32_t, H1_4, float32_min) +DO_ZPZZ_FP(sve_fmin_d, uint64_t, , float64_min) + +DO_ZPZZ_FP(sve_fmax_h, uint16_t, H1_2, float16_max) +DO_ZPZZ_FP(sve_fmax_s, uint32_t, H1_4, float32_max) +DO_ZPZZ_FP(sve_fmax_d, uint64_t, , float64_max) + +DO_ZPZZ_FP(sve_fminnum_h, uint16_t, H1_2, float16_minnum) +DO_ZPZZ_FP(sve_fminnum_s, uint32_t, H1_4, float32_minnum) +DO_ZPZZ_FP(sve_fminnum_d, uint64_t, , float64_minnum) + +DO_ZPZZ_FP(sve_fmaxnum_h, uint16_t, H1_2, float16_maxnum) +DO_ZPZZ_FP(sve_fmaxnum_s, uint32_t, H1_4, float32_maxnum) +DO_ZPZZ_FP(sve_fmaxnum_d, uint64_t, , float64_maxnum) + +static inline float16 abd_h(float16 a, float16 b, float_status *s) +{ + return float16_abs(float16_sub(a, b, s)); +} + +static inline float32 abd_s(float32 a, float32 b, float_status *s) +{ + return float32_abs(float32_sub(a, b, s)); +} + +static inline float64 abd_d(float64 a, float64 b, float_status *s) +{ + return float64_abs(float64_sub(a, b, s)); +} + +DO_ZPZZ_FP(sve_fabd_h, uint16_t, H1_2, abd_h) +DO_ZPZZ_FP(sve_fabd_s, uint32_t, H1_4, abd_s) +DO_ZPZZ_FP(sve_fabd_d, uint64_t, , abd_d) + +static inline float64 scalbn_d(float64 a, int64_t b, float_status *s) +{ + int b_int = MIN(MAX(b, INT_MIN), INT_MAX); + return float64_scalbn(a, b_int, s); +} + +DO_ZPZZ_FP(sve_fscalbn_h, int16_t, H1_2, float16_scalbn) +DO_ZPZZ_FP(sve_fscalbn_s, int32_t, H1_4, float32_scalbn) +DO_ZPZZ_FP(sve_fscalbn_d, int64_t, , scalbn_d) + +DO_ZPZZ_FP(sve_fmulx_h, uint16_t, H1_2, helper_advsimd_mulxh) +DO_ZPZZ_FP(sve_fmulx_s, uint32_t, H1_4, helper_vfp_mulxs) +DO_ZPZZ_FP(sve_fmulx_d, uint64_t, , helper_vfp_mulxd) + +#undef DO_ZPZZ_FP + /* Fully general two-operand expander, controlled by a predicate, * With the extra float_status parameter. */ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 7639e589f5..4df5360da9 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3425,6 +3425,52 @@ DO_FP3(FRSQRTS, rsqrts) #undef DO_FP3 +/* + *** SVE Floating Point Arithmetic - Predicated Group + */ + +static bool do_zpzz_fp(DisasContext *s, arg_rprr_esz *a, + gen_helper_gvec_4_ptr *fn) +{ + if (fn == NULL) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16); + tcg_gen_gvec_4_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + pred_full_reg_offset(s, a->pg), + status, vsz, vsz, 0, fn); + tcg_temp_free_ptr(status); + } + return true; +} + +#define DO_FP3(NAME, name) \ +static bool trans_##NAME(DisasContext *s, arg_rprr_esz *a, uint32_t insn) \ +{ \ + static gen_helper_gvec_4_ptr * const fns[4] = { \ + NULL, gen_helper_sve_##name##_h, \ + gen_helper_sve_##name##_s, gen_helper_sve_##name##_d \ + }; \ + return do_zpzz_fp(s, a, fns[a->esz]); \ +} + +DO_FP3(FADD_zpzz, fadd) +DO_FP3(FSUB_zpzz, fsub) +DO_FP3(FMUL_zpzz, fmul) +DO_FP3(FMIN_zpzz, fmin) +DO_FP3(FMAX_zpzz, fmax) +DO_FP3(FMINNM_zpzz, fminnum) +DO_FP3(FMAXNM_zpzz, fmaxnum) +DO_FP3(FABD, fabd) +DO_FP3(FSCALE, fscalbn) +DO_FP3(FDIV, fdiv) +DO_FP3(FMULX, fmulx) + +#undef DO_FP3 /* *** SVE Floating Point Unary Operations Prediated Group diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 3abdb87cf5..636212a638 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -684,6 +684,23 @@ FTSMUL 01100101 .. 0 ..... 000 011 ..... ..... @rd_rn_rm FRECPS 01100101 .. 0 ..... 000 110 ..... ..... @rd_rn_rm FRSQRTS 01100101 .. 0 ..... 000 111 ..... ..... @rd_rn_rm +### SVE FP Arithmetic Predicated Group + +# SVE floating-point arithmetic (predicated) +FADD_zpzz 01100101 .. 00 0000 100 ... ..... ..... @rdn_pg_rm +FSUB_zpzz 01100101 .. 00 0001 100 ... ..... ..... @rdn_pg_rm +FMUL_zpzz 01100101 .. 00 0010 100 ... ..... ..... @rdn_pg_rm +FSUB_zpzz 01100101 .. 00 0011 100 ... ..... ..... @rdm_pg_rn # FSUBR +FMAXNM_zpzz 01100101 .. 00 0100 100 ... ..... ..... @rdn_pg_rm +FMINNM_zpzz 01100101 .. 00 0101 100 ... ..... ..... @rdn_pg_rm +FMAX_zpzz 01100101 .. 00 0110 100 ... ..... ..... @rdn_pg_rm +FMIN_zpzz 01100101 .. 00 0111 100 ... ..... ..... @rdn_pg_rm +FABD 01100101 .. 00 1000 100 ... ..... ..... @rdn_pg_rm +FSCALE 01100101 .. 00 1001 100 ... ..... ..... @rdn_pg_rm +FMULX 01100101 .. 00 1010 100 ... ..... ..... @rdn_pg_rm +FDIV 01100101 .. 00 1100 100 ... ..... ..... @rdm_pg_rn # FDIVR +FDIV 01100101 .. 00 1101 100 ... ..... ..... @rdn_pg_rm + ### SVE FP Unary Operations Predicated Group # SVE integer convert to floating-point From patchwork Thu Jun 21 01:53:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 932475 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="Iu7CDaR5"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41B4gG2khTz9s3R for ; Thu, 21 Jun 2018 12:01:14 +1000 (AEST) Received: from localhost ([::1]:52511 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVouN-0000dm-TN for incoming@patchwork.ozlabs.org; Wed, 20 Jun 2018 22:01:11 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39184) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVonl-00045Z-Kh for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:23 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVonk-000321-4W for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:21 -0400 Received: from mail-pf0-x234.google.com ([2607:f8b0:400e:c00::234]:38383) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVonj-00031i-RW for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:20 -0400 Received: by mail-pf0-x234.google.com with SMTP id b74-v6so688247pfl.5 for ; Wed, 20 Jun 2018 18:54:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=jINFLC02lMv+InyN1iespI0Zzx3CuWqfLW8uGBHW3LI=; b=Iu7CDaR5A0qYg9rvkRicFbrbTGtKK6l0qfEy4W4Wjx8XmZwj4BJr2tdbB/t5sst889 vtd/5KvWw6ae3eDb4rZqM4qG/hQboQDijbyQZOcH8itVczk8NidmA+XXHU5uCdIUJWxS JmEyy6qTc0Up6v9X39STi0HEzeKWciaucAbAE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=jINFLC02lMv+InyN1iespI0Zzx3CuWqfLW8uGBHW3LI=; b=TXeSHKRHkzjxGV8dcaqEjriYD6UZKaUqdC/GlV7aNp7BNyw2JxXVOn1Xl3cSBLG243 NkMm9tbZJUP37VKAiWSlE1z9TFm8Uwsz5jABcm5g2pWRVyaAomr/PghXT4w9+y9i95Tw n8+IkfmkuFk2fVk903whWlAyNUK/hisXPVuTj+EM2kr8Pqi8yQ99X+wSQwORY5Z8CPBE 4c+hRlCSvD4fsyv/ya9slbgkoHnyyL1l+Fbbave+rUqpOd0JNPxo+MK8bLf/sjiYHYrP 0QJRfdevGTaNKnzmeYx9pHd2dj5PyNnbw46wrT9v5PA9FGxZxfCMOe0kzRQ/E8fJJNmx uFbA== X-Gm-Message-State: APt69E3a0t5yjyBRrIKou+xoCytecylOrh7zlwAUEBLEV6rSxVgKKQ2P ju5Hgs8U4O3H8t23mSgw8gq+V6MXeQk= X-Google-Smtp-Source: ADUXVKKFcaeoXv2QjNj/AHOslMMhZ6KjImvlfNYgGeC5Ja7cZaiW6RdquCXw2kyLyHS9how8Iukvdg== X-Received: by 2002:a65:48c9:: with SMTP id o9-v6mr21085398pgs.262.1529546058526; Wed, 20 Jun 2018 18:54:18 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.16 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:17 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:31 -1000 Message-Id: <20180621015359.12018-8-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::234 Subject: [Qemu-devel] [PATCH v5 07/35] target/arm: Implement SVE FP Multiply-Add Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 16 ++++ target/arm/sve_helper.c | 158 +++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 49 ++++++++++++ target/arm/sve.decode | 17 ++++ 4 files changed, 240 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 4097b55f0e..eb0645dd43 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -827,6 +827,22 @@ DEF_HELPER_FLAGS_5(sve_ucvt_ds, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_ucvt_dd, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_fmls_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fmls_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fmls_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_fnmla_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fnmla_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fnmla_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) + DEF_HELPER_FLAGS_4(sve_ld1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 6bf30a3e66..aeb4ccadd9 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2938,6 +2938,164 @@ DO_ZPZ_FP(sve_ucvt_dd, uint64_t, , uint64_to_float64) #undef DO_ZPZ_FP +/* 4-operand predicated multiply-add. This requires 7 operands to pass + * "properly", so we need to encode some of the registers into DESC. + */ +QEMU_BUILD_BUG_ON(SIMD_DATA_SHIFT + 20 > 32); + +static void do_fmla_zpzzz_h(CPUARMState *env, void *vg, uint32_t desc, + uint16_t neg1, uint16_t neg3) +{ + intptr_t i = simd_oprsz(desc); + unsigned rd = extract32(desc, SIMD_DATA_SHIFT, 5); + unsigned rn = extract32(desc, SIMD_DATA_SHIFT + 5, 5); + unsigned rm = extract32(desc, SIMD_DATA_SHIFT + 10, 5); + unsigned ra = extract32(desc, SIMD_DATA_SHIFT + 15, 5); + void *vd = &env->vfp.zregs[rd]; + void *vn = &env->vfp.zregs[rn]; + void *vm = &env->vfp.zregs[rm]; + void *va = &env->vfp.zregs[ra]; + uint64_t *g = vg; + + do { + uint64_t pg = g[(i - 1) >> 6]; + do { + i -= 2; + if (likely((pg >> (i & 63)) & 1)) { + float16 e1, e2, e3, r; + + e1 = *(uint16_t *)(vn + H1_2(i)) ^ neg1; + e2 = *(uint16_t *)(vm + H1_2(i)); + e3 = *(uint16_t *)(va + H1_2(i)) ^ neg3; + r = float16_muladd(e1, e2, e3, 0, &env->vfp.fp_status); + *(uint16_t *)(vd + H1_2(i)) = r; + } + } while (i & 63); + } while (i != 0); +} + +void HELPER(sve_fmla_zpzzz_h)(CPUARMState *env, void *vg, uint32_t desc) +{ + do_fmla_zpzzz_h(env, vg, desc, 0, 0); +} + +void HELPER(sve_fmls_zpzzz_h)(CPUARMState *env, void *vg, uint32_t desc) +{ + do_fmla_zpzzz_h(env, vg, desc, 0x8000, 0); +} + +void HELPER(sve_fnmla_zpzzz_h)(CPUARMState *env, void *vg, uint32_t desc) +{ + do_fmla_zpzzz_h(env, vg, desc, 0x8000, 0x8000); +} + +void HELPER(sve_fnmls_zpzzz_h)(CPUARMState *env, void *vg, uint32_t desc) +{ + do_fmla_zpzzz_h(env, vg, desc, 0, 0x8000); +} + +static void do_fmla_zpzzz_s(CPUARMState *env, void *vg, uint32_t desc, + uint32_t neg1, uint32_t neg3) +{ + intptr_t i = simd_oprsz(desc); + unsigned rd = extract32(desc, SIMD_DATA_SHIFT, 5); + unsigned rn = extract32(desc, SIMD_DATA_SHIFT + 5, 5); + unsigned rm = extract32(desc, SIMD_DATA_SHIFT + 10, 5); + unsigned ra = extract32(desc, SIMD_DATA_SHIFT + 15, 5); + void *vd = &env->vfp.zregs[rd]; + void *vn = &env->vfp.zregs[rn]; + void *vm = &env->vfp.zregs[rm]; + void *va = &env->vfp.zregs[ra]; + uint64_t *g = vg; + + do { + uint64_t pg = g[(i - 1) >> 6]; + do { + i -= 4; + if (likely((pg >> (i & 63)) & 1)) { + float32 e1, e2, e3, r; + + e1 = *(uint32_t *)(vn + H1_4(i)) ^ neg1; + e2 = *(uint32_t *)(vm + H1_4(i)); + e3 = *(uint32_t *)(va + H1_4(i)) ^ neg3; + r = float32_muladd(e1, e2, e3, 0, &env->vfp.fp_status); + *(uint32_t *)(vd + H1_4(i)) = r; + } + } while (i & 63); + } while (i != 0); +} + +void HELPER(sve_fmla_zpzzz_s)(CPUARMState *env, void *vg, uint32_t desc) +{ + do_fmla_zpzzz_s(env, vg, desc, 0, 0); +} + +void HELPER(sve_fmls_zpzzz_s)(CPUARMState *env, void *vg, uint32_t desc) +{ + do_fmla_zpzzz_s(env, vg, desc, 0x80000000, 0); +} + +void HELPER(sve_fnmla_zpzzz_s)(CPUARMState *env, void *vg, uint32_t desc) +{ + do_fmla_zpzzz_s(env, vg, desc, 0x80000000, 0x80000000); +} + +void HELPER(sve_fnmls_zpzzz_s)(CPUARMState *env, void *vg, uint32_t desc) +{ + do_fmla_zpzzz_s(env, vg, desc, 0, 0x80000000); +} + +static void do_fmla_zpzzz_d(CPUARMState *env, void *vg, uint32_t desc, + uint64_t neg1, uint64_t neg3) +{ + intptr_t i = simd_oprsz(desc); + unsigned rd = extract32(desc, SIMD_DATA_SHIFT, 5); + unsigned rn = extract32(desc, SIMD_DATA_SHIFT + 5, 5); + unsigned rm = extract32(desc, SIMD_DATA_SHIFT + 10, 5); + unsigned ra = extract32(desc, SIMD_DATA_SHIFT + 15, 5); + void *vd = &env->vfp.zregs[rd]; + void *vn = &env->vfp.zregs[rn]; + void *vm = &env->vfp.zregs[rm]; + void *va = &env->vfp.zregs[ra]; + uint64_t *g = vg; + + do { + uint64_t pg = g[(i - 1) >> 6]; + do { + i -= 8; + if (likely((pg >> (i & 63)) & 1)) { + float64 e1, e2, e3, r; + + e1 = *(uint64_t *)(vn + i) ^ neg1; + e2 = *(uint64_t *)(vm + i); + e3 = *(uint64_t *)(va + i) ^ neg3; + r = float64_muladd(e1, e2, e3, 0, &env->vfp.fp_status); + *(uint64_t *)(vd + i) = r; + } + } while (i & 63); + } while (i != 0); +} + +void HELPER(sve_fmla_zpzzz_d)(CPUARMState *env, void *vg, uint32_t desc) +{ + do_fmla_zpzzz_d(env, vg, desc, 0, 0); +} + +void HELPER(sve_fmls_zpzzz_d)(CPUARMState *env, void *vg, uint32_t desc) +{ + do_fmla_zpzzz_d(env, vg, desc, INT64_MIN, 0); +} + +void HELPER(sve_fnmla_zpzzz_d)(CPUARMState *env, void *vg, uint32_t desc) +{ + do_fmla_zpzzz_d(env, vg, desc, INT64_MIN, INT64_MIN); +} + +void HELPER(sve_fnmls_zpzzz_d)(CPUARMState *env, void *vg, uint32_t desc) +{ + do_fmla_zpzzz_d(env, vg, desc, 0, INT64_MIN); +} + /* * Load contiguous data, protected by a governing predicate. */ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 4df5360da9..acad6374ef 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3472,6 +3472,55 @@ DO_FP3(FMULX, fmulx) #undef DO_FP3 +typedef void gen_helper_sve_fmla(TCGv_env, TCGv_ptr, TCGv_i32); + +static bool do_fmla(DisasContext *s, arg_rprrr_esz *a, gen_helper_sve_fmla *fn) +{ + if (fn == NULL) { + return false; + } + if (!sve_access_check(s)) { + return true; + } + + unsigned vsz = vec_full_reg_size(s); + unsigned desc; + TCGv_i32 t_desc; + TCGv_ptr pg = tcg_temp_new_ptr(); + + /* We would need 7 operands to pass these arguments "properly". + * So we encode all the register numbers into the descriptor. + */ + desc = deposit32(a->rd, 5, 5, a->rn); + desc = deposit32(desc, 10, 5, a->rm); + desc = deposit32(desc, 15, 5, a->ra); + desc = simd_desc(vsz, vsz, desc); + + t_desc = tcg_const_i32(desc); + tcg_gen_addi_ptr(pg, cpu_env, pred_full_reg_offset(s, a->pg)); + fn(cpu_env, pg, t_desc); + tcg_temp_free_i32(t_desc); + tcg_temp_free_ptr(pg); + return true; +} + +#define DO_FMLA(NAME, name) \ +static bool trans_##NAME(DisasContext *s, arg_rprrr_esz *a, uint32_t insn) \ +{ \ + static gen_helper_sve_fmla * const fns[4] = { \ + NULL, gen_helper_sve_##name##_h, \ + gen_helper_sve_##name##_s, gen_helper_sve_##name##_d \ + }; \ + return do_fmla(s, a, fns[a->esz]); \ +} + +DO_FMLA(FMLA_zpzzz, fmla_zpzzz) +DO_FMLA(FMLS_zpzzz, fmls_zpzzz) +DO_FMLA(FNMLA_zpzzz, fnmla_zpzzz) +DO_FMLA(FNMLS_zpzzz, fnmls_zpzzz) + +#undef DO_FMLA + /* *** SVE Floating Point Unary Operations Prediated Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 636212a638..70e5a3aeb5 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -128,6 +128,8 @@ &rprrr_esz ra=%reg_movprfx @rdn_pg_ra_rm ........ esz:2 . rm:5 ... pg:3 ra:5 rd:5 \ &rprrr_esz rn=%reg_movprfx +@rdn_pg_rm_ra ........ esz:2 . ra:5 ... pg:3 rm:5 rd:5 \ + &rprrr_esz rn=%reg_movprfx # One register operand, with governing predicate, vector element size @rd_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 &rpr_esz @@ -701,6 +703,21 @@ FMULX 01100101 .. 00 1010 100 ... ..... ..... @rdn_pg_rm FDIV 01100101 .. 00 1100 100 ... ..... ..... @rdm_pg_rn # FDIVR FDIV 01100101 .. 00 1101 100 ... ..... ..... @rdn_pg_rm +### SVE FP Multiply-Add Group + +# SVE floating-point multiply-accumulate writing addend +FMLA_zpzzz 01100101 .. 1 ..... 000 ... ..... ..... @rda_pg_rn_rm +FMLS_zpzzz 01100101 .. 1 ..... 001 ... ..... ..... @rda_pg_rn_rm +FNMLA_zpzzz 01100101 .. 1 ..... 010 ... ..... ..... @rda_pg_rn_rm +FNMLS_zpzzz 01100101 .. 1 ..... 011 ... ..... ..... @rda_pg_rn_rm + +# SVE floating-point multiply-accumulate writing multiplicand +# FMAD, FMSB, FNMAD, FNMS +FMLA_zpzzz 01100101 .. 1 ..... 100 ... ..... ..... @rdn_pg_rm_ra +FMLS_zpzzz 01100101 .. 1 ..... 101 ... ..... ..... @rdn_pg_rm_ra +FNMLA_zpzzz 01100101 .. 1 ..... 110 ... ..... ..... @rdn_pg_rm_ra +FNMLS_zpzzz 01100101 .. 1 ..... 111 ... ..... ..... @rdn_pg_rm_ra + ### SVE FP Unary Operations Predicated Group # SVE integer convert to floating-point From patchwork Thu Jun 21 01:53:32 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 932473 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="X0b2Xv/f"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41B4cy19v0z9s3R for ; Thu, 21 Jun 2018 11:59:14 +1000 (AEST) Received: from localhost ([::1]:52498 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVosR-0007T3-Oa for incoming@patchwork.ozlabs.org; Wed, 20 Jun 2018 21:59:11 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39195) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVonn-00047F-7A for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:24 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVonm-00032y-4S for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:23 -0400 Received: from mail-pf0-x22a.google.com ([2607:f8b0:400e:c00::22a]:46298) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVonl-00032b-Sf for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:22 -0400 Received: by mail-pf0-x22a.google.com with SMTP id q1-v6so676563pff.13 for ; Wed, 20 Jun 2018 18:54:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=zMUCpNHqvYI8Hwem4/9P2JgKC2kBIp0MNfhRl9e2Emo=; b=X0b2Xv/fpekxUwxARyRF5KM/iKI9DXWj/G0sK9mxEnRpfsERL0kh83FaCzqp4z+sKQ KfvUWm5n2SzFpI39LhkHWYRGIj2cNsiy8fScfv5rQ+C2tzgJ+6AwafZ9ZMZ9l6lMwfbO W+p+pcm8loYraOlmOF/RjtWlZBFq2AVdT4JRw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=zMUCpNHqvYI8Hwem4/9P2JgKC2kBIp0MNfhRl9e2Emo=; b=gY38APSSvgRPFZ+uf6eBbeve6M3FkzbUHsLP4JaBp3fx29tzXJf2G2PoassXojLrAS YgiiqkVElk/gW2E/ura9D5W5G3vlz4AaA4qEebTjgXyHUdh9bZ1ikT7bOOArAmUoUCUM 6M0wMa0ugtrR4qfV4Pmx5NTjJgE6W7DOZSUG34w8dFfJXGlkSmRqafmZLyNnK5xToDEZ jon4TkfrROts5sbim6Lx8I6T5yQ0rF0TZUTor8xzVqpV53GNUUXkpJjtFMfOHORXsmFO sef2Z/iVgLW4jAmDXCopziPa76zmFRtOHUAmsn7McrunxRS9XiRZ/iuubrfxImj0PMpL 2Vtw== X-Gm-Message-State: APt69E3px0yhOWojsMyQHKK3L2QITz8drR4vjMBzjKHYFRi4OzCM0zMo Mthi2UpRzGCf9Xi95W1RrXExfxIEouI= X-Google-Smtp-Source: ADUXVKI09D+s9TAn/b98HgtXpL0XW+lknV2Y+6BE4vI60glsNKl7BwYEGCUjuNP80WKuQZsZhHAauw== X-Received: by 2002:a65:65ca:: with SMTP id y10-v6mr20821260pgv.359.1529546060529; Wed, 20 Jun 2018 18:54:20 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.19 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:19 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:32 -1000 Message-Id: <20180621015359.12018-9-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::22a Subject: [Qemu-devel] [PATCH v5 08/35] target/arm: Implement SVE Floating Point Accumulating Reduction Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper-sve.h | 7 +++++ target/arm/sve_helper.c | 56 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 45 ++++++++++++++++++++++++++++++ target/arm/sve.decode | 5 ++++ 4 files changed, 113 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index eb0645dd43..68e55a8d03 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -720,6 +720,13 @@ DEF_HELPER_FLAGS_5(gvec_rsqrts_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(gvec_rsqrts_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fadda_h, TCG_CALL_NO_RWG, + i64, i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fadda_s, TCG_CALL_NO_RWG, + i64, i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fadda_d, TCG_CALL_NO_RWG, + i64, i64, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_6(sve_fadd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_6(sve_fadd_s, TCG_CALL_NO_RWG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index aeb4ccadd9..990e5f3900 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2811,6 +2811,62 @@ uint32_t HELPER(sve_while)(void *vd, uint32_t count, uint32_t pred_desc) return predtest_ones(d, oprsz, esz_mask); } +uint64_t HELPER(sve_fadda_h)(uint64_t nn, void *vm, void *vg, + void *status, uint32_t desc) +{ + intptr_t i = 0, opr_sz = simd_oprsz(desc); + float16 result = nn; + + do { + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); + do { + if (pg & 1) { + float16 mm = *(float16 *)(vm + H1_2(i)); + result = float16_add(result, mm, status); + } + i += sizeof(float16), pg >>= sizeof(float16); + } while (i & 15); + } while (i < opr_sz); + + return result; +} + +uint64_t HELPER(sve_fadda_s)(uint64_t nn, void *vm, void *vg, + void *status, uint32_t desc) +{ + intptr_t i = 0, opr_sz = simd_oprsz(desc); + float32 result = nn; + + do { + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); + do { + if (pg & 1) { + float32 mm = *(float32 *)(vm + H1_2(i)); + result = float32_add(result, mm, status); + } + i += sizeof(float32), pg >>= sizeof(float32); + } while (i & 15); + } while (i < opr_sz); + + return result; +} + +uint64_t HELPER(sve_fadda_d)(uint64_t nn, void *vm, void *vg, + void *status, uint32_t desc) +{ + intptr_t i = 0, opr_sz = simd_oprsz(desc) / 8; + uint64_t *m = vm; + uint8_t *pg = vg; + + for (i = 0; i < opr_sz; i++) { + if (pg[H1(i)] & 1) { + nn = float64_add(nn, m[i], status); + } + } + + return nn; +} + /* Fully general three-operand expander, controlled by a predicate, * With the extra float_status parameter. */ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index acad6374ef..483ad33179 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3383,6 +3383,51 @@ DO_ZZI(UMIN, umin) #undef DO_ZZI +/* + *** SVE Floating Point Accumulating Reduction Group + */ + +static bool trans_FADDA(DisasContext *s, arg_rprr_esz *a, uint32_t insn) +{ + typedef void fadda_fn(TCGv_i64, TCGv_i64, TCGv_ptr, + TCGv_ptr, TCGv_ptr, TCGv_i32); + static fadda_fn * const fns[3] = { + gen_helper_sve_fadda_h, + gen_helper_sve_fadda_s, + gen_helper_sve_fadda_d, + }; + unsigned vsz = vec_full_reg_size(s); + TCGv_ptr t_rm, t_pg, t_fpst; + TCGv_i64 t_val; + TCGv_i32 t_desc; + + if (a->esz == 0) { + return false; + } + if (!sve_access_check(s)) { + return true; + } + + t_val = load_esz(cpu_env, vec_reg_offset(s, a->rn, 0, a->esz), a->esz); + t_rm = tcg_temp_new_ptr(); + t_pg = tcg_temp_new_ptr(); + tcg_gen_addi_ptr(t_rm, cpu_env, vec_full_reg_offset(s, a->rm)); + tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, a->pg)); + t_fpst = get_fpstatus_ptr(a->esz == MO_16); + t_desc = tcg_const_i32(simd_desc(vsz, vsz, 0)); + + fns[a->esz - 1](t_val, t_val, t_rm, t_pg, t_fpst, t_desc); + + tcg_temp_free_i32(t_desc); + tcg_temp_free_ptr(t_fpst); + tcg_temp_free_ptr(t_pg); + tcg_temp_free_ptr(t_rm); + + write_fp_dreg(s, a->rd, t_val); + tcg_temp_free_i64(t_val); + return true; +} + /* *** SVE Floating Point Arithmetic - Unpredicated Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 70e5a3aeb5..ba10cddb8a 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -676,6 +676,11 @@ UMIN_zzi 00100101 .. 101 011 110 ........ ..... @rdn_i8u # SVE integer multiply immediate (unpredicated) MUL_zzi 00100101 .. 110 000 110 ........ ..... @rdn_i8s +### SVE FP Accumulating Reduction Group + +# SVE floating-point serial reduction (predicated) +FADDA 01100101 .. 011 000 001 ... ..... ..... @rdn_pg_rm + ### SVE Floating Point Arithmetic - Unpredicated Group # SVE floating-point arithmetic (unpredicated) From patchwork Thu Jun 21 01:53:33 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 932474 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="Oyf1MzH+"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41B4gC46Mxz9s3R for ; Thu, 21 Jun 2018 12:01:11 +1000 (AEST) Received: from localhost ([::1]:52510 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVouL-0000cI-3A for incoming@patchwork.ozlabs.org; Wed, 20 Jun 2018 22:01:09 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39206) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVonp-00049F-0v for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:26 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVonn-00033s-OK for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:25 -0400 Received: from mail-pl0-x242.google.com ([2607:f8b0:400e:c01::242]:38157) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVonn-00033P-Gj for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:23 -0400 Received: by mail-pl0-x242.google.com with SMTP id d10-v6so757258plo.5 for ; Wed, 20 Jun 2018 18:54:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=4PBOZ+xP2oHjWmgB6IgGTqH/sc5fgdOszz58FGX7GC4=; b=Oyf1MzH+3mQJZTcDMLYUGxG1kN+kOtogDBVfDt545Cjfzb2UnhAMW5H6dkH+SJaOYj mmUJg9HK0mj1Rtu5PCjCT/Ih5yBg4DlxyNnmhxojtNy2FPlpaHU1UBqEGjLqhJv9+ait YC4nj5M+l38Zqlv4AqlwNQ+TUZx1aeb6QtZug= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=4PBOZ+xP2oHjWmgB6IgGTqH/sc5fgdOszz58FGX7GC4=; b=g1BbFQ5Pxuud2wek4EDRhhxFOo4lWq/EnapYjCPqYJrorPWTCwwW3SU/+rBq5XRmdw 6vRH1zHXh1LIPmT6krEtpyGfu30/HUQs27iD4StK4IAnD/3MuQGuhsOu5qtaRsW9ps08 BHI6Y3X6nFeZGG9RFSDnSYwC3q3N4lffC5eqiDZhUxBRj35qhAbMYdIWU0QqfawKqe0W y7SuRGEg20HyP9+4KcsIPviZ6FAuDHxxnSalNcWo4cVo3NP7mfiPA8+6nuQFffyHucbK mHMS0GFE33ZgaLrFz5WWfRCuqSmCxEfQD59HH7BvxWr8FDkfCR+OmnkqynGAZkKjYS6N 1b1g== X-Gm-Message-State: APt69E2tezeKajKGlCIf559vFwdLstTAXpSkwYtvHvsVjDL7+PG8iO0b e2BE7Gh/cZyM5D3IRPRtrW7Ya+XcxiI= X-Google-Smtp-Source: ADUXVKJQpxAeCuRVyzkc8wFLC5o1PhR3YY9F4AjlgN9U8YovAG6H7aYZshk3Ne2HTbi5wmM2IpbfwQ== X-Received: by 2002:a17:902:64cf:: with SMTP id y15-v6mr26307178pli.53.1529546062178; Wed, 20 Jun 2018 18:54:22 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.20 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:21 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:33 -1000 Message-Id: <20180621015359.12018-10-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::242 Subject: [Qemu-devel] [PATCH v5 09/35] target/arm: Implement SVE load and broadcast element X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper-sve.h | 5 +++ target/arm/sve_helper.c | 41 +++++++++++++++++++++++++ target/arm/translate-sve.c | 62 ++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 5 +++ 4 files changed, 113 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 68e55a8d03..a5d3bb121c 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -274,6 +274,11 @@ DEF_HELPER_FLAGS_3(sve_clr_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_clr_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_clr_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_movz_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_movz_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_movz_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_movz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(sve_asr_zpzi_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_asr_zpzi_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_asr_zpzi_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 990e5f3900..a9c98bca32 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -995,6 +995,47 @@ void HELPER(sve_clr_d)(void *vd, void *vg, uint32_t desc) } } +/* Copy Zn into Zn, and store zero into inactive elements. */ +void HELPER(sve_movz_b)(void *vd, void *vn, void *vg, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd, *n = vn; + uint8_t *pg = vg; + for (i = 0; i < opr_sz; i += 1) { + d[i] = n[i] & expand_pred_b(pg[H1(i)]); + } +} + +void HELPER(sve_movz_h)(void *vd, void *vn, void *vg, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd, *n = vn; + uint8_t *pg = vg; + for (i = 0; i < opr_sz; i += 1) { + d[i] = n[i] & expand_pred_h(pg[H1(i)]); + } +} + +void HELPER(sve_movz_s)(void *vd, void *vn, void *vg, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd, *n = vn; + uint8_t *pg = vg; + for (i = 0; i < opr_sz; i += 1) { + d[i] = n[i] & expand_pred_s(pg[H1(i)]); + } +} + +void HELPER(sve_movz_d)(void *vd, void *vn, void *vg, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd, *n = vn; + uint8_t *pg = vg; + for (i = 0; i < opr_sz; i += 1) { + d[i] = n[1] & -(uint64_t)(pg[H1(i)] & 1); + } +} + /* Three-operand expander, immediate operand, controlled by a predicate. */ #define DO_ZPZI(NAME, TYPE, H, OP) \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 483ad33179..954d6653d3 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -606,6 +606,20 @@ static bool do_clr_zp(DisasContext *s, int rd, int pg, int esz) return true; } +/* Copy Zn into Zd, storing zeros into inactive elements. */ +static void do_movz_zpz(DisasContext *s, int rd, int rn, int pg, int esz) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve_movz_b, gen_helper_sve_movz_h, + gen_helper_sve_movz_s, gen_helper_sve_movz_d, + }; + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_3_ool(vec_full_reg_offset(s, rd), + vec_full_reg_offset(s, rn), + pred_full_reg_offset(s, pg), + vsz, vsz, 0, fns[esz]); +} + static bool do_zpzi_ool(DisasContext *s, arg_rpri_esz *a, gen_helper_gvec_3 *fn) { @@ -3999,6 +4013,54 @@ static bool trans_LD1RQ_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn) return true; } +/* Load and broadcast element. */ +static bool trans_LD1R_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn) +{ + if (!sve_access_check(s)) { + return true; + } + + unsigned vsz = vec_full_reg_size(s); + unsigned psz = pred_full_reg_size(s); + unsigned esz = dtype_esz[a->dtype]; + TCGLabel *over = gen_new_label(); + TCGv_i64 temp; + + /* If the guarding predicate has no bits set, no load occurs. */ + if (psz <= 8) { + /* Reduce the pred_esz_masks value simply to reduce the + * size of the code generated here. + */ + uint64_t psz_mask = MAKE_64BIT_MASK(0, psz * 8); + temp = tcg_temp_new_i64(); + tcg_gen_ld_i64(temp, cpu_env, pred_full_reg_offset(s, a->pg)); + tcg_gen_andi_i64(temp, temp, pred_esz_masks[esz] & psz_mask); + tcg_gen_brcondi_i64(TCG_COND_EQ, temp, 0, over); + tcg_temp_free_i64(temp); + } else { + TCGv_i32 t32 = tcg_temp_new_i32(); + find_last_active(s, t32, esz, a->pg); + tcg_gen_brcondi_i32(TCG_COND_LT, t32, 0, over); + tcg_temp_free_i32(t32); + } + + /* Load the data. */ + temp = tcg_temp_new_i64(); + tcg_gen_addi_i64(temp, cpu_reg_sp(s, a->rn), a->imm << esz); + tcg_gen_qemu_ld_i64(temp, temp, get_mem_index(s), + s->be_data | dtype_mop[a->dtype]); + + /* Broadcast to *all* elements. */ + tcg_gen_gvec_dup_i64(esz, vec_full_reg_offset(s, a->rd), + vsz, vsz, temp); + tcg_temp_free_i64(temp); + + /* Zero the inactive elements. */ + gen_set_label(over); + do_movz_zpz(s, a->rd, a->rd, a->pg, esz); + return true; +} + static void do_st_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr, int msz, int esz, int nreg) { diff --git a/target/arm/sve.decode b/target/arm/sve.decode index ba10cddb8a..56039e2193 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -28,6 +28,7 @@ %imm8_16_10 16:5 10:3 %imm9_16_10 16:s6 10:3 %size_23 23:2 +%dtype_23_13 23:2 13:2 # A combination of tsz:imm3 -- extract esize. %tszimm_esz 22:2 5:5 !function=tszimm_esz @@ -750,6 +751,10 @@ LDR_pri 10000101 10 ...... 000 ... ..... 0 .... @pd_rn_i9 # SVE load vector register LDR_zri 10000101 10 ...... 010 ... ..... ..... @rd_rn_i9 +# SVE load and broadcast element +LD1R_zpri 1000010 .. 1 imm:6 1.. pg:3 rn:5 rd:5 \ + &rpri_load dtype=%dtype_23_13 nreg=0 + ### SVE Memory Contiguous Load Group # SVE contiguous load (scalar plus scalar) From patchwork Thu Jun 21 01:53:34 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 932479 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="IVeJyioO"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41B4l81F0Yz9s3R for ; Thu, 21 Jun 2018 12:04:36 +1000 (AEST) Received: from localhost ([::1]:52527 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVoxd-00030h-IK for incoming@patchwork.ozlabs.org; Wed, 20 Jun 2018 22:04:33 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39217) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVonq-0004BX-Gk for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:32 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVonp-00034m-Eb for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:26 -0400 Received: from mail-pl0-x241.google.com ([2607:f8b0:400e:c01::241]:33126) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVonp-00034H-7P for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:25 -0400 Received: by mail-pl0-x241.google.com with SMTP id 6-v6so763889plb.0 for ; Wed, 20 Jun 2018 18:54:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=SEyYQK35r/FvibNeEMXdwWPmyznCQnXYKZpyUq8JXnM=; b=IVeJyioO3nu3qs8l61Dr5svG97k3vhpH810KPNoFvYZRIsl+LB62dfo3DXOg8mSLDm LesSgf+TVqzTiNta6a2HhoknS6f8CsAaEjPcp55eJzCvUNohZOEphYzdJR1fwgxfIX/r 4YSUuxKQ7P5vKc6yU2WBACCIsZwsywxCIAAas= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=SEyYQK35r/FvibNeEMXdwWPmyznCQnXYKZpyUq8JXnM=; b=b5VIlntjQxh+RAZc/Bazj43cMtkIITR5oUwHZUrClm14F/gTMEAZCZCXpaZiRqS7Op +iB0U9YADLWh0pxgEqAGep8/1vjUJFlR/eikGQ2h3M5nLpAgAhjqPI1gylmQ2c/aT/A9 9FOjiQIsM8eV+NKcx3WX8LaR7ASE6iPC2yTSXURrispqLJVmrVGthE2tWedHcljd/tI+ FbZzvkPcUondO8DXS4+a7h6PK2gXgOvn/usPCfvI0PeJN4LFhog6AIq4LKOuMul2W9K+ EQTmqy+KT27cXDKutRw1btbcndnAoQb8XrJ4OMrg4Sju6sixuVMEH3bffjBcBSdGcORy E+UA== X-Gm-Message-State: APt69E2BCsLI9fX1Pbl/OaHMnUBteIDt+qT49qnlXDBxAo4QPvLAo1Jn lUgTFxHkIcZmSNnfV8KcKssmjyawduM= X-Google-Smtp-Source: ADUXVKKjdUF2Ejs/MOIZ9KhWdvMSVmeoC06v9Uh3fjRAlQMTRtomBekCdmNaYkxXZPKw5O8LjmWxUg== X-Received: by 2002:a17:902:b18b:: with SMTP id s11-v6mr26451373plr.190.1529546063925; Wed, 20 Jun 2018 18:54:23 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.22 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:23 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:34 -1000 Message-Id: <20180621015359.12018-11-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::241 Subject: [Qemu-devel] [PATCH v5 10/35] target/arm: Implement SVE store vector/predicate register X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/translate-sve.c | 103 +++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 6 +++ 2 files changed, 109 insertions(+) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 954d6653d3..50f1ff75ef 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3762,6 +3762,89 @@ static void do_ldr(DisasContext *s, uint32_t vofs, uint32_t len, tcg_temp_free_i64(t0); } +/* Similarly for stores. */ +static void do_str(DisasContext *s, uint32_t vofs, uint32_t len, + int rn, int imm) +{ + uint32_t len_align = QEMU_ALIGN_DOWN(len, 8); + uint32_t len_remain = len % 8; + uint32_t nparts = len / 8 + ctpop8(len_remain); + int midx = get_mem_index(s); + TCGv_i64 addr, t0; + + addr = tcg_temp_new_i64(); + t0 = tcg_temp_new_i64(); + + /* Note that unpredicated load/store of vector/predicate registers + * are defined as a stream of bytes, which equates to little-endian + * operations on larger quantities. There is no nice way to force + * a little-endian store for aarch64_be-linux-user out of line. + * + * Attempt to keep code expansion to a minimum by limiting the + * amount of unrolling done. + */ + if (nparts <= 4) { + int i; + + for (i = 0; i < len_align; i += 8) { + tcg_gen_ld_i64(t0, cpu_env, vofs + i); + tcg_gen_addi_i64(addr, cpu_reg_sp(s, rn), imm + i); + tcg_gen_qemu_st_i64(t0, addr, midx, MO_LEQ); + } + } else { + TCGLabel *loop = gen_new_label(); + TCGv_ptr t2, i = tcg_const_local_ptr(0); + + gen_set_label(loop); + + t2 = tcg_temp_new_ptr(); + tcg_gen_add_ptr(t2, cpu_env, i); + tcg_gen_ld_i64(t0, t2, vofs); + + /* Minimize the number of local temps that must be re-read from + * the stack each iteration. Instead, re-compute values other + * than the loop counter. + */ + tcg_gen_addi_ptr(t2, i, imm); + tcg_gen_extu_ptr_i64(addr, t2); + tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, rn)); + tcg_temp_free_ptr(t2); + + tcg_gen_qemu_st_i64(t0, addr, midx, MO_LEQ); + + tcg_gen_addi_ptr(i, i, 8); + + tcg_gen_brcondi_ptr(TCG_COND_LTU, i, len_align, loop); + tcg_temp_free_ptr(i); + } + + /* Predicate register stores can be any multiple of 2. */ + if (len_remain) { + tcg_gen_ld_i64(t0, cpu_env, vofs + len_align); + tcg_gen_addi_i64(addr, cpu_reg_sp(s, rn), imm + len_align); + + switch (len_remain) { + case 2: + case 4: + case 8: + tcg_gen_qemu_st_i64(t0, addr, midx, MO_LE | ctz32(len_remain)); + break; + + case 6: + tcg_gen_qemu_st_i64(t0, addr, midx, MO_LEUL); + tcg_gen_addi_i64(addr, addr, 4); + tcg_gen_shri_i64(addr, addr, 32); + tcg_gen_qemu_st_i64(t0, addr, midx, MO_LEUW); + break; + + default: + g_assert_not_reached(); + } + } + tcg_temp_free_i64(addr); + tcg_temp_free_i64(t0); +} + static bool trans_LDR_zri(DisasContext *s, arg_rri *a, uint32_t insn) { if (sve_access_check(s)) { @@ -3782,6 +3865,26 @@ static bool trans_LDR_pri(DisasContext *s, arg_rri *a, uint32_t insn) return true; } +static bool trans_STR_zri(DisasContext *s, arg_rri *a, uint32_t insn) +{ + if (sve_access_check(s)) { + int size = vec_full_reg_size(s); + int off = vec_full_reg_offset(s, a->rd); + do_str(s, off, size, a->rn, a->imm * size); + } + return true; +} + +static bool trans_STR_pri(DisasContext *s, arg_rri *a, uint32_t insn) +{ + if (sve_access_check(s)) { + int size = pred_full_reg_size(s); + int off = pred_full_reg_offset(s, a->rd); + do_str(s, off, size, a->rn, a->imm * size); + } + return true; +} + /* *** SVE Memory - Contiguous Load Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 56039e2193..c088e51493 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -792,6 +792,12 @@ LD1RQ_zpri 1010010 .. 00 0.... 001 ... ..... ..... \ ### SVE Memory Store Group +# SVE store predicate register +STR_pri 1110010 11 0. ..... 000 ... ..... 0 .... @pd_rn_i9 + +# SVE store vector register +STR_zri 1110010 11 0. ..... 010 ... ..... ..... @rd_rn_i9 + # SVE contiguous store (scalar plus immediate) # ST1B, ST1H, ST1W, ST1D; require msz <= esz ST_zpri 1110010 .. esz:2 0.... 111 ... ..... ..... \ From patchwork Thu Jun 21 01:53:35 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 932478 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="IGntk7hp"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41B4l21XShz9s3R for ; Thu, 21 Jun 2018 12:04:30 +1000 (AEST) Received: from localhost ([::1]:52526 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVoxX-0002xu-Nf for incoming@patchwork.ozlabs.org; Wed, 20 Jun 2018 22:04:27 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39234) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVonu-0004PB-Jm for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:32 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVonr-000361-8a for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:30 -0400 Received: from mail-pl0-x230.google.com ([2607:f8b0:400e:c01::230]:35617) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVonq-00035O-W9 for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:27 -0400 Received: by mail-pl0-x230.google.com with SMTP id k1-v6so756899plt.2 for ; Wed, 20 Jun 2018 18:54:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=1mQ1TMjGOJewWJHYTgzA7FQK3l6kp40/y15IFPWJw84=; b=IGntk7hpd9zGMATxmEv6WefRle1U8d/B00/WI/Efr+1dQoCLjpEXto9HLxOMHPabo6 VnPHbLLOa9bj9f+PbUfinqdV/xwDUbv+jf+qYVmbqWtNmRZ+phg4JH2o4r6RY+lnfmYn sK8EXITIsFSSuFZ/0bcjndVyFUKBAfTZxFlf0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=1mQ1TMjGOJewWJHYTgzA7FQK3l6kp40/y15IFPWJw84=; b=eY4Dclx64rCK4g8q2kybpJ5xed2v+k2zHXYj6JpVHxXGsfI2AzkPXi2a3JNFHvP9B2 MTxFcZQ4oKdcpdJDwvWe5BrnjEsdY/pJ8CT06cF87BX0jRzkeauYUX0P5ra5j2Vc4mGB bIuxpRl7I59VJLVokQ7b1+OEeEuKcQpx0NjQ7/OqI1XFfYgaXWq6ySzrymfZ2+ZIN3tO TsVk8rpzYTbMVnMiAMKN64JejFpaQ7e8scHqbRrg2vtlxTewXkGBiNg6ELGJOKK2gw8l 64i3llwcPd6rZzjfllaAHEIHr106S+fkqCdWxvuXCSzcL9qAbiE5nBvHHq3O6fv6Elr/ dvQg== X-Gm-Message-State: APt69E3pRi6+ur5mgAZ9YT5tBdhvWtKD/yl0F9X5WfjeoFs2V4F1pRn6 3fhXWhsLHN6I9zqaS7FHQwXaMxF1gDU= X-Google-Smtp-Source: ADUXVKJ4O8psLWj5lWE7Ww409p23ap8Ol2b6Iw/WgCisIOhYlTlKCEXm6razGIMTUnSPyGX3WkJT9w== X-Received: by 2002:a17:902:650a:: with SMTP id b10-v6mr26566350plk.45.1529546065677; Wed, 20 Jun 2018 18:54:25 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.24 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:24 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:35 -1000 Message-Id: <20180621015359.12018-12-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::230 Subject: [Qemu-devel] [PATCH v5 11/35] target/arm: Implement SVE scatter stores X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper-sve.h | 41 +++++++++++++++++++++ target/arm/sve_helper.c | 62 ++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 74 ++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 39 ++++++++++++++++++++ 4 files changed, 216 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index a5d3bb121c..8880128f9c 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -958,3 +958,44 @@ DEF_HELPER_FLAGS_4(sve_st1hs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_st1hd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_st1sd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_stbs_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_sths_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_stss_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_stbs_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_sths_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_stss_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_stbd_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_sthd_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_stsd_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_stdd_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_stbd_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_sthd_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_stsd_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_stdd_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_stbd_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_sthd_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_stsd_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_stdd_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index a9c98bca32..ed4861a292 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3712,3 +3712,65 @@ void HELPER(sve_st4dd_r)(CPUARMState *env, void *vg, addr += 4 * 8; } } + +/* Stores with a vector index. */ + +#define DO_ST1_ZPZ_S(NAME, TYPEI, FN) \ +void HELPER(NAME)(CPUARMState *env, void *vd, void *vg, void *vm, \ + target_ulong base, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc) / 8; \ + unsigned scale = simd_data(desc); \ + uintptr_t ra = GETPC(); \ + uint32_t *d = vd; TYPEI *m = vm; uint8_t *pg = vg; \ + for (i = 0; i < oprsz; i++) { \ + uint8_t pp = pg[H1(i)]; \ + if (pp & 0x01) { \ + target_ulong off = (target_ulong)m[H4(i * 2)] << scale; \ + FN(env, base + off, d[H4(i * 2)], ra); \ + } \ + if (pp & 0x10) { \ + target_ulong off = (target_ulong)m[H4(i * 2 + 1)] << scale; \ + FN(env, base + off, d[H4(i * 2 + 1)], ra); \ + } \ + } \ +} + +#define DO_ST1_ZPZ_D(NAME, TYPEI, FN) \ +void HELPER(NAME)(CPUARMState *env, void *vd, void *vg, void *vm, \ + target_ulong base, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc) / 8; \ + unsigned scale = simd_data(desc); \ + uintptr_t ra = GETPC(); \ + uint64_t *d = vd, *m = vm; uint8_t *pg = vg; \ + for (i = 0; i < oprsz; i++) { \ + if (pg[H1(i)] & 1) { \ + target_ulong off = (target_ulong)(TYPEI)m[i] << scale; \ + FN(env, base + off, d[i], ra); \ + } \ + } \ +} + +DO_ST1_ZPZ_S(sve_stbs_zsu, uint32_t, cpu_stb_data_ra) +DO_ST1_ZPZ_S(sve_sths_zsu, uint32_t, cpu_stw_data_ra) +DO_ST1_ZPZ_S(sve_stss_zsu, uint32_t, cpu_stl_data_ra) + +DO_ST1_ZPZ_S(sve_stbs_zss, int32_t, cpu_stb_data_ra) +DO_ST1_ZPZ_S(sve_sths_zss, int32_t, cpu_stw_data_ra) +DO_ST1_ZPZ_S(sve_stss_zss, int32_t, cpu_stl_data_ra) + +DO_ST1_ZPZ_D(sve_stbd_zsu, uint32_t, cpu_stb_data_ra) +DO_ST1_ZPZ_D(sve_sthd_zsu, uint32_t, cpu_stw_data_ra) +DO_ST1_ZPZ_D(sve_stsd_zsu, uint32_t, cpu_stl_data_ra) +DO_ST1_ZPZ_D(sve_stdd_zsu, uint32_t, cpu_stq_data_ra) + +DO_ST1_ZPZ_D(sve_stbd_zss, int32_t, cpu_stb_data_ra) +DO_ST1_ZPZ_D(sve_sthd_zss, int32_t, cpu_stw_data_ra) +DO_ST1_ZPZ_D(sve_stsd_zss, int32_t, cpu_stl_data_ra) +DO_ST1_ZPZ_D(sve_stdd_zss, int32_t, cpu_stq_data_ra) + +DO_ST1_ZPZ_D(sve_stbd_zd, uint64_t, cpu_stb_data_ra) +DO_ST1_ZPZ_D(sve_sthd_zd, uint64_t, cpu_stw_data_ra) +DO_ST1_ZPZ_D(sve_stsd_zd, uint64_t, cpu_stl_data_ra) +DO_ST1_ZPZ_D(sve_stdd_zd, uint64_t, cpu_stq_data_ra) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 50f1ff75ef..6e1907cedd 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -43,6 +43,8 @@ typedef void gen_helper_gvec_flags_4(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32); typedef void gen_helper_gvec_mem(TCGv_env, TCGv_ptr, TCGv_i64, TCGv_i32); +typedef void gen_helper_gvec_mem_scatter(TCGv_env, TCGv_ptr, TCGv_ptr, + TCGv_ptr, TCGv_i64, TCGv_i32); /* * Helpers for extracting complex instruction fields. @@ -4228,3 +4230,75 @@ static bool trans_ST_zpri(DisasContext *s, arg_rpri_store *a, uint32_t insn) } return true; } + +/* + *** SVE gather loads / scatter stores + */ + +static void do_mem_zpz(DisasContext *s, int zt, int pg, int zm, int scale, + TCGv_i64 scalar, gen_helper_gvec_mem_scatter *fn) +{ + unsigned vsz = vec_full_reg_size(s); + TCGv_i32 desc = tcg_const_i32(simd_desc(vsz, vsz, scale)); + TCGv_ptr t_zm = tcg_temp_new_ptr(); + TCGv_ptr t_pg = tcg_temp_new_ptr(); + TCGv_ptr t_zt = tcg_temp_new_ptr(); + + tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, pg)); + tcg_gen_addi_ptr(t_zm, cpu_env, vec_full_reg_offset(s, zm)); + tcg_gen_addi_ptr(t_zt, cpu_env, vec_full_reg_offset(s, zt)); + fn(cpu_env, t_zt, t_pg, t_zm, scalar, desc); + + tcg_temp_free_ptr(t_zt); + tcg_temp_free_ptr(t_zm); + tcg_temp_free_ptr(t_pg); + tcg_temp_free_i32(desc); +} + +static bool trans_ST1_zprz(DisasContext *s, arg_ST1_zprz *a, uint32_t insn) +{ + /* Indexed by [xs][msz]. */ + static gen_helper_gvec_mem_scatter * const fn32[2][3] = { + { gen_helper_sve_stbs_zsu, + gen_helper_sve_sths_zsu, + gen_helper_sve_stss_zsu, }, + { gen_helper_sve_stbs_zss, + gen_helper_sve_sths_zss, + gen_helper_sve_stss_zss, }, + }; + static gen_helper_gvec_mem_scatter * const fn64[3][4] = { + { gen_helper_sve_stbd_zsu, + gen_helper_sve_sthd_zsu, + gen_helper_sve_stsd_zsu, + gen_helper_sve_stdd_zsu, }, + { gen_helper_sve_stbd_zss, + gen_helper_sve_sthd_zss, + gen_helper_sve_stsd_zss, + gen_helper_sve_stdd_zss, }, + { gen_helper_sve_stbd_zd, + gen_helper_sve_sthd_zd, + gen_helper_sve_stsd_zd, + gen_helper_sve_stdd_zd, }, + }; + gen_helper_gvec_mem_scatter *fn; + + if (a->esz < a->msz || (a->msz == 0 && a->scale)) { + return false; + } + if (!sve_access_check(s)) { + return true; + } + switch (a->esz) { + case MO_32: + fn = fn32[a->xs][a->msz]; + break; + case MO_64: + fn = fn64[a->xs][a->msz]; + break; + default: + g_assert_not_reached(); + } + do_mem_zpz(s, a->rd, a->pg, a->rm, a->scale * a->msz, + cpu_reg_sp(s, a->rn), fn); + return true; +} diff --git a/target/arm/sve.decode b/target/arm/sve.decode index c088e51493..2ca0fd85e6 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -80,6 +80,7 @@ &rpri_load rd pg rn imm dtype nreg &rprr_store rd pg rn rm msz esz nreg &rpri_store rd pg rn imm msz esz nreg +&rprr_scatter_store rd pg rn rm esz msz xs scale ########################################################################### # Named instruction formats. These are generally used to @@ -198,6 +199,8 @@ @rpri_store_msz ....... msz:2 .. . imm:s4 ... pg:3 rn:5 rd:5 &rpri_store @rprr_store_esz_n0 ....... .. esz:2 rm:5 ... pg:3 rn:5 rd:5 \ &rprr_store nreg=0 +@rprr_scatter_store ....... msz:2 .. rm:5 ... pg:3 rn:5 rd:5 \ + &rprr_scatter_store ########################################################################### # Instruction patterns. Grouped according to the SVE encodingindex.xhtml. @@ -824,3 +827,39 @@ ST_zpri 1110010 .. nreg:2 1.... 111 ... ..... ..... \ # SVE store multiple structures (scalar plus scalar) (nreg != 0) ST_zprr 1110010 msz:2 nreg:2 ..... 011 ... ..... ..... \ @rprr_store esz=%size_23 + +# SVE 32-bit scatter store (scalar plus 32-bit scaled offsets) +# Require msz > 0 && msz <= esz. +ST1_zprz 1110010 .. 11 ..... 100 ... ..... ..... \ + @rprr_scatter_store xs=0 esz=2 scale=1 +ST1_zprz 1110010 .. 11 ..... 110 ... ..... ..... \ + @rprr_scatter_store xs=1 esz=2 scale=1 + +# SVE 32-bit scatter store (scalar plus 32-bit unscaled offsets) +# Require msz <= esz. +ST1_zprz 1110010 .. 10 ..... 100 ... ..... ..... \ + @rprr_scatter_store xs=0 esz=2 scale=0 +ST1_zprz 1110010 .. 10 ..... 110 ... ..... ..... \ + @rprr_scatter_store xs=1 esz=2 scale=0 + +# SVE 64-bit scatter store (scalar plus 64-bit scaled offset) +# Require msz > 0 +ST1_zprz 1110010 .. 01 ..... 101 ... ..... ..... \ + @rprr_scatter_store xs=2 esz=3 scale=1 + +# SVE 64-bit scatter store (scalar plus 64-bit unscaled offset) +ST1_zprz 1110010 .. 00 ..... 101 ... ..... ..... \ + @rprr_scatter_store xs=2 esz=3 scale=0 + +# SVE 64-bit scatter store (scalar plus unpacked 32-bit scaled offset) +# Require msz > 0 +ST1_zprz 1110010 .. 01 ..... 100 ... ..... ..... \ + @rprr_scatter_store xs=0 esz=3 scale=1 +ST1_zprz 1110010 .. 01 ..... 110 ... ..... ..... \ + @rprr_scatter_store xs=1 esz=3 scale=1 + +# SVE 64-bit scatter store (scalar plus unpacked 32-bit unscaled offset) +ST1_zprz 1110010 .. 00 ..... 100 ... ..... ..... \ + @rprr_scatter_store xs=0 esz=3 scale=0 +ST1_zprz 1110010 .. 00 ..... 110 ... ..... ..... \ + @rprr_scatter_store xs=1 esz=3 scale=0 From patchwork Thu Jun 21 01:53:36 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 932483 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="PiwcLZ7B"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41B4pg35VYz9s3R for ; Thu, 21 Jun 2018 12:07:39 +1000 (AEST) Received: from localhost ([::1]:52544 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVp0b-0005SG-2m for incoming@patchwork.ozlabs.org; Wed, 20 Jun 2018 22:07:37 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39233) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVonu-0004P9-JT for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:32 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVons-00036p-U6 for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:30 -0400 Received: from mail-pf0-x229.google.com ([2607:f8b0:400e:c00::229]:33683) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVons-00036R-NH for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:28 -0400 Received: by mail-pf0-x229.google.com with SMTP id b17-v6so694612pfi.0 for ; Wed, 20 Jun 2018 18:54:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=ABjeW7FWo1HVKOTk5GV1YLP/uN+2VusXjD7Yv5Umy/k=; b=PiwcLZ7BYFg6q1immBv7qlBb5cLdO0M7fxvQmHG1CcOQd10qYjR+XlzSD9LIkHdXI+ sk9OeaWUsMHmpmeyB/oEonlBg5bJ+lYfHlPEGlGHGgqvrxc9V5TQIjRIa0a/djSxvK9U SUAbWQVRlqZqEwI20DXbyJ7LyzQilj5xSnlnE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=ABjeW7FWo1HVKOTk5GV1YLP/uN+2VusXjD7Yv5Umy/k=; b=W2V0BaHuSg//4FjNhjnOaqltCKWD3uQZFqIRJ2IOxxrXqN9OPdGEpd7Y+WoRk/CE8p Tu+nkIhY8l9KQ+yO61x9II+5nBFYeaKPq7ddUF0Hj3DJsnDmB8oA1W4OvFU08xsx+hGI HfMDdfBEly95zEXBpyHdzBVU19bxkp6c8kqBHyS63U3VtC+QuLeSN9ZsSkkcLUxG8Ehx +wBibQactB3rmSqtYLSSKtSbaZv+dwdjdveePB0vJrjp1MciFcvceJyXAzgFCT5G/nNp 8L8Sv38srSe2Lj1sk8hi+3uB6DNmq/TIyBuwvr6LhgvEBnIELfBi4Y89w7aNbrDnVjzJ ogfg== X-Gm-Message-State: APt69E2xBhi4dKlX//7g0p+Swf8kG+j1Dt2Zd+9o/k7Gs/vv/wPiRQ4y XKpMnWqHe41pBoLXSsNmVQ9C/rNYfes= X-Google-Smtp-Source: ADUXVKKpMH70IGO2dAjGaJxKo/Gv/uiXJShJ/u2BRkW75JrowAVyrXU2fSxHk97v4dKSGrs6rkLeig== X-Received: by 2002:a65:5686:: with SMTP id v6-v6mr20906014pgs.141.1529546067428; Wed, 20 Jun 2018 18:54:27 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.25 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:26 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:36 -1000 Message-Id: <20180621015359.12018-13-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::229 Subject: [Qemu-devel] [PATCH v5 12/35] target/arm: Implement SVE prefetches X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/translate-sve.c | 21 +++++++++++++++++++++ target/arm/sve.decode | 23 +++++++++++++++++++++++ 2 files changed, 44 insertions(+) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 6e1907cedd..c054e3268b 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -4302,3 +4302,24 @@ static bool trans_ST1_zprz(DisasContext *s, arg_ST1_zprz *a, uint32_t insn) cpu_reg_sp(s, a->rn), fn); return true; } + +/* + * Prefetches + */ + +static bool trans_PRF(DisasContext *s, arg_PRF *a, uint32_t insn) +{ + /* Prefetch is a nop within QEMU. */ + sve_access_check(s); + return true; +} + +static bool trans_PRF_rr(DisasContext *s, arg_PRF_rr *a, uint32_t insn) +{ + if (a->rm == 31) { + return false; + } + /* Prefetch is a nop within QEMU. */ + sve_access_check(s); + return true; +} diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 2ca0fd85e6..a20f98b70c 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -793,6 +793,29 @@ LD1RQ_zprr 1010010 .. 00 ..... 000 ... ..... ..... \ LD1RQ_zpri 1010010 .. 00 0.... 001 ... ..... ..... \ @rpri_load_msz nreg=0 +# SVE 32-bit gather prefetch (scalar plus 32-bit scaled offsets) +PRF 1000010 00 -1 ----- 0-- --- ----- 0 ---- + +# SVE 32-bit gather prefetch (vector plus immediate) +PRF 1000010 -- 00 ----- 111 --- ----- 0 ---- + +# SVE contiguous prefetch (scalar plus immediate) +PRF 1000010 11 1- ----- 0-- --- ----- 0 ---- + +# SVE contiguous prefetch (scalar plus scalar) +PRF_rr 1000010 -- 00 rm:5 110 --- ----- 0 ---- + +### SVE Memory 64-bit Gather Group + +# SVE 64-bit gather prefetch (scalar plus 64-bit scaled offsets) +PRF 1100010 00 11 ----- 1-- --- ----- 0 ---- + +# SVE 64-bit gather prefetch (scalar plus unpacked 32-bit scaled offsets) +PRF 1100010 00 -1 ----- 0-- --- ----- 0 ---- + +# SVE 64-bit gather prefetch (vector plus immediate) +PRF 1100010 -- 00 ----- 111 --- ----- 0 ---- + ### SVE Memory Store Group # SVE store predicate register From patchwork Thu Jun 21 01:53:37 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 932476 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="FVBp4pQx"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41B4gS532Kz9s3R for ; Thu, 21 Jun 2018 12:01:24 +1000 (AEST) Received: from localhost ([::1]:52513 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVouY-0000jf-8T for incoming@patchwork.ozlabs.org; Wed, 20 Jun 2018 22:01:22 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39251) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVonx-0004S6-1D for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:35 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVonv-000389-7c for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:33 -0400 Received: from mail-pg0-x232.google.com ([2607:f8b0:400e:c05::232]:32899) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVonu-00037b-U2 for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:31 -0400 Received: by mail-pg0-x232.google.com with SMTP id e11-v6so650176pgq.0 for ; Wed, 20 Jun 2018 18:54:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=K1yCRF/mY5J+RXM8YLsiTaO0HBwTPISeSv7DvvgqE/M=; b=FVBp4pQxf8qFaNCIoMrhRuks4RSSbupuHODvyJARQVdZvAjiHoOFUneemJA/yi5gzt +kbyTraTuuodArrbu8+xxZQOVEiMcwsbJC00XAjnlyrQs1TyD62ldtDYkhwXHvOtKUIu LbRpxnK713JGHlvSaBYbyhH0rbVkXVKqFb9Jw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=K1yCRF/mY5J+RXM8YLsiTaO0HBwTPISeSv7DvvgqE/M=; b=pLhk1JhzwFWlFDuuY4krkCAAIY6m+/fe9Hem4yH+rK9vGNa+UPRh/qPeyhm2omYQ3C ezih43hJF08DO6A01SZloCV+1H8HYtqQ/ltnFAHIcNEbeb7cYi9LwOOzGJUcc0I8A7jt 67UZ9kcIzBeyeIdTMOljP2gV5j2sIXYCczl4zyfL4l3L0jAHM913eJfaoFi/Lr4pkYLg v1OFkyHTCdJnphxNjQI2RzXksRgMHv4bzI1RvYYFBXTKBcHV/ztp7/EyMJURiND5kzOC aoJzvKsF330IvtvUP5UTEZmadlcZQu8CGwalYoE6djCGxgizhm+s24K+CxUugMjTbTX0 7LiQ== X-Gm-Message-State: APt69E3fQlD1VW9BHLyzSJNXCkDG3EZzAobAxFSDBHJGSVTHTIa3dCwc ba7Ckky38AzyP/JOvt9SBRjM81Ofrnk= X-Google-Smtp-Source: ADUXVKJ7+6ClY4J+Ebf+qQNV65SJ77saDem0TuzxTxyUehnsmtmgY46Y+jNaTwxkZ+GkZYghE+Fs5g== X-Received: by 2002:a62:df89:: with SMTP id d9-v6mr5768561pfl.147.1529546069377; Wed, 20 Jun 2018 18:54:29 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.27 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:28 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:37 -1000 Message-Id: <20180621015359.12018-14-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::232 Subject: [Qemu-devel] [PATCH v5 13/35] target/arm: Implement SVE gather loads X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 67 ++++++++++++++++++++++++ target/arm/sve_helper.c | 77 +++++++++++++++++++++++++++ target/arm/translate-sve.c | 104 +++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 53 +++++++++++++++++++ 4 files changed, 301 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 8880128f9c..aeb62afc34 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -959,6 +959,73 @@ DEF_HELPER_FLAGS_4(sve_st1hd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_st1sd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldbsu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhsu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldssu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldbss_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhss_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_ldbsu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhsu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldssu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldbss_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhss_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_ldbdu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhdu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldsdu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldddu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldbds_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhds_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldsds_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_ldbdu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhdu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldsdu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldddu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldbds_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhds_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldsds_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_ldbdu_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhdu_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldsdu_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldddu_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldbds_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhds_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldsds_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + DEF_HELPER_FLAGS_6(sve_stbs_zsu, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr, tl, i32) DEF_HELPER_FLAGS_6(sve_sths_zsu, TCG_CALL_NO_WG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index ed4861a292..9b99718156 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3713,6 +3713,83 @@ void HELPER(sve_st4dd_r)(CPUARMState *env, void *vg, } } +/* Loads with a vector index. */ + +#define DO_LD1_ZPZ_S(NAME, TYPEI, TYPEM, FN) \ +void HELPER(NAME)(CPUARMState *env, void *vd, void *vg, void *vm, \ + target_ulong base, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + unsigned scale = simd_data(desc); \ + uintptr_t ra = GETPC(); \ + for (i = 0; i < oprsz; i++) { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + TYPEM m = 0; \ + if (pg & 1) { \ + target_ulong off = *(TYPEI *)(vm + H1_4(i)); \ + m = FN(env, base + (off << scale), ra); \ + } \ + *(uint32_t *)(vd + H1_4(i)) = m; \ + i += 4, pg >>= 4; \ + } while (i & 15); \ + } \ +} + +#define DO_LD1_ZPZ_D(NAME, TYPEI, TYPEM, FN) \ +void HELPER(NAME)(CPUARMState *env, void *vd, void *vg, void *vm, \ + target_ulong base, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc) / 8; \ + unsigned scale = simd_data(desc); \ + uintptr_t ra = GETPC(); \ + uint64_t *d = vd, *m = vm; uint8_t *pg = vg; \ + for (i = 0; i < oprsz; i++) { \ + TYPEM mm = 0; \ + if (pg[H1(i)] & 1) { \ + target_ulong off = (TYPEI)m[i]; \ + mm = FN(env, base + (off << scale), ra); \ + } \ + d[i] = mm; \ + } \ +} + +DO_LD1_ZPZ_S(sve_ldbsu_zsu, uint32_t, uint8_t, cpu_ldub_data_ra) +DO_LD1_ZPZ_S(sve_ldhsu_zsu, uint32_t, uint16_t, cpu_lduw_data_ra) +DO_LD1_ZPZ_S(sve_ldssu_zsu, uint32_t, uint32_t, cpu_ldl_data_ra) +DO_LD1_ZPZ_S(sve_ldbss_zsu, uint32_t, int8_t, cpu_ldub_data_ra) +DO_LD1_ZPZ_S(sve_ldhss_zsu, uint32_t, int16_t, cpu_lduw_data_ra) + +DO_LD1_ZPZ_S(sve_ldbsu_zss, int32_t, uint8_t, cpu_ldub_data_ra) +DO_LD1_ZPZ_S(sve_ldhsu_zss, int32_t, uint16_t, cpu_lduw_data_ra) +DO_LD1_ZPZ_S(sve_ldssu_zss, int32_t, uint32_t, cpu_ldl_data_ra) +DO_LD1_ZPZ_S(sve_ldbss_zss, int32_t, int8_t, cpu_ldub_data_ra) +DO_LD1_ZPZ_S(sve_ldhss_zss, int32_t, int16_t, cpu_lduw_data_ra) + +DO_LD1_ZPZ_D(sve_ldbdu_zsu, uint32_t, uint8_t, cpu_ldub_data_ra) +DO_LD1_ZPZ_D(sve_ldhdu_zsu, uint32_t, uint16_t, cpu_lduw_data_ra) +DO_LD1_ZPZ_D(sve_ldsdu_zsu, uint32_t, uint32_t, cpu_ldl_data_ra) +DO_LD1_ZPZ_D(sve_ldddu_zsu, uint32_t, uint64_t, cpu_ldq_data_ra) +DO_LD1_ZPZ_D(sve_ldbds_zsu, uint32_t, int8_t, cpu_ldub_data_ra) +DO_LD1_ZPZ_D(sve_ldhds_zsu, uint32_t, int16_t, cpu_lduw_data_ra) +DO_LD1_ZPZ_D(sve_ldsds_zsu, uint32_t, int32_t, cpu_ldl_data_ra) + +DO_LD1_ZPZ_D(sve_ldbdu_zss, int32_t, uint8_t, cpu_ldub_data_ra) +DO_LD1_ZPZ_D(sve_ldhdu_zss, int32_t, uint16_t, cpu_lduw_data_ra) +DO_LD1_ZPZ_D(sve_ldsdu_zss, int32_t, uint32_t, cpu_ldl_data_ra) +DO_LD1_ZPZ_D(sve_ldddu_zss, int32_t, uint64_t, cpu_ldq_data_ra) +DO_LD1_ZPZ_D(sve_ldbds_zss, int32_t, int8_t, cpu_ldub_data_ra) +DO_LD1_ZPZ_D(sve_ldhds_zss, int32_t, int16_t, cpu_lduw_data_ra) +DO_LD1_ZPZ_D(sve_ldsds_zss, int32_t, int32_t, cpu_ldl_data_ra) + +DO_LD1_ZPZ_D(sve_ldbdu_zd, uint64_t, uint8_t, cpu_ldub_data_ra) +DO_LD1_ZPZ_D(sve_ldhdu_zd, uint64_t, uint16_t, cpu_lduw_data_ra) +DO_LD1_ZPZ_D(sve_ldsdu_zd, uint64_t, uint32_t, cpu_ldl_data_ra) +DO_LD1_ZPZ_D(sve_ldddu_zd, uint64_t, uint64_t, cpu_ldq_data_ra) +DO_LD1_ZPZ_D(sve_ldbds_zd, uint64_t, int8_t, cpu_ldub_data_ra) +DO_LD1_ZPZ_D(sve_ldhds_zd, uint64_t, int16_t, cpu_lduw_data_ra) +DO_LD1_ZPZ_D(sve_ldsds_zd, uint64_t, int32_t, cpu_ldl_data_ra) + /* Stores with a vector index. */ #define DO_ST1_ZPZ_S(NAME, TYPEI, FN) \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index c054e3268b..a53554f8d5 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -4255,6 +4255,110 @@ static void do_mem_zpz(DisasContext *s, int zt, int pg, int zm, int scale, tcg_temp_free_i32(desc); } +/* Indexed by [xs][u][msz]. */ +static gen_helper_gvec_mem_scatter * const gather_load_fn32[2][2][3] = { + { { gen_helper_sve_ldbss_zsu, + gen_helper_sve_ldhss_zsu, + NULL, }, + { gen_helper_sve_ldbsu_zsu, + gen_helper_sve_ldhsu_zsu, + gen_helper_sve_ldssu_zsu, } }, + { { gen_helper_sve_ldbss_zss, + gen_helper_sve_ldhss_zss, + NULL, }, + { gen_helper_sve_ldbsu_zss, + gen_helper_sve_ldhsu_zss, + gen_helper_sve_ldssu_zss, } }, +}; + +static gen_helper_gvec_mem_scatter * const gather_load_fn64[3][2][4] = { + { { gen_helper_sve_ldbds_zsu, + gen_helper_sve_ldhds_zsu, + gen_helper_sve_ldsds_zsu, + NULL, }, + { gen_helper_sve_ldbdu_zsu, + gen_helper_sve_ldhdu_zsu, + gen_helper_sve_ldsdu_zsu, + gen_helper_sve_ldddu_zsu, } }, + { { gen_helper_sve_ldbds_zss, + gen_helper_sve_ldhds_zss, + gen_helper_sve_ldsds_zss, + NULL, }, + { gen_helper_sve_ldbdu_zss, + gen_helper_sve_ldhdu_zss, + gen_helper_sve_ldsdu_zss, + gen_helper_sve_ldddu_zss, } }, + { { gen_helper_sve_ldbds_zd, + gen_helper_sve_ldhds_zd, + gen_helper_sve_ldsds_zd, + NULL, }, + { gen_helper_sve_ldbdu_zd, + gen_helper_sve_ldhdu_zd, + gen_helper_sve_ldsdu_zd, + gen_helper_sve_ldddu_zd, } }, +}; + +static bool trans_LD1_zprz(DisasContext *s, arg_LD1_zprz *a, uint32_t insn) +{ + gen_helper_gvec_mem_scatter *fn = NULL; + + if (a->esz < a->msz + || (a->msz == 0 && a->scale) + || (a->esz == a->msz && !a->u)) { + return false; + } + if (!sve_access_check(s)) { + return true; + } + + /* TODO: handle LDFF1. */ + switch (a->esz) { + case MO_32: + fn = gather_load_fn32[a->xs][a->u][a->msz]; + break; + case MO_64: + fn = gather_load_fn64[a->xs][a->u][a->msz]; + break; + } + assert(fn != NULL); + + do_mem_zpz(s, a->rd, a->pg, a->rm, a->scale * a->msz, + cpu_reg_sp(s, a->rn), fn); + return true; +} + +static bool trans_LD1_zpiz(DisasContext *s, arg_LD1_zpiz *a, uint32_t insn) +{ + gen_helper_gvec_mem_scatter *fn = NULL; + TCGv_i64 imm; + + if (a->esz < a->msz || (a->esz == a->msz && !a->u)) { + return false; + } + if (!sve_access_check(s)) { + return true; + } + + /* TODO: handle LDFF1. */ + switch (a->esz) { + case MO_32: + fn = gather_load_fn32[0][a->u][a->msz]; + break; + case MO_64: + fn = gather_load_fn64[2][a->u][a->msz]; + break; + } + assert(fn != NULL); + + /* Treat LD1_zpiz (zn[x] + imm) the same way as LD1_zprz (rn + zm[x]) + * by loading the immediate into the scalar parameter. + */ + imm = tcg_const_i64(a->imm << a->msz); + do_mem_zpz(s, a->rd, a->pg, a->rn, 0, imm, fn); + tcg_temp_free_i64(imm); + return true; +} + static bool trans_ST1_zprz(DisasContext *s, arg_ST1_zprz *a, uint32_t insn) { /* Indexed by [xs][msz]. */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index a20f98b70c..fb73a22c0e 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -80,6 +80,8 @@ &rpri_load rd pg rn imm dtype nreg &rprr_store rd pg rn rm msz esz nreg &rpri_store rd pg rn imm msz esz nreg +&rprr_gather_load rd pg rn rm esz msz u ff xs scale +&rpri_gather_load rd pg rn imm esz msz u ff &rprr_scatter_store rd pg rn rm esz msz xs scale ########################################################################### @@ -194,6 +196,18 @@ @rpri_load_msz ....... .... . imm:s4 ... pg:3 rn:5 rd:5 \ &rpri_load dtype=%msz_dtype +# Gather Loads. +@rprr_g_load_u ....... .. . . rm:5 . u:1 ff:1 pg:3 rn:5 rd:5 \ + &rprr_gather_load xs=2 +@rprr_g_load_xs_u ....... .. xs:1 . rm:5 . u:1 ff:1 pg:3 rn:5 rd:5 \ + &rprr_gather_load +@rprr_g_load_xs_u_sc ....... .. xs:1 scale:1 rm:5 . u:1 ff:1 pg:3 rn:5 rd:5 \ + &rprr_gather_load +@rprr_g_load_u_sc ....... .. . scale:1 rm:5 . u:1 ff:1 pg:3 rn:5 rd:5 \ + &rprr_gather_load xs=2 +@rpri_g_load ....... msz:2 .. imm:5 . u:1 ff:1 pg:3 rn:5 rd:5 \ + &rpri_gather_load + # Stores; user must fill in ESZ, MSZ, NREG as needed. @rprr_store ....... .. .. rm:5 ... pg:3 rn:5 rd:5 &rprr_store @rpri_store_msz ....... msz:2 .. . imm:s4 ... pg:3 rn:5 rd:5 &rpri_store @@ -758,6 +772,19 @@ LDR_zri 10000101 10 ...... 010 ... ..... ..... @rd_rn_i9 LD1R_zpri 1000010 .. 1 imm:6 1.. pg:3 rn:5 rd:5 \ &rpri_load dtype=%dtype_23_13 nreg=0 +# SVE 32-bit gather load (scalar plus 32-bit unscaled offsets) +# SVE 32-bit gather load (scalar plus 32-bit scaled offsets) +LD1_zprz 1000010 00 .0 ..... 0.. ... ..... ..... \ + @rprr_g_load_xs_u esz=2 msz=0 scale=0 +LD1_zprz 1000010 01 .. ..... 0.. ... ..... ..... \ + @rprr_g_load_xs_u_sc esz=2 msz=1 +LD1_zprz 1000010 10 .. ..... 0.. ... ..... ..... \ + @rprr_g_load_xs_u_sc esz=2 msz=2 + +# SVE 32-bit gather load (vector plus immediate) +LD1_zpiz 1000010 .. 01 ..... 1.. ... ..... ..... \ + @rpri_g_load esz=2 + ### SVE Memory Contiguous Load Group # SVE contiguous load (scalar plus scalar) @@ -807,6 +834,32 @@ PRF_rr 1000010 -- 00 rm:5 110 --- ----- 0 ---- ### SVE Memory 64-bit Gather Group +# SVE 64-bit gather load (scalar plus 32-bit unpacked unscaled offsets) +# SVE 64-bit gather load (scalar plus 32-bit unpacked scaled offsets) +LD1_zprz 1100010 00 .0 ..... 0.. ... ..... ..... \ + @rprr_g_load_xs_u esz=3 msz=0 scale=0 +LD1_zprz 1100010 01 .. ..... 0.. ... ..... ..... \ + @rprr_g_load_xs_u_sc esz=3 msz=1 +LD1_zprz 1100010 10 .. ..... 0.. ... ..... ..... \ + @rprr_g_load_xs_u_sc esz=3 msz=2 +LD1_zprz 1100010 11 .. ..... 0.. ... ..... ..... \ + @rprr_g_load_xs_u_sc esz=3 msz=3 + +# SVE 64-bit gather load (scalar plus 64-bit unscaled offsets) +# SVE 64-bit gather load (scalar plus 64-bit scaled offsets) +LD1_zprz 1100010 00 10 ..... 1.. ... ..... ..... \ + @rprr_g_load_u esz=3 msz=0 scale=0 +LD1_zprz 1100010 01 1. ..... 1.. ... ..... ..... \ + @rprr_g_load_u_sc esz=3 msz=1 +LD1_zprz 1100010 10 1. ..... 1.. ... ..... ..... \ + @rprr_g_load_u_sc esz=3 msz=2 +LD1_zprz 1100010 11 1. ..... 1.. ... ..... ..... \ + @rprr_g_load_u_sc esz=3 msz=3 + +# SVE 64-bit gather load (vector plus immediate) +LD1_zpiz 1100010 .. 01 ..... 1.. ... ..... ..... \ + @rpri_g_load esz=3 + # SVE 64-bit gather prefetch (scalar plus 64-bit scaled offsets) PRF 1100010 00 11 ----- 1-- --- ----- 0 ---- From patchwork Thu Jun 21 01:53:38 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 932484 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="emhfObw7"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41B4pg3Tj8z9s3T for ; Thu, 21 Jun 2018 12:07:39 +1000 (AEST) Received: from localhost ([::1]:52543 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVp0b-0005RP-53 for incoming@patchwork.ozlabs.org; Wed, 20 Jun 2018 22:07:37 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39259) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVonz-0004S9-8x for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:37 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVonw-000391-Nx for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:34 -0400 Received: from mail-pf0-x232.google.com ([2607:f8b0:400e:c00::232]:35928) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVonw-00038g-FD for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:32 -0400 Received: by mail-pf0-x232.google.com with SMTP id a12-v6so689301pfi.3 for ; Wed, 20 Jun 2018 18:54:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=DhPHednOJI+L5QL2OOrjnicM5jhrFbP7kdaN7+aPugs=; b=emhfObw7jBmVCq4NgsiYnUzo/2H5lfDP7j8X9N/0gtuwlk2VuxtcpEnge8NpLCo1vu DchC3dxWTTBSkZVuXxHdW2R8fTqiqqYSrU6hHn2Nu9mkgkTgoTQuhJSdrLV16m0eRgKf G/8N16oGp3q0Dv2DTFMPTay7R3X9EpiuX3ncE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=DhPHednOJI+L5QL2OOrjnicM5jhrFbP7kdaN7+aPugs=; b=r/iaY3K2cjdhZ0wIWs8gUAf6Xor8zccY2yXlQY64WrenlHWHO6H+yx9OFt4HNYaLXP jOjj9Hckiawk0cYDGX3gkVrMkbimQ63ydkjt2P1rdZ4+7rdwCjLVHMccdOqrzeD6kOd+ 9MD794Zm6IsU/BOzkxr9EquNVccj9U28jMbzJ0o+wwUD5CUZ0H3dFiVF9r74gfF1ad9z VMcMR6GWcRib8ubZLK/oV+lG6nMw9cTbX8CJ3gY79JA7dvym+M9rZ6OPnx2v2HPPphFH uax7IQf+Fm1QZHDfcxv7Nz3fV/DPGz6xiKGvsQ48t8wji0ecBM6nU8e1gI6oGSGzcNWN vNfQ== X-Gm-Message-State: APt69E0/wnFLKz3Dy+UIY2VMYnN0N3ouQLS9cqJuiCw+Y3InXaelAPW0 c6AQ0xW58yRfyU4rnqlhrilG7IYj/DY= X-Google-Smtp-Source: ADUXVKJJlVewWbUlN7LlOZGVJzmULto8ct5XB7SjNG63tqeK0inn6kgD86da3G/9jEtCVEl0EzsaHA== X-Received: by 2002:a62:9385:: with SMTP id r5-v6mr25212938pfk.59.1529546071051; Wed, 20 Jun 2018 18:54:31 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.29 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:30 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:38 -1000 Message-Id: <20180621015359.12018-15-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::232 Subject: [Qemu-devel] [PATCH v5 14/35] target/arm: Implement SVE first-fault gather loads X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper-sve.h | 67 ++++++++++++++++++++ target/arm/sve_helper.c | 88 ++++++++++++++++++++++++++ target/arm/translate-sve.c | 126 ++++++++++++++++++++++++------------- 3 files changed, 236 insertions(+), 45 deletions(-) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index aeb62afc34..55e8a908d4 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -1026,6 +1026,73 @@ DEF_HELPER_FLAGS_6(sve_ldhds_zd, TCG_CALL_NO_WG, DEF_HELPER_FLAGS_6(sve_ldsds_zd, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffbsu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffhsu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffssu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffbss_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffhss_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_ldffbsu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffhsu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffssu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffbss_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffhss_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_ldffbdu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffhdu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffsdu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffddu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffbds_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffhds_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffsds_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_ldffbdu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffhdu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffsdu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffddu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffbds_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffhds_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffsds_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_ldffbdu_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffhdu_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffsdu_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffddu_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffbds_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffhds_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldffsds_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + DEF_HELPER_FLAGS_6(sve_stbs_zsu, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr, tl, i32) DEF_HELPER_FLAGS_6(sve_sths_zsu, TCG_CALL_NO_WG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 9b99718156..38f7cc2274 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3790,6 +3790,94 @@ DO_LD1_ZPZ_D(sve_ldbds_zd, uint64_t, int8_t, cpu_ldub_data_ra) DO_LD1_ZPZ_D(sve_ldhds_zd, uint64_t, int16_t, cpu_lduw_data_ra) DO_LD1_ZPZ_D(sve_ldsds_zd, uint64_t, int32_t, cpu_ldl_data_ra) +/* First fault loads with a vector index. */ + +#ifdef CONFIG_USER_ONLY + +#define DO_LDFF1_ZPZ(NAME, TYPEE, TYPEI, TYPEM, FN, H) \ +void HELPER(NAME)(CPUARMState *env, void *vd, void *vg, void *vm, \ + target_ulong base, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + unsigned scale = simd_data(desc); \ + uintptr_t ra = GETPC(); \ + bool first = true; \ + mmap_lock(); \ + for (i = 0; i < oprsz; i++) { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + TYPEM m = 0; \ + if (pg & 1) { \ + target_ulong off = *(TYPEI *)(vm + H(i)); \ + target_ulong addr = base + (off << scale); \ + if (!first && \ + page_check_range(addr, sizeof(TYPEM), PAGE_READ)) { \ + record_fault(env, i, oprsz); \ + goto exit; \ + } \ + m = FN(env, addr, ra); \ + first = false; \ + } \ + *(TYPEE *)(vd + H(i)) = m; \ + i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \ + } while (i & 15); \ + } \ + exit: \ + mmap_unlock(); \ +} + +#else + +#define DO_LDFF1_ZPZ(NAME, TYPEE, TYPEI, TYPEM, FN, H) \ +void HELPER(NAME)(CPUARMState *env, void *vd, void *vg, void *vm, \ + target_ulong base, uint32_t desc) \ +{ \ + g_assert_not_reached(); \ +} + +#endif + +#define DO_LDFF1_ZPZ_S(NAME, TYPEI, TYPEM, FN) \ + DO_LDFF1_ZPZ(NAME, uint32_t, TYPEI, TYPEM, FN, H1_4) +#define DO_LDFF1_ZPZ_D(NAME, TYPEI, TYPEM, FN) \ + DO_LDFF1_ZPZ(NAME, uint64_t, TYPEI, TYPEM, FN, ) + +DO_LDFF1_ZPZ_S(sve_ldffbsu_zsu, uint32_t, uint8_t, cpu_ldub_data_ra) +DO_LDFF1_ZPZ_S(sve_ldffhsu_zsu, uint32_t, uint16_t, cpu_lduw_data_ra) +DO_LDFF1_ZPZ_S(sve_ldffssu_zsu, uint32_t, uint32_t, cpu_ldl_data_ra) +DO_LDFF1_ZPZ_S(sve_ldffbss_zsu, uint32_t, int8_t, cpu_ldub_data_ra) +DO_LDFF1_ZPZ_S(sve_ldffhss_zsu, uint32_t, int16_t, cpu_lduw_data_ra) + +DO_LDFF1_ZPZ_S(sve_ldffbsu_zss, int32_t, uint8_t, cpu_ldub_data_ra) +DO_LDFF1_ZPZ_S(sve_ldffhsu_zss, int32_t, uint16_t, cpu_lduw_data_ra) +DO_LDFF1_ZPZ_S(sve_ldffssu_zss, int32_t, uint32_t, cpu_ldl_data_ra) +DO_LDFF1_ZPZ_S(sve_ldffbss_zss, int32_t, int8_t, cpu_ldub_data_ra) +DO_LDFF1_ZPZ_S(sve_ldffhss_zss, int32_t, int16_t, cpu_lduw_data_ra) + +DO_LDFF1_ZPZ_D(sve_ldffbdu_zsu, uint32_t, uint8_t, cpu_ldub_data_ra) +DO_LDFF1_ZPZ_D(sve_ldffhdu_zsu, uint32_t, uint16_t, cpu_lduw_data_ra) +DO_LDFF1_ZPZ_D(sve_ldffsdu_zsu, uint32_t, uint32_t, cpu_ldl_data_ra) +DO_LDFF1_ZPZ_D(sve_ldffddu_zsu, uint32_t, uint64_t, cpu_ldq_data_ra) +DO_LDFF1_ZPZ_D(sve_ldffbds_zsu, uint32_t, int8_t, cpu_ldub_data_ra) +DO_LDFF1_ZPZ_D(sve_ldffhds_zsu, uint32_t, int16_t, cpu_lduw_data_ra) +DO_LDFF1_ZPZ_D(sve_ldffsds_zsu, uint32_t, int32_t, cpu_ldl_data_ra) + +DO_LDFF1_ZPZ_D(sve_ldffbdu_zss, int32_t, uint8_t, cpu_ldub_data_ra) +DO_LDFF1_ZPZ_D(sve_ldffhdu_zss, int32_t, uint16_t, cpu_lduw_data_ra) +DO_LDFF1_ZPZ_D(sve_ldffsdu_zss, int32_t, uint32_t, cpu_ldl_data_ra) +DO_LDFF1_ZPZ_D(sve_ldffddu_zss, int32_t, uint64_t, cpu_ldq_data_ra) +DO_LDFF1_ZPZ_D(sve_ldffbds_zss, int32_t, int8_t, cpu_ldub_data_ra) +DO_LDFF1_ZPZ_D(sve_ldffhds_zss, int32_t, int16_t, cpu_lduw_data_ra) +DO_LDFF1_ZPZ_D(sve_ldffsds_zss, int32_t, int32_t, cpu_ldl_data_ra) + +DO_LDFF1_ZPZ_D(sve_ldffbdu_zd, uint64_t, uint8_t, cpu_ldub_data_ra) +DO_LDFF1_ZPZ_D(sve_ldffhdu_zd, uint64_t, uint16_t, cpu_lduw_data_ra) +DO_LDFF1_ZPZ_D(sve_ldffsdu_zd, uint64_t, uint32_t, cpu_ldl_data_ra) +DO_LDFF1_ZPZ_D(sve_ldffddu_zd, uint64_t, uint64_t, cpu_ldq_data_ra) +DO_LDFF1_ZPZ_D(sve_ldffbds_zd, uint64_t, int8_t, cpu_ldub_data_ra) +DO_LDFF1_ZPZ_D(sve_ldffhds_zd, uint64_t, int16_t, cpu_lduw_data_ra) +DO_LDFF1_ZPZ_D(sve_ldffsds_zd, uint64_t, int32_t, cpu_ldl_data_ra) + /* Stores with a vector index. */ #define DO_ST1_ZPZ_S(NAME, TYPEI, FN) \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index a53554f8d5..11c1bf112c 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -4255,47 +4255,85 @@ static void do_mem_zpz(DisasContext *s, int zt, int pg, int zm, int scale, tcg_temp_free_i32(desc); } -/* Indexed by [xs][u][msz]. */ -static gen_helper_gvec_mem_scatter * const gather_load_fn32[2][2][3] = { - { { gen_helper_sve_ldbss_zsu, - gen_helper_sve_ldhss_zsu, - NULL, }, - { gen_helper_sve_ldbsu_zsu, - gen_helper_sve_ldhsu_zsu, - gen_helper_sve_ldssu_zsu, } }, - { { gen_helper_sve_ldbss_zss, - gen_helper_sve_ldhss_zss, - NULL, }, - { gen_helper_sve_ldbsu_zss, - gen_helper_sve_ldhsu_zss, - gen_helper_sve_ldssu_zss, } }, +/* Indexed by [ff][xs][u][msz]. */ +static gen_helper_gvec_mem_scatter * const gather_load_fn32[2][2][2][3] = { + { { { gen_helper_sve_ldbss_zsu, + gen_helper_sve_ldhss_zsu, + NULL, }, + { gen_helper_sve_ldbsu_zsu, + gen_helper_sve_ldhsu_zsu, + gen_helper_sve_ldssu_zsu, } }, + { { gen_helper_sve_ldbss_zss, + gen_helper_sve_ldhss_zss, + NULL, }, + { gen_helper_sve_ldbsu_zss, + gen_helper_sve_ldhsu_zss, + gen_helper_sve_ldssu_zss, } }, }, + + { { { gen_helper_sve_ldffbss_zsu, + gen_helper_sve_ldffhss_zsu, + NULL, }, + { gen_helper_sve_ldffbsu_zsu, + gen_helper_sve_ldffhsu_zsu, + gen_helper_sve_ldffssu_zsu, } }, + { { gen_helper_sve_ldffbss_zss, + gen_helper_sve_ldffhss_zss, + NULL, }, + { gen_helper_sve_ldffbsu_zss, + gen_helper_sve_ldffhsu_zss, + gen_helper_sve_ldffssu_zss, } } } }; -static gen_helper_gvec_mem_scatter * const gather_load_fn64[3][2][4] = { - { { gen_helper_sve_ldbds_zsu, - gen_helper_sve_ldhds_zsu, - gen_helper_sve_ldsds_zsu, - NULL, }, - { gen_helper_sve_ldbdu_zsu, - gen_helper_sve_ldhdu_zsu, - gen_helper_sve_ldsdu_zsu, - gen_helper_sve_ldddu_zsu, } }, - { { gen_helper_sve_ldbds_zss, - gen_helper_sve_ldhds_zss, - gen_helper_sve_ldsds_zss, - NULL, }, - { gen_helper_sve_ldbdu_zss, - gen_helper_sve_ldhdu_zss, - gen_helper_sve_ldsdu_zss, - gen_helper_sve_ldddu_zss, } }, - { { gen_helper_sve_ldbds_zd, - gen_helper_sve_ldhds_zd, - gen_helper_sve_ldsds_zd, - NULL, }, - { gen_helper_sve_ldbdu_zd, - gen_helper_sve_ldhdu_zd, - gen_helper_sve_ldsdu_zd, - gen_helper_sve_ldddu_zd, } }, +static gen_helper_gvec_mem_scatter * const gather_load_fn64[2][3][2][4] = { + { { { gen_helper_sve_ldbds_zsu, + gen_helper_sve_ldhds_zsu, + gen_helper_sve_ldsds_zsu, + NULL, }, + { gen_helper_sve_ldbdu_zsu, + gen_helper_sve_ldhdu_zsu, + gen_helper_sve_ldsdu_zsu, + gen_helper_sve_ldddu_zsu, } }, + { { gen_helper_sve_ldbds_zss, + gen_helper_sve_ldhds_zss, + gen_helper_sve_ldsds_zss, + NULL, }, + { gen_helper_sve_ldbdu_zss, + gen_helper_sve_ldhdu_zss, + gen_helper_sve_ldsdu_zss, + gen_helper_sve_ldddu_zss, } }, + { { gen_helper_sve_ldbds_zd, + gen_helper_sve_ldhds_zd, + gen_helper_sve_ldsds_zd, + NULL, }, + { gen_helper_sve_ldbdu_zd, + gen_helper_sve_ldhdu_zd, + gen_helper_sve_ldsdu_zd, + gen_helper_sve_ldddu_zd, } } }, + + { { { gen_helper_sve_ldffbds_zsu, + gen_helper_sve_ldffhds_zsu, + gen_helper_sve_ldffsds_zsu, + NULL, }, + { gen_helper_sve_ldffbdu_zsu, + gen_helper_sve_ldffhdu_zsu, + gen_helper_sve_ldffsdu_zsu, + gen_helper_sve_ldffddu_zsu, } }, + { { gen_helper_sve_ldffbds_zss, + gen_helper_sve_ldffhds_zss, + gen_helper_sve_ldffsds_zss, + NULL, }, + { gen_helper_sve_ldffbdu_zss, + gen_helper_sve_ldffhdu_zss, + gen_helper_sve_ldffsdu_zss, + gen_helper_sve_ldffddu_zss, } }, + { { gen_helper_sve_ldffbds_zd, + gen_helper_sve_ldffhds_zd, + gen_helper_sve_ldffsds_zd, + NULL, }, + { gen_helper_sve_ldffbdu_zd, + gen_helper_sve_ldffhdu_zd, + gen_helper_sve_ldffsdu_zd, + gen_helper_sve_ldffddu_zd, } } } }; static bool trans_LD1_zprz(DisasContext *s, arg_LD1_zprz *a, uint32_t insn) @@ -4311,13 +4349,12 @@ static bool trans_LD1_zprz(DisasContext *s, arg_LD1_zprz *a, uint32_t insn) return true; } - /* TODO: handle LDFF1. */ switch (a->esz) { case MO_32: - fn = gather_load_fn32[a->xs][a->u][a->msz]; + fn = gather_load_fn32[a->ff][a->xs][a->u][a->msz]; break; case MO_64: - fn = gather_load_fn64[a->xs][a->u][a->msz]; + fn = gather_load_fn64[a->ff][a->xs][a->u][a->msz]; break; } assert(fn != NULL); @@ -4339,13 +4376,12 @@ static bool trans_LD1_zpiz(DisasContext *s, arg_LD1_zpiz *a, uint32_t insn) return true; } - /* TODO: handle LDFF1. */ switch (a->esz) { case MO_32: - fn = gather_load_fn32[0][a->u][a->msz]; + fn = gather_load_fn32[a->ff][0][a->u][a->msz]; break; case MO_64: - fn = gather_load_fn64[2][a->u][a->msz]; + fn = gather_load_fn64[a->ff][2][a->u][a->msz]; break; } assert(fn != NULL); From patchwork Thu Jun 21 01:53:39 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 932477 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="Bn4ppvSG"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41B4hj0rcLz9s3R for ; Thu, 21 Jun 2018 12:02:29 +1000 (AEST) Received: from localhost ([::1]:52519 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVovX-0001TI-P1 for incoming@patchwork.ozlabs.org; Wed, 20 Jun 2018 22:02:23 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39273) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVonz-0004SA-E6 for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:36 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVony-0003A3-Ax for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:35 -0400 Received: from mail-pl0-x22f.google.com ([2607:f8b0:400e:c01::22f]:45163) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVony-00039Y-2c for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:34 -0400 Received: by mail-pl0-x22f.google.com with SMTP id o18-v6so58673pll.12 for ; Wed, 20 Jun 2018 18:54:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=K/RaemtBMKMDMQiaXGvolQPI9idWFbIOOZMoovbxENI=; b=Bn4ppvSGM/GQJ+fdsN6FJSqINnqdxfomlfNFtdEPFvX6j8vaIWfaSV0WfGSb94E5mf GMpToHtkg/p0QpoSD4O3KVJuxvzvdoWNJ7DolkjH7GGeSG4kZzojqMdAucAVVXeHWhhf xfXytbP9mLaO3AQJiEbtq+1z9GWBjVWDMSSbA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=K/RaemtBMKMDMQiaXGvolQPI9idWFbIOOZMoovbxENI=; b=dpC0gt/X9j2HywNhlwUNsKuSB9P/d80OaNIbPm880aywP/pHBY5SoE/1xttYG8Bfb1 I32xzUzeGPP3cHxONxyTgoVad3Cnylyv4dhjXUdfLZfyItSGqFaEZ23bGLecn5tho1gl qSjViRYC3IvAF3BBCa0zmBMkW43yvc03LhgFB9GNu/rzZVqDlqPp515DwcP8VhmIszQE zK217qpLVKSTdgJkyuEiCP+033GsYb2aNG/gTDS2uy810uAZgI7uBz1oA/pdccRqHxao IX3OqKn6m0j9BvfhNenCaCKO2hxIHc28RB+nANfe9oFlX8jqSBx1xRI2fpUNxxVkNxRE fzNw== X-Gm-Message-State: APt69E3FHdg+oqKEnZT1menEJP+ClHXWJE34WSjBA3USWPdRbAcWppSi 7CzW93e3vWCek/gbeIaNUCmYl0XMF3M= X-Google-Smtp-Source: ADUXVKIMg0HiOMmxlDXVad92ZLinGDSq/as5HHR2L70HBgVMXeLlmya+a0mMBStFwd1Yk66dkLNX5Q== X-Received: by 2002:a17:902:b217:: with SMTP id t23-v6mr26425520plr.312.1529546072766; Wed, 20 Jun 2018 18:54:32 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.31 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:31 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:39 -1000 Message-Id: <20180621015359.12018-16-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::22f Subject: [Qemu-devel] [PATCH v5 15/35] target/arm: Implement SVE scatter store vector immediate X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/translate-sve.c | 83 ++++++++++++++++++++++++++------------ target/arm/sve.decode | 11 +++++ 2 files changed, 69 insertions(+), 25 deletions(-) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 11c1bf112c..fdc3231e63 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -4395,31 +4395,33 @@ static bool trans_LD1_zpiz(DisasContext *s, arg_LD1_zpiz *a, uint32_t insn) return true; } +/* Indexed by [xs][msz]. */ +static gen_helper_gvec_mem_scatter * const scatter_store_fn32[2][3] = { + { gen_helper_sve_stbs_zsu, + gen_helper_sve_sths_zsu, + gen_helper_sve_stss_zsu, }, + { gen_helper_sve_stbs_zss, + gen_helper_sve_sths_zss, + gen_helper_sve_stss_zss, }, +}; + +static gen_helper_gvec_mem_scatter * const scatter_store_fn64[3][4] = { + { gen_helper_sve_stbd_zsu, + gen_helper_sve_sthd_zsu, + gen_helper_sve_stsd_zsu, + gen_helper_sve_stdd_zsu, }, + { gen_helper_sve_stbd_zss, + gen_helper_sve_sthd_zss, + gen_helper_sve_stsd_zss, + gen_helper_sve_stdd_zss, }, + { gen_helper_sve_stbd_zd, + gen_helper_sve_sthd_zd, + gen_helper_sve_stsd_zd, + gen_helper_sve_stdd_zd, }, +}; + static bool trans_ST1_zprz(DisasContext *s, arg_ST1_zprz *a, uint32_t insn) { - /* Indexed by [xs][msz]. */ - static gen_helper_gvec_mem_scatter * const fn32[2][3] = { - { gen_helper_sve_stbs_zsu, - gen_helper_sve_sths_zsu, - gen_helper_sve_stss_zsu, }, - { gen_helper_sve_stbs_zss, - gen_helper_sve_sths_zss, - gen_helper_sve_stss_zss, }, - }; - static gen_helper_gvec_mem_scatter * const fn64[3][4] = { - { gen_helper_sve_stbd_zsu, - gen_helper_sve_sthd_zsu, - gen_helper_sve_stsd_zsu, - gen_helper_sve_stdd_zsu, }, - { gen_helper_sve_stbd_zss, - gen_helper_sve_sthd_zss, - gen_helper_sve_stsd_zss, - gen_helper_sve_stdd_zss, }, - { gen_helper_sve_stbd_zd, - gen_helper_sve_sthd_zd, - gen_helper_sve_stsd_zd, - gen_helper_sve_stdd_zd, }, - }; gen_helper_gvec_mem_scatter *fn; if (a->esz < a->msz || (a->msz == 0 && a->scale)) { @@ -4430,10 +4432,10 @@ static bool trans_ST1_zprz(DisasContext *s, arg_ST1_zprz *a, uint32_t insn) } switch (a->esz) { case MO_32: - fn = fn32[a->xs][a->msz]; + fn = scatter_store_fn32[a->xs][a->msz]; break; case MO_64: - fn = fn64[a->xs][a->msz]; + fn = scatter_store_fn64[a->xs][a->msz]; break; default: g_assert_not_reached(); @@ -4443,6 +4445,37 @@ static bool trans_ST1_zprz(DisasContext *s, arg_ST1_zprz *a, uint32_t insn) return true; } +static bool trans_ST1_zpiz(DisasContext *s, arg_ST1_zpiz *a, uint32_t insn) +{ + gen_helper_gvec_mem_scatter *fn = NULL; + TCGv_i64 imm; + + if (a->esz < a->msz) { + return false; + } + if (!sve_access_check(s)) { + return true; + } + + switch (a->esz) { + case MO_32: + fn = scatter_store_fn32[0][a->msz]; + break; + case MO_64: + fn = scatter_store_fn64[2][a->msz]; + break; + } + assert(fn != NULL); + + /* Treat ST1_zpiz (zn[x] + imm) the same way as ST1_zprz (rn + zm[x]) + * by loading the immediate into the scalar parameter. + */ + imm = tcg_const_i64(a->imm << a->msz); + do_mem_zpz(s, a->rd, a->pg, a->rn, 0, imm, fn); + tcg_temp_free_i64(imm); + return true; +} + /* * Prefetches */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index fb73a22c0e..311ae6dbdf 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -83,6 +83,7 @@ &rprr_gather_load rd pg rn rm esz msz u ff xs scale &rpri_gather_load rd pg rn imm esz msz u ff &rprr_scatter_store rd pg rn rm esz msz xs scale +&rpri_scatter_store rd pg rn imm esz msz ########################################################################### # Named instruction formats. These are generally used to @@ -215,6 +216,8 @@ &rprr_store nreg=0 @rprr_scatter_store ....... msz:2 .. rm:5 ... pg:3 rn:5 rd:5 \ &rprr_scatter_store +@rpri_scatter_store ....... msz:2 .. imm:5 ... pg:3 rn:5 rd:5 \ + &rpri_scatter_store ########################################################################### # Instruction patterns. Grouped according to the SVE encodingindex.xhtml. @@ -927,6 +930,14 @@ ST1_zprz 1110010 .. 01 ..... 101 ... ..... ..... \ ST1_zprz 1110010 .. 00 ..... 101 ... ..... ..... \ @rprr_scatter_store xs=2 esz=3 scale=0 +# SVE 64-bit scatter store (vector plus immediate) +ST1_zpiz 1110010 .. 10 ..... 101 ... ..... ..... \ + @rpri_scatter_store esz=3 + +# SVE 32-bit scatter store (vector plus immediate) +ST1_zpiz 1110010 .. 11 ..... 101 ... ..... ..... \ + @rpri_scatter_store esz=2 + # SVE 64-bit scatter store (scalar plus unpacked 32-bit scaled offset) # Require msz > 0 ST1_zprz 1110010 .. 01 ..... 100 ... ..... ..... \ From patchwork Thu Jun 21 01:53:40 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 932490 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="EIJu97IH"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41B4tT5LNcz9s3R for ; Thu, 21 Jun 2018 12:10:57 +1000 (AEST) Received: from localhost ([::1]:52562 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVp3n-0007uv-AH for incoming@patchwork.ozlabs.org; Wed, 20 Jun 2018 22:10:55 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39301) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVoo4-0004XC-5i for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:41 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVoo0-0003B3-9a for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:40 -0400 Received: from mail-pl0-x233.google.com ([2607:f8b0:400e:c01::233]:44757) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVoo0-0003Ah-0s for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:36 -0400 Received: by mail-pl0-x233.google.com with SMTP id z9-v6so746293plk.11 for ; Wed, 20 Jun 2018 18:54:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=9e2ZEGdVMVF9+P5x8vHEftV4O66gf5blUE3ikkuxwT0=; b=EIJu97IH3/CM7JcmPhHXBLxcA6LCp8e+EG9Lveyro8IqzcWR4hxZVr6FDeJ8xuFtnY x6+xtwTyS9K+lXunAe/MHkWXUIx+I+qzLK1Hjm3QWjK2NNjfsHuF1eUoFEaqC+dJPasf K+TKAGmWAnp+52zIRMvzI6s19/PfgTj+hqQCo= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=9e2ZEGdVMVF9+P5x8vHEftV4O66gf5blUE3ikkuxwT0=; b=Du24WsEkSGoPgW7WKxPUSwRVsRKdVuU0AXWT5pZxJeCt5RGVdllyDOK2DiLt7dZHdW if72jYrLSoSFZVW64ZR92UGkO2v5xK5UmFLn9lc97ujElsZN1WNbgKOi99HOutaApO7K zXXKNEJBJKO++RQUY2DoF9fAzZhTRlgKw05FwYjI0+QTlEDOIbb4yvhQmC3W+51bjSO9 IcixDV0nv/HrdEgTTu0En8aogcXybyG931GXvgHicCwitNVZy/n9ILi/UwWeTI6Qg3ic GzHbIbwWdIo0CHx7g6XJhLndbCWmYrzTkrNQZNYSywHmpiBXD6CoJ4ckByZPMH+o0TXi mVlA== X-Gm-Message-State: APt69E2X2jGER0H4u+x+L8isLpVmVB1q9HZpvAUz8D+gnP2oY43XEUWY uE4W0Slx/33zJZ6t+GhmRZOXjUUREZE= X-Google-Smtp-Source: ADUXVKKEk7+3SaVctry/oCdF+WpKk+26Ey/Fu3REfgV8F6qQFNqQuvioZKIXDLiHrnpR6f0TYOZ6pA== X-Received: by 2002:a17:902:a989:: with SMTP id bh9-v6mr26782187plb.245.1529546074765; Wed, 20 Jun 2018 18:54:34 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.32 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:33 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:40 -1000 Message-Id: <20180621015359.12018-17-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::233 Subject: [Qemu-devel] [PATCH v5 16/35] target/arm: Implement SVE floating-point compare vectors X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper-sve.h | 49 ++++++++++++++++++++++++++++++ target/arm/sve_helper.c | 62 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 40 ++++++++++++++++++++++++ target/arm/sve.decode | 11 +++++++ 4 files changed, 162 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 55e8a908d4..6089b3a53f 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -839,6 +839,55 @@ DEF_HELPER_FLAGS_5(sve_ucvt_ds, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_ucvt_dd, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmge_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmge_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmge_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fcmgt_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmgt_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmgt_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fcmeq_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmeq_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmeq_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fcmne_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmne_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmne_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fcmuo_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmuo_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmuo_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_facge_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_facge_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_facge_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_facgt_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_facgt_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_facgt_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 38f7cc2274..2fd34d722b 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3193,6 +3193,68 @@ void HELPER(sve_fnmls_zpzzz_d)(CPUARMState *env, void *vg, uint32_t desc) do_fmla_zpzzz_d(env, vg, desc, 0, INT64_MIN); } +/* Two operand floating-point comparison controlled by a predicate. + * Unlike the integer version, we are not allowed to optimistically + * compare operands, since the comparison may have side effects wrt + * the FPSR. + */ +#define DO_FPCMP_PPZZ(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, \ + void *status, uint32_t desc) \ +{ \ + intptr_t i = simd_oprsz(desc), j = (i - 1) >> 6; \ + uint64_t *d = vd, *g = vg; \ + do { \ + uint64_t out = 0, pg = g[j]; \ + do { \ + i -= sizeof(TYPE), out <<= sizeof(TYPE); \ + if (likely((pg >> (i & 63)) & 1)) { \ + TYPE nn = *(TYPE *)(vn + H(i)); \ + TYPE mm = *(TYPE *)(vm + H(i)); \ + out |= OP(TYPE, nn, mm, status); \ + } \ + } while (i & 63); \ + d[j--] = out; \ + } while (i > 0); \ +} + +#define DO_FPCMP_PPZZ_H(NAME, OP) \ + DO_FPCMP_PPZZ(NAME##_h, float16, H1_2, OP) +#define DO_FPCMP_PPZZ_S(NAME, OP) \ + DO_FPCMP_PPZZ(NAME##_s, float32, H1_4, OP) +#define DO_FPCMP_PPZZ_D(NAME, OP) \ + DO_FPCMP_PPZZ(NAME##_d, float64, , OP) + +#define DO_FPCMP_PPZZ_ALL(NAME, OP) \ + DO_FPCMP_PPZZ_H(NAME, OP) \ + DO_FPCMP_PPZZ_S(NAME, OP) \ + DO_FPCMP_PPZZ_D(NAME, OP) + +#define DO_FCMGE(TYPE, X, Y, ST) TYPE##_compare(Y, X, ST) <= 0 +#define DO_FCMGT(TYPE, X, Y, ST) TYPE##_compare(Y, X, ST) < 0 +#define DO_FCMEQ(TYPE, X, Y, ST) TYPE##_compare_quiet(X, Y, ST) == 0 +#define DO_FCMNE(TYPE, X, Y, ST) TYPE##_compare_quiet(X, Y, ST) != 0 +#define DO_FCMUO(TYPE, X, Y, ST) \ + TYPE##_compare_quiet(X, Y, ST) == float_relation_unordered +#define DO_FACGE(TYPE, X, Y, ST) \ + TYPE##_compare(TYPE##_abs(Y), TYPE##_abs(X), ST) <= 0 +#define DO_FACGT(TYPE, X, Y, ST) \ + TYPE##_compare(TYPE##_abs(Y), TYPE##_abs(X), ST) < 0 + +DO_FPCMP_PPZZ_ALL(sve_fcmge, DO_FCMGE) +DO_FPCMP_PPZZ_ALL(sve_fcmgt, DO_FCMGT) +DO_FPCMP_PPZZ_ALL(sve_fcmeq, DO_FCMEQ) +DO_FPCMP_PPZZ_ALL(sve_fcmne, DO_FCMNE) +DO_FPCMP_PPZZ_ALL(sve_fcmuo, DO_FCMUO) +DO_FPCMP_PPZZ_ALL(sve_facge, DO_FACGE) +DO_FPCMP_PPZZ_ALL(sve_facgt, DO_FACGT) + +#undef DO_FPCMP_PPZZ_ALL +#undef DO_FPCMP_PPZZ_D +#undef DO_FPCMP_PPZZ_S +#undef DO_FPCMP_PPZZ_H +#undef DO_FPCMP_PPZZ + /* * Load contiguous data, protected by a governing predicate. */ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index fdc3231e63..a68126b369 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3533,6 +3533,46 @@ DO_FP3(FMULX, fmulx) #undef DO_FP3 +static bool do_fp_cmp(DisasContext *s, arg_rprr_esz *a, + gen_helper_gvec_4_ptr *fn) +{ + if (fn == NULL) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16); + tcg_gen_gvec_4_ptr(pred_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + pred_full_reg_offset(s, a->pg), + status, vsz, vsz, 0, fn); + tcg_temp_free_ptr(status); + } + return true; +} + +#define DO_FPCMP(NAME, name) \ +static bool trans_##NAME##_ppzz(DisasContext *s, arg_rprr_esz *a, \ + uint32_t insn) \ +{ \ + static gen_helper_gvec_4_ptr * const fns[4] = { \ + NULL, gen_helper_sve_##name##_h, \ + gen_helper_sve_##name##_s, gen_helper_sve_##name##_d \ + }; \ + return do_fp_cmp(s, a, fns[a->esz]); \ +} + +DO_FPCMP(FCMGE, fcmge) +DO_FPCMP(FCMGT, fcmgt) +DO_FPCMP(FCMEQ, fcmeq) +DO_FPCMP(FCMNE, fcmne) +DO_FPCMP(FCMUO, fcmuo) +DO_FPCMP(FACGE, facge) +DO_FPCMP(FACGT, facgt) + +#undef DO_FPCMP + typedef void gen_helper_sve_fmla(TCGv_env, TCGv_ptr, TCGv_i32); static bool do_fmla(DisasContext *s, arg_rprrr_esz *a, gen_helper_sve_fmla *fn) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 311ae6dbdf..6f4a36422d 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -320,6 +320,17 @@ UXTH 00000100 .. 010 011 101 ... ..... ..... @rd_pg_rn SXTW 00000100 .. 010 100 101 ... ..... ..... @rd_pg_rn UXTW 00000100 .. 010 101 101 ... ..... ..... @rd_pg_rn +### SVE Floating Point Compare - Vectors Group + +# SVE floating-point compare vectors +FCMGE_ppzz 01100101 .. 0 ..... 010 ... ..... 0 .... @pd_pg_rn_rm +FCMGT_ppzz 01100101 .. 0 ..... 010 ... ..... 1 .... @pd_pg_rn_rm +FCMEQ_ppzz 01100101 .. 0 ..... 011 ... ..... 0 .... @pd_pg_rn_rm +FCMNE_ppzz 01100101 .. 0 ..... 011 ... ..... 1 .... @pd_pg_rn_rm +FCMUO_ppzz 01100101 .. 0 ..... 110 ... ..... 0 .... @pd_pg_rn_rm +FACGE_ppzz 01100101 .. 0 ..... 110 ... ..... 1 .... @pd_pg_rn_rm +FACGT_ppzz 01100101 .. 0 ..... 111 ... ..... 1 .... @pd_pg_rn_rm + ### SVE Integer Multiply-Add Group # SVE integer multiply-add writing addend (predicated) From patchwork Thu Jun 21 01:53:41 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 932481 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="agiJRlvR"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41B4mP1JgFz9s3R for ; Thu, 21 Jun 2018 12:05:41 +1000 (AEST) Received: from localhost ([::1]:52531 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVoyg-0003qV-J1 for incoming@patchwork.ozlabs.org; Wed, 20 Jun 2018 22:05:38 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39300) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVoo4-0004XB-5V for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:41 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVoo2-0003CZ-Eh for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:40 -0400 Received: from mail-pg0-x22b.google.com ([2607:f8b0:400e:c05::22b]:40125) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVoo2-0003C3-5W for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:38 -0400 Received: by mail-pg0-x22b.google.com with SMTP id w8-v6so640749pgp.7 for ; Wed, 20 Jun 2018 18:54:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=9jJ4qWntsM8u8UvCttYqEQPIwhB/EyI5Rg+m1004LwY=; b=agiJRlvRlyasQ1opd7JSdagT0MZLsdOECu6ZOZZiYBKGIVEPzjY/JMo8hSGuob3Tkn zy5/s838UTl10sG3j4rCUcLxbzXvoAODmEFus/cc9TUS5IwOQw4L26Qz7e6Kn+CWKGc0 2jtOvjBVYEbGCPW2zaCoAWxnlnkUr3gqsxNUU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=9jJ4qWntsM8u8UvCttYqEQPIwhB/EyI5Rg+m1004LwY=; b=a7cJiLQk3Rvhd65eyubokuGeD+y1hs7OOAy68T9S8I8l8g3GLMGdvSGOXVaBaybzEU uSQH/vTwrnVTnWw8bEzZrfzTQgGP2c8Sr0iyT74LAOpdUkZlGKoumlvBDa6R23nOWp61 NWW5t7hqCsm8IeQOk67Hrv/JwElfGuM1Uot8ZKgYLCCA47VQswga4fDnyHFuO/trsvuE Aphh04GCEsEDn/y+bxdmNTox0xd1ixK7hSs1spS+OVxYeAHcZq942RDGxw6fIVDVH642 /8MHaVYMldkaRJ7Xfwe+0w9EboenNabzrn3ncQgn7qB90lDLTIwN750VRtkKwWTz1xIO eYiw== X-Gm-Message-State: APt69E2zLT1ssvvWAvodhzG2jmnFYPPgOzN/zUrEJCWaQIjLhuj5Lxgh HD38D1pnICvXpkRkhVW1i5G+dKswVPs= X-Google-Smtp-Source: ADUXVKJMHATRQZ0pxITiyCxdazMA+uEyIFhfvGb/pXnqY7RAC10xwlSnwIPzCvga7LBegebgkFmG8Q== X-Received: by 2002:a65:5a4f:: with SMTP id z15-v6mr20049760pgs.283.1529546076773; Wed, 20 Jun 2018 18:54:36 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.34 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:35 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:41 -1000 Message-Id: <20180621015359.12018-18-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::22b Subject: [Qemu-devel] [PATCH v5 17/35] target/arm: Implement SVE floating-point arithmetic with immediate X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper-sve.h | 56 ++++++++++++++++++++++++++++ target/arm/sve_helper.c | 69 +++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 75 ++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 14 +++++++ 4 files changed, 214 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 6089b3a53f..087819ec2b 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -809,6 +809,62 @@ DEF_HELPER_FLAGS_6(sve_fmulx_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_6(sve_fmulx_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fadds_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fadds_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fadds_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fsubs_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fsubs_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fsubs_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fmuls_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmuls_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmuls_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fsubrs_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fsubrs_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fsubrs_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fmaxnms_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmaxnms_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmaxnms_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fminnms_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fminnms_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fminnms_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fmaxs_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmaxs_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmaxs_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fmins_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmins_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmins_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) + DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 2fd34d722b..a40df62414 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2997,6 +2997,75 @@ DO_ZPZZ_FP(sve_fmulx_d, uint64_t, , helper_vfp_mulxd) #undef DO_ZPZZ_FP +/* Three-operand expander, with one scalar operand, controlled by + * a predicate, with the extra float_status parameter. + */ +#define DO_ZPZS_FP(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vg, uint64_t scalar, \ + void *status, uint32_t desc) \ +{ \ + intptr_t i = simd_oprsz(desc); \ + uint64_t *g = vg; \ + TYPE mm = scalar; \ + do { \ + uint64_t pg = g[(i - 1) >> 6]; \ + do { \ + i -= sizeof(TYPE); \ + if (likely((pg >> (i & 63)) & 1)) { \ + TYPE nn = *(TYPE *)(vn + H(i)); \ + *(TYPE *)(vd + H(i)) = OP(nn, mm, status); \ + } \ + } while (i & 63); \ + } while (i != 0); \ +} + +DO_ZPZS_FP(sve_fadds_h, float16, H1_2, float16_add) +DO_ZPZS_FP(sve_fadds_s, float32, H1_4, float32_add) +DO_ZPZS_FP(sve_fadds_d, float64, , float64_add) + +DO_ZPZS_FP(sve_fsubs_h, float16, H1_2, float16_sub) +DO_ZPZS_FP(sve_fsubs_s, float32, H1_4, float32_sub) +DO_ZPZS_FP(sve_fsubs_d, float64, , float64_sub) + +DO_ZPZS_FP(sve_fmuls_h, float16, H1_2, float16_mul) +DO_ZPZS_FP(sve_fmuls_s, float32, H1_4, float32_mul) +DO_ZPZS_FP(sve_fmuls_d, float64, , float64_mul) + +static inline float16 subr_h(float16 a, float16 b, float_status *s) +{ + return float16_sub(b, a, s); +} + +static inline float32 subr_s(float32 a, float32 b, float_status *s) +{ + return float32_sub(b, a, s); +} + +static inline float64 subr_d(float64 a, float64 b, float_status *s) +{ + return float64_sub(b, a, s); +} + +DO_ZPZS_FP(sve_fsubrs_h, float16, H1_2, subr_h) +DO_ZPZS_FP(sve_fsubrs_s, float32, H1_4, subr_s) +DO_ZPZS_FP(sve_fsubrs_d, float64, , subr_d) + +DO_ZPZS_FP(sve_fmaxnms_h, float16, H1_2, float16_maxnum) +DO_ZPZS_FP(sve_fmaxnms_s, float32, H1_4, float32_maxnum) +DO_ZPZS_FP(sve_fmaxnms_d, float64, , float64_maxnum) + +DO_ZPZS_FP(sve_fminnms_h, float16, H1_2, float16_minnum) +DO_ZPZS_FP(sve_fminnms_s, float32, H1_4, float32_minnum) +DO_ZPZS_FP(sve_fminnms_d, float64, , float64_minnum) + +DO_ZPZS_FP(sve_fmaxs_h, float16, H1_2, float16_max) +DO_ZPZS_FP(sve_fmaxs_s, float32, H1_4, float32_max) +DO_ZPZS_FP(sve_fmaxs_d, float64, , float64_max) + +DO_ZPZS_FP(sve_fmins_h, float16, H1_2, float16_min) +DO_ZPZS_FP(sve_fmins_s, float32, H1_4, float32_min) +DO_ZPZS_FP(sve_fmins_d, float64, , float64_min) + /* Fully general two-operand expander, controlled by a predicate, * With the extra float_status parameter. */ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index a68126b369..12cfadf4e9 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -32,6 +32,7 @@ #include "exec/log.h" #include "trace-tcg.h" #include "translate-a64.h" +#include "fpu/softfloat.h" typedef void GVecGen2sFn(unsigned, uint32_t, uint32_t, @@ -3533,6 +3534,80 @@ DO_FP3(FMULX, fmulx) #undef DO_FP3 +typedef void gen_helper_sve_fp2scalar(TCGv_ptr, TCGv_ptr, TCGv_ptr, + TCGv_i64, TCGv_ptr, TCGv_i32); + +static void do_fp_scalar(DisasContext *s, int zd, int zn, int pg, bool is_fp16, + TCGv_i64 scalar, gen_helper_sve_fp2scalar *fn) +{ + unsigned vsz = vec_full_reg_size(s); + TCGv_ptr t_zd, t_zn, t_pg, status; + TCGv_i32 desc; + + t_zd = tcg_temp_new_ptr(); + t_zn = tcg_temp_new_ptr(); + t_pg = tcg_temp_new_ptr(); + tcg_gen_addi_ptr(t_zd, cpu_env, vec_full_reg_offset(s, zd)); + tcg_gen_addi_ptr(t_zn, cpu_env, vec_full_reg_offset(s, zn)); + tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, pg)); + + status = get_fpstatus_ptr(is_fp16); + desc = tcg_const_i32(simd_desc(vsz, vsz, 0)); + fn(t_zd, t_zn, t_pg, scalar, status, desc); + + tcg_temp_free_i32(desc); + tcg_temp_free_ptr(status); + tcg_temp_free_ptr(t_pg); + tcg_temp_free_ptr(t_zn); + tcg_temp_free_ptr(t_zd); +} + +static void do_fp_imm(DisasContext *s, arg_rpri_esz *a, uint64_t imm, + gen_helper_sve_fp2scalar *fn) +{ + TCGv_i64 temp = tcg_const_i64(imm); + do_fp_scalar(s, a->rd, a->rn, a->pg, a->esz == MO_16, temp, fn); + tcg_temp_free_i64(temp); +} + +#define DO_FP_IMM(NAME, name, const0, const1) \ +static bool trans_##NAME##_zpzi(DisasContext *s, arg_rpri_esz *a, \ + uint32_t insn) \ +{ \ + static gen_helper_sve_fp2scalar * const fns[3] = { \ + gen_helper_sve_##name##_h, \ + gen_helper_sve_##name##_s, \ + gen_helper_sve_##name##_d \ + }; \ + static uint64_t const val[3][2] = { \ + { float16_##const0, float16_##const1 }, \ + { float32_##const0, float32_##const1 }, \ + { float64_##const0, float64_##const1 }, \ + }; \ + if (a->esz == 0) { \ + return false; \ + } \ + if (sve_access_check(s)) { \ + do_fp_imm(s, a, val[a->esz - 1][a->imm], fns[a->esz - 1]); \ + } \ + return true; \ +} + +#define float16_two make_float16(0x4000) +#define float32_two make_float32(0x40000000) +#define float64_two make_float64(0x4000000000000000ULL) + +DO_FP_IMM(FADD, fadds, half, one) +DO_FP_IMM(FSUB, fsubs, half, one) +DO_FP_IMM(FMUL, fmuls, half, two) +DO_FP_IMM(FSUBR, fsubrs, half, one) +DO_FP_IMM(FMAXNM, fmaxnms, zero, one) +DO_FP_IMM(FMINNM, fminnms, zero, one) +DO_FP_IMM(FMAX, fmaxs, zero, one) +DO_FP_IMM(FMIN, fmins, zero, one) + +#undef DO_FP_IMM + static bool do_fp_cmp(DisasContext *s, arg_rprr_esz *a, gen_helper_gvec_4_ptr *fn) { diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 6f4a36422d..0068e415ac 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -160,6 +160,10 @@ @rdn_pg4 ........ esz:2 .. pg:4 ... ........ rd:5 \ &rpri_esz rn=%reg_movprfx +# Two register operand, one one-bit floating-point operand. +@rdn_i1 ........ esz:2 ......... pg:3 .... imm:1 rd:5 \ + &rpri_esz rn=%reg_movprfx + # Two register operand, one encoded bitmask. @rdn_dbm ........ .. .... dbm:13 rd:5 \ &rr_dbm rn=%reg_movprfx @@ -740,6 +744,16 @@ FMULX 01100101 .. 00 1010 100 ... ..... ..... @rdn_pg_rm FDIV 01100101 .. 00 1100 100 ... ..... ..... @rdm_pg_rn # FDIVR FDIV 01100101 .. 00 1101 100 ... ..... ..... @rdn_pg_rm +# SVE floating-point arithmetic with immediate (predicated) +FADD_zpzi 01100101 .. 011 000 100 ... 0000 . ..... @rdn_i1 +FSUB_zpzi 01100101 .. 011 001 100 ... 0000 . ..... @rdn_i1 +FMUL_zpzi 01100101 .. 011 010 100 ... 0000 . ..... @rdn_i1 +FSUBR_zpzi 01100101 .. 011 011 100 ... 0000 . ..... @rdn_i1 +FMAXNM_zpzi 01100101 .. 011 100 100 ... 0000 . ..... @rdn_i1 +FMINNM_zpzi 01100101 .. 011 101 100 ... 0000 . ..... @rdn_i1 +FMAX_zpzi 01100101 .. 011 110 100 ... 0000 . ..... @rdn_i1 +FMIN_zpzi 01100101 .. 011 111 100 ... 0000 . ..... @rdn_i1 + ### SVE FP Multiply-Add Group # SVE floating-point multiply-accumulate writing addend From patchwork Thu Jun 21 01:53:42 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 932485 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="avrrm2FT"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41B4qc4SSlz9s3R for ; Thu, 21 Jun 2018 12:08:28 +1000 (AEST) Received: from localhost ([::1]:52549 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVp1O-00066A-44 for incoming@patchwork.ozlabs.org; Wed, 20 Jun 2018 22:08:26 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39316) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVoo5-0004YX-G9 for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:43 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVoo4-0003DL-9U for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:41 -0400 Received: from mail-pg0-x22e.google.com ([2607:f8b0:400e:c05::22e]:34589) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVoo4-0003Cz-0x for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:40 -0400 Received: by mail-pg0-x22e.google.com with SMTP id q4-v6so650289pgr.1 for ; Wed, 20 Jun 2018 18:54:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=jp8pCSaPEPFwVcyoLa5oiFh6g5GNPr1vfSN3Kxz3GBU=; b=avrrm2FT+CqmhvMSc0/aaluOOU56WJzxxjaVMuOccjzXLOCVVSO3nFU3hZ8znXAM68 iVlMNUOyhfxGl8L1mNM6Rvcf89bPqRp93iqPGYI/XJDHZx06oAeCDjzKAV0UYgRxGW3H GSJkpqKhDih930J/m1Ao/Dd5IHZArcr/Tjq3A= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=jp8pCSaPEPFwVcyoLa5oiFh6g5GNPr1vfSN3Kxz3GBU=; b=gvYsw+bz/UcdOxEiUuiwRpWKF74Twxs/QSBE3GVBws8BVQzJamXqmX6RsUUxlbUM3p eNRvAkkZeSKi6xV4zdiwDKEwYxidcd6RnkkgQxsyTmu5qi/SI8COnrN5C7FvZHf0M049 vNWlk83C1pxYd9V5jqPPwAhjWx/EKYzrGxTyKkuNjAbp/coFQXFKfwQ1mXqEMsWfIBh1 B0VV23K3TaSJkOJdgoPtj4a8iy/H0qoMaLEGsDoHavhSH6ApzqtDtpJOGGpMd016aQts J8EIEGf9bWTPhfZFe8POkl9r6oKdkIqrKgmnrNchO5LF8rpDezwmbyqGW0dquRsmH/Ul z/5g== X-Gm-Message-State: APt69E1i+ilpICuQRJBvUz9YNJb5iOCoRhWkyqMEVXpcndEE9UGgifDi 8faG54orqij/xXO/TfdY81jaKmnr8jM= X-Google-Smtp-Source: ADUXVKJaDAJYrouzjakfMMEOcEyCIfLVjIa9att0mHH2wsw8z4Lp5nmHOAbyrqDm6oetbuiHdMU+oQ== X-Received: by 2002:a62:138c:: with SMTP id 12-v6mr25466554pft.34.1529546078605; Wed, 20 Jun 2018 18:54:38 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.36 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:37 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:42 -1000 Message-Id: <20180621015359.12018-19-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::22e Subject: [Qemu-devel] [PATCH v5 18/35] target/arm: Implement SVE Floating Point Multiply Indexed Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper.h | 14 +++++++++++ target/arm/translate-sve.c | 50 ++++++++++++++++++++++++++++++++++++++ target/arm/vec_helper.c | 48 ++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 19 +++++++++++++++ 4 files changed, 131 insertions(+) diff --git a/target/arm/helper.h b/target/arm/helper.h index 879a7229e9..56439ac1e4 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -620,6 +620,20 @@ DEF_HELPER_FLAGS_5(gvec_ftsmul_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(gvec_ftsmul_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fmul_idx_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fmul_idx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fmul_idx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(gvec_fmla_idx_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(gvec_fmla_idx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(gvec_fmla_idx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 12cfadf4e9..e4ba84cadd 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3400,6 +3400,56 @@ DO_ZZI(UMIN, umin) #undef DO_ZZI +/* + *** SVE Floating Point Multiply-Add Indexed Group + */ + +static bool trans_FMLA_zzxz(DisasContext *s, arg_FMLA_zzxz *a, uint32_t insn) +{ + static gen_helper_gvec_4_ptr * const fns[3] = { + gen_helper_gvec_fmla_idx_h, + gen_helper_gvec_fmla_idx_s, + gen_helper_gvec_fmla_idx_d, + }; + + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16); + tcg_gen_gvec_4_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vec_full_reg_offset(s, a->ra), + status, vsz, vsz, a->index * 2 + a->sub, + fns[a->esz - 1]); + tcg_temp_free_ptr(status); + } + return true; +} + +/* + *** SVE Floating Point Multiply Indexed Group + */ + +static bool trans_FMUL_zzx(DisasContext *s, arg_FMUL_zzx *a, uint32_t insn) +{ + static gen_helper_gvec_3_ptr * const fns[3] = { + gen_helper_gvec_fmul_idx_h, + gen_helper_gvec_fmul_idx_s, + gen_helper_gvec_fmul_idx_d, + }; + + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16); + tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + status, vsz, vsz, a->index, fns[a->esz - 1]); + tcg_temp_free_ptr(status); + } + return true; +} + /* *** SVE Floating Point Accumulating Reduction Group */ diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index f504dd53c8..97af75a61b 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -495,3 +495,51 @@ DO_3OP(gvec_rsqrts_d, helper_rsqrtsf_f64, float64) #endif #undef DO_3OP + +/* For the indexed ops, SVE applies the index per 128-bit vector segment. + * For AdvSIMD, there is of course only one such vector segment. + */ + +#define DO_MUL_IDX(NAME, TYPE, H) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \ +{ \ + intptr_t i, j, oprsz = simd_oprsz(desc), segment = 16 / sizeof(TYPE); \ + intptr_t idx = simd_data(desc); \ + TYPE *d = vd, *n = vn, *m = vm; \ + for (i = 0; i < oprsz / sizeof(TYPE); i += segment) { \ + TYPE mm = m[H(i + idx)]; \ + for (j = 0; j < segment; j++) { \ + d[i + j] = TYPE##_mul(n[i + j], mm, stat); \ + } \ + } \ +} + +DO_MUL_IDX(gvec_fmul_idx_h, float16, H2) +DO_MUL_IDX(gvec_fmul_idx_s, float32, H4) +DO_MUL_IDX(gvec_fmul_idx_d, float64, ) + +#undef DO_MUL_IDX + +#define DO_FMLA_IDX(NAME, TYPE, H) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *va, \ + void *stat, uint32_t desc) \ +{ \ + intptr_t i, j, oprsz = simd_oprsz(desc), segment = 16 / sizeof(TYPE); \ + TYPE op1_neg = extract32(desc, SIMD_DATA_SHIFT, 1); \ + intptr_t idx = desc >> (SIMD_DATA_SHIFT + 1); \ + TYPE *d = vd, *n = vn, *m = vm, *a = va; \ + op1_neg <<= (8 * sizeof(TYPE) - 1); \ + for (i = 0; i < oprsz / sizeof(TYPE); i += segment) { \ + TYPE mm = m[H(i + idx)]; \ + for (j = 0; j < segment; j++) { \ + d[i + j] = TYPE##_muladd(n[i + j] ^ op1_neg, \ + mm, a[i + j], 0, stat); \ + } \ + } \ +} + +DO_FMLA_IDX(gvec_fmla_idx_h, float16, H2) +DO_FMLA_IDX(gvec_fmla_idx_s, float32, H4) +DO_FMLA_IDX(gvec_fmla_idx_d, float64, ) + +#undef DO_FMLA_IDX diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 0068e415ac..f64e74fa9d 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -29,6 +29,7 @@ %imm9_16_10 16:s6 10:3 %size_23 23:2 %dtype_23_13 23:2 13:2 +%index3_22_19 22:1 19:2 # A combination of tsz:imm3 -- extract esize. %tszimm_esz 22:2 5:5 !function=tszimm_esz @@ -712,6 +713,24 @@ UMIN_zzi 00100101 .. 101 011 110 ........ ..... @rdn_i8u # SVE integer multiply immediate (unpredicated) MUL_zzi 00100101 .. 110 000 110 ........ ..... @rdn_i8s +### SVE FP Multiply-Add Indexed Group + +# SVE floating-point multiply-add (indexed) +FMLA_zzxz 01100100 0.1 .. rm:3 00000 sub:1 rn:5 rd:5 \ + ra=%reg_movprfx index=%index3_22_19 esz=1 +FMLA_zzxz 01100100 101 index:2 rm:3 00000 sub:1 rn:5 rd:5 \ + ra=%reg_movprfx esz=2 +FMLA_zzxz 01100100 111 index:1 rm:4 00000 sub:1 rn:5 rd:5 \ + ra=%reg_movprfx esz=3 + +### SVE FP Multiply Indexed Group + +# SVE floating-point multiply (indexed) +FMUL_zzx 01100100 0.1 .. rm:3 001000 rn:5 rd:5 \ + index=%index3_22_19 esz=1 +FMUL_zzx 01100100 101 index:2 rm:3 001000 rn:5 rd:5 esz=2 +FMUL_zzx 01100100 111 index:1 rm:4 001000 rn:5 rd:5 esz=3 + ### SVE FP Accumulating Reduction Group # SVE floating-point serial reduction (predicated) From patchwork Thu Jun 21 01:53:43 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 932504 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="ffGshJrO"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41B4xB5bpcz9s2L for ; Thu, 21 Jun 2018 12:13:18 +1000 (AEST) Received: from localhost ([::1]:52576 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVp64-0001Km-De for incoming@patchwork.ozlabs.org; Wed, 20 Jun 2018 22:13:16 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39328) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVoo7-0004bG-Ug for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:46 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVoo6-0003Gz-G6 for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:44 -0400 Received: from mail-pg0-x22c.google.com ([2607:f8b0:400e:c05::22c]:35482) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVoo6-0003Fo-77 for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:42 -0400 Received: by mail-pg0-x22c.google.com with SMTP id i7-v6so648518pgp.2 for ; Wed, 20 Jun 2018 18:54:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=UzNv4pLhvlwFa3l7wUH4tyICe6dAZvXuJw329+CC+Ss=; b=ffGshJrOFEfOSZmQgPcThaJYrK8RvDPP0rzSVZ1hgJAFwbRpZg3g2Lch/zgze+wLPZ VMeIbNofb/kDMTq6c7j/BxMy70zE9/Js4xwAzxDy0IgkDhNMBeGQydMjpX2y27prQwzf LRcjmEMHCGZ3yNgCbYJzI4O0zoMtHpoK192k0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=UzNv4pLhvlwFa3l7wUH4tyICe6dAZvXuJw329+CC+Ss=; b=MKjDeOXoEEhJ3gKZdBgfL+kfZYW+3r/WfqBpmcNeTSBT+yT/AW0AwbW8M8SBHEXxgV 7WEmbqWwL/irSYTS3AOXb2632lQmZUOCKA3iCuq5om/4IVBKWCoMmgDI9ddKoVxqOLjN GSpRNiVohmbCt1ZC9H8zsq4IEE/F1AfsaGKBusRFk5L4t2zjU7OS6q+6Dea/BQIIk9qE dqy6Oku3qECsj+LG/lXCt1yCrGKAutgcjuyPH2VAFHQThKjX00IgWUd7C08H1sQ7RjPk PQr3HQjMwa9rxXRnfcDfeSoR1+qq9jHqjvB3vbwa3wntmwcWomV3o/R8VPC2BPliFb6a Xjdg== X-Gm-Message-State: APt69E3SWHIxLcm+3CjJf2gY7pFaH/ehuNkPxsHLLtxEv5nzh4BitAW3 GIvyh/TRvtaF6Qf0aTyB0V1amgSpDbs= X-Google-Smtp-Source: ADUXVKJOfNvGf10k24aLuh5TeDacxWnxmvcZYTdAy+WmYiR5wBgmWfyeE8MXFQQxYbfh+l93fNNdQw== X-Received: by 2002:a65:5884:: with SMTP id d4-v6mr20324762pgu.292.1529546080823; Wed, 20 Jun 2018 18:54:40 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.38 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:39 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:43 -1000 Message-Id: <20180621015359.12018-20-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::22c Subject: [Qemu-devel] [PATCH v5 19/35] target/arm: Implement SVE FP Fast Reduction Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper-sve.h | 35 ++++++++++++++++++++++ target/arm/sve_helper.c | 61 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 57 +++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 8 +++++ 4 files changed, 161 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 087819ec2b..ff69d143a0 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -725,6 +725,41 @@ DEF_HELPER_FLAGS_5(gvec_rsqrts_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(gvec_rsqrts_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_faddv_h, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_faddv_s, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_faddv_d, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_fmaxnmv_h, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fmaxnmv_s, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fmaxnmv_d, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_fminnmv_h, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fminnmv_s, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fminnmv_d, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_fmaxv_h, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fmaxv_s, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fmaxv_d, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_fminv_h, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fminv_s, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fminv_d, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_fadda_h, TCG_CALL_NO_RWG, i64, i64, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_fadda_s, TCG_CALL_NO_RWG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index a40df62414..befea9ba54 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2852,6 +2852,67 @@ uint32_t HELPER(sve_while)(void *vd, uint32_t count, uint32_t pred_desc) return predtest_ones(d, oprsz, esz_mask); } +/* Recursive reduction on a function; + * C.f. the ARM ARM function ReducePredicated. + * + * While it would be possible to write this without the DATA temporary, + * it is much simpler to process the predicate register this way. + * The recursion is bounded to depth 7 (128 fp16 elements), so there's + * little to gain with a more complex non-recursive form. + */ +#define DO_REDUCE(NAME, TYPE, H, FUNC, IDENT) \ +static TYPE NAME##_reduce(TYPE *data, float_status *status, uintptr_t n) \ +{ \ + if (n == 1) { \ + return *data; \ + } else { \ + uintptr_t half = n / 2; \ + TYPE lo = NAME##_reduce(data, status, half); \ + TYPE hi = NAME##_reduce(data + half, status, half); \ + return TYPE##_##FUNC(lo, hi, status); \ + } \ +} \ +uint64_t HELPER(NAME)(void *vn, void *vg, void *vs, uint32_t desc) \ +{ \ + uintptr_t i, oprsz = simd_oprsz(desc), maxsz = simd_maxsz(desc); \ + TYPE data[sizeof(ARMVectorReg) / sizeof(TYPE)]; \ + for (i = 0; i < oprsz; ) { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + TYPE nn = *(TYPE *)(vn + H(i)); \ + *(TYPE *)((void *)data + i) = (pg & 1 ? nn : IDENT); \ + i += sizeof(TYPE), pg >>= sizeof(TYPE); \ + } while (i & 15); \ + } \ + for (; i < maxsz; i += sizeof(TYPE)) { \ + *(TYPE *)((void *)data + i) = IDENT; \ + } \ + return NAME##_reduce(data, vs, maxsz / sizeof(TYPE)); \ +} + +DO_REDUCE(sve_faddv_h, float16, H1_2, add, float16_zero) +DO_REDUCE(sve_faddv_s, float32, H1_4, add, float32_zero) +DO_REDUCE(sve_faddv_d, float64, , add, float64_zero) + +/* Identity is floatN_default_nan, without the function call. */ +DO_REDUCE(sve_fminnmv_h, float16, H1_2, minnum, 0x7E00) +DO_REDUCE(sve_fminnmv_s, float32, H1_4, minnum, 0x7FC00000) +DO_REDUCE(sve_fminnmv_d, float64, , minnum, 0x7FF8000000000000ULL) + +DO_REDUCE(sve_fmaxnmv_h, float16, H1_2, maxnum, 0x7E00) +DO_REDUCE(sve_fmaxnmv_s, float32, H1_4, maxnum, 0x7FC00000) +DO_REDUCE(sve_fmaxnmv_d, float64, , maxnum, 0x7FF8000000000000ULL) + +DO_REDUCE(sve_fminv_h, float16, H1_2, min, float16_infinity) +DO_REDUCE(sve_fminv_s, float32, H1_4, min, float32_infinity) +DO_REDUCE(sve_fminv_d, float64, , min, float64_infinity) + +DO_REDUCE(sve_fmaxv_h, float16, H1_2, max, float16_chs(float16_infinity)) +DO_REDUCE(sve_fmaxv_s, float32, H1_4, max, float32_chs(float32_infinity)) +DO_REDUCE(sve_fmaxv_d, float64, , max, float64_chs(float64_infinity)) + +#undef DO_REDUCE + uint64_t HELPER(sve_fadda_h)(uint64_t nn, void *vm, void *vg, void *status, uint32_t desc) { diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index e4ba84cadd..47d64f2fc7 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3450,6 +3450,63 @@ static bool trans_FMUL_zzx(DisasContext *s, arg_FMUL_zzx *a, uint32_t insn) return true; } +/* + *** SVE Floating Point Fast Reduction Group + */ + +typedef void gen_helper_fp_reduce(TCGv_i64, TCGv_ptr, TCGv_ptr, + TCGv_ptr, TCGv_i32); + +static void do_reduce(DisasContext *s, arg_rpr_esz *a, + gen_helper_fp_reduce *fn) +{ + unsigned vsz = vec_full_reg_size(s); + unsigned p2vsz = pow2ceil(vsz); + TCGv_i32 t_desc = tcg_const_i32(simd_desc(vsz, p2vsz, 0)); + TCGv_ptr t_zn, t_pg, status; + TCGv_i64 temp; + + temp = tcg_temp_new_i64(); + t_zn = tcg_temp_new_ptr(); + t_pg = tcg_temp_new_ptr(); + + tcg_gen_addi_ptr(t_zn, cpu_env, vec_full_reg_offset(s, a->rn)); + tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, a->pg)); + status = get_fpstatus_ptr(a->esz == MO_16); + + fn(temp, t_zn, t_pg, status, t_desc); + tcg_temp_free_ptr(t_zn); + tcg_temp_free_ptr(t_pg); + tcg_temp_free_ptr(status); + tcg_temp_free_i32(t_desc); + + write_fp_dreg(s, a->rd, temp); + tcg_temp_free_i64(temp); +} + +#define DO_VPZ(NAME, name) \ +static bool trans_##NAME(DisasContext *s, arg_rpr_esz *a, uint32_t insn) \ +{ \ + static gen_helper_fp_reduce * const fns[3] = { \ + gen_helper_sve_##name##_h, \ + gen_helper_sve_##name##_s, \ + gen_helper_sve_##name##_d, \ + }; \ + if (a->esz == 0) { \ + return false; \ + } \ + if (sve_access_check(s)) { \ + do_reduce(s, a, fns[a->esz - 1]); \ + } \ + return true; \ +} + +DO_VPZ(FADDV, faddv) +DO_VPZ(FMINNMV, fminnmv) +DO_VPZ(FMAXNMV, fmaxnmv) +DO_VPZ(FMINV, fminv) +DO_VPZ(FMAXV, fmaxv) + /* *** SVE Floating Point Accumulating Reduction Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index f64e74fa9d..39a803621f 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -731,6 +731,14 @@ FMUL_zzx 01100100 0.1 .. rm:3 001000 rn:5 rd:5 \ FMUL_zzx 01100100 101 index:2 rm:3 001000 rn:5 rd:5 esz=2 FMUL_zzx 01100100 111 index:1 rm:4 001000 rn:5 rd:5 esz=3 +### SVE FP Fast Reduction Group + +FADDV 01100101 .. 000 000 001 ... ..... ..... @rd_pg_rn +FMAXNMV 01100101 .. 000 100 001 ... ..... ..... @rd_pg_rn +FMINNMV 01100101 .. 000 101 001 ... ..... ..... @rd_pg_rn +FMAXV 01100101 .. 000 110 001 ... ..... ..... @rd_pg_rn +FMINV 01100101 .. 000 111 001 ... ..... ..... @rd_pg_rn + ### SVE FP Accumulating Reduction Group # SVE floating-point serial reduction (predicated) From patchwork Thu Jun 21 01:53:44 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 932486 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="iJW0uLe9"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41B4sR5Xhjz9s3R for ; Thu, 21 Jun 2018 12:10:03 +1000 (AEST) Received: from localhost ([::1]:52556 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVp2v-0007DW-Dr for incoming@patchwork.ozlabs.org; Wed, 20 Jun 2018 22:10:01 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39336) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVoo9-0004bI-21 for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:46 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVoo8-0003I8-18 for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:45 -0400 Received: from mail-pl0-x22f.google.com ([2607:f8b0:400e:c01::22f]:38072) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVoo7-0003Hl-PM for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:43 -0400 Received: by mail-pl0-x22f.google.com with SMTP id d10-v6so757627plo.5 for ; Wed, 20 Jun 2018 18:54:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=RCtfZpKBlW4ElNRMaWaS9zgg8qe947nw1g14SfBucD0=; b=iJW0uLe9bOPvuuQn+kE+zY+X/rWphnXbttEfStGrY6TM2IjUFmjg7G+cW5AQKXp/Rv cNS8Eai3+gi1t24aUU1kxwdFJkmDP6dG5IoX+nLNViMMLc9CkMSDSOu1iyyzrfiB6G8C Q0VpLRGwedekZ2/z9sz1ZrBkt8sM5fqnpF2iI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=RCtfZpKBlW4ElNRMaWaS9zgg8qe947nw1g14SfBucD0=; b=uZK5TCKFcuK6XoPv13FEsBjgTXJNbc9nYVqkvRZL3KhAWtneySW+vRJGSQxRhuRx72 MmKNZUu+9tfsUKJ499rJ//EVs4NYgyDe2lA9n/0/BgrQIHyH6Clwp3l5Ar9r9Mjh1TB4 0cWzTYlBf/9J7s4hcn1Db5LsK1HiDEsQgnHUrZsRE/aPFMwPGbJe4VbP0JuClowHmpFj ZM9SVpFYgAUGDSAWPyYEgOgfly8QdBl1+U2vJH+SjMJbbsTk4PMs0s+I1MhET1Ym3unC +w5ja6yIV+RiSYonQ9QZC4ZeO4P30CzraOJnRrTsr8s79t1CZJ6rbjqA5jUJIgBpwKMI qkEA== X-Gm-Message-State: APt69E3nsT5tCRoJWLu4aB+uC+vKiWcEAZSISR/oVRLuRajPXtuk/AZX cbZs87sJr7gKZsbNqbnJq3droTxdWsA= X-Google-Smtp-Source: ADUXVKKVh/SZ9rJH9tGcumlq+t9Kl3xsATtWak21XTYpCvukH0bVJiOPhV6jHv1TXJXCXb6iKklUbQ== X-Received: by 2002:a17:902:125:: with SMTP id 34-v6mr26252825plb.42.1529546082491; Wed, 20 Jun 2018 18:54:42 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.40 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:41 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:44 -1000 Message-Id: <20180621015359.12018-21-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::22f Subject: [Qemu-devel] [PATCH v5 20/35] target/arm: Implement SVE Floating Point Unary Operations - Unpredicated Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper.h | 8 +++++++ target/arm/translate-sve.c | 47 ++++++++++++++++++++++++++++++++++++++ target/arm/vec_helper.c | 20 ++++++++++++++++ target/arm/sve.decode | 5 ++++ 4 files changed, 80 insertions(+) diff --git a/target/arm/helper.h b/target/arm/helper.h index 56439ac1e4..ad9cb6c7d5 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -601,6 +601,14 @@ DEF_HELPER_FLAGS_5(gvec_fcmlas_idx, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(gvec_fcmlad, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_frecpe_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_frecpe_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_frecpe_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(gvec_frsqrte_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_frsqrte_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_frsqrte_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(gvec_fadd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_fadd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_fadd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 47d64f2fc7..d7957cddbd 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3507,6 +3507,53 @@ DO_VPZ(FMAXNMV, fmaxnmv) DO_VPZ(FMINV, fminv) DO_VPZ(FMAXV, fmaxv) +/* + *** SVE Floating Point Unary Operations - Unpredicated Group + */ + +static void do_zz_fp(DisasContext *s, arg_rr_esz *a, gen_helper_gvec_2_ptr *fn) +{ + unsigned vsz = vec_full_reg_size(s); + TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16); + + tcg_gen_gvec_2_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + status, vsz, vsz, 0, fn); + tcg_temp_free_ptr(status); +} + +static bool trans_FRECPE(DisasContext *s, arg_rr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_2_ptr * const fns[3] = { + gen_helper_gvec_frecpe_h, + gen_helper_gvec_frecpe_s, + gen_helper_gvec_frecpe_d, + }; + if (a->esz == 0) { + return false; + } + if (sve_access_check(s)) { + do_zz_fp(s, a, fns[a->esz - 1]); + } + return true; +} + +static bool trans_FRSQRTE(DisasContext *s, arg_rr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_2_ptr * const fns[3] = { + gen_helper_gvec_frsqrte_h, + gen_helper_gvec_frsqrte_s, + gen_helper_gvec_frsqrte_d, + }; + if (a->esz == 0) { + return false; + } + if (sve_access_check(s)) { + do_zz_fp(s, a, fns[a->esz - 1]); + } + return true; +} + /* *** SVE Floating Point Accumulating Reduction Group */ diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 97af75a61b..073e5c58e7 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -427,6 +427,26 @@ void HELPER(gvec_fcmlad)(void *vd, void *vn, void *vm, clear_tail(d, opr_sz, simd_maxsz(desc)); } +#define DO_2OP(NAME, FUNC, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, void *stat, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + TYPE *d = vd, *n = vn; \ + for (i = 0; i < oprsz / sizeof(TYPE); i++) { \ + d[i] = FUNC(n[i], stat); \ + } \ +} + +DO_2OP(gvec_frecpe_h, helper_recpe_f16, float16) +DO_2OP(gvec_frecpe_s, helper_recpe_f32, float32) +DO_2OP(gvec_frecpe_d, helper_recpe_f64, float64) + +DO_2OP(gvec_frsqrte_h, helper_rsqrte_f16, float16) +DO_2OP(gvec_frsqrte_s, helper_rsqrte_f32, float32) +DO_2OP(gvec_frsqrte_d, helper_rsqrte_f64, float64) + +#undef DO_2OP + /* Floating-point trigonometric starting value. * See the ARM ARM pseudocode function FPTrigSMul. */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 39a803621f..191be9463d 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -739,6 +739,11 @@ FMINNMV 01100101 .. 000 101 001 ... ..... ..... @rd_pg_rn FMAXV 01100101 .. 000 110 001 ... ..... ..... @rd_pg_rn FMINV 01100101 .. 000 111 001 ... ..... ..... @rd_pg_rn +## SVE Floating Point Unary Operations - Unpredicated Group + +FRECPE 01100101 .. 001 110 001100 ..... ..... @rd_rn +FRSQRTE 01100101 .. 001 111 001100 ..... ..... @rd_rn + ### SVE FP Accumulating Reduction Group # SVE floating-point serial reduction (predicated) From patchwork Thu Jun 21 01:53:45 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 932509 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="UmbomvWx"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41B4zx0Q1mz9s2L for ; Thu, 21 Jun 2018 12:15:39 +1000 (AEST) Received: from localhost ([::1]:52586 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVp8J-000399-IY for incoming@patchwork.ozlabs.org; Wed, 20 Jun 2018 22:15:35 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39369) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVooE-0004gh-19 for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:51 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVooA-0003JL-0M for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:50 -0400 Received: from mail-pl0-x234.google.com ([2607:f8b0:400e:c01::234]:45168) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVoo9-0003Ij-OZ for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:45 -0400 Received: by mail-pl0-x234.google.com with SMTP id o18-v6so58914pll.12 for ; Wed, 20 Jun 2018 18:54:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=G7HJFf01ahwyJ3S0o1PGDi5YnhvesRg4vJI2sGihuSA=; b=UmbomvWx7hafV6A8fKPrlz7fy1b2Mo3KDedLFZh1MbgFEU2RCs2tvAEWLhneDFIwzb +L+sE2TLGMMw4bHvVPyGnm9yw5wNLJg3APO9WNIMz2nlAP2xu6rBGeHaGybzSsqqgCXa Zxl9Kn6D0f0jQhSkI6FNxSna6b7Ec+71Zq5rQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=G7HJFf01ahwyJ3S0o1PGDi5YnhvesRg4vJI2sGihuSA=; b=FqD8sEWkGAwc5zlMw7XrhkX0y+DdrKNWBYew+9lTw6WT/ViTAZxPV6Q9X3S6kmJZXq Q4mk3CC8XTkJNA/4L0mowf1ZdaKgb3c7iJAKXwb2SOM1kekeN35P412T4j4g+4tsEBuR HQpcIZTIYfgfQANMCXNBmiueI152ptsM7A8NxRQ8+aGJvImbNu56MBqc65pTBpYkHr6O zJTnbf3pU1qwKWI4voMnrZtzXHRT4ZIIKRBm7iSSXYJvHpotPbvaVXB9YJh11MbM3ECn fXzSL496Nm6YDFgrKjxAY0KSZEqwonkMKuzQikJ1huMNrdiH4tbz2lRSoJtnao5Tu0M1 /Ajg== X-Gm-Message-State: APt69E1r7co+M6FpPPRttY9/MGjcrXh3HnLlB7scvTbMC8L4RFaNHIuf sVzd9R8DD9ODGCPQ7S0TUIA/YTzXBOg= X-Google-Smtp-Source: ADUXVKJUbtcq2gsJOGqr4HXVKHHujJCAS0cV6ZxrxHxxsS/5JjnYohUza3EHXbvn9HskNKBTqtG3NQ== X-Received: by 2002:a17:902:5501:: with SMTP id f1-v6mr26162868pli.108.1529546084438; Wed, 20 Jun 2018 18:54:44 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.42 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:43 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:45 -1000 Message-Id: <20180621015359.12018-22-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::234 Subject: [Qemu-devel] [PATCH v5 21/35] target/arm: Implement SVE FP Compare with Zero Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper-sve.h | 42 +++++++++++++++++++++++++++++++++++++ target/arm/sve_helper.c | 43 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 43 ++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 10 +++++++++ 4 files changed, 138 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index ff69d143a0..44a98440c9 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -767,6 +767,48 @@ DEF_HELPER_FLAGS_5(sve_fadda_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_fadda_d, TCG_CALL_NO_RWG, i64, i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmge0_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmge0_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmge0_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_fcmgt0_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmgt0_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmgt0_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_fcmlt0_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmlt0_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmlt0_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_fcmle0_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmle0_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmle0_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_fcmeq0_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmeq0_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmeq0_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_fcmne0_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmne0_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmne0_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_6(sve_fadd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_6(sve_fadd_s, TCG_CALL_NO_RWG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index befea9ba54..06e963e9c2 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3362,6 +3362,8 @@ void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, \ #define DO_FCMGE(TYPE, X, Y, ST) TYPE##_compare(Y, X, ST) <= 0 #define DO_FCMGT(TYPE, X, Y, ST) TYPE##_compare(Y, X, ST) < 0 +#define DO_FCMLE(TYPE, X, Y, ST) TYPE##_compare(X, Y, ST) <= 0 +#define DO_FCMLT(TYPE, X, Y, ST) TYPE##_compare(X, Y, ST) < 0 #define DO_FCMEQ(TYPE, X, Y, ST) TYPE##_compare_quiet(X, Y, ST) == 0 #define DO_FCMNE(TYPE, X, Y, ST) TYPE##_compare_quiet(X, Y, ST) != 0 #define DO_FCMUO(TYPE, X, Y, ST) \ @@ -3385,6 +3387,47 @@ DO_FPCMP_PPZZ_ALL(sve_facgt, DO_FACGT) #undef DO_FPCMP_PPZZ_H #undef DO_FPCMP_PPZZ +/* One operand floating-point comparison against zero, controlled + * by a predicate. + */ +#define DO_FPCMP_PPZ0(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vg, \ + void *status, uint32_t desc) \ +{ \ + intptr_t i = simd_oprsz(desc), j = (i - 1) >> 6; \ + uint64_t *d = vd, *g = vg; \ + do { \ + uint64_t out = 0, pg = g[j]; \ + do { \ + i -= sizeof(TYPE), out <<= sizeof(TYPE); \ + if ((pg >> (i & 63)) & 1) { \ + TYPE nn = *(TYPE *)(vn + H(i)); \ + out |= OP(TYPE, nn, 0, status); \ + } \ + } while (i & 63); \ + d[j--] = out; \ + } while (i > 0); \ +} + +#define DO_FPCMP_PPZ0_H(NAME, OP) \ + DO_FPCMP_PPZ0(NAME##_h, float16, H1_2, OP) +#define DO_FPCMP_PPZ0_S(NAME, OP) \ + DO_FPCMP_PPZ0(NAME##_s, float32, H1_4, OP) +#define DO_FPCMP_PPZ0_D(NAME, OP) \ + DO_FPCMP_PPZ0(NAME##_d, float64, , OP) + +#define DO_FPCMP_PPZ0_ALL(NAME, OP) \ + DO_FPCMP_PPZ0_H(NAME, OP) \ + DO_FPCMP_PPZ0_S(NAME, OP) \ + DO_FPCMP_PPZ0_D(NAME, OP) + +DO_FPCMP_PPZ0_ALL(sve_fcmge0, DO_FCMGE) +DO_FPCMP_PPZ0_ALL(sve_fcmgt0, DO_FCMGT) +DO_FPCMP_PPZ0_ALL(sve_fcmle0, DO_FCMLE) +DO_FPCMP_PPZ0_ALL(sve_fcmlt0, DO_FCMLT) +DO_FPCMP_PPZ0_ALL(sve_fcmeq0, DO_FCMEQ) +DO_FPCMP_PPZ0_ALL(sve_fcmne0, DO_FCMNE) + /* * Load contiguous data, protected by a governing predicate. */ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index d7957cddbd..0c55501935 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3554,6 +3554,49 @@ static bool trans_FRSQRTE(DisasContext *s, arg_rr_esz *a, uint32_t insn) return true; } +/* + *** SVE Floating Point Compare with Zero Group + */ + +static void do_ppz_fp(DisasContext *s, arg_rpr_esz *a, + gen_helper_gvec_3_ptr *fn) +{ + unsigned vsz = vec_full_reg_size(s); + TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16); + + tcg_gen_gvec_3_ptr(pred_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + pred_full_reg_offset(s, a->pg), + status, vsz, vsz, 0, fn); + tcg_temp_free_ptr(status); +} + +#define DO_PPZ(NAME, name) \ +static bool trans_##NAME(DisasContext *s, arg_rpr_esz *a, uint32_t insn) \ +{ \ + static gen_helper_gvec_3_ptr * const fns[3] = { \ + gen_helper_sve_##name##_h, \ + gen_helper_sve_##name##_s, \ + gen_helper_sve_##name##_d, \ + }; \ + if (a->esz == 0) { \ + return false; \ + } \ + if (sve_access_check(s)) { \ + do_ppz_fp(s, a, fns[a->esz - 1]); \ + } \ + return true; \ +} + +DO_PPZ(FCMGE_ppz0, fcmge0) +DO_PPZ(FCMGT_ppz0, fcmgt0) +DO_PPZ(FCMLE_ppz0, fcmle0) +DO_PPZ(FCMLT_ppz0, fcmlt0) +DO_PPZ(FCMEQ_ppz0, fcmeq0) +DO_PPZ(FCMNE_ppz0, fcmne0) + +#undef DO_PPZ + /* *** SVE Floating Point Accumulating Reduction Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 191be9463d..5dea2e5e88 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -140,6 +140,7 @@ # One register operand, with governing predicate, vector element size @rd_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 &rpr_esz @rd_pg4_pn ........ esz:2 ... ... .. pg:4 . rn:4 rd:5 &rpr_esz +@pd_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 . rd:4 &rpr_esz # One register operand, with governing predicate, no vector element size @rd_pg_rn_e0 ........ .. ... ... ... pg:3 rn:5 rd:5 &rpr_esz esz=0 @@ -744,6 +745,15 @@ FMINV 01100101 .. 000 111 001 ... ..... ..... @rd_pg_rn FRECPE 01100101 .. 001 110 001100 ..... ..... @rd_rn FRSQRTE 01100101 .. 001 111 001100 ..... ..... @rd_rn +### SVE FP Compare with Zero Group + +FCMGE_ppz0 01100101 .. 0100 00 001 ... ..... 0 .... @pd_pg_rn +FCMGT_ppz0 01100101 .. 0100 00 001 ... ..... 1 .... @pd_pg_rn +FCMLT_ppz0 01100101 .. 0100 01 001 ... ..... 0 .... @pd_pg_rn +FCMLE_ppz0 01100101 .. 0100 01 001 ... ..... 1 .... @pd_pg_rn +FCMEQ_ppz0 01100101 .. 0100 10 001 ... ..... 0 .... @pd_pg_rn +FCMNE_ppz0 01100101 .. 0100 11 001 ... ..... 0 .... @pd_pg_rn + ### SVE FP Accumulating Reduction Group # SVE floating-point serial reduction (predicated) From patchwork Thu Jun 21 01:53:46 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 932498 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="aEjp4Cfg"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41B4w6601lz9s2L for ; Thu, 21 Jun 2018 12:12:22 +1000 (AEST) Received: from localhost ([::1]:52570 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVp5A-0000b2-Du for incoming@patchwork.ozlabs.org; Wed, 20 Jun 2018 22:12:20 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39368) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVooE-0004gg-0s for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:51 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVooC-0003KT-35 for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:50 -0400 Received: from mail-pl0-x22b.google.com ([2607:f8b0:400e:c01::22b]:39902) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVooB-0003K2-QY for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:47 -0400 Received: by mail-pl0-x22b.google.com with SMTP id s24-v6so752994plq.6 for ; Wed, 20 Jun 2018 18:54:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=YM1glD7stHNvZqQN3AWRG/IvykMWfYM5AgvXp5srQR0=; b=aEjp4CfgU4VlL5WiVthx2wYXBFXcqdH3LemPp3n9hn72GsrgZO9zhZXM+KscqZIoj1 7pfReehU4z2UIPP69KrvHf3LWpVVV9z1luhhmRwN8ASS9HBycIiFiktvdO07gKkDEOwl I9B5bXQE0Uz8Olv89i8RyWCyvjjtP2TAdKYgI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=YM1glD7stHNvZqQN3AWRG/IvykMWfYM5AgvXp5srQR0=; b=s8xG3+Q7ww7COIxaLusHmoNtPTwHg5BI3K94urs4+BcrSfUbvFdWG49BWO8z6NlfmK PAMz4KO/IJ3nq9JiKfnO6eZlQUxslp5Qc5y+Ezh4B71G3qtV+0Hwd1RY3x2QM3C1lhJu CV4fpToLuu5KzRDc202ZpjJ22s/vM2ATpvEJujQ7btvxIs1OXmzOs9awFWQXotCmDnSJ x4+ZCySXwjifdMQZlX4KEtIIQetQlG1Rf5g5ImjCMiMEz/pDrSLv/wsPJAcMX/3RORT/ gp8GBxkXRP5qo8W9w0LlAu6ojseWf/Cs6Iu4N5oC1qM2O/tOrVgz2RBnbzNko6JfE6Yn E4bQ== X-Gm-Message-State: APt69E3fXhw2qwbw4v5VroPMD02HlJ9cUC9AhX5vtUS24KJpZTytPRyb cshwjhBgVhs6FtSAJ0223NUPtlttC8E= X-Google-Smtp-Source: ADUXVKIvPtfa6nNvTIxHRTdkvRk/3kI05rORpTwKZut8LopvKNo4SIm1oKEq7h0ohI4lO5Ox0UOCsA== X-Received: by 2002:a17:902:b40f:: with SMTP id x15-v6mr26487854plr.270.1529546086488; Wed, 20 Jun 2018 18:54:46 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.44 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:45 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:46 -1000 Message-Id: <20180621015359.12018-23-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::22b Subject: [Qemu-devel] [PATCH v5 22/35] target/arm: Implement SVE floating-point trig multiply-add coefficient X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper-sve.h | 4 +++ target/arm/sve_helper.c | 70 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 27 +++++++++++++++ target/arm/sve.decode | 3 ++ 4 files changed, 104 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 44a98440c9..aca137fc37 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -1037,6 +1037,10 @@ DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_5(sve_ftmad_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_ftmad_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_ftmad_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(sve_ld1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 06e963e9c2..3dc35d8b12 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3428,6 +3428,76 @@ DO_FPCMP_PPZ0_ALL(sve_fcmlt0, DO_FCMLT) DO_FPCMP_PPZ0_ALL(sve_fcmeq0, DO_FCMEQ) DO_FPCMP_PPZ0_ALL(sve_fcmne0, DO_FCMNE) +/* FP Trig Multiply-Add. */ + +void HELPER(sve_ftmad_h)(void *vd, void *vn, void *vm, void *vs, uint32_t desc) +{ + static const float16 coeff[16] = { + 0x3c00, 0xb155, 0x2030, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, + 0x3c00, 0xb800, 0x293a, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, + }; + intptr_t i, opr_sz = simd_oprsz(desc) / sizeof(float16); + intptr_t x = simd_data(desc); + float16 *d = vd, *n = vn, *m = vm; + for (i = 0; i < opr_sz; i++) { + float16 mm = m[i]; + intptr_t xx = x; + if (float16_is_neg(mm)) { + mm = float16_abs(mm); + xx += 8; + } + d[i] = float16_muladd(n[i], mm, coeff[xx], 0, vs); + } +} + +void HELPER(sve_ftmad_s)(void *vd, void *vn, void *vm, void *vs, uint32_t desc) +{ + static const float32 coeff[16] = { + 0x3f800000, 0xbe2aaaab, 0x3c088886, 0xb95008b9, + 0x36369d6d, 0x00000000, 0x00000000, 0x00000000, + 0x3f800000, 0xbf000000, 0x3d2aaaa6, 0xbab60705, + 0x37cd37cc, 0x00000000, 0x00000000, 0x00000000, + }; + intptr_t i, opr_sz = simd_oprsz(desc) / sizeof(float32); + intptr_t x = simd_data(desc); + float32 *d = vd, *n = vn, *m = vm; + for (i = 0; i < opr_sz; i++) { + float32 mm = m[i]; + intptr_t xx = x; + if (float32_is_neg(mm)) { + mm = float32_abs(mm); + xx += 8; + } + d[i] = float32_muladd(n[i], mm, coeff[xx], 0, vs); + } +} + +void HELPER(sve_ftmad_d)(void *vd, void *vn, void *vm, void *vs, uint32_t desc) +{ + static const float64 coeff[16] = { + 0x3ff0000000000000ull, 0xbfc5555555555543ull, + 0x3f8111111110f30cull, 0xbf2a01a019b92fc6ull, + 0x3ec71de351f3d22bull, 0xbe5ae5e2b60f7b91ull, + 0x3de5d8408868552full, 0x0000000000000000ull, + 0x3ff0000000000000ull, 0xbfe0000000000000ull, + 0x3fa5555555555536ull, 0xbf56c16c16c13a0bull, + 0x3efa01a019b1e8d8ull, 0xbe927e4f7282f468ull, + 0x3e21ee96d2641b13ull, 0xbda8f76380fbb401ull, + }; + intptr_t i, opr_sz = simd_oprsz(desc) / sizeof(float64); + intptr_t x = simd_data(desc); + float64 *d = vd, *n = vn, *m = vm; + for (i = 0; i < opr_sz; i++) { + float64 mm = m[i]; + intptr_t xx = x; + if (float64_is_neg(mm)) { + mm = float64_abs(mm); + xx += 8; + } + d[i] = float64_muladd(n[i], mm, coeff[xx], 0, vs); + } +} + /* * Load contiguous data, protected by a governing predicate. */ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 0c55501935..fb225d56a1 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3597,6 +3597,33 @@ DO_PPZ(FCMNE_ppz0, fcmne0) #undef DO_PPZ +/* + *** SVE floating-point trig multiply-add coefficient + */ + +static bool trans_FTMAD(DisasContext *s, arg_FTMAD *a, uint32_t insn) +{ + static gen_helper_gvec_3_ptr * const fns[3] = { + gen_helper_sve_ftmad_h, + gen_helper_sve_ftmad_s, + gen_helper_sve_ftmad_d, + }; + + if (a->esz == 0) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16); + tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + status, vsz, vsz, a->imm, fns[a->esz - 1]); + tcg_temp_free_ptr(status); + } + return true; +} + /* *** SVE Floating Point Accumulating Reduction Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 5dea2e5e88..e29c598783 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -796,6 +796,9 @@ FMINNM_zpzi 01100101 .. 011 101 100 ... 0000 . ..... @rdn_i1 FMAX_zpzi 01100101 .. 011 110 100 ... 0000 . ..... @rdn_i1 FMIN_zpzi 01100101 .. 011 111 100 ... 0000 . ..... @rdn_i1 +# SVE floating-point trig multiply-add coefficient +FTMAD 01100101 esz:2 010 imm:3 100000 rm:5 rd:5 rn=%reg_movprfx + ### SVE FP Multiply-Add Group # SVE floating-point multiply-accumulate writing addend From patchwork Thu Jun 21 01:53:47 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 932488 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="DDVtR+IP"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41B4tL3bjlz9s3T for ; Thu, 21 Jun 2018 12:10:50 +1000 (AEST) Received: from localhost ([::1]:52561 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVp3g-0007q8-5I for incoming@patchwork.ozlabs.org; Wed, 20 Jun 2018 22:10:48 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39391) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVooG-0004is-Cb for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:53 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVooF-0003MF-BK for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:52 -0400 Received: from mail-pf0-x232.google.com ([2607:f8b0:400e:c00::232]:37729) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVooF-0003Lj-2f for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:51 -0400 Received: by mail-pf0-x232.google.com with SMTP id y5-v6so685998pfn.4 for ; Wed, 20 Jun 2018 18:54:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=fcBqKOfz/OhWxZj3vl+NSnfdZzqnqtvhx9FbSAfYtQI=; b=DDVtR+IPTnfPti8/xA5a86+TdFJjdcq/TOilp6C4x6q2iwxo+R+wePs2ge2XcMtHMW Qwb4Dop8Z5Hhu1KKJ4SMiBSUOaeFdX1ORZh6JZ5oZOR8JKClLrMh2wuuoTRhev5Amwxg 4ceWPT98Yrrb9g0yYfSr8kLKK5CVt/f7zCwR4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=fcBqKOfz/OhWxZj3vl+NSnfdZzqnqtvhx9FbSAfYtQI=; b=Iy1XLUzmrLzJTFLBasDp/6aa4rvmK5tCC8HfPyLhgpgDw7k85HcxYrfSUytgxx1Lwl 4frbG21UKgPXfAsrADDDFcGujnAzTCdcM6p4z3+tWwE0ZR1apYm1j1KgNTAcsaPrA0cQ OLjVWbfe+Iu6exKN2wstpPZ6LwK2lxPZdaX4WM/tapYE6n1N8d6Zrnwsb8gvx/rNMAv0 lp10B+tZR8f9RIjZ3PDz9ZLTM1j45F/4afK3jLEHiEES+ak6HFK4EtdocAPQ3grQ8vch Kwor9eZx3O/cjEdTInKpOOOYL6T4EwuREyftWLhiOjleDSAZpK12O9KYWJujuxPX4Zzz X+3w== X-Gm-Message-State: APt69E3CtT/9O1e3tEfIKO9qOWvGhbb/o0FHA482QWQVkgjvOwCAdJJm rOcu0F0YYFlgUP879bCRzjHjiRZ2Nro= X-Google-Smtp-Source: ADUXVKKZYm/0t5Qi20O8fIUfOCUjppQXDx5bZQpY+Av94/o0Nagv+j9nVCXAlZs45khdqvnb12pr0g== X-Received: by 2002:a62:4f4f:: with SMTP id d76-v6mr25014999pfb.188.1529546089754; Wed, 20 Jun 2018 18:54:49 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.46 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:48 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:47 -1000 Message-Id: <20180621015359.12018-24-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::232 Subject: [Qemu-devel] [PATCH v5 23/35] target/arm: Implement SVE floating-point convert precision X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 13 +++++++++++++ target/arm/sve_helper.c | 27 +++++++++++++++++++++++++++ target/arm/translate-sve.c | 30 ++++++++++++++++++++++++++++++ target/arm/sve.decode | 8 ++++++++ 4 files changed, 78 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index aca137fc37..4c379dbb05 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -942,6 +942,19 @@ DEF_HELPER_FLAGS_6(sve_fmins_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_6(sve_fmins_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvt_sh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvt_dh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvt_hs, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvt_ds, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvt_hd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvt_sd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 3dc35d8b12..e1bbe1f550 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3147,6 +3147,33 @@ void HELPER(NAME)(void *vd, void *vn, void *vg, void *status, uint32_t desc) \ } while (i != 0); \ } +static inline float32 sve_f16_to_f32(float16 f, float_status *s) +{ + return float16_to_float32(f, true, s); +} + +static inline float64 sve_f16_to_f64(float16 f, float_status *s) +{ + return float16_to_float64(f, true, s); +} + +static inline float16 sve_f32_to_f16(float32 f, float_status *s) +{ + return float32_to_float16(f, true, s); +} + +static inline float16 sve_f64_to_f16(float64 f, float_status *s) +{ + return float64_to_float16(f, true, s); +} + +DO_ZPZ_FP(sve_fcvt_sh, uint32_t, H1_4, sve_f32_to_f16) +DO_ZPZ_FP(sve_fcvt_hs, uint32_t, H1_4, sve_f16_to_f32) +DO_ZPZ_FP(sve_fcvt_dh, uint64_t, , sve_f64_to_f16) +DO_ZPZ_FP(sve_fcvt_hd, uint64_t, , sve_f16_to_f64) +DO_ZPZ_FP(sve_fcvt_ds, uint64_t, , float64_to_float32) +DO_ZPZ_FP(sve_fcvt_sd, uint64_t, , float32_to_float64) + DO_ZPZ_FP(sve_scvt_hh, uint16_t, H1_2, int16_to_float16) DO_ZPZ_FP(sve_scvt_sh, uint32_t, H1_4, int32_to_float16) DO_ZPZ_FP(sve_scvt_ss, uint32_t, H1_4, int32_to_float32) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index fb225d56a1..c08c1263f1 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3940,6 +3940,36 @@ static bool do_zpz_ptr(DisasContext *s, int rd, int rn, int pg, return true; } +static bool trans_FCVT_sh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvt_sh); +} + +static bool trans_FCVT_hs(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvt_hs); +} + +static bool trans_FCVT_dh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvt_dh); +} + +static bool trans_FCVT_hd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvt_hd); +} + +static bool trans_FCVT_ds(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvt_ds); +} + +static bool trans_FCVT_sd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvt_sd); +} + static bool trans_SCVTF_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) { return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_scvt_hh); diff --git a/target/arm/sve.decode b/target/arm/sve.decode index e29c598783..fd45f51029 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -816,6 +816,14 @@ FNMLS_zpzzz 01100101 .. 1 ..... 111 ... ..... ..... @rdn_pg_rm_ra ### SVE FP Unary Operations Predicated Group +# SVE floating-point convert precision +FCVT_sh 01100101 10 0010 00 101 ... ..... ..... @rd_pg_rn_e0 +FCVT_hs 01100101 10 0010 01 101 ... ..... ..... @rd_pg_rn_e0 +FCVT_dh 01100101 11 0010 00 101 ... ..... ..... @rd_pg_rn_e0 +FCVT_hd 01100101 11 0010 01 101 ... ..... ..... @rd_pg_rn_e0 +FCVT_ds 01100101 11 0010 10 101 ... ..... ..... @rd_pg_rn_e0 +FCVT_sd 01100101 11 0010 11 101 ... ..... ..... @rd_pg_rn_e0 + # SVE integer convert to floating-point SCVTF_hh 01100101 01 010 01 0 101 ... ..... ..... @rd_pg_rn_e0 SCVTF_sh 01100101 01 010 10 0 101 ... ..... ..... @rd_pg_rn_e0 From patchwork Thu Jun 21 01:53:48 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 932507 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="NWopxSu0"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41B4yp27khz9s2L for ; Thu, 21 Jun 2018 12:14:42 +1000 (AEST) Received: from localhost ([::1]:52581 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVp7P-0002Ow-Vt for incoming@patchwork.ozlabs.org; Wed, 20 Jun 2018 22:14:40 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39409) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVooJ-0004lF-7I for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:57 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVooH-0003Nj-DL for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:55 -0400 Received: from mail-pf0-x22b.google.com ([2607:f8b0:400e:c00::22b]:46300) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVooH-0003N0-51 for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:53 -0400 Received: by mail-pf0-x22b.google.com with SMTP id q1-v6so677086pff.13 for ; Wed, 20 Jun 2018 18:54:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=chAs1KQCufmRrHCN4/ljpOmhhg2MDuWbn1GZ6QxL+Fs=; b=NWopxSu0548Zu8bzdF529X8cVpnrSkLPg+hDZcQKCYoG4EuFKf87WUIaMHRcq+elZn NGDyHhnyjYuJJFzq2GXOtI1X0pTYPbEEmgPv++Tz09/2Tlddqk+Kd233MqzE96oSyma+ h2hIeGnSs0SfIRexroUl1wKyl9amW/oGuKXQk= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=chAs1KQCufmRrHCN4/ljpOmhhg2MDuWbn1GZ6QxL+Fs=; b=Yq5A/niaHE5Tjjjyn31KNSahUb/uaAoEjPltMLJENe/NWeYG7E42KEQWbqZfDOjLWE IH+xvxOMAwcMogjIhHVS5dwghn8J002A+CzDgdSyGY7Wt0THqNynqKuTNe2u03lkJQPC A9X/Nh6xF+uyQqDVyzIQsZCH26tH8aXQcEUlY3db6T58jm/GOCpp0FO/LZf+kYJZbOhS 3YKoSiJovKWkKFUrgX2krSMDkqc6MDSmUdQi2oMLUcoBOU0NFjw4JI7no2xu7jUFHiFc JTcFQ3eJE/f/Bg7Jjc9cZn5wT/F1r2fx0kg4kdRUx0vt0IRmQ+WnRUdGQwYkckTxgIto EeAw== X-Gm-Message-State: APt69E09fth6Tzcip1CX9Wt2dGgIi/x8/tYh8WzwTD6hs1UncB2Zwqw8 JrsPDeoQxepjZ5jEKmh/UwTEvXHodX8= X-Google-Smtp-Source: ADUXVKKB2UWuGaHGl+q8FndEgNC24jWsTCNjTXNvX82paM3cqgkSxA9D7PagPypxWAshpETLpsWg3A== X-Received: by 2002:a62:df89:: with SMTP id d9-v6mr5769449pfl.147.1529546091633; Wed, 20 Jun 2018 18:54:51 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.49 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:50 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:48 -1000 Message-Id: <20180621015359.12018-25-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::22b Subject: [Qemu-devel] [PATCH v5 24/35] target/arm: Implement SVE floating-point convert to integer X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper-sve.h | 30 +++++++++++++ target/arm/helper.h | 12 +++--- target/arm/helper.c | 2 +- target/arm/sve_helper.c | 88 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 70 ++++++++++++++++++++++++++++++ target/arm/sve.decode | 16 +++++++ 6 files changed, 211 insertions(+), 7 deletions(-) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 4c379dbb05..37fa9eb9bb 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -955,6 +955,36 @@ DEF_HELPER_FLAGS_5(sve_fcvt_hd, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_fcvt_sd, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzs_hh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzs_hs, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzs_ss, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzs_ds, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzs_hd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzs_sd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzs_dd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_fcvtzu_hh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzu_hs, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzu_ss, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzu_ds, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzu_hd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzu_sd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzu_dd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG, diff --git a/target/arm/helper.h b/target/arm/helper.h index ad9cb6c7d5..8607077dda 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -134,12 +134,12 @@ DEF_HELPER_2(vfp_touid, i32, f64, ptr) DEF_HELPER_2(vfp_touizh, i32, f16, ptr) DEF_HELPER_2(vfp_touizs, i32, f32, ptr) DEF_HELPER_2(vfp_touizd, i32, f64, ptr) -DEF_HELPER_2(vfp_tosih, i32, f16, ptr) -DEF_HELPER_2(vfp_tosis, i32, f32, ptr) -DEF_HELPER_2(vfp_tosid, i32, f64, ptr) -DEF_HELPER_2(vfp_tosizh, i32, f16, ptr) -DEF_HELPER_2(vfp_tosizs, i32, f32, ptr) -DEF_HELPER_2(vfp_tosizd, i32, f64, ptr) +DEF_HELPER_2(vfp_tosih, s32, f16, ptr) +DEF_HELPER_2(vfp_tosis, s32, f32, ptr) +DEF_HELPER_2(vfp_tosid, s32, f64, ptr) +DEF_HELPER_2(vfp_tosizh, s32, f16, ptr) +DEF_HELPER_2(vfp_tosizs, s32, f32, ptr) +DEF_HELPER_2(vfp_tosizd, s32, f64, ptr) DEF_HELPER_3(vfp_toshs_round_to_zero, i32, f32, i32, ptr) DEF_HELPER_3(vfp_tosls_round_to_zero, i32, f32, i32, ptr) diff --git a/target/arm/helper.c b/target/arm/helper.c index 1248d84e6f..a36f5b1899 100644 --- a/target/arm/helper.c +++ b/target/arm/helper.c @@ -11360,7 +11360,7 @@ ftype HELPER(name)(uint32_t x, void *fpstp) \ } #define CONV_FTOI(name, ftype, fsz, sign, round) \ -uint32_t HELPER(name)(ftype x, void *fpstp) \ +sign##int32_t HELPER(name)(ftype x, void *fpstp) \ { \ float_status *fpst = fpstp; \ if (float##fsz##_is_any_nan(x)) { \ diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index e1bbe1f550..497efbf3a8 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3167,6 +3167,78 @@ static inline float16 sve_f64_to_f16(float64 f, float_status *s) return float64_to_float16(f, true, s); } +static inline int16_t vfp_float16_to_int16_rtz(float16 f, float_status *s) +{ + if (float16_is_any_nan(f)) { + float_raise(float_flag_invalid, s); + return 0; + } + return float16_to_int16_round_to_zero(f, s); +} + +static inline int64_t vfp_float16_to_int64_rtz(float16 f, float_status *s) +{ + if (float16_is_any_nan(f)) { + float_raise(float_flag_invalid, s); + return 0; + } + return float16_to_int64_round_to_zero(f, s); +} + +static inline int64_t vfp_float32_to_int64_rtz(float32 f, float_status *s) +{ + if (float32_is_any_nan(f)) { + float_raise(float_flag_invalid, s); + return 0; + } + return float32_to_int64_round_to_zero(f, s); +} + +static inline int64_t vfp_float64_to_int64_rtz(float64 f, float_status *s) +{ + if (float64_is_any_nan(f)) { + float_raise(float_flag_invalid, s); + return 0; + } + return float64_to_int64_round_to_zero(f, s); +} + +static inline uint16_t vfp_float16_to_uint16_rtz(float16 f, float_status *s) +{ + if (float16_is_any_nan(f)) { + float_raise(float_flag_invalid, s); + return 0; + } + return float16_to_uint16_round_to_zero(f, s); +} + +static inline uint64_t vfp_float16_to_uint64_rtz(float16 f, float_status *s) +{ + if (float16_is_any_nan(f)) { + float_raise(float_flag_invalid, s); + return 0; + } + return float16_to_uint64_round_to_zero(f, s); +} + +static inline uint64_t vfp_float32_to_uint64_rtz(float32 f, float_status *s) +{ + if (float32_is_any_nan(f)) { + float_raise(float_flag_invalid, s); + return 0; + } + return float32_to_uint64_round_to_zero(f, s); +} + +static inline uint64_t vfp_float64_to_uint64_rtz(float64 f, float_status *s) +{ + if (float64_is_any_nan(f)) { + float_raise(float_flag_invalid, s); + return 0; + } + return float64_to_uint64_round_to_zero(f, s); +} + DO_ZPZ_FP(sve_fcvt_sh, uint32_t, H1_4, sve_f32_to_f16) DO_ZPZ_FP(sve_fcvt_hs, uint32_t, H1_4, sve_f16_to_f32) DO_ZPZ_FP(sve_fcvt_dh, uint64_t, , sve_f64_to_f16) @@ -3174,6 +3246,22 @@ DO_ZPZ_FP(sve_fcvt_hd, uint64_t, , sve_f16_to_f64) DO_ZPZ_FP(sve_fcvt_ds, uint64_t, , float64_to_float32) DO_ZPZ_FP(sve_fcvt_sd, uint64_t, , float32_to_float64) +DO_ZPZ_FP(sve_fcvtzs_hh, uint16_t, H1_2, vfp_float16_to_int16_rtz) +DO_ZPZ_FP(sve_fcvtzs_hs, uint32_t, H1_4, helper_vfp_tosizh) +DO_ZPZ_FP(sve_fcvtzs_ss, uint32_t, H1_4, helper_vfp_tosizs) +DO_ZPZ_FP(sve_fcvtzs_hd, uint64_t, , vfp_float16_to_int64_rtz) +DO_ZPZ_FP(sve_fcvtzs_sd, uint64_t, , vfp_float32_to_int64_rtz) +DO_ZPZ_FP(sve_fcvtzs_ds, uint64_t, , helper_vfp_tosizd) +DO_ZPZ_FP(sve_fcvtzs_dd, uint64_t, , vfp_float64_to_int64_rtz) + +DO_ZPZ_FP(sve_fcvtzu_hh, uint16_t, H1_2, vfp_float16_to_uint16_rtz) +DO_ZPZ_FP(sve_fcvtzu_hs, uint32_t, H1_4, helper_vfp_touizh) +DO_ZPZ_FP(sve_fcvtzu_ss, uint32_t, H1_4, helper_vfp_touizs) +DO_ZPZ_FP(sve_fcvtzu_hd, uint64_t, , vfp_float16_to_uint64_rtz) +DO_ZPZ_FP(sve_fcvtzu_sd, uint64_t, , vfp_float32_to_uint64_rtz) +DO_ZPZ_FP(sve_fcvtzu_ds, uint64_t, , helper_vfp_touizd) +DO_ZPZ_FP(sve_fcvtzu_dd, uint64_t, , vfp_float64_to_uint64_rtz) + DO_ZPZ_FP(sve_scvt_hh, uint16_t, H1_2, int16_to_float16) DO_ZPZ_FP(sve_scvt_sh, uint32_t, H1_4, int32_to_float16) DO_ZPZ_FP(sve_scvt_ss, uint32_t, H1_4, int32_to_float32) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index c08c1263f1..fd94c5337e 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3970,6 +3970,76 @@ static bool trans_FCVT_sd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvt_sd); } +static bool trans_FCVTZS_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvtzs_hh); +} + +static bool trans_FCVTZU_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvtzu_hh); +} + +static bool trans_FCVTZS_hs(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvtzs_hs); +} + +static bool trans_FCVTZU_hs(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvtzu_hs); +} + +static bool trans_FCVTZS_hd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvtzs_hd); +} + +static bool trans_FCVTZU_hd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvtzu_hd); +} + +static bool trans_FCVTZS_ss(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzs_ss); +} + +static bool trans_FCVTZU_ss(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzu_ss); +} + +static bool trans_FCVTZS_sd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzs_sd); +} + +static bool trans_FCVTZU_sd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzu_sd); +} + +static bool trans_FCVTZS_ds(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzs_ds); +} + +static bool trans_FCVTZU_ds(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzu_ds); +} + +static bool trans_FCVTZS_dd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzs_dd); +} + +static bool trans_FCVTZU_dd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzu_dd); +} + static bool trans_SCVTF_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) { return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_scvt_hh); diff --git a/target/arm/sve.decode b/target/arm/sve.decode index fd45f51029..c4597d4fbe 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -824,6 +824,22 @@ FCVT_hd 01100101 11 0010 01 101 ... ..... ..... @rd_pg_rn_e0 FCVT_ds 01100101 11 0010 10 101 ... ..... ..... @rd_pg_rn_e0 FCVT_sd 01100101 11 0010 11 101 ... ..... ..... @rd_pg_rn_e0 +# SVE floating-point convert to integer +FCVTZS_hh 01100101 01 011 01 0 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZU_hh 01100101 01 011 01 1 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZS_hs 01100101 01 011 10 0 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZU_hs 01100101 01 011 10 1 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZS_hd 01100101 01 011 11 0 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZU_hd 01100101 01 011 11 1 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZS_ss 01100101 10 011 10 0 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZU_ss 01100101 10 011 10 1 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZS_ds 01100101 11 011 00 0 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZU_ds 01100101 11 011 00 1 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZS_sd 01100101 11 011 10 0 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZU_sd 01100101 11 011 10 1 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZS_dd 01100101 11 011 11 0 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZU_dd 01100101 11 011 11 1 101 ... ..... ..... @rd_pg_rn_e0 + # SVE integer convert to floating-point SCVTF_hh 01100101 01 010 01 0 101 ... ..... ..... @rd_pg_rn_e0 SCVTF_sh 01100101 01 010 10 0 101 ... ..... ..... @rd_pg_rn_e0 From patchwork Thu Jun 21 01:53:49 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 932503 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="Ek5cy461"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41B4x40wLyz9s2L for ; Thu, 21 Jun 2018 12:13:12 +1000 (AEST) Received: from localhost ([::1]:52574 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVp5x-0001FD-Nj for incoming@patchwork.ozlabs.org; Wed, 20 Jun 2018 22:13:09 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39417) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVooK-0004lG-5v for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:00 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVooI-0003OZ-U0 for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:56 -0400 Received: from mail-pg0-x22d.google.com ([2607:f8b0:400e:c05::22d]:36153) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVooI-0003O6-LC for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:54 -0400 Received: by mail-pg0-x22d.google.com with SMTP id m5-v6so647544pgd.3 for ; Wed, 20 Jun 2018 18:54:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=QIBCwmPy3yQFPEkh4dvSFFLAvuGXcxOXoxq2dHESWn8=; b=Ek5cy461GCteWjNZEZ4h8VxhkYlxZpZFz6sBIoWOWbxKs/Muui3X9hvs/WNjeB0ZG1 as7Ck+Z2hRa363uvHF0LkkU+fjgdUOV9Zd//CShEw0AJhxBcGif+scg/H+Izs4D/QNV2 sSFuaQ4QR4jgdQ8s5lUfYlu2W2HWihXR3SSDg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=QIBCwmPy3yQFPEkh4dvSFFLAvuGXcxOXoxq2dHESWn8=; b=lnPP407+rDxRNxQy1ZfotS2WAXRVWTnkivpr/I1rLp5aK8LmVrrOOGPys1aMS0hm1+ MUwpte4Xxibw2D48QiK5+1JyTq07l7yCNBZ1px8UorGOI6kbDfO0vhPlTzfykFVUaiMe rBINQYiKi+/DSuTjJhaL8do0SzK87j6Vc5iuSqlxN/qozIb2UA3+awQvOc31d1iyXvea WPCp3VhsItMJFmr3W1zqEJXjr24OPvPDPE8ecI93W6M56qtz1IshcLSGvlj6P9Ebp4uf /pIELLZrkh2qMaNzV3TcTpqW4p0381y+BHXShu0k5rzGUkE/A9/rShGRfr2Gd0jjZh9+ I3TA== X-Gm-Message-State: APt69E3P2MIYWLrj+3tBZmYthJPTvRyG7dGJcRl+WKY6hNxZ+oQyD2GW ebEjAswuC2uf5kAr/EiHGx4q/CFK+cI= X-Google-Smtp-Source: ADUXVKJSeigqcGhFFxyvOX8GwwVPLstQJrSQS145P3Pt1iwXbZF11qZC7y4iws5YcJdAq2KIrkYp4A== X-Received: by 2002:a63:8048:: with SMTP id j69-v6mr21062875pgd.429.1529546093325; Wed, 20 Jun 2018 18:54:53 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.51 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:52 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:49 -1000 Message-Id: <20180621015359.12018-26-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::22d Subject: [Qemu-devel] [PATCH v5 25/35] target/arm: Implement SVE floating-point round to integral value X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper-sve.h | 14 +++++++ target/arm/sve_helper.c | 8 ++++ target/arm/translate-sve.c | 77 ++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 9 +++++ 4 files changed, 108 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 37fa9eb9bb..36168c5bb2 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -985,6 +985,20 @@ DEF_HELPER_FLAGS_5(sve_fcvtzu_sd, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_fcvtzu_dd, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_frint_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_frint_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_frint_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_frintx_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_frintx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_frintx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 497efbf3a8..3b7a2f58c1 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3262,6 +3262,14 @@ DO_ZPZ_FP(sve_fcvtzu_sd, uint64_t, , vfp_float32_to_uint64_rtz) DO_ZPZ_FP(sve_fcvtzu_ds, uint64_t, , helper_vfp_touizd) DO_ZPZ_FP(sve_fcvtzu_dd, uint64_t, , vfp_float64_to_uint64_rtz) +DO_ZPZ_FP(sve_frint_h, uint16_t, H1_2, helper_advsimd_rinth) +DO_ZPZ_FP(sve_frint_s, uint32_t, H1_4, helper_rints) +DO_ZPZ_FP(sve_frint_d, uint64_t, , helper_rintd) + +DO_ZPZ_FP(sve_frintx_h, uint16_t, H1_2, float16_round_to_int) +DO_ZPZ_FP(sve_frintx_s, uint32_t, H1_4, float32_round_to_int) +DO_ZPZ_FP(sve_frintx_d, uint64_t, , float64_round_to_int) + DO_ZPZ_FP(sve_scvt_hh, uint16_t, H1_2, int16_to_float16) DO_ZPZ_FP(sve_scvt_sh, uint32_t, H1_4, int32_to_float16) DO_ZPZ_FP(sve_scvt_ss, uint32_t, H1_4, int32_to_float32) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index fd94c5337e..15e4f2888b 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -4040,6 +4040,83 @@ static bool trans_FCVTZU_dd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzu_dd); } +static gen_helper_gvec_3_ptr * const frint_fns[3] = { + gen_helper_sve_frint_h, + gen_helper_sve_frint_s, + gen_helper_sve_frint_d +}; + +static bool trans_FRINTI(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + if (a->esz == 0) { + return false; + } + return do_zpz_ptr(s, a->rd, a->rn, a->pg, a->esz == MO_16, + frint_fns[a->esz - 1]); +} + +static bool trans_FRINTX(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3_ptr * const fns[3] = { + gen_helper_sve_frintx_h, + gen_helper_sve_frintx_s, + gen_helper_sve_frintx_d + }; + if (a->esz == 0) { + return false; + } + return do_zpz_ptr(s, a->rd, a->rn, a->pg, a->esz == MO_16, fns[a->esz - 1]); +} + +static bool do_frint_mode(DisasContext *s, arg_rpr_esz *a, int mode) +{ + if (a->esz == 0) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + TCGv_i32 tmode = tcg_const_i32(mode); + TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16); + + gen_helper_set_rmode(tmode, tmode, status); + + tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + pred_full_reg_offset(s, a->pg), + status, vsz, vsz, 0, frint_fns[a->esz - 1]); + + gen_helper_set_rmode(tmode, tmode, status); + tcg_temp_free_i32(tmode); + tcg_temp_free_ptr(status); + } + return true; +} + +static bool trans_FRINTN(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_frint_mode(s, a, float_round_nearest_even); +} + +static bool trans_FRINTP(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_frint_mode(s, a, float_round_up); +} + +static bool trans_FRINTM(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_frint_mode(s, a, float_round_down); +} + +static bool trans_FRINTZ(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_frint_mode(s, a, float_round_to_zero); +} + +static bool trans_FRINTA(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_frint_mode(s, a, float_round_ties_away); +} + static bool trans_SCVTF_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) { return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_scvt_hh); diff --git a/target/arm/sve.decode b/target/arm/sve.decode index c4597d4fbe..eeab3d485f 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -840,6 +840,15 @@ FCVTZU_sd 01100101 11 011 10 1 101 ... ..... ..... @rd_pg_rn_e0 FCVTZS_dd 01100101 11 011 11 0 101 ... ..... ..... @rd_pg_rn_e0 FCVTZU_dd 01100101 11 011 11 1 101 ... ..... ..... @rd_pg_rn_e0 +# SVE floating-point round to integral value +FRINTN 01100101 .. 000 000 101 ... ..... ..... @rd_pg_rn +FRINTP 01100101 .. 000 001 101 ... ..... ..... @rd_pg_rn +FRINTM 01100101 .. 000 010 101 ... ..... ..... @rd_pg_rn +FRINTZ 01100101 .. 000 011 101 ... ..... ..... @rd_pg_rn +FRINTA 01100101 .. 000 100 101 ... ..... ..... @rd_pg_rn +FRINTX 01100101 .. 000 110 101 ... ..... ..... @rd_pg_rn +FRINTI 01100101 .. 000 111 101 ... ..... ..... @rd_pg_rn + # SVE integer convert to floating-point SCVTF_hh 01100101 01 010 01 0 101 ... ..... ..... @rd_pg_rn_e0 SCVTF_sh 01100101 01 010 10 0 101 ... ..... ..... @rd_pg_rn_e0 From patchwork Thu Jun 21 01:53:50 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 932511 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="H1HfCJQ0"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41B53Z0Xb9z9s3R for ; Thu, 21 Jun 2018 12:18:49 +1000 (AEST) Received: from localhost ([::1]:52604 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVpBP-0005dX-Gv for incoming@patchwork.ozlabs.org; Wed, 20 Jun 2018 22:18:47 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39465) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVooO-0004oz-8g for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:01 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVooK-0003Q9-Ux for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:00 -0400 Received: from mail-pl0-x235.google.com ([2607:f8b0:400e:c01::235]:38078) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVooK-0003PO-PL for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:56 -0400 Received: by mail-pl0-x235.google.com with SMTP id d10-v6so757826plo.5 for ; Wed, 20 Jun 2018 18:54:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=LHWFtiyOKa+HrAxinQWN0q0OwYqueH9VPd2ad8t6RQs=; b=H1HfCJQ0nxatWBoXR0gS5E1IEyiaFdmQB2rJ61wgdCLtLpRrOHH0ZV7svJ+lLzTvzA Jyk8ziuFE6wscp0t5qgH4/nl7EDP6orGh3QDttx8jdYD7rZB0w7r6nh0S2bAuGm2yPb8 O+0Y7ElA1AgNCJ96irnVsfzPV5meiET+uNL0Q= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=LHWFtiyOKa+HrAxinQWN0q0OwYqueH9VPd2ad8t6RQs=; b=sTYoExAkkhDxJj30tyiqaLw/g/f7eLFVUWtx3WLQ7mMw2F48oVNaSewe9D+sQneAGa hupDNSYTBAyIymWEQe7gFswx/lEQRmLs6SANFOynzohlk7NkwwJffDG7KrUaLm/I1E2V putLOOS667LVwytPJGoJxLuj4QGjU+nFy4GpsYLZ5Fh+nyEmjnOO1+lJebbaCPHKTk9X ACHQTcgK1HBY82x8hyOzVkuTE2+nBcRmibJ/TJe0KgCyEvF3KC3FsL1llPD5tqP11zVS N2GpzgSwR/N1Q9RcQWooFoto1rN9sy8mrWFsIo96mLgvHCCF50ENkJ8HKgE6MiefdeXR Hq4w== X-Gm-Message-State: APt69E24xNpyS/By82YxuCs4fBKMkQQzaxV3NQYwxhkYkyRsj3swu9ee wd63RXfc5zpgusT9QV+aJ/kMoKHnM3U= X-Google-Smtp-Source: ADUXVKL9Y+/YMu5ofjA4SJOzQ8ZyMEPX4EAXa7w3BT5nB6L0S/2oHG1Pn1xfY6zcsP3/A0K85mfQ/Q== X-Received: by 2002:a17:902:b582:: with SMTP id a2-v6mr26632985pls.335.1529546095408; Wed, 20 Jun 2018 18:54:55 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.53 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:54 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:50 -1000 Message-Id: <20180621015359.12018-27-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::235 Subject: [Qemu-devel] [PATCH v5 26/35] target/arm: Implement SVE floating-point unary operations X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper-sve.h | 14 ++++++++++++++ target/arm/sve_helper.c | 8 ++++++++ target/arm/translate-sve.c | 26 ++++++++++++++++++++++++++ target/arm/sve.decode | 4 ++++ 4 files changed, 52 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 36168c5bb2..891346a5ac 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -999,6 +999,20 @@ DEF_HELPER_FLAGS_5(sve_frintx_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_frintx_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_frecpx_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_frecpx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_frecpx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_fsqrt_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fsqrt_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fsqrt_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 3b7a2f58c1..5309cf0866 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3270,6 +3270,14 @@ DO_ZPZ_FP(sve_frintx_h, uint16_t, H1_2, float16_round_to_int) DO_ZPZ_FP(sve_frintx_s, uint32_t, H1_4, float32_round_to_int) DO_ZPZ_FP(sve_frintx_d, uint64_t, , float64_round_to_int) +DO_ZPZ_FP(sve_frecpx_h, uint16_t, H1_2, helper_frecpx_f16) +DO_ZPZ_FP(sve_frecpx_s, uint32_t, H1_4, helper_frecpx_f32) +DO_ZPZ_FP(sve_frecpx_d, uint64_t, , helper_frecpx_f64) + +DO_ZPZ_FP(sve_fsqrt_h, uint16_t, H1_2, float16_sqrt) +DO_ZPZ_FP(sve_fsqrt_s, uint32_t, H1_4, float32_sqrt) +DO_ZPZ_FP(sve_fsqrt_d, uint64_t, , float64_sqrt) + DO_ZPZ_FP(sve_scvt_hh, uint16_t, H1_2, int16_to_float16) DO_ZPZ_FP(sve_scvt_sh, uint32_t, H1_4, int32_to_float16) DO_ZPZ_FP(sve_scvt_ss, uint32_t, H1_4, int32_to_float32) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 15e4f2888b..308c04de89 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -4117,6 +4117,32 @@ static bool trans_FRINTA(DisasContext *s, arg_rpr_esz *a, uint32_t insn) return do_frint_mode(s, a, float_round_ties_away); } +static bool trans_FRECPX(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3_ptr * const fns[3] = { + gen_helper_sve_frecpx_h, + gen_helper_sve_frecpx_s, + gen_helper_sve_frecpx_d + }; + if (a->esz == 0) { + return false; + } + return do_zpz_ptr(s, a->rd, a->rn, a->pg, a->esz == MO_16, fns[a->esz - 1]); +} + +static bool trans_FSQRT(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3_ptr * const fns[3] = { + gen_helper_sve_fsqrt_h, + gen_helper_sve_fsqrt_s, + gen_helper_sve_fsqrt_d + }; + if (a->esz == 0) { + return false; + } + return do_zpz_ptr(s, a->rd, a->rn, a->pg, a->esz == MO_16, fns[a->esz - 1]); +} + static bool trans_SCVTF_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) { return do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_scvt_hh); diff --git a/target/arm/sve.decode b/target/arm/sve.decode index eeab3d485f..94d7b157b4 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -849,6 +849,10 @@ FRINTA 01100101 .. 000 100 101 ... ..... ..... @rd_pg_rn FRINTX 01100101 .. 000 110 101 ... ..... ..... @rd_pg_rn FRINTI 01100101 .. 000 111 101 ... ..... ..... @rd_pg_rn +# SVE floating-point unary operations +FRECPX 01100101 .. 001 100 101 ... ..... ..... @rd_pg_rn +FSQRT 01100101 .. 001 101 101 ... ..... ..... @rd_pg_rn + # SVE integer convert to floating-point SCVTF_hh 01100101 01 010 01 0 101 ... ..... ..... @rd_pg_rn_e0 SCVTF_sh 01100101 01 010 10 0 101 ... ..... ..... @rd_pg_rn_e0 From patchwork Thu Jun 21 01:53:51 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 932513 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="AFLbihHr"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41B5636fttz9s2L for ; Thu, 21 Jun 2018 12:20:58 +1000 (AEST) Received: from localhost ([::1]:52620 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVpDT-0000Rx-70 for incoming@patchwork.ozlabs.org; Wed, 20 Jun 2018 22:20:55 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39464) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVooO-0004ov-8C for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:01 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVooM-0003Qt-Ga for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:00 -0400 Received: from mail-pg0-x232.google.com ([2607:f8b0:400e:c05::232]:44093) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVooM-0003Qe-Ab for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:54:58 -0400 Received: by mail-pg0-x232.google.com with SMTP id g26-v6so636820pgn.11 for ; Wed, 20 Jun 2018 18:54:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=/0h8A3BZDF8RXvY+bWdtcrfvhJEFmIbQwJk4x2l8+Dk=; b=AFLbihHryh2nCHH15bOVKtmQXN2BBDJWIdp/FaNFfUOtsm3fAedPN81Cq7odxYIusQ zd3OPyXErv+P9eUG9BiS0fpiAJr8FRyOOps0X1ITtKaMcMDeH2LOTbxy+H9yJIawFLVq zHY0cu2IdvGWl/dA9J3IvMCoVfi8V0hLN1mPY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=/0h8A3BZDF8RXvY+bWdtcrfvhJEFmIbQwJk4x2l8+Dk=; b=mSbjkEtSneLUxlpOJ9i7N/JZvppN3NJsMOXXE7s3o7JImpIu4YAhSoub8Q9M4CA9Eq v6VATYJxR4SJ5PtLouA7RJmKSlFTm5WDITFGezD4NU1LvqlWkqNXN5UG5gj/5PN45dcm dzBxof1mUORPsHaycutztZbQ/3Lsf9iDASG0HL4EgfjUPEVY4IUlqVT5wh3Vsy+Ktx7w LmqwIjxZYiMzbaoFs523ZMwGqBlOv6Xb8TEaB6MCEcxiW366K5shEpCmxmlnhs5eAYGP xyjCw2u/qGhtmOe4O+SICJANOaLLHJ5BZh3YGQUUI5YdkYEuohNzvP09/zj0t+DWO7Yd S8zA== X-Gm-Message-State: APt69E1z5Zo/7AzJ5eorxtzQzUhPoM2TSj/xRz+XW8yCv0x84lqPvNx1 5gwjyZWxow3r0HjqzRkPUOeLhb7qkAI= X-Google-Smtp-Source: ADUXVKIkRuVf2RJ8+VmDPJy5xqhbEqwR5l9ddc2Uzde9rWg/37MfG7k+ij8eucW8moW/ih++c8mGwQ== X-Received: by 2002:a62:8202:: with SMTP id w2-v6mr25345344pfd.32.1529546097046; Wed, 20 Jun 2018 18:54:57 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.55 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:56 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:51 -1000 Message-Id: <20180621015359.12018-28-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::232 Subject: [Qemu-devel] [PATCH v5 27/35] target/arm: Implement SVE MOVPRFX X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/translate-sve.c | 60 +++++++++++++++++++++++++++++++++++++- target/arm/sve.decode | 7 +++++ 2 files changed, 66 insertions(+), 1 deletion(-) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 308c04de89..067c219b54 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -351,6 +351,23 @@ static bool do_zpzz_ool(DisasContext *s, arg_rprr_esz *a, gen_helper_gvec_4 *fn) return true; } +/* Select active elememnts from Zn and inactive elements from Zm, + * storing the result in Zd. + */ +static void do_sel_z(DisasContext *s, int rd, int rn, int rm, int pg, int esz) +{ + static gen_helper_gvec_4 * const fns[4] = { + gen_helper_sve_sel_zpzz_b, gen_helper_sve_sel_zpzz_h, + gen_helper_sve_sel_zpzz_s, gen_helper_sve_sel_zpzz_d + }; + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_4_ool(vec_full_reg_offset(s, rd), + vec_full_reg_offset(s, rn), + vec_full_reg_offset(s, rm), + pred_full_reg_offset(s, pg), + vsz, vsz, 0, fns[esz]); +} + #define DO_ZPZZ(NAME, name) \ static bool trans_##NAME##_zpzz(DisasContext *s, arg_rprr_esz *a, \ uint32_t insn) \ @@ -401,7 +418,13 @@ static bool trans_UDIV_zpzz(DisasContext *s, arg_rprr_esz *a, uint32_t insn) return do_zpzz_ool(s, a, fns[a->esz]); } -DO_ZPZZ(SEL, sel) +static bool trans_SEL_zpzz(DisasContext *s, arg_rprr_esz *a, uint32_t insn) +{ + if (sve_access_check(s)) { + do_sel_z(s, a->rd, a->rn, a->rm, a->pg, a->esz); + } + return true; +} #undef DO_ZPZZ @@ -5038,3 +5061,38 @@ static bool trans_PRF_rr(DisasContext *s, arg_PRF_rr *a, uint32_t insn) sve_access_check(s); return true; } + +/* + * Move Prefix + * + * TODO: The implementation so far could handle predicated merging movprfx. + * The helper functions as written take an extra source register to + * use in the operation, but the result is only written when predication + * succeeds. For unpredicated movprfx, we need to rearrange the helpers + * to allow the final write back to the destination to be unconditional. + * For predicated zering movprfz, we need to rearrange the helpers to + * allow the final write back to zero inactives. + * + * In the meantime, just emit the moves. + */ + +static bool trans_MOVPRFX(DisasContext *s, arg_MOVPRFX *a, uint32_t insn) +{ + return do_mov_z(s, a->rd, a->rn); +} + +static bool trans_MOVPRFX_m(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + if (sve_access_check(s)) { + do_sel_z(s, a->rd, a->rn, a->rd, a->pg, a->esz); + } + return true; +} + +static bool trans_MOVPRFX_z(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + if (sve_access_check(s)) { + do_movz_zpz(s, a->rd, a->rn, a->pg, a->esz); + } + return true; +} diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 94d7b157b4..85f2b39776 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -266,6 +266,10 @@ ORV 00000100 .. 011 000 001 ... ..... ..... @rd_pg_rn EORV 00000100 .. 011 001 001 ... ..... ..... @rd_pg_rn ANDV 00000100 .. 011 010 001 ... ..... ..... @rd_pg_rn +# SVE constructive prefix (predicated) +MOVPRFX_z 00000100 .. 010 000 001 ... ..... ..... @rd_pg_rn +MOVPRFX_m 00000100 .. 010 001 001 ... ..... ..... @rd_pg_rn + # SVE integer add reduction (predicated) # Note that saddv requires size != 3. UADDV 00000100 .. 000 001 001 ... ..... ..... @rd_pg_rn @@ -414,6 +418,9 @@ ADR_p64 00000100 11 1 ..... 1010 .. ..... ..... @rd_rn_msz_rm ### SVE Integer Misc - Unpredicated Group +# SVE constructive prefix (unpredicated) +MOVPRFX 00000100 00 1 00000 101111 rn:5 rd:5 + # SVE floating-point exponential accelerator # Note esz != 0 FEXPA 00000100 .. 1 00000 101110 ..... ..... @rd_rn From patchwork Thu Jun 21 01:53:52 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 932514 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="IaRvnxPH"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41B57V0bpTz9s2L for ; Thu, 21 Jun 2018 12:22:14 +1000 (AEST) Received: from localhost ([::1]:52631 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVpEh-0001bM-RF for incoming@patchwork.ozlabs.org; Wed, 20 Jun 2018 22:22:11 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39492) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVooQ-0004rB-8L for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:03 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVooO-0003Sc-H6 for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:02 -0400 Received: from mail-pl0-x235.google.com ([2607:f8b0:400e:c01::235]:43907) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVooO-0003Rs-7u for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:00 -0400 Received: by mail-pl0-x235.google.com with SMTP id c41-v6so746764plj.10 for ; Wed, 20 Jun 2018 18:55:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=li9h4tydA6NYZ7tm8NAvM0RkuO1DvOT8U6gU0EDKrec=; b=IaRvnxPHffmQGsxyZo10mwgRGxXD2Jcri9Crz7ytPRU/yckYyxv7X/Na3029U8H4at dBnb8v8SHwsw1UyX5iyNkONbyYbr1xqIWKEG1BTMpGF0fy77OSFeTHQ4Gd8b568EiJ8g TBzhlhOS9cRrBd9C2WsM/G454UsJYsFrntSok= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=li9h4tydA6NYZ7tm8NAvM0RkuO1DvOT8U6gU0EDKrec=; b=DjtZdkzKUJL4+L9AlJzcns5NK7Lq0HbtKpnGJVvrclXmYLnSxvySM3pk4XKLTneDyH +neIHvye2t7AVk97z7RY/zBMMVwr0EXHMrKnIJrojxvEw5QzsHO5Mc4frRNG7gq+dnkc shMW2F8eY7dHjwChgHLr6VZclEjfNXlHxT7IGJokEyJbbPZsanmvUCnQ8oYoCXZDidEH x/7ZwfreiwsbvRnb/I0QJYRR4b23cDH2A+bz1G37WZmgsHxKJU7+f2o8xc0H3oJTwHuJ kJeA4OpA31fxlEHlOY68gjMXLcADvVkEgd1uWOaa5+zCbeGmvqDavJwyD0CdacqJYXhT ecSw== X-Gm-Message-State: APt69E3h5HECrMOjBi44rJs6TBE2oKuA9ZAkQbq0M386aJhfYVQZmOVb U+zOM7QztCYvl6tbyAzEY/sTK+S3Qxo= X-Google-Smtp-Source: ADUXVKKmMlFi5hbJLQHT4OiV7vvfRGSaCghBcVs1BdpqqqYGW7lKH2Cw1Wv33cJ/X9KL0gC5GRbxsw== X-Received: by 2002:a17:902:b18b:: with SMTP id s11-v6mr26452889plr.190.1529546098935; Wed, 20 Jun 2018 18:54:58 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.57 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:54:58 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:52 -1000 Message-Id: <20180621015359.12018-29-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::235 Subject: [Qemu-devel] [PATCH v5 28/35] target/arm: Implement SVE floating-point complex add X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper-sve.h | 7 +++ target/arm/sve_helper.c | 100 +++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 24 +++++++++ target/arm/sve.decode | 4 ++ 4 files changed, 135 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 891346a5ac..0bd9fe2f28 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -1092,6 +1092,13 @@ DEF_HELPER_FLAGS_6(sve_facgt_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_6(sve_facgt_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcadd_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcadd_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcadd_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 5309cf0866..ee7fc23bb9 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3629,6 +3629,106 @@ void HELPER(sve_ftmad_d)(void *vd, void *vn, void *vm, void *vs, uint32_t desc) } } +/* + * FP Complex Add + */ + +void HELPER(sve_fcadd_h)(void *vd, void *vn, void *vm, void *vg, + void *vs, uint32_t desc) +{ + intptr_t j, i = simd_oprsz(desc); + uint64_t *g = vg; + float16 neg_imag = float16_set_sign(0, simd_data(desc)); + float16 neg_real = float16_chs(neg_imag); + + do { + uint64_t pg = g[(i - 1) >> 6]; + do { + float16 e0, e1, e2, e3; + + /* I holds the real index; J holds the imag index. */ + j = i - sizeof(float16); + i -= 2 * sizeof(float16); + + e0 = *(float16 *)(vn + H1_2(i)); + e1 = *(float16 *)(vm + H1_2(j)) ^ neg_real; + e2 = *(float16 *)(vn + H1_2(j)); + e3 = *(float16 *)(vm + H1_2(i)) ^ neg_imag; + + if (likely((pg >> (i & 63)) & 1)) { + *(float16 *)(vd + H1_2(i)) = float16_add(e0, e1, vs); + } + if (likely((pg >> (j & 63)) & 1)) { + *(float16 *)(vd + H1_2(j)) = float16_add(e2, e3, vs); + } + } while (i & 63); + } while (i != 0); +} + +void HELPER(sve_fcadd_s)(void *vd, void *vn, void *vm, void *vg, + void *vs, uint32_t desc) +{ + intptr_t j, i = simd_oprsz(desc); + uint64_t *g = vg; + float32 neg_imag = float32_set_sign(0, simd_data(desc)); + float32 neg_real = float32_chs(neg_imag); + + do { + uint64_t pg = g[(i - 1) >> 6]; + do { + float32 e0, e1, e2, e3; + + /* I holds the real index; J holds the imag index. */ + j = i - sizeof(float32); + i -= 2 * sizeof(float32); + + e0 = *(float32 *)(vn + H1_2(i)); + e1 = *(float32 *)(vm + H1_2(j)) ^ neg_real; + e2 = *(float32 *)(vn + H1_2(j)); + e3 = *(float32 *)(vm + H1_2(i)) ^ neg_imag; + + if (likely((pg >> (i & 63)) & 1)) { + *(float32 *)(vd + H1_2(i)) = float32_add(e0, e1, vs); + } + if (likely((pg >> (j & 63)) & 1)) { + *(float32 *)(vd + H1_2(j)) = float32_add(e2, e3, vs); + } + } while (i & 63); + } while (i != 0); +} + +void HELPER(sve_fcadd_d)(void *vd, void *vn, void *vm, void *vg, + void *vs, uint32_t desc) +{ + intptr_t j, i = simd_oprsz(desc); + uint64_t *g = vg; + float64 neg_imag = float64_set_sign(0, simd_data(desc)); + float64 neg_real = float64_chs(neg_imag); + + do { + uint64_t pg = g[(i - 1) >> 6]; + do { + float64 e0, e1, e2, e3; + + /* I holds the real index; J holds the imag index. */ + j = i - sizeof(float64); + i -= 2 * sizeof(float64); + + e0 = *(float64 *)(vn + H1_2(i)); + e1 = *(float64 *)(vm + H1_2(j)) ^ neg_real; + e2 = *(float64 *)(vn + H1_2(j)); + e3 = *(float64 *)(vm + H1_2(i)) ^ neg_imag; + + if (likely((pg >> (i & 63)) & 1)) { + *(float64 *)(vd + H1_2(i)) = float64_add(e0, e1, vs); + } + if (likely((pg >> (j & 63)) & 1)) { + *(float64 *)(vd + H1_2(j)) = float64_add(e2, e3, vs); + } + } while (i & 63); + } while (i != 0); +} + /* * Load contiguous data, protected by a governing predicate. */ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 067c219b54..7a39be9bdd 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3895,6 +3895,30 @@ DO_FPCMP(FACGT, facgt) #undef DO_FPCMP +static bool trans_FCADD(DisasContext *s, arg_FCADD *a, uint32_t insn) +{ + static gen_helper_gvec_4_ptr * const fns[3] = { + gen_helper_sve_fcadd_h, + gen_helper_sve_fcadd_s, + gen_helper_sve_fcadd_d + }; + + if (a->esz == 0) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16); + tcg_gen_gvec_4_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + pred_full_reg_offset(s, a->pg), + status, vsz, vsz, a->rot, fns[a->esz - 1]); + tcg_temp_free_ptr(status); + } + return true; +} + typedef void gen_helper_sve_fmla(TCGv_env, TCGv_ptr, TCGv_i32); static bool do_fmla(DisasContext *s, arg_rprrr_esz *a, gen_helper_sve_fmla *fn) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 85f2b39776..7b5ada1311 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -721,6 +721,10 @@ UMIN_zzi 00100101 .. 101 011 110 ........ ..... @rdn_i8u # SVE integer multiply immediate (unpredicated) MUL_zzi 00100101 .. 110 000 110 ........ ..... @rdn_i8s +# SVE floating-point complex add (predicated) +FCADD 01100100 esz:2 00000 rot:1 100 pg:3 rm:5 rd:5 \ + rn=%reg_movprfx + ### SVE FP Multiply-Add Indexed Group # SVE floating-point multiply-add (indexed) From patchwork Thu Jun 21 01:53:53 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 932480 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="GO4WWikd"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41B4lN1cTgz9s3R for ; Thu, 21 Jun 2018 12:04:48 +1000 (AEST) Received: from localhost ([::1]:52529 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVoxp-0003AQ-Rk for incoming@patchwork.ozlabs.org; Wed, 20 Jun 2018 22:04:45 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39507) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVooS-0004tA-1x for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:05 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVooQ-0003Tx-MC for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:04 -0400 Received: from mail-pf0-x242.google.com ([2607:f8b0:400e:c00::242]:35973) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVooQ-0003TM-Dh for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:02 -0400 Received: by mail-pf0-x242.google.com with SMTP id a12-v6so689780pfi.3 for ; Wed, 20 Jun 2018 18:55:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=knsSl75J+cMYx24an5DGn279NUCI/wNVkLxKUvInCr0=; b=GO4WWikdwc74BHSAljIwlBGSvTkcetCiWOnbzCpa773a3vk+fsoO/MuyA0SOnKRETq mIWA17Tqhb9tTWt/IKfS4qUCO6IvGjMFJ1/hvFPECg/FKX9vMsLKvxRkkwq9MA+jmJPe ra3wvMhmC7Zim2aukV5tJpXOEyFIaSOVpF2l8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=knsSl75J+cMYx24an5DGn279NUCI/wNVkLxKUvInCr0=; b=WuxLD1iRrzGOLGguuoB/fJoNzzAv/EApF2+JICyKW2XVfLTl0Zsz0EusKJXJ2j5B4C 2e+YTjYmTLgyl5MKkknKbsvv0IfzLN3sgiuAsvaAPEEuN7wlcNJROgX4jidN3D17znqq cqpRe1Ds+IK4OLADZZu9HXbG2SrWN7TYy61g338ugdh3DiVj8XJP82XlvPCA2TbPiex6 tSLR9pQmvIGOJoC7HYB5eoJpUP/5G/RaLzQJ1EKr26RE+nnFOLODK3Mmhs9T0OiQLzwQ lKPQ+0jXi6iskiohLdlrEEpfoygxXxKqtSyk9utZrJk/UPj8DMYe2odQVrAbTOqiBBSL iNZw== X-Gm-Message-State: APt69E0OAvii7n8KkPt6nmEOKVM/shsHIRI6A6oEJNG9W1NqcCgMrvvI OkCjwMXrAmad4/FGVkin7GqhxuMWEHY= X-Google-Smtp-Source: ADUXVKJy4edaPKnJq4eEKTIKm1XCt1AIn2lMlAFke3rh4Xom97EhkIxTIihz9eN3aOQBwuMkbdD6vQ== X-Received: by 2002:aa7:83d1:: with SMTP id j17-v6mr25500360pfn.236.1529546100920; Wed, 20 Jun 2018 18:55:00 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.54.59 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:55:00 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:53 -1000 Message-Id: <20180621015359.12018-30-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::242 Subject: [Qemu-devel] [PATCH v5 29/35] target/arm: Implement SVE fp complex multiply add X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper-sve.h | 4 + target/arm/sve_helper.c | 162 +++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 37 +++++++++ target/arm/sve.decode | 4 + 4 files changed, 207 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 0bd9fe2f28..023952a9a4 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -1115,6 +1115,10 @@ DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fcmla_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fcmla_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fcmla_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) + DEF_HELPER_FLAGS_5(sve_ftmad_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_ftmad_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_ftmad_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index ee7fc23bb9..cd3dfc8b26 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3729,6 +3729,168 @@ void HELPER(sve_fcadd_d)(void *vd, void *vn, void *vm, void *vg, } while (i != 0); } +/* + * FP Complex Multiply + */ + +QEMU_BUILD_BUG_ON(SIMD_DATA_SHIFT + 22 > 32); + +void HELPER(sve_fcmla_zpzzz_h)(CPUARMState *env, void *vg, uint32_t desc) +{ + intptr_t j, i = simd_oprsz(desc); + unsigned rd = extract32(desc, SIMD_DATA_SHIFT, 5); + unsigned rn = extract32(desc, SIMD_DATA_SHIFT + 5, 5); + unsigned rm = extract32(desc, SIMD_DATA_SHIFT + 10, 5); + unsigned ra = extract32(desc, SIMD_DATA_SHIFT + 15, 5); + unsigned rot = extract32(desc, SIMD_DATA_SHIFT + 20, 2); + bool flip = rot & 1; + float16 neg_imag, neg_real; + void *vd = &env->vfp.zregs[rd]; + void *vn = &env->vfp.zregs[rn]; + void *vm = &env->vfp.zregs[rm]; + void *va = &env->vfp.zregs[ra]; + uint64_t *g = vg; + + neg_imag = float16_set_sign(0, (rot & 2) != 0); + neg_real = float16_set_sign(0, rot == 1 || rot == 2); + + do { + uint64_t pg = g[(i - 1) >> 6]; + do { + float16 e1, e2, e3, e4, nr, ni, mr, mi, d; + + /* I holds the real index; J holds the imag index. */ + j = i - sizeof(float16); + i -= 2 * sizeof(float16); + + nr = *(float16 *)(vn + H1_2(i)); + ni = *(float16 *)(vn + H1_2(j)); + mr = *(float16 *)(vm + H1_2(i)); + mi = *(float16 *)(vm + H1_2(j)); + + e2 = (flip ? ni : nr); + e1 = (flip ? mi : mr) ^ neg_real; + e4 = e2; + e3 = (flip ? mr : mi) ^ neg_imag; + + if (likely((pg >> (i & 63)) & 1)) { + d = *(float16 *)(va + H1_2(i)); + d = float16_muladd(e2, e1, d, 0, &env->vfp.fp_status_f16); + *(float16 *)(vd + H1_2(i)) = d; + } + if (likely((pg >> (j & 63)) & 1)) { + d = *(float16 *)(va + H1_2(j)); + d = float16_muladd(e4, e3, d, 0, &env->vfp.fp_status_f16); + *(float16 *)(vd + H1_2(j)) = d; + } + } while (i & 63); + } while (i != 0); +} + +void HELPER(sve_fcmla_zpzzz_s)(CPUARMState *env, void *vg, uint32_t desc) +{ + intptr_t j, i = simd_oprsz(desc); + unsigned rd = extract32(desc, SIMD_DATA_SHIFT, 5); + unsigned rn = extract32(desc, SIMD_DATA_SHIFT + 5, 5); + unsigned rm = extract32(desc, SIMD_DATA_SHIFT + 10, 5); + unsigned ra = extract32(desc, SIMD_DATA_SHIFT + 15, 5); + unsigned rot = extract32(desc, SIMD_DATA_SHIFT + 20, 2); + bool flip = rot & 1; + float32 neg_imag, neg_real; + void *vd = &env->vfp.zregs[rd]; + void *vn = &env->vfp.zregs[rn]; + void *vm = &env->vfp.zregs[rm]; + void *va = &env->vfp.zregs[ra]; + uint64_t *g = vg; + + neg_imag = float32_set_sign(0, (rot & 2) != 0); + neg_real = float32_set_sign(0, rot == 1 || rot == 2); + + do { + uint64_t pg = g[(i - 1) >> 6]; + do { + float32 e1, e2, e3, e4, nr, ni, mr, mi, d; + + /* I holds the real index; J holds the imag index. */ + j = i - sizeof(float32); + i -= 2 * sizeof(float32); + + nr = *(float32 *)(vn + H1_2(i)); + ni = *(float32 *)(vn + H1_2(j)); + mr = *(float32 *)(vm + H1_2(i)); + mi = *(float32 *)(vm + H1_2(j)); + + e2 = (flip ? ni : nr); + e1 = (flip ? mi : mr) ^ neg_real; + e4 = e2; + e3 = (flip ? mr : mi) ^ neg_imag; + + if (likely((pg >> (i & 63)) & 1)) { + d = *(float32 *)(va + H1_2(i)); + d = float32_muladd(e2, e1, d, 0, &env->vfp.fp_status); + *(float32 *)(vd + H1_2(i)) = d; + } + if (likely((pg >> (j & 63)) & 1)) { + d = *(float32 *)(va + H1_2(j)); + d = float32_muladd(e4, e3, d, 0, &env->vfp.fp_status); + *(float32 *)(vd + H1_2(j)) = d; + } + } while (i & 63); + } while (i != 0); +} + +void HELPER(sve_fcmla_zpzzz_d)(CPUARMState *env, void *vg, uint32_t desc) +{ + intptr_t j, i = simd_oprsz(desc); + unsigned rd = extract32(desc, SIMD_DATA_SHIFT, 5); + unsigned rn = extract32(desc, SIMD_DATA_SHIFT + 5, 5); + unsigned rm = extract32(desc, SIMD_DATA_SHIFT + 10, 5); + unsigned ra = extract32(desc, SIMD_DATA_SHIFT + 15, 5); + unsigned rot = extract32(desc, SIMD_DATA_SHIFT + 20, 2); + bool flip = rot & 1; + float64 neg_imag, neg_real; + void *vd = &env->vfp.zregs[rd]; + void *vn = &env->vfp.zregs[rn]; + void *vm = &env->vfp.zregs[rm]; + void *va = &env->vfp.zregs[ra]; + uint64_t *g = vg; + + neg_imag = float64_set_sign(0, (rot & 2) != 0); + neg_real = float64_set_sign(0, rot == 1 || rot == 2); + + do { + uint64_t pg = g[(i - 1) >> 6]; + do { + float64 e1, e2, e3, e4, nr, ni, mr, mi, d; + + /* I holds the real index; J holds the imag index. */ + j = i - sizeof(float64); + i -= 2 * sizeof(float64); + + nr = *(float64 *)(vn + H1_2(i)); + ni = *(float64 *)(vn + H1_2(j)); + mr = *(float64 *)(vm + H1_2(i)); + mi = *(float64 *)(vm + H1_2(j)); + + e2 = (flip ? ni : nr); + e1 = (flip ? mi : mr) ^ neg_real; + e4 = e2; + e3 = (flip ? mr : mi) ^ neg_imag; + + if (likely((pg >> (i & 63)) & 1)) { + d = *(float64 *)(va + H1_2(i)); + d = float64_muladd(e2, e1, d, 0, &env->vfp.fp_status); + *(float64 *)(vd + H1_2(i)) = d; + } + if (likely((pg >> (j & 63)) & 1)) { + d = *(float64 *)(va + H1_2(j)); + d = float64_muladd(e4, e3, d, 0, &env->vfp.fp_status); + *(float64 *)(vd + H1_2(j)) = d; + } + } while (i & 63); + } while (i != 0); +} + /* * Load contiguous data, protected by a governing predicate. */ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 7a39be9bdd..6487fe760a 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3968,6 +3968,43 @@ DO_FMLA(FNMLS_zpzzz, fnmls_zpzzz) #undef DO_FMLA +static bool trans_FCMLA_zpzzz(DisasContext *s, + arg_FCMLA_zpzzz *a, uint32_t insn) +{ + static gen_helper_sve_fmla * const fns[3] = { + gen_helper_sve_fcmla_zpzzz_h, + gen_helper_sve_fcmla_zpzzz_s, + gen_helper_sve_fcmla_zpzzz_d, + }; + + if (a->esz == 0) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + unsigned desc; + TCGv_i32 t_desc; + TCGv_ptr pg = tcg_temp_new_ptr(); + + /* We would need 7 operands to pass these arguments "properly". + * So we encode all the register numbers into the descriptor. + */ + desc = deposit32(a->rd, 5, 5, a->rn); + desc = deposit32(desc, 10, 5, a->rm); + desc = deposit32(desc, 15, 5, a->ra); + desc = deposit32(desc, 20, 2, a->rot); + desc = sextract32(desc, 0, 22); + desc = simd_desc(vsz, vsz, desc); + + t_desc = tcg_const_i32(desc); + tcg_gen_addi_ptr(pg, cpu_env, pred_full_reg_offset(s, a->pg)); + fns[a->esz - 1](cpu_env, pg, t_desc); + tcg_temp_free_i32(t_desc); + tcg_temp_free_ptr(pg); + } + return true; +} + /* *** SVE Floating Point Unary Operations Prediated Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 7b5ada1311..da89697700 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -725,6 +725,10 @@ MUL_zzi 00100101 .. 110 000 110 ........ ..... @rdn_i8s FCADD 01100100 esz:2 00000 rot:1 100 pg:3 rm:5 rd:5 \ rn=%reg_movprfx +# SVE floating-point complex multiply-add (predicated) +FCMLA_zpzzz 01100100 esz:2 0 rm:5 0 rot:2 pg:3 rn:5 rd:5 \ + ra=%reg_movprfx + ### SVE FP Multiply-Add Indexed Group # SVE floating-point multiply-add (indexed) From patchwork Thu Jun 21 01:53:54 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 932515 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="iZl8jqOl"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41B58q5KJbz9s4v for ; Thu, 21 Jun 2018 12:23:23 +1000 (AEST) Received: from localhost ([::1]:52635 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVpFp-0002Be-FH for incoming@patchwork.ozlabs.org; Wed, 20 Jun 2018 22:23:21 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39524) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVooT-0004uH-5T for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:06 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVooS-0003Ut-Bq for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:05 -0400 Received: from mail-pl0-x235.google.com ([2607:f8b0:400e:c01::235]:36790) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVooS-0003UV-5W for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:04 -0400 Received: by mail-pl0-x235.google.com with SMTP id a7-v6so758236plp.3 for ; Wed, 20 Jun 2018 18:55:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=NuYhoUhPplWoUIEzViyhFHHhbuisvWgK7aXq7dj7Bws=; b=iZl8jqOl7KrFWi80RF2LGXQt82p2pfqtffZyTACNRDOWHwWYYl/rx6ZJdHdS3RfWNf nVR7y1alA9zxk71vOKbKH5njN1hGyc6Owt1HtRhV0QZmyx1ECW1DdfW14TeF63xhtayC xwRMMBtEMxH5jIr0WlGbJggCEQallIpWJ3VLU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=NuYhoUhPplWoUIEzViyhFHHhbuisvWgK7aXq7dj7Bws=; b=QyDoObEchFY5h02hd1ug4Nf3iQEZdToXng2eiDL6H/jwpiuK5TxtNs6Y5elhdeVpFK 47f10oS+b6WEE44gbU5h2WtftCnb87+rlfAwiKhocP9nVNn6C/ukHyWD4t61csN7N+hq skQhxb+Hh4/xfqkbrbeyqnCb/OxoQ5rIuLjMtCdDQgDDniFB5/dkshjNaY+fvDIXHcUG o26BjjzE8LYrjD30o/BUm/ZAY5jTuUn5cXtdIi+Yz70/nQIscZR6bUNpMcK/NRHwMXYB SpEG3RCUjc4e8smBJGldTJXR6KkjJBQoFGmIHK6GSG4ontXQryPxa5PJRsfgtJewEwEG fhGQ== X-Gm-Message-State: APt69E0e8q7KX3KPqRjn5jeQ59aFny74qcLTFJD+fMYKJ5c5EQNuEIGU 9dBYVtuRAX+ZHFhtjbP5XcRfbJLKqDc= X-Google-Smtp-Source: ADUXVKKOtCJ6YS0tsmGLo1DjKrgp0ikOfRmwB9xaeHxVKoCu0dbkabGQNHRQSzmv9m6Qa1IYDfW3FQ== X-Received: by 2002:a17:902:9a06:: with SMTP id v6-v6mr26242519plp.21.1529546102935; Wed, 20 Jun 2018 18:55:02 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.55.01 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:55:02 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:54 -1000 Message-Id: <20180621015359.12018-31-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::235 Subject: [Qemu-devel] [PATCH v5 30/35] target/arm: Pass index to AdvSIMD FCMLA (indexed) X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" The original commit failed to pass, or use, the index. Fixes: d17b7cdcf4ea Signed-off-by: Richard Henderson --- target/arm/translate-a64.c | 21 ++++++++++++--------- target/arm/vec_helper.c | 10 ++++++---- 2 files changed, 18 insertions(+), 13 deletions(-) diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 8d8a4cecb0..038e48278f 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -12669,15 +12669,18 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) case 0x13: /* FCMLA #90 */ case 0x15: /* FCMLA #180 */ case 0x17: /* FCMLA #270 */ - tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, rd), - vec_full_reg_offset(s, rn), - vec_reg_offset(s, rm, index, size), fpst, - is_q ? 16 : 8, vec_full_reg_size(s), - extract32(insn, 13, 2), /* rot */ - size == MO_64 - ? gen_helper_gvec_fcmlas_idx - : gen_helper_gvec_fcmlah_idx); - tcg_temp_free_ptr(fpst); + { + int rot = extract32(insn, 13, 2); + int data = index * 4 + rot; + tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, rd), + vec_full_reg_offset(s, rn), + vec_reg_offset(s, rm, index, size), fpst, + is_q ? 16 : 8, vec_full_reg_size(s), data, + size == MO_64 + ? gen_helper_gvec_fcmlas_idx + : gen_helper_gvec_fcmlah_idx); + tcg_temp_free_ptr(fpst); + } return; } diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 073e5c58e7..8f2dc4b989 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -317,10 +317,11 @@ void HELPER(gvec_fcmlah_idx)(void *vd, void *vn, void *vm, float_status *fpst = vfpst; intptr_t flip = extract32(desc, SIMD_DATA_SHIFT, 1); uint32_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1); + intptr_t index = extract32(desc, SIMD_DATA_SHIFT + 2, 2); uint32_t neg_real = flip ^ neg_imag; uintptr_t i; - float16 e1 = m[H2(flip)]; - float16 e3 = m[H2(1 - flip)]; + float16 e1 = m[H2(2 * index + flip)]; + float16 e3 = m[H2(2 * index + 1 - flip)]; /* Shift boolean to the sign bit so we can xor to negate. */ neg_real <<= 15; @@ -377,10 +378,11 @@ void HELPER(gvec_fcmlas_idx)(void *vd, void *vn, void *vm, float_status *fpst = vfpst; intptr_t flip = extract32(desc, SIMD_DATA_SHIFT, 1); uint32_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1); + intptr_t index = extract32(desc, SIMD_DATA_SHIFT + 2, 2); uint32_t neg_real = flip ^ neg_imag; uintptr_t i; - float32 e1 = m[H4(flip)]; - float32 e3 = m[H4(1 - flip)]; + float32 e1 = m[H4(2 * index + flip)]; + float32 e3 = m[H4(2 * index + 1 - flip)]; /* Shift boolean to the sign bit so we can xor to negate. */ neg_real <<= 31; From patchwork Thu Jun 21 01:53:55 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 932510 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="AHckJE8T"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41B51W2Cdmz9s3R for ; Thu, 21 Jun 2018 12:17:03 +1000 (AEST) Received: from localhost ([::1]:52597 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVp9g-0004GD-U3 for incoming@patchwork.ozlabs.org; Wed, 20 Jun 2018 22:17:00 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39567) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVooY-0004zQ-3S for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:13 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVooU-0003WH-7h for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:10 -0400 Received: from mail-pl0-x22b.google.com ([2607:f8b0:400e:c01::22b]:38069) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVooU-0003Vg-04 for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:06 -0400 Received: by mail-pl0-x22b.google.com with SMTP id d10-v6so757999plo.5 for ; Wed, 20 Jun 2018 18:55:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=cTkQOLsJ7w2QJCExRny2kUXfmlF9jgGlwgL2VbhhBnA=; b=AHckJE8Tqqlj9FzWbFYAQTxzHTf5m9XO/dr6iag24efxQvjWTx6rtwQ+JL2kL6RlDp +VHDChJNjUNAgJrgeV4CmWPjxvB0xcJEl+GQsYy5948wK9dni+gNnRM8/+Wd6PkfWtK7 jpdMHLlF/QJb1+gCbjoFpDR/9Gbtpvx8kuUPQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=cTkQOLsJ7w2QJCExRny2kUXfmlF9jgGlwgL2VbhhBnA=; b=j+fYh21dZ30JrVSYdZkHWtKhkv9dcg86JZFpo8hv+QbaaoZ0Lw32mut0jtKwGuj7SV CRLOP2qNbAF2xcD2FDR883hbWMnpVTSYPkvlv+tIizxJ+oFV4Tg2g7yxBgAJDNCm61Cn coX+3JMRjpqUvhxh6W42AlUEx6OeF01JxlwFWEF622WEalSY7BHZgkjp2uSMO4zVP4a/ VU9BsWuGiYC8HZzTdIOkpfFEgqPr3EcLqpEXzCFTFEqk9hQPiMOqW8kEPGpetdmFZGbH fJpYsOx/+tNcM+dh4NPWpEYq4zfobMTQxJjZBrvuVUTNY3TvSiQyBWqMNN/o6mTrSb5J UG6Q== X-Gm-Message-State: APt69E2CDHUhV02W+ODflvH/vNZ0Kna1ZNJRtjJZpKonPldtHtLnIb0M DfDjW8k3XEbGjGDtnNDUKFnc6VEZpzM= X-Google-Smtp-Source: ADUXVKLGAeNcNQLF/PS5/sSCtneqzX8+wLe/CNAFsMTzJTYxva1MNa0D5I8Hey9ke4eOhoByKYbeEQ== X-Received: by 2002:a17:902:650a:: with SMTP id b10-v6mr26568033plk.45.1529546104693; Wed, 20 Jun 2018 18:55:04 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.55.03 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:55:03 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:55 -1000 Message-Id: <20180621015359.12018-32-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::22b Subject: [Qemu-devel] [PATCH v5 31/35] target/arm: Implement SVE fp complex multiply add (indexed) X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Enhance the existing helpers to support SVE, which takes the index from each 128-bit segment. The change has no effect for AdvSIMD, since there is only one such segment. Signed-off-by: Richard Henderson --- target/arm/translate-sve.c | 23 ++++++++++++++++++ target/arm/vec_helper.c | 50 +++++++++++++++++++++++--------------- target/arm/sve.decode | 6 +++++ 3 files changed, 59 insertions(+), 20 deletions(-) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 6487fe760a..209a69cd76 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -4005,6 +4005,29 @@ static bool trans_FCMLA_zpzzz(DisasContext *s, return true; } +static bool trans_FCMLA_zzxz(DisasContext *s, arg_FCMLA_zzxz *a, uint32_t insn) +{ + static gen_helper_gvec_3_ptr * const fns[2] = { + gen_helper_gvec_fcmlah_idx, + gen_helper_gvec_fcmlas_idx, + }; + + tcg_debug_assert(a->esz == 1 || a->esz == 2); + tcg_debug_assert(a->rd == a->ra); + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16); + tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + status, vsz, vsz, + a->index * 4 + a->rot, + fns[a->esz - 1]); + tcg_temp_free_ptr(status); + } + return true; +} + /* *** SVE Floating Point Unary Operations Prediated Group */ diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 8f2dc4b989..db5aeb9f24 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -319,22 +319,27 @@ void HELPER(gvec_fcmlah_idx)(void *vd, void *vn, void *vm, uint32_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1); intptr_t index = extract32(desc, SIMD_DATA_SHIFT + 2, 2); uint32_t neg_real = flip ^ neg_imag; - uintptr_t i; - float16 e1 = m[H2(2 * index + flip)]; - float16 e3 = m[H2(2 * index + 1 - flip)]; + intptr_t elements = opr_sz / sizeof(float16); + intptr_t eltspersegment = 16 / sizeof(float16); + intptr_t i, j; /* Shift boolean to the sign bit so we can xor to negate. */ neg_real <<= 15; neg_imag <<= 15; - e1 ^= neg_real; - e3 ^= neg_imag; - for (i = 0; i < opr_sz / 2; i += 2) { - float16 e2 = n[H2(i + flip)]; - float16 e4 = e2; + for (i = 0; i < elements; i += eltspersegment) { + float16 mr = m[H2(i + 2 * index + 0)]; + float16 mi = m[H2(i + 2 * index + 1)]; + float16 e1 = neg_real ^ (flip ? mi : mr); + float16 e3 = neg_imag ^ (flip ? mr : mi); - d[H2(i)] = float16_muladd(e2, e1, d[H2(i)], 0, fpst); - d[H2(i + 1)] = float16_muladd(e4, e3, d[H2(i + 1)], 0, fpst); + for (j = i; j < i + eltspersegment; j += 2) { + float16 e2 = n[H2(j + flip)]; + float16 e4 = e2; + + d[H2(j)] = float16_muladd(e2, e1, d[H2(j)], 0, fpst); + d[H2(j + 1)] = float16_muladd(e4, e3, d[H2(j + 1)], 0, fpst); + } } clear_tail(d, opr_sz, simd_maxsz(desc)); } @@ -380,22 +385,27 @@ void HELPER(gvec_fcmlas_idx)(void *vd, void *vn, void *vm, uint32_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1); intptr_t index = extract32(desc, SIMD_DATA_SHIFT + 2, 2); uint32_t neg_real = flip ^ neg_imag; - uintptr_t i; - float32 e1 = m[H4(2 * index + flip)]; - float32 e3 = m[H4(2 * index + 1 - flip)]; + intptr_t elements = opr_sz / sizeof(float32); + intptr_t eltspersegment = 16 / sizeof(float32); + intptr_t i, j; /* Shift boolean to the sign bit so we can xor to negate. */ neg_real <<= 31; neg_imag <<= 31; - e1 ^= neg_real; - e3 ^= neg_imag; - for (i = 0; i < opr_sz / 4; i += 2) { - float32 e2 = n[H4(i + flip)]; - float32 e4 = e2; + for (i = 0; i < elements; i += eltspersegment) { + float32 mr = m[H4(i + 2 * index + 0)]; + float32 mi = m[H4(i + 2 * index + 1)]; + float32 e1 = neg_real ^ (flip ? mi : mr); + float32 e3 = neg_imag ^ (flip ? mr : mi); - d[H4(i)] = float32_muladd(e2, e1, d[H4(i)], 0, fpst); - d[H4(i + 1)] = float32_muladd(e4, e3, d[H4(i + 1)], 0, fpst); + for (j = i; j < i + eltspersegment; j += 2) { + float32 e2 = n[H4(j + flip)]; + float32 e4 = e2; + + d[H4(j)] = float32_muladd(e2, e1, d[H4(j)], 0, fpst); + d[H4(j + 1)] = float32_muladd(e4, e3, d[H4(j + 1)], 0, fpst); + } } clear_tail(d, opr_sz, simd_maxsz(desc)); } diff --git a/target/arm/sve.decode b/target/arm/sve.decode index da89697700..b578d104c4 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -729,6 +729,12 @@ FCADD 01100100 esz:2 00000 rot:1 100 pg:3 rm:5 rd:5 \ FCMLA_zpzzz 01100100 esz:2 0 rm:5 0 rot:2 pg:3 rn:5 rd:5 \ ra=%reg_movprfx +# SVE floating-point complex multiply-add (indexed) +FCMLA_zzxz 01100100 10 1 index:2 rm:3 0001 rot:2 rn:5 rd:5 \ + ra=%reg_movprfx esz=1 +FCMLA_zzxz 01100100 11 1 index:1 rm:4 0001 rot:2 rn:5 rd:5 \ + ra=%reg_movprfx esz=2 + ### SVE FP Multiply-Add Indexed Group # SVE floating-point multiply-add (indexed) From patchwork Thu Jun 21 01:53:56 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 932517 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="SJoBpV6F"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41B5CW3tKjz9s2L for ; Thu, 21 Jun 2018 12:25:43 +1000 (AEST) Received: from localhost ([::1]:52648 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVpI5-0003m7-Ai for incoming@patchwork.ozlabs.org; Wed, 20 Jun 2018 22:25:41 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39570) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVooY-0004zU-43 for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:13 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVooV-0003XO-ON for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:10 -0400 Received: from mail-pf0-x234.google.com ([2607:f8b0:400e:c00::234]:46309) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVooV-0003Wt-HZ for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:07 -0400 Received: by mail-pf0-x234.google.com with SMTP id q1-v6so677329pff.13 for ; Wed, 20 Jun 2018 18:55:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=/jTgOXL9VwGXcuJ4oLcHRPbBUE2s7YvX1Bg6uCTWz1E=; b=SJoBpV6F3WRiO59/SsA0BQQlylcCDgPgeBoDp+ifaPxu/UZRDpe+L2buLPyZMGIQiJ Y5IrwleDOBB89I4JdrkbNGPnH7GuGZ1gtfq/YBKWsA67P+b7a7w4isl4jEm+5qAT5i4K JY7x8zvCZq9CdE2Of1j4i5UJRtATQKwFSGYWM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=/jTgOXL9VwGXcuJ4oLcHRPbBUE2s7YvX1Bg6uCTWz1E=; b=ZQUTioejahW057vf2kWVM2yBX/0WHP6jgEZnwruFp6wgtoKjYV++XOUZUDPHDbPr/F xN4/YK4XQHFpjDk6U6rN5uSxDETCWf7nCt+ZazMl4zrW3A1d4OsLSUr9KoGiHxvcLs5X mXTbdretrTwCs/d4rhNvg9jFjrvUedOvgz5Wm/3drNedswnpX+iS84oObnVF6V9lCrX9 L4tUa5ohJmc6WdSl0DTPquS349bLQKdlN7ibiMS1iFPkneBoLr9lHNBU8AM1Ev1neIAi wCnqH/afs4klVGmex4jpfZWFNRqoV7vj8lZmUh5aQ4JSBYKxjoxR5zOA+65vzNDthUpd BlTg== X-Gm-Message-State: APt69E0shthRutxMMhXdldAMPF2hWzBlk32YtDs8kegVI/l93IrgO3qR XU0N/KRo8LlS8/8/ITirQJvj6iLlBfA= X-Google-Smtp-Source: ADUXVKLaylJbAy/cVCwsnxnVi9tOYxcaHJsmthPcKyz90dgtXFpUwmZyxdzc/tTtqjKpxmAhGRTztg== X-Received: by 2002:a62:f705:: with SMTP id h5-v6mr25491227pfi.169.1529546106269; Wed, 20 Jun 2018 18:55:06 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.55.04 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:55:05 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:56 -1000 Message-Id: <20180621015359.12018-33-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::234 Subject: [Qemu-devel] [PATCH v5 32/35] target/arm: Implement SVE dot product (vectors) X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/helper.h | 5 +++ target/arm/translate-sve.c | 17 ++++++++++ target/arm/vec_helper.c | 67 ++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 3 ++ 4 files changed, 92 insertions(+) diff --git a/target/arm/helper.h b/target/arm/helper.h index 8607077dda..e23ce7ff19 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -583,6 +583,11 @@ DEF_HELPER_FLAGS_5(gvec_qrdmlah_s32, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(gvec_qrdmlsh_s32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_sdot_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_udot_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_sdot_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_udot_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(gvec_fcaddh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_fcadds, TCG_CALL_NO_RWG, diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 209a69cd76..aa109208e5 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3423,6 +3423,23 @@ DO_ZZI(UMIN, umin) #undef DO_ZZI +static bool trans_DOT_zzz(DisasContext *s, arg_DOT_zzz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[2][2] = { + { gen_helper_gvec_sdot_b, gen_helper_gvec_sdot_h }, + { gen_helper_gvec_udot_b, gen_helper_gvec_udot_h } + }; + + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vsz, vsz, 0, fns[a->u][a->sz]); + } + return true; +} + /* *** SVE Floating Point Multiply-Add Indexed Group */ diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index db5aeb9f24..c16a30c3b5 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -194,6 +194,73 @@ void HELPER(gvec_qrdmlsh_s32)(void *vd, void *vn, void *vm, clear_tail(d, opr_sz, simd_maxsz(desc)); } +/* Integer 8 and 16-bit dot-product. + * + * Note that for the loops herein, host endianness does not matter + * with respect to the ordering of data within the 64-bit lanes. + * All elements are treated equally, no matter where they are. + */ + +void HELPER(gvec_sdot_b)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint32_t *d = vd; + int8_t *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 4; ++i) { + d[i] += n[i * 4 + 0] * m[i * 4 + 0] + + n[i * 4 + 1] * m[i * 4 + 1] + + n[i * 4 + 2] * m[i * 4 + 2] + + n[i * 4 + 3] * m[i * 4 + 3]; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_udot_b)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint32_t *d = vd; + uint8_t *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 4; ++i) { + d[i] += n[i * 4 + 0] * m[i * 4 + 0] + + n[i * 4 + 1] * m[i * 4 + 1] + + n[i * 4 + 2] * m[i * 4 + 2] + + n[i * 4 + 3] * m[i * 4 + 3]; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_sdot_h)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint64_t *d = vd; + int16_t *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 8; ++i) { + d[i] += (int64_t)n[i * 4 + 0] * m[i * 4 + 0] + + (int64_t)n[i * 4 + 1] * m[i * 4 + 1] + + (int64_t)n[i * 4 + 2] * m[i * 4 + 2] + + (int64_t)n[i * 4 + 3] * m[i * 4 + 3]; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_udot_h)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint64_t *d = vd; + uint16_t *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 8; ++i) { + d[i] += (uint64_t)n[i * 4 + 0] * m[i * 4 + 0] + + (uint64_t)n[i * 4 + 1] * m[i * 4 + 1] + + (uint64_t)n[i * 4 + 2] * m[i * 4 + 2] + + (uint64_t)n[i * 4 + 3] * m[i * 4 + 3]; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + void HELPER(gvec_fcaddh)(void *vd, void *vn, void *vm, void *vfpst, uint32_t desc) { diff --git a/target/arm/sve.decode b/target/arm/sve.decode index b578d104c4..0b29da9f3a 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -721,6 +721,9 @@ UMIN_zzi 00100101 .. 101 011 110 ........ ..... @rdn_i8u # SVE integer multiply immediate (unpredicated) MUL_zzi 00100101 .. 110 000 110 ........ ..... @rdn_i8s +# SVE integer dot product (unpredicated) +DOT_zzz 01000100 1 sz:1 0 rm:5 00000 u:1 rn:5 rd:5 + # SVE floating-point complex add (predicated) FCADD 01100100 esz:2 00000 rot:1 100 pg:3 rm:5 rd:5 \ rn=%reg_movprfx From patchwork Thu Jun 21 01:53:57 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 932516 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="aOZuopV2"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41B5B75Gjlz9s2L for ; Thu, 21 Jun 2018 12:24:31 +1000 (AEST) Received: from localhost ([::1]:52638 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVpGv-0002m8-Cm for incoming@patchwork.ozlabs.org; Wed, 20 Jun 2018 22:24:29 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39590) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVooZ-00050I-BD for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:13 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVooX-0003Ya-P6 for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:11 -0400 Received: from mail-pl0-x22f.google.com ([2607:f8b0:400e:c01::22f]:38073) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVooX-0003Xw-GG for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:09 -0400 Received: by mail-pl0-x22f.google.com with SMTP id d10-v6so758077plo.5 for ; Wed, 20 Jun 2018 18:55:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=VTLYFUk8noLkXNDXuc9Ff1/aKekYRtej65w/CxbcRBo=; b=aOZuopV2Xits/v/nAXedaE3FJGRzUWFxnvI87kFStOR1xpGE3QD7HnYVh1fEwifXy4 ccJzOG4YXOjfaAMqIi4s3pOIWvXjxtqtRiAg56V+QIEy9VpNw/kKOazL8n1R3te5+497 DZxM9fUXtZBH1MFKCGCCMKJDADZfpqyEq3DHg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=VTLYFUk8noLkXNDXuc9Ff1/aKekYRtej65w/CxbcRBo=; b=C9DPXN0itOBrSVWPeF+E2fah96auQiREf18B8ISnoAmFN9uiB+eJCte0WvmxJrZiP4 59TUvyZK5Hzu2DzXaEl9ca0WL76aXeuCKtrPVQK6YuAsnYWo++hcy2ACuOMNYU3xpO+K vfBcjS6AKCxcunJpj6Zv3SPdEIc+dZ+UAe1OWqbQs13Cpm1xb6P7diQdaUagLdytPpkK UW8ba/ZS2b9mJQmW0ZaikAaYLlMCoKmDL5Gcv5hECVS+RzaWs+iPwEaGTeApOBc3EMuv bwWiobnSCdmtHIhOmFbj+FZcwCWctE54GyGPm+cY0F81o8xvGwIjsHYXmFf1zlFuG1YB 0R5g== X-Gm-Message-State: APt69E3gKc9szqq7zYnhqM5b9iQRMbnRAoMvt9PWtp+f/Gk1TXpbAkD9 5hmvJfqVwkaFCewKRhc0iHBKQ2TGSZg= X-Google-Smtp-Source: ADUXVKKxLe5DOhFGBqLq5t1GfkEheUt+rnCmciEJp8QG0Xj9FZ4yfZArD8doaybeMAUlMlch44kWLw== X-Received: by 2002:a17:902:22e:: with SMTP id 43-v6mr9206749plc.82.1529546108212; Wed, 20 Jun 2018 18:55:08 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.55.06 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:55:07 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:57 -1000 Message-Id: <20180621015359.12018-34-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::22f Subject: [Qemu-devel] [PATCH v5 33/35] target/arm: Implement SVE dot product (indexed) X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper.h | 5 ++ target/arm/translate-sve.c | 18 +++++++ target/arm/vec_helper.c | 96 ++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 8 +++- 4 files changed, 126 insertions(+), 1 deletion(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index e23ce7ff19..59e8c3bd1b 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -588,6 +588,11 @@ DEF_HELPER_FLAGS_4(gvec_udot_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_sdot_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_udot_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_sdot_idx_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_udot_idx_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_sdot_idx_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_udot_idx_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(gvec_fcaddh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_fcadds, TCG_CALL_NO_RWG, diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index aa109208e5..af2958be10 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3440,6 +3440,24 @@ static bool trans_DOT_zzz(DisasContext *s, arg_DOT_zzz *a, uint32_t insn) return true; } +static bool trans_DOT_zzx(DisasContext *s, arg_DOT_zzx *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[2][2] = { + { gen_helper_gvec_sdot_idx_b, gen_helper_gvec_sdot_idx_h }, + { gen_helper_gvec_udot_idx_b, gen_helper_gvec_udot_idx_h } + }; + + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vsz, vsz, a->index, fns[a->u][a->sz]); + } + return true; +} + + /* *** SVE Floating Point Multiply-Add Indexed Group */ diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index c16a30c3b5..3117ee39cd 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -261,6 +261,102 @@ void HELPER(gvec_udot_h)(void *vd, void *vn, void *vm, uint32_t desc) clear_tail(d, opr_sz, simd_maxsz(desc)); } +void HELPER(gvec_sdot_idx_b)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, j, opr_sz = simd_oprsz(desc), opr_sz_4 = opr_sz / 4; + intptr_t index = simd_data(desc); + uint32_t *d = vd; + int8_t *n = vn, *m = vm; + + for (i = 0; i < opr_sz_4; i = j) { + int8_t m0 = m[(i + index) * 4 + 0]; + int8_t m1 = m[(i + index) * 4 + 1]; + int8_t m2 = m[(i + index) * 4 + 2]; + int8_t m3 = m[(i + index) * 4 + 3]; + + j = i; + do { + d[j] += n[j * 4 + 0] * m0 + + n[j * 4 + 1] * m1 + + n[j * 4 + 2] * m2 + + n[j * 4 + 3] * m3; + } while (++j < MIN(i + 4, opr_sz_4)); + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_udot_idx_b)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, j, opr_sz = simd_oprsz(desc), opr_sz_4 = opr_sz / 4; + intptr_t index = simd_data(desc); + uint32_t *d = vd; + uint8_t *n = vn, *m = vm; + + for (i = 0; i < opr_sz_4; i = j) { + uint8_t m0 = m[(i + index) * 4 + 0]; + uint8_t m1 = m[(i + index) * 4 + 1]; + uint8_t m2 = m[(i + index) * 4 + 2]; + uint8_t m3 = m[(i + index) * 4 + 3]; + + j = i; + do { + d[j] += n[j * 4 + 0] * m0 + + n[j * 4 + 1] * m1 + + n[j * 4 + 2] * m2 + + n[j * 4 + 3] * m3; + } while (++j < MIN(i + 4, opr_sz_4)); + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_sdot_idx_h)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, j, opr_sz = simd_oprsz(desc), opr_sz_8 = opr_sz / 8; + intptr_t index = simd_data(desc); + uint64_t *d = vd; + int16_t *n = vn, *m = vm; + + for (i = 0; i < opr_sz_8; i = j) { + int64_t m0 = m[(i + index) * 4 + 0]; + int64_t m1 = m[(i + index) * 4 + 1]; + int64_t m2 = m[(i + index) * 4 + 2]; + int64_t m3 = m[(i + index) * 4 + 3]; + + j = i; + do { + d[j] += n[j * 4 + 0] * m0 + + n[j * 4 + 1] * m1 + + n[j * 4 + 2] * m2 + + n[j * 4 + 3] * m3; + } while (++j < MIN(i + 2, opr_sz_8)); + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_udot_idx_h)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, j, opr_sz = simd_oprsz(desc), opr_sz_8 = opr_sz / 8; + intptr_t index = simd_data(desc); + uint64_t *d = vd; + uint16_t *n = vn, *m = vm; + + for (i = 0; i < opr_sz_8; i = j) { + uint64_t m0 = m[(i + index) * 4 + 0]; + uint64_t m1 = m[(i + index) * 4 + 1]; + uint64_t m2 = m[(i + index) * 4 + 2]; + uint64_t m3 = m[(i + index) * 4 + 3]; + + j = i; + do { + d[j] += n[j * 4 + 0] * m0 + + n[j * 4 + 1] * m1 + + n[j * 4 + 2] * m2 + + n[j * 4 + 3] * m3; + } while (++j < MIN(i + 2, opr_sz_8)); + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + void HELPER(gvec_fcaddh)(void *vd, void *vn, void *vm, void *vfpst, uint32_t desc) { diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 0b29da9f3a..f69ff285a1 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -722,7 +722,13 @@ UMIN_zzi 00100101 .. 101 011 110 ........ ..... @rdn_i8u MUL_zzi 00100101 .. 110 000 110 ........ ..... @rdn_i8s # SVE integer dot product (unpredicated) -DOT_zzz 01000100 1 sz:1 0 rm:5 00000 u:1 rn:5 rd:5 +DOT_zzz 01000100 1 sz:1 0 rm:5 00000 u:1 rn:5 rd:5 ra=%reg_movprfx + +# SVE integer dot product (indexed) +DOT_zzx 01000100 101 index:2 rm:3 00000 u:1 rn:5 rd:5 \ + sz=0 ra=%reg_movprfx +DOT_zzx 01000100 111 index:1 rm:4 00000 u:1 rn:5 rd:5 \ + sz=1 ra=%reg_movprfx # SVE floating-point complex add (predicated) FCADD 01100100 esz:2 00000 rot:1 100 pg:3 rm:5 rd:5 \ From patchwork Thu Jun 21 01:53:58 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 932518 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="Okl5CQko"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41B5Dv1xVfz9s2L for ; Thu, 21 Jun 2018 12:26:55 +1000 (AEST) Received: from localhost ([::1]:52651 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVpJF-0004Mv-09 for incoming@patchwork.ozlabs.org; Wed, 20 Jun 2018 22:26:53 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39617) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVoob-00050J-0M for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:14 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVooZ-0003a3-LD for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:13 -0400 Received: from mail-pl0-x243.google.com ([2607:f8b0:400e:c01::243]:43026) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVooZ-0003ZL-Ft for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:11 -0400 Received: by mail-pl0-x243.google.com with SMTP id c41-v6so747004plj.10 for ; Wed, 20 Jun 2018 18:55:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=gLVizP9+OtQMsYfZB5CLtpDd26fS7Q4VxOJvrGxLcWk=; b=Okl5CQkoUBL0dxb31zKdiNx0CoQ7TviNHo/h07VEmSCUWfXIaQ38DlywpQ76qDh4tQ SZuXEDopk0BXZLUKB/yrvjAUH13iy8tYhwwtGVA8/zNQH3Lc1RThWhn9lFpz2LlzzKsN 7mfTIhxhXJ50SVliAvvArNnAScLqYgTe3FfpU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=gLVizP9+OtQMsYfZB5CLtpDd26fS7Q4VxOJvrGxLcWk=; b=S9SKte8EbE8RMJJYHD0dBN8SZ+zDcvanlc6qrPzs80tqFVM+U1LkQxz1yN97JtCE9v uqFS6aCuZuvRGRjFCxv8fVAzR+oSDODxYZno3uibhX8trgznbpe4XZDoZjwQG4siKAaT Sm9nZfRbRFYtkFr6iGvOxnEwTJ1yiZEcB6d6RJJmvg0JdqAYOMkhGzAymXowJM0BFpCm P5oHx5wHuGwfAs4Mc0OvPWQDS/cDmOdtusWoeFX9rZ5Sz088eEQ2ry4ydRFAjiZyzsKt 1Bn7q5kr+GAMwNDaobNcmVdEhAGtsDh/ALEVmolFaY4MzaaWvQgDOrUFAS2MyRwl349s Aqtw== X-Gm-Message-State: APt69E39sm1oSZmyGVfyosvOmPSzsphsP5dO6uCvLmzzPfZRrPI07vTM YvwRbsLx5eWUtp1TQyNZm2u0e1RUT1c= X-Google-Smtp-Source: ADUXVKIIYAzyusJZ4Z2CwZitb9w1a4j4Ro+GgDczVUNmrpJEzUHrMJMD58ryb1dMUL8b24ov0nxZ0A== X-Received: by 2002:a17:902:1081:: with SMTP id c1-v6mr26228473pla.153.1529546110071; Wed, 20 Jun 2018 18:55:10 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.55.08 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:55:09 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:58 -1000 Message-Id: <20180621015359.12018-35-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::243 Subject: [Qemu-devel] [PATCH v5 34/35] target/arm: Enable SVE for aarch64-linux-user X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Enable ARM_FEATURE_SVE for the generic "max" cpu. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/cpu.c | 7 +++++++ target/arm/cpu64.c | 1 + 2 files changed, 8 insertions(+) diff --git a/target/arm/cpu.c b/target/arm/cpu.c index e1de45e904..8e4f4d8c21 100644 --- a/target/arm/cpu.c +++ b/target/arm/cpu.c @@ -164,6 +164,13 @@ static void arm_cpu_reset(CPUState *s) env->cp15.sctlr_el[1] |= SCTLR_UCT | SCTLR_UCI | SCTLR_DZE; /* and to the FP/Neon instructions */ env->cp15.cpacr_el1 = deposit64(env->cp15.cpacr_el1, 20, 2, 3); + /* and to the SVE instructions */ + env->cp15.cpacr_el1 = deposit64(env->cp15.cpacr_el1, 16, 2, 3); + env->cp15.cptr_el[3] |= CPTR_EZ; + /* with maximum vector length */ + env->vfp.zcr_el[1] = ARM_MAX_VQ - 1; + env->vfp.zcr_el[2] = ARM_MAX_VQ - 1; + env->vfp.zcr_el[3] = ARM_MAX_VQ - 1; #else /* Reset into the highest available EL */ if (arm_feature(env, ARM_FEATURE_EL3)) { diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c index c50dcd4077..0360d7efc5 100644 --- a/target/arm/cpu64.c +++ b/target/arm/cpu64.c @@ -252,6 +252,7 @@ static void aarch64_max_initfn(Object *obj) set_feature(&cpu->env, ARM_FEATURE_V8_RDM); set_feature(&cpu->env, ARM_FEATURE_V8_FP16); set_feature(&cpu->env, ARM_FEATURE_V8_FCMA); + set_feature(&cpu->env, ARM_FEATURE_SVE); /* For usermode -cpu max we can use a larger and more efficient DCZ * blocksize since we don't have to follow what the hardware does. */ From patchwork Thu Jun 21 01:53:59 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 932508 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="MWxRzx4G"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41B4zj52YNz9s3R for ; Thu, 21 Jun 2018 12:15:29 +1000 (AEST) Received: from localhost ([::1]:52584 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVp8B-00033L-Ay for incoming@patchwork.ozlabs.org; Wed, 20 Jun 2018 22:15:27 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39637) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fVood-00052B-N9 for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:19 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fVoob-0003bI-Os for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:15 -0400 Received: from mail-pf0-x22a.google.com ([2607:f8b0:400e:c00::22a]:42129) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fVoob-0003af-FW for qemu-devel@nongnu.org; Wed, 20 Jun 2018 21:55:13 -0400 Received: by mail-pf0-x22a.google.com with SMTP id w7-v6so683532pfn.9 for ; Wed, 20 Jun 2018 18:55:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=DAvQ6d8C7WhBxzP4vcx1hFuoE7NEqSILwo/Uh1DOpfk=; b=MWxRzx4GMNO4DAGIBuxFSOTcl1DHvFGagls5ki1R7inZcuW1pMMXAeT3qYVa7uIx0G lXpHLhvYFuSIpeJLlMcqdH/fgUeKCDWQ4iY0rgfXIu9Q94x+4H3AriOZtvrd8NOSb7MO n1Ix5VE6dYxvyLBeQN0AVFyICi2W/3l/SD5v8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=DAvQ6d8C7WhBxzP4vcx1hFuoE7NEqSILwo/Uh1DOpfk=; b=gsLvqoLiwR5IxexjjaGrSWyvM9WcLxCrbfNsR65hzSMkkWK7YFWxIp1S0ou3mipnMT YJPvkuqGHoyE8MLNPecJ5iq7R0SmBdwfrzAZeKIFzBD7zCLz0mbXPm57Q59qd8EUL7vS 8qGRLn/ARb7hhmw3eWu2ZNq0+/fsvba+GN/9muTojMbzR2oPQth1ksE13IMF9nseDkuA u+BKa9Z7lOSh8zzf6UnCSv4rGc1L9ASdVf0ajgDY2rACfE4wli4i8GpCucveZwCeG+QN D0jDBK2Sgbu0Xxe7biqHNJH7PlmmuM5wY3gSS9sajXcQcqH3RjxZzqzbKfwMlpjjwrgY t79A== X-Gm-Message-State: APt69E2jJMRB7h1GGgRp5fYICwFrGkzcV/77xxFa4rq0q0fysBqHLRVf Q4BKXfi7anHYvaXSgYgtUWH8LW0mFxc= X-Google-Smtp-Source: ADUXVKIC2L7kt7+1NvrDcDBRsHh9PkU/GHlAxGST6REG9C3IVGU3A/15VnHI+Fy3L8ed6ZnLn7+vog== X-Received: by 2002:a65:43c9:: with SMTP id n9-v6mr20527147pgp.399.1529546112151; Wed, 20 Jun 2018 18:55:12 -0700 (PDT) Received: from cloudburst.twiddle.net (mta-98-147-121-51.hawaii.rr.com. [98.147.121.51]) by smtp.gmail.com with ESMTPSA id a27-v6sm6187946pfc.18.2018.06.20.18.55.10 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Jun 2018 18:55:11 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 20 Jun 2018 15:53:59 -1000 Message-Id: <20180621015359.12018-36-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180621015359.12018-1-richard.henderson@linaro.org> References: <20180621015359.12018-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::22a Subject: [Qemu-devel] [PATCH v5 35/35] target/arm: Implement ARMv8.2-DotProd X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" We've already added the helpers with an SVE patch, all that remains is to wire up the aa64 and aa32 translators. Enable the feature within -cpu max for CONFIG_USER_ONLY. Signed-off-by: Richard Henderson Reviewed-by: Peter Maydell --- target/arm/cpu.h | 1 + linux-user/elfload.c | 1 + target/arm/cpu.c | 1 + target/arm/cpu64.c | 1 + target/arm/translate-a64.c | 36 +++++++++++++++++ target/arm/translate.c | 81 ++++++++++++++++++++++++++------------ 6 files changed, 96 insertions(+), 25 deletions(-) diff --git a/target/arm/cpu.h b/target/arm/cpu.h index 8488273c5b..23098474e1 100644 --- a/target/arm/cpu.h +++ b/target/arm/cpu.h @@ -1480,6 +1480,7 @@ enum arm_features { ARM_FEATURE_V8_SM4, /* implements SM4 part of v8 Crypto Extensions */ ARM_FEATURE_V8_ATOMICS, /* ARMv8.1-Atomics feature */ ARM_FEATURE_V8_RDM, /* implements v8.1 simd round multiply */ + ARM_FEATURE_V8_DOTPROD, /* implements v8.2 simd dot product */ ARM_FEATURE_V8_FP16, /* implements v8.2 half-precision float */ ARM_FEATURE_V8_FCMA, /* has complex number part of v8.3 extensions. */ }; diff --git a/linux-user/elfload.c b/linux-user/elfload.c index 13bc78d0c8..bdb023b477 100644 --- a/linux-user/elfload.c +++ b/linux-user/elfload.c @@ -583,6 +583,7 @@ static uint32_t get_elf_hwcap(void) ARM_HWCAP_A64_FPHP | ARM_HWCAP_A64_ASIMDHP); GET_FEATURE(ARM_FEATURE_V8_ATOMICS, ARM_HWCAP_A64_ATOMICS); GET_FEATURE(ARM_FEATURE_V8_RDM, ARM_HWCAP_A64_ASIMDRDM); + GET_FEATURE(ARM_FEATURE_V8_DOTPROD, ARM_HWCAP_A64_ASIMDDP); GET_FEATURE(ARM_FEATURE_V8_FCMA, ARM_HWCAP_A64_FCMA); #undef GET_FEATURE diff --git a/target/arm/cpu.c b/target/arm/cpu.c index 8e4f4d8c21..95ac9c064d 100644 --- a/target/arm/cpu.c +++ b/target/arm/cpu.c @@ -1794,6 +1794,7 @@ static void arm_max_initfn(Object *obj) set_feature(&cpu->env, ARM_FEATURE_V8_PMULL); set_feature(&cpu->env, ARM_FEATURE_CRC); set_feature(&cpu->env, ARM_FEATURE_V8_RDM); + set_feature(&cpu->env, ARM_FEATURE_V8_DOTPROD); set_feature(&cpu->env, ARM_FEATURE_V8_FCMA); #endif } diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c index 0360d7efc5..3b4bc73ffa 100644 --- a/target/arm/cpu64.c +++ b/target/arm/cpu64.c @@ -250,6 +250,7 @@ static void aarch64_max_initfn(Object *obj) set_feature(&cpu->env, ARM_FEATURE_CRC); set_feature(&cpu->env, ARM_FEATURE_V8_ATOMICS); set_feature(&cpu->env, ARM_FEATURE_V8_RDM); + set_feature(&cpu->env, ARM_FEATURE_V8_DOTPROD); set_feature(&cpu->env, ARM_FEATURE_V8_FP16); set_feature(&cpu->env, ARM_FEATURE_V8_FCMA); set_feature(&cpu->env, ARM_FEATURE_SVE); diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 038e48278f..903d6233d3 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -640,6 +640,16 @@ static void gen_gvec_op3(DisasContext *s, bool is_q, int rd, vec_full_reg_size(s), gvec_op); } +/* Expand a 3-operand operation using an out-of-line helper. */ +static void gen_gvec_op3_ool(DisasContext *s, bool is_q, int rd, + int rn, int rm, int data, gen_helper_gvec_3 *fn) +{ + tcg_gen_gvec_3_ool(vec_full_reg_offset(s, rd), + vec_full_reg_offset(s, rn), + vec_full_reg_offset(s, rm), + is_q ? 16 : 8, vec_full_reg_size(s), data, fn); +} + /* Expand a 3-operand + env pointer operation using * an out-of-line helper. */ @@ -11336,6 +11346,14 @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn) } feature = ARM_FEATURE_V8_RDM; break; + case 0x02: /* SDOT (vector) */ + case 0x12: /* UDOT (vector) */ + if (size != MO_32) { + unallocated_encoding(s); + return; + } + feature = ARM_FEATURE_V8_DOTPROD; + break; case 0x8: /* FCMLA, #0 */ case 0x9: /* FCMLA, #90 */ case 0xa: /* FCMLA, #180 */ @@ -11389,6 +11407,11 @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn) } return; + case 0x2: /* SDOT / UDOT */ + gen_gvec_op3_ool(s, is_q, rd, rn, rm, 0, + u ? gen_helper_gvec_udot_b : gen_helper_gvec_sdot_b); + return; + case 0x8: /* FCMLA, #0 */ case 0x9: /* FCMLA, #90 */ case 0xa: /* FCMLA, #180 */ @@ -12568,6 +12591,13 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) return; } break; + case 0x0e: /* SDOT */ + case 0x1e: /* UDOT */ + if (size != MO_32 || !arm_dc_feature(s, ARM_FEATURE_V8_DOTPROD)) { + unallocated_encoding(s); + return; + } + break; case 0x11: /* FCMLA #0 */ case 0x13: /* FCMLA #90 */ case 0x15: /* FCMLA #180 */ @@ -12665,6 +12695,12 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) } switch (16 * u + opcode) { + case 0x0e: /* SDOT */ + case 0x1e: /* UDOT */ + gen_gvec_op3_ool(s, is_q, rd, rn, rm, index, + u ? gen_helper_gvec_udot_idx_b + : gen_helper_gvec_sdot_idx_b); + return; case 0x11: /* FCMLA #0 */ case 0x13: /* FCMLA #90 */ case 0x15: /* FCMLA #180 */ diff --git a/target/arm/translate.c b/target/arm/translate.c index f405c82fb2..09f2852b29 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -7748,9 +7748,10 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) */ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn) { - gen_helper_gvec_3_ptr *fn_gvec_ptr; - int rd, rn, rm, rot, size, opr_sz; - TCGv_ptr fpst; + gen_helper_gvec_3 *fn_gvec = NULL; + gen_helper_gvec_3_ptr *fn_gvec_ptr = NULL; + int rd, rn, rm, opr_sz; + int data = 0; bool q; q = extract32(insn, 6, 1); @@ -7763,8 +7764,8 @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn) if ((insn & 0xfe200f10) == 0xfc200800) { /* VCMLA -- 1111 110R R.1S .... .... 1000 ...0 .... */ - size = extract32(insn, 20, 1); - rot = extract32(insn, 23, 2); + int size = extract32(insn, 20, 1); + data = extract32(insn, 23, 2); /* rot */ if (!arm_dc_feature(s, ARM_FEATURE_V8_FCMA) || (!size && !arm_dc_feature(s, ARM_FEATURE_V8_FP16))) { return 1; @@ -7772,13 +7773,20 @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn) fn_gvec_ptr = size ? gen_helper_gvec_fcmlas : gen_helper_gvec_fcmlah; } else if ((insn & 0xfea00f10) == 0xfc800800) { /* VCADD -- 1111 110R 1.0S .... .... 1000 ...0 .... */ - size = extract32(insn, 20, 1); - rot = extract32(insn, 24, 1); + int size = extract32(insn, 20, 1); + data = extract32(insn, 24, 1); /* rot */ if (!arm_dc_feature(s, ARM_FEATURE_V8_FCMA) || (!size && !arm_dc_feature(s, ARM_FEATURE_V8_FP16))) { return 1; } fn_gvec_ptr = size ? gen_helper_gvec_fcadds : gen_helper_gvec_fcaddh; + } else if ((insn & 0xfeb00f00) == 0xfc200d00) { + /* V[US]DOT -- 1111 1100 0.10 .... .... 1101 .Q.U .... */ + bool u = extract32(insn, 4, 1); + if (!arm_dc_feature(s, ARM_FEATURE_V8_DOTPROD)) { + return 1; + } + fn_gvec = u ? gen_helper_gvec_udot_b : gen_helper_gvec_sdot_b; } else { return 1; } @@ -7793,12 +7801,19 @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn) } opr_sz = (1 + q) * 8; - fpst = get_fpstatus_ptr(1); - tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd), - vfp_reg_offset(1, rn), - vfp_reg_offset(1, rm), fpst, - opr_sz, opr_sz, rot, fn_gvec_ptr); - tcg_temp_free_ptr(fpst); + if (fn_gvec_ptr) { + TCGv_ptr fpst = get_fpstatus_ptr(1); + tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd), + vfp_reg_offset(1, rn), + vfp_reg_offset(1, rm), fpst, + opr_sz, opr_sz, data, fn_gvec_ptr); + tcg_temp_free_ptr(fpst); + } else { + tcg_gen_gvec_3_ool(vfp_reg_offset(1, rd), + vfp_reg_offset(1, rn), + vfp_reg_offset(1, rm), + opr_sz, opr_sz, data, fn_gvec); + } return 0; } @@ -7812,8 +7827,10 @@ static int disas_neon_insn_3same_ext(DisasContext *s, uint32_t insn) static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn) { - int rd, rn, rm, rot, size, opr_sz; - TCGv_ptr fpst; + gen_helper_gvec_3 *fn_gvec = NULL; + gen_helper_gvec_3_ptr *fn_gvec_ptr = NULL; + int rd, rn, rm, opr_sz; + int data = 0; bool q; q = extract32(insn, 6, 1); @@ -7826,12 +7843,21 @@ static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn) if ((insn & 0xff000f10) == 0xfe000800) { /* VCMLA (indexed) -- 1111 1110 S.RR .... .... 1000 ...0 .... */ - rot = extract32(insn, 20, 2); - size = extract32(insn, 23, 1); + int size = extract32(insn, 23, 1); + data = extract32(insn, 20, 2); /* rot */ if (!arm_dc_feature(s, ARM_FEATURE_V8_FCMA) || (!size && !arm_dc_feature(s, ARM_FEATURE_V8_FP16))) { return 1; } + fn_gvec_ptr = (size ? gen_helper_gvec_fcmlas_idx + : gen_helper_gvec_fcmlah_idx); + } else if ((insn & 0xffb00f00) == 0xfe200d00) { + /* V[US]DOT -- 1111 1110 0.10 .... .... 1101 .Q.U .... */ + int u = extract32(insn, 4, 1); + if (!arm_dc_feature(s, ARM_FEATURE_V8_DOTPROD)) { + return 1; + } + fn_gvec = u ? gen_helper_gvec_udot_idx_b : gen_helper_gvec_sdot_idx_b; } else { return 1; } @@ -7846,14 +7872,19 @@ static int disas_neon_insn_2reg_scalar_ext(DisasContext *s, uint32_t insn) } opr_sz = (1 + q) * 8; - fpst = get_fpstatus_ptr(1); - tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd), - vfp_reg_offset(1, rn), - vfp_reg_offset(1, rm), fpst, - opr_sz, opr_sz, rot, - size ? gen_helper_gvec_fcmlas_idx - : gen_helper_gvec_fcmlah_idx); - tcg_temp_free_ptr(fpst); + if (fn_gvec_ptr) { + TCGv_ptr fpst = get_fpstatus_ptr(1); + tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd), + vfp_reg_offset(1, rn), + vfp_reg_offset(1, rm), fpst, + opr_sz, opr_sz, data, fn_gvec_ptr); + tcg_temp_free_ptr(fpst); + } else { + tcg_gen_gvec_3_ool(vfp_reg_offset(1, rd), + vfp_reg_offset(1, rn), + vfp_reg_offset(1, rm), + opr_sz, opr_sz, data, fn_gvec); + } return 0; }