From patchwork Wed May 13 16:32:30 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1289485 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=tyMcrGvm; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 49MgGd5N37z9sRY for ; Thu, 14 May 2020 02:33:45 +1000 (AEST) Received: from localhost ([::1]:53654 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jYuKF-0003bV-D0 for incoming@patchwork.ozlabs.org; Wed, 13 May 2020 12:33:43 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:47928) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jYuJQ-0003ZH-Hk for qemu-devel@nongnu.org; Wed, 13 May 2020 12:32:52 -0400 Received: from mail-pj1-x1042.google.com ([2607:f8b0:4864:20::1042]:54008) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jYuJP-0001z5-9H for qemu-devel@nongnu.org; Wed, 13 May 2020 12:32:52 -0400 Received: by mail-pj1-x1042.google.com with SMTP id hi11so11332310pjb.3 for ; Wed, 13 May 2020 09:32:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=8A6hG2VJX6G8AstTG9MEpJ8uNR7b68/hVZpp/w046b8=; b=tyMcrGvmgykxmAtwkNFUpWXDVTu7DtHa10DGaXg8ik33MfN79jnxdSvUbJ0LT9VUli JpS44y9KwVrQSXzw967qrdhr7qXSYv+Hq92qrTuxF6lIhIsY7Ie0vnEnAkDW5ZHLP2sL 3jvt9CmpIPlAe3frg3H9UQIDGfN5IU4VgQdzo1Dgr1H/rXyAPIHDUuX47Az0P6A0TzrJ 4PRVS/Km1EqAd4OwbQmVDwXWDngT6rp8ULDqqywljm/zGD0/3tOv+CpaAq/lZKraYkHY W2yDpEQzdlw2kN6YUf9WUkb6JFY4yqTTMvQvJKE/XmudEK1aCeiytM44LXJ7GLhC9/SB yDCw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=8A6hG2VJX6G8AstTG9MEpJ8uNR7b68/hVZpp/w046b8=; b=TFMRJdYhBuKhCUEEOPgDDUWfgPtKlYaFPZmikG4seKX/4SIMlDWtMp98Icdbg9oENm iNoAajxfWukcWPZeFVnb6dkIB1BZQGNmH3Irg+JzTcPgNPrHwbv78kDQZOwWDmLWgiyp fZoeZs6OyYIpCUhKuXYvbC+yoyI7D4HhUh2plIOupslVFhjmO9NrI1XxY+NKX3skrTp1 Z5KpaVL3joAPc/omlDUhQaAGi89GqjGuFRNT4Z0rAIwy+yixVsncTWnzh8HANwjEbe45 /D70oy45nR8rssRTLyNRsHs6VqZ4Y3TnepQjjE1+bYkZPEwNx1uh4BrFEKVznLq027HB KVvw== X-Gm-Message-State: AGi0PuaGOa0eW91j4lcmN5kLQeCaFJD5gwawHvOieEfIos6xAEHT6DGJ SVKMXLMlvFmphNWKc5zp5/lAWLM42o0= X-Google-Smtp-Source: APiQypJ7AKG51cefykhaRE6MzxGLncMIP1d3nXoXxRZJC8jFoxnuv/o7OwX4DCYAZvMMfa9r7Ukccw== X-Received: by 2002:a17:90b:374f:: with SMTP id ne15mr34846602pjb.181.1589387569108; Wed, 13 May 2020 09:32:49 -0700 (PDT) Received: from localhost.localdomain (174-21-143-238.tukw.qwest.net. [174.21.143.238]) by smtp.gmail.com with ESMTPSA id b11sm158025pgq.50.2020.05.13.09.32.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 May 2020 09:32:48 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v4 01/16] target/arm: Create gen_gvec_[us]sra Date: Wed, 13 May 2020 09:32:30 -0700 Message-Id: <20200513163245.17915-2-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200513163245.17915-1-richard.henderson@linaro.org> References: <20200513163245.17915-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1042; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1042.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" The functions eliminate duplication of the special cases for this operation. They match up with the GVecGen2iFn typedef. Add out-of-line helpers. We got away with only having inline expanders because the neon vector size is only 16 bytes, and we know that the inline expansion will always succeed. When we reuse this for SVE, tcg-gvec-op may decide to use an out-of-line helper due to longer vector lengths. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper.h | 10 +++ target/arm/translate.h | 7 +- target/arm/translate-a64.c | 15 +--- target/arm/translate.c | 161 ++++++++++++++++++++++--------------- target/arm/vec_helper.c | 25 ++++++ 5 files changed, 139 insertions(+), 79 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index 5817626b20..9bc162345c 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -691,6 +691,16 @@ DEF_HELPER_FLAGS_4(gvec_pmull_q, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(neon_pmull_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_ssra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_ssra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_ssra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_ssra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(gvec_usra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_usra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_usra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_usra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/translate.h b/target/arm/translate.h index cb7925ea46..1839a59a8e 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -285,8 +285,6 @@ extern const GVecGen3 mls_op[4]; extern const GVecGen3 cmtst_op[4]; extern const GVecGen3 sshl_op[4]; extern const GVecGen3 ushl_op[4]; -extern const GVecGen2i ssra_op[4]; -extern const GVecGen2i usra_op[4]; extern const GVecGen2i sri_op[4]; extern const GVecGen2i sli_op[4]; extern const GVecGen4 uqadd_op[4]; @@ -299,6 +297,11 @@ void gen_sshl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b); void gen_ushl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); void gen_sshl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); +void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); + /* * Forward to the isar_feature_* tests given a DisasContext pointer. */ diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 62e5729904..315de9a9b6 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -10188,19 +10188,8 @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u, switch (opcode) { case 0x02: /* SSRA / USRA (accumulate) */ - if (is_u) { - /* Shift count same as element size produces zero to add. */ - if (shift == 8 << size) { - goto done; - } - gen_gvec_op2i(s, is_q, rd, rn, shift, &usra_op[size]); - } else { - /* Shift count same as element size produces all sign to add. */ - if (shift == 8 << size) { - shift -= 1; - } - gen_gvec_op2i(s, is_q, rd, rn, shift, &ssra_op[size]); - } + gen_gvec_fn2i(s, is_q, rd, rn, shift, + is_u ? gen_gvec_usra : gen_gvec_ssra, size); return; case 0x08: /* SRI */ /* Shift count same as element size is valid but does nothing. */ diff --git a/target/arm/translate.c b/target/arm/translate.c index 74fac1d09c..c18140f2e6 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -3874,33 +3874,51 @@ static void gen_ssra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) tcg_gen_add_vec(vece, d, d, a); } -static const TCGOpcode vecop_list_ssra[] = { - INDEX_op_sari_vec, INDEX_op_add_vec, 0 -}; +void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sari_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen2i ops[4] = { + { .fni8 = gen_ssra8_i64, + .fniv = gen_ssra_vec, + .fno = gen_helper_gvec_ssra_b, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni8 = gen_ssra16_i64, + .fniv = gen_ssra_vec, + .fno = gen_helper_gvec_ssra_h, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_ssra32_i32, + .fniv = gen_ssra_vec, + .fno = gen_helper_gvec_ssra_s, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_ssra64_i64, + .fniv = gen_ssra_vec, + .fno = gen_helper_gvec_ssra_b, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_64 }, + }; -const GVecGen2i ssra_op[4] = { - { .fni8 = gen_ssra8_i64, - .fniv = gen_ssra_vec, - .load_dest = true, - .opt_opc = vecop_list_ssra, - .vece = MO_8 }, - { .fni8 = gen_ssra16_i64, - .fniv = gen_ssra_vec, - .load_dest = true, - .opt_opc = vecop_list_ssra, - .vece = MO_16 }, - { .fni4 = gen_ssra32_i32, - .fniv = gen_ssra_vec, - .load_dest = true, - .opt_opc = vecop_list_ssra, - .vece = MO_32 }, - { .fni8 = gen_ssra64_i64, - .fniv = gen_ssra_vec, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .opt_opc = vecop_list_ssra, - .load_dest = true, - .vece = MO_64 }, -}; + /* tszimm encoding produces immediates in the range [1..esize]. */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + /* + * Shifts larger than the element size are architecturally valid. + * Signed results in all sign bits. + */ + shift = MIN(shift, (8 << vece) - 1); + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); +} static void gen_usra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) { @@ -3932,33 +3950,55 @@ static void gen_usra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) tcg_gen_add_vec(vece, d, d, a); } -static const TCGOpcode vecop_list_usra[] = { - INDEX_op_shri_vec, INDEX_op_add_vec, 0 -}; +void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shri_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen2i ops[4] = { + { .fni8 = gen_usra8_i64, + .fniv = gen_usra_vec, + .fno = gen_helper_gvec_usra_b, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_8, }, + { .fni8 = gen_usra16_i64, + .fniv = gen_usra_vec, + .fno = gen_helper_gvec_usra_h, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_16, }, + { .fni4 = gen_usra32_i32, + .fniv = gen_usra_vec, + .fno = gen_helper_gvec_usra_s, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_32, }, + { .fni8 = gen_usra64_i64, + .fniv = gen_usra_vec, + .fno = gen_helper_gvec_usra_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_64, }, + }; -const GVecGen2i usra_op[4] = { - { .fni8 = gen_usra8_i64, - .fniv = gen_usra_vec, - .load_dest = true, - .opt_opc = vecop_list_usra, - .vece = MO_8, }, - { .fni8 = gen_usra16_i64, - .fniv = gen_usra_vec, - .load_dest = true, - .opt_opc = vecop_list_usra, - .vece = MO_16, }, - { .fni4 = gen_usra32_i32, - .fniv = gen_usra_vec, - .load_dest = true, - .opt_opc = vecop_list_usra, - .vece = MO_32, }, - { .fni8 = gen_usra64_i64, - .fniv = gen_usra_vec, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .load_dest = true, - .opt_opc = vecop_list_usra, - .vece = MO_64, }, -}; + /* tszimm encoding produces immediates in the range [1..esize]. */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + /* + * Shifts larger than the element size are architecturally valid. + * Unsigned results in all zeros as input to accumulate: nop. + */ + if (shift < (8 << vece)) { + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); + } else { + /* Nop, but we do need to clear the tail. */ + tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz); + } +} static void gen_shr8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) { @@ -5220,19 +5260,12 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) case 1: /* VSRA */ /* Right shift comes here negative. */ shift = -shift; - /* Shifts larger than the element size are architecturally - * valid. Unsigned results in all zeros; signed results - * in all sign bits. - */ - if (!u) { - tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, vec_size, - MIN(shift, (8 << size) - 1), - &ssra_op[size]); - } else if (shift >= 8 << size) { - /* rd += 0 */ + if (u) { + gen_gvec_usra(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); } else { - tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, vec_size, - shift, &usra_op[size]); + gen_gvec_ssra(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); } return 0; diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 3d534188a8..230085b35e 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -899,6 +899,31 @@ void HELPER(gvec_sqsub_d)(void *vd, void *vq, void *vn, clear_tail(d, oprsz, simd_maxsz(desc)); } + +#define DO_SRA(NAME, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + int shift = simd_data(desc); \ + TYPE *d = vd, *n = vn; \ + for (i = 0; i < oprsz / sizeof(TYPE); i++) { \ + d[i] += n[i] >> shift; \ + } \ + clear_tail(d, oprsz, simd_maxsz(desc)); \ +} + +DO_SRA(gvec_ssra_b, int8_t) +DO_SRA(gvec_ssra_h, int16_t) +DO_SRA(gvec_ssra_s, int32_t) +DO_SRA(gvec_ssra_d, int64_t) + +DO_SRA(gvec_usra_b, uint8_t) +DO_SRA(gvec_usra_h, uint16_t) +DO_SRA(gvec_usra_s, uint32_t) +DO_SRA(gvec_usra_d, uint64_t) + +#undef DO_SRA + /* * Convert float16 to float32, raising no exceptions and * preserving exceptional values, including SNaN. From patchwork Wed May 13 16:32:31 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1289494 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=ormT12iD; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 49MgKv2fzcz9sRY for ; Thu, 14 May 2020 02:36:35 +1000 (AEST) Received: from localhost ([::1]:32922 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jYuMy-0006qz-SA for incoming@patchwork.ozlabs.org; Wed, 13 May 2020 12:36:32 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:47938) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jYuJS-0003bX-IX for qemu-devel@nongnu.org; Wed, 13 May 2020 12:32:54 -0400 Received: from mail-pg1-x543.google.com ([2607:f8b0:4864:20::543]:41092) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jYuJQ-0001zq-Tb for qemu-devel@nongnu.org; Wed, 13 May 2020 12:32:54 -0400 Received: by mail-pg1-x543.google.com with SMTP id r10so7514572pgv.8 for ; Wed, 13 May 2020 09:32:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=XS/mIW9X0kPcGiRIELKdlQNFPmcbFEWtnUU78dllzHU=; b=ormT12iD6QOHXRrN7+7/4bsnrS4trYlzsLobNtEUgiSKCmdzSxNv/fvmvInSb8ZGF5 ue4o66cTjqTyD1dqw55swNyyVQYzlb2McSGUV0hFsPpjR7kJSVvcRbH6iylSYtIzWx65 KDnT7Poce2WTu6IxQxRIbfsqmqt88jo9Voxa1ztPMLKJXVxoBA5tGA9rlCSWPWm/KANM OAth5bX6wXtyuhdOyrlBpLFXFmdHJXWTWeED+p555O+d2d5KwDi/H09h5dgvzLm1FXzC Z3FT6+KgQcdao/bJpk6+g4LtHxtepr01Q3B+EhmPjVSCKwrXstmccnhXkewakW809VW7 1Ocw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=XS/mIW9X0kPcGiRIELKdlQNFPmcbFEWtnUU78dllzHU=; b=UKBzUaV/aB8Adg0smJiF34Kzqc1DTYl2mnLU02HUiMQr1fJrEOJau7RDke8Be986nB x3FYl2bWbHWLkvJPeggImFRhj3ksL3VHTcD4Y5a0SW3U9i2YxnBT03e0/D/z+4VRwWP0 VTeUUvtZ3KBVPaOVQH45sxyjKWjuD1V2uU6ZvmdIMwXBJ5dNcdXw2WVSrJ9xHx5w4Nmy dPcAfFkZGV8ckCBwT8nsCRuUuxlm7QlX6WhxE6wp5VUKxwNbbbhzhAe93d33fn1FR03X dXvnOrlqC5vkQyEBOFwpuY19iD0R0t+24oTdDS4rmD8PaAN/dmmj7YmX9e+BB2Wf3beY +eNQ== X-Gm-Message-State: AOAM530f25MERXZ+PPLrfKQ8YMNfgX+gVFJZYWliGNDc8+OoIOdAVbsZ tRZ+NIwPauJIdD9jvpGU/DwrbxYYfps= X-Google-Smtp-Source: ABdhPJxgu6ozRJvmPrTYeFSUlIMWsHFeeDPLpAJci/sVKosxzkFgnonr0eujbvy/pbU4S+iRzEQ/yQ== X-Received: by 2002:a63:1e01:: with SMTP id e1mr117524pge.351.1589387570498; Wed, 13 May 2020 09:32:50 -0700 (PDT) Received: from localhost.localdomain (174-21-143-238.tukw.qwest.net. [174.21.143.238]) by smtp.gmail.com with ESMTPSA id b11sm158025pgq.50.2020.05.13.09.32.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 May 2020 09:32:49 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v4 02/16] target/arm: Create gen_gvec_{u,s}{rshr,rsra} Date: Wed, 13 May 2020 09:32:31 -0700 Message-Id: <20200513163245.17915-3-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200513163245.17915-1-richard.henderson@linaro.org> References: <20200513163245.17915-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::543; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x543.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Create vectorized versions of handle_shri_with_rndacc for shift+round and shift+round+accumulate. Add out-of-line helpers in preparation for longer vector lengths from SVE. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper.h | 20 ++ target/arm/translate.h | 9 + target/arm/translate-a64.c | 11 +- target/arm/translate.c | 463 +++++++++++++++++++++++++++++++++++-- target/arm/vec_helper.c | 50 ++++ 5 files changed, 527 insertions(+), 26 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index 9bc162345c..aeb1f52455 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -701,6 +701,26 @@ DEF_HELPER_FLAGS_3(gvec_usra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_usra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_usra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_srshr_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_srshr_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_srshr_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_srshr_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(gvec_urshr_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_urshr_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_urshr_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_urshr_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(gvec_srsra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_srsra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_srsra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_srsra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(gvec_ursra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_ursra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_ursra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_ursra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/translate.h b/target/arm/translate.h index 1839a59a8e..1db3b43a61 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -302,6 +302,15 @@ void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_srshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_urshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_srsra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_ursra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); + /* * Forward to the isar_feature_* tests given a DisasContext pointer. */ diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 315de9a9b6..50949d306b 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -10218,10 +10218,15 @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u, return; case 0x04: /* SRSHR / URSHR (rounding) */ - break; + gen_gvec_fn2i(s, is_q, rd, rn, shift, + is_u ? gen_gvec_urshr : gen_gvec_srshr, size); + return; + case 0x06: /* SRSRA / URSRA (accum + rounding) */ - accumulate = true; - break; + gen_gvec_fn2i(s, is_q, rd, rn, shift, + is_u ? gen_gvec_ursra : gen_gvec_srsra, size); + return; + default: g_assert_not_reached(); } diff --git a/target/arm/translate.c b/target/arm/translate.c index c18140f2e6..aa03dc236b 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -4000,6 +4000,422 @@ void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, } } +/* + * Shift one less than the requested amount, and the low bit is + * the rounding bit. For the 8 and 16-bit operations, because we + * mask the low bit, we can perform a normal integer shift instead + * of a vector shift. + */ +static void gen_srshr8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_shri_i64(t, a, sh - 1); + tcg_gen_andi_i64(t, t, dup_const(MO_8, 1)); + tcg_gen_vec_sar8i_i64(d, a, sh); + tcg_gen_vec_add8_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_srshr16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_shri_i64(t, a, sh - 1); + tcg_gen_andi_i64(t, t, dup_const(MO_16, 1)); + tcg_gen_vec_sar16i_i64(d, a, sh); + tcg_gen_vec_add16_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_srshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh) +{ + TCGv_i32 t = tcg_temp_new_i32(); + + tcg_gen_extract_i32(t, a, sh - 1, 1); + tcg_gen_sari_i32(d, a, sh); + tcg_gen_add_i32(d, d, t); + tcg_temp_free_i32(t); +} + +static void gen_srshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_extract_i64(t, a, sh - 1, 1); + tcg_gen_sari_i64(d, a, sh); + tcg_gen_add_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_srshr_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + TCGv_vec ones = tcg_temp_new_vec_matching(d); + + tcg_gen_shri_vec(vece, t, a, sh - 1); + tcg_gen_dupi_vec(vece, ones, 1); + tcg_gen_and_vec(vece, t, t, ones); + tcg_gen_sari_vec(vece, d, a, sh); + tcg_gen_add_vec(vece, d, d, t); + + tcg_temp_free_vec(t); + tcg_temp_free_vec(ones); +} + +void gen_gvec_srshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shri_vec, INDEX_op_sari_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen2i ops[4] = { + { .fni8 = gen_srshr8_i64, + .fniv = gen_srshr_vec, + .fno = gen_helper_gvec_srshr_b, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni8 = gen_srshr16_i64, + .fniv = gen_srshr_vec, + .fno = gen_helper_gvec_srshr_h, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_srshr32_i32, + .fniv = gen_srshr_vec, + .fno = gen_helper_gvec_srshr_s, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_srshr64_i64, + .fniv = gen_srshr_vec, + .fno = gen_helper_gvec_srshr_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + + /* tszimm encoding produces immediates in the range [1..esize] */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + if (shift == (8 << vece)) { + /* + * Shifts larger than the element size are architecturally valid. + * Signed results in all sign bits. With rounding, this produces + * (-1 + 1) >> 1 == 0, or (0 + 1) >> 1 == 0. + * I.e. always zero. + */ + tcg_gen_gvec_dup_imm(vece, rd_ofs, opr_sz, max_sz, 0); + } else { + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); + } +} + +static void gen_srsra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + gen_srshr8_i64(t, a, sh); + tcg_gen_vec_add8_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_srsra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + gen_srshr16_i64(t, a, sh); + tcg_gen_vec_add16_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_srsra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh) +{ + TCGv_i32 t = tcg_temp_new_i32(); + + gen_srshr32_i32(t, a, sh); + tcg_gen_add_i32(d, d, t); + tcg_temp_free_i32(t); +} + +static void gen_srsra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + gen_srshr64_i64(t, a, sh); + tcg_gen_add_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_srsra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + + gen_srshr_vec(vece, t, a, sh); + tcg_gen_add_vec(vece, d, d, t); + tcg_temp_free_vec(t); +} + +void gen_gvec_srsra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shri_vec, INDEX_op_sari_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen2i ops[4] = { + { .fni8 = gen_srsra8_i64, + .fniv = gen_srsra_vec, + .fno = gen_helper_gvec_srsra_b, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_8 }, + { .fni8 = gen_srsra16_i64, + .fniv = gen_srsra_vec, + .fno = gen_helper_gvec_srsra_h, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_16 }, + { .fni4 = gen_srsra32_i32, + .fniv = gen_srsra_vec, + .fno = gen_helper_gvec_srsra_s, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_32 }, + { .fni8 = gen_srsra64_i64, + .fniv = gen_srsra_vec, + .fno = gen_helper_gvec_srsra_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_64 }, + }; + + /* tszimm encoding produces immediates in the range [1..esize] */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + /* + * Shifts larger than the element size are architecturally valid. + * Signed results in all sign bits. With rounding, this produces + * (-1 + 1) >> 1 == 0, or (0 + 1) >> 1 == 0. + * I.e. always zero. With accumulation, this leaves D unchanged. + */ + if (shift == (8 << vece)) { + /* Nop, but we do need to clear the tail. */ + tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz); + } else { + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); + } +} + +static void gen_urshr8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_shri_i64(t, a, sh - 1); + tcg_gen_andi_i64(t, t, dup_const(MO_8, 1)); + tcg_gen_vec_shr8i_i64(d, a, sh); + tcg_gen_vec_add8_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_urshr16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_shri_i64(t, a, sh - 1); + tcg_gen_andi_i64(t, t, dup_const(MO_16, 1)); + tcg_gen_vec_shr16i_i64(d, a, sh); + tcg_gen_vec_add16_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_urshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh) +{ + TCGv_i32 t = tcg_temp_new_i32(); + + tcg_gen_extract_i32(t, a, sh - 1, 1); + tcg_gen_shri_i32(d, a, sh); + tcg_gen_add_i32(d, d, t); + tcg_temp_free_i32(t); +} + +static void gen_urshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_extract_i64(t, a, sh - 1, 1); + tcg_gen_shri_i64(d, a, sh); + tcg_gen_add_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_urshr_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t shift) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + TCGv_vec ones = tcg_temp_new_vec_matching(d); + + tcg_gen_shri_vec(vece, t, a, shift - 1); + tcg_gen_dupi_vec(vece, ones, 1); + tcg_gen_and_vec(vece, t, t, ones); + tcg_gen_shri_vec(vece, d, a, shift); + tcg_gen_add_vec(vece, d, d, t); + + tcg_temp_free_vec(t); + tcg_temp_free_vec(ones); +} + +void gen_gvec_urshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shri_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen2i ops[4] = { + { .fni8 = gen_urshr8_i64, + .fniv = gen_urshr_vec, + .fno = gen_helper_gvec_urshr_b, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni8 = gen_urshr16_i64, + .fniv = gen_urshr_vec, + .fno = gen_helper_gvec_urshr_h, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_urshr32_i32, + .fniv = gen_urshr_vec, + .fno = gen_helper_gvec_urshr_s, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_urshr64_i64, + .fniv = gen_urshr_vec, + .fno = gen_helper_gvec_urshr_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + + /* tszimm encoding produces immediates in the range [1..esize] */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + if (shift == (8 << vece)) { + /* + * Shifts larger than the element size are architecturally valid. + * Unsigned results in zero. With rounding, this produces a + * copy of the most significant bit. + */ + tcg_gen_gvec_shri(vece, rd_ofs, rm_ofs, shift - 1, opr_sz, max_sz); + } else { + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); + } +} + +static void gen_ursra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + if (sh == 8) { + tcg_gen_vec_shr8i_i64(t, a, 7); + } else { + gen_urshr8_i64(t, a, sh); + } + tcg_gen_vec_add8_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_ursra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + if (sh == 16) { + tcg_gen_vec_shr16i_i64(t, a, 15); + } else { + gen_urshr16_i64(t, a, sh); + } + tcg_gen_vec_add16_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_ursra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t sh) +{ + TCGv_i32 t = tcg_temp_new_i32(); + + if (sh == 32) { + tcg_gen_shri_i32(t, a, 31); + } else { + gen_urshr32_i32(t, a, sh); + } + tcg_gen_add_i32(d, d, t); + tcg_temp_free_i32(t); +} + +static void gen_ursra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + if (sh == 64) { + tcg_gen_shri_i64(t, a, 63); + } else { + gen_urshr64_i64(t, a, sh); + } + tcg_gen_add_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_ursra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + + if (sh == (8 << vece)) { + tcg_gen_shri_vec(vece, t, a, sh - 1); + } else { + gen_urshr_vec(vece, t, a, sh); + } + tcg_gen_add_vec(vece, d, d, t); + tcg_temp_free_vec(t); +} + +void gen_gvec_ursra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shri_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen2i ops[4] = { + { .fni8 = gen_ursra8_i64, + .fniv = gen_ursra_vec, + .fno = gen_helper_gvec_ursra_b, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_8 }, + { .fni8 = gen_ursra16_i64, + .fniv = gen_ursra_vec, + .fno = gen_helper_gvec_ursra_h, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_16 }, + { .fni4 = gen_ursra32_i32, + .fniv = gen_ursra_vec, + .fno = gen_helper_gvec_ursra_s, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_32 }, + { .fni8 = gen_ursra64_i64, + .fniv = gen_ursra_vec, + .fno = gen_helper_gvec_ursra_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_64 }, + }; + + /* tszimm encoding produces immediates in the range [1..esize] */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); +} + static void gen_shr8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) { uint64_t mask = dup_const(MO_8, 0xff >> shift); @@ -5269,6 +5685,30 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) } return 0; + case 2: /* VRSHR */ + /* Right shift comes here negative. */ + shift = -shift; + if (u) { + gen_gvec_urshr(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); + } else { + gen_gvec_srshr(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); + } + return 0; + + case 3: /* VRSRA */ + /* Right shift comes here negative. */ + shift = -shift; + if (u) { + gen_gvec_ursra(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); + } else { + gen_gvec_srsra(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); + } + return 0; + case 4: /* VSRI */ if (!u) { return 1; @@ -5320,13 +5760,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) neon_load_reg64(cpu_V0, rm + pass); tcg_gen_movi_i64(cpu_V1, imm); switch (op) { - case 2: /* VRSHR */ - case 3: /* VRSRA */ - if (u) - gen_helper_neon_rshl_u64(cpu_V0, cpu_V0, cpu_V1); - else - gen_helper_neon_rshl_s64(cpu_V0, cpu_V0, cpu_V1); - break; case 6: /* VQSHLU */ gen_helper_neon_qshlu_s64(cpu_V0, cpu_env, cpu_V0, cpu_V1); @@ -5343,11 +5776,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) default: g_assert_not_reached(); } - if (op == 3) { - /* Accumulate. */ - neon_load_reg64(cpu_V1, rd + pass); - tcg_gen_add_i64(cpu_V0, cpu_V0, cpu_V1); - } neon_store_reg64(cpu_V0, rd + pass); } else { /* size < 3 */ /* Operands in T0 and T1. */ @@ -5355,10 +5783,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) tmp2 = tcg_temp_new_i32(); tcg_gen_movi_i32(tmp2, imm); switch (op) { - case 2: /* VRSHR */ - case 3: /* VRSRA */ - GEN_NEON_INTEGER_OP(rshl); - break; case 6: /* VQSHLU */ switch (size) { case 0: @@ -5384,13 +5808,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) g_assert_not_reached(); } tcg_temp_free_i32(tmp2); - - if (op == 3) { - /* Accumulate. */ - tmp2 = neon_load_reg(rd, pass); - gen_neon_add(size, tmp, tmp2); - tcg_temp_free_i32(tmp2); - } neon_store_reg(rd, pass, tmp); } } /* for pass */ diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 230085b35e..fd8b2bff49 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -924,6 +924,56 @@ DO_SRA(gvec_usra_d, uint64_t) #undef DO_SRA +#define DO_RSHR(NAME, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + int shift = simd_data(desc); \ + TYPE *d = vd, *n = vn; \ + for (i = 0; i < oprsz / sizeof(TYPE); i++) { \ + TYPE tmp = n[i] >> (shift - 1); \ + d[i] = (tmp >> 1) + (tmp & 1); \ + } \ + clear_tail(d, oprsz, simd_maxsz(desc)); \ +} + +DO_RSHR(gvec_srshr_b, int8_t) +DO_RSHR(gvec_srshr_h, int16_t) +DO_RSHR(gvec_srshr_s, int32_t) +DO_RSHR(gvec_srshr_d, int64_t) + +DO_RSHR(gvec_urshr_b, uint8_t) +DO_RSHR(gvec_urshr_h, uint16_t) +DO_RSHR(gvec_urshr_s, uint32_t) +DO_RSHR(gvec_urshr_d, uint64_t) + +#undef DO_RSHR + +#define DO_RSRA(NAME, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + int shift = simd_data(desc); \ + TYPE *d = vd, *n = vn; \ + for (i = 0; i < oprsz / sizeof(TYPE); i++) { \ + TYPE tmp = n[i] >> (shift - 1); \ + d[i] += (tmp >> 1) + (tmp & 1); \ + } \ + clear_tail(d, oprsz, simd_maxsz(desc)); \ +} + +DO_RSRA(gvec_srsra_b, int8_t) +DO_RSRA(gvec_srsra_h, int16_t) +DO_RSRA(gvec_srsra_s, int32_t) +DO_RSRA(gvec_srsra_d, int64_t) + +DO_RSRA(gvec_ursra_b, uint8_t) +DO_RSRA(gvec_ursra_h, uint16_t) +DO_RSRA(gvec_ursra_s, uint32_t) +DO_RSRA(gvec_ursra_d, uint64_t) + +#undef DO_RSRA + /* * Convert float16 to float32, raising no exceptions and * preserving exceptional values, including SNaN. From patchwork Wed May 13 16:32:32 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1289495 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=KiTt6z5V; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 49MgL01F8hz9sRf for ; Thu, 14 May 2020 02:36:40 +1000 (AEST) Received: from localhost ([::1]:33414 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jYuN3-00072n-Oj for incoming@patchwork.ozlabs.org; Wed, 13 May 2020 12:36:37 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:47950) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jYuJT-0003cQ-6p for qemu-devel@nongnu.org; Wed, 13 May 2020 12:32:55 -0400 Received: from mail-pl1-x641.google.com ([2607:f8b0:4864:20::641]:37581) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jYuJR-0001zy-JD for qemu-devel@nongnu.org; Wed, 13 May 2020 12:32:54 -0400 Received: by mail-pl1-x641.google.com with SMTP id x10so48660plr.4 for ; Wed, 13 May 2020 09:32:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=0WbbmeSiaNN4QpouxeGPic6h6RNKP0wk8Zd43DH2ue0=; b=KiTt6z5VmnrOIrPiuaDIFW526+LR40zcVRryWnIVgzOCXaz/apM2zvgL+9MtPMk25A o+h0kHqkXFWhkYHE2MhLLe364wh/P0u9FPB+NGhXboJvOAbWdjTsY5G+mvlWYFzHEXma hTbD0v5qDGFVdA7/en0qjKUhB45wP+wAP6jZg4CbNkFyfbB309V2WuPxTbGuAogog48Y CvlJYYVio3HlTyeT1J8Z9MawMY8UGrB9RP5vvnA82TNLCc0rv8v0oJY7CY/R6UlctU2F h9squYJ/nbDviMEKx12neoQLlaKUVriZwsDjk/whl2g5EyXh7P1CoZqG3XTgtFQjlI// dYVg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=0WbbmeSiaNN4QpouxeGPic6h6RNKP0wk8Zd43DH2ue0=; b=qVndfiU7of3pTvcuEpZvO/4ieJaqTbXmnYweaKUFwEOQJ5PdqA8oOYAhx4wjMns+8S P5WGbbDcuIg20pRK/rowHIIZXdobnP7vHRXi48jPqs0Wo0abj9AcHBkAS+9vK0L0LzV+ F3R4Y7UTqF3/F04q1g/4jeRo2B5thRlB6cTBcC+RSk9bssCtXdeZ4OCC980olNKCZgv/ zrfalSSY1xS+SzcUAO/H1ARBPDuJ1xJBdWZC/o2ixBOOFcrhBcmQjpifRnPPZ2YSwxQ3 X9XvanJV+G3h9wYG57Hdtu7bqSuLaKjF9xS0a0KslfyzQrp87HXQ3pEiB/7RPH6fKDKB 0w8Q== X-Gm-Message-State: AGi0Pua7QUfvc30T6WXJDagM4ha1nWN0/Gcjn2DyS3iXH3WbmDM6ccJB /J6US273wYgHBvY1ZRkAfEKjNx1zfM8= X-Google-Smtp-Source: APiQypLaA1zwvfaXm7Ynz0KbZ9RrViM/ne5qrZIBlu4st9krpzCISQXzHQNxLFA87oVGv2ftsxoVww== X-Received: by 2002:a17:90a:1743:: with SMTP id 3mr33873672pjm.106.1589387571713; Wed, 13 May 2020 09:32:51 -0700 (PDT) Received: from localhost.localdomain (174-21-143-238.tukw.qwest.net. [174.21.143.238]) by smtp.gmail.com with ESMTPSA id b11sm158025pgq.50.2020.05.13.09.32.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 May 2020 09:32:51 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v4 03/16] target/arm: Create gen_gvec_{sri,sli} Date: Wed, 13 May 2020 09:32:32 -0700 Message-Id: <20200513163245.17915-4-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200513163245.17915-1-richard.henderson@linaro.org> References: <20200513163245.17915-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::641; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x641.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" The functions eliminate duplication of the special cases for this operation. They match up with the GVecGen2iFn typedef. Add out-of-line helpers. We got away with only having inline expanders because the neon vector size is only 16 bytes, and we know that the inline expansion will always succeed. When we reuse this for SVE, tcg-gvec-op may decide to use an out-of-line helper due to longer vector lengths. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper.h | 10 ++ target/arm/translate.h | 7 +- target/arm/translate-a64.c | 20 +--- target/arm/translate.c | 186 +++++++++++++++++++++---------------- target/arm/vec_helper.c | 38 ++++++++ 5 files changed, 160 insertions(+), 101 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index aeb1f52455..33c76192d2 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -721,6 +721,16 @@ DEF_HELPER_FLAGS_3(gvec_ursra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_ursra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_ursra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_sri_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_sri_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_sri_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_sri_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(gvec_sli_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_sli_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_sli_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_sli_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/translate.h b/target/arm/translate.h index 1db3b43a61..fa5c3f12b9 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -285,8 +285,6 @@ extern const GVecGen3 mls_op[4]; extern const GVecGen3 cmtst_op[4]; extern const GVecGen3 sshl_op[4]; extern const GVecGen3 ushl_op[4]; -extern const GVecGen2i sri_op[4]; -extern const GVecGen2i sli_op[4]; extern const GVecGen4 uqadd_op[4]; extern const GVecGen4 sqadd_op[4]; extern const GVecGen4 uqsub_op[4]; @@ -311,6 +309,11 @@ void gen_gvec_srsra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, void gen_gvec_ursra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_sri(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_sli(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); + /* * Forward to the isar_feature_* tests given a DisasContext pointer. */ diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 50949d306b..2d7dad6c3f 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -585,16 +585,6 @@ static void gen_gvec_op2(DisasContext *s, bool is_q, int rd, is_q ? 16 : 8, vec_full_reg_size(s), gvec_op); } -/* Expand a 2-operand + immediate AdvSIMD vector operation using - * an op descriptor. - */ -static void gen_gvec_op2i(DisasContext *s, bool is_q, int rd, - int rn, int64_t imm, const GVecGen2i *gvec_op) -{ - tcg_gen_gvec_2i(vec_full_reg_offset(s, rd), vec_full_reg_offset(s, rn), - is_q ? 16 : 8, vec_full_reg_size(s), imm, gvec_op); -} - /* Expand a 3-operand AdvSIMD vector operation using an op descriptor. */ static void gen_gvec_op3(DisasContext *s, bool is_q, int rd, int rn, int rm, const GVecGen3 *gvec_op) @@ -10191,12 +10181,9 @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u, gen_gvec_fn2i(s, is_q, rd, rn, shift, is_u ? gen_gvec_usra : gen_gvec_ssra, size); return; + case 0x08: /* SRI */ - /* Shift count same as element size is valid but does nothing. */ - if (shift == 8 << size) { - goto done; - } - gen_gvec_op2i(s, is_q, rd, rn, shift, &sri_op[size]); + gen_gvec_fn2i(s, is_q, rd, rn, shift, gen_gvec_sri, size); return; case 0x00: /* SSHR / USHR */ @@ -10247,7 +10234,6 @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u, } tcg_temp_free_i64(tcg_round); - done: clear_vec_high(s, is_q, rd); } @@ -10272,7 +10258,7 @@ static void handle_vec_simd_shli(DisasContext *s, bool is_q, bool insert, } if (insert) { - gen_gvec_op2i(s, is_q, rd, rn, shift, &sli_op[size]); + gen_gvec_fn2i(s, is_q, rd, rn, shift, gen_gvec_sli, size); } else { gen_gvec_fn2i(s, is_q, rd, rn, shift, tcg_gen_gvec_shli, size); } diff --git a/target/arm/translate.c b/target/arm/translate.c index aa03dc236b..3c489852dc 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -4454,47 +4454,62 @@ static void gen_shr64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) static void gen_shr_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) { - if (sh == 0) { - tcg_gen_mov_vec(d, a); - } else { - TCGv_vec t = tcg_temp_new_vec_matching(d); - TCGv_vec m = tcg_temp_new_vec_matching(d); + TCGv_vec t = tcg_temp_new_vec_matching(d); + TCGv_vec m = tcg_temp_new_vec_matching(d); - tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK((8 << vece) - sh, sh)); - tcg_gen_shri_vec(vece, t, a, sh); - tcg_gen_and_vec(vece, d, d, m); - tcg_gen_or_vec(vece, d, d, t); + tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK((8 << vece) - sh, sh)); + tcg_gen_shri_vec(vece, t, a, sh); + tcg_gen_and_vec(vece, d, d, m); + tcg_gen_or_vec(vece, d, d, t); - tcg_temp_free_vec(t); - tcg_temp_free_vec(m); - } + tcg_temp_free_vec(t); + tcg_temp_free_vec(m); } -static const TCGOpcode vecop_list_sri[] = { INDEX_op_shri_vec, 0 }; +void gen_gvec_sri(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { INDEX_op_shri_vec, 0 }; + const GVecGen2i ops[4] = { + { .fni8 = gen_shr8_ins_i64, + .fniv = gen_shr_ins_vec, + .fno = gen_helper_gvec_sri_b, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni8 = gen_shr16_ins_i64, + .fniv = gen_shr_ins_vec, + .fno = gen_helper_gvec_sri_h, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_shr32_ins_i32, + .fniv = gen_shr_ins_vec, + .fno = gen_helper_gvec_sri_s, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_shr64_ins_i64, + .fniv = gen_shr_ins_vec, + .fno = gen_helper_gvec_sri_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; -const GVecGen2i sri_op[4] = { - { .fni8 = gen_shr8_ins_i64, - .fniv = gen_shr_ins_vec, - .load_dest = true, - .opt_opc = vecop_list_sri, - .vece = MO_8 }, - { .fni8 = gen_shr16_ins_i64, - .fniv = gen_shr_ins_vec, - .load_dest = true, - .opt_opc = vecop_list_sri, - .vece = MO_16 }, - { .fni4 = gen_shr32_ins_i32, - .fniv = gen_shr_ins_vec, - .load_dest = true, - .opt_opc = vecop_list_sri, - .vece = MO_32 }, - { .fni8 = gen_shr64_ins_i64, - .fniv = gen_shr_ins_vec, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .load_dest = true, - .opt_opc = vecop_list_sri, - .vece = MO_64 }, -}; + /* tszimm encoding produces immediates in the range [1..esize]. */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + /* Shift of esize leaves destination unchanged. */ + if (shift < (8 << vece)) { + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); + } else { + /* Nop, but we do need to clear the tail. */ + tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz); + } +} static void gen_shl8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) { @@ -4532,47 +4547,60 @@ static void gen_shl64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) static void gen_shl_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) { - if (sh == 0) { - tcg_gen_mov_vec(d, a); - } else { - TCGv_vec t = tcg_temp_new_vec_matching(d); - TCGv_vec m = tcg_temp_new_vec_matching(d); + TCGv_vec t = tcg_temp_new_vec_matching(d); + TCGv_vec m = tcg_temp_new_vec_matching(d); - tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK(0, sh)); - tcg_gen_shli_vec(vece, t, a, sh); - tcg_gen_and_vec(vece, d, d, m); - tcg_gen_or_vec(vece, d, d, t); + tcg_gen_shli_vec(vece, t, a, sh); + tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK(0, sh)); + tcg_gen_and_vec(vece, d, d, m); + tcg_gen_or_vec(vece, d, d, t); - tcg_temp_free_vec(t); - tcg_temp_free_vec(m); - } + tcg_temp_free_vec(t); + tcg_temp_free_vec(m); } -static const TCGOpcode vecop_list_sli[] = { INDEX_op_shli_vec, 0 }; +void gen_gvec_sli(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { INDEX_op_shli_vec, 0 }; + const GVecGen2i ops[4] = { + { .fni8 = gen_shl8_ins_i64, + .fniv = gen_shl_ins_vec, + .fno = gen_helper_gvec_sli_b, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni8 = gen_shl16_ins_i64, + .fniv = gen_shl_ins_vec, + .fno = gen_helper_gvec_sli_h, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_shl32_ins_i32, + .fniv = gen_shl_ins_vec, + .fno = gen_helper_gvec_sli_s, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_shl64_ins_i64, + .fniv = gen_shl_ins_vec, + .fno = gen_helper_gvec_sli_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; -const GVecGen2i sli_op[4] = { - { .fni8 = gen_shl8_ins_i64, - .fniv = gen_shl_ins_vec, - .load_dest = true, - .opt_opc = vecop_list_sli, - .vece = MO_8 }, - { .fni8 = gen_shl16_ins_i64, - .fniv = gen_shl_ins_vec, - .load_dest = true, - .opt_opc = vecop_list_sli, - .vece = MO_16 }, - { .fni4 = gen_shl32_ins_i32, - .fniv = gen_shl_ins_vec, - .load_dest = true, - .opt_opc = vecop_list_sli, - .vece = MO_32 }, - { .fni8 = gen_shl64_ins_i64, - .fniv = gen_shl_ins_vec, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .load_dest = true, - .opt_opc = vecop_list_sli, - .vece = MO_64 }, -}; + /* tszimm encoding produces immediates in the range [0..esize-1]. */ + tcg_debug_assert(shift >= 0); + tcg_debug_assert(shift < (8 << vece)); + + if (shift == 0) { + tcg_gen_gvec_mov(vece, rd_ofs, rm_ofs, opr_sz, max_sz); + } else { + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); + } +} static void gen_mla8_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) { @@ -5715,20 +5743,14 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) } /* Right shift comes here negative. */ shift = -shift; - /* Shift out of range leaves destination unchanged. */ - if (shift < 8 << size) { - tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, vec_size, - shift, &sri_op[size]); - } + gen_gvec_sri(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); return 0; case 5: /* VSHL, VSLI */ if (u) { /* VSLI */ - /* Shift out of range leaves destination unchanged. */ - if (shift < 8 << size) { - tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, - vec_size, shift, &sli_op[size]); - } + gen_gvec_sli(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); } else { /* VSHL */ /* Shifts larger than the element size are * architecturally valid and results in zero. diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index fd8b2bff49..096fea67ef 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -974,6 +974,44 @@ DO_RSRA(gvec_ursra_d, uint64_t) #undef DO_RSRA +#define DO_SRI(NAME, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + int shift = simd_data(desc); \ + TYPE *d = vd, *n = vn; \ + for (i = 0; i < oprsz / sizeof(TYPE); i++) { \ + d[i] = deposit64(d[i], 0, sizeof(TYPE) * 8 - shift, n[i] >> shift); \ + } \ + clear_tail(d, oprsz, simd_maxsz(desc)); \ +} + +DO_SRI(gvec_sri_b, uint8_t) +DO_SRI(gvec_sri_h, uint16_t) +DO_SRI(gvec_sri_s, uint32_t) +DO_SRI(gvec_sri_d, uint64_t) + +#undef DO_SRI + +#define DO_SLI(NAME, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + int shift = simd_data(desc); \ + TYPE *d = vd, *n = vn; \ + for (i = 0; i < oprsz / sizeof(TYPE); i++) { \ + d[i] = deposit64(d[i], shift, sizeof(TYPE) * 8 - shift, n[i]); \ + } \ + clear_tail(d, oprsz, simd_maxsz(desc)); \ +} + +DO_SLI(gvec_sli_b, uint8_t) +DO_SLI(gvec_sli_h, uint16_t) +DO_SLI(gvec_sli_s, uint32_t) +DO_SLI(gvec_sli_d, uint64_t) + +#undef DO_SLI + /* * Convert float16 to float32, raising no exceptions and * preserving exceptional values, including SNaN. From patchwork Wed May 13 16:32:33 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1289497 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=yHu6zi3v; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 49MgNj1Wpqz9sRf for ; Thu, 14 May 2020 02:39:01 +1000 (AEST) Received: from localhost ([::1]:41378 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jYuPK-0003Ql-Kj for incoming@patchwork.ozlabs.org; Wed, 13 May 2020 12:38:58 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:47952) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jYuJT-0003d7-N4 for qemu-devel@nongnu.org; Wed, 13 May 2020 12:32:55 -0400 Received: from mail-pj1-x1044.google.com ([2607:f8b0:4864:20::1044]:39227) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jYuJS-000206-Mb for qemu-devel@nongnu.org; Wed, 13 May 2020 12:32:55 -0400 Received: by mail-pj1-x1044.google.com with SMTP id e6so11215539pjt.4 for ; Wed, 13 May 2020 09:32:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=2WJ6plpfPlfef610n9KhYRt52HX2ijIyS18DrRrsIPs=; b=yHu6zi3v58tBaVlgx8pCuRGdfufTf/Nf0iEz1oczi01XRLJgh0KxCS4r8UW/sZRvU8 6TLIjCDJVaUf7qJAWux4U3jhowBNP8C0+sH55PV8fagWm8DmeVmS0cqU1mGXsytaF4mt 8+6PrU5rdaQWoA/AlBru3wLHfb/lbQqiw+oFmHsAK6oFQfQ57Lf8wqlzxNsBHAiM8tKl 4F8Ijqpl+wBzjuaI2wqPMP2TfY/Irg0oSRItTGnQEARow3FRUKeXkoH3hCCfgXXVJGGm xNuePPAt5dfkEHNSWXjQloM9tjppbrfC9CTOIgwt38aX601NLJnDqp2lNE8hL3Mrh5YY 02/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=2WJ6plpfPlfef610n9KhYRt52HX2ijIyS18DrRrsIPs=; b=b+Xae/4ut9hpb7l+JDcpJywyGkeHHibqFWDgmL/66oNV/lWk/7wr/2hQQ5k7cs4+mo 5m3djqBC7SQhrxg8dewSP++xbSdUPapSNEDTqa2nLnS6/Pb3RzqMF3111BK2fhVVQ5IT rpP3XlJMIdXC2LdxBUIvMRIKRAf7sF3p0XaODWcW1AE3vQuPQ26NWxgRTYPNAeYFIgpr pRrjUtxnNOmn4LGOq8zz58VBz1ih2Cx3ceM9WtHsyQRCSjzohqokvNchxJUd56z8/n3w Z4v0B3FK/CFYnoXMYo55WaL0K1aJFDZUDh9sQfUJV9jv4pqQRIShevy4DcFuH5fy8YLr 2Chw== X-Gm-Message-State: AOAM533YWSwbyjGWmUMrF7YdjR7GtC/r9ZO2xYziGm1K5CWgr2T4Au0o p6nU/w1koaS0OPdsuB9JcmNyKBf9vy4= X-Google-Smtp-Source: ABdhPJzYtv8yvUdiBqq3u1TVQ5uUscwlDOA0nBsIv+P6GgaUXBMFrq4hDI5k61NU9amNHeyu0zUoCg== X-Received: by 2002:a17:902:8349:: with SMTP id z9mr70073pln.38.1589387573009; Wed, 13 May 2020 09:32:53 -0700 (PDT) Received: from localhost.localdomain (174-21-143-238.tukw.qwest.net. [174.21.143.238]) by smtp.gmail.com with ESMTPSA id b11sm158025pgq.50.2020.05.13.09.32.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 May 2020 09:32:52 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v4 04/16] target/arm: Remove unnecessary range check for VSHL Date: Wed, 13 May 2020 09:32:33 -0700 Message-Id: <20200513163245.17915-5-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200513163245.17915-1-richard.henderson@linaro.org> References: <20200513163245.17915-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1044; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1044.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" In 1dc8425e551, while converting to gvec, I added an extra range check against the shift count. This was unnecessary because the encoding of the shift count produces 0 to the element size - 1. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/translate.c | 12 ++---------- 1 file changed, 2 insertions(+), 10 deletions(-) diff --git a/target/arm/translate.c b/target/arm/translate.c index 3c489852dc..2eec689c5e 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -5752,16 +5752,8 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) gen_gvec_sli(size, rd_ofs, rm_ofs, shift, vec_size, vec_size); } else { /* VSHL */ - /* Shifts larger than the element size are - * architecturally valid and results in zero. - */ - if (shift >= 8 << size) { - tcg_gen_gvec_dup_imm(size, rd_ofs, - vec_size, vec_size, 0); - } else { - tcg_gen_gvec_shli(size, rd_ofs, rm_ofs, shift, - vec_size, vec_size); - } + tcg_gen_gvec_shli(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); } return 0; } From patchwork Wed May 13 16:32:34 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1289501 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=wRs8eg+q; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 49MgS34lMGz9sRf for ; Thu, 14 May 2020 02:41:55 +1000 (AEST) Received: from localhost ([::1]:49760 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jYuS9-0008Gf-AO for incoming@patchwork.ozlabs.org; Wed, 13 May 2020 12:41:53 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:47962) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jYuJU-0003er-SG for qemu-devel@nongnu.org; Wed, 13 May 2020 12:32:56 -0400 Received: from mail-pl1-x644.google.com ([2607:f8b0:4864:20::644]:45584) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jYuJT-00020N-Ud for qemu-devel@nongnu.org; Wed, 13 May 2020 12:32:56 -0400 Received: by mail-pl1-x644.google.com with SMTP id u22so32501plq.12 for ; Wed, 13 May 2020 09:32:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=aEeapnyAaYV0er6LNsDf+nnur14ggqXJIcqCiWufNK0=; b=wRs8eg+qSjJsDO8X2oJuG5+dH7gxjcvKT4uab5YBXDLX89wjresuGcN7A1b6i3M+WC pPLWUu65KgafJCrNNdsCNaQyHEQEVLjuYdoX0bnKp4DXVjeIe8cv2adWaapMvHsTeCTe Ctel7rknCZFVfUQGMjSEr0rxL4DNxHtpddXznhmTV0iogjYodGHOVoHPyv7bJYMj4HhB rRc97A4caKXU6jDYiOQRzSzBQQvVGTZ4jg592TlXW5uW4wUBvmdfeXqvfQemVVOFd9gA DHhVCohWLgth+9fGopg+ndgQQq0ArZxgcEsc+C/c6hdR4BqW2lxPKXybH+RUCHEb6tXr pW/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=aEeapnyAaYV0er6LNsDf+nnur14ggqXJIcqCiWufNK0=; b=VVVjJRbrt6oGMfKBPvWx9HOS7EAjEqvDJtJgpJdqTOQyNjuIr4DPx+wC6egkbqgmwP S2I0/LM/2VsZKOkDo6/DgDaaZiuGgplUjP5t2tiJoiWNLMVUFgbMdc4QdNWK7JvqW9p4 ZM6QSAgSYSMBJck6rWDDAHgPcJGXBxSWXNWsFS3Ykw+OHXezENR+7ER2vl3OIP6eFYfe Fk33y2zVXyOvww7VGwnnoN6ksMIBcAALMIIh+s4liXSBdKUm+wDD7wqaYK5ZnHX11mhX xPxL2ByWptmsMJcox1rAQ7TLB+M9jfR/z9NXslRHtTEA2WMFDn2TPzCmxshrE4KIMFOj SVPQ== X-Gm-Message-State: AOAM531B8Mfr+CUvopq6We6xUjxYRip3seSQpGnO47ObCe6yp1X9hPlq SSa5l5GaSYXtfvNRJ1mCkM4IsRHcZBA= X-Google-Smtp-Source: ABdhPJwMx6q1zmbly3hq2pgVVP2rF807pecfA3UQQQHpWJCF1dfzvkZjEMITe8PVRFGYD2iKOhDbrQ== X-Received: by 2002:a17:902:b7cc:: with SMTP id v12mr31525plz.39.1589387574244; Wed, 13 May 2020 09:32:54 -0700 (PDT) Received: from localhost.localdomain (174-21-143-238.tukw.qwest.net. [174.21.143.238]) by smtp.gmail.com with ESMTPSA id b11sm158025pgq.50.2020.05.13.09.32.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 May 2020 09:32:53 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v4 05/16] target/arm: Tidy handle_vec_simd_shri Date: Wed, 13 May 2020 09:32:34 -0700 Message-Id: <20200513163245.17915-6-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200513163245.17915-1-richard.henderson@linaro.org> References: <20200513163245.17915-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::644; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x644.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Now that we've converted all cases to gvec, there is quite a bit of dead code at the end of the function. Remove it. Sink the call to gen_gvec_fn2i to the end, loading a function pointer within the switch statement. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/translate-a64.c | 56 ++++++++++---------------------------- 1 file changed, 14 insertions(+), 42 deletions(-) diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 2d7dad6c3f..d5e77f34a7 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -10155,16 +10155,7 @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u, int size = 32 - clz32(immh) - 1; int immhb = immh << 3 | immb; int shift = 2 * (8 << size) - immhb; - bool accumulate = false; - int dsize = is_q ? 128 : 64; - int esize = 8 << size; - int elements = dsize/esize; - MemOp memop = size | (is_u ? 0 : MO_SIGN); - TCGv_i64 tcg_rn = new_tmp_a64(s); - TCGv_i64 tcg_rd = new_tmp_a64(s); - TCGv_i64 tcg_round; - uint64_t round_const; - int i; + GVecGen2iFn *gvec_fn; if (extract32(immh, 3, 1) && !is_q) { unallocated_encoding(s); @@ -10178,13 +10169,12 @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u, switch (opcode) { case 0x02: /* SSRA / USRA (accumulate) */ - gen_gvec_fn2i(s, is_q, rd, rn, shift, - is_u ? gen_gvec_usra : gen_gvec_ssra, size); - return; + gvec_fn = is_u ? gen_gvec_usra : gen_gvec_ssra; + break; case 0x08: /* SRI */ - gen_gvec_fn2i(s, is_q, rd, rn, shift, gen_gvec_sri, size); - return; + gvec_fn = gen_gvec_sri; + break; case 0x00: /* SSHR / USHR */ if (is_u) { @@ -10192,49 +10182,31 @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u, /* Shift count the same size as element size produces zero. */ tcg_gen_gvec_dup_imm(size, vec_full_reg_offset(s, rd), is_q ? 16 : 8, vec_full_reg_size(s), 0); - } else { - gen_gvec_fn2i(s, is_q, rd, rn, shift, tcg_gen_gvec_shri, size); + return; } + gvec_fn = tcg_gen_gvec_shri; } else { /* Shift count the same size as element size produces all sign. */ if (shift == 8 << size) { shift -= 1; } - gen_gvec_fn2i(s, is_q, rd, rn, shift, tcg_gen_gvec_sari, size); + gvec_fn = tcg_gen_gvec_sari; } - return; + break; case 0x04: /* SRSHR / URSHR (rounding) */ - gen_gvec_fn2i(s, is_q, rd, rn, shift, - is_u ? gen_gvec_urshr : gen_gvec_srshr, size); - return; + gvec_fn = is_u ? gen_gvec_urshr : gen_gvec_srshr; + break; case 0x06: /* SRSRA / URSRA (accum + rounding) */ - gen_gvec_fn2i(s, is_q, rd, rn, shift, - is_u ? gen_gvec_ursra : gen_gvec_srsra, size); - return; + gvec_fn = is_u ? gen_gvec_ursra : gen_gvec_srsra; + break; default: g_assert_not_reached(); } - round_const = 1ULL << (shift - 1); - tcg_round = tcg_const_i64(round_const); - - for (i = 0; i < elements; i++) { - read_vec_element(s, tcg_rn, rn, i, memop); - if (accumulate) { - read_vec_element(s, tcg_rd, rd, i, memop); - } - - handle_shri_with_rndacc(tcg_rd, tcg_rn, tcg_round, - accumulate, is_u, size, shift); - - write_vec_element(s, tcg_rd, rd, i, size); - } - tcg_temp_free_i64(tcg_round); - - clear_vec_high(s, is_q, rd); + gen_gvec_fn2i(s, is_q, rd, rn, shift, gvec_fn, size); } /* SHL/SLI - Vector shift left */ From patchwork Wed May 13 16:32:35 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1289520 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=oCqJ2f0H; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 49MgVl6wT7z9sRY for ; Thu, 14 May 2020 02:44:15 +1000 (AEST) Received: from localhost ([::1]:57878 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jYuUO-0003Cc-Fb for incoming@patchwork.ozlabs.org; Wed, 13 May 2020 12:44:12 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:47976) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jYuJW-0003g4-Vm for qemu-devel@nongnu.org; Wed, 13 May 2020 12:32:59 -0400 Received: from mail-pj1-x1044.google.com ([2607:f8b0:4864:20::1044]:39228) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jYuJV-00020b-3U for qemu-devel@nongnu.org; Wed, 13 May 2020 12:32:58 -0400 Received: by mail-pj1-x1044.google.com with SMTP id e6so11215598pjt.4 for ; Wed, 13 May 2020 09:32:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=l2QCtAJuqJw9W7z9OKqSJ6gRkNQUxusCmyvREvrDgd8=; b=oCqJ2f0Hw74eBQHT3/1vYVWd2us9cmQipR8aRjknPXKf7XxYTlimX2541YF22UAqTe o61ekV6Uz3j4veSejK9cmuhhDacienq2PeEpb3ksfN9qiqpwlk8D0hlm0fTeXn7i3FWQ tQlBHG9BH0Yvs7v0Uoqi3VulW7NqWo3DjHqt4YNQZM3ck1uawxbM2HhM3J9Iow7tp3gE cU9134CQ5b/dNKYvqrJjnhSS3Vedb2+qMn3ZuaC//xyoxllCiWMCLsKyrKN9tVmRIUoZ TtYdVuyTQGCEhpynK6KKznxlCqVWFoJwn4cqNMWBP/ajPCs7p2EKf/Vt1ytwdBSDtZeV VhCQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=l2QCtAJuqJw9W7z9OKqSJ6gRkNQUxusCmyvREvrDgd8=; b=eNjJprNe2cNF9ku47qTRF74EKHPUm3Q4WWvA7v3SNUT+6qshXwGQIyS8lISsCrzg2i q+GrvI5dFID4Or2TcgRMWhyPKxFNgRkYmZpfJKW7pk7o1d9de/osoB2J3fsClA/eBxyA 0GqRZxrwFU/ykBghPq7KBCeXmphJP4okRRa2M/UqY+TVjLZLxy6shdOG6kMEzD2F+5Uw wMqn8X8f17OX7biX5zNZk4ebHQ8tsDPSmwgHbt+XJhtRvmn2QMt93WVXobpobTNqJV6q sPaoHp2IXq7wEBGEO1AjNy9I7TY5bf/oWE3cLg2liHEwaIMGry9ANxKbxwRuFS6vSIsl c1Gw== X-Gm-Message-State: AGi0Pub326dGgnsCxIHSzoqrFvcyAiRP/iZBiZhfHoJpjTC8ojMDJ5Fs 6RYRPeopf20OVm+DuWWfcUw+Nq25AWk= X-Google-Smtp-Source: APiQypJRTHk3Jn7AzihOsjSLoMD1/eXNP3+jO99cwcsRlwB7TxfSSOS/bUju04yY5N+d3KUgFOo1Jg== X-Received: by 2002:a17:90b:f16:: with SMTP id br22mr35213498pjb.89.1589387575345; Wed, 13 May 2020 09:32:55 -0700 (PDT) Received: from localhost.localdomain (174-21-143-238.tukw.qwest.net. [174.21.143.238]) by smtp.gmail.com with ESMTPSA id b11sm158025pgq.50.2020.05.13.09.32.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 May 2020 09:32:54 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v4 06/16] target/arm: Create gen_gvec_{ceq,clt,cle,cgt,cge}0 Date: Wed, 13 May 2020 09:32:35 -0700 Message-Id: <20200513163245.17915-7-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200513163245.17915-1-richard.henderson@linaro.org> References: <20200513163245.17915-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1044; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1044.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Provide a functional interface for the vector expansion. This fits better with the existing set of helpers that we provide for other operations. Macro-ize the 5 nearly identical comparisons. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/translate.h | 16 ++- target/arm/translate-a64.c | 22 ++-- target/arm/translate.c | 254 ++++++++----------------------------- 3 files changed, 74 insertions(+), 218 deletions(-) diff --git a/target/arm/translate.h b/target/arm/translate.h index fa5c3f12b9..e35c812cc5 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -275,11 +275,17 @@ static inline void gen_swstep_exception(DisasContext *s, int isv, int ex) uint64_t vfp_expand_imm(int size, uint8_t imm8); /* Vector operations shared between ARM and AArch64. */ -extern const GVecGen2 ceq0_op[4]; -extern const GVecGen2 clt0_op[4]; -extern const GVecGen2 cgt0_op[4]; -extern const GVecGen2 cle0_op[4]; -extern const GVecGen2 cge0_op[4]; +void gen_gvec_ceq0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_clt0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_cgt0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_cle0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_cge0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + uint32_t opr_sz, uint32_t max_sz); + extern const GVecGen3 mla_op[4]; extern const GVecGen3 mls_op[4]; extern const GVecGen3 cmtst_op[4]; diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index d5e77f34a7..fef93dc27a 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -577,14 +577,6 @@ static void gen_gvec_fn4(DisasContext *s, bool is_q, int rd, int rn, int rm, is_q ? 16 : 8, vec_full_reg_size(s)); } -/* Expand a 2-operand AdvSIMD vector operation using an op descriptor. */ -static void gen_gvec_op2(DisasContext *s, bool is_q, int rd, - int rn, const GVecGen2 *gvec_op) -{ - tcg_gen_gvec_2(vec_full_reg_offset(s, rd), vec_full_reg_offset(s, rn), - is_q ? 16 : 8, vec_full_reg_size(s), gvec_op); -} - /* Expand a 3-operand AdvSIMD vector operation using an op descriptor. */ static void gen_gvec_op3(DisasContext *s, bool is_q, int rd, int rn, int rm, const GVecGen3 *gvec_op) @@ -12310,13 +12302,21 @@ static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn) } break; case 0x8: /* CMGT, CMGE */ - gen_gvec_op2(s, is_q, rd, rn, u ? &cge0_op[size] : &cgt0_op[size]); + if (u) { + gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_cge0, size); + } else { + gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_cgt0, size); + } return; case 0x9: /* CMEQ, CMLE */ - gen_gvec_op2(s, is_q, rd, rn, u ? &cle0_op[size] : &ceq0_op[size]); + if (u) { + gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_cle0, size); + } else { + gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_ceq0, size); + } return; case 0xa: /* CMLT */ - gen_gvec_op2(s, is_q, rd, rn, &clt0_op[size]); + gen_gvec_fn2(s, is_q, rd, rn, gen_gvec_clt0, size); return; case 0xb: if (u) { /* ABS, NEG */ diff --git a/target/arm/translate.c b/target/arm/translate.c index 2eec689c5e..010a158e63 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -3645,204 +3645,59 @@ static int do_v81_helper(DisasContext *s, gen_helper_gvec_3_ptr *fn, return 1; } -static void gen_ceq0_i32(TCGv_i32 d, TCGv_i32 a) -{ - tcg_gen_setcondi_i32(TCG_COND_EQ, d, a, 0); - tcg_gen_neg_i32(d, d); -} - -static void gen_ceq0_i64(TCGv_i64 d, TCGv_i64 a) -{ - tcg_gen_setcondi_i64(TCG_COND_EQ, d, a, 0); - tcg_gen_neg_i64(d, d); -} - -static void gen_ceq0_vec(unsigned vece, TCGv_vec d, TCGv_vec a) -{ - TCGv_vec zero = tcg_const_zeros_vec_matching(d); - tcg_gen_cmp_vec(TCG_COND_EQ, vece, d, a, zero); - tcg_temp_free_vec(zero); -} +#define GEN_CMP0(NAME, COND) \ + static void gen_##NAME##0_i32(TCGv_i32 d, TCGv_i32 a) \ + { \ + tcg_gen_setcondi_i32(COND, d, a, 0); \ + tcg_gen_neg_i32(d, d); \ + } \ + static void gen_##NAME##0_i64(TCGv_i64 d, TCGv_i64 a) \ + { \ + tcg_gen_setcondi_i64(COND, d, a, 0); \ + tcg_gen_neg_i64(d, d); \ + } \ + static void gen_##NAME##0_vec(unsigned vece, TCGv_vec d, TCGv_vec a) \ + { \ + TCGv_vec zero = tcg_const_zeros_vec_matching(d); \ + tcg_gen_cmp_vec(COND, vece, d, a, zero); \ + tcg_temp_free_vec(zero); \ + } \ + void gen_gvec_##NAME##0(unsigned vece, uint32_t d, uint32_t m, \ + uint32_t opr_sz, uint32_t max_sz) \ + { \ + const GVecGen2 op[4] = { \ + { .fno = gen_helper_gvec_##NAME##0_b, \ + .fniv = gen_##NAME##0_vec, \ + .opt_opc = vecop_list_cmp, \ + .vece = MO_8 }, \ + { .fno = gen_helper_gvec_##NAME##0_h, \ + .fniv = gen_##NAME##0_vec, \ + .opt_opc = vecop_list_cmp, \ + .vece = MO_16 }, \ + { .fni4 = gen_##NAME##0_i32, \ + .fniv = gen_##NAME##0_vec, \ + .opt_opc = vecop_list_cmp, \ + .vece = MO_32 }, \ + { .fni8 = gen_##NAME##0_i64, \ + .fniv = gen_##NAME##0_vec, \ + .opt_opc = vecop_list_cmp, \ + .prefer_i64 = TCG_TARGET_REG_BITS == 64, \ + .vece = MO_64 }, \ + }; \ + tcg_gen_gvec_2(d, m, opr_sz, max_sz, &op[vece]); \ + } static const TCGOpcode vecop_list_cmp[] = { INDEX_op_cmp_vec, 0 }; -const GVecGen2 ceq0_op[4] = { - { .fno = gen_helper_gvec_ceq0_b, - .fniv = gen_ceq0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_8 }, - { .fno = gen_helper_gvec_ceq0_h, - .fniv = gen_ceq0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_16 }, - { .fni4 = gen_ceq0_i32, - .fniv = gen_ceq0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_32 }, - { .fni8 = gen_ceq0_i64, - .fniv = gen_ceq0_vec, - .opt_opc = vecop_list_cmp, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .vece = MO_64 }, -}; +GEN_CMP0(ceq, TCG_COND_EQ) +GEN_CMP0(cle, TCG_COND_LE) +GEN_CMP0(cge, TCG_COND_GE) +GEN_CMP0(clt, TCG_COND_LT) +GEN_CMP0(cgt, TCG_COND_GT) -static void gen_cle0_i32(TCGv_i32 d, TCGv_i32 a) -{ - tcg_gen_setcondi_i32(TCG_COND_LE, d, a, 0); - tcg_gen_neg_i32(d, d); -} - -static void gen_cle0_i64(TCGv_i64 d, TCGv_i64 a) -{ - tcg_gen_setcondi_i64(TCG_COND_LE, d, a, 0); - tcg_gen_neg_i64(d, d); -} - -static void gen_cle0_vec(unsigned vece, TCGv_vec d, TCGv_vec a) -{ - TCGv_vec zero = tcg_const_zeros_vec_matching(d); - tcg_gen_cmp_vec(TCG_COND_LE, vece, d, a, zero); - tcg_temp_free_vec(zero); -} - -const GVecGen2 cle0_op[4] = { - { .fno = gen_helper_gvec_cle0_b, - .fniv = gen_cle0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_8 }, - { .fno = gen_helper_gvec_cle0_h, - .fniv = gen_cle0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_16 }, - { .fni4 = gen_cle0_i32, - .fniv = gen_cle0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_32 }, - { .fni8 = gen_cle0_i64, - .fniv = gen_cle0_vec, - .opt_opc = vecop_list_cmp, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .vece = MO_64 }, -}; - -static void gen_cge0_i32(TCGv_i32 d, TCGv_i32 a) -{ - tcg_gen_setcondi_i32(TCG_COND_GE, d, a, 0); - tcg_gen_neg_i32(d, d); -} - -static void gen_cge0_i64(TCGv_i64 d, TCGv_i64 a) -{ - tcg_gen_setcondi_i64(TCG_COND_GE, d, a, 0); - tcg_gen_neg_i64(d, d); -} - -static void gen_cge0_vec(unsigned vece, TCGv_vec d, TCGv_vec a) -{ - TCGv_vec zero = tcg_const_zeros_vec_matching(d); - tcg_gen_cmp_vec(TCG_COND_GE, vece, d, a, zero); - tcg_temp_free_vec(zero); -} - -const GVecGen2 cge0_op[4] = { - { .fno = gen_helper_gvec_cge0_b, - .fniv = gen_cge0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_8 }, - { .fno = gen_helper_gvec_cge0_h, - .fniv = gen_cge0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_16 }, - { .fni4 = gen_cge0_i32, - .fniv = gen_cge0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_32 }, - { .fni8 = gen_cge0_i64, - .fniv = gen_cge0_vec, - .opt_opc = vecop_list_cmp, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .vece = MO_64 }, -}; - -static void gen_clt0_i32(TCGv_i32 d, TCGv_i32 a) -{ - tcg_gen_setcondi_i32(TCG_COND_LT, d, a, 0); - tcg_gen_neg_i32(d, d); -} - -static void gen_clt0_i64(TCGv_i64 d, TCGv_i64 a) -{ - tcg_gen_setcondi_i64(TCG_COND_LT, d, a, 0); - tcg_gen_neg_i64(d, d); -} - -static void gen_clt0_vec(unsigned vece, TCGv_vec d, TCGv_vec a) -{ - TCGv_vec zero = tcg_const_zeros_vec_matching(d); - tcg_gen_cmp_vec(TCG_COND_LT, vece, d, a, zero); - tcg_temp_free_vec(zero); -} - -const GVecGen2 clt0_op[4] = { - { .fno = gen_helper_gvec_clt0_b, - .fniv = gen_clt0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_8 }, - { .fno = gen_helper_gvec_clt0_h, - .fniv = gen_clt0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_16 }, - { .fni4 = gen_clt0_i32, - .fniv = gen_clt0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_32 }, - { .fni8 = gen_clt0_i64, - .fniv = gen_clt0_vec, - .opt_opc = vecop_list_cmp, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .vece = MO_64 }, -}; - -static void gen_cgt0_i32(TCGv_i32 d, TCGv_i32 a) -{ - tcg_gen_setcondi_i32(TCG_COND_GT, d, a, 0); - tcg_gen_neg_i32(d, d); -} - -static void gen_cgt0_i64(TCGv_i64 d, TCGv_i64 a) -{ - tcg_gen_setcondi_i64(TCG_COND_GT, d, a, 0); - tcg_gen_neg_i64(d, d); -} - -static void gen_cgt0_vec(unsigned vece, TCGv_vec d, TCGv_vec a) -{ - TCGv_vec zero = tcg_const_zeros_vec_matching(d); - tcg_gen_cmp_vec(TCG_COND_GT, vece, d, a, zero); - tcg_temp_free_vec(zero); -} - -const GVecGen2 cgt0_op[4] = { - { .fno = gen_helper_gvec_cgt0_b, - .fniv = gen_cgt0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_8 }, - { .fno = gen_helper_gvec_cgt0_h, - .fniv = gen_cgt0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_16 }, - { .fni4 = gen_cgt0_i32, - .fniv = gen_cgt0_vec, - .opt_opc = vecop_list_cmp, - .vece = MO_32 }, - { .fni8 = gen_cgt0_i64, - .fniv = gen_cgt0_vec, - .opt_opc = vecop_list_cmp, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .vece = MO_64 }, -}; +#undef GEN_CMP0 static void gen_ssra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) { @@ -6772,24 +6627,19 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) break; case NEON_2RM_VCEQ0: - tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size, - vec_size, &ceq0_op[size]); + gen_gvec_ceq0(size, rd_ofs, rm_ofs, vec_size, vec_size); break; case NEON_2RM_VCGT0: - tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size, - vec_size, &cgt0_op[size]); + gen_gvec_cgt0(size, rd_ofs, rm_ofs, vec_size, vec_size); break; case NEON_2RM_VCLE0: - tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size, - vec_size, &cle0_op[size]); + gen_gvec_cle0(size, rd_ofs, rm_ofs, vec_size, vec_size); break; case NEON_2RM_VCGE0: - tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size, - vec_size, &cge0_op[size]); + gen_gvec_cge0(size, rd_ofs, rm_ofs, vec_size, vec_size); break; case NEON_2RM_VCLT0: - tcg_gen_gvec_2(rd_ofs, rm_ofs, vec_size, - vec_size, &clt0_op[size]); + gen_gvec_clt0(size, rd_ofs, rm_ofs, vec_size, vec_size); break; default: From patchwork Wed May 13 16:32:36 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1289525 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=Fzg6MFP6; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 49MgYk6DVrz9sRY for ; Thu, 14 May 2020 02:46:50 +1000 (AEST) Received: from localhost ([::1]:36154 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jYuWu-0005yT-IQ for incoming@patchwork.ozlabs.org; Wed, 13 May 2020 12:46:48 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:47984) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jYuJY-0003gz-6W for qemu-devel@nongnu.org; Wed, 13 May 2020 12:33:00 -0400 Received: from mail-pl1-x644.google.com ([2607:f8b0:4864:20::644]:43320) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jYuJW-000211-Al for qemu-devel@nongnu.org; Wed, 13 May 2020 12:32:59 -0400 Received: by mail-pl1-x644.google.com with SMTP id y9so37077plk.10 for ; Wed, 13 May 2020 09:32:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=R4kMTERo8NxbHW9D1mlQVpVF5MPRlqOT3EITQBFEuRw=; b=Fzg6MFP6APBMA2EWx/zAOcTTEfsR0EujkUH4R4p4i39gYy9kVVeUJtQI08IKpdRzB/ 907ooOFSfth6aFviCWjY1snzIyi8AV/u1LA7aVmULZ0+XuALsLn2iyeUANroQWxFDruD Apsc2pu3EtbxC3Qm/emgdyNcC+vRD8jdlgTktxMHplmLM0nkcf1XpQHpI3ba8wyS5d6U WgVKboG//efR4EtsRvLVH9Nh1yPKjmdILLBy6qxXHufdS8uHacACzaNfL+VymYz1HY5B mDol1rBp7+Vbydw1hvjM260OlBJaIJQxCRD6hLr7ZDlu4YimeufMFc9bD6iATtGbOqvd pbxQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=R4kMTERo8NxbHW9D1mlQVpVF5MPRlqOT3EITQBFEuRw=; b=gY5w75FF8MPbnAnvgS0Xim7MtX/vExOUtOBZvnTtTpbG+AYA/qXZSvYbWQBBVW5I6a PuLEBAonOKDO7zbNviisSy3i4UuZEMdDcq0Oje/qCFJ/x1fhwdACKmSupDDCGyOWqzGN zbdM3dtnH2AX6EpzqIelVrvKzs0W8EllkEQ2SiNT8PG9+u78V+DMWJ80PkrFXVM71wwC m2akHk1Zki301QH801jDZeigcEJxZKmGOyXmurSe7FBR2yi13wCG0VVjgXEWITNbWV0f AQQDyOLnVtjonH2cIgp9z4lvW8G1zVSqTq+H0stVRkWIgTmZYMj1CSQygGkB/TNpHnlf EsVQ== X-Gm-Message-State: AOAM533PA+xkFmJS7+/YNJoKHUyIBSIw9s4fuhn5aYzw2k9aGYZxXjsY LfjTigU0954vW8jjYRjqeN6frzO7n3o= X-Google-Smtp-Source: ABdhPJyZC+GnfhPXbnaHj0rKUjVyaGGq6WIM+tF39KVnBbJMmvnJeKHRT3t5czxQW+Fm5Ax6S4v9kQ== X-Received: by 2002:a17:902:9a06:: with SMTP id v6mr37711plp.286.1589387576633; Wed, 13 May 2020 09:32:56 -0700 (PDT) Received: from localhost.localdomain (174-21-143-238.tukw.qwest.net. [174.21.143.238]) by smtp.gmail.com with ESMTPSA id b11sm158025pgq.50.2020.05.13.09.32.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 May 2020 09:32:55 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v4 07/16] target/arm: Create gen_gvec_{mla,mls} Date: Wed, 13 May 2020 09:32:36 -0700 Message-Id: <20200513163245.17915-8-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200513163245.17915-1-richard.henderson@linaro.org> References: <20200513163245.17915-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::644; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x644.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Provide a functional interface for the vector expansion. This fits better with the existing set of helpers that we provide for other operations. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/translate.h | 7 +- target/arm/translate-a64.c | 4 +- target/arm/translate-neon.inc.c | 16 +---- target/arm/translate.c | 117 +++++++++++++++++--------------- 4 files changed, 71 insertions(+), 73 deletions(-) diff --git a/target/arm/translate.h b/target/arm/translate.h index e35c812cc5..9354ceba35 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -286,8 +286,11 @@ void gen_gvec_cle0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, void gen_gvec_cge0(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); -extern const GVecGen3 mla_op[4]; -extern const GVecGen3 mls_op[4]; +void gen_gvec_mla(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_mls(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); + extern const GVecGen3 cmtst_op[4]; extern const GVecGen3 sshl_op[4]; extern const GVecGen3 ushl_op[4]; diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index fef93dc27a..ab9df12e44 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -11226,9 +11226,9 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) return; case 0x12: /* MLA, MLS */ if (u) { - gen_gvec_op3(s, is_q, rd, rn, rm, &mls_op[size]); + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_mls, size); } else { - gen_gvec_op3(s, is_q, rd, rn, rm, &mla_op[size]); + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_mla, size); } return; case 0x11: diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c index 50b77b6d71..aefeff498a 100644 --- a/target/arm/translate-neon.inc.c +++ b/target/arm/translate-neon.inc.c @@ -632,6 +632,8 @@ DO_3SAME_NO_SZ_3(VMAX_U, tcg_gen_gvec_umax) DO_3SAME_NO_SZ_3(VMIN_S, tcg_gen_gvec_smin) DO_3SAME_NO_SZ_3(VMIN_U, tcg_gen_gvec_umin) DO_3SAME_NO_SZ_3(VMUL, tcg_gen_gvec_mul) +DO_3SAME_NO_SZ_3(VMLA, gen_gvec_mla) +DO_3SAME_NO_SZ_3(VMLS, gen_gvec_mls) #define DO_3SAME_CMP(INSN, COND) \ static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \ @@ -685,20 +687,6 @@ static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a) return do_3same(s, a, gen_VMUL_p_3s); } -#define DO_3SAME_GVEC3_NO_SZ_3(INSN, OPARRAY) \ - static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \ - uint32_t rn_ofs, uint32_t rm_ofs, \ - uint32_t oprsz, uint32_t maxsz) \ - { \ - tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, \ - oprsz, maxsz, &OPARRAY[vece]); \ - } \ - DO_3SAME_NO_SZ_3(INSN, gen_##INSN##_3s) - - -DO_3SAME_GVEC3_NO_SZ_3(VMLA, mla_op) -DO_3SAME_GVEC3_NO_SZ_3(VMLS, mls_op) - #define DO_3SAME_GVEC3_SHIFT(INSN, OPARRAY) \ static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \ uint32_t rn_ofs, uint32_t rm_ofs, \ diff --git a/target/arm/translate.c b/target/arm/translate.c index 010a158e63..face89a1f7 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -4520,62 +4520,69 @@ static void gen_mls_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) /* Note that while NEON does not support VMLA and VMLS as 64-bit ops, * these tables are shared with AArch64 which does support them. */ +void gen_gvec_mla(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_mul_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen3 ops[4] = { + { .fni4 = gen_mla8_i32, + .fniv = gen_mla_vec, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni4 = gen_mla16_i32, + .fniv = gen_mla_vec, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_mla32_i32, + .fniv = gen_mla_vec, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_mla64_i64, + .fniv = gen_mla_vec, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} -static const TCGOpcode vecop_list_mla[] = { - INDEX_op_mul_vec, INDEX_op_add_vec, 0 -}; - -static const TCGOpcode vecop_list_mls[] = { - INDEX_op_mul_vec, INDEX_op_sub_vec, 0 -}; - -const GVecGen3 mla_op[4] = { - { .fni4 = gen_mla8_i32, - .fniv = gen_mla_vec, - .load_dest = true, - .opt_opc = vecop_list_mla, - .vece = MO_8 }, - { .fni4 = gen_mla16_i32, - .fniv = gen_mla_vec, - .load_dest = true, - .opt_opc = vecop_list_mla, - .vece = MO_16 }, - { .fni4 = gen_mla32_i32, - .fniv = gen_mla_vec, - .load_dest = true, - .opt_opc = vecop_list_mla, - .vece = MO_32 }, - { .fni8 = gen_mla64_i64, - .fniv = gen_mla_vec, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .load_dest = true, - .opt_opc = vecop_list_mla, - .vece = MO_64 }, -}; - -const GVecGen3 mls_op[4] = { - { .fni4 = gen_mls8_i32, - .fniv = gen_mls_vec, - .load_dest = true, - .opt_opc = vecop_list_mls, - .vece = MO_8 }, - { .fni4 = gen_mls16_i32, - .fniv = gen_mls_vec, - .load_dest = true, - .opt_opc = vecop_list_mls, - .vece = MO_16 }, - { .fni4 = gen_mls32_i32, - .fniv = gen_mls_vec, - .load_dest = true, - .opt_opc = vecop_list_mls, - .vece = MO_32 }, - { .fni8 = gen_mls64_i64, - .fniv = gen_mls_vec, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .load_dest = true, - .opt_opc = vecop_list_mls, - .vece = MO_64 }, -}; +void gen_gvec_mls(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_mul_vec, INDEX_op_sub_vec, 0 + }; + static const GVecGen3 ops[4] = { + { .fni4 = gen_mls8_i32, + .fniv = gen_mls_vec, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni4 = gen_mls16_i32, + .fniv = gen_mls_vec, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_mls32_i32, + .fniv = gen_mls_vec, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_mls64_i64, + .fniv = gen_mls_vec, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} /* CMTST : test is "if (X & Y != 0)". */ static void gen_cmtst_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) From patchwork Wed May 13 16:32:37 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1289526 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=Xnlyr6De; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 49MgdZ1PZyz9sRf for ; Thu, 14 May 2020 02:50:10 +1000 (AEST) Received: from localhost ([::1]:43572 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jYua7-0001Gu-Rm for incoming@patchwork.ozlabs.org; Wed, 13 May 2020 12:50:07 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:47986) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jYuJZ-0003iw-Us for qemu-devel@nongnu.org; Wed, 13 May 2020 12:33:02 -0400 Received: from mail-pf1-x444.google.com ([2607:f8b0:4864:20::444]:39531) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jYuJX-00021F-Sn for qemu-devel@nongnu.org; Wed, 13 May 2020 12:33:00 -0400 Received: by mail-pf1-x444.google.com with SMTP id b190so2464116pfg.6 for ; Wed, 13 May 2020 09:32:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=0XPH9ZnuWODr1aczdQy13UdDY+XzlpSDX0OJjahOFSQ=; b=Xnlyr6DewUk05UaxzDoYmd8KQpFrb81ykp6darc4Lz04gdyaBZMiUOQFKqiu2HOj73 SNYw8y5Z0+hLuqQ8y1+9nqdQvsMaCWEoYnumQqlNOdA+eAI1+u9LvRAJXVnDuOMq5FdY 19Q3RBCmD54scSu1SlviTwUbfj7GEhTqXPLtn/KlvVsTkiCyHY9Y/MmCIY8cgYiky79h NLLHAXGy7eeK2E+KYC9ySIhWeykqQKKNNC8Tn4Bv36nxu/UktkqLzgTLG2tX5n8QFeCf RZYw/CXYtFOHItJV8sp8f6ygEe8ZNmYslLs/B8Tu4A1AdLOm/LihTLDmRwFoOUzWwtoO SMrw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=0XPH9ZnuWODr1aczdQy13UdDY+XzlpSDX0OJjahOFSQ=; b=YZa/znZrZRUVhOD1xFmU4KkbCsecxtDY9D4Uv9818sdaN+ALGQplQpvVj2dLL3NEcq w89y2ORo+oD+v0y7zwDoo40AwBUkSZ1jOv5zJHwiQfkNhZs8kXTy9xFqeNuBc0bmC2w3 bzA6W+2ZIhSFXuwCfGPltgLZyqC0Fiko7kvB+kkUMAad7knsUD+44V+HVV1dqo3ozlwj HZeV/CPG1okeX0n25oZItaVpXbOqno6sPWM4CePWd3NI2hS5NIzQYroCvR/GBaNIpNPD K7rexdHZd/+CSwoOYCNbl40U/S+fNDqWCElzcfxV1bt0zaJhhj1EIpaJB2u1SRZ5hACN 4eFA== X-Gm-Message-State: AOAM5329NRIqUehchxwnWGIAnN4ugypqqr+f9GQADT8JHRbPrNx00PE7 1Umu/2GvBpMrkAlIEbu8I1icfujU+Fg= X-Google-Smtp-Source: ABdhPJzH97qwtARlIRbXKfugiwh1CLjXp5v1/krNL7vZc1PXELbCel70jzzV7xSG3q0g4BX57rtX4g== X-Received: by 2002:a63:6a82:: with SMTP id f124mr152238pgc.378.1589387577859; Wed, 13 May 2020 09:32:57 -0700 (PDT) Received: from localhost.localdomain (174-21-143-238.tukw.qwest.net. [174.21.143.238]) by smtp.gmail.com with ESMTPSA id b11sm158025pgq.50.2020.05.13.09.32.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 May 2020 09:32:57 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v4 08/16] target/arm: Swap argument order for VSHL during decode Date: Wed, 13 May 2020 09:32:37 -0700 Message-Id: <20200513163245.17915-9-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200513163245.17915-1-richard.henderson@linaro.org> References: <20200513163245.17915-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::444; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x444.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Rather than perform the argument swap during code generation, perform it during decode. This means it doesn't have to be special cased later, and we can share code with aarch64 code generation. Hopefully the decode comment addresses any confusion that might arise in between. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/neon-dp.decode | 17 +++++++++++++++-- target/arm/translate-neon.inc.c | 3 +-- 2 files changed, 16 insertions(+), 4 deletions(-) diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode index ec3a92fe75..593f7fff03 100644 --- a/target/arm/neon-dp.decode +++ b/target/arm/neon-dp.decode @@ -65,8 +65,21 @@ VCGT_U_3s 1111 001 1 0 . .. .... .... 0011 . . . 0 .... @3same VCGE_S_3s 1111 001 0 0 . .. .... .... 0011 . . . 1 .... @3same VCGE_U_3s 1111 001 1 0 . .. .... .... 0011 . . . 1 .... @3same -VSHL_S_3s 1111 001 0 0 . .. .... .... 0100 . . . 0 .... @3same -VSHL_U_3s 1111 001 1 0 . .. .... .... 0100 . . . 0 .... @3same +# The _rev suffix indicates that Vn and Vm are reversed. This is +# the case for shifts. In the Arm ARM these insns are documented +# with the Vm and Vn fields in their usual places, but in the +# assembly the operands are listed "backwards", ie in the order +# Dd, Dm, Dn where other insns use Dd, Dn, Dm. For QEMU we choose +# to consider Vm and Vn as being in different fields in the insn, +# which allows us to avoid special-casing shifts in the trans_ +# function code. We would otherwise need to manually swap the operands +# over to call Neon helper functions that are shared with AArch64, +# which does not have this odd reversed-operand situation. +@3same_rev .... ... . . . size:2 .... .... .... . q:1 . . .... \ + &3same vn=%vm_dp vm=%vn_dp vd=%vd_dp + +VSHL_S_3s 1111 001 0 0 . .. .... .... 0100 . . . 0 .... @3same_rev +VSHL_U_3s 1111 001 1 0 . .. .... .... 0100 . . . 0 .... @3same_rev VMAX_S_3s 1111 001 0 0 . .. .... .... 0110 . . . 0 .... @3same VMAX_U_3s 1111 001 1 0 . .. .... .... 0110 . . . 0 .... @3same diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c index aefeff498a..416302bcc7 100644 --- a/target/arm/translate-neon.inc.c +++ b/target/arm/translate-neon.inc.c @@ -692,8 +692,7 @@ static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a) uint32_t rn_ofs, uint32_t rm_ofs, \ uint32_t oprsz, uint32_t maxsz) \ { \ - /* Note the operation is vshl vd,vm,vn */ \ - tcg_gen_gvec_3(rd_ofs, rm_ofs, rn_ofs, \ + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, \ oprsz, maxsz, &OPARRAY[vece]); \ } \ DO_3SAME(INSN, gen_##INSN##_3s) From patchwork Wed May 13 16:32:38 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1289498 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=BJzCsDIC; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 49MgNh5RXJz9sRY for ; Thu, 14 May 2020 02:38:58 +1000 (AEST) Received: from localhost ([::1]:41072 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jYuPH-00032w-8Z for incoming@patchwork.ozlabs.org; Wed, 13 May 2020 12:38:55 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48006) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jYuJc-0003kE-55 for qemu-devel@nongnu.org; Wed, 13 May 2020 12:33:04 -0400 Received: from mail-pg1-x541.google.com ([2607:f8b0:4864:20::541]:32862) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jYuJa-00021P-8t for qemu-devel@nongnu.org; Wed, 13 May 2020 12:33:03 -0400 Received: by mail-pg1-x541.google.com with SMTP id a4so8013607pgc.0 for ; Wed, 13 May 2020 09:33:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=E/r+NCV6DzSIcLUZf6WhXrUleBIA2S8Z0OvC6I1eflc=; b=BJzCsDICEfYtNjBi7wowlbIAGSrGGxWJJbtFa7XhIUoSrMb4Ed+BM+fSjDsmNfr2Gz 2Ly1jj3Ul3HMtJ4YK8T+rUy6R2iQTWh6pdMb3f/WMbs7Q9eN8HTeEfV/ouabAqotlgf+ NWeF4phxW7ev6drzi0EVEa7/GSRygwzW+mc6RuzQ3V8S0ELd46JovP1auE8FQFbYjC/I j3yB9hd1dMSnZM8/0RxZbERUvUw2mnp0It3czqlGioPikOCSKNOLY9fdJlhAwSp8KesP w07n5+nvAsXkKkWqTZUAIzvtdHqz4gWTcaDmGiSsS4yIEH/uPo1O8U0qPUyMYkRQQajk 6OPw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=E/r+NCV6DzSIcLUZf6WhXrUleBIA2S8Z0OvC6I1eflc=; b=sb6dC4RHzpYr9y0hIoGQDH4Ozq+bkrisCLayxU0OJPB1G2xiH7rbB4meh/QvPwYt0e YoG4XGOK6imMOQdVKJal6NjVWCKmU2M+joY8gcQFgVeQwMizsKyEPH0Zj958eApK9uA3 AWbBxGTJzZaKCDERt6bLVSppnzDLDvYKwqZ159IbxiMnrdzo1R2eQiYQg3800mAU2fzb 4TbAYDy8W9Mooht5QiYAdqXBxLslgT2QyddDMFycxh+zEu3IvPfU0ctbxwuCOw/AfO6L dWWkZ02g3eT3zxul2O+pjI7C3L+HN+pJ5WRNEfwlBtI3KLljuN96iV1vSihm5xnL8p1n LPTQ== X-Gm-Message-State: AOAM531TtFteZoJUaYyE7k1EYfXE1/DwNkcEarugDOimSNqT1b6Z7LpT r/bVYcyOJn7pZCjjzppa5A+IeWKqrv8= X-Google-Smtp-Source: ABdhPJwc61Rvk69f4ogoc+Oj3d3ZAAogy7hAqmHz2AISpLHxmzF0NiSvBmTcNEd8SyIKb2cvIFOJTw== X-Received: by 2002:a65:5287:: with SMTP id y7mr174306pgp.86.1589387578999; Wed, 13 May 2020 09:32:58 -0700 (PDT) Received: from localhost.localdomain (174-21-143-238.tukw.qwest.net. [174.21.143.238]) by smtp.gmail.com with ESMTPSA id b11sm158025pgq.50.2020.05.13.09.32.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 May 2020 09:32:58 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v4 09/16] target/arm: Create gen_gvec_{cmtst,ushl,sshl} Date: Wed, 13 May 2020 09:32:38 -0700 Message-Id: <20200513163245.17915-10-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200513163245.17915-1-richard.henderson@linaro.org> References: <20200513163245.17915-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::541; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x541.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Provide a functional interface for the vector expansion. This fits better with the existing set of helpers that we provide for other operations. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/translate.h | 10 ++- target/arm/translate-a64.c | 18 ++-- target/arm/translate-neon.inc.c | 23 +---- target/arm/translate.c | 146 +++++++++++++++++--------------- 4 files changed, 95 insertions(+), 102 deletions(-) diff --git a/target/arm/translate.h b/target/arm/translate.h index 9354ceba35..a02a54cabf 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -291,9 +291,13 @@ void gen_gvec_mla(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, void gen_gvec_mls(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); -extern const GVecGen3 cmtst_op[4]; -extern const GVecGen3 sshl_op[4]; -extern const GVecGen3 ushl_op[4]; +void gen_gvec_cmtst(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_sshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_ushl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); + extern const GVecGen4 uqadd_op[4]; extern const GVecGen4 sqadd_op[4]; extern const GVecGen4 uqsub_op[4]; diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index ab9df12e44..3956c19ed8 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -577,15 +577,6 @@ static void gen_gvec_fn4(DisasContext *s, bool is_q, int rd, int rn, int rm, is_q ? 16 : 8, vec_full_reg_size(s)); } -/* Expand a 3-operand AdvSIMD vector operation using an op descriptor. */ -static void gen_gvec_op3(DisasContext *s, bool is_q, int rd, - int rn, int rm, const GVecGen3 *gvec_op) -{ - tcg_gen_gvec_3(vec_full_reg_offset(s, rd), vec_full_reg_offset(s, rn), - vec_full_reg_offset(s, rm), is_q ? 16 : 8, - vec_full_reg_size(s), gvec_op); -} - /* Expand a 3-operand operation using an out-of-line helper. */ static void gen_gvec_op3_ool(DisasContext *s, bool is_q, int rd, int rn, int rm, int data, gen_helper_gvec_3 *fn) @@ -11193,8 +11184,11 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) (u ? uqsub_op : sqsub_op) + size); return; case 0x08: /* SSHL, USHL */ - gen_gvec_op3(s, is_q, rd, rn, rm, - u ? &ushl_op[size] : &sshl_op[size]); + if (u) { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_ushl, size); + } else { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sshl, size); + } return; case 0x0c: /* SMAX, UMAX */ if (u) { @@ -11233,7 +11227,7 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) return; case 0x11: if (!u) { /* CMTST */ - gen_gvec_op3(s, is_q, rd, rn, rm, &cmtst_op[size]); + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_cmtst, size); return; } /* else CMEQ */ diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c index 416302bcc7..e16475c212 100644 --- a/target/arm/translate-neon.inc.c +++ b/target/arm/translate-neon.inc.c @@ -603,6 +603,8 @@ DO_3SAME(VBIC, tcg_gen_gvec_andc) DO_3SAME(VORR, tcg_gen_gvec_or) DO_3SAME(VORN, tcg_gen_gvec_orc) DO_3SAME(VEOR, tcg_gen_gvec_xor) +DO_3SAME(VSHL_S, gen_gvec_sshl) +DO_3SAME(VSHL_U, gen_gvec_ushl) /* These insns are all gvec_bitsel but with the inputs in various orders. */ #define DO_3SAME_BITSEL(INSN, O1, O2, O3) \ @@ -634,6 +636,7 @@ DO_3SAME_NO_SZ_3(VMIN_U, tcg_gen_gvec_umin) DO_3SAME_NO_SZ_3(VMUL, tcg_gen_gvec_mul) DO_3SAME_NO_SZ_3(VMLA, gen_gvec_mla) DO_3SAME_NO_SZ_3(VMLS, gen_gvec_mls) +DO_3SAME_NO_SZ_3(VTST, gen_gvec_cmtst) #define DO_3SAME_CMP(INSN, COND) \ static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \ @@ -650,13 +653,6 @@ DO_3SAME_CMP(VCGE_S, TCG_COND_GE) DO_3SAME_CMP(VCGE_U, TCG_COND_GEU) DO_3SAME_CMP(VCEQ, TCG_COND_EQ) -static void gen_VTST_3s(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, - uint32_t rm_ofs, uint32_t oprsz, uint32_t maxsz) -{ - tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &cmtst_op[vece]); -} -DO_3SAME_NO_SZ_3(VTST, gen_VTST_3s) - #define DO_3SAME_GVEC4(INSN, OPARRAY) \ static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \ uint32_t rn_ofs, uint32_t rm_ofs, \ @@ -686,16 +682,3 @@ static bool trans_VMUL_p_3s(DisasContext *s, arg_3same *a) } return do_3same(s, a, gen_VMUL_p_3s); } - -#define DO_3SAME_GVEC3_SHIFT(INSN, OPARRAY) \ - static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \ - uint32_t rn_ofs, uint32_t rm_ofs, \ - uint32_t oprsz, uint32_t maxsz) \ - { \ - tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, \ - oprsz, maxsz, &OPARRAY[vece]); \ - } \ - DO_3SAME(INSN, gen_##INSN##_3s) - -DO_3SAME_GVEC3_SHIFT(VSHL_S, sshl_op) -DO_3SAME_GVEC3_SHIFT(VSHL_U, ushl_op) diff --git a/target/arm/translate.c b/target/arm/translate.c index face89a1f7..df91ff73e3 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -4606,27 +4606,31 @@ static void gen_cmtst_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) tcg_gen_cmp_vec(TCG_COND_NE, vece, d, d, a); } -static const TCGOpcode vecop_list_cmtst[] = { INDEX_op_cmp_vec, 0 }; - -const GVecGen3 cmtst_op[4] = { - { .fni4 = gen_helper_neon_tst_u8, - .fniv = gen_cmtst_vec, - .opt_opc = vecop_list_cmtst, - .vece = MO_8 }, - { .fni4 = gen_helper_neon_tst_u16, - .fniv = gen_cmtst_vec, - .opt_opc = vecop_list_cmtst, - .vece = MO_16 }, - { .fni4 = gen_cmtst_i32, - .fniv = gen_cmtst_vec, - .opt_opc = vecop_list_cmtst, - .vece = MO_32 }, - { .fni8 = gen_cmtst_i64, - .fniv = gen_cmtst_vec, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .opt_opc = vecop_list_cmtst, - .vece = MO_64 }, -}; +void gen_gvec_cmtst(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { INDEX_op_cmp_vec, 0 }; + static const GVecGen3 ops[4] = { + { .fni4 = gen_helper_neon_tst_u8, + .fniv = gen_cmtst_vec, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni4 = gen_helper_neon_tst_u16, + .fniv = gen_cmtst_vec, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_cmtst_i32, + .fniv = gen_cmtst_vec, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_cmtst_i64, + .fniv = gen_cmtst_vec, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} void gen_ushl_i32(TCGv_i32 dst, TCGv_i32 src, TCGv_i32 shift) { @@ -4744,29 +4748,33 @@ static void gen_ushl_vec(unsigned vece, TCGv_vec dst, tcg_temp_free_vec(rsh); } -static const TCGOpcode ushl_list[] = { - INDEX_op_neg_vec, INDEX_op_shlv_vec, - INDEX_op_shrv_vec, INDEX_op_cmp_vec, 0 -}; - -const GVecGen3 ushl_op[4] = { - { .fniv = gen_ushl_vec, - .fno = gen_helper_gvec_ushl_b, - .opt_opc = ushl_list, - .vece = MO_8 }, - { .fniv = gen_ushl_vec, - .fno = gen_helper_gvec_ushl_h, - .opt_opc = ushl_list, - .vece = MO_16 }, - { .fni4 = gen_ushl_i32, - .fniv = gen_ushl_vec, - .opt_opc = ushl_list, - .vece = MO_32 }, - { .fni8 = gen_ushl_i64, - .fniv = gen_ushl_vec, - .opt_opc = ushl_list, - .vece = MO_64 }, -}; +void gen_gvec_ushl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_neg_vec, INDEX_op_shlv_vec, + INDEX_op_shrv_vec, INDEX_op_cmp_vec, 0 + }; + static const GVecGen3 ops[4] = { + { .fniv = gen_ushl_vec, + .fno = gen_helper_gvec_ushl_b, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fniv = gen_ushl_vec, + .fno = gen_helper_gvec_ushl_h, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_ushl_i32, + .fniv = gen_ushl_vec, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_ushl_i64, + .fniv = gen_ushl_vec, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} void gen_sshl_i32(TCGv_i32 dst, TCGv_i32 src, TCGv_i32 shift) { @@ -4878,29 +4886,33 @@ static void gen_sshl_vec(unsigned vece, TCGv_vec dst, tcg_temp_free_vec(tmp); } -static const TCGOpcode sshl_list[] = { - INDEX_op_neg_vec, INDEX_op_umin_vec, INDEX_op_shlv_vec, - INDEX_op_sarv_vec, INDEX_op_cmp_vec, INDEX_op_cmpsel_vec, 0 -}; - -const GVecGen3 sshl_op[4] = { - { .fniv = gen_sshl_vec, - .fno = gen_helper_gvec_sshl_b, - .opt_opc = sshl_list, - .vece = MO_8 }, - { .fniv = gen_sshl_vec, - .fno = gen_helper_gvec_sshl_h, - .opt_opc = sshl_list, - .vece = MO_16 }, - { .fni4 = gen_sshl_i32, - .fniv = gen_sshl_vec, - .opt_opc = sshl_list, - .vece = MO_32 }, - { .fni8 = gen_sshl_i64, - .fniv = gen_sshl_vec, - .opt_opc = sshl_list, - .vece = MO_64 }, -}; +void gen_gvec_sshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_neg_vec, INDEX_op_umin_vec, INDEX_op_shlv_vec, + INDEX_op_sarv_vec, INDEX_op_cmp_vec, INDEX_op_cmpsel_vec, 0 + }; + static const GVecGen3 ops[4] = { + { .fniv = gen_sshl_vec, + .fno = gen_helper_gvec_sshl_b, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fniv = gen_sshl_vec, + .fno = gen_helper_gvec_sshl_h, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_sshl_i32, + .fniv = gen_sshl_vec, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_sshl_i64, + .fniv = gen_sshl_vec, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} static void gen_uqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, TCGv_vec a, TCGv_vec b) From patchwork Wed May 13 16:32:39 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1289505 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=Cwft4MyW; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 49MgS41hwMz9sSf for ; Thu, 14 May 2020 02:41:56 +1000 (AEST) Received: from localhost ([::1]:49702 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jYuS9-0008FM-Ql for incoming@patchwork.ozlabs.org; Wed, 13 May 2020 12:41:53 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48024) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jYuJf-0003lX-4c for qemu-devel@nongnu.org; Wed, 13 May 2020 12:33:07 -0400 Received: from mail-pj1-x1042.google.com ([2607:f8b0:4864:20::1042]:40274) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jYuJa-00022f-9Y for qemu-devel@nongnu.org; Wed, 13 May 2020 12:33:06 -0400 Received: by mail-pj1-x1042.google.com with SMTP id fu13so11215888pjb.5 for ; Wed, 13 May 2020 09:33:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=skj7edYF/xFborw3YWQ6+XILqBi0CxcV8pcPVCKN/pk=; b=Cwft4MyWGS5EkJ6akZStYYsNvTqVL1GFNIBdbKsSmsMh4qFa8blzv6nwcXJhAneuW8 hK7aNYUaR/YEp/jwS3SoRMxxxtZeYA+eBmWTFqRNX09w7w96fP4PnUVEfhx5jsOckSmb 5xnk8homEU7F3pdW0NUojX4/pTgVRKHXGWvuTQ0QuUJK21BbMkrxQsk6zg+8qEMIDvSu 3qrLsnN0zABzstxR/L6dLui2CRX7Kk8VswrrC5mNjPYKuVFa2hDjbbjKokcVRxyaYp88 2jgsV4APba0OvbvFSJSPEd1ItTGG560JPNEDdNLSPfZaMcirnxzGzWYGyPKzBrV2ude9 D/cg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=skj7edYF/xFborw3YWQ6+XILqBi0CxcV8pcPVCKN/pk=; b=cjDTrglOfK7qgeUXA0vAdZU+8rGOmPT7aamhXy5z+VUIUsD4kZQGxOKjOOBBxP58SG ZklWldYelimo4nQlgYvDDvmA6hKMeVD9macdD7TMlPaK9PCU9Q9NJPI9J99HF4h4ODiT N4sJjZC3fVk8sXxogF7O4o0csfrQrbAAODV4DFVYB0X7jCag3q/zbkd+8ibiVVeqsyku NFxeQ7hiPtEiZHGNVCUkngMQhxlXT+INx63ae3lPy1B5xP/eE3crQrIUqVZKsrNbrezz lwvBNMLQrjFzWuadSbOrBIEhQwKP10wVBHVQLMBDellkvl8u+kFAKJsmLzBwLMdvTGED y9mw== X-Gm-Message-State: AGi0PubSeESgLMADR2xVjY7o5diQPkiLRyOUU1ZPD4+mb2/Aoez6t1nv YkCi/fuk7Hm2Og7P7ewG+eIUHBqV4qU= X-Google-Smtp-Source: APiQypJ12JdK3d4EeKRujgP215icejQCT5a5Xwi8IoZ03Z5FwIbq1u731mF6FQKQ2GEAKMVnMBqPmA== X-Received: by 2002:a17:90a:1b67:: with SMTP id q94mr34892204pjq.84.1589387580340; Wed, 13 May 2020 09:33:00 -0700 (PDT) Received: from localhost.localdomain (174-21-143-238.tukw.qwest.net. [174.21.143.238]) by smtp.gmail.com with ESMTPSA id b11sm158025pgq.50.2020.05.13.09.32.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 May 2020 09:32:59 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v4 10/16] target/arm: Create gen_gvec_{uqadd, sqadd, uqsub, sqsub} Date: Wed, 13 May 2020 09:32:39 -0700 Message-Id: <20200513163245.17915-11-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200513163245.17915-1-richard.henderson@linaro.org> References: <20200513163245.17915-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1042; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1042.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Provide a functional interface for the vector expansion. This fits better with the existing set of helpers that we provide for other operations. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/translate.h | 13 +- target/arm/translate-a64.c | 22 ++- target/arm/translate-neon.inc.c | 19 +-- target/arm/translate.c | 228 +++++++++++++++++--------------- 4 files changed, 147 insertions(+), 135 deletions(-) diff --git a/target/arm/translate.h b/target/arm/translate.h index a02a54cabf..4e1778c5e0 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -298,16 +298,21 @@ void gen_gvec_sshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, void gen_gvec_ushl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); -extern const GVecGen4 uqadd_op[4]; -extern const GVecGen4 sqadd_op[4]; -extern const GVecGen4 uqsub_op[4]; -extern const GVecGen4 sqsub_op[4]; void gen_cmtst_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); void gen_ushl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b); void gen_sshl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b); void gen_ushl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); void gen_sshl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); +void gen_gvec_uqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_sqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_uqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_sqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); + void gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, int64_t shift, uint32_t opr_sz, uint32_t max_sz); void gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 3956c19ed8..ea5f6ceadc 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -11168,20 +11168,18 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) switch (opcode) { case 0x01: /* SQADD, UQADD */ - tcg_gen_gvec_4(vec_full_reg_offset(s, rd), - offsetof(CPUARMState, vfp.qc), - vec_full_reg_offset(s, rn), - vec_full_reg_offset(s, rm), - is_q ? 16 : 8, vec_full_reg_size(s), - (u ? uqadd_op : sqadd_op) + size); + if (u) { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uqadd_qc, size); + } else { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sqadd_qc, size); + } return; case 0x05: /* SQSUB, UQSUB */ - tcg_gen_gvec_4(vec_full_reg_offset(s, rd), - offsetof(CPUARMState, vfp.qc), - vec_full_reg_offset(s, rn), - vec_full_reg_offset(s, rm), - is_q ? 16 : 8, vec_full_reg_size(s), - (u ? uqsub_op : sqsub_op) + size); + if (u) { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uqsub_qc, size); + } else { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sqsub_qc, size); + } return; case 0x08: /* SSHL, USHL */ if (u) { diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c index e16475c212..099491b16f 100644 --- a/target/arm/translate-neon.inc.c +++ b/target/arm/translate-neon.inc.c @@ -605,6 +605,10 @@ DO_3SAME(VORN, tcg_gen_gvec_orc) DO_3SAME(VEOR, tcg_gen_gvec_xor) DO_3SAME(VSHL_S, gen_gvec_sshl) DO_3SAME(VSHL_U, gen_gvec_ushl) +DO_3SAME(VQADD_S, gen_gvec_sqadd_qc) +DO_3SAME(VQADD_U, gen_gvec_uqadd_qc) +DO_3SAME(VQSUB_S, gen_gvec_sqsub_qc) +DO_3SAME(VQSUB_U, gen_gvec_uqsub_qc) /* These insns are all gvec_bitsel but with the inputs in various orders. */ #define DO_3SAME_BITSEL(INSN, O1, O2, O3) \ @@ -653,21 +657,6 @@ DO_3SAME_CMP(VCGE_S, TCG_COND_GE) DO_3SAME_CMP(VCGE_U, TCG_COND_GEU) DO_3SAME_CMP(VCEQ, TCG_COND_EQ) -#define DO_3SAME_GVEC4(INSN, OPARRAY) \ - static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs, \ - uint32_t rn_ofs, uint32_t rm_ofs, \ - uint32_t oprsz, uint32_t maxsz) \ - { \ - tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc), \ - rn_ofs, rm_ofs, oprsz, maxsz, &OPARRAY[vece]); \ - } \ - DO_3SAME(INSN, gen_##INSN##_3s) - -DO_3SAME_GVEC4(VQADD_S, sqadd_op) -DO_3SAME_GVEC4(VQADD_U, uqadd_op) -DO_3SAME_GVEC4(VQSUB_S, sqsub_op) -DO_3SAME_GVEC4(VQSUB_U, uqsub_op) - static void gen_VMUL_p_3s(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t oprsz, uint32_t maxsz) { diff --git a/target/arm/translate.c b/target/arm/translate.c index df91ff73e3..7eb30cde60 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -4925,32 +4925,37 @@ static void gen_uqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, tcg_temp_free_vec(x); } -static const TCGOpcode vecop_list_uqadd[] = { - INDEX_op_usadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0 -}; - -const GVecGen4 uqadd_op[4] = { - { .fniv = gen_uqadd_vec, - .fno = gen_helper_gvec_uqadd_b, - .write_aofs = true, - .opt_opc = vecop_list_uqadd, - .vece = MO_8 }, - { .fniv = gen_uqadd_vec, - .fno = gen_helper_gvec_uqadd_h, - .write_aofs = true, - .opt_opc = vecop_list_uqadd, - .vece = MO_16 }, - { .fniv = gen_uqadd_vec, - .fno = gen_helper_gvec_uqadd_s, - .write_aofs = true, - .opt_opc = vecop_list_uqadd, - .vece = MO_32 }, - { .fniv = gen_uqadd_vec, - .fno = gen_helper_gvec_uqadd_d, - .write_aofs = true, - .opt_opc = vecop_list_uqadd, - .vece = MO_64 }, -}; +void gen_gvec_uqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_usadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen4 ops[4] = { + { .fniv = gen_uqadd_vec, + .fno = gen_helper_gvec_uqadd_b, + .write_aofs = true, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fniv = gen_uqadd_vec, + .fno = gen_helper_gvec_uqadd_h, + .write_aofs = true, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fniv = gen_uqadd_vec, + .fno = gen_helper_gvec_uqadd_s, + .write_aofs = true, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fniv = gen_uqadd_vec, + .fno = gen_helper_gvec_uqadd_d, + .write_aofs = true, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc), + rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} static void gen_sqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, TCGv_vec a, TCGv_vec b) @@ -4963,32 +4968,37 @@ static void gen_sqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, tcg_temp_free_vec(x); } -static const TCGOpcode vecop_list_sqadd[] = { - INDEX_op_ssadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0 -}; - -const GVecGen4 sqadd_op[4] = { - { .fniv = gen_sqadd_vec, - .fno = gen_helper_gvec_sqadd_b, - .opt_opc = vecop_list_sqadd, - .write_aofs = true, - .vece = MO_8 }, - { .fniv = gen_sqadd_vec, - .fno = gen_helper_gvec_sqadd_h, - .opt_opc = vecop_list_sqadd, - .write_aofs = true, - .vece = MO_16 }, - { .fniv = gen_sqadd_vec, - .fno = gen_helper_gvec_sqadd_s, - .opt_opc = vecop_list_sqadd, - .write_aofs = true, - .vece = MO_32 }, - { .fniv = gen_sqadd_vec, - .fno = gen_helper_gvec_sqadd_d, - .opt_opc = vecop_list_sqadd, - .write_aofs = true, - .vece = MO_64 }, -}; +void gen_gvec_sqadd_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_ssadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen4 ops[4] = { + { .fniv = gen_sqadd_vec, + .fno = gen_helper_gvec_sqadd_b, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_8 }, + { .fniv = gen_sqadd_vec, + .fno = gen_helper_gvec_sqadd_h, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_16 }, + { .fniv = gen_sqadd_vec, + .fno = gen_helper_gvec_sqadd_s, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_32 }, + { .fniv = gen_sqadd_vec, + .fno = gen_helper_gvec_sqadd_d, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_64 }, + }; + tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc), + rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} static void gen_uqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, TCGv_vec a, TCGv_vec b) @@ -5001,32 +5011,37 @@ static void gen_uqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, tcg_temp_free_vec(x); } -static const TCGOpcode vecop_list_uqsub[] = { - INDEX_op_ussub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0 -}; - -const GVecGen4 uqsub_op[4] = { - { .fniv = gen_uqsub_vec, - .fno = gen_helper_gvec_uqsub_b, - .opt_opc = vecop_list_uqsub, - .write_aofs = true, - .vece = MO_8 }, - { .fniv = gen_uqsub_vec, - .fno = gen_helper_gvec_uqsub_h, - .opt_opc = vecop_list_uqsub, - .write_aofs = true, - .vece = MO_16 }, - { .fniv = gen_uqsub_vec, - .fno = gen_helper_gvec_uqsub_s, - .opt_opc = vecop_list_uqsub, - .write_aofs = true, - .vece = MO_32 }, - { .fniv = gen_uqsub_vec, - .fno = gen_helper_gvec_uqsub_d, - .opt_opc = vecop_list_uqsub, - .write_aofs = true, - .vece = MO_64 }, -}; +void gen_gvec_uqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_ussub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0 + }; + static const GVecGen4 ops[4] = { + { .fniv = gen_uqsub_vec, + .fno = gen_helper_gvec_uqsub_b, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_8 }, + { .fniv = gen_uqsub_vec, + .fno = gen_helper_gvec_uqsub_h, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_16 }, + { .fniv = gen_uqsub_vec, + .fno = gen_helper_gvec_uqsub_s, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_32 }, + { .fniv = gen_uqsub_vec, + .fno = gen_helper_gvec_uqsub_d, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_64 }, + }; + tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc), + rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} static void gen_sqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, TCGv_vec a, TCGv_vec b) @@ -5039,32 +5054,37 @@ static void gen_sqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, tcg_temp_free_vec(x); } -static const TCGOpcode vecop_list_sqsub[] = { - INDEX_op_sssub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0 -}; - -const GVecGen4 sqsub_op[4] = { - { .fniv = gen_sqsub_vec, - .fno = gen_helper_gvec_sqsub_b, - .opt_opc = vecop_list_sqsub, - .write_aofs = true, - .vece = MO_8 }, - { .fniv = gen_sqsub_vec, - .fno = gen_helper_gvec_sqsub_h, - .opt_opc = vecop_list_sqsub, - .write_aofs = true, - .vece = MO_16 }, - { .fniv = gen_sqsub_vec, - .fno = gen_helper_gvec_sqsub_s, - .opt_opc = vecop_list_sqsub, - .write_aofs = true, - .vece = MO_32 }, - { .fniv = gen_sqsub_vec, - .fno = gen_helper_gvec_sqsub_d, - .opt_opc = vecop_list_sqsub, - .write_aofs = true, - .vece = MO_64 }, -}; +void gen_gvec_sqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sssub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0 + }; + static const GVecGen4 ops[4] = { + { .fniv = gen_sqsub_vec, + .fno = gen_helper_gvec_sqsub_b, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_8 }, + { .fniv = gen_sqsub_vec, + .fno = gen_helper_gvec_sqsub_h, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_16 }, + { .fniv = gen_sqsub_vec, + .fno = gen_helper_gvec_sqsub_s, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_32 }, + { .fniv = gen_sqsub_vec, + .fno = gen_helper_gvec_sqsub_d, + .opt_opc = vecop_list, + .write_aofs = true, + .vece = MO_64 }, + }; + tcg_gen_gvec_4(rd_ofs, offsetof(CPUARMState, vfp.qc), + rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} /* Translate a NEON data processing instruction. Return nonzero if the instruction is invalid. From patchwork Wed May 13 16:32:40 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1289486 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=NkNaAgio; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 49MgJF0LrCz9sRf for ; Thu, 14 May 2020 02:35:09 +1000 (AEST) Received: from localhost ([::1]:56204 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jYuLa-0004mP-L6 for incoming@patchwork.ozlabs.org; Wed, 13 May 2020 12:35:06 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48012) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jYuJd-0003kz-H4 for qemu-devel@nongnu.org; Wed, 13 May 2020 12:33:05 -0400 Received: from mail-pj1-x1033.google.com ([2607:f8b0:4864:20::1033]:39570) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jYuJb-00022u-Pz for qemu-devel@nongnu.org; Wed, 13 May 2020 12:33:05 -0400 Received: by mail-pj1-x1033.google.com with SMTP id e6so11215716pjt.4 for ; Wed, 13 May 2020 09:33:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=NLb4rJjMwiMsLgfx7HOV1muKPW0Fe0mWL0pQOnLqixs=; b=NkNaAgio82eDYKGidh/r6DIupdurwkx5r0sGeI4ArSShWcmRGwUn+Dwggugz0zm3ll Cd/OwvEARRuDLGIk+ECRooxJdqYp773SvmSf8IJm+7STjs7EE6D8NDDIRg/7RUBNfe5q MVz+rdMyRwyh1UU0uzqEKDphrFRw0TGFpaH2PxFoAj3gsTTspLPthpr2Jj9dAsowdHuK dTE1miExlgD0gI1nJaE2Of2qk6K1zUAl8ikiaFEcqWiBciLTlYLj2JxfxyDwChSThirX y/QwkSu5s5DXW2wcC5TgDYL+pDapC3+mz5LKryzTAmnYDOUxnguqMKK+LpBQIuTHgMc8 Cp1A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=NLb4rJjMwiMsLgfx7HOV1muKPW0Fe0mWL0pQOnLqixs=; b=sgveRowXBzsGP17IMN6JZ52qkv/vhNR2okY4TNmnrZWp+HjfLwAHEaFP9FqModiAU3 TMSwzQzNjjovy+KyPatCyw9ithU6VrbaZTB4tInG+oCi4hPlWTVJgOb7DmzNbgtpxngF CU9yJy976HqZlOzw51OFSYoF9F92IAcp7Lz02FqyvDkNtxRLbs3CQLp4wlzlHggKil1a CCCKlBcFRKxBEWXWUEQdzFwhEKMmXaCmwyDklkxPL2mBpMqen5HGFsOLi+C2SCu9fEgW WewH6zwfUp4UnEqMbF4ltqgANtH7q8A0fUmlQhZPIxGXmlExMpSB5+BXa1gIilgp/gs0 eiWw== X-Gm-Message-State: AGi0PuZZdOqXgf8f3XTawrSkLVtjzahzolOLJs+PZ9QaPny9n3fXtKyN AzHa6fB9hP3NPHhFBqWlj1myZ+OqZnM= X-Google-Smtp-Source: APiQypJkNDOizqmwiiXR0NMaOU6D5KVd7kNjfZ+TT7CIJcUcFT4trR0Xli1AnoU4Bem3L2YeO+X/2Q== X-Received: by 2002:a17:90a:154e:: with SMTP id y14mr36596197pja.180.1589387581810; Wed, 13 May 2020 09:33:01 -0700 (PDT) Received: from localhost.localdomain (174-21-143-238.tukw.qwest.net. [174.21.143.238]) by smtp.gmail.com with ESMTPSA id b11sm158025pgq.50.2020.05.13.09.33.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 May 2020 09:33:01 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v4 11/16] target/arm: Remove fp_status from helper_{recpe, rsqrte}_u32 Date: Wed, 13 May 2020 09:32:40 -0700 Message-Id: <20200513163245.17915-12-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200513163245.17915-1-richard.henderson@linaro.org> References: <20200513163245.17915-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1033; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1033.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" These operations do not touch fp_status. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper.h | 4 ++-- target/arm/translate-a64.c | 5 ++--- target/arm/translate.c | 12 ++---------- target/arm/vfp_helper.c | 5 ++--- 4 files changed, 8 insertions(+), 18 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index 33c76192d2..aed3050965 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -211,8 +211,8 @@ DEF_HELPER_FLAGS_2(recpe_f64, TCG_CALL_NO_RWG, f64, f64, ptr) DEF_HELPER_FLAGS_2(rsqrte_f16, TCG_CALL_NO_RWG, f16, f16, ptr) DEF_HELPER_FLAGS_2(rsqrte_f32, TCG_CALL_NO_RWG, f32, f32, ptr) DEF_HELPER_FLAGS_2(rsqrte_f64, TCG_CALL_NO_RWG, f64, f64, ptr) -DEF_HELPER_2(recpe_u32, i32, i32, ptr) -DEF_HELPER_FLAGS_2(rsqrte_u32, TCG_CALL_NO_RWG, i32, i32, ptr) +DEF_HELPER_FLAGS_1(recpe_u32, TCG_CALL_NO_RWG, i32, i32) +DEF_HELPER_FLAGS_1(rsqrte_u32, TCG_CALL_NO_RWG, i32, i32) DEF_HELPER_FLAGS_4(neon_tbl, TCG_CALL_NO_RWG, i32, i32, i32, ptr, i32) DEF_HELPER_3(shl_cc, i32, env, i32, i32) diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index ea5f6ceadc..367fa403ae 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -9699,7 +9699,7 @@ static void handle_2misc_reciprocal(DisasContext *s, int opcode, switch (opcode) { case 0x3c: /* URECPE */ - gen_helper_recpe_u32(tcg_res, tcg_op, fpst); + gen_helper_recpe_u32(tcg_res, tcg_op); break; case 0x3d: /* FRECPE */ gen_helper_recpe_f32(tcg_res, tcg_op, fpst); @@ -12244,7 +12244,6 @@ static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn) unallocated_encoding(s); return; } - need_fpstatus = true; break; case 0x1e: /* FRINT32Z */ case 0x1f: /* FRINT64Z */ @@ -12412,7 +12411,7 @@ static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn) gen_helper_rints_exact(tcg_res, tcg_op, tcg_fpstatus); break; case 0x7c: /* URSQRTE */ - gen_helper_rsqrte_u32(tcg_res, tcg_op, tcg_fpstatus); + gen_helper_rsqrte_u32(tcg_res, tcg_op); break; case 0x1e: /* FRINT32Z */ case 0x5e: /* FRINT32X */ diff --git a/target/arm/translate.c b/target/arm/translate.c index 7eb30cde60..391a09b439 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -6875,19 +6875,11 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) break; } case NEON_2RM_VRECPE: - { - TCGv_ptr fpstatus = get_fpstatus_ptr(1); - gen_helper_recpe_u32(tmp, tmp, fpstatus); - tcg_temp_free_ptr(fpstatus); + gen_helper_recpe_u32(tmp, tmp); break; - } case NEON_2RM_VRSQRTE: - { - TCGv_ptr fpstatus = get_fpstatus_ptr(1); - gen_helper_rsqrte_u32(tmp, tmp, fpstatus); - tcg_temp_free_ptr(fpstatus); + gen_helper_rsqrte_u32(tmp, tmp); break; - } case NEON_2RM_VRECPE_F: { TCGv_ptr fpstatus = get_fpstatus_ptr(1); diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c index 930d6e747f..ec007fce25 100644 --- a/target/arm/vfp_helper.c +++ b/target/arm/vfp_helper.c @@ -1023,9 +1023,8 @@ float64 HELPER(rsqrte_f64)(float64 input, void *fpstp) return make_float64(val); } -uint32_t HELPER(recpe_u32)(uint32_t a, void *fpstp) +uint32_t HELPER(recpe_u32)(uint32_t a) { - /* float_status *s = fpstp; */ int input, estimate; if ((a & 0x80000000) == 0) { @@ -1038,7 +1037,7 @@ uint32_t HELPER(recpe_u32)(uint32_t a, void *fpstp) return deposit32(0, (32 - 9), 9, estimate); } -uint32_t HELPER(rsqrte_u32)(uint32_t a, void *fpstp) +uint32_t HELPER(rsqrte_u32)(uint32_t a) { int estimate; From patchwork Wed May 13 16:32:41 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1289487 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=ZK4yb9/3; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 49MgJM3T2pz9sSF for ; Thu, 14 May 2020 02:35:15 +1000 (AEST) Received: from localhost ([::1]:56612 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jYuLg-0004wd-US for incoming@patchwork.ozlabs.org; Wed, 13 May 2020 12:35:12 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48014) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jYuJe-0003lG-6L for qemu-devel@nongnu.org; Wed, 13 May 2020 12:33:06 -0400 Received: from mail-pg1-x542.google.com ([2607:f8b0:4864:20::542]:35600) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jYuJd-00023I-7Y for qemu-devel@nongnu.org; Wed, 13 May 2020 12:33:05 -0400 Received: by mail-pg1-x542.google.com with SMTP id t11so8012350pgg.2 for ; Wed, 13 May 2020 09:33:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=WJvxSydD4AqaqoBlC1YovV+jM2AoB6yDoOP5tm4PS34=; b=ZK4yb9/3XDesj4y48AAkEGNckVi++SwSZOlwBKfHicR/LBfqsMtV/bWcvlWqBpOAcz o0Gy5AhGeEg/s6qngAgNaHwTtE94KoSCcljM3Npx0fVErYnktwPK/dRWDRaSG+uSbOgs 2mg6N7MSn6v18dRbggWNA3b/9v5KYZpOv7pDxwE5NN7ZFBQfA/u8OUAZUBInp4ZTD4jx gkAcvPOWkXpgbzaMuB2uh+c5IVLsBk/K+VJIor5q3jdeo+Hs2yprVIfwiHRRl1y2x/R+ bF1A5FZbtuekjZtiNk/X2Ey/e8Zpb0QtOsCtSG3hF8qEDdW7zdZZIDMd3MCCfaFF39+C Q+Wg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=WJvxSydD4AqaqoBlC1YovV+jM2AoB6yDoOP5tm4PS34=; b=PM25Ov6NDCEunVfS8ZgOfBtFKJYxI2xjde1mtUf84FQVgI1WlXVRbrmYpzP6IZ12iN 4OPP3ywkI9wC5SQi4OvQXc9Fui4jFRc8ECbnGK0s4LpoM38C9pM1HiGFR0kXZ6Upj44l qOLlWnoDiYdb9GmUCSUDNyqtLSLWDDiCTUINSgE8W63pxvwNBsh6AP7+DEjPBoAoshn8 dNUKHxWFAdHvAJmasC79LP9vt94+hJQmVLUIChRiQLFtNoCZs1R5hdjU7bHrVian5tRE dpgwfO+PGY/WOVs06FSDllEhv5Oosrtm/pb5P7+nlQWEhJk/pjl6YHjD5pldp/DRShFs uHeA== X-Gm-Message-State: AOAM532vQ0I18Qs7rVoXql6pbTyYL0ZjO+fGfqYJ0b1dz9N/Of76wLZ+ M2V2GDS2W+iJ1EegNQDh754W7RKaxUg= X-Google-Smtp-Source: ABdhPJx1Co3O/WzgBcrqUmtCZEtkXyHKoEItNaV0crxD3+Ex+zKpimkiGJkxkRHbXq6jgylJtWXlzA== X-Received: by 2002:aa7:9575:: with SMTP id x21mr148296pfq.324.1589387583236; Wed, 13 May 2020 09:33:03 -0700 (PDT) Received: from localhost.localdomain (174-21-143-238.tukw.qwest.net. [174.21.143.238]) by smtp.gmail.com with ESMTPSA id b11sm158025pgq.50.2020.05.13.09.33.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 May 2020 09:33:02 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v4 12/16] target/arm: Create gen_gvec_{qrdmla,qrdmls} Date: Wed, 13 May 2020 09:32:41 -0700 Message-Id: <20200513163245.17915-13-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200513163245.17915-1-richard.henderson@linaro.org> References: <20200513163245.17915-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::542; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x542.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Provide a functional interface for the vector expansion. This fits better with the existing set of helpers that we provide for other operations. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/translate.h | 5 ++++ target/arm/translate-a64.c | 34 ++---------------------- target/arm/translate.c | 54 +++++++++++++++++++------------------- 3 files changed, 34 insertions(+), 59 deletions(-) diff --git a/target/arm/translate.h b/target/arm/translate.h index 4e1778c5e0..aea8a9759d 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -332,6 +332,11 @@ void gen_gvec_sri(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, void gen_gvec_sli(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); + /* * Forward to the isar_feature_* tests given a DisasContext pointer. */ diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 367fa403ae..4577df3cf4 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -587,18 +587,6 @@ static void gen_gvec_op3_ool(DisasContext *s, bool is_q, int rd, is_q ? 16 : 8, vec_full_reg_size(s), data, fn); } -/* Expand a 3-operand + env pointer operation using - * an out-of-line helper. - */ -static void gen_gvec_op3_env(DisasContext *s, bool is_q, int rd, - int rn, int rm, gen_helper_gvec_3_ptr *fn) -{ - tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, rd), - vec_full_reg_offset(s, rn), - vec_full_reg_offset(s, rm), cpu_env, - is_q ? 16 : 8, vec_full_reg_size(s), 0, fn); -} - /* Expand a 3-operand + fpstatus pointer + simd data value operation using * an out-of-line helper. */ @@ -11693,29 +11681,11 @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn) switch (opcode) { case 0x0: /* SQRDMLAH (vector) */ - switch (size) { - case 1: - gen_gvec_op3_env(s, is_q, rd, rn, rm, gen_helper_gvec_qrdmlah_s16); - break; - case 2: - gen_gvec_op3_env(s, is_q, rd, rn, rm, gen_helper_gvec_qrdmlah_s32); - break; - default: - g_assert_not_reached(); - } + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sqrdmlah_qc, size); return; case 0x1: /* SQRDMLSH (vector) */ - switch (size) { - case 1: - gen_gvec_op3_env(s, is_q, rd, rn, rm, gen_helper_gvec_qrdmlsh_s16); - break; - case 2: - gen_gvec_op3_env(s, is_q, rd, rn, rm, gen_helper_gvec_qrdmlsh_s32); - break; - default: - g_assert_not_reached(); - } + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sqrdmlsh_qc, size); return; case 0x2: /* SDOT / UDOT */ diff --git a/target/arm/translate.c b/target/arm/translate.c index 391a09b439..39626e0df9 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -3629,20 +3629,26 @@ static const uint8_t neon_2rm_sizes[] = { [NEON_2RM_VCVT_UF] = 0x4, }; - -/* Expand v8.1 simd helper. */ -static int do_v81_helper(DisasContext *s, gen_helper_gvec_3_ptr *fn, - int q, int rd, int rn, int rm) +void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) { - if (dc_isar_feature(aa32_rdm, s)) { - int opr_sz = (1 + q) * 8; - tcg_gen_gvec_3_ptr(vfp_reg_offset(1, rd), - vfp_reg_offset(1, rn), - vfp_reg_offset(1, rm), cpu_env, - opr_sz, opr_sz, 0, fn); - return 0; - } - return 1; + static gen_helper_gvec_3_ptr * const fns[2] = { + gen_helper_gvec_qrdmlah_s16, gen_helper_gvec_qrdmlah_s32 + }; + tcg_debug_assert(vece >= 1 && vece <= 2); + tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, cpu_env, + opr_sz, max_sz, 0, fns[vece - 1]); +} + +void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static gen_helper_gvec_3_ptr * const fns[2] = { + gen_helper_gvec_qrdmlsh_s16, gen_helper_gvec_qrdmlsh_s32 + }; + tcg_debug_assert(vece >= 1 && vece <= 2); + tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, cpu_env, + opr_sz, max_sz, 0, fns[vece - 1]); } #define GEN_CMP0(NAME, COND) \ @@ -5197,13 +5203,10 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) break; /* VPADD */ } /* VQRDMLAH */ - switch (size) { - case 1: - return do_v81_helper(s, gen_helper_gvec_qrdmlah_s16, - q, rd, rn, rm); - case 2: - return do_v81_helper(s, gen_helper_gvec_qrdmlah_s32, - q, rd, rn, rm); + if (dc_isar_feature(aa32_rdm, s) && (size == 1 || size == 2)) { + gen_gvec_sqrdmlah_qc(size, rd_ofs, rn_ofs, rm_ofs, + vec_size, vec_size); + return 0; } return 1; @@ -5216,13 +5219,10 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) break; } /* VQRDMLSH */ - switch (size) { - case 1: - return do_v81_helper(s, gen_helper_gvec_qrdmlsh_s16, - q, rd, rn, rm); - case 2: - return do_v81_helper(s, gen_helper_gvec_qrdmlsh_s32, - q, rd, rn, rm); + if (dc_isar_feature(aa32_rdm, s) && (size == 1 || size == 2)) { + gen_gvec_sqrdmlsh_qc(size, rd_ofs, rn_ofs, rm_ofs, + vec_size, vec_size); + return 0; } return 1; From patchwork Wed May 13 16:32:42 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1289528 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=XOSOiSkN; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 49Mghr4Bg7z9sRf for ; Thu, 14 May 2020 02:53:00 +1000 (AEST) Received: from localhost ([::1]:55274 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jYucs-0007Pj-6A for incoming@patchwork.ozlabs.org; Wed, 13 May 2020 12:52:58 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48030) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jYuJg-0003m2-23 for qemu-devel@nongnu.org; Wed, 13 May 2020 12:33:09 -0400 Received: from mail-pl1-x62b.google.com ([2607:f8b0:4864:20::62b]:46225) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jYuJe-00023P-AU for qemu-devel@nongnu.org; Wed, 13 May 2020 12:33:07 -0400 Received: by mail-pl1-x62b.google.com with SMTP id b12so30685plz.13 for ; Wed, 13 May 2020 09:33:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=oVwVd/Y2egFp31An/r8mYLXImVhx97R6Jw16V2BZktI=; b=XOSOiSkNedW0jdr5QVyeVcqKdDBxdSz/KRpA0mrgmfRd7ex48UBbJYNb71LKTKlrjL +K6lTxZKDpMaE8B5KSR0Oj6elxlLbsdIfnfAmVUZuh+iYYgV/fa3OhifF4hKH8y9THl1 r7Sd6BApEtIeOqO81HRRxAxShZM7ZlKgC99C+35HVQcU0cFOznVHq0AFkS0VtwQaQm9z Lx0zWoukkKxu8KREjYD7gkc9f4wgClWYsx0i1BmzQNBjOta1+Bp62KtkbkMqzSEDTfih sZcGMDAEfHdXbVXDfJkciyE+/+QPUQMXdB81sO6ZDrpmUN14GbACrPSXPMNSSg4g02m4 5ooQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=oVwVd/Y2egFp31An/r8mYLXImVhx97R6Jw16V2BZktI=; b=MJag6ehAn431a8tdT5L8BSDYaknMrvQkqfFsPD1QmfmDZnSAVogGvNzkKRQzwvNC6t lHG9Gm9ni7v6kuhLu88xIs1ceIaf+HjXdD2TNDse+wLEJTZeAASdDG3WYHsP87GlJLHm /M3I0D660qK+uKGkoGnPHAkjfRRSUneNKMdZwjTWdAiM7JO728U8f3z+d+PenQ8DABA6 2NpmE4eMMHcYdIjcPi2fYKni9o/tkk6i/R4y2z5nU/MqVO879Mkl7xpwDAnI8DmqRZKZ MBW+lWYdB6qlaHD9/fHBRwNtpmYt4tdnyibHoz2WY1qpJoHnc1OOkPSUfPsob/mC2i4G VeLw== X-Gm-Message-State: AOAM532NiiQwGj0F19GKULxgV+l5js1zfxlIXboGX3OfG6TN73wsQGO6 hyTGHpNe+gYaLX1q8YEVMEN5kUg7mJg= X-Google-Smtp-Source: ABdhPJyFFI3z1cocCPw9tkBZPI/PYjm6/6byxlXhQcFB2sVZQTj2kGMq1KovMFrALwOVZvIPEBSYGQ== X-Received: by 2002:a17:902:6949:: with SMTP id k9mr30318plt.47.1589387584548; Wed, 13 May 2020 09:33:04 -0700 (PDT) Received: from localhost.localdomain (174-21-143-238.tukw.qwest.net. [174.21.143.238]) by smtp.gmail.com with ESMTPSA id b11sm158025pgq.50.2020.05.13.09.33.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 May 2020 09:33:03 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v4 13/16] target/arm: Pass pointer to qc to qrdmla/qrdmls Date: Wed, 13 May 2020 09:32:42 -0700 Message-Id: <20200513163245.17915-14-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200513163245.17915-1-richard.henderson@linaro.org> References: <20200513163245.17915-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::62b; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x62b.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Pass a pointer directly to env->vfp.qc[0], rather than env. This will allow SVE2, which does not modify QC, to pass a pointer to dummy storage. Change the return type of inl_qrdml.h_s16 to match the sense of the operation: signed. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/translate.c | 18 ++++++++--- target/arm/vec_helper.c | 70 +++++++++++++++++++++++------------------ 2 files changed, 54 insertions(+), 34 deletions(-) diff --git a/target/arm/translate.c b/target/arm/translate.c index 39626e0df9..21529a9b8f 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -3629,6 +3629,18 @@ static const uint8_t neon_2rm_sizes[] = { [NEON_2RM_VCVT_UF] = 0x4, }; +static void gen_gvec_fn3_qc(uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, + uint32_t opr_sz, uint32_t max_sz, + gen_helper_gvec_3_ptr *fn) +{ + TCGv_ptr qc_ptr = tcg_temp_new_ptr(); + + tcg_gen_addi_ptr(qc_ptr, cpu_env, offsetof(CPUARMState, vfp.qc)); + tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, qc_ptr, + opr_sz, max_sz, 0, fn); + tcg_temp_free_ptr(qc_ptr); +} + void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) { @@ -3636,8 +3648,7 @@ void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, gen_helper_gvec_qrdmlah_s16, gen_helper_gvec_qrdmlah_s32 }; tcg_debug_assert(vece >= 1 && vece <= 2); - tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, cpu_env, - opr_sz, max_sz, 0, fns[vece - 1]); + gen_gvec_fn3_qc(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, fns[vece - 1]); } void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, @@ -3647,8 +3658,7 @@ void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, gen_helper_gvec_qrdmlsh_s16, gen_helper_gvec_qrdmlsh_s32 }; tcg_debug_assert(vece >= 1 && vece <= 2); - tcg_gen_gvec_3_ptr(rd_ofs, rn_ofs, rm_ofs, cpu_env, - opr_sz, max_sz, 0, fns[vece - 1]); + gen_gvec_fn3_qc(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, fns[vece - 1]); } #define GEN_CMP0(NAME, COND) \ diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 096fea67ef..6aa2ca0827 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -36,8 +36,6 @@ #define H4(x) (x) #endif -#define SET_QC() env->vfp.qc[0] = 1 - static void clear_tail(void *vd, uintptr_t opr_sz, uintptr_t max_sz) { uint64_t *d = vd + opr_sz; @@ -49,8 +47,8 @@ static void clear_tail(void *vd, uintptr_t opr_sz, uintptr_t max_sz) } /* Signed saturating rounding doubling multiply-accumulate high half, 16-bit */ -static uint16_t inl_qrdmlah_s16(CPUARMState *env, int16_t src1, - int16_t src2, int16_t src3) +static int16_t inl_qrdmlah_s16(int16_t src1, int16_t src2, + int16_t src3, uint32_t *sat) { /* Simplify: * = ((a3 << 16) + ((e1 * e2) << 1) + (1 << 15)) >> 16 @@ -60,7 +58,7 @@ static uint16_t inl_qrdmlah_s16(CPUARMState *env, int16_t src1, ret = ((int32_t)src3 << 15) + ret + (1 << 14); ret >>= 15; if (ret != (int16_t)ret) { - SET_QC(); + *sat = 1; ret = (ret < 0 ? -0x8000 : 0x7fff); } return ret; @@ -69,30 +67,30 @@ static uint16_t inl_qrdmlah_s16(CPUARMState *env, int16_t src1, uint32_t HELPER(neon_qrdmlah_s16)(CPUARMState *env, uint32_t src1, uint32_t src2, uint32_t src3) { - uint16_t e1 = inl_qrdmlah_s16(env, src1, src2, src3); - uint16_t e2 = inl_qrdmlah_s16(env, src1 >> 16, src2 >> 16, src3 >> 16); + uint32_t *sat = &env->vfp.qc[0]; + uint16_t e1 = inl_qrdmlah_s16(src1, src2, src3, sat); + uint16_t e2 = inl_qrdmlah_s16(src1 >> 16, src2 >> 16, src3 >> 16, sat); return deposit32(e1, 16, 16, e2); } void HELPER(gvec_qrdmlah_s16)(void *vd, void *vn, void *vm, - void *ve, uint32_t desc) + void *vq, uint32_t desc) { uintptr_t opr_sz = simd_oprsz(desc); int16_t *d = vd; int16_t *n = vn; int16_t *m = vm; - CPUARMState *env = ve; uintptr_t i; for (i = 0; i < opr_sz / 2; ++i) { - d[i] = inl_qrdmlah_s16(env, n[i], m[i], d[i]); + d[i] = inl_qrdmlah_s16(n[i], m[i], d[i], vq); } clear_tail(d, opr_sz, simd_maxsz(desc)); } /* Signed saturating rounding doubling multiply-subtract high half, 16-bit */ -static uint16_t inl_qrdmlsh_s16(CPUARMState *env, int16_t src1, - int16_t src2, int16_t src3) +static int16_t inl_qrdmlsh_s16(int16_t src1, int16_t src2, + int16_t src3, uint32_t *sat) { /* Similarly, using subtraction: * = ((a3 << 16) - ((e1 * e2) << 1) + (1 << 15)) >> 16 @@ -102,7 +100,7 @@ static uint16_t inl_qrdmlsh_s16(CPUARMState *env, int16_t src1, ret = ((int32_t)src3 << 15) - ret + (1 << 14); ret >>= 15; if (ret != (int16_t)ret) { - SET_QC(); + *sat = 1; ret = (ret < 0 ? -0x8000 : 0x7fff); } return ret; @@ -111,85 +109,97 @@ static uint16_t inl_qrdmlsh_s16(CPUARMState *env, int16_t src1, uint32_t HELPER(neon_qrdmlsh_s16)(CPUARMState *env, uint32_t src1, uint32_t src2, uint32_t src3) { - uint16_t e1 = inl_qrdmlsh_s16(env, src1, src2, src3); - uint16_t e2 = inl_qrdmlsh_s16(env, src1 >> 16, src2 >> 16, src3 >> 16); + uint32_t *sat = &env->vfp.qc[0]; + uint16_t e1 = inl_qrdmlsh_s16(src1, src2, src3, sat); + uint16_t e2 = inl_qrdmlsh_s16(src1 >> 16, src2 >> 16, src3 >> 16, sat); return deposit32(e1, 16, 16, e2); } void HELPER(gvec_qrdmlsh_s16)(void *vd, void *vn, void *vm, - void *ve, uint32_t desc) + void *vq, uint32_t desc) { uintptr_t opr_sz = simd_oprsz(desc); int16_t *d = vd; int16_t *n = vn; int16_t *m = vm; - CPUARMState *env = ve; uintptr_t i; for (i = 0; i < opr_sz / 2; ++i) { - d[i] = inl_qrdmlsh_s16(env, n[i], m[i], d[i]); + d[i] = inl_qrdmlsh_s16(n[i], m[i], d[i], vq); } clear_tail(d, opr_sz, simd_maxsz(desc)); } /* Signed saturating rounding doubling multiply-accumulate high half, 32-bit */ -uint32_t HELPER(neon_qrdmlah_s32)(CPUARMState *env, int32_t src1, - int32_t src2, int32_t src3) +static int32_t inl_qrdmlah_s32(int32_t src1, int32_t src2, + int32_t src3, uint32_t *sat) { /* Simplify similarly to int_qrdmlah_s16 above. */ int64_t ret = (int64_t)src1 * src2; ret = ((int64_t)src3 << 31) + ret + (1 << 30); ret >>= 31; if (ret != (int32_t)ret) { - SET_QC(); + *sat = 1; ret = (ret < 0 ? INT32_MIN : INT32_MAX); } return ret; } +uint32_t HELPER(neon_qrdmlah_s32)(CPUARMState *env, int32_t src1, + int32_t src2, int32_t src3) +{ + uint32_t *sat = &env->vfp.qc[0]; + return inl_qrdmlah_s32(src1, src2, src3, sat); +} + void HELPER(gvec_qrdmlah_s32)(void *vd, void *vn, void *vm, - void *ve, uint32_t desc) + void *vq, uint32_t desc) { uintptr_t opr_sz = simd_oprsz(desc); int32_t *d = vd; int32_t *n = vn; int32_t *m = vm; - CPUARMState *env = ve; uintptr_t i; for (i = 0; i < opr_sz / 4; ++i) { - d[i] = helper_neon_qrdmlah_s32(env, n[i], m[i], d[i]); + d[i] = inl_qrdmlah_s32(n[i], m[i], d[i], vq); } clear_tail(d, opr_sz, simd_maxsz(desc)); } /* Signed saturating rounding doubling multiply-subtract high half, 32-bit */ -uint32_t HELPER(neon_qrdmlsh_s32)(CPUARMState *env, int32_t src1, - int32_t src2, int32_t src3) +static int32_t inl_qrdmlsh_s32(int32_t src1, int32_t src2, + int32_t src3, uint32_t *sat) { /* Simplify similarly to int_qrdmlsh_s16 above. */ int64_t ret = (int64_t)src1 * src2; ret = ((int64_t)src3 << 31) - ret + (1 << 30); ret >>= 31; if (ret != (int32_t)ret) { - SET_QC(); + *sat = 1; ret = (ret < 0 ? INT32_MIN : INT32_MAX); } return ret; } +uint32_t HELPER(neon_qrdmlsh_s32)(CPUARMState *env, int32_t src1, + int32_t src2, int32_t src3) +{ + uint32_t *sat = &env->vfp.qc[0]; + return inl_qrdmlsh_s32(src1, src2, src3, sat); +} + void HELPER(gvec_qrdmlsh_s32)(void *vd, void *vn, void *vm, - void *ve, uint32_t desc) + void *vq, uint32_t desc) { uintptr_t opr_sz = simd_oprsz(desc); int32_t *d = vd; int32_t *n = vn; int32_t *m = vm; - CPUARMState *env = ve; uintptr_t i; for (i = 0; i < opr_sz / 4; ++i) { - d[i] = helper_neon_qrdmlsh_s32(env, n[i], m[i], d[i]); + d[i] = inl_qrdmlsh_s32(n[i], m[i], d[i], vq); } clear_tail(d, opr_sz, simd_maxsz(desc)); } From patchwork Wed May 13 16:32:43 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1289496 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=fUtoNy5w; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 49MgLv3mMWz9sRY for ; Thu, 14 May 2020 02:37:27 +1000 (AEST) Received: from localhost ([::1]:36506 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jYuNp-0008PI-5r for incoming@patchwork.ozlabs.org; Wed, 13 May 2020 12:37:25 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48038) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jYuJh-0003my-UK for qemu-devel@nongnu.org; Wed, 13 May 2020 12:33:11 -0400 Received: from mail-pj1-x1044.google.com ([2607:f8b0:4864:20::1044]:36449) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jYuJf-00023Z-OS for qemu-devel@nongnu.org; Wed, 13 May 2020 12:33:09 -0400 Received: by mail-pj1-x1044.google.com with SMTP id q24so11219843pjd.1 for ; Wed, 13 May 2020 09:33:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=zUV0ojLtw3LPZ1kr/+clAulYjRQCrH9O1wPeh4lYV08=; b=fUtoNy5wCQjWdkWWSLSHHPB8aJ5tVXP5PTWRvBowJoIsQ/1vb6a34/SYrwoujjuA5d +IMnBJLmJLatuIDiWYPGZdSGHghDATgrSsgr17VeRQxl+S2NApqx8GpuMsdzpc5G2kXu 1n1g7f4LKzvIMkNaWYfi/N2qibIjUXeW4x/4xzmolM6oAq147UUD49pgKQ8QUz+40wBd hjZwxEGryw0Nj6u5hJeX3ZoixztdjVEUOfBH1fGMWWsVDFq9SrLXKqoAEZdgseKhO8lg 4jezj+myFVla8Ck2Z1O+YWjIe8nSDcys13q5NcT+mZrWxYSAvkOuQq+bRi6Y7DK/xRey jC7A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=zUV0ojLtw3LPZ1kr/+clAulYjRQCrH9O1wPeh4lYV08=; b=VUMyNcyA2lYgI75lW5onhPWf0I7GRqXM6soRs5qwLfl7GT19Ba7vaoTenFzOS33RtY /mP9E4ZfzsQc+fNnzGRwy2jR1wsef0Kdyj0Srg2S2FqE/SxfeuN5mINagxU3UlmMutTB VrGv+dNqBQfoh1aJQuupab5JG+qQ/6qpaWH+6TuobhNcaOLf0ZF4srnkvJoc2SthA3e8 3biaNbM8G4SELviR3JKcXRJWfW1H2Z8Bv7V934OLZs31AlrXN91YJuvxNjAp0ee1Heu8 R7ApOym3e9DCVi4v0VIE88Vft3hFGzfK0ZOqO+5cS8OS6FZj0QgdGeykw4JpdUY3Mh67 D6+A== X-Gm-Message-State: AGi0PubLXuPwVUTEVFGoCeeYGm8OIFaZkRQHw3A9oNEEbRucT2dUMxLJ c0w77TX3/7ECG+nmBekZCDfS/HguViM= X-Google-Smtp-Source: APiQypIrqxN7OI7uq1DqcfXw4hpX0pAyIDb6UCh5c8MBuAoxNhje9iJXt5pFuR7pqRHJECpXbI8Mhg== X-Received: by 2002:a17:90b:3115:: with SMTP id gc21mr33829486pjb.183.1589387585798; Wed, 13 May 2020 09:33:05 -0700 (PDT) Received: from localhost.localdomain (174-21-143-238.tukw.qwest.net. [174.21.143.238]) by smtp.gmail.com with ESMTPSA id b11sm158025pgq.50.2020.05.13.09.33.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 May 2020 09:33:04 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v4 14/16] target/arm: Clear tail in gvec_fmul_idx_*, gvec_fmla_idx_* Date: Wed, 13 May 2020 09:32:43 -0700 Message-Id: <20200513163245.17915-15-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200513163245.17915-1-richard.henderson@linaro.org> References: <20200513163245.17915-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1044; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1044.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, qemu-stable@nongnu.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Must clear the tail for AdvSIMD when SVE is enabled. Fixes: ca40a6e6e39 Cc: qemu-stable@nongnu.org Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/vec_helper.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 6aa2ca0827..a483841add 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -747,6 +747,7 @@ void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \ d[i + j] = TYPE##_mul(n[i + j], mm, stat); \ } \ } \ + clear_tail(d, oprsz, simd_maxsz(desc)); \ } DO_MUL_IDX(gvec_fmul_idx_h, float16, H2) @@ -771,6 +772,7 @@ void HELPER(NAME)(void *vd, void *vn, void *vm, void *va, \ mm, a[i + j], 0, stat); \ } \ } \ + clear_tail(d, oprsz, simd_maxsz(desc)); \ } DO_FMLA_IDX(gvec_fmla_idx_h, float16, H2) From patchwork Wed May 13 16:32:44 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1289521 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=fM4EGvFT; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 49MgWD38hXz9sRY for ; Thu, 14 May 2020 02:44:40 +1000 (AEST) Received: from localhost ([::1]:58912 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jYuUn-0003c5-Rm for incoming@patchwork.ozlabs.org; Wed, 13 May 2020 12:44:37 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48052) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jYuJj-0003nf-Pf for qemu-devel@nongnu.org; Wed, 13 May 2020 12:33:11 -0400 Received: from mail-pl1-x641.google.com ([2607:f8b0:4864:20::641]:42066) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jYuJh-00023s-IS for qemu-devel@nongnu.org; Wed, 13 May 2020 12:33:11 -0400 Received: by mail-pl1-x641.google.com with SMTP id k19so38675pll.9 for ; Wed, 13 May 2020 09:33:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=sMsSJIbayaVcN7fAKLPcCoZeyuYV2bazR+WjJ8tzbTs=; b=fM4EGvFTSPXqTYDA3eCrVoULRLcsBmO9xM4iM9/ZWDPOqk+mcs9WqtH0vX5LljQ9pq fi7LHt8yirMEsi9kVwcG6L626Pba4kSoBUIVBOJG9YTgSAgt/Xb+qJahrKT6ONBTW8qF l22YvLHH2bvFDeYxPGgDsxc8mRfioVtUxxv04xDCyyAwXfFS/6rFlNSKEylLrQ4hc1i4 WBgifZFXt99bB5WCHz+1813PXALfz8NPyLde5bs4un3cbVRPYAhLD6WU4fLDX4Xren/5 sLpO8Rqb7Lk80syXqy+eQpSbV9M4SzlhvvS4NtNheyCXb0eIPrqhuYBxOK/hkbo4HFwU b5Aw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=sMsSJIbayaVcN7fAKLPcCoZeyuYV2bazR+WjJ8tzbTs=; b=dbFj/tSbvXpE+izaxbqCUeWw5EEzxrimLWC6fFSWtmzlN6pyKARQOuszqFRd73FQEK DlcuCp28VSCztV5bpOC8PFXOKwX/jZnBl1dpDPRciZYRYx3ZTUZgJoBq4Mdt3jKPkxLK tpAUx/t3vzVx5549zixF8Hr4tFI41CmFKnKUmgRJUiwF2FqfXkNa+zRZo31eV81MTUF1 VFOL8xkibqCX/cZQN5m8nQDs4dTJu9FGYV2KJoIeVNy3BCbveRyb62ufqd0lTjSxpKvx JLYeQMjDZ3nvBadN8oSPtLblIEKgeikc13B+nk3v1AL9QxbSVhSVLOxMRyYDTSYuQFL4 vYbA== X-Gm-Message-State: AGi0PuYjBAQaRoNXMNcjE20E1tDU7MjDJGhQlJJK+oZf6ex98Jk/f6pg jlko3sYYvCHoa+Rb1oLxGNbyCHDrXtY= X-Google-Smtp-Source: APiQypJcgsfyNPYt2rH8tM4Sb3rfBU2T3j1mmUS5sp0l9EompnFhTJpMNkrE9g+BECOoDx8eoH+mIQ== X-Received: by 2002:a17:90a:930c:: with SMTP id p12mr38892146pjo.64.1589387586951; Wed, 13 May 2020 09:33:06 -0700 (PDT) Received: from localhost.localdomain (174-21-143-238.tukw.qwest.net. [174.21.143.238]) by smtp.gmail.com with ESMTPSA id b11sm158025pgq.50.2020.05.13.09.33.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 May 2020 09:33:06 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v4 15/16] target/arm: Vectorize SABD/UABD Date: Wed, 13 May 2020 09:32:44 -0700 Message-Id: <20200513163245.17915-16-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200513163245.17915-1-richard.henderson@linaro.org> References: <20200513163245.17915-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::641; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x641.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Include 64-bit element size in preparation for SVE2. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper.h | 10 +++ target/arm/translate.h | 5 ++ target/arm/translate-a64.c | 8 ++- target/arm/translate.c | 133 ++++++++++++++++++++++++++++++++++++- target/arm/vec_helper.c | 24 +++++++ 5 files changed, 176 insertions(+), 4 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index aed3050965..4678d3a6f4 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -731,6 +731,16 @@ DEF_HELPER_FLAGS_3(gvec_sli_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_sli_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_sli_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_sabd_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_sabd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_sabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_sabd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(gvec_uabd_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_uabd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_uabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_uabd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/translate.h b/target/arm/translate.h index aea8a9759d..bbfe3d1393 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -337,6 +337,11 @@ void gen_gvec_sqrdmlah_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, void gen_gvec_sqrdmlsh_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_sabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); + /* * Forward to the isar_feature_* tests given a DisasContext pointer. */ diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 4577df3cf4..54b06553a6 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -11190,6 +11190,13 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_smin, size); } return; + case 0xe: /* SABD, UABD */ + if (u) { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uabd, size); + } else { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sabd, size); + } + return; case 0x10: /* ADD, SUB */ if (u) { gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_sub, size); @@ -11322,7 +11329,6 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) genenvfn = fns[size][u]; break; } - case 0xe: /* SABD, UABD */ case 0xf: /* SABA, UABA */ { static NeonGenTwoOpFn * const fns[3][2] = { diff --git a/target/arm/translate.c b/target/arm/translate.c index 21529a9b8f..d288721c23 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -5102,6 +5102,126 @@ void gen_gvec_sqsub_qc(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); } +static void gen_sabd_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + TCGv_i32 t = tcg_temp_new_i32(); + + tcg_gen_sub_i32(t, a, b); + tcg_gen_sub_i32(d, b, a); + tcg_gen_movcond_i32(TCG_COND_LT, d, a, b, d, t); + tcg_temp_free_i32(t); +} + +static void gen_sabd_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_sub_i64(t, a, b); + tcg_gen_sub_i64(d, b, a); + tcg_gen_movcond_i64(TCG_COND_LT, d, a, b, d, t); + tcg_temp_free_i64(t); +} + +static void gen_sabd_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + + tcg_gen_smin_vec(vece, t, a, b); + tcg_gen_smax_vec(vece, d, a, b); + tcg_gen_sub_vec(vece, d, d, t); + tcg_temp_free_vec(t); +} + +void gen_gvec_sabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sub_vec, INDEX_op_smin_vec, INDEX_op_smax_vec, 0 + }; + static const GVecGen3 ops[4] = { + { .fniv = gen_sabd_vec, + .fno = gen_helper_gvec_sabd_b, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fniv = gen_sabd_vec, + .fno = gen_helper_gvec_sabd_h, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_sabd_i32, + .fniv = gen_sabd_vec, + .fno = gen_helper_gvec_sabd_s, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_sabd_i64, + .fniv = gen_sabd_vec, + .fno = gen_helper_gvec_sabd_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} + +static void gen_uabd_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + TCGv_i32 t = tcg_temp_new_i32(); + + tcg_gen_sub_i32(t, a, b); + tcg_gen_sub_i32(d, b, a); + tcg_gen_movcond_i32(TCG_COND_LTU, d, a, b, d, t); + tcg_temp_free_i32(t); +} + +static void gen_uabd_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_sub_i64(t, a, b); + tcg_gen_sub_i64(d, b, a); + tcg_gen_movcond_i64(TCG_COND_LTU, d, a, b, d, t); + tcg_temp_free_i64(t); +} + +static void gen_uabd_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + + tcg_gen_umin_vec(vece, t, a, b); + tcg_gen_umax_vec(vece, d, a, b); + tcg_gen_sub_vec(vece, d, d, t); + tcg_temp_free_vec(t); +} + +void gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sub_vec, INDEX_op_umin_vec, INDEX_op_umax_vec, 0 + }; + static const GVecGen3 ops[4] = { + { .fniv = gen_uabd_vec, + .fno = gen_helper_gvec_uabd_b, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fniv = gen_uabd_vec, + .fno = gen_helper_gvec_uabd_h, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_uabd_i32, + .fniv = gen_uabd_vec, + .fno = gen_helper_gvec_uabd_s, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_uabd_i64, + .fniv = gen_uabd_vec, + .fno = gen_helper_gvec_uabd_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} + /* Translate a NEON data processing instruction. Return nonzero if the instruction is invalid. We process data in a mixture of 32-bit and 64-bit chunks. @@ -5236,6 +5356,16 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) } return 1; + case NEON_3R_VABD: + if (u) { + gen_gvec_uabd(size, rd_ofs, rn_ofs, rm_ofs, + vec_size, vec_size); + } else { + gen_gvec_sabd(size, rd_ofs, rn_ofs, rm_ofs, + vec_size, vec_size); + } + return 0; + case NEON_3R_VADD_VSUB: case NEON_3R_LOGIC: case NEON_3R_VMAX: @@ -5380,9 +5510,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) case NEON_3R_VQRSHL: GEN_NEON_INTEGER_OP_ENV(qrshl); break; - case NEON_3R_VABD: - GEN_NEON_INTEGER_OP(abd); - break; case NEON_3R_VABA: GEN_NEON_INTEGER_OP(abd); tcg_temp_free_i32(tmp2); diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index a483841add..a4972d02fc 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -1407,3 +1407,27 @@ DO_CMP0(gvec_cgt0_h, int16_t, >) DO_CMP0(gvec_cge0_h, int16_t, >=) #undef DO_CMP0 + +#define DO_ABD(NAME, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + TYPE *d = vd, *n = vn, *m = vm; \ + \ + for (i = 0; i < opr_sz / sizeof(TYPE); ++i) { \ + d[i] = n[i] < m[i] ? m[i] - n[i] : n[i] - m[i]; \ + } \ + clear_tail(d, opr_sz, simd_maxsz(desc)); \ +} + +DO_ABD(gvec_sabd_b, int8_t) +DO_ABD(gvec_sabd_h, int16_t) +DO_ABD(gvec_sabd_s, int32_t) +DO_ABD(gvec_sabd_d, int64_t) + +DO_ABD(gvec_uabd_b, uint8_t) +DO_ABD(gvec_uabd_h, uint16_t) +DO_ABD(gvec_uabd_s, uint32_t) +DO_ABD(gvec_uabd_d, uint64_t) + +#undef DO_ABD From patchwork Wed May 13 16:32:45 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1289499 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=X2N1z92H; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 49MgPt1BByz9sRY for ; Thu, 14 May 2020 02:40:02 +1000 (AEST) Received: from localhost ([::1]:44226 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jYuQJ-0005nv-Nb for incoming@patchwork.ozlabs.org; Wed, 13 May 2020 12:39:59 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48054) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jYuJk-0003oq-EE for qemu-devel@nongnu.org; Wed, 13 May 2020 12:33:12 -0400 Received: from mail-pj1-x1041.google.com ([2607:f8b0:4864:20::1041]:51783) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jYuJi-00024N-0K for qemu-devel@nongnu.org; Wed, 13 May 2020 12:33:12 -0400 Received: by mail-pj1-x1041.google.com with SMTP id mq3so11344061pjb.1 for ; Wed, 13 May 2020 09:33:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=hVHTEirUzJ20bAilpSh5nRxHdFXu/fWlu+BQCZdG8wI=; b=X2N1z92Hh3WJ7vtCRPIyz5OW5ZQqXBvZcJuxOr1p03Ok1f3zjOOOhcm3QhtN9pf2n8 sJdYPuG8bPGGfm4MDawILYEODVnGPcyuOjT6Vj7gypDXPbBVCBRexH0A76ro2ttMtSjV NnOqtt/Qrq9LCW51kfQOpixKJrng2rjFZvsoHr3+z+OJabLvpBnlUA0agR66mZaDh+Ad GN9twoZhaL5SblFPpVUBznhhkvDVjJ8H3NYsfCKjaXct+Ya5yR6sZwAal5ECsuVNAYLq jHfEsJCYmTqMI8VTXzKJ7tO8kX00FhOPjzPtRDsHzGRZfAA7IAyIgFD3z5rOzoatNQhG 6Ibg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=hVHTEirUzJ20bAilpSh5nRxHdFXu/fWlu+BQCZdG8wI=; b=PaxZLCjG6d9OHOoYXVSzFOBUQ1Qz6pSFJvOBevG1VFcXkXfVR3vypaJE4DHLGPyB3i N6/Sd6HUrc29ZqqApzPIMaWINqUn1IEbC7p46/JyL0Fc9vmo+0t+HADF/y/ZRd/vM6kB 6oBykdeQysKVlD+KYzNbkZHncMRvPk7atNIOKHAxkNux3AnmXzABfsQLYzKP32lqg9OW kAuT+eKpIZSGKV1SCXMgBESPB8bsmWGalKG71VWHf3sZU7kt8/EhCpuEzoinLU7MrtuN uDJtEUwO4zYAia8zTJWMQR/XLtD8z/41WdcqvESFxXFlsXglQJCaHoSdcdy8pr/jLz7k Gteg== X-Gm-Message-State: AOAM531XF7LIlewgEt+B9MVr+71/0TxOfl1VSEMbjUe/XtXA+E7dOWgT xkoSyv0jX+QoHsInsrpMk5ZLcmUQrFY= X-Google-Smtp-Source: ABdhPJzY1mxRiT/M4Nihykdno47qUhipVo75UvqvEVQ2aC6ZYkwwQa0Jyukc5ziXlXadHfEjaYrgXA== X-Received: by 2002:a17:902:8494:: with SMTP id c20mr2897plo.305.1589387588187; Wed, 13 May 2020 09:33:08 -0700 (PDT) Received: from localhost.localdomain (174-21-143-238.tukw.qwest.net. [174.21.143.238]) by smtp.gmail.com with ESMTPSA id b11sm158025pgq.50.2020.05.13.09.33.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 May 2020 09:33:07 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v4 16/16] target/arm: Vectorize SABA/UABA Date: Wed, 13 May 2020 09:32:45 -0700 Message-Id: <20200513163245.17915-17-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200513163245.17915-1-richard.henderson@linaro.org> References: <20200513163245.17915-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1041; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1041.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Include 64-bit element size in preparation for SVE2. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper.h | 17 +++-- target/arm/translate.h | 5 ++ target/arm/neon_helper.c | 10 --- target/arm/translate-a64.c | 17 ++--- target/arm/translate.c | 134 +++++++++++++++++++++++++++++++++++-- target/arm/vec_helper.c | 24 +++++++ 6 files changed, 174 insertions(+), 33 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index 4678d3a6f4..1857f4ee46 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -284,13 +284,6 @@ DEF_HELPER_2(neon_pmax_s8, i32, i32, i32) DEF_HELPER_2(neon_pmax_u16, i32, i32, i32) DEF_HELPER_2(neon_pmax_s16, i32, i32, i32) -DEF_HELPER_2(neon_abd_u8, i32, i32, i32) -DEF_HELPER_2(neon_abd_s8, i32, i32, i32) -DEF_HELPER_2(neon_abd_u16, i32, i32, i32) -DEF_HELPER_2(neon_abd_s16, i32, i32, i32) -DEF_HELPER_2(neon_abd_u32, i32, i32, i32) -DEF_HELPER_2(neon_abd_s32, i32, i32, i32) - DEF_HELPER_2(neon_shl_u16, i32, i32, i32) DEF_HELPER_2(neon_shl_s16, i32, i32, i32) DEF_HELPER_2(neon_rshl_u8, i32, i32, i32) @@ -741,6 +734,16 @@ DEF_HELPER_FLAGS_4(gvec_uabd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_uabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_uabd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_saba_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_saba_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_saba_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_saba_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(gvec_uaba_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_uaba_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_uaba_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_uaba_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/translate.h b/target/arm/translate.h index bbfe3d1393..c937dfe9bf 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -342,6 +342,11 @@ void gen_gvec_sabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, void gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_saba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_uaba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); + /* * Forward to the isar_feature_* tests given a DisasContext pointer. */ diff --git a/target/arm/neon_helper.c b/target/arm/neon_helper.c index 448be93fa1..2ef75e04c8 100644 --- a/target/arm/neon_helper.c +++ b/target/arm/neon_helper.c @@ -576,16 +576,6 @@ NEON_POP(pmax_s16, neon_s16, 2) NEON_POP(pmax_u16, neon_u16, 2) #undef NEON_FN -#define NEON_FN(dest, src1, src2) \ - dest = (src1 > src2) ? (src1 - src2) : (src2 - src1) -NEON_VOP(abd_s8, neon_s8, 4) -NEON_VOP(abd_u8, neon_u8, 4) -NEON_VOP(abd_s16, neon_s16, 2) -NEON_VOP(abd_u16, neon_u16, 2) -NEON_VOP(abd_s32, neon_s32, 1) -NEON_VOP(abd_u32, neon_u32, 1) -#undef NEON_FN - #define NEON_FN(dest, src1, src2) do { \ int8_t tmp; \ tmp = (int8_t)src2; \ diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 54b06553a6..991e451644 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -11197,6 +11197,13 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_sabd, size); } return; + case 0xf: /* SABA, UABA */ + if (u) { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_uaba, size); + } else { + gen_gvec_fn3(s, is_q, rd, rn, rm, gen_gvec_saba, size); + } + return; case 0x10: /* ADD, SUB */ if (u) { gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_sub, size); @@ -11329,16 +11336,6 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) genenvfn = fns[size][u]; break; } - case 0xf: /* SABA, UABA */ - { - static NeonGenTwoOpFn * const fns[3][2] = { - { gen_helper_neon_abd_s8, gen_helper_neon_abd_u8 }, - { gen_helper_neon_abd_s16, gen_helper_neon_abd_u16 }, - { gen_helper_neon_abd_s32, gen_helper_neon_abd_u32 }, - }; - genfn = fns[size][u]; - break; - } case 0x16: /* SQDMULH, SQRDMULH */ { static NeonGenTwoOpEnvFn * const fns[2][2] = { diff --git a/target/arm/translate.c b/target/arm/translate.c index d288721c23..e3d37ef2e9 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -5222,6 +5222,124 @@ void gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); } +static void gen_saba_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + TCGv_i32 t = tcg_temp_new_i32(); + gen_sabd_i32(t, a, b); + tcg_gen_add_i32(d, d, t); + tcg_temp_free_i32(t); +} + +static void gen_saba_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 t = tcg_temp_new_i64(); + gen_sabd_i64(t, a, b); + tcg_gen_add_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_saba_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + gen_sabd_vec(vece, t, a, b); + tcg_gen_add_vec(vece, d, d, t); + tcg_temp_free_vec(t); +} + +void gen_gvec_saba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sub_vec, INDEX_op_add_vec, + INDEX_op_smin_vec, INDEX_op_smax_vec, 0 + }; + static const GVecGen3 ops[4] = { + { .fniv = gen_saba_vec, + .fno = gen_helper_gvec_saba_b, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_8 }, + { .fniv = gen_saba_vec, + .fno = gen_helper_gvec_saba_h, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_16 }, + { .fni4 = gen_saba_i32, + .fniv = gen_saba_vec, + .fno = gen_helper_gvec_saba_s, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_32 }, + { .fni8 = gen_saba_i64, + .fniv = gen_saba_vec, + .fno = gen_helper_gvec_saba_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} + +static void gen_uaba_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + TCGv_i32 t = tcg_temp_new_i32(); + gen_uabd_i32(t, a, b); + tcg_gen_add_i32(d, d, t); + tcg_temp_free_i32(t); +} + +static void gen_uaba_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 t = tcg_temp_new_i64(); + gen_uabd_i64(t, a, b); + tcg_gen_add_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_uaba_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + gen_uabd_vec(vece, t, a, b); + tcg_gen_add_vec(vece, d, d, t); + tcg_temp_free_vec(t); +} + +void gen_gvec_uaba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sub_vec, INDEX_op_add_vec, + INDEX_op_umin_vec, INDEX_op_umax_vec, 0 + }; + static const GVecGen3 ops[4] = { + { .fniv = gen_uaba_vec, + .fno = gen_helper_gvec_uaba_b, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_8 }, + { .fniv = gen_uaba_vec, + .fno = gen_helper_gvec_uaba_h, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_16 }, + { .fni4 = gen_uaba_i32, + .fniv = gen_uaba_vec, + .fno = gen_helper_gvec_uaba_s, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_32 }, + { .fni8 = gen_uaba_i64, + .fniv = gen_uaba_vec, + .fno = gen_helper_gvec_uaba_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} + /* Translate a NEON data processing instruction. Return nonzero if the instruction is invalid. We process data in a mixture of 32-bit and 64-bit chunks. @@ -5366,6 +5484,16 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) } return 0; + case NEON_3R_VABA: + if (u) { + gen_gvec_uaba(size, rd_ofs, rn_ofs, rm_ofs, + vec_size, vec_size); + } else { + gen_gvec_saba(size, rd_ofs, rn_ofs, rm_ofs, + vec_size, vec_size); + } + return 0; + case NEON_3R_VADD_VSUB: case NEON_3R_LOGIC: case NEON_3R_VMAX: @@ -5510,12 +5638,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) case NEON_3R_VQRSHL: GEN_NEON_INTEGER_OP_ENV(qrshl); break; - case NEON_3R_VABA: - GEN_NEON_INTEGER_OP(abd); - tcg_temp_free_i32(tmp2); - tmp2 = neon_load_reg(rd, pass); - gen_neon_add(size, tmp, tmp2); - break; case NEON_3R_VPMAX: GEN_NEON_INTEGER_OP(pmax); break; diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index a4972d02fc..fa33df859e 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -1431,3 +1431,27 @@ DO_ABD(gvec_uabd_s, uint32_t) DO_ABD(gvec_uabd_d, uint64_t) #undef DO_ABD + +#define DO_ABA(NAME, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + TYPE *d = vd, *n = vn, *m = vm; \ + \ + for (i = 0; i < opr_sz / sizeof(TYPE); ++i) { \ + d[i] += n[i] < m[i] ? m[i] - n[i] : n[i] - m[i]; \ + } \ + clear_tail(d, opr_sz, simd_maxsz(desc)); \ +} + +DO_ABA(gvec_saba_b, int8_t) +DO_ABA(gvec_saba_h, int16_t) +DO_ABA(gvec_saba_s, int32_t) +DO_ABA(gvec_saba_d, int64_t) + +DO_ABA(gvec_uaba_b, uint8_t) +DO_ABA(gvec_uaba_h, uint16_t) +DO_ABA(gvec_uaba_s, uint32_t) +DO_ABA(gvec_uaba_d, uint64_t) + +#undef DO_ABA