From patchwork Thu Mar 26 23:08:09 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1262427 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=dfz/ky64; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48pLKB6Wl0z9sSL for ; Fri, 27 Mar 2020 10:09:18 +1100 (AEDT) Received: from localhost ([::1]:34128 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbci-0001wa-Bj for incoming@patchwork.ozlabs.org; Thu, 26 Mar 2020 19:09:16 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:57836) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcD-0001qV-Ra for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:47 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcC-0001Fq-Cn for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:45 -0400 Received: from mail-pj1-x1032.google.com ([2607:f8b0:4864:20::1032]:33986) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcC-0001FB-76 for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:44 -0400 Received: by mail-pj1-x1032.google.com with SMTP id q16so3754626pje.1 for ; Thu, 26 Mar 2020 16:08:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Q1ePjC0PZpEMmpsvj5toCEs0OJPgs4+rpqy4ey1oxvI=; b=dfz/ky64gZMpo1AacGx5BZSiGmb1rPh+HEyTYW/sl196DELP9/0DtxJmjngXWaTQuS IEzrQ+IbeU2CHe3VbtBqK0cyHGrEy3m2zX7h2Eo5v39ATkZ+qaS6mRGQ1vpABpoer4rF Iu23TjKptFwojb+XGYOK+QanCH6Xku/OAqDsv+A5y29wJiZolUXqcXFSRasjvkvaUlrW PjN/k92csmCD96PvfqI2bQXNsKof707hm6z3CKB0CHXCOenx0bOcTCpSmGHZaqv1Mnpw 6sm0P0iqnloYSZKysik6HbSMk29Q6k+mkfczAp2mhfYvhHKI0kOgzHqsJVu0MtDs3w2r C0dg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Q1ePjC0PZpEMmpsvj5toCEs0OJPgs4+rpqy4ey1oxvI=; b=S7rcu4V0QcfoC9gK+GvDVFA3vsmEc/1+wc5kGcRGrxK4IR/lqQ7w4IZuyB/q12o8+i FRAXm6xBWQK6zIQx79ILV6ai9tqGZyG/KPkfMJg7H5A//MSU/77rMRQFYud7U1CWabAe ZqDbjC9zz8akIKpnRxrMOwuL3KpYo3AJEG3KCZwptcbQaIVBvhShOaFVzk/69Ri0iAUb ZDnYMZhsr2EBlqhQhECqiQ9OIFeykfaXdSFmbEFAHB7a/ZW5QhLO27m7L8jWl/jitdXs mFDKY5nPgkvV5SdXuKmZRPKi/iw0Fa1128UTkvlwcYIir1xx5T0RnJxPdG+sCs4kroDU xJ6A== X-Gm-Message-State: ANhLgQ1qmC8CUhQawbPXPugM3oWJFqsxCx3NocJfTVRMP3gIJDdRzaCE /7mVV6YvdnqZnqrefZvmNXZ7iPgT7+A= X-Google-Smtp-Source: ADFU+vtBUdFpLLPUs7pnxrUOyxyyCs9anlCbglYMqonWQvCn/hLz23CXEFjkb1Da6MFkS5/KHMBEiw== X-Received: by 2002:a17:90a:e7c8:: with SMTP id kb8mr2534548pjb.79.1585264122682; Thu, 26 Mar 2020 16:08:42 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.08.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:08:42 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 02/31] target/arm: Implement SVE2 Integer Multiply - Unpredicated Date: Thu, 26 Mar 2020 16:08:09 -0700 Message-Id: <20200326230838.31112-3-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::1032 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" For MUL, we can rely on generic support. For SMULH and UMULH, create some trivial helpers. For PMUL, back in a21bb78e5817, we organized helper_gvec_pmul_b in preparation for this use. Signed-off-by: Richard Henderson --- target/arm/helper.h | 10 ++++ target/arm/sve.decode | 9 ++++ target/arm/translate-sve.c | 51 ++++++++++++++++++++ target/arm/vec_helper.c | 96 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 166 insertions(+) diff --git a/target/arm/helper.h b/target/arm/helper.h index d5f1c87192..80bc129763 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -688,6 +688,16 @@ DEF_HELPER_FLAGS_2(frint64_s, TCG_CALL_NO_RWG, f32, f32, ptr) DEF_HELPER_FLAGS_2(frint32_d, TCG_CALL_NO_RWG, f64, f64, ptr) DEF_HELPER_FLAGS_2(frint64_d, TCG_CALL_NO_RWG, f64, f64, ptr) +DEF_HELPER_FLAGS_4(gvec_smulh_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_smulh_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_smulh_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_smulh_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(gvec_umulh_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_umulh_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_umulh_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_umulh_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(gvec_sshl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_sshl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_ushl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 4f580a25e7..58e0b808e9 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1093,3 +1093,12 @@ ST1_zprz 1110010 .. 00 ..... 100 ... ..... ..... \ @rprr_scatter_store xs=0 esz=3 scale=0 ST1_zprz 1110010 .. 00 ..... 110 ... ..... ..... \ @rprr_scatter_store xs=1 esz=3 scale=0 + +#### SVE2 Support + +### SVE2 Integer Multiply - Unpredicated + +MUL_zzz 00000100 .. 1 ..... 0110 00 ..... ..... @rd_rn_rm +SMULH_zzz 00000100 .. 1 ..... 0110 10 ..... ..... @rd_rn_rm +UMULH_zzz 00000100 .. 1 ..... 0110 11 ..... ..... @rd_rn_rm +PMUL_zzz 00000100 00 1 ..... 0110 01 ..... ..... @rd_rn_rm_e0 diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index acf962b6b0..e962f45b32 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -5829,3 +5829,54 @@ static bool trans_MOVPRFX_z(DisasContext *s, arg_rpr_esz *a) } return true; } + +/* + * SVE2 Integer Multiply - Unpredicated + */ + +static bool trans_MUL_zzz(DisasContext *s, arg_rrr_esz *a) +{ + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + return do_vector3_z(s, tcg_gen_gvec_mul, a->esz, a->rd, a->rn, a->rm); +} + +static bool do_sve2_zzz_ool(DisasContext *s, arg_rrr_esz *a, + gen_helper_gvec_3 *fn) +{ + if (fn == NULL || !dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vsz, vsz, 0, fn); + } + return true; +} + +static bool trans_SMULH_zzz(DisasContext *s, arg_rrr_esz *a) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_gvec_smulh_b, gen_helper_gvec_smulh_h, + gen_helper_gvec_smulh_s, gen_helper_gvec_smulh_d, + }; + return do_sve2_zzz_ool(s, a, fns[a->esz]); +} + +static bool trans_UMULH_zzz(DisasContext *s, arg_rrr_esz *a) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_gvec_umulh_b, gen_helper_gvec_umulh_h, + gen_helper_gvec_umulh_s, gen_helper_gvec_umulh_d, + }; + return do_sve2_zzz_ool(s, a, fns[a->esz]); +} + +static bool trans_PMUL_zzz(DisasContext *s, arg_rrr_esz *a) +{ + return do_sve2_zzz_ool(s, a, gen_helper_gvec_pmul_b); +} diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 8017bd88c4..00dc38c9db 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -1257,3 +1257,99 @@ void HELPER(sve2_pmull_h)(void *vd, void *vn, void *vm, uint32_t desc) } } #endif + +/* + * NxN -> N highpart multiply + * + * TODO: expose this as a generic vector operation. + */ + +void HELPER(gvec_smulh_b)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int8_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz; ++i) { + d[i] = ((int32_t)n[i] * m[i]) >> 8; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_smulh_h)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int16_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 2; ++i) { + d[i] = ((int32_t)n[i] * m[i]) >> 16; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_smulh_s)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int32_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 4; ++i) { + d[i] = ((int64_t)n[i] * m[i]) >> 32; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_smulh_d)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint64_t *d = vd, *n = vn, *m = vm; + uint64_t discard; + + for (i = 0; i < opr_sz / 8; ++i) { + muls64(&discard, &d[i], n[i], m[i]); + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_umulh_b)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint8_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz; ++i) { + d[i] = ((uint32_t)n[i] * m[i]) >> 8; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_umulh_h)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint16_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 2; ++i) { + d[i] = ((uint32_t)n[i] * m[i]) >> 16; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_umulh_s)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint32_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 4; ++i) { + d[i] = ((uint64_t)n[i] * m[i]) >> 32; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_umulh_d)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint64_t *d = vd, *n = vn, *m = vm; + uint64_t discard; + + for (i = 0; i < opr_sz / 8; ++i) { + mulu64(&discard, &d[i], n[i], m[i]); + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} From patchwork Thu Mar 26 23:08:10 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1262435 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=iperahye; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48pLNC6M7Vz9sSH for ; Fri, 27 Mar 2020 10:11:55 +1100 (AEDT) Received: from localhost ([::1]:34530 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbfE-0006Jr-Pt for incoming@patchwork.ozlabs.org; Thu, 26 Mar 2020 19:11:53 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:57862) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcE-0001rE-S0 for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:48 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcD-0001Ge-Io for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:46 -0400 Received: from mail-pj1-x1043.google.com ([2607:f8b0:4864:20::1043]:51269) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcD-0001GF-DW for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:45 -0400 Received: by mail-pj1-x1043.google.com with SMTP id w9so3092430pjh.1 for ; Thu, 26 Mar 2020 16:08:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ENzLI8qMuJSws+8eLNHdlg92DyxjJFX8uRl+A0psg7s=; b=iperahyeiAw+M62lLJODUFHBwlVgapsnkYDNUt8COvK1fUmm1CJjWCE2Hler7y8WFs thYR4AGnwlDpnCKLOz4W2/jPs87Mc9ZC6qQB6TmI5C9/I8Qzt10HEy5wCcfcrImpN++9 q9/ToVSicS6yJNXRwm7YmfeY1qvAz1dE00t+m10yGiwla4weY2NoLjx6l5mYqNOX2zU1 Z9PFttzHnPu6iFSBdTty5pTpAISQ3TP4y8T0HMxc/IyWH4HluXYgXB7QE/vzVfvzqEXl 59jGq9BWoPSBCrgj9jgfG9GAMPto66F6A8AY6ZugID3QINWKSrYUAzjpAwU9dB1vLayU P8ng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ENzLI8qMuJSws+8eLNHdlg92DyxjJFX8uRl+A0psg7s=; b=hDpUGfwmRjHoujr8IGtOhaDcaLs8AngmziqIsvngUBArUFCPrgGeSj24Yy6bRO6cVW QkbqUtx2FA+Ic30Z0OvWmQMapXNO7+2uCdBLA3KTH8o6nQebJUSw+vglCvrbmNLMeUXr DxU9GtPFdZ9w1H1SUBPXfjhCBj4TEiSwBPyDMJarATkg7Oedkx2QNNY4CacFRHgY7LSN IW8CBPdcNLkelxYHbkZaQFSb93uDaoONsypsq5RFxnGaE7v46RbHY6gQN4rIiAssaadn 6uPq4xLPhtedJYpepsHzBWIVTAVMujE3wR5J1qylwXi/9h4jQuh3gVpfsKscm1Jcm+tm 0fVQ== X-Gm-Message-State: ANhLgQ35+YgrOeYszLcWP0v26YnsSWfiMYYOiZ4WCf5P8WyXVpeDDOGh Hw8wmLNYNOfSgypyEfazF1uLnM7FGZ4= X-Google-Smtp-Source: ADFU+vuwOR8dH7IR43LjZXrH8xVcFp6McjSc/gQl9AbGrkJlptwUBYtTAtJNODjwFPtW0/oivVV6iw== X-Received: by 2002:a17:90a:c385:: with SMTP id h5mr2508231pjt.131.1585264123970; Thu, 26 Mar 2020 16:08:43 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.08.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:08:43 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 03/31] target/arm: Implement SVE2 integer pairwise add and accumulate long Date: Thu, 26 Mar 2020 16:08:10 -0700 Message-Id: <20200326230838.31112-4-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::1043 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 14 ++++++++++++ target/arm/sve.decode | 5 +++++ target/arm/sve_helper.c | 44 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 39 +++++++++++++++++++++++++++++++++ 4 files changed, 102 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 11d627981d..854cd97fdf 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -158,6 +158,20 @@ DEF_HELPER_FLAGS_5(sve_umulh_zpzz_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_umulh_zpzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sadalp_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sadalp_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sadalp_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_uadalp_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uadalp_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uadalp_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_sdiv_zpzz_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_sdiv_zpzz_d, TCG_CALL_NO_RWG, diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 58e0b808e9..6691145854 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1102,3 +1102,8 @@ MUL_zzz 00000100 .. 1 ..... 0110 00 ..... ..... @rd_rn_rm SMULH_zzz 00000100 .. 1 ..... 0110 10 ..... ..... @rd_rn_rm UMULH_zzz 00000100 .. 1 ..... 0110 11 ..... ..... @rd_rn_rm PMUL_zzz 00000100 00 1 ..... 0110 01 ..... ..... @rd_rn_rm_e0 + +### SVE2 Integer - Predicated + +SADALP_zpzz 01000100 .. 000 100 101 ... ..... ..... @rdm_pg_rn +UADALP_zpzz 01000100 .. 000 101 101 ... ..... ..... @rdm_pg_rn diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index d40b1994aa..7dc17421e9 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -517,6 +517,50 @@ DO_ZPZZ_D(sve_asr_zpzz_d, int64_t, DO_ASR) DO_ZPZZ_D(sve_lsr_zpzz_d, uint64_t, DO_LSR) DO_ZPZZ_D(sve_lsl_zpzz_d, uint64_t, DO_LSL) +static inline uint16_t do_sadalp_h(uint16_t n, uint16_t m) +{ + int8_t n1 = n, n2 = n >> 8; + return m + n1 + n2; +} + +static inline uint32_t do_sadalp_s(uint32_t n, uint32_t m) +{ + int16_t n1 = n, n2 = n >> 16; + return m + n1 + n2; +} + +static inline uint64_t do_sadalp_d(uint64_t n, uint64_t m) +{ + int32_t n1 = n, n2 = n >> 32; + return m + n1 + n2; +} + +DO_ZPZZ(sve2_sadalp_zpzz_h, int16_t, H1_2, do_sadalp_h) +DO_ZPZZ(sve2_sadalp_zpzz_s, int32_t, H1_4, do_sadalp_s) +DO_ZPZZ_D(sve2_sadalp_zpzz_d, uint64_t, do_sadalp_d) + +static inline uint16_t do_uadalp_h(uint16_t n, uint16_t m) +{ + uint8_t n1 = n, n2 = n >> 8; + return m + n1 + n2; +} + +static inline uint32_t do_uadalp_s(uint32_t n, uint32_t m) +{ + uint16_t n1 = n, n2 = n >> 16; + return m + n1 + n2; +} + +static inline uint64_t do_uadalp_d(uint64_t n, uint64_t m) +{ + uint32_t n1 = n, n2 = n >> 32; + return m + n1 + n2; +} + +DO_ZPZZ(sve2_uadalp_zpzz_h, int16_t, H1_2, do_uadalp_h) +DO_ZPZZ(sve2_uadalp_zpzz_s, int32_t, H1_4, do_uadalp_s) +DO_ZPZZ_D(sve2_uadalp_zpzz_d, uint64_t, do_uadalp_d) + #undef DO_ZPZZ #undef DO_ZPZZ_D diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index e962f45b32..bc8321f7cd 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -5880,3 +5880,42 @@ static bool trans_PMUL_zzz(DisasContext *s, arg_rrr_esz *a) { return do_sve2_zzz_ool(s, a, gen_helper_gvec_pmul_b); } + +/* + * SVE2 Integer - Predicated + */ + +static bool do_sve2_zpzz_ool(DisasContext *s, arg_rprr_esz *a, + gen_helper_gvec_4 *fn) +{ + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + return do_zpzz_ool(s, a, fn); +} + +static bool trans_SADALP_zpzz(DisasContext *s, arg_rprr_esz *a) +{ + static gen_helper_gvec_4 * const fns[3] = { + gen_helper_sve2_sadalp_zpzz_h, + gen_helper_sve2_sadalp_zpzz_s, + gen_helper_sve2_sadalp_zpzz_d, + }; + if (a->esz == 0) { + return false; + } + return do_sve2_zpzz_ool(s, a, fns[a->esz - 1]); +} + +static bool trans_UADALP_zpzz(DisasContext *s, arg_rprr_esz *a) +{ + static gen_helper_gvec_4 * const fns[3] = { + gen_helper_sve2_uadalp_zpzz_h, + gen_helper_sve2_uadalp_zpzz_s, + gen_helper_sve2_uadalp_zpzz_d, + }; + if (a->esz == 0) { + return false; + } + return do_sve2_zpzz_ool(s, a, fns[a->esz - 1]); +} From patchwork Thu Mar 26 23:08:11 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1262434 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=u/edcjH8; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48pLN75D3Xz9sSH for ; Fri, 27 Mar 2020 10:11:51 +1100 (AEDT) Received: from localhost ([::1]:34524 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbfB-0006G7-LE for incoming@patchwork.ozlabs.org; Thu, 26 Mar 2020 19:11:49 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:57892) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcG-0001uE-3J for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:50 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcE-0001Hj-SA for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:48 -0400 Received: from mail-pl1-x629.google.com ([2607:f8b0:4864:20::629]:46788) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcE-0001Gx-N2 for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:46 -0400 Received: by mail-pl1-x629.google.com with SMTP id s23so2720978plq.13 for ; Thu, 26 Mar 2020 16:08:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=HkqstrAP2B7IxXW64bwy+CKbvFuSsclyXxBa82r0Wh4=; b=u/edcjH8d9Y7t/gbupmzjmDrvCYtvDAHHPG9LbB/qWyLc/VUn0n/9SK6TVTqezET1g Z9aPfYv14+ZrYHuWh2UkRfrtewc0X+z45eY86y7772g9W65A3Mx+IfMg4H8ggbPE+btq LUt6CiuDsRpUANEkoS8AmTUC99DH1ZXH0G1v8/hV6ghpuk3ZLZaVLctQNLTENnGmf/UA Euy0wbCbm8VbRdse5etjgTmi5wmOePIfSMhxxzaodZldDn0gfmrshMViMUTcwL7JZRCh nAHAT8PF3M0ZR4K/dRDex2X/a0ZZH1awmCYcYlBPOylBnVxl5s6jTAuUgdprVhfHFaVq 1v5w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=HkqstrAP2B7IxXW64bwy+CKbvFuSsclyXxBa82r0Wh4=; b=hSrlz0PwLfwFEKaPZgKlFZ2KSnSR2K41XnmjO8k2kWLUjjNkyZbpi1QhCNBkxZKO4+ +PqrbtJzq2rITnyttfLdbL1aDfB7XJbiQoc8/LXs8Cy9nc7rGmHHetsSwYjl/d4BiDN7 f7nthJuf1qgTSO5r1PPNYVz6U/C8pN66JX5fiCl2tbNzgL6sGX1KeNs8RnBvVoBzagjq UvRZbcaEFuQMSXmhIl//vH2D2SYk6OHc0YF4rB6+n24wzShdRingqyALzLNQtjElOyej t++4lGsqPtI0PViF0IQud+6popD6avvO41Otcv2FbWOT6931s2zW7RJzq5JlHO9891k+ PbPA== X-Gm-Message-State: ANhLgQ2aukKn49Pu+cslILLwti0XZjDXAONHL7xdOofstAOzbXWeXmvc 8Jgkp2mbsyXzGpWVlw3EU6+rUIoNVtA= X-Google-Smtp-Source: ADFU+vthDshpht7UzEYS7uRBokWn1T8HtqNrEMCM3+K4NdsBfIrYjGlCurUM7Rygqwyz1unpHWOw4A== X-Received: by 2002:a17:902:82c5:: with SMTP id u5mr10729656plz.254.1585264125357; Thu, 26 Mar 2020 16:08:45 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.08.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:08:44 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 04/31] target/arm: Remove fp_status from helper_{recpe, rsqrte}_u32 Date: Thu, 26 Mar 2020 16:08:11 -0700 Message-Id: <20200326230838.31112-5-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::629 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" These operations do not touch fp_status. Signed-off-by: Richard Henderson --- target/arm/helper.h | 4 ++-- target/arm/translate-a64.c | 5 ++--- target/arm/translate.c | 12 ++---------- target/arm/vfp_helper.c | 4 ++-- 4 files changed, 8 insertions(+), 17 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index 80bc129763..938fdbc362 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -213,8 +213,8 @@ DEF_HELPER_FLAGS_2(recpe_f64, TCG_CALL_NO_RWG, f64, f64, ptr) DEF_HELPER_FLAGS_2(rsqrte_f16, TCG_CALL_NO_RWG, f16, f16, ptr) DEF_HELPER_FLAGS_2(rsqrte_f32, TCG_CALL_NO_RWG, f32, f32, ptr) DEF_HELPER_FLAGS_2(rsqrte_f64, TCG_CALL_NO_RWG, f64, f64, ptr) -DEF_HELPER_2(recpe_u32, i32, i32, ptr) -DEF_HELPER_FLAGS_2(rsqrte_u32, TCG_CALL_NO_RWG, i32, i32, ptr) +DEF_HELPER_FLAGS_1(recpe_u32, TCG_CALL_NO_RWG, i32, i32) +DEF_HELPER_FLAGS_1(rsqrte_u32, TCG_CALL_NO_RWG, i32, i32) DEF_HELPER_FLAGS_4(neon_tbl, TCG_CALL_NO_RWG, i32, i32, i32, ptr, i32) DEF_HELPER_3(shl_cc, i32, env, i32, i32) diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index db41e3d72a..2bcf643069 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -10220,7 +10220,7 @@ static void handle_2misc_reciprocal(DisasContext *s, int opcode, switch (opcode) { case 0x3c: /* URECPE */ - gen_helper_recpe_u32(tcg_res, tcg_op, fpst); + gen_helper_recpe_u32(tcg_res, tcg_op); break; case 0x3d: /* FRECPE */ gen_helper_recpe_f32(tcg_res, tcg_op, fpst); @@ -12802,7 +12802,6 @@ static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn) unallocated_encoding(s); return; } - need_fpstatus = true; break; case 0x1e: /* FRINT32Z */ case 0x1f: /* FRINT64Z */ @@ -12970,7 +12969,7 @@ static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn) gen_helper_rints_exact(tcg_res, tcg_op, tcg_fpstatus); break; case 0x7c: /* URSQRTE */ - gen_helper_rsqrte_u32(tcg_res, tcg_op, tcg_fpstatus); + gen_helper_rsqrte_u32(tcg_res, tcg_op); break; case 0x1e: /* FRINT32Z */ case 0x5e: /* FRINT32X */ diff --git a/target/arm/translate.c b/target/arm/translate.c index b38af6149a..cba84987db 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -6711,19 +6711,11 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) break; } case NEON_2RM_VRECPE: - { - TCGv_ptr fpstatus = get_fpstatus_ptr(1); - gen_helper_recpe_u32(tmp, tmp, fpstatus); - tcg_temp_free_ptr(fpstatus); + gen_helper_recpe_u32(tmp, tmp); break; - } case NEON_2RM_VRSQRTE: - { - TCGv_ptr fpstatus = get_fpstatus_ptr(1); - gen_helper_rsqrte_u32(tmp, tmp, fpstatus); - tcg_temp_free_ptr(fpstatus); + gen_helper_rsqrte_u32(tmp, tmp); break; - } case NEON_2RM_VRECPE_F: { TCGv_ptr fpstatus = get_fpstatus_ptr(1); diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c index 930d6e747f..a792661166 100644 --- a/target/arm/vfp_helper.c +++ b/target/arm/vfp_helper.c @@ -1023,7 +1023,7 @@ float64 HELPER(rsqrte_f64)(float64 input, void *fpstp) return make_float64(val); } -uint32_t HELPER(recpe_u32)(uint32_t a, void *fpstp) +uint32_t HELPER(recpe_u32)(uint32_t a) { /* float_status *s = fpstp; */ int input, estimate; @@ -1038,7 +1038,7 @@ uint32_t HELPER(recpe_u32)(uint32_t a, void *fpstp) return deposit32(0, (32 - 9), 9, estimate); } -uint32_t HELPER(rsqrte_u32)(uint32_t a, void *fpstp) +uint32_t HELPER(rsqrte_u32)(uint32_t a) { int estimate; From patchwork Thu Mar 26 23:08:12 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1262438 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=Qz0ewG13; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48pLQm3RCKz9sSM for ; Fri, 27 Mar 2020 10:14:08 +1100 (AEDT) Received: from localhost ([::1]:34580 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbhO-0001L7-C5 for incoming@patchwork.ozlabs.org; Thu, 26 Mar 2020 19:14:06 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:57914) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcI-0001xP-26 for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:51 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcG-0001IY-3Y for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:49 -0400 Received: from mail-pf1-x443.google.com ([2607:f8b0:4864:20::443]:36571) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcF-0001I2-UF for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:48 -0400 Received: by mail-pf1-x443.google.com with SMTP id i13so3553360pfe.3 for ; Thu, 26 Mar 2020 16:08:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=1a/74RtZJl65FwDaRqFWJWhWsV/XH0AS+LKjeLcIuFE=; b=Qz0ewG13NTs2zcsq5z35cW116x7yv5X9PevHtCa0PiThUIaXkg053Wb+w3nOSnUdDE 2Ryev4Aqz41K268MlINa9PBxEJHsNKX/yI9RuEfAWnhryv20RvXh6y2Y1vUb2Vx4qOHS pzRJuHJnRN8ov2cqV/IHW0et7KAwaqPZ2d3xENNcHdI/1S8r8RzzcfG3xeMgmcF3yyvc H9OtAO+BC/P48eQHCejTOG4mK/gBXcgMLWtVX7+6d8jzpvRhoz1KE4DGnpI5zwx1htBW h0kA7W5FfmizD6LIFASo3usTpKDwCdZwgq2Y+8fUyg/+HZGPY+8YBuW/ysgQEIYo8gNO h1+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=1a/74RtZJl65FwDaRqFWJWhWsV/XH0AS+LKjeLcIuFE=; b=j8Nkqzh3I1cMpFroI9uM6kqwXc1xLJDXNgUmswFEyWE2oxxZ2zo72qWCMA2Fl74b2Q V41Uo3kC0PdsJ0Dt7iWmd2vfO+wN5FCZkmPjvmz8KFbT6yJo/3H53GAfzrIhuH3L+Ctm C43wmNy6EMbccb4ORgKpSkl/y6kDudBhEM9E0gDsAA6RxTK7LWIw1VFrrz1qxO3vKAg1 GYhIZhKMfUEoKlN0h6YGNr634KvXh7GEsY/XYBGHRcrDH5Hbh9Z8+N4DYgGkh+pERPMX vtnwezwDa2xmf0uaUgjwfzn8riY5Wc0vqQVAkyd8gXKIUv47kgv3wlUatHibglr0xFVi 4PxQ== X-Gm-Message-State: ANhLgQ24T0X7qna+LRsHYDHIew8VvFdTtZEo11fYsS98Isociif3z/7p WesyAT8qn996aJJMJbnKdb3JsvoTtTg= X-Google-Smtp-Source: ADFU+vtkFuCEYnYE+C8HG0CPv5ecAEOejaS0OkxBYMMX5mFWvk7b5OYq3i+DEYNQFgE4e3j6QeIP8g== X-Received: by 2002:a62:1dcd:: with SMTP id d196mr11633461pfd.296.1585264126502; Thu, 26 Mar 2020 16:08:46 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.08.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:08:45 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 05/31] target/arm: Implement SVE2 integer unary operations (predicated) Date: Thu, 26 Mar 2020 16:08:12 -0700 Message-Id: <20200326230838.31112-6-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::443 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 13 +++++++++++ target/arm/sve.decode | 7 ++++++ target/arm/sve_helper.c | 25 ++++++++++++++++---- target/arm/translate-sve.c | 47 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 88 insertions(+), 4 deletions(-) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 854cd97fdf..d3b7c3bd12 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -507,6 +507,19 @@ DEF_HELPER_FLAGS_4(sve_rbit_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_rbit_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_rbit_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqabs_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqabs_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqabs_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqabs_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_sqneg_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqneg_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqneg_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqneg_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_urecpe_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_ursqrte_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_splice, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_cmpeq_ppzz_b, TCG_CALL_NO_RWG, diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 6691145854..95a9c65451 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1107,3 +1107,10 @@ PMUL_zzz 00000100 00 1 ..... 0110 01 ..... ..... @rd_rn_rm_e0 SADALP_zpzz 01000100 .. 000 100 101 ... ..... ..... @rdm_pg_rn UADALP_zpzz 01000100 .. 000 101 101 ... ..... ..... @rdm_pg_rn + +### SVE2 integer unary operations (predicated) + +URECPE 01000100 .. 000 000 101 ... ..... ..... @rd_pg_rn +URSQRTE 01000100 .. 000 001 101 ... ..... ..... @rd_pg_rn +SQABS 01000100 .. 001 000 101 ... ..... ..... @rd_pg_rn +SQNEG 01000100 .. 001 001 101 ... ..... ..... @rd_pg_rn diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 7dc17421e9..16606331fc 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -535,8 +535,8 @@ static inline uint64_t do_sadalp_d(uint64_t n, uint64_t m) return m + n1 + n2; } -DO_ZPZZ(sve2_sadalp_zpzz_h, int16_t, H1_2, do_sadalp_h) -DO_ZPZZ(sve2_sadalp_zpzz_s, int32_t, H1_4, do_sadalp_s) +DO_ZPZZ(sve2_sadalp_zpzz_h, uint16_t, H1_2, do_sadalp_h) +DO_ZPZZ(sve2_sadalp_zpzz_s, uint32_t, H1_4, do_sadalp_s) DO_ZPZZ_D(sve2_sadalp_zpzz_d, uint64_t, do_sadalp_d) static inline uint16_t do_uadalp_h(uint16_t n, uint16_t m) @@ -557,8 +557,8 @@ static inline uint64_t do_uadalp_d(uint64_t n, uint64_t m) return m + n1 + n2; } -DO_ZPZZ(sve2_uadalp_zpzz_h, int16_t, H1_2, do_uadalp_h) -DO_ZPZZ(sve2_uadalp_zpzz_s, int32_t, H1_4, do_uadalp_s) +DO_ZPZZ(sve2_uadalp_zpzz_h, uint16_t, H1_2, do_uadalp_h) +DO_ZPZZ(sve2_uadalp_zpzz_s, uint32_t, H1_4, do_uadalp_s) DO_ZPZZ_D(sve2_uadalp_zpzz_d, uint64_t, do_uadalp_d) #undef DO_ZPZZ @@ -728,6 +728,23 @@ DO_ZPZ(sve_rbit_h, uint16_t, H1_2, revbit16) DO_ZPZ(sve_rbit_s, uint32_t, H1_4, revbit32) DO_ZPZ_D(sve_rbit_d, uint64_t, revbit64) +#define DO_SQABS(N) (N == -N ? N - 1 : N < 0 ? -N : N) + +DO_ZPZ(sve2_sqabs_b, int8_t, H1, DO_SQABS) +DO_ZPZ(sve2_sqabs_h, int16_t, H1_2, DO_SQABS) +DO_ZPZ(sve2_sqabs_s, int32_t, H1_4, DO_SQABS) +DO_ZPZ_D(sve2_sqabs_d, int64_t, DO_SQABS) + +#define DO_SQNEG(N) (N == -N ? N - 1 : -N) + +DO_ZPZ(sve2_sqneg_b, uint8_t, H1, DO_SQNEG) +DO_ZPZ(sve2_sqneg_h, uint16_t, H1_2, DO_SQNEG) +DO_ZPZ(sve2_sqneg_s, uint32_t, H1_4, DO_SQNEG) +DO_ZPZ_D(sve2_sqneg_d, uint64_t, DO_SQNEG) + +DO_ZPZ(sve2_urecpe_s, uint32_t, H1_4, helper_recpe_u32) +DO_ZPZ(sve2_ursqrte_s, uint32_t, H1_4, helper_rsqrte_u32) + /* Three-operand expander, unpredicated, in which the third operand is "wide". */ #define DO_ZZW(NAME, TYPE, TYPEW, H, OP) \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index bc8321f7cd..938ec08673 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -5919,3 +5919,50 @@ static bool trans_UADALP_zpzz(DisasContext *s, arg_rprr_esz *a) } return do_sve2_zpzz_ool(s, a, fns[a->esz - 1]); } + +/* + * SVE2 integer unary operations (predicated) + */ + +static bool do_sve2_zpz_ool(DisasContext *s, arg_rpr_esz *a, + gen_helper_gvec_3 *fn) +{ + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + return do_zpz_ool(s, a, fn); +} + +static bool trans_URECPE(DisasContext *s, arg_rpr_esz *a) +{ + if (a->esz != 2) { + return false; + } + return do_sve2_zpz_ool(s, a, gen_helper_sve2_urecpe_s); +} + +static bool trans_URSQRTE(DisasContext *s, arg_rpr_esz *a) +{ + if (a->esz != 2) { + return false; + } + return do_sve2_zpz_ool(s, a, gen_helper_sve2_ursqrte_s); +} + +static bool trans_SQABS(DisasContext *s, arg_rpr_esz *a) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve2_sqabs_b, gen_helper_sve2_sqabs_h, + gen_helper_sve2_sqabs_s, gen_helper_sve2_sqabs_d, + }; + return do_sve2_zpz_ool(s, a, fns[a->esz]); +} + +static bool trans_SQNEG(DisasContext *s, arg_rpr_esz *a) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve2_sqneg_b, gen_helper_sve2_sqneg_h, + gen_helper_sve2_sqneg_s, gen_helper_sve2_sqneg_d, + }; + return do_sve2_zpz_ool(s, a, fns[a->esz]); +} From patchwork Thu Mar 26 23:08:13 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1262441 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=nPxpTUTX; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48pLS85g5tz9sSL for ; Fri, 27 Mar 2020 10:15:20 +1100 (AEDT) Received: from localhost ([::1]:34626 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbiY-0003Mm-MA for incoming@patchwork.ozlabs.org; Thu, 26 Mar 2020 19:15:18 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:58063) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcL-00026u-R4 for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:57 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcI-0001Jg-IO for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:53 -0400 Received: from mail-pj1-x1044.google.com ([2607:f8b0:4864:20::1044]:55236) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcI-0001J4-9Y for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:50 -0400 Received: by mail-pj1-x1044.google.com with SMTP id np9so3088352pjb.4 for ; Thu, 26 Mar 2020 16:08:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=jC3lzQRoD9FYu6PU7hez8Vvm2StpyVGzvHCSmbTwtC4=; b=nPxpTUTXZgSbtkfAVT1ucMgmZz4RroG/rSKGEVdxBMA0DjxMpyk4u1jpMy3WY0c6iH YQYoikICQPy9ls+eZlB+NCluAq773ZUu0kRzyjKuGPgBioQYxdwBa1yZHzhgUliRjcer 7durjlmiTkTy/KPMqxibQ1k7ynzRCkePuIsY6e1yGLPN0heS7wPMID6+ge2f5JUNZ7Yg x7/j18ZgELxd7fvXzdLtMLDOukWV7dhnuFaBtKfY78WIkUaRt3D+yuxJz9EQ4fF3Yqyq 95cKTTcrf6LpuNmvvAkwR34bozawG/7EfH+HfVBCoEcex0a1b+cVbrvYYcrPQ+WVDqBW NSkw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=jC3lzQRoD9FYu6PU7hez8Vvm2StpyVGzvHCSmbTwtC4=; b=qxdCtDNwb6OnJucKh6DMfITDyXb3Iw3M+3+AKFAAGIivM/4SOX/N3vd7WTIS0MB8Ts 4shctRVgl99/ansN5c6O8kJJQUv/U1Ub2kyjMuEe30ED8ittxUBd6YszwkSLLAPWDRYy 1yw0CRQjFr1WBIqh233V27Pfhm51h9Ydd6AqSAdO32EulEMiN5B+rEkmRmkHi9pLnvdm iC0cYxoZ66l4SgjLpZ5bnADiYNEoNDKh10rm8ZdBniJkHb1J1OesaB4s4JXK1CckQZME qdEETQt0eNXRdlb90XHkLA8kLENK+lpWDoyQ4DuK8EIhI8h1qrXpw7Dgf1CBSKcvAh6D Hqjw== X-Gm-Message-State: ANhLgQ2KjfPhlzXZI41NZAFtBrDryDsMOiuCzGQEhBjXRscrkz1AF6Fd 0Elq367xnT4gj2tDYnEzMylWqfz+Mxs= X-Google-Smtp-Source: ADFU+vvDmP4jM8JgQtkPvRUSdXzDBov/5PMCVlbI+rguK/3aFgTiPCksWbI0VnIgYyjf6nDxEvdZug== X-Received: by 2002:a17:902:5984:: with SMTP id p4mr10599828pli.43.1585264127804; Thu, 26 Mar 2020 16:08:47 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.08.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:08:47 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 06/31] target/arm: Split out saturating/rounding shifts from neon Date: Thu, 26 Mar 2020 16:08:13 -0700 Message-Id: <20200326230838.31112-7-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::1044 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Split these operations out into a header that can be shared between neon and sve. The "sat" pointer acts both as a boolean for control of saturating behavior and controls the difference in behavior between neon and sve -- QC bit or no QC bit. Implement right-shift rounding as tmp = src >> (shift - 1); dst = (tmp >> 1) + (tmp & 1); This is the same number of instructions as the current tmp = 1 << (shift - 1); dst = (src + tmp) >> shift; without any possibility of intermediate overflow. Signed-off-by: Richard Henderson --- target/arm/vec_internal.h | 161 ++++++++++++ target/arm/neon_helper.c | 507 +++++++------------------------------- 2 files changed, 244 insertions(+), 424 deletions(-) create mode 100644 target/arm/vec_internal.h diff --git a/target/arm/vec_internal.h b/target/arm/vec_internal.h new file mode 100644 index 0000000000..0d1f9c86c8 --- /dev/null +++ b/target/arm/vec_internal.h @@ -0,0 +1,161 @@ +/* + * ARM AdvSIMD / SVE Vector Helpers + * + * Copyright (c) 2020 Linaro + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, see . + */ + +#ifndef TARGET_ARM_VEC_INTERNALS_H +#define TARGET_ARM_VEC_INTERNALS_H + +static inline int32_t do_sqrshl_bhs(int32_t src, int8_t shift, int bits, + bool round, uint32_t *sat) +{ + if (shift <= -bits) { + /* Rounding the sign bit always produces 0. */ + if (round) { + return 0; + } + return src >> 31; + } else if (shift < 0) { + if (round) { + src >>= -shift - 1; + return (src >> 1) + (src & 1); + } + return src >> -shift; + } else if (shift < bits) { + int32_t val = src << shift; + if (bits == 32) { + if (!sat || val >> shift == src) { + return val; + } + } else { + int32_t extval = sextract32(val, 0, bits); + if (!sat || val == extval) { + return extval; + } + } + } else if (!sat || src == 0) { + return 0; + } + + *sat = 1; + return (1u << (bits - 1)) - (src >= 0); +} + +static inline uint32_t do_uqrshl_bhs(uint32_t src, int8_t shift, int bits, + bool round, uint32_t *sat) +{ + if (shift <= -(bits + round)) { + return 0; + } else if (shift < 0) { + if (round) { + src >>= -shift - 1; + return (src >> 1) + (src & 1); + } + return src >> -shift; + } else if (shift < bits) { + uint32_t val = src << shift; + if (bits == 32) { + if (!sat || val >> shift == src) { + return val; + } + } else { + uint32_t extval = extract32(val, 0, bits); + if (!sat || val == extval) { + return extval; + } + } + } else if (!sat || src == 0) { + return 0; + } + + *sat = 1; + return MAKE_64BIT_MASK(0, bits); +} + +static inline int32_t do_suqrshl_bhs(int32_t src, int8_t shift, int bits, + bool round, uint32_t *sat) +{ + if (src < 0) { + *sat = 1; + return 0; + } + return do_uqrshl_bhs(src, shift, bits, round, sat); +} + +static inline int64_t do_sqrshl_d(int64_t src, int8_t shift, + bool round, uint32_t *sat) +{ + if (shift <= -64) { + /* Rounding the sign bit always produces 0. */ + if (round) { + return 0; + } + return src >> 63; + } else if (shift < 0) { + if (round) { + src >>= -shift - 1; + return (src >> 1) + (src & 1); + } + return src >> -shift; + } else if (shift < 64) { + int64_t val = src << shift; + if (!sat || val >> shift == src) { + return val; + } + } else if (!sat || src == 0) { + return 0; + } + + *sat = 1; + return src < 0 ? INT64_MIN : INT64_MAX; +} + +static inline uint64_t do_uqrshl_d(uint64_t src, int8_t shift, + bool round, uint32_t *sat) +{ + if (shift <= -(64 + round)) { + return 0; + } else if (shift < 0) { + if (round) { + src >>= -shift - 1; + return (src >> 1) + (src & 1); + } + return src >> -shift; + } else if (shift < 64) { + uint64_t val = src << shift; + if (!sat || val >> shift == src) { + return val; + } + } else if (!sat || src == 0) { + return 0; + } + + *sat = 1; + return UINT64_MAX; +} + +static inline int64_t do_suqrshl_d(int64_t src, int8_t shift, + bool round, uint32_t *sat) +{ + if (src < 0) { + *sat = 1; + return 0; + } + return do_uqrshl_d(src, shift, round, sat); +} + +#endif /* TARGET_ARM_VEC_INTERNALS_H */ diff --git a/target/arm/neon_helper.c b/target/arm/neon_helper.c index c7a8438b42..e6481a5764 100644 --- a/target/arm/neon_helper.c +++ b/target/arm/neon_helper.c @@ -11,6 +11,7 @@ #include "cpu.h" #include "exec/helper-proto.h" #include "fpu/softfloat.h" +#include "vec_internal.h" #define SIGNBIT (uint32_t)0x80000000 #define SIGNBIT64 ((uint64_t)1 << 63) @@ -604,496 +605,154 @@ NEON_VOP(abd_s32, neon_s32, 1) NEON_VOP(abd_u32, neon_u32, 1) #undef NEON_FN -#define NEON_FN(dest, src1, src2) do { \ - int8_t tmp; \ - tmp = (int8_t)src2; \ - if (tmp >= (ssize_t)sizeof(src1) * 8 || \ - tmp <= -(ssize_t)sizeof(src1) * 8) { \ - dest = 0; \ - } else if (tmp < 0) { \ - dest = src1 >> -tmp; \ - } else { \ - dest = src1 << tmp; \ - }} while (0) +#define NEON_FN(dest, src1, src2) \ + (dest = do_uqrshl_bhs(src1, src2, 16, false, NULL)) NEON_VOP(shl_u16, neon_u16, 2) #undef NEON_FN -#define NEON_FN(dest, src1, src2) do { \ - int8_t tmp; \ - tmp = (int8_t)src2; \ - if (tmp >= (ssize_t)sizeof(src1) * 8) { \ - dest = 0; \ - } else if (tmp <= -(ssize_t)sizeof(src1) * 8) { \ - dest = src1 >> (sizeof(src1) * 8 - 1); \ - } else if (tmp < 0) { \ - dest = src1 >> -tmp; \ - } else { \ - dest = src1 << tmp; \ - }} while (0) +#define NEON_FN(dest, src1, src2) \ + (dest = do_sqrshl_bhs(src1, src2, 16, false, NULL)) NEON_VOP(shl_s16, neon_s16, 2) #undef NEON_FN -#define NEON_FN(dest, src1, src2) do { \ - int8_t tmp; \ - tmp = (int8_t)src2; \ - if ((tmp >= (ssize_t)sizeof(src1) * 8) \ - || (tmp <= -(ssize_t)sizeof(src1) * 8)) { \ - dest = 0; \ - } else if (tmp < 0) { \ - dest = (src1 + (1 << (-1 - tmp))) >> -tmp; \ - } else { \ - dest = src1 << tmp; \ - }} while (0) +#define NEON_FN(dest, src1, src2) \ + (dest = do_sqrshl_bhs(src1, src2, 8, true, NULL)) NEON_VOP(rshl_s8, neon_s8, 4) +#undef NEON_FN + +#define NEON_FN(dest, src1, src2) \ + (dest = do_sqrshl_bhs(src1, src2, 16, true, NULL)) NEON_VOP(rshl_s16, neon_s16, 2) #undef NEON_FN -/* The addition of the rounding constant may overflow, so we use an - * intermediate 64 bit accumulator. */ -uint32_t HELPER(neon_rshl_s32)(uint32_t valop, uint32_t shiftop) +uint32_t HELPER(neon_rshl_s32)(uint32_t val, uint32_t shift) { - int32_t dest; - int32_t val = (int32_t)valop; - int8_t shift = (int8_t)shiftop; - if ((shift >= 32) || (shift <= -32)) { - dest = 0; - } else if (shift < 0) { - int64_t big_dest = ((int64_t)val + (1 << (-1 - shift))); - dest = big_dest >> -shift; - } else { - dest = val << shift; - } - return dest; + return do_sqrshl_bhs(val, shift, 32, true, NULL); } -/* Handling addition overflow with 64 bit input values is more - * tricky than with 32 bit values. */ -uint64_t HELPER(neon_rshl_s64)(uint64_t valop, uint64_t shiftop) +uint64_t HELPER(neon_rshl_s64)(uint64_t val, uint64_t shift) { - int8_t shift = (int8_t)shiftop; - int64_t val = valop; - if ((shift >= 64) || (shift <= -64)) { - val = 0; - } else if (shift < 0) { - val >>= (-shift - 1); - if (val == INT64_MAX) { - /* In this case, it means that the rounding constant is 1, - * and the addition would overflow. Return the actual - * result directly. */ - val = 0x4000000000000000LL; - } else { - val++; - val >>= 1; - } - } else { - val <<= shift; - } - return val; + return do_sqrshl_d(val, shift, true, NULL); } -#define NEON_FN(dest, src1, src2) do { \ - int8_t tmp; \ - tmp = (int8_t)src2; \ - if (tmp >= (ssize_t)sizeof(src1) * 8 || \ - tmp < -(ssize_t)sizeof(src1) * 8) { \ - dest = 0; \ - } else if (tmp == -(ssize_t)sizeof(src1) * 8) { \ - dest = src1 >> (-tmp - 1); \ - } else if (tmp < 0) { \ - dest = (src1 + (1 << (-1 - tmp))) >> -tmp; \ - } else { \ - dest = src1 << tmp; \ - }} while (0) +#define NEON_FN(dest, src1, src2) \ + (dest = do_uqrshl_bhs(src1, src2, 8, true, NULL)) NEON_VOP(rshl_u8, neon_u8, 4) +#undef NEON_FN + +#define NEON_FN(dest, src1, src2) \ + (dest = do_uqrshl_bhs(src1, src2, 16, true, NULL)) NEON_VOP(rshl_u16, neon_u16, 2) #undef NEON_FN -/* The addition of the rounding constant may overflow, so we use an - * intermediate 64 bit accumulator. */ -uint32_t HELPER(neon_rshl_u32)(uint32_t val, uint32_t shiftop) +uint32_t HELPER(neon_rshl_u32)(uint32_t val, uint32_t shift) { - uint32_t dest; - int8_t shift = (int8_t)shiftop; - if (shift >= 32 || shift < -32) { - dest = 0; - } else if (shift == -32) { - dest = val >> 31; - } else if (shift < 0) { - uint64_t big_dest = ((uint64_t)val + (1 << (-1 - shift))); - dest = big_dest >> -shift; - } else { - dest = val << shift; - } - return dest; + return do_uqrshl_bhs(val, shift, 32, true, NULL); } -/* Handling addition overflow with 64 bit input values is more - * tricky than with 32 bit values. */ -uint64_t HELPER(neon_rshl_u64)(uint64_t val, uint64_t shiftop) +uint64_t HELPER(neon_rshl_u64)(uint64_t val, uint64_t shift) { - int8_t shift = (uint8_t)shiftop; - if (shift >= 64 || shift < -64) { - val = 0; - } else if (shift == -64) { - /* Rounding a 1-bit result just preserves that bit. */ - val >>= 63; - } else if (shift < 0) { - val >>= (-shift - 1); - if (val == UINT64_MAX) { - /* In this case, it means that the rounding constant is 1, - * and the addition would overflow. Return the actual - * result directly. */ - val = 0x8000000000000000ULL; - } else { - val++; - val >>= 1; - } - } else { - val <<= shift; - } - return val; + return do_uqrshl_d(val, shift, true, NULL); } -#define NEON_FN(dest, src1, src2) do { \ - int8_t tmp; \ - tmp = (int8_t)src2; \ - if (tmp >= (ssize_t)sizeof(src1) * 8) { \ - if (src1) { \ - SET_QC(); \ - dest = ~0; \ - } else { \ - dest = 0; \ - } \ - } else if (tmp <= -(ssize_t)sizeof(src1) * 8) { \ - dest = 0; \ - } else if (tmp < 0) { \ - dest = src1 >> -tmp; \ - } else { \ - dest = src1 << tmp; \ - if ((dest >> tmp) != src1) { \ - SET_QC(); \ - dest = ~0; \ - } \ - }} while (0) +#define NEON_FN(dest, src1, src2) \ + (dest = do_uqrshl_bhs(src1, src2, 8, false, env->vfp.qc)) NEON_VOP_ENV(qshl_u8, neon_u8, 4) +#undef NEON_FN + +#define NEON_FN(dest, src1, src2) \ + (dest = do_uqrshl_bhs(src1, src2, 16, false, env->vfp.qc)) NEON_VOP_ENV(qshl_u16, neon_u16, 2) -NEON_VOP_ENV(qshl_u32, neon_u32, 1) #undef NEON_FN -uint64_t HELPER(neon_qshl_u64)(CPUARMState *env, uint64_t val, uint64_t shiftop) +uint32_t HELPER(neon_qshl_u32)(CPUARMState *env, uint32_t val, uint32_t shift) { - int8_t shift = (int8_t)shiftop; - if (shift >= 64) { - if (val) { - val = ~(uint64_t)0; - SET_QC(); - } - } else if (shift <= -64) { - val = 0; - } else if (shift < 0) { - val >>= -shift; - } else { - uint64_t tmp = val; - val <<= shift; - if ((val >> shift) != tmp) { - SET_QC(); - val = ~(uint64_t)0; - } - } - return val; + return do_uqrshl_bhs(val, shift, 32, false, env->vfp.qc); } -#define NEON_FN(dest, src1, src2) do { \ - int8_t tmp; \ - tmp = (int8_t)src2; \ - if (tmp >= (ssize_t)sizeof(src1) * 8) { \ - if (src1) { \ - SET_QC(); \ - dest = (uint32_t)(1 << (sizeof(src1) * 8 - 1)); \ - if (src1 > 0) { \ - dest--; \ - } \ - } else { \ - dest = src1; \ - } \ - } else if (tmp <= -(ssize_t)sizeof(src1) * 8) { \ - dest = src1 >> 31; \ - } else if (tmp < 0) { \ - dest = src1 >> -tmp; \ - } else { \ - dest = src1 << tmp; \ - if ((dest >> tmp) != src1) { \ - SET_QC(); \ - dest = (uint32_t)(1 << (sizeof(src1) * 8 - 1)); \ - if (src1 > 0) { \ - dest--; \ - } \ - } \ - }} while (0) +uint64_t HELPER(neon_qshl_u64)(CPUARMState *env, uint64_t val, uint64_t shift) +{ + return do_uqrshl_d(val, shift, false, env->vfp.qc); +} + +#define NEON_FN(dest, src1, src2) \ + (dest = do_sqrshl_bhs(src1, src2, 8, false, env->vfp.qc)) NEON_VOP_ENV(qshl_s8, neon_s8, 4) +#undef NEON_FN + +#define NEON_FN(dest, src1, src2) \ + (dest = do_sqrshl_bhs(src1, src2, 16, false, env->vfp.qc)) NEON_VOP_ENV(qshl_s16, neon_s16, 2) -NEON_VOP_ENV(qshl_s32, neon_s32, 1) #undef NEON_FN -uint64_t HELPER(neon_qshl_s64)(CPUARMState *env, uint64_t valop, uint64_t shiftop) +uint32_t HELPER(neon_qshl_s32)(CPUARMState *env, uint32_t val, uint32_t shift) { - int8_t shift = (uint8_t)shiftop; - int64_t val = valop; - if (shift >= 64) { - if (val) { - SET_QC(); - val = (val >> 63) ^ ~SIGNBIT64; - } - } else if (shift <= -64) { - val >>= 63; - } else if (shift < 0) { - val >>= -shift; - } else { - int64_t tmp = val; - val <<= shift; - if ((val >> shift) != tmp) { - SET_QC(); - val = (tmp >> 63) ^ ~SIGNBIT64; - } - } - return val; + return do_sqrshl_bhs(val, shift, 32, false, env->vfp.qc); } -#define NEON_FN(dest, src1, src2) do { \ - if (src1 & (1 << (sizeof(src1) * 8 - 1))) { \ - SET_QC(); \ - dest = 0; \ - } else { \ - int8_t tmp; \ - tmp = (int8_t)src2; \ - if (tmp >= (ssize_t)sizeof(src1) * 8) { \ - if (src1) { \ - SET_QC(); \ - dest = ~0; \ - } else { \ - dest = 0; \ - } \ - } else if (tmp <= -(ssize_t)sizeof(src1) * 8) { \ - dest = 0; \ - } else if (tmp < 0) { \ - dest = src1 >> -tmp; \ - } else { \ - dest = src1 << tmp; \ - if ((dest >> tmp) != src1) { \ - SET_QC(); \ - dest = ~0; \ - } \ - } \ - }} while (0) -NEON_VOP_ENV(qshlu_s8, neon_u8, 4) -NEON_VOP_ENV(qshlu_s16, neon_u16, 2) +uint64_t HELPER(neon_qshl_s64)(CPUARMState *env, uint64_t val, uint64_t shift) +{ + return do_sqrshl_d(val, shift, false, env->vfp.qc); +} + +#define NEON_FN(dest, src1, src2) \ + (dest = do_suqrshl_bhs(src1, src2, 8, false, env->vfp.qc)) +NEON_VOP_ENV(qshlu_s8, neon_s8, 4) #undef NEON_FN -uint32_t HELPER(neon_qshlu_s32)(CPUARMState *env, uint32_t valop, uint32_t shiftop) +#define NEON_FN(dest, src1, src2) \ + (dest = do_suqrshl_bhs(src1, src2, 16, false, env->vfp.qc)) +NEON_VOP_ENV(qshlu_s16, neon_s16, 2) +#undef NEON_FN + +uint32_t HELPER(neon_qshlu_s32)(CPUARMState *env, uint32_t val, uint32_t shift) { - if ((int32_t)valop < 0) { - SET_QC(); - return 0; - } - return helper_neon_qshl_u32(env, valop, shiftop); + return do_suqrshl_bhs(val, shift, 32, false, env->vfp.qc); } -uint64_t HELPER(neon_qshlu_s64)(CPUARMState *env, uint64_t valop, uint64_t shiftop) +uint64_t HELPER(neon_qshlu_s64)(CPUARMState *env, uint64_t val, uint64_t shift) { - if ((int64_t)valop < 0) { - SET_QC(); - return 0; - } - return helper_neon_qshl_u64(env, valop, shiftop); + return do_suqrshl_d(val, shift, false, env->vfp.qc); } -#define NEON_FN(dest, src1, src2) do { \ - int8_t tmp; \ - tmp = (int8_t)src2; \ - if (tmp >= (ssize_t)sizeof(src1) * 8) { \ - if (src1) { \ - SET_QC(); \ - dest = ~0; \ - } else { \ - dest = 0; \ - } \ - } else if (tmp < -(ssize_t)sizeof(src1) * 8) { \ - dest = 0; \ - } else if (tmp == -(ssize_t)sizeof(src1) * 8) { \ - dest = src1 >> (sizeof(src1) * 8 - 1); \ - } else if (tmp < 0) { \ - dest = (src1 + (1 << (-1 - tmp))) >> -tmp; \ - } else { \ - dest = src1 << tmp; \ - if ((dest >> tmp) != src1) { \ - SET_QC(); \ - dest = ~0; \ - } \ - }} while (0) +#define NEON_FN(dest, src1, src2) \ + (dest = do_uqrshl_bhs(src1, src2, 8, true, env->vfp.qc)) NEON_VOP_ENV(qrshl_u8, neon_u8, 4) +#undef NEON_FN + +#define NEON_FN(dest, src1, src2) \ + (dest = do_uqrshl_bhs(src1, src2, 16, true, env->vfp.qc)) NEON_VOP_ENV(qrshl_u16, neon_u16, 2) #undef NEON_FN -/* The addition of the rounding constant may overflow, so we use an - * intermediate 64 bit accumulator. */ -uint32_t HELPER(neon_qrshl_u32)(CPUARMState *env, uint32_t val, uint32_t shiftop) +uint32_t HELPER(neon_qrshl_u32)(CPUARMState *env, uint32_t val, uint32_t shift) { - uint32_t dest; - int8_t shift = (int8_t)shiftop; - if (shift >= 32) { - if (val) { - SET_QC(); - dest = ~0; - } else { - dest = 0; - } - } else if (shift < -32) { - dest = 0; - } else if (shift == -32) { - dest = val >> 31; - } else if (shift < 0) { - uint64_t big_dest = ((uint64_t)val + (1 << (-1 - shift))); - dest = big_dest >> -shift; - } else { - dest = val << shift; - if ((dest >> shift) != val) { - SET_QC(); - dest = ~0; - } - } - return dest; + return do_uqrshl_bhs(val, shift, 32, true, env->vfp.qc); } -/* Handling addition overflow with 64 bit input values is more - * tricky than with 32 bit values. */ -uint64_t HELPER(neon_qrshl_u64)(CPUARMState *env, uint64_t val, uint64_t shiftop) +uint64_t HELPER(neon_qrshl_u64)(CPUARMState *env, uint64_t val, uint64_t shift) { - int8_t shift = (int8_t)shiftop; - if (shift >= 64) { - if (val) { - SET_QC(); - val = ~0; - } - } else if (shift < -64) { - val = 0; - } else if (shift == -64) { - val >>= 63; - } else if (shift < 0) { - val >>= (-shift - 1); - if (val == UINT64_MAX) { - /* In this case, it means that the rounding constant is 1, - * and the addition would overflow. Return the actual - * result directly. */ - val = 0x8000000000000000ULL; - } else { - val++; - val >>= 1; - } - } else { \ - uint64_t tmp = val; - val <<= shift; - if ((val >> shift) != tmp) { - SET_QC(); - val = ~0; - } - } - return val; + return do_uqrshl_d(val, shift, true, env->vfp.qc); } -#define NEON_FN(dest, src1, src2) do { \ - int8_t tmp; \ - tmp = (int8_t)src2; \ - if (tmp >= (ssize_t)sizeof(src1) * 8) { \ - if (src1) { \ - SET_QC(); \ - dest = (typeof(dest))(1 << (sizeof(src1) * 8 - 1)); \ - if (src1 > 0) { \ - dest--; \ - } \ - } else { \ - dest = 0; \ - } \ - } else if (tmp <= -(ssize_t)sizeof(src1) * 8) { \ - dest = 0; \ - } else if (tmp < 0) { \ - dest = (src1 + (1 << (-1 - tmp))) >> -tmp; \ - } else { \ - dest = src1 << tmp; \ - if ((dest >> tmp) != src1) { \ - SET_QC(); \ - dest = (uint32_t)(1 << (sizeof(src1) * 8 - 1)); \ - if (src1 > 0) { \ - dest--; \ - } \ - } \ - }} while (0) +#define NEON_FN(dest, src1, src2) \ + (dest = do_sqrshl_bhs(src1, src2, 8, true, env->vfp.qc)) NEON_VOP_ENV(qrshl_s8, neon_s8, 4) +#undef NEON_FN + +#define NEON_FN(dest, src1, src2) \ + (dest = do_sqrshl_bhs(src1, src2, 16, true, env->vfp.qc)) NEON_VOP_ENV(qrshl_s16, neon_s16, 2) #undef NEON_FN -/* The addition of the rounding constant may overflow, so we use an - * intermediate 64 bit accumulator. */ -uint32_t HELPER(neon_qrshl_s32)(CPUARMState *env, uint32_t valop, uint32_t shiftop) +uint32_t HELPER(neon_qrshl_s32)(CPUARMState *env, uint32_t val, uint32_t shift) { - int32_t dest; - int32_t val = (int32_t)valop; - int8_t shift = (int8_t)shiftop; - if (shift >= 32) { - if (val) { - SET_QC(); - dest = (val >> 31) ^ ~SIGNBIT; - } else { - dest = 0; - } - } else if (shift <= -32) { - dest = 0; - } else if (shift < 0) { - int64_t big_dest = ((int64_t)val + (1 << (-1 - shift))); - dest = big_dest >> -shift; - } else { - dest = val << shift; - if ((dest >> shift) != val) { - SET_QC(); - dest = (val >> 31) ^ ~SIGNBIT; - } - } - return dest; + return do_sqrshl_bhs(val, shift, 32, true, env->vfp.qc); } -/* Handling addition overflow with 64 bit input values is more - * tricky than with 32 bit values. */ -uint64_t HELPER(neon_qrshl_s64)(CPUARMState *env, uint64_t valop, uint64_t shiftop) +uint64_t HELPER(neon_qrshl_s64)(CPUARMState *env, uint64_t val, uint64_t shift) { - int8_t shift = (uint8_t)shiftop; - int64_t val = valop; - - if (shift >= 64) { - if (val) { - SET_QC(); - val = (val >> 63) ^ ~SIGNBIT64; - } - } else if (shift <= -64) { - val = 0; - } else if (shift < 0) { - val >>= (-shift - 1); - if (val == INT64_MAX) { - /* In this case, it means that the rounding constant is 1, - * and the addition would overflow. Return the actual - * result directly. */ - val = 0x4000000000000000ULL; - } else { - val++; - val >>= 1; - } - } else { - int64_t tmp = val; - val <<= shift; - if ((val >> shift) != tmp) { - SET_QC(); - val = (tmp >> 63) ^ ~SIGNBIT64; - } - } - return val; + return do_sqrshl_d(val, shift, true, env->vfp.qc); } uint32_t HELPER(neon_add_u8)(uint32_t a, uint32_t b) From patchwork Thu Mar 26 23:08:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1262437 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=nEmAwWsg; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48pLP52RWTz9sSH for ; Fri, 27 Mar 2020 10:12:41 +1100 (AEDT) Received: from localhost ([::1]:34552 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbfz-0007IZ-74 for incoming@patchwork.ozlabs.org; Thu, 26 Mar 2020 19:12:39 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:58010) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcK-00024n-OB for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:54 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcI-0001K0-Qo for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:52 -0400 Received: from mail-pj1-x1043.google.com ([2607:f8b0:4864:20::1043]:40041) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcI-0001JS-Jc for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:50 -0400 Received: by mail-pj1-x1043.google.com with SMTP id kx8so3020224pjb.5 for ; Thu, 26 Mar 2020 16:08:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=qcvLlVnwnl+rx9t724hsUpqaoFTYlb8bMAA1WYSB/kc=; b=nEmAwWsgfi7nh5KGB2o58Lcu+tKp9gwPw+8yRg9qj9LxuYMrwd1N0RClYJYpij3/Qy II655RKPyB0MpzXIxyK5S/GpPbBvpwT3kHkWuAQ2LsmQIBLoxJnFpWoI+wm2ubIWaNFI 47LOcWQE8xvH64jmPIINiXOejQpbRPspJv3Qr1fHOrRPbtSepLWg4R/0nimPTxU0lSyr jFZ1zecAXOtFGB/c/FwPP1zqNbv8UvmQ2IKtsF3J/iH0QgpcSX8vP3kyXIXbwm2YTWNt sOMsPHyXqtvlULv3VojsZNuu8aSViVVa0Ma73+zmcEiB0LiULRs+ZVIWkVuN9wIqWHXE l8OA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=qcvLlVnwnl+rx9t724hsUpqaoFTYlb8bMAA1WYSB/kc=; b=DPZxwci1g6CZkvRAtgNSCooNvtIwwj9OFfjM7OdQQte7+BG4ih3zx7ZiMEXu8yOp1O x9LL8JIMIJZ/0z3dvMhSRWM4sNF6Nse32Vq8t/0cmTZsRv5GAfv+DZBSZ/t9XuPA4+oK +QGP7OfB77aaDQY/TCI0l1AJUby2WhaNOGhAuNkMsXYbpaX1Mt6eWwUILo5L7N1ckqZz l+vVAmHOOvowFeHOajxVA3ILQCIkpDy7y1LhblBOjUS0GXcJry9KVxKX3tBOzOxvjLVC T8xpB+k0n14ryZyOd9teFta3BZg6eS2Lfmc8pWYaiKFrbS55finZnWEz4cz8D2Far1Sd FKNw== X-Gm-Message-State: ANhLgQ1TRz6zG7W4bWaFJYAUmBKchjtPnFAP6YDdSwQ7NATOxWXSHGUI QWYMDIsongySfMjd8+eaizpxQ19XIns= X-Google-Smtp-Source: ADFU+vuX/VYBHjbq15/03NutASfIwHV06egPtaKUT9E3kQO+aMB0MTJNlj3izZR1EuL1gq1FyQx3KQ== X-Received: by 2002:a17:902:694c:: with SMTP id k12mr10460092plt.173.1585264129055; Thu, 26 Mar 2020 16:08:49 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.08.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:08:48 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 07/31] target/arm: Implement SVE2 saturating/rounding bitwise shift left (predicated) Date: Thu, 26 Mar 2020 16:08:14 -0700 Message-Id: <20200326230838.31112-8-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::1043 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 54 ++++++++++++++++++++++++++ target/arm/sve.decode | 17 +++++++++ target/arm/sve_helper.c | 78 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 18 +++++++++ 4 files changed, 167 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index d3b7c3bd12..0eecf33249 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -172,6 +172,60 @@ DEF_HELPER_FLAGS_5(sve2_uadalp_zpzz_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve2_uadalp_zpzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_srshl_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_srshl_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_srshl_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_srshl_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_urshl_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_urshl_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_urshl_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_urshl_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_sqshl_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqshl_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqshl_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqshl_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_uqshl_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uqshl_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uqshl_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uqshl_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_sqrshl_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrshl_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrshl_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrshl_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_uqrshl_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uqrshl_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uqrshl_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uqrshl_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_sdiv_zpzz_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_sdiv_zpzz_d, TCG_CALL_NO_RWG, diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 95a9c65451..f0b6692e43 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1114,3 +1114,20 @@ URECPE 01000100 .. 000 000 101 ... ..... ..... @rd_pg_rn URSQRTE 01000100 .. 000 001 101 ... ..... ..... @rd_pg_rn SQABS 01000100 .. 001 000 101 ... ..... ..... @rd_pg_rn SQNEG 01000100 .. 001 001 101 ... ..... ..... @rd_pg_rn + +### SVE2 saturating/rounding bitwise shift left (predicated) + +SRSHL 01000100 .. 000 010 100 ... ..... ..... @rdn_pg_rm +URSHL 01000100 .. 000 011 100 ... ..... ..... @rdn_pg_rm +SRSHL 01000100 .. 000 110 100 ... ..... ..... @rdm_pg_rn # SRSHLR +URSHL 01000100 .. 000 111 100 ... ..... ..... @rdm_pg_rn # URSHLR + +SQSHL 01000100 .. 001 000 100 ... ..... ..... @rdn_pg_rm +UQSHL 01000100 .. 001 001 100 ... ..... ..... @rdn_pg_rm +SQSHL 01000100 .. 001 100 100 ... ..... ..... @rdm_pg_rn # SQSHLR +UQSHL 01000100 .. 001 101 100 ... ..... ..... @rdm_pg_rn # UQSHLR + +SQRSHL 01000100 .. 001 010 100 ... ..... ..... @rdn_pg_rm +UQRSHL 01000100 .. 001 011 100 ... ..... ..... @rdn_pg_rm +SQRSHL 01000100 .. 001 110 100 ... ..... ..... @rdm_pg_rn # SQRSHLR +UQRSHL 01000100 .. 001 111 100 ... ..... ..... @rdm_pg_rn # UQRSHLR diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 16606331fc..a7e9b8d341 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -26,6 +26,7 @@ #include "tcg/tcg-gvec-desc.h" #include "fpu/softfloat.h" #include "tcg/tcg.h" +#include "vec_internal.h" /* Note that vector data is stored in host-endian 64-bit chunks, @@ -561,6 +562,83 @@ DO_ZPZZ(sve2_uadalp_zpzz_h, uint16_t, H1_2, do_uadalp_h) DO_ZPZZ(sve2_uadalp_zpzz_s, uint32_t, H1_4, do_uadalp_s) DO_ZPZZ_D(sve2_uadalp_zpzz_d, uint64_t, do_uadalp_d) +#define do_srshl_b(n, m) do_sqrshl_bhs(n, m, 8, true, NULL) +#define do_srshl_h(n, m) do_sqrshl_bhs(n, m, 16, true, NULL) +#define do_srshl_s(n, m) do_sqrshl_bhs(n, m, 32, true, NULL) +#define do_srshl_d(n, m) do_sqrshl_d(n, m, true, NULL) + +DO_ZPZZ(sve2_srshl_zpzz_b, int8_t, H1_2, do_srshl_b) +DO_ZPZZ(sve2_srshl_zpzz_h, int16_t, H1_2, do_srshl_h) +DO_ZPZZ(sve2_srshl_zpzz_s, int32_t, H1_4, do_srshl_s) +DO_ZPZZ_D(sve2_srshl_zpzz_d, int64_t, do_srshl_d) + +#define do_urshl_b(n, m) do_uqrshl_bhs(n, m, 8, true, NULL) +#define do_urshl_h(n, m) do_uqrshl_bhs(n, m, 16, true, NULL) +#define do_urshl_s(n, m) do_uqrshl_bhs(n, m, 32, true, NULL) +#define do_urshl_d(n, m) do_uqrshl_d(n, m, true, NULL) + +DO_ZPZZ(sve2_urshl_zpzz_b, uint8_t, H1_2, do_urshl_b) +DO_ZPZZ(sve2_urshl_zpzz_h, uint16_t, H1_2, do_urshl_h) +DO_ZPZZ(sve2_urshl_zpzz_s, uint32_t, H1_4, do_urshl_s) +DO_ZPZZ_D(sve2_urshl_zpzz_d, uint64_t, do_urshl_d) + +/* Unlike the NEON and AdvSIMD versions, there is no QC bit to set. */ +#define do_sqshl_b(n, m) \ + ({ uint32_t discard; do_sqrshl_bhs(n, m, 8, false, &discard); }) +#define do_sqshl_h(n, m) \ + ({ uint32_t discard; do_sqrshl_bhs(n, m, 16, false, &discard); }) +#define do_sqshl_s(n, m) \ + ({ uint32_t discard; do_sqrshl_bhs(n, m, 32, false, &discard); }) +#define do_sqshl_d(n, m) \ + ({ uint32_t discard; do_sqrshl_d(n, m, false, &discard); }) + +DO_ZPZZ(sve2_sqshl_zpzz_b, int8_t, H1_2, do_sqshl_b) +DO_ZPZZ(sve2_sqshl_zpzz_h, int16_t, H1_2, do_sqshl_h) +DO_ZPZZ(sve2_sqshl_zpzz_s, int32_t, H1_4, do_sqshl_s) +DO_ZPZZ_D(sve2_sqshl_zpzz_d, int64_t, do_sqshl_d) + +#define do_uqshl_b(n, m) \ + ({ uint32_t discard; do_uqrshl_bhs(n, m, 8, false, &discard); }) +#define do_uqshl_h(n, m) \ + ({ uint32_t discard; do_uqrshl_bhs(n, m, 16, false, &discard); }) +#define do_uqshl_s(n, m) \ + ({ uint32_t discard; do_uqrshl_bhs(n, m, 32, false, &discard); }) +#define do_uqshl_d(n, m) \ + ({ uint32_t discard; do_uqrshl_d(n, m, false, &discard); }) + +DO_ZPZZ(sve2_uqshl_zpzz_b, uint8_t, H1_2, do_uqshl_b) +DO_ZPZZ(sve2_uqshl_zpzz_h, uint16_t, H1_2, do_uqshl_h) +DO_ZPZZ(sve2_uqshl_zpzz_s, uint32_t, H1_4, do_uqshl_s) +DO_ZPZZ_D(sve2_uqshl_zpzz_d, uint64_t, do_uqshl_d) + +#define do_sqrshl_b(n, m) \ + ({ uint32_t discard; do_sqrshl_bhs(n, m, 8, true, &discard); }) +#define do_sqrshl_h(n, m) \ + ({ uint32_t discard; do_sqrshl_bhs(n, m, 16, true, &discard); }) +#define do_sqrshl_s(n, m) \ + ({ uint32_t discard; do_sqrshl_bhs(n, m, 32, true, &discard); }) +#define do_sqrshl_d(n, m) \ + ({ uint32_t discard; do_sqrshl_d(n, m, true, &discard); }) + +DO_ZPZZ(sve2_sqrshl_zpzz_b, int8_t, H1_2, do_sqrshl_b) +DO_ZPZZ(sve2_sqrshl_zpzz_h, int16_t, H1_2, do_sqrshl_h) +DO_ZPZZ(sve2_sqrshl_zpzz_s, int32_t, H1_4, do_sqrshl_s) +DO_ZPZZ_D(sve2_sqrshl_zpzz_d, int64_t, do_sqrshl_d) + +#define do_uqrshl_b(n, m) \ + ({ uint32_t discard; do_uqrshl_bhs(n, m, 8, true, &discard); }) +#define do_uqrshl_h(n, m) \ + ({ uint32_t discard; do_uqrshl_bhs(n, m, 16, true, &discard); }) +#define do_uqrshl_s(n, m) \ + ({ uint32_t discard; do_uqrshl_bhs(n, m, 32, true, &discard); }) +#define do_uqrshl_d(n, m) \ + ({ uint32_t discard; do_uqrshl_d(n, m, true, &discard); }) + +DO_ZPZZ(sve2_uqrshl_zpzz_b, uint8_t, H1_2, do_uqrshl_b) +DO_ZPZZ(sve2_uqrshl_zpzz_h, uint16_t, H1_2, do_uqrshl_h) +DO_ZPZZ(sve2_uqrshl_zpzz_s, uint32_t, H1_4, do_uqrshl_s) +DO_ZPZZ_D(sve2_uqrshl_zpzz_d, uint64_t, do_uqrshl_d) + #undef DO_ZPZZ #undef DO_ZPZZ_D diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 938ec08673..45a72b1750 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -5966,3 +5966,21 @@ static bool trans_SQNEG(DisasContext *s, arg_rpr_esz *a) }; return do_sve2_zpz_ool(s, a, fns[a->esz]); } + +#define DO_SVE2_ZPZZ(NAME, name) \ +static bool trans_##NAME(DisasContext *s, arg_rprr_esz *a) \ +{ \ + static gen_helper_gvec_4 * const fns[4] = { \ + gen_helper_sve2_##name##_zpzz_b, gen_helper_sve2_##name##_zpzz_h, \ + gen_helper_sve2_##name##_zpzz_s, gen_helper_sve2_##name##_zpzz_d, \ + }; \ + return do_sve2_zpzz_ool(s, a, fns[a->esz]); \ +} + +DO_SVE2_ZPZZ(SQSHL, sqshl) +DO_SVE2_ZPZZ(SQRSHL, sqrshl) +DO_SVE2_ZPZZ(SRSHL, srshl) + +DO_SVE2_ZPZZ(UQSHL, uqshl) +DO_SVE2_ZPZZ(UQRSHL, uqrshl) +DO_SVE2_ZPZZ(URSHL, urshl) From patchwork Thu Mar 26 23:08:15 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1262439 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=dmdeoTnu; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48pLRS3R7Wz9sSH for ; Fri, 27 Mar 2020 10:14:44 +1100 (AEDT) Received: from localhost ([::1]:34602 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbhy-0002Fu-Cp for incoming@patchwork.ozlabs.org; Thu, 26 Mar 2020 19:14:42 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:58052) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcL-00026H-Hx for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:55 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcJ-0001L6-Vz for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:53 -0400 Received: from mail-pj1-x1036.google.com ([2607:f8b0:4864:20::1036]:54173) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcJ-0001KK-Q3 for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:51 -0400 Received: by mail-pj1-x1036.google.com with SMTP id l36so3088418pjb.3 for ; Thu, 26 Mar 2020 16:08:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=80IQPaqK9ZEvLA4BC4hOR/X+CTW0LTpH0JJGF9vMSPw=; b=dmdeoTnu06QZ1pe3dwlLNBHPKVQ4gwOYsa7GaCugqIBnU72azJBDCtcCD5CR6MnmOC 9CTr39b5EiCvsuYHsq1ibd/eVoj2ZRiStTvUlIrVIt0OW16rMcR5q5ka0qs5yukYThL2 d4eJ7rWeYq+NPwAh4JgS60+ccYKo5i7BrjA9nM7iNPPYEnvYdtWL+LZzfCLLOZaZmMz1 azQtpdahe58sdt/XU1grI1oVDxP5ofE71uBEAW/Y6kBP3vMz/6IeZiXkkjEOnrRF00jr 8GZjyKIZaCDAoxWYL60UNT/Hrz1J4W59Swxg7ETQIm/57QcV7UunpRkrAehjNkvsEeMB G5kg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=80IQPaqK9ZEvLA4BC4hOR/X+CTW0LTpH0JJGF9vMSPw=; b=DP91jnTPzWtGV99/qHVP1WvgOfnkdZEvMm1MPXkVG17aiEhOzxpP7r7gM9BBYib0Fk kZ+m/Likm6HwF5PdN2ZHQNo1JLNVSmwA5Bwl0vuFEIrzKZOqXG8VPqi4vh3wPHiUirnH CC8s2VqrNrpTT4TSTIK+ohGgGWYY9IenMkvD8BH082HJtFyI9Rg6EI/eUIix1pZmrqdj smq5CmHoJ4mVPr6DJ0XJJsgHVCI6Za2w0gsa7xrrKLv3RWJ+SRXKn3PhmjUgeKzxYARe 7YGgniTQAivJnZkswKWJy+7HxKsMJUlevalGu+FidE8gWi6vnCbYrATTBn7wEPEQfCoX jJGA== X-Gm-Message-State: ANhLgQ0yjOnt0jCyxrHYbAdTad9YPrOMVNMM8ik1jjRW/yVvdgfs4/lF r77nHo5KyyTf8nb4VbyvJHeoGxBcun0= X-Google-Smtp-Source: ADFU+vtxp3Gke37xHuY+wGBl1cN68a3ta4jSPup1Xiy8DHcCntkLQNE3Dx+IstHSXibfxtdIG85nJg== X-Received: by 2002:a17:902:8648:: with SMTP id y8mr10258682plt.153.1585264130363; Thu, 26 Mar 2020 16:08:50 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.08.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:08:49 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 08/31] target/arm: Implement SVE2 integer halving add/subtract (predicated) Date: Thu, 26 Mar 2020 16:08:15 -0700 Message-Id: <20200326230838.31112-9-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::1036 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 54 ++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 11 ++++++++ target/arm/sve_helper.c | 39 +++++++++++++++++++++++++++ target/arm/translate-sve.c | 8 ++++++ 4 files changed, 112 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 0eecf33249..149fff1fae 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -226,6 +226,60 @@ DEF_HELPER_FLAGS_5(sve2_uqrshl_zpzz_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve2_uqrshl_zpzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_shadd_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_shadd_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_shadd_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_shadd_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_uhadd_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uhadd_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uhadd_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uhadd_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_srhadd_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_srhadd_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_srhadd_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_srhadd_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_urhadd_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_urhadd_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_urhadd_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_urhadd_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_shsub_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_shsub_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_shsub_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_shsub_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_uhsub_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uhsub_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uhsub_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uhsub_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_sdiv_zpzz_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_sdiv_zpzz_d, TCG_CALL_NO_RWG, diff --git a/target/arm/sve.decode b/target/arm/sve.decode index f0b6692e43..54076bb607 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1131,3 +1131,14 @@ SQRSHL 01000100 .. 001 010 100 ... ..... ..... @rdn_pg_rm UQRSHL 01000100 .. 001 011 100 ... ..... ..... @rdn_pg_rm SQRSHL 01000100 .. 001 110 100 ... ..... ..... @rdm_pg_rn # SQRSHLR UQRSHL 01000100 .. 001 111 100 ... ..... ..... @rdm_pg_rn # UQRSHLR + +### SVE2 integer halving add/subtract (predicated) + +SHADD 01000100 .. 010 000 100 ... ..... ..... @rdn_pg_rm +UHADD 01000100 .. 010 001 100 ... ..... ..... @rdn_pg_rm +SHSUB 01000100 .. 010 010 100 ... ..... ..... @rdn_pg_rm +UHSUB 01000100 .. 010 011 100 ... ..... ..... @rdn_pg_rm +SRHADD 01000100 .. 010 100 100 ... ..... ..... @rdn_pg_rm +URHADD 01000100 .. 010 101 100 ... ..... ..... @rdn_pg_rm +SHSUB 01000100 .. 010 110 100 ... ..... ..... @rdm_pg_rn # SHSUBR +UHSUB 01000100 .. 010 111 100 ... ..... ..... @rdm_pg_rn # UHSUBR diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index a7e9b8d341..5d75aed7b7 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -639,6 +639,45 @@ DO_ZPZZ(sve2_uqrshl_zpzz_h, uint16_t, H1_2, do_uqrshl_h) DO_ZPZZ(sve2_uqrshl_zpzz_s, uint32_t, H1_4, do_uqrshl_s) DO_ZPZZ_D(sve2_uqrshl_zpzz_d, uint64_t, do_uqrshl_d) +#define DO_HADD_BHS(n, m) (((int64_t)n + m) >> 1) +#define DO_HADD_D(n, m) ((n >> 1) + (m >> 1) + (n & m & 1)) + +DO_ZPZZ(sve2_shadd_zpzz_b, int8_t, H1_2, DO_HADD_BHS) +DO_ZPZZ(sve2_shadd_zpzz_h, int16_t, H1_2, DO_HADD_BHS) +DO_ZPZZ(sve2_shadd_zpzz_s, int32_t, H1_4, DO_HADD_BHS) +DO_ZPZZ_D(sve2_shadd_zpzz_d, int64_t, DO_HADD_D) + +DO_ZPZZ(sve2_uhadd_zpzz_b, uint8_t, H1_2, DO_HADD_BHS) +DO_ZPZZ(sve2_uhadd_zpzz_h, uint16_t, H1_2, DO_HADD_BHS) +DO_ZPZZ(sve2_uhadd_zpzz_s, uint32_t, H1_4, DO_HADD_BHS) +DO_ZPZZ_D(sve2_uhadd_zpzz_d, uint64_t, DO_HADD_D) + +#define DO_RHADD_BHS(n, m) (((int64_t)n + m + 1) >> 1) +#define DO_RHADD_D(n, m) ((n >> 1) + (m >> 1) + ((n | m) & 1)) + +DO_ZPZZ(sve2_srhadd_zpzz_b, int8_t, H1_2, DO_RHADD_BHS) +DO_ZPZZ(sve2_srhadd_zpzz_h, int16_t, H1_2, DO_RHADD_BHS) +DO_ZPZZ(sve2_srhadd_zpzz_s, int32_t, H1_4, DO_RHADD_BHS) +DO_ZPZZ_D(sve2_srhadd_zpzz_d, int64_t, DO_RHADD_D) + +DO_ZPZZ(sve2_urhadd_zpzz_b, uint8_t, H1_2, DO_RHADD_BHS) +DO_ZPZZ(sve2_urhadd_zpzz_h, uint16_t, H1_2, DO_RHADD_BHS) +DO_ZPZZ(sve2_urhadd_zpzz_s, uint32_t, H1_4, DO_RHADD_BHS) +DO_ZPZZ_D(sve2_urhadd_zpzz_d, uint64_t, DO_RHADD_D) + +#define DO_HSUB_BHS(n, m) (((int64_t)n - m) >> 1) +#define DO_HSUB_D(n, m) ((n >> 1) - (m >> 1) - (~n & m & 1)) + +DO_ZPZZ(sve2_shsub_zpzz_b, int8_t, H1_2, DO_HSUB_BHS) +DO_ZPZZ(sve2_shsub_zpzz_h, int16_t, H1_2, DO_HSUB_BHS) +DO_ZPZZ(sve2_shsub_zpzz_s, int32_t, H1_4, DO_HSUB_BHS) +DO_ZPZZ_D(sve2_shsub_zpzz_d, int64_t, DO_HSUB_D) + +DO_ZPZZ(sve2_uhsub_zpzz_b, uint8_t, H1_2, DO_HSUB_BHS) +DO_ZPZZ(sve2_uhsub_zpzz_h, uint16_t, H1_2, DO_HSUB_BHS) +DO_ZPZZ(sve2_uhsub_zpzz_s, uint32_t, H1_4, DO_HSUB_BHS) +DO_ZPZZ_D(sve2_uhsub_zpzz_d, uint64_t, DO_HSUB_D) + #undef DO_ZPZZ #undef DO_ZPZZ_D diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 45a72b1750..7d619d7ad4 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -5984,3 +5984,11 @@ DO_SVE2_ZPZZ(SRSHL, srshl) DO_SVE2_ZPZZ(UQSHL, uqshl) DO_SVE2_ZPZZ(UQRSHL, uqrshl) DO_SVE2_ZPZZ(URSHL, urshl) + +DO_SVE2_ZPZZ(SHADD, shadd) +DO_SVE2_ZPZZ(SRHADD, srhadd) +DO_SVE2_ZPZZ(SHSUB, shsub) + +DO_SVE2_ZPZZ(UHADD, uhadd) +DO_SVE2_ZPZZ(URHADD, urhadd) +DO_SVE2_ZPZZ(UHSUB, uhsub) From patchwork Thu Mar 26 23:08:16 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1262444 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=hLbAw9Gk; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48pLVG2wzNz9sSL for ; Fri, 27 Mar 2020 10:17:10 +1100 (AEDT) Received: from localhost ([::1]:34692 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbkK-0006FA-8B for incoming@patchwork.ozlabs.org; Thu, 26 Mar 2020 19:17:08 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:58141) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcN-0002Ag-Co for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:56 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcL-0001N8-B2 for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:54 -0400 Received: from mail-pg1-x541.google.com ([2607:f8b0:4864:20::541]:39793) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcL-0001M5-41 for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:53 -0400 Received: by mail-pg1-x541.google.com with SMTP id b22so3650789pgb.6 for ; Thu, 26 Mar 2020 16:08:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=0Qn92PslIMK9Z7a8YCtRKYyKEHx9UY3I2q6Qw3BgsLw=; b=hLbAw9GkJTfLGztqk+Vwl/Cib3bAvHo/CfCnwH3dvGqK06hKyTu8n8klONtfC0kVgt HADSW5TxCM8XxDK9MKlTT28IUjujqPPBb+qiu6KqZudjr73B+EWR0PlTB5mqy4dLxz8U uFh1rrc3PE9e+pR/wdpfTcO31ZZXmyNso299OF1xfmp8cFjDx9t8T2L5OVMJQYYR609O gOdetXG7E3pQjJ9CLoonydRkL+8XqfiRmzhh8AK4o/0GpJ29FsV+yLl9NgzywHuAYkV9 OWX7kbjWMYk6WadWh0mMsrVPArO9TnBqJBkUduKxuZsTjCGdlj8SwrJTzwv/98yHmJ9v uCQw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=0Qn92PslIMK9Z7a8YCtRKYyKEHx9UY3I2q6Qw3BgsLw=; b=fOWHY09kCBmWExxWEw61VzkIvgS4fvwwWfHBKuMTVNYZ/godOAZ/EGetbHt46rmvz9 NlX3jDO59Cg4Jwi6rz7/mqjWnrc4GnF0toSc4h+365GeyUwkdNooJ6Ob5XH4payt4Rd+ F05oJaUR5Vwb8U9TuXuiHxzO8xpSK7kicSAV3fbvrqdTxSSg133aNPIvViQVZ4rQTlAk 5CkpTSWd+oPpE5yMB/Fqe92OlGg+mVUTsPsZQk6Bnh+8+MUePSzSwITHck7PZ9z1Z5bj G2Un4WLkgbEpcS6gGC+JnGRrpGK4zQ0fIPSY+pK5wIrRk0Nnq2YG4LLCntFGNAMTSpAc DznA== X-Gm-Message-State: ANhLgQ0txrsD44AWlEqcDCJFiBTS4WU7DYq8sDE7OHw7LHypdGe3fntw 5x4ATMDdlRHOm9c73BK39xek4bxS/ps= X-Google-Smtp-Source: ADFU+vtzZizJQavHHiNKnbRJpmg1wIFC8kqLeY9Cai5r2o4NDaz5Eptv6V4OKmtjUZLJNHMfeN86Jw== X-Received: by 2002:a63:a91a:: with SMTP id u26mr3677060pge.236.1585264131657; Thu, 26 Mar 2020 16:08:51 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.08.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:08:51 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 09/31] target/arm: Implement SVE2 integer pairwise arithmetic Date: Thu, 26 Mar 2020 16:08:16 -0700 Message-Id: <20200326230838.31112-10-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::541 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 45 +++++++++++++++++++++++++ target/arm/sve.decode | 8 +++++ target/arm/sve_helper.c | 67 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 6 ++++ 4 files changed, 126 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 149fff1fae..028c3b85a8 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -326,6 +326,51 @@ DEF_HELPER_FLAGS_5(sve_sel_zpzz_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_sel_zpzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_addp_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_addp_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_addp_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_addp_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_smaxp_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_smaxp_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_smaxp_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_smaxp_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_umaxp_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_umaxp_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_umaxp_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_umaxp_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_sminp_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sminp_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sminp_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sminp_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_uminp_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uminp_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uminp_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uminp_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_asr_zpzw_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_asr_zpzw_h, TCG_CALL_NO_RWG, diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 54076bb607..86a6bf7088 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1142,3 +1142,11 @@ SRHADD 01000100 .. 010 100 100 ... ..... ..... @rdn_pg_rm URHADD 01000100 .. 010 101 100 ... ..... ..... @rdn_pg_rm SHSUB 01000100 .. 010 110 100 ... ..... ..... @rdm_pg_rn # SHSUBR UHSUB 01000100 .. 010 111 100 ... ..... ..... @rdm_pg_rn # UHSUBR + +### SVE2 integer pairwise arithmetic + +ADDP 01000100 .. 010 001 101 ... ..... ..... @rdn_pg_rm +SMAXP 01000100 .. 010 100 101 ... ..... ..... @rdn_pg_rm +UMAXP 01000100 .. 010 101 101 ... ..... ..... @rdn_pg_rm +SMINP 01000100 .. 010 110 101 ... ..... ..... @rdn_pg_rm +UMINP 01000100 .. 010 111 101 ... ..... ..... @rdn_pg_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 5d75aed7b7..d7c181ddb8 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -681,6 +681,73 @@ DO_ZPZZ_D(sve2_uhsub_zpzz_d, uint64_t, DO_HSUB_D) #undef DO_ZPZZ #undef DO_ZPZZ_D +/* + * Three operand expander, operating on element pairs. + * If the slot I is even, the elements from from VN {I, I+1}. + * If the slot I is odd, the elements from from VM {I-1, I}. + */ +#define DO_ZPZZ_PAIR(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + for (i = 0; i < opr_sz; ) { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + if (pg & 1) { \ + void *p = (i & 1 ? vm : vn); \ + TYPE nn = *(TYPE *)(p + H(i & ~1)); \ + TYPE mm = *(TYPE *)(p + H(i | 1)); \ + *(TYPE *)(vd + H(i)) = OP(nn, mm); \ + } \ + i += sizeof(TYPE), pg >>= sizeof(TYPE); \ + } while (i & 15); \ + } \ +} + +/* Similarly, specialized for 64-bit operands. */ +#define DO_ZPZZ_PAIR_D(NAME, TYPE, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; \ + TYPE *d = vd, *n = vn, *m = vm; \ + uint8_t *pg = vg; \ + for (i = 0; i < opr_sz; i += 1) { \ + if (pg[H1(i)] & 1) { \ + TYPE *p = (i & 1 ? m : n) + (i & ~1); \ + TYPE nn = p[0], mm = p[1]; \ + d[i] = OP(nn, mm); \ + } \ + } \ +} + +DO_ZPZZ_PAIR(sve2_addp_zpzz_b, uint8_t, H1_2, DO_ADD) +DO_ZPZZ_PAIR(sve2_addp_zpzz_h, uint16_t, H1_2, DO_ADD) +DO_ZPZZ_PAIR(sve2_addp_zpzz_s, uint32_t, H1_4, DO_ADD) +DO_ZPZZ_PAIR_D(sve2_addp_zpzz_d, uint64_t, DO_ADD) + +DO_ZPZZ_PAIR(sve2_umaxp_zpzz_b, uint8_t, H1_2, DO_MAX) +DO_ZPZZ_PAIR(sve2_umaxp_zpzz_h, uint16_t, H1_2, DO_MAX) +DO_ZPZZ_PAIR(sve2_umaxp_zpzz_s, uint32_t, H1_4, DO_MAX) +DO_ZPZZ_PAIR_D(sve2_umaxp_zpzz_d, uint64_t, DO_MAX) + +DO_ZPZZ_PAIR(sve2_uminp_zpzz_b, uint8_t, H1_2, DO_MIN) +DO_ZPZZ_PAIR(sve2_uminp_zpzz_h, uint16_t, H1_2, DO_MIN) +DO_ZPZZ_PAIR(sve2_uminp_zpzz_s, uint32_t, H1_4, DO_MIN) +DO_ZPZZ_PAIR_D(sve2_uminp_zpzz_d, uint64_t, DO_MIN) + +DO_ZPZZ_PAIR(sve2_smaxp_zpzz_b, int8_t, H1_2, DO_MAX) +DO_ZPZZ_PAIR(sve2_smaxp_zpzz_h, int16_t, H1_2, DO_MAX) +DO_ZPZZ_PAIR(sve2_smaxp_zpzz_s, int32_t, H1_4, DO_MAX) +DO_ZPZZ_PAIR_D(sve2_smaxp_zpzz_d, int64_t, DO_MAX) + +DO_ZPZZ_PAIR(sve2_sminp_zpzz_b, int8_t, H1_2, DO_MIN) +DO_ZPZZ_PAIR(sve2_sminp_zpzz_h, int16_t, H1_2, DO_MIN) +DO_ZPZZ_PAIR(sve2_sminp_zpzz_s, int32_t, H1_4, DO_MIN) +DO_ZPZZ_PAIR_D(sve2_sminp_zpzz_d, int64_t, DO_MIN) + +#undef DO_ZPZZ_PAIR +#undef DO_ZPZZ_PAIR_D + /* Three-operand expander, controlled by a predicate, in which the * third operand is "wide". That is, for D = N op M, the same 64-bit * value of M is used with all of the narrower values of N. diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 7d619d7ad4..5f137c0e92 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -5992,3 +5992,9 @@ DO_SVE2_ZPZZ(SHSUB, shsub) DO_SVE2_ZPZZ(UHADD, uhadd) DO_SVE2_ZPZZ(URHADD, urhadd) DO_SVE2_ZPZZ(UHSUB, uhsub) + +DO_SVE2_ZPZZ(ADDP, addp) +DO_SVE2_ZPZZ(SMAXP, smaxp) +DO_SVE2_ZPZZ(UMAXP, umaxp) +DO_SVE2_ZPZZ(SMINP, sminp) +DO_SVE2_ZPZZ(UMINP, uminp) From patchwork Thu Mar 26 23:08:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1262446 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=dOg/KZgb; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48pLW344mNz9sSM for ; Fri, 27 Mar 2020 10:17:51 +1100 (AEDT) Received: from localhost ([::1]:34712 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbkz-00079P-D3 for incoming@patchwork.ozlabs.org; Thu, 26 Mar 2020 19:17:49 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:58270) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcP-0002GS-BP for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:59 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcN-0001Pj-0D for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:57 -0400 Received: from mail-pj1-x1034.google.com ([2607:f8b0:4864:20::1034]:33988) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcM-0001Ob-Ox for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:54 -0400 Received: by mail-pj1-x1034.google.com with SMTP id q16so3754705pje.1 for ; Thu, 26 Mar 2020 16:08:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=TZZOTv3OeZghjguCKNcfyc5uGnx4cEJVuFPp8if3XZ4=; b=dOg/KZgbX2SBMbO9mQ0Ao9i5826dhyaAIyqQl4wUUff6lLjBDZgOGDTkzfcGu3TK/r W4br3S0TCsczqKk4GHtocBDF3a5Oz+qGdzeJMwLcfHYxs3t+et3RN0SB5sUo7Ji3AbIX jrayJCIXq0CjPVr15Dvy65cSNyPC2e4MxMdQyH754ZpJo4auUn+NQWhuFq9k+Wd417qi 7yTWqGq4JKEfEmKb/x0bUbk6BvhisXKwVWOZVytNXv/QrlVecmdd0mPCR+yMQJoAaF17 q/d+Bc+Anz5J98EfMAUp48ve59a19jadmDfzhqfb9sQFRaXwYjU6ZbIsh5yKsAd+rIP6 L7Uw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=TZZOTv3OeZghjguCKNcfyc5uGnx4cEJVuFPp8if3XZ4=; b=cnuE6t+euSjSP21d/u+ud0ZYF4xr0R0NoXSEGeSsY75sQx8nKpYz+Uypf0BTXt1pBY YfHwebHrFQzE4R2lQ20R+L3wX9vYabEvytac3+DXBESQleTCZg2A98a3Ac9W1vgX4rix IY3Me+JcWsanhV5SV0r0ZW5VmC7ec7+9v9yw+VgheTduFwQDJsH83uj7HUbQb9EAhWH6 5p4WHE66Ytxo8c7BBKF+vuaMPhJl9NmphJ7JUrvPzxFs9tB1h5Z9udvF9YcG7q5SygL0 I/5Ve43PHLrDlp13LdA4Rczt3Mgj/Tu4b57UNh7bHUZNI2Uds8axHeMfh6X7EfGuZPEp EqRA== X-Gm-Message-State: ANhLgQ0dSE4z9VJuY0VZVz4XvNhyRE9o8jWXC/mtMl6fQBp2ZBKMEujk d9GvGSAD14Ol6cEWy8cGQJaAAAcIe5o= X-Google-Smtp-Source: ADFU+vsjOAyTGZ49UgZRNjcLuMxav9HG+zOFihrMxAMo+JimgI4DOemXKgszKGmHg+EMzlRKGEP6/Q== X-Received: by 2002:a17:902:9890:: with SMTP id s16mr10062525plp.71.1585264132994; Thu, 26 Mar 2020 16:08:52 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.08.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:08:52 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 10/31] target/arm: Implement SVE2 saturating add/subtract (predicated) Date: Thu, 26 Mar 2020 16:08:17 -0700 Message-Id: <20200326230838.31112-11-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::1034 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 54 +++++++++++ target/arm/sve.decode | 11 +++ target/arm/sve_helper.c | 182 +++++++++++++++++++++++++------------ target/arm/translate-sve.c | 7 ++ 4 files changed, 198 insertions(+), 56 deletions(-) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 028c3b85a8..368185944a 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -371,6 +371,60 @@ DEF_HELPER_FLAGS_5(sve2_uminp_zpzz_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve2_uminp_zpzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqadd_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqadd_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqadd_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqadd_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_uqadd_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uqadd_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uqadd_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uqadd_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_sqsub_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqsub_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqsub_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqsub_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_uqsub_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uqsub_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uqsub_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uqsub_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_suqadd_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_suqadd_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_suqadd_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_suqadd_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_usqadd_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_usqadd_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_usqadd_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_usqadd_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_asr_zpzw_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_asr_zpzw_h, TCG_CALL_NO_RWG, diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 86a6bf7088..86aee38668 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1150,3 +1150,14 @@ SMAXP 01000100 .. 010 100 101 ... ..... ..... @rdn_pg_rm UMAXP 01000100 .. 010 101 101 ... ..... ..... @rdn_pg_rm SMINP 01000100 .. 010 110 101 ... ..... ..... @rdn_pg_rm UMINP 01000100 .. 010 111 101 ... ..... ..... @rdn_pg_rm + +### SVE2 saturating add/subtract (predicated) + +SQADD_zpzz 01000100 .. 011 000 100 ... ..... ..... @rdn_pg_rm +UQADD_zpzz 01000100 .. 011 001 100 ... ..... ..... @rdn_pg_rm +SQSUB_zpzz 01000100 .. 011 010 100 ... ..... ..... @rdn_pg_rm +UQSUB_zpzz 01000100 .. 011 011 100 ... ..... ..... @rdn_pg_rm +SUQADD 01000100 .. 011 100 100 ... ..... ..... @rdn_pg_rm +USQADD 01000100 .. 011 101 100 ... ..... ..... @rdn_pg_rm +SQSUB_zpzz 01000100 .. 011 110 100 ... ..... ..... @rdm_pg_rn # SQSUBR +UQSUB_zpzz 01000100 .. 011 111 100 ... ..... ..... @rdm_pg_rn # UQSUBR diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index d7c181ddb8..bee00eaa44 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -678,6 +678,123 @@ DO_ZPZZ(sve2_uhsub_zpzz_h, uint16_t, H1_2, DO_HSUB_BHS) DO_ZPZZ(sve2_uhsub_zpzz_s, uint32_t, H1_4, DO_HSUB_BHS) DO_ZPZZ_D(sve2_uhsub_zpzz_d, uint64_t, DO_HSUB_D) +static inline int32_t do_sat_bhs(int64_t val, int64_t min, int64_t max) +{ + return val >= max ? max : val <= min ? min : val; +} + +#define DO_SQADD_B(n, m) do_sat_bhs((int64_t)n + m, INT8_MIN, INT8_MAX) +#define DO_SQADD_H(n, m) do_sat_bhs((int64_t)n + m, INT16_MIN, INT16_MAX) +#define DO_SQADD_S(n, m) do_sat_bhs((int64_t)n + m, INT32_MIN, INT32_MAX) + +static inline int64_t do_sqadd_d(int64_t n, int64_t m) +{ + int64_t r = n + m; + if (((r ^ n) & ~(n ^ m)) < 0) { + /* Signed overflow. */ + return r < 0 ? INT64_MAX : INT64_MIN; + } + return r; +} + +DO_ZPZZ(sve2_sqadd_zpzz_b, int8_t, H1_2, DO_SQADD_B) +DO_ZPZZ(sve2_sqadd_zpzz_h, int16_t, H1_2, DO_SQADD_H) +DO_ZPZZ(sve2_sqadd_zpzz_s, int32_t, H1_4, DO_SQADD_S) +DO_ZPZZ_D(sve2_sqadd_zpzz_d, int64_t, do_sqadd_d) + +#define DO_UQADD_B(n, m) do_sat_bhs((int64_t)n + m, 0, UINT8_MAX) +#define DO_UQADD_H(n, m) do_sat_bhs((int64_t)n + m, 0, UINT16_MAX) +#define DO_UQADD_S(n, m) do_sat_bhs((int64_t)n + m, 0, UINT32_MAX) + +static inline uint64_t do_uqadd_d(uint64_t n, uint64_t m) +{ + uint64_t r = n + m; + return r < n ? UINT64_MAX : r; +} + +DO_ZPZZ(sve2_uqadd_zpzz_b, uint8_t, H1_2, DO_UQADD_B) +DO_ZPZZ(sve2_uqadd_zpzz_h, uint16_t, H1_2, DO_UQADD_H) +DO_ZPZZ(sve2_uqadd_zpzz_s, uint32_t, H1_4, DO_UQADD_S) +DO_ZPZZ_D(sve2_uqadd_zpzz_d, uint64_t, do_uqadd_d) + +#define DO_SQSUB_B(n, m) do_sat_bhs((int64_t)n - m, INT8_MIN, INT8_MAX) +#define DO_SQSUB_H(n, m) do_sat_bhs((int64_t)n - m, INT16_MIN, INT16_MAX) +#define DO_SQSUB_S(n, m) do_sat_bhs((int64_t)n - m, INT32_MIN, INT32_MAX) + +static inline int64_t do_sqsub_d(int64_t n, int64_t m) +{ + int64_t r = n - m; + if (((r ^ n) & (n ^ m)) < 0) { + /* Signed overflow. */ + return r < 0 ? INT64_MAX : INT64_MIN; + } + return r; +} + +DO_ZPZZ(sve2_sqsub_zpzz_b, int8_t, H1_2, DO_SQSUB_B) +DO_ZPZZ(sve2_sqsub_zpzz_h, int16_t, H1_2, DO_SQSUB_H) +DO_ZPZZ(sve2_sqsub_zpzz_s, int32_t, H1_4, DO_SQSUB_S) +DO_ZPZZ_D(sve2_sqsub_zpzz_d, int64_t, do_sqsub_d) + +#define DO_UQSUB_B(n, m) do_sat_bhs((int64_t)n - m, 0, UINT8_MAX) +#define DO_UQSUB_H(n, m) do_sat_bhs((int64_t)n - m, 0, UINT16_MAX) +#define DO_UQSUB_S(n, m) do_sat_bhs((int64_t)n - m, 0, UINT32_MAX) + +static inline uint64_t do_uqsub_d(uint64_t n, uint64_t m) +{ + return n > m ? n - m : 0; +} + +DO_ZPZZ(sve2_uqsub_zpzz_b, uint8_t, H1_2, DO_UQSUB_B) +DO_ZPZZ(sve2_uqsub_zpzz_h, uint16_t, H1_2, DO_UQSUB_H) +DO_ZPZZ(sve2_uqsub_zpzz_s, uint32_t, H1_4, DO_UQSUB_S) +DO_ZPZZ_D(sve2_uqsub_zpzz_d, uint64_t, do_uqsub_d) + +#define DO_SUQADD_B(n, m) \ + do_sat_bhs((int64_t)(int8_t)n + m, INT8_MIN, INT8_MAX) +#define DO_SUQADD_H(n, m) \ + do_sat_bhs((int64_t)(int16_t)n + m, INT16_MIN, INT16_MAX) +#define DO_SUQADD_S(n, m) \ + do_sat_bhs((int64_t)(int32_t)n + m, INT32_MIN, INT32_MAX) + +static inline int64_t do_suqadd_d(int64_t n, uint64_t m) +{ + uint64_t r = n + m; + + /* Note that m - abs(n) cannot underflow. */ + if (n >= 0 && (r < m || r >= INT64_MAX)) { + return INT64_MAX; + } + return r; +} + +DO_ZPZZ(sve2_suqadd_zpzz_b, uint8_t, H1_2, DO_SUQADD_B) +DO_ZPZZ(sve2_suqadd_zpzz_h, uint16_t, H1_2, DO_SUQADD_H) +DO_ZPZZ(sve2_suqadd_zpzz_s, uint32_t, H1_4, DO_SUQADD_S) +DO_ZPZZ_D(sve2_suqadd_zpzz_d, uint64_t, do_suqadd_d) + +#define DO_USQADD_B(n, m) \ + do_sat_bhs((int64_t)n + (int8_t)m, 0, UINT8_MAX) +#define DO_USQADD_H(n, m) \ + do_sat_bhs((int64_t)n + (int16_t)m, 0, UINT16_MAX) +#define DO_USQADD_S(n, m) \ + do_sat_bhs((int64_t)n + (int32_t)m, 0, UINT32_MAX) + +static inline uint64_t do_usqadd_d(uint64_t n, int64_t m) +{ + uint64_t r = n + m; + + if (m < 0) { + return n < -m ? 0 : r; + } + return r < n ? UINT64_MAX : r; +} + +DO_ZPZZ(sve2_usqadd_zpzz_b, uint8_t, H1_2, DO_USQADD_B) +DO_ZPZZ(sve2_usqadd_zpzz_h, uint16_t, H1_2, DO_USQADD_H) +DO_ZPZZ(sve2_usqadd_zpzz_s, uint32_t, H1_4, DO_USQADD_S) +DO_ZPZZ_D(sve2_usqadd_zpzz_d, uint64_t, do_usqadd_d) + #undef DO_ZPZZ #undef DO_ZPZZ_D @@ -1640,13 +1757,7 @@ void HELPER(sve_sqaddi_b)(void *d, void *a, int32_t b, uint32_t desc) intptr_t i, oprsz = simd_oprsz(desc); for (i = 0; i < oprsz; i += sizeof(int8_t)) { - int r = *(int8_t *)(a + i) + b; - if (r > INT8_MAX) { - r = INT8_MAX; - } else if (r < INT8_MIN) { - r = INT8_MIN; - } - *(int8_t *)(d + i) = r; + *(int8_t *)(d + i) = DO_SQADD_B(b, *(int8_t *)(a + i)); } } @@ -1655,13 +1766,7 @@ void HELPER(sve_sqaddi_h)(void *d, void *a, int32_t b, uint32_t desc) intptr_t i, oprsz = simd_oprsz(desc); for (i = 0; i < oprsz; i += sizeof(int16_t)) { - int r = *(int16_t *)(a + i) + b; - if (r > INT16_MAX) { - r = INT16_MAX; - } else if (r < INT16_MIN) { - r = INT16_MIN; - } - *(int16_t *)(d + i) = r; + *(int16_t *)(d + i) = DO_SQADD_H(b, *(int16_t *)(a + i)); } } @@ -1670,13 +1775,7 @@ void HELPER(sve_sqaddi_s)(void *d, void *a, int64_t b, uint32_t desc) intptr_t i, oprsz = simd_oprsz(desc); for (i = 0; i < oprsz; i += sizeof(int32_t)) { - int64_t r = *(int32_t *)(a + i) + b; - if (r > INT32_MAX) { - r = INT32_MAX; - } else if (r < INT32_MIN) { - r = INT32_MIN; - } - *(int32_t *)(d + i) = r; + *(int32_t *)(d + i) = DO_SQADD_S(b, *(int32_t *)(a + i)); } } @@ -1685,13 +1784,7 @@ void HELPER(sve_sqaddi_d)(void *d, void *a, int64_t b, uint32_t desc) intptr_t i, oprsz = simd_oprsz(desc); for (i = 0; i < oprsz; i += sizeof(int64_t)) { - int64_t ai = *(int64_t *)(a + i); - int64_t r = ai + b; - if (((r ^ ai) & ~(ai ^ b)) < 0) { - /* Signed overflow. */ - r = (r < 0 ? INT64_MAX : INT64_MIN); - } - *(int64_t *)(d + i) = r; + *(int64_t *)(d + i) = do_sqadd_d(b, *(int64_t *)(a + i)); } } @@ -1704,13 +1797,7 @@ void HELPER(sve_uqaddi_b)(void *d, void *a, int32_t b, uint32_t desc) intptr_t i, oprsz = simd_oprsz(desc); for (i = 0; i < oprsz; i += sizeof(uint8_t)) { - int r = *(uint8_t *)(a + i) + b; - if (r > UINT8_MAX) { - r = UINT8_MAX; - } else if (r < 0) { - r = 0; - } - *(uint8_t *)(d + i) = r; + *(uint8_t *)(d + i) = DO_UQADD_B(b, *(uint8_t *)(a + i)); } } @@ -1719,13 +1806,7 @@ void HELPER(sve_uqaddi_h)(void *d, void *a, int32_t b, uint32_t desc) intptr_t i, oprsz = simd_oprsz(desc); for (i = 0; i < oprsz; i += sizeof(uint16_t)) { - int r = *(uint16_t *)(a + i) + b; - if (r > UINT16_MAX) { - r = UINT16_MAX; - } else if (r < 0) { - r = 0; - } - *(uint16_t *)(d + i) = r; + *(uint16_t *)(d + i) = DO_UQADD_H(b, *(uint16_t *)(a + i)); } } @@ -1734,13 +1815,7 @@ void HELPER(sve_uqaddi_s)(void *d, void *a, int64_t b, uint32_t desc) intptr_t i, oprsz = simd_oprsz(desc); for (i = 0; i < oprsz; i += sizeof(uint32_t)) { - int64_t r = *(uint32_t *)(a + i) + b; - if (r > UINT32_MAX) { - r = UINT32_MAX; - } else if (r < 0) { - r = 0; - } - *(uint32_t *)(d + i) = r; + *(uint32_t *)(d + i) = DO_UQADD_S(b, *(uint32_t *)(a + i)); } } @@ -1749,11 +1824,7 @@ void HELPER(sve_uqaddi_d)(void *d, void *a, uint64_t b, uint32_t desc) intptr_t i, oprsz = simd_oprsz(desc); for (i = 0; i < oprsz; i += sizeof(uint64_t)) { - uint64_t r = *(uint64_t *)(a + i) + b; - if (r < b) { - r = UINT64_MAX; - } - *(uint64_t *)(d + i) = r; + *(uint64_t *)(d + i) = do_uqadd_d(b, *(uint64_t *)(a + i)); } } @@ -1762,8 +1833,7 @@ void HELPER(sve_uqsubi_d)(void *d, void *a, uint64_t b, uint32_t desc) intptr_t i, oprsz = simd_oprsz(desc); for (i = 0; i < oprsz; i += sizeof(uint64_t)) { - uint64_t ai = *(uint64_t *)(a + i); - *(uint64_t *)(d + i) = (ai < b ? 0 : ai - b); + *(uint64_t *)(d + i) = do_uqsub_d(*(uint64_t *)(a + i), b); } } diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 5f137c0e92..21dfb2455a 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -5998,3 +5998,10 @@ DO_SVE2_ZPZZ(SMAXP, smaxp) DO_SVE2_ZPZZ(UMAXP, umaxp) DO_SVE2_ZPZZ(SMINP, sminp) DO_SVE2_ZPZZ(UMINP, uminp) + +DO_SVE2_ZPZZ(SQADD_zpzz, sqadd) +DO_SVE2_ZPZZ(UQADD_zpzz, uqadd) +DO_SVE2_ZPZZ(SQSUB_zpzz, sqsub) +DO_SVE2_ZPZZ(UQSUB_zpzz, uqsub) +DO_SVE2_ZPZZ(SUQADD, suqadd) +DO_SVE2_ZPZZ(USQADD, usqadd) From patchwork Thu Mar 26 23:08:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1262429 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=wGx1bq/f; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48pLKm57nWz9sSH for ; Fri, 27 Mar 2020 10:09:48 +1100 (AEDT) Received: from localhost ([::1]:34372 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbdC-0002xj-Ax for incoming@patchwork.ozlabs.org; Thu, 26 Mar 2020 19:09:46 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:58299) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcP-0002Ho-U3 for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:59 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcO-0001RV-7Z for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:57 -0400 Received: from mail-pj1-x1042.google.com ([2607:f8b0:4864:20::1042]:50910) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcO-0001QH-0u for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:56 -0400 Received: by mail-pj1-x1042.google.com with SMTP id v13so3096993pjb.0 for ; Thu, 26 Mar 2020 16:08:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=gLLGScFvDeYfc9yE8BUZTiwCqKsBUzUDnFzQPSPVdVg=; b=wGx1bq/f3pT24xZnCyWWby/2e7Fcd8f8s5MWQuZEvEAK9iCT4FPslEJKW5CIhu7xl+ lcZRNe3a8tPFMKFWteMKa3Xb5f/gzWh0RLj2ASFGwHmSi4Irelfp4cKsVjZz7DDqS1cV NbAUQxVtu2SIy2yqsGvh3uFU4MrRaIcEE/rjatHT5i4ODvVX4Xsblk/GLGaR/ee/72Ue vlW0pkXrpAQPUFNzIjK9ubzpKzNHZQWINeWoUWetGzwe0wYV+DdblK4llS9LKkclIIso CeDviFHLDg/U5jZ5/BaF6SoLclIgBb1BPOzpSB8tToNr0x+Zmd4piAfIHqV8BCB6Gx6d f2IA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=gLLGScFvDeYfc9yE8BUZTiwCqKsBUzUDnFzQPSPVdVg=; b=mX/h29+Irp+ZNccp/Ij7N5xI6hBD4WO31bP7FHh+/1Xm+tOPX6SBHtcdnFjd1QXyNe IOGdfzmDVlzzLJbltUur7qwTCqBHkzm3mVaahvoDeNV9ltHiQOgP5zrSA2iudEwCFs5O g394ASBku8fzG46J8oef73WZR9n5VVPAxvhgBsuBgRmqBbBL+bc1Yl48caG8iDg+PyiI j8BZXTL+EfrvvpVu9nroJnaiKm4DTOk2mNzsQQEf1gYRfUjxhcnaNDRqSlZxH6TjboUp YzDzkt4zFTEanYwx9w4AoV8rLtKVLKZQBTdMfr8XZTLRbYxPn82l7FW2SqQOu4tkj/b4 RGiw== X-Gm-Message-State: ANhLgQ2u4QIrDDKFF2UKiwNbD/JuMAjXNWbQsKqY1QePRv7Hii9B73i+ JqNhII4gXZtwtip2VpN3LvV8Bxs5HVI= X-Google-Smtp-Source: ADFU+vv/ZdUyeOOyoGNwWFcLtpCIVJusCO35PRiglJ//iBe0Lqs/tDxgjwv0RyWxKtv1mWvrOB+bDw== X-Received: by 2002:a17:902:6bc8:: with SMTP id m8mr10234014plt.223.1585264134560; Thu, 26 Mar 2020 16:08:54 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.08.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:08:53 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 11/31] target/arm: Implement SVE2 integer add/subtract long Date: Thu, 26 Mar 2020 16:08:18 -0700 Message-Id: <20200326230838.31112-12-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::1042 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 24 ++++++++++++++++++++ target/arm/sve.decode | 19 ++++++++++++++++ target/arm/sve_helper.c | 43 +++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 46 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 132 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 368185944a..475fce7f3a 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -1372,6 +1372,30 @@ DEF_HELPER_FLAGS_5(sve_ftmad_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_ftmad_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_ftmad_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_saddl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_saddl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_saddl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_ssubl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_ssubl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_ssubl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_sabdl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sabdl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sabdl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_uaddl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_uaddl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_uaddl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_usubl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_usubl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_usubl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_uabdl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_uabdl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_uabdl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(sve_ldr, TCG_CALL_NO_WG, void, env, ptr, tl, int) DEF_HELPER_FLAGS_4(sve_str, TCG_CALL_NO_WG, void, env, ptr, tl, int) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 86aee38668..a239fd3479 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1161,3 +1161,22 @@ SUQADD 01000100 .. 011 100 100 ... ..... ..... @rdn_pg_rm USQADD 01000100 .. 011 101 100 ... ..... ..... @rdn_pg_rm SQSUB_zpzz 01000100 .. 011 110 100 ... ..... ..... @rdm_pg_rn # SQSUBR UQSUB_zpzz 01000100 .. 011 111 100 ... ..... ..... @rdm_pg_rn # UQSUBR + +#### SVE2 Widening Integer Arithmetic + +## SVE2 integer add/subtract long + +SADDLB 01000101 .. 0 ..... 00 0000 ..... ..... @rd_rn_rm +SADDLT 01000101 .. 0 ..... 00 0001 ..... ..... @rd_rn_rm +UADDLB 01000101 .. 0 ..... 00 0010 ..... ..... @rd_rn_rm +UADDLT 01000101 .. 0 ..... 00 0011 ..... ..... @rd_rn_rm + +SSUBLB 01000101 .. 0 ..... 00 0100 ..... ..... @rd_rn_rm +SSUBLT 01000101 .. 0 ..... 00 0101 ..... ..... @rd_rn_rm +USUBLB 01000101 .. 0 ..... 00 0110 ..... ..... @rd_rn_rm +USUBLT 01000101 .. 0 ..... 00 0111 ..... ..... @rd_rn_rm + +SABDLB 01000101 .. 0 ..... 00 1100 ..... ..... @rd_rn_rm +SABDLT 01000101 .. 0 ..... 00 1101 ..... ..... @rd_rn_rm +UABDLB 01000101 .. 0 ..... 00 1110 ..... ..... @rd_rn_rm +UABDLT 01000101 .. 0 ..... 00 1111 ..... ..... @rd_rn_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index bee00eaa44..7d7a59f620 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1088,6 +1088,49 @@ DO_ZZW(sve_lsl_zzw_s, uint32_t, uint64_t, H1_4, DO_LSL) #undef DO_ZPZ #undef DO_ZPZ_D +/* + * Three-operand expander, unpredicated, in which the two inputs are + * selected from the top or bottom half of the wide column. + */ +#define DO_ZZZ_TB(NAME, TYPE, TYPEN, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + int sel1 = (simd_data(desc) & 1) * sizeof(TYPE); \ + int sel2 = (simd_data(desc) & 2) * (sizeof(TYPE) / 2); \ + for (i = 0; i < opr_sz; i += sizeof(TYPE)) { \ + TYPE nn = (TYPEN)(*(TYPE *)(vn + i) >> sel1); \ + TYPE mm = (TYPEN)(*(TYPE *)(vm + i) >> sel2); \ + *(TYPE *)(vd + i) = OP(nn, mm); \ + } \ +} + +DO_ZZZ_TB(sve2_saddl_h, int16_t, int8_t, DO_ADD) +DO_ZZZ_TB(sve2_saddl_s, int32_t, int16_t, DO_ADD) +DO_ZZZ_TB(sve2_saddl_d, int64_t, int32_t, DO_ADD) + +DO_ZZZ_TB(sve2_ssubl_h, int16_t, int8_t, DO_SUB) +DO_ZZZ_TB(sve2_ssubl_s, int32_t, int16_t, DO_SUB) +DO_ZZZ_TB(sve2_ssubl_d, int64_t, int32_t, DO_SUB) + +DO_ZZZ_TB(sve2_sabdl_h, int16_t, int8_t, DO_ABD) +DO_ZZZ_TB(sve2_sabdl_s, int32_t, int16_t, DO_ABD) +DO_ZZZ_TB(sve2_sabdl_d, int64_t, int32_t, DO_ABD) + +DO_ZZZ_TB(sve2_uaddl_h, uint16_t, uint8_t, DO_ADD) +DO_ZZZ_TB(sve2_uaddl_s, uint32_t, uint16_t, DO_ADD) +DO_ZZZ_TB(sve2_uaddl_d, uint64_t, uint32_t, DO_ADD) + +DO_ZZZ_TB(sve2_usubl_h, uint16_t, uint8_t, DO_SUB) +DO_ZZZ_TB(sve2_usubl_s, uint32_t, uint16_t, DO_SUB) +DO_ZZZ_TB(sve2_usubl_d, uint64_t, uint32_t, DO_SUB) + +DO_ZZZ_TB(sve2_uabdl_h, uint16_t, uint8_t, DO_ABD) +DO_ZZZ_TB(sve2_uabdl_s, uint32_t, uint16_t, DO_ABD) +DO_ZZZ_TB(sve2_uabdl_d, uint64_t, uint32_t, DO_ABD) + +#undef DO_ZZZ_TB + /* Two-operand reduction expander, controlled by a predicate. * The difference between TYPERED and TYPERET has to do with * sign-extension. E.g. for SMAX, TYPERED must be signed, diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 21dfb2455a..ee8a6fd912 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6005,3 +6005,49 @@ DO_SVE2_ZPZZ(SQSUB_zpzz, sqsub) DO_SVE2_ZPZZ(UQSUB_zpzz, uqsub) DO_SVE2_ZPZZ(SUQADD, suqadd) DO_SVE2_ZPZZ(USQADD, usqadd) + +/* + * SVE2 Widening Integer Arithmetic + */ + +static bool do_sve2_zzw_ool(DisasContext *s, arg_rrr_esz *a, + gen_helper_gvec_3 *fn, int data) +{ + if (fn == NULL || !dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vsz, vsz, data, fn); + } + return true; +} + +#define DO_SVE2_ZZZ_TB(NAME, name, SEL1, SEL2) \ +static bool trans_##NAME(DisasContext *s, arg_rrr_esz *a) \ +{ \ + static gen_helper_gvec_3 * const fns[4] = { \ + NULL, gen_helper_sve2_##name##_h, \ + gen_helper_sve2_##name##_s, gen_helper_sve2_##name##_d, \ + }; \ + return do_sve2_zzw_ool(s, a, fns[a->esz], (SEL2 << 1) | SEL1); \ +} + +DO_SVE2_ZZZ_TB(SADDLB, saddl, false, false) +DO_SVE2_ZZZ_TB(SSUBLB, ssubl, false, false) +DO_SVE2_ZZZ_TB(SABDLB, sabdl, false, false) + +DO_SVE2_ZZZ_TB(UADDLB, uaddl, false, false) +DO_SVE2_ZZZ_TB(USUBLB, usubl, false, false) +DO_SVE2_ZZZ_TB(UABDLB, uabdl, false, false) + +DO_SVE2_ZZZ_TB(SADDLT, saddl, true, true) +DO_SVE2_ZZZ_TB(SSUBLT, ssubl, true, true) +DO_SVE2_ZZZ_TB(SABDLT, sabdl, true, true) + +DO_SVE2_ZZZ_TB(UADDLT, uaddl, true, true) +DO_SVE2_ZZZ_TB(USUBLT, usubl, true, true) +DO_SVE2_ZZZ_TB(UABDLT, uabdl, true, true) From patchwork Thu Mar 26 23:08:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1262436 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=bCT8eHrH; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48pLP122mxz9sSM for ; Fri, 27 Mar 2020 10:12:37 +1100 (AEDT) Received: from localhost ([::1]:34548 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbfv-00077k-40 for incoming@patchwork.ozlabs.org; Thu, 26 Mar 2020 19:12:35 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:58342) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcQ-0002Jk-Ha for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:59 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcP-0001Te-EF for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:58 -0400 Received: from mail-pg1-x52e.google.com ([2607:f8b0:4864:20::52e]:45972) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcP-0001SB-8l for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:57 -0400 Received: by mail-pg1-x52e.google.com with SMTP id o26so3638237pgc.12 for ; Thu, 26 Mar 2020 16:08:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Sz8Q6/lROF919J+K+cTqN6AeyEkEwGuAjIPol0tSqWA=; b=bCT8eHrH03fihfc4mys2UWYgxhCZJ3Eapf7D7S00Lg9bWrQGq0DSmu744se0BHLjGh DwuCyP4cypLCDASWuWTY6OYPCIoJP1OLzkFkhNICLoiZyEoLV17lB9IIWLuilfQ8FFDE HVOdsAKdSfZrHcS5MnA6JlC/maqVdgt+OfI5+ajF4MwkVuONJgvrmm8uUa6wfnwxaza/ YWdK9bCWF9lafluE/31aNF/sgdtqLlWBnUQ/1YZQWTbPdfm+A+sFngvccHk+p8nIkYhi m2BdmtW0TikIeH+bZkBbXm/V01DSVsfBIb3TonHZ2I0JUU56JrX01i/B8IKv7jBzbOKD FumQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Sz8Q6/lROF919J+K+cTqN6AeyEkEwGuAjIPol0tSqWA=; b=bVN4zZvwj8pA3KvftM9AKQ4vgD0P6Sqw1PFO8t/H+Gvr1IaFR0bs/zR8IxNqPrxScW Be5yY8J6fhtw2ZAquapwUqv2kZlG4uwgYkrbNmA5ljkFB+9MCQWB1LoSuTrdNajd7Sqq A9NdeW/8a77zW0wtextgA1xIrelHooNBmosQMjlHYo+EpswoRM9A496nl3/0AsdWoY8r CacJ/LMfSIENMt7HolHuG7swz6zfWU2x8isY6bw3Iv+2Dj/f4I4rnhr6zC9q3kwU2giA AoMr5SNkDlqIAAkOmspYsTd517jFca7NLHRooJ8P+Vpj6JNWsmo3Hq+BYDyMzoHYSUJG Bpsw== X-Gm-Message-State: ANhLgQ29f8hgYUy8gLGxSeKXlSkLPr3Lgyfo66SgXNpYcFb9xsqdMbTR QuL5u66ueSn5X4D9oTQrvedtqpdZtSc= X-Google-Smtp-Source: ADFU+vufQa/VgUXFEoKcEMOTG6Rk0bxrgJP56bBnhXIWWQTqr/kGxpV35bkjBKY1YGqrGvfEIQ72dw== X-Received: by 2002:a63:ff4e:: with SMTP id s14mr11080157pgk.269.1585264135783; Thu, 26 Mar 2020 16:08:55 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.08.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:08:55 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 12/31] target/arm: Implement SVE2 integer add/subtract interleaved long Date: Thu, 26 Mar 2020 16:08:19 -0700 Message-Id: <20200326230838.31112-13-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::52e X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/sve.decode | 7 +++++++ target/arm/translate-sve.c | 3 +++ 2 files changed, 10 insertions(+) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index a239fd3479..8d5f31bcc4 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -109,6 +109,7 @@ # Three operand, vector element size @rd_rn_rm ........ esz:2 . rm:5 ... ... rn:5 rd:5 &rrr_esz +@rd_rm_rn ........ esz:2 . rn:5 ... ... rm:5 rd:5 &rrr_esz @pd_pn_pm ........ esz:2 .. rm:4 ....... rn:4 . rd:4 &rrr_esz @rdn_rm ........ esz:2 ...... ...... rm:5 rd:5 \ &rrr_esz rn=%reg_movprfx @@ -1180,3 +1181,9 @@ SABDLB 01000101 .. 0 ..... 00 1100 ..... ..... @rd_rn_rm SABDLT 01000101 .. 0 ..... 00 1101 ..... ..... @rd_rn_rm UABDLB 01000101 .. 0 ..... 00 1110 ..... ..... @rd_rn_rm UABDLT 01000101 .. 0 ..... 00 1111 ..... ..... @rd_rn_rm + +## SVE2 integer add/subtract interleaved long + +SADDLBT 01000101 .. 0 ..... 1000 00 ..... ..... @rd_rn_rm +SSUBLBT 01000101 .. 0 ..... 1000 10 ..... ..... @rd_rn_rm +SSUBLBT 01000101 .. 0 ..... 1000 11 ..... ..... @rd_rm_rn # SSUBLTB diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index ee8a6fd912..accb74537b 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6051,3 +6051,6 @@ DO_SVE2_ZZZ_TB(SABDLT, sabdl, true, true) DO_SVE2_ZZZ_TB(UADDLT, uaddl, true, true) DO_SVE2_ZZZ_TB(USUBLT, usubl, true, true) DO_SVE2_ZZZ_TB(UABDLT, uabdl, true, true) + +DO_SVE2_ZZZ_TB(SADDLBT, saddl, false, true) +DO_SVE2_ZZZ_TB(SSUBLBT, ssubl, false, true) From patchwork Thu Mar 26 23:08:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1262443 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=hhI8h5Pi; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48pLVF3qf2z9sSH for ; Fri, 27 Mar 2020 10:17:09 +1100 (AEDT) Received: from localhost ([::1]:34688 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbkJ-0006E0-Ds for incoming@patchwork.ozlabs.org; Thu, 26 Mar 2020 19:17:07 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:58430) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcR-0002O1-UX for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:01 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcQ-0001VK-Fu for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:59 -0400 Received: from mail-pl1-x643.google.com ([2607:f8b0:4864:20::643]:33612) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcQ-0001UG-9C for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:58 -0400 Received: by mail-pl1-x643.google.com with SMTP id g18so2752886plq.0 for ; Thu, 26 Mar 2020 16:08:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=mMKWgjpLavmbt56//ygtC6ZwR2NAPw0rAnQXI9rkvA8=; b=hhI8h5PirpzeUyHGeWw5TyAz80KK5mgR5DlmLdJ2ACyl0WdBE2lwA+HxEqrAlstXvQ 2LTJSsfkSCDt+nDl10GmcP6aJijm8dzs3keJexTrj9J8Ef5kMdW7UAJTD7R0Dw4BOj9G YV/B/qxQhh7bRnEdENfUNlRdzMaYHKJ+SE30yGk4dsILc5RQUt1XRsR7i1l5yoYmEEb9 fPhxknTMAKFCA0WN59UwLJCC0wcblfgSPM39Oc0DI7UnYgw9xKZYhqeoGiun44p+PzxO uYzl1afbw/Z3tvQjbT2Z8BpXl/n3GWZUViXp0BqN/gJKfFDyEJMHxSh/s9602LKOZaWc 49UQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=mMKWgjpLavmbt56//ygtC6ZwR2NAPw0rAnQXI9rkvA8=; b=rynQJnmg3ol28yMEWnrHz31bwu4GK5iZwL1vEBNw7RuaASCDZLQMlhWt1o1YpH3dxT TGgYvvm/ciKcxamJzJCLQCuLgmFCxeorYHfpYjGUmowP2Kn6AKQGmdEK/mD1+RFE/+Gy UWoFvKwtf2f30WWkyUS136yoRmJTvdg+M+tua/2colwa4gyHqPpsFehh6kCAXQhSDiD/ LVJtkQVWiv2u15HBLu9hMKL7cFRNQ0oq3SD6DCpAX4Gb4dajOaCHUEBHfDZK3V0X2DQM rOCMoI/ylsNS94L474gdoa5bnWKmDjXMOh9hLaWD5SYZ0n/bRORq1of3wigUSd35NFfC LuPA== X-Gm-Message-State: ANhLgQ0EFle4NjvudgJEyFl3O/3tQ3DUewSPX8KTn5az12b+kA1NF0q4 f8Szgu3UufpmAbsi7OyiaiYdkkInX+k= X-Google-Smtp-Source: ADFU+vuTJCzL3UguRr3noNGwssKrn1MkPNkM161nKOVoQx6WOwrfpkdrg9WF/QHzM3ihivyWqOMcDw== X-Received: by 2002:a17:902:a517:: with SMTP id s23mr10766321plq.125.1585264136950; Thu, 26 Mar 2020 16:08:56 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.08.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:08:56 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 13/31] target/arm: Implement SVE2 integer add/subtract wide Date: Thu, 26 Mar 2020 16:08:20 -0700 Message-Id: <20200326230838.31112-14-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::643 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 16 ++++++++++++++++ target/arm/sve.decode | 12 ++++++++++++ target/arm/sve_helper.c | 30 ++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 20 ++++++++++++++++++++ 4 files changed, 78 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 475fce7f3a..6a95c6085c 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -1396,6 +1396,22 @@ DEF_HELPER_FLAGS_4(sve2_uabdl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_uabdl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_uabdl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_saddw_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_saddw_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_saddw_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_ssubw_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_ssubw_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_ssubw_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_uaddw_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_uaddw_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_uaddw_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_usubw_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_usubw_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_usubw_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(sve_ldr, TCG_CALL_NO_WG, void, env, ptr, tl, int) DEF_HELPER_FLAGS_4(sve_str, TCG_CALL_NO_WG, void, env, ptr, tl, int) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 8d5f31bcc4..9994e1eb71 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1187,3 +1187,15 @@ UABDLT 01000101 .. 0 ..... 00 1111 ..... ..... @rd_rn_rm SADDLBT 01000101 .. 0 ..... 1000 00 ..... ..... @rd_rn_rm SSUBLBT 01000101 .. 0 ..... 1000 10 ..... ..... @rd_rn_rm SSUBLBT 01000101 .. 0 ..... 1000 11 ..... ..... @rd_rm_rn # SSUBLTB + +## SVE2 integer add/subtract wide + +SADDWB 01000101 .. 0 ..... 010 000 ..... ..... @rd_rn_rm +SADDWT 01000101 .. 0 ..... 010 001 ..... ..... @rd_rn_rm +UADDWB 01000101 .. 0 ..... 010 010 ..... ..... @rd_rn_rm +UADDWT 01000101 .. 0 ..... 010 011 ..... ..... @rd_rn_rm + +SSUBWB 01000101 .. 0 ..... 010 100 ..... ..... @rd_rn_rm +SSUBWT 01000101 .. 0 ..... 010 101 ..... ..... @rd_rn_rm +USUBWB 01000101 .. 0 ..... 010 110 ..... ..... @rd_rn_rm +USUBWT 01000101 .. 0 ..... 010 111 ..... ..... @rd_rn_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 7d7a59f620..44503626e4 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1131,6 +1131,36 @@ DO_ZZZ_TB(sve2_uabdl_d, uint64_t, uint32_t, DO_ABD) #undef DO_ZZZ_TB +#define DO_ZZZ_WTB(NAME, TYPE, TYPEN, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + int sel2 = (simd_data(desc) & 1) * sizeof(TYPE); \ + for (i = 0; i < opr_sz; i += sizeof(TYPE)) { \ + TYPE nn = *(TYPE *)(vn + i); \ + TYPE mm = (TYPEN)(*(TYPE *)(vm + i) >> sel2); \ + *(TYPE *)(vd + i) = OP(nn, mm); \ + } \ +} + +DO_ZZZ_WTB(sve2_saddw_h, int16_t, int8_t, DO_ADD) +DO_ZZZ_WTB(sve2_saddw_s, int32_t, int16_t, DO_ADD) +DO_ZZZ_WTB(sve2_saddw_d, int64_t, int32_t, DO_ADD) + +DO_ZZZ_WTB(sve2_ssubw_h, int16_t, int8_t, DO_SUB) +DO_ZZZ_WTB(sve2_ssubw_s, int32_t, int16_t, DO_SUB) +DO_ZZZ_WTB(sve2_ssubw_d, int64_t, int32_t, DO_SUB) + +DO_ZZZ_WTB(sve2_uaddw_h, uint16_t, uint8_t, DO_ADD) +DO_ZZZ_WTB(sve2_uaddw_s, uint32_t, uint16_t, DO_ADD) +DO_ZZZ_WTB(sve2_uaddw_d, uint64_t, uint32_t, DO_ADD) + +DO_ZZZ_WTB(sve2_usubw_h, uint16_t, uint8_t, DO_SUB) +DO_ZZZ_WTB(sve2_usubw_s, uint32_t, uint16_t, DO_SUB) +DO_ZZZ_WTB(sve2_usubw_d, uint64_t, uint32_t, DO_SUB) + +#undef DO_ZZZ_WTB + /* Two-operand reduction expander, controlled by a predicate. * The difference between TYPERED and TYPERET has to do with * sign-extension. E.g. for SMAX, TYPERED must be signed, diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index accb74537b..fb214360bf 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6054,3 +6054,23 @@ DO_SVE2_ZZZ_TB(UABDLT, uabdl, true, true) DO_SVE2_ZZZ_TB(SADDLBT, saddl, false, true) DO_SVE2_ZZZ_TB(SSUBLBT, ssubl, false, true) + +#define DO_SVE2_ZZZ_WTB(NAME, name, SEL2) \ +static bool trans_##NAME(DisasContext *s, arg_rrr_esz *a) \ +{ \ + static gen_helper_gvec_3 * const fns[4] = { \ + NULL, gen_helper_sve2_##name##_h, \ + gen_helper_sve2_##name##_s, gen_helper_sve2_##name##_d, \ + }; \ + return do_sve2_zzw_ool(s, a, fns[a->esz], SEL2); \ +} + +DO_SVE2_ZZZ_WTB(SADDWB, saddw, false) +DO_SVE2_ZZZ_WTB(SADDWT, saddw, true) +DO_SVE2_ZZZ_WTB(SSUBWB, ssubw, false) +DO_SVE2_ZZZ_WTB(SSUBWT, ssubw, true) + +DO_SVE2_ZZZ_WTB(UADDWB, uaddw, false) +DO_SVE2_ZZZ_WTB(UADDWT, uaddw, true) +DO_SVE2_ZZZ_WTB(USUBWB, usubw, false) +DO_SVE2_ZZZ_WTB(USUBWT, usubw, true) From patchwork Thu Mar 26 23:08:21 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1262447 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=SaqlsuX5; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48pLXt0lj0z9sSL for ; Fri, 27 Mar 2020 10:19:26 +1100 (AEDT) Received: from localhost ([::1]:34748 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbmV-0001HU-WB for incoming@patchwork.ozlabs.org; Thu, 26 Mar 2020 19:19:24 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:58509) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcT-0002Rg-8Q for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:02 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcR-0001XV-Q2 for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:01 -0400 Received: from mail-pj1-x1044.google.com ([2607:f8b0:4864:20::1044]:39644) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcR-0001WS-Jq for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:59 -0400 Received: by mail-pj1-x1044.google.com with SMTP id z3so2516103pjr.4 for ; Thu, 26 Mar 2020 16:08:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=j4qfax2U5xHz9yKRRnJ0t4mZCcpGgQcdv9vkX7WKX8I=; b=SaqlsuX5Nr9AGr9lWM0Bg3tTu5mQu0H/31bW/Ih5/q/4+oWIN93BxUBpAJ0DQHdTiT TG/DhHGc0PDhvQcZUzv2xGay40RDtb9dtbcIcdUd0WIEppJqYh+LAPW7fMZ+Ass/3lQ1 uu4Ua4cK+Wpsa8RPUDovOMNqP4EXmj8+mf1DpskvrIfQTpYU2IynMH4Ylr6xlTRiFYdm C8aXto4AhQgDlUVriiUvp5HSr0AzhFL2YvQIA7sUdRUSKMg/K068e/ghBBQoj9rnpaZz T3yR12eCQzDZBazdejtfDe65SS/dS5dnhlvt9l2mgz1cD9dMs5sXw3RG5KSGuyu/DUUB yTvQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=j4qfax2U5xHz9yKRRnJ0t4mZCcpGgQcdv9vkX7WKX8I=; b=hdgwSzxujnm5fD0C5Emvu1BMuqOaAQep4TjJRkihtLSmjIY8ed+4/NHKUQMeDS4Ape Fk+jATv2AGPUazfqWpgJ2lLpSPpJ1UVh45ECezFUEmjQIv+/avyIdVHjVD2c/On7AeR+ Ovox9xeBswYbDsBT9IugitDESLdeW+k5xCTCUW2JbEaQabCfrk4kBTzANwJwq5o+hkzA xOW6Kds3sMruh4/ISpmLI8rbC2OQbh7ulwm5wfqwuOqB+/FSjpOsAsxUimEVOHPwUYMg tyBHIwM8xAd0jR92bkrgLzv2jpCBx+z1YsShU5gjjvaEhUJwL6/uBKHHeJeB2C80LUJE Qqow== X-Gm-Message-State: ANhLgQ1NeEsaj7UOYGJlH/weq1QGlQs4QuclNrnzkcyhSy69WmcH+vcb P9PTAGDRPK2DEg6AjmuMwdXWp6zx6hg= X-Google-Smtp-Source: ADFU+vvhASiOUvGHuI+0G2bd6PbKgavD7nKIk4ULVRitjmbdYTXe9uTRgYDUCi9ya6B2136WtEuL5A== X-Received: by 2002:a17:902:a9cc:: with SMTP id b12mr10337563plr.177.1585264138249; Thu, 26 Mar 2020 16:08:58 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.08.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:08:57 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 14/31] target/arm: Implement SVE2 integer multiply long Date: Thu, 26 Mar 2020 16:08:21 -0700 Message-Id: <20200326230838.31112-15-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::1044 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Exclude PMULL from this category for the moment. Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 15 +++++++++++++++ target/arm/sve.decode | 9 +++++++++ target/arm/sve_helper.c | 31 +++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 9 +++++++++ 4 files changed, 64 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 6a95c6085c..c4784919d2 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2355,4 +2355,19 @@ DEF_HELPER_FLAGS_6(sve_stdd_le_zd_mte, TCG_CALL_NO_WG, DEF_HELPER_FLAGS_6(sve_stdd_be_zd_mte, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve2_sqdmull_zzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqdmull_zzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqdmull_zzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_smull_zzz_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_smull_zzz_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_smull_zzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_umull_zzz_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_umull_zzz_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_umull_zzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(sve2_pmull_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 9994e1eb71..2410dd85a1 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1199,3 +1199,12 @@ SSUBWB 01000101 .. 0 ..... 010 100 ..... ..... @rd_rn_rm SSUBWT 01000101 .. 0 ..... 010 101 ..... ..... @rd_rn_rm USUBWB 01000101 .. 0 ..... 010 110 ..... ..... @rd_rn_rm USUBWT 01000101 .. 0 ..... 010 111 ..... ..... @rd_rn_rm + +## SVE2 integer multiply long + +SQDMULLB_zzz 01000101 .. 0 ..... 011 000 ..... ..... @rd_rn_rm +SQDMULLT_zzz 01000101 .. 0 ..... 011 001 ..... ..... @rd_rn_rm +SMULLB_zzz 01000101 .. 0 ..... 011 100 ..... ..... @rd_rn_rm +SMULLT_zzz 01000101 .. 0 ..... 011 101 ..... ..... @rd_rn_rm +UMULLB_zzz 01000101 .. 0 ..... 011 110 ..... ..... @rd_rn_rm +UMULLT_zzz 01000101 .. 0 ..... 011 111 ..... ..... @rd_rn_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 44503626e4..130697f3d9 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1129,6 +1129,37 @@ DO_ZZZ_TB(sve2_uabdl_h, uint16_t, uint8_t, DO_ABD) DO_ZZZ_TB(sve2_uabdl_s, uint32_t, uint16_t, DO_ABD) DO_ZZZ_TB(sve2_uabdl_d, uint64_t, uint32_t, DO_ABD) +DO_ZZZ_TB(sve2_smull_zzz_h, int16_t, int8_t, DO_MUL) +DO_ZZZ_TB(sve2_smull_zzz_s, int32_t, int16_t, DO_MUL) +DO_ZZZ_TB(sve2_smull_zzz_d, int64_t, int32_t, DO_MUL) + +DO_ZZZ_TB(sve2_umull_zzz_h, uint16_t, uint8_t, DO_MUL) +DO_ZZZ_TB(sve2_umull_zzz_s, uint32_t, uint16_t, DO_MUL) +DO_ZZZ_TB(sve2_umull_zzz_d, uint64_t, uint32_t, DO_MUL) + +/* Note that the multiply cannot overflow, but the doubling can. */ +static inline int16_t do_sqdmull_h(int16_t n, int16_t m) +{ + int16_t val = n * m; + return DO_SQADD_H(val, val); +} + +static inline int32_t do_sqdmull_s(int32_t n, int32_t m) +{ + int32_t val = n * m; + return DO_SQADD_S(val, val); +} + +static inline int64_t do_sqdmull_d(int64_t n, int64_t m) +{ + int64_t val = n * m; + return do_sqadd_d(val, val); +} + +DO_ZZZ_TB(sve2_sqdmull_zzz_h, int16_t, int8_t, do_sqdmull_h) +DO_ZZZ_TB(sve2_sqdmull_zzz_s, int32_t, int16_t, do_sqdmull_s) +DO_ZZZ_TB(sve2_sqdmull_zzz_d, int64_t, int32_t, do_sqdmull_d) + #undef DO_ZZZ_TB #define DO_ZZZ_WTB(NAME, TYPE, TYPEN, OP) \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index fb214360bf..c66ec9eb83 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6055,6 +6055,15 @@ DO_SVE2_ZZZ_TB(UABDLT, uabdl, true, true) DO_SVE2_ZZZ_TB(SADDLBT, saddl, false, true) DO_SVE2_ZZZ_TB(SSUBLBT, ssubl, false, true) +DO_SVE2_ZZZ_TB(SQDMULLB_zzz, sqdmull_zzz, false, false) +DO_SVE2_ZZZ_TB(SQDMULLT_zzz, sqdmull_zzz, true, true) + +DO_SVE2_ZZZ_TB(SMULLB_zzz, smull_zzz, false, false) +DO_SVE2_ZZZ_TB(SMULLT_zzz, smull_zzz, true, true) + +DO_SVE2_ZZZ_TB(UMULLB_zzz, umull_zzz, false, false) +DO_SVE2_ZZZ_TB(UMULLT_zzz, umull_zzz, true, true) + #define DO_SVE2_ZZZ_WTB(NAME, name, SEL2) \ static bool trans_##NAME(DisasContext *s, arg_rrr_esz *a) \ { \ From patchwork Thu Mar 26 23:08:22 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1262453 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=RJ64crcL; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48pLbp4jLzz9sSH for ; Fri, 27 Mar 2020 10:21:58 +1100 (AEDT) Received: from localhost ([::1]:34826 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHboy-0004wa-GS for incoming@patchwork.ozlabs.org; Thu, 26 Mar 2020 19:21:56 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:58587) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcU-0002V7-G1 for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:03 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcT-0001Zq-3G for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:02 -0400 Received: from mail-pf1-x441.google.com ([2607:f8b0:4864:20::441]:33179) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcS-0001Yc-T8 for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:01 -0400 Received: by mail-pf1-x441.google.com with SMTP id j1so3565816pfe.0 for ; Thu, 26 Mar 2020 16:09:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=rI8qejJNvyJMtMzOFbbZO4TRq+KeIHi8lkQl/Kjod0g=; b=RJ64crcLI/IiapbqkQWiQAhcVnS+EC2JxfMAlP2pfuq6LQaG4KuMeOlLm0j2ym8b1J WBDNEQeWBAFCsdRRKryCr/J32tu53LGaGCZAspzLYWODv6Fj/ftlPAJXTWHlhviJ1o4V tYSnn9owYLxT9Qjbj8rcgMP2jwIBVKNx2jbcVfcj3Vjoy93ehTz/QATOJhW48JcdiujE t2XuWILZcr49oH6Hgc7fNzXWHPHKZV74IYeuavftKeu7xawIursnI8kWZI5n6kij7Acy 62i4vIGyxaWCifq5a8xXwhnTX/FXM4xPos/yBikiF0Xhu8sxWTTyMdCHbLnCea9cQrYU M53g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=rI8qejJNvyJMtMzOFbbZO4TRq+KeIHi8lkQl/Kjod0g=; b=rw6m/RYDqKnEHXemhgH1yJMLHN3oCGBYsZzKa0TUCaq3LQgSl6NekNC/PEbTOuRpHI n8AkeXBirJm0XD87dAniwxvkRnnFO0UaIyj6la3jGjz/HGfpGf0n+Y/cpRJNRnqYeWCs ELBUpgd88E2/fMPGp3UNSvRdXKcPgvNleUVLdI4VM+VBFyr01xW5kNv1opC9l0e6qqjN HF05+73pJdP69NZPpklQJkBmj5ltZIKOjrDzvum+0pzPztYzAOhZdvc4GNKAtMrmpuPx zkA5+B6ql9z/mIxMamESNmn/wp05VFZIb9oOrLGZLtnJvdJyIcJNfCIXlnh9yHyIpkb0 5rKg== X-Gm-Message-State: ANhLgQ31mKZqOmzy0ym7H3Rh38YEeBprZXwpuySSC1mNki5gRPtFD6fj px749Dm+7yS22IRST0DpDtYNbj19SZM= X-Google-Smtp-Source: ADFU+vvHyh/Krk3BSeqstT31AiS82tf0RqBBBAmJzXGEvwCDnt1DkM/oPkiuHfp3N7T2wvOR0FwIWw== X-Received: by 2002:aa7:9f94:: with SMTP id z20mr9893160pfr.261.1585264139443; Thu, 26 Mar 2020 16:08:59 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.08.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:08:58 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 15/31] target/arm: Implement PMULLB and PMULLT Date: Thu, 26 Mar 2020 16:08:22 -0700 Message-Id: <20200326230838.31112-16-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::441 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/cpu.h | 10 ++++++++++ target/arm/helper-sve.h | 1 + target/arm/sve.decode | 2 ++ target/arm/translate-sve.c | 22 ++++++++++++++++++++++ target/arm/vec_helper.c | 26 ++++++++++++++++++++++++++ 5 files changed, 61 insertions(+) diff --git a/target/arm/cpu.h b/target/arm/cpu.h index 2314e3c18c..2e9d9f046d 100644 --- a/target/arm/cpu.h +++ b/target/arm/cpu.h @@ -3855,6 +3855,16 @@ static inline bool isar_feature_aa64_sve2(const ARMISARegisters *id) return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, SVEVER) != 0; } +static inline bool isar_feature_aa64_sve2_aes(const ARMISARegisters *id) +{ + return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, AES) != 0; +} + +static inline bool isar_feature_aa64_sve2_pmull128(const ARMISARegisters *id) +{ + return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, AES) >= 2; +} + /* * Feature tests for "does this exist in either 32-bit or 64-bit?" */ diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index c4784919d2..943839e2dc 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2371,3 +2371,4 @@ DEF_HELPER_FLAGS_4(sve2_umull_zzz_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_umull_zzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_pmull_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_pmull_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 2410dd85a1..04bf9e5ce8 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1204,6 +1204,8 @@ USUBWT 01000101 .. 0 ..... 010 111 ..... ..... @rd_rn_rm SQDMULLB_zzz 01000101 .. 0 ..... 011 000 ..... ..... @rd_rn_rm SQDMULLT_zzz 01000101 .. 0 ..... 011 001 ..... ..... @rd_rn_rm +PMULLB 01000101 .. 0 ..... 011 010 ..... ..... @rd_rn_rm +PMULLT 01000101 .. 0 ..... 011 011 ..... ..... @rd_rn_rm SMULLB_zzz 01000101 .. 0 ..... 011 100 ..... ..... @rd_rn_rm SMULLT_zzz 01000101 .. 0 ..... 011 101 ..... ..... @rd_rn_rm UMULLB_zzz 01000101 .. 0 ..... 011 110 ..... ..... @rd_rn_rm diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index c66ec9eb83..67416a25ce 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6064,6 +6064,28 @@ DO_SVE2_ZZZ_TB(SMULLT_zzz, smull_zzz, true, true) DO_SVE2_ZZZ_TB(UMULLB_zzz, umull_zzz, false, false) DO_SVE2_ZZZ_TB(UMULLT_zzz, umull_zzz, true, true) +static bool do_trans_pmull(DisasContext *s, arg_rrr_esz *a, bool sel) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_gvec_pmull_q, gen_helper_sve2_pmull_h, + NULL, gen_helper_sve2_pmull_d, + }; + if (a->esz == 0 && !dc_isar_feature(aa64_sve2_pmull128, s)) { + return false; + } + return do_sve2_zzw_ool(s, a, fns[a->esz], sel); +} + +static bool trans_PMULLB(DisasContext *s, arg_rrr_esz *a) +{ + return do_trans_pmull(s, a, false); +} + +static bool trans_PMULLT(DisasContext *s, arg_rrr_esz *a) +{ + return do_trans_pmull(s, a, true); +} + #define DO_SVE2_ZZZ_WTB(NAME, name, SEL2) \ static bool trans_##NAME(DisasContext *s, arg_rrr_esz *a) \ { \ diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 00dc38c9db..154d32518a 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -1256,6 +1256,32 @@ void HELPER(sve2_pmull_h)(void *vd, void *vn, void *vm, uint32_t desc) d[i] = pmull_h(nn, mm); } } + +static uint64_t pmull_d(uint64_t op1, uint64_t op2) +{ + uint64_t result = 0; + int i; + + for (i = 0; i < 32; ++i) { + uint64_t mask = -((op1 >> i) & 1); + result ^= (op2 << i) & mask; + } + return result; +} + +void HELPER(sve2_pmull_d)(void *vd, void *vn, void *vm, uint32_t desc) +{ + int shift = simd_data(desc) * 32; + intptr_t i, opr_sz = simd_oprsz(desc); + uint64_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 8; ++i) { + uint64_t nn = (uint32_t)(n[i] >> shift); + uint64_t mm = (uint32_t)(m[i] >> shift); + + d[i] = pmull_d(nn, mm); + } +} #endif /* From patchwork Thu Mar 26 23:08:23 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1262457 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=uIqBnWL8; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48pLds4FPpz9sSH for ; Fri, 27 Mar 2020 10:23:45 +1100 (AEDT) Received: from localhost ([::1]:34884 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbqh-0008Nn-Fz for incoming@patchwork.ozlabs.org; Thu, 26 Mar 2020 19:23:43 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:58659) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcV-0002YU-Ls for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:04 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcU-0001cl-GL for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:03 -0400 Received: from mail-pf1-x442.google.com ([2607:f8b0:4864:20::442]:36573) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcU-0001b3-A1 for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:02 -0400 Received: by mail-pf1-x442.google.com with SMTP id i13so3553596pfe.3 for ; Thu, 26 Mar 2020 16:09:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=cv1t9pmvTM2or2y2OW8+tePDhkWPmqSM7xnDl7FKLbM=; b=uIqBnWL8T4v191boSmUZwB494xE/PGFRifYhV43jeTNSJ3Cj5Mm43BNKjSeUAl55sw vZpHh6nj8aEaZFog2zlzPCeedO3UXD8V9+nLMzCuqI9iircm7N7FIdUre1kskonwCdgr KBHtgE1vg8AQRYnz74dth6UCBR9+I222bWevRB9nK1VYtmgZ4rIND3oCiHe1/TrhKiRj IGGoyLmmMUQPujB2Rf9iBo5fch/Cxcbj0DMTZuam8fRBKKxY0j1JMG/s5Fbf0dobOhFe 8SVvRMM1qrpW8VkmErq1AKHlE96DpetQ8FrjlFbH2hHMWpY8JvoWgRElvU/Kt2HJGreR 6S+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=cv1t9pmvTM2or2y2OW8+tePDhkWPmqSM7xnDl7FKLbM=; b=cD5NE3WDJ7J4ulLw8t5WPRLYs8iMeZVZX2+kbvHesJmg9WZhFDyFOb4XQLyMPwHp7K psUq+VSe18iO9KlrPG5TIqKYFGZdwG0C6LhanICRclVTSj/X4xPp/Js47I+akgsjvkz7 fIVEGxA0LT7uneqpwvE4x3EON3/Wtgp6F2MfTPzNb7RmTrj6Iew+pvXAFyO57OxuE8yx 8KUSLxTpfOgxPwnSWWRzkrRvpY9gy73N9oI2Qxkq3u7uOiYuY/kF0Bxl18hAJv/gEohV czJqYK2GjIEKRtut/xlcpdTSs+rGF4sJRhmJ4dO0rjNCjWYlWGIaWFCm/m49y+SmKAcV r5QA== X-Gm-Message-State: ANhLgQ0x0JEoLA+FQi7YU4tit+wViFT2Mm7Xc30eOrCxv+zm6PSmhNHa q5FJLDhvoZkmsgfdkYMva+u59GPGnjE= X-Google-Smtp-Source: ADFU+vt42cA3UfdHbYZGsk8oAxIRBhVuTro4FBG3J6f9YDSKvCyVbgXzqvmyUd8xI8so8TUdQnDSag== X-Received: by 2002:a63:e544:: with SMTP id z4mr10927219pgj.174.1585264140810; Thu, 26 Mar 2020 16:09:00 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.08.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:09:00 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 16/31] target/arm: Tidy SVE tszimm shift formats Date: Thu, 26 Mar 2020 16:08:23 -0700 Message-Id: <20200326230838.31112-17-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::442 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Rather than require the user to fill in the immediate (shl or shr), create full formats that include the immediate. Signed-off-by: Richard Henderson --- target/arm/sve.decode | 35 ++++++++++++++++------------------- 1 file changed, 16 insertions(+), 19 deletions(-) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 04bf9e5ce8..440cff4597 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -151,13 +151,17 @@ @rd_rn_i6 ........ ... rn:5 ..... imm:s6 rd:5 &rri # Two register operand, one immediate operand, with predicate, -# element size encoded as TSZHL. User must fill in imm. -@rdn_pg_tszimm ........ .. ... ... ... pg:3 ..... rd:5 \ - &rpri_esz rn=%reg_movprfx esz=%tszimm_esz +# element size encoded as TSZHL. +@rdn_pg_tszimm_shl ........ .. ... ... ... pg:3 ..... rd:5 \ + &rpri_esz rn=%reg_movprfx esz=%tszimm_esz imm=%tszimm_shl +@rdn_pg_tszimm_shr ........ .. ... ... ... pg:3 ..... rd:5 \ + &rpri_esz rn=%reg_movprfx esz=%tszimm_esz imm=%tszimm_shr # Similarly without predicate. -@rd_rn_tszimm ........ .. ... ... ...... rn:5 rd:5 \ - &rri_esz esz=%tszimm16_esz +@rd_rn_tszimm_shl ........ .. ... ... ...... rn:5 rd:5 \ + &rri_esz esz=%tszimm16_esz imm=%tszimm16_shl +@rd_rn_tszimm_shr ........ .. ... ... ...... rn:5 rd:5 \ + &rri_esz esz=%tszimm16_esz imm=%tszimm16_shr # Two register operand, one immediate operand, with 4-bit predicate. # User must fill in imm. @@ -290,14 +294,10 @@ UMINV 00000100 .. 001 011 001 ... ..... ..... @rd_pg_rn ### SVE Shift by Immediate - Predicated Group # SVE bitwise shift by immediate (predicated) -ASR_zpzi 00000100 .. 000 000 100 ... .. ... ..... \ - @rdn_pg_tszimm imm=%tszimm_shr -LSR_zpzi 00000100 .. 000 001 100 ... .. ... ..... \ - @rdn_pg_tszimm imm=%tszimm_shr -LSL_zpzi 00000100 .. 000 011 100 ... .. ... ..... \ - @rdn_pg_tszimm imm=%tszimm_shl -ASRD 00000100 .. 000 100 100 ... .. ... ..... \ - @rdn_pg_tszimm imm=%tszimm_shr +ASR_zpzi 00000100 .. 000 000 100 ... .. ... ..... @rdn_pg_tszimm_shr +LSR_zpzi 00000100 .. 000 001 100 ... .. ... ..... @rdn_pg_tszimm_shr +LSL_zpzi 00000100 .. 000 011 100 ... .. ... ..... @rdn_pg_tszimm_shl +ASRD 00000100 .. 000 100 100 ... .. ... ..... @rdn_pg_tszimm_shr # SVE bitwise shift by vector (predicated) ASR_zpzz 00000100 .. 010 000 100 ... ..... ..... @rdn_pg_rm @@ -401,12 +401,9 @@ RDVL 00000100 101 11111 01010 imm:s6 rd:5 ### SVE Bitwise Shift - Unpredicated Group # SVE bitwise shift by immediate (unpredicated) -ASR_zzi 00000100 .. 1 ..... 1001 00 ..... ..... \ - @rd_rn_tszimm imm=%tszimm16_shr -LSR_zzi 00000100 .. 1 ..... 1001 01 ..... ..... \ - @rd_rn_tszimm imm=%tszimm16_shr -LSL_zzi 00000100 .. 1 ..... 1001 11 ..... ..... \ - @rd_rn_tszimm imm=%tszimm16_shl +ASR_zzi 00000100 .. 1 ..... 1001 00 ..... ..... @rd_rn_tszimm_shr +LSR_zzi 00000100 .. 1 ..... 1001 01 ..... ..... @rd_rn_tszimm_shr +LSL_zzi 00000100 .. 1 ..... 1001 11 ..... ..... @rd_rn_tszimm_shl # SVE bitwise shift by wide elements (unpredicated) # Note esz != 3 From patchwork Thu Mar 26 23:08:24 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1262440 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=UC7RaQjB; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48pLRq104Kz9sSH for ; Fri, 27 Mar 2020 10:15:03 +1100 (AEDT) Received: from localhost ([::1]:34610 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbiH-0002s3-2z for incoming@patchwork.ozlabs.org; Thu, 26 Mar 2020 19:15:01 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:58747) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcX-0002cq-8O for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:06 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcV-0001ee-Mr for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:05 -0400 Received: from mail-pf1-x443.google.com ([2607:f8b0:4864:20::443]:36574) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcV-0001dX-H3 for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:03 -0400 Received: by mail-pf1-x443.google.com with SMTP id i13so3553621pfe.3 for ; Thu, 26 Mar 2020 16:09:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=6gFXa7U00jC32XzVhB/bZg7gt2oypaZr80DxmsTe9GI=; b=UC7RaQjBnyI9Jsd1QHYb4JunreupUSW8ZCngkDpMBSwvQOxyLaqftlmlP5XFuK+nze QyGHtHWGh//fU+Qaf5L7e0rY3w0oZ8q+XycCkbTXNsxv0UaDXibcCf2oIaZ5sn5aNUjF OO0jKglu9zcYUtZ/AwGqb2IUpR/+V90dtejgWmLQb+ZqRPQaRA+LmBICmqdXJQD9fGvw ku4AL7MNAh65IjqrF5lphjGQY/VHQLwHWtCGJ1+etq4UTRmG5jl6LmhneasoxctYHHSI e6CEzWtynYSVEOccw4jB9AbUUzZCSGXLBUfWhtcGQ0AbXZOISjPsqrR2M1S3SVyo+hN+ Qgyg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=6gFXa7U00jC32XzVhB/bZg7gt2oypaZr80DxmsTe9GI=; b=WwygLfA/3gaauGV3xEOSn1sWaTUdLjEtSaWMEhEMy3TIpbw5nXMaCalJKFqemL5CjO tpiS2i3vAI03cmK4etjasokA8cReCG4KmUhiAzJ+Z8qIF+yaWiYEpxN+FLfB6y2Ir+eh Be+ob7Zmpq+i9EWbb9r559wZ19H3r0rs8fVp+CJ6zb1wRmq6qgbAP/zjaMftLZc6fhCq 21oiqEiTAboOEjli72pW37xX/7ao+UDfMzDeS8LXDjcJTbezK/fEOPCreRzrEZ25GmfB NMsiont1iGeeIgk6qNxFiPdNBkCq8kDAGgJ1O2Nzc1hh/AtyW+g8JScs/N1cW5or2mJf Njww== X-Gm-Message-State: ANhLgQ3XEK3QB737IfH40NC2sg1q1qN80uI9jEZ4RVDXlz+TMuBYXuYf 0WfVlCcTB1FBTHTpF1OxJNLridgH8Hs= X-Google-Smtp-Source: ADFU+vt56MvLnhu81r+GJWVg0oIEDFWEYDoC9gjAoBVja7kBc+4HtJxVTFlbTqgidp/n1V+XPiBPpQ== X-Received: by 2002:aa7:9790:: with SMTP id o16mr11614378pfp.322.1585264142088; Thu, 26 Mar 2020 16:09:02 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.09.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:09:01 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 17/31] target/arm: Implement SVE2 bitwise shift left long Date: Thu, 26 Mar 2020 16:08:24 -0700 Message-Id: <20200326230838.31112-18-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::443 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 8 +++++++ target/arm/sve.decode | 8 +++++++ target/arm/sve_helper.c | 34 ++++++++++++++++++++++++++ target/arm/translate-sve.c | 49 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 99 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 943839e2dc..9c0c41ba80 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2372,3 +2372,11 @@ DEF_HELPER_FLAGS_4(sve2_umull_zzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_pmull_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_pmull_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve2_sshll_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sshll_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sshll_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve2_ushll_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_ushll_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_ushll_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 440cff4597..36ef9de563 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1207,3 +1207,11 @@ SMULLB_zzz 01000101 .. 0 ..... 011 100 ..... ..... @rd_rn_rm SMULLT_zzz 01000101 .. 0 ..... 011 101 ..... ..... @rd_rn_rm UMULLB_zzz 01000101 .. 0 ..... 011 110 ..... ..... @rd_rn_rm UMULLT_zzz 01000101 .. 0 ..... 011 111 ..... ..... @rd_rn_rm + +## SVE2 bitwise shift left long + +# Note bit23 == 0 is handled by esz > 0 in do_sve2_shll_tb. +SSHLLB 01000101 .. 0 ..... 1010 00 ..... ..... @rd_rn_tszimm_shl +SSHLLT 01000101 .. 0 ..... 1010 01 ..... ..... @rd_rn_tszimm_shl +USHLLB 01000101 .. 0 ..... 1010 10 ..... ..... @rd_rn_tszimm_shl +USHLLT 01000101 .. 0 ..... 1010 11 ..... ..... @rd_rn_tszimm_shl diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 130697f3d9..e0a701c446 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -625,6 +625,8 @@ DO_ZPZZ(sve2_sqrshl_zpzz_h, int16_t, H1_2, do_sqrshl_h) DO_ZPZZ(sve2_sqrshl_zpzz_s, int32_t, H1_4, do_sqrshl_s) DO_ZPZZ_D(sve2_sqrshl_zpzz_d, int64_t, do_sqrshl_d) +#undef do_sqrshl_d + #define do_uqrshl_b(n, m) \ ({ uint32_t discard; do_uqrshl_bhs(n, m, 8, true, &discard); }) #define do_uqrshl_h(n, m) \ @@ -639,6 +641,8 @@ DO_ZPZZ(sve2_uqrshl_zpzz_h, uint16_t, H1_2, do_uqrshl_h) DO_ZPZZ(sve2_uqrshl_zpzz_s, uint32_t, H1_4, do_uqrshl_s) DO_ZPZZ_D(sve2_uqrshl_zpzz_d, uint64_t, do_uqrshl_d) +#undef do_uqrshl_d + #define DO_HADD_BHS(n, m) (((int64_t)n + m) >> 1) #define DO_HADD_D(n, m) ((n >> 1) + (m >> 1) + (n & m & 1)) @@ -1192,6 +1196,36 @@ DO_ZZZ_WTB(sve2_usubw_d, uint64_t, uint32_t, DO_SUB) #undef DO_ZZZ_WTB +#define DO_ZZI_SHLL(NAME, TYPE, TYPEN, OP) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + int sel = (simd_data(desc) & 1) * sizeof(TYPE); \ + int shift = simd_data(desc) >> 1; \ + for (i = 0; i < opr_sz; i += sizeof(TYPE)) { \ + TYPE nn = (TYPEN)(*(TYPE *)(vn + i) >> sel); \ + *(TYPE *)(vd + i) = OP(nn, shift); \ + } \ +} + +#define DO_SSHLL_H(val, sh) do_sqrshl_bhs(val, sh, 16, false, NULL) +#define DO_SSHLL_S(val, sh) do_sqrshl_bhs(val, sh, 32, false, NULL) +#define DO_SSHLL_D(val, sh) do_sqrshl_d(val, sh, false, NULL) + +DO_ZZI_SHLL(sve2_sshll_h, int16_t, int8_t, DO_SSHLL_H) +DO_ZZI_SHLL(sve2_sshll_s, int32_t, int16_t, DO_SSHLL_S) +DO_ZZI_SHLL(sve2_sshll_d, int64_t, int32_t, DO_SSHLL_D) + +#define DO_USHLL_H(val, sh) do_uqrshl_bhs(val, sh, 16, false, NULL) +#define DO_USHLL_S(val, sh) do_uqrshl_bhs(val, sh, 32, false, NULL) +#define DO_USHLL_D(val, sh) do_uqrshl_d(val, sh, false, NULL) + +DO_ZZI_SHLL(sve2_ushll_h, uint16_t, uint8_t, DO_USHLL_H) +DO_ZZI_SHLL(sve2_ushll_s, uint32_t, uint16_t, DO_USHLL_S) +DO_ZZI_SHLL(sve2_ushll_d, uint64_t, uint32_t, DO_USHLL_D) + +#undef DO_ZZI_SHLL + /* Two-operand reduction expander, controlled by a predicate. * The difference between TYPERED and TYPERET has to do with * sign-extension. E.g. for SMAX, TYPERED must be signed, diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 67416a25ce..9873b83feb 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6105,3 +6105,52 @@ DO_SVE2_ZZZ_WTB(UADDWB, uaddw, false) DO_SVE2_ZZZ_WTB(UADDWT, uaddw, true) DO_SVE2_ZZZ_WTB(USUBWB, usubw, false) DO_SVE2_ZZZ_WTB(USUBWT, usubw, true) + +static bool do_sve2_shll_tb(DisasContext *s, arg_rri_esz *a, + bool sel, bool uns) +{ + static gen_helper_gvec_2 * const fns[2][3] = { + { gen_helper_sve2_sshll_h, + gen_helper_sve2_sshll_s, + gen_helper_sve2_sshll_d }, + { gen_helper_sve2_ushll_h, + gen_helper_sve2_ushll_s, + gen_helper_sve2_ushll_d }, + }; + + if (a->esz <= 0 || !dc_isar_feature(aa64_sve2, s)) { + /* + * For < 0, invalid tsz encoding -- see tszimm_esz. + * For = 0, not a widening operation; note this implies bit23 = 0. + */ + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_2_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vsz, vsz, (a->imm << 1) | sel, + fns[uns][a->esz - 1]); + } + return true; +} + +static bool trans_SSHLLB(DisasContext *s, arg_rri_esz *a) +{ + return do_sve2_shll_tb(s, a, false, false); +} + +static bool trans_SSHLLT(DisasContext *s, arg_rri_esz *a) +{ + return do_sve2_shll_tb(s, a, true, false); +} + +static bool trans_USHLLB(DisasContext *s, arg_rri_esz *a) +{ + return do_sve2_shll_tb(s, a, false, true); +} + +static bool trans_USHLLT(DisasContext *s, arg_rri_esz *a) +{ + return do_sve2_shll_tb(s, a, true, true); +} From patchwork Thu Mar 26 23:08:25 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1262464 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=BrEbMv9x; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48pLjh2N3Wz9sTC for ; Fri, 27 Mar 2020 10:27:04 +1100 (AEDT) Received: from localhost ([::1]:34978 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbtu-0005C3-6u for incoming@patchwork.ozlabs.org; Thu, 26 Mar 2020 19:27:02 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:58819) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcY-0002gA-HA for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:07 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcW-0001gc-Uf for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:06 -0400 Received: from mail-pf1-x42b.google.com ([2607:f8b0:4864:20::42b]:33342) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcW-0001f9-OI for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:04 -0400 Received: by mail-pf1-x42b.google.com with SMTP id j1so3565885pfe.0 for ; Thu, 26 Mar 2020 16:09:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=wrW/Tw+fzY4lh5Sk9yEJSGX1ZVgOSoYBH6s+EdHCLNI=; b=BrEbMv9x6I4UYjjt578PfVTp5NTnY4KulJP7eMueszsSXLHs4JDW+q4ZAyAKMqjTvF puiyrdvzPlBAiwdmNIIW1JgZLQmwqpf1HQKpYe4XIxCo3h9yH+fI7WwaV+24LQbbvXKC Cf5caroStFOxthkkauY/DLdJ8IMtUoSB5P6dBHYvMcLzDBtEiNN2lmQLmLRgNHnZDPF4 pUrvZXk3D9HSnOqtg2j8GE6Tay3hCT9rwg++eGWXhJvY+mPa8diiJOPxO7ndFZY8E4HX FOT4oD8TMq52Ue57ktHpOuzDAYRTLU1prnFWQ80j78ucP0Sz1Anr6S18LF5vQSXNtL/2 Qgng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=wrW/Tw+fzY4lh5Sk9yEJSGX1ZVgOSoYBH6s+EdHCLNI=; b=rSdPbr7VZ6IQJmG+Y5Vvr6cKzBxYxtx1gV7GG+H1/z1owLV7EJNt5mFXwW8oNN6p09 X0UWFdh308weFUqGW6pAVRo5DfvVEXI/AOKRJENwZ+cTaAS/km0WRGvKwRm8HvIiJXat jkinAPpRW8olzqOxMutjo6flmvBpmyA+ga0qGurI6vwy4gbALL1oKbw34oY4Bp/emHNo RQp+7XD8lftfmLiiNer+lvhBHxQxfmdVixojoIph8mVKDDk5KQKflh90yDcSXiD1krpV Fnyvh/AKG/wVCwzaD81wN52tyE1DMJukxOkB0p+85oFp0gwoVb1f55cNauEDvZnKeS0P w9sQ== X-Gm-Message-State: ANhLgQ0xh8miLRqarlJKmc3m3IdJZP7VengMUQE3tRkmjOd/1p90+8uc Jmq92qy4RxZnapSnB4Kw4UZ1jJAFTv4= X-Google-Smtp-Source: ADFU+vuhWlYBnjC1nZMAKFmHHvViUNV5mVwls0R2Fyf4gizmiEGDRcS4Xt3SLPXzti6BOUxTGMxlLg== X-Received: by 2002:a62:3305:: with SMTP id z5mr11968336pfz.297.1585264143340; Thu, 26 Mar 2020 16:09:03 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.09.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:09:02 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 18/31] target/arm: Implement SVE2 bitwise exclusive-or interleaved Date: Thu, 26 Mar 2020 16:08:25 -0700 Message-Id: <20200326230838.31112-19-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::42b X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 5 +++++ target/arm/sve.decode | 5 +++++ target/arm/sve_helper.c | 20 ++++++++++++++++++++ target/arm/translate-sve.c | 19 +++++++++++++++++++ 4 files changed, 49 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 9c0c41ba80..9e894a2b55 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2380,3 +2380,8 @@ DEF_HELPER_FLAGS_3(sve2_sshll_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve2_ushll_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve2_ushll_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve2_ushll_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_eoril_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_eoril_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_eoril_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_eoril_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 36ef9de563..8af35e48a5 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1215,3 +1215,8 @@ SSHLLB 01000101 .. 0 ..... 1010 00 ..... ..... @rd_rn_tszimm_shl SSHLLT 01000101 .. 0 ..... 1010 01 ..... ..... @rd_rn_tszimm_shl USHLLB 01000101 .. 0 ..... 1010 10 ..... ..... @rd_rn_tszimm_shl USHLLT 01000101 .. 0 ..... 1010 11 ..... ..... @rd_rn_tszimm_shl + +## SVE2 bitwise exclusive-or interleaved + +EORBT 01000101 .. 0 ..... 10010 0 ..... ..... @rd_rn_rm +EORTB 01000101 .. 0 ..... 10010 1 ..... ..... @rd_rn_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index e0a701c446..15ea1fd524 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1196,6 +1196,26 @@ DO_ZZZ_WTB(sve2_usubw_d, uint64_t, uint32_t, DO_SUB) #undef DO_ZZZ_WTB +#define DO_ZZZ_NTB(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + intptr_t sel1 = extract32(desc, SIMD_DATA_SHIFT, 1) * sizeof(TYPE); \ + intptr_t sel2 = extract32(desc, SIMD_DATA_SHIFT + 1, 1) * sizeof(TYPE); \ + for (i = 0; i < opr_sz; i += 2 * sizeof(TYPE)) { \ + TYPE nn = *(TYPE *)(vn + H(i + sel1)); \ + TYPE mm = *(TYPE *)(vm + H(i + sel2)); \ + *(TYPE *)(vd + H(i + sel1)) = OP(nn, mm); \ + } \ +} + +DO_ZZZ_NTB(sve2_eoril_b, uint8_t, H1, DO_EOR) +DO_ZZZ_NTB(sve2_eoril_h, uint16_t, H1_2, DO_EOR) +DO_ZZZ_NTB(sve2_eoril_s, uint32_t, H1_4, DO_EOR) +DO_ZZZ_NTB(sve2_eoril_d, uint64_t, , DO_EOR) + +#undef DO_ZZZ_NTB + #define DO_ZZI_SHLL(NAME, TYPE, TYPEN, OP) \ void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ { \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 9873b83feb..3eaf9cbe51 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6064,6 +6064,25 @@ DO_SVE2_ZZZ_TB(SMULLT_zzz, smull_zzz, true, true) DO_SVE2_ZZZ_TB(UMULLB_zzz, umull_zzz, false, false) DO_SVE2_ZZZ_TB(UMULLT_zzz, umull_zzz, true, true) +static bool do_eor_tb(DisasContext *s, arg_rrr_esz *a, bool sel1) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve2_eoril_b, gen_helper_sve2_eoril_h, + gen_helper_sve2_eoril_s, gen_helper_sve2_eoril_d, + }; + return do_sve2_zzw_ool(s, a, fns[a->esz], (!sel1 << 1) | sel1); +} + +static bool trans_EORBT(DisasContext *s, arg_rrr_esz *a) +{ + return do_eor_tb(s, a, false); +} + +static bool trans_EORTB(DisasContext *s, arg_rrr_esz *a) +{ + return do_eor_tb(s, a, true); +} + static bool do_trans_pmull(DisasContext *s, arg_rrr_esz *a, bool sel) { static gen_helper_gvec_3 * const fns[4] = { From patchwork Thu Mar 26 23:08:26 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1262448 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=IGICM+am; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48pLYH0W00z9sSH for ; Fri, 27 Mar 2020 10:19:45 +1100 (AEDT) Received: from localhost ([::1]:34762 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbmo-0001fm-4L for incoming@patchwork.ozlabs.org; Thu, 26 Mar 2020 19:19:42 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:58899) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcZ-0002jY-RV for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:09 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcY-0001io-7O for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:07 -0400 Received: from mail-pf1-x434.google.com ([2607:f8b0:4864:20::434]:39120) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcY-0001hL-0l for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:06 -0400 Received: by mail-pf1-x434.google.com with SMTP id d25so3539181pfn.6 for ; Thu, 26 Mar 2020 16:09:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=/qWmMRJyiGABbx/G2niY2yMmtnq+8Op3LMQTaXLdldI=; b=IGICM+amvQtPCB/7f1CwynEjtdKE7uXrktOg20btDhVHcMbqYwqCizDEjeWASh2N7A u6AcwMa/1X97tyi7ypnwLjU1j7eMQEXh1Nc2hu88+Xv2DHIwbS1ie8sygjfZg/URRjo7 63RJYn/gQ868f2rPc9bbAViA2Z11qbwx7YykmUrxqVZ90O5mHpA/Tcx3RqFlHl8g59q3 7DuCuY1/KSuTsf3MD/s1pbY5yWCfF5xHVeW9+oSyGzfw///4nBhSNTMB6uqu/EE0vsgS fFdZggHdsfigx+BaezIcc8nUAkOdz18BMXOWJFLcYpEgXgyKbZD8iWP873zB8c8nDeH7 n6bw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=/qWmMRJyiGABbx/G2niY2yMmtnq+8Op3LMQTaXLdldI=; b=KYRPvQLd8Vstce0ogo/Vc/IaS8i9oFFPmhlReYumWWtATMi0gFKHJyanMJA5yBpRgm uuWuebKlRHcSMYfDtXjEx0pUBLzMmzBzifzdag02vAHMiG8qBelHn6ATvNs/vp4C8mSU p/EUW6UvjbK/bSX9VvZ8ZBZZhcZ9O1sBWXa5DkhvD1nq+KivgBxqIvGhvilXS0VdI0/N 03/HDWBQuRjTG6bN7ezY5+JTOiU7hZGFmekBS47o9qq+CVYyeTQ8np1ZZvCrygBEH7gw 12dwqsnWUQlrIPAQOiHp5eKXX53F7FHrasaWeH/N5CPEJaYW2fEZGESRYSsWo5FORLaK Hlpw== X-Gm-Message-State: ANhLgQ2EZxwwCyHzOE+LckkM98i8zbAasq4Rvv0b52S5356pYSf8dHDy InQZU9hGs7Cx5rGddWH6CFFLjHwLlJo= X-Google-Smtp-Source: ADFU+vucrOX7fdVivHu37Z+LLmkFPV6+phIKoJIUv3w7eoJURFJtO1MouxAsuq8gt2qYY8Hi9J6TuQ== X-Received: by 2002:a62:4e57:: with SMTP id c84mr12144654pfb.156.1585264144578; Thu, 26 Mar 2020 16:09:04 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.09.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:09:03 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 19/31] target/arm: Implement SVE2 bitwise permute Date: Thu, 26 Mar 2020 16:08:26 -0700 Message-Id: <20200326230838.31112-20-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::434 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/cpu.h | 5 +++ target/arm/helper-sve.h | 15 ++++++++ target/arm/sve.decode | 6 ++++ target/arm/sve_helper.c | 73 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 36 +++++++++++++++++++ 5 files changed, 135 insertions(+) diff --git a/target/arm/cpu.h b/target/arm/cpu.h index 2e9d9f046d..b7c7946771 100644 --- a/target/arm/cpu.h +++ b/target/arm/cpu.h @@ -3865,6 +3865,11 @@ static inline bool isar_feature_aa64_sve2_pmull128(const ARMISARegisters *id) return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, AES) >= 2; } +static inline bool isar_feature_aa64_sve2_bitperm(const ARMISARegisters *id) +{ + return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, BITPERM) != 0; +} + /* * Feature tests for "does this exist in either 32-bit or 64-bit?" */ diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 9e894a2b55..466b01986f 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2385,3 +2385,18 @@ DEF_HELPER_FLAGS_4(sve2_eoril_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_eoril_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_eoril_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_eoril_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_bext_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_bext_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_bext_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_bext_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_bdep_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_bdep_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_bdep_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_bdep_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_bgrp_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_bgrp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_bgrp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_bgrp_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 8af35e48a5..ca60e9f2ce 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1220,3 +1220,9 @@ USHLLT 01000101 .. 0 ..... 1010 11 ..... ..... @rd_rn_tszimm_shl EORBT 01000101 .. 0 ..... 10010 0 ..... ..... @rd_rn_rm EORTB 01000101 .. 0 ..... 10010 1 ..... ..... @rd_rn_rm + +## SVE2 bitwise permute + +BEXT 01000101 .. 0 ..... 1011 00 ..... ..... @rd_rn_rm +BDEP 01000101 .. 0 ..... 1011 01 ..... ..... @rd_rn_rm +BGRP 01000101 .. 0 ..... 1011 10 ..... ..... @rd_rn_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 15ea1fd524..b5afa34efe 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1216,6 +1216,79 @@ DO_ZZZ_NTB(sve2_eoril_d, uint64_t, , DO_EOR) #undef DO_ZZZ_NTB +#define DO_BITPERM(NAME, TYPE, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + for (i = 0; i < opr_sz; i += sizeof(TYPE)) { \ + TYPE nn = *(TYPE *)(vn + i); \ + TYPE mm = *(TYPE *)(vm + i); \ + *(TYPE *)(vd + i) = OP(nn, mm, sizeof(TYPE) * 8); \ + } \ +} + +static uint64_t bitextract(uint64_t data, uint64_t mask, int n) +{ + uint64_t res = 0; + int db, rb = 0; + + for (db = 0; db < n; ++db) { + if ((mask >> db) & 1) { + res |= ((data >> db) & 1) << rb; + ++rb; + } + } + return res; +} + +DO_BITPERM(sve2_bext_b, uint8_t, bitextract) +DO_BITPERM(sve2_bext_h, uint16_t, bitextract) +DO_BITPERM(sve2_bext_s, uint32_t, bitextract) +DO_BITPERM(sve2_bext_d, uint64_t, bitextract) + +static uint64_t bitdeposit(uint64_t data, uint64_t mask, int n) +{ + uint64_t res = 0; + int rb, db = 0; + + for (rb = 0; rb < n; ++rb) { + if ((mask >> rb) & 1) { + res |= ((data >> db) & 1) << rb; + ++db; + } + } + return res; +} + +DO_BITPERM(sve2_bdep_b, uint8_t, bitdeposit) +DO_BITPERM(sve2_bdep_h, uint16_t, bitdeposit) +DO_BITPERM(sve2_bdep_s, uint32_t, bitdeposit) +DO_BITPERM(sve2_bdep_d, uint64_t, bitdeposit) + +static uint64_t bitgroup(uint64_t data, uint64_t mask, int n) +{ + uint64_t resm = 0, resu = 0; + int db, rbm = 0, rbu = 0; + + for (db = 0; db < n; ++db) { + uint64_t val = (data >> db) & 1; + if ((mask >> db) & 1) { + resm |= val << rbm++; + } else { + resu |= val << rbu++; + } + } + + return resm | (resu << rbm); +} + +DO_BITPERM(sve2_bgrp_b, uint8_t, bitgroup) +DO_BITPERM(sve2_bgrp_h, uint16_t, bitgroup) +DO_BITPERM(sve2_bgrp_s, uint32_t, bitgroup) +DO_BITPERM(sve2_bgrp_d, uint64_t, bitgroup) + +#undef DO_BITPERM + #define DO_ZZI_SHLL(NAME, TYPE, TYPEN, OP) \ void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ { \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 3eaf9cbe51..375b9dc983 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6173,3 +6173,39 @@ static bool trans_USHLLT(DisasContext *s, arg_rri_esz *a) { return do_sve2_shll_tb(s, a, true, true); } + +static bool trans_BEXT(DisasContext *s, arg_rrr_esz *a) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve2_bext_b, gen_helper_sve2_bext_h, + gen_helper_sve2_bext_s, gen_helper_sve2_bext_d, + }; + if (!dc_isar_feature(aa64_sve2_bitperm, s)) { + return false; + } + return do_sve2_zzw_ool(s, a, fns[a->esz], 0); +} + +static bool trans_BDEP(DisasContext *s, arg_rrr_esz *a) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve2_bdep_b, gen_helper_sve2_bdep_h, + gen_helper_sve2_bdep_s, gen_helper_sve2_bdep_d, + }; + if (!dc_isar_feature(aa64_sve2_bitperm, s)) { + return false; + } + return do_sve2_zzw_ool(s, a, fns[a->esz], 0); +} + +static bool trans_BGRP(DisasContext *s, arg_rrr_esz *a) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve2_bgrp_b, gen_helper_sve2_bgrp_h, + gen_helper_sve2_bgrp_s, gen_helper_sve2_bgrp_d, + }; + if (!dc_isar_feature(aa64_sve2_bitperm, s)) { + return false; + } + return do_sve2_zzw_ool(s, a, fns[a->esz], 0); +} From patchwork Thu Mar 26 23:08:27 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1262454 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=C3NYIsWk; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48pLc14QFGz9sSH for ; Fri, 27 Mar 2020 10:22:09 +1100 (AEDT) Received: from localhost ([::1]:34838 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbp9-0005Nv-Gp for incoming@patchwork.ozlabs.org; Thu, 26 Mar 2020 19:22:07 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:59048) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcc-0002pA-Oa for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:12 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcZ-0001kl-Au for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:08 -0400 Received: from mail-pj1-x1033.google.com ([2607:f8b0:4864:20::1033]:52092) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcZ-0001jY-4Z for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:07 -0400 Received: by mail-pj1-x1033.google.com with SMTP id w9so3092755pjh.1 for ; Thu, 26 Mar 2020 16:09:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=z4NXK1cCVKPNkZrtCQBn5TawdBNm+oJul6QWfWmZysk=; b=C3NYIsWkK2jC2OBvctWMG4R9cQGXUIRJxoFtLMVH3vbR+h+iV5c2khBUZL/Kn/eJ6d Uy+SzmOgWiiZrmJVmlYfwN7jhg39GQNwq7r1ACRu12d57mQsJYLlctLt78shDvFEDkJr RjyvS23MCbQuegedqaSsTE2WB1VJDwV+Zk0oUjDYk/ZhnJ0uyQtvtCLun/H49Vtzm1Xn ZS2LIloP67Loxau11L8MLCg0o8LnIr6wm4ZZuLLHsEo3cidqSQsC5eTepvlWf/tXEoKT iZgsdkhIpgo15QUSUcKUEqywr1A2qUE1GOzRJhufvAQtGQ1Q02j37gVZGAUz6B1AD2nj iCaQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=z4NXK1cCVKPNkZrtCQBn5TawdBNm+oJul6QWfWmZysk=; b=sngz6V/NzSWwzLK3DWNiz0octd9sN5vhjSyvjUh7HQyiwKdAx1cvQ4EPoF6/L2pYwt xRSUxRXnMHkDEiUVpyNlImIFiAtb0ua1RMksLqtcd+aMC6jsy/cNln+BCS27WmAgIOaH /H5bNZJPkajrAErTE0UaBIqLlUEC7jrCPC962BxA0CNigehNZ91JS+mfDoHrISNPPXCp nAOktQH6k8ViBsd2HOoRZp0lf0iY8csGKZjCAUvGc/RTNkdznREcZr/ApUaZgP3YgTW6 pesjseTgnyyftQ5z16rnHmnj/ucjNoJtlnGYcjub0cuInV0WjKP37YdQighQJ2b0ZgxT rruw== X-Gm-Message-State: ANhLgQ2wMkk3i845Ax60DbEBw94R1CGHlNoNUPV8VPTMSv0EJrxElTVD plLwC/Fk8EwJmeTJtn17BhRKKV+Rqos= X-Google-Smtp-Source: ADFU+vuVVmjTXCH8Hx0kieKCP17idb53WPWuFFGB2EmWSk2zXBGfmSiHOqDWRB8KNxpI9EaUZL0GaQ== X-Received: by 2002:a17:90a:cc14:: with SMTP id b20mr2714276pju.75.1585264145760; Thu, 26 Mar 2020 16:09:05 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.09.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:09:05 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 20/31] target/arm: Implement SVE2 complex integer add Date: Thu, 26 Mar 2020 16:08:27 -0700 Message-Id: <20200326230838.31112-21-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::1033 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 10 +++++++++ target/arm/sve.decode | 9 ++++++++ target/arm/sve_helper.c | 42 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 31 ++++++++++++++++++++++++++++ 4 files changed, 92 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 466b01986f..0e4b4c48da 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2400,3 +2400,13 @@ DEF_HELPER_FLAGS_4(sve2_bgrp_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_bgrp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_bgrp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_bgrp_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_cadd_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_cadd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_cadd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_cadd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_sqcadd_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqcadd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqcadd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqcadd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index ca60e9f2ce..5fb4b5f977 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1226,3 +1226,12 @@ EORTB 01000101 .. 0 ..... 10010 1 ..... ..... @rd_rn_rm BEXT 01000101 .. 0 ..... 1011 00 ..... ..... @rd_rn_rm BDEP 01000101 .. 0 ..... 1011 01 ..... ..... @rd_rn_rm BGRP 01000101 .. 0 ..... 1011 10 ..... ..... @rd_rn_rm + +#### SVE2 Accumulate + +## SVE2 complex integer add + +CADD_rot90 01000101 .. 00000 0 11011 0 ..... ..... @rdn_rm +CADD_rot270 01000101 .. 00000 0 11011 1 ..... ..... @rdn_rm +SQCADD_rot90 01000101 .. 00000 1 11011 0 ..... ..... @rdn_rm +SQCADD_rot270 01000101 .. 00000 1 11011 1 ..... ..... @rdn_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index b5afa34efe..a3653007ac 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1289,6 +1289,48 @@ DO_BITPERM(sve2_bgrp_d, uint64_t, bitgroup) #undef DO_BITPERM +#define DO_CADD(NAME, TYPE, H, ADD_OP, SUB_OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + int sub_r = simd_data(desc); \ + if (sub_r) { \ + for (i = 0; i < opr_sz; i += 2 * sizeof(TYPE)) { \ + TYPE acc_r = *(TYPE *)(vn + H(i)); \ + TYPE acc_i = *(TYPE *)(vn + H(i + sizeof(TYPE))); \ + TYPE el2_r = *(TYPE *)(vm + H(i)); \ + TYPE el2_i = *(TYPE *)(vm + H(i + sizeof(TYPE))); \ + acc_r = SUB_OP(acc_r, el2_i); \ + acc_i = ADD_OP(acc_i, el2_r); \ + *(TYPE *)(vd + H(i)) = acc_r; \ + *(TYPE *)(vd + H(i + sizeof(TYPE))) = acc_i; \ + } \ + } else { \ + for (i = 0; i < opr_sz; i += 2 * sizeof(TYPE)) { \ + TYPE acc_r = *(TYPE *)(vn + H(i)); \ + TYPE acc_i = *(TYPE *)(vn + H(i + sizeof(TYPE))); \ + TYPE el2_r = *(TYPE *)(vm + H(i)); \ + TYPE el2_i = *(TYPE *)(vm + H(i + sizeof(TYPE))); \ + acc_r = ADD_OP(acc_r, el2_i); \ + acc_i = SUB_OP(acc_i, el2_r); \ + *(TYPE *)(vd + H(i)) = acc_r; \ + *(TYPE *)(vd + H(i + sizeof(TYPE))) = acc_i; \ + } \ + } \ +} + +DO_CADD(sve2_cadd_b, int8_t, H1, DO_ADD, DO_SUB) +DO_CADD(sve2_cadd_h, int16_t, H1_2, DO_ADD, DO_SUB) +DO_CADD(sve2_cadd_s, int32_t, H1_4, DO_ADD, DO_SUB) +DO_CADD(sve2_cadd_d, int64_t, , DO_ADD, DO_SUB) + +DO_CADD(sve2_sqcadd_b, int8_t, H1, DO_SQADD_B, DO_SQSUB_B) +DO_CADD(sve2_sqcadd_h, int16_t, H1_2, DO_SQADD_H, DO_SQSUB_H) +DO_CADD(sve2_sqcadd_s, int32_t, H1_4, DO_SQADD_S, DO_SQSUB_S) +DO_CADD(sve2_sqcadd_d, int64_t, , do_sqadd_d, do_sqsub_d) + +#undef DO_CADD + #define DO_ZZI_SHLL(NAME, TYPE, TYPEN, OP) \ void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ { \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 375b9dc983..3b0aa86e79 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6209,3 +6209,34 @@ static bool trans_BGRP(DisasContext *s, arg_rrr_esz *a) } return do_sve2_zzw_ool(s, a, fns[a->esz], 0); } + +static bool do_cadd(DisasContext *s, arg_rrr_esz *a, bool sq, bool rot) +{ + static gen_helper_gvec_3 * const fns[2][4] = { + { gen_helper_sve2_cadd_b, gen_helper_sve2_cadd_h, + gen_helper_sve2_cadd_s, gen_helper_sve2_cadd_d }, + { gen_helper_sve2_sqcadd_b, gen_helper_sve2_sqcadd_h, + gen_helper_sve2_sqcadd_s, gen_helper_sve2_sqcadd_d }, + }; + return do_sve2_zzw_ool(s, a, fns[sq][a->esz], rot); +} + +static bool trans_CADD_rot90(DisasContext *s, arg_rrr_esz *a) +{ + return do_cadd(s, a, false, false); +} + +static bool trans_CADD_rot270(DisasContext *s, arg_rrr_esz *a) +{ + return do_cadd(s, a, false, true); +} + +static bool trans_SQCADD_rot90(DisasContext *s, arg_rrr_esz *a) +{ + return do_cadd(s, a, true, false); +} + +static bool trans_SQCADD_rot270(DisasContext *s, arg_rrr_esz *a) +{ + return do_cadd(s, a, true, true); +} From patchwork Thu Mar 26 23:08:28 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1262458 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=yw8hGYah; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48pLf54Y88z9sSH for ; Fri, 27 Mar 2020 10:23:57 +1100 (AEDT) Received: from localhost ([::1]:34892 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbqt-0000S3-Im for incoming@patchwork.ozlabs.org; Thu, 26 Mar 2020 19:23:55 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:59089) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcd-0002r2-C1 for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:12 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcb-0001oV-Q1 for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:11 -0400 Received: from mail-pj1-x1044.google.com ([2607:f8b0:4864:20::1044]:37022) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbca-0001m4-KA for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:09 -0400 Received: by mail-pj1-x1044.google.com with SMTP id o12so3028580pjs.2 for ; Thu, 26 Mar 2020 16:09:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=LQixDJy63XBgzN/Out1zz4zHKJbz7GOqSodi9xHGXHY=; b=yw8hGYahmmK5OWDdb+aXE7qdpyChrE6WKdynmHRx10QCfxGN12pxCrUM3xOHoJHocf IT4y1kaUnMJ9dew4pZAMz2N4TM1EZtNZzbQVsqnmOWd/YcZlllzjUD0u79NNuflhLka4 GAJJvIIgOp+2EXGxCpzPDpXU1YG9hDIQf1zIXrSjtWXgUr3s/KJUfzv62KdgIaWgIxbB AXueFc4VMn5nFMMqiMwRGgxJMcvTaqdzZop4M+jQm8b0SQPw9BbDU9rqA7NVmQ66FViq Ozrw0/LOXg4KNlMd3HPnfgK0x3cyWmHMvdYdjDaljHPuWo1y3DP1GMbFmFF4rqR29OO9 sNVQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=LQixDJy63XBgzN/Out1zz4zHKJbz7GOqSodi9xHGXHY=; b=ZaurbdBcyCBupDO2b1C/Vn7mOify+cZ8M1UktXhUlRmCaAxGzYiJHG1Yarrzoo9mWm KHla0VKhmfI6lo6LfBk7OBPtfkAH2vyMhrugXYvYhs0UImPV4iw7rStNkbqJDDwpW2CV fcWGomZHLbuCaFaMKN9hgsM5Oqa6xHx6WVBJoV2hBfEGdbJgmK6CRL5gozJbP8UrICIT 9rYCkTiNlsZajHayaXUJ+6XG/gas7Q6T+qO94IqBI+X4zvhXVrU8glyyQ+CNnpqarIsU m94zxeyujBdFe3Djd7bh+zOgNkw3Y73A7t6qUb2iLqalMF5qZUDzO7EObWZMoDKxOsI7 B4tQ== X-Gm-Message-State: ANhLgQ0ToKAYNiIX5d6yyNZyaYrCP8k+hwrxK2NctxJ0zuMlka31sAYE fkIoHxTJO0DwhkDg8BIZdHMDV+hCvuU= X-Google-Smtp-Source: ADFU+vtaEmi+DNjWVYqLEceeThdr1eqV9TPqbXXT/gEy6i4PJ5vGbQ1bliHzNezxovdd9tFvE/YSWw== X-Received: by 2002:a17:90a:2103:: with SMTP id a3mr2604118pje.181.1585264147049; Thu, 26 Mar 2020 16:09:07 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.09.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:09:06 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 21/31] target/arm: Implement SVE2 integer absolute difference and accumulate long Date: Thu, 26 Mar 2020 16:08:28 -0700 Message-Id: <20200326230838.31112-22-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::1044 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 14 ++++++++++ target/arm/sve.decode | 12 +++++++++ target/arm/sve_helper.c | 24 +++++++++++++++++ target/arm/translate-sve.c | 54 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 104 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 0e4b4c48da..b48a88135f 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2410,3 +2410,17 @@ DEF_HELPER_FLAGS_4(sve2_sqcadd_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_sqcadd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_sqcadd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_sqcadd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_sabal_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sabal_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sabal_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_uabal_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uabal_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uabal_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 5fb4b5f977..f66a6c242f 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -70,6 +70,7 @@ &rpr_s rd pg rn s &rprr_s rd pg rn rm s &rprr_esz rd pg rn rm esz +&rrrr_esz rd ra rn rm esz &rprrr_esz rd pg rn rm ra esz &rpri_esz rd pg rn imm esz &ptrue rd esz pat s @@ -120,6 +121,10 @@ @rdn_i8s ........ esz:2 ...... ... imm:s8 rd:5 \ &rri_esz rn=%reg_movprfx +# Four operand, vector element size +@rda_rn_rm ........ esz:2 . rm:5 ... ... rn:5 rd:5 \ + &rrrr_esz ra=%reg_movprfx + # Three operand with "memory" size, aka immediate left shift @rd_rn_msz_rm ........ ... rm:5 .... imm:2 rn:5 rd:5 &rrri @@ -1235,3 +1240,10 @@ CADD_rot90 01000101 .. 00000 0 11011 0 ..... ..... @rdn_rm CADD_rot270 01000101 .. 00000 0 11011 1 ..... ..... @rdn_rm SQCADD_rot90 01000101 .. 00000 1 11011 0 ..... ..... @rdn_rm SQCADD_rot270 01000101 .. 00000 1 11011 1 ..... ..... @rdn_rm + +## SVE2 integer absolute difference and accumulate long + +SABALB 01000101 .. 0 ..... 1100 00 ..... ..... @rda_rn_rm +SABALT 01000101 .. 0 ..... 1100 01 ..... ..... @rda_rn_rm +UABALB 01000101 .. 0 ..... 1100 10 ..... ..... @rda_rn_rm +UABALT 01000101 .. 0 ..... 1100 11 ..... ..... @rda_rn_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index a3653007ac..a0995d95c7 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1216,6 +1216,30 @@ DO_ZZZ_NTB(sve2_eoril_d, uint64_t, , DO_EOR) #undef DO_ZZZ_NTB +#define DO_ABAL(NAME, TYPE, TYPEN) \ +void HELPER(NAME)(void *vd, void *va, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + int sel1 = (simd_data(desc) & 1) * sizeof(TYPE); \ + int sel2 = (simd_data(desc) & 2) * (sizeof(TYPE) / 2); \ + for (i = 0; i < opr_sz; i += sizeof(TYPE)) { \ + TYPE nn = (TYPEN)(*(TYPE *)(vn + i) >> sel1); \ + TYPE mm = (TYPEN)(*(TYPE *)(vm + i) >> sel2); \ + TYPE aa = *(TYPE *)(va + i); \ + *(TYPE *)(vd + i) = DO_ABD(nn, mm) + aa; \ + } \ +} + +DO_ABAL(sve2_sabal_h, int16_t, int8_t) +DO_ABAL(sve2_sabal_s, int32_t, int16_t) +DO_ABAL(sve2_sabal_d, int64_t, int32_t) + +DO_ABAL(sve2_uabal_h, uint16_t, uint8_t) +DO_ABAL(sve2_uabal_s, uint32_t, uint16_t) +DO_ABAL(sve2_uabal_d, uint64_t, uint32_t) + +#undef DO_ABAL + #define DO_BITPERM(NAME, TYPE, OP) \ void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ { \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 3b0aa86e79..c6161d2ce2 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6240,3 +6240,57 @@ static bool trans_SQCADD_rot270(DisasContext *s, arg_rrr_esz *a) { return do_cadd(s, a, true, true); } + +static bool do_sve2_zzzz_ool(DisasContext *s, arg_rrrr_esz *a, + gen_helper_gvec_4 *fn, int data) +{ + if (fn == NULL || !dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_4_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->ra), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vsz, vsz, data, fn); + } + return true; +} + +static bool do_abal(DisasContext *s, arg_rrrr_esz *a, bool uns, bool sel) +{ + static gen_helper_gvec_4 * const fns[2][3] = { + { gen_helper_sve2_sabal_h, + gen_helper_sve2_sabal_s, + gen_helper_sve2_sabal_d }, + { gen_helper_sve2_uabal_h, + gen_helper_sve2_uabal_s, + gen_helper_sve2_uabal_d }, + }; + + if (a->esz == 0) { + return false; + } + return do_sve2_zzzz_ool(s, a, fns[uns][a->esz - 1], sel); +} + +static bool trans_SABALB(DisasContext *s, arg_rrrr_esz *a) +{ + return do_abal(s, a, false, false); +} + +static bool trans_SABALT(DisasContext *s, arg_rrrr_esz *a) +{ + return do_abal(s, a, false, true); +} + +static bool trans_UABALB(DisasContext *s, arg_rrrr_esz *a) +{ + return do_abal(s, a, true, false); +} + +static bool trans_UABALT(DisasContext *s, arg_rrrr_esz *a) +{ + return do_abal(s, a, true, true); +} From patchwork Thu Mar 26 23:08:29 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1262445 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=uZP/lhXQ; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48pLVj5LX1z9sSH for ; Fri, 27 Mar 2020 10:17:33 +1100 (AEDT) Received: from localhost ([::1]:34704 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbkh-0006aS-LG for incoming@patchwork.ozlabs.org; Thu, 26 Mar 2020 19:17:31 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:59043) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcc-0002p0-NG for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:12 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcb-0001o9-Ff for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:10 -0400 Received: from mail-pj1-x1044.google.com ([2607:f8b0:4864:20::1044]:39645) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcb-0001nC-9n for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:09 -0400 Received: by mail-pj1-x1044.google.com with SMTP id z3so2516243pjr.4 for ; Thu, 26 Mar 2020 16:09:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=shWQ4UzPuX9NwbrJUFloHvE+a/223KVCdbky39wDCzw=; b=uZP/lhXQ9+6tllKEfEN0KkQUWibVsIqtbsoZx4jNPeMpoCPd23g11WKmPnIa24KHvk Z8auWRPUzDsWDFnhCnEDBhqnR4kbtFkKd9AHAO2D3CAjoPijiF54ECmCyfg1C9PfRIh8 upojrFhm9RyZD2WkX2yWbtqOBS2SDeOBjI9RxSd6zitwCvctGM4alAv4Cjf+6wZn5bnS 3r+cvru6wsrE8ht/2U/gR6IsiTRnmCRT7Oh0SyQG2tQaES7HFfV2fFkUmKK2ZOIu+O40 jwI0prwSLnx2LeKppVkYjRRWYj5B9mtLbdm6pBkOpr1fh+eyewl7MFbXpphdoKXBB+lU 7hOw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=shWQ4UzPuX9NwbrJUFloHvE+a/223KVCdbky39wDCzw=; b=XLXfp7MOZgIRLnTH45yw3lI5M1L/YUHL4Hnm1H7gXfQLZfrP5iwW+86wFr2XWeKKiO Rv9xlE/hv0dJ5oRFwgFhY5LVBVKsIYy5VXb+q/U+Sq3LXQZiJ2FQYmDrR7TXyOrQIo5g 5hm+D7hVlFfWv8KieKNLgt/lqZ+Np1Idhn7cq2ahqQ/E4FoQhf1Vu1PQZmLbQttOawZe 4ANdH6AFwMeTIVPP6BKEdpT64NvjdzG3/1ZYtsNSioOJId2AXHAlvuIMmbxFrxCYolpf Lv088/llDo//NOVERnUB39hULMeuMLPOJBQ+Ca4FMGvYgjAZq+XKjnGq5vu5POJugxD1 4d2A== X-Gm-Message-State: ANhLgQ3VTeK96CmeeTOC9l6kxVbNkIRuu11Q71uOA32uuEJMSVuGPq61 xepr2J9fGTLCLJu1qy/+Pv5qFzIu2Vs= X-Google-Smtp-Source: ADFU+vvPoE0JokHDMqWCBQi1uly8mCuS57fbi/6wXSF5yB1P3rnU44iiQAiMm15VeaFMe4Rlu6fLjQ== X-Received: by 2002:a17:90a:9501:: with SMTP id t1mr2659209pjo.108.1585264147996; Thu, 26 Mar 2020 16:09:07 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.09.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:09:07 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 22/31] target/arm: Implement SVE2 integer add/subtract long with carry Date: Thu, 26 Mar 2020 16:08:29 -0700 Message-Id: <20200326230838.31112-23-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::1044 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 3 +++ target/arm/sve.decode | 6 ++++++ target/arm/sve_helper.c | 33 +++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 23 +++++++++++++++++++++++ 4 files changed, 65 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index b48a88135f..cfc1357613 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2424,3 +2424,6 @@ DEF_HELPER_FLAGS_5(sve2_uabal_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve2_uabal_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_adcl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_adcl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index f66a6c242f..5d46e3ab45 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1247,3 +1247,9 @@ SABALB 01000101 .. 0 ..... 1100 00 ..... ..... @rda_rn_rm SABALT 01000101 .. 0 ..... 1100 01 ..... ..... @rda_rn_rm UABALB 01000101 .. 0 ..... 1100 10 ..... ..... @rda_rn_rm UABALT 01000101 .. 0 ..... 1100 11 ..... ..... @rda_rn_rm + +## SVE2 integer add/subtract long with carry + +# ADC and SBC decoded via size in helper dispatch. +ADCLB 01000101 .. 0 ..... 11010 0 ..... ..... @rda_rn_rm +ADCLT 01000101 .. 0 ..... 11010 1 ..... ..... @rda_rn_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index a0995d95c7..aa330f75c3 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1240,6 +1240,39 @@ DO_ABAL(sve2_uabal_d, uint64_t, uint32_t) #undef DO_ABAL +void HELPER(sve2_adcl_s)(void *vd, void *va, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int sel = extract32(desc, SIMD_DATA_SHIFT, 1) * 32; + uint32_t inv = -extract32(desc, SIMD_DATA_SHIFT + 1, 1); + uint64_t *d = vd, *a = va, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 8; ++i) { + uint32_t e1 = (uint32_t)a[i]; + uint32_t e2 = (uint32_t)(n[i] >> sel) ^ inv; + uint64_t c = extract64(m[i], 32, 1); + /* Compute and store the entire 33-bit result at once. */ + d[i] = c + e1 + e2; + } +} + +void HELPER(sve2_adcl_d)(void *vd, void *va, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int sel = extract32(desc, SIMD_DATA_SHIFT, 1) * 32; + uint64_t inv = -(uint64_t)extract32(desc, SIMD_DATA_SHIFT + 1, 1); + uint64_t *d = vd, *a = va, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 8; i += 2) { + Int128 e1 = int128_make64(a[i]); + Int128 e2 = int128_make64(n[i + sel] ^ inv); + Int128 c = int128_make64(m[i + 1] & 1); + Int128 r = int128_add(int128_add(e1, e2), c); + d[i + 0] = int128_getlo(r); + d[i + 1] = int128_gethi(r); + } +} + #define DO_BITPERM(NAME, TYPE, OP) \ void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ { \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index c6161d2ce2..a80765a978 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6294,3 +6294,26 @@ static bool trans_UABALT(DisasContext *s, arg_rrrr_esz *a) { return do_abal(s, a, true, true); } + +static bool do_adcl(DisasContext *s, arg_rrrr_esz *a, bool sel) +{ + static gen_helper_gvec_4 * const fns[2] = { + gen_helper_sve2_adcl_s, + gen_helper_sve2_adcl_d, + }; + /* + * Note that in this case the ESZ field encodes both size and sign. + * Split out 'subtract' into bit 1 of the data field for the helper. + */ + return do_sve2_zzzz_ool(s, a, fns[a->esz & 1], (a->esz & 2) | sel); +} + +static bool trans_ADCLB(DisasContext *s, arg_rrrr_esz *a) +{ + return do_adcl(s, a, false); +} + +static bool trans_ADCLT(DisasContext *s, arg_rrrr_esz *a) +{ + return do_adcl(s, a, true); +} From patchwork Thu Mar 26 23:08:30 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1262451 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=VXIvD9E/; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48pLZq4FPcz9sSL for ; Fri, 27 Mar 2020 10:21:07 +1100 (AEDT) Received: from localhost ([::1]:34804 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbo9-0003cw-E3 for incoming@patchwork.ozlabs.org; Thu, 26 Mar 2020 19:21:05 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:59207) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcf-0002w3-AH for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:15 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcd-0001qr-5u for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:13 -0400 Received: from mail-pg1-x544.google.com ([2607:f8b0:4864:20::544]:44776) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcc-0001pE-U5 for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:11 -0400 Received: by mail-pg1-x544.google.com with SMTP id 142so3639909pgf.11 for ; Thu, 26 Mar 2020 16:09:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=iY5JtAHPmUUP1F/+dH6oTIOXqKx5ZoOANRCDeiNVP7k=; b=VXIvD9E/nYGp0Qf6bb0kKd04aMAuAnzrY8CC6BqEIqDfq6wAMytN6pBNBeuVQezlxq 44pH3y18nnFRAjwmkQEV3/mFBZrf6noH1Q9VcKOiCNa3jFyGVJVZtqcr0DaOGZtO83Zv pkecPkO7mJoLTsQMqgLp5YoP9bCVyY0FFg4EVa52nCf8jEzUmDoj8T2Yc+TPShkMAXP3 m42Yil/NdH1kTIUs7VWe1aY85iISzpjYLV0BuUJfOz1VU1ml3YevOLoUxUXrEpMlmGIP cZ5WYYCjlp5QQriO/WKpS/xOOAmuiIiR1cpfp3AMhoSDBWAu4c/bQUEb+YiHyWdNhNZf +KSg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=iY5JtAHPmUUP1F/+dH6oTIOXqKx5ZoOANRCDeiNVP7k=; b=TsEZaWluVrkyV8Au7ySd2o6paox4iJbXcCtDj/t87X+rhI0PVoKfgz6ZGlntE2itu2 k32TB3+cGkPSDLfb999IHmMVx6lbiaOFbnHq/SrentMKSPcWqvbCveTFstg0f5hlwrel Hxo7zSbw4+xq637U3ef1ATL8AycahgXiDY9V0+7VIwQ6qnpkhLOjY2swgz6XjsGyp79d PcwifV/LrZDgL5PGjaMiifPdOWywrZ4aBIPvjnBGf9aG6eAJhjZzr2b9oyfTKfDVEiGo h1r4NJEIwpvlGrvodBNHE1bGiWUFEL16HF07ZUgEzK97BdLR7Fwg8vxqZBRfqt5omzf0 BIIA== X-Gm-Message-State: ANhLgQ2wKABaJNe6yL5eIzWjWNELd1pZ7rPXWsaEUYYn9RmjGnOTueUo MkcqPPFFWRDg26uUiYs+mP2i/vOYhPA= X-Google-Smtp-Source: ADFU+vsln7U5paVI74wLyJFOC0xw37eaw52geJZFB9EZ1DiVCsQma+BmJ6q9OdLkdPysE6cIDYO/cw== X-Received: by 2002:a62:1894:: with SMTP id 142mr11802128pfy.27.1585264149138; Thu, 26 Mar 2020 16:09:09 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.09.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:09:08 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 23/31] target/arm: Create arm_gen_gvec_[us]sra Date: Thu, 26 Mar 2020 16:08:30 -0700 Message-Id: <20200326230838.31112-24-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::544 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" The functions eliminate duplication of the special cases for this operation. They match up with the GVecGen2iFn typedef. Add out-of-line helpers. We got away with only having inline expanders because the neon vector size is only 16 bytes, and we know that the inline expansion will always succeed. When we reuse this for SVE, tcg-gvec-op may decide to use an out-of-line helper due to longer vector lengths. Signed-off-by: Richard Henderson --- target/arm/helper.h | 10 +++ target/arm/translate.h | 7 +- target/arm/translate-a64.c | 15 +--- target/arm/translate.c | 161 ++++++++++++++++++++++--------------- target/arm/vec_helper.c | 25 ++++++ 5 files changed, 139 insertions(+), 79 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index 938fdbc362..dc6a43dbd8 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -708,6 +708,16 @@ DEF_HELPER_FLAGS_4(gvec_pmull_q, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(neon_pmull_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_ssra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_ssra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_ssra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_ssra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(gvec_usra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_usra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_usra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_usra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/translate.h b/target/arm/translate.h index 5552ee5a94..1c5cdf13e3 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -291,8 +291,6 @@ extern const GVecGen3 mls_op[4]; extern const GVecGen3 cmtst_op[4]; extern const GVecGen3 sshl_op[4]; extern const GVecGen3 ushl_op[4]; -extern const GVecGen2i ssra_op[4]; -extern const GVecGen2i usra_op[4]; extern const GVecGen2i sri_op[4]; extern const GVecGen2i sli_op[4]; extern const GVecGen4 uqadd_op[4]; @@ -305,6 +303,11 @@ void gen_sshl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b); void gen_ushl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); void gen_sshl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); +void arm_gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void arm_gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); + /* * Forward to the isar_feature_* tests given a DisasContext pointer. */ diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 2bcf643069..d50207fcfb 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -10682,19 +10682,8 @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u, switch (opcode) { case 0x02: /* SSRA / USRA (accumulate) */ - if (is_u) { - /* Shift count same as element size produces zero to add. */ - if (shift == 8 << size) { - goto done; - } - gen_gvec_op2i(s, is_q, rd, rn, shift, &usra_op[size]); - } else { - /* Shift count same as element size produces all sign to add. */ - if (shift == 8 << size) { - shift -= 1; - } - gen_gvec_op2i(s, is_q, rd, rn, shift, &ssra_op[size]); - } + gen_gvec_fn2i(s, is_q, rd, rn, shift, + is_u ? arm_gen_gvec_usra : arm_gen_gvec_ssra, size); return; case 0x08: /* SRI */ /* Shift count same as element size is valid but does nothing. */ diff --git a/target/arm/translate.c b/target/arm/translate.c index cba84987db..f5768014d1 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -3947,33 +3947,51 @@ static void gen_ssra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) tcg_gen_add_vec(vece, d, d, a); } -static const TCGOpcode vecop_list_ssra[] = { - INDEX_op_sari_vec, INDEX_op_add_vec, 0 -}; +void arm_gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sari_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen2i ops[4] = { + { .fni8 = gen_ssra8_i64, + .fniv = gen_ssra_vec, + .fno = gen_helper_gvec_ssra_b, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni8 = gen_ssra16_i64, + .fniv = gen_ssra_vec, + .fno = gen_helper_gvec_ssra_h, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_ssra32_i32, + .fniv = gen_ssra_vec, + .fno = gen_helper_gvec_ssra_s, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_ssra64_i64, + .fniv = gen_ssra_vec, + .fno = gen_helper_gvec_ssra_b, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_64 }, + }; -const GVecGen2i ssra_op[4] = { - { .fni8 = gen_ssra8_i64, - .fniv = gen_ssra_vec, - .load_dest = true, - .opt_opc = vecop_list_ssra, - .vece = MO_8 }, - { .fni8 = gen_ssra16_i64, - .fniv = gen_ssra_vec, - .load_dest = true, - .opt_opc = vecop_list_ssra, - .vece = MO_16 }, - { .fni4 = gen_ssra32_i32, - .fniv = gen_ssra_vec, - .load_dest = true, - .opt_opc = vecop_list_ssra, - .vece = MO_32 }, - { .fni8 = gen_ssra64_i64, - .fniv = gen_ssra_vec, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .opt_opc = vecop_list_ssra, - .load_dest = true, - .vece = MO_64 }, -}; + /* tszimm encoding produces immediates in the range [1..esize]. */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + /* + * Shifts larger than the element size are architecturally valid. + * Signed results in all sign bits. + */ + shift = MIN(shift, (8 << vece) - 1); + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); +} static void gen_usra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) { @@ -4005,33 +4023,55 @@ static void gen_usra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) tcg_gen_add_vec(vece, d, d, a); } -static const TCGOpcode vecop_list_usra[] = { - INDEX_op_shri_vec, INDEX_op_add_vec, 0 -}; +void arm_gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shri_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen2i ops[4] = { + { .fni8 = gen_usra8_i64, + .fniv = gen_usra_vec, + .fno = gen_helper_gvec_usra_b, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_8, }, + { .fni8 = gen_usra16_i64, + .fniv = gen_usra_vec, + .fno = gen_helper_gvec_usra_h, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_16, }, + { .fni4 = gen_usra32_i32, + .fniv = gen_usra_vec, + .fno = gen_helper_gvec_usra_s, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_32, }, + { .fni8 = gen_usra64_i64, + .fniv = gen_usra_vec, + .fno = gen_helper_gvec_usra_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_64, }, + }; -const GVecGen2i usra_op[4] = { - { .fni8 = gen_usra8_i64, - .fniv = gen_usra_vec, - .load_dest = true, - .opt_opc = vecop_list_usra, - .vece = MO_8, }, - { .fni8 = gen_usra16_i64, - .fniv = gen_usra_vec, - .load_dest = true, - .opt_opc = vecop_list_usra, - .vece = MO_16, }, - { .fni4 = gen_usra32_i32, - .fniv = gen_usra_vec, - .load_dest = true, - .opt_opc = vecop_list_usra, - .vece = MO_32, }, - { .fni8 = gen_usra64_i64, - .fniv = gen_usra_vec, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .load_dest = true, - .opt_opc = vecop_list_usra, - .vece = MO_64, }, -}; + /* tszimm encoding produces immediates in the range [1..esize]. */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + /* + * Shifts larger than the element size are architecturally valid. + * Unsigned results in all zeros as input to accumulate: nop. + */ + if (shift < (8 << vece)) { + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); + } else { + /* Nop, but we do need to clear the tail. */ + tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz); + } +} static void gen_shr8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) { @@ -5396,19 +5436,12 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) case 1: /* VSRA */ /* Right shift comes here negative. */ shift = -shift; - /* Shifts larger than the element size are architecturally - * valid. Unsigned results in all zeros; signed results - * in all sign bits. - */ - if (!u) { - tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, vec_size, - MIN(shift, (8 << size) - 1), - &ssra_op[size]); - } else if (shift >= 8 << size) { - /* rd += 0 */ + if (u) { + arm_gen_gvec_usra(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); } else { - tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, vec_size, - shift, &usra_op[size]); + arm_gen_gvec_ssra(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); } return 0; diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 154d32518a..aaaccc0a2d 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -899,6 +899,31 @@ void HELPER(gvec_sqsub_d)(void *vd, void *vq, void *vn, clear_tail(d, oprsz, simd_maxsz(desc)); } + +#define DO_SRA(NAME, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + int shift = simd_data(desc); \ + TYPE *d = vd, *n = vn; \ + for (i = 0; i < oprsz / sizeof(TYPE); i++) { \ + d[i] += n[i] >> shift; \ + } \ + clear_tail(d, oprsz, simd_maxsz(desc)); \ +} + +DO_SRA(gvec_ssra_b, int8_t) +DO_SRA(gvec_ssra_h, int16_t) +DO_SRA(gvec_ssra_s, int32_t) +DO_SRA(gvec_ssra_d, int64_t) + +DO_SRA(gvec_usra_b, uint8_t) +DO_SRA(gvec_usra_h, uint16_t) +DO_SRA(gvec_usra_s, uint32_t) +DO_SRA(gvec_usra_d, uint64_t) + +#undef DO_SRA + /* * Convert float16 to float32, raising no exceptions and * preserving exceptional values, including SNaN. From patchwork Thu Mar 26 23:08:31 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1262449 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=ZzrtufDM; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48pLYT3p0Jz9sSM for ; Fri, 27 Mar 2020 10:19:57 +1100 (AEDT) Received: from localhost ([::1]:34768 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbn1-00023Q-CZ for incoming@patchwork.ozlabs.org; Thu, 26 Mar 2020 19:19:55 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:59322) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbch-00031H-2t for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:17 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbce-0001si-C2 for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:14 -0400 Received: from mail-pg1-x542.google.com ([2607:f8b0:4864:20::542]:36237) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbce-0001rZ-2I for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:12 -0400 Received: by mail-pg1-x542.google.com with SMTP id j29so3657723pgl.3 for ; Thu, 26 Mar 2020 16:09:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=I6ankYmNHyYeNrJfSSg8aj/RPi8jQqHs2aTejgbW4Ig=; b=ZzrtufDME0y+UJkq2FPDh6+9jlGlN/tr0Pd6WEgBMrlaseWdFI3acjL9qpda9yEysN q2AxjuPLIXfu5O0i/5bxmcNiLhc85OHlh0ezSju9DtpuWvwW4tDP4M4iQQ6GezZPydnX v6mn/e/NadYwKNe5LmAXsJ1oxEsChdNBtIxeew/R1SYKDmJNhd7YEjYaVPQGYyRBlJ24 D8AeAxrrojx6uQwCds18/Whd1bqAQoXbpYpl2K+UkcaJea+yfClQIwH6ons/36JENfvy 7VRA0Dd8wrh6uYGcS5ypIAJQP4uljGKZAzdyVBLimu9GDM+GBSSj+PawlvSUTj2wN6xM doMA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=I6ankYmNHyYeNrJfSSg8aj/RPi8jQqHs2aTejgbW4Ig=; b=DcnGLkq/AwSCAVr2LDZ4OxHYQdaSj6QC8Z9w8qGjbljAr7gER259yCo/l1RJRtm4y/ BSsSXRtKOoIOG4UP5tD8rcdGJ/tK0mLCKGAdjPEptw2IWJuMylKce4HAd853AokcvCsF bR/6dpW+6/cgOmSyCfcy3GMLtv7eo7MLU59yfF3bDb95cNseAHb0in/uBUKVQqP5Aup3 Hn+s1WJWit5OQx+oGxLApP9fCDwh547/fvobLcjxuzqF9RJf4KoX5SDrchpsTFiRLVfE NWuDoJJUvwekNB6d4GskrLpuTi6sTHfmx5PZH3ZoyLO7x70N3rTcRn/AbXW1hHydvBuz anBw== X-Gm-Message-State: ANhLgQ1kwdrSdiHhcQDP3gttFCN/FcK+lR4yxt9y6HKkG63X7YK44evZ kFuqn1xaeHQnYyfFi9v31PoUJM19X80= X-Google-Smtp-Source: ADFU+vtOVygUkutX1dEAtenD5j2XE9k7y+XDdDJ43QAyJXNsMD6IOq0HGY1RtMyT+P/X9WsO/9m1Rg== X-Received: by 2002:a63:e607:: with SMTP id g7mr10814971pgh.303.1585264150252; Thu, 26 Mar 2020 16:09:10 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.09.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:09:09 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 24/31] target/arm: Create arm_gen_gvec_{u,s}{rshr,rsra} Date: Thu, 26 Mar 2020 16:08:31 -0700 Message-Id: <20200326230838.31112-25-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::542 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Create vectorized versions of handle_shri_with_rndacc for shift+round and shift+round+accumulate. Add out-of-line helpers in preparation for longer vector lengths from SVE. Signed-off-by: Richard Henderson --- target/arm/helper.h | 20 ++ target/arm/translate-a64.h | 9 + target/arm/translate-a64.c | 458 ++++++++++++++++++++++++++++++++++++- target/arm/vec_helper.c | 50 ++++ 4 files changed, 534 insertions(+), 3 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index dc6a43dbd8..1ffd840f1d 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -718,6 +718,26 @@ DEF_HELPER_FLAGS_3(gvec_usra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_usra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_usra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_srshr_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_srshr_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_srshr_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_srshr_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(gvec_urshr_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_urshr_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_urshr_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_urshr_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(gvec_srsra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_srsra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_srsra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_srsra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(gvec_ursra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_ursra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_ursra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_ursra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/translate-a64.h b/target/arm/translate-a64.h index 65c0280498..7846e91e51 100644 --- a/target/arm/translate-a64.h +++ b/target/arm/translate-a64.h @@ -129,4 +129,13 @@ typedef void GVecGen3Fn(unsigned, uint32_t, uint32_t, typedef void GVecGen4Fn(unsigned, uint32_t, uint32_t, uint32_t, uint32_t, uint32_t, uint32_t); +void arm_gen_gvec_srshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void arm_gen_gvec_urshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void arm_gen_gvec_srsra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void arm_gen_gvec_ursra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); + #endif /* TARGET_ARM_TRANSLATE_A64_H */ diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index d50207fcfb..37ee85f867 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -8561,6 +8561,453 @@ static void handle_shri_with_rndacc(TCGv_i64 tcg_res, TCGv_i64 tcg_src, } } +static void gen_srshr8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) +{ + TCGv_i64 t = tcg_temp_new_i64(); + TCGv_i64 ones = tcg_const_i64(dup_const(MO_8, 1)); + + /* Shift one less than the requested amount. */ + if (shift > 1) { + tcg_gen_vec_sar8i_i64(a, a, shift - 1); + } + + /* The low bit is the rounding bit. Mask it off. */ + tcg_gen_and_i64(t, a, ones); + + /* Finish the shift. */ + tcg_gen_vec_sar8i_i64(d, a, 1); + + /* Round. */ + tcg_gen_vec_add8_i64(d, d, t); + + tcg_temp_free_i64(t); + tcg_temp_free_i64(ones); +} + +static void gen_srshr16_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) +{ + TCGv_i64 t = tcg_temp_new_i64(); + TCGv_i64 ones = tcg_const_i64(dup_const(MO_16, 1)); + + if (shift > 1) { + tcg_gen_vec_sar16i_i64(a, a, shift - 1); + } + tcg_gen_and_i64(t, a, ones); + tcg_gen_vec_sar16i_i64(d, a, 1); + tcg_gen_vec_add16_i64(d, d, t); + + tcg_temp_free_i64(t); + tcg_temp_free_i64(ones); +} + +static void gen_srshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift) +{ + TCGv_i32 t = tcg_temp_new_i32(); + + tcg_gen_sari_i32(a, a, shift - 1); + tcg_gen_andi_i32(t, a, 1); + tcg_gen_sari_i32(d, a, 1); + tcg_gen_add_i32(d, d, t); + + tcg_temp_free_i32(t); +} + +static void gen_srshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_sari_i64(a, a, shift - 1); + tcg_gen_andi_i64(t, a, 1); + tcg_gen_sari_i64(d, a, 1); + tcg_gen_add_i64(d, d, t); + + tcg_temp_free_i64(t); +} + +static void gen_srshr_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t shift) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + TCGv_vec ones = tcg_temp_new_vec_matching(d); + + tcg_gen_sari_vec(vece, a, a, shift - 1); + tcg_gen_dupi_vec(vece, ones, 1); + tcg_gen_and_vec(vece, t, a, ones); + tcg_gen_sari_vec(vece, d, a, 1); + tcg_gen_add_vec(vece, d, d, t); + + tcg_temp_free_vec(t); + tcg_temp_free_vec(ones); +} + +void arm_gen_gvec_srshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sari_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen2i ops[4] = { + { .fni8 = gen_srshr8_i64, + .fniv = gen_srshr_vec, + .fno = gen_helper_gvec_srshr_b, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni8 = gen_srshr16_i64, + .fniv = gen_srshr_vec, + .fno = gen_helper_gvec_srshr_h, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_srshr32_i32, + .fniv = gen_srshr_vec, + .fno = gen_helper_gvec_srshr_s, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_srshr64_i64, + .fniv = gen_srshr_vec, + .fno = gen_helper_gvec_srshr_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + + /* tszimm encoding produces immediates in the range [1..esize] */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + if (shift == (8 << vece)) { + /* + * Shifts larger than the element size are architecturally valid. + * Signed results in all sign bits. With rounding, this produces + * (-1 + 1) >> 1 == 0, or (0 + 1) >> 1 == 0. + * I.e. always zero. + */ + tcg_gen_gvec_dup8i(rd_ofs, opr_sz, max_sz, 0); + } else { + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); + } +} + +static void gen_srsra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) +{ + TCGv_i64 t = tcg_temp_new_i64(); + gen_srshr8_i64(t, a, shift); + tcg_gen_vec_add8_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_srsra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) +{ + TCGv_i64 t = tcg_temp_new_i64(); + gen_srshr16_i64(t, a, shift); + tcg_gen_vec_add16_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_srsra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift) +{ + TCGv_i32 t = tcg_temp_new_i32(); + gen_srshr32_i32(t, a, shift); + tcg_gen_add_i32(d, d, t); + tcg_temp_free_i32(t); +} + +static void gen_srsra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) +{ + TCGv_i64 t = tcg_temp_new_i64(); + gen_srshr64_i64(t, a, shift); + tcg_gen_add_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_srsra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t shift) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + gen_srshr_vec(vece, t, a, shift); + tcg_gen_add_vec(vece, d, d, t); + tcg_temp_free_vec(t); +} + +void arm_gen_gvec_srsra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sari_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen2i ops[4] = { + { .fni8 = gen_srsra8_i64, + .fniv = gen_srsra_vec, + .fno = gen_helper_gvec_srsra_b, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_8 }, + { .fni8 = gen_srsra16_i64, + .fniv = gen_srsra_vec, + .fno = gen_helper_gvec_srsra_h, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_16 }, + { .fni4 = gen_srsra32_i32, + .fniv = gen_srsra_vec, + .fno = gen_helper_gvec_srsra_s, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_32 }, + { .fni8 = gen_srsra64_i64, + .fniv = gen_srsra_vec, + .fno = gen_helper_gvec_srsra_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_64 }, + }; + + /* tszimm encoding produces immediates in the range [1..esize] */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + /* + * Shifts larger than the element size are architecturally valid. + * Signed results in all sign bits. With rounding, this produces + * (-1 + 1) >> 1 == 0, or (0 + 1) >> 1 == 0. + * I.e. always zero. With accumulation, this leaves D unchanged. + */ + if (shift != (8 << vece)) { + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); + } else { + /* Nop, but we do need to clear the tail. */ + tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz); + } +} + +static void gen_urshr8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) +{ + TCGv_i64 t = tcg_temp_new_i64(); + TCGv_i64 ones = tcg_const_i64(dup_const(MO_8, 1)); + + /* Shift one less than the requested amount. */ + if (shift > 1) { + tcg_gen_vec_shr8i_i64(a, a, shift - 1); + } + + /* The low bit is the rounding bit. Mask it off. */ + tcg_gen_and_i64(t, a, ones); + + /* Finish the shift. */ + tcg_gen_vec_shr8i_i64(d, a, 1); + + /* Round. */ + tcg_gen_vec_add8_i64(d, d, t); + + tcg_temp_free_i64(t); + tcg_temp_free_i64(ones); +} + +static void gen_urshr16_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) +{ + TCGv_i64 t = tcg_temp_new_i64(); + TCGv_i64 ones = tcg_const_i64(dup_const(MO_16, 1)); + + if (shift > 1) { + tcg_gen_vec_shr16i_i64(a, a, shift - 1); + } + tcg_gen_and_i64(t, a, ones); + tcg_gen_vec_shr16i_i64(d, a, 1); + tcg_gen_vec_add16_i64(d, d, t); + + tcg_temp_free_i64(t); + tcg_temp_free_i64(ones); +} + +static void gen_urshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift) +{ + TCGv_i32 t = tcg_temp_new_i32(); + + tcg_gen_shri_i32(a, a, shift - 1); + tcg_gen_andi_i32(t, a, 1); + tcg_gen_shri_i32(d, a, 1); + tcg_gen_add_i32(d, d, t); + + tcg_temp_free_i32(t); +} + +static void gen_urshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_shri_i64(a, a, shift - 1); + tcg_gen_andi_i64(t, a, 1); + tcg_gen_shri_i64(d, a, 1); + tcg_gen_add_i64(d, d, t); + + tcg_temp_free_i64(t); +} + +static void gen_urshr_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t shift) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + TCGv_vec ones = tcg_temp_new_vec_matching(d); + + tcg_gen_shri_vec(vece, a, a, shift - 1); + tcg_gen_dupi_vec(vece, ones, 1); + tcg_gen_and_vec(vece, t, a, ones); + tcg_gen_shri_vec(vece, d, a, 1); + tcg_gen_add_vec(vece, d, d, t); + + tcg_temp_free_vec(t); + tcg_temp_free_vec(ones); +} + +void arm_gen_gvec_urshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shri_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen2i ops[4] = { + { .fni8 = gen_urshr8_i64, + .fniv = gen_urshr_vec, + .fno = gen_helper_gvec_urshr_b, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni8 = gen_urshr16_i64, + .fniv = gen_urshr_vec, + .fno = gen_helper_gvec_urshr_h, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_urshr32_i32, + .fniv = gen_urshr_vec, + .fno = gen_helper_gvec_urshr_s, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_urshr64_i64, + .fniv = gen_urshr_vec, + .fno = gen_helper_gvec_urshr_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + + /* tszimm encoding produces immediates in the range [1..esize] */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + if (shift == (8 << vece)) { + /* + * Shifts larger than the element size are architecturally valid. + * Unsigned results in zero. With rounding, this produces a + * copy of the most significant bit. + */ + tcg_gen_gvec_shri(vece, rd_ofs, rm_ofs, shift - 1, opr_sz, max_sz); + } else { + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); + } +} + +static void gen_ursra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + if (shift == 8) { + tcg_gen_vec_shr8i_i64(t, a, 7); + } else { + gen_urshr8_i64(t, a, shift); + } + tcg_gen_vec_add8_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_ursra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + if (shift == 16) { + tcg_gen_vec_shr16i_i64(t, a, 15); + } else { + gen_urshr16_i64(t, a, shift); + } + tcg_gen_vec_add16_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_ursra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift) +{ + TCGv_i32 t = tcg_temp_new_i32(); + + if (shift == 32) { + tcg_gen_shri_i32(t, a, 31); + } else { + gen_urshr32_i32(t, a, shift); + } + tcg_gen_add_i32(d, d, t); + tcg_temp_free_i32(t); +} + +static void gen_ursra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + if (shift == 64) { + tcg_gen_shri_i64(t, a, 63); + } else { + gen_urshr64_i64(t, a, shift); + } + tcg_gen_add_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_ursra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t shift) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + + if (shift == (8 << vece)) { + tcg_gen_shri_vec(vece, t, a, (8 << vece) - 1); + } else { + gen_urshr_vec(vece, t, a, shift); + } + tcg_gen_add_vec(vece, d, d, t); + tcg_temp_free_vec(t); +} + +void arm_gen_gvec_ursra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shri_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen2i ops[4] = { + { .fni8 = gen_ursra8_i64, + .fniv = gen_ursra_vec, + .fno = gen_helper_gvec_ursra_b, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_8 }, + { .fni8 = gen_ursra16_i64, + .fniv = gen_ursra_vec, + .fno = gen_helper_gvec_ursra_h, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_16 }, + { .fni4 = gen_ursra32_i32, + .fniv = gen_ursra_vec, + .fno = gen_helper_gvec_ursra_s, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_32 }, + { .fni8 = gen_ursra64_i64, + .fniv = gen_ursra_vec, + .fno = gen_helper_gvec_ursra_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_64 }, + }; + + /* tszimm encoding produces immediates in the range [1..esize] */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); +} + /* SSHR[RA]/USHR[RA] - Scalar shift right (optional rounding/accumulate) */ static void handle_scalar_simd_shri(DisasContext *s, bool is_u, int immh, int immb, @@ -10712,10 +11159,15 @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u, return; case 0x04: /* SRSHR / URSHR (rounding) */ - break; + gen_gvec_fn2i(s, is_q, rd, rn, shift, + is_u ? arm_gen_gvec_urshr : arm_gen_gvec_srshr, size); + return; + case 0x06: /* SRSRA / URSRA (accum + rounding) */ - accumulate = true; - break; + gen_gvec_fn2i(s, is_q, rd, rn, shift, + is_u ? arm_gen_gvec_ursra : arm_gen_gvec_srsra, size); + return; + default: g_assert_not_reached(); } diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index aaaccc0a2d..c6a39c188e 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -924,6 +924,56 @@ DO_SRA(gvec_usra_d, uint64_t) #undef DO_SRA +#define DO_RSHR(NAME, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + int shift = simd_data(desc); \ + TYPE *d = vd, *n = vn; \ + for (i = 0; i < oprsz / sizeof(TYPE); i++) { \ + TYPE tmp = n[i] >> (shift - 1); \ + d[i] = (tmp >> 1) + (tmp & 1); \ + } \ + clear_tail(d, oprsz, simd_maxsz(desc)); \ +} + +DO_RSHR(gvec_srshr_b, int8_t) +DO_RSHR(gvec_srshr_h, int16_t) +DO_RSHR(gvec_srshr_s, int32_t) +DO_RSHR(gvec_srshr_d, int64_t) + +DO_RSHR(gvec_urshr_b, uint8_t) +DO_RSHR(gvec_urshr_h, uint16_t) +DO_RSHR(gvec_urshr_s, uint32_t) +DO_RSHR(gvec_urshr_d, uint64_t) + +#undef DO_RSHR + +#define DO_RSRA(NAME, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + int shift = simd_data(desc); \ + TYPE *d = vd, *n = vn; \ + for (i = 0; i < oprsz / sizeof(TYPE); i++) { \ + TYPE tmp = n[i] >> (shift - 1); \ + d[i] += (tmp >> 1) + (tmp & 1); \ + } \ + clear_tail(d, oprsz, simd_maxsz(desc)); \ +} + +DO_RSRA(gvec_srsra_b, int8_t) +DO_RSRA(gvec_srsra_h, int16_t) +DO_RSRA(gvec_srsra_s, int32_t) +DO_RSRA(gvec_srsra_d, int64_t) + +DO_RSRA(gvec_ursra_b, uint8_t) +DO_RSRA(gvec_ursra_h, uint16_t) +DO_RSRA(gvec_ursra_s, uint32_t) +DO_RSRA(gvec_ursra_d, uint64_t) + +#undef DO_RSRA + /* * Convert float16 to float32, raising no exceptions and * preserving exceptional values, including SNaN. From patchwork Thu Mar 26 23:08:32 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1262461 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=aD2zgnrz; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48pLj76tCCz9sT8 for ; Fri, 27 Mar 2020 10:26:35 +1100 (AEDT) Received: from localhost ([::1]:34958 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbtR-00044a-MN for incoming@patchwork.ozlabs.org; Thu, 26 Mar 2020 19:26:33 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:59253) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcf-0002xc-TU for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:14 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbce-0001tV-Oo for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:13 -0400 Received: from mail-pg1-x541.google.com ([2607:f8b0:4864:20::541]:44774) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbce-0001sI-Iu for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:12 -0400 Received: by mail-pg1-x541.google.com with SMTP id 142so3639946pgf.11 for ; Thu, 26 Mar 2020 16:09:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=OdjaC+2trQ9ceTWGUWGj5oLcK8x+CzwmCibMzIBPms4=; b=aD2zgnrzX/IkgZWK1ZnF0cSKeiEAY8SMYX1u7WTSWJknIg6OOX4PVlZUgaR9pG6qy0 ZyNsuIILpcGidy1iqB9znjowkFRlTsfrtbdhGh89GR3i6GYamkAbiwTcpzkLXE6xgjH6 rqZXAq1FtARKFrJJCEnb7fJj667c3OWzmk1DpklnL5hbGvD2UitAmC6dMKTtDO5v0WKQ dauE7kIRDOJbpqXICicQuE4vJq8D3Io5DA1LGYWP2H6f8fIDtXbSot69QxJ7+Bdap39S z4+qhQj92FpCiV+YtMEILq/0xLm7s2omjTpRDQwGNn+7B4sxmzkMbPUJPtcUaPAdac+v wWCg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=OdjaC+2trQ9ceTWGUWGj5oLcK8x+CzwmCibMzIBPms4=; b=dR5DzU2qZj/RTIcNyknibI4jP7ZP5tWxrzfelz6P4dkPv5R83TMw9+IhKT4pMom9Rg 6WljhppKIMDZozQnEvDqrivhK/Riu2NOqMfJ/G9XwGrOEMgODdE/c65V48Dvx9Pj0GyT XqT831VjLbE7W+X56p/qWh980iZmF7gd4F+jS46VbgajRgOjaevSyPS8Obqz+Oj3Lfi3 bHvVQruTC6U4KG42OIRIvmbnBUm92jf5sVTZB0IJ2reBjvReYhZaopfRaRjdhhVQfKBx sNp4Z3o96CVXOhJwrHjsvVG5g+C0MF2v3ywxA0a4zMIEOj5O2WBqykcpAmOj63J6CvQL FTbQ== X-Gm-Message-State: ANhLgQ10YtX0jYjE2PfBa8nuM5JEbbxghdcohPjNuir0qj2l9fKsGuu0 1/4YpEu8xGEfi7OOiNlDN8DPnLPWD8M= X-Google-Smtp-Source: ADFU+vsZxtqInyt3k/ezZfGWtaC0yUw6b+8rk1+iLd5P2dZhyLJn/yg5uNyB9OPW8kM2u6SVHIzk4g== X-Received: by 2002:a63:2cc3:: with SMTP id s186mr10720238pgs.71.1585264151297; Thu, 26 Mar 2020 16:09:11 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.09.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:09:10 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 25/31] target/arm: Implement SVE2 bitwise shift right and accumulate Date: Thu, 26 Mar 2020 16:08:32 -0700 Message-Id: <20200326230838.31112-26-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::541 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/sve.decode | 8 ++++++++ target/arm/translate-sve.c | 34 ++++++++++++++++++++++++++++++++++ 2 files changed, 42 insertions(+) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 5d46e3ab45..756f939df1 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1253,3 +1253,11 @@ UABALT 01000101 .. 0 ..... 1100 11 ..... ..... @rda_rn_rm # ADC and SBC decoded via size in helper dispatch. ADCLB 01000101 .. 0 ..... 11010 0 ..... ..... @rda_rn_rm ADCLT 01000101 .. 0 ..... 11010 1 ..... ..... @rda_rn_rm + +## SVE2 bitwise shift right and accumulate + +# TODO: Use @rda and %reg_movprfx here. +SSRA 01000101 .. 0 ..... 1110 00 ..... ..... @rd_rn_tszimm_shr +USRA 01000101 .. 0 ..... 1110 01 ..... ..... @rd_rn_tszimm_shr +SRSRA 01000101 .. 0 ..... 1110 10 ..... ..... @rd_rn_tszimm_shr +URSRA 01000101 .. 0 ..... 1110 11 ..... ..... @rd_rn_tszimm_shr diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index a80765a978..1d1f55dfdd 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6317,3 +6317,37 @@ static bool trans_ADCLT(DisasContext *s, arg_rrrr_esz *a) { return do_adcl(s, a, true); } + +static bool do_sve2_fn2i(DisasContext *s, arg_rri_esz *a, GVecGen2iFn *fn) +{ + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + unsigned rd_ofs = vec_full_reg_offset(s, a->rd); + unsigned rn_ofs = vec_full_reg_offset(s, a->rn); + fn(a->esz, rd_ofs, rn_ofs, a->imm, vsz, vsz); + } + return true; +} + +static bool trans_SSRA(DisasContext *s, arg_rri_esz *a) +{ + return do_sve2_fn2i(s, a, arm_gen_gvec_ssra); +} + +static bool trans_USRA(DisasContext *s, arg_rri_esz *a) +{ + return do_sve2_fn2i(s, a, arm_gen_gvec_usra); +} + +static bool trans_SRSRA(DisasContext *s, arg_rri_esz *a) +{ + return do_sve2_fn2i(s, a, arm_gen_gvec_srsra); +} + +static bool trans_URSRA(DisasContext *s, arg_rri_esz *a) +{ + return do_sve2_fn2i(s, a, arm_gen_gvec_ursra); +} From patchwork Thu Mar 26 23:08:33 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1262460 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=JjgIBZIl; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48pLgy6Pk4z9sSH for ; Fri, 27 Mar 2020 10:25:34 +1100 (AEDT) Received: from localhost ([::1]:34928 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbsS-0002m9-La for incoming@patchwork.ozlabs.org; Thu, 26 Mar 2020 19:25:32 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:59488) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbck-00036k-0z for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:20 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcg-0001wd-JA for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:16 -0400 Received: from mail-pf1-x441.google.com ([2607:f8b0:4864:20::441]:37108) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcg-0001vE-BO for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:14 -0400 Received: by mail-pf1-x441.google.com with SMTP id h72so3555250pfe.4 for ; Thu, 26 Mar 2020 16:09:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=cxkAq7ERmXI5Oy2mbhzOoAS+SoesHDepdGEbK1/bQ9Y=; b=JjgIBZIlijmglWEbZ7lrd8BT5g+5KVANuj7duhBiwhmyHPRb5QOPpuVBtopUwnklAF Ze1bX9ffvz9JKLzBaroqlN06xXjfmo8W1lK7evBO8bLItA1btrH/0fxYdg2iGCcC6pP5 wRTUCZ4yboS5M92auL2ttn0WYkokzR/TyMnDnyj70YBDMPGJNv/YxRJEzBamLrS/mhIP mBS6sdxw/4RjVw8mk5hugNZKjYmJRbh2hqV8jcp5RH7yZwgsKBrNagRG9q6/a+cwCm6d 8ukBXc6h+eMXiILNf2vwn+OfWA3jBMmm5KTGGukx3nePDJ6dFs/ksZhyYl0JjlzNdOsK CCnw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=cxkAq7ERmXI5Oy2mbhzOoAS+SoesHDepdGEbK1/bQ9Y=; b=KsmN8GRTHWCaKzdQzhvgKHoMMeGbv5D7GcooJ1VHcmj029EhkV0w1/RPOahoZ+pBYQ tZ8lXT7emCqQLItOCJwl6qqJNXgLizskRX83ZBk0RitYeCUumTkb9WEPG6+KqViGPN5P LuU/OxsDKxRpLbb97Vaw4MgNIp+0bCui93w1+iGP2zubXDAbkudvSS4PKQ91VaLdnv0w lTUJmsFZQG5t+36Bo1s3cgUS2MYskg4PhaaJImuoawqvBI50YiWXwht/q0rS8I15bG5N 2+UD8k3Z80hyt71raC6NeJkUuVkcaf/YcEFQmEog4aUIj7pN6PPf+T1DFvZkOSQkhY04 igCg== X-Gm-Message-State: ANhLgQ2Zp8aShONhFhBnkORNhaQyJl71jSchwG9Iorjbz0VKGT9ipi3J OeAcjwOEvE7v4Ftcbem1BhxYLTJibYY= X-Google-Smtp-Source: ADFU+vsi4yr5jP9LqsfKWeZ7N/C/U/IrI/Bc/fTTSwdtLL+IdKN3/+f4w7eLioOgok5H/3PEv73sdA== X-Received: by 2002:a63:f95c:: with SMTP id q28mr3719550pgk.321.1585264152438; Thu, 26 Mar 2020 16:09:12 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.09.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:09:11 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 26/31] target/arm: Create arm_gen_gvec_{sri,sli} Date: Thu, 26 Mar 2020 16:08:33 -0700 Message-Id: <20200326230838.31112-27-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::441 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" The functions eliminate duplication of the special cases for this operation. They match up with the GVecGen2iFn typedef. Add out-of-line helpers. We got away with only having inline expanders because the neon vector size is only 16 bytes, and we know that the inline expansion will always succeed. When we reuse this for SVE, tcg-gvec-op may decide to use an out-of-line helper due to longer vector lengths. Signed-off-by: Richard Henderson --- target/arm/helper.h | 10 ++ target/arm/translate.h | 7 +- target/arm/translate-a64.c | 20 +--- target/arm/translate.c | 186 +++++++++++++++++++++---------------- target/arm/vec_helper.c | 38 ++++++++ 5 files changed, 160 insertions(+), 101 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index 1ffd840f1d..5ef7bb158f 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -738,6 +738,16 @@ DEF_HELPER_FLAGS_3(gvec_ursra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_ursra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_ursra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_sri_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_sri_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_sri_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_sri_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(gvec_sli_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_sli_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_sli_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_sli_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/translate.h b/target/arm/translate.h index 1c5cdf13e3..843ecc1472 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -291,8 +291,6 @@ extern const GVecGen3 mls_op[4]; extern const GVecGen3 cmtst_op[4]; extern const GVecGen3 sshl_op[4]; extern const GVecGen3 ushl_op[4]; -extern const GVecGen2i sri_op[4]; -extern const GVecGen2i sli_op[4]; extern const GVecGen4 uqadd_op[4]; extern const GVecGen4 sqadd_op[4]; extern const GVecGen4 uqsub_op[4]; @@ -308,6 +306,11 @@ void arm_gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, void arm_gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void arm_gen_gvec_sri(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void arm_gen_gvec_sli(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); + /* * Forward to the isar_feature_* tests given a DisasContext pointer. */ diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 37ee85f867..f7d492cce4 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -680,16 +680,6 @@ static void gen_gvec_fn4(DisasContext *s, bool is_q, int rd, int rn, int rm, is_q ? 16 : 8, vec_full_reg_size(s)); } -/* Expand a 2-operand + immediate AdvSIMD vector operation using - * an op descriptor. - */ -static void gen_gvec_op2i(DisasContext *s, bool is_q, int rd, - int rn, int64_t imm, const GVecGen2i *gvec_op) -{ - tcg_gen_gvec_2i(vec_full_reg_offset(s, rd), vec_full_reg_offset(s, rn), - is_q ? 16 : 8, vec_full_reg_size(s), imm, gvec_op); -} - /* Expand a 3-operand AdvSIMD vector operation using an op descriptor. */ static void gen_gvec_op3(DisasContext *s, bool is_q, int rd, int rn, int rm, const GVecGen3 *gvec_op) @@ -11132,12 +11122,9 @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u, gen_gvec_fn2i(s, is_q, rd, rn, shift, is_u ? arm_gen_gvec_usra : arm_gen_gvec_ssra, size); return; + case 0x08: /* SRI */ - /* Shift count same as element size is valid but does nothing. */ - if (shift == 8 << size) { - goto done; - } - gen_gvec_op2i(s, is_q, rd, rn, shift, &sri_op[size]); + gen_gvec_fn2i(s, is_q, rd, rn, shift, arm_gen_gvec_sri, size); return; case 0x00: /* SSHR / USHR */ @@ -11188,7 +11175,6 @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u, } tcg_temp_free_i64(tcg_round); - done: clear_vec_high(s, is_q, rd); } @@ -11213,7 +11199,7 @@ static void handle_vec_simd_shli(DisasContext *s, bool is_q, bool insert, } if (insert) { - gen_gvec_op2i(s, is_q, rd, rn, shift, &sli_op[size]); + gen_gvec_fn2i(s, is_q, rd, rn, shift, arm_gen_gvec_sli, size); } else { gen_gvec_fn2i(s, is_q, rd, rn, shift, tcg_gen_gvec_shli, size); } diff --git a/target/arm/translate.c b/target/arm/translate.c index f5768014d1..bb6db53598 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -4111,47 +4111,62 @@ static void gen_shr64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) static void gen_shr_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) { - if (sh == 0) { - tcg_gen_mov_vec(d, a); - } else { - TCGv_vec t = tcg_temp_new_vec_matching(d); - TCGv_vec m = tcg_temp_new_vec_matching(d); + TCGv_vec t = tcg_temp_new_vec_matching(d); + TCGv_vec m = tcg_temp_new_vec_matching(d); - tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK((8 << vece) - sh, sh)); - tcg_gen_shri_vec(vece, t, a, sh); - tcg_gen_and_vec(vece, d, d, m); - tcg_gen_or_vec(vece, d, d, t); + tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK((8 << vece) - sh, sh)); + tcg_gen_shri_vec(vece, t, a, sh); + tcg_gen_and_vec(vece, d, d, m); + tcg_gen_or_vec(vece, d, d, t); - tcg_temp_free_vec(t); - tcg_temp_free_vec(m); - } + tcg_temp_free_vec(t); + tcg_temp_free_vec(m); } -static const TCGOpcode vecop_list_sri[] = { INDEX_op_shri_vec, 0 }; +void arm_gen_gvec_sri(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { INDEX_op_shri_vec, 0 }; + const GVecGen2i ops[4] = { + { .fni8 = gen_shr8_ins_i64, + .fniv = gen_shr_ins_vec, + .fno = gen_helper_gvec_sri_b, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni8 = gen_shr16_ins_i64, + .fniv = gen_shr_ins_vec, + .fno = gen_helper_gvec_sri_h, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_shr32_ins_i32, + .fniv = gen_shr_ins_vec, + .fno = gen_helper_gvec_sri_s, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_shr64_ins_i64, + .fniv = gen_shr_ins_vec, + .fno = gen_helper_gvec_sri_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; -const GVecGen2i sri_op[4] = { - { .fni8 = gen_shr8_ins_i64, - .fniv = gen_shr_ins_vec, - .load_dest = true, - .opt_opc = vecop_list_sri, - .vece = MO_8 }, - { .fni8 = gen_shr16_ins_i64, - .fniv = gen_shr_ins_vec, - .load_dest = true, - .opt_opc = vecop_list_sri, - .vece = MO_16 }, - { .fni4 = gen_shr32_ins_i32, - .fniv = gen_shr_ins_vec, - .load_dest = true, - .opt_opc = vecop_list_sri, - .vece = MO_32 }, - { .fni8 = gen_shr64_ins_i64, - .fniv = gen_shr_ins_vec, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .load_dest = true, - .opt_opc = vecop_list_sri, - .vece = MO_64 }, -}; + /* tszimm encoding produces immediates in the range [1..esize]. */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + /* Shift of esize leaves destination unchanged. */ + if (shift < (8 << vece)) { + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); + } else { + /* Nop, but we do need to clear the tail. */ + tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz); + } +} static void gen_shl8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) { @@ -4189,47 +4204,60 @@ static void gen_shl64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) static void gen_shl_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) { - if (sh == 0) { - tcg_gen_mov_vec(d, a); - } else { - TCGv_vec t = tcg_temp_new_vec_matching(d); - TCGv_vec m = tcg_temp_new_vec_matching(d); + TCGv_vec t = tcg_temp_new_vec_matching(d); + TCGv_vec m = tcg_temp_new_vec_matching(d); - tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK(0, sh)); - tcg_gen_shli_vec(vece, t, a, sh); - tcg_gen_and_vec(vece, d, d, m); - tcg_gen_or_vec(vece, d, d, t); + tcg_gen_shli_vec(vece, t, a, sh); + tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK(0, sh)); + tcg_gen_and_vec(vece, d, d, m); + tcg_gen_or_vec(vece, d, d, t); - tcg_temp_free_vec(t); - tcg_temp_free_vec(m); - } + tcg_temp_free_vec(t); + tcg_temp_free_vec(m); } -static const TCGOpcode vecop_list_sli[] = { INDEX_op_shli_vec, 0 }; +void arm_gen_gvec_sli(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { INDEX_op_shli_vec, 0 }; + const GVecGen2i ops[4] = { + { .fni8 = gen_shl8_ins_i64, + .fniv = gen_shl_ins_vec, + .fno = gen_helper_gvec_sli_b, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni8 = gen_shl16_ins_i64, + .fniv = gen_shl_ins_vec, + .fno = gen_helper_gvec_sli_h, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_shl32_ins_i32, + .fniv = gen_shl_ins_vec, + .fno = gen_helper_gvec_sli_s, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_shl64_ins_i64, + .fniv = gen_shl_ins_vec, + .fno = gen_helper_gvec_sli_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; -const GVecGen2i sli_op[4] = { - { .fni8 = gen_shl8_ins_i64, - .fniv = gen_shl_ins_vec, - .load_dest = true, - .opt_opc = vecop_list_sli, - .vece = MO_8 }, - { .fni8 = gen_shl16_ins_i64, - .fniv = gen_shl_ins_vec, - .load_dest = true, - .opt_opc = vecop_list_sli, - .vece = MO_16 }, - { .fni4 = gen_shl32_ins_i32, - .fniv = gen_shl_ins_vec, - .load_dest = true, - .opt_opc = vecop_list_sli, - .vece = MO_32 }, - { .fni8 = gen_shl64_ins_i64, - .fniv = gen_shl_ins_vec, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .load_dest = true, - .opt_opc = vecop_list_sli, - .vece = MO_64 }, -}; + /* tszimm encoding produces immediates in the range [0..esize-1]. */ + tcg_debug_assert(shift >= 0); + tcg_debug_assert(shift < (8 << vece)); + + if (shift == 0) { + tcg_gen_gvec_mov(vece, rd_ofs, rm_ofs, opr_sz, max_sz); + } else { + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); + } +} static void gen_mla8_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) { @@ -5451,20 +5479,14 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) } /* Right shift comes here negative. */ shift = -shift; - /* Shift out of range leaves destination unchanged. */ - if (shift < 8 << size) { - tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, vec_size, - shift, &sri_op[size]); - } + arm_gen_gvec_sri(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); return 0; case 5: /* VSHL, VSLI */ if (u) { /* VSLI */ - /* Shift out of range leaves destination unchanged. */ - if (shift < 8 << size) { - tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, - vec_size, shift, &sli_op[size]); - } + arm_gen_gvec_sli(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); } else { /* VSHL */ /* Shifts larger than the element size are * architecturally valid and results in zero. diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index c6a39c188e..27035a8a42 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -974,6 +974,44 @@ DO_RSRA(gvec_ursra_d, uint64_t) #undef DO_RSRA +#define DO_SRI(NAME, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + int shift = simd_data(desc); \ + TYPE *d = vd, *n = vn; \ + for (i = 0; i < oprsz / sizeof(TYPE); i++) { \ + d[i] = deposit64(d[i], 0, sizeof(TYPE) * 8 - shift, n[i] >> shift); \ + } \ + clear_tail(d, oprsz, simd_maxsz(desc)); \ +} + +DO_SRI(gvec_sri_b, uint8_t) +DO_SRI(gvec_sri_h, uint16_t) +DO_SRI(gvec_sri_s, uint32_t) +DO_SRI(gvec_sri_d, uint64_t) + +#undef DO_SRI + +#define DO_SLI(NAME, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + int shift = simd_data(desc); \ + TYPE *d = vd, *n = vn; \ + for (i = 0; i < oprsz / sizeof(TYPE); i++) { \ + d[i] = deposit64(d[i], shift, sizeof(TYPE) * 8 - shift, n[i]); \ + } \ + clear_tail(d, oprsz, simd_maxsz(desc)); \ +} + +DO_SLI(gvec_sli_b, uint8_t) +DO_SLI(gvec_sli_h, uint16_t) +DO_SLI(gvec_sli_s, uint32_t) +DO_SLI(gvec_sli_d, uint64_t) + +#undef DO_SLI + /* * Convert float16 to float32, raising no exceptions and * preserving exceptional values, including SNaN. From patchwork Thu Mar 26 23:08:34 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1262456 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=T4spmkhx; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48pLdT2lYmz9sSH for ; Fri, 27 Mar 2020 10:23:25 +1100 (AEDT) Received: from localhost ([::1]:34870 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbqN-0007WZ-48 for incoming@patchwork.ozlabs.org; Thu, 26 Mar 2020 19:23:23 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:59427) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbci-00035U-Nw for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:17 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbch-0001xl-7z for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:16 -0400 Received: from mail-pf1-x441.google.com ([2607:f8b0:4864:20::441]:41257) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbch-0001wZ-1c for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:15 -0400 Received: by mail-pf1-x441.google.com with SMTP id z65so3545055pfz.8 for ; Thu, 26 Mar 2020 16:09:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=gruaHJOeD8WvnhMIfKkze3XKHGHvv2/+9IJLAI7Js3I=; b=T4spmkhxY+XY0VoCM6cAeio6xDP/bnRE+S2bIyjiIN6IYxOPv8exi560GTi6kAUcpa hBnOBO5v3dTcj4NFoyWvTsPXW1uekeOEIunYMZjwaInR5jTKkNqMEXYLMniEMsXCl+ow 6M2TwsX5S4MklHqCSZkpDBx+P2v6tTMJhOk+5eZGOAdFH863SIRwai13pMHeC/BGqvPv iEz3E3u/tRsBJdwT/8ixdmUtNJ0qrYf7Ns4/MmT/K9BlsJok8Vm6tas1oUT5FBczpHLv iXcxkPKScIo8Jhmsgf+/9smNbjaJSeRSsZx8zAcPHXCnupl75DAxTMYtD+Gm/ZPSsZlQ oXvA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=gruaHJOeD8WvnhMIfKkze3XKHGHvv2/+9IJLAI7Js3I=; b=I05k7XVlskPStGMVpycqCrWV9AQsWA26hgHHGziBhBzAcwHPIdnFMVaRFWArGbgzzL Kxaxn2AfzyBUJO11/IJDdwiGxq72q0/T2cliln4uhBN4Qmtm8puZ8DAeAd7TV9Dl0Yk0 s0YRNnyqKDEDFsxc4QarCHoPtUqP7lQkhrNWJp9hzixVVdh6hQhJZJ2DbbclqAYb8NIW 8m3BwVoYO32oU7kV+OHLgJ707M1CZAaHSEGRenmiSHFdkhpoFlA/uHZl7/wrIRvPd3aR Lpq+Wj6+Ox8MHM+WivVLT0LzTwQDqY6xorAZbXnXv8PuZHUyTG9ELecmFgsaQUrvHyQf q3Iw== X-Gm-Message-State: ANhLgQ1xRaFs4ZooEb6yORTuh3kPbuKL9E8yIFZn4RbnunmS11AdMj5X LQ5+yF753Fg1jOZJgPUNweJF4lWwvuw= X-Google-Smtp-Source: ADFU+vs3L1Sl/5tE2cuoMtF8BNkjYIlc26nth96PD2LmJTnnhwR02hfCEuj2EdMm8qWVWxs+NBz5Gw== X-Received: by 2002:a63:7159:: with SMTP id b25mr10696807pgn.72.1585264153691; Thu, 26 Mar 2020 16:09:13 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.09.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:09:13 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 27/31] target/arm: Tidy handle_vec_simd_shri Date: Thu, 26 Mar 2020 16:08:34 -0700 Message-Id: <20200326230838.31112-28-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::441 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Now that we've converted all cases to gvec, there is quite a bit of dead code at the end of the function. Remove it. Sink the call to gen_gvec_fn2i to the end, loading a function pointer within the switch statement. Signed-off-by: Richard Henderson --- target/arm/translate-a64.c | 56 ++++++++++---------------------------- 1 file changed, 14 insertions(+), 42 deletions(-) diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index f7d492cce4..fc156a217a 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -11096,16 +11096,7 @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u, int size = 32 - clz32(immh) - 1; int immhb = immh << 3 | immb; int shift = 2 * (8 << size) - immhb; - bool accumulate = false; - int dsize = is_q ? 128 : 64; - int esize = 8 << size; - int elements = dsize/esize; - MemOp memop = size | (is_u ? 0 : MO_SIGN); - TCGv_i64 tcg_rn = new_tmp_a64(s); - TCGv_i64 tcg_rd = new_tmp_a64(s); - TCGv_i64 tcg_round; - uint64_t round_const; - int i; + GVecGen2iFn *gvec_fn; if (extract32(immh, 3, 1) && !is_q) { unallocated_encoding(s); @@ -11119,13 +11110,12 @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u, switch (opcode) { case 0x02: /* SSRA / USRA (accumulate) */ - gen_gvec_fn2i(s, is_q, rd, rn, shift, - is_u ? arm_gen_gvec_usra : arm_gen_gvec_ssra, size); - return; + gvec_fn = is_u ? arm_gen_gvec_usra : arm_gen_gvec_ssra; + break; case 0x08: /* SRI */ - gen_gvec_fn2i(s, is_q, rd, rn, shift, arm_gen_gvec_sri, size); - return; + gvec_fn = arm_gen_gvec_sri; + break; case 0x00: /* SSHR / USHR */ if (is_u) { @@ -11133,49 +11123,31 @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u, /* Shift count the same size as element size produces zero. */ tcg_gen_gvec_dup8i(vec_full_reg_offset(s, rd), is_q ? 16 : 8, vec_full_reg_size(s), 0); - } else { - gen_gvec_fn2i(s, is_q, rd, rn, shift, tcg_gen_gvec_shri, size); + return; } + gvec_fn = tcg_gen_gvec_shri; } else { /* Shift count the same size as element size produces all sign. */ if (shift == 8 << size) { shift -= 1; } - gen_gvec_fn2i(s, is_q, rd, rn, shift, tcg_gen_gvec_sari, size); + gvec_fn = tcg_gen_gvec_sari; } - return; + break; case 0x04: /* SRSHR / URSHR (rounding) */ - gen_gvec_fn2i(s, is_q, rd, rn, shift, - is_u ? arm_gen_gvec_urshr : arm_gen_gvec_srshr, size); - return; + gvec_fn = is_u ? arm_gen_gvec_urshr : arm_gen_gvec_srshr; + break; case 0x06: /* SRSRA / URSRA (accum + rounding) */ - gen_gvec_fn2i(s, is_q, rd, rn, shift, - is_u ? arm_gen_gvec_ursra : arm_gen_gvec_srsra, size); - return; + gvec_fn = is_u ? arm_gen_gvec_ursra : arm_gen_gvec_srsra; + break; default: g_assert_not_reached(); } - round_const = 1ULL << (shift - 1); - tcg_round = tcg_const_i64(round_const); - - for (i = 0; i < elements; i++) { - read_vec_element(s, tcg_rn, rn, i, memop); - if (accumulate) { - read_vec_element(s, tcg_rd, rd, i, memop); - } - - handle_shri_with_rndacc(tcg_rd, tcg_rn, tcg_round, - accumulate, is_u, size, shift); - - write_vec_element(s, tcg_rd, rd, i, size); - } - tcg_temp_free_i64(tcg_round); - - clear_vec_high(s, is_q, rd); + gen_gvec_fn2i(s, is_q, rd, rn, shift, gvec_fn, size); } /* SHL/SLI - Vector shift left */ From patchwork Thu Mar 26 23:08:35 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1262455 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=icKxNZgI; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48pLcc1jrsz9sSL for ; Fri, 27 Mar 2020 10:22:40 +1100 (AEDT) Received: from localhost ([::1]:34850 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbpe-000673-3c for incoming@patchwork.ozlabs.org; Thu, 26 Mar 2020 19:22:38 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:59538) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbck-00038S-JS for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:19 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbci-0001zl-Co for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:18 -0400 Received: from mail-pf1-x444.google.com ([2607:f8b0:4864:20::444]:39259) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbci-0001yT-6j for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:16 -0400 Received: by mail-pf1-x444.google.com with SMTP id d25so3539381pfn.6 for ; Thu, 26 Mar 2020 16:09:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=XWCeE/5E1HZf0ja/1RRJESpNCJLpd5hNPRFBKO5ZIL8=; b=icKxNZgINscLHQzuXJde8x0qqoeXNgrjWKtOxFEHuJVwQZzSoa8bQCLD44dWlVwp+9 BMfNG++TtWZ71up4vdx5vki2E1b08GzPXuPscp4JWsnA4bQQeZZYeT9ymHWWnfDHmqyW Pe8u5LgMyX/xLK4fbYM/+4Pt+BwY7A7glNoM6xzEa3/L8ou1HWORqMdV1ho3zkjmuAXr TayBcIb6G9kr/TFuiWU+MddOdi/9BLdy5TYLsQLRPcihZmyc7QKM2b136J/QLSIgEzOh XRdZP3f940aoT4Y2HwsPI+Uju0+0XJP5Q2HpQl/9renCWjG0v1KFTEjVspZYORJ7gvRg qdXQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=XWCeE/5E1HZf0ja/1RRJESpNCJLpd5hNPRFBKO5ZIL8=; b=HpIcjYOFsAoX/VyamSK+ITQT6/4M03S/UHFpyMDWCO4vB7M0OA5/Yl312Wg53QsbKN /xzIfE/3qqXJggDH5rDE9pjyG0l0gtvlPwqN1IELzkCOUwOfJiNoWT3kqxtvVHS82oLG D5phb4cC+tt0StgzTMZ4PSRw+ON3omkcnV2n+UhbuyBsR8pauunRbNXr3sDb+PCBXJtp UyZLerqyWoVfQpXBIntNfaq/xdCnvnccCD/ilot5HOWSVe37SOSf1RAl0HCNVJ+4krQI mXPFFEJiixKUpe6jky5RbJ8g7pHkEUFRjvVbzAPAJMVmITivxmykiq2w41DmWzMyb8ni CMiQ== X-Gm-Message-State: ANhLgQ0+WOd7g4+AgunnLFV/v1GdY6a4BfRjAwKh2uKHvv+2KuGdqD6J tpF3AvDya2esRdKge1TBXLAAV2OdKH8= X-Google-Smtp-Source: ADFU+vs+7QumS3vKSXnZJs/GYFazcjo3QHpyyw+CYnc79vfVh/t9HAM3cvMO3+3kMIQP9pjf7TxrXw== X-Received: by 2002:aa7:8108:: with SMTP id b8mr11659763pfi.212.1585264154959; Thu, 26 Mar 2020 16:09:14 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.09.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:09:14 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 28/31] target/arm: Implement SVE2 bitwise shift and insert Date: Thu, 26 Mar 2020 16:08:35 -0700 Message-Id: <20200326230838.31112-29-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::444 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/sve.decode | 5 +++++ target/arm/translate-sve.c | 10 ++++++++++ 2 files changed, 15 insertions(+) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 756f939df1..9bf66e8ad4 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1261,3 +1261,8 @@ SSRA 01000101 .. 0 ..... 1110 00 ..... ..... @rd_rn_tszimm_shr USRA 01000101 .. 0 ..... 1110 01 ..... ..... @rd_rn_tszimm_shr SRSRA 01000101 .. 0 ..... 1110 10 ..... ..... @rd_rn_tszimm_shr URSRA 01000101 .. 0 ..... 1110 11 ..... ..... @rd_rn_tszimm_shr + +## SVE2 bitwise shift and insert + +SRI 01000101 .. 0 ..... 11110 0 ..... ..... @rd_rn_tszimm_shr +SLI 01000101 .. 0 ..... 11110 1 ..... ..... @rd_rn_tszimm_shl diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 1d1f55dfdd..7556cecfb3 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6351,3 +6351,13 @@ static bool trans_URSRA(DisasContext *s, arg_rri_esz *a) { return do_sve2_fn2i(s, a, arm_gen_gvec_ursra); } + +static bool trans_SRI(DisasContext *s, arg_rri_esz *a) +{ + return do_sve2_fn2i(s, a, arm_gen_gvec_sri); +} + +static bool trans_SLI(DisasContext *s, arg_rri_esz *a) +{ + return do_sve2_fn2i(s, a, arm_gen_gvec_sli); +} From patchwork Thu Mar 26 23:08:36 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1262467 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=pJoNT6xt; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48pLm7065Cz9sTd for ; Fri, 27 Mar 2020 10:29:11 +1100 (AEDT) Received: from localhost ([::1]:35050 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbvw-0000LN-Un for incoming@patchwork.ozlabs.org; Thu, 26 Mar 2020 19:29:08 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:59627) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcm-0003Bo-61 for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:22 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcj-00022F-W6 for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:20 -0400 Received: from mail-pl1-x641.google.com ([2607:f8b0:4864:20::641]:34597) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcj-000211-Nb for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:17 -0400 Received: by mail-pl1-x641.google.com with SMTP id a23so2746288plm.1 for ; Thu, 26 Mar 2020 16:09:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=TUH/iWxYaarqAhHTxewl7Cdd7qJ//bZUOWIU1IWx5z8=; b=pJoNT6xtsQOq3gFBSADOXLKKBy6htgXEVA+0bvS7tCgOo+fJQXfFi5yypP9l38HFU4 dg8A6Y6z/vkysMm/HaR4HNqbcSHLJU+pCRblmnfyOqbiPsldQE0HvKYAhaLNBMiD7/+5 YgFOaOhRXqKZSISylnOoYMa0CAkZZ9RTanP83HNGPMhkOsAbWd4cQKZl+gzeeyzI1o7F V1wjnhIdiy+n8YmOgJAL1A2pnMvEiTtgWSzz+Uj0RWqUKcmrg8PbN2/tSmzShlpYmkP0 MKlo9NrZQ7n3oyMrSbs+qSmg7pTfHTWVLzsdMF83ZHpNfzyDQVg3MH4UfTdBq0Lftuqm 8D2g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=TUH/iWxYaarqAhHTxewl7Cdd7qJ//bZUOWIU1IWx5z8=; b=LG/P2pa4uxncZlcy9ktYdwNwoswMTN3dn+Q8Ba7aBmIY/pdkVpY3/tU3mYpPFCX4g/ +GPp0z3hyU/8A9aS7Qt8IEFiC+yo6ndmuT9tZ1ODfO4dGtJLCLDzkn0+o5WLR0JM0g3/ Si5smnNygaSF9MHPq+wIIpKoovzjWJNYBJyoRPg8KXlruHKnz7w3CMifcwtJjlMS0daj w1w9+3gIxKGcEQnKyFlZqfiNrVGY3/POTs8k2Z35yDsJO2F3VKpIEJ/0x/UKa4Tzb2Vb MQnv+EtjZ60Xt453fWWFmNkneX125WrkagvRe3KFo5qipfgq6RZ+TxdJVEXY+CsQuO46 PaRQ== X-Gm-Message-State: ANhLgQ3kooVCAbYvIwimgQTP1TaWtwNfqeqArPJ013NXqnaw2dJ4DVDW H87jNJNsmHocxH7PuAJEjHiUDqfFgoI= X-Google-Smtp-Source: ADFU+vvRRKr7xFN/9pZOh13o0k7AlKZ/eqU0Xw7Ym7EyUa7gC4zac11T6W9A6B/DsdAU09qHckPmNA== X-Received: by 2002:a17:90a:7105:: with SMTP id h5mr2615902pjk.54.1585264155988; Thu, 26 Mar 2020 16:09:15 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.09.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:09:15 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 29/31] target/arm: Vectorize SABD/UABD Date: Thu, 26 Mar 2020 16:08:36 -0700 Message-Id: <20200326230838.31112-30-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::641 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Include 64-bit element size in preparation for SVE. Signed-off-by: Richard Henderson --- target/arm/helper.h | 10 +++ target/arm/translate.h | 5 ++ target/arm/translate-a64.c | 8 ++- target/arm/translate.c | 133 ++++++++++++++++++++++++++++++++++++- target/arm/vec_helper.c | 88 ++++++++++++++++++++++++ 5 files changed, 240 insertions(+), 4 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index 5ef7bb158f..97ccbd70c6 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -748,6 +748,16 @@ DEF_HELPER_FLAGS_3(gvec_sli_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_sli_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_sli_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_sabd_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_sabd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_sabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_sabd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(gvec_uabd_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_uabd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_uabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_uabd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/translate.h b/target/arm/translate.h index 843ecc1472..c453aa1c47 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -311,6 +311,11 @@ void arm_gen_gvec_sri(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, void arm_gen_gvec_sli(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void arm_gen_gvec_sabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void arm_gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); + /* * Forward to the isar_feature_* tests given a DisasContext pointer. */ diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index fc156a217a..1791c26a39 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -12159,6 +12159,13 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_smin, size); } return; + case 0xe: /* SABD, UABD */ + if (u) { + gen_gvec_fn3(s, is_q, rd, rn, rm, arm_gen_gvec_uabd, size); + } else { + gen_gvec_fn3(s, is_q, rd, rn, rm, arm_gen_gvec_sabd, size); + } + return; case 0x10: /* ADD, SUB */ if (u) { gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_sub, size); @@ -12291,7 +12298,6 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) genenvfn = fns[size][u]; break; } - case 0xe: /* SABD, UABD */ case 0xf: /* SABA, UABA */ { static NeonGenTwoOpFn * const fns[3][2] = { diff --git a/target/arm/translate.c b/target/arm/translate.c index bb6db53598..a29868976a 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -4849,6 +4849,126 @@ const GVecGen4 sqsub_op[4] = { .vece = MO_64 }, }; +static void gen_sabd_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + TCGv_i32 t = tcg_temp_new_i32(); + + tcg_gen_sub_i32(t, a, b); + tcg_gen_sub_i32(d, b, a); + tcg_gen_movcond_i32(TCG_COND_LT, d, a, b, d, t); + tcg_temp_free_i32(t); +} + +static void gen_sabd_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_sub_i64(t, a, b); + tcg_gen_sub_i64(d, b, a); + tcg_gen_movcond_i64(TCG_COND_LT, d, a, b, d, t); + tcg_temp_free_i64(t); +} + +static void gen_sabd_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + + tcg_gen_smin_vec(vece, t, a, b); + tcg_gen_smax_vec(vece, d, a, b); + tcg_gen_sub_vec(vece, d, d, t); + tcg_temp_free_vec(t); +} + +void arm_gen_gvec_sabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sub_vec, INDEX_op_smin_vec, INDEX_op_smax_vec, 0 + }; + static const GVecGen3 ops[4] = { + { .fniv = gen_sabd_vec, + .fno = gen_helper_gvec_sabd_b, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fniv = gen_sabd_vec, + .fno = gen_helper_gvec_sabd_h, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_sabd_i32, + .fniv = gen_sabd_vec, + .fno = gen_helper_gvec_sabd_s, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_sabd_i64, + .fniv = gen_sabd_vec, + .fno = gen_helper_gvec_sabd_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} + +static void gen_uabd_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + TCGv_i32 t = tcg_temp_new_i32(); + + tcg_gen_sub_i32(t, a, b); + tcg_gen_sub_i32(d, b, a); + tcg_gen_movcond_i32(TCG_COND_LTU, d, a, b, d, t); + tcg_temp_free_i32(t); +} + +static void gen_uabd_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_sub_i64(t, a, b); + tcg_gen_sub_i64(d, b, a); + tcg_gen_movcond_i64(TCG_COND_LTU, d, a, b, d, t); + tcg_temp_free_i64(t); +} + +static void gen_uabd_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + + tcg_gen_umin_vec(vece, t, a, b); + tcg_gen_umax_vec(vece, d, a, b); + tcg_gen_sub_vec(vece, d, d, t); + tcg_temp_free_vec(t); +} + +void arm_gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sub_vec, INDEX_op_umin_vec, INDEX_op_umax_vec, 0 + }; + static const GVecGen3 ops[4] = { + { .fniv = gen_uabd_vec, + .fno = gen_helper_gvec_uabd_b, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fniv = gen_uabd_vec, + .fno = gen_helper_gvec_uabd_h, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_uabd_i32, + .fniv = gen_uabd_vec, + .fno = gen_helper_gvec_uabd_s, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_uabd_i64, + .fniv = gen_uabd_vec, + .fno = gen_helper_gvec_uabd_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} + /* Translate a NEON data processing instruction. Return nonzero if the instruction is invalid. We process data in a mixture of 32-bit and 64-bit chunks. @@ -5107,6 +5227,16 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) tcg_gen_gvec_3(rd_ofs, rm_ofs, rn_ofs, vec_size, vec_size, u ? &ushl_op[size] : &sshl_op[size]); return 0; + + case NEON_3R_VABD: + if (u) { + arm_gen_gvec_uabd(size, rd_ofs, rn_ofs, rm_ofs, + vec_size, vec_size); + } else { + arm_gen_gvec_sabd(size, rd_ofs, rn_ofs, rm_ofs, + vec_size, vec_size); + } + return 0; } if (size == 3) { @@ -5237,9 +5367,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) case NEON_3R_VQRSHL: GEN_NEON_INTEGER_OP_ENV(qrshl); break; - case NEON_3R_VABD: - GEN_NEON_INTEGER_OP(abd); - break; case NEON_3R_VABA: GEN_NEON_INTEGER_OP(abd); tcg_temp_free_i32(tmp2); diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 27035a8a42..e0694c16f4 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -1492,3 +1492,91 @@ void HELPER(gvec_umulh_d)(void *vd, void *vn, void *vm, uint32_t desc) } clear_tail(d, opr_sz, simd_maxsz(desc)); } + +void HELPER(gvec_sabd_b)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int8_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz; ++i) { + d[i] = n[i] < m[i] ? m[i] - n[i] : n[i] - m[i]; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_sabd_h)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int16_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 2; ++i) { + d[i] = n[i] < m[i] ? m[i] - n[i] : n[i] - m[i]; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_sabd_s)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int32_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 4; ++i) { + d[i] = n[i] < m[i] ? m[i] - n[i] : n[i] - m[i]; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_sabd_d)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int64_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 8; ++i) { + d[i] = n[i] < m[i] ? m[i] - n[i] : n[i] - m[i]; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_uabd_b)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint8_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz; ++i) { + d[i] = n[i] < m[i] ? m[i] - n[i] : n[i] - m[i]; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_uabd_h)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint16_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 2; ++i) { + d[i] = n[i] < m[i] ? m[i] - n[i] : n[i] - m[i]; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_uabd_s)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint32_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 4; ++i) { + d[i] = n[i] < m[i] ? m[i] - n[i] : n[i] - m[i]; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_uabd_d)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint64_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 8; ++i) { + d[i] = n[i] < m[i] ? m[i] - n[i] : n[i] - m[i]; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} From patchwork Thu Mar 26 23:08:37 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1262465 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=kpXqVdEj; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48pLl31P7wz9sTQ for ; Fri, 27 Mar 2020 10:28:15 +1100 (AEDT) Received: from localhost ([::1]:35008 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbv3-0007AW-5C for incoming@patchwork.ozlabs.org; Thu, 26 Mar 2020 19:28:13 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:59696) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcn-0003F8-DH for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:23 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcl-00024j-8T for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:21 -0400 Received: from mail-pj1-x1042.google.com ([2607:f8b0:4864:20::1042]:53741) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbck-000235-W4 for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:19 -0400 Received: by mail-pj1-x1042.google.com with SMTP id l36so3088799pjb.3 for ; Thu, 26 Mar 2020 16:09:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=59fkymWDzWb5TOs7OFoXchMT/KmijGyH3f2e6vHC1Kk=; b=kpXqVdEj3qFqq7pDlrWworCzh0b+aZbdn51q8tVyXxCLo1KagMK8IwfOgtnBmbM3iG 8P0NhzKUaT5s7wTyDO2uNBY2t0Hy/ANoDpPSq9X5bsvOvDCNXkmi60KFpW9ymXSVjRj/ fIjJ0aLa4ytIZv5jqPVME2Arvyji4NiuFbzToxCEG04ZxcDTgA/jArhy9Cs5bgG3HOym FTN7iCEkU63Pu4CjqwmLV4Vf0765W2lVsk2QauuEVmturluVOqjXs+JnFKLfwwqngxnr GaVFbECe7EDJ6VBWzL+iL5M/5MMDc2w6PUlIbi7QeI3e5WuhOyevzOcyd9vYbAs+Uku6 1fcQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=59fkymWDzWb5TOs7OFoXchMT/KmijGyH3f2e6vHC1Kk=; b=QL5y8VnGWXJXi8pYntri/s50c8A8f4xsgrB8/NtkHZ+Q7YTPsepRrqDjYRGmeyp4BE pFb0x72HJXlRn/rvnA3DseVjOTKIrUrXpDlvbivOT9RgcyiYvoRH6URIdjbPHZ1xxiOj WPEb/jSLwMZ06EIuEzLprPyZ8OcPI9GjQE74Xg6Av/xVP0Jqy9/ZwWJDC4pLRuJGn4nG Fra0Znd9tSTpsgyD5cp7ukzfLMa0EsvILUi79yi5enmj3O1elwIcfunZ1DY4Zv0TSYrk amxvXfE0KNhP38vP4IyOYcjisRt0iH0Hl66mvHB6Hy6xFhEwkP13YJbTtECOYuO4ckmB cKpw== X-Gm-Message-State: ANhLgQ3e50YLqngzHpUfiL137jHdb4BACZHKXR4R1hN3n+CyUHSkpTUw 9mmG1QSbLSY9e5ufdjLsqxJgUKDgkkg= X-Google-Smtp-Source: ADFU+vt/jGF7wbyvNw8mttItdolN7MkJE8RAsUO5CdzwUYj8S+ReD5k+zvGcRqPwR6xGPYjj+KrcMQ== X-Received: by 2002:a17:90a:cb14:: with SMTP id z20mr2496268pjt.170.1585264157065; Thu, 26 Mar 2020 16:09:17 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.09.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:09:16 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 30/31] target/arm: Vectorize SABA/UABA Date: Thu, 26 Mar 2020 16:08:37 -0700 Message-Id: <20200326230838.31112-31-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::1042 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Include 64-bit element size in preparation for SVE. Signed-off-by: Richard Henderson --- target/arm/helper.h | 17 +++-- target/arm/translate.h | 5 ++ target/arm/neon_helper.c | 10 --- target/arm/translate-a64.c | 17 ++--- target/arm/translate.c | 134 +++++++++++++++++++++++++++++++++++-- target/arm/vec_helper.c | 88 ++++++++++++++++++++++++ 6 files changed, 238 insertions(+), 33 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index 97ccbd70c6..5cf6a5b4a0 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -299,13 +299,6 @@ DEF_HELPER_2(neon_pmax_s8, i32, i32, i32) DEF_HELPER_2(neon_pmax_u16, i32, i32, i32) DEF_HELPER_2(neon_pmax_s16, i32, i32, i32) -DEF_HELPER_2(neon_abd_u8, i32, i32, i32) -DEF_HELPER_2(neon_abd_s8, i32, i32, i32) -DEF_HELPER_2(neon_abd_u16, i32, i32, i32) -DEF_HELPER_2(neon_abd_s16, i32, i32, i32) -DEF_HELPER_2(neon_abd_u32, i32, i32, i32) -DEF_HELPER_2(neon_abd_s32, i32, i32, i32) - DEF_HELPER_2(neon_shl_u16, i32, i32, i32) DEF_HELPER_2(neon_shl_s16, i32, i32, i32) DEF_HELPER_2(neon_rshl_u8, i32, i32, i32) @@ -758,6 +751,16 @@ DEF_HELPER_FLAGS_4(gvec_uabd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_uabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_uabd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_saba_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_saba_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_saba_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_saba_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(gvec_uaba_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_uaba_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_uaba_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_uaba_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/translate.h b/target/arm/translate.h index c453aa1c47..0df7ce51b2 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -316,6 +316,11 @@ void arm_gen_gvec_sabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, void arm_gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void arm_gen_gvec_saba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void arm_gen_gvec_uaba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); + /* * Forward to the isar_feature_* tests given a DisasContext pointer. */ diff --git a/target/arm/neon_helper.c b/target/arm/neon_helper.c index e6481a5764..4c1cf1e031 100644 --- a/target/arm/neon_helper.c +++ b/target/arm/neon_helper.c @@ -595,16 +595,6 @@ NEON_POP(pmax_s16, neon_s16, 2) NEON_POP(pmax_u16, neon_u16, 2) #undef NEON_FN -#define NEON_FN(dest, src1, src2) \ - dest = (src1 > src2) ? (src1 - src2) : (src2 - src1) -NEON_VOP(abd_s8, neon_s8, 4) -NEON_VOP(abd_u8, neon_u8, 4) -NEON_VOP(abd_s16, neon_s16, 2) -NEON_VOP(abd_u16, neon_u16, 2) -NEON_VOP(abd_s32, neon_s32, 1) -NEON_VOP(abd_u32, neon_u32, 1) -#undef NEON_FN - #define NEON_FN(dest, src1, src2) \ (dest = do_uqrshl_bhs(src1, src2, 16, false, NULL)) NEON_VOP(shl_u16, neon_u16, 2) diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 1791c26a39..d830a58c3f 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -12166,6 +12166,13 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) gen_gvec_fn3(s, is_q, rd, rn, rm, arm_gen_gvec_sabd, size); } return; + case 0xf: /* SABA, UABA */ + if (u) { + gen_gvec_fn3(s, is_q, rd, rn, rm, arm_gen_gvec_uaba, size); + } else { + gen_gvec_fn3(s, is_q, rd, rn, rm, arm_gen_gvec_saba, size); + } + return; case 0x10: /* ADD, SUB */ if (u) { gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_sub, size); @@ -12298,16 +12305,6 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) genenvfn = fns[size][u]; break; } - case 0xf: /* SABA, UABA */ - { - static NeonGenTwoOpFn * const fns[3][2] = { - { gen_helper_neon_abd_s8, gen_helper_neon_abd_u8 }, - { gen_helper_neon_abd_s16, gen_helper_neon_abd_u16 }, - { gen_helper_neon_abd_s32, gen_helper_neon_abd_u32 }, - }; - genfn = fns[size][u]; - break; - } case 0x16: /* SQDMULH, SQRDMULH */ { static NeonGenTwoOpEnvFn * const fns[2][2] = { diff --git a/target/arm/translate.c b/target/arm/translate.c index a29868976a..4491ab0eb0 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -4969,6 +4969,124 @@ void arm_gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); } +static void gen_saba_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + TCGv_i32 t = tcg_temp_new_i32(); + gen_sabd_i32(t, a, b); + tcg_gen_add_i32(d, d, t); + tcg_temp_free_i32(t); +} + +static void gen_saba_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 t = tcg_temp_new_i64(); + gen_sabd_i64(t, a, b); + tcg_gen_add_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_saba_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + gen_sabd_vec(vece, t, a, b); + tcg_gen_add_vec(vece, d, d, t); + tcg_temp_free_vec(t); +} + +void arm_gen_gvec_saba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sub_vec, INDEX_op_add_vec, + INDEX_op_smin_vec, INDEX_op_smax_vec, 0 + }; + static const GVecGen3 ops[4] = { + { .fniv = gen_saba_vec, + .fno = gen_helper_gvec_saba_b, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_8 }, + { .fniv = gen_saba_vec, + .fno = gen_helper_gvec_saba_h, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_16 }, + { .fni4 = gen_saba_i32, + .fniv = gen_saba_vec, + .fno = gen_helper_gvec_saba_s, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_32 }, + { .fni8 = gen_saba_i64, + .fniv = gen_saba_vec, + .fno = gen_helper_gvec_saba_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} + +static void gen_uaba_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + TCGv_i32 t = tcg_temp_new_i32(); + gen_uabd_i32(t, a, b); + tcg_gen_add_i32(d, d, t); + tcg_temp_free_i32(t); +} + +static void gen_uaba_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 t = tcg_temp_new_i64(); + gen_uabd_i64(t, a, b); + tcg_gen_add_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_uaba_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + gen_uabd_vec(vece, t, a, b); + tcg_gen_add_vec(vece, d, d, t); + tcg_temp_free_vec(t); +} + +void arm_gen_gvec_uaba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sub_vec, INDEX_op_add_vec, + INDEX_op_umin_vec, INDEX_op_umax_vec, 0 + }; + static const GVecGen3 ops[4] = { + { .fniv = gen_uaba_vec, + .fno = gen_helper_gvec_uaba_b, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_8 }, + { .fniv = gen_uaba_vec, + .fno = gen_helper_gvec_uaba_h, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_16 }, + { .fni4 = gen_uaba_i32, + .fniv = gen_uaba_vec, + .fno = gen_helper_gvec_uaba_s, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_32 }, + { .fni8 = gen_uaba_i64, + .fniv = gen_uaba_vec, + .fno = gen_helper_gvec_uaba_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} + /* Translate a NEON data processing instruction. Return nonzero if the instruction is invalid. We process data in a mixture of 32-bit and 64-bit chunks. @@ -5237,6 +5355,16 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) vec_size, vec_size); } return 0; + + case NEON_3R_VABA: + if (u) { + arm_gen_gvec_uaba(size, rd_ofs, rn_ofs, rm_ofs, + vec_size, vec_size); + } else { + arm_gen_gvec_saba(size, rd_ofs, rn_ofs, rm_ofs, + vec_size, vec_size); + } + return 0; } if (size == 3) { @@ -5367,12 +5495,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) case NEON_3R_VQRSHL: GEN_NEON_INTEGER_OP_ENV(qrshl); break; - case NEON_3R_VABA: - GEN_NEON_INTEGER_OP(abd); - tcg_temp_free_i32(tmp2); - tmp2 = neon_load_reg(rd, pass); - gen_neon_add(size, tmp, tmp2); - break; case NEON_3R_VPMAX: GEN_NEON_INTEGER_OP(pmax); break; diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index e0694c16f4..cbd0382c71 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -1580,3 +1580,91 @@ void HELPER(gvec_uabd_d)(void *vd, void *vn, void *vm, uint32_t desc) } clear_tail(d, opr_sz, simd_maxsz(desc)); } + +void HELPER(gvec_saba_b)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int8_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz; ++i) { + d[i] += n[i] < m[i] ? m[i] - n[i] : n[i] - m[i]; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_saba_h)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int16_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 2; ++i) { + d[i] += n[i] < m[i] ? m[i] - n[i] : n[i] - m[i]; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_saba_s)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int32_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 4; ++i) { + d[i] += n[i] < m[i] ? m[i] - n[i] : n[i] - m[i]; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_saba_d)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int64_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 8; ++i) { + d[i] += n[i] < m[i] ? m[i] - n[i] : n[i] - m[i]; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_uaba_b)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint8_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz; ++i) { + d[i] += n[i] < m[i] ? m[i] - n[i] : n[i] - m[i]; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_uaba_h)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint16_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 2; ++i) { + d[i] += n[i] < m[i] ? m[i] - n[i] : n[i] - m[i]; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_uaba_s)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint32_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 4; ++i) { + d[i] += n[i] < m[i] ? m[i] - n[i] : n[i] - m[i]; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_uaba_d)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint64_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 8; ++i) { + d[i] += n[i] < m[i] ? m[i] - n[i] : n[i] - m[i]; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} From patchwork Thu Mar 26 23:08:38 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1262459 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=khvwLdZa; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48pLg430JBz9sSN for ; Fri, 27 Mar 2020 10:24:46 +1100 (AEDT) Received: from localhost ([::1]:34910 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbrf-0001cM-U4 for incoming@patchwork.ozlabs.org; Thu, 26 Mar 2020 19:24:43 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:59665) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcm-0003DX-RU for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:21 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcl-00025T-O3 for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:20 -0400 Received: from mail-pg1-x52f.google.com ([2607:f8b0:4864:20::52f]:38921) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcl-00024Q-H3 for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:19 -0400 Received: by mail-pg1-x52f.google.com with SMTP id b22so3651273pgb.6 for ; Thu, 26 Mar 2020 16:09:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=y9AW+fkUrXuwDSItggnbypCfN/Kt8Ehf1WvwBsOiyX4=; b=khvwLdZab3bGspZGQIaKRdwlLVP1QxMoQh23OPqJ9bvQ0EaDbAH8hdQF9ns2qOh9+Y 5xDTzqwa9KB5x2bpqlMI0WYts9IWOfgKn/M6T/EvrVYh39mDqEO9DGodcUjyjqATD5L1 Z+adzhVsC1oe7IEciDLGylE+8OEhrgV9KIZOR1L0fKSTzVzkVnssBVTi7mF2oGOmFIIX wDhMMPndHWwPVxMkGspGsevtspvHPhqsxvjLmcRi+vYJ+uvl5K8evme7HZ7Iaq+hH6IO gf2YUtR2yWM6ZPiYulrCzlEoBrtEqUj1W/tZSKgKai/Bo2yVtjyKY67CpngbpsbA6YcV P/6w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=y9AW+fkUrXuwDSItggnbypCfN/Kt8Ehf1WvwBsOiyX4=; b=onLZePLaTvqFBOz4Be9P87sO7FSyNL+HoF6s+88tJGzaif4FGS+6LshSXNZpy5F2JX M/T3aAa3XgL9kNTw+D08c2uFP+oqpJngJsjgAoz515XLNm5aSUJoHOzkY5FiMdiSyk+X tV+wOS2B7HfB84i2wAPgaS1u8kBO/yFySqG4BEM+/RQ8nRQWSYuJLwNEuIUylhLeKdwy qD+7zTsBBfYst2cmJfimtwiBV9ziZygl5JW5J4iOTINyM5QMucJQr57XRKGgjw9Jbjk4 1atCUSGb/UFhk8TYeTBfkz4GsONb6lhSs45obR91hYrclKe3WarZG+XWcSqlMc23G6iw h5Jw== X-Gm-Message-State: ANhLgQ1qmvY/yZHty+bKau7JxoLB2A/oQLH7uu7CDjMN1HVzerrrkqwk Z/EewFObKtIxMChkSPSX9+VJy0/Ctpk= X-Google-Smtp-Source: ADFU+vt2KQ7hy70StOZQwy3l8qK2rwGMuNtgStNN+sDz9xiUxITSahfKKJUxkP4MjyciFoRYJD6vbQ== X-Received: by 2002:a62:7a82:: with SMTP id v124mr10983254pfc.10.1585264158220; Thu, 26 Mar 2020 16:09:18 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.09.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:09:17 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 31/31] target/arm: Implement SVE2 integer absolute difference and accumulate Date: Thu, 26 Mar 2020 16:08:38 -0700 Message-Id: <20200326230838.31112-32-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::52f X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/sve.decode | 6 ++++++ target/arm/translate-sve.c | 25 +++++++++++++++++++++++++ 2 files changed, 31 insertions(+) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 9bf66e8ad4..6d565912e3 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1266,3 +1266,9 @@ URSRA 01000101 .. 0 ..... 1110 11 ..... ..... @rd_rn_tszimm_shr SRI 01000101 .. 0 ..... 11110 0 ..... ..... @rd_rn_tszimm_shr SLI 01000101 .. 0 ..... 11110 1 ..... ..... @rd_rn_tszimm_shl + +## SVE2 integer absolute difference and accumulate + +# TODO: Use @rda and %reg_movprfx here. +SABA 01000101 .. 0 ..... 11111 0 ..... ..... @rd_rn_rm +UABA 01000101 .. 0 ..... 11111 1 ..... ..... @rd_rn_rm diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 7556cecfb3..42ef031b77 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6361,3 +6361,28 @@ static bool trans_SLI(DisasContext *s, arg_rri_esz *a) { return do_sve2_fn2i(s, a, arm_gen_gvec_sli); } + +static bool do_sve2_fn3(DisasContext *s, arg_rrr_esz *a, GVecGen3Fn *fn) +{ + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + unsigned rd_ofs = vec_full_reg_offset(s, a->rd); + unsigned rn_ofs = vec_full_reg_offset(s, a->rn); + unsigned rm_ofs = vec_full_reg_offset(s, a->rm); + fn(a->esz, rd_ofs, rn_ofs, rm_ofs, vsz, vsz); + } + return true; +} + +static bool trans_SABA(DisasContext *s, arg_rrr_esz *a) +{ + return do_sve2_fn3(s, a, arm_gen_gvec_saba); +} + +static bool trans_UABA(DisasContext *s, arg_rrr_esz *a) +{ + return do_sve2_fn3(s, a, arm_gen_gvec_uaba); +}