From patchwork Thu Jul 29 11:14:20 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511158 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=hDH0Ozfz; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb7Mk6Lw7z9sSs for ; Thu, 29 Jul 2021 21:19:14 +1000 (AEST) Received: from localhost ([::1]:55482 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m944G-0006St-JB for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:19:12 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:39634) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m940a-0008SV-Ez for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:24 -0400 Received: from mail-wm1-x332.google.com ([2a00:1450:4864:20::332]:35446) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m940T-0000tN-58 for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:24 -0400 Received: by mail-wm1-x332.google.com with SMTP id u15-20020a05600c19cfb02902501bdb23cdso6595443wmq.0 for ; Thu, 29 Jul 2021 04:15:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=znndHHEoxoWMeYC4H5uF/7NDy7y5s/e+PfCtHxh/c/4=; b=hDH0OzfzKq/VoVFYYOvBUXnPy++Usa1W6fy+pMV0Imgoj78x9wg2A2Fi4GCoCSwipp 8Vjmvpo+ph2j9j48N02yAKldrw+BsCtikUM5F4Y4oascaUMAvEmfUQgMvRXraRAVZgpw NQHcWbs7+EpSIs3g65sS9IpsYMypzmiBcnssFE/tBKZPEE6LhjGsWpkg2OSdZVC0SiJr 6rhWV7psFd8iPy4/YxUwTmwtGc+mN1HnOthjWoinJk9Wvn1TfakKpO1oU5O7ptD6Pd+E 4asIti0LvSVtlTzWDowu5Eqhcg6eR6r/uONnm46bxiiqTK3gK4RfhizhVRc3j0BRjiiZ NUIw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=znndHHEoxoWMeYC4H5uF/7NDy7y5s/e+PfCtHxh/c/4=; b=m4GK0iXrh3zgqsaywr8nD2y8u/iviyj4uylU5wvEQ7pow/XQ/EpxGwot7ljC/3OFRe RSEEjcokLABQyWMsdLsE0Su33PGz44NGaxEDnLIY4WCJ0YhIBPJYG1aVd7Ui1y4p16go E/7Wj7+jI+ItlRs1Mr0MhLd0pHxjCZdcQgApy9PoVgCo42jMb4sAqJOuSC5O8GNNHw2W IltxIbSo1Ghks8oxOd7883CrHkYlh5Bd2MjxlkXtknzKo+385g61zOnN0K69iLl5UEU2 ObQ9yZpQfiHs8KajMf0+B8hPQ1Hzt/G1LBNSR85p5BIznIzRifBtNgliXcV3DoujySLL vBhw== X-Gm-Message-State: AOAM531594dENFQ1NvDywsuiJlhCSz0Oy9s3vlPCi96QwRPnKAQfPmDv gMXK1UsJMawQWnP57iWRq8WCmsXPg60vlA== X-Google-Smtp-Source: ABdhPJz/vy1xTQWIEBDo9w2q4/zTk2DBF77C/C8lRt2suIMfUD97d+WLeTa81SRk+yiyaSiuPqKGYw== X-Received: by 2002:a1c:7402:: with SMTP id p2mr14015548wmc.111.1627557315734; Thu, 29 Jul 2021 04:15:15 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:15 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 01/53] target/arm: Note that we handle VMOVL as a special case of VSHLL Date: Thu, 29 Jul 2021 12:14:20 +0100 Message-Id: <20210729111512.16541-2-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::332; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x332.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Although the architecture doesn't define it as an alias, VMOVL (vector move long) is encoded as a VSHLL with a zero shift. Add a comment in the decode file noting that we handle VMOVL as part of VSHLL. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/mve.decode | 2 ++ 1 file changed, 2 insertions(+) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index 595d97568eb..fa9d921f933 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -364,6 +364,8 @@ VRSHRI_U 111 1 1111 1 . ... ... ... 0 0010 0 1 . 1 ... 0 @2_shr_h VRSHRI_U 111 1 1111 1 . ... ... ... 0 0010 0 1 . 1 ... 0 @2_shr_w # VSHLL T1 encoding; the T2 VSHLL encoding is elsewhere in this file +# Note that VMOVL is encoded as "VSHLL with a zero shift count"; we +# implement it that way rather than special-casing it in the decode. VSHLL_BS 111 0 1110 1 . 1 .. ... ... 0 1111 0 1 . 0 ... 0 @2_shll_b VSHLL_BS 111 0 1110 1 . 1 .. ... ... 0 1111 0 1 . 0 ... 0 @2_shll_h From patchwork Thu Jul 29 11:14:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511157 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=WH01ZJuH; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb7Lj1S8gz9sSs for ; Thu, 29 Jul 2021 21:18:21 +1000 (AEST) Received: from localhost ([::1]:51628 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m943O-0003uE-UL for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:18:18 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:39700) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m940d-0000CE-Er for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:27 -0400 Received: from mail-wr1-x430.google.com ([2a00:1450:4864:20::430]:45892) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m940T-0000uk-PO for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:27 -0400 Received: by mail-wr1-x430.google.com with SMTP id m12so1648139wru.12 for ; Thu, 29 Jul 2021 04:15:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=X46IYmwDa0Vu+Tiq/N7j9UCGpxEaqTWhGbb7xdBU5TU=; b=WH01ZJuHu69DDGuF8DmKOAcY/rbpiR1Ksf0wH8yhWQHmjwpRwY9RyBw7UG/BodlJ/Y PuFHIPvHP6dqe1NuxefNXoEWdfJrbMpxM7nmE/ZJR4UexcKt14y+bQ5h/z9oOsh3Phdw rHxejHJ3TLGEnAylLlT50UklLO71qM+f7ZMIFcieTuMxOS/Rkdz0JEvhjSrEz2Q4zFRY MfMxyOSw184ZvMjDo/GnmZrHEyju1ij48jJ53yf43Wbt9gEWS4f9LcAq+13VS+0K6GlO 3ktVNpFlIqIDyEqyP0JO3iDQNgqsqXO9g9+7xI5kZmReMfmwM5z5YJsmsFy9BVC/zuJd rsvw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=X46IYmwDa0Vu+Tiq/N7j9UCGpxEaqTWhGbb7xdBU5TU=; b=KOiq3jKAVqhUgVid8GW5dLudV5D451rMfrwRKPixq1j9CzlzpoLN8Y+BLc+tY7YQ5r JpFu7M9LGkZDwKPOWV3Rqq3RAqsNTMimwN9uGLj9GHdCYKC3SwaUlG278fW8aGJu9x6y DKPCAhaJKUcGMFh/6Yl6j2Gm5s5sYFzB/E3juG6eUoVjo9Z0EUkvVUdFeSKMM3sy6TJi 6pjNvXBstpt90vlmPbz7WsrnmLVAb5MLx1727vsNWUcEXPP+p9DUEZBby8Nc0FHL/JV8 1Dywx1JF8+mIQ9JjWVxNADPl/nBuHBYIgMortEWNbzKLg5/qIDa6o502Aj8Wd5CIzuzo 0ZKw== X-Gm-Message-State: AOAM532zzjYlvXlTVe1fCGFpP1V/JEYuPLAvGJ/97quYyWw8rZ8l67RN WIT049pIuXSVmTi6qJ9EgdMA2Gqy84fWMQ== X-Google-Smtp-Source: ABdhPJzp+W1Lxf/Bf7nITU879y1qefqvOP6XXRPzMxUCT5RyZK96zJEtRQMvRL1I8qgbHtSnSPYEaQ== X-Received: by 2002:a5d:6608:: with SMTP id n8mr4115122wru.427.1627557316473; Thu, 29 Jul 2021 04:15:16 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:16 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 02/53] target/arm: Print MVE VPR in CPU dumps Date: Thu, 29 Jul 2021 12:14:21 +0100 Message-Id: <20210729111512.16541-3-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::430; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x430.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Include the MVE VPR register value in the CPU dumps produced by arm_cpu_dump_state() if we are printing FPU information. This makes it easier to interpret debug logs when predication is active. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/cpu.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/target/arm/cpu.c b/target/arm/cpu.c index 2866dd76588..a82e39dd97f 100644 --- a/target/arm/cpu.c +++ b/target/arm/cpu.c @@ -1017,6 +1017,9 @@ static void arm_cpu_dump_state(CPUState *cs, FILE *f, int flags) i, v); } qemu_fprintf(f, "FPSCR: %08x\n", vfp_get_fpscr(env)); + if (cpu_isar_feature(aa32_mve, cpu)) { + qemu_fprintf(f, "VPR: %08x\n", env->v7m.vpr); + } } } From patchwork Thu Jul 29 11:14:22 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511155 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=CDDI2YZ3; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb7KH1ZWgz9sSs for ; Thu, 29 Jul 2021 21:17:06 +1000 (AEST) Received: from localhost ([::1]:46548 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m942C-00004y-H8 for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:17:04 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:39632) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m940a-0008SQ-EY for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:24 -0400 Received: from mail-wm1-x329.google.com ([2a00:1450:4864:20::329]:44552) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m940U-0000vR-Im for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:22 -0400 Received: by mail-wm1-x329.google.com with SMTP id d131-20020a1c1d890000b02902516717f562so3766473wmd.3 for ; Thu, 29 Jul 2021 04:15:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=GGF4FybEKy96gYcYMu+JcP8KcGdVIAGNRtBK4g3nQ2Y=; b=CDDI2YZ3ES72TH9b5xS0B5e8W9Cg1RLQ98gNketTfJN72pYLOY/giiTD+xjj2kQnG6 d3r/u1uQJRmY0HbVH0iIGLTt2mfXsa1aIeUUSTQ2Ub17R2Cgr/i71CQYIRK5I0SK2kXN YMhzkUujolcy8/uaLcpl3YxA2vkghvHqX9JVfYXQ7MRp/Un5Rq0R6IuBR/V0Z8Yr5gpN mYGEZL7W7Er9zHDMjXpNMyf8ADZapEfiCpwuIrXTR5did2uP4s8pL2vhauU64OqHVgXS zZldVBvs0mOiRcAXzNhirIIWK+jh4QEa65jS49e3PUg/zNvGjJSKlhcvDucbSD6lXMIr q23A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=GGF4FybEKy96gYcYMu+JcP8KcGdVIAGNRtBK4g3nQ2Y=; b=uXYFA1pOQ7LfGSb5npTKK+aPeEJeoTMX6Ku0FPBTJ5EeHUnKsjJmPPMJtX7OyCD+Yl MDakN793V55HHg9NQgTTgVGJu2qWSiVJl+cCvpTfOnSmMP6txtTe0yo0VF+xlGFsEwyv 1Rg1RIE6iTZ6UGnge6kaIqNC0v/ig6xxpqP2N6BYz1ZGA3atD5JQAnR/YUDMooszebBW 8pWbikEhx/CBf04W49XkaplIGyB3KVSVPSOZr2CV4FsayfKMmVCrNA3DLt7RlJTpZenG r29vbelqge39JJ4hScReHaDf6KiuB+qTQ/tJLhvkphIZoL5mElVhf2Wyx+vXycSbvOCR jPtA== X-Gm-Message-State: AOAM533eoB8GmrOmoXz7IN/rdN7LFdf+RriTGQxfzb4ECqSMPWMii/S0 bndUSatfPKAczpV8fptw7+a0MA== X-Google-Smtp-Source: ABdhPJxua5/SAwlyDsfJZxaG65MFP2B14TaoJug1kIBchCfdBBn/ywYZU5mziOrqXnLbxD5SKq403w== X-Received: by 2002:a05:600c:4c96:: with SMTP id g22mr13543761wmp.70.1627557317253; Thu, 29 Jul 2021 04:15:17 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:16 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 03/53] target/arm: Fix MVE VSLI by 0 and VSRI by
Date: Thu, 29 Jul 2021 12:14:22 +0100 Message-Id: <20210729111512.16541-4-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::329; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x329.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" In the MVE shift-and-insert insns, we special case VSLI by 0 and VSRI by
. VSRI by
means "don't update the destination", which is what we've implemented. However VSLI by 0 is "set destination to the input", so we don't want to use the same special-casing that we do for VSRI by
. Since the generic logic gives the right answer for a shift by 0, just use that. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/mve_helper.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index db5d6220854..f14fa914b68 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -1279,11 +1279,12 @@ DO_2SHIFT_S(vrshli_s, DO_VRSHLS) uint16_t mask; \ uint64_t shiftmask; \ unsigned e; \ - if (shift == 0 || shift == ESIZE * 8) { \ + if (shift == ESIZE * 8) { \ /* \ - * Only VSLI can shift by 0; only VSRI can shift by
. \ - * The generic logic would give the right answer for 0 but \ - * fails for
. \ + * Only VSRI can shift by
; it should mean "don't \ + * update the destination". The generic logic can't handle \ + * this because it would try to shift by an out-of-range \ + * amount, so special case it here. \ */ \ goto done; \ } \ From patchwork Thu Jul 29 11:14:23 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511162 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=m/8jrRsI; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb7Qq1dJVz9sSs for ; Thu, 29 Jul 2021 21:21:54 +1000 (AEST) Received: from localhost ([::1]:35850 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m946q-0003rJ-Lv for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:21:52 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:39694) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m940d-0000BV-72 for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:27 -0400 Received: from mail-wr1-x433.google.com ([2a00:1450:4864:20::433]:47098) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m940V-0000wc-CU for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:26 -0400 Received: by mail-wr1-x433.google.com with SMTP id c16so6427389wrp.13 for ; Thu, 29 Jul 2021 04:15:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=2AKLx+FAZSEHTDgotUySXrnfQs5xm57wonDC26WVk28=; b=m/8jrRsIdWQaEvtiM0ZzsiP7w+nuqXtKnLnbQL/QIreJnepWWyOIrMQxJsrQzy7L3f wtBAzFj04ma9Fyo97gjRcLkfAE8qoOETf5rcQTGy3h1KKM2Zgv5rNjKxlZebZGiJEA7Z h9xDfd0/Pxr0A3d+OcwI6o4NIjxdsnbWFFCKEt9hYm5irFPO8UrO/Ton4ezT03Bwz6ce 5ejtZc907GIR7G5aQVJqbX2+KHSvuZiFoTCn+1L86rvt8n/PmUPT9iL2mt8kLL6zJ71B r0OMFJHpxQLXvOiVuuSVDwXDlEh8BTynQiqQSpqFVuKmeSL1Tj1Dc8m4fjNf10mkREnz AZ0g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=2AKLx+FAZSEHTDgotUySXrnfQs5xm57wonDC26WVk28=; b=rZDoHt7+IaxOuRrFSCi8EBHI5qRN/wimBQVYRGqfgnNFkvUQTR116dqJhiHct6a4HI qEC2eFiAkrZnPnbM/aKDw+HWamX1v+yX002yyZe5TpT156745937wYP0XOdtsVzVTPJv qzd/P0ErHwgwRpBVesqmxOt3QcSmvBvX93dqUUU1KGXykU5ZvgczEM9gPgKZu3aQ12pf 0vMMeWnMNsT3AvM/lDRprFRY0mFvny5om2BNHZ/83/9m1MxgjcoBlskHiW3YjGSjm3/o MGhNidz5nfETrNLqbFsCn5begsPif0RAo9jC96fFX6+JB8DCKAxMEhNCuHWFEMF60Drx cv8A== X-Gm-Message-State: AOAM533TnNQJo4XYYFTWGXjZjIKQs5xusTUDoCtCvNgN0Bi//zSRvqlg aDjAearlA7BI9pt3jZ4d2m6CDg== X-Google-Smtp-Source: ABdhPJwOEoMbNAGJklqLss3qcb4JE36wA0LIoXcI4b3t8f5eqTrABeViERrInEBVi1N23FkCHOOMFA== X-Received: by 2002:adf:de8a:: with SMTP id w10mr4293021wrl.61.1627557318090; Thu, 29 Jul 2021 04:15:18 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:17 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 04/53] target/arm: Fix signed VADDV Date: Thu, 29 Jul 2021 12:14:23 +0100 Message-Id: <20210729111512.16541-5-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::433; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x433.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" A cut-and-paste error meant we handled signed VADDV like unsigned VADDV; fix the type used. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/mve_helper.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index f14fa914b68..82151b06200 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -1182,9 +1182,9 @@ DO_LDAVH(vrmlsldavhxsw, int32_t, int64_t, true, true) return ra; \ } \ -DO_VADDV(vaddvsb, 1, uint8_t) -DO_VADDV(vaddvsh, 2, uint16_t) -DO_VADDV(vaddvsw, 4, uint32_t) +DO_VADDV(vaddvsb, 1, int8_t) +DO_VADDV(vaddvsh, 2, int16_t) +DO_VADDV(vaddvsw, 4, int32_t) DO_VADDV(vaddvub, 1, uint8_t) DO_VADDV(vaddvuh, 2, uint16_t) DO_VADDV(vaddvuw, 4, uint32_t) From patchwork Thu Jul 29 11:14:24 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511166 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=leis0gb/; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb7VM63kcz9sW8 for ; Thu, 29 Jul 2021 21:24:59 +1000 (AEST) Received: from localhost ([::1]:45580 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m949p-0001vg-IG for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:24:57 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:39704) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m940d-0000D1-Mk for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:27 -0400 Received: from mail-wr1-x42f.google.com ([2a00:1450:4864:20::42f]:39881) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m940W-0000y3-95 for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:27 -0400 Received: by mail-wr1-x42f.google.com with SMTP id b11so1107200wrx.6 for ; Thu, 29 Jul 2021 04:15:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=AdQbpZh4Nsn4ZEWXMN8/5MXfv9l4tGOGQ949NSeXfXg=; b=leis0gb/ZagYqBhoF55uD2lsIcnPdessnkG1zNgwKzmkH+5NePet1CdV0o8JWIkN/+ m2B7xWgpsjV2DurZkQCsNA7Bp0nqWhS4JS01xZXTqZChLxGA4xwGhijOD75x/fGigxkA yOqw8BmqA8hT/rotU7L2IBbtP4ognFNezVRT+UUOeW5Zpql2tuIiavjryyckkvCseSN2 Xkv8l6yx1IyE+fNWN652Bdpf6dwpuJ8aAd1NT8hZWNS5JVvdkANZw0oWdUUD71yAHXg1 hjtAtg4Uj3ivDoOrMcNabeI0XfLjZYlF2+Tnbx40yQs0ZE26xNIudcCFiTPN3DsmXayQ eQdA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=AdQbpZh4Nsn4ZEWXMN8/5MXfv9l4tGOGQ949NSeXfXg=; b=CzGgwHKwzXXB1XbYdcAb58Ad+apcS1/3fmqnl881Ifytp4aW22PMVyABVU8ndqCkvy osax3jWZk8UMp7qJ3BIN92WuSqTkm7JwVKC+Yk2zfuzMv/SmRW6Fm3t4xyiApuNuwVnV LH9bO2ONmzwMpEjcHD6RpINFM3iwV8/FKPu5+K3CDcYI47y8I55PLESrHvWFXNOhj4Vm dvdqQsMcOABHtVuCPW5imv9/dmReI3DoxfvERAt6v9iCT8H0c36K9ACG/GWQ4MxxP+uf FkzTFm20/2fVUt2nmyu3xxJQS5YodUe1OUvbnp21imM54G16NbnhaHwH8wINKG/ZOL1k HXxA== X-Gm-Message-State: AOAM532lzXngNHBjAf/sDrmbqSIXjP/4KkNO/Umx1cabbI7b4hZ4g3J+ iPb8Ynow8rbgMf5qf9cDE42jqA== X-Google-Smtp-Source: ABdhPJxxQE+HWqL6XKvd1+d1FHZjVC7sKN03FkZYpAUJ9IGA74nzYXzW9IqngL5PbA6b5BoJ+Mzc4A== X-Received: by 2002:a5d:61c8:: with SMTP id q8mr4188401wrv.151.1627557318849; Thu, 29 Jul 2021 04:15:18 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:18 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 05/53] target/arm: Fix mask handling for MVE narrowing operations Date: Thu, 29 Jul 2021 12:14:24 +0100 Message-Id: <20210729111512.16541-6-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::42f; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x42f.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" In the MVE helpers for the narrowing operations (DO_VSHRN and DO_VSHRN_SAT) we were using the wrong bits of the predicate mask for the 'top' versions of the insn. This is because the loop works over the double-sized input elements and shifts the predicate mask by that many bits each time, but when we write out the half-sized output we must look at the mask bits for whichever half of the element we are writing to. Correct this by shifting the whole mask right by ESIZE bits for the 'top' insns. This allows us also to simplify the saturation bit checking (where we had noticed that we needed to look at a different mask bit for the 'top' insn.) Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/mve_helper.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index 82151b06200..847ef5156ad 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -1358,6 +1358,7 @@ DO_VSHLL_ALL(vshllt, true) TYPE *d = vd; \ uint16_t mask = mve_element_mask(env); \ unsigned le; \ + mask >>= ESIZE * TOP; \ for (le = 0; le < 16 / LESIZE; le++, mask >>= LESIZE) { \ TYPE r = FN(m[H##LESIZE(le)], shift); \ mergemask(&d[H##ESIZE(le * 2 + TOP)], r, mask); \ @@ -1419,11 +1420,12 @@ static inline int32_t do_sat_bhs(int64_t val, int64_t min, int64_t max, uint16_t mask = mve_element_mask(env); \ bool qc = false; \ unsigned le; \ + mask >>= ESIZE * TOP; \ for (le = 0; le < 16 / LESIZE; le++, mask >>= LESIZE) { \ bool sat = false; \ TYPE r = FN(m[H##LESIZE(le)], shift, &sat); \ mergemask(&d[H##ESIZE(le * 2 + TOP)], r, mask); \ - qc |= sat && (mask & 1 << (TOP * ESIZE)); \ + qc |= sat & mask & 1; \ } \ if (qc) { \ env->vfp.qc[0] = qc; \ From patchwork Thu Jul 29 11:14:25 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511161 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=BU/lZOxZ; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb7PL2q14z9sSs for ; Thu, 29 Jul 2021 21:20:38 +1000 (AEST) Received: from localhost ([::1]:60204 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m945c-0001Dh-1S for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:20:36 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:39754) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m940e-0000FY-TI for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:29 -0400 Received: from mail-wr1-x42f.google.com ([2a00:1450:4864:20::42f]:41634) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m940W-0000ze-U9 for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:28 -0400 Received: by mail-wr1-x42f.google.com with SMTP id b7so6454411wri.8 for ; Thu, 29 Jul 2021 04:15:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=SngGo1vVY/dtcXAMZ+FWnV5QnI/yJF6vZ4XQ33WiT+w=; b=BU/lZOxZuQ/YyalcmqPglVnC9D2ypjEIpi2DwBz16NjDeHsYMUSODtB9HJPxQ1u/91 VDrSsEBBBv/BbhiRO0i8+CHB9wV8PDGre4St7kn8+8kvEFKYIbqB8H3Vj9VNOi8z9cDM 4jERJ1CuAcvYN1ntH+3jcBGC+M2zcDJOe7QPhjbosu9lvjngnb7/2BLK2nJpkBZB5TsX sgOLNYJHS4Mwdfb2rUkChbRgZ+Dc9hUj3qfUDZr5oRdRHhHES2PcaY/raLYMNJMJL+Ua Q/VJoKW6++wf3g1ykqgG7aSu8aTXjSEO7tDoDEr2kx9ddLRVbwpgL0JwTZSzkiC5RWNW rJTw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=SngGo1vVY/dtcXAMZ+FWnV5QnI/yJF6vZ4XQ33WiT+w=; b=DZOm3ag+y8nPoumv3ALiTSqxJlQhieDTwTeJW3SQRYnaXq5JpEhHXnSWO5u5aC9lgV NmIMcPF8GPvyhIYo+1UjSP8m4vT+BJKBi0Qp7aDpIQnPCn529XwPCRh945XHB11OZ9MU rGuq9vPOg8daLgpMo0WXPallor9yvhHMxatKKgRF8L82o9d5RDzGEcmMrZ6Gs5yTceiy 0CKAh1ch6KaA8Z8+smTljWXfTXPMI/ou4+Xl0CFjRV7aJ7Ak6HBvexIm70gFL7RwOvsy 2kJEzcUnSMncrLg8tN+MarHUwnovAm6H7qMLI5q4CsOlGTAI/ahTOx0WH7JypRbCOLhn 94/Q== X-Gm-Message-State: AOAM5332akkmHkGUBTm7+zvZ2WsDK0cwOtYhmfdVItjGLzKLq24Qp4u1 R6oFVfXVHud6ydNaNs52Mb3gNg== X-Google-Smtp-Source: ABdhPJzlYQlXsopXEiAnO6slvFlXfa9APv40jm1pHY3rAcFZVapN2QhIpFMFqjoHnJNsh5QkDV8pVw== X-Received: by 2002:adf:a409:: with SMTP id d9mr4284457wra.237.1627557319627; Thu, 29 Jul 2021 04:15:19 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:19 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 06/53] target/arm: Fix 48-bit saturating shifts Date: Thu, 29 Jul 2021 12:14:25 +0100 Message-Id: <20210729111512.16541-7-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::42f; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x42f.google.com X-Spam_score_int: -1 X-Spam_score: -0.2 X-Spam_bar: / X-Spam_report: (-0.2 / 5.0 requ) DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" In do_sqrshl48_d() and do_uqrshl48_d() we got some of the edge cases wrong and failed to saturate correctly: (1) In do_sqrshl48_d() we used the same code that do_shrshl_bhs() does to obtain the saturated most-negative and most-positive 48-bit signed values for the large-shift-left case. This gives (1 << 47) for saturate-to-most-negative, but we weren't sign-extending this value to the 64-bit output as the pseudocode requires. (2) For left shifts by less than 48, we copied the "8/16 bit" code from do_sqrshl_bhs() and do_uqrshl_bhs(). This doesn't do the right thing because it assumes the C type we're working with is at least twice the number of bits we're saturating to (so that a shift left by bits-1 can't shift anything off the top of the value). This isn't true for bits == 48, so we would incorrectly return 0 rather than the most-positive value for situations like "shift (1 << 44) right by 20". Instead check for saturation by doing the shift and signextend and then testing whether shifting back left again gives the original value. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/mve_helper.c | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index 847ef5156ad..5730b48f35e 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -1576,9 +1576,8 @@ static inline int64_t do_sqrshl48_d(int64_t src, int64_t shift, } return src >> -shift; } else if (shift < 48) { - int64_t val = src << shift; - int64_t extval = sextract64(val, 0, 48); - if (!sat || val == extval) { + int64_t extval = sextract64(src << shift, 0, 48); + if (!sat || src == (extval >> shift)) { return extval; } } else if (!sat || src == 0) { @@ -1586,7 +1585,7 @@ static inline int64_t do_sqrshl48_d(int64_t src, int64_t shift, } *sat = 1; - return (1ULL << 47) - (src >= 0); + return src >= 0 ? MAKE_64BIT_MASK(0, 47) : MAKE_64BIT_MASK(47, 17); } /* Operate on 64-bit values, but saturate at 48 bits */ @@ -1609,9 +1608,8 @@ static inline uint64_t do_uqrshl48_d(uint64_t src, int64_t shift, return extval; } } else if (shift < 48) { - uint64_t val = src << shift; - uint64_t extval = extract64(val, 0, 48); - if (!sat || val == extval) { + uint64_t extval = extract64(src << shift, 0, 48); + if (!sat || src == (extval >> shift)) { return extval; } } else if (!sat || src == 0) { From patchwork Thu Jul 29 11:14:26 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511170 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=dzA/zddL; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb7Ym4hRpz9sW8 for ; Thu, 29 Jul 2021 21:27:56 +1000 (AEST) Received: from localhost ([::1]:56360 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m94Cg-0000jF-3i for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:27:54 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:39860) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m940k-0000QI-0q for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:37 -0400 Received: from mail-wr1-x42f.google.com ([2a00:1450:4864:20::42f]:47095) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m940a-00010B-5f for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:33 -0400 Received: by mail-wr1-x42f.google.com with SMTP id c16so6427554wrp.13 for ; Thu, 29 Jul 2021 04:15:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=KHg6Srrabt2i/P1wMabf21oSUvedE7e5kSg7WXSq3yY=; b=dzA/zddL6DXQVfgUPv4PH0cqozt2cdld1eme+ZTUV6sOfT+OLDJVEaCgSDY+FzSnxR BZsgX8GOUHy0eeDz8G/vStNLHH17b9AZhuPrljAAEAIg6sCuyYmB+oVzr/YjWL+VHjOB 124p6XzEe0lc71UAEknrAQN1I5r0UzKSMAWQ+Z2p6WoxCJ08U5wRHUDE6SmKOoXJ7OGN nrmzxz1fYHRorntH3bomR17oui9sPG7AGTYm7wJYFc00tnyzN3euiwu6/lLsHlayApsh lnSoFGzrhL8MAXOjRd5h6GzFk4GJl5YbfhVdGI5HHVSbHf/cBTZEjGACK2LXSdW4qew5 9vyg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=KHg6Srrabt2i/P1wMabf21oSUvedE7e5kSg7WXSq3yY=; b=m+vEYZw+XxX3gtsf5yd6qnzSr3mioPnBZgWyavqtmf5UrQDLsZY8OvKVZqR21godO2 SwlYL4rWfkSB1Y1j9ZUkRA/Q91Y009tvK+LO6dVckannyUC9bVM2uOQJ0oDayOMMzcdw xpbWXsXWcEUt5NFPFQ3Zfefv2470ivp1JihrhDI8Cc28Qgy3bYU/7CqvaRU3TLIokFOz /+NVVsyaHMszEi0iz+rggyEV3eJZYKGNOAMTx/p9NRzSWMVWPG5bAkeG0rn9GenphF8/ 940XHvCJutlRNY/iZKKrmLBTQMZQ1QJx1V3/jpUptXAgA02Y1AN+6glSrlLb5/o2H6rk WoBQ== X-Gm-Message-State: AOAM5312Hm0XArDffjw4oleMJPQeG6TkjNITpwpuCedeVtHhhSyJ9sI1 FxSK1fvSLnxUEgqbzq0DSKeI5w== X-Google-Smtp-Source: ABdhPJx9IRd7dXOldI4aiu1IO/1iEsQAd7scPCTDBCk09B1b25Lzv98SnpvRQyY0cJd/X6unryuITA== X-Received: by 2002:a5d:5412:: with SMTP id g18mr4320538wrv.301.1627557320596; Thu, 29 Jul 2021 04:15:20 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:20 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 07/53] target/arm: Fix MVE 48-bit SQRSHRL for small right shifts Date: Thu, 29 Jul 2021 12:14:26 +0100 Message-Id: <20210729111512.16541-8-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::42f; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x42f.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" We got an edge case wrong in the 48-bit SQRSHRL implementation: if the shift is to the right, although it always makes the result smaller than the input value it might not be within the 48-bit range the result is supposed to be if the input had some bits in [63..48] set and the shift didn't bring all of those within the [47..0] range. Handle this similarly to the way we already do for this case in do_uqrshl48_d(): extend the calculated result from 48 bits, and return that if not saturating or if it doesn't change the result; otherwise fall through to return a saturated value. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- Not squashed into the previous patch because that one has already been reviewed, so as this fixes a different edge case I thought it clearer kept separate. --- target/arm/mve_helper.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index 5730b48f35e..1a4b2ef8075 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -1563,6 +1563,8 @@ uint64_t HELPER(mve_uqrshll)(CPUARMState *env, uint64_t n, uint32_t shift) static inline int64_t do_sqrshl48_d(int64_t src, int64_t shift, bool round, uint32_t *sat) { + int64_t val, extval; + if (shift <= -48) { /* Rounding the sign bit always produces 0. */ if (round) { @@ -1572,9 +1574,14 @@ static inline int64_t do_sqrshl48_d(int64_t src, int64_t shift, } else if (shift < 0) { if (round) { src >>= -shift - 1; - return (src >> 1) + (src & 1); + val = (src >> 1) + (src & 1); + } else { + val = src >> -shift; + } + extval = sextract64(val, 0, 48); + if (!sat || val == extval) { + return extval; } - return src >> -shift; } else if (shift < 48) { int64_t extval = sextract64(src << shift, 0, 48); if (!sat || src == (extval >> shift)) { From patchwork Thu Jul 29 11:14:27 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511159 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=S4g7euqE; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb7N03R49z9sSs for ; Thu, 29 Jul 2021 21:19:28 +1000 (AEST) Received: from localhost ([::1]:56926 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m944U-0007QI-81 for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:19:26 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:39846) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m940j-0000Pv-UA for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:37 -0400 Received: from mail-wr1-x433.google.com ([2a00:1450:4864:20::433]:41638) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m940a-000113-67 for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:31 -0400 Received: by mail-wr1-x433.google.com with SMTP id b7so6454517wri.8 for ; Thu, 29 Jul 2021 04:15:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=lutAo5Z1ewFuThzfMPVLqlhd9Ml/H60A8yYUE67fuT0=; b=S4g7euqEWzT7bP4rdD2sYvFbBbwSftKuNMyQE+CZ1s+ARt7R9TS3+C6p/2rOJT2HJg fEYHxyHOEqppsnGzSzfi6kwZ+0tSKYqQdnQPRfA+0ITUNmrRQmLfmKLAzRSgF3FnaPJf rKhPjpt6l+TA72FhplyEOLwY1x8abyO0hdMG1jNnSWFuvIuUgsuNj6rWphl71XXmJnrV CSX+UBH6SwsI2o1Oog/UqklRbIHVd28P/8v6MPRqZsNn+F3SZ8mgNx4/k0f7C9BZzEQJ RSlZJOFqtShsAJuSXlp3146YPX2up/EAjLeV6fF9/Pg07V8v44hRn+v5Bo882sMevKaw dXzg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=lutAo5Z1ewFuThzfMPVLqlhd9Ml/H60A8yYUE67fuT0=; b=g8mf7njnT41GCC8zqwe4yciHUEMbMA3Z7Y1ye3ZM1418z5MFH+YJU4522fhq5zTh9W KAitZzRLug5KFJ6aICL3Z+CqmLB9GD5FmvWWZ07w+5YeQPP9Y/1ICnFKS6D7zlKqun1K 2F71v0vkmX3OZ3mpplsaF6cVMe9MhR+0wDKULF7tCE7jf4oZwRXnAKGEkw53tsmsWnD+ yjkXFcHVmGB3P4ZxoldLQa0TxvySRoIW4LgVlFAHp4vrxQiBApBfa8OAXXnuUCJEWY8B AVqnVZuvqLrTvSjcGcPKKBcjnMjST8mXM/ym1xhCrs/FdBlI1OXJ0lAMFDf5o+xxb4AA vOeg== X-Gm-Message-State: AOAM531pG+Kqz2XpZXi2ajOqRoPoL9qi5PKdnLM0jByKyybgkQLSZMMR NlqYpfPLLBldUb0LpmFoCh4uLsbS+jRTKQ== X-Google-Smtp-Source: ABdhPJzk77pGfWEVLyH1Tqx247JuOFLjPe6wUnNFyhlz6FKsK9xMzbAEy5x6Vep7yRIGLWhNebxy+w== X-Received: by 2002:adf:e10c:: with SMTP id t12mr4264501wrz.36.1627557321312; Thu, 29 Jul 2021 04:15:21 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:20 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 08/53] target/arm: Fix calculation of LTP mask when LR is 0 Date: Thu, 29 Jul 2021 12:14:27 +0100 Message-Id: <20210729111512.16541-9-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::433; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x433.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" In mve_element_mask(), we calculate a mask for tail predication which should have a number of 1 bits based on the value of LR. However, our MAKE_64BIT_MASK() macro has undefined behaviour when passed a zero length. Special case this to give the all-zeroes mask we require. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/mve_helper.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index 1a4b2ef8075..bc67b86e700 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -64,7 +64,8 @@ static uint16_t mve_element_mask(CPUARMState *env) */ int masklen = env->regs[14] << env->v7m.ltpsize; assert(masklen <= 16); - mask &= MAKE_64BIT_MASK(0, masklen); + uint16_t ltpmask = masklen ? MAKE_64BIT_MASK(0, masklen) : 0; + mask &= ltpmask; } if ((env->condexec_bits & 0xf) == 0) { From patchwork Thu Jul 29 11:14:28 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511164 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=PZZBvpgo; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb7SP6r0Yz9sSs for ; Thu, 29 Jul 2021 21:23:17 +1000 (AEST) Received: from localhost ([::1]:40574 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m948B-0006zZ-NT for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:23:15 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:39796) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m940g-0000Iy-3X for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:30 -0400 Received: from mail-wr1-x432.google.com ([2a00:1450:4864:20::432]:34722) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m940a-00012B-5S for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:29 -0400 Received: by mail-wr1-x432.google.com with SMTP id r2so6483503wrl.1 for ; Thu, 29 Jul 2021 04:15:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=NumtafSnEiF4kjNIFtDKTqX8sKZowsVsCRz2Ds8yzEc=; b=PZZBvpgo2retsmAoruoibwBJfAkNQthPHUnd1T4sSwr5naD0f8YbvQBpcc/Yobx2M8 WIL5YrLdK7/nDei+MRptkvqUf63mYpgsyJzitgcvoheFb7ZIwmyOZuIbNP4IiqtY8MCc xrRj896WxnTCMtxIoBrfor9jCGN+ecAefzX5fT373mpNB3kiycL+OkHXnGrYBIhVTBjY E8N8YH+RaHnSt8uGM601PzHoKQZqJNaP/wuyg7paUp0AADFjcy5IUQvkxUPwWrmCi9l0 yuw8SiS7dgZjT3LKPuJJWmkywpmjPKxGgwe6F8pMlNx4se7Pb9znp7o8DrxP4szUaRVr FRaQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=NumtafSnEiF4kjNIFtDKTqX8sKZowsVsCRz2Ds8yzEc=; b=cFpGB6A5NgV8EkwLwXCCz6PYBuNIubiTk+i0sGTL7iCFuCnqFO/Rk7BxNU93VFW/Jn zTf6q521fdIaKH03WntlqnzYEspuiZ70H6CqJvBwA9aUKoFYkIkHg4FcMPgu4aN0YYwX uWvE8YG3pBPbvZiHZkuaZp8FkbZthuUT/CsssdKgaWzjE/G9QZNSoseyNEGX28S6Qwyk VEM1setzPeOi+RxWSz+KEMNRYmiWSWTaHvwlqsmwFietZJwGXXwAYw92RSB94kPx8GGR nf3sauqkGLpYTKkfAemQaD4VauWyNSlK8i4J8k4dgpQTSUugDdqXdzDhyfRs1yJELeTg MahA== X-Gm-Message-State: AOAM5330Bch2JpT4FJ/vg43yG01H7TybEVnVma8R6S/a9sKAxcCUNIbH /g881PMoUr7jq73AYhStXokOE3wwrSzqmg== X-Google-Smtp-Source: ABdhPJwSR9KzEvQ7/CSivSaebU9cOIok2bVMYN81PLDbGp0gUMkeBmL2vmHm7iqMVeB4zJ+gYcYcvw== X-Received: by 2002:adf:dcd1:: with SMTP id x17mr4207436wrm.59.1627557322075; Thu, 29 Jul 2021 04:15:22 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:21 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 09/53] target/arm: Factor out mve_eci_mask() Date: Thu, 29 Jul 2021 12:14:28 +0100 Message-Id: <20210729111512.16541-10-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::432; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x432.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" In some situations we need a mask telling us which parts of the vector correspond to beats that are not being executed because of ECI, separately from the combined "which bytes are predicated away" mask. Factor this mask calculation out of mve_element_mask() into its own function. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/mve_helper.c | 58 ++++++++++++++++++++++++----------------- 1 file changed, 34 insertions(+), 24 deletions(-) diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index bc67b86e700..ffff280726d 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -26,6 +26,35 @@ #include "exec/exec-all.h" #include "tcg/tcg.h" +static uint16_t mve_eci_mask(CPUARMState *env) +{ + /* + * Return the mask of which elements in the MVE vector correspond + * to beats being executed. The mask has 1 bits for executed lanes + * and 0 bits where ECI says this beat was already executed. + */ + int eci; + + if ((env->condexec_bits & 0xf) != 0) { + return 0xffff; + } + + eci = env->condexec_bits >> 4; + switch (eci) { + case ECI_NONE: + return 0xffff; + case ECI_A0: + return 0xfff0; + case ECI_A0A1: + return 0xff00; + case ECI_A0A1A2: + case ECI_A0A1A2B0: + return 0xf000; + default: + g_assert_not_reached(); + } +} + static uint16_t mve_element_mask(CPUARMState *env) { /* @@ -68,30 +97,11 @@ static uint16_t mve_element_mask(CPUARMState *env) mask &= ltpmask; } - if ((env->condexec_bits & 0xf) == 0) { - /* - * ECI bits indicate which beats are already executed; - * we handle this by effectively predicating them out. - */ - int eci = env->condexec_bits >> 4; - switch (eci) { - case ECI_NONE: - break; - case ECI_A0: - mask &= 0xfff0; - break; - case ECI_A0A1: - mask &= 0xff00; - break; - case ECI_A0A1A2: - case ECI_A0A1A2B0: - mask &= 0xf000; - break; - default: - g_assert_not_reached(); - } - } - + /* + * ECI bits indicate which beats are already executed; + * we handle this by effectively predicating them out. + */ + mask &= mve_eci_mask(env); return mask; } From patchwork Thu Jul 29 11:14:29 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511167 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=Kiwvbyvn; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb7Ws33KKz9sSs for ; Thu, 29 Jul 2021 21:26:17 +1000 (AEST) Received: from localhost ([::1]:49198 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m94B5-0004Jj-5t for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:26:15 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:39888) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m940k-0000R8-HU for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:37 -0400 Received: from mail-wr1-x435.google.com ([2a00:1450:4864:20::435]:38588) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m940a-00012Z-6I for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:34 -0400 Received: by mail-wr1-x435.google.com with SMTP id l18so6461616wrv.5 for ; Thu, 29 Jul 2021 04:15:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=qmYdn/b+7uslGkmiRTc9E6zb61LbVOfXICGmdFgy+bQ=; b=KiwvbyvnDCKLQ8U1e4tmBErUZ9PzM20Q0BvEHUH28UfyxKXBmxOdRcoC5PTRkoQwNh iDK2GRKFeIh7mlXu/+BiaWXTXUNxofZWVqbctvz7iGrx1nEAlOZcc8crEWKn8tj6pmEe LpfmJ6FGFv8VW6WlXJUSMAQHJJcyUMO8x4TSBq/keE05KLOfpYKoF04UFrAUQVu3xrb6 V1WvVVtdSikmAIicqHSs8XqrpvKqtISo1y77SvmvSJp2oyIiBE5rlz4WN0Ahxf/IELt0 JJvJuWEr8ORdQqH8EbosJyYpBzzOGQfWWQfOVz1P7lxUsTvDukphGE2H2R76eh/xr7mv Dh7A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=qmYdn/b+7uslGkmiRTc9E6zb61LbVOfXICGmdFgy+bQ=; b=esFKXNEVKYg+SDCHNrlgfhHFZJAao5BH1s2n4BOTT69RfmRH/+i4djbxtf7XAl2F95 IFAfbeaxqvsDymEKAV+wA/WCQr2B2wQ1ZhXu20ECf43unDIyTMxUiC/7O5POXPDngF7A Ipa/VLPSLroDozmTC94zjfeDmoBKun4UEbNXJ3stUNvHP8L1zkmdYzNcrj7ZdjXIy1p5 5cYiBdueAYCF5xdAY2WvsnoYbDLdOGqBXw/kHkh8N0WOeBiTzJ/izzBylUC64Ufm0Lsv x/thvZI2U5zkMpWjJozvquYvbLr/HBrVXS5splgCTfEwjfso/JvZB7NrBi0Y8mnPaqZU 92xA== X-Gm-Message-State: AOAM530k7Hr2l6SxVQxiSWufzb2dAqI4xG93SMT8vL+TLtxAlisBh/C9 jDUcJhgDeGm9eHxVzMIAXBQ1cONEYhO7FQ== X-Google-Smtp-Source: ABdhPJyEIuxAFGBcoxeu396kyoyXES3d9WB3AanQQF6ZDJxvLiq53pd/lhjf4XIHMoYe23mWnH7cVQ== X-Received: by 2002:a5d:54c7:: with SMTP id x7mr4291482wrv.77.1627557322794; Thu, 29 Jul 2021 04:15:22 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:22 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 10/53] target/arm: Fix VPT advance when ECI is non-zero Date: Thu, 29 Jul 2021 12:14:29 +0100 Message-Id: <20210729111512.16541-11-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::435; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x435.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" We were not paying attention to the ECI state when advancing the VPT state. Architecturally, VPT state advance happens for every beat (see the pseudocode VPTAdvance()), so on every beat the 4 bits of VPR.P0 corresponding to the current beat are inverted if required, and at the end of beats 1 and 3 the VPR MASK fields are updated. This means that if the ECI state says we should not be executing all 4 beats then we need to skip some of the updating of the VPR that we currently do in mve_advance_vpt(). Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/mve_helper.c | 24 +++++++++++++++++------- 1 file changed, 17 insertions(+), 7 deletions(-) diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index ffff280726d..bc89ce94d5a 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -110,6 +110,8 @@ static void mve_advance_vpt(CPUARMState *env) /* Advance the VPT and ECI state if necessary */ uint32_t vpr = env->v7m.vpr; unsigned mask01, mask23; + uint16_t inv_mask; + uint16_t eci_mask = mve_eci_mask(env); if ((env->condexec_bits & 0xf) == 0) { env->condexec_bits = (env->condexec_bits == (ECI_A0A1A2B0 << 4)) ? @@ -121,17 +123,25 @@ static void mve_advance_vpt(CPUARMState *env) return; } + /* Invert P0 bits if needed, but only for beats we actually executed */ mask01 = FIELD_EX32(vpr, V7M_VPR, MASK01); mask23 = FIELD_EX32(vpr, V7M_VPR, MASK23); - if (mask01 > 8) { - /* high bit set, but not 0b1000: invert the relevant half of P0 */ - vpr ^= 0xff; + /* Start by assuming we invert all bits corresponding to executed beats */ + inv_mask = eci_mask; + if (mask01 <= 8) { + /* MASK01 says don't invert low half of P0 */ + inv_mask &= ~0xff; } - if (mask23 > 8) { - /* high bit set, but not 0b1000: invert the relevant half of P0 */ - vpr ^= 0xff00; + if (mask23 <= 8) { + /* MASK23 says don't invert high half of P0 */ + inv_mask &= ~0xff00; } - vpr = FIELD_DP32(vpr, V7M_VPR, MASK01, mask01 << 1); + vpr ^= inv_mask; + /* Only update MASK01 if beat 1 executed */ + if (eci_mask & 0xf0) { + vpr = FIELD_DP32(vpr, V7M_VPR, MASK01, mask01 << 1); + } + /* Beat 3 always executes, so update MASK23 */ vpr = FIELD_DP32(vpr, V7M_VPR, MASK23, mask23 << 1); env->v7m.vpr = vpr; } From patchwork Thu Jul 29 11:14:30 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511156 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=p71GY64S; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb7Ld3cXYz9sSs for ; Thu, 29 Jul 2021 21:18:17 +1000 (AEST) Received: from localhost ([::1]:51220 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m943L-0003d5-5k for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:18:15 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:39972) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m940o-0000UD-1x for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:38 -0400 Received: from mail-wm1-x333.google.com ([2a00:1450:4864:20::333]:37493) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m940a-000147-SS for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:37 -0400 Received: by mail-wm1-x333.google.com with SMTP id l34-20020a05600c1d22b02902573c214807so1130496wms.2 for ; Thu, 29 Jul 2021 04:15:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=9E2l5jElzoIn6FznnRlfJdbmRFg2UKvriErGukFqqJw=; b=p71GY64S4bkwnzaHIPsCIIXm2YhNf/9nXFFqcyR1DUFSBqemp5Mdvx/eh3hm0bTdm/ NGr5DNhmXLqXBDNuqReLCVMM9p/SS7aPzxfoHJdpX1+nq0nYjNllSyDcLApwax7pQ8rD rWPDWg8+SuVAMUUYrLI4InDE+epjnaY3/4ZIp31RUc5qK9xCk4izkmA+ca4wHLXMOziV 8RrgP1+OUAAbWK6SsYeRvfPIkLj1vNRshK3PzVWgje3BTPqAFb41yW1j7vxgF3mTQ6ue Xtklzd8t5p6zKMqyDQKMNFyQ50Wv5LHs5tMjshXy4T663COXk9HvK8WDy67z8+OBjBNT bdUA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=9E2l5jElzoIn6FznnRlfJdbmRFg2UKvriErGukFqqJw=; b=dsI13S3vADhtG5T9mC6FYBEYjH4E1Y1L7/QyHYo4DTppAVkXN62hYV3ONInU4wP6tZ nM8GLwwGk1Ct9vpg658jjEzH368ekKR6Vo1+QoEEgjUaRMuTfpJc8DXaSu5brNVWLIwJ 9sPVJZVaKKtB5JFLJ2Ijs9Cdcf10Hr/H4XNTtTJa6ppyGw4bV0owdUMiB+8GPuF+auAr c3bxN0+oWki/vt268rA2bMpfkw+JS+ZWG0XRAGSYfkDvokDby3ONCJdLQIP1g3tTxzB9 noqNC5HO+7G2qMau034E4bjRmclFKWzU3AxA9dzsQeWnPK9Pucwt+Kx9XlohVH1K6HaN Qnlg== X-Gm-Message-State: AOAM5327agIMud3j3VKCNU2H0BRs/tzjYE6grrcSs53DrCkHQCPwdFb+ AgxwzcSkShd8uumH2JanhuAr6gznm3YjqA== X-Google-Smtp-Source: ABdhPJxXZU9VXPHVeUJS7afG27GLM8616ssKqQLvn+nyjY3Lo9pZEhu1dnFI/qUH4kfKGqgCOx+ikg== X-Received: by 2002:a05:600c:21d8:: with SMTP id x24mr4153217wmj.59.1627557323542; Thu, 29 Jul 2021 04:15:23 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:23 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 11/53] target/arm: Fix VLDRB/H/W for predicated elements Date: Thu, 29 Jul 2021 12:14:30 +0100 Message-Id: <20210729111512.16541-12-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::333; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x333.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" For vector loads, predicated elements are zeroed, instead of retaining their previous values (as happens for most data processing operations). This means we need to distinguish "beat not executed due to ECI" (don't touch destination element) from "beat executed but predicated out" (zero destination element). Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/mve_helper.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index bc89ce94d5a..be8b9545317 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -146,12 +146,13 @@ static void mve_advance_vpt(CPUARMState *env) env->v7m.vpr = vpr; } - +/* For loads, predicated lanes are zeroed instead of keeping their old values */ #define DO_VLDR(OP, MSIZE, LDTYPE, ESIZE, TYPE) \ void HELPER(mve_##OP)(CPUARMState *env, void *vd, uint32_t addr) \ { \ TYPE *d = vd; \ uint16_t mask = mve_element_mask(env); \ + uint16_t eci_mask = mve_eci_mask(env); \ unsigned b, e; \ /* \ * R_SXTM allows the dest reg to become UNKNOWN for abandoned \ @@ -159,8 +160,9 @@ static void mve_advance_vpt(CPUARMState *env) * then take an exception. \ */ \ for (b = 0, e = 0; b < 16; b += ESIZE, e++) { \ - if (mask & (1 << b)) { \ - d[H##ESIZE(e)] = cpu_##LDTYPE##_data_ra(env, addr, GETPC()); \ + if (eci_mask & (1 << b)) { \ + d[H##ESIZE(e)] = (mask & (1 << b)) ? \ + cpu_##LDTYPE##_data_ra(env, addr, GETPC()) : 0; \ } \ addr += MSIZE; \ } \ From patchwork Thu Jul 29 11:14:31 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511160 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=B+TWFPf/; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb7PC4wz8z9sSs for ; Thu, 29 Jul 2021 21:20:31 +1000 (AEST) Received: from localhost ([::1]:59834 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m945V-0000yD-6y for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:20:29 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40032) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m940q-0000XK-2d for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:41 -0400 Received: from mail-wr1-x42b.google.com ([2a00:1450:4864:20::42b]:42504) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m940b-00014t-R1 for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:39 -0400 Received: by mail-wr1-x42b.google.com with SMTP id j2so6433887wrx.9 for ; Thu, 29 Jul 2021 04:15:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=r9xp+d/e9bbwwMEOsVgj3Bb2KxRODXZsL3alV4Ml4i0=; b=B+TWFPf/sXmFFYbk8j2XEZ6ppHEBfZECucTLz8puc6LwMFlZQF1Er94LzjWpXetyP5 66zt4dC8xCD7rb2OGxPrZ1QeC3+1phDJMrBv+ylh7wdDVkP1p+e9eCO18CsQnGhTno+w uZZ7wGlQHuUbQHss7av/ERZS/dShbAI7TTL3p5ga4ybFQGFbN0LXIC1eKIkO7hAlowai QK/tqOtbFiAyqk+Y6sulmDnSA7tto6Fdkpw1/Cz7XPzDdSnfK+CSOxjYpzSYf/CAZvbE Zx5hvMwESid1gxuHQqm3KXBftVHUT9SHHwB3zYhI1TtFUzwjbzPzylpd7gRf0s2oX16W S8ZA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=r9xp+d/e9bbwwMEOsVgj3Bb2KxRODXZsL3alV4Ml4i0=; b=nI+Kjf82DlechWgVlEVX9oK9pqlU5BW52f57h5+ukk0fYFjQtqlxKlCv4WejtrCWl6 j2aZZXfUe/gKwer/oiIwFiKAz/WB9YfKutSHv60cTXEaR8D98QPc41fVC0IgDLoBTrNR 0icrMYx6gmHhO6hMrflT3SJekpPbs27hhOoD207T1Qz3LCzo13cm+zoabQql7JgeE9AG Cqx3eZU88ZHOqRG+ZEmQ+J4GUgFPNvAwZeXyAFcVoomGRpMVSHYiXNYFKfHsTpE2oBnC xls4dkpQ1lk5P/JixTe6ToDyZvABsrezEqMqxBWQnSuPcWF5CPt7+kcsE3wI9WcLczdI w2fA== X-Gm-Message-State: AOAM531cMk98o45pl3CcgZEyf74NJwWKigo9V+bC7gWogYL1VWrlqzTN 5VjAhgRnzCWNzlKeVx/YRQ1uTtgo6zVtKA== X-Google-Smtp-Source: ABdhPJw5uzAvFCuXxLAqmFtfRkqwzCbSRj94oOmX0WJGI2IBxi28Y6ENyxrhwzgQh4vnwslRnDdPXg== X-Received: by 2002:adf:82e6:: with SMTP id 93mr4204432wrc.47.1627557324476; Thu, 29 Jul 2021 04:15:24 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:24 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 12/53] target/arm: Implement MVE VMULL (polynomial) Date: Thu, 29 Jul 2021 12:14:31 +0100 Message-Id: <20210729111512.16541-13-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::42b; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x42b.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE VMULL (polynomial) insn. Unlike Neon, this comes in two flavours: 8x8->16 and a 16x16->32. Also unlike Neon, the inputs are in either the low or the high half of each double-width element. The assembler for this insn indicates the size with "P8" or "P16", encoded into bit 28 as size = 0 or 1. We choose to follow the same encoding as VQDMULL and decode this into a->size as MO_16 or MO_32 indicating the size of the result elements. This then carries through to the helper function names where it then matches up with the existing pmull_h() which does an 8x8->16 operation and a new pmull_w() which does the 16x16->32. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper-mve.h | 5 +++++ target/arm/vec_internal.h | 11 +++++++++++ target/arm/mve.decode | 14 ++++++++++---- target/arm/mve_helper.c | 16 ++++++++++++++++ target/arm/translate-mve.c | 28 ++++++++++++++++++++++++++++ target/arm/vec_helper.c | 14 +++++++++++++- 6 files changed, 83 insertions(+), 5 deletions(-) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index 56e40844ad9..84adfb21517 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -145,6 +145,11 @@ DEF_HELPER_FLAGS_4(mve_vmulltub, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) DEF_HELPER_FLAGS_4(mve_vmulltuh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) DEF_HELPER_FLAGS_4(mve_vmulltuw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) +DEF_HELPER_FLAGS_4(mve_vmullpbh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) +DEF_HELPER_FLAGS_4(mve_vmullpth, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) +DEF_HELPER_FLAGS_4(mve_vmullpbw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) +DEF_HELPER_FLAGS_4(mve_vmullptw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) + DEF_HELPER_FLAGS_4(mve_vqdmulhb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) DEF_HELPER_FLAGS_4(mve_vqdmulhh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) DEF_HELPER_FLAGS_4(mve_vqdmulhw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) diff --git a/target/arm/vec_internal.h b/target/arm/vec_internal.h index 865d2139447..2a335582906 100644 --- a/target/arm/vec_internal.h +++ b/target/arm/vec_internal.h @@ -206,4 +206,15 @@ int16_t do_sqrdmlah_h(int16_t, int16_t, int16_t, bool, bool, uint32_t *); int32_t do_sqrdmlah_s(int32_t, int32_t, int32_t, bool, bool, uint32_t *); int64_t do_sqrdmlah_d(int64_t, int64_t, int64_t, bool, bool); +/* + * 8 x 8 -> 16 vector polynomial multiply where the inputs are + * in the low 8 bits of each 16-bit element +*/ +uint64_t pmull_h(uint64_t op1, uint64_t op2); +/* + * 16 x 16 -> 32 vector polynomial multiply where the inputs are + * in the low 16 bits of each 32-bit element + */ +uint64_t pmull_w(uint64_t op1, uint64_t op2); + #endif /* TARGET_ARM_VEC_INTERNALS_H */ diff --git a/target/arm/mve.decode b/target/arm/mve.decode index fa9d921f933..de079ec517d 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -173,10 +173,16 @@ VHADD_U 111 1 1111 0 . .. ... 0 ... 0 0000 . 1 . 0 ... 0 @2op VHSUB_S 111 0 1111 0 . .. ... 0 ... 0 0010 . 1 . 0 ... 0 @2op VHSUB_U 111 1 1111 0 . .. ... 0 ... 0 0010 . 1 . 0 ... 0 @2op -VMULL_BS 111 0 1110 0 . .. ... 1 ... 0 1110 . 0 . 0 ... 0 @2op -VMULL_BU 111 1 1110 0 . .. ... 1 ... 0 1110 . 0 . 0 ... 0 @2op -VMULL_TS 111 0 1110 0 . .. ... 1 ... 1 1110 . 0 . 0 ... 0 @2op -VMULL_TU 111 1 1110 0 . .. ... 1 ... 1 1110 . 0 . 0 ... 0 @2op +{ + VMULLP_B 111 . 1110 0 . 11 ... 1 ... 0 1110 . 0 . 0 ... 0 @2op_sz28 + VMULL_BS 111 0 1110 0 . .. ... 1 ... 0 1110 . 0 . 0 ... 0 @2op + VMULL_BU 111 1 1110 0 . .. ... 1 ... 0 1110 . 0 . 0 ... 0 @2op +} +{ + VMULLP_T 111 . 1110 0 . 11 ... 1 ... 1 1110 . 0 . 0 ... 0 @2op_sz28 + VMULL_TS 111 0 1110 0 . .. ... 1 ... 1 1110 . 0 . 0 ... 0 @2op + VMULL_TU 111 1 1110 0 . .. ... 1 ... 1 1110 . 0 . 0 ... 0 @2op +} VQDMULH 1110 1111 0 . .. ... 0 ... 0 1011 . 1 . 0 ... 0 @2op VQRDMULH 1111 1111 0 . .. ... 0 ... 0 1011 . 1 . 0 ... 0 @2op diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index be8b9545317..91fb346d7e5 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -481,6 +481,22 @@ DO_2OP_L(vmulltub, 1, 1, uint8_t, 2, uint16_t, DO_MUL) DO_2OP_L(vmulltuh, 1, 2, uint16_t, 4, uint32_t, DO_MUL) DO_2OP_L(vmulltuw, 1, 4, uint32_t, 8, uint64_t, DO_MUL) +/* + * Polynomial multiply. We can always do this generating 64 bits + * of the result at a time, so we don't need to use DO_2OP_L. + */ +#define VMULLPH_MASK 0x00ff00ff00ff00ffULL +#define VMULLPW_MASK 0x0000ffff0000ffffULL +#define DO_VMULLPBH(N, M) pmull_h((N) & VMULLPH_MASK, (M) & VMULLPH_MASK) +#define DO_VMULLPTH(N, M) DO_VMULLPBH((N) >> 8, (M) >> 8) +#define DO_VMULLPBW(N, M) pmull_w((N) & VMULLPW_MASK, (M) & VMULLPW_MASK) +#define DO_VMULLPTW(N, M) DO_VMULLPBW((N) >> 16, (M) >> 16) + +DO_2OP(vmullpbh, 8, uint64_t, DO_VMULLPBH) +DO_2OP(vmullpth, 8, uint64_t, DO_VMULLPTH) +DO_2OP(vmullpbw, 8, uint64_t, DO_VMULLPBW) +DO_2OP(vmullptw, 8, uint64_t, DO_VMULLPTW) + /* * Because the computation type is at least twice as large as required, * these work for both signed and unsigned source types. diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index a2a45036a0b..d318f34b2bc 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -464,6 +464,34 @@ static bool trans_VQDMULLT(DisasContext *s, arg_2op *a) return do_2op(s, a, fns[a->size]); } +static bool trans_VMULLP_B(DisasContext *s, arg_2op *a) +{ + /* + * Note that a->size indicates the output size, ie VMULL.P8 + * is the 8x8->16 operation and a->size is MO_16; VMULL.P16 + * is the 16x16->32 operation and a->size is MO_32. + */ + static MVEGenTwoOpFn * const fns[] = { + NULL, + gen_helper_mve_vmullpbh, + gen_helper_mve_vmullpbw, + NULL, + }; + return do_2op(s, a, fns[a->size]); +} + +static bool trans_VMULLP_T(DisasContext *s, arg_2op *a) +{ + /* a->size is as for trans_VMULLP_B */ + static MVEGenTwoOpFn * const fns[] = { + NULL, + gen_helper_mve_vmullpth, + gen_helper_mve_vmullptw, + NULL, + }; + return do_2op(s, a, fns[a->size]); +} + /* * VADC and VSBC: these perform an add-with-carry or subtract-with-carry * of the 32-bit elements in each lane of the input vectors, where the diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 034f6b84f78..17fb1583622 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -2028,11 +2028,23 @@ static uint64_t expand_byte_to_half(uint64_t x) | ((x & 0xff000000) << 24); } -static uint64_t pmull_h(uint64_t op1, uint64_t op2) +uint64_t pmull_w(uint64_t op1, uint64_t op2) { uint64_t result = 0; int i; + for (i = 0; i < 16; ++i) { + uint64_t mask = (op1 & 0x0000000100000001ull) * 0xffffffff; + result ^= op2 & mask; + op1 >>= 1; + op2 <<= 1; + } + return result; +} +uint64_t pmull_h(uint64_t op1, uint64_t op2) +{ + uint64_t result = 0; + int i; for (i = 0; i < 8; ++i) { uint64_t mask = (op1 & 0x0001000100010001ull) * 0xffff; result ^= op2 & mask; From patchwork Thu Jul 29 11:14:32 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511172 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=AcMOi2C5; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb7bM0rFHz9sSs for ; Thu, 29 Jul 2021 21:29:19 +1000 (AEST) Received: from localhost ([::1]:60052 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m94E0-0003Cq-Pf for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:29:16 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40048) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m940q-0000XM-Cj for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:41 -0400 Received: from mail-wr1-x431.google.com ([2a00:1450:4864:20::431]:37758) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m940c-000162-NL for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:40 -0400 Received: by mail-wr1-x431.google.com with SMTP id d8so6455673wrm.4 for ; Thu, 29 Jul 2021 04:15:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=sBt+Er2ORY6pTVXoQlImPIKKZV4cWhSFl3FktQ7/5BI=; b=AcMOi2C5CfADuaZnCMB9BcZR/jt4w/D0ZvfONy3AryOpPQoQEeKAzY1HFjP8d3gQQx skwthkb/ObIbKc0nSu1x6JFCapZwTHuiaHXRb8zojl5Hu/RNnm0Fioi7cpWvxmSxpfSQ 1DTaEETLI3/9/iEThLrfqeJif0K071cLcszfXFgMla3shPKTjsroHTKvVVZdqoJy9C3/ fGhk3um3ckDLZLo0XlLHYQlWgRVrbBX4Ct1oXxY3X/I/pbNgeADDqIhRdMLFXMNHBFPp Ip+Ba+kS7x6LlQrshHp7bJYmgbl7R0Dh9fb8IKAfCdfWHQ9sIxelgwy6pCAas7xXk7rO WR3Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=sBt+Er2ORY6pTVXoQlImPIKKZV4cWhSFl3FktQ7/5BI=; b=jMZCEsyJUMeYogXR3ryJnhDP1AmOTZxl+E12ORsL9s9gTGIIF3XH4Qn8RZt2+LhbRj gKIXIjonb4unek8gWgzbRYNQyVPtnKipHHdblwN9ZEJJxmq8qKY+NPhv+T1ucqqR2S/X OypLau4J+3xjYyvQNDXt4a29y0Myh+wLhTTOtgJEVTQHgRVA5pToxWvGRRvn21tI4QGp Yioi/tVt/huk4eFD5mzGkNL8gKnT3tzvO8E6H014H8J2mMwS6nWF/galO04/hXBUjFKR Aepeh8JqCJk4K3zJRv0fAeL6DFBfiHXTfBPCuNRA+4/IcBDi1tm9pSfI3VaXyBClLlnL 9QPA== X-Gm-Message-State: AOAM531ta9nm4cFv4t9okXeCAd4ummDFkSk4zl6vPNqjhSKTRq9GnsZQ mhYCO/jxiK/hJ6KW4xRE4gmuEg+JuK0Ptw== X-Google-Smtp-Source: ABdhPJwMLYVnloh0Ik6lnqPjroCuwX0icmgkVBDr4lLUbitSOERx6HO6Reg2rjXilQpD4cjUsDHbZA== X-Received: by 2002:a05:6000:548:: with SMTP id b8mr4373825wrf.159.1627557325386; Thu, 29 Jul 2021 04:15:25 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:25 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 13/53] target/arm: Implement MVE incrementing/decrementing dup insns Date: Thu, 29 Jul 2021 12:14:32 +0100 Message-Id: <20210729111512.16541-14-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::431; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x431.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE incrementing/decrementing dup insns VIDUP, VDDUP, VIWDUP and VDWDUP. These fill the elements of a vector with successively incrementing values, starting at the offset specified in a general purpose register. The final value of the offset is written back to this register. The wrapping variants take a second general purpose register which specifies the point where the count should wrap back to 0. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper-mve.h | 12 ++++ target/arm/mve.decode | 25 ++++++++ target/arm/mve_helper.c | 63 +++++++++++++++++++ target/arm/translate-mve.c | 120 +++++++++++++++++++++++++++++++++++++ 4 files changed, 220 insertions(+) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index 84adfb21517..b9af03cc03b 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -35,6 +35,18 @@ DEF_HELPER_FLAGS_3(mve_vstrh_w, TCG_CALL_NO_WG, void, env, ptr, i32) DEF_HELPER_FLAGS_3(mve_vdup, TCG_CALL_NO_WG, void, env, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vidupb, TCG_CALL_NO_WG, i32, env, ptr, i32, i32) +DEF_HELPER_FLAGS_4(mve_viduph, TCG_CALL_NO_WG, i32, env, ptr, i32, i32) +DEF_HELPER_FLAGS_4(mve_vidupw, TCG_CALL_NO_WG, i32, env, ptr, i32, i32) + +DEF_HELPER_FLAGS_5(mve_viwdupb, TCG_CALL_NO_WG, i32, env, ptr, i32, i32, i32) +DEF_HELPER_FLAGS_5(mve_viwduph, TCG_CALL_NO_WG, i32, env, ptr, i32, i32, i32) +DEF_HELPER_FLAGS_5(mve_viwdupw, TCG_CALL_NO_WG, i32, env, ptr, i32, i32, i32) + +DEF_HELPER_FLAGS_5(mve_vdwdupb, TCG_CALL_NO_WG, i32, env, ptr, i32, i32, i32) +DEF_HELPER_FLAGS_5(mve_vdwduph, TCG_CALL_NO_WG, i32, env, ptr, i32, i32, i32) +DEF_HELPER_FLAGS_5(mve_vdwdupw, TCG_CALL_NO_WG, i32, env, ptr, i32, i32, i32) + DEF_HELPER_FLAGS_3(mve_vclsb, TCG_CALL_NO_WG, void, env, ptr, ptr) DEF_HELPER_FLAGS_3(mve_vclsh, TCG_CALL_NO_WG, void, env, ptr, ptr) DEF_HELPER_FLAGS_3(mve_vclsw, TCG_CALL_NO_WG, void, env, ptr, ptr) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index de079ec517d..88c9c18ebf1 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -35,6 +35,8 @@ &2scalar qd qn rm size &1imm qd imm cmode op &2shift qd qm shift size +&vidup qd rn size imm +&viwdup qd rn rm size imm @vldr_vstr ....... . . . . l:1 rn:4 ... ...... imm:7 &vldr_vstr qd=%qd u=0 # Note that both Rn and Qd are 3 bits only (no D bit) @@ -259,6 +261,29 @@ VDUP 1110 1110 1 1 10 ... 0 .... 1011 . 0 0 1 0000 @vdup size=0 VDUP 1110 1110 1 0 10 ... 0 .... 1011 . 0 1 1 0000 @vdup size=1 VDUP 1110 1110 1 0 10 ... 0 .... 1011 . 0 0 1 0000 @vdup size=2 +# Incrementing and decrementing dup + +# VIDUP, VDDUP format immediate: 1 << (immh:imml) +%imm_vidup 7:1 0:1 !function=vidup_imm + +# VIDUP, VDDUP registers: Rm bits [3:1] from insn, bit 0 is 1; +# Rn bits [3:1] from insn, bit 0 is 0 +%vidup_rm 1:3 !function=times_2_plus_1 +%vidup_rn 17:3 !function=times_2 + +@vidup .... .... . . size:2 .... .... .... .... .... \ + qd=%qd imm=%imm_vidup rn=%vidup_rn &vidup +@viwdup .... .... . . size:2 .... .... .... .... .... \ + qd=%qd imm=%imm_vidup rm=%vidup_rm rn=%vidup_rn &viwdup +{ + VIDUP 1110 1110 0 . .. ... 1 ... 0 1111 . 110 111 . @vidup + VIWDUP 1110 1110 0 . .. ... 1 ... 0 1111 . 110 ... . @viwdup +} +{ + VDDUP 1110 1110 0 . .. ... 1 ... 1 1111 . 110 111 . @vidup + VDWDUP 1110 1110 0 . .. ... 1 ... 1 1111 . 110 ... . @viwdup +} + # multiply-add long dual accumulate # rdahi: bits [3:1] from insn, bit 0 is 1 # rdalo: bits [3:1] from insn, bit 0 is 0 diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index 91fb346d7e5..38b4181db2a 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -1695,3 +1695,66 @@ uint32_t HELPER(mve_sqrshr)(CPUARMState *env, uint32_t n, uint32_t shift) { return do_sqrshl_bhs(n, -(int8_t)shift, 32, true, &env->QF); } + +#define DO_VIDUP(OP, ESIZE, TYPE, FN) \ + uint32_t HELPER(mve_##OP)(CPUARMState *env, void *vd, \ + uint32_t offset, uint32_t imm) \ + { \ + TYPE *d = vd; \ + uint16_t mask = mve_element_mask(env); \ + unsigned e; \ + for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \ + mergemask(&d[H##ESIZE(e)], offset, mask); \ + offset = FN(offset, imm); \ + } \ + mve_advance_vpt(env); \ + return offset; \ + } + +#define DO_VIWDUP(OP, ESIZE, TYPE, FN) \ + uint32_t HELPER(mve_##OP)(CPUARMState *env, void *vd, \ + uint32_t offset, uint32_t wrap, \ + uint32_t imm) \ + { \ + TYPE *d = vd; \ + uint16_t mask = mve_element_mask(env); \ + unsigned e; \ + for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \ + mergemask(&d[H##ESIZE(e)], offset, mask); \ + offset = FN(offset, wrap, imm); \ + } \ + mve_advance_vpt(env); \ + return offset; \ + } + +#define DO_VIDUP_ALL(OP, FN) \ + DO_VIDUP(OP##b, 1, int8_t, FN) \ + DO_VIDUP(OP##h, 2, int16_t, FN) \ + DO_VIDUP(OP##w, 4, int32_t, FN) + +#define DO_VIWDUP_ALL(OP, FN) \ + DO_VIWDUP(OP##b, 1, int8_t, FN) \ + DO_VIWDUP(OP##h, 2, int16_t, FN) \ + DO_VIWDUP(OP##w, 4, int32_t, FN) + +static uint32_t do_add_wrap(uint32_t offset, uint32_t wrap, uint32_t imm) +{ + offset += imm; + if (offset == wrap) { + offset = 0; + } + return offset; +} + +static uint32_t do_sub_wrap(uint32_t offset, uint32_t wrap, uint32_t imm) +{ + if (offset == 0) { + offset = wrap; + } + offset -= imm; + return offset; +} + +DO_VIDUP_ALL(vidup, DO_ADD) +DO_VIWDUP_ALL(viwdup, do_add_wrap) +DO_VIWDUP_ALL(vdwdup, do_sub_wrap) diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index d318f34b2bc..a220521c00b 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -25,6 +25,11 @@ #include "translate.h" #include "translate-a32.h" +static inline int vidup_imm(DisasContext *s, int x) +{ + return 1 << x; +} + /* Include the generated decoder */ #include "decode-mve.c.inc" @@ -36,6 +41,8 @@ typedef void MVEGenTwoOpShiftFn(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32); typedef void MVEGenDualAccOpFn(TCGv_i64, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i64); typedef void MVEGenVADDVFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_i32); typedef void MVEGenOneOpImmFn(TCGv_ptr, TCGv_ptr, TCGv_i64); +typedef void MVEGenVIDUPFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_i32, TCGv_i32); +typedef void MVEGenVIWDUPFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_i32, TCGv_i32, TCGv_i32); /* Return the offset of a Qn register (same semantics as aa32_vfp_qreg()) */ static inline long mve_qreg_offset(unsigned reg) @@ -1059,3 +1066,116 @@ static bool trans_VSHLC(DisasContext *s, arg_VSHLC *a) mve_update_eci(s); return true; } + +static bool do_vidup(DisasContext *s, arg_vidup *a, MVEGenVIDUPFn *fn) +{ + TCGv_ptr qd; + TCGv_i32 rn; + + /* + * Vector increment/decrement with wrap and duplicate (VIDUP, VDDUP). + * This fills the vector with elements of successively increasing + * or decreasing values, starting from Rn. + */ + if (!dc_isar_feature(aa32_mve, s) || !mve_check_qreg_bank(s, a->qd)) { + return false; + } + if (a->size == MO_64) { + /* size 0b11 is another encoding */ + return false; + } + if (!mve_eci_check(s) || !vfp_access_check(s)) { + return true; + } + + qd = mve_qreg_ptr(a->qd); + rn = load_reg(s, a->rn); + fn(rn, cpu_env, qd, rn, tcg_constant_i32(a->imm)); + store_reg(s, a->rn, rn); + tcg_temp_free_ptr(qd); + mve_update_eci(s); + return true; +} + +static bool do_viwdup(DisasContext *s, arg_viwdup *a, MVEGenVIWDUPFn *fn) +{ + TCGv_ptr qd; + TCGv_i32 rn, rm; + + /* + * Vector increment/decrement with wrap and duplicate (VIWDUp, VDWDUP) + * This fills the vector with elements of successively increasing + * or decreasing values, starting from Rn. Rm specifies a point where + * the count wraps back around to 0. The updated offset is written back + * to Rn. + */ + if (!dc_isar_feature(aa32_mve, s) || !mve_check_qreg_bank(s, a->qd)) { + return false; + } + if (!fn || a->rm == 13 || a->rm == 15) { + /* + * size 0b11 is another encoding; Rm == 13 is UNPREDICTABLE; + * Rm == 13 is VIWDUP, VDWDUP. + */ + return false; + } + if (!mve_eci_check(s) || !vfp_access_check(s)) { + return true; + } + + qd = mve_qreg_ptr(a->qd); + rn = load_reg(s, a->rn); + rm = load_reg(s, a->rm); + fn(rn, cpu_env, qd, rn, rm, tcg_constant_i32(a->imm)); + store_reg(s, a->rn, rn); + tcg_temp_free_ptr(qd); + tcg_temp_free_i32(rm); + mve_update_eci(s); + return true; +} + +static bool trans_VIDUP(DisasContext *s, arg_vidup *a) +{ + static MVEGenVIDUPFn * const fns[] = { + gen_helper_mve_vidupb, + gen_helper_mve_viduph, + gen_helper_mve_vidupw, + NULL, + }; + return do_vidup(s, a, fns[a->size]); +} + +static bool trans_VDDUP(DisasContext *s, arg_vidup *a) +{ + static MVEGenVIDUPFn * const fns[] = { + gen_helper_mve_vidupb, + gen_helper_mve_viduph, + gen_helper_mve_vidupw, + NULL, + }; + /* VDDUP is just like VIDUP but with a negative immediate */ + a->imm = -a->imm; + return do_vidup(s, a, fns[a->size]); +} + +static bool trans_VIWDUP(DisasContext *s, arg_viwdup *a) +{ + static MVEGenVIWDUPFn * const fns[] = { + gen_helper_mve_viwdupb, + gen_helper_mve_viwduph, + gen_helper_mve_viwdupw, + NULL, + }; + return do_viwdup(s, a, fns[a->size]); +} + +static bool trans_VDWDUP(DisasContext *s, arg_viwdup *a) +{ + static MVEGenVIWDUPFn * const fns[] = { + gen_helper_mve_vdwdupb, + gen_helper_mve_vdwduph, + gen_helper_mve_vdwdupw, + NULL, + }; + return do_viwdup(s, a, fns[a->size]); +} From patchwork Thu Jul 29 11:14:33 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511175 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=a+RT+BAb; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb7g72fgpz9sSs for ; Thu, 29 Jul 2021 21:32:33 +1000 (AEST) Received: from localhost ([::1]:40540 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m94H8-0000lq-8L for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:32:30 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40114) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m940t-0000af-7w for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:44 -0400 Received: from mail-wm1-x32b.google.com ([2a00:1450:4864:20::32b]:36614) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m940d-00016s-U9 for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:42 -0400 Received: by mail-wm1-x32b.google.com with SMTP id o7-20020a05600c5107b0290257f956e02dso140050wms.1 for ; Thu, 29 Jul 2021 04:15:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=lONbbID3VHlgzJU0nIMD9RpteXWbXJ5QRJzmueidsYM=; b=a+RT+BAb0W6VuQZT6ipbN7pZscaDSnOzRGZEuwz/qU6E12J2ooS0nTkeU/as3jDccL afVxbW61VyaaRUxISM8Fd0e5LiG+NxjrZoY4TM9jGOBnN3KaYEOulr9Rcch3CPeOM14U 6PnH5B82s2fCSv88lS2n4P8LUqdoH2UfhEphtAOv0XrGe1PqNMDUi0TxvJoWJ6LpAmcI 7TUQNv9Mx4cIYxRt8Kra83l57c3bjF/ZV84tLuhnDM6VSs3N6AeaFY9kIKOFhDY1z2RN EjeSyEosBoJKhtR6I9bH5Wv0pKIGxocwoN1cgX3/finTtCXhpljnFLIOKenS3iKWR8Gi Hznw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=lONbbID3VHlgzJU0nIMD9RpteXWbXJ5QRJzmueidsYM=; b=eQ/PodWBAoltMHWNBigC5e+aGFzS8DGwdYgTUsPxGqe76oNK4UrYABBbIq1jf9rO+f +sWdU+qyNStmt64xv6+g4vu9a7+BQr29G5qksFRM4B1v7zhIIkRJJRUQoBGoQnhVxUZ5 fHZS60ta1tQK/h29PfXLlSpXJ3cNf87YCgzZIA+ivGXSbNRp2FWbGJM4PeAWiu22ppce pNTmUlHPpDzCODCCJmWFxQOaCd/ha5+NXDzdgupU3YjFu8wrN0bkDpvYU8PLPXolfFML pU4TALKq5sbAYaj88FV9sBXAnFoeisVjuLwGljrsyL6cLKht7Z/2vuaHWC+6AjISI8uu jC6A== X-Gm-Message-State: AOAM531K3WEjwa+LPK/NSrfafGJsUD6nAd9YUUzPA5vIC6ivpm3pG+T/ 6lkR/dTA67aGYha+5pWIL82UHXnzfVcMLg== X-Google-Smtp-Source: ABdhPJzhgKJkXCjOI3PrzY/m70iddc/49rGmwi4oamNYzxoyEumpiUWzokq0TnHXDOK9yH2xi4VnnA== X-Received: by 2002:a05:600c:190e:: with SMTP id j14mr4026276wmq.19.1627557326439; Thu, 29 Jul 2021 04:15:26 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:25 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 14/53] target/arm: Factor out gen_vpst() Date: Thu, 29 Jul 2021 12:14:33 +0100 Message-Id: <20210729111512.16541-15-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::32b; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x32b.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Factor out the "generate code to update VPR.MASK01/MASK23" part of trans_VPST(); we are going to want to reuse it for the VPT insns. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/translate-mve.c | 31 +++++++++++++++++-------------- 1 file changed, 17 insertions(+), 14 deletions(-) diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index a220521c00b..6d8da361469 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -737,33 +737,24 @@ static bool trans_VRMLSLDAVH(DisasContext *s, arg_vmlaldav *a) return do_long_dual_acc(s, a, fns[a->x]); } -static bool trans_VPST(DisasContext *s, arg_VPST *a) +static void gen_vpst(DisasContext *s, uint32_t mask) { - TCGv_i32 vpr; - - /* mask == 0 is a "related encoding" */ - if (!dc_isar_feature(aa32_mve, s) || !a->mask) { - return false; - } - if (!mve_eci_check(s) || !vfp_access_check(s)) { - return true; - } /* * Set the VPR mask fields. We take advantage of MASK01 and MASK23 * being adjacent fields in the register. * - * This insn is not predicated, but it is subject to beat-wise + * Updating the masks is not predicated, but it is subject to beat-wise * execution, and the mask is updated on the odd-numbered beats. * So if PSR.ECI says we should skip beat 1, we mustn't update the * 01 mask field. */ - vpr = load_cpu_field(v7m.vpr); + TCGv_i32 vpr = load_cpu_field(v7m.vpr); switch (s->eci) { case ECI_NONE: case ECI_A0: /* Update both 01 and 23 fields */ tcg_gen_deposit_i32(vpr, vpr, - tcg_constant_i32(a->mask | (a->mask << 4)), + tcg_constant_i32(mask | (mask << 4)), R_V7M_VPR_MASK01_SHIFT, R_V7M_VPR_MASK01_LENGTH + R_V7M_VPR_MASK23_LENGTH); break; @@ -772,13 +763,25 @@ static bool trans_VPST(DisasContext *s, arg_VPST *a) case ECI_A0A1A2B0: /* Update only the 23 mask field */ tcg_gen_deposit_i32(vpr, vpr, - tcg_constant_i32(a->mask), + tcg_constant_i32(mask), R_V7M_VPR_MASK23_SHIFT, R_V7M_VPR_MASK23_LENGTH); break; default: g_assert_not_reached(); } store_cpu_field(vpr, v7m.vpr); +} + +static bool trans_VPST(DisasContext *s, arg_VPST *a) +{ + /* mask == 0 is a "related encoding" */ + if (!dc_isar_feature(aa32_mve, s) || !a->mask) { + return false; + } + if (!mve_eci_check(s) || !vfp_access_check(s)) { + return true; + } + gen_vpst(s, a->mask); mve_update_and_store_eci(s); return true; } From patchwork Thu Jul 29 11:14:34 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511163 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=BELPXcfa; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb7Rb611Bz9sSs for ; Thu, 29 Jul 2021 21:22:35 +1000 (AEST) Received: from localhost ([::1]:38374 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m947V-0005Xi-JC for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:22:33 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40126) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m940v-0000d1-40 for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:47 -0400 Received: from mail-wm1-x331.google.com ([2a00:1450:4864:20::331]:40659) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m940e-00017w-Mn for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:44 -0400 Received: by mail-wm1-x331.google.com with SMTP id f18-20020a05600c4e92b0290253c32620e7so6334765wmq.5 for ; Thu, 29 Jul 2021 04:15:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=ByVPvCjeaXnuNCmNSTqYK/dafUjEZDazMD4qmT4NIMQ=; b=BELPXcfazIFo32Q5v5xVEhvFb6jb+0JWMkAKNiiZChKhJXlH9kODFd2XcHCXmSk4wk sTMKAFzzw5XNUeq/dX4rBXizyC4u6OsgV/+sMS7UzHKA37uNRpNghoIO1Qii9JxjwCX9 Bg4t2QfjgaIUoJmxH0IQgdp9X+srENE4WIxnvxltkuj7qjd/bhoZJhXfHLE48MTXYEQ1 V77/QH/4BFBm8uVc/9xe2gHl4DCc2vu+6k8WOOkFs5sSJOeOdo1I574yGsbGPOfu8T+e R8ioSCZ6CyneXQqoTKkaUZCjF3JWShjmZBQkOp8VRM8tSAaRihezGi4B+iP8nU0DhIsU Nl1A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ByVPvCjeaXnuNCmNSTqYK/dafUjEZDazMD4qmT4NIMQ=; b=nvNOJxUxBxF2e849pxY3h1tfUFp7KEMGA2LXQluXtl7LjVvqX5zM6+fsQ3LtsDbxg2 egVqCr8c9sCeeaLM6IkcYSkr+7c9kTSJGLLJuRsXuq1iSzXuM5U7RWLEKDj9YU2kDH4p eIUBPAHwRDdpyCtwp2vP4TvlDwjI57ClCW3L/mTF7l6zQ+lClA28mt3MkocVjHJi75s9 miSHjkUifNyB5DcRbVfkML+cnXQXgdZ39Vilptb4EO/IhMzBd2hExUBKuGp99KsKGBBw YVHLsyp8rLlbcAKH7tBJ6oigQyre+J4UrxYktGjDhZDPEQdjAWRAmUsQsqQZb/o7NIa3 7qpA== X-Gm-Message-State: AOAM531G3D42aWw7L5qSELxcDfYF5EoNQI+rZ0SY8Fv84tzYddy7XQC/ Xc7tYtrwYkqwOfDZEVmTXOk7091C5fSwfQ== X-Google-Smtp-Source: ABdhPJxbc30YLAGh8nlXHpMpbp0YiwZ1NoDipXOgOEcCaBwiHtB7MXKhBXEbFT+eoSTe9DOluA1ycQ== X-Received: by 2002:a05:600c:4141:: with SMTP id h1mr4131651wmm.83.1627557327315; Thu, 29 Jul 2021 04:15:27 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:26 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 15/53] target/arm: Implement MVE integer vector comparisons Date: Thu, 29 Jul 2021 12:14:34 +0100 Message-Id: <20210729111512.16541-16-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::331; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x331.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE integer vector comparison instructions. These are "VCMP (vector)" encodings T1, T2 and T3, and "VPT (vector)" encodings T1, T2 and T3. These insns compare corresponding elements in each vector, and update the VPR.P0 predicate bits with the results of the comparison. VPT also sets the VPR.MASK01 and VPR.MASK23 fields -- it is effectively "VCMP then VPST". Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper-mve.h | 32 ++++++++++++++++++++++ target/arm/mve.decode | 18 +++++++++++- target/arm/mve_helper.c | 56 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-mve.c | 47 ++++++++++++++++++++++++++++++++ 4 files changed, 152 insertions(+), 1 deletion(-) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index b9af03cc03b..ca5a6ab51cc 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -480,3 +480,35 @@ DEF_HELPER_FLAGS_3(mve_uqshl, TCG_CALL_NO_RWG, i32, env, i32, i32) DEF_HELPER_FLAGS_3(mve_sqshl, TCG_CALL_NO_RWG, i32, env, i32, i32) DEF_HELPER_FLAGS_3(mve_uqrshl, TCG_CALL_NO_RWG, i32, env, i32, i32) DEF_HELPER_FLAGS_3(mve_sqrshr, TCG_CALL_NO_RWG, i32, env, i32, i32) + +DEF_HELPER_FLAGS_3(mve_vcmpeqb, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vcmpeqh, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vcmpeqw, TCG_CALL_NO_WG, void, env, ptr, ptr) + +DEF_HELPER_FLAGS_3(mve_vcmpneb, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vcmpneh, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vcmpnew, TCG_CALL_NO_WG, void, env, ptr, ptr) + +DEF_HELPER_FLAGS_3(mve_vcmpcsb, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vcmpcsh, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vcmpcsw, TCG_CALL_NO_WG, void, env, ptr, ptr) + +DEF_HELPER_FLAGS_3(mve_vcmphib, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vcmphih, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vcmphiw, TCG_CALL_NO_WG, void, env, ptr, ptr) + +DEF_HELPER_FLAGS_3(mve_vcmpgeb, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vcmpgeh, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vcmpgew, TCG_CALL_NO_WG, void, env, ptr, ptr) + +DEF_HELPER_FLAGS_3(mve_vcmpltb, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vcmplth, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vcmpltw, TCG_CALL_NO_WG, void, env, ptr, ptr) + +DEF_HELPER_FLAGS_3(mve_vcmpgtb, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vcmpgth, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vcmpgtw, TCG_CALL_NO_WG, void, env, ptr, ptr) + +DEF_HELPER_FLAGS_3(mve_vcmpleb, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vcmpleh, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vcmplew, TCG_CALL_NO_WG, void, env, ptr, ptr) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index 88c9c18ebf1..76bbf9a6136 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -37,6 +37,7 @@ &2shift qd qm shift size &vidup qd rn size imm &viwdup qd rn rm size imm +&vcmp qm qn size mask @vldr_vstr ....... . . . . l:1 rn:4 ... ...... imm:7 &vldr_vstr qd=%qd u=0 # Note that both Rn and Qd are 3 bits only (no D bit) @@ -86,6 +87,10 @@ @2_shr_w .... .... .. 1 ..... .... .... .... .... &2shift qd=%qd qm=%qm \ size=2 shift=%rshift_i5 +# Vector comparison; 4-bit Qm but 3-bit Qn +%mask_22_13 22:1 13:3 +@vcmp .... .... .. size:2 qn:3 . .... .... .... .... &vcmp qm=%qm mask=%mask_22_13 + # Vector loads and stores # Widening loads and narrowing stores: @@ -345,7 +350,6 @@ VQRDMULH_scalar 1111 1110 0 . .. ... 1 ... 0 1110 . 110 .... @2scalar } # Predicate operations -%mask_22_13 22:1 13:3 VPST 1111 1110 0 . 11 000 1 ... 0 1111 0100 1101 mask=%mask_22_13 # Logical immediate operations (1 reg and modified-immediate) @@ -458,3 +462,15 @@ VQRSHRUNT 111 1 1110 1 . ... ... ... 1 1111 1 1 . 0 ... 0 @2_shr_b VQRSHRUNT 111 1 1110 1 . ... ... ... 1 1111 1 1 . 0 ... 0 @2_shr_h VSHLC 111 0 1110 1 . 1 imm:5 ... 0 1111 1100 rdm:4 qd=%qd + +# Comparisons. We expand out the conditions which are split across +# encodings T1, T2, T3 and the fc bits. These include VPT, which is +# effectively "VCMP then VPST". A plain "VCMP" has a mask field of zero. +VCMPEQ 1111 1110 0 . .. ... 1 ... 0 1111 0 0 . 0 ... 0 @vcmp +VCMPNE 1111 1110 0 . .. ... 1 ... 0 1111 1 0 . 0 ... 0 @vcmp +VCMPCS 1111 1110 0 . .. ... 1 ... 0 1111 0 0 . 0 ... 1 @vcmp +VCMPHI 1111 1110 0 . .. ... 1 ... 0 1111 1 0 . 0 ... 1 @vcmp +VCMPGE 1111 1110 0 . .. ... 1 ... 1 1111 0 0 . 0 ... 0 @vcmp +VCMPLT 1111 1110 0 . .. ... 1 ... 1 1111 1 0 . 0 ... 0 @vcmp +VCMPGT 1111 1110 0 . .. ... 1 ... 1 1111 0 0 . 0 ... 1 @vcmp +VCMPLE 1111 1110 0 . .. ... 1 ... 1 1111 1 0 . 0 ... 1 @vcmp diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index 38b4181db2a..b0b380b94b0 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -1758,3 +1758,59 @@ static uint32_t do_sub_wrap(uint32_t offset, uint32_t wrap, uint32_t imm) DO_VIDUP_ALL(vidup, DO_ADD) DO_VIWDUP_ALL(viwdup, do_add_wrap) DO_VIWDUP_ALL(vdwdup, do_sub_wrap) + +/* + * Vector comparison. + * P0 bits for non-executed beats (where eci_mask is 0) are unchanged. + * P0 bits for predicated lanes in executed beats (where mask is 0) are 0. + * P0 bits otherwise are updated with the results of the comparisons. + * We must also keep unchanged the MASK fields at the top of v7m.vpr. + */ +#define DO_VCMP(OP, ESIZE, TYPE, FN) \ + void HELPER(glue(mve_, OP))(CPUARMState *env, void *vn, void *vm) \ + { \ + TYPE *n = vn, *m = vm; \ + uint16_t mask = mve_element_mask(env); \ + uint16_t eci_mask = mve_eci_mask(env); \ + uint16_t beatpred = 0; \ + uint16_t emask = MAKE_64BIT_MASK(0, ESIZE); \ + unsigned e; \ + for (e = 0; e < 16 / ESIZE; e++) { \ + bool r = FN(n[H##ESIZE(e)], m[H##ESIZE(e)]); \ + /* Comparison sets 0/1 bits for each byte in the element */ \ + beatpred |= r * emask; \ + emask <<= ESIZE; \ + } \ + beatpred &= mask; \ + env->v7m.vpr = (env->v7m.vpr & ~(uint32_t)eci_mask) | \ + (beatpred & eci_mask); \ + mve_advance_vpt(env); \ + } + +#define DO_VCMP_S(OP, FN) \ + DO_VCMP(OP##b, 1, int8_t, FN) \ + DO_VCMP(OP##h, 2, int16_t, FN) \ + DO_VCMP(OP##w, 4, int32_t, FN) + +#define DO_VCMP_U(OP, FN) \ + DO_VCMP(OP##b, 1, uint8_t, FN) \ + DO_VCMP(OP##h, 2, uint16_t, FN) \ + DO_VCMP(OP##w, 4, uint32_t, FN) + +#define DO_EQ(N, M) ((N) == (M)) +#define DO_NE(N, M) ((N) != (M)) +#define DO_EQ(N, M) ((N) == (M)) +#define DO_EQ(N, M) ((N) == (M)) +#define DO_GE(N, M) ((N) >= (M)) +#define DO_LT(N, M) ((N) < (M)) +#define DO_GT(N, M) ((N) > (M)) +#define DO_LE(N, M) ((N) <= (M)) + +DO_VCMP_U(vcmpeq, DO_EQ) +DO_VCMP_U(vcmpne, DO_NE) +DO_VCMP_U(vcmpcs, DO_GE) +DO_VCMP_U(vcmphi, DO_GT) +DO_VCMP_S(vcmpge, DO_GE) +DO_VCMP_S(vcmplt, DO_LT) +DO_VCMP_S(vcmpgt, DO_GT) +DO_VCMP_S(vcmple, DO_LE) diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index 6d8da361469..2d7211b5271 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -43,6 +43,7 @@ typedef void MVEGenVADDVFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_i32); typedef void MVEGenOneOpImmFn(TCGv_ptr, TCGv_ptr, TCGv_i64); typedef void MVEGenVIDUPFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_i32, TCGv_i32); typedef void MVEGenVIWDUPFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_i32, TCGv_i32, TCGv_i32); +typedef void MVEGenCmpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr); /* Return the offset of a Qn register (same semantics as aa32_vfp_qreg()) */ static inline long mve_qreg_offset(unsigned reg) @@ -1182,3 +1183,49 @@ static bool trans_VDWDUP(DisasContext *s, arg_viwdup *a) }; return do_viwdup(s, a, fns[a->size]); } + +static bool do_vcmp(DisasContext *s, arg_vcmp *a, MVEGenCmpFn *fn) +{ + TCGv_ptr qn, qm; + + if (!dc_isar_feature(aa32_mve, s) || !mve_check_qreg_bank(s, a->qm) || + !fn) { + return false; + } + if (!mve_eci_check(s) || !vfp_access_check(s)) { + return true; + } + + qn = mve_qreg_ptr(a->qn); + qm = mve_qreg_ptr(a->qm); + fn(cpu_env, qn, qm); + tcg_temp_free_ptr(qn); + tcg_temp_free_ptr(qm); + if (a->mask) { + /* VPT */ + gen_vpst(s, a->mask); + } + mve_update_eci(s); + return true; +} + +#define DO_VCMP(INSN, FN) \ + static bool trans_##INSN(DisasContext *s, arg_vcmp *a) \ + { \ + static MVEGenCmpFn * const fns[] = { \ + gen_helper_mve_##FN##b, \ + gen_helper_mve_##FN##h, \ + gen_helper_mve_##FN##w, \ + NULL, \ + }; \ + return do_vcmp(s, a, fns[a->size]); \ + } + +DO_VCMP(VCMPEQ, vcmpeq) +DO_VCMP(VCMPNE, vcmpne) +DO_VCMP(VCMPCS, vcmpcs) +DO_VCMP(VCMPHI, vcmphi) +DO_VCMP(VCMPGE, vcmpge) +DO_VCMP(VCMPLT, vcmplt) +DO_VCMP(VCMPGT, vcmpgt) +DO_VCMP(VCMPLE, vcmple) From patchwork Thu Jul 29 11:14:35 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511180 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=puRrpj8+; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb7kk6k5mz9sSs for ; Thu, 29 Jul 2021 21:35:42 +1000 (AEST) Received: from localhost ([::1]:51452 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m94KC-00084l-L6 for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:35:40 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40236) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m940y-0000fx-VP for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:48 -0400 Received: from mail-wm1-x329.google.com ([2a00:1450:4864:20::329]:55132) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m940f-00018n-Ne for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:48 -0400 Received: by mail-wm1-x329.google.com with SMTP id b128so3465223wmb.4 for ; Thu, 29 Jul 2021 04:15:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=iZpgClRSEyhYOR+Os3hp/BipliKUZ2Q4cpKVQjciGAc=; b=puRrpj8+LWlGrdyULw2nyGEKw2eR29UmKr6UCI6ZunTUJWUsw/w+t+VJbGcChkbsXu VPbc83hsoQAIpOuArKLQUNfFe+rKp5I6T8wBN0KMGJ6A6FpB9FhdJt+MKPAYYiaVCwGH DdksCqVxm5n4YTCocDkcou7arq37doRH6FjwSUeAoTL58Cn823KGSho9nvWHbkoQly6y SAoYEyJZ8ZvRKT/P93SxQqrFgc/JgzWq+VjM7LkmJvAzAX2gNAEtdEnzqBoILeuhnImv cDGh9hRpNCRLvbgzjIJVC9rAUjwq7fdGT17UWlfqjUBQpRJvgHlUC2sWhLJc2GyEba8e nvlw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=iZpgClRSEyhYOR+Os3hp/BipliKUZ2Q4cpKVQjciGAc=; b=GKgOHtAUbOU6yQIAJ0+4blW+8AzuUGe33wbYFXpHbKMN7OC16+pW1cnldPYZQXO7+m 2+GRUkI4AQgjuMl01z5OdNQbFmzIBb8AR3XgY1Q9G17YcT0asfnGmHxf+KGAwgt1i2h2 AoYZ7a9rV751IB/UvHuQwnKCN1HDKvjOsRKLMSY918SrisLX49NVaAYMslo5J53FKNyW VLCYBzP3wvw71G3BWMKTqaFPemf3lymLa4FcxXutawkc67ZnFyE0bC/vL1tn2b2NWJ0d d7kqzTtgKERqksPy6eHMaxeXpxvIYKYV2lH/PhGRY6UNJsElsSmCrRQSMwyCVnqopd/q b72w== X-Gm-Message-State: AOAM5316CAN0iGIh/0rbPRg7lqWFX4VJXTOYh7e13oS/hnizOsuGlYo0 B5dPE8aIoi4SDZwBUCK76FlB8zF2YBqHHA== X-Google-Smtp-Source: ABdhPJx/WH/7QXT1u+thuHwThH581bSYbap4eYowOIL65Hhd+ElTwPFDWhFwoGmxDGiZ4DYvmS+aYg== X-Received: by 2002:a1c:238e:: with SMTP id j136mr4271650wmj.91.1627557328109; Thu, 29 Jul 2021 04:15:28 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:27 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 16/53] target/arm: Implement MVE integer vector-vs-scalar comparisons Date: Thu, 29 Jul 2021 12:14:35 +0100 Message-Id: <20210729111512.16541-17-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::329; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x329.google.com X-Spam_score_int: -1 X-Spam_score: -0.2 X-Spam_bar: / X-Spam_report: (-0.2 / 5.0 requ) DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE integer vector comparison instructions that compare each element against a scalar from a general purpose register. These are "VCMP (vector)" encodings T4, T5 and T6 and "VPT (vector)" encodings T4, T5 and T6. We have to move the decodetree pattern for VPST, because it overlaps with VCMP T4 with size = 0b11. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper-mve.h | 32 +++++++++++++++++++++++++++ target/arm/mve.decode | 18 +++++++++++++--- target/arm/mve_helper.c | 44 +++++++++++++++++++++++++++++++------- target/arm/translate-mve.c | 43 +++++++++++++++++++++++++++++++++++++ 4 files changed, 126 insertions(+), 11 deletions(-) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index ca5a6ab51cc..4f9903e66ef 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -512,3 +512,35 @@ DEF_HELPER_FLAGS_3(mve_vcmpgtw, TCG_CALL_NO_WG, void, env, ptr, ptr) DEF_HELPER_FLAGS_3(mve_vcmpleb, TCG_CALL_NO_WG, void, env, ptr, ptr) DEF_HELPER_FLAGS_3(mve_vcmpleh, TCG_CALL_NO_WG, void, env, ptr, ptr) DEF_HELPER_FLAGS_3(mve_vcmplew, TCG_CALL_NO_WG, void, env, ptr, ptr) + +DEF_HELPER_FLAGS_3(mve_vcmpeq_scalarb, TCG_CALL_NO_WG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vcmpeq_scalarh, TCG_CALL_NO_WG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vcmpeq_scalarw, TCG_CALL_NO_WG, void, env, ptr, i32) + +DEF_HELPER_FLAGS_3(mve_vcmpne_scalarb, TCG_CALL_NO_WG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vcmpne_scalarh, TCG_CALL_NO_WG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vcmpne_scalarw, TCG_CALL_NO_WG, void, env, ptr, i32) + +DEF_HELPER_FLAGS_3(mve_vcmpcs_scalarb, TCG_CALL_NO_WG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vcmpcs_scalarh, TCG_CALL_NO_WG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vcmpcs_scalarw, TCG_CALL_NO_WG, void, env, ptr, i32) + +DEF_HELPER_FLAGS_3(mve_vcmphi_scalarb, TCG_CALL_NO_WG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vcmphi_scalarh, TCG_CALL_NO_WG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vcmphi_scalarw, TCG_CALL_NO_WG, void, env, ptr, i32) + +DEF_HELPER_FLAGS_3(mve_vcmpge_scalarb, TCG_CALL_NO_WG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vcmpge_scalarh, TCG_CALL_NO_WG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vcmpge_scalarw, TCG_CALL_NO_WG, void, env, ptr, i32) + +DEF_HELPER_FLAGS_3(mve_vcmplt_scalarb, TCG_CALL_NO_WG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vcmplt_scalarh, TCG_CALL_NO_WG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vcmplt_scalarw, TCG_CALL_NO_WG, void, env, ptr, i32) + +DEF_HELPER_FLAGS_3(mve_vcmpgt_scalarb, TCG_CALL_NO_WG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vcmpgt_scalarh, TCG_CALL_NO_WG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vcmpgt_scalarw, TCG_CALL_NO_WG, void, env, ptr, i32) + +DEF_HELPER_FLAGS_3(mve_vcmple_scalarb, TCG_CALL_NO_WG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vcmple_scalarh, TCG_CALL_NO_WG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vcmple_scalarw, TCG_CALL_NO_WG, void, env, ptr, i32) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index 76bbf9a6136..ef708ba80ff 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -38,6 +38,7 @@ &vidup qd rn size imm &viwdup qd rn rm size imm &vcmp qm qn size mask +&vcmp_scalar qn rm size mask @vldr_vstr ....... . . . . l:1 rn:4 ... ...... imm:7 &vldr_vstr qd=%qd u=0 # Note that both Rn and Qd are 3 bits only (no D bit) @@ -90,6 +91,8 @@ # Vector comparison; 4-bit Qm but 3-bit Qn %mask_22_13 22:1 13:3 @vcmp .... .... .. size:2 qn:3 . .... .... .... .... &vcmp qm=%qm mask=%mask_22_13 +@vcmp_scalar .... .... .. size:2 qn:3 . .... .... .... rm:4 &vcmp_scalar \ + mask=%mask_22_13 # Vector loads and stores @@ -349,9 +352,6 @@ VQRDMULH_scalar 1111 1110 0 . .. ... 1 ... 0 1110 . 110 .... @2scalar rdahi=%rdahi rdalo=%rdalo } -# Predicate operations -VPST 1111 1110 0 . 11 000 1 ... 0 1111 0100 1101 mask=%mask_22_13 - # Logical immediate operations (1 reg and modified-immediate) # The cmode/op bits here decode VORR/VBIC/VMOV/VMVN, but @@ -474,3 +474,15 @@ VCMPGE 1111 1110 0 . .. ... 1 ... 1 1111 0 0 . 0 ... 0 @vcmp VCMPLT 1111 1110 0 . .. ... 1 ... 1 1111 1 0 . 0 ... 0 @vcmp VCMPGT 1111 1110 0 . .. ... 1 ... 1 1111 0 0 . 0 ... 1 @vcmp VCMPLE 1111 1110 0 . .. ... 1 ... 1 1111 1 0 . 0 ... 1 @vcmp + +{ + VPST 1111 1110 0 . 11 000 1 ... 0 1111 0100 1101 mask=%mask_22_13 + VCMPEQ_scalar 1111 1110 0 . .. ... 1 ... 0 1111 0 1 0 0 .... @vcmp_scalar +} +VCMPNE_scalar 1111 1110 0 . .. ... 1 ... 0 1111 1 1 0 0 .... @vcmp_scalar +VCMPCS_scalar 1111 1110 0 . .. ... 1 ... 0 1111 0 1 1 0 .... @vcmp_scalar +VCMPHI_scalar 1111 1110 0 . .. ... 1 ... 0 1111 1 1 1 0 .... @vcmp_scalar +VCMPGE_scalar 1111 1110 0 . .. ... 1 ... 1 1111 0 1 0 0 .... @vcmp_scalar +VCMPLT_scalar 1111 1110 0 . .. ... 1 ... 1 1111 1 1 0 0 .... @vcmp_scalar +VCMPGT_scalar 1111 1110 0 . .. ... 1 ... 1 1111 0 1 1 0 .... @vcmp_scalar +VCMPLE_scalar 1111 1110 0 . .. ... 1 ... 1 1111 1 1 1 0 .... @vcmp_scalar diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index b0b380b94b0..1a021a9a817 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -1787,15 +1787,43 @@ DO_VIWDUP_ALL(vdwdup, do_sub_wrap) mve_advance_vpt(env); \ } -#define DO_VCMP_S(OP, FN) \ - DO_VCMP(OP##b, 1, int8_t, FN) \ - DO_VCMP(OP##h, 2, int16_t, FN) \ - DO_VCMP(OP##w, 4, int32_t, FN) +#define DO_VCMP_SCALAR(OP, ESIZE, TYPE, FN) \ + void HELPER(glue(mve_, OP))(CPUARMState *env, void *vn, \ + uint32_t rm) \ + { \ + TYPE *n = vn; \ + uint16_t mask = mve_element_mask(env); \ + uint16_t eci_mask = mve_eci_mask(env); \ + uint16_t beatpred = 0; \ + uint16_t emask = MAKE_64BIT_MASK(0, ESIZE); \ + unsigned e; \ + for (e = 0; e < 16 / ESIZE; e++) { \ + bool r = FN(n[H##ESIZE(e)], (TYPE)rm); \ + /* Comparison sets 0/1 bits for each byte in the element */ \ + beatpred |= r * emask; \ + emask <<= ESIZE; \ + } \ + beatpred &= mask; \ + env->v7m.vpr = (env->v7m.vpr & ~(uint32_t)eci_mask) | \ + (beatpred & eci_mask); \ + mve_advance_vpt(env); \ + } -#define DO_VCMP_U(OP, FN) \ - DO_VCMP(OP##b, 1, uint8_t, FN) \ - DO_VCMP(OP##h, 2, uint16_t, FN) \ - DO_VCMP(OP##w, 4, uint32_t, FN) +#define DO_VCMP_S(OP, FN) \ + DO_VCMP(OP##b, 1, int8_t, FN) \ + DO_VCMP(OP##h, 2, int16_t, FN) \ + DO_VCMP(OP##w, 4, int32_t, FN) \ + DO_VCMP_SCALAR(OP##_scalarb, 1, int8_t, FN) \ + DO_VCMP_SCALAR(OP##_scalarh, 2, int16_t, FN) \ + DO_VCMP_SCALAR(OP##_scalarw, 4, int32_t, FN) + +#define DO_VCMP_U(OP, FN) \ + DO_VCMP(OP##b, 1, uint8_t, FN) \ + DO_VCMP(OP##h, 2, uint16_t, FN) \ + DO_VCMP(OP##w, 4, uint32_t, FN) \ + DO_VCMP_SCALAR(OP##_scalarb, 1, uint8_t, FN) \ + DO_VCMP_SCALAR(OP##_scalarh, 2, uint16_t, FN) \ + DO_VCMP_SCALAR(OP##_scalarw, 4, uint32_t, FN) #define DO_EQ(N, M) ((N) == (M)) #define DO_NE(N, M) ((N) != (M)) diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index 2d7211b5271..6c6f159aa3e 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -44,6 +44,7 @@ typedef void MVEGenOneOpImmFn(TCGv_ptr, TCGv_ptr, TCGv_i64); typedef void MVEGenVIDUPFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_i32, TCGv_i32); typedef void MVEGenVIWDUPFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_i32, TCGv_i32, TCGv_i32); typedef void MVEGenCmpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr); +typedef void MVEGenScalarCmpFn(TCGv_ptr, TCGv_ptr, TCGv_i32); /* Return the offset of a Qn register (same semantics as aa32_vfp_qreg()) */ static inline long mve_qreg_offset(unsigned reg) @@ -1209,6 +1210,37 @@ static bool do_vcmp(DisasContext *s, arg_vcmp *a, MVEGenCmpFn *fn) return true; } +static bool do_vcmp_scalar(DisasContext *s, arg_vcmp_scalar *a, + MVEGenScalarCmpFn *fn) +{ + TCGv_ptr qn; + TCGv_i32 rm; + + if (!dc_isar_feature(aa32_mve, s) || !fn || a->rm == 13) { + return false; + } + if (!mve_eci_check(s) || !vfp_access_check(s)) { + return true; + } + + qn = mve_qreg_ptr(a->qn); + if (a->rm == 15) { + /* Encoding Rm=0b1111 means "constant zero" */ + rm = tcg_constant_i32(0); + } else { + rm = load_reg(s, a->rm); + } + fn(cpu_env, qn, rm); + tcg_temp_free_ptr(qn); + tcg_temp_free_i32(rm); + if (a->mask) { + /* VPT */ + gen_vpst(s, a->mask); + } + mve_update_eci(s); + return true; +} + #define DO_VCMP(INSN, FN) \ static bool trans_##INSN(DisasContext *s, arg_vcmp *a) \ { \ @@ -1219,6 +1251,17 @@ static bool do_vcmp(DisasContext *s, arg_vcmp *a, MVEGenCmpFn *fn) NULL, \ }; \ return do_vcmp(s, a, fns[a->size]); \ + } \ + static bool trans_##INSN##_scalar(DisasContext *s, \ + arg_vcmp_scalar *a) \ + { \ + static MVEGenScalarCmpFn * const fns[] = { \ + gen_helper_mve_##FN##_scalarb, \ + gen_helper_mve_##FN##_scalarh, \ + gen_helper_mve_##FN##_scalarw, \ + NULL, \ + }; \ + return do_vcmp_scalar(s, a, fns[a->size]); \ } DO_VCMP(VCMPEQ, vcmpeq) From patchwork Thu Jul 29 11:14:36 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511178 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=wK6yb4kG; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb7k61SgHz9sSs for ; Thu, 29 Jul 2021 21:35:10 +1000 (AEST) Received: from localhost ([::1]:49866 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m94Jf-000722-VE for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:35:07 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40264) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m940z-0000iu-R4 for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:49 -0400 Received: from mail-wm1-x32d.google.com ([2a00:1450:4864:20::32d]:38903) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m940j-00019o-M4 for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:49 -0400 Received: by mail-wm1-x32d.google.com with SMTP id o5-20020a1c4d050000b02901fc3a62af78so6568837wmh.3 for ; Thu, 29 Jul 2021 04:15:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=H9TZmaWqhZmwH6AhFQ01QX7XQokaEmxCohWlyYnItss=; b=wK6yb4kGx51zJ2lJQyDXkel7EjQqfiMi9z+xK9u1dstCRvhouHXI0YL37JCxadYHcl FmZdXxJ82X2z1V0iBpsPX9GY7DxDu6hkz6B6c6vCKHMkKhH2OVnp09NnajPEC3bD5gKw j0vZt2N2rEOrLUZ5ha4bnrsKfDvSjTTRE/fWmmTBgPby9lEABdij6DBHmCcYn7nU9jqQ DQgGKMpxy3Pslq5+MXvIis/3QUTVoysxM3AI+cM7T9j5uFCEDp7zpzAQ9KMPJwpKJ8F5 usa8i2tHfVU5BaDt8SccpqcOfrJsE1Tgx9NDD1y6HwYrs0JdvVI2G0ayu0vAsM03n6zX 96zg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=H9TZmaWqhZmwH6AhFQ01QX7XQokaEmxCohWlyYnItss=; b=SMnMGKKbJptF1S7Qtc9hFgQC/IqmiYSxKMnCvDnMLpOpsZT60IxUIdUIVLM1DJu313 C2wt6lp7EcuyOuSvTNYnoc2ksYmXgbqURyJJltMjdkfklEHtGZ/zVQPMRXpBRsaMXvJZ 9l3Mfz/aUApJiUMy9nZ/9WcBM9t36NgxOShYZcVGnFmZmu+SPtvzMVUmksyq2djLpxX0 6oOeREAkkFy9dq7vtOSOr05W3z5nD/LjUHeJJkyH6NcwCMHqZ/Dq8xL+lFpFCI2fuo+B cn+PLwtDVTbFtP8/1d7GAUhij83IRyp3B/RxlcJxSX0ku1Y0qCQkwbLFh0eaGrD+ADu+ pjsw== X-Gm-Message-State: AOAM533CDMunk76CYjtp/fcIxELXjxGVhv5b2KlX/gcaZxa5oF4yalJN QFBBxLfiFxx5MjFzrnNBFhNxr0VubC67JQ== X-Google-Smtp-Source: ABdhPJyl3O0gaMW7j4M0nJaclQW460L0TTuarWOH2KJSuqgsMM+SqIy4BrJogu7eApV+dBbHac45WQ== X-Received: by 2002:a7b:c0cc:: with SMTP id s12mr725597wmh.0.1627557328943; Thu, 29 Jul 2021 04:15:28 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:28 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 17/53] target/arm: Implement MVE VPSEL Date: Thu, 29 Jul 2021 12:14:36 +0100 Message-Id: <20210729111512.16541-18-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::32d; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x32d.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE VPSEL insn, which sets each byte of the destination vector Qd to the byte from either Qn or Qm depending on the value of the corresponding bit in VPR.P0. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper-mve.h | 2 ++ target/arm/mve.decode | 7 +++++-- target/arm/mve_helper.c | 19 +++++++++++++++++++ target/arm/translate-mve.c | 2 ++ 4 files changed, 28 insertions(+), 2 deletions(-) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index 4f9903e66ef..16c4c3b8f61 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -82,6 +82,8 @@ DEF_HELPER_FLAGS_4(mve_vorr, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) DEF_HELPER_FLAGS_4(mve_vorn, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) DEF_HELPER_FLAGS_4(mve_veor, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) +DEF_HELPER_FLAGS_4(mve_vpsel, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) + DEF_HELPER_FLAGS_4(mve_vaddb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) DEF_HELPER_FLAGS_4(mve_vaddh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) DEF_HELPER_FLAGS_4(mve_vaddw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index ef708ba80ff..4bd20a9a319 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -468,8 +468,11 @@ VSHLC 111 0 1110 1 . 1 imm:5 ... 0 1111 1100 rdm:4 qd=%qd # effectively "VCMP then VPST". A plain "VCMP" has a mask field of zero. VCMPEQ 1111 1110 0 . .. ... 1 ... 0 1111 0 0 . 0 ... 0 @vcmp VCMPNE 1111 1110 0 . .. ... 1 ... 0 1111 1 0 . 0 ... 0 @vcmp -VCMPCS 1111 1110 0 . .. ... 1 ... 0 1111 0 0 . 0 ... 1 @vcmp -VCMPHI 1111 1110 0 . .. ... 1 ... 0 1111 1 0 . 0 ... 1 @vcmp +{ + VPSEL 1111 1110 0 . 11 ... 1 ... 0 1111 . 0 . 0 ... 1 @2op_nosz + VCMPCS 1111 1110 0 . .. ... 1 ... 0 1111 0 0 . 0 ... 1 @vcmp + VCMPHI 1111 1110 0 . .. ... 1 ... 0 1111 1 0 . 0 ... 1 @vcmp +} VCMPGE 1111 1110 0 . .. ... 1 ... 1 1111 0 0 . 0 ... 0 @vcmp VCMPLT 1111 1110 0 . .. ... 1 ... 1 1111 1 0 . 0 ... 0 @vcmp VCMPGT 1111 1110 0 . .. ... 1 ... 1 1111 0 0 . 0 ... 1 @vcmp diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index 1a021a9a817..03171766b57 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -1842,3 +1842,22 @@ DO_VCMP_S(vcmpge, DO_GE) DO_VCMP_S(vcmplt, DO_LT) DO_VCMP_S(vcmpgt, DO_GT) DO_VCMP_S(vcmple, DO_LE) + +void HELPER(mve_vpsel)(CPUARMState *env, void *vd, void *vn, void *vm) +{ + /* + * Qd[n] = VPR.P0[n] ? Qn[n] : Qm[n] + * but note that whether bytes are written to Qd is still subject + * to (all forms of) predication in the usual way. + */ + uint64_t *d = vd, *n = vn, *m = vm; + uint16_t mask = mve_element_mask(env); + uint16_t p0 = FIELD_EX32(env->v7m.vpr, V7M_VPR, P0); + unsigned e; + for (e = 0; e < 16 / 8; e++, mask >>= 8, p0 >>= 8) { + uint64_t r = m[H8(e)]; + mergemask(&r, n[H8(e)], p0); + mergemask(&d[H8(e)], r, mask); + } + mve_advance_vpt(env); +} diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index 6c6f159aa3e..aa38218e08f 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -376,6 +376,8 @@ DO_LOGIC(VORR, gen_helper_mve_vorr) DO_LOGIC(VORN, gen_helper_mve_vorn) DO_LOGIC(VEOR, gen_helper_mve_veor) +DO_LOGIC(VPSEL, gen_helper_mve_vpsel) + #define DO_2OP(INSN, FN) \ static bool trans_##INSN(DisasContext *s, arg_2op *a) \ { \ From patchwork Thu Jul 29 11:14:37 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511189 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=cguWcA8k; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb7nC6WmFz9sSs for ; Thu, 29 Jul 2021 21:37:51 +1000 (AEST) Received: from localhost ([::1]:60418 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m94MH-0005mY-Ja for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:37:49 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40268) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m9410-0000kG-6Q for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:50 -0400 Received: from mail-wm1-x32a.google.com ([2a00:1450:4864:20::32a]:39822) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m940j-0001AC-Mi for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:49 -0400 Received: by mail-wm1-x32a.google.com with SMTP id f14-20020a05600c154eb02902519e4abe10so6545529wmg.4 for ; Thu, 29 Jul 2021 04:15:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=sMTDRQniGICDQscZGG0hawEpXGBLFhziwkAtwxxcYNs=; b=cguWcA8kPh9Nq8axqquWdaJg8ic3M7eY5vIW4f7PkVSfjeNZALmbRZFfI9YZ10T0f0 /o1JtrFjgslxome1FR4yw4ug9UY8sY6wC49boXhxJXSynPo/KjVRBoeZgqgaIljzQLLe CAqCuW6XgdZLsoXpbhlJmzu2VTzu9tDf8Q+G8Ji1CFRJSMpypF2ard9ee2BrUVKGeVrC XiSV2sPDfNurBKGUS4f4Lpt6lF0Bcg1wcRkarl5e478Y20ACqMGSkDxO5V97gBQlIttd usH1KQKW0UmZcuVd+Ef3b55bQ/tQi8332Hz5xQwZUK7EVtq6+kmy28pC5t+OyujF9hXp lEUA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=sMTDRQniGICDQscZGG0hawEpXGBLFhziwkAtwxxcYNs=; b=EdOoCWaXQajDOXKWr8D1yeWMbBetjNLOhe6siX60DWmYUsH6xVDgYe1ILhqz30gmJj /t43K1a+OiRE8aQDUCswtBliatI5XmviGfICxLWmBKKovlxvV6kFTevvU7g0t9wPFre8 ac6K0arPBdxOuP8yBAU/mk/lJlOFUM9rzWXHt6vhrzJADc403FK+D+yep8g+uNXGC2MS tuQqHTmHLtZhsyoI+S4jPYgMn7EE6SRDA+ajOqa9pxzGcyoKiLDc0m0vJjM4J59DVbdn wuwtgtMGpoeQf42fKMAPjL2xH96ccRMGSXf2nrGxNlwJMH61tS4fUEuoEO6CgP+S3vhw CdIg== X-Gm-Message-State: AOAM532gu0PerPjPwoKoj4k8N4WBYsCb458QdnnNSGH2oHjtSu6A1pel cvSSGovoD1qizTsCFtNgD86jTQ== X-Google-Smtp-Source: ABdhPJzuOpLxYkxlMDH6YYuyGWlKuhZM39Nkvj9Bkv+2vqdWuK+MmR/NEisRFq2fLL2IwUNyuyd5vQ== X-Received: by 2002:a1c:4b0a:: with SMTP id y10mr725694wma.1.1627557329737; Thu, 29 Jul 2021 04:15:29 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:29 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 18/53] target/arm: Implement MVE VMLAS Date: Thu, 29 Jul 2021 12:14:37 +0100 Message-Id: <20210729111512.16541-19-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::32a; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x32a.google.com X-Spam_score_int: -1 X-Spam_score: -0.2 X-Spam_bar: / X-Spam_report: (-0.2 / 5.0 requ) DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE VMLAS insn, which multiplies a vector by a vector and adds a scalar. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- changes: don't decode U bit as it does not affect output values --- target/arm/helper-mve.h | 4 ++++ target/arm/mve.decode | 3 +++ target/arm/mve_helper.c | 26 ++++++++++++++++++++++++++ target/arm/translate-mve.c | 1 + 4 files changed, 34 insertions(+) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index 16c4c3b8f61..715b1bbd012 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -347,6 +347,10 @@ DEF_HELPER_FLAGS_4(mve_vqdmullb_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i3 DEF_HELPER_FLAGS_4(mve_vqdmullt_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vqdmullt_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vmlasb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vmlash, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vmlasw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(mve_vmlaldavsh, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64) DEF_HELPER_FLAGS_4(mve_vmlaldavsw, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64) DEF_HELPER_FLAGS_4(mve_vmlaldavxsh, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index 4bd20a9a319..226b74790b3 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -345,6 +345,9 @@ VBRSR 1111 1110 0 . .. ... 1 ... 1 1110 . 110 .... @2scalar VQDMULH_scalar 1110 1110 0 . .. ... 1 ... 0 1110 . 110 .... @2scalar VQRDMULH_scalar 1111 1110 0 . .. ... 1 ... 0 1110 . 110 .... @2scalar +# The U bit (28) is don't-care because it does not affect the result +VMLAS 111- 1110 0 . .. ... 1 ... 1 1110 . 100 .... @2scalar + # Vector add across vector { VADDV 111 u:1 1110 1111 size:2 01 ... 0 1111 0 0 a:1 0 qm:3 0 rda=%rdalo diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index 03171766b57..ab02a1e60f4 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -948,6 +948,22 @@ DO_VQDMLADH_OP(vqrdmlsdhxw, 4, int32_t, 1, 1, do_vqdmlsdh_w) mve_advance_vpt(env); \ } +/* "accumulating" version where FN takes d as well as n and m */ +#define DO_2OP_ACC_SCALAR(OP, ESIZE, TYPE, FN) \ + void HELPER(glue(mve_, OP))(CPUARMState *env, void *vd, void *vn, \ + uint32_t rm) \ + { \ + TYPE *d = vd, *n = vn; \ + TYPE m = rm; \ + uint16_t mask = mve_element_mask(env); \ + unsigned e; \ + for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \ + mergemask(&d[H##ESIZE(e)], \ + FN(d[H##ESIZE(e)], n[H##ESIZE(e)], m), mask); \ + } \ + mve_advance_vpt(env); \ + } + /* provide unsigned 2-op scalar helpers for all sizes */ #define DO_2OP_SCALAR_U(OP, FN) \ DO_2OP_SCALAR(OP##b, 1, uint8_t, FN) \ @@ -958,6 +974,11 @@ DO_VQDMLADH_OP(vqrdmlsdhxw, 4, int32_t, 1, 1, do_vqdmlsdh_w) DO_2OP_SCALAR(OP##h, 2, int16_t, FN) \ DO_2OP_SCALAR(OP##w, 4, int32_t, FN) +#define DO_2OP_ACC_SCALAR_U(OP, FN) \ + DO_2OP_ACC_SCALAR(OP##b, 1, uint8_t, FN) \ + DO_2OP_ACC_SCALAR(OP##h, 2, uint16_t, FN) \ + DO_2OP_ACC_SCALAR(OP##w, 4, uint32_t, FN) + DO_2OP_SCALAR_U(vadd_scalar, DO_ADD) DO_2OP_SCALAR_U(vsub_scalar, DO_SUB) DO_2OP_SCALAR_U(vmul_scalar, DO_MUL) @@ -987,6 +1008,11 @@ DO_2OP_SAT_SCALAR(vqrdmulh_scalarb, 1, int8_t, DO_QRDMULH_B) DO_2OP_SAT_SCALAR(vqrdmulh_scalarh, 2, int16_t, DO_QRDMULH_H) DO_2OP_SAT_SCALAR(vqrdmulh_scalarw, 4, int32_t, DO_QRDMULH_W) +/* Vector by vector plus scalar */ +#define DO_VMLAS(D, N, M) ((N) * (D) + (M)) + +DO_2OP_ACC_SCALAR_U(vmlas, DO_VMLAS) + /* * Long saturating scalar ops. As with DO_2OP_L, TYPE and H are for the * input (smaller) type and LESIZE, LTYPE, LH for the output (long) type. diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index aa38218e08f..b56c91db2ab 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -596,6 +596,7 @@ DO_2OP_SCALAR(VQSUB_U_scalar, vqsubu_scalar) DO_2OP_SCALAR(VQDMULH_scalar, vqdmulh_scalar) DO_2OP_SCALAR(VQRDMULH_scalar, vqrdmulh_scalar) DO_2OP_SCALAR(VBRSR, vbrsr) +DO_2OP_SCALAR(VMLAS, vmlas) static bool trans_VQDMULLB_scalar(DisasContext *s, arg_2scalar *a) { From patchwork Thu Jul 29 11:14:38 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511184 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=vRa7t17n; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb7mX17Jgz9sSs for ; Thu, 29 Jul 2021 21:37:16 +1000 (AEST) Received: from localhost ([::1]:58436 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m94Lh-0004TA-Qb for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:37:13 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40396) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m9415-0000sS-8c for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:55 -0400 Received: from mail-wm1-x32e.google.com ([2a00:1450:4864:20::32e]:33747) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m940j-0001Bs-Ne for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:51 -0400 Received: by mail-wm1-x32e.google.com with SMTP id a192-20020a1c7fc90000b0290253b32e8796so5028522wmd.0 for ; Thu, 29 Jul 2021 04:15:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=TWbXHhQWATq+c4AfBX4ejJAoMYSLeEhYRoWPRhuFdZs=; b=vRa7t17n2lMHnGbTTomig70fxVnpv8ZwKvsm+vYye6ZuoiRSHe/hDlCqecO3lGgV1t yW2HZFUnvMcsDM6LPYOCGuPF5r/uq8yD10DPRloQ8mmNLKxi+Fn1innJQP9GbICKEtRu /a+iJPFIlrbvkwCr13axrncnfHLppZbmqBr511Lq2ONzSVsGYiML+AG1yH5tji30SvgG C2Nja8ZQwJhS76dRthFTFLpL2RVGDKr1XtSK8JSGl9rhrJ94ST5i2l9aoJM71rphQnV0 oGspbynXkErD4hormwKqWWnIAFFGve4hEzwx6qmNlOPqiMjuOaGoeHlHAZrF+F/Xb1Ri nD6g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=TWbXHhQWATq+c4AfBX4ejJAoMYSLeEhYRoWPRhuFdZs=; b=Z59UAun9BMFmCbwFrQ7sKy754sWd16Vuum7B4XdgQveyKtCOZJrq7jZ/e39T9Yw4XW ipNApA1+IMwuuvHWvCjlsPI0Nfgra3EzWmbsitiBKK1V9+z26mreRWpWsVC6gAqG/QBD LvfZ+F12dHoz68xsN1Jc+L9H9f8OgRqzFh2nYT8fuotEKkFjg0TqW4wE7a68v8uBGXeE RdfHASFt736WzirGpIbnx68WrimxNiJyB7gJyhHZkAz9kvvQrMA2gxGF8YVb7BCCidsz GiBnGROpVQiiJuhhgULwDo6JnQXupjaxIAn2dHDD5DupcewUnsvRKHHI9Ws+WPRy3C30 QNXg== X-Gm-Message-State: AOAM532AWkT7TujE+6HppNSBD/cOukRCV8RaeAFMV8Q/OtFG8GUrIxsQ wr/WuTe4zLhXuD0pkQtEmrMNI/oNWS9fIA== X-Google-Smtp-Source: ABdhPJwsQ88x0AC3frqw3HXl5uYZE84tMzH5K4Ss3YDViy8O6OG492eQsGty3wwyxDzp/zf7lhSBMw== X-Received: by 2002:a05:600c:1c09:: with SMTP id j9mr8753804wms.183.1627557330559; Thu, 29 Jul 2021 04:15:30 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:30 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 19/53] target/arm: Implement MVE shift-by-scalar Date: Thu, 29 Jul 2021 12:14:38 +0100 Message-Id: <20210729111512.16541-20-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::32e; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x32e.google.com X-Spam_score_int: -1 X-Spam_score: -0.2 X-Spam_bar: / X-Spam_report: (-0.2 / 5.0 requ) DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE instructions which perform shifts by a scalar. These are VSHL T2, VRSHL T2, VQSHL T1 and VQRSHL T2. They take the shift amount in a general purpose register and shift every element in the vector by that amount. Mostly we can reuse the helper functions for shift-by-immediate; we do need two new helpers for VQRSHL. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper-mve.h | 8 +++++++ target/arm/mve.decode | 23 ++++++++++++++++--- target/arm/mve_helper.c | 2 ++ target/arm/translate-mve.c | 46 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 76 insertions(+), 3 deletions(-) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index 715b1bbd012..0ee5ea3cabd 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -414,6 +414,14 @@ DEF_HELPER_FLAGS_4(mve_vrshli_ub, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vrshli_uh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vrshli_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vqrshli_sb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vqrshli_sh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vqrshli_sw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(mve_vqrshli_ub, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vqrshli_uh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vqrshli_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(mve_vshllbsb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vshllbsh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vshllbub, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index 226b74790b3..eb26b103d12 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -39,6 +39,7 @@ &viwdup qd rn rm size imm &vcmp qm qn size mask &vcmp_scalar qn rm size mask +&shl_scalar qda rm size @vldr_vstr ....... . . . . l:1 rn:4 ... ...... imm:7 &vldr_vstr qd=%qd u=0 # Note that both Rn and Qd are 3 bits only (no D bit) @@ -88,6 +89,8 @@ @2_shr_w .... .... .. 1 ..... .... .... .... .... &2shift qd=%qd qm=%qm \ size=2 shift=%rshift_i5 +@shl_scalar .... .... .... size:2 .. .... .... .... rm:4 &shl_scalar qda=%qd + # Vector comparison; 4-bit Qm but 3-bit Qn %mask_22_13 22:1 13:3 @vcmp .... .... .. size:2 qn:3 . .... .... .... .... &vcmp qm=%qm mask=%mask_22_13 @@ -320,7 +323,23 @@ VRMLSLDAVH 1111 1110 1 ... ... 0 ... x:1 1110 . 0 a:1 0 ... 1 @vmlaldav_no VADD_scalar 1110 1110 0 . .. ... 1 ... 0 1111 . 100 .... @2scalar VSUB_scalar 1110 1110 0 . .. ... 1 ... 1 1111 . 100 .... @2scalar -VMUL_scalar 1110 1110 0 . .. ... 1 ... 1 1110 . 110 .... @2scalar + +{ + VSHL_S_scalar 1110 1110 0 . 11 .. 01 ... 1 1110 0110 .... @shl_scalar + VRSHL_S_scalar 1110 1110 0 . 11 .. 11 ... 1 1110 0110 .... @shl_scalar + VQSHL_S_scalar 1110 1110 0 . 11 .. 01 ... 1 1110 1110 .... @shl_scalar + VQRSHL_S_scalar 1110 1110 0 . 11 .. 11 ... 1 1110 1110 .... @shl_scalar + VMUL_scalar 1110 1110 0 . .. ... 1 ... 1 1110 . 110 .... @2scalar +} + +{ + VSHL_U_scalar 1111 1110 0 . 11 .. 01 ... 1 1110 0110 .... @shl_scalar + VRSHL_U_scalar 1111 1110 0 . 11 .. 11 ... 1 1110 0110 .... @shl_scalar + VQSHL_U_scalar 1111 1110 0 . 11 .. 01 ... 1 1110 1110 .... @shl_scalar + VQRSHL_U_scalar 1111 1110 0 . 11 .. 11 ... 1 1110 1110 .... @shl_scalar + VBRSR 1111 1110 0 . .. ... 1 ... 1 1110 . 110 .... @2scalar +} + VHADD_S_scalar 1110 1110 0 . .. ... 0 ... 0 1111 . 100 .... @2scalar VHADD_U_scalar 1111 1110 0 . .. ... 0 ... 0 1111 . 100 .... @2scalar VHSUB_S_scalar 1110 1110 0 . .. ... 0 ... 1 1111 . 100 .... @2scalar @@ -340,8 +359,6 @@ VHSUB_U_scalar 1111 1110 0 . .. ... 0 ... 1 1111 . 100 .... @2scalar size=%size_28 } -VBRSR 1111 1110 0 . .. ... 1 ... 1 1110 . 110 .... @2scalar - VQDMULH_scalar 1110 1110 0 . .. ... 1 ... 0 1110 . 110 .... @2scalar VQRDMULH_scalar 1111 1110 0 . .. ... 1 ... 0 1110 . 110 .... @2scalar diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index ab02a1e60f4..ac608fc524b 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -1334,6 +1334,8 @@ DO_2SHIFT_SAT_S(vqshli_s, DO_SQSHL_OP) DO_2SHIFT_SAT_S(vqshlui_s, DO_SUQSHL_OP) DO_2SHIFT_U(vrshli_u, DO_VRSHLU) DO_2SHIFT_S(vrshli_s, DO_VRSHLS) +DO_2SHIFT_SAT_U(vqrshli_u, DO_UQRSHL_OP) +DO_2SHIFT_SAT_S(vqrshli_s, DO_SQRSHL_OP) /* Shift-and-insert; we always work with 64 bits at a time */ #define DO_2SHIFT_INSERT(OP, ESIZE, SHIFTFN, MASKFN) \ diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index b56c91db2ab..44731fc4eb7 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -1003,6 +1003,52 @@ DO_2SHIFT(VRSHRI_U, vrshli_u, true) DO_2SHIFT(VSRI, vsri, false) DO_2SHIFT(VSLI, vsli, false) +static bool do_2shift_scalar(DisasContext *s, arg_shl_scalar *a, + MVEGenTwoOpShiftFn *fn) +{ + TCGv_ptr qda; + TCGv_i32 rm; + + if (!dc_isar_feature(aa32_mve, s) || + !mve_check_qreg_bank(s, a->qda) || + a->rm == 13 || a->rm == 15 || !fn) { + /* Rm cases are UNPREDICTABLE */ + return false; + } + if (!mve_eci_check(s) || !vfp_access_check(s)) { + return true; + } + + qda = mve_qreg_ptr(a->qda); + rm = load_reg(s, a->rm); + fn(cpu_env, qda, qda, rm); + tcg_temp_free_ptr(qda); + tcg_temp_free_i32(rm); + mve_update_eci(s); + return true; +} + +#define DO_2SHIFT_SCALAR(INSN, FN) \ + static bool trans_##INSN(DisasContext *s, arg_shl_scalar *a) \ + { \ + static MVEGenTwoOpShiftFn * const fns[] = { \ + gen_helper_mve_##FN##b, \ + gen_helper_mve_##FN##h, \ + gen_helper_mve_##FN##w, \ + NULL, \ + }; \ + return do_2shift_scalar(s, a, fns[a->size]); \ + } + +DO_2SHIFT_SCALAR(VSHL_S_scalar, vshli_s) +DO_2SHIFT_SCALAR(VSHL_U_scalar, vshli_u) +DO_2SHIFT_SCALAR(VRSHL_S_scalar, vrshli_s) +DO_2SHIFT_SCALAR(VRSHL_U_scalar, vrshli_u) +DO_2SHIFT_SCALAR(VQSHL_S_scalar, vqshli_s) +DO_2SHIFT_SCALAR(VQSHL_U_scalar, vqshli_u) +DO_2SHIFT_SCALAR(VQRSHL_S_scalar, vqrshli_s) +DO_2SHIFT_SCALAR(VQRSHL_U_scalar, vqrshli_u) + #define DO_VSHLL(INSN, FN) \ static bool trans_##INSN(DisasContext *s, arg_2shift *a) \ { \ From patchwork Thu Jul 29 11:14:39 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511193 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=x+wWvPZV; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb7tt0f5dz9sSs for ; Thu, 29 Jul 2021 21:42:44 +1000 (AEST) Received: from localhost ([::1]:41946 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m94Qz-0004DX-1Z for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:42:41 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40398) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m9415-0000sT-9O for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:55 -0400 Received: from mail-wr1-x42e.google.com ([2a00:1450:4864:20::42e]:35331) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m940j-0001CG-Rx for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:51 -0400 Received: by mail-wr1-x42e.google.com with SMTP id n12so6480687wrr.2 for ; Thu, 29 Jul 2021 04:15:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=87OSQfI+t8J/2NRNU5SXGP9uuTxhBzV2bH7+CgjkZBI=; b=x+wWvPZVN085cj9+J/UYXuN7EbyetXEivfUIiM/yMEz4Yw2t5oLGgNxqygeKA+k7g9 v0nHSpuvKXlBkvsl8ovFXogx1WwUXPcOPkyp4C/2ThQ/zVlEBjPJ8MTM5P46iCaCpc3v 0czUwcfw8edILVPfvluUhSgp/guPZiKJ6dvpcJNKWD1zhnizlbTgRetszmnHDl4S4HPS q84Dii64JKzDHTK42iBQIrbqwMynCsxuH+vDUoxgW0BTYTw9EDGIVbvI2qcyDyutHAp8 I+4oswP+io9Ju96dufQLv+HaZlT2ij8A1isB5ZIm8/+2hD6tB7E+mgf3t0lOfrpnguhp OqYA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=87OSQfI+t8J/2NRNU5SXGP9uuTxhBzV2bH7+CgjkZBI=; b=mJ0NksR/u+XkzIqssKaLfq5IMGiKe2/TsxjIROJRTaYLxVcu8e+lJZ1ENx4rGUZrBE juFMLWdfsQvSDaUeKtQD+ucflMdDDQTl3jdZw27scBDKayxkRxWsWoGLAn+0oV9Osjjh ovQIVsoIjXyRnxLXJy59hyEji/WMv+yuhG+Dj+mgKQcGG5nm6iLTwSo9ZBxshMZWYzS9 DOrWhJDrl4sYOu2QNvDceZ9HKsh8iA7M+tjqc9KThqEE8lLUvKgMuX9lzbkXPaqcCiSl snUwfbTMN9ubIMnDI70qByvZdfpBZTkCysRuXEgVy3dIFutvmIwx5ERxjkBp66SbhFhS H7uA== X-Gm-Message-State: AOAM530T9jJuzT3eAAXujf1LOgINcKDG3k4wBAfE3rToL0A6LNvMr6pP jTXI8b92GMjgwYEJln2f5XuLuUVHEZ/sMw== X-Google-Smtp-Source: ABdhPJyC+3aCGillYoXTUEMRkikFdopSJ5BnvR4V73wDJc9LkCH5k3dUOiW9LciHD2GONu0OHPzq/g== X-Received: by 2002:a5d:4f8b:: with SMTP id d11mr4208559wru.351.1627557331276; Thu, 29 Jul 2021 04:15:31 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:30 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 20/53] target/arm: Move 'x' and 'a' bit definitions into vmlaldav formats Date: Thu, 29 Jul 2021 12:14:39 +0100 Message-Id: <20210729111512.16541-21-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::42e; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x42e.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" All the users of the vmlaldav formats have an 'x bit in bit 12 and an 'a' bit in bit 5; move these to the format rather than specifying them in each insn pattern. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/mve.decode | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index eb26b103d12..bdcd660aaf4 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -305,19 +305,19 @@ VDUP 1110 1110 1 0 10 ... 0 .... 1011 . 0 0 1 0000 @vdup size=2 &vmlaldav rdahi rdalo size qn qm x a -@vmlaldav .... .... . ... ... . ... . .... .... qm:3 . \ +@vmlaldav .... .... . ... ... . ... x:1 .... .. a:1 . qm:3 . \ qn=%qn rdahi=%rdahi rdalo=%rdalo size=%size_16 &vmlaldav -@vmlaldav_nosz .... .... . ... ... . ... . .... .... qm:3 . \ +@vmlaldav_nosz .... .... . ... ... . ... x:1 .... .. a:1 . qm:3 . \ qn=%qn rdahi=%rdahi rdalo=%rdalo size=0 &vmlaldav -VMLALDAV_S 1110 1110 1 ... ... . ... x:1 1110 . 0 a:1 0 ... 0 @vmlaldav -VMLALDAV_U 1111 1110 1 ... ... . ... x:1 1110 . 0 a:1 0 ... 0 @vmlaldav +VMLALDAV_S 1110 1110 1 ... ... . ... . 1110 . 0 . 0 ... 0 @vmlaldav +VMLALDAV_U 1111 1110 1 ... ... . ... . 1110 . 0 . 0 ... 0 @vmlaldav -VMLSLDAV 1110 1110 1 ... ... . ... x:1 1110 . 0 a:1 0 ... 1 @vmlaldav +VMLSLDAV 1110 1110 1 ... ... . ... . 1110 . 0 . 0 ... 1 @vmlaldav -VRMLALDAVH_S 1110 1110 1 ... ... 0 ... x:1 1111 . 0 a:1 0 ... 0 @vmlaldav_nosz -VRMLALDAVH_U 1111 1110 1 ... ... 0 ... x:1 1111 . 0 a:1 0 ... 0 @vmlaldav_nosz +VRMLALDAVH_S 1110 1110 1 ... ... 0 ... . 1111 . 0 . 0 ... 0 @vmlaldav_nosz +VRMLALDAVH_U 1111 1110 1 ... ... 0 ... . 1111 . 0 . 0 ... 0 @vmlaldav_nosz -VRMLSLDAVH 1111 1110 1 ... ... 0 ... x:1 1110 . 0 a:1 0 ... 1 @vmlaldav_nosz +VRMLSLDAVH 1111 1110 1 ... ... 0 ... . 1110 . 0 . 0 ... 1 @vmlaldav_nosz # Scalar operations From patchwork Thu Jul 29 11:14:40 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511176 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=Zq0HSg9x; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb7gF4w3Vz9sSs for ; Thu, 29 Jul 2021 21:32:41 +1000 (AEST) Received: from localhost ([::1]:41218 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m94HH-0001EG-D4 for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:32:39 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40238) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m940y-0000fy-WA for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:49 -0400 Received: from mail-wm1-x333.google.com ([2a00:1450:4864:20::333]:38909) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m940j-0001CX-Lg for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:48 -0400 Received: by mail-wm1-x333.google.com with SMTP id o5-20020a1c4d050000b02901fc3a62af78so6568975wmh.3 for ; Thu, 29 Jul 2021 04:15:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=wKy0dzwUb/umCJpYC/xTp1e1IUqKJL1s4VW/DPW/sww=; b=Zq0HSg9xfVC9uUGymhEm0+zrU9l2afdmPdU1lg3PkCDk0mbpn4fEHZ3ia9L6wTMqxN pPVZGnZp8wL7U05z7aVY/zqN7EToqHH/zPvTzPt3PbfIRKJOD3udfbQJT0vvIv0Z/hbS vQmxyl8fyOM0S2a6xPtP5/w+sOf8OJttmHrjOXsF6nNEt+1ZX9JDzhzRLZu9JtXsUZii H+Qotk2rPjMSAYKqwJRJXfbQWPcbsDhfwIxbhRWLOTaSUv0lEm5lhNGs78/NEqwleyo4 4H+aO4cpZhWYsKRLdSega1R747io7/d71AwUH2XT2DnAc4lDhHY4AXPtLo1cmjKb8Qz4 VLCA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=wKy0dzwUb/umCJpYC/xTp1e1IUqKJL1s4VW/DPW/sww=; b=NfFYJVY67fiSIBJ9CB0MkP2SbYtuY1gPsv1XF4c0L8JUO0CjB5QngxQF7VtqGHzta9 Itx/r63xjJZpd61DKPCA/REP11nYsBRuiKt+nC6McRArJ+QLtSFT0xTb8E9Bxn/aXmFj za3T42e2WZHA2ZDyadrpXK8qde7bOjAjqn4b6aJDHgi65OK/vy9ePF9xKna3TapXlAGj KmuBotbcyBlVunrp8HGR36vuBVheYBWpfIF2av1PEK9UDPHPV4hOL0M9P2xCSMQFBvcD 7xrIozIEaFWPyeKZMcaxuG0QWTtdaZ/7JEER9nUs0Fk/DuNA9kqmErvWT+uLzcCyxzv9 CjAg== X-Gm-Message-State: AOAM531crYYqKOsxSy7S3LhKqAwK+vEwc+zMhnIM8Fr26C562axalzST AHyk7xrzyxkwVJXqritYwkj9QQ== X-Google-Smtp-Source: ABdhPJym6X/QUDXFX36F+j2PSRb6dq+T+8ZOSo3Xmu2uA0dslpu2epLnNjJnaOE/OBMWtgv/d5gw6A== X-Received: by 2002:a7b:c5d8:: with SMTP id n24mr4224738wmk.51.1627557332183; Thu, 29 Jul 2021 04:15:32 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:31 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 21/53] target/arm: Implement MVE integer min/max across vector Date: Thu, 29 Jul 2021 12:14:40 +0100 Message-Id: <20210729111512.16541-22-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::333; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x333.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE integer min/max across vector insns VMAXV, VMINV, VMAXAV and VMINAV, which find the maximum from the vector elements and a general purpose register, and store the maximum back into the general purpose register. These insns overlap with VRMLALDAVH (they use what would be RdaHi=0b110). Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- Changes v1->v2: Drop the harmless but unnecessary "take abs value of 'n'" part of do_maxa() and do_mina() --- target/arm/helper-mve.h | 20 ++++++++++++ target/arm/mve.decode | 18 +++++++++-- target/arm/mve_helper.c | 66 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-mve.c | 48 +++++++++++++++++++++++++++ 4 files changed, 150 insertions(+), 2 deletions(-) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index 0ee5ea3cabd..2c66fcba792 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -379,6 +379,26 @@ DEF_HELPER_FLAGS_3(mve_vaddvuh, TCG_CALL_NO_WG, i32, env, ptr, i32) DEF_HELPER_FLAGS_3(mve_vaddvsw, TCG_CALL_NO_WG, i32, env, ptr, i32) DEF_HELPER_FLAGS_3(mve_vaddvuw, TCG_CALL_NO_WG, i32, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vmaxvsb, TCG_CALL_NO_WG, i32, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vmaxvsh, TCG_CALL_NO_WG, i32, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vmaxvsw, TCG_CALL_NO_WG, i32, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vmaxvub, TCG_CALL_NO_WG, i32, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vmaxvuh, TCG_CALL_NO_WG, i32, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vmaxvuw, TCG_CALL_NO_WG, i32, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vmaxavb, TCG_CALL_NO_WG, i32, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vmaxavh, TCG_CALL_NO_WG, i32, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vmaxavw, TCG_CALL_NO_WG, i32, env, ptr, i32) + +DEF_HELPER_FLAGS_3(mve_vminvsb, TCG_CALL_NO_WG, i32, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vminvsh, TCG_CALL_NO_WG, i32, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vminvsw, TCG_CALL_NO_WG, i32, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vminvub, TCG_CALL_NO_WG, i32, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vminvuh, TCG_CALL_NO_WG, i32, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vminvuw, TCG_CALL_NO_WG, i32, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vminavb, TCG_CALL_NO_WG, i32, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vminavh, TCG_CALL_NO_WG, i32, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vminavw, TCG_CALL_NO_WG, i32, env, ptr, i32) + DEF_HELPER_FLAGS_3(mve_vaddlv_s, TCG_CALL_NO_WG, i64, env, ptr, i64) DEF_HELPER_FLAGS_3(mve_vaddlv_u, TCG_CALL_NO_WG, i64, env, ptr, i64) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index bdcd660aaf4..83dc0300d69 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -40,6 +40,7 @@ &vcmp qm qn size mask &vcmp_scalar qn rm size mask &shl_scalar qda rm size +&vmaxv qm rda size @vldr_vstr ....... . . . . l:1 rn:4 ... ...... imm:7 &vldr_vstr qd=%qd u=0 # Note that both Rn and Qd are 3 bits only (no D bit) @@ -97,6 +98,8 @@ @vcmp_scalar .... .... .. size:2 qn:3 . .... .... .... rm:4 &vcmp_scalar \ mask=%mask_22_13 +@vmaxv .... .... .... size:2 .. rda:4 .... .... .... &vmaxv qm=%qm + # Vector loads and stores # Widening loads and narrowing stores: @@ -314,8 +317,19 @@ VMLALDAV_U 1111 1110 1 ... ... . ... . 1110 . 0 . 0 ... 0 @vmlaldav VMLSLDAV 1110 1110 1 ... ... . ... . 1110 . 0 . 0 ... 1 @vmlaldav -VRMLALDAVH_S 1110 1110 1 ... ... 0 ... . 1111 . 0 . 0 ... 0 @vmlaldav_nosz -VRMLALDAVH_U 1111 1110 1 ... ... 0 ... . 1111 . 0 . 0 ... 0 @vmlaldav_nosz +{ + VMAXV_S 1110 1110 1110 .. 10 .... 1111 0 0 . 0 ... 0 @vmaxv + VMINV_S 1110 1110 1110 .. 10 .... 1111 1 0 . 0 ... 0 @vmaxv + VMAXAV 1110 1110 1110 .. 00 .... 1111 0 0 . 0 ... 0 @vmaxv + VMINAV 1110 1110 1110 .. 00 .... 1111 1 0 . 0 ... 0 @vmaxv + VRMLALDAVH_S 1110 1110 1 ... ... 0 ... . 1111 . 0 . 0 ... 0 @vmlaldav_nosz +} + +{ + VMAXV_U 1111 1110 1110 .. 10 .... 1111 0 0 . 0 ... 0 @vmaxv + VMINV_U 1111 1110 1110 .. 10 .... 1111 1 0 . 0 ... 0 @vmaxv + VRMLALDAVH_U 1111 1110 1 ... ... 0 ... . 1111 . 0 . 0 ... 0 @vmlaldav_nosz +} VRMLSLDAVH 1111 1110 1 ... ... 0 ... . 1110 . 0 . 0 ... 1 @vmlaldav_nosz diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index ac608fc524b..924ad7f2bdc 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -1254,6 +1254,72 @@ DO_VADDV(vaddvub, 1, uint8_t) DO_VADDV(vaddvuh, 2, uint16_t) DO_VADDV(vaddvuw, 4, uint32_t) +/* + * Vector max/min across vector. Unlike VADDV, we must + * read ra as the element size, not its full width. + * We work with int64_t internally for simplicity. + */ +#define DO_VMAXMINV(OP, ESIZE, TYPE, RATYPE, FN) \ + uint32_t HELPER(glue(mve_, OP))(CPUARMState *env, void *vm, \ + uint32_t ra_in) \ + { \ + uint16_t mask = mve_element_mask(env); \ + unsigned e; \ + TYPE *m = vm; \ + int64_t ra = (RATYPE)ra_in; \ + for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \ + if (mask & 1) { \ + ra = FN(ra, m[H##ESIZE(e)]); \ + } \ + } \ + mve_advance_vpt(env); \ + return ra; \ + } \ + +#define DO_VMAXMINV_U(INSN, FN) \ + DO_VMAXMINV(INSN##b, 1, uint8_t, uint8_t, FN) \ + DO_VMAXMINV(INSN##h, 2, uint16_t, uint16_t, FN) \ + DO_VMAXMINV(INSN##w, 4, uint32_t, uint32_t, FN) +#define DO_VMAXMINV_S(INSN, FN) \ + DO_VMAXMINV(INSN##b, 1, int8_t, int8_t, FN) \ + DO_VMAXMINV(INSN##h, 2, int16_t, int16_t, FN) \ + DO_VMAXMINV(INSN##w, 4, int32_t, int32_t, FN) + +/* + * Helpers for max and min of absolute values across vector: + * note that we only take the absolute value of 'm', not 'n' + */ +static int64_t do_maxa(int64_t n, int64_t m) +{ + if (m < 0) { + m = -m; + } + return MAX(n, m); +} + +static int64_t do_mina(int64_t n, int64_t m) +{ + if (m < 0) { + m = -m; + } + return MIN(n, m); +} + +DO_VMAXMINV_S(vmaxvs, DO_MAX) +DO_VMAXMINV_U(vmaxvu, DO_MAX) +DO_VMAXMINV_S(vminvs, DO_MIN) +DO_VMAXMINV_U(vminvu, DO_MIN) +/* + * VMAXAV, VMINAV treat the general purpose input as unsigned + * and the vector elements as signed. + */ +DO_VMAXMINV(vmaxavb, 1, int8_t, uint8_t, do_maxa) +DO_VMAXMINV(vmaxavh, 2, int16_t, uint16_t, do_maxa) +DO_VMAXMINV(vmaxavw, 4, int32_t, uint32_t, do_maxa) +DO_VMAXMINV(vminavb, 1, int8_t, uint8_t, do_mina) +DO_VMAXMINV(vminavh, 2, int16_t, uint16_t, do_mina) +DO_VMAXMINV(vminavw, 4, int32_t, uint32_t, do_mina) + #define DO_VADDLV(OP, TYPE, LTYPE) \ uint64_t HELPER(glue(mve_, OP))(CPUARMState *env, void *vm, \ uint64_t ra) \ diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index 44731fc4eb7..2fce74f86ab 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -1321,3 +1321,51 @@ DO_VCMP(VCMPGE, vcmpge) DO_VCMP(VCMPLT, vcmplt) DO_VCMP(VCMPGT, vcmpgt) DO_VCMP(VCMPLE, vcmple) + +static bool do_vmaxv(DisasContext *s, arg_vmaxv *a, MVEGenVADDVFn fn) +{ + /* + * MIN/MAX operations across a vector: compute the min or + * max of the initial value in a general purpose register + * and all the elements in the vector, and store it back + * into the general purpose register. + */ + TCGv_ptr qm; + TCGv_i32 rda; + + if (!dc_isar_feature(aa32_mve, s) || !mve_check_qreg_bank(s, a->qm) || + !fn || a->rda == 13 || a->rda == 15) { + /* Rda cases are UNPREDICTABLE */ + return false; + } + if (!mve_eci_check(s) || !vfp_access_check(s)) { + return true; + } + + qm = mve_qreg_ptr(a->qm); + rda = load_reg(s, a->rda); + fn(rda, cpu_env, qm, rda); + store_reg(s, a->rda, rda); + tcg_temp_free_ptr(qm); + mve_update_eci(s); + return true; +} + +#define DO_VMAXV(INSN, FN) \ + static bool trans_##INSN(DisasContext *s, arg_vmaxv *a) \ + { \ + static MVEGenVADDVFn * const fns[] = { \ + gen_helper_mve_##FN##b, \ + gen_helper_mve_##FN##h, \ + gen_helper_mve_##FN##w, \ + NULL, \ + }; \ + return do_vmaxv(s, a, fns[a->size]); \ + } + +DO_VMAXV(VMAXV_S, vmaxvs) +DO_VMAXV(VMAXV_U, vmaxvu) +DO_VMAXV(VMAXAV, vmaxav) +DO_VMAXV(VMINV_S, vminvs) +DO_VMAXV(VMINV_U, vminvu) +DO_VMAXV(VMINAV, vminav) From patchwork Thu Jul 29 11:14:41 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511165 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=Cz8OiGqN; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb7T9425Kz9sSs for ; Thu, 29 Jul 2021 21:23:57 +1000 (AEST) Received: from localhost ([::1]:42488 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m948p-0008GA-AQ for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:23:55 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40392) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m9415-0000sP-9N for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:55 -0400 Received: from mail-wm1-x32f.google.com ([2a00:1450:4864:20::32f]:40658) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m940k-0001DF-IE for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:54 -0400 Received: by mail-wm1-x32f.google.com with SMTP id f18-20020a05600c4e92b0290253c32620e7so6334972wmq.5 for ; Thu, 29 Jul 2021 04:15:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=wLBDTVo8uPxsFMze+DTZAhzyPrmDpywm7k5ga1QcofA=; b=Cz8OiGqNYdKuO89rG3X4DMgkrWt+6eQa1VhzC1z3eZw+ifWJkcL90dliz07Gy8gWES HsRTEBONsSvtXn+iUCiwbs2oiX4q4VXfYS7AgxdECut2onxqcfS6ljXFyTzsIJjeJO9h tNNYvnJ6K3dksiM+mPE/MqlRALxixDoYJ4UBR7bpUXuNRCfYLPGmXekJb1dHxB5O0EQz WlBTzfrKk03Gap4kqeZbTAj83xrci1McizG1Xkt/Q1B+3chVqrT13abNi+oV62wMXNB1 42V9AZDvZg4U+IKP4IlcJSfobt9P6EniuXJh9M2dJJoSpKWuDxvfptru6B/U5fVOlcXz uD5Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=wLBDTVo8uPxsFMze+DTZAhzyPrmDpywm7k5ga1QcofA=; b=a7mi4s6dT4FxTemS0mjXKbxDREagW4Wva79m/Y1//ErJuDr0+6RWRv6lun/IorZXVm BDLXct/x/EICM+DoczjrnnYUDgftMO6ULn+xICbOv/wvspBgHEUx3Lq+TN52SBHxHTVW WTnCTkCpZWWbTK18oaYAv1RJwiY1p5y4Ym4iBHS5W6Q7vS7AInk+zdZR+MSwDdxVmLAl QMylq8oXP3ARYDUl3hsi36/8BxZRyx8mEBs2p0x0mNcCM2L7Y2s479l/AD6bpWoOmXLZ xQZlPqRht45iLco1pi6szWDX6H9KUVVnxOR+3DXijgXT4IEzl7FWrbxVsbxQj5xePyaR Hm2Q== X-Gm-Message-State: AOAM5333BoWGu2UtIuS4bTb0xGEr+Vd6lDIyBI3Wv0N5q9iqqVZPmvyV zL7c5dgE1DxQmzSHC2qndCddUQ== X-Google-Smtp-Source: ABdhPJyDNg3DBAIGotuOpfOtgKDJ2im4K+5Q1dO0Vi7qYEQQGaMq0nW+UQrx4s37N2+inFKa2nPOPQ== X-Received: by 2002:a05:600c:33a6:: with SMTP id o38mr7760134wmp.131.1627557333007; Thu, 29 Jul 2021 04:15:33 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:32 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 22/53] target/arm: Implement MVE VABAV Date: Thu, 29 Jul 2021 12:14:41 +0100 Message-Id: <20210729111512.16541-23-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::32f; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x32f.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE VABAV insn, which computes absolute differences between elements of two vectors and accumulates the result into a general purpose register. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper-mve.h | 7 +++++++ target/arm/mve.decode | 6 ++++++ target/arm/mve_helper.c | 26 +++++++++++++++++++++++ target/arm/translate-mve.c | 43 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 82 insertions(+) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index 2c66fcba792..c7e7aab2cbb 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -402,6 +402,13 @@ DEF_HELPER_FLAGS_3(mve_vminavw, TCG_CALL_NO_WG, i32, env, ptr, i32) DEF_HELPER_FLAGS_3(mve_vaddlv_s, TCG_CALL_NO_WG, i64, env, ptr, i64) DEF_HELPER_FLAGS_3(mve_vaddlv_u, TCG_CALL_NO_WG, i64, env, ptr, i64) +DEF_HELPER_FLAGS_4(mve_vabavsb, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vabavsh, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vabavsw, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vabavub, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vabavuh, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vabavuw, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32) + DEF_HELPER_FLAGS_3(mve_vmovi, TCG_CALL_NO_WG, void, env, ptr, i64) DEF_HELPER_FLAGS_3(mve_vandi, TCG_CALL_NO_WG, void, env, ptr, i64) DEF_HELPER_FLAGS_3(mve_vorri, TCG_CALL_NO_WG, void, env, ptr, i64) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index 83dc0300d69..c8a06edca78 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -41,6 +41,7 @@ &vcmp_scalar qn rm size mask &shl_scalar qda rm size &vmaxv qm rda size +&vabav qn qm rda size @vldr_vstr ....... . . . . l:1 rn:4 ... ...... imm:7 &vldr_vstr qd=%qd u=0 # Note that both Rn and Qd are 3 bits only (no D bit) @@ -386,6 +387,11 @@ VMLAS 111- 1110 0 . .. ... 1 ... 1 1110 . 100 .... @2scalar rdahi=%rdahi rdalo=%rdalo } +@vabav .... .... .. size:2 .... rda:4 .... .... .... &vabav qn=%qn qm=%qm + +VABAV_S 111 0 1110 10 .. ... 0 .... 1111 . 0 . 0 ... 1 @vabav +VABAV_U 111 1 1110 10 .. ... 0 .... 1111 . 0 . 0 ... 1 @vabav + # Logical immediate operations (1 reg and modified-immediate) # The cmode/op bits here decode VORR/VBIC/VMOV/VMVN, but diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index 924ad7f2bdc..fed0f3cd610 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -1320,6 +1320,32 @@ DO_VMAXMINV(vminavb, 1, int8_t, uint8_t, do_mina) DO_VMAXMINV(vminavh, 2, int16_t, uint16_t, do_mina) DO_VMAXMINV(vminavw, 4, int32_t, uint32_t, do_mina) +#define DO_VABAV(OP, ESIZE, TYPE) \ + uint32_t HELPER(glue(mve_, OP))(CPUARMState *env, void *vn, \ + void *vm, uint32_t ra) \ + { \ + uint16_t mask = mve_element_mask(env); \ + unsigned e; \ + TYPE *m = vm, *n = vn; \ + for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \ + if (mask & 1) { \ + int64_t n0 = n[H##ESIZE(e)]; \ + int64_t m0 = m[H##ESIZE(e)]; \ + uint32_t r = n0 >= m0 ? (n0 - m0) : (m0 - n0); \ + ra += r; \ + } \ + } \ + mve_advance_vpt(env); \ + return ra; \ + } + +DO_VABAV(vabavsb, 1, int8_t) +DO_VABAV(vabavsh, 2, int16_t) +DO_VABAV(vabavsw, 4, int32_t) +DO_VABAV(vabavub, 1, uint8_t) +DO_VABAV(vabavuh, 2, uint16_t) +DO_VABAV(vabavuw, 4, uint32_t) + #define DO_VADDLV(OP, TYPE, LTYPE) \ uint64_t HELPER(glue(mve_, OP))(CPUARMState *env, void *vm, \ uint64_t ra) \ diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index 2fce74f86ab..247f6719e6f 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -45,6 +45,7 @@ typedef void MVEGenVIDUPFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_i32, TCGv_i32); typedef void MVEGenVIWDUPFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_i32, TCGv_i32, TCGv_i32); typedef void MVEGenCmpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr); typedef void MVEGenScalarCmpFn(TCGv_ptr, TCGv_ptr, TCGv_i32); +typedef void MVEGenVABAVFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32); /* Return the offset of a Qn register (same semantics as aa32_vfp_qreg()) */ static inline long mve_qreg_offset(unsigned reg) @@ -1369,3 +1370,45 @@ DO_VMAXV(VMAXAV, vmaxav) DO_VMAXV(VMINV_S, vminvs) DO_VMAXV(VMINV_U, vminvu) DO_VMAXV(VMINAV, vminav) + +static bool do_vabav(DisasContext *s, arg_vabav *a, MVEGenVABAVFn *fn) +{ + /* Absolute difference accumulated across vector */ + TCGv_ptr qn, qm; + TCGv_i32 rda; + + if (!dc_isar_feature(aa32_mve, s) || + !mve_check_qreg_bank(s, a->qm | a->qn) || + !fn || a->rda == 13 || a->rda == 15) { + /* Rda cases are UNPREDICTABLE */ + return false; + } + if (!mve_eci_check(s) || !vfp_access_check(s)) { + return true; + } + + qm = mve_qreg_ptr(a->qm); + qn = mve_qreg_ptr(a->qn); + rda = load_reg(s, a->rda); + fn(rda, cpu_env, qn, qm, rda); + store_reg(s, a->rda, rda); + tcg_temp_free_ptr(qm); + tcg_temp_free_ptr(qn); + mve_update_eci(s); + return true; +} + +#define DO_VABAV(INSN, FN) \ + static bool trans_##INSN(DisasContext *s, arg_vabav *a) \ + { \ + static MVEGenVABAVFn * const fns[] = { \ + gen_helper_mve_##FN##b, \ + gen_helper_mve_##FN##h, \ + gen_helper_mve_##FN##w, \ + NULL, \ + }; \ + return do_vabav(s, a, fns[a->size]); \ + } + +DO_VABAV(VABAV_S, vabavs) +DO_VABAV(VABAV_U, vabavu) From patchwork Thu Jul 29 11:14:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511168 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=ET1RyGqH; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb7Wz2B8Pz9sSs for ; Thu, 29 Jul 2021 21:26:23 +1000 (AEST) Received: from localhost ([::1]:49460 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m94BA-0004Ts-U1 for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:26:20 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40494) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m9419-00010D-DH for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:59 -0400 Received: from mail-wm1-x335.google.com ([2a00:1450:4864:20::335]:46072) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m940l-0001Da-LO for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:55 -0400 Received: by mail-wm1-x335.google.com with SMTP id l11-20020a7bcf0b0000b0290253545c2997so3760672wmg.4 for ; Thu, 29 Jul 2021 04:15:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=rmaXRFhzA84467Map0exaJ0g7V5QOi2vgBWwnWDrh20=; b=ET1RyGqHvbbn5Y9JrAZhxQAf3wmwWc6sTU1ffrWSiIfw/xp7MtwgQApTehwyBJACU4 Qg9RNuEJ23QMH3wGOrtNYoqhghEpp21meymE2128vrVTcQz3rzHD+AgkMBH5HADULBUq Xf3XFW8+GTz6PeyMlEPkfcMH6a14NzvkEDTO6BEZRp0eK0GrhWpH+Y5e2Th4WQSpG+31 7PmhVlexBfesusWX6drOTIezN+npzmVxrX3jmpiJciqyqz39bqb2MjQdBVtDgVnlugjb /HVugMECWRJFn+wBKa3VoZkVxM177Vo82m/3txgIUOY/ya55a0ZifvYhFisNL7AObxs4 cR+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=rmaXRFhzA84467Map0exaJ0g7V5QOi2vgBWwnWDrh20=; b=Kgqw1AzzTZFAiuq1BgosjH/hq6jXyp9LR3E1GFMs1WoapOmSyasqqUVmcO2u3p9xSD /Zcd3pJ+F/4UogCgTehSni0icqkcBVctIn8olKWeZd/R+9deHfhvBkGclAYyeHj5m+rH 0oaTaYzJnU0JDwN/HwqhxXKjc5RtXNf8f8F16kwXCH93K0w4/+IDx6ATa7yzi4yZpUat si+cy2jkpZwCtsQwCPFM9lQgvcruw09qNrU4jiddvWIFA2DtV4250bufRC7amipfFyGl JzMPbUdFy1wmCfsFwq4KFpis1deG93BkG66WTyCj04rci7VIlYn8PMARE38cGbwGzL4q n7nQ== X-Gm-Message-State: AOAM531Wuh5GfFu3cy8pmkeVAorvV4x6NSigZwVcrEqO9OfYB1zlslkj Jz0s9kKBIJP2eStL6C+ivwIo8bX0Ag2y5A== X-Google-Smtp-Source: ABdhPJz2px11uNMa7hlB4h167yOfbBmClBlcS+Ia1APmP6lzJO+wIgrJ0xqw9RMVW/ADRo1XXoPTuQ== X-Received: by 2002:a1c:a94f:: with SMTP id s76mr11588221wme.17.1627557333828; Thu, 29 Jul 2021 04:15:33 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:33 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 23/53] target/arm: Implement MVE narrowing moves Date: Thu, 29 Jul 2021 12:14:42 +0100 Message-Id: <20210729111512.16541-24-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::335; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x335.google.com X-Spam_score_int: -1 X-Spam_score: -0.2 X-Spam_bar: / X-Spam_report: (-0.2 / 5.0 requ) DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE narrowing move insns VMOVN, VQMOVN and VQMOVUN. These take a double-width input, narrow it (possibly saturating) and store the result to either the top or bottom half of the output element. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper-mve.h | 20 ++++++++++ target/arm/mve.decode | 12 ++++++ target/arm/mve_helper.c | 78 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-mve.c | 22 +++++++++++ 4 files changed, 132 insertions(+) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index c7e7aab2cbb..17484f74323 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -76,6 +76,26 @@ DEF_HELPER_FLAGS_3(mve_vnegw, TCG_CALL_NO_WG, void, env, ptr, ptr) DEF_HELPER_FLAGS_3(mve_vfnegh, TCG_CALL_NO_WG, void, env, ptr, ptr) DEF_HELPER_FLAGS_3(mve_vfnegs, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vmovnbb, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vmovnbh, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vmovntb, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vmovnth, TCG_CALL_NO_WG, void, env, ptr, ptr) + +DEF_HELPER_FLAGS_3(mve_vqmovunbb, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vqmovunbh, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vqmovuntb, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vqmovunth, TCG_CALL_NO_WG, void, env, ptr, ptr) + +DEF_HELPER_FLAGS_3(mve_vqmovnbsb, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vqmovnbsh, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vqmovntsb, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vqmovntsh, TCG_CALL_NO_WG, void, env, ptr, ptr) + +DEF_HELPER_FLAGS_3(mve_vqmovnbub, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vqmovnbuh, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vqmovntub, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vqmovntuh, TCG_CALL_NO_WG, void, env, ptr, ptr) + DEF_HELPER_FLAGS_4(mve_vand, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) DEF_HELPER_FLAGS_4(mve_vbic, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) DEF_HELPER_FLAGS_4(mve_vorr, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index c8a06edca78..d295a693b18 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -153,6 +153,9 @@ VMUL 1110 1111 0 . .. ... 0 ... 0 1001 . 1 . 1 ... 0 @2op VSHLL_BS 111 0 1110 0 . 11 .. 01 ... 0 1110 0 0 . 0 ... 1 @2_shll_esize_b VSHLL_BS 111 0 1110 0 . 11 .. 01 ... 0 1110 0 0 . 0 ... 1 @2_shll_esize_h + VQMOVUNB 111 0 1110 0 . 11 .. 01 ... 0 1110 1 0 . 0 ... 1 @1op + VQMOVN_BS 111 0 1110 0 . 11 .. 11 ... 0 1110 0 0 . 0 ... 1 @1op + VMULH_S 111 0 1110 0 . .. ...1 ... 0 1110 . 0 . 0 ... 1 @2op } @@ -160,6 +163,9 @@ VMUL 1110 1111 0 . .. ... 0 ... 0 1001 . 1 . 1 ... 0 @2op VSHLL_BU 111 1 1110 0 . 11 .. 01 ... 0 1110 0 0 . 0 ... 1 @2_shll_esize_b VSHLL_BU 111 1 1110 0 . 11 .. 01 ... 0 1110 0 0 . 0 ... 1 @2_shll_esize_h + VMOVNB 111 1 1110 0 . 11 .. 01 ... 0 1110 1 0 . 0 ... 1 @1op + VQMOVN_BU 111 1 1110 0 . 11 .. 11 ... 0 1110 0 0 . 0 ... 1 @1op + VMULH_U 111 1 1110 0 . .. ...1 ... 0 1110 . 0 . 0 ... 1 @2op } @@ -167,6 +173,9 @@ VMUL 1110 1111 0 . .. ... 0 ... 0 1001 . 1 . 1 ... 0 @2op VSHLL_TS 111 0 1110 0 . 11 .. 01 ... 1 1110 0 0 . 0 ... 1 @2_shll_esize_b VSHLL_TS 111 0 1110 0 . 11 .. 01 ... 1 1110 0 0 . 0 ... 1 @2_shll_esize_h + VQMOVUNT 111 0 1110 0 . 11 .. 01 ... 1 1110 1 0 . 0 ... 1 @1op + VQMOVN_TS 111 0 1110 0 . 11 .. 11 ... 1 1110 0 0 . 0 ... 1 @1op + VRMULH_S 111 0 1110 0 . .. ...1 ... 1 1110 . 0 . 0 ... 1 @2op } @@ -174,6 +183,9 @@ VMUL 1110 1111 0 . .. ... 0 ... 0 1001 . 1 . 1 ... 0 @2op VSHLL_TU 111 1 1110 0 . 11 .. 01 ... 1 1110 0 0 . 0 ... 1 @2_shll_esize_b VSHLL_TU 111 1 1110 0 . 11 .. 01 ... 1 1110 0 0 . 0 ... 1 @2_shll_esize_h + VMOVNT 111 1 1110 0 . 11 .. 01 ... 1 1110 1 0 . 0 ... 1 @1op + VQMOVN_TU 111 1 1110 0 . 11 .. 11 ... 1 1110 0 0 . 0 ... 1 @1op + VRMULH_U 111 1 1110 0 . .. ...1 ... 1 1110 . 0 . 0 ... 1 @2op } diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index fed0f3cd610..72c30f360ac 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -1650,6 +1650,84 @@ DO_VSHRN_SAT_UH(vqrshrnb_uh, vqrshrnt_uh, DO_RSHRN_UH) DO_VSHRN_SAT_SB(vqrshrunbb, vqrshruntb, DO_RSHRUN_B) DO_VSHRN_SAT_SH(vqrshrunbh, vqrshrunth, DO_RSHRUN_H) +#define DO_VMOVN(OP, TOP, ESIZE, TYPE, LESIZE, LTYPE) \ + void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm) \ + { \ + LTYPE *m = vm; \ + TYPE *d = vd; \ + uint16_t mask = mve_element_mask(env); \ + unsigned le; \ + mask >>= ESIZE * TOP; \ + for (le = 0; le < 16 / LESIZE; le++, mask >>= LESIZE) { \ + mergemask(&d[H##ESIZE(le * 2 + TOP)], \ + m[H##LESIZE(le)], mask); \ + } \ + mve_advance_vpt(env); \ + } + +DO_VMOVN(vmovnbb, false, 1, uint8_t, 2, uint16_t) +DO_VMOVN(vmovnbh, false, 2, uint16_t, 4, uint32_t) +DO_VMOVN(vmovntb, true, 1, uint8_t, 2, uint16_t) +DO_VMOVN(vmovnth, true, 2, uint16_t, 4, uint32_t) + +#define DO_VMOVN_SAT(OP, TOP, ESIZE, TYPE, LESIZE, LTYPE, FN) \ + void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm) \ + { \ + LTYPE *m = vm; \ + TYPE *d = vd; \ + uint16_t mask = mve_element_mask(env); \ + bool qc = false; \ + unsigned le; \ + mask >>= ESIZE * TOP; \ + for (le = 0; le < 16 / LESIZE; le++, mask >>= LESIZE) { \ + bool sat = false; \ + TYPE r = FN(m[H##LESIZE(le)], &sat); \ + mergemask(&d[H##ESIZE(le * 2 + TOP)], r, mask); \ + qc |= sat & mask & 1; \ + } \ + if (qc) { \ + env->vfp.qc[0] = qc; \ + } \ + mve_advance_vpt(env); \ + } + +#define DO_VMOVN_SAT_UB(BOP, TOP, FN) \ + DO_VMOVN_SAT(BOP, false, 1, uint8_t, 2, uint16_t, FN) \ + DO_VMOVN_SAT(TOP, true, 1, uint8_t, 2, uint16_t, FN) + +#define DO_VMOVN_SAT_UH(BOP, TOP, FN) \ + DO_VMOVN_SAT(BOP, false, 2, uint16_t, 4, uint32_t, FN) \ + DO_VMOVN_SAT(TOP, true, 2, uint16_t, 4, uint32_t, FN) + +#define DO_VMOVN_SAT_SB(BOP, TOP, FN) \ + DO_VMOVN_SAT(BOP, false, 1, int8_t, 2, int16_t, FN) \ + DO_VMOVN_SAT(TOP, true, 1, int8_t, 2, int16_t, FN) + +#define DO_VMOVN_SAT_SH(BOP, TOP, FN) \ + DO_VMOVN_SAT(BOP, false, 2, int16_t, 4, int32_t, FN) \ + DO_VMOVN_SAT(TOP, true, 2, int16_t, 4, int32_t, FN) + +#define DO_VQMOVN_SB(N, SATP) \ + do_sat_bhs((int64_t)(N), INT8_MIN, INT8_MAX, SATP) +#define DO_VQMOVN_UB(N, SATP) \ + do_sat_bhs((uint64_t)(N), 0, UINT8_MAX, SATP) +#define DO_VQMOVUN_B(N, SATP) \ + do_sat_bhs((int64_t)(N), 0, UINT8_MAX, SATP) + +#define DO_VQMOVN_SH(N, SATP) \ + do_sat_bhs((int64_t)(N), INT16_MIN, INT16_MAX, SATP) +#define DO_VQMOVN_UH(N, SATP) \ + do_sat_bhs((uint64_t)(N), 0, UINT16_MAX, SATP) +#define DO_VQMOVUN_H(N, SATP) \ + do_sat_bhs((int64_t)(N), 0, UINT16_MAX, SATP) + +DO_VMOVN_SAT_SB(vqmovnbsb, vqmovntsb, DO_VQMOVN_SB) +DO_VMOVN_SAT_SH(vqmovnbsh, vqmovntsh, DO_VQMOVN_SH) +DO_VMOVN_SAT_UB(vqmovnbub, vqmovntub, DO_VQMOVN_UB) +DO_VMOVN_SAT_UH(vqmovnbuh, vqmovntuh, DO_VQMOVN_UH) +DO_VMOVN_SAT_SB(vqmovunbb, vqmovuntb, DO_VQMOVUN_B) +DO_VMOVN_SAT_SH(vqmovunbh, vqmovunth, DO_VQMOVUN_H) + uint32_t HELPER(mve_vshlc)(CPUARMState *env, void *vd, uint32_t rdm, uint32_t shift) { diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index 247f6719e6f..5c3655efc3c 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -275,6 +275,28 @@ DO_1OP(VCLS, vcls) DO_1OP(VABS, vabs) DO_1OP(VNEG, vneg) +/* Narrowing moves: only size 0 and 1 are valid */ +#define DO_VMOVN(INSN, FN) \ + static bool trans_##INSN(DisasContext *s, arg_1op *a) \ + { \ + static MVEGenOneOpFn * const fns[] = { \ + gen_helper_mve_##FN##b, \ + gen_helper_mve_##FN##h, \ + NULL, \ + NULL, \ + }; \ + return do_1op(s, a, fns[a->size]); \ + } + +DO_VMOVN(VMOVNB, vmovnb) +DO_VMOVN(VMOVNT, vmovnt) +DO_VMOVN(VQMOVUNB, vqmovunb) +DO_VMOVN(VQMOVUNT, vqmovunt) +DO_VMOVN(VQMOVN_BS, vqmovnbs) +DO_VMOVN(VQMOVN_TS, vqmovnts) +DO_VMOVN(VQMOVN_BU, vqmovnbu) +DO_VMOVN(VQMOVN_TU, vqmovntu) + static bool trans_VREV16(DisasContext *s, arg_1op *a) { static MVEGenOneOpFn * const fns[] = { From patchwork Thu Jul 29 11:14:43 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511169 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=xwx3JCKF; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb7XN2v2gz9sSs for ; Thu, 29 Jul 2021 21:26:44 +1000 (AEST) Received: from localhost ([::1]:51234 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m94BW-0005lH-4d for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:26:42 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40496) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m9419-00010J-Ep for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:59 -0400 Received: from mail-wm1-x334.google.com ([2a00:1450:4864:20::334]:35450) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m940n-0001Eu-8M for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:56 -0400 Received: by mail-wm1-x334.google.com with SMTP id u15-20020a05600c19cfb02902501bdb23cdso6596287wmq.0 for ; Thu, 29 Jul 2021 04:15:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=bjNMiUKUCYi1YMIhagOS3VyZtwvdg9vDgXthAdg4/AU=; b=xwx3JCKFzT1nIRQEPENaVY9u3DmcBFj5Pnmh8Ahjo3p81a4sgGLs1KigrevQQWR689 ZV1XwYcT8HxrWWr1pogrYP05g/CxNqq6CdZKrMl2O4KzALREP89kb3o35H0c7pBb06ai 252qfpySUjKuQbT0lNpD4tnhBqlkkJdUw2tEm18b7r1YFTdKuoL6eT7ajQlXlIodgV3b fZvf6bATNWBWcudIbHerL0AckgWLQQDUTeVpaJpUYL1Jvn2YHoL7ZULHzKJNxYYDSmDq MjV3F4ec9tHOPtfA3Y2dIBlchtOGaApBPz+Jq1MDLnvCE5t9a1PXhh8qXdeGfv6/Hc0a 6M3g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=bjNMiUKUCYi1YMIhagOS3VyZtwvdg9vDgXthAdg4/AU=; b=VqLvdEbxLkeAooZ8BtqysxSruLqaNe87BnHAIF2rAiJ08ZGAzsTH+N+x0AOE+0ZDNa B/8fBwAq/D2MtXemlT9o9OkMZpeksjAsbor0PUM4CYhv76ClV5K56XHr6VFxx18pmt9f dMPSdXP+orG/0JHn7HAoRC0dG0KFbubGhhAQHV+BZ+Ty6QWRQhY+I9khazEPou4yZCSX v/1RWXaFSruzX3QUcrylTLp1kbDb8PLvAcU5P81E+QIqpVFM4Dl6sPip6y4JEWAoJIdo rnIopEslai5ACNFDyyhrftnEdkFUNbfo63+akWvIV0C6oxiyyEQcVuNyeDBu2XQRhPGY tdQw== X-Gm-Message-State: AOAM530v3bcMVX4DsQ3I2omAicBEwx/wMdwZanD49v+DhW5TD58GZIjB hl6aLj+bZPyCkM1M1raAaPo9Bo/p+o8AgA== X-Google-Smtp-Source: ABdhPJywkt57HvKZb3WF0r5DN65lqeODZkFLviJo9aqgVUHni1w9syR3v/uvavtN4JPH+OpvpO0E8g== X-Received: by 2002:a05:600c:3b98:: with SMTP id n24mr13998430wms.182.1627557334673; Thu, 29 Jul 2021 04:15:34 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:34 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 24/53] target/arm: Rename MVEGenDualAccOpFn to MVEGenLongDualAccOpFn Date: Thu, 29 Jul 2021 12:14:43 +0100 Message-Id: <20210729111512.16541-25-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::334; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x334.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" The MVEGenDualAccOpFn is a bit misnamed, since it is used for the "long dual accumulate" operations that use a 64-bit accumulator. Rename it to MVEGenLongDualAccOpFn so we can use the former name for the 32-bit accumulator insns. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/translate-mve.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index 5c3655efc3c..676411e05cb 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -38,7 +38,7 @@ typedef void MVEGenOneOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr); typedef void MVEGenTwoOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr); typedef void MVEGenTwoOpScalarFn(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32); typedef void MVEGenTwoOpShiftFn(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32); -typedef void MVEGenDualAccOpFn(TCGv_i64, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i64); +typedef void MVEGenLongDualAccOpFn(TCGv_i64, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i64); typedef void MVEGenVADDVFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_i32); typedef void MVEGenOneOpImmFn(TCGv_ptr, TCGv_ptr, TCGv_i64); typedef void MVEGenVIDUPFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_i32, TCGv_i32); @@ -652,7 +652,7 @@ static bool trans_VQDMULLT_scalar(DisasContext *s, arg_2scalar *a) } static bool do_long_dual_acc(DisasContext *s, arg_vmlaldav *a, - MVEGenDualAccOpFn *fn) + MVEGenLongDualAccOpFn *fn) { TCGv_ptr qn, qm; TCGv_i64 rda; @@ -710,7 +710,7 @@ static bool do_long_dual_acc(DisasContext *s, arg_vmlaldav *a, static bool trans_VMLALDAV_S(DisasContext *s, arg_vmlaldav *a) { - static MVEGenDualAccOpFn * const fns[4][2] = { + static MVEGenLongDualAccOpFn * const fns[4][2] = { { NULL, NULL }, { gen_helper_mve_vmlaldavsh, gen_helper_mve_vmlaldavxsh }, { gen_helper_mve_vmlaldavsw, gen_helper_mve_vmlaldavxsw }, @@ -721,7 +721,7 @@ static bool trans_VMLALDAV_S(DisasContext *s, arg_vmlaldav *a) static bool trans_VMLALDAV_U(DisasContext *s, arg_vmlaldav *a) { - static MVEGenDualAccOpFn * const fns[4][2] = { + static MVEGenLongDualAccOpFn * const fns[4][2] = { { NULL, NULL }, { gen_helper_mve_vmlaldavuh, NULL }, { gen_helper_mve_vmlaldavuw, NULL }, @@ -732,7 +732,7 @@ static bool trans_VMLALDAV_U(DisasContext *s, arg_vmlaldav *a) static bool trans_VMLSLDAV(DisasContext *s, arg_vmlaldav *a) { - static MVEGenDualAccOpFn * const fns[4][2] = { + static MVEGenLongDualAccOpFn * const fns[4][2] = { { NULL, NULL }, { gen_helper_mve_vmlsldavsh, gen_helper_mve_vmlsldavxsh }, { gen_helper_mve_vmlsldavsw, gen_helper_mve_vmlsldavxsw }, @@ -743,7 +743,7 @@ static bool trans_VMLSLDAV(DisasContext *s, arg_vmlaldav *a) static bool trans_VRMLALDAVH_S(DisasContext *s, arg_vmlaldav *a) { - static MVEGenDualAccOpFn * const fns[] = { + static MVEGenLongDualAccOpFn * const fns[] = { gen_helper_mve_vrmlaldavhsw, gen_helper_mve_vrmlaldavhxsw, }; return do_long_dual_acc(s, a, fns[a->x]); @@ -751,7 +751,7 @@ static bool trans_VRMLALDAVH_S(DisasContext *s, arg_vmlaldav *a) static bool trans_VRMLALDAVH_U(DisasContext *s, arg_vmlaldav *a) { - static MVEGenDualAccOpFn * const fns[] = { + static MVEGenLongDualAccOpFn * const fns[] = { gen_helper_mve_vrmlaldavhuw, NULL, }; return do_long_dual_acc(s, a, fns[a->x]); @@ -759,7 +759,7 @@ static bool trans_VRMLALDAVH_U(DisasContext *s, arg_vmlaldav *a) static bool trans_VRMLSLDAVH(DisasContext *s, arg_vmlaldav *a) { - static MVEGenDualAccOpFn * const fns[] = { + static MVEGenLongDualAccOpFn * const fns[] = { gen_helper_mve_vrmlsldavhsw, gen_helper_mve_vrmlsldavhxsw, }; return do_long_dual_acc(s, a, fns[a->x]); From patchwork Thu Jul 29 11:14:44 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511191 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=w7bUrh/+; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb7sr4ZFWz9sV8 for ; Thu, 29 Jul 2021 21:41:52 +1000 (AEST) Received: from localhost ([::1]:39806 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m94QA-0002lE-B1 for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:41:50 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40546) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m941A-00012n-By for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:00 -0400 Received: from mail-wm1-x330.google.com ([2a00:1450:4864:20::330]:37491) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m940n-0001FH-8r for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:59 -0400 Received: by mail-wm1-x330.google.com with SMTP id l34-20020a05600c1d22b02902573c214807so1130935wms.2 for ; Thu, 29 Jul 2021 04:15:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=OIN56S8ygwPfGNHYDNt+mnV7Jb27Ma6YNdL6z4Tr/Uk=; b=w7bUrh/+LkNO6XW+pQ8u5fz0Eh8s9x4vnJ/OKZbbKwVAK80NJO2lG/hPRSrBp/mxD7 gsvCbs3NjG7RM4n04IEWA8PNS9jvPA5U8ORMPbjhTcfxGDy1dh20nk2TwCz7bxHFBud7 CHDJq7AZOS9dgTUKGyWWSiOHtNHVwO8y0Jy+bZUdxCW1ngCdqoKJvG1BqnQOnJ9s5iKP dundHp9ViAHqkpu+i6MMvAJ2oQP5DPwprSaPf39ThNRzuFB+3cbbjvE2ILB7Mxjh2llX dnmf/g3Jmae7Aq019MvWnft8QiHirv3GMQNL/AZX8wiOahXEHDIYmTMd9XqhbImvXedW DGUA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=OIN56S8ygwPfGNHYDNt+mnV7Jb27Ma6YNdL6z4Tr/Uk=; b=THb4NOHg0BOhN8e5rexA/cXCoe8lTy17UUt6yY4yAzfLD2tiapcaeiL/I3T9vg9Ckx HQlx7vtU/FdjtmPKGWA8bpIIo1asPl9lYLn7vtq40I14wlF/MDv8/zlXvuuIM/l6f+Be DCA4nNx9d8xi0qeySHrLo+hQMptsH8CFeixyoJ43P1W2Byc0vUy9qC7ujwZrwArZH8yi q9GxYOf4A7lGuOz7yhFpkLJknk02ZIUF3xPbMd5HtBvHbjBRjYqRWic1hXzUtz61JpPx lMYaIyxHwLMlqSC9IUi1QNJFeO05PEKHWEyXWLMpboZ/WERJeDkZA9i+kpDZ+RY88u0Q 0Hxw== X-Gm-Message-State: AOAM531/ERlEXRosF7E5xVOrTtLrh8ymWwFgohH8otTMtkGnC+lfbVQy MBwGkf1HqNXg0OmskzjAWiSQbwtssj3/dg== X-Google-Smtp-Source: ABdhPJyeXkCIBWwG1gQ1uT2oXc5OyxXECwGJDfbXgqczCazuHzNHuFU1Qsw1wEuobpS2Fw+Ph+8DwQ== X-Received: by 2002:a05:600c:4e86:: with SMTP id f6mr14204477wmq.14.1627557335551; Thu, 29 Jul 2021 04:15:35 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:35 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 25/53] target/arm: Implement MVE VMLADAV and VMLSLDAV Date: Thu, 29 Jul 2021 12:14:44 +0100 Message-Id: <20210729111512.16541-26-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::330; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x330.google.com X-Spam_score_int: -1 X-Spam_score: -0.2 X-Spam_bar: / X-Spam_report: (-0.2 / 5.0 requ) DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE VMLADAV and VMLSLDAV insns. Like the VMLALDAV and VMLSLDAV insns already implemented, these accumulate multiplied vector elements; but they accumulate a 32-bit result rather than a 64-bit one. Note that these encodings overlap with what would be RdaHi=0b111 for VMLALDAV, VMLSLDAV, VRMLALDAVH and VRMLSLDAVH. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper-mve.h | 17 ++++++++++ target/arm/mve.decode | 33 +++++++++++++++++--- target/arm/mve_helper.c | 41 ++++++++++++++++++++++++ target/arm/translate-mve.c | 64 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 150 insertions(+), 5 deletions(-) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index 17484f74323..34d644a519c 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -392,6 +392,23 @@ DEF_HELPER_FLAGS_4(mve_vrmlaldavhuw, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64) DEF_HELPER_FLAGS_4(mve_vrmlsldavhsw, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64) DEF_HELPER_FLAGS_4(mve_vrmlsldavhxsw, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64) +DEF_HELPER_FLAGS_4(mve_vmladavsb, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vmladavsh, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vmladavsw, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vmladavub, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vmladavuh, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vmladavuw, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vmlsdavb, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vmlsdavh, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vmlsdavw, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(mve_vmladavsxb, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vmladavsxh, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vmladavsxw, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vmlsdavxb, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vmlsdavxh, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vmlsdavxw, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32) + DEF_HELPER_FLAGS_3(mve_vaddvsb, TCG_CALL_NO_WG, i32, env, ptr, i32) DEF_HELPER_FLAGS_3(mve_vaddvub, TCG_CALL_NO_WG, i32, env, ptr, i32) DEF_HELPER_FLAGS_3(mve_vaddvsh, TCG_CALL_NO_WG, i32, env, ptr, i32) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index d295a693b18..cec5a51b0ee 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -320,32 +320,55 @@ VDUP 1110 1110 1 0 10 ... 0 .... 1011 . 0 0 1 0000 @vdup size=2 %size_16 16:1 !function=plus_1 &vmlaldav rdahi rdalo size qn qm x a +&vmladav rda size qn qm x a @vmlaldav .... .... . ... ... . ... x:1 .... .. a:1 . qm:3 . \ qn=%qn rdahi=%rdahi rdalo=%rdalo size=%size_16 &vmlaldav @vmlaldav_nosz .... .... . ... ... . ... x:1 .... .. a:1 . qm:3 . \ qn=%qn rdahi=%rdahi rdalo=%rdalo size=0 &vmlaldav -VMLALDAV_S 1110 1110 1 ... ... . ... . 1110 . 0 . 0 ... 0 @vmlaldav -VMLALDAV_U 1111 1110 1 ... ... . ... . 1110 . 0 . 0 ... 0 @vmlaldav +@vmladav .... .... .... ... . ... x:1 .... . . a:1 . qm:3 . \ + qn=%qn rda=%rdalo size=%size_16 &vmladav +@vmladav_nosz .... .... .... ... . ... x:1 .... . . a:1 . qm:3 . \ + qn=%qn rda=%rdalo size=0 &vmladav -VMLSLDAV 1110 1110 1 ... ... . ... . 1110 . 0 . 0 ... 1 @vmlaldav +{ + VMLADAV_S 1110 1110 1111 ... . ... . 1110 . 0 . 0 ... 0 @vmladav + VMLALDAV_S 1110 1110 1 ... ... . ... . 1110 . 0 . 0 ... 0 @vmlaldav +} +{ + VMLADAV_U 1111 1110 1111 ... . ... . 1110 . 0 . 0 ... 0 @vmladav + VMLALDAV_U 1111 1110 1 ... ... . ... . 1110 . 0 . 0 ... 0 @vmlaldav +} + +{ + VMLSDAV 1110 1110 1111 ... . ... . 1110 . 0 . 0 ... 1 @vmladav + VMLSLDAV 1110 1110 1 ... ... . ... . 1110 . 0 . 0 ... 1 @vmlaldav +} + +{ + VMLSDAV 1111 1110 1111 ... 0 ... . 1110 . 0 . 0 ... 1 @vmladav_nosz + VRMLSLDAVH 1111 1110 1 ... ... 0 ... . 1110 . 0 . 0 ... 1 @vmlaldav_nosz +} + +VMLADAV_S 1110 1110 1111 ... 0 ... . 1111 . 0 . 0 ... 1 @vmladav_nosz +VMLADAV_U 1111 1110 1111 ... 0 ... . 1111 . 0 . 0 ... 1 @vmladav_nosz { VMAXV_S 1110 1110 1110 .. 10 .... 1111 0 0 . 0 ... 0 @vmaxv VMINV_S 1110 1110 1110 .. 10 .... 1111 1 0 . 0 ... 0 @vmaxv VMAXAV 1110 1110 1110 .. 00 .... 1111 0 0 . 0 ... 0 @vmaxv VMINAV 1110 1110 1110 .. 00 .... 1111 1 0 . 0 ... 0 @vmaxv + VMLADAV_S 1110 1110 1111 ... 0 ... . 1111 . 0 . 0 ... 0 @vmladav_nosz VRMLALDAVH_S 1110 1110 1 ... ... 0 ... . 1111 . 0 . 0 ... 0 @vmlaldav_nosz } { VMAXV_U 1111 1110 1110 .. 10 .... 1111 0 0 . 0 ... 0 @vmaxv VMINV_U 1111 1110 1110 .. 10 .... 1111 1 0 . 0 ... 0 @vmaxv + VMLADAV_U 1111 1110 1111 ... 0 ... . 1111 . 0 . 0 ... 0 @vmladav_nosz VRMLALDAVH_U 1111 1110 1 ... ... 0 ... . 1111 . 0 . 0 ... 0 @vmlaldav_nosz } -VRMLSLDAVH 1111 1110 1 ... ... 0 ... . 1110 . 0 . 0 ... 1 @vmlaldav_nosz - # Scalar operations VADD_scalar 1110 1110 0 . .. ... 1 ... 0 1111 . 100 .... @2scalar diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index 72c30f360ac..ea206c932bc 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -1189,6 +1189,47 @@ DO_LDAV(vmlsldavxsh, 2, int16_t, true, +=, -=) DO_LDAV(vmlsldavsw, 4, int32_t, false, +=, -=) DO_LDAV(vmlsldavxsw, 4, int32_t, true, +=, -=) +/* + * Multiply add dual accumulate ops + */ +#define DO_DAV(OP, ESIZE, TYPE, XCHG, EVENACC, ODDACC) \ + uint32_t HELPER(glue(mve_, OP))(CPUARMState *env, void *vn, \ + void *vm, uint32_t a) \ + { \ + uint16_t mask = mve_element_mask(env); \ + unsigned e; \ + TYPE *n = vn, *m = vm; \ + for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \ + if (mask & 1) { \ + if (e & 1) { \ + a ODDACC \ + n[H##ESIZE(e - 1 * XCHG)] * m[H##ESIZE(e)]; \ + } else { \ + a EVENACC \ + n[H##ESIZE(e + 1 * XCHG)] * m[H##ESIZE(e)]; \ + } \ + } \ + } \ + mve_advance_vpt(env); \ + return a; \ + } + +#define DO_DAV_S(INSN, XCHG, EVENACC, ODDACC) \ + DO_DAV(INSN##b, 1, int8_t, XCHG, EVENACC, ODDACC) \ + DO_DAV(INSN##h, 2, int16_t, XCHG, EVENACC, ODDACC) \ + DO_DAV(INSN##w, 4, int32_t, XCHG, EVENACC, ODDACC) + +#define DO_DAV_U(INSN, XCHG, EVENACC, ODDACC) \ + DO_DAV(INSN##b, 1, uint8_t, XCHG, EVENACC, ODDACC) \ + DO_DAV(INSN##h, 2, uint16_t, XCHG, EVENACC, ODDACC) \ + DO_DAV(INSN##w, 4, uint32_t, XCHG, EVENACC, ODDACC) + +DO_DAV_S(vmladavs, false, +=, +=) +DO_DAV_U(vmladavu, false, +=, +=) +DO_DAV_S(vmlsdav, false, +=, -=) +DO_DAV_S(vmladavsx, true, +=, +=) +DO_DAV_S(vmlsdavx, true, +=, -=) + /* * Rounding multiply add long dual accumulate high. In the pseudocode * this is implemented with a 72-bit internal accumulator value of which diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index 676411e05cb..92ed1be83e7 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -46,6 +46,7 @@ typedef void MVEGenVIWDUPFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_i32, TCGv_i32, TC typedef void MVEGenCmpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr); typedef void MVEGenScalarCmpFn(TCGv_ptr, TCGv_ptr, TCGv_i32); typedef void MVEGenVABAVFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32); +typedef void MVEGenDualAccOpFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32); /* Return the offset of a Qn register (same semantics as aa32_vfp_qreg()) */ static inline long mve_qreg_offset(unsigned reg) @@ -765,6 +766,69 @@ static bool trans_VRMLSLDAVH(DisasContext *s, arg_vmlaldav *a) return do_long_dual_acc(s, a, fns[a->x]); } +static bool do_dual_acc(DisasContext *s, arg_vmladav *a, MVEGenDualAccOpFn *fn) +{ + TCGv_ptr qn, qm; + TCGv_i32 rda; + + if (!dc_isar_feature(aa32_mve, s) || + !mve_check_qreg_bank(s, a->qn) || + !fn) { + return false; + } + if (!mve_eci_check(s) || !vfp_access_check(s)) { + return true; + } + + qn = mve_qreg_ptr(a->qn); + qm = mve_qreg_ptr(a->qm); + + /* + * This insn is subject to beat-wise execution. Partial execution + * of an A=0 (no-accumulate) insn which does not execute the first + * beat must start with the current rda value, not 0. + */ + if (a->a || mve_skip_first_beat(s)) { + rda = load_reg(s, a->rda); + } else { + rda = tcg_const_i32(0); + } + + fn(rda, cpu_env, qn, qm, rda); + store_reg(s, a->rda, rda); + tcg_temp_free_ptr(qn); + tcg_temp_free_ptr(qm); + + mve_update_eci(s); + return true; +} + +#define DO_DUAL_ACC(INSN, FN) \ + static bool trans_##INSN(DisasContext *s, arg_vmladav *a) \ + { \ + static MVEGenDualAccOpFn * const fns[4][2] = { \ + { gen_helper_mve_##FN##b, gen_helper_mve_##FN##xb }, \ + { gen_helper_mve_##FN##h, gen_helper_mve_##FN##xh }, \ + { gen_helper_mve_##FN##w, gen_helper_mve_##FN##xw }, \ + { NULL, NULL }, \ + }; \ + return do_dual_acc(s, a, fns[a->size][a->x]); \ + } + +DO_DUAL_ACC(VMLADAV_S, vmladavs) +DO_DUAL_ACC(VMLSDAV, vmlsdav) + +static bool trans_VMLADAV_U(DisasContext *s, arg_vmladav *a) +{ + static MVEGenDualAccOpFn * const fns[4][2] = { + { gen_helper_mve_vmladavub, NULL }, + { gen_helper_mve_vmladavuh, NULL }, + { gen_helper_mve_vmladavuw, NULL }, + { NULL, NULL }, + }; + return do_dual_acc(s, a, fns[a->size][a->x]); +} + static void gen_vpst(DisasContext *s, uint32_t mask) { /* From patchwork Thu Jul 29 11:14:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511171 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=iIqpT8aV; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb7ZP6T5lz9sW8 for ; Thu, 29 Jul 2021 21:28:29 +1000 (AEST) Received: from localhost ([::1]:58258 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m94DD-0001zr-LI for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:28:27 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40510) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m9419-00011B-Mm for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:59 -0400 Received: from mail-wm1-x32c.google.com ([2a00:1450:4864:20::32c]:38903) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m940n-0001FS-KI for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:15:59 -0400 Received: by mail-wm1-x32c.google.com with SMTP id o5-20020a1c4d050000b02901fc3a62af78so6569133wmh.3 for ; Thu, 29 Jul 2021 04:15:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=YspGX6RXJjhn0x59Tstxyc/gxMQjYsvkbC/RAiXTbf4=; b=iIqpT8aV6IL4OOmzIIaz6zYeMdeg/CUj9dYe/IqI/TWZOSWhSKCkhY+xscH5CV3Ij9 r8L255tLm3+CrGvmE40EkMeIn7hzJYkb6wlyGb+2+z2sthnDEHTbP9uh1ZkZUkBb7PZu I2o5br++FQRj23bKYx5y10lkUT6pfB++NIwWR2Dvzi9uWwPTOKMdFYPtHGh64hWTRdJH KLqkojyeG5KPf+ihH6LpKuNoS10bzhrxYTS66ObvaW+nvmECGQWLsuWBEulAQeaRqsGS 0DCtK8MSWXATBuqOrTDynerbX8i7RC0BHPMnxZuvaQqTKbab9raRzK/PyL50W58/Ty7Z h8Kw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=YspGX6RXJjhn0x59Tstxyc/gxMQjYsvkbC/RAiXTbf4=; b=So9BVdeybynR8NjMNTBZrOPg0ae9rtas8kzXx35+wQQBXrNFIZrMHghRQaS2E+nC7d qJbSfGTsanrvyxn9IYMagVKDvhdolXJ4tv9gs/Oa++7dhPrvbPOn0Lu69hTZn/oC+svv egO1o1R0T0CmNbMJthaD4+zjS4MyMVv5g4f2J6F5OwsYaSJyIZ/DoEk7UysDdowUV5o0 N1eJwdJzULjSS5tCE64KSeMayd4dlOnICLAoPIqmsxg/bLiOdia6WO5sH6TT4NbRqftT LMuT37DwBxlDWTb8bOiNMlgDyX3PwS2xWSia2UlEvp1xZWS9OVpwqze+r205Qh2PzXlZ o+EQ== X-Gm-Message-State: AOAM5304Fw6WKrXVIAgjjNRXA3w/yXuBDZTuBcTku3pOaj9WYkITtOk3 SNbkRdMoMUxjrnE4s1+NntnIHA== X-Google-Smtp-Source: ABdhPJwL/W1kA2znBs6cHB3ATOJrxS5Jox1w73yT+94Wp1kyvAmSGCnqtqgaXM7AGQcQd8FiNakUuQ== X-Received: by 2002:a7b:c301:: with SMTP id k1mr4197646wmj.165.1627557336306; Thu, 29 Jul 2021 04:15:36 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:35 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 26/53] target/arm: Implement MVE VMLA Date: Thu, 29 Jul 2021 12:14:45 +0100 Message-Id: <20210729111512.16541-27-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::32c; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x32c.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE VMLA insn, which multiplies a vector by a scalar and accumulates into another vector. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- Changes v1->v2: don't decode U bit --- target/arm/helper-mve.h | 4 ++++ target/arm/mve.decode | 1 + target/arm/mve_helper.c | 5 +++++ target/arm/translate-mve.c | 1 + 4 files changed, 11 insertions(+) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index 34d644a519c..328e31e2665 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -367,6 +367,10 @@ DEF_HELPER_FLAGS_4(mve_vqdmullb_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i3 DEF_HELPER_FLAGS_4(mve_vqdmullt_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vqdmullt_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vmlab, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vmlah, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vmlaw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(mve_vmlasb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vmlash, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vmlasw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index cec5a51b0ee..cd9c806a11c 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -413,6 +413,7 @@ VQDMULH_scalar 1110 1110 0 . .. ... 1 ... 0 1110 . 110 .... @2scalar VQRDMULH_scalar 1111 1110 0 . .. ... 1 ... 0 1110 . 110 .... @2scalar # The U bit (28) is don't-care because it does not affect the result +VMLA 111- 1110 0 . .. ... 1 ... 0 1110 . 100 .... @2scalar VMLAS 111- 1110 0 . .. ... 1 ... 1 1110 . 100 .... @2scalar # Vector add across vector diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index ea206c932bc..8004b9bb728 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -1008,6 +1008,11 @@ DO_2OP_SAT_SCALAR(vqrdmulh_scalarb, 1, int8_t, DO_QRDMULH_B) DO_2OP_SAT_SCALAR(vqrdmulh_scalarh, 2, int16_t, DO_QRDMULH_H) DO_2OP_SAT_SCALAR(vqrdmulh_scalarw, 4, int32_t, DO_QRDMULH_W) +/* Vector by scalar plus vector */ +#define DO_VMLA(D, N, M) ((N) * (M) + (D)) + +DO_2OP_ACC_SCALAR_U(vmla, DO_VMLA) + /* Vector by vector plus scalar */ #define DO_VMLAS(D, N, M) ((N) * (D) + (M)) diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index 92ed1be83e7..f8899af352d 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -620,6 +620,7 @@ DO_2OP_SCALAR(VQSUB_U_scalar, vqsubu_scalar) DO_2OP_SCALAR(VQDMULH_scalar, vqdmulh_scalar) DO_2OP_SCALAR(VQRDMULH_scalar, vqrdmulh_scalar) DO_2OP_SCALAR(VBRSR, vbrsr) +DO_2OP_SCALAR(VMLA, vmla) DO_2OP_SCALAR(VMLAS, vmlas) static bool trans_VQDMULLB_scalar(DisasContext *s, arg_2scalar *a) From patchwork Thu Jul 29 11:14:46 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511197 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=wUoMfZQE; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb7zq6DYzz9sSs for ; Thu, 29 Jul 2021 21:47:03 +1000 (AEST) Received: from localhost ([::1]:52572 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m94VB-0003Jc-Gp for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:47:01 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40620) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m941C-00014c-Pl for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:02 -0400 Received: from mail-wm1-x32a.google.com ([2a00:1450:4864:20::32a]:39823) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m940o-0001Fp-Ct for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:02 -0400 Received: by mail-wm1-x32a.google.com with SMTP id f14-20020a05600c154eb02902519e4abe10so6545801wmg.4 for ; Thu, 29 Jul 2021 04:15:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=TU5VLZPJYGx6Qh3NO9t/DQaQygrYYqxLMGPPbmN9qvY=; b=wUoMfZQE8zQRSzgfnhWZ7X0dprVFyfuPzzD+AlSplBKPKaEXKHOXIPzkuliYQqlgSy wCVTIGvLtpofckeMqpLRsKX187UqfQnDLWcfDjGSzVUxiIdlh0sYMxe+5I9CvRWrKkXN 4ZkhcE7bYccOSxQZ8rBf1GU7Q6F4Wde0zUMm+zDyE1LUt4DKgWB+D2gJMyiRxEh2HVcI m3qC2ggkQOJuo2TSg02bpUruUBc5tADkYOd3qQsIsG+EFcfCRXCFeg7H1q0zn8LX/239 fQEYA9trvoVI8FsVh9JACB6tjRDMFKY4pFih1eKEmr2WvYbAbXzDWTorPfZjzoo1M0BP +vwA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=TU5VLZPJYGx6Qh3NO9t/DQaQygrYYqxLMGPPbmN9qvY=; b=mhDtRY95pMTOW5uuZXCFxX0pOrE2cp5QQxtJrt+Ikeidy+TcOORmLroNTRQg30OSCq TwZ01xR55VKREBdCLnNmH/cJEOokJGhbgHJnICQOFrQiWecwn0U0My7b4zM0eo5E2bwp dHjS/TCWM1MaTnJzW9YywuosnY+VFSqxSULEfpMaLrEUcOazHBV5aqjugnoM8qU+GLf9 YBZb/NzG/3Mabio/QT7P0Z8Au55HdPt+km1OPuvCyg258DbkOpnO9bzZQFEGDzHZo+OK d9qEafmiBGfiQMK7vpIS8E/kbfbJlIESatYmeFYd/wBSCC1XEmiuydvcDcnmSFbT5nEu KMpw== X-Gm-Message-State: AOAM53396kx5h3MKQbw21CkVdySrjdXEQHn1j+Tjx1CtO+fZAxTpIOPe +A7lYC21pNSxKASMb3DwS5jK7zBVxp2vKA== X-Google-Smtp-Source: ABdhPJyVImG7Il3Xclx9112Pno0PcFr5lLC8PO9pFWpO0eP/hRQnw7fh9W7lmjINoAxMTRPaknAagg== X-Received: by 2002:a1c:4409:: with SMTP id r9mr4240521wma.150.1627557337195; Thu, 29 Jul 2021 04:15:37 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:36 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 27/53] target/arm: Implement MVE saturating doubling multiply accumulates Date: Thu, 29 Jul 2021 12:14:46 +0100 Message-Id: <20210729111512.16541-28-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::32a; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x32a.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE saturating doubling multiply accumulate insns VQDMLAH, VQRDMLAH, VQDMLASH and VQRDMLASH. These perform a multiply, double, add the accumulator shifted by the element size, possibly round, saturate to twice the element size, then take the high half of the result. The *MLAH insns do vector * scalar + vector, and the *MLASH insns do vector * vector + scalar. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper-mve.h | 16 +++++++ target/arm/mve.decode | 5 ++ target/arm/mve_helper.c | 95 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-mve.c | 4 ++ 4 files changed, 120 insertions(+) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index 328e31e2665..2f54396b2df 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -375,6 +375,22 @@ DEF_HELPER_FLAGS_4(mve_vmlasb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vmlash, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vmlasw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vqdmlahb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vqdmlahh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vqdmlahw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(mve_vqrdmlahb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vqrdmlahh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vqrdmlahw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(mve_vqdmlashb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vqdmlashh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vqdmlashw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(mve_vqrdmlashb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vqrdmlashh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vqrdmlashw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(mve_vmlaldavsh, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64) DEF_HELPER_FLAGS_4(mve_vmlaldavsw, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64) DEF_HELPER_FLAGS_4(mve_vmlaldavxsh, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index cd9c806a11c..7a6de3991b6 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -416,6 +416,11 @@ VQRDMULH_scalar 1111 1110 0 . .. ... 1 ... 0 1110 . 110 .... @2scalar VMLA 111- 1110 0 . .. ... 1 ... 0 1110 . 100 .... @2scalar VMLAS 111- 1110 0 . .. ... 1 ... 1 1110 . 100 .... @2scalar +VQRDMLAH 1110 1110 0 . .. ... 0 ... 0 1110 . 100 .... @2scalar +VQRDMLASH 1110 1110 0 . .. ... 0 ... 1 1110 . 100 .... @2scalar +VQDMLAH 1110 1110 0 . .. ... 0 ... 0 1110 . 110 .... @2scalar +VQDMLASH 1110 1110 0 . .. ... 0 ... 1 1110 . 110 .... @2scalar + # Vector add across vector { VADDV 111 u:1 1110 1111 size:2 01 ... 0 1111 0 0 a:1 0 qm:3 0 rda=%rdalo diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index 8004b9bb728..a69fcd2243c 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -964,6 +964,28 @@ DO_VQDMLADH_OP(vqrdmlsdhxw, 4, int32_t, 1, 1, do_vqdmlsdh_w) mve_advance_vpt(env); \ } +#define DO_2OP_SAT_ACC_SCALAR(OP, ESIZE, TYPE, FN) \ + void HELPER(glue(mve_, OP))(CPUARMState *env, void *vd, void *vn, \ + uint32_t rm) \ + { \ + TYPE *d = vd, *n = vn; \ + TYPE m = rm; \ + uint16_t mask = mve_element_mask(env); \ + unsigned e; \ + bool qc = false; \ + for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \ + bool sat = false; \ + mergemask(&d[H##ESIZE(e)], \ + FN(d[H##ESIZE(e)], n[H##ESIZE(e)], m, &sat), \ + mask); \ + qc |= sat & mask & 1; \ + } \ + if (qc) { \ + env->vfp.qc[0] = qc; \ + } \ + mve_advance_vpt(env); \ + } + /* provide unsigned 2-op scalar helpers for all sizes */ #define DO_2OP_SCALAR_U(OP, FN) \ DO_2OP_SCALAR(OP##b, 1, uint8_t, FN) \ @@ -1008,6 +1030,79 @@ DO_2OP_SAT_SCALAR(vqrdmulh_scalarb, 1, int8_t, DO_QRDMULH_B) DO_2OP_SAT_SCALAR(vqrdmulh_scalarh, 2, int16_t, DO_QRDMULH_H) DO_2OP_SAT_SCALAR(vqrdmulh_scalarw, 4, int32_t, DO_QRDMULH_W) +static int8_t do_vqdmlah_b(int8_t a, int8_t b, int8_t c, int round, bool *sat) +{ + int64_t r = (int64_t)a * b * 2 + ((int64_t)c << 8) + (round << 7); + return do_sat_bhw(r, INT16_MIN, INT16_MAX, sat) >> 8; +} + +static int16_t do_vqdmlah_h(int16_t a, int16_t b, int16_t c, + int round, bool *sat) +{ + int64_t r = (int64_t)a * b * 2 + ((int64_t)c << 16) + (round << 15); + return do_sat_bhw(r, INT32_MIN, INT32_MAX, sat) >> 16; +} + +static int32_t do_vqdmlah_w(int32_t a, int32_t b, int32_t c, + int round, bool *sat) +{ + /* + * Architecturally we should do the entire add, double, round + * and then check for saturation. We do three saturating adds, + * but we need to be careful about the order. If the first + * m1 + m2 saturates then it's impossible for the *2+rc to + * bring it back into the non-saturated range. However, if + * m1 + m2 is negative then it's possible that doing the doubling + * would take the intermediate result below INT64_MAX and the + * addition of the rounding constant then brings it back in range. + * So we add half the rounding constant and half the "c << esize" + * before doubling rather than adding the rounding constant after + * the doubling. + */ + int64_t m1 = (int64_t)a * b; + int64_t m2 = (int64_t)c << 31; + int64_t r; + if (sadd64_overflow(m1, m2, &r) || + sadd64_overflow(r, (round << 30), &r) || + sadd64_overflow(r, r, &r)) { + *sat = true; + return r < 0 ? INT32_MAX : INT32_MIN; + } + return r >> 32; +} + +/* + * The *MLAH insns are vector * scalar + vector; + * the *MLASH insns are vector * vector + scalar + */ +#define DO_VQDMLAH_B(D, N, M, S) do_vqdmlah_b(N, M, D, 0, S) +#define DO_VQDMLAH_H(D, N, M, S) do_vqdmlah_h(N, M, D, 0, S) +#define DO_VQDMLAH_W(D, N, M, S) do_vqdmlah_w(N, M, D, 0, S) +#define DO_VQRDMLAH_B(D, N, M, S) do_vqdmlah_b(N, M, D, 1, S) +#define DO_VQRDMLAH_H(D, N, M, S) do_vqdmlah_h(N, M, D, 1, S) +#define DO_VQRDMLAH_W(D, N, M, S) do_vqdmlah_w(N, M, D, 1, S) + +#define DO_VQDMLASH_B(D, N, M, S) do_vqdmlah_b(N, D, M, 0, S) +#define DO_VQDMLASH_H(D, N, M, S) do_vqdmlah_h(N, D, M, 0, S) +#define DO_VQDMLASH_W(D, N, M, S) do_vqdmlah_w(N, D, M, 0, S) +#define DO_VQRDMLASH_B(D, N, M, S) do_vqdmlah_b(N, D, M, 1, S) +#define DO_VQRDMLASH_H(D, N, M, S) do_vqdmlah_h(N, D, M, 1, S) +#define DO_VQRDMLASH_W(D, N, M, S) do_vqdmlah_w(N, D, M, 1, S) + +DO_2OP_SAT_ACC_SCALAR(vqdmlahb, 1, int8_t, DO_VQDMLAH_B) +DO_2OP_SAT_ACC_SCALAR(vqdmlahh, 2, int16_t, DO_VQDMLAH_H) +DO_2OP_SAT_ACC_SCALAR(vqdmlahw, 4, int32_t, DO_VQDMLAH_W) +DO_2OP_SAT_ACC_SCALAR(vqrdmlahb, 1, int8_t, DO_VQRDMLAH_B) +DO_2OP_SAT_ACC_SCALAR(vqrdmlahh, 2, int16_t, DO_VQRDMLAH_H) +DO_2OP_SAT_ACC_SCALAR(vqrdmlahw, 4, int32_t, DO_VQRDMLAH_W) + +DO_2OP_SAT_ACC_SCALAR(vqdmlashb, 1, int8_t, DO_VQDMLASH_B) +DO_2OP_SAT_ACC_SCALAR(vqdmlashh, 2, int16_t, DO_VQDMLASH_H) +DO_2OP_SAT_ACC_SCALAR(vqdmlashw, 4, int32_t, DO_VQDMLASH_W) +DO_2OP_SAT_ACC_SCALAR(vqrdmlashb, 1, int8_t, DO_VQRDMLASH_B) +DO_2OP_SAT_ACC_SCALAR(vqrdmlashh, 2, int16_t, DO_VQRDMLASH_H) +DO_2OP_SAT_ACC_SCALAR(vqrdmlashw, 4, int32_t, DO_VQRDMLASH_W) + /* Vector by scalar plus vector */ #define DO_VMLA(D, N, M) ((N) * (M) + (D)) diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index f8899af352d..e3e115c1aa9 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -622,6 +622,10 @@ DO_2OP_SCALAR(VQRDMULH_scalar, vqrdmulh_scalar) DO_2OP_SCALAR(VBRSR, vbrsr) DO_2OP_SCALAR(VMLA, vmla) DO_2OP_SCALAR(VMLAS, vmlas) +DO_2OP_SCALAR(VQDMLAH, vqdmlah) +DO_2OP_SCALAR(VQRDMLAH, vqrdmlah) +DO_2OP_SCALAR(VQDMLASH, vqdmlash) +DO_2OP_SCALAR(VQRDMLASH, vqrdmlash) static bool trans_VQDMULLB_scalar(DisasContext *s, arg_2scalar *a) { From patchwork Thu Jul 29 11:14:47 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511173 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=ByVounPG; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb7cL4hDhz9sSs for ; Thu, 29 Jul 2021 21:30:10 +1000 (AEST) Received: from localhost ([::1]:34102 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m94Eq-0004je-Ai for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:30:08 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40604) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m941B-00014W-Re for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:02 -0400 Received: from mail-wm1-x32b.google.com ([2a00:1450:4864:20::32b]:39824) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m940p-0001Ge-HJ for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:01 -0400 Received: by mail-wm1-x32b.google.com with SMTP id f14-20020a05600c154eb02902519e4abe10so6545820wmg.4 for ; Thu, 29 Jul 2021 04:15:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=2WneyhHKqIU7XDRl0YOpayRXl21coDrx+TpbKifug4Y=; b=ByVounPG/fEi8V4Z4srS2cgHHsGjZgQMEdrztd6eXE2H+RfH+97QCzqkH13X1Pg2uN yLFDD/R5Y1gFrqY6YOBbjqLC8wtUQLxatLtLcZXEmwYuA7dkc3KAMPPdr2hMWmaGmh+7 99zFh+QY3YsgPsKF6nC+ewi2RuB0HHl88pqadHIoUd73eKwPZgr352Zk2/fzzs/gplw5 IKOErjr2O9EjSkREUpOABiMq/TC1egjYT+NkoOStyL8IfFfUxB1BuixJcrY7why1x/lZ ww7tcDlVrOYztSCT4gaAdz+o7Z9cwDvNmg0uCDWYWjd2bDiTDppfyfKbKve7qfvVBCbF x5Qw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=2WneyhHKqIU7XDRl0YOpayRXl21coDrx+TpbKifug4Y=; b=BImPLQ962Uz4ljCEK+qvdvrUO3bu0ZN5vFIp8OwVfWqBL/2IxX7BUBgJlUnYmAROWs zpWAjafdxPNCXOWi13YFaF1mwk8qew45jk5l9wcWvisd79OMWFZbdOjRlnmn/L+oY2K3 KWOENSS4ObEhUdz1p0dEEJvTf3bd7jt05a+FaaT5XsOeZoQfZ1R0nyayHqlfdzWYx5O2 0Pr4pufJDy0LYHDIZw/lKZjNYwZR3psGaKu1UY29bsAiaQZB9GRQiO2kr3hqN6mCfSci D2BbaXWN31zgGiEMPf+THAwA9aCbusXvIb5NZoGmrks42c1W+Mha8tIekPfsvM9Vf87d dW8g== X-Gm-Message-State: AOAM5318u28WKt8HzubVYzs5Jfut1bd3EQlQeM4VQptgabxe76ACXNJM I+8D6N43wu8mKW9uo1QMukO5a2nyenq3oQ== X-Google-Smtp-Source: ABdhPJyt11vwd+xXZg476yFC1U6a1NcL2J7PUZ2x3RZfM7Qh4qJk836oU4Bknhjo/1KNcLruH8dq/A== X-Received: by 2002:a05:600c:33a6:: with SMTP id o38mr7760479wmp.131.1627557337925; Thu, 29 Jul 2021 04:15:37 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:37 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 28/53] target/arm: Implement MVE VQABS, VQNEG Date: Thu, 29 Jul 2021 12:14:47 +0100 Message-Id: <20210729111512.16541-29-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::32b; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x32b.google.com X-Spam_score_int: -1 X-Spam_score: -0.2 X-Spam_bar: / X-Spam_report: (-0.2 / 5.0 requ) DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE 1-operand saturating operations VQABS and VQNEG. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper-mve.h | 8 ++++++++ target/arm/mve.decode | 3 +++ target/arm/mve_helper.c | 37 +++++++++++++++++++++++++++++++++++++ target/arm/translate-mve.c | 2 ++ 4 files changed, 50 insertions(+) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index 2f54396b2df..f9345bfafc7 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -76,6 +76,14 @@ DEF_HELPER_FLAGS_3(mve_vnegw, TCG_CALL_NO_WG, void, env, ptr, ptr) DEF_HELPER_FLAGS_3(mve_vfnegh, TCG_CALL_NO_WG, void, env, ptr, ptr) DEF_HELPER_FLAGS_3(mve_vfnegs, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vqabsb, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vqabsh, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vqabsw, TCG_CALL_NO_WG, void, env, ptr, ptr) + +DEF_HELPER_FLAGS_3(mve_vqnegb, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vqnegh, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vqnegw, TCG_CALL_NO_WG, void, env, ptr, ptr) + DEF_HELPER_FLAGS_3(mve_vmovnbb, TCG_CALL_NO_WG, void, env, ptr, ptr) DEF_HELPER_FLAGS_3(mve_vmovnbh, TCG_CALL_NO_WG, void, env, ptr, ptr) DEF_HELPER_FLAGS_3(mve_vmovntb, TCG_CALL_NO_WG, void, env, ptr, ptr) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index 7a6de3991b6..a05b882f9d9 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -279,6 +279,9 @@ VABS_fp 1111 1111 1 . 11 .. 01 ... 0 0111 01 . 0 ... 0 @1op VNEG 1111 1111 1 . 11 .. 01 ... 0 0011 11 . 0 ... 0 @1op VNEG_fp 1111 1111 1 . 11 .. 01 ... 0 0111 11 . 0 ... 0 @1op +VQABS 1111 1111 1 . 11 .. 00 ... 0 0111 01 . 0 ... 0 @1op +VQNEG 1111 1111 1 . 11 .. 00 ... 0 0111 11 . 0 ... 0 @1op + &vdup qd rt size # Qd is in the fields usually named Qn @vdup .... .... . . .. ... . rt:4 .... . . . . .... qd=%qn &vdup diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index a69fcd2243c..6539012ddd8 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -2200,3 +2200,40 @@ void HELPER(mve_vpsel)(CPUARMState *env, void *vd, void *vn, void *vm) } mve_advance_vpt(env); } + +#define DO_1OP_SAT(OP, ESIZE, TYPE, FN) \ + void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm) \ + { \ + TYPE *d = vd, *m = vm; \ + uint16_t mask = mve_element_mask(env); \ + unsigned e; \ + bool qc = false; \ + for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \ + bool sat = false; \ + mergemask(&d[H##ESIZE(e)], FN(m[H##ESIZE(e)], &sat), mask); \ + qc |= sat & mask & 1; \ + } \ + if (qc) { \ + env->vfp.qc[0] = qc; \ + } \ + mve_advance_vpt(env); \ + } + +#define DO_VQABS_B(N, SATP) \ + do_sat_bhs(DO_ABS((int64_t)N), INT8_MIN, INT8_MAX, SATP) +#define DO_VQABS_H(N, SATP) \ + do_sat_bhs(DO_ABS((int64_t)N), INT16_MIN, INT16_MAX, SATP) +#define DO_VQABS_W(N, SATP) \ + do_sat_bhs(DO_ABS((int64_t)N), INT32_MIN, INT32_MAX, SATP) + +#define DO_VQNEG_B(N, SATP) do_sat_bhs(-(int64_t)N, INT8_MIN, INT8_MAX, SATP) +#define DO_VQNEG_H(N, SATP) do_sat_bhs(-(int64_t)N, INT16_MIN, INT16_MAX, SATP) +#define DO_VQNEG_W(N, SATP) do_sat_bhs(-(int64_t)N, INT32_MIN, INT32_MAX, SATP) + +DO_1OP_SAT(vqabsb, 1, int8_t, DO_VQABS_B) +DO_1OP_SAT(vqabsh, 2, int16_t, DO_VQABS_H) +DO_1OP_SAT(vqabsw, 4, int32_t, DO_VQABS_W) + +DO_1OP_SAT(vqnegb, 1, int8_t, DO_VQNEG_B) +DO_1OP_SAT(vqnegh, 2, int16_t, DO_VQNEG_H) +DO_1OP_SAT(vqnegw, 4, int32_t, DO_VQNEG_W) diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index e3e115c1aa9..f2213ec8cde 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -275,6 +275,8 @@ DO_1OP(VCLZ, vclz) DO_1OP(VCLS, vcls) DO_1OP(VABS, vabs) DO_1OP(VNEG, vneg) +DO_1OP(VQABS, vqabs) +DO_1OP(VQNEG, vqneg) /* Narrowing moves: only size 0 and 1 are valid */ #define DO_VMOVN(INSN, FN) \ From patchwork Thu Jul 29 11:14:48 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511174 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=neICDKsb; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb7fv2PJ0z9sSs for ; Thu, 29 Jul 2021 21:32:23 +1000 (AEST) Received: from localhost ([::1]:39818 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m94Gy-0000Hc-Q2 for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:32:20 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40654) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m941D-00016A-Ut for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:03 -0400 Received: from mail-wm1-x32c.google.com ([2a00:1450:4864:20::32c]:43004) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m940r-0001I0-3b for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:03 -0400 Received: by mail-wm1-x32c.google.com with SMTP id e25-20020a05600c4b99b0290253418ba0fbso3780845wmp.1 for ; Thu, 29 Jul 2021 04:15:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=NHW+bio5vpc+hqjWMB9d8Ta5DNoBT2v+EDTK/IAK+GE=; b=neICDKsbjAiZZ5i88PHoB5uLbiyLNHjBH8qkIlAmva+dkuTma8uBS5had0moPgY3XA jfSq4KAvORMBmsenLQTF4X5n1bo+0syizaVB46hfZ7wYQ1ZbXIfM1nH1jeLpX0gGSQrG 8l5KHNmUp8olrtb6mCGmfTKp93csRKIklUHi67Vn/rlc/9kp0jRNt0G5fC1XGbV+rXP4 SXlERfE2Po5YC/HUFctDY4yjMv8pUgso87Z0v7nBc8seaLVuOHOUWnF+Spvw2dt9CGJ1 czGyiH2q6ZqxcqUYMQ368FF23o4jIqv84Tc5/TRH6hRLMrgiPp9clV7KT4UrDx+HH0XO 9P+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=NHW+bio5vpc+hqjWMB9d8Ta5DNoBT2v+EDTK/IAK+GE=; b=moQY6VSHi/4qw9Hze5IHFYLayY7Xh3Ewfh4cv5MSCSmM8e8JNimUvhwDv3Y7c4u5m7 XD7Dxm1yCP/FP5xG3PK2Iu16d9HJugXmrU3VaV4Aycnh52xYdjLki7E3vTnSsAzqWB+z iRgB5w8XrJXl8VsP54fCwA6OZylfpzbEEAereuem6ATE8lXuQIQb3PIg4l+jrs5zThg6 5w5tEUcG7O939170xtGe4T2+i06um1SJ/EDyNtf6iTm3fkfgVfxfmwBX6JTh5w5be6TL 126USDyiGx4sLYki712PMaZT44mcLWzZ8AczfPfXuqDQrKJsPrqbJtNcURmkQqa08mVb eFUw== X-Gm-Message-State: AOAM533dcOA2cEOn0Fi8Cb+CBImL1aZxrv2DOmhZWxSckbiGwmhRAtws pjF2Oe1T6ohLFQRBhRG7/9AyGQ== X-Google-Smtp-Source: ABdhPJyStbAieGWJt+AuOohVvxuZS3qE1VgdH9RIOsOrOYdAe+E9jf0JgaMVGs8gvX5rV0JYy2WWSQ== X-Received: by 2002:a05:600c:2942:: with SMTP id n2mr4173622wmd.161.1627557338830; Thu, 29 Jul 2021 04:15:38 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:38 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 29/53] target/arm: Implement MVE VMAXA, VMINA Date: Thu, 29 Jul 2021 12:14:48 +0100 Message-Id: <20210729111512.16541-30-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::32c; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x32c.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE VMAXA and VMINA insns, which take the absolute value of the signed elements in the input vector and then accumulate the unsigned max or min into the destination vector. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper-mve.h | 8 ++++++++ target/arm/mve.decode | 4 ++++ target/arm/mve_helper.c | 26 ++++++++++++++++++++++++++ target/arm/translate-mve.c | 2 ++ 4 files changed, 40 insertions(+) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index f9345bfafc7..651020aaad8 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -84,6 +84,14 @@ DEF_HELPER_FLAGS_3(mve_vqnegb, TCG_CALL_NO_WG, void, env, ptr, ptr) DEF_HELPER_FLAGS_3(mve_vqnegh, TCG_CALL_NO_WG, void, env, ptr, ptr) DEF_HELPER_FLAGS_3(mve_vqnegw, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vmaxab, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vmaxah, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vmaxaw, TCG_CALL_NO_WG, void, env, ptr, ptr) + +DEF_HELPER_FLAGS_3(mve_vminab, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vminah, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vminaw, TCG_CALL_NO_WG, void, env, ptr, ptr) + DEF_HELPER_FLAGS_3(mve_vmovnbb, TCG_CALL_NO_WG, void, env, ptr, ptr) DEF_HELPER_FLAGS_3(mve_vmovnbh, TCG_CALL_NO_WG, void, env, ptr, ptr) DEF_HELPER_FLAGS_3(mve_vmovntb, TCG_CALL_NO_WG, void, env, ptr, ptr) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index a05b882f9d9..0955ed0cc22 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -156,6 +156,8 @@ VMUL 1110 1111 0 . .. ... 0 ... 0 1001 . 1 . 1 ... 0 @2op VQMOVUNB 111 0 1110 0 . 11 .. 01 ... 0 1110 1 0 . 0 ... 1 @1op VQMOVN_BS 111 0 1110 0 . 11 .. 11 ... 0 1110 0 0 . 0 ... 1 @1op + VMAXA 111 0 1110 0 . 11 .. 11 ... 0 1110 1 0 . 0 ... 1 @1op + VMULH_S 111 0 1110 0 . .. ...1 ... 0 1110 . 0 . 0 ... 1 @2op } @@ -176,6 +178,8 @@ VMUL 1110 1111 0 . .. ... 0 ... 0 1001 . 1 . 1 ... 0 @2op VQMOVUNT 111 0 1110 0 . 11 .. 01 ... 1 1110 1 0 . 0 ... 1 @1op VQMOVN_TS 111 0 1110 0 . 11 .. 11 ... 1 1110 0 0 . 0 ... 1 @1op + VMINA 111 0 1110 0 . 11 .. 11 ... 1 1110 1 0 . 0 ... 1 @1op + VRMULH_S 111 0 1110 0 . .. ...1 ... 1 1110 . 0 . 0 ... 1 @2op } diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index 6539012ddd8..d326205cbf0 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -2237,3 +2237,29 @@ DO_1OP_SAT(vqabsw, 4, int32_t, DO_VQABS_W) DO_1OP_SAT(vqnegb, 1, int8_t, DO_VQNEG_B) DO_1OP_SAT(vqnegh, 2, int16_t, DO_VQNEG_H) DO_1OP_SAT(vqnegw, 4, int32_t, DO_VQNEG_W) + +/* + * VMAXA, VMINA: vd is unsigned; vm is signed, and we take its + * absolute value; we then do an unsigned comparison. + */ +#define DO_VMAXMINA(OP, ESIZE, STYPE, UTYPE, FN) \ + void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm) \ + { \ + UTYPE *d = vd; \ + STYPE *m = vm; \ + uint16_t mask = mve_element_mask(env); \ + unsigned e; \ + for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \ + UTYPE r = DO_ABS(m[H##ESIZE(e)]); \ + r = FN(d[H##ESIZE(e)], r); \ + mergemask(&d[H##ESIZE(e)], r, mask); \ + } \ + mve_advance_vpt(env); \ + } + +DO_VMAXMINA(vmaxab, 1, int8_t, uint8_t, DO_MAX) +DO_VMAXMINA(vmaxah, 2, int16_t, uint16_t, DO_MAX) +DO_VMAXMINA(vmaxaw, 4, int32_t, uint32_t, DO_MAX) +DO_VMAXMINA(vminab, 1, int8_t, uint8_t, DO_MIN) +DO_VMAXMINA(vminah, 2, int16_t, uint16_t, DO_MIN) +DO_VMAXMINA(vminaw, 4, int32_t, uint32_t, DO_MIN) diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index f2213ec8cde..02c26987a2d 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -277,6 +277,8 @@ DO_1OP(VABS, vabs) DO_1OP(VNEG, vneg) DO_1OP(VQABS, vqabs) DO_1OP(VQNEG, vqneg) +DO_1OP(VMAXA, vmaxa) +DO_1OP(VMINA, vmina) /* Narrowing moves: only size 0 and 1 are valid */ #define DO_VMOVN(INSN, FN) \ From patchwork Thu Jul 29 11:14:49 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511177 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=FakOkKQx; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb7hC4JPBz9sW8 for ; Thu, 29 Jul 2021 21:33:31 +1000 (AEST) Received: from localhost ([::1]:43904 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m94I5-00032D-9m for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:33:29 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40658) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m941E-00016Y-5D for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:05 -0400 Received: from mail-wm1-x329.google.com ([2a00:1450:4864:20::329]:37485) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m940r-0001IY-44 for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:03 -0400 Received: by mail-wm1-x329.google.com with SMTP id l34-20020a05600c1d22b02902573c214807so1131089wms.2 for ; Thu, 29 Jul 2021 04:15:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=Qp88lepzk56QVkhZRSGO2VDv/JbZJtAi4P83EneyDdw=; b=FakOkKQxABmcFJham9NEs+Cu2LZ2OT5L1gfSiSMDKJiJTp3WutvQUxidPYUcb/dM2i N4NcYPxQDCklozTuarkXs8u1CuLIDlCZbJuXP+lKn1pjkLIT95jbN+Wkhj8CljxCrukH dAuQl7JeG6abHc3DuDNvgl/79SwbKQkE9kaKj52XzZDS49fc+MajBzpBbMpz28SMQ2md KHEqiZdR6mU2b1EsI9Lij3sVAH3Sx0oED6nfzNnZY+M82hoV7cuCCGbRUvBZB0ZhqOMn IRc53ZZrPBfRcumHWPJzuJgnS0b9okqOZBoVgQfCneoNxPNwALlmc7l3zfmcYnyT6AG/ mZpg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Qp88lepzk56QVkhZRSGO2VDv/JbZJtAi4P83EneyDdw=; b=fq6mEO/Al+Vna3jJHksZLY5m44ixIqaTiysIiK7J3tGhLWV8HdZgPNSN39tU/D+iYJ PKwAk0V4ChDdz+vA/XZ0DqHfoIcKjDEMgNuxkXShRNvm8SW/6ug3qjskOR5rc8imdcn8 tStU7rztzlb6HFXJpxWnowUmy5LBKWJBKbgHWOJepV8sKZ2bS8G5eQXam/L1qRLplFJc 0qVFtW5sI8F8bzfLkzP2Mc5psZqZqD1DKuyKOn3f6tmBA2R776nRrEAkuIPH8lppQW+l aHtg3LTN3CaHqzwiNBAPH0NlbdvqDF+ZVPAq8BFLXE8fWstlv5mUq3TuDksRaqt46HZc 6n7g== X-Gm-Message-State: AOAM530vnpTZxuU2conSm8+D6ApedePLG5Vz63SSuq3Fk9BdqNUID+oV 8KR6YuVmLWMBhGzRDppltX4fis+5yVyjHA== X-Google-Smtp-Source: ABdhPJzyL0TVyXC6jvYKQB8lQ6Jbq/nxROOPdA1buMrU7Pc1jeLaxdMVQpLb0xLw/Xfs0462UUBTUg== X-Received: by 2002:a05:600c:3b98:: with SMTP id n24mr13998775wms.182.1627557339571; Thu, 29 Jul 2021 04:15:39 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:39 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 30/53] target/arm: Implement MVE VMOV to/from 2 general-purpose registers Date: Thu, 29 Jul 2021 12:14:49 +0100 Message-Id: <20210729111512.16541-31-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::329; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x329.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE VMOV forms that move data between 2 general-purpose registers and 2 32-bit lanes in a vector register. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/translate-a32.h | 1 + target/arm/mve.decode | 4 ++ target/arm/translate-mve.c | 85 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-vfp.c | 2 +- 4 files changed, 91 insertions(+), 1 deletion(-) diff --git a/target/arm/translate-a32.h b/target/arm/translate-a32.h index 6dfcafe1796..6f4d65ddb00 100644 --- a/target/arm/translate-a32.h +++ b/target/arm/translate-a32.h @@ -49,6 +49,7 @@ void gen_rev16(TCGv_i32 dest, TCGv_i32 var); void clear_eci_state(DisasContext *s); bool mve_eci_check(DisasContext *s); void mve_update_and_store_eci(DisasContext *s); +bool mve_skip_vmov(DisasContext *s, int vn, int index, int size); static inline TCGv_i32 load_cpu_offset(int offset) { diff --git a/target/arm/mve.decode b/target/arm/mve.decode index 0955ed0cc22..774ee2a1a5b 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -136,6 +136,10 @@ VLDR_VSTR 1110110 1 a:1 . w:1 . .... ... 111101 ....... @vldr_vstr \ VLDR_VSTR 1110110 1 a:1 . w:1 . .... ... 111110 ....... @vldr_vstr \ size=2 p=1 +# Moves between 2 32-bit vector lanes and 2 general purpose registers +VMOV_to_2gp 1110 1100 0 . 00 rt2:4 ... 0 1111 000 idx:1 rt:4 qd=%qd +VMOV_from_2gp 1110 1100 0 . 01 rt2:4 ... 0 1111 000 idx:1 rt:4 qd=%qd + # Vector 2-op VAND 1110 1111 0 . 00 ... 0 ... 0 0001 . 1 . 1 ... 0 @2op_nosz VBIC 1110 1111 0 . 01 ... 0 ... 0 0001 . 1 . 1 ... 0 @2op_nosz diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index 02c26987a2d..93707fdd681 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -1507,3 +1507,88 @@ static bool do_vabav(DisasContext *s, arg_vabav *a, MVEGenVABAVFn *fn) DO_VABAV(VABAV_S, vabavs) DO_VABAV(VABAV_U, vabavu) + +static bool trans_VMOV_to_2gp(DisasContext *s, arg_VMOV_to_2gp *a) +{ + /* + * VMOV two 32-bit vector lanes to two general-purpose registers. + * This insn is not predicated but it is subject to beat-wise + * execution if it is not in an IT block. For us this means + * only that if PSR.ECI says we should not be executing the beat + * corresponding to the lane of the vector register being accessed + * then we should skip perfoming the move, and that we need to do + * the usual check for bad ECI state and advance of ECI state. + * (If PSR.ECI is non-zero then we cannot be in an IT block.) + */ + TCGv_i32 tmp; + int vd; + + if (!dc_isar_feature(aa32_mve, s) || !mve_check_qreg_bank(s, a->qd) || + a->rt == 13 || a->rt == 15 || a->rt2 == 13 || a->rt2 == 15 || + a->rt == a->rt2) { + /* Rt/Rt2 cases are UNPREDICTABLE */ + return false; + } + if (!mve_eci_check(s) || !vfp_access_check(s)) { + return true; + } + + /* Convert Qreg index to Dreg for read_neon_element32() etc */ + vd = a->qd * 2; + + if (!mve_skip_vmov(s, vd, a->idx, MO_32)) { + tmp = tcg_temp_new_i32(); + read_neon_element32(tmp, vd, a->idx, MO_32); + store_reg(s, a->rt, tmp); + } + if (!mve_skip_vmov(s, vd + 1, a->idx, MO_32)) { + tmp = tcg_temp_new_i32(); + read_neon_element32(tmp, vd + 1, a->idx, MO_32); + store_reg(s, a->rt2, tmp); + } + + mve_update_and_store_eci(s); + return true; +} + +static bool trans_VMOV_from_2gp(DisasContext *s, arg_VMOV_to_2gp *a) +{ + /* + * VMOV two general-purpose registers to two 32-bit vector lanes. + * This insn is not predicated but it is subject to beat-wise + * execution if it is not in an IT block. For us this means + * only that if PSR.ECI says we should not be executing the beat + * corresponding to the lane of the vector register being accessed + * then we should skip perfoming the move, and that we need to do + * the usual check for bad ECI state and advance of ECI state. + * (If PSR.ECI is non-zero then we cannot be in an IT block.) + */ + TCGv_i32 tmp; + int vd; + + if (!dc_isar_feature(aa32_mve, s) || !mve_check_qreg_bank(s, a->qd) || + a->rt == 13 || a->rt == 15 || a->rt2 == 13 || a->rt2 == 15) { + /* Rt/Rt2 cases are UNPREDICTABLE */ + return false; + } + if (!mve_eci_check(s) || !vfp_access_check(s)) { + return true; + } + + /* Convert Qreg idx to Dreg for read_neon_element32() etc */ + vd = a->qd * 2; + + if (!mve_skip_vmov(s, vd, a->idx, MO_32)) { + tmp = load_reg(s, a->rt); + write_neon_element32(tmp, vd, a->idx, MO_32); + tcg_temp_free_i32(tmp); + } + if (!mve_skip_vmov(s, vd + 1, a->idx, MO_32)) { + tmp = load_reg(s, a->rt2); + write_neon_element32(tmp, vd + 1, a->idx, MO_32); + tcg_temp_free_i32(tmp); + } + + mve_update_and_store_eci(s); + return true; +} diff --git a/target/arm/translate-vfp.c b/target/arm/translate-vfp.c index b2991e21ec7..e2eb797c829 100644 --- a/target/arm/translate-vfp.c +++ b/target/arm/translate-vfp.c @@ -581,7 +581,7 @@ static bool trans_VCVT(DisasContext *s, arg_VCVT *a) return true; } -static bool mve_skip_vmov(DisasContext *s, int vn, int index, int size) +bool mve_skip_vmov(DisasContext *s, int vn, int index, int size) { /* * In a CPU with MVE, the VMOV (vector lane to general-purpose register) From patchwork Thu Jul 29 11:14:50 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511198 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=SOZtKpR6; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb81N4PRsz9sSs for ; Thu, 29 Jul 2021 21:48:24 +1000 (AEST) Received: from localhost ([::1]:56504 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m94WU-0006Ly-8R for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:48:22 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40704) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m941G-00019K-Ij for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:06 -0400 Received: from mail-wm1-x335.google.com ([2a00:1450:4864:20::335]:52898) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m940s-0001Ir-VN for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:06 -0400 Received: by mail-wm1-x335.google.com with SMTP id n11so3469446wmd.2 for ; Thu, 29 Jul 2021 04:15:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=ZsgiYwh7qNLAcSkC7LxnVfqa/pmBeTfF8iAebdcRKSo=; b=SOZtKpR6e95OuMwwg5Vj4v3FwmrUCpQVqO8qaGID1yX7o7SYO9cVmMhoa/Dc4rS5Eh PKxOOg46CuNQvjyoZ5WNsC1TmKxuao3hPYdHtNSdgxjjNh7zQAe9bH5XvmjBH5JzXexX WoCbBD6ZKfh0Rn147v3pj8F+m6O7IVtEIK80kGYmzW5a9t1c+ImG2wOE81OhIZvHY5E2 N9WmJqKE++EViMSydP8sJBPkZ5yryXQQCcZPC77xdh4cyFQDTkjPA9mXckCF4KRQEU1U GSRW0N3sUVa2rJeviUpF8Oroxd2Bhvp7aoFvpszgF5LMrCMKFXQKuMe3QTvaBYpdsAPB pxpw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ZsgiYwh7qNLAcSkC7LxnVfqa/pmBeTfF8iAebdcRKSo=; b=erdG1BPmTgxwiAuEWajmS8hz2y341NocZoHJy5tYo6T+/hZeXYjpgrpoAeKqknSdBu zz2jk3SebU4rFIQEmENob/v/OUEju2lhB8YfGnzoXZilJfS3BS3sV0sqM5CLNldKdPZX 3pQFlcDXLYoEWH6vVhpnZkcwwMtHsYKPLiePkd7piczZ/OY/dibTToIpcw4Z4VvH+Soz rODSG0V2b0hLr9thEw1t4imGH7A/BLWerxe0IuUyHUyu07m4l1V1/HQObmtdafSY/QG9 69n5oLrqf9NGj/SaWmUeRw/F/M6v9dpsR/gtrNE27unBW+Ed38SdPNljsskb1tyquSpN xvTw== X-Gm-Message-State: AOAM531WCfzyljRjaWApz8yVTeuWzshK8/TlX1ltsObuwAPrDrbtj0mr tI3lCDDDUSOZq/sEmr/+mwa0QXd6ycMFqA== X-Google-Smtp-Source: ABdhPJyBRfp3B2BwtgVEImoLug4hjAHdMuy57UrWnmxDpT1vZovS2+CEZlZccaB5fM9rw4CTBz+sqQ== X-Received: by 2002:a1c:238e:: with SMTP id j136mr4272553wmj.91.1627557340414; Thu, 29 Jul 2021 04:15:40 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:39 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 31/53] target/arm: Implement MVE VPNOT Date: Thu, 29 Jul 2021 12:14:50 +0100 Message-Id: <20210729111512.16541-32-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::335; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x335.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE VPNOT insn, which inverts the bits in VPR.P0 (subject to both predication and to beatwise execution). Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper-mve.h | 1 + target/arm/mve.decode | 1 + target/arm/mve_helper.c | 17 +++++++++++++++++ target/arm/translate-mve.c | 19 +++++++++++++++++++ 4 files changed, 38 insertions(+) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index 651020aaad8..8cb941912fc 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -119,6 +119,7 @@ DEF_HELPER_FLAGS_4(mve_vorn, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) DEF_HELPER_FLAGS_4(mve_veor, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) DEF_HELPER_FLAGS_4(mve_vpsel, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) +DEF_HELPER_FLAGS_1(mve_vpnot, TCG_CALL_NO_WG, void, env) DEF_HELPER_FLAGS_4(mve_vaddb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) DEF_HELPER_FLAGS_4(mve_vaddh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index 774ee2a1a5b..40bd0c04b59 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -571,6 +571,7 @@ VCMPGT 1111 1110 0 . .. ... 1 ... 1 1111 0 0 . 0 ... 1 @vcmp VCMPLE 1111 1110 0 . .. ... 1 ... 1 1111 1 0 . 0 ... 1 @vcmp { + VPNOT 1111 1110 0 0 11 000 1 000 0 1111 0100 1101 VPST 1111 1110 0 . 11 000 1 ... 0 1111 0100 1101 mask=%mask_22_13 VCMPEQ_scalar 1111 1110 0 . .. ... 1 ... 0 1111 0 1 0 0 .... @vcmp_scalar } diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index d326205cbf0..c22a00c5ed6 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -2201,6 +2201,23 @@ void HELPER(mve_vpsel)(CPUARMState *env, void *vd, void *vn, void *vm) mve_advance_vpt(env); } +void HELPER(mve_vpnot)(CPUARMState *env) +{ + /* + * P0 bits for unexecuted beats (where eci_mask is 0) are unchanged. + * P0 bits for predicated lanes in executed bits (where mask is 0) are 0. + * P0 bits otherwise are inverted. + * (This is the same logic as VCMP.) + * This insn is itself subject to predication and to beat-wise execution, + * and after it executes VPT state advances in the usual way. + */ + uint16_t mask = mve_element_mask(env); + uint16_t eci_mask = mve_eci_mask(env); + uint16_t beatpred = ~env->v7m.vpr & mask; + env->v7m.vpr = (env->v7m.vpr & ~(uint32_t)eci_mask) | (beatpred & eci_mask); + mve_advance_vpt(env); +} + #define DO_1OP_SAT(OP, ESIZE, TYPE, FN) \ void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm) \ { \ diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index 93707fdd681..cc2e58cfe2f 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -887,6 +887,25 @@ static bool trans_VPST(DisasContext *s, arg_VPST *a) return true; } +static bool trans_VPNOT(DisasContext *s, arg_VPNOT *a) +{ + /* + * Invert the predicate in VPR.P0. We have call out to + * a helper because this insn itself is beatwise and can + * be predicated. + */ + if (!dc_isar_feature(aa32_mve, s)) { + return false; + } + if (!mve_eci_check(s) || !vfp_access_check(s)) { + return true; + } + + gen_helper_mve_vpnot(cpu_env); + mve_update_eci(s); + return true; +} + static bool trans_VADDV(DisasContext *s, arg_VADDV *a) { /* VADDV: vector add across vector */ From patchwork Thu Jul 29 11:14:51 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511195 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=JfXN5HLS; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb7xw2g3jz9sSs for ; Thu, 29 Jul 2021 21:45:24 +1000 (AEST) Received: from localhost ([::1]:48210 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m94Ta-0000D0-4o for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:45:22 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40670) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m941E-00017F-MZ for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:05 -0400 Received: from mail-wm1-x330.google.com ([2a00:1450:4864:20::330]:54097) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m940s-0001JF-V0 for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:04 -0400 Received: by mail-wm1-x330.google.com with SMTP id k4so3469506wms.3 for ; Thu, 29 Jul 2021 04:15:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=m3w3Ow7urHFvFZW+mQsIR7ooKiNtRVZse42MclMGtaw=; b=JfXN5HLS4gnwP1VUt8S00CCACP4/OUQ/rD4o+cbQkJCXWocxvBKosc5lOUPwj/V2t3 3WKTrbL6YZRlBKiLj7QLeKl6UVW9q13s18ccwpZccuXlk+3yNmsfxV9Fx0Mbu9LYxkoQ FWsHAox2Gg7uVlqqwmPjJ15IbDglDpG2WlNcpiWO+qam3uo6Tdi+ydcJog9aOppnsKQa qjkf2qTJfraKubdgxWyfvatRqrRcdne72op1XnXG5xBK0V01Lpb5xk3gZ/Ihpl7eNi9x onRjtLs1SG+F8DFSaFj7yN1cichAJ8A/0aBGhwFCw7k2pdGQhRZin/IY9kXKsWZU/5Uk UViQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=m3w3Ow7urHFvFZW+mQsIR7ooKiNtRVZse42MclMGtaw=; b=R6GS8BQ/EHg90taVJRbUk8K/yBvuBJ868mJeN53bI5Q3WFh9q8bDj28pIOCDlYignM DI5b4/usqaWrAV7y4tZGcIQtbnNdZMgBWvaIA8AazlJmvdE+B9dQFYNMrARYcUs+CflN 2vgMGPO6kehhXy8SqZy4FyiidzL5Ul+tDiXZsFkW42q73XN2CXRVBVcUcorRTqlEIo91 +S3um6J4XU3As+nDl2nnNu2Bi43F6BaluAdKQextg1fKWHmvk1ZmZJd5+9C1zAkRQBiw mGaJySMvkccd5hpWqv83MPzfXZVqiY78fPdJQS6EyuYhHF5n8i8rAJut9o993mSWJmQx pFpQ== X-Gm-Message-State: AOAM533D5lF8D/vgjq55wDmXen08DtsBnB3PhBYnv/7O218rtbmdVM4F 8i5Wqxagx7BfMG7Optw4QBrdFeGfYq1iOQ== X-Google-Smtp-Source: ABdhPJxJzXOzwGTVX3woE7qOmAizW5fqnBrFjrHNjH3cwnlVc5bNGh48UmeG5JPIOwiM7iqXir2jzA== X-Received: by 2002:a05:600c:4c96:: with SMTP id g22mr13545369wmp.70.1627557341199; Thu, 29 Jul 2021 04:15:41 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:40 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 32/53] target/arm: Implement MVE VCTP Date: Thu, 29 Jul 2021 12:14:51 +0100 Message-Id: <20210729111512.16541-33-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::330; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x330.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE VCTP insn, which sets the VPR.P0 predicate bits so as to predicate any element at index Rn or greater is predicated. As with VPNOT, this insn itself is predicable and subject to beatwise execution. The calculation of the mask is the same as is used to determine ltpmask in mve_element_mask(), but we precalculate masklen in generated code to avoid having to have 4 helpers specialized by size. We put the decode line in with the low-overhead-loop insns in t32.decode because it's logically part of that collection of insn patterns, even though it is an MVE only insn. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper-mve.h | 2 ++ target/arm/translate-a32.h | 1 + target/arm/t32.decode | 1 + target/arm/mve_helper.c | 20 ++++++++++++++++++++ target/arm/translate-mve.c | 2 +- target/arm/translate.c | 33 +++++++++++++++++++++++++++++++++ 6 files changed, 58 insertions(+), 1 deletion(-) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index 8cb941912fc..b6cf3f0c94d 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -121,6 +121,8 @@ DEF_HELPER_FLAGS_4(mve_veor, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) DEF_HELPER_FLAGS_4(mve_vpsel, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) DEF_HELPER_FLAGS_1(mve_vpnot, TCG_CALL_NO_WG, void, env) +DEF_HELPER_FLAGS_2(mve_vctp, TCG_CALL_NO_WG, void, env, i32) + DEF_HELPER_FLAGS_4(mve_vaddb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) DEF_HELPER_FLAGS_4(mve_vaddh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) DEF_HELPER_FLAGS_4(mve_vaddw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) diff --git a/target/arm/translate-a32.h b/target/arm/translate-a32.h index 6f4d65ddb00..88f15df60e8 100644 --- a/target/arm/translate-a32.h +++ b/target/arm/translate-a32.h @@ -48,6 +48,7 @@ long neon_element_offset(int reg, int element, MemOp memop); void gen_rev16(TCGv_i32 dest, TCGv_i32 var); void clear_eci_state(DisasContext *s); bool mve_eci_check(DisasContext *s); +void mve_update_eci(DisasContext *s); void mve_update_and_store_eci(DisasContext *s); bool mve_skip_vmov(DisasContext *s, int vn, int index, int size); diff --git a/target/arm/t32.decode b/target/arm/t32.decode index 2d47f31f143..78fadef9d62 100644 --- a/target/arm/t32.decode +++ b/target/arm/t32.decode @@ -748,5 +748,6 @@ BL 1111 0. .......... 11.1 ............ @branch24 # This is DLSTP DLS 1111 0 0000 0 size:2 rn:4 1110 0000 0000 0001 } + VCTP 1111 0 0000 0 size:2 rn:4 1110 1000 0000 0001 ] } diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index c22a00c5ed6..1752555a218 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -2218,6 +2218,26 @@ void HELPER(mve_vpnot)(CPUARMState *env) mve_advance_vpt(env); } +/* + * VCTP: P0 unexecuted bits unchanged, predicated bits zeroed, + * otherwise set according to value of Rn. The calculation of + * newmask here works in the same way as the calculation of the + * ltpmask in mve_element_mask(), but we have pre-calculated + * the masklen in the generated code. + */ +void HELPER(mve_vctp)(CPUARMState *env, uint32_t masklen) +{ + uint16_t mask = mve_element_mask(env); + uint16_t eci_mask = mve_eci_mask(env); + uint16_t newmask; + + assert(masklen <= 16); + newmask = masklen ? MAKE_64BIT_MASK(0, masklen) : 0; + newmask &= mask; + env->v7m.vpr = (env->v7m.vpr & ~(uint32_t)eci_mask) | (newmask & eci_mask); + mve_advance_vpt(env); +} + #define DO_1OP_SAT(OP, ESIZE, TYPE, FN) \ void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm) \ { \ diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index cc2e58cfe2f..865d5acbe76 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -93,7 +93,7 @@ bool mve_eci_check(DisasContext *s) } } -static void mve_update_eci(DisasContext *s) +void mve_update_eci(DisasContext *s) { /* * The helper function will always update the CPUState field, diff --git a/target/arm/translate.c b/target/arm/translate.c index 80c282669f0..804a53279bd 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -8669,6 +8669,39 @@ static bool trans_LCTP(DisasContext *s, arg_LCTP *a) return true; } +static bool trans_VCTP(DisasContext *s, arg_VCTP *a) +{ + /* + * M-profile Create Vector Tail Predicate. This insn is itself + * predicated and is subject to beatwise execution. + */ + TCGv_i32 rn_shifted, masklen; + + if (!dc_isar_feature(aa32_mve, s) || a->rn == 13 || a->rn == 15) { + return false; + } + + if (!mve_eci_check(s) || !vfp_access_check(s)) { + return true; + } + + /* + * We pre-calculate the mask length here to avoid having + * to have multiple helpers specialized for size. + * We pass the helper "rn <= (1 << (4 - size)) ? (rn << size) : 16". + */ + rn_shifted = tcg_temp_new_i32(); + masklen = load_reg(s, a->rn); + tcg_gen_shli_i32(rn_shifted, masklen, a->size); + tcg_gen_movcond_i32(TCG_COND_LEU, masklen, + masklen, tcg_constant_i32(1 << (4 - a->size)), + rn_shifted, tcg_constant_i32(16)); + gen_helper_mve_vctp(cpu_env, masklen); + tcg_temp_free_i32(masklen); + tcg_temp_free_i32(rn_shifted); + mve_update_eci(s); + return true; +} static bool op_tbranch(DisasContext *s, arg_tbranch *a, bool half) { From patchwork Thu Jul 29 11:14:52 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511181 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=PsNy8A4Q; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb7lb3Xg3z9sW8 for ; Thu, 29 Jul 2021 21:36:27 +1000 (AEST) Received: from localhost ([::1]:53758 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m94Kv-0001Jd-56 for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:36:25 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40748) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m941I-0001B1-2D for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:09 -0400 Received: from mail-wr1-x429.google.com ([2a00:1450:4864:20::429]:43800) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m940u-0001JR-RG for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:07 -0400 Received: by mail-wr1-x429.google.com with SMTP id h14so6423897wrx.10 for ; Thu, 29 Jul 2021 04:15:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=jbw4NGPAyad+yhK6Ke60e03iznnGnDNGm1xkKalyQQY=; b=PsNy8A4Qb9QgEn0QuJojgw35E033d7i8vmmWoAaRwwoxndEYRWINqfM+ffxBmTVavE lVEneYeT+eCr6WCvv3gGA5qf5y6eaMsffIlrNCtOOtGncu2Z5wY/BqoeF5Qs8yJZa7+d laevbLSaa192p0CYS9VwzkppezzRPlq93Dx+cpUKg6uyl2wWbBBY/LmyAfBrGZlzzHwQ DjMKZ74u0WpoNu33xOqfzqtBpbqi+xDVYQIlmZNr8cqea8+JCK1R4oj1e7iDqKOOLMtC +pRVpHMStV4v+iJBfbyklamRUfNOSiJl2YHJeKMcyhggCejfR0nr8hu9S3pLIsUOdA3Q DhzA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=jbw4NGPAyad+yhK6Ke60e03iznnGnDNGm1xkKalyQQY=; b=FpvpGfy0flhv/6c8Aueu6CmHiMef3ixsxPCm2bAn1+nQnlgwuY+EWqmYRSXVnXFMwJ 17xa7JGzHb9kZVAwUOAIWuC/O/1atkHptTopJdFwz8FYO6IH08TkA+LxoOoE6LMq7JsP 85RNPKTfGMpJuBl9qJ4RpiZbYxvyngBdoAVV0daxarjxRJTKT4nuyaJz9sVgsYstaoFB 7lzD47For2foVHdgGWYfgQkXjXTKDjA+KMtaK2FKORh4wwyf9gw0ry39Y0dodGNAY1ak 361nNRZwxaZ+dY1KiIEUqAAIOqJvqVuttaQlEQ1ancwl1OK517vf/K7AtCTgk3Jf3MSM NtUQ== X-Gm-Message-State: AOAM531ODTiZRCb7GJi/INWFonx8TXxwq02v7nfHLt4dAEJZScbOAs64 zNXxQWsCUAgabGdwtFcb8BAgWA== X-Google-Smtp-Source: ABdhPJykWTlpFE0tSrGGf1OM1ij+k8rOFwVV6sX0j+9ifyzbvsFVgKS2jD/5PdpzSUbww5OZqHCuxw== X-Received: by 2002:adf:f282:: with SMTP id k2mr4283611wro.183.1627557342157; Thu, 29 Jul 2021 04:15:42 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:41 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 33/53] target/arm: Implement MVE scatter-gather insns Date: Thu, 29 Jul 2021 12:14:52 +0100 Message-Id: <20210729111512.16541-34-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::429; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x429.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE gather-loads and scatter-stores which form the address by adding a base value from a scalar register to an offset in each element of a vector. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- v2: UNDEF the UNPREDICTABLE Qd==Qm case for loads --- target/arm/helper-mve.h | 32 +++++++++ target/arm/mve.decode | 12 ++++ target/arm/mve_helper.c | 129 +++++++++++++++++++++++++++++++++++++ target/arm/translate-mve.c | 97 ++++++++++++++++++++++++++++ 4 files changed, 270 insertions(+) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index b6cf3f0c94d..ba842b97c17 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -33,6 +33,38 @@ DEF_HELPER_FLAGS_3(mve_vstrb_h, TCG_CALL_NO_WG, void, env, ptr, i32) DEF_HELPER_FLAGS_3(mve_vstrb_w, TCG_CALL_NO_WG, void, env, ptr, i32) DEF_HELPER_FLAGS_3(mve_vstrh_w, TCG_CALL_NO_WG, void, env, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vldrb_sg_sh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vldrb_sg_sw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vldrh_sg_sw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(mve_vldrb_sg_ub, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vldrb_sg_uh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vldrb_sg_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vldrh_sg_uh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vldrh_sg_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vldrw_sg_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vldrd_sg_ud, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(mve_vstrb_sg_ub, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vstrb_sg_uh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vstrb_sg_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vstrh_sg_uh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vstrh_sg_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vstrw_sg_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vstrd_sg_ud, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(mve_vldrh_sg_os_sw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(mve_vldrh_sg_os_uh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vldrh_sg_os_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vldrw_sg_os_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vldrd_sg_os_ud, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(mve_vstrh_sg_os_uh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vstrh_sg_os_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vstrw_sg_os_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vstrd_sg_os_ud, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) + DEF_HELPER_FLAGS_3(mve_vdup, TCG_CALL_NO_WG, void, env, ptr, i32) DEF_HELPER_FLAGS_4(mve_vidupb, TCG_CALL_NO_WG, i32, env, ptr, i32, i32) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index 40bd0c04b59..6c3f45c7195 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -42,11 +42,18 @@ &shl_scalar qda rm size &vmaxv qm rda size &vabav qn qm rda size +&vldst_sg qd qm rn size msize os + +# scatter-gather memory size is in bits 6:4 +%sg_msize 6:1 4:1 @vldr_vstr ....... . . . . l:1 rn:4 ... ...... imm:7 &vldr_vstr qd=%qd u=0 # Note that both Rn and Qd are 3 bits only (no D bit) @vldst_wn ... u:1 ... . . . . l:1 . rn:3 qd:3 . ... .. imm:7 &vldr_vstr +@vldst_sg .... .... .... rn:4 .... ... size:2 ... ... os:1 &vldst_sg \ + qd=%qd qm=%qm msize=%sg_msize + @1op .... .... .... size:2 .. .... .... .... .... &1op qd=%qd qm=%qm @1op_nosz .... .... .... .... .... .... .... .... &1op qd=%qd qm=%qm size=0 @2op .... .... .. size:2 .... .... .... .... .... &2op qd=%qd qm=%qm qn=%qn @@ -136,6 +143,11 @@ VLDR_VSTR 1110110 1 a:1 . w:1 . .... ... 111101 ....... @vldr_vstr \ VLDR_VSTR 1110110 1 a:1 . w:1 . .... ... 111110 ....... @vldr_vstr \ size=2 p=1 +# gather loads/scatter stores +VLDR_S_sg 111 0 1100 1 . 01 .... ... 0 111 . .... .... @vldst_sg +VLDR_U_sg 111 1 1100 1 . 01 .... ... 0 111 . .... .... @vldst_sg +VSTR_sg 111 0 1100 1 . 00 .... ... 0 111 . .... .... @vldst_sg + # Moves between 2 32-bit vector lanes and 2 general purpose registers VMOV_to_2gp 1110 1100 0 . 00 rt2:4 ... 0 1111 000 idx:1 rt:4 qd=%qd VMOV_from_2gp 1110 1100 0 . 01 rt2:4 ... 0 1111 000 idx:1 rt:4 qd=%qd diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index 1752555a218..2b882db1c3d 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -206,6 +206,135 @@ DO_VSTR(vstrh_w, 2, stw, 4, int32_t) #undef DO_VLDR #undef DO_VSTR +/* + * Gather loads/scatter stores. Here each element of Qm specifies + * an offset to use from the base register Rm. In the _os_ versions + * that offset is scaled by the element size. + * For loads, predicated lanes are zeroed instead of retaining + * their previous values. + */ +#define DO_VLDR_SG(OP, LDTYPE, ESIZE, TYPE, OFFTYPE, ADDRFN) \ + void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm, \ + uint32_t base) \ + { \ + TYPE *d = vd; \ + OFFTYPE *m = vm; \ + uint16_t mask = mve_element_mask(env); \ + uint16_t eci_mask = mve_eci_mask(env); \ + unsigned e; \ + uint32_t addr; \ + for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE, eci_mask >>= ESIZE) { \ + if (!(eci_mask & 1)) { \ + continue; \ + } \ + addr = ADDRFN(base, m[H##ESIZE(e)]); \ + d[H##ESIZE(e)] = (mask & 1) ? \ + cpu_##LDTYPE##_data_ra(env, addr, GETPC()) : 0; \ + } \ + mve_advance_vpt(env); \ + } + +/* We know here TYPE is unsigned so always the same as the offset type */ +#define DO_VSTR_SG(OP, STTYPE, ESIZE, TYPE, ADDRFN) \ + void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm, \ + uint32_t base) \ + { \ + TYPE *d = vd; \ + TYPE *m = vm; \ + uint16_t mask = mve_element_mask(env); \ + unsigned e; \ + uint32_t addr; \ + for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \ + addr = ADDRFN(base, m[H##ESIZE(e)]); \ + if (mask & 1) { \ + cpu_##STTYPE##_data_ra(env, addr, d[H##ESIZE(e)], GETPC()); \ + } \ + } \ + mve_advance_vpt(env); \ + } + +/* + * 64-bit accesses are slightly different: they are done as two 32-bit + * accesses, controlled by the predicate mask for the relevant beat, + * and with a single 32-bit offset in the first of the two Qm elements. + * Note that for QEMU our IMPDEF AIRCR.ENDIANNESS is always 0 (little). + */ +#define DO_VLDR64_SG(OP, ADDRFN) \ + void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm, \ + uint32_t base) \ + { \ + uint32_t *d = vd; \ + uint32_t *m = vm; \ + uint16_t mask = mve_element_mask(env); \ + uint16_t eci_mask = mve_eci_mask(env); \ + unsigned e; \ + uint32_t addr; \ + for (e = 0; e < 16 / 4; e++, mask >>= 4, eci_mask >>= 4) { \ + if (!(eci_mask & 1)) { \ + continue; \ + } \ + addr = ADDRFN(base, m[H4(e & ~1)]); \ + addr += 4 * (e & 1); \ + d[H4(e)] = (mask & 1) ? cpu_ldl_data_ra(env, addr, GETPC()) : 0; \ + } \ + mve_advance_vpt(env); \ + } + +#define DO_VSTR64_SG(OP, ADDRFN) \ + void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm, \ + uint32_t base) \ + { \ + uint32_t *d = vd; \ + uint32_t *m = vm; \ + uint16_t mask = mve_element_mask(env); \ + unsigned e; \ + uint32_t addr; \ + for (e = 0; e < 16 / 4; e++, mask >>= 4) { \ + addr = ADDRFN(base, m[H4(e & ~1)]); \ + addr += 4 * (e & 1); \ + if (mask & 1) { \ + cpu_stl_data_ra(env, addr, d[H4(e)], GETPC()); \ + } \ + } \ + mve_advance_vpt(env); \ + } + +#define ADDR_ADD(BASE, OFFSET) ((BASE) + (OFFSET)) +#define ADDR_ADD_OSH(BASE, OFFSET) ((BASE) + ((OFFSET) << 1)) +#define ADDR_ADD_OSW(BASE, OFFSET) ((BASE) + ((OFFSET) << 2)) +#define ADDR_ADD_OSD(BASE, OFFSET) ((BASE) + ((OFFSET) << 3)) + +DO_VLDR_SG(vldrb_sg_sh, ldsb, 2, int16_t, uint16_t, ADDR_ADD) +DO_VLDR_SG(vldrb_sg_sw, ldsb, 4, int32_t, uint32_t, ADDR_ADD) +DO_VLDR_SG(vldrh_sg_sw, ldsw, 4, int32_t, uint32_t, ADDR_ADD) + +DO_VLDR_SG(vldrb_sg_ub, ldub, 1, uint8_t, uint8_t, ADDR_ADD) +DO_VLDR_SG(vldrb_sg_uh, ldub, 2, uint16_t, uint16_t, ADDR_ADD) +DO_VLDR_SG(vldrb_sg_uw, ldub, 4, uint32_t, uint32_t, ADDR_ADD) +DO_VLDR_SG(vldrh_sg_uh, lduw, 2, uint16_t, uint16_t, ADDR_ADD) +DO_VLDR_SG(vldrh_sg_uw, lduw, 4, uint32_t, uint32_t, ADDR_ADD) +DO_VLDR_SG(vldrw_sg_uw, ldl, 4, uint32_t, uint32_t, ADDR_ADD) +DO_VLDR64_SG(vldrd_sg_ud, ADDR_ADD) + +DO_VLDR_SG(vldrh_sg_os_sw, ldsw, 4, int32_t, uint32_t, ADDR_ADD_OSH) +DO_VLDR_SG(vldrh_sg_os_uh, lduw, 2, uint16_t, uint16_t, ADDR_ADD_OSH) +DO_VLDR_SG(vldrh_sg_os_uw, lduw, 4, uint32_t, uint32_t, ADDR_ADD_OSH) +DO_VLDR_SG(vldrw_sg_os_uw, ldl, 4, uint32_t, uint32_t, ADDR_ADD_OSW) +DO_VLDR64_SG(vldrd_sg_os_ud, ADDR_ADD_OSD) + +DO_VSTR_SG(vstrb_sg_ub, stb, 1, uint8_t, ADDR_ADD) +DO_VSTR_SG(vstrb_sg_uh, stb, 2, uint16_t, ADDR_ADD) +DO_VSTR_SG(vstrb_sg_uw, stb, 4, uint32_t, ADDR_ADD) +DO_VSTR_SG(vstrh_sg_uh, stw, 2, uint16_t, ADDR_ADD) +DO_VSTR_SG(vstrh_sg_uw, stw, 4, uint32_t, ADDR_ADD) +DO_VSTR_SG(vstrw_sg_uw, stl, 4, uint32_t, ADDR_ADD) +DO_VSTR64_SG(vstrd_sg_ud, ADDR_ADD) + +DO_VSTR_SG(vstrh_sg_os_uh, stw, 2, uint16_t, ADDR_ADD_OSH) +DO_VSTR_SG(vstrh_sg_os_uw, stw, 4, uint32_t, ADDR_ADD_OSH) +DO_VSTR_SG(vstrw_sg_os_uw, stl, 4, uint32_t, ADDR_ADD_OSW) +DO_VSTR64_SG(vstrd_sg_os_ud, ADDR_ADD_OSD) + /* * The mergemask(D, R, M) macro performs the operation "*D = R" but * storing only the bytes which correspond to 1 bits in M, diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index 865d5acbe76..24d4e57ead4 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -34,6 +34,7 @@ static inline int vidup_imm(DisasContext *s, int x) #include "decode-mve.c.inc" typedef void MVEGenLdStFn(TCGv_ptr, TCGv_ptr, TCGv_i32); +typedef void MVEGenLdStSGFn(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32); typedef void MVEGenOneOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr); typedef void MVEGenTwoOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr); typedef void MVEGenTwoOpScalarFn(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32); @@ -209,6 +210,102 @@ DO_VLDST_WIDE_NARROW(VLDSTB_H, vldrb_sh, vldrb_uh, vstrb_h, MO_8) DO_VLDST_WIDE_NARROW(VLDSTB_W, vldrb_sw, vldrb_uw, vstrb_w, MO_8) DO_VLDST_WIDE_NARROW(VLDSTH_W, vldrh_sw, vldrh_uw, vstrh_w, MO_16) +static bool do_ldst_sg(DisasContext *s, arg_vldst_sg *a, MVEGenLdStSGFn fn) +{ + TCGv_i32 addr; + TCGv_ptr qd, qm; + + if (!dc_isar_feature(aa32_mve, s) || + !mve_check_qreg_bank(s, a->qd | a->qm) || + !fn || a->rn == 15) { + /* Rn case is UNPREDICTABLE */ + return false; + } + + if (!mve_eci_check(s) || !vfp_access_check(s)) { + return true; + } + + addr = load_reg(s, a->rn); + + qd = mve_qreg_ptr(a->qd); + qm = mve_qreg_ptr(a->qm); + fn(cpu_env, qd, qm, addr); + tcg_temp_free_ptr(qd); + tcg_temp_free_ptr(qm); + tcg_temp_free_i32(addr); + mve_update_eci(s); + return true; +} + +/* + * The naming scheme here is "vldrb_sg_sh == in-memory byte loads + * signextended to halfword elements in register". _os_ indicates that + * the offsets in Qm should be scaled by the element size. + */ +/* This macro is just to make the arrays more compact in these functions */ +#define F(N) gen_helper_mve_##N + +/* VLDRB/VSTRB (ie msize 1) with OS=1 is UNPREDICTABLE; we UNDEF */ +static bool trans_VLDR_S_sg(DisasContext *s, arg_vldst_sg *a) +{ + static MVEGenLdStSGFn * const fns[2][4][4] = { { + { NULL, F(vldrb_sg_sh), F(vldrb_sg_sw), NULL }, + { NULL, NULL, F(vldrh_sg_sw), NULL }, + { NULL, NULL, NULL, NULL }, + { NULL, NULL, NULL, NULL } + }, { + { NULL, NULL, NULL, NULL }, + { NULL, NULL, F(vldrh_sg_os_sw), NULL }, + { NULL, NULL, NULL, NULL }, + { NULL, NULL, NULL, NULL } + } + }; + if (a->qd == a->qm) { + return false; /* UNPREDICTABLE */ + } + return do_ldst_sg(s, a, fns[a->os][a->msize][a->size]); +} + +static bool trans_VLDR_U_sg(DisasContext *s, arg_vldst_sg *a) +{ + static MVEGenLdStSGFn * const fns[2][4][4] = { { + { F(vldrb_sg_ub), F(vldrb_sg_uh), F(vldrb_sg_uw), NULL }, + { NULL, F(vldrh_sg_uh), F(vldrh_sg_uw), NULL }, + { NULL, NULL, F(vldrw_sg_uw), NULL }, + { NULL, NULL, NULL, F(vldrd_sg_ud) } + }, { + { NULL, NULL, NULL, NULL }, + { NULL, F(vldrh_sg_os_uh), F(vldrh_sg_os_uw), NULL }, + { NULL, NULL, F(vldrw_sg_os_uw), NULL }, + { NULL, NULL, NULL, F(vldrd_sg_os_ud) } + } + }; + if (a->qd == a->qm) { + return false; /* UNPREDICTABLE */ + } + return do_ldst_sg(s, a, fns[a->os][a->msize][a->size]); +} + +static bool trans_VSTR_sg(DisasContext *s, arg_vldst_sg *a) +{ + static MVEGenLdStSGFn * const fns[2][4][4] = { { + { F(vstrb_sg_ub), F(vstrb_sg_uh), F(vstrb_sg_uw), NULL }, + { NULL, F(vstrh_sg_uh), F(vstrh_sg_uw), NULL }, + { NULL, NULL, F(vstrw_sg_uw), NULL }, + { NULL, NULL, NULL, F(vstrd_sg_ud) } + }, { + { NULL, NULL, NULL, NULL }, + { NULL, F(vstrh_sg_os_uh), F(vstrh_sg_os_uw), NULL }, + { NULL, NULL, F(vstrw_sg_os_uw), NULL }, + { NULL, NULL, NULL, F(vstrd_sg_os_ud) } + } + }; + return do_ldst_sg(s, a, fns[a->os][a->msize][a->size]); +} + +#undef F + static bool trans_VDUP(DisasContext *s, arg_VDUP *a) { TCGv_ptr qd; From patchwork Thu Jul 29 11:14:53 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511190 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=sHKHO1F4; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb7p60qNBz9sSs for ; Thu, 29 Jul 2021 21:38:38 +1000 (AEST) Received: from localhost ([::1]:34374 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m94N1-0007JH-Nl for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:38:35 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40782) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m941K-0001CZ-8b for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:11 -0400 Received: from mail-wr1-x42c.google.com ([2a00:1450:4864:20::42c]:33375) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m940v-0001Jo-PU for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:08 -0400 Received: by mail-wr1-x42c.google.com with SMTP id q3so6513844wrx.0 for ; Thu, 29 Jul 2021 04:15:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=3JMHCoeAFjJzqhwHh0xa5sRHoOMhf1KEp8ztWJXpo1w=; b=sHKHO1F4UbRre3pCsFuhFdXh6CvfpExmj/TEUDK4FD2Jn/3ZAud17aaSAvoXdl6eSZ KwZUTL7yNuqnPnhzpu/wV5hNjBbo+6L+m0l3hFULsChIprLErxMHYtbwV6KsoVv38S+Y CVyvT8TP5RqnbP2nDoAG5uaPG3ufZSb83gcWyfgfYDL7Yk0laIgQ1km/Hi24BRXoq+vb uzkxmOczeaBeKLziyX0zybK8mZdnOyfmkEicS1CRZ0YuIoy6vWP402g14fKAPpj1ejTV 9Fjj3jXsaU4Dn9F5SRmnz7yYIj+cjEKXsywhtfAvvYM3LkhR64F2r/58EGbE/kCmjeP+ ctbQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=3JMHCoeAFjJzqhwHh0xa5sRHoOMhf1KEp8ztWJXpo1w=; b=Ncljp9/Pm1DljstOQMzhP6FLT37aIPpTYZqRZeqxccuJIKJlA6Bbx8RHipNT3sBYAP ssXE1zpvso9Te0kaG5fJA/cJfTm8kdYyJbqi2MAlDi0KMSTfLrWZcNQMIyY3D8pA+hlK gdblAkUiGrJ5zwivGGOUx0aHIjfk+cauZxonCXp1U6N1uK7Oy3Um82M8QnFBD7nYS1tt LKq3EouAIsIJFeg2CRsYiRVArkZGKDvRAn0EBlpGzVoWEdIg2jWxddh66df8xidIgUmn UhQV1wuahdTgreuNabz9u0YZAzp6C4t1pW9eI4bnvfOHjI7DBRyCVFN9mknh1lbyxKNu 3YIg== X-Gm-Message-State: AOAM532aB75WLo+73dNwVrrdWUDd5LiyGMQ0sctpd8u3+X5AKIUhMSB6 M1ICuscv02FDQTqpWje8oGp7nGTJhJRNiQ== X-Google-Smtp-Source: ABdhPJzvc+l/+YLJlla3RJrjnRN4JhikeiPcSkxVXR8wJVCDqkHaIgYJD7/cnIkgtrGapwcG9t8iZg== X-Received: by 2002:a5d:6789:: with SMTP id v9mr3578865wru.254.1627557343107; Thu, 29 Jul 2021 04:15:43 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:42 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 34/53] target/arm: Implement MVE scatter-gather immediate forms Date: Thu, 29 Jul 2021 12:14:53 +0100 Message-Id: <20210729111512.16541-35-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::42c; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x42c.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE VLDR/VSTR insns which do scatter-gather using base addresses from Qm plus or minus an immediate offset (possibly with writeback). Note that writeback is not predicated but it does have to honour ECI state, so we have to add an eci_mask check to the VSTR_SG macros (the VLDR_SG macros already needed this to be able to distinguish "skip beat" from "set predicated element to 0"). Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- v2: UNDEF the UNPREDICTABLE Qd==Qm loads --- target/arm/helper-mve.h | 5 +++ target/arm/mve.decode | 10 +++++ target/arm/mve_helper.c | 91 ++++++++++++++++++++++++-------------- target/arm/translate-mve.c | 72 ++++++++++++++++++++++++++++++ 4 files changed, 146 insertions(+), 32 deletions(-) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index ba842b97c17..a85a7e1b75d 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -65,6 +65,11 @@ DEF_HELPER_FLAGS_4(mve_vstrh_sg_os_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vstrw_sg_os_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vstrd_sg_os_ud, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vldrw_sg_wb_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vldrd_sg_wb_ud, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vstrw_sg_wb_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vstrd_sg_wb_ud, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) + DEF_HELPER_FLAGS_3(mve_vdup, TCG_CALL_NO_WG, void, env, ptr, i32) DEF_HELPER_FLAGS_4(mve_vidupb, TCG_CALL_NO_WG, i32, env, ptr, i32, i32) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index 6c3f45c7195..48882dd7f38 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -43,6 +43,7 @@ &vmaxv qm rda size &vabav qn qm rda size &vldst_sg qd qm rn size msize os +&vldst_sg_imm qd qm a w imm # scatter-gather memory size is in bits 6:4 %sg_msize 6:1 4:1 @@ -54,6 +55,10 @@ @vldst_sg .... .... .... rn:4 .... ... size:2 ... ... os:1 &vldst_sg \ qd=%qd qm=%qm msize=%sg_msize +# Qm is in the fields usually labeled Qn +@vldst_sg_imm .... .... a:1 . w:1 . .... .... .... . imm:7 &vldst_sg_imm \ + qd=%qd qm=%qn + @1op .... .... .... size:2 .. .... .... .... .... &1op qd=%qd qm=%qm @1op_nosz .... .... .... .... .... .... .... .... &1op qd=%qd qm=%qm size=0 @2op .... .... .. size:2 .... .... .... .... .... &2op qd=%qd qm=%qm qn=%qn @@ -148,6 +153,11 @@ VLDR_S_sg 111 0 1100 1 . 01 .... ... 0 111 . .... .... @vldst_sg VLDR_U_sg 111 1 1100 1 . 01 .... ... 0 111 . .... .... @vldst_sg VSTR_sg 111 0 1100 1 . 00 .... ... 0 111 . .... .... @vldst_sg +VLDRW_sg_imm 111 1 1101 ... 1 ... 0 ... 1 1110 .... .... @vldst_sg_imm +VLDRD_sg_imm 111 1 1101 ... 1 ... 0 ... 1 1111 .... .... @vldst_sg_imm +VSTRW_sg_imm 111 1 1101 ... 0 ... 0 ... 1 1110 .... .... @vldst_sg_imm +VSTRD_sg_imm 111 1 1101 ... 0 ... 0 ... 1 1111 .... .... @vldst_sg_imm + # Moves between 2 32-bit vector lanes and 2 general purpose registers VMOV_to_2gp 1110 1100 0 . 00 rt2:4 ... 0 1111 000 idx:1 rt:4 qd=%qd VMOV_from_2gp 1110 1100 0 . 01 rt2:4 ... 0 1111 000 idx:1 rt:4 qd=%qd diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index 2b882db1c3d..bbbaa538074 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -213,7 +213,7 @@ DO_VSTR(vstrh_w, 2, stw, 4, int32_t) * For loads, predicated lanes are zeroed instead of retaining * their previous values. */ -#define DO_VLDR_SG(OP, LDTYPE, ESIZE, TYPE, OFFTYPE, ADDRFN) \ +#define DO_VLDR_SG(OP, LDTYPE, ESIZE, TYPE, OFFTYPE, ADDRFN, WB) \ void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm, \ uint32_t base) \ { \ @@ -230,25 +230,35 @@ DO_VSTR(vstrh_w, 2, stw, 4, int32_t) addr = ADDRFN(base, m[H##ESIZE(e)]); \ d[H##ESIZE(e)] = (mask & 1) ? \ cpu_##LDTYPE##_data_ra(env, addr, GETPC()) : 0; \ + if (WB) { \ + m[H##ESIZE(e)] = addr; \ + } \ } \ mve_advance_vpt(env); \ } /* We know here TYPE is unsigned so always the same as the offset type */ -#define DO_VSTR_SG(OP, STTYPE, ESIZE, TYPE, ADDRFN) \ +#define DO_VSTR_SG(OP, STTYPE, ESIZE, TYPE, ADDRFN, WB) \ void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm, \ uint32_t base) \ { \ TYPE *d = vd; \ TYPE *m = vm; \ uint16_t mask = mve_element_mask(env); \ + uint16_t eci_mask = mve_eci_mask(env); \ unsigned e; \ uint32_t addr; \ - for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \ + for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE, eci_mask >>= ESIZE) { \ + if (!(eci_mask & 1)) { \ + continue; \ + } \ addr = ADDRFN(base, m[H##ESIZE(e)]); \ if (mask & 1) { \ cpu_##STTYPE##_data_ra(env, addr, d[H##ESIZE(e)], GETPC()); \ } \ + if (WB) { \ + m[H##ESIZE(e)] = addr; \ + } \ } \ mve_advance_vpt(env); \ } @@ -258,8 +268,10 @@ DO_VSTR(vstrh_w, 2, stw, 4, int32_t) * accesses, controlled by the predicate mask for the relevant beat, * and with a single 32-bit offset in the first of the two Qm elements. * Note that for QEMU our IMPDEF AIRCR.ENDIANNESS is always 0 (little). + * Address writeback happens on the odd beats and updates the address + * stored in the even-beat element. */ -#define DO_VLDR64_SG(OP, ADDRFN) \ +#define DO_VLDR64_SG(OP, ADDRFN, WB) \ void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm, \ uint32_t base) \ { \ @@ -276,25 +288,35 @@ DO_VSTR(vstrh_w, 2, stw, 4, int32_t) addr = ADDRFN(base, m[H4(e & ~1)]); \ addr += 4 * (e & 1); \ d[H4(e)] = (mask & 1) ? cpu_ldl_data_ra(env, addr, GETPC()) : 0; \ + if (WB && (e & 1)) { \ + m[H4(e & ~1)] = addr - 4; \ + } \ } \ mve_advance_vpt(env); \ } -#define DO_VSTR64_SG(OP, ADDRFN) \ +#define DO_VSTR64_SG(OP, ADDRFN, WB) \ void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm, \ uint32_t base) \ { \ uint32_t *d = vd; \ uint32_t *m = vm; \ uint16_t mask = mve_element_mask(env); \ + uint16_t eci_mask = mve_eci_mask(env); \ unsigned e; \ uint32_t addr; \ - for (e = 0; e < 16 / 4; e++, mask >>= 4) { \ + for (e = 0; e < 16 / 4; e++, mask >>= 4, eci_mask >>= 4) { \ + if (!(eci_mask & 1)) { \ + continue; \ + } \ addr = ADDRFN(base, m[H4(e & ~1)]); \ addr += 4 * (e & 1); \ if (mask & 1) { \ cpu_stl_data_ra(env, addr, d[H4(e)], GETPC()); \ } \ + if (WB && (e & 1)) { \ + m[H4(e & ~1)] = addr - 4; \ + } \ } \ mve_advance_vpt(env); \ } @@ -304,36 +326,41 @@ DO_VSTR(vstrh_w, 2, stw, 4, int32_t) #define ADDR_ADD_OSW(BASE, OFFSET) ((BASE) + ((OFFSET) << 2)) #define ADDR_ADD_OSD(BASE, OFFSET) ((BASE) + ((OFFSET) << 3)) -DO_VLDR_SG(vldrb_sg_sh, ldsb, 2, int16_t, uint16_t, ADDR_ADD) -DO_VLDR_SG(vldrb_sg_sw, ldsb, 4, int32_t, uint32_t, ADDR_ADD) -DO_VLDR_SG(vldrh_sg_sw, ldsw, 4, int32_t, uint32_t, ADDR_ADD) +DO_VLDR_SG(vldrb_sg_sh, ldsb, 2, int16_t, uint16_t, ADDR_ADD, false) +DO_VLDR_SG(vldrb_sg_sw, ldsb, 4, int32_t, uint32_t, ADDR_ADD, false) +DO_VLDR_SG(vldrh_sg_sw, ldsw, 4, int32_t, uint32_t, ADDR_ADD, false) -DO_VLDR_SG(vldrb_sg_ub, ldub, 1, uint8_t, uint8_t, ADDR_ADD) -DO_VLDR_SG(vldrb_sg_uh, ldub, 2, uint16_t, uint16_t, ADDR_ADD) -DO_VLDR_SG(vldrb_sg_uw, ldub, 4, uint32_t, uint32_t, ADDR_ADD) -DO_VLDR_SG(vldrh_sg_uh, lduw, 2, uint16_t, uint16_t, ADDR_ADD) -DO_VLDR_SG(vldrh_sg_uw, lduw, 4, uint32_t, uint32_t, ADDR_ADD) -DO_VLDR_SG(vldrw_sg_uw, ldl, 4, uint32_t, uint32_t, ADDR_ADD) -DO_VLDR64_SG(vldrd_sg_ud, ADDR_ADD) +DO_VLDR_SG(vldrb_sg_ub, ldub, 1, uint8_t, uint8_t, ADDR_ADD, false) +DO_VLDR_SG(vldrb_sg_uh, ldub, 2, uint16_t, uint16_t, ADDR_ADD, false) +DO_VLDR_SG(vldrb_sg_uw, ldub, 4, uint32_t, uint32_t, ADDR_ADD, false) +DO_VLDR_SG(vldrh_sg_uh, lduw, 2, uint16_t, uint16_t, ADDR_ADD, false) +DO_VLDR_SG(vldrh_sg_uw, lduw, 4, uint32_t, uint32_t, ADDR_ADD, false) +DO_VLDR_SG(vldrw_sg_uw, ldl, 4, uint32_t, uint32_t, ADDR_ADD, false) +DO_VLDR64_SG(vldrd_sg_ud, ADDR_ADD, false) -DO_VLDR_SG(vldrh_sg_os_sw, ldsw, 4, int32_t, uint32_t, ADDR_ADD_OSH) -DO_VLDR_SG(vldrh_sg_os_uh, lduw, 2, uint16_t, uint16_t, ADDR_ADD_OSH) -DO_VLDR_SG(vldrh_sg_os_uw, lduw, 4, uint32_t, uint32_t, ADDR_ADD_OSH) -DO_VLDR_SG(vldrw_sg_os_uw, ldl, 4, uint32_t, uint32_t, ADDR_ADD_OSW) -DO_VLDR64_SG(vldrd_sg_os_ud, ADDR_ADD_OSD) +DO_VLDR_SG(vldrh_sg_os_sw, ldsw, 4, int32_t, uint32_t, ADDR_ADD_OSH, false) +DO_VLDR_SG(vldrh_sg_os_uh, lduw, 2, uint16_t, uint16_t, ADDR_ADD_OSH, false) +DO_VLDR_SG(vldrh_sg_os_uw, lduw, 4, uint32_t, uint32_t, ADDR_ADD_OSH, false) +DO_VLDR_SG(vldrw_sg_os_uw, ldl, 4, uint32_t, uint32_t, ADDR_ADD_OSW, false) +DO_VLDR64_SG(vldrd_sg_os_ud, ADDR_ADD_OSD, false) -DO_VSTR_SG(vstrb_sg_ub, stb, 1, uint8_t, ADDR_ADD) -DO_VSTR_SG(vstrb_sg_uh, stb, 2, uint16_t, ADDR_ADD) -DO_VSTR_SG(vstrb_sg_uw, stb, 4, uint32_t, ADDR_ADD) -DO_VSTR_SG(vstrh_sg_uh, stw, 2, uint16_t, ADDR_ADD) -DO_VSTR_SG(vstrh_sg_uw, stw, 4, uint32_t, ADDR_ADD) -DO_VSTR_SG(vstrw_sg_uw, stl, 4, uint32_t, ADDR_ADD) -DO_VSTR64_SG(vstrd_sg_ud, ADDR_ADD) +DO_VSTR_SG(vstrb_sg_ub, stb, 1, uint8_t, ADDR_ADD, false) +DO_VSTR_SG(vstrb_sg_uh, stb, 2, uint16_t, ADDR_ADD, false) +DO_VSTR_SG(vstrb_sg_uw, stb, 4, uint32_t, ADDR_ADD, false) +DO_VSTR_SG(vstrh_sg_uh, stw, 2, uint16_t, ADDR_ADD, false) +DO_VSTR_SG(vstrh_sg_uw, stw, 4, uint32_t, ADDR_ADD, false) +DO_VSTR_SG(vstrw_sg_uw, stl, 4, uint32_t, ADDR_ADD, false) +DO_VSTR64_SG(vstrd_sg_ud, ADDR_ADD, false) -DO_VSTR_SG(vstrh_sg_os_uh, stw, 2, uint16_t, ADDR_ADD_OSH) -DO_VSTR_SG(vstrh_sg_os_uw, stw, 4, uint32_t, ADDR_ADD_OSH) -DO_VSTR_SG(vstrw_sg_os_uw, stl, 4, uint32_t, ADDR_ADD_OSW) -DO_VSTR64_SG(vstrd_sg_os_ud, ADDR_ADD_OSD) +DO_VSTR_SG(vstrh_sg_os_uh, stw, 2, uint16_t, ADDR_ADD_OSH, false) +DO_VSTR_SG(vstrh_sg_os_uw, stw, 4, uint32_t, ADDR_ADD_OSH, false) +DO_VSTR_SG(vstrw_sg_os_uw, stl, 4, uint32_t, ADDR_ADD_OSW, false) +DO_VSTR64_SG(vstrd_sg_os_ud, ADDR_ADD_OSD, false) + +DO_VLDR_SG(vldrw_sg_wb_uw, ldl, 4, uint32_t, uint32_t, ADDR_ADD, true) +DO_VLDR64_SG(vldrd_sg_wb_ud, ADDR_ADD, true) +DO_VSTR_SG(vstrw_sg_wb_uw, stl, 4, uint32_t, ADDR_ADD, true) +DO_VSTR64_SG(vstrd_sg_wb_ud, ADDR_ADD, true) /* * The mergemask(D, R, M) macro performs the operation "*D = R" but diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index 24d4e57ead4..d3cb3396863 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -306,6 +306,78 @@ static bool trans_VSTR_sg(DisasContext *s, arg_vldst_sg *a) #undef F +static bool do_ldst_sg_imm(DisasContext *s, arg_vldst_sg_imm *a, + MVEGenLdStSGFn *fn, unsigned msize) +{ + uint32_t offset; + TCGv_ptr qd, qm; + + if (!dc_isar_feature(aa32_mve, s) || + !mve_check_qreg_bank(s, a->qd | a->qm) || + !fn) { + return false; + } + + if (!mve_eci_check(s) || !vfp_access_check(s)) { + return true; + } + + offset = a->imm << msize; + if (!a->a) { + offset = -offset; + } + + qd = mve_qreg_ptr(a->qd); + qm = mve_qreg_ptr(a->qm); + fn(cpu_env, qd, qm, tcg_constant_i32(offset)); + tcg_temp_free_ptr(qd); + tcg_temp_free_ptr(qm); + mve_update_eci(s); + return true; +} + +static bool trans_VLDRW_sg_imm(DisasContext *s, arg_vldst_sg_imm *a) +{ + static MVEGenLdStSGFn * const fns[] = { + gen_helper_mve_vldrw_sg_uw, + gen_helper_mve_vldrw_sg_wb_uw, + }; + if (a->qd == a->qm) { + return false; /* UNPREDICTABLE */ + } + return do_ldst_sg_imm(s, a, fns[a->w], MO_32); +} + +static bool trans_VLDRD_sg_imm(DisasContext *s, arg_vldst_sg_imm *a) +{ + static MVEGenLdStSGFn * const fns[] = { + gen_helper_mve_vldrd_sg_ud, + gen_helper_mve_vldrd_sg_wb_ud, + }; + if (a->qd == a->qm) { + return false; /* UNPREDICTABLE */ + } + return do_ldst_sg_imm(s, a, fns[a->w], MO_64); +} + +static bool trans_VSTRW_sg_imm(DisasContext *s, arg_vldst_sg_imm *a) +{ + static MVEGenLdStSGFn * const fns[] = { + gen_helper_mve_vstrw_sg_uw, + gen_helper_mve_vstrw_sg_wb_uw, + }; + return do_ldst_sg_imm(s, a, fns[a->w], MO_32); +} + +static bool trans_VSTRD_sg_imm(DisasContext *s, arg_vldst_sg_imm *a) +{ + static MVEGenLdStSGFn * const fns[] = { + gen_helper_mve_vstrd_sg_ud, + gen_helper_mve_vstrd_sg_wb_ud, + }; + return do_ldst_sg_imm(s, a, fns[a->w], MO_64); +} + static bool trans_VDUP(DisasContext *s, arg_VDUP *a) { TCGv_ptr qd; From patchwork Thu Jul 29 11:14:54 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511192 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=h2gNEgjx; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb7ss6j6pz9sSs for ; Thu, 29 Jul 2021 21:41:53 +1000 (AEST) Received: from localhost ([::1]:39874 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m94QB-0002nT-MN for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:41:51 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40862) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m941P-0001IS-Sv for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:18 -0400 Received: from mail-wr1-x42f.google.com ([2a00:1450:4864:20::42f]:36807) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m940x-0001KS-0E for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:13 -0400 Received: by mail-wr1-x42f.google.com with SMTP id g15so6471764wrd.3 for ; Thu, 29 Jul 2021 04:15:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=18c+niiJ2qq4FpsY7Q8PYylK5MV+mVE5cLUI6DQNoOs=; b=h2gNEgjx7rLVKscBQ79c85/cpjkTRX4cn4HBuvi2HOL9BCxvewpO1F0/FIGGP9+V77 v/sSLP0guVWrwQDc1u/iJd8wZPdWzC3kkgWcwGa8971dvrsgkvEZ7Q4incL55J9tRCrY Lf4OKDkJWmdCYV76DpNIfTcc/mlCrpArcyZvTIoE7nlQa8Oas+wrHW2pgf4nbxBgPa1P Mc/BYLE4lH4gnt95m4Gx6tWzf8d9EWAzq0xQ1l8k+03zuk69FLyeBITOYYWSCE78da2k r/MoJYdH1FPhQ9H6GRdW4jMkXTinBqWL8gui+zsS5LH/Y0BAPXI+x5K4ZmOpsVzt1qAb iygg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=18c+niiJ2qq4FpsY7Q8PYylK5MV+mVE5cLUI6DQNoOs=; b=ejOKyvDdQuVEfGrjHQ3crKUkO+QvnZW6CRGfC9qCsDlk56qDZEL4sGNO/N5EqvxLom CiB3wBCp4gQ2uFYfL0F9tcF0s09eLGuY6syqsMSATIEAA0XPVTyk3ZWpQj5ejV2VF+ia 5BmHWXDlqYp1USgfWWFDRKpQdVQbpT8wNtKMB/2NRl+PPD8J5d2zjQHPmoYh12x2L1Fn QHjmtQMALIvV5y07fAdxl9XWUpHF6vorXSc7+/ejJiYIARFMEi5yFIkAaRB3v+oNVmpz jcXSVvQ5Se+rvnV7KPDWkorYBXWGmcfk7JqHkw+Mw+zN9m9D9VinSVtf/Neinqtc4dO9 YCGw== X-Gm-Message-State: AOAM532GMnbxuB2jRo8SklT1Qd/Y3UB2LRWZvlU1yBa+wCrjPoDBpvlD XLybHrWsFFHoBGNPWEvnncz9fbO/qyRH4g== X-Google-Smtp-Source: ABdhPJwdYPgCO91Skk8SkRGVsgwH9sZ8bWRWXhsG3E9FeV1rRRAq5WzefA++oLYqsNZMsLlQ6EWQZQ== X-Received: by 2002:a5d:6c63:: with SMTP id r3mr2821671wrz.405.1627557344275; Thu, 29 Jul 2021 04:15:44 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:43 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 35/53] target/arm: Implement MVE interleaving loads/stores Date: Thu, 29 Jul 2021 12:14:54 +0100 Message-Id: <20210729111512.16541-36-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::42f; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x42f.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE interleaving load/store functions VLD2, VLD4, VST2 and VST4. VLD2 loads 16 bytes of data from memory and writes to 2 consecutive Qregs; VLD4 loads 16 bytes of data from memory and writes to 4 consecutive Qregs. The 'pattern' field in the encoding determines the offset into memory which is accessed and also which elements in the Qregs are written to. (The intention is that a sequence of four consecutive VLD4 with different pattern values performs a complete de-interleaving load of 64 bytes into all elements of the 4 Qregs.) VST2 and VST4 do the same, but for stores. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- I found the pseudocode description of these instructions pretty hard to follow, because (1) it is written to be generic over all sizes and pattern values and beat counts and (2) it accesses the vector elements by (Qreg number, beat within Qreg, element within beat). I ended up writing a little program to print out the various intermediate numbers and also calculate "index of element within the whole Qreg", which is what QEMU wants to access elements by. You can find that here: https://people.linaro.org/~peter.maydell/ldinter.c I then just stared at the numbers for each (pattern, esize) specialization and tried to come up with something that does less gluing together of random bits from curBeat, pattern and e than the pseudocode... --- target/arm/helper-mve.h | 48 ++++++ target/arm/mve.decode | 11 ++ target/arm/mve_helper.c | 342 +++++++++++++++++++++++++++++++++++++ target/arm/translate-mve.c | 94 ++++++++++ 4 files changed, 495 insertions(+) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index a85a7e1b75d..3db9b15f121 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -70,6 +70,54 @@ DEF_HELPER_FLAGS_4(mve_vldrd_sg_wb_ud, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vstrw_sg_wb_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vstrd_sg_wb_ud, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vld20b, TCG_CALL_NO_WG, void, env, i32, i32) +DEF_HELPER_FLAGS_3(mve_vld20h, TCG_CALL_NO_WG, void, env, i32, i32) +DEF_HELPER_FLAGS_3(mve_vld20w, TCG_CALL_NO_WG, void, env, i32, i32) + +DEF_HELPER_FLAGS_3(mve_vld21b, TCG_CALL_NO_WG, void, env, i32, i32) +DEF_HELPER_FLAGS_3(mve_vld21h, TCG_CALL_NO_WG, void, env, i32, i32) +DEF_HELPER_FLAGS_3(mve_vld21w, TCG_CALL_NO_WG, void, env, i32, i32) + +DEF_HELPER_FLAGS_3(mve_vld40b, TCG_CALL_NO_WG, void, env, i32, i32) +DEF_HELPER_FLAGS_3(mve_vld40h, TCG_CALL_NO_WG, void, env, i32, i32) +DEF_HELPER_FLAGS_3(mve_vld40w, TCG_CALL_NO_WG, void, env, i32, i32) + +DEF_HELPER_FLAGS_3(mve_vld41b, TCG_CALL_NO_WG, void, env, i32, i32) +DEF_HELPER_FLAGS_3(mve_vld41h, TCG_CALL_NO_WG, void, env, i32, i32) +DEF_HELPER_FLAGS_3(mve_vld41w, TCG_CALL_NO_WG, void, env, i32, i32) + +DEF_HELPER_FLAGS_3(mve_vld42b, TCG_CALL_NO_WG, void, env, i32, i32) +DEF_HELPER_FLAGS_3(mve_vld42h, TCG_CALL_NO_WG, void, env, i32, i32) +DEF_HELPER_FLAGS_3(mve_vld42w, TCG_CALL_NO_WG, void, env, i32, i32) + +DEF_HELPER_FLAGS_3(mve_vld43b, TCG_CALL_NO_WG, void, env, i32, i32) +DEF_HELPER_FLAGS_3(mve_vld43h, TCG_CALL_NO_WG, void, env, i32, i32) +DEF_HELPER_FLAGS_3(mve_vld43w, TCG_CALL_NO_WG, void, env, i32, i32) + +DEF_HELPER_FLAGS_3(mve_vst20b, TCG_CALL_NO_WG, void, env, i32, i32) +DEF_HELPER_FLAGS_3(mve_vst20h, TCG_CALL_NO_WG, void, env, i32, i32) +DEF_HELPER_FLAGS_3(mve_vst20w, TCG_CALL_NO_WG, void, env, i32, i32) + +DEF_HELPER_FLAGS_3(mve_vst21b, TCG_CALL_NO_WG, void, env, i32, i32) +DEF_HELPER_FLAGS_3(mve_vst21h, TCG_CALL_NO_WG, void, env, i32, i32) +DEF_HELPER_FLAGS_3(mve_vst21w, TCG_CALL_NO_WG, void, env, i32, i32) + +DEF_HELPER_FLAGS_3(mve_vst40b, TCG_CALL_NO_WG, void, env, i32, i32) +DEF_HELPER_FLAGS_3(mve_vst40h, TCG_CALL_NO_WG, void, env, i32, i32) +DEF_HELPER_FLAGS_3(mve_vst40w, TCG_CALL_NO_WG, void, env, i32, i32) + +DEF_HELPER_FLAGS_3(mve_vst41b, TCG_CALL_NO_WG, void, env, i32, i32) +DEF_HELPER_FLAGS_3(mve_vst41h, TCG_CALL_NO_WG, void, env, i32, i32) +DEF_HELPER_FLAGS_3(mve_vst41w, TCG_CALL_NO_WG, void, env, i32, i32) + +DEF_HELPER_FLAGS_3(mve_vst42b, TCG_CALL_NO_WG, void, env, i32, i32) +DEF_HELPER_FLAGS_3(mve_vst42h, TCG_CALL_NO_WG, void, env, i32, i32) +DEF_HELPER_FLAGS_3(mve_vst42w, TCG_CALL_NO_WG, void, env, i32, i32) + +DEF_HELPER_FLAGS_3(mve_vst43b, TCG_CALL_NO_WG, void, env, i32, i32) +DEF_HELPER_FLAGS_3(mve_vst43h, TCG_CALL_NO_WG, void, env, i32, i32) +DEF_HELPER_FLAGS_3(mve_vst43w, TCG_CALL_NO_WG, void, env, i32, i32) + DEF_HELPER_FLAGS_3(mve_vdup, TCG_CALL_NO_WG, void, env, ptr, i32) DEF_HELPER_FLAGS_4(mve_vidupb, TCG_CALL_NO_WG, i32, env, ptr, i32, i32) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index 48882dd7f38..87446816293 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -44,6 +44,7 @@ &vabav qn qm rda size &vldst_sg qd qm rn size msize os &vldst_sg_imm qd qm a w imm +&vldst_il qd rn size pat w # scatter-gather memory size is in bits 6:4 %sg_msize 6:1 4:1 @@ -59,6 +60,10 @@ @vldst_sg_imm .... .... a:1 . w:1 . .... .... .... . imm:7 &vldst_sg_imm \ qd=%qd qm=%qn +# Deinterleaving load/interleaving store +@vldst_il .... .... .. w:1 . rn:4 .... ... size:2 pat:2 ..... &vldst_il \ + qd=%qd + @1op .... .... .... size:2 .. .... .... .... .... &1op qd=%qd qm=%qm @1op_nosz .... .... .... .... .... .... .... .... &1op qd=%qd qm=%qm size=0 @2op .... .... .. size:2 .... .... .... .... .... &2op qd=%qd qm=%qm qn=%qn @@ -158,6 +163,12 @@ VLDRD_sg_imm 111 1 1101 ... 1 ... 0 ... 1 1111 .... .... @vldst_sg_imm VSTRW_sg_imm 111 1 1101 ... 0 ... 0 ... 1 1110 .... .... @vldst_sg_imm VSTRD_sg_imm 111 1 1101 ... 0 ... 0 ... 1 1111 .... .... @vldst_sg_imm +# deinterleaving loads/interleaving stores +VLD2 1111 1100 1 .. 1 .... ... 1 111 .. .. 00000 @vldst_il +VLD4 1111 1100 1 .. 1 .... ... 1 111 .. .. 00001 @vldst_il +VST2 1111 1100 1 .. 0 .... ... 1 111 .. .. 00000 @vldst_il +VST4 1111 1100 1 .. 0 .... ... 1 111 .. .. 00001 @vldst_il + # Moves between 2 32-bit vector lanes and 2 general purpose registers VMOV_to_2gp 1110 1100 0 . 00 rt2:4 ... 0 1111 000 idx:1 rt:4 qd=%qd VMOV_from_2gp 1110 1100 0 . 01 rt2:4 ... 0 1111 000 idx:1 rt:4 qd=%qd diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index bbbaa538074..c2826eb5f9f 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -362,6 +362,348 @@ DO_VLDR64_SG(vldrd_sg_wb_ud, ADDR_ADD, true) DO_VSTR_SG(vstrw_sg_wb_uw, stl, 4, uint32_t, ADDR_ADD, true) DO_VSTR64_SG(vstrd_sg_wb_ud, ADDR_ADD, true) +/* + * Deinterleaving loads/interleaving stores. + * + * For these helpers we are passed the index of the first Qreg + * (VLD2/VST2 will also access Qn+1, VLD4/VST4 access Qn .. Qn+3) + * and the value of the base address register Rn. + * The helpers are specialized for pattern and element size, so + * for instance vld42h is VLD4 with pattern 2, element size MO_16. + * + * These insns are beatwise but not predicated, so we must honour ECI, + * but need not look at mve_element_mask(). + * + * The pseudocode implements these insns with multiple memory accesses + * of the element size, but rules R_VVVG and R_FXDM permit us to make + * one 32-bit memory access per beat. + */ +#define DO_VLD4B(OP, O1, O2, O3, O4) \ + void HELPER(mve_##OP)(CPUARMState *env, uint32_t qnidx, \ + uint32_t base) \ + { \ + int beat, e; \ + uint16_t mask = mve_eci_mask(env); \ + static const uint8_t off[4] = { O1, O2, O3, O4 }; \ + uint32_t addr, data; \ + for (beat = 0; beat < 4; beat++, mask >>= 4) { \ + if ((mask & 1) == 0) { \ + /* ECI says skip this beat */ \ + continue; \ + } \ + addr = base + off[beat] * 4; \ + data = cpu_ldl_le_data_ra(env, addr, GETPC()); \ + for (e = 0; e < 4; e++, data >>= 8) { \ + uint8_t *qd = (uint8_t *)aa32_vfp_qreg(env, qnidx + e); \ + qd[H1(off[beat])] = data; \ + } \ + } \ + } + +#define DO_VLD4H(OP, O1, O2) \ + void HELPER(mve_##OP)(CPUARMState *env, uint32_t qnidx, \ + uint32_t base) \ + { \ + int beat; \ + uint16_t mask = mve_eci_mask(env); \ + static const uint8_t off[4] = { O1, O1, O2, O2 }; \ + uint32_t addr, data; \ + int y; /* y counts 0 2 0 2 */ \ + uint16_t *qd; \ + for (beat = 0, y = 0; beat < 4; beat++, mask >>= 4, y ^= 2) { \ + if ((mask & 1) == 0) { \ + /* ECI says skip this beat */ \ + continue; \ + } \ + addr = base + off[beat] * 8 + (beat & 1) * 4; \ + data = cpu_ldl_le_data_ra(env, addr, GETPC()); \ + qd = (uint16_t *)aa32_vfp_qreg(env, qnidx + y); \ + qd[H2(off[beat])] = data; \ + data >>= 16; \ + qd = (uint16_t *)aa32_vfp_qreg(env, qnidx + y + 1); \ + qd[H2(off[beat])] = data; \ + } \ + } + +#define DO_VLD4W(OP, O1, O2, O3, O4) \ + void HELPER(mve_##OP)(CPUARMState *env, uint32_t qnidx, \ + uint32_t base) \ + { \ + int beat; \ + uint16_t mask = mve_eci_mask(env); \ + static const uint8_t off[4] = { O1, O2, O3, O4 }; \ + uint32_t addr, data; \ + uint32_t *qd; \ + int y; \ + for (beat = 0; beat < 4; beat++, mask >>= 4) { \ + if ((mask & 1) == 0) { \ + /* ECI says skip this beat */ \ + continue; \ + } \ + addr = base + off[beat] * 4; \ + data = cpu_ldl_le_data_ra(env, addr, GETPC()); \ + y = (beat + (O1 & 2)) & 3; \ + qd = (uint32_t *)aa32_vfp_qreg(env, qnidx + y); \ + qd[H4(off[beat] >> 2)] = data; \ + } \ + } + +DO_VLD4B(vld40b, 0, 1, 10, 11) +DO_VLD4B(vld41b, 2, 3, 12, 13) +DO_VLD4B(vld42b, 4, 5, 14, 15) +DO_VLD4B(vld43b, 6, 7, 8, 9) + +DO_VLD4H(vld40h, 0, 5) +DO_VLD4H(vld41h, 1, 6) +DO_VLD4H(vld42h, 2, 7) +DO_VLD4H(vld43h, 3, 4) + +DO_VLD4W(vld40w, 0, 1, 10, 11) +DO_VLD4W(vld41w, 2, 3, 12, 13) +DO_VLD4W(vld42w, 4, 5, 14, 15) +DO_VLD4W(vld43w, 6, 7, 8, 9) + +#define DO_VLD2B(OP, O1, O2, O3, O4) \ + void HELPER(mve_##OP)(CPUARMState *env, uint32_t qnidx, \ + uint32_t base) \ + { \ + int beat, e; \ + uint16_t mask = mve_eci_mask(env); \ + static const uint8_t off[4] = { O1, O2, O3, O4 }; \ + uint32_t addr, data; \ + uint8_t *qd; \ + for (beat = 0; beat < 4; beat++, mask >>= 4) { \ + if ((mask & 1) == 0) { \ + /* ECI says skip this beat */ \ + continue; \ + } \ + addr = base + off[beat] * 2; \ + data = cpu_ldl_le_data_ra(env, addr, GETPC()); \ + for (e = 0; e < 4; e++, data >>= 8) { \ + qd = (uint8_t *)aa32_vfp_qreg(env, qnidx + (e & 1)); \ + qd[H1(off[beat] + (e >> 1))] = data; \ + } \ + } \ + } + +#define DO_VLD2H(OP, O1, O2, O3, O4) \ + void HELPER(mve_##OP)(CPUARMState *env, uint32_t qnidx, \ + uint32_t base) \ + { \ + int beat; \ + uint16_t mask = mve_eci_mask(env); \ + static const uint8_t off[4] = { O1, O2, O3, O4 }; \ + uint32_t addr, data; \ + int e; \ + uint16_t *qd; \ + for (beat = 0; beat < 4; beat++, mask >>= 4) { \ + if ((mask & 1) == 0) { \ + /* ECI says skip this beat */ \ + continue; \ + } \ + addr = base + off[beat] * 4; \ + data = cpu_ldl_le_data_ra(env, addr, GETPC()); \ + for (e = 0; e < 2; e++, data >>= 16) { \ + qd = (uint16_t *)aa32_vfp_qreg(env, qnidx + e); \ + qd[H2(off[beat])] = data; \ + } \ + } \ + } + +#define DO_VLD2W(OP, O1, O2, O3, O4) \ + void HELPER(mve_##OP)(CPUARMState *env, uint32_t qnidx, \ + uint32_t base) \ + { \ + int beat; \ + uint16_t mask = mve_eci_mask(env); \ + static const uint8_t off[4] = { O1, O2, O3, O4 }; \ + uint32_t addr, data; \ + uint32_t *qd; \ + for (beat = 0; beat < 4; beat++, mask >>= 4) { \ + if ((mask & 1) == 0) { \ + /* ECI says skip this beat */ \ + continue; \ + } \ + addr = base + off[beat]; \ + data = cpu_ldl_le_data_ra(env, addr, GETPC()); \ + qd = (uint32_t *)aa32_vfp_qreg(env, qnidx + (beat & 1)); \ + qd[H4(off[beat] >> 3)] = data; \ + } \ + } + +DO_VLD2B(vld20b, 0, 2, 12, 14) +DO_VLD2B(vld21b, 4, 6, 8, 10) + +DO_VLD2H(vld20h, 0, 1, 6, 7) +DO_VLD2H(vld21h, 2, 3, 4, 5) + +DO_VLD2W(vld20w, 0, 4, 24, 28) +DO_VLD2W(vld21w, 8, 12, 16, 20) + +#define DO_VST4B(OP, O1, O2, O3, O4) \ + void HELPER(mve_##OP)(CPUARMState *env, uint32_t qnidx, \ + uint32_t base) \ + { \ + int beat, e; \ + uint16_t mask = mve_eci_mask(env); \ + static const uint8_t off[4] = { O1, O2, O3, O4 }; \ + uint32_t addr, data; \ + for (beat = 0; beat < 4; beat++, mask >>= 4) { \ + if ((mask & 1) == 0) { \ + /* ECI says skip this beat */ \ + continue; \ + } \ + addr = base + off[beat] * 4; \ + data = 0; \ + for (e = 3; e >= 0; e--) { \ + uint8_t *qd = (uint8_t *)aa32_vfp_qreg(env, qnidx + e); \ + data = (data << 8) | qd[H1(off[beat])]; \ + } \ + cpu_stl_le_data_ra(env, addr, data, GETPC()); \ + } \ + } + +#define DO_VST4H(OP, O1, O2) \ + void HELPER(mve_##OP)(CPUARMState *env, uint32_t qnidx, \ + uint32_t base) \ + { \ + int beat; \ + uint16_t mask = mve_eci_mask(env); \ + static const uint8_t off[4] = { O1, O1, O2, O2 }; \ + uint32_t addr, data; \ + int y; /* y counts 0 2 0 2 */ \ + uint16_t *qd; \ + for (beat = 0, y = 0; beat < 4; beat++, mask >>= 4, y ^= 2) { \ + if ((mask & 1) == 0) { \ + /* ECI says skip this beat */ \ + continue; \ + } \ + addr = base + off[beat] * 8 + (beat & 1) * 4; \ + qd = (uint16_t *)aa32_vfp_qreg(env, qnidx + y); \ + data = qd[H2(off[beat])]; \ + qd = (uint16_t *)aa32_vfp_qreg(env, qnidx + y + 1); \ + data |= qd[H2(off[beat])] << 16; \ + cpu_stl_le_data_ra(env, addr, data, GETPC()); \ + } \ + } + +#define DO_VST4W(OP, O1, O2, O3, O4) \ + void HELPER(mve_##OP)(CPUARMState *env, uint32_t qnidx, \ + uint32_t base) \ + { \ + int beat; \ + uint16_t mask = mve_eci_mask(env); \ + static const uint8_t off[4] = { O1, O2, O3, O4 }; \ + uint32_t addr, data; \ + uint32_t *qd; \ + int y; \ + for (beat = 0; beat < 4; beat++, mask >>= 4) { \ + if ((mask & 1) == 0) { \ + /* ECI says skip this beat */ \ + continue; \ + } \ + addr = base + off[beat] * 4; \ + y = (beat + (O1 & 2)) & 3; \ + qd = (uint32_t *)aa32_vfp_qreg(env, qnidx + y); \ + data = qd[H4(off[beat] >> 2)]; \ + cpu_stl_le_data_ra(env, addr, data, GETPC()); \ + } \ + } + +DO_VST4B(vst40b, 0, 1, 10, 11) +DO_VST4B(vst41b, 2, 3, 12, 13) +DO_VST4B(vst42b, 4, 5, 14, 15) +DO_VST4B(vst43b, 6, 7, 8, 9) + +DO_VST4H(vst40h, 0, 5) +DO_VST4H(vst41h, 1, 6) +DO_VST4H(vst42h, 2, 7) +DO_VST4H(vst43h, 3, 4) + +DO_VST4W(vst40w, 0, 1, 10, 11) +DO_VST4W(vst41w, 2, 3, 12, 13) +DO_VST4W(vst42w, 4, 5, 14, 15) +DO_VST4W(vst43w, 6, 7, 8, 9) + +#define DO_VST2B(OP, O1, O2, O3, O4) \ + void HELPER(mve_##OP)(CPUARMState *env, uint32_t qnidx, \ + uint32_t base) \ + { \ + int beat, e; \ + uint16_t mask = mve_eci_mask(env); \ + static const uint8_t off[4] = { O1, O2, O3, O4 }; \ + uint32_t addr, data; \ + uint8_t *qd; \ + for (beat = 0; beat < 4; beat++, mask >>= 4) { \ + if ((mask & 1) == 0) { \ + /* ECI says skip this beat */ \ + continue; \ + } \ + addr = base + off[beat] * 2; \ + data = 0; \ + for (e = 3; e >= 0; e--) { \ + qd = (uint8_t *)aa32_vfp_qreg(env, qnidx + (e & 1)); \ + data = (data << 8) | qd[H1(off[beat] + (e >> 1))]; \ + } \ + cpu_stl_le_data_ra(env, addr, data, GETPC()); \ + } \ + } + +#define DO_VST2H(OP, O1, O2, O3, O4) \ + void HELPER(mve_##OP)(CPUARMState *env, uint32_t qnidx, \ + uint32_t base) \ + { \ + int beat; \ + uint16_t mask = mve_eci_mask(env); \ + static const uint8_t off[4] = { O1, O2, O3, O4 }; \ + uint32_t addr, data; \ + int e; \ + uint16_t *qd; \ + for (beat = 0; beat < 4; beat++, mask >>= 4) { \ + if ((mask & 1) == 0) { \ + /* ECI says skip this beat */ \ + continue; \ + } \ + addr = base + off[beat] * 4; \ + data = 0; \ + for (e = 1; e >= 0; e--) { \ + qd = (uint16_t *)aa32_vfp_qreg(env, qnidx + e); \ + data = (data << 16) | qd[H2(off[beat])]; \ + } \ + cpu_stl_le_data_ra(env, addr, data, GETPC()); \ + } \ + } + +#define DO_VST2W(OP, O1, O2, O3, O4) \ + void HELPER(mve_##OP)(CPUARMState *env, uint32_t qnidx, \ + uint32_t base) \ + { \ + int beat; \ + uint16_t mask = mve_eci_mask(env); \ + static const uint8_t off[4] = { O1, O2, O3, O4 }; \ + uint32_t addr, data; \ + uint32_t *qd; \ + for (beat = 0; beat < 4; beat++, mask >>= 4) { \ + if ((mask & 1) == 0) { \ + /* ECI says skip this beat */ \ + continue; \ + } \ + addr = base + off[beat]; \ + qd = (uint32_t *)aa32_vfp_qreg(env, qnidx + (beat & 1)); \ + data = qd[H4(off[beat] >> 3)]; \ + cpu_stl_le_data_ra(env, addr, data, GETPC()); \ + } \ + } + +DO_VST2B(vst20b, 0, 2, 12, 14) +DO_VST2B(vst21b, 4, 6, 8, 10) + +DO_VST2H(vst20h, 0, 1, 6, 7) +DO_VST2H(vst21h, 2, 3, 4, 5) + +DO_VST2W(vst20w, 0, 4, 24, 28) +DO_VST2W(vst21w, 8, 12, 16, 20) + /* * The mergemask(D, R, M) macro performs the operation "*D = R" but * storing only the bytes which correspond to 1 bits in M, diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index d3cb3396863..78229c44c68 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -35,6 +35,7 @@ static inline int vidup_imm(DisasContext *s, int x) typedef void MVEGenLdStFn(TCGv_ptr, TCGv_ptr, TCGv_i32); typedef void MVEGenLdStSGFn(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32); +typedef void MVEGenLdStIlFn(TCGv_ptr, TCGv_i32, TCGv_i32); typedef void MVEGenOneOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr); typedef void MVEGenTwoOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr); typedef void MVEGenTwoOpScalarFn(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32); @@ -378,6 +379,99 @@ static bool trans_VSTRD_sg_imm(DisasContext *s, arg_vldst_sg_imm *a) return do_ldst_sg_imm(s, a, fns[a->w], MO_64); } +static bool do_vldst_il(DisasContext *s, arg_vldst_il *a, MVEGenLdStIlFn *fn, + int addrinc) +{ + TCGv_i32 rn; + + if (!dc_isar_feature(aa32_mve, s) || + !mve_check_qreg_bank(s, a->qd) || + !fn || (a->rn == 13 && a->w) || a->rn == 15) { + /* Variously UNPREDICTABLE or UNDEF or related-encoding */ + return false; + } + if (!mve_eci_check(s) || !vfp_access_check(s)) { + return true; + } + + rn = load_reg(s, a->rn); + /* + * We pass the index of Qd, not a pointer, because the helper must + * access multiple Q registers starting at Qd and working up. + */ + fn(cpu_env, tcg_constant_i32(a->qd), rn); + + if (a->w) { + tcg_gen_addi_i32(rn, rn, addrinc); + store_reg(s, a->rn, rn); + } else { + tcg_temp_free_i32(rn); + } + mve_update_and_store_eci(s); + return true; +} + +/* This macro is just to make the arrays more compact in these functions */ +#define F(N) gen_helper_mve_##N + +static bool trans_VLD2(DisasContext *s, arg_vldst_il *a) +{ + static MVEGenLdStIlFn * const fns[4][4] = { + { F(vld20b), F(vld20h), F(vld20w), NULL, }, + { F(vld21b), F(vld21h), F(vld21w), NULL, }, + { NULL, NULL, NULL, NULL }, + { NULL, NULL, NULL, NULL }, + }; + if (a->qd > 6) { + return false; + } + return do_vldst_il(s, a, fns[a->pat][a->size], 32); +} + +static bool trans_VLD4(DisasContext *s, arg_vldst_il *a) +{ + static MVEGenLdStIlFn * const fns[4][4] = { + { F(vld40b), F(vld40h), F(vld40w), NULL, }, + { F(vld41b), F(vld41h), F(vld41w), NULL, }, + { F(vld42b), F(vld42h), F(vld42w), NULL, }, + { F(vld43b), F(vld43h), F(vld43w), NULL, }, + }; + if (a->qd > 4) { + return false; + } + return do_vldst_il(s, a, fns[a->pat][a->size], 64); +} + +static bool trans_VST2(DisasContext *s, arg_vldst_il *a) +{ + static MVEGenLdStIlFn * const fns[4][4] = { + { F(vst20b), F(vst20h), F(vst20w), NULL, }, + { F(vst21b), F(vst21h), F(vst21w), NULL, }, + { NULL, NULL, NULL, NULL }, + { NULL, NULL, NULL, NULL }, + }; + if (a->qd > 6) { + return false; + } + return do_vldst_il(s, a, fns[a->pat][a->size], 32); +} + +static bool trans_VST4(DisasContext *s, arg_vldst_il *a) +{ + static MVEGenLdStIlFn * const fns[4][4] = { + { F(vst40b), F(vst40h), F(vst40w), NULL, }, + { F(vst41b), F(vst41h), F(vst41w), NULL, }, + { F(vst42b), F(vst42h), F(vst42w), NULL, }, + { F(vst43b), F(vst43h), F(vst43w), NULL, }, + }; + if (a->qd > 4) { + return false; + } + return do_vldst_il(s, a, fns[a->pat][a->size], 64); +} + +#undef F + static bool trans_VDUP(DisasContext *s, arg_VDUP *a) { TCGv_ptr qd; From patchwork Thu Jul 29 11:14:55 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511179 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=PQLzQrZO; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb7kZ16sQz9sSs for ; Thu, 29 Jul 2021 21:35:34 +1000 (AEST) Received: from localhost ([::1]:51110 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m94K3-0007r6-Sh for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:35:31 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40788) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m941K-0001Cu-BD for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:11 -0400 Received: from mail-wr1-x42f.google.com ([2a00:1450:4864:20::42f]:37759) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m940w-0001M4-QR for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:10 -0400 Received: by mail-wr1-x42f.google.com with SMTP id d8so6456775wrm.4 for ; Thu, 29 Jul 2021 04:15:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=alz44QktxpBv/sFgB1FXFqYrKdqALX0LQcZ0bcLunZg=; b=PQLzQrZOxRCaA+NE4M3nsIcmIADxv3K5+iWTDZ1hJBnI2afvgGrMNkfUIgT8kdxosu 4g4xoiAkB0fKIxC5jJzqFfICHUoD4EqSjpKTALGM7ny5bsGe/VriHHuxc8ht15XvePb0 GUSj+XpfRH5Wrgzo6MutULnwuBaAVoDJWV0mh4gNc4lvsn6ZPJMmaye8Wa2moRaajI1o 3kO00sqsCsTaW2JHDyhBru5ZqZc3/yQT0U1SSHhOAJ/0baGUSy2+iyN5FoLhhH5Mfi5s 8s1xwEvcaFh22xEu47DHBHQwbnQoB+rRJMBJaIYILsV1n371OospJoizdvrwqolgLs7C y5iw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=alz44QktxpBv/sFgB1FXFqYrKdqALX0LQcZ0bcLunZg=; b=P70iUesjiz3uCBF18E2+SMmSG+p+e3l98O9J6ngMH6xoH8futJrigFENjg8zj6eB0F N23s2UGsPDZyoPcBdYE9HCJDSEnikhZahFg8MbI2nDKEHMdHcI4saH5vtLCu+VbcbOyh icBSA0FiDzOfDnfMq7qa/XYMrCGscKPR2emTUrcBKEUNt1zs+Q+b5Dkfmq4+KaJb3X+L 0nJTvoKqeqH7RQSVYPidEPurQouuE6Texok2MPrEtdGii3KZp6lx1/C9GJPUQdC9aLis KAg40qRc4VRFebat76aVm8FZ4Zr6U7kIJc12kxfoudM5Y+Onsx3CVripAbh667+AkrLH 9QZQ== X-Gm-Message-State: AOAM530o4frtjb4zMwJRX2ZpfqGUZHBZH3X3D1t8tRz+lLRp6FLijsKZ uwPCsWDSRMiQ6ekic/NleJSTcw== X-Google-Smtp-Source: ABdhPJwM8YFc1gey54/ltdLRSXBBU1d1rJ9f0FSfQMp8PeeYhhv6100F+8vgvplt9MIK8xZ2A+WhNg== X-Received: by 2002:adf:82e6:: with SMTP id 93mr4205936wrc.47.1627557345129; Thu, 29 Jul 2021 04:15:45 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:44 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 36/53] target/arm: Implement MVE VADD (floating-point) Date: Thu, 29 Jul 2021 12:14:55 +0100 Message-Id: <20210729111512.16541-37-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::42f; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x42f.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE VADD (floating-point) insn. Handling of this is similar to the 2-operand integer insns, except that we must take care to only update the floating point exception status if the least significant bit of the predicate mask for each element is active. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper-mve.h | 3 +++ target/arm/translate.h | 6 ++++++ target/arm/mve.decode | 10 ++++++++++ target/arm/mve_helper.c | 37 +++++++++++++++++++++++++++++++++++++ target/arm/translate-mve.c | 17 +++++++++++++++++ target/arm/translate-neon.c | 6 ------ 6 files changed, 73 insertions(+), 6 deletions(-) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index 3db9b15f121..32fd2e1f9be 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -410,6 +410,9 @@ DEF_HELPER_FLAGS_4(mve_vhcadd270b, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) DEF_HELPER_FLAGS_4(mve_vhcadd270h, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) DEF_HELPER_FLAGS_4(mve_vhcadd270w, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) +DEF_HELPER_FLAGS_4(mve_vfaddh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) +DEF_HELPER_FLAGS_4(mve_vfadds, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) + DEF_HELPER_FLAGS_4(mve_vadd_scalarb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vadd_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vadd_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) diff --git a/target/arm/translate.h b/target/arm/translate.h index 241596c5bda..8636c20c3b4 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -181,6 +181,12 @@ static inline int rsub_8(DisasContext *s, int x) return 8 - x; } +static inline int neon_3same_fp_size(DisasContext *s, int x) +{ + /* Convert 0==fp32, 1==fp16 into a MO_* value */ + return MO_32 - x; +} + static inline int arm_dc_feature(DisasContext *dc, int feature) { return (dc->features & (1ULL << feature)) != 0; diff --git a/target/arm/mve.decode b/target/arm/mve.decode index 87446816293..e211cb016c6 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -26,6 +26,10 @@ # VQDMULL has size in bit 28: 0 for 16 bit, 1 for 32 bit %size_28 28:1 !function=plus_1 +# 2 operand fp insns have size in bit 20: 1 for 16 bit, 0 for 32 bit, +# like Neon FP insns. +%2op_fp_size 20:1 !function=neon_3same_fp_size + # 1imm format immediate %imm_28_16_0 28:1 16:3 0:4 @@ -118,6 +122,9 @@ @vmaxv .... .... .... size:2 .. rda:4 .... .... .... &vmaxv qm=%qm +@2op_fp .... .... .... .... .... .... .... .... &2op \ + qd=%qd qn=%qn qm=%qm size=%2op_fp_size + # Vector loads and stores # Widening loads and narrowing stores: @@ -615,3 +622,6 @@ VCMPGE_scalar 1111 1110 0 . .. ... 1 ... 1 1111 0 1 0 0 .... @vcmp_scalar VCMPLT_scalar 1111 1110 0 . .. ... 1 ... 1 1111 1 1 0 0 .... @vcmp_scalar VCMPGT_scalar 1111 1110 0 . .. ... 1 ... 1 1111 0 1 1 0 .... @vcmp_scalar VCMPLE_scalar 1111 1110 0 . .. ... 1 ... 1 1111 1 1 1 0 .... @vcmp_scalar + +# 2-operand FP +VADD_fp 1110 1111 0 . 0 . ... 0 ... 0 1101 . 1 . 0 ... 0 @2op_fp diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index c2826eb5f9f..ff087e9d3a4 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -25,6 +25,7 @@ #include "exec/cpu_ldst.h" #include "exec/exec-all.h" #include "tcg/tcg.h" +#include "fpu/softfloat.h" static uint16_t mve_eci_mask(CPUARMState *env) { @@ -2798,3 +2799,39 @@ DO_VMAXMINA(vmaxaw, 4, int32_t, uint32_t, DO_MAX) DO_VMAXMINA(vminab, 1, int8_t, uint8_t, DO_MIN) DO_VMAXMINA(vminah, 2, int16_t, uint16_t, DO_MIN) DO_VMAXMINA(vminaw, 4, int32_t, uint32_t, DO_MIN) + +/* + * 2-operand floating point. Note that if an element is partially + * predicated we must do the FP operation to update the non-predicated + * bytes, but we must be careful to avoid updating the FP exception + * state unless byte 0 of the element was unpredicated. + */ +#define DO_2OP_FP(OP, ESIZE, TYPE, FN) \ + void HELPER(glue(mve_, OP))(CPUARMState *env, \ + void *vd, void *vn, void *vm) \ + { \ + TYPE *d = vd, *n = vn, *m = vm; \ + TYPE r; \ + uint16_t mask = mve_element_mask(env); \ + unsigned e; \ + float_status *fpst; \ + float_status scratch_fpst; \ + for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \ + if ((mask & MAKE_64BIT_MASK(0, ESIZE)) == 0) { \ + continue; \ + } \ + fpst = (ESIZE == 2) ? &env->vfp.standard_fp_status_f16 : \ + &env->vfp.standard_fp_status; \ + if (!(mask & 1)) { \ + /* We need the result but without updating flags */ \ + scratch_fpst = *fpst; \ + fpst = &scratch_fpst; \ + } \ + r = FN(n[H##ESIZE(e)], m[H##ESIZE(e)], fpst); \ + mergemask(&d[H##ESIZE(e)], r, mask); \ + } \ + mve_advance_vpt(env); \ + } + +DO_2OP_FP(vfaddh, 2, uint16_t, float16_add) +DO_2OP_FP(vfadds, 4, uint32_t, float32_add) diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index 78229c44c68..d2c40ede564 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -831,6 +831,23 @@ static bool trans_VSBCI(DisasContext *s, arg_2op *a) return do_2op(s, a, gen_helper_mve_vsbci); } +#define DO_2OP_FP(INSN, FN) \ + static bool trans_##INSN(DisasContext *s, arg_2op *a) \ + { \ + static MVEGenTwoOpFn * const fns[] = { \ + NULL, \ + gen_helper_mve_##FN##h, \ + gen_helper_mve_##FN##s, \ + NULL, \ + }; \ + if (!dc_isar_feature(aa32_mve_fp, s)) { \ + return false; \ + } \ + return do_2op(s, a, fns[a->size]); \ + } + +DO_2OP_FP(VADD_fp, vfadd) + static bool do_2op_scalar(DisasContext *s, arg_2scalar *a, MVEGenTwoOpScalarFn fn) { diff --git a/target/arm/translate-neon.c b/target/arm/translate-neon.c index c53ab20fa48..dd43de558e4 100644 --- a/target/arm/translate-neon.c +++ b/target/arm/translate-neon.c @@ -28,12 +28,6 @@ #include "translate.h" #include "translate-a32.h" -static inline int neon_3same_fp_size(DisasContext *s, int x) -{ - /* Convert 0==fp32, 1==fp16 into a MO_* value */ - return MO_32 - x; -} - /* Include the generated Neon decoder */ #include "decode-neon-dp.c.inc" #include "decode-neon-ls.c.inc" From patchwork Thu Jul 29 11:14:56 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511199 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=rv6Nd7Fy; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb81g2T5Lz9sSs for ; Thu, 29 Jul 2021 21:48:39 +1000 (AEST) Received: from localhost ([::1]:56980 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m94Wj-0006go-1D for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:48:37 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40874) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m941Q-0001Ic-J6 for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:18 -0400 Received: from mail-wr1-x42a.google.com ([2a00:1450:4864:20::42a]:36803) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m940y-0001MH-0N for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:15 -0400 Received: by mail-wr1-x42a.google.com with SMTP id g15so6471885wrd.3 for ; Thu, 29 Jul 2021 04:15:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=6J2yRvpWpl1slGdw4MbxdAJXhdOLJlQJMTOp4A5Pkfs=; b=rv6Nd7FyvALRAt59lanXYsGkbfjHOj+ACORwh+snq5LUFG0YPWj9x3JIWnr/H8F9Rc +sSlhbZUnbXdaN57zsSqPin5HKxKaCgMdezAw2vGPw5F3Q/xbVXpkRoMb8SdkN8Z4INT wTC/Az4Xma+LeUM/2c1TIRnv3huniBXIMMulxDxl8b8Xs3DV7gPSZKYMRCYgKM4neYM8 lwP9yRTr0l8RfuwarHoUyD2x3qPtAk02i+NO+L2NRr4Qxx7krNL+zg3FJTSYpaBHErjL vUTjNg/MQDPJ4+jkEldA2T2xjbnxCFOxRnYyceHfqI3vhM1h3b5wIBtV7oDcIvtjOgr7 wbyg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=6J2yRvpWpl1slGdw4MbxdAJXhdOLJlQJMTOp4A5Pkfs=; b=TyODN8P0ynsnoJ3SLADSwa0ijq4Y2kaZgdPTF4Zxka2wnCelGsROPEePuDgVKMVUFg 4TC2EQDbga2EJ0hh3rsxGIcDYRDKE9VceLFnU0tIhF2b0iyeTzvp+sJ0NCtHvt1OU1hh mLA5oLV0xV+tlNjI5rErU2dhcuwbDThWfMROhLqArvucw1hbCAfBCI4dB4O8LdmtRjw/ ivHsX7Jc2NC4OrcdNa9Wcwxqq0G3psZGqEpeV5PRK+hOFVJ3xDRvCJ2XijV2bpbQElc7 4EOxICNw1NnCEATKaZUc5w9Z4v7Q7oRMcw0LJ+k235NyHUrR3nwcvk80bSH5eWdskp55 lurQ== X-Gm-Message-State: AOAM532oJgeLgALLPVqAfqdhONVh9y4w/LdpAVeV1p2yAYOm/QXeDVrP Gdqur8msXc8aFiXcxhtWkr9OQ/nls8VoTw== X-Google-Smtp-Source: ABdhPJzz7JsfQWSxN9+QKw7T1Z6X/OY5V5ixGWvk9gqJWLs0HQbjr/aqzBiPYHLfiNgqeTtDS/bb7w== X-Received: by 2002:adf:fb8f:: with SMTP id a15mr4365724wrr.92.1627557345950; Thu, 29 Jul 2021 04:15:45 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:45 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 37/53] target/arm: Implement MVE VSUB, VMUL, VABD, VMAXNM, VMINNM Date: Thu, 29 Jul 2021 12:14:56 +0100 Message-Id: <20210729111512.16541-38-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::42a; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x42a.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement more simple 2-operand floating point MVE insns. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper-mve.h | 15 +++++++++++++++ target/arm/mve.decode | 6 ++++++ target/arm/mve_helper.c | 24 ++++++++++++++++++++++++ target/arm/translate-mve.c | 5 +++++ 4 files changed, 50 insertions(+) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index 32fd2e1f9be..370876d7934 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -413,6 +413,21 @@ DEF_HELPER_FLAGS_4(mve_vhcadd270w, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) DEF_HELPER_FLAGS_4(mve_vfaddh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) DEF_HELPER_FLAGS_4(mve_vfadds, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) +DEF_HELPER_FLAGS_4(mve_vfsubh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) +DEF_HELPER_FLAGS_4(mve_vfsubs, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) + +DEF_HELPER_FLAGS_4(mve_vfmulh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) +DEF_HELPER_FLAGS_4(mve_vfmuls, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) + +DEF_HELPER_FLAGS_4(mve_vfabdh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) +DEF_HELPER_FLAGS_4(mve_vfabds, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) + +DEF_HELPER_FLAGS_4(mve_vmaxnmh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) +DEF_HELPER_FLAGS_4(mve_vmaxnms, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) + +DEF_HELPER_FLAGS_4(mve_vminnmh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) +DEF_HELPER_FLAGS_4(mve_vminnms, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) + DEF_HELPER_FLAGS_4(mve_vadd_scalarb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vadd_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vadd_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index e211cb016c6..cdbfaa4245b 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -625,3 +625,9 @@ VCMPLE_scalar 1111 1110 0 . .. ... 1 ... 1 1111 1 1 1 0 .... @vcmp_scalar # 2-operand FP VADD_fp 1110 1111 0 . 0 . ... 0 ... 0 1101 . 1 . 0 ... 0 @2op_fp +VSUB_fp 1110 1111 0 . 1 . ... 0 ... 0 1101 . 1 . 0 ... 0 @2op_fp +VMUL_fp 1111 1111 0 . 0 . ... 0 ... 0 1101 . 1 . 1 ... 0 @2op_fp +VABD_fp 1111 1111 0 . 1 . ... 0 ... 0 1101 . 1 . 0 ... 0 @2op_fp + +VMAXNM 1111 1111 0 . 0 . ... 0 ... 0 1111 . 1 . 1 ... 0 @2op_fp +VMINNM 1111 1111 0 . 1 . ... 0 ... 0 1111 . 1 . 1 ... 0 @2op_fp diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index ff087e9d3a4..e0e3e30de68 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -2835,3 +2835,27 @@ DO_VMAXMINA(vminaw, 4, int32_t, uint32_t, DO_MIN) DO_2OP_FP(vfaddh, 2, uint16_t, float16_add) DO_2OP_FP(vfadds, 4, uint32_t, float32_add) + +DO_2OP_FP(vfsubh, 2, uint16_t, float16_sub) +DO_2OP_FP(vfsubs, 4, uint32_t, float32_sub) + +DO_2OP_FP(vfmulh, 2, uint16_t, float16_mul) +DO_2OP_FP(vfmuls, 4, uint32_t, float32_mul) + +static inline float16 float16_abd(float16 a, float16 b, float_status *s) +{ + return float16_abs(float16_sub(a, b, s)); +} + +static inline float32 float32_abd(float32 a, float32 b, float_status *s) +{ + return float32_abs(float32_sub(a, b, s)); +} + +DO_2OP_FP(vfabdh, 2, uint16_t, float16_abd) +DO_2OP_FP(vfabds, 4, uint32_t, float32_abd) + +DO_2OP_FP(vmaxnmh, 2, uint16_t, float16_maxnum) +DO_2OP_FP(vmaxnms, 4, uint32_t, float32_maxnum) +DO_2OP_FP(vminnmh, 2, uint16_t, float16_minnum) +DO_2OP_FP(vminnms, 4, uint32_t, float32_minnum) diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index d2c40ede564..98282335820 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -847,6 +847,11 @@ static bool trans_VSBCI(DisasContext *s, arg_2op *a) } DO_2OP_FP(VADD_fp, vfadd) +DO_2OP_FP(VSUB_fp, vfsub) +DO_2OP_FP(VMUL_fp, vfmul) +DO_2OP_FP(VABD_fp, vfabd) +DO_2OP_FP(VMAXNM, vmaxnm) +DO_2OP_FP(VMINNM, vminnm) static bool do_2op_scalar(DisasContext *s, arg_2scalar *a, MVEGenTwoOpScalarFn fn) From patchwork Thu Jul 29 11:14:57 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511188 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=OLIv2jd2; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb7n15Svyz9sSs for ; Thu, 29 Jul 2021 21:37:40 +1000 (AEST) Received: from localhost ([::1]:59708 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m94M4-0005Jp-LR for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:37:36 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40888) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m941R-0001If-7r for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:18 -0400 Received: from mail-wm1-x330.google.com ([2a00:1450:4864:20::330]:40661) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m940y-0001MX-Mf for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:16 -0400 Received: by mail-wm1-x330.google.com with SMTP id f18-20020a05600c4e92b0290253c32620e7so6335441wmq.5 for ; Thu, 29 Jul 2021 04:15:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=+4FNPh8wWx4RXyOj6ISacCnC2M23ncHaene1C3j/YaM=; b=OLIv2jd2M+H/HHSmFzJHUZDbUlbGmLk/F5MFwmVvPoLSpSfFRVZerT34z86zGyRqIs KkkmKGdtdgLOvEdh8WK7PgE5HEEbq3kaRw92Ii2PY0ESHyqOvY/HPGPpziY9bt2eiYP5 gVsHWhCWEDbLKRmKs9GLDIlQjVjL/96VpK9B+VDozYpaMWUXphpPPT7zR+Jq0jTYWDD1 p+V4auKPmnx9WjxRNIy25RRhhoLI6UFJF4X+o2O9EroIJbMc4QcwLKdpkuuWIpybC8D8 S2T9PEZzbaM+8mPhK+pE+oZV2NydLXLRgj7h4qg1SX/Y/GAYD8B98cYRkWqKiQ+6UnXO rokQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=+4FNPh8wWx4RXyOj6ISacCnC2M23ncHaene1C3j/YaM=; b=pwbMtiCSWMI1c1Lhx3jJUavVkHj+vSVuyxGIAyCsisVThbHx7b7/8cdP6KnZ/Ws6p4 BL8vJlp2//a/9CcSafbe01UB1OuKakCNejsRhXhkW7svEci3GhzG/ihxjWHZ1HIVf6a9 j4vM/0RSwGNfYr9hQ9nJT8CGFO0poSwd5D0A6eSYW6cMs8cfBIwsooyOFiQIanMb0e9i hc6bGnQjZRULXdgTOrO53pdyZNY7JnoycT2ZbciWXTdLmi+eXMZMMWZSTAR14j7DJj2C fHuZZHoFTZ9OvqLY1VrVTG/Soqe92B/IH50oCO/By+fUOuoHB/o8DB11yn9xQ2m5Vk/M bdDA== X-Gm-Message-State: AOAM531g0VMuwl8iUgqJVujWky0Nk5xhUP8blV4+gxuJM+iVjRbPwZcD rV+y3HxR7jLR9JexpH06G/Z+qECZWTJxFQ== X-Google-Smtp-Source: ABdhPJyY4KcfwvKiiWmkc5Y17bIu4dt0r3JZT2TtlS1gtJhhOQuJLTagBDCGYz+AqhqL7sn0MixK0g== X-Received: by 2002:a7b:c5d8:: with SMTP id n24mr4225734wmk.51.1627557346740; Thu, 29 Jul 2021 04:15:46 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:46 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 38/53] target/arm: Implement MVE VCADD Date: Thu, 29 Jul 2021 12:14:57 +0100 Message-Id: <20210729111512.16541-39-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::330; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x330.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE VCADD insn. Note that here the size bit is the opposite sense to the other 2-operand fp insns. We don't check for the sz == 1 && Qd == Qm UNPREDICTABLE case, because that would mean we can't use the DO_2OP_FP macro in translate-mve.c. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper-mve.h | 6 ++++++ target/arm/mve.decode | 8 ++++++++ target/arm/mve_helper.c | 40 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-mve.c | 4 +++- 4 files changed, 57 insertions(+), 1 deletion(-) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index 370876d7934..42eba8ea96d 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -428,6 +428,12 @@ DEF_HELPER_FLAGS_4(mve_vmaxnms, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) DEF_HELPER_FLAGS_4(mve_vminnmh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) DEF_HELPER_FLAGS_4(mve_vminnms, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) +DEF_HELPER_FLAGS_4(mve_vfcadd90h, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) +DEF_HELPER_FLAGS_4(mve_vfcadd90s, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) + +DEF_HELPER_FLAGS_4(mve_vfcadd270h, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) +DEF_HELPER_FLAGS_4(mve_vfcadd270s, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) + DEF_HELPER_FLAGS_4(mve_vadd_scalarb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vadd_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vadd_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index cdbfaa4245b..c728c7089ac 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -29,6 +29,8 @@ # 2 operand fp insns have size in bit 20: 1 for 16 bit, 0 for 32 bit, # like Neon FP insns. %2op_fp_size 20:1 !function=neon_3same_fp_size +# VCADD is an exception, where bit 20 is 0 for 16 bit and 1 for 32 bit +%2op_fp_size_rev 20:1 !function=plus_1 # 1imm format immediate %imm_28_16_0 28:1 16:3 0:4 @@ -125,6 +127,9 @@ @2op_fp .... .... .... .... .... .... .... .... &2op \ qd=%qd qn=%qn qm=%qm size=%2op_fp_size +@2op_fp_size_rev .... .... .... .... .... .... .... .... &2op \ + qd=%qd qn=%qn qm=%qm size=%2op_fp_size_rev + # Vector loads and stores # Widening loads and narrowing stores: @@ -631,3 +636,6 @@ VABD_fp 1111 1111 0 . 1 . ... 0 ... 0 1101 . 1 . 0 ... 0 @2op_fp VMAXNM 1111 1111 0 . 0 . ... 0 ... 0 1111 . 1 . 1 ... 0 @2op_fp VMINNM 1111 1111 0 . 1 . ... 0 ... 0 1111 . 1 . 1 ... 0 @2op_fp + +VCADD90_fp 1111 1100 1 . 0 . ... 0 ... 0 1000 . 1 . 0 ... 0 @2op_fp_size_rev +VCADD270_fp 1111 1101 1 . 0 . ... 0 ... 0 1000 . 1 . 0 ... 0 @2op_fp_size_rev diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index e0e3e30de68..fd6ff167849 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -2859,3 +2859,43 @@ DO_2OP_FP(vmaxnmh, 2, uint16_t, float16_maxnum) DO_2OP_FP(vmaxnms, 4, uint32_t, float32_maxnum) DO_2OP_FP(vminnmh, 2, uint16_t, float16_minnum) DO_2OP_FP(vminnms, 4, uint32_t, float32_minnum) + +#define DO_VCADD_FP(OP, ESIZE, TYPE, FN0, FN1) \ + void HELPER(glue(mve_, OP))(CPUARMState *env, \ + void *vd, void *vn, void *vm) \ + { \ + TYPE *d = vd, *n = vn, *m = vm; \ + TYPE r[16 / ESIZE]; \ + uint16_t tm, mask = mve_element_mask(env); \ + unsigned e; \ + float_status *fpst; \ + float_status scratch_fpst; \ + /* Calculate all results first to avoid overwriting inputs */ \ + for (e = 0, tm = mask; e < 16 / ESIZE; e++, tm >>= ESIZE) { \ + if ((tm & MAKE_64BIT_MASK(0, ESIZE)) == 0) { \ + r[e] = 0; \ + continue; \ + } \ + fpst = (ESIZE == 2) ? &env->vfp.standard_fp_status_f16 : \ + &env->vfp.standard_fp_status; \ + if (!(tm & 1)) { \ + /* We need the result but without updating flags */ \ + scratch_fpst = *fpst; \ + fpst = &scratch_fpst; \ + } \ + if (!(e & 1)) { \ + r[e] = FN0(n[H##ESIZE(e)], m[H##ESIZE(e + 1)], fpst); \ + } else { \ + r[e] = FN1(n[H##ESIZE(e)], m[H##ESIZE(e - 1)], fpst); \ + } \ + } \ + for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \ + mergemask(&d[H##ESIZE(e)], r[e], mask); \ + } \ + mve_advance_vpt(env); \ + } + +DO_VCADD_FP(vfcadd90h, 2, uint16_t, float16_sub, float16_add) +DO_VCADD_FP(vfcadd90s, 4, uint32_t, float32_sub, float32_add) +DO_VCADD_FP(vfcadd270h, 2, uint16_t, float16_add, float16_sub) +DO_VCADD_FP(vfcadd270s, 4, uint32_t, float32_add, float32_sub) diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index 98282335820..6203e3ff916 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -852,6 +852,8 @@ DO_2OP_FP(VMUL_fp, vfmul) DO_2OP_FP(VABD_fp, vfabd) DO_2OP_FP(VMAXNM, vmaxnm) DO_2OP_FP(VMINNM, vminnm) +DO_2OP_FP(VCADD90_fp, vfcadd90) +DO_2OP_FP(VCADD270_fp, vfcadd270) static bool do_2op_scalar(DisasContext *s, arg_2scalar *a, MVEGenTwoOpScalarFn fn) @@ -883,7 +885,7 @@ static bool do_2op_scalar(DisasContext *s, arg_2scalar *a, return true; } -#define DO_2OP_SCALAR(INSN, FN) \ +#define DO_2OP_SCALAR(INSN, FN) \ static bool trans_##INSN(DisasContext *s, arg_2scalar *a) \ { \ static MVEGenTwoOpScalarFn * const fns[] = { \ From patchwork Thu Jul 29 11:14:58 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511203 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=ykrcNS8A; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb85h3bH2z9sSs for ; Thu, 29 Jul 2021 21:52:08 +1000 (AEST) Received: from localhost ([::1]:40300 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m94a6-0006Jc-5A for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:52:06 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40878) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m941Q-0001Id-KI for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:18 -0400 Received: from mail-wr1-x429.google.com ([2a00:1450:4864:20::429]:40883) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m940y-0001Mj-P7 for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:16 -0400 Received: by mail-wr1-x429.google.com with SMTP id p5so6464102wro.7 for ; Thu, 29 Jul 2021 04:15:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=lyRTX/hDBkPMXx5zFtZ53SkkChNk+0nncZrvsfyNOxs=; b=ykrcNS8AkfuW5Gjcf2OaG91KgJ226pCFkE87yKjw/GQN57b1bA91JbswfdAW57OW7s fOJc9bIAjaPYVi33ckxPcA0d0U3RCoWnV1V62zpoLlOJkc/taMXLELwc9yud2GJvNx5q zil3OrT1BiiAMGTRXady6RjkqqT3yPndddfmoeQIMQgVGAp4C31D6KIjsn1B1q+ppwXZ YfjTI42VpXa+C6Ijq44CqMP8cZSVKXGCr3H25xFKN794Wpl/bRhWGoY5ByhlN816YYHr PCePOUTl76Q+SIlI/eELCFDgw8jRTWVUwyk7z+tqjE+JIHZU15BoMS1vJCFLs7SW9x98 SL4A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=lyRTX/hDBkPMXx5zFtZ53SkkChNk+0nncZrvsfyNOxs=; b=O6YXkrlOYsn2OcqaE8v+tCKeDpaNHfP2TXMxDzgSR2nKqlVQvSi7kIICx0ESyWirpG ivThKhtDQFiqBDj8ailIoFGgkuNTjlJjHkbSk0rNO8cLYFyFoTBTMrX/Uy7ZYmmc3t7q UQR4FXbhPtxiXx2AKo2ngFhmGD96SWy2nCbqlY+MxsRPca1DE591++dVPP+SnO9yXKhB agpakEZgwtHpBKb/fgzkWoNCfP40GLxFXYiyXzzCSsiAgvYnxMXo8CVSso1eMWrUk+qb p/NoT+C0lki/4aQNzWZMD13OJ57pzfVHH6uTgznP/Zl21Ge3Ivx4UkxICKf8dTMcOYDl NWIw== X-Gm-Message-State: AOAM532UnvLoncUQdmSwvuVE4gXx5EkQTlivLISznqRDEEaGSQguQ0pM GhhY6UJuRg6DCaPXqhCk8FMj2Cyklfb1sQ== X-Google-Smtp-Source: ABdhPJxLeK3Loh42b6uGRRHQVDtC+Oeu3gxPOF3m0iAO4OXHypM7CtU8DidowCpXrKlPC+h1NHITXg== X-Received: by 2002:adf:f302:: with SMTP id i2mr4208007wro.186.1627557347454; Thu, 29 Jul 2021 04:15:47 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:47 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 39/53] target/arm: Implement MVE VFMA and VFMS Date: Thu, 29 Jul 2021 12:14:58 +0100 Message-Id: <20210729111512.16541-40-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::429; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x429.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE VFMA and VFMS insns. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper-mve.h | 6 ++++++ target/arm/mve.decode | 3 +++ target/arm/mve_helper.c | 36 ++++++++++++++++++++++++++++++++++++ target/arm/translate-mve.c | 2 ++ 4 files changed, 47 insertions(+) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index 42eba8ea96d..c230610d25c 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -434,6 +434,12 @@ DEF_HELPER_FLAGS_4(mve_vfcadd90s, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) DEF_HELPER_FLAGS_4(mve_vfcadd270h, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) DEF_HELPER_FLAGS_4(mve_vfcadd270s, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) +DEF_HELPER_FLAGS_4(mve_vfmah, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) +DEF_HELPER_FLAGS_4(mve_vfmas, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) + +DEF_HELPER_FLAGS_4(mve_vfmsh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) +DEF_HELPER_FLAGS_4(mve_vfmss, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) + DEF_HELPER_FLAGS_4(mve_vadd_scalarb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vadd_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vadd_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index c728c7089ac..3a2056f6b34 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -639,3 +639,6 @@ VMINNM 1111 1111 0 . 1 . ... 0 ... 0 1111 . 1 . 1 ... 0 @2op_fp VCADD90_fp 1111 1100 1 . 0 . ... 0 ... 0 1000 . 1 . 0 ... 0 @2op_fp_size_rev VCADD270_fp 1111 1101 1 . 0 . ... 0 ... 0 1000 . 1 . 0 ... 0 @2op_fp_size_rev + +VFMA 1110 1111 0 . 0 . ... 0 ... 0 1100 . 1 . 1 ... 0 @2op_fp +VFMS 1110 1111 0 . 1 . ... 0 ... 0 1100 . 1 . 1 ... 0 @2op_fp diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index fd6ff167849..0146137d18f 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -2899,3 +2899,39 @@ DO_VCADD_FP(vfcadd90h, 2, uint16_t, float16_sub, float16_add) DO_VCADD_FP(vfcadd90s, 4, uint32_t, float32_sub, float32_add) DO_VCADD_FP(vfcadd270h, 2, uint16_t, float16_add, float16_sub) DO_VCADD_FP(vfcadd270s, 4, uint32_t, float32_add, float32_sub) + +#define DO_VFMA(OP, ESIZE, TYPE, FN) \ + void HELPER(glue(mve_, OP))(CPUARMState *env, \ + void *vd, void *vn, void *vm) \ + { \ + TYPE *d = vd, *n = vn, *m = vm; \ + TYPE r; \ + uint16_t mask = mve_element_mask(env); \ + unsigned e; \ + float_status *fpst; \ + float_status scratch_fpst; \ + for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \ + if ((mask & MAKE_64BIT_MASK(0, ESIZE)) == 0) { \ + continue; \ + } \ + fpst = (ESIZE == 2) ? &env->vfp.standard_fp_status_f16 : \ + &env->vfp.standard_fp_status; \ + if (!(mask & 1)) { \ + /* We need the result but without updating flags */ \ + scratch_fpst = *fpst; \ + fpst = &scratch_fpst; \ + } \ + r = FN(n[H##ESIZE(e)], m[H##ESIZE(e)], d[H##ESIZE(e)], \ + 0, fpst); \ + mergemask(&d[H##ESIZE(e)], r, mask); \ + } \ + mve_advance_vpt(env); \ + } + +#define DO_VFMS16(N, M, D, F, S) float16_muladd(float16_chs(N), M, D, F, S) +#define DO_VFMS32(N, M, D, F, S) float32_muladd(float32_chs(N), M, D, F, S) + +DO_VFMA(vfmah, 2, uint16_t, float16_muladd) +DO_VFMA(vfmas, 4, uint32_t, float32_muladd) +DO_VFMA(vfmsh, 2, uint16_t, DO_VFMS16) +DO_VFMA(vfmss, 4, uint32_t, DO_VFMS32) diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index 6203e3ff916..d61abc6d46f 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -854,6 +854,8 @@ DO_2OP_FP(VMAXNM, vmaxnm) DO_2OP_FP(VMINNM, vminnm) DO_2OP_FP(VCADD90_fp, vfcadd90) DO_2OP_FP(VCADD270_fp, vfcadd270) +DO_2OP_FP(VFMA, vfma) +DO_2OP_FP(VFMS, vfms) static bool do_2op_scalar(DisasContext *s, arg_2scalar *a, MVEGenTwoOpScalarFn fn) From patchwork Thu Jul 29 11:14:59 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511196 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=P1odjIMT; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb7y32dbFz9sSs for ; Thu, 29 Jul 2021 21:45:31 +1000 (AEST) Received: from localhost ([::1]:48566 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m94Th-0000Ta-1d for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:45:29 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40912) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m941R-0001Ig-OS for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:18 -0400 Received: from mail-wm1-x329.google.com ([2a00:1450:4864:20::329]:43002) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m940z-0001My-KA for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:17 -0400 Received: by mail-wm1-x329.google.com with SMTP id e25-20020a05600c4b99b0290253418ba0fbso3781142wmp.1 for ; Thu, 29 Jul 2021 04:15:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=cK3D9jr6c7rzAtNnX4xQNMbIJISTN5Ct4CVgZ3O8Kso=; b=P1odjIMTRYjYSFFBtb1yZpj1h5TcF/+2wj7oDXOHhgMpqdRbUOhFtg2UZn/iw1ZYK+ Axn+6eXfo3HI532e98hy6iysWIQ5NKgYkvtsjBbatMXV6DAvgTZUAHVaDLBt3o44uRvv /loKG09RLI6UwNmpuxiMLAhngZA84gUyK3uuU/+GWXc4lK+PWnuyK8GbofgV3MATnHAx ZXqx5yo+pqRGRfLdegW3gPV9egkDk99zL7wuAlsskPYeSTp1LVrxqoamCt0plqqQz/tQ VtFhmZPL3B4rJIHvAR/A19vTLoNQOeBgXwnPMSfNkVarkmdY4Mfyh6WZmPhPUtubPdxs vGjA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=cK3D9jr6c7rzAtNnX4xQNMbIJISTN5Ct4CVgZ3O8Kso=; b=YQy6z4tsvYr7FsMnpyb/dEbs49q9zJOTnH5fzLu4fDGzjmSEbhMF/0gvc42MtF0k4l cKv7v78MSM5+prCFH+EKMqOY7mTU/8CqgCFwv4s3um2a0I3jbbsMyhfnhTOscm3p8zIx jzn2osTTmAxnA91H/H/m9DoRtoePaKnxA9qfarvt5Ysk5SFSIvk0NVVxISNZTHgAmn3e 4EWyNJrVi2+hLRXZN7kSxVE67nsPFAhdl0/e4CoMDoMmcZyrYDSBcfNXNGRtbUCeFTnS WAFEhXZHmV76yHmm96NzN7S3ShEWKKzWSg2MFwMgJ1q/wj9RhineXkUAASoTRG79OwQq BuiA== X-Gm-Message-State: AOAM530UB8aIrY0xi1rp2+nkr+5SlVz8VZSEYDgw9T7TQq3+HHy1T32m YNWqUKnPyKLOU5NbAI99/o4kNw== X-Google-Smtp-Source: ABdhPJxx2cU0lEnt8KG2fYZkFzdrjm1PgSALqJQpakh8odznNYWHQgOKHqoa0n6iqbuUS26bUg+opw== X-Received: by 2002:a1c:4409:: with SMTP id r9mr4241289wma.150.1627557348260; Thu, 29 Jul 2021 04:15:48 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:47 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 40/53] target/arm: Implement MVE VCMUL and VCMLA Date: Thu, 29 Jul 2021 12:14:59 +0100 Message-Id: <20210729111512.16541-41-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::329; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x329.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE VCMUL and VCMLA insns. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper-mve.h | 18 ++++++++ target/arm/mve.decode | 35 ++++++++++++---- target/arm/mve_helper.c | 86 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-mve.c | 8 ++++ 4 files changed, 139 insertions(+), 8 deletions(-) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index c230610d25c..73950403bc3 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -440,6 +440,24 @@ DEF_HELPER_FLAGS_4(mve_vfmas, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) DEF_HELPER_FLAGS_4(mve_vfmsh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) DEF_HELPER_FLAGS_4(mve_vfmss, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) +DEF_HELPER_FLAGS_4(mve_vcmul0h, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) +DEF_HELPER_FLAGS_4(mve_vcmul0s, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) +DEF_HELPER_FLAGS_4(mve_vcmul90h, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) +DEF_HELPER_FLAGS_4(mve_vcmul90s, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) +DEF_HELPER_FLAGS_4(mve_vcmul180h, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) +DEF_HELPER_FLAGS_4(mve_vcmul180s, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) +DEF_HELPER_FLAGS_4(mve_vcmul270h, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) +DEF_HELPER_FLAGS_4(mve_vcmul270s, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) + +DEF_HELPER_FLAGS_4(mve_vcmla0h, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) +DEF_HELPER_FLAGS_4(mve_vcmla0s, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) +DEF_HELPER_FLAGS_4(mve_vcmla90h, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) +DEF_HELPER_FLAGS_4(mve_vcmla90s, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) +DEF_HELPER_FLAGS_4(mve_vcmla180h, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) +DEF_HELPER_FLAGS_4(mve_vcmla180s, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) +DEF_HELPER_FLAGS_4(mve_vcmla270h, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) +DEF_HELPER_FLAGS_4(mve_vcmla270s, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) + DEF_HELPER_FLAGS_4(mve_vadd_scalarb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vadd_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vadd_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index 3a2056f6b34..403381eef61 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -286,15 +286,29 @@ VQSHL_U 111 1 1111 0 . .. ... 0 ... 0 0100 . 1 . 1 ... 0 @2op_rev VQRSHL_S 111 0 1111 0 . .. ... 0 ... 0 0101 . 1 . 1 ... 0 @2op_rev VQRSHL_U 111 1 1111 0 . .. ... 0 ... 0 0101 . 1 . 1 ... 0 @2op_rev -VQDMLADH 1110 1110 0 . .. ... 0 ... 0 1110 . 0 . 0 ... 0 @2op -VQDMLADHX 1110 1110 0 . .. ... 0 ... 1 1110 . 0 . 0 ... 0 @2op -VQRDMLADH 1110 1110 0 . .. ... 0 ... 0 1110 . 0 . 0 ... 1 @2op -VQRDMLADHX 1110 1110 0 . .. ... 0 ... 1 1110 . 0 . 0 ... 1 @2op +{ + VCMUL0 111 . 1110 0 . 11 ... 0 ... 0 1110 . 0 . 0 ... 0 @2op_sz28 + VQDMLADH 1110 1110 0 . .. ... 0 ... 0 1110 . 0 . 0 ... 0 @2op + VQDMLSDH 1111 1110 0 . .. ... 0 ... 0 1110 . 0 . 0 ... 0 @2op +} -VQDMLSDH 1111 1110 0 . .. ... 0 ... 0 1110 . 0 . 0 ... 0 @2op -VQDMLSDHX 1111 1110 0 . .. ... 0 ... 1 1110 . 0 . 0 ... 0 @2op -VQRDMLSDH 1111 1110 0 . .. ... 0 ... 0 1110 . 0 . 0 ... 1 @2op -VQRDMLSDHX 1111 1110 0 . .. ... 0 ... 1 1110 . 0 . 0 ... 1 @2op +{ + VCMUL180 111 . 1110 0 . 11 ... 0 ... 1 1110 . 0 . 0 ... 0 @2op_sz28 + VQDMLADHX 111 0 1110 0 . .. ... 0 ... 1 1110 . 0 . 0 ... 0 @2op + VQDMLSDHX 111 1 1110 0 . .. ... 0 ... 1 1110 . 0 . 0 ... 0 @2op +} + +{ + VCMUL90 111 . 1110 0 . 11 ... 0 ... 0 1110 . 0 . 0 ... 1 @2op_sz28 + VQRDMLADH 111 0 1110 0 . .. ... 0 ... 0 1110 . 0 . 0 ... 1 @2op + VQRDMLSDH 111 1 1110 0 . .. ... 0 ... 0 1110 . 0 . 0 ... 1 @2op +} + +{ + VCMUL270 111 . 1110 0 . 11 ... 0 ... 1 1110 . 0 . 0 ... 1 @2op_sz28 + VQRDMLADHX 111 0 1110 0 . .. ... 0 ... 1 1110 . 0 . 0 ... 1 @2op + VQRDMLSDHX 111 1 1110 0 . .. ... 0 ... 1 1110 . 0 . 0 ... 1 @2op +} VQDMULLB 111 . 1110 0 . 11 ... 0 ... 0 1111 . 0 . 0 ... 1 @2op_sz28 VQDMULLT 111 . 1110 0 . 11 ... 0 ... 1 1111 . 0 . 0 ... 1 @2op_sz28 @@ -642,3 +656,8 @@ VCADD270_fp 1111 1101 1 . 0 . ... 0 ... 0 1000 . 1 . 0 ... 0 @2op_fp_size_ VFMA 1110 1111 0 . 0 . ... 0 ... 0 1100 . 1 . 1 ... 0 @2op_fp VFMS 1110 1111 0 . 1 . ... 0 ... 0 1100 . 1 . 1 ... 0 @2op_fp + +VCMLA0 1111 110 00 . 1 . ... 0 ... 0 1000 . 1 . 0 ... 0 @2op_fp_size_rev +VCMLA90 1111 110 01 . 1 . ... 0 ... 0 1000 . 1 . 0 ... 0 @2op_fp_size_rev +VCMLA180 1111 110 10 . 1 . ... 0 ... 0 1000 . 1 . 0 ... 0 @2op_fp_size_rev +VCMLA270 1111 110 11 . 1 . ... 0 ... 0 1000 . 1 . 0 ... 0 @2op_fp_size_rev diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index 0146137d18f..489892344b4 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -2935,3 +2935,89 @@ DO_VFMA(vfmah, 2, uint16_t, float16_muladd) DO_VFMA(vfmas, 4, uint32_t, float32_muladd) DO_VFMA(vfmsh, 2, uint16_t, DO_VFMS16) DO_VFMA(vfmss, 4, uint32_t, DO_VFMS32) + +#define DO_VCMLA(OP, ESIZE, TYPE, ROT, CHS, FN) \ + void HELPER(glue(mve_, OP))(CPUARMState *env, \ + void *vd, void *vn, void *vm) \ + { \ + TYPE *d = vd, *n = vn, *m = vm; \ + TYPE r0, r1, e1, e2, e3, e4; \ + uint16_t mask = mve_element_mask(env); \ + unsigned e; \ + float_status *fpst0, *fpst1; \ + float_status scratch_fpst; \ + /* We loop through pairs of elements at a time */ \ + for (e = 0; e < 16 / ESIZE; e += 2, mask >>= ESIZE * 2) { \ + if ((mask & MAKE_64BIT_MASK(0, ESIZE * 2)) == 0) { \ + continue; \ + } \ + fpst0 = (ESIZE == 2) ? &env->vfp.standard_fp_status_f16 : \ + &env->vfp.standard_fp_status; \ + fpst1 = fpst0; \ + if (!(mask & 1)) { \ + scratch_fpst = *fpst0; \ + fpst0 = &scratch_fpst; \ + } \ + if (!(mask & (1 << ESIZE))) { \ + scratch_fpst = *fpst1; \ + fpst1 = &scratch_fpst; \ + } \ + switch (ROT) { \ + case 0: \ + e1 = m[H##ESIZE(e)]; \ + e2 = n[H##ESIZE(e)]; \ + e3 = m[H##ESIZE(e + 1)]; \ + e4 = n[H##ESIZE(e)]; \ + break; \ + case 1: \ + e1 = CHS(m[H##ESIZE(e + 1)]); \ + e2 = n[H##ESIZE(e + 1)]; \ + e3 = m[H##ESIZE(e)]; \ + e4 = n[H##ESIZE(e + 1)]; \ + break; \ + case 2: \ + e1 = CHS(m[H##ESIZE(e)]); \ + e2 = n[H##ESIZE(e)]; \ + e3 = CHS(m[H##ESIZE(e + 1)]); \ + e4 = n[H##ESIZE(e)]; \ + break; \ + case 3: \ + e1 = m[H##ESIZE(e + 1)]; \ + e2 = n[H##ESIZE(e + 1)]; \ + e3 = CHS(m[H##ESIZE(e)]); \ + e4 = n[H##ESIZE(e + 1)]; \ + break; \ + default: \ + g_assert_not_reached(); \ + } \ + r0 = FN(e2, e1, d[H##ESIZE(e)], fpst0); \ + r1 = FN(e4, e3, d[H##ESIZE(e + 1)], fpst1); \ + mergemask(&d[H##ESIZE(e)], r0, mask); \ + mergemask(&d[H##ESIZE(e + 1)], r1, mask >> ESIZE); \ + } \ + mve_advance_vpt(env); \ + } + +#define DO_VCMULH(N, M, D, S) float16_mul(N, M, S) +#define DO_VCMULS(N, M, D, S) float32_mul(N, M, S) + +#define DO_VCMLAH(N, M, D, S) float16_muladd(N, M, D, 0, S) +#define DO_VCMLAS(N, M, D, S) float32_muladd(N, M, D, 0, S) + +DO_VCMLA(vcmul0h, 2, uint16_t, 0, float16_chs, DO_VCMULH) +DO_VCMLA(vcmul0s, 4, uint32_t, 0, float32_chs, DO_VCMULS) +DO_VCMLA(vcmul90h, 2, uint16_t, 1, float16_chs, DO_VCMULH) +DO_VCMLA(vcmul90s, 4, uint32_t, 1, float32_chs, DO_VCMULS) +DO_VCMLA(vcmul180h, 2, uint16_t, 2, float16_chs, DO_VCMULH) +DO_VCMLA(vcmul180s, 4, uint32_t, 2, float32_chs, DO_VCMULS) +DO_VCMLA(vcmul270h, 2, uint16_t, 3, float16_chs, DO_VCMULH) +DO_VCMLA(vcmul270s, 4, uint32_t, 3, float32_chs, DO_VCMULS) + +DO_VCMLA(vcmla0h, 2, uint16_t, 0, float16_chs, DO_VCMLAH) +DO_VCMLA(vcmla0s, 4, uint32_t, 0, float32_chs, DO_VCMLAS) +DO_VCMLA(vcmla90h, 2, uint16_t, 1, float16_chs, DO_VCMLAH) +DO_VCMLA(vcmla90s, 4, uint32_t, 1, float32_chs, DO_VCMLAS) +DO_VCMLA(vcmla180h, 2, uint16_t, 2, float16_chs, DO_VCMLAH) +DO_VCMLA(vcmla180s, 4, uint32_t, 2, float32_chs, DO_VCMLAS) +DO_VCMLA(vcmla270h, 2, uint16_t, 3, float16_chs, DO_VCMLAH) +DO_VCMLA(vcmla270s, 4, uint32_t, 3, float32_chs, DO_VCMLAS) diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index d61abc6d46f..d62ed1fc295 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -856,6 +856,14 @@ DO_2OP_FP(VCADD90_fp, vfcadd90) DO_2OP_FP(VCADD270_fp, vfcadd270) DO_2OP_FP(VFMA, vfma) DO_2OP_FP(VFMS, vfms) +DO_2OP_FP(VCMUL0, vcmul0) +DO_2OP_FP(VCMUL90, vcmul90) +DO_2OP_FP(VCMUL180, vcmul180) +DO_2OP_FP(VCMUL270, vcmul270) +DO_2OP_FP(VCMLA0, vcmla0) +DO_2OP_FP(VCMLA90, vcmla90) +DO_2OP_FP(VCMLA180, vcmla180) +DO_2OP_FP(VCMLA270, vcmla270) static bool do_2op_scalar(DisasContext *s, arg_2scalar *a, MVEGenTwoOpScalarFn fn) From patchwork Thu Jul 29 11:15:00 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511201 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=urv6983r; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb84c2Lw3z9sSs for ; Thu, 29 Jul 2021 21:51:12 +1000 (AEST) Received: from localhost ([::1]:36570 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m94ZB-0003er-VB for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:51:09 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40936) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m941T-0001Jr-Px for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:20 -0400 Received: from mail-wr1-x42e.google.com ([2a00:1450:4864:20::42e]:47097) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m9410-0001PB-Kj for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:18 -0400 Received: by mail-wr1-x42e.google.com with SMTP id c16so6429435wrp.13 for ; Thu, 29 Jul 2021 04:15:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=Vs7FOGHKNBHJAr5iI+u8DlBQ7avOBil7+a3EJ4GTK7A=; b=urv6983rsVGMK8660734Y52LxNNBcqb122uhO5d61JrdfKVttoPvBfkk22TqmiFRRx ZZObWjDy6uyI05loD0aPS1IMnRTqUA66nw+lrYSusXlvGJrh0BjLeWwhx2zx/dVu/LF0 jfTKucasakmJvlISZCXJeN8BwFnhpK7joq8a0sNpUyVFBYOBSuFU6WDnNQC+itAHdHba qTUgc2TVbxJ7gbkSrtnFhhZ7syjaB2ox89Dzhk9FChN+PTopGdM/Ii20M29GGdO6urgH ViYAAB6Z0P+6TJkkS68WB8YjnSS9dkhZ27mG99WCYBdTBqvJCINZTYwtQJjL04qODK9A VjfQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Vs7FOGHKNBHJAr5iI+u8DlBQ7avOBil7+a3EJ4GTK7A=; b=ZGnknzsLMS8321BhjvBVTSFXTGT2iPeavRdij9hqkGwUOaLo4U5k0BALdO2TUVZw1n MqA0lJOCjyqVJNFuO79SjHNGA19v14TZyRhyPAzRdUdVNs+Pn1Xvq22Ys76xfZMTKvYe 2qHvtI0DRQADmrXEpDnX735ubfm366g5+ztQ3B83qvxfI+0AkTcaM65xpVYcnWqTrkTy 8lrtTE3DDdnuQhy5tQyP2sC+ZOaZpE9yxGQhZl/IFAq0F93lRxN2Z0q6wtWKLs7qF/nr 2PBtZzS6O6Mo5jFEbYRwIJm1tFopChJhc13dET7MpsWZJSgXHM16tSJyIvmSkAphAtl9 hfJg== X-Gm-Message-State: AOAM532cW2iPc4/f2gP6wryGFiM7T/RF1Cmh4GNKbLb2COBl6pgwHLAD I2nOqKfRt+kiviw8W8i508XX6an+iDuyqw== X-Google-Smtp-Source: ABdhPJzNeK0GMZ6luY/LkTfl0a8r/zo+64X3HDr2vGLLuoIGk1sjueI60712syDGR1gKyppkHJ/T2w== X-Received: by 2002:adf:fc12:: with SMTP id i18mr4214481wrr.138.1627557349053; Thu, 29 Jul 2021 04:15:49 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:48 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 41/53] target/arm: Implement MVE VMAXNMA and VMINNMA Date: Thu, 29 Jul 2021 12:15:00 +0100 Message-Id: <20210729111512.16541-42-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::42e; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x42e.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE VMAXNMA and VMINNMA insns; these are 2-operand, but the destination register must be the same as one of the source registers. We defer the decode of the size in bit 28 to the individual insn patterns rather than doing it in the format, because otherwise we would have a single insn pattern that overlapped with two groups (eg VMAXNMA with the VMULH_S and VMULH_U groups). Having two insn patterns per insn seems clearer than a complex multilevel nesting of overlapping and non-overlapping groups. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper-mve.h | 6 ++++++ target/arm/mve.decode | 11 +++++++++++ target/arm/mve_helper.c | 25 +++++++++++++++++++++++++ target/arm/translate-mve.c | 2 ++ 4 files changed, 44 insertions(+) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index 73950403bc3..57ab3f7b59f 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -428,6 +428,12 @@ DEF_HELPER_FLAGS_4(mve_vmaxnms, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) DEF_HELPER_FLAGS_4(mve_vminnmh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) DEF_HELPER_FLAGS_4(mve_vminnms, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) +DEF_HELPER_FLAGS_4(mve_vmaxnmah, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) +DEF_HELPER_FLAGS_4(mve_vmaxnmas, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) + +DEF_HELPER_FLAGS_4(mve_vminnmah, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) +DEF_HELPER_FLAGS_4(mve_vminnmas, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) + DEF_HELPER_FLAGS_4(mve_vfcadd90h, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) DEF_HELPER_FLAGS_4(mve_vfcadd90s, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index 403381eef61..b0622e1f62c 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -130,6 +130,11 @@ @2op_fp_size_rev .... .... .... .... .... .... .... .... &2op \ qd=%qd qn=%qn qm=%qm size=%2op_fp_size_rev +# 2-operand, but Qd and Qn share a field. Size is in bit 28, but we +# don't decode it in this format +@vmaxnma .... .... .... .... .... .... .... .... &2op \ + qd=%qd qn=%qd qm=%qm + # Vector loads and stores # Widening loads and narrowing stores: @@ -199,6 +204,8 @@ VMUL 1110 1111 0 . .. ... 0 ... 0 1001 . 1 . 1 ... 0 @2op # The VSHLL T2 encoding is not a @2op pattern, but is here because it # overlaps what would be size=0b11 VMULH/VRMULH { + VMAXNMA 111 0 1110 0 . 11 1111 ... 0 1110 1 0 . 0 ... 1 @vmaxnma size=2 + VSHLL_BS 111 0 1110 0 . 11 .. 01 ... 0 1110 0 0 . 0 ... 1 @2_shll_esize_b VSHLL_BS 111 0 1110 0 . 11 .. 01 ... 0 1110 0 0 . 0 ... 1 @2_shll_esize_h @@ -211,6 +218,8 @@ VMUL 1110 1111 0 . .. ... 0 ... 0 1001 . 1 . 1 ... 0 @2op } { + VMAXNMA 111 1 1110 0 . 11 1111 ... 0 1110 1 0 . 0 ... 1 @vmaxnma size=1 + VSHLL_BU 111 1 1110 0 . 11 .. 01 ... 0 1110 0 0 . 0 ... 1 @2_shll_esize_b VSHLL_BU 111 1 1110 0 . 11 .. 01 ... 0 1110 0 0 . 0 ... 1 @2_shll_esize_h @@ -221,6 +230,7 @@ VMUL 1110 1111 0 . .. ... 0 ... 0 1001 . 1 . 1 ... 0 @2op } { + VMINNMA 111 0 1110 0 . 11 1111 ... 1 1110 1 0 . 0 ... 1 @vmaxnma size=2 VSHLL_TS 111 0 1110 0 . 11 .. 01 ... 1 1110 0 0 . 0 ... 1 @2_shll_esize_b VSHLL_TS 111 0 1110 0 . 11 .. 01 ... 1 1110 0 0 . 0 ... 1 @2_shll_esize_h @@ -233,6 +243,7 @@ VMUL 1110 1111 0 . .. ... 0 ... 0 1001 . 1 . 1 ... 0 @2op } { + VMINNMA 111 1 1110 0 . 11 1111 ... 1 1110 1 0 . 0 ... 1 @vmaxnma size=1 VSHLL_TU 111 1 1110 0 . 11 .. 01 ... 1 1110 0 0 . 0 ... 1 @2_shll_esize_b VSHLL_TU 111 1 1110 0 . 11 .. 01 ... 1 1110 0 0 . 0 ... 1 @2_shll_esize_h diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index 489892344b4..d44369c15e2 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -2860,6 +2860,31 @@ DO_2OP_FP(vmaxnms, 4, uint32_t, float32_maxnum) DO_2OP_FP(vminnmh, 2, uint16_t, float16_minnum) DO_2OP_FP(vminnms, 4, uint32_t, float32_minnum) +static inline float16 float16_maxnuma(float16 a, float16 b, float_status *s) +{ + return float16_maxnum(float16_abs(a), float16_abs(b), s); +} + +static inline float32 float32_maxnuma(float32 a, float32 b, float_status *s) +{ + return float32_maxnum(float32_abs(a), float32_abs(b), s); +} + +static inline float16 float16_minnuma(float16 a, float16 b, float_status *s) +{ + return float16_minnum(float16_abs(a), float16_abs(b), s); +} + +static inline float32 float32_minnuma(float32 a, float32 b, float_status *s) +{ + return float32_minnum(float32_abs(a), float32_abs(b), s); +} + +DO_2OP_FP(vmaxnmah, 2, uint16_t, float16_maxnuma) +DO_2OP_FP(vmaxnmas, 4, uint32_t, float32_maxnuma) +DO_2OP_FP(vminnmah, 2, uint16_t, float16_minnuma) +DO_2OP_FP(vminnmas, 4, uint32_t, float32_minnuma) + #define DO_VCADD_FP(OP, ESIZE, TYPE, FN0, FN1) \ void HELPER(glue(mve_, OP))(CPUARMState *env, \ void *vd, void *vn, void *vm) \ diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index d62ed1fc295..4d702da808d 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -864,6 +864,8 @@ DO_2OP_FP(VCMLA0, vcmla0) DO_2OP_FP(VCMLA90, vcmla90) DO_2OP_FP(VCMLA180, vcmla180) DO_2OP_FP(VCMLA270, vcmla270) +DO_2OP_FP(VMAXNMA, vmaxnma) +DO_2OP_FP(VMINNMA, vminnma) static bool do_2op_scalar(DisasContext *s, arg_2scalar *a, MVEGenTwoOpScalarFn fn) From patchwork Thu Jul 29 11:15:01 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511207 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=gv7A213i; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb89W41Wnz9sRR for ; Thu, 29 Jul 2021 21:55:27 +1000 (AEST) Received: from localhost ([::1]:54440 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m94dJ-0007Zw-Ae for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:55:25 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40952) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m941V-0001KX-U1 for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:22 -0400 Received: from mail-wr1-x42b.google.com ([2a00:1450:4864:20::42b]:35331) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m9411-0001PT-FC for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:21 -0400 Received: by mail-wr1-x42b.google.com with SMTP id n12so6481685wrr.2 for ; Thu, 29 Jul 2021 04:15:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=SmFi3d8a8x239T2bKc1646ktKfQt5s6Sv+V0vxSAgpw=; b=gv7A213iGovDwz7W6KSaCkTMfGWEExH81+R76vgJpmqMz8muSlbm3JxiiekQJPbw8i DSsm6he15/3KODxE4OznU0tBM40Ij4yr8zAzxl8GqTmO+c2EPg0tujgZyFgVm3vqWPnv 7133UPhfN9Xs6E7RXeIIZugOWI0R4rzd3Z/2iYsfo7BP7auC+uPdZm+Pv32oqvccS4cx Z4KTyVRYvRiqM+BGNG1glsw9k6PZFZ3It2fOz82m1ZnQE82RaETodltKq76xFwNTYNJZ 3vgn5q87bdrPGP4dm6PKW0OeOnhV/35eOyQVOkyH9w9BGu48F3eIcSRCVqvQc3aPyJlc RhqA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=SmFi3d8a8x239T2bKc1646ktKfQt5s6Sv+V0vxSAgpw=; b=COlkTEF4SgXfIsy/NkK1EcxXv3EBh3iTNjdwu0CK7P4HgTNXDj9VhvUngjLjstEBiw eTy105mHP7+iRxv27z0Hbwt2eDafFlSArj8qbg70XFDBNiYFs7TmDr1tMouotgFxqPaQ NboZcbbjDncXIsNkvBhd5QNVXJJFbuNrJcQxKM5mEd/RJwnL5Z9LAnSDEflJDLsyKXQ0 aUUSc8lnLv6P4xA+BNGlsylKuKmVqUPTST5biwOp/euYDoy8UUSPaF/g2J0bu0wViAmB feIX1fzifkfiA0Eo+iVt/Uz7Q5olCeQzRnDncdcyTZjYjvSlYJSLGQjhobKWPzmBD4bS YeQg== X-Gm-Message-State: AOAM5323BHZ0GROseNNRhWPQttoFGTHUMR1yy+W/1QY4SNudxRluW7kE 6ybjGNYbT1+APtDQrUdIlumLeHP0qQ98zw== X-Google-Smtp-Source: ABdhPJyudSYYLaJnchXrFP/vvwBWftBETz5nAgUBQI+6KEmxgFicWXzkKEajNrJ9td62V7YNy5K5xw== X-Received: by 2002:a5d:680e:: with SMTP id w14mr4371387wru.71.1627557349837; Thu, 29 Jul 2021 04:15:49 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:49 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 42/53] target/arm: Implement MVE scalar fp insns Date: Thu, 29 Jul 2021 12:15:01 +0100 Message-Id: <20210729111512.16541-43-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::42b; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x42b.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE scalar floating point insns VADD, VSUB and VMUL. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper-mve.h | 9 +++++++++ target/arm/mve.decode | 27 +++++++++++++++++++++------ target/arm/mve_helper.c | 34 ++++++++++++++++++++++++++++++++++ target/arm/translate-mve.c | 20 ++++++++++++++++++++ 4 files changed, 84 insertions(+), 6 deletions(-) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index 57ab3f7b59f..091ec4b4270 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -800,3 +800,12 @@ DEF_HELPER_FLAGS_3(mve_vcmpgt_scalarw, TCG_CALL_NO_WG, void, env, ptr, i32) DEF_HELPER_FLAGS_3(mve_vcmple_scalarb, TCG_CALL_NO_WG, void, env, ptr, i32) DEF_HELPER_FLAGS_3(mve_vcmple_scalarh, TCG_CALL_NO_WG, void, env, ptr, i32) DEF_HELPER_FLAGS_3(mve_vcmple_scalarw, TCG_CALL_NO_WG, void, env, ptr, i32) + +DEF_HELPER_FLAGS_4(mve_vfadd_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vfadd_scalars, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(mve_vfsub_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vfsub_scalars, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(mve_vfmul_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vfmul_scalars, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index b0622e1f62c..5ba8b6deeaa 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -31,6 +31,8 @@ %2op_fp_size 20:1 !function=neon_3same_fp_size # VCADD is an exception, where bit 20 is 0 for 16 bit and 1 for 32 bit %2op_fp_size_rev 20:1 !function=plus_1 +# FP scalars have size in bit 28, 1 for 16 bit, 0 for 32 bit +%2op_fp_scalar_size 28:1 !function=neon_3same_fp_size # 1imm format immediate %imm_28_16_0 28:1 16:3 0:4 @@ -135,6 +137,9 @@ @vmaxnma .... .... .... .... .... .... .... .... &2op \ qd=%qd qn=%qd qm=%qm +@2op_fp_scalar .... .... .... .... .... .... .... rm:4 &2scalar \ + qd=%qd qn=%qn size=%2op_fp_scalar_size + # Vector loads and stores # Widening loads and narrowing stores: @@ -471,10 +476,17 @@ VSUB_scalar 1110 1110 0 . .. ... 1 ... 1 1111 . 100 .... @2scalar VBRSR 1111 1110 0 . .. ... 1 ... 1 1110 . 110 .... @2scalar } -VHADD_S_scalar 1110 1110 0 . .. ... 0 ... 0 1111 . 100 .... @2scalar -VHADD_U_scalar 1111 1110 0 . .. ... 0 ... 0 1111 . 100 .... @2scalar -VHSUB_S_scalar 1110 1110 0 . .. ... 0 ... 1 1111 . 100 .... @2scalar -VHSUB_U_scalar 1111 1110 0 . .. ... 0 ... 1 1111 . 100 .... @2scalar +{ + VADD_fp_scalar 111 . 1110 0 . 11 ... 0 ... 0 1111 . 100 .... @2op_fp_scalar + VHADD_S_scalar 1110 1110 0 . .. ... 0 ... 0 1111 . 100 .... @2scalar + VHADD_U_scalar 1111 1110 0 . .. ... 0 ... 0 1111 . 100 .... @2scalar +} + +{ + VSUB_fp_scalar 111 . 1110 0 . 11 ... 0 ... 1 1111 . 100 .... @2op_fp_scalar + VHSUB_S_scalar 1110 1110 0 . .. ... 0 ... 1 1111 . 100 .... @2scalar + VHSUB_U_scalar 1111 1110 0 . .. ... 0 ... 1 1111 . 100 .... @2scalar +} { VQADD_S_scalar 1110 1110 0 . .. ... 0 ... 0 1111 . 110 .... @2scalar @@ -490,8 +502,11 @@ VHSUB_U_scalar 1111 1110 0 . .. ... 0 ... 1 1111 . 100 .... @2scalar size=%size_28 } -VQDMULH_scalar 1110 1110 0 . .. ... 1 ... 0 1110 . 110 .... @2scalar -VQRDMULH_scalar 1111 1110 0 . .. ... 1 ... 0 1110 . 110 .... @2scalar +{ + VMUL_fp_scalar 111 . 1110 0 . 11 ... 1 ... 0 1110 . 110 .... @2op_fp_scalar + VQDMULH_scalar 1110 1110 0 . .. ... 1 ... 0 1110 . 110 .... @2scalar + VQRDMULH_scalar 1111 1110 0 . .. ... 1 ... 0 1110 . 110 .... @2scalar +} # The U bit (28) is don't-care because it does not affect the result VMLA 111- 1110 0 . .. ... 1 ... 0 1110 . 100 .... @2scalar diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index d44369c15e2..4175bacfaa4 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -3046,3 +3046,37 @@ DO_VCMLA(vcmla180h, 2, uint16_t, 2, float16_chs, DO_VCMLAH) DO_VCMLA(vcmla180s, 4, uint32_t, 2, float32_chs, DO_VCMLAS) DO_VCMLA(vcmla270h, 2, uint16_t, 3, float16_chs, DO_VCMLAH) DO_VCMLA(vcmla270s, 4, uint32_t, 3, float32_chs, DO_VCMLAS) + +#define DO_2OP_FP_SCALAR(OP, ESIZE, TYPE, FN) \ + void HELPER(glue(mve_, OP))(CPUARMState *env, \ + void *vd, void *vn, uint32_t rm) \ + { \ + TYPE *d = vd, *n = vn; \ + TYPE r, m = rm; \ + uint16_t mask = mve_element_mask(env); \ + unsigned e; \ + float_status *fpst; \ + float_status scratch_fpst; \ + for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \ + if ((mask & MAKE_64BIT_MASK(0, ESIZE)) == 0) { \ + continue; \ + } \ + fpst = (ESIZE == 2) ? &env->vfp.standard_fp_status_f16 : \ + &env->vfp.standard_fp_status; \ + if (!(mask & 1)) { \ + /* We need the result but without updating flags */ \ + scratch_fpst = *fpst; \ + fpst = &scratch_fpst; \ + } \ + r = FN(n[H##ESIZE(e)], m, fpst); \ + mergemask(&d[H##ESIZE(e)], r, mask); \ + } \ + mve_advance_vpt(env); \ + } + +DO_2OP_FP_SCALAR(vfadd_scalarh, 2, uint16_t, float16_add) +DO_2OP_FP_SCALAR(vfadd_scalars, 4, uint32_t, float32_add) +DO_2OP_FP_SCALAR(vfsub_scalarh, 2, uint16_t, float16_sub) +DO_2OP_FP_SCALAR(vfsub_scalars, 4, uint32_t, float32_sub) +DO_2OP_FP_SCALAR(vfmul_scalarh, 2, uint16_t, float16_mul) +DO_2OP_FP_SCALAR(vfmul_scalars, 4, uint32_t, float32_mul) diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index 4d702da808d..bc4b3f840a0 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -960,6 +960,26 @@ static bool trans_VQDMULLT_scalar(DisasContext *s, arg_2scalar *a) return do_2op_scalar(s, a, fns[a->size]); } + +#define DO_2OP_FP_SCALAR(INSN, FN) \ + static bool trans_##INSN(DisasContext *s, arg_2scalar *a) \ + { \ + static MVEGenTwoOpScalarFn * const fns[] = { \ + NULL, \ + gen_helper_mve_##FN##h, \ + gen_helper_mve_##FN##s, \ + NULL, \ + }; \ + if (!dc_isar_feature(aa32_mve_fp, s)) { \ + return false; \ + } \ + return do_2op_scalar(s, a, fns[a->size]); \ + } + +DO_2OP_FP_SCALAR(VADD_fp_scalar, vfadd_scalar) +DO_2OP_FP_SCALAR(VSUB_fp_scalar, vfsub_scalar) +DO_2OP_FP_SCALAR(VMUL_fp_scalar, vfmul_scalar) + static bool do_long_dual_acc(DisasContext *s, arg_vmlaldav *a, MVEGenLongDualAccOpFn *fn) { From patchwork Thu Jul 29 11:15:02 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511205 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=XYZD9OiF; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb87Z3ngrz9sSs for ; Thu, 29 Jul 2021 21:53:46 +1000 (AEST) Received: from localhost ([::1]:46844 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m94bg-0002OI-72 for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:53:44 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40956) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m941V-0001KZ-V4 for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:22 -0400 Received: from mail-wr1-x432.google.com ([2a00:1450:4864:20::432]:45900) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m9414-0001Pi-Tn for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:21 -0400 Received: by mail-wr1-x432.google.com with SMTP id m12so1650400wru.12 for ; Thu, 29 Jul 2021 04:15:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=p/PVosmr3NDaaW48KULGfEDE4gWAQBCxNNHpd5+4wvQ=; b=XYZD9OiFWNXRZANimC6KdeeBOd9/IP8LQwcic3/Rr3WwYgFrxe3bJjVz8xSG0yaAOH rAWPsLKED3IpwtUl6dHRiZgAmaVlGkhV3JHkH3ZCEUAxEdKe9l1uMkZo2xYexpNiHkac y45+iqJsCP8zqra12ipRDT7YCi5XUbl5P48p4TlvJr7Q+PaEJXDu/SE0l39t4Xcduorf N4xpd95zg3pq2UYf3enw+F5Ewe3ulOAscw5WkFnffbG+4DjALaOo5nfV4fPr9RaE0Lua gcVlTfGtuOOVSp4869uIJzd3ug2niSENL9OGhjOJ99xG2HaqOoFzPEq56YlHuZNxP0w3 +PmQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=p/PVosmr3NDaaW48KULGfEDE4gWAQBCxNNHpd5+4wvQ=; b=a9fc9kgQJO8wuYSG0gkYttX7JxkBPo8MMpmszB2ovG4zgsIBmp5uHZV1dJOjQZB166 UH8+KP3FyvB7K8Hp18ZMs/lsvSuCIha+RcpMZs9yFKdDjKC775bTCN1AOnHeDY2eev5Z +hDS3OnP+dwEDdW+ec8mVvwByas5PMeaaNkJYZ/7sQxezJhkeIKbD8h6+Z4Z7TGsjz/k r/dskKBfpmGEuJ/03fwVdvGj8dgTI5xWPCG+bjSZspOZ+TeUhGpWq38AD2vrAkyJW9gD QZa8D8HCZNa7J93hgR1RXlji7ZnEqE7SEJbY3VFBt7tVUQZdExykBlDm4Bqh9oXcbG1c /Etw== X-Gm-Message-State: AOAM533zATKTy1Ie0FUZakfYkiFg5040oxl6fGA6/hvdO0OLYouxtRss a8nF53jyoW7PTxl6IbKOxZz/Kw== X-Google-Smtp-Source: ABdhPJy1qwc662Ht16KS5dSUtGRIOMHqMZ5OdO8sObWAjHTFgF9+/3kBDOYH2rkmfPUZumhbLw3Hnw== X-Received: by 2002:adf:dcd1:: with SMTP id x17mr4209511wrm.59.1627557350653; Thu, 29 Jul 2021 04:15:50 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:50 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 43/53] target/arm: Implement MVE fp-with-scalar VFMA, VFMAS Date: Thu, 29 Jul 2021 12:15:02 +0100 Message-Id: <20210729111512.16541-44-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::432; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x432.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE fp-with-scalar VFMA and VFMAS insns. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper-mve.h | 6 ++++++ target/arm/mve.decode | 14 +++++++++++--- target/arm/mve_helper.c | 37 +++++++++++++++++++++++++++++++++++++ target/arm/translate-mve.c | 2 ++ 4 files changed, 56 insertions(+), 3 deletions(-) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index 091ec4b4270..cb7b6423239 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -809,3 +809,9 @@ DEF_HELPER_FLAGS_4(mve_vfsub_scalars, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vfmul_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vfmul_scalars, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(mve_vfma_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vfma_scalars, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(mve_vfmas_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vfmas_scalars, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index 5ba8b6deeaa..d2bd6815bc3 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -508,9 +508,17 @@ VSUB_scalar 1110 1110 0 . .. ... 1 ... 1 1111 . 100 .... @2scalar VQRDMULH_scalar 1111 1110 0 . .. ... 1 ... 0 1110 . 110 .... @2scalar } -# The U bit (28) is don't-care because it does not affect the result -VMLA 111- 1110 0 . .. ... 1 ... 0 1110 . 100 .... @2scalar -VMLAS 111- 1110 0 . .. ... 1 ... 1 1110 . 100 .... @2scalar +{ + VFMA_scalar 111 . 1110 0 . 11 ... 1 ... 0 1110 . 100 .... @2op_fp_scalar + # The U bit (28) is don't-care because it does not affect the result + VMLA 111 - 1110 0 . .. ... 1 ... 0 1110 . 100 .... @2scalar +} + +{ + VFMAS_scalar 111 . 1110 0 . 11 ... 1 ... 1 1110 . 100 .... @2op_fp_scalar + # The U bit (28) is don't-care because it does not affect the result + VMLAS 111 - 1110 0 . .. ... 1 ... 1 1110 . 100 .... @2scalar +} VQRDMLAH 1110 1110 0 . .. ... 0 ... 0 1110 . 100 .... @2scalar VQRDMLASH 1110 1110 0 . .. ... 0 ... 1 1110 . 100 .... @2scalar diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index 4175bacfaa4..3b243aaefa2 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -3080,3 +3080,40 @@ DO_2OP_FP_SCALAR(vfsub_scalarh, 2, uint16_t, float16_sub) DO_2OP_FP_SCALAR(vfsub_scalars, 4, uint32_t, float32_sub) DO_2OP_FP_SCALAR(vfmul_scalarh, 2, uint16_t, float16_mul) DO_2OP_FP_SCALAR(vfmul_scalars, 4, uint32_t, float32_mul) + +#define DO_2OP_FP_ACC_SCALAR(OP, ESIZE, TYPE, FN) \ + void HELPER(glue(mve_, OP))(CPUARMState *env, \ + void *vd, void *vn, uint32_t rm) \ + { \ + TYPE *d = vd, *n = vn; \ + TYPE r, m = rm; \ + uint16_t mask = mve_element_mask(env); \ + unsigned e; \ + float_status *fpst; \ + float_status scratch_fpst; \ + for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \ + if ((mask & MAKE_64BIT_MASK(0, ESIZE)) == 0) { \ + continue; \ + } \ + fpst = (ESIZE == 2) ? &env->vfp.standard_fp_status_f16 : \ + &env->vfp.standard_fp_status; \ + if (!(mask & 1)) { \ + /* We need the result but without updating flags */ \ + scratch_fpst = *fpst; \ + fpst = &scratch_fpst; \ + } \ + r = FN(n[H##ESIZE(e)], m, d[H##ESIZE(e)], 0, fpst); \ + mergemask(&d[H##ESIZE(e)], r, mask); \ + } \ + mve_advance_vpt(env); \ + } + +/* VFMAS is vector * vector + scalar, so swap op2 and op3 */ +#define DO_VFMAS_SCALARH(N, M, D, F, S) float16_muladd(N, D, M, F, S) +#define DO_VFMAS_SCALARS(N, M, D, F, S) float32_muladd(N, D, M, F, S) + +/* VFMA is vector * scalar + vector */ +DO_2OP_FP_ACC_SCALAR(vfma_scalarh, 2, uint16_t, float16_muladd) +DO_2OP_FP_ACC_SCALAR(vfma_scalars, 4, uint32_t, float32_muladd) +DO_2OP_FP_ACC_SCALAR(vfmas_scalarh, 2, uint16_t, DO_VFMAS_SCALARH) +DO_2OP_FP_ACC_SCALAR(vfmas_scalars, 4, uint32_t, DO_VFMAS_SCALARS) diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index bc4b3f840a0..3627ba227f2 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -979,6 +979,8 @@ static bool trans_VQDMULLT_scalar(DisasContext *s, arg_2scalar *a) DO_2OP_FP_SCALAR(VADD_fp_scalar, vfadd_scalar) DO_2OP_FP_SCALAR(VSUB_fp_scalar, vfsub_scalar) DO_2OP_FP_SCALAR(VMUL_fp_scalar, vfmul_scalar) +DO_2OP_FP_SCALAR(VFMA_scalar, vfma_scalar) +DO_2OP_FP_SCALAR(VFMAS_scalar, vfmas_scalar) static bool do_long_dual_acc(DisasContext *s, arg_vmlaldav *a, MVEGenLongDualAccOpFn *fn) From patchwork Thu Jul 29 11:15:03 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511194 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=nhY4KIem; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb7xK5Dkjz9sSs for ; Thu, 29 Jul 2021 21:44:53 +1000 (AEST) Received: from localhost ([::1]:47300 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m94T5-0007we-Do for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:44:51 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41014) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m941X-0001LQ-UY for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:24 -0400 Received: from mail-wm1-x335.google.com ([2a00:1450:4864:20::335]:56222) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m9415-0001Q8-P5 for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:23 -0400 Received: by mail-wm1-x335.google.com with SMTP id n21so3461557wmq.5 for ; Thu, 29 Jul 2021 04:15:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=hZoXvoCnZRqFIIUh7VxjDF986CnTtjaaiZusPA7tfs8=; b=nhY4KIem0tqLbQDewOxpT0Dfez4pSsZezW2AiLxC+Vq3qioL6SFWbatt0U7Fm7NWsM IfuuA9km/36oEnWte8nsvQBN4Lrb3ZRS0fs2iLlGCtwCboYpmNGEAdyR73OEIyKPE6ex rPj2grxFJ4vx+S/HHAE5fWXrFVub49ObQBXY3rj7PW+BnJwbheNm6V7mA1PY//wxhs2l XVQ/yZiBDLcYw41upOzomBePpHu19789b9en6i8+Uc3r7iSlgGGRG89kAarSsiY8Rgnq 3NRX1o/NDh3PlEtFGvERQjPgkR2rvucI7ri3cfiyHaKO6vX9KESp1SiZnTm3KzwXHOZC LTDg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=hZoXvoCnZRqFIIUh7VxjDF986CnTtjaaiZusPA7tfs8=; b=leKBdVpeg4QEk/PjHHOY/ho7Cd5kVcDWFuWf7UJDjT4mbgzA3MwrpkuL0eQegHPDWo /2rt0ULGwWow60Ix2Hgp8DlZSIXY4YB3Hd8MHxd8Migry45ocEBsdL6HF1olrMX7gmJK N1PCA+1vHQpgvuvei0uQTE6NmGFMFOYalDPlCGrK/znAJFZXvGpOYO49RPY2gMw/khJI uDbnKyqfGs5S+KtrQHeJQH8wZBCxcOXx2p0VA2Z4pGsnwEDp9Hop/aUKU8g3C+Mclv1H OApNrznxkmFDDvCOAP4u/hs3ZQWhZcYEdlYCHrRGGK88W5Eoe5v7fkQSWM3egHZLtX0v LQiA== X-Gm-Message-State: AOAM530Bq1HlIUsIeGmUI2G+yWFBUSSBT3BkOVrFCbzIA8Re6AcbYyoL kD1zxrQGBRPmi2mEaRsb8dYK2MGc1eDbcQ== X-Google-Smtp-Source: ABdhPJwjEl1n+Ox2a+Et4xUGg61ebc7mMTUvUlWWCXx8SB+rZ6dTOwIFNkbmPJVqzOydlDRvQ4fLag== X-Received: by 2002:a7b:c255:: with SMTP id b21mr4296737wmj.100.1627557351563; Thu, 29 Jul 2021 04:15:51 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:51 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 44/53] softfloat: Remove assertion preventing silencing of NaN in default-NaN mode Date: Thu, 29 Jul 2021 12:15:03 +0100 Message-Id: <20210729111512.16541-45-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::335; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x335.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" In commit a777d6033447a we added an assertion to parts_silence_nan() that prohibits calling float*_silence_nan() when in default-NaN mode. This ties together a property of the output ("do we generate a default NaN when the result is a NaN?") with an operation on an input ("silence this input NaN"). It's true that most of the time when in default-NaN mode you won't need to silence an input NaN, because you can just produce the default NaN as the result instead. But some functions like float*_maxnum() are defined to be able to work with quiet NaNs, so silencing an input SNaN is still reasonable. In particular, the upcoming implementation of MVE VMAXNMV would fall over this assertion if we didn't delete it. Delete the assertion. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- fpu/softfloat-specialize.c.inc | 1 - 1 file changed, 1 deletion(-) diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc index 12467bb9bba..f2ad0f335e6 100644 --- a/fpu/softfloat-specialize.c.inc +++ b/fpu/softfloat-specialize.c.inc @@ -198,7 +198,6 @@ static void parts128_default_nan(FloatParts128 *p, float_status *status) static uint64_t parts_silence_nan_frac(uint64_t frac, float_status *status) { g_assert(!no_signaling_nans(status)); - g_assert(!status->default_nan_mode); /* The only snan_bit_is_one target without default_nan_mode is HPPA. */ if (snan_bit_is_one(status)) { From patchwork Thu Jul 29 11:15:04 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511202 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=ujwDorne; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb84k3kz9z9sSs for ; Thu, 29 Jul 2021 21:51:18 +1000 (AEST) Received: from localhost ([::1]:37080 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m94ZI-000417-6h for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:51:16 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41002) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m941X-0001LE-Nl for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:23 -0400 Received: from mail-wm1-x333.google.com ([2a00:1450:4864:20::333]:51969) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m9414-0001Rm-V7 for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:23 -0400 Received: by mail-wm1-x333.google.com with SMTP id u15so3476307wmj.1 for ; Thu, 29 Jul 2021 04:15:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=p4Wlm2c9JqTkhyqQGttGpGYIq+fmB9R+3lHbXxxx7oo=; b=ujwDorne0tBRjj+JI055wiizniSpURE8lzYYgLzw8x5aC06pPa6WZIDKCXmNCFT9uH zbV4Xr6lJW7T5viZh3OnXtzki68FMGtytTxKAgTKlsTPVM8g8wbxpn+67x/3TI9ygw+5 lHkjqv6NIFT+GmiwwzUK7wZWxeyRWRjSPfCywlHsDQZYBdgK0o4PUnTHlKMbSLL3g4Yl XjHY7gITekjaYGP9uH5IYyNo3rZszXKi5k7oXjE+CpmUxIM7FgRwT1r2hYGkmBHFg1Z3 PaFDZdpMqczPFKf2upLyQEOFM93NRdO+nuMUgiu++KHokIJbQgk6k9N0anWilNiAgZCt FPCQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=p4Wlm2c9JqTkhyqQGttGpGYIq+fmB9R+3lHbXxxx7oo=; b=hevqBMQl4aEs8FXEdGQDDY/U/zqub587MTZH4JkjEbSXBzP6pV39gJmNcZa03VFvfd jnZ0++0tccRDwzMzqx0Ap2sx8n6xg4c857zTv9gakyTdgCkhZJRqYyfHAlgQmbGgJ49J VrjFbD7ptgo9CH3EsxpW3CXBPK8UrK2rNIDMeWpHjo5rXHlds+X843gMZQmVbJ3SG5+y 8uEIO/ururPi2OkdgQmKA/+ukmVOhNxAXrLSe3QOIdRCQ756lsUEWmTr4lqaTYQwdt8U 3SIPkXhKzbyZ+OUy6rnVV6O2ktZMxIJPbj//ZXflDOXWoBlk+nA/Nx8cQ3NEDlGUsJ+J dggw== X-Gm-Message-State: AOAM531HBQw1P8zeMbecPAY5B5o5sfLIuJ5SqRz/2MhpL5eDalnC994l OKRNSxiQJpSKiEh35+OPlWBH5A== X-Google-Smtp-Source: ABdhPJzqpj/hoP1fe17g2lo0veTSN0vqD9+iemZO9t0dwAS2PU+GukhkpRVlKzFPn0pKFdo5X/4K1g== X-Received: by 2002:a1c:a94f:: with SMTP id s76mr11589459wme.17.1627557352390; Thu, 29 Jul 2021 04:15:52 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:51 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 45/53] target/arm: Implement MVE FP max/min across vector Date: Thu, 29 Jul 2021 12:15:04 +0100 Message-Id: <20210729111512.16541-46-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::333; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x333.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE VMAXNMV, VMINNMV, VMAXNMAV, VMINNMAV insns. These calculate the maximum or minimum of floating point elements across a vector, starting with a value in a general purpose register and returning the result there. The pseudocode silences a possible SNaN in the accumulating result on every iteration (by calling FPConvertNaN), but we do it only on the input ra, because if none of the inputs to float*_maxnum or float*_minnum are SNaNs then the result can't be an SNaN. Note that we can't use the float*_maxnuma() etc functions we defined earlier for VMAXNMA and VMINNMA, because we mustn't take the absolute value of the starting general-purpose register value, which could be negative. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper-mve.h | 12 +++++++++++ target/arm/mve.decode | 12 +++++++++++ target/arm/mve_helper.c | 44 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-mve.c | 20 +++++++++++++++++ 4 files changed, 88 insertions(+) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index cb7b6423239..47fd18dddbf 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -614,6 +614,18 @@ DEF_HELPER_FLAGS_3(mve_vminavb, TCG_CALL_NO_WG, i32, env, ptr, i32) DEF_HELPER_FLAGS_3(mve_vminavh, TCG_CALL_NO_WG, i32, env, ptr, i32) DEF_HELPER_FLAGS_3(mve_vminavw, TCG_CALL_NO_WG, i32, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vmaxnmvh, TCG_CALL_NO_WG, i32, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vmaxnmvs, TCG_CALL_NO_WG, i32, env, ptr, i32) + +DEF_HELPER_FLAGS_3(mve_vminnmvh, TCG_CALL_NO_WG, i32, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vminnmvs, TCG_CALL_NO_WG, i32, env, ptr, i32) + +DEF_HELPER_FLAGS_3(mve_vmaxnmavh, TCG_CALL_NO_WG, i32, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vmaxnmavs, TCG_CALL_NO_WG, i32, env, ptr, i32) + +DEF_HELPER_FLAGS_3(mve_vminnmavh, TCG_CALL_NO_WG, i32, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vminnmavs, TCG_CALL_NO_WG, i32, env, ptr, i32) + DEF_HELPER_FLAGS_3(mve_vaddlv_s, TCG_CALL_NO_WG, i64, env, ptr, i64) DEF_HELPER_FLAGS_3(mve_vaddlv_u, TCG_CALL_NO_WG, i64, env, ptr, i64) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index d2bd6815bc3..1a18c3b8eeb 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -137,6 +137,10 @@ @vmaxnma .... .... .... .... .... .... .... .... &2op \ qd=%qd qn=%qd qm=%qm +# Here also we don't decode the bit 28 size in the format to avoid +# awkward nested overlap groups +@vmaxnmv .... .... .... .... rda:4 .... .... .... &vmaxv qm=%qm + @2op_fp_scalar .... .... .... .... .... .... .... rm:4 &2scalar \ qd=%qd qn=%qn size=%2op_fp_scalar_size @@ -440,6 +444,10 @@ VMLADAV_S 1110 1110 1111 ... 0 ... . 1111 . 0 . 0 ... 1 @vmladav_nosz VMLADAV_U 1111 1110 1111 ... 0 ... . 1111 . 0 . 0 ... 1 @vmladav_nosz { + VMAXNMAV 1110 1110 1110 11 00 .... 1111 0 0 . 0 ... 0 @vmaxnmv size=2 + VMINNMAV 1110 1110 1110 11 00 .... 1111 1 0 . 0 ... 0 @vmaxnmv size=2 + VMAXNMV 1110 1110 1110 11 10 .... 1111 0 0 . 0 ... 0 @vmaxnmv size=2 + VMINNMV 1110 1110 1110 11 10 .... 1111 1 0 . 0 ... 0 @vmaxnmv size=2 VMAXV_S 1110 1110 1110 .. 10 .... 1111 0 0 . 0 ... 0 @vmaxv VMINV_S 1110 1110 1110 .. 10 .... 1111 1 0 . 0 ... 0 @vmaxv VMAXAV 1110 1110 1110 .. 00 .... 1111 0 0 . 0 ... 0 @vmaxv @@ -449,6 +457,10 @@ VMLADAV_U 1111 1110 1111 ... 0 ... . 1111 . 0 . 0 ... 1 @vmladav_nosz } { + VMAXNMAV 1111 1110 1110 11 00 .... 1111 0 0 . 0 ... 0 @vmaxnmv size=1 + VMINNMAV 1111 1110 1110 11 00 .... 1111 1 0 . 0 ... 0 @vmaxnmv size=1 + VMAXNMV 1111 1110 1110 11 10 .... 1111 0 0 . 0 ... 0 @vmaxnmv size=1 + VMINNMV 1111 1110 1110 11 10 .... 1111 1 0 . 0 ... 0 @vmaxnmv size=1 VMAXV_U 1111 1110 1110 .. 10 .... 1111 0 0 . 0 ... 0 @vmaxv VMINV_U 1111 1110 1110 .. 10 .... 1111 1 0 . 0 ... 0 @vmaxv VMLADAV_U 1111 1110 1111 ... 0 ... . 1111 . 0 . 0 ... 0 @vmladav_nosz diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index 3b243aaefa2..6a73134c74a 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -3117,3 +3117,47 @@ DO_2OP_FP_ACC_SCALAR(vfma_scalarh, 2, uint16_t, float16_muladd) DO_2OP_FP_ACC_SCALAR(vfma_scalars, 4, uint32_t, float32_muladd) DO_2OP_FP_ACC_SCALAR(vfmas_scalarh, 2, uint16_t, DO_VFMAS_SCALARH) DO_2OP_FP_ACC_SCALAR(vfmas_scalars, 4, uint32_t, DO_VFMAS_SCALARS) + +/* Floating point max/min across vector. */ +#define DO_FP_VMAXMINV(OP, ESIZE, TYPE, FTYPE, ABS, FN) \ + uint32_t HELPER(glue(mve_, OP))(CPUARMState *env, void *vm, \ + uint32_t ra_in) \ + { \ + uint16_t mask = mve_element_mask(env); \ + unsigned e; \ + TYPE *m = vm; \ + TYPE ra = (TYPE)ra_in; \ + float_status *fpst = (ESIZE == 2) ? \ + &env->vfp.standard_fp_status_f16 : \ + &env->vfp.standard_fp_status; \ + for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \ + if (mask & 1) { \ + TYPE v = m[H##ESIZE(e)]; \ + if (FTYPE##_is_signaling_nan(ra, fpst)) { \ + ra = FTYPE##_silence_nan(ra, fpst); \ + float_raise(float_flag_invalid, fpst); \ + } \ + if (FTYPE##_is_signaling_nan(v, fpst)) { \ + v = FTYPE##_silence_nan(v, fpst); \ + float_raise(float_flag_invalid, fpst); \ + } \ + if (ABS) { \ + v = FTYPE##_abs(v); \ + } \ + ra = FN(ra, v, fpst); \ + } \ + } \ + mve_advance_vpt(env); \ + return ra; \ + } \ + +#define NOP(X) (X) + +DO_FP_VMAXMINV(vmaxnmvh, 2, uint16_t, float16, false, float16_maxnum) +DO_FP_VMAXMINV(vmaxnmvs, 4, uint32_t, float32, false, float32_maxnum) +DO_FP_VMAXMINV(vminnmvh, 2, uint16_t, float16, false, float16_minnum) +DO_FP_VMAXMINV(vminnmvs, 4, uint32_t, float32, false, float32_minnum) +DO_FP_VMAXMINV(vmaxnmavh, 2, uint16_t, float16, true, float16_maxnum) +DO_FP_VMAXMINV(vmaxnmavs, 4, uint32_t, float32, true, float32_maxnum) +DO_FP_VMAXMINV(vminnmavh, 2, uint16_t, float16, true, float16_minnum) +DO_FP_VMAXMINV(vminnmavs, 4, uint32_t, float32, true, float32_minnum) diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index 3627ba227f2..4e2aa2cae2d 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -1806,6 +1806,26 @@ DO_VMAXV(VMINV_S, vminvs) DO_VMAXV(VMINV_U, vminvu) DO_VMAXV(VMINAV, vminav) +#define DO_VMAXV_FP(INSN, FN) \ + static bool trans_##INSN(DisasContext *s, arg_vmaxv *a) \ + { \ + static MVEGenVADDVFn * const fns[] = { \ + NULL, \ + gen_helper_mve_##FN##h, \ + gen_helper_mve_##FN##s, \ + NULL, \ + }; \ + if (!dc_isar_feature(aa32_mve_fp, s)) { \ + return false; \ + } \ + return do_vmaxv(s, a, fns[a->size]); \ + } + +DO_VMAXV_FP(VMAXNMV, vmaxnmv) +DO_VMAXV_FP(VMINNMV, vminnmv) +DO_VMAXV_FP(VMAXNMAV, vmaxnmav) +DO_VMAXV_FP(VMINNMAV, vminnmav) + static bool do_vabav(DisasContext *s, arg_vabav *a, MVEGenVABAVFn *fn) { /* Absolute difference accumulated across vector */ From patchwork Thu Jul 29 11:15:05 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511218 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=M3jI9HZr; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb8Cj13XPz9sV8 for ; Thu, 29 Jul 2021 21:57:20 +1000 (AEST) Received: from localhost ([::1]:33676 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m94f8-0004Df-FL for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:57:18 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41006) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m941X-0001LI-OR for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:23 -0400 Received: from mail-wm1-x32e.google.com ([2a00:1450:4864:20::32e]:40660) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m9415-0001SE-EV for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:23 -0400 Received: by mail-wm1-x32e.google.com with SMTP id f18-20020a05600c4e92b0290253c32620e7so6335662wmq.5 for ; Thu, 29 Jul 2021 04:15:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=FRwz4ys1hUqinmA7323A6Mg5frUBtFRSL5FRRL7kFdQ=; b=M3jI9HZrGs3NANcK5PCLPkdosOySStMFkmprGPaGrpBc0bwJ8zy9qiS7H0ou0n2Gb7 cmC1Vn6c0i8+NmnWg3rT5Vy8mc/iAO0COUIYna/JBfNDUZTLBCfna/O219Vhqfpa8f28 gKKatm1T8Dednz4d626Dsa6Xxy1tSYu63z/CNsksCHaVusOtYqwCwuJaH1S0YmJSN6Ee cjdfQyYvMd5dO1zRsMqMW8Gea2pwFC/w+uKPxL2PGjDi6Ljkv6tVVzF9AC3CD9PBu4eD iwfrYAD1A0urCcV2R7kNyybgL/JgPUiWu+bPU06o+XPx/BvGrdH8WR+tAsC57TpTgaap vM7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=FRwz4ys1hUqinmA7323A6Mg5frUBtFRSL5FRRL7kFdQ=; b=F3pY+IE9BSVdmpeNwCKjlEkRNuSZs4iBILrBN9dnVjA3FmViHe3ydS9aJUNd6ovdcN er4n3UXSNY534Yn+jKI27f6PonYBD9ZOOF+CSJ59gLd+CBvBYb5M+9dh89hKOOBas9oQ HYA/m5JNqEM6Uu61IJ5YPTnRA5M004wrYMb0RrUWz2kJ7QlUebqkjiQVO/kDtV9PBY68 TunKjRZo+Yi0NCwcWyDha7wP/rDGV6Ee1+mozNDXzQGivWfN2PgFZid43TvkCgSt+sne AQLcF6YTUsNMJgH3l5fWVesJ5NSRDHgUjh9wnTgut2TNpP0eYfpeU+ZONJtiewktgbvB pr6w== X-Gm-Message-State: AOAM533iKEPs0RRtBJFyKUnEqwNFkiTaBXYlbHjVh9CUyqh28z1cXaJv w9I9lopkuzUHUVnszWPBsYeBn3cley45Lg== X-Google-Smtp-Source: ABdhPJyseiUzXWzAYDDvtLI48KLDeR7h8U571nFN4R78m/9uqn7IfQgvylbWcsgq0LiEnfGy1tOlqQ== X-Received: by 2002:a7b:cd86:: with SMTP id y6mr2061089wmj.181.1627557353233; Thu, 29 Jul 2021 04:15:53 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:52 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 46/53] target/arm: Implement MVE fp vector comparisons Date: Thu, 29 Jul 2021 12:15:05 +0100 Message-Id: <20210729111512.16541-47-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::32e; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x32e.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE fp vector comparisons VCMP and VPT. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper-mve.h | 18 +++++++++++ target/arm/mve.decode | 39 +++++++++++++++++++---- target/arm/mve_helper.c | 64 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-mve.c | 22 +++++++++++++ 4 files changed, 137 insertions(+), 6 deletions(-) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index 47fd18dddbf..0c15c531641 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -813,6 +813,24 @@ DEF_HELPER_FLAGS_3(mve_vcmple_scalarb, TCG_CALL_NO_WG, void, env, ptr, i32) DEF_HELPER_FLAGS_3(mve_vcmple_scalarh, TCG_CALL_NO_WG, void, env, ptr, i32) DEF_HELPER_FLAGS_3(mve_vcmple_scalarw, TCG_CALL_NO_WG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vfcmpeqh, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vfcmpeqs, TCG_CALL_NO_WG, void, env, ptr, ptr) + +DEF_HELPER_FLAGS_3(mve_vfcmpneh, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vfcmpnes, TCG_CALL_NO_WG, void, env, ptr, ptr) + +DEF_HELPER_FLAGS_3(mve_vfcmpgeh, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vfcmpges, TCG_CALL_NO_WG, void, env, ptr, ptr) + +DEF_HELPER_FLAGS_3(mve_vfcmplth, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vfcmplts, TCG_CALL_NO_WG, void, env, ptr, ptr) + +DEF_HELPER_FLAGS_3(mve_vfcmpgth, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vfcmpgts, TCG_CALL_NO_WG, void, env, ptr, ptr) + +DEF_HELPER_FLAGS_3(mve_vfcmpleh, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vfcmples, TCG_CALL_NO_WG, void, env, ptr, ptr) + DEF_HELPER_FLAGS_4(mve_vfadd_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vfadd_scalars, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index 1a18c3b8eeb..7767ecae2ac 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -124,6 +124,9 @@ @vcmp_scalar .... .... .. size:2 qn:3 . .... .... .... rm:4 &vcmp_scalar \ mask=%mask_22_13 +@vcmp_fp .... .... .... qn:3 . .... .... .... .... &vcmp \ + qm=%qm size=%2op_fp_scalar_size mask=%mask_22_13 + @vmaxv .... .... .... size:2 .. rda:4 .... .... .... &vmaxv qm=%qm @2op_fp .... .... .... .... .... .... .... .... &2op \ @@ -663,17 +666,41 @@ VSHLC 111 0 1110 1 . 1 imm:5 ... 0 1111 1100 rdm:4 qd=%qd # Comparisons. We expand out the conditions which are split across # encodings T1, T2, T3 and the fc bits. These include VPT, which is # effectively "VCMP then VPST". A plain "VCMP" has a mask field of zero. -VCMPEQ 1111 1110 0 . .. ... 1 ... 0 1111 0 0 . 0 ... 0 @vcmp -VCMPNE 1111 1110 0 . .. ... 1 ... 0 1111 1 0 . 0 ... 0 @vcmp +{ + VCMPEQ_fp 111 . 1110 0 . 11 ... 1 ... 0 1111 0 0 . 0 ... 0 @vcmp_fp + VCMPEQ 111 1 1110 0 . .. ... 1 ... 0 1111 0 0 . 0 ... 0 @vcmp +} + +{ + VCMPNE_fp 111 . 1110 0 . 11 ... 1 ... 0 1111 1 0 . 0 ... 0 @vcmp_fp + VCMPNE 111 1 1110 0 . .. ... 1 ... 0 1111 1 0 . 0 ... 0 @vcmp +} + +{ + VCMPGE_fp 111 . 1110 0 . 11 ... 1 ... 1 1111 0 0 . 0 ... 0 @vcmp_fp + VCMPGE 111 1 1110 0 . .. ... 1 ... 1 1111 0 0 . 0 ... 0 @vcmp +} + +{ + VCMPLT_fp 111 . 1110 0 . 11 ... 1 ... 1 1111 1 0 . 0 ... 0 @vcmp_fp + VCMPLT 111 1 1110 0 . .. ... 1 ... 1 1111 1 0 . 0 ... 0 @vcmp +} + +{ + VCMPGT_fp 111 . 1110 0 . 11 ... 1 ... 1 1111 0 0 . 0 ... 1 @vcmp_fp + VCMPGT 111 1 1110 0 . .. ... 1 ... 1 1111 0 0 . 0 ... 1 @vcmp +} + +{ + VCMPLE_fp 111 . 1110 0 . 11 ... 1 ... 1 1111 1 0 . 0 ... 1 @vcmp_fp + VCMPLE 1111 1110 0 . .. ... 1 ... 1 1111 1 0 . 0 ... 1 @vcmp +} + { VPSEL 1111 1110 0 . 11 ... 1 ... 0 1111 . 0 . 0 ... 1 @2op_nosz VCMPCS 1111 1110 0 . .. ... 1 ... 0 1111 0 0 . 0 ... 1 @vcmp VCMPHI 1111 1110 0 . .. ... 1 ... 0 1111 1 0 . 0 ... 1 @vcmp } -VCMPGE 1111 1110 0 . .. ... 1 ... 1 1111 0 0 . 0 ... 0 @vcmp -VCMPLT 1111 1110 0 . .. ... 1 ... 1 1111 1 0 . 0 ... 0 @vcmp -VCMPGT 1111 1110 0 . .. ... 1 ... 1 1111 0 0 . 0 ... 1 @vcmp -VCMPLE 1111 1110 0 . .. ... 1 ... 1 1111 1 0 . 0 ... 1 @vcmp { VPNOT 1111 1110 0 0 11 000 1 000 0 1111 0100 1101 diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index 6a73134c74a..ebfd5746b13 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -3161,3 +3161,67 @@ DO_FP_VMAXMINV(vmaxnmavh, 2, uint16_t, float16, true, float16_maxnum) DO_FP_VMAXMINV(vmaxnmavs, 4, uint32_t, float32, true, float32_maxnum) DO_FP_VMAXMINV(vminnmavh, 2, uint16_t, float16, true, float16_minnum) DO_FP_VMAXMINV(vminnmavs, 4, uint32_t, float32, true, float32_minnum) + +/* FP compares; note that all comparisons signal InvalidOp for QNaNs */ +#define DO_VCMP_FP(OP, ESIZE, TYPE, FN) \ + void HELPER(glue(mve_, OP))(CPUARMState *env, void *vn, void *vm) \ + { \ + TYPE *n = vn, *m = vm; \ + uint16_t mask = mve_element_mask(env); \ + uint16_t eci_mask = mve_eci_mask(env); \ + uint16_t beatpred = 0; \ + uint16_t emask = MAKE_64BIT_MASK(0, ESIZE); \ + unsigned e; \ + float_status *fpst; \ + float_status scratch_fpst; \ + bool r; \ + for (e = 0; e < 16 / ESIZE; e++, emask <<= ESIZE) { \ + if ((mask & emask) == 0) { \ + continue; \ + } \ + fpst = (ESIZE == 2) ? &env->vfp.standard_fp_status_f16 : \ + &env->vfp.standard_fp_status; \ + if (!(mask & (1 << (e * ESIZE)))) { \ + /* We need the result but without updating flags */ \ + scratch_fpst = *fpst; \ + fpst = &scratch_fpst; \ + } \ + r = FN(n[H##ESIZE(e)], m[H##ESIZE(e)], fpst); \ + /* Comparison sets 0/1 bits for each byte in the element */ \ + beatpred |= r * emask; \ + } \ + beatpred &= mask; \ + env->v7m.vpr = (env->v7m.vpr & ~(uint32_t)eci_mask) | \ + (beatpred & eci_mask); \ + mve_advance_vpt(env); \ + } + +/* + * Some care is needed here to get the correct result for the unordered case. + * Architecturally EQ, GE and GT are defined to be false for unordered, but + * the NE, LT and LE comparisons are defined as simple logical inverses of + * EQ, GE and GT and so they must return true for unordered. The softfloat + * comparison functions float*_{eq,le,lt} all return false for unordered. + */ +#define DO_GE16(X, Y, S) float16_le(Y, X, S) +#define DO_GE32(X, Y, S) float32_le(Y, X, S) +#define DO_GT16(X, Y, S) float16_lt(Y, X, S) +#define DO_GT32(X, Y, S) float32_lt(Y, X, S) + +DO_VCMP_FP(vfcmpeqh, 2, uint16_t, float16_eq) +DO_VCMP_FP(vfcmpeqs, 4, uint32_t, float32_eq) + +DO_VCMP_FP(vfcmpneh, 2, uint16_t, !float16_eq) +DO_VCMP_FP(vfcmpnes, 4, uint32_t, !float32_eq) + +DO_VCMP_FP(vfcmpgeh, 2, uint16_t, DO_GE16) +DO_VCMP_FP(vfcmpges, 4, uint32_t, DO_GE32) + +DO_VCMP_FP(vfcmplth, 2, uint16_t, !DO_GE16) +DO_VCMP_FP(vfcmplts, 4, uint32_t, !DO_GE32) + +DO_VCMP_FP(vfcmpgth, 2, uint16_t, DO_GT16) +DO_VCMP_FP(vfcmpgts, 4, uint32_t, DO_GT32) + +DO_VCMP_FP(vfcmpleh, 2, uint16_t, !DO_GT16) +DO_VCMP_FP(vfcmples, 4, uint32_t, !DO_GT32) diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index 4e2aa2cae2d..da14a6f790e 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -1758,6 +1758,28 @@ DO_VCMP(VCMPLT, vcmplt) DO_VCMP(VCMPGT, vcmpgt) DO_VCMP(VCMPLE, vcmple) +#define DO_VCMP_FP(INSN, FN) \ + static bool trans_##INSN(DisasContext *s, arg_vcmp *a) \ + { \ + static MVEGenCmpFn * const fns[] = { \ + NULL, \ + gen_helper_mve_##FN##h, \ + gen_helper_mve_##FN##s, \ + NULL, \ + }; \ + if (!dc_isar_feature(aa32_mve_fp, s)) { \ + return false; \ + } \ + return do_vcmp(s, a, fns[a->size]); \ + } + +DO_VCMP_FP(VCMPEQ_fp, vfcmpeq) +DO_VCMP_FP(VCMPNE_fp, vfcmpne) +DO_VCMP_FP(VCMPGE_fp, vfcmpge) +DO_VCMP_FP(VCMPLT_fp, vfcmplt) +DO_VCMP_FP(VCMPGT_fp, vfcmpgt) +DO_VCMP_FP(VCMPLE_fp, vfcmple) + static bool do_vmaxv(DisasContext *s, arg_vmaxv *a, MVEGenVADDVFn fn) { /* From patchwork Thu Jul 29 11:15:06 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511200 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=i4uKGvwN; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb81w1Mb3z9sSs for ; Thu, 29 Jul 2021 21:48:52 +1000 (AEST) Received: from localhost ([::1]:57878 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m94Wv-0007Hp-TI for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:48:49 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41158) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m941e-0001Wd-2R for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:30 -0400 Received: from mail-wm1-x334.google.com ([2a00:1450:4864:20::334]:54102) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m941A-0001SR-0k for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:29 -0400 Received: by mail-wm1-x334.google.com with SMTP id k4so3469854wms.3 for ; Thu, 29 Jul 2021 04:15:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=nWalocP07tEKh2jpkIuDwJz8kFwTS8kRNLKBL8PDYDM=; b=i4uKGvwNbc2QsxZyOMMqVapJzBpYGvkoUfyuU1qQF6d88FgF4r8244dAdcaJkLysdv JUNgMGFS4Ji4JFGkuMKg6Vdww5xKYpmB7o0YvambR6Oi3IR8grnsFRb7jAXPCqPjYl7L cFUlV4HO0i40jVt5jfWKDbQZU69ojZjSZ1YAjG8Ujm7chIE79CU2sSkyLeHRtJHB4TPK qL4UfgheXkl3bJb9v0DbEJQeG7e/tABftBI2W0H5gFni+TnLvhL/WphLdsv7PKqgJevW Gk2D9MBwjB3UTolJtfKSqYFv7HQE4/+7AH/scLvCqoNc0FFcMbH7xGX1bOw8KZJaVYT/ 10zw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=nWalocP07tEKh2jpkIuDwJz8kFwTS8kRNLKBL8PDYDM=; b=kC8DRcfjB+X2ajXoo4Ip4vSv5OtdNOycxVGaqbFe/IbyfQuRLDc0RU05qwxRyVWY4p O6xK/I3IeH4ii7WTQR8H0Dvm05nL3vy+XUxyheZW60Qp3mloMoqmLy1mYwf4MFWbhLgt Lj6Y5lRGQq9MNgBZFbYooO7Bo8LwyQpX5jfBz7aIVNseHSJzIlmtDr1sG6nOkg1NsYyk MdvoG0wbRfefB2/+baxIL78L6ByfdpLafnaHtKTKsOiJjyZPmkZXy6Ntkl0z18peZDt2 UeBWTj8nQBX4kOH0F/mZudl8e6Of7SUpD/IuyuM7qpFA+7FyQtTJTnIz7RWoCnpOCShw UfTg== X-Gm-Message-State: AOAM530CrQgKmujRsfVhkZSx13okTkfAuq5hWuvbf6vl//x9VrLQ5j5E rD0Z4InIGy+sTrhg/9GCk7xZxw== X-Google-Smtp-Source: ABdhPJzmAp0xBa9rhCYoGD8DJxHYb9AoKlOl2wq+B8/tdGomKl8p4faGRXkHS1BbPJvkmpbgk26fVg== X-Received: by 2002:a1c:a510:: with SMTP id o16mr13916814wme.162.1627557354135; Thu, 29 Jul 2021 04:15:54 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:53 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 47/53] target/arm: Implement MVE fp scalar comparisons Date: Thu, 29 Jul 2021 12:15:06 +0100 Message-Id: <20210729111512.16541-48-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::334; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x334.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE fp scalar comparisons VCMP and VPT. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper-mve.h | 18 +++++++++++ target/arm/mve.decode | 61 +++++++++++++++++++++++++++++-------- target/arm/mve_helper.c | 62 ++++++++++++++++++++++++++++++-------- target/arm/translate-mve.c | 14 +++++++++ 4 files changed, 131 insertions(+), 24 deletions(-) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index 0c15c531641..9ee841cdf01 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -831,6 +831,24 @@ DEF_HELPER_FLAGS_3(mve_vfcmpgts, TCG_CALL_NO_WG, void, env, ptr, ptr) DEF_HELPER_FLAGS_3(mve_vfcmpleh, TCG_CALL_NO_WG, void, env, ptr, ptr) DEF_HELPER_FLAGS_3(mve_vfcmples, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vfcmpeq_scalarh, TCG_CALL_NO_WG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vfcmpeq_scalars, TCG_CALL_NO_WG, void, env, ptr, i32) + +DEF_HELPER_FLAGS_3(mve_vfcmpne_scalarh, TCG_CALL_NO_WG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vfcmpne_scalars, TCG_CALL_NO_WG, void, env, ptr, i32) + +DEF_HELPER_FLAGS_3(mve_vfcmpge_scalarh, TCG_CALL_NO_WG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vfcmpge_scalars, TCG_CALL_NO_WG, void, env, ptr, i32) + +DEF_HELPER_FLAGS_3(mve_vfcmplt_scalarh, TCG_CALL_NO_WG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vfcmplt_scalars, TCG_CALL_NO_WG, void, env, ptr, i32) + +DEF_HELPER_FLAGS_3(mve_vfcmpgt_scalarh, TCG_CALL_NO_WG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vfcmpgt_scalars, TCG_CALL_NO_WG, void, env, ptr, i32) + +DEF_HELPER_FLAGS_3(mve_vfcmple_scalarh, TCG_CALL_NO_WG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vfcmple_scalars, TCG_CALL_NO_WG, void, env, ptr, i32) + DEF_HELPER_FLAGS_4(mve_vfadd_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vfadd_scalars, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index 7767ecae2ac..aa113279dc5 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -127,6 +127,11 @@ @vcmp_fp .... .... .... qn:3 . .... .... .... .... &vcmp \ qm=%qm size=%2op_fp_scalar_size mask=%mask_22_13 +# Bit 28 is a 2op_fp_scalar_size bit, but we do not decode it in this +# format to avoid complicated overlapping-instruction-groups +@vcmp_fp_scalar .... .... .... qn:3 . .... .... .... rm:4 &vcmp_scalar \ + mask=%mask_22_13 + @vmaxv .... .... .... size:2 .. rda:4 .... .... .... &vmaxv qm=%qm @2op_fp .... .... .... .... .... .... .... .... &2op \ @@ -400,8 +405,10 @@ VDUP 1110 1110 1 0 10 ... 0 .... 1011 . 0 0 1 0000 @vdup size=2 VIWDUP 1110 1110 0 . .. ... 1 ... 0 1111 . 110 ... . @viwdup } { - VDDUP 1110 1110 0 . .. ... 1 ... 1 1111 . 110 111 . @vidup - VDWDUP 1110 1110 0 . .. ... 1 ... 1 1111 . 110 ... . @viwdup + VCMPGT_fp_scalar 1110 1110 0 . 11 ... 1 ... 1 1111 0110 .... @vcmp_fp_scalar size=2 + VCMPLE_fp_scalar 1110 1110 0 . 11 ... 1 ... 1 1111 1110 .... @vcmp_fp_scalar size=2 + VDDUP 1110 1110 0 . .. ... 1 ... 1 1111 . 110 111 . @vidup + VDWDUP 1110 1110 0 . .. ... 1 ... 1 1111 . 110 ... . @viwdup } # multiply-add long dual accumulate @@ -472,8 +479,17 @@ VMLADAV_U 1111 1110 1111 ... 0 ... . 1111 . 0 . 0 ... 1 @vmladav_nosz # Scalar operations -VADD_scalar 1110 1110 0 . .. ... 1 ... 0 1111 . 100 .... @2scalar -VSUB_scalar 1110 1110 0 . .. ... 1 ... 1 1111 . 100 .... @2scalar +{ + VCMPEQ_fp_scalar 1110 1110 0 . 11 ... 1 ... 0 1111 0100 .... @vcmp_fp_scalar size=2 + VCMPNE_fp_scalar 1110 1110 0 . 11 ... 1 ... 0 1111 1100 .... @vcmp_fp_scalar size=2 + VADD_scalar 1110 1110 0 . .. ... 1 ... 0 1111 . 100 .... @2scalar +} + +{ + VCMPLT_fp_scalar 1110 1110 0 . 11 ... 1 ... 1 1111 1100 .... @vcmp_fp_scalar size=2 + VCMPGE_fp_scalar 1110 1110 0 . 11 ... 1 ... 1 1111 0100 .... @vcmp_fp_scalar size=2 + VSUB_scalar 1110 1110 0 . .. ... 1 ... 1 1111 . 100 .... @2scalar +} { VSHL_S_scalar 1110 1110 0 . 11 .. 01 ... 1 1110 0110 .... @shl_scalar @@ -703,17 +719,38 @@ VSHLC 111 0 1110 1 . 1 imm:5 ... 0 1111 1100 rdm:4 qd=%qd } { - VPNOT 1111 1110 0 0 11 000 1 000 0 1111 0100 1101 - VPST 1111 1110 0 . 11 000 1 ... 0 1111 0100 1101 mask=%mask_22_13 - VCMPEQ_scalar 1111 1110 0 . .. ... 1 ... 0 1111 0 1 0 0 .... @vcmp_scalar + VPNOT 1111 1110 0 0 11 000 1 000 0 1111 0100 1101 + VPST 1111 1110 0 . 11 000 1 ... 0 1111 0100 1101 mask=%mask_22_13 + VCMPEQ_fp_scalar 1111 1110 0 . 11 ... 1 ... 0 1111 0100 .... @vcmp_fp_scalar size=1 + VCMPEQ_scalar 1111 1110 0 . .. ... 1 ... 0 1111 0100 .... @vcmp_scalar } -VCMPNE_scalar 1111 1110 0 . .. ... 1 ... 0 1111 1 1 0 0 .... @vcmp_scalar + +{ + VCMPNE_fp_scalar 1111 1110 0 . 11 ... 1 ... 0 1111 1100 .... @vcmp_fp_scalar size=1 + VCMPNE_scalar 1111 1110 0 . .. ... 1 ... 0 1111 1100 .... @vcmp_scalar +} + +{ + VCMPGT_fp_scalar 1111 1110 0 . 11 ... 1 ... 1 1111 0110 .... @vcmp_fp_scalar size=1 + VCMPGT_scalar 1111 1110 0 . .. ... 1 ... 1 1111 0110 .... @vcmp_scalar +} + +{ + VCMPLE_fp_scalar 1111 1110 0 . 11 ... 1 ... 1 1111 1110 .... @vcmp_fp_scalar size=1 + VCMPLE_scalar 1111 1110 0 . .. ... 1 ... 1 1111 1110 .... @vcmp_scalar +} + +{ + VCMPGE_fp_scalar 1111 1110 0 . 11 ... 1 ... 1 1111 0100 .... @vcmp_fp_scalar size=1 + VCMPGE_scalar 1111 1110 0 . .. ... 1 ... 1 1111 0100 .... @vcmp_scalar +} +{ + VCMPLT_fp_scalar 1111 1110 0 . 11 ... 1 ... 1 1111 1100 .... @vcmp_fp_scalar size=1 + VCMPLT_scalar 1111 1110 0 . .. ... 1 ... 1 1111 1100 .... @vcmp_scalar +} + VCMPCS_scalar 1111 1110 0 . .. ... 1 ... 0 1111 0 1 1 0 .... @vcmp_scalar VCMPHI_scalar 1111 1110 0 . .. ... 1 ... 0 1111 1 1 1 0 .... @vcmp_scalar -VCMPGE_scalar 1111 1110 0 . .. ... 1 ... 1 1111 0 1 0 0 .... @vcmp_scalar -VCMPLT_scalar 1111 1110 0 . .. ... 1 ... 1 1111 1 1 0 0 .... @vcmp_scalar -VCMPGT_scalar 1111 1110 0 . .. ... 1 ... 1 1111 0 1 1 0 .... @vcmp_scalar -VCMPLE_scalar 1111 1110 0 . .. ... 1 ... 1 1111 1 1 1 0 .... @vcmp_scalar # 2-operand FP VADD_fp 1110 1111 0 . 0 . ... 0 ... 0 1101 . 1 . 0 ... 0 @2op_fp diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index ebfd5746b13..0aeccc12d69 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -3196,6 +3196,44 @@ DO_FP_VMAXMINV(vminnmavs, 4, uint32_t, float32, true, float32_minnum) mve_advance_vpt(env); \ } +#define DO_VCMP_FP_SCALAR(OP, ESIZE, TYPE, FN) \ + void HELPER(glue(mve_, OP))(CPUARMState *env, void *vn, \ + uint32_t rm) \ + { \ + TYPE *n = vn; \ + uint16_t mask = mve_element_mask(env); \ + uint16_t eci_mask = mve_eci_mask(env); \ + uint16_t beatpred = 0; \ + uint16_t emask = MAKE_64BIT_MASK(0, ESIZE); \ + unsigned e; \ + float_status *fpst; \ + float_status scratch_fpst; \ + bool r; \ + for (e = 0; e < 16 / ESIZE; e++, emask <<= ESIZE) { \ + if ((mask & emask) == 0) { \ + continue; \ + } \ + fpst = (ESIZE == 2) ? &env->vfp.standard_fp_status_f16 : \ + &env->vfp.standard_fp_status; \ + if (!(mask & (1 << (e * ESIZE)))) { \ + /* We need the result but without updating flags */ \ + scratch_fpst = *fpst; \ + fpst = &scratch_fpst; \ + } \ + r = FN(n[H##ESIZE(e)], (TYPE)rm, fpst); \ + /* Comparison sets 0/1 bits for each byte in the element */ \ + beatpred |= r * emask; \ + } \ + beatpred &= mask; \ + env->v7m.vpr = (env->v7m.vpr & ~(uint32_t)eci_mask) | \ + (beatpred & eci_mask); \ + mve_advance_vpt(env); \ + } + +#define DO_VCMP_FP_BOTH(VOP, SOP, ESIZE, TYPE, FN) \ + DO_VCMP_FP(VOP, ESIZE, TYPE, FN) \ + DO_VCMP_FP_SCALAR(SOP, ESIZE, TYPE, FN) + /* * Some care is needed here to get the correct result for the unordered case. * Architecturally EQ, GE and GT are defined to be false for unordered, but @@ -3208,20 +3246,20 @@ DO_FP_VMAXMINV(vminnmavs, 4, uint32_t, float32, true, float32_minnum) #define DO_GT16(X, Y, S) float16_lt(Y, X, S) #define DO_GT32(X, Y, S) float32_lt(Y, X, S) -DO_VCMP_FP(vfcmpeqh, 2, uint16_t, float16_eq) -DO_VCMP_FP(vfcmpeqs, 4, uint32_t, float32_eq) +DO_VCMP_FP_BOTH(vfcmpeqh, vfcmpeq_scalarh, 2, uint16_t, float16_eq) +DO_VCMP_FP_BOTH(vfcmpeqs, vfcmpeq_scalars, 4, uint32_t, float32_eq) -DO_VCMP_FP(vfcmpneh, 2, uint16_t, !float16_eq) -DO_VCMP_FP(vfcmpnes, 4, uint32_t, !float32_eq) +DO_VCMP_FP_BOTH(vfcmpneh, vfcmpne_scalarh, 2, uint16_t, !float16_eq) +DO_VCMP_FP_BOTH(vfcmpnes, vfcmpne_scalars, 4, uint32_t, !float32_eq) -DO_VCMP_FP(vfcmpgeh, 2, uint16_t, DO_GE16) -DO_VCMP_FP(vfcmpges, 4, uint32_t, DO_GE32) +DO_VCMP_FP_BOTH(vfcmpgeh, vfcmpge_scalarh, 2, uint16_t, DO_GE16) +DO_VCMP_FP_BOTH(vfcmpges, vfcmpge_scalars, 4, uint32_t, DO_GE32) -DO_VCMP_FP(vfcmplth, 2, uint16_t, !DO_GE16) -DO_VCMP_FP(vfcmplts, 4, uint32_t, !DO_GE32) +DO_VCMP_FP_BOTH(vfcmplth, vfcmplt_scalarh, 2, uint16_t, !DO_GE16) +DO_VCMP_FP_BOTH(vfcmplts, vfcmplt_scalars, 4, uint32_t, !DO_GE32) -DO_VCMP_FP(vfcmpgth, 2, uint16_t, DO_GT16) -DO_VCMP_FP(vfcmpgts, 4, uint32_t, DO_GT32) +DO_VCMP_FP_BOTH(vfcmpgth, vfcmpgt_scalarh, 2, uint16_t, DO_GT16) +DO_VCMP_FP_BOTH(vfcmpgts, vfcmpgt_scalars, 4, uint32_t, DO_GT32) -DO_VCMP_FP(vfcmpleh, 2, uint16_t, !DO_GT16) -DO_VCMP_FP(vfcmples, 4, uint32_t, !DO_GT32) +DO_VCMP_FP_BOTH(vfcmpleh, vfcmple_scalarh, 2, uint16_t, !DO_GT16) +DO_VCMP_FP_BOTH(vfcmples, vfcmple_scalars, 4, uint32_t, !DO_GT32) diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index da14a6f790e..e8a3dec6683 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -1771,6 +1771,20 @@ DO_VCMP(VCMPLE, vcmple) return false; \ } \ return do_vcmp(s, a, fns[a->size]); \ + } \ + static bool trans_##INSN##_scalar(DisasContext *s, \ + arg_vcmp_scalar *a) \ + { \ + static MVEGenScalarCmpFn * const fns[] = { \ + NULL, \ + gen_helper_mve_##FN##_scalarh, \ + gen_helper_mve_##FN##_scalars, \ + NULL, \ + }; \ + if (!dc_isar_feature(aa32_mve_fp, s)) { \ + return false; \ + } \ + return do_vcmp_scalar(s, a, fns[a->size]); \ } DO_VCMP_FP(VCMPEQ_fp, vfcmpeq) From patchwork Thu Jul 29 11:15:07 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511206 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=rtHNOSok; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb88b5Lcrz9sSs for ; Thu, 29 Jul 2021 21:54:39 +1000 (AEST) Received: from localhost ([::1]:50342 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m94cX-0004pM-H3 for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:54:37 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41098) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m941b-0001QY-GJ for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:27 -0400 Received: from mail-wr1-x435.google.com ([2a00:1450:4864:20::435]:41645) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m9419-0001TM-61 for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:27 -0400 Received: by mail-wr1-x435.google.com with SMTP id b7so6456518wri.8 for ; Thu, 29 Jul 2021 04:15:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=XkZ/M1G+THrDlmqrCgDLQ+Zh1oSvNMXSv94gr49fYtc=; b=rtHNOSokLA8O5zBaYcAJ706IaGOPtRtKB8qEQCm/lbzm8zRHqwjRnsi+Cv9v0E/hnd jvBCK5WZGhpvohQQJe43cZRDSHPBOIZCzSinso+S50uJg9erueW2hVoV+BIP/HG9JLQt yZxdqLnoxpk947RKLtsVRmmZLnsEV2HcNjE+U1mgWxlbP7XnS71mKUF1xOm1ClxVpx1x GArQ+e6DNSrbbgWGko5ap5RbPJ8MST3Cm81nflBb3OlitRTaLiW4einbijV9E8SAk7g+ U4mueMPlBW1dLxjJlF4mFimJ+hXaGH63g/jcTCNUcZsGxuS76HC+7BVKCSbMdwCyRxhW wD+A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=XkZ/M1G+THrDlmqrCgDLQ+Zh1oSvNMXSv94gr49fYtc=; b=MJBTt9790ms5+05uzgr4eoRjzWlIJWj+NYaWdLF171U3LfweCWH5ErZda1JUFf16jj gQJxYov4VpkJgdaegV53Ly1ZaON6lvr9KYQJ6h4G/Tol8GlVj82oR52eMQYeyq5SgWa4 hCoxltr0vI7QU5bClSz+K04SbU3upsn16qvaHm4+iOzkDwjw9ZRgN0BLD4SvAj44QG4t bhQX37os3RveUoCp84iN2gb/saytUWnMIUDRWYRNob4Hb35VD6+Y4H0fLPaIYR7zJu0d 7NNaJ+SXKw8zZoDcr5vv8T2JHfCIRuJxCzWO5E6m8pi2UMtR4P26So5s9JTuyyAEe8lW A0mw== X-Gm-Message-State: AOAM532F6kRWaViXtmWYhUPrxMcKvb1x0leMYHLz58eyfdICxmBL6aIg e72si7AHAItrlQuqEu2d/Ukr41emHaUeIw== X-Google-Smtp-Source: ABdhPJzbvOYKDux75XycHwaqng4bUYXR2jlXhuTP2s0hjlpBS5xlcrz3TypTMFpLEW2xeGGCxeJ8iw== X-Received: by 2002:a5d:6c63:: with SMTP id r3mr2822447wrz.405.1627557354932; Thu, 29 Jul 2021 04:15:54 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:54 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 48/53] target/arm: Implement MVE VCVT between floating and fixed point Date: Thu, 29 Jul 2021 12:15:07 +0100 Message-Id: <20210729111512.16541-49-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::435; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x435.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE VCVT insns which convert between floating and fixed point. As with the Neon equivalents, these use essentially the same constant encoding as right-shift-by-immediate. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper-mve.h | 9 +++++++++ target/arm/mve.decode | 19 +++++++++++++++++++ target/arm/mve_helper.c | 36 ++++++++++++++++++++++++++++++++++++ target/arm/translate-mve.c | 18 ++++++++++++++++++ 4 files changed, 82 insertions(+) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index 9ee841cdf01..f3c2b43bf43 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -863,3 +863,12 @@ DEF_HELPER_FLAGS_4(mve_vfma_scalars, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vfmas_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vfmas_scalars, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(mve_vcvt_sh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vcvt_uh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vcvt_hs, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vcvt_hu, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vcvt_sf, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vcvt_uf, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vcvt_fs, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vcvt_fu, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index aa113279dc5..d9fcc42d36d 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -771,3 +771,22 @@ VCMLA0 1111 110 00 . 1 . ... 0 ... 0 1000 . 1 . 0 ... 0 @2op_fp_size_ VCMLA90 1111 110 01 . 1 . ... 0 ... 0 1000 . 1 . 0 ... 0 @2op_fp_size_rev VCMLA180 1111 110 10 . 1 . ... 0 ... 0 1000 . 1 . 0 ... 0 @2op_fp_size_rev VCMLA270 1111 110 11 . 1 . ... 0 ... 0 1000 . 1 . 0 ... 0 @2op_fp_size_rev + +# floating-point <-> fixed-point conversions. Naming convention: +# VCVT_, S = signed int, U = unsigned int, H = halfprec, F = singleprec +@vcvt .... .... .. 1 ..... .... .. 1 . .... .... &2shift \ + qd=%qd qm=%qm shift=%rshift_i5 size=2 +@vcvt_f16 .... .... .. 11 .... .... .. 0 . .... .... &2shift \ + qd=%qd qm=%qm shift=%rshift_i4 size=1 + +VCVT_SH_fixed 1110 1111 1 . ...... ... 0 11 . 0 01 . 1 ... 0 @vcvt_f16 +VCVT_UH_fixed 1111 1111 1 . ...... ... 0 11 . 0 01 . 1 ... 0 @vcvt_f16 + +VCVT_HS_fixed 1110 1111 1 . ...... ... 0 11 . 1 01 . 1 ... 0 @vcvt_f16 +VCVT_HU_fixed 1111 1111 1 . ...... ... 0 11 . 1 01 . 1 ... 0 @vcvt_f16 + +VCVT_SF_fixed 1110 1111 1 . ...... ... 0 11 . 0 01 . 1 ... 0 @vcvt +VCVT_UF_fixed 1111 1111 1 . ...... ... 0 11 . 0 01 . 1 ... 0 @vcvt + +VCVT_FS_fixed 1110 1111 1 . ...... ... 0 11 . 1 01 . 1 ... 0 @vcvt +VCVT_FU_fixed 1111 1111 1 . ...... ... 0 11 . 1 01 . 1 ... 0 @vcvt diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index 0aeccc12d69..8e1184db3b4 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -3263,3 +3263,39 @@ DO_VCMP_FP_BOTH(vfcmpgts, vfcmpgt_scalars, 4, uint32_t, DO_GT32) DO_VCMP_FP_BOTH(vfcmpleh, vfcmple_scalarh, 2, uint16_t, !DO_GT16) DO_VCMP_FP_BOTH(vfcmples, vfcmple_scalars, 4, uint32_t, !DO_GT32) + +#define DO_VCVT_FIXED(OP, ESIZE, TYPE, FN) \ + void HELPER(glue(mve_, OP))(CPUARMState *env, void *vd, void *vm, \ + uint32_t shift) \ + { \ + TYPE *d = vd, *m = vm; \ + TYPE r; \ + uint16_t mask = mve_element_mask(env); \ + unsigned e; \ + float_status *fpst; \ + float_status scratch_fpst; \ + for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \ + if ((mask & MAKE_64BIT_MASK(0, ESIZE)) == 0) { \ + continue; \ + } \ + fpst = (ESIZE == 2) ? &env->vfp.standard_fp_status_f16 : \ + &env->vfp.standard_fp_status; \ + if (!(mask & 1)) { \ + /* We need the result but without updating flags */ \ + scratch_fpst = *fpst; \ + fpst = &scratch_fpst; \ + } \ + r = FN(m[H##ESIZE(e)], shift, fpst); \ + mergemask(&d[H##ESIZE(e)], r, mask); \ + } \ + mve_advance_vpt(env); \ + } + +DO_VCVT_FIXED(vcvt_sh, 2, int16_t, helper_vfp_shtoh) +DO_VCVT_FIXED(vcvt_uh, 2, uint16_t, helper_vfp_uhtoh) +DO_VCVT_FIXED(vcvt_hs, 2, int16_t, helper_vfp_toshh_round_to_zero) +DO_VCVT_FIXED(vcvt_hu, 2, uint16_t, helper_vfp_touhh_round_to_zero) +DO_VCVT_FIXED(vcvt_sf, 4, int32_t, helper_vfp_sltos) +DO_VCVT_FIXED(vcvt_uf, 4, uint32_t, helper_vfp_ultos) +DO_VCVT_FIXED(vcvt_fs, 4, int32_t, helper_vfp_tosls_round_to_zero) +DO_VCVT_FIXED(vcvt_fu, 4, uint32_t, helper_vfp_touls_round_to_zero) diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index e8a3dec6683..9269dbc3324 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -1439,6 +1439,24 @@ DO_2SHIFT(VRSHRI_U, vrshli_u, true) DO_2SHIFT(VSRI, vsri, false) DO_2SHIFT(VSLI, vsli, false) +#define DO_2SHIFT_FP(INSN, FN) \ + static bool trans_##INSN(DisasContext *s, arg_2shift *a) \ + { \ + if (!dc_isar_feature(aa32_mve_fp, s)) { \ + return false; \ + } \ + return do_2shift(s, a, gen_helper_mve_##FN, false); \ + } + +DO_2SHIFT_FP(VCVT_SH_fixed, vcvt_sh) +DO_2SHIFT_FP(VCVT_UH_fixed, vcvt_uh) +DO_2SHIFT_FP(VCVT_HS_fixed, vcvt_hs) +DO_2SHIFT_FP(VCVT_HU_fixed, vcvt_hu) +DO_2SHIFT_FP(VCVT_SF_fixed, vcvt_sf) +DO_2SHIFT_FP(VCVT_UF_fixed, vcvt_uf) +DO_2SHIFT_FP(VCVT_FS_fixed, vcvt_fs) +DO_2SHIFT_FP(VCVT_FU_fixed, vcvt_fu) + static bool do_2shift_scalar(DisasContext *s, arg_shl_scalar *a, MVEGenTwoOpShiftFn *fn) { From patchwork Thu Jul 29 11:15:08 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511223 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=akpBwRVP; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb8GC2173z9sV8 for ; Thu, 29 Jul 2021 21:59:31 +1000 (AEST) Received: from localhost ([::1]:40360 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m94hF-0000K6-0W for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:59:29 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41086) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m941b-0001Oz-2X for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:27 -0400 Received: from mail-wm1-x336.google.com ([2a00:1450:4864:20::336]:41747) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m9419-0001U9-6s for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:26 -0400 Received: by mail-wm1-x336.google.com with SMTP id n28-20020a05600c3b9cb02902552e60df56so3744906wms.0 for ; Thu, 29 Jul 2021 04:15:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=1vyQ0Oi0tEEGDv6NXEcKRepnjPjEniiaL2/Am2D+bJo=; b=akpBwRVPr3gGW5KI3ngUEkfu4smdkEL3E56ASLfC4ANOOyWmmv+6IZvd5xqz7ZDi1u knLbCE/eqgYHwpYk1vQibjDPeiTbcFq3OB8aAwO01q3PEs1qiemIwh+tuDJQWJUdYco7 QKIEqta69hcIGoLy6TgIWZnUs5oDWES/yyXr1B7mcZldHyW66ydkZDYJP888FDBrYoHE RfUuOpz66YFRfhM30/1jMCH6l/R33YmTEslDUStaNKOdaQ2tnHl8iT9Jc/lfN2Qyq4Vk +eXJzRa72jmXD04xUje1FQo9vpzSSTVNGwi6NPijFKRhziQT4gF3YkEw/Gb0l8Qibdye 64YA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=1vyQ0Oi0tEEGDv6NXEcKRepnjPjEniiaL2/Am2D+bJo=; b=P7pXuR54d0XrIlxU2299d9tXjC0mmafNlmh+Eo9qmXTR4//TaWcaIxGWydN4ZG8XR+ uROqB3KFgc6y0JsJJ0AQYF1p6VQ+CLEOBQ94N48KXwjEmBYm2dfpykhcQoLyQ3EIaaRo EqVgaUuZeEujYR0yobzq8jMzcbmiysNDxbVEG8bugr0TLUZTzPKjVTZCQLFN3lZ1Fe9u 1VWvdEY3ILrIYK76t7zWPzoVIL483uCEEVU1ND6L+SHvOzEUfL+01MXNOx1Xwhx/WE6X PxiLtrFa9d6D1u1V0ZCvJPZZRUEcUzvbS0x54Xfps9W+WbSDt24uefzY84Vo7XTZPOQ3 3wxQ== X-Gm-Message-State: AOAM533tdoYVRHM26twK9TX0hIy3aC7tFI/czSCxK2SGEGzJ80idP1zo wKVd5G8JS0lXS2CW6mZjd5KgoUopzVUatg== X-Google-Smtp-Source: ABdhPJw02pAAMb4OSUZ7drK//EFV5/af67FRBEL/3O/kDOLEL8cERk1xgRIDw0zANzGCKDunCsHMUg== X-Received: by 2002:a1c:238e:: with SMTP id j136mr4273639wmj.91.1627557355681; Thu, 29 Jul 2021 04:15:55 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:55 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 49/53] target/arm: Implement MVE VCVT between fp and integer Date: Thu, 29 Jul 2021 12:15:08 +0100 Message-Id: <20210729111512.16541-50-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::336; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x336.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE "VCVT (between floating-point and integer)" insn. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/mve.decode | 7 +++++++ target/arm/translate-mve.c | 32 ++++++++++++++++++++++++++++++++ 2 files changed, 39 insertions(+) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index d9fcc42d36d..9a40ff9f43c 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -790,3 +790,10 @@ VCVT_UF_fixed 1111 1111 1 . ...... ... 0 11 . 0 01 . 1 ... 0 @vcvt VCVT_FS_fixed 1110 1111 1 . ...... ... 0 11 . 1 01 . 1 ... 0 @vcvt VCVT_FU_fixed 1111 1111 1 . ...... ... 0 11 . 1 01 . 1 ... 0 @vcvt + +# VCVT between floating point and integer (halfprec and single); +# VCVT_, S = signed int, U = unsigned int, F = float +VCVT_SF 1111 1111 1 . 11 .. 11 ... 0 011 00 1 . 0 ... 0 @1op +VCVT_UF 1111 1111 1 . 11 .. 11 ... 0 011 01 1 . 0 ... 0 @1op +VCVT_FS 1111 1111 1 . 11 .. 11 ... 0 011 10 1 . 0 ... 0 @1op +VCVT_FU 1111 1111 1 . 11 .. 11 ... 0 011 11 1 . 0 ... 0 @1op diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index 9269dbc3324..351033af1ec 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -543,6 +543,38 @@ DO_1OP(VQNEG, vqneg) DO_1OP(VMAXA, vmaxa) DO_1OP(VMINA, vmina) +/* + * For simple float/int conversions we use the fixed-point + * conversion helpers with a zero shift count + */ +#define DO_VCVT(INSN, HFN, SFN) \ + static void gen_##INSN##h(TCGv_ptr env, TCGv_ptr qd, TCGv_ptr qm) \ + { \ + gen_helper_mve_##HFN(env, qd, qm, tcg_constant_i32(0)); \ + } \ + static void gen_##INSN##s(TCGv_ptr env, TCGv_ptr qd, TCGv_ptr qm) \ + { \ + gen_helper_mve_##SFN(env, qd, qm, tcg_constant_i32(0)); \ + } \ + static bool trans_##INSN(DisasContext *s, arg_1op *a) \ + { \ + static MVEGenOneOpFn * const fns[] = { \ + NULL, \ + gen_##INSN##h, \ + gen_##INSN##s, \ + NULL, \ + }; \ + if (!dc_isar_feature(aa32_mve_fp, s)) { \ + return false; \ + } \ + return do_1op(s, a, fns[a->size]); \ + } + +DO_VCVT(VCVT_SF, vcvt_sh, vcvt_sf) +DO_VCVT(VCVT_UF, vcvt_uh, vcvt_uf) +DO_VCVT(VCVT_FS, vcvt_hs, vcvt_fs) +DO_VCVT(VCVT_FU, vcvt_hu, vcvt_fu) + /* Narrowing moves: only size 0 and 1 are valid */ #define DO_VMOVN(INSN, FN) \ static bool trans_##INSN(DisasContext *s, arg_1op *a) \ From patchwork Thu Jul 29 11:15:09 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511204 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=e/jn0Y4t; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb87N5FCjz9sSs for ; Thu, 29 Jul 2021 21:53:35 +1000 (AEST) Received: from localhost ([::1]:45632 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m94bT-0001V6-OO for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:53:31 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41064) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m941Z-0001MV-Lf for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:25 -0400 Received: from mail-wr1-x42f.google.com ([2a00:1450:4864:20::42f]:37760) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m9419-0001Ug-5z for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:25 -0400 Received: by mail-wr1-x42f.google.com with SMTP id d8so6457495wrm.4 for ; Thu, 29 Jul 2021 04:15:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=/oEQZ2222tqM0a4cMKBeNUf6j4gK7g7+B5zZA/0EG9I=; b=e/jn0Y4tVLANZeLl8hOFxumPzKGBUvJO6WOhn8R4yPqKs2sBWaeOZH30WLmM5DCsCl w2pIL5sftu5U1xRUy0I0vTn47giJFxFXNRe0PaLAvtpJCeDSJKPVNBcDTJc/2UKEo/V5 widh6y/YDMggu+zsJeysNiam0v9Kw/8/AfnLl8TzdRKG967c+Wxj7ilJnk3OqzMv3jL7 ZvXtU9ZOoDZidhPTapOvAygD5eErBEiNXDTrptX4lehEqUFv/METKsJo+4LImb64U1O7 nguNigAVU619i8btG0rIwWtAXHO1uxaabDQCC0MIsov58HqLBoCsNT/K2JOQuWERUxjs xPmQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=/oEQZ2222tqM0a4cMKBeNUf6j4gK7g7+B5zZA/0EG9I=; b=NpCZf7DIw01K9JP0Eh2QlT02hiTqUbOoawsHtHO7dYstnE5nBOcuCRBCoNotnSlzoo tI8YH/AjFUEcokZcFzx2XlXq3masfXbmI+4Ua9dltN2vEKDqeeJg+Bh2p1Jv/J8AaMWo Ak7U4E3FvhpsRlZ2jyCFz/sGaSl/ki5coq/2HNxjYs8DG7MCx0uEjgJdoO+zEgjtZGTV 8AkPhtOXIRPYMNgN3OPgQmlJmlP7xv7QYTfXZB/bOJnkMetG5XjrG6QW1k0Bj/cVK4Cj BJfrWZkUN5w8HWuMa9H4z5aIuOrvUBHxuGbR5/dmrZKcqivGfsxXrQz6DVg3vYlwQQYB yOtQ== X-Gm-Message-State: AOAM533ddSUuTknR51HnkUq91gnzyZjcA1gf07nvbxbivEKkn02gS8Pb yGxQ82M0iu42hldanhmwz8K1Mg== X-Google-Smtp-Source: ABdhPJws55HduM6WOFvByUKcydlN+t9N4B0xIEWz1wgFRgbAXKGvf7polBxLJwSYZs2pNI53JJ1FRA== X-Received: by 2002:adf:8169:: with SMTP id 96mr4213795wrm.424.1627557356482; Thu, 29 Jul 2021 04:15:56 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:56 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 50/53] target/arm: Implement MVE VCVT with specified rounding mode Date: Thu, 29 Jul 2021 12:15:09 +0100 Message-Id: <20210729111512.16541-51-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::42f; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x42f.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE VCVT which converts from floating-point to integer using a rounding mode specified by the instruction. We implement this similarly to the Neon equivalents, by passing the required rounding mode as an extra integer parameter to the helper functions. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper-mve.h | 5 ++++ target/arm/mve.decode | 10 ++++++++ target/arm/mve_helper.c | 38 ++++++++++++++++++++++++++++ target/arm/translate-mve.c | 52 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 105 insertions(+) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index f3c2b43bf43..6d4052a5269 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -177,6 +177,11 @@ DEF_HELPER_FLAGS_3(mve_vminab, TCG_CALL_NO_WG, void, env, ptr, ptr) DEF_HELPER_FLAGS_3(mve_vminah, TCG_CALL_NO_WG, void, env, ptr, ptr) DEF_HELPER_FLAGS_3(mve_vminaw, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_4(mve_vcvt_rm_sh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vcvt_rm_uh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vcvt_rm_ss, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vcvt_rm_us, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) + DEF_HELPER_FLAGS_3(mve_vmovnbb, TCG_CALL_NO_WG, void, env, ptr, ptr) DEF_HELPER_FLAGS_3(mve_vmovnbh, TCG_CALL_NO_WG, void, env, ptr, ptr) DEF_HELPER_FLAGS_3(mve_vmovntb, TCG_CALL_NO_WG, void, env, ptr, ptr) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index 9a40ff9f43c..410ea746fcf 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -797,3 +797,13 @@ VCVT_SF 1111 1111 1 . 11 .. 11 ... 0 011 00 1 . 0 ... 0 @1op VCVT_UF 1111 1111 1 . 11 .. 11 ... 0 011 01 1 . 0 ... 0 @1op VCVT_FS 1111 1111 1 . 11 .. 11 ... 0 011 10 1 . 0 ... 0 @1op VCVT_FU 1111 1111 1 . 11 .. 11 ... 0 011 11 1 . 0 ... 0 @1op + +# VCVT from floating point to integer with specified rounding mode +VCVTAS 1111 1111 1 . 11 .. 11 ... 000 00 0 1 . 0 ... 0 @1op +VCVTAU 1111 1111 1 . 11 .. 11 ... 000 00 1 1 . 0 ... 0 @1op +VCVTNS 1111 1111 1 . 11 .. 11 ... 000 01 0 1 . 0 ... 0 @1op +VCVTNU 1111 1111 1 . 11 .. 11 ... 000 01 1 1 . 0 ... 0 @1op +VCVTPS 1111 1111 1 . 11 .. 11 ... 000 10 0 1 . 0 ... 0 @1op +VCVTPU 1111 1111 1 . 11 .. 11 ... 000 10 1 1 . 0 ... 0 @1op +VCVTMS 1111 1111 1 . 11 .. 11 ... 000 11 0 1 . 0 ... 0 @1op +VCVTMU 1111 1111 1 . 11 .. 11 ... 000 11 1 1 . 0 ... 0 @1op diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index 8e1184db3b4..4e0d979e643 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -3299,3 +3299,41 @@ DO_VCVT_FIXED(vcvt_sf, 4, int32_t, helper_vfp_sltos) DO_VCVT_FIXED(vcvt_uf, 4, uint32_t, helper_vfp_ultos) DO_VCVT_FIXED(vcvt_fs, 4, int32_t, helper_vfp_tosls_round_to_zero) DO_VCVT_FIXED(vcvt_fu, 4, uint32_t, helper_vfp_touls_round_to_zero) + +/* VCVT with specified rmode */ +#define DO_VCVT_RMODE(OP, ESIZE, TYPE, FN) \ + void HELPER(glue(mve_, OP))(CPUARMState *env, \ + void *vd, void *vm, uint32_t rmode) \ + { \ + TYPE *d = vd, *m = vm; \ + TYPE r; \ + uint16_t mask = mve_element_mask(env); \ + unsigned e; \ + float_status *fpst; \ + float_status scratch_fpst; \ + float_status *base_fpst = (ESIZE == 2) ? \ + &env->vfp.standard_fp_status_f16 : \ + &env->vfp.standard_fp_status; \ + uint32_t prev_rmode = get_float_rounding_mode(base_fpst); \ + set_float_rounding_mode(rmode, base_fpst); \ + for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \ + if ((mask & MAKE_64BIT_MASK(0, ESIZE)) == 0) { \ + continue; \ + } \ + fpst = base_fpst; \ + if (!(mask & 1)) { \ + /* We need the result but without updating flags */ \ + scratch_fpst = *fpst; \ + fpst = &scratch_fpst; \ + } \ + r = FN(m[H##ESIZE(e)], 0, fpst); \ + mergemask(&d[H##ESIZE(e)], r, mask); \ + } \ + set_float_rounding_mode(prev_rmode, base_fpst); \ + mve_advance_vpt(env); \ + } + +DO_VCVT_RMODE(vcvt_rm_sh, 2, uint16_t, helper_vfp_toshh) +DO_VCVT_RMODE(vcvt_rm_uh, 2, uint16_t, helper_vfp_touhh) +DO_VCVT_RMODE(vcvt_rm_ss, 4, uint32_t, helper_vfp_tosls) +DO_VCVT_RMODE(vcvt_rm_us, 4, uint32_t, helper_vfp_touls) diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index 351033af1ec..e80a55eb62e 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -49,6 +49,7 @@ typedef void MVEGenCmpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr); typedef void MVEGenScalarCmpFn(TCGv_ptr, TCGv_ptr, TCGv_i32); typedef void MVEGenVABAVFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32); typedef void MVEGenDualAccOpFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32); +typedef void MVEGenVCVTRmodeFn(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32); /* Return the offset of a Qn register (same semantics as aa32_vfp_qreg()) */ static inline long mve_qreg_offset(unsigned reg) @@ -575,6 +576,57 @@ DO_VCVT(VCVT_UF, vcvt_uh, vcvt_uf) DO_VCVT(VCVT_FS, vcvt_hs, vcvt_fs) DO_VCVT(VCVT_FU, vcvt_hu, vcvt_fu) +static bool do_vcvt_rmode(DisasContext *s, arg_1op *a, + enum arm_fprounding rmode, bool u) +{ + /* + * Handle VCVT fp to int with specified rounding mode. + * This is a 1op fn but we must pass the rounding mode as + * an immediate to the helper. + */ + TCGv_ptr qd, qm; + static MVEGenVCVTRmodeFn * const fns[4][2] = { + { NULL, NULL }, + { gen_helper_mve_vcvt_rm_sh, gen_helper_mve_vcvt_rm_uh }, + { gen_helper_mve_vcvt_rm_ss, gen_helper_mve_vcvt_rm_us }, + { NULL, NULL }, + }; + MVEGenVCVTRmodeFn *fn = fns[a->size][u]; + + if (!dc_isar_feature(aa32_mve_fp, s) || + !mve_check_qreg_bank(s, a->qd | a->qm) || + !fn) { + return false; + } + + if (!mve_eci_check(s) || !vfp_access_check(s)) { + return true; + } + + qd = mve_qreg_ptr(a->qd); + qm = mve_qreg_ptr(a->qm); + fn(cpu_env, qd, qm, tcg_constant_i32(arm_rmode_to_sf(rmode))); + tcg_temp_free_ptr(qd); + tcg_temp_free_ptr(qm); + mve_update_eci(s); + return true; +} + +#define DO_VCVT_RMODE(INSN, RMODE, U) \ + static bool trans_##INSN(DisasContext *s, arg_1op *a) \ + { \ + return do_vcvt_rmode(s, a, RMODE, U); \ + } \ + +DO_VCVT_RMODE(VCVTAS, FPROUNDING_TIEAWAY, false) +DO_VCVT_RMODE(VCVTAU, FPROUNDING_TIEAWAY, true) +DO_VCVT_RMODE(VCVTNS, FPROUNDING_TIEEVEN, false) +DO_VCVT_RMODE(VCVTNU, FPROUNDING_TIEEVEN, true) +DO_VCVT_RMODE(VCVTPS, FPROUNDING_POSINF, false) +DO_VCVT_RMODE(VCVTPU, FPROUNDING_POSINF, true) +DO_VCVT_RMODE(VCVTMS, FPROUNDING_NEGINF, false) +DO_VCVT_RMODE(VCVTMU, FPROUNDING_NEGINF, true) + /* Narrowing moves: only size 0 and 1 are valid */ #define DO_VMOVN(INSN, FN) \ static bool trans_##INSN(DisasContext *s, arg_1op *a) \ From patchwork Thu Jul 29 11:15:10 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511208 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=EG1jpE6J; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb89X6kJtz9sV8 for ; Thu, 29 Jul 2021 21:55:28 +1000 (AEST) Received: from localhost ([::1]:54538 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m94dK-0007dd-LD for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:55:26 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41118) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m941c-0001Sm-27 for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:28 -0400 Received: from mail-wr1-x42f.google.com ([2a00:1450:4864:20::42f]:39887) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m9419-0001V0-Fb for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:27 -0400 Received: by mail-wr1-x42f.google.com with SMTP id b11so1109598wrx.6 for ; Thu, 29 Jul 2021 04:15:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=zAZRxjiyb+93dblmUX5ptmQBplEyBfFmOxvexmaeevQ=; b=EG1jpE6JM6ZujeqJQjfa5t5l0rm1Dx47gpiMhUt+G63knT+tTgbCLZbVM+gYDyH2T4 Bdgemr/LcLq/KxOtsB8xyTSdtzX/wrdBBt9+197cmQSsFaWEddEatrcoJZXlHEJgKzaa WDUYcdLmNpUt4mBJE8gQb3bOahFKrMSYA9Xq1mDwzfGBN/hnmSipwPfvNnylGXYoxaus XTOjeRY+cvCg6V6uZUKpcrIQLWOZYOwHjg5vnuk7Ia0WaL8cycbq+ObTSNnxOQQcNQhZ wrckQl2oK15nBSaSFRSnlyV4OqhoVeYfn+dEEoihm5qOP/SJGLOeIKSvpgMkjm3NlkkJ nx9g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=zAZRxjiyb+93dblmUX5ptmQBplEyBfFmOxvexmaeevQ=; b=jNcRqtV/9w9AErhNndG13wi85YHGK3VmZdcChkk5dWH3oAdYnZn906JXdQb2GDqtKU JxQP3/8OVALV93TzzsfFw5I/fhgOmYpa8aWFUdQrY+iGxm46Kac019FQ327FM/3OfNsm tbU/p95NUBTA+0pjOiaeTw33Uw4+mMQDg2sEw8YzpSIkhmEfkyVhO81HvWUdWrNdT7Re 32je+N018fQfKXpirYbCilDMhcJVzFF/fhkHWOXDXxR4iec6j3R6TkZJGaaJFRFfXdTe d+7zL4iH3Fs2PPBW26fwvSEdEa3WcNPkWVoWkwp1TK9hhfqWyyt4zUYFhlXttYg8tiOv y6yg== X-Gm-Message-State: AOAM533KEx52O0Fz3gvl8/ssgvXngTV57k3hsrek0RftLU3kK4yvvs0t eUqqu5zGO/LZxbas9/1QtwJdX8dAeEEHvQ== X-Google-Smtp-Source: ABdhPJz87zW+mMUYWJ0UeWE2FSw8aG5frQBeWYzrm8+IMrAppgvmhG+WNalJVL4kjXxPxYTzGm2iCg== X-Received: by 2002:adf:82e6:: with SMTP id 93mr4206843wrc.47.1627557357412; Thu, 29 Jul 2021 04:15:57 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:57 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 51/53] target/arm: Implement MVE VCVT between single and half precision Date: Thu, 29 Jul 2021 12:15:10 +0100 Message-Id: <20210729111512.16541-52-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::42f; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x42f.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE VCVT instruction which converts between single and half precision floating point. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper-mve.h | 5 +++ target/arm/mve.decode | 8 +++++ target/arm/mve_helper.c | 71 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-mve.c | 14 ++++++++ 4 files changed, 98 insertions(+) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index 6d4052a5269..f6345c7abbe 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -182,6 +182,11 @@ DEF_HELPER_FLAGS_4(mve_vcvt_rm_uh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vcvt_rm_ss, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vcvt_rm_us, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vcvtb_sh, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vcvtt_sh, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vcvtb_hs, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vcvtt_hs, TCG_CALL_NO_WG, void, env, ptr, ptr) + DEF_HELPER_FLAGS_3(mve_vmovnbb, TCG_CALL_NO_WG, void, env, ptr, ptr) DEF_HELPER_FLAGS_3(mve_vmovnbh, TCG_CALL_NO_WG, void, env, ptr, ptr) DEF_HELPER_FLAGS_3(mve_vmovntb, TCG_CALL_NO_WG, void, env, ptr, ptr) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index 410ea746fcf..32de4af3170 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -221,6 +221,8 @@ VMUL 1110 1111 0 . .. ... 0 ... 0 1001 . 1 . 1 ... 0 @2op # The VSHLL T2 encoding is not a @2op pattern, but is here because it # overlaps what would be size=0b11 VMULH/VRMULH { + VCVTB_SH 111 0 1110 0 . 11 1111 ... 0 1110 0 0 . 0 ... 1 @1op_nosz + VMAXNMA 111 0 1110 0 . 11 1111 ... 0 1110 1 0 . 0 ... 1 @vmaxnma size=2 VSHLL_BS 111 0 1110 0 . 11 .. 01 ... 0 1110 0 0 . 0 ... 1 @2_shll_esize_b @@ -235,6 +237,8 @@ VMUL 1110 1111 0 . .. ... 0 ... 0 1001 . 1 . 1 ... 0 @2op } { + VCVTB_HS 111 1 1110 0 . 11 1111 ... 0 1110 0 0 . 0 ... 1 @1op_nosz + VMAXNMA 111 1 1110 0 . 11 1111 ... 0 1110 1 0 . 0 ... 1 @vmaxnma size=1 VSHLL_BU 111 1 1110 0 . 11 .. 01 ... 0 1110 0 0 . 0 ... 1 @2_shll_esize_b @@ -247,6 +251,8 @@ VMUL 1110 1111 0 . .. ... 0 ... 0 1001 . 1 . 1 ... 0 @2op } { + VCVTT_SH 111 0 1110 0 . 11 1111 ... 1 1110 0 0 . 0 ... 1 @1op_nosz + VMINNMA 111 0 1110 0 . 11 1111 ... 1 1110 1 0 . 0 ... 1 @vmaxnma size=2 VSHLL_TS 111 0 1110 0 . 11 .. 01 ... 1 1110 0 0 . 0 ... 1 @2_shll_esize_b VSHLL_TS 111 0 1110 0 . 11 .. 01 ... 1 1110 0 0 . 0 ... 1 @2_shll_esize_h @@ -260,6 +266,8 @@ VMUL 1110 1111 0 . .. ... 0 ... 0 1001 . 1 . 1 ... 0 @2op } { + VCVTT_HS 111 1 1110 0 . 11 1111 ... 1 1110 0 0 . 0 ... 1 @1op_nosz + VMINNMA 111 1 1110 0 . 11 1111 ... 1 1110 1 0 . 0 ... 1 @vmaxnma size=1 VSHLL_TU 111 1 1110 0 . 11 .. 01 ... 1 1110 0 0 . 0 ... 1 @2_shll_esize_b VSHLL_TU 111 1 1110 0 . 11 .. 01 ... 1 1110 0 0 . 0 ... 1 @2_shll_esize_h diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index 4e0d979e643..7a5143ba6f3 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -3337,3 +3337,74 @@ DO_VCVT_RMODE(vcvt_rm_sh, 2, uint16_t, helper_vfp_toshh) DO_VCVT_RMODE(vcvt_rm_uh, 2, uint16_t, helper_vfp_touhh) DO_VCVT_RMODE(vcvt_rm_ss, 4, uint32_t, helper_vfp_tosls) DO_VCVT_RMODE(vcvt_rm_us, 4, uint32_t, helper_vfp_touls) + +/* + * VCVT between halfprec and singleprec. As usual for halfprec + * conversions, FZ16 is ignored and AHP is observed. + */ +#define DO_VCVT_SH(OP, TOP) \ + void HELPER(glue(mve_, OP))(CPUARMState *env, void *vd, void *vm) \ + { \ + uint16_t *d = vd; \ + uint32_t *m = vm; \ + uint16_t r; \ + uint16_t mask = mve_element_mask(env); \ + bool ieee = !(env->vfp.xregs[ARM_VFP_FPSCR] & FPCR_AHP); \ + unsigned e; \ + float_status *fpst; \ + float_status scratch_fpst; \ + float_status *base_fpst = &env->vfp.standard_fp_status; \ + bool old_fz = get_flush_to_zero(base_fpst); \ + set_flush_to_zero(false, base_fpst); \ + for (e = 0; e < 16 / 4; e++, mask >>= 4) { \ + if ((mask & MAKE_64BIT_MASK(0, 4)) == 0) { \ + continue; \ + } \ + fpst = base_fpst; \ + if (!(mask & 1)) { \ + /* We need the result but without updating flags */ \ + scratch_fpst = *fpst; \ + fpst = &scratch_fpst; \ + } \ + r = float32_to_float16(m[H4(e)], ieee, fpst); \ + mergemask(&d[H2(e * 2 + TOP)], r, mask >> (TOP * 2)); \ + } \ + set_flush_to_zero(old_fz, base_fpst); \ + mve_advance_vpt(env); \ + } + +#define DO_VCVT_HS(OP, TOP) \ + void HELPER(glue(mve_, OP))(CPUARMState *env, void *vd, void *vm) \ + { \ + uint32_t *d = vd; \ + uint16_t *m = vm; \ + uint32_t r; \ + uint16_t mask = mve_element_mask(env); \ + bool ieee = !(env->vfp.xregs[ARM_VFP_FPSCR] & FPCR_AHP); \ + unsigned e; \ + float_status *fpst; \ + float_status scratch_fpst; \ + float_status *base_fpst = &env->vfp.standard_fp_status; \ + bool old_fiz = get_flush_inputs_to_zero(base_fpst); \ + set_flush_inputs_to_zero(false, base_fpst); \ + for (e = 0; e < 16 / 4; e++, mask >>= 4) { \ + if ((mask & MAKE_64BIT_MASK(0, 4)) == 0) { \ + continue; \ + } \ + fpst = base_fpst; \ + if (!(mask & (1 << (TOP * 2)))) { \ + /* We need the result but without updating flags */ \ + scratch_fpst = *fpst; \ + fpst = &scratch_fpst; \ + } \ + r = float16_to_float32(m[H2(e * 2 + TOP)], ieee, fpst); \ + mergemask(&d[H4(e)], r, mask); \ + } \ + set_flush_inputs_to_zero(old_fiz, base_fpst); \ + mve_advance_vpt(env); \ + } + +DO_VCVT_SH(vcvtb_sh, 0) +DO_VCVT_SH(vcvtt_sh, 1) +DO_VCVT_HS(vcvtb_hs, 0) +DO_VCVT_HS(vcvtt_hs, 1) diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index e80a55eb62e..194ef99cc74 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -627,6 +627,20 @@ DO_VCVT_RMODE(VCVTPU, FPROUNDING_POSINF, true) DO_VCVT_RMODE(VCVTMS, FPROUNDING_NEGINF, false) DO_VCVT_RMODE(VCVTMU, FPROUNDING_NEGINF, true) +#define DO_VCVT_SH(INSN, FN) \ + static bool trans_##INSN(DisasContext *s, arg_1op *a) \ + { \ + if (!dc_isar_feature(aa32_mve_fp, s)) { \ + return false; \ + } \ + return do_1op(s, a, gen_helper_mve_##FN); \ + } \ + +DO_VCVT_SH(VCVTB_SH, vcvtb_sh) +DO_VCVT_SH(VCVTT_SH, vcvtt_sh) +DO_VCVT_SH(VCVTB_HS, vcvtb_hs) +DO_VCVT_SH(VCVTT_HS, vcvtt_hs) + /* Narrowing moves: only size 0 and 1 are valid */ #define DO_VMOVN(INSN, FN) \ static bool trans_##INSN(DisasContext *s, arg_1op *a) \ From patchwork Thu Jul 29 11:15:11 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511233 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=OZ/IjqkG; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb8LB5QQ6z9sT6 for ; Thu, 29 Jul 2021 22:02:58 +1000 (AEST) Received: from localhost ([::1]:47214 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m94ka-0005BL-FX for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 08:02:56 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41148) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m941d-0001WS-Rc for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:30 -0400 Received: from mail-wr1-x431.google.com ([2a00:1450:4864:20::431]:41642) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m9419-0001Vf-Fi for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:29 -0400 Received: by mail-wr1-x431.google.com with SMTP id b7so6456677wri.8 for ; Thu, 29 Jul 2021 04:15:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=j0VNEds2e4uUMVkkbv3wCuyg9G/hLSB47JPGz5ftqSI=; b=OZ/IjqkGod40w6VxYsTfg4W7gES942iRVVySX5bexpEAMYswZZwRDp1c6UyCnweLqU utGSIQiglNeyPMk7BWaP9hcJjzg2BMVhqFDi1jpg03XDF+SC3mTHOWtoyJ3e7Q9aZjva rqX6Qxfo+utpMjbTHYreuGjvAavs86jO3yr1BDoO/EfW7IO73SAJOxnypPl708HhRr4P r3Ow9josV4/R4OAXsemkPD6SrMUfqscjTRg0XR9YYPCb1ZuHR/9LjT8jSo7073C9I9/z QEdWtOcpi3URj3hIi2ffvBdcj5KwKr6LuDhePW7LdnsqFLBIHsA3fqOy5PyEE3anXy/J 2uGg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=j0VNEds2e4uUMVkkbv3wCuyg9G/hLSB47JPGz5ftqSI=; b=iKDZDKDs84vd345IwBoGkAXcdZMH9iY06mE81aNVD76BetYihEe4cleTLYKSrUqT6j qpiS4HKBV6afBhUaQ4sHFU92TyV+jPIpNqnT8kaxjLk5MnKNTE4yE5Os4DV6mgSosENd gYgPxfO6DXg8WeVRBW//GtVkUysvtR4GivgTPuCRZQgG33ftyqmzA742fy8TE3Dg2+c1 GGHopTjTpFWUogv5+fuAZCyB1ADNqwFFzB0qzIyXDC1MSbOdw0281r/3k4Us9Yjt5n4Y V7rKelOtnEk9cVqJFbno/nXbDdJ+QnO4KaS03wxMtDtfQkiYZh3AtaPlQtjjLWeqpqtI ht0Q== X-Gm-Message-State: AOAM5328KpMqEcGpq4Ej8nhzCb0eOh1n9oN4x/tUpSgOYf8se7ikzzuS ciG+lWOUGMLWUn1Wsz4GAHmJfCkwHkPjUA== X-Google-Smtp-Source: ABdhPJwXYFWKJklrVjpPsIPYQ8IlYPdhwmivaxZ53LZBn9MK88bx2QdpoKxdcpXgztmwBpcGC/b4/A== X-Received: by 2002:a05:6000:548:: with SMTP id b8mr4376317wrf.159.1627557358175; Thu, 29 Jul 2021 04:15:58 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:57 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 52/53] target/arm: Implement MVE VRINT insns Date: Thu, 29 Jul 2021 12:15:11 +0100 Message-Id: <20210729111512.16541-53-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::431; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x431.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE VRINT insns, which round floating point inputs to integer values, leaving them in floating point format. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper-mve.h | 6 +++++ target/arm/mve.decode | 7 ++++++ target/arm/mve_helper.c | 35 +++++++++++++++++++++++++++++ target/arm/translate-mve.c | 45 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 93 insertions(+) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index f6345c7abbe..76bd25006d8 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -882,3 +882,9 @@ DEF_HELPER_FLAGS_4(mve_vcvt_sf, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vcvt_uf, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vcvt_fs, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vcvt_fu, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(mve_vrint_rm_h, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vrint_rm_s, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(mve_vrintx_h, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vrintx_s, TCG_CALL_NO_WG, void, env, ptr, ptr) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index 32de4af3170..72b93bfcaa3 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -815,3 +815,10 @@ VCVTPS 1111 1111 1 . 11 .. 11 ... 000 10 0 1 . 0 ... 0 @1op VCVTPU 1111 1111 1 . 11 .. 11 ... 000 10 1 1 . 0 ... 0 @1op VCVTMS 1111 1111 1 . 11 .. 11 ... 000 11 0 1 . 0 ... 0 @1op VCVTMU 1111 1111 1 . 11 .. 11 ... 000 11 1 1 . 0 ... 0 @1op + +VRINTN 1111 1111 1 . 11 .. 10 ... 001 000 1 . 0 ... 0 @1op +VRINTX 1111 1111 1 . 11 .. 10 ... 001 001 1 . 0 ... 0 @1op +VRINTA 1111 1111 1 . 11 .. 10 ... 001 010 1 . 0 ... 0 @1op +VRINTZ 1111 1111 1 . 11 .. 10 ... 001 011 1 . 0 ... 0 @1op +VRINTM 1111 1111 1 . 11 .. 10 ... 001 101 1 . 0 ... 0 @1op +VRINTP 1111 1111 1 . 11 .. 10 ... 001 111 1 . 0 ... 0 @1op diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index 7a5143ba6f3..015f25cffce 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -3338,6 +3338,12 @@ DO_VCVT_RMODE(vcvt_rm_uh, 2, uint16_t, helper_vfp_touhh) DO_VCVT_RMODE(vcvt_rm_ss, 4, uint32_t, helper_vfp_tosls) DO_VCVT_RMODE(vcvt_rm_us, 4, uint32_t, helper_vfp_touls) +#define DO_VRINT_RM_H(M, F, S) helper_rinth(M, S) +#define DO_VRINT_RM_S(M, F, S) helper_rints(M, S) + +DO_VCVT_RMODE(vrint_rm_h, 2, uint16_t, DO_VRINT_RM_H) +DO_VCVT_RMODE(vrint_rm_s, 4, uint32_t, DO_VRINT_RM_S) + /* * VCVT between halfprec and singleprec. As usual for halfprec * conversions, FZ16 is ignored and AHP is observed. @@ -3408,3 +3414,32 @@ DO_VCVT_SH(vcvtb_sh, 0) DO_VCVT_SH(vcvtt_sh, 1) DO_VCVT_HS(vcvtb_hs, 0) DO_VCVT_HS(vcvtt_hs, 1) + +#define DO_1OP_FP(OP, ESIZE, TYPE, FN) \ + void HELPER(glue(mve_, OP))(CPUARMState *env, void *vd, void *vm) \ + { \ + TYPE *d = vd, *m = vm; \ + TYPE r; \ + uint16_t mask = mve_element_mask(env); \ + unsigned e; \ + float_status *fpst; \ + float_status scratch_fpst; \ + for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \ + if ((mask & MAKE_64BIT_MASK(0, ESIZE)) == 0) { \ + continue; \ + } \ + fpst = (ESIZE == 2) ? &env->vfp.standard_fp_status_f16 : \ + &env->vfp.standard_fp_status; \ + if (!(mask & 1)) { \ + /* We need the result but without updating flags */ \ + scratch_fpst = *fpst; \ + fpst = &scratch_fpst; \ + } \ + r = FN(m[H##ESIZE(e)], fpst); \ + mergemask(&d[H##ESIZE(e)], r, mask); \ + } \ + mve_advance_vpt(env); \ + } + +DO_1OP_FP(vrintx_h, 2, uint16_t, float16_round_to_int) +DO_1OP_FP(vrintx_s, 4, uint32_t, float32_round_to_int) diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index 194ef99cc74..2ed91577ec8 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -641,6 +641,51 @@ DO_VCVT_SH(VCVTT_SH, vcvtt_sh) DO_VCVT_SH(VCVTB_HS, vcvtb_hs) DO_VCVT_SH(VCVTT_HS, vcvtt_hs) +#define DO_VRINT(INSN, RMODE) \ + static void gen_##INSN##h(TCGv_ptr env, TCGv_ptr qd, TCGv_ptr qm) \ + { \ + gen_helper_mve_vrint_rm_h(env, qd, qm, \ + tcg_constant_i32(arm_rmode_to_sf(RMODE))); \ + } \ + static void gen_##INSN##s(TCGv_ptr env, TCGv_ptr qd, TCGv_ptr qm) \ + { \ + gen_helper_mve_vrint_rm_s(env, qd, qm, \ + tcg_constant_i32(arm_rmode_to_sf(RMODE))); \ + } \ + static bool trans_##INSN(DisasContext *s, arg_1op *a) \ + { \ + static MVEGenOneOpFn * const fns[] = { \ + NULL, \ + gen_##INSN##h, \ + gen_##INSN##s, \ + NULL, \ + }; \ + if (!dc_isar_feature(aa32_mve_fp, s)) { \ + return false; \ + } \ + return do_1op(s, a, fns[a->size]); \ + } + +DO_VRINT(VRINTN, FPROUNDING_TIEEVEN) +DO_VRINT(VRINTA, FPROUNDING_TIEAWAY) +DO_VRINT(VRINTZ, FPROUNDING_ZERO) +DO_VRINT(VRINTM, FPROUNDING_NEGINF) +DO_VRINT(VRINTP, FPROUNDING_POSINF) + +static bool trans_VRINTX(DisasContext *s, arg_1op *a) +{ + static MVEGenOneOpFn * const fns[] = { + NULL, + gen_helper_mve_vrintx_h, + gen_helper_mve_vrintx_s, + NULL, + }; + if (!dc_isar_feature(aa32_mve_fp, s)) { + return false; + } + return do_1op(s, a, fns[a->size]); +} + /* Narrowing moves: only size 0 and 1 are valid */ #define DO_VMOVN(INSN, FN) \ static bool trans_##INSN(DisasContext *s, arg_1op *a) \ From patchwork Thu Jul 29 11:15:12 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1511220 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=eo5Xakfm; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gb8Dj4XVxz9sRR for ; Thu, 29 Jul 2021 21:58:13 +1000 (AEST) Received: from localhost ([::1]:36204 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m94fz-0005vl-C8 for incoming@patchwork.ozlabs.org; Thu, 29 Jul 2021 07:58:11 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41192) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m941f-0001ac-8S for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:31 -0400 Received: from mail-wr1-x42e.google.com ([2a00:1450:4864:20::42e]:45898) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m941A-0001W6-9O for qemu-devel@nongnu.org; Thu, 29 Jul 2021 07:16:30 -0400 Received: by mail-wr1-x42e.google.com with SMTP id m12so1651011wru.12 for ; Thu, 29 Jul 2021 04:15:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=1nPPw7sf6+ACZS1KSJaNNWd+RZu2j3hEbtkOKyLj6Kw=; b=eo5Xakfmv7I1HlifNM/udWOgc6jt+ZL52gj3uUt84STRxjLsqBC83kcqq2qRwdpZaQ +/PObFMo41KBDHpcch5MLaiGvv7S5Z/Jy9pv3OD2+5wRIPUKYk3D9wX2iiqoKXqy79xt 26lxIE8Ni1Vn9BrX6XzxTxDdZQ4HDbagHF8Zy+hfMCOzl7px4h1jlvWMgoDMl+crjdu0 hCr0HCWA7cO0MLupQ61o8jH5vE4F7pqJa2D2NWEcwjIjaIVjlp+mk4n+qhYFhzTRgFUX 0lY0cvZQF1XhnlkyzzLEWUtaPJVNoRKal05x+PGFnM/GVBox/iqkrk//o9guC5eihDPv vuwQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=1nPPw7sf6+ACZS1KSJaNNWd+RZu2j3hEbtkOKyLj6Kw=; b=OqdiS7MDgwJfmpzBh8+eilBzqSgSAq5OdBGJWGpF2UfeMns8TTUor2nLW+vFRzJMeD ll5JgxY/RZIBE1bxa0lBrFvCT2PGr6y+3A0ZpyRuBmTokfwOtsfpEiydUaSCtZ9MCDMX XF6t1PJVXTqtHCzRaszcoHzNvg1XORikb20/l+IAcSt1Sr4Vd0dGT/Rn/7PC6OW0XDRF l1bBUUcB0bhZZsjTMtdDveXAhleDFBwbOPEc4pOhp4GAM6Nj9WNg7vxCp5HXVzVD5T5V VkuoITGibCwsRxcFlDKkx96vQsE161B54d6zF9x0xQK3oxybS8zrc8q3X9MG58X0NB1r sSJg== X-Gm-Message-State: AOAM532sHrfuUzh1QyzImd2h8cm2/3wKcTMw9UrLs7KxpdExTPSIsuvH cJP6cczr+SPv9qtTa5UfnVS0og== X-Google-Smtp-Source: ABdhPJwpRMycSu+uqxFm+tdQ5QMu/DncPMrjyANSA5GlxOr9K7m/HDmZBS4MkuJa3Gegs9B/p7INuw== X-Received: by 2002:a5d:6608:: with SMTP id n8mr4118152wru.427.1627557359030; Thu, 29 Jul 2021 04:15:59 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j140sm3037829wmj.37.2021.07.29.04.15.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Jul 2021 04:15:58 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 53/53] target/arm: Enable MVE in Cortex-M55 Date: Thu, 29 Jul 2021 12:15:12 +0100 Message-Id: <20210729111512.16541-54-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210729111512.16541-1-peter.maydell@linaro.org> References: <20210729111512.16541-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::42e; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x42e.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" We now have a complete MVE emulation, so we can enable it in our Cortex-M55 model by setting the ID registers to match those of a Cortex-M55 with full MVE support. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- docs/system/arm/emulation.rst | 1 + target/arm/cpu_tcg.c | 7 ++----- 2 files changed, 3 insertions(+), 5 deletions(-) diff --git a/docs/system/arm/emulation.rst b/docs/system/arm/emulation.rst index 144dc491d95..89310e4842f 100644 --- a/docs/system/arm/emulation.rst +++ b/docs/system/arm/emulation.rst @@ -87,6 +87,7 @@ for the following architecture extensions: - LOB (Low Overhead loops and Branch future) - M (Main Extension) - MPU (Memory Protection Unit Extension) +- MVE (M-Profile Vector Extension) - PXN (Privileged Execute Never) - RAS (Reliability, Serviceability and Availability): "minimum RAS Extension" only - S (Security Extension) diff --git a/target/arm/cpu_tcg.c b/target/arm/cpu_tcg.c index ed444bf436a..33cc75af57d 100644 --- a/target/arm/cpu_tcg.c +++ b/target/arm/cpu_tcg.c @@ -654,12 +654,9 @@ static void cortex_m55_initfn(Object *obj) cpu->revidr = 0; cpu->pmsav7_dregion = 16; cpu->sau_sregion = 8; - /* - * These are the MVFR* values for the FPU, no MVE configuration; - * we will update them later when we implement MVE - */ + /* These are the MVFR* values for the FPU + full MVE configuration */ cpu->isar.mvfr0 = 0x10110221; - cpu->isar.mvfr1 = 0x12100011; + cpu->isar.mvfr1 = 0x12100211; cpu->isar.mvfr2 = 0x00000040; cpu->isar.id_pfr0 = 0x20000030; cpu->isar.id_pfr1 = 0x00000230;