From patchwork Tue Jul 13 13:36:53 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1504616 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=mUyuD/V2; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GPMDF4mDDz9sWd for ; Tue, 13 Jul 2021 23:38:53 +1000 (AEST) Received: from localhost ([::1]:53356 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m3Icd-0000tH-CU for incoming@patchwork.ozlabs.org; Tue, 13 Jul 2021 09:38:51 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:53912) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m3IbR-0008HP-5i for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:37 -0400 Received: from mail-wr1-x42f.google.com ([2a00:1450:4864:20::42f]:45597) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m3IbL-0003Zn-4w for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:36 -0400 Received: by mail-wr1-x42f.google.com with SMTP id t5so16679962wrw.12 for ; Tue, 13 Jul 2021 06:37:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=IO7/29GWk0k3evo7LsylmUNRHzg1sxgFY9MGve3JcNo=; b=mUyuD/V2jkicKo1/LxEiH+iZ4awbNitPRzx1VbRdP3WxlPU8Ql8fNjHuJ4U73UbOoE 9Y+UYrCVkuV/FpKOV2UVjO/ETyF+WVq7S8MZLGuMukm1V+1soZr//bQOieoKjc0USBI9 ydO2zkUZ5SekPVgC7lcOsPNxM5gcRXnk7m4TPuemAIfHuimPa9YEmyCbir2deUSWH8jS SZcFDBby+zpWlH+cir5QH9SP9hH9e2xoRIPaoiJSiVm/368uH3l1ChkDNdx9zhX9xvN9 2goY3uL0v128mDtJVlDFYRJW9qC/3YtXi/JG2Z0zTpv/O2paBq7Y8BZugB8b1yOq3Sqa xkDg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=IO7/29GWk0k3evo7LsylmUNRHzg1sxgFY9MGve3JcNo=; b=CTGbvkKntw/wyOsa9fTneU9bPxZgtErB+27qQXCwXl/aa4OYlfitZ/G7bmWZNSO33y D1k0s+ua2LLFdONugUR4vRBJP//LDLVPdS5OicoyeDPJJX55pqGw+JC6jXCoC9JWYIFY pxOoC8qfYR8spXPmJ0lJeN07U/49Ja12UnjS0xSB0mT2FlqLzbXROLKpDWeQ5OskmStG 1EyL9ZROFFFkJExGPgy4ive+6vXws0DpWnftKNdmGzEipGbLytUEAfzpVjIQ0MJELT4o DjYS/Kg8OwGoeWdIZiHCB4HlLlIPtyjMCavTHLsEm+f8p0tvTsa2MDRM9idYfZdKfV0Y XzUg== X-Gm-Message-State: AOAM532Yss4ZbGBjNb9Bh51wNKKpxnr7KH27P/5Y9S/bgEQQWLAuy6rF 2Islys2AHPTlvz04uMnlLSEq8A== X-Google-Smtp-Source: ABdhPJz2KCPLv4yRj4oHgb0R7v1SWBXZzFx0LcZcbZ20CIWERvyqsp00rn6q9ZcTq2K/waH0Z/jtcw== X-Received: by 2002:adf:ce8d:: with SMTP id r13mr5836732wrn.304.1626183449775; Tue, 13 Jul 2021 06:37:29 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j6sm9827443wrm.97.2021.07.13.06.37.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 06:37:29 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 01/34] target/arm: Note that we handle VMOVL as a special case of VSHLL Date: Tue, 13 Jul 2021 14:36:53 +0100 Message-Id: <20210713133726.26842-2-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210713133726.26842-1-peter.maydell@linaro.org> References: <20210713133726.26842-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::42f; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x42f.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Although the architecture doesn't define it as an alias, VMOVL (vector move long) is encoded as a VSHLL with a zero shift. Add a comment in the decode file noting that we handle VMOVL as part of VSHLL. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/mve.decode | 2 ++ 1 file changed, 2 insertions(+) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index 595d97568eb..fa9d921f933 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -364,6 +364,8 @@ VRSHRI_U 111 1 1111 1 . ... ... ... 0 0010 0 1 . 1 ... 0 @2_shr_h VRSHRI_U 111 1 1111 1 . ... ... ... 0 0010 0 1 . 1 ... 0 @2_shr_w # VSHLL T1 encoding; the T2 VSHLL encoding is elsewhere in this file +# Note that VMOVL is encoded as "VSHLL with a zero shift count"; we +# implement it that way rather than special-casing it in the decode. VSHLL_BS 111 0 1110 1 . 1 .. ... ... 0 1111 0 1 . 0 ... 0 @2_shll_b VSHLL_BS 111 0 1110 1 . 1 .. ... ... 0 1111 0 1 . 0 ... 0 @2_shll_h From patchwork Tue Jul 13 13:36:54 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1504615 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=XJhNEyiu; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GPMDD5pnfz9sWd for ; Tue, 13 Jul 2021 23:38:52 +1000 (AEST) Received: from localhost ([::1]:53216 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m3Icc-0000mn-J8 for incoming@patchwork.ozlabs.org; Tue, 13 Jul 2021 09:38:50 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:53896) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m3IbQ-0008Fr-Mx for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:36 -0400 Received: from mail-wr1-x436.google.com ([2a00:1450:4864:20::436]:37703) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m3IbL-0003aY-MO for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:36 -0400 Received: by mail-wr1-x436.google.com with SMTP id i94so30500421wri.4 for ; Tue, 13 Jul 2021 06:37:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=5FuhIcPQ8dqLKfYvxbInQRrJHkLhIPar3p6mkfICIW0=; b=XJhNEyiucku7u8UM6V38+9h7Q5Hyad6ta8uLbCnbwn62ovCSz7KmKa7g095ws9Z8HP phq8j57Ms7ZEiSPy5i3GoP2UJwiYYizZywJxZ4iTt/ijI5edjMbpV5HUY5uGmzqCbDK9 Xdf7rqAXaKpY4+Nvw8yWCmT1qhCxvneinW7Qe2W50JKcMLET3mJJELcfSq5aOAHA3jBr TtmZnoK7Qs+JgylXrPEKn90AUeLVAPfXCobXXoAP64C3wp2nyVnr1mFwsevyGjA/AA52 3VzhodbApH7eC8WME7j+BD+jZShRai0UTaMlm1q0f4wU/eyjWfxhFqrbyF4jh6+8CpGc n2vA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=5FuhIcPQ8dqLKfYvxbInQRrJHkLhIPar3p6mkfICIW0=; b=goYv+jDqcE3g5dx1WKif3RZVbnkNMcMR5qZnrkdbtEBv6tsOSzPOViYbZLB+m2ZxGa FaOhbGV6lTMS661fipl5nT48pKiKjupKLmgonTGK5kjBAYXNFtYFWerB4FFvJd10MREm 7pd3fQGllpXeljzN56QuU3qjoLhcS1gY9N2bj9C94C1/o2hHI1B8Oi31eeKoqPWOUdUh jdc97FQSEp+daggdSc+QaxTtC7QuDWGQm4l+933vdDwDWwvE5Iuoam2lKaPh4JlUs6xv /PkcAsFJ0AxiWXXPQ7pmvmlfl5k8wB/FDaoPWMO5J+Skj9BTNiqS478Qaa31bdxmpzVm 4jBQ== X-Gm-Message-State: AOAM5316zDQuN11BajxXPG0bzVmvip0zePDCpmiveEOMWV6B1Ujrw0Qu uVOOiI9qULAObNEniF5Y/nuG2A== X-Google-Smtp-Source: ABdhPJz23L23OwC/MuDEhoRjIEa1gu6Svy1wz+oUeGWWV2GUnvfX1XdFEgrwHa9l5G1qOtfIUptvvQ== X-Received: by 2002:adf:de84:: with SMTP id w4mr5876809wrl.104.1626183450436; Tue, 13 Jul 2021 06:37:30 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j6sm9827443wrm.97.2021.07.13.06.37.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 06:37:30 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 02/34] target/arm: Print MVE VPR in CPU dumps Date: Tue, 13 Jul 2021 14:36:54 +0100 Message-Id: <20210713133726.26842-3-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210713133726.26842-1-peter.maydell@linaro.org> References: <20210713133726.26842-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::436; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x436.google.com X-Spam_score_int: -1 X-Spam_score: -0.2 X-Spam_bar: / X-Spam_report: (-0.2 / 5.0 requ) DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Include the MVE VPR register value in the CPU dumps produced by arm_cpu_dump_state() if we are printing FPU information. This makes it easier to interpret debug logs when predication is active. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/cpu.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/target/arm/cpu.c b/target/arm/cpu.c index 9cddfd6a442..6d6b8888037 100644 --- a/target/arm/cpu.c +++ b/target/arm/cpu.c @@ -1016,6 +1016,9 @@ static void arm_cpu_dump_state(CPUState *cs, FILE *f, int flags) i, v); } qemu_fprintf(f, "FPSCR: %08x\n", vfp_get_fpscr(env)); + if (cpu_isar_feature(aa32_mve, cpu)) { + qemu_fprintf(f, "VPR: %08x\n", env->v7m.vpr); + } } } From patchwork Tue Jul 13 13:36:55 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1504617 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=IMFRlO/2; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GPMHm6N6vz9sWd for ; Tue, 13 Jul 2021 23:41:56 +1000 (AEST) Received: from localhost ([::1]:33498 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m3Ifa-0006Z2-Ko for incoming@patchwork.ozlabs.org; Tue, 13 Jul 2021 09:41:54 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:53916) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m3IbR-0008HU-7t for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:37 -0400 Received: from mail-wr1-x436.google.com ([2a00:1450:4864:20::436]:45604) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m3IbM-0003bK-LH for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:36 -0400 Received: by mail-wr1-x436.google.com with SMTP id t5so16680061wrw.12 for ; Tue, 13 Jul 2021 06:37:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=Bon4R+GO/Kh6lXfctlExnNFuhwRkWgeWXsUtfA3/MIY=; b=IMFRlO/2qfyAWzfeywhLG8VbndFXEw4ttLF85sFynSGtY5DjuwCw2a1iX44y7udzyH mUNO/YHeLf9Sk61jdG9dvaOZXivbzq7bFFuL0pATMJZw0poeiF7Trcc1486K51dSshms khqqabq0ykM8VioFckReDqhRhavCAG6USC8n5/fhHIW5/xY1oavQnj84sfcrSnkW7lLm T6CQc55/niaISZ8ybmGfxKblBCj60hV3EugtOt1YcyqfAAxYNFZFLsln0brBTJ+G+72P qz7JQtidTIr036mZVDrY7g3ZrXN/UEFIax9jdzVHIsSE/0TCUE5moEAnmxMZzorIq+1o bTGw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Bon4R+GO/Kh6lXfctlExnNFuhwRkWgeWXsUtfA3/MIY=; b=Pue67OGjM1Vj7W/6ZbXjpret3s/Ah98ShzRV1s2txHHKHlpt2kVlgucEvkwDZ3CR/x inn8zwlij7auW1XpTHmiQlDGlvV2YMENmtRaDpIzT09V2Nvx+9xnRrJK0s6hcvlKyyXE /S05ZOvBZJplXICJLQcnunATLifZxSKfnR6jEsjVutXpyjdzMpkdkzfGCd0SmsUWPJ78 nTMqRRVthE+s1DvwKPN3x8mgJMkwm1Ho84dET/NHzrkPF6Crbn1VXc2PjniK+VJlCq8G jxwDioWT+QOzHPZCgqTVAQuvkRrdIRUsa+Y4/yxK7oDtBUEqfOn+huYkSKGoTik49jeT 2Z3A== X-Gm-Message-State: AOAM531X5qGq6lzLRDTYWzwdLgefH/W/rsXnz6vDJ6UElW5Gzz/+ZzOd n/tMpM+jz9v/e1UXhDjh7Lga9A== X-Google-Smtp-Source: ABdhPJyWXGoBZgEw4iYFEjiNITrcYPYyIGWI6P4Fyfs7LjZD0ryso50Ryq54UuyyG0qYXleCAWgktQ== X-Received: by 2002:adf:f90d:: with SMTP id b13mr5917402wrr.336.1626183451487; Tue, 13 Jul 2021 06:37:31 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j6sm9827443wrm.97.2021.07.13.06.37.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 06:37:30 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 03/34] target/arm: Fix MVE VSLI by 0 and VSRI by
Date: Tue, 13 Jul 2021 14:36:55 +0100 Message-Id: <20210713133726.26842-4-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210713133726.26842-1-peter.maydell@linaro.org> References: <20210713133726.26842-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::436; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x436.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" In the MVE shift-and-insert insns, we special case VSLI by 0 and VSRI by
, both of which mean "no shift". However we incorrectly implemented these as "don't update the destination", which works only if Qd == Qm. When Qd != Qm this kind of shift must update Qd, honouring the predicate mask. Signed-off-by: Peter Maydell --- target/arm/mve_helper.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index db5d6220854..16a701933b8 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -1276,19 +1276,23 @@ DO_2SHIFT_S(vrshli_s, DO_VRSHLS) void *vm, uint32_t shift) \ { \ uint64_t *d = vd, *m = vm; \ - uint16_t mask; \ + uint16_t mask = mve_element_mask(env); \ uint64_t shiftmask; \ unsigned e; \ if (shift == 0 || shift == ESIZE * 8) { \ /* \ * Only VSLI can shift by 0; only VSRI can shift by
. \ * The generic logic would give the right answer for 0 but \ - * fails for
. \ + * fails for
. In both cases, we must not shift the \ + * input but just copy it to the destination, honouring \ + * the predicate mask. \ */ \ + for (e = 0; e < 16 / 8; e++, mask >>= 8) { \ + mergemask(&d[H8(e)], m[H8(e)], mask); \ + } \ goto done; \ } \ assert(shift < ESIZE * 8); \ - mask = mve_element_mask(env); \ /* ESIZE / 2 gives the MO_* value if ESIZE is in [1,2,4] */ \ shiftmask = dup_const(ESIZE / 2, MASKFN(ESIZE * 8, shift)); \ for (e = 0; e < 16 / 8; e++, mask >>= 8) { \ From patchwork Tue Jul 13 13:36:56 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1504614 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=wJakB/KA; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GPMD51BGMz9sWd for ; Tue, 13 Jul 2021 23:38:45 +1000 (AEST) Received: from localhost ([::1]:52000 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m3IcQ-0008O1-Qm for incoming@patchwork.ozlabs.org; Tue, 13 Jul 2021 09:38:38 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:53952) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m3IbR-0008Jj-Uq for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:37 -0400 Received: from mail-wr1-x42b.google.com ([2a00:1450:4864:20::42b]:39718) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m3IbN-0003bi-U7 for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:37 -0400 Received: by mail-wr1-x42b.google.com with SMTP id f17so30522349wrt.6 for ; Tue, 13 Jul 2021 06:37:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=gTQCjbESgq6v50RkvtDQjOcLLtpOe7evSTv/m/L93cE=; b=wJakB/KA4AAMG2JcYfFKddaQdZRO/bjaomKY49sXHkpRlm4aN/mqldbPE4SPgP91A2 TZypm0cBXzeM4q21JGUNyYRWWIANt60LoSnsJvmxSePVPwmVAg518XCP+6eaa0QngKap wDhJ3VkP/SwoPyFYXYLTe2Wlg8/EFlBAFEm4Fhpq9XMGFiFo7gNfKZpmMLqtcqjn6yKi VQj90AKaycFtnze3rDVXPDUk6e5n5Zwl3+zy3RhyyaoGs2sRT2UUeoa3TgzrdIUHTtZO 35yV490h80Eh+bSbTwSQraLZLB630zU1FlPVrY8p6MOwJ1Frw6fbWJGMg5TdaMb2AJHx qaeg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=gTQCjbESgq6v50RkvtDQjOcLLtpOe7evSTv/m/L93cE=; b=g7NEE73gCfT+WB3mt+DHbQhUxtlX1c+zHn1gMC8ah9tCpT91yC57UONXdrqaoFTXkn M1gCTf/inIQpkyF/TxeA6s8s6rGNYcBHLjiBwHfuB3huQZycKPIw1Fabosik77OklTA5 Res+W78cZiLy6LNzy+ksJQyWHZJsa8sRcAe7xPZbjCvkO2EsWYbDaD1S9rUI9NL8b+jJ 0jVfOpoSwiYWh3kwsoRCL8nxIMDq9XO5JtdHTPmd4A0RuVQXbTJ2PWM3hT68IR+e3vW3 glZ8Yt5wQ3GiLPb/v65oaOnmNDuxJgleq6/tszzL21RSVc0nuPMdwkoIbXEley1gSqR/ 7Whw== X-Gm-Message-State: AOAM532JbsmZ8R+0jgL4TSqwrS4Tf0OhDCtYdM3zP8PfbPjYg21nRXBk ZE5/4pgqupyoApJnlE1WhzNM6A== X-Google-Smtp-Source: ABdhPJwKJk1l8NvIW/ZWNanq7ivJXlU09bIdfhRyevouo6E5awtH3+j8d6ppXkHceUIFeEQ5adG7mg== X-Received: by 2002:a5d:420b:: with SMTP id n11mr5814193wrq.395.1626183452586; Tue, 13 Jul 2021 06:37:32 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j6sm9827443wrm.97.2021.07.13.06.37.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 06:37:31 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 04/34] target/arm: Fix signed VADDV Date: Tue, 13 Jul 2021 14:36:56 +0100 Message-Id: <20210713133726.26842-5-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210713133726.26842-1-peter.maydell@linaro.org> References: <20210713133726.26842-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::42b; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x42b.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" A cut-and-paste error meant we handled signed VADDV like unsigned VADDV; fix the type used. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/mve_helper.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index 16a701933b8..99b4801088f 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -1182,9 +1182,9 @@ DO_LDAVH(vrmlsldavhxsw, int32_t, int64_t, true, true) return ra; \ } \ -DO_VADDV(vaddvsb, 1, uint8_t) -DO_VADDV(vaddvsh, 2, uint16_t) -DO_VADDV(vaddvsw, 4, uint32_t) +DO_VADDV(vaddvsb, 1, int8_t) +DO_VADDV(vaddvsh, 2, int16_t) +DO_VADDV(vaddvsw, 4, int32_t) DO_VADDV(vaddvub, 1, uint8_t) DO_VADDV(vaddvuh, 2, uint16_t) DO_VADDV(vaddvuw, 4, uint32_t) From patchwork Tue Jul 13 13:36:57 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1504620 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=OaqEhGMx; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GPMJv1XqQz9sWl for ; Tue, 13 Jul 2021 23:42:55 +1000 (AEST) Received: from localhost ([::1]:36164 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m3IgV-0008Mn-UV for incoming@patchwork.ozlabs.org; Tue, 13 Jul 2021 09:42:52 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:53982) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m3IbS-0008M0-JO for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:38 -0400 Received: from mail-wr1-x42f.google.com ([2a00:1450:4864:20::42f]:34541) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m3IbQ-0003cH-Bt for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:38 -0400 Received: by mail-wr1-x42f.google.com with SMTP id p8so30507724wrr.1 for ; Tue, 13 Jul 2021 06:37:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=l4h+O9UQxvrDocVeKKCEGu5i8zNpjbzWnTmmgtbEZ2s=; b=OaqEhGMxaVz33bTdKgnek+RTDw5dZ+clhFCdU18BF4QC3taqqoLqIlR0/KybUvROX6 9EVjY6/GgiKAP+hmL9VjJGu6Adxcic0R0zZrNJWD8kxH3hS5JkAMWYdZcAXkzYhFBbdO ChJK7xO21Bsx6XH6DK2TB4NRM4h2zrvJ4s6erbbC/newnPJYPNh9P4TK0E1uvS1O/bG6 xY5DAejwTotZTyIrOtH6SYQtsIhToRTedJ+N+itcD3BBt1P/CstQvpy5ad8r6+Bhk2/Q DLb4ehh+zB7CJUY8hGXmedX3dfbSCG+huPdQH26P+r0xrKhkcy1BgTz7M4TIYGtgxspB l0wA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=l4h+O9UQxvrDocVeKKCEGu5i8zNpjbzWnTmmgtbEZ2s=; b=GtXyMFTeJ4t/WGoJZgd2JMmsbXvuFydreazDLF9f8E3POxgZOb0ObCSXOSXld0UE57 u84WbmSLm1FbcZQOcT3cxxxi6Jchu10pa7tSLXZ/pcg4ztdvACHEI7NLygkiqqG5fBWa 0SdeE2gOhOdOkQgXJuk9axQIYpLb4q+gqElLe8gD7EdMSIgmbi3AMNinWzoDk9tR4rkH Fztk8woJKW7tF4zUi3Iy4ySVaZ8XKhVHKo1Tm6A21zoawdApMfpKelFREn/exu8ZtiSt VFklbk+RUmJpOe8kOy1fhYOVIYGiwqw7VPV4FHHW1Im5xrzktF8qWMbh/hnIuTGiG9rT yxqA== X-Gm-Message-State: AOAM530SYJGAazmySifObYivgKbR58ZM3Mgo01VZY6Bjcee0cHiAj4jP CpCICNlsI7gdT3rf2FUOcpYrQ5mQ+u8VxG+J X-Google-Smtp-Source: ABdhPJyMrKd+dSTcjSQIwTRmL6BMPOd91vKcRq/hZD8YC4XmXNBEPeE/HyvwHpAWTIMAB9M8qF0qIA== X-Received: by 2002:a5d:5989:: with SMTP id n9mr5672702wri.8.1626183453289; Tue, 13 Jul 2021 06:37:33 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j6sm9827443wrm.97.2021.07.13.06.37.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 06:37:32 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 05/34] target/arm: Fix mask handling for MVE narrowing operations Date: Tue, 13 Jul 2021 14:36:57 +0100 Message-Id: <20210713133726.26842-6-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210713133726.26842-1-peter.maydell@linaro.org> References: <20210713133726.26842-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::42f; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x42f.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" In the MVE helpers for the narrowing operations (DO_VSHRN and DO_VSHRN_SAT) we were using the wrong bits of the predicate mask for the 'top' versions of the insn. This is because the loop works over the double-sized input elements and shifts the predicate mask by that many bits each time, but when we write out the half-sized output we must look at the mask bits for whichever half of the element we are writing to. Correct this by shifting the whole mask right by ESIZE bits for the 'top' insns. This allows us also to simplify the saturation bit checking (where we had noticed that we needed to look at a different mask bit for the 'top' insn.) Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/mve_helper.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index 99b4801088f..8cbfd3a8c53 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -1361,6 +1361,7 @@ DO_VSHLL_ALL(vshllt, true) TYPE *d = vd; \ uint16_t mask = mve_element_mask(env); \ unsigned le; \ + mask >>= ESIZE * TOP; \ for (le = 0; le < 16 / LESIZE; le++, mask >>= LESIZE) { \ TYPE r = FN(m[H##LESIZE(le)], shift); \ mergemask(&d[H##ESIZE(le * 2 + TOP)], r, mask); \ @@ -1422,11 +1423,12 @@ static inline int32_t do_sat_bhs(int64_t val, int64_t min, int64_t max, uint16_t mask = mve_element_mask(env); \ bool qc = false; \ unsigned le; \ + mask >>= ESIZE * TOP; \ for (le = 0; le < 16 / LESIZE; le++, mask >>= LESIZE) { \ bool sat = false; \ TYPE r = FN(m[H##LESIZE(le)], shift, &sat); \ mergemask(&d[H##ESIZE(le * 2 + TOP)], r, mask); \ - qc |= sat && (mask & 1 << (TOP * ESIZE)); \ + qc |= sat & mask & 1; \ } \ if (qc) { \ env->vfp.qc[0] = qc; \ From patchwork Tue Jul 13 13:36:58 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1504626 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=rR55kHlQ; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GPMQZ4dlsz9sWd for ; Tue, 13 Jul 2021 23:47:50 +1000 (AEST) Received: from localhost ([::1]:52160 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m3IlI-0002EF-Bj for incoming@patchwork.ozlabs.org; Tue, 13 Jul 2021 09:47:48 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54082) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m3IbW-0008SZ-1a for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:42 -0400 Received: from mail-wr1-x42a.google.com ([2a00:1450:4864:20::42a]:40490) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m3IbQ-0003dI-DR for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:41 -0400 Received: by mail-wr1-x42a.google.com with SMTP id l7so29611864wrv.7 for ; Tue, 13 Jul 2021 06:37:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=hzoBamT9L0is33IwFMKLrmnDteA/k33y/OWd2P4v5ng=; b=rR55kHlQIl1wQrwUm7Xj4eNLUchExUoJuFxbGK2hPb/tq6itxRee7QxMeQXHH18+m5 dEuOfmI2C12ZOblK0FJgPYD37OBJ3lUMDj0A7dVJ5JcXxfSue26SQhPVQWKHMYmQU4Db irwul/UrxmiUeUo3SF0XtgJ1lHY3tCBTcjemaaydQ71Jtq6fNowUWZ35chPKdteooQ9n 8cGS9lX575VTDi1wnwhl872p3bNlF7JIJhbGYS4HpG1dBHpnIwu1YYbvcdYkYMliBZS5 40teNAimv/aDyVTaXQfw4E6Dv5CxhfBkUtwJ/bnPy+BPPPYexpZhmaX5PPckBBrGKIvx mSuQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=hzoBamT9L0is33IwFMKLrmnDteA/k33y/OWd2P4v5ng=; b=ceNHWq3wRJCyQV1XCASwavulXdKbZ92PAAu/ryT6agzgnuksj+VFqFBP1XudZSarX0 AxeH8Tg1uNv+W6MI8vRI7WKGTMFuQbjItKOQokyzNQnOsKJRIxFl/F31OkxYDdCr8llI O6t5+atgpLaFabmYrFOBxP9PuVxCdi3AzFVTIe57PjDsZcgZL83dZaFXzOSlreFmUaGR czg24MCpKTRyQltQYPdQe/Q7QsPH8yReriym0N8kvWKfj6f/AuiAo2TxZZuSbT3LSvZv Bjpe+ZBuoQrwYwr4tW2JDthARhZgSyNQBY3T0LPNku6QONnaK9gpPJKdgtoBO4ovzT/+ f38w== X-Gm-Message-State: AOAM532bsXdOcWnqpKM1NAjr3V0EuSna04+md5arzANuSx0tKmUKBId5 3rY0Yfhg1Lm9GwWGtKOKfN85vGU4jRfUGuBI X-Google-Smtp-Source: ABdhPJw2rxp2JLV20GQ2+1j7o80ISN4qJMoLbml6hXzdka1OlyyhsR19sZlkwh8aFnav9D+2tOPmxQ== X-Received: by 2002:adf:f9cb:: with SMTP id w11mr5862352wrr.57.1626183453988; Tue, 13 Jul 2021 06:37:33 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j6sm9827443wrm.97.2021.07.13.06.37.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 06:37:33 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 06/34] target/arm: Fix 48-bit saturating shifts Date: Tue, 13 Jul 2021 14:36:58 +0100 Message-Id: <20210713133726.26842-7-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210713133726.26842-1-peter.maydell@linaro.org> References: <20210713133726.26842-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::42a; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x42a.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" In do_sqrshl48_d() and do_uqrshl48_d() we got some of the edge cases wrong and failed to saturate correctly: (1) In do_sqrshl48_d() we used the same code that do_shrshl_bhs() does to obtain the saturated most-negative and most-positive 48-bit signed values for the large-shift-left case. This gives (1 << 47) for saturate-to-most-negative, but we weren't sign-extending this value to the 64-bit output as the pseudocode requires. (2) For left shifts by less than 48, we copied the "8/16 bit" code from do_sqrshl_bhs() and do_uqrshl_bhs(). This doesn't do the right thing because it assumes the C type we're working with is at least twice the number of bits we're saturating to (so that a shift left by bits-1 can't shift anything off the top of the value). This isn't true for bits == 48, so we would incorrectly return 0 rather than the most-positive value for situations like "shift (1 << 44) right by 20". Instead check for saturation by doing the shift and signextend and then testing whether shifting back left again gives the original value. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/mve_helper.c | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index 8cbfd3a8c53..f17e5a413fd 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -1579,9 +1579,8 @@ static inline int64_t do_sqrshl48_d(int64_t src, int64_t shift, } return src >> -shift; } else if (shift < 48) { - int64_t val = src << shift; - int64_t extval = sextract64(val, 0, 48); - if (!sat || val == extval) { + int64_t extval = sextract64(src << shift, 0, 48); + if (!sat || src == (extval >> shift)) { return extval; } } else if (!sat || src == 0) { @@ -1589,7 +1588,7 @@ static inline int64_t do_sqrshl48_d(int64_t src, int64_t shift, } *sat = 1; - return (1ULL << 47) - (src >= 0); + return sextract64((1ULL << 47) - (src >= 0), 0, 48); } /* Operate on 64-bit values, but saturate at 48 bits */ @@ -1612,9 +1611,8 @@ static inline uint64_t do_uqrshl48_d(uint64_t src, int64_t shift, return extval; } } else if (shift < 48) { - uint64_t val = src << shift; - uint64_t extval = extract64(val, 0, 48); - if (!sat || val == extval) { + uint64_t extval = extract64(src << shift, 0, 48); + if (!sat || src == (extval >> shift)) { return extval; } } else if (!sat || src == 0) { From patchwork Tue Jul 13 13:36:59 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1504618 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=jN1Sbx1V; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GPMJP3wmvz9sWd for ; Tue, 13 Jul 2021 23:42:29 +1000 (AEST) Received: from localhost ([::1]:34828 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m3Ig7-0007VL-AN for incoming@patchwork.ozlabs.org; Tue, 13 Jul 2021 09:42:27 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54040) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m3IbV-0008Q4-5U for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:41 -0400 Received: from mail-wr1-x42c.google.com ([2a00:1450:4864:20::42c]:36506) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m3IbQ-0003dQ-D2 for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:40 -0400 Received: by mail-wr1-x42c.google.com with SMTP id v5so30525221wrt.3 for ; Tue, 13 Jul 2021 06:37:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=k6grFFqHPB3tF2ojR0KuBO/NFzUsEvLnrBSRggUNVfs=; b=jN1Sbx1VY9KoDtcdpRFKvgdmKxbzlK32P+heZlXMenNUzJaIIYmthJkFo2AqBd27xa mlbsFs/Wice4pEIlCSNEc3CT7GTRxOu/CchTGAOlfeaGi7cYTQGSE/AAmEXFsNJeTOPs zpDda8v1mvRdElHuEvayjttRDcDPfczPrF2Gm/sp4NovMv2ob9pHKOBdJkm5Znwi+47y TDv2mXhq//qnX9WC4+MOlfxPP/NZOdmyVuCMIOwKU+56yOewEqUGCZZh2y2LrR3EePVn JQe1bKbOrA+hJHMwLHpbiuraagoQM5LdutRoneYzVo8fw/liP6Re6hr05lsWx25dqboo 7HvQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=k6grFFqHPB3tF2ojR0KuBO/NFzUsEvLnrBSRggUNVfs=; b=KK535vjwzyP0uvUDB9PCJxVIIi9Om62Opxyz9wk3uwXI1WWv5c671KRNCDpC3if/1E SMvyBwap+y97YWOwG4bEGBWOaX7m5HuoYUYM7HrS9nZXe38D5WURXtjJDIU36D51Weye Ttx9y47TnhGMm+zK7svxXbHIp2i6cGjT3HWaU65//qei+VYtAMKPT8s6cSyWlzFGSwYI aHA7MSp1NPpfIW2HV37EkS6yYI0wSOMZ89MFJSTlFyQQaqWu9OPCQnc1wlAOIXsiCO6p loYER/dqIgGRr2QGOLHaZwd2H2St7wjkafQ9Vve1Wil+11QT31TTKjpiTGhsvbIooJDX DlOA== X-Gm-Message-State: AOAM531lsjv+4cROu+iPLAfibXZg/8pKf8X649qdWwTxKyfBRzxop8TJ w5HeR99YpqNsopjqv2Ta/BcFeg== X-Google-Smtp-Source: ABdhPJzAnvVu6/SF3bfB9J5GrcKSmVepVqxP+gwQ0e6uk5SsUa0desiOHBw26qJT668icl0omjrV0w== X-Received: by 2002:a5d:4a0b:: with SMTP id m11mr5853166wrq.210.1626183454587; Tue, 13 Jul 2021 06:37:34 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j6sm9827443wrm.97.2021.07.13.06.37.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 06:37:34 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 07/34] target/arm: Fix calculation of LTP mask when LR is 0 Date: Tue, 13 Jul 2021 14:36:59 +0100 Message-Id: <20210713133726.26842-8-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210713133726.26842-1-peter.maydell@linaro.org> References: <20210713133726.26842-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::42c; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x42c.google.com X-Spam_score_int: -1 X-Spam_score: -0.2 X-Spam_bar: / X-Spam_report: (-0.2 / 5.0 requ) DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" In mve_element_mask(), we calculate a mask for tail predication which should have a number of 1 bits based on the value of LR. However, our MAKE_64BIT_MASK() macro has undefined behaviour when passed a zero length. Special case this to give the all-zeroes mask we require. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/mve_helper.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index f17e5a413fd..c75432c5fef 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -64,7 +64,8 @@ static uint16_t mve_element_mask(CPUARMState *env) */ int masklen = env->regs[14] << env->v7m.ltpsize; assert(masklen <= 16); - mask &= MAKE_64BIT_MASK(0, masklen); + uint16_t ltpmask = masklen ? MAKE_64BIT_MASK(0, masklen) : 0; + mask &= ltpmask; } if ((env->condexec_bits & 0xf) == 0) { From patchwork Tue Jul 13 13:37:00 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1504621 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=ZSQfSA14; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GPMMv2TDvz9sXM for ; Tue, 13 Jul 2021 23:45:31 +1000 (AEST) Received: from localhost ([::1]:43486 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m3Ij2-0004sE-He for incoming@patchwork.ozlabs.org; Tue, 13 Jul 2021 09:45:28 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54050) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m3IbV-0008Q6-BG for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:41 -0400 Received: from mail-wr1-x42e.google.com ([2a00:1450:4864:20::42e]:36508) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m3IbQ-0003db-FQ for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:41 -0400 Received: by mail-wr1-x42e.google.com with SMTP id v5so30525282wrt.3 for ; Tue, 13 Jul 2021 06:37:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=b4VQqETNkuWXD+kQmUy6tGq0WaqyAHDpekMK+m6bk54=; b=ZSQfSA14kv+/1OFQDQD+YjYnebZt37mbZi19Ck7a+WPp4L08vUjtLxBtWjtrSnAOep WemdCHnj/TTpbejaCOB3u80/Xklj9V8La6w22oROFbRMAPtuUKVOaEIpIXDvzX0RUgbH tL7BBZGX3AUFVAnpYlk2vvEw++MHYlb/vY+npvVOLAiKRYw7hFkeApjeKmVE57vr8JFv PugNmc+vo8sWJT04sS5/1DpxWKYQrh/G/oGS/TbVhsywDDaURmwFn/0FdW40svnystbH B3ptJ0bUYRQt7vz+OMDlD9MNxvzh6jJ5Rd7Ro6kdkaUEhfW0x+Z8JGxvmYg80kC09CKV jYaA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=b4VQqETNkuWXD+kQmUy6tGq0WaqyAHDpekMK+m6bk54=; b=fbFaRR66yoJ4TPPmhSzkBVPhN0c68L2C7ZLmmcSN30BAx+ZE2HiK6S8fRtpyaq7NG2 SCh+YgbEcU11S0duFfYUlXuZIOH6bDEGFKbMAjyytJZpaN9bqkXnf8JZYyhACxFphuKG eU3dpVTIvjNrPK5bKJJ6Dm8rpeKeVDo1CMA3vMpuuOsOadRx8pjI2XjIIKi7MXLWWjnw kh2sX694DqhRb2aFpQPBlfZPQ7IATGqDS88ZduqGZZNAFiPZ1dgE15F3sYz40D8g3GsA BXiELaNuu/BlZZLqwN6af3z4L5Q3kfOeE56Hy9nE3MsPcIy4RvaCjRwh9RIjKW1axri5 71cw== X-Gm-Message-State: AOAM532FEa3u4ZT/qLSJ0RuprVnB6G0PcigCfAXAzVzwOfD1n/egOk1G 1dBQppW7iacPgeyksOV1hDbEjRn08lE48oLq X-Google-Smtp-Source: ABdhPJyiKb9ZrW5qt9E0PN1OHy7UGQvATnu5VNL2W/Yflh3HHZDoXFwCwP2iU1Z4huDa4WIpw0gjoQ== X-Received: by 2002:adf:ba13:: with SMTP id o19mr5756299wrg.7.1626183455268; Tue, 13 Jul 2021 06:37:35 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j6sm9827443wrm.97.2021.07.13.06.37.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 06:37:34 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 08/34] target/arm: Fix VPT advance when ECI is non-zero Date: Tue, 13 Jul 2021 14:37:00 +0100 Message-Id: <20210713133726.26842-9-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210713133726.26842-1-peter.maydell@linaro.org> References: <20210713133726.26842-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::42e; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x42e.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" We were not paying attention to the ECI state when advancing the VPT state. Architecturally, VPT state advance happens for every beat (see the pseudocode VPTAdvance()), so on every beat the 4 bits of VPR.P0 corresponding to the current beat are inverted if required, and at the end of beats 1 and 3 the VPR MASK fields are updated. This means that if the ECI state says we should not be executing all 4 beats then we need to skip some of the updating of the VPR that we currently do in mve_advance_vpt(). Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/mve_helper.c | 29 +++++++++++++++++++++++------ 1 file changed, 23 insertions(+), 6 deletions(-) diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index c75432c5fef..b111ba3b106 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -100,9 +100,11 @@ static void mve_advance_vpt(CPUARMState *env) /* Advance the VPT and ECI state if necessary */ uint32_t vpr = env->v7m.vpr; unsigned mask01, mask23; + int eci = ECI_NONE; if ((env->condexec_bits & 0xf) == 0) { - env->condexec_bits = (env->condexec_bits == (ECI_A0A1A2B0 << 4)) ? + eci = env->condexec_bits >> 4; + env->condexec_bits = (eci == ECI_A0A1A2B0) ? (ECI_A0 << 4) : (ECI_NONE << 4); } @@ -111,17 +113,32 @@ static void mve_advance_vpt(CPUARMState *env) return; } + /* Invert P0 bits if needed, but only for beats we actually executed */ mask01 = FIELD_EX32(vpr, V7M_VPR, MASK01); mask23 = FIELD_EX32(vpr, V7M_VPR, MASK23); if (mask01 > 8) { - /* high bit set, but not 0b1000: invert the relevant half of P0 */ - vpr ^= 0xff; + if (eci == ECI_NONE) { + /* high bit set, but not 0b1000: invert the relevant half of P0 */ + vpr ^= 0xff; + } else if (eci == ECI_A0) { + /* Invert only the beat 1 P0 bits, as we didn't execute beat 0 */ + vpr ^= 0xf0; + } /* otherwise we didn't execute either beat 0 or beat 1 */ } if (mask23 > 8) { - /* high bit set, but not 0b1000: invert the relevant half of P0 */ - vpr ^= 0xff00; + if (eci != ECI_A0A1A2 && eci != ECI_A0A1A2B0) { + /* high bit set, but not 0b1000: invert the relevant half of P0 */ + vpr ^= 0xff00; + } else { + /* We didn't execute beat 2, only invert the beat 3 P0 bits */ + vpr ^= 0xf000; + } } - vpr = FIELD_DP32(vpr, V7M_VPR, MASK01, mask01 << 1); + /* Only update MASK01 if beat 1 executed */ + if (eci == ECI_NONE || eci == ECI_A0) { + vpr = FIELD_DP32(vpr, V7M_VPR, MASK01, mask01 << 1); + } + /* Beat 3 always executes, so update MASK23 */ vpr = FIELD_DP32(vpr, V7M_VPR, MASK23, mask23 << 1); env->v7m.vpr = vpr; } From patchwork Tue Jul 13 13:37:01 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1504623 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=SPIBGNyW; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GPMNV4xWpz9sWl for ; Tue, 13 Jul 2021 23:46:02 +1000 (AEST) Received: from localhost ([::1]:44802 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m3IjY-0005jZ-Cz for incoming@patchwork.ozlabs.org; Tue, 13 Jul 2021 09:46:00 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54210) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m3IbZ-0008W3-SI for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:49 -0400 Received: from mail-wm1-x32e.google.com ([2a00:1450:4864:20::32e]:51933) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m3IbR-0003dm-Ca for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:45 -0400 Received: by mail-wm1-x32e.google.com with SMTP id n4so3255586wms.1 for ; Tue, 13 Jul 2021 06:37:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=KPIYZ3zIq5bH2bgXU9pNq5i9SxEKiNByuloEeu6bbIM=; b=SPIBGNyWKDEeVV0cl9XGjDkfUKZJHstzBPR9I54/HUkKedoXMECvpGWCiGnKLmXhA6 We9/KzJ1HODr/XckHCD5Sm8Pl0c0BDfAx9X6wTStwiLT6CNbEFPTnEidbPrcEKECZj2q sX/yPrv/j3h8dRU0xxNW0xx5/w3q/hsSZ3OlafJqJBhuvk5bBZRXoT+2O4MCpCqs7/jD ZJc6EcqD303gxuAhfnMZm1gOfD3ypvDVX+f0ByuiS9jDPNEPHuicoybHWdMZkbwdciY1 i8fcKI5t8I1pKAcjmTXSDDedD+BbV1aK8D2/GW/kYzhKEPRDh1YZBLMUDfRkWVZJ15AW ZgGA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=KPIYZ3zIq5bH2bgXU9pNq5i9SxEKiNByuloEeu6bbIM=; b=IGKDjbe7NnmJ3riy7fodpPMjWMDcZq/v5i4GxND8bjq5EM8J74K9qiryIr0G8Zyjm0 obl+NOmdOuhvEIttdP4hhVFTzeVCsYm70oJdRooTZBukWPWcbqp0K9r/YGq7TAj9V46X ywB1oSDKKu0l1GtwLnQKnJo+yN79Gnxht4pHZWQVLpzw6QJgcBHqZR08DZjBUkxnvb61 PfGWURGn/Klj/ebGW+8jI9RDHj9nBpkXa4PGggWa0v5Qnnl/uG+dk0EexSsbS84iWOth hk7gIUdRuEw65rvSqZUiAcdBdTvzSE6nIWn0VZN7c8WkE+9Ur9sHcItfaR0e40XrSSCd lW0g== X-Gm-Message-State: AOAM530ufhPWNFnTNi6czYxYpOKzw8ykPRl+EQAhjg9s3lDQnpyovN/8 xoz2qruiqJwDKTSu+nygjFcnO5tCAiUxs2nl X-Google-Smtp-Source: ABdhPJxv+gaGHK5fr8Y2H3uasw0A/MzgBv6pZXfDllLmhV1Bb6njTpQ6Gpk7siXztpJKvvdPjcKgKg== X-Received: by 2002:a05:600c:2197:: with SMTP id e23mr5174184wme.101.1626183455932; Tue, 13 Jul 2021 06:37:35 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j6sm9827443wrm.97.2021.07.13.06.37.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 06:37:35 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 09/34] target/arm: Factor out mve_eci_mask() Date: Tue, 13 Jul 2021 14:37:01 +0100 Message-Id: <20210713133726.26842-10-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210713133726.26842-1-peter.maydell@linaro.org> References: <20210713133726.26842-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::32e; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x32e.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" In some situations we need a mask telling us which parts of the vector correspond to beats that are not being executed because of ECI, separately from the combined "which bytes are predicated away" mask. Factor this mask calculation out of mve_element_mask() into its own function. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/mve_helper.c | 58 ++++++++++++++++++++++++----------------- 1 file changed, 34 insertions(+), 24 deletions(-) diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index b111ba3b106..b0cbfda3cce 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -26,6 +26,35 @@ #include "exec/exec-all.h" #include "tcg/tcg.h" +static uint16_t mve_eci_mask(CPUARMState *env) +{ + /* + * Return the mask of which elements in the MVE vector correspond + * to beats being executed. The mask has 1 bits for executed lanes + * and 0 bits where ECI says this beat was already executed. + */ + int eci; + + if ((env->condexec_bits & 0xf) != 0) { + return 0xffff; + } + + eci = env->condexec_bits >> 4; + switch (eci) { + case ECI_NONE: + return 0xffff; + case ECI_A0: + return 0xfff0; + case ECI_A0A1: + return 0xff00; + case ECI_A0A1A2: + case ECI_A0A1A2B0: + return 0xf000; + default: + g_assert_not_reached(); + } +} + static uint16_t mve_element_mask(CPUARMState *env) { /* @@ -68,30 +97,11 @@ static uint16_t mve_element_mask(CPUARMState *env) mask &= ltpmask; } - if ((env->condexec_bits & 0xf) == 0) { - /* - * ECI bits indicate which beats are already executed; - * we handle this by effectively predicating them out. - */ - int eci = env->condexec_bits >> 4; - switch (eci) { - case ECI_NONE: - break; - case ECI_A0: - mask &= 0xfff0; - break; - case ECI_A0A1: - mask &= 0xff00; - break; - case ECI_A0A1A2: - case ECI_A0A1A2B0: - mask &= 0xf000; - break; - default: - g_assert_not_reached(); - } - } - + /* + * ECI bits indicate which beats are already executed; + * we handle this by effectively predicating them out. + */ + mask &= mve_eci_mask(env); return mask; } From patchwork Tue Jul 13 13:37:02 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1504619 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=uF7x+HyA; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GPMJg6gqrz9sWl for ; Tue, 13 Jul 2021 23:42:43 +1000 (AEST) Received: from localhost ([::1]:35646 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m3IgL-00081h-Kc for incoming@patchwork.ozlabs.org; Tue, 13 Jul 2021 09:42:41 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54118) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m3IbX-0008UB-10 for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:44 -0400 Received: from mail-wm1-x32d.google.com ([2a00:1450:4864:20::32d]:34646) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m3IbR-0003e0-Te for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:42 -0400 Received: by mail-wm1-x32d.google.com with SMTP id u5-20020a7bc0450000b02901480e40338bso1661778wmc.1 for ; Tue, 13 Jul 2021 06:37:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=YEUxhatg18yYkk6DxfXs5vGtqzQve7WvoaIOOBTMxb4=; b=uF7x+HyAWZqUjVXr3ZVV4UL8sKwCdvSPBhUzU5f3hW0o/QtHeeTV1G45D21K501ZG4 F4eKkKOfOy+/TYU7TIvHG82nFUMmMj90gqnR2+ADW+B2/LlUEJ8yQoa9o+9nHUM7vm3y WOIFqrhU1Z5D8U6R52+Oj1DVxey0cmqh9DWOhk7oftypf7JW/gobSp6YAqiNQnuXvyNh Xl13mZ6gtoxgsiOJSMHTXGuREck2u9Havtt28fX7cNXPWmMOG2DQhXIcOhUDYGOcj/ns uqf66HMeFO6oH30Ic9v3tckVmsXDa78JI1/mt08X3GknOeYyarg4axVZhzfnAHxFnhY5 v1Vw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=YEUxhatg18yYkk6DxfXs5vGtqzQve7WvoaIOOBTMxb4=; b=SyyhPWBFIh34BmZm/CrMBT91Tcaf4q7c3NNBOba8G9/m63m0sXHMvbRpYUAGE4oiAT Xb1PIDc/ELuLdAIyU62WaGhv0q2aBzGZRhO4Ccf2fUMfdlHtxT1fufiwQpCjLbRHYLMK d0/r+rZ6wIee0+TvhsXpX/Lt4wnRH08JpGUX4ar1KHlKLOm5kNNvBhyBQgKaDgwoNAHo iVCQmyAn+8bBkh/x0tlSIWg0USsOSH4Z7JkOr1lPg+iwJlxdyqoHu+c1n9EUFyHvXYcD qox6RaOR2d/TZ1FQqtT4heHiQzhnMICZcMMSlJlP/+eXq7Q1jyPrWMOR3AuvQZ7JAYYT FuYQ== X-Gm-Message-State: AOAM532uOxcU9qvLixVFM1nezEVtWxpdRGSo/uaZpODdxlBCvoorPeUK LOdraVYVpN8AC2uLrZES4/gWKZTpFDfYNeb9 X-Google-Smtp-Source: ABdhPJyI6d95G7H/Lk46izsooBeEFo4Q7DkkkDDdkgb5eheW1iOKgztNMoKJoeYkoB7kdVvrAW5NZg== X-Received: by 2002:a1c:f616:: with SMTP id w22mr5044718wmc.131.1626183456535; Tue, 13 Jul 2021 06:37:36 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j6sm9827443wrm.97.2021.07.13.06.37.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 06:37:36 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 10/34] target/arm: Fix VLDRB/H/W for predicated elements Date: Tue, 13 Jul 2021 14:37:02 +0100 Message-Id: <20210713133726.26842-11-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210713133726.26842-1-peter.maydell@linaro.org> References: <20210713133726.26842-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::32d; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x32d.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" For vector loads, predicated elements are zeroed, instead of retaining their previous values (as happens for most data processing operations). This means we need to distinguish "beat not executed due to ECI" (don't touch destination element) from "beat executed but predicated out" (zero destination element). Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/mve_helper.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index b0cbfda3cce..f78228f70c1 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -153,12 +153,13 @@ static void mve_advance_vpt(CPUARMState *env) env->v7m.vpr = vpr; } - +/* For loads, predicated lanes are zeroed instead of keeping their old values */ #define DO_VLDR(OP, MSIZE, LDTYPE, ESIZE, TYPE) \ void HELPER(mve_##OP)(CPUARMState *env, void *vd, uint32_t addr) \ { \ TYPE *d = vd; \ uint16_t mask = mve_element_mask(env); \ + uint16_t eci_mask = mve_eci_mask(env); \ unsigned b, e; \ /* \ * R_SXTM allows the dest reg to become UNKNOWN for abandoned \ @@ -166,8 +167,9 @@ static void mve_advance_vpt(CPUARMState *env) * then take an exception. \ */ \ for (b = 0, e = 0; b < 16; b += ESIZE, e++) { \ - if (mask & (1 << b)) { \ - d[H##ESIZE(e)] = cpu_##LDTYPE##_data_ra(env, addr, GETPC()); \ + if (eci_mask & (1 << b)) { \ + d[H##ESIZE(e)] = (mask & (1 << b)) ? \ + cpu_##LDTYPE##_data_ra(env, addr, GETPC()) : 0; \ } \ addr += MSIZE; \ } \ From patchwork Tue Jul 13 13:37:03 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1504624 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=hOIsDVWu; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GPMNW1NRlz9sX3 for ; Tue, 13 Jul 2021 23:46:03 +1000 (AEST) Received: from localhost ([::1]:44754 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m3IjY-0005hD-R0 for incoming@patchwork.ozlabs.org; Tue, 13 Jul 2021 09:46:00 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54178) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m3IbY-0008Vm-Pe for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:49 -0400 Received: from mail-wr1-x429.google.com ([2a00:1450:4864:20::429]:37692) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m3IbS-0003f3-Kv for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:44 -0400 Received: by mail-wr1-x429.google.com with SMTP id i94so30500949wri.4 for ; Tue, 13 Jul 2021 06:37:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=Y6eRE0yqDQuLemB38skVeadaDqHqEErxb2Y9Ypfy/IQ=; b=hOIsDVWu027+lFYTYm8XvhP6hdesGR/s5tB9JxAf5SlF7Aq/eu9X0wWZVGb5s3RoOz hE3AOkBEz3HL7IFL0eNJvpM5gmipbRz5bnFYqareYcSwjtDSm3bnVreQJT/RpueZz32Q 5TxA7m2jiRYZRzyxUfWUXCKm1WQCkmUV6YwB2j1ONq8+m0WVFUnzExWS4LkA96sYarJC aGAzwq20U6TNlrYzHZCk6AHHOcxXWjlrUbSIMlFvLWubhw/iEUE5tzSL7gVoAt3xVSjE SI4WFw032YvpbCNZ7rb+JHNDnjIJiig69A9UoNIzXTM2wa4IsN7vB2jCWsn211xDyCOs 418A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Y6eRE0yqDQuLemB38skVeadaDqHqEErxb2Y9Ypfy/IQ=; b=YTbo35o1jHE+y2LzVJpgJ66D7gLmw7C98B4qroCKXV5jE05VqoU4gnXsYZyP3li6cx YnORpSdKiNgfYFIyUszYWB3kzPd3dinVpWNMKP+GMuOkR7OQd2ZM6x/2hp3pT3kBgQHN LA0XK/2Ut2BUGJRRA0/U8UKQ+f1+KRnUy9nab0QlaSij2Xq+FJiZYDF9iNmbAWhZDfSu KVDBpboHWUGYh97sJoGoOHttM0x8gXNVT9Bi+OdZvJyW47jADETuhkpdDAwk9ycdmIbp VcmGkN2fP+nbymEKI1ilFkf5/4gs7Ypk+qbARKpYAuYeof/Cs5zgCpMSqPLf0hGfdpxk 62EQ== X-Gm-Message-State: AOAM5322rt962fzXeLaxE0Ag1dpdm0DDKr0GAurt0f1WjpeiGIPEg0uV e/oDpzJfG0eLi1bOVEkyUbbtRg== X-Google-Smtp-Source: ABdhPJzn5BlqVXM2BeReYBxdysQ+f8Rc5dDoin/j45/SKsKrQyWJYL9ke9daZ3s1zak/k/4NVonz5Q== X-Received: by 2002:adf:de84:: with SMTP id w4mr5877473wrl.104.1626183457246; Tue, 13 Jul 2021 06:37:37 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j6sm9827443wrm.97.2021.07.13.06.37.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 06:37:36 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 11/34] target/arm: Implement MVE VMULL (polynomial) Date: Tue, 13 Jul 2021 14:37:03 +0100 Message-Id: <20210713133726.26842-12-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210713133726.26842-1-peter.maydell@linaro.org> References: <20210713133726.26842-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::429; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x429.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE VMULL (polynomial) insn. Unlike Neon, this comes in two flavours: 8x8->16 and a 16x16->32. Also unlike Neon, the inputs are in either the low or the high half of each double-width element. The assembler for this insn indicates the size with "P8" or "P16", encoded into bit 28 as size = 0 or 1. We choose to follow the same encoding as VQDMULL and decode this into a->size as MO_16 or MO_32 indicating the size of the result elements. This then carries through to the helper function names where it then matches up with the existing pmull_h() which does an 8x8->16 operation and a new pmull_w() which does the 16x16->32. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper-mve.h | 5 +++++ target/arm/vec_internal.h | 11 +++++++++++ target/arm/mve.decode | 14 ++++++++++---- target/arm/mve_helper.c | 16 ++++++++++++++++ target/arm/translate-mve.c | 28 ++++++++++++++++++++++++++++ target/arm/vec_helper.c | 14 +++++++++++++- 6 files changed, 83 insertions(+), 5 deletions(-) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index 56e40844ad9..84adfb21517 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -145,6 +145,11 @@ DEF_HELPER_FLAGS_4(mve_vmulltub, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) DEF_HELPER_FLAGS_4(mve_vmulltuh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) DEF_HELPER_FLAGS_4(mve_vmulltuw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) +DEF_HELPER_FLAGS_4(mve_vmullpbh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) +DEF_HELPER_FLAGS_4(mve_vmullpth, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) +DEF_HELPER_FLAGS_4(mve_vmullpbw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) +DEF_HELPER_FLAGS_4(mve_vmullptw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) + DEF_HELPER_FLAGS_4(mve_vqdmulhb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) DEF_HELPER_FLAGS_4(mve_vqdmulhh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) DEF_HELPER_FLAGS_4(mve_vqdmulhw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) diff --git a/target/arm/vec_internal.h b/target/arm/vec_internal.h index 865d2139447..2a335582906 100644 --- a/target/arm/vec_internal.h +++ b/target/arm/vec_internal.h @@ -206,4 +206,15 @@ int16_t do_sqrdmlah_h(int16_t, int16_t, int16_t, bool, bool, uint32_t *); int32_t do_sqrdmlah_s(int32_t, int32_t, int32_t, bool, bool, uint32_t *); int64_t do_sqrdmlah_d(int64_t, int64_t, int64_t, bool, bool); +/* + * 8 x 8 -> 16 vector polynomial multiply where the inputs are + * in the low 8 bits of each 16-bit element +*/ +uint64_t pmull_h(uint64_t op1, uint64_t op2); +/* + * 16 x 16 -> 32 vector polynomial multiply where the inputs are + * in the low 16 bits of each 32-bit element + */ +uint64_t pmull_w(uint64_t op1, uint64_t op2); + #endif /* TARGET_ARM_VEC_INTERNALS_H */ diff --git a/target/arm/mve.decode b/target/arm/mve.decode index fa9d921f933..de079ec517d 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -173,10 +173,16 @@ VHADD_U 111 1 1111 0 . .. ... 0 ... 0 0000 . 1 . 0 ... 0 @2op VHSUB_S 111 0 1111 0 . .. ... 0 ... 0 0010 . 1 . 0 ... 0 @2op VHSUB_U 111 1 1111 0 . .. ... 0 ... 0 0010 . 1 . 0 ... 0 @2op -VMULL_BS 111 0 1110 0 . .. ... 1 ... 0 1110 . 0 . 0 ... 0 @2op -VMULL_BU 111 1 1110 0 . .. ... 1 ... 0 1110 . 0 . 0 ... 0 @2op -VMULL_TS 111 0 1110 0 . .. ... 1 ... 1 1110 . 0 . 0 ... 0 @2op -VMULL_TU 111 1 1110 0 . .. ... 1 ... 1 1110 . 0 . 0 ... 0 @2op +{ + VMULLP_B 111 . 1110 0 . 11 ... 1 ... 0 1110 . 0 . 0 ... 0 @2op_sz28 + VMULL_BS 111 0 1110 0 . .. ... 1 ... 0 1110 . 0 . 0 ... 0 @2op + VMULL_BU 111 1 1110 0 . .. ... 1 ... 0 1110 . 0 . 0 ... 0 @2op +} +{ + VMULLP_T 111 . 1110 0 . 11 ... 1 ... 1 1110 . 0 . 0 ... 0 @2op_sz28 + VMULL_TS 111 0 1110 0 . .. ... 1 ... 1 1110 . 0 . 0 ... 0 @2op + VMULL_TU 111 1 1110 0 . .. ... 1 ... 1 1110 . 0 . 0 ... 0 @2op +} VQDMULH 1110 1111 0 . .. ... 0 ... 0 1011 . 1 . 0 ... 0 @2op VQRDMULH 1111 1111 0 . .. ... 0 ... 0 1011 . 1 . 0 ... 0 @2op diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index f78228f70c1..db5ec9266d1 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -488,6 +488,22 @@ DO_2OP_L(vmulltub, 1, 1, uint8_t, 2, uint16_t, DO_MUL) DO_2OP_L(vmulltuh, 1, 2, uint16_t, 4, uint32_t, DO_MUL) DO_2OP_L(vmulltuw, 1, 4, uint32_t, 8, uint64_t, DO_MUL) +/* + * Polynomial multiply. We can always do this generating 64 bits + * of the result at a time, so we don't need to use DO_2OP_L. + */ +#define VMULLPH_MASK 0x00ff00ff00ff00ffULL +#define VMULLPW_MASK 0x0000ffff0000ffffULL +#define DO_VMULLPBH(N, M) pmull_h((N) & VMULLPH_MASK, (M) & VMULLPH_MASK) +#define DO_VMULLPTH(N, M) DO_VMULLPBH((N) >> 8, (M) >> 8) +#define DO_VMULLPBW(N, M) pmull_w((N) & VMULLPW_MASK, (M) & VMULLPW_MASK) +#define DO_VMULLPTW(N, M) DO_VMULLPBW((N) >> 16, (M) >> 16) + +DO_2OP(vmullpbh, 8, uint64_t, DO_VMULLPBH) +DO_2OP(vmullpth, 8, uint64_t, DO_VMULLPTH) +DO_2OP(vmullpbw, 8, uint64_t, DO_VMULLPBW) +DO_2OP(vmullptw, 8, uint64_t, DO_VMULLPTW) + /* * Because the computation type is at least twice as large as required, * these work for both signed and unsigned source types. diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index a2a45036a0b..d318f34b2bc 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -464,6 +464,34 @@ static bool trans_VQDMULLT(DisasContext *s, arg_2op *a) return do_2op(s, a, fns[a->size]); } +static bool trans_VMULLP_B(DisasContext *s, arg_2op *a) +{ + /* + * Note that a->size indicates the output size, ie VMULL.P8 + * is the 8x8->16 operation and a->size is MO_16; VMULL.P16 + * is the 16x16->32 operation and a->size is MO_32. + */ + static MVEGenTwoOpFn * const fns[] = { + NULL, + gen_helper_mve_vmullpbh, + gen_helper_mve_vmullpbw, + NULL, + }; + return do_2op(s, a, fns[a->size]); +} + +static bool trans_VMULLP_T(DisasContext *s, arg_2op *a) +{ + /* a->size is as for trans_VMULLP_B */ + static MVEGenTwoOpFn * const fns[] = { + NULL, + gen_helper_mve_vmullpth, + gen_helper_mve_vmullptw, + NULL, + }; + return do_2op(s, a, fns[a->size]); +} + /* * VADC and VSBC: these perform an add-with-carry or subtract-with-carry * of the 32-bit elements in each lane of the input vectors, where the diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 034f6b84f78..17fb1583622 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -2028,11 +2028,23 @@ static uint64_t expand_byte_to_half(uint64_t x) | ((x & 0xff000000) << 24); } -static uint64_t pmull_h(uint64_t op1, uint64_t op2) +uint64_t pmull_w(uint64_t op1, uint64_t op2) { uint64_t result = 0; int i; + for (i = 0; i < 16; ++i) { + uint64_t mask = (op1 & 0x0000000100000001ull) * 0xffffffff; + result ^= op2 & mask; + op1 >>= 1; + op2 <<= 1; + } + return result; +} +uint64_t pmull_h(uint64_t op1, uint64_t op2) +{ + uint64_t result = 0; + int i; for (i = 0; i < 8; ++i) { uint64_t mask = (op1 & 0x0001000100010001ull) * 0xffff; result ^= op2 & mask; From patchwork Tue Jul 13 13:37:04 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1504630 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=oucua9Qn; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GPMW14nDpz9sXL for ; Tue, 13 Jul 2021 23:51:41 +1000 (AEST) Received: from localhost ([::1]:60850 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m3Ip1-0007zO-Cx for incoming@patchwork.ozlabs.org; Tue, 13 Jul 2021 09:51:39 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54194) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m3IbZ-0008W1-Eo for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:49 -0400 Received: from mail-wr1-x429.google.com ([2a00:1450:4864:20::429]:33602) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m3IbT-0003ff-J5 for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:45 -0400 Received: by mail-wr1-x429.google.com with SMTP id d2so30571830wrn.0 for ; Tue, 13 Jul 2021 06:37:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=lF5a95h+76qUaqN85b2/xOD4AS8PIdYh9+rcTl/1F90=; b=oucua9Qn6TDUxkvokKcKOMGAZjyqS8NEWD9tNvIzQhDM8i+xQl7xGmP7Aj3eNscREs 98ejT5S4MsdhsBZeU2wc9ZVcgEXX0KjN/CxOr22iJ/s2hrJK36D0DzboJo3eO2UhFCk4 Zjw1wLQb57+D50FNECKK8TzObYgH0+mgDp5IKIaytP73fEBjgyeN8QOvlxTRWLrpjp7g pH5/D5KcM7QP00qg5g4Lnm1V7IRZus+EzMuFhj+DPa6i4TJFAr+b0d8WKliWO8t4gsXY 3QcWLfdKEGXj6OOnkbLAxcfK4HgU0Xfrs+/fjOe15r5mNc6Qsowr9kfz7XxB8/zmZZUs ri5A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=lF5a95h+76qUaqN85b2/xOD4AS8PIdYh9+rcTl/1F90=; b=bfbs+hAmfxSP5ESbmfsTJ8Frwsz+QQ4eMztpiWvuUbHs54807FdH45mmPmUSQ8ScUS oJavtHMAN4IT+pgXSD5SToWakHyo7OZjKeeV6oV8weBVT+gE0AkK1Un+reT7L/vk5ZHi AJXmn7YZPew+wKGVWEpuWIDxqA2aIA4t/+udjnNcTAgS/tb8F8YSAVZkJGbCTARUM5MW mSyDgiILeVPSnU0KEOkKBXP4Y8fZvwQhNYaWTe+OY6c7TInXyUS6R/0jBB/KIQQjUY0N 0Z3zYXQu+9vdqSjOw4RKr0ZlnugWRVoAnda6wP5p3VDERlxaWHVeJNvLgRaH8GW5TXk7 C+dg== X-Gm-Message-State: AOAM531W0r2lynlPgAtlMzZoJhDg3Pcw4/hRQyDMOYRWqMJJNtHftVDS V6sbCzBuVS9FD9/HdByY3wTZbQ== X-Google-Smtp-Source: ABdhPJzB1LiwuLZ9A2SiEBcEpJeiaBJ7NBqI1auxUBRHw0r5yVwFU/eu+K5wI90vHnSXKZs7Zur6gA== X-Received: by 2002:adf:e3cf:: with SMTP id k15mr5738030wrm.60.1626183458182; Tue, 13 Jul 2021 06:37:38 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j6sm9827443wrm.97.2021.07.13.06.37.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 06:37:37 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 12/34] target/arm: Implement MVE incrementing/decrementing dup insns Date: Tue, 13 Jul 2021 14:37:04 +0100 Message-Id: <20210713133726.26842-13-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210713133726.26842-1-peter.maydell@linaro.org> References: <20210713133726.26842-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::429; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x429.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE incrementing/decrementing dup insns VIDUP, VDDUP, VIWDUP and VDWDUP. These fill the elements of a vector with successively incrementing values, starting at the offset specified in a general purpose register. The final value of the offset is written back to this register. The wrapping variants take a second general purpose register which specifies the point where the count should wrap back to 0. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper-mve.h | 16 +++++ target/arm/mve.decode | 25 ++++++++ target/arm/mve_helper.c | 64 ++++++++++++++++++++ target/arm/translate-mve.c | 118 +++++++++++++++++++++++++++++++++++++ 4 files changed, 223 insertions(+) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index 84adfb21517..54b252e98af 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -35,6 +35,22 @@ DEF_HELPER_FLAGS_3(mve_vstrh_w, TCG_CALL_NO_WG, void, env, ptr, i32) DEF_HELPER_FLAGS_3(mve_vdup, TCG_CALL_NO_WG, void, env, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vidupb, TCG_CALL_NO_WG, i32, env, ptr, i32, i32) +DEF_HELPER_FLAGS_4(mve_viduph, TCG_CALL_NO_WG, i32, env, ptr, i32, i32) +DEF_HELPER_FLAGS_4(mve_vidupw, TCG_CALL_NO_WG, i32, env, ptr, i32, i32) + +DEF_HELPER_FLAGS_4(mve_vddupb, TCG_CALL_NO_WG, i32, env, ptr, i32, i32) +DEF_HELPER_FLAGS_4(mve_vdduph, TCG_CALL_NO_WG, i32, env, ptr, i32, i32) +DEF_HELPER_FLAGS_4(mve_vddupw, TCG_CALL_NO_WG, i32, env, ptr, i32, i32) + +DEF_HELPER_FLAGS_5(mve_viwdupb, TCG_CALL_NO_WG, i32, env, ptr, i32, i32, i32) +DEF_HELPER_FLAGS_5(mve_viwduph, TCG_CALL_NO_WG, i32, env, ptr, i32, i32, i32) +DEF_HELPER_FLAGS_5(mve_viwdupw, TCG_CALL_NO_WG, i32, env, ptr, i32, i32, i32) + +DEF_HELPER_FLAGS_5(mve_vdwdupb, TCG_CALL_NO_WG, i32, env, ptr, i32, i32, i32) +DEF_HELPER_FLAGS_5(mve_vdwduph, TCG_CALL_NO_WG, i32, env, ptr, i32, i32, i32) +DEF_HELPER_FLAGS_5(mve_vdwdupw, TCG_CALL_NO_WG, i32, env, ptr, i32, i32, i32) + DEF_HELPER_FLAGS_3(mve_vclsb, TCG_CALL_NO_WG, void, env, ptr, ptr) DEF_HELPER_FLAGS_3(mve_vclsh, TCG_CALL_NO_WG, void, env, ptr, ptr) DEF_HELPER_FLAGS_3(mve_vclsw, TCG_CALL_NO_WG, void, env, ptr, ptr) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index de079ec517d..88c9c18ebf1 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -35,6 +35,8 @@ &2scalar qd qn rm size &1imm qd imm cmode op &2shift qd qm shift size +&vidup qd rn size imm +&viwdup qd rn rm size imm @vldr_vstr ....... . . . . l:1 rn:4 ... ...... imm:7 &vldr_vstr qd=%qd u=0 # Note that both Rn and Qd are 3 bits only (no D bit) @@ -259,6 +261,29 @@ VDUP 1110 1110 1 1 10 ... 0 .... 1011 . 0 0 1 0000 @vdup size=0 VDUP 1110 1110 1 0 10 ... 0 .... 1011 . 0 1 1 0000 @vdup size=1 VDUP 1110 1110 1 0 10 ... 0 .... 1011 . 0 0 1 0000 @vdup size=2 +# Incrementing and decrementing dup + +# VIDUP, VDDUP format immediate: 1 << (immh:imml) +%imm_vidup 7:1 0:1 !function=vidup_imm + +# VIDUP, VDDUP registers: Rm bits [3:1] from insn, bit 0 is 1; +# Rn bits [3:1] from insn, bit 0 is 0 +%vidup_rm 1:3 !function=times_2_plus_1 +%vidup_rn 17:3 !function=times_2 + +@vidup .... .... . . size:2 .... .... .... .... .... \ + qd=%qd imm=%imm_vidup rn=%vidup_rn &vidup +@viwdup .... .... . . size:2 .... .... .... .... .... \ + qd=%qd imm=%imm_vidup rm=%vidup_rm rn=%vidup_rn &viwdup +{ + VIDUP 1110 1110 0 . .. ... 1 ... 0 1111 . 110 111 . @vidup + VIWDUP 1110 1110 0 . .. ... 1 ... 0 1111 . 110 ... . @viwdup +} +{ + VDDUP 1110 1110 0 . .. ... 1 ... 1 1111 . 110 111 . @vidup + VDWDUP 1110 1110 0 . .. ... 1 ... 1 1111 . 110 ... . @viwdup +} + # multiply-add long dual accumulate # rdahi: bits [3:1] from insn, bit 0 is 1 # rdalo: bits [3:1] from insn, bit 0 is 0 diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index db5ec9266d1..0ef5f5d8871 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -1698,3 +1698,67 @@ uint32_t HELPER(mve_sqrshr)(CPUARMState *env, uint32_t n, uint32_t shift) { return do_sqrshl_bhs(n, -(int8_t)shift, 32, true, &env->QF); } + +#define DO_VIDUP(OP, ESIZE, TYPE, FN) \ + uint32_t HELPER(mve_##OP)(CPUARMState *env, void *vd, \ + uint32_t offset, uint32_t imm) \ + { \ + TYPE *d = vd; \ + uint16_t mask = mve_element_mask(env); \ + unsigned e; \ + for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \ + mergemask(&d[H##ESIZE(e)], offset, mask); \ + offset = FN(offset, imm); \ + } \ + mve_advance_vpt(env); \ + return offset; \ + } + +#define DO_VIWDUP(OP, ESIZE, TYPE, FN) \ + uint32_t HELPER(mve_##OP)(CPUARMState *env, void *vd, \ + uint32_t offset, uint32_t wrap, \ + uint32_t imm) \ + { \ + TYPE *d = vd; \ + uint16_t mask = mve_element_mask(env); \ + unsigned e; \ + for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \ + mergemask(&d[H##ESIZE(e)], offset, mask); \ + offset = FN(offset, wrap, imm); \ + } \ + mve_advance_vpt(env); \ + return offset; \ + } + +#define DO_VIDUP_ALL(OP, FN) \ + DO_VIDUP(OP##b, 1, int8_t, FN) \ + DO_VIDUP(OP##h, 2, int16_t, FN) \ + DO_VIDUP(OP##w, 4, int32_t, FN) + +#define DO_VIWDUP_ALL(OP, FN) \ + DO_VIWDUP(OP##b, 1, int8_t, FN) \ + DO_VIWDUP(OP##h, 2, int16_t, FN) \ + DO_VIWDUP(OP##w, 4, int32_t, FN) + +static uint32_t do_add_wrap(uint32_t offset, uint32_t wrap, uint32_t imm) +{ + offset += imm; + if (offset == wrap) { + offset = 0; + } + return offset; +} + +static uint32_t do_sub_wrap(uint32_t offset, uint32_t wrap, uint32_t imm) +{ + if (offset == 0) { + offset = wrap; + } + offset -= imm; + return offset; +} + +DO_VIDUP_ALL(vidup, DO_ADD) +DO_VIDUP_ALL(vddup, DO_SUB) +DO_VIWDUP_ALL(viwdup, do_add_wrap) +DO_VIWDUP_ALL(vdwdup, do_sub_wrap) diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index d318f34b2bc..52400864692 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -25,6 +25,11 @@ #include "translate.h" #include "translate-a32.h" +static inline int vidup_imm(DisasContext *s, int x) +{ + return 1 << x; +} + /* Include the generated decoder */ #include "decode-mve.c.inc" @@ -36,6 +41,8 @@ typedef void MVEGenTwoOpShiftFn(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32); typedef void MVEGenDualAccOpFn(TCGv_i64, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i64); typedef void MVEGenVADDVFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_i32); typedef void MVEGenOneOpImmFn(TCGv_ptr, TCGv_ptr, TCGv_i64); +typedef void MVEGenVIDUPFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_i32, TCGv_i32); +typedef void MVEGenVIWDUPFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_i32, TCGv_i32, TCGv_i32); /* Return the offset of a Qn register (same semantics as aa32_vfp_qreg()) */ static inline long mve_qreg_offset(unsigned reg) @@ -1059,3 +1066,114 @@ static bool trans_VSHLC(DisasContext *s, arg_VSHLC *a) mve_update_eci(s); return true; } + +static bool do_vidup(DisasContext *s, arg_vidup *a, MVEGenVIDUPFn *fn) +{ + TCGv_ptr qd; + TCGv_i32 rn; + + /* + * Vector increment/decrement with wrap and duplicate (VIDUP, VDDUP). + * This fills the vector with elements of successively increasing + * or decreasing values, starting from Rn. + */ + if (!dc_isar_feature(aa32_mve, s) || !mve_check_qreg_bank(s, a->qd)) { + return false; + } + if (a->size == MO_64) { + /* size 0b11 is another encoding */ + return false; + } + if (!mve_eci_check(s) || !vfp_access_check(s)) { + return true; + } + + qd = mve_qreg_ptr(a->qd); + rn = load_reg(s, a->rn); + fn(rn, cpu_env, qd, rn, tcg_constant_i32(a->imm)); + store_reg(s, a->rn, rn); + tcg_temp_free_ptr(qd); + mve_update_eci(s); + return true; +} + +static bool do_viwdup(DisasContext *s, arg_viwdup *a, MVEGenVIWDUPFn *fn) +{ + TCGv_ptr qd; + TCGv_i32 rn, rm; + + /* + * Vector increment/decrement with wrap and duplicate (VIWDUp, VDWDUP) + * This fills the vector with elements of successively increasing + * or decreasing values, starting from Rn. Rm specifies a point where + * the count wraps back around to 0. The updated offset is written back + * to Rn. + */ + if (!dc_isar_feature(aa32_mve, s) || !mve_check_qreg_bank(s, a->qd)) { + return false; + } + if (!fn || a->rm == 13 || a->rm == 15) { + /* + * size 0b11 is another encoding; Rm == 13 is UNPREDICTABLE; + * Rm == 13 is VIWDUP, VDWDUP. + */ + return false; + } + if (!mve_eci_check(s) || !vfp_access_check(s)) { + return true; + } + + qd = mve_qreg_ptr(a->qd); + rn = load_reg(s, a->rn); + rm = load_reg(s, a->rm); + fn(rn, cpu_env, qd, rn, rm, tcg_constant_i32(a->imm)); + store_reg(s, a->rn, rn); + tcg_temp_free_ptr(qd); + tcg_temp_free_i32(rm); + mve_update_eci(s); + return true; +} + +static bool trans_VIDUP(DisasContext *s, arg_vidup *a) +{ + static MVEGenVIDUPFn * const fns[] = { + gen_helper_mve_vidupb, + gen_helper_mve_viduph, + gen_helper_mve_vidupw, + NULL, + }; + return do_vidup(s, a, fns[a->size]); +} + +static bool trans_VDDUP(DisasContext *s, arg_vidup *a) +{ + static MVEGenVIDUPFn * const fns[] = { + gen_helper_mve_vddupb, + gen_helper_mve_vdduph, + gen_helper_mve_vddupw, + NULL, + }; + return do_vidup(s, a, fns[a->size]); +} + +static bool trans_VIWDUP(DisasContext *s, arg_viwdup *a) +{ + static MVEGenVIWDUPFn * const fns[] = { + gen_helper_mve_viwdupb, + gen_helper_mve_viwduph, + gen_helper_mve_viwdupw, + NULL, + }; + return do_viwdup(s, a, fns[a->size]); +} + +static bool trans_VDWDUP(DisasContext *s, arg_viwdup *a) +{ + static MVEGenVIWDUPFn * const fns[] = { + gen_helper_mve_vdwdupb, + gen_helper_mve_vdwduph, + gen_helper_mve_vdwdupw, + NULL, + }; + return do_viwdup(s, a, fns[a->size]); +} From patchwork Tue Jul 13 13:37:05 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1504625 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=ZeOcnWQd; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GPMNy0m7nz9sWd for ; Tue, 13 Jul 2021 23:46:26 +1000 (AEST) Received: from localhost ([::1]:46672 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m3Iju-0006wr-PM for incoming@patchwork.ozlabs.org; Tue, 13 Jul 2021 09:46:23 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54240) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m3Iba-0008W8-ND for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:50 -0400 Received: from mail-wm1-x32c.google.com ([2a00:1450:4864:20::32c]:39775) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m3IbU-0003fx-Ol for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:46 -0400 Received: by mail-wm1-x32c.google.com with SMTP id l18-20020a1ced120000b029014c1adff1edso1652400wmh.4 for ; Tue, 13 Jul 2021 06:37:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=WAuWkxdCynAW+qf4yNf2j7XCCuMuasPntN4NpfrGcTs=; b=ZeOcnWQdlj1Uc9k89hRhxBVW0WJXX6a2Lqhbw729fruUwaoWRSYdkgWEcLOB6jaWbK xVgCNSs3DKaXh3BMmNTTIrEh24GT1FeVZZ8x3yhAFe8++XHp+EiqJyfFOwuW/nGhc+pj qYQ7itUbcCe2H4kChuCWH4NuT21s6LfTGqLrIfBgGyXBTL5rw3AA62BPqYBP6drNuBKX e16NGhuw6q8f43t4ZA8UjPsyjunu3yAypqOEIjiaBk7UdIHl+NjdBHy9XnmSQkIstcIZ cJtyVzsz7cofvSm209ofjIRjAWLfAa8F3kWvCNRFamFfVljj3g5TbaqcAfaKDaT79GCc BTRg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=WAuWkxdCynAW+qf4yNf2j7XCCuMuasPntN4NpfrGcTs=; b=oi4/w5/G5DCrJtpjRB6GHEzM4KmfHomQyLaDOpWAfQ3Pfk8e2yH8ch3k+E0OpL8pyf I6tYmlkWEg1i/JcwXqm60sFMEfBXcCJTa8C01bTAcwrNcGaxMu+4fIQZvJTuYT+xFqbR uT6PYICCPsrJOR2mjeKKo0SrHy8BnB32+AabC6kk4GzH1P7U7ql8une/vHBp73XmhTEQ Na59kcyRbnpv6M5R3J5+crrH6f8AuAPXbWFkfKFRjOpx8Veatu3QcPfbHjVV+CqCOqVv 2vWE62h4SMMFFkAZUNXIHSbuCx8j4CuBnulMwdRnBtPd8lPDMe3gwodY+Yh31Od0ZFUJ CMwg== X-Gm-Message-State: AOAM530a86N/h5gFbG7Jx9+AqpAdbcavxsDbNg8TeDuB1SN8ouDsPA2W Fmese8GQsIbQrINFEcvaSI6RJ7eg0rkCWVOR X-Google-Smtp-Source: ABdhPJwOjwLa8bgx4/U+ei2S83H0wcF5kU4XUjr87I6pa7BZpFX0anpmDKUpiAWvXuUCwwZBnh0Z+Q== X-Received: by 2002:a1c:1bc3:: with SMTP id b186mr70971wmb.27.1626183459137; Tue, 13 Jul 2021 06:37:39 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j6sm9827443wrm.97.2021.07.13.06.37.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 06:37:38 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 13/34] target/arm: Factor out gen_vpst() Date: Tue, 13 Jul 2021 14:37:05 +0100 Message-Id: <20210713133726.26842-14-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210713133726.26842-1-peter.maydell@linaro.org> References: <20210713133726.26842-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::32c; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x32c.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Factor out the "generate code to update VPR.MASK01/MASK23" part of trans_VPST(); we are going to want to reuse it for the VPT insns. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/translate-mve.c | 31 +++++++++++++++++-------------- 1 file changed, 17 insertions(+), 14 deletions(-) diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index 52400864692..de65a1c3cf1 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -737,33 +737,24 @@ static bool trans_VRMLSLDAVH(DisasContext *s, arg_vmlaldav *a) return do_long_dual_acc(s, a, fns[a->x]); } -static bool trans_VPST(DisasContext *s, arg_VPST *a) +static void gen_vpst(DisasContext *s, uint32_t mask) { - TCGv_i32 vpr; - - /* mask == 0 is a "related encoding" */ - if (!dc_isar_feature(aa32_mve, s) || !a->mask) { - return false; - } - if (!mve_eci_check(s) || !vfp_access_check(s)) { - return true; - } /* * Set the VPR mask fields. We take advantage of MASK01 and MASK23 * being adjacent fields in the register. * - * This insn is not predicated, but it is subject to beat-wise + * Updating the masks is not predicated, but it is subject to beat-wise * execution, and the mask is updated on the odd-numbered beats. * So if PSR.ECI says we should skip beat 1, we mustn't update the * 01 mask field. */ - vpr = load_cpu_field(v7m.vpr); + TCGv_i32 vpr = load_cpu_field(v7m.vpr); switch (s->eci) { case ECI_NONE: case ECI_A0: /* Update both 01 and 23 fields */ tcg_gen_deposit_i32(vpr, vpr, - tcg_constant_i32(a->mask | (a->mask << 4)), + tcg_constant_i32(mask | (mask << 4)), R_V7M_VPR_MASK01_SHIFT, R_V7M_VPR_MASK01_LENGTH + R_V7M_VPR_MASK23_LENGTH); break; @@ -772,13 +763,25 @@ static bool trans_VPST(DisasContext *s, arg_VPST *a) case ECI_A0A1A2B0: /* Update only the 23 mask field */ tcg_gen_deposit_i32(vpr, vpr, - tcg_constant_i32(a->mask), + tcg_constant_i32(mask), R_V7M_VPR_MASK23_SHIFT, R_V7M_VPR_MASK23_LENGTH); break; default: g_assert_not_reached(); } store_cpu_field(vpr, v7m.vpr); +} + +static bool trans_VPST(DisasContext *s, arg_VPST *a) +{ + /* mask == 0 is a "related encoding" */ + if (!dc_isar_feature(aa32_mve, s) || !a->mask) { + return false; + } + if (!mve_eci_check(s) || !vfp_access_check(s)) { + return true; + } + gen_vpst(s, a->mask); mve_update_and_store_eci(s); return true; } From patchwork Tue Jul 13 13:37:06 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1504634 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=cPld11kj; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GPMYb2MqXz9sXL for ; Tue, 13 Jul 2021 23:53:55 +1000 (AEST) Received: from localhost ([::1]:42612 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m3IrA-0006Gl-Pq for incoming@patchwork.ozlabs.org; Tue, 13 Jul 2021 09:53:52 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54320) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m3Ibd-0008WE-3g for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:50 -0400 Received: from mail-wr1-x434.google.com ([2a00:1450:4864:20::434]:34547) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m3IbV-0003gd-9s for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:48 -0400 Received: by mail-wr1-x434.google.com with SMTP id p8so30508185wrr.1 for ; Tue, 13 Jul 2021 06:37:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=hZ3VuhusV6EaIsCvrlgReeFP12hFhuCL2InmlZcVOMk=; b=cPld11kjrtMLpKH7GFgjQy1CZmzdy+P3axgM0IwBwKhPW1y8/wgkWkHegr2iuGCKzZ ERqUhyRKQozBAp1q55LAvEHd1a8nowAanrt6ttOaMD0pDaIr4+1KnzNXvketmfLd221f o+V2CN+yALHL1HQnVwIfGF1cvIiNYYsiDOc1CJIcz9nECKFSxeFeMEpVZ6Xgf3SMb0B3 vCzYyqjva3YZ6AT5+1wV3mdQHQM5psl20+JLPoRhJQ+ECXEn4/dapCQ4a8mGO0A2506S wTlJtLN01rUlr5kv4dGwPNGTdrg3cnhT6hTyWBLxxryxu+tH4ftqSBYdmRaR5ZmZzFqS bfBg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=hZ3VuhusV6EaIsCvrlgReeFP12hFhuCL2InmlZcVOMk=; b=gO5yOx9KlfjaofgBD8D/m7MBhPjL62kbdNCMhdypO7plQvylJV2bd/aLjImmLrDhIG JlISSk0fgqRdjRbVA8nM5X7H/dW9Pk3TyiPJlZctKcqAw/RzKNwW0qBjirqMYb8hDl+J wX8DXRjBPizzdS41FEzsCJtmmk1vs8fAlkLrMsM3KMcqwOa+Bp/ErKARSaY26KBoHHPv gpTX378nw7Cn9bKYDlsN4GsMGvFNMLyXHLY7A/v9HBWQBg3zd0yzTHjQ+mi7X8HLVoEc l1YtkKiPPDmIC5cFBlOmVpijPOkM9YpbW82AeR+2kHUUypSNoHOaTunYJM+5viwnysnY md+w== X-Gm-Message-State: AOAM533o/PED+qmqfI9KJ+F0STUX6z7ZKlk9EDrBB8s5d1a+qS5fxsvQ MA13JJxysjlQNV9B42+h+UVdwA== X-Google-Smtp-Source: ABdhPJzBEDxK1noV+rotuHVQ6QLitF16uqHQEP3hCr27GPcDvugVfn/dIwmWrtatkqwTY0Tqxej64A== X-Received: by 2002:adf:e5ce:: with SMTP id a14mr5655733wrn.226.1626183459949; Tue, 13 Jul 2021 06:37:39 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j6sm9827443wrm.97.2021.07.13.06.37.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 06:37:39 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 14/34] target/arm: Implement MVE integer vector comparisons Date: Tue, 13 Jul 2021 14:37:06 +0100 Message-Id: <20210713133726.26842-15-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210713133726.26842-1-peter.maydell@linaro.org> References: <20210713133726.26842-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::434; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x434.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE integer vector comparison instructions. These are "VCMP (vector)" encodings T1, T2 and T3, and "VPT (vector)" encodings T1, T2 and T3. These insns compare corresponding elements in each vector, and update the VPR.P0 predicate bits with the results of the comparison. VPT also sets the VPR.MASK01 and VPR.MASK23 fields -- it is effectively "VCMP then VPST". Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper-mve.h | 32 ++++++++++++++++++++++ target/arm/mve.decode | 18 +++++++++++- target/arm/mve_helper.c | 56 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-mve.c | 47 ++++++++++++++++++++++++++++++++ 4 files changed, 152 insertions(+), 1 deletion(-) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index 54b252e98af..e89238eac9d 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -484,3 +484,35 @@ DEF_HELPER_FLAGS_3(mve_uqshl, TCG_CALL_NO_RWG, i32, env, i32, i32) DEF_HELPER_FLAGS_3(mve_sqshl, TCG_CALL_NO_RWG, i32, env, i32, i32) DEF_HELPER_FLAGS_3(mve_uqrshl, TCG_CALL_NO_RWG, i32, env, i32, i32) DEF_HELPER_FLAGS_3(mve_sqrshr, TCG_CALL_NO_RWG, i32, env, i32, i32) + +DEF_HELPER_FLAGS_3(mve_vcmpeqb, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vcmpeqh, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vcmpeqw, TCG_CALL_NO_WG, void, env, ptr, ptr) + +DEF_HELPER_FLAGS_3(mve_vcmpneb, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vcmpneh, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vcmpnew, TCG_CALL_NO_WG, void, env, ptr, ptr) + +DEF_HELPER_FLAGS_3(mve_vcmpcsb, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vcmpcsh, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vcmpcsw, TCG_CALL_NO_WG, void, env, ptr, ptr) + +DEF_HELPER_FLAGS_3(mve_vcmphib, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vcmphih, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vcmphiw, TCG_CALL_NO_WG, void, env, ptr, ptr) + +DEF_HELPER_FLAGS_3(mve_vcmpgeb, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vcmpgeh, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vcmpgew, TCG_CALL_NO_WG, void, env, ptr, ptr) + +DEF_HELPER_FLAGS_3(mve_vcmpltb, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vcmplth, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vcmpltw, TCG_CALL_NO_WG, void, env, ptr, ptr) + +DEF_HELPER_FLAGS_3(mve_vcmpgtb, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vcmpgth, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vcmpgtw, TCG_CALL_NO_WG, void, env, ptr, ptr) + +DEF_HELPER_FLAGS_3(mve_vcmpleb, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vcmpleh, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vcmplew, TCG_CALL_NO_WG, void, env, ptr, ptr) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index 88c9c18ebf1..76bbf9a6136 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -37,6 +37,7 @@ &2shift qd qm shift size &vidup qd rn size imm &viwdup qd rn rm size imm +&vcmp qm qn size mask @vldr_vstr ....... . . . . l:1 rn:4 ... ...... imm:7 &vldr_vstr qd=%qd u=0 # Note that both Rn and Qd are 3 bits only (no D bit) @@ -86,6 +87,10 @@ @2_shr_w .... .... .. 1 ..... .... .... .... .... &2shift qd=%qd qm=%qm \ size=2 shift=%rshift_i5 +# Vector comparison; 4-bit Qm but 3-bit Qn +%mask_22_13 22:1 13:3 +@vcmp .... .... .. size:2 qn:3 . .... .... .... .... &vcmp qm=%qm mask=%mask_22_13 + # Vector loads and stores # Widening loads and narrowing stores: @@ -345,7 +350,6 @@ VQRDMULH_scalar 1111 1110 0 . .. ... 1 ... 0 1110 . 110 .... @2scalar } # Predicate operations -%mask_22_13 22:1 13:3 VPST 1111 1110 0 . 11 000 1 ... 0 1111 0100 1101 mask=%mask_22_13 # Logical immediate operations (1 reg and modified-immediate) @@ -458,3 +462,15 @@ VQRSHRUNT 111 1 1110 1 . ... ... ... 1 1111 1 1 . 0 ... 0 @2_shr_b VQRSHRUNT 111 1 1110 1 . ... ... ... 1 1111 1 1 . 0 ... 0 @2_shr_h VSHLC 111 0 1110 1 . 1 imm:5 ... 0 1111 1100 rdm:4 qd=%qd + +# Comparisons. We expand out the conditions which are split across +# encodings T1, T2, T3 and the fc bits. These include VPT, which is +# effectively "VCMP then VPST". A plain "VCMP" has a mask field of zero. +VCMPEQ 1111 1110 0 . .. ... 1 ... 0 1111 0 0 . 0 ... 0 @vcmp +VCMPNE 1111 1110 0 . .. ... 1 ... 0 1111 1 0 . 0 ... 0 @vcmp +VCMPCS 1111 1110 0 . .. ... 1 ... 0 1111 0 0 . 0 ... 1 @vcmp +VCMPHI 1111 1110 0 . .. ... 1 ... 0 1111 1 0 . 0 ... 1 @vcmp +VCMPGE 1111 1110 0 . .. ... 1 ... 1 1111 0 0 . 0 ... 0 @vcmp +VCMPLT 1111 1110 0 . .. ... 1 ... 1 1111 1 0 . 0 ... 0 @vcmp +VCMPGT 1111 1110 0 . .. ... 1 ... 1 1111 0 0 . 0 ... 1 @vcmp +VCMPLE 1111 1110 0 . .. ... 1 ... 1 1111 1 0 . 0 ... 1 @vcmp diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index 0ef5f5d8871..23398e86f7d 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -1762,3 +1762,59 @@ DO_VIDUP_ALL(vidup, DO_ADD) DO_VIDUP_ALL(vddup, DO_SUB) DO_VIWDUP_ALL(viwdup, do_add_wrap) DO_VIWDUP_ALL(vdwdup, do_sub_wrap) + +/* + * Vector comparison. + * P0 bits for non-executed beats (where eci_mask is 0) are unchanged. + * P0 bits for predicated lanes in executed beats (where mask is 0) are 0. + * P0 bits otherwise are updated with the results of the comparisons. + * We must also keep unchanged the MASK fields at the top of v7m.vpr. + */ +#define DO_VCMP(OP, ESIZE, TYPE, FN) \ + void HELPER(glue(mve_, OP))(CPUARMState *env, void *vn, void *vm) \ + { \ + TYPE *n = vn, *m = vm; \ + uint16_t mask = mve_element_mask(env); \ + uint16_t eci_mask = mve_eci_mask(env); \ + uint16_t beatpred = 0; \ + uint16_t emask = MAKE_64BIT_MASK(0, ESIZE); \ + unsigned e; \ + for (e = 0; e < 16 / ESIZE; e++) { \ + bool r = FN(n[H##ESIZE(e)], m[H##ESIZE(e)]); \ + /* Comparison sets 0/1 bits for each byte in the element */ \ + beatpred |= r * emask; \ + emask <<= ESIZE; \ + } \ + beatpred &= mask; \ + env->v7m.vpr = (env->v7m.vpr & ~(uint32_t)eci_mask) | \ + (beatpred & eci_mask); \ + mve_advance_vpt(env); \ + } + +#define DO_VCMP_S(OP, FN) \ + DO_VCMP(OP##b, 1, int8_t, FN) \ + DO_VCMP(OP##h, 2, int16_t, FN) \ + DO_VCMP(OP##w, 4, int32_t, FN) + +#define DO_VCMP_U(OP, FN) \ + DO_VCMP(OP##b, 1, uint8_t, FN) \ + DO_VCMP(OP##h, 2, uint16_t, FN) \ + DO_VCMP(OP##w, 4, uint32_t, FN) + +#define DO_EQ(N, M) ((N) == (M)) +#define DO_NE(N, M) ((N) != (M)) +#define DO_EQ(N, M) ((N) == (M)) +#define DO_EQ(N, M) ((N) == (M)) +#define DO_GE(N, M) ((N) >= (M)) +#define DO_LT(N, M) ((N) < (M)) +#define DO_GT(N, M) ((N) > (M)) +#define DO_LE(N, M) ((N) <= (M)) + +DO_VCMP_U(vcmpeq, DO_EQ) +DO_VCMP_U(vcmpne, DO_NE) +DO_VCMP_U(vcmpcs, DO_GE) +DO_VCMP_U(vcmphi, DO_GT) +DO_VCMP_S(vcmpge, DO_GE) +DO_VCMP_S(vcmplt, DO_LT) +DO_VCMP_S(vcmpgt, DO_GT) +DO_VCMP_S(vcmple, DO_LE) diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index de65a1c3cf1..a7334609e29 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -43,6 +43,7 @@ typedef void MVEGenVADDVFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_i32); typedef void MVEGenOneOpImmFn(TCGv_ptr, TCGv_ptr, TCGv_i64); typedef void MVEGenVIDUPFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_i32, TCGv_i32); typedef void MVEGenVIWDUPFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_i32, TCGv_i32, TCGv_i32); +typedef void MVEGenCmpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr); /* Return the offset of a Qn register (same semantics as aa32_vfp_qreg()) */ static inline long mve_qreg_offset(unsigned reg) @@ -1180,3 +1181,49 @@ static bool trans_VDWDUP(DisasContext *s, arg_viwdup *a) }; return do_viwdup(s, a, fns[a->size]); } + +static bool do_vcmp(DisasContext *s, arg_vcmp *a, MVEGenCmpFn *fn) +{ + TCGv_ptr qn, qm; + + if (!dc_isar_feature(aa32_mve, s) || !mve_check_qreg_bank(s, a->qm) || + !fn) { + return false; + } + if (!mve_eci_check(s) || !vfp_access_check(s)) { + return true; + } + + qn = mve_qreg_ptr(a->qn); + qm = mve_qreg_ptr(a->qm); + fn(cpu_env, qn, qm); + tcg_temp_free_ptr(qn); + tcg_temp_free_ptr(qm); + if (a->mask) { + /* VPT */ + gen_vpst(s, a->mask); + } + mve_update_eci(s); + return true; +} + +#define DO_VCMP(INSN, FN) \ + static bool trans_##INSN(DisasContext *s, arg_vcmp *a) \ + { \ + static MVEGenCmpFn * const fns[] = { \ + gen_helper_mve_##FN##b, \ + gen_helper_mve_##FN##h, \ + gen_helper_mve_##FN##w, \ + NULL, \ + }; \ + return do_vcmp(s, a, fns[a->size]); \ + } + +DO_VCMP(VCMPEQ, vcmpeq) +DO_VCMP(VCMPNE, vcmpne) +DO_VCMP(VCMPCS, vcmpcs) +DO_VCMP(VCMPHI, vcmphi) +DO_VCMP(VCMPGE, vcmpge) +DO_VCMP(VCMPLT, vcmplt) +DO_VCMP(VCMPGT, vcmpgt) +DO_VCMP(VCMPLE, vcmple) From patchwork Tue Jul 13 13:37:07 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1504628 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=O9kRsFXn; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GPMSg712Vz9sWd for ; Tue, 13 Jul 2021 23:49:39 +1000 (AEST) Received: from localhost ([::1]:55308 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m3In3-0004Jj-Ls for incoming@patchwork.ozlabs.org; Tue, 13 Jul 2021 09:49:37 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54300) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m3Ibc-0008WC-Hf for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:50 -0400 Received: from mail-wm1-x335.google.com ([2a00:1450:4864:20::335]:46913) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m3IbW-0003gt-5c for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:48 -0400 Received: by mail-wm1-x335.google.com with SMTP id o30-20020a05600c511eb029022e0571d1a0so10392wms.5 for ; Tue, 13 Jul 2021 06:37:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=OFf9oyydDstmZq6qaETKxiNX7dyXcqTKjvBd2TGqcCg=; b=O9kRsFXnEAJHqORufnkpz7G44qE+kL5p71wZdi6JOdYHi6CZWDdj74oiBUvqVdrYN1 GJiBn26lWGT8Fq4Q/Vg45bvw3J8cKYz8srz4aopElmKvv1yxPHmxFqlSYnkuQECjWTSv hrO5TJZHC4a7B0QJ/kgI79WCs8jYUv+pYa/QPK/6NtHnNw9EFsBI3OF2/NcHncnnKk/U +eNIq+o73lPQlQtpB5iBYr2GD/PyK7vtGwEuDFmz8nlcq5NrTNq/TZv+7V+N7xlHeWD7 Fj2OBSqdfSQ94n5EBdiPxZLUFh5DndqOAQz7nCSS14exLASXMH3rQSqkqCHcAMYkQ/Ff x2hA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=OFf9oyydDstmZq6qaETKxiNX7dyXcqTKjvBd2TGqcCg=; b=BzcvgeyFCVDsWTHEyLfb7y+SwVI1UT+Q0s/DM+m80PypDYJOWOjIqZH2lqqwZp6c1F Ja3ZxR/nhwPATVbFrfi6mAEVQbBFwExCePQNonCiKI08QtXxE+/ykEbkxJxijXIt8IpL TVNlqmV+sSBNhdJVXNfediAai6vKXGTkRECcfqPW/uMtnrF6GLa10dQsWb+gbd18FJX6 RNyOc6DaNwSEOTD12ha7GNi+mpyatIUzRq8SAPY7QGIPcjCElymY79XT3rh9LbeDlkQ/ K9OdBtR1/foCnPTTYlnbMd+uxfLQvOE+ov0PQjHw+3dTSiMd2NRIFs9PFK38u9FlbZ89 HQQg== X-Gm-Message-State: AOAM533jil3Z3tLCDrIAKRSK8hl4Sk+lO5niOKMfn/u8mw1mKfoDWKNP T+tQi8ZpqdfblLU3fKY5kt0Yb3eDkVqujOf3 X-Google-Smtp-Source: ABdhPJzO4iXc9BsLBJtQGlhNTb0y4/ttrK+n1enGh29rOlav3r+c8RoyB6g4I0W8OQlv62q81Ni22Q== X-Received: by 2002:a7b:cbda:: with SMTP id n26mr5104371wmi.179.1626183460742; Tue, 13 Jul 2021 06:37:40 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j6sm9827443wrm.97.2021.07.13.06.37.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 06:37:40 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 15/34] target/arm: Implement MVE integer vector-vs-scalar comparisons Date: Tue, 13 Jul 2021 14:37:07 +0100 Message-Id: <20210713133726.26842-16-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210713133726.26842-1-peter.maydell@linaro.org> References: <20210713133726.26842-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::335; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x335.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE integer vector comparison instructions that compare each element against a scalar from a general purpose register. These are "VCMP (vector)" encodings T4, T5 and T6 and "VPT (vector)" encodings T4, T5 and T6. We have to move the decodetree pattern for VPST, because it overlaps with VCMP T4 with size = 0b11. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper-mve.h | 32 +++++++++++++++++++++++++++ target/arm/mve.decode | 18 +++++++++++++--- target/arm/mve_helper.c | 44 +++++++++++++++++++++++++++++++------- target/arm/translate-mve.c | 43 +++++++++++++++++++++++++++++++++++++ 4 files changed, 126 insertions(+), 11 deletions(-) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index e89238eac9d..035779b0576 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -516,3 +516,35 @@ DEF_HELPER_FLAGS_3(mve_vcmpgtw, TCG_CALL_NO_WG, void, env, ptr, ptr) DEF_HELPER_FLAGS_3(mve_vcmpleb, TCG_CALL_NO_WG, void, env, ptr, ptr) DEF_HELPER_FLAGS_3(mve_vcmpleh, TCG_CALL_NO_WG, void, env, ptr, ptr) DEF_HELPER_FLAGS_3(mve_vcmplew, TCG_CALL_NO_WG, void, env, ptr, ptr) + +DEF_HELPER_FLAGS_3(mve_vcmpeq_scalarb, TCG_CALL_NO_WG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vcmpeq_scalarh, TCG_CALL_NO_WG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vcmpeq_scalarw, TCG_CALL_NO_WG, void, env, ptr, i32) + +DEF_HELPER_FLAGS_3(mve_vcmpne_scalarb, TCG_CALL_NO_WG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vcmpne_scalarh, TCG_CALL_NO_WG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vcmpne_scalarw, TCG_CALL_NO_WG, void, env, ptr, i32) + +DEF_HELPER_FLAGS_3(mve_vcmpcs_scalarb, TCG_CALL_NO_WG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vcmpcs_scalarh, TCG_CALL_NO_WG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vcmpcs_scalarw, TCG_CALL_NO_WG, void, env, ptr, i32) + +DEF_HELPER_FLAGS_3(mve_vcmphi_scalarb, TCG_CALL_NO_WG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vcmphi_scalarh, TCG_CALL_NO_WG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vcmphi_scalarw, TCG_CALL_NO_WG, void, env, ptr, i32) + +DEF_HELPER_FLAGS_3(mve_vcmpge_scalarb, TCG_CALL_NO_WG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vcmpge_scalarh, TCG_CALL_NO_WG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vcmpge_scalarw, TCG_CALL_NO_WG, void, env, ptr, i32) + +DEF_HELPER_FLAGS_3(mve_vcmplt_scalarb, TCG_CALL_NO_WG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vcmplt_scalarh, TCG_CALL_NO_WG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vcmplt_scalarw, TCG_CALL_NO_WG, void, env, ptr, i32) + +DEF_HELPER_FLAGS_3(mve_vcmpgt_scalarb, TCG_CALL_NO_WG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vcmpgt_scalarh, TCG_CALL_NO_WG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vcmpgt_scalarw, TCG_CALL_NO_WG, void, env, ptr, i32) + +DEF_HELPER_FLAGS_3(mve_vcmple_scalarb, TCG_CALL_NO_WG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vcmple_scalarh, TCG_CALL_NO_WG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vcmple_scalarw, TCG_CALL_NO_WG, void, env, ptr, i32) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index 76bbf9a6136..ef708ba80ff 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -38,6 +38,7 @@ &vidup qd rn size imm &viwdup qd rn rm size imm &vcmp qm qn size mask +&vcmp_scalar qn rm size mask @vldr_vstr ....... . . . . l:1 rn:4 ... ...... imm:7 &vldr_vstr qd=%qd u=0 # Note that both Rn and Qd are 3 bits only (no D bit) @@ -90,6 +91,8 @@ # Vector comparison; 4-bit Qm but 3-bit Qn %mask_22_13 22:1 13:3 @vcmp .... .... .. size:2 qn:3 . .... .... .... .... &vcmp qm=%qm mask=%mask_22_13 +@vcmp_scalar .... .... .. size:2 qn:3 . .... .... .... rm:4 &vcmp_scalar \ + mask=%mask_22_13 # Vector loads and stores @@ -349,9 +352,6 @@ VQRDMULH_scalar 1111 1110 0 . .. ... 1 ... 0 1110 . 110 .... @2scalar rdahi=%rdahi rdalo=%rdalo } -# Predicate operations -VPST 1111 1110 0 . 11 000 1 ... 0 1111 0100 1101 mask=%mask_22_13 - # Logical immediate operations (1 reg and modified-immediate) # The cmode/op bits here decode VORR/VBIC/VMOV/VMVN, but @@ -474,3 +474,15 @@ VCMPGE 1111 1110 0 . .. ... 1 ... 1 1111 0 0 . 0 ... 0 @vcmp VCMPLT 1111 1110 0 . .. ... 1 ... 1 1111 1 0 . 0 ... 0 @vcmp VCMPGT 1111 1110 0 . .. ... 1 ... 1 1111 0 0 . 0 ... 1 @vcmp VCMPLE 1111 1110 0 . .. ... 1 ... 1 1111 1 0 . 0 ... 1 @vcmp + +{ + VPST 1111 1110 0 . 11 000 1 ... 0 1111 0100 1101 mask=%mask_22_13 + VCMPEQ_scalar 1111 1110 0 . .. ... 1 ... 0 1111 0 1 0 0 .... @vcmp_scalar +} +VCMPNE_scalar 1111 1110 0 . .. ... 1 ... 0 1111 1 1 0 0 .... @vcmp_scalar +VCMPCS_scalar 1111 1110 0 . .. ... 1 ... 0 1111 0 1 1 0 .... @vcmp_scalar +VCMPHI_scalar 1111 1110 0 . .. ... 1 ... 0 1111 1 1 1 0 .... @vcmp_scalar +VCMPGE_scalar 1111 1110 0 . .. ... 1 ... 1 1111 0 1 0 0 .... @vcmp_scalar +VCMPLT_scalar 1111 1110 0 . .. ... 1 ... 1 1111 1 1 0 0 .... @vcmp_scalar +VCMPGT_scalar 1111 1110 0 . .. ... 1 ... 1 1111 0 1 1 0 .... @vcmp_scalar +VCMPLE_scalar 1111 1110 0 . .. ... 1 ... 1 1111 1 1 1 0 .... @vcmp_scalar diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index 23398e86f7d..57a92bc6841 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -1791,15 +1791,43 @@ DO_VIWDUP_ALL(vdwdup, do_sub_wrap) mve_advance_vpt(env); \ } -#define DO_VCMP_S(OP, FN) \ - DO_VCMP(OP##b, 1, int8_t, FN) \ - DO_VCMP(OP##h, 2, int16_t, FN) \ - DO_VCMP(OP##w, 4, int32_t, FN) +#define DO_VCMP_SCALAR(OP, ESIZE, TYPE, FN) \ + void HELPER(glue(mve_, OP))(CPUARMState *env, void *vn, \ + uint32_t rm) \ + { \ + TYPE *n = vn; \ + uint16_t mask = mve_element_mask(env); \ + uint16_t eci_mask = mve_eci_mask(env); \ + uint16_t beatpred = 0; \ + uint16_t emask = MAKE_64BIT_MASK(0, ESIZE); \ + unsigned e; \ + for (e = 0; e < 16 / ESIZE; e++) { \ + bool r = FN(n[H##ESIZE(e)], (TYPE)rm); \ + /* Comparison sets 0/1 bits for each byte in the element */ \ + beatpred |= r * emask; \ + emask <<= ESIZE; \ + } \ + beatpred &= mask; \ + env->v7m.vpr = (env->v7m.vpr & ~(uint32_t)eci_mask) | \ + (beatpred & eci_mask); \ + mve_advance_vpt(env); \ + } -#define DO_VCMP_U(OP, FN) \ - DO_VCMP(OP##b, 1, uint8_t, FN) \ - DO_VCMP(OP##h, 2, uint16_t, FN) \ - DO_VCMP(OP##w, 4, uint32_t, FN) +#define DO_VCMP_S(OP, FN) \ + DO_VCMP(OP##b, 1, int8_t, FN) \ + DO_VCMP(OP##h, 2, int16_t, FN) \ + DO_VCMP(OP##w, 4, int32_t, FN) \ + DO_VCMP_SCALAR(OP##_scalarb, 1, int8_t, FN) \ + DO_VCMP_SCALAR(OP##_scalarh, 2, int16_t, FN) \ + DO_VCMP_SCALAR(OP##_scalarw, 4, int32_t, FN) + +#define DO_VCMP_U(OP, FN) \ + DO_VCMP(OP##b, 1, uint8_t, FN) \ + DO_VCMP(OP##h, 2, uint16_t, FN) \ + DO_VCMP(OP##w, 4, uint32_t, FN) \ + DO_VCMP_SCALAR(OP##_scalarb, 1, uint8_t, FN) \ + DO_VCMP_SCALAR(OP##_scalarh, 2, uint16_t, FN) \ + DO_VCMP_SCALAR(OP##_scalarw, 4, uint32_t, FN) #define DO_EQ(N, M) ((N) == (M)) #define DO_NE(N, M) ((N) != (M)) diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index a7334609e29..f8b8639eab7 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -44,6 +44,7 @@ typedef void MVEGenOneOpImmFn(TCGv_ptr, TCGv_ptr, TCGv_i64); typedef void MVEGenVIDUPFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_i32, TCGv_i32); typedef void MVEGenVIWDUPFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_i32, TCGv_i32, TCGv_i32); typedef void MVEGenCmpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr); +typedef void MVEGenScalarCmpFn(TCGv_ptr, TCGv_ptr, TCGv_i32); /* Return the offset of a Qn register (same semantics as aa32_vfp_qreg()) */ static inline long mve_qreg_offset(unsigned reg) @@ -1207,6 +1208,37 @@ static bool do_vcmp(DisasContext *s, arg_vcmp *a, MVEGenCmpFn *fn) return true; } +static bool do_vcmp_scalar(DisasContext *s, arg_vcmp_scalar *a, + MVEGenScalarCmpFn *fn) +{ + TCGv_ptr qn; + TCGv_i32 rm; + + if (!dc_isar_feature(aa32_mve, s) || !fn || a->rm == 13) { + return false; + } + if (!mve_eci_check(s) || !vfp_access_check(s)) { + return true; + } + + qn = mve_qreg_ptr(a->qn); + if (a->rm == 15) { + /* Encoding Rm=0b1111 means "constant zero" */ + rm = tcg_constant_i32(0); + } else { + rm = load_reg(s, a->rm); + } + fn(cpu_env, qn, rm); + tcg_temp_free_ptr(qn); + tcg_temp_free_i32(rm); + if (a->mask) { + /* VPT */ + gen_vpst(s, a->mask); + } + mve_update_eci(s); + return true; +} + #define DO_VCMP(INSN, FN) \ static bool trans_##INSN(DisasContext *s, arg_vcmp *a) \ { \ @@ -1217,6 +1249,17 @@ static bool do_vcmp(DisasContext *s, arg_vcmp *a, MVEGenCmpFn *fn) NULL, \ }; \ return do_vcmp(s, a, fns[a->size]); \ + } \ + static bool trans_##INSN##_scalar(DisasContext *s, \ + arg_vcmp_scalar *a) \ + { \ + static MVEGenScalarCmpFn * const fns[] = { \ + gen_helper_mve_##FN##_scalarb, \ + gen_helper_mve_##FN##_scalarh, \ + gen_helper_mve_##FN##_scalarw, \ + NULL, \ + }; \ + return do_vcmp_scalar(s, a, fns[a->size]); \ } DO_VCMP(VCMPEQ, vcmpeq) From patchwork Tue Jul 13 13:37:08 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1504631 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=UCQ8lZvD; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GPMWh619Lz9sX3 for ; Tue, 13 Jul 2021 23:52:16 +1000 (AEST) Received: from localhost ([::1]:33900 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m3Ipa-0000Tk-HR for incoming@patchwork.ozlabs.org; Tue, 13 Jul 2021 09:52:14 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54364) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m3Ibe-00004t-5o for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:51 -0400 Received: from mail-wm1-x336.google.com ([2a00:1450:4864:20::336]:51941) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m3IbW-0003hB-Nh for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:49 -0400 Received: by mail-wm1-x336.google.com with SMTP id n4so3255737wms.1 for ; Tue, 13 Jul 2021 06:37:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=R+iqSzvkngVHdjj9CAmtBya0mSLIYbFqnfRfCDrGqng=; b=UCQ8lZvDVPhRoWElm7zrwV/MA4dbHpFCltL0VEP4CEnT0jIRQYi0FhMHOSNtir+PYJ FApNuF3v0wWjbITaGL3DOM4GQ0Jk5QWGl77WXYAHNdDbagdv0qCrmYZZ6L3+l+kxgf1q Qw7MOsptZQJ+gpHozZTfVhBNxRTMKh5ayHDPu4GBHRT2Hbg/FWx9yBf7HkBcBoEQcNnP oRbCZloeb6MhjtxfFwwXT0k4ofAkIvBcyNI4EATVQlEp4whN1ernEQ89n2RFdElpMi+C giCIHxLAjeSGHQo3TQJB06Ms4tYylGXzjeZ6iYZK7aW6xPhTXIaAkeNtszsFk4kqoept DMMA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=R+iqSzvkngVHdjj9CAmtBya0mSLIYbFqnfRfCDrGqng=; b=DZm6XpZ0en9CfQZ+tJmtvCpPrXl5twz2Ajxj4Vz4PS6f1L8McWy9wrmMwMEQj0pFzX jcactn4rzIw3C6nZKvgQId1dxSDpXD5F/spXbShKbAjrdDNxEtc1DHVjj1+IiGU4EnB+ ieXA3/MH2VJ8xUSPJKkdBAxClsUKLJ0MkLxvsYlDgU9irrD6oJtnWpmt94WqLpVtfMmW Q1mXoKM+iRHkDuYkt6i+C/V+Dy+ar6nDF6WAa2EjvW5oFr7asdNqx2ywW/Hc5kcXRsP3 vDXQg9KIVcsdFe5l8BA9c8E+9m6H0Gfr/1rUjDHBOewegBe96rptSRRAhRcD97QDZDlH 7T8A== X-Gm-Message-State: AOAM532oJFdWYPgp3d31aXaWg/MpQbsa19dMue/cQybT6d2PBkWwg0P/ iYQXBYiEKIcySZhGMNfbn701OAo6uU0fOdPv X-Google-Smtp-Source: ABdhPJzEE8J4pIi+e53PTBeZLGPh5C/aqemxDVtFBannQwIGCNfNaMpsJDCK1u6G55pg+PrTKL0JVQ== X-Received: by 2002:a7b:c1cd:: with SMTP id a13mr57254wmj.75.1626183461380; Tue, 13 Jul 2021 06:37:41 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j6sm9827443wrm.97.2021.07.13.06.37.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 06:37:41 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 16/34] target/arm: Implement MVE VPSEL Date: Tue, 13 Jul 2021 14:37:08 +0100 Message-Id: <20210713133726.26842-17-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210713133726.26842-1-peter.maydell@linaro.org> References: <20210713133726.26842-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::336; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x336.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE VPSEL insn, which sets each byte of the destination vector Qd to the byte from either Qn or Qm depending on the value of the corresponding bit in VPR.P0. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper-mve.h | 2 ++ target/arm/mve.decode | 7 +++++-- target/arm/mve_helper.c | 19 +++++++++++++++++++ target/arm/translate-mve.c | 2 ++ 4 files changed, 28 insertions(+), 2 deletions(-) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index 035779b0576..f1a54aba5d4 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -86,6 +86,8 @@ DEF_HELPER_FLAGS_4(mve_vorr, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) DEF_HELPER_FLAGS_4(mve_vorn, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) DEF_HELPER_FLAGS_4(mve_veor, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) +DEF_HELPER_FLAGS_4(mve_vpsel, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) + DEF_HELPER_FLAGS_4(mve_vaddb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) DEF_HELPER_FLAGS_4(mve_vaddh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) DEF_HELPER_FLAGS_4(mve_vaddw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index ef708ba80ff..4bd20a9a319 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -468,8 +468,11 @@ VSHLC 111 0 1110 1 . 1 imm:5 ... 0 1111 1100 rdm:4 qd=%qd # effectively "VCMP then VPST". A plain "VCMP" has a mask field of zero. VCMPEQ 1111 1110 0 . .. ... 1 ... 0 1111 0 0 . 0 ... 0 @vcmp VCMPNE 1111 1110 0 . .. ... 1 ... 0 1111 1 0 . 0 ... 0 @vcmp -VCMPCS 1111 1110 0 . .. ... 1 ... 0 1111 0 0 . 0 ... 1 @vcmp -VCMPHI 1111 1110 0 . .. ... 1 ... 0 1111 1 0 . 0 ... 1 @vcmp +{ + VPSEL 1111 1110 0 . 11 ... 1 ... 0 1111 . 0 . 0 ... 1 @2op_nosz + VCMPCS 1111 1110 0 . .. ... 1 ... 0 1111 0 0 . 0 ... 1 @vcmp + VCMPHI 1111 1110 0 . .. ... 1 ... 0 1111 1 0 . 0 ... 1 @vcmp +} VCMPGE 1111 1110 0 . .. ... 1 ... 1 1111 0 0 . 0 ... 0 @vcmp VCMPLT 1111 1110 0 . .. ... 1 ... 1 1111 1 0 . 0 ... 0 @vcmp VCMPGT 1111 1110 0 . .. ... 1 ... 1 1111 0 0 . 0 ... 1 @vcmp diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index 57a92bc6841..be67e7cea26 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -1846,3 +1846,22 @@ DO_VCMP_S(vcmpge, DO_GE) DO_VCMP_S(vcmplt, DO_LT) DO_VCMP_S(vcmpgt, DO_GT) DO_VCMP_S(vcmple, DO_LE) + +void HELPER(mve_vpsel)(CPUARMState *env, void *vd, void *vn, void *vm) +{ + /* + * Qd[n] = VPR.P0[n] ? Qn[n] : Qm[n] + * but note that whether bytes are written to Qd is still subject + * to (all forms of) predication in the usual way. + */ + uint64_t *d = vd, *n = vn, *m = vm; + uint16_t mask = mve_element_mask(env); + uint16_t p0 = FIELD_EX32(env->v7m.vpr, V7M_VPR, P0); + unsigned e; + for (e = 0; e < 16 / 8; e++, mask >>= 8, p0 >>= 8) { + uint64_t r = m[H8(e)]; + mergemask(&r, n[H8(e)], p0); + mergemask(&d[H8(e)], r, mask); + } + mve_advance_vpt(env); +} diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index f8b8639eab7..689e15c069b 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -376,6 +376,8 @@ DO_LOGIC(VORR, gen_helper_mve_vorr) DO_LOGIC(VORN, gen_helper_mve_vorn) DO_LOGIC(VEOR, gen_helper_mve_veor) +DO_LOGIC(VPSEL, gen_helper_mve_vpsel) + #define DO_2OP(INSN, FN) \ static bool trans_##INSN(DisasContext *s, arg_2op *a) \ { \ From patchwork Tue Jul 13 13:37:09 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1504635 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=q7qSAeuz; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GPMZS47LWz9sXk for ; Tue, 13 Jul 2021 23:54:40 +1000 (AEST) Received: from localhost ([::1]:45934 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m3Iru-0008R7-7L for incoming@patchwork.ozlabs.org; Tue, 13 Jul 2021 09:54:38 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54396) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m3Ibe-00005s-Lv for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:51 -0400 Received: from mail-wr1-x42b.google.com ([2a00:1450:4864:20::42b]:41803) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m3IbX-0003hT-8Z for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:50 -0400 Received: by mail-wr1-x42b.google.com with SMTP id k4so24197242wrc.8 for ; Tue, 13 Jul 2021 06:37:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=R4xeSVi6VBIk/RfAx9WizQtWx2u9jg9p3hJQ3Wj3J9M=; b=q7qSAeuzsb1dN5a9s0PtXeu3tMrlz54zFmI6gNLb+K5SXmXmqyC6g+esKvC1u7FiO0 4Z/Kcc7Jp/D2c0uDS6aIG87cgRTdDrC5xfeMwuxR7bY8LG2+GVQaBS3wzV5uR/DkAgJO hVczQ+WHnn+QGQ3iP7SmwLYHtZdPEwXR8F6QZWz4ctnUAiaI0NMi3aNWOcL1LDOiwtOG aPv3pYfbNWKOy4SZqEcjEZkIMz+M+acAPTx1EFWePfD3f8zKZY5zbRqs2SiNePf++gRF r7imfGjTxLYvaWYwu1MUam5C5qvZCadRmhL+5HTF+njiX8a5GqYRw65L72uNrGFDzFqw be1A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=R4xeSVi6VBIk/RfAx9WizQtWx2u9jg9p3hJQ3Wj3J9M=; b=BiTxVxPYsILi5NfuTF+b3XqRhpXFEms7NtuCQT5M4rQnFDnSGOPDMLNN77o197kcWE kLsW8k3A371QAxuMNBU3fJYfn4YfNLnSjsmHJrgPx1ZGalYZz5dmWI5Tb/nqQ48eqN+1 ZsD1O96763S1VVtcZWVSfhZsRv4hm8jTKEPxC88XL4K7TvSbx5bADgwTXUZkvzyqt+jy vCVN1xgZpP0QjlwtfPTN/Gj5MAedPw+aXce/KbDIKijvzrB3Y43q3N2mOX/ajqJFILWr K6z9ocpOtALwaYHlvBLdxJTiTSGdKtTMUJQI6QaNDKOGzSNsQB5+BpKI5oKpryDQOtlb fbnw== X-Gm-Message-State: AOAM53011Os0lGPv4PbaAN+zanvJNcTUuB07uUFA9cupwn4EoO1ApTBW m0eG3XVGlu3MXlrSYX8irJE/M00zK6VUGC01 X-Google-Smtp-Source: ABdhPJwBpBcfzdkER2ZjF8h8WG7P3GDku1dJ4mT55YobYVVyX3C4C61u5aksxQ5ycdkOb+elGZj7eQ== X-Received: by 2002:a5d:6b91:: with SMTP id n17mr5767682wrx.385.1626183462037; Tue, 13 Jul 2021 06:37:42 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j6sm9827443wrm.97.2021.07.13.06.37.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 06:37:41 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 17/34] target/arm: Implement MVE VMLAS Date: Tue, 13 Jul 2021 14:37:09 +0100 Message-Id: <20210713133726.26842-18-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210713133726.26842-1-peter.maydell@linaro.org> References: <20210713133726.26842-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::42b; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x42b.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE VMLAS insn, which multiplies a vector by a vector and adds a scalar. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper-mve.h | 8 ++++++++ target/arm/mve.decode | 3 +++ target/arm/mve_helper.c | 31 +++++++++++++++++++++++++++++++ target/arm/translate-mve.c | 2 ++ 4 files changed, 44 insertions(+) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index f1a54aba5d4..6f2cc5c2929 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -351,6 +351,14 @@ DEF_HELPER_FLAGS_4(mve_vqdmullb_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i3 DEF_HELPER_FLAGS_4(mve_vqdmullt_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vqdmullt_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vmlassb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vmlassh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vmlassw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(mve_vmlasub, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vmlasuh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vmlasuw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(mve_vmlaldavsh, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64) DEF_HELPER_FLAGS_4(mve_vmlaldavsw, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64) DEF_HELPER_FLAGS_4(mve_vmlaldavxsh, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index 4bd20a9a319..05c30735545 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -345,6 +345,9 @@ VBRSR 1111 1110 0 . .. ... 1 ... 1 1110 . 110 .... @2scalar VQDMULH_scalar 1110 1110 0 . .. ... 1 ... 0 1110 . 110 .... @2scalar VQRDMULH_scalar 1111 1110 0 . .. ... 1 ... 0 1110 . 110 .... @2scalar +VMLAS_S 1110 1110 0 . .. ... 1 ... 1 1110 . 100 .... @2scalar +VMLAS_U 1111 1110 0 . .. ... 1 ... 1 1110 . 100 .... @2scalar + # Vector add across vector { VADDV 111 u:1 1110 1111 size:2 01 ... 0 1111 0 0 a:1 0 qm:3 0 rda=%rdalo diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index be67e7cea26..98c3a418dcb 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -955,6 +955,22 @@ DO_VQDMLADH_OP(vqrdmlsdhxw, 4, int32_t, 1, 1, do_vqdmlsdh_w) mve_advance_vpt(env); \ } +/* "accumulating" version where FN takes d as well as n and m */ +#define DO_2OP_ACC_SCALAR(OP, ESIZE, TYPE, FN) \ + void HELPER(glue(mve_, OP))(CPUARMState *env, void *vd, void *vn, \ + uint32_t rm) \ + { \ + TYPE *d = vd, *n = vn; \ + TYPE m = rm; \ + uint16_t mask = mve_element_mask(env); \ + unsigned e; \ + for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \ + mergemask(&d[H##ESIZE(e)], \ + FN(d[H##ESIZE(e)], n[H##ESIZE(e)], m), mask); \ + } \ + mve_advance_vpt(env); \ + } + /* provide unsigned 2-op scalar helpers for all sizes */ #define DO_2OP_SCALAR_U(OP, FN) \ DO_2OP_SCALAR(OP##b, 1, uint8_t, FN) \ @@ -965,6 +981,15 @@ DO_VQDMLADH_OP(vqrdmlsdhxw, 4, int32_t, 1, 1, do_vqdmlsdh_w) DO_2OP_SCALAR(OP##h, 2, int16_t, FN) \ DO_2OP_SCALAR(OP##w, 4, int32_t, FN) +#define DO_2OP_ACC_SCALAR_U(OP, FN) \ + DO_2OP_ACC_SCALAR(OP##b, 1, uint8_t, FN) \ + DO_2OP_ACC_SCALAR(OP##h, 2, uint16_t, FN) \ + DO_2OP_ACC_SCALAR(OP##w, 4, uint32_t, FN) +#define DO_2OP_ACC_SCALAR_S(OP, FN) \ + DO_2OP_ACC_SCALAR(OP##b, 1, int8_t, FN) \ + DO_2OP_ACC_SCALAR(OP##h, 2, int16_t, FN) \ + DO_2OP_ACC_SCALAR(OP##w, 4, int32_t, FN) + DO_2OP_SCALAR_U(vadd_scalar, DO_ADD) DO_2OP_SCALAR_U(vsub_scalar, DO_SUB) DO_2OP_SCALAR_U(vmul_scalar, DO_MUL) @@ -994,6 +1019,12 @@ DO_2OP_SAT_SCALAR(vqrdmulh_scalarb, 1, int8_t, DO_QRDMULH_B) DO_2OP_SAT_SCALAR(vqrdmulh_scalarh, 2, int16_t, DO_QRDMULH_H) DO_2OP_SAT_SCALAR(vqrdmulh_scalarw, 4, int32_t, DO_QRDMULH_W) +/* Vector by vector plus scalar */ +#define DO_VMLAS(D, N, M) ((N) * (D) + (M)) + +DO_2OP_ACC_SCALAR_S(vmlass, DO_VMLAS) +DO_2OP_ACC_SCALAR_U(vmlasu, DO_VMLAS) + /* * Long saturating scalar ops. As with DO_2OP_L, TYPE and H are for the * input (smaller) type and LESIZE, LTYPE, LH for the output (long) type. diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index 689e15c069b..011d1d6bcd9 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -596,6 +596,8 @@ DO_2OP_SCALAR(VQSUB_U_scalar, vqsubu_scalar) DO_2OP_SCALAR(VQDMULH_scalar, vqdmulh_scalar) DO_2OP_SCALAR(VQRDMULH_scalar, vqrdmulh_scalar) DO_2OP_SCALAR(VBRSR, vbrsr) +DO_2OP_SCALAR(VMLAS_S, vmlass) +DO_2OP_SCALAR(VMLAS_U, vmlasu) static bool trans_VQDMULLB_scalar(DisasContext *s, arg_2scalar *a) { From patchwork Tue Jul 13 13:37:10 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1504629 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=r7wuhctb; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GPMTN5TsSz9sX3 for ; Tue, 13 Jul 2021 23:50:16 +1000 (AEST) Received: from localhost ([::1]:58014 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m3Ine-00067J-F6 for incoming@patchwork.ozlabs.org; Tue, 13 Jul 2021 09:50:14 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54414) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m3Ibf-00005w-7s for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:51 -0400 Received: from mail-wr1-x431.google.com ([2a00:1450:4864:20::431]:40498) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m3IbY-0003iW-G6 for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:50 -0400 Received: by mail-wr1-x431.google.com with SMTP id l7so29612517wrv.7 for ; Tue, 13 Jul 2021 06:37:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=kO+xQaMgkBnNM7uo4YEAhjAg0x7cuzywZZ9zDM0T84Q=; b=r7wuhctb0kwV3Uwd6TWwBuiBGBhopnA+gCsAgW+bELynpjeqVC07HLtc6MkhQYHqt+ U72HmEE/Ga86y2R8O5hnODygpNFqVDCcET/rFs7yuRbb9iJQczpbMpy1gUkBQEAvFEDe feQ7CmvzC5LcGqIW/MfI2cR+m7mtE1I0wMxMn1HtaZgO4Nx+T2BKfoCiTK3QDya5q/Xh mHm+NRm6GsUttTKGVWoJ9jnzJQImiKiyckCw932z43OWZrOM3jzHMuIfCG2ys21TJ+Qi GtG0PXjqz+OxdwCmQC/7cs7DM53vFfP3vR2o36jXFCwoFdbZu65mVPxDgbCvnXHe6GjT mlWA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=kO+xQaMgkBnNM7uo4YEAhjAg0x7cuzywZZ9zDM0T84Q=; b=mXpNbUOcyxEAT996EeQLdh69O8/qtC/7sjmgSVJA0oH1OHSwI4dNl97IInAGLebKU9 ky92nLDXzhIGcMtmlkh6ZqP8LKmUSLaZAo01oCu/zdzWPkayPzJRlTV8s3wAlnpWA2Kj i3CsRABFAuI7dJ9eRDfpqIJKrkCVBDsOOclar7OdffPM9LOfClkFTmKIGp2WqvvfgJsx NyDm1bwI7oI2CieIPhFlV7gt+zlIGJb1RDXE2GmLUW8KxD4iRyl3W1Ra//NgoSXM0G7m VySkGNyxFyw3CrMiEnsNQEZO2C+zj1PkhEvsCsm1IMLiAteyBpxmfg4kqYepSIx0jsnJ dbcA== X-Gm-Message-State: AOAM532tiV8o6Fn33oYCaZQvMkF//wAgWWwwworyxBxXlbBQMJ2g93hJ iMPWLrp1IJqDQf3f9NLOVJKyUw== X-Google-Smtp-Source: ABdhPJxaklezJniilTqqlYfqwF0isZQYkCJQFhmTVwzqjJKnROX/uPA35ZBjIG/pxI3iSAKk4YlktA== X-Received: by 2002:a5d:5989:: with SMTP id n9mr5673625wri.8.1626183462744; Tue, 13 Jul 2021 06:37:42 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j6sm9827443wrm.97.2021.07.13.06.37.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 06:37:42 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 18/34] target/arm: Implement MVE shift-by-scalar Date: Tue, 13 Jul 2021 14:37:10 +0100 Message-Id: <20210713133726.26842-19-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210713133726.26842-1-peter.maydell@linaro.org> References: <20210713133726.26842-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::431; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x431.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE instructions which perform shifts by a scalar. These are VSHL T2, VRSHL T2, VQSHL T1 and VQRSHL T2. They take the shift amount in a general purpose register and shift every element in the vector by that amount. Mostly we can reuse the helper functions for shift-by-immediate; we do need two new helpers for VQRSHL. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper-mve.h | 8 +++++++ target/arm/mve.decode | 23 ++++++++++++++++--- target/arm/mve_helper.c | 2 ++ target/arm/translate-mve.c | 46 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 76 insertions(+), 3 deletions(-) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index 6f2cc5c2929..c702db4c39a 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -422,6 +422,14 @@ DEF_HELPER_FLAGS_4(mve_vrshli_ub, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vrshli_uh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vrshli_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vqrshli_sb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vqrshli_sh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vqrshli_sw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(mve_vqrshli_ub, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vqrshli_uh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vqrshli_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(mve_vshllbsb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vshllbsh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vshllbub, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index 05c30735545..1a788e438de 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -39,6 +39,7 @@ &viwdup qd rn rm size imm &vcmp qm qn size mask &vcmp_scalar qn rm size mask +&shl_scalar qda rm size @vldr_vstr ....... . . . . l:1 rn:4 ... ...... imm:7 &vldr_vstr qd=%qd u=0 # Note that both Rn and Qd are 3 bits only (no D bit) @@ -88,6 +89,8 @@ @2_shr_w .... .... .. 1 ..... .... .... .... .... &2shift qd=%qd qm=%qm \ size=2 shift=%rshift_i5 +@shl_scalar .... .... .... size:2 .. .... .... .... rm:4 &shl_scalar qda=%qd + # Vector comparison; 4-bit Qm but 3-bit Qn %mask_22_13 22:1 13:3 @vcmp .... .... .. size:2 qn:3 . .... .... .... .... &vcmp qm=%qm mask=%mask_22_13 @@ -320,7 +323,23 @@ VRMLSLDAVH 1111 1110 1 ... ... 0 ... x:1 1110 . 0 a:1 0 ... 1 @vmlaldav_no VADD_scalar 1110 1110 0 . .. ... 1 ... 0 1111 . 100 .... @2scalar VSUB_scalar 1110 1110 0 . .. ... 1 ... 1 1111 . 100 .... @2scalar -VMUL_scalar 1110 1110 0 . .. ... 1 ... 1 1110 . 110 .... @2scalar + +{ + VSHL_S_scalar 1110 1110 0 . 11 .. 01 ... 1 1110 0110 .... @shl_scalar + VRSHL_S_scalar 1110 1110 0 . 11 .. 11 ... 1 1110 0110 .... @shl_scalar + VQSHL_S_scalar 1110 1110 0 . 11 .. 01 ... 1 1110 1110 .... @shl_scalar + VQRSHL_S_scalar 1110 1110 0 . 11 .. 11 ... 1 1110 1110 .... @shl_scalar + VMUL_scalar 1110 1110 0 . .. ... 1 ... 1 1110 . 110 .... @2scalar +} + +{ + VSHL_U_scalar 1111 1110 0 . 11 .. 01 ... 1 1110 0110 .... @shl_scalar + VRSHL_U_scalar 1111 1110 0 . 11 .. 11 ... 1 1110 0110 .... @shl_scalar + VQSHL_U_scalar 1111 1110 0 . 11 .. 01 ... 1 1110 1110 .... @shl_scalar + VQRSHL_U_scalar 1111 1110 0 . 11 .. 11 ... 1 1110 1110 .... @shl_scalar + VBRSR 1111 1110 0 . .. ... 1 ... 1 1110 . 110 .... @2scalar +} + VHADD_S_scalar 1110 1110 0 . .. ... 0 ... 0 1111 . 100 .... @2scalar VHADD_U_scalar 1111 1110 0 . .. ... 0 ... 0 1111 . 100 .... @2scalar VHSUB_S_scalar 1110 1110 0 . .. ... 0 ... 1 1111 . 100 .... @2scalar @@ -340,8 +359,6 @@ VHSUB_U_scalar 1111 1110 0 . .. ... 0 ... 1 1111 . 100 .... @2scalar size=%size_28 } -VBRSR 1111 1110 0 . .. ... 1 ... 1 1110 . 110 .... @2scalar - VQDMULH_scalar 1110 1110 0 . .. ... 1 ... 0 1110 . 110 .... @2scalar VQRDMULH_scalar 1111 1110 0 . .. ... 1 ... 0 1110 . 110 .... @2scalar diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index 98c3a418dcb..d44cd80e18b 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -1346,6 +1346,8 @@ DO_2SHIFT_SAT_S(vqshli_s, DO_SQSHL_OP) DO_2SHIFT_SAT_S(vqshlui_s, DO_SUQSHL_OP) DO_2SHIFT_U(vrshli_u, DO_VRSHLU) DO_2SHIFT_S(vrshli_s, DO_VRSHLS) +DO_2SHIFT_SAT_U(vqrshli_u, DO_UQRSHL_OP) +DO_2SHIFT_SAT_S(vqrshli_s, DO_SQRSHL_OP) /* Shift-and-insert; we always work with 64 bits at a time */ #define DO_2SHIFT_INSERT(OP, ESIZE, SHIFTFN, MASKFN) \ diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index 011d1d6bcd9..650d1470f08 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -1004,6 +1004,52 @@ DO_2SHIFT(VRSHRI_U, vrshli_u, true) DO_2SHIFT(VSRI, vsri, false) DO_2SHIFT(VSLI, vsli, false) +static bool do_2shift_scalar(DisasContext *s, arg_shl_scalar *a, + MVEGenTwoOpShiftFn *fn) +{ + TCGv_ptr qda; + TCGv_i32 rm; + + if (!dc_isar_feature(aa32_mve, s) || + !mve_check_qreg_bank(s, a->qda) || + a->rm == 13 || a->rm == 15 || !fn) { + /* Rm cases are UNPREDICTABLE */ + return false; + } + if (!mve_eci_check(s) || !vfp_access_check(s)) { + return true; + } + + qda = mve_qreg_ptr(a->qda); + rm = load_reg(s, a->rm); + fn(cpu_env, qda, qda, rm); + tcg_temp_free_ptr(qda); + tcg_temp_free_i32(rm); + mve_update_eci(s); + return true; +} + +#define DO_2SHIFT_SCALAR(INSN, FN) \ + static bool trans_##INSN(DisasContext *s, arg_shl_scalar *a) \ + { \ + static MVEGenTwoOpShiftFn * const fns[] = { \ + gen_helper_mve_##FN##b, \ + gen_helper_mve_##FN##h, \ + gen_helper_mve_##FN##w, \ + NULL, \ + }; \ + return do_2shift_scalar(s, a, fns[a->size]); \ + } + +DO_2SHIFT_SCALAR(VSHL_S_scalar, vshli_s) +DO_2SHIFT_SCALAR(VSHL_U_scalar, vshli_u) +DO_2SHIFT_SCALAR(VRSHL_S_scalar, vrshli_s) +DO_2SHIFT_SCALAR(VRSHL_U_scalar, vrshli_u) +DO_2SHIFT_SCALAR(VQSHL_S_scalar, vqshli_s) +DO_2SHIFT_SCALAR(VQSHL_U_scalar, vqshli_u) +DO_2SHIFT_SCALAR(VQRSHL_S_scalar, vqrshli_s) +DO_2SHIFT_SCALAR(VQRSHL_U_scalar, vqrshli_u) + #define DO_VSHLL(INSN, FN) \ static bool trans_##INSN(DisasContext *s, arg_2shift *a) \ { \ From patchwork Tue Jul 13 13:37:11 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1504627 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=KCvcGlFR; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GPMRm5Ktdz9sWd for ; Tue, 13 Jul 2021 23:48:51 +1000 (AEST) Received: from localhost ([::1]:53456 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m3ImH-00035G-Gs for incoming@patchwork.ozlabs.org; Tue, 13 Jul 2021 09:48:49 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54420) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m3Ibf-000063-El for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:51 -0400 Received: from mail-wr1-x429.google.com ([2a00:1450:4864:20::429]:42511) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m3IbY-0003jG-K1 for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:51 -0400 Received: by mail-wr1-x429.google.com with SMTP id r11so25256366wro.9 for ; Tue, 13 Jul 2021 06:37:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=+Lu7dJm1TE887g0x9UEBakZnKs1VKB2KmcyMjuqzDNs=; b=KCvcGlFRz8nVXWUDiOiiwluzhitAUgKzD+XXpntIwBfbofFxCkYpQBNkJU8L+udvnL OSX7R+iDIVQkreLrNMpraRtbnffko1QixSpENw524P2dpWEE+FSAyf+wBURO/B7GiIo3 t8tTlMXMBwQAuYSz6NcQ93x1Pdpdd4XAJiEjKLEU8k/pgNVSn/TmElW3UcrRkpTQe/fq RrXobLUtgMgLqRijqUcoQJnf9mPeuQCVb7q6i17F2d81Bz6SUwi4ff1NiBDWKoEDgDsV 6dhbwOm5AcVr+LDopDWTmo1KbsKyicE2Xu6L3m5c7nDYUuVzbskvnp2FgOgr+fWuyylF Dc7Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=+Lu7dJm1TE887g0x9UEBakZnKs1VKB2KmcyMjuqzDNs=; b=iqacDeezAH/QiBYcLr6cH3vvuwHYpb114PaDAgJmKagO6IPhBk8O7q0uCRl+29tbjG J98cZPxhcDuvJpOQWlOI1Ry0u2WK5pxKCOOLH1/ntir+htedBGClP0meXVKCmEqAqYgA 0bzOqUI0fm3VlWbRGGfbSLhZUi7F0aPxu/hGHLKKaS6AM4r0k6o7o7kyoCX6Xfu2ArvT j2c4UtbewQT53GPdQPbu8OKH84uVKeWqDZ7HdcgW04FWErOoWULRrIWZWOfL0bnaiv63 3eaHQmdVrLDxxxS+3FN9SB5io+9qWIAIQ/URzP8Z5njJguvOG3Z0EgkUNBCT7bTicr++ halA== X-Gm-Message-State: AOAM533QoRNdVMhnIOJBItAe8Qj51VFs2c2Ad76ILSgdn4qWkBIqWaoL t3vpDfBhzfd/ZqfPuMSA0cKuE0vIv5AMFTqA X-Google-Smtp-Source: ABdhPJwAtnQZK1tG/8roIHmMtmdGDr+tGil9wxCauhBM4fsjQPD7xC992bFrXmbqp+CwmgZ+bKgOuQ== X-Received: by 2002:adf:f9cb:: with SMTP id w11mr5863322wrr.57.1626183463369; Tue, 13 Jul 2021 06:37:43 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j6sm9827443wrm.97.2021.07.13.06.37.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 06:37:43 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 19/34] target/arm: Move 'x' and 'a' bit definitions into vmlaldav formats Date: Tue, 13 Jul 2021 14:37:11 +0100 Message-Id: <20210713133726.26842-20-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210713133726.26842-1-peter.maydell@linaro.org> References: <20210713133726.26842-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::429; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x429.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" All the users of the vmlaldav formats have an 'x bit in bit 12 and an 'a' bit in bit 5; move these to the format rather than specifying them in each insn pattern. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- Not sure why I didn't write it this way in the first place; when I came to implement VMLADAV I noticed the oddity and preferred to fix it rather than either copying it for VMLADAV or having VMLADAV different. --- target/arm/mve.decode | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index 1a788e438de..67bd894daf1 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -305,19 +305,19 @@ VDUP 1110 1110 1 0 10 ... 0 .... 1011 . 0 0 1 0000 @vdup size=2 &vmlaldav rdahi rdalo size qn qm x a -@vmlaldav .... .... . ... ... . ... . .... .... qm:3 . \ +@vmlaldav .... .... . ... ... . ... x:1 .... .. a:1 . qm:3 . \ qn=%qn rdahi=%rdahi rdalo=%rdalo size=%size_16 &vmlaldav -@vmlaldav_nosz .... .... . ... ... . ... . .... .... qm:3 . \ +@vmlaldav_nosz .... .... . ... ... . ... x:1 .... .. a:1 . qm:3 . \ qn=%qn rdahi=%rdahi rdalo=%rdalo size=0 &vmlaldav -VMLALDAV_S 1110 1110 1 ... ... . ... x:1 1110 . 0 a:1 0 ... 0 @vmlaldav -VMLALDAV_U 1111 1110 1 ... ... . ... x:1 1110 . 0 a:1 0 ... 0 @vmlaldav +VMLALDAV_S 1110 1110 1 ... ... . ... . 1110 . 0 . 0 ... 0 @vmlaldav +VMLALDAV_U 1111 1110 1 ... ... . ... . 1110 . 0 . 0 ... 0 @vmlaldav -VMLSLDAV 1110 1110 1 ... ... . ... x:1 1110 . 0 a:1 0 ... 1 @vmlaldav +VMLSLDAV 1110 1110 1 ... ... . ... . 1110 . 0 . 0 ... 1 @vmlaldav -VRMLALDAVH_S 1110 1110 1 ... ... 0 ... x:1 1111 . 0 a:1 0 ... 0 @vmlaldav_nosz -VRMLALDAVH_U 1111 1110 1 ... ... 0 ... x:1 1111 . 0 a:1 0 ... 0 @vmlaldav_nosz +VRMLALDAVH_S 1110 1110 1 ... ... 0 ... . 1111 . 0 . 0 ... 0 @vmlaldav_nosz +VRMLALDAVH_U 1111 1110 1 ... ... 0 ... . 1111 . 0 . 0 ... 0 @vmlaldav_nosz -VRMLSLDAVH 1111 1110 1 ... ... 0 ... x:1 1110 . 0 a:1 0 ... 1 @vmlaldav_nosz +VRMLSLDAVH 1111 1110 1 ... ... 0 ... . 1110 . 0 . 0 ... 1 @vmlaldav_nosz # Scalar operations From patchwork Tue Jul 13 13:37:12 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1504646 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=S7RxtI6r; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GPMjP4Vcrz9sX3 for ; Wed, 14 Jul 2021 00:00:40 +1000 (AEST) Received: from localhost ([::1]:37588 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m3Ixh-00053j-09 for incoming@patchwork.ozlabs.org; Tue, 13 Jul 2021 10:00:37 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54482) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m3Ibh-00009d-Ml for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:53 -0400 Received: from mail-wr1-x42e.google.com ([2a00:1450:4864:20::42e]:36510) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m3IbZ-0003jd-Hj for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:53 -0400 Received: by mail-wr1-x42e.google.com with SMTP id v5so30525874wrt.3 for ; Tue, 13 Jul 2021 06:37:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=Ja0q62vN/Key46fFCtdomZn1IL1h348ETeYRWt5fgQE=; b=S7RxtI6rXRoVAAagOXqdsW8hhsuULLD52tIcR2e2gKmMy03F/filRbQqT5L3/1BfDs vs5J7dkkrsZTsBZr2PXW9ehjIkIMZoo3EVTgRvINAHXS0z5B3Nl1/jEc5qtC+ptb6Mh0 BCLkB8UTQrzHOqAZnnFNeZW6hBtX8bbyvqqyT9gc5w8b0Kkqzack0UtWopT4qDYX/bGG XsNKeLYuWxzfQ7b06LhQq2OzzmyNfqFgee15qDuEvh7/dkBQMfUt7ZqlUrvi4mizYsDj ZA/FvXOf/JCKSTcIl/SDPXHCrSI9OqOTRRRdIBXdkVBwdp4SZF4kl7tNf8e4GrCfWzzH h1mA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Ja0q62vN/Key46fFCtdomZn1IL1h348ETeYRWt5fgQE=; b=F+UYWEl9Be4l2ODumgJr0jyVUk5zGjs1rl9hBpSAM5qlS4yEXTqSnPOpKMQQ9eVCSb d2ip4+QOnt7aRH8co2GKyqG41QDbUN+6G1vc0cLiT/J2I8lv158/4i815WBRyXsn0IUs vvQu4sk8cYlu0bpmQNCkXtTGaDOKiPtb2dU+wH+ES+VaqDNfGRHb2gBGdQYtobb5kpCT UxGXTE8/NqU7hcWBtHTKLPX1hU0ncl4vdBwWUBpBt38Z/Qdaa5R//1I5lxE+vZ4qjbye xrxgzVeNp4XcL8Ub9iBgQDGecgp/BtfMMVhr60UqyD36u5Z9ZM+9K4/K9G1uC++fQV93 aVCw== X-Gm-Message-State: AOAM530A+lsrXRWVAB8OF4uUZotH/XtM1ALFJT+vi6FSJ6xZC9CA7c+p eI2XJEtXrk0uQ5KacIIUhQ3ppshAnCi2xfQf X-Google-Smtp-Source: ABdhPJxOW8UbOcUuv19B6RJ6MrDH/mB4SvLWaWE/FsdMbU6dFXgqsmCzJarU4o/3yhC3dDU1RYEkAg== X-Received: by 2002:a5d:408d:: with SMTP id o13mr1572605wrp.246.1626183464282; Tue, 13 Jul 2021 06:37:44 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j6sm9827443wrm.97.2021.07.13.06.37.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 06:37:43 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 20/34] target/arm: Implement MVE integer min/max across vector Date: Tue, 13 Jul 2021 14:37:12 +0100 Message-Id: <20210713133726.26842-21-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210713133726.26842-1-peter.maydell@linaro.org> References: <20210713133726.26842-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::42e; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x42e.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE integer min/max across vector insns VMAXV, VMINV, VMAXAV and VMINAV, which find the maximum from the vector elements and a general purpose register, and store the maximum back into the general purpose register. These insns overlap with VRMLALDAVH (they use what would be RdaHi=0b110). Signed-off-by: Peter Maydell --- target/arm/helper-mve.h | 20 +++++++++++ target/arm/mve.decode | 18 ++++++++-- target/arm/mve_helper.c | 69 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-mve.c | 48 ++++++++++++++++++++++++++ 4 files changed, 153 insertions(+), 2 deletions(-) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index c702db4c39a..282bfe80942 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -387,6 +387,26 @@ DEF_HELPER_FLAGS_3(mve_vaddvuh, TCG_CALL_NO_WG, i32, env, ptr, i32) DEF_HELPER_FLAGS_3(mve_vaddvsw, TCG_CALL_NO_WG, i32, env, ptr, i32) DEF_HELPER_FLAGS_3(mve_vaddvuw, TCG_CALL_NO_WG, i32, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vmaxvsb, TCG_CALL_NO_WG, i32, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vmaxvsh, TCG_CALL_NO_WG, i32, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vmaxvsw, TCG_CALL_NO_WG, i32, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vmaxvub, TCG_CALL_NO_WG, i32, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vmaxvuh, TCG_CALL_NO_WG, i32, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vmaxvuw, TCG_CALL_NO_WG, i32, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vmaxavb, TCG_CALL_NO_WG, i32, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vmaxavh, TCG_CALL_NO_WG, i32, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vmaxavw, TCG_CALL_NO_WG, i32, env, ptr, i32) + +DEF_HELPER_FLAGS_3(mve_vminvsb, TCG_CALL_NO_WG, i32, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vminvsh, TCG_CALL_NO_WG, i32, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vminvsw, TCG_CALL_NO_WG, i32, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vminvub, TCG_CALL_NO_WG, i32, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vminvuh, TCG_CALL_NO_WG, i32, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vminvuw, TCG_CALL_NO_WG, i32, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vminavb, TCG_CALL_NO_WG, i32, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vminavh, TCG_CALL_NO_WG, i32, env, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vminavw, TCG_CALL_NO_WG, i32, env, ptr, i32) + DEF_HELPER_FLAGS_3(mve_vaddlv_s, TCG_CALL_NO_WG, i64, env, ptr, i64) DEF_HELPER_FLAGS_3(mve_vaddlv_u, TCG_CALL_NO_WG, i64, env, ptr, i64) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index 67bd894daf1..9ae417b718a 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -40,6 +40,7 @@ &vcmp qm qn size mask &vcmp_scalar qn rm size mask &shl_scalar qda rm size +&vmaxv qm rda size @vldr_vstr ....... . . . . l:1 rn:4 ... ...... imm:7 &vldr_vstr qd=%qd u=0 # Note that both Rn and Qd are 3 bits only (no D bit) @@ -97,6 +98,8 @@ @vcmp_scalar .... .... .. size:2 qn:3 . .... .... .... rm:4 &vcmp_scalar \ mask=%mask_22_13 +@vmaxv .... .... .... size:2 .. rda:4 .... .... .... &vmaxv qm=%qm + # Vector loads and stores # Widening loads and narrowing stores: @@ -314,8 +317,19 @@ VMLALDAV_U 1111 1110 1 ... ... . ... . 1110 . 0 . 0 ... 0 @vmlaldav VMLSLDAV 1110 1110 1 ... ... . ... . 1110 . 0 . 0 ... 1 @vmlaldav -VRMLALDAVH_S 1110 1110 1 ... ... 0 ... . 1111 . 0 . 0 ... 0 @vmlaldav_nosz -VRMLALDAVH_U 1111 1110 1 ... ... 0 ... . 1111 . 0 . 0 ... 0 @vmlaldav_nosz +{ + VMAXV_S 1110 1110 1110 .. 10 .... 1111 0 0 . 0 ... 0 @vmaxv + VMINV_S 1110 1110 1110 .. 10 .... 1111 1 0 . 0 ... 0 @vmaxv + VMAXAV 1110 1110 1110 .. 00 .... 1111 0 0 . 0 ... 0 @vmaxv + VMINAV 1110 1110 1110 .. 00 .... 1111 1 0 . 0 ... 0 @vmaxv + VRMLALDAVH_S 1110 1110 1 ... ... 0 ... . 1111 . 0 . 0 ... 0 @vmlaldav_nosz +} + +{ + VMAXV_U 1111 1110 1110 .. 10 .... 1111 0 0 . 0 ... 0 @vmaxv + VMINV_U 1111 1110 1110 .. 10 .... 1111 1 0 . 0 ... 0 @vmaxv + VRMLALDAVH_U 1111 1110 1 ... ... 0 ... . 1111 . 0 . 0 ... 0 @vmlaldav_nosz +} VRMLSLDAVH 1111 1110 1 ... ... 0 ... . 1110 . 0 . 0 ... 1 @vmlaldav_nosz diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index d44cd80e18b..5066ee3169a 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -1266,6 +1266,75 @@ DO_VADDV(vaddvub, 1, uint8_t) DO_VADDV(vaddvuh, 2, uint16_t) DO_VADDV(vaddvuw, 4, uint32_t) +/* + * Vector max/min across vector. Unlike VADDV, we must + * read ra as the element size, not its full width. + * We work with int64_t internally for simplicity. + */ +#define DO_VMAXMINV(OP, ESIZE, TYPE, RATYPE, FN) \ + uint32_t HELPER(glue(mve_, OP))(CPUARMState *env, void *vm, \ + uint32_t ra_in) \ + { \ + uint16_t mask = mve_element_mask(env); \ + unsigned e; \ + TYPE *m = vm; \ + int64_t ra = (RATYPE)ra_in; \ + for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \ + if (mask & 1) { \ + ra = FN(ra, m[H##ESIZE(e)]); \ + } \ + } \ + mve_advance_vpt(env); \ + return ra; \ + } \ + +#define DO_VMAXMINV_U(INSN, FN) \ + DO_VMAXMINV(INSN##b, 1, uint8_t, uint8_t, FN) \ + DO_VMAXMINV(INSN##h, 2, uint16_t, uint16_t, FN) \ + DO_VMAXMINV(INSN##w, 4, uint32_t, uint32_t, FN) +#define DO_VMAXMINV_S(INSN, FN) \ + DO_VMAXMINV(INSN##b, 1, int8_t, int8_t, FN) \ + DO_VMAXMINV(INSN##h, 2, int16_t, int16_t, FN) \ + DO_VMAXMINV(INSN##w, 4, int32_t, int32_t, FN) + +/* Max and min of absolute values */ +static int64_t do_maxa(int64_t n, int64_t m) +{ + if (n < 0) { + n = -n; + } + if (m < 0) { + m = -m; + } + return MAX(n, m); +} + +static int64_t do_mina(int64_t n, int64_t m) +{ + if (n < 0) { + n = -n; + } + if (m < 0) { + m = -m; + } + return MIN(n, m); +} + +DO_VMAXMINV_S(vmaxvs, DO_MAX) +DO_VMAXMINV_U(vmaxvu, DO_MAX) +DO_VMAXMINV_S(vminvs, DO_MIN) +DO_VMAXMINV_U(vminvu, DO_MIN) +/* + * VMAXAV, VMINAV treat the general purpose input as unsigned + * and the vector elements as signed. + */ +DO_VMAXMINV(vmaxavb, 1, int8_t, uint8_t, do_maxa) +DO_VMAXMINV(vmaxavh, 2, int16_t, uint16_t, do_maxa) +DO_VMAXMINV(vmaxavw, 4, int32_t, uint32_t, do_maxa) +DO_VMAXMINV(vminavb, 1, int8_t, uint8_t, do_mina) +DO_VMAXMINV(vminavh, 2, int16_t, uint16_t, do_mina) +DO_VMAXMINV(vminavw, 4, int32_t, uint32_t, do_mina) + #define DO_VADDLV(OP, TYPE, LTYPE) \ uint64_t HELPER(glue(mve_, OP))(CPUARMState *env, void *vm, \ uint64_t ra) \ diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index 650d1470f08..949c11344e3 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -1320,3 +1320,51 @@ DO_VCMP(VCMPGE, vcmpge) DO_VCMP(VCMPLT, vcmplt) DO_VCMP(VCMPGT, vcmpgt) DO_VCMP(VCMPLE, vcmple) + +static bool do_vmaxv(DisasContext *s, arg_vmaxv *a, MVEGenVADDVFn fn) +{ + /* + * MIN/MAX operations across a vector: compute the min or + * max of the initial value in a general purpose register + * and all the elements in the vector, and store it back + * into the general purpose register. + */ + TCGv_ptr qm; + TCGv_i32 rda; + + if (!dc_isar_feature(aa32_mve, s) || !mve_check_qreg_bank(s, a->qm) || + !fn || a->rda == 13 || a->rda == 15) { + /* Rda cases are UNPREDICTABLE */ + return false; + } + if (!mve_eci_check(s) || !vfp_access_check(s)) { + return true; + } + + qm = mve_qreg_ptr(a->qm); + rda = load_reg(s, a->rda); + fn(rda, cpu_env, qm, rda); + store_reg(s, a->rda, rda); + tcg_temp_free_ptr(qm); + mve_update_eci(s); + return true; +} + +#define DO_VMAXV(INSN, FN) \ + static bool trans_##INSN(DisasContext *s, arg_vmaxv *a) \ + { \ + static MVEGenVADDVFn * const fns[] = { \ + gen_helper_mve_##FN##b, \ + gen_helper_mve_##FN##h, \ + gen_helper_mve_##FN##w, \ + NULL, \ + }; \ + return do_vmaxv(s, a, fns[a->size]); \ + } + +DO_VMAXV(VMAXV_S, vmaxvs) +DO_VMAXV(VMAXV_U, vmaxvu) +DO_VMAXV(VMAXAV, vmaxav) +DO_VMAXV(VMINV_S, vminvs) +DO_VMAXV(VMINV_U, vminvu) +DO_VMAXV(VMINAV, vminav) From patchwork Tue Jul 13 13:37:13 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1504640 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=TPeGk2Jg; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GPMf04LS6z9sXL for ; Tue, 13 Jul 2021 23:57:44 +1000 (AEST) Received: from localhost ([::1]:55724 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m3Ius-0006mp-8U for incoming@patchwork.ozlabs.org; Tue, 13 Jul 2021 09:57:42 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54494) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m3Ibi-0000BX-7m for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:54 -0400 Received: from mail-wr1-x434.google.com ([2a00:1450:4864:20::434]:37704) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m3Iba-0003jx-A4 for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:53 -0400 Received: by mail-wr1-x434.google.com with SMTP id i94so30501455wri.4 for ; Tue, 13 Jul 2021 06:37:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=Yzz6rh//yD/VixxG+zqhbi9VBeG09WvN+qezZDNmpTU=; b=TPeGk2JgNaGx4LlK4oLljFJb0U2STMJ5SWQ8xQ23Anjulh39bQuWhTdvL60okrlRO7 2F0g3V/6sr9pU+sDAIs+Qh54hV4qieSDYMyzckEU4zP7H9rpbu0KaG2KwuQlTPHlnn6P RJAlrsayk/CSLWmPM1v8+aufSda+JQ0hwx+8jFm1CaGRdx+584/cTOoAtYrpmTXy7rdp SxVcMuv9YzCHS4l++7tF8xG+8KsZxHBdMOpMMD2MSqa4ZMXGSYOM/xEqS1BIevKuyoXA 4+ndb7EDcu5djmjH7MTf0BLn/vRGdiN22EZ8yeeemB9nbXXHKY0BLuXKLNRtQDorRLUG WOtA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Yzz6rh//yD/VixxG+zqhbi9VBeG09WvN+qezZDNmpTU=; b=ooWiE/sePy5bzO0ad9BU6cTZI5Qpx0uTIFO3cBsgwi7EFhDjPgLW0e8CBDrScTJzfA tYX8NVe1p3Sz4xn1jVCM8Owdr2Ln0gNtD/pInmfMdo+MR02x++BGJ1CZhWqLovlJn6HK 1iFsCW1HrUaBFu0VXj3sk5q8TtCglNB5eIR0Ph515d8yehdOlwddFQhxywaGYINL2qGP wPv4efums/i4UEQJtOogWvRy6xv2xhtPyrWq8X8HVDFKgbzAyqFHsPfGTXj+67HldZe5 lfFl/mlHb4cpJZByZ0JgtY8fsWKVvy1x0TB+LBnIYYHeWi981oH/a3nMNOSqJT0KM/Kd I1wA== X-Gm-Message-State: AOAM531Doyf9hpNmeC990C/Dm6ZRdqWS8iOOSrJhyuBsTeiLWDLJiglj AkK5d2UuWS+pcnVJM0HMv4GLdGK0BWbXoe8q X-Google-Smtp-Source: ABdhPJzMmLbGmsKFdcyeFz+NloarhVJ00yPqR2s6isWnRGTSSaZoCTJXqRGDfrgazasEQb4iUlZk6Q== X-Received: by 2002:a5d:48ce:: with SMTP id p14mr5918291wrs.170.1626183464987; Tue, 13 Jul 2021 06:37:44 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j6sm9827443wrm.97.2021.07.13.06.37.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 06:37:44 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 21/34] target/arm: Implement MVE VABAV Date: Tue, 13 Jul 2021 14:37:13 +0100 Message-Id: <20210713133726.26842-22-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210713133726.26842-1-peter.maydell@linaro.org> References: <20210713133726.26842-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::434; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x434.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE VABAV insn, which computes absolute differences between elements of two vectors and accumulates the result into a general purpose register. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper-mve.h | 7 +++++++ target/arm/mve.decode | 6 ++++++ target/arm/mve_helper.c | 26 +++++++++++++++++++++++ target/arm/translate-mve.c | 43 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 82 insertions(+) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index 282bfe80942..5c3f8a26df0 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -410,6 +410,13 @@ DEF_HELPER_FLAGS_3(mve_vminavw, TCG_CALL_NO_WG, i32, env, ptr, i32) DEF_HELPER_FLAGS_3(mve_vaddlv_s, TCG_CALL_NO_WG, i64, env, ptr, i64) DEF_HELPER_FLAGS_3(mve_vaddlv_u, TCG_CALL_NO_WG, i64, env, ptr, i64) +DEF_HELPER_FLAGS_4(mve_vabavsb, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vabavsh, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vabavsw, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vabavub, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vabavuh, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vabavuw, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32) + DEF_HELPER_FLAGS_3(mve_vmovi, TCG_CALL_NO_WG, void, env, ptr, i64) DEF_HELPER_FLAGS_3(mve_vandi, TCG_CALL_NO_WG, void, env, ptr, i64) DEF_HELPER_FLAGS_3(mve_vorri, TCG_CALL_NO_WG, void, env, ptr, i64) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index 9ae417b718a..bf6cf6f8383 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -41,6 +41,7 @@ &vcmp_scalar qn rm size mask &shl_scalar qda rm size &vmaxv qm rda size +&vabav qn qm rda size @vldr_vstr ....... . . . . l:1 rn:4 ... ...... imm:7 &vldr_vstr qd=%qd u=0 # Note that both Rn and Qd are 3 bits only (no D bit) @@ -386,6 +387,11 @@ VMLAS_U 1111 1110 0 . .. ... 1 ... 1 1110 . 100 .... @2scalar rdahi=%rdahi rdalo=%rdalo } +@vabav .... .... .. size:2 .... rda:4 .... .... .... &vabav qn=%qn qm=%qm + +VABAV_S 111 0 1110 10 .. ... 0 .... 1111 . 0 . 0 ... 1 @vabav +VABAV_U 111 1 1110 10 .. ... 0 .... 1111 . 0 . 0 ... 1 @vabav + # Logical immediate operations (1 reg and modified-immediate) # The cmode/op bits here decode VORR/VBIC/VMOV/VMVN, but diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index 5066ee3169a..4eb5dbce6d7 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -1335,6 +1335,32 @@ DO_VMAXMINV(vminavb, 1, int8_t, uint8_t, do_mina) DO_VMAXMINV(vminavh, 2, int16_t, uint16_t, do_mina) DO_VMAXMINV(vminavw, 4, int32_t, uint32_t, do_mina) +#define DO_VABAV(OP, ESIZE, TYPE) \ + uint32_t HELPER(glue(mve_, OP))(CPUARMState *env, void *vn, \ + void *vm, uint32_t ra) \ + { \ + uint16_t mask = mve_element_mask(env); \ + unsigned e; \ + TYPE *m = vm, *n = vn; \ + for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \ + if (mask & 1) { \ + int64_t n0 = n[H##ESIZE(e)]; \ + int64_t m0 = m[H##ESIZE(e)]; \ + uint32_t r = n0 >= m0 ? (n0 - m0) : (m0 - n0); \ + ra += r; \ + } \ + } \ + mve_advance_vpt(env); \ + return ra; \ + } + +DO_VABAV(vabavsb, 1, int8_t) +DO_VABAV(vabavsh, 2, int16_t) +DO_VABAV(vabavsw, 4, int32_t) +DO_VABAV(vabavub, 1, uint8_t) +DO_VABAV(vabavuh, 2, uint16_t) +DO_VABAV(vabavuw, 4, uint32_t) + #define DO_VADDLV(OP, TYPE, LTYPE) \ uint64_t HELPER(glue(mve_, OP))(CPUARMState *env, void *vm, \ uint64_t ra) \ diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index 949c11344e3..c304b8d6e41 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -45,6 +45,7 @@ typedef void MVEGenVIDUPFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_i32, TCGv_i32); typedef void MVEGenVIWDUPFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_i32, TCGv_i32, TCGv_i32); typedef void MVEGenCmpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr); typedef void MVEGenScalarCmpFn(TCGv_ptr, TCGv_ptr, TCGv_i32); +typedef void MVEGenVABAVFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32); /* Return the offset of a Qn register (same semantics as aa32_vfp_qreg()) */ static inline long mve_qreg_offset(unsigned reg) @@ -1368,3 +1369,45 @@ DO_VMAXV(VMAXAV, vmaxav) DO_VMAXV(VMINV_S, vminvs) DO_VMAXV(VMINV_U, vminvu) DO_VMAXV(VMINAV, vminav) + +static bool do_vabav(DisasContext *s, arg_vabav *a, MVEGenVABAVFn *fn) +{ + /* Absolute difference accumulated across vector */ + TCGv_ptr qn, qm; + TCGv_i32 rda; + + if (!dc_isar_feature(aa32_mve, s) || + !mve_check_qreg_bank(s, a->qm | a->qn) || + !fn || a->rda == 13 || a->rda == 15) { + /* Rda cases are UNPREDICTABLE */ + return false; + } + if (!mve_eci_check(s) || !vfp_access_check(s)) { + return true; + } + + qm = mve_qreg_ptr(a->qm); + qn = mve_qreg_ptr(a->qn); + rda = load_reg(s, a->rda); + fn(rda, cpu_env, qn, qm, rda); + store_reg(s, a->rda, rda); + tcg_temp_free_ptr(qm); + tcg_temp_free_ptr(qn); + mve_update_eci(s); + return true; +} + +#define DO_VABAV(INSN, FN) \ + static bool trans_##INSN(DisasContext *s, arg_vabav *a) \ + { \ + static MVEGenVABAVFn * const fns[] = { \ + gen_helper_mve_##FN##b, \ + gen_helper_mve_##FN##h, \ + gen_helper_mve_##FN##w, \ + NULL, \ + }; \ + return do_vabav(s, a, fns[a->size]); \ + } + +DO_VABAV(VABAV_S, vabavs) +DO_VABAV(VABAV_U, vabavu) From patchwork Tue Jul 13 13:37:14 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1504636 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=fyL9kYmL; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GPMc91cCVz9sXN for ; Tue, 13 Jul 2021 23:56:09 +1000 (AEST) Received: from localhost ([::1]:48148 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m3ItK-0001Tq-MT for incoming@patchwork.ozlabs.org; Tue, 13 Jul 2021 09:56:06 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54522) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m3Ibi-0000Du-Rq for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:56 -0400 Received: from mail-wm1-x335.google.com ([2a00:1450:4864:20::335]:41673) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m3Ibb-0003ka-Mh for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:54 -0400 Received: by mail-wm1-x335.google.com with SMTP id a5-20020a7bc1c50000b02901e3bbe0939bso2405383wmj.0 for ; Tue, 13 Jul 2021 06:37:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=gzlJnyMrAHKkx+BzpZ7aFiX7hVIlIkOtXpndBsdqVDU=; b=fyL9kYmLHS0panODdQJ1HznzgaPj4CvG7lLCY19bQYtXRRACR9rmf2A8ekIP9RCsfa gjF3PHapA1fR4gnzGbI0gC/iNKa0L8lAfpIS3LS9nFZ/Q9CYUDBYoNFyMlcdsqTTy+sC xM209QduJaeuzil/hVx2ma+veMOshY9twm1Em3/KYkgtVVYlbaPN9bd/m9HIwkxZICCI RAh8ko4mGJbYGJmSAsCM8Y5fTE1PStMKqVDsRZDi2OgV5SLfJLqTE3moBOcJfnuT2slB JFKkVoY/cH/DY0VcAL6MA8W6/V4MEvubfogZslEY6U2FOr5bpCoeknauYx4YE7JSwssH N0/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=gzlJnyMrAHKkx+BzpZ7aFiX7hVIlIkOtXpndBsdqVDU=; b=dYkUCvDdlg0JL5YFvnzVOj1Vd7GxC+20IA1uB7U+GMxyue+k0flrMrLy9xQwIJR1k1 rSHwVLMjxuQ2dof5Hr0o/AjEkNwSV1AxJTKYltmkLQZT5h+xmJUk2W/UNnZJUX2TS8Fh YeJ+4Xv+idskSkXtIx8Z0UirzqJfswy/G46UH/MBH/rpUdPuIIwXA9qGEYQxpIOJltGd TxMzamcrtaJPBPxryub9ZVFTzZoX4d/N/A9JDb1FtJAbw5vFK4HgeeAx1fXuFWhtyZZ2 h7D7BGv3MtCCC2KSmS4UpkbEymDkuzjWkZcgXsEm0jwJ+BUKs0dWS1u9ynGSp6odvtqh scRg== X-Gm-Message-State: AOAM530dW++eGkwFBrYhycH+mdUEfnKzhgle4tQCuXbI+bU7it8c7zT/ oyQa06WwL/EWOt3aCr5ZSN1rEQ== X-Google-Smtp-Source: ABdhPJxaw7QMCVu1IWGdNAZbPV7I/CECIddoXfhfX7lYWAjlBTacY3QR48EcOoYhcUGTGGzeTC1hhg== X-Received: by 2002:a05:600c:22d2:: with SMTP id 18mr5135904wmg.63.1626183465698; Tue, 13 Jul 2021 06:37:45 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j6sm9827443wrm.97.2021.07.13.06.37.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 06:37:45 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 22/34] target/arm: Implement MVE narrowing moves Date: Tue, 13 Jul 2021 14:37:14 +0100 Message-Id: <20210713133726.26842-23-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210713133726.26842-1-peter.maydell@linaro.org> References: <20210713133726.26842-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::335; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x335.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE narrowing move insns VMOVN, VQMOVN and VQMOVUN. These take a double-width input, narrow it (possibly saturating) and store the result to either the top or bottom half of the output element. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper-mve.h | 20 ++++++++++ target/arm/mve.decode | 12 ++++++ target/arm/mve_helper.c | 78 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-mve.c | 22 +++++++++++ 4 files changed, 132 insertions(+) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index 5c3f8a26df0..84aa9de6e06 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -80,6 +80,26 @@ DEF_HELPER_FLAGS_3(mve_vnegw, TCG_CALL_NO_WG, void, env, ptr, ptr) DEF_HELPER_FLAGS_3(mve_vfnegh, TCG_CALL_NO_WG, void, env, ptr, ptr) DEF_HELPER_FLAGS_3(mve_vfnegs, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vmovnbb, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vmovnbh, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vmovntb, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vmovnth, TCG_CALL_NO_WG, void, env, ptr, ptr) + +DEF_HELPER_FLAGS_3(mve_vqmovunbb, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vqmovunbh, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vqmovuntb, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vqmovunth, TCG_CALL_NO_WG, void, env, ptr, ptr) + +DEF_HELPER_FLAGS_3(mve_vqmovnbsb, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vqmovnbsh, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vqmovntsb, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vqmovntsh, TCG_CALL_NO_WG, void, env, ptr, ptr) + +DEF_HELPER_FLAGS_3(mve_vqmovnbub, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vqmovnbuh, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vqmovntub, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vqmovntuh, TCG_CALL_NO_WG, void, env, ptr, ptr) + DEF_HELPER_FLAGS_4(mve_vand, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) DEF_HELPER_FLAGS_4(mve_vbic, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) DEF_HELPER_FLAGS_4(mve_vorr, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index bf6cf6f8383..79c529e762f 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -153,6 +153,9 @@ VMUL 1110 1111 0 . .. ... 0 ... 0 1001 . 1 . 1 ... 0 @2op VSHLL_BS 111 0 1110 0 . 11 .. 01 ... 0 1110 0 0 . 0 ... 1 @2_shll_esize_b VSHLL_BS 111 0 1110 0 . 11 .. 01 ... 0 1110 0 0 . 0 ... 1 @2_shll_esize_h + VQMOVUNB 111 0 1110 0 . 11 .. 01 ... 0 1110 1 0 . 0 ... 1 @1op + VQMOVN_BS 111 0 1110 0 . 11 .. 11 ... 0 1110 0 0 . 0 ... 1 @1op + VMULH_S 111 0 1110 0 . .. ...1 ... 0 1110 . 0 . 0 ... 1 @2op } @@ -160,6 +163,9 @@ VMUL 1110 1111 0 . .. ... 0 ... 0 1001 . 1 . 1 ... 0 @2op VSHLL_BU 111 1 1110 0 . 11 .. 01 ... 0 1110 0 0 . 0 ... 1 @2_shll_esize_b VSHLL_BU 111 1 1110 0 . 11 .. 01 ... 0 1110 0 0 . 0 ... 1 @2_shll_esize_h + VMOVNB 111 1 1110 0 . 11 .. 01 ... 0 1110 1 0 . 0 ... 1 @1op + VQMOVN_BU 111 1 1110 0 . 11 .. 11 ... 0 1110 0 0 . 0 ... 1 @1op + VMULH_U 111 1 1110 0 . .. ...1 ... 0 1110 . 0 . 0 ... 1 @2op } @@ -167,6 +173,9 @@ VMUL 1110 1111 0 . .. ... 0 ... 0 1001 . 1 . 1 ... 0 @2op VSHLL_TS 111 0 1110 0 . 11 .. 01 ... 1 1110 0 0 . 0 ... 1 @2_shll_esize_b VSHLL_TS 111 0 1110 0 . 11 .. 01 ... 1 1110 0 0 . 0 ... 1 @2_shll_esize_h + VQMOVUNT 111 0 1110 0 . 11 .. 01 ... 1 1110 1 0 . 0 ... 1 @1op + VQMOVN_TS 111 0 1110 0 . 11 .. 11 ... 1 1110 0 0 . 0 ... 1 @1op + VRMULH_S 111 0 1110 0 . .. ...1 ... 1 1110 . 0 . 0 ... 1 @2op } @@ -174,6 +183,9 @@ VMUL 1110 1111 0 . .. ... 0 ... 0 1001 . 1 . 1 ... 0 @2op VSHLL_TU 111 1 1110 0 . 11 .. 01 ... 1 1110 0 0 . 0 ... 1 @2_shll_esize_b VSHLL_TU 111 1 1110 0 . 11 .. 01 ... 1 1110 0 0 . 0 ... 1 @2_shll_esize_h + VMOVNT 111 1 1110 0 . 11 .. 01 ... 1 1110 1 0 . 0 ... 1 @1op + VQMOVN_TU 111 1 1110 0 . 11 .. 11 ... 1 1110 0 0 . 0 ... 1 @1op + VRMULH_U 111 1 1110 0 . .. ...1 ... 1 1110 . 0 . 0 ... 1 @2op } diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index 4eb5dbce6d7..725fe64a348 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -1668,6 +1668,84 @@ DO_VSHRN_SAT_UH(vqrshrnb_uh, vqrshrnt_uh, DO_RSHRN_UH) DO_VSHRN_SAT_SB(vqrshrunbb, vqrshruntb, DO_RSHRUN_B) DO_VSHRN_SAT_SH(vqrshrunbh, vqrshrunth, DO_RSHRUN_H) +#define DO_VMOVN(OP, TOP, ESIZE, TYPE, LESIZE, LTYPE) \ + void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm) \ + { \ + LTYPE *m = vm; \ + TYPE *d = vd; \ + uint16_t mask = mve_element_mask(env); \ + unsigned le; \ + mask >>= ESIZE * TOP; \ + for (le = 0; le < 16 / LESIZE; le++, mask >>= LESIZE) { \ + mergemask(&d[H##ESIZE(le * 2 + TOP)], \ + m[H##LESIZE(le)], mask); \ + } \ + mve_advance_vpt(env); \ + } + +DO_VMOVN(vmovnbb, false, 1, uint8_t, 2, uint16_t) +DO_VMOVN(vmovnbh, false, 2, uint16_t, 4, uint32_t) +DO_VMOVN(vmovntb, true, 1, uint8_t, 2, uint16_t) +DO_VMOVN(vmovnth, true, 2, uint16_t, 4, uint32_t) + +#define DO_VMOVN_SAT(OP, TOP, ESIZE, TYPE, LESIZE, LTYPE, FN) \ + void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm) \ + { \ + LTYPE *m = vm; \ + TYPE *d = vd; \ + uint16_t mask = mve_element_mask(env); \ + bool qc = false; \ + unsigned le; \ + mask >>= ESIZE * TOP; \ + for (le = 0; le < 16 / LESIZE; le++, mask >>= LESIZE) { \ + bool sat = false; \ + TYPE r = FN(m[H##LESIZE(le)], &sat); \ + mergemask(&d[H##ESIZE(le * 2 + TOP)], r, mask); \ + qc |= sat & mask & 1; \ + } \ + if (qc) { \ + env->vfp.qc[0] = qc; \ + } \ + mve_advance_vpt(env); \ + } + +#define DO_VMOVN_SAT_UB(BOP, TOP, FN) \ + DO_VMOVN_SAT(BOP, false, 1, uint8_t, 2, uint16_t, FN) \ + DO_VMOVN_SAT(TOP, true, 1, uint8_t, 2, uint16_t, FN) + +#define DO_VMOVN_SAT_UH(BOP, TOP, FN) \ + DO_VMOVN_SAT(BOP, false, 2, uint16_t, 4, uint32_t, FN) \ + DO_VMOVN_SAT(TOP, true, 2, uint16_t, 4, uint32_t, FN) + +#define DO_VMOVN_SAT_SB(BOP, TOP, FN) \ + DO_VMOVN_SAT(BOP, false, 1, int8_t, 2, int16_t, FN) \ + DO_VMOVN_SAT(TOP, true, 1, int8_t, 2, int16_t, FN) + +#define DO_VMOVN_SAT_SH(BOP, TOP, FN) \ + DO_VMOVN_SAT(BOP, false, 2, int16_t, 4, int32_t, FN) \ + DO_VMOVN_SAT(TOP, true, 2, int16_t, 4, int32_t, FN) + +#define DO_VQMOVN_SB(N, SATP) \ + do_sat_bhs((int64_t)(N), INT8_MIN, INT8_MAX, SATP) +#define DO_VQMOVN_UB(N, SATP) \ + do_sat_bhs((uint64_t)(N), 0, UINT8_MAX, SATP) +#define DO_VQMOVUN_B(N, SATP) \ + do_sat_bhs((int64_t)(N), 0, UINT8_MAX, SATP) + +#define DO_VQMOVN_SH(N, SATP) \ + do_sat_bhs((int64_t)(N), INT16_MIN, INT16_MAX, SATP) +#define DO_VQMOVN_UH(N, SATP) \ + do_sat_bhs((uint64_t)(N), 0, UINT16_MAX, SATP) +#define DO_VQMOVUN_H(N, SATP) \ + do_sat_bhs((int64_t)(N), 0, UINT16_MAX, SATP) + +DO_VMOVN_SAT_SB(vqmovnbsb, vqmovntsb, DO_VQMOVN_SB) +DO_VMOVN_SAT_SH(vqmovnbsh, vqmovntsh, DO_VQMOVN_SH) +DO_VMOVN_SAT_UB(vqmovnbub, vqmovntub, DO_VQMOVN_UB) +DO_VMOVN_SAT_UH(vqmovnbuh, vqmovntuh, DO_VQMOVN_UH) +DO_VMOVN_SAT_SB(vqmovunbb, vqmovuntb, DO_VQMOVUN_B) +DO_VMOVN_SAT_SH(vqmovunbh, vqmovunth, DO_VQMOVUN_H) + uint32_t HELPER(mve_vshlc)(CPUARMState *env, void *vd, uint32_t rdm, uint32_t shift) { diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index c304b8d6e41..ba5b7809b09 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -275,6 +275,28 @@ DO_1OP(VCLS, vcls) DO_1OP(VABS, vabs) DO_1OP(VNEG, vneg) +/* Narrowing moves: only size 0 and 1 are valid */ +#define DO_VMOVN(INSN, FN) \ + static bool trans_##INSN(DisasContext *s, arg_1op *a) \ + { \ + static MVEGenOneOpFn * const fns[] = { \ + gen_helper_mve_##FN##b, \ + gen_helper_mve_##FN##h, \ + NULL, \ + NULL, \ + }; \ + return do_1op(s, a, fns[a->size]); \ + } + +DO_VMOVN(VMOVNB, vmovnb) +DO_VMOVN(VMOVNT, vmovnt) +DO_VMOVN(VQMOVUNB, vqmovunb) +DO_VMOVN(VQMOVUNT, vqmovunt) +DO_VMOVN(VQMOVN_BS, vqmovnbs) +DO_VMOVN(VQMOVN_TS, vqmovnts) +DO_VMOVN(VQMOVN_BU, vqmovnbu) +DO_VMOVN(VQMOVN_TU, vqmovntu) + static bool trans_VREV16(DisasContext *s, arg_1op *a) { static MVEGenOneOpFn * const fns[] = { From patchwork Tue Jul 13 13:37:15 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1504632 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=SzkjwbeV; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GPMXy3nGmz9sX3 for ; Tue, 13 Jul 2021 23:53:22 +1000 (AEST) Received: from localhost ([::1]:39524 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m3Iqe-0004GL-6p for incoming@patchwork.ozlabs.org; Tue, 13 Jul 2021 09:53:20 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54540) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m3Ibj-0000E3-8r for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:56 -0400 Received: from mail-wm1-x331.google.com ([2a00:1450:4864:20::331]:35662) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m3Ibb-0003lY-Ng for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:54 -0400 Received: by mail-wm1-x331.google.com with SMTP id m11-20020a05600c3b0bb0290228f19cb433so1673675wms.0 for ; Tue, 13 Jul 2021 06:37:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=1gNwG3Zi1TKHMAG22ECIQQgeqEd5OSlOPRogN4sXV3U=; b=SzkjwbeV5NY2Czucw6R8O20ODa2S0ch9gQy7W+0wNORcS4GYP1hV+0bIUe8rQD002i GsauIVXKmJnItq3OAjMaiTGlw9bjL6vUP+Ghx2VBpN5ShH3E8gcNIRkh2IV+rn1sZrHH i/l8y47q+nVV/tOCJB4EnZPOJmj8H0rWEh6Eh7nJlYwJ99fF9HqFyvpwPOeTYBif2tWr AMPgkC+WA+iCa1GsvAOVRTQ1XIPBrsRGhl462sIXrAEIpiBffsR1d4wxHC5Pk//WotJn tWkF7xC/LyITCTlqy/rn66GyWLlhqrghsUGL88FcpCWKI0W7hUunkg0wA2VY/VVNa9kE cVug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=1gNwG3Zi1TKHMAG22ECIQQgeqEd5OSlOPRogN4sXV3U=; b=LZ5jmJzS3iMMTOyTxh9LavkltYRuWWgVITtmrf9awPgpUMqEvDhhGlffyoUQLSbf2A cQ7Z1662Bkt8j//cAk4Jpu1hfe+1D3JeBtWhrgYTUZSMDM7EazvNBf0/PtmSk9Vvni0U dzOfo+Z8BL9OrtvoNrJMp0UJq3yGoQ6QO1ImZj7HFsopiq+HYKVXVXOmvqFHkKRJ/YQg QiugYAWPtSab7411W+8MQ1ygMy0/iKvcQN8gwX94b1Hi3d8CNudw8hmFQaZhtmtfTkeA 0X1xl/imRiyPBFiUpcn1xg10f/m6ue/Z3L7Ul4A5BoQQU2tK5/gSB8zB1M6EYZnvncnt 6AEg== X-Gm-Message-State: AOAM532QJs7YfKaJcmiSVQ5POnrJemhzLAAQ+zrQlPAfUObKXRtU3cRw xYWT1bHNuSd9+oAH5FsFmJbCjw== X-Google-Smtp-Source: ABdhPJyVaiSxCGlsNMhTxZ2jkNGAy4X+Ph+KBUGk5MPYoo+zhQVFjLi7uOMDNx3X/XGOjkCAaRaGBQ== X-Received: by 2002:a1c:7402:: with SMTP id p2mr98472wmc.88.1626183466445; Tue, 13 Jul 2021 06:37:46 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j6sm9827443wrm.97.2021.07.13.06.37.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 06:37:46 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 23/34] target/arm: Rename MVEGenDualAccOpFn to MVEGenLongDualAccOpFn Date: Tue, 13 Jul 2021 14:37:15 +0100 Message-Id: <20210713133726.26842-24-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210713133726.26842-1-peter.maydell@linaro.org> References: <20210713133726.26842-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::331; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x331.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" The MVEGenDualAccOpFn is a bit misnamed, since it is used for the "long dual accumulate" operations that use a 64-bit accumulator. Rename it to MVEGenLongDualAccOpFn so we can use the former name for the 32-bit accumulator insns. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/translate-mve.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index ba5b7809b09..22b178296f4 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -38,7 +38,7 @@ typedef void MVEGenOneOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr); typedef void MVEGenTwoOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr); typedef void MVEGenTwoOpScalarFn(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32); typedef void MVEGenTwoOpShiftFn(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32); -typedef void MVEGenDualAccOpFn(TCGv_i64, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i64); +typedef void MVEGenLongDualAccOpFn(TCGv_i64, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i64); typedef void MVEGenVADDVFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_i32); typedef void MVEGenOneOpImmFn(TCGv_ptr, TCGv_ptr, TCGv_i64); typedef void MVEGenVIDUPFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_i32, TCGv_i32); @@ -653,7 +653,7 @@ static bool trans_VQDMULLT_scalar(DisasContext *s, arg_2scalar *a) } static bool do_long_dual_acc(DisasContext *s, arg_vmlaldav *a, - MVEGenDualAccOpFn *fn) + MVEGenLongDualAccOpFn *fn) { TCGv_ptr qn, qm; TCGv_i64 rda; @@ -711,7 +711,7 @@ static bool do_long_dual_acc(DisasContext *s, arg_vmlaldav *a, static bool trans_VMLALDAV_S(DisasContext *s, arg_vmlaldav *a) { - static MVEGenDualAccOpFn * const fns[4][2] = { + static MVEGenLongDualAccOpFn * const fns[4][2] = { { NULL, NULL }, { gen_helper_mve_vmlaldavsh, gen_helper_mve_vmlaldavxsh }, { gen_helper_mve_vmlaldavsw, gen_helper_mve_vmlaldavxsw }, @@ -722,7 +722,7 @@ static bool trans_VMLALDAV_S(DisasContext *s, arg_vmlaldav *a) static bool trans_VMLALDAV_U(DisasContext *s, arg_vmlaldav *a) { - static MVEGenDualAccOpFn * const fns[4][2] = { + static MVEGenLongDualAccOpFn * const fns[4][2] = { { NULL, NULL }, { gen_helper_mve_vmlaldavuh, NULL }, { gen_helper_mve_vmlaldavuw, NULL }, @@ -733,7 +733,7 @@ static bool trans_VMLALDAV_U(DisasContext *s, arg_vmlaldav *a) static bool trans_VMLSLDAV(DisasContext *s, arg_vmlaldav *a) { - static MVEGenDualAccOpFn * const fns[4][2] = { + static MVEGenLongDualAccOpFn * const fns[4][2] = { { NULL, NULL }, { gen_helper_mve_vmlsldavsh, gen_helper_mve_vmlsldavxsh }, { gen_helper_mve_vmlsldavsw, gen_helper_mve_vmlsldavxsw }, @@ -744,7 +744,7 @@ static bool trans_VMLSLDAV(DisasContext *s, arg_vmlaldav *a) static bool trans_VRMLALDAVH_S(DisasContext *s, arg_vmlaldav *a) { - static MVEGenDualAccOpFn * const fns[] = { + static MVEGenLongDualAccOpFn * const fns[] = { gen_helper_mve_vrmlaldavhsw, gen_helper_mve_vrmlaldavhxsw, }; return do_long_dual_acc(s, a, fns[a->x]); @@ -752,7 +752,7 @@ static bool trans_VRMLALDAVH_S(DisasContext *s, arg_vmlaldav *a) static bool trans_VRMLALDAVH_U(DisasContext *s, arg_vmlaldav *a) { - static MVEGenDualAccOpFn * const fns[] = { + static MVEGenLongDualAccOpFn * const fns[] = { gen_helper_mve_vrmlaldavhuw, NULL, }; return do_long_dual_acc(s, a, fns[a->x]); @@ -760,7 +760,7 @@ static bool trans_VRMLALDAVH_U(DisasContext *s, arg_vmlaldav *a) static bool trans_VRMLSLDAVH(DisasContext *s, arg_vmlaldav *a) { - static MVEGenDualAccOpFn * const fns[] = { + static MVEGenLongDualAccOpFn * const fns[] = { gen_helper_mve_vrmlsldavhsw, gen_helper_mve_vrmlsldavhxsw, }; return do_long_dual_acc(s, a, fns[a->x]); From patchwork Tue Jul 13 13:37:16 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1504633 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=grWAjetm; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GPMYD6GQMz9sX3 for ; Tue, 13 Jul 2021 23:53:36 +1000 (AEST) Received: from localhost ([::1]:40838 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m3Iqs-000583-IH for incoming@patchwork.ozlabs.org; Tue, 13 Jul 2021 09:53:34 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54598) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m3Ibk-0000EI-NA for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:38:01 -0400 Received: from mail-wr1-x432.google.com ([2a00:1450:4864:20::432]:42520) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m3Ibc-0003m4-HG for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:56 -0400 Received: by mail-wr1-x432.google.com with SMTP id r11so25256647wro.9 for ; Tue, 13 Jul 2021 06:37:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=N/+JKeJsZGjzhojHp2ciBdin+H8CvA1X0pN/mFvf1cM=; b=grWAjetmP49U5AJQgS7hMy1LQMbdfRknzzzeHKTPScOy9/IooAkJ1ANaZDGT7QnDyd bqSQ3bXdVZu/OQpREhwbM2SbNbTf5M7O0OaY1MaaeO2bfDrHFB0ujIipt5RVYoMw7rRn sa7xf0hcD/K+zmrjlBs635O3ADbZE7KUZN0qh78cy0SHCRjzqcWlLdDmUOK/435giQJz v0vvWleHWL6sYXVi6yw55/eNIXTRkeBxAxHki/hcLyMh8xxTGgg7fY9OTx/KwH29ihjb hdvJkkNhDYhhwUMIEN2/SwnDP5utFsVd5tNJnxR5o3/Ofds7v8JpyeC8QULilyM1uHzv g73A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=N/+JKeJsZGjzhojHp2ciBdin+H8CvA1X0pN/mFvf1cM=; b=dd5UADBUCtGE52SDtxF9l8oJJ2hDLp8zwH7bK7WK9Qt4qerQMbA/RsJyMxL1CeG94U rgUikVnJ7qIwNQpF9Hpbur+RElF5/Ksppr7x/doD7POHosLv0GD7cWgCRMGNk4ZwMW4L ibs+fbwJO+psBPTdq8nc0IWLFtRvFldJoXRjtK9CR6LsZrsVyaN4YLtbLDywEsDtTUX2 +guTaaAmMRgrh4UVSmtb4f7JeiM1tBS6Qg1bKImqk8DPNEOZzg/QmfRQk330f4TIDwyB Zi2ywIb6mJa8c5ypKZfksUFyqSurnHvpwmBueg9fu2ly9pVqoWET0vk2/QKQTq89kstn Rlvg== X-Gm-Message-State: AOAM5330I2G6htM20QS0rmOozXJ92DsfIXvbCXesVDtFO0s+hJp5r/ll cOX4VN4IPbr8H0bR2vFFWj9M+Tj0KPM59sfU X-Google-Smtp-Source: ABdhPJxiMBhrXWqepvvapgjvSo4EHSglJhgNvIxpljHiO/QIJMiQt+E+SWfw+ZrotdRppq7bTd2q3Q== X-Received: by 2002:adf:ce8d:: with SMTP id r13mr5838514wrn.304.1626183467247; Tue, 13 Jul 2021 06:37:47 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j6sm9827443wrm.97.2021.07.13.06.37.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 06:37:46 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 24/34] target/arm: Implement MVE VMLADAV and VMLSLDAV Date: Tue, 13 Jul 2021 14:37:16 +0100 Message-Id: <20210713133726.26842-25-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210713133726.26842-1-peter.maydell@linaro.org> References: <20210713133726.26842-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::432; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x432.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE VMLADAV and VMLSLDAV insns. Like the VMLALDAV and VMLSLDAV insns already implemented, these accumulate multiplied vector elements; but they accumulate a 32-bit result rather than a 64-bit one. Note that these encodings overlap with what would be RdaHi=0b111 for VMLALDAV, VMLSLDAV, VRMLALDAVH and VRMLSLDAVH. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper-mve.h | 17 ++++++++++ target/arm/mve.decode | 33 +++++++++++++++++--- target/arm/mve_helper.c | 41 ++++++++++++++++++++++++ target/arm/translate-mve.c | 64 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 150 insertions(+), 5 deletions(-) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index 84aa9de6e06..088bdd3ca50 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -400,6 +400,23 @@ DEF_HELPER_FLAGS_4(mve_vrmlaldavhuw, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64) DEF_HELPER_FLAGS_4(mve_vrmlsldavhsw, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64) DEF_HELPER_FLAGS_4(mve_vrmlsldavhxsw, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64) +DEF_HELPER_FLAGS_4(mve_vmladavsb, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vmladavsh, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vmladavsw, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vmladavub, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vmladavuh, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vmladavuw, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vmlsdavb, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vmlsdavh, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vmlsdavw, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(mve_vmladavsxb, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vmladavsxh, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vmladavsxw, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vmlsdavxb, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vmlsdavxh, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vmlsdavxw, TCG_CALL_NO_WG, i32, env, ptr, ptr, i32) + DEF_HELPER_FLAGS_3(mve_vaddvsb, TCG_CALL_NO_WG, i32, env, ptr, i32) DEF_HELPER_FLAGS_3(mve_vaddvub, TCG_CALL_NO_WG, i32, env, ptr, i32) DEF_HELPER_FLAGS_3(mve_vaddvsh, TCG_CALL_NO_WG, i32, env, ptr, i32) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index 79c529e762f..0c4708ea988 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -320,32 +320,55 @@ VDUP 1110 1110 1 0 10 ... 0 .... 1011 . 0 0 1 0000 @vdup size=2 %size_16 16:1 !function=plus_1 &vmlaldav rdahi rdalo size qn qm x a +&vmladav rda size qn qm x a @vmlaldav .... .... . ... ... . ... x:1 .... .. a:1 . qm:3 . \ qn=%qn rdahi=%rdahi rdalo=%rdalo size=%size_16 &vmlaldav @vmlaldav_nosz .... .... . ... ... . ... x:1 .... .. a:1 . qm:3 . \ qn=%qn rdahi=%rdahi rdalo=%rdalo size=0 &vmlaldav -VMLALDAV_S 1110 1110 1 ... ... . ... . 1110 . 0 . 0 ... 0 @vmlaldav -VMLALDAV_U 1111 1110 1 ... ... . ... . 1110 . 0 . 0 ... 0 @vmlaldav +@vmladav .... .... .... ... . ... x:1 .... . . a:1 . qm:3 . \ + qn=%qn rda=%rdalo size=%size_16 &vmladav +@vmladav_nosz .... .... .... ... . ... x:1 .... . . a:1 . qm:3 . \ + qn=%qn rda=%rdalo size=0 &vmladav -VMLSLDAV 1110 1110 1 ... ... . ... . 1110 . 0 . 0 ... 1 @vmlaldav +{ + VMLADAV_S 1110 1110 1111 ... . ... . 1110 . 0 . 0 ... 0 @vmladav + VMLALDAV_S 1110 1110 1 ... ... . ... . 1110 . 0 . 0 ... 0 @vmlaldav +} +{ + VMLADAV_U 1111 1110 1111 ... . ... . 1110 . 0 . 0 ... 0 @vmladav + VMLALDAV_U 1111 1110 1 ... ... . ... . 1110 . 0 . 0 ... 0 @vmlaldav +} + +{ + VMLSDAV 1110 1110 1111 ... . ... . 1110 . 0 . 0 ... 1 @vmladav + VMLSLDAV 1110 1110 1 ... ... . ... . 1110 . 0 . 0 ... 1 @vmlaldav +} + +{ + VMLSDAV 1111 1110 1111 ... 0 ... . 1110 . 0 . 0 ... 1 @vmladav_nosz + VRMLSLDAVH 1111 1110 1 ... ... 0 ... . 1110 . 0 . 0 ... 1 @vmlaldav_nosz +} + +VMLADAV_S 1110 1110 1111 ... 0 ... . 1111 . 0 . 0 ... 1 @vmladav_nosz +VMLADAV_U 1111 1110 1111 ... 0 ... . 1111 . 0 . 0 ... 1 @vmladav_nosz { VMAXV_S 1110 1110 1110 .. 10 .... 1111 0 0 . 0 ... 0 @vmaxv VMINV_S 1110 1110 1110 .. 10 .... 1111 1 0 . 0 ... 0 @vmaxv VMAXAV 1110 1110 1110 .. 00 .... 1111 0 0 . 0 ... 0 @vmaxv VMINAV 1110 1110 1110 .. 00 .... 1111 1 0 . 0 ... 0 @vmaxv + VMLADAV_S 1110 1110 1111 ... 0 ... . 1111 . 0 . 0 ... 0 @vmladav_nosz VRMLALDAVH_S 1110 1110 1 ... ... 0 ... . 1111 . 0 . 0 ... 0 @vmlaldav_nosz } { VMAXV_U 1111 1110 1110 .. 10 .... 1111 0 0 . 0 ... 0 @vmaxv VMINV_U 1111 1110 1110 .. 10 .... 1111 1 0 . 0 ... 0 @vmaxv + VMLADAV_U 1111 1110 1111 ... 0 ... . 1111 . 0 . 0 ... 0 @vmladav_nosz VRMLALDAVH_U 1111 1110 1 ... ... 0 ... . 1111 . 0 . 0 ... 0 @vmlaldav_nosz } -VRMLSLDAVH 1111 1110 1 ... ... 0 ... . 1110 . 0 . 0 ... 1 @vmlaldav_nosz - # Scalar operations VADD_scalar 1110 1110 0 . .. ... 1 ... 0 1111 . 100 .... @2scalar diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index 725fe64a348..8b70362f012 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -1201,6 +1201,47 @@ DO_LDAV(vmlsldavxsh, 2, int16_t, true, +=, -=) DO_LDAV(vmlsldavsw, 4, int32_t, false, +=, -=) DO_LDAV(vmlsldavxsw, 4, int32_t, true, +=, -=) +/* + * Multiply add dual accumulate ops + */ +#define DO_DAV(OP, ESIZE, TYPE, XCHG, EVENACC, ODDACC) \ + uint32_t HELPER(glue(mve_, OP))(CPUARMState *env, void *vn, \ + void *vm, uint32_t a) \ + { \ + uint16_t mask = mve_element_mask(env); \ + unsigned e; \ + TYPE *n = vn, *m = vm; \ + for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \ + if (mask & 1) { \ + if (e & 1) { \ + a ODDACC \ + n[H##ESIZE(e - 1 * XCHG)] * m[H##ESIZE(e)]; \ + } else { \ + a EVENACC \ + n[H##ESIZE(e + 1 * XCHG)] * m[H##ESIZE(e)]; \ + } \ + } \ + } \ + mve_advance_vpt(env); \ + return a; \ + } + +#define DO_DAV_S(INSN, XCHG, EVENACC, ODDACC) \ + DO_DAV(INSN##b, 1, int8_t, XCHG, EVENACC, ODDACC) \ + DO_DAV(INSN##h, 2, int16_t, XCHG, EVENACC, ODDACC) \ + DO_DAV(INSN##w, 4, int32_t, XCHG, EVENACC, ODDACC) + +#define DO_DAV_U(INSN, XCHG, EVENACC, ODDACC) \ + DO_DAV(INSN##b, 1, uint8_t, XCHG, EVENACC, ODDACC) \ + DO_DAV(INSN##h, 2, uint16_t, XCHG, EVENACC, ODDACC) \ + DO_DAV(INSN##w, 4, uint32_t, XCHG, EVENACC, ODDACC) + +DO_DAV_S(vmladavs, false, +=, +=) +DO_DAV_U(vmladavu, false, +=, +=) +DO_DAV_S(vmlsdav, false, +=, -=) +DO_DAV_S(vmladavsx, true, +=, +=) +DO_DAV_S(vmlsdavx, true, +=, -=) + /* * Rounding multiply add long dual accumulate high. In the pseudocode * this is implemented with a 72-bit internal accumulator value of which diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index 22b178296f4..67b9c07447a 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -46,6 +46,7 @@ typedef void MVEGenVIWDUPFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_i32, TCGv_i32, TC typedef void MVEGenCmpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr); typedef void MVEGenScalarCmpFn(TCGv_ptr, TCGv_ptr, TCGv_i32); typedef void MVEGenVABAVFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32); +typedef void MVEGenDualAccOpFn(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32); /* Return the offset of a Qn register (same semantics as aa32_vfp_qreg()) */ static inline long mve_qreg_offset(unsigned reg) @@ -766,6 +767,69 @@ static bool trans_VRMLSLDAVH(DisasContext *s, arg_vmlaldav *a) return do_long_dual_acc(s, a, fns[a->x]); } +static bool do_dual_acc(DisasContext *s, arg_vmladav *a, MVEGenDualAccOpFn *fn) +{ + TCGv_ptr qn, qm; + TCGv_i32 rda; + + if (!dc_isar_feature(aa32_mve, s) || + !mve_check_qreg_bank(s, a->qn) || + !fn) { + return false; + } + if (!mve_eci_check(s) || !vfp_access_check(s)) { + return true; + } + + qn = mve_qreg_ptr(a->qn); + qm = mve_qreg_ptr(a->qm); + + /* + * This insn is subject to beat-wise execution. Partial execution + * of an A=0 (no-accumulate) insn which does not execute the first + * beat must start with the current rda value, not 0. + */ + if (a->a || mve_skip_first_beat(s)) { + rda = load_reg(s, a->rda); + } else { + rda = tcg_const_i32(0); + } + + fn(rda, cpu_env, qn, qm, rda); + store_reg(s, a->rda, rda); + tcg_temp_free_ptr(qn); + tcg_temp_free_ptr(qm); + + mve_update_eci(s); + return true; +} + +#define DO_DUAL_ACC(INSN, FN) \ + static bool trans_##INSN(DisasContext *s, arg_vmladav *a) \ + { \ + static MVEGenDualAccOpFn * const fns[4][2] = { \ + { gen_helper_mve_##FN##b, gen_helper_mve_##FN##xb }, \ + { gen_helper_mve_##FN##h, gen_helper_mve_##FN##xh }, \ + { gen_helper_mve_##FN##w, gen_helper_mve_##FN##xw }, \ + { NULL, NULL }, \ + }; \ + return do_dual_acc(s, a, fns[a->size][a->x]); \ + } + +DO_DUAL_ACC(VMLADAV_S, vmladavs) +DO_DUAL_ACC(VMLSDAV, vmlsdav) + +static bool trans_VMLADAV_U(DisasContext *s, arg_vmladav *a) +{ + static MVEGenDualAccOpFn * const fns[4][2] = { + { gen_helper_mve_vmladavub, NULL }, + { gen_helper_mve_vmladavuh, NULL }, + { gen_helper_mve_vmladavuw, NULL }, + { NULL, NULL }, + }; + return do_dual_acc(s, a, fns[a->size][a->x]); +} + static void gen_vpst(DisasContext *s, uint32_t mask) { /* From patchwork Tue Jul 13 13:37:17 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1504641 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=lsrA1VSs; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GPMfk5Lblz9sX3 for ; Tue, 13 Jul 2021 23:58:22 +1000 (AEST) Received: from localhost ([::1]:58114 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m3IvU-0008MG-Fu for incoming@patchwork.ozlabs.org; Tue, 13 Jul 2021 09:58:20 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54600) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m3Ibk-0000ER-PP for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:38:01 -0400 Received: from mail-wr1-x42b.google.com ([2a00:1450:4864:20::42b]:41804) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m3Ibd-0003mF-2Z for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:56 -0400 Received: by mail-wr1-x42b.google.com with SMTP id k4so24197685wrc.8 for ; Tue, 13 Jul 2021 06:37:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=KRGBoCnPAuu3QA59UbNot9EYsJQ7O/Xl42j0/zCUXDY=; b=lsrA1VSsTxlQqKCpgc9RyM1wllwMnNCiLcm5Iap8O0TOQ/OYsRdfFcMnvz8c/jnh7r CF7squtR2CHEKZPB6SgA2E1bEa+yqKjJvFhiQD9/uqM5nharmf6h/Zu9U7ij7o/kz3jh V7xRt+9f/pVE2+HXRCHgOm3CpXgIHxcmewRknjtKk5x7ef4KpkopznoIVGT6jqH47mWe h3w6vY7YJZjFvkXS6xKwfiBoJt0Y4rgIqUlM8hMT8rUDPe1mtOcWJc8YTLnRC7eub6PS UHWyiK0jPfAbTQOAMIzAjhHGRciF9lYQcYKWDhLt2xThmWGwPc88zw4se+qRcYUUmYfi bVCA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=KRGBoCnPAuu3QA59UbNot9EYsJQ7O/Xl42j0/zCUXDY=; b=FQ1OpiselKnoBG3B/f6+O86hzcD8XCV+O8CaWUFhNwX1tZpik1wiCWWKUYwb3yt8O1 Q3BKcCVoB/atQ7pQQNLHg1k93fL2ENermgFFefFyrewE/xDOsfs930Y1LaqgTr3uq4KE IQvBm9JkNJ1T2UFeqGxqTBoAwzhYX/LXOi1sWK+sUF3SLSS3KUB9NqdYJv+fA8O4GrQh Q57eWjXG+7ov/609+LWG/GXv1J1GrJKZGTLfP13rPiSSlGPU37LLBEymNXKnxwZ/DoZD wWT2RFyUiehivUiLhvqp1oMZQq94vhrTN+wc83st7Duapn32hdHAZa6uMelmgnkVy+aN 7X3Q== X-Gm-Message-State: AOAM532lvpSGPG2YRU/b1hLLjZpov7sRnOwVYqTdcBas39EOWk9qbR63 nop5NMglyuMHzRHn4K0nkO5213IyPpGlu+1V X-Google-Smtp-Source: ABdhPJxiJ6KddEWEti2BmCIG+yYs6djPDha//ffG6ZmdZr7Ihfyq2e3cHyxvY01OS8Qj9iHMHQMVIg== X-Received: by 2002:adf:de84:: with SMTP id w4mr5878528wrl.104.1626183467890; Tue, 13 Jul 2021 06:37:47 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j6sm9827443wrm.97.2021.07.13.06.37.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 06:37:47 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 25/34] target/arm: Implement MVE VMLA Date: Tue, 13 Jul 2021 14:37:17 +0100 Message-Id: <20210713133726.26842-26-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210713133726.26842-1-peter.maydell@linaro.org> References: <20210713133726.26842-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::42b; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x42b.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE VMLA insn, which multiplies a vector by a scalar and accumulates into another vector. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper-mve.h | 8 ++++++++ target/arm/mve.decode | 3 +++ target/arm/mve_helper.c | 6 ++++++ target/arm/translate-mve.c | 2 ++ 4 files changed, 19 insertions(+) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index 088bdd3ca50..50b34c601e1 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -371,6 +371,14 @@ DEF_HELPER_FLAGS_4(mve_vqdmullb_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i3 DEF_HELPER_FLAGS_4(mve_vqdmullt_scalarh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vqdmullt_scalarw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vmlasb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vmlash, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vmlasw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(mve_vmlaub, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vmlauh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vmlauw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(mve_vmlassb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vmlassh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vmlassw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index 0c4708ea988..2e2df61c860 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -412,6 +412,9 @@ VHSUB_U_scalar 1111 1110 0 . .. ... 0 ... 1 1111 . 100 .... @2scalar VQDMULH_scalar 1110 1110 0 . .. ... 1 ... 0 1110 . 110 .... @2scalar VQRDMULH_scalar 1111 1110 0 . .. ... 1 ... 0 1110 . 110 .... @2scalar +VMLA_S 1110 1110 0 . .. ... 1 ... 0 1110 . 100 .... @2scalar +VMLA_U 1111 1110 0 . .. ... 1 ... 0 1110 . 100 .... @2scalar + VMLAS_S 1110 1110 0 . .. ... 1 ... 1 1110 . 100 .... @2scalar VMLAS_U 1111 1110 0 . .. ... 1 ... 1 1110 . 100 .... @2scalar diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index 8b70362f012..91c0add8da7 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -1019,6 +1019,12 @@ DO_2OP_SAT_SCALAR(vqrdmulh_scalarb, 1, int8_t, DO_QRDMULH_B) DO_2OP_SAT_SCALAR(vqrdmulh_scalarh, 2, int16_t, DO_QRDMULH_H) DO_2OP_SAT_SCALAR(vqrdmulh_scalarw, 4, int32_t, DO_QRDMULH_W) +/* Vector by scalar plus vector */ +#define DO_VMLA(D, N, M) ((N) * (M) + (D)) + +DO_2OP_ACC_SCALAR_S(vmlas, DO_VMLA) +DO_2OP_ACC_SCALAR_U(vmlau, DO_VMLA) + /* Vector by vector plus scalar */ #define DO_VMLAS(D, N, M) ((N) * (D) + (M)) diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index 67b9c07447a..650f3b95edf 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -620,6 +620,8 @@ DO_2OP_SCALAR(VQSUB_U_scalar, vqsubu_scalar) DO_2OP_SCALAR(VQDMULH_scalar, vqdmulh_scalar) DO_2OP_SCALAR(VQRDMULH_scalar, vqrdmulh_scalar) DO_2OP_SCALAR(VBRSR, vbrsr) +DO_2OP_SCALAR(VMLA_S, vmlas) +DO_2OP_SCALAR(VMLA_U, vmlau) DO_2OP_SCALAR(VMLAS_S, vmlass) DO_2OP_SCALAR(VMLAS_U, vmlasu) From patchwork Tue Jul 13 13:37:18 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1504647 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=v4lj85qe; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GPMkT1Cpnz9sWd for ; Wed, 14 Jul 2021 00:01:37 +1000 (AEST) Received: from localhost ([::1]:38434 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m3Iyc-0005bJ-LG for incoming@patchwork.ozlabs.org; Tue, 13 Jul 2021 10:01:34 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54638) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m3Ibm-0000EV-Ol for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:38:01 -0400 Received: from mail-wr1-x432.google.com ([2a00:1450:4864:20::432]:44799) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m3Ibe-0003me-0H for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:37:58 -0400 Received: by mail-wr1-x432.google.com with SMTP id f9so24854615wrq.11 for ; Tue, 13 Jul 2021 06:37:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=klXHEewFEEY6mYo55I2EIetx+SJvhkqMxzPxJGrf5V4=; b=v4lj85qeDSP+fBejQEkQYPYFeQHWlPSe57RJpwXzlfUihPSg3T4OOg39AfA8AY0DZT qH/7PN4qHnDfx4pXa+SxeDNRYS/pIhVM5lg4kvlK+mOv/zNA3yZ291gw9F+cpVDgQ553 YTLkFl8nI65OPS8xAAeJGpWpNyEaF+nGEr9RC0wSFmwyppTee0inl0dcnMtnTEhE7uE7 T2HGJrplVhuiOHZW/pz7cEYG5UxphrVAL9I9lVlx7wjbJz2bCUhwap5oh71Z4D00dazt 10+8HqYAfiuCuZBqF5ZW9u86jXhxLwLe3RQJoNzx2Ghc5Paf+mTXT/CrpKPckW/Knmwa OZNw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=klXHEewFEEY6mYo55I2EIetx+SJvhkqMxzPxJGrf5V4=; b=PAnUjlnJ6aPMAm9SRN78G2QZV89otHxTwf99i0zwHiZ9JM9tk5dVYw6fGNuwoToDb+ tLnF39zfvOvVy1sahfyGcDSQMIrNhbcW/qtdPNb8ymtCdPUrUtAtoSz/hK9Svf1WQbVz QRbkB4jP4xeIN45jnXgtiIXFFBmNN4CSfD6RPqgm8tDaps5A5BRFgwgudDvmqpSsHYxc QamD7XPzWQAPtyU0J0C/ecKkDZl9oELwHsAKGlf5G8b0FRQnbZHM1K7X732+GTOICjSF 3NqvPSTEYkqrsXwG82UbHnD5i5Q50WsUFr2OhUr2F/KPSrYaY5e3+GOMQ8bDg0bXuPPS qWRg== X-Gm-Message-State: AOAM531TmrKRlH2JbaT8F4t6/s6V2zeltnBoKJbbv15/UgZuZrJwKIIk Jy9PpDnP2z/Bk0kxh7E+RiSeMA97di7ZMuYZ X-Google-Smtp-Source: ABdhPJxgSrGsF7Wtv+GJX+21Ag0GJ2MhV+el1xW1lc6Kxt7OZraHURhqEAUho2zDrfCOSTRj8VI3/A== X-Received: by 2002:a5d:64e4:: with SMTP id g4mr5603589wri.377.1626183468648; Tue, 13 Jul 2021 06:37:48 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j6sm9827443wrm.97.2021.07.13.06.37.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 06:37:48 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 26/34] target/arm: Implement MVE saturating doubling multiply accumulates Date: Tue, 13 Jul 2021 14:37:18 +0100 Message-Id: <20210713133726.26842-27-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210713133726.26842-1-peter.maydell@linaro.org> References: <20210713133726.26842-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::432; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x432.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE saturating doubling multiply accumulate insns VQDMLAH, VQRDMLAH, VQDMLASH and VQRDMLASH. These perform a multiply, double, add the accumulator shifted by the element size, possibly round, saturate to twice the element size, then take the high half of the result. The *MLAH insns do vector * scalar + vector, and the *MLASH insns do vector * vector + scalar. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper-mve.h | 16 +++++++ target/arm/mve.decode | 5 ++ target/arm/mve_helper.c | 95 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-mve.c | 4 ++ 4 files changed, 120 insertions(+) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index 50b34c601e1..e61c5d56f41 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -387,6 +387,22 @@ DEF_HELPER_FLAGS_4(mve_vmlasub, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vmlasuh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vmlasuw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vqdmlahb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vqdmlahh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vqdmlahw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(mve_vqrdmlahb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vqrdmlahh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vqrdmlahw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(mve_vqdmlashb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vqdmlashh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vqdmlashw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(mve_vqrdmlashb, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vqrdmlashh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vqrdmlashw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(mve_vmlaldavsh, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64) DEF_HELPER_FLAGS_4(mve_vmlaldavsw, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64) DEF_HELPER_FLAGS_4(mve_vmlaldavxsh, TCG_CALL_NO_WG, i64, env, ptr, ptr, i64) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index 2e2df61c860..99cea8d39b6 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -418,6 +418,11 @@ VMLA_U 1111 1110 0 . .. ... 1 ... 0 1110 . 100 .... @2scalar VMLAS_S 1110 1110 0 . .. ... 1 ... 1 1110 . 100 .... @2scalar VMLAS_U 1111 1110 0 . .. ... 1 ... 1 1110 . 100 .... @2scalar +VQRDMLAH 1110 1110 0 . .. ... 0 ... 0 1110 . 100 .... @2scalar +VQRDMLASH 1110 1110 0 . .. ... 0 ... 1 1110 . 100 .... @2scalar +VQDMLAH 1110 1110 0 . .. ... 0 ... 0 1110 . 110 .... @2scalar +VQDMLASH 1110 1110 0 . .. ... 0 ... 1 1110 . 110 .... @2scalar + # Vector add across vector { VADDV 111 u:1 1110 1111 size:2 01 ... 0 1111 0 0 a:1 0 qm:3 0 rda=%rdalo diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index 91c0add8da7..1013060baeb 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -971,6 +971,28 @@ DO_VQDMLADH_OP(vqrdmlsdhxw, 4, int32_t, 1, 1, do_vqdmlsdh_w) mve_advance_vpt(env); \ } +#define DO_2OP_SAT_ACC_SCALAR(OP, ESIZE, TYPE, FN) \ + void HELPER(glue(mve_, OP))(CPUARMState *env, void *vd, void *vn, \ + uint32_t rm) \ + { \ + TYPE *d = vd, *n = vn; \ + TYPE m = rm; \ + uint16_t mask = mve_element_mask(env); \ + unsigned e; \ + bool qc = false; \ + for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \ + bool sat = false; \ + mergemask(&d[H##ESIZE(e)], \ + FN(d[H##ESIZE(e)], n[H##ESIZE(e)], m, &sat), \ + mask); \ + qc |= sat & mask & 1; \ + } \ + if (qc) { \ + env->vfp.qc[0] = qc; \ + } \ + mve_advance_vpt(env); \ + } + /* provide unsigned 2-op scalar helpers for all sizes */ #define DO_2OP_SCALAR_U(OP, FN) \ DO_2OP_SCALAR(OP##b, 1, uint8_t, FN) \ @@ -1019,6 +1041,79 @@ DO_2OP_SAT_SCALAR(vqrdmulh_scalarb, 1, int8_t, DO_QRDMULH_B) DO_2OP_SAT_SCALAR(vqrdmulh_scalarh, 2, int16_t, DO_QRDMULH_H) DO_2OP_SAT_SCALAR(vqrdmulh_scalarw, 4, int32_t, DO_QRDMULH_W) +static int8_t do_vqdmlah_b(int8_t a, int8_t b, int8_t c, int round, bool *sat) +{ + int64_t r = (int64_t)a * b * 2 + ((int64_t)c << 8) + (round << 7); + return do_sat_bhw(r, INT16_MIN, INT16_MAX, sat) >> 8; +} + +static int16_t do_vqdmlah_h(int16_t a, int16_t b, int16_t c, + int round, bool *sat) +{ + int64_t r = (int64_t)a * b * 2 + ((int64_t)c << 16) + (round << 15); + return do_sat_bhw(r, INT32_MIN, INT32_MAX, sat) >> 16; +} + +static int32_t do_vqdmlah_w(int32_t a, int32_t b, int32_t c, + int round, bool *sat) +{ + /* + * Architecturally we should do the entire add, double, round + * and then check for saturation. We do three saturating adds, + * but we need to be careful about the order. If the first + * m1 + m2 saturates then it's impossible for the *2+rc to + * bring it back into the non-saturated range. However, if + * m1 + m2 is negative then it's possible that doing the doubling + * would take the intermediate result below INT64_MAX and the + * addition of the rounding constant then brings it back in range. + * So we add half the rounding constant and half the "c << esize" + * before doubling rather than adding the rounding constant after + * the doubling. + */ + int64_t m1 = (int64_t)a * b; + int64_t m2 = (int64_t)c << 31; + int64_t r; + if (sadd64_overflow(m1, m2, &r) || + sadd64_overflow(r, (round << 30), &r) || + sadd64_overflow(r, r, &r)) { + *sat = true; + return r < 0 ? INT32_MAX : INT32_MIN; + } + return r >> 32; +} + +/* + * The *MLAH insns are vector * scalar + vector; + * the *MLASH insns are vector * vector + scalar + */ +#define DO_VQDMLAH_B(D, N, M, S) do_vqdmlah_b(N, M, D, 0, S) +#define DO_VQDMLAH_H(D, N, M, S) do_vqdmlah_h(N, M, D, 0, S) +#define DO_VQDMLAH_W(D, N, M, S) do_vqdmlah_w(N, M, D, 0, S) +#define DO_VQRDMLAH_B(D, N, M, S) do_vqdmlah_b(N, M, D, 1, S) +#define DO_VQRDMLAH_H(D, N, M, S) do_vqdmlah_h(N, M, D, 1, S) +#define DO_VQRDMLAH_W(D, N, M, S) do_vqdmlah_w(N, M, D, 1, S) + +#define DO_VQDMLASH_B(D, N, M, S) do_vqdmlah_b(N, D, M, 0, S) +#define DO_VQDMLASH_H(D, N, M, S) do_vqdmlah_h(N, D, M, 0, S) +#define DO_VQDMLASH_W(D, N, M, S) do_vqdmlah_w(N, D, M, 0, S) +#define DO_VQRDMLASH_B(D, N, M, S) do_vqdmlah_b(N, D, M, 1, S) +#define DO_VQRDMLASH_H(D, N, M, S) do_vqdmlah_h(N, D, M, 1, S) +#define DO_VQRDMLASH_W(D, N, M, S) do_vqdmlah_w(N, D, M, 1, S) + +DO_2OP_SAT_ACC_SCALAR(vqdmlahb, 1, int8_t, DO_VQDMLAH_B) +DO_2OP_SAT_ACC_SCALAR(vqdmlahh, 2, int16_t, DO_VQDMLAH_H) +DO_2OP_SAT_ACC_SCALAR(vqdmlahw, 4, int32_t, DO_VQDMLAH_W) +DO_2OP_SAT_ACC_SCALAR(vqrdmlahb, 1, int8_t, DO_VQRDMLAH_B) +DO_2OP_SAT_ACC_SCALAR(vqrdmlahh, 2, int16_t, DO_VQRDMLAH_H) +DO_2OP_SAT_ACC_SCALAR(vqrdmlahw, 4, int32_t, DO_VQRDMLAH_W) + +DO_2OP_SAT_ACC_SCALAR(vqdmlashb, 1, int8_t, DO_VQDMLASH_B) +DO_2OP_SAT_ACC_SCALAR(vqdmlashh, 2, int16_t, DO_VQDMLASH_H) +DO_2OP_SAT_ACC_SCALAR(vqdmlashw, 4, int32_t, DO_VQDMLASH_W) +DO_2OP_SAT_ACC_SCALAR(vqrdmlashb, 1, int8_t, DO_VQRDMLASH_B) +DO_2OP_SAT_ACC_SCALAR(vqrdmlashh, 2, int16_t, DO_VQRDMLASH_H) +DO_2OP_SAT_ACC_SCALAR(vqrdmlashw, 4, int32_t, DO_VQRDMLASH_W) + /* Vector by scalar plus vector */ #define DO_VMLA(D, N, M) ((N) * (M) + (D)) diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index 650f3b95edf..f8b34c9ef36 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -624,6 +624,10 @@ DO_2OP_SCALAR(VMLA_S, vmlas) DO_2OP_SCALAR(VMLA_U, vmlau) DO_2OP_SCALAR(VMLAS_S, vmlass) DO_2OP_SCALAR(VMLAS_U, vmlasu) +DO_2OP_SCALAR(VQDMLAH, vqdmlah) +DO_2OP_SCALAR(VQRDMLAH, vqrdmlah) +DO_2OP_SCALAR(VQDMLASH, vqdmlash) +DO_2OP_SCALAR(VQRDMLASH, vqrdmlash) static bool trans_VQDMULLB_scalar(DisasContext *s, arg_2scalar *a) { From patchwork Tue Jul 13 13:37:19 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1504637 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=f/CshtQb; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GPMcT1ld9z9sX3 for ; Tue, 13 Jul 2021 23:56:24 +1000 (AEST) Received: from localhost ([::1]:49700 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m3Ita-0002lL-Jo for incoming@patchwork.ozlabs.org; Tue, 13 Jul 2021 09:56:22 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54696) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m3Ibq-0000KA-IR for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:38:02 -0400 Received: from mail-wr1-x430.google.com ([2a00:1450:4864:20::430]:43999) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m3Ibe-0003nX-J3 for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:38:02 -0400 Received: by mail-wr1-x430.google.com with SMTP id a13so30499777wrf.10 for ; Tue, 13 Jul 2021 06:37:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=S3Nm5WPC1woG5qpBDWUMq5K5w6e5M1eP3KgH+21zZP4=; b=f/CshtQbdvfiYRqQ++ff2TFu+5bw6AxHu4Kk8v4G7/JY1OtiE5Meg6zFM5ljvyJWjE 4P7HwqKXPt+rtiTdEBJUmW2kny/t2Bxm/09ipmPp3cwFbgkW+8fZSWThmSJxz9bvlKXy VATbDZ4nr/17gBCcp/Y5sC412IaTA0c22s95h515pLgmAjKjWsgtt0IQ8BDpJk1/q0jV fAjq3POXju/Wj/U5Rt3EYs4Azh6Nox9/7NF6ys+6Tv7TYoNY2T682FlwLpXieeP6sfMr /io/jrS04Ru3Xsbcn5z29vNbzkqt/7MQ/dCfhjnlMk3vhCADlb4dAoWnaoPF5Ns6juYr 7Vsw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=S3Nm5WPC1woG5qpBDWUMq5K5w6e5M1eP3KgH+21zZP4=; b=pWaY7YIPXpdvfUPZT1Ku96RW5VGl4Zly7tbcU7I0v3KkEUfil93r4VSOHBJ1a3sUYx u31ReMh6vAQIYhGXQNtgZ5L6k1LAYnTuYcLrBI0gsI4eEF23m29JwkvqxwMfIaWbeF8p mkKRKhdkX+Eu2mMPDthejheUbWYRM6kvXRbrI+qe2DWH7jBki5hPhqeM1tDLoYepiFOZ je27dcCT+2KpezyOtSIP1UvsynjPJH95MDL4LhubAEKNrJpPLhM3ap/HK+tDkfRPbt3P L+1HJwcY7aWPMBiB6fSuFRwCwyIgpuzb0uJUGUJyisr5sQXKdpNWZvhayNNtQI6PurUd saGw== X-Gm-Message-State: AOAM533do5UxHtg995+0xqaRnK3YVQmz7QiWYu1FUMVMZFTxUoZ+Nv4u 6YDev4HOsBdYubhKZp+eH4xEdRn+C5ahH+R0 X-Google-Smtp-Source: ABdhPJzMh3LN+PBFhwzCH55ZZ3ji96BFwjvA9Mh3L+f8i4REW/jwH8NXt1KZq262F9ynG6lsx8XgzQ== X-Received: by 2002:adf:e3cf:: with SMTP id k15mr5739115wrm.60.1626183469359; Tue, 13 Jul 2021 06:37:49 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j6sm9827443wrm.97.2021.07.13.06.37.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 06:37:48 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 27/34] target/arm: Implement MVE VQABS, VQNEG Date: Tue, 13 Jul 2021 14:37:19 +0100 Message-Id: <20210713133726.26842-28-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210713133726.26842-1-peter.maydell@linaro.org> References: <20210713133726.26842-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::430; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x430.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE 1-operand saturating operations VQABS and VQNEG. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper-mve.h | 8 ++++++++ target/arm/mve.decode | 3 +++ target/arm/mve_helper.c | 37 +++++++++++++++++++++++++++++++++++++ target/arm/translate-mve.c | 2 ++ 4 files changed, 50 insertions(+) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index e61c5d56f41..69f0474f6a3 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -80,6 +80,14 @@ DEF_HELPER_FLAGS_3(mve_vnegw, TCG_CALL_NO_WG, void, env, ptr, ptr) DEF_HELPER_FLAGS_3(mve_vfnegh, TCG_CALL_NO_WG, void, env, ptr, ptr) DEF_HELPER_FLAGS_3(mve_vfnegs, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vqabsb, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vqabsh, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vqabsw, TCG_CALL_NO_WG, void, env, ptr, ptr) + +DEF_HELPER_FLAGS_3(mve_vqnegb, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vqnegh, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vqnegw, TCG_CALL_NO_WG, void, env, ptr, ptr) + DEF_HELPER_FLAGS_3(mve_vmovnbb, TCG_CALL_NO_WG, void, env, ptr, ptr) DEF_HELPER_FLAGS_3(mve_vmovnbh, TCG_CALL_NO_WG, void, env, ptr, ptr) DEF_HELPER_FLAGS_3(mve_vmovntb, TCG_CALL_NO_WG, void, env, ptr, ptr) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index 99cea8d39b6..1d38dd8dba3 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -279,6 +279,9 @@ VABS_fp 1111 1111 1 . 11 .. 01 ... 0 0111 01 . 0 ... 0 @1op VNEG 1111 1111 1 . 11 .. 01 ... 0 0011 11 . 0 ... 0 @1op VNEG_fp 1111 1111 1 . 11 .. 01 ... 0 0111 11 . 0 ... 0 @1op +VQABS 1111 1111 1 . 11 .. 00 ... 0 0111 01 . 0 ... 0 @1op +VQNEG 1111 1111 1 . 11 .. 00 ... 0 0111 11 . 0 ... 0 @1op + &vdup qd rt size # Qd is in the fields usually named Qn @vdup .... .... . . .. ... . rt:4 .... . . . . .... qd=%qn &vdup diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index 1013060baeb..3b3695885ef 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -2213,3 +2213,40 @@ void HELPER(mve_vpsel)(CPUARMState *env, void *vd, void *vn, void *vm) } mve_advance_vpt(env); } + +#define DO_1OP_SAT(OP, ESIZE, TYPE, FN) \ + void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm) \ + { \ + TYPE *d = vd, *m = vm; \ + uint16_t mask = mve_element_mask(env); \ + unsigned e; \ + bool qc = false; \ + for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \ + bool sat = false; \ + mergemask(&d[H##ESIZE(e)], FN(m[H##ESIZE(e)], &sat), mask); \ + qc |= sat & mask & 1; \ + } \ + if (qc) { \ + env->vfp.qc[0] = qc; \ + } \ + mve_advance_vpt(env); \ + } + +#define DO_VQABS_B(N, SATP) \ + do_sat_bhs(DO_ABS((int64_t)N), INT8_MIN, INT8_MAX, SATP) +#define DO_VQABS_H(N, SATP) \ + do_sat_bhs(DO_ABS((int64_t)N), INT16_MIN, INT16_MAX, SATP) +#define DO_VQABS_W(N, SATP) \ + do_sat_bhs(DO_ABS((int64_t)N), INT32_MIN, INT32_MAX, SATP) + +#define DO_VQNEG_B(N, SATP) do_sat_bhs(-(int64_t)N, INT8_MIN, INT8_MAX, SATP) +#define DO_VQNEG_H(N, SATP) do_sat_bhs(-(int64_t)N, INT16_MIN, INT16_MAX, SATP) +#define DO_VQNEG_W(N, SATP) do_sat_bhs(-(int64_t)N, INT32_MIN, INT32_MAX, SATP) + +DO_1OP_SAT(vqabsb, 1, int8_t, DO_VQABS_B) +DO_1OP_SAT(vqabsh, 2, int16_t, DO_VQABS_H) +DO_1OP_SAT(vqabsw, 4, int32_t, DO_VQABS_W) + +DO_1OP_SAT(vqnegb, 1, int8_t, DO_VQNEG_B) +DO_1OP_SAT(vqnegh, 2, int16_t, DO_VQNEG_H) +DO_1OP_SAT(vqnegw, 4, int32_t, DO_VQNEG_W) diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index f8b34c9ef36..59e09f58a8c 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -275,6 +275,8 @@ DO_1OP(VCLZ, vclz) DO_1OP(VCLS, vcls) DO_1OP(VABS, vabs) DO_1OP(VNEG, vneg) +DO_1OP(VQABS, vqabs) +DO_1OP(VQNEG, vqneg) /* Narrowing moves: only size 0 and 1 are valid */ #define DO_VMOVN(INSN, FN) \ From patchwork Tue Jul 13 13:37:20 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1504638 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=UVvGwXc/; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GPMdP39gZz9sXN for ; Tue, 13 Jul 2021 23:57:13 +1000 (AEST) Received: from localhost ([::1]:52760 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m3IuM-0004rI-Rl for incoming@patchwork.ozlabs.org; Tue, 13 Jul 2021 09:57:10 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54682) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m3Ibp-0000GK-Rf for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:38:01 -0400 Received: from mail-wm1-x330.google.com ([2a00:1450:4864:20::330]:50751) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m3Ibf-0003oW-FA for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:38:01 -0400 Received: by mail-wm1-x330.google.com with SMTP id l6so5380048wmq.0 for ; Tue, 13 Jul 2021 06:37:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=kbQLDgQV6VvX5+WpuSOCf9tnLATz5FQ5PTEJgS4fW8c=; b=UVvGwXc/5AQwBw21BRu1gheE56cHjVzohKUahmI028zSJfdE/kcU+EzZBlde1ab3bc YYDIJrDaq5lBxybVA49RjPNAkVrTywQbpcofpQlJyKPdWIU/5fJaYnGPHfSIpPOGNZhn 9ar6LoLx849OSnbhlshBaC9NUYYNUCzP5TuALeHpJDu8UJ0jHfGxT+aKSoJ4zj8qicpC HdBr7bd+/VSfH2f6BY3xSeb2+O5/LEuX8gxXbrVBB3JaHelEOyFpYCxkRCdJdkjcJY+u 4jQOgONdVYFxcXfv8cKKRI/w9gwiYyqvfRl4ugVln/1NA3yxkOeHDfxN8zu1H4X2mCYc nZnA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=kbQLDgQV6VvX5+WpuSOCf9tnLATz5FQ5PTEJgS4fW8c=; b=tFppHYWI5PlbDolHC2jw0Uz+NU2vk4pHZqWcloUHBGKe/ef7+tEuM/WJofHa9VXgUh 5pEMk1hpiLvDc68C646FRvLj+RsZm5kmK3iFTLZQtNYBUuv4pCHqbhSgO9AhbzDr1Ztk MSmTGXEv07fXCEJMAkgrNZNjrYfc+lgvsaipgG7psG3J6h0XGKEjpbGUxMg8oRt/nLiY H3tZQdRz79s4LVQJR10T7M2g5Tcca6HoEkv3zm36ERaNSCA/nS7ohOQOrSr/3HJAmp7q Vi+t1yfMSTnqkuDgUBlLjMtbi1092cgUjEE5bvEw/jqWm0g5Vc32tv09nKTJunrlsjh5 4KEw== X-Gm-Message-State: AOAM5303gNoWIcq87sM+KiIU4jt5QWt5nHIMneKvtNcTy5FLpGCy9ePj 2kR81tDaeL/MlLkCnHB3iFAS0i/iF9v7QuKF X-Google-Smtp-Source: ABdhPJyG1aFLmNFnDHMFSr7LHFRsp94qynoQDuUIfflFJae6Uwuy9tGR1VySnIZHrrGvQEimXGzSkw== X-Received: by 2002:a05:600c:2248:: with SMTP id a8mr45755wmm.141.1626183470092; Tue, 13 Jul 2021 06:37:50 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j6sm9827443wrm.97.2021.07.13.06.37.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 06:37:49 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 28/34] target/arm: Implement MVE VMAXA, VMINA Date: Tue, 13 Jul 2021 14:37:20 +0100 Message-Id: <20210713133726.26842-29-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210713133726.26842-1-peter.maydell@linaro.org> References: <20210713133726.26842-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::330; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x330.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE VMAXA and VMINA insns, which take the absolute value of the signed elements in the input vector and then accumulate the unsigned max or min into the destination vector. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper-mve.h | 8 ++++++++ target/arm/mve.decode | 4 ++++ target/arm/mve_helper.c | 26 ++++++++++++++++++++++++++ target/arm/translate-mve.c | 2 ++ 4 files changed, 40 insertions(+) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index 69f0474f6a3..c36640e75e9 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -88,6 +88,14 @@ DEF_HELPER_FLAGS_3(mve_vqnegb, TCG_CALL_NO_WG, void, env, ptr, ptr) DEF_HELPER_FLAGS_3(mve_vqnegh, TCG_CALL_NO_WG, void, env, ptr, ptr) DEF_HELPER_FLAGS_3(mve_vqnegw, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vmaxab, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vmaxah, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vmaxaw, TCG_CALL_NO_WG, void, env, ptr, ptr) + +DEF_HELPER_FLAGS_3(mve_vminab, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vminah, TCG_CALL_NO_WG, void, env, ptr, ptr) +DEF_HELPER_FLAGS_3(mve_vminaw, TCG_CALL_NO_WG, void, env, ptr, ptr) + DEF_HELPER_FLAGS_3(mve_vmovnbb, TCG_CALL_NO_WG, void, env, ptr, ptr) DEF_HELPER_FLAGS_3(mve_vmovnbh, TCG_CALL_NO_WG, void, env, ptr, ptr) DEF_HELPER_FLAGS_3(mve_vmovntb, TCG_CALL_NO_WG, void, env, ptr, ptr) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index 1d38dd8dba3..3899937f033 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -156,6 +156,8 @@ VMUL 1110 1111 0 . .. ... 0 ... 0 1001 . 1 . 1 ... 0 @2op VQMOVUNB 111 0 1110 0 . 11 .. 01 ... 0 1110 1 0 . 0 ... 1 @1op VQMOVN_BS 111 0 1110 0 . 11 .. 11 ... 0 1110 0 0 . 0 ... 1 @1op + VMAXA 111 0 1110 0 . 11 .. 11 ... 0 1110 1 0 . 0 ... 1 @1op + VMULH_S 111 0 1110 0 . .. ...1 ... 0 1110 . 0 . 0 ... 1 @2op } @@ -176,6 +178,8 @@ VMUL 1110 1111 0 . .. ... 0 ... 0 1001 . 1 . 1 ... 0 @2op VQMOVUNT 111 0 1110 0 . 11 .. 01 ... 1 1110 1 0 . 0 ... 1 @1op VQMOVN_TS 111 0 1110 0 . 11 .. 11 ... 1 1110 0 0 . 0 ... 1 @1op + VMINA 111 0 1110 0 . 11 .. 11 ... 1 1110 1 0 . 0 ... 1 @1op + VRMULH_S 111 0 1110 0 . .. ...1 ... 1 1110 . 0 . 0 ... 1 @2op } diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index 3b3695885ef..40e652229d6 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -2250,3 +2250,29 @@ DO_1OP_SAT(vqabsw, 4, int32_t, DO_VQABS_W) DO_1OP_SAT(vqnegb, 1, int8_t, DO_VQNEG_B) DO_1OP_SAT(vqnegh, 2, int16_t, DO_VQNEG_H) DO_1OP_SAT(vqnegw, 4, int32_t, DO_VQNEG_W) + +/* + * VMAXA, VMINA: vd is unsigned; vm is signed, and we take its + * absolute value; we then do an unsigned comparison. + */ +#define DO_VMAXMINA(OP, ESIZE, STYPE, UTYPE, FN) \ + void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm) \ + { \ + UTYPE *d = vd; \ + STYPE *m = vm; \ + uint16_t mask = mve_element_mask(env); \ + unsigned e; \ + for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \ + UTYPE r = DO_ABS(m[H##ESIZE(e)]); \ + r = FN(d[H##ESIZE(e)], r); \ + mergemask(&d[H##ESIZE(e)], r, mask); \ + } \ + mve_advance_vpt(env); \ + } + +DO_VMAXMINA(vmaxab, 1, int8_t, uint8_t, DO_MAX) +DO_VMAXMINA(vmaxah, 2, int16_t, uint16_t, DO_MAX) +DO_VMAXMINA(vmaxaw, 4, int32_t, uint32_t, DO_MAX) +DO_VMAXMINA(vminab, 1, int8_t, uint8_t, DO_MIN) +DO_VMAXMINA(vminah, 2, int16_t, uint16_t, DO_MIN) +DO_VMAXMINA(vminaw, 4, int32_t, uint32_t, DO_MIN) diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index 59e09f58a8c..f243c34bd21 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -277,6 +277,8 @@ DO_1OP(VABS, vabs) DO_1OP(VNEG, vneg) DO_1OP(VQABS, vqabs) DO_1OP(VQNEG, vqneg) +DO_1OP(VMAXA, vmaxa) +DO_1OP(VMINA, vmina) /* Narrowing moves: only size 0 and 1 are valid */ #define DO_VMOVN(INSN, FN) \ From patchwork Tue Jul 13 13:37:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1504642 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=PLF+1z+c; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GPMfx6XDdz9sXL for ; Tue, 13 Jul 2021 23:58:33 +1000 (AEST) Received: from localhost ([::1]:58444 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m3Ivf-00007e-L9 for incoming@patchwork.ozlabs.org; Tue, 13 Jul 2021 09:58:31 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54738) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m3Ibr-0000Qd-VL for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:38:04 -0400 Received: from mail-wr1-x434.google.com ([2a00:1450:4864:20::434]:45606) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m3Ibg-0003p2-QD for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:38:03 -0400 Received: by mail-wr1-x434.google.com with SMTP id t5so16681437wrw.12 for ; Tue, 13 Jul 2021 06:37:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=g56+4yRtsX4HmaJSKKEITprEduln747BESSrSYF9XdU=; b=PLF+1z+cdAjyuUo7ZkDD1mPGpWrkKw/m9ZJ8paYDRbmpovr5qkkIR2HSDk+wMoQS7i XC5Lwt8FQG3BHiv+wOLhB2DVrHpbdYa66TlSc3giOsVgmxMiDUb14wl/odId2r4rRt3w V5Jfwqc2HGq2YmWHj3FkTLkSelrOGz4v3X2TUdXtDnf0voo15+WOSdXyu8d4aQLQg+QU WHQt8WrH1Ch3Qd1NffBAeJ7xnscWWLfqFy+ts4oPq6YPzz5GvAP93dUXwPmsYi1w+kOJ bJbrfGJ93y9kz5Q4S07CtduiouZk4fwDCdsRTDiBruqxJAUxcaMPVLuH6v0xT+GjfLah Q2aQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=g56+4yRtsX4HmaJSKKEITprEduln747BESSrSYF9XdU=; b=V/n7N/FvGkByG0FDpI7nUEgNSsQ+cZDtA0VUXaPchdl8Dp5cp2nTYoDbqES+MdPk67 M+Yl2BuLDn2IMEOw+KpKNg80OwzAFvQVHkkEgas539W4vxdk7whn8F2afeaAfe5g0oDQ gPxejeFSc0Td4KcgmB2Fqz1v+x/8D92qpCaKGwG/LM//TIkB/CKMO5mCd6F6IU4aTeE5 CfWeqBTPgNx6NScTaq8+vv2H3ABcHUc2HVfJv0q15mZnCqOh2COwVq6QjuQypbQvMwSB 1lZCr19nPuw0yhv3wgSFRdZ78CUM6Tyod5tjOykQyH0SqZW/7v4NHKoHBYsiUSBNK83W 1R1Q== X-Gm-Message-State: AOAM532AP+SK92Q6QySCTL2n/SQf0FHhf1ON/3l75JXSuJnFqyrgUUI5 3uBhpaECIZpi9AXifPjMhLZEosLAaxUSZYmM X-Google-Smtp-Source: ABdhPJzaaEuxsriXEl2zoSjAcZba+sSn4E8l0oIbI8xd4n9VcMC13fCb/tR3bugkZf4jqNVLxY1CNw== X-Received: by 2002:adf:ec07:: with SMTP id x7mr5932828wrn.262.1626183471130; Tue, 13 Jul 2021 06:37:51 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j6sm9827443wrm.97.2021.07.13.06.37.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 06:37:50 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 29/34] target/arm: Implement MVE VMOV to/from 2 general-purpose registers Date: Tue, 13 Jul 2021 14:37:21 +0100 Message-Id: <20210713133726.26842-30-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210713133726.26842-1-peter.maydell@linaro.org> References: <20210713133726.26842-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::434; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x434.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE VMOV forms that move data between 2 general-purpose registers and 2 32-bit lanes in a vector register. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/translate-a32.h | 1 + target/arm/mve.decode | 4 ++ target/arm/translate-mve.c | 85 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-vfp.c | 2 +- 4 files changed, 91 insertions(+), 1 deletion(-) diff --git a/target/arm/translate-a32.h b/target/arm/translate-a32.h index 6dfcafe1796..6f4d65ddb00 100644 --- a/target/arm/translate-a32.h +++ b/target/arm/translate-a32.h @@ -49,6 +49,7 @@ void gen_rev16(TCGv_i32 dest, TCGv_i32 var); void clear_eci_state(DisasContext *s); bool mve_eci_check(DisasContext *s); void mve_update_and_store_eci(DisasContext *s); +bool mve_skip_vmov(DisasContext *s, int vn, int index, int size); static inline TCGv_i32 load_cpu_offset(int offset) { diff --git a/target/arm/mve.decode b/target/arm/mve.decode index 3899937f033..6ac9cb8e4d4 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -136,6 +136,10 @@ VLDR_VSTR 1110110 1 a:1 . w:1 . .... ... 111101 ....... @vldr_vstr \ VLDR_VSTR 1110110 1 a:1 . w:1 . .... ... 111110 ....... @vldr_vstr \ size=2 p=1 +# Moves between 2 32-bit vector lanes and 2 general purpose registers +VMOV_to_2gp 1110 1100 0 . 00 rt2:4 ... 0 1111 000 idx:1 rt:4 qd=%qd +VMOV_from_2gp 1110 1100 0 . 01 rt2:4 ... 0 1111 000 idx:1 rt:4 qd=%qd + # Vector 2-op VAND 1110 1111 0 . 00 ... 0 ... 0 0001 . 1 . 1 ... 0 @2op_nosz VBIC 1110 1111 0 . 01 ... 0 ... 0 0001 . 1 . 1 ... 0 @2op_nosz diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index f243c34bd21..43f917e609e 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -1507,3 +1507,88 @@ static bool do_vabav(DisasContext *s, arg_vabav *a, MVEGenVABAVFn *fn) DO_VABAV(VABAV_S, vabavs) DO_VABAV(VABAV_U, vabavu) + +static bool trans_VMOV_to_2gp(DisasContext *s, arg_VMOV_to_2gp *a) +{ + /* + * VMOV two 32-bit vector lanes to two general-purpose registers. + * This insn is not predicated but it is subject to beat-wise + * execution if it is not in an IT block. For us this means + * only that if PSR.ECI says we should not be executing the beat + * corresponding to the lane of the vector register being accessed + * then we should skip perfoming the move, and that we need to do + * the usual check for bad ECI state and advance of ECI state. + * (If PSR.ECI is non-zero then we cannot be in an IT block.) + */ + TCGv_i32 tmp; + int vd; + + if (!dc_isar_feature(aa32_mve, s) || !mve_check_qreg_bank(s, a->qd) || + a->rt == 13 || a->rt == 15 || a->rt2 == 13 || a->rt2 == 15 || + a->rt == a->rt2) { + /* Rt/Rt2 cases are UNPREDICTABLE */ + return false; + } + if (!mve_eci_check(s) || !vfp_access_check(s)) { + return true; + } + + /* Convert Qreg index to Dreg for read_neon_element32() etc */ + vd = a->qd * 2; + + if (!mve_skip_vmov(s, vd, a->idx, MO_32)) { + tmp = tcg_temp_new_i32(); + read_neon_element32(tmp, vd, a->idx, MO_32); + store_reg(s, a->rt, tmp); + } + if (!mve_skip_vmov(s, vd + 1, a->idx, MO_32)) { + tmp = tcg_temp_new_i32(); + read_neon_element32(tmp, vd + 1, a->idx, MO_32); + store_reg(s, a->rt2, tmp); + } + + mve_update_and_store_eci(s); + return true; +} + +static bool trans_VMOV_from_2gp(DisasContext *s, arg_VMOV_to_2gp *a) +{ + /* + * VMOV two general-purpose registers to two 32-bit vector lanes. + * This insn is not predicated but it is subject to beat-wise + * execution if it is not in an IT block. For us this means + * only that if PSR.ECI says we should not be executing the beat + * corresponding to the lane of the vector register being accessed + * then we should skip perfoming the move, and that we need to do + * the usual check for bad ECI state and advance of ECI state. + * (If PSR.ECI is non-zero then we cannot be in an IT block.) + */ + TCGv_i32 tmp; + int vd; + + if (!dc_isar_feature(aa32_mve, s) || !mve_check_qreg_bank(s, a->qd) || + a->rt == 13 || a->rt == 15 || a->rt2 == 13 || a->rt2 == 15) { + /* Rt/Rt2 cases are UNPREDICTABLE */ + return false; + } + if (!mve_eci_check(s) || !vfp_access_check(s)) { + return true; + } + + /* Convert Qreg idx to Dreg for read_neon_element32() etc */ + vd = a->qd * 2; + + if (!mve_skip_vmov(s, vd, a->idx, MO_32)) { + tmp = load_reg(s, a->rt); + write_neon_element32(tmp, vd, a->idx, MO_32); + tcg_temp_free_i32(tmp); + } + if (!mve_skip_vmov(s, vd + 1, a->idx, MO_32)) { + tmp = load_reg(s, a->rt2); + write_neon_element32(tmp, vd + 1, a->idx, MO_32); + tcg_temp_free_i32(tmp); + } + + mve_update_and_store_eci(s); + return true; +} diff --git a/target/arm/translate-vfp.c b/target/arm/translate-vfp.c index b2991e21ec7..e2eb797c829 100644 --- a/target/arm/translate-vfp.c +++ b/target/arm/translate-vfp.c @@ -581,7 +581,7 @@ static bool trans_VCVT(DisasContext *s, arg_VCVT *a) return true; } -static bool mve_skip_vmov(DisasContext *s, int vn, int index, int size) +bool mve_skip_vmov(DisasContext *s, int vn, int index, int size) { /* * In a CPU with MVE, the VMOV (vector lane to general-purpose register) From patchwork Tue Jul 13 13:37:22 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1504650 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=AXeeUJfe; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GPMnT5rc5z9sWd for ; Wed, 14 Jul 2021 00:04:13 +1000 (AEST) Received: from localhost ([::1]:47204 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m3J19-0003Mn-9a for incoming@patchwork.ozlabs.org; Tue, 13 Jul 2021 10:04:11 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54764) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m3Ibt-0000Wk-Db for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:38:06 -0400 Received: from mail-wr1-x435.google.com ([2a00:1450:4864:20::435]:35490) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m3Ibi-0003q5-IE for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:38:05 -0400 Received: by mail-wr1-x435.google.com with SMTP id m2so19483547wrq.2 for ; Tue, 13 Jul 2021 06:37:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=xdTiVrTGrjwqxBrVWc02ZyC5Zuxj4JEnIGJ5DSfl9U4=; b=AXeeUJfeCTahTk0Hs73SVfHTU3+XXQffQhq6aAS4UwsRCaCxnCXzeCQgCp8gZZ+Gkf bC55Qcx3c4lwH5PLeRJniY5IMp2RrANCfDv4Y08EmaUlSCw73xzTV3ga0gGyEdsWRtn5 /b1D0b7HKTYgZCOCMh0EjoWTeOgN1R8pl10X9jdwS6B/eHp1gTKn+8dzmA0axYfFI8A0 yXVtZJGsVYqyS2y+BztJ125ZUAejHKkSYrIbClWXpM93Bvxy5I6SBSvzkZc7gslhwGqk eYKc8Fn8FbQWG8tGKgoTZ15a7WAi1kUaidzG6trDZTL217/Wrm4akrGnX5mto2s4Fb95 vTsg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=xdTiVrTGrjwqxBrVWc02ZyC5Zuxj4JEnIGJ5DSfl9U4=; b=jvaaci+tEM/mBPSvqanxhZ/rcStlkuaOcKTmFq3Z9i85OmA90C6L7yUjcFxot4Fa0c c869uNlBDNB1goDRM2CWVqVP2u/HBrqKMBMi7IMjpNwqH4NmTjtILAoV/eQA9ib2h8D9 KJXfIBRISJveZb/4vzaPTsrvIdYe36+HrpBXfZa+URzfGJssVXff7gzjvEa5KeWkuvcu UMg8yBbbOeUFe57lUxzcQe2LOW0X84uxZt6ieBBxUI9MrfEpT9jWi1UdqKlIoUs8dh6T BrbLHRHZgfaBB1ruAevLu0FKz3zGOgEURvqCUldg+kk+IgFm+gstnGUeJ2ngzVteg2Yx SN4g== X-Gm-Message-State: AOAM530lDsIeN4LvKMSQZstfAGaDgBYHr0QnjPxWr8BZDZkEo1M7MWJN U+fQDJwXM/jielgvVclPC64Qhtord7K5N3ss X-Google-Smtp-Source: ABdhPJzySUwMdE76HsEfWvu8an/vcUOGr80w2pP6i8kmSmJvA0YWSWzv12hvNj2B3/JVKD90mTbw2w== X-Received: by 2002:a5d:420b:: with SMTP id n11mr5816072wrq.395.1626183471783; Tue, 13 Jul 2021 06:37:51 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j6sm9827443wrm.97.2021.07.13.06.37.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 06:37:51 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 30/34] target/arm: Implement MVE VPNOT Date: Tue, 13 Jul 2021 14:37:22 +0100 Message-Id: <20210713133726.26842-31-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210713133726.26842-1-peter.maydell@linaro.org> References: <20210713133726.26842-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::435; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x435.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE VPNOT insn, which inverts the bits in VPR.P0 (subject to both predication and to beatwise execution). Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper-mve.h | 1 + target/arm/mve.decode | 1 + target/arm/mve_helper.c | 17 +++++++++++++++++ target/arm/translate-mve.c | 19 +++++++++++++++++++ 4 files changed, 38 insertions(+) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index c36640e75e9..5844bb891ed 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -123,6 +123,7 @@ DEF_HELPER_FLAGS_4(mve_vorn, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) DEF_HELPER_FLAGS_4(mve_veor, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) DEF_HELPER_FLAGS_4(mve_vpsel, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) +DEF_HELPER_FLAGS_1(mve_vpnot, TCG_CALL_NO_WG, void, env) DEF_HELPER_FLAGS_4(mve_vaddb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) DEF_HELPER_FLAGS_4(mve_vaddh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index 6ac9cb8e4d4..82dc07bc30e 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -573,6 +573,7 @@ VCMPGT 1111 1110 0 . .. ... 1 ... 1 1111 0 0 . 0 ... 1 @vcmp VCMPLE 1111 1110 0 . .. ... 1 ... 1 1111 1 0 . 0 ... 1 @vcmp { + VPNOT 1111 1110 0 0 11 000 1 000 0 1111 0100 1101 VPST 1111 1110 0 . 11 000 1 ... 0 1111 0100 1101 mask=%mask_22_13 VCMPEQ_scalar 1111 1110 0 . .. ... 1 ... 0 1111 0 1 0 0 .... @vcmp_scalar } diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index 40e652229d6..6efb3c69636 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -2214,6 +2214,23 @@ void HELPER(mve_vpsel)(CPUARMState *env, void *vd, void *vn, void *vm) mve_advance_vpt(env); } +void HELPER(mve_vpnot)(CPUARMState *env) +{ + /* + * P0 bits for unexecuted beats (where eci_mask is 0) are unchanged. + * P0 bits for predicated lanes in executed bits (where mask is 0) are 0. + * P0 bits otherwise are inverted. + * (This is the same logic as VCMP.) + * This insn is itself subject to predication and to beat-wise execution, + * and after it executes VPT state advances in the usual way. + */ + uint16_t mask = mve_element_mask(env); + uint16_t eci_mask = mve_eci_mask(env); + uint16_t beatpred = ~env->v7m.vpr & mask; + env->v7m.vpr = (env->v7m.vpr & ~(uint32_t)eci_mask) | (beatpred & eci_mask); + mve_advance_vpt(env); +} + #define DO_1OP_SAT(OP, ESIZE, TYPE, FN) \ void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm) \ { \ diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index 43f917e609e..be961864ada 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -889,6 +889,25 @@ static bool trans_VPST(DisasContext *s, arg_VPST *a) return true; } +static bool trans_VPNOT(DisasContext *s, arg_VPNOT *a) +{ + /* + * Invert the predicate in VPR.P0. We have call out to + * a helper because this insn itself is beatwise and can + * be predicated. + */ + if (!dc_isar_feature(aa32_mve, s)) { + return false; + } + if (!mve_eci_check(s) || !vfp_access_check(s)) { + return true; + } + + gen_helper_mve_vpnot(cpu_env); + mve_update_eci(s); + return true; +} + static bool trans_VADDV(DisasContext *s, arg_VADDV *a) { /* VADDV: vector add across vector */ From patchwork Tue Jul 13 13:37:23 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1504648 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=VLLBm8gz; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GPMkd3v0Jz9sWd for ; Wed, 14 Jul 2021 00:01:45 +1000 (AEST) Received: from localhost ([::1]:38892 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m3Iyl-0005up-8j for incoming@patchwork.ozlabs.org; Tue, 13 Jul 2021 10:01:43 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54758) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m3Ibt-0000W4-8B for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:38:05 -0400 Received: from mail-wr1-x432.google.com ([2a00:1450:4864:20::432]:40501) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m3Ibi-0003qR-RG for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:38:04 -0400 Received: by mail-wr1-x432.google.com with SMTP id l7so29613268wrv.7 for ; Tue, 13 Jul 2021 06:37:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=jcneWcHtRSIWZQaKInlpxUM3P5jf//GWQwWQ0tWibxQ=; b=VLLBm8gzy+J77EPIHFW42M3zd3AfUbT2GcsCckWaLh8SURn9o8j+YwUvnSdsuk5e26 nc9jqmgr4pay/VBhOG/JGxStW7KJD1xsNR5YykBujOSGBHMxFz3kgBFYKaUHMCumq53l +l/pHZ3ITgNNTLwIcHtUQqpDblxyrZVfjjhEIDt0bvE35hD5Vx8VK+1fIeiSJviNd0MA 7CacftQNpqE10Ovb/7sIcVFKuYLnrLCVGI23u7+9ETJ7KkTPRVQgWkUAqK4ZDC/DgEME otYsUMciM45ubOOXqHC5dz4oUtX0iKRJbgKx6BCf8RUe3Pr7W688y5KeBYRu+G7uAlXV DLLA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=jcneWcHtRSIWZQaKInlpxUM3P5jf//GWQwWQ0tWibxQ=; b=X7zaAY5p5RizHvgfLhhJPBVucW0f0ztQDCRU1YbaPtwwfUBVDOaB7kM3y44TJvawdi Z0XCIHUN5H+osy9btlElobjL0Cpis9/CqMLxOaWlondFtrhPQZCvzLzQ/hVHn3krGhJD 6+rdRspoNthgsH8J28MVwAxTfYmj9rY+2QWiq9iXMQ9S/9BTa9GZld4fxEYl+3l7zFvP N7xjaRRf8QrY16sZb5gAodvIa/MDme5u5ZVmYdfKBj+onW/yFHkbxJ/zs9EwlSjbZv4c 908g/dTkXNCZGwU6I17BrZ7PmtjYjxS0uDDMyedLTPdO2nrAwk8qjCNPLX4+ZhjtE0cA NqvA== X-Gm-Message-State: AOAM53202ZuA9p9DkKyqFvA+2W9mtX6tJc2hfMPHbZmPPEi4oPObLeNt BuTPfOm0mciHFMmCTq3QEpUOCg== X-Google-Smtp-Source: ABdhPJwW7RYDsb1ajKd3dsrRi/any7DBm2NZnDjhRSqaIN4/PsQgJqdcVO77BdryfUHxqcqO6VT5EA== X-Received: by 2002:a5d:48ce:: with SMTP id p14mr5919155wrs.170.1626183473595; Tue, 13 Jul 2021 06:37:53 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j6sm9827443wrm.97.2021.07.13.06.37.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 06:37:53 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 31/34] target/arm: Implement MVE VCTP Date: Tue, 13 Jul 2021 14:37:23 +0100 Message-Id: <20210713133726.26842-32-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210713133726.26842-1-peter.maydell@linaro.org> References: <20210713133726.26842-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::432; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x432.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE VCTP insn, which sets the VPR.P0 predicate bits so as to predicate any element at index Rn or greater is predicated. As with VPNOT, this insn itself is predicable and subject to beatwise execution. The calculation of the mask is the same as is used to determine ltpmask in mve_element_mask(), but we precalculate masklen in generated code to avoid having to have 4 helpers specialized by size. We put the decode line in with the low-overhead-loop insns in t32.decode because it's logically part of that collection of insn patterns, even though it is an MVE only insn. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper-mve.h | 2 ++ target/arm/translate-a32.h | 1 + target/arm/t32.decode | 1 + target/arm/mve_helper.c | 20 ++++++++++++++++++++ target/arm/translate-mve.c | 2 +- target/arm/translate.c | 33 +++++++++++++++++++++++++++++++++ 6 files changed, 58 insertions(+), 1 deletion(-) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index 5844bb891ed..55f9151ccbf 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -125,6 +125,8 @@ DEF_HELPER_FLAGS_4(mve_veor, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) DEF_HELPER_FLAGS_4(mve_vpsel, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) DEF_HELPER_FLAGS_1(mve_vpnot, TCG_CALL_NO_WG, void, env) +DEF_HELPER_FLAGS_2(mve_vctp, TCG_CALL_NO_WG, void, env, i32) + DEF_HELPER_FLAGS_4(mve_vaddb, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) DEF_HELPER_FLAGS_4(mve_vaddh, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) DEF_HELPER_FLAGS_4(mve_vaddw, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr) diff --git a/target/arm/translate-a32.h b/target/arm/translate-a32.h index 6f4d65ddb00..88f15df60e8 100644 --- a/target/arm/translate-a32.h +++ b/target/arm/translate-a32.h @@ -48,6 +48,7 @@ long neon_element_offset(int reg, int element, MemOp memop); void gen_rev16(TCGv_i32 dest, TCGv_i32 var); void clear_eci_state(DisasContext *s); bool mve_eci_check(DisasContext *s); +void mve_update_eci(DisasContext *s); void mve_update_and_store_eci(DisasContext *s); bool mve_skip_vmov(DisasContext *s, int vn, int index, int size); diff --git a/target/arm/t32.decode b/target/arm/t32.decode index 2d47f31f143..78fadef9d62 100644 --- a/target/arm/t32.decode +++ b/target/arm/t32.decode @@ -748,5 +748,6 @@ BL 1111 0. .......... 11.1 ............ @branch24 # This is DLSTP DLS 1111 0 0000 0 size:2 rn:4 1110 0000 0000 0001 } + VCTP 1111 0 0000 0 size:2 rn:4 1110 1000 0000 0001 ] } diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index 6efb3c69636..210e70d1727 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -2231,6 +2231,26 @@ void HELPER(mve_vpnot)(CPUARMState *env) mve_advance_vpt(env); } +/* + * VCTP: P0 unexecuted bits unchanged, predicated bits zeroed, + * otherwise set according to value of Rn. The calculation of + * newmask here works in the same way as the calculation of the + * ltpmask in mve_element_mask(), but we have pre-calculated + * the masklen in the generated code. + */ +void HELPER(mve_vctp)(CPUARMState *env, uint32_t masklen) +{ + uint16_t mask = mve_element_mask(env); + uint16_t eci_mask = mve_eci_mask(env); + uint16_t newmask; + + assert(masklen <= 16); + newmask = masklen ? MAKE_64BIT_MASK(0, masklen) : 0; + newmask &= mask; + env->v7m.vpr = (env->v7m.vpr & ~(uint32_t)eci_mask) | (newmask & eci_mask); + mve_advance_vpt(env); +} + #define DO_1OP_SAT(OP, ESIZE, TYPE, FN) \ void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm) \ { \ diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index be961864ada..be5a3e1a1f5 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -93,7 +93,7 @@ bool mve_eci_check(DisasContext *s) } } -static void mve_update_eci(DisasContext *s) +void mve_update_eci(DisasContext *s) { /* * The helper function will always update the CPUState field, diff --git a/target/arm/translate.c b/target/arm/translate.c index 28e478927df..e0b0cabc39f 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -8677,6 +8677,39 @@ static bool trans_LCTP(DisasContext *s, arg_LCTP *a) return true; } +static bool trans_VCTP(DisasContext *s, arg_VCTP *a) +{ + /* + * M-profile Create Vector Tail Predicate. This insn is itself + * predicated and is subject to beatwise execution. + */ + TCGv_i32 rn_shifted, masklen; + + if (!dc_isar_feature(aa32_mve, s) || a->rn == 13 || a->rn == 15) { + return false; + } + + if (!mve_eci_check(s) || !vfp_access_check(s)) { + return true; + } + + /* + * We pre-calculate the mask length here to avoid having + * to have multiple helpers specialized for size. + * We pass the helper "rn <= (1 << (4 - size)) ? (rn << size) : 16". + */ + rn_shifted = tcg_temp_new_i32(); + masklen = load_reg(s, a->rn); + tcg_gen_shli_i32(rn_shifted, masklen, a->size); + tcg_gen_movcond_i32(TCG_COND_LEU, masklen, + masklen, tcg_constant_i32(1 << (4 - a->size)), + rn_shifted, tcg_constant_i32(16)); + gen_helper_mve_vctp(cpu_env, masklen); + tcg_temp_free_i32(masklen); + tcg_temp_free_i32(rn_shifted); + mve_update_eci(s); + return true; +} static bool op_tbranch(DisasContext *s, arg_tbranch *a, bool half) { From patchwork Tue Jul 13 13:37:24 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1504653 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=xMSOWfwd; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GPMrt1wgRz9sWd for ; Wed, 14 Jul 2021 00:07:10 +1000 (AEST) Received: from localhost ([::1]:55788 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m3J3z-00013W-LH for incoming@patchwork.ozlabs.org; Tue, 13 Jul 2021 10:07:07 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54812) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m3Ibv-0000eo-P5 for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:38:07 -0400 Received: from mail-wr1-x430.google.com ([2a00:1450:4864:20::430]:34546) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m3Ibj-0003qy-Pu for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:38:07 -0400 Received: by mail-wr1-x430.google.com with SMTP id p8so30509136wrr.1 for ; Tue, 13 Jul 2021 06:37:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=ezvyuOx6k7IOrukp8TdSAl9DSXaSG8vwOoLlVakuAlI=; b=xMSOWfwdSthijX6JAvnMlLZaLu+AH+h4Fx09tR86/x5tp6RrF+Jc+LqMezl6tmR7T3 inXeTfzCz7P+maVwXvlIMx04bxxWfTE0hKop6cM9n5cqxYWrMMZCMoinoYNde7jAa0Xc kBX8nsu2MSOtvQN8WismVwv+4FBJYACoUf1azlaiZJEQIkTcVn/OaOIiyW1g2H0gwpfg No5dZQDAAQL7ylEB/ClydrDCviyBqWmjhiH3zj16T5j1JAKMiP5Fl00SkT0MEDsgOcvE bivPHVSEqr7oS2LlWW2Mj0F6NcMIogIYIpSmfSf6A7N7Di2rFT4v+OSVFmDD3+G5T5En iTRw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ezvyuOx6k7IOrukp8TdSAl9DSXaSG8vwOoLlVakuAlI=; b=jvY+sfJ16JFBtrMHPwjv1hVJS7/edbJLoR76S/8QnfrVtu++URc1PilDZIqqYzo185 +U9QDrgnh4kT5jcSJ24G5L7AtDMM1GFP5lgC/bmTQaNBZ5HFOBBkc+6iJH/CKMu6qXe3 mf0na0YmUQIAkl1Cgi/SxNDILHMr035BjXGA/eVcNnrwXA7Bd6wHadbIFTsMC8xCQ7Gf wY/iLN1EEtE4gr3TvtDNIQ7zuUGdf1soWYwA3TJrLYhH8PJQM+i6XuUBguxvFvtFr5Nt mujjiSKZdccOWphiclInBuGQkxpS3pGTlwHdjAPFmygA/oyzb1m4vYlrceirzWpPUmaH khDg== X-Gm-Message-State: AOAM531qwlmRDic/EX+4+O6ma1u2W8hQdIKIh4k9cFGmq6rik9E4nqjM V09m8ktB9Ek1urttLWHCvEjkuw== X-Google-Smtp-Source: ABdhPJwReOB0xvDDQOuXeALg2suu1m5olzYZ4YYerZChZEJRKQ3lkr251b5B/xcjHXYSJy8IfESXVw== X-Received: by 2002:a5d:6b06:: with SMTP id v6mr5975506wrw.146.1626183474415; Tue, 13 Jul 2021 06:37:54 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j6sm9827443wrm.97.2021.07.13.06.37.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 06:37:54 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 32/34] target/arm: Implement MVE scatter-gather insns Date: Tue, 13 Jul 2021 14:37:24 +0100 Message-Id: <20210713133726.26842-33-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210713133726.26842-1-peter.maydell@linaro.org> References: <20210713133726.26842-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::430; envelope-from=peter.maydell@linaro.org; helo=mail-wr1-x430.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE gather-loads and scatter-stores which form the address by adding a base value from a scalar register to an offset in each element of a vector. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper-mve.h | 32 +++++++++ target/arm/mve.decode | 12 ++++ target/arm/mve_helper.c | 129 +++++++++++++++++++++++++++++++++++++ target/arm/translate-mve.c | 91 ++++++++++++++++++++++++++ 4 files changed, 264 insertions(+) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index 55f9151ccbf..9c570270c61 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -33,6 +33,38 @@ DEF_HELPER_FLAGS_3(mve_vstrb_h, TCG_CALL_NO_WG, void, env, ptr, i32) DEF_HELPER_FLAGS_3(mve_vstrb_w, TCG_CALL_NO_WG, void, env, ptr, i32) DEF_HELPER_FLAGS_3(mve_vstrh_w, TCG_CALL_NO_WG, void, env, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vldrb_sg_sh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vldrb_sg_sw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vldrh_sg_sw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(mve_vldrb_sg_ub, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vldrb_sg_uh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vldrb_sg_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vldrh_sg_uh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vldrh_sg_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vldrw_sg_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vldrd_sg_ud, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(mve_vstrb_sg_ub, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vstrb_sg_uh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vstrb_sg_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vstrh_sg_uh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vstrh_sg_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vstrw_sg_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vstrd_sg_ud, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(mve_vldrh_sg_os_sw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(mve_vldrh_sg_os_uh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vldrh_sg_os_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vldrw_sg_os_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vldrd_sg_os_ud, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(mve_vstrh_sg_os_uh, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vstrh_sg_os_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vstrw_sg_os_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vstrd_sg_os_ud, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) + DEF_HELPER_FLAGS_3(mve_vdup, TCG_CALL_NO_WG, void, env, ptr, i32) DEF_HELPER_FLAGS_4(mve_vidupb, TCG_CALL_NO_WG, i32, env, ptr, i32, i32) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index 82dc07bc30e..b0e39f36723 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -42,11 +42,18 @@ &shl_scalar qda rm size &vmaxv qm rda size &vabav qn qm rda size +&vldst_sg qd qm rn size msize os + +# scatter-gather memory size is in bits 6:4 +%sg_msize 6:1 4:1 @vldr_vstr ....... . . . . l:1 rn:4 ... ...... imm:7 &vldr_vstr qd=%qd u=0 # Note that both Rn and Qd are 3 bits only (no D bit) @vldst_wn ... u:1 ... . . . . l:1 . rn:3 qd:3 . ... .. imm:7 &vldr_vstr +@vldst_sg .... .... .... rn:4 .... ... size:2 ... ... os:1 &vldst_sg \ + qd=%qd qm=%qm msize=%sg_msize + @1op .... .... .... size:2 .. .... .... .... .... &1op qd=%qd qm=%qm @1op_nosz .... .... .... .... .... .... .... .... &1op qd=%qd qm=%qm size=0 @2op .... .... .. size:2 .... .... .... .... .... &2op qd=%qd qm=%qm qn=%qn @@ -136,6 +143,11 @@ VLDR_VSTR 1110110 1 a:1 . w:1 . .... ... 111101 ....... @vldr_vstr \ VLDR_VSTR 1110110 1 a:1 . w:1 . .... ... 111110 ....... @vldr_vstr \ size=2 p=1 +# gather loads/scatter stores +VLDR_S_sg 111 0 1100 1 . 01 .... ... 0 111 . .... .... @vldst_sg +VLDR_U_sg 111 1 1100 1 . 01 .... ... 0 111 . .... .... @vldst_sg +VSTR_sg 111 0 1100 1 . 00 .... ... 0 111 . .... .... @vldst_sg + # Moves between 2 32-bit vector lanes and 2 general purpose registers VMOV_to_2gp 1110 1100 0 . 00 rt2:4 ... 0 1111 000 idx:1 rt:4 qd=%qd VMOV_from_2gp 1110 1100 0 . 01 rt2:4 ... 0 1111 000 idx:1 rt:4 qd=%qd diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index 210e70d1727..36592b88372 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -213,6 +213,135 @@ DO_VSTR(vstrh_w, 2, stw, 4, int32_t) #undef DO_VLDR #undef DO_VSTR +/* + * Gather loads/scatter stores. Here each element of Qm specifies + * an offset to use from the base register Rm. In the _os_ versions + * that offset is scaled by the element size. + * For loads, predicated lanes are zeroed instead of retaining + * their previous values. + */ +#define DO_VLDR_SG(OP, LDTYPE, ESIZE, TYPE, OFFTYPE, ADDRFN) \ + void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm, \ + uint32_t base) \ + { \ + TYPE *d = vd; \ + OFFTYPE *m = vm; \ + uint16_t mask = mve_element_mask(env); \ + uint16_t eci_mask = mve_eci_mask(env); \ + unsigned e; \ + uint32_t addr; \ + for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE, eci_mask >>= ESIZE) { \ + if (!(eci_mask & 1)) { \ + continue; \ + } \ + addr = ADDRFN(base, m[H##ESIZE(e)]); \ + d[H##ESIZE(e)] = (mask & 1) ? \ + cpu_##LDTYPE##_data_ra(env, addr, GETPC()) : 0; \ + } \ + mve_advance_vpt(env); \ + } + +/* We know here TYPE is unsigned so always the same as the offset type */ +#define DO_VSTR_SG(OP, STTYPE, ESIZE, TYPE, ADDRFN) \ + void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm, \ + uint32_t base) \ + { \ + TYPE *d = vd; \ + TYPE *m = vm; \ + uint16_t mask = mve_element_mask(env); \ + unsigned e; \ + uint32_t addr; \ + for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \ + addr = ADDRFN(base, m[H##ESIZE(e)]); \ + if (mask & 1) { \ + cpu_##STTYPE##_data_ra(env, addr, d[H##ESIZE(e)], GETPC()); \ + } \ + } \ + mve_advance_vpt(env); \ + } + +/* + * 64-bit accesses are slightly different: they are done as two 32-bit + * accesses, controlled by the predicate mask for the relevant beat, + * and with a single 32-bit offset in the first of the two Qm elements. + * Note that for QEMU our IMPDEF AIRCR.ENDIANNESS is always 0 (little). + */ +#define DO_VLDR64_SG(OP, ADDRFN) \ + void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm, \ + uint32_t base) \ + { \ + uint32_t *d = vd; \ + uint32_t *m = vm; \ + uint16_t mask = mve_element_mask(env); \ + uint16_t eci_mask = mve_eci_mask(env); \ + unsigned e; \ + uint32_t addr; \ + for (e = 0; e < 16 / 4; e++, mask >>= 4, eci_mask >>= 4) { \ + if (!(eci_mask & 1)) { \ + continue; \ + } \ + addr = ADDRFN(base, m[H4(e & ~1)]); \ + addr += 4 * (e & 1); \ + d[H4(e)] = (mask & 1) ? cpu_ldl_data_ra(env, addr, GETPC()) : 0; \ + } \ + mve_advance_vpt(env); \ + } + +#define DO_VSTR64_SG(OP, ADDRFN) \ + void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm, \ + uint32_t base) \ + { \ + uint32_t *d = vd; \ + uint32_t *m = vm; \ + uint16_t mask = mve_element_mask(env); \ + unsigned e; \ + uint32_t addr; \ + for (e = 0; e < 16 / 4; e++, mask >>= 4) { \ + addr = ADDRFN(base, m[H4(e & ~1)]); \ + addr += 4 * (e & 1); \ + if (mask & 1) { \ + cpu_stl_data_ra(env, addr, d[H4(e)], GETPC()); \ + } \ + } \ + mve_advance_vpt(env); \ + } + +#define ADDR_ADD(BASE, OFFSET) ((BASE) + (OFFSET)) +#define ADDR_ADD_OSH(BASE, OFFSET) ((BASE) + ((OFFSET) << 1)) +#define ADDR_ADD_OSW(BASE, OFFSET) ((BASE) + ((OFFSET) << 2)) +#define ADDR_ADD_OSD(BASE, OFFSET) ((BASE) + ((OFFSET) << 3)) + +DO_VLDR_SG(vldrb_sg_sh, ldsb, 2, int16_t, uint16_t, ADDR_ADD) +DO_VLDR_SG(vldrb_sg_sw, ldsb, 4, int32_t, uint32_t, ADDR_ADD) +DO_VLDR_SG(vldrh_sg_sw, ldsw, 4, int32_t, uint32_t, ADDR_ADD) + +DO_VLDR_SG(vldrb_sg_ub, ldub, 1, uint8_t, uint8_t, ADDR_ADD) +DO_VLDR_SG(vldrb_sg_uh, ldub, 2, uint16_t, uint16_t, ADDR_ADD) +DO_VLDR_SG(vldrb_sg_uw, ldub, 4, uint32_t, uint32_t, ADDR_ADD) +DO_VLDR_SG(vldrh_sg_uh, lduw, 2, uint16_t, uint16_t, ADDR_ADD) +DO_VLDR_SG(vldrh_sg_uw, lduw, 4, uint32_t, uint32_t, ADDR_ADD) +DO_VLDR_SG(vldrw_sg_uw, ldl, 4, uint32_t, uint32_t, ADDR_ADD) +DO_VLDR64_SG(vldrd_sg_ud, ADDR_ADD) + +DO_VLDR_SG(vldrh_sg_os_sw, ldsw, 4, int32_t, uint32_t, ADDR_ADD_OSH) +DO_VLDR_SG(vldrh_sg_os_uh, lduw, 2, uint16_t, uint16_t, ADDR_ADD_OSH) +DO_VLDR_SG(vldrh_sg_os_uw, lduw, 4, uint32_t, uint32_t, ADDR_ADD_OSH) +DO_VLDR_SG(vldrw_sg_os_uw, ldl, 4, uint32_t, uint32_t, ADDR_ADD_OSW) +DO_VLDR64_SG(vldrd_sg_os_ud, ADDR_ADD_OSD) + +DO_VSTR_SG(vstrb_sg_ub, stb, 1, uint8_t, ADDR_ADD) +DO_VSTR_SG(vstrb_sg_uh, stb, 2, uint16_t, ADDR_ADD) +DO_VSTR_SG(vstrb_sg_uw, stb, 4, uint32_t, ADDR_ADD) +DO_VSTR_SG(vstrh_sg_uh, stw, 2, uint16_t, ADDR_ADD) +DO_VSTR_SG(vstrh_sg_uw, stw, 4, uint32_t, ADDR_ADD) +DO_VSTR_SG(vstrw_sg_uw, stl, 4, uint32_t, ADDR_ADD) +DO_VSTR64_SG(vstrd_sg_ud, ADDR_ADD) + +DO_VSTR_SG(vstrh_sg_os_uh, stw, 2, uint16_t, ADDR_ADD_OSH) +DO_VSTR_SG(vstrh_sg_os_uw, stw, 4, uint32_t, ADDR_ADD_OSH) +DO_VSTR_SG(vstrw_sg_os_uw, stl, 4, uint32_t, ADDR_ADD_OSW) +DO_VSTR64_SG(vstrd_sg_os_ud, ADDR_ADD_OSD) + /* * The mergemask(D, R, M) macro performs the operation "*D = R" but * storing only the bytes which correspond to 1 bits in M, diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index be5a3e1a1f5..b0e4bdeb1c5 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -34,6 +34,7 @@ static inline int vidup_imm(DisasContext *s, int x) #include "decode-mve.c.inc" typedef void MVEGenLdStFn(TCGv_ptr, TCGv_ptr, TCGv_i32); +typedef void MVEGenLdStSGFn(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32); typedef void MVEGenOneOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr); typedef void MVEGenTwoOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr); typedef void MVEGenTwoOpScalarFn(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32); @@ -209,6 +210,96 @@ DO_VLDST_WIDE_NARROW(VLDSTB_H, vldrb_sh, vldrb_uh, vstrb_h, MO_8) DO_VLDST_WIDE_NARROW(VLDSTB_W, vldrb_sw, vldrb_uw, vstrb_w, MO_8) DO_VLDST_WIDE_NARROW(VLDSTH_W, vldrh_sw, vldrh_uw, vstrh_w, MO_16) +static bool do_ldst_sg(DisasContext *s, arg_vldst_sg *a, MVEGenLdStSGFn fn) +{ + TCGv_i32 addr; + TCGv_ptr qd, qm; + + if (!dc_isar_feature(aa32_mve, s) || + !mve_check_qreg_bank(s, a->qd | a->qm) || + !fn || a->rn == 15) { + /* Rn case is UNPREDICTABLE */ + return false; + } + + if (!mve_eci_check(s) || !vfp_access_check(s)) { + return true; + } + + addr = load_reg(s, a->rn); + + qd = mve_qreg_ptr(a->qd); + qm = mve_qreg_ptr(a->qm); + fn(cpu_env, qd, qm, addr); + tcg_temp_free_ptr(qd); + tcg_temp_free_ptr(qm); + tcg_temp_free_i32(addr); + mve_update_eci(s); + return true; +} + +/* + * The naming scheme here is "vldrb_sg_sh == in-memory byte loads + * signextended to halfword elements in register". _os_ indicates that + * the offsets in Qm should be scaled by the element size. + */ +/* This macro is just to make the arrays more compact in these functions */ +#define F(N) gen_helper_mve_##N + +/* VLDRB/VSTRB (ie msize 1) with OS=1 is UNPREDICTABLE; we UNDEF */ +static bool trans_VLDR_S_sg(DisasContext *s, arg_vldst_sg *a) +{ + static MVEGenLdStSGFn * const fns[2][4][4] = { { + { NULL, F(vldrb_sg_sh), F(vldrb_sg_sw), NULL }, + { NULL, NULL, F(vldrh_sg_sw), NULL }, + { NULL, NULL, NULL, NULL }, + { NULL, NULL, NULL, NULL } + }, { + { NULL, NULL, NULL, NULL }, + { NULL, NULL, F(vldrh_sg_os_sw), NULL }, + { NULL, NULL, NULL, NULL }, + { NULL, NULL, NULL, NULL } + } + }; + return do_ldst_sg(s, a, fns[a->os][a->msize][a->size]); +} + +static bool trans_VLDR_U_sg(DisasContext *s, arg_vldst_sg *a) +{ + static MVEGenLdStSGFn * const fns[2][4][4] = { { + { F(vldrb_sg_ub), F(vldrb_sg_uh), F(vldrb_sg_uw), NULL }, + { NULL, F(vldrh_sg_uh), F(vldrh_sg_uw), NULL }, + { NULL, NULL, F(vldrw_sg_uw), NULL }, + { NULL, NULL, NULL, F(vldrd_sg_ud) } + }, { + { NULL, NULL, NULL, NULL }, + { NULL, F(vldrh_sg_os_uh), F(vldrh_sg_os_uw), NULL }, + { NULL, NULL, F(vldrw_sg_os_uw), NULL }, + { NULL, NULL, NULL, F(vldrd_sg_os_ud) } + } + }; + return do_ldst_sg(s, a, fns[a->os][a->msize][a->size]); +} + +static bool trans_VSTR_sg(DisasContext *s, arg_vldst_sg *a) +{ + static MVEGenLdStSGFn * const fns[2][4][4] = { { + { F(vstrb_sg_ub), F(vstrb_sg_uh), F(vstrb_sg_uw), NULL }, + { NULL, F(vstrh_sg_uh), F(vstrh_sg_uw), NULL }, + { NULL, NULL, F(vstrw_sg_uw), NULL }, + { NULL, NULL, NULL, F(vstrd_sg_ud) } + }, { + { NULL, NULL, NULL, NULL }, + { NULL, F(vstrh_sg_os_uh), F(vstrh_sg_os_uw), NULL }, + { NULL, NULL, F(vstrw_sg_os_uw), NULL }, + { NULL, NULL, NULL, F(vstrd_sg_os_ud) } + } + }; + return do_ldst_sg(s, a, fns[a->os][a->msize][a->size]); +} + +#undef F + static bool trans_VDUP(DisasContext *s, arg_VDUP *a) { TCGv_ptr qd; From patchwork Tue Jul 13 13:37:25 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1504649 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=m7v9PlrU; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GPMkn24JRz9sWd for ; Wed, 14 Jul 2021 00:01:53 +1000 (AEST) Received: from localhost ([::1]:38936 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m3Iyo-0005wQ-IZ for incoming@patchwork.ozlabs.org; Tue, 13 Jul 2021 10:01:49 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54806) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m3Ibv-0000dX-DS for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:38:07 -0400 Received: from mail-wm1-x331.google.com ([2a00:1450:4864:20::331]:43957) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m3Ibk-0003rV-Iz for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:38:07 -0400 Received: by mail-wm1-x331.google.com with SMTP id q18-20020a1ce9120000b02901f259f3a250so2367548wmc.2 for ; Tue, 13 Jul 2021 06:37:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=qAjZNYXFvhxqmdLYqteXePqQI/hda9MXmKtS25wLRMw=; b=m7v9PlrU7O8c5/a+7X1JrN1QnFxURoGjTPwCuA4p/ClcXi4Rr9Ad90sxfiBrn9XY0o fFRc3+P99LBngEJdwXBI8/bI/p1CBzTAQ106f9jvo9j5uOlI+M1mbNxgkSKwOsg2Iedk RJf7GVb2Bn8IPuomct/+0QGy60EYMV5bGIhb5QO8hrRa3jkvLOIPoBuhykaggNBHkJg6 zDvozg1tZ2foJD3EaKjalDrHaHEiAGAE34l2eQ/p5Y42uyZyAqfB1w80QuHK1Ia4QFdf QpI6nkLzn83rkOD84Rag9EN05BGP1dsZfyiwl4ts+OjA0fCf3wKyYdOT0HJvViYc4v+d ol8A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=qAjZNYXFvhxqmdLYqteXePqQI/hda9MXmKtS25wLRMw=; b=lTptlkHlSBij0Wla09Aw1LNuQ9D0NE/2zlfpIKZLL7SA1zYw5UbcMEKsgCfthqFD8a lOagGhx7ktB8uTAaR98qIj4jtO+s5ekvtUfEMzlsWpzaHW7zeQlHVoe6TNW/5Jz8Wnjs TFLw72q0NkJGVPs0UEYGJTOCXl/xwCEgLqtHFgGvXd7MICAHX0cjqZEaIdLLUAEliXaz QK8UDiGlfGbS6xqthxhmoksCOonUzGLQFVF+BvMDRm2oZAnuTIr5n2xlOulGHsICYnEj KPl+inYKMPon4mjvLQ/hJORFkjyvu92UTjgfUbOML6h4MtPpBjDCuUwcinRxvGanXOtj bJzg== X-Gm-Message-State: AOAM531EiX1Ulk16ofldKlEtYZvlSeNPDxHU9PNENUCKmfzJF1ix9OL+ q1EPIcGSDJYVVORijEPh9mLi+g== X-Google-Smtp-Source: ABdhPJzSc4EUZdSSnxSZ/USK4jpEv2e4JxUedQ0Jku4tXdzUfhTygyDFd8HqKeoUcWAgGs0cN3UoGA== X-Received: by 2002:a05:600c:2255:: with SMTP id a21mr5217473wmm.90.1626183475332; Tue, 13 Jul 2021 06:37:55 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j6sm9827443wrm.97.2021.07.13.06.37.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 06:37:54 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 33/34] target/arm: Implement MVE scatter-gather immediate forms Date: Tue, 13 Jul 2021 14:37:25 +0100 Message-Id: <20210713133726.26842-34-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210713133726.26842-1-peter.maydell@linaro.org> References: <20210713133726.26842-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::331; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x331.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE VLDR/VSTR insns which do scatter-gather using base addresses from Qm plus or minus an immediate offset (possibly with writeback). Note that writeback is not predicated but it does have to honour ECI state, so we have to add an eci_mask check to the VSTR_SG macros (the VLDR_SG macros already needed this to be able to distinguish "skip beat" from "set predicated element to 0"). Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- target/arm/helper-mve.h | 5 +++ target/arm/mve.decode | 10 +++++ target/arm/mve_helper.c | 91 ++++++++++++++++++++++++-------------- target/arm/translate-mve.c | 66 +++++++++++++++++++++++++++ 4 files changed, 140 insertions(+), 32 deletions(-) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index 9c570270c61..16799b110fd 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -65,6 +65,11 @@ DEF_HELPER_FLAGS_4(mve_vstrh_sg_os_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vstrw_sg_os_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vstrd_sg_os_ud, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vldrw_sg_wb_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vldrd_sg_wb_ud, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vstrw_sg_wb_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(mve_vstrd_sg_wb_ud, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) + DEF_HELPER_FLAGS_3(mve_vdup, TCG_CALL_NO_WG, void, env, ptr, i32) DEF_HELPER_FLAGS_4(mve_vidupb, TCG_CALL_NO_WG, i32, env, ptr, i32, i32) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index b0e39f36723..76e9b9c721c 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -43,6 +43,7 @@ &vmaxv qm rda size &vabav qn qm rda size &vldst_sg qd qm rn size msize os +&vldst_sg_imm qd qm a w imm # scatter-gather memory size is in bits 6:4 %sg_msize 6:1 4:1 @@ -54,6 +55,10 @@ @vldst_sg .... .... .... rn:4 .... ... size:2 ... ... os:1 &vldst_sg \ qd=%qd qm=%qm msize=%sg_msize +# Qm is in the fields usually labeled Qn +@vldst_sg_imm .... .... a:1 . w:1 . .... .... .... . imm:7 &vldst_sg_imm \ + qd=%qd qm=%qn + @1op .... .... .... size:2 .. .... .... .... .... &1op qd=%qd qm=%qm @1op_nosz .... .... .... .... .... .... .... .... &1op qd=%qd qm=%qm size=0 @2op .... .... .. size:2 .... .... .... .... .... &2op qd=%qd qm=%qm qn=%qn @@ -148,6 +153,11 @@ VLDR_S_sg 111 0 1100 1 . 01 .... ... 0 111 . .... .... @vldst_sg VLDR_U_sg 111 1 1100 1 . 01 .... ... 0 111 . .... .... @vldst_sg VSTR_sg 111 0 1100 1 . 00 .... ... 0 111 . .... .... @vldst_sg +VLDRW_sg_imm 111 1 1101 ... 1 ... 0 ... 1 1110 .... .... @vldst_sg_imm +VLDRD_sg_imm 111 1 1101 ... 1 ... 0 ... 1 1111 .... .... @vldst_sg_imm +VSTRW_sg_imm 111 1 1101 ... 0 ... 0 ... 1 1110 .... .... @vldst_sg_imm +VSTRD_sg_imm 111 1 1101 ... 0 ... 0 ... 1 1111 .... .... @vldst_sg_imm + # Moves between 2 32-bit vector lanes and 2 general purpose registers VMOV_to_2gp 1110 1100 0 . 00 rt2:4 ... 0 1111 000 idx:1 rt:4 qd=%qd VMOV_from_2gp 1110 1100 0 . 01 rt2:4 ... 0 1111 000 idx:1 rt:4 qd=%qd diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index 36592b88372..293c0e11819 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -220,7 +220,7 @@ DO_VSTR(vstrh_w, 2, stw, 4, int32_t) * For loads, predicated lanes are zeroed instead of retaining * their previous values. */ -#define DO_VLDR_SG(OP, LDTYPE, ESIZE, TYPE, OFFTYPE, ADDRFN) \ +#define DO_VLDR_SG(OP, LDTYPE, ESIZE, TYPE, OFFTYPE, ADDRFN, WB) \ void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm, \ uint32_t base) \ { \ @@ -237,25 +237,35 @@ DO_VSTR(vstrh_w, 2, stw, 4, int32_t) addr = ADDRFN(base, m[H##ESIZE(e)]); \ d[H##ESIZE(e)] = (mask & 1) ? \ cpu_##LDTYPE##_data_ra(env, addr, GETPC()) : 0; \ + if (WB) { \ + m[H##ESIZE(e)] = addr; \ + } \ } \ mve_advance_vpt(env); \ } /* We know here TYPE is unsigned so always the same as the offset type */ -#define DO_VSTR_SG(OP, STTYPE, ESIZE, TYPE, ADDRFN) \ +#define DO_VSTR_SG(OP, STTYPE, ESIZE, TYPE, ADDRFN, WB) \ void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm, \ uint32_t base) \ { \ TYPE *d = vd; \ TYPE *m = vm; \ uint16_t mask = mve_element_mask(env); \ + uint16_t eci_mask = mve_eci_mask(env); \ unsigned e; \ uint32_t addr; \ - for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \ + for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE, eci_mask >>= ESIZE) { \ + if (!(eci_mask & 1)) { \ + continue; \ + } \ addr = ADDRFN(base, m[H##ESIZE(e)]); \ if (mask & 1) { \ cpu_##STTYPE##_data_ra(env, addr, d[H##ESIZE(e)], GETPC()); \ } \ + if (WB) { \ + m[H##ESIZE(e)] = addr; \ + } \ } \ mve_advance_vpt(env); \ } @@ -265,8 +275,10 @@ DO_VSTR(vstrh_w, 2, stw, 4, int32_t) * accesses, controlled by the predicate mask for the relevant beat, * and with a single 32-bit offset in the first of the two Qm elements. * Note that for QEMU our IMPDEF AIRCR.ENDIANNESS is always 0 (little). + * Address writeback happens on the odd beats and updates the address + * stored in the even-beat element. */ -#define DO_VLDR64_SG(OP, ADDRFN) \ +#define DO_VLDR64_SG(OP, ADDRFN, WB) \ void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm, \ uint32_t base) \ { \ @@ -283,25 +295,35 @@ DO_VSTR(vstrh_w, 2, stw, 4, int32_t) addr = ADDRFN(base, m[H4(e & ~1)]); \ addr += 4 * (e & 1); \ d[H4(e)] = (mask & 1) ? cpu_ldl_data_ra(env, addr, GETPC()) : 0; \ + if (WB && (e & 1)) { \ + m[H4(e & ~1)] = addr - 4; \ + } \ } \ mve_advance_vpt(env); \ } -#define DO_VSTR64_SG(OP, ADDRFN) \ +#define DO_VSTR64_SG(OP, ADDRFN, WB) \ void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm, \ uint32_t base) \ { \ uint32_t *d = vd; \ uint32_t *m = vm; \ uint16_t mask = mve_element_mask(env); \ + uint16_t eci_mask = mve_eci_mask(env); \ unsigned e; \ uint32_t addr; \ - for (e = 0; e < 16 / 4; e++, mask >>= 4) { \ + for (e = 0; e < 16 / 4; e++, mask >>= 4, eci_mask >>= 4) { \ + if (!(eci_mask & 1)) { \ + continue; \ + } \ addr = ADDRFN(base, m[H4(e & ~1)]); \ addr += 4 * (e & 1); \ if (mask & 1) { \ cpu_stl_data_ra(env, addr, d[H4(e)], GETPC()); \ } \ + if (WB && (e & 1)) { \ + m[H4(e & ~1)] = addr - 4; \ + } \ } \ mve_advance_vpt(env); \ } @@ -311,36 +333,41 @@ DO_VSTR(vstrh_w, 2, stw, 4, int32_t) #define ADDR_ADD_OSW(BASE, OFFSET) ((BASE) + ((OFFSET) << 2)) #define ADDR_ADD_OSD(BASE, OFFSET) ((BASE) + ((OFFSET) << 3)) -DO_VLDR_SG(vldrb_sg_sh, ldsb, 2, int16_t, uint16_t, ADDR_ADD) -DO_VLDR_SG(vldrb_sg_sw, ldsb, 4, int32_t, uint32_t, ADDR_ADD) -DO_VLDR_SG(vldrh_sg_sw, ldsw, 4, int32_t, uint32_t, ADDR_ADD) +DO_VLDR_SG(vldrb_sg_sh, ldsb, 2, int16_t, uint16_t, ADDR_ADD, false) +DO_VLDR_SG(vldrb_sg_sw, ldsb, 4, int32_t, uint32_t, ADDR_ADD, false) +DO_VLDR_SG(vldrh_sg_sw, ldsw, 4, int32_t, uint32_t, ADDR_ADD, false) -DO_VLDR_SG(vldrb_sg_ub, ldub, 1, uint8_t, uint8_t, ADDR_ADD) -DO_VLDR_SG(vldrb_sg_uh, ldub, 2, uint16_t, uint16_t, ADDR_ADD) -DO_VLDR_SG(vldrb_sg_uw, ldub, 4, uint32_t, uint32_t, ADDR_ADD) -DO_VLDR_SG(vldrh_sg_uh, lduw, 2, uint16_t, uint16_t, ADDR_ADD) -DO_VLDR_SG(vldrh_sg_uw, lduw, 4, uint32_t, uint32_t, ADDR_ADD) -DO_VLDR_SG(vldrw_sg_uw, ldl, 4, uint32_t, uint32_t, ADDR_ADD) -DO_VLDR64_SG(vldrd_sg_ud, ADDR_ADD) +DO_VLDR_SG(vldrb_sg_ub, ldub, 1, uint8_t, uint8_t, ADDR_ADD, false) +DO_VLDR_SG(vldrb_sg_uh, ldub, 2, uint16_t, uint16_t, ADDR_ADD, false) +DO_VLDR_SG(vldrb_sg_uw, ldub, 4, uint32_t, uint32_t, ADDR_ADD, false) +DO_VLDR_SG(vldrh_sg_uh, lduw, 2, uint16_t, uint16_t, ADDR_ADD, false) +DO_VLDR_SG(vldrh_sg_uw, lduw, 4, uint32_t, uint32_t, ADDR_ADD, false) +DO_VLDR_SG(vldrw_sg_uw, ldl, 4, uint32_t, uint32_t, ADDR_ADD, false) +DO_VLDR64_SG(vldrd_sg_ud, ADDR_ADD, false) -DO_VLDR_SG(vldrh_sg_os_sw, ldsw, 4, int32_t, uint32_t, ADDR_ADD_OSH) -DO_VLDR_SG(vldrh_sg_os_uh, lduw, 2, uint16_t, uint16_t, ADDR_ADD_OSH) -DO_VLDR_SG(vldrh_sg_os_uw, lduw, 4, uint32_t, uint32_t, ADDR_ADD_OSH) -DO_VLDR_SG(vldrw_sg_os_uw, ldl, 4, uint32_t, uint32_t, ADDR_ADD_OSW) -DO_VLDR64_SG(vldrd_sg_os_ud, ADDR_ADD_OSD) +DO_VLDR_SG(vldrh_sg_os_sw, ldsw, 4, int32_t, uint32_t, ADDR_ADD_OSH, false) +DO_VLDR_SG(vldrh_sg_os_uh, lduw, 2, uint16_t, uint16_t, ADDR_ADD_OSH, false) +DO_VLDR_SG(vldrh_sg_os_uw, lduw, 4, uint32_t, uint32_t, ADDR_ADD_OSH, false) +DO_VLDR_SG(vldrw_sg_os_uw, ldl, 4, uint32_t, uint32_t, ADDR_ADD_OSW, false) +DO_VLDR64_SG(vldrd_sg_os_ud, ADDR_ADD_OSD, false) -DO_VSTR_SG(vstrb_sg_ub, stb, 1, uint8_t, ADDR_ADD) -DO_VSTR_SG(vstrb_sg_uh, stb, 2, uint16_t, ADDR_ADD) -DO_VSTR_SG(vstrb_sg_uw, stb, 4, uint32_t, ADDR_ADD) -DO_VSTR_SG(vstrh_sg_uh, stw, 2, uint16_t, ADDR_ADD) -DO_VSTR_SG(vstrh_sg_uw, stw, 4, uint32_t, ADDR_ADD) -DO_VSTR_SG(vstrw_sg_uw, stl, 4, uint32_t, ADDR_ADD) -DO_VSTR64_SG(vstrd_sg_ud, ADDR_ADD) +DO_VSTR_SG(vstrb_sg_ub, stb, 1, uint8_t, ADDR_ADD, false) +DO_VSTR_SG(vstrb_sg_uh, stb, 2, uint16_t, ADDR_ADD, false) +DO_VSTR_SG(vstrb_sg_uw, stb, 4, uint32_t, ADDR_ADD, false) +DO_VSTR_SG(vstrh_sg_uh, stw, 2, uint16_t, ADDR_ADD, false) +DO_VSTR_SG(vstrh_sg_uw, stw, 4, uint32_t, ADDR_ADD, false) +DO_VSTR_SG(vstrw_sg_uw, stl, 4, uint32_t, ADDR_ADD, false) +DO_VSTR64_SG(vstrd_sg_ud, ADDR_ADD, false) -DO_VSTR_SG(vstrh_sg_os_uh, stw, 2, uint16_t, ADDR_ADD_OSH) -DO_VSTR_SG(vstrh_sg_os_uw, stw, 4, uint32_t, ADDR_ADD_OSH) -DO_VSTR_SG(vstrw_sg_os_uw, stl, 4, uint32_t, ADDR_ADD_OSW) -DO_VSTR64_SG(vstrd_sg_os_ud, ADDR_ADD_OSD) +DO_VSTR_SG(vstrh_sg_os_uh, stw, 2, uint16_t, ADDR_ADD_OSH, false) +DO_VSTR_SG(vstrh_sg_os_uw, stw, 4, uint32_t, ADDR_ADD_OSH, false) +DO_VSTR_SG(vstrw_sg_os_uw, stl, 4, uint32_t, ADDR_ADD_OSW, false) +DO_VSTR64_SG(vstrd_sg_os_ud, ADDR_ADD_OSD, false) + +DO_VLDR_SG(vldrw_sg_wb_uw, ldl, 4, uint32_t, uint32_t, ADDR_ADD, true) +DO_VLDR64_SG(vldrd_sg_wb_ud, ADDR_ADD, true) +DO_VSTR_SG(vstrw_sg_wb_uw, stl, 4, uint32_t, ADDR_ADD, true) +DO_VSTR64_SG(vstrd_sg_wb_ud, ADDR_ADD, true) /* * The mergemask(D, R, M) macro performs the operation "*D = R" but diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index b0e4bdeb1c5..f4229e308ba 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -300,6 +300,72 @@ static bool trans_VSTR_sg(DisasContext *s, arg_vldst_sg *a) #undef F +static bool do_ldst_sg_imm(DisasContext *s, arg_vldst_sg_imm *a, + MVEGenLdStSGFn *fn, unsigned msize) +{ + uint32_t offset; + TCGv_ptr qd, qm; + + if (!dc_isar_feature(aa32_mve, s) || + !mve_check_qreg_bank(s, a->qd | a->qm) || + !fn) { + return false; + } + + if (!mve_eci_check(s) || !vfp_access_check(s)) { + return true; + } + + offset = a->imm << msize; + if (!a->a) { + offset = -offset; + } + + qd = mve_qreg_ptr(a->qd); + qm = mve_qreg_ptr(a->qm); + fn(cpu_env, qd, qm, tcg_constant_i32(offset)); + tcg_temp_free_ptr(qd); + tcg_temp_free_ptr(qm); + mve_update_eci(s); + return true; +} + +static bool trans_VLDRW_sg_imm(DisasContext *s, arg_vldst_sg_imm *a) +{ + static MVEGenLdStSGFn * const fns[] = { + gen_helper_mve_vldrw_sg_uw, + gen_helper_mve_vldrw_sg_wb_uw, + }; + return do_ldst_sg_imm(s, a, fns[a->w], MO_32); +} + +static bool trans_VLDRD_sg_imm(DisasContext *s, arg_vldst_sg_imm *a) +{ + static MVEGenLdStSGFn * const fns[] = { + gen_helper_mve_vldrd_sg_ud, + gen_helper_mve_vldrd_sg_wb_ud, + }; + return do_ldst_sg_imm(s, a, fns[a->w], MO_64); +} + +static bool trans_VSTRW_sg_imm(DisasContext *s, arg_vldst_sg_imm *a) +{ + static MVEGenLdStSGFn * const fns[] = { + gen_helper_mve_vstrw_sg_uw, + gen_helper_mve_vstrw_sg_wb_uw, + }; + return do_ldst_sg_imm(s, a, fns[a->w], MO_32); +} + +static bool trans_VSTRD_sg_imm(DisasContext *s, arg_vldst_sg_imm *a) +{ + static MVEGenLdStSGFn * const fns[] = { + gen_helper_mve_vstrd_sg_ud, + gen_helper_mve_vstrd_sg_wb_ud, + }; + return do_ldst_sg_imm(s, a, fns[a->w], MO_64); +} + static bool trans_VDUP(DisasContext *s, arg_VDUP *a) { TCGv_ptr qd; From patchwork Tue Jul 13 13:37:26 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 1504651 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=RquzfShV; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GPMpp1JWjz9sXL for ; Wed, 14 Jul 2021 00:05:22 +1000 (AEST) Received: from localhost ([::1]:49934 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1m3J2F-0005Ez-Sv for incoming@patchwork.ozlabs.org; Tue, 13 Jul 2021 10:05:19 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54848) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1m3Ibx-0000lD-2d for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:38:09 -0400 Received: from mail-wm1-x336.google.com ([2a00:1450:4864:20::336]:55857) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1m3Ibl-0003sz-Rg for qemu-devel@nongnu.org; Tue, 13 Jul 2021 09:38:08 -0400 Received: by mail-wm1-x336.google.com with SMTP id j34so13663140wms.5 for ; Tue, 13 Jul 2021 06:37:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=Et1w9KDJpjyqBux8DZYeptECVaTj09BTPY/nUeH3xIw=; b=RquzfShV97q9g9kI92zneLZcyjWwQThRcXscsnLjkKMgHaUFacAbpy1Mt+VRr8Z9Mr C7xC6hIMzBMcyHzNCToUenDQr1sBq2HZKl9FuvGrGZTKzmLzJOv2rBj6Z0eIQzVMg46K K8ESG4fhz2THSSBuytDqn8/j9DmVCsF41bewcaPHNj65kxzOCIE9o4FF3AHuIhh2nOzR c3AF19qgj+vIOWnedMXty44HUQn8ybhm+5R6a07OWgRFBAxloC/QqpIw6JeCi5oGth93 e3x1ohha35GMS2nI8xg86zzHjXXoj2smC4ljaPJVzMPsDgUgtbX4psg/r5OX3dgZ8ldr H/+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Et1w9KDJpjyqBux8DZYeptECVaTj09BTPY/nUeH3xIw=; b=rdmI1+fAyTtyySi7MSCHOk5d5LmyOcYbMuUYbYM2sFtUZJ1PFg0VcY1+fROdfopPhE Rh1sFa7+Ub20GgRmxBKWKoUoKw9/KI5vL//zmgU/w/OxHQwGcX2d8eLiK15Ew1oM9YU1 WtY3bBiXtTGVc68K2IMgi/s8I1s4NEQPnU+RXRivGWg7XgqK5eFVn7yz49hFOZjhhHhd JXQtCZGCVevo4KNIzQJHZQMLW8xlRKiMJ1xx07y3UNi+v4nCPxnRSFVQInJbumTA9Mjt BbMo++nG25hw4i4IFPOE37s6bz90g/u7hIlOMz8R517a62+DnMbRblIaajljZmeel56b XUzA== X-Gm-Message-State: AOAM531fDQfvm7dHMF3aFVZU39yExHpUlHu6HzBmPIi/O1kEjLsJvTuP BRV4qftkuLyTV2MOiwEBiz8Cdg== X-Google-Smtp-Source: ABdhPJx2Gyj4vigUnyP0Y1fPQY8wSrhHQllvWx9FrgpAYKFTzUxdiNSq9XV8cAdpgJw6PhKwWwhIRQ== X-Received: by 2002:a1c:f613:: with SMTP id w19mr91098wmc.136.1626183476497; Tue, 13 Jul 2021 06:37:56 -0700 (PDT) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [81.2.115.148]) by smtp.gmail.com with ESMTPSA id j6sm9827443wrm.97.2021.07.13.06.37.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jul 2021 06:37:56 -0700 (PDT) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH for-6.2 34/34] target/arm: Implement MVE interleaving loads/stores Date: Tue, 13 Jul 2021 14:37:26 +0100 Message-Id: <20210713133726.26842-35-peter.maydell@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210713133726.26842-1-peter.maydell@linaro.org> References: <20210713133726.26842-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::336; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x336.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Implement the MVE interleaving load/store functions VLD2, VLD4, VST2 and VST4. VLD2 loads 16 bytes of data from memory and writes to 2 consecutive Qregs; VLD4 loads 16 bytes of data from memory and writes to 4 consecutive Qregs. The 'pattern' field in the encoding determines the offset into memory which is accessed and also which elements in the Qregs are written to. (The intention is that a sequence of four consecutive VLD4 with different pattern values performs a complete de-interleaving load of 64 bytes into all elements of the 4 Qregs.) VST2 and VST4 do the same, but for stores. Signed-off-by: Peter Maydell Reviewed-by: Richard Henderson --- I found the pseudocode description of these instructions pretty hard to follow, because (1) it is written to be generic over all sizes and pattern values and beat counts and (2) it accesses the vector elements by (Qreg number, beat within Qreg, element within beat). I ended up writing a little program to print out the various intermediate numbers and also calculate "index of element within the whole Qreg", which is what QEMU wants to access elements by. You can find that here: https://people.linaro.org/~peter.maydell/ldinter.c I then just stared at the numbers for each (pattern, esize) specialization and tried to come up with something that does less gluing together of random bits from curBeat, pattern and e than the pseudocode... --- target/arm/helper-mve.h | 48 ++++++ target/arm/mve.decode | 11 ++ target/arm/mve_helper.c | 342 +++++++++++++++++++++++++++++++++++++ target/arm/translate-mve.c | 94 ++++++++++ 4 files changed, 495 insertions(+) diff --git a/target/arm/helper-mve.h b/target/arm/helper-mve.h index 16799b110fd..9d1453ca174 100644 --- a/target/arm/helper-mve.h +++ b/target/arm/helper-mve.h @@ -70,6 +70,54 @@ DEF_HELPER_FLAGS_4(mve_vldrd_sg_wb_ud, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vstrw_sg_wb_uw, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) DEF_HELPER_FLAGS_4(mve_vstrd_sg_wb_ud, TCG_CALL_NO_WG, void, env, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(mve_vld20b, TCG_CALL_NO_WG, void, env, i32, i32) +DEF_HELPER_FLAGS_3(mve_vld20h, TCG_CALL_NO_WG, void, env, i32, i32) +DEF_HELPER_FLAGS_3(mve_vld20w, TCG_CALL_NO_WG, void, env, i32, i32) + +DEF_HELPER_FLAGS_3(mve_vld21b, TCG_CALL_NO_WG, void, env, i32, i32) +DEF_HELPER_FLAGS_3(mve_vld21h, TCG_CALL_NO_WG, void, env, i32, i32) +DEF_HELPER_FLAGS_3(mve_vld21w, TCG_CALL_NO_WG, void, env, i32, i32) + +DEF_HELPER_FLAGS_3(mve_vld40b, TCG_CALL_NO_WG, void, env, i32, i32) +DEF_HELPER_FLAGS_3(mve_vld40h, TCG_CALL_NO_WG, void, env, i32, i32) +DEF_HELPER_FLAGS_3(mve_vld40w, TCG_CALL_NO_WG, void, env, i32, i32) + +DEF_HELPER_FLAGS_3(mve_vld41b, TCG_CALL_NO_WG, void, env, i32, i32) +DEF_HELPER_FLAGS_3(mve_vld41h, TCG_CALL_NO_WG, void, env, i32, i32) +DEF_HELPER_FLAGS_3(mve_vld41w, TCG_CALL_NO_WG, void, env, i32, i32) + +DEF_HELPER_FLAGS_3(mve_vld42b, TCG_CALL_NO_WG, void, env, i32, i32) +DEF_HELPER_FLAGS_3(mve_vld42h, TCG_CALL_NO_WG, void, env, i32, i32) +DEF_HELPER_FLAGS_3(mve_vld42w, TCG_CALL_NO_WG, void, env, i32, i32) + +DEF_HELPER_FLAGS_3(mve_vld43b, TCG_CALL_NO_WG, void, env, i32, i32) +DEF_HELPER_FLAGS_3(mve_vld43h, TCG_CALL_NO_WG, void, env, i32, i32) +DEF_HELPER_FLAGS_3(mve_vld43w, TCG_CALL_NO_WG, void, env, i32, i32) + +DEF_HELPER_FLAGS_3(mve_vst20b, TCG_CALL_NO_WG, void, env, i32, i32) +DEF_HELPER_FLAGS_3(mve_vst20h, TCG_CALL_NO_WG, void, env, i32, i32) +DEF_HELPER_FLAGS_3(mve_vst20w, TCG_CALL_NO_WG, void, env, i32, i32) + +DEF_HELPER_FLAGS_3(mve_vst21b, TCG_CALL_NO_WG, void, env, i32, i32) +DEF_HELPER_FLAGS_3(mve_vst21h, TCG_CALL_NO_WG, void, env, i32, i32) +DEF_HELPER_FLAGS_3(mve_vst21w, TCG_CALL_NO_WG, void, env, i32, i32) + +DEF_HELPER_FLAGS_3(mve_vst40b, TCG_CALL_NO_WG, void, env, i32, i32) +DEF_HELPER_FLAGS_3(mve_vst40h, TCG_CALL_NO_WG, void, env, i32, i32) +DEF_HELPER_FLAGS_3(mve_vst40w, TCG_CALL_NO_WG, void, env, i32, i32) + +DEF_HELPER_FLAGS_3(mve_vst41b, TCG_CALL_NO_WG, void, env, i32, i32) +DEF_HELPER_FLAGS_3(mve_vst41h, TCG_CALL_NO_WG, void, env, i32, i32) +DEF_HELPER_FLAGS_3(mve_vst41w, TCG_CALL_NO_WG, void, env, i32, i32) + +DEF_HELPER_FLAGS_3(mve_vst42b, TCG_CALL_NO_WG, void, env, i32, i32) +DEF_HELPER_FLAGS_3(mve_vst42h, TCG_CALL_NO_WG, void, env, i32, i32) +DEF_HELPER_FLAGS_3(mve_vst42w, TCG_CALL_NO_WG, void, env, i32, i32) + +DEF_HELPER_FLAGS_3(mve_vst43b, TCG_CALL_NO_WG, void, env, i32, i32) +DEF_HELPER_FLAGS_3(mve_vst43h, TCG_CALL_NO_WG, void, env, i32, i32) +DEF_HELPER_FLAGS_3(mve_vst43w, TCG_CALL_NO_WG, void, env, i32, i32) + DEF_HELPER_FLAGS_3(mve_vdup, TCG_CALL_NO_WG, void, env, ptr, i32) DEF_HELPER_FLAGS_4(mve_vidupb, TCG_CALL_NO_WG, i32, env, ptr, i32, i32) diff --git a/target/arm/mve.decode b/target/arm/mve.decode index 76e9b9c721c..faff94cf6d5 100644 --- a/target/arm/mve.decode +++ b/target/arm/mve.decode @@ -44,6 +44,7 @@ &vabav qn qm rda size &vldst_sg qd qm rn size msize os &vldst_sg_imm qd qm a w imm +&vldst_il qd rn size pat w # scatter-gather memory size is in bits 6:4 %sg_msize 6:1 4:1 @@ -59,6 +60,10 @@ @vldst_sg_imm .... .... a:1 . w:1 . .... .... .... . imm:7 &vldst_sg_imm \ qd=%qd qm=%qn +# Deinterleaving load/interleaving store +@vldst_il .... .... .. w:1 . rn:4 .... ... size:2 pat:2 ..... &vldst_il \ + qd=%qd + @1op .... .... .... size:2 .. .... .... .... .... &1op qd=%qd qm=%qm @1op_nosz .... .... .... .... .... .... .... .... &1op qd=%qd qm=%qm size=0 @2op .... .... .. size:2 .... .... .... .... .... &2op qd=%qd qm=%qm qn=%qn @@ -158,6 +163,12 @@ VLDRD_sg_imm 111 1 1101 ... 1 ... 0 ... 1 1111 .... .... @vldst_sg_imm VSTRW_sg_imm 111 1 1101 ... 0 ... 0 ... 1 1110 .... .... @vldst_sg_imm VSTRD_sg_imm 111 1 1101 ... 0 ... 0 ... 1 1111 .... .... @vldst_sg_imm +# deinterleaving loads/interleaving stores +VLD2 1111 1100 1 .. 1 .... ... 1 111 .. .. 00000 @vldst_il +VLD4 1111 1100 1 .. 1 .... ... 1 111 .. .. 00001 @vldst_il +VST2 1111 1100 1 .. 0 .... ... 1 111 .. .. 00000 @vldst_il +VST4 1111 1100 1 .. 0 .... ... 1 111 .. .. 00001 @vldst_il + # Moves between 2 32-bit vector lanes and 2 general purpose registers VMOV_to_2gp 1110 1100 0 . 00 rt2:4 ... 0 1111 000 idx:1 rt:4 qd=%qd VMOV_from_2gp 1110 1100 0 . 01 rt2:4 ... 0 1111 000 idx:1 rt:4 qd=%qd diff --git a/target/arm/mve_helper.c b/target/arm/mve_helper.c index 293c0e11819..ec9677395fc 100644 --- a/target/arm/mve_helper.c +++ b/target/arm/mve_helper.c @@ -369,6 +369,348 @@ DO_VLDR64_SG(vldrd_sg_wb_ud, ADDR_ADD, true) DO_VSTR_SG(vstrw_sg_wb_uw, stl, 4, uint32_t, ADDR_ADD, true) DO_VSTR64_SG(vstrd_sg_wb_ud, ADDR_ADD, true) +/* + * Deinterleaving loads/interleaving stores. + * + * For these helpers we are passed the index of the first Qreg + * (VLD2/VST2 will also access Qn+1, VLD4/VST4 access Qn .. Qn+3) + * and the value of the base address register Rn. + * The helpers are specialized for pattern and element size, so + * for instance vld42h is VLD4 with pattern 2, element size MO_16. + * + * These insns are beatwise but not predicated, so we must honour ECI, + * but need not look at mve_element_mask(). + * + * The pseudocode implements these insns with multiple memory accesses + * of the element size, but rules R_VVVG and R_FXDM permit us to make + * one 32-bit memory access per beat. + */ +#define DO_VLD4B(OP, O1, O2, O3, O4) \ + void HELPER(mve_##OP)(CPUARMState *env, uint32_t qnidx, \ + uint32_t base) \ + { \ + int beat, e; \ + uint16_t mask = mve_eci_mask(env); \ + const int off[4] = { O1, O2, O3, O4 }; \ + uint32_t addr, data; \ + for (beat = 0; beat < 4; beat++, mask >>= 4) { \ + if ((mask & 1) == 0) { \ + /* ECI says skip this beat */ \ + continue; \ + } \ + addr = base + off[beat] * 4; \ + data = cpu_ldl_le_data_ra(env, addr, GETPC()); \ + for (e = 0; e < 4; e++, data >>= 8) { \ + uint8_t *qd = (uint8_t *)aa32_vfp_qreg(env, qnidx + e); \ + qd[H1(off[beat])] = data; \ + } \ + } \ + } + +#define DO_VLD4H(OP, O1, O2) \ + void HELPER(mve_##OP)(CPUARMState *env, uint32_t qnidx, \ + uint32_t base) \ + { \ + int beat; \ + uint16_t mask = mve_eci_mask(env); \ + const int off[4] = { O1, O1, O2, O2 }; \ + uint32_t addr, data; \ + int y; /* y counts 0 2 0 2 */ \ + uint16_t *qd; \ + for (beat = 0, y = 0; beat < 4; beat++, mask >>= 4, y ^= 2) { \ + if ((mask & 1) == 0) { \ + /* ECI says skip this beat */ \ + continue; \ + } \ + addr = base + off[beat] * 8 + (beat & 1) * 4; \ + data = cpu_ldl_le_data_ra(env, addr, GETPC()); \ + qd = (uint16_t *)aa32_vfp_qreg(env, qnidx + y); \ + qd[H2(off[beat])] = data; \ + data >>= 16; \ + qd = (uint16_t *)aa32_vfp_qreg(env, qnidx + y + 1); \ + qd[H2(off[beat])] = data; \ + } \ + } + +#define DO_VLD4W(OP, O1, O2, O3, O4) \ + void HELPER(mve_##OP)(CPUARMState *env, uint32_t qnidx, \ + uint32_t base) \ + { \ + int beat; \ + uint16_t mask = mve_eci_mask(env); \ + const int off[4] = { O1, O2, O3, O4 }; \ + uint32_t addr, data; \ + uint32_t *qd; \ + int y; \ + for (beat = 0; beat < 4; beat++, mask >>= 4) { \ + if ((mask & 1) == 0) { \ + /* ECI says skip this beat */ \ + continue; \ + } \ + addr = base + off[beat] * 4; \ + data = cpu_ldl_le_data_ra(env, addr, GETPC()); \ + y = (beat + (O1 & 2)) & 3; \ + qd = (uint32_t *)aa32_vfp_qreg(env, qnidx + y); \ + qd[H4(off[beat] >> 2)] = data; \ + } \ + } + +DO_VLD4B(vld40b, 0, 1, 10, 11) +DO_VLD4B(vld41b, 2, 3, 12, 13) +DO_VLD4B(vld42b, 4, 5, 14, 15) +DO_VLD4B(vld43b, 6, 7, 8, 9) + +DO_VLD4H(vld40h, 0, 5) +DO_VLD4H(vld41h, 1, 6) +DO_VLD4H(vld42h, 2, 7) +DO_VLD4H(vld43h, 3, 4) + +DO_VLD4W(vld40w, 0, 1, 10, 11) +DO_VLD4W(vld41w, 2, 3, 12, 13) +DO_VLD4W(vld42w, 4, 5, 14, 15) +DO_VLD4W(vld43w, 6, 7, 8, 9) + +#define DO_VLD2B(OP, O1, O2, O3, O4) \ + void HELPER(mve_##OP)(CPUARMState *env, uint32_t qnidx, \ + uint32_t base) \ + { \ + int beat, e; \ + uint16_t mask = mve_eci_mask(env); \ + const int off[4] = { O1, O2, O3, O4 }; \ + uint32_t addr, data; \ + uint8_t *qd; \ + for (beat = 0; beat < 4; beat++, mask >>= 4) { \ + if ((mask & 1) == 0) { \ + /* ECI says skip this beat */ \ + continue; \ + } \ + addr = base + off[beat] * 2; \ + data = cpu_ldl_le_data_ra(env, addr, GETPC()); \ + for (e = 0; e < 4; e++, data >>= 8) { \ + qd = (uint8_t *)aa32_vfp_qreg(env, qnidx + (e & 1)); \ + qd[H1(off[beat] + (e >> 1))] = data; \ + } \ + } \ + } + +#define DO_VLD2H(OP, O1, O2, O3, O4) \ + void HELPER(mve_##OP)(CPUARMState *env, uint32_t qnidx, \ + uint32_t base) \ + { \ + int beat; \ + uint16_t mask = mve_eci_mask(env); \ + const int off[4] = { O1, O2, O3, O4 }; \ + uint32_t addr, data; \ + int e; \ + uint16_t *qd; \ + for (beat = 0; beat < 4; beat++, mask >>= 4) { \ + if ((mask & 1) == 0) { \ + /* ECI says skip this beat */ \ + continue; \ + } \ + addr = base + off[beat] * 4; \ + data = cpu_ldl_le_data_ra(env, addr, GETPC()); \ + for (e = 0; e < 2; e++, data >>= 16) { \ + qd = (uint16_t *)aa32_vfp_qreg(env, qnidx + e); \ + qd[H2(off[beat])] = data; \ + } \ + } \ + } + +#define DO_VLD2W(OP, O1, O2, O3, O4) \ + void HELPER(mve_##OP)(CPUARMState *env, uint32_t qnidx, \ + uint32_t base) \ + { \ + int beat; \ + uint16_t mask = mve_eci_mask(env); \ + const int off[4] = { O1, O2, O3, O4 }; \ + uint32_t addr, data; \ + uint32_t *qd; \ + for (beat = 0; beat < 4; beat++, mask >>= 4) { \ + if ((mask & 1) == 0) { \ + /* ECI says skip this beat */ \ + continue; \ + } \ + addr = base + off[beat]; \ + data = cpu_ldl_le_data_ra(env, addr, GETPC()); \ + qd = (uint32_t *)aa32_vfp_qreg(env, qnidx + (beat & 1)); \ + qd[H4(off[beat] >> 3)] = data; \ + } \ + } + +DO_VLD2B(vld20b, 0, 2, 12, 14) +DO_VLD2B(vld21b, 4, 6, 8, 10) + +DO_VLD2H(vld20h, 0, 1, 6, 7) +DO_VLD2H(vld21h, 2, 3, 4, 5) + +DO_VLD2W(vld20w, 0, 4, 24, 28) +DO_VLD2W(vld21w, 8, 12, 16, 20) + +#define DO_VST4B(OP, O1, O2, O3, O4) \ + void HELPER(mve_##OP)(CPUARMState *env, uint32_t qnidx, \ + uint32_t base) \ + { \ + int beat, e; \ + uint16_t mask = mve_eci_mask(env); \ + const int off[4] = { O1, O2, O3, O4 }; \ + uint32_t addr, data; \ + for (beat = 0; beat < 4; beat++, mask >>= 4) { \ + if ((mask & 1) == 0) { \ + /* ECI says skip this beat */ \ + continue; \ + } \ + addr = base + off[beat] * 4; \ + data = 0; \ + for (e = 3; e >= 0; e--) { \ + uint8_t *qd = (uint8_t *)aa32_vfp_qreg(env, qnidx + e); \ + data = (data << 8) | qd[H1(off[beat])]; \ + } \ + cpu_stl_le_data_ra(env, addr, data, GETPC()); \ + } \ + } + +#define DO_VST4H(OP, O1, O2) \ + void HELPER(mve_##OP)(CPUARMState *env, uint32_t qnidx, \ + uint32_t base) \ + { \ + int beat; \ + uint16_t mask = mve_eci_mask(env); \ + const int off[4] = { O1, O1, O2, O2 }; \ + uint32_t addr, data; \ + int y; /* y counts 0 2 0 2 */ \ + uint16_t *qd; \ + for (beat = 0, y = 0; beat < 4; beat++, mask >>= 4, y ^= 2) { \ + if ((mask & 1) == 0) { \ + /* ECI says skip this beat */ \ + continue; \ + } \ + addr = base + off[beat] * 8 + (beat & 1) * 4; \ + qd = (uint16_t *)aa32_vfp_qreg(env, qnidx + y); \ + data = qd[H2(off[beat])]; \ + qd = (uint16_t *)aa32_vfp_qreg(env, qnidx + y + 1); \ + data |= qd[H2(off[beat])] << 16; \ + cpu_stl_le_data_ra(env, addr, data, GETPC()); \ + } \ + } + +#define DO_VST4W(OP, O1, O2, O3, O4) \ + void HELPER(mve_##OP)(CPUARMState *env, uint32_t qnidx, \ + uint32_t base) \ + { \ + int beat; \ + uint16_t mask = mve_eci_mask(env); \ + const int off[4] = { O1, O2, O3, O4 }; \ + uint32_t addr, data; \ + uint32_t *qd; \ + int y; \ + for (beat = 0; beat < 4; beat++, mask >>= 4) { \ + if ((mask & 1) == 0) { \ + /* ECI says skip this beat */ \ + continue; \ + } \ + addr = base + off[beat] * 4; \ + y = (beat + (O1 & 2)) & 3; \ + qd = (uint32_t *)aa32_vfp_qreg(env, qnidx + y); \ + data = qd[H4(off[beat] >> 2)]; \ + cpu_stl_le_data_ra(env, addr, data, GETPC()); \ + } \ + } + +DO_VST4B(vst40b, 0, 1, 10, 11) +DO_VST4B(vst41b, 2, 3, 12, 13) +DO_VST4B(vst42b, 4, 5, 14, 15) +DO_VST4B(vst43b, 6, 7, 8, 9) + +DO_VST4H(vst40h, 0, 5) +DO_VST4H(vst41h, 1, 6) +DO_VST4H(vst42h, 2, 7) +DO_VST4H(vst43h, 3, 4) + +DO_VST4W(vst40w, 0, 1, 10, 11) +DO_VST4W(vst41w, 2, 3, 12, 13) +DO_VST4W(vst42w, 4, 5, 14, 15) +DO_VST4W(vst43w, 6, 7, 8, 9) + +#define DO_VST2B(OP, O1, O2, O3, O4) \ + void HELPER(mve_##OP)(CPUARMState *env, uint32_t qnidx, \ + uint32_t base) \ + { \ + int beat, e; \ + uint16_t mask = mve_eci_mask(env); \ + const int off[4] = { O1, O2, O3, O4 }; \ + uint32_t addr, data; \ + uint8_t *qd; \ + for (beat = 0; beat < 4; beat++, mask >>= 4) { \ + if ((mask & 1) == 0) { \ + /* ECI says skip this beat */ \ + continue; \ + } \ + addr = base + off[beat] * 2; \ + data = 0; \ + for (e = 3; e >= 0; e--) { \ + qd = (uint8_t *)aa32_vfp_qreg(env, qnidx + (e & 1)); \ + data = (data << 8) | qd[H1(off[beat] + (e >> 1))]; \ + } \ + cpu_stl_le_data_ra(env, addr, data, GETPC()); \ + } \ + } + +#define DO_VST2H(OP, O1, O2, O3, O4) \ + void HELPER(mve_##OP)(CPUARMState *env, uint32_t qnidx, \ + uint32_t base) \ + { \ + int beat; \ + uint16_t mask = mve_eci_mask(env); \ + const int off[4] = { O1, O2, O3, O4 }; \ + uint32_t addr, data; \ + int e; \ + uint16_t *qd; \ + for (beat = 0; beat < 4; beat++, mask >>= 4) { \ + if ((mask & 1) == 0) { \ + /* ECI says skip this beat */ \ + continue; \ + } \ + addr = base + off[beat] * 4; \ + data = 0; \ + for (e = 1; e >= 0; e--) { \ + qd = (uint16_t *)aa32_vfp_qreg(env, qnidx + e); \ + data = (data << 16) | qd[H2(off[beat])]; \ + } \ + cpu_stl_le_data_ra(env, addr, data, GETPC()); \ + } \ + } + +#define DO_VST2W(OP, O1, O2, O3, O4) \ + void HELPER(mve_##OP)(CPUARMState *env, uint32_t qnidx, \ + uint32_t base) \ + { \ + int beat; \ + uint16_t mask = mve_eci_mask(env); \ + const int off[4] = { O1, O2, O3, O4 }; \ + uint32_t addr, data; \ + uint32_t *qd; \ + for (beat = 0; beat < 4; beat++, mask >>= 4) { \ + if ((mask & 1) == 0) { \ + /* ECI says skip this beat */ \ + continue; \ + } \ + addr = base + off[beat]; \ + qd = (uint32_t *)aa32_vfp_qreg(env, qnidx + (beat & 1)); \ + data = qd[H4(off[beat] >> 3)]; \ + cpu_stl_le_data_ra(env, addr, data, GETPC()); \ + } \ + } + +DO_VST2B(vst20b, 0, 2, 12, 14) +DO_VST2B(vst21b, 4, 6, 8, 10) + +DO_VST2H(vst20h, 0, 1, 6, 7) +DO_VST2H(vst21h, 2, 3, 4, 5) + +DO_VST2W(vst20w, 0, 4, 24, 28) +DO_VST2W(vst21w, 8, 12, 16, 20) + /* * The mergemask(D, R, M) macro performs the operation "*D = R" but * storing only the bytes which correspond to 1 bits in M, diff --git a/target/arm/translate-mve.c b/target/arm/translate-mve.c index f4229e308ba..3ebc6356e4c 100644 --- a/target/arm/translate-mve.c +++ b/target/arm/translate-mve.c @@ -35,6 +35,7 @@ static inline int vidup_imm(DisasContext *s, int x) typedef void MVEGenLdStFn(TCGv_ptr, TCGv_ptr, TCGv_i32); typedef void MVEGenLdStSGFn(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32); +typedef void MVEGenLdStIlFn(TCGv_ptr, TCGv_i32, TCGv_i32); typedef void MVEGenOneOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr); typedef void MVEGenTwoOpFn(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr); typedef void MVEGenTwoOpScalarFn(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32); @@ -366,6 +367,99 @@ static bool trans_VSTRD_sg_imm(DisasContext *s, arg_vldst_sg_imm *a) return do_ldst_sg_imm(s, a, fns[a->w], MO_64); } +static bool do_vldst_il(DisasContext *s, arg_vldst_il *a, MVEGenLdStIlFn *fn, + int addrinc) +{ + TCGv_i32 rn; + + if (!dc_isar_feature(aa32_mve, s) || + !mve_check_qreg_bank(s, a->qd) || + !fn || (a->rn == 13 && a->w) || a->rn == 15) { + /* Variously UNPREDICTABLE or UNDEF or related-encoding */ + return false; + } + if (!mve_eci_check(s) || !vfp_access_check(s)) { + return true; + } + + rn = load_reg(s, a->rn); + /* + * We pass the index of Qd, not a pointer, because the helper must + * access multiple Q registers starting at Qd and working up. + */ + fn(cpu_env, tcg_constant_i32(a->qd), rn); + + if (a->w) { + tcg_gen_addi_i32(rn, rn, addrinc); + store_reg(s, a->rn, rn); + } else { + tcg_temp_free_i32(rn); + } + mve_update_and_store_eci(s); + return true; +} + +/* This macro is just to make the arrays more compact in these functions */ +#define F(N) gen_helper_mve_##N + +static bool trans_VLD2(DisasContext *s, arg_vldst_il *a) +{ + static MVEGenLdStIlFn * const fns[4][4] = { + { F(vld20b), F(vld20h), F(vld20w), NULL, }, + { F(vld21b), F(vld21h), F(vld21w), NULL, }, + { NULL, NULL, NULL, NULL }, + { NULL, NULL, NULL, NULL }, + }; + if (a->qd > 6) { + return false; + } + return do_vldst_il(s, a, fns[a->pat][a->size], 32); +} + +static bool trans_VLD4(DisasContext *s, arg_vldst_il *a) +{ + static MVEGenLdStIlFn * const fns[4][4] = { + { F(vld40b), F(vld40h), F(vld40w), NULL, }, + { F(vld41b), F(vld41h), F(vld41w), NULL, }, + { F(vld42b), F(vld42h), F(vld42w), NULL, }, + { F(vld43b), F(vld43h), F(vld43w), NULL, }, + }; + if (a->qd > 4) { + return false; + } + return do_vldst_il(s, a, fns[a->pat][a->size], 64); +} + +static bool trans_VST2(DisasContext *s, arg_vldst_il *a) +{ + static MVEGenLdStIlFn * const fns[4][4] = { + { F(vst20b), F(vst20h), F(vst20w), NULL, }, + { F(vst21b), F(vst21h), F(vst21w), NULL, }, + { NULL, NULL, NULL, NULL }, + { NULL, NULL, NULL, NULL }, + }; + if (a->qd > 6) { + return false; + } + return do_vldst_il(s, a, fns[a->pat][a->size], 32); +} + +static bool trans_VST4(DisasContext *s, arg_vldst_il *a) +{ + static MVEGenLdStIlFn * const fns[4][4] = { + { F(vst40b), F(vst40h), F(vst40w), NULL, }, + { F(vst41b), F(vst41h), F(vst41w), NULL, }, + { F(vst42b), F(vst42h), F(vst42w), NULL, }, + { F(vst43b), F(vst43h), F(vst43w), NULL, }, + }; + if (a->qd > 4) { + return false; + } + return do_vldst_il(s, a, fns[a->pat][a->size], 64); +} + +#undef F + static bool trans_VDUP(DisasContext *s, arg_VDUP *a) { TCGv_ptr qd;