From patchwork Thu Feb 16 02:57:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1743256 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=DI74/wz0; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4PHKV02Hbcz240K for ; Thu, 16 Feb 2023 14:01:44 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pSUT1-0003O7-DV; Wed, 15 Feb 2023 21:57:51 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pSUSy-0003NW-3o for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:57:48 -0500 Received: from mail-pg1-x531.google.com ([2607:f8b0:4864:20::531]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pSUSv-0005QM-TJ for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:57:47 -0500 Received: by mail-pg1-x531.google.com with SMTP id b22so422391pgw.3 for ; Wed, 15 Feb 2023 18:57:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ffWg6mEljAFhezdHqYZAazycsN64dtuZ/7E8yDlKLu8=; b=DI74/wz0Tf9uWZmcF8woYJJtLuWBhJkyj83QwIAyTzeFqGzB7xLD2OsIlPNB5EE6ng es+uHGsfCXTBkdNXXo1t+wvpALwTqlrjm3n7PunCntWuyqYTx5L8ovqIjVn1hvmXithg M/YlmqyN9vRLcErZKsVzCxZecydRnA4WKJvsgiTHPdP7UBg5FKFlPwL+wFLFXV35VrWv giY7N6ZvSRidS+WiJHel4xbjvw4jNflTEtNDJYHIv7KP7A5MpcX9iz8yuc1xmLsmfjAy 30UVXXj0rY0OrCTtK4GxNP+vsrga3m3PhaLOaB9a59Hg1w+o8MTk276XCDhASc0EWfLv dSPw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ffWg6mEljAFhezdHqYZAazycsN64dtuZ/7E8yDlKLu8=; b=2panj78p01Kl8+kmian9Dnq0+a81ZPoF8IuLx1MVXaknea1pnHuItJDArU6NZGU0+N kyqr7Fa9M6/MaWyt5poPndMk9OiwdokyHAULqBwGnsNoqqDaLOBQxf6G7yj2uOkUi3FN GSkz4uGiFt8wwUiP0k+6VkSAJdcQrJBL9Cz1l73Qxq/Q1MyjAJpD8WmbZsdc7jc9QEoJ riJmJNok4SijamfqsKIathAjbbtc7aiNg6YK+oBQIGtqU35rs+sKeKRUVZVj0BNLxW8O RE00OybrmKlq06sp6O/u2lhFGIFVod1fCfAjcBR8uQCdJHJ8+n4Mpg+nJj0YtXsjMReP gkFQ== X-Gm-Message-State: AO0yUKWH2jH5PmW89/M+mE9Gg85PLufzQ0ADu764oVYu2B9CKRDX1jx4 C4Ep2sUG/inLNOPtv4KK/500YfVWTSeicTla2oQ= X-Google-Smtp-Source: AK7set8OOuVi14yd6/dF9VF2RX+b0YxVMz3ZXzEinyoG7kwtC6QSOlvdt1PTGTzLJFqT0cWH2QzAjw== X-Received: by 2002:aa7:9987:0:b0:593:da8:6f34 with SMTP id k7-20020aa79987000000b005930da86f34mr3742143pfh.5.1676516264054; Wed, 15 Feb 2023 18:57:44 -0800 (PST) Received: from stoup.. (rrcs-74-87-59-234.west.biz.rr.com. [74.87.59.234]) by smtp.gmail.com with ESMTPSA id e14-20020a62aa0e000000b005a816b7c3e8sm89655pff.24.2023.02.15.18.57.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Feb 2023 18:57:43 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Cc: =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= Subject: [PATCH v2 01/30] include/qemu/cpuid: Introduce xgetbv_low Date: Wed, 15 Feb 2023 16:57:10 -1000 Message-Id: <20230216025739.1211680-2-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230216025739.1211680-1-richard.henderson@linaro.org> References: <20230216025739.1211680-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::531; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x531.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Replace the two uses of asm to expand xgetbv with an inline function. Since one of the two has been using the mnemonic, assume that the comment about "older versions of the assember" is obsolete, as even that is 4 years old. Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- include/qemu/cpuid.h | 7 +++++++ util/bufferiszero.c | 3 +-- tcg/i386/tcg-target.c.inc | 11 ++++------- 3 files changed, 12 insertions(+), 9 deletions(-) diff --git a/include/qemu/cpuid.h b/include/qemu/cpuid.h index 7adb12d320..1451e8ef2f 100644 --- a/include/qemu/cpuid.h +++ b/include/qemu/cpuid.h @@ -71,4 +71,11 @@ #define bit_LZCNT (1 << 5) #endif +static inline unsigned xgetbv_low(unsigned c) +{ + unsigned a, d; + asm("xgetbv" : "=a"(a), "=d"(d) : "c"(c)); + return a; +} + #endif /* QEMU_CPUID_H */ diff --git a/util/bufferiszero.c b/util/bufferiszero.c index 1790ded7d4..1886bc5ba4 100644 --- a/util/bufferiszero.c +++ b/util/bufferiszero.c @@ -258,8 +258,7 @@ static void __attribute__((constructor)) init_cpuid_cache(void) /* We must check that AVX is not just available, but usable. */ if ((c & bit_OSXSAVE) && (c & bit_AVX) && max >= 7) { - int bv; - __asm("xgetbv" : "=a"(bv), "=d"(d) : "c"(0)); + unsigned bv = xgetbv_low(0); __cpuid_count(7, 0, a, b, c, d); if ((bv & 0x6) == 0x6 && (b & bit_AVX2)) { cache |= CACHE_AVX2; diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc index 883ced8168..028ece62a0 100644 --- a/tcg/i386/tcg-target.c.inc +++ b/tcg/i386/tcg-target.c.inc @@ -4156,12 +4156,9 @@ static void tcg_target_init(TCGContext *s) /* There are a number of things we must check before we can be sure of not hitting invalid opcode. */ if (c & bit_OSXSAVE) { - unsigned xcrl, xcrh; - /* The xgetbv instruction is not available to older versions of - * the assembler, so we encode the instruction manually. - */ - asm(".byte 0x0f, 0x01, 0xd0" : "=a" (xcrl), "=d" (xcrh) : "c" (0)); - if ((xcrl & 6) == 6) { + unsigned bv = xgetbv_low(0); + + if ((bv & 6) == 6) { have_avx1 = (c & bit_AVX) != 0; have_avx2 = (b7 & bit_AVX2) != 0; @@ -4172,7 +4169,7 @@ static void tcg_target_init(TCGContext *s) * check that OPMASK and all extended ZMM state are enabled * even if we're not using them -- the insns will fault. */ - if ((xcrl & 0xe0) == 0xe0 + if ((bv & 0xe0) == 0xe0 && (b7 & bit_AVX512F) && (b7 & bit_AVX512VL)) { have_avx512vl = true; From patchwork Thu Feb 16 02:57:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1743249 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=Pl5w9+qS; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4PHKTW6nFZz240K for ; Thu, 16 Feb 2023 14:01:19 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pSUT1-0003OC-GS; Wed, 15 Feb 2023 21:57:51 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pSUSy-0003Ne-IV for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:57:49 -0500 Received: from mail-pg1-x530.google.com ([2607:f8b0:4864:20::530]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pSUSw-0005UJ-OH for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:57:48 -0500 Received: by mail-pg1-x530.google.com with SMTP id r18so399606pgr.12 for ; Wed, 15 Feb 2023 18:57:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=jzoBh6h+eW5YyC2mEGGgFAPiTTpJh3Si9+ZlAorRV5U=; b=Pl5w9+qSdp9zQZeai32JyzgTY+SsukJgIIaB9000wcPRoziThkVPIbxhVATiq9F/TA yHcZPbtsHcHOEKs2bLSZMzyukFU8iWMVoHgYrIpEubYcLTGWDOS9bwUHujmHqHqgEyZe hTzTUZJYgKTYW0pAz7tjPUzxZkGk2mXH35rJIsHx1y22ogWXGGU89Q7BwJxdBoXb+VJn ACqI+AZKEm9pKPgL4yQhucFywJnPOQmBbKemjPl7wEye/7LlLzmVLmPWQacKi1gkwwO/ 38i2yDhWXTcNUft2h2gpTfBA6uTgBSolFDbehyxGaWITfCWCGkRIfLeksJJOUO9V9ULJ tjaQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jzoBh6h+eW5YyC2mEGGgFAPiTTpJh3Si9+ZlAorRV5U=; b=noHZGBAncJ+CUYeHspeN4Nff/bByNQK1El0IykdnGZBumjbLtTUmycirVI1xynoPjk eIKoStCegzH4CrWBPYCWtUf7cbPecOUMDViYRdlGht1TqSazHJ0Sa5Lm94yxF5DWioUl 0II/iy3/HQTmJOkB8XFbs8Uf63BNS3ptr4CalmnpaUeZe8IZxGqNnlY1ik21HUgMMhK9 NBEjmOKl7k9FcxFZSU0iRmjVN6XTjTD10O3G7f3htJEPoQ/8zUwX+7IxRBRslFj42LV1 32116KqJ5zRuBtn+/9vrQYqht5Nn+cHx6GKu+KD8VDm5yXK7WivySOrlt/fOhQlO+NVG Bpow== X-Gm-Message-State: AO0yUKWvX3hXrBK8dhNCXBl7BOg2XYJo8qim6ky8TZQSOws1JdyNND6G h9Y9wiIktTy5kBRa630cRhgUn3L3xG87Cj1hKSk= X-Google-Smtp-Source: AK7set/qDNhlYHZdEi2Ny2vcaBFDNJ1MyNQbkHTrKy76NfnosNSlI20uG10/KBXBf/+duI9MIW1Pqw== X-Received: by 2002:a62:27c5:0:b0:5a8:a138:9a82 with SMTP id n188-20020a6227c5000000b005a8a1389a82mr3461007pfn.24.1676516265291; Wed, 15 Feb 2023 18:57:45 -0800 (PST) Received: from stoup.. (rrcs-74-87-59-234.west.biz.rr.com. [74.87.59.234]) by smtp.gmail.com with ESMTPSA id e14-20020a62aa0e000000b005a816b7c3e8sm89655pff.24.2023.02.15.18.57.44 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Feb 2023 18:57:44 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 02/30] include/exec/memop: Add bits describing atomicity Date: Wed, 15 Feb 2023 16:57:11 -1000 Message-Id: <20230216025739.1211680-3-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230216025739.1211680-1-richard.henderson@linaro.org> References: <20230216025739.1211680-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::530; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x530.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org These bits may be used to describe the precise atomicity requirements of the guest, which may then be used to constrain the methods by which it may be emulated by the host. For instance, the AArch64 LDP (32-bit) instruction changes semantics with ARMv8.4 LSE2, from MO_64 | MO_ATMAX_4 | MO_ATOM_IFALIGN (64-bits, single-copy atomic only on 4 byte units, nonatomic if not aligned by 4), to MO_64 | MO_ATMAX_SIZE | MO_ATOM_WITHIN16 (64-bits, single-copy atomic within a 16 byte block) The former may be implemented with two 4 byte loads, or a single 8 byte load if that happens to be efficient on the host. The latter may not, and may also require a helper when misaligned. Signed-off-by: Richard Henderson Reviewed-by: Alex Bennée Reviewed-by: Philippe Mathieu-Daudé --- include/exec/memop.h | 36 ++++++++++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+) diff --git a/include/exec/memop.h b/include/exec/memop.h index 25d027434a..04e4048f0b 100644 --- a/include/exec/memop.h +++ b/include/exec/memop.h @@ -81,6 +81,42 @@ typedef enum MemOp { MO_ALIGN_32 = 5 << MO_ASHIFT, MO_ALIGN_64 = 6 << MO_ASHIFT, + /* + * MO_ATOM_* describes that atomicity requirements of the operation: + * MO_ATOM_IFALIGN: the operation must be single-copy atomic if and + * only if it is aligned; if unaligned there is no atomicity. + * MO_ATOM_NONE: the operation has no atomicity requirements. + * MO_ATOM_SUBALIGN: the operation is single-copy atomic by parts + * by the alignment. E.g. if the address is 0 mod 4, then each + * 4-byte subobject is single-copy atomic. + * This is the atomicity of IBM Power and S390X processors. + * MO_ATOM_WITHIN16: the operation is single-copy atomic, even if it + * is unaligned, so long as it does not cross a 16-byte boundary; + * if it crosses a 16-byte boundary there is no atomicity. + * This is the atomicity of Arm FEAT_LSE2. + * + * MO_ATMAX_* describes the maximum atomicity unit required: + * MO_ATMAX_SIZE: the entire operation, i.e. MO_SIZE. + * MO_ATMAX_[248]: units of N bytes. + * + * Note the default (i.e. 0) values are single-copy atomic to the + * size of the operation, if aligned. This retains the behaviour + * from before these were introduced. + */ + MO_ATOM_SHIFT = 8, + MO_ATOM_MASK = 0x3 << MO_ATOM_SHIFT, + MO_ATOM_IFALIGN = 0 << MO_ATOM_SHIFT, + MO_ATOM_NONE = 1 << MO_ATOM_SHIFT, + MO_ATOM_SUBALIGN = 2 << MO_ATOM_SHIFT, + MO_ATOM_WITHIN16 = 3 << MO_ATOM_SHIFT, + + MO_ATMAX_SHIFT = 10, + MO_ATMAX_MASK = 0x3 << MO_ATMAX_SHIFT, + MO_ATMAX_SIZE = 0 << MO_ATMAX_SHIFT, + MO_ATMAX_2 = 1 << MO_ATMAX_SHIFT, + MO_ATMAX_4 = 2 << MO_ATMAX_SHIFT, + MO_ATMAX_8 = 3 << MO_ATMAX_SHIFT, + /* Combinations of the above, for ease of use. */ MO_UB = MO_8, MO_UW = MO_16, From patchwork Thu Feb 16 02:57:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1743251 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=nsLQaYJb; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4PHKTY4GRdz23h0 for ; Thu, 16 Feb 2023 14:01:21 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pSUT4-0003PY-H5; Wed, 15 Feb 2023 21:57:54 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pSUSz-0003Nx-OJ for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:57:50 -0500 Received: from mail-pj1-x1035.google.com ([2607:f8b0:4864:20::1035]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pSUSx-0005Z8-V3 for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:57:49 -0500 Received: by mail-pj1-x1035.google.com with SMTP id bg2so549131pjb.4 for ; Wed, 15 Feb 2023 18:57:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=h3P6HqEEKj6zOhYvB4XtcCe2Zq35easW5Xbjd8pYTNM=; b=nsLQaYJbAbY47m5ol/xuXoAPRpw90+yDDBDUBr472UFio4P+/mMpK/WQRieqR4+9cJ zTRsTaoMGF/jLrtVDz0MNrS5oo2Zy/69B9Z2PGIvOVfU0ukzs7AjL/ANs0Y/DJGnCZSZ 8KW0YmdUXkJujE72labMgcot+4NQEFxQKKrXYE2S5cBJtNHkTVFmRZelrmQ5wz/SOmD7 pdqHtMe0yMWC+pSYu1tinIjG8j53OuPiu8oB4iGDDj0mnJUsBL7RG4d+IS2EwjMxJi2m KcvDQ7fnrcXoChlPi6rn7MllmY5srCumGcsriTM0FB5m4XCHYZJy7ti0+ENRfV4o8IrX ObDA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=h3P6HqEEKj6zOhYvB4XtcCe2Zq35easW5Xbjd8pYTNM=; b=U66ER8q5DqBLNn4Rv4OtmsLlCh7db/kg7RhQHAUsr85FY/zc8+zMw6pxThnCwhDrE2 YAYHgBsX0tCeezOVmCLvV367tctY27BawK/HR07G+4XnWrH0vWunCIcQy3rU7tjDtwmp 8dl4yqissnbWvqeHeIohS6yBFyYLBDZIaLNj79ujDmy/JDMHC5H0dLbxm0XF1E5TCdlN L4FbcYip6mr0Quh/ryFkr03/hsuZ/uWug8UwFoVDyBrX3+8Y5WDSnZMB/sp+4HjejnC3 4WvFJI33olvw0hXdhKGm0jT6rS0s8IacnCqveZOCl9mCc2ZgqEpT8ITnataYfE7ycJGF FS7g== X-Gm-Message-State: AO0yUKX9NQYZDprwpoosaYBRE9tw3SfoZLCRxK1LCBtzoXt0gsBMajY5 TXQLso/0a8UjD/dlVwCoBsA2oXcXR0Xw0AGiWKc= X-Google-Smtp-Source: AK7set8X8QHwrKB7fmdwO6peJcsjGVNFazVKNPTKMSbPNU5KIDwzTU1M3ngrD1ImHUEkZbciK6lJNQ== X-Received: by 2002:a05:6a20:8e09:b0:c7:13ae:2f03 with SMTP id y9-20020a056a208e0900b000c713ae2f03mr1320155pzj.31.1676516266509; Wed, 15 Feb 2023 18:57:46 -0800 (PST) Received: from stoup.. (rrcs-74-87-59-234.west.biz.rr.com. [74.87.59.234]) by smtp.gmail.com with ESMTPSA id e14-20020a62aa0e000000b005a816b7c3e8sm89655pff.24.2023.02.15.18.57.45 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Feb 2023 18:57:46 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 03/30] accel/tcg: Add cpu_in_serial_context Date: Wed, 15 Feb 2023 16:57:12 -1000 Message-Id: <20230216025739.1211680-4-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230216025739.1211680-1-richard.henderson@linaro.org> References: <20230216025739.1211680-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1035; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1035.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Like cpu_in_exclusive_context, but also true if there is no other cpu against which we could race. Use it in tb_flush as a direct replacement. Use it in cpu_loop_exit_atomic to ensure that there is no loop against cpu_exec_step_atomic. Signed-off-by: Richard Henderson Reviewed-by: Alex Bennée Reviewed-by: Philippe Mathieu-Daudé --- accel/tcg/internal.h | 5 +++++ accel/tcg/cpu-exec-common.c | 3 +++ accel/tcg/tb-maint.c | 2 +- 3 files changed, 9 insertions(+), 1 deletion(-) diff --git a/accel/tcg/internal.h b/accel/tcg/internal.h index 6edff16fb0..e181872a93 100644 --- a/accel/tcg/internal.h +++ b/accel/tcg/internal.h @@ -64,4 +64,9 @@ static inline target_ulong log_pc(CPUState *cpu, const TranslationBlock *tb) #endif } +static inline bool cpu_in_serial_context(CPUState *cs) +{ + return !(cs->tcg_cflags & CF_PARALLEL) || cpu_in_exclusive_context(cs); +} + #endif /* ACCEL_TCG_INTERNAL_H */ diff --git a/accel/tcg/cpu-exec-common.c b/accel/tcg/cpu-exec-common.c index c7bc8c6efa..2fb4454c7a 100644 --- a/accel/tcg/cpu-exec-common.c +++ b/accel/tcg/cpu-exec-common.c @@ -21,6 +21,7 @@ #include "sysemu/cpus.h" #include "sysemu/tcg.h" #include "exec/exec-all.h" +#include "internal.h" bool tcg_allowed; @@ -78,6 +79,8 @@ void cpu_loop_exit_restore(CPUState *cpu, uintptr_t pc) void cpu_loop_exit_atomic(CPUState *cpu, uintptr_t pc) { + /* Prevent looping if already executing in a serial context. */ + g_assert(!cpu_in_serial_context(cpu)); cpu->exception_index = EXCP_ATOMIC; cpu_loop_exit_restore(cpu, pc); } diff --git a/accel/tcg/tb-maint.c b/accel/tcg/tb-maint.c index b3d6529ae2..4f6b447149 100644 --- a/accel/tcg/tb-maint.c +++ b/accel/tcg/tb-maint.c @@ -758,7 +758,7 @@ void tb_flush(CPUState *cpu) if (tcg_enabled()) { unsigned tb_flush_count = qatomic_mb_read(&tb_ctx.tb_flush_count); - if (cpu_in_exclusive_context(cpu)) { + if (cpu_in_serial_context(cpu)) { do_tb_flush(cpu, RUN_ON_CPU_HOST_INT(tb_flush_count)); } else { async_safe_run_on_cpu(cpu, do_tb_flush, From patchwork Thu Feb 16 02:57:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1743258 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=ePJO+1oL; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4PHKVH4zkgz23h0 for ; Thu, 16 Feb 2023 14:01:59 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pSUT4-0003Pm-Ov; Wed, 15 Feb 2023 21:57:54 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pSUT1-0003OV-Qp for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:57:53 -0500 Received: from mail-pl1-x629.google.com ([2607:f8b0:4864:20::629]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pSUSz-0005de-EV for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:57:51 -0500 Received: by mail-pl1-x629.google.com with SMTP id v23so712488plo.1 for ; Wed, 15 Feb 2023 18:57:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=tTpMfPJAa0NEKM83zReY5JWVnIi6+v5PP2oKa95S1EI=; b=ePJO+1oLGzBGW+baak3oKlWwlszRCK5dskRDpfL4HOCHT5gRdR6PYXslYZVP/djjit YfYZ6QxhwdDWsw+g9PzQ1SCoN8mwZ3wP/Khveedr18frIjP7QxNeFZUqItUD4+SzYWYS 6cZ/OQA5JjdONj0Q9e62tUiE4C2/bAAJ8JDf7i5ld+ZBl9FKRHlQCmQbKE1hK9uatpYl F19eXDRy8bJ9tIW/vewlMk+YsXk4/wkpx6/iPP2rNW/oWo9chYewPls8W2mkCdgIfJM1 jurub2a4Id3S0mpAOHHVyt+cCr7zoqOYX6WM8osvy9q+WV5jQGrMDodWjnbpeooi/dpt Ta6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=tTpMfPJAa0NEKM83zReY5JWVnIi6+v5PP2oKa95S1EI=; b=L8aPLhMTn75WHEnfY/TWkEBjMzZpU+kXYUuMy6WtgCuPt25okXnZ+e+6DDdtNQVA1s PqyWyyaRSdFUQfo+qk+Sx7aP7W14b9VIh0mk3ZRem/IJ9fndKlCj8GLeIN7rJdQH/gL4 7XoBGtya/V2Z6/0Nbgzpzg1VflnIjj3nIdAup8wuadMvxeV30kCiJP9fKjC6zi9WD6Q1 B0YB4vtlasQg56MHAXyAeNB5y0Zwip5fl+y+a6iWps96N9+iL5AuhD5/eZj4IuyqWi7x vbnw8zLy1ja/aAnnu7lNbJJSxrYfSXhEoMupFna+J6+JlXkjSPICZeSwy2jLRG4XYmW6 FSqA== X-Gm-Message-State: AO0yUKXwcntNaA3ddjA4un9pr4oU8n9hTIqh8ZDsRvEW/ySLPXHInOaM vfWlQgkUsORr869Hs0rt0DbXbTziwTrzDdVotZ0= X-Google-Smtp-Source: AK7set9VA7sczEakGkChr7Sb6Wdr3rFtYScb7XmvugCMwR8QqBGRqvgntFedCMn8Ls4vxObhKzitFA== X-Received: by 2002:a05:6a20:6982:b0:c2:b6cf:96db with SMTP id t2-20020a056a20698200b000c2b6cf96dbmr4723615pzk.39.1676516267944; Wed, 15 Feb 2023 18:57:47 -0800 (PST) Received: from stoup.. (rrcs-74-87-59-234.west.biz.rr.com. [74.87.59.234]) by smtp.gmail.com with ESMTPSA id e14-20020a62aa0e000000b005a816b7c3e8sm89655pff.24.2023.02.15.18.57.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Feb 2023 18:57:47 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Cc: =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= Subject: [PATCH v2 04/30] accel/tcg: Introduce tlb_read_idx Date: Wed, 15 Feb 2023 16:57:13 -1000 Message-Id: <20230216025739.1211680-5-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230216025739.1211680-1-richard.henderson@linaro.org> References: <20230216025739.1211680-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::629; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x629.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Instead of playing with offsetof in various places, use MMUAccessType to index an array. This is easily defined instead of the previous dummy padding array in the union. Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson Reviewed-by: Alex Bennée --- include/exec/cpu-defs.h | 7 ++- include/exec/cpu_ldst.h | 26 ++++++++-- accel/tcg/cputlb.c | 104 +++++++++++++--------------------------- 3 files changed, 59 insertions(+), 78 deletions(-) diff --git a/include/exec/cpu-defs.h b/include/exec/cpu-defs.h index 21309cf567..7ce3bcb06b 100644 --- a/include/exec/cpu-defs.h +++ b/include/exec/cpu-defs.h @@ -128,8 +128,11 @@ typedef struct CPUTLBEntry { use the corresponding iotlb value. */ uintptr_t addend; }; - /* padding to get a power of two size */ - uint8_t dummy[1 << CPU_TLB_ENTRY_BITS]; + /* + * Padding to get a power of two size, as well as index + * access to addr_{read,write,code}. + */ + target_ulong addr_idx[(1 << CPU_TLB_ENTRY_BITS) / TARGET_LONG_SIZE]; }; } CPUTLBEntry; diff --git a/include/exec/cpu_ldst.h b/include/exec/cpu_ldst.h index 09b55cc0ee..fad6efc0ad 100644 --- a/include/exec/cpu_ldst.h +++ b/include/exec/cpu_ldst.h @@ -360,13 +360,29 @@ static inline void clear_helper_retaddr(void) /* Needed for TCG_OVERSIZED_GUEST */ #include "tcg/tcg.h" +static inline target_ulong tlb_read_idx(const CPUTLBEntry *entry, + MMUAccessType access_type) +{ + /* Do not rearrange the CPUTLBEntry structure members. */ + QEMU_BUILD_BUG_ON(offsetof(CPUTLBEntry, addr_read) != + MMU_DATA_LOAD * TARGET_LONG_SIZE); + QEMU_BUILD_BUG_ON(offsetof(CPUTLBEntry, addr_write) != + MMU_DATA_STORE * TARGET_LONG_SIZE); + QEMU_BUILD_BUG_ON(offsetof(CPUTLBEntry, addr_code) != + MMU_INST_FETCH * TARGET_LONG_SIZE); + + const target_ulong *ptr = &entry->addr_idx[access_type]; +#if TCG_OVERSIZED_GUEST + return *ptr; +#else + /* ofs might correspond to .addr_write, so use qatomic_read */ + return qatomic_read(ptr); +#endif +} + static inline target_ulong tlb_addr_write(const CPUTLBEntry *entry) { -#if TCG_OVERSIZED_GUEST - return entry->addr_write; -#else - return qatomic_read(&entry->addr_write); -#endif + return tlb_read_idx(entry, MMU_DATA_STORE); } /* Find the TLB index corresponding to the mmu_idx + address pair. */ diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c index 4812d83961..3cadd35f5d 100644 --- a/accel/tcg/cputlb.c +++ b/accel/tcg/cputlb.c @@ -1442,34 +1442,17 @@ static void io_writex(CPUArchState *env, CPUTLBEntryFull *full, } } -static inline target_ulong tlb_read_ofs(CPUTLBEntry *entry, size_t ofs) -{ -#if TCG_OVERSIZED_GUEST - return *(target_ulong *)((uintptr_t)entry + ofs); -#else - /* ofs might correspond to .addr_write, so use qatomic_read */ - return qatomic_read((target_ulong *)((uintptr_t)entry + ofs)); -#endif -} - /* Return true if ADDR is present in the victim tlb, and has been copied back to the main tlb. */ static bool victim_tlb_hit(CPUArchState *env, size_t mmu_idx, size_t index, - size_t elt_ofs, target_ulong page) + MMUAccessType access_type, target_ulong page) { size_t vidx; assert_cpu_is_self(env_cpu(env)); for (vidx = 0; vidx < CPU_VTLB_SIZE; ++vidx) { CPUTLBEntry *vtlb = &env_tlb(env)->d[mmu_idx].vtable[vidx]; - target_ulong cmp; - - /* elt_ofs might correspond to .addr_write, so use qatomic_read */ -#if TCG_OVERSIZED_GUEST - cmp = *(target_ulong *)((uintptr_t)vtlb + elt_ofs); -#else - cmp = qatomic_read((target_ulong *)((uintptr_t)vtlb + elt_ofs)); -#endif + target_ulong cmp = tlb_read_idx(vtlb, access_type); if (cmp == page) { /* Found entry in victim tlb, swap tlb and iotlb. */ @@ -1491,11 +1474,6 @@ static bool victim_tlb_hit(CPUArchState *env, size_t mmu_idx, size_t index, return false; } -/* Macro to call the above, with local variables from the use context. */ -#define VICTIM_TLB_HIT(TY, ADDR) \ - victim_tlb_hit(env, mmu_idx, index, offsetof(CPUTLBEntry, TY), \ - (ADDR) & TARGET_PAGE_MASK) - static void notdirty_write(CPUState *cpu, vaddr mem_vaddr, unsigned size, CPUTLBEntryFull *full, uintptr_t retaddr) { @@ -1528,29 +1506,12 @@ static int probe_access_internal(CPUArchState *env, target_ulong addr, { uintptr_t index = tlb_index(env, mmu_idx, addr); CPUTLBEntry *entry = tlb_entry(env, mmu_idx, addr); - target_ulong tlb_addr, page_addr; - size_t elt_ofs; - int flags; + target_ulong tlb_addr = tlb_read_idx(entry, access_type); + target_ulong page_addr = addr & TARGET_PAGE_MASK; + int flags = TLB_FLAGS_MASK; - switch (access_type) { - case MMU_DATA_LOAD: - elt_ofs = offsetof(CPUTLBEntry, addr_read); - break; - case MMU_DATA_STORE: - elt_ofs = offsetof(CPUTLBEntry, addr_write); - break; - case MMU_INST_FETCH: - elt_ofs = offsetof(CPUTLBEntry, addr_code); - break; - default: - g_assert_not_reached(); - } - tlb_addr = tlb_read_ofs(entry, elt_ofs); - - flags = TLB_FLAGS_MASK; - page_addr = addr & TARGET_PAGE_MASK; if (!tlb_hit_page(tlb_addr, page_addr)) { - if (!victim_tlb_hit(env, mmu_idx, index, elt_ofs, page_addr)) { + if (!victim_tlb_hit(env, mmu_idx, index, access_type, page_addr)) { CPUState *cs = env_cpu(env); if (!cs->cc->tcg_ops->tlb_fill(cs, addr, fault_size, access_type, @@ -1572,7 +1533,7 @@ static int probe_access_internal(CPUArchState *env, target_ulong addr, */ flags &= ~TLB_INVALID_MASK; } - tlb_addr = tlb_read_ofs(entry, elt_ofs); + tlb_addr = tlb_read_idx(entry, access_type); } flags &= tlb_addr; @@ -1786,7 +1747,8 @@ static void *atomic_mmu_lookup(CPUArchState *env, target_ulong addr, if (prot & PAGE_WRITE) { tlb_addr = tlb_addr_write(tlbe); if (!tlb_hit(tlb_addr, addr)) { - if (!VICTIM_TLB_HIT(addr_write, addr)) { + if (!victim_tlb_hit(env, mmu_idx, index, MMU_DATA_STORE, + addr & TARGET_PAGE_MASK)) { tlb_fill(env_cpu(env), addr, size, MMU_DATA_STORE, mmu_idx, retaddr); index = tlb_index(env, mmu_idx, addr); @@ -1810,7 +1772,8 @@ static void *atomic_mmu_lookup(CPUArchState *env, target_ulong addr, } else /* if (prot & PAGE_READ) */ { tlb_addr = tlbe->addr_read; if (!tlb_hit(tlb_addr, addr)) { - if (!VICTIM_TLB_HIT(addr_write, addr)) { + if (!victim_tlb_hit(env, mmu_idx, index, MMU_DATA_LOAD, + addr & TARGET_PAGE_MASK)) { tlb_fill(env_cpu(env), addr, size, MMU_DATA_LOAD, mmu_idx, retaddr); index = tlb_index(env, mmu_idx, addr); @@ -1896,13 +1859,9 @@ load_memop(const void *haddr, MemOp op) static inline uint64_t QEMU_ALWAYS_INLINE load_helper(CPUArchState *env, target_ulong addr, MemOpIdx oi, - uintptr_t retaddr, MemOp op, bool code_read, + uintptr_t retaddr, MemOp op, MMUAccessType access_type, FullLoadHelper *full_load) { - const size_t tlb_off = code_read ? - offsetof(CPUTLBEntry, addr_code) : offsetof(CPUTLBEntry, addr_read); - const MMUAccessType access_type = - code_read ? MMU_INST_FETCH : MMU_DATA_LOAD; const unsigned a_bits = get_alignment_bits(get_memop(oi)); const size_t size = memop_size(op); uintptr_t mmu_idx = get_mmuidx(oi); @@ -1922,18 +1881,18 @@ load_helper(CPUArchState *env, target_ulong addr, MemOpIdx oi, index = tlb_index(env, mmu_idx, addr); entry = tlb_entry(env, mmu_idx, addr); - tlb_addr = code_read ? entry->addr_code : entry->addr_read; + tlb_addr = tlb_read_idx(entry, access_type); /* If the TLB entry is for a different page, reload and try again. */ if (!tlb_hit(tlb_addr, addr)) { - if (!victim_tlb_hit(env, mmu_idx, index, tlb_off, + if (!victim_tlb_hit(env, mmu_idx, index, access_type, addr & TARGET_PAGE_MASK)) { tlb_fill(env_cpu(env), addr, size, access_type, mmu_idx, retaddr); index = tlb_index(env, mmu_idx, addr); entry = tlb_entry(env, mmu_idx, addr); } - tlb_addr = code_read ? entry->addr_code : entry->addr_read; + tlb_addr = tlb_read_idx(entry, access_type); tlb_addr &= ~TLB_INVALID_MASK; } @@ -2019,7 +1978,8 @@ static uint64_t full_ldub_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, uintptr_t retaddr) { validate_memop(oi, MO_UB); - return load_helper(env, addr, oi, retaddr, MO_UB, false, full_ldub_mmu); + return load_helper(env, addr, oi, retaddr, MO_UB, MMU_DATA_LOAD, + full_ldub_mmu); } tcg_target_ulong helper_ret_ldub_mmu(CPUArchState *env, target_ulong addr, @@ -2032,7 +1992,7 @@ static uint64_t full_le_lduw_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, uintptr_t retaddr) { validate_memop(oi, MO_LEUW); - return load_helper(env, addr, oi, retaddr, MO_LEUW, false, + return load_helper(env, addr, oi, retaddr, MO_LEUW, MMU_DATA_LOAD, full_le_lduw_mmu); } @@ -2046,7 +2006,7 @@ static uint64_t full_be_lduw_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, uintptr_t retaddr) { validate_memop(oi, MO_BEUW); - return load_helper(env, addr, oi, retaddr, MO_BEUW, false, + return load_helper(env, addr, oi, retaddr, MO_BEUW, MMU_DATA_LOAD, full_be_lduw_mmu); } @@ -2060,7 +2020,7 @@ static uint64_t full_le_ldul_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, uintptr_t retaddr) { validate_memop(oi, MO_LEUL); - return load_helper(env, addr, oi, retaddr, MO_LEUL, false, + return load_helper(env, addr, oi, retaddr, MO_LEUL, MMU_DATA_LOAD, full_le_ldul_mmu); } @@ -2074,7 +2034,7 @@ static uint64_t full_be_ldul_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, uintptr_t retaddr) { validate_memop(oi, MO_BEUL); - return load_helper(env, addr, oi, retaddr, MO_BEUL, false, + return load_helper(env, addr, oi, retaddr, MO_BEUL, MMU_DATA_LOAD, full_be_ldul_mmu); } @@ -2088,7 +2048,7 @@ uint64_t helper_le_ldq_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, uintptr_t retaddr) { validate_memop(oi, MO_LEUQ); - return load_helper(env, addr, oi, retaddr, MO_LEUQ, false, + return load_helper(env, addr, oi, retaddr, MO_LEUQ, MMU_DATA_LOAD, helper_le_ldq_mmu); } @@ -2096,7 +2056,7 @@ uint64_t helper_be_ldq_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, uintptr_t retaddr) { validate_memop(oi, MO_BEUQ); - return load_helper(env, addr, oi, retaddr, MO_BEUQ, false, + return load_helper(env, addr, oi, retaddr, MO_BEUQ, MMU_DATA_LOAD, helper_be_ldq_mmu); } @@ -2292,7 +2252,6 @@ store_helper_unaligned(CPUArchState *env, target_ulong addr, uint64_t val, uintptr_t retaddr, size_t size, uintptr_t mmu_idx, bool big_endian) { - const size_t tlb_off = offsetof(CPUTLBEntry, addr_write); uintptr_t index, index2; CPUTLBEntry *entry, *entry2; target_ulong page1, page2, tlb_addr, tlb_addr2; @@ -2314,7 +2273,7 @@ store_helper_unaligned(CPUArchState *env, target_ulong addr, uint64_t val, tlb_addr2 = tlb_addr_write(entry2); if (page1 != page2 && !tlb_hit_page(tlb_addr2, page2)) { - if (!victim_tlb_hit(env, mmu_idx, index2, tlb_off, page2)) { + if (!victim_tlb_hit(env, mmu_idx, index2, MMU_DATA_STORE, page2)) { tlb_fill(env_cpu(env), page2, size2, MMU_DATA_STORE, mmu_idx, retaddr); index2 = tlb_index(env, mmu_idx, page2); @@ -2367,7 +2326,6 @@ static inline void QEMU_ALWAYS_INLINE store_helper(CPUArchState *env, target_ulong addr, uint64_t val, MemOpIdx oi, uintptr_t retaddr, MemOp op) { - const size_t tlb_off = offsetof(CPUTLBEntry, addr_write); const unsigned a_bits = get_alignment_bits(get_memop(oi)); const size_t size = memop_size(op); uintptr_t mmu_idx = get_mmuidx(oi); @@ -2390,7 +2348,7 @@ store_helper(CPUArchState *env, target_ulong addr, uint64_t val, /* If the TLB entry is for a different page, reload and try again. */ if (!tlb_hit(tlb_addr, addr)) { - if (!victim_tlb_hit(env, mmu_idx, index, tlb_off, + if (!victim_tlb_hit(env, mmu_idx, index, MMU_DATA_STORE, addr & TARGET_PAGE_MASK)) { tlb_fill(env_cpu(env), addr, size, MMU_DATA_STORE, mmu_idx, retaddr); @@ -2696,7 +2654,8 @@ void cpu_st16_le_mmu(CPUArchState *env, abi_ptr addr, Int128 val, static uint64_t full_ldub_code(CPUArchState *env, target_ulong addr, MemOpIdx oi, uintptr_t retaddr) { - return load_helper(env, addr, oi, retaddr, MO_8, true, full_ldub_code); + return load_helper(env, addr, oi, retaddr, MO_8, + MMU_INST_FETCH, full_ldub_code); } uint32_t cpu_ldub_code(CPUArchState *env, abi_ptr addr) @@ -2708,7 +2667,8 @@ uint32_t cpu_ldub_code(CPUArchState *env, abi_ptr addr) static uint64_t full_lduw_code(CPUArchState *env, target_ulong addr, MemOpIdx oi, uintptr_t retaddr) { - return load_helper(env, addr, oi, retaddr, MO_TEUW, true, full_lduw_code); + return load_helper(env, addr, oi, retaddr, MO_TEUW, + MMU_INST_FETCH, full_lduw_code); } uint32_t cpu_lduw_code(CPUArchState *env, abi_ptr addr) @@ -2720,7 +2680,8 @@ uint32_t cpu_lduw_code(CPUArchState *env, abi_ptr addr) static uint64_t full_ldl_code(CPUArchState *env, target_ulong addr, MemOpIdx oi, uintptr_t retaddr) { - return load_helper(env, addr, oi, retaddr, MO_TEUL, true, full_ldl_code); + return load_helper(env, addr, oi, retaddr, MO_TEUL, + MMU_INST_FETCH, full_ldl_code); } uint32_t cpu_ldl_code(CPUArchState *env, abi_ptr addr) @@ -2732,7 +2693,8 @@ uint32_t cpu_ldl_code(CPUArchState *env, abi_ptr addr) static uint64_t full_ldq_code(CPUArchState *env, target_ulong addr, MemOpIdx oi, uintptr_t retaddr) { - return load_helper(env, addr, oi, retaddr, MO_TEUQ, true, full_ldq_code); + return load_helper(env, addr, oi, retaddr, MO_TEUQ, + MMU_INST_FETCH, full_ldq_code); } uint64_t cpu_ldq_code(CPUArchState *env, abi_ptr addr) From patchwork Thu Feb 16 02:57:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1743232 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=rJzk84Gf; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4PHKQT71Q7z23yD for ; Thu, 16 Feb 2023 13:58:41 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pSUT5-0003QI-EX; Wed, 15 Feb 2023 21:57:55 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pSUT4-0003PV-Eg for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:57:54 -0500 Received: from mail-pg1-x531.google.com ([2607:f8b0:4864:20::531]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pSUT0-0005QM-F2 for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:57:54 -0500 Received: by mail-pg1-x531.google.com with SMTP id b22so422505pgw.3 for ; Wed, 15 Feb 2023 18:57:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=d2U8zB/xbV6qO9ndTK6ziT4H80/1vQse0/MXDvdMJfY=; b=rJzk84GfIbFyUqkCJvOsyU4LMJPsId5sJpVlDbnA7jKBWsE6Bp581bQH24ynVo7w7/ 3GRI1qnr0DTO6bcr3pIysWYnl9xaYfkzNhspsx0TPCfFmpEvf7WmkBA7khIS144bNQj8 xuua8/H0zdetvVTW4lttKIMXSFpqoeHl2XM+zYk6p+WYEcuc38jmcpeXyNJKGeTjMxBk sD9MrwtUCJSHJ8tutLn//KwqW0jDkFmwftj68lit81sH2prAoD9Id5LXvIelUy4YqsCB nOFFR5tv61SdLvlXQqu1e0BdcfcC1uzvma1EgVT6YfUgqzq6fTLoIa1qpxVb/u2Ekwwj UOgw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=d2U8zB/xbV6qO9ndTK6ziT4H80/1vQse0/MXDvdMJfY=; b=5G39DuaGM6beywRoaoseeggLhRGRZ/cTgsw0Q0lQIAYZeBrlO5wzqdb+pQtprzCjv7 3QF4IFMrtL34ADL6dy2JJFxj/Qs/51nkvd7LHAy6xKuWSfwVFz4k9wD48NbjssOi7OZ4 CWrQGd0K9b3fyOa5LJzk9wMffIj/Kejot2vh0XNuCI6g3YHjKLOAiAMmcmwLsCTiypmf c4Qz2JoO3C/GSC7IZxuHUFDWHwgGPybRr3o9BRAqLGIqqAVqDJJ/XhTPx0O5xrXKb/YY 6uq2XJ+mbSXxn+/GitWgSWjSUfRIH8neuBOxBUA9HAurrI7wNG3FDbQtKzGBljbGW0vn bgEQ== X-Gm-Message-State: AO0yUKUxtLHpLWcBtH53P9XtYtlls2limGQj0e5VomV4BeGtKczBNm92 xfA+zfmCYT7lZoUAvYySepDiLGe73BK4c0hx+ok= X-Google-Smtp-Source: AK7set/BMfoD8bB92j4Nn5y1z02GX6LsmHytmbW7vW0ldDLHtCHBay/+R2riYYU0IyC2ektxEoxA4g== X-Received: by 2002:a05:6a00:42:b0:598:b178:a3a9 with SMTP id i2-20020a056a00004200b00598b178a3a9mr3918103pfk.6.1676516269313; Wed, 15 Feb 2023 18:57:49 -0800 (PST) Received: from stoup.. (rrcs-74-87-59-234.west.biz.rr.com. [74.87.59.234]) by smtp.gmail.com with ESMTPSA id e14-20020a62aa0e000000b005a816b7c3e8sm89655pff.24.2023.02.15.18.57.48 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Feb 2023 18:57:48 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 05/30] accel/tcg: Reorg system mode load helpers Date: Wed, 15 Feb 2023 16:57:14 -1000 Message-Id: <20230216025739.1211680-6-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230216025739.1211680-1-richard.henderson@linaro.org> References: <20230216025739.1211680-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::531; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x531.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Instead of trying to unify all operations on uint64_t, pull out mmu_lookup() to perform the basic tlb hit and resolution. Create individual functions to handle access by size. Signed-off-by: Richard Henderson Reviewed-by: Alex Bennée --- accel/tcg/cputlb.c | 612 +++++++++++++++++++++++++++++++-------------- 1 file changed, 419 insertions(+), 193 deletions(-) diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c index 3cadd35f5d..1fba836790 100644 --- a/accel/tcg/cputlb.c +++ b/accel/tcg/cputlb.c @@ -1701,6 +1701,178 @@ bool tlb_plugin_lookup(CPUState *cpu, target_ulong addr, int mmu_idx, #endif +/* + * Probe for a load/store operation. + * Return the host address and into @flags. + */ + +typedef struct MMULookupPageData { + CPUTLBEntryFull *full; + void *haddr; + target_ulong addr; + int flags; + int size; +} MMULookupPageData; + +typedef struct MMULookupLocals { + MMULookupPageData page[2]; + MemOp memop; + int mmu_idx; +} MMULookupLocals; + +/** + * mmu_lookup1: translate one page + * @env: cpu context + * @data: lookup parameters + * @mmu_idx: virtual address context + * @access_type: load/store/code + * @ra: return address into tcg generated code, or 0 + * + * Resolve the translation for the one page at @data.addr, filling in + * the rest of @data with the results. If the translation fails, + * tlb_fill will longjmp out. Return true if the softmmu tlb for + * @mmu_idx may have resized. + */ +static bool mmu_lookup1(CPUArchState *env, MMULookupPageData *data, + int mmu_idx, MMUAccessType access_type, uintptr_t ra) +{ + target_ulong addr = data->addr; + uintptr_t index = tlb_index(env, mmu_idx, addr); + CPUTLBEntry *entry = tlb_entry(env, mmu_idx, addr); + target_ulong tlb_addr = tlb_read_idx(entry, access_type); + bool maybe_resized = false; + + /* If the TLB entry is for a different page, reload and try again. */ + if (!tlb_hit(tlb_addr, addr)) { + if (!victim_tlb_hit(env, mmu_idx, index, access_type, + addr & TARGET_PAGE_MASK)) { + tlb_fill(env_cpu(env), addr, data->size, access_type, mmu_idx, ra); + maybe_resized = true; + index = tlb_index(env, mmu_idx, addr); + entry = tlb_entry(env, mmu_idx, addr); + } + tlb_addr = tlb_read_idx(entry, access_type) & ~TLB_INVALID_MASK; + } + + data->flags = tlb_addr & TLB_FLAGS_MASK; + data->full = &env_tlb(env)->d[mmu_idx].fulltlb[index]; + /* Compute haddr speculatively; depending on flags it might be invalid. */ + data->haddr = (void *)((uintptr_t)addr + entry->addend); + + return maybe_resized; +} + +/** + * mmu_watch_or_dirty + * @env: cpu context + * @data: lookup parameters + * @access_type: load/store/code + * @ra: return address into tcg generated code, or 0 + * + * Trigger watchpoints for @data.addr:@data.size; + * record writes to protected clean pages. + */ +static void mmu_watch_or_dirty(CPUArchState *env, MMULookupPageData *data, + MMUAccessType access_type, uintptr_t ra) +{ + CPUTLBEntryFull *full = data->full; + target_ulong addr = data->addr; + int flags = data->flags; + int size = data->size; + + /* On watchpoint hit, this will longjmp out. */ + if (flags & TLB_WATCHPOINT) { + int wp = access_type == MMU_DATA_STORE ? BP_MEM_WRITE : BP_MEM_READ; + cpu_check_watchpoint(env_cpu(env), addr, size, full->attrs, wp, ra); + flags &= ~TLB_WATCHPOINT; + } + + if (flags & TLB_NOTDIRTY) { + notdirty_write(env_cpu(env), addr, size, full, ra); + flags &= ~TLB_NOTDIRTY; + } + data->flags = flags; +} + +/** + * mmu_lookup: translate page(s) + * @env: cpu context + * @addr: virtual address + * @oi: combined mmu_idx and MemOp + * @ra: return address into tcg generated code, or 0 + * @access_type: load/store/code + * @l: output result + * + * Resolve the translation for the page(s) beginning at @addr, for MemOp.size + * bytes. Return true if the lookup crosses a page boundary. + */ +static bool mmu_lookup(CPUArchState *env, target_ulong addr, MemOpIdx oi, + uintptr_t ra, MMUAccessType type, MMULookupLocals *l) +{ + unsigned a_bits; + bool crosspage; + int flags; + + l->memop = get_memop(oi); + l->mmu_idx = get_mmuidx(oi); + + tcg_debug_assert(l->mmu_idx < NB_MMU_MODES); + + /* Handle CPU specific unaligned behaviour */ + a_bits = get_alignment_bits(l->memop); + if (addr & ((1 << a_bits) - 1)) { + cpu_unaligned_access(env_cpu(env), addr, type, l->mmu_idx, ra); + } + + l->page[0].addr = addr; + l->page[0].size = memop_size(l->memop); + l->page[1].addr = (addr + l->page[0].size - 1) & TARGET_PAGE_MASK; + l->page[1].size = 0; + crosspage = (addr ^ l->page[1].addr) & TARGET_PAGE_MASK; + + if (likely(!crosspage)) { + mmu_lookup1(env, &l->page[0], l->mmu_idx, type, ra); + + flags = l->page[0].flags; + if (unlikely(flags & (TLB_WATCHPOINT | TLB_NOTDIRTY))) { + mmu_watch_or_dirty(env, &l->page[0], type, ra); + } + if (unlikely(flags & TLB_BSWAP)) { + l->memop ^= MO_BSWAP; + } + } else { + /* Finish compute of page crossing. */ + int size1 = l->page[1].addr - addr; + l->page[1].size = l->page[0].size - size1; + l->page[0].size = size1; + + /* + * Lookup both pages, recognizing exceptions from either. If the + * second lookup potentially resized, refresh first CPUTLBEntryFull. + */ + mmu_lookup1(env, &l->page[0], l->mmu_idx, type, ra); + if (mmu_lookup1(env, &l->page[1], l->mmu_idx, type, ra)) { + uintptr_t index = tlb_index(env, l->mmu_idx, addr); + l->page[0].full = &env_tlb(env)->d[l->mmu_idx].fulltlb[index]; + } + + flags = l->page[0].flags | l->page[1].flags; + if (unlikely(flags & (TLB_WATCHPOINT | TLB_NOTDIRTY))) { + mmu_watch_or_dirty(env, &l->page[0], type, ra); + mmu_watch_or_dirty(env, &l->page[1], type, ra); + } + + /* + * Since target/sparc is the only user of TLB_BSWAP, and all + * Sparc accesses are aligned, any treatment across two pages + * would be arbitrary. Refuse it until there's a use. + */ + tcg_debug_assert((flags & TLB_BSWAP) == 0); + } + + return crosspage; +} + /* * Probe for an atomic operation. Do not allow unaligned operations, * or io operations to proceed. Return the host address. @@ -1857,113 +2029,6 @@ load_memop(const void *haddr, MemOp op) } } -static inline uint64_t QEMU_ALWAYS_INLINE -load_helper(CPUArchState *env, target_ulong addr, MemOpIdx oi, - uintptr_t retaddr, MemOp op, MMUAccessType access_type, - FullLoadHelper *full_load) -{ - const unsigned a_bits = get_alignment_bits(get_memop(oi)); - const size_t size = memop_size(op); - uintptr_t mmu_idx = get_mmuidx(oi); - uintptr_t index; - CPUTLBEntry *entry; - target_ulong tlb_addr; - void *haddr; - uint64_t res; - - tcg_debug_assert(mmu_idx < NB_MMU_MODES); - - /* Handle CPU specific unaligned behaviour */ - if (addr & ((1 << a_bits) - 1)) { - cpu_unaligned_access(env_cpu(env), addr, access_type, - mmu_idx, retaddr); - } - - index = tlb_index(env, mmu_idx, addr); - entry = tlb_entry(env, mmu_idx, addr); - tlb_addr = tlb_read_idx(entry, access_type); - - /* If the TLB entry is for a different page, reload and try again. */ - if (!tlb_hit(tlb_addr, addr)) { - if (!victim_tlb_hit(env, mmu_idx, index, access_type, - addr & TARGET_PAGE_MASK)) { - tlb_fill(env_cpu(env), addr, size, - access_type, mmu_idx, retaddr); - index = tlb_index(env, mmu_idx, addr); - entry = tlb_entry(env, mmu_idx, addr); - } - tlb_addr = tlb_read_idx(entry, access_type); - tlb_addr &= ~TLB_INVALID_MASK; - } - - /* Handle anything that isn't just a straight memory access. */ - if (unlikely(tlb_addr & ~TARGET_PAGE_MASK)) { - CPUTLBEntryFull *full; - bool need_swap; - - /* For anything that is unaligned, recurse through full_load. */ - if ((addr & (size - 1)) != 0) { - goto do_unaligned_access; - } - - full = &env_tlb(env)->d[mmu_idx].fulltlb[index]; - - /* Handle watchpoints. */ - if (unlikely(tlb_addr & TLB_WATCHPOINT)) { - /* On watchpoint hit, this will longjmp out. */ - cpu_check_watchpoint(env_cpu(env), addr, size, - full->attrs, BP_MEM_READ, retaddr); - } - - need_swap = size > 1 && (tlb_addr & TLB_BSWAP); - - /* Handle I/O access. */ - if (likely(tlb_addr & TLB_MMIO)) { - return io_readx(env, full, mmu_idx, addr, retaddr, - access_type, op ^ (need_swap * MO_BSWAP)); - } - - haddr = (void *)((uintptr_t)addr + entry->addend); - - /* - * Keep these two load_memop separate to ensure that the compiler - * is able to fold the entire function to a single instruction. - * There is a build-time assert inside to remind you of this. ;-) - */ - if (unlikely(need_swap)) { - return load_memop(haddr, op ^ MO_BSWAP); - } - return load_memop(haddr, op); - } - - /* Handle slow unaligned access (it spans two pages or IO). */ - if (size > 1 - && unlikely((addr & ~TARGET_PAGE_MASK) + size - 1 - >= TARGET_PAGE_SIZE)) { - target_ulong addr1, addr2; - uint64_t r1, r2; - unsigned shift; - do_unaligned_access: - addr1 = addr & ~((target_ulong)size - 1); - addr2 = addr1 + size; - r1 = full_load(env, addr1, oi, retaddr); - r2 = full_load(env, addr2, oi, retaddr); - shift = (addr & (size - 1)) * 8; - - if (memop_big_endian(op)) { - /* Big-endian combine. */ - res = (r1 << shift) | (r2 >> ((size * 8) - shift)); - } else { - /* Little-endian combine. */ - res = (r1 >> shift) | (r2 << ((size * 8) - shift)); - } - return res & MAKE_64BIT_MASK(0, size * 8); - } - - haddr = (void *)((uintptr_t)addr + entry->addend); - return load_memop(haddr, op); -} - /* * For the benefit of TCG generated code, we want to avoid the * complication of ABI-specific return type promotion and always @@ -1974,90 +2039,250 @@ load_helper(CPUArchState *env, target_ulong addr, MemOpIdx oi, * We don't bother with this widened value for SOFTMMU_CODE_ACCESS. */ -static uint64_t full_ldub_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr) +/** + * do_ld_mmio_beN: + * @env: cpu context + * @p: translation parameters + * @ret_be: accumulated data + * @mmu_idx: virtual address context + * @ra: return address into tcg generated code, or 0 + * + * Load @p->size bytes from @p->addr, which is memory-mapped i/o. + * The bytes are concatenated with in big-endian order with @ret_be. + */ +static uint64_t do_ld_mmio_beN(CPUArchState *env, MMULookupPageData *p, + uint64_t ret_be, int mmu_idx, + MMUAccessType type, uintptr_t ra) { - validate_memop(oi, MO_UB); - return load_helper(env, addr, oi, retaddr, MO_UB, MMU_DATA_LOAD, - full_ldub_mmu); + CPUTLBEntryFull *full = p->full; + target_ulong addr = p->addr; + int i, size = p->size; + + QEMU_IOTHREAD_LOCK_GUARD(); + for (i = 0; i < size; i++) { + uint8_t x = io_readx(env, full, mmu_idx, addr + i, ra, type, MO_UB); + ret_be = (ret_be << 8) | x; + } + return ret_be; +} + +/** + * do_ld_bytes_beN + * @p: translation parameters + * @ret_be: accumulated data + * + * Load @p->size bytes from @p->haddr, which is RAM. + * The bytes to concatenated in big-endian order with @ret_be. + */ +static uint64_t do_ld_bytes_beN(MMULookupPageData *p, uint64_t ret_be) +{ + uint8_t *haddr = p->haddr; + int i, size = p->size; + + for (i = 0; i < size; i++) { + ret_be = (ret_be << 8) | haddr[i]; + } + return ret_be; +} + +/* + * Wrapper for the above. + */ +static uint64_t do_ld_beN(CPUArchState *env, MMULookupPageData *p, + uint64_t ret_be, int mmu_idx, + MMUAccessType type, uintptr_t ra) +{ + if (unlikely(p->flags & TLB_MMIO)) { + return do_ld_mmio_beN(env, p, ret_be, mmu_idx, type, ra); + } else { + return do_ld_bytes_beN(p, ret_be); + } +} + +static uint8_t do_ld_1(CPUArchState *env, MMULookupPageData *p, int mmu_idx, + MMUAccessType type, uintptr_t ra) +{ + if (unlikely(p->flags & TLB_MMIO)) { + return io_readx(env, p->full, mmu_idx, p->addr, ra, type, MO_UB); + } else { + return *(uint8_t *)p->haddr; + } +} + +static uint16_t do_ld_2(CPUArchState *env, MMULookupPageData *p, int mmu_idx, + MMUAccessType type, MemOp memop, uintptr_t ra) +{ + uint64_t ret; + + if (unlikely(p->flags & TLB_MMIO)) { + return io_readx(env, p->full, mmu_idx, p->addr, ra, type, memop); + } + + /* Perform the load host endian, then swap if necessary. */ + ret = load_memop(p->haddr, MO_UW); + if (memop & MO_BSWAP) { + ret = bswap16(ret); + } + return ret; +} + +static uint32_t do_ld_4(CPUArchState *env, MMULookupPageData *p, int mmu_idx, + MMUAccessType type, MemOp memop, uintptr_t ra) +{ + uint32_t ret; + + if (unlikely(p->flags & TLB_MMIO)) { + return io_readx(env, p->full, mmu_idx, p->addr, ra, type, memop); + } + + /* Perform the load host endian. */ + ret = load_memop(p->haddr, MO_UL); + if (memop & MO_BSWAP) { + ret = bswap32(ret); + } + return ret; +} + +static uint64_t do_ld_8(CPUArchState *env, MMULookupPageData *p, int mmu_idx, + MMUAccessType type, MemOp memop, uintptr_t ra) +{ + uint64_t ret; + + if (unlikely(p->flags & TLB_MMIO)) { + return io_readx(env, p->full, mmu_idx, p->addr, ra, type, memop); + } + + /* Perform the load host endian. */ + ret = load_memop(p->haddr, MO_UQ); + if (memop & MO_BSWAP) { + ret = bswap64(ret); + } + return ret; +} + +static uint8_t do_ld1_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, + uintptr_t ra, MMUAccessType access_type) +{ + MMULookupLocals l; + bool crosspage; + + crosspage = mmu_lookup(env, addr, oi, ra, access_type, &l); + tcg_debug_assert(!crosspage); + + return do_ld_1(env, &l.page[0], l.mmu_idx, access_type, ra); } tcg_target_ulong helper_ret_ldub_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, uintptr_t retaddr) { - return full_ldub_mmu(env, addr, oi, retaddr); + validate_memop(oi, MO_UB); + return do_ld1_mmu(env, addr, oi, retaddr, MMU_DATA_LOAD); } -static uint64_t full_le_lduw_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr) +static uint16_t do_ld2_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, + uintptr_t ra, MMUAccessType access_type) { - validate_memop(oi, MO_LEUW); - return load_helper(env, addr, oi, retaddr, MO_LEUW, MMU_DATA_LOAD, - full_le_lduw_mmu); + MMULookupLocals l; + bool crosspage; + uint16_t ret; + uint8_t a, b; + + crosspage = mmu_lookup(env, addr, oi, ra, access_type, &l); + if (likely(!crosspage)) { + return do_ld_2(env, &l.page[0], l.mmu_idx, access_type, l.memop, ra); + } + + a = do_ld_1(env, &l.page[0], l.mmu_idx, access_type, ra); + b = do_ld_1(env, &l.page[1], l.mmu_idx, access_type, ra); + + if ((l.memop & MO_BSWAP) == MO_LE) { + ret = a | (b << 8); + } else { + ret = b | (a << 8); + } + return ret; } tcg_target_ulong helper_le_lduw_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, uintptr_t retaddr) { - return full_le_lduw_mmu(env, addr, oi, retaddr); -} - -static uint64_t full_be_lduw_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr) -{ - validate_memop(oi, MO_BEUW); - return load_helper(env, addr, oi, retaddr, MO_BEUW, MMU_DATA_LOAD, - full_be_lduw_mmu); + validate_memop(oi, MO_LEUW); + return do_ld2_mmu(env, addr, oi, retaddr, MMU_DATA_LOAD); } tcg_target_ulong helper_be_lduw_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, uintptr_t retaddr) { - return full_be_lduw_mmu(env, addr, oi, retaddr); + validate_memop(oi, MO_BEUW); + return do_ld2_mmu(env, addr, oi, retaddr, MMU_DATA_LOAD); } -static uint64_t full_le_ldul_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr) +static uint32_t do_ld4_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, + uintptr_t ra, MMUAccessType access_type) { - validate_memop(oi, MO_LEUL); - return load_helper(env, addr, oi, retaddr, MO_LEUL, MMU_DATA_LOAD, - full_le_ldul_mmu); + MMULookupLocals l; + bool crosspage; + uint32_t ret; + + crosspage = mmu_lookup(env, addr, oi, ra, access_type, &l); + if (likely(!crosspage)) { + return do_ld_4(env, &l.page[0], l.mmu_idx, access_type, l.memop, ra); + } + + ret = do_ld_beN(env, &l.page[0], 0, l.mmu_idx, access_type, ra); + ret = do_ld_beN(env, &l.page[1], ret, l.mmu_idx, access_type, ra); + if ((l.memop & MO_BSWAP) == MO_LE) { + ret = bswap32(ret); + } + return ret; } tcg_target_ulong helper_le_ldul_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, uintptr_t retaddr) { - return full_le_ldul_mmu(env, addr, oi, retaddr); -} - -static uint64_t full_be_ldul_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr) -{ - validate_memop(oi, MO_BEUL); - return load_helper(env, addr, oi, retaddr, MO_BEUL, MMU_DATA_LOAD, - full_be_ldul_mmu); + validate_memop(oi, MO_LEUL); + return do_ld4_mmu(env, addr, oi, retaddr, MMU_DATA_LOAD); } tcg_target_ulong helper_be_ldul_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, uintptr_t retaddr) { - return full_be_ldul_mmu(env, addr, oi, retaddr); + validate_memop(oi, MO_BEUL); + return do_ld4_mmu(env, addr, oi, retaddr, MMU_DATA_LOAD); +} + +static uint64_t do_ld8_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, + uintptr_t ra, MMUAccessType access_type) +{ + MMULookupLocals l; + bool crosspage; + uint64_t ret; + + crosspage = mmu_lookup(env, addr, oi, ra, access_type, &l); + if (likely(!crosspage)) { + return do_ld_8(env, &l.page[0], l.mmu_idx, access_type, l.memop, ra); + } + + ret = do_ld_beN(env, &l.page[0], 0, l.mmu_idx, access_type, ra); + ret = do_ld_beN(env, &l.page[1], ret, l.mmu_idx, access_type, ra); + if ((l.memop & MO_BSWAP) == MO_LE) { + ret = bswap64(ret); + } + return ret; } uint64_t helper_le_ldq_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, uintptr_t retaddr) { validate_memop(oi, MO_LEUQ); - return load_helper(env, addr, oi, retaddr, MO_LEUQ, MMU_DATA_LOAD, - helper_le_ldq_mmu); + return do_ld8_mmu(env, addr, oi, retaddr, MMU_DATA_LOAD); } uint64_t helper_be_ldq_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, uintptr_t retaddr) { validate_memop(oi, MO_BEUQ); - return load_helper(env, addr, oi, retaddr, MO_BEUQ, MMU_DATA_LOAD, - helper_be_ldq_mmu); + return do_ld8_mmu(env, addr, oi, retaddr, MMU_DATA_LOAD); } /* @@ -2100,56 +2325,85 @@ tcg_target_ulong helper_be_ldsl_mmu(CPUArchState *env, target_ulong addr, * Load helpers for cpu_ldst.h. */ -static inline uint64_t cpu_load_helper(CPUArchState *env, abi_ptr addr, - MemOpIdx oi, uintptr_t retaddr, - FullLoadHelper *full_load) +static void plugin_load_cb(CPUArchState *env, abi_ptr addr, MemOpIdx oi) { - uint64_t ret; - - ret = full_load(env, addr, oi, retaddr); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R); - return ret; } uint8_t cpu_ldb_mmu(CPUArchState *env, abi_ptr addr, MemOpIdx oi, uintptr_t ra) { - return cpu_load_helper(env, addr, oi, ra, full_ldub_mmu); + uint8_t ret; + + validate_memop(oi, MO_UB); + ret = do_ld1_mmu(env, addr, oi, ra, MMU_DATA_LOAD); + plugin_load_cb(env, addr, oi); + return ret; } uint16_t cpu_ldw_be_mmu(CPUArchState *env, abi_ptr addr, MemOpIdx oi, uintptr_t ra) { - return cpu_load_helper(env, addr, oi, ra, full_be_lduw_mmu); + uint16_t ret; + + validate_memop(oi, MO_BEUW); + ret = do_ld2_mmu(env, addr, oi, ra, MMU_DATA_LOAD); + plugin_load_cb(env, addr, oi); + return ret; } uint32_t cpu_ldl_be_mmu(CPUArchState *env, abi_ptr addr, MemOpIdx oi, uintptr_t ra) { - return cpu_load_helper(env, addr, oi, ra, full_be_ldul_mmu); + uint32_t ret; + + validate_memop(oi, MO_BEUL); + ret = do_ld4_mmu(env, addr, oi, ra, MMU_DATA_LOAD); + plugin_load_cb(env, addr, oi); + return ret; } uint64_t cpu_ldq_be_mmu(CPUArchState *env, abi_ptr addr, MemOpIdx oi, uintptr_t ra) { - return cpu_load_helper(env, addr, oi, ra, helper_be_ldq_mmu); + uint64_t ret; + + validate_memop(oi, MO_BEUQ); + ret = do_ld8_mmu(env, addr, oi, ra, MMU_DATA_LOAD); + plugin_load_cb(env, addr, oi); + return ret; } uint16_t cpu_ldw_le_mmu(CPUArchState *env, abi_ptr addr, MemOpIdx oi, uintptr_t ra) { - return cpu_load_helper(env, addr, oi, ra, full_le_lduw_mmu); + uint16_t ret; + + validate_memop(oi, MO_LEUW); + ret = do_ld2_mmu(env, addr, oi, ra, MMU_DATA_LOAD); + plugin_load_cb(env, addr, oi); + return ret; } uint32_t cpu_ldl_le_mmu(CPUArchState *env, abi_ptr addr, MemOpIdx oi, uintptr_t ra) { - return cpu_load_helper(env, addr, oi, ra, full_le_ldul_mmu); + uint32_t ret; + + validate_memop(oi, MO_LEUL); + ret = do_ld4_mmu(env, addr, oi, ra, MMU_DATA_LOAD); + plugin_load_cb(env, addr, oi); + return ret; } uint64_t cpu_ldq_le_mmu(CPUArchState *env, abi_ptr addr, MemOpIdx oi, uintptr_t ra) { - return cpu_load_helper(env, addr, oi, ra, helper_le_ldq_mmu); + uint64_t ret; + + validate_memop(oi, MO_LEUQ); + ret = do_ld8_mmu(env, addr, oi, ra, MMU_DATA_LOAD); + plugin_load_cb(env, addr, oi); + return ret; } Int128 cpu_ld16_be_mmu(CPUArchState *env, abi_ptr addr, @@ -2651,54 +2905,26 @@ void cpu_st16_le_mmu(CPUArchState *env, abi_ptr addr, Int128 val, /* Code access functions. */ -static uint64_t full_ldub_code(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr) -{ - return load_helper(env, addr, oi, retaddr, MO_8, - MMU_INST_FETCH, full_ldub_code); -} - uint32_t cpu_ldub_code(CPUArchState *env, abi_ptr addr) { MemOpIdx oi = make_memop_idx(MO_UB, cpu_mmu_index(env, true)); - return full_ldub_code(env, addr, oi, 0); -} - -static uint64_t full_lduw_code(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr) -{ - return load_helper(env, addr, oi, retaddr, MO_TEUW, - MMU_INST_FETCH, full_lduw_code); + return do_ld1_mmu(env, addr, oi, 0, MMU_INST_FETCH); } uint32_t cpu_lduw_code(CPUArchState *env, abi_ptr addr) { MemOpIdx oi = make_memop_idx(MO_TEUW, cpu_mmu_index(env, true)); - return full_lduw_code(env, addr, oi, 0); -} - -static uint64_t full_ldl_code(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr) -{ - return load_helper(env, addr, oi, retaddr, MO_TEUL, - MMU_INST_FETCH, full_ldl_code); + return do_ld2_mmu(env, addr, oi, 0, MMU_INST_FETCH); } uint32_t cpu_ldl_code(CPUArchState *env, abi_ptr addr) { MemOpIdx oi = make_memop_idx(MO_TEUL, cpu_mmu_index(env, true)); - return full_ldl_code(env, addr, oi, 0); -} - -static uint64_t full_ldq_code(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr) -{ - return load_helper(env, addr, oi, retaddr, MO_TEUQ, - MMU_INST_FETCH, full_ldq_code); + return do_ld4_mmu(env, addr, oi, 0, MMU_INST_FETCH); } uint64_t cpu_ldq_code(CPUArchState *env, abi_ptr addr) { MemOpIdx oi = make_memop_idx(MO_TEUQ, cpu_mmu_index(env, true)); - return full_ldq_code(env, addr, oi, 0); + return do_ld8_mmu(env, addr, oi, 0, MMU_INST_FETCH); } From patchwork Thu Feb 16 02:57:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1743231 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=Co58Z8CT; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4PHKQ637Fwz240K for ; Thu, 16 Feb 2023 13:58:20 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pSUT9-0003Qw-2V; Wed, 15 Feb 2023 21:57:59 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pSUT5-0003QJ-CA for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:57:55 -0500 Received: from mail-pf1-x42c.google.com ([2607:f8b0:4864:20::42c]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pSUT2-0005gr-4E for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:57:55 -0500 Received: by mail-pf1-x42c.google.com with SMTP id bd35so580456pfb.6 for ; Wed, 15 Feb 2023 18:57:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=v7i5PB2PJS7ixTLfs6mp6O5vnkrklB5zdm5JUtQX9hY=; b=Co58Z8CTVhsEg8isN7GNlniExLaVcBp6pW423gWKMQYsKAGe0F1Af64xpuG5zOrnfa s76JdbQ8AtDKLx6VY29F/mREx2bszkP0XpxfgfWXU36EWs6EE/QkjqR6SORS4eTEkcPg tybPuojNlgfHmnTmhmawKuBpn3qViWLFX6Q4Y4Qza30OHh/ueCPnDvJlFQgzOxxFpvqk yYxIg6jaDTzUNWBu523XQALDIXc4UyRz8s6ah50ItgWCzPJPA2Z0MsrKBgaEnDqQ7Ym1 Uzctd80Br2lMcLflm1kUw37c2fU2VITFcxxP46YVB4wsIo4eshbgfYey9frWEgQUi247 LIDg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=v7i5PB2PJS7ixTLfs6mp6O5vnkrklB5zdm5JUtQX9hY=; b=Cia0ZTRLq+9sRbtYYyOAaHVZN+PE9c+4yvIuuX3vCty7bHm9EXGeFZXCc6HJ290LXq vM64lyZrVVwxfC8AV6b/+wlr/iXogGLx2bTkkmYpWa4hD/B89n5CHHrAfM7Z2ikyomFo E9tzGB2pMWo/ALEbYga6G6KK9sH/zNqyMiL7Vh8m07zcgGXn2ePWkhSrXGwJjYd9t0YG SIx6Yy3M7QVnY2We2z7DSyJE3Y7zbKzS4ikmI0/UfTxdqc42F3CoxPVV+rsQkmJAKzmP zON+8UArlbaMsnqRwgeqn0tM6CRdc+KmFyHmcgYYtLB9zZSTX43+Fl1Y2TzNcXwpqkHt CN7g== X-Gm-Message-State: AO0yUKUC5LTYJQJYP8gY6tEYmrxpv7BKFX7ytmd+/i1ZYci4j0ggRG1+ uxFmSrIr7djAZaOYwlOuIW/mhkCqBtyE3XlKdwU= X-Google-Smtp-Source: AK7set80AbnspeLDmeXG/jOckC/J3Eqh+7iNqjFljUAYmyAgZBmEsCd44frxJgu8dVtS41MGNdLW1Q== X-Received: by 2002:aa7:96aa:0:b0:5a8:ecb1:bfe with SMTP id g10-20020aa796aa000000b005a8ecb10bfemr3196523pfk.2.1676516270637; Wed, 15 Feb 2023 18:57:50 -0800 (PST) Received: from stoup.. (rrcs-74-87-59-234.west.biz.rr.com. [74.87.59.234]) by smtp.gmail.com with ESMTPSA id e14-20020a62aa0e000000b005a816b7c3e8sm89655pff.24.2023.02.15.18.57.49 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Feb 2023 18:57:50 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 06/30] accel/tcg: Reorg system mode store helpers Date: Wed, 15 Feb 2023 16:57:15 -1000 Message-Id: <20230216025739.1211680-7-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230216025739.1211680-1-richard.henderson@linaro.org> References: <20230216025739.1211680-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42c; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x42c.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Instead of trying to unify all operations on uint64_t, use mmu_lookup() to perform the basic tlb hit and resolution. Create individual functions to handle access by size. Signed-off-by: Richard Henderson --- accel/tcg/cputlb.c | 408 +++++++++++++++++++++------------------------ 1 file changed, 193 insertions(+), 215 deletions(-) diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c index 1fba836790..186a7f9510 100644 --- a/accel/tcg/cputlb.c +++ b/accel/tcg/cputlb.c @@ -2498,322 +2498,300 @@ store_memop(void *haddr, uint64_t val, MemOp op) } } -static void full_stb_mmu(CPUArchState *env, target_ulong addr, uint64_t val, - MemOpIdx oi, uintptr_t retaddr); - -static void __attribute__((noinline)) -store_helper_unaligned(CPUArchState *env, target_ulong addr, uint64_t val, - uintptr_t retaddr, size_t size, uintptr_t mmu_idx, - bool big_endian) +/** + * do_st_mmio_leN: + * @env: cpu context + * @p: translation parameters + * @val_le: data to store + * @mmu_idx: virtual address context + * @ra: return address into tcg generated code, or 0 + * + * Store @p->size bytes at @p->addr, which is memory-mapped i/o. + * The bytes to store are extracted in little-endian order from @val_le; + * return the bytes of @val_le beyond @p->size that have not been stored. + */ +static uint64_t do_st_mmio_leN(CPUArchState *env, MMULookupPageData *p, + uint64_t val_le, int mmu_idx, uintptr_t ra) { - uintptr_t index, index2; - CPUTLBEntry *entry, *entry2; - target_ulong page1, page2, tlb_addr, tlb_addr2; - MemOpIdx oi; - size_t size2; - int i; + CPUTLBEntryFull *full = p->full; + target_ulong addr = p->addr; + int i, size = p->size; - /* - * Ensure the second page is in the TLB. Note that the first page - * is already guaranteed to be filled, and that the second page - * cannot evict the first. An exception to this rule is PAGE_WRITE_INV - * handling: the first page could have evicted itself. - */ - page1 = addr & TARGET_PAGE_MASK; - page2 = (addr + size) & TARGET_PAGE_MASK; - size2 = (addr + size) & ~TARGET_PAGE_MASK; - index2 = tlb_index(env, mmu_idx, page2); - entry2 = tlb_entry(env, mmu_idx, page2); - - tlb_addr2 = tlb_addr_write(entry2); - if (page1 != page2 && !tlb_hit_page(tlb_addr2, page2)) { - if (!victim_tlb_hit(env, mmu_idx, index2, MMU_DATA_STORE, page2)) { - tlb_fill(env_cpu(env), page2, size2, MMU_DATA_STORE, - mmu_idx, retaddr); - index2 = tlb_index(env, mmu_idx, page2); - entry2 = tlb_entry(env, mmu_idx, page2); - } - tlb_addr2 = tlb_addr_write(entry2); + QEMU_IOTHREAD_LOCK_GUARD(); + for (i = 0; i < size; i++, val_le >>= 8) { + io_writex(env, full, mmu_idx, val_le, addr + i, ra, MO_UB); } + return val_le; +} - index = tlb_index(env, mmu_idx, addr); - entry = tlb_entry(env, mmu_idx, addr); - tlb_addr = tlb_addr_write(entry); +/** + * do_st_bytes_leN: + * @p: translation parameters + * @val_le: data to store + * + * Store @p->size bytes at @p->haddr, which is RAM. + * The bytes to store are extracted in little-endian order from @val_le; + * return the bytes of @val_le beyond @p->size that have not been stored. + */ +static uint64_t do_st_bytes_leN(MMULookupPageData *p, uint64_t val_le) +{ + uint8_t *haddr = p->haddr; + int i, size = p->size; - /* - * Handle watchpoints. Since this may trap, all checks - * must happen before any store. - */ - if (unlikely(tlb_addr & TLB_WATCHPOINT)) { - cpu_check_watchpoint(env_cpu(env), addr, size - size2, - env_tlb(env)->d[mmu_idx].fulltlb[index].attrs, - BP_MEM_WRITE, retaddr); - } - if (unlikely(tlb_addr2 & TLB_WATCHPOINT)) { - cpu_check_watchpoint(env_cpu(env), page2, size2, - env_tlb(env)->d[mmu_idx].fulltlb[index2].attrs, - BP_MEM_WRITE, retaddr); + for (i = 0; i < size; i++, val_le >>= 8) { + haddr[i] = val_le; } + return val_le; +} - /* - * XXX: not efficient, but simple. - * This loop must go in the forward direction to avoid issues - * with self-modifying code in Windows 64-bit. - */ - oi = make_memop_idx(MO_UB, mmu_idx); - if (big_endian) { - for (i = 0; i < size; ++i) { - /* Big-endian extract. */ - uint8_t val8 = val >> (((size - 1) * 8) - (i * 8)); - full_stb_mmu(env, addr + i, val8, oi, retaddr); - } +/* + * Wrapper for the above. + */ +static uint64_t do_st_leN(CPUArchState *env, MMULookupPageData *p, + uint64_t val_le, int mmu_idx, uintptr_t ra) +{ + if (unlikely(p->flags & TLB_MMIO)) { + return do_st_mmio_leN(env, p, val_le, mmu_idx, ra); + } else if (unlikely(p->flags & TLB_DISCARD_WRITE)) { + return val_le >> (p->size * 8); } else { - for (i = 0; i < size; ++i) { - /* Little-endian extract. */ - uint8_t val8 = val >> (i * 8); - full_stb_mmu(env, addr + i, val8, oi, retaddr); - } + return do_st_bytes_leN(p, val_le); } } -static inline void QEMU_ALWAYS_INLINE -store_helper(CPUArchState *env, target_ulong addr, uint64_t val, - MemOpIdx oi, uintptr_t retaddr, MemOp op) +static void do_st_1(CPUArchState *env, MMULookupPageData *p, uint8_t val, + int mmu_idx, uintptr_t ra) { - const unsigned a_bits = get_alignment_bits(get_memop(oi)); - const size_t size = memop_size(op); - uintptr_t mmu_idx = get_mmuidx(oi); - uintptr_t index; - CPUTLBEntry *entry; - target_ulong tlb_addr; - void *haddr; - - tcg_debug_assert(mmu_idx < NB_MMU_MODES); - - /* Handle CPU specific unaligned behaviour */ - if (addr & ((1 << a_bits) - 1)) { - cpu_unaligned_access(env_cpu(env), addr, MMU_DATA_STORE, - mmu_idx, retaddr); + if (unlikely(p->flags & TLB_MMIO)) { + io_writex(env, p->full, mmu_idx, val, p->addr, ra, MO_UB); + } else if (unlikely(p->flags & TLB_DISCARD_WRITE)) { + /* nothing */ + } else { + *(uint8_t *)p->haddr = val; } - - index = tlb_index(env, mmu_idx, addr); - entry = tlb_entry(env, mmu_idx, addr); - tlb_addr = tlb_addr_write(entry); - - /* If the TLB entry is for a different page, reload and try again. */ - if (!tlb_hit(tlb_addr, addr)) { - if (!victim_tlb_hit(env, mmu_idx, index, MMU_DATA_STORE, - addr & TARGET_PAGE_MASK)) { - tlb_fill(env_cpu(env), addr, size, MMU_DATA_STORE, - mmu_idx, retaddr); - index = tlb_index(env, mmu_idx, addr); - entry = tlb_entry(env, mmu_idx, addr); - } - tlb_addr = tlb_addr_write(entry) & ~TLB_INVALID_MASK; - } - - /* Handle anything that isn't just a straight memory access. */ - if (unlikely(tlb_addr & ~TARGET_PAGE_MASK)) { - CPUTLBEntryFull *full; - bool need_swap; - - /* For anything that is unaligned, recurse through byte stores. */ - if ((addr & (size - 1)) != 0) { - goto do_unaligned_access; - } - - full = &env_tlb(env)->d[mmu_idx].fulltlb[index]; - - /* Handle watchpoints. */ - if (unlikely(tlb_addr & TLB_WATCHPOINT)) { - /* On watchpoint hit, this will longjmp out. */ - cpu_check_watchpoint(env_cpu(env), addr, size, - full->attrs, BP_MEM_WRITE, retaddr); - } - - need_swap = size > 1 && (tlb_addr & TLB_BSWAP); - - /* Handle I/O access. */ - if (tlb_addr & TLB_MMIO) { - io_writex(env, full, mmu_idx, val, addr, retaddr, - op ^ (need_swap * MO_BSWAP)); - return; - } - - /* Ignore writes to ROM. */ - if (unlikely(tlb_addr & TLB_DISCARD_WRITE)) { - return; - } - - /* Handle clean RAM pages. */ - if (tlb_addr & TLB_NOTDIRTY) { - notdirty_write(env_cpu(env), addr, size, full, retaddr); - } - - haddr = (void *)((uintptr_t)addr + entry->addend); - - /* - * Keep these two store_memop separate to ensure that the compiler - * is able to fold the entire function to a single instruction. - * There is a build-time assert inside to remind you of this. ;-) - */ - if (unlikely(need_swap)) { - store_memop(haddr, val, op ^ MO_BSWAP); - } else { - store_memop(haddr, val, op); - } - return; - } - - /* Handle slow unaligned access (it spans two pages or IO). */ - if (size > 1 - && unlikely((addr & ~TARGET_PAGE_MASK) + size - 1 - >= TARGET_PAGE_SIZE)) { - do_unaligned_access: - store_helper_unaligned(env, addr, val, retaddr, size, - mmu_idx, memop_big_endian(op)); - return; - } - - haddr = (void *)((uintptr_t)addr + entry->addend); - store_memop(haddr, val, op); } -static void __attribute__((noinline)) -full_stb_mmu(CPUArchState *env, target_ulong addr, uint64_t val, - MemOpIdx oi, uintptr_t retaddr) +static void do_st_2(CPUArchState *env, MMULookupPageData *p, uint16_t val, + int mmu_idx, MemOp memop, uintptr_t ra) { - validate_memop(oi, MO_UB); - store_helper(env, addr, val, oi, retaddr, MO_UB); + if (unlikely(p->flags & TLB_MMIO)) { + io_writex(env, p->full, mmu_idx, val, p->addr, ra, memop); + } else if (unlikely(p->flags & TLB_DISCARD_WRITE)) { + /* nothing */ + } else { + /* Swap to host endian if necessary, then store. */ + if (memop & MO_BSWAP) { + val = bswap16(val); + } + store_memop(p->haddr, val, MO_UW); + } +} + +static void do_st_4(CPUArchState *env, MMULookupPageData *p, uint32_t val, + int mmu_idx, MemOp memop, uintptr_t ra) +{ + if (unlikely(p->flags & TLB_MMIO)) { + io_writex(env, p->full, mmu_idx, val, p->addr, ra, memop); + } else if (unlikely(p->flags & TLB_DISCARD_WRITE)) { + /* nothing */ + } else { + /* Swap to host endian if necessary, then store. */ + if (memop & MO_BSWAP) { + val = bswap32(val); + } + store_memop(p->haddr, val, MO_UL); + } +} + +static void do_st_8(CPUArchState *env, MMULookupPageData *p, uint64_t val, + int mmu_idx, MemOp memop, uintptr_t ra) +{ + if (unlikely(p->flags & TLB_MMIO)) { + io_writex(env, p->full, mmu_idx, val, p->addr, ra, memop); + } else if (unlikely(p->flags & TLB_DISCARD_WRITE)) { + /* nothing */ + } else { + /* Swap to host endian if necessary, then store. */ + if (memop & MO_BSWAP) { + val = bswap64(val); + } + store_memop(p->haddr, val, MO_UQ); + } } void helper_ret_stb_mmu(CPUArchState *env, target_ulong addr, uint8_t val, - MemOpIdx oi, uintptr_t retaddr) + MemOpIdx oi, uintptr_t ra) { - full_stb_mmu(env, addr, val, oi, retaddr); + MMULookupLocals l; + bool crosspage; + + validate_memop(oi, MO_UB); + crosspage = mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE, &l); + tcg_debug_assert(!crosspage); + + do_st_1(env, &l.page[0], val, l.mmu_idx, ra); } -static void full_le_stw_mmu(CPUArchState *env, target_ulong addr, uint64_t val, - MemOpIdx oi, uintptr_t retaddr) +static void do_st2_mmu(CPUArchState *env, target_ulong addr, uint16_t val, + MemOpIdx oi, uintptr_t ra) { - validate_memop(oi, MO_LEUW); - store_helper(env, addr, val, oi, retaddr, MO_LEUW); + MMULookupLocals l; + bool crosspage; + uint8_t a, b; + + crosspage = mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE, &l); + if (likely(!crosspage)) { + do_st_2(env, &l.page[0], val, l.mmu_idx, l.memop, ra); + return; + } + + if ((l.memop & MO_BSWAP) == MO_LE) { + a = val, b = val >> 8; + } else { + b = val, a = val >> 8; + } + do_st_1(env, &l.page[0], a, l.mmu_idx, ra); + do_st_1(env, &l.page[1], b, l.mmu_idx, ra); } void helper_le_stw_mmu(CPUArchState *env, target_ulong addr, uint16_t val, MemOpIdx oi, uintptr_t retaddr) { - full_le_stw_mmu(env, addr, val, oi, retaddr); -} - -static void full_be_stw_mmu(CPUArchState *env, target_ulong addr, uint64_t val, - MemOpIdx oi, uintptr_t retaddr) -{ - validate_memop(oi, MO_BEUW); - store_helper(env, addr, val, oi, retaddr, MO_BEUW); + validate_memop(oi, MO_LEUW); + do_st2_mmu(env, addr, val, oi, retaddr); } void helper_be_stw_mmu(CPUArchState *env, target_ulong addr, uint16_t val, MemOpIdx oi, uintptr_t retaddr) { - full_be_stw_mmu(env, addr, val, oi, retaddr); + validate_memop(oi, MO_BEUW); + do_st2_mmu(env, addr, val, oi, retaddr); } -static void full_le_stl_mmu(CPUArchState *env, target_ulong addr, uint64_t val, - MemOpIdx oi, uintptr_t retaddr) +static void do_st4_mmu(CPUArchState *env, target_ulong addr, uint32_t val, + MemOpIdx oi, uintptr_t ra) { - validate_memop(oi, MO_LEUL); - store_helper(env, addr, val, oi, retaddr, MO_LEUL); + MMULookupLocals l; + bool crosspage; + + crosspage = mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE, &l); + if (likely(!crosspage)) { + do_st_4(env, &l.page[0], val, l.mmu_idx, l.memop, ra); + return; + } + + /* Swap to little endian for simplicity, then store by bytes. */ + if ((l.memop & MO_BSWAP) != MO_LE) { + val = bswap32(val); + } + val = do_st_leN(env, &l.page[0], val, l.mmu_idx, ra); + (void) do_st_leN(env, &l.page[1], val, l.mmu_idx, ra); } void helper_le_stl_mmu(CPUArchState *env, target_ulong addr, uint32_t val, MemOpIdx oi, uintptr_t retaddr) { - full_le_stl_mmu(env, addr, val, oi, retaddr); -} - -static void full_be_stl_mmu(CPUArchState *env, target_ulong addr, uint64_t val, - MemOpIdx oi, uintptr_t retaddr) -{ - validate_memop(oi, MO_BEUL); - store_helper(env, addr, val, oi, retaddr, MO_BEUL); + validate_memop(oi, MO_LEUL); + do_st4_mmu(env, addr, val, oi, retaddr); } void helper_be_stl_mmu(CPUArchState *env, target_ulong addr, uint32_t val, MemOpIdx oi, uintptr_t retaddr) { - full_be_stl_mmu(env, addr, val, oi, retaddr); + validate_memop(oi, MO_BEUL); + do_st4_mmu(env, addr, val, oi, retaddr); +} + +static void do_st8_mmu(CPUArchState *env, target_ulong addr, uint64_t val, + MemOpIdx oi, uintptr_t ra) +{ + MMULookupLocals l; + bool crosspage; + + crosspage = mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE, &l); + if (likely(!crosspage)) { + do_st_8(env, &l.page[0], val, l.mmu_idx, l.memop, ra); + return; + } + + /* Swap to little endian for simplicity, then store by bytes. */ + if ((l.memop & MO_BSWAP) != MO_LE) { + val = bswap64(val); + } + val = do_st_leN(env, &l.page[0], val, l.mmu_idx, ra); + (void) do_st_leN(env, &l.page[1], val, l.mmu_idx, ra); } void helper_le_stq_mmu(CPUArchState *env, target_ulong addr, uint64_t val, MemOpIdx oi, uintptr_t retaddr) { validate_memop(oi, MO_LEUQ); - store_helper(env, addr, val, oi, retaddr, MO_LEUQ); + do_st8_mmu(env, addr, val, oi, retaddr); } void helper_be_stq_mmu(CPUArchState *env, target_ulong addr, uint64_t val, MemOpIdx oi, uintptr_t retaddr) { validate_memop(oi, MO_BEUQ); - store_helper(env, addr, val, oi, retaddr, MO_BEUQ); + do_st8_mmu(env, addr, val, oi, retaddr); } /* * Store Helpers for cpu_ldst.h */ -typedef void FullStoreHelper(CPUArchState *env, target_ulong addr, - uint64_t val, MemOpIdx oi, uintptr_t retaddr); - -static inline void cpu_store_helper(CPUArchState *env, target_ulong addr, - uint64_t val, MemOpIdx oi, uintptr_t ra, - FullStoreHelper *full_store) +static void plugin_store_cb(CPUArchState *env, abi_ptr addr, MemOpIdx oi) { - full_store(env, addr, val, oi, ra); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W); } void cpu_stb_mmu(CPUArchState *env, target_ulong addr, uint8_t val, MemOpIdx oi, uintptr_t retaddr) { - cpu_store_helper(env, addr, val, oi, retaddr, full_stb_mmu); + helper_ret_stb_mmu(env, addr, val, oi, retaddr); + plugin_store_cb(env, addr, oi); } void cpu_stw_be_mmu(CPUArchState *env, target_ulong addr, uint16_t val, MemOpIdx oi, uintptr_t retaddr) { - cpu_store_helper(env, addr, val, oi, retaddr, full_be_stw_mmu); + helper_be_stw_mmu(env, addr, val, oi, retaddr); + plugin_store_cb(env, addr, oi); } void cpu_stl_be_mmu(CPUArchState *env, target_ulong addr, uint32_t val, MemOpIdx oi, uintptr_t retaddr) { - cpu_store_helper(env, addr, val, oi, retaddr, full_be_stl_mmu); + helper_be_stl_mmu(env, addr, val, oi, retaddr); + plugin_store_cb(env, addr, oi); } void cpu_stq_be_mmu(CPUArchState *env, target_ulong addr, uint64_t val, MemOpIdx oi, uintptr_t retaddr) { - cpu_store_helper(env, addr, val, oi, retaddr, helper_be_stq_mmu); + helper_be_stq_mmu(env, addr, val, oi, retaddr); + plugin_store_cb(env, addr, oi); } void cpu_stw_le_mmu(CPUArchState *env, target_ulong addr, uint16_t val, MemOpIdx oi, uintptr_t retaddr) { - cpu_store_helper(env, addr, val, oi, retaddr, full_le_stw_mmu); + helper_le_stw_mmu(env, addr, val, oi, retaddr); + plugin_store_cb(env, addr, oi); } void cpu_stl_le_mmu(CPUArchState *env, target_ulong addr, uint32_t val, MemOpIdx oi, uintptr_t retaddr) { - cpu_store_helper(env, addr, val, oi, retaddr, full_le_stl_mmu); + helper_le_stl_mmu(env, addr, val, oi, retaddr); + plugin_store_cb(env, addr, oi); } void cpu_stq_le_mmu(CPUArchState *env, target_ulong addr, uint64_t val, MemOpIdx oi, uintptr_t retaddr) { - cpu_store_helper(env, addr, val, oi, retaddr, helper_le_stq_mmu); + helper_le_stq_mmu(env, addr, val, oi, retaddr); + plugin_store_cb(env, addr, oi); } void cpu_st16_be_mmu(CPUArchState *env, abi_ptr addr, Int128 val, From patchwork Thu Feb 16 02:57:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1743242 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=Wo4/l5mK; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4PHKRk6nw0z1yYg for ; Thu, 16 Feb 2023 13:59:46 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pSUTB-0003S7-58; Wed, 15 Feb 2023 21:58:01 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pSUT7-0003Qo-34 for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:57:57 -0500 Received: from mail-pl1-x632.google.com ([2607:f8b0:4864:20::632]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pSUT3-0005i1-Q4 for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:57:56 -0500 Received: by mail-pl1-x632.google.com with SMTP id d8so669374plr.10 for ; Wed, 15 Feb 2023 18:57:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=CmSO9xL6DJDBdxahox53p5sqmeHvZevAPdCaO0FneX0=; b=Wo4/l5mKGDgFy3b3W3oiHDZZm+NFFoX8ZzXhjJvHc+eeKguWnFCZJui7dCEvpCwolo J+0iY3oeX4cXhHr7Lzb65IOysekYMJwAH4M+icNXLlXt7Vv18HgsOBrO0f68AXI3zMSD 4BdXCRbCH2l2RmI2kV6Cwfjf+Md5VVuCmPi2SREYCJGqLMJ+NM/T74qDFsiKw6zOA+uN 9/1ZuCmjTzb9asryiK5/01mwR2Hfgt/YKAplLi/sVbGA8L26vubT998T2kK1zbkSW6rx 8s4F7G0mCKeCq443o+XBWJU1mMIlryO+LQaUr3VJhJFq2qQkh1qUukeMbkUDrHnxz+2t GgQA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=CmSO9xL6DJDBdxahox53p5sqmeHvZevAPdCaO0FneX0=; b=B8ViQxGFeccmNZT+8okRZHtRApkZd9lMABFMhy48rrU+fOAXpFqmOmpYGSRBlQ2EoJ xEKtEeRwAOO5a0wHN1oaNNPoN6VkaMZvNsT7Y1VSX9hkSsiTcwLGznjGA3o+b6bjoVZ4 PzwS0c5F3QxCmu9rq2Q6t2mcN9LHX3mS8dFVxWPCuFqhahibRYyi+z7D0aQKv7094+N+ vidYM1VVh6jmWcIbIo5MAeiA/wpnPaS/PA04lf/7EJJ7tE2KaL0r8S1FtMug0HSOwxMq 2YLlZtGo62djXNVvImtog0HmQoeUhMDXNmYrn10IQoqZLMfgp2wWq0eQaKNg3ibqa2sx xdDg== X-Gm-Message-State: AO0yUKXVK4Ig9KbVXeUnaPNW0O2ksx7eKjuMqLAPpzuE1+AbJ45pOyZw 6H03FGRzp8fqW+KakOKrcVv7GQNFxncLmjD7YTw= X-Google-Smtp-Source: AK7set8NSi7iAK6RYwVlZGiao64cHHifSo9zMbKsntxK9GYn6CUheyL4pvSHOOGr4IHo9whlOKQiEQ== X-Received: by 2002:a05:6a20:7fa3:b0:c3:8493:626d with SMTP id d35-20020a056a207fa300b000c38493626dmr5434979pzj.50.1676516272043; Wed, 15 Feb 2023 18:57:52 -0800 (PST) Received: from stoup.. (rrcs-74-87-59-234.west.biz.rr.com. [74.87.59.234]) by smtp.gmail.com with ESMTPSA id e14-20020a62aa0e000000b005a816b7c3e8sm89655pff.24.2023.02.15.18.57.50 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Feb 2023 18:57:51 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 07/30] accel/tcg: Honor atomicity of loads Date: Wed, 15 Feb 2023 16:57:16 -1000 Message-Id: <20230216025739.1211680-8-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230216025739.1211680-1-richard.henderson@linaro.org> References: <20230216025739.1211680-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::632; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x632.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Create ldst_atomicity.c.inc. Not required for user-only code loads, because we've ensured that the page is read-only before beginning to translate code. Signed-off-by: Richard Henderson Reviewed-by: Alex Bennée --- accel/tcg/cputlb.c | 170 +++++++--- accel/tcg/user-exec.c | 26 +- accel/tcg/ldst_atomicity.c.inc | 547 +++++++++++++++++++++++++++++++++ 3 files changed, 692 insertions(+), 51 deletions(-) create mode 100644 accel/tcg/ldst_atomicity.c.inc diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c index 186a7f9510..8e2fe4a271 100644 --- a/accel/tcg/cputlb.c +++ b/accel/tcg/cputlb.c @@ -1653,6 +1653,9 @@ tb_page_addr_t get_page_addr_code_hostp(CPUArchState *env, target_ulong addr, return qemu_ram_addr_from_host_nofail(p); } +/* Load/store with atomicity primitives. */ +#include "ldst_atomicity.c.inc" + #ifdef CONFIG_PLUGIN /* * Perform a TLB lookup and populate the qemu_plugin_hwaddr structure. @@ -2001,35 +2004,7 @@ static void validate_memop(MemOpIdx oi, MemOp expected) * specifically for reading instructions from system memory. It is * called by the translation loop and in some helpers where the code * is disassembled. It shouldn't be called directly by guest code. - */ - -typedef uint64_t FullLoadHelper(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr); - -static inline uint64_t QEMU_ALWAYS_INLINE -load_memop(const void *haddr, MemOp op) -{ - switch (op) { - case MO_UB: - return ldub_p(haddr); - case MO_BEUW: - return lduw_be_p(haddr); - case MO_LEUW: - return lduw_le_p(haddr); - case MO_BEUL: - return (uint32_t)ldl_be_p(haddr); - case MO_LEUL: - return (uint32_t)ldl_le_p(haddr); - case MO_BEUQ: - return ldq_be_p(haddr); - case MO_LEUQ: - return ldq_le_p(haddr); - default: - qemu_build_not_reached(); - } -} - -/* + * * For the benefit of TCG generated code, we want to avoid the * complication of ABI-specific return type promotion and always * return a value extended to the register size of the host. This is @@ -2085,17 +2060,134 @@ static uint64_t do_ld_bytes_beN(MMULookupPageData *p, uint64_t ret_be) return ret_be; } +/** + * do_ld_parts_beN + * @p: translation parameters + * @ret_be: accumulated data + * + * As do_ld_bytes_beN, but atomically on each aligned part. + */ +static uint64_t do_ld_parts_beN(MMULookupPageData *p, uint64_t ret_be) +{ + void *haddr = p->haddr; + int size = p->size; + + do { + uint64_t x; + int n; + + /* + * Find minimum of alignment and size. + * This is slightly stronger than required by MO_ATOM_SUBALIGN, which + * would have only checked the low bits of addr|size once at the start, + * but is just as easy. + */ + switch (((uintptr_t)haddr | size) & 7) { + case 4: + x = cpu_to_be32(load_atomic4(haddr)); + ret_be = (ret_be << 32) | x; + n = 4; + break; + case 2: + case 6: + x = cpu_to_be16(load_atomic2(haddr)); + ret_be = (ret_be << 16) | x; + n = 2; + break; + default: + x = *(uint8_t *)haddr; + ret_be = (ret_be << 8) | x; + n = 1; + break; + case 0: + g_assert_not_reached(); + } + haddr += n; + size -= n; + } while (size != 0); + return ret_be; +} + +/** + * do_ld_parts_be4 + * @p: translation parameters + * @ret_be: accumulated data + * + * As do_ld_bytes_beN, but with one atomic load. + * Four aligned bytes are guaranteed to cover the load. + */ +static uint64_t do_ld_whole_be4(MMULookupPageData *p, uint64_t ret_be) +{ + int o = p->addr & 3; + uint32_t x = load_atomic4(p->haddr - o); + + x = cpu_to_be32(x); + x <<= o * 8; + x >>= (4 - p->size) * 8; + return (ret_be << (p->size * 8)) | x; +} + +/** + * do_ld_parts_be8 + * @p: translation parameters + * @ret_be: accumulated data + * + * As do_ld_bytes_beN, but with one atomic load. + * Eight aligned bytes are guaranteed to cover the load. + */ +static uint64_t do_ld_whole_be8(CPUArchState *env, uintptr_t ra, + MMULookupPageData *p, uint64_t ret_be) +{ + int o = p->addr & 7; + uint64_t x = load_atomic8_or_exit(env, ra, p->haddr - o); + + x = cpu_to_be64(x); + x <<= o * 8; + x >>= (8 - p->size) * 8; + return (ret_be << (p->size * 8)) | x; +} + /* * Wrapper for the above. */ static uint64_t do_ld_beN(CPUArchState *env, MMULookupPageData *p, - uint64_t ret_be, int mmu_idx, - MMUAccessType type, uintptr_t ra) + uint64_t ret_be, int mmu_idx, MMUAccessType type, + MemOp mop, uintptr_t ra) { + MemOp atmax; + if (unlikely(p->flags & TLB_MMIO)) { return do_ld_mmio_beN(env, p, ret_be, mmu_idx, type, ra); - } else { + } + + switch (mop & MO_ATOM_MASK) { + case MO_ATOM_WITHIN16: + /* + * It is a given that we cross a page and therefore there is no + * atomicity for the load as a whole, but there may be a subobject + * as defined by ATMAX which does not cross a 16-byte boundary. + */ + atmax = mop & MO_ATMAX_MASK; + if (atmax == MO_ATMAX_SIZE) { + atmax = mop & MO_SIZE; + } else { + atmax >>= MO_ATMAX_SHIFT; + } + if (unlikely(p->size >= (1 << atmax))) { + if (!HAVE_al8_fast && p->size < 4) { + return do_ld_whole_be4(p, ret_be); + } else { + return do_ld_whole_be8(env, ra, p, ret_be); + } + } + /* fall through */ + case MO_ATOM_IFALIGN: + case MO_ATOM_NONE: return do_ld_bytes_beN(p, ret_be); + case MO_ATOM_SUBALIGN: + return do_ld_parts_beN(p, ret_be); + default: + g_assert_not_reached(); } } @@ -2119,7 +2211,7 @@ static uint16_t do_ld_2(CPUArchState *env, MMULookupPageData *p, int mmu_idx, } /* Perform the load host endian, then swap if necessary. */ - ret = load_memop(p->haddr, MO_UW); + ret = load_atom_2(env, ra, p->haddr, memop); if (memop & MO_BSWAP) { ret = bswap16(ret); } @@ -2136,7 +2228,7 @@ static uint32_t do_ld_4(CPUArchState *env, MMULookupPageData *p, int mmu_idx, } /* Perform the load host endian. */ - ret = load_memop(p->haddr, MO_UL); + ret = load_atom_4(env, ra, p->haddr, memop); if (memop & MO_BSWAP) { ret = bswap32(ret); } @@ -2153,7 +2245,7 @@ static uint64_t do_ld_8(CPUArchState *env, MMULookupPageData *p, int mmu_idx, } /* Perform the load host endian. */ - ret = load_memop(p->haddr, MO_UQ); + ret = load_atom_8(env, ra, p->haddr, memop); if (memop & MO_BSWAP) { ret = bswap64(ret); } @@ -2229,8 +2321,8 @@ static uint32_t do_ld4_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, return do_ld_4(env, &l.page[0], l.mmu_idx, access_type, l.memop, ra); } - ret = do_ld_beN(env, &l.page[0], 0, l.mmu_idx, access_type, ra); - ret = do_ld_beN(env, &l.page[1], ret, l.mmu_idx, access_type, ra); + ret = do_ld_beN(env, &l.page[0], 0, l.mmu_idx, access_type, l.memop, ra); + ret = do_ld_beN(env, &l.page[1], ret, l.mmu_idx, access_type, l.memop, ra); if ((l.memop & MO_BSWAP) == MO_LE) { ret = bswap32(ret); } @@ -2263,8 +2355,8 @@ static uint64_t do_ld8_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, return do_ld_8(env, &l.page[0], l.mmu_idx, access_type, l.memop, ra); } - ret = do_ld_beN(env, &l.page[0], 0, l.mmu_idx, access_type, ra); - ret = do_ld_beN(env, &l.page[1], ret, l.mmu_idx, access_type, ra); + ret = do_ld_beN(env, &l.page[0], 0, l.mmu_idx, access_type, l.memop, ra); + ret = do_ld_beN(env, &l.page[1], ret, l.mmu_idx, access_type, l.memop, ra); if ((l.memop & MO_BSWAP) == MO_LE) { ret = bswap64(ret); } diff --git a/accel/tcg/user-exec.c b/accel/tcg/user-exec.c index ae67d84638..25e55a40fb 100644 --- a/accel/tcg/user-exec.c +++ b/accel/tcg/user-exec.c @@ -933,6 +933,8 @@ static void *cpu_mmu_lookup(CPUArchState *env, target_ulong addr, return ret; } +#include "ldst_atomicity.c.inc" + uint8_t cpu_ldb_mmu(CPUArchState *env, abi_ptr addr, MemOpIdx oi, uintptr_t ra) { @@ -955,10 +957,10 @@ uint16_t cpu_ldw_be_mmu(CPUArchState *env, abi_ptr addr, validate_memop(oi, MO_BEUW); haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_LOAD); - ret = lduw_be_p(haddr); + ret = load_atom_2(env, ra, haddr, get_memop(oi)); clear_helper_retaddr(); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R); - return ret; + return cpu_to_be16(ret); } uint32_t cpu_ldl_be_mmu(CPUArchState *env, abi_ptr addr, @@ -969,10 +971,10 @@ uint32_t cpu_ldl_be_mmu(CPUArchState *env, abi_ptr addr, validate_memop(oi, MO_BEUL); haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_LOAD); - ret = ldl_be_p(haddr); + ret = load_atom_4(env, ra, haddr, get_memop(oi)); clear_helper_retaddr(); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R); - return ret; + return cpu_to_be32(ret); } uint64_t cpu_ldq_be_mmu(CPUArchState *env, abi_ptr addr, @@ -983,10 +985,10 @@ uint64_t cpu_ldq_be_mmu(CPUArchState *env, abi_ptr addr, validate_memop(oi, MO_BEUQ); haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_LOAD); - ret = ldq_be_p(haddr); + ret = load_atom_8(env, ra, haddr, get_memop(oi)); clear_helper_retaddr(); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R); - return ret; + return cpu_to_be64(ret); } uint16_t cpu_ldw_le_mmu(CPUArchState *env, abi_ptr addr, @@ -997,10 +999,10 @@ uint16_t cpu_ldw_le_mmu(CPUArchState *env, abi_ptr addr, validate_memop(oi, MO_LEUW); haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_LOAD); - ret = lduw_le_p(haddr); + ret = load_atom_2(env, ra, haddr, get_memop(oi)); clear_helper_retaddr(); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R); - return ret; + return cpu_to_le16(ret); } uint32_t cpu_ldl_le_mmu(CPUArchState *env, abi_ptr addr, @@ -1011,10 +1013,10 @@ uint32_t cpu_ldl_le_mmu(CPUArchState *env, abi_ptr addr, validate_memop(oi, MO_LEUL); haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_LOAD); - ret = ldl_le_p(haddr); + ret = load_atom_4(env, ra, haddr, get_memop(oi)); clear_helper_retaddr(); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R); - return ret; + return cpu_to_le32(ret); } uint64_t cpu_ldq_le_mmu(CPUArchState *env, abi_ptr addr, @@ -1025,10 +1027,10 @@ uint64_t cpu_ldq_le_mmu(CPUArchState *env, abi_ptr addr, validate_memop(oi, MO_LEUQ); haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_LOAD); - ret = ldq_le_p(haddr); + ret = load_atom_8(env, ra, haddr, get_memop(oi)); clear_helper_retaddr(); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R); - return ret; + return cpu_to_le64(ret); } Int128 cpu_ld16_be_mmu(CPUArchState *env, abi_ptr addr, diff --git a/accel/tcg/ldst_atomicity.c.inc b/accel/tcg/ldst_atomicity.c.inc new file mode 100644 index 0000000000..c93328fbaa --- /dev/null +++ b/accel/tcg/ldst_atomicity.c.inc @@ -0,0 +1,547 @@ +/* + * Routines common to user and system emulation of load/store. + * + * Copyright (c) 2022 Linaro, Ltd. + * + * SPDX-License-Identifier: GPL-2.0-or-later + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ + +#ifdef CONFIG_ATOMIC64 +# define HAVE_al8 true +#else +# define HAVE_al8 false +#endif +#define HAVE_al8_fast (ATOMIC_REG_SIZE >= 8) + +#if defined(CONFIG_ATOMIC128) +# define HAVE_al16_fast true +#else +# define HAVE_al16_fast false +#endif + +/** + * required_atomicity: + * + * Return the lg2 bytes of atomicity required by @memop for @p. + * If the operation must be split into two operations to be + * examined separately for atomicity, return -lg2. + */ +static int required_atomicity(CPUArchState *env, uintptr_t p, MemOp memop) +{ + int atmax = memop & MO_ATMAX_MASK; + int size = memop & MO_SIZE; + unsigned tmp; + + if (atmax == MO_ATMAX_SIZE) { + atmax = size; + } else { + atmax >>= MO_ATMAX_SHIFT; + } + + switch (memop & MO_ATOM_MASK) { + case MO_ATOM_IFALIGN: + tmp = (1 << atmax) - 1; + if (p & tmp) { + return MO_8; + } + break; + case MO_ATOM_NONE: + return MO_8; + case MO_ATOM_SUBALIGN: + tmp = p & -p; + if (tmp != 0 && tmp < atmax) { + atmax = tmp; + } + break; + case MO_ATOM_WITHIN16: + tmp = p & 15; + if (tmp + (1 << size) <= 16) { + atmax = size; + } else if (atmax == size) { + return MO_8; + } else if (tmp + (1 << atmax) != 16) { + /* + * Paired load/store, where the pairs aren't aligned. + * One of the two must still be handled atomically. + */ + atmax = -atmax; + } + break; + default: + g_assert_not_reached(); + } + + /* + * Here we have the architectural atomicity of the operation. + * However, when executing in a serial context, we need no extra + * host atomicity in order to avoid racing. This reduction + * avoids looping with cpu_loop_exit_atomic. + */ + if (cpu_in_serial_context(env_cpu(env))) { + return MO_8; + } + return atmax; +} + +/** + * load_atomic2: + * @pv: host address + * + * Atomically load 2 aligned bytes from @pv. + */ +static inline uint16_t load_atomic2(void *pv) +{ + uint16_t *p = __builtin_assume_aligned(pv, 2); + return qatomic_read(p); +} + +/** + * load_atomic4: + * @pv: host address + * + * Atomically load 4 aligned bytes from @pv. + */ +static inline uint32_t load_atomic4(void *pv) +{ + uint32_t *p = __builtin_assume_aligned(pv, 4); + return qatomic_read(p); +} + +/** + * load_atomic8: + * @pv: host address + * + * Atomically load 8 aligned bytes from @pv. + */ +static inline uint64_t load_atomic8(void *pv) +{ + uint64_t *p = __builtin_assume_aligned(pv, 8); + + qemu_build_assert(HAVE_al8); + return qatomic_read__nocheck(p); +} + +/** + * load_atomic16: + * @pv: host address + * + * Atomically load 16 aligned bytes from @pv. + */ +static inline Int128 load_atomic16(void *pv) +{ +#ifdef CONFIG_ATOMIC128 + __uint128_t *p = __builtin_assume_aligned(pv, 16); + Int128Alias r; + + r.u = qatomic_read__nocheck(p); + return r.s; +#else + qemu_build_not_reached(); +#endif +} + +/** + * load_atomic8_or_exit: + * @env: cpu context + * @ra: host unwind address + * @pv: host address + * + * Atomically load 8 aligned bytes from @pv. + * If this is not possible, longjmp out to restart serially. + */ +static uint64_t load_atomic8_or_exit(CPUArchState *env, uintptr_t ra, void *pv) +{ + if (HAVE_al8) { + return load_atomic8(pv); + } + +#ifdef CONFIG_USER_ONLY + /* + * If the page is not writable, then assume the value is immutable + * and requires no locking. This ignores the case of MAP_SHARED with + * another process, because the fallback start_exclusive solution + * provides no protection across processes. + */ + if (!page_check_range(h2g(pv), 8, PAGE_WRITE)) { + uint64_t *p = __builtin_assume_aligned(pv, 8); + return *p; + } +#endif + + /* Ultimate fallback: re-execute in serial context. */ + cpu_loop_exit_atomic(env_cpu(env), ra); +} + +/** + * load_atomic16_or_exit: + * @env: cpu context + * @ra: host unwind address + * @pv: host address + * + * Atomically load 16 aligned bytes from @pv. + * If this is not possible, longjmp out to restart serially. + */ +static Int128 load_atomic16_or_exit(CPUArchState *env, uintptr_t ra, void *pv) +{ + Int128 *p = __builtin_assume_aligned(pv, 16); + + if (HAVE_al16_fast) { + return load_atomic16(p); + } + +#ifdef CONFIG_USER_ONLY + /* + * We can only use cmpxchg to emulate a load if the page is writable. + * If the page is not writable, then assume the value is immutable + * and requires no locking. This ignores the case of MAP_SHARED with + * another process, because the fallback start_exclusive solution + * provides no protection across processes. + */ + if (!page_check_range(h2g(p), 16, PAGE_WRITE)) { + return *p; + } +#endif + + /* + * In system mode all guest pages are writable, and for user-only + * we have just checked writability. Try cmpxchg. + */ +#if defined(CONFIG_CMPXCHG128) + /* Swap 0 with 0, with the side-effect of returning the old value. */ + { + Int128Alias r; + r.u = __sync_val_compare_and_swap_16((__uint128_t *)p, 0, 0); + return r.s; + } +#endif + + /* Ultimate fallback: re-execute in serial context. */ + cpu_loop_exit_atomic(env_cpu(env), ra); +} + +/** + * load_atom_extract_al4x2: + * @pv: host address + * + * Load 4 bytes from @p, from two sequential atomic 4-byte loads. + */ +static uint32_t load_atom_extract_al4x2(void *pv) +{ + uintptr_t pi = (uintptr_t)pv; + int sh = (pi & 3) * 8; + uint32_t a, b; + + pv = (void *)(pi & ~3); + a = load_atomic4(pv); + b = load_atomic4(pv + 4); + + if (HOST_BIG_ENDIAN) { + return (a << sh) | (b >> (-sh & 31)); + } else { + return (a >> sh) | (b << (-sh & 31)); + } +} + +/** + * load_atom_extract_al8x2: + * @pv: host address + * + * Load 8 bytes from @p, from two sequential atomic 8-byte loads. + */ +static uint64_t load_atom_extract_al8x2(void *pv) +{ + uintptr_t pi = (uintptr_t)pv; + int sh = (pi & 7) * 8; + uint64_t a, b; + + pv = (void *)(pi & ~7); + a = load_atomic8(pv); + b = load_atomic8(pv + 8); + + if (HOST_BIG_ENDIAN) { + return (a << sh) | (b >> (-sh & 63)); + } else { + return (a >> sh) | (b << (-sh & 63)); + } +} + +/** + * load_atom_extract_al8: + * @pv: host address + * @s: object size in bytes, @s <= 4. + * + * Atomically load @s bytes from @p, when p % s != 0, and [p, p+s-1] does + * not cross an 8-byte boundary. This means that we can perform an atomic + * 8-byte load and extract. + * The value is returned in the low bits of a uint32_t. + */ +static uint32_t load_atom_extract_al8(void *pv, int s) +{ + uintptr_t pi = (uintptr_t)pv; + int o = pi & 7; + int shr = (HOST_BIG_ENDIAN ? 8 - s - o : o) * 8; + + pv = (void *)(pi & ~7); + return load_atomic8(pv) >> shr; +} + +/** + * load_atom_extract_al16_or_exit: + * @env: cpu context + * @ra: host unwind address + * @p: host address + * @s: object size in bytes, @s <= 8. + * + * Atomically load @s bytes from @p, when p % 16 < 8 + * and p % 16 + s > 8. I.e. does not cross a 16-byte + * boundary, but *does* cross an 8-byte boundary. + * This is the slow version, so we must have eliminated + * any faster load_atom_extract_al8 case. + * + * If this is not possible, longjmp out to restart serially. + */ +static uint64_t load_atom_extract_al16_or_exit(CPUArchState *env, uintptr_t ra, + void *pv, int s) +{ + uintptr_t pi = (uintptr_t)pv; + int o = pi & 7; + int shr = (HOST_BIG_ENDIAN ? 16 - s - o : o) * 8; + Int128 r; + + /* + * Note constraints above: p & 8 must be clear. + * Provoke SIGBUS if possible otherwise. + */ + pv = (void *)(pi & ~7); + r = load_atomic16_or_exit(env, ra, pv); + + r = int128_urshift(r, shr); + return int128_getlo(r); +} + +/** + * load_atom_extract_al16_or_al8: + * @p: host address + * @s: object size in bytes, @s <= 8. + * + * Load @s bytes from @p, when p % s != 0. If [p, p+s-1] does not + * cross an 16-byte boundary then the access must be 16-byte atomic, + * otherwise the access must be 8-byte atomic. + */ +static inline uint64_t load_atom_extract_al16_or_al8(void *pv, int s) +{ +#if defined(CONFIG_ATOMIC128) + uintptr_t pi = (uintptr_t)pv; + int o = pi & 7; + int shr = (HOST_BIG_ENDIAN ? 16 - s - o : o) * 8; + __uint128_t r; + + pv = (void *)(pi & ~7); + if (pi & 8) { + uint64_t *p8 = __builtin_assume_aligned(pv, 16, 8); + uint64_t a = qatomic_read__nocheck(p8); + uint64_t b = qatomic_read__nocheck(p8 + 1); + + if (HOST_BIG_ENDIAN) { + r = ((__uint128_t)a << 64) | b; + } else { + r = ((__uint128_t)b << 64) | a; + } + } else { + __uint128_t *p16 = __builtin_assume_aligned(pv, 16, 0); + r = qatomic_read__nocheck(p16); + } + return r >> shr; +#else + qemu_build_not_reached(); +#endif +} + +/** + * load_atom_4_by_2: + * @pv: host address + * + * Load 4 bytes from @pv, with two 2-byte atomic loads. + */ +static inline uint32_t load_atom_4_by_2(void *pv) +{ + uint32_t a = load_atomic2(pv); + uint32_t b = load_atomic2(pv + 2); + + if (HOST_BIG_ENDIAN) { + return (a << 16) | b; + } else { + return (b << 16) | a; + } +} + +/** + * load_atom_8_by_2: + * @pv: host address + * + * Load 8 bytes from @pv, with four 2-byte atomic loads. + */ +static inline uint64_t load_atom_8_by_2(void *pv) +{ + uint32_t a = load_atom_4_by_2(pv); + uint32_t b = load_atom_4_by_2(pv + 4); + + if (HOST_BIG_ENDIAN) { + return ((uint64_t)a << 32) | b; + } else { + return ((uint64_t)b << 32) | a; + } +} + +/** + * load_atom_8_by_4: + * @pv: host address + * + * Load 8 bytes from @pv, with two 4-byte atomic loads. + */ +static inline uint64_t load_atom_8_by_4(void *pv) +{ + uint32_t a = load_atomic4(pv); + uint32_t b = load_atomic4(pv + 4); + + if (HOST_BIG_ENDIAN) { + return ((uint64_t)a << 32) | b; + } else { + return ((uint64_t)b << 32) | a; + } +} + +/** + * load_atom_2: + * @p: host address + * @memop: the full memory op + * + * Load 2 bytes from @p, honoring the atomicity of @memop. + */ +static uint16_t load_atom_2(CPUArchState *env, uintptr_t ra, + void *pv, MemOp memop) +{ + uintptr_t pi = (uintptr_t)pv; + int atmax; + + if (likely((pi & 1) == 0)) { + return load_atomic2(pv); + } + if (HAVE_al16_fast) { + return load_atom_extract_al16_or_al8(pv, 2); + } + + atmax = required_atomicity(env, pi, memop); + switch (atmax) { + case MO_8: + return lduw_he_p(pv); + case MO_16: + /* The only case remaining is MO_ATOM_WITHIN16. */ + if (!HAVE_al8_fast && (pi & 3) == 1) { + /* Big or little endian, we want the middle two bytes. */ + return load_atomic4(pv - 1) >> 8; + } + if (unlikely((pi & 15) != 7)) { + return load_atom_extract_al8(pv, 2); + } + return load_atom_extract_al16_or_exit(env, ra, pv, 2); + default: + g_assert_not_reached(); + } +} + +/** + * load_atom_4: + * @p: host address + * @memop: the full memory op + * + * Load 4 bytes from @p, honoring the atomicity of @memop. + */ +static uint32_t load_atom_4(CPUArchState *env, uintptr_t ra, + void *pv, MemOp memop) +{ + uintptr_t pi = (uintptr_t)pv; + int atmax; + + if (likely((pi & 3) == 0)) { + return load_atomic4(pv); + } + if (HAVE_al16_fast) { + return load_atom_extract_al16_or_al8(pv, 4); + } + + atmax = required_atomicity(env, pi, memop); + switch (atmax) { + case MO_8: + case MO_16: + case -MO_16: + /* + * For MO_ATOM_IFALIGN, this is more atomicity than required, + * but it's trivially supported on all hosts, better than 4 + * individual byte loads (when the host requires alignment), + * and overlaps with the MO_ATOM_SUBALIGN case of p % 2 == 0. + */ + return load_atom_extract_al4x2(pv); + case MO_32: + if (!(pi & 4)) { + return load_atom_extract_al8(pv, 4); + } + return load_atom_extract_al16_or_exit(env, ra, pv, 4); + default: + g_assert_not_reached(); + } +} + +/** + * load_atom_8: + * @p: host address + * @memop: the full memory op + * + * Load 8 bytes from @p, honoring the atomicity of @memop. + */ +static uint64_t load_atom_8(CPUArchState *env, uintptr_t ra, + void *pv, MemOp memop) +{ + uintptr_t pi = (uintptr_t)pv; + int atmax; + + /* + * If the host does not support 8-byte atomics, wait until we have + * examined the atomicity parameters below. + */ + if (HAVE_al8 && likely((pi & 7) == 0)) { + return load_atomic8(pv); + } + if (HAVE_al16_fast) { + return load_atom_extract_al16_or_al8(pv, 8); + } + + atmax = required_atomicity(env, pi, memop); + if (atmax == MO_64) { + if (!HAVE_al8 && (pi & 7) == 0) { + load_atomic8_or_exit(env, ra, pv); + } + return load_atom_extract_al16_or_exit(env, ra, pv, 8); + } + if (HAVE_al8_fast) { + return load_atom_extract_al8x2(pv); + } + switch (atmax) { + case MO_8: + return ldq_he_p(pv); + case MO_16: + return load_atom_8_by_2(pv); + case MO_32: + return load_atom_8_by_4(pv); + case -MO_32: + if (HAVE_al8) { + return load_atom_extract_al8x2(pv); + } + cpu_loop_exit_atomic(env_cpu(env), ra); + default: + g_assert_not_reached(); + } +} From patchwork Thu Feb 16 02:57:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1743240 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=xbqgqUog; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4PHKRd1hhYz23yD for ; Thu, 16 Feb 2023 13:59:41 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pSUTD-0003Sh-37; Wed, 15 Feb 2023 21:58:03 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pSUT7-0003Qs-N9 for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:57:58 -0500 Received: from mail-pf1-x434.google.com ([2607:f8b0:4864:20::434]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pSUT4-0005iW-Q9 for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:57:57 -0500 Received: by mail-pf1-x434.google.com with SMTP id 16so576115pfo.8 for ; Wed, 15 Feb 2023 18:57:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=5hg55pYj41zo8XEIo2eZDVmK1Cv3+pAAUaqjUmvzjaQ=; b=xbqgqUogLFwzXhrjNI8wayQA55kk0Y0HlvLzqnV4FQfQqW3XC0KUjQ8UUhU6sQdvaU DNo3jMHRNhHxAlA+F8wn9hD+wSReGKJ3Qp4g6k8YwYmaGm2O24P0VVV60UH+etYsFddU QCw4wHbpmYv6bdSs92kVu9blr76WtWNIoY1SnZ1mHzdwZt11jJRoEQU4AmuybqNUM3jc DbXLZONfWxwFr0TNoqbHL7MnsZI6hKO1QnIq9p9QfZFtRwNpjJTqlaIBaCQbARII1ytY 71ytUUzFF58LX4vgKpFbmVeL/IuhHIssfNbgc4pk2trFwaCuk7baSAE++EAiGjVix9Uy M/Sg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=5hg55pYj41zo8XEIo2eZDVmK1Cv3+pAAUaqjUmvzjaQ=; b=3L9vn2v1YO6YEmeSaJZkjBUPEtE6OnosPHdDL4qv9S7riD3qLrSXKkyj9gmikpznfS grPwx9tsuWBXZegeoo3IFWWhPwIuTFfbBrbtukXr+CpyCEUFooZta0rtDDqoPYZeZL5S QL4Y7HdiDfCs06o2kfdO6T4kTVT88nIVirOMWypD88NvD+lMVyuazU51WIsh4MyLn7gf UMxd0pUZ2w+blwXUI0lIcgLUCjJUTiftlnCZ3QHgueAtb+f8mEN8BLoL6cHotPwvaU+g d5sDP0zZYvIIt6yD3TAlH1LDwVSW4wb9iC/SfeNWTZF0r/vARoUEvDskxr4n0/SR7pwA nxeA== X-Gm-Message-State: AO0yUKUYXDG938Lh7EbR929Q50lXMQMJi3v9JtdKEAHPAGWdS0nDaQbw g741Pqjm7h6yNiAqW7NfZBlMz+H8PjXgZiggOlY= X-Google-Smtp-Source: AK7set9C9H0ktiaWKDkVaX9XpUo1ekaKRTcZ55R7+8elNSvJ1jbBOljanvFUQ5DjeGpPc1gjwTBbRw== X-Received: by 2002:a05:6a00:4305:b0:5a8:51a3:7f69 with SMTP id cb5-20020a056a00430500b005a851a37f69mr792800pfb.2.1676516273260; Wed, 15 Feb 2023 18:57:53 -0800 (PST) Received: from stoup.. (rrcs-74-87-59-234.west.biz.rr.com. [74.87.59.234]) by smtp.gmail.com with ESMTPSA id e14-20020a62aa0e000000b005a816b7c3e8sm89655pff.24.2023.02.15.18.57.52 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Feb 2023 18:57:52 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 08/30] accel/tcg: Honor atomicity of stores Date: Wed, 15 Feb 2023 16:57:17 -1000 Message-Id: <20230216025739.1211680-9-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230216025739.1211680-1-richard.henderson@linaro.org> References: <20230216025739.1211680-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::434; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x434.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Signed-off-by: Richard Henderson --- accel/tcg/cputlb.c | 103 +++---- accel/tcg/user-exec.c | 12 +- accel/tcg/ldst_atomicity.c.inc | 491 +++++++++++++++++++++++++++++++++ 3 files changed, 540 insertions(+), 66 deletions(-) diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c index 8e2fe4a271..6c93558a1c 100644 --- a/accel/tcg/cputlb.c +++ b/accel/tcg/cputlb.c @@ -2560,36 +2560,6 @@ Int128 cpu_ld16_le_mmu(CPUArchState *env, abi_ptr addr, * Store Helpers */ -static inline void QEMU_ALWAYS_INLINE -store_memop(void *haddr, uint64_t val, MemOp op) -{ - switch (op) { - case MO_UB: - stb_p(haddr, val); - break; - case MO_BEUW: - stw_be_p(haddr, val); - break; - case MO_LEUW: - stw_le_p(haddr, val); - break; - case MO_BEUL: - stl_be_p(haddr, val); - break; - case MO_LEUL: - stl_le_p(haddr, val); - break; - case MO_BEUQ: - stq_be_p(haddr, val); - break; - case MO_LEUQ: - stq_le_p(haddr, val); - break; - default: - qemu_build_not_reached(); - } -} - /** * do_st_mmio_leN: * @env: cpu context @@ -2616,38 +2586,51 @@ static uint64_t do_st_mmio_leN(CPUArchState *env, MMULookupPageData *p, return val_le; } -/** - * do_st_bytes_leN: - * @p: translation parameters - * @val_le: data to store - * - * Store @p->size bytes at @p->haddr, which is RAM. - * The bytes to store are extracted in little-endian order from @val_le; - * return the bytes of @val_le beyond @p->size that have not been stored. - */ -static uint64_t do_st_bytes_leN(MMULookupPageData *p, uint64_t val_le) -{ - uint8_t *haddr = p->haddr; - int i, size = p->size; - - for (i = 0; i < size; i++, val_le >>= 8) { - haddr[i] = val_le; - } - return val_le; -} - /* * Wrapper for the above. */ static uint64_t do_st_leN(CPUArchState *env, MMULookupPageData *p, - uint64_t val_le, int mmu_idx, uintptr_t ra) + uint64_t val_le, int mmu_idx, + MemOp mop, uintptr_t ra) { + MemOp atmax; + if (unlikely(p->flags & TLB_MMIO)) { return do_st_mmio_leN(env, p, val_le, mmu_idx, ra); } else if (unlikely(p->flags & TLB_DISCARD_WRITE)) { return val_le >> (p->size * 8); - } else { - return do_st_bytes_leN(p, val_le); + } + + switch (mop & MO_ATOM_MASK) { + case MO_ATOM_WITHIN16: + /* + * It is a given that we cross a page and therefore there is no + * atomicity for the load as a whole, but there may be a subobject + * as defined by ATMAX which does not cross a 16-byte boundary. + */ + atmax = mop & MO_ATMAX_MASK; + if (atmax == MO_ATMAX_SIZE) { + atmax = mop & MO_SIZE; + } else { + atmax >>= MO_ATMAX_SHIFT; + } + if (unlikely(p->size >= (1 << atmax))) { + if (!HAVE_al8_fast && p->size <= 4) { + return store_whole_le4(p->haddr, p->size, val_le); + } else if (HAVE_al8) { + return store_whole_le8(p->haddr, p->size, val_le); + } else { + cpu_loop_exit_atomic(env_cpu(env), ra); + } + } + /* fall through */ + case MO_ATOM_IFALIGN: + case MO_ATOM_NONE: + return store_bytes_leN(p->haddr, p->size, val_le); + case MO_ATOM_SUBALIGN: + return store_parts_leN(p->haddr, p->size, val_le); + default: + g_assert_not_reached(); } } @@ -2675,7 +2658,7 @@ static void do_st_2(CPUArchState *env, MMULookupPageData *p, uint16_t val, if (memop & MO_BSWAP) { val = bswap16(val); } - store_memop(p->haddr, val, MO_UW); + store_atom_2(env, ra, p->haddr, memop, val); } } @@ -2691,7 +2674,7 @@ static void do_st_4(CPUArchState *env, MMULookupPageData *p, uint32_t val, if (memop & MO_BSWAP) { val = bswap32(val); } - store_memop(p->haddr, val, MO_UL); + store_atom_4(env, ra, p->haddr, memop, val); } } @@ -2707,7 +2690,7 @@ static void do_st_8(CPUArchState *env, MMULookupPageData *p, uint64_t val, if (memop & MO_BSWAP) { val = bswap64(val); } - store_memop(p->haddr, val, MO_UQ); + store_atom_8(env, ra, p->haddr, memop, val); } } @@ -2776,8 +2759,8 @@ static void do_st4_mmu(CPUArchState *env, target_ulong addr, uint32_t val, if ((l.memop & MO_BSWAP) != MO_LE) { val = bswap32(val); } - val = do_st_leN(env, &l.page[0], val, l.mmu_idx, ra); - (void) do_st_leN(env, &l.page[1], val, l.mmu_idx, ra); + val = do_st_leN(env, &l.page[0], val, l.mmu_idx, l.memop, ra); + (void) do_st_leN(env, &l.page[1], val, l.mmu_idx, l.memop, ra); } void helper_le_stl_mmu(CPUArchState *env, target_ulong addr, uint32_t val, @@ -2810,8 +2793,8 @@ static void do_st8_mmu(CPUArchState *env, target_ulong addr, uint64_t val, if ((l.memop & MO_BSWAP) != MO_LE) { val = bswap64(val); } - val = do_st_leN(env, &l.page[0], val, l.mmu_idx, ra); - (void) do_st_leN(env, &l.page[1], val, l.mmu_idx, ra); + val = do_st_leN(env, &l.page[0], val, l.mmu_idx, l.memop, ra); + (void) do_st_leN(env, &l.page[1], val, l.mmu_idx, l.memop, ra); } void helper_le_stq_mmu(CPUArchState *env, target_ulong addr, uint64_t val, diff --git a/accel/tcg/user-exec.c b/accel/tcg/user-exec.c index 25e55a40fb..a4acf705f4 100644 --- a/accel/tcg/user-exec.c +++ b/accel/tcg/user-exec.c @@ -1088,7 +1088,7 @@ void cpu_stw_be_mmu(CPUArchState *env, abi_ptr addr, uint16_t val, validate_memop(oi, MO_BEUW); haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE); - stw_be_p(haddr, val); + store_atom_2(env, ra, haddr, get_memop(oi), be16_to_cpu(val)); clear_helper_retaddr(); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W); } @@ -1100,7 +1100,7 @@ void cpu_stl_be_mmu(CPUArchState *env, abi_ptr addr, uint32_t val, validate_memop(oi, MO_BEUL); haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE); - stl_be_p(haddr, val); + store_atom_4(env, ra, haddr, get_memop(oi), be32_to_cpu(val)); clear_helper_retaddr(); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W); } @@ -1112,7 +1112,7 @@ void cpu_stq_be_mmu(CPUArchState *env, abi_ptr addr, uint64_t val, validate_memop(oi, MO_BEUQ); haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE); - stq_be_p(haddr, val); + store_atom_8(env, ra, haddr, get_memop(oi), be64_to_cpu(val)); clear_helper_retaddr(); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W); } @@ -1124,7 +1124,7 @@ void cpu_stw_le_mmu(CPUArchState *env, abi_ptr addr, uint16_t val, validate_memop(oi, MO_LEUW); haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE); - stw_le_p(haddr, val); + store_atom_2(env, ra, haddr, get_memop(oi), le16_to_cpu(val)); clear_helper_retaddr(); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W); } @@ -1136,7 +1136,7 @@ void cpu_stl_le_mmu(CPUArchState *env, abi_ptr addr, uint32_t val, validate_memop(oi, MO_LEUL); haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE); - stl_le_p(haddr, val); + store_atom_4(env, ra, haddr, get_memop(oi), le32_to_cpu(val)); clear_helper_retaddr(); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W); } @@ -1148,7 +1148,7 @@ void cpu_stq_le_mmu(CPUArchState *env, abi_ptr addr, uint64_t val, validate_memop(oi, MO_LEUQ); haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE); - stq_le_p(haddr, val); + store_atom_8(env, ra, haddr, get_memop(oi), le64_to_cpu(val)); clear_helper_retaddr(); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W); } diff --git a/accel/tcg/ldst_atomicity.c.inc b/accel/tcg/ldst_atomicity.c.inc index c93328fbaa..0e4292ec66 100644 --- a/accel/tcg/ldst_atomicity.c.inc +++ b/accel/tcg/ldst_atomicity.c.inc @@ -21,6 +21,12 @@ #else # define HAVE_al16_fast false #endif +#if defined(CONFIG_ATOMIC128) || defined(CONFIG_CMPXCHG128) +# define HAVE_al16 true +#else +# define HAVE_al16 false +#endif + /** * required_atomicity: @@ -545,3 +551,488 @@ static uint64_t load_atom_8(CPUArchState *env, uintptr_t ra, g_assert_not_reached(); } } + +/** + * store_atomic2: + * @pv: host address + * @val: value to store + * + * Atomically store 2 aligned bytes to @pv. + */ +static inline void store_atomic2(void *pv, uint16_t val) +{ + uint16_t *p = __builtin_assume_aligned(pv, 2); + qatomic_set(p, val); +} + +/** + * store_atomic4: + * @pv: host address + * @val: value to store + * + * Atomically store 4 aligned bytes to @pv. + */ +static inline void store_atomic4(void *pv, uint32_t val) +{ + uint32_t *p = __builtin_assume_aligned(pv, 4); + qatomic_set(p, val); +} + +/** + * store_atomic8: + * @pv: host address + * @val: value to store + * + * Atomically store 8 aligned bytes to @pv. + */ +static inline void store_atomic8(void *pv, uint64_t val) +{ + uint64_t *p = __builtin_assume_aligned(pv, 8); + + qemu_build_assert(HAVE_al8); + qatomic_set__nocheck(p, val); +} + +/** + * store_atom_4x2 + */ +static inline void store_atom_4_by_2(void *pv, uint32_t val) +{ + store_atomic2(pv, val >> (HOST_BIG_ENDIAN ? 16 : 0)); + store_atomic2(pv + 2, val >> (HOST_BIG_ENDIAN ? 0 : 16)); +} + +/** + * store_atom_8_by_2 + */ +static inline void store_atom_8_by_2(void *pv, uint64_t val) +{ + store_atom_4_by_2(pv, val >> (HOST_BIG_ENDIAN ? 32 : 0)); + store_atom_4_by_2(pv + 4, val >> (HOST_BIG_ENDIAN ? 0 : 32)); +} + +/** + * store_atom_8_by_4 + */ +static inline void store_atom_8_by_4(void *pv, uint64_t val) +{ + store_atomic4(pv, val >> (HOST_BIG_ENDIAN ? 32 : 0)); + store_atomic4(pv + 4, val >> (HOST_BIG_ENDIAN ? 0 : 32)); +} + +/** + * store_atom_insert_al4: + * @p: host address + * @val: shifted value to store + * @msk: mask for value to store + * + * Atomically store @val to @p, masked by @msk. + */ +static void store_atom_insert_al4(uint32_t *p, uint32_t val, uint32_t msk) +{ + uint32_t old, new; + + p = __builtin_assume_aligned(p, 4); + old = qatomic_read(p); + do { + new = (old & ~msk) | val; + } while (!__atomic_compare_exchange_n(p, &old, new, true, + __ATOMIC_RELAXED, __ATOMIC_RELAXED)); +} + +/** + * store_atom_insert_al8: + * @p: host address + * @val: shifted value to store + * @msk: mask for value to store + * + * Atomically store @val to @p masked by @msk. + */ +static void store_atom_insert_al8(uint64_t *p, uint64_t val, uint64_t msk) +{ + uint64_t old, new; + + qemu_build_assert(HAVE_al8); + p = __builtin_assume_aligned(p, 8); + old = qatomic_read__nocheck(p); + do { + new = (old & ~msk) | val; + } while (!__atomic_compare_exchange_n(p, &old, new, true, + __ATOMIC_RELAXED, __ATOMIC_RELAXED)); +} + +/** + * store_atom_insert_al16: + * @p: host address + * @val: shifted value to store + * @msk: mask for value to store + * + * Atomically store @val to @p masked by @msk. + */ +static void store_atom_insert_al16(Int128 *ps, Int128Alias val, Int128Alias msk) +{ +#if defined(CONFIG_ATOMIC128) + __uint128_t *pu, old, new; + + /* With CONFIG_ATOMIC128, we can avoid the memory barriers. */ + pu = __builtin_assume_aligned(ps, 16); + old = *pu; + do { + new = (old & ~msk.u) | val.u; + } while (!__atomic_compare_exchange_n(pu, &old, new, true, + __ATOMIC_RELAXED, __ATOMIC_RELAXED)); +#elif defined(CONFIG_CMPXCHG128) + __uint128_t *pu, old, new; + + /* + * Without CONFIG_ATOMIC128, __atomic_compare_exchange_n will always + * defer to libatomic, so we must use __sync_val_compare_and_swap_16 + * and accept the sequential consistency that comes with it. + */ + pu = __builtin_assume_aligned(ps, 16); + do { + old = *pu; + new = (old & ~msk.u) | val.u; + } while (!__sync_bool_compare_and_swap_16(pu, old, new)); +#else + qemu_build_not_reached(); +#endif +} + +/** + * store_bytes_leN: + * @pv: host address + * @size: number of bytes to store + * @val_le: data to store + * + * Store @size bytes at @p. The bytes to store are extracted in little-endian order + * from @val_le; return the bytes of @val_le beyond @size that have not been stored. + */ +static uint64_t store_bytes_leN(void *pv, int size, uint64_t val_le) +{ + uint8_t *p = pv; + for (int i = 0; i < size; i++, val_le >>= 8) { + p[i] = val_le; + } + return val_le; +} + +/** + * store_parts_leN + * @pv: host address + * @size: number of bytes to store + * @val_le: data to store + * + * As store_bytes_leN, but atomically on each aligned part. + */ +G_GNUC_UNUSED +static uint64_t store_parts_leN(void *pv, int size, uint64_t val_le) +{ + do { + int n; + + /* Find minimum of alignment and size */ + switch (((uintptr_t)pv | size) & 7) { + case 4: + store_atomic4(pv, le32_to_cpu(val_le)); + val_le >>= 32; + n = 4; + break; + case 2: + case 6: + store_atomic2(pv, le16_to_cpu(val_le)); + val_le >>= 16; + n = 2; + break; + default: + *(uint8_t *)pv = val_le; + val_le >>= 8; + n = 1; + break; + case 0: + g_assert_not_reached(); + } + pv += n; + size -= n; + } while (size != 0); + + return val_le; +} + +/** + * store_whole_le4 + * @pv: host address + * @size: number of bytes to store + * @val_le: data to store + * + * As store_bytes_leN, but atomically as a whole. + * Four aligned bytes are guaranteed to cover the store. + */ +static uint64_t store_whole_le4(void *pv, int size, uint64_t val_le) +{ + int sz = size * 8; + int o = (uintptr_t)pv & 3; + int sh = o * 8; + uint32_t m = MAKE_64BIT_MASK(0, sz); + uint32_t v; + + if (HOST_BIG_ENDIAN) { + v = bswap32(val_le) >> sh; + m = bswap32(m) >> sh; + } else { + v = val_le << sh; + m <<= sh; + } + store_atom_insert_al4(pv - o, v, m); + return val_le >> sz; +} + +/** + * store_whole_le8 + * @pv: host address + * @size: number of bytes to store + * @val_le: data to store + * + * As store_bytes_leN, but atomically as a whole. + * Eight aligned bytes are guaranteed to cover the store. + */ +static uint64_t store_whole_le8(void *pv, int size, uint64_t val_le) +{ + int sz = size * 8; + int o = (uintptr_t)pv & 7; + int sh = o * 8; + uint64_t m = MAKE_64BIT_MASK(0, sz); + uint64_t v; + + qemu_build_assert(HAVE_al8); + if (HOST_BIG_ENDIAN) { + v = bswap64(val_le) >> sh; + m = bswap64(m) >> sh; + } else { + v = val_le << sh; + m <<= sh; + } + store_atom_insert_al8(pv - o, v, m); + return val_le >> sz; +} + +/** + * store_whole_le16 + * @pv: host address + * @size: number of bytes to store + * @val_le: data to store + * + * As store_bytes_leN, but atomically as a whole. + * 16 aligned bytes are guaranteed to cover the store. + */ +static uint64_t store_whole_le16(void *pv, int size, Int128 val_le) +{ + int sz = size * 8; + int o = (uintptr_t)pv & 15; + int sh = o * 8; + Int128 m, v; + + qemu_build_assert(HAVE_al16); + + /* Like MAKE_64BIT_MASK(0, sz), but larger. */ + if (sz <= 64) { + m = int128_make64(MAKE_64BIT_MASK(0, sz)); + } else { + m = int128_make128(-1, MAKE_64BIT_MASK(0, sz - 64)); + } + + if (HOST_BIG_ENDIAN) { + v = int128_urshift(bswap128(val_le), sh); + m = int128_urshift(bswap128(m), sh); + } else { + v = int128_lshift(val_le, sh); + m = int128_lshift(m, sh); + } + store_atom_insert_al16(pv - o, v, m); + + /* Unused if sz <= 64. */ + return int128_gethi(val_le) >> (sz - 64); +} + +/** + * store_atom_2: + * @p: host address + * @val: the value to store + * @memop: the full memory op + * + * Store 2 bytes to @p, honoring the atomicity of @memop. + */ +static void store_atom_2(CPUArchState *env, uintptr_t ra, + void *pv, MemOp memop, uint16_t val) +{ + uintptr_t pi = (uintptr_t)pv; + int atmax; + + if (likely((pi & 1) == 0)) { + store_atomic2(pv, val); + return; + } + + atmax = required_atomicity(env, pi, memop); + if (atmax == MO_8) { + stw_he_p(pv, val); + return; + } + + /* + * The only case remaining is MO_ATOM_WITHIN16. + * Big or little endian, we want the middle two bytes in each test. + */ + if ((pi & 3) == 1) { + store_atom_insert_al4(pv - 1, (uint32_t)val << 8, MAKE_64BIT_MASK(8, 16)); + return; + } else if ((pi & 7) == 3) { + if (HAVE_al8) { + store_atom_insert_al8(pv - 3, (uint64_t)val << 24, MAKE_64BIT_MASK(24, 16)); + return; + } + } else if ((pi & 15) == 7) { + if (HAVE_al16) { + Int128 v = int128_lshift(int128_make64(val), 56); + Int128 m = int128_lshift(int128_make64(0xffff), 56); + store_atom_insert_al16(pv - 7, v, m); + return; + } + } else { + g_assert_not_reached(); + } + + cpu_loop_exit_atomic(env_cpu(env), ra); +} + +/** + * store_atom_4: + * @p: host address + * @val: the value to store + * @memop: the full memory op + * + * Store 4 bytes to @p, honoring the atomicity of @memop. + */ +static void store_atom_4(CPUArchState *env, uintptr_t ra, + void *pv, MemOp memop, uint32_t val) +{ + uintptr_t pi = (uintptr_t)pv; + int atmax; + + if (likely((pi & 3) == 0)) { + store_atomic4(pv, val); + return; + } + + atmax = required_atomicity(env, pi, memop); + switch (atmax) { + case MO_8: + stl_he_p(pv, val); + return; + case MO_16: + store_atom_4_by_2(pv, val); + return; + case -MO_16: + { + uint32_t val_le = cpu_to_le32(val); + int s2 = pi & 3; + int s1 = 4 - s2; + + switch (s2) { + case 1: + val_le = store_whole_le4(pv, s1, val_le); + *(uint8_t *)(pv + 3) = val_le; + break; + case 3: + *(uint8_t *)pv = val_le; + store_whole_le4(pv + 1, s2, val_le >> 8); + break; + case 0: /* aligned */ + case 2: /* atmax MO_16 */ + default: + g_assert_not_reached(); + } + } + return; + case MO_32: + if ((pi & 7) < 4) { + if (HAVE_al8) { + store_whole_le8(pv, 4, cpu_to_le32(val)); + return; + } + } else { + if (HAVE_al16) { + store_whole_le16(pv, 4, int128_make64(cpu_to_le32(val))); + return; + } + } + cpu_loop_exit_atomic(env_cpu(env), ra); + default: + g_assert_not_reached(); + } +} + +/** + * store_atom_8: + * @p: host address + * @val: the value to store + * @memop: the full memory op + * + * Store 8 bytes to @p, honoring the atomicity of @memop. + */ +static void store_atom_8(CPUArchState *env, uintptr_t ra, + void *pv, MemOp memop, uint64_t val) +{ + uintptr_t pi = (uintptr_t)pv; + int atmax; + + if (HAVE_al8 && likely((pi & 7) == 0)) { + store_atomic8(pv, val); + return; + } + + atmax = required_atomicity(env, pi, memop); + switch (atmax) { + case MO_8: + stq_he_p(pv, val); + return; + case MO_16: + store_atom_8_by_2(pv, val); + return; + case MO_32: + store_atom_8_by_4(pv, val); + return; + case -MO_32: + if (HAVE_al8) { + uint64_t val_le = cpu_to_le64(val); + int s2 = pi & 7; + int s1 = 8 - s2; + + switch (s2) { + case 1 ... 3: + val_le = store_whole_le8(pv, s1, val_le); + store_bytes_leN(pv + s1, s2, val_le); + break; + case 5 ... 7: + val_le = store_bytes_leN(pv, s1, val_le); + store_whole_le8(pv + s1, s2, val_le); + break; + case 0: /* aligned */ + case 4: /* atmax MO_32 */ + default: + g_assert_not_reached(); + } + return; + } + break; + case MO_64: + if (HAVE_al16) { + store_whole_le16(pv, 8, int128_make64(cpu_to_le64(val))); + return; + } + break; + default: + g_assert_not_reached(); + } + cpu_loop_exit_atomic(env_cpu(env), ra); +} From patchwork Thu Feb 16 02:57:18 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1743246 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=vjLW7xMV; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4PHKT45ZxNz23h0 for ; Thu, 16 Feb 2023 14:00:56 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pSUTF-0003St-1v; Wed, 15 Feb 2023 21:58:05 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pSUT8-0003Qu-02 for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:57:58 -0500 Received: from mail-pg1-x529.google.com ([2607:f8b0:4864:20::529]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pSUT5-0005is-VO for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:57:57 -0500 Received: by mail-pg1-x529.google.com with SMTP id e1so406524pgg.9 for ; Wed, 15 Feb 2023 18:57:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=MR4ZriWzmZ1HUjpLLVCmcaaRm+e7n16eAnVU5EaLyGs=; b=vjLW7xMVIIStrixEDu2cec7V8m+bs1g8M5Wk7+ZGSqb0l4jLn5AxvIal4df+fu0oJp 1pe58Yye7yWNlTWEZ2iVOpKwq7SmbT3dcy+ROgcT7PjUyEEaavkZH9cB7ux+vxG2Mud+ 248mRxI52M+rD5nx+B/68KpKvjR/yHvmnD8MBr1cypG3wXhqIa+QZVJTq/Ye1N0DqOZx dGO8crFy/lNFDlIynWsMD6ucMZxjO7QKmzGMWdYoe5WUT4igG43w4JSP4EqmHNxLs4+a EJFAgcokscwt6PAOPuLspwDKXhzsQCVt0Nhbzeky4Wu2LDdYHhJnr+h9hHGGLA53MGVF AkoA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=MR4ZriWzmZ1HUjpLLVCmcaaRm+e7n16eAnVU5EaLyGs=; b=KkMf4S9k/UD4sT7jsI2rUcvXiKDmF3woN3EGD33qbORIUwtzzwksR8zGldRJolcQHG Z2elHN82u7pfJpXwr/Dxo3h3emZLEkdfRo7zi1kkrV+++m+rNTJOb4b3KxjVjMaqid/c zPvJQ5cdyq0lSS3+PfwonvuaJQd2QIOL30sQfv6BtgJky6z6WSz2oEsRTi54Y+rSsIsU /fXtTHwJz4fVopq3+ZTQ2RICj/24ECLczI+Of/m20rApmgQgapEleQVkK4rDXGL2lNg3 IrHih9pq7CHrt+LGgUxFw7w6NLxDinWf14jFfmlv8rmMTbQI220AkKxGEJiNKkmFhz1U uSig== X-Gm-Message-State: AO0yUKX7NdhyzoFgCAC2MB9akTGkBcG3jV10sAB782AelEJXkwPannYL t4FOBDcOgEEk6XU8SkhJByupw1Pq8he7ggZd8Po= X-Google-Smtp-Source: AK7set8wZZawq8d0LcG2y3dKM5qmt0x2S2oLSk7SeLPsgJz89qmT2njUND3u8Xs2qqUkmko+aiG/Sw== X-Received: by 2002:a62:1cd3:0:b0:5a8:d737:991d with SMTP id c202-20020a621cd3000000b005a8d737991dmr3308548pfc.16.1676516274604; Wed, 15 Feb 2023 18:57:54 -0800 (PST) Received: from stoup.. (rrcs-74-87-59-234.west.biz.rr.com. [74.87.59.234]) by smtp.gmail.com with ESMTPSA id e14-20020a62aa0e000000b005a816b7c3e8sm89655pff.24.2023.02.15.18.57.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Feb 2023 18:57:54 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Cc: =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= Subject: [PATCH v2 09/30] tcg/tci: Use cpu_{ld,st}_mmu Date: Wed, 15 Feb 2023 16:57:18 -1000 Message-Id: <20230216025739.1211680-10-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230216025739.1211680-1-richard.henderson@linaro.org> References: <20230216025739.1211680-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::529; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x529.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Unify the softmmu and the user-only paths by using the official memory interface. Avoid double logging of memory operations to plugins by relying on the ones within the cpu_*_mmu functions. Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- tcg/tcg-op.c | 9 +++- tcg/tci.c | 127 ++++++++------------------------------------------- 2 files changed, 26 insertions(+), 110 deletions(-) diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c index c581ae77c4..da312dcf7e 100644 --- a/tcg/tcg-op.c +++ b/tcg/tcg-op.c @@ -2916,7 +2916,12 @@ static void tcg_gen_req_mo(TCGBar type) static inline TCGv plugin_prep_mem_callbacks(TCGv vaddr) { -#ifdef CONFIG_PLUGIN + /* + * With TCI, we get memory tracing via cpu_{ld,st}_mmu. + * No need to instrument memory operations inline, and + * we don't want to log the same memory operation twice. + */ +#if defined(CONFIG_PLUGIN) && !defined(CONFIG_TCG_INTERPRETER) if (tcg_ctx->plugin_insn != NULL) { /* Save a copy of the vaddr for use after a load. */ TCGv temp = tcg_temp_new(); @@ -2930,7 +2935,7 @@ static inline TCGv plugin_prep_mem_callbacks(TCGv vaddr) static void plugin_gen_mem_callbacks(TCGv vaddr, MemOpIdx oi, enum qemu_plugin_mem_rw rw) { -#ifdef CONFIG_PLUGIN +#if defined(CONFIG_PLUGIN) && !defined(CONFIG_TCG_INTERPRETER) if (tcg_ctx->plugin_insn != NULL) { qemu_plugin_meminfo_t info = make_plugin_meminfo(oi, rw); plugin_gen_empty_mem_callback(vaddr, info); diff --git a/tcg/tci.c b/tcg/tci.c index fc67e7e767..170dcf1262 100644 --- a/tcg/tci.c +++ b/tcg/tci.c @@ -292,87 +292,34 @@ static uint64_t tci_qemu_ld(CPUArchState *env, target_ulong taddr, MemOp mop = get_memop(oi); uintptr_t ra = (uintptr_t)tb_ptr; -#ifdef CONFIG_SOFTMMU switch (mop & (MO_BSWAP | MO_SSIZE)) { case MO_UB: - return helper_ret_ldub_mmu(env, taddr, oi, ra); + return cpu_ldb_mmu(env, taddr, oi, ra); case MO_SB: - return helper_ret_ldsb_mmu(env, taddr, oi, ra); + return (int8_t)cpu_ldb_mmu(env, taddr, oi, ra); case MO_LEUW: - return helper_le_lduw_mmu(env, taddr, oi, ra); + return cpu_ldw_le_mmu(env, taddr, oi, ra); case MO_LESW: - return helper_le_ldsw_mmu(env, taddr, oi, ra); + return (int16_t)cpu_ldw_le_mmu(env, taddr, oi, ra); case MO_LEUL: - return helper_le_ldul_mmu(env, taddr, oi, ra); + return cpu_ldl_le_mmu(env, taddr, oi, ra); case MO_LESL: - return helper_le_ldsl_mmu(env, taddr, oi, ra); + return (int32_t)cpu_ldl_le_mmu(env, taddr, oi, ra); case MO_LEUQ: - return helper_le_ldq_mmu(env, taddr, oi, ra); + return cpu_ldq_le_mmu(env, taddr, oi, ra); case MO_BEUW: - return helper_be_lduw_mmu(env, taddr, oi, ra); + return cpu_ldw_be_mmu(env, taddr, oi, ra); case MO_BESW: - return helper_be_ldsw_mmu(env, taddr, oi, ra); + return (int16_t)cpu_ldw_be_mmu(env, taddr, oi, ra); case MO_BEUL: - return helper_be_ldul_mmu(env, taddr, oi, ra); + return cpu_ldl_be_mmu(env, taddr, oi, ra); case MO_BESL: - return helper_be_ldsl_mmu(env, taddr, oi, ra); + return (int32_t)cpu_ldl_be_mmu(env, taddr, oi, ra); case MO_BEUQ: - return helper_be_ldq_mmu(env, taddr, oi, ra); + return cpu_ldq_be_mmu(env, taddr, oi, ra); default: g_assert_not_reached(); } -#else - void *haddr = g2h(env_cpu(env), taddr); - unsigned a_mask = (1u << get_alignment_bits(mop)) - 1; - uint64_t ret; - - set_helper_retaddr(ra); - if (taddr & a_mask) { - helper_unaligned_ld(env, taddr); - } - switch (mop & (MO_BSWAP | MO_SSIZE)) { - case MO_UB: - ret = ldub_p(haddr); - break; - case MO_SB: - ret = ldsb_p(haddr); - break; - case MO_LEUW: - ret = lduw_le_p(haddr); - break; - case MO_LESW: - ret = ldsw_le_p(haddr); - break; - case MO_LEUL: - ret = (uint32_t)ldl_le_p(haddr); - break; - case MO_LESL: - ret = (int32_t)ldl_le_p(haddr); - break; - case MO_LEUQ: - ret = ldq_le_p(haddr); - break; - case MO_BEUW: - ret = lduw_be_p(haddr); - break; - case MO_BESW: - ret = ldsw_be_p(haddr); - break; - case MO_BEUL: - ret = (uint32_t)ldl_be_p(haddr); - break; - case MO_BESL: - ret = (int32_t)ldl_be_p(haddr); - break; - case MO_BEUQ: - ret = ldq_be_p(haddr); - break; - default: - g_assert_not_reached(); - } - clear_helper_retaddr(); - return ret; -#endif } static void tci_qemu_st(CPUArchState *env, target_ulong taddr, uint64_t val, @@ -381,67 +328,31 @@ static void tci_qemu_st(CPUArchState *env, target_ulong taddr, uint64_t val, MemOp mop = get_memop(oi); uintptr_t ra = (uintptr_t)tb_ptr; -#ifdef CONFIG_SOFTMMU switch (mop & (MO_BSWAP | MO_SIZE)) { case MO_UB: - helper_ret_stb_mmu(env, taddr, val, oi, ra); + cpu_stb_mmu(env, taddr, val, oi, ra); break; case MO_LEUW: - helper_le_stw_mmu(env, taddr, val, oi, ra); + cpu_stw_le_mmu(env, taddr, val, oi, ra); break; case MO_LEUL: - helper_le_stl_mmu(env, taddr, val, oi, ra); + cpu_stl_le_mmu(env, taddr, val, oi, ra); break; case MO_LEUQ: - helper_le_stq_mmu(env, taddr, val, oi, ra); + cpu_stq_le_mmu(env, taddr, val, oi, ra); break; case MO_BEUW: - helper_be_stw_mmu(env, taddr, val, oi, ra); + cpu_stw_be_mmu(env, taddr, val, oi, ra); break; case MO_BEUL: - helper_be_stl_mmu(env, taddr, val, oi, ra); + cpu_stl_be_mmu(env, taddr, val, oi, ra); break; case MO_BEUQ: - helper_be_stq_mmu(env, taddr, val, oi, ra); + cpu_stq_be_mmu(env, taddr, val, oi, ra); break; default: g_assert_not_reached(); } -#else - void *haddr = g2h(env_cpu(env), taddr); - unsigned a_mask = (1u << get_alignment_bits(mop)) - 1; - - set_helper_retaddr(ra); - if (taddr & a_mask) { - helper_unaligned_st(env, taddr); - } - switch (mop & (MO_BSWAP | MO_SIZE)) { - case MO_UB: - stb_p(haddr, val); - break; - case MO_LEUW: - stw_le_p(haddr, val); - break; - case MO_LEUL: - stl_le_p(haddr, val); - break; - case MO_LEUQ: - stq_le_p(haddr, val); - break; - case MO_BEUW: - stw_be_p(haddr, val); - break; - case MO_BEUL: - stl_be_p(haddr, val); - break; - case MO_BEUQ: - stq_be_p(haddr, val); - break; - default: - g_assert_not_reached(); - } - clear_helper_retaddr(); -#endif } #if TCG_TARGET_REG_BITS == 64 From patchwork Thu Feb 16 02:57:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1743244 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=WpDNt7Dp; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4PHKRy3g81z1yYg for ; Thu, 16 Feb 2023 13:59:58 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pSUTF-0003Tr-PB; Wed, 15 Feb 2023 21:58:05 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pSUTB-0003Sg-TQ for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:58:02 -0500 Received: from mail-pl1-x630.google.com ([2607:f8b0:4864:20::630]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pSUT7-0005jI-LT for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:58:01 -0500 Received: by mail-pl1-x630.google.com with SMTP id i18so701765pli.3 for ; Wed, 15 Feb 2023 18:57:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=VoJuYVAr1kKDx6EwwozOosrTTyKRCUmDycf8JxEPr5o=; b=WpDNt7DpOzMB88urvTMIWDexrDKE1tK4xP3/Ew/cISbHxEdrXBnCMFIT7sisgMh3bi 8dZXWjMKtpkzQkcygR1844Lk7kojCPJ3LsgPziPhb4eIIuQXszfWuc47ZFUr0dserPW5 awtwF6fccmMvO8ya0R6/3jOUawwyv5Yp9J8YjUupNDn7oEcm1TX/NrolOj1lFu0M37Sj EnjvkCbm3LqJ5lZSEiClxJlcVFbGXVlcIj5+NCZo5rL/BuGvlKXI5hF58NDuDK0kHP51 sUyz54pt0RDpDkvBaflh2kv59j8hW9THHdbrDwPUBFXRqRdk0GhI+d1tzam8jYdBrWuh Qfkg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=VoJuYVAr1kKDx6EwwozOosrTTyKRCUmDycf8JxEPr5o=; b=b+Q6Osbamb1svae1uekDjYiW6vLufa4AFmXwoL2+4aDcQVt83lAkKrmf6Rlz7JF3Uw pfcR28tgN/9/23j0nI0n58oEfvcOGU4z/xyBmDRMphb6/FdlNDcGo8jsVonXDnVxaRk3 vxV8L50tx3WsQA6Ay1yXSfE9fPyNBf6FgT68piYn0J+apsSrN0O35ChOzJ7T/jbIWti7 8mavrOTIsfVPgjSMYvxQrmzAcG7gZlsaCMRzbQoFO6OxLJUbWkJNMNxalGDTC7Ee+u7b 71pY5KgijxSVufZlZ0G5exUlSzVpV/EeoQPjQFkm2Ggp9CC80QI3B6GBU9RCzHO8+xAX c2lQ== X-Gm-Message-State: AO0yUKURdJa6WzYGf0U1PqxkDVgkQoucx6+5iEHcPqfhNFaPoH/a/tIA qDIRYtx4UiNX8zGpaWlQZ+L/VgWEuYcvB4CUd00= X-Google-Smtp-Source: AK7set/Ul3cHltzs3jurRtGjURitLkASJrv4+iUtsaQOb8GFd6vpjATN9VLoJUtAdzQ5gecMK6wwQg== X-Received: by 2002:a05:6a20:1faf:b0:bf:488a:1daf with SMTP id dm47-20020a056a201faf00b000bf488a1dafmr3491991pzb.12.1676516276072; Wed, 15 Feb 2023 18:57:56 -0800 (PST) Received: from stoup.. (rrcs-74-87-59-234.west.biz.rr.com. [74.87.59.234]) by smtp.gmail.com with ESMTPSA id e14-20020a62aa0e000000b005a816b7c3e8sm89655pff.24.2023.02.15.18.57.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Feb 2023 18:57:55 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Cc: =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= Subject: [PATCH v2 10/30] tcg: Unify helper_{be,le}_{ld,st}* Date: Wed, 15 Feb 2023 16:57:19 -1000 Message-Id: <20230216025739.1211680-11-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230216025739.1211680-1-richard.henderson@linaro.org> References: <20230216025739.1211680-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::630; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x630.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org With the current structure of cputlb.c, there is no difference between the little-endian and big-endian entry points, aside from the assert. Unify the pairs of functions. Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- docs/devel/loads-stores.rst | 36 ++---- include/tcg/tcg-ldst.h | 60 ++++------ accel/tcg/cputlb.c | 190 ++++++++++--------------------- tcg/aarch64/tcg-target.c.inc | 39 +++---- tcg/arm/tcg-target.c.inc | 45 +++----- tcg/i386/tcg-target.c.inc | 40 +++---- tcg/loongarch64/tcg-target.c.inc | 25 ++-- tcg/mips/tcg-target.c.inc | 40 +++---- tcg/ppc/tcg-target.c.inc | 30 ++--- tcg/riscv/tcg-target.c.inc | 51 +++------ tcg/s390x/tcg-target.c.inc | 38 +++---- tcg/sparc64/tcg-target.c.inc | 37 +++--- 12 files changed, 226 insertions(+), 405 deletions(-) diff --git a/docs/devel/loads-stores.rst b/docs/devel/loads-stores.rst index ad5dfe133e..d2cefc77a2 100644 --- a/docs/devel/loads-stores.rst +++ b/docs/devel/loads-stores.rst @@ -297,31 +297,20 @@ swap: ``translator_ld{sign}{size}_swap(env, ptr, swap)`` Regexes for git grep - ``\`` -``helper_*_{ld,st}*_mmu`` +``helper_{ld,st}*_mmu`` ~~~~~~~~~~~~~~~~~~~~~~~~~ These functions are intended primarily to be called by the code -generated by the TCG backend. They may also be called by target -CPU helper function code. Like the ``cpu_{ld,st}_mmuidx_ra`` functions -they perform accesses by guest virtual address, with a given ``mmuidx``. +generated by the TCG backend. Like the ``cpu_{ld,st}_mmu`` functions +they perform accesses by guest virtual address, with a given ``MemOpIdx``. -These functions specify an ``opindex`` parameter which encodes -(among other things) the mmu index to use for the access. This parameter -should be created by calling ``make_memop_idx()``. +They differ from ``cpu_{ld,st}_mmu`` in that they take the endianness +of the operation only from the MemOpIdx, and loads extend the return +value to the size of a host general register (``tcg_target_ulong``). -The ``retaddr`` parameter should be the result of GETPC() called directly -from the top level HELPER(foo) function (or 0 if no guest CPU state -unwinding is required). +load: ``helper_ld{sign}{size}_mmu(env, addr, opindex, retaddr)`` -**TODO** The names of these functions are a bit odd for historical -reasons because they were originally expected to be called only from -within generated code. We should rename them to bring them more in -line with the other memory access functions. The explicit endianness -is the only feature they have beyond ``*_mmuidx_ra``. - -load: ``helper_{endian}_ld{sign}{size}_mmu(env, addr, opindex, retaddr)`` - -store: ``helper_{endian}_st{size}_mmu(env, addr, val, opindex, retaddr)`` +store: ``helper_{size}_mmu(env, addr, val, opindex, retaddr)`` ``sign`` - (empty) : for 32 or 64 bit sizes @@ -334,14 +323,9 @@ store: ``helper_{endian}_st{size}_mmu(env, addr, val, opindex, retaddr)`` - ``l`` : 32 bits - ``q`` : 64 bits -``endian`` - - ``le`` : little endian - - ``be`` : big endian - - ``ret`` : target endianness - Regexes for git grep - - ``\`` - - ``\`` + - ``\`` + - ``\`` ``address_space_*`` ~~~~~~~~~~~~~~~~~~~ diff --git a/include/tcg/tcg-ldst.h b/include/tcg/tcg-ldst.h index 2ba22bd5fe..56fa7afe5e 100644 --- a/include/tcg/tcg-ldst.h +++ b/include/tcg/tcg-ldst.h @@ -28,47 +28,31 @@ #ifdef CONFIG_SOFTMMU /* Value zero-extended to tcg register size. */ -tcg_target_ulong helper_ret_ldub_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr); -tcg_target_ulong helper_le_lduw_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr); -tcg_target_ulong helper_le_ldul_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr); -uint64_t helper_le_ldq_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr); -tcg_target_ulong helper_be_lduw_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr); -tcg_target_ulong helper_be_ldul_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr); -uint64_t helper_be_ldq_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr); +tcg_target_ulong helper_ldub_mmu(CPUArchState *env, target_ulong addr, + MemOpIdx oi, uintptr_t retaddr); +tcg_target_ulong helper_lduw_mmu(CPUArchState *env, target_ulong addr, + MemOpIdx oi, uintptr_t retaddr); +tcg_target_ulong helper_ldul_mmu(CPUArchState *env, target_ulong addr, + MemOpIdx oi, uintptr_t retaddr); +uint64_t helper_ldq_mmu(CPUArchState *env, target_ulong addr, + MemOpIdx oi, uintptr_t retaddr); /* Value sign-extended to tcg register size. */ -tcg_target_ulong helper_ret_ldsb_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr); -tcg_target_ulong helper_le_ldsw_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr); -tcg_target_ulong helper_le_ldsl_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr); -tcg_target_ulong helper_be_ldsw_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr); -tcg_target_ulong helper_be_ldsl_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr); +tcg_target_ulong helper_ldsb_mmu(CPUArchState *env, target_ulong addr, + MemOpIdx oi, uintptr_t retaddr); +tcg_target_ulong helper_ldsw_mmu(CPUArchState *env, target_ulong addr, + MemOpIdx oi, uintptr_t retaddr); +tcg_target_ulong helper_ldsl_mmu(CPUArchState *env, target_ulong addr, + MemOpIdx oi, uintptr_t retaddr); -void helper_ret_stb_mmu(CPUArchState *env, target_ulong addr, uint8_t val, - MemOpIdx oi, uintptr_t retaddr); -void helper_le_stw_mmu(CPUArchState *env, target_ulong addr, uint16_t val, - MemOpIdx oi, uintptr_t retaddr); -void helper_le_stl_mmu(CPUArchState *env, target_ulong addr, uint32_t val, - MemOpIdx oi, uintptr_t retaddr); -void helper_le_stq_mmu(CPUArchState *env, target_ulong addr, uint64_t val, - MemOpIdx oi, uintptr_t retaddr); -void helper_be_stw_mmu(CPUArchState *env, target_ulong addr, uint16_t val, - MemOpIdx oi, uintptr_t retaddr); -void helper_be_stl_mmu(CPUArchState *env, target_ulong addr, uint32_t val, - MemOpIdx oi, uintptr_t retaddr); -void helper_be_stq_mmu(CPUArchState *env, target_ulong addr, uint64_t val, - MemOpIdx oi, uintptr_t retaddr); +void helper_stb_mmu(CPUArchState *env, target_ulong addr, uint8_t val, + MemOpIdx oi, uintptr_t retaddr); +void helper_stw_mmu(CPUArchState *env, target_ulong addr, uint16_t val, + MemOpIdx oi, uintptr_t retaddr); +void helper_stl_mmu(CPUArchState *env, target_ulong addr, uint32_t val, + MemOpIdx oi, uintptr_t retaddr); +void helper_stq_mmu(CPUArchState *env, target_ulong addr, uint64_t val, + MemOpIdx oi, uintptr_t retaddr); #else diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c index 6c93558a1c..e2e764f4da 100644 --- a/accel/tcg/cputlb.c +++ b/accel/tcg/cputlb.c @@ -1978,25 +1978,6 @@ static void *atomic_mmu_lookup(CPUArchState *env, target_ulong addr, cpu_loop_exit_atomic(env_cpu(env), retaddr); } -/* - * Verify that we have passed the correct MemOp to the correct function. - * - * In the case of the helper_*_mmu functions, we will have done this by - * using the MemOp to look up the helper during code generation. - * - * In the case of the cpu_*_mmu functions, this is up to the caller. - * We could present one function to target code, and dispatch based on - * the MemOp, but so far we have worked hard to avoid an indirect function - * call along the memory path. - */ -static void validate_memop(MemOpIdx oi, MemOp expected) -{ -#ifdef CONFIG_DEBUG_TCG - MemOp have = get_memop(oi) & (MO_SIZE | MO_BSWAP); - assert(have == expected); -#endif -} - /* * Load Helpers * @@ -2264,10 +2245,10 @@ static uint8_t do_ld1_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, return do_ld_1(env, &l.page[0], l.mmu_idx, access_type, ra); } -tcg_target_ulong helper_ret_ldub_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr) +tcg_target_ulong helper_ldub_mmu(CPUArchState *env, target_ulong addr, + MemOpIdx oi, uintptr_t retaddr) { - validate_memop(oi, MO_UB); + tcg_debug_assert((get_memop(oi) & MO_SIZE) == MO_8); return do_ld1_mmu(env, addr, oi, retaddr, MMU_DATA_LOAD); } @@ -2295,17 +2276,10 @@ static uint16_t do_ld2_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, return ret; } -tcg_target_ulong helper_le_lduw_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr) +tcg_target_ulong helper_lduw_mmu(CPUArchState *env, target_ulong addr, + MemOpIdx oi, uintptr_t retaddr) { - validate_memop(oi, MO_LEUW); - return do_ld2_mmu(env, addr, oi, retaddr, MMU_DATA_LOAD); -} - -tcg_target_ulong helper_be_lduw_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr) -{ - validate_memop(oi, MO_BEUW); + tcg_debug_assert((get_memop(oi) & MO_SIZE) == MO_16); return do_ld2_mmu(env, addr, oi, retaddr, MMU_DATA_LOAD); } @@ -2329,17 +2303,10 @@ static uint32_t do_ld4_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, return ret; } -tcg_target_ulong helper_le_ldul_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr) +tcg_target_ulong helper_ldul_mmu(CPUArchState *env, target_ulong addr, + MemOpIdx oi, uintptr_t retaddr) { - validate_memop(oi, MO_LEUL); - return do_ld4_mmu(env, addr, oi, retaddr, MMU_DATA_LOAD); -} - -tcg_target_ulong helper_be_ldul_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr) -{ - validate_memop(oi, MO_BEUL); + tcg_debug_assert((get_memop(oi) & MO_SIZE) == MO_32); return do_ld4_mmu(env, addr, oi, retaddr, MMU_DATA_LOAD); } @@ -2363,17 +2330,10 @@ static uint64_t do_ld8_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, return ret; } -uint64_t helper_le_ldq_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr) +uint64_t helper_ldq_mmu(CPUArchState *env, target_ulong addr, + MemOpIdx oi, uintptr_t retaddr) { - validate_memop(oi, MO_LEUQ); - return do_ld8_mmu(env, addr, oi, retaddr, MMU_DATA_LOAD); -} - -uint64_t helper_be_ldq_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr) -{ - validate_memop(oi, MO_BEUQ); + tcg_debug_assert((get_memop(oi) & MO_SIZE) == MO_64); return do_ld8_mmu(env, addr, oi, retaddr, MMU_DATA_LOAD); } @@ -2382,35 +2342,22 @@ uint64_t helper_be_ldq_mmu(CPUArchState *env, target_ulong addr, * avoid this for 64-bit data, or for 32-bit data on 32-bit host. */ - -tcg_target_ulong helper_ret_ldsb_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr) +tcg_target_ulong helper_ldsb_mmu(CPUArchState *env, target_ulong addr, + MemOpIdx oi, uintptr_t retaddr) { - return (int8_t)helper_ret_ldub_mmu(env, addr, oi, retaddr); + return (int8_t)helper_ldub_mmu(env, addr, oi, retaddr); } -tcg_target_ulong helper_le_ldsw_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr) +tcg_target_ulong helper_ldsw_mmu(CPUArchState *env, target_ulong addr, + MemOpIdx oi, uintptr_t retaddr) { - return (int16_t)helper_le_lduw_mmu(env, addr, oi, retaddr); + return (int16_t)helper_lduw_mmu(env, addr, oi, retaddr); } -tcg_target_ulong helper_be_ldsw_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr) +tcg_target_ulong helper_ldsl_mmu(CPUArchState *env, target_ulong addr, + MemOpIdx oi, uintptr_t retaddr) { - return (int16_t)helper_be_lduw_mmu(env, addr, oi, retaddr); -} - -tcg_target_ulong helper_le_ldsl_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr) -{ - return (int32_t)helper_le_ldul_mmu(env, addr, oi, retaddr); -} - -tcg_target_ulong helper_be_ldsl_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr) -{ - return (int32_t)helper_be_ldul_mmu(env, addr, oi, retaddr); + return (int32_t)helper_ldul_mmu(env, addr, oi, retaddr); } /* @@ -2426,7 +2373,7 @@ uint8_t cpu_ldb_mmu(CPUArchState *env, abi_ptr addr, MemOpIdx oi, uintptr_t ra) { uint8_t ret; - validate_memop(oi, MO_UB); + tcg_debug_assert((get_memop(oi) & MO_SIZE) == MO_UB); ret = do_ld1_mmu(env, addr, oi, ra, MMU_DATA_LOAD); plugin_load_cb(env, addr, oi); return ret; @@ -2437,7 +2384,7 @@ uint16_t cpu_ldw_be_mmu(CPUArchState *env, abi_ptr addr, { uint16_t ret; - validate_memop(oi, MO_BEUW); + tcg_debug_assert((get_memop(oi) & (MO_BSWAP | MO_SIZE)) == MO_BEUW); ret = do_ld2_mmu(env, addr, oi, ra, MMU_DATA_LOAD); plugin_load_cb(env, addr, oi); return ret; @@ -2448,7 +2395,7 @@ uint32_t cpu_ldl_be_mmu(CPUArchState *env, abi_ptr addr, { uint32_t ret; - validate_memop(oi, MO_BEUL); + tcg_debug_assert((get_memop(oi) & (MO_BSWAP | MO_SIZE)) == MO_BEUL); ret = do_ld4_mmu(env, addr, oi, ra, MMU_DATA_LOAD); plugin_load_cb(env, addr, oi); return ret; @@ -2459,7 +2406,7 @@ uint64_t cpu_ldq_be_mmu(CPUArchState *env, abi_ptr addr, { uint64_t ret; - validate_memop(oi, MO_BEUQ); + tcg_debug_assert((get_memop(oi) & (MO_BSWAP | MO_SIZE)) == MO_BEUQ); ret = do_ld8_mmu(env, addr, oi, ra, MMU_DATA_LOAD); plugin_load_cb(env, addr, oi); return ret; @@ -2470,7 +2417,7 @@ uint16_t cpu_ldw_le_mmu(CPUArchState *env, abi_ptr addr, { uint16_t ret; - validate_memop(oi, MO_LEUW); + tcg_debug_assert((get_memop(oi) & (MO_BSWAP | MO_SIZE)) == MO_LEUW); ret = do_ld2_mmu(env, addr, oi, ra, MMU_DATA_LOAD); plugin_load_cb(env, addr, oi); return ret; @@ -2481,7 +2428,7 @@ uint32_t cpu_ldl_le_mmu(CPUArchState *env, abi_ptr addr, { uint32_t ret; - validate_memop(oi, MO_LEUL); + tcg_debug_assert((get_memop(oi) & (MO_BSWAP | MO_SIZE)) == MO_LEUL); ret = do_ld4_mmu(env, addr, oi, ra, MMU_DATA_LOAD); plugin_load_cb(env, addr, oi); return ret; @@ -2492,7 +2439,7 @@ uint64_t cpu_ldq_le_mmu(CPUArchState *env, abi_ptr addr, { uint64_t ret; - validate_memop(oi, MO_LEUQ); + tcg_debug_assert((get_memop(oi) & (MO_BSWAP | MO_SIZE)) == MO_LEUQ); ret = do_ld8_mmu(env, addr, oi, ra, MMU_DATA_LOAD); plugin_load_cb(env, addr, oi); return ret; @@ -2520,8 +2467,8 @@ Int128 cpu_ld16_be_mmu(CPUArchState *env, abi_ptr addr, mop = (mop & ~(MO_SIZE | MO_AMASK)) | MO_64 | MO_UNALN; new_oi = make_memop_idx(mop, mmu_idx); - h = helper_be_ldq_mmu(env, addr, new_oi, ra); - l = helper_be_ldq_mmu(env, addr + 8, new_oi, ra); + h = helper_ldq_mmu(env, addr, new_oi, ra); + l = helper_ldq_mmu(env, addr + 8, new_oi, ra); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R); return int128_make128(l, h); @@ -2549,8 +2496,8 @@ Int128 cpu_ld16_le_mmu(CPUArchState *env, abi_ptr addr, mop = (mop & ~(MO_SIZE | MO_AMASK)) | MO_64 | MO_UNALN; new_oi = make_memop_idx(mop, mmu_idx); - l = helper_le_ldq_mmu(env, addr, new_oi, ra); - h = helper_le_ldq_mmu(env, addr + 8, new_oi, ra); + l = helper_ldq_mmu(env, addr, new_oi, ra); + h = helper_ldq_mmu(env, addr + 8, new_oi, ra); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R); return int128_make128(l, h); @@ -2694,13 +2641,13 @@ static void do_st_8(CPUArchState *env, MMULookupPageData *p, uint64_t val, } } -void helper_ret_stb_mmu(CPUArchState *env, target_ulong addr, uint8_t val, - MemOpIdx oi, uintptr_t ra) +void helper_stb_mmu(CPUArchState *env, target_ulong addr, uint8_t val, + MemOpIdx oi, uintptr_t ra) { MMULookupLocals l; bool crosspage; - validate_memop(oi, MO_UB); + tcg_debug_assert((get_memop(oi) & MO_SIZE) == MO_8); crosspage = mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE, &l); tcg_debug_assert(!crosspage); @@ -2729,17 +2676,10 @@ static void do_st2_mmu(CPUArchState *env, target_ulong addr, uint16_t val, do_st_1(env, &l.page[1], b, l.mmu_idx, ra); } -void helper_le_stw_mmu(CPUArchState *env, target_ulong addr, uint16_t val, - MemOpIdx oi, uintptr_t retaddr) +void helper_stw_mmu(CPUArchState *env, target_ulong addr, uint16_t val, + MemOpIdx oi, uintptr_t retaddr) { - validate_memop(oi, MO_LEUW); - do_st2_mmu(env, addr, val, oi, retaddr); -} - -void helper_be_stw_mmu(CPUArchState *env, target_ulong addr, uint16_t val, - MemOpIdx oi, uintptr_t retaddr) -{ - validate_memop(oi, MO_BEUW); + tcg_debug_assert((get_memop(oi) & MO_SIZE) == MO_16); do_st2_mmu(env, addr, val, oi, retaddr); } @@ -2763,17 +2703,10 @@ static void do_st4_mmu(CPUArchState *env, target_ulong addr, uint32_t val, (void) do_st_leN(env, &l.page[1], val, l.mmu_idx, l.memop, ra); } -void helper_le_stl_mmu(CPUArchState *env, target_ulong addr, uint32_t val, - MemOpIdx oi, uintptr_t retaddr) +void helper_stl_mmu(CPUArchState *env, target_ulong addr, uint32_t val, + MemOpIdx oi, uintptr_t retaddr) { - validate_memop(oi, MO_LEUL); - do_st4_mmu(env, addr, val, oi, retaddr); -} - -void helper_be_stl_mmu(CPUArchState *env, target_ulong addr, uint32_t val, - MemOpIdx oi, uintptr_t retaddr) -{ - validate_memop(oi, MO_BEUL); + tcg_debug_assert((get_memop(oi) & MO_SIZE) == MO_32); do_st4_mmu(env, addr, val, oi, retaddr); } @@ -2797,17 +2730,10 @@ static void do_st8_mmu(CPUArchState *env, target_ulong addr, uint64_t val, (void) do_st_leN(env, &l.page[1], val, l.mmu_idx, l.memop, ra); } -void helper_le_stq_mmu(CPUArchState *env, target_ulong addr, uint64_t val, - MemOpIdx oi, uintptr_t retaddr) +void helper_stq_mmu(CPUArchState *env, target_ulong addr, uint64_t val, + MemOpIdx oi, uintptr_t retaddr) { - validate_memop(oi, MO_LEUQ); - do_st8_mmu(env, addr, val, oi, retaddr); -} - -void helper_be_stq_mmu(CPUArchState *env, target_ulong addr, uint64_t val, - MemOpIdx oi, uintptr_t retaddr) -{ - validate_memop(oi, MO_BEUQ); + tcg_debug_assert((get_memop(oi) & MO_SIZE) == MO_64); do_st8_mmu(env, addr, val, oi, retaddr); } @@ -2823,49 +2749,55 @@ static void plugin_store_cb(CPUArchState *env, abi_ptr addr, MemOpIdx oi) void cpu_stb_mmu(CPUArchState *env, target_ulong addr, uint8_t val, MemOpIdx oi, uintptr_t retaddr) { - helper_ret_stb_mmu(env, addr, val, oi, retaddr); + helper_stb_mmu(env, addr, val, oi, retaddr); plugin_store_cb(env, addr, oi); } void cpu_stw_be_mmu(CPUArchState *env, target_ulong addr, uint16_t val, MemOpIdx oi, uintptr_t retaddr) { - helper_be_stw_mmu(env, addr, val, oi, retaddr); + tcg_debug_assert((get_memop(oi) & (MO_BSWAP | MO_SIZE)) == MO_BEUW); + do_st2_mmu(env, addr, val, oi, retaddr); plugin_store_cb(env, addr, oi); } void cpu_stl_be_mmu(CPUArchState *env, target_ulong addr, uint32_t val, MemOpIdx oi, uintptr_t retaddr) { - helper_be_stl_mmu(env, addr, val, oi, retaddr); + tcg_debug_assert((get_memop(oi) & (MO_BSWAP | MO_SIZE)) == MO_BEUL); + do_st4_mmu(env, addr, val, oi, retaddr); plugin_store_cb(env, addr, oi); } void cpu_stq_be_mmu(CPUArchState *env, target_ulong addr, uint64_t val, MemOpIdx oi, uintptr_t retaddr) { - helper_be_stq_mmu(env, addr, val, oi, retaddr); + tcg_debug_assert((get_memop(oi) & (MO_BSWAP | MO_SIZE)) == MO_BEUQ); + do_st8_mmu(env, addr, val, oi, retaddr); plugin_store_cb(env, addr, oi); } void cpu_stw_le_mmu(CPUArchState *env, target_ulong addr, uint16_t val, MemOpIdx oi, uintptr_t retaddr) { - helper_le_stw_mmu(env, addr, val, oi, retaddr); + tcg_debug_assert((get_memop(oi) & (MO_BSWAP | MO_SIZE)) == MO_LEUW); + do_st2_mmu(env, addr, val, oi, retaddr); plugin_store_cb(env, addr, oi); } void cpu_stl_le_mmu(CPUArchState *env, target_ulong addr, uint32_t val, MemOpIdx oi, uintptr_t retaddr) { - helper_le_stl_mmu(env, addr, val, oi, retaddr); + tcg_debug_assert((get_memop(oi) & (MO_BSWAP | MO_SIZE)) == MO_LEUL); + do_st4_mmu(env, addr, val, oi, retaddr); plugin_store_cb(env, addr, oi); } void cpu_stq_le_mmu(CPUArchState *env, target_ulong addr, uint64_t val, MemOpIdx oi, uintptr_t retaddr) { - helper_le_stq_mmu(env, addr, val, oi, retaddr); + tcg_debug_assert((get_memop(oi) & (MO_BSWAP | MO_SIZE)) == MO_LEUQ); + do_st8_mmu(env, addr, val, oi, retaddr); plugin_store_cb(env, addr, oi); } @@ -2890,8 +2822,8 @@ void cpu_st16_be_mmu(CPUArchState *env, abi_ptr addr, Int128 val, mop = (mop & ~(MO_SIZE | MO_AMASK)) | MO_64 | MO_UNALN; new_oi = make_memop_idx(mop, mmu_idx); - helper_be_stq_mmu(env, addr, int128_gethi(val), new_oi, ra); - helper_be_stq_mmu(env, addr + 8, int128_getlo(val), new_oi, ra); + helper_stq_mmu(env, addr, int128_gethi(val), new_oi, ra); + helper_stq_mmu(env, addr + 8, int128_getlo(val), new_oi, ra); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W); } @@ -2917,8 +2849,8 @@ void cpu_st16_le_mmu(CPUArchState *env, abi_ptr addr, Int128 val, mop = (mop & ~(MO_SIZE | MO_AMASK)) | MO_64 | MO_UNALN; new_oi = make_memop_idx(mop, mmu_idx); - helper_le_stq_mmu(env, addr, int128_getlo(val), new_oi, ra); - helper_le_stq_mmu(env, addr + 8, int128_gethi(val), new_oi, ra); + helper_stq_mmu(env, addr, int128_getlo(val), new_oi, ra); + helper_stq_mmu(env, addr + 8, int128_gethi(val), new_oi, ra); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W); } diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc index a091326f84..05123cce35 100644 --- a/tcg/aarch64/tcg-target.c.inc +++ b/tcg/aarch64/tcg-target.c.inc @@ -1538,37 +1538,26 @@ static void tcg_out_adr(TCGContext *s, TCGReg rd, const void *target) } #ifdef CONFIG_SOFTMMU -/* helper signature: helper_ret_ld_mmu(CPUState *env, target_ulong addr, - * MemOpIdx oi, uintptr_t ra) +/* + * helper signature: helper_ld*_mmu(CPUState *env, target_ulong addr, + * MemOpIdx oi, uintptr_t ra) */ static void * const qemu_ld_helpers[MO_SIZE + 1] = { - [MO_8] = helper_ret_ldub_mmu, -#if HOST_BIG_ENDIAN - [MO_16] = helper_be_lduw_mmu, - [MO_32] = helper_be_ldul_mmu, - [MO_64] = helper_be_ldq_mmu, -#else - [MO_16] = helper_le_lduw_mmu, - [MO_32] = helper_le_ldul_mmu, - [MO_64] = helper_le_ldq_mmu, -#endif + [MO_8] = helper_ldub_mmu, + [MO_16] = helper_lduw_mmu, + [MO_32] = helper_ldul_mmu, + [MO_64] = helper_ldq_mmu, }; -/* helper signature: helper_ret_st_mmu(CPUState *env, target_ulong addr, - * uintxx_t val, MemOpIdx oi, - * uintptr_t ra) +/* + * helper signature: helper_st*_mmu(CPUState *env, target_ulong addr, + * uintxx_t val, MemOpIdx oi, uintptr_t ra) */ static void * const qemu_st_helpers[MO_SIZE + 1] = { - [MO_8] = helper_ret_stb_mmu, -#if HOST_BIG_ENDIAN - [MO_16] = helper_be_stw_mmu, - [MO_32] = helper_be_stl_mmu, - [MO_64] = helper_be_stq_mmu, -#else - [MO_16] = helper_le_stw_mmu, - [MO_32] = helper_le_stl_mmu, - [MO_64] = helper_le_stq_mmu, -#endif + [MO_8] = helper_stb_mmu, + [MO_16] = helper_stw_mmu, + [MO_32] = helper_stl_mmu, + [MO_64] = helper_stq_mmu, }; static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *lb) diff --git a/tcg/arm/tcg-target.c.inc b/tcg/arm/tcg-target.c.inc index d06ac60c15..d0f97d2b14 100644 --- a/tcg/arm/tcg-target.c.inc +++ b/tcg/arm/tcg-target.c.inc @@ -1302,41 +1302,26 @@ static void tcg_out_vldst(TCGContext *s, ARMInsn insn, } #ifdef CONFIG_SOFTMMU -/* helper signature: helper_ret_ld_mmu(CPUState *env, target_ulong addr, - * int mmu_idx, uintptr_t ra) +/* + * helper signature: helper_ld*_mmu(CPUState *env, target_ulong addr, + * int mmu_idx, uintptr_t ra) */ -static void * const qemu_ld_helpers[MO_SSIZE + 1] = { - [MO_UB] = helper_ret_ldub_mmu, - [MO_SB] = helper_ret_ldsb_mmu, -#if HOST_BIG_ENDIAN - [MO_UW] = helper_be_lduw_mmu, - [MO_UL] = helper_be_ldul_mmu, - [MO_UQ] = helper_be_ldq_mmu, - [MO_SW] = helper_be_ldsw_mmu, - [MO_SL] = helper_be_ldul_mmu, -#else - [MO_UW] = helper_le_lduw_mmu, - [MO_UL] = helper_le_ldul_mmu, - [MO_UQ] = helper_le_ldq_mmu, - [MO_SW] = helper_le_ldsw_mmu, - [MO_SL] = helper_le_ldul_mmu, -#endif +static void * const qemu_ld_helpers[MO_SIZE + 1] = { + [MO_UB] = helper_ldub_mmu, + [MO_UW] = helper_lduw_mmu, + [MO_UL] = helper_ldul_mmu, + [MO_UQ] = helper_ldq_mmu, }; -/* helper signature: helper_ret_st_mmu(CPUState *env, target_ulong addr, - * uintxx_t val, int mmu_idx, uintptr_t ra) +/* + * helper signature: helper_st*_mmu(CPUState *env, target_ulong addr, + * uintxx_t val, int mmu_idx, uintptr_t ra) */ static void * const qemu_st_helpers[MO_SIZE + 1] = { - [MO_8] = helper_ret_stb_mmu, -#if HOST_BIG_ENDIAN - [MO_16] = helper_be_stw_mmu, - [MO_32] = helper_be_stl_mmu, - [MO_64] = helper_be_stq_mmu, -#else - [MO_16] = helper_le_stw_mmu, - [MO_32] = helper_le_stl_mmu, - [MO_64] = helper_le_stq_mmu, -#endif + [MO_8] = helper_stb_mmu, + [MO_16] = helper_stw_mmu, + [MO_32] = helper_stl_mmu, + [MO_64] = helper_stq_mmu, }; /* Helper routines for marshalling helper function arguments into diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc index 028ece62a0..29dba3fa1c 100644 --- a/tcg/i386/tcg-target.c.inc +++ b/tcg/i386/tcg-target.c.inc @@ -1728,30 +1728,26 @@ static void tcg_out_nopn(TCGContext *s, int n) } #if defined(CONFIG_SOFTMMU) -/* helper signature: helper_ret_ld_mmu(CPUState *env, target_ulong addr, - * int mmu_idx, uintptr_t ra) +/* + * helper signature: helper_ld*_mmu(CPUState *env, target_ulong addr, + * int mmu_idx, uintptr_t ra) */ -static void * const qemu_ld_helpers[(MO_SIZE | MO_BSWAP) + 1] = { - [MO_UB] = helper_ret_ldub_mmu, - [MO_LEUW] = helper_le_lduw_mmu, - [MO_LEUL] = helper_le_ldul_mmu, - [MO_LEUQ] = helper_le_ldq_mmu, - [MO_BEUW] = helper_be_lduw_mmu, - [MO_BEUL] = helper_be_ldul_mmu, - [MO_BEUQ] = helper_be_ldq_mmu, +static void * const qemu_ld_helpers[MO_SIZE + 1] = { + [MO_UB] = helper_ldub_mmu, + [MO_UW] = helper_lduw_mmu, + [MO_UL] = helper_ldul_mmu, + [MO_UQ] = helper_ldq_mmu, }; -/* helper signature: helper_ret_st_mmu(CPUState *env, target_ulong addr, - * uintxx_t val, int mmu_idx, uintptr_t ra) +/* + * helper signature: helper_st*_mmu(CPUState *env, target_ulong addr, + * uintxx_t val, int mmu_idx, uintptr_t ra) */ -static void * const qemu_st_helpers[(MO_SIZE | MO_BSWAP) + 1] = { - [MO_UB] = helper_ret_stb_mmu, - [MO_LEUW] = helper_le_stw_mmu, - [MO_LEUL] = helper_le_stl_mmu, - [MO_LEUQ] = helper_le_stq_mmu, - [MO_BEUW] = helper_be_stw_mmu, - [MO_BEUL] = helper_be_stl_mmu, - [MO_BEUQ] = helper_be_stq_mmu, +static void * const qemu_st_helpers[MO_SIZE + 1] = { + [MO_UB] = helper_stb_mmu, + [MO_UW] = helper_stw_mmu, + [MO_UL] = helper_stl_mmu, + [MO_UQ] = helper_stq_mmu, }; /* Perform the TLB load and compare. @@ -1926,7 +1922,7 @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l) (uintptr_t)l->raddr); } - tcg_out_branch(s, 1, qemu_ld_helpers[opc & (MO_BSWAP | MO_SIZE)]); + tcg_out_branch(s, 1, qemu_ld_helpers[opc & MO_SIZE]); data_reg = l->datalo_reg; switch (opc & MO_SSIZE) { @@ -2033,7 +2029,7 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l) /* "Tail call" to the helper, with the return address back inline. */ tcg_out_push(s, retaddr); - tcg_out_jmp(s, qemu_st_helpers[opc & (MO_BSWAP | MO_SIZE)]); + tcg_out_jmp(s, qemu_st_helpers[opc & MO_SIZE]); return true; } #else diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc index c5f55afd68..5318f83a8a 100644 --- a/tcg/loongarch64/tcg-target.c.inc +++ b/tcg/loongarch64/tcg-target.c.inc @@ -774,26 +774,25 @@ static bool tcg_out_sti(TCGContext *s, TCGType type, TCGArg val, #if defined(CONFIG_SOFTMMU) /* - * helper signature: helper_ret_ld_mmu(CPUState *env, target_ulong addr, - * MemOpIdx oi, uintptr_t ra) + * helper signature: helper_ld*_mmu(CPUState *env, target_ulong addr, + * MemOpIdx oi, uintptr_t ra) */ static void * const qemu_ld_helpers[4] = { - [MO_8] = helper_ret_ldub_mmu, - [MO_16] = helper_le_lduw_mmu, - [MO_32] = helper_le_ldul_mmu, - [MO_64] = helper_le_ldq_mmu, + [MO_8] = helper_ldub_mmu, + [MO_16] = helper_lduw_mmu, + [MO_32] = helper_ldul_mmu, + [MO_64] = helper_ldq_mmu, }; /* - * helper signature: helper_ret_st_mmu(CPUState *env, target_ulong addr, - * uintxx_t val, MemOpIdx oi, - * uintptr_t ra) + * helper signature: helper_st*_mmu(CPUState *env, target_ulong addr, + * uintxx_t val, MemOpIdx oi, uintptr_t ra) */ static void * const qemu_st_helpers[4] = { - [MO_8] = helper_ret_stb_mmu, - [MO_16] = helper_le_stw_mmu, - [MO_32] = helper_le_stl_mmu, - [MO_64] = helper_le_stq_mmu, + [MO_8] = helper_stb_mmu, + [MO_16] = helper_stw_mmu, + [MO_32] = helper_stl_mmu, + [MO_64] = helper_stq_mmu, }; /* We expect to use a 12-bit negative offset from ENV. */ diff --git a/tcg/mips/tcg-target.c.inc b/tcg/mips/tcg-target.c.inc index 80748d892e..56c4b2377b 100644 --- a/tcg/mips/tcg-target.c.inc +++ b/tcg/mips/tcg-target.c.inc @@ -1037,31 +1037,21 @@ static void tcg_out_call(TCGContext *s, const tcg_insn_unit *arg, } #if defined(CONFIG_SOFTMMU) -static void * const qemu_ld_helpers[(MO_SSIZE | MO_BSWAP) + 1] = { - [MO_UB] = helper_ret_ldub_mmu, - [MO_SB] = helper_ret_ldsb_mmu, - [MO_LEUW] = helper_le_lduw_mmu, - [MO_LESW] = helper_le_ldsw_mmu, - [MO_LEUL] = helper_le_ldul_mmu, - [MO_LEUQ] = helper_le_ldq_mmu, - [MO_BEUW] = helper_be_lduw_mmu, - [MO_BESW] = helper_be_ldsw_mmu, - [MO_BEUL] = helper_be_ldul_mmu, - [MO_BEUQ] = helper_be_ldq_mmu, -#if TCG_TARGET_REG_BITS == 64 - [MO_LESL] = helper_le_ldsl_mmu, - [MO_BESL] = helper_be_ldsl_mmu, -#endif +static void * const qemu_ld_helpers[MO_SSIZE + 1] = { + [MO_UB] = helper_ldub_mmu, + [MO_SB] = helper_ldsb_mmu, + [MO_UW] = helper_lduw_mmu, + [MO_SW] = helper_ldsw_mmu, + [MO_UL] = helper_ldul_mmu, + [MO_SL] = helper_ldsl_mmu, + [MO_UQ] = helper_ldq_mmu, }; -static void * const qemu_st_helpers[(MO_SIZE | MO_BSWAP) + 1] = { - [MO_UB] = helper_ret_stb_mmu, - [MO_LEUW] = helper_le_stw_mmu, - [MO_LEUL] = helper_le_stl_mmu, - [MO_LEUQ] = helper_le_stq_mmu, - [MO_BEUW] = helper_be_stw_mmu, - [MO_BEUL] = helper_be_stl_mmu, - [MO_BEUQ] = helper_be_stq_mmu, +static void * const qemu_st_helpers[MO_SIZE + 1] = { + [MO_UB] = helper_stb_mmu, + [MO_UW] = helper_stw_mmu, + [MO_UL] = helper_stl_mmu, + [MO_UQ] = helper_stq_mmu, }; /* Helper routines for marshalling helper function arguments into @@ -1267,7 +1257,7 @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l) } i = tcg_out_call_iarg_imm(s, i, oi); i = tcg_out_call_iarg_imm(s, i, (intptr_t)l->raddr); - tcg_out_call_int(s, qemu_ld_helpers[opc & (MO_BSWAP | MO_SSIZE)], false); + tcg_out_call_int(s, qemu_ld_helpers[opc & MO_SSIZE], false); /* delay slot */ tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[0], TCG_AREG0); @@ -1345,7 +1335,7 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l) computation to take place in the return address register. */ tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_RA, (intptr_t)l->raddr); i = tcg_out_call_iarg_reg(s, i, TCG_REG_RA); - tcg_out_call_int(s, qemu_st_helpers[opc & (MO_BSWAP | MO_SIZE)], true); + tcg_out_call_int(s, qemu_st_helpers[opc & MO_SIZE], true); /* delay slot */ tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[0], TCG_AREG0); return true; diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc index afadf9a1e3..627256e46e 100644 --- a/tcg/ppc/tcg-target.c.inc +++ b/tcg/ppc/tcg-target.c.inc @@ -1955,27 +1955,21 @@ static const uint32_t qemu_exts_opc[4] = { /* helper signature: helper_ld_mmu(CPUState *env, target_ulong addr, * int mmu_idx, uintptr_t ra) */ -static void * const qemu_ld_helpers[(MO_SIZE | MO_BSWAP) + 1] = { - [MO_UB] = helper_ret_ldub_mmu, - [MO_LEUW] = helper_le_lduw_mmu, - [MO_LEUL] = helper_le_ldul_mmu, - [MO_LEUQ] = helper_le_ldq_mmu, - [MO_BEUW] = helper_be_lduw_mmu, - [MO_BEUL] = helper_be_ldul_mmu, - [MO_BEUQ] = helper_be_ldq_mmu, +static void * const qemu_ld_helpers[MO_SIZE + 1] = { + [MO_UB] = helper_ldub_mmu, + [MO_UW] = helper_lduw_mmu, + [MO_UL] = helper_ldul_mmu, + [MO_UQ] = helper_ldq_mmu, }; /* helper signature: helper_st_mmu(CPUState *env, target_ulong addr, * uintxx_t val, int mmu_idx, uintptr_t ra) */ -static void * const qemu_st_helpers[(MO_SIZE | MO_BSWAP) + 1] = { - [MO_UB] = helper_ret_stb_mmu, - [MO_LEUW] = helper_le_stw_mmu, - [MO_LEUL] = helper_le_stl_mmu, - [MO_LEUQ] = helper_le_stq_mmu, - [MO_BEUW] = helper_be_stw_mmu, - [MO_BEUL] = helper_be_stl_mmu, - [MO_BEUQ] = helper_be_stq_mmu, +static void * const qemu_st_helpers[MO_SIZE + 1] = { + [MO_UB] = helper_stb_mmu, + [MO_UW] = helper_stw_mmu, + [MO_UL] = helper_stl_mmu, + [MO_UQ] = helper_stq_mmu, }; /* We expect to use a 16-bit negative offset from ENV. */ @@ -2137,7 +2131,7 @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *lb) tcg_out_movi(s, TCG_TYPE_I32, arg++, oi); tcg_out32(s, MFSPR | RT(arg) | LR); - tcg_out_call_int(s, LK, qemu_ld_helpers[opc & (MO_BSWAP | MO_SIZE)]); + tcg_out_call_int(s, LK, qemu_ld_helpers[opc & MO_SIZE]); lo = lb->datalo_reg; hi = lb->datahi_reg; @@ -2206,7 +2200,7 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *lb) tcg_out_movi(s, TCG_TYPE_I32, arg++, oi); tcg_out32(s, MFSPR | RT(arg) | LR); - tcg_out_call_int(s, LK, qemu_st_helpers[opc & (MO_BSWAP | MO_SIZE)]); + tcg_out_call_int(s, LK, qemu_st_helpers[opc & MO_SIZE]); tcg_out_b(s, 0, lb->raddr); return true; diff --git a/tcg/riscv/tcg-target.c.inc b/tcg/riscv/tcg-target.c.inc index 558de127ef..ad966112d4 100644 --- a/tcg/riscv/tcg-target.c.inc +++ b/tcg/riscv/tcg-target.c.inc @@ -878,46 +878,29 @@ static void tcg_out_mb(TCGContext *s, TCGArg a0) */ #if defined(CONFIG_SOFTMMU) -/* helper signature: helper_ret_ld_mmu(CPUState *env, target_ulong addr, - * MemOpIdx oi, uintptr_t ra) +/* + * helper signature: helper_ld*_mmu(CPUState *env, target_ulong addr, + * MemOpIdx oi, uintptr_t ra) */ static void * const qemu_ld_helpers[MO_SSIZE + 1] = { - [MO_UB] = helper_ret_ldub_mmu, - [MO_SB] = helper_ret_ldsb_mmu, -#if HOST_BIG_ENDIAN - [MO_UW] = helper_be_lduw_mmu, - [MO_SW] = helper_be_ldsw_mmu, - [MO_UL] = helper_be_ldul_mmu, -#if TCG_TARGET_REG_BITS == 64 - [MO_SL] = helper_be_ldsl_mmu, -#endif - [MO_UQ] = helper_be_ldq_mmu, -#else - [MO_UW] = helper_le_lduw_mmu, - [MO_SW] = helper_le_ldsw_mmu, - [MO_UL] = helper_le_ldul_mmu, -#if TCG_TARGET_REG_BITS == 64 - [MO_SL] = helper_le_ldsl_mmu, -#endif - [MO_UQ] = helper_le_ldq_mmu, -#endif + [MO_UB] = helper_ldub_mmu, + [MO_SB] = helper_ldsb_mmu, + [MO_UW] = helper_lduw_mmu, + [MO_SW] = helper_ldsw_mmu, + [MO_UL] = helper_ldul_mmu, + [MO_SL] = helper_ldsl_mmu, + [MO_UQ] = helper_ldq_mmu, }; -/* helper signature: helper_ret_st_mmu(CPUState *env, target_ulong addr, - * uintxx_t val, MemOpIdx oi, - * uintptr_t ra) +/* + * helper signature: helper_st*_mmu(CPUState *env, target_ulong addr, + * uintxx_t val, MemOpIdx oi, uintptr_t ra) */ static void * const qemu_st_helpers[MO_SIZE + 1] = { - [MO_8] = helper_ret_stb_mmu, -#if HOST_BIG_ENDIAN - [MO_16] = helper_be_stw_mmu, - [MO_32] = helper_be_stl_mmu, - [MO_64] = helper_be_stq_mmu, -#else - [MO_16] = helper_le_stw_mmu, - [MO_32] = helper_le_stl_mmu, - [MO_64] = helper_le_stq_mmu, -#endif + [MO_8] = helper_stb_mmu, + [MO_16] = helper_stw_mmu, + [MO_32] = helper_stl_mmu, + [MO_64] = helper_stq_mmu, }; /* We don't support oversize guests */ diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc index 844532156b..7a94ff943b 100644 --- a/tcg/s390x/tcg-target.c.inc +++ b/tcg/s390x/tcg-target.c.inc @@ -450,29 +450,21 @@ static const uint8_t tcg_cond_to_ltr_cond[] = { }; #ifdef CONFIG_SOFTMMU -static void * const qemu_ld_helpers[(MO_SSIZE | MO_BSWAP) + 1] = { - [MO_UB] = helper_ret_ldub_mmu, - [MO_SB] = helper_ret_ldsb_mmu, - [MO_LEUW] = helper_le_lduw_mmu, - [MO_LESW] = helper_le_ldsw_mmu, - [MO_LEUL] = helper_le_ldul_mmu, - [MO_LESL] = helper_le_ldsl_mmu, - [MO_LEUQ] = helper_le_ldq_mmu, - [MO_BEUW] = helper_be_lduw_mmu, - [MO_BESW] = helper_be_ldsw_mmu, - [MO_BEUL] = helper_be_ldul_mmu, - [MO_BESL] = helper_be_ldsl_mmu, - [MO_BEUQ] = helper_be_ldq_mmu, +static void * const qemu_ld_helpers[MO_SSIZE + 1] = { + [MO_UB] = helper_ldub_mmu, + [MO_SB] = helper_ldsb_mmu, + [MO_UW] = helper_lduw_mmu, + [MO_SW] = helper_ldsw_mmu, + [MO_UL] = helper_ldul_mmu, + [MO_SL] = helper_ldsl_mmu, + [MO_UQ] = helper_ldq_mmu, }; -static void * const qemu_st_helpers[(MO_SIZE | MO_BSWAP) + 1] = { - [MO_UB] = helper_ret_stb_mmu, - [MO_LEUW] = helper_le_stw_mmu, - [MO_LEUL] = helper_le_stl_mmu, - [MO_LEUQ] = helper_le_stq_mmu, - [MO_BEUW] = helper_be_stw_mmu, - [MO_BEUL] = helper_be_stl_mmu, - [MO_BEUQ] = helper_be_stq_mmu, +static void * const qemu_st_helpers[MO_SIZE + 1] = { + [MO_UB] = helper_stb_mmu, + [MO_UW] = helper_stw_mmu, + [MO_UL] = helper_stl_mmu, + [MO_UQ] = helper_stq_mmu, }; #endif @@ -1781,7 +1773,7 @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *lb) } tcg_out_movi(s, TCG_TYPE_I32, TCG_REG_R4, oi); tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R5, (uintptr_t)lb->raddr); - tcg_out_call_int(s, qemu_ld_helpers[opc & (MO_BSWAP | MO_SSIZE)]); + tcg_out_call_int(s, qemu_ld_helpers[opc & MO_SSIZE]); tcg_out_mov(s, TCG_TYPE_I64, data_reg, TCG_REG_R2); tgen_gotoi(s, S390_CC_ALWAYS, lb->raddr); @@ -1822,7 +1814,7 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *lb) } tcg_out_movi(s, TCG_TYPE_I32, TCG_REG_R5, oi); tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R6, (uintptr_t)lb->raddr); - tcg_out_call_int(s, qemu_st_helpers[opc & (MO_BSWAP | MO_SIZE)]); + tcg_out_call_int(s, qemu_st_helpers[opc & MO_SIZE]); tgen_gotoi(s, S390_CC_ALWAYS, lb->raddr); return true; diff --git a/tcg/sparc64/tcg-target.c.inc b/tcg/sparc64/tcg-target.c.inc index ccc4144f7c..f731e161ec 100644 --- a/tcg/sparc64/tcg-target.c.inc +++ b/tcg/sparc64/tcg-target.c.inc @@ -868,8 +868,8 @@ static void tcg_out_mb(TCGContext *s, TCGArg a0) } #ifdef CONFIG_SOFTMMU -static const tcg_insn_unit *qemu_ld_trampoline[(MO_SSIZE | MO_BSWAP) + 1]; -static const tcg_insn_unit *qemu_st_trampoline[(MO_SIZE | MO_BSWAP) + 1]; +static const tcg_insn_unit *qemu_ld_trampoline[MO_SSIZE + 1]; +static const tcg_insn_unit *qemu_st_trampoline[MO_SIZE + 1]; static void emit_extend(TCGContext *s, TCGReg r, int op) { @@ -895,25 +895,18 @@ static void emit_extend(TCGContext *s, TCGReg r, int op) static void build_trampolines(TCGContext *s) { static void * const qemu_ld_helpers[] = { - [MO_UB] = helper_ret_ldub_mmu, - [MO_SB] = helper_ret_ldsb_mmu, - [MO_LEUW] = helper_le_lduw_mmu, - [MO_LESW] = helper_le_ldsw_mmu, - [MO_LEUL] = helper_le_ldul_mmu, - [MO_LEUQ] = helper_le_ldq_mmu, - [MO_BEUW] = helper_be_lduw_mmu, - [MO_BESW] = helper_be_ldsw_mmu, - [MO_BEUL] = helper_be_ldul_mmu, - [MO_BEUQ] = helper_be_ldq_mmu, + [MO_UB] = helper_ldub_mmu, + [MO_SB] = helper_ldsb_mmu, + [MO_UW] = helper_lduw_mmu, + [MO_SW] = helper_ldsw_mmu, + [MO_UL] = helper_ldul_mmu, + [MO_UQ] = helper_ldq_mmu, }; static void * const qemu_st_helpers[] = { - [MO_UB] = helper_ret_stb_mmu, - [MO_LEUW] = helper_le_stw_mmu, - [MO_LEUL] = helper_le_stl_mmu, - [MO_LEUQ] = helper_le_stq_mmu, - [MO_BEUW] = helper_be_stw_mmu, - [MO_BEUL] = helper_be_stl_mmu, - [MO_BEUQ] = helper_be_stq_mmu, + [MO_UB] = helper_stb_mmu, + [MO_UW] = helper_stw_mmu, + [MO_UL] = helper_stl_mmu, + [MO_UQ] = helper_stq_mmu, }; int i; @@ -1182,9 +1175,9 @@ static void tcg_out_qemu_ld(TCGContext *s, TCGReg data, TCGReg addr, /* We use the helpers to extend SB and SW data, leaving the case of SL needing explicit extending below. */ if ((memop & MO_SSIZE) == MO_SL) { - func = qemu_ld_trampoline[memop & (MO_BSWAP | MO_SIZE)]; + func = qemu_ld_trampoline[MO_UL]; } else { - func = qemu_ld_trampoline[memop & (MO_BSWAP | MO_SSIZE)]; + func = qemu_ld_trampoline[memop & MO_SSIZE]; } tcg_debug_assert(func != NULL); tcg_out_call_nodelay(s, func, false); @@ -1324,7 +1317,7 @@ static void tcg_out_qemu_st(TCGContext *s, TCGReg data, TCGReg addr, tcg_out_mov(s, TCG_TYPE_REG, TCG_REG_O1, addrz); tcg_out_mov(s, TCG_TYPE_REG, TCG_REG_O2, data); - func = qemu_st_trampoline[memop & (MO_BSWAP | MO_SIZE)]; + func = qemu_st_trampoline[memop & MO_SIZE]; tcg_debug_assert(func != NULL); tcg_out_call_nodelay(s, func, false); /* delay slot */ From patchwork Thu Feb 16 02:57:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1743259 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=vL2mwmnL; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4PHKVf6vLDz23h0 for ; Thu, 16 Feb 2023 14:02:18 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pSUTG-0003Ul-Ra; Wed, 15 Feb 2023 21:58:06 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pSUTF-0003TG-4c for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:58:05 -0500 Received: from mail-pf1-x435.google.com ([2607:f8b0:4864:20::435]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pSUT8-0005jh-OH for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:58:04 -0500 Received: by mail-pf1-x435.google.com with SMTP id g9so586510pfo.5 for ; Wed, 15 Feb 2023 18:57:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=JUCyudTxkXtwi+rR7T8FbctRHERmPAsUL+d1DdLwZJQ=; b=vL2mwmnLahia/zFWnXCwMwKTiXTIolWkuo/qkCcI0qJ7SnzNWiCeFRZlWeoCXmL8wG egDwTRhUDmAjo9eUNYFDOTNz8SR1zcv9icmAmBSErJx4QmbFXeO6no/AXK3UmOrJVWlJ UBm4+mJy9fjrRePjg/wiQn9zFg9emkOrBVH9ipjp2I486W8pvr0leVmF5X1iDdTRqFh/ RgRkRyPZmbWPzPAYruuIsnFsvLgQ2CaAyPoeukHH/52OR3ZrupsSr1oAdJQgmRLCRDsp i1ToAAQqqTSkTOlXv07QIZT6Q9z/TqGZIOuuZEiLcBnEW+6QsrXWd4Vb0fNQEEbViF+c v+mw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=JUCyudTxkXtwi+rR7T8FbctRHERmPAsUL+d1DdLwZJQ=; b=xvyV5OA7HjfpALq76PVWKJm1Q/WEbEQUDY1PF19HLq5m171iJuFnzxkCMw+6DaNyaX 0GxmQeVZWE4ju4NE6AjR38qER0sddbJG68yb82ipm6HKxa/tK6Me3cZsFhaexfFLuFWC xF8dow1TbqSTgJHNg3CXX9ndmTVsqm9Z8vLgbkfxLo9kZdoUiQes8/a4Q6DkeSePD1EF N0bcQ/lGH14rLagFpZiSWsY/Ulgg2GUsGHk61xtiavno1sHQMRSESU8lfp5eiPq8NIRx 42w2TF3ZWo8Tx5XV3NCMSjQg1wFOPohIW3uRGxQnA5jL3hR7rjdyNOMA/pkf7ZdPTOjq xPSQ== X-Gm-Message-State: AO0yUKX8eXsHGJl4xB90Wtw5yXgE8dG6odAf3bxayBvQn55xkxZWQCbW yRILZIUJ1rMiNhpWOyFEzZRQAdY9VkSUVGaCuBI= X-Google-Smtp-Source: AK7set8U7SATbiNdKHvbcx9RqYmXhQQV5Eq//JPBnbTSW+p1b29INreGkpoSDMtJTi5/Myh2OAiCeA== X-Received: by 2002:a62:4ed5:0:b0:5a8:cec9:6ab6 with SMTP id c204-20020a624ed5000000b005a8cec96ab6mr3475702pfb.31.1676516277288; Wed, 15 Feb 2023 18:57:57 -0800 (PST) Received: from stoup.. (rrcs-74-87-59-234.west.biz.rr.com. [74.87.59.234]) by smtp.gmail.com with ESMTPSA id e14-20020a62aa0e000000b005a816b7c3e8sm89655pff.24.2023.02.15.18.57.56 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Feb 2023 18:57:56 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 11/30] accel/tcg: Implement helper_{ld, st}*_mmu for user-only Date: Wed, 15 Feb 2023 16:57:20 -1000 Message-Id: <20230216025739.1211680-12-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230216025739.1211680-1-richard.henderson@linaro.org> References: <20230216025739.1211680-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::435; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x435.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org TCG backends may need to defer to a helper to implement the atomicity required by a given operation. Mirror the interface used in system mode. Signed-off-by: Richard Henderson --- include/tcg/tcg-ldst.h | 6 +- accel/tcg/user-exec.c | 392 ++++++++++++++++++++++++++++------------- 2 files changed, 276 insertions(+), 122 deletions(-) diff --git a/include/tcg/tcg-ldst.h b/include/tcg/tcg-ldst.h index 56fa7afe5e..c1d945fd66 100644 --- a/include/tcg/tcg-ldst.h +++ b/include/tcg/tcg-ldst.h @@ -25,8 +25,6 @@ #ifndef TCG_LDST_H #define TCG_LDST_H -#ifdef CONFIG_SOFTMMU - /* Value zero-extended to tcg register size. */ tcg_target_ulong helper_ldub_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, uintptr_t retaddr); @@ -54,10 +52,10 @@ void helper_stl_mmu(CPUArchState *env, target_ulong addr, uint32_t val, void helper_stq_mmu(CPUArchState *env, target_ulong addr, uint64_t val, MemOpIdx oi, uintptr_t retaddr); -#else +#ifdef CONFIG_USER_ONLY G_NORETURN void helper_unaligned_ld(CPUArchState *env, target_ulong addr); G_NORETURN void helper_unaligned_st(CPUArchState *env, target_ulong addr); -#endif /* CONFIG_SOFTMMU */ +#endif /* CONFIG_USER_ONLY*/ #endif /* TCG_LDST_H */ diff --git a/accel/tcg/user-exec.c b/accel/tcg/user-exec.c index a4acf705f4..2a4b9e2e63 100644 --- a/accel/tcg/user-exec.c +++ b/accel/tcg/user-exec.c @@ -891,21 +891,6 @@ void page_reset_target_data(target_ulong start, target_ulong end) { } /* The softmmu versions of these helpers are in cputlb.c. */ -/* - * Verify that we have passed the correct MemOp to the correct function. - * - * We could present one function to target code, and dispatch based on - * the MemOp, but so far we have worked hard to avoid an indirect function - * call along the memory path. - */ -static void validate_memop(MemOpIdx oi, MemOp expected) -{ -#ifdef CONFIG_DEBUG_TCG - MemOp have = get_memop(oi) & (MO_SIZE | MO_BSWAP); - assert(have == expected); -#endif -} - void helper_unaligned_ld(CPUArchState *env, target_ulong addr) { cpu_loop_exit_sigbus(env_cpu(env), addr, MMU_DATA_LOAD, GETPC()); @@ -916,10 +901,9 @@ void helper_unaligned_st(CPUArchState *env, target_ulong addr) cpu_loop_exit_sigbus(env_cpu(env), addr, MMU_DATA_STORE, GETPC()); } -static void *cpu_mmu_lookup(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t ra, MMUAccessType type) +static void *cpu_mmu_lookup(CPUArchState *env, abi_ptr addr, + MemOp mop, uintptr_t ra, MMUAccessType type) { - MemOp mop = get_memop(oi); int a_bits = get_alignment_bits(mop); void *ret; @@ -935,100 +919,206 @@ static void *cpu_mmu_lookup(CPUArchState *env, target_ulong addr, #include "ldst_atomicity.c.inc" -uint8_t cpu_ldb_mmu(CPUArchState *env, abi_ptr addr, - MemOpIdx oi, uintptr_t ra) +static uint8_t do_ld1_mmu(CPUArchState *env, abi_ptr addr, + MemOp mop, uintptr_t ra) { void *haddr; uint8_t ret; - validate_memop(oi, MO_UB); - haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_LOAD); + tcg_debug_assert((mop & MO_SIZE) == MO_8); + haddr = cpu_mmu_lookup(env, addr, mop, ra, MMU_DATA_LOAD); ret = ldub_p(haddr); clear_helper_retaddr(); + return ret; +} + +tcg_target_ulong helper_ldub_mmu(CPUArchState *env, target_ulong addr, + MemOpIdx oi, uintptr_t ra) +{ + return do_ld1_mmu(env, addr, get_memop(oi), ra); +} + +tcg_target_ulong helper_ldsb_mmu(CPUArchState *env, target_ulong addr, + MemOpIdx oi, uintptr_t ra) +{ + return (int8_t)do_ld1_mmu(env, addr, get_memop(oi), ra); +} + +uint8_t cpu_ldb_mmu(CPUArchState *env, abi_ptr addr, + MemOpIdx oi, uintptr_t ra) +{ + uint8_t ret = do_ld1_mmu(env, addr, get_memop(oi), ra); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R); return ret; } +static uint16_t do_ld2_he_mmu(CPUArchState *env, abi_ptr addr, + MemOp mop, uintptr_t ra) +{ + void *haddr; + uint16_t ret; + + tcg_debug_assert((mop & MO_SIZE) == MO_16); + haddr = cpu_mmu_lookup(env, addr, mop, ra, MMU_DATA_LOAD); + ret = load_atom_2(env, ra, haddr, mop); + clear_helper_retaddr(); + return ret; +} + +tcg_target_ulong helper_lduw_mmu(CPUArchState *env, target_ulong addr, + MemOpIdx oi, uintptr_t ra) +{ + MemOp mop = get_memop(oi); + uint16_t ret = do_ld2_he_mmu(env, addr, mop, ra); + + if (mop & MO_BSWAP) { + ret = bswap16(ret); + } + return ret; +} + +tcg_target_ulong helper_ldsw_mmu(CPUArchState *env, target_ulong addr, + MemOpIdx oi, uintptr_t ra) +{ + MemOp mop = get_memop(oi); + int16_t ret = do_ld2_he_mmu(env, addr, mop, ra); + + if (mop & MO_BSWAP) { + ret = bswap16(ret); + } + return ret; +} + uint16_t cpu_ldw_be_mmu(CPUArchState *env, abi_ptr addr, MemOpIdx oi, uintptr_t ra) { - void *haddr; + MemOp mop = get_memop(oi); uint16_t ret; - validate_memop(oi, MO_BEUW); - haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_LOAD); - ret = load_atom_2(env, ra, haddr, get_memop(oi)); - clear_helper_retaddr(); + tcg_debug_assert((mop & MO_BSWAP) == MO_BE); + ret = do_ld2_he_mmu(env, addr, mop, ra); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R); return cpu_to_be16(ret); } -uint32_t cpu_ldl_be_mmu(CPUArchState *env, abi_ptr addr, - MemOpIdx oi, uintptr_t ra) -{ - void *haddr; - uint32_t ret; - - validate_memop(oi, MO_BEUL); - haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_LOAD); - ret = load_atom_4(env, ra, haddr, get_memop(oi)); - clear_helper_retaddr(); - qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R); - return cpu_to_be32(ret); -} - -uint64_t cpu_ldq_be_mmu(CPUArchState *env, abi_ptr addr, - MemOpIdx oi, uintptr_t ra) -{ - void *haddr; - uint64_t ret; - - validate_memop(oi, MO_BEUQ); - haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_LOAD); - ret = load_atom_8(env, ra, haddr, get_memop(oi)); - clear_helper_retaddr(); - qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R); - return cpu_to_be64(ret); -} - uint16_t cpu_ldw_le_mmu(CPUArchState *env, abi_ptr addr, MemOpIdx oi, uintptr_t ra) { - void *haddr; + MemOp mop = get_memop(oi); uint16_t ret; - validate_memop(oi, MO_LEUW); - haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_LOAD); - ret = load_atom_2(env, ra, haddr, get_memop(oi)); - clear_helper_retaddr(); + tcg_debug_assert((mop & MO_BSWAP) == MO_LE); + ret = do_ld2_he_mmu(env, addr, mop, ra); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R); return cpu_to_le16(ret); } +static uint32_t do_ld4_he_mmu(CPUArchState *env, abi_ptr addr, + MemOp mop, uintptr_t ra) +{ + void *haddr; + uint32_t ret; + + tcg_debug_assert((mop & MO_SIZE) == MO_32); + haddr = cpu_mmu_lookup(env, addr, mop, ra, MMU_DATA_LOAD); + ret = load_atom_4(env, ra, haddr, mop); + clear_helper_retaddr(); + return ret; +} + +tcg_target_ulong helper_ldul_mmu(CPUArchState *env, target_ulong addr, + MemOpIdx oi, uintptr_t ra) +{ + MemOp mop = get_memop(oi); + uint32_t ret = do_ld4_he_mmu(env, addr, mop, ra); + + if (mop & MO_BSWAP) { + ret = bswap32(ret); + } + return ret; +} + +tcg_target_ulong helper_ldsl_mmu(CPUArchState *env, target_ulong addr, + MemOpIdx oi, uintptr_t ra) +{ + MemOp mop = get_memop(oi); + int32_t ret = do_ld4_he_mmu(env, addr, mop, ra); + + if (mop & MO_BSWAP) { + ret = bswap32(ret); + } + return ret; +} + +uint32_t cpu_ldl_be_mmu(CPUArchState *env, abi_ptr addr, + MemOpIdx oi, uintptr_t ra) +{ + MemOp mop = get_memop(oi); + uint32_t ret; + + tcg_debug_assert((mop & MO_BSWAP) == MO_BE); + ret = do_ld4_he_mmu(env, addr, mop, ra); + qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R); + return cpu_to_be32(ret); +} + uint32_t cpu_ldl_le_mmu(CPUArchState *env, abi_ptr addr, MemOpIdx oi, uintptr_t ra) { - void *haddr; + MemOp mop = get_memop(oi); uint32_t ret; - validate_memop(oi, MO_LEUL); - haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_LOAD); - ret = load_atom_4(env, ra, haddr, get_memop(oi)); - clear_helper_retaddr(); + tcg_debug_assert((mop & MO_BSWAP) == MO_LE); + ret = do_ld4_he_mmu(env, addr, mop, ra); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R); return cpu_to_le32(ret); } +static uint64_t do_ld8_he_mmu(CPUArchState *env, abi_ptr addr, + MemOp mop, uintptr_t ra) +{ + void *haddr; + uint64_t ret; + + tcg_debug_assert((mop & MO_SIZE) == MO_64); + haddr = cpu_mmu_lookup(env, addr, mop, ra, MMU_DATA_LOAD); + ret = load_atom_8(env, ra, haddr, mop); + clear_helper_retaddr(); + return ret; +} + +uint64_t helper_ldq_mmu(CPUArchState *env, target_ulong addr, + MemOpIdx oi, uintptr_t ra) +{ + MemOp mop = get_memop(oi); + uint64_t ret = do_ld8_he_mmu(env, addr, mop, ra); + + if (mop & MO_BSWAP) { + ret = bswap64(ret); + } + return ret; +} + +uint64_t cpu_ldq_be_mmu(CPUArchState *env, abi_ptr addr, + MemOpIdx oi, uintptr_t ra) +{ + MemOp mop = get_memop(oi); + uint64_t ret; + + tcg_debug_assert((mop & MO_BSWAP) == MO_BE); + ret = do_ld8_he_mmu(env, addr, mop, ra); + qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R); + return cpu_to_be64(ret); +} + uint64_t cpu_ldq_le_mmu(CPUArchState *env, abi_ptr addr, MemOpIdx oi, uintptr_t ra) { - void *haddr; + MemOp mop = get_memop(oi); uint64_t ret; - validate_memop(oi, MO_LEUQ); - haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_LOAD); - ret = load_atom_8(env, ra, haddr, get_memop(oi)); - clear_helper_retaddr(); + tcg_debug_assert((mop & MO_BSWAP) == MO_LE); + ret = do_ld8_he_mmu(env, addr, mop, ra); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R); return cpu_to_le64(ret); } @@ -1039,7 +1129,7 @@ Int128 cpu_ld16_be_mmu(CPUArchState *env, abi_ptr addr, void *haddr; Int128 ret; - validate_memop(oi, MO_128 | MO_BE); + tcg_debug_assert((get_memop(oi) & (MO_BSWAP | MO_SIZE)) == (MO_128 | MO_BE)); haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_LOAD); memcpy(&ret, haddr, 16); clear_helper_retaddr(); @@ -1057,7 +1147,7 @@ Int128 cpu_ld16_le_mmu(CPUArchState *env, abi_ptr addr, void *haddr; Int128 ret; - validate_memop(oi, MO_128 | MO_LE); + tcg_debug_assert((get_memop(oi) & (MO_BSWAP | MO_SIZE)) == (MO_128 | MO_LE)); haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_LOAD); memcpy(&ret, haddr, 16); clear_helper_retaddr(); @@ -1069,87 +1159,153 @@ Int128 cpu_ld16_le_mmu(CPUArchState *env, abi_ptr addr, return ret; } -void cpu_stb_mmu(CPUArchState *env, abi_ptr addr, uint8_t val, - MemOpIdx oi, uintptr_t ra) +static void do_st1_mmu(CPUArchState *env, abi_ptr addr, uint8_t val, + MemOp mop, uintptr_t ra) { void *haddr; - validate_memop(oi, MO_UB); - haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE); + tcg_debug_assert((mop & MO_SIZE) == MO_8); + haddr = cpu_mmu_lookup(env, addr, mop, ra, MMU_DATA_STORE); stb_p(haddr, val); clear_helper_retaddr(); +} + +void helper_stb_mmu(CPUArchState *env, target_ulong addr, uint8_t val, + MemOpIdx oi, uintptr_t ra) +{ + do_st1_mmu(env, addr, val, get_memop(oi), ra); +} + +void cpu_stb_mmu(CPUArchState *env, abi_ptr addr, uint8_t val, + MemOpIdx oi, uintptr_t ra) +{ + do_st1_mmu(env, addr, val, get_memop(oi), ra); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W); } +static void do_st2_he_mmu(CPUArchState *env, abi_ptr addr, uint16_t val, + MemOp mop, uintptr_t ra) +{ + void *haddr; + + tcg_debug_assert((mop & MO_SIZE) == MO_16); + haddr = cpu_mmu_lookup(env, addr, mop, ra, MMU_DATA_STORE); + store_atom_2(env, ra, haddr, mop, val); + clear_helper_retaddr(); +} + +void helper_stw_mmu(CPUArchState *env, target_ulong addr, uint16_t val, + MemOpIdx oi, uintptr_t ra) +{ + MemOp mop = get_memop(oi); + + if (mop & MO_BSWAP) { + val = bswap16(val); + } + do_st2_he_mmu(env, addr, val, mop, ra); +} + void cpu_stw_be_mmu(CPUArchState *env, abi_ptr addr, uint16_t val, MemOpIdx oi, uintptr_t ra) { - void *haddr; + MemOp mop = get_memop(oi); - validate_memop(oi, MO_BEUW); - haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE); - store_atom_2(env, ra, haddr, get_memop(oi), be16_to_cpu(val)); - clear_helper_retaddr(); - qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W); -} - -void cpu_stl_be_mmu(CPUArchState *env, abi_ptr addr, uint32_t val, - MemOpIdx oi, uintptr_t ra) -{ - void *haddr; - - validate_memop(oi, MO_BEUL); - haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE); - store_atom_4(env, ra, haddr, get_memop(oi), be32_to_cpu(val)); - clear_helper_retaddr(); - qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W); -} - -void cpu_stq_be_mmu(CPUArchState *env, abi_ptr addr, uint64_t val, - MemOpIdx oi, uintptr_t ra) -{ - void *haddr; - - validate_memop(oi, MO_BEUQ); - haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE); - store_atom_8(env, ra, haddr, get_memop(oi), be64_to_cpu(val)); - clear_helper_retaddr(); + tcg_debug_assert((mop & MO_BSWAP) == MO_BE); + do_st2_he_mmu(env, addr, be16_to_cpu(val), mop, ra); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W); } void cpu_stw_le_mmu(CPUArchState *env, abi_ptr addr, uint16_t val, MemOpIdx oi, uintptr_t ra) +{ + MemOp mop = get_memop(oi); + + tcg_debug_assert((mop & MO_BSWAP) == MO_LE); + do_st2_he_mmu(env, addr, le16_to_cpu(val), mop, ra); + qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W); +} + +static void do_st4_he_mmu(CPUArchState *env, abi_ptr addr, uint32_t val, + MemOp mop, uintptr_t ra) { void *haddr; - validate_memop(oi, MO_LEUW); - haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE); - store_atom_2(env, ra, haddr, get_memop(oi), le16_to_cpu(val)); + tcg_debug_assert((mop & MO_SIZE) == MO_32); + haddr = cpu_mmu_lookup(env, addr, mop, ra, MMU_DATA_STORE); + store_atom_4(env, ra, haddr, mop, val); clear_helper_retaddr(); +} + +void helper_stl_mmu(CPUArchState *env, target_ulong addr, uint32_t val, + MemOpIdx oi, uintptr_t ra) +{ + MemOp mop = get_memop(oi); + + if (mop & MO_BSWAP) { + val = bswap32(val); + } + do_st4_he_mmu(env, addr, val, mop, ra); +} + +void cpu_stl_be_mmu(CPUArchState *env, abi_ptr addr, uint32_t val, + MemOpIdx oi, uintptr_t ra) +{ + MemOp mop = get_memop(oi); + + tcg_debug_assert((mop & MO_BSWAP) == MO_BE); + do_st4_he_mmu(env, addr, be32_to_cpu(val), mop, ra); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W); } void cpu_stl_le_mmu(CPUArchState *env, abi_ptr addr, uint32_t val, MemOpIdx oi, uintptr_t ra) +{ + MemOp mop = get_memop(oi); + + tcg_debug_assert((mop & MO_BSWAP) == MO_LE); + do_st4_he_mmu(env, addr, le32_to_cpu(val), mop, ra); + qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W); +} + +static void do_st8_he_mmu(CPUArchState *env, abi_ptr addr, uint64_t val, + MemOp mop, uintptr_t ra) { void *haddr; - validate_memop(oi, MO_LEUL); - haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE); - store_atom_4(env, ra, haddr, get_memop(oi), le32_to_cpu(val)); + tcg_debug_assert((mop & MO_SIZE) == MO_64); + haddr = cpu_mmu_lookup(env, addr, mop, ra, MMU_DATA_STORE); + store_atom_8(env, ra, haddr, mop, val); clear_helper_retaddr(); +} + +void helper_stq_mmu(CPUArchState *env, target_ulong addr, uint64_t val, + MemOpIdx oi, uintptr_t ra) +{ + MemOp mop = get_memop(oi); + + if (mop & MO_BSWAP) { + val = bswap64(val); + } + do_st8_he_mmu(env, addr, val, mop, ra); +} + +void cpu_stq_be_mmu(CPUArchState *env, abi_ptr addr, uint64_t val, + MemOpIdx oi, uintptr_t ra) +{ + MemOp mop = get_memop(oi); + + tcg_debug_assert((mop & MO_BSWAP) == MO_BE); + do_st8_he_mmu(env, addr, cpu_to_be64(val), mop, ra); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W); } void cpu_stq_le_mmu(CPUArchState *env, abi_ptr addr, uint64_t val, MemOpIdx oi, uintptr_t ra) { - void *haddr; + MemOp mop = get_memop(oi); - validate_memop(oi, MO_LEUQ); - haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE); - store_atom_8(env, ra, haddr, get_memop(oi), le64_to_cpu(val)); - clear_helper_retaddr(); + tcg_debug_assert((mop & MO_BSWAP) == MO_LE); + do_st8_he_mmu(env, addr, cpu_to_le64(val), mop, ra); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W); } @@ -1158,7 +1314,7 @@ void cpu_st16_be_mmu(CPUArchState *env, abi_ptr addr, { void *haddr; - validate_memop(oi, MO_128 | MO_BE); + tcg_debug_assert((get_memop(oi) & (MO_BSWAP | MO_SIZE)) == (MO_128 | MO_BE)); haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE); if (!HOST_BIG_ENDIAN) { val = bswap128(val); @@ -1173,7 +1329,7 @@ void cpu_st16_le_mmu(CPUArchState *env, abi_ptr addr, { void *haddr; - validate_memop(oi, MO_128 | MO_LE); + tcg_debug_assert((get_memop(oi) & (MO_BSWAP | MO_SIZE)) == (MO_128 | MO_LE)); haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE); if (HOST_BIG_ENDIAN) { val = bswap128(val); From patchwork Thu Feb 16 02:57:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1743239 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=tG20XzmI; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4PHKRP5CYZz23yD for ; Thu, 16 Feb 2023 13:59:29 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pSUTH-0003Up-65; Wed, 15 Feb 2023 21:58:07 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pSUTF-0003TV-6N for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:58:05 -0500 Received: from mail-pf1-x433.google.com ([2607:f8b0:4864:20::433]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pSUTA-0005k0-AO for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:58:04 -0500 Received: by mail-pf1-x433.google.com with SMTP id r3so587373pfh.4 for ; Wed, 15 Feb 2023 18:57:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=WOPY5Dwrl2nWIi+WbdD7GF5IeQUqMimwzlA53ngq60M=; b=tG20XzmIbjawJuyKiyiVHgjk6JPKAGZTTUrlKqBNoqddsvLBWcz2ounE2O8oeZTEMi BQ9H/swVDiw+Mhmeej5MUCanoKPU9qkDmXCo0iH/X0b0B4EaOpmvnYC11b1zXyOjnHYx zEZTUHSlz2A5BnKw4FCEKCnOcxzVfOX9pbxFeG8hmP0gP1mTM/Sjcq8xvDGcf2Kjwehl 0zB2hC3kGBDVagDq5WAHYpc/1Vi3KTlKhUGTR5XlmmNVU5B6MxfBqKai5vJEflEWzan/ qd9RdRVULUpgYsXWI4LImP8QTEF1xtXCaeEWsq3WN8GEddQ6kd+DRJxHiLqE0IktarNP KSRQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=WOPY5Dwrl2nWIi+WbdD7GF5IeQUqMimwzlA53ngq60M=; b=cuiYW2Ql6/9PZMGBhYvCWsUJgEBNVny8ITghhN1hQhXYWOsiNDhO0MzoMZrczb8Quq totEnHuGgT2LxkjRSMNiHb3cnargzGQw/zpeEXLdDa58Gh/lytqIitgozjGqFTlKXUqt JLSfy5mT4P5ZDN8/PZd3EKcIflsHVqtrQA0VMyXojnRc11B5RiNDIe4rCSQGylLLN7DL vA6ZN1jgRiJZm4DTsvDRFNC1aDC1l0IH3Ph2sk3CmEcbO8PmJciN14Rn2HyPg+y3o+OT tOmHIK1QYSHX4j76NFIe3WpUNoR8X/+zNdqmvKFt4W6wVvf5C3T80KNlc8yGPXWKc8re FgFw== X-Gm-Message-State: AO0yUKXHfVeJ4LQjYAphLMOE75ZYt6TleOVdBLaIC62SXQ7KXlmfpiQD r7z/vQk9ekyWSKoB80A8TI+oFRXEIom0Q133gPg= X-Google-Smtp-Source: AK7set+YbFVVOk+N0mIOc8y9zcEYfAA9X0+Qefdlv2JjNQQ4fBWdx9hcvD8F8b1lY6hfDjTeqAlJ9w== X-Received: by 2002:aa7:9a09:0:b0:59c:3fd7:45de with SMTP id w9-20020aa79a09000000b0059c3fd745demr3727286pfj.30.1676516278750; Wed, 15 Feb 2023 18:57:58 -0800 (PST) Received: from stoup.. (rrcs-74-87-59-234.west.biz.rr.com. [74.87.59.234]) by smtp.gmail.com with ESMTPSA id e14-20020a62aa0e000000b005a816b7c3e8sm89655pff.24.2023.02.15.18.57.57 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Feb 2023 18:57:58 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 12/30] tcg: Add 128-bit guest memory primitives Date: Wed, 15 Feb 2023 16:57:21 -1000 Message-Id: <20230216025739.1211680-13-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230216025739.1211680-1-richard.henderson@linaro.org> References: <20230216025739.1211680-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::433; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x433.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Signed-off-by: Richard Henderson --- accel/tcg/tcg-runtime.h | 3 + include/tcg/tcg-ldst.h | 4 + accel/tcg/cputlb.c | 392 +++++++++++++++++++++++++-------- accel/tcg/user-exec.c | 94 ++++++-- tcg/tcg-op.c | 178 ++++++++++----- accel/tcg/ldst_atomicity.c.inc | 189 ++++++++++++++++ 6 files changed, 682 insertions(+), 178 deletions(-) diff --git a/accel/tcg/tcg-runtime.h b/accel/tcg/tcg-runtime.h index e141a6ab24..a7a2038901 100644 --- a/accel/tcg/tcg-runtime.h +++ b/accel/tcg/tcg-runtime.h @@ -39,6 +39,9 @@ DEF_HELPER_FLAGS_1(exit_atomic, TCG_CALL_NO_WG, noreturn, env) DEF_HELPER_FLAGS_3(memset, TCG_CALL_NO_RWG, ptr, ptr, int, ptr) #endif /* IN_HELPER_PROTO */ +DEF_HELPER_FLAGS_3(ld_i128, TCG_CALL_NO_WG, i128, env, tl, i32) +DEF_HELPER_FLAGS_4(st_i128, TCG_CALL_NO_WG, void, env, tl, i128, i32) + DEF_HELPER_FLAGS_5(atomic_cmpxchgb, TCG_CALL_NO_WG, i32, env, tl, i32, i32, i32) DEF_HELPER_FLAGS_5(atomic_cmpxchgw_be, TCG_CALL_NO_WG, diff --git a/include/tcg/tcg-ldst.h b/include/tcg/tcg-ldst.h index c1d945fd66..3004e5292d 100644 --- a/include/tcg/tcg-ldst.h +++ b/include/tcg/tcg-ldst.h @@ -34,6 +34,8 @@ tcg_target_ulong helper_ldul_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, uintptr_t retaddr); uint64_t helper_ldq_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, uintptr_t retaddr); +Int128 helper_ld16_mmu(CPUArchState *env, target_ulong addr, + MemOpIdx oi, uintptr_t retaddr); /* Value sign-extended to tcg register size. */ tcg_target_ulong helper_ldsb_mmu(CPUArchState *env, target_ulong addr, @@ -51,6 +53,8 @@ void helper_stl_mmu(CPUArchState *env, target_ulong addr, uint32_t val, MemOpIdx oi, uintptr_t retaddr); void helper_stq_mmu(CPUArchState *env, target_ulong addr, uint64_t val, MemOpIdx oi, uintptr_t retaddr); +void helper_st16_mmu(CPUArchState *env, target_ulong addr, Int128 val, + MemOpIdx oi, uintptr_t retaddr); #ifdef CONFIG_USER_ONLY diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c index e2e764f4da..d7ca90e3b4 100644 --- a/accel/tcg/cputlb.c +++ b/accel/tcg/cputlb.c @@ -40,6 +40,7 @@ #include "qemu/plugin-memory.h" #endif #include "tcg/tcg-ldst.h" +#include "exec/helper-proto.h" /* DEBUG defines, enable DEBUG_TLB_LOG to log to the CPU_LOG_MMU target */ /* #define DEBUG_TLB */ @@ -2128,6 +2129,31 @@ static uint64_t do_ld_whole_be8(CPUArchState *env, uintptr_t ra, return (ret_be << (p->size * 8)) | x; } +/** + * do_ld_parts_be16 + * @p: translation parameters + * @ret_be: accumulated data + * + * As do_ld_bytes_beN, but with one atomic load. + * 16 aligned bytes are guaranteed to cover the load. + */ +static Int128 do_ld_whole_be16(CPUArchState *env, uintptr_t ra, + MMULookupPageData *p, uint64_t ret_be) +{ + int o = p->addr & 15; + Int128 x, y = load_atomic16_or_exit(env, ra, p->haddr - o); + int size = p->size; + + if (!HOST_BIG_ENDIAN) { + y = bswap128(y); + } + y = int128_lshift(y, o * 8); + y = int128_urshift(y, (16 - size) * 8); + x = int128_make64(ret_be); + x = int128_lshift(x, size * 8); + return int128_or(x, y); +} + /* * Wrapper for the above. */ @@ -2172,6 +2198,59 @@ static uint64_t do_ld_beN(CPUArchState *env, MMULookupPageData *p, } } +/* + * Wrapper for the above, for 8 < size < 16. + */ +static Int128 do_ld16_beN(CPUArchState *env, MMULookupPageData *p, + uint64_t a, int mmu_idx, MemOp mop, uintptr_t ra) +{ + int size = p->size; + uint64_t b; + MemOp atmax; + + if (unlikely(p->flags & TLB_MMIO)) { + p->size = size - 8; + a = do_ld_mmio_beN(env, p, a, mmu_idx, MMU_DATA_LOAD, ra); + p->addr += p->size; + p->size = 8; + b = do_ld_mmio_beN(env, p, 0, mmu_idx, MMU_DATA_LOAD, ra); + } else { + switch (mop & MO_ATOM_MASK) { + case MO_ATOM_WITHIN16: + /* + * It is a given that we cross a page and therefore there is no + * atomicity for the load as a whole, but there may be a subobject + * as defined by ATMAX which does not cross a 16-byte boundary. + */ + atmax = mop & MO_ATMAX_MASK; + if (atmax != MO_ATMAX_SIZE) { + atmax >>= MO_ATMAX_SHIFT; + if (unlikely(size >= (1 << atmax))) { + return do_ld_whole_be16(env, ra, p, a); + } + } + /* fall through */ + case MO_ATOM_IFALIGN: + case MO_ATOM_NONE: + p->size = size - 8; + a = do_ld_bytes_beN(p, a); + b = ldq_be_p(p->haddr + size - 8); + break; + case MO_ATOM_SUBALIGN: + p->size = size - 8; + a = do_ld_parts_beN(p, a); + p->haddr += size - 8; + p->size = 8; + b = do_ld_parts_beN(p, 0); + break; + default: + g_assert_not_reached(); + } + } + + return int128_make128(b, a); +} + static uint8_t do_ld_1(CPUArchState *env, MMULookupPageData *p, int mmu_idx, MMUAccessType type, uintptr_t ra) { @@ -2360,6 +2439,80 @@ tcg_target_ulong helper_ldsl_mmu(CPUArchState *env, target_ulong addr, return (int32_t)helper_ldul_mmu(env, addr, oi, retaddr); } +static Int128 do_ld16_mmu(CPUArchState *env, target_ulong addr, + MemOpIdx oi, uintptr_t ra) +{ + MMULookupLocals l; + bool crosspage; + uint64_t a, b; + Int128 ret; + int first; + + crosspage = mmu_lookup(env, addr, oi, ra, MMU_DATA_LOAD, &l); + if (likely(!crosspage)) { + /* Perform the load host endian. */ + if (unlikely(l.page[0].flags & TLB_MMIO)) { + QEMU_IOTHREAD_LOCK_GUARD(); + a = io_readx(env, l.page[0].full, l.mmu_idx, addr, + ra, MMU_DATA_LOAD, MO_64); + b = io_readx(env, l.page[0].full, l.mmu_idx, addr + 8, + ra, MMU_DATA_LOAD, MO_64); + ret = int128_make128(HOST_BIG_ENDIAN ? b : a, + HOST_BIG_ENDIAN ? a : b); + } else { + ret = load_atom_16(env, ra, l.page[0].haddr, l.memop); + } + if (l.memop & MO_BSWAP) { + ret = bswap128(ret); + } + return ret; + } + + first = l.page[0].size; + if (first == 8) { + MemOp mop8 = (l.memop & ~MO_SIZE) | MO_64; + + a = do_ld_8(env, &l.page[0], l.mmu_idx, MMU_DATA_LOAD, mop8, ra); + b = do_ld_8(env, &l.page[1], l.mmu_idx, MMU_DATA_LOAD, mop8, ra); + if ((mop8 & MO_BSWAP) == MO_LE) { + ret = int128_make128(a, b); + } else { + ret = int128_make128(b, a); + } + return ret; + } + + if (first < 8) { + a = do_ld_beN(env, &l.page[0], 0, l.mmu_idx, + MMU_DATA_LOAD, l.memop, ra); + ret = do_ld16_beN(env, &l.page[1], a, l.mmu_idx, l.memop, ra); + } else { + ret = do_ld16_beN(env, &l.page[0], 0, l.mmu_idx, l.memop, ra); + b = int128_getlo(ret); + ret = int128_lshift(ret, l.page[1].size * 8); + a = int128_gethi(ret); + b = do_ld_beN(env, &l.page[1], b, l.mmu_idx, + MMU_DATA_LOAD, l.memop, ra); + ret = int128_make128(b, a); + } + if ((l.memop & MO_BSWAP) == MO_LE) { + ret = bswap128(ret); + } + return ret; +} + +Int128 helper_ld16_mmu(CPUArchState *env, target_ulong addr, + uint32_t oi, uintptr_t retaddr) +{ + tcg_debug_assert((get_memop(oi) & MO_SIZE) == MO_128); + return do_ld16_mmu(env, addr, oi, retaddr); +} + +Int128 helper_ld_i128(CPUArchState *env, target_ulong addr, uint32_t oi) +{ + return helper_ld16_mmu(env, addr, oi, GETPC()); +} + /* * Load helpers for cpu_ldst.h. */ @@ -2448,59 +2601,23 @@ uint64_t cpu_ldq_le_mmu(CPUArchState *env, abi_ptr addr, Int128 cpu_ld16_be_mmu(CPUArchState *env, abi_ptr addr, MemOpIdx oi, uintptr_t ra) { - MemOp mop = get_memop(oi); - int mmu_idx = get_mmuidx(oi); - MemOpIdx new_oi; - unsigned a_bits; - uint64_t h, l; + Int128 ret; - tcg_debug_assert((mop & (MO_BSWAP|MO_SSIZE)) == (MO_BE|MO_128)); - a_bits = get_alignment_bits(mop); - - /* Handle CPU specific unaligned behaviour */ - if (addr & ((1 << a_bits) - 1)) { - cpu_unaligned_access(env_cpu(env), addr, MMU_DATA_LOAD, - mmu_idx, ra); - } - - /* Construct an unaligned 64-bit replacement MemOpIdx. */ - mop = (mop & ~(MO_SIZE | MO_AMASK)) | MO_64 | MO_UNALN; - new_oi = make_memop_idx(mop, mmu_idx); - - h = helper_ldq_mmu(env, addr, new_oi, ra); - l = helper_ldq_mmu(env, addr + 8, new_oi, ra); - - qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R); - return int128_make128(l, h); + tcg_debug_assert((get_memop(oi) & (MO_BSWAP|MO_SIZE)) == (MO_BE|MO_128)); + ret = do_ld16_mmu(env, addr, oi, ra); + plugin_load_cb(env, addr, oi); + return ret; } Int128 cpu_ld16_le_mmu(CPUArchState *env, abi_ptr addr, MemOpIdx oi, uintptr_t ra) { - MemOp mop = get_memop(oi); - int mmu_idx = get_mmuidx(oi); - MemOpIdx new_oi; - unsigned a_bits; - uint64_t h, l; + Int128 ret; - tcg_debug_assert((mop & (MO_BSWAP|MO_SSIZE)) == (MO_LE|MO_128)); - a_bits = get_alignment_bits(mop); - - /* Handle CPU specific unaligned behaviour */ - if (addr & ((1 << a_bits) - 1)) { - cpu_unaligned_access(env_cpu(env), addr, MMU_DATA_LOAD, - mmu_idx, ra); - } - - /* Construct an unaligned 64-bit replacement MemOpIdx. */ - mop = (mop & ~(MO_SIZE | MO_AMASK)) | MO_64 | MO_UNALN; - new_oi = make_memop_idx(mop, mmu_idx); - - l = helper_ldq_mmu(env, addr, new_oi, ra); - h = helper_ldq_mmu(env, addr + 8, new_oi, ra); - - qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R); - return int128_make128(l, h); + tcg_debug_assert((get_memop(oi) & (MO_BSWAP|MO_SIZE)) == (MO_LE|MO_128)); + ret = do_ld16_mmu(env, addr, oi, ra); + plugin_load_cb(env, addr, oi); + return ret; } /* @@ -2581,6 +2698,57 @@ static uint64_t do_st_leN(CPUArchState *env, MMULookupPageData *p, } } +/* + * Wrapper for the above, for 8 < size < 16. + */ +static uint64_t do_st16_leN(CPUArchState *env, MMULookupPageData *p, + Int128 val_le, int mmu_idx, + MemOp mop, uintptr_t ra) +{ + int size = p->size; + MemOp atmax; + + if (unlikely(p->flags & TLB_MMIO)) { + p->size = 8; + do_st_mmio_leN(env, p, int128_getlo(val_le), mmu_idx, ra); + p->size = size - 8; + p->addr += 8; + return do_st_mmio_leN(env, p, int128_gethi(val_le), mmu_idx, ra); + } else if (unlikely(p->flags & TLB_DISCARD_WRITE)) { + return int128_gethi(val_le) >> ((size - 8) * 8); + } + + switch (mop & MO_ATOM_MASK) { + case MO_ATOM_WITHIN16: + /* + * It is a given that we cross a page and therefore there is no + * atomicity for the store as a whole, but there may be a subobject + * as defined by ATMAX which does not cross a 16-byte boundary. + */ + atmax = mop & MO_ATMAX_MASK; + if (atmax != MO_ATMAX_SIZE) { + atmax >>= MO_ATMAX_SHIFT; + if (unlikely(size >= (1 << atmax))) { + if (HAVE_al16) { + return store_whole_le16(p->haddr, p->size, val_le); + } else { + cpu_loop_exit_atomic(env_cpu(env), ra); + } + } + } + /* fall through */ + case MO_ATOM_IFALIGN: + case MO_ATOM_NONE: + stq_le_p(p->haddr, int128_getlo(val_le)); + return store_bytes_leN(p->haddr + 8, p->size - 8, int128_gethi(val_le)); + case MO_ATOM_SUBALIGN: + store_parts_leN(p->haddr, 8, int128_getlo(val_le)); + return store_parts_leN(p->haddr + 8, p->size - 8, int128_gethi(val_le)); + default: + g_assert_not_reached(); + } +} + static void do_st_1(CPUArchState *env, MMULookupPageData *p, uint8_t val, int mmu_idx, uintptr_t ra) { @@ -2737,6 +2905,80 @@ void helper_stq_mmu(CPUArchState *env, target_ulong addr, uint64_t val, do_st8_mmu(env, addr, val, oi, retaddr); } +static void do_st16_mmu(CPUArchState *env, target_ulong addr, Int128 val, + MemOpIdx oi, uintptr_t ra) +{ + MMULookupLocals l; + bool crosspage; + uint64_t a, b; + int first; + + crosspage = mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE, &l); + if (likely(!crosspage)) { + /* Swap to host endian if necessary, then store. */ + if (l.memop & MO_BSWAP) { + val = bswap128(val); + } + if (unlikely(l.page[0].flags & TLB_MMIO)) { + QEMU_IOTHREAD_LOCK_GUARD(); + if (HOST_BIG_ENDIAN) { + b = int128_getlo(val), a = int128_gethi(val); + } else { + a = int128_getlo(val), b = int128_gethi(val); + } + io_writex(env, l.page[0].full, l.mmu_idx, a, addr, ra, MO_64); + io_writex(env, l.page[0].full, l.mmu_idx, b, addr + 8, ra, MO_64); + } else if (unlikely(l.page[0].flags & TLB_DISCARD_WRITE)) { + /* nothing */ + } else { + store_atom_16(env, ra, l.page[0].haddr, l.memop, val); + } + return; + } + + first = l.page[0].size; + if (first == 8) { + MemOp mop8 = (l.memop & ~(MO_SIZE | MO_BSWAP)) | MO_64; + + if (l.memop & MO_BSWAP) { + val = bswap128(val); + } + if (HOST_BIG_ENDIAN) { + b = int128_getlo(val), a = int128_gethi(val); + } else { + a = int128_getlo(val), b = int128_gethi(val); + } + do_st_8(env, &l.page[0], a, l.mmu_idx, mop8, ra); + do_st_8(env, &l.page[1], b, l.mmu_idx, mop8, ra); + return; + } + + if ((l.memop & MO_BSWAP) != MO_LE) { + val = bswap128(val); + } + if (first < 8) { + do_st_leN(env, &l.page[0], int128_getlo(val), l.mmu_idx, l.memop, ra); + val = int128_urshift(val, first * 8); + do_st16_leN(env, &l.page[1], val, l.mmu_idx, l.memop, ra); + } else { + b = do_st16_leN(env, &l.page[0], val, l.mmu_idx, l.memop, ra); + do_st_leN(env, &l.page[1], b, l.mmu_idx, l.memop, ra); + } +} + +void helper_st16_mmu(CPUArchState *env, target_ulong addr, Int128 val, + MemOpIdx oi, uintptr_t retaddr) +{ + tcg_debug_assert((get_memop(oi) & MO_SIZE) == MO_128); + do_st16_mmu(env, addr, val, oi, retaddr); +} + +void helper_st_i128(CPUArchState *env, target_ulong addr, Int128 val, + MemOpIdx oi) +{ + helper_st16_mmu(env, addr, val, oi, GETPC()); +} + /* * Store Helpers for cpu_ldst.h */ @@ -2801,58 +3043,20 @@ void cpu_stq_le_mmu(CPUArchState *env, target_ulong addr, uint64_t val, plugin_store_cb(env, addr, oi); } -void cpu_st16_be_mmu(CPUArchState *env, abi_ptr addr, Int128 val, - MemOpIdx oi, uintptr_t ra) +void cpu_st16_be_mmu(CPUArchState *env, target_ulong addr, Int128 val, + MemOpIdx oi, uintptr_t retaddr) { - MemOp mop = get_memop(oi); - int mmu_idx = get_mmuidx(oi); - MemOpIdx new_oi; - unsigned a_bits; - - tcg_debug_assert((mop & (MO_BSWAP|MO_SSIZE)) == (MO_BE|MO_128)); - a_bits = get_alignment_bits(mop); - - /* Handle CPU specific unaligned behaviour */ - if (addr & ((1 << a_bits) - 1)) { - cpu_unaligned_access(env_cpu(env), addr, MMU_DATA_STORE, - mmu_idx, ra); - } - - /* Construct an unaligned 64-bit replacement MemOpIdx. */ - mop = (mop & ~(MO_SIZE | MO_AMASK)) | MO_64 | MO_UNALN; - new_oi = make_memop_idx(mop, mmu_idx); - - helper_stq_mmu(env, addr, int128_gethi(val), new_oi, ra); - helper_stq_mmu(env, addr + 8, int128_getlo(val), new_oi, ra); - - qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W); + tcg_debug_assert((get_memop(oi) & (MO_BSWAP|MO_SIZE)) == (MO_BE|MO_128)); + do_st16_mmu(env, addr, val, oi, retaddr); + plugin_store_cb(env, addr, oi); } -void cpu_st16_le_mmu(CPUArchState *env, abi_ptr addr, Int128 val, - MemOpIdx oi, uintptr_t ra) +void cpu_st16_le_mmu(CPUArchState *env, target_ulong addr, Int128 val, + MemOpIdx oi, uintptr_t retaddr) { - MemOp mop = get_memop(oi); - int mmu_idx = get_mmuidx(oi); - MemOpIdx new_oi; - unsigned a_bits; - - tcg_debug_assert((mop & (MO_BSWAP|MO_SSIZE)) == (MO_LE|MO_128)); - a_bits = get_alignment_bits(mop); - - /* Handle CPU specific unaligned behaviour */ - if (addr & ((1 << a_bits) - 1)) { - cpu_unaligned_access(env_cpu(env), addr, MMU_DATA_STORE, - mmu_idx, ra); - } - - /* Construct an unaligned 64-bit replacement MemOpIdx. */ - mop = (mop & ~(MO_SIZE | MO_AMASK)) | MO_64 | MO_UNALN; - new_oi = make_memop_idx(mop, mmu_idx); - - helper_stq_mmu(env, addr, int128_getlo(val), new_oi, ra); - helper_stq_mmu(env, addr + 8, int128_gethi(val), new_oi, ra); - - qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W); + tcg_debug_assert((get_memop(oi) & (MO_BSWAP|MO_SIZE)) == (MO_LE|MO_128)); + do_st16_mmu(env, addr, val, oi, retaddr); + plugin_store_cb(env, addr, oi); } #include "ldst_common.c.inc" diff --git a/accel/tcg/user-exec.c b/accel/tcg/user-exec.c index 2a4b9e2e63..b0f4fbace7 100644 --- a/accel/tcg/user-exec.c +++ b/accel/tcg/user-exec.c @@ -1123,18 +1123,45 @@ uint64_t cpu_ldq_le_mmu(CPUArchState *env, abi_ptr addr, return cpu_to_le64(ret); } -Int128 cpu_ld16_be_mmu(CPUArchState *env, abi_ptr addr, - MemOpIdx oi, uintptr_t ra) +static Int128 do_ld16_he_mmu(CPUArchState *env, abi_ptr addr, + MemOp mop, uintptr_t ra) { void *haddr; Int128 ret; - tcg_debug_assert((get_memop(oi) & (MO_BSWAP | MO_SIZE)) == (MO_128 | MO_BE)); - haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_LOAD); - memcpy(&ret, haddr, 16); + tcg_debug_assert((mop & MO_SIZE) == MO_128); + haddr = cpu_mmu_lookup(env, addr, mop, ra, MMU_DATA_LOAD); + ret = load_atom_16(env, ra, haddr, mop); clear_helper_retaddr(); - qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R); + return ret; +} +Int128 helper_ld16_mmu(CPUArchState *env, target_ulong addr, + MemOpIdx oi, uintptr_t ra) +{ + MemOp mop = get_memop(oi); + Int128 ret = do_ld16_he_mmu(env, addr, mop, ra); + + if (mop & MO_BSWAP) { + ret = bswap128(ret); + } + return ret; +} + +Int128 helper_ld_i128(CPUArchState *env, target_ulong addr, MemOpIdx oi) +{ + return helper_ld16_mmu(env, addr, oi, GETPC()); +} + +Int128 cpu_ld16_be_mmu(CPUArchState *env, abi_ptr addr, + MemOpIdx oi, uintptr_t ra) +{ + MemOp mop = get_memop(oi); + Int128 ret; + + tcg_debug_assert((mop & MO_BSWAP) == MO_BE); + ret = do_ld16_he_mmu(env, addr, mop, ra); + qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R); if (!HOST_BIG_ENDIAN) { ret = bswap128(ret); } @@ -1144,15 +1171,12 @@ Int128 cpu_ld16_be_mmu(CPUArchState *env, abi_ptr addr, Int128 cpu_ld16_le_mmu(CPUArchState *env, abi_ptr addr, MemOpIdx oi, uintptr_t ra) { - void *haddr; + MemOp mop = get_memop(oi); Int128 ret; - tcg_debug_assert((get_memop(oi) & (MO_BSWAP | MO_SIZE)) == (MO_128 | MO_LE)); - haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_LOAD); - memcpy(&ret, haddr, 16); - clear_helper_retaddr(); + tcg_debug_assert((mop & MO_BSWAP) == MO_LE); + ret = do_ld16_he_mmu(env, addr, mop, ra); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R); - if (HOST_BIG_ENDIAN) { ret = bswap128(ret); } @@ -1309,33 +1333,57 @@ void cpu_stq_le_mmu(CPUArchState *env, abi_ptr addr, uint64_t val, qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W); } -void cpu_st16_be_mmu(CPUArchState *env, abi_ptr addr, - Int128 val, MemOpIdx oi, uintptr_t ra) +static void do_st16_he_mmu(CPUArchState *env, abi_ptr addr, Int128 val, + MemOp mop, uintptr_t ra) { void *haddr; - tcg_debug_assert((get_memop(oi) & (MO_BSWAP | MO_SIZE)) == (MO_128 | MO_BE)); - haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE); + tcg_debug_assert((mop & MO_SIZE) == MO_128); + haddr = cpu_mmu_lookup(env, addr, mop, ra, MMU_DATA_STORE); + store_atom_16(env, ra, haddr, mop, val); + clear_helper_retaddr(); +} + +void helper_st16_mmu(CPUArchState *env, target_ulong addr, Int128 val, + MemOpIdx oi, uintptr_t ra) +{ + MemOp mop = get_memop(oi); + + if (mop & MO_BSWAP) { + val = bswap128(val); + } + do_st16_he_mmu(env, addr, val, mop, ra); +} + +void helper_st_i128(CPUArchState *env, target_ulong addr, + Int128 val, MemOpIdx oi) +{ + helper_st16_mmu(env, addr, val, oi, GETPC()); +} + +void cpu_st16_be_mmu(CPUArchState *env, abi_ptr addr, + Int128 val, MemOpIdx oi, uintptr_t ra) +{ + MemOp mop = get_memop(oi); + + tcg_debug_assert((mop & MO_BSWAP) == MO_BE); if (!HOST_BIG_ENDIAN) { val = bswap128(val); } - memcpy(haddr, &val, 16); - clear_helper_retaddr(); + do_st16_he_mmu(env, addr, val, mop, ra); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W); } void cpu_st16_le_mmu(CPUArchState *env, abi_ptr addr, Int128 val, MemOpIdx oi, uintptr_t ra) { - void *haddr; + MemOp mop = get_memop(oi); - tcg_debug_assert((get_memop(oi) & (MO_BSWAP | MO_SIZE)) == (MO_128 | MO_LE)); - haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE); + tcg_debug_assert((mop & MO_BSWAP) == MO_LE); if (HOST_BIG_ENDIAN) { val = bswap128(val); } - memcpy(haddr, &val, 16); - clear_helper_retaddr(); + do_st16_he_mmu(env, addr, val, mop, ra); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W); } diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c index da312dcf7e..93ac864aed 100644 --- a/tcg/tcg-op.c +++ b/tcg/tcg-op.c @@ -3114,6 +3114,48 @@ void tcg_gen_qemu_st_i64(TCGv_i64 val, TCGv addr, TCGArg idx, MemOp memop) } } +/* + * Return true if @mop, without knowledge of the pointer alignment, + * does not require 16-byte atomicity, and it would be adventagous + * to avoid a call to a helper function. + */ +static bool use_two_i64_for_i128(MemOp mop) +{ +#ifdef CONFIG_SOFTMMU + /* Two softmmu tlb lookups is larger than one function call. */ + return false; +#else + /* + * For user-only, two 64-bit operations may well be smaller than a call. + * Determine if that would be legal for the requested atomicity. + */ + MemOp atom = mop & MO_ATOM_MASK; + MemOp atmax = mop & MO_ATMAX_MASK; + + /* In a serialized context, no atomicity is required. */ + if (!(tcg_ctx->gen_tb->cflags & CF_PARALLEL)) { + return true; + } + + if (atmax == MO_ATMAX_SIZE) { + atmax = mop & MO_SIZE; + } else { + atmax >>= MO_ATMAX_SHIFT; + } + switch (atom) { + case MO_ATOM_NONE: + return true; + case MO_ATOM_IFALIGN: + case MO_ATOM_SUBALIGN: + return atmax < MO_128; + case MO_ATOM_WITHIN16: + return atmax == MO_8; + default: + g_assert_not_reached(); + } +#endif +} + static void canonicalize_memop_i128_as_i64(MemOp ret[2], MemOp orig) { MemOp mop_1 = orig, mop_2; @@ -3161,91 +3203,105 @@ static void canonicalize_memop_i128_as_i64(MemOp ret[2], MemOp orig) void tcg_gen_qemu_ld_i128(TCGv_i128 val, TCGv addr, TCGArg idx, MemOp memop) { - MemOp mop[2]; - TCGv addr_p8; - TCGv_i64 x, y; + MemOpIdx oi = make_memop_idx(memop, idx); - canonicalize_memop_i128_as_i64(mop, memop); + tcg_debug_assert((memop & MO_SIZE) == MO_128); + tcg_debug_assert((memop & MO_SIGN) == 0); tcg_gen_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD); addr = plugin_prep_mem_callbacks(addr); - /* TODO: respect atomicity of the operation. */ /* TODO: allow the tcg backend to see the whole operation. */ - /* - * Since there are no global TCGv_i128, there is no visible state - * changed if the second load faults. Load directly into the two - * subwords. - */ - if ((memop & MO_BSWAP) == MO_LE) { - x = TCGV128_LOW(val); - y = TCGV128_HIGH(val); + if (use_two_i64_for_i128(memop)) { + MemOp mop[2]; + TCGv addr_p8; + TCGv_i64 x, y; + + canonicalize_memop_i128_as_i64(mop, memop); + + /* + * Since there are no global TCGv_i128, there is no visible state + * changed if the second load faults. Load directly into the two + * subwords. + */ + if ((memop & MO_BSWAP) == MO_LE) { + x = TCGV128_LOW(val); + y = TCGV128_HIGH(val); + } else { + x = TCGV128_HIGH(val); + y = TCGV128_LOW(val); + } + + gen_ldst_i64(INDEX_op_qemu_ld_i64, x, addr, mop[0], idx); + + if ((mop[0] ^ memop) & MO_BSWAP) { + tcg_gen_bswap64_i64(x, x); + } + + addr_p8 = tcg_temp_new(); + tcg_gen_addi_tl(addr_p8, addr, 8); + gen_ldst_i64(INDEX_op_qemu_ld_i64, y, addr_p8, mop[1], idx); + tcg_temp_free(addr_p8); + + if ((mop[0] ^ memop) & MO_BSWAP) { + tcg_gen_bswap64_i64(y, y); + } } else { - x = TCGV128_HIGH(val); - y = TCGV128_LOW(val); + gen_helper_ld_i128(val, cpu_env, addr, tcg_constant_i32(oi)); } - gen_ldst_i64(INDEX_op_qemu_ld_i64, x, addr, mop[0], idx); - - if ((mop[0] ^ memop) & MO_BSWAP) { - tcg_gen_bswap64_i64(x, x); - } - - addr_p8 = tcg_temp_new(); - tcg_gen_addi_tl(addr_p8, addr, 8); - gen_ldst_i64(INDEX_op_qemu_ld_i64, y, addr_p8, mop[1], idx); - tcg_temp_free(addr_p8); - - if ((mop[0] ^ memop) & MO_BSWAP) { - tcg_gen_bswap64_i64(y, y); - } - - plugin_gen_mem_callbacks(addr, make_memop_idx(memop, idx), - QEMU_PLUGIN_MEM_R); + plugin_gen_mem_callbacks(addr, oi, QEMU_PLUGIN_MEM_R); } void tcg_gen_qemu_st_i128(TCGv_i128 val, TCGv addr, TCGArg idx, MemOp memop) { - MemOp mop[2]; - TCGv addr_p8; - TCGv_i64 x, y; + MemOpIdx oi = make_memop_idx(memop, idx); - canonicalize_memop_i128_as_i64(mop, memop); + tcg_debug_assert((memop & MO_SIZE) == MO_128); + tcg_debug_assert((memop & MO_SIGN) == 0); tcg_gen_req_mo(TCG_MO_ST_LD | TCG_MO_ST_ST); addr = plugin_prep_mem_callbacks(addr); - /* TODO: respect atomicity of the operation. */ /* TODO: allow the tcg backend to see the whole operation. */ - if ((memop & MO_BSWAP) == MO_LE) { - x = TCGV128_LOW(val); - y = TCGV128_HIGH(val); + if (use_two_i64_for_i128(memop)) { + MemOp mop[2]; + TCGv addr_p8; + TCGv_i64 x, y; + + canonicalize_memop_i128_as_i64(mop, memop); + + if ((memop & MO_BSWAP) == MO_LE) { + x = TCGV128_LOW(val); + y = TCGV128_HIGH(val); + } else { + x = TCGV128_HIGH(val); + y = TCGV128_LOW(val); + } + + addr_p8 = tcg_temp_new(); + if ((mop[0] ^ memop) & MO_BSWAP) { + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_bswap64_i64(t, x); + gen_ldst_i64(INDEX_op_qemu_st_i64, t, addr, mop[0], idx); + tcg_gen_bswap64_i64(t, y); + tcg_gen_addi_tl(addr_p8, addr, 8); + gen_ldst_i64(INDEX_op_qemu_st_i64, t, addr_p8, mop[1], idx); + tcg_temp_free_i64(t); + } else { + gen_ldst_i64(INDEX_op_qemu_st_i64, x, addr, mop[0], idx); + tcg_gen_addi_tl(addr_p8, addr, 8); + gen_ldst_i64(INDEX_op_qemu_st_i64, y, addr_p8, mop[1], idx); + } + tcg_temp_free(addr_p8); } else { - x = TCGV128_HIGH(val); - y = TCGV128_LOW(val); + gen_helper_st_i128(cpu_env, addr, val, tcg_constant_i32(oi)); } - addr_p8 = tcg_temp_new(); - if ((mop[0] ^ memop) & MO_BSWAP) { - TCGv_i64 t = tcg_temp_new_i64(); - - tcg_gen_bswap64_i64(t, x); - gen_ldst_i64(INDEX_op_qemu_st_i64, t, addr, mop[0], idx); - tcg_gen_bswap64_i64(t, y); - tcg_gen_addi_tl(addr_p8, addr, 8); - gen_ldst_i64(INDEX_op_qemu_st_i64, t, addr_p8, mop[1], idx); - tcg_temp_free_i64(t); - } else { - gen_ldst_i64(INDEX_op_qemu_st_i64, x, addr, mop[0], idx); - tcg_gen_addi_tl(addr_p8, addr, 8); - gen_ldst_i64(INDEX_op_qemu_st_i64, y, addr_p8, mop[1], idx); - } - tcg_temp_free(addr_p8); - - plugin_gen_mem_callbacks(addr, make_memop_idx(memop, idx), - QEMU_PLUGIN_MEM_W); + plugin_gen_mem_callbacks(addr, oi, QEMU_PLUGIN_MEM_W); } static void tcg_gen_ext_i32(TCGv_i32 ret, TCGv_i32 val, MemOp opc) diff --git a/accel/tcg/ldst_atomicity.c.inc b/accel/tcg/ldst_atomicity.c.inc index 0e4292ec66..40bf63a4b5 100644 --- a/accel/tcg/ldst_atomicity.c.inc +++ b/accel/tcg/ldst_atomicity.c.inc @@ -420,6 +420,21 @@ static inline uint64_t load_atom_8_by_4(void *pv) } } +/** + * load_atom_8_by_8_or_4: + * @pv: host address + * + * Load 8 bytes from aligned @pv, with at least 4-byte atomicity. + */ +static inline uint64_t load_atom_8_by_8_or_4(void *pv) +{ + if (HAVE_al8_fast) { + return load_atomic8(pv); + } else { + return load_atom_8_by_4(pv); + } +} + /** * load_atom_2: * @p: host address @@ -552,6 +567,64 @@ static uint64_t load_atom_8(CPUArchState *env, uintptr_t ra, } } +/** + * load_atom_16: + * @p: host address + * @memop: the full memory op + * + * Load 16 bytes from @p, honoring the atomicity of @memop. + */ +static Int128 load_atom_16(CPUArchState *env, uintptr_t ra, + void *pv, MemOp memop) +{ + uintptr_t pi = (uintptr_t)pv; + int atmax; + Int128 r; + uint64_t a, b; + + /* + * If the host does not support 8-byte atomics, wait until we have + * examined the atomicity parameters below. + */ + if (HAVE_al16_fast && likely((pi & 15) == 0)) { + return load_atomic16(pv); + } + + atmax = required_atomicity(env, pi, memop); + switch (atmax) { + case MO_8: + memcpy(&r, pv, 16); + return r; + case MO_16: + a = load_atom_8_by_2(pv); + b = load_atom_8_by_2(pv + 8); + break; + case MO_32: + a = load_atom_8_by_4(pv); + b = load_atom_8_by_4(pv + 8); + break; + case MO_64: + if (!HAVE_al8) { + cpu_loop_exit_atomic(env_cpu(env), ra); + } + a = load_atomic8(pv); + b = load_atomic8(pv + 8); + break; + case -MO_64: + if (!HAVE_al8) { + cpu_loop_exit_atomic(env_cpu(env), ra); + } + a = load_atom_extract_al8x2(pv); + b = load_atom_extract_al8x2(pv + 8); + break; + case MO_128: + return load_atomic16_or_exit(env, ra, pv); + default: + g_assert_not_reached(); + } + return int128_make128(HOST_BIG_ENDIAN ? b : a, HOST_BIG_ENDIAN ? a : b); +} + /** * store_atomic2: * @pv: host address @@ -593,6 +666,40 @@ static inline void store_atomic8(void *pv, uint64_t val) qatomic_set__nocheck(p, val); } +/** + * store_atomic16: + * @pv: host address + * @val: value to store + * + * Atomically store 16 aligned bytes to @pv. + */ +static inline void store_atomic16(void *pv, Int128 val) +{ +#if defined(CONFIG_ATOMIC128) + __uint128_t *pu = __builtin_assume_aligned(pv, 16); + Int128Alias new; + + new.s = val; + qatomic_set__nocheck(pu, new.u); +#elif defined(CONFIG_CMPXCHG128) + __uint128_t *pu = __builtin_assume_aligned(pv, 16); + __uint128_t o; + Int128Alias n; + + /* + * Without CONFIG_ATOMIC128, __atomic_compare_exchange_n will always + * defer to libatomic, so we must use __sync_val_compare_and_swap_16 + * and accept the sequential consistency that comes with it. + */ + n.s = val; + do { + o = *pu; + } while (!__sync_bool_compare_and_swap_16(pu, o, n.u)); +#else + qemu_build_not_reached(); +#endif +} + /** * store_atom_4x2 */ @@ -1036,3 +1143,85 @@ static void store_atom_8(CPUArchState *env, uintptr_t ra, } cpu_loop_exit_atomic(env_cpu(env), ra); } + +/** + * store_atom_16: + * @p: host address + * @val: the value to store + * @memop: the full memory op + * + * Store 16 bytes to @p, honoring the atomicity of @memop. + */ +static void store_atom_16(CPUArchState *env, uintptr_t ra, + void *pv, MemOp memop, Int128 val) +{ + uintptr_t pi = (uintptr_t)pv; + uint64_t a, b; + int atmax; + + if (HAVE_al16_fast && likely((pi & 15) == 0)) { + store_atomic16(pv, val); + return; + } + + atmax = required_atomicity(env, pi, memop); + + a = HOST_BIG_ENDIAN ? int128_gethi(val) : int128_getlo(val); + b = HOST_BIG_ENDIAN ? int128_getlo(val) : int128_gethi(val); + switch (atmax) { + case MO_8: + memcpy(pv, &val, 16); + return; + case MO_16: + store_atom_8_by_2(pv, a); + store_atom_8_by_2(pv + 8, b); + return; + case MO_32: + store_atom_8_by_4(pv, a); + store_atom_8_by_4(pv + 8, b); + return; + case MO_64: + if (HAVE_al8) { + store_atomic8(pv, a); + store_atomic8(pv + 8, b); + return; + } + break; + case -MO_64: + if (HAVE_al16) { + uint64_t val_le; + int s2 = pi & 15; + int s1 = 16 - s2; + + if (HOST_BIG_ENDIAN) { + val = bswap128(val); + } + switch (s2) { + case 1 ... 7: + val_le = store_whole_le16(pv, s1, val); + store_bytes_leN(pv + s1, s2, val_le); + break; + case 9 ... 15: + store_bytes_leN(pv, s1, int128_getlo(val)); + val = int128_urshift(val, s1 * 8); + store_whole_le16(pv + s1, s2, val); + break; + case 0: /* aligned */ + case 8: /* atmax MO_64 */ + default: + g_assert_not_reached(); + } + return; + } + break; + case MO_128: + if (HAVE_al16) { + store_atomic16(pv, val); + return; + } + break; + default: + g_assert_not_reached(); + } + cpu_loop_exit_atomic(env_cpu(env), ra); +} From patchwork Thu Feb 16 02:57:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1743252 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=Eole4val; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4PHKTn1TfSz23h0 for ; Thu, 16 Feb 2023 14:01:33 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pSUTG-0003UJ-GX; Wed, 15 Feb 2023 21:58:06 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pSUTD-0003Su-Sf for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:58:04 -0500 Received: from mail-pg1-x541.google.com ([2607:f8b0:4864:20::541]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pSUTB-0005kA-G8 for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:58:03 -0500 Received: by mail-pg1-x541.google.com with SMTP id c29so415592pgm.5 for ; Wed, 15 Feb 2023 18:58:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=QmXLwNFeU+xZu9xjrhAU7pKrmUasowHjjl+QfLvQ2bo=; b=Eole4val/WiTsAky44w69zKh7PVT6mdO78tbMD7H6TYOWY0FNHps/phVgw/kFi61TO fP19QViuWXUDpMt8g700NDPbr/1C5Cd3M8w/Lg8e0q8Qfx8Ldr94iJ/bToUm73uZo0lm Z86bRlelc+nToizXYJga97KxsotYyYirZHmLd7jAQo4H2qSwL5D3NRzCdfqZeGcqDeOK iYqQPM42dceGIMhW0Lq5y6M70tH7MwCRu3x9oXaCCaQMJNRb0yibz00FcIgfMlAFWwo4 Yl3DBOHdVyNHDO93kahu69yIV+rKVizIFE3DErTbaZW/DK9k0vx3vCGGaPWOTmgT6Y1L EkMQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=QmXLwNFeU+xZu9xjrhAU7pKrmUasowHjjl+QfLvQ2bo=; b=8R/ze80vot1c0xt3cyjp/mUHi23uRiN6YCVrSmePSf1LVcNVjg6yWAunepsPvL2dAk XUUqa+5LElNJMroTO+hM4tF4Pg2Isn7PgMQcMAtPTY8u9fZyUizWNx9opoiHiwvWFqrq VEyK1qbNH0aGN84h7eWqcK6TlF5M1umLiLhXxN8yPBeGdhr9m8byekWPYIAofApa/+dL oNeyelUUpc059KagO0vvLr1uyFBO7zY8F6sM+GBkmyU88Ng0POC5bPegD57vfibiz1np esEzFtp3xPZVHNc/M0UTHnhsfW+TOvLW86/YP/XRGQHEyduADbRobfWDbeNm8647QtOZ 8h7A== X-Gm-Message-State: AO0yUKWhdBQnbiK97lbBkZr4sZaZnbJqHy//KJ+wAgeVvP2660kYqh5Q JLA1TnpeAXrIt7B07w31/gPn1V6+YNLxIgcyzOa+YQ== X-Google-Smtp-Source: AK7set95cj0oXm4U5XUarYPHbKc7bWnhdm2ug+R3gam1lGzfkvBzVY6wopa2gLA73VDwu4pIK0UKJQ== X-Received: by 2002:aa7:9ad0:0:b0:5a8:58b5:bfa5 with SMTP id x16-20020aa79ad0000000b005a858b5bfa5mr3630249pfp.4.1676516279978; Wed, 15 Feb 2023 18:57:59 -0800 (PST) Received: from stoup.. (rrcs-74-87-59-234.west.biz.rr.com. [74.87.59.234]) by smtp.gmail.com with ESMTPSA id e14-20020a62aa0e000000b005a816b7c3e8sm89655pff.24.2023.02.15.18.57.58 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Feb 2023 18:57:59 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 13/30] meson: Detect atomic128 support with optimization Date: Wed, 15 Feb 2023 16:57:22 -1000 Message-Id: <20230216025739.1211680-14-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230216025739.1211680-1-richard.henderson@linaro.org> References: <20230216025739.1211680-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::541; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x541.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org There is an edge condition prior to gcc13 for which optimization is required to generate 16-byte atomic sequences. Detect this. Signed-off-by: Richard Henderson --- meson.build | 52 ++++++++++++++++++++++------------ accel/tcg/ldst_atomicity.c.inc | 38 ++++++++++++++++++------- 2 files changed, 61 insertions(+), 29 deletions(-) diff --git a/meson.build b/meson.build index a76c855312..964469780a 100644 --- a/meson.build +++ b/meson.build @@ -2229,23 +2229,21 @@ config_host_data.set('HAVE_BROKEN_SIZE_MAX', not cc.compiles(''' return printf("%zu", SIZE_MAX); }''', args: ['-Werror'])) -atomic_test = ''' +# See if 64-bit atomic operations are supported. +# Note that without __atomic builtins, we can only +# assume atomic loads/stores max at pointer size. +config_host_data.set('CONFIG_ATOMIC64', cc.links(''' #include int main(void) { - @0@ x = 0, y = 0; + uint64_t x = 0, y = 0; y = __atomic_load_n(&x, __ATOMIC_RELAXED); __atomic_store_n(&x, y, __ATOMIC_RELAXED); __atomic_compare_exchange_n(&x, &y, x, 0, __ATOMIC_RELAXED, __ATOMIC_RELAXED); __atomic_exchange_n(&x, y, __ATOMIC_RELAXED); __atomic_fetch_add(&x, y, __ATOMIC_RELAXED); return 0; - }''' - -# See if 64-bit atomic operations are supported. -# Note that without __atomic builtins, we can only -# assume atomic loads/stores max at pointer size. -config_host_data.set('CONFIG_ATOMIC64', cc.links(atomic_test.format('uint64_t'))) + }''')) has_int128 = cc.links(''' __int128_t a; @@ -2263,21 +2261,39 @@ if has_int128 # "do we have 128-bit atomics which are handled inline and specifically not # via libatomic". The reason we can't use libatomic is documented in the # comment starting "GCC is a house divided" in include/qemu/atomic128.h. - has_atomic128 = cc.links(atomic_test.format('unsigned __int128')) + # We only care about these operations on 16-byte aligned pointers, so + # force 16-byte alignment of the pointer, which may be greater than + # __alignof(unsigned __int128) for the host. + atomic_test_128 = ''' + int main(int ac, char **av) { + unsigned __int128 *p = __builtin_assume_aligned(av[ac - 1], sizeof(16)); + p[1] = __atomic_load_n(&p[0], __ATOMIC_RELAXED); + __atomic_store_n(&p[2], p[3], __ATOMIC_RELAXED); + __atomic_compare_exchange_n(&p[4], &p[5], p[6], 0, __ATOMIC_RELAXED, __ATOMIC_RELAXED); + return 0; + }''' + has_atomic128 = cc.links(atomic_test_128) config_host_data.set('CONFIG_ATOMIC128', has_atomic128) if not has_atomic128 - has_cmpxchg128 = cc.links(''' - int main(void) - { - unsigned __int128 x = 0, y = 0; - __sync_val_compare_and_swap_16(&x, y, x); - return 0; - } - ''') + # Even with __builtin_assume_aligned, the above test may have failed + # without optimization enabled. Try again with optimizations locally + # enabled for the function. See + # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107389 + has_atomic128_opt = cc.links('__attribute__((optimize("O1")))' + atomic_test_128) + config_host_data.set('CONFIG_ATOMIC128_OPT', has_atomic128_opt) - config_host_data.set('CONFIG_CMPXCHG128', has_cmpxchg128) + if not has_atomic128_opt + config_host_data.set('CONFIG_CMPXCHG128', cc.links(''' + int main(void) + { + unsigned __int128 x = 0, y = 0; + __sync_val_compare_and_swap_16(&x, y, x); + return 0; + } + ''')) + endif endif endif diff --git a/accel/tcg/ldst_atomicity.c.inc b/accel/tcg/ldst_atomicity.c.inc index 40bf63a4b5..c7999e0ef1 100644 --- a/accel/tcg/ldst_atomicity.c.inc +++ b/accel/tcg/ldst_atomicity.c.inc @@ -16,6 +16,23 @@ #endif #define HAVE_al8_fast (ATOMIC_REG_SIZE >= 8) +/* + * If __alignof(unsigned __int128) < 16, GCC may refuse to inline atomics + * that are supported by the host, e.g. s390x. We can force the pointer to + * have our known alignment with __builtin_assume_aligned, however prior to + * GCC 13 that was only reliable with optimization enabled. See + * https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107389 + */ +#if defined(CONFIG_ATOMIC128_OPT) +# if !defined(__OPTIMIZE__) +# define ATTRIBUTE_ATOMIC128_OPT __attribute__((optimize("O1"))) +# endif +# define CONFIG_ATOMIC128 +#endif +#ifndef ATTRIBUTE_ATOMIC128_OPT +# define ATTRIBUTE_ATOMIC128_OPT +#endif + #if defined(CONFIG_ATOMIC128) # define HAVE_al16_fast true #else @@ -136,7 +153,8 @@ static inline uint64_t load_atomic8(void *pv) * * Atomically load 16 aligned bytes from @pv. */ -static inline Int128 load_atomic16(void *pv) +static inline Int128 ATTRIBUTE_ATOMIC128_OPT +load_atomic16(void *pv) { #ifdef CONFIG_ATOMIC128 __uint128_t *p = __builtin_assume_aligned(pv, 16); @@ -337,7 +355,8 @@ static uint64_t load_atom_extract_al16_or_exit(CPUArchState *env, uintptr_t ra, * cross an 16-byte boundary then the access must be 16-byte atomic, * otherwise the access must be 8-byte atomic. */ -static inline uint64_t load_atom_extract_al16_or_al8(void *pv, int s) +static inline uint64_t ATTRIBUTE_ATOMIC128_OPT +load_atom_extract_al16_or_al8(void *pv, int s) { #if defined(CONFIG_ATOMIC128) uintptr_t pi = (uintptr_t)pv; @@ -673,28 +692,24 @@ static inline void store_atomic8(void *pv, uint64_t val) * * Atomically store 16 aligned bytes to @pv. */ -static inline void store_atomic16(void *pv, Int128 val) +static inline void ATTRIBUTE_ATOMIC128_OPT +store_atomic16(void *pv, Int128Alias val) { #if defined(CONFIG_ATOMIC128) __uint128_t *pu = __builtin_assume_aligned(pv, 16); - Int128Alias new; - - new.s = val; - qatomic_set__nocheck(pu, new.u); + qatomic_set__nocheck(pu, val.u); #elif defined(CONFIG_CMPXCHG128) __uint128_t *pu = __builtin_assume_aligned(pv, 16); __uint128_t o; - Int128Alias n; /* * Without CONFIG_ATOMIC128, __atomic_compare_exchange_n will always * defer to libatomic, so we must use __sync_val_compare_and_swap_16 * and accept the sequential consistency that comes with it. */ - n.s = val; do { o = *pu; - } while (!__sync_bool_compare_and_swap_16(pu, o, n.u)); + } while (!__sync_bool_compare_and_swap_16(pu, o, val.u)); #else qemu_build_not_reached(); #endif @@ -776,7 +791,8 @@ static void store_atom_insert_al8(uint64_t *p, uint64_t val, uint64_t msk) * * Atomically store @val to @p masked by @msk. */ -static void store_atom_insert_al16(Int128 *ps, Int128Alias val, Int128Alias msk) +static void ATTRIBUTE_ATOMIC128_OPT +store_atom_insert_al16(Int128 *ps, Int128Alias val, Int128Alias msk) { #if defined(CONFIG_ATOMIC128) __uint128_t *pu, old, new; From patchwork Thu Feb 16 02:57:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1743241 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=i9OR7AVN; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4PHKRh35P0z1yYg for ; Thu, 16 Feb 2023 13:59:44 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pSUTH-0003VD-RU; Wed, 15 Feb 2023 21:58:07 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pSUTE-0003T0-OP for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:58:04 -0500 Received: from mail-pf1-x441.google.com ([2607:f8b0:4864:20::441]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pSUTC-0005ka-Np for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:58:04 -0500 Received: by mail-pf1-x441.google.com with SMTP id bw10so617576pfb.0 for ; Wed, 15 Feb 2023 18:58:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=QP+/EHiQKRHx1fO7mLC4dIgIU6Ym3+F7e4Wu/m2d1pQ=; b=i9OR7AVN0KsuyYPTZnRm2cIoFPcpmpTQ70AER6hKsFGo0Wf5neh+dR2swaZpa/MS6I VF44WqvDjWH7WEqTSSckMhYb45hn7QWePMMZvgegfRk1k15lrFtPPQMfzrvGL8WXF+nc KYyaWzcedxnDyFnHKMPcl1Nnd2xSZDkS80tpAyoQN/PfzYGjA4J5I7rqYRv/FngxMPnc bw/7pKnplqjDEYDN57hhbh+SL5+hbo0DDtIuqZ0w2jdfLg9oLwyOpJcfTOqN/cVyCsOL 7WJ9zf+d2OEgkMy81RlooJu/YfQ12HRefCAKI0gdnQktX51p9p6gI+A6/dI1cwwsvQZk Xtjw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=QP+/EHiQKRHx1fO7mLC4dIgIU6Ym3+F7e4Wu/m2d1pQ=; b=SvFwiZiD0RImmk/Zn5BWUxUmpD86NEjIs+DZU/ofFF4UCyTPwsSlkHd+ei0mvQc/6U FnL28TM+L9LEiXjB0W5MI8BXvK8kyQdRK28iyAFaEK6Evrk4/kC6rn3cScPXU5MlShFo gL8WA+gXLYUr/Pgiz7NgOndiZTyuhU8SY/a4CyrI8hzPkq8OvilkRpEszBUlAF2DpfvT baQQvJjqxH6eEG9mKsEoiCaMwsV4PU0+HrKWMHNUtINAnRmspT+QXyNBSJ3xLUDWcGGd /YrYhy4PEcfQab8xDtgrG1q2GTcwNh5mW89g4v0qZxM82ZYbX37tdoi+DTuPCHh4EEof nwQw== X-Gm-Message-State: AO0yUKUwvRHjwhHLyTgnQ++r6rkHUh2gZIVU/V7orpFiF2wzf7rMEX5J I8WyJxzylaETouqUqVSkUVrTJwYlMSUrKiP55oZOnQ== X-Google-Smtp-Source: AK7set/OJ2559IdK8XefOD2/IBi9BYSQ8d6E0yv1dT5cN+sOCMucCewBJ5MHprYKSXlIClIzlD9c4Q== X-Received: by 2002:aa7:8428:0:b0:5a8:bbac:1cf2 with SMTP id q8-20020aa78428000000b005a8bbac1cf2mr3574334pfn.1.1676516281304; Wed, 15 Feb 2023 18:58:01 -0800 (PST) Received: from stoup.. (rrcs-74-87-59-234.west.biz.rr.com. [74.87.59.234]) by smtp.gmail.com with ESMTPSA id e14-20020a62aa0e000000b005a816b7c3e8sm89655pff.24.2023.02.15.18.58.00 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Feb 2023 18:58:00 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 14/30] tcg/i386: Add have_atomic16 Date: Wed, 15 Feb 2023 16:57:23 -1000 Message-Id: <20230216025739.1211680-15-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230216025739.1211680-1-richard.henderson@linaro.org> References: <20230216025739.1211680-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::441; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x441.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Notice when Intel or AMD have guaranteed that vmovdqa is atomic. The new variable will also be used in generated code. Signed-off-by: Richard Henderson --- include/qemu/cpuid.h | 18 ++++++++++++++++++ tcg/i386/tcg-target.h | 1 + tcg/i386/tcg-target.c.inc | 27 +++++++++++++++++++++++++++ 3 files changed, 46 insertions(+) diff --git a/include/qemu/cpuid.h b/include/qemu/cpuid.h index 1451e8ef2f..35325f1995 100644 --- a/include/qemu/cpuid.h +++ b/include/qemu/cpuid.h @@ -71,6 +71,24 @@ #define bit_LZCNT (1 << 5) #endif +/* + * Signatures for different CPU implementations as returned from Leaf 0. + */ + +#ifndef signature_INTEL_ecx +/* "Genu" "ineI" "ntel" */ +#define signature_INTEL_ebx 0x756e6547 +#define signature_INTEL_edx 0x49656e69 +#define signature_INTEL_ecx 0x6c65746e +#endif + +#ifndef signature_AMD_ecx +/* "Auth" "enti" "cAMD" */ +#define signature_AMD_ebx 0x68747541 +#define signature_AMD_edx 0x69746e65 +#define signature_AMD_ecx 0x444d4163 +#endif + static inline unsigned xgetbv_low(unsigned c) { unsigned a, d; diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h index d4f2a6f8c2..0421776cb8 100644 --- a/tcg/i386/tcg-target.h +++ b/tcg/i386/tcg-target.h @@ -120,6 +120,7 @@ extern bool have_avx512dq; extern bool have_avx512vbmi2; extern bool have_avx512vl; extern bool have_movbe; +extern bool have_atomic16; /* optional instructions */ #define TCG_TARGET_HAS_div2_i32 1 diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc index 29dba3fa1c..977650263b 100644 --- a/tcg/i386/tcg-target.c.inc +++ b/tcg/i386/tcg-target.c.inc @@ -185,6 +185,7 @@ bool have_avx512dq; bool have_avx512vbmi2; bool have_avx512vl; bool have_movbe; +bool have_atomic16; #ifdef CONFIG_CPUID_H static bool have_bmi2; @@ -4173,6 +4174,32 @@ static void tcg_target_init(TCGContext *s) have_avx512dq = (b7 & bit_AVX512DQ) != 0; have_avx512vbmi2 = (c7 & bit_AVX512VBMI2) != 0; } + + /* + * The Intel SDM has added: + * Processors that enumerate support for Intel® AVX + * (by setting the feature flag CPUID.01H:ECX.AVX[bit 28]) + * guarantee that the 16-byte memory operations performed + * by the following instructions will always be carried + * out atomically: + * - MOVAPD, MOVAPS, and MOVDQA. + * - VMOVAPD, VMOVAPS, and VMOVDQA when encoded with VEX.128. + * - VMOVAPD, VMOVAPS, VMOVDQA32, and VMOVDQA64 when encoded + * with EVEX.128 and k0 (masking disabled). + * Note that these instructions require the linear addresses + * of their memory operands to be 16-byte aligned. + * + * AMD has provided an even stronger guarantee that processors + * with AVX provide 16-byte atomicity for all cachable, + * naturally aligned single loads and stores, e.g. MOVDQU. + * + * See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104688 + */ + if (have_avx1) { + __cpuid(0, a, b, c, d); + have_atomic16 = (c == signature_INTEL_ecx || + c == signature_AMD_ecx); + } } } } From patchwork Thu Feb 16 02:57:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1743253 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=zC0NiHMs; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4PHKTt6Zg4z23h0 for ; Thu, 16 Feb 2023 14:01:38 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pSUTI-0003VJ-FW; Wed, 15 Feb 2023 21:58:08 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pSUTG-0003Ts-45 for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:58:06 -0500 Received: from mail-pf1-x429.google.com ([2607:f8b0:4864:20::429]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pSUTE-0005kp-6n for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:58:05 -0500 Received: by mail-pf1-x429.google.com with SMTP id r17so574427pff.9 for ; Wed, 15 Feb 2023 18:58:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=c3rh4A690IymlOf10Atd7nhBNH65J0XD/I+q4xGnotk=; b=zC0NiHMscGMuivRf6wqzv3Ts33GYnj5IVrnKJWRRTFdRXAQT/VGrkUJkeDl2qH5MjL LFuuWPHgTGCjqu6NNnImppH+WIi0yCMRvTZGQCyZ0ac2+DoSZpy7wUco6tXwCMErUo2v s2D36Re+AmVQy0Y+DBaIYRenMxX9OOs9pJX/TB4jWQpVGsLIYNA9lWHnoPjEp78B3odu K0PyTibrQVXUuLakSw4VZT6D0XEAjB3gl0hTwof0kvtbxcwps6MMOzoG72NAtx/146Uc q08pdpaH7JeU9vcQygE8ewyUwAn0yy1D0Z8h63HiHN0tsZssZwF9VLwxNZHuAff2qds/ Xeqg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=c3rh4A690IymlOf10Atd7nhBNH65J0XD/I+q4xGnotk=; b=Km3ko4WzOD6craiUKXAmkumDT9CDB4D0fN6XjGNb6Ex0NyNymg+0nCiZ9WS/szG6q5 0Lb5kORvdm8YOinJLvI4k8pa4fF4wgHjQpEGxfkwdY/DL5EFPOxRlRpGMQ39xLJ2cFo9 B9maKULiNnjZMCz0Rrn4N+SV9wL0pZxFI9gowtwTV0fUMyefE9P0XQEP1ytrWhcoabzX T7gjwkGqlcL0NclHpiEwdLm1Rhxn7ttzHCA4Ptwg1rGofnyukklZS9qG9a7Dhko5GSSB YUPOAu3OZKMVTqCL3bgYSFBCnq1R8lutrqeLeojnFMNH8abnwh2rwMGrjNz6ESin1iBc RTEA== X-Gm-Message-State: AO0yUKXsLGbnxiNXUSpLs5iExFhCuPfiLk04LrMMD+9IUqV7SF0k0/Sh n4HistMOiaZZ4IrG/vwFPgZbXOsjlfEoz7KeaCo= X-Google-Smtp-Source: AK7set+2qvCPMhRsjlQZv5NCyZ/WP6YgjVU3SYsVf0Ps3JPD5/cn2y3yuMeJlSVn9XBpB+56EjXKTw== X-Received: by 2002:aa7:9d01:0:b0:5a8:cb5f:94d4 with SMTP id k1-20020aa79d01000000b005a8cb5f94d4mr4092598pfp.32.1676516282619; Wed, 15 Feb 2023 18:58:02 -0800 (PST) Received: from stoup.. (rrcs-74-87-59-234.west.biz.rr.com. [74.87.59.234]) by smtp.gmail.com with ESMTPSA id e14-20020a62aa0e000000b005a816b7c3e8sm89655pff.24.2023.02.15.18.58.01 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Feb 2023 18:58:02 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 15/30] accel/tcg: Use have_atomic16 in ldst_atomicity.c.inc Date: Wed, 15 Feb 2023 16:57:24 -1000 Message-Id: <20230216025739.1211680-16-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230216025739.1211680-1-richard.henderson@linaro.org> References: <20230216025739.1211680-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::429; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x429.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Hosts using Intel and AMD AVX cpus are quite common. Add fast paths through ldst_atomicity using this. Signed-off-by: Richard Henderson --- accel/tcg/ldst_atomicity.c.inc | 76 +++++++++++++++++++++++++++------- 1 file changed, 60 insertions(+), 16 deletions(-) diff --git a/accel/tcg/ldst_atomicity.c.inc b/accel/tcg/ldst_atomicity.c.inc index c7999e0ef1..07982e021d 100644 --- a/accel/tcg/ldst_atomicity.c.inc +++ b/accel/tcg/ldst_atomicity.c.inc @@ -35,6 +35,14 @@ #if defined(CONFIG_ATOMIC128) # define HAVE_al16_fast true +#elif defined(CONFIG_TCG_INTERPRETER) +/* + * FIXME: host specific detection this is in tcg/$host/, + * but we're using tcg/tci/ instead. + */ +# define HAVE_al16_fast false +#elif defined(__x86_64__) +# define HAVE_al16_fast likely(have_atomic16) #else # define HAVE_al16_fast false #endif @@ -162,6 +170,12 @@ load_atomic16(void *pv) r.u = qatomic_read__nocheck(p); return r.s; +#elif defined(__x86_64__) + Int128Alias r; + + /* Via HAVE_al16_fast, have_atomic16 is true. */ + asm("vmovdqa %1, %0" : "=x" (r.u) : "m" (*(Int128 *)pv)); + return r.s; #else qemu_build_not_reached(); #endif @@ -380,6 +394,24 @@ load_atom_extract_al16_or_al8(void *pv, int s) r = qatomic_read__nocheck(p16); } return r >> shr; +#elif defined(__x86_64__) + uintptr_t pi = (uintptr_t)pv; + int shr = (pi & 7) * 8; + uint64_t a, b; + + /* Via HAVE_al16_fast, have_atomic16 is true. */ + pv = (void *)(pi & ~7); + if (pi & 8) { + uint64_t *p8 = __builtin_assume_aligned(pv, 16, 8); + a = qatomic_read__nocheck(p8); + b = qatomic_read__nocheck(p8 + 1); + } else { + asm("vmovdqa %2, %0\n\tvpextrq $1, %0, %1" + : "=x"(a), "=r"(b) : "m" (*(__uint128_t *)pv)); + } + asm("shrd %b2, %1, %0" : "+r"(a) : "r"(b), "c"(shr)); + + return a; #else qemu_build_not_reached(); #endif @@ -696,23 +728,35 @@ static inline void ATTRIBUTE_ATOMIC128_OPT store_atomic16(void *pv, Int128Alias val) { #if defined(CONFIG_ATOMIC128) - __uint128_t *pu = __builtin_assume_aligned(pv, 16); - qatomic_set__nocheck(pu, val.u); -#elif defined(CONFIG_CMPXCHG128) - __uint128_t *pu = __builtin_assume_aligned(pv, 16); - __uint128_t o; - - /* - * Without CONFIG_ATOMIC128, __atomic_compare_exchange_n will always - * defer to libatomic, so we must use __sync_val_compare_and_swap_16 - * and accept the sequential consistency that comes with it. - */ - do { - o = *pu; - } while (!__sync_bool_compare_and_swap_16(pu, o, val.u)); -#else - qemu_build_not_reached(); + { + __uint128_t *pu = __builtin_assume_aligned(pv, 16); + qatomic_set__nocheck(pu, val.u); + return; + } #endif +#if defined(__x86_64__) + if (HAVE_al16_fast) { + asm("vmovdqa %1, %0" : "=m"(*(__uint128_t *)pv) : "x" (val.u)); + return; + } +#endif +#if defined(CONFIG_CMPXCHG128) + { + __uint128_t *pu = __builtin_assume_aligned(pv, 16); + __uint128_t o; + + /* + * Without CONFIG_ATOMIC128, __atomic_compare_exchange_n will always + * defer to libatomic, so we must use __sync_val_compare_and_swap_16 + * and accept the sequential consistency that comes with it. + */ + do { + o = *pu; + } while (!__sync_bool_compare_and_swap_16(pu, o, val.u)); + return; + } +#endif + qemu_build_not_reached(); } /** From patchwork Thu Feb 16 02:57:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1743235 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=oYtqP8/5; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4PHKRB62GMz23yD for ; Thu, 16 Feb 2023 13:59:18 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pSUTJ-0003Vp-Ed; Wed, 15 Feb 2023 21:58:09 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pSUTH-0003Uo-4H for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:58:07 -0500 Received: from mail-pl1-x62d.google.com ([2607:f8b0:4864:20::62d]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pSUTF-0005lJ-7F for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:58:06 -0500 Received: by mail-pl1-x62d.google.com with SMTP id h4so675404pll.9 for ; Wed, 15 Feb 2023 18:58:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=eD5cSh7HSVjqFOmUj2rwI/Hu7f1XIWpv7Fy8Vo5nqrw=; b=oYtqP8/53Zwe6DqXgrW1XU2bfkwhCD4w5AxIN6niIhig74b7HA0pGRWHgcNxCCLsZQ dAS2Grx5gK1sdBfV16ap3Qiw3Wjp98ealMziVaiDIeJ12yTdbMCDJQTyxuDN+7xC91eY xsviaNpiT/O0RphwR8xH+fbBWgUfFIQb1LQNvODBprDEEOGreBHtjqjZ/HGtkI8du/pZ 7CjaZKTUxaoTkh/IKwDkCzwv8Ox2Mrp8SNh9KWue0hTyFNK6A4qgFTb57zTo6RlQ4Cr2 aqkcd8vLV95ZA4L4gHn+YGic9pK8NMrxhFZCaIDv3WZrCnZVFBcNuYmztVEZ+vYP/fVN /PHw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=eD5cSh7HSVjqFOmUj2rwI/Hu7f1XIWpv7Fy8Vo5nqrw=; b=myCFDak6xQy8s+XZY4OOb5OhR7vfUFVkPSKv2GreR3xqBXOhUVptYKjtAfHXqoKleA ZaKy/SbyTFw8qEX0uWmgv6gMdlEm2Glfe7wWr6c2Q+mXMWVoRbxRtgQJ14i+j6tuv8Kj BTvpS3NjYiNaVD+ENaGsKIkYOYsBONR4Ztd9ga6AB0SJcXR4dAaMBaYD5Etqa8DM4GQK I0jufx+nY3a7tMbmbVx+ZXBK5glfhWT9x6IcYfy4dpGBVX0L/BERhra+wfpavWy9hnrY k4Ah6FbKG8bfT9OF+spFsj+0hUnrAMujZjk3MpNAnrevABNctPRlitlJRjqf4znq1cki xI1Q== X-Gm-Message-State: AO0yUKVP21adMfCUQmb1pMHVBDlLZKmdo8b/+BwZ3ikkZBvTz5MK+8hV +x/ZvYqujFbPol+2LGPuU9d7bhe1r56UsUdwKJo= X-Google-Smtp-Source: AK7set+s94na1ikaUYJmGIUaiU8mK5rfZz2ts4xaGdGJejQ5vDzg251syBas3V58v9/0VIYJHynV6A== X-Received: by 2002:a05:6a20:69a8:b0:c7:13be:6df0 with SMTP id t40-20020a056a2069a800b000c713be6df0mr1162019pzk.11.1676516283854; Wed, 15 Feb 2023 18:58:03 -0800 (PST) Received: from stoup.. (rrcs-74-87-59-234.west.biz.rr.com. [74.87.59.234]) by smtp.gmail.com with ESMTPSA id e14-20020a62aa0e000000b005a816b7c3e8sm89655pff.24.2023.02.15.18.58.02 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Feb 2023 18:58:03 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 16/30] accel/tcg: Add aarch64 specific support in ldst_atomicity Date: Wed, 15 Feb 2023 16:57:25 -1000 Message-Id: <20230216025739.1211680-17-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230216025739.1211680-1-richard.henderson@linaro.org> References: <20230216025739.1211680-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::62d; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x62d.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org We have code in atomic128.h noting that through GCC 8, there was no support for atomic operations on __uint128. This has been fixed in GCC 10. But we can still improve over any basic compare-and-swap loop using the ldxp/stxp instructions. Signed-off-by: Richard Henderson --- accel/tcg/ldst_atomicity.c.inc | 60 ++++++++++++++++++++++++++++++++-- 1 file changed, 57 insertions(+), 3 deletions(-) diff --git a/accel/tcg/ldst_atomicity.c.inc b/accel/tcg/ldst_atomicity.c.inc index 07982e021d..9a95ac327d 100644 --- a/accel/tcg/ldst_atomicity.c.inc +++ b/accel/tcg/ldst_atomicity.c.inc @@ -247,7 +247,22 @@ static Int128 load_atomic16_or_exit(CPUArchState *env, uintptr_t ra, void *pv) * In system mode all guest pages are writable, and for user-only * we have just checked writability. Try cmpxchg. */ -#if defined(CONFIG_CMPXCHG128) +#if defined(__aarch64__) + /* We can do better than cmpxchg for AArch64. */ + { + uint64_t l, h; + uint32_t fail; + + /* The load must be paired with the store to guarantee not tearing. */ + asm("0: ldxp %0, %1, %3\n\t" + "stxp %w2, %0, %1, %3\n\t" + "cbnz %w2, 0b" + : "=&r"(l), "=&r"(h), "=&r"(fail) : "Q"(*p)); + + qemu_build_assert(!HOST_BIG_ENDIAN); + return int128_make128(l, h); + } +#elif defined(CONFIG_CMPXCHG128) /* Swap 0 with 0, with the side-effect of returning the old value. */ { Int128Alias r; @@ -740,7 +755,22 @@ store_atomic16(void *pv, Int128Alias val) return; } #endif -#if defined(CONFIG_CMPXCHG128) +#if defined(__aarch64__) + /* We can do better than cmpxchg for AArch64. */ + { + uint64_t l, h, t; + + qemu_build_assert(!HOST_BIG_ENDIAN); + l = int128_getlo(val.s); + h = int128_gethi(val.s); + + asm("0: ldxp %0, xzr, %1\n\t" + "stxp %w0, %2, %3, %1\n\t" + "cbnz %w0, 0b" + : "=&r"(t), "=Q"(*(__uint128_t *)pv) : "r"(l), "r"(h)); + return; + } +#elif defined(CONFIG_CMPXCHG128) { __uint128_t *pu = __builtin_assume_aligned(pv, 16); __uint128_t o; @@ -838,7 +868,31 @@ static void store_atom_insert_al8(uint64_t *p, uint64_t val, uint64_t msk) static void ATTRIBUTE_ATOMIC128_OPT store_atom_insert_al16(Int128 *ps, Int128Alias val, Int128Alias msk) { -#if defined(CONFIG_ATOMIC128) +#if defined(__aarch64__) + /* + * GCC only implements __sync* primitives for int128 on aarch64. + * We can do better without the barriers, and integrating the + * arithmetic into the load-exclusive/store-conditional pair. + */ + uint64_t tl, th, vl, vh, ml, mh; + uint32_t fail; + + qemu_build_assert(!HOST_BIG_ENDIAN); + vl = int128_getlo(val.s); + vh = int128_gethi(val.s); + ml = int128_getlo(msk.s); + mh = int128_gethi(msk.s); + + asm("0: ldxp %[l], %[h], %[mem]\n\t" + "bic %[l], %[l], %[ml]\n\t" + "bic %[h], %[h], %[mh]\n\t" + "orr %[l], %[l], %[vl]\n\t" + "orr %[h], %[h], %[vh]\n\t" + "stxp %w[f], %[l], %[h], %[mem]\n\t" + "cbnz %w[f], 0b\n" + : [mem] "+Q"(*ps), [f] "=&r"(fail), [l] "=&r"(tl), [h] "=&r"(th) + : [vl] "r"(vl), [vh] "r"(vh), [ml] "r"(ml), [mh] "r"(mh)); +#elif defined(CONFIG_ATOMIC128) __uint128_t *pu, old, new; /* With CONFIG_ATOMIC128, we can avoid the memory barriers. */ From patchwork Thu Feb 16 02:57:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1743233 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=PUA+M0pF; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4PHKQd1VdWz23yD for ; Thu, 16 Feb 2023 13:58:49 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pSUTJ-0003Vr-FR; Wed, 15 Feb 2023 21:58:09 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pSUTI-0003VI-GC for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:58:08 -0500 Received: from mail-pg1-x531.google.com ([2607:f8b0:4864:20::531]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pSUTG-0005ls-Bl for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:58:07 -0500 Received: by mail-pg1-x531.google.com with SMTP id 7so427384pga.1 for ; Wed, 15 Feb 2023 18:58:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=XpxrtDgBblg1xHWmk0z4agClqvDtMclVGTGWwtGIZic=; b=PUA+M0pFskLHb8VIAAW9koLR3Lq5nb72g1mvOprB8XrpaXYiuleAYsYx5wAew/Cy/F k5M8NMyvzho2nIuZifsmYEv49Yv2KtrdIl/u0tu8iaZM6UgvYu9hAbwlHm6UR5pyKKfZ KVRzz3wvSO0AtAVSZPxoNo5iZ39VjAgL9VEd4pJpzZTg4Ecy+xXw/GYPupdG2fHVsQNs tDzFti5rfeMyOrtNpH8Q+2uLhbZr36Bm0mJtcBplMhyYeFUEcKdEF8vr46g+s1QWceUE KaGQKM+MTUcdEGcwmcZE7tr8Ljqte//J6EF5l0c0rtwN4xTiY1lE6xb4E5tYAf5Esylq NqbA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=XpxrtDgBblg1xHWmk0z4agClqvDtMclVGTGWwtGIZic=; b=8EirQJNLSDb7b3UDJDBTWi/P7I8v++NRl3LlLfo3BDf/qaqTTgHM9yDVQzVWrXmAPK ky44Bva5E4E6HIz5KBrnn7ki5ffl5XGLP1TVvPx/lg4XtQajnD63QaqHblnUUlv8A+2P mDlfKtOhdAFiqIomcYxGwuWSvhC/6dXMdmakgAbOmDPcLJIxy2d/s+i1s6fMuluMBeOG zngKRVIHbbfDpOcieDPSY210h/u+Sr0SvUDobHNTYpCDNQ9MiIvszni0vc8LNkm9AElh /VimqnfT0siWJ226QNjbaAjUTGYyWbRk6B7rxq1MDVlRRfVbxBuSEc9neRUx9PyqK2MO XzUA== X-Gm-Message-State: AO0yUKWgGU2Nu5QDUkqte1VEiQNo+USD8fJhQcrfcv1GuzKcSi7i2Um8 Dipku5dfBX2bsBQ34Uydobcui1y03wTSBaQBZhI= X-Google-Smtp-Source: AK7set/e8XpMvbS/D5OycIWvqHr04Wce9JSvYDCyF6OpaDWd4fJ+Zum6kgcMRIXW28lEkP4HPhpJqg== X-Received: by 2002:a05:6a00:288e:b0:5a8:c038:f4e7 with SMTP id ch14-20020a056a00288e00b005a8c038f4e7mr759196pfb.1.1676516285058; Wed, 15 Feb 2023 18:58:05 -0800 (PST) Received: from stoup.. (rrcs-74-87-59-234.west.biz.rr.com. [74.87.59.234]) by smtp.gmail.com with ESMTPSA id e14-20020a62aa0e000000b005a816b7c3e8sm89655pff.24.2023.02.15.18.58.04 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Feb 2023 18:58:04 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 17/30] tcg/aarch64: Detect have_lse, have_lse2 for linux Date: Wed, 15 Feb 2023 16:57:26 -1000 Message-Id: <20230216025739.1211680-18-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230216025739.1211680-1-richard.henderson@linaro.org> References: <20230216025739.1211680-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::531; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x531.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Notice when the host has additional atomic instructions. The new variables will also be used in generated code. Signed-off-by: Richard Henderson Reviewed-by: Philippe Mathieu-Daudé --- tcg/aarch64/tcg-target.h | 3 +++ tcg/aarch64/tcg-target.c.inc | 12 ++++++++++++ 2 files changed, 15 insertions(+) diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h index c0b0f614ba..3c0b0d312d 100644 --- a/tcg/aarch64/tcg-target.h +++ b/tcg/aarch64/tcg-target.h @@ -57,6 +57,9 @@ typedef enum { #define TCG_TARGET_CALL_ARG_I128 TCG_CALL_ARG_EVEN #define TCG_TARGET_CALL_RET_I128 TCG_CALL_RET_NORMAL +extern bool have_lse; +extern bool have_lse2; + /* optional instructions */ #define TCG_TARGET_HAS_div_i32 1 #define TCG_TARGET_HAS_rem_i32 1 diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc index 05123cce35..d144d1a769 100644 --- a/tcg/aarch64/tcg-target.c.inc +++ b/tcg/aarch64/tcg-target.c.inc @@ -13,6 +13,9 @@ #include "../tcg-ldst.c.inc" #include "../tcg-pool.c.inc" #include "qemu/bitops.h" +#ifdef __linux__ +#include +#endif /* We're going to re-use TCGType in setting of the SF bit, which controls the size of the operation performed. If we know the values match, it @@ -71,6 +74,9 @@ static TCGReg tcg_target_call_oarg_reg(TCGCallReturnKind kind, int slot) return TCG_REG_X0 + slot; } +bool have_lse; +bool have_lse2; + #define TCG_REG_TMP TCG_REG_X30 #define TCG_VEC_TMP TCG_REG_V31 @@ -2912,6 +2918,12 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op) static void tcg_target_init(TCGContext *s) { +#ifdef __linux__ + unsigned long hwcap = qemu_getauxval(AT_HWCAP); + have_lse = hwcap & HWCAP_ATOMICS; + have_lse2 = hwcap & HWCAP_USCAT; +#endif + tcg_target_available_regs[TCG_TYPE_I32] = 0xffffffffu; tcg_target_available_regs[TCG_TYPE_I64] = 0xffffffffu; tcg_target_available_regs[TCG_TYPE_V64] = 0xffffffff00000000ull; From patchwork Thu Feb 16 02:57:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1743255 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=TrPWFXMs; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4PHKTz6XzTz23h0 for ; Thu, 16 Feb 2023 14:01:43 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pSUTM-0003X8-8c; Wed, 15 Feb 2023 21:58:12 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pSUTI-0003VR-PU for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:58:08 -0500 Received: from mail-pg1-x529.google.com ([2607:f8b0:4864:20::529]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pSUTH-0005is-4d for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:58:08 -0500 Received: by mail-pg1-x529.google.com with SMTP id e1so406775pgg.9 for ; Wed, 15 Feb 2023 18:58:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=V4V1gK7l/X0jR5MaeGeYdL9LeHtcxKxh5We6ThF9P1U=; b=TrPWFXMsAiCkA4p371n2A6qXRuDhQ0MFNwv82NytilZ83cU0t7FATGtwB9oNo1dOaD WlxX+PIwO9DLe6rQmgjhpvl8nOkwR7+uDVDl+PZeJsWjdt3KYwjKn5j/GJEX0xHWzxJR lMsmkGgYa6n6DFzoiN/KDvqPbyYyT0A/CcoruFZc8bybyllT1z5mhTrRSA/UT03E1PA9 U3H6UV/EpY/pJEPcZMtKQSy0IFw8xXtibC7bS/8LzMmFherZH0DdJwHvdMqvtTW13w0h ygDQuRq+vtQbe3dl6ibPsnBq2xWSQut9J4s1UfGoyq41CRTSm+xAlhtPEUnsnOJRhZWA xBrw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=V4V1gK7l/X0jR5MaeGeYdL9LeHtcxKxh5We6ThF9P1U=; b=BpvwTaBfbLRyQzlc0gY3/Zjv2hjoBXG3TtddDX+pRJHFW2uKSD0cW8BnMC5YX7bpO2 cNuGmr+TmJOVhJx1x1p2aMaGhr9QdWWFFz2rEoxB4DG/sdTIFidPnqpDv/YRXAjgkTyq brAQ7AmREXKMr1ghKH2G0WwIMl34opFARmWDzsrufG1nDLuNJsjsgr/BJSAY7Os8nYyt 73n6uTd+CTM4llTeJg4j8Uh6ogXUWnIm8CQiRG+Ysb5ocvzc8ORRaUHAnorYEi+maHEp /6Rav7XEMWZxbFmOkE9JWXgze9U7wUmspgaiBPwzyF3YzLTrZ4G1T7lYDTdU3iZw/B/2 t9qQ== X-Gm-Message-State: AO0yUKXLa3f7H0F2s45EB9qkPAiEBghyGbBAeAzhUPI70m0LBHFTrw2X suMDirExrhZPC60L+8kGJEUF0RM+reSBpEo2QBQ= X-Google-Smtp-Source: AK7set9pEKXU8YP51L9zrx5Qss/6A4nDf3CQ6HrQyMLSlMuzpheUTQJqWR8QUKf6sb92x2Y7gAEEHg== X-Received: by 2002:a62:1a58:0:b0:5a8:bdae:caa7 with SMTP id a85-20020a621a58000000b005a8bdaecaa7mr3512652pfa.14.1676516286358; Wed, 15 Feb 2023 18:58:06 -0800 (PST) Received: from stoup.. (rrcs-74-87-59-234.west.biz.rr.com. [74.87.59.234]) by smtp.gmail.com with ESMTPSA id e14-20020a62aa0e000000b005a816b7c3e8sm89655pff.24.2023.02.15.18.58.05 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Feb 2023 18:58:05 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 18/30] tcg/aarch64: Detect have_lse, have_lse2 for darwin Date: Wed, 15 Feb 2023 16:57:27 -1000 Message-Id: <20230216025739.1211680-19-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230216025739.1211680-1-richard.henderson@linaro.org> References: <20230216025739.1211680-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::529; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x529.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org These features are present for Apple M1. Signed-off-by: Richard Henderson Reviewed-by: Philippe Mathieu-Daudé Tested-by: Philippe Mathieu-Daudé --- tcg/aarch64/tcg-target.c.inc | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc index d144d1a769..1a295791b4 100644 --- a/tcg/aarch64/tcg-target.c.inc +++ b/tcg/aarch64/tcg-target.c.inc @@ -16,6 +16,9 @@ #ifdef __linux__ #include #endif +#ifdef CONFIG_DARWIN +#include +#endif /* We're going to re-use TCGType in setting of the SF bit, which controls the size of the operation performed. If we know the values match, it @@ -2916,6 +2919,27 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op) } } +#ifdef CONFIG_DARWIN +static bool sysctl_for_bool(const char *name) +{ + int val = 0; + size_t len = sizeof(val); + + if (sysctlbyname(name, &val, &len, NULL, 0) == 0) { + return val != 0; + } + + /* + * We might in ask for properties not present in older kernels, + * but we're only asking about static properties, all of which + * should be 'int'. So we shouln't see ENOMEM (val too small), + * or any of the other more exotic errors. + */ + assert(errno == ENOENT); + return false; +} +#endif + static void tcg_target_init(TCGContext *s) { #ifdef __linux__ @@ -2923,6 +2947,10 @@ static void tcg_target_init(TCGContext *s) have_lse = hwcap & HWCAP_ATOMICS; have_lse2 = hwcap & HWCAP_USCAT; #endif +#ifdef CONFIG_DARWIN + have_lse = sysctl_for_bool("hw.optional.arm.FEAT_LSE"); + have_lse2 = sysctl_for_bool("hw.optional.arm.FEAT_LSE2"); +#endif tcg_target_available_regs[TCG_TYPE_I32] = 0xffffffffu; tcg_target_available_regs[TCG_TYPE_I64] = 0xffffffffu; From patchwork Thu Feb 16 02:57:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1743248 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=F34IOuP+; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4PHKTT64QHz23h0 for ; Thu, 16 Feb 2023 14:01:17 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pSUTM-0003X3-4i; Wed, 15 Feb 2023 21:58:12 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pSUTK-0003Wl-Pa for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:58:10 -0500 Received: from mail-pg1-x52e.google.com ([2607:f8b0:4864:20::52e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pSUTI-0005mc-Vb for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:58:10 -0500 Received: by mail-pg1-x52e.google.com with SMTP id c29so415748pgm.5 for ; Wed, 15 Feb 2023 18:58:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=j43l2wwityBoGjxEXvAnE2DYTf2+ihi6z7wIWyCgIko=; b=F34IOuP+nK/HoRVIX/dkUZ65V8DDhXBH/2eKcT+bp/LAs9epr40CDpX7wSQHDB/PFa 0h0Rdz0XDPA/jwzEP75yaCf25Em3zpvwHBN4iwUYw8a92tHYsH7bcqEq+uFbgjtCqMBb C+seDajIHO4bl/cGk6brr9kcUVWyMBijBEWecPywh47TzsEHM1zRtJdOqoP7p/8ENMqr qT1rBfLhGJuSqrdue4lqGkQKKZIcARdt9nkTausX+UzXIz//19OqNwGnOHZGfFE8ZvKL R66s9U8l9pqveg3Ul/OWoknGewVnfKUhkS+J/Kpfu2EpQMc7kx5158Jt9mpgXpYvtqpj bHEQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=j43l2wwityBoGjxEXvAnE2DYTf2+ihi6z7wIWyCgIko=; b=x9/tBq/1Nm1nFt1wNewTwi2lNZlizvYu2HIQbgjE2mlvHnj0u32Rgd+K7z+m2Pu1B5 auLHoVXRh8IqNpbzH8vB2TUvPeF01/z3cdZp/f96bG5yOWTcuo1CWBphiGJi0SIt60A4 CkPuWvV4k50Luq2sGuADcmiDxPr8NIUJZKIgGKqz0QBIpqnxFA26f4JV6/DWpIHIqlAH dh5gGpBgKnqcO2HotQx5Iq1HLOxy5GR1QcJiHD5ytX28c78gtNQCYjEKei1OjP+nlvuo jPDmhvJHnPDrkBKmHwyb8/y04Ms8z2a4xsxiuPdSFYOxXgpr7kOD0WVJ2jQrq6F/L6tv QyeA== X-Gm-Message-State: AO0yUKW7fTI/N8IGr8vlOyzUG31hFgZx0uNZbrWeZLp3rHxCTBuscXmo oDPmYLmd8sinDwFB7nU8r9gsv4KD5STO+y/bNKk= X-Google-Smtp-Source: AK7set+UNvds4bzom7ykHak6erFu4/T0O+tqtf5IIcoWOYvDzDltzdgRITM0fwyfQmg55lCfc/tB9g== X-Received: by 2002:aa7:9851:0:b0:5a8:4d2a:ad2d with SMTP id n17-20020aa79851000000b005a84d2aad2dmr3770213pfq.1.1676516287600; Wed, 15 Feb 2023 18:58:07 -0800 (PST) Received: from stoup.. (rrcs-74-87-59-234.west.biz.rr.com. [74.87.59.234]) by smtp.gmail.com with ESMTPSA id e14-20020a62aa0e000000b005a816b7c3e8sm89655pff.24.2023.02.15.18.58.06 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Feb 2023 18:58:07 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 19/30] accel/tcg: Add have_lse2 support in ldst_atomicity Date: Wed, 15 Feb 2023 16:57:28 -1000 Message-Id: <20230216025739.1211680-20-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230216025739.1211680-1-richard.henderson@linaro.org> References: <20230216025739.1211680-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::52e; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x52e.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Add fast paths for FEAT_LSE2, using the detection in tcg. Signed-off-by: Richard Henderson --- accel/tcg/ldst_atomicity.c.inc | 37 ++++++++++++++++++++++++++++++---- 1 file changed, 33 insertions(+), 4 deletions(-) diff --git a/accel/tcg/ldst_atomicity.c.inc b/accel/tcg/ldst_atomicity.c.inc index 9a95ac327d..277629f241 100644 --- a/accel/tcg/ldst_atomicity.c.inc +++ b/accel/tcg/ldst_atomicity.c.inc @@ -41,6 +41,8 @@ * but we're using tcg/tci/ instead. */ # define HAVE_al16_fast false +#elif defined(__aarch64__) +# define HAVE_al16_fast likely(have_lse2) #elif defined(__x86_64__) # define HAVE_al16_fast likely(have_atomic16) #else @@ -48,6 +50,8 @@ #endif #if defined(CONFIG_ATOMIC128) || defined(CONFIG_CMPXCHG128) # define HAVE_al16 true +#elif defined(__aarch64__) +# define HAVE_al16 true #else # define HAVE_al16 false #endif @@ -170,6 +174,14 @@ load_atomic16(void *pv) r.u = qatomic_read__nocheck(p); return r.s; +#elif defined(__aarch64__) + uint64_t l, h; + + /* Via HAVE_al16_fast, FEAT_LSE2 is present: LDP becomes atomic. */ + asm("ldp %0, %1, %2" : "=r"(l), "=r"(h) : "m"(*(__uint128_t *)pv)); + + qemu_build_assert(!HOST_BIG_ENDIAN); + return int128_make128(l, h); #elif defined(__x86_64__) Int128Alias r; @@ -409,6 +421,18 @@ load_atom_extract_al16_or_al8(void *pv, int s) r = qatomic_read__nocheck(p16); } return r >> shr; +#elif defined(__aarch64__) + /* + * Via HAVE_al16_fast, FEAT_LSE2 is present. + * LDP becomes single-copy atomic if 16-byte aligned, and + * single-copy atomic on the parts if 8-byte aligned. + */ + uintptr_t pi = (uintptr_t)pv; + int shr = (pi & 7) * 8; + uint64_t l, h; + + asm("ldp %0, %1, %2" : "=r"(l), "=r"(h) : "m"(*(__uint128_t *)(pi & ~7))); + return (l >> shr) | (h << (-shr & 63)); #elif defined(__x86_64__) uintptr_t pi = (uintptr_t)pv; int shr = (pi & 7) * 8; @@ -764,10 +788,15 @@ store_atomic16(void *pv, Int128Alias val) l = int128_getlo(val.s); h = int128_gethi(val.s); - asm("0: ldxp %0, xzr, %1\n\t" - "stxp %w0, %2, %3, %1\n\t" - "cbnz %w0, 0b" - : "=&r"(t), "=Q"(*(__uint128_t *)pv) : "r"(l), "r"(h)); + if (HAVE_al16_fast) { + /* Via HAVE_al16_fast, FEAT_LSE2 is present: STP becomes atomic. */ + asm("stp %1, %2, %0" : "=Q"(*(__uint128_t *)pv) : "r"(l), "r"(h)); + } else { + asm("0: ldxp %0, xzr, %1\n\t" + "stxp %w0, %2, %3, %1\n\t" + "cbnz %w0, 0b" + : "=&r"(t), "=Q"(*(__uint128_t *)pv) : "r"(l), "r"(h)); + } return; } #elif defined(CONFIG_CMPXCHG128) From patchwork Thu Feb 16 02:57:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1743243 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=VJ3OiBVj; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4PHKRx0NsXz1yYg for ; Thu, 16 Feb 2023 13:59:57 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pSUTO-0003XZ-8T; Wed, 15 Feb 2023 21:58:14 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pSUTL-0003Wu-NL for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:58:11 -0500 Received: from mail-pg1-x531.google.com ([2607:f8b0:4864:20::531]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pSUTJ-0005QM-SZ for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:58:11 -0500 Received: by mail-pg1-x531.google.com with SMTP id b22so422924pgw.3 for ; Wed, 15 Feb 2023 18:58:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ibiy84w6Pigy9L51coynCz3t4pOnDEuUKJQbh8tK+24=; b=VJ3OiBVj5YEfsbwBtdC5n2tW53DXFneO/Te+sn7WuaFT159Q8t34ORSvxnnoCUh367 8OZHi6UjNSIoGMDtDAythCgiM9tjPQzBCboXI0FtK7SZ5yKI1MfqjEKuvZ/PUS9iyIdw CEKwihuWscjSz1c167k+44bzUq37w1HT6HPYHYTr8nuS5Y5I0ocdCVowHL/q3gSSmMkf duzEMk+q4FLgd1FxC6Ya4ocueG4IjJ85a4G6kpLf9PCIgUC3UFuuK/tZAbIVwab5cmlQ J5hSy3R+ek2YI0Pot8rTTv/2zhDoBdNr3WkVl01D6JDobWMw9lkECz/823jCUeQczILO 1bpw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ibiy84w6Pigy9L51coynCz3t4pOnDEuUKJQbh8tK+24=; b=N+lqwPTYWAANcx5eskN4OLrxU2OwtGRVmm6ro11BU7FwH02DwCVC4RoXlzKab3o1s5 nIfOBACKnsQODAT5FnMsEnqEN9pKCEs4SdFS988HdlGmWNRj7+q3raLDQte3aYeIR7mO Ih/MFxHnBJ+JgdV30kbp2D9X8JBiaUdtsLNMKlW/zv/qGLGfEH0T6YoDfqGTa5hWV9Xl Oa9lFp5PYq+zcFcMwV3bQN9937rchjPQBIEaGFNcLzDBHMFKdfrVubQ/2JwODm8+qjzW NNpYg4BTZI4FVYCBzOJ7G/CE7ZQjCGBZCkbubw9L7hEAauSgGOBI45H7iS1QUJ/+E0Ip iphQ== X-Gm-Message-State: AO0yUKX0dzo00UAy5qA97Azj71oD9yL3L2JpzOeKzAOYrsEYsCIRCLIV elhBxT6KeeQywGIuP6O+SbxxaKnbkjtOLRAy5rI= X-Google-Smtp-Source: AK7set/phguXxLfyqIIG5Waaugl2oYFeIFvmsq1cGs8O3imRE1oYuMii9JDy8MOrPFYToyxKN14MlQ== X-Received: by 2002:aa7:97a5:0:b0:5a8:eb62:293e with SMTP id d5-20020aa797a5000000b005a8eb62293emr3297532pfq.7.1676516289034; Wed, 15 Feb 2023 18:58:09 -0800 (PST) Received: from stoup.. (rrcs-74-87-59-234.west.biz.rr.com. [74.87.59.234]) by smtp.gmail.com with ESMTPSA id e14-20020a62aa0e000000b005a816b7c3e8sm89655pff.24.2023.02.15.18.58.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Feb 2023 18:58:08 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Cc: =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= Subject: [PATCH v2 20/30] tcg: Introduce TCG_OPF_TYPE_MASK Date: Wed, 15 Feb 2023 16:57:29 -1000 Message-Id: <20230216025739.1211680-21-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230216025739.1211680-1-richard.henderson@linaro.org> References: <20230216025739.1211680-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::531; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x531.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Reorg TCG_OPF_64BIT and TCG_OPF_VECTOR into a two-bit field so that we can add TCG_OPF_128BIT without requiring another bit. Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- include/tcg/tcg.h | 22 ++++++++++++---------- tcg/optimize.c | 15 ++++++++++++--- tcg/tcg.c | 4 ++-- tcg/aarch64/tcg-target.c.inc | 8 +++++--- tcg/tci/tcg-target.c.inc | 3 ++- 5 files changed, 33 insertions(+), 19 deletions(-) diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h index 59854f95b1..23369541fe 100644 --- a/include/tcg/tcg.h +++ b/include/tcg/tcg.h @@ -987,24 +987,26 @@ typedef struct TCGArgConstraint { /* Bits for TCGOpDef->flags, 8 bits available, all used. */ enum { + /* Two bits describing the output type. */ + TCG_OPF_TYPE_MASK = 0x03, + TCG_OPF_32BIT = 0x00, + TCG_OPF_64BIT = 0x01, + TCG_OPF_VECTOR = 0x02, + TCG_OPF_128BIT = 0x03, /* Instruction exits the translation block. */ - TCG_OPF_BB_EXIT = 0x01, + TCG_OPF_BB_EXIT = 0x04, /* Instruction defines the end of a basic block. */ - TCG_OPF_BB_END = 0x02, + TCG_OPF_BB_END = 0x08, /* Instruction clobbers call registers and potentially update globals. */ - TCG_OPF_CALL_CLOBBER = 0x04, + TCG_OPF_CALL_CLOBBER = 0x10, /* Instruction has side effects: it cannot be removed if its outputs are not used, and might trigger exceptions. */ - TCG_OPF_SIDE_EFFECTS = 0x08, - /* Instruction operands are 64-bits (otherwise 32-bits). */ - TCG_OPF_64BIT = 0x10, + TCG_OPF_SIDE_EFFECTS = 0x20, /* Instruction is optional and not implemented by the host, or insn is generic and should not be implemened by the host. */ - TCG_OPF_NOT_PRESENT = 0x20, - /* Instruction operands are vectors. */ - TCG_OPF_VECTOR = 0x40, + TCG_OPF_NOT_PRESENT = 0x40, /* Instruction is a conditional branch. */ - TCG_OPF_COND_BRANCH = 0x80 + TCG_OPF_COND_BRANCH = 0x80, }; typedef struct TCGOpDef { diff --git a/tcg/optimize.c b/tcg/optimize.c index 763bca9ea6..5c0bd6b6e6 100644 --- a/tcg/optimize.c +++ b/tcg/optimize.c @@ -2053,12 +2053,21 @@ void tcg_optimize(TCGContext *s) copy_propagate(&ctx, op, def->nb_oargs, def->nb_iargs); /* Pre-compute the type of the operation. */ - if (def->flags & TCG_OPF_VECTOR) { + switch (def->flags & TCG_OPF_TYPE_MASK) { + case TCG_OPF_VECTOR: ctx.type = TCG_TYPE_V64 + TCGOP_VECL(op); - } else if (def->flags & TCG_OPF_64BIT) { + break; + case TCG_OPF_128BIT: + ctx.type = TCG_TYPE_I128; + break; + case TCG_OPF_64BIT: ctx.type = TCG_TYPE_I64; - } else { + break; + case TCG_OPF_32BIT: ctx.type = TCG_TYPE_I32; + break; + default: + qemu_build_not_reached(); } /* Assume all bits affected, no bits known zero, no sign reps. */ diff --git a/tcg/tcg.c b/tcg/tcg.c index a4a3da6804..07522d50ee 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -2118,7 +2118,7 @@ static void tcg_dump_ops(TCGContext *s, FILE *f, bool have_prefs) nb_iargs = def->nb_iargs; nb_cargs = def->nb_cargs; - if (def->flags & TCG_OPF_VECTOR) { + if ((def->flags & TCG_OPF_TYPE_MASK) == TCG_OPF_VECTOR) { col += ne_fprintf(f, "v%d,e%d,", 64 << TCGOP_VECL(op), 8 << TCGOP_VECE(op)); } @@ -4375,7 +4375,7 @@ static void tcg_reg_alloc_op(TCGContext *s, const TCGOp *op) } /* emit instruction */ - if (def->flags & TCG_OPF_VECTOR) { + if ((def->flags & TCG_OPF_TYPE_MASK) == TCG_OPF_VECTOR) { tcg_out_vec_op(s, op->opc, TCGOP_VECL(op), TCGOP_VECE(op), new_args, const_args); } else { diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc index 1a295791b4..6e40a453e6 100644 --- a/tcg/aarch64/tcg-target.c.inc +++ b/tcg/aarch64/tcg-target.c.inc @@ -1922,9 +1922,11 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc, const TCGArg args[TCG_MAX_OP_ARGS], const int const_args[TCG_MAX_OP_ARGS]) { - /* 99% of the time, we can signal the use of extension registers - by looking to see if the opcode handles 64-bit data. */ - TCGType ext = (tcg_op_defs[opc].flags & TCG_OPF_64BIT) != 0; + /* + * 99% of the time, we can signal the use of extension registers + * by looking to see if the opcode handles 32-bit data or not. + */ + TCGType ext = (tcg_op_defs[opc].flags & TCG_OPF_TYPE_MASK) != TCG_OPF_32BIT; /* Hoist the loads of the most common arguments. */ TCGArg a0 = args[0]; diff --git a/tcg/tci/tcg-target.c.inc b/tcg/tci/tcg-target.c.inc index c1d34d7bd1..570b8c160e 100644 --- a/tcg/tci/tcg-target.c.inc +++ b/tcg/tci/tcg-target.c.inc @@ -697,7 +697,8 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc, CASE_32_64(sextract) /* Optional (TCG_TARGET_HAS_sextract_*). */ { TCGArg pos = args[2], len = args[3]; - TCGArg max = tcg_op_defs[opc].flags & TCG_OPF_64BIT ? 64 : 32; + TCGArg max = ((tcg_op_defs[opc].flags & TCG_OPF_TYPE_MASK) + == TCG_OPF_32BIT ? 32 : 64); tcg_debug_assert(pos < max); tcg_debug_assert(pos + len <= max); From patchwork Thu Feb 16 02:57:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1743260 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=TJzUOb0B; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4PHKVv3R20z23h0 for ; Thu, 16 Feb 2023 14:02:31 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pSUTQ-0003Yr-D0; Wed, 15 Feb 2023 21:58:16 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pSUTO-0003Xa-AG for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:58:14 -0500 Received: from mail-pf1-x42e.google.com ([2607:f8b0:4864:20::42e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pSUTL-0005nH-SJ for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:58:14 -0500 Received: by mail-pf1-x42e.google.com with SMTP id bw10so617727pfb.0 for ; Wed, 15 Feb 2023 18:58:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=5xEeXxyBFUOJQcE6gk7QAbWMV1Qps5GGYTHA+Vwaf6g=; b=TJzUOb0B0XyR0fuY/jEOjFSvQBr5hNmIsMU+j60yeaMu7/7okuQr6bsERt55rb863u 6WhStLqRpINErBKtj6czQeQmOmYs9Ltcy/ETaYfDY4u9ZyrAXwi2U4TEkjKy79/QUbcH 5N5hV9OFECDj8NtVRRzb/NWDDFWOTViov5+pVmm5H6wZo6J58eirsQIakKZKAZDYQrkm xXmTt6zPDjiuSugZH1w9E0lFQaqdQzAflYgRXgVPwrgeYHabc3UHW4AjF9CjuRDWStvS Y7BLdyUpdBzc7NVTy76oRRQuGq3a05d3RGA3tda2av/jDUYW4vAIOVAfxTZfT0kmYjlb 5K3Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=5xEeXxyBFUOJQcE6gk7QAbWMV1Qps5GGYTHA+Vwaf6g=; b=UX5fgVAlbNRm53n0RGSoSkkn4ZZcrBAxvhyvMLpmH85QD6Dzum21B8GX9lvc1V+DQn 0mPKXbiXV38xuToYQ6vksoigVmybP8YUhhJYdnck43an3JtEHf/nHHNN07Gmuv2MRDMN ltaaRQkjmBCbRnMNh/Ff5J9jGPEKkRDPJBVOIek9ki/uYp9U9poqhq6W3vgoJZDA6zbl OQLw0BwRPijX/o9vsGxhz41+cxxO4Spej413HjR2JTJjfbRYBuj04gHZ4ghnuBXmU2BE hXJ6333chhne5iKrYkj89W1O8MaLTGEq7KHi8JEzPuXunY3UsfR9VGb3xrn4o8CLLV2W k8og== X-Gm-Message-State: AO0yUKXPqLpRs7awklaYnSfVylf031ZEPL4KeOiXqde5ME/MCMrONbb0 1Ian3PjKbgvonrp9I/DCBNeJa918ojRfEGB97WY= X-Google-Smtp-Source: AK7set8EDUH7aA+RZBohglsdLcfSIgtrTjtpoSCJrgRDMArXPkKWfZajeUu7t5mATo9xTYl7eENC6A== X-Received: by 2002:aa7:96af:0:b0:592:4502:fb0 with SMTP id g15-20020aa796af000000b0059245020fb0mr3585161pfk.0.1676516290382; Wed, 15 Feb 2023 18:58:10 -0800 (PST) Received: from stoup.. (rrcs-74-87-59-234.west.biz.rr.com. [74.87.59.234]) by smtp.gmail.com with ESMTPSA id e14-20020a62aa0e000000b005a816b7c3e8sm89655pff.24.2023.02.15.18.58.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Feb 2023 18:58:09 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Cc: =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= Subject: [PATCH v2 21/30] tcg: Add INDEX_op_qemu_{ld,st}_i128 Date: Wed, 15 Feb 2023 16:57:30 -1000 Message-Id: <20230216025739.1211680-22-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230216025739.1211680-1-richard.henderson@linaro.org> References: <20230216025739.1211680-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42e; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x42e.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Add opcodes for backend support for 128-bit memory operations. Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- docs/devel/tcg-ops.rst | 11 +++--- include/tcg/tcg-opc.h | 8 +++++ tcg/aarch64/tcg-target.h | 2 ++ tcg/arm/tcg-target.h | 2 ++ tcg/i386/tcg-target.h | 2 ++ tcg/loongarch64/tcg-target.h | 1 + tcg/mips/tcg-target.h | 2 ++ tcg/ppc/tcg-target.h | 2 ++ tcg/riscv/tcg-target.h | 2 ++ tcg/s390x/tcg-target.h | 2 ++ tcg/sparc64/tcg-target.h | 2 ++ tcg/tci/tcg-target.h | 2 ++ tcg/tcg-op.c | 67 ++++++++++++++++++++++++++++++++---- tcg/tcg.c | 4 +++ 14 files changed, 99 insertions(+), 10 deletions(-) diff --git a/docs/devel/tcg-ops.rst b/docs/devel/tcg-ops.rst index 9adc0c9b6c..dd14dbe3fa 100644 --- a/docs/devel/tcg-ops.rst +++ b/docs/devel/tcg-ops.rst @@ -629,19 +629,20 @@ QEMU specific operations | This operation is optional. If the TCG backend does not implement the goto_ptr opcode, emitting this op is equivalent to emitting exit_tb(0). - * - qemu_ld_i32/i64 *t0*, *t1*, *flags*, *memidx* + * - qemu_ld_i32/i64/i128 *t0*, *t1*, *flags*, *memidx* - qemu_st_i32/i64 *t0*, *t1*, *flags*, *memidx* + qemu_st_i32/i64/i128 *t0*, *t1*, *flags*, *memidx* qemu_st8_i32 *t0*, *t1*, *flags*, *memidx* - | Load data at the guest address *t1* into *t0*, or store data in *t0* at guest - address *t1*. The _i32/_i64 size applies to the size of the input/output + address *t1*. The _i32/_i64/_i128 size applies to the size of the input/output register *t0* only. The address *t1* is always sized according to the guest, and the width of the memory operation is controlled by *flags*. | | Both *t0* and *t1* may be split into little-endian ordered pairs of registers - if dealing with 64-bit quantities on a 32-bit host. + if dealing with 64-bit quantities on a 32-bit host, or 128-bit quantities on + a 64-bit host. | | The *memidx* selects the qemu tlb index to use (e.g. user or kernel access). The flags are the MemOp bits, selecting the sign, width, and endianness @@ -650,6 +651,8 @@ QEMU specific operations | For a 32-bit host, qemu_ld/st_i64 is guaranteed to only be used with a 64-bit memory access specified in *flags*. | + | For qemu_ld/st_i128, these are only supported for a 64-bit host. + | | For i386, qemu_st8_i32 is exactly like qemu_st_i32, except the size of the memory operation is known to be 8-bit. This allows the backend to provide a different set of register constraints. diff --git a/include/tcg/tcg-opc.h b/include/tcg/tcg-opc.h index dd444734d9..94cf7c5d6a 100644 --- a/include/tcg/tcg-opc.h +++ b/include/tcg/tcg-opc.h @@ -213,6 +213,14 @@ DEF(qemu_st8_i32, 0, TLADDR_ARGS + 1, 1, TCG_OPF_CALL_CLOBBER | TCG_OPF_SIDE_EFFECTS | IMPL(TCG_TARGET_HAS_qemu_st8_i32)) +/* Only for 64-bit hosts at the moment. */ +DEF(qemu_ld_i128, 2, 1, 1, + TCG_OPF_CALL_CLOBBER | TCG_OPF_SIDE_EFFECTS | TCG_OPF_64BIT | + IMPL(TCG_TARGET_HAS_qemu_ldst_i128)) +DEF(qemu_st_i128, 0, 3, 1, + TCG_OPF_CALL_CLOBBER | TCG_OPF_SIDE_EFFECTS | TCG_OPF_64BIT | + IMPL(TCG_TARGET_HAS_qemu_ldst_i128)) + /* Host vector support. */ #define IMPLVEC TCG_OPF_VECTOR | IMPL(TCG_TARGET_MAYBE_vec) diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h index 3c0b0d312d..60ed1f3042 100644 --- a/tcg/aarch64/tcg-target.h +++ b/tcg/aarch64/tcg-target.h @@ -129,6 +129,8 @@ extern bool have_lse2; #define TCG_TARGET_HAS_muluh_i64 1 #define TCG_TARGET_HAS_mulsh_i64 1 +#define TCG_TARGET_HAS_qemu_ldst_i128 0 + #define TCG_TARGET_HAS_v64 1 #define TCG_TARGET_HAS_v128 1 #define TCG_TARGET_HAS_v256 0 diff --git a/tcg/arm/tcg-target.h b/tcg/arm/tcg-target.h index def2a189e6..c8d1e32a27 100644 --- a/tcg/arm/tcg-target.h +++ b/tcg/arm/tcg-target.h @@ -125,6 +125,8 @@ extern bool use_neon_instructions; #define TCG_TARGET_HAS_rem_i32 0 #define TCG_TARGET_HAS_qemu_st8_i32 0 +#define TCG_TARGET_HAS_qemu_ldst_i128 0 + #define TCG_TARGET_HAS_v64 use_neon_instructions #define TCG_TARGET_HAS_v128 use_neon_instructions #define TCG_TARGET_HAS_v256 0 diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h index 0421776cb8..6d8a536a32 100644 --- a/tcg/i386/tcg-target.h +++ b/tcg/i386/tcg-target.h @@ -194,6 +194,8 @@ extern bool have_atomic16; #define TCG_TARGET_HAS_qemu_st8_i32 1 #endif +#define TCG_TARGET_HAS_qemu_ldst_i128 0 + /* We do not support older SSE systems, only beginning with AVX1. */ #define TCG_TARGET_HAS_v64 have_avx1 #define TCG_TARGET_HAS_v128 have_avx1 diff --git a/tcg/loongarch64/tcg-target.h b/tcg/loongarch64/tcg-target.h index 17b8193aa5..53ff05a75b 100644 --- a/tcg/loongarch64/tcg-target.h +++ b/tcg/loongarch64/tcg-target.h @@ -168,6 +168,7 @@ typedef enum { #define TCG_TARGET_HAS_muls2_i64 0 #define TCG_TARGET_HAS_muluh_i64 1 #define TCG_TARGET_HAS_mulsh_i64 1 +#define TCG_TARGET_HAS_qemu_ldst_i128 0 #define TCG_TARGET_DEFAULT_MO (0) diff --git a/tcg/mips/tcg-target.h b/tcg/mips/tcg-target.h index 68b11e4d48..bade6a50ee 100644 --- a/tcg/mips/tcg-target.h +++ b/tcg/mips/tcg-target.h @@ -203,6 +203,8 @@ extern bool use_mips32r2_instructions; #define TCG_TARGET_HAS_ext16u_i64 0 /* andi rt, rs, 0xffff */ #endif +#define TCG_TARGET_HAS_qemu_ldst_i128 0 + #define TCG_TARGET_DEFAULT_MO (0) #define TCG_TARGET_HAS_MEMORY_BSWAP 1 diff --git a/tcg/ppc/tcg-target.h b/tcg/ppc/tcg-target.h index af81c5a57f..8d6939ee82 100644 --- a/tcg/ppc/tcg-target.h +++ b/tcg/ppc/tcg-target.h @@ -149,6 +149,8 @@ extern bool have_vsx; #define TCG_TARGET_HAS_mulsh_i64 1 #endif +#define TCG_TARGET_HAS_qemu_ldst_i128 0 + /* * While technically Altivec could support V64, it has no 64-bit store * instruction and substituting two 32-bit stores makes the generated diff --git a/tcg/riscv/tcg-target.h b/tcg/riscv/tcg-target.h index 0deb33701f..3949646f4b 100644 --- a/tcg/riscv/tcg-target.h +++ b/tcg/riscv/tcg-target.h @@ -167,6 +167,8 @@ typedef enum { #define TCG_TARGET_HAS_mulsh_i64 1 #endif +#define TCG_TARGET_HAS_qemu_ldst_i128 0 + #define TCG_TARGET_DEFAULT_MO (0) #define TCG_TARGET_NEED_LDST_LABELS diff --git a/tcg/s390x/tcg-target.h b/tcg/s390x/tcg-target.h index a05b473117..a0546d77b4 100644 --- a/tcg/s390x/tcg-target.h +++ b/tcg/s390x/tcg-target.h @@ -140,6 +140,8 @@ extern uint64_t s390_facilities[3]; #define TCG_TARGET_HAS_muluh_i64 0 #define TCG_TARGET_HAS_mulsh_i64 0 +#define TCG_TARGET_HAS_qemu_ldst_i128 0 + #define TCG_TARGET_HAS_v64 HAVE_FACILITY(VECTOR) #define TCG_TARGET_HAS_v128 HAVE_FACILITY(VECTOR) #define TCG_TARGET_HAS_v256 0 diff --git a/tcg/sparc64/tcg-target.h b/tcg/sparc64/tcg-target.h index ffe22b1d21..32733949bf 100644 --- a/tcg/sparc64/tcg-target.h +++ b/tcg/sparc64/tcg-target.h @@ -151,6 +151,8 @@ extern bool use_vis3_instructions; #define TCG_TARGET_HAS_muluh_i64 use_vis3_instructions #define TCG_TARGET_HAS_mulsh_i64 0 +#define TCG_TARGET_HAS_qemu_ldst_i128 0 + #define TCG_AREG0 TCG_REG_I0 #define TCG_TARGET_DEFAULT_MO (0) diff --git a/tcg/tci/tcg-target.h b/tcg/tci/tcg-target.h index 7140a76a73..8cf6b87040 100644 --- a/tcg/tci/tcg-target.h +++ b/tcg/tci/tcg-target.h @@ -127,6 +127,8 @@ #define TCG_TARGET_HAS_mulu2_i32 1 #endif /* TCG_TARGET_REG_BITS == 64 */ +#define TCG_TARGET_HAS_qemu_ldst_i128 0 + /* Number of registers available. */ #define TCG_TARGET_NB_REGS 16 diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c index 93ac864aed..1784873314 100644 --- a/tcg/tcg-op.c +++ b/tcg/tcg-op.c @@ -3203,7 +3203,7 @@ static void canonicalize_memop_i128_as_i64(MemOp ret[2], MemOp orig) void tcg_gen_qemu_ld_i128(TCGv_i128 val, TCGv addr, TCGArg idx, MemOp memop) { - MemOpIdx oi = make_memop_idx(memop, idx); + const MemOpIdx oi = make_memop_idx(memop, idx); tcg_debug_assert((memop & MO_SIZE) == MO_128); tcg_debug_assert((memop & MO_SIGN) == 0); @@ -3211,9 +3211,35 @@ void tcg_gen_qemu_ld_i128(TCGv_i128 val, TCGv addr, TCGArg idx, MemOp memop) tcg_gen_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD); addr = plugin_prep_mem_callbacks(addr); - /* TODO: allow the tcg backend to see the whole operation. */ + /* TODO: For now, force 32-bit hosts to use the helper. */ + if (TCG_TARGET_HAS_qemu_ldst_i128 && TCG_TARGET_REG_BITS == 64) { + TCGv_i64 lo, hi; + TCGArg addr_arg; + MemOpIdx adj_oi; - if (use_two_i64_for_i128(memop)) { + /* TODO: Make TCG_TARGET_HAS_MEMORY_BSWAP fine grained. */ + if (!TCG_TARGET_HAS_MEMORY_BSWAP && (memop & MO_BSWAP)) { + lo = TCGV128_HIGH(val); + hi = TCGV128_LOW(val); + adj_oi = make_memop_idx(memop & ~MO_BSWAP, idx); + } else { + lo = TCGV128_LOW(val); + hi = TCGV128_HIGH(val); + adj_oi = oi; + } + +#if TARGET_LONG_BITS == 32 + addr_arg = tcgv_i32_arg(addr); +#else + addr_arg = tcgv_i64_arg(addr); +#endif + tcg_gen_op4ii_i64(INDEX_op_qemu_ld_i128, lo, hi, addr_arg, adj_oi); + + if (!TCG_TARGET_HAS_MEMORY_BSWAP && (memop & MO_BSWAP)) { + tcg_gen_bswap64_i64(lo, lo); + tcg_gen_bswap64_i64(hi, hi); + } + } else if (use_two_i64_for_i128(memop)) { MemOp mop[2]; TCGv addr_p8; TCGv_i64 x, y; @@ -3256,7 +3282,7 @@ void tcg_gen_qemu_ld_i128(TCGv_i128 val, TCGv addr, TCGArg idx, MemOp memop) void tcg_gen_qemu_st_i128(TCGv_i128 val, TCGv addr, TCGArg idx, MemOp memop) { - MemOpIdx oi = make_memop_idx(memop, idx); + const MemOpIdx oi = make_memop_idx(memop, idx); tcg_debug_assert((memop & MO_SIZE) == MO_128); tcg_debug_assert((memop & MO_SIGN) == 0); @@ -3264,9 +3290,38 @@ void tcg_gen_qemu_st_i128(TCGv_i128 val, TCGv addr, TCGArg idx, MemOp memop) tcg_gen_req_mo(TCG_MO_ST_LD | TCG_MO_ST_ST); addr = plugin_prep_mem_callbacks(addr); - /* TODO: allow the tcg backend to see the whole operation. */ + /* TODO: For now, force 32-bit hosts to use the helper. */ - if (use_two_i64_for_i128(memop)) { + if (TCG_TARGET_HAS_qemu_ldst_i128 && TCG_TARGET_REG_BITS == 64) { + TCGv_i64 lo, hi; + TCGArg addr_arg; + MemOpIdx adj_oi; + + /* TODO: Make TCG_TARGET_HAS_MEMORY_BSWAP fine grained. */ + if (!TCG_TARGET_HAS_MEMORY_BSWAP && (memop & MO_BSWAP)) { + lo = tcg_temp_new_i64(); + hi = tcg_temp_new_i64(); + tcg_gen_bswap64_i64(lo, TCGV128_HIGH(val)); + tcg_gen_bswap64_i64(hi, TCGV128_LOW(val)); + adj_oi = make_memop_idx(memop & ~MO_BSWAP, idx); + } else { + lo = TCGV128_LOW(val); + hi = TCGV128_HIGH(val); + adj_oi = oi; + } + +#if TARGET_LONG_BITS == 32 + addr_arg = tcgv_i32_arg(addr); +#else + addr_arg = tcgv_i64_arg(addr); +#endif + tcg_gen_op4ii_i64(INDEX_op_qemu_st_i128, lo, hi, addr_arg, adj_oi); + + if (!TCG_TARGET_HAS_MEMORY_BSWAP && (memop & MO_BSWAP)) { + tcg_temp_free_i64(lo); + tcg_temp_free_i64(hi); + } + } else if (use_two_i64_for_i128(memop)) { MemOp mop[2]; TCGv addr_p8; TCGv_i64 x, y; diff --git a/tcg/tcg.c b/tcg/tcg.c index 07522d50ee..a82fac66e3 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -1538,6 +1538,10 @@ bool tcg_op_supported(TCGOpcode op) case INDEX_op_qemu_st8_i32: return TCG_TARGET_HAS_qemu_st8_i32; + case INDEX_op_qemu_ld_i128: + case INDEX_op_qemu_st_i128: + return TCG_TARGET_HAS_qemu_ldst_i128; + case INDEX_op_mov_i32: case INDEX_op_setcond_i32: case INDEX_op_brcond_i32: From patchwork Thu Feb 16 02:57:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1743236 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=dLDyBikn; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4PHKRF11Lgz23yD for ; Thu, 16 Feb 2023 13:59:21 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pSUTP-0003YW-PJ; Wed, 15 Feb 2023 21:58:15 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pSUTO-0003Xb-DW for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:58:14 -0500 Received: from mail-pf1-x434.google.com ([2607:f8b0:4864:20::434]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pSUTM-0005iW-GF for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:58:14 -0500 Received: by mail-pf1-x434.google.com with SMTP id 16so576463pfo.8 for ; Wed, 15 Feb 2023 18:58:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=1rrupK/ncvdfd5iG8nRgJ9I313hi/PeeTY/lxXxrncs=; b=dLDyBiknSW0lbT/Tjn5iwDXKNqC5ENREnWYXF4LUK3zx/48k4SIEvgSQLVxKW8Ulfq /DCj2WjCAcQVpLH0o4Td+9DNxZzuho77I1AZN8T0+k2+wnN0THJwpCKCBN7tPVB7K6pS l8tMb8Of1pzhqS5gPqm76auO8cSzyMfblP2xe2K0UhF7fW2mlxjuHRYP8QWbiyJMyto5 dq/aFBjdD1JPEMeCvRgvMpGk/2EDCjZjCCxVMCA613dJi5SzGJeJmbrHMAvssxX0OzaZ ZcrQtSBC0K6Tovc5mL0Y9ndJ4AynLImSRo9jfK7jDZr/kBt3it1G96Cqf9RovvJate3E Nsxg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=1rrupK/ncvdfd5iG8nRgJ9I313hi/PeeTY/lxXxrncs=; b=pb2V53nQfkmInTmDtJWx1Sq61OlwDhw2U2IbF9rvKQAq/QTjaNPNcLnwJ/SbAeTNhD LMKIHJ+FynfJdmuHDTLq6lFZAe4RtVsz2O1wm9vupH78u6DmxaXjXLPzvNLbOg5TA/bH Tp2aPlJzZgjDHZNYeh5pLvxuUXAROGYCshMurYdLwx7Dn4XHjQlojBrtvjJhHYpgWj5r K0KfA03XunYyJlcggV9OrdI1T2eUXdc8bSiCQTuwoMHtrnnCDymxZI5QP9BGLALT0k2O GcgTxJkfrADPPFxtxuLVl6mt2AtzPm+/Z0Z/tno1r8eTB29PLc7hgw9UzzamRlWHBLK0 izFg== X-Gm-Message-State: AO0yUKVpuHR4AAPvcV27Vx3hn9bI1v5kScET40DJZb4Q0tCzssCv4DCJ 5An3zgY0hzo2vvhwlcV1gZ8jKeBEVclotEDhf7o= X-Google-Smtp-Source: AK7set/Oz7AdgF06Li8/mhM1HaLSJe4NwQDh9I8frOUyqx9yPEZDOD6v2BTI5ozG/V0WPbfaoVJ5WQ== X-Received: by 2002:a62:170f:0:b0:5a8:d364:62ab with SMTP id 15-20020a62170f000000b005a8d36462abmr3584685pfx.17.1676516291684; Wed, 15 Feb 2023 18:58:11 -0800 (PST) Received: from stoup.. (rrcs-74-87-59-234.west.biz.rr.com. [74.87.59.234]) by smtp.gmail.com with ESMTPSA id e14-20020a62aa0e000000b005a816b7c3e8sm89655pff.24.2023.02.15.18.58.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Feb 2023 18:58:11 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Cc: =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= Subject: [PATCH v2 22/30] tcg/i386: Introduce tcg_out_mov2 Date: Wed, 15 Feb 2023 16:57:31 -1000 Message-Id: <20230216025739.1211680-23-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230216025739.1211680-1-richard.henderson@linaro.org> References: <20230216025739.1211680-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::434; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x434.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Create a helper for data movement minding register overlap. Use the more general xchg instruction, which consumes one extra byte, but simplifies the more general function. Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- tcg/i386/tcg-target.c.inc | 27 +++++++++++++++++++++------ 1 file changed, 21 insertions(+), 6 deletions(-) diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc index 977650263b..5547f47a26 100644 --- a/tcg/i386/tcg-target.c.inc +++ b/tcg/i386/tcg-target.c.inc @@ -461,6 +461,7 @@ static bool tcg_target_const_match(int64_t val, TCGType type, int ct) #define OPC_VPTERNLOGQ (0x25 | P_EXT3A | P_DATA16 | P_VEXW | P_EVEX) #define OPC_VZEROUPPER (0x77 | P_EXT) #define OPC_XCHG_ax_r32 (0x90) +#define OPC_XCHG_EvGv (0x87) #define OPC_GRP3_Eb (0xf6) #define OPC_GRP3_Ev (0xf7) @@ -1880,6 +1881,24 @@ static void add_qemu_ldst_label(TCGContext *s, bool is_ld, bool is_64, } } +/* Move src1 to dst1 and src2 to dst2, minding possible overlap. */ +static void tcg_out_mov2(TCGContext *s, + TCGType type1, TCGReg dst1, TCGReg src1, + TCGType type2, TCGReg dst2, TCGReg src2) +{ + if (dst1 != src2) { + tcg_out_mov(s, type1, dst1, src1); + tcg_out_mov(s, type2, dst2, src2); + } else if (dst2 != src1) { + tcg_out_mov(s, type2, dst2, src2); + tcg_out_mov(s, type1, dst1, src1); + } else { + /* dst1 == src2 && dst2 == src1 -> xchg. */ + int w = (type1 == TCG_TYPE_I32 && type2 == TCG_TYPE_I32 ? 0 : P_REXW); + tcg_out_modrm(s, OPC_XCHG_EvGv + w, dst1, dst2); + } +} + /* * Generate code for the slow path for a load at the end of block */ @@ -1947,13 +1966,9 @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l) case MO_UQ: if (TCG_TARGET_REG_BITS == 64) { tcg_out_mov(s, TCG_TYPE_I64, data_reg, TCG_REG_RAX); - } else if (data_reg == TCG_REG_EDX) { - /* xchg %edx, %eax */ - tcg_out_opc(s, OPC_XCHG_ax_r32 + TCG_REG_EDX, 0, 0, 0); - tcg_out_mov(s, TCG_TYPE_I32, l->datahi_reg, TCG_REG_EAX); } else { - tcg_out_mov(s, TCG_TYPE_I32, data_reg, TCG_REG_EAX); - tcg_out_mov(s, TCG_TYPE_I32, l->datahi_reg, TCG_REG_EDX); + tcg_out_mov2(s, TCG_TYPE_I32, data_reg, TCG_REG_EAX, + TCG_TYPE_I32, l->datahi_reg, TCG_REG_EDX); } break; default: From patchwork Thu Feb 16 02:57:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1743247 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=AYiAC7SE; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4PHKTH0fdbz23h0 for ; Thu, 16 Feb 2023 14:01:07 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pSUTR-0003ZB-SU; Wed, 15 Feb 2023 21:58:17 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pSUTQ-0003Yj-3J for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:58:16 -0500 Received: from mail-pj1-x1032.google.com ([2607:f8b0:4864:20::1032]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pSUTO-0005nd-CX for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:58:15 -0500 Received: by mail-pj1-x1032.google.com with SMTP id d2so546469pjd.5 for ; Wed, 15 Feb 2023 18:58:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=afuwZcabde7LyQSxeaM/6PWqvilPqwbhg0UKCcmEKIk=; b=AYiAC7SEBkMggu2by3pRAPL3RmibP0d/naZm0Lc+Rgh2sAaSpLRbR5+YbWZf5YWr9S S3tHp/bbnFJiFr/Hu6KIU/t931dqljsMmdY03QpoWCGs05u6SqA4FrD02c5aGmu+8lOA sByM4wcf0wKnJNmc0XGSaTuorLZbqzYVdfTkS7W6jsipTKLMHomm9g+H2B8P8uwabq7H psoH5hvMvwcLUYzzYHx0tdeAj/ZlCsMsl7eP5HhbB56P+qMdSdBNvyBgDvf04v6Hu36n 42NtjUoPZfeddH+6JJJQ1V1mj28CHqZxbskYSe+RkY69PEIVjcqR/GWNkjlyupJtZkBj 3COQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=afuwZcabde7LyQSxeaM/6PWqvilPqwbhg0UKCcmEKIk=; b=dqac0tz0DjsskuzcyX78UjRmOScv7uOmJKrpgGQHi1sfbfq//SxdnyVcrxAOfJp1xc 9lhJqsoRYPqklSqair55UhMK7RqdTfDm0KcShEYio5e1kyw9z8vBTAzJFe9XubQa6CFu +ZJjKGTJcL0ZCqxiAyMJafdru4Cf5MHh2TgAvDl0Ysl2VR4ZQDoQD3NaDJw/T+sNDfrS sXwCJpysZYjBQev8/koPFgbmb99DYT/kBNik7PZZQvg4QbeB7cLbfhCMdImv0v/dURbD JTJg3bR6f0q33K3IWFNtQj1va963biLvRndRhZE4upg23rGNIuyJmmcQ7sEMtuPSDkFd nxQg== X-Gm-Message-State: AO0yUKVcYAxOqXdm5Exyw11q0m+SzNQy+pHaQULjhc7PQbiIY8mRuQbA CpzMGHjxhSW41lxy02fSQzyM+SduBB7KiUOEAJo= X-Google-Smtp-Source: AK7set/y2LUSBAMOA+/uuZU1OvxMRV7htuVWK+P9EG/YGI4op4EqVCQeCXeFpTBOg3n3fYSz1Gu3iw== X-Received: by 2002:a05:6a20:a1a5:b0:bc:ccea:a969 with SMTP id r37-20020a056a20a1a500b000bccceaa969mr3490714pzk.26.1676516293030; Wed, 15 Feb 2023 18:58:13 -0800 (PST) Received: from stoup.. (rrcs-74-87-59-234.west.biz.rr.com. [74.87.59.234]) by smtp.gmail.com with ESMTPSA id e14-20020a62aa0e000000b005a816b7c3e8sm89655pff.24.2023.02.15.18.58.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Feb 2023 18:58:12 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Cc: =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= Subject: [PATCH v2 23/30] tcg/i386: Introduce tcg_out_testi Date: Wed, 15 Feb 2023 16:57:32 -1000 Message-Id: <20230216025739.1211680-24-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230216025739.1211680-1-richard.henderson@linaro.org> References: <20230216025739.1211680-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1032; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1032.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Split out a helper for choosing testb vs testl. Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- tcg/i386/tcg-target.c.inc | 30 ++++++++++++++++++------------ 1 file changed, 18 insertions(+), 12 deletions(-) diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc index 5547f47a26..a75fe91e86 100644 --- a/tcg/i386/tcg-target.c.inc +++ b/tcg/i386/tcg-target.c.inc @@ -1729,6 +1729,23 @@ static void tcg_out_nopn(TCGContext *s, int n) tcg_out8(s, 0x90); } +/* Test register R vs immediate bits I, setting Z flag for EQ/NE. */ +static void __attribute__((unused)) +tcg_out_testi(TCGContext *s, TCGReg r, uint32_t i) +{ + /* + * This is used for testing alignment, so we can usually use testb. + * For i686, we have to use testl for %esi/%edi. + */ + if (i <= 0xff && (TCG_TARGET_REG_BITS == 64 || r < 4)) { + tcg_out_modrm(s, OPC_GRP3_Eb | P_REXB_RM, EXT3_TESTi, r); + tcg_out8(s, i); + } else { + tcg_out_modrm(s, OPC_GRP3_Ev, EXT3_TESTi, r); + tcg_out32(s, i); + } +} + #if defined(CONFIG_SOFTMMU) /* * helper signature: helper_ld*_mmu(CPUState *env, target_ulong addr, @@ -2056,18 +2073,7 @@ static void tcg_out_test_alignment(TCGContext *s, bool is_ld, TCGReg addrlo, unsigned a_mask = (1 << a_bits) - 1; TCGLabelQemuLdst *label; - /* - * We are expecting a_bits to max out at 7, so we can usually use testb. - * For i686, we have to use testl for %esi/%edi. - */ - if (a_mask <= 0xff && (TCG_TARGET_REG_BITS == 64 || addrlo < 4)) { - tcg_out_modrm(s, OPC_GRP3_Eb | P_REXB_RM, EXT3_TESTi, addrlo); - tcg_out8(s, a_mask); - } else { - tcg_out_modrm(s, OPC_GRP3_Ev, EXT3_TESTi, addrlo); - tcg_out32(s, a_mask); - } - + tcg_out_testi(s, addrlo, a_mask); /* jne slow_path */ tcg_out_opc(s, OPC_JCC_long + JCC_JNE, 0, 0, 0); From patchwork Thu Feb 16 02:57:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1743234 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=XD+fuRn/; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4PHKQt4fMZz23yD for ; Thu, 16 Feb 2023 13:59:02 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pSUTT-0003fi-T0; Wed, 15 Feb 2023 21:58:19 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pSUTS-0003cD-Dc for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:58:18 -0500 Received: from mail-pf1-x430.google.com ([2607:f8b0:4864:20::430]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pSUTP-0005o1-QT for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:58:18 -0500 Received: by mail-pf1-x430.google.com with SMTP id b1so600399pft.1 for ; Wed, 15 Feb 2023 18:58:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=7CLedoS9krKbRh0tKxWPKIqzbW4nVGbqqv64T38J4Cw=; b=XD+fuRn//W9KTLzy2sSb/hWIItHz3ZDqh4Pk2INTXCiAaMJ44R6YKFHOlsMbWolytA hP8STtY+zK78fDviv9WNZDW25Y+aB6Mkda2xCt3pMSFkdMMFMIlTV6foePn/RCGMFl4e SlihYIez6cMpvtC2oN5aYP5eHhxk/0IQw6SKxs8XTIDU01824hA8ooif5enI5GqFyOWQ +B5pN7IX1LYsvK1dzKpJg+wUanmkxofKzP2zJCgkk03tbHJxeEssBPTd+BXbUywLzLyf /9iLiiblHfztPJUHBIiJbGsMdCBDZYgbxvf1aBeSUXqViZ9nFow7R/GJg+OS6Y7JtyRa JDpA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7CLedoS9krKbRh0tKxWPKIqzbW4nVGbqqv64T38J4Cw=; b=rTxqNGV2Cpr8SNRFsyR+dBG9TnUJ4tWMjoXonEZ/RPiZ1qGd3RoJkGxSp+8H0u7Bfe GwXDCNrHY67cM3ewDuMu3pm5+IEqLBEsevW5OtpYkkviKWFnGrJATeyb7O3D3ZyYaceU q+4zDm01hVa0zX36HMyG0bp5LJ9/m6wrrbWV80rzvKA7Et33spFwldlsjZVF9YngxbUn hY9XuT7ts7qWeMPq6Jt+o5M4nlPfmDgYmvi5EHFmzLZLlOkOUJYHifWUa+OB4yZnf7Ug ECkCp3+AmGHOTmcxZqcDMVA5tg1DnPGcTA8TLub4qyrOh9Fz8f4ntBiIA6zIpa9juV3g 4ZOA== X-Gm-Message-State: AO0yUKUjMSx2dkQKe0iwM4vN7QmyuvYLe3hEXzmQkhyQwn/FKbPetDzs zpgtKBhuThYgeS32Jejf9hXQVxNYBRuuPUPOnZw= X-Google-Smtp-Source: AK7set9oAHmwbq7QA5E5irUUu0z25ppKYxGEJ4B1GLF4g2Q8n0ihdImqPUlwwevxhubBR9zN4wkmiw== X-Received: by 2002:a62:1413:0:b0:5a8:bd67:1563 with SMTP id 19-20020a621413000000b005a8bd671563mr3537733pfu.22.1676516294325; Wed, 15 Feb 2023 18:58:14 -0800 (PST) Received: from stoup.. (rrcs-74-87-59-234.west.biz.rr.com. [74.87.59.234]) by smtp.gmail.com with ESMTPSA id e14-20020a62aa0e000000b005a816b7c3e8sm89655pff.24.2023.02.15.18.58.13 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Feb 2023 18:58:13 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 24/30] tcg/i386: Use full load/store helpers in user-only mode Date: Wed, 15 Feb 2023 16:57:33 -1000 Message-Id: <20230216025739.1211680-25-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230216025739.1211680-1-richard.henderson@linaro.org> References: <20230216025739.1211680-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::430; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x430.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Instead of using helper_unaligned_{ld,st}, use the full load/store helpers. This will allow the fast path to increase alignment to implement atomicity while not immediately raising an alignment exception. Signed-off-by: Richard Henderson --- tcg/i386/tcg-target.c.inc | 332 ++++++++++++++++---------------------- 1 file changed, 142 insertions(+), 190 deletions(-) diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc index a75fe91e86..cad1775133 100644 --- a/tcg/i386/tcg-target.c.inc +++ b/tcg/i386/tcg-target.c.inc @@ -1746,7 +1746,6 @@ tcg_out_testi(TCGContext *s, TCGReg r, uint32_t i) } } -#if defined(CONFIG_SOFTMMU) /* * helper signature: helper_ld*_mmu(CPUState *env, target_ulong addr, * int mmu_idx, uintptr_t ra) @@ -1769,108 +1768,6 @@ static void * const qemu_st_helpers[MO_SIZE + 1] = { [MO_UQ] = helper_stq_mmu, }; -/* Perform the TLB load and compare. - - Inputs: - ADDRLO and ADDRHI contain the low and high part of the address. - - MEM_INDEX and S_BITS are the memory context and log2 size of the load. - - WHICH is the offset into the CPUTLBEntry structure of the slot to read. - This should be offsetof addr_read or addr_write. - - Outputs: - LABEL_PTRS is filled with 1 (32-bit addresses) or 2 (64-bit addresses) - positions of the displacements of forward jumps to the TLB miss case. - - Second argument register is loaded with the low part of the address. - In the TLB hit case, it has been adjusted as indicated by the TLB - and so is a host address. In the TLB miss case, it continues to - hold a guest address. - - First argument register is clobbered. */ - -static inline void tcg_out_tlb_load(TCGContext *s, TCGReg addrlo, TCGReg addrhi, - int mem_index, MemOp opc, - tcg_insn_unit **label_ptr, int which) -{ - const TCGReg r0 = TCG_REG_L0; - const TCGReg r1 = TCG_REG_L1; - TCGType ttype = TCG_TYPE_I32; - TCGType tlbtype = TCG_TYPE_I32; - int trexw = 0, hrexw = 0, tlbrexw = 0; - unsigned a_bits = get_alignment_bits(opc); - unsigned s_bits = opc & MO_SIZE; - unsigned a_mask = (1 << a_bits) - 1; - unsigned s_mask = (1 << s_bits) - 1; - target_ulong tlb_mask; - - if (TCG_TARGET_REG_BITS == 64) { - if (TARGET_LONG_BITS == 64) { - ttype = TCG_TYPE_I64; - trexw = P_REXW; - } - if (TCG_TYPE_PTR == TCG_TYPE_I64) { - hrexw = P_REXW; - if (TARGET_PAGE_BITS + CPU_TLB_DYN_MAX_BITS > 32) { - tlbtype = TCG_TYPE_I64; - tlbrexw = P_REXW; - } - } - } - - tcg_out_mov(s, tlbtype, r0, addrlo); - tcg_out_shifti(s, SHIFT_SHR + tlbrexw, r0, - TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS); - - tcg_out_modrm_offset(s, OPC_AND_GvEv + trexw, r0, TCG_AREG0, - TLB_MASK_TABLE_OFS(mem_index) + - offsetof(CPUTLBDescFast, mask)); - - tcg_out_modrm_offset(s, OPC_ADD_GvEv + hrexw, r0, TCG_AREG0, - TLB_MASK_TABLE_OFS(mem_index) + - offsetof(CPUTLBDescFast, table)); - - /* If the required alignment is at least as large as the access, simply - copy the address and mask. For lesser alignments, check that we don't - cross pages for the complete access. */ - if (a_bits >= s_bits) { - tcg_out_mov(s, ttype, r1, addrlo); - } else { - tcg_out_modrm_offset(s, OPC_LEA + trexw, r1, addrlo, s_mask - a_mask); - } - tlb_mask = (target_ulong)TARGET_PAGE_MASK | a_mask; - tgen_arithi(s, ARITH_AND + trexw, r1, tlb_mask, 0); - - /* cmp 0(r0), r1 */ - tcg_out_modrm_offset(s, OPC_CMP_GvEv + trexw, r1, r0, which); - - /* Prepare for both the fast path add of the tlb addend, and the slow - path function argument setup. */ - tcg_out_mov(s, ttype, r1, addrlo); - - /* jne slow_path */ - tcg_out_opc(s, OPC_JCC_long + JCC_JNE, 0, 0, 0); - label_ptr[0] = s->code_ptr; - s->code_ptr += 4; - - if (TARGET_LONG_BITS > TCG_TARGET_REG_BITS) { - /* cmp 4(r0), addrhi */ - tcg_out_modrm_offset(s, OPC_CMP_GvEv, addrhi, r0, which + 4); - - /* jne slow_path */ - tcg_out_opc(s, OPC_JCC_long + JCC_JNE, 0, 0, 0); - label_ptr[1] = s->code_ptr; - s->code_ptr += 4; - } - - /* TLB Hit. */ - - /* add addend(r0), r1 */ - tcg_out_modrm_offset(s, OPC_ADD_GvEv + hrexw, r1, r0, - offsetof(CPUTLBEntry, addend)); -} - /* * Record the context of a call to the out of line helper code for the slow path * for a load or store, so that we can later generate the correct helper code @@ -1893,9 +1790,7 @@ static void add_qemu_ldst_label(TCGContext *s, bool is_ld, bool is_64, label->addrhi_reg = addrhi; label->raddr = tcg_splitwx_to_rx(raddr); label->label_ptr[0] = label_ptr[0]; - if (TARGET_LONG_BITS > TCG_TARGET_REG_BITS) { - label->label_ptr[1] = label_ptr[1]; - } + label->label_ptr[1] = label_ptr[1]; } /* Move src1 to dst1 and src2 to dst2, minding possible overlap. */ @@ -1929,7 +1824,7 @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l) /* resolve label address */ tcg_patch32(label_ptr[0], s->code_ptr - label_ptr[0] - 4); - if (TARGET_LONG_BITS > TCG_TARGET_REG_BITS) { + if (label_ptr[1]) { tcg_patch32(label_ptr[1], s->code_ptr - label_ptr[1] - 4); } @@ -1952,8 +1847,9 @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l) tcg_out_sti(s, TCG_TYPE_PTR, (uintptr_t)l->raddr, TCG_REG_ESP, ofs); } else { + tcg_out_mov(s, TCG_TYPE_TL, tcg_target_call_iarg_regs[1], + l->addrlo_reg); tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[0], TCG_AREG0); - /* The second argument is already loaded with addrlo. */ tcg_out_movi(s, TCG_TYPE_I32, tcg_target_call_iarg_regs[2], oi); tcg_out_movi(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[3], (uintptr_t)l->raddr); @@ -2010,7 +1906,7 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l) /* resolve label address */ tcg_patch32(label_ptr[0], s->code_ptr - label_ptr[0] - 4); - if (TARGET_LONG_BITS > TCG_TARGET_REG_BITS) { + if (label_ptr[1]) { tcg_patch32(label_ptr[1], s->code_ptr - label_ptr[1] - 4); } @@ -2043,10 +1939,11 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l) tcg_out_movi(s, TCG_TYPE_PTR, retaddr, (uintptr_t)l->raddr); tcg_out_st(s, TCG_TYPE_PTR, retaddr, TCG_REG_ESP, ofs); } else { + tcg_out_mov2(s, TCG_TYPE_TL, + tcg_target_call_iarg_regs[1], l->addrlo_reg, + s_bits == MO_64 ? TCG_TYPE_I64 : TCG_TYPE_I32, + tcg_target_call_iarg_regs[2], l->datalo_reg); tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[0], TCG_AREG0); - /* The second argument is already loaded with addrlo. */ - tcg_out_mov(s, (s_bits == MO_64 ? TCG_TYPE_I64 : TCG_TYPE_I32), - tcg_target_call_iarg_regs[2], l->datalo_reg); tcg_out_movi(s, TCG_TYPE_I32, tcg_target_call_iarg_regs[3], oi); if (ARRAY_SIZE(tcg_target_call_iarg_regs) > 4) { @@ -2065,72 +1962,129 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l) tcg_out_jmp(s, qemu_st_helpers[opc & MO_SIZE]); return true; } + +#if defined(CONFIG_SOFTMMU) +/* + * Perform the TLB load and compare. + * + * Inputs: + * ADDRLO and ADDRHI contain the low and high part of the address. + * + * MEM_INDEX and S_BITS are the memory context and log2 size of the load. + * + * WHICH is the offset into the CPUTLBEntry structure of the slot to read. + * This should be offsetof addr_read or addr_write. + * + * Outputs: + * LABEL_PTRS is filled with 1 (32-bit addresses) or 2 (64-bit addresses) + * positions of the displacements of forward jumps to the TLB miss case. + * + * Second argument register is loaded with the low part of the address. + * In the TLB hit case, it has been adjusted as indicated by the TLB + * and so is a host address. In the TLB miss case, it continues to + * hold a guest address. + * + * First argument register is clobbered. + */ +static void tcg_out_tlb_load(TCGContext *s, TCGReg addrlo, TCGReg addrhi, + int mem_index, MemOp opc, + tcg_insn_unit **label_ptr, int which) +{ + const TCGReg r0 = TCG_REG_L0; + const TCGReg r1 = TCG_REG_L1; + TCGType ttype = TCG_TYPE_I32; + TCGType tlbtype = TCG_TYPE_I32; + int trexw = 0, hrexw = 0, tlbrexw = 0; + unsigned a_bits = get_alignment_bits(opc); + unsigned s_bits = opc & MO_SIZE; + unsigned a_mask = (1 << a_bits) - 1; + unsigned s_mask = (1 << s_bits) - 1; + target_ulong tlb_mask; + + if (TCG_TARGET_REG_BITS == 64) { + if (TARGET_LONG_BITS == 64) { + ttype = TCG_TYPE_I64; + trexw = P_REXW; + } + if (TCG_TYPE_PTR == TCG_TYPE_I64) { + hrexw = P_REXW; + if (TARGET_PAGE_BITS + CPU_TLB_DYN_MAX_BITS > 32) { + tlbtype = TCG_TYPE_I64; + tlbrexw = P_REXW; + } + } + } + + tcg_out_mov(s, tlbtype, r0, addrlo); + tcg_out_shifti(s, SHIFT_SHR + tlbrexw, r0, + TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS); + + tcg_out_modrm_offset(s, OPC_AND_GvEv + trexw, r0, TCG_AREG0, + TLB_MASK_TABLE_OFS(mem_index) + + offsetof(CPUTLBDescFast, mask)); + + tcg_out_modrm_offset(s, OPC_ADD_GvEv + hrexw, r0, TCG_AREG0, + TLB_MASK_TABLE_OFS(mem_index) + + offsetof(CPUTLBDescFast, table)); + + /* + * If the required alignment is at least as large as the access, simply + * copy the address and mask. For lesser alignments, check that we don't + * cross pages for the complete access. + */ + if (a_bits >= s_bits) { + tcg_out_mov(s, ttype, r1, addrlo); + } else { + tcg_out_modrm_offset(s, OPC_LEA + trexw, r1, addrlo, s_mask - a_mask); + } + tlb_mask = (target_ulong)TARGET_PAGE_MASK | a_mask; + tgen_arithi(s, ARITH_AND + trexw, r1, tlb_mask, 0); + + /* cmp 0(r0), r1 */ + tcg_out_modrm_offset(s, OPC_CMP_GvEv + trexw, r1, r0, which); + + /* + * Prepare for both the fast path add of the tlb addend, and the slow + * path function argument setup. + */ + tcg_out_mov(s, ttype, r1, addrlo); + + /* jne slow_path */ + tcg_out_opc(s, OPC_JCC_long + JCC_JNE, 0, 0, 0); + label_ptr[0] = s->code_ptr; + s->code_ptr += 4; + + if (TARGET_LONG_BITS > TCG_TARGET_REG_BITS) { + /* cmp 4(r0), addrhi */ + tcg_out_modrm_offset(s, OPC_CMP_GvEv, addrhi, r0, which + 4); + + /* jne slow_path */ + tcg_out_opc(s, OPC_JCC_long + JCC_JNE, 0, 0, 0); + label_ptr[1] = s->code_ptr; + s->code_ptr += 4; + } + + /* TLB Hit. */ + + /* add addend(r0), r1 */ + tcg_out_modrm_offset(s, OPC_ADD_GvEv + hrexw, r1, r0, + offsetof(CPUTLBEntry, addend)); +} + #else -static void tcg_out_test_alignment(TCGContext *s, bool is_ld, TCGReg addrlo, - TCGReg addrhi, unsigned a_bits) +static void tcg_out_test_alignment(TCGContext *s, TCGReg addrlo, + unsigned a_bits, tcg_insn_unit **label_ptr) { unsigned a_mask = (1 << a_bits) - 1; - TCGLabelQemuLdst *label; tcg_out_testi(s, addrlo, a_mask); /* jne slow_path */ tcg_out_opc(s, OPC_JCC_long + JCC_JNE, 0, 0, 0); - - label = new_ldst_label(s); - label->is_ld = is_ld; - label->addrlo_reg = addrlo; - label->addrhi_reg = addrhi; - label->raddr = tcg_splitwx_to_rx(s->code_ptr + 4); - label->label_ptr[0] = s->code_ptr; - + *label_ptr = s->code_ptr; s->code_ptr += 4; } -static bool tcg_out_fail_alignment(TCGContext *s, TCGLabelQemuLdst *l) -{ - /* resolve label address */ - tcg_patch32(l->label_ptr[0], s->code_ptr - l->label_ptr[0] - 4); - - if (TCG_TARGET_REG_BITS == 32) { - int ofs = 0; - - tcg_out_st(s, TCG_TYPE_PTR, TCG_AREG0, TCG_REG_ESP, ofs); - ofs += 4; - - tcg_out_st(s, TCG_TYPE_I32, l->addrlo_reg, TCG_REG_ESP, ofs); - ofs += 4; - if (TARGET_LONG_BITS == 64) { - tcg_out_st(s, TCG_TYPE_I32, l->addrhi_reg, TCG_REG_ESP, ofs); - ofs += 4; - } - - tcg_out_pushi(s, (uintptr_t)l->raddr); - } else { - tcg_out_mov(s, TCG_TYPE_TL, tcg_target_call_iarg_regs[1], - l->addrlo_reg); - tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[0], TCG_AREG0); - - tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_RAX, (uintptr_t)l->raddr); - tcg_out_push(s, TCG_REG_RAX); - } - - /* "Tail call" to the helper, with the return address back inline. */ - tcg_out_jmp(s, (const void *)(l->is_ld ? helper_unaligned_ld - : helper_unaligned_st)); - return true; -} - -static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l) -{ - return tcg_out_fail_alignment(s, l); -} - -static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l) -{ - return tcg_out_fail_alignment(s, l); -} - #if TCG_TARGET_REG_BITS == 32 # define x86_guest_base_seg 0 # define x86_guest_base_index -1 @@ -2165,7 +2119,7 @@ static inline int setup_guest_base_seg(void) return 0; } # endif -#endif +#endif /* TCG_TARGET_REG_BITS == 32 */ #endif /* SOFTMMU */ static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg datalo, TCGReg datahi, @@ -2272,10 +2226,8 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, bool is64) TCGReg addrhi __attribute__((unused)); MemOpIdx oi; MemOp opc; -#if defined(CONFIG_SOFTMMU) - int mem_index; - tcg_insn_unit *label_ptr[2]; -#else + tcg_insn_unit *label_ptr[2] = { }; +#ifndef CONFIG_SOFTMMU unsigned a_bits; #endif @@ -2287,26 +2239,27 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, bool is64) opc = get_memop(oi); #if defined(CONFIG_SOFTMMU) - mem_index = get_mmuidx(oi); - - tcg_out_tlb_load(s, addrlo, addrhi, mem_index, opc, + tcg_out_tlb_load(s, addrlo, addrhi, get_mmuidx(oi), opc, label_ptr, offsetof(CPUTLBEntry, addr_read)); /* TLB Hit. */ tcg_out_qemu_ld_direct(s, datalo, datahi, TCG_REG_L1, -1, 0, 0, is64, opc); /* Record the current context of a load into ldst label */ - add_qemu_ldst_label(s, true, is64, oi, datalo, datahi, addrlo, addrhi, - s->code_ptr, label_ptr); + add_qemu_ldst_label(s, true, is64, oi, datalo, datahi, + TCG_REG_L1, addrhi, s->code_ptr, label_ptr); #else a_bits = get_alignment_bits(opc); if (a_bits) { - tcg_out_test_alignment(s, true, addrlo, addrhi, a_bits); + tcg_out_test_alignment(s, addrlo, a_bits, label_ptr); } - tcg_out_qemu_ld_direct(s, datalo, datahi, addrlo, x86_guest_base_index, x86_guest_base_offset, x86_guest_base_seg, is64, opc); + if (a_bits) { + add_qemu_ldst_label(s, true, is64, oi, datalo, datahi, + addrlo, addrhi, s->code_ptr, label_ptr); + } #endif } @@ -2368,10 +2321,8 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, bool is64) TCGReg addrhi __attribute__((unused)); MemOpIdx oi; MemOp opc; -#if defined(CONFIG_SOFTMMU) - int mem_index; - tcg_insn_unit *label_ptr[2]; -#else + tcg_insn_unit *label_ptr[2] = { }; +#ifndef CONFIG_SOFTMMU unsigned a_bits; #endif @@ -2383,25 +2334,26 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, bool is64) opc = get_memop(oi); #if defined(CONFIG_SOFTMMU) - mem_index = get_mmuidx(oi); - - tcg_out_tlb_load(s, addrlo, addrhi, mem_index, opc, + tcg_out_tlb_load(s, addrlo, addrhi, get_mmuidx(oi), opc, label_ptr, offsetof(CPUTLBEntry, addr_write)); /* TLB Hit. */ tcg_out_qemu_st_direct(s, datalo, datahi, TCG_REG_L1, -1, 0, 0, opc); /* Record the current context of a store into ldst label */ - add_qemu_ldst_label(s, false, is64, oi, datalo, datahi, addrlo, addrhi, - s->code_ptr, label_ptr); + add_qemu_ldst_label(s, false, is64, oi, datalo, datahi, + TCG_REG_L1, addrhi, s->code_ptr, label_ptr); #else a_bits = get_alignment_bits(opc); if (a_bits) { - tcg_out_test_alignment(s, false, addrlo, addrhi, a_bits); + tcg_out_test_alignment(s, addrlo, a_bits, label_ptr); } - tcg_out_qemu_st_direct(s, datalo, datahi, addrlo, x86_guest_base_index, x86_guest_base_offset, x86_guest_base_seg, opc); + if (a_bits) { + add_qemu_ldst_label(s, false, is64, oi, datalo, datahi, + addrlo, addrhi, s->code_ptr, label_ptr); + } #endif } From patchwork Thu Feb 16 02:57:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1743250 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=F02Xy/SN; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4PHKTW6cTCz23h0 for ; Thu, 16 Feb 2023 14:01:19 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pSUTV-0003jj-I0; Wed, 15 Feb 2023 21:58:21 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pSUTT-0003eU-3A for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:58:19 -0500 Received: from mail-pf1-x42a.google.com ([2607:f8b0:4864:20::42a]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pSUTR-0005oJ-1n for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:58:18 -0500 Received: by mail-pf1-x42a.google.com with SMTP id x13so355916pfu.7 for ; Wed, 15 Feb 2023 18:58:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=li5lluLj9lzXTBlzleuIeRSOMn7SYzIz3nRkH1+ZWYA=; b=F02Xy/SNhesgv4JhzigHi0GH54aez9P+/tCTywf9F8m/WOFrQnYrTWgZYdkXUgeFTv rQrYem9MpMMEWqDK8b04nVuxLRNukRyhSwKWnfBD8pZe2xhC7mlOvbzmXLolpRbg3Nit ZDwPGKIxtiy24s2qv+JhH4FwDbu705YPeLxQf18+MwUNwLw03f7ZHZ6C+65Vbs0KqPsO VlxCN+RW1P2FLfqm6tEUzWkY/Qidv8QAQ7DusWA4yCLL1hNXIFhJ8xtC67UtR5W+Xx+L aYqG/6fiiV2266NF/8VMGC7xUuMTtS6a4wlFs3V3eIrPZSGKi4wwyE/liz6n8sYWZgnq xOpg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=li5lluLj9lzXTBlzleuIeRSOMn7SYzIz3nRkH1+ZWYA=; b=5Jbp1HiI2Pb3AXzipkzzHW+T2M8dNncptUznc/lZQW/Zwav6gcAFazh8l8Ifb87g2r CcO7RxhwRImTI1xyATAs2loXoybl70+DUws8HtEvTGGm1zSrmW9aO4ZSKRE1Ba3mbC8A SnzZBkVxdTgJGyuWEKb+lWKNC1GSOysUsewS5pJe7oRcyargUXsauowdFMwDCVvBVlLd r4bOWXTVRYR0XE6lUWDagCzfnTc9PkL50Ge8zivZrU6fCyhvVwJqauClsjHzHiAAs5+q 8s0L1vRWa2by0CM0zhj0G9TNsaET+GLopzmP/GEmPK2ox6PZ5LXJi8DRErC32/8GW+sM 6/pg== X-Gm-Message-State: AO0yUKUQb8C5B2mU7hpu2N3Yfx1GmNCsQTPq8/BfcQK4OQJqB5NWx+jx j683hD3aKMfond32YVSvkBU+EBndY3yyqpYjBiw= X-Google-Smtp-Source: AK7set/QCKwMa6INYseZbC+Q2g2S3JpN4NNNWeCpFrdMCwpbH2d6b29gS3i9aXgFdT8jF55YJtFdGA== X-Received: by 2002:aa7:9e5d:0:b0:5a8:aae8:1160 with SMTP id z29-20020aa79e5d000000b005a8aae81160mr3475025pfq.20.1676516295699; Wed, 15 Feb 2023 18:58:15 -0800 (PST) Received: from stoup.. (rrcs-74-87-59-234.west.biz.rr.com. [74.87.59.234]) by smtp.gmail.com with ESMTPSA id e14-20020a62aa0e000000b005a816b7c3e8sm89655pff.24.2023.02.15.18.58.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Feb 2023 18:58:15 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Cc: =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= Subject: [PATCH v2 25/30] tcg/i386: Replace is64 with type in qemu_ld/st routines Date: Wed, 15 Feb 2023 16:57:34 -1000 Message-Id: <20230216025739.1211680-26-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230216025739.1211680-1-richard.henderson@linaro.org> References: <20230216025739.1211680-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42a; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x42a.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Prepare for TCG_TYPE_I128 by not using a boolean. Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- tcg/i386/tcg-target.c.inc | 54 ++++++++++++++++++++++++++------------- 1 file changed, 36 insertions(+), 18 deletions(-) diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc index cad1775133..5dcea7e198 100644 --- a/tcg/i386/tcg-target.c.inc +++ b/tcg/i386/tcg-target.c.inc @@ -1772,7 +1772,7 @@ static void * const qemu_st_helpers[MO_SIZE + 1] = { * Record the context of a call to the out of line helper code for the slow path * for a load or store, so that we can later generate the correct helper code */ -static void add_qemu_ldst_label(TCGContext *s, bool is_ld, bool is_64, +static void add_qemu_ldst_label(TCGContext *s, bool is_ld, TCGType type, MemOpIdx oi, TCGReg datalo, TCGReg datahi, TCGReg addrlo, TCGReg addrhi, @@ -1783,7 +1783,7 @@ static void add_qemu_ldst_label(TCGContext *s, bool is_ld, bool is_64, label->is_ld = is_ld; label->oi = oi; - label->type = is_64 ? TCG_TYPE_I64 : TCG_TYPE_I32; + label->type = type; label->datalo_reg = datalo; label->datahi_reg = datahi; label->addrlo_reg = addrlo; @@ -2124,10 +2124,10 @@ static inline int setup_guest_base_seg(void) static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg datalo, TCGReg datahi, TCGReg base, int index, intptr_t ofs, - int seg, bool is64, MemOp memop) + int seg, TCGType type, MemOp memop) { bool use_movbe = false; - int rexw = is64 * P_REXW; + int rexw = (type == TCG_TYPE_I32 ? 0 : P_REXW); int movop = OPC_MOVL_GvEv; /* Do big-endian loads with movbe. */ @@ -2220,7 +2220,7 @@ static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg datalo, TCGReg datahi, /* XXX: qemu_ld and qemu_st could be modified to clobber only EDX and EAX. It will be useful once fixed registers globals are less common. */ -static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, bool is64) +static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, TCGType type) { TCGReg datalo, datahi, addrlo; TCGReg addrhi __attribute__((unused)); @@ -2232,7 +2232,16 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, bool is64) #endif datalo = *args++; - datahi = (TCG_TARGET_REG_BITS == 32 && is64 ? *args++ : 0); + switch (type) { + case TCG_TYPE_I32: + datahi = 0; + break; + case TCG_TYPE_I64: + datahi = (TCG_TARGET_REG_BITS == 32 ? *args++ : 0); + break; + default: + g_assert_not_reached(); + } addrlo = *args++; addrhi = (TARGET_LONG_BITS > TCG_TARGET_REG_BITS ? *args++ : 0); oi = *args++; @@ -2243,10 +2252,10 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, bool is64) label_ptr, offsetof(CPUTLBEntry, addr_read)); /* TLB Hit. */ - tcg_out_qemu_ld_direct(s, datalo, datahi, TCG_REG_L1, -1, 0, 0, is64, opc); + tcg_out_qemu_ld_direct(s, datalo, datahi, TCG_REG_L1, -1, 0, 0, type, opc); /* Record the current context of a load into ldst label */ - add_qemu_ldst_label(s, true, is64, oi, datalo, datahi, + add_qemu_ldst_label(s, true, type, oi, datalo, datahi, TCG_REG_L1, addrhi, s->code_ptr, label_ptr); #else a_bits = get_alignment_bits(opc); @@ -2255,9 +2264,9 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, bool is64) } tcg_out_qemu_ld_direct(s, datalo, datahi, addrlo, x86_guest_base_index, x86_guest_base_offset, x86_guest_base_seg, - is64, opc); + type, opc); if (a_bits) { - add_qemu_ldst_label(s, true, is64, oi, datalo, datahi, + add_qemu_ldst_label(s, true, type, oi, datalo, datahi, addrlo, addrhi, s->code_ptr, label_ptr); } #endif @@ -2315,7 +2324,7 @@ static void tcg_out_qemu_st_direct(TCGContext *s, TCGReg datalo, TCGReg datahi, } } -static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, bool is64) +static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, TCGType type) { TCGReg datalo, datahi, addrlo; TCGReg addrhi __attribute__((unused)); @@ -2327,7 +2336,16 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, bool is64) #endif datalo = *args++; - datahi = (TCG_TARGET_REG_BITS == 32 && is64 ? *args++ : 0); + switch (type) { + case TCG_TYPE_I32: + datahi = 0; + break; + case TCG_TYPE_I64: + datahi = (TCG_TARGET_REG_BITS == 32 ? *args++ : 0); + break; + default: + g_assert_not_reached(); + } addrlo = *args++; addrhi = (TARGET_LONG_BITS > TCG_TARGET_REG_BITS ? *args++ : 0); oi = *args++; @@ -2341,7 +2359,7 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, bool is64) tcg_out_qemu_st_direct(s, datalo, datahi, TCG_REG_L1, -1, 0, 0, opc); /* Record the current context of a store into ldst label */ - add_qemu_ldst_label(s, false, is64, oi, datalo, datahi, + add_qemu_ldst_label(s, false, type, oi, datalo, datahi, TCG_REG_L1, addrhi, s->code_ptr, label_ptr); #else a_bits = get_alignment_bits(opc); @@ -2351,7 +2369,7 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, bool is64) tcg_out_qemu_st_direct(s, datalo, datahi, addrlo, x86_guest_base_index, x86_guest_base_offset, x86_guest_base_seg, opc); if (a_bits) { - add_qemu_ldst_label(s, false, is64, oi, datalo, datahi, + add_qemu_ldst_label(s, false, type, oi, datalo, datahi, addrlo, addrhi, s->code_ptr, label_ptr); } #endif @@ -2655,17 +2673,17 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc, break; case INDEX_op_qemu_ld_i32: - tcg_out_qemu_ld(s, args, 0); + tcg_out_qemu_ld(s, args, TCG_TYPE_I32); break; case INDEX_op_qemu_ld_i64: - tcg_out_qemu_ld(s, args, 1); + tcg_out_qemu_ld(s, args, TCG_TYPE_I64); break; case INDEX_op_qemu_st_i32: case INDEX_op_qemu_st8_i32: - tcg_out_qemu_st(s, args, 0); + tcg_out_qemu_st(s, args, TCG_TYPE_I32); break; case INDEX_op_qemu_st_i64: - tcg_out_qemu_st(s, args, 1); + tcg_out_qemu_st(s, args, TCG_TYPE_I64); break; OP_32_64(mulu2): From patchwork Thu Feb 16 02:57:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1743245 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=WZKVtdf3; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4PHKT51qwpz240K for ; Thu, 16 Feb 2023 14:00:57 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pSUTU-0003iO-KS; Wed, 15 Feb 2023 21:58:20 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pSUTU-0003fq-0Y for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:58:20 -0500 Received: from mail-pl1-x635.google.com ([2607:f8b0:4864:20::635]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pSUTS-0005oV-D0 for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:58:19 -0500 Received: by mail-pl1-x635.google.com with SMTP id r8so708135pls.2 for ; Wed, 15 Feb 2023 18:58:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=/9MYkYHWZA2BK/uGhp/+Wu1+AiImWiE3k4spOtSXkSE=; b=WZKVtdf3zU0ANvXnRmUD0SjFo2nu7PQ3Bfn5skyEi0PNCsNWDMBD9bxxDj9SuC12kD n6Tnk4xEufGEksK1ooN0UEW6hmZV2dcBRUgsYh9JJf4hLDFuMadongk8TegFjilLt8Gv mnhOg8s6pVTFGNx2hSWADbsJr5GdB62g6xjLZNiaKzU0NQrGLZoSd8UgwEgyOTOKbERV xMhzlesHmM7nDZuYB4bkaaGcnAkDBrUc9FAPXdisqZq4rugPQe372VXVdS+GzRTH8lPW 82Q2D3k8AEqfqYvC8XwgB921z60wLmgkA8G0fQk9c1qnh3eHzzI+T/CQPndKw52Tbolp 58yA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=/9MYkYHWZA2BK/uGhp/+Wu1+AiImWiE3k4spOtSXkSE=; b=3E2T6G3+spIeDjvrjp4vs2ektzt+VKLW5lHnVX1lJEEGi3HkrS7oG4If61QvlkyV0i iZRzjTcNGgwhpyrE1EbJHDIXywibGTvuKQyfprQeeXcLMUPUCLSyPCHj0qnEu06SmJOj tHbXeuXCZTZeCF10azs69pPpcOI+BPHVV/Tw1f8nl6Ip6UgZBBRPWwe99qoFpTqXVWaK KlN+ZCds5tkI8yX0elajsGNm+szDz9IVGZgBCEfCxgPHmX9+fTPZsu5x5krXiWfJOhJA wyQCRC2xxqZNQMZWIUwcU8VR5JKCnODiI0Ny2T0fwe4sQ0+zNEAt/8alMsix8Dt/wDpo JNow== X-Gm-Message-State: AO0yUKWJME/GiiE/nPWjPKKCRZzlFJQWRmp9nHKA7ReOsCxKixYJoTEB tkG4z13HZWdtcXCJuuwupFj5Q/juRuSkw7NKyBw= X-Google-Smtp-Source: AK7set/9MU7/DfoFLOk53rZphZXAR9fmXEugH/V42nQSl2Gk8AdlTRwM54fel+v1F0vtas75LoK15A== X-Received: by 2002:a05:6a20:8416:b0:c7:13bf:3fd0 with SMTP id c22-20020a056a20841600b000c713bf3fd0mr1363274pzd.25.1676516297101; Wed, 15 Feb 2023 18:58:17 -0800 (PST) Received: from stoup.. (rrcs-74-87-59-234.west.biz.rr.com. [74.87.59.234]) by smtp.gmail.com with ESMTPSA id e14-20020a62aa0e000000b005a816b7c3e8sm89655pff.24.2023.02.15.18.58.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Feb 2023 18:58:16 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Cc: =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= Subject: [PATCH v2 26/30] tcg/i386: Mark Win64 call-saved vector regs as reserved Date: Wed, 15 Feb 2023 16:57:35 -1000 Message-Id: <20230216025739.1211680-27-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230216025739.1211680-1-richard.henderson@linaro.org> References: <20230216025739.1211680-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::635; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x635.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org While we do not include these in tcg_target_reg_alloc_order, and therefore they ought never be allocated, it seems safer to mark them reserved as well. Reviewed-by: Philippe Mathieu-Daudé Signed-off-by: Richard Henderson --- tcg/i386/tcg-target.c.inc | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc index 5dcea7e198..21442c9339 100644 --- a/tcg/i386/tcg-target.c.inc +++ b/tcg/i386/tcg-target.c.inc @@ -4232,6 +4232,19 @@ static void tcg_target_init(TCGContext *s) s->reserved_regs = 0; tcg_regset_set_reg(s->reserved_regs, TCG_REG_CALL_STACK); +#ifdef _WIN64 + /* These are call saved, and we don't save them, so don't use them. */ + tcg_regset_set_reg(s->reserved_regs, TCG_REG_XMM6); + tcg_regset_set_reg(s->reserved_regs, TCG_REG_XMM7); + tcg_regset_set_reg(s->reserved_regs, TCG_REG_XMM8); + tcg_regset_set_reg(s->reserved_regs, TCG_REG_XMM9); + tcg_regset_set_reg(s->reserved_regs, TCG_REG_XMM10); + tcg_regset_set_reg(s->reserved_regs, TCG_REG_XMM11); + tcg_regset_set_reg(s->reserved_regs, TCG_REG_XMM12); + tcg_regset_set_reg(s->reserved_regs, TCG_REG_XMM13); + tcg_regset_set_reg(s->reserved_regs, TCG_REG_XMM14); + tcg_regset_set_reg(s->reserved_regs, TCG_REG_XMM15); +#endif } typedef struct { From patchwork Thu Feb 16 02:57:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1743254 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=RsY4Q3Rm; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4PHKTv4C6Xz240K for ; Thu, 16 Feb 2023 14:01:39 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pSUTW-0003pO-Rq; Wed, 15 Feb 2023 21:58:22 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pSUTV-0003l3-N2 for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:58:21 -0500 Received: from mail-pf1-x430.google.com ([2607:f8b0:4864:20::430]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pSUTT-0005o1-AR for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:58:21 -0500 Received: by mail-pf1-x430.google.com with SMTP id b1so600477pft.1 for ; Wed, 15 Feb 2023 18:58:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=SCpgDrDSeXYRNeccRXJlncfjqmKvR7yI1DKKfk5rUfY=; b=RsY4Q3Rmb1KNuaGxBVT2iNzZJXXCJygYwRQXAYbUzpEpP2y0BiUpiX7vxUygSSLD14 Mv5Gh4qrBStVCf7uOiNNA11QAKcSJTQ8BjeYaZepHFZV7+PBcpTEXxZIye/zgdn2C/M8 dP57Fyu3FF5H8BCsMbDn3edD/MxuMBfOtmYmi22t94I/x7OU0a0+sz2keiIW1k7u3E4J Z9ztGjhiTwdSZY7/jnZ3B9ilmnFepNo9nGkTRsd2F1j6cg8d0TgrKU4j8IsPV9MnrYj/ qI3pY0E0lWVlhollO25/96hJWfIXBQKNYx5f3KiUGXkU1e2huTKkjrdNnHvpd+wvkNjK zuTg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=SCpgDrDSeXYRNeccRXJlncfjqmKvR7yI1DKKfk5rUfY=; b=oAMhLJzBMi5DWkjsXuntWuELfqXZpI2A3hzoj+0VRAHX0IneKFvRzNo5rlEREhFRpD TWnNBwm/W8otruWFjYtJ8wiO30jab8fv3YFySIc8cK27G9wtoxehVIACAVkYrfc87DUX R4+AI1Ri/DXxSMCR4GpnfAYYoQTeWeBjf2prLuqeV5bpWGqbFGLH9mlO4UNGbUeDmp9Q ViNX3urG/ZSGjfpi8x0bpdoc5Yc+e40K+HhnP07D2V9Gi/biXVW3f6T9negYkYHna4Bw 9RgvELQw+2E219EqePCQm+CnT4ExLeXpaz7beheyq36M5axW9S+KRjZwDO+cAy7miNl7 SnWA== X-Gm-Message-State: AO0yUKU1lu3u0mM+0BqNcN/nxXL1NrtInihs3GsL4hG0Gsef6SItyQ/h EAEMXpCeW4bBrICW6P3NbBAc4hohyfhXd6XEazg= X-Google-Smtp-Source: AK7set9yCCXS+L+j1sWxL6iNYrUKWZjbb89R3R5dvwKPQWU44nr3+wuAA+7xjVPn4h4V3rLe4zAStQ== X-Received: by 2002:aa7:9591:0:b0:5a8:ab21:be2e with SMTP id z17-20020aa79591000000b005a8ab21be2emr3797849pfj.18.1676516298413; Wed, 15 Feb 2023 18:58:18 -0800 (PST) Received: from stoup.. (rrcs-74-87-59-234.west.biz.rr.com. [74.87.59.234]) by smtp.gmail.com with ESMTPSA id e14-20020a62aa0e000000b005a816b7c3e8sm89655pff.24.2023.02.15.18.58.17 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Feb 2023 18:58:17 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 27/30] tcg/i386: Examine MemOp for atomicity and alignment Date: Wed, 15 Feb 2023 16:57:36 -1000 Message-Id: <20230216025739.1211680-28-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230216025739.1211680-1-richard.henderson@linaro.org> References: <20230216025739.1211680-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::430; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x430.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org No change to the ultimate load/store routines yet, so some atomicity conditions not yet honored, but plumbs the change to alignment through the adjacent functions. Signed-off-by: Richard Henderson --- tcg/i386/tcg-target.c.inc | 128 ++++++++++++++++++++++++++++++-------- 1 file changed, 101 insertions(+), 27 deletions(-) diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc index 21442c9339..6ee7bc5a9a 100644 --- a/tcg/i386/tcg-target.c.inc +++ b/tcg/i386/tcg-target.c.inc @@ -1746,6 +1746,83 @@ tcg_out_testi(TCGContext *s, TCGReg r, uint32_t i) } } +/* + * Return the alignment and atomicity to use for the inline fast path + * for the given memory operation. The alignment may be larger than + * that specified in @opc, and the correct alignment will be diagnosed + * by the slow path helper. + */ +static MemOp atom_and_align_for_opc(TCGContext *s, MemOp opc, MemOp *out_al) +{ + MemOp align = get_alignment_bits(opc); + MemOp atom, atmax, atsub, size = opc & MO_SIZE; + + /* When serialized, no further atomicity required. */ + if (s->gen_tb->cflags & CF_PARALLEL) { + atom = opc & MO_ATOM_MASK; + } else { + atom = MO_ATOM_NONE; + } + + atmax = opc & MO_ATMAX_MASK; + if (atmax == MO_ATMAX_SIZE) { + atmax = size; + } else { + atmax = atmax >> MO_ATMAX_SHIFT; + } + + switch (atom) { + case MO_ATOM_NONE: + /* The operation requires no specific atomicity. */ + atmax = MO_8; + atsub = MO_8; + break; + case MO_ATOM_IFALIGN: + /* If unaligned, the subobjects are bytes. */ + atsub = MO_8; + break; + case MO_ATOM_WITHIN16: + /* If unaligned, there are subobjects if atmax < size. */ + atsub = (atmax < size ? atmax : MO_8); + atmax = size; + break; + case MO_ATOM_SUBALIGN: + /* If unaligned but not odd, there are subobjects up to atmax - 1. */ + atsub = (atmax == MO_8 ? MO_8 : atmax - 1); + break; + default: + g_assert_not_reached(); + } + + /* + * Per Intel Architecture SDM, Volume 3 Section 8.1.1, + * - Pentium family guarantees atomicity of aligned <= 64-bit. + * - P6 family guarantees atomicity of unaligned <= 64-bit + * which fit within a cache line. + * - AVX guarantees atomicity of aligned 128-bit VMOVDQA (et al). + * + * There is no language in the Intel manual specifying what happens + * with the partial memory operations when crossing a cache line. + * When there is required atomicity of subobjects, we must perform + * an additional runtime test for alignment and then perform either + * the full operation, or two half-sized operations. + * + * For x86_64, and MO_64, we do not have a scratch register with + * which to do this. Only allow splitting for MO_64 on i386, + * where the data is already separated, or MO_128. + * Otherwise, require full alignment and fall back to the helper + * for the misaligned case. + */ + if (align < atmax + && atsub != MO_8 + && size != (TCG_TARGET_REG_BITS == 64 ? MO_128 : MO_64)) { + align = size; + } + + *out_al = align; + return atmax; +} + /* * helper signature: helper_ld*_mmu(CPUState *env, target_ulong addr, * int mmu_idx, uintptr_t ra) @@ -1987,7 +2064,7 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l) * First argument register is clobbered. */ static void tcg_out_tlb_load(TCGContext *s, TCGReg addrlo, TCGReg addrhi, - int mem_index, MemOp opc, + int mem_index, MemOp a_bits, MemOp s_bits, tcg_insn_unit **label_ptr, int which) { const TCGReg r0 = TCG_REG_L0; @@ -1995,8 +2072,6 @@ static void tcg_out_tlb_load(TCGContext *s, TCGReg addrlo, TCGReg addrhi, TCGType ttype = TCG_TYPE_I32; TCGType tlbtype = TCG_TYPE_I32; int trexw = 0, hrexw = 0, tlbrexw = 0; - unsigned a_bits = get_alignment_bits(opc); - unsigned s_bits = opc & MO_SIZE; unsigned a_mask = (1 << a_bits) - 1; unsigned s_mask = (1 << s_bits) - 1; target_ulong tlb_mask; @@ -2124,7 +2199,8 @@ static inline int setup_guest_base_seg(void) static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg datalo, TCGReg datahi, TCGReg base, int index, intptr_t ofs, - int seg, TCGType type, MemOp memop) + int seg, TCGType type, MemOp memop, + MemOp atom, MemOp align) { bool use_movbe = false; int rexw = (type == TCG_TYPE_I32 ? 0 : P_REXW); @@ -2225,11 +2301,8 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, TCGType type) TCGReg datalo, datahi, addrlo; TCGReg addrhi __attribute__((unused)); MemOpIdx oi; - MemOp opc; + MemOp opc, atom, align; tcg_insn_unit *label_ptr[2] = { }; -#ifndef CONFIG_SOFTMMU - unsigned a_bits; -#endif datalo = *args++; switch (type) { @@ -2246,26 +2319,27 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, TCGType type) addrhi = (TARGET_LONG_BITS > TCG_TARGET_REG_BITS ? *args++ : 0); oi = *args++; opc = get_memop(oi); + atom = atom_and_align_for_opc(s, opc, &align); #if defined(CONFIG_SOFTMMU) - tcg_out_tlb_load(s, addrlo, addrhi, get_mmuidx(oi), opc, + tcg_out_tlb_load(s, addrlo, addrhi, get_mmuidx(oi), align, opc & MO_SIZE, label_ptr, offsetof(CPUTLBEntry, addr_read)); /* TLB Hit. */ - tcg_out_qemu_ld_direct(s, datalo, datahi, TCG_REG_L1, -1, 0, 0, type, opc); + tcg_out_qemu_ld_direct(s, datalo, datahi, TCG_REG_L1, -1, 0, 0, type, + opc, atom, align); /* Record the current context of a load into ldst label */ add_qemu_ldst_label(s, true, type, oi, datalo, datahi, TCG_REG_L1, addrhi, s->code_ptr, label_ptr); #else - a_bits = get_alignment_bits(opc); - if (a_bits) { - tcg_out_test_alignment(s, addrlo, a_bits, label_ptr); + if (align) { + tcg_out_test_alignment(s, addrlo, align, label_ptr); } tcg_out_qemu_ld_direct(s, datalo, datahi, addrlo, x86_guest_base_index, x86_guest_base_offset, x86_guest_base_seg, - type, opc); - if (a_bits) { + type, opc, atom, align); + if (align) { add_qemu_ldst_label(s, true, type, oi, datalo, datahi, addrlo, addrhi, s->code_ptr, label_ptr); } @@ -2274,7 +2348,8 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, TCGType type) static void tcg_out_qemu_st_direct(TCGContext *s, TCGReg datalo, TCGReg datahi, TCGReg base, int index, intptr_t ofs, - int seg, MemOp memop) + int seg, MemOp memop, + MemOp atom, MemOp align) { bool use_movbe = false; int movop = OPC_MOVL_EvGv; @@ -2329,11 +2404,8 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, TCGType type) TCGReg datalo, datahi, addrlo; TCGReg addrhi __attribute__((unused)); MemOpIdx oi; - MemOp opc; + MemOp opc, atom, align; tcg_insn_unit *label_ptr[2] = { }; -#ifndef CONFIG_SOFTMMU - unsigned a_bits; -#endif datalo = *args++; switch (type) { @@ -2350,25 +2422,27 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, TCGType type) addrhi = (TARGET_LONG_BITS > TCG_TARGET_REG_BITS ? *args++ : 0); oi = *args++; opc = get_memop(oi); + atom = atom_and_align_for_opc(s, opc, &align); #if defined(CONFIG_SOFTMMU) - tcg_out_tlb_load(s, addrlo, addrhi, get_mmuidx(oi), opc, + tcg_out_tlb_load(s, addrlo, addrhi, get_mmuidx(oi), align, opc & MO_SIZE, label_ptr, offsetof(CPUTLBEntry, addr_write)); /* TLB Hit. */ - tcg_out_qemu_st_direct(s, datalo, datahi, TCG_REG_L1, -1, 0, 0, opc); + tcg_out_qemu_st_direct(s, datalo, datahi, TCG_REG_L1, -1, 0, 0, + opc, atom, align); /* Record the current context of a store into ldst label */ add_qemu_ldst_label(s, false, type, oi, datalo, datahi, TCG_REG_L1, addrhi, s->code_ptr, label_ptr); #else - a_bits = get_alignment_bits(opc); - if (a_bits) { - tcg_out_test_alignment(s, addrlo, a_bits, label_ptr); + if (align) { + tcg_out_test_alignment(s, addrlo, align, label_ptr); } tcg_out_qemu_st_direct(s, datalo, datahi, addrlo, x86_guest_base_index, - x86_guest_base_offset, x86_guest_base_seg, opc); - if (a_bits) { + x86_guest_base_offset, x86_guest_base_seg, + opc, atom, align); + if (align) { add_qemu_ldst_label(s, false, type, oi, datalo, datahi, addrlo, addrhi, s->code_ptr, label_ptr); } From patchwork Thu Feb 16 02:57:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1743237 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=VBQRfrFe; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4PHKRL5Xw5z23yD for ; Thu, 16 Feb 2023 13:59:26 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pSUTY-00043M-Nj; Wed, 15 Feb 2023 21:58:24 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pSUTX-0003uY-CG for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:58:23 -0500 Received: from mail-pf1-x433.google.com ([2607:f8b0:4864:20::433]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pSUTU-0005k0-IT for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:58:23 -0500 Received: by mail-pf1-x433.google.com with SMTP id r3so587762pfh.4 for ; Wed, 15 Feb 2023 18:58:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=BN63CK+ADcQ15q4HiVZXL27r9Of47VrWbcuZQkDw4c8=; b=VBQRfrFeL5+Gfe5kLLHKSnETD8L88xCzRqMfdfnsuzlcyCslPWezxXrorvfRugbdbI tub1W5YJvLPF5JZnF8ozO8cvorLbI+C2RSpFRp6OpjUWnwL8wmt5aHqUQWlvXPFfVA9W HUNqj3OVqG5bSHv5Ezq3H7U3g1dzMafMuyjAwAL5yCCaekL8hv20TYBLC414GGD//+jM oU484p/wMHafY9joZv1zarVUccOt5oPNHE6UHaUYtTPY+BV3JoN/aGlrNFlZQLIhgncl vmmlO9QpGSqHrGK/2MTFyPgRXGDjCmfqyrAb/0BQ6qybtd5kjWKurdVi99ln7CsDPScp UEGA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=BN63CK+ADcQ15q4HiVZXL27r9Of47VrWbcuZQkDw4c8=; b=fwtVGpoM5KYtitMGLfnh9LLg+Z8VITLS/dPUoIYdgGG5CqDAEod9oq+J/eOxKUbvQJ luW7U0O9HR3GM3M0ASD6gvsOXT7ySQihdU4yyolXqJR6MAtylNtlW/PqJi7XzoQY5qzM NeGf0MI9M1E+Y5cEXRz4pga3HYALWS990OpDC7SDJGyDhoviLXdYDsDfn+qUQiykUd/t /vikdjvnsbPgyI39YJHkpEdpN+nTHPFXHt/FqN6jabCQflmxJCfDnMpkgrfKyUp8OAYx kQXOP+Gr0gVLzfs4DxjXnLbY6OrG9lt3uz2Gp8urLlO1vitZBD1OrDpNNIHJs9nrUv+n mILA== X-Gm-Message-State: AO0yUKVRKMduIB5qwAgDNcBETdkO/LDk/vNJLfL3pFK5gqh+AWK0byss Rc8xiBQDTO0RP8um2s6eI7COe3Fz1mZNrs5Acm8= X-Google-Smtp-Source: AK7set8ulcYiUF3F+93/pAHJHbMDiu1/OYUQDwqs5eMl0rs/6F9Mw1uth2g24v3XlhnuwKLXeTbWKQ== X-Received: by 2002:aa7:8424:0:b0:5a8:380d:7822 with SMTP id q4-20020aa78424000000b005a8380d7822mr3932175pfn.23.1676516299686; Wed, 15 Feb 2023 18:58:19 -0800 (PST) Received: from stoup.. (rrcs-74-87-59-234.west.biz.rr.com. [74.87.59.234]) by smtp.gmail.com with ESMTPSA id e14-20020a62aa0e000000b005a816b7c3e8sm89655pff.24.2023.02.15.18.58.18 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Feb 2023 18:58:19 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 28/30] tcg/i386: Support 128-bit load/store with have_atomic16 Date: Wed, 15 Feb 2023 16:57:37 -1000 Message-Id: <20230216025739.1211680-29-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230216025739.1211680-1-richard.henderson@linaro.org> References: <20230216025739.1211680-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::433; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x433.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Signed-off-by: Richard Henderson --- tcg/i386/tcg-target.h | 3 +- tcg/i386/tcg-target.c.inc | 325 +++++++++++++++++++++++++++++++++++--- 2 files changed, 304 insertions(+), 24 deletions(-) diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h index 6d8a536a32..37d8e70fdc 100644 --- a/tcg/i386/tcg-target.h +++ b/tcg/i386/tcg-target.h @@ -194,7 +194,8 @@ extern bool have_atomic16; #define TCG_TARGET_HAS_qemu_st8_i32 1 #endif -#define TCG_TARGET_HAS_qemu_ldst_i128 0 +#define TCG_TARGET_HAS_qemu_ldst_i128 \ + (TCG_TARGET_REG_BITS == 64 && have_atomic16) /* We do not support older SSE systems, only beginning with AVX1. */ #define TCG_TARGET_HAS_v64 have_avx1 diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc index 6ee7bc5a9a..6fdf79020f 100644 --- a/tcg/i386/tcg-target.c.inc +++ b/tcg/i386/tcg-target.c.inc @@ -91,6 +91,8 @@ static const int tcg_target_reg_alloc_order[] = { #endif }; +#define TCG_TMP_VEC TCG_REG_XMM5 + static const int tcg_target_call_iarg_regs[] = { #if TCG_TARGET_REG_BITS == 64 #if defined(_WIN64) @@ -347,6 +349,8 @@ static bool tcg_target_const_match(int64_t val, TCGType type, int ct) #define OPC_PCMPGTW (0x65 | P_EXT | P_DATA16) #define OPC_PCMPGTD (0x66 | P_EXT | P_DATA16) #define OPC_PCMPGTQ (0x37 | P_EXT38 | P_DATA16) +#define OPC_PEXTRD (0x16 | P_EXT3A | P_DATA16) +#define OPC_PINSRD (0x22 | P_EXT3A | P_DATA16) #define OPC_PMAXSB (0x3c | P_EXT38 | P_DATA16) #define OPC_PMAXSW (0xee | P_EXT | P_DATA16) #define OPC_PMAXSD (0x3d | P_EXT38 | P_DATA16) @@ -1730,8 +1734,7 @@ static void tcg_out_nopn(TCGContext *s, int n) } /* Test register R vs immediate bits I, setting Z flag for EQ/NE. */ -static void __attribute__((unused)) -tcg_out_testi(TCGContext *s, TCGReg r, uint32_t i) +static void tcg_out_testi(TCGContext *s, TCGReg r, uint32_t i) { /* * This is used for testing alignment, so we can usually use testb. @@ -1828,10 +1831,11 @@ static MemOp atom_and_align_for_opc(TCGContext *s, MemOp opc, MemOp *out_al) * int mmu_idx, uintptr_t ra) */ static void * const qemu_ld_helpers[MO_SIZE + 1] = { - [MO_UB] = helper_ldub_mmu, - [MO_UW] = helper_lduw_mmu, - [MO_UL] = helper_ldul_mmu, - [MO_UQ] = helper_ldq_mmu, + [MO_8] = helper_ldub_mmu, + [MO_16] = helper_lduw_mmu, + [MO_32] = helper_ldul_mmu, + [MO_64] = helper_ldq_mmu, + [MO_128] = helper_ld16_mmu, }; /* @@ -1839,10 +1843,11 @@ static void * const qemu_ld_helpers[MO_SIZE + 1] = { * uintxx_t val, int mmu_idx, uintptr_t ra) */ static void * const qemu_st_helpers[MO_SIZE + 1] = { - [MO_UB] = helper_stb_mmu, - [MO_UW] = helper_stw_mmu, - [MO_UL] = helper_stl_mmu, - [MO_UQ] = helper_stq_mmu, + [MO_8] = helper_stb_mmu, + [MO_16] = helper_stw_mmu, + [MO_32] = helper_stl_mmu, + [MO_64] = helper_stq_mmu, + [MO_128] = helper_st16_mmu, }; /* @@ -1870,6 +1875,13 @@ static void add_qemu_ldst_label(TCGContext *s, bool is_ld, TCGType type, label->label_ptr[1] = label_ptr[1]; } +static void tcg_out_mov2_xchg(TCGContext *s, TCGType type1, TCGType type2, + TCGReg dst1, TCGReg dst2) +{ + int w = (type1 == TCG_TYPE_I32 && type2 == TCG_TYPE_I32 ? 0 : P_REXW); + tcg_out_modrm(s, OPC_XCHG_EvGv + w, dst1, dst2); +} + /* Move src1 to dst1 and src2 to dst2, minding possible overlap. */ static void tcg_out_mov2(TCGContext *s, TCGType type1, TCGReg dst1, TCGReg src1, @@ -1883,11 +1895,69 @@ static void tcg_out_mov2(TCGContext *s, tcg_out_mov(s, type1, dst1, src1); } else { /* dst1 == src2 && dst2 == src1 -> xchg. */ - int w = (type1 == TCG_TYPE_I32 && type2 == TCG_TYPE_I32 ? 0 : P_REXW); - tcg_out_modrm(s, OPC_XCHG_EvGv + w, dst1, dst2); + tcg_out_mov2_xchg(s, type1, type2, dst1, dst2); } } +/* Similarly for 3 pairs. */ +static void tcg_out_mov3(TCGContext *s, + TCGType type1, TCGReg dst1, TCGReg src1, + TCGType type2, TCGReg dst2, TCGReg src2, + TCGType type3, TCGReg dst3, TCGReg src3) +{ + if (dst1 != src2 && dst1 != src3) { + tcg_out_mov(s, type1, dst1, src1); + tcg_out_mov2(s, type2, dst2, src2, type3, dst3, src3); + return; + } + if (dst2 != src2 && dst2 != src3) { + tcg_out_mov(s, type2, dst2, src2); + tcg_out_mov2(s, type1, dst1, src1, type3, dst3, src3); + return; + } + if (dst3 != src1 && dst3 != src2) { + tcg_out_mov(s, type3, dst3, src3); + tcg_out_mov2(s, type1, dst1, src1, type2, dst2, src2); + return; + } + /* Three-way overlap present, at least one xchg needed. */ + if (dst1 == src2) { + tcg_out_mov2_xchg(s, type1, type2, src1, src2); + tcg_out_mov2(s, type2, dst2, src1, type3, dst3, src3); + return; + } + if (dst1 == src3) { + tcg_out_mov2_xchg(s, type1, type3, src1, src3); + tcg_out_mov2(s, type2, dst2, src2, type3, dst3, src1); + return; + } + g_assert_not_reached(); +} + +static void tcg_out_vec_to_pair(TCGContext *s, TCGType type, + TCGReg l, TCGReg h, TCGReg v) +{ + int rexw = type == TCG_TYPE_I32 ? 0 : P_REXW; + + /* vpmov{d,q} %v, %l */ + tcg_out_vex_modrm(s, OPC_MOVD_EyVy + rexw, v, 0, l); + /* vpextr{d,q} $1, %v, %h */ + tcg_out_vex_modrm(s, OPC_PEXTRD + rexw, v, 0, h); + tcg_out8(s, 1); +} + +static void tcg_out_pair_to_vec(TCGContext *s, TCGType type, + TCGReg v, TCGReg l, TCGReg h) +{ + int rexw = type == TCG_TYPE_I32 ? 0 : P_REXW; + + /* vmov{d,q} %l, %v */ + tcg_out_vex_modrm(s, OPC_MOVD_VyEy + rexw, v, 0, l); + /* vpinsr{d,q} $1, %h, %v, %v */ + tcg_out_vex_modrm(s, OPC_PINSRD + rexw, v, v, h); + tcg_out8(s, 1); +} + /* * Generate code for the slow path for a load at the end of block */ @@ -1897,7 +1967,7 @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l) MemOp opc = get_memop(oi); TCGReg data_reg; tcg_insn_unit **label_ptr = &l->label_ptr[0]; - int rexw = (l->type == TCG_TYPE_I64 ? P_REXW : 0); + int rexw = (l->type == TCG_TYPE_I32 ? 0 : P_REXW); /* resolve label address */ tcg_patch32(label_ptr[0], s->code_ptr - label_ptr[0] - 4); @@ -1961,6 +2031,22 @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l) TCG_TYPE_I32, l->datahi_reg, TCG_REG_EDX); } break; + case MO_128: + tcg_debug_assert(TCG_TARGET_REG_BITS == 64); + switch (TCG_TARGET_CALL_RET_I128) { + case TCG_CALL_RET_NORMAL: + tcg_out_mov2(s, TCG_TYPE_I64, data_reg, TCG_REG_RAX, + TCG_TYPE_I64, l->datahi_reg, TCG_REG_RDX); + break; + case TCG_CALL_RET_BY_VEC: + tcg_out_vec_to_pair(s, TCG_TYPE_I64, + data_reg, l->datahi_reg, TCG_REG_XMM0); + break; + default: + qemu_build_not_reached(); + } + break; + default: tcg_abort(); } @@ -1977,7 +2063,6 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l) { MemOpIdx oi = l->oi; MemOp opc = get_memop(oi); - MemOp s_bits = opc & MO_SIZE; tcg_insn_unit **label_ptr = &l->label_ptr[0]; TCGReg retaddr; @@ -2004,9 +2089,15 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l) tcg_out_st(s, TCG_TYPE_I32, l->datalo_reg, TCG_REG_ESP, ofs); ofs += 4; - if (s_bits == MO_64) { + switch (l->type) { + case TCG_TYPE_I32: + break; + case TCG_TYPE_I64: tcg_out_st(s, TCG_TYPE_I32, l->datahi_reg, TCG_REG_ESP, ofs); ofs += 4; + break; + default: + g_assert_not_reached(); } tcg_out_sti(s, TCG_TYPE_I32, oi, TCG_REG_ESP, ofs); @@ -2016,15 +2107,54 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l) tcg_out_movi(s, TCG_TYPE_PTR, retaddr, (uintptr_t)l->raddr); tcg_out_st(s, TCG_TYPE_PTR, retaddr, TCG_REG_ESP, ofs); } else { - tcg_out_mov2(s, TCG_TYPE_TL, - tcg_target_call_iarg_regs[1], l->addrlo_reg, - s_bits == MO_64 ? TCG_TYPE_I64 : TCG_TYPE_I32, - tcg_target_call_iarg_regs[2], l->datalo_reg); - tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[0], TCG_AREG0); - tcg_out_movi(s, TCG_TYPE_I32, tcg_target_call_iarg_regs[3], oi); + int slot; - if (ARRAY_SIZE(tcg_target_call_iarg_regs) > 4) { - retaddr = tcg_target_call_iarg_regs[4]; + switch (l->type) { + case TCG_TYPE_I32: + case TCG_TYPE_I64: + tcg_out_mov2(s, TCG_TYPE_TL, + tcg_target_call_iarg_regs[1], l->addrlo_reg, + l->type, tcg_target_call_iarg_regs[2], l->datalo_reg); + slot = 3; + break; + case TCG_TYPE_I128: + switch (TCG_TARGET_CALL_ARG_I128) { + case TCG_CALL_ARG_NORMAL: + tcg_out_mov3(s, TCG_TYPE_TL, + tcg_target_call_iarg_regs[1], l->addrlo_reg, + TCG_TYPE_I64, + tcg_target_call_iarg_regs[2], l->datalo_reg, + TCG_TYPE_I64, + tcg_target_call_iarg_regs[3], l->datahi_reg); + slot = 4; + break; + case TCG_CALL_ARG_BY_REF: + /* Leave room for retaddr below, take next 16 aligned bytes. */ + tcg_out_st(s, TCG_TYPE_I64, l->datalo_reg, + TCG_REG_ESP, TCG_TARGET_CALL_STACK_OFFSET + 16); + tcg_out_st(s, TCG_TYPE_I64, l->datahi_reg, + TCG_REG_ESP, TCG_TARGET_CALL_STACK_OFFSET + 24); + tcg_out_mov(s, TCG_TYPE_TL, + tcg_target_call_iarg_regs[1], l->addrlo_reg); + tcg_out_modrm_offset(s, OPC_LEA + P_REXW, + tcg_target_call_iarg_regs[2], TCG_REG_ESP, + TCG_TARGET_CALL_STACK_OFFSET + 16); + slot = 3; + break; + default: + qemu_build_not_reached(); + } + break; + default: + g_assert_not_reached(); + } + + tcg_debug_assert(slot < (int)ARRAY_SIZE(tcg_target_call_iarg_regs) - 1); + tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[0], TCG_AREG0); + tcg_out_movi(s, TCG_TYPE_I32, tcg_target_call_iarg_regs[slot++], oi); + + if (slot < (int)ARRAY_SIZE(tcg_target_call_iarg_regs)) { + retaddr = tcg_target_call_iarg_regs[slot]; tcg_out_movi(s, TCG_TYPE_PTR, retaddr, (uintptr_t)l->raddr); } else { retaddr = TCG_REG_RAX; @@ -2288,6 +2418,71 @@ static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg datalo, TCGReg datahi, } } break; + + case MO_128: + { + TCGLabel *l1 = NULL, *l2 = NULL; + bool use_pair = atom < MO_128; + + tcg_debug_assert(TCG_TARGET_REG_BITS == 64); + + if (use_movbe) { + TCGReg t = datalo; + datalo = datahi; + datahi = t; + } + if (!use_pair) { + /* + * Atomicity requires that we use use VMOVDQA. + * If we've already checked for 16-byte alignment, that's all + * we need. If we arrive here with lesser alignment, then we + * have determined that less that 16-byte alignment can be + * satisfied with two 8-byte loads. + */ + if (align < MO_128) { + use_pair = true; + l1 = gen_new_label(); + l2 = gen_new_label(); + + tcg_out_testi(s, base, align == MO_64 ? 8 : 15); + tcg_out_jxx(s, JCC_JNE, l2, true); + } + + tcg_out_vex_modrm_sib_offset(s, OPC_MOVDQA_VxWx + seg, + TCG_TMP_VEC, 0, + base, index, 0, ofs); + tcg_out_vec_to_pair(s, TCG_TYPE_I64, + datalo, datahi, TCG_TMP_VEC); + + if (use_movbe) { + tcg_out_bswap64(s, datalo); + tcg_out_bswap64(s, datahi); + } + + if (use_pair) { + tcg_out_jxx(s, JCC_JMP, l1, true); + tcg_out_label(s, l2); + } + } + if (use_pair) { + if (base != datalo) { + tcg_out_modrm_sib_offset(s, movop + P_REXW + seg, datalo, + base, index, 0, ofs); + tcg_out_modrm_sib_offset(s, movop + P_REXW + seg, datahi, + base, index, 0, ofs + 8); + } else { + tcg_out_modrm_sib_offset(s, movop + P_REXW + seg, datahi, + base, index, 0, ofs + 8); + tcg_out_modrm_sib_offset(s, movop + P_REXW + seg, datalo, + base, index, 0, ofs); + } + } + if (l1) { + tcg_out_label(s, l1); + } + } + break; + default: g_assert_not_reached(); } @@ -2312,6 +2507,10 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, TCGType type) case TCG_TYPE_I64: datahi = (TCG_TARGET_REG_BITS == 32 ? *args++ : 0); break; + case TCG_TYPE_I128: + tcg_debug_assert(TCG_TARGET_REG_BITS == 64); + datahi = *args++; + break; default: g_assert_not_reached(); } @@ -2394,6 +2593,68 @@ static void tcg_out_qemu_st_direct(TCGContext *s, TCGReg datalo, TCGReg datahi, base, index, 0, ofs + 4); } break; + + case MO_128: + { + TCGLabel *l1 = NULL, *l2 = NULL; + bool use_pair = atom < MO_128; + + tcg_debug_assert(TCG_TARGET_REG_BITS == 64); + + if (use_movbe) { + TCGReg t = datalo; + datalo = datahi; + datahi = t; + } + if (!use_pair) { + /* + * Atomicity requires that we use use VMOVDQA. + * If we've already checked for 16-byte alignment, that's all + * we need. If we arrive here with lesser alignment, then we + * have determined that less that 16-byte alignment can be + * satisfied with two 8-byte loads. + */ + if (align < MO_128) { + use_pair = true; + l1 = gen_new_label(); + l2 = gen_new_label(); + + tcg_out_testi(s, base, align == MO_64 ? 8 : 15); + tcg_out_jxx(s, JCC_JNE, l2, true); + } + + if (use_movbe) { + /* Byte swap while storing to the stack. */ + tcg_out_modrm_offset(s, movop + P_REXW + seg, datalo, + TCG_REG_ESP, 0); + tcg_out_modrm_offset(s, movop + P_REXW + seg, datahi, + TCG_REG_ESP, 8); + tcg_out_ld(s, TCG_TYPE_V128, TCG_TMP_VEC, TCG_REG_ESP, 0); + } else { + tcg_out_pair_to_vec(s, TCG_TYPE_I64, + TCG_TMP_VEC, datalo, datahi); + } + tcg_out_vex_modrm_sib_offset(s, OPC_MOVDQA_WxVx + seg, + TCG_TMP_VEC, 0, + base, index, 0, ofs); + + if (use_pair) { + tcg_out_jxx(s, JCC_JMP, l1, true); + tcg_out_label(s, l2); + } + } + if (use_pair) { + tcg_out_modrm_sib_offset(s, movop + P_REXW + seg, datalo, + base, index, 0, ofs); + tcg_out_modrm_sib_offset(s, movop + P_REXW + seg, datahi, + base, index, 0, ofs + 8); + } + if (l1) { + tcg_out_label(s, l1); + } + } + break; + default: g_assert_not_reached(); } @@ -2415,6 +2676,10 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, TCGType type) case TCG_TYPE_I64: datahi = (TCG_TARGET_REG_BITS == 32 ? *args++ : 0); break; + case TCG_TYPE_I128: + tcg_debug_assert(TCG_TARGET_REG_BITS == 64); + datahi = *args++; + break; default: g_assert_not_reached(); } @@ -2752,6 +3017,9 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc, case INDEX_op_qemu_ld_i64: tcg_out_qemu_ld(s, args, TCG_TYPE_I64); break; + case INDEX_op_qemu_ld_i128: + tcg_out_qemu_ld(s, args, TCG_TYPE_I128); + break; case INDEX_op_qemu_st_i32: case INDEX_op_qemu_st8_i32: tcg_out_qemu_st(s, args, TCG_TYPE_I32); @@ -2759,6 +3027,9 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc, case INDEX_op_qemu_st_i64: tcg_out_qemu_st(s, args, TCG_TYPE_I64); break; + case INDEX_op_qemu_st_i128: + tcg_out_qemu_st(s, args, TCG_TYPE_I128); + break; OP_32_64(mulu2): tcg_out_modrm(s, OPC_GRP3_Ev + rexw, EXT3_MUL, args[3]); @@ -3449,6 +3720,13 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op) : TARGET_LONG_BITS <= TCG_TARGET_REG_BITS ? C_O0_I3(L, L, L) : C_O0_I4(L, L, L, L)); + case INDEX_op_qemu_ld_i128: + tcg_debug_assert(TCG_TARGET_REG_BITS == 64); + return C_O2_I1(r, r, L); + case INDEX_op_qemu_st_i128: + tcg_debug_assert(TCG_TARGET_REG_BITS == 64); + return C_O0_I3(L, L, L); + case INDEX_op_brcond2_i32: return C_O0_I4(r, r, ri, ri); @@ -4306,6 +4584,7 @@ static void tcg_target_init(TCGContext *s) s->reserved_regs = 0; tcg_regset_set_reg(s->reserved_regs, TCG_REG_CALL_STACK); + tcg_regset_set_reg(s->reserved_regs, TCG_TMP_VEC); #ifdef _WIN64 /* These are call saved, and we don't save them, so don't use them. */ tcg_regset_set_reg(s->reserved_regs, TCG_REG_XMM6); From patchwork Thu Feb 16 02:57:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1743238 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=OyVbDrgu; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4PHKRM3jPhz240K for ; Thu, 16 Feb 2023 13:59:27 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pSUTa-0004HY-SP; Wed, 15 Feb 2023 21:58:26 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pSUTY-00044t-UH for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:58:24 -0500 Received: from mail-pg1-x52e.google.com ([2607:f8b0:4864:20::52e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pSUTW-0005ps-6A for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:58:24 -0500 Received: by mail-pg1-x52e.google.com with SMTP id y186so403390pgb.10 for ; Wed, 15 Feb 2023 18:58:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=zgKVGamtNyyVi4mZ2KMjAyJ6Yx88IG+M/cd77HEWakQ=; b=OyVbDrgutGYWmP0yULW8drYpw5d+9SsVvrQ8FicDxZfK8CqeBO0ogN07vF2HHfbgEb V+GQGLyzTWxz9x9Xh+rk/atKNCcsvEw8xUX4tVcTV5SDXEgXyX6LRPVRiOpSUkU05JDC YfNc7IeuniHvBrJnMbkbDo+d7jgiPkNEugUBUdatQgjKcVMZwQXs12V/k0F+mrjwqCqS hTIzjaj7rvF7xKdkyLqKqwXhmuG/vrAO93Boi5+OK73PoRbJPvaOXm65mbhKm5+PLYun NGzqgcX4xQG16ClmyiNGK5ssXJGOwsO1hUqeEeCYc1C5deXl6ytIrYBOoRHkLWHUIf3F qIKg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=zgKVGamtNyyVi4mZ2KMjAyJ6Yx88IG+M/cd77HEWakQ=; b=ZV0GGnZ5hKxM6F/Iokg3wpkRl+yIL6ArSwg+pStB8u1iBOpIHR1gwKJpEWIdjopN4W rTRWyV27jDlJRuraXCFGaNBbnFp9GFuhOqftw2kAzcjQTqLLIqwR+C2gWG/BlkLtsp36 p0rQa36IKVQAEhl8eKa471lZkqjmaQxBEUG7nBhNQ0Cta8RskZtUnJqKrGKIymQ2Fa4E vUFLe3pmx2oz81QgXeqOIAALIG/N4vpJy49gxLPoZu0QqArL4Eg7jxFxRKPey6J/tnSC EmtPwqCFWjow9Ukp5vdpYxDqrXYMxVPw4LxzXKDvdyUhx6q4TcNxgwDWrYhj1zeKBMhv 841A== X-Gm-Message-State: AO0yUKWOA7uR+o1lUe0X4W18Oa3oDYSYkoZqRHsf8GhPPnPD92Tjtai/ aKUkYSNiZrw7Nl4mzU1LZSdwhdDNMljXasOIKUU= X-Google-Smtp-Source: AK7set8GM3irtldThO0jYY5BJNoE9q2rF983UN9G+yPN5iHdkxezqp7wyGuU86PVQi5E6Ag4Y4msNg== X-Received: by 2002:a62:1d43:0:b0:5a8:cbcc:4b58 with SMTP id d64-20020a621d43000000b005a8cbcc4b58mr3363198pfd.12.1676516300891; Wed, 15 Feb 2023 18:58:20 -0800 (PST) Received: from stoup.. (rrcs-74-87-59-234.west.biz.rr.com. [74.87.59.234]) by smtp.gmail.com with ESMTPSA id e14-20020a62aa0e000000b005a816b7c3e8sm89655pff.24.2023.02.15.18.58.19 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Feb 2023 18:58:20 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 29/30] tcg/i386: Add vex_v argument to tcg_out_vex_modrm_pool Date: Wed, 15 Feb 2023 16:57:38 -1000 Message-Id: <20230216025739.1211680-30-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230216025739.1211680-1-richard.henderson@linaro.org> References: <20230216025739.1211680-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::52e; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x52e.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Signed-off-by: Richard Henderson Reviewed-by: Philippe Mathieu-Daudé --- tcg/i386/tcg-target.c.inc | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc index 6fdf79020f..834978f7a6 100644 --- a/tcg/i386/tcg-target.c.inc +++ b/tcg/i386/tcg-target.c.inc @@ -841,9 +841,9 @@ static inline void tcg_out_modrm_pool(TCGContext *s, int opc, int r) } /* Output an opcode with an expected reference to the constant pool. */ -static inline void tcg_out_vex_modrm_pool(TCGContext *s, int opc, int r) +static inline void tcg_out_vex_modrm_pool(TCGContext *s, int opc, int r, int v) { - tcg_out_vex_opc(s, opc, r, 0, 0, 0); + tcg_out_vex_opc(s, opc, r, v, 0, 0); /* Absolute for 32-bit, pc-relative for 64-bit. */ tcg_out8(s, LOWREGMASK(r) << 3 | 5); tcg_out32(s, 0); @@ -990,18 +990,18 @@ static void tcg_out_dupi_vec(TCGContext *s, TCGType type, unsigned vece, if (TCG_TARGET_REG_BITS == 32 && vece < MO_64) { if (have_avx2) { - tcg_out_vex_modrm_pool(s, OPC_VPBROADCASTD + vex_l, ret); + tcg_out_vex_modrm_pool(s, OPC_VPBROADCASTD + vex_l, ret, 0); } else { - tcg_out_vex_modrm_pool(s, OPC_VBROADCASTSS, ret); + tcg_out_vex_modrm_pool(s, OPC_VBROADCASTSS, ret, 0); } new_pool_label(s, arg, R_386_32, s->code_ptr - 4, 0); } else { if (type == TCG_TYPE_V64) { - tcg_out_vex_modrm_pool(s, OPC_MOVQ_VqWq, ret); + tcg_out_vex_modrm_pool(s, OPC_MOVQ_VqWq, ret, 0); } else if (have_avx2) { - tcg_out_vex_modrm_pool(s, OPC_VPBROADCASTQ + vex_l, ret); + tcg_out_vex_modrm_pool(s, OPC_VPBROADCASTQ + vex_l, ret, 0); } else { - tcg_out_vex_modrm_pool(s, OPC_MOVDDUP, ret); + tcg_out_vex_modrm_pool(s, OPC_MOVDDUP, ret, 0); } if (TCG_TARGET_REG_BITS == 64) { new_pool_label(s, arg, R_386_PC32, s->code_ptr - 4, -4); @@ -1024,7 +1024,7 @@ static void tcg_out_movi_vec(TCGContext *s, TCGType type, } int rexw = (type == TCG_TYPE_I32 ? 0 : P_REXW); - tcg_out_vex_modrm_pool(s, OPC_MOVD_VyEy + rexw, ret); + tcg_out_vex_modrm_pool(s, OPC_MOVD_VyEy + rexw, ret, 0); if (TCG_TARGET_REG_BITS == 64) { new_pool_label(s, arg, R_386_PC32, s->code_ptr - 4, -4); } else { From patchwork Thu Feb 16 02:57:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1743257 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=q8b66nDv; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4PHKVC6QKfz23h0 for ; Thu, 16 Feb 2023 14:01:55 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pSUTb-0004Ka-Gt; Wed, 15 Feb 2023 21:58:27 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pSUTZ-0004AY-OM for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:58:25 -0500 Received: from mail-pf1-x42c.google.com ([2607:f8b0:4864:20::42c]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pSUTX-0005qC-Jg for qemu-devel@nongnu.org; Wed, 15 Feb 2023 21:58:25 -0500 Received: by mail-pf1-x42c.google.com with SMTP id ct17so566635pfb.12 for ; Wed, 15 Feb 2023 18:58:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=0XfRS2dDu9SiDclu61/+tgLmkVBKywu8JzeWE1I/auM=; b=q8b66nDvwrxE/moTmfzO/ntoYPEJFOue+KI9ArImsIsk0IetKAZk+54eibHL9etUqD e3caJ7JA1bsHBzuiJ+/IKhXRiCj5Xd+c3Lqa6PXElgr1tKzDF6AeE9UV+WpsTvwrk0xc U/zznYimj1qwcg7FnsLVVbsCe+seuCnTZsKHMh2AHlo/TJAlUz46fgBqoQq7Lcg4sSIo iIsh1Vxj2/JFRPZFKF06ScNoL+ZB3CKWLu9zLUV1L2Ck4Poy8wc6/zFAcST6dNVjc398 PTbiv/qWsdRovGMG0XQszTNP2MF85Zk/41CZ9visFxgP/0Nv/p2BtLh1Nb9AkB3QDY7w YJ0g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=0XfRS2dDu9SiDclu61/+tgLmkVBKywu8JzeWE1I/auM=; b=SuxGMFSjK2H57w9pQtVla1AOSvZjCYZX61KU1cgwzOnoZpY+tvv6KyBICfOTyJtU0G GR3NivGhbZv2WlCGLpB19/MjraK82fEKP3ODFtRXdCjbORIKjKOK3i/EGN4Lj7OuqrVh lp1Ojh4wxYbMdulNR3e+I/CjHtz4hZc1PvQMybL84hw24AndSEaxt4/iOOrtqpk03BL7 CJx3lv53+ehQnuptWCI0sDtfmghX4U+5zQFXK+4e5xI7E6ux4/1IHmgF4/Vy2A722pRQ eK/5x5rXlutTCdM3zpLuW6UkZW9vQdunTnI2CP064gHcjnDuZaSX6S7lZRwwC1M+GVn/ BXEQ== X-Gm-Message-State: AO0yUKXOX8+G5xVGnWDx2kIfeLhXwh2YOTjOOu+0cF3EFVycBdzhTNG5 CwqMjJTeiDW5EKGGIAx6+TA+La7jFqTN9+nTCMo= X-Google-Smtp-Source: AK7set96I6TNJ/BVOjGrRb75bNeOkOb7q7ZuuHwuK441OCYnWTtlzWyj+2i8+g7ftlU7P/G8q45fzg== X-Received: by 2002:a62:1652:0:b0:5a8:d1a7:772d with SMTP id 79-20020a621652000000b005a8d1a7772dmr3448789pfw.8.1676516302195; Wed, 15 Feb 2023 18:58:22 -0800 (PST) Received: from stoup.. (rrcs-74-87-59-234.west.biz.rr.com. [74.87.59.234]) by smtp.gmail.com with ESMTPSA id e14-20020a62aa0e000000b005a816b7c3e8sm89655pff.24.2023.02.15.18.58.21 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Feb 2023 18:58:21 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v2 30/30] tcg/i386: Honor 64-bit atomicity in 32-bit mode Date: Wed, 15 Feb 2023 16:57:39 -1000 Message-Id: <20230216025739.1211680-31-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230216025739.1211680-1-richard.henderson@linaro.org> References: <20230216025739.1211680-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42c; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x42c.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Use one of the coprocessors to perform 64-bit stores. Signed-off-by: Richard Henderson --- tcg/i386/tcg-target.c.inc | 119 +++++++++++++++++++++++++++++++++----- 1 file changed, 106 insertions(+), 13 deletions(-) diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc index 834978f7a6..2ac0f5cf4e 100644 --- a/tcg/i386/tcg-target.c.inc +++ b/tcg/i386/tcg-target.c.inc @@ -472,6 +472,10 @@ static bool tcg_target_const_match(int64_t val, TCGType type, int ct) #define OPC_GRP5 (0xff) #define OPC_GRP14 (0x73 | P_EXT | P_DATA16) +#define OPC_ESCDF (0xdf) +#define ESCDF_FILD_m64 5 +#define ESCDF_FISTP_m64 7 + /* Group 1 opcode extensions for 0x80-0x83. These are also used as modifiers for OPC_ARITH. */ #define ARITH_ADD 0 @@ -2400,21 +2404,65 @@ static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg datalo, TCGReg datahi, tcg_out_modrm_sib_offset(s, movop + P_REXW + seg, datalo, base, index, 0, ofs); } else { + TCGLabel *l1 = NULL, *l2 = NULL; + bool use_pair = atom < MO_64; + if (use_movbe) { TCGReg t = datalo; datalo = datahi; datahi = t; } - if (base != datalo) { - tcg_out_modrm_sib_offset(s, movop + seg, datalo, - base, index, 0, ofs); - tcg_out_modrm_sib_offset(s, movop + seg, datahi, - base, index, 0, ofs + 4); - } else { - tcg_out_modrm_sib_offset(s, movop + seg, datahi, - base, index, 0, ofs + 4); - tcg_out_modrm_sib_offset(s, movop + seg, datalo, + + if (!use_pair) { + /* + * Atomicity requires that we use use a single 8-byte load. + * For simplicity, and code size, always use the FPU for this. + * Similar insns using SSE/AVX are merely larger. + * Load from memory in one go, then store back to the stack, + * from whence we can load into the correct integer regs. + * + * If we've already checked for 8-byte alignment, or not + * checked for alignment at all, that's all we need. + * If we arrive here with lesser but non-zero alignment, + * then we have determined that subalignment can be + * satisfied with two 4-byte loads. + */ + if (align > MO_8 && align < MO_64) { + use_pair = true; + l1 = gen_new_label(); + l2 = gen_new_label(); + + tcg_out_testi(s, base, align == MO_32 ? 4 : 7); + tcg_out_jxx(s, JCC_JNE, l2, true); + } + + tcg_out_modrm_sib_offset(s, OPC_ESCDF + seg, ESCDF_FILD_m64, base, index, 0, ofs); + tcg_out_modrm_offset(s, OPC_ESCDF, ESCDF_FISTP_m64, + TCG_REG_ESP, 0); + tcg_out_modrm_offset(s, movop, datalo, TCG_REG_ESP, 0); + tcg_out_modrm_offset(s, movop, datahi, TCG_REG_ESP, 4); + + if (use_pair) { + tcg_out_jxx(s, JCC_JMP, l1, true); + tcg_out_label(s, l2); + } + } + if (use_pair) { + if (base != datalo) { + tcg_out_modrm_sib_offset(s, movop + seg, datalo, + base, index, 0, ofs); + tcg_out_modrm_sib_offset(s, movop + seg, datahi, + base, index, 0, ofs + 4); + } else { + tcg_out_modrm_sib_offset(s, movop + seg, datahi, + base, index, 0, ofs + 4); + tcg_out_modrm_sib_offset(s, movop + seg, datalo, + base, index, 0, ofs); + } + } + if (l1) { + tcg_out_label(s, l1); } } break; @@ -2577,20 +2625,65 @@ static void tcg_out_qemu_st_direct(TCGContext *s, TCGReg datalo, TCGReg datahi, case MO_32: tcg_out_modrm_sib_offset(s, movop + seg, datalo, base, index, 0, ofs); break; + case MO_64: if (TCG_TARGET_REG_BITS == 64) { tcg_out_modrm_sib_offset(s, movop + P_REXW + seg, datalo, base, index, 0, ofs); } else { + TCGLabel *l1 = NULL, *l2 = NULL; + bool use_pair = atom < MO_64; + if (use_movbe) { TCGReg t = datalo; datalo = datahi; datahi = t; } - tcg_out_modrm_sib_offset(s, movop + seg, datalo, - base, index, 0, ofs); - tcg_out_modrm_sib_offset(s, movop + seg, datahi, - base, index, 0, ofs + 4); + + if (!use_pair) { + /* + * Atomicity requires that we use use one 8-byte store. + * For simplicity, and code size, always use the FPU for this. + * Similar insns using SSE/AVX are merely larger. + * Assemble the 8-byte quantity in required endianness + * on the stack, load to coproc unit, and store. + * + * If we've already checked for 8-byte alignment, or not + * checked for alignment at all, that's all we need. + * If we arrive here with lesser but non-zero alignment, + * then we have determined that subalignment can be + * satisfied with two 4-byte stores. + */ + if (align > MO_8 && align < MO_64) { + use_pair = true; + l1 = gen_new_label(); + l2 = gen_new_label(); + + tcg_out_testi(s, base, align == MO_32 ? 4 : 7); + tcg_out_jxx(s, JCC_JNE, l2, true); + } + + tcg_out_modrm_offset(s, movop, datalo, TCG_REG_ESP, 0); + tcg_out_modrm_offset(s, movop, datahi, TCG_REG_ESP, 4); + tcg_out_modrm_offset(s, OPC_ESCDF, ESCDF_FILD_m64, + TCG_REG_ESP, 0); + tcg_out_modrm_sib_offset(s, OPC_ESCDF + seg, ESCDF_FISTP_m64, + base, index, 0, ofs); + + if (use_pair) { + tcg_out_jxx(s, JCC_JMP, l1, true); + tcg_out_label(s, l2); + } + } + if (use_pair) { + tcg_out_modrm_sib_offset(s, movop + seg, datalo, + base, index, 0, ofs); + tcg_out_modrm_sib_offset(s, movop + seg, datahi, + base, index, 0, ofs + 4); + } + if (l1) { + tcg_out_label(s, l1); + } } break;