From patchwork Tue Jan 17 22:08:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 1727783 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=NjHhPVSw; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4NxNQd6689z23gM for ; Wed, 18 Jan 2023 09:11:37 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pHu9K-0000JH-3H; Tue, 17 Jan 2023 17:09:46 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pHu9I-0000FR-6F for qemu-devel@nongnu.org; Tue, 17 Jan 2023 17:09:44 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pHu9D-0007Wn-M0 for qemu-devel@nongnu.org; Tue, 17 Jan 2023 17:09:43 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1673993363; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=lbKYQ7DlPSAgk2E3KZ5oZ/VRthUBdPAVVCLkUAw3R6g=; b=NjHhPVSwTMR+8KEZKASmU7laTLZkvyrzI9bEwDoSXPwZ/A+qQVcFziTVq2I/v4s5KYxcMU nhU6mgHlfoIbRCORhASJ+353ypSL+bloY+Qzmm182GqQAavoq6YlOJ52stclJPP7a79EZx sXxTMz4OpYzyeRtqMwOFD7VSD9u8m50= Received: from mail-vk1-f198.google.com (mail-vk1-f198.google.com [209.85.221.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-264-j7dZqg8nPIWWF3caXTxzpA-1; Tue, 17 Jan 2023 17:09:22 -0500 X-MC-Unique: j7dZqg8nPIWWF3caXTxzpA-1 Received: by mail-vk1-f198.google.com with SMTP id x22-20020a1f3116000000b003c67dc01d12so9716490vkx.17 for ; Tue, 17 Jan 2023 14:09:22 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=lbKYQ7DlPSAgk2E3KZ5oZ/VRthUBdPAVVCLkUAw3R6g=; b=wHvRbUd9GQg4cawvx16fbLXJeDW4kBFP8j7iJ83Z84G0VtiUiV7mxfu+YTc2PVZW/S EyjGExU6Aeu8iU7VU1FFsXU+0IH1b1SBFHpzBj6Tlomso8Wa94V18JXBvncO/DOvULxn nbmDHqNafmYUBLw8XMXmHJZtK5JhHE8gbwRuriE0FiqnNNvY64CpbtJtPuu5uV0KRY/t sP7OHRCV0TbLtu8H3fWGCIOps1b5qd4ZqHoJG3bzZU6vxgIdm1cxX2tRyLIk+mBiHwo6 FR5O1qQR1BSjgMj/8xkXwroejpPQrvpgLtrpR2FhB6tEqpcHWttlQh0l8PyAVv3wCyUb fcHA== X-Gm-Message-State: AFqh2kqEn2blUOvNA4ZSnPw5NVEaHmWR2Ab5PkU7GYtGSATcF7uGYhd9 1MsFWz7LKYaIS7CjbMO6MR1vOcPgPsMg+doJIywHsgZGp5EAVbo+zsnNtMxyxMVZgNKEgy2N6Kj TD+Vm/3PtwjyImqLuhtsoDW2WfgQGDge8RGELt/oOE7Pr5yvGWjp+n88V3W0eKnLc X-Received: by 2002:a1f:1dc7:0:b0:3e1:b028:3c71 with SMTP id d190-20020a1f1dc7000000b003e1b0283c71mr1758514vkd.10.1673993359894; Tue, 17 Jan 2023 14:09:19 -0800 (PST) X-Google-Smtp-Source: AMrXdXtDlE4jXsKVkLyqwg1FgeAojGLe8Ms84d4soSb3LMJek31v/TOvVRytzicM/Hn8bIzcqf7v9A== X-Received: by 2002:a1f:1dc7:0:b0:3e1:b028:3c71 with SMTP id d190-20020a1f1dc7000000b003e1b0283c71mr1758433vkd.10.1673993358827; Tue, 17 Jan 2023 14:09:18 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-56-70-30-145-63.dsl.bell.ca. [70.30.145.63]) by smtp.gmail.com with ESMTPSA id bm16-20020a05620a199000b006e16dcf99c8sm21142978qkb.71.2023.01.17.14.09.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 Jan 2023 14:09:17 -0800 (PST) From: Peter Xu To: qemu-devel@nongnu.org Cc: Leonardo Bras Soares Passos , James Houghton , Juan Quintela , peterx@redhat.com, "Dr . David Alan Gilbert" Subject: [PATCH RFC 01/21] update linux headers Date: Tue, 17 Jan 2023 17:08:54 -0500 Message-Id: <20230117220914.2062125-2-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20230117220914.2062125-1-peterx@redhat.com> References: <20230117220914.2062125-1-peterx@redhat.com> MIME-Version: 1.0 Received-SPF: pass client-ip=170.10.133.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Signed-off-by: Peter Xu --- include/standard-headers/drm/drm_fourcc.h | 63 +++- include/standard-headers/linux/ethtool.h | 81 ++++- include/standard-headers/linux/fuse.h | 20 +- .../linux/input-event-codes.h | 4 + include/standard-headers/linux/pci_regs.h | 2 + include/standard-headers/linux/virtio_blk.h | 19 ++ include/standard-headers/linux/virtio_bt.h | 8 + include/standard-headers/linux/virtio_net.h | 4 + linux-headers/asm-arm64/kvm.h | 1 + linux-headers/asm-generic/hugetlb_encode.h | 26 +- linux-headers/asm-generic/mman-common.h | 4 + linux-headers/asm-mips/mman.h | 4 + linux-headers/asm-riscv/kvm.h | 7 + linux-headers/asm-x86/kvm.h | 11 +- linux-headers/linux/kvm.h | 32 +- linux-headers/linux/psci.h | 14 + linux-headers/linux/userfaultfd.h | 4 + linux-headers/linux/vfio.h | 278 +++++++++++++++++- 18 files changed, 526 insertions(+), 56 deletions(-) diff --git a/include/standard-headers/drm/drm_fourcc.h b/include/standard-headers/drm/drm_fourcc.h index 48b620cbef..69cab17b38 100644 --- a/include/standard-headers/drm/drm_fourcc.h +++ b/include/standard-headers/drm/drm_fourcc.h @@ -98,18 +98,42 @@ extern "C" { #define DRM_FORMAT_INVALID 0 /* color index */ +#define DRM_FORMAT_C1 fourcc_code('C', '1', ' ', ' ') /* [7:0] C0:C1:C2:C3:C4:C5:C6:C7 1:1:1:1:1:1:1:1 eight pixels/byte */ +#define DRM_FORMAT_C2 fourcc_code('C', '2', ' ', ' ') /* [7:0] C0:C1:C2:C3 2:2:2:2 four pixels/byte */ +#define DRM_FORMAT_C4 fourcc_code('C', '4', ' ', ' ') /* [7:0] C0:C1 4:4 two pixels/byte */ #define DRM_FORMAT_C8 fourcc_code('C', '8', ' ', ' ') /* [7:0] C */ -/* 8 bpp Red */ +/* 1 bpp Darkness (inverse relationship between channel value and brightness) */ +#define DRM_FORMAT_D1 fourcc_code('D', '1', ' ', ' ') /* [7:0] D0:D1:D2:D3:D4:D5:D6:D7 1:1:1:1:1:1:1:1 eight pixels/byte */ + +/* 2 bpp Darkness (inverse relationship between channel value and brightness) */ +#define DRM_FORMAT_D2 fourcc_code('D', '2', ' ', ' ') /* [7:0] D0:D1:D2:D3 2:2:2:2 four pixels/byte */ + +/* 4 bpp Darkness (inverse relationship between channel value and brightness) */ +#define DRM_FORMAT_D4 fourcc_code('D', '4', ' ', ' ') /* [7:0] D0:D1 4:4 two pixels/byte */ + +/* 8 bpp Darkness (inverse relationship between channel value and brightness) */ +#define DRM_FORMAT_D8 fourcc_code('D', '8', ' ', ' ') /* [7:0] D */ + +/* 1 bpp Red (direct relationship between channel value and brightness) */ +#define DRM_FORMAT_R1 fourcc_code('R', '1', ' ', ' ') /* [7:0] R0:R1:R2:R3:R4:R5:R6:R7 1:1:1:1:1:1:1:1 eight pixels/byte */ + +/* 2 bpp Red (direct relationship between channel value and brightness) */ +#define DRM_FORMAT_R2 fourcc_code('R', '2', ' ', ' ') /* [7:0] R0:R1:R2:R3 2:2:2:2 four pixels/byte */ + +/* 4 bpp Red (direct relationship between channel value and brightness) */ +#define DRM_FORMAT_R4 fourcc_code('R', '4', ' ', ' ') /* [7:0] R0:R1 4:4 two pixels/byte */ + +/* 8 bpp Red (direct relationship between channel value and brightness) */ #define DRM_FORMAT_R8 fourcc_code('R', '8', ' ', ' ') /* [7:0] R */ -/* 10 bpp Red */ +/* 10 bpp Red (direct relationship between channel value and brightness) */ #define DRM_FORMAT_R10 fourcc_code('R', '1', '0', ' ') /* [15:0] x:R 6:10 little endian */ -/* 12 bpp Red */ +/* 12 bpp Red (direct relationship between channel value and brightness) */ #define DRM_FORMAT_R12 fourcc_code('R', '1', '2', ' ') /* [15:0] x:R 4:12 little endian */ -/* 16 bpp Red */ +/* 16 bpp Red (direct relationship between channel value and brightness) */ #define DRM_FORMAT_R16 fourcc_code('R', '1', '6', ' ') /* [15:0] R little endian */ /* 16 bpp RG */ @@ -204,7 +228,9 @@ extern "C" { #define DRM_FORMAT_VYUY fourcc_code('V', 'Y', 'U', 'Y') /* [31:0] Y1:Cb0:Y0:Cr0 8:8:8:8 little endian */ #define DRM_FORMAT_AYUV fourcc_code('A', 'Y', 'U', 'V') /* [31:0] A:Y:Cb:Cr 8:8:8:8 little endian */ +#define DRM_FORMAT_AVUY8888 fourcc_code('A', 'V', 'U', 'Y') /* [31:0] A:Cr:Cb:Y 8:8:8:8 little endian */ #define DRM_FORMAT_XYUV8888 fourcc_code('X', 'Y', 'U', 'V') /* [31:0] X:Y:Cb:Cr 8:8:8:8 little endian */ +#define DRM_FORMAT_XVUY8888 fourcc_code('X', 'V', 'U', 'Y') /* [31:0] X:Cr:Cb:Y 8:8:8:8 little endian */ #define DRM_FORMAT_VUY888 fourcc_code('V', 'U', '2', '4') /* [23:0] Cr:Cb:Y 8:8:8 little endian */ #define DRM_FORMAT_VUY101010 fourcc_code('V', 'U', '3', '0') /* Y followed by U then V, 10:10:10. Non-linear modifier only */ @@ -717,6 +743,35 @@ extern "C" { */ #define DRM_FORMAT_MOD_VIVANTE_SPLIT_SUPER_TILED fourcc_mod_code(VIVANTE, 4) +/* + * Vivante TS (tile-status) buffer modifiers. They can be combined with all of + * the color buffer tiling modifiers defined above. When TS is present it's a + * separate buffer containing the clear/compression status of each tile. The + * modifiers are defined as VIVANTE_MOD_TS_c_s, where c is the color buffer + * tile size in bytes covered by one entry in the status buffer and s is the + * number of status bits per entry. + * We reserve the top 8 bits of the Vivante modifier space for tile status + * clear/compression modifiers, as future cores might add some more TS layout + * variations. + */ +#define VIVANTE_MOD_TS_64_4 (1ULL << 48) +#define VIVANTE_MOD_TS_64_2 (2ULL << 48) +#define VIVANTE_MOD_TS_128_4 (3ULL << 48) +#define VIVANTE_MOD_TS_256_4 (4ULL << 48) +#define VIVANTE_MOD_TS_MASK (0xfULL << 48) + +/* + * Vivante compression modifiers. Those depend on a TS modifier being present + * as the TS bits get reinterpreted as compression tags instead of simple + * clear markers when compression is enabled. + */ +#define VIVANTE_MOD_COMP_DEC400 (1ULL << 52) +#define VIVANTE_MOD_COMP_MASK (0xfULL << 52) + +/* Masking out the extension bits will yield the base modifier. */ +#define VIVANTE_MOD_EXT_MASK (VIVANTE_MOD_TS_MASK | \ + VIVANTE_MOD_COMP_MASK) + /* NVIDIA frame buffer modifiers */ /* diff --git a/include/standard-headers/linux/ethtool.h b/include/standard-headers/linux/ethtool.h index 4537da20cc..87176ab075 100644 --- a/include/standard-headers/linux/ethtool.h +++ b/include/standard-headers/linux/ethtool.h @@ -159,8 +159,10 @@ static inline uint32_t ethtool_cmd_speed(const struct ethtool_cmd *ep) * in its bus driver structure (e.g. pci_driver::name). Must * not be an empty string. * @version: Driver version string; may be an empty string - * @fw_version: Firmware version string; may be an empty string - * @erom_version: Expansion ROM version string; may be an empty string + * @fw_version: Firmware version string; driver defined; may be an + * empty string + * @erom_version: Expansion ROM version string; driver defined; may be + * an empty string * @bus_info: Device bus address. This should match the dev_name() * string for the underlying bus device, if there is one. May be * an empty string. @@ -179,10 +181,6 @@ static inline uint32_t ethtool_cmd_speed(const struct ethtool_cmd *ep) * * Users can use the %ETHTOOL_GSSET_INFO command to get the number of * strings in any string set (from Linux 2.6.34). - * - * Drivers should set at most @driver, @version, @fw_version and - * @bus_info in their get_drvinfo() implementation. The ethtool - * core fills in the other fields using other driver operations. */ struct ethtool_drvinfo { uint32_t cmd; @@ -736,6 +734,51 @@ enum ethtool_module_power_mode { ETHTOOL_MODULE_POWER_MODE_HIGH, }; +/** + * enum ethtool_podl_pse_admin_state - operational state of the PoDL PSE + * functions. IEEE 802.3-2018 30.15.1.1.2 aPoDLPSEAdminState + * @ETHTOOL_PODL_PSE_ADMIN_STATE_UNKNOWN: state of PoDL PSE functions are + * unknown + * @ETHTOOL_PODL_PSE_ADMIN_STATE_DISABLED: PoDL PSE functions are disabled + * @ETHTOOL_PODL_PSE_ADMIN_STATE_ENABLED: PoDL PSE functions are enabled + */ +enum ethtool_podl_pse_admin_state { + ETHTOOL_PODL_PSE_ADMIN_STATE_UNKNOWN = 1, + ETHTOOL_PODL_PSE_ADMIN_STATE_DISABLED, + ETHTOOL_PODL_PSE_ADMIN_STATE_ENABLED, +}; + +/** + * enum ethtool_podl_pse_pw_d_status - power detection status of the PoDL PSE. + * IEEE 802.3-2018 30.15.1.1.3 aPoDLPSEPowerDetectionStatus: + * @ETHTOOL_PODL_PSE_PW_D_STATUS_UNKNOWN: PoDL PSE + * @ETHTOOL_PODL_PSE_PW_D_STATUS_DISABLED: "The enumeration “disabled” is + * asserted true when the PoDL PSE state diagram variable mr_pse_enable is + * false" + * @ETHTOOL_PODL_PSE_PW_D_STATUS_SEARCHING: "The enumeration “searching” is + * asserted true when either of the PSE state diagram variables + * pi_detecting or pi_classifying is true." + * @ETHTOOL_PODL_PSE_PW_D_STATUS_DELIVERING: "The enumeration “deliveringPower” + * is asserted true when the PoDL PSE state diagram variable pi_powered is + * true." + * @ETHTOOL_PODL_PSE_PW_D_STATUS_SLEEP: "The enumeration “sleep” is asserted + * true when the PoDL PSE state diagram variable pi_sleeping is true." + * @ETHTOOL_PODL_PSE_PW_D_STATUS_IDLE: "The enumeration “idle” is asserted true + * when the logical combination of the PoDL PSE state diagram variables + * pi_prebiased*!pi_sleeping is true." + * @ETHTOOL_PODL_PSE_PW_D_STATUS_ERROR: "The enumeration “error” is asserted + * true when the PoDL PSE state diagram variable overload_held is true." + */ +enum ethtool_podl_pse_pw_d_status { + ETHTOOL_PODL_PSE_PW_D_STATUS_UNKNOWN = 1, + ETHTOOL_PODL_PSE_PW_D_STATUS_DISABLED, + ETHTOOL_PODL_PSE_PW_D_STATUS_SEARCHING, + ETHTOOL_PODL_PSE_PW_D_STATUS_DELIVERING, + ETHTOOL_PODL_PSE_PW_D_STATUS_SLEEP, + ETHTOOL_PODL_PSE_PW_D_STATUS_IDLE, + ETHTOOL_PODL_PSE_PW_D_STATUS_ERROR, +}; + /** * struct ethtool_gstrings - string set for data tagging * @cmd: Command number = %ETHTOOL_GSTRINGS @@ -1692,6 +1735,13 @@ enum ethtool_link_mode_bit_indices { ETHTOOL_LINK_MODE_100baseFX_Half_BIT = 90, ETHTOOL_LINK_MODE_100baseFX_Full_BIT = 91, ETHTOOL_LINK_MODE_10baseT1L_Full_BIT = 92, + ETHTOOL_LINK_MODE_800000baseCR8_Full_BIT = 93, + ETHTOOL_LINK_MODE_800000baseKR8_Full_BIT = 94, + ETHTOOL_LINK_MODE_800000baseDR8_Full_BIT = 95, + ETHTOOL_LINK_MODE_800000baseDR8_2_Full_BIT = 96, + ETHTOOL_LINK_MODE_800000baseSR8_Full_BIT = 97, + ETHTOOL_LINK_MODE_800000baseVR8_Full_BIT = 98, + /* must be last entry */ __ETHTOOL_LINK_MODE_MASK_NBITS }; @@ -1803,6 +1853,7 @@ enum ethtool_link_mode_bit_indices { #define SPEED_100000 100000 #define SPEED_200000 200000 #define SPEED_400000 400000 +#define SPEED_800000 800000 #define SPEED_UNKNOWN -1 @@ -1840,6 +1891,20 @@ static inline int ethtool_validate_duplex(uint8_t duplex) #define MASTER_SLAVE_STATE_SLAVE 3 #define MASTER_SLAVE_STATE_ERR 4 +/* These are used to throttle the rate of data on the phy interface when the + * native speed of the interface is higher than the link speed. These should + * not be used for phy interfaces which natively support multiple speeds (e.g. + * MII or SGMII). + */ +/* No rate matching performed. */ +#define RATE_MATCH_NONE 0 +/* The phy sends pause frames to throttle the MAC. */ +#define RATE_MATCH_PAUSE 1 +/* The phy asserts CRS to prevent the MAC from transmitting. */ +#define RATE_MATCH_CRS 2 +/* The MAC is programmed with a sufficiently-large IPG. */ +#define RATE_MATCH_OPEN_LOOP 3 + /* Which connector port. */ #define PORT_TP 0x00 #define PORT_AUI 0x01 @@ -2033,8 +2098,8 @@ enum ethtool_reset_flags { * reported consistently by PHYLIB. Read-only. * @master_slave_cfg: Master/slave port mode. * @master_slave_state: Master/slave port state. + * @rate_matching: Rate adaptation performed by the PHY * @reserved: Reserved for future use; see the note on reserved space. - * @reserved1: Reserved for future use; see the note on reserved space. * @link_mode_masks: Variable length bitmaps. * * If autonegotiation is disabled, the speed and @duplex represent the @@ -2085,7 +2150,7 @@ struct ethtool_link_settings { uint8_t transceiver; uint8_t master_slave_cfg; uint8_t master_slave_state; - uint8_t reserved1[1]; + uint8_t rate_matching; uint32_t reserved[7]; uint32_t link_mode_masks[]; /* layout of link_mode_masks fields: diff --git a/include/standard-headers/linux/fuse.h b/include/standard-headers/linux/fuse.h index bda06258be..a1af78d989 100644 --- a/include/standard-headers/linux/fuse.h +++ b/include/standard-headers/linux/fuse.h @@ -194,6 +194,13 @@ * - add FUSE_SECURITY_CTX init flag * - add security context to create, mkdir, symlink, and mknod requests * - add FUSE_HAS_INODE_DAX, FUSE_ATTR_DAX + * + * 7.37 + * - add FUSE_TMPFILE + * + * 7.38 + * - add FUSE_EXPIRE_ONLY flag to fuse_notify_inval_entry + * - add FOPEN_PARALLEL_DIRECT_WRITES */ #ifndef _LINUX_FUSE_H @@ -225,7 +232,7 @@ #define FUSE_KERNEL_VERSION 7 /** Minor version number of this interface */ -#define FUSE_KERNEL_MINOR_VERSION 36 +#define FUSE_KERNEL_MINOR_VERSION 38 /** The node ID of the root inode */ #define FUSE_ROOT_ID 1 @@ -297,6 +304,7 @@ struct fuse_file_lock { * FOPEN_CACHE_DIR: allow caching this directory * FOPEN_STREAM: the file is stream-like (no file position at all) * FOPEN_NOFLUSH: don't flush data cache on close (unless FUSE_WRITEBACK_CACHE) + * FOPEN_PARALLEL_DIRECT_WRITES: Allow concurrent direct writes on the same inode */ #define FOPEN_DIRECT_IO (1 << 0) #define FOPEN_KEEP_CACHE (1 << 1) @@ -304,6 +312,7 @@ struct fuse_file_lock { #define FOPEN_CACHE_DIR (1 << 3) #define FOPEN_STREAM (1 << 4) #define FOPEN_NOFLUSH (1 << 5) +#define FOPEN_PARALLEL_DIRECT_WRITES (1 << 6) /** * INIT request/reply flags @@ -484,6 +493,12 @@ struct fuse_file_lock { */ #define FUSE_SETXATTR_ACL_KILL_SGID (1 << 0) +/** + * notify_inval_entry flags + * FUSE_EXPIRE_ONLY + */ +#define FUSE_EXPIRE_ONLY (1 << 0) + enum fuse_opcode { FUSE_LOOKUP = 1, FUSE_FORGET = 2, /* no reply */ @@ -533,6 +548,7 @@ enum fuse_opcode { FUSE_SETUPMAPPING = 48, FUSE_REMOVEMAPPING = 49, FUSE_SYNCFS = 50, + FUSE_TMPFILE = 51, /* CUSE specific operations */ CUSE_INIT = 4096, @@ -911,7 +927,7 @@ struct fuse_notify_inval_inode_out { struct fuse_notify_inval_entry_out { uint64_t parent; uint32_t namelen; - uint32_t padding; + uint32_t flags; }; struct fuse_notify_delete_out { diff --git a/include/standard-headers/linux/input-event-codes.h b/include/standard-headers/linux/input-event-codes.h index 50790aee5a..f6bab08540 100644 --- a/include/standard-headers/linux/input-event-codes.h +++ b/include/standard-headers/linux/input-event-codes.h @@ -614,6 +614,9 @@ #define KEY_KBD_LAYOUT_NEXT 0x248 /* AC Next Keyboard Layout Select */ #define KEY_EMOJI_PICKER 0x249 /* Show/hide emoji picker (HUTRR101) */ #define KEY_DICTATE 0x24a /* Start or Stop Voice Dictation Session (HUTRR99) */ +#define KEY_CAMERA_ACCESS_ENABLE 0x24b /* Enables programmatic access to camera devices. (HUTRR72) */ +#define KEY_CAMERA_ACCESS_DISABLE 0x24c /* Disables programmatic access to camera devices. (HUTRR72) */ +#define KEY_CAMERA_ACCESS_TOGGLE 0x24d /* Toggles the current state of the camera access control. (HUTRR72) */ #define KEY_BRIGHTNESS_MIN 0x250 /* Set Brightness to Minimum */ #define KEY_BRIGHTNESS_MAX 0x251 /* Set Brightness to Maximum */ @@ -862,6 +865,7 @@ #define ABS_TOOL_WIDTH 0x1c #define ABS_VOLUME 0x20 +#define ABS_PROFILE 0x21 #define ABS_MISC 0x28 diff --git a/include/standard-headers/linux/pci_regs.h b/include/standard-headers/linux/pci_regs.h index 57b8e2ffb1..85ab127881 100644 --- a/include/standard-headers/linux/pci_regs.h +++ b/include/standard-headers/linux/pci_regs.h @@ -1058,6 +1058,7 @@ /* Precision Time Measurement */ #define PCI_PTM_CAP 0x04 /* PTM Capability */ #define PCI_PTM_CAP_REQ 0x00000001 /* Requester capable */ +#define PCI_PTM_CAP_RES 0x00000002 /* Responder capable */ #define PCI_PTM_CAP_ROOT 0x00000004 /* Root capable */ #define PCI_PTM_GRANULARITY_MASK 0x0000FF00 /* Clock granularity */ #define PCI_PTM_CTRL 0x08 /* PTM Control */ @@ -1119,6 +1120,7 @@ #define PCI_DOE_STATUS_DATA_OBJECT_READY 0x80000000 /* Data Object Ready */ #define PCI_DOE_WRITE 0x10 /* DOE Write Data Mailbox Register */ #define PCI_DOE_READ 0x14 /* DOE Read Data Mailbox Register */ +#define PCI_DOE_CAP_SIZEOF 0x18 /* Size of DOE register block */ /* DOE Data Object - note not actually registers */ #define PCI_DOE_DATA_OBJECT_HEADER_1_VID 0x0000ffff diff --git a/include/standard-headers/linux/virtio_blk.h b/include/standard-headers/linux/virtio_blk.h index 2dcc90826a..e81715cd70 100644 --- a/include/standard-headers/linux/virtio_blk.h +++ b/include/standard-headers/linux/virtio_blk.h @@ -40,6 +40,7 @@ #define VIRTIO_BLK_F_MQ 12 /* support more than one vq */ #define VIRTIO_BLK_F_DISCARD 13 /* DISCARD is supported */ #define VIRTIO_BLK_F_WRITE_ZEROES 14 /* WRITE ZEROES is supported */ +#define VIRTIO_BLK_F_SECURE_ERASE 16 /* Secure Erase is supported */ /* Legacy feature bits */ #ifndef VIRTIO_BLK_NO_LEGACY @@ -119,6 +120,21 @@ struct virtio_blk_config { uint8_t write_zeroes_may_unmap; uint8_t unused1[3]; + + /* the next 3 entries are guarded by VIRTIO_BLK_F_SECURE_ERASE */ + /* + * The maximum secure erase sectors (in 512-byte sectors) for + * one segment. + */ + __virtio32 max_secure_erase_sectors; + /* + * The maximum number of secure erase segments in a + * secure erase command. + */ + __virtio32 max_secure_erase_seg; + /* Secure erase commands must be aligned to this number of sectors. */ + __virtio32 secure_erase_sector_alignment; + } QEMU_PACKED; /* @@ -153,6 +169,9 @@ struct virtio_blk_config { /* Write zeroes command */ #define VIRTIO_BLK_T_WRITE_ZEROES 13 +/* Secure erase command */ +#define VIRTIO_BLK_T_SECURE_ERASE 14 + #ifndef VIRTIO_BLK_NO_LEGACY /* Barrier before this op. */ #define VIRTIO_BLK_T_BARRIER 0x80000000 diff --git a/include/standard-headers/linux/virtio_bt.h b/include/standard-headers/linux/virtio_bt.h index 245e1eff4b..a11ecc3f92 100644 --- a/include/standard-headers/linux/virtio_bt.h +++ b/include/standard-headers/linux/virtio_bt.h @@ -9,6 +9,7 @@ #define VIRTIO_BT_F_VND_HCI 0 /* Indicates vendor command support */ #define VIRTIO_BT_F_MSFT_EXT 1 /* Indicates MSFT vendor support */ #define VIRTIO_BT_F_AOSP_EXT 2 /* Indicates AOSP vendor support */ +#define VIRTIO_BT_F_CONFIG_V2 3 /* Use second version configuration */ enum virtio_bt_config_type { VIRTIO_BT_CONFIG_TYPE_PRIMARY = 0, @@ -28,4 +29,11 @@ struct virtio_bt_config { uint16_t msft_opcode; } QEMU_PACKED; +struct virtio_bt_config_v2 { + uint8_t type; + uint8_t alignment; + uint16_t vendor; + uint16_t msft_opcode; +}; + #endif /* _LINUX_VIRTIO_BT_H */ diff --git a/include/standard-headers/linux/virtio_net.h b/include/standard-headers/linux/virtio_net.h index 42c68caf71..c0e797067a 100644 --- a/include/standard-headers/linux/virtio_net.h +++ b/include/standard-headers/linux/virtio_net.h @@ -57,6 +57,9 @@ * Steering */ #define VIRTIO_NET_F_CTRL_MAC_ADDR 23 /* Set MAC address */ #define VIRTIO_NET_F_NOTF_COAL 53 /* Device supports notifications coalescing */ +#define VIRTIO_NET_F_GUEST_USO4 54 /* Guest can handle USOv4 in. */ +#define VIRTIO_NET_F_GUEST_USO6 55 /* Guest can handle USOv6 in. */ +#define VIRTIO_NET_F_HOST_USO 56 /* Host can handle USO in. */ #define VIRTIO_NET_F_HASH_REPORT 57 /* Supports hash report */ #define VIRTIO_NET_F_RSS 60 /* Supports RSS RX steering */ #define VIRTIO_NET_F_RSC_EXT 61 /* extended coalescing info */ @@ -130,6 +133,7 @@ struct virtio_net_hdr_v1 { #define VIRTIO_NET_HDR_GSO_TCPV4 1 /* GSO frame, IPv4 TCP (TSO) */ #define VIRTIO_NET_HDR_GSO_UDP 3 /* GSO frame, IPv4 UDP (UFO) */ #define VIRTIO_NET_HDR_GSO_TCPV6 4 /* GSO frame, IPv6 TCP */ +#define VIRTIO_NET_HDR_GSO_UDP_L4 5 /* GSO frame, IPv4& IPv6 UDP (USO) */ #define VIRTIO_NET_HDR_GSO_ECN 0x80 /* TCP has ECN set */ uint8_t gso_type; __virtio16 hdr_len; /* Ethernet + IP + tcp/udp hdrs */ diff --git a/linux-headers/asm-arm64/kvm.h b/linux-headers/asm-arm64/kvm.h index 4bf2d7246e..a7cfefb3a8 100644 --- a/linux-headers/asm-arm64/kvm.h +++ b/linux-headers/asm-arm64/kvm.h @@ -43,6 +43,7 @@ #define __KVM_HAVE_VCPU_EVENTS #define KVM_COALESCED_MMIO_PAGE_OFFSET 1 +#define KVM_DIRTY_LOG_PAGE_OFFSET 64 #define KVM_REG_SIZE(id) \ (1U << (((id) & KVM_REG_SIZE_MASK) >> KVM_REG_SIZE_SHIFT)) diff --git a/linux-headers/asm-generic/hugetlb_encode.h b/linux-headers/asm-generic/hugetlb_encode.h index 4f3d5aaa11..de687009bf 100644 --- a/linux-headers/asm-generic/hugetlb_encode.h +++ b/linux-headers/asm-generic/hugetlb_encode.h @@ -20,18 +20,18 @@ #define HUGETLB_FLAG_ENCODE_SHIFT 26 #define HUGETLB_FLAG_ENCODE_MASK 0x3f -#define HUGETLB_FLAG_ENCODE_16KB (14 << HUGETLB_FLAG_ENCODE_SHIFT) -#define HUGETLB_FLAG_ENCODE_64KB (16 << HUGETLB_FLAG_ENCODE_SHIFT) -#define HUGETLB_FLAG_ENCODE_512KB (19 << HUGETLB_FLAG_ENCODE_SHIFT) -#define HUGETLB_FLAG_ENCODE_1MB (20 << HUGETLB_FLAG_ENCODE_SHIFT) -#define HUGETLB_FLAG_ENCODE_2MB (21 << HUGETLB_FLAG_ENCODE_SHIFT) -#define HUGETLB_FLAG_ENCODE_8MB (23 << HUGETLB_FLAG_ENCODE_SHIFT) -#define HUGETLB_FLAG_ENCODE_16MB (24 << HUGETLB_FLAG_ENCODE_SHIFT) -#define HUGETLB_FLAG_ENCODE_32MB (25 << HUGETLB_FLAG_ENCODE_SHIFT) -#define HUGETLB_FLAG_ENCODE_256MB (28 << HUGETLB_FLAG_ENCODE_SHIFT) -#define HUGETLB_FLAG_ENCODE_512MB (29 << HUGETLB_FLAG_ENCODE_SHIFT) -#define HUGETLB_FLAG_ENCODE_1GB (30 << HUGETLB_FLAG_ENCODE_SHIFT) -#define HUGETLB_FLAG_ENCODE_2GB (31 << HUGETLB_FLAG_ENCODE_SHIFT) -#define HUGETLB_FLAG_ENCODE_16GB (34 << HUGETLB_FLAG_ENCODE_SHIFT) +#define HUGETLB_FLAG_ENCODE_16KB (14U << HUGETLB_FLAG_ENCODE_SHIFT) +#define HUGETLB_FLAG_ENCODE_64KB (16U << HUGETLB_FLAG_ENCODE_SHIFT) +#define HUGETLB_FLAG_ENCODE_512KB (19U << HUGETLB_FLAG_ENCODE_SHIFT) +#define HUGETLB_FLAG_ENCODE_1MB (20U << HUGETLB_FLAG_ENCODE_SHIFT) +#define HUGETLB_FLAG_ENCODE_2MB (21U << HUGETLB_FLAG_ENCODE_SHIFT) +#define HUGETLB_FLAG_ENCODE_8MB (23U << HUGETLB_FLAG_ENCODE_SHIFT) +#define HUGETLB_FLAG_ENCODE_16MB (24U << HUGETLB_FLAG_ENCODE_SHIFT) +#define HUGETLB_FLAG_ENCODE_32MB (25U << HUGETLB_FLAG_ENCODE_SHIFT) +#define HUGETLB_FLAG_ENCODE_256MB (28U << HUGETLB_FLAG_ENCODE_SHIFT) +#define HUGETLB_FLAG_ENCODE_512MB (29U << HUGETLB_FLAG_ENCODE_SHIFT) +#define HUGETLB_FLAG_ENCODE_1GB (30U << HUGETLB_FLAG_ENCODE_SHIFT) +#define HUGETLB_FLAG_ENCODE_2GB (31U << HUGETLB_FLAG_ENCODE_SHIFT) +#define HUGETLB_FLAG_ENCODE_16GB (34U << HUGETLB_FLAG_ENCODE_SHIFT) #endif /* _ASM_GENERIC_HUGETLB_ENCODE_H_ */ diff --git a/linux-headers/asm-generic/mman-common.h b/linux-headers/asm-generic/mman-common.h index 6c1aa92a92..996e8ded09 100644 --- a/linux-headers/asm-generic/mman-common.h +++ b/linux-headers/asm-generic/mman-common.h @@ -77,6 +77,10 @@ #define MADV_DONTNEED_LOCKED 24 /* like DONTNEED, but drop locked pages too */ +#define MADV_COLLAPSE 25 /* Synchronous hugepage collapse */ + +#define MADV_SPLIT 26 /* Enable hugepage high-granularity APIs */ + /* compatibility flags */ #define MAP_FILE 0 diff --git a/linux-headers/asm-mips/mman.h b/linux-headers/asm-mips/mman.h index 1be428663c..f8a74a3a09 100644 --- a/linux-headers/asm-mips/mman.h +++ b/linux-headers/asm-mips/mman.h @@ -103,6 +103,10 @@ #define MADV_DONTNEED_LOCKED 24 /* like DONTNEED, but drop locked pages too */ +#define MADV_COLLAPSE 25 /* Synchronous hugepage collapse */ + +#define MADV_SPLIT 26 /* Enable hugepage high-granularity APIs */ + /* compatibility flags */ #define MAP_FILE 0 diff --git a/linux-headers/asm-riscv/kvm.h b/linux-headers/asm-riscv/kvm.h index 7351417afd..92af6f3f05 100644 --- a/linux-headers/asm-riscv/kvm.h +++ b/linux-headers/asm-riscv/kvm.h @@ -48,6 +48,10 @@ struct kvm_sregs { /* CONFIG registers for KVM_GET_ONE_REG and KVM_SET_ONE_REG */ struct kvm_riscv_config { unsigned long isa; + unsigned long zicbom_block_size; + unsigned long mvendorid; + unsigned long marchid; + unsigned long mimpid; }; /* CORE registers for KVM_GET_ONE_REG and KVM_SET_ONE_REG */ @@ -98,6 +102,9 @@ enum KVM_RISCV_ISA_EXT_ID { KVM_RISCV_ISA_EXT_M, KVM_RISCV_ISA_EXT_SVPBMT, KVM_RISCV_ISA_EXT_SSTC, + KVM_RISCV_ISA_EXT_SVINVAL, + KVM_RISCV_ISA_EXT_ZIHINTPAUSE, + KVM_RISCV_ISA_EXT_ZICBOM, KVM_RISCV_ISA_EXT_MAX, }; diff --git a/linux-headers/asm-x86/kvm.h b/linux-headers/asm-x86/kvm.h index 46de10a809..2747d2ce14 100644 --- a/linux-headers/asm-x86/kvm.h +++ b/linux-headers/asm-x86/kvm.h @@ -53,14 +53,6 @@ /* Architectural interrupt line count. */ #define KVM_NR_INTERRUPTS 256 -struct kvm_memory_alias { - __u32 slot; /* this has a different namespace than memory slots */ - __u32 flags; - __u64 guest_phys_addr; - __u64 memory_size; - __u64 target_phys_addr; -}; - /* for KVM_GET_IRQCHIP and KVM_SET_IRQCHIP */ struct kvm_pic_state { __u8 last_irr; /* edge detection */ @@ -214,6 +206,8 @@ struct kvm_msr_list { struct kvm_msr_filter_range { #define KVM_MSR_FILTER_READ (1 << 0) #define KVM_MSR_FILTER_WRITE (1 << 1) +#define KVM_MSR_FILTER_RANGE_VALID_MASK (KVM_MSR_FILTER_READ | \ + KVM_MSR_FILTER_WRITE) __u32 flags; __u32 nmsrs; /* number of msrs in bitmap */ __u32 base; /* MSR index the bitmap starts at */ @@ -224,6 +218,7 @@ struct kvm_msr_filter_range { struct kvm_msr_filter { #define KVM_MSR_FILTER_DEFAULT_ALLOW (0 << 0) #define KVM_MSR_FILTER_DEFAULT_DENY (1 << 0) +#define KVM_MSR_FILTER_VALID_MASK (KVM_MSR_FILTER_DEFAULT_DENY) __u32 flags; struct kvm_msr_filter_range ranges[KVM_MSR_FILTER_MAX_RANGES]; }; diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h index ebdafa576d..30b2795d10 100644 --- a/linux-headers/linux/kvm.h +++ b/linux-headers/linux/kvm.h @@ -86,14 +86,6 @@ struct kvm_debug_guest { /* *** End of deprecated interfaces *** */ -/* for KVM_CREATE_MEMORY_REGION */ -struct kvm_memory_region { - __u32 slot; - __u32 flags; - __u64 guest_phys_addr; - __u64 memory_size; /* bytes */ -}; - /* for KVM_SET_USER_MEMORY_REGION */ struct kvm_userspace_memory_region { __u32 slot; @@ -104,9 +96,9 @@ struct kvm_userspace_memory_region { }; /* - * The bit 0 ~ bit 15 of kvm_memory_region::flags are visible for userspace, - * other bits are reserved for kvm internal use which are defined in - * include/linux/kvm_host.h. + * The bit 0 ~ bit 15 of kvm_userspace_memory_region::flags are visible for + * userspace, other bits are reserved for kvm internal use which are defined + * in include/linux/kvm_host.h. */ #define KVM_MEM_LOG_DIRTY_PAGES (1UL << 0) #define KVM_MEM_READONLY (1UL << 1) @@ -483,6 +475,9 @@ struct kvm_run { #define KVM_MSR_EXIT_REASON_INVAL (1 << 0) #define KVM_MSR_EXIT_REASON_UNKNOWN (1 << 1) #define KVM_MSR_EXIT_REASON_FILTER (1 << 2) +#define KVM_MSR_EXIT_REASON_VALID_MASK (KVM_MSR_EXIT_REASON_INVAL | \ + KVM_MSR_EXIT_REASON_UNKNOWN | \ + KVM_MSR_EXIT_REASON_FILTER) __u32 reason; /* kernel -> user */ __u32 index; /* kernel -> user */ __u64 data; /* kernel <-> user */ @@ -1175,6 +1170,9 @@ struct kvm_ppc_resize_hpt { #define KVM_CAP_VM_DISABLE_NX_HUGE_PAGES 220 #define KVM_CAP_S390_ZPCI_OP 221 #define KVM_CAP_S390_CPU_TOPOLOGY 222 +#define KVM_CAP_DIRTY_LOG_RING_ACQ_REL 223 +#define KVM_CAP_S390_PROTECTED_ASYNC_DISABLE 224 +#define KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP 225 #ifdef KVM_CAP_IRQ_ROUTING @@ -1264,6 +1262,7 @@ struct kvm_x86_mce { #define KVM_XEN_HVM_CONFIG_RUNSTATE (1 << 3) #define KVM_XEN_HVM_CONFIG_EVTCHN_2LEVEL (1 << 4) #define KVM_XEN_HVM_CONFIG_EVTCHN_SEND (1 << 5) +#define KVM_XEN_HVM_CONFIG_RUNSTATE_UPDATE_FLAG (1 << 6) struct kvm_xen_hvm_config { __u32 flags; @@ -1434,18 +1433,12 @@ struct kvm_vfio_spapr_tce { __s32 tablefd; }; -/* - * ioctls for VM fds - */ -#define KVM_SET_MEMORY_REGION _IOW(KVMIO, 0x40, struct kvm_memory_region) /* * KVM_CREATE_VCPU receives as a parameter the vcpu slot, and returns * a vcpu fd. */ #define KVM_CREATE_VCPU _IO(KVMIO, 0x41) #define KVM_GET_DIRTY_LOG _IOW(KVMIO, 0x42, struct kvm_dirty_log) -/* KVM_SET_MEMORY_ALIAS is obsolete: */ -#define KVM_SET_MEMORY_ALIAS _IOW(KVMIO, 0x43, struct kvm_memory_alias) #define KVM_SET_NR_MMU_PAGES _IO(KVMIO, 0x44) #define KVM_GET_NR_MMU_PAGES _IO(KVMIO, 0x45) #define KVM_SET_USER_MEMORY_REGION _IOW(KVMIO, 0x46, \ @@ -1737,6 +1730,8 @@ enum pv_cmd_id { KVM_PV_UNSHARE_ALL, KVM_PV_INFO, KVM_PV_DUMP, + KVM_PV_ASYNC_CLEANUP_PREPARE, + KVM_PV_ASYNC_CLEANUP_PERFORM, }; struct kvm_pv_cmd { @@ -1767,6 +1762,7 @@ struct kvm_xen_hvm_attr { union { __u8 long_mode; __u8 vector; + __u8 runstate_update_flag; struct { __u64 gfn; } shared_info; @@ -1807,6 +1803,8 @@ struct kvm_xen_hvm_attr { /* Available with KVM_CAP_XEN_HVM / KVM_XEN_HVM_CONFIG_EVTCHN_SEND */ #define KVM_XEN_ATTR_TYPE_EVTCHN 0x3 #define KVM_XEN_ATTR_TYPE_XEN_VERSION 0x4 +/* Available with KVM_CAP_XEN_HVM / KVM_XEN_HVM_CONFIG_RUNSTATE_UPDATE_FLAG */ +#define KVM_XEN_ATTR_TYPE_RUNSTATE_UPDATE_FLAG 0x5 /* Per-vCPU Xen attributes */ #define KVM_XEN_VCPU_GET_ATTR _IOWR(KVMIO, 0xca, struct kvm_xen_vcpu_attr) diff --git a/linux-headers/linux/psci.h b/linux-headers/linux/psci.h index 213b2a0f70..e60dfd8907 100644 --- a/linux-headers/linux/psci.h +++ b/linux-headers/linux/psci.h @@ -48,12 +48,26 @@ #define PSCI_0_2_FN64_MIGRATE_INFO_UP_CPU PSCI_0_2_FN64(7) #define PSCI_1_0_FN_PSCI_FEATURES PSCI_0_2_FN(10) +#define PSCI_1_0_FN_CPU_FREEZE PSCI_0_2_FN(11) +#define PSCI_1_0_FN_CPU_DEFAULT_SUSPEND PSCI_0_2_FN(12) +#define PSCI_1_0_FN_NODE_HW_STATE PSCI_0_2_FN(13) #define PSCI_1_0_FN_SYSTEM_SUSPEND PSCI_0_2_FN(14) #define PSCI_1_0_FN_SET_SUSPEND_MODE PSCI_0_2_FN(15) +#define PSCI_1_0_FN_STAT_RESIDENCY PSCI_0_2_FN(16) +#define PSCI_1_0_FN_STAT_COUNT PSCI_0_2_FN(17) + #define PSCI_1_1_FN_SYSTEM_RESET2 PSCI_0_2_FN(18) +#define PSCI_1_1_FN_MEM_PROTECT PSCI_0_2_FN(19) +#define PSCI_1_1_FN_MEM_PROTECT_CHECK_RANGE PSCI_0_2_FN(19) +#define PSCI_1_0_FN64_CPU_DEFAULT_SUSPEND PSCI_0_2_FN64(12) +#define PSCI_1_0_FN64_NODE_HW_STATE PSCI_0_2_FN64(13) #define PSCI_1_0_FN64_SYSTEM_SUSPEND PSCI_0_2_FN64(14) +#define PSCI_1_0_FN64_STAT_RESIDENCY PSCI_0_2_FN64(16) +#define PSCI_1_0_FN64_STAT_COUNT PSCI_0_2_FN64(17) + #define PSCI_1_1_FN64_SYSTEM_RESET2 PSCI_0_2_FN64(18) +#define PSCI_1_1_FN64_MEM_PROTECT_CHECK_RANGE PSCI_0_2_FN64(19) /* PSCI v0.2 power state encoding for CPU_SUSPEND function */ #define PSCI_0_2_POWER_STATE_ID_MASK 0xffff diff --git a/linux-headers/linux/userfaultfd.h b/linux-headers/linux/userfaultfd.h index a3a377cd44..ba5d0df52f 100644 --- a/linux-headers/linux/userfaultfd.h +++ b/linux-headers/linux/userfaultfd.h @@ -12,6 +12,10 @@ #include +/* ioctls for /dev/userfaultfd */ +#define USERFAULTFD_IOC 0xAA +#define USERFAULTFD_IOC_NEW _IO(USERFAULTFD_IOC, 0x00) + /* * If the UFFDIO_API is upgraded someday, the UFFDIO_UNREGISTER and * UFFDIO_WAKE ioctls should be defined as _IOW and not as _IOR. In diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h index ede44b5572..c59692ce0b 100644 --- a/linux-headers/linux/vfio.h +++ b/linux-headers/linux/vfio.h @@ -819,12 +819,20 @@ struct vfio_device_feature { * VFIO_MIGRATION_STOP_COPY | VFIO_MIGRATION_P2P means that RUNNING_P2P * is supported in addition to the STOP_COPY states. * + * VFIO_MIGRATION_STOP_COPY | VFIO_MIGRATION_PRE_COPY means that + * PRE_COPY is supported in addition to the STOP_COPY states. + * + * VFIO_MIGRATION_STOP_COPY | VFIO_MIGRATION_P2P | VFIO_MIGRATION_PRE_COPY + * means that RUNNING_P2P, PRE_COPY and PRE_COPY_P2P are supported + * in addition to the STOP_COPY states. + * * Other combinations of flags have behavior to be defined in the future. */ struct vfio_device_feature_migration { __aligned_u64 flags; #define VFIO_MIGRATION_STOP_COPY (1 << 0) #define VFIO_MIGRATION_P2P (1 << 1) +#define VFIO_MIGRATION_PRE_COPY (1 << 2) }; #define VFIO_DEVICE_FEATURE_MIGRATION 1 @@ -875,8 +883,13 @@ struct vfio_device_feature_mig_state { * RESUMING - The device is stopped and is loading a new internal state * ERROR - The device has failed and must be reset * - * And 1 optional state to support VFIO_MIGRATION_P2P: + * And optional states to support VFIO_MIGRATION_P2P: * RUNNING_P2P - RUNNING, except the device cannot do peer to peer DMA + * And VFIO_MIGRATION_PRE_COPY: + * PRE_COPY - The device is running normally but tracking internal state + * changes + * And VFIO_MIGRATION_P2P | VFIO_MIGRATION_PRE_COPY: + * PRE_COPY_P2P - PRE_COPY, except the device cannot do peer to peer DMA * * The FSM takes actions on the arcs between FSM states. The driver implements * the following behavior for the FSM arcs: @@ -908,20 +921,48 @@ struct vfio_device_feature_mig_state { * * To abort a RESUMING session the device must be reset. * + * PRE_COPY -> RUNNING * RUNNING_P2P -> RUNNING * While in RUNNING the device is fully operational, the device may generate * interrupts, DMA, respond to MMIO, all vfio device regions are functional, * and the device may advance its internal state. * + * The PRE_COPY arc will terminate a data transfer session. + * + * PRE_COPY_P2P -> RUNNING_P2P * RUNNING -> RUNNING_P2P * STOP -> RUNNING_P2P * While in RUNNING_P2P the device is partially running in the P2P quiescent * state defined below. * + * The PRE_COPY_P2P arc will terminate a data transfer session. + * + * RUNNING -> PRE_COPY + * RUNNING_P2P -> PRE_COPY_P2P * STOP -> STOP_COPY - * This arc begin the process of saving the device state and will return a - * new data_fd. + * PRE_COPY, PRE_COPY_P2P and STOP_COPY form the "saving group" of states + * which share a data transfer session. Moving between these states alters + * what is streamed in session, but does not terminate or otherwise affect + * the associated fd. + * + * These arcs begin the process of saving the device state and will return a + * new data_fd. The migration driver may perform actions such as enabling + * dirty logging of device state when entering PRE_COPY or PER_COPY_P2P. + * + * Each arc does not change the device operation, the device remains + * RUNNING, P2P quiesced or in STOP. The STOP_COPY state is described below + * in PRE_COPY_P2P -> STOP_COPY. + * + * PRE_COPY -> PRE_COPY_P2P + * Entering PRE_COPY_P2P continues all the behaviors of PRE_COPY above. + * However, while in the PRE_COPY_P2P state, the device is partially running + * in the P2P quiescent state defined below, like RUNNING_P2P. + * + * PRE_COPY_P2P -> PRE_COPY + * This arc allows returning the device to a full RUNNING behavior while + * continuing all the behaviors of PRE_COPY. * + * PRE_COPY_P2P -> STOP_COPY * While in the STOP_COPY state the device has the same behavior as STOP * with the addition that the data transfers session continues to stream the * migration state. End of stream on the FD indicates the entire device @@ -939,6 +980,13 @@ struct vfio_device_feature_mig_state { * device state for this arc if required to prepare the device to receive the * migration data. * + * STOP_COPY -> PRE_COPY + * STOP_COPY -> PRE_COPY_P2P + * These arcs are not permitted and return error if requested. Future + * revisions of this API may define behaviors for these arcs, in this case + * support will be discoverable by a new flag in + * VFIO_DEVICE_FEATURE_MIGRATION. + * * any -> ERROR * ERROR cannot be specified as a device state, however any transition request * can be failed with an errno return and may then move the device_state into @@ -950,7 +998,7 @@ struct vfio_device_feature_mig_state { * The optional peer to peer (P2P) quiescent state is intended to be a quiescent * state for the device for the purposes of managing multiple devices within a * user context where peer-to-peer DMA between devices may be active. The - * RUNNING_P2P states must prevent the device from initiating + * RUNNING_P2P and PRE_COPY_P2P states must prevent the device from initiating * any new P2P DMA transactions. If the device can identify P2P transactions * then it can stop only P2P DMA, otherwise it must stop all DMA. The migration * driver must complete any such outstanding operations prior to completing the @@ -963,6 +1011,8 @@ struct vfio_device_feature_mig_state { * above FSM arcs. As there are multiple paths through the FSM arcs the path * should be selected based on the following rules: * - Select the shortest path. + * - The path cannot have saving group states as interior arcs, only + * starting/end states. * Refer to vfio_mig_get_next_state() for the result of the algorithm. * * The automatic transit through the FSM arcs that make up the combination @@ -976,6 +1026,9 @@ struct vfio_device_feature_mig_state { * support them. The user can discover if these states are supported by using * VFIO_DEVICE_FEATURE_MIGRATION. By using combination transitions the user can * avoid knowing about these optional states if the kernel driver supports them. + * + * Arcs touching PRE_COPY and PRE_COPY_P2P are removed if support for PRE_COPY + * is not present. */ enum vfio_device_mig_state { VFIO_DEVICE_STATE_ERROR = 0, @@ -984,8 +1037,225 @@ enum vfio_device_mig_state { VFIO_DEVICE_STATE_STOP_COPY = 3, VFIO_DEVICE_STATE_RESUMING = 4, VFIO_DEVICE_STATE_RUNNING_P2P = 5, + VFIO_DEVICE_STATE_PRE_COPY = 6, + VFIO_DEVICE_STATE_PRE_COPY_P2P = 7, +}; + +/** + * VFIO_MIG_GET_PRECOPY_INFO - _IO(VFIO_TYPE, VFIO_BASE + 21) + * + * This ioctl is used on the migration data FD in the precopy phase of the + * migration data transfer. It returns an estimate of the current data sizes + * remaining to be transferred. It allows the user to judge when it is + * appropriate to leave PRE_COPY for STOP_COPY. + * + * This ioctl is valid only in PRE_COPY states and kernel driver should + * return -EINVAL from any other migration state. + * + * The vfio_precopy_info data structure returned by this ioctl provides + * estimates of data available from the device during the PRE_COPY states. + * This estimate is split into two categories, initial_bytes and + * dirty_bytes. + * + * The initial_bytes field indicates the amount of initial precopy + * data available from the device. This field should have a non-zero initial + * value and decrease as migration data is read from the device. + * It is recommended to leave PRE_COPY for STOP_COPY only after this field + * reaches zero. Leaving PRE_COPY earlier might make things slower. + * + * The dirty_bytes field tracks device state changes relative to data + * previously retrieved. This field starts at zero and may increase as + * the internal device state is modified or decrease as that modified + * state is read from the device. + * + * Userspace may use the combination of these fields to estimate the + * potential data size available during the PRE_COPY phases, as well as + * trends relative to the rate the device is dirtying its internal + * state, but these fields are not required to have any bearing relative + * to the data size available during the STOP_COPY phase. + * + * Drivers have a lot of flexibility in when and what they transfer during the + * PRE_COPY phase, and how they report this from VFIO_MIG_GET_PRECOPY_INFO. + * + * During pre-copy the migration data FD has a temporary "end of stream" that is + * reached when both initial_bytes and dirty_byte are zero. For instance, this + * may indicate that the device is idle and not currently dirtying any internal + * state. When read() is done on this temporary end of stream the kernel driver + * should return ENOMSG from read(). Userspace can wait for more data (which may + * never come) by using poll. + * + * Once in STOP_COPY the migration data FD has a permanent end of stream + * signaled in the usual way by read() always returning 0 and poll always + * returning readable. ENOMSG may not be returned in STOP_COPY. + * Support for this ioctl is mandatory if a driver claims to support + * VFIO_MIGRATION_PRE_COPY. + * + * Return: 0 on success, -1 and errno set on failure. + */ +struct vfio_precopy_info { + __u32 argsz; + __u32 flags; + __aligned_u64 initial_bytes; + __aligned_u64 dirty_bytes; +}; + +#define VFIO_MIG_GET_PRECOPY_INFO _IO(VFIO_TYPE, VFIO_BASE + 21) + +/* + * Upon VFIO_DEVICE_FEATURE_SET, allow the device to be moved into a low power + * state with the platform-based power management. Device use of lower power + * states depends on factors managed by the runtime power management core, + * including system level support and coordinating support among dependent + * devices. Enabling device low power entry does not guarantee lower power + * usage by the device, nor is a mechanism provided through this feature to + * know the current power state of the device. If any device access happens + * (either from the host or through the vfio uAPI) when the device is in the + * low power state, then the host will move the device out of the low power + * state as necessary prior to the access. Once the access is completed, the + * device may re-enter the low power state. For single shot low power support + * with wake-up notification, see + * VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY_WITH_WAKEUP below. Access to mmap'd + * device regions is disabled on LOW_POWER_ENTRY and may only be resumed after + * calling LOW_POWER_EXIT. + */ +#define VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY 3 + +/* + * This device feature has the same behavior as + * VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY with the exception that the user + * provides an eventfd for wake-up notification. When the device moves out of + * the low power state for the wake-up, the host will not allow the device to + * re-enter a low power state without a subsequent user call to one of the low + * power entry device feature IOCTLs. Access to mmap'd device regions is + * disabled on LOW_POWER_ENTRY_WITH_WAKEUP and may only be resumed after the + * low power exit. The low power exit can happen either through LOW_POWER_EXIT + * or through any other access (where the wake-up notification has been + * generated). The access to mmap'd device regions will not trigger low power + * exit. + * + * The notification through the provided eventfd will be generated only when + * the device has entered and is resumed from a low power state after + * calling this device feature IOCTL. A device that has not entered low power + * state, as managed through the runtime power management core, will not + * generate a notification through the provided eventfd on access. Calling the + * LOW_POWER_EXIT feature is optional in the case where notification has been + * signaled on the provided eventfd that a resume from low power has occurred. + */ +struct vfio_device_low_power_entry_with_wakeup { + __s32 wakeup_eventfd; + __u32 reserved; +}; + +#define VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY_WITH_WAKEUP 4 + +/* + * Upon VFIO_DEVICE_FEATURE_SET, disallow use of device low power states as + * previously enabled via VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY or + * VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY_WITH_WAKEUP device features. + * This device feature IOCTL may itself generate a wakeup eventfd notification + * in the latter case if the device had previously entered a low power state. + */ +#define VFIO_DEVICE_FEATURE_LOW_POWER_EXIT 5 + +/* + * Upon VFIO_DEVICE_FEATURE_SET start/stop device DMA logging. + * VFIO_DEVICE_FEATURE_PROBE can be used to detect if the device supports + * DMA logging. + * + * DMA logging allows a device to internally record what DMAs the device is + * initiating and report them back to userspace. It is part of the VFIO + * migration infrastructure that allows implementing dirty page tracking + * during the pre copy phase of live migration. Only DMA WRITEs are logged, + * and this API is not connected to VFIO_DEVICE_FEATURE_MIG_DEVICE_STATE. + * + * When DMA logging is started a range of IOVAs to monitor is provided and the + * device can optimize its logging to cover only the IOVA range given. Each + * DMA that the device initiates inside the range will be logged by the device + * for later retrieval. + * + * page_size is an input that hints what tracking granularity the device + * should try to achieve. If the device cannot do the hinted page size then + * it's the driver choice which page size to pick based on its support. + * On output the device will return the page size it selected. + * + * ranges is a pointer to an array of + * struct vfio_device_feature_dma_logging_range. + * + * The core kernel code guarantees to support by minimum num_ranges that fit + * into a single kernel page. User space can try higher values but should give + * up if the above can't be achieved as of some driver limitations. + * + * A single call to start device DMA logging can be issued and a matching stop + * should follow at the end. Another start is not allowed in the meantime. + */ +struct vfio_device_feature_dma_logging_control { + __aligned_u64 page_size; + __u32 num_ranges; + __u32 __reserved; + __aligned_u64 ranges; }; +struct vfio_device_feature_dma_logging_range { + __aligned_u64 iova; + __aligned_u64 length; +}; + +#define VFIO_DEVICE_FEATURE_DMA_LOGGING_START 6 + +/* + * Upon VFIO_DEVICE_FEATURE_SET stop device DMA logging that was started + * by VFIO_DEVICE_FEATURE_DMA_LOGGING_START + */ +#define VFIO_DEVICE_FEATURE_DMA_LOGGING_STOP 7 + +/* + * Upon VFIO_DEVICE_FEATURE_GET read back and clear the device DMA log + * + * Query the device's DMA log for written pages within the given IOVA range. + * During querying the log is cleared for the IOVA range. + * + * bitmap is a pointer to an array of u64s that will hold the output bitmap + * with 1 bit reporting a page_size unit of IOVA. The mapping of IOVA to bits + * is given by: + * bitmap[(addr - iova)/page_size] & (1ULL << (addr % 64)) + * + * The input page_size can be any power of two value and does not have to + * match the value given to VFIO_DEVICE_FEATURE_DMA_LOGGING_START. The driver + * will format its internal logging to match the reporting page size, possibly + * by replicating bits if the internal page size is lower than requested. + * + * The LOGGING_REPORT will only set bits in the bitmap and never clear or + * perform any initialization of the user provided bitmap. + * + * If any error is returned userspace should assume that the dirty log is + * corrupted. Error recovery is to consider all memory dirty and try to + * restart the dirty tracking, or to abort/restart the whole migration. + * + * If DMA logging is not enabled, an error will be returned. + * + */ +struct vfio_device_feature_dma_logging_report { + __aligned_u64 iova; + __aligned_u64 length; + __aligned_u64 page_size; + __aligned_u64 bitmap; +}; + +#define VFIO_DEVICE_FEATURE_DMA_LOGGING_REPORT 8 + +/* + * Upon VFIO_DEVICE_FEATURE_GET read back the estimated data length that will + * be required to complete stop copy. + * + * Note: Can be called on each device state. + */ + +struct vfio_device_feature_mig_data_size { + __aligned_u64 stop_copy_length; +}; + +#define VFIO_DEVICE_FEATURE_MIG_DATA_SIZE 9 + /* -------- API for Type1 VFIO IOMMU -------- */ /** From patchwork Tue Jan 17 22:08:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 1727768 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=U652nYRD; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4NxNNy07lpz23gM for ; Wed, 18 Jan 2023 09:10:10 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pHu9D-0008WR-3N; Tue, 17 Jan 2023 17:09:39 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pHu95-0008Sf-2D for qemu-devel@nongnu.org; Tue, 17 Jan 2023 17:09:32 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pHu90-0007X8-Dh for qemu-devel@nongnu.org; Tue, 17 Jan 2023 17:09:28 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1673993364; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nOS3CKEGPjANiM7I6HWXSEnAfE8Rh5Z/OAp17l2f8LQ=; b=U652nYRDeCvvefdpMLgugcXik88TigkO1GBylOw3L6/jmsGTzgKke2BlBPqdfl8Du75gk9 Vd46c1V+ar5L0ATl5kccPtTKX2jYFT7il5hInVsq0gjW0wWkCxPty1Pm/sCAP6KFQNf7QR Q/MTZyTvIrmNYGm5tArfADi3a68n1pY= Received: from mail-vk1-f198.google.com (mail-vk1-f198.google.com [209.85.221.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-509-OLqE4R94Pb6WvyGJo-8ErA-1; Tue, 17 Jan 2023 17:09:22 -0500 X-MC-Unique: OLqE4R94Pb6WvyGJo-8ErA-1 Received: by mail-vk1-f198.google.com with SMTP id d130-20020a1f9b88000000b003b87d0db0d9so9626334vke.15 for ; Tue, 17 Jan 2023 14:09:22 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=nOS3CKEGPjANiM7I6HWXSEnAfE8Rh5Z/OAp17l2f8LQ=; b=xlGZHz1TTeiQcAVJmQlVYWZYcf4NjwONbo+kPKN+DIF+c3/Ow0j+82wIryWmU5tQ7d SaMqUQm8dgHKsULAGm8vHYMz5HixEUlPA3qTjiUvZLkvB3UDNKMTfMih7mAu1wXb6SIK niLfVlC8sgrupcxNjxied19eJszfT/ZLqMo8NFLS/V3m3wO6D70GbQyODHfzwncFcBSA LWt/TYk3KkdMHh6/fI7R5GuDErgpui5HxKj2jTbq2sSImeW+TBfnPW4yjURlCEIq71GO Ak0TuWNRg6s4nShloCjOAz9w3HdalX4cmHt7wuNiqQLQ7G9L8ILhBvZcUwAj/KpUiWhu j2Ng== X-Gm-Message-State: AFqh2koeNcWFuqoPfv/49K/qzXnJgzHmUWGBK6HWPYJG9wVjLwd6rN3/ U9WcWt3aUJ9jV2UKn1bkGd7MQZ7zhq9Sbgm+DrER5JlSWkdg9cWlTAxaOi8tkVipfQ/MfJXwH2q 7q0VYvb6Q2v0JUyfPl1uXO+9GUzqfuxSl6KoUvoAHCoPq2WOjrGBHefnQMmAGItAv X-Received: by 2002:ac5:cd4e:0:b0:3bc:8a9a:2c70 with SMTP id n14-20020ac5cd4e000000b003bc8a9a2c70mr2904834vkm.1.1673993360893; Tue, 17 Jan 2023 14:09:20 -0800 (PST) X-Google-Smtp-Source: AMrXdXvuXP1Ct3TMZdoYa/9m8Ky+NfDR+SbWufrlptlsK0CVyWkSL9vtSyQN9BJ8EfXrqSHBPBHL5g== X-Received: by 2002:ac5:cd4e:0:b0:3bc:8a9a:2c70 with SMTP id n14-20020ac5cd4e000000b003bc8a9a2c70mr2904806vkm.1.1673993360591; Tue, 17 Jan 2023 14:09:20 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-56-70-30-145-63.dsl.bell.ca. [70.30.145.63]) by smtp.gmail.com with ESMTPSA id bm16-20020a05620a199000b006e16dcf99c8sm21142978qkb.71.2023.01.17.14.09.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 Jan 2023 14:09:19 -0800 (PST) From: Peter Xu To: qemu-devel@nongnu.org Cc: Leonardo Bras Soares Passos , James Houghton , Juan Quintela , peterx@redhat.com, "Dr . David Alan Gilbert" Subject: [PATCH RFC 02/21] util: Include osdep.h first in util/mmap-alloc.c Date: Tue, 17 Jan 2023 17:08:55 -0500 Message-Id: <20230117220914.2062125-3-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20230117220914.2062125-1-peterx@redhat.com> References: <20230117220914.2062125-1-peterx@redhat.com> MIME-Version: 1.0 Received-SPF: pass client-ip=170.10.129.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Without it, we never have CONFIG_LINUX defined even if on linux, so linux/mman.h is never really included. Signed-off-by: Peter Xu Reviewed-by: Dr. David Alan Gilbert Reviewed-by: Philippe Mathieu-Daudé Reviewed-by: Juan Quintela --- util/mmap-alloc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/util/mmap-alloc.c b/util/mmap-alloc.c index 5ed7d29183..040599b0e3 100644 --- a/util/mmap-alloc.c +++ b/util/mmap-alloc.c @@ -9,6 +9,7 @@ * This work is licensed under the terms of the GNU GPL, version 2 or * later. See the COPYING file in the top-level directory. */ +#include "qemu/osdep.h" #ifdef CONFIG_LINUX #include @@ -17,7 +18,6 @@ #define MAP_SHARED_VALIDATE 0x0 #endif /* CONFIG_LINUX */ -#include "qemu/osdep.h" #include "qemu/mmap-alloc.h" #include "qemu/host-utils.h" #include "qemu/cutils.h" From patchwork Tue Jan 17 22:08:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 1727764 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=hQZdD6cs; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4NxNNY60hwz23fp for ; Wed, 18 Jan 2023 09:09:49 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pHu9F-00006X-Om; Tue, 17 Jan 2023 17:09:41 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pHu9D-00005Z-Nj for qemu-devel@nongnu.org; Tue, 17 Jan 2023 17:09:40 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pHu96-0007XN-4T for qemu-devel@nongnu.org; Tue, 17 Jan 2023 17:09:35 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1673993366; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=PC4mh0B0hMSMRIj8sb0h21wn9Jn8Ce4T2CHIcXSGJ8A=; b=hQZdD6csxfKplwaYJYqYKwLCLpq1vIEBC7h4S9X8sbJc3wrb27btLIvcugnxIJ9eFGX+U3 NOfIaKxgVeyJpmy8JgrQc6y0Tr2zSzM5y5mXoaeVEFKhPDdIEnPE/pVW5ZxYOyuqNkTKRS jZmdJECY65cCQh3b6JcrtzkapWncyg0= Received: from mail-vk1-f199.google.com (mail-vk1-f199.google.com [209.85.221.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-442-RwOLqgfmODikMKjmxxiGGg-1; Tue, 17 Jan 2023 17:09:24 -0500 X-MC-Unique: RwOLqgfmODikMKjmxxiGGg-1 Received: by mail-vk1-f199.google.com with SMTP id m84-20020a1fa357000000b003bcb3e83df3so9620095vke.7 for ; Tue, 17 Jan 2023 14:09:24 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=PC4mh0B0hMSMRIj8sb0h21wn9Jn8Ce4T2CHIcXSGJ8A=; b=XKAeawyHF6hdqEuqM5vozABdwVtIFFyVDjOCUlMtXBixM/C/w3fvqO9MET34GAJ0VT 0VqSL/J/Cv+0/XGIgR/gTWyK8mwzpmM7KBMXNPzqMmdU2oYN6IWKeXyBhewCbeInN3H5 qBRuFyU/Fu3kVfPLRB284IKLso8Ek/q7H/oLpgKitjHRtCudRWW5sLoSDoJ7mApZNIdP VGRyX40LtlrCdjszMKZ6Z/7+GhemBPEjz5sZvJXr4eHsyqk9rZW+SJIoOrCQVKoY6CIo pod++dOZqrW2+ZGqBpQyxtY240inIUipNCH1LFTc2Tpx9HlBZvoKEC298TVwmeeYfIbz 7xDg== X-Gm-Message-State: AFqh2kqASqrCmvQxvuiHUAphHrUtqkee2099XpnASh6TMh4nPEfIlVc0 JWc0vMZmjf/jTxFAQKqlyQKNQEmyG2vMkUXlAOyUnWvo+1FG7yCV5IMOnN2epnEluFCeaZdqLrn u5FAVDZJlCgoteN4/z9eTeL62zqq7zNLQ5CB9ldHtv7NwwOiUJfZHrfdBxYJey040 X-Received: by 2002:a67:ee97:0:b0:3d0:ce41:85d6 with SMTP id n23-20020a67ee97000000b003d0ce4185d6mr2290957vsp.25.1673993363727; Tue, 17 Jan 2023 14:09:23 -0800 (PST) X-Google-Smtp-Source: AMrXdXtWoZYc9RTP2JiPaqdTvMnPe3x9+rQMc1gtpSIsM+O2IF4F4We4LoPY1j6c4WfVBXRKwPBHwA== X-Received: by 2002:a67:ee97:0:b0:3d0:ce41:85d6 with SMTP id n23-20020a67ee97000000b003d0ce4185d6mr2290934vsp.25.1673993363424; Tue, 17 Jan 2023 14:09:23 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-56-70-30-145-63.dsl.bell.ca. [70.30.145.63]) by smtp.gmail.com with ESMTPSA id bm16-20020a05620a199000b006e16dcf99c8sm21142978qkb.71.2023.01.17.14.09.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 Jan 2023 14:09:22 -0800 (PST) From: Peter Xu To: qemu-devel@nongnu.org Cc: Leonardo Bras Soares Passos , James Houghton , Juan Quintela , peterx@redhat.com, "Dr . David Alan Gilbert" Subject: [PATCH RFC 03/21] physmem: Add qemu_ram_is_hugetlb() Date: Tue, 17 Jan 2023 17:08:56 -0500 Message-Id: <20230117220914.2062125-4-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20230117220914.2062125-1-peterx@redhat.com> References: <20230117220914.2062125-1-peterx@redhat.com> MIME-Version: 1.0 Received-SPF: pass client-ip=170.10.133.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Returns true for a hugetlbfs mapping, false otherwise. Signed-off-by: Peter Xu Reviewed-by: Dr. David Alan Gilbert Reviewed-by: Juan Quintela --- include/exec/cpu-common.h | 1 + softmmu/physmem.c | 5 +++++ 2 files changed, 6 insertions(+) diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h index 6feaa40ca7..94452aa17f 100644 --- a/include/exec/cpu-common.h +++ b/include/exec/cpu-common.h @@ -95,6 +95,7 @@ void qemu_ram_unset_migratable(RAMBlock *rb); int qemu_ram_get_fd(RAMBlock *rb); size_t qemu_ram_pagesize(RAMBlock *block); +bool qemu_ram_is_hugetlb(RAMBlock *rb); size_t qemu_ram_pagesize_largest(void); /** diff --git a/softmmu/physmem.c b/softmmu/physmem.c index edec095c7a..a4fb129d8f 100644 --- a/softmmu/physmem.c +++ b/softmmu/physmem.c @@ -1798,6 +1798,11 @@ size_t qemu_ram_pagesize(RAMBlock *rb) return rb->page_size; } +bool qemu_ram_is_hugetlb(RAMBlock *rb) +{ + return rb->page_size > qemu_real_host_page_size(); +} + /* Returns the largest size of page in use */ size_t qemu_ram_pagesize_largest(void) { From patchwork Tue Jan 17 22:08:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 1727765 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=htNAu2vz; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4NxNNs0gSlz23h2 for ; Wed, 18 Jan 2023 09:10:05 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pHu9F-00006Y-S5; Tue, 17 Jan 2023 17:09:41 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pHu9D-00005b-S3 for qemu-devel@nongnu.org; Tue, 17 Jan 2023 17:09:40 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pHu96-0007Xs-4V for qemu-devel@nongnu.org; Tue, 17 Jan 2023 17:09:37 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1673993368; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=hulNKOYkgTkwf0QwU7ikkHIwpsmJsFQjittKiUz75KY=; b=htNAu2vzfOIrKYSJxFx+3SbjM9SQgFMyy3KkDdn3mVSqroh7Jr4qBFJ/XiY6WIqxPjMiqm 4CWUpAlogTTLJnuXixvcaOIBhS2NbUjJ8pDhLvT4MFzsADB5OYny8IdyvTHOvmJ8BAZjMp dbA2J0sv5AdV30xlx64BmK/9MK5Dr3o= Received: from mail-qv1-f72.google.com (mail-qv1-f72.google.com [209.85.219.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-26-tu75bYN3M6-KDlgAkC2OvQ-1; Tue, 17 Jan 2023 17:09:27 -0500 X-MC-Unique: tu75bYN3M6-KDlgAkC2OvQ-1 Received: by mail-qv1-f72.google.com with SMTP id c10-20020a05621401ea00b004c72d0e92bcso16501428qvu.12 for ; Tue, 17 Jan 2023 14:09:27 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=hulNKOYkgTkwf0QwU7ikkHIwpsmJsFQjittKiUz75KY=; b=I3Zs722wVYQDeXqIL6WDpaeV0icXX+Nf7mAUrQanUOIcb0QyR/MuWETzIT0QvsS7OL jX9ZQUHnFrXYdx2IHnC4OIWfNmvvPxWcAKec+UM2ka+O8NSrFvDA/aPsj5y+fyi9nYK3 5OBA2KXIZdwLYk7WuXkMbX9gWoqPcJBewZKcSAi5GXEDfYqkflpBqlc5VyuHYwkCrOU+ lnCJd8XUHo3lJ/e9wid3c4Bif880KV8aXc26t7/NFOPCQxWuhc9XR9GNRMEUcs5rHfFA 2waPFsfJIgK4vVCGAcXCi4iqk2oPu7ACIKzdu35jeBmwt5SdfQ2Dd2w+XxE0P/eW3lE3 Oc6A== X-Gm-Message-State: AFqh2kq4P+dG/IgKUpy13jBUTZsARJ8/YDpL8dL107zysJhR3l9BVT4g xsPSdVyDjRyeTM6WU5WVYAbWO3pj7/gbXvBXqxQh1XxGIkD5iy2mVs+6pFAGNvZ1aLwkJCgJU8w Fj2pcp0fNyL2xYfyHo2I4Eo04KYjLAkbdytxUBSU2FngyTXecGLZwZvAXxaF6HBI8 X-Received: by 2002:a05:622a:1a15:b0:3b6:43dd:9016 with SMTP id f21-20020a05622a1a1500b003b643dd9016mr3535932qtb.3.1673993366037; Tue, 17 Jan 2023 14:09:26 -0800 (PST) X-Google-Smtp-Source: AMrXdXvI36Mic+7eskdYAEhrI5uMm+AuxR/UHD1rSM+dUTbPm70f0iIl66GlQwf+UYkLKkLMMls4Ew== X-Received: by 2002:a05:622a:1a15:b0:3b6:43dd:9016 with SMTP id f21-20020a05622a1a1500b003b643dd9016mr3535903qtb.3.1673993365756; Tue, 17 Jan 2023 14:09:25 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-56-70-30-145-63.dsl.bell.ca. [70.30.145.63]) by smtp.gmail.com with ESMTPSA id bm16-20020a05620a199000b006e16dcf99c8sm21142978qkb.71.2023.01.17.14.09.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 Jan 2023 14:09:25 -0800 (PST) From: Peter Xu To: qemu-devel@nongnu.org Cc: Leonardo Bras Soares Passos , James Houghton , Juan Quintela , peterx@redhat.com, "Dr . David Alan Gilbert" Subject: [PATCH RFC 04/21] madvise: Include linux/mman.h under linux-headers/ Date: Tue, 17 Jan 2023 17:08:57 -0500 Message-Id: <20230117220914.2062125-5-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20230117220914.2062125-1-peterx@redhat.com> References: <20230117220914.2062125-1-peterx@redhat.com> MIME-Version: 1.0 Received-SPF: pass client-ip=170.10.133.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org This will allow qemu/madvise.h to always include linux/mman.h under the linux-headers/. Signed-off-by: Peter Xu Reviewed-by: Dr. David Alan Gilbert Reviewed-by: Juan Quintela --- include/qemu/madvise.h | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/include/qemu/madvise.h b/include/qemu/madvise.h index e155f59a0d..b6fa49553f 100644 --- a/include/qemu/madvise.h +++ b/include/qemu/madvise.h @@ -8,6 +8,10 @@ #ifndef QEMU_MADVISE_H #define QEMU_MADVISE_H +#ifdef CONFIG_LINUX +#include "linux/mman.h" +#endif + #define QEMU_MADV_INVALID -1 #if defined(CONFIG_MADVISE) From patchwork Tue Jan 17 22:08:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 1727775 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=P+CtS6B2; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4NxNQ94jqhz23fp for ; Wed, 18 Jan 2023 09:11:13 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pHu9G-00007x-O4; Tue, 17 Jan 2023 17:09:42 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pHu9F-00006G-0W for qemu-devel@nongnu.org; Tue, 17 Jan 2023 17:09:41 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pHu9A-0007YB-1I for qemu-devel@nongnu.org; Tue, 17 Jan 2023 17:09:40 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1673993372; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=MHxiNfcZZImYagsz+S2qwPW9isNnxPRikDxHv8UG8ok=; b=P+CtS6B25n/3QTlP8PUGxD9RTDhzvMQCCpV0o0woaRUd8uA6VWqJdi22s6FqeYTPpJSUKr OPBzcNAZYNne9zRTShCBRnk2qYaGNYA7yN+L4Ug93K3SyHoFMvUMi/OlcBg2ICjSGv3grG FTubhwLxGigkV4mMfjW5F9GzCDtJBsQ= Received: from mail-yw1-f199.google.com (mail-yw1-f199.google.com [209.85.128.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-573-Iknmb9dWMJ2L3LyO1yWI_Q-1; Tue, 17 Jan 2023 17:09:30 -0500 X-MC-Unique: Iknmb9dWMJ2L3LyO1yWI_Q-1 Received: by mail-yw1-f199.google.com with SMTP id 00721157ae682-4b34cf67fb6so340180567b3.6 for ; Tue, 17 Jan 2023 14:09:30 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=MHxiNfcZZImYagsz+S2qwPW9isNnxPRikDxHv8UG8ok=; b=P0lTG2/kORNYxL7/QE2fAllELEWbj2ZsHIr+mqTd8J8kg5eHHyeEYxWVXatZY73ymH 3KbyB5dURxyu1cupT64/P26Mvyi2FuJw3Ta3JasvNAVtsO1EKzce4h5LPysFhguyVche b7BlMfN+CQfi0ptwXnkxCE7zj6Z4OWYEFL/cztPUfMNcU/JZm8298n4uJWKrhRDqzIbb /rOej4sSjJBIEYsi4I65dr9waXcg4zz07VfAqvDSDrsiUkJFvFsdWP3OsPEBr8KRIYkP pDTeAAJBVN3Y8Ynv7FXgGy+SvIakCv0milUlmZrV3qbKpQxPAeWcmx6hbjM9oW8sZubN tjNA== X-Gm-Message-State: AFqh2kqsBIINbwRxOvuQsKVkpQdo+O9Y/EAgx/dJDhf1BcMGaFtOiou9 aa+dV+Jx28/sp9zYPzOHCspCAEeHdtJ35yORLgJklnl1YGyQaYO7U/oyPb/4ZbcHGolNkp68vhR jEnQM5erPgMo242/hBnorDvWYC7nbbCC+gUoKTEShZwjvClh1gcPBvHFHi3CrBx78 X-Received: by 2002:a0d:f244:0:b0:4d7:5784:59c2 with SMTP id b65-20020a0df244000000b004d7578459c2mr3422114ywf.50.1673993368591; Tue, 17 Jan 2023 14:09:28 -0800 (PST) X-Google-Smtp-Source: AMrXdXsaW6V/8te66s7U03cHPel2Bp8u7+SrV0q9ceXXu+Zs8cQ4F42ce8GCZKDHCpAYflVgB+P0Fg== X-Received: by 2002:a0d:f244:0:b0:4d7:5784:59c2 with SMTP id b65-20020a0df244000000b004d7578459c2mr3422090ywf.50.1673993368313; Tue, 17 Jan 2023 14:09:28 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-56-70-30-145-63.dsl.bell.ca. [70.30.145.63]) by smtp.gmail.com with ESMTPSA id bm16-20020a05620a199000b006e16dcf99c8sm21142978qkb.71.2023.01.17.14.09.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 Jan 2023 14:09:27 -0800 (PST) From: Peter Xu To: qemu-devel@nongnu.org Cc: Leonardo Bras Soares Passos , James Houghton , Juan Quintela , peterx@redhat.com, "Dr . David Alan Gilbert" Subject: [PATCH RFC 05/21] madvise: Add QEMU_MADV_SPLIT Date: Tue, 17 Jan 2023 17:08:58 -0500 Message-Id: <20230117220914.2062125-6-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20230117220914.2062125-1-peterx@redhat.com> References: <20230117220914.2062125-1-peterx@redhat.com> MIME-Version: 1.0 Received-SPF: pass client-ip=170.10.129.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UPPERCASE_50_75=0.008 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org MADV_SPLIT is a new madvise() on Linux. Define QEMU_MADV_SPLIT. Signed-off-by: Peter Xu Reviewed-by: Juan Quintela --- include/qemu/madvise.h | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/include/qemu/madvise.h b/include/qemu/madvise.h index b6fa49553f..3dddd25065 100644 --- a/include/qemu/madvise.h +++ b/include/qemu/madvise.h @@ -63,6 +63,11 @@ #else #define QEMU_MADV_POPULATE_WRITE QEMU_MADV_INVALID #endif +#ifdef MADV_SPLIT +#define QEMU_MADV_SPLIT MADV_SPLIT +#else +#define QEMU_MADV_SPLIT QEMU_MADV_INVALID +#endif #elif defined(CONFIG_POSIX_MADVISE) @@ -77,6 +82,7 @@ #define QEMU_MADV_NOHUGEPAGE QEMU_MADV_INVALID #define QEMU_MADV_REMOVE QEMU_MADV_DONTNEED #define QEMU_MADV_POPULATE_WRITE QEMU_MADV_INVALID +#define QEMU_MADV_SPLIT QEMU_MADV_INVALID #else /* no-op */ @@ -91,6 +97,7 @@ #define QEMU_MADV_NOHUGEPAGE QEMU_MADV_INVALID #define QEMU_MADV_REMOVE QEMU_MADV_INVALID #define QEMU_MADV_POPULATE_WRITE QEMU_MADV_INVALID +#define QEMU_MADV_SPLIT QEMU_MADV_INVALID #endif From patchwork Tue Jan 17 22:08:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 1727784 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=PARwgXKs; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4NxNQn4yNKz23fp for ; Wed, 18 Jan 2023 09:11:45 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pHu9H-0000C8-Dh; Tue, 17 Jan 2023 17:09:43 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pHu9G-00006t-5y for qemu-devel@nongnu.org; Tue, 17 Jan 2023 17:09:42 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pHu9D-0007YL-M6 for qemu-devel@nongnu.org; Tue, 17 Jan 2023 17:09:41 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1673993374; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=z4pSYzeh/hEHH8/U7keCTUAk7hSnmmUcX2s4y6e9rfI=; b=PARwgXKsQOSoTLrwflsPwUZih3K3FqGGCr0EKrkXXezb74TyPe3f2IxONL4ieLvSx6Hb4L hQYYW7KMKbLKBhPT30OU0eEJRpCYYd+FSOSuPgDXXvVSODFdAZ4rZDxd1akZPK0zMhnKtn /JSZ3+TEy/mvkEc6L0Ilfea762/GLpE= Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com [209.85.222.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-611-ww1AuZqBPIm3gWD10EO25A-1; Tue, 17 Jan 2023 17:09:33 -0500 X-MC-Unique: ww1AuZqBPIm3gWD10EO25A-1 Received: by mail-qk1-f197.google.com with SMTP id de38-20020a05620a372600b0070224de1c6eso23865438qkb.17 for ; Tue, 17 Jan 2023 14:09:33 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=z4pSYzeh/hEHH8/U7keCTUAk7hSnmmUcX2s4y6e9rfI=; b=azl3/bfO2sTgoOtXiImPZnNfzuJjfLhbCxum/YjRNZ++Gu6fAbGnISyoutLurB/S41 eLAKTmKwjESkNjfp4QU+xxyj8L9/JUdtsaEAZ1iJ951UIo0ihV0SHMT+SMzqiM2LiPro WCeikeW48BdO0TsPmouW1v1s6pHSRTBJVQNkeVBtwrIVQIptYV7qfQ0+8fGvGHWmIAh8 NxsF2H/ObBQZ369TjkZxD2jx8z0BM94olZZvJI79Ibe+Mqonv309HbJHQWpVi3LyoRCR g8EerwOAEJ9dywWAxb8fCjcuG6Fq3aSAA4CY98CtbQDu7jgEFsNUFnECTdQLYm29v1Lb PJFQ== X-Gm-Message-State: AFqh2kqxUujy+Rz8dZQVkXByYLSu7eHHbqZik+sMugxmKRsCPFgALeYQ FxEe7bvmpdGhJWWjI/DkbPbhzVKWzYw+kymK5BFjzhDLmYKQ8+oghtqTbXBjEZnUG2cXvmTgIH9 dkJ4OqSJo/G1O4L0HDSWL36CauPpGTmaQCq+qXSR9TWsKpq31iPAOTBF/7z2V0m8A X-Received: by 2002:a05:6214:4381:b0:535:26a9:782 with SMTP id oh1-20020a056214438100b0053526a90782mr6729600qvb.37.1673993371917; Tue, 17 Jan 2023 14:09:31 -0800 (PST) X-Google-Smtp-Source: AMrXdXsVasjuyiR4kYzwPUKstPsVbPVhVMkFUJvhSibumwUcGxKS5hp4Jgq0mnxFJ8LcBSZZvPFZEQ== X-Received: by 2002:a05:6214:4381:b0:535:26a9:782 with SMTP id oh1-20020a056214438100b0053526a90782mr6729571qvb.37.1673993371624; Tue, 17 Jan 2023 14:09:31 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-56-70-30-145-63.dsl.bell.ca. [70.30.145.63]) by smtp.gmail.com with ESMTPSA id bm16-20020a05620a199000b006e16dcf99c8sm21142978qkb.71.2023.01.17.14.09.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 Jan 2023 14:09:30 -0800 (PST) From: Peter Xu To: qemu-devel@nongnu.org Cc: Leonardo Bras Soares Passos , James Houghton , Juan Quintela , peterx@redhat.com, "Dr . David Alan Gilbert" Subject: [PATCH RFC 06/21] madvise: Add QEMU_MADV_COLLAPSE Date: Tue, 17 Jan 2023 17:08:59 -0500 Message-Id: <20230117220914.2062125-7-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20230117220914.2062125-1-peterx@redhat.com> References: <20230117220914.2062125-1-peterx@redhat.com> MIME-Version: 1.0 Received-SPF: pass client-ip=170.10.129.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UPPERCASE_50_75=0.008 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org MADV_COLLAPSE is a new madvise() on Linux. Define it. Signed-off-by: Peter Xu Reviewed-by: Juan Quintela --- include/qemu/madvise.h | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/include/qemu/madvise.h b/include/qemu/madvise.h index 3dddd25065..794e5fb0a7 100644 --- a/include/qemu/madvise.h +++ b/include/qemu/madvise.h @@ -68,6 +68,11 @@ #else #define QEMU_MADV_SPLIT QEMU_MADV_INVALID #endif +#ifdef MADV_COLLAPSE +#define QEMU_MADV_COLLAPSE MADV_COLLAPSE +#else +#define QEMU_MADV_COLLAPSE QEMU_MADV_INVALID +#endif #elif defined(CONFIG_POSIX_MADVISE) @@ -83,6 +88,7 @@ #define QEMU_MADV_REMOVE QEMU_MADV_DONTNEED #define QEMU_MADV_POPULATE_WRITE QEMU_MADV_INVALID #define QEMU_MADV_SPLIT QEMU_MADV_INVALID +#define QEMU_MADV_COLLAPSE QEMU_MADV_INVALID #else /* no-op */ @@ -98,6 +104,7 @@ #define QEMU_MADV_REMOVE QEMU_MADV_INVALID #define QEMU_MADV_POPULATE_WRITE QEMU_MADV_INVALID #define QEMU_MADV_SPLIT QEMU_MADV_INVALID +#define QEMU_MADV_COLLAPSE QEMU_MADV_INVALID #endif From patchwork Tue Jan 17 22:09:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 1727785 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=Nbo7f1h9; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4NxNQq4bq7z23fp for ; Wed, 18 Jan 2023 09:11:47 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pHu9H-0000D2-Pw; Tue, 17 Jan 2023 17:09:43 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pHu9G-00007E-AR for qemu-devel@nongnu.org; Tue, 17 Jan 2023 17:09:42 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pHu9D-0007YX-Lv for qemu-devel@nongnu.org; Tue, 17 Jan 2023 17:09:42 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1673993377; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vOgoyenP92MI75OuuSbWf2PpAHEguwfvpUNa2Tx4ejQ=; b=Nbo7f1h9b6QL12HPaOJJTibm5p/3jzr1P6/NUoQ+N7zy9aBdxVVvF8zU/tjnilo2cPBioT uCxKCFPEVpXNFfFvMpGoxSG0jj1y5JYtyMTvWxMeJeojX3PmDMluyYAX7A2p4o5tJ/C8dF J4VpR/jtSHoqQ9cmDinlxiE3ZBMbM3A= Received: from mail-vs1-f69.google.com (mail-vs1-f69.google.com [209.85.217.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-507-4Igi9gGjPFqaBlaOpsSeOA-1; Tue, 17 Jan 2023 17:09:36 -0500 X-MC-Unique: 4Igi9gGjPFqaBlaOpsSeOA-1 Received: by mail-vs1-f69.google.com with SMTP id b65-20020a676744000000b003cedad0ea4bso8210435vsc.9 for ; Tue, 17 Jan 2023 14:09:35 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=vOgoyenP92MI75OuuSbWf2PpAHEguwfvpUNa2Tx4ejQ=; b=F4HJRR8x/4C6N0rF0Du9D1DnmHFB5dRQz46TPJyB6tF28Jam358nxnNpBcEPA3pO+J ZeJboyuscV5ZVULpxVx3Hratm/9LsVlYXTlzRL/ltEEOeDRLHzIg2zhP8POQTlMoeG3b aXzFdead/5AQu4O2uS8PpuAy8J1EvPGLlrntf2psBw+vsecY4IMh6C4tj16IPxdYGaQy o+uTie0oZXcXMAjYNeeGZ9ZI+9sym5Y+Lmap+iw2pv0W09/QKGv3oT5ELeyrseto+slg 9ft1UBaQ0v6PkG6+FaadD9YHXfAiZqCGXX3Zpnwem6LZgcPefVCHz4xxlny973FEeXJO 1bdw== X-Gm-Message-State: AFqh2krjtlPuO2q6KYmSED6UnKvWoTzjmhumuzQdvVAG5TU8wEjZR4wR ldkk98mWFiVwlw07c9lUQ2ep3xUm+4MFFcWux3tWtkAjXErEYsQvudF767RCNIy9t3TyQwlHJpY 3F2wtX8K/+rF0ThRsD5SZ3EcX7vj4ubcpuKhJRVqp1mhZKGJLiaG2Rwb28fXjftmz X-Received: by 2002:a05:6102:2f9:b0:3d0:e802:2f7d with SMTP id j25-20020a05610202f900b003d0e8022f7dmr2403844vsj.13.1673993375037; Tue, 17 Jan 2023 14:09:35 -0800 (PST) X-Google-Smtp-Source: AMrXdXsGELj8RRaJlcwuKvPt2ZLOgcAn51lq/aLv5xR5o3raou2OBHrIb9nhyhQmIZ8PRiPH3t7oRg== X-Received: by 2002:a05:6102:2f9:b0:3d0:e802:2f7d with SMTP id j25-20020a05610202f900b003d0e8022f7dmr2403814vsj.13.1673993374743; Tue, 17 Jan 2023 14:09:34 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-56-70-30-145-63.dsl.bell.ca. [70.30.145.63]) by smtp.gmail.com with ESMTPSA id bm16-20020a05620a199000b006e16dcf99c8sm21142978qkb.71.2023.01.17.14.09.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 Jan 2023 14:09:33 -0800 (PST) From: Peter Xu To: qemu-devel@nongnu.org Cc: Leonardo Bras Soares Passos , James Houghton , Juan Quintela , peterx@redhat.com, "Dr . David Alan Gilbert" Subject: [PATCH RFC 07/21] ramblock: Cache file offset for file-backed ramblocks Date: Tue, 17 Jan 2023 17:09:00 -0500 Message-Id: <20230117220914.2062125-8-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20230117220914.2062125-1-peterx@redhat.com> References: <20230117220914.2062125-1-peterx@redhat.com> MIME-Version: 1.0 Received-SPF: pass client-ip=170.10.129.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org This value was only used for mmap() when we want to map at a specific offset of the file for memory. To be prepared that we might do another map upon the same range for whatever reason, cache the offset so we know how to map again on the same range. Signed-off-by: Peter Xu Reviewed-by: Juan Quintela --- include/exec/ramblock.h | 5 +++++ softmmu/physmem.c | 2 ++ 2 files changed, 7 insertions(+) diff --git a/include/exec/ramblock.h b/include/exec/ramblock.h index adc03df59c..76cd0812c8 100644 --- a/include/exec/ramblock.h +++ b/include/exec/ramblock.h @@ -41,6 +41,11 @@ struct RAMBlock { QLIST_HEAD(, RAMBlockNotifier) ramblock_notifiers; int fd; size_t page_size; + /* + * Cache for file offset to map the ramblock. Only used for + * file-backed ramblocks. + */ + off_t file_offset; /* dirty bitmap used during migration */ unsigned long *bmap; /* bitmap of already received pages in postcopy */ diff --git a/softmmu/physmem.c b/softmmu/physmem.c index a4fb129d8f..aa1a7466e5 100644 --- a/softmmu/physmem.c +++ b/softmmu/physmem.c @@ -1543,6 +1543,8 @@ static void *file_ram_alloc(RAMBlock *block, uint32_t qemu_map_flags; void *area; + /* Remember the offset just in case we'll need to map the range again */ + block->file_offset = offset; block->page_size = qemu_fd_getpagesize(fd); if (block->mr->align % block->page_size) { error_setg(errp, "alignment 0x%" PRIx64 From patchwork Tue Jan 17 22:09:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 1727782 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=PD4Myp31; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4NxNQd2pKNz23fp for ; Wed, 18 Jan 2023 09:11:37 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pHu9I-0000Gr-Li; Tue, 17 Jan 2023 17:09:44 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pHu9G-00008D-PM for qemu-devel@nongnu.org; Tue, 17 Jan 2023 17:09:42 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pHu9F-0007Z2-22 for qemu-devel@nongnu.org; Tue, 17 Jan 2023 17:09:42 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1673993380; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=eBzEOj0C8gI3srLDIfpzVTfM/NNJ4x/eV+I6rmI2dW8=; b=PD4Myp315ZzrXiPLNFsb8Co4618geOIbO+p0FYicwpvFCitlCqAIdJ4GQOzVrOY/UiIT3P oInhCJfC0JOPis/HT54uzq6y7L+C6jNvw80xnmXuCaJDW9fYF8YEzpIzWbh/s8rW2n1Rx7 cFTcZWTKUNx4gxj+Mal+lj08TdTx34w= Received: from mail-vs1-f71.google.com (mail-vs1-f71.google.com [209.85.217.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-631-jXAg80F_MOi8dWxnMi5kVQ-1; Tue, 17 Jan 2023 17:09:39 -0500 X-MC-Unique: jXAg80F_MOi8dWxnMi5kVQ-1 Received: by mail-vs1-f71.google.com with SMTP id n189-20020a6772c6000000b003d0ecf8a7f2so4476721vsc.3 for ; Tue, 17 Jan 2023 14:09:39 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=eBzEOj0C8gI3srLDIfpzVTfM/NNJ4x/eV+I6rmI2dW8=; b=AXXXUAkPdtElAiQUmPJOTZtr/tEwvOtIbQ/phF0k8xD3JsZ3ZSpAHYbZjocJmv+qU5 GSzGaZPpPlXewior9m3CSOLcFVi5qOi9s2n8Git2yGcToWpjKCaQoUXNkmhGJR2Ovors glR/dcgNlse0z6nos+xDq1EpUJYKYS5cbcpwzQBdEYxlfWy4910hESNLbMKQ0PGuwS5r 21FrJhmEFwW+LuknOR8K0USzpxyksVQvZByS3l1YPTMlUrf4kizqVA+Czi83ICJglftm zEKeI70MzU9xWHoJnptB1iu0HipwGCzDpz5y8Jyn6LOPauoZ/buk4BCun8kQDyujgcnQ s2ig== X-Gm-Message-State: AFqh2kp8owVVeRp7VOwp6BdxAc7q6XrBBE3g3fBiOZjgKfrLpTum+cWN kh5wu3TRxScw3kLFAqn1MgmB5a3x9a9kzbIFPlFATyDRENmhg9aJneqT25bwUkLmLWnOmAaJPxJ 2wsfDnc5yjqS6h3BnAeVJ+B34tM5J4ooue/rO06YEkTH1gc8rY+9ShUyZbKuMpzH9 X-Received: by 2002:a67:fc81:0:b0:3d0:ebf4:dcb2 with SMTP id x1-20020a67fc81000000b003d0ebf4dcb2mr1980773vsp.20.1673993377933; Tue, 17 Jan 2023 14:09:37 -0800 (PST) X-Google-Smtp-Source: AMrXdXuxxv8ykuTj/lPHjSQpxQZQ3ws68iQ9lEZHpHhSnFcQLM8FSLeJP5GoRKtJvi5IfpcfkdAS3Q== X-Received: by 2002:a67:fc81:0:b0:3d0:ebf4:dcb2 with SMTP id x1-20020a67fc81000000b003d0ebf4dcb2mr1980743vsp.20.1673993377564; Tue, 17 Jan 2023 14:09:37 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-56-70-30-145-63.dsl.bell.ca. [70.30.145.63]) by smtp.gmail.com with ESMTPSA id bm16-20020a05620a199000b006e16dcf99c8sm21142978qkb.71.2023.01.17.14.09.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 Jan 2023 14:09:36 -0800 (PST) From: Peter Xu To: qemu-devel@nongnu.org Cc: Leonardo Bras Soares Passos , James Houghton , Juan Quintela , peterx@redhat.com, "Dr . David Alan Gilbert" Subject: [PATCH RFC 08/21] ramblock: Cache the length to do file mmap() on ramblocks Date: Tue, 17 Jan 2023 17:09:01 -0500 Message-Id: <20230117220914.2062125-9-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20230117220914.2062125-1-peterx@redhat.com> References: <20230117220914.2062125-1-peterx@redhat.com> MIME-Version: 1.0 Received-SPF: pass client-ip=170.10.129.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org We do proper page size alignment for file backed mmap()s for ramblocks. Even if it's as simple as that, cache the value because it'll be used in multiple places. Since at it, drop size for file_ram_alloc() and just use max_length because that's always true for file-backed ramblocks. Signed-off-by: Peter Xu Reviewed-by: Dr. David Alan Gilbert Reviewed-by: Juan Quintela --- include/exec/ramblock.h | 2 ++ softmmu/physmem.c | 14 +++++++------- 2 files changed, 9 insertions(+), 7 deletions(-) diff --git a/include/exec/ramblock.h b/include/exec/ramblock.h index 76cd0812c8..3f31ce1591 100644 --- a/include/exec/ramblock.h +++ b/include/exec/ramblock.h @@ -32,6 +32,8 @@ struct RAMBlock { ram_addr_t offset; ram_addr_t used_length; ram_addr_t max_length; + /* Only used for file-backed ramblocks */ + ram_addr_t mmap_length; void (*resized)(const char*, uint64_t length, void *host); uint32_t flags; /* Protected by iothread lock. */ diff --git a/softmmu/physmem.c b/softmmu/physmem.c index aa1a7466e5..b5be02f1cb 100644 --- a/softmmu/physmem.c +++ b/softmmu/physmem.c @@ -1533,7 +1533,6 @@ static int file_ram_open(const char *path, } static void *file_ram_alloc(RAMBlock *block, - ram_addr_t memory, int fd, bool readonly, bool truncate, @@ -1563,14 +1562,14 @@ static void *file_ram_alloc(RAMBlock *block, } #endif - if (memory < block->page_size) { + if (block->max_length < block->page_size) { error_setg(errp, "memory size 0x" RAM_ADDR_FMT " must be equal to " "or larger than page size 0x%zx", - memory, block->page_size); + block->max_length, block->page_size); return NULL; } - memory = ROUND_UP(memory, block->page_size); + block->mmap_length = ROUND_UP(block->max_length, block->page_size); /* * ftruncate is not supported by hugetlbfs in older @@ -1586,7 +1585,7 @@ static void *file_ram_alloc(RAMBlock *block, * those labels. Therefore, extending the non-empty backend file * is disabled as well. */ - if (truncate && ftruncate(fd, memory)) { + if (truncate && ftruncate(fd, block->mmap_length)) { perror("ftruncate"); } @@ -1594,7 +1593,8 @@ static void *file_ram_alloc(RAMBlock *block, qemu_map_flags |= (block->flags & RAM_SHARED) ? QEMU_MAP_SHARED : 0; qemu_map_flags |= (block->flags & RAM_PMEM) ? QEMU_MAP_SYNC : 0; qemu_map_flags |= (block->flags & RAM_NORESERVE) ? QEMU_MAP_NORESERVE : 0; - area = qemu_ram_mmap(fd, memory, block->mr->align, qemu_map_flags, offset); + area = qemu_ram_mmap(fd, block->mmap_length, block->mr->align, + qemu_map_flags, offset); if (area == MAP_FAILED) { error_setg_errno(errp, errno, "unable to map backing store for guest RAM"); @@ -2100,7 +2100,7 @@ RAMBlock *qemu_ram_alloc_from_fd(ram_addr_t size, MemoryRegion *mr, new_block->used_length = size; new_block->max_length = size; new_block->flags = ram_flags; - new_block->host = file_ram_alloc(new_block, size, fd, readonly, + new_block->host = file_ram_alloc(new_block, fd, readonly, !file_size, offset, errp); if (!new_block->host) { g_free(new_block); From patchwork Tue Jan 17 22:09:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 1727769 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=NpQK+GFc; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4NxNPP6GKnz23fp for ; Wed, 18 Jan 2023 09:10:33 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pHu9R-000120-61; Tue, 17 Jan 2023 17:09:53 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pHu9P-0000tH-6r for qemu-devel@nongnu.org; Tue, 17 Jan 2023 17:09:51 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pHu9N-0007a2-6z for qemu-devel@nongnu.org; Tue, 17 Jan 2023 17:09:50 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1673993388; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7X9VBKHZSiasDceXYMQp5mjpHt6GeeQjvaxd2veMkQA=; b=NpQK+GFcIya7tB3iSNvs3JMFuHJt4wPXqnFDEzVSregdr0SHPbyvZlB+hFSafKvnOKVWsj K8R3JjkZJLEYsybGmYeAqEpgUCjQfzZEZ6lW8xzSaAjJv+dRAJsaYwK2xmIyRboQ12eRm0 8RkoFJFbCkuxwNIa7mMllWahqOE1kL4= Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com [209.85.222.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-626-Zyniild0P56PTjKdsbuMjQ-1; Tue, 17 Jan 2023 17:09:44 -0500 X-MC-Unique: Zyniild0P56PTjKdsbuMjQ-1 Received: by mail-qk1-f197.google.com with SMTP id x12-20020a05620a258c00b007051ae500a2so23659650qko.15 for ; Tue, 17 Jan 2023 14:09:44 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7X9VBKHZSiasDceXYMQp5mjpHt6GeeQjvaxd2veMkQA=; b=DTQm7Ds6e9YCa42dpXkkVV58fHur2ujFB4rQR9b3wH1hhhtb9iDYjAPUidS8LN48jx p1AuraJX8h1HGOA1niFNz3D4xO8wrkuGdlA80Xp8vun7B+h7ik4ouTkJyJMQFmQd312Y GaSGE8BZVJqgjFxCvt8khxYh23R041U1icc6gkMlZBF1dDxCAeYkqJJQOuLgDH94SPY1 ibGiOKcrngE7cw3Hf9mEUwYlR81Kpq4k8CPoGDsurJKoksDV0pO2wwAf7DX/skh7URQi yKAJ9zofERNlqFTJyuAVifcLBDJVNBj482u3Yw1wL1laaALyqpCoV6t81X6E41rQYjRa 5jxQ== X-Gm-Message-State: AFqh2ko3vfv6kMl2FZcTnVEwoYEDOt3AU4naiuVP989Vs/U/WcUWBSOt p+KQsAZ8md/Cdi/bN5zjwFNbhHLVKEwWbWDuETxbaMh8DlRHj+oNDjTRXgA3bXe6he5y9W8SXah 9tvH4cM4hzjcEHpVZ5OGVSEo8ph+znu6RJgT4DStT2I2PTkqseXi57mj9HHbftYFP X-Received: by 2002:ac8:46ca:0:b0:3a4:fddd:f8ef with SMTP id h10-20020ac846ca000000b003a4fdddf8efmr5335785qto.53.1673993381326; Tue, 17 Jan 2023 14:09:41 -0800 (PST) X-Google-Smtp-Source: AMrXdXtLm3/2lIyL1Ho0rzAroaYrRvaCab7D2BJmfUeND/i+Cnr8pJsEB6zNlVaci2iK3U+3nStsTQ== X-Received: by 2002:ac8:46ca:0:b0:3a4:fddd:f8ef with SMTP id h10-20020ac846ca000000b003a4fdddf8efmr5335745qto.53.1673993380912; Tue, 17 Jan 2023 14:09:40 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-56-70-30-145-63.dsl.bell.ca. [70.30.145.63]) by smtp.gmail.com with ESMTPSA id bm16-20020a05620a199000b006e16dcf99c8sm21142978qkb.71.2023.01.17.14.09.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 Jan 2023 14:09:39 -0800 (PST) From: Peter Xu To: qemu-devel@nongnu.org Cc: Leonardo Bras Soares Passos , James Houghton , Juan Quintela , peterx@redhat.com, "Dr . David Alan Gilbert" Subject: [PATCH RFC 09/21] ramblock: Add RAM_READONLY Date: Tue, 17 Jan 2023 17:09:02 -0500 Message-Id: <20230117220914.2062125-10-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20230117220914.2062125-1-peterx@redhat.com> References: <20230117220914.2062125-1-peterx@redhat.com> MIME-Version: 1.0 Received-SPF: pass client-ip=170.10.133.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org This allows us to have RAM_READONLY to be set in ram_flags to show that this ramblock can only be read not write. We used to pass in readonly boolean along the way for allocating the ramblock, now let it be together with the rest ramblock flags. The main purpose of this patch is not for clean up though, it's for caching mapping information of each ramblock so when we want to mmap() it again for whatever reason we can have all the information on hand. Signed-off-by: Peter Xu Reviewed-by: Dr. David Alan Gilbert Reviewed-by: Juan Quintela --- backends/hostmem-file.c | 3 ++- include/exec/memory.h | 4 ++-- include/exec/ram_addr.h | 5 ++--- softmmu/memory.c | 8 +++----- softmmu/physmem.c | 16 +++++++--------- 5 files changed, 16 insertions(+), 20 deletions(-) diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c index 25141283c4..1daf00d2da 100644 --- a/backends/hostmem-file.c +++ b/backends/hostmem-file.c @@ -56,9 +56,10 @@ file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp) ram_flags = backend->share ? RAM_SHARED : 0; ram_flags |= backend->reserve ? 0 : RAM_NORESERVE; ram_flags |= fb->is_pmem ? RAM_PMEM : 0; + ram_flags |= fb->readonly ? RAM_READONLY : 0; memory_region_init_ram_from_file(&backend->mr, OBJECT(backend), name, backend->size, fb->align, ram_flags, - fb->mem_path, fb->readonly, errp); + fb->mem_path, errp); g_free(name); #endif } diff --git a/include/exec/memory.h b/include/exec/memory.h index c37ffdbcd1..006ba77ede 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -188,6 +188,8 @@ typedef struct IOMMUTLBEvent { /* RAM is a persistent kind memory */ #define RAM_PMEM (1 << 5) +/* RAM is read-only */ +#define RAM_READONLY (1 << 6) /* * UFFDIO_WRITEPROTECT is used on this RAMBlock to @@ -1292,7 +1294,6 @@ void memory_region_init_resizeable_ram(MemoryRegion *mr, * @ram_flags: RamBlock flags. Supported flags: RAM_SHARED, RAM_PMEM, * RAM_NORESERVE, * @path: the path in which to allocate the RAM. - * @readonly: true to open @path for reading, false for read/write. * @errp: pointer to Error*, to store an error if it happens. * * Note that this function does not do anything to cause the data in the @@ -1305,7 +1306,6 @@ void memory_region_init_ram_from_file(MemoryRegion *mr, uint64_t align, uint32_t ram_flags, const char *path, - bool readonly, Error **errp); /** diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h index f4fb6a2111..0bf9cfc659 100644 --- a/include/exec/ram_addr.h +++ b/include/exec/ram_addr.h @@ -110,7 +110,6 @@ long qemu_maxrampagesize(void); * @ram_flags: RamBlock flags. Supported flags: RAM_SHARED, RAM_PMEM, * RAM_NORESERVE. * @mem_path or @fd: specify the backing file or device - * @readonly: true to open @path for reading, false for read/write. * @errp: pointer to Error*, to store an error if it happens * * Return: @@ -119,10 +118,10 @@ long qemu_maxrampagesize(void); */ RAMBlock *qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr, uint32_t ram_flags, const char *mem_path, - bool readonly, Error **errp); + Error **errp); RAMBlock *qemu_ram_alloc_from_fd(ram_addr_t size, MemoryRegion *mr, uint32_t ram_flags, int fd, off_t offset, - bool readonly, Error **errp); + Error **errp); RAMBlock *qemu_ram_alloc_from_ptr(ram_addr_t size, void *host, MemoryRegion *mr, Error **errp); diff --git a/softmmu/memory.c b/softmmu/memory.c index e05332d07f..2137028773 100644 --- a/softmmu/memory.c +++ b/softmmu/memory.c @@ -1601,18 +1601,16 @@ void memory_region_init_ram_from_file(MemoryRegion *mr, uint64_t align, uint32_t ram_flags, const char *path, - bool readonly, Error **errp) { Error *err = NULL; memory_region_init(mr, owner, name, size); mr->ram = true; - mr->readonly = readonly; + mr->readonly = ram_flags & RAM_READONLY; mr->terminates = true; mr->destructor = memory_region_destructor_ram; mr->align = align; - mr->ram_block = qemu_ram_alloc_from_file(size, mr, ram_flags, path, - readonly, &err); + mr->ram_block = qemu_ram_alloc_from_file(size, mr, ram_flags, path, &err); if (err) { mr->size = int128_zero(); object_unparent(OBJECT(mr)); @@ -1635,7 +1633,7 @@ void memory_region_init_ram_from_fd(MemoryRegion *mr, mr->terminates = true; mr->destructor = memory_region_destructor_ram; mr->ram_block = qemu_ram_alloc_from_fd(size, mr, ram_flags, fd, offset, - false, &err); + &err); if (err) { mr->size = int128_zero(); object_unparent(OBJECT(mr)); diff --git a/softmmu/physmem.c b/softmmu/physmem.c index b5be02f1cb..6096eac286 100644 --- a/softmmu/physmem.c +++ b/softmmu/physmem.c @@ -1534,7 +1534,6 @@ static int file_ram_open(const char *path, static void *file_ram_alloc(RAMBlock *block, int fd, - bool readonly, bool truncate, off_t offset, Error **errp) @@ -1589,7 +1588,7 @@ static void *file_ram_alloc(RAMBlock *block, perror("ftruncate"); } - qemu_map_flags = readonly ? QEMU_MAP_READONLY : 0; + qemu_map_flags = (block->flags & RAM_READONLY) ? QEMU_MAP_READONLY : 0; qemu_map_flags |= (block->flags & RAM_SHARED) ? QEMU_MAP_SHARED : 0; qemu_map_flags |= (block->flags & RAM_PMEM) ? QEMU_MAP_SYNC : 0; qemu_map_flags |= (block->flags & RAM_NORESERVE) ? QEMU_MAP_NORESERVE : 0; @@ -2057,7 +2056,7 @@ static void ram_block_add(RAMBlock *new_block, Error **errp) #ifdef CONFIG_POSIX RAMBlock *qemu_ram_alloc_from_fd(ram_addr_t size, MemoryRegion *mr, uint32_t ram_flags, int fd, off_t offset, - bool readonly, Error **errp) + Error **errp) { RAMBlock *new_block; Error *local_err = NULL; @@ -2065,7 +2064,7 @@ RAMBlock *qemu_ram_alloc_from_fd(ram_addr_t size, MemoryRegion *mr, /* Just support these ram flags by now. */ assert((ram_flags & ~(RAM_SHARED | RAM_PMEM | RAM_NORESERVE | - RAM_PROTECTED)) == 0); + RAM_PROTECTED | RAM_READONLY)) == 0); if (xen_enabled()) { error_setg(errp, "-mem-path not supported with Xen"); @@ -2100,8 +2099,7 @@ RAMBlock *qemu_ram_alloc_from_fd(ram_addr_t size, MemoryRegion *mr, new_block->used_length = size; new_block->max_length = size; new_block->flags = ram_flags; - new_block->host = file_ram_alloc(new_block, fd, readonly, - !file_size, offset, errp); + new_block->host = file_ram_alloc(new_block, fd, !file_size, offset, errp); if (!new_block->host) { g_free(new_block); return NULL; @@ -2120,11 +2118,11 @@ RAMBlock *qemu_ram_alloc_from_fd(ram_addr_t size, MemoryRegion *mr, RAMBlock *qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr, uint32_t ram_flags, const char *mem_path, - bool readonly, Error **errp) + Error **errp) { int fd; - bool created; RAMBlock *block; + bool created, readonly = ram_flags & RAM_READONLY; fd = file_ram_open(mem_path, memory_region_name(mr), readonly, &created, errp); @@ -2132,7 +2130,7 @@ RAMBlock *qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr, return NULL; } - block = qemu_ram_alloc_from_fd(size, mr, ram_flags, fd, 0, readonly, errp); + block = qemu_ram_alloc_from_fd(size, mr, ram_flags, fd, 0, errp); if (!block) { if (created) { unlink(mem_path); From patchwork Tue Jan 17 22:09:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 1727766 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=bxRdHozB; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4NxNNr5BnZz23fp for ; Wed, 18 Jan 2023 09:10:04 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pHu9Q-0000zL-D7; Tue, 17 Jan 2023 17:09:52 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pHu9N-0000kx-W3 for qemu-devel@nongnu.org; Tue, 17 Jan 2023 17:09:50 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pHu9M-0007Zs-9c for qemu-devel@nongnu.org; Tue, 17 Jan 2023 17:09:49 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1673993387; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=266iSbql5BbyDtqQckxWDJK2IyS5vgAVX9Vmr1kym90=; b=bxRdHozBwIlq6gM3a63FtXFeyr8IrvEmbVDBzjRHOMQJTmzT/Sto25zvlb3ZmmlXkLmdXD soMLuDCLZibuRdJNOxsb07HQxQD8tb6HVagn6g4PIe8bkTQKUAswmT9m+k2WE8yDIABjI/ 9/iV5xCSqhSOcQsRwGgBlgjSLnfk92o= Received: from mail-vk1-f197.google.com (mail-vk1-f197.google.com [209.85.221.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-423--GRA8lHUMqGiz-kUp4lrTA-1; Tue, 17 Jan 2023 17:09:46 -0500 X-MC-Unique: -GRA8lHUMqGiz-kUp4lrTA-1 Received: by mail-vk1-f197.google.com with SMTP id u197-20020a1f2ece000000b003e1b0286c0fso704607vku.12 for ; Tue, 17 Jan 2023 14:09:46 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=266iSbql5BbyDtqQckxWDJK2IyS5vgAVX9Vmr1kym90=; b=1/jVJ5EBq1BebaY2IjSv6WJia70qqj0x1P3EJYMVRw8fSjzYvrHORLABTJebDTce8s i0i7U1UhLOogxL3zcsQfnf6E2bLJqlGtTqPGuhodh8PDcI368yDHO8WERHVB48/dv890 uTVTZC18mQic2/u2ZeYuQKRJaHZqJaKFa88HnV5EyyZpWtRr71IkkKJq6eNIuohlAC1H cWK16cbshoRTryIWJwATRThpKPOzT3LcPSffJfAYAT1q4kbpYzalm8Qy9XzH3Zc9xG+e XeN1TlGDpYdrvFRRs09Skmjio4V5MjWUscSSfcI/QbBvuCyvOkcu+a34Dmb2PDMFiX9v 3ufQ== X-Gm-Message-State: AFqh2kp0c8rbPp6rmexC3/x/An99IO1g+nQIO1aNqX/m8xPo8t96HuiP u6WunK7XWRrP/c+zkpOjO1a2AtzCqUXBSgJytylc5Rggtx2jHBjtP/b8DE+A75IHFzwjOPwO/Fc NXYyhTz92DopC7rHof4J1kwwd5iWKQwQk9Grbv9yTYcGp/AsOrwg6iDn6r1DKtCWY X-Received: by 2002:a67:f30d:0:b0:3c4:a880:7312 with SMTP id p13-20020a67f30d000000b003c4a8807312mr2546250vsf.27.1673993385373; Tue, 17 Jan 2023 14:09:45 -0800 (PST) X-Google-Smtp-Source: AMrXdXvd557uB39Mv6N4IAkGYzlvyzL96KiLz9/ZXzfyT1BC10KRf9Xw82eeXQujZDUmxaJZo2mbAA== X-Received: by 2002:a67:f30d:0:b0:3c4:a880:7312 with SMTP id p13-20020a67f30d000000b003c4a8807312mr2546231vsf.27.1673993385129; Tue, 17 Jan 2023 14:09:45 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-56-70-30-145-63.dsl.bell.ca. [70.30.145.63]) by smtp.gmail.com with ESMTPSA id bm16-20020a05620a199000b006e16dcf99c8sm21142978qkb.71.2023.01.17.14.09.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 Jan 2023 14:09:44 -0800 (PST) From: Peter Xu To: qemu-devel@nongnu.org Cc: Leonardo Bras Soares Passos , James Houghton , Juan Quintela , peterx@redhat.com, "Dr . David Alan Gilbert" Subject: [PATCH RFC 10/21] ramblock: Add ramblock_file_map() Date: Tue, 17 Jan 2023 17:09:03 -0500 Message-Id: <20230117220914.2062125-11-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20230117220914.2062125-1-peterx@redhat.com> References: <20230117220914.2062125-1-peterx@redhat.com> MIME-Version: 1.0 Received-SPF: pass client-ip=170.10.129.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Add a helper to do mmap() for a ramblock based on the cached informations. A trivial thing to mention is we need to move ramblock->fd setup to be earlier, before the ramblock_file_map() call, because it'll need to reference the fd being mapped. However that should not be a problem at all, majorly because the fd won't be freed if successful, and if it failed the fd will be freeed (or to be explicit, close()ed) by the caller. Export it - prepare to be used outside this file. Signed-off-by: Peter Xu Reviewed-by: Dr. David Alan Gilbert Reviewed-by: Juan Quintela --- include/exec/ram_addr.h | 1 + softmmu/physmem.c | 25 +++++++++++++++++-------- 2 files changed, 18 insertions(+), 8 deletions(-) diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h index 0bf9cfc659..56db25009a 100644 --- a/include/exec/ram_addr.h +++ b/include/exec/ram_addr.h @@ -98,6 +98,7 @@ bool ramblock_is_pmem(RAMBlock *rb); long qemu_minrampagesize(void); long qemu_maxrampagesize(void); +void *ramblock_file_map(RAMBlock *block); /** * qemu_ram_alloc_from_file, diff --git a/softmmu/physmem.c b/softmmu/physmem.c index 6096eac286..cdda7eaea5 100644 --- a/softmmu/physmem.c +++ b/softmmu/physmem.c @@ -1532,17 +1532,31 @@ static int file_ram_open(const char *path, return fd; } +/* Do the mmap() for a ramblock based on information already setup */ +void *ramblock_file_map(RAMBlock *block) +{ + uint32_t qemu_map_flags; + + qemu_map_flags = (block->flags & RAM_READONLY) ? QEMU_MAP_READONLY : 0; + qemu_map_flags |= (block->flags & RAM_SHARED) ? QEMU_MAP_SHARED : 0; + qemu_map_flags |= (block->flags & RAM_PMEM) ? QEMU_MAP_SYNC : 0; + qemu_map_flags |= (block->flags & RAM_NORESERVE) ? QEMU_MAP_NORESERVE : 0; + + return qemu_ram_mmap(block->fd, block->mmap_length, block->mr->align, + qemu_map_flags, block->file_offset); +} + static void *file_ram_alloc(RAMBlock *block, int fd, bool truncate, off_t offset, Error **errp) { - uint32_t qemu_map_flags; void *area; /* Remember the offset just in case we'll need to map the range again */ block->file_offset = offset; + block->fd = fd; block->page_size = qemu_fd_getpagesize(fd); if (block->mr->align % block->page_size) { error_setg(errp, "alignment 0x%" PRIx64 @@ -1588,19 +1602,14 @@ static void *file_ram_alloc(RAMBlock *block, perror("ftruncate"); } - qemu_map_flags = (block->flags & RAM_READONLY) ? QEMU_MAP_READONLY : 0; - qemu_map_flags |= (block->flags & RAM_SHARED) ? QEMU_MAP_SHARED : 0; - qemu_map_flags |= (block->flags & RAM_PMEM) ? QEMU_MAP_SYNC : 0; - qemu_map_flags |= (block->flags & RAM_NORESERVE) ? QEMU_MAP_NORESERVE : 0; - area = qemu_ram_mmap(fd, block->mmap_length, block->mr->align, - qemu_map_flags, offset); + area = ramblock_file_map(block); + if (area == MAP_FAILED) { error_setg_errno(errp, errno, "unable to map backing store for guest RAM"); return NULL; } - block->fd = fd; return area; } #endif From patchwork Tue Jan 17 22:09:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 1727770 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=Fp/+G3dC; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4NxNPj0ldKz23fp for ; Wed, 18 Jan 2023 09:10:49 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pHu9S-00017R-Qu; Tue, 17 Jan 2023 17:09:54 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pHu9S-000146-3S for qemu-devel@nongnu.org; Tue, 17 Jan 2023 17:09:54 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pHu9Q-0007aS-AG for qemu-devel@nongnu.org; Tue, 17 Jan 2023 17:09:53 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1673993391; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=g9eBoW98vzMiN4TfhNXS6GssyAtqf8+iJR0Ats5Su2c=; b=Fp/+G3dCOnsOXtO9yP1Kg7yBnDLE57T5dskGYj6hJBsUQ6jvzRIIm7+9T7lV0qFsLNoNxe t4tmGrjo/JmbtHioXm6mMixPrF6TSu5ufRKFygr+sSKrFKLDasTznuJRPjUxttejv2aRsA NYHldxCdd1UElTZ9++THmlc9EA3JQIs= Received: from mail-vk1-f199.google.com (mail-vk1-f199.google.com [209.85.221.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-365-xJuVeRDsMuyWucsvr7tCpg-1; Tue, 17 Jan 2023 17:09:49 -0500 X-MC-Unique: xJuVeRDsMuyWucsvr7tCpg-1 Received: by mail-vk1-f199.google.com with SMTP id u187-20020a1fabc4000000b003ca3e899f8fso9759564vke.22 for ; Tue, 17 Jan 2023 14:09:49 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=g9eBoW98vzMiN4TfhNXS6GssyAtqf8+iJR0Ats5Su2c=; b=aVO53cswUSOXFjvNHKID45m+ASwhKvdBh1Eh4ulHiHSxgcW+yhiSUSyrFS7+B8xoo9 4hl5thYsKXukTsAoMTlz2nfwWuosdE+flhi1DBvb04N149TQSCxc0qe/02AcsFZtM9uf YrFnpeJe5qk9P2hlXWvVM2v29BR2ZRiRDpO66XBaR2jbXtShAa9+6PVefiw/6m/BCr7m WjjLIw3ePivyAYfbZrbTiw4Jj/HM1f+z3Yae06yB/ZBcdaT2k6Fslyo5yw6HrdBMFM9+ 7j4GxtqINR1zxy4PgMwnZQ2b7wrGKc0wMKcO09SHE3rwLYkzVcSv9ytJ7FcaosIGBaIO ++CA== X-Gm-Message-State: AFqh2krOyXDQD9N1pYFxriQkygm43O1bOPI617M3G1VeXWJj9w10r9g2 hFB3MCGjDTVY1DC6FNsm45z9fXnNRjcLqijixwpWea/dJGbvcovbkjUTALqY4Brm/W1cdk1YVLO DcDKdqvm9Hb3zs0ZkJG+bJvzL1ywLuDhiI7VqugAVblhdYdKI1h9h2hrgdumIMMqX X-Received: by 2002:a05:6122:18b8:b0:3e1:6412:33a5 with SMTP id bi56-20020a05612218b800b003e1641233a5mr2026183vkb.6.1673993388110; Tue, 17 Jan 2023 14:09:48 -0800 (PST) X-Google-Smtp-Source: AMrXdXucHZXq8j8BOqPZadj0cN/80ZzIGDEqGyQ7b9PNkGBshGZ/KYByd5gm0e84+CwTh8lyohU+nw== X-Received: by 2002:a05:6122:18b8:b0:3e1:6412:33a5 with SMTP id bi56-20020a05612218b800b003e1641233a5mr2026163vkb.6.1673993387799; Tue, 17 Jan 2023 14:09:47 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-56-70-30-145-63.dsl.bell.ca. [70.30.145.63]) by smtp.gmail.com with ESMTPSA id bm16-20020a05620a199000b006e16dcf99c8sm21142978qkb.71.2023.01.17.14.09.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 Jan 2023 14:09:46 -0800 (PST) From: Peter Xu To: qemu-devel@nongnu.org Cc: Leonardo Bras Soares Passos , James Houghton , Juan Quintela , peterx@redhat.com, "Dr . David Alan Gilbert" Subject: [PATCH RFC 11/21] migration: Add hugetlb-doublemap cap Date: Tue, 17 Jan 2023 17:09:04 -0500 Message-Id: <20230117220914.2062125-12-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20230117220914.2062125-1-peterx@redhat.com> References: <20230117220914.2062125-1-peterx@redhat.com> MIME-Version: 1.0 Received-SPF: pass client-ip=170.10.129.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Add a new cap to allow mapping hugetlbfs backed RAMs in small page sizes. Signed-off-by: Peter Xu Reviewed-by: Dr. David Alan Gilbert Reviewed-by: Juan Quintela --- migration/migration.c | 48 ++++++++++++++++++++++++++++++++++++++++++- migration/migration.h | 1 + qapi/migration.json | 7 ++++++- 3 files changed, 54 insertions(+), 2 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index 64f74534e2..b174f2af92 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -17,6 +17,7 @@ #include "qemu/cutils.h" #include "qemu/error-report.h" #include "qemu/main-loop.h" +#include "qemu/madvise.h" #include "migration/blocker.h" #include "exec.h" #include "fd.h" @@ -62,6 +63,7 @@ #include "sysemu/cpus.h" #include "yank_functions.h" #include "sysemu/qtest.h" +#include "exec/ramblock.h" #define MAX_THROTTLE (128 << 20) /* Migration transfer speed throttling */ @@ -1363,12 +1365,47 @@ static bool migrate_caps_check(bool *cap_list, "Zero copy only available for non-compressed non-TLS multifd migration"); return false; } + + if (cap_list[MIGRATION_CAPABILITY_HUGETLB_DOUBLEMAP]) { + RAMBlock *rb; + + /* Check whether the platform/binary supports the new madvise()s */ + +#if QEMU_MADV_SPLIT == QEMU_MADV_INVALID + error_setg(errp, "MADV_SPLIT is not supported by the QEMU binary"); + return false; +#endif + +#if QEMU_MADV_COLLAPSE == QEMU_MADV_INVALID + error_setg(errp, "MADV_COLLAPSE is not supported by the QEMU binary"); + return false; +#endif + + /* + * Check against kernel support of MADV_SPLIT is not easy, delay + * that until we have all the hugetlb mappings ready on dest node, + * meanwhile do the best effort check here because doublemap + * requires the hugetlb ramblocks to be shared first. + */ + RAMBLOCK_FOREACH_NOT_IGNORED(rb) { + if (qemu_ram_is_hugetlb(rb) && !qemu_ram_is_shared(rb)) { + error_setg(errp, "RAMBlock '%s' needs to be shared for doublemap", + rb->idstr); + return false; + } + } + } #else if (cap_list[MIGRATION_CAPABILITY_ZERO_COPY_SEND]) { error_setg(errp, "Zero copy currently only available on Linux"); return false; } + + if (cap_list[MIGRATION_CAPABILITY_HUGETLB_DOUBLEMAP]) { + error_setg(errp, "Hugetlb doublemap is only supported on Linux"); + return false; + } #endif if (cap_list[MIGRATION_CAPABILITY_POSTCOPY_PREEMPT]) { @@ -2792,6 +2829,13 @@ bool migrate_postcopy_preempt(void) return s->enabled_capabilities[MIGRATION_CAPABILITY_POSTCOPY_PREEMPT]; } +bool migrate_hugetlb_doublemap(void) +{ + MigrationState *s = migrate_get_current(); + + return s->enabled_capabilities[MIGRATION_CAPABILITY_HUGETLB_DOUBLEMAP]; +} + /* migration thread support */ /* * Something bad happened to the RP stream, mark an error @@ -4472,7 +4516,9 @@ static Property migration_properties[] = { DEFINE_PROP_MIG_CAP("x-return-path", MIGRATION_CAPABILITY_RETURN_PATH), DEFINE_PROP_MIG_CAP("x-multifd", MIGRATION_CAPABILITY_MULTIFD), DEFINE_PROP_MIG_CAP("x-background-snapshot", - MIGRATION_CAPABILITY_BACKGROUND_SNAPSHOT), + MIGRATION_CAPABILITY_BACKGROUND_SNAPSHOT), + DEFINE_PROP_MIG_CAP("hugetlb-doublemap", + MIGRATION_CAPABILITY_HUGETLB_DOUBLEMAP), #ifdef CONFIG_LINUX DEFINE_PROP_MIG_CAP("x-zero-copy-send", MIGRATION_CAPABILITY_ZERO_COPY_SEND), diff --git a/migration/migration.h b/migration/migration.h index 5674a13876..bbd610a2d5 100644 --- a/migration/migration.h +++ b/migration/migration.h @@ -447,6 +447,7 @@ bool migrate_use_events(void); bool migrate_postcopy_blocktime(void); bool migrate_background_snapshot(void); bool migrate_postcopy_preempt(void); +bool migrate_hugetlb_doublemap(void); /* Sending on the return path - generic and then for each message type */ void migrate_send_rp_shut(MigrationIncomingState *mis, diff --git a/qapi/migration.json b/qapi/migration.json index 88ecf86ac8..b23516e75e 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -477,6 +477,11 @@ # will be handled faster. This is a performance feature and # should not affect the correctness of postcopy migration. # (since 7.1) +# @hugetlb-doublemap: If enabled, the migration process will allow postcopy +# to handle page faults based on small pages even if +# hugetlb is used. This will drastically reduce page +# fault latencies when hugetlb is used as the guest RAM +# backends. (since 7.3) # # Features: # @unstable: Members @x-colo and @x-ignore-shared are experimental. @@ -492,7 +497,7 @@ 'dirty-bitmaps', 'postcopy-blocktime', 'late-block-activate', { 'name': 'x-ignore-shared', 'features': [ 'unstable' ] }, 'validate-uuid', 'background-snapshot', - 'zero-copy-send', 'postcopy-preempt'] } + 'zero-copy-send', 'postcopy-preempt', 'hugetlb-doublemap'] } ## # @MigrationCapabilityStatus: From patchwork Tue Jan 17 22:09:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 1727776 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=f+3vIusC; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4NxNQG43fkz23fp for ; Wed, 18 Jan 2023 09:11:18 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pHu9V-0001II-72; Tue, 17 Jan 2023 17:09:57 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pHu9T-0001BU-QJ for qemu-devel@nongnu.org; Tue, 17 Jan 2023 17:09:55 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pHu9R-0007aW-Qh for qemu-devel@nongnu.org; Tue, 17 Jan 2023 17:09:55 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1673993393; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5Wpj//1KX4T5LSYnGijdyGTWi6aBZGHsD+WnZQl5RqE=; b=f+3vIusCgm1gcyFRG4I4u/QD/lke/mo2XtChuH9k26kULNHEFM4xfv+K24NFxifUKi1Ye6 XvpYxfOPq65vK2YPBTpv+ccvOTOf9xjvyGDPX8Tz7VG5Cn2uBxRHNaN9oSVDlUY6KqGJAe i8hRORll98jGIngCSmuAFgKRV3/kf9o= Received: from mail-qv1-f72.google.com (mail-qv1-f72.google.com [209.85.219.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-73-enQ6GwqkPROwJHG9sNqTdw-1; Tue, 17 Jan 2023 17:09:52 -0500 X-MC-Unique: enQ6GwqkPROwJHG9sNqTdw-1 Received: by mail-qv1-f72.google.com with SMTP id jy13-20020a0562142b4d00b00535302dd1b8so1092402qvb.18 for ; Tue, 17 Jan 2023 14:09:51 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=5Wpj//1KX4T5LSYnGijdyGTWi6aBZGHsD+WnZQl5RqE=; b=ArU7GYOkRst+aM/bRubzLcDPUCOeGFOgek1DVVRUCZDT5GksDsN0t6BfJ01MzeesxD 1OQNW4xE2+7B3eT8bQ5xkrTmayVI2i0dIlSHG23HI+BeiYGZwaq+RHxY8ir2Na1VDwU6 1bbKaokHLVAJ+HvRK2CZ3SdP2UP8pZdSxYu+kpxas1n4Mfm0igpMEpn6PrUZUcW8oikh MRsNioNNaf7QnNy8klec/ejt5NoP9FcXjH/6oh+4FoZ1Dh1GeUkKMLXGH5PKe72dhUe/ Op3nOGvjrvrfTMTw4XMl9Wg43IAHfXJqRokPbAL4Q7RW46OsJvy71609eU7se7OBQ7vw a7OQ== X-Gm-Message-State: AFqh2krJXyuDABIG5Vc8PIjcD+1OrowTkBzEr9n72Wik0zXpK1zD3JOW pY8w2PAhmUv1GW/Q/lQUuEWs/KEo/eVcsxcdgJvnfb5d67+l2dz7dpzWJ9jIR67oPe4DKGdSxlo lv1rxfJD65YZ6a/+ApFcf1LDzdI+Yan3qbIDeylLCOf+ZyaEBgD1Y4I1yWdUTapVo X-Received: by 2002:a05:6214:3311:b0:534:ba17:9e71 with SMTP id mo17-20020a056214331100b00534ba179e71mr7883997qvb.9.1673993390885; Tue, 17 Jan 2023 14:09:50 -0800 (PST) X-Google-Smtp-Source: AMrXdXssfMGjJKQpdsxwqXyYeEPcOpe8i3cZIi3PWqFmhzsMr9BuFzLAJbsaC2VxIeVGCpQ50azxvA== X-Received: by 2002:a05:6214:3311:b0:534:ba17:9e71 with SMTP id mo17-20020a056214331100b00534ba179e71mr7883959qvb.9.1673993390494; Tue, 17 Jan 2023 14:09:50 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-56-70-30-145-63.dsl.bell.ca. [70.30.145.63]) by smtp.gmail.com with ESMTPSA id bm16-20020a05620a199000b006e16dcf99c8sm21142978qkb.71.2023.01.17.14.09.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 Jan 2023 14:09:49 -0800 (PST) From: Peter Xu To: qemu-devel@nongnu.org Cc: Leonardo Bras Soares Passos , James Houghton , Juan Quintela , peterx@redhat.com, "Dr . David Alan Gilbert" Subject: [PATCH RFC 12/21] migration: Introduce page size for-migration-only Date: Tue, 17 Jan 2023 17:09:05 -0500 Message-Id: <20230117220914.2062125-13-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20230117220914.2062125-1-peterx@redhat.com> References: <20230117220914.2062125-1-peterx@redhat.com> MIME-Version: 1.0 Received-SPF: pass client-ip=170.10.129.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Migration may not want to recognize memory chunks in page size of the host only, but sometimes we may want to recognize the memory in smaller chunks if e.g. they're doubly mapped as both huge and small. In those cases we'll prefer to assume the memory page size is always mapped small (qemu_real_host_page_size) and we'll do things just like when the pages was only smally mapped. Let's do this to be prepared of postcopy double-mapping for hugetlbfs. Signed-off-by: Peter Xu Reviewed-by: Juan Quintela --- migration/migration.c | 6 ++++-- migration/postcopy-ram.c | 16 +++++++++------- migration/ram.c | 29 ++++++++++++++++++++++------- migration/ram.h | 1 + 4 files changed, 36 insertions(+), 16 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index b174f2af92..f6fe474fc3 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -408,7 +408,7 @@ int migrate_send_rp_message_req_pages(MigrationIncomingState *mis, { uint8_t bufc[12 + 1 + 255]; /* start (8), len (4), rbname up to 256 */ size_t msglen = 12; /* start + len */ - size_t len = qemu_ram_pagesize(rb); + size_t len = migration_ram_pagesize(rb); enum mig_rp_message_type msg_type; const char *rbname; int rbname_len; @@ -443,8 +443,10 @@ int migrate_send_rp_message_req_pages(MigrationIncomingState *mis, int migrate_send_rp_req_pages(MigrationIncomingState *mis, RAMBlock *rb, ram_addr_t start, uint64_t haddr) { - void *aligned = (void *)(uintptr_t)ROUND_DOWN(haddr, qemu_ram_pagesize(rb)); bool received = false; + void *aligned; + + aligned = (void *)(uintptr_t)ROUND_DOWN(haddr, migration_ram_pagesize(rb)); WITH_QEMU_LOCK_GUARD(&mis->page_request_mutex) { received = ramblock_recv_bitmap_test_byte_offset(rb, start); diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c index 2c86bfc091..acae1dc6ae 100644 --- a/migration/postcopy-ram.c +++ b/migration/postcopy-ram.c @@ -694,7 +694,7 @@ int postcopy_wake_shared(struct PostCopyFD *pcfd, uint64_t client_addr, RAMBlock *rb) { - size_t pagesize = qemu_ram_pagesize(rb); + size_t pagesize = migration_ram_pagesize(rb); struct uffdio_range range; int ret; trace_postcopy_wake_shared(client_addr, qemu_ram_get_idstr(rb)); @@ -712,7 +712,9 @@ int postcopy_wake_shared(struct PostCopyFD *pcfd, static int postcopy_request_page(MigrationIncomingState *mis, RAMBlock *rb, ram_addr_t start, uint64_t haddr) { - void *aligned = (void *)(uintptr_t)ROUND_DOWN(haddr, qemu_ram_pagesize(rb)); + void *aligned; + + aligned = (void *)(uintptr_t)ROUND_DOWN(haddr, migration_ram_pagesize(rb)); /* * Discarded pages (via RamDiscardManager) are never migrated. On unlikely @@ -722,7 +724,7 @@ static int postcopy_request_page(MigrationIncomingState *mis, RAMBlock *rb, * Checking a single bit is sufficient to handle pagesize > TPS as either * all relevant bits are set or not. */ - assert(QEMU_IS_ALIGNED(start, qemu_ram_pagesize(rb))); + assert(QEMU_IS_ALIGNED(start, migration_ram_pagesize(rb))); if (ramblock_page_is_discarded(rb, start)) { bool received = ramblock_recv_bitmap_test_byte_offset(rb, start); @@ -740,7 +742,7 @@ static int postcopy_request_page(MigrationIncomingState *mis, RAMBlock *rb, int postcopy_request_shared_page(struct PostCopyFD *pcfd, RAMBlock *rb, uint64_t client_addr, uint64_t rb_offset) { - uint64_t aligned_rbo = ROUND_DOWN(rb_offset, qemu_ram_pagesize(rb)); + uint64_t aligned_rbo = ROUND_DOWN(rb_offset, migration_ram_pagesize(rb)); MigrationIncomingState *mis = migration_incoming_get_current(); trace_postcopy_request_shared_page(pcfd->idstr, qemu_ram_get_idstr(rb), @@ -1020,7 +1022,7 @@ static void *postcopy_ram_fault_thread(void *opaque) break; } - rb_offset = ROUND_DOWN(rb_offset, qemu_ram_pagesize(rb)); + rb_offset = ROUND_DOWN(rb_offset, migration_ram_pagesize(rb)); trace_postcopy_ram_fault_thread_request(msg.arg.pagefault.address, qemu_ram_get_idstr(rb), rb_offset, @@ -1281,7 +1283,7 @@ int postcopy_notify_shared_wake(RAMBlock *rb, uint64_t offset) int postcopy_place_page(MigrationIncomingState *mis, void *host, void *from, RAMBlock *rb) { - size_t pagesize = qemu_ram_pagesize(rb); + size_t pagesize = migration_ram_pagesize(rb); /* copy also acks to the kernel waking the stalled thread up * TODO: We can inhibit that ack and only do it if it was requested @@ -1308,7 +1310,7 @@ int postcopy_place_page(MigrationIncomingState *mis, void *host, void *from, int postcopy_place_page_zero(MigrationIncomingState *mis, void *host, RAMBlock *rb) { - size_t pagesize = qemu_ram_pagesize(rb); + size_t pagesize = migration_ram_pagesize(rb); trace_postcopy_place_page_zero(host); /* Normal RAMBlocks can zero a page using UFFDIO_ZEROPAGE diff --git a/migration/ram.c b/migration/ram.c index 334309f1c6..945c6477fd 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -121,6 +121,20 @@ static struct { uint8_t *decoded_buf; } XBZRLE; +/* Get the page size we should use for migration purpose. */ +size_t migration_ram_pagesize(RAMBlock *block) +{ + /* + * When hugetlb doublemap is enabled, we should always use the smallest + * page for migration. + */ + if (migrate_hugetlb_doublemap()) { + return qemu_real_host_page_size(); + } + + return qemu_ram_pagesize(block); +} + static void XBZRLE_cache_lock(void) { if (migrate_use_xbzrle()) { @@ -1049,7 +1063,7 @@ bool ramblock_page_is_discarded(RAMBlock *rb, ram_addr_t start) MemoryRegionSection section = { .mr = rb->mr, .offset_within_region = start, - .size = int128_make64(qemu_ram_pagesize(rb)), + .size = int128_make64(migration_ram_pagesize(rb)), }; return !ram_discard_manager_is_populated(rdm, §ion); @@ -2152,7 +2166,7 @@ int ram_save_queue_pages(const char *rbname, ram_addr_t start, ram_addr_t len) */ if (postcopy_preempt_active()) { ram_addr_t page_start = start >> TARGET_PAGE_BITS; - size_t page_size = qemu_ram_pagesize(ramblock); + size_t page_size = migration_ram_pagesize(ramblock); PageSearchStatus *pss = &ram_state->pss[RAM_CHANNEL_POSTCOPY]; int ret = 0; @@ -2316,7 +2330,7 @@ static int ram_save_target_page(RAMState *rs, PageSearchStatus *pss) static void pss_host_page_prepare(PageSearchStatus *pss) { /* How many guest pages are there in one host page? */ - size_t guest_pfns = qemu_ram_pagesize(pss->block) >> TARGET_PAGE_BITS; + size_t guest_pfns = migration_ram_pagesize(pss->block) >> TARGET_PAGE_BITS; pss->host_page_sending = true; pss->host_page_start = ROUND_DOWN(pss->page, guest_pfns); @@ -2425,7 +2439,7 @@ static int ram_save_host_page(RAMState *rs, PageSearchStatus *pss) bool page_dirty, preempt_active = postcopy_preempt_active(); int tmppages, pages = 0; size_t pagesize_bits = - qemu_ram_pagesize(pss->block) >> TARGET_PAGE_BITS; + migration_ram_pagesize(pss->block) >> TARGET_PAGE_BITS; unsigned long start_page = pss->page; int res; @@ -3518,7 +3532,7 @@ static void *host_page_from_ram_block_offset(RAMBlock *block, { /* Note: Explicitly no check against offset_in_ramblock(). */ return (void *)QEMU_ALIGN_DOWN((uintptr_t)(block->host + offset), - block->page_size); + migration_ram_pagesize(block)); } static ram_addr_t host_page_offset_from_ram_block_offset(RAMBlock *block, @@ -3970,7 +3984,8 @@ int ram_load_postcopy(QEMUFile *f, int channel) break; } tmp_page->target_pages++; - matches_target_page_size = block->page_size == TARGET_PAGE_SIZE; + matches_target_page_size = + migration_ram_pagesize(block) == TARGET_PAGE_SIZE; /* * Postcopy requires that we place whole host pages atomically; * these may be huge pages for RAMBlocks that are backed by @@ -4005,7 +4020,7 @@ int ram_load_postcopy(QEMUFile *f, int channel) * page */ if (tmp_page->target_pages == - (block->page_size / TARGET_PAGE_SIZE)) { + (migration_ram_pagesize(block) / TARGET_PAGE_SIZE)) { place_needed = true; } place_source = tmp_page->tmp_huge_page; diff --git a/migration/ram.h b/migration/ram.h index 81cbb0947c..162b3e7cb8 100644 --- a/migration/ram.h +++ b/migration/ram.h @@ -68,6 +68,7 @@ bool ramblock_is_ignored(RAMBlock *block); if (!qemu_ram_is_migratable(block)) {} else int xbzrle_cache_resize(uint64_t new_size, Error **errp); +size_t migration_ram_pagesize(RAMBlock *block); uint64_t ram_bytes_remaining(void); uint64_t ram_bytes_total(void); void mig_throttle_counter_reset(void); From patchwork Tue Jan 17 22:09:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 1727767 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=eu7re/DQ; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4NxNNx2LZSz23fp for ; Wed, 18 Jan 2023 09:10:09 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pHu9W-0001Pc-Vh; Tue, 17 Jan 2023 17:09:58 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pHu9W-0001LS-3Z for qemu-devel@nongnu.org; Tue, 17 Jan 2023 17:09:58 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pHu9U-0007ap-GB for qemu-devel@nongnu.org; Tue, 17 Jan 2023 17:09:57 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1673993395; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/m6H/A6Idz67sjLDU/TwhgBulhjmWX8IAhtuHAZqTBo=; b=eu7re/DQZnNU4S8hXXxVf9mHNK8xFXetbzwKd2AlcRrLCpN0oYoMXfQcR8b7+n58aUDLVK QoK0CXupMh1de9LK2Dy6JTQfvJHK258C7pMAZXTgP7O8/3K64EvYQOXiF2YVp6N5SuxR4t H+nu6/ioGVHy+105Cn88KiOeeRR/xRM= Received: from mail-vk1-f198.google.com (mail-vk1-f198.google.com [209.85.221.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-593-O_pV1PGKPxWYeEVkabTzuA-1; Tue, 17 Jan 2023 17:09:54 -0500 X-MC-Unique: O_pV1PGKPxWYeEVkabTzuA-1 Received: by mail-vk1-f198.google.com with SMTP id j17-20020a1f2311000000b003bd40550849so9618057vkj.6 for ; Tue, 17 Jan 2023 14:09:54 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=/m6H/A6Idz67sjLDU/TwhgBulhjmWX8IAhtuHAZqTBo=; b=OJPLvRW98d/uMO4nBoNtZE0Eg0ZFHWrnoJJ625OD19zC3Rwza6bRFNJPRtTBMC7pIK sedCeT3gFZBIkU4jcZpiio0VDrxmnAOYLTe1op/MWStC6ZFOXhxRrZYGTX5WJF1V4Y0t BibFpNrq7PX29ezID2Eo/B4yYRMgFP7ZTv0058sA1K4DCXqY+F56ZFkWwrHMRdV26qIT pmD2wsPAGQid7p3YjneDbkFcoFPTrywyH6WthveRWV5Mdo9r7F6+a6xMUFSmWHGP2vCV X8KRk/0Nl15aSR+paOQeLbyLF0j/r56hj0894sErCVT2zZwFIN1RTpa9UgsmL2MsYE+q HmmA== X-Gm-Message-State: AFqh2kr7TZxbteMwfxMjfku7hTaLnk1IQ8PnoZ207wjvUrokV4C1/sbo DgysAptYcf8uTfBxmXxK5u/3yIl8qKHMiMGCo1VO8gJ/sbIVbrr7n+bAvTaWEQKaVUAI4WKi3Mr nK0+Bg2EswdToRGP8iND0EYuhJ2PtYWBB/EETuVGCPXK7YMNSYhDINLo8H9oEeocl X-Received: by 2002:a05:6102:3d81:b0:3b0:eb7f:f15 with SMTP id h1-20020a0561023d8100b003b0eb7f0f15mr2816461vsv.19.1673993393287; Tue, 17 Jan 2023 14:09:53 -0800 (PST) X-Google-Smtp-Source: AMrXdXvi07etoFLU93wVx8sx/5PSKVumBZGD40uvjm8bqR+wgw02yNY7ZFV2ywyJkcziwgq/iY6QuQ== X-Received: by 2002:a05:6102:3d81:b0:3b0:eb7f:f15 with SMTP id h1-20020a0561023d8100b003b0eb7f0f15mr2816428vsv.19.1673993392989; Tue, 17 Jan 2023 14:09:52 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-56-70-30-145-63.dsl.bell.ca. [70.30.145.63]) by smtp.gmail.com with ESMTPSA id bm16-20020a05620a199000b006e16dcf99c8sm21142978qkb.71.2023.01.17.14.09.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 Jan 2023 14:09:52 -0800 (PST) From: Peter Xu To: qemu-devel@nongnu.org Cc: Leonardo Bras Soares Passos , James Houghton , Juan Quintela , peterx@redhat.com, "Dr . David Alan Gilbert" Subject: [PATCH RFC 13/21] migration: Add migration_ram_pagesize_largest() Date: Tue, 17 Jan 2023 17:09:06 -0500 Message-Id: <20230117220914.2062125-14-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20230117220914.2062125-1-peterx@redhat.com> References: <20230117220914.2062125-1-peterx@redhat.com> MIME-Version: 1.0 Received-SPF: pass client-ip=170.10.129.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Let it replace the old qemu_ram_pagesize_largest() just to fetch the page sizes using migration_ram_pagesize(), because it'll start to consider double mapping effect in migrations. Also don't account the ignored ramblocks as they won't be migrated. Signed-off-by: Peter Xu Reviewed-by: Dr. David Alan Gilbert Reviewed-by: Juan Quintela --- include/exec/cpu-common.h | 1 - migration/migration.c | 2 +- migration/ram.c | 12 ++++++++++++ migration/ram.h | 1 + softmmu/physmem.c | 13 ------------- 5 files changed, 14 insertions(+), 15 deletions(-) diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h index 94452aa17f..4c394ccdfc 100644 --- a/include/exec/cpu-common.h +++ b/include/exec/cpu-common.h @@ -96,7 +96,6 @@ int qemu_ram_get_fd(RAMBlock *rb); size_t qemu_ram_pagesize(RAMBlock *block); bool qemu_ram_is_hugetlb(RAMBlock *rb); -size_t qemu_ram_pagesize_largest(void); /** * cpu_address_space_init: diff --git a/migration/migration.c b/migration/migration.c index f6fe474fc3..7724e00c47 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -604,7 +604,7 @@ process_incoming_migration_co(void *opaque) assert(mis->from_src_file); mis->migration_incoming_co = qemu_coroutine_self(); - mis->largest_page_size = qemu_ram_pagesize_largest(); + mis->largest_page_size = migration_ram_pagesize_largest(); postcopy_state_set(POSTCOPY_INCOMING_NONE); migrate_set_state(&mis->state, MIGRATION_STATUS_NONE, MIGRATION_STATUS_ACTIVE); diff --git a/migration/ram.c b/migration/ram.c index 945c6477fd..2ebf414f5f 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -135,6 +135,18 @@ size_t migration_ram_pagesize(RAMBlock *block) return qemu_ram_pagesize(block); } +size_t migration_ram_pagesize_largest(void) +{ + RAMBlock *block; + size_t largest = 0; + + RAMBLOCK_FOREACH_NOT_IGNORED(block) { + largest = MAX(largest, migration_ram_pagesize(block)); + } + + return largest; +} + static void XBZRLE_cache_lock(void) { if (migrate_use_xbzrle()) { diff --git a/migration/ram.h b/migration/ram.h index 162b3e7cb8..cefe166841 100644 --- a/migration/ram.h +++ b/migration/ram.h @@ -69,6 +69,7 @@ bool ramblock_is_ignored(RAMBlock *block); int xbzrle_cache_resize(uint64_t new_size, Error **errp); size_t migration_ram_pagesize(RAMBlock *block); +size_t migration_ram_pagesize_largest(void); uint64_t ram_bytes_remaining(void); uint64_t ram_bytes_total(void); void mig_throttle_counter_reset(void); diff --git a/softmmu/physmem.c b/softmmu/physmem.c index cdda7eaea5..536c204811 100644 --- a/softmmu/physmem.c +++ b/softmmu/physmem.c @@ -1813,19 +1813,6 @@ bool qemu_ram_is_hugetlb(RAMBlock *rb) return rb->page_size > qemu_real_host_page_size(); } -/* Returns the largest size of page in use */ -size_t qemu_ram_pagesize_largest(void) -{ - RAMBlock *block; - size_t largest = 0; - - RAMBLOCK_FOREACH(block) { - largest = MAX(largest, qemu_ram_pagesize(block)); - } - - return largest; -} - static int memory_try_enable_merging(void *addr, size_t len) { if (!machine_mem_merge(current_machine)) { From patchwork Tue Jan 17 22:09:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 1727779 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=MszW9Kwd; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4NxNQM4jZ2z23gM for ; Wed, 18 Jan 2023 09:11:23 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pHu9a-0001mb-2p; Tue, 17 Jan 2023 17:10:02 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pHu9Y-0001bk-AD for qemu-devel@nongnu.org; Tue, 17 Jan 2023 17:10:00 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pHu9W-0007b2-Jc for qemu-devel@nongnu.org; Tue, 17 Jan 2023 17:10:00 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1673993398; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=gDIycr7V4ehicHYTbW8zApWYTAG6A+ojiqCcDH8vsjY=; b=MszW9KwdZbPUwrgjs6AG9GJ3pJ3vSIZcH42PwzzjFk2fydiimwudyjmAa4bpOHcjieRT+0 mIQxNv2Or1Zz5Qe62uXp4rR2XyaDpGQHi+Gs6DpB8LVnyrpJssJa+NpqwyfeyYi+/sYHfQ xl8NmmOKYGaAH4UyINEs2MpN1b1eduI= Received: from mail-yb1-f198.google.com (mail-yb1-f198.google.com [209.85.219.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-611-hgNqkl4vMga2BTwV3zFDRQ-1; Tue, 17 Jan 2023 17:09:57 -0500 X-MC-Unique: hgNqkl4vMga2BTwV3zFDRQ-1 Received: by mail-yb1-f198.google.com with SMTP id x188-20020a2531c5000000b00716de19d76bso35371158ybx.19 for ; Tue, 17 Jan 2023 14:09:56 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=gDIycr7V4ehicHYTbW8zApWYTAG6A+ojiqCcDH8vsjY=; b=iknJOyK075ujsCOGwrIjg64i6rj9rSKWTiWtV0i3x3+y6FPS9RUNhhV8ouZXBerGQt 3uVyZ/Um14LOUbP7CVB80326lk/OwqIpNCPBzq6JVo5EoJ9HP4kgmyqec6YgECmeiFA6 ozhjTEZxJWFhbG/WWelmAUJ4kyIPbJF0BbSEjRcCCCL+1aOEpnyFGIYxnp9ksgaWIllS xOmLrVf5ggOZAPrJ0HvVzR8EVcEeRaa7jvEhA4ftoqP3q/pNxi0jsvAG0p2ZqJSl6fd6 Ac/8EdmPA41Uum4GklbnUUIy2z9VzfspWOfs9CdxZm2JDnXRRXDZl4nHp+S9lCobSnuX 1fZQ== X-Gm-Message-State: AFqh2krExAZembA9aR8baZKTJjsNmsIgL0ubos/+PSVx+sZ4SeUcgYGT lWuzk+PIh7xogZoAzg6qwyQiGW5Z+XGV6XecaoGpnxPlfMGxVKj2zdr8VbRyPV2AesTYwHmByQi tLfXDgXJ5riCxjgp7asV5LYsFoV88f2YSv+/QvPDKasXOHkxFox7YhaQTuO2h+AXU X-Received: by 2002:a81:7587:0:b0:4e0:59b8:da3 with SMTP id q129-20020a817587000000b004e059b80da3mr4291700ywc.45.1673993395285; Tue, 17 Jan 2023 14:09:55 -0800 (PST) X-Google-Smtp-Source: AMrXdXvGNBXSYLOXaJPiMuvaBQl0LQxfkWLemrtYS1E24L/AtAcqgdELtLBZJDPDX8qaNmv3w67nQg== X-Received: by 2002:a81:7587:0:b0:4e0:59b8:da3 with SMTP id q129-20020a817587000000b004e059b80da3mr4291680ywc.45.1673993394940; Tue, 17 Jan 2023 14:09:54 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-56-70-30-145-63.dsl.bell.ca. [70.30.145.63]) by smtp.gmail.com with ESMTPSA id bm16-20020a05620a199000b006e16dcf99c8sm21142978qkb.71.2023.01.17.14.09.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 Jan 2023 14:09:53 -0800 (PST) From: Peter Xu To: qemu-devel@nongnu.org Cc: Leonardo Bras Soares Passos , James Houghton , Juan Quintela , peterx@redhat.com, "Dr . David Alan Gilbert" Subject: [PATCH RFC 14/21] migration: Map hugetlbfs ramblocks twice, and pre-allocate Date: Tue, 17 Jan 2023 17:09:07 -0500 Message-Id: <20230117220914.2062125-15-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20230117220914.2062125-1-peterx@redhat.com> References: <20230117220914.2062125-1-peterx@redhat.com> MIME-Version: 1.0 Received-SPF: pass client-ip=170.10.129.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Add a RAMBlock.host_mirror for all the hugetlbfs backed guest memories. It'll be used to remap the same region twice and it'll be used to service page faults using UFFDIO_CONTINUE. To make sure all accesses to these ranges will generate minor page faults not missing page faults, we need to pre-allocate the files to make sure page cache exist start from the beginning. Signed-off-by: Peter Xu Reviewed-by: Dr. David Alan Gilbert Reviewed-by: Juan Quintela --- include/exec/ramblock.h | 7 +++++ migration/ram.c | 59 +++++++++++++++++++++++++++++++++++++++++ 2 files changed, 66 insertions(+) diff --git a/include/exec/ramblock.h b/include/exec/ramblock.h index 3f31ce1591..c76683c3c8 100644 --- a/include/exec/ramblock.h +++ b/include/exec/ramblock.h @@ -28,6 +28,13 @@ struct RAMBlock { struct rcu_head rcu; struct MemoryRegion *mr; uint8_t *host; + /* + * This is only used for hugetlbfs ramblocks where doublemap is + * enabled. The pointer is managed by dest host migration code, and + * should be NULL when migration is finished. On src host, it should + * always be NULL. + */ + uint8_t *host_mirror; uint8_t *colo_cache; /* For colo, VM's ram cache */ ram_addr_t offset; ram_addr_t used_length; diff --git a/migration/ram.c b/migration/ram.c index 2ebf414f5f..37d7b3553a 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -3879,6 +3879,57 @@ void colo_release_ram_cache(void) ram_state_cleanup(&ram_state); } +static int migrate_hugetlb_doublemap_init(void) +{ + RAMBlock *rb; + void *addr; + int ret; + + if (!migrate_hugetlb_doublemap()) { + return 0; + } + + RAMBLOCK_FOREACH_NOT_IGNORED(rb) { + if (qemu_ram_is_hugetlb(rb)) { + /* + * Firstly, we remap the same ramblock into another range of + * virtual address, so that we can write to the pages without + * touching the page tables that directly mapped for the guest. + */ + addr = ramblock_file_map(rb); + if (addr == MAP_FAILED) { + ret = -errno; + error_report("%s: Duplicate mapping for hugetlb ramblock '%s'" + "failed: %s", __func__, qemu_ram_get_idstr(rb), + strerror(errno)); + return ret; + } + rb->host_mirror = addr; + + /* + * We need to make sure we pre-allocate the range with + * hugetlbfs pages before hand, so that all the page fault will + * be trapped as MINOR faults always, rather than MISSING + * faults in userfaultfd. + */ + ret = qemu_madvise(addr, rb->mmap_length, QEMU_MADV_POPULATE_WRITE); + if (ret) { + error_report("Failed to populate hugetlb ramblock '%s': " + "%s", qemu_ram_get_idstr(rb), strerror(-ret)); + return ret; + } + } + } + + /* + * When reach here, it means we've setup the mirror mapping for all the + * hugetlbfs pages. Hence when page fault happens, we'll be able to + * resolve page faults using UFFDIO_CONTINUE for hugetlbfs pages, but + * we'll keep using UFFDIO_COPY for anonymous pages. + */ + return 0; +} + /** * ram_load_setup: Setup RAM for migration incoming side * @@ -3893,6 +3944,10 @@ static int ram_load_setup(QEMUFile *f, void *opaque) return -1; } + if (migrate_hugetlb_doublemap_init()) { + return -1; + } + xbzrle_load_setup(); ramblock_recv_map_init(); @@ -3913,6 +3968,10 @@ static int ram_load_cleanup(void *opaque) RAMBLOCK_FOREACH_NOT_IGNORED(rb) { g_free(rb->receivedmap); rb->receivedmap = NULL; + if (rb->host_mirror) { + munmap(rb->host_mirror, rb->mmap_length); + rb->host_mirror = NULL; + } } return 0; From patchwork Tue Jan 17 22:09:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 1727780 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=RegdF2j1; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4NxNQT1lPKz23fp for ; Wed, 18 Jan 2023 09:11:29 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pHu9t-0002pa-1p; Tue, 17 Jan 2023 17:10:21 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pHu9m-0002VO-Ok for qemu-devel@nongnu.org; Tue, 17 Jan 2023 17:10:15 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pHu9i-0007pg-RA for qemu-devel@nongnu.org; Tue, 17 Jan 2023 17:10:13 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1673993409; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Tet6KDshR52hByMThCbEyNb3otDtocztY2T3g7R+w9w=; b=RegdF2j1amoFWZ5oGSdQiWPrJZIRSzRMIHU+6/7ASTCG/wkRIJhgow7zZ6Bwsil9uYZpWa AnRVrAOqw3JLFdRlf+5uXqeq9kdRFTySkcfmJ3Ih4YfXmHBTdNSTTvdMkvqhTKYulT3OXq l2SLptSSc9KDJZvyNk2F3HdOMJ71lo8= Received: from mail-vs1-f71.google.com (mail-vs1-f71.google.com [209.85.217.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-633-l3ZVXT-ePB6u6c_1YkRVGA-1; Tue, 17 Jan 2023 17:10:00 -0500 X-MC-Unique: l3ZVXT-ePB6u6c_1YkRVGA-1 Received: by mail-vs1-f71.google.com with SMTP id 68-20020a670347000000b003bf750cb86eso8138825vsd.8 for ; Tue, 17 Jan 2023 14:10:00 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Tet6KDshR52hByMThCbEyNb3otDtocztY2T3g7R+w9w=; b=MaKdqVHEG9TA+G5kC3UHyXq69G0QoK2AbHNp1PVMF2qTfno2dWPwG61c91gax9U5NP O84x7lIsnrrP6Cih61/abofdQgJnqwYzqPSrWpQK5kQ63NsH375AIZbACRWKsARGnHW9 DKo8XrABfFYCaH6P5Pisp5ogY+bEQkiysmiJr8+meDle+OY9KAV4oFaUPGsp7igCTiMR FiLDPkrIfViyGCpV60+ZsYLr18R6O4v8gdlkB0alZWgT9FulP0T1ekFGA3URjDQsJrYr /G2DAs6+ZWSosuRZRUVYk3CLncDIMtDUabPt3pdVHdBy1STUdUhIuJsCuZFSLEvja+IG gkpg== X-Gm-Message-State: AFqh2kq4le6vOD47xn7lXXgi1qcfl7b5hTlTsaXvLnp/dtmqKGQayS41 9ocWV46i1PhaE6sOp1noK3c8vaLKj7b9+clfUqXLD1KjjO9H1yL+qZ8PZTRicvovApaPZTYEovj ZkKAvHjPJbtuHjfUdyhlAjiR/clA3UXbZhP/vxCFQIt1hSQ5qfrTJiYiu47x/Dwc3 X-Received: by 2002:a05:6102:941:b0:3ce:b848:d673 with SMTP id a1-20020a056102094100b003ceb848d673mr2916594vsi.32.1673993397684; Tue, 17 Jan 2023 14:09:57 -0800 (PST) X-Google-Smtp-Source: AMrXdXuqkJLrYhxHRdX8hihOW/AWMfuSuL07PVB97bsNcj6Q7A6XkiyZ/gF9R+x449HElhoAFuM/gQ== X-Received: by 2002:a05:6102:941:b0:3ce:b848:d673 with SMTP id a1-20020a056102094100b003ceb848d673mr2916560vsi.32.1673993397290; Tue, 17 Jan 2023 14:09:57 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-56-70-30-145-63.dsl.bell.ca. [70.30.145.63]) by smtp.gmail.com with ESMTPSA id bm16-20020a05620a199000b006e16dcf99c8sm21142978qkb.71.2023.01.17.14.09.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 Jan 2023 14:09:55 -0800 (PST) From: Peter Xu To: qemu-devel@nongnu.org Cc: Leonardo Bras Soares Passos , James Houghton , Juan Quintela , peterx@redhat.com, "Dr . David Alan Gilbert" Subject: [PATCH RFC 15/21] migration: Teach qemu about minor faults and doublemap Date: Tue, 17 Jan 2023 17:09:08 -0500 Message-Id: <20230117220914.2062125-16-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20230117220914.2062125-1-peterx@redhat.com> References: <20230117220914.2062125-1-peterx@redhat.com> MIME-Version: 1.0 Received-SPF: pass client-ip=170.10.129.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org When a ramblock is backed by hugetlbfs and the user specified using double-map feature, we trap the faults on these regions using minor mode. Teach QEMU about that. Add some sanity check on the fault flags when receiving a uffd message. For minor fault trapped ranges, we should always see the MINOR flag set, while when using generic missing faults we should never see it. Signed-off-by: Peter Xu Reviewed-by: Juan Quintela --- migration/postcopy-ram.c | 99 ++++++++++++++++++++++++++++++++-------- migration/postcopy-ram.h | 1 + 2 files changed, 81 insertions(+), 19 deletions(-) diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c index acae1dc6ae..86ff73c2c0 100644 --- a/migration/postcopy-ram.c +++ b/migration/postcopy-ram.c @@ -325,12 +325,25 @@ static bool ufd_check_and_apply(int ufd, MigrationIncomingState *mis) if (qemu_real_host_page_size() != ram_pagesize_summary()) { bool have_hp = false; - /* We've got a huge page */ + + /* + * If we're using doublemap, we need MINOR fault, otherwise we need + * MISSING fault (which is the default). + */ + if (migrate_hugetlb_doublemap()) { +#ifdef UFFD_FEATURE_MINOR_HUGETLBFS + have_hp = supported_features & UFFD_FEATURE_MINOR_HUGETLBFS; +#endif + } else { #ifdef UFFD_FEATURE_MISSING_HUGETLBFS - have_hp = supported_features & UFFD_FEATURE_MISSING_HUGETLBFS; + have_hp = supported_features & UFFD_FEATURE_MISSING_HUGETLBFS; #endif + } + if (!have_hp) { - error_report("Userfault on this host does not support huge pages"); + error_report("Userfault on this host does not support huge pages " + "with %s fault traps", migrate_hugetlb_doublemap() ? + "MINOR" : "MISSING"); return false; } } @@ -669,22 +682,43 @@ static int ram_block_enable_notify(RAMBlock *rb, void *opaque) { MigrationIncomingState *mis = opaque; struct uffdio_register reg_struct; + bool minor_fault = postcopy_use_minor_fault(rb); reg_struct.range.start = (uintptr_t)qemu_ram_get_host_addr(rb); reg_struct.range.len = rb->postcopy_length; + + /* + * For hugetlbfs with double-map enabled, we trap pages using minor + * mode, otherwise we use missing mode. Note: we also register missing + * mode for doublemap, but we should never hit it. + */ reg_struct.mode = UFFDIO_REGISTER_MODE_MISSING; + if (minor_fault) { + reg_struct.mode |= UFFDIO_REGISTER_MODE_MINOR; + } /* Now tell our userfault_fd that it's responsible for this area */ if (ioctl(mis->userfault_fd, UFFDIO_REGISTER, ®_struct)) { error_report("%s userfault register: %s", __func__, strerror(errno)); return -1; } - if (!(reg_struct.ioctls & ((__u64)1 << _UFFDIO_COPY))) { - error_report("%s userfault: Region doesn't support COPY", __func__); - return -1; - } - if (reg_struct.ioctls & ((__u64)1 << _UFFDIO_ZEROPAGE)) { - qemu_ram_set_uf_zeroable(rb); + + if (minor_fault) { + /* Using minor faults for this ramblock */ + if (!(reg_struct.ioctls & ((__u64)1 << _UFFDIO_CONTINUE))) { + error_report("%s userfault: Region doesn't support CONTINUE", + __func__); + return -1; + } + } else { + /* Using missing faults for this ramblock */ + if (!(reg_struct.ioctls & ((__u64)1 << _UFFDIO_COPY))) { + error_report("%s userfault: Region doesn't support COPY", __func__); + return -1; + } + if (reg_struct.ioctls & ((__u64)1 << _UFFDIO_ZEROPAGE)) { + qemu_ram_set_uf_zeroable(rb); + } } return 0; @@ -916,6 +950,7 @@ static void *postcopy_ram_fault_thread(void *opaque) { MigrationIncomingState *mis = opaque; struct uffd_msg msg; + uint64_t address; int ret; size_t index; RAMBlock *rb = NULL; @@ -945,6 +980,7 @@ static void *postcopy_ram_fault_thread(void *opaque) } while (true) { + bool use_minor_fault, minor_flag; ram_addr_t rb_offset; int poll_result; @@ -1022,22 +1058,37 @@ static void *postcopy_ram_fault_thread(void *opaque) break; } - rb_offset = ROUND_DOWN(rb_offset, migration_ram_pagesize(rb)); - trace_postcopy_ram_fault_thread_request(msg.arg.pagefault.address, - qemu_ram_get_idstr(rb), - rb_offset, - msg.arg.pagefault.feat.ptid); - mark_postcopy_blocktime_begin( - (uintptr_t)(msg.arg.pagefault.address), - msg.arg.pagefault.feat.ptid, rb); + address = ROUND_DOWN(msg.arg.pagefault.address, + migration_ram_pagesize(rb)); + use_minor_fault = postcopy_use_minor_fault(rb); + minor_flag = !!(msg.arg.pagefault.flags & + UFFD_PAGEFAULT_FLAG_MINOR); + /* + * Do sanity check on the message flags to make sure this is + * the one we expect to receive. When using minor fault on + * this ramblock, it should _always_ be set; when not using + * minor fault, it should _never_ be set. + */ + if (use_minor_fault ^ minor_flag) { + error_report("%s: Unexpected page fault flags (0x%"PRIx64") " + "for address 0x%"PRIx64" (mode=%s)", __func__, + (uint64_t)msg.arg.pagefault.flags, + (uint64_t)msg.arg.pagefault.address, + use_minor_fault ? "MINOR" : "MISSING"); + } + + trace_postcopy_ram_fault_thread_request( + address, qemu_ram_get_idstr(rb), rb_offset, + msg.arg.pagefault.feat.ptid); + mark_postcopy_blocktime_begin( + (uintptr_t)(address), msg.arg.pagefault.feat.ptid, rb); retry: /* * Send the request to the source - we want to request one * of our host page sizes (which is >= TPS) */ - ret = postcopy_request_page(mis, rb, rb_offset, - msg.arg.pagefault.address); + ret = postcopy_request_page(mis, rb, rb_offset, address); if (ret) { /* May be network failure, try to wait for recovery */ postcopy_pause_fault_thread(mis); @@ -1694,3 +1745,13 @@ void *postcopy_preempt_thread(void *opaque) return NULL; } + +/* + * Whether we should use MINOR fault to trap page faults? It will be used + * when doublemap is enabled on hugetlbfs. The default value will be + * false, which means we'll keep using the legacy MISSING faults. + */ +bool postcopy_use_minor_fault(RAMBlock *rb) +{ + return migrate_hugetlb_doublemap() && qemu_ram_is_hugetlb(rb); +} diff --git a/migration/postcopy-ram.h b/migration/postcopy-ram.h index b4867a32d5..32734d2340 100644 --- a/migration/postcopy-ram.h +++ b/migration/postcopy-ram.h @@ -193,5 +193,6 @@ enum PostcopyChannels { void postcopy_preempt_new_channel(MigrationIncomingState *mis, QEMUFile *file); void postcopy_preempt_setup(MigrationState *s); int postcopy_preempt_establish_channel(MigrationState *s); +bool postcopy_use_minor_fault(RAMBlock *rb); #endif From patchwork Tue Jan 17 22:09:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 1727777 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=NBySccfO; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4NxNQJ0D1Wz23fp for ; Wed, 18 Jan 2023 09:11:20 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pHu9g-0002C7-R1; Tue, 17 Jan 2023 17:10:08 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pHu9e-00022E-EK for qemu-devel@nongnu.org; Tue, 17 Jan 2023 17:10:06 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pHu9c-0007cZ-92 for qemu-devel@nongnu.org; Tue, 17 Jan 2023 17:10:05 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1673993403; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=6A9Htc+Z2aHcoCigPArM3A4fI1fi4lE0wvMzM8VmSJE=; b=NBySccfOQQYGlJm0SpalF9xhRFemPPDtN2Nus6EUUWL31Q1dTuipMPJa+PESI+jyD5H6+z Skq6ssn5+7DFOCERMIIYTHXiu0F//j6zCHdUa1tyh14MJ2764pDWzUlEBg77B6iN/5UtZw eniV6mXelXAbN6xR8NswKyZ/sQhnObY= Received: from mail-vk1-f199.google.com (mail-vk1-f199.google.com [209.85.221.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-595-6XYlK0yjMzqI1Pf-R7FBXg-1; Tue, 17 Jan 2023 17:10:01 -0500 X-MC-Unique: 6XYlK0yjMzqI1Pf-R7FBXg-1 Received: by mail-vk1-f199.google.com with SMTP id o85-20020a1f2858000000b003d5eb4cc1e6so9532744vko.2 for ; Tue, 17 Jan 2023 14:10:01 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=6A9Htc+Z2aHcoCigPArM3A4fI1fi4lE0wvMzM8VmSJE=; b=pnDpxDnJxQvO4cDgSI1Jf5A8yB7TxOqrtE+Ds+XKOADrQz01QDRS2gzb+bj963nCDR m9OCe/F5HXDyvSql1NwjrMfOuxG5nTT+FKgz/OKhMh7Zd9YFQUK1YvtlQzPf0gjeEIqm vUG4p57QvTCsD/w7V5WwVQgHeinw95nM8Ki74In/wDDeCuoO/XDzIuawDLyq5yR9/RE9 ZnqwAJN9ESEcz0tAdKF4IW/TKsfZl5ugdblG4hqKqfG7TJcEQvIzQyfqgXRZy2kHfbWD nGhMdk2iR99tGxZkTnasBiYMEMu+VIgJ1vr8V7vl/l29JP8kQSXGESBQffUgGVC3utXz ofrw== X-Gm-Message-State: AFqh2krzE0lJgEOUSDj8SgjZIsewqNXz0G0bTObe8ryWXK/GvZclbhBg UnMrugYcri+r2fWAg5WUYJjJQxhZyOkSgMwAXh918O9DGtPWei3hpz1zBETlUdLR2FoEmTAxaIn L/rZqEWmkLovNjO9EZZ3Np5uYM279TX5E3Z5pTldXORYuFul4dIfYv4JuVw3pwJPE X-Received: by 2002:a1f:ad4e:0:b0:3d1:ca4:ddeb with SMTP id w75-20020a1fad4e000000b003d10ca4ddebmr2635785vke.6.1673993400029; Tue, 17 Jan 2023 14:10:00 -0800 (PST) X-Google-Smtp-Source: AMrXdXvuKtccrQPnGQCdL6FMlATKT6ZxY1cEzCxiYY5nk1zDP68LAxq+2TqCCocSzDQWsakoYzYbpQ== X-Received: by 2002:a1f:ad4e:0:b0:3d1:ca4:ddeb with SMTP id w75-20020a1fad4e000000b003d10ca4ddebmr2635750vke.6.1673993399562; Tue, 17 Jan 2023 14:09:59 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-56-70-30-145-63.dsl.bell.ca. [70.30.145.63]) by smtp.gmail.com with ESMTPSA id bm16-20020a05620a199000b006e16dcf99c8sm21142978qkb.71.2023.01.17.14.09.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 Jan 2023 14:09:57 -0800 (PST) From: Peter Xu To: qemu-devel@nongnu.org Cc: Leonardo Bras Soares Passos , James Houghton , Juan Quintela , peterx@redhat.com, "Dr . David Alan Gilbert" Subject: [PATCH RFC 16/21] migration: Enable doublemap with MADV_SPLIT Date: Tue, 17 Jan 2023 17:09:09 -0500 Message-Id: <20230117220914.2062125-17-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20230117220914.2062125-1-peterx@redhat.com> References: <20230117220914.2062125-1-peterx@redhat.com> MIME-Version: 1.0 Received-SPF: pass client-ip=170.10.129.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org MADV_SPLIT enables doublemap on hugetlb. Do that if doublemap=true specified for the migration. Signed-off-by: Peter Xu Reviewed-by: Juan Quintela --- migration/postcopy-ram.c | 16 ++++++++++++++++ migration/ram.c | 18 ++++++++++++++++++ 2 files changed, 34 insertions(+) diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c index 86ff73c2c0..dbc7e54e4a 100644 --- a/migration/postcopy-ram.c +++ b/migration/postcopy-ram.c @@ -694,6 +694,22 @@ static int ram_block_enable_notify(RAMBlock *rb, void *opaque) */ reg_struct.mode = UFFDIO_REGISTER_MODE_MISSING; if (minor_fault) { + /* + * MADV_SPLIT implicitly enables doublemap mode for hugetlb. If + * that fails (e.g. on old kernels) we need to fail the migration. + * + * It's a bit late to fail here as we could have migrated lots of + * pages in precopy, but early failure will require us to allocate + * hugetlb pages secretly in QEMU which is not friendly to admins + * and it may affect the global hugetlb pool. Considering it is + * normally always limited, keep the failure late but tolerable. + */ + if (qemu_madvise(qemu_ram_get_host_addr(rb), rb->postcopy_length, + QEMU_MADV_SPLIT)) { + error_report("%s: madvise(MADV_SPLIT) failed (ret=%d) but " + "required for doublemap.", __func__, -errno); + return -1; + } reg_struct.mode |= UFFDIO_REGISTER_MODE_MINOR; } diff --git a/migration/ram.c b/migration/ram.c index 37d7b3553a..4d786f4b97 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -3891,6 +3891,19 @@ static int migrate_hugetlb_doublemap_init(void) RAMBLOCK_FOREACH_NOT_IGNORED(rb) { if (qemu_ram_is_hugetlb(rb)) { + /* + * MADV_SPLIT implicitly enables doublemap mode for hugetlb on + * the guest mapped ranges. If that fails (e.g. on old + * kernels) we need to fail the migration. Note, the + * host_mirror mapping below can be kept as hugely mapped. + */ + if (qemu_madvise(qemu_ram_get_host_addr(rb), rb->mmap_length, + QEMU_MADV_SPLIT)) { + error_report("%s: madvise(MADV_SPLIT) required for doublemap", + __func__); + return -1; + } + /* * Firstly, we remap the same ramblock into another range of * virtual address, so that we can write to the pages without @@ -3898,6 +3911,11 @@ static int migrate_hugetlb_doublemap_init(void) */ addr = ramblock_file_map(rb); if (addr == MAP_FAILED) { + /* + * No need to undo MADV_SPLIT because this is dest node and + * we're going to bail out anyway. Leave that for mm exit + * to clean things up. + */ ret = -errno; error_report("%s: Duplicate mapping for hugetlb ramblock '%s'" "failed: %s", __func__, qemu_ram_get_idstr(rb), From patchwork Tue Jan 17 22:09:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 1727771 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=NALwSKm7; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4NxNPk2pJrz23fp for ; Wed, 18 Jan 2023 09:10:50 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pHuA9-0003d1-0N; Tue, 17 Jan 2023 17:10:37 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pHu9r-0002kO-To for qemu-devel@nongnu.org; Tue, 17 Jan 2023 17:10:20 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pHu9q-0007rp-4I for qemu-devel@nongnu.org; Tue, 17 Jan 2023 17:10:19 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1673993416; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=wfQL5H6B8e8hdqXoP2IDShzPV2mKexzfOVrl4WAyjgw=; b=NALwSKm7qcu6uS9G0RdzK2vrpSnKRVXhO1ckldzralJqa2Nhu+QY+dCAbxvNP027iMGGp+ YgZQZVMVxfZj3M9fmED7EUYXuBhS18ZZ7BkIi0ncQ8m0kry8cmUsQdZoKR7t6cu9IcGbGS It2J5p/0IwF+6dIRad0r6Ye8rtTErls= Received: from mail-vs1-f69.google.com (mail-vs1-f69.google.com [209.85.217.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-218-ZXvMl1MxMcKRYsEBRKkOYA-1; Tue, 17 Jan 2023 17:10:04 -0500 X-MC-Unique: ZXvMl1MxMcKRYsEBRKkOYA-1 Received: by mail-vs1-f69.google.com with SMTP id k27-20020a67ef5b000000b003d0dce04b32so4978655vsr.13 for ; Tue, 17 Jan 2023 14:10:03 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=wfQL5H6B8e8hdqXoP2IDShzPV2mKexzfOVrl4WAyjgw=; b=QOTvtzBySz5air7YW86J9Tt0uVp3AsCTDcvNnMA/zfc1LrsiI31a+rQEChL7LDddQv q5oOt26urfVUYao90AhocRmh3FIJ32yxnoyRyFZkXvOtbexOCWqBhdvnIewnnCmZZLxB 2ZQjYUffAmkeGpbKh5D3q9cRBn+ECCfFo6Fk/brRECw+ViSrj50KFPCDVXiUN1/mOTH1 NMdrBVLpYv21mQAICxU93EACGw0Bha2bNF1ofSVYK/Y1JsUvd8B3gcsXcqn6izBRSlf6 NqEjD9ZFmw68dWkcPr6cNbjkRTfj55s6U+9y1Zdm87/zXNI5zdxm82T/QF9d9hIobJbS TGFQ== X-Gm-Message-State: AFqh2kogGLn4/eGz0v+P2HDmQ6Vp53XBWmqF9TpeWCxfnTG3yijduUwX DI8kDvdO2aGsr1y+cghoBd4RmHjXHAmrXi/0vROwT1z2MMlf+g8Lyw6xbqLViDJ2ZOwVTfQXtAr ZxRjLuKBgWsIm6xqqTxdbSk/e3lUC4ZS2IrQnKwYJniPtuiYi3eJk9Ieqki3dYC1C X-Received: by 2002:a67:f84a:0:b0:3ce:5079:1063 with SMTP id b10-20020a67f84a000000b003ce50791063mr14478491vsp.35.1673993402513; Tue, 17 Jan 2023 14:10:02 -0800 (PST) X-Google-Smtp-Source: AMrXdXtBig1mXDQuo8a/BpN2rgkNj/xn5SLVOgLA201wdwP4J7YZD7gr4Spq81OgbZrdUaP23Mbipg== X-Received: by 2002:a67:f84a:0:b0:3ce:5079:1063 with SMTP id b10-20020a67f84a000000b003ce50791063mr14478461vsp.35.1673993402208; Tue, 17 Jan 2023 14:10:02 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-56-70-30-145-63.dsl.bell.ca. [70.30.145.63]) by smtp.gmail.com with ESMTPSA id bm16-20020a05620a199000b006e16dcf99c8sm21142978qkb.71.2023.01.17.14.09.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 Jan 2023 14:10:00 -0800 (PST) From: Peter Xu To: qemu-devel@nongnu.org Cc: Leonardo Bras Soares Passos , James Houghton , Juan Quintela , peterx@redhat.com, "Dr . David Alan Gilbert" Subject: [PATCH RFC 17/21] migration: Rework ram discard logic for hugetlb double-map Date: Tue, 17 Jan 2023 17:09:10 -0500 Message-Id: <20230117220914.2062125-18-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20230117220914.2062125-1-peterx@redhat.com> References: <20230117220914.2062125-1-peterx@redhat.com> MIME-Version: 1.0 Received-SPF: pass client-ip=170.10.133.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Hugetlb double map will make the ram discard logic different. The whole idea will still be the same: we need to a bitmap sync between src/dst before we switch to postcopy. When discarding a range, we only erase the pgtables that were used to be mapped for the guest leveraging the semantics of MADV_DONTNEED on Linux. This guarantees us that when a guest access triggered we'll receive a MINOR fault message rather than a MISSING fault message. Signed-off-by: Peter Xu Reviewed-by: Juan Quintela --- include/exec/cpu-common.h | 1 + migration/ram.c | 16 +++++++++++++++- migration/trace-events | 1 + softmmu/physmem.c | 31 +++++++++++++++++++++++++++++++ 4 files changed, 48 insertions(+), 1 deletion(-) diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h index 4c394ccdfc..09378c6ada 100644 --- a/include/exec/cpu-common.h +++ b/include/exec/cpu-common.h @@ -155,6 +155,7 @@ typedef int (RAMBlockIterFunc)(RAMBlock *rb, void *opaque); int qemu_ram_foreach_block(RAMBlockIterFunc func, void *opaque); int ram_block_discard_range(RAMBlock *rb, uint64_t start, size_t length); +int ram_block_zap_range(RAMBlock *rb, uint64_t start, size_t length); #endif diff --git a/migration/ram.c b/migration/ram.c index 4d786f4b97..4da56d925c 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -2770,6 +2770,12 @@ static void postcopy_each_ram_send_discard(MigrationState *ms) * host-page size chunks, mark any partially dirty host-page size * chunks as all dirty. In this case the host-page is the host-page * for the particular RAMBlock, i.e. it might be a huge page. + * + * Note: we need to do huge page truncation when double-map is + * enabled too, _only_ because we use MADV_DONTNEED to drop + * pgtables on dest QEMU, and it (at least so far...) does not + * support dropping partial of the hugetlb pgtables. If it can one + * day, we can skip this "chunk" operation as further optimization. */ postcopy_chunk_hostpages_pass(ms, block); @@ -2913,7 +2919,15 @@ int ram_discard_range(const char *rbname, uint64_t start, size_t length) length >> qemu_target_page_bits()); } - return ram_block_discard_range(rb, start, length); + if (postcopy_use_minor_fault(rb)) { + /* + * We need to keep the page cache exist, so as to trigger MINOR + * faults for every future page accesses on old pages. + */ + return ram_block_zap_range(rb, start, length); + } else { + return ram_block_discard_range(rb, start, length); + } } /* diff --git a/migration/trace-events b/migration/trace-events index 57003edcbd..6b418a0e9e 100644 --- a/migration/trace-events +++ b/migration/trace-events @@ -92,6 +92,7 @@ migration_bitmap_sync_end(uint64_t dirty_pages) "dirty_pages %" PRIu64 migration_bitmap_clear_dirty(char *str, uint64_t start, uint64_t size, unsigned long page) "rb %s start 0x%"PRIx64" size 0x%"PRIx64" page 0x%lx" migration_throttle(void) "" ram_discard_range(const char *rbname, uint64_t start, size_t len) "%s: start: %" PRIx64 " %zx" +postcopy_discard_range(const char *rbname, uint64_t start, void *host, size_t len) "%s: start=%" PRIx64 " haddr=%p len=%zx" ram_load_loop(const char *rbname, uint64_t addr, int flags, void *host) "%s: addr: 0x%" PRIx64 " flags: 0x%x host: %p" ram_load_postcopy_loop(int channel, uint64_t addr, int flags) "chan=%d addr=0x%" PRIx64 " flags=0x%x" ram_postcopy_send_discard_bitmap(void) "" diff --git a/softmmu/physmem.c b/softmmu/physmem.c index 536c204811..12c0bc9aee 100644 --- a/softmmu/physmem.c +++ b/softmmu/physmem.c @@ -3567,6 +3567,37 @@ int qemu_ram_foreach_block(RAMBlockIterFunc func, void *opaque) return ret; } +/* + * Zap page tables for specified range. Only applicable for file-backed + * memory. We're relying on Linux's MADV_DONTNEED behavior here for + * zapping the pgtables, it may or may not work on other OSes. Before we + * know that, fail them. + */ +int ram_block_zap_range(RAMBlock *rb, uint64_t start, size_t length) +{ +#ifdef CONFIG_LINUX + uint8_t *host_addr = rb->host + start; + int ret; + + if (rb->fd == -1) { + /* The zap magic only works with file-backed */ + return -EINVAL; + } + + ret = madvise(host_addr, length, MADV_DONTNEED); + if (ret) { + ret = -errno; + error_report("%s: Failed to zap ramblock start=0x%"PRIx64 + " addr=0x%"PRIx64" length=0x%zx", __func__, + start, (uint64_t)host_addr, length); + } + + return ret; +#else + return -EINVAL; +#endif +} + /* * Unmap pages of memory from start to start+length such that * they a) read as 0, b) Trigger whatever fault mechanism From patchwork Tue Jan 17 22:09:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 1727772 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=SeHQY775; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4NxNPz0dmhz23fp for ; Wed, 18 Jan 2023 09:11:03 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pHu9m-0002SQ-0m; Tue, 17 Jan 2023 17:10:14 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pHu9i-0002N5-Rd for qemu-devel@nongnu.org; Tue, 17 Jan 2023 17:10:10 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pHu9g-0007oM-Qm for qemu-devel@nongnu.org; Tue, 17 Jan 2023 17:10:10 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1673993408; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=IKX66sMu8sDsM/hjR01wibhbeGfEJDdJKC6LTJjtWeE=; b=SeHQY775v2rWS0lwK/KlNQqWZYGlxL4SuM1NJbb4jdkn5Jb9bEsoZ40EQF2Wi7ioSJOwju F6J5mFh8/QQ6fXxNUq1GJTn9g8pnU4X/q13/VLf+sxw8tyAR9TZS6F455uvcnedyZ2ZfXy vqazjA1dzVJQHON7/dBSrv3yCTFF3AA= Received: from mail-vk1-f199.google.com (mail-vk1-f199.google.com [209.85.221.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-641-RIqQ8iZdOBu2qJdbWPN0yg-1; Tue, 17 Jan 2023 17:10:06 -0500 X-MC-Unique: RIqQ8iZdOBu2qJdbWPN0yg-1 Received: by mail-vk1-f199.google.com with SMTP id f123-20020a1f9c81000000b003e1a7591524so868519vke.1 for ; Tue, 17 Jan 2023 14:10:05 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=IKX66sMu8sDsM/hjR01wibhbeGfEJDdJKC6LTJjtWeE=; b=EZmDryfRnl9qQoPNdMsOxt2UUNcobTgMopwR19zDvXDltsSIo9muzOh2u2LgaW+7W+ ExC3Ak+8INy5WCojmEhJjAL0ao7Yc52Q1SjB3PXc7R+G4cf75jECO/73LU86CxHV/s26 3rG2EBxV2+DbzFhwbF/m19xOmdg+k/I2cxgMxTIDyG5CxdyBij/4zzph+6D6ar5XDn2v 8LJoAK6mxF2aYCrc4PDdlBOTNB2h5lmirl1yOD1HHncUn1wTWHcu58oY4k+LLTT6DUk/ mZZbovcUTvG9F4c3lBDbv3bSabfRuQ/SxPHm15uMMNG3AzztLVKc2neq2DjnkkBd2BmI tMdw== X-Gm-Message-State: AFqh2kpZXhpZkoJwz6OTwrMEihfFbhI1aZIPJ4VANKVU61PuSnQ7o83B ctIRCvJ/s9/kX0jQCcDE1s8AIt5TRkdbQaifkKRK0pEyZH4DRE0eq2aATxJhE3beElwHwUgHfDg RUHIZv9VS2AovqwEHdDtbA4i38/AZhxBKGy/Jyug6Yau3cg/0YLyRH3Z/C0aj2ugW X-Received: by 2002:a05:6102:d0:b0:3d2:34b5:fe26 with SMTP id u16-20020a05610200d000b003d234b5fe26mr2533318vsp.16.1673993404432; Tue, 17 Jan 2023 14:10:04 -0800 (PST) X-Google-Smtp-Source: AMrXdXts4qgBQFv/LjEZO4Mz/RIzNdV6Ix7eF5wfpYxKwCxiGZIuw6oYDbxOIJfYK4RB95O7CZUGaw== X-Received: by 2002:a05:6102:d0:b0:3d2:34b5:fe26 with SMTP id u16-20020a05610200d000b003d234b5fe26mr2533286vsp.16.1673993404167; Tue, 17 Jan 2023 14:10:04 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-56-70-30-145-63.dsl.bell.ca. [70.30.145.63]) by smtp.gmail.com with ESMTPSA id bm16-20020a05620a199000b006e16dcf99c8sm21142978qkb.71.2023.01.17.14.10.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 Jan 2023 14:10:02 -0800 (PST) From: Peter Xu To: qemu-devel@nongnu.org Cc: Leonardo Bras Soares Passos , James Houghton , Juan Quintela , peterx@redhat.com, "Dr . David Alan Gilbert" Subject: [PATCH RFC 18/21] migration: Allow postcopy_register_shared_ufd() to fail Date: Tue, 17 Jan 2023 17:09:11 -0500 Message-Id: <20230117220914.2062125-19-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20230117220914.2062125-1-peterx@redhat.com> References: <20230117220914.2062125-1-peterx@redhat.com> MIME-Version: 1.0 Received-SPF: pass client-ip=170.10.129.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Let's fail double-map for vhost-user and any potential users that can have a remote userfaultfd for now. Signed-off-by: Peter Xu Reviewed-by: Juan Quintela --- hw/virtio/vhost-user.c | 9 ++++++++- migration/postcopy-ram.c | 9 +++++++-- migration/postcopy-ram.h | 4 ++-- 3 files changed, 17 insertions(+), 5 deletions(-) diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c index d9ce0501b2..00351bd67a 100644 --- a/hw/virtio/vhost-user.c +++ b/hw/virtio/vhost-user.c @@ -1952,7 +1952,14 @@ static int vhost_user_postcopy_advise(struct vhost_dev *dev, Error **errp) u->postcopy_fd.handler = vhost_user_postcopy_fault_handler; u->postcopy_fd.waker = vhost_user_postcopy_waker; u->postcopy_fd.idstr = "vhost-user"; /* Need to find unique name */ - postcopy_register_shared_ufd(&u->postcopy_fd); + + ret = postcopy_register_shared_ufd(&u->postcopy_fd); + if (ret) { + error_setg(errp, "%s: Register of shared userfaultfd failed: %s", + __func__, strerror(ret)); + return ret; + } + return 0; #else error_setg(errp, "Postcopy not supported on non-Linux systems"); diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c index dbc7e54e4a..0cfe5174a5 100644 --- a/migration/postcopy-ram.c +++ b/migration/postcopy-ram.c @@ -1582,14 +1582,19 @@ PostcopyState postcopy_state_set(PostcopyState new_state) } /* Register a handler for external shared memory postcopy - * called on the destination. + * called on the destination. Returns 0 if success, <0 for err. */ -void postcopy_register_shared_ufd(struct PostCopyFD *pcfd) +int postcopy_register_shared_ufd(struct PostCopyFD *pcfd) { MigrationIncomingState *mis = migration_incoming_get_current(); + if (migrate_hugetlb_doublemap()) { + return -EINVAL; + } + mis->postcopy_remote_fds = g_array_append_val(mis->postcopy_remote_fds, *pcfd); + return 0; } /* Unregister a handler for external shared memory postcopy diff --git a/migration/postcopy-ram.h b/migration/postcopy-ram.h index 32734d2340..94adad6fb8 100644 --- a/migration/postcopy-ram.h +++ b/migration/postcopy-ram.h @@ -161,9 +161,9 @@ struct PostCopyFD { }; /* Register a userfaultfd owned by an external process for - * shared memory. + * shared memory. Returns 0 if succeeded, <0 if error. */ -void postcopy_register_shared_ufd(struct PostCopyFD *pcfd); +int postcopy_register_shared_ufd(struct PostCopyFD *pcfd); void postcopy_unregister_shared_ufd(struct PostCopyFD *pcfd); /* Call each of the shared 'waker's registered telling them of * availability of a block. From patchwork Tue Jan 17 22:09:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 1727778 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=eVMbDNGJ; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4NxNQM48mPz23fp for ; Wed, 18 Jan 2023 09:11:23 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pHu9q-0002eC-9Z; Tue, 17 Jan 2023 17:10:18 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pHu9m-0002VL-OG for qemu-devel@nongnu.org; Tue, 17 Jan 2023 17:10:15 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pHu9k-0007q5-FD for qemu-devel@nongnu.org; Tue, 17 Jan 2023 17:10:13 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1673993411; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ZrG55HiwnGXPFSX6+W/ipSAfxUfsFmficRwkiZNYMjg=; b=eVMbDNGJLJIh37/Jb7A6cD5xjcBNXgZPRvTl4sIQ7sutfH9Fmc2lRJV92eCTHa2LMfa+d8 wlGOlm0A2jlQdvT4sJ2WZ3awbzqOBkQWzX4VOOyu769OeEbCicLyVUFwdPLf3EWbcRSwmh 7iaxvjINrmu9rM4sEaUP1m1bZD0/tcg= Received: from mail-vs1-f72.google.com (mail-vs1-f72.google.com [209.85.217.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-307-21cWty5LMDGJzT_oxSVWjw-1; Tue, 17 Jan 2023 17:10:08 -0500 X-MC-Unique: 21cWty5LMDGJzT_oxSVWjw-1 Received: by mail-vs1-f72.google.com with SMTP id k8-20020a056102004800b003d0f2b18a22so4118641vsp.5 for ; Tue, 17 Jan 2023 14:10:08 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ZrG55HiwnGXPFSX6+W/ipSAfxUfsFmficRwkiZNYMjg=; b=aAQBEu42JaWxpn2tpvMgsoSDZFJMKnk4z5xuqnr27MrBtM5S4XTB+6jiKcMwxcw1iz cZRBntsmfMGtgywW7tY94RV+hC+dqoigYSlXLpArS3uEw/vaAbC8H6Ykf1Mmwe3mq0c2 JCMhyk/3XTu5/kr1kUv7Wu2YtDNoxGREr3bNnhC/2kA6Len+4kpTi0gBZxTZPYc6CniD JT7oss/+XTN0w6IajI9RXY/girvyFkZlvNA+iM4MSHcmdzJR2BnMN8Js8KDOAJelyiIA KKrVceYnX2cKh+VsU6yqnGuRg5rIeC7D4ZCnPENj/tFnHYmdH53QdmWoiC7uiAQpwqk6 FDmQ== X-Gm-Message-State: AFqh2koprK3T/5VjAGPN93miermjbYAG8u+/zeZ1/KTKskt2gITboOW8 BnAnOoM3buFXZVcJpgh0tOHyLOrjhXa3Q/YSTys7u8Ii6/RLNNyf53SfoUWwO7Xsb1SsW1zSd1C EMq2cs1nlBjHVXzb0eFm7Q3I5FAVHWudNkXok9pV7p0Q4iODTO5eRdRIRYX3P1FiM X-Received: by 2002:a05:6122:106a:b0:3bd:795f:27b2 with SMTP id k10-20020a056122106a00b003bd795f27b2mr2418111vko.7.1673993407307; Tue, 17 Jan 2023 14:10:07 -0800 (PST) X-Google-Smtp-Source: AMrXdXsnJFBsunKabOjIpHKNV5cxf2l1L5FHTd0jjAHImX8iFp9CmXE9OWX9WpiJISWv/kC15wNnxw== X-Received: by 2002:a05:6122:106a:b0:3bd:795f:27b2 with SMTP id k10-20020a056122106a00b003bd795f27b2mr2418079vko.7.1673993406949; Tue, 17 Jan 2023 14:10:06 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-56-70-30-145-63.dsl.bell.ca. [70.30.145.63]) by smtp.gmail.com with ESMTPSA id bm16-20020a05620a199000b006e16dcf99c8sm21142978qkb.71.2023.01.17.14.10.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 Jan 2023 14:10:06 -0800 (PST) From: Peter Xu To: qemu-devel@nongnu.org Cc: Leonardo Bras Soares Passos , James Houghton , Juan Quintela , peterx@redhat.com, "Dr . David Alan Gilbert" Subject: [PATCH RFC 19/21] migration: Add postcopy_mark_received() Date: Tue, 17 Jan 2023 17:09:12 -0500 Message-Id: <20230117220914.2062125-20-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20230117220914.2062125-1-peterx@redhat.com> References: <20230117220914.2062125-1-peterx@redhat.com> MIME-Version: 1.0 Received-SPF: pass client-ip=170.10.129.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org We have a few maintainance work to do after we UFFDIO_[ZERO]COPY a page before, e.g. on requested list of pages or when measuring page latencies. Move those steps into a separate function so that it can be easily reused when we're going to support UFFDIO_CONTINUE. Signed-off-by: Peter Xu Reviewed-by: Juan Quintela --- migration/postcopy-ram.c | 35 +++++++++++++++++++++-------------- 1 file changed, 21 insertions(+), 14 deletions(-) diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c index 0cfe5174a5..8a2259581e 100644 --- a/migration/postcopy-ram.c +++ b/migration/postcopy-ram.c @@ -1288,6 +1288,25 @@ int postcopy_ram_incoming_setup(MigrationIncomingState *mis) return 0; } +static void +postcopy_mark_received(MigrationIncomingState *mis, RAMBlock *rb, + void *host_addr, size_t npages) +{ + qemu_mutex_lock(&mis->page_request_mutex); + ramblock_recv_bitmap_set_range(rb, host_addr, npages); + /* + * If this page resolves a page fault for a previous recorded faulted + * address, take a special note to maintain the requested page list. + */ + if (g_tree_lookup(mis->page_requested, host_addr)) { + g_tree_remove(mis->page_requested, host_addr); + mis->page_requested_count--; + trace_postcopy_page_req_del(host_addr, mis->page_requested_count); + } + qemu_mutex_unlock(&mis->page_request_mutex); + mark_postcopy_blocktime_end((uintptr_t)host_addr); +} + static int qemu_ufd_copy_ioctl(MigrationIncomingState *mis, void *host_addr, void *from_addr, uint64_t pagesize, RAMBlock *rb) { @@ -1309,20 +1328,8 @@ static int qemu_ufd_copy_ioctl(MigrationIncomingState *mis, void *host_addr, ret = ioctl(userfault_fd, UFFDIO_ZEROPAGE, &zero_struct); } if (!ret) { - qemu_mutex_lock(&mis->page_request_mutex); - ramblock_recv_bitmap_set_range(rb, host_addr, - pagesize / qemu_target_page_size()); - /* - * If this page resolves a page fault for a previous recorded faulted - * address, take a special note to maintain the requested page list. - */ - if (g_tree_lookup(mis->page_requested, host_addr)) { - g_tree_remove(mis->page_requested, host_addr); - mis->page_requested_count--; - trace_postcopy_page_req_del(host_addr, mis->page_requested_count); - } - qemu_mutex_unlock(&mis->page_request_mutex); - mark_postcopy_blocktime_end((uintptr_t)host_addr); + postcopy_mark_received(mis, rb, host_addr, + pagesize / qemu_target_page_size()); } return ret; } From patchwork Tue Jan 17 22:09:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 1727781 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=geyINHpG; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4NxNQY32Xqz23fp for ; Wed, 18 Jan 2023 09:11:33 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pHu9s-0002hy-5B; Tue, 17 Jan 2023 17:10:20 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pHu9o-0002Wm-H7 for qemu-devel@nongnu.org; Tue, 17 Jan 2023 17:10:16 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pHu9k-0007q7-Ju for qemu-devel@nongnu.org; Tue, 17 Jan 2023 17:10:14 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1673993412; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7+lyqMf7FS/xt0fUNaJo0X2RsMrbw86pu+I7arjRFgs=; b=geyINHpGCziy+DClnn/njyhq44Gl7IQP8GsE8aslR/uOljMvoUJVg8TWTdPOq6tFmSVQ4f PMnhnp3FgOIWa7Z/CsfKthQEOZTUmt19XQuvl/iv+3CLVlL1//3mXqIdUqAXgXtfyTkTXD yOt7gCeWG1HGAaKL7L0l+CIN86q4jr0= Received: from mail-yb1-f197.google.com (mail-yb1-f197.google.com [209.85.219.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-503-SM2TDBSFOya7wXUDsiKQew-1; Tue, 17 Jan 2023 17:10:10 -0500 X-MC-Unique: SM2TDBSFOya7wXUDsiKQew-1 Received: by mail-yb1-f197.google.com with SMTP id m187-20020a2558c4000000b007f17c91f06fso2796577ybb.6 for ; Tue, 17 Jan 2023 14:10:10 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7+lyqMf7FS/xt0fUNaJo0X2RsMrbw86pu+I7arjRFgs=; b=HcLzX7j1alPRfciiaheG8KHdUhOfoyRJTy9uAko6uOC7a2PJrxHapGa/XzP1cvNEgp wbhuEbmcj8S9zphXgUE4FKG15cCVrsg7FE2YBIcXpt6v+gXbwgS3xDPXk2cDFYHf9HXl HE2i6enOffeEn4H9Dk9eXw2iESOAn6vtv0GHuMJekzVPEMyGFSasndWrPI3SNwQGGg3v 4ZPE41kxkyvaLPuc3h/ptiUSKyg91Do2JSqRPx9BZg/Mm7BKKUi+6lYYVfpEVNycuyze 3WgRMnsR/4rySs5VqDFgBHD+tffG+DQ37ldLu1XXuwZu3Rm4li9VmofJ3e87KRajm3jw rz3A== X-Gm-Message-State: AFqh2krYv5WIgB3fSXUhGSTonytrW6Y7WbR+AWgnS3Ivz3LEsAqadKQu xRv3qP1U1CxNmaI3kD3+yw19oR0zA6wThuaWSl09mwNoYZCDvpRufvUnzpXV7dQKZaGsaGt8IYK 2CM9rraVa+pWTck6SIdifdPTIqo4ibNHrQSdzhOFcilqf4t5LxvtDJS1FaXsoQkWz X-Received: by 2002:a25:b52:0:b0:7c2:82a5:292f with SMTP id 79-20020a250b52000000b007c282a5292fmr4760769ybl.32.1673993409301; Tue, 17 Jan 2023 14:10:09 -0800 (PST) X-Google-Smtp-Source: AMrXdXuh4diQw1eV/fPVemocx+IX7icllN9AA2J+U2iDK6T1MCoWJUzzF7OigYVFncpJzipHo4376w== X-Received: by 2002:a25:b52:0:b0:7c2:82a5:292f with SMTP id 79-20020a250b52000000b007c282a5292fmr4760741ybl.32.1673993408866; Tue, 17 Jan 2023 14:10:08 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-56-70-30-145-63.dsl.bell.ca. [70.30.145.63]) by smtp.gmail.com with ESMTPSA id bm16-20020a05620a199000b006e16dcf99c8sm21142978qkb.71.2023.01.17.14.10.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 Jan 2023 14:10:07 -0800 (PST) From: Peter Xu To: qemu-devel@nongnu.org Cc: Leonardo Bras Soares Passos , James Houghton , Juan Quintela , peterx@redhat.com, "Dr . David Alan Gilbert" Subject: [PATCH RFC 20/21] migration: Handle page faults using UFFDIO_CONTINUE Date: Tue, 17 Jan 2023 17:09:13 -0500 Message-Id: <20230117220914.2062125-21-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20230117220914.2062125-1-peterx@redhat.com> References: <20230117220914.2062125-1-peterx@redhat.com> MIME-Version: 1.0 Received-SPF: pass client-ip=170.10.129.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Teach QEMU to be able to handle page faults using UFFDIO_CONTINUE for hugetlbfs double mapped ranges. To copy the data, we need to use the mirror buffer created per ramblock by a raw memcpy(), then we can kick the faulted threads using UFFDIO_CONTINUE by installing the pgtables. Move trace_postcopy_place_page(host) upper so that it'll dump something for either UFFDIO_COPY or UFFDIO_CONTINUE. Signed-off-by: Peter Xu --- migration/postcopy-ram.c | 55 ++++++++++++++++++++++++++++++++++++++-- migration/trace-events | 4 +-- 2 files changed, 55 insertions(+), 4 deletions(-) diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c index 8a2259581e..c4bd338e22 100644 --- a/migration/postcopy-ram.c +++ b/migration/postcopy-ram.c @@ -1350,6 +1350,43 @@ int postcopy_notify_shared_wake(RAMBlock *rb, uint64_t offset) return 0; } +/* Returns the mirror_host addr for a specific host address in ramblock */ +static inline void *migration_ram_get_mirror_addr(RAMBlock *rb, void *host) +{ + return (void *)((__u64)rb->host_mirror + ((__u64)host - (__u64)rb->host)); +} + +static int +qemu_uffd_continue(MigrationIncomingState *mis, RAMBlock *rb, void *host, + void *from) +{ + void *mirror_addr = migration_ram_get_mirror_addr(rb, host); + /* Doublemap uses small host page size */ + uint64_t psize = qemu_real_host_page_size(); + struct uffdio_continue req; + + /* + * Copy data first into the mirror host pointer; we can't directly copy + * data into rb->host because otherwise our thread will get trapped too. + */ + memcpy(mirror_addr, from, psize); + + /* Kick off the faluted threads to fetch data from the page cache */ + req.range.start = (__u64)host; + req.range.len = psize; + req.mode = 0; + if (ioctl(mis->userfault_fd, UFFDIO_CONTINUE, &req)) { + error_report("%s: UFFDIO_CONTINUE failed for start=%p" + " len=0x%"PRIx64": %s\n", __func__, host, + psize, strerror(-req.mapped)); + return req.mapped; + } + + postcopy_mark_received(mis, rb, host, psize / qemu_target_page_size()); + + return 0; +} + /* * Place a host page (from) at (host) atomically * returns 0 on success @@ -1359,6 +1396,18 @@ int postcopy_place_page(MigrationIncomingState *mis, void *host, void *from, { size_t pagesize = migration_ram_pagesize(rb); + trace_postcopy_place_page(rb->idstr, (uint8_t *)host - rb->host, host); + + if (postcopy_use_minor_fault(rb)) { + /* + * If minor fault used, we use UFFDIO_CONTINUE instead. + * + * TODO: support shared uffds (e.g. vhost-user). Currently we're + * skipping them. + */ + return qemu_uffd_continue(mis, rb, host, from); + } + /* copy also acks to the kernel waking the stalled thread up * TODO: We can inhibit that ack and only do it if it was requested * which would be slightly cheaper, but we'd have to be careful @@ -1372,7 +1421,6 @@ int postcopy_place_page(MigrationIncomingState *mis, void *host, void *from, return -e; } - trace_postcopy_place_page(host); return postcopy_notify_shared_wake(rb, qemu_ram_block_host_offset(rb, host)); } @@ -1385,10 +1433,13 @@ int postcopy_place_page_zero(MigrationIncomingState *mis, void *host, RAMBlock *rb) { size_t pagesize = migration_ram_pagesize(rb); - trace_postcopy_place_page_zero(host); + trace_postcopy_place_page_zero(rb->idstr, (uint8_t *)host - rb->host, host); /* Normal RAMBlocks can zero a page using UFFDIO_ZEROPAGE * but it's not available for everything (e.g. hugetlbpages) + * + * NOTE: when hugetlb double-map enabled, then this ramblock will never + * have RAM_UF_ZEROPAGE, so it'll always go to postcopy_place_page(). */ if (qemu_ram_is_uf_zeroable(rb)) { if (qemu_ufd_copy_ioctl(mis, host, NULL, pagesize, rb)) { diff --git a/migration/trace-events b/migration/trace-events index 6b418a0e9e..7baf235d22 100644 --- a/migration/trace-events +++ b/migration/trace-events @@ -265,8 +265,8 @@ postcopy_discard_send_range(const char *ramblock, unsigned long start, unsigned postcopy_cleanup_range(const char *ramblock, void *host_addr, size_t offset, size_t length) "%s: %p offset=0x%zx length=0x%zx" postcopy_init_range(const char *ramblock, void *host_addr, size_t offset, size_t length) "%s: %p offset=0x%zx length=0x%zx" postcopy_nhp_range(const char *ramblock, void *host_addr, size_t offset, size_t length) "%s: %p offset=0x%zx length=0x%zx" -postcopy_place_page(void *host_addr) "host=%p" -postcopy_place_page_zero(void *host_addr) "host=%p" +postcopy_place_page(const char *id, size_t offset, void *host_addr) "id=%s offset=0x%zx host=%p" +postcopy_place_page_zero(const char *id, size_t offset, void *host_addr) "id=%s offset=0x%zx host=%p" postcopy_ram_enable_notify(void) "" mark_postcopy_blocktime_begin(uint64_t addr, void *dd, uint32_t time, int cpu, int received) "addr: 0x%" PRIx64 ", dd: %p, time: %u, cpu: %d, already_received: %d" mark_postcopy_blocktime_end(uint64_t addr, void *dd, uint32_t time, int affected_cpu) "addr: 0x%" PRIx64 ", dd: %p, time: %u, affected_cpu: %d" From patchwork Tue Jan 17 22:09:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 1727774 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=LhDH3EvA; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4NxNQ70pz6z23fp for ; Wed, 18 Jan 2023 09:11:11 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pHuA5-00036m-DS; Tue, 17 Jan 2023 17:10:33 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pHu9q-0002ez-F3 for qemu-devel@nongnu.org; Tue, 17 Jan 2023 17:10:19 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pHu9o-0007r0-9k for qemu-devel@nongnu.org; Tue, 17 Jan 2023 17:10:17 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1673993414; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ia6RdXcdii0hyfKSWhplcLO26KVeNp4Sk3B79jomOc4=; b=LhDH3EvAAqaeA2W9yOf/dyPCihMs3VHwx8wyAE/9aQchGuq9gx3gUrUMXoapotoeOWHsm1 wxE9jRiBNcuxLJGqkg9sOAFJRSHQd88ltyycdXTmTx0fZao9FRPDx7PfS+Lmk+0lTY1k1l omQfKS1fgxlgeE43wVDSLDgbY8Q8mPk= Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-479-0r7A6VeEO4-Ol9x1quo3Gw-1; Tue, 17 Jan 2023 17:10:13 -0500 X-MC-Unique: 0r7A6VeEO4-Ol9x1quo3Gw-1 Received: by mail-qt1-f200.google.com with SMTP id f23-20020ac84717000000b003b645f1491aso655072qtp.6 for ; Tue, 17 Jan 2023 14:10:13 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ia6RdXcdii0hyfKSWhplcLO26KVeNp4Sk3B79jomOc4=; b=7ODNS/8JQvh246b4zVSBEn5SDaili9/P8chXiFi19aBKsnO0tovKMbpLuDZAmDi9eg y8hXgehbRtwsLMfV5eNIPeEsb6T24BwYTaVsjnyod4MgSlc+TP0j9dQTX35e63tUPdDe WTeVIc6V+hge7uWg9q0xewWpal2J86xnFM604gxDoJMkm6CLcYp9T4YmnIc1V6QZ7bKr fS8VMtR8gqDmkl2AflPkld3W97G0GeFRKDC2W89kWDYV/uDPjy1NNbQomw2+w9ARibnP TfJpYI7iCNh0XNQVbboQy0uYQrVRAbZoJ5ulUigp8jOzLo4Sb7fpMRjexWlxD5MYI7ps KPhw== X-Gm-Message-State: AFqh2koif+iBJkSf/rtJ37G+/t5SXK7zLhNck3bQE2ve9Djg0XkMm/yI 1/dkjUal4sWpq+Vp7BnaLBbWU4sYTw4Iq6ZcaAMFJp6cHB1rqiuZHbDPwqRSFvfhf9li+t6YW5c ww5Lu8u909BU+XOuPstvDaiYhfK45aX/brUigXFxVyjjbtDheXMue3eVUoOZsU6ao X-Received: by 2002:a05:6214:328c:b0:533:6733:2bd5 with SMTP id mu12-20020a056214328c00b0053367332bd5mr6696089qvb.52.1673993411962; Tue, 17 Jan 2023 14:10:11 -0800 (PST) X-Google-Smtp-Source: AMrXdXsgfEsm17g5nCU8Knu9+te1+qx0IYuLERmZwMz2tXKsPimPTx/Y7HNrottduPHI9P3QYA6DsQ== X-Received: by 2002:a05:6214:328c:b0:533:6733:2bd5 with SMTP id mu12-20020a056214328c00b0053367332bd5mr6696064qvb.52.1673993411684; Tue, 17 Jan 2023 14:10:11 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-56-70-30-145-63.dsl.bell.ca. [70.30.145.63]) by smtp.gmail.com with ESMTPSA id bm16-20020a05620a199000b006e16dcf99c8sm21142978qkb.71.2023.01.17.14.10.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 Jan 2023 14:10:11 -0800 (PST) From: Peter Xu To: qemu-devel@nongnu.org Cc: Leonardo Bras Soares Passos , James Houghton , Juan Quintela , peterx@redhat.com, "Dr . David Alan Gilbert" Subject: [PATCH RFC 21/21] migration: Collapse huge pages again after postcopy finished Date: Tue, 17 Jan 2023 17:09:14 -0500 Message-Id: <20230117220914.2062125-22-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20230117220914.2062125-1-peterx@redhat.com> References: <20230117220914.2062125-1-peterx@redhat.com> MIME-Version: 1.0 Received-SPF: pass client-ip=170.10.129.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org When hugetlb-doublemap enabled, the pages will be migrated in small page sizes during postcopy. When the migration finishes, the pgtable needs to be rebuilt explicitly for these ranges to have huge page being mapped again. Signed-off-by: Peter Xu --- migration/ram.c | 31 +++++++++++++++++++++++++++++++ migration/trace-events | 1 + 2 files changed, 32 insertions(+) diff --git a/migration/ram.c b/migration/ram.c index 4da56d925c..178739f8c3 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -3986,6 +3986,31 @@ static int ram_load_setup(QEMUFile *f, void *opaque) return 0; } +#define MADV_COLLAPSE_CHUNK_SIZE (1UL << 30) /* 1G */ + +static void ramblock_rebuild_huge_mappings(RAMBlock *rb) +{ + unsigned long addr, size; + + assert(qemu_ram_is_hugetlb(rb)); + + addr = (unsigned long)qemu_ram_get_host_addr(rb); + size = rb->mmap_length; + + while (size) { + unsigned long chunk = MIN(size, MADV_COLLAPSE_CHUNK_SIZE); + + if (qemu_madvise((void *)addr, chunk, QEMU_MADV_COLLAPSE)) { + error_report("%s: madvise(MADV_COLLAPSE) failed " + "for ramblock '%s'", __func__, rb->idstr); + } else { + trace_ramblock_rebuild_huge_mappings(rb->idstr, addr, chunk); + } + addr += chunk; + size -= chunk; + } +} + static int ram_load_cleanup(void *opaque) { RAMBlock *rb; @@ -4001,6 +4026,12 @@ static int ram_load_cleanup(void *opaque) g_free(rb->receivedmap); rb->receivedmap = NULL; if (rb->host_mirror) { + /* + * If host_mirror set, it means this is an hugetlb ramblock, + * and we've enabled double mappings for it. Rebuild the huge + * page tables here. + */ + ramblock_rebuild_huge_mappings(rb); munmap(rb->host_mirror, rb->mmap_length); rb->host_mirror = NULL; } diff --git a/migration/trace-events b/migration/trace-events index 7baf235d22..6b52bb691c 100644 --- a/migration/trace-events +++ b/migration/trace-events @@ -119,6 +119,7 @@ postcopy_preempt_hit(char *str, uint64_t offset) "ramblock %s offset 0x%"PRIx64 postcopy_preempt_send_host_page(char *str, uint64_t offset) "ramblock %s offset 0x%"PRIx64 postcopy_preempt_switch_channel(int channel) "%d" postcopy_preempt_reset_channel(void) "" +ramblock_rebuild_huge_mappings(char *str, unsigned long start, unsigned long size) "ramblock %s start 0x%lx size 0x%lx" # multifd.c multifd_new_send_channel_async(uint8_t id) "channel %u"