From patchwork Mon Oct 19 22:57:15 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 1384526 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=WrUp7Uqj; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4CFXHD3HHyz9sSs for ; Tue, 20 Oct 2020 09:58:30 +1100 (AEDT) Received: from localhost ([::1]:37326 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kUe6k-00070f-FV for incoming@patchwork.ozlabs.org; Mon, 19 Oct 2020 18:58:26 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:37464) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kUe5q-00070P-VJ for qemu-devel@nongnu.org; Mon, 19 Oct 2020 18:57:30 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:54770) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1kUe5o-0005cJ-IK for qemu-devel@nongnu.org; Mon, 19 Oct 2020 18:57:30 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1603148247; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=n4IR530G0jn3KaGrAMU+kHWU9guPG9Q80wHEffp3OJw=; b=WrUp7UqjNSJnhPiGza7P0Iv87t/y2LN4l/f22wdWkkvs6jwaKxvTfYa7O2HYDDYYGmH5Eu fKT+CEuaAH2mumYeYIIJhb0WoIhBC0XQSLiHDGv+mTeDXd7f1KeUky6J+Yn0WG+1sdsz8U S0U1nNl3Y7ZwXsYDBF468Wupo9iR4+s= Received: from mail-il1-f197.google.com (mail-il1-f197.google.com [209.85.166.197]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-487-_tvVK4qIMt2CMa5Yv7a6hQ-1; Mon, 19 Oct 2020 18:57:25 -0400 X-MC-Unique: _tvVK4qIMt2CMa5Yv7a6hQ-1 Received: by mail-il1-f197.google.com with SMTP id p17so1456595ilj.0 for ; Mon, 19 Oct 2020 15:57:25 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=n4IR530G0jn3KaGrAMU+kHWU9guPG9Q80wHEffp3OJw=; b=iW2EUcUN5F4UofdWyPy1rCt4jcfCvsdrj8BMwbfLaZcTA6iiupgwWek37qEwcYPm0J GXz3uFfr2CAn8D2Kvu39iuSI44wF3s9OvQJBTXk9vI/Nr5Y9c4/DkqxFW4IC321J2AtQ RmTw5v2jFjSJJLTuzt5ksdkoiTdJVeqTOjY71psxt57JMXX3UP3Qk6XioymrzfrVUDyo p1LS5XuFN0/oerVDtv/bPVKQN674FAmtXKC8RvkH7ktx3JbsfLVXHDFjToOjCEEbTJZ1 PCkoaN/ivFD2eXbFSpy9WyQHmYmjhCMucNj2G4CR0KYEA/xYJqKqb0CqsGrqwt2Cho7F FaXw== X-Gm-Message-State: AOAM533fXHfjQqgGyLiVInWF7qh3AI/bpsjO/S0dL4w6/ZWPuRt5FWcU rey/LP2zW6ZozRjmpAvrFUctQN0m2LIW82CDiKwjpXAFzzqWV4k0DIQYctPxVMzYfSB8aJF8HsJ zmInw3dQ3DQ5rayE= X-Received: by 2002:a5e:d719:: with SMTP id v25mr54997iom.32.1603148244718; Mon, 19 Oct 2020 15:57:24 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxN63MEt+Wcot7AwVcM3Pym62fevDxqfrAUC6UTLm3lhCFpqeNaRvMAGxgrorpA04e/7B+9iQ== X-Received: by 2002:a5e:d719:: with SMTP id v25mr54983iom.32.1603148244536; Mon, 19 Oct 2020 15:57:24 -0700 (PDT) Received: from xz-x1.redhat.com (toroon474qw-lp140-04-174-95-215-133.dsl.bell.ca. [174.95.215.133]) by smtp.gmail.com with ESMTPSA id z89sm6017ilk.4.2020.10.19.15.57.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 19 Oct 2020 15:57:23 -0700 (PDT) From: Peter Xu To: qemu-devel@nongnu.org Subject: [PATCH v5 1/6] migration: Pass incoming state into qemu_ufd_copy_ioctl() Date: Mon, 19 Oct 2020 18:57:15 -0400 Message-Id: <20201019225720.172743-2-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201019225720.172743-1-peterx@redhat.com> References: <20201019225720.172743-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=216.205.24.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/10/19 15:28:27 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Thomas Huth , "Dr . David Alan Gilbert" , peterx@redhat.com, Juan Quintela Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" It'll be used in follow up patches to access more fields out of it. Meanwhile fetch the userfaultfd inside the function. Reviewed-by: Dr. David Alan Gilbert Signed-off-by: Peter Xu --- migration/postcopy-ram.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c index 0a2f88a87d..722034dc01 100644 --- a/migration/postcopy-ram.c +++ b/migration/postcopy-ram.c @@ -1128,10 +1128,12 @@ int postcopy_ram_incoming_setup(MigrationIncomingState *mis) return 0; } -static int qemu_ufd_copy_ioctl(int userfault_fd, void *host_addr, +static int qemu_ufd_copy_ioctl(MigrationIncomingState *mis, void *host_addr, void *from_addr, uint64_t pagesize, RAMBlock *rb) { + int userfault_fd = mis->userfault_fd; int ret; + if (from_addr) { struct uffdio_copy copy_struct; copy_struct.dst = (uint64_t)(uintptr_t)host_addr; @@ -1185,7 +1187,7 @@ int postcopy_place_page(MigrationIncomingState *mis, void *host, void *from, * which would be slightly cheaper, but we'd have to be careful * of the order of updating our page state. */ - if (qemu_ufd_copy_ioctl(mis->userfault_fd, host, from, pagesize, rb)) { + if (qemu_ufd_copy_ioctl(mis, host, from, pagesize, rb)) { int e = errno; error_report("%s: %s copy host: %p from: %p (size: %zd)", __func__, strerror(e), host, from, pagesize); @@ -1212,7 +1214,7 @@ int postcopy_place_page_zero(MigrationIncomingState *mis, void *host, * but it's not available for everything (e.g. hugetlbpages) */ if (qemu_ram_is_uf_zeroable(rb)) { - if (qemu_ufd_copy_ioctl(mis->userfault_fd, host, NULL, pagesize, rb)) { + if (qemu_ufd_copy_ioctl(mis, host, NULL, pagesize, rb)) { int e = errno; error_report("%s: %s zero host: %p", __func__, strerror(e), host); From patchwork Mon Oct 19 22:57:16 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 1384531 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=HN20YtLt; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4CFXKj6L6Mz9sSn for ; Tue, 20 Oct 2020 10:00:41 +1100 (AEDT) Received: from localhost ([::1]:45406 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kUe8t-0001uJ-K6 for incoming@patchwork.ozlabs.org; Mon, 19 Oct 2020 19:00:39 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:37478) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kUe5r-00070e-Nj for qemu-devel@nongnu.org; Mon, 19 Oct 2020 18:57:31 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:55379) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1kUe5p-0005cR-CP for qemu-devel@nongnu.org; Mon, 19 Oct 2020 18:57:31 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1603148248; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=hSeDqMHpB+FHPSAPyKgXEXUo5azyTlLZe7OlYAX6W4M=; b=HN20YtLtpxXFbY0b0taAHU+vocFXZHUnq3U+T4uwblq1xsmgzdo+vj98dmSepJ2T6NyoE8 l5IS5n9Ilr5HS5XooBGiinOaHnToWsqATzKvCEoRgN0fQPg17xWOsZyilDsP5Mq6XwYqnA BR4SKPezjb27uLSuAZ9e2WkL1ZOOdfU= Received: from mail-io1-f69.google.com (mail-io1-f69.google.com [209.85.166.69]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-541-WdAkfVXcO6eaETXgw7ARLw-1; Mon, 19 Oct 2020 18:57:27 -0400 X-MC-Unique: WdAkfVXcO6eaETXgw7ARLw-1 Received: by mail-io1-f69.google.com with SMTP id x13so1339530iom.10 for ; Mon, 19 Oct 2020 15:57:27 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=hSeDqMHpB+FHPSAPyKgXEXUo5azyTlLZe7OlYAX6W4M=; b=MBqMgiouHHMuCfd5SojlUcJQrdbqNcJYRS+uWPLWfxR0ryRmnqa3gQkHa0v36M7Fz6 aeWX/14GcIFdEJ6ijPSCqa85K5ng1W9XGxC8XQSDrdMFEBGfPcsqG2+iMQCEe67xEVLJ xBj9qVgvuJPZpYZH0G7YeHsihIQ4xwlDfXkMPc8wKC9AbVHEzup9G8TxW7aHgC7ff0VA XzfyxniFQeF6Sm5zFuX8m2SAOcGemYhQJ6AvQ0BQRpgrt4blZDRprTQnHfVsu7INAhdg eojYxCCsDMHkAZFcrXQC7o+bQR08O0ar3nXXshNBxVQj2d26IgEpeZKz+rXMHRSmqFZq N57A== X-Gm-Message-State: AOAM530SIxEYnGaa8KiqT0Hl3UBxxduNISlnYMZTFlNTTSFfAZ8DuhIA XZnPKb7sb6kOSp+C7ujytCPmLqw8eGY17gkVtZr9JfDQYc8B/Thu4kdrJE/SbgacA2gjkaGCzH8 CraeS61Zz/Eyt3Gg= X-Received: by 2002:a92:5b02:: with SMTP id p2mr1875079ilb.283.1603148246433; Mon, 19 Oct 2020 15:57:26 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzfVISBIRnF+2+sUJINEEiz8W0RLnjUZpqmYjYkDfWfmwzD/pVG/QB2hlKQi05JGNyGR7y1cg== X-Received: by 2002:a92:5b02:: with SMTP id p2mr1875069ilb.283.1603148246275; Mon, 19 Oct 2020 15:57:26 -0700 (PDT) Received: from xz-x1.redhat.com (toroon474qw-lp140-04-174-95-215-133.dsl.bell.ca. [174.95.215.133]) by smtp.gmail.com with ESMTPSA id z89sm6017ilk.4.2020.10.19.15.57.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 19 Oct 2020 15:57:25 -0700 (PDT) From: Peter Xu To: qemu-devel@nongnu.org Subject: [PATCH v5 2/6] migration: Introduce migrate_send_rp_message_req_pages() Date: Mon, 19 Oct 2020 18:57:16 -0400 Message-Id: <20201019225720.172743-3-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201019225720.172743-1-peterx@redhat.com> References: <20201019225720.172743-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=216.205.24.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/10/19 15:28:27 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Thomas Huth , "Dr . David Alan Gilbert" , peterx@redhat.com, Juan Quintela Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" This is another layer wrapper for sending a page request to the source VM. The new migrate_send_rp_message_req_pages() will be used elsewhere in coming patches. Reviewed-by: Dr. David Alan Gilbert Signed-off-by: Peter Xu --- migration/migration.c | 10 ++++++++-- migration/migration.h | 2 ++ 2 files changed, 10 insertions(+), 2 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index aca7fdcd0b..b2dac6b39c 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -316,8 +316,8 @@ error: * Start: Address offset within the RB * Len: Length in bytes required - must be a multiple of pagesize */ -int migrate_send_rp_req_pages(MigrationIncomingState *mis, RAMBlock *rb, - ram_addr_t start) +int migrate_send_rp_message_req_pages(MigrationIncomingState *mis, + RAMBlock *rb, ram_addr_t start) { uint8_t bufc[12 + 1 + 255]; /* start (8), len (4), rbname up to 256 */ size_t msglen = 12; /* start + len */ @@ -353,6 +353,12 @@ int migrate_send_rp_req_pages(MigrationIncomingState *mis, RAMBlock *rb, return migrate_send_rp_message(mis, msg_type, msglen, bufc); } +int migrate_send_rp_req_pages(MigrationIncomingState *mis, + RAMBlock *rb, ram_addr_t start) +{ + return migrate_send_rp_message_req_pages(mis, rb, start); +} + static bool migration_colo_enabled; bool migration_incoming_colo_enabled(void) { diff --git a/migration/migration.h b/migration/migration.h index deb411aaad..e853ccf8b1 100644 --- a/migration/migration.h +++ b/migration/migration.h @@ -333,6 +333,8 @@ void migrate_send_rp_pong(MigrationIncomingState *mis, uint32_t value); int migrate_send_rp_req_pages(MigrationIncomingState *mis, RAMBlock *rb, ram_addr_t start); +int migrate_send_rp_message_req_pages(MigrationIncomingState *mis, + RAMBlock *rb, ram_addr_t start); void migrate_send_rp_recv_bitmap(MigrationIncomingState *mis, char *block_name); void migrate_send_rp_resume_ack(MigrationIncomingState *mis, uint32_t value); From patchwork Mon Oct 19 22:57:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 1384528 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=ZdlOLkqe; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4CFXHN539nz9sSC for ; Tue, 20 Oct 2020 09:58:40 +1100 (AEDT) Received: from localhost ([::1]:37512 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kUe6w-00076j-M3 for incoming@patchwork.ozlabs.org; Mon, 19 Oct 2020 18:58:38 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:37488) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kUe5t-00072w-64 for qemu-devel@nongnu.org; Mon, 19 Oct 2020 18:57:33 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:36234) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1kUe5r-0005ce-6a for qemu-devel@nongnu.org; Mon, 19 Oct 2020 18:57:32 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1603148250; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=EwOEB/sQZucP+nG5Qy7zpidCKqVjwFANYXLB2eJwVkA=; b=ZdlOLkqeqeJx9SXWaOqEyCrdh90dwKl3Y2R79gbgnIuvICGt1Z3NXCf+HAKVMXH6vECCSr gFGcnVzJIsoqaz97fpvKnG5gIaubYP0H0YkkpACIEVfBlBJLgyk1OESOVQm9HPdoPkZD4N htQMFyPnjqKcVcZmV40KtfUpSEEeaNE= Received: from mail-io1-f72.google.com (mail-io1-f72.google.com [209.85.166.72]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-553-MobJEKojMDCQUE8HEumaAg-1; Mon, 19 Oct 2020 18:57:29 -0400 X-MC-Unique: MobJEKojMDCQUE8HEumaAg-1 Received: by mail-io1-f72.google.com with SMTP id s10so679486iot.21 for ; Mon, 19 Oct 2020 15:57:28 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=EwOEB/sQZucP+nG5Qy7zpidCKqVjwFANYXLB2eJwVkA=; b=d/hNGVMiJtE/ejpeZ4p1pdjUr2vgTPivmn2ObDQ9icRfATGoY+ae5dU/Oqx5vt4ehl zxXXrG/R3tjUrSKomv2TG8d/Ey936bPaZetkEkX9wcbLESQ+0MrDvU5Vl7WrcQ+C0zeq 1jtuiDkV1yj32Bc4DHmaHwi6lYS/cZKE330xuGNbKBF7nsxReN9ks6pechBB9rCkthG6 wVuy7kecIzONRCK1P+sQvKUeuNLz+gYXrU/JpE9DAHEn7ITtu8iPHJENl5Hl7TtxjKVS wOS+dwOIC53gYtcA4oQ+PgvwjFTsIccbC2w9Es0Rr20m13D+h51Lkack8ojDR3sz0Cxx P+6A== X-Gm-Message-State: AOAM532TGCHHBagz5Ml2Hy5FHNV7T4NLFHEIVl7U0xgOa3h1qVW2vjoQ R+DU/YMT4k4SSW/GqwYzO//YnckcXd3IZHC0PFAdt6gNHJ1Juv6esVPQDqX+tHC/rK31HZTPZ8k +BvIvsALl5POFpUQ= X-Received: by 2002:a92:d804:: with SMTP id y4mr2072360ilm.106.1603148248060; Mon, 19 Oct 2020 15:57:28 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy+reF1rXW2jB/INyEtlEOtFvLWZMc6cbTSKtJxstiwIAe5OoYswTzf0YVJiew08bL8GUdDxw== X-Received: by 2002:a92:d804:: with SMTP id y4mr2072346ilm.106.1603148247772; Mon, 19 Oct 2020 15:57:27 -0700 (PDT) Received: from xz-x1.redhat.com (toroon474qw-lp140-04-174-95-215-133.dsl.bell.ca. [174.95.215.133]) by smtp.gmail.com with ESMTPSA id z89sm6017ilk.4.2020.10.19.15.57.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 19 Oct 2020 15:57:27 -0700 (PDT) From: Peter Xu To: qemu-devel@nongnu.org Subject: [PATCH v5 3/6] migration: Maintain postcopy faulted addresses Date: Mon, 19 Oct 2020 18:57:17 -0400 Message-Id: <20201019225720.172743-4-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201019225720.172743-1-peterx@redhat.com> References: <20201019225720.172743-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=216.205.24.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/10/19 15:28:27 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Thomas Huth , "Dr . David Alan Gilbert" , peterx@redhat.com, Juan Quintela Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Maintain a list of faulted addresses on the destination host for which we're waiting on. This is implemented using a GTree rather than a real list to make sure even there're plenty of vCPUs/threads that are faulting, the lookup will still be fast with O(log(N)) (because we'll do that after placing each page). It should bring a slight overhead, but ideally that shouldn't be a big problem simply because in most cases the requested page list will be short. Actually we did similar things for postcopy blocktime measurements. This patch didn't use that simply because: (1) blocktime measurement is towards vcpu threads only, but here we need to record all faulted addresses, including main thread and external thread (like, DPDK via vhost-user). (2) blocktime measurement will require UFFD_FEATURE_THREAD_ID, but here we don't want to add that extra dependency on the kernel version since not necessary. E.g., we don't need to know which thread faulted on which page, we also don't care about multiple threads faulting on the same page. But we only care about what addresses are faulted so waiting for a page copying from src. (3) blocktime measurement is not enabled by default. However we need this by default especially for postcopy recover. Another thing to mention is that this patch introduced a new mutex to serialize the receivedmap and the page_requested tree, however that serialization does not cover other procedures like UFFDIO_COPY. Signed-off-by: Peter Xu --- migration/migration.c | 41 +++++++++++++++++++++++++++++++++++++++- migration/migration.h | 19 ++++++++++++++++++- migration/postcopy-ram.c | 17 ++++++++++++++--- migration/trace-events | 2 ++ 4 files changed, 74 insertions(+), 5 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index b2dac6b39c..0b4fcff01f 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -143,6 +143,13 @@ static int migration_maybe_pause(MigrationState *s, int new_state); static void migrate_fd_cancel(MigrationState *s); +static gint page_request_addr_cmp(gconstpointer ap, gconstpointer bp) +{ + uintptr_t a = (uintptr_t) ap, b = (uintptr_t) bp; + + return (a > b) - (a < b); +} + void migration_object_init(void) { MachineState *ms = MACHINE(qdev_get_machine()); @@ -165,6 +172,8 @@ void migration_object_init(void) qemu_event_init(¤t_incoming->main_thread_load_event, false); qemu_sem_init(¤t_incoming->postcopy_pause_sem_dst, 0); qemu_sem_init(¤t_incoming->postcopy_pause_sem_fault, 0); + qemu_mutex_init(¤t_incoming->page_request_mutex); + current_incoming->page_requested = g_tree_new(page_request_addr_cmp); if (!migration_object_check(current_migration, &err)) { error_report_err(err); @@ -240,6 +249,11 @@ void migration_incoming_state_destroy(void) qemu_event_reset(&mis->main_thread_load_event); + if (mis->page_requested) { + g_tree_destroy(mis->page_requested); + mis->page_requested = NULL; + } + if (mis->socket_address_list) { qapi_free_SocketAddressList(mis->socket_address_list); mis->socket_address_list = NULL; @@ -354,8 +368,33 @@ int migrate_send_rp_message_req_pages(MigrationIncomingState *mis, } int migrate_send_rp_req_pages(MigrationIncomingState *mis, - RAMBlock *rb, ram_addr_t start) + RAMBlock *rb, ram_addr_t start, uint64_t haddr) { + void *aligned = (void *)(uintptr_t)(haddr & qemu_real_host_page_mask); + bool received; + + WITH_QEMU_LOCK_GUARD(&mis->page_request_mutex) { + received = ramblock_recv_bitmap_test_byte_offset(rb, start); + if (!received && !g_tree_lookup(mis->page_requested, aligned)) { + /* + * The page has not been received, and it's not yet in the page + * request list. Queue it. Set the value of element to 1, so that + * things like g_tree_lookup() will return TRUE (1) when found. + */ + g_tree_insert(mis->page_requested, aligned, (gpointer)1); + mis->page_requested_count++; + trace_postcopy_page_req_add(aligned, mis->page_requested_count); + } + } + + /* + * If the page is there, skip sending the message. We don't even need the + * lock because as long as the page arrived, it'll be there forever. + */ + if (received) { + return 0; + } + return migrate_send_rp_message_req_pages(mis, rb, start); } diff --git a/migration/migration.h b/migration/migration.h index e853ccf8b1..8d2d1ce839 100644 --- a/migration/migration.h +++ b/migration/migration.h @@ -104,6 +104,23 @@ struct MigrationIncomingState { /* List of listening socket addresses */ SocketAddressList *socket_address_list; + + /* A tree of pages that we requested to the source VM */ + GTree *page_requested; + /* For debugging purpose only, but would be nice to keep */ + int page_requested_count; + /* + * The mutex helps to maintain the requested pages that we sent to the + * source, IOW, to guarantee coherent between the page_requests tree and + * the per-ramblock receivedmap. Note! This does not guarantee consistency + * of the real page copy procedures (using UFFDIO_[ZERO]COPY). E.g., even + * if one bit in receivedmap is cleared, UFFDIO_COPY could have happened + * for that page already. This is intended so that the mutex won't + * serialize and blocked by slow operations like UFFDIO_* ioctls. However + * this should be enough to make sure the page_requested tree always + * contains valid information. + */ + QemuMutex page_request_mutex; }; MigrationIncomingState *migration_incoming_get_current(void); @@ -332,7 +349,7 @@ void migrate_send_rp_shut(MigrationIncomingState *mis, void migrate_send_rp_pong(MigrationIncomingState *mis, uint32_t value); int migrate_send_rp_req_pages(MigrationIncomingState *mis, RAMBlock *rb, - ram_addr_t start); + ram_addr_t start, uint64_t haddr); int migrate_send_rp_message_req_pages(MigrationIncomingState *mis, RAMBlock *rb, ram_addr_t start); void migrate_send_rp_recv_bitmap(MigrationIncomingState *mis, diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c index 722034dc01..ca1daf0024 100644 --- a/migration/postcopy-ram.c +++ b/migration/postcopy-ram.c @@ -684,7 +684,7 @@ int postcopy_request_shared_page(struct PostCopyFD *pcfd, RAMBlock *rb, qemu_ram_get_idstr(rb), rb_offset); return postcopy_wake_shared(pcfd, client_addr, rb); } - migrate_send_rp_req_pages(mis, rb, aligned_rbo); + migrate_send_rp_req_pages(mis, rb, aligned_rbo, client_addr); return 0; } @@ -979,7 +979,8 @@ retry: * Send the request to the source - we want to request one * of our host page sizes (which is >= TPS) */ - ret = migrate_send_rp_req_pages(mis, rb, rb_offset); + ret = migrate_send_rp_req_pages(mis, rb, rb_offset, + msg.arg.pagefault.address); if (ret) { /* May be network failure, try to wait for recovery */ if (ret == -EIO && postcopy_pause_fault_thread(mis)) { @@ -1149,10 +1150,20 @@ static int qemu_ufd_copy_ioctl(MigrationIncomingState *mis, void *host_addr, ret = ioctl(userfault_fd, UFFDIO_ZEROPAGE, &zero_struct); } if (!ret) { + qemu_mutex_lock(&mis->page_request_mutex); ramblock_recv_bitmap_set_range(rb, host_addr, pagesize / qemu_target_page_size()); + /* + * If this page resolves a page fault for a previous recorded faulted + * address, take a special note to maintain the requested page list. + */ + if (g_tree_lookup(mis->page_requested, host_addr)) { + g_tree_remove(mis->page_requested, host_addr); + mis->page_requested_count--; + trace_postcopy_page_req_del(host_addr, mis->page_requested_count); + } + qemu_mutex_unlock(&mis->page_request_mutex); mark_postcopy_blocktime_end((uintptr_t)host_addr); - } return ret; } diff --git a/migration/trace-events b/migration/trace-events index 338f38b3dd..e4d5eb94ca 100644 --- a/migration/trace-events +++ b/migration/trace-events @@ -162,6 +162,7 @@ postcopy_pause_return_path(void) "" postcopy_pause_return_path_continued(void) "" postcopy_pause_continued(void) "" postcopy_start_set_run(void) "" +postcopy_page_req_add(void *addr, int count) "new page req %p total %d" source_return_path_thread_bad_end(void) "" source_return_path_thread_end(void) "" source_return_path_thread_entry(void) "" @@ -272,6 +273,7 @@ postcopy_ram_incoming_cleanup_blocktime(uint64_t total) "total blocktime %" PRIu postcopy_request_shared_page(const char *sharer, const char *rb, uint64_t rb_offset) "for %s in %s offset 0x%"PRIx64 postcopy_request_shared_page_present(const char *sharer, const char *rb, uint64_t rb_offset) "%s already %s offset 0x%"PRIx64 postcopy_wake_shared(uint64_t client_addr, const char *rb) "at 0x%"PRIx64" in %s" +postcopy_page_req_del(void *addr, int count) "resolved page req %p total %d" get_mem_fault_cpu_index(int cpu, uint32_t pid) "cpu: %d, pid: %u" From patchwork Mon Oct 19 22:57:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 1384532 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=ULL5Yjtg; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4CFXKq1wjKz9sSC for ; Tue, 20 Oct 2020 10:00:47 +1100 (AEDT) Received: from localhost ([::1]:46028 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kUe8z-0002Au-4l for incoming@patchwork.ozlabs.org; Mon, 19 Oct 2020 19:00:45 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:37506) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kUe5v-00077d-OA for qemu-devel@nongnu.org; Mon, 19 Oct 2020 18:57:35 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:28289) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1kUe5t-0005co-Mx for qemu-devel@nongnu.org; Mon, 19 Oct 2020 18:57:35 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1603148252; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=yBr8BDKw0RiY6YhaLlC9+NJ0Jdoel4RXbCJlumI8WKo=; b=ULL5Yjtg0RVQWEkp9YVSyhY9veZ5R6NzgVcn1/lViVyGKGQmkycYkG9WZvnzDKxzeeowlr GENXwiupOWonhGE25wPLxJVTBO9rTZqGzs3CcOl0uZyWjuAhYG4tTJgy8DZPoVIz2/hStW Qer0UHEsDfIJ14Sgf9G1e/gEpr/A3Ng= Received: from mail-il1-f200.google.com (mail-il1-f200.google.com [209.85.166.200]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-453-bm7uqevzNmiVtEgfBryE_A-1; Mon, 19 Oct 2020 18:57:30 -0400 X-MC-Unique: bm7uqevzNmiVtEgfBryE_A-1 Received: by mail-il1-f200.google.com with SMTP id p17so1456803ilj.0 for ; Mon, 19 Oct 2020 15:57:30 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=yBr8BDKw0RiY6YhaLlC9+NJ0Jdoel4RXbCJlumI8WKo=; b=KP64iDkL7pA491ARg6wN7J55nCY8XYzI/rLpFE9BBInnh/bHT++g07FXkCXEX4j3m0 TCUzfheqF3dCzFSHG2Xu/7Y0veijZ3WQ8cK29kvtWVIzNUTPXcpTfyk+TN1aSmVO5bLx LtsAFUAbwztEp9jFO20zmL0gYx8DktFsS+ZTdxYGHVo14SHSlc1N7HcUfm8Nm8BNtW8o NvEbt07ErM+dzFmCzme+IBRs2w0LewiMaR6BD6YE5dfjI1qRGIEyGXP/gh9kGfAvtKKT fAIu27uAjMtJpmXeMOzl8SBOspFiP5z4Kp/JsXaN8pM+sOVgdqhzOv3h0PrNVQyJDa+t DteA== X-Gm-Message-State: AOAM533E5MAcxKDeoRBesjkxP0yH4DJDOGtumyMohFFb1KoMMAXzjph8 bQ/b2G6V77xiUAcibXYsGDIxRXZQUNN3IvINRG6qu0XSdPkU9G+g5/lXplVshN/VGVnbT1HyfOM i7Fz3mR5rcVgeXEk= X-Received: by 2002:a02:c885:: with SMTP id m5mr40780jao.72.1603148249808; Mon, 19 Oct 2020 15:57:29 -0700 (PDT) X-Google-Smtp-Source: ABdhPJybXILAVJw2SXVn7BglcryH/GPwQ8S/o/RQeliYDtzg6I5YzLGeTkfK+jz+77WXR3DRAGM1ew== X-Received: by 2002:a02:c885:: with SMTP id m5mr40766jao.72.1603148249601; Mon, 19 Oct 2020 15:57:29 -0700 (PDT) Received: from xz-x1.redhat.com (toroon474qw-lp140-04-174-95-215-133.dsl.bell.ca. [174.95.215.133]) by smtp.gmail.com with ESMTPSA id z89sm6017ilk.4.2020.10.19.15.57.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 19 Oct 2020 15:57:28 -0700 (PDT) From: Peter Xu To: qemu-devel@nongnu.org Subject: [PATCH v5 4/6] migration: Sync requested pages after postcopy recovery Date: Mon, 19 Oct 2020 18:57:18 -0400 Message-Id: <20201019225720.172743-5-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201019225720.172743-1-peterx@redhat.com> References: <20201019225720.172743-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=216.205.24.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/10/19 15:28:27 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Thomas Huth , Juan Quintela , Xiaohui Li , "Dr . David Alan Gilbert" , peterx@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" We synchronize the requested pages right after a postcopy recovery happens. This helps to synchronize the prioritized pages on source so that the faulted threads can be served faster. Reported-by: Xiaohui Li Signed-off-by: Peter Xu Reviewed-by: Dr. David Alan Gilbert --- migration/savevm.c | 57 ++++++++++++++++++++++++++++++++++++++++++ migration/trace-events | 1 + 2 files changed, 58 insertions(+) diff --git a/migration/savevm.c b/migration/savevm.c index d2e141f7b1..33acbba1a4 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -2011,6 +2011,49 @@ static int loadvm_postcopy_handle_run(MigrationIncomingState *mis) return LOADVM_QUIT; } +/* We must be with page_request_mutex held */ +static gboolean postcopy_sync_page_req(gpointer key, gpointer value, + gpointer data) +{ + MigrationIncomingState *mis = data; + void *host_addr = (void *) key; + ram_addr_t rb_offset; + RAMBlock *rb; + int ret; + + rb = qemu_ram_block_from_host(host_addr, true, &rb_offset); + if (!rb) { + /* + * This should _never_ happen. However be nice for a migrating VM to + * not crash/assert. Post an error (note: intended to not use *_once + * because we do want to see all the illegal addresses; and this can + * never be triggered by the guest so we're safe) and move on next. + */ + error_report("%s: illegal host addr %p", __func__, host_addr); + /* Try the next entry */ + return FALSE; + } + + ret = migrate_send_rp_message_req_pages(mis, rb, rb_offset); + if (ret) { + /* Please refer to above comment. */ + error_report("%s: send rp message failed for addr %p", + __func__, host_addr); + return FALSE; + } + + trace_postcopy_page_req_sync(host_addr); + + return FALSE; +} + +static void migrate_send_rp_req_pages_pending(MigrationIncomingState *mis) +{ + WITH_QEMU_LOCK_GUARD(&mis->page_request_mutex) { + g_tree_foreach(mis->page_requested, postcopy_sync_page_req, mis); + } +} + static int loadvm_postcopy_handle_resume(MigrationIncomingState *mis) { if (mis->state != MIGRATION_STATUS_POSTCOPY_RECOVER) { @@ -2033,6 +2076,20 @@ static int loadvm_postcopy_handle_resume(MigrationIncomingState *mis) /* Tell source that "we are ready" */ migrate_send_rp_resume_ack(mis, MIGRATION_RESUME_ACK_VALUE); + /* + * After a postcopy recovery, the source should have lost the postcopy + * queue, or potentially the requested pages could have been lost during + * the network down phase. Let's re-sync with the source VM by re-sending + * all the pending pages that we eagerly need, so these threads won't get + * blocked too long due to the recovery. + * + * Without this procedure, the faulted destination VM threads (waiting for + * page requests right before the postcopy is interrupted) can keep hanging + * until the pages are sent by the source during the background copying of + * pages, or another thread faulted on the same address accidentally. + */ + migrate_send_rp_req_pages_pending(mis); + return 0; } diff --git a/migration/trace-events b/migration/trace-events index e4d5eb94ca..0fbfd2da60 100644 --- a/migration/trace-events +++ b/migration/trace-events @@ -49,6 +49,7 @@ vmstate_save(const char *idstr, const char *vmsd_name) "%s, %s" vmstate_load(const char *idstr, const char *vmsd_name) "%s, %s" postcopy_pause_incoming(void) "" postcopy_pause_incoming_continued(void) "" +postcopy_page_req_sync(void *host_addr) "sync page req %p" # vmstate.c vmstate_load_field_error(const char *field, int ret) "field \"%s\" load failed, ret = %d" From patchwork Mon Oct 19 22:57:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 1384529 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=Afe/1H3+; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4CFXHT1KbGz9sSC for ; Tue, 20 Oct 2020 09:58:45 +1100 (AEDT) Received: from localhost ([::1]:37864 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kUe71-0007FL-1H for incoming@patchwork.ozlabs.org; Mon, 19 Oct 2020 18:58:43 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:37512) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kUe5w-00079p-MR for qemu-devel@nongnu.org; Mon, 19 Oct 2020 18:57:36 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:53363) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1kUe5u-0005cx-Us for qemu-devel@nongnu.org; Mon, 19 Oct 2020 18:57:36 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1603148254; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=D5HFQfdM4TJN4VGII06GowLAi2up3lb9QwDK/dU70iw=; b=Afe/1H3+KGSmP8Fhbyrc2Hn9xU+osGPkewu+uT7jybk6tS3NUjkMlyuzwSGie3hGG4IJaU 3N9jknRfVf29YiRU27tbX1QOTUQCQ4XllOFPTNFo3X4SZy04KfXodD+/ZqAix5dMp95oZM 8AkA+Bk9ia7DGYaKqarER62Spap+dBc= Received: from mail-io1-f71.google.com (mail-io1-f71.google.com [209.85.166.71]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-595-_0hkHQUNMhWfvAPFDrrk8A-1; Mon, 19 Oct 2020 18:57:32 -0400 X-MC-Unique: _0hkHQUNMhWfvAPFDrrk8A-1 Received: by mail-io1-f71.google.com with SMTP id x13so1339695iom.10 for ; Mon, 19 Oct 2020 15:57:32 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=D5HFQfdM4TJN4VGII06GowLAi2up3lb9QwDK/dU70iw=; b=Ep+qfYhMtuIBt4zrgD44D5fRDkhx3LLzdIFrroX7h5+MjbWER3tfLhtg6mQANn6U7/ B/LEvNij3Bw9uESdHATD7I/noh5uUxmJhndEDyghpAJPA7sZaWlY9DNY8nb4zpgbGrxF XI945oZ8UFpXplUPe3yDiV7kt0SbgMksj9f6qOS6c5omUnvhSd/Qzb86mhFAZKPYSlN6 lAeRPhx6ijD6Xcl5bXBfp8wRCGYwvefAQl8YtICCxgFGsWMM2t+dsDlK0GUSs1Os9DX+ A2xtnkUHIynfuf6M4cbtm/Zmrs48CwlrzfQU8BSUBPZcLF/yBfTR/Msms4vXYNNd1fkU KxUw== X-Gm-Message-State: AOAM532SgqTskQH5dve1LQZIhrrnhH7XgsgN9EqGTOP2OXH9DDHnn2GE wX9NQCOMi+vvGOPir5XcBHj96zQLw9vmQZB0Rt3mYQi59pLAfq+6lVYWRTDtZOlLgG1wCpa/95R ZykUMPFl/wq4B7Yo= X-Received: by 2002:a92:d709:: with SMTP id m9mr2053137iln.226.1603148251634; Mon, 19 Oct 2020 15:57:31 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzrHKKdgocksKmW88BekSWF8FxFlKMkfhuGqm4kBj2vERLcVumo9AMK0xXE8x9yBFi7IoPJ8A== X-Received: by 2002:a92:d709:: with SMTP id m9mr2053121iln.226.1603148251321; Mon, 19 Oct 2020 15:57:31 -0700 (PDT) Received: from xz-x1.redhat.com (toroon474qw-lp140-04-174-95-215-133.dsl.bell.ca. [174.95.215.133]) by smtp.gmail.com with ESMTPSA id z89sm6017ilk.4.2020.10.19.15.57.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 19 Oct 2020 15:57:30 -0700 (PDT) From: Peter Xu To: qemu-devel@nongnu.org Subject: [PATCH v5 5/6] migration/postcopy: Release fd before going into 'postcopy-pause' Date: Mon, 19 Oct 2020 18:57:19 -0400 Message-Id: <20201019225720.172743-6-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201019225720.172743-1-peterx@redhat.com> References: <20201019225720.172743-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=216.205.24.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/10/19 15:28:27 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Thomas Huth , "Dr . David Alan Gilbert" , peterx@redhat.com, Juan Quintela Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Logically below race could trigger with the old code: test program migration thread ------------ ---------------- wait_until('postcopy-pause') postcopy_pause() set_state('postcopy-pause') do_postcopy_recover() arm s->to_dst_file with new fd release s->to_dst_file [1] Here [1] could have released the just-installed recoverying channel. Then the migration could hang without really resuming. Instead, it should be very safe to release the fd before setting the state into 'postcopy-pause', because there's no reason for any other thread to touch it during 'postcopy-active'. Dave reported a very rare postcopy recovery hang that the migration-test program waited for the migration to complete in migrate_postcopy_complete(). We do suspect it's the same thing that we're gonna fix here. Hard to tell. However since we've noticed this, fix this irrelevant of the hang report. Cc: Dr. David Alan Gilbert Cc: Juan Quintela Signed-off-by: Peter Xu Reviewed-by: Dr. David Alan Gilbert --- migration/migration.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index 0b4fcff01f..50df6251b7 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -3182,9 +3182,6 @@ static MigThrError postcopy_pause(MigrationState *s) while (true) { QEMUFile *file; - migrate_set_state(&s->state, s->state, - MIGRATION_STATUS_POSTCOPY_PAUSED); - /* Current channel is possibly broken. Release it. */ assert(s->to_dst_file); qemu_mutex_lock(&s->qemu_file_lock); @@ -3195,6 +3192,9 @@ static MigThrError postcopy_pause(MigrationState *s) qemu_file_shutdown(file); qemu_fclose(file); + migrate_set_state(&s->state, s->state, + MIGRATION_STATUS_POSTCOPY_PAUSED); + error_report("Detected IO failure for postcopy. " "Migration paused."); From patchwork Mon Oct 19 22:57:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 1384530 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=SGu6xkad; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4CFXKj63JKz9sSC for ; Tue, 20 Oct 2020 10:00:41 +1100 (AEDT) Received: from localhost ([::1]:45468 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kUe8t-0001vj-1x for incoming@patchwork.ozlabs.org; Mon, 19 Oct 2020 19:00:39 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:37524) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kUe5y-0007DW-Bo for qemu-devel@nongnu.org; Mon, 19 Oct 2020 18:57:38 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:26541) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1kUe5w-0005d6-IY for qemu-devel@nongnu.org; Mon, 19 Oct 2020 18:57:38 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1603148255; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=6Qk8695FHtIdQ59StBPwLf/xWeZh5o7jRbCU96TiP8E=; b=SGu6xkadaxeMcHKbL4d0r0DgdSTqgxDgBhELk8jcSGQACqIIpmShLRaT7VmZOcXjM/6YQr hIJ3QiRDfrdZcUYaPps8eLllNK4TIFVoZsamWxjpLSomzfXXvHxjXt1lLhLASkof1NF0FC +6j09yPRwDSsyxLWHEnVgaRfInCRVdg= Received: from mail-il1-f199.google.com (mail-il1-f199.google.com [209.85.166.199]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-105-MFErl_7uP_Ket98DtUP2cw-1; Mon, 19 Oct 2020 18:57:33 -0400 X-MC-Unique: MFErl_7uP_Ket98DtUP2cw-1 Received: by mail-il1-f199.google.com with SMTP id e3so1395159ilq.18 for ; Mon, 19 Oct 2020 15:57:33 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=6Qk8695FHtIdQ59StBPwLf/xWeZh5o7jRbCU96TiP8E=; b=KStywOW9HTOoasBpz6B4g/AYDlRDZANquZYcNMKXUUZNv3eQbWWU3g39VD2YjX0p1b OeGiBYI4MJ0I9moinIaBaVGGqvJ9afoF6ECduQS/U4ZNV2nd3diTOV4f9JUxru83l3mS 4a84J8om19f39Rnc9ylgsqHhQsYgDEOJzUgY/58MekYvgv77j0VMpdkjM7ZsHoI8dOCo S8SoXKp9SpQgrhz+EUzZGKGB4vWz+0WbwIyNnjFRUOhBagbaRWc7+gx18kSVv9avKH+h YbXcboJE0e09F5SU4bcPCHrZOwkmQYrlI6kUkgsAGt0uREPbY0boolr56BEgL0guriP+ GRrg== X-Gm-Message-State: AOAM5318Ufsj6ngJzOh++SVCacu0U67PqZKR1wgUtPCQdaYv12d3WQD+ wr19DVK8PG+Lh7kTT19km9CSjTkBmsjjzusdzvg1r9hojLFF1w/kbwIGCT4Xx1rhwD1Zlh5kiuF +UDxnsZcqDhGeKiA= X-Received: by 2002:a05:6e02:1247:: with SMTP id j7mr1950272ilq.304.1603148252887; Mon, 19 Oct 2020 15:57:32 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzrxD72MWz0Hgf+t7SZJG2lhZVxVXnxHznyuaZj9DDCz1XkLmhKMZloID10PgVQxKMrlWvi2g== X-Received: by 2002:a05:6e02:1247:: with SMTP id j7mr1950262ilq.304.1603148252682; Mon, 19 Oct 2020 15:57:32 -0700 (PDT) Received: from xz-x1.redhat.com (toroon474qw-lp140-04-174-95-215-133.dsl.bell.ca. [174.95.215.133]) by smtp.gmail.com with ESMTPSA id z89sm6017ilk.4.2020.10.19.15.57.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 19 Oct 2020 15:57:32 -0700 (PDT) From: Peter Xu To: qemu-devel@nongnu.org Subject: [PATCH v5 6/6] migration-test: Only hide error if !QTEST_LOG Date: Mon, 19 Oct 2020 18:57:20 -0400 Message-Id: <20201019225720.172743-7-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20201019225720.172743-1-peterx@redhat.com> References: <20201019225720.172743-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=216.205.24.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/10/19 15:28:27 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Thomas Huth , "Dr . David Alan Gilbert" , peterx@redhat.com, Juan Quintela Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" The errors are very useful when debugging qtest failures, especially when QTEST_LOG=1 is set. Let's allow override MigrateStart.hide_stderr when QTEST_LOG=1 is specified, because that means the user wants to be verbose. Not very nice to introduce the first QTEST_LOG env access in migration-test.c, however it should be handy. Without this patch, I was hacking error_report() when debugging such errors. Let's make things easier. Signed-off-by: Peter Xu --- tests/qtest/migration-test.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c index 00a233cd8c..ff9ed70029 100644 --- a/tests/qtest/migration-test.c +++ b/tests/qtest/migration-test.c @@ -461,6 +461,10 @@ static void migrate_postcopy_start(QTestState *from, QTestState *to) } typedef struct { + /* + * QTEST_LOG=1 may override this. When QTEST_LOG=1, we always dump errors + * unconditionally, because it means the user would like to be verbose. + */ bool hide_stderr; bool use_shmem; /* only launch the target process */ @@ -554,7 +558,7 @@ static int test_migrate_start(QTestState **from, QTestState **to, g_free(bootpath); - if (args->hide_stderr) { + if (!getenv("QTEST_LOG") && args->hide_stderr) { ignore_stderr = "2>/dev/null"; } else { ignore_stderr = "";