From patchwork Fri Aug 3 09:13:39 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: 858585 jemmy X-Patchwork-Id: 953136 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="NFVI2MmY"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41hhJy3lV3z9s0R for ; Fri, 3 Aug 2018 19:17:38 +1000 (AEST) Received: from localhost ([::1]:49752 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1flWDH-00045I-Ud for incoming@patchwork.ozlabs.org; Fri, 03 Aug 2018 05:17:35 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:52689) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1flW9q-0001O0-S1 for qemu-devel@nongnu.org; Fri, 03 Aug 2018 05:14:04 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1flW9p-0001BI-0m for qemu-devel@nongnu.org; Fri, 03 Aug 2018 05:14:02 -0400 Received: from mail-pg1-x535.google.com ([2607:f8b0:4864:20::535]:42315) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1flW9o-0001AO-PD for qemu-devel@nongnu.org; Fri, 03 Aug 2018 05:14:00 -0400 Received: by mail-pg1-x535.google.com with SMTP id y4-v6so2562664pgp.9 for ; Fri, 03 Aug 2018 02:14:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=Gw1eRPs9Eiy6kcsN8PtO8YyrNNK7Uhup63R4pYFlYZQ=; b=NFVI2MmYB9a1nOH2vDfTlQBddcvjVKAiEi541VW8NhJ4YbORaCQIDOa0jYpkRio129 1eV0qR9iaU3uinthoBBqAEEgDD7lFdAD5YzgQB9usIAGvx4IYc76OkdoWfl7pqDtpah5 3CjwjBYwdpgOoO2BEPLLPOUVKntXYARrcTwXjszJX5wxjOvNaf2YQbe1rP6LoBt2zAOT iuNtVIckqFCM032hAqgowo8yP0XrZ/mb8rZcpONV0X4LlFvmvqeMlP4SvAd3UI42Vchq mSAUs3G54/5U7cHFGQhV9Q0g1RtJCOhQ1ZK05d2cr9BdVTtUCXYczpnXwW7Ma0hJKcWI hDPQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Gw1eRPs9Eiy6kcsN8PtO8YyrNNK7Uhup63R4pYFlYZQ=; b=CFM/Eo8cJdAjqfFr03D2BWs5mBkn9VRpLnJ3ISSDxCHk34P1G6pH80HlHTCdcCN/Mn tF9dtwfSgtEecCC5s7NCtCNzsQ3iEf6K3rvJJYuMgDutQ+QMn/8sbnLA5L8Zvqoo3UTB cGPceUCTwnV7BteireZNjLA25jxEOei8jvilIGtWqcUiRb6xALOobu3fMfcmFJzjBABt A8i/I9XxnpBfKWyvJ0HIcfRibGTEiXROTQs45XNff+uIx6MGdXNnkDgT64yM26MjnF57 Qd+bKFj6whxqh42mwQv45WCUIL1OcD7wxzJYnzvHgPiDltfSdZmfh+3VnyuZC9kXS05D c/bA== X-Gm-Message-State: AOUpUlE2Enf3HA8s7BmVl5I4jf8NnC3lA6UCt1fdG4j1JTXBkYaCnxb0 ajpHjcRhPy1Ux3ZmjbW5ReM= X-Google-Smtp-Source: AAOMgpeemQCcYoa1HZsYwVR3E5Noa7oDZDqKC13StrBG4zfZP3zAiOZ8/E17ifRXhjGfYVQan5OGjQ== X-Received: by 2002:a65:6455:: with SMTP id s21-v6mr2814308pgv.394.1533287640022; Fri, 03 Aug 2018 02:14:00 -0700 (PDT) Received: from VM_120_46_centos.localdomain ([119.28.87.64]) by smtp.gmail.com with ESMTPSA id e21-v6sm7991352pfl.187.2018.08.03.02.13.58 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 03 Aug 2018 02:13:59 -0700 (PDT) From: Lidong Chen X-Google-Original-From: Lidong Chen To: zhang.zhanghailiang@huawei.com, quintela@redhat.com, dgilbert@redhat.com Date: Fri, 3 Aug 2018 17:13:39 +0800 Message-Id: <1533287630-4221-2-git-send-email-lidongchen@tencent.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1533287630-4221-1-git-send-email-lidongchen@tencent.com> References: <1533287630-4221-1-git-send-email-lidongchen@tencent.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::535 Subject: [Qemu-devel] [PATCH v6 01/12] migration: disable RDMA WRITE after postcopy started X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lidong Chen , qemu-devel@nongnu.org, Lidong Chen Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" From: Lidong Chen RDMA WRITE operations are performed with no notification to the destination qemu, then the destination qemu can not wakeup. This patch disable RDMA WRITE after postcopy started. Signed-off-by: Lidong Chen Reviewed-by: Dr. David Alan Gilbert --- migration/qemu-file.c | 8 ++++++-- migration/rdma.c | 12 ++++++++++++ 2 files changed, 18 insertions(+), 2 deletions(-) diff --git a/migration/qemu-file.c b/migration/qemu-file.c index 0463f4c..977b9ae 100644 --- a/migration/qemu-file.c +++ b/migration/qemu-file.c @@ -253,8 +253,12 @@ size_t ram_control_save_page(QEMUFile *f, ram_addr_t block_offset, if (f->hooks && f->hooks->save_page) { int ret = f->hooks->save_page(f, f->opaque, block_offset, offset, size, bytes_sent); - f->bytes_xfer += size; - if (ret != RAM_SAVE_CONTROL_DELAYED) { + if (ret != RAM_SAVE_CONTROL_NOT_SUPP) { + f->bytes_xfer += size; + } + + if (ret != RAM_SAVE_CONTROL_DELAYED && + ret != RAM_SAVE_CONTROL_NOT_SUPP) { if (bytes_sent && *bytes_sent > 0) { qemu_update_position(f, *bytes_sent); } else if (ret < 0) { diff --git a/migration/rdma.c b/migration/rdma.c index 8bd7159..76424a5 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -2921,6 +2921,10 @@ static size_t qemu_rdma_save_page(QEMUFile *f, void *opaque, CHECK_ERROR_STATE(); + if (migrate_get_current()->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) { + return RAM_SAVE_CONTROL_NOT_SUPP; + } + qemu_fflush(f); if (size > 0) { @@ -3480,6 +3484,10 @@ static int qemu_rdma_registration_start(QEMUFile *f, void *opaque, CHECK_ERROR_STATE(); + if (migrate_get_current()->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) { + return 0; + } + trace_qemu_rdma_registration_start(flags); qemu_put_be64(f, RAM_SAVE_FLAG_HOOK); qemu_fflush(f); @@ -3502,6 +3510,10 @@ static int qemu_rdma_registration_stop(QEMUFile *f, void *opaque, CHECK_ERROR_STATE(); + if (migrate_get_current()->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) { + return 0; + } + qemu_fflush(f); ret = qemu_rdma_drain_cq(f, rdma); From patchwork Fri Aug 3 09:13:40 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: 858585 jemmy X-Patchwork-Id: 953139 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="VBOtDDy2"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41hhMm3Rsvz9s2g for ; Fri, 3 Aug 2018 19:20:04 +1000 (AEST) Received: from localhost ([::1]:49763 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1flWFe-0005ro-3P for incoming@patchwork.ozlabs.org; Fri, 03 Aug 2018 05:20:02 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:52710) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1flW9s-0001OF-HY for qemu-devel@nongnu.org; Fri, 03 Aug 2018 05:14:05 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1flW9q-0001FV-Ox for qemu-devel@nongnu.org; Fri, 03 Aug 2018 05:14:04 -0400 Received: from mail-pg1-x542.google.com ([2607:f8b0:4864:20::542]:34142) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1flW9q-0001Do-H6 for qemu-devel@nongnu.org; Fri, 03 Aug 2018 05:14:02 -0400 Received: by mail-pg1-x542.google.com with SMTP id y5-v6so2578927pgv.1 for ; Fri, 03 Aug 2018 02:14:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=xaKjbDdmaFRrENVSxuPNBBGbclo+QaleQR6T32e6rxo=; b=VBOtDDy2603WMAVdJMBViqBr2pwaAoOc31IBziSi1fGwXv/TBILBIvYmrqFV7FdGtl H9uuBejH14nO6ODHki3zITfDVB7LLj6eCaFdvZEXcbU0QSBe4N+WIiB0aUwJ/XIsc03N 3hx3pZsSTw99JxyLolr72lWYAhzW6HQ02jkbBG+fqkcAPLU1abkTeKColIsXNVy3THpu tkk95s7aIPaITl2MDsY6Msn6fCuXOpvlC6jk6668CqsKLbGJwFa4YGw2lkGo6GgkwX3E BC4zJuOSit3zgsAOSWEQOzDkSkevaOuJfcjT7H8sQt9CPyt637/7uwvGdwV7s94twSJz 0FZw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=xaKjbDdmaFRrENVSxuPNBBGbclo+QaleQR6T32e6rxo=; b=lg+NHs5yL7jCEq2TtbBbYiDzlB6iZfcqv5XlRr94VVK02EwcEumc2pajDbsrsl379Y GIDgzsKiIv0G5DHD/G6L/dUsttqhN1iHkfNqth4gvrNiFsApiXqixWqYmIl83Cby6b6Y 7LswyA9uz0e3ClMZMVD3WhsirIHTvoZjpw5f057QffBItw+8xPjMVvkUe5OUB5KKAQgh 4ORqSnyYb6euGkNf0/YSdF2XKuEiBF8yhBazkfdFbfewQPJ0WplTPf702wo+U6CBKpgE 61dKA6woC53ep8fT3scyxEaJ10T59Dj5zRyAfxn6P5aizij9uVIFl+vllRTR54PzOgZU 15og== X-Gm-Message-State: AOUpUlEuh8rKvjbuLsYUN9soHF8oPGQ2kzA8tnmqMkdwvH3BmcSuUFJ9 RwEg8IEuHLz9/IigT20vZ5VQbBdp6NM= X-Google-Smtp-Source: AAOMgpeVQ2Ei1zlL+GpxyXZajheWzFU8IpQLHsfzOuLj9WbEMxNLAGvhCTIeFw+NRo6h2DhUL+ss4Q== X-Received: by 2002:a63:f54c:: with SMTP id e12-v6mr2872091pgk.286.1533287641741; Fri, 03 Aug 2018 02:14:01 -0700 (PDT) Received: from VM_120_46_centos.localdomain ([119.28.87.64]) by smtp.gmail.com with ESMTPSA id e21-v6sm7991352pfl.187.2018.08.03.02.14.00 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 03 Aug 2018 02:14:01 -0700 (PDT) From: Lidong Chen X-Google-Original-From: Lidong Chen To: zhang.zhanghailiang@huawei.com, quintela@redhat.com, dgilbert@redhat.com Date: Fri, 3 Aug 2018 17:13:40 +0800 Message-Id: <1533287630-4221-3-git-send-email-lidongchen@tencent.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1533287630-4221-1-git-send-email-lidongchen@tencent.com> References: <1533287630-4221-1-git-send-email-lidongchen@tencent.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::542 Subject: [Qemu-devel] [PATCH v6 02/12] migration: create a dedicated connection for rdma return path X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lidong Chen , qemu-devel@nongnu.org, Lidong Chen Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" From: Lidong Chen If start a RDMA migration with postcopy enabled, the source qemu establish a dedicated connection for return path. Signed-off-by: Lidong Chen Reviewed-by: Dr. David Alan Gilbert --- migration/rdma.c | 94 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 91 insertions(+), 3 deletions(-) diff --git a/migration/rdma.c b/migration/rdma.c index 76424a5..57af5ed 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -387,6 +387,10 @@ typedef struct RDMAContext { uint64_t unregistrations[RDMA_SIGNALED_SEND_MAX]; GHashTable *blockmap; + + /* the RDMAContext for return path */ + struct RDMAContext *return_path; + bool is_return_path; } RDMAContext; #define TYPE_QIO_CHANNEL_RDMA "qio-channel-rdma" @@ -2323,10 +2327,22 @@ static void qemu_rdma_cleanup(RDMAContext *rdma) rdma_destroy_id(rdma->cm_id); rdma->cm_id = NULL; } + + /* the destination side, listen_id and channel is shared */ if (rdma->listen_id) { - rdma_destroy_id(rdma->listen_id); + if (!rdma->is_return_path) { + rdma_destroy_id(rdma->listen_id); + } rdma->listen_id = NULL; + + if (rdma->channel) { + if (!rdma->is_return_path) { + rdma_destroy_event_channel(rdma->channel); + } + rdma->channel = NULL; + } } + if (rdma->channel) { rdma_destroy_event_channel(rdma->channel); rdma->channel = NULL; @@ -2555,6 +2571,25 @@ err_dest_init_create_listen_id: } +static void qemu_rdma_return_path_dest_init(RDMAContext *rdma_return_path, + RDMAContext *rdma) +{ + int idx; + + for (idx = 0; idx < RDMA_WRID_MAX; idx++) { + rdma_return_path->wr_data[idx].control_len = 0; + rdma_return_path->wr_data[idx].control_curr = NULL; + } + + /*the CM channel and CM id is shared*/ + rdma_return_path->channel = rdma->channel; + rdma_return_path->listen_id = rdma->listen_id; + + rdma->return_path = rdma_return_path; + rdma_return_path->return_path = rdma; + rdma_return_path->is_return_path = true; +} + static void *qemu_rdma_data_init(const char *host_port, Error **errp) { RDMAContext *rdma = NULL; @@ -3012,6 +3047,8 @@ err: return ret; } +static void rdma_accept_incoming_migration(void *opaque); + static int qemu_rdma_accept(RDMAContext *rdma) { RDMACapabilities cap; @@ -3106,7 +3143,14 @@ static int qemu_rdma_accept(RDMAContext *rdma) } } - qemu_set_fd_handler(rdma->channel->fd, NULL, NULL, NULL); + /* Accept the second connection request for return path */ + if (migrate_postcopy() && !rdma->is_return_path) { + qemu_set_fd_handler(rdma->channel->fd, rdma_accept_incoming_migration, + NULL, + (void *)(intptr_t)rdma->return_path); + } else { + qemu_set_fd_handler(rdma->channel->fd, NULL, NULL, NULL); + } ret = rdma_accept(rdma->cm_id, &conn_param); if (ret) { @@ -3691,6 +3735,10 @@ static void rdma_accept_incoming_migration(void *opaque) trace_qemu_rdma_accept_incoming_migration_accepted(); + if (rdma->is_return_path) { + return; + } + f = qemu_fopen_rdma(rdma, "rb"); if (f == NULL) { ERROR(errp, "could not qemu_fopen_rdma!"); @@ -3705,7 +3753,7 @@ static void rdma_accept_incoming_migration(void *opaque) void rdma_start_incoming_migration(const char *host_port, Error **errp) { int ret; - RDMAContext *rdma; + RDMAContext *rdma, *rdma_return_path; Error *local_err = NULL; trace_rdma_start_incoming_migration(); @@ -3732,12 +3780,24 @@ void rdma_start_incoming_migration(const char *host_port, Error **errp) trace_rdma_start_incoming_migration_after_rdma_listen(); + /* initialize the RDMAContext for return path */ + if (migrate_postcopy()) { + rdma_return_path = qemu_rdma_data_init(host_port, &local_err); + + if (rdma_return_path == NULL) { + goto err; + } + + qemu_rdma_return_path_dest_init(rdma_return_path, rdma); + } + qemu_set_fd_handler(rdma->channel->fd, rdma_accept_incoming_migration, NULL, (void *)(intptr_t)rdma); return; err: error_propagate(errp, local_err); g_free(rdma); + g_free(rdma_return_path); } void rdma_start_outgoing_migration(void *opaque, @@ -3745,6 +3805,7 @@ void rdma_start_outgoing_migration(void *opaque, { MigrationState *s = opaque; RDMAContext *rdma = qemu_rdma_data_init(host_port, errp); + RDMAContext *rdma_return_path = NULL; int ret = 0; if (rdma == NULL) { @@ -3765,6 +3826,32 @@ void rdma_start_outgoing_migration(void *opaque, goto err; } + /* RDMA postcopy need a seprate queue pair for return path */ + if (migrate_postcopy()) { + rdma_return_path = qemu_rdma_data_init(host_port, errp); + + if (rdma_return_path == NULL) { + goto err; + } + + ret = qemu_rdma_source_init(rdma_return_path, + s->enabled_capabilities[MIGRATION_CAPABILITY_RDMA_PIN_ALL], errp); + + if (ret) { + goto err; + } + + ret = qemu_rdma_connect(rdma_return_path, errp); + + if (ret) { + goto err; + } + + rdma->return_path = rdma_return_path; + rdma_return_path->return_path = rdma; + rdma_return_path->is_return_path = true; + } + trace_rdma_start_outgoing_migration_after_rdma_connect(); s->to_dst_file = qemu_fopen_rdma(rdma, "wb"); @@ -3772,4 +3859,5 @@ void rdma_start_outgoing_migration(void *opaque, return; err: g_free(rdma); + g_free(rdma_return_path); } From patchwork Fri Aug 3 09:13:41 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: 858585 jemmy X-Patchwork-Id: 953141 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="G1kEGg4U"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41hhQ704S1z9s2g for ; Fri, 3 Aug 2018 19:22:05 +1000 (AEST) Received: from localhost ([::1]:49779 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1flWHa-0007rF-Nw for incoming@patchwork.ozlabs.org; Fri, 03 Aug 2018 05:22:02 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:52726) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1flW9t-0001Op-L9 for qemu-devel@nongnu.org; Fri, 03 Aug 2018 05:14:06 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1flW9s-0001Jg-HN for qemu-devel@nongnu.org; Fri, 03 Aug 2018 05:14:05 -0400 Received: from mail-pg1-x544.google.com ([2607:f8b0:4864:20::544]:41479) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1flW9s-0001Ih-AR for qemu-devel@nongnu.org; Fri, 03 Aug 2018 05:14:04 -0400 Received: by mail-pg1-x544.google.com with SMTP id z8-v6so2562220pgu.8 for ; Fri, 03 Aug 2018 02:14:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=RWIMgk03BRlmqbTHDnYPfsRwdbqRWy3x1CB6C7/BT8U=; b=G1kEGg4U8pxR/FODrvmwCGKxtALOW8R6RAjXi0IjCrr2Kn8k30tQDh1TBbOLEkCjyN PItYE0I+/9eisNtf7Qlthm6RZhJhu2SZehqwlMXepJ1QucLlzbmyoPt0iV9eUZgbZLPI B1aEYQL6DUHRYhhG5wfL8/J+fJdF6utMhtW2awgDbg0zjg3WMEOJZEeDaK/hmADbrwjO fAP4ueNtYTsu7hL6/4Adzqlth33HOf4Ek5cWXadnyfVC/+dTGJGjYia+PMUy1KhT4vou c8U0TEbwzDaMbzNRASnDRDGxbZOL3noziAwOGCFEghpUFhTvzZVM8Vjc+rU6lpfu+Lfe cZiQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=RWIMgk03BRlmqbTHDnYPfsRwdbqRWy3x1CB6C7/BT8U=; b=dIJ0/IdIHmJ0rFY7Z3l9QT7InH/svLtgxfxLh1AtqgWfB+gIpTqAaUnU+MTx65skV2 9oWn2BckVWrT5Ge9/Txr5hUgb1WL1OIU+8NHnDIAyq2qCrnYm76pEMCI/5k7Uy6qJAm7 zVKDBKOs1z8Sj5NVwH6zfn5c+nnyPibMSrPkUHdng2UqI2hWKvU5fyJ9mHTQBbFQg65T p9vTom7dNfgd5mHu6gxZosXi9r2vM+l92lwow2AELMA2KdlkUnNzvEXY6I2VR/buiKhZ oePoCb5jOOMIlHnmobLpBLeKwE/brNJjUwGgm9jn8t5NswQIPceqOphEiKWpNxKxMnyS DmtA== X-Gm-Message-State: AOUpUlEQzb615Du3SOAZVGseA4nSaV+OasEBd+YulCQHNvUvIxn0x3F+ Ghcha3bPnf3tjrEycK+7R5I= X-Google-Smtp-Source: AAOMgpcCjbqsZg7MIqZ0sAB7gIZoQkVYucj1p3oF3G3kesf8t1w4/QBH0x1/huqy+85XrlTH/C94GA== X-Received: by 2002:a65:5004:: with SMTP id f4-v6mr2902535pgo.54.1533287643507; Fri, 03 Aug 2018 02:14:03 -0700 (PDT) Received: from VM_120_46_centos.localdomain ([119.28.87.64]) by smtp.gmail.com with ESMTPSA id e21-v6sm7991352pfl.187.2018.08.03.02.14.01 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 03 Aug 2018 02:14:02 -0700 (PDT) From: Lidong Chen X-Google-Original-From: Lidong Chen To: zhang.zhanghailiang@huawei.com, quintela@redhat.com, dgilbert@redhat.com Date: Fri, 3 Aug 2018 17:13:41 +0800 Message-Id: <1533287630-4221-4-git-send-email-lidongchen@tencent.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1533287630-4221-1-git-send-email-lidongchen@tencent.com> References: <1533287630-4221-1-git-send-email-lidongchen@tencent.com> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::544 Subject: [Qemu-devel] [PATCH v6 03/12] migration: avoid concurrent invoke channel_close by different threads X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lidong Chen , qemu-devel@nongnu.org, Lidong Chen Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" From: Lidong Chen The channel_close maybe invoked by different threads. For example, source qemu invokes qemu_fclose in main thread, migration thread and return path thread. Destination qemu invokes qemu_fclose in main thread, listen thread and COLO incoming thread. Signed-off-by: Lidong Chen Reviewed-by: Daniel P. Berrangé --- migration/migration.c | 2 ++ migration/migration.h | 7 +++++++ migration/qemu-file.c | 6 ++++-- 3 files changed, 13 insertions(+), 2 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index b7d9854..a3a0756 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -3200,6 +3200,7 @@ static void migration_instance_finalize(Object *obj) qemu_sem_destroy(&ms->postcopy_pause_sem); qemu_sem_destroy(&ms->postcopy_pause_rp_sem); qemu_sem_destroy(&ms->rp_state.rp_sem); + qemu_mutex_destroy(&ms->qemu_file_close_lock); error_free(ms->error); } @@ -3236,6 +3237,7 @@ static void migration_instance_init(Object *obj) qemu_sem_init(&ms->rp_state.rp_sem, 0); qemu_sem_init(&ms->rate_limit_sem, 0); qemu_mutex_init(&ms->qemu_file_lock); + qemu_mutex_init(&ms->qemu_file_close_lock); } /* diff --git a/migration/migration.h b/migration/migration.h index 64a7b33..a50c2de 100644 --- a/migration/migration.h +++ b/migration/migration.h @@ -122,6 +122,13 @@ struct MigrationState QemuMutex qemu_file_lock; /* + * The to_src_file and from_dst_file point to one QIOChannelRDMA, + * And qemu_fclose maybe invoked by different threads. use this lock + * to avoid concurrent invoke channel_close by different threads. + */ + QemuMutex qemu_file_close_lock; + + /* * Used to allow urgent requests to override rate limiting. */ QemuSemaphore rate_limit_sem; diff --git a/migration/qemu-file.c b/migration/qemu-file.c index 977b9ae..74c48e0 100644 --- a/migration/qemu-file.c +++ b/migration/qemu-file.c @@ -323,12 +323,14 @@ void qemu_update_position(QEMUFile *f, size_t size) */ int qemu_fclose(QEMUFile *f) { - int ret; + int ret, ret2; qemu_fflush(f); ret = qemu_file_get_error(f); if (f->ops->close) { - int ret2 = f->ops->close(f->opaque); + qemu_mutex_lock(&migrate_get_current()->qemu_file_close_lock); + ret2 = f->ops->close(f->opaque); + qemu_mutex_unlock(&migrate_get_current()->qemu_file_close_lock); if (ret >= 0) { ret = ret2; } From patchwork Fri Aug 3 09:13:42 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: 858585 jemmy X-Patchwork-Id: 953135 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="Hbnedu8J"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41hhH75Qvsz9s3Z for ; Fri, 3 Aug 2018 19:16:03 +1000 (AEST) Received: from localhost ([::1]:49743 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1flWBk-0002TD-9D for incoming@patchwork.ozlabs.org; Fri, 03 Aug 2018 05:16:00 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:52795) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1flW9x-0001RR-AT for qemu-devel@nongnu.org; Fri, 03 Aug 2018 05:14:12 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1flW9u-0001Nv-Kn for qemu-devel@nongnu.org; Fri, 03 Aug 2018 05:14:09 -0400 Received: from mail-pf1-x432.google.com ([2607:f8b0:4864:20::432]:36318) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1flW9u-0001Mt-Ai for qemu-devel@nongnu.org; Fri, 03 Aug 2018 05:14:06 -0400 Received: by mail-pf1-x432.google.com with SMTP id b11-v6so2930815pfo.3 for ; Fri, 03 Aug 2018 02:14:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=ve+ewzePoLVm5jRNrYZ1DCfvOb5UloP+RAKRNeBmMoQ=; b=Hbnedu8JVlx8w5YEJIwEZwpRms5masX2kgYeGb6sP10wcfby+fP3Xcq55ohcoj8SZ0 KxN6X+pb/ypXo8RyKdkNCtCycGoqztl6uGOAeSxy4NBpcYIIIbCEM1PzR5mGBd7VNbf9 wEgP+hGuQomzMJG+YlvcI8APvA9wn4XATezy6VYIkDnq+gFLaT8i4nCrKyArm7wCWZ26 euNIcw/kvP0X5TEsGQccr1CCTDpDik1PLv9CKScR6L+1VGU4EMwIP9UUU/15AQd+zJDH ysGj2s3xv8AcwR9DnNO6twnSdwLMtKDH9mna2DznyYvjugA1IJFmfHatmEg2hCZixs7J PoIA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=ve+ewzePoLVm5jRNrYZ1DCfvOb5UloP+RAKRNeBmMoQ=; b=mRGLaX16o4f9C22gMNmm4oqsUTHjusF0sZC7tiDXKlN8HPnQ6LtGR+5Pah6iuVYc/l PNYBP81UAvNwlrumrIZTmv/8f4NKVHLC6OCPWVYK6+FPzeiu0AaR9DZ/vHPzAEU2OJHT k+QH/lee5+g1oLzPv7kG9GKlwI2qYOhrL/j5ct5BDjC6vHCzfKb2yAIFByB5VKQh1Ch3 WKHB7CiGg9/W/eFhHuoqXGpskJEZZNJe3qoqDVaN1Kwr0nGjTHQbfmDA4cLNCGcMYorv jDYJBpc5+MRZ7qUvwsX26BwIDGvPErZs9XoA/GSyapFCMoi/yJcFk0WxV1zqtU0KdiMr efjA== X-Gm-Message-State: AOUpUlHrH3tCIaFfuQsF2XjE5q5sH/y6Bi4HuzX0299QJcfL7nQzHZrH qM2r2rnCL69p+74DF38aN/4= X-Google-Smtp-Source: AAOMgpemJmnUtJGg43Ymv8gG25xK9n6vQ66w9quE1GwOCCkt5957gE4O2qeC0Qlp1d9/zYZfXXOaKg== X-Received: by 2002:a63:9f0a:: with SMTP id g10-v6mr2903640pge.324.1533287645328; Fri, 03 Aug 2018 02:14:05 -0700 (PDT) Received: from VM_120_46_centos.localdomain ([119.28.87.64]) by smtp.gmail.com with ESMTPSA id e21-v6sm7991352pfl.187.2018.08.03.02.14.03 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 03 Aug 2018 02:14:04 -0700 (PDT) From: Lidong Chen X-Google-Original-From: Lidong Chen To: zhang.zhanghailiang@huawei.com, quintela@redhat.com, dgilbert@redhat.com Date: Fri, 3 Aug 2018 17:13:42 +0800 Message-Id: <1533287630-4221-5-git-send-email-lidongchen@tencent.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1533287630-4221-1-git-send-email-lidongchen@tencent.com> References: <1533287630-4221-1-git-send-email-lidongchen@tencent.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::432 Subject: [Qemu-devel] [PATCH v6 04/12] migration: implement bi-directional RDMA QIOChannel X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lidong Chen , qemu-devel@nongnu.org, Lidong Chen Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" From: Lidong Chen This patch implements bi-directional RDMA QIOChannel. Because different threads may access RDMAQIOChannel currently, this patch use RCU to protect it. Signed-off-by: Lidong Chen Reviewed-by: Dr. David Alan Gilbert --- migration/colo.c | 2 + migration/migration.c | 2 + migration/postcopy-ram.c | 2 + migration/ram.c | 4 + migration/rdma.c | 196 ++++++++++++++++++++++++++++++++++++++++------- migration/savevm.c | 3 + 6 files changed, 183 insertions(+), 26 deletions(-) diff --git a/migration/colo.c b/migration/colo.c index 4381067..88936f5 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -534,6 +534,7 @@ void *colo_process_incoming_thread(void *opaque) uint64_t value; Error *local_err = NULL; + rcu_register_thread(); qemu_sem_init(&mis->colo_incoming_sem, 0); migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE, @@ -666,5 +667,6 @@ out: } migration_incoming_exit_colo(); + rcu_unregister_thread(); return NULL; } diff --git a/migration/migration.c b/migration/migration.c index a3a0756..f190964 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -2104,6 +2104,7 @@ static void *source_return_path_thread(void *opaque) int res; trace_source_return_path_thread_entry(); + rcu_register_thread(); retry: while (!ms->rp_state.error && !qemu_file_get_error(rp) && @@ -2243,6 +2244,7 @@ out: trace_source_return_path_thread_end(); ms->rp_state.from_dst_file = NULL; qemu_fclose(rp); + rcu_unregister_thread(); return NULL; } diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c index 932f188..3952d78 100644 --- a/migration/postcopy-ram.c +++ b/migration/postcopy-ram.c @@ -853,6 +853,7 @@ static void *postcopy_ram_fault_thread(void *opaque) RAMBlock *rb = NULL; trace_postcopy_ram_fault_thread_entry(); + rcu_register_thread(); mis->last_rb = NULL; /* last RAMBlock we sent part of */ qemu_sem_post(&mis->fault_thread_sem); @@ -1059,6 +1060,7 @@ retry: } } } + rcu_unregister_thread(); trace_postcopy_ram_fault_thread_exit(); g_free(pfd); return NULL; diff --git a/migration/ram.c b/migration/ram.c index 24dea27..4da0930 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -988,6 +988,7 @@ static void *multifd_send_thread(void *opaque) int ret; trace_multifd_send_thread_start(p->id); + rcu_register_thread(); if (multifd_send_initial_packet(p, &local_err) < 0) { goto out; @@ -1050,6 +1051,7 @@ out: p->running = false; qemu_mutex_unlock(&p->mutex); + rcu_unregister_thread(); trace_multifd_send_thread_end(p->id, p->num_packets, p->num_pages); return NULL; @@ -1219,6 +1221,7 @@ static void *multifd_recv_thread(void *opaque) int ret; trace_multifd_recv_thread_start(p->id); + rcu_register_thread(); while (true) { uint32_t used; @@ -1265,6 +1268,7 @@ static void *multifd_recv_thread(void *opaque) p->running = false; qemu_mutex_unlock(&p->mutex); + rcu_unregister_thread(); trace_multifd_recv_thread_end(p->id, p->num_packets, p->num_pages); return NULL; diff --git a/migration/rdma.c b/migration/rdma.c index 57af5ed..a5535fb 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -86,6 +86,7 @@ static uint32_t known_capabilities = RDMA_CAPABILITY_PIN_ALL; " to abort!"); \ rdma->error_reported = 1; \ } \ + rcu_read_unlock(); \ return rdma->error_state; \ } \ } while (0) @@ -402,7 +403,8 @@ typedef struct QIOChannelRDMA QIOChannelRDMA; struct QIOChannelRDMA { QIOChannel parent; - RDMAContext *rdma; + RDMAContext *rdmain; + RDMAContext *rdmaout; QEMUFile *file; bool blocking; /* XXX we don't actually honour this yet */ }; @@ -2630,12 +2632,20 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc, { QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc); QEMUFile *f = rioc->file; - RDMAContext *rdma = rioc->rdma; + RDMAContext *rdma; int ret; ssize_t done = 0; size_t i; size_t len = 0; + rcu_read_lock(); + rdma = atomic_rcu_read(&rioc->rdmaout); + + if (!rdma) { + rcu_read_unlock(); + return -EIO; + } + CHECK_ERROR_STATE(); /* @@ -2645,6 +2655,7 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc, ret = qemu_rdma_write_flush(f, rdma); if (ret < 0) { rdma->error_state = ret; + rcu_read_unlock(); return ret; } @@ -2664,6 +2675,7 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc, if (ret < 0) { rdma->error_state = ret; + rcu_read_unlock(); return ret; } @@ -2672,6 +2684,7 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc, } } + rcu_read_unlock(); return done; } @@ -2705,12 +2718,20 @@ static ssize_t qio_channel_rdma_readv(QIOChannel *ioc, Error **errp) { QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc); - RDMAContext *rdma = rioc->rdma; + RDMAContext *rdma; RDMAControlHeader head; int ret = 0; ssize_t i; size_t done = 0; + rcu_read_lock(); + rdma = atomic_rcu_read(&rioc->rdmain); + + if (!rdma) { + rcu_read_unlock(); + return -EIO; + } + CHECK_ERROR_STATE(); for (i = 0; i < niov; i++) { @@ -2722,7 +2743,7 @@ static ssize_t qio_channel_rdma_readv(QIOChannel *ioc, * were given and dish out the bytes until we run * out of bytes. */ - ret = qemu_rdma_fill(rioc->rdma, data, want, 0); + ret = qemu_rdma_fill(rdma, data, want, 0); done += ret; want -= ret; /* Got what we needed, so go to next iovec */ @@ -2744,25 +2765,28 @@ static ssize_t qio_channel_rdma_readv(QIOChannel *ioc, if (ret < 0) { rdma->error_state = ret; + rcu_read_unlock(); return ret; } /* * SEND was received with new bytes, now try again. */ - ret = qemu_rdma_fill(rioc->rdma, data, want, 0); + ret = qemu_rdma_fill(rdma, data, want, 0); done += ret; want -= ret; /* Still didn't get enough, so lets just return */ if (want) { if (done == 0) { + rcu_read_unlock(); return QIO_CHANNEL_ERR_BLOCK; } else { break; } } } + rcu_read_unlock(); return done; } @@ -2814,15 +2838,29 @@ qio_channel_rdma_source_prepare(GSource *source, gint *timeout) { QIOChannelRDMASource *rsource = (QIOChannelRDMASource *)source; - RDMAContext *rdma = rsource->rioc->rdma; + RDMAContext *rdma; GIOCondition cond = 0; *timeout = -1; + rcu_read_lock(); + if (rsource->condition == G_IO_IN) { + rdma = atomic_rcu_read(&rsource->rioc->rdmain); + } else { + rdma = atomic_rcu_read(&rsource->rioc->rdmaout); + } + + if (!rdma) { + error_report("RDMAContext is NULL when prepare Gsource"); + rcu_read_unlock(); + return FALSE; + } + if (rdma->wr_data[0].control_len) { cond |= G_IO_IN; } cond |= G_IO_OUT; + rcu_read_unlock(); return cond & rsource->condition; } @@ -2830,14 +2868,28 @@ static gboolean qio_channel_rdma_source_check(GSource *source) { QIOChannelRDMASource *rsource = (QIOChannelRDMASource *)source; - RDMAContext *rdma = rsource->rioc->rdma; + RDMAContext *rdma; GIOCondition cond = 0; + rcu_read_lock(); + if (rsource->condition == G_IO_IN) { + rdma = atomic_rcu_read(&rsource->rioc->rdmain); + } else { + rdma = atomic_rcu_read(&rsource->rioc->rdmaout); + } + + if (!rdma) { + error_report("RDMAContext is NULL when check Gsource"); + rcu_read_unlock(); + return FALSE; + } + if (rdma->wr_data[0].control_len) { cond |= G_IO_IN; } cond |= G_IO_OUT; + rcu_read_unlock(); return cond & rsource->condition; } @@ -2848,14 +2900,28 @@ qio_channel_rdma_source_dispatch(GSource *source, { QIOChannelFunc func = (QIOChannelFunc)callback; QIOChannelRDMASource *rsource = (QIOChannelRDMASource *)source; - RDMAContext *rdma = rsource->rioc->rdma; + RDMAContext *rdma; GIOCondition cond = 0; + rcu_read_lock(); + if (rsource->condition == G_IO_IN) { + rdma = atomic_rcu_read(&rsource->rioc->rdmain); + } else { + rdma = atomic_rcu_read(&rsource->rioc->rdmaout); + } + + if (!rdma) { + error_report("RDMAContext is NULL when dispatch Gsource"); + rcu_read_unlock(); + return FALSE; + } + if (rdma->wr_data[0].control_len) { cond |= G_IO_IN; } cond |= G_IO_OUT; + rcu_read_unlock(); return (*func)(QIO_CHANNEL(rsource->rioc), (cond & rsource->condition), user_data); @@ -2900,15 +2966,32 @@ static int qio_channel_rdma_close(QIOChannel *ioc, Error **errp) { QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc); + RDMAContext *rdmain, *rdmaout; trace_qemu_rdma_close(); - if (rioc->rdma) { - if (!rioc->rdma->error_state) { - rioc->rdma->error_state = qemu_file_get_error(rioc->file); - } - qemu_rdma_cleanup(rioc->rdma); - g_free(rioc->rdma); - rioc->rdma = NULL; + + rdmain = rioc->rdmain; + if (rdmain) { + atomic_rcu_set(&rioc->rdmain, NULL); + } + + rdmaout = rioc->rdmaout; + if (rdmaout) { + atomic_rcu_set(&rioc->rdmaout, NULL); } + + synchronize_rcu(); + + if (rdmain) { + qemu_rdma_cleanup(rdmain); + } + + if (rdmaout) { + qemu_rdma_cleanup(rdmaout); + } + + g_free(rdmain); + g_free(rdmaout); + return 0; } @@ -2951,12 +3034,21 @@ static size_t qemu_rdma_save_page(QEMUFile *f, void *opaque, size_t size, uint64_t *bytes_sent) { QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(opaque); - RDMAContext *rdma = rioc->rdma; + RDMAContext *rdma; int ret; + rcu_read_lock(); + rdma = atomic_rcu_read(&rioc->rdmaout); + + if (!rdma) { + rcu_read_unlock(); + return -EIO; + } + CHECK_ERROR_STATE(); if (migrate_get_current()->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) { + rcu_read_unlock(); return RAM_SAVE_CONTROL_NOT_SUPP; } @@ -3041,9 +3133,11 @@ static size_t qemu_rdma_save_page(QEMUFile *f, void *opaque, } } + rcu_read_unlock(); return RAM_SAVE_CONTROL_DELAYED; err: rdma->error_state = ret; + rcu_read_unlock(); return ret; } @@ -3219,8 +3313,8 @@ static int qemu_rdma_registration_handle(QEMUFile *f, void *opaque) RDMAControlHeader blocks = { .type = RDMA_CONTROL_RAM_BLOCKS_RESULT, .repeat = 1 }; QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(opaque); - RDMAContext *rdma = rioc->rdma; - RDMALocalBlocks *local = &rdma->local_ram_blocks; + RDMAContext *rdma; + RDMALocalBlocks *local; RDMAControlHeader head; RDMARegister *reg, *registers; RDMACompress *comp; @@ -3233,8 +3327,17 @@ static int qemu_rdma_registration_handle(QEMUFile *f, void *opaque) int count = 0; int i = 0; + rcu_read_lock(); + rdma = atomic_rcu_read(&rioc->rdmain); + + if (!rdma) { + rcu_read_unlock(); + return -EIO; + } + CHECK_ERROR_STATE(); + local = &rdma->local_ram_blocks; do { trace_qemu_rdma_registration_handle_wait(); @@ -3468,6 +3571,7 @@ out: if (ret < 0) { rdma->error_state = ret; } + rcu_read_unlock(); return ret; } @@ -3481,10 +3585,18 @@ out: static int rdma_block_notification_handle(QIOChannelRDMA *rioc, const char *name) { - RDMAContext *rdma = rioc->rdma; + RDMAContext *rdma; int curr; int found = -1; + rcu_read_lock(); + rdma = atomic_rcu_read(&rioc->rdmain); + + if (!rdma) { + rcu_read_unlock(); + return -EIO; + } + /* Find the matching RAMBlock in our local list */ for (curr = 0; curr < rdma->local_ram_blocks.nb_blocks; curr++) { if (!strcmp(rdma->local_ram_blocks.block[curr].block_name, name)) { @@ -3495,6 +3607,7 @@ rdma_block_notification_handle(QIOChannelRDMA *rioc, const char *name) if (found == -1) { error_report("RAMBlock '%s' not found on destination", name); + rcu_read_unlock(); return -ENOENT; } @@ -3502,6 +3615,7 @@ rdma_block_notification_handle(QIOChannelRDMA *rioc, const char *name) trace_rdma_block_notification_handle(name, rdma->next_src_index); rdma->next_src_index++; + rcu_read_unlock(); return 0; } @@ -3524,11 +3638,19 @@ static int qemu_rdma_registration_start(QEMUFile *f, void *opaque, uint64_t flags, void *data) { QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(opaque); - RDMAContext *rdma = rioc->rdma; + RDMAContext *rdma; + + rcu_read_lock(); + rdma = atomic_rcu_read(&rioc->rdmaout); + if (!rdma) { + rcu_read_unlock(); + return -EIO; + } CHECK_ERROR_STATE(); if (migrate_get_current()->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) { + rcu_read_unlock(); return 0; } @@ -3536,6 +3658,7 @@ static int qemu_rdma_registration_start(QEMUFile *f, void *opaque, qemu_put_be64(f, RAM_SAVE_FLAG_HOOK); qemu_fflush(f); + rcu_read_unlock(); return 0; } @@ -3548,13 +3671,21 @@ static int qemu_rdma_registration_stop(QEMUFile *f, void *opaque, { Error *local_err = NULL, **errp = &local_err; QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(opaque); - RDMAContext *rdma = rioc->rdma; + RDMAContext *rdma; RDMAControlHeader head = { .len = 0, .repeat = 1 }; int ret = 0; + rcu_read_lock(); + rdma = atomic_rcu_read(&rioc->rdmaout); + if (!rdma) { + rcu_read_unlock(); + return -EIO; + } + CHECK_ERROR_STATE(); if (migrate_get_current()->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) { + rcu_read_unlock(); return 0; } @@ -3586,6 +3717,7 @@ static int qemu_rdma_registration_stop(QEMUFile *f, void *opaque, qemu_rdma_reg_whole_ram_blocks : NULL); if (ret < 0) { ERROR(errp, "receiving remote info!"); + rcu_read_unlock(); return ret; } @@ -3609,6 +3741,7 @@ static int qemu_rdma_registration_stop(QEMUFile *f, void *opaque, "not identical on both the source and destination.", local->nb_blocks, nb_dest_blocks); rdma->error_state = -EINVAL; + rcu_read_unlock(); return -EINVAL; } @@ -3625,6 +3758,7 @@ static int qemu_rdma_registration_stop(QEMUFile *f, void *opaque, local->block[i].length, rdma->dest_blocks[i].length); rdma->error_state = -EINVAL; + rcu_read_unlock(); return -EINVAL; } local->block[i].remote_host_addr = @@ -3642,9 +3776,11 @@ static int qemu_rdma_registration_stop(QEMUFile *f, void *opaque, goto err; } + rcu_read_unlock(); return 0; err: rdma->error_state = ret; + rcu_read_unlock(); return ret; } @@ -3662,10 +3798,15 @@ static const QEMUFileHooks rdma_write_hooks = { static void qio_channel_rdma_finalize(Object *obj) { QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(obj); - if (rioc->rdma) { - qemu_rdma_cleanup(rioc->rdma); - g_free(rioc->rdma); - rioc->rdma = NULL; + if (rioc->rdmain) { + qemu_rdma_cleanup(rioc->rdmain); + g_free(rioc->rdmain); + rioc->rdmain = NULL; + } + if (rioc->rdmaout) { + qemu_rdma_cleanup(rioc->rdmaout); + g_free(rioc->rdmaout); + rioc->rdmaout = NULL; } } @@ -3705,13 +3846,16 @@ static QEMUFile *qemu_fopen_rdma(RDMAContext *rdma, const char *mode) } rioc = QIO_CHANNEL_RDMA(object_new(TYPE_QIO_CHANNEL_RDMA)); - rioc->rdma = rdma; if (mode[0] == 'w') { rioc->file = qemu_fopen_channel_output(QIO_CHANNEL(rioc)); + rioc->rdmaout = rdma; + rioc->rdmain = rdma->return_path; qemu_file_set_hooks(rioc->file, &rdma_write_hooks); } else { rioc->file = qemu_fopen_channel_input(QIO_CHANNEL(rioc)); + rioc->rdmain = rdma; + rioc->rdmaout = rdma->return_path; qemu_file_set_hooks(rioc->file, &rdma_read_hooks); } diff --git a/migration/savevm.c b/migration/savevm.c index 7f92567..13e51f0 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -1622,6 +1622,7 @@ static void *postcopy_ram_listen_thread(void *opaque) qemu_sem_post(&mis->listen_thread_sem); trace_postcopy_ram_listen_thread_start(); + rcu_register_thread(); /* * Because we're a thread and not a coroutine we can't yield * in qemu_file, and thus we must be blocking now. @@ -1662,6 +1663,7 @@ static void *postcopy_ram_listen_thread(void *opaque) * to leave the guest running and fire MCEs for pages that never * arrived as a desperate recovery step. */ + rcu_unregister_thread(); exit(EXIT_FAILURE); } @@ -1676,6 +1678,7 @@ static void *postcopy_ram_listen_thread(void *opaque) migration_incoming_state_destroy(); qemu_loadvm_state_cleanup(); + rcu_unregister_thread(); return NULL; } From patchwork Fri Aug 3 09:13:43 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: 858585 jemmy X-Patchwork-Id: 953133 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="ntwGPjUE"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41hhFz06p9z9s0R for ; Fri, 3 Aug 2018 19:15:02 +1000 (AEST) Received: from localhost ([::1]:49738 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1flWAj-0001Ut-Dw for incoming@patchwork.ozlabs.org; Fri, 03 Aug 2018 05:14:57 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:52833) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1flW9z-0001Sj-7S for qemu-devel@nongnu.org; Fri, 03 Aug 2018 05:14:12 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1flW9v-0001QH-Vi for qemu-devel@nongnu.org; Fri, 03 Aug 2018 05:14:11 -0400 Received: from mail-pf1-x444.google.com ([2607:f8b0:4864:20::444]:34659) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1flW9v-0001Ou-PG for qemu-devel@nongnu.org; Fri, 03 Aug 2018 05:14:07 -0400 Received: by mail-pf1-x444.google.com with SMTP id k19-v6so2936236pfi.1 for ; Fri, 03 Aug 2018 02:14:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=BhTjWSxI5vI0JqfpTA2MCjzO8Oc/jwHf7sGsu2PlpTU=; b=ntwGPjUEC0uruCP+MpFkGLxMYXjD/2VMBoi6uleZpZpM/191ZIqhfoOSprbcq8bB8J 2vUgA2Ey7SziLnZBJWF1W4ca8nY91aoFDWG8vkPscdTUDmpivOIWSTwvU0rAnPh+xtCT s4ZsQkMFcb3g84Npuz081QMIrEUz3WNeesuLF64oo/w6cG8o2ToQX1VPHNk8T0kFLIBP 0cAbjt2JGBJ2Fu8z3MKSWj7V4Jyxx8mfolXnAXvAOCk99RiP0Oqu43hC0SOye7NiNjRD iPienYfo6NuiGssLmeCxg/GVTRCd+7+nQrgIe/UTlimoB47CoWXd7Bqv789AF3BSE0/X jl/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=BhTjWSxI5vI0JqfpTA2MCjzO8Oc/jwHf7sGsu2PlpTU=; b=bkZdb+OeZ3wM0fdUWvwxJx8DhWE+wDzB6a8tpvITnj4RmXOUYYWybVyF22g1I7eNyg h2i8cDYrDYVhs7Yf+5lwgBY9CHwJlFoppT8Y6BP2sJeyDfAoLUykOh5WiA8RlMcfANCK RKDTQz1Bw5ytWaW+V67RW7grmyvamHlDdgxaUqdmCHOeQhuRZU3yWqpmBHS5+AVYlGIx o/VbBWfDaImWRBZPFuc5g9kAq6f5Hp3zcUGRHTHoeW6++XcOdS2GRIAjmg68aw5rGtNj LnSCqB52L+8qI/xTNdOZvueDyAnM0eTv2Si/ZcUTzyKxoPlJ80DR3xIANV9kjb0jkCXd Tz2g== X-Gm-Message-State: AOUpUlHTqMxr3Acf8D1X/RQp+1SZxWiyugTIDlcHKRh0r9X3fbTa27HK 1daes0v6cEG359EnXxaJM+k= X-Google-Smtp-Source: AAOMgpeMVuRVK8jMCceObLCaob1Suc1BA8RIprU/D3RXxmlNSVTFU8tCvop6XT3lRmDBgAGmeCfUmA== X-Received: by 2002:a63:be4a:: with SMTP id g10-v6mr2863595pgo.378.1533287647020; Fri, 03 Aug 2018 02:14:07 -0700 (PDT) Received: from VM_120_46_centos.localdomain ([119.28.87.64]) by smtp.gmail.com with ESMTPSA id e21-v6sm7991352pfl.187.2018.08.03.02.14.05 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 03 Aug 2018 02:14:06 -0700 (PDT) From: Lidong Chen X-Google-Original-From: Lidong Chen To: zhang.zhanghailiang@huawei.com, quintela@redhat.com, dgilbert@redhat.com Date: Fri, 3 Aug 2018 17:13:43 +0800 Message-Id: <1533287630-4221-6-git-send-email-lidongchen@tencent.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1533287630-4221-1-git-send-email-lidongchen@tencent.com> References: <1533287630-4221-1-git-send-email-lidongchen@tencent.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::444 Subject: [Qemu-devel] [PATCH v6 05/12] migration: Stop rdma yielding during incoming postcopy X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lidong Chen , qemu-devel@nongnu.org, Lidong Chen Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" From: Lidong Chen During incoming postcopy, the destination qemu will invoke qemu_rdma_wait_comp_channel in a seprate thread. So does not use rdma yield, and poll the completion channel fd instead. Signed-off-by: Lidong Chen Reviewed-by: Dr. David Alan Gilbert --- migration/rdma.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/migration/rdma.c b/migration/rdma.c index a5535fb..cfb0671 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -1493,11 +1493,13 @@ static int qemu_rdma_wait_comp_channel(RDMAContext *rdma) * Coroutine doesn't start until migration_fd_process_incoming() * so don't yield unless we know we're running inside of a coroutine. */ - if (rdma->migration_started_on_destination) { + if (rdma->migration_started_on_destination && + migration_incoming_get_current()->state == MIGRATION_STATUS_ACTIVE) { yield_until_fd_readable(rdma->comp_channel->fd); } else { /* This is the source side, we're in a separate thread * or destination prior to migration_fd_process_incoming() + * after postcopy, the destination also in a seprate thread. * we can't yield; so we have to poll the fd. * But we need to be able to handle 'cancel' or an error * without hanging forever. From patchwork Fri Aug 3 09:13:44 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: 858585 jemmy X-Patchwork-Id: 953132 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="nydgSbRL"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41hhFw618gz9s0R for ; Fri, 3 Aug 2018 19:15:00 +1000 (AEST) Received: from localhost ([::1]:49739 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1flWAk-0001Ve-CA for incoming@patchwork.ozlabs.org; Fri, 03 Aug 2018 05:14:58 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:52845) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1flW9z-0001T9-Oc for qemu-devel@nongnu.org; Fri, 03 Aug 2018 05:14:13 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1flW9x-0001WZ-To for qemu-devel@nongnu.org; Fri, 03 Aug 2018 05:14:11 -0400 Received: from mail-pf1-x429.google.com ([2607:f8b0:4864:20::429]:38122) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1flW9x-0001Tu-JB for qemu-devel@nongnu.org; Fri, 03 Aug 2018 05:14:09 -0400 Received: by mail-pf1-x429.google.com with SMTP id x17-v6so2928832pfh.5 for ; Fri, 03 Aug 2018 02:14:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=i+9cu6YEw6HZ98kuAAxb7164x+SFub9Zft6Br89mO6c=; b=nydgSbRLaiX4u1s5gA94N3LTxvuRzovqDQfI4sjrxDI/PCb2nEQjMuU5hyLB9T1Q1e 64SbMm5ospr58jklE6IKMyGS4FX9q3nfkufyWiO7Mmk+j5jENFfMV8Z9W9Ot8/z4tRoE 8XkzfQ7fkqOyH8wmi8aaE2QkvJCRNfWleqlj22Ulfbwo6QuemgIXtllXxSoFtd4U+Upz bdbjL+gQCa9FG+Hsq0HI/3kpVMXUuC+WWX1Q/ZtdKNxATMNa0tEyT6FMjrwDqwzAeOTV BtCPXYhwVmvuqov4F2/m0r+hvCa5N97mPtw5g2RqtTUnKmhVP+c7UvDSpsyZHfQfCPjm LKag== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=i+9cu6YEw6HZ98kuAAxb7164x+SFub9Zft6Br89mO6c=; b=QCDmtp/zba6WFzAbO+3tInQ87Zp0sVGX+PRs+C+vSpcWVfFS+BgFank4gJB9T9tulw bJ0fN8EMY8eQG/6W+/JqK64mD6k3+2Qv/ENpAdl8O9m3Ktu6ww50juArR+7+2HGLoXwr Sudvt81rV5s1UKdr+i8/Z6i7tqiT+r6MUPuorp8gkpB/5oFKD/6ohPVUNeOcNP39XaVN gaxRvhCo1hDzxZCLuO7E1mAZQs6Ra0BHUciszN5SjSJ4BCXu7LmS8uPxnVSaUNwIZ8b4 i0Y5OIF+wCMBY9SP/PMC9/KwlMAZEPUfd2Evs9UPPXBMme8S1GYXXyVjz8ZZTECGSvdc rSjw== X-Gm-Message-State: AOUpUlE6fkoA2BzVD0K8I+fHDQfXWm9vd0Cs1gcLQiNW6tkxGWusp3Vg iFzqc1nLp2WHo1KA5+RRWMJjbZQ2NcQ= X-Google-Smtp-Source: AAOMgpc8/8epSVfH+87tDn3KVz/QGPmZx4dgkQFPlbstKHMBTv+0TEiv2egpkdzEIx5qTKelXGaXgQ== X-Received: by 2002:a62:4c0f:: with SMTP id z15-v6mr3465076pfa.110.1533287648813; Fri, 03 Aug 2018 02:14:08 -0700 (PDT) Received: from VM_120_46_centos.localdomain ([119.28.87.64]) by smtp.gmail.com with ESMTPSA id e21-v6sm7991352pfl.187.2018.08.03.02.14.07 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 03 Aug 2018 02:14:08 -0700 (PDT) From: Lidong Chen X-Google-Original-From: Lidong Chen To: zhang.zhanghailiang@huawei.com, quintela@redhat.com, dgilbert@redhat.com Date: Fri, 3 Aug 2018 17:13:44 +0800 Message-Id: <1533287630-4221-7-git-send-email-lidongchen@tencent.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1533287630-4221-1-git-send-email-lidongchen@tencent.com> References: <1533287630-4221-1-git-send-email-lidongchen@tencent.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::429 Subject: [Qemu-devel] [PATCH v6 06/12] migration: implement io_set_aio_fd_handler function for RDMA QIOChannel X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lidong Chen , qemu-devel@nongnu.org, Lidong Chen Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" From: Lidong Chen if qio_channel_rdma_readv return QIO_CHANNEL_ERR_BLOCK, the destination qemu crash. The backtrace is: (gdb) bt #0 0x0000000000000000 in ?? () #1 0x00000000008db50e in qio_channel_set_aio_fd_handler (ioc=0x38111e0, ctx=0x3726080, io_read=0x8db841 , io_write=0x0, opaque=0x38111e0) at io/channel.c: #2 0x00000000008db952 in qio_channel_set_aio_fd_handlers (ioc=0x38111e0) at io/channel.c:438 #3 0x00000000008dbab4 in qio_channel_yield (ioc=0x38111e0, condition=G_IO_IN) at io/channel.c:47 #4 0x00000000007a870b in channel_get_buffer (opaque=0x38111e0, buf=0x440c038 "", pos=0, size=327 at migration/qemu-file-channel.c:83 #5 0x00000000007a70f6 in qemu_fill_buffer (f=0x440c000) at migration/qemu-file.c:299 #6 0x00000000007a79d0 in qemu_peek_byte (f=0x440c000, offset=0) at migration/qemu-file.c:562 #7 0x00000000007a7a22 in qemu_get_byte (f=0x440c000) at migration/qemu-file.c:575 #8 0x00000000007a7c78 in qemu_get_be32 (f=0x440c000) at migration/qemu-file.c:655 #9 0x00000000007a0508 in qemu_loadvm_state (f=0x440c000) at migration/savevm.c:2126 #10 0x0000000000794141 in process_incoming_migration_co (opaque=0x0) at migration/migration.c:366 #11 0x000000000095c598 in coroutine_trampoline (i0=84033984, i1=0) at util/coroutine-ucontext.c:1 #12 0x00007f9c0db56d40 in ?? () from /lib64/libc.so.6 #13 0x00007f96fe858760 in ?? () #14 0x0000000000000000 in ?? () RDMA QIOChannel not implement io_set_aio_fd_handler. so qio_channel_set_aio_fd_handler will access NULL pointer. Signed-off-by: Lidong Chen Reviewed-by: Juan Quintela --- migration/rdma.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/migration/rdma.c b/migration/rdma.c index cfb0671..d6bbf28 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -2963,6 +2963,21 @@ static GSource *qio_channel_rdma_create_watch(QIOChannel *ioc, return source; } +static void qio_channel_rdma_set_aio_fd_handler(QIOChannel *ioc, + AioContext *ctx, + IOHandler *io_read, + IOHandler *io_write, + void *opaque) +{ + QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc); + if (io_read) { + aio_set_fd_handler(ctx, rioc->rdmain->comp_channel->fd, + false, io_read, io_write, NULL, opaque); + } else { + aio_set_fd_handler(ctx, rioc->rdmaout->comp_channel->fd, + false, io_read, io_write, NULL, opaque); + } +} static int qio_channel_rdma_close(QIOChannel *ioc, Error **errp) @@ -3822,6 +3837,7 @@ static void qio_channel_rdma_class_init(ObjectClass *klass, ioc_klass->io_set_blocking = qio_channel_rdma_set_blocking; ioc_klass->io_close = qio_channel_rdma_close; ioc_klass->io_create_watch = qio_channel_rdma_create_watch; + ioc_klass->io_set_aio_fd_handler = qio_channel_rdma_set_aio_fd_handler; } static const TypeInfo qio_channel_rdma_info = { From patchwork Fri Aug 3 09:13:45 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: 858585 jemmy X-Patchwork-Id: 953143 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="lY4at5Dx"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41hhS64bWMz9s2g for ; Fri, 3 Aug 2018 19:23:50 +1000 (AEST) Received: from localhost ([::1]:49791 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1flWJI-0000aQ-4W for incoming@patchwork.ozlabs.org; Fri, 03 Aug 2018 05:23:48 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:52884) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1flWA1-0001UY-I4 for qemu-devel@nongnu.org; Fri, 03 Aug 2018 05:14:14 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1flW9z-0001Zd-N8 for qemu-devel@nongnu.org; Fri, 03 Aug 2018 05:14:13 -0400 Received: from mail-pf1-x432.google.com ([2607:f8b0:4864:20::432]:35383) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1flW9z-0001Xo-A6 for qemu-devel@nongnu.org; Fri, 03 Aug 2018 05:14:11 -0400 Received: by mail-pf1-x432.google.com with SMTP id p12-v6so2939392pfh.2 for ; Fri, 03 Aug 2018 02:14:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=V6VZn/TLDJ0rcpWXxTwI8ayG4VlrvGWwAP//SDZcvS8=; b=lY4at5DxzP/OmsUeG9y4KowZOTbmfoQ3KTZOCTUY/j4PuupiVwqJ5wq0+XHm5/Ar0j sQ+uy99Ns6gOKMXQcD78bedc3XuytWcV2J9Y9eGVcG3rOAc6Kdm0azh0fMJuHlKSMmru fxK3dVZgrrd/gGF9jLHUO6ajkw+0Q5EEy0rAZ33ZgSkwK6browCxW/zpNx9wpsCuuNAh Dqu+UvBv66laO80QcfzoDT3yQqBwv65DvuDwnF/E9EPpeMZZAM3tmciI6ncKMn65vPGi JRDCkb7LIx88KP9FMJHz0pP7uW0UmMZxDewXxNVzyyIYWYZMKlL8IVXIUStkCcUu8x7Q MRoQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=V6VZn/TLDJ0rcpWXxTwI8ayG4VlrvGWwAP//SDZcvS8=; b=nbIQil41q2VLgxeiadmaPXJLpu3iCxsm+Q2BrLUgO0w1E9mgF/JtZ8Cd51Tb+wQxL1 tXW7POASdgnqB6zD/OSWNo1ZmNXafjDUSeZqbc/ZwaZBVKbT1YLhNKRGU0GZL+5NEPE1 NtjauVGtp0Tmmu990pIwb/vzlGK2SJ4myefey047W5ULDZ8+44eCDV12lwNegKqPs+D5 uVTikdlgR7ftKhSFcLTIB3IC8tuh+viopAO7KjUJa4Z8tfuw2xktGdfpPP9VeLyqars3 AE3cQqf7q1i24/WaOmARmiMOMDZElU5SwSWk8i8U+cYQ7P+zIM9YZo1eQ3tQUHzlzISH JhKw== X-Gm-Message-State: AOUpUlEW1FNFkGu9vz8ErshCrhfPZZygvc96QN2yoMuT9146Ye687yUg 7j4okYmHDURWzEL4Eyl0t1o= X-Google-Smtp-Source: AAOMgpfKCrQ8eZZGoLnCEikLO/Il37/iako53UfP6+53MMtxZaH+WgZWoKxsUUQBkQ4KK/k5vwlOsw== X-Received: by 2002:a62:4083:: with SMTP id f3-v6mr3460170pfd.229.1533287650506; Fri, 03 Aug 2018 02:14:10 -0700 (PDT) Received: from VM_120_46_centos.localdomain ([119.28.87.64]) by smtp.gmail.com with ESMTPSA id e21-v6sm7991352pfl.187.2018.08.03.02.14.08 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 03 Aug 2018 02:14:10 -0700 (PDT) From: Lidong Chen X-Google-Original-From: Lidong Chen To: zhang.zhanghailiang@huawei.com, quintela@redhat.com, dgilbert@redhat.com Date: Fri, 3 Aug 2018 17:13:45 +0800 Message-Id: <1533287630-4221-8-git-send-email-lidongchen@tencent.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1533287630-4221-1-git-send-email-lidongchen@tencent.com> References: <1533287630-4221-1-git-send-email-lidongchen@tencent.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::432 Subject: [Qemu-devel] [PATCH v6 07/12] migration: invoke qio_channel_yield only when qemu_in_coroutine() X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lidong Chen , qemu-devel@nongnu.org, Lidong Chen Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" From: Lidong Chen when qio_channel_read return QIO_CHANNEL_ERR_BLOCK, the source qemu crash. The backtrace is: (gdb) bt #0 0x00007fb20aba91d7 in raise () from /lib64/libc.so.6 #1 0x00007fb20abaa8c8 in abort () from /lib64/libc.so.6 #2 0x00007fb20aba2146 in __assert_fail_base () from /lib64/libc.so.6 #3 0x00007fb20aba21f2 in __assert_fail () from /lib64/libc.so.6 #4 0x00000000008dba2d in qio_channel_yield (ioc=0x22f9e20, condition=G_IO_IN) at io/channel.c:460 #5 0x00000000007a870b in channel_get_buffer (opaque=0x22f9e20, buf=0x3d54038 "", pos=0, size=32768) at migration/qemu-file-channel.c:83 #6 0x00000000007a70f6 in qemu_fill_buffer (f=0x3d54000) at migration/qemu-file.c:299 #7 0x00000000007a79d0 in qemu_peek_byte (f=0x3d54000, offset=0) at migration/qemu-file.c:562 #8 0x00000000007a7a22 in qemu_get_byte (f=0x3d54000) at migration/qemu-file.c:575 #9 0x00000000007a7c46 in qemu_get_be16 (f=0x3d54000) at migration/qemu-file.c:647 #10 0x0000000000796db7 in source_return_path_thread (opaque=0x2242280) at migration/migration.c:1794 #11 0x00000000009428fa in qemu_thread_start (args=0x3e58420) at util/qemu-thread-posix.c:504 #12 0x00007fb20af3ddc5 in start_thread () from /lib64/libpthread.so.0 #13 0x00007fb20ac6b74d in clone () from /lib64/libc.so.6 This patch fixed by invoke qio_channel_yield only when qemu_in_coroutine(). Signed-off-by: Lidong Chen Reviewed-by: Juan Quintela --- migration/qemu-file-channel.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/migration/qemu-file-channel.c b/migration/qemu-file-channel.c index e202d73..8e639eb 100644 --- a/migration/qemu-file-channel.c +++ b/migration/qemu-file-channel.c @@ -49,7 +49,11 @@ static ssize_t channel_writev_buffer(void *opaque, ssize_t len; len = qio_channel_writev(ioc, local_iov, nlocal_iov, NULL); if (len == QIO_CHANNEL_ERR_BLOCK) { - qio_channel_wait(ioc, G_IO_OUT); + if (qemu_in_coroutine()) { + qio_channel_yield(ioc, G_IO_OUT); + } else { + qio_channel_wait(ioc, G_IO_OUT); + } continue; } if (len < 0) { @@ -80,7 +84,11 @@ static ssize_t channel_get_buffer(void *opaque, ret = qio_channel_read(ioc, (char *)buf, size, NULL); if (ret < 0) { if (ret == QIO_CHANNEL_ERR_BLOCK) { - qio_channel_yield(ioc, G_IO_IN); + if (qemu_in_coroutine()) { + qio_channel_yield(ioc, G_IO_IN); + } else { + qio_channel_wait(ioc, G_IO_IN); + } } else { /* XXX handle Error * object */ return -EIO; From patchwork Fri Aug 3 09:13:46 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: 858585 jemmy X-Patchwork-Id: 953137 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="FiXgbURY"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41hhKM0q1mz9s0R for ; Fri, 3 Aug 2018 19:17:59 +1000 (AEST) Received: from localhost ([::1]:49755 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1flWDc-0004My-Mf for incoming@patchwork.ozlabs.org; Fri, 03 Aug 2018 05:17:56 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:52914) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1flWA3-0001Vg-0Q for qemu-devel@nongnu.org; Fri, 03 Aug 2018 05:14:16 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1flWA1-0001dc-Ix for qemu-devel@nongnu.org; Fri, 03 Aug 2018 05:14:14 -0400 Received: from mail-pg1-x541.google.com ([2607:f8b0:4864:20::541]:33091) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1flWA1-0001cM-Ab for qemu-devel@nongnu.org; Fri, 03 Aug 2018 05:14:13 -0400 Received: by mail-pg1-x541.google.com with SMTP id r5-v6so2582940pgv.0 for ; Fri, 03 Aug 2018 02:14:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=NJ8AlY4mjuV9hYO17hGvZ8dmi6Tk0gA05ynUsTBv4Fw=; b=FiXgbURYs0OhG2i4wrD6PM8klDiCpBv8gRxWAfiGwU7okMeuWDrzGklFmuUzQ1WZlH aEU76p+5nmGoWMYvtteXbix7eBH7DW8PBiX7xUdp/agoChoKw2nVopXD5/TdUdd0NX7S 4ypAAsM612+1OJNOfV8MEk6rMVQAbdMvRnbOVBMuw9nHg0Tdbdy+OzLF8VevcKMjA1ob gQ3+Q3Sj9LS8dfC6xseFcK+X2I+sIQpjceKN3UwID93hFScv/ZjmvkTasaeea5Ow67lw Ir0EOBIwl4PlGhqxhqeBGiCt1C2FmxVueM5lVURQbt+AmVo2N3J6jCUIsx62IELh1dwH PQPQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=NJ8AlY4mjuV9hYO17hGvZ8dmi6Tk0gA05ynUsTBv4Fw=; b=iA0TvHJDzdzzf5Tw60Oxt2xLnOXPrAqCUIcVL0ToGKl2MUg7NDOebttR0AHpXoyoya xDQo4kQH7ieB+PJZF+k7nu7y2XU9XlCEiLfOqOylyHQg40qgn+oqiyfhUW37sX+MlUQI IvoG6385Te+3a1TNdiNSVw/kJj8q0sbzgsaEI+e4wv1g8YiPQ/tFaderWdyW6Kk/Spn3 EyjbOPsfbVhMIfxvbjxXmAbEczaz/ASscLz4ZsqPa2kU0yThCn6c4rTNjS0D109vXL7E Ol99Aki3wImrN1O8Tq8xPK7dVlgQIPQbTdqIxS/q4JhMDTWcnREfDOg24EryIfyWIXKf RKVg== X-Gm-Message-State: AOUpUlFDxu5igqflRqRiPlamUaWDGW0R03FqP0RiBCwOGEdlxAO9gXAA GGSiIGHoN6NdD+LqSSojGOvB9avfcJI= X-Google-Smtp-Source: AAOMgpcwxiS2FYEc76s1vk6e085CAS2ITtq8XhoDrtoDRzMp5fSesZ5dw6rePF6Ez1JwWfvKtVrAzQ== X-Received: by 2002:aa7:8591:: with SMTP id w17-v6mr3506785pfn.77.1533287652493; Fri, 03 Aug 2018 02:14:12 -0700 (PDT) Received: from VM_120_46_centos.localdomain ([119.28.87.64]) by smtp.gmail.com with ESMTPSA id e21-v6sm7991352pfl.187.2018.08.03.02.14.10 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 03 Aug 2018 02:14:12 -0700 (PDT) From: Lidong Chen X-Google-Original-From: Lidong Chen To: zhang.zhanghailiang@huawei.com, quintela@redhat.com, dgilbert@redhat.com Date: Fri, 3 Aug 2018 17:13:46 +0800 Message-Id: <1533287630-4221-9-git-send-email-lidongchen@tencent.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1533287630-4221-1-git-send-email-lidongchen@tencent.com> References: <1533287630-4221-1-git-send-email-lidongchen@tencent.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::541 Subject: [Qemu-devel] [PATCH v6 08/12] migration: poll the cm event while wait RDMA work request completion X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Gal Shachaf , Lidong Chen , Aviad Yehezkel , qemu-devel@nongnu.org, Lidong Chen Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" From: Lidong Chen If the peer qemu is crashed, the qemu_rdma_wait_comp_channel function maybe loop forever. so we should also poll the cm event fd, and when receive RDMA_CM_EVENT_DISCONNECTED and RDMA_CM_EVENT_DEVICE_REMOVAL, we consider some error happened. Signed-off-by: Lidong Chen Signed-off-by: Gal Shachaf Signed-off-by: Aviad Yehezkel --- migration/rdma.c | 33 ++++++++++++++++++++++++++++++--- 1 file changed, 30 insertions(+), 3 deletions(-) diff --git a/migration/rdma.c b/migration/rdma.c index d6bbf28..673f126 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -1489,6 +1489,9 @@ static uint64_t qemu_rdma_poll(RDMAContext *rdma, uint64_t *wr_id_out, */ static int qemu_rdma_wait_comp_channel(RDMAContext *rdma) { + struct rdma_cm_event *cm_event; + int ret = -1; + /* * Coroutine doesn't start until migration_fd_process_incoming() * so don't yield unless we know we're running inside of a coroutine. @@ -1505,13 +1508,37 @@ static int qemu_rdma_wait_comp_channel(RDMAContext *rdma) * without hanging forever. */ while (!rdma->error_state && !rdma->received_error) { - GPollFD pfds[1]; + GPollFD pfds[2]; pfds[0].fd = rdma->comp_channel->fd; pfds[0].events = G_IO_IN | G_IO_HUP | G_IO_ERR; + pfds[0].revents = 0; + + pfds[1].fd = rdma->channel->fd; + pfds[1].events = G_IO_IN | G_IO_HUP | G_IO_ERR; + pfds[1].revents = 0; + /* 0.1s timeout, should be fine for a 'cancel' */ - switch (qemu_poll_ns(pfds, 1, 100 * 1000 * 1000)) { + switch (qemu_poll_ns(pfds, 2, 100 * 1000 * 1000)) { + case 2: case 1: /* fd active */ - return 0; + if (pfds[0].revents) { + return 0; + } + + if (pfds[1].revents) { + ret = rdma_get_cm_event(rdma->channel, &cm_event); + if (!ret) { + rdma_ack_cm_event(cm_event); + } + + error_report("receive cm event while wait comp channel," + "cm event is %d", cm_event->event); + if (cm_event->event == RDMA_CM_EVENT_DISCONNECTED || + cm_event->event == RDMA_CM_EVENT_DEVICE_REMOVAL) { + return -EPIPE; + } + } + break; case 0: /* Timeout, go around again */ break; From patchwork Fri Aug 3 09:13:47 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: 858585 jemmy X-Patchwork-Id: 953144 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="jZCRlJKT"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41hhV426Vjz9s2g for ; Fri, 3 Aug 2018 19:25:32 +1000 (AEST) Received: from localhost ([::1]:49800 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1flWKv-0002GH-Ui for incoming@patchwork.ozlabs.org; Fri, 03 Aug 2018 05:25:29 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:52933) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1flWA4-0001Xa-OK for qemu-devel@nongnu.org; Fri, 03 Aug 2018 05:14:18 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1flWA3-0001hs-C7 for qemu-devel@nongnu.org; Fri, 03 Aug 2018 05:14:16 -0400 Received: from mail-pl0-x232.google.com ([2607:f8b0:400e:c01::232]:43714) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1flWA3-0001fu-2U for qemu-devel@nongnu.org; Fri, 03 Aug 2018 05:14:15 -0400 Received: by mail-pl0-x232.google.com with SMTP id x6-v6so2297874plv.10 for ; Fri, 03 Aug 2018 02:14:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=UaUc+NfHOvncluzDJkjXLlo2mSi7ZcHceZZz9mI6xM4=; b=jZCRlJKTTOUlGmSHLwAODgsO/AesyzG7phoGXK3YggKd7zsX5lkbXjDv4pmnuOTMn/ P3Abudrv41r0il7jRYbrLRSBnvmrNrgpu+qGWGUmulowmei8ATCBs/1klj2+twmCWUPD ifbhANoYaUJuSwzVgHX4mOI2RH1ADLK7mDs5tC1HTA2ED9WSDT/A6yYNSaYXEm/NXaIU ZDP5UEX7lCaxNJey+wv3/DkooniNm5Ac6lmN1II1jCiUDZgSC0t4F1dtQZ5oOllMc0iO avc4C/Lnjl72+d310BK29bmXjLO2keqzIXPHE9Y9aCVAdBOEuznQtr4bx5iIhCF5KLzZ BkhA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=UaUc+NfHOvncluzDJkjXLlo2mSi7ZcHceZZz9mI6xM4=; b=UHD9htatJ0doZbU0itrSIuSwuivSPD+cGc6xt7/2y6wugSqGDmNjBs/QZCHvHj8vFf hXRHhT2yzcDxOWfTrM8cWGG5Nms+vMj6kDylSjYtxq9cQI7z39Vm+qP7RFpkT3Z6z22v OoUPNHpVP+NAyo2MFrUJwGlEqrXuSsUu6a52UeEonzluUCnxFCXUIX5R8G7fVTVGIOd6 YrpjWXywAPCSAQZcGewq3h6bRluxshz7CuxCcHWYhNVc02zZcSCxm6ghSur0kpd3ovpC QkuK1uKZ2hK9x/8D9m7mBm8xzn5H7j1w13zqxct0vdqS+wDdTdKtDqPFbKMzkUJoydyI La8w== X-Gm-Message-State: AOUpUlESAIiRnN/u+j8Df0LiJ4EP11PRdWOwXkZGmiOIaPo8wmFQL1hn mVEsc/s2tFTxQ0Hdrj2zTU8= X-Google-Smtp-Source: AAOMgpcRrohT5Lj78mcNI37y+qO2Sl3hZonrUNFB7z4zc4hPSpRjo1vMry0bEfNB3Zc/VIOgrL9tEw== X-Received: by 2002:a17:902:d906:: with SMTP id c6-v6mr2726015plz.65.1533287654263; Fri, 03 Aug 2018 02:14:14 -0700 (PDT) Received: from VM_120_46_centos.localdomain ([119.28.87.64]) by smtp.gmail.com with ESMTPSA id e21-v6sm7991352pfl.187.2018.08.03.02.14.12 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 03 Aug 2018 02:14:13 -0700 (PDT) From: Lidong Chen X-Google-Original-From: Lidong Chen To: zhang.zhanghailiang@huawei.com, quintela@redhat.com, dgilbert@redhat.com Date: Fri, 3 Aug 2018 17:13:47 +0800 Message-Id: <1533287630-4221-10-git-send-email-lidongchen@tencent.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1533287630-4221-1-git-send-email-lidongchen@tencent.com> References: <1533287630-4221-1-git-send-email-lidongchen@tencent.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::232 Subject: [Qemu-devel] [PATCH v6 09/12] migration: implement the shutdown for RDMA QIOChannel X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lidong Chen , qemu-devel@nongnu.org, Lidong Chen Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" From: Lidong Chen Because RDMA QIOChannel not implement shutdown function, If the to_dst_file was set error, the return path thread will wait forever. and the migration thread will wait return path thread exit. the backtrace of return path thread is: (gdb) bt #0 0x00007f372a76bb0f in ppoll () from /lib64/libc.so.6 #1 0x000000000071dc24 in qemu_poll_ns (fds=0x7ef7091d0580, nfds=2, timeout=100000000) at qemu-timer.c:325 #2 0x00000000006b2fba in qemu_rdma_wait_comp_channel (rdma=0xd424000) at migration/rdma.c:1501 #3 0x00000000006b3191 in qemu_rdma_block_for_wrid (rdma=0xd424000, wrid_requested=4000, byte_len=0x7ef7091d0640) at migration/rdma.c:1580 #4 0x00000000006b3638 in qemu_rdma_exchange_get_response (rdma=0xd424000, head=0x7ef7091d0720, expecting=3, idx=0) at migration/rdma.c:1726 #5 0x00000000006b3ad6 in qemu_rdma_exchange_recv (rdma=0xd424000, head=0x7ef7091d0720, expecting=3) at migration/rdma.c:1903 #6 0x00000000006b5d03 in qemu_rdma_get_buffer (opaque=0x6a57dc0, buf=0x5c80030 "", pos=8, size=32768) at migration/rdma.c:2714 #7 0x00000000006a9635 in qemu_fill_buffer (f=0x5c80000) at migration/qemu-file.c:232 #8 0x00000000006a9ecd in qemu_peek_byte (f=0x5c80000, offset=0) at migration/qemu-file.c:502 #9 0x00000000006a9f1f in qemu_get_byte (f=0x5c80000) at migration/qemu-file.c:515 #10 0x00000000006aa162 in qemu_get_be16 (f=0x5c80000) at migration/qemu-file.c:591 #11 0x00000000006a46d3 in source_return_path_thread ( opaque=0xd826a0 ) at migration/migration.c:1331 #12 0x00007f372aa49e25 in start_thread () from /lib64/libpthread.so.0 #13 0x00007f372a77635d in clone () from /lib64/libc.so.6 the backtrace of migration thread is: (gdb) bt #0 0x00007f372aa4af57 in pthread_join () from /lib64/libpthread.so.0 #1 0x00000000007d5711 in qemu_thread_join (thread=0xd826f8 ) at util/qemu-thread-posix.c:504 #2 0x00000000006a4bc5 in await_return_path_close_on_source ( ms=0xd826a0 ) at migration/migration.c:1460 #3 0x00000000006a53e4 in migration_completion (s=0xd826a0 , current_active_state=4, old_vm_running=0x7ef7089cf976, start_time=0x7ef7089cf980) at migration/migration.c:1695 #4 0x00000000006a5c54 in migration_thread (opaque=0xd826a0 ) at migration/migration.c:1837 #5 0x00007f372aa49e25 in start_thread () from /lib64/libpthread.so.0 #6 0x00007f372a77635d in clone () from /lib64/libc.so.6 Signed-off-by: Lidong Chen Reviewed-by: Dr. David Alan Gilbert --- migration/rdma.c | 40 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 40 insertions(+) diff --git a/migration/rdma.c b/migration/rdma.c index 673f126..1affc46 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -3039,6 +3039,45 @@ static int qio_channel_rdma_close(QIOChannel *ioc, return 0; } +static int +qio_channel_rdma_shutdown(QIOChannel *ioc, + QIOChannelShutdown how, + Error **errp) +{ + QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc); + RDMAContext *rdmain, *rdmaout; + + rcu_read_lock(); + + rdmain = atomic_rcu_read(&rioc->rdmain); + rdmaout = atomic_rcu_read(&rioc->rdmain); + + switch (how) { + case QIO_CHANNEL_SHUTDOWN_READ: + if (rdmain) { + rdmain->error_state = -1; + } + break; + case QIO_CHANNEL_SHUTDOWN_WRITE: + if (rdmaout) { + rdmaout->error_state = -1; + } + break; + case QIO_CHANNEL_SHUTDOWN_BOTH: + default: + if (rdmain) { + rdmain->error_state = -1; + } + if (rdmaout) { + rdmaout->error_state = -1; + } + break; + } + + rcu_read_unlock(); + return 0; +} + /* * Parameters: * @offset == 0 : @@ -3865,6 +3904,7 @@ static void qio_channel_rdma_class_init(ObjectClass *klass, ioc_klass->io_close = qio_channel_rdma_close; ioc_klass->io_create_watch = qio_channel_rdma_create_watch; ioc_klass->io_set_aio_fd_handler = qio_channel_rdma_set_aio_fd_handler; + ioc_klass->io_shutdown = qio_channel_rdma_shutdown; } static const TypeInfo qio_channel_rdma_info = { From patchwork Fri Aug 3 09:13:48 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: 858585 jemmy X-Patchwork-Id: 953138 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="ETbNHjT5"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41hhLL5WCWz9s2g for ; Fri, 3 Aug 2018 19:18:50 +1000 (AEST) Received: from localhost ([::1]:49757 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1flWES-0004y9-Ct for incoming@patchwork.ozlabs.org; Fri, 03 Aug 2018 05:18:48 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:52951) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1flWA5-0001Yd-R2 for qemu-devel@nongnu.org; Fri, 03 Aug 2018 05:14:21 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1flWA4-0001lZ-Ne for qemu-devel@nongnu.org; Fri, 03 Aug 2018 05:14:17 -0400 Received: from mail-pl0-x231.google.com ([2607:f8b0:400e:c01::231]:45562) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1flWA4-0001jy-Gu for qemu-devel@nongnu.org; Fri, 03 Aug 2018 05:14:16 -0400 Received: by mail-pl0-x231.google.com with SMTP id j8-v6so2292912pll.12 for ; Fri, 03 Aug 2018 02:14:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=tmaMPgA/4D3Vt0rma7Rf6cf1XnX0c3yLSjIs9mPFLn8=; b=ETbNHjT5QnxDAAQycXz/kyJ2qIUj5yYcU3lgeRp2SLb8ZKyDr3Qv+WJdX5O+yW+lPu 7Pn2aat9CMRRkOnS+Ht3QuK9PFh+tyXTnqhVeCdfCr6vtsopZOGSQ4xzQjYIpFJ8AL8P qH4HzlPCSAEBSNk/6lsNNxPWAY1hkSfKOyRgpb/nB80IQX5ZYCXEFNX5RttELj39gcFQ GF7COtYJ9pbSyvml4fuDca6x0FV4LD/N38KVq5epeRJm7aIXvZa+Lyn12THgA/g6YrTC gLsDsPIYDtwFcR0Ap1+woPMTgJTjFaZ6XQyIpSHcItIscYoydHRUHb2wH3RVwjwaGLdS bsqQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=tmaMPgA/4D3Vt0rma7Rf6cf1XnX0c3yLSjIs9mPFLn8=; b=iFubsy98GsvxbgW/kSooUlpEfE4t+VUDyehAuWhJiCcmfugFgsHBFIxWCIG7RVDI3f QcAPBTmUT/Ti0ccaCcqQVX/UM3KJfZe/FDVBKBZvzUfGe0h44n+LRT7OFuxosbGl4Ctg r5Fsk/ZiHOvQTMyS3EBhe5mg+1kaixvdPc2yxQNfjYdPCysaUv2HvBD4vCobDRs8j4iS +lWdFzaphxCx67FyemJK+/n9B8PawuN5wHKLjIo4G3t2dCPp333zFXKLopKMep1yopy5 zJ2XJccCzNHz4S/iktAjQLG0DTxivunf+SXbe+djhln0oqh4b6lWcwxou5GT3nyhZ0LZ f+qQ== X-Gm-Message-State: AOUpUlGuklsIDlXHs9gqgzP0+uO+49xuj32MPBSGBHWcUxn3gC3F0gqN UpzKYzFiDXq5n1pgG8CFrRw= X-Google-Smtp-Source: AAOMgpeR82VFRCs2qWiL5bxge6ZUbpprXqx7dsw7xa3dlWYLHY6C06If3LY/1q3xK6acGpcg9lF4iQ== X-Received: by 2002:a17:902:8b86:: with SMTP id ay6-v6mr2690224plb.25.1533287655772; Fri, 03 Aug 2018 02:14:15 -0700 (PDT) Received: from VM_120_46_centos.localdomain ([119.28.87.64]) by smtp.gmail.com with ESMTPSA id e21-v6sm7991352pfl.187.2018.08.03.02.14.14 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 03 Aug 2018 02:14:15 -0700 (PDT) From: Lidong Chen X-Google-Original-From: Lidong Chen To: zhang.zhanghailiang@huawei.com, quintela@redhat.com, dgilbert@redhat.com Date: Fri, 3 Aug 2018 17:13:48 +0800 Message-Id: <1533287630-4221-11-git-send-email-lidongchen@tencent.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1533287630-4221-1-git-send-email-lidongchen@tencent.com> References: <1533287630-4221-1-git-send-email-lidongchen@tencent.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::231 Subject: [Qemu-devel] [PATCH v6 10/12] migration: poll the cm event for destination qemu X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-devel@nongnu.org, Lidong Chen Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" The destination qemu only poll the comp_channel->fd in qemu_rdma_wait_comp_channel. But when source qemu disconnnect the rdma connection, the destination qemu should be notified. Signed-off-by: Lidong Chen --- migration/migration.c | 3 ++- migration/rdma.c | 31 ++++++++++++++++++++++++++++++- 2 files changed, 32 insertions(+), 2 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index f190964..360ee94 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -389,6 +389,7 @@ static void process_incoming_migration_co(void *opaque) int ret; assert(mis->from_src_file); + mis->migration_incoming_co = qemu_coroutine_self(); mis->largest_page_size = qemu_ram_pagesize_largest(); postcopy_state_set(POSTCOPY_INCOMING_NONE); migrate_set_state(&mis->state, MIGRATION_STATUS_NONE, @@ -418,7 +419,6 @@ static void process_incoming_migration_co(void *opaque) /* we get COLO info, and know if we are in COLO mode */ if (!ret && migration_incoming_enable_colo()) { - mis->migration_incoming_co = qemu_coroutine_self(); qemu_thread_create(&mis->colo_incoming_thread, "COLO incoming", colo_process_incoming_thread, mis, QEMU_THREAD_JOINABLE); mis->have_colo_incoming_thread = true; @@ -442,6 +442,7 @@ static void process_incoming_migration_co(void *opaque) } mis->bh = qemu_bh_new(process_incoming_migration_bh, mis); qemu_bh_schedule(mis->bh); + mis->migration_incoming_co = NULL; } static void migration_incoming_setup(QEMUFile *f) diff --git a/migration/rdma.c b/migration/rdma.c index 1affc46..62de2ec 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -3226,6 +3226,34 @@ err: static void rdma_accept_incoming_migration(void *opaque); +static void rdma_cm_poll_handler(void *opaque) +{ + RDMAContext *rdma = opaque; + int ret; + struct rdma_cm_event *cm_event; + MigrationIncomingState *mis = migration_incoming_get_current(); + + ret = rdma_get_cm_event(rdma->channel, &cm_event); + if (ret) { + return; + } + + if (cm_event->event == RDMA_CM_EVENT_DISCONNECTED || + cm_event->event == RDMA_CM_EVENT_DEVICE_REMOVAL) { + rdma_ack_cm_event(cm_event); + error_report("receive cm event, cm event is %d", cm_event->event); + rdma->error_state = -EPIPE; + if (rdma->return_path) { + rdma->return_path->error_state = -EPIPE; + } + + if (mis->migration_incoming_co) { + qemu_coroutine_enter(mis->migration_incoming_co); + } + return; + } +} + static int qemu_rdma_accept(RDMAContext *rdma) { RDMACapabilities cap; @@ -3326,7 +3354,8 @@ static int qemu_rdma_accept(RDMAContext *rdma) NULL, (void *)(intptr_t)rdma->return_path); } else { - qemu_set_fd_handler(rdma->channel->fd, NULL, NULL, NULL); + qemu_set_fd_handler(rdma->channel->fd, rdma_cm_poll_handler, + NULL, rdma); } ret = rdma_accept(rdma->cm_id, &conn_param); From patchwork Fri Aug 3 09:13:49 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: 858585 jemmy X-Patchwork-Id: 953140 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="ESR/Fae3"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41hhN75J2Mz9s2g for ; Fri, 3 Aug 2018 19:20:23 +1000 (AEST) Received: from localhost ([::1]:49765 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1flWFx-0006NN-Bf for incoming@patchwork.ozlabs.org; Fri, 03 Aug 2018 05:20:21 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:52987) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1flWA9-0001Zr-4T for qemu-devel@nongnu.org; Fri, 03 Aug 2018 05:14:22 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1flWA6-0001pT-AO for qemu-devel@nongnu.org; Fri, 03 Aug 2018 05:14:21 -0400 Received: from mail-pl0-x244.google.com ([2607:f8b0:400e:c01::244]:38336) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1flWA6-0001ng-4o for qemu-devel@nongnu.org; Fri, 03 Aug 2018 05:14:18 -0400 Received: by mail-pl0-x244.google.com with SMTP id u11-v6so2310480plq.5 for ; Fri, 03 Aug 2018 02:14:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=pEFezLYVk4Oxp4U1uMRNW8tjw1DGFNquuUXNxb7t4Nk=; b=ESR/Fae3dy794IpO/17Mf3iiCBrm5QOeZ/agSLmwYxE+smbysQe+9D49E+aahH6a18 jc2hS7RHWJ8m/q2Q4mwTuTc8Y1POBlzgGrKush8YGfSUvQbv6JZQPDX+Xm3D9+dKDKAR hIOastNaELsPR8x32jjXAT0dQXbdvONuNm++jX2WVzpPi3gxLOpH836wXK4iaUcIPd6h qRckGkzI7bOg7ekHhj41WNl7Ky8ZKPaoTBQMhsU1OoS4u61CRSXm+P57KJeXbIUJEJUK UI8+cltb9Ew+GMCuTaK6Te6QgkHRLDMbqBi/5Tenc1KVj8OTV9PB7i8jm/caOEE9cWks pAaw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=pEFezLYVk4Oxp4U1uMRNW8tjw1DGFNquuUXNxb7t4Nk=; b=K+I2aZIpxZaaGVeKKnK/OxubTkV6+NceSJ/R7KJKOFsmYEP++VCaf20ZvBNbXXhAve 82F85ZZLjr1HraAwcy9/8xcd/EyGM/jLbiRAyIvTYCJ3lVsJmop9HbGvZD5KCs8E+J1d xaWWyY6946brSe+r8GFGkxWNJmp/h4qUM5vfcBbLqbP0Wl6o07yxQl+KX8BflvIcZo00 KynAEjPWzjXIX+OZLMwQPBcDmiR6X8v50AMcmb5wfQqXQEzsz1V8HN3OZmdLgHG3nN2s UZ5NT0nsYKTXSL0F5vBEyot8Mc1XbP4TbRKlof4q9lXZAZkaPemgk94FKUIWHhpHx1FT 5Vrg== X-Gm-Message-State: AOUpUlGNQeqRhu/HdaoIBIsdOkx2rwrOq3yRENki9s0RxLeJLrGzEWbZ 11o/lnYbCFPjBVszOAmyZ24= X-Google-Smtp-Source: AAOMgpcKRQOQn5VMxZz7pGfml54q36gqO4h+ra3QzmtixPy3GIbKck9y1Fqfwzq1FtZJCD+2xKeS1Q== X-Received: by 2002:a17:902:622:: with SMTP id 31-v6mr2732103plg.153.1533287657339; Fri, 03 Aug 2018 02:14:17 -0700 (PDT) Received: from VM_120_46_centos.localdomain ([119.28.87.64]) by smtp.gmail.com with ESMTPSA id e21-v6sm7991352pfl.187.2018.08.03.02.14.15 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 03 Aug 2018 02:14:16 -0700 (PDT) From: Lidong Chen X-Google-Original-From: Lidong Chen To: zhang.zhanghailiang@huawei.com, quintela@redhat.com, dgilbert@redhat.com Date: Fri, 3 Aug 2018 17:13:49 +0800 Message-Id: <1533287630-4221-12-git-send-email-lidongchen@tencent.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1533287630-4221-1-git-send-email-lidongchen@tencent.com> References: <1533287630-4221-1-git-send-email-lidongchen@tencent.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::244 Subject: [Qemu-devel] [PATCH v6 11/12] migration: remove the unnecessary RDMA_CONTROL_ERROR message X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-devel@nongnu.org, Lidong Chen Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" It's not necessary to send RDMA_CONTROL_ERROR when clean up rdma resource. If rdma->error_state is ture, the message may not send successfully. and the cm event can also notify the peer qemu. Signed-off-by: Lidong Chen --- migration/rdma.c | 11 ----------- 1 file changed, 11 deletions(-) diff --git a/migration/rdma.c b/migration/rdma.c index 62de2ec..14cdf82 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -2305,17 +2305,6 @@ static void qemu_rdma_cleanup(RDMAContext *rdma) int idx; if (rdma->cm_id && rdma->connected) { - if ((rdma->error_state || - migrate_get_current()->state == MIGRATION_STATUS_CANCELLING) && - !rdma->received_error) { - RDMAControlHeader head = { .len = 0, - .type = RDMA_CONTROL_ERROR, - .repeat = 1, - }; - error_report("Early error. Sending error."); - qemu_rdma_post_send_control(rdma, NULL, &head); - } - rdma_disconnect(rdma->cm_id); trace_qemu_rdma_cleanup_disconnect(); rdma->connected = false; From patchwork Fri Aug 3 09:13:50 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: 858585 jemmy X-Patchwork-Id: 953142 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="US4ciwJw"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41hhQK1MTlz9s2g for ; Fri, 3 Aug 2018 19:22:16 +1000 (AEST) Received: from localhost ([::1]:49783 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1flWHm-00080P-8J for incoming@patchwork.ozlabs.org; Fri, 03 Aug 2018 05:22:14 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:52986) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1flWA9-0001Zm-3T for qemu-devel@nongnu.org; Fri, 03 Aug 2018 05:14:22 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1flWA7-0001tn-Sg for qemu-devel@nongnu.org; Fri, 03 Aug 2018 05:14:21 -0400 Received: from mail-pf1-x441.google.com ([2607:f8b0:4864:20::441]:36972) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1flWA7-0001ri-M6 for qemu-devel@nongnu.org; Fri, 03 Aug 2018 05:14:19 -0400 Received: by mail-pf1-x441.google.com with SMTP id a26-v6so2927509pfo.4 for ; Fri, 03 Aug 2018 02:14:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=AP9lCjBQnayg5NRv2Z3sgi+XbfOW1T9ZCaxxobQuIRg=; b=US4ciwJwS0doYzJb+2cFTnBDKomlM5fdo/P8K4LyvLi8ka0OgbnVxypu/4s6TjsFbq Z3nAfIQZVl+QlHnsC7mtTrM/cB/47L+U1wibAImBvIzzYaYHVoLiaWMVgqup7OQXeZHz 0Ur0XN0OKzeSTKCh78bNxf2n5vUVNrsryLbI99i7BBA/8uBK7A+dVOqDmQRV6NVBj51P i/C6m/SEMRfq0hKwB20Bw5OprAd/NqXFupeLMaT53SR6MOmeL9N6kvSTOzAYnHI/SF1o OJ4mnkzsn0nr2EpQDNPiFhMvFCrNjP5jwqMeLTFJ+C75vYSgq3a/Ui2L2QNjoMPBHXJD bi7g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=AP9lCjBQnayg5NRv2Z3sgi+XbfOW1T9ZCaxxobQuIRg=; b=ir6s+hD7T+Db9k6n1sbyNMnZMLHZivP/I0bCBpd0kZsvPk27YUQ9oHV9wN1NHTtx7U EQyac/rvt84GNJ1UuFMgKNibT/CIyfJAycURrqiaXi32lmNuo9o8/OsYP21PJYYOmySy 3mQcQvTETShSAtm0vZHJrF3H94QcEbYIs7j+g0os7OZjDk43BdVCC5EgEoM6k+XZ/LgC uBCzDOolcqNyNxFCQEZTECP3J2AEJrcLdUOQVsLkd/bUS6B3dCTaUJCBOfjLmEpD/xtH UrsB+NzzZA82rfMLqbvPBCGLpNrQ8uavK2/sQxD5CJRLBGXqkTHJEok3fXZ+x0JFMjtq hSlw== X-Gm-Message-State: AOUpUlFZ7zc25BFdvvsDC6O3szYvqmo02JFPzjT5OyFIWOZ7huHRseiO srNKPT9WcbZygLG2+UifX4M= X-Google-Smtp-Source: AAOMgpe0nb3fPuA83zwtqLvx6cCexXcVRNGVmtFUf+d+dRXZEfSCTVJ6HbjN1PPD8cLwqGVfob7Q7A== X-Received: by 2002:a63:d04f:: with SMTP id s15-v6mr2986203pgi.42.1533287658900; Fri, 03 Aug 2018 02:14:18 -0700 (PDT) Received: from VM_120_46_centos.localdomain ([119.28.87.64]) by smtp.gmail.com with ESMTPSA id e21-v6sm7991352pfl.187.2018.08.03.02.14.17 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 03 Aug 2018 02:14:18 -0700 (PDT) From: Lidong Chen X-Google-Original-From: Lidong Chen To: zhang.zhanghailiang@huawei.com, quintela@redhat.com, dgilbert@redhat.com Date: Fri, 3 Aug 2018 17:13:50 +0800 Message-Id: <1533287630-4221-13-git-send-email-lidongchen@tencent.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1533287630-4221-1-git-send-email-lidongchen@tencent.com> References: <1533287630-4221-1-git-send-email-lidongchen@tencent.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::441 Subject: [Qemu-devel] [PATCH v6 12/12] migration: create a dedicated thread to release rdma resource X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-devel@nongnu.org, Lidong Chen Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" ibv_dereg_mr wait for a long time for big memory size virtual server. The test result is: 10GB 326ms 20GB 699ms 30GB 1021ms 40GB 1387ms 50GB 1712ms 60GB 2034ms 70GB 2457ms 80GB 2807ms 90GB 3107ms 100GB 3474ms 110GB 3735ms 120GB 4064ms 130GB 4567ms 140GB 4886ms this will cause the guest os hang for a while when migration finished. So create a dedicated thread to release rdma resource. Signed-off-by: Lidong Chen --- migration/migration.c | 6 ++++++ migration/migration.h | 3 +++ migration/rdma.c | 47 +++++++++++++++++++++++++++++++---------------- 3 files changed, 40 insertions(+), 16 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index 360ee94..f58fe45 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -1499,6 +1499,7 @@ void migrate_init(MigrationState *s) s->vm_was_running = false; s->iteration_initial_bytes = 0; s->threshold_size = 0; + s->rdma_cleanup_thread_quit = true; } static GSList *migration_blockers; @@ -1660,6 +1661,10 @@ static bool migrate_prepare(MigrationState *s, bool blk, bool blk_inc, return false; } + if (s->rdma_cleanup_thread_quit != true) { + return false; + } + if (runstate_check(RUN_STATE_INMIGRATE)) { error_setg(errp, "Guest is waiting for an incoming migration"); return false; @@ -3214,6 +3219,7 @@ static void migration_instance_init(Object *obj) ms->state = MIGRATION_STATUS_NONE; ms->mbps = -1; + ms->rdma_cleanup_thread_quit = true; qemu_sem_init(&ms->pause_sem, 0); qemu_mutex_init(&ms->error_mutex); diff --git a/migration/migration.h b/migration/migration.h index a50c2de..4d1be08 100644 --- a/migration/migration.h +++ b/migration/migration.h @@ -231,6 +231,9 @@ struct MigrationState * do not trigger spurious decompression errors. */ bool decompress_error_check; + + /* Set this when rdma resource have released */ + bool rdma_cleanup_thread_quit; }; void migrate_set_state(int *state, int old_state, int new_state); diff --git a/migration/rdma.c b/migration/rdma.c index 14cdf82..3d1a4ad 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -2995,35 +2995,50 @@ static void qio_channel_rdma_set_aio_fd_handler(QIOChannel *ioc, } } -static int qio_channel_rdma_close(QIOChannel *ioc, - Error **errp) +static void *qio_channel_rdma_close_thread(void *arg) { - QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc); - RDMAContext *rdmain, *rdmaout; - trace_qemu_rdma_close(); - - rdmain = rioc->rdmain; - if (rdmain) { - atomic_rcu_set(&rioc->rdmain, NULL); - } + RDMAContext **rdma = arg; + RDMAContext *rdmain = rdma[0]; + RDMAContext *rdmaout = rdma[1]; + MigrationState *s = migrate_get_current(); - rdmaout = rioc->rdmaout; - if (rdmaout) { - atomic_rcu_set(&rioc->rdmaout, NULL); - } + rcu_register_thread(); synchronize_rcu(); - if (rdmain) { qemu_rdma_cleanup(rdmain); } - if (rdmaout) { qemu_rdma_cleanup(rdmaout); } g_free(rdmain); g_free(rdmaout); + g_free(rdma); + + rcu_unregister_thread(); + s->rdma_cleanup_thread_quit = true; + return NULL; +} + +static int qio_channel_rdma_close(QIOChannel *ioc, + Error **errp) +{ + QemuThread t; + QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc); + RDMAContext **rdma = g_new0(RDMAContext*, 2); + MigrationState *s = migrate_get_current(); + + trace_qemu_rdma_close(); + if (rioc->rdmain || rioc->rdmaout) { + rdma[0] = rioc->rdmain; + rdma[1] = rioc->rdmaout; + atomic_rcu_set(&rioc->rdmain, NULL); + atomic_rcu_set(&rioc->rdmaout, NULL); + s->rdma_cleanup_thread_quit = false; + qemu_thread_create(&t, "rdma cleanup", qio_channel_rdma_close_thread, + rdma, QEMU_THREAD_DETACHED); + } return 0; }