From patchwork Tue Jun 5 15:28:00 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: 858585 jemmy X-Patchwork-Id: 925524 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="oI3oVR8E"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 410bLl6ZKDz9s0W for ; Wed, 6 Jun 2018 01:29:02 +1000 (AEST) Received: from localhost ([::1]:47501 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fQDtL-00084L-64 for incoming@patchwork.ozlabs.org; Tue, 05 Jun 2018 11:28:59 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39998) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fQDsl-00083u-OI for qemu-devel@nongnu.org; Tue, 05 Jun 2018 11:28:24 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fQDsj-0003mH-DN for qemu-devel@nongnu.org; Tue, 05 Jun 2018 11:28:23 -0400 Received: from mail-pl0-x230.google.com ([2607:f8b0:400e:c01::230]:46320) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fQDsj-0003lt-67 for qemu-devel@nongnu.org; Tue, 05 Jun 2018 11:28:21 -0400 Received: by mail-pl0-x230.google.com with SMTP id 30-v6so1720750pld.13 for ; Tue, 05 Jun 2018 08:28:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=NlDdsPqvKyL/hLM8p1ZbcsrVMLFk27hIrWz1IvXFCBQ=; b=oI3oVR8EEIHRbbZfk+r42ptLAR0wS8ifcCMTq+Nxq2L4UsY/4IC4ZDj6uuR7/gpZBu Mq9eK2sHgcHHje4NYnG/Q45+5Mq+ddfWTt5G5kxO55/KWgIzu5OqcyDUtmIiusUhZSDu aO97p5FTYrvJOSZlMravpHS2UV7F9KM+qJDNWCSDmKsUpeQ21WWrICnlHe4FvcpJvLcQ 94LMKirgFDYbYzBVYG5JG/hx2c06WkfhShI4lvisnbm1dWCsijrJ4asSxoE1GITLwF5o mjekhb4E6Yty86Gh1WMEX/i2IRJAeyQThY5mja/r0yNHT75khopVj+BGqMtAstYkYorO FBAg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=NlDdsPqvKyL/hLM8p1ZbcsrVMLFk27hIrWz1IvXFCBQ=; b=CRrrKYhzoHtwPMPuIgVg23WfOaREvCCD66vfd1BpQftByIYbPU7YVFGrNzXRLwNPnI cbgf7kigkb2eJC51NyOFiwJEqSGhu5D7fBt2lwwYX1k6a8T6o8cFsnkft9x0BtIpEvxz eH2NQyjKjU07REn6iqtxLcQPy/qRBh94Dyryw7Ff2QvJlO0dLgNs/1ujMQDIHC8vzyT1 BxwDfy/cmdyhZD97t9A5mjjMQPds4Q0IOMTooRWQ3Ba2yar43Irxwy4WAtLTVVjU82BQ YILqHOhg6O5nlfjN0YeR8YhKX5DvZnUmuw3OjSV7dk4g/l9IqikbMvfsoaTvebR9HkNH uL4Q== X-Gm-Message-State: ALKqPwcrEmMEX8A0zogiZSdgtzS1kNl5UbU7EcFembPrboLL5cuWMOAh BBsMQ7hbirWLJuiW4K81ces= X-Google-Smtp-Source: ADUXVKIKbXJwYvs1ymOY91BKfyN1U3hAQX0XygxxCjFhAxHiao60cSCDTsB23RSLO0FrjnHi/IiTfQ== X-Received: by 2002:a17:902:3a5:: with SMTP id d34-v6mr27416487pld.103.1528212500426; Tue, 05 Jun 2018 08:28:20 -0700 (PDT) Received: from VM_93_245_centos.localdomain ([150.109.57.149]) by smtp.gmail.com with ESMTPSA id t14-v6sm101952007pfh.109.2018.06.05.08.28.17 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 05 Jun 2018 08:28:19 -0700 (PDT) From: Lidong Chen X-Google-Original-From: Lidong Chen To: zhang.zhanghailiang@huawei.com, quintela@redhat.com, dgilbert@redhat.com, berrange@redhat.com, aviadye@mellanox.com, pbonzini@redhat.com Date: Tue, 5 Jun 2018 23:28:00 +0800 Message-Id: <1528212489-19137-2-git-send-email-lidongchen@tencent.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1528212489-19137-1-git-send-email-lidongchen@tencent.com> References: <1528212489-19137-1-git-send-email-lidongchen@tencent.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::230 Subject: [Qemu-devel] [PATCH v5 01/10] migration: disable RDMA WRITE after postcopy started X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: galsha@mellanox.com, Lidong Chen , adido@mellanox.com, qemu-devel@nongnu.org, Lidong Chen Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" From: Lidong Chen RDMA WRITE operations are performed with no notification to the destination qemu, then the destination qemu can not wakeup. This patch disable RDMA WRITE after postcopy started. Signed-off-by: Lidong Chen Reviewed-by: Dr. David Alan Gilbert --- migration/qemu-file.c | 8 ++++++-- migration/rdma.c | 12 ++++++++++++ 2 files changed, 18 insertions(+), 2 deletions(-) diff --git a/migration/qemu-file.c b/migration/qemu-file.c index 0463f4c..977b9ae 100644 --- a/migration/qemu-file.c +++ b/migration/qemu-file.c @@ -253,8 +253,12 @@ size_t ram_control_save_page(QEMUFile *f, ram_addr_t block_offset, if (f->hooks && f->hooks->save_page) { int ret = f->hooks->save_page(f, f->opaque, block_offset, offset, size, bytes_sent); - f->bytes_xfer += size; - if (ret != RAM_SAVE_CONTROL_DELAYED) { + if (ret != RAM_SAVE_CONTROL_NOT_SUPP) { + f->bytes_xfer += size; + } + + if (ret != RAM_SAVE_CONTROL_DELAYED && + ret != RAM_SAVE_CONTROL_NOT_SUPP) { if (bytes_sent && *bytes_sent > 0) { qemu_update_position(f, *bytes_sent); } else if (ret < 0) { diff --git a/migration/rdma.c b/migration/rdma.c index 05aee3d..185ed98 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -2921,6 +2921,10 @@ static size_t qemu_rdma_save_page(QEMUFile *f, void *opaque, CHECK_ERROR_STATE(); + if (migrate_get_current()->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) { + return RAM_SAVE_CONTROL_NOT_SUPP; + } + qemu_fflush(f); if (size > 0) { @@ -3480,6 +3484,10 @@ static int qemu_rdma_registration_start(QEMUFile *f, void *opaque, CHECK_ERROR_STATE(); + if (migrate_get_current()->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) { + return 0; + } + trace_qemu_rdma_registration_start(flags); qemu_put_be64(f, RAM_SAVE_FLAG_HOOK); qemu_fflush(f); @@ -3502,6 +3510,10 @@ static int qemu_rdma_registration_stop(QEMUFile *f, void *opaque, CHECK_ERROR_STATE(); + if (migrate_get_current()->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) { + return 0; + } + qemu_fflush(f); ret = qemu_rdma_drain_cq(f, rdma); From patchwork Tue Jun 5 15:28:01 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: 858585 jemmy X-Patchwork-Id: 925526 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="EoMoUaUu"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 410bLn1CHsz9s0x for ; Wed, 6 Jun 2018 01:29:05 +1000 (AEST) Received: from localhost ([::1]:47503 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fQDtO-00086O-Pp for incoming@patchwork.ozlabs.org; Tue, 05 Jun 2018 11:29:02 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40021) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fQDsn-000847-B9 for qemu-devel@nongnu.org; Tue, 05 Jun 2018 11:28:26 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fQDsm-0003ns-5F for qemu-devel@nongnu.org; Tue, 05 Jun 2018 11:28:25 -0400 Received: from mail-pl0-x244.google.com ([2607:f8b0:400e:c01::244]:44767) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fQDsl-0003nS-T4 for qemu-devel@nongnu.org; Tue, 05 Jun 2018 11:28:24 -0400 Received: by mail-pl0-x244.google.com with SMTP id z9-v6so1724419plk.11 for ; Tue, 05 Jun 2018 08:28:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=23iHKOTQc/bOBqtX6dX4OSMjtbP9Vg3O7l4a24RZsJE=; b=EoMoUaUu4Aaw9yKLeZbElJ4UfEONKnsbQ/ulLekwHocwjO97lVo+5Vn99BbJOh2pvW MhIiEhDDp0Pc/kslaN/BFzAOCwCXM2dTwuEMdb/z6Nz7G0usjYwfALTeFovgnXt/ibN8 5fWPdjPJEOINI4dGPh5OJLoujz6KKmSzq5c0Ud0fuzOY3rjJAUgVTrXHc54DNLMlCCha LbzUfAbgsLH1nTBDODKZANTOfzCK8f7FZynPrA8SzgqGFux3M2d5GFN8dKzpGEeHLE8p jpUWMZ7UL1WJmygs7+JFiOiGiXzlO1+83lnaVLSvYu6c5Q/rocWyuyu//3ko9KxlPEBm rdEQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=23iHKOTQc/bOBqtX6dX4OSMjtbP9Vg3O7l4a24RZsJE=; b=r3zM1/quw2ofy/8UK0jgAE9FVtnxPhEr4NDt23Caz9dFabgzrmaMDf/bGJsh7ldCj0 1+klEbX98egOxo9x0njUhdLvKccbyevwRWlJuNzZXznmkIGkqgcxzeC3VEpm17aruDWn tsrZnni1Bzqfdn70C40oV52HcEqRQYFyR3O6DBtoMwACTEJfodUSmrT5J6Jn+Xv9zeyG E1WT5iSn8z3OEtR50nXSqkcMZ29ui61q25ftvKA0uudV8XtOyHZ4Sh996fRxH33oFZIJ 4B1ltXViv2jGt8HRaRJC3evRM1hOO+wfJ5vSrmswlC4kRgOSL4kQ5L1SkPvqwyf3iDhT 2zjg== X-Gm-Message-State: ALKqPwc/5LagDBy5SaotWn8YZKczKTHwuVmTj5Htc39mAppbUvQXUZqM kwObLAytDwz6VHfrD1OvCeo= X-Google-Smtp-Source: ADUXVKLKnQhUrisof0TiD4zV7/Dbm1tMdXU9hF+e2nxqDzZqUTObvqf/0pokwoyzFICKF9pus9/ALg== X-Received: by 2002:a17:902:8f94:: with SMTP id z20-v6mr27121690plo.391.1528212503087; Tue, 05 Jun 2018 08:28:23 -0700 (PDT) Received: from VM_93_245_centos.localdomain ([150.109.57.149]) by smtp.gmail.com with ESMTPSA id t14-v6sm101952007pfh.109.2018.06.05.08.28.20 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 05 Jun 2018 08:28:22 -0700 (PDT) From: Lidong Chen X-Google-Original-From: Lidong Chen To: zhang.zhanghailiang@huawei.com, quintela@redhat.com, dgilbert@redhat.com, berrange@redhat.com, aviadye@mellanox.com, pbonzini@redhat.com Date: Tue, 5 Jun 2018 23:28:01 +0800 Message-Id: <1528212489-19137-3-git-send-email-lidongchen@tencent.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1528212489-19137-1-git-send-email-lidongchen@tencent.com> References: <1528212489-19137-1-git-send-email-lidongchen@tencent.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::244 Subject: [Qemu-devel] [PATCH v5 02/10] migration: create a dedicated connection for rdma return path X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: galsha@mellanox.com, Lidong Chen , adido@mellanox.com, qemu-devel@nongnu.org, Lidong Chen Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" From: Lidong Chen If start a RDMA migration with postcopy enabled, the source qemu establish a dedicated connection for return path. Signed-off-by: Lidong Chen Reviewed-by: Dr. David Alan Gilbert --- migration/rdma.c | 94 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 91 insertions(+), 3 deletions(-) diff --git a/migration/rdma.c b/migration/rdma.c index 185ed98..f6705a3 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -387,6 +387,10 @@ typedef struct RDMAContext { uint64_t unregistrations[RDMA_SIGNALED_SEND_MAX]; GHashTable *blockmap; + + /* the RDMAContext for return path */ + struct RDMAContext *return_path; + bool is_return_path; } RDMAContext; #define TYPE_QIO_CHANNEL_RDMA "qio-channel-rdma" @@ -2323,10 +2327,22 @@ static void qemu_rdma_cleanup(RDMAContext *rdma) rdma_destroy_id(rdma->cm_id); rdma->cm_id = NULL; } + + /* the destination side, listen_id and channel is shared */ if (rdma->listen_id) { - rdma_destroy_id(rdma->listen_id); + if (!rdma->is_return_path) { + rdma_destroy_id(rdma->listen_id); + } rdma->listen_id = NULL; + + if (rdma->channel) { + if (!rdma->is_return_path) { + rdma_destroy_event_channel(rdma->channel); + } + rdma->channel = NULL; + } } + if (rdma->channel) { rdma_destroy_event_channel(rdma->channel); rdma->channel = NULL; @@ -2555,6 +2571,25 @@ err_dest_init_create_listen_id: } +static void qemu_rdma_return_path_dest_init(RDMAContext *rdma_return_path, + RDMAContext *rdma) +{ + int idx; + + for (idx = 0; idx < RDMA_WRID_MAX; idx++) { + rdma_return_path->wr_data[idx].control_len = 0; + rdma_return_path->wr_data[idx].control_curr = NULL; + } + + /*the CM channel and CM id is shared*/ + rdma_return_path->channel = rdma->channel; + rdma_return_path->listen_id = rdma->listen_id; + + rdma->return_path = rdma_return_path; + rdma_return_path->return_path = rdma; + rdma_return_path->is_return_path = true; +} + static void *qemu_rdma_data_init(const char *host_port, Error **errp) { RDMAContext *rdma = NULL; @@ -3012,6 +3047,8 @@ err: return ret; } +static void rdma_accept_incoming_migration(void *opaque); + static int qemu_rdma_accept(RDMAContext *rdma) { RDMACapabilities cap; @@ -3106,7 +3143,14 @@ static int qemu_rdma_accept(RDMAContext *rdma) } } - qemu_set_fd_handler(rdma->channel->fd, NULL, NULL, NULL); + /* Accept the second connection request for return path */ + if (migrate_postcopy() && !rdma->is_return_path) { + qemu_set_fd_handler(rdma->channel->fd, rdma_accept_incoming_migration, + NULL, + (void *)(intptr_t)rdma->return_path); + } else { + qemu_set_fd_handler(rdma->channel->fd, NULL, NULL, NULL); + } ret = rdma_accept(rdma->cm_id, &conn_param); if (ret) { @@ -3691,6 +3735,10 @@ static void rdma_accept_incoming_migration(void *opaque) trace_qemu_rdma_accept_incoming_migration_accepted(); + if (rdma->is_return_path) { + return; + } + f = qemu_fopen_rdma(rdma, "rb"); if (f == NULL) { ERROR(errp, "could not qemu_fopen_rdma!"); @@ -3705,7 +3753,7 @@ static void rdma_accept_incoming_migration(void *opaque) void rdma_start_incoming_migration(const char *host_port, Error **errp) { int ret; - RDMAContext *rdma; + RDMAContext *rdma, *rdma_return_path; Error *local_err = NULL; trace_rdma_start_incoming_migration(); @@ -3732,12 +3780,24 @@ void rdma_start_incoming_migration(const char *host_port, Error **errp) trace_rdma_start_incoming_migration_after_rdma_listen(); + /* initialize the RDMAContext for return path */ + if (migrate_postcopy()) { + rdma_return_path = qemu_rdma_data_init(host_port, &local_err); + + if (rdma_return_path == NULL) { + goto err; + } + + qemu_rdma_return_path_dest_init(rdma_return_path, rdma); + } + qemu_set_fd_handler(rdma->channel->fd, rdma_accept_incoming_migration, NULL, (void *)(intptr_t)rdma); return; err: error_propagate(errp, local_err); g_free(rdma); + g_free(rdma_return_path); } void rdma_start_outgoing_migration(void *opaque, @@ -3745,6 +3805,7 @@ void rdma_start_outgoing_migration(void *opaque, { MigrationState *s = opaque; RDMAContext *rdma = qemu_rdma_data_init(host_port, errp); + RDMAContext *rdma_return_path = NULL; int ret = 0; if (rdma == NULL) { @@ -3765,6 +3826,32 @@ void rdma_start_outgoing_migration(void *opaque, goto err; } + /* RDMA postcopy need a seprate queue pair for return path */ + if (migrate_postcopy()) { + rdma_return_path = qemu_rdma_data_init(host_port, errp); + + if (rdma_return_path == NULL) { + goto err; + } + + ret = qemu_rdma_source_init(rdma_return_path, + s->enabled_capabilities[MIGRATION_CAPABILITY_RDMA_PIN_ALL], errp); + + if (ret) { + goto err; + } + + ret = qemu_rdma_connect(rdma_return_path, errp); + + if (ret) { + goto err; + } + + rdma->return_path = rdma_return_path; + rdma_return_path->return_path = rdma; + rdma_return_path->is_return_path = true; + } + trace_rdma_start_outgoing_migration_after_rdma_connect(); s->to_dst_file = qemu_fopen_rdma(rdma, "wb"); @@ -3772,4 +3859,5 @@ void rdma_start_outgoing_migration(void *opaque, return; err: g_free(rdma); + g_free(rdma_return_path); } From patchwork Tue Jun 5 15:28:02 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: 858585 jemmy X-Patchwork-Id: 925530 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="JXSVQTXD"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 410bQB2pnVz9s0W for ; Wed, 6 Jun 2018 01:32:02 +1000 (AEST) Received: from localhost ([::1]:47524 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fQDwG-0002ku-2l for incoming@patchwork.ozlabs.org; Tue, 05 Jun 2018 11:32:00 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40041) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fQDsp-00085A-BP for qemu-devel@nongnu.org; Tue, 05 Jun 2018 11:28:28 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fQDso-0003pR-CC for qemu-devel@nongnu.org; Tue, 05 Jun 2018 11:28:27 -0400 Received: from mail-pl0-x243.google.com ([2607:f8b0:400e:c01::243]:37101) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fQDso-0003pF-53 for qemu-devel@nongnu.org; Tue, 05 Jun 2018 11:28:26 -0400 Received: by mail-pl0-x243.google.com with SMTP id 31-v6so1733848plc.4 for ; Tue, 05 Jun 2018 08:28:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=RdErOtDtb6zGWvaVNsJ9OQWLC9q//RU38Y9qtUQkDBE=; b=JXSVQTXDZpDk14e3mJKBiTnKXJMM1oyxBvghrT6aM7O6UBnbFVekljFLkcTgWXeLNt 12b9x2p6Z2pdNgERGG33WuRozo1Ck2B9uLch1iKVD2IEo22is1ScJJ+sAsLjnSBXXFMl 50nHtWY6r/tZV64zm1qEcQgWOsx/PctvX+rxQcyPRAO9TEOha9ze1/EqVQwmRq6AfDr0 7p2bq3WNQs5c0CYuQIenkN1N0P1KTPuC8dnaXyQ6s7+DOGRRY9Hw4byIFWNYnHaQmjB6 vMWxg1IWO1QWA5qubA3C32dBcYwriYVkF4Pv/yCEaIzW1LbhWDXVqSyU7XWKuMG197Ca SjXw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=RdErOtDtb6zGWvaVNsJ9OQWLC9q//RU38Y9qtUQkDBE=; b=K6vwUoGKOUi9MBRoZIHLejVKT4moOqnbpk2q0FfsgiQBkARL6njE8+CPDbKX1mpcp0 9/aM7QyfNbpaAAyPXaOMZEXT7t+WKU1+NFTM+pcGU62LJ0gMJwuyVweUskSwrpKN3RcT uEja8c8wvHC6y2zRoLNAyhKXh76DpFkRIrl548QrD5pFxAW314t9o5caP0QVOqLyog6J koeHICXw798U/yrlXCwUKqLdp4Eouczd/PfeNNZgpAZfO5Z/qYgMUxCqRNbjJalczewJ 1SerhR8miQp750c/cV1a+6k/OWfQXQXAELrPnXTBNnYWTyFa2mBzOpFVmvRN+yYYhH9S cktA== X-Gm-Message-State: ALKqPwd6nj65d8g0U75oYgqStT/HwWwgn6UiDrIBDVwuiIayeEejeMfB xaqCzBxKj1zaMbjN6Km98cM= X-Google-Smtp-Source: ADUXVKItmPBFmTmos+B+d4I6xeF2BDcjsJCfXNZ37oyT//R8H68imhbRy5WnKESkCoN+Mu9Z4g3cDQ== X-Received: by 2002:a17:902:3103:: with SMTP id w3-v6mr26637746plb.353.1528212505384; Tue, 05 Jun 2018 08:28:25 -0700 (PDT) Received: from VM_93_245_centos.localdomain ([150.109.57.149]) by smtp.gmail.com with ESMTPSA id t14-v6sm101952007pfh.109.2018.06.05.08.28.23 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 05 Jun 2018 08:28:24 -0700 (PDT) From: Lidong Chen X-Google-Original-From: Lidong Chen To: zhang.zhanghailiang@huawei.com, quintela@redhat.com, dgilbert@redhat.com, berrange@redhat.com, aviadye@mellanox.com, pbonzini@redhat.com Date: Tue, 5 Jun 2018 23:28:02 +0800 Message-Id: <1528212489-19137-4-git-send-email-lidongchen@tencent.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1528212489-19137-1-git-send-email-lidongchen@tencent.com> References: <1528212489-19137-1-git-send-email-lidongchen@tencent.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::243 Subject: [Qemu-devel] [PATCH v5 03/10] migration: avoid concurrent invoke channel_close by different threads X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: galsha@mellanox.com, adido@mellanox.com, qemu-devel@nongnu.org, Lidong Chen Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" The channel_close maybe invoked by different threads. For example, source qemu invokes qemu_fclose in main thread, migration thread and return path thread. Destination qemu invokes qemu_fclose in main thread, listen thread and COLO incoming thread. Signed-off-by: Lidong Chen Reviewed-by: Daniel P. Berrangé --- migration/migration.c | 2 ++ migration/migration.h | 7 +++++++ migration/qemu-file.c | 6 ++++-- 3 files changed, 13 insertions(+), 2 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index 1e99ec9..1d0aaec 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -3075,6 +3075,7 @@ static void migration_instance_finalize(Object *obj) qemu_mutex_destroy(&ms->error_mutex); qemu_mutex_destroy(&ms->qemu_file_lock); + qemu_mutex_destroy(&ms->qemu_file_close_lock); g_free(params->tls_hostname); g_free(params->tls_creds); qemu_sem_destroy(&ms->pause_sem); @@ -3115,6 +3116,7 @@ static void migration_instance_init(Object *obj) qemu_sem_init(&ms->postcopy_pause_rp_sem, 0); qemu_sem_init(&ms->rp_state.rp_sem, 0); qemu_mutex_init(&ms->qemu_file_lock); + qemu_mutex_init(&ms->qemu_file_close_lock); } /* diff --git a/migration/migration.h b/migration/migration.h index 5af57d6..7a6025a 100644 --- a/migration/migration.h +++ b/migration/migration.h @@ -121,6 +121,13 @@ struct MigrationState */ QemuMutex qemu_file_lock; + /* + * The to_src_file and from_dst_file point to one QIOChannelRDMA, + * And qemu_fclose maybe invoked by different threads. use this lock + * to avoid concurrent invoke channel_close by different threads. + */ + QemuMutex qemu_file_close_lock; + /* bytes already send at the beggining of current interation */ uint64_t iteration_initial_bytes; /* time at the start of current iteration */ diff --git a/migration/qemu-file.c b/migration/qemu-file.c index 977b9ae..74c48e0 100644 --- a/migration/qemu-file.c +++ b/migration/qemu-file.c @@ -323,12 +323,14 @@ void qemu_update_position(QEMUFile *f, size_t size) */ int qemu_fclose(QEMUFile *f) { - int ret; + int ret, ret2; qemu_fflush(f); ret = qemu_file_get_error(f); if (f->ops->close) { - int ret2 = f->ops->close(f->opaque); + qemu_mutex_lock(&migrate_get_current()->qemu_file_close_lock); + ret2 = f->ops->close(f->opaque); + qemu_mutex_unlock(&migrate_get_current()->qemu_file_close_lock); if (ret >= 0) { ret = ret2; } From patchwork Tue Jun 5 15:28:03 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: 858585 jemmy X-Patchwork-Id: 925529 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="NewljKEe"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 410bPy2Wlfz9s0W for ; Wed, 6 Jun 2018 01:31:50 +1000 (AEST) Received: from localhost ([::1]:47522 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fQDw3-0002ZH-Tn for incoming@patchwork.ozlabs.org; Tue, 05 Jun 2018 11:31:47 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40099) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fQDsw-0008DL-As for qemu-devel@nongnu.org; Tue, 05 Jun 2018 11:28:37 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fQDsr-0003qt-Be for qemu-devel@nongnu.org; Tue, 05 Jun 2018 11:28:34 -0400 Received: from mail-pf0-x244.google.com ([2607:f8b0:400e:c00::244]:37433) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fQDsr-0003qT-28 for qemu-devel@nongnu.org; Tue, 05 Jun 2018 11:28:29 -0400 Received: by mail-pf0-x244.google.com with SMTP id y5-v6so580319pfn.4 for ; Tue, 05 Jun 2018 08:28:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=+GDOELYu8h6qcL8JkmsrAYj2jytAXQ+WQZd0nd36A38=; b=NewljKEeU8iOa6Kz2qkwvhb88iO7iWlxveHAuaMgG9y/jiGOpU9n8V3IGpVgNuhhBy kUc4yFH1uxktQvN/JogdngySEW7aNy+eMIgdwpE84haRDbHPnro0g66q8Jd/Tx/T+Yyd bH4qQlTw2kdZmFpRXA3hH5KohJJRDYiHI2l7AYKHD0VvBZko3CAdNDMSmuMFtbHQ6STv eZ5GNKtsK6/mEKlftFjdUqYSgpNq7FCKP+iIVGqxXhhr6mcYxihNam9tfdFSeZn4RPGV oyTrsfQf05Twh+iT+T3wGZzFbKcRZCXwEebjt9GJ57mA7M6Ph++bwuBxi7HVfGGs3FQF BKSg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=+GDOELYu8h6qcL8JkmsrAYj2jytAXQ+WQZd0nd36A38=; b=b9hLagCcC8UiLO+ey47lyN9iO0hAE24HYTvKVyUvRiZbNlehqcml9Xzac29BRrM2o+ iyYBslMae41svORwP2eoOkcjFly7yqxT+nIS5Y3QC1X4AvXIpluYdqbzRYZRTKcFGyLp 1gk5RX0upKefxo+35Vmt/vAf3OirwHwK1qPcbQ2mETcc82qKSBy3Clua1F7K8YTLQiKT K9NDkaPJ4l3cnYkpiDUog6YlmZR4Spl9L+av9/L/Sr2hXqkmIKSvOmwtxeKxnSUATRnY fe1+9WcAKWmYv+NDvqY+JxTxpS8tCaOw9opF9utFf+0E2ml0LpflJmSjqqEA5uRYIwCY Ozag== X-Gm-Message-State: ALKqPwfuT8mITABeOxfKyG1J+lyMxxw2NyY7f3YKWMRsIrsw2kf+/0oh frxGbtH+UVSu9deFYlLVD5M= X-Google-Smtp-Source: ADUXVKJS6wmyvfC67UZZAtRib/9l1qSVqQzVcAE/wOGXpVNhNWvWKFYgimXtpjuiMgWEXhDfv4WY0g== X-Received: by 2002:a65:5c89:: with SMTP id a9-v6mr21155062pgt.51.1528212508025; Tue, 05 Jun 2018 08:28:28 -0700 (PDT) Received: from VM_93_245_centos.localdomain ([150.109.57.149]) by smtp.gmail.com with ESMTPSA id t14-v6sm101952007pfh.109.2018.06.05.08.28.25 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 05 Jun 2018 08:28:27 -0700 (PDT) From: Lidong Chen X-Google-Original-From: Lidong Chen To: zhang.zhanghailiang@huawei.com, quintela@redhat.com, dgilbert@redhat.com, berrange@redhat.com, aviadye@mellanox.com, pbonzini@redhat.com Date: Tue, 5 Jun 2018 23:28:03 +0800 Message-Id: <1528212489-19137-5-git-send-email-lidongchen@tencent.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1528212489-19137-1-git-send-email-lidongchen@tencent.com> References: <1528212489-19137-1-git-send-email-lidongchen@tencent.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::244 Subject: [Qemu-devel] [PATCH v5 04/10] migration: implement bi-directional RDMA QIOChannel X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: galsha@mellanox.com, Lidong Chen , adido@mellanox.com, qemu-devel@nongnu.org, Lidong Chen Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" From: Lidong Chen This patch implements bi-directional RDMA QIOChannel. Because different threads may access RDMAQIOChannel currently, this patch use RCU to protect it. Signed-off-by: Lidong Chen Reviewed-by: Dr. David Alan Gilbert --- migration/colo.c | 2 + migration/migration.c | 2 + migration/postcopy-ram.c | 2 + migration/ram.c | 4 + migration/rdma.c | 196 ++++++++++++++++++++++++++++++++++++++++------- migration/savevm.c | 3 + 6 files changed, 183 insertions(+), 26 deletions(-) diff --git a/migration/colo.c b/migration/colo.c index 4381067..88936f5 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -534,6 +534,7 @@ void *colo_process_incoming_thread(void *opaque) uint64_t value; Error *local_err = NULL; + rcu_register_thread(); qemu_sem_init(&mis->colo_incoming_sem, 0); migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE, @@ -666,5 +667,6 @@ out: } migration_incoming_exit_colo(); + rcu_unregister_thread(); return NULL; } diff --git a/migration/migration.c b/migration/migration.c index 1d0aaec..4253d9f 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -2028,6 +2028,7 @@ static void *source_return_path_thread(void *opaque) int res; trace_source_return_path_thread_entry(); + rcu_register_thread(); retry: while (!ms->rp_state.error && !qemu_file_get_error(rp) && @@ -2167,6 +2168,7 @@ out: trace_source_return_path_thread_end(); ms->rp_state.from_dst_file = NULL; qemu_fclose(rp); + rcu_unregister_thread(); return NULL; } diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c index 48e5155..98613eb 100644 --- a/migration/postcopy-ram.c +++ b/migration/postcopy-ram.c @@ -853,6 +853,7 @@ static void *postcopy_ram_fault_thread(void *opaque) RAMBlock *rb = NULL; trace_postcopy_ram_fault_thread_entry(); + rcu_register_thread(); mis->last_rb = NULL; /* last RAMBlock we sent part of */ qemu_sem_post(&mis->fault_thread_sem); @@ -1059,6 +1060,7 @@ retry: } } } + rcu_unregister_thread(); trace_postcopy_ram_fault_thread_exit(); g_free(pfd); return NULL; diff --git a/migration/ram.c b/migration/ram.c index a500015..a674fb5 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -683,6 +683,7 @@ static void *multifd_send_thread(void *opaque) MultiFDSendParams *p = opaque; Error *local_err = NULL; + rcu_register_thread(); if (multifd_send_initial_packet(p, &local_err) < 0) { goto out; } @@ -706,6 +707,7 @@ out: p->running = false; qemu_mutex_unlock(&p->mutex); + rcu_unregister_thread(); return NULL; } @@ -819,6 +821,7 @@ static void *multifd_recv_thread(void *opaque) { MultiFDRecvParams *p = opaque; + rcu_register_thread(); while (true) { qemu_mutex_lock(&p->mutex); if (p->quit) { @@ -833,6 +836,7 @@ static void *multifd_recv_thread(void *opaque) p->running = false; qemu_mutex_unlock(&p->mutex); + rcu_unregister_thread(); return NULL; } diff --git a/migration/rdma.c b/migration/rdma.c index f6705a3..769f443 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -86,6 +86,7 @@ static uint32_t known_capabilities = RDMA_CAPABILITY_PIN_ALL; " to abort!"); \ rdma->error_reported = 1; \ } \ + rcu_read_unlock(); \ return rdma->error_state; \ } \ } while (0) @@ -402,7 +403,8 @@ typedef struct QIOChannelRDMA QIOChannelRDMA; struct QIOChannelRDMA { QIOChannel parent; - RDMAContext *rdma; + RDMAContext *rdmain; + RDMAContext *rdmaout; QEMUFile *file; bool blocking; /* XXX we don't actually honour this yet */ }; @@ -2630,12 +2632,20 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc, { QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc); QEMUFile *f = rioc->file; - RDMAContext *rdma = rioc->rdma; + RDMAContext *rdma; int ret; ssize_t done = 0; size_t i; size_t len = 0; + rcu_read_lock(); + rdma = atomic_rcu_read(&rioc->rdmaout); + + if (!rdma) { + rcu_read_unlock(); + return -EIO; + } + CHECK_ERROR_STATE(); /* @@ -2645,6 +2655,7 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc, ret = qemu_rdma_write_flush(f, rdma); if (ret < 0) { rdma->error_state = ret; + rcu_read_unlock(); return ret; } @@ -2664,6 +2675,7 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc, if (ret < 0) { rdma->error_state = ret; + rcu_read_unlock(); return ret; } @@ -2672,6 +2684,7 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc, } } + rcu_read_unlock(); return done; } @@ -2705,12 +2718,20 @@ static ssize_t qio_channel_rdma_readv(QIOChannel *ioc, Error **errp) { QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc); - RDMAContext *rdma = rioc->rdma; + RDMAContext *rdma; RDMAControlHeader head; int ret = 0; ssize_t i; size_t done = 0; + rcu_read_lock(); + rdma = atomic_rcu_read(&rioc->rdmain); + + if (!rdma) { + rcu_read_unlock(); + return -EIO; + } + CHECK_ERROR_STATE(); for (i = 0; i < niov; i++) { @@ -2722,7 +2743,7 @@ static ssize_t qio_channel_rdma_readv(QIOChannel *ioc, * were given and dish out the bytes until we run * out of bytes. */ - ret = qemu_rdma_fill(rioc->rdma, data, want, 0); + ret = qemu_rdma_fill(rdma, data, want, 0); done += ret; want -= ret; /* Got what we needed, so go to next iovec */ @@ -2744,25 +2765,28 @@ static ssize_t qio_channel_rdma_readv(QIOChannel *ioc, if (ret < 0) { rdma->error_state = ret; + rcu_read_unlock(); return ret; } /* * SEND was received with new bytes, now try again. */ - ret = qemu_rdma_fill(rioc->rdma, data, want, 0); + ret = qemu_rdma_fill(rdma, data, want, 0); done += ret; want -= ret; /* Still didn't get enough, so lets just return */ if (want) { if (done == 0) { + rcu_read_unlock(); return QIO_CHANNEL_ERR_BLOCK; } else { break; } } } + rcu_read_unlock(); return done; } @@ -2814,15 +2838,29 @@ qio_channel_rdma_source_prepare(GSource *source, gint *timeout) { QIOChannelRDMASource *rsource = (QIOChannelRDMASource *)source; - RDMAContext *rdma = rsource->rioc->rdma; + RDMAContext *rdma; GIOCondition cond = 0; *timeout = -1; + rcu_read_lock(); + if (rsource->condition == G_IO_IN) { + rdma = atomic_rcu_read(&rsource->rioc->rdmain); + } else { + rdma = atomic_rcu_read(&rsource->rioc->rdmaout); + } + + if (!rdma) { + error_report("RDMAContext is NULL when prepare Gsource"); + rcu_read_unlock(); + return FALSE; + } + if (rdma->wr_data[0].control_len) { cond |= G_IO_IN; } cond |= G_IO_OUT; + rcu_read_unlock(); return cond & rsource->condition; } @@ -2830,14 +2868,28 @@ static gboolean qio_channel_rdma_source_check(GSource *source) { QIOChannelRDMASource *rsource = (QIOChannelRDMASource *)source; - RDMAContext *rdma = rsource->rioc->rdma; + RDMAContext *rdma; GIOCondition cond = 0; + rcu_read_lock(); + if (rsource->condition == G_IO_IN) { + rdma = atomic_rcu_read(&rsource->rioc->rdmain); + } else { + rdma = atomic_rcu_read(&rsource->rioc->rdmaout); + } + + if (!rdma) { + error_report("RDMAContext is NULL when check Gsource"); + rcu_read_unlock(); + return FALSE; + } + if (rdma->wr_data[0].control_len) { cond |= G_IO_IN; } cond |= G_IO_OUT; + rcu_read_unlock(); return cond & rsource->condition; } @@ -2848,14 +2900,28 @@ qio_channel_rdma_source_dispatch(GSource *source, { QIOChannelFunc func = (QIOChannelFunc)callback; QIOChannelRDMASource *rsource = (QIOChannelRDMASource *)source; - RDMAContext *rdma = rsource->rioc->rdma; + RDMAContext *rdma; GIOCondition cond = 0; + rcu_read_lock(); + if (rsource->condition == G_IO_IN) { + rdma = atomic_rcu_read(&rsource->rioc->rdmain); + } else { + rdma = atomic_rcu_read(&rsource->rioc->rdmaout); + } + + if (!rdma) { + error_report("RDMAContext is NULL when dispatch Gsource"); + rcu_read_unlock(); + return FALSE; + } + if (rdma->wr_data[0].control_len) { cond |= G_IO_IN; } cond |= G_IO_OUT; + rcu_read_unlock(); return (*func)(QIO_CHANNEL(rsource->rioc), (cond & rsource->condition), user_data); @@ -2900,15 +2966,32 @@ static int qio_channel_rdma_close(QIOChannel *ioc, Error **errp) { QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc); + RDMAContext *rdmain, *rdmaout; trace_qemu_rdma_close(); - if (rioc->rdma) { - if (!rioc->rdma->error_state) { - rioc->rdma->error_state = qemu_file_get_error(rioc->file); - } - qemu_rdma_cleanup(rioc->rdma); - g_free(rioc->rdma); - rioc->rdma = NULL; + + rdmain = rioc->rdmain; + if (rdmain) { + atomic_rcu_set(&rioc->rdmain, NULL); + } + + rdmaout = rioc->rdmaout; + if (rdmaout) { + atomic_rcu_set(&rioc->rdmaout, NULL); } + + synchronize_rcu(); + + if (rdmain) { + qemu_rdma_cleanup(rdmain); + } + + if (rdmaout) { + qemu_rdma_cleanup(rdmaout); + } + + g_free(rdmain); + g_free(rdmaout); + return 0; } @@ -2951,12 +3034,21 @@ static size_t qemu_rdma_save_page(QEMUFile *f, void *opaque, size_t size, uint64_t *bytes_sent) { QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(opaque); - RDMAContext *rdma = rioc->rdma; + RDMAContext *rdma; int ret; + rcu_read_lock(); + rdma = atomic_rcu_read(&rioc->rdmaout); + + if (!rdma) { + rcu_read_unlock(); + return -EIO; + } + CHECK_ERROR_STATE(); if (migrate_get_current()->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) { + rcu_read_unlock(); return RAM_SAVE_CONTROL_NOT_SUPP; } @@ -3041,9 +3133,11 @@ static size_t qemu_rdma_save_page(QEMUFile *f, void *opaque, } } + rcu_read_unlock(); return RAM_SAVE_CONTROL_DELAYED; err: rdma->error_state = ret; + rcu_read_unlock(); return ret; } @@ -3219,8 +3313,8 @@ static int qemu_rdma_registration_handle(QEMUFile *f, void *opaque) RDMAControlHeader blocks = { .type = RDMA_CONTROL_RAM_BLOCKS_RESULT, .repeat = 1 }; QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(opaque); - RDMAContext *rdma = rioc->rdma; - RDMALocalBlocks *local = &rdma->local_ram_blocks; + RDMAContext *rdma; + RDMALocalBlocks *local; RDMAControlHeader head; RDMARegister *reg, *registers; RDMACompress *comp; @@ -3233,8 +3327,17 @@ static int qemu_rdma_registration_handle(QEMUFile *f, void *opaque) int count = 0; int i = 0; + rcu_read_lock(); + rdma = atomic_rcu_read(&rioc->rdmain); + + if (!rdma) { + rcu_read_unlock(); + return -EIO; + } + CHECK_ERROR_STATE(); + local = &rdma->local_ram_blocks; do { trace_qemu_rdma_registration_handle_wait(); @@ -3468,6 +3571,7 @@ out: if (ret < 0) { rdma->error_state = ret; } + rcu_read_unlock(); return ret; } @@ -3481,10 +3585,18 @@ out: static int rdma_block_notification_handle(QIOChannelRDMA *rioc, const char *name) { - RDMAContext *rdma = rioc->rdma; + RDMAContext *rdma; int curr; int found = -1; + rcu_read_lock(); + rdma = atomic_rcu_read(&rioc->rdmain); + + if (!rdma) { + rcu_read_unlock(); + return -EIO; + } + /* Find the matching RAMBlock in our local list */ for (curr = 0; curr < rdma->local_ram_blocks.nb_blocks; curr++) { if (!strcmp(rdma->local_ram_blocks.block[curr].block_name, name)) { @@ -3495,6 +3607,7 @@ rdma_block_notification_handle(QIOChannelRDMA *rioc, const char *name) if (found == -1) { error_report("RAMBlock '%s' not found on destination", name); + rcu_read_unlock(); return -ENOENT; } @@ -3502,6 +3615,7 @@ rdma_block_notification_handle(QIOChannelRDMA *rioc, const char *name) trace_rdma_block_notification_handle(name, rdma->next_src_index); rdma->next_src_index++; + rcu_read_unlock(); return 0; } @@ -3524,11 +3638,19 @@ static int qemu_rdma_registration_start(QEMUFile *f, void *opaque, uint64_t flags, void *data) { QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(opaque); - RDMAContext *rdma = rioc->rdma; + RDMAContext *rdma; + + rcu_read_lock(); + rdma = atomic_rcu_read(&rioc->rdmaout); + if (!rdma) { + rcu_read_unlock(); + return -EIO; + } CHECK_ERROR_STATE(); if (migrate_get_current()->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) { + rcu_read_unlock(); return 0; } @@ -3536,6 +3658,7 @@ static int qemu_rdma_registration_start(QEMUFile *f, void *opaque, qemu_put_be64(f, RAM_SAVE_FLAG_HOOK); qemu_fflush(f); + rcu_read_unlock(); return 0; } @@ -3548,13 +3671,21 @@ static int qemu_rdma_registration_stop(QEMUFile *f, void *opaque, { Error *local_err = NULL, **errp = &local_err; QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(opaque); - RDMAContext *rdma = rioc->rdma; + RDMAContext *rdma; RDMAControlHeader head = { .len = 0, .repeat = 1 }; int ret = 0; + rcu_read_lock(); + rdma = atomic_rcu_read(&rioc->rdmaout); + if (!rdma) { + rcu_read_unlock(); + return -EIO; + } + CHECK_ERROR_STATE(); if (migrate_get_current()->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) { + rcu_read_unlock(); return 0; } @@ -3586,6 +3717,7 @@ static int qemu_rdma_registration_stop(QEMUFile *f, void *opaque, qemu_rdma_reg_whole_ram_blocks : NULL); if (ret < 0) { ERROR(errp, "receiving remote info!"); + rcu_read_unlock(); return ret; } @@ -3609,6 +3741,7 @@ static int qemu_rdma_registration_stop(QEMUFile *f, void *opaque, "not identical on both the source and destination.", local->nb_blocks, nb_dest_blocks); rdma->error_state = -EINVAL; + rcu_read_unlock(); return -EINVAL; } @@ -3625,6 +3758,7 @@ static int qemu_rdma_registration_stop(QEMUFile *f, void *opaque, local->block[i].length, rdma->dest_blocks[i].length); rdma->error_state = -EINVAL; + rcu_read_unlock(); return -EINVAL; } local->block[i].remote_host_addr = @@ -3642,9 +3776,11 @@ static int qemu_rdma_registration_stop(QEMUFile *f, void *opaque, goto err; } + rcu_read_unlock(); return 0; err: rdma->error_state = ret; + rcu_read_unlock(); return ret; } @@ -3662,10 +3798,15 @@ static const QEMUFileHooks rdma_write_hooks = { static void qio_channel_rdma_finalize(Object *obj) { QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(obj); - if (rioc->rdma) { - qemu_rdma_cleanup(rioc->rdma); - g_free(rioc->rdma); - rioc->rdma = NULL; + if (rioc->rdmain) { + qemu_rdma_cleanup(rioc->rdmain); + g_free(rioc->rdmain); + rioc->rdmain = NULL; + } + if (rioc->rdmaout) { + qemu_rdma_cleanup(rioc->rdmaout); + g_free(rioc->rdmaout); + rioc->rdmaout = NULL; } } @@ -3705,13 +3846,16 @@ static QEMUFile *qemu_fopen_rdma(RDMAContext *rdma, const char *mode) } rioc = QIO_CHANNEL_RDMA(object_new(TYPE_QIO_CHANNEL_RDMA)); - rioc->rdma = rdma; if (mode[0] == 'w') { rioc->file = qemu_fopen_channel_output(QIO_CHANNEL(rioc)); + rioc->rdmaout = rdma; + rioc->rdmain = rdma->return_path; qemu_file_set_hooks(rioc->file, &rdma_write_hooks); } else { rioc->file = qemu_fopen_channel_input(QIO_CHANNEL(rioc)); + rioc->rdmain = rdma; + rioc->rdmaout = rdma->return_path; qemu_file_set_hooks(rioc->file, &rdma_read_hooks); } diff --git a/migration/savevm.c b/migration/savevm.c index c2f34ff..21c07d4 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -1622,6 +1622,7 @@ static void *postcopy_ram_listen_thread(void *opaque) qemu_sem_post(&mis->listen_thread_sem); trace_postcopy_ram_listen_thread_start(); + rcu_register_thread(); /* * Because we're a thread and not a coroutine we can't yield * in qemu_file, and thus we must be blocking now. @@ -1662,6 +1663,7 @@ static void *postcopy_ram_listen_thread(void *opaque) * to leave the guest running and fire MCEs for pages that never * arrived as a desperate recovery step. */ + rcu_unregister_thread(); exit(EXIT_FAILURE); } @@ -1676,6 +1678,7 @@ static void *postcopy_ram_listen_thread(void *opaque) migration_incoming_state_destroy(); qemu_loadvm_state_cleanup(); + rcu_unregister_thread(); return NULL; } From patchwork Tue Jun 5 15:28:04 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: 858585 jemmy X-Patchwork-Id: 925528 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="jL0HD2V4"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 410bPs6z5sz9s08 for ; Wed, 6 Jun 2018 01:31:45 +1000 (AEST) Received: from localhost ([::1]:47521 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fQDvz-0002WU-Hx for incoming@patchwork.ozlabs.org; Tue, 05 Jun 2018 11:31:43 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40098) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fQDsw-0008DK-B7 for qemu-devel@nongnu.org; Tue, 05 Jun 2018 11:28:35 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fQDst-0003t0-ET for qemu-devel@nongnu.org; Tue, 05 Jun 2018 11:28:34 -0400 Received: from mail-pl0-x243.google.com ([2607:f8b0:400e:c01::243]:45631) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fQDst-0003s7-8G for qemu-devel@nongnu.org; Tue, 05 Jun 2018 11:28:31 -0400 Received: by mail-pl0-x243.google.com with SMTP id c23-v6so1719771plz.12 for ; Tue, 05 Jun 2018 08:28:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=2yHE8OL8KNi32FMDWIm3OmOwkLTKbSn7g6pxPl68C08=; b=jL0HD2V4M16ftEFlm/NtZfiqswmrBwgu6T2n/cH6tL3Pzc2gAZAmwsBeVQdTcyn1y1 l739pWYxPsVFeO8Ojw0Bl3TMb08KyefYs5o6OdCPZ4ASB0F3ZdZxQ4qZSgFP8AnMb8aD /qgDgJKa/14dpatGNvthOOc2SWOjC7yN1eQv0BpD/MXq+TU2+g7BehFsZ0JUZOYmFxxp a5TSDtIELolC7Kdh+nTGdartqubq8oPoVUh0n0vlTQrLE4j19bG2HQMjOOiyf4fd7lim DU3gwK//L/XUdD6kKKNbmQk9xUT5RsToZ1GU9+szKrrgzhlFy4KdReLzAObtIgf6uazG gPQw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=2yHE8OL8KNi32FMDWIm3OmOwkLTKbSn7g6pxPl68C08=; b=S0j2i6hNnIFDLBkFIOXcwQjkClhPk/6Jfq8tewKGgLikgki80vz9QjDlQ1FfLgnCi9 CKSpnBrtzcG49OFCe/Z1YfYW2ifmJWWy01DiaBNzYD4ypRRxV1I/d52+xaftmcxr1ZG3 gRZt/LZKaoRCXxfqdZlGTjbBs8M6q2+AXyJy09ndltqPoMuOWAgDF9wkCRR1RaX4gaUI r4dwD4iADcLeiy1yMTPmjio9+JXomIn4663PxZ8160ObWlX7rG4EeFd1/LboloG5dHaI 4Lzv2fMaAlNG5XrHi2pUQtATplf3hGOaD/U3SAbtuSDvL1gyR7hus9VbZJvGJF6T0eiz tkQw== X-Gm-Message-State: APt69E389ZIBleZWyYl9wPscUz6+ONUFixTvYZxj3x5gNBkpkbzHbKRe AusoA9YiGkd+P6oScmLxbIM= X-Google-Smtp-Source: ADUXVKKjUSC50pi3I24aonEt3KaeCr0/Rz2f5hh0s/XCnp6JVDwLIniZk0VEQydyFSJUltbPDLghPQ== X-Received: by 2002:a17:902:8648:: with SMTP id y8-v6mr7363335plt.86.1528212510563; Tue, 05 Jun 2018 08:28:30 -0700 (PDT) Received: from VM_93_245_centos.localdomain ([150.109.57.149]) by smtp.gmail.com with ESMTPSA id t14-v6sm101952007pfh.109.2018.06.05.08.28.28 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 05 Jun 2018 08:28:29 -0700 (PDT) From: Lidong Chen X-Google-Original-From: Lidong Chen To: zhang.zhanghailiang@huawei.com, quintela@redhat.com, dgilbert@redhat.com, berrange@redhat.com, aviadye@mellanox.com, pbonzini@redhat.com Date: Tue, 5 Jun 2018 23:28:04 +0800 Message-Id: <1528212489-19137-6-git-send-email-lidongchen@tencent.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1528212489-19137-1-git-send-email-lidongchen@tencent.com> References: <1528212489-19137-1-git-send-email-lidongchen@tencent.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::243 Subject: [Qemu-devel] [PATCH v5 05/10] migration: Stop rdma yielding during incoming postcopy X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: galsha@mellanox.com, Lidong Chen , adido@mellanox.com, qemu-devel@nongnu.org, Lidong Chen Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" From: Lidong Chen During incoming postcopy, the destination qemu will invoke qemu_rdma_wait_comp_channel in a seprate thread. So does not use rdma yield, and poll the completion channel fd instead. Signed-off-by: Lidong Chen Reviewed-by: Dr. David Alan Gilbert --- migration/rdma.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/migration/rdma.c b/migration/rdma.c index 769f443..92e4d30 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -1493,11 +1493,13 @@ static int qemu_rdma_wait_comp_channel(RDMAContext *rdma) * Coroutine doesn't start until migration_fd_process_incoming() * so don't yield unless we know we're running inside of a coroutine. */ - if (rdma->migration_started_on_destination) { + if (rdma->migration_started_on_destination && + migration_incoming_get_current()->state == MIGRATION_STATUS_ACTIVE) { yield_until_fd_readable(rdma->comp_channel->fd); } else { /* This is the source side, we're in a separate thread * or destination prior to migration_fd_process_incoming() + * after postcopy, the destination also in a seprate thread. * we can't yield; so we have to poll the fd. * But we need to be able to handle 'cancel' or an error * without hanging forever. From patchwork Tue Jun 5 15:28:05 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: 858585 jemmy X-Patchwork-Id: 925533 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="pKbEPPlI"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 410bTN5RmKz9s08 for ; Wed, 6 Jun 2018 01:34:48 +1000 (AEST) Received: from localhost ([::1]:47540 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fQDyw-0005Dc-Fa for incoming@patchwork.ozlabs.org; Tue, 05 Jun 2018 11:34:46 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40120) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fQDsx-0008Dz-3m for qemu-devel@nongnu.org; Tue, 05 Jun 2018 11:28:36 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fQDsw-0003vA-17 for qemu-devel@nongnu.org; Tue, 05 Jun 2018 11:28:35 -0400 Received: from mail-pl0-x22c.google.com ([2607:f8b0:400e:c01::22c]:42188) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fQDsv-0003uI-RQ for qemu-devel@nongnu.org; Tue, 05 Jun 2018 11:28:33 -0400 Received: by mail-pl0-x22c.google.com with SMTP id w17-v6so1724312pll.9 for ; Tue, 05 Jun 2018 08:28:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=lux+N7RLSocDI3oxa3LwaDyLJuk54eW4aEDrU0k5j8k=; b=pKbEPPlIb39QR8VlEu7COByIJEPgMr/Z2C58u4qVfDsqz9ZqRyA3DR7CWx7/8MgcRI 50ji89L/FZtOBVpu8BS8zoqcWnXoEn6RhSnyuO/0izEWFYp7ODzByH0g93Wl9j/FPcH0 g7u1UhZ1s/m6NWOBmMh+PBhVwu0OkuTP7KPTY7+OlzQqHm5xMg8Ql8jp8sGy0HhiS/Es IQb4qGFiJijpgfq39qYZ8yFe9Y/YzKK+tOQ4Bd8h11v27y/BWHGnhGhk9FIHmkWL3jyb ZpIwL0TxxFaJMCP9708Hq+U4Vgz6zTTyKVySzT8OaE6V+DFWnnIdssttOQ5uU346q92s o6/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=lux+N7RLSocDI3oxa3LwaDyLJuk54eW4aEDrU0k5j8k=; b=XbwWAOu3vys+V4Zov3kol3gLz5V4TwM62/oE/KLikNc+2evS/suYZZEgRjf44x6yG9 Dh8QHfqSwYdWw4PUVqachVNuKnYUPQUTDX023zsr4j/sOo0Qu15ghxD87k6OrWNNEogH OspLFlrNnsRAfniZiJbr5Ar15kttFK5SK8cAMd27VWI2WLQBSSmKYHD/bSDvDeF+IPYl jMdYn09t1KvJ/bYhjSxJmSYW/IpmyIaANMHXptrXN1+MSVy2vEWgg46IZXqtaIC/b0rO UqYUFEBeJoRtPdOccoe3XrYyZ9vdAODHToRYPb5RBK0MXw8vIo0OQanKkfVT40mmUOyd E1qg== X-Gm-Message-State: ALKqPwc8Z6gr7qdyJmG7HNHZ0eJaHbAqClwdxb/2e6w44PYapC7vNIxU g+KI4RYE1atw0JuLHkAZwEA= X-Google-Smtp-Source: ADUXVKJgU8r3j9Qrum7n8ZMRbOxpY2LEDf+p6NFSALJoaoBhIwKkfwy5TOzuq4erM38Vk0f1NF0Mqg== X-Received: by 2002:a17:902:70ca:: with SMTP id l10-v6mr19530983plt.174.1528212513074; Tue, 05 Jun 2018 08:28:33 -0700 (PDT) Received: from VM_93_245_centos.localdomain ([150.109.57.149]) by smtp.gmail.com with ESMTPSA id t14-v6sm101952007pfh.109.2018.06.05.08.28.30 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 05 Jun 2018 08:28:32 -0700 (PDT) From: Lidong Chen X-Google-Original-From: Lidong Chen To: zhang.zhanghailiang@huawei.com, quintela@redhat.com, dgilbert@redhat.com, berrange@redhat.com, aviadye@mellanox.com, pbonzini@redhat.com Date: Tue, 5 Jun 2018 23:28:05 +0800 Message-Id: <1528212489-19137-7-git-send-email-lidongchen@tencent.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1528212489-19137-1-git-send-email-lidongchen@tencent.com> References: <1528212489-19137-1-git-send-email-lidongchen@tencent.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::22c Subject: [Qemu-devel] [PATCH v5 06/10] migration: implement io_set_aio_fd_handler function for RDMA QIOChannel X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: galsha@mellanox.com, Lidong Chen , adido@mellanox.com, qemu-devel@nongnu.org, Lidong Chen Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" From: Lidong Chen if qio_channel_rdma_readv return QIO_CHANNEL_ERR_BLOCK, the destination qemu crash. The backtrace is: (gdb) bt #0 0x0000000000000000 in ?? () #1 0x00000000008db50e in qio_channel_set_aio_fd_handler (ioc=0x38111e0, ctx=0x3726080, io_read=0x8db841 , io_write=0x0, opaque=0x38111e0) at io/channel.c: #2 0x00000000008db952 in qio_channel_set_aio_fd_handlers (ioc=0x38111e0) at io/channel.c:438 #3 0x00000000008dbab4 in qio_channel_yield (ioc=0x38111e0, condition=G_IO_IN) at io/channel.c:47 #4 0x00000000007a870b in channel_get_buffer (opaque=0x38111e0, buf=0x440c038 "", pos=0, size=327 at migration/qemu-file-channel.c:83 #5 0x00000000007a70f6 in qemu_fill_buffer (f=0x440c000) at migration/qemu-file.c:299 #6 0x00000000007a79d0 in qemu_peek_byte (f=0x440c000, offset=0) at migration/qemu-file.c:562 #7 0x00000000007a7a22 in qemu_get_byte (f=0x440c000) at migration/qemu-file.c:575 #8 0x00000000007a7c78 in qemu_get_be32 (f=0x440c000) at migration/qemu-file.c:655 #9 0x00000000007a0508 in qemu_loadvm_state (f=0x440c000) at migration/savevm.c:2126 #10 0x0000000000794141 in process_incoming_migration_co (opaque=0x0) at migration/migration.c:366 #11 0x000000000095c598 in coroutine_trampoline (i0=84033984, i1=0) at util/coroutine-ucontext.c:1 #12 0x00007f9c0db56d40 in ?? () from /lib64/libc.so.6 #13 0x00007f96fe858760 in ?? () #14 0x0000000000000000 in ?? () RDMA QIOChannel not implement io_set_aio_fd_handler. so qio_channel_set_aio_fd_handler will access NULL pointer. Signed-off-by: Lidong Chen Reviewed-by: Juan Quintela --- migration/rdma.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/migration/rdma.c b/migration/rdma.c index 92e4d30..dfa4f77 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -2963,6 +2963,21 @@ static GSource *qio_channel_rdma_create_watch(QIOChannel *ioc, return source; } +static void qio_channel_rdma_set_aio_fd_handler(QIOChannel *ioc, + AioContext *ctx, + IOHandler *io_read, + IOHandler *io_write, + void *opaque) +{ + QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc); + if (io_read) { + aio_set_fd_handler(ctx, rioc->rdmain->comp_channel->fd, + false, io_read, io_write, NULL, opaque); + } else { + aio_set_fd_handler(ctx, rioc->rdmaout->comp_channel->fd, + false, io_read, io_write, NULL, opaque); + } +} static int qio_channel_rdma_close(QIOChannel *ioc, Error **errp) @@ -3822,6 +3837,7 @@ static void qio_channel_rdma_class_init(ObjectClass *klass, ioc_klass->io_set_blocking = qio_channel_rdma_set_blocking; ioc_klass->io_close = qio_channel_rdma_close; ioc_klass->io_create_watch = qio_channel_rdma_create_watch; + ioc_klass->io_set_aio_fd_handler = qio_channel_rdma_set_aio_fd_handler; } static const TypeInfo qio_channel_rdma_info = { From patchwork Tue Jun 5 15:28:06 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: 858585 jemmy X-Patchwork-Id: 925532 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="oGSed8Jw"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 410bTJ0zZLz9s08 for ; Wed, 6 Jun 2018 01:34:42 +1000 (AEST) Received: from localhost ([::1]:47538 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fQDyp-00058H-Fy for incoming@patchwork.ozlabs.org; Tue, 05 Jun 2018 11:34:39 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40147) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fQDsz-0008GI-De for qemu-devel@nongnu.org; Tue, 05 Jun 2018 11:28:38 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fQDsy-0003wN-Ia for qemu-devel@nongnu.org; Tue, 05 Jun 2018 11:28:37 -0400 Received: from mail-pf0-x231.google.com ([2607:f8b0:400e:c00::231]:42690) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fQDsy-0003wA-C8 for qemu-devel@nongnu.org; Tue, 05 Jun 2018 11:28:36 -0400 Received: by mail-pf0-x231.google.com with SMTP id w7-v6so1480113pfn.9 for ; Tue, 05 Jun 2018 08:28:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=V6VZn/TLDJ0rcpWXxTwI8ayG4VlrvGWwAP//SDZcvS8=; b=oGSed8JwGF9ZzJTs+lH32Vn/ywsQPftL93xgVT5Jy4LyZVYVBAnHp9hrOBJjZuxGNs 56f9LU3i7gBTB8Zi3uctBJU9fhc4I0CBc/GdMDCXZDb5midw/ymwnDMoDviXGu3ulsYW UfxhcluKNPjJ//LpwLVkua81tNV4Bxd16mmMJ48bAFRaKxO0bGomaTL+gWGkY1rwYyL5 O1h4TlVcVuHEz6xazSja3YgGOBD8ojeJdqc+IMpk3lSpzmG2yd2MGMb8sxO4eNDE+HyW N9PvneuYgDUR0INpPfD335zVrDfsH+uw9IE8eKNVw2kGG2ZAhfKLpOZNiv7V2tJyBq4R 6TOg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=V6VZn/TLDJ0rcpWXxTwI8ayG4VlrvGWwAP//SDZcvS8=; b=fTP3TT3FS/U4Wi7u43NwoW4+K3QrQ52RGqSU3yGP2/ovnYy3ZusnjBnj61WVwBDC0H 2+1H/aviI8w+7FnYvAbNAPR5lOBtW5q/n1aPeIjSfVbzneEtVrkMSvksacGxUOwZP54U QRSI038xfU78gpblt+d0NhnSxZFooAzQerEYzDYp6a5KVj+fSZjuMYnGdfDpDbRYame0 CCN38/abmz+WjZzzMQeq3blOyyM6f4wYcwManMEE7NRI688AO2gn6nEy50CLqAJno+Hp 8B7Pq/SBumt/RcNBBr/euTIDLuobmFhi34l+iymjhWCQtSO5Z5EMYz/RDsD+WOWAabSJ Fyuw== X-Gm-Message-State: APt69E3SWapW+qH+qdQkl/Yn/SdY6ul2wDPzTULhM9ptiGw2S3KZWc3w ZVwl+wU7jTrE2yltZn6U0GI= X-Google-Smtp-Source: ADUXVKJE1hKZttVMFhAjr8jrZTf+qSB+upESUwcFuwWZbHqhmMpgA1kjF+LifFnijiTP8ihfEYmK5A== X-Received: by 2002:a65:5084:: with SMTP id r4-v6mr856528pgp.202.1528212515583; Tue, 05 Jun 2018 08:28:35 -0700 (PDT) Received: from VM_93_245_centos.localdomain ([150.109.57.149]) by smtp.gmail.com with ESMTPSA id t14-v6sm101952007pfh.109.2018.06.05.08.28.33 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 05 Jun 2018 08:28:35 -0700 (PDT) From: Lidong Chen X-Google-Original-From: Lidong Chen To: zhang.zhanghailiang@huawei.com, quintela@redhat.com, dgilbert@redhat.com, berrange@redhat.com, aviadye@mellanox.com, pbonzini@redhat.com Date: Tue, 5 Jun 2018 23:28:06 +0800 Message-Id: <1528212489-19137-8-git-send-email-lidongchen@tencent.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1528212489-19137-1-git-send-email-lidongchen@tencent.com> References: <1528212489-19137-1-git-send-email-lidongchen@tencent.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::231 Subject: [Qemu-devel] [PATCH v5 07/10] migration: invoke qio_channel_yield only when qemu_in_coroutine() X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: galsha@mellanox.com, Lidong Chen , adido@mellanox.com, qemu-devel@nongnu.org, Lidong Chen Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" From: Lidong Chen when qio_channel_read return QIO_CHANNEL_ERR_BLOCK, the source qemu crash. The backtrace is: (gdb) bt #0 0x00007fb20aba91d7 in raise () from /lib64/libc.so.6 #1 0x00007fb20abaa8c8 in abort () from /lib64/libc.so.6 #2 0x00007fb20aba2146 in __assert_fail_base () from /lib64/libc.so.6 #3 0x00007fb20aba21f2 in __assert_fail () from /lib64/libc.so.6 #4 0x00000000008dba2d in qio_channel_yield (ioc=0x22f9e20, condition=G_IO_IN) at io/channel.c:460 #5 0x00000000007a870b in channel_get_buffer (opaque=0x22f9e20, buf=0x3d54038 "", pos=0, size=32768) at migration/qemu-file-channel.c:83 #6 0x00000000007a70f6 in qemu_fill_buffer (f=0x3d54000) at migration/qemu-file.c:299 #7 0x00000000007a79d0 in qemu_peek_byte (f=0x3d54000, offset=0) at migration/qemu-file.c:562 #8 0x00000000007a7a22 in qemu_get_byte (f=0x3d54000) at migration/qemu-file.c:575 #9 0x00000000007a7c46 in qemu_get_be16 (f=0x3d54000) at migration/qemu-file.c:647 #10 0x0000000000796db7 in source_return_path_thread (opaque=0x2242280) at migration/migration.c:1794 #11 0x00000000009428fa in qemu_thread_start (args=0x3e58420) at util/qemu-thread-posix.c:504 #12 0x00007fb20af3ddc5 in start_thread () from /lib64/libpthread.so.0 #13 0x00007fb20ac6b74d in clone () from /lib64/libc.so.6 This patch fixed by invoke qio_channel_yield only when qemu_in_coroutine(). Signed-off-by: Lidong Chen Reviewed-by: Juan Quintela --- migration/qemu-file-channel.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/migration/qemu-file-channel.c b/migration/qemu-file-channel.c index e202d73..8e639eb 100644 --- a/migration/qemu-file-channel.c +++ b/migration/qemu-file-channel.c @@ -49,7 +49,11 @@ static ssize_t channel_writev_buffer(void *opaque, ssize_t len; len = qio_channel_writev(ioc, local_iov, nlocal_iov, NULL); if (len == QIO_CHANNEL_ERR_BLOCK) { - qio_channel_wait(ioc, G_IO_OUT); + if (qemu_in_coroutine()) { + qio_channel_yield(ioc, G_IO_OUT); + } else { + qio_channel_wait(ioc, G_IO_OUT); + } continue; } if (len < 0) { @@ -80,7 +84,11 @@ static ssize_t channel_get_buffer(void *opaque, ret = qio_channel_read(ioc, (char *)buf, size, NULL); if (ret < 0) { if (ret == QIO_CHANNEL_ERR_BLOCK) { - qio_channel_yield(ioc, G_IO_IN); + if (qemu_in_coroutine()) { + qio_channel_yield(ioc, G_IO_IN); + } else { + qio_channel_wait(ioc, G_IO_IN); + } } else { /* XXX handle Error * object */ return -EIO; From patchwork Tue Jun 5 15:28:07 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: 858585 jemmy X-Patchwork-Id: 925527 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="qwa352qO"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 410bMB3Vn3z9s0W for ; Wed, 6 Jun 2018 01:29:26 +1000 (AEST) Received: from localhost ([::1]:47505 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fQDtk-0008OG-44 for incoming@patchwork.ozlabs.org; Tue, 05 Jun 2018 11:29:24 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40175) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fQDt2-0008Ix-0z for qemu-devel@nongnu.org; Tue, 05 Jun 2018 11:28:45 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fQDt1-0003xY-1m for qemu-devel@nongnu.org; Tue, 05 Jun 2018 11:28:40 -0400 Received: from mail-pf0-x243.google.com ([2607:f8b0:400e:c00::243]:41673) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fQDt0-0003x5-OW for qemu-devel@nongnu.org; Tue, 05 Jun 2018 11:28:38 -0400 Received: by mail-pf0-x243.google.com with SMTP id a11-v6so1484134pff.8 for ; Tue, 05 Jun 2018 08:28:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=MiVhFCP4aas7BoWLX4QmXx/HnZ+qlLWa1ujmNDXEV3c=; b=qwa352qOnCNCrZVsAPjmRd4oX+axa8wqbuQvcJtZHJcaabdnmsxK464Zm0R+lAZxg9 B/EDEJbRYpTpe5hE7SvWxhS8BHLawzQor9zobRK97qD2z4qlieDgr6GOb/qSWPDhZTuh LyA8BUDKnYjxE7eIDl0njtkP90EGAs6oRscoqRmsk5fXhtNHOJbg4WkH8z1+TU5p9bWg j/j2XnJ0lif1maxWbf/KCyk2/R47uUeZzvdwLuVw9SyNoKd5CXZ70tDFnCZ50FxhiTYX HSj/itY47IZWcgbDMJZHS+ZnUORqrpwE4tEmxHpXMgNVw8zzWisE1TsPCUNbL/dwRnLx U5qw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=MiVhFCP4aas7BoWLX4QmXx/HnZ+qlLWa1ujmNDXEV3c=; b=PH/c7b0goaPWj9cXMkzcPztG6r9s14ruPKeDjJOJyZSK0cgA0Ewo7oaRZbNpwyfzEp maK8Kjy5sWv5ZQmBHdSrbEdec6c7xIvUYfyDgPgyNsTyUUIjaSTLvfto+QgNIMvEId5X YH4WTVzL4RPS1F+bODrsEd13tbNyfUBtQZXk69vIp8xmTqUci11Z5XUmQOWa15Kw2pZP s4fjuEHDPrsHC2l928R2XQ+36rjpKNDMBZuOJS7Rb5mp2LE9096F9zIe4Lqi6cdxMkgG pyF7DlwnMwtHUe+/g5jkQV80BkgxL4Z0+LU17QJ2oAIFdo8biOH+qXE87Bq3y6FjhcZC xPHA== X-Gm-Message-State: ALKqPweElfkeetO/OMEsb70al8C+skovbDgcpwu0A5cvfvQKVXAa/JrN xQez8v63bY2TE4iIJ4/YXAw= X-Google-Smtp-Source: ADUXVKLmssD1+fo0JfHaT+Mwz8fKnSFI6jzZcflO3Cq6PTqqVqx7gT3Z1lEmgtrOs2ToAC5+PUGgkA== X-Received: by 2002:a63:a60a:: with SMTP id t10-v6mr21171234pge.351.1528212517962; Tue, 05 Jun 2018 08:28:37 -0700 (PDT) Received: from VM_93_245_centos.localdomain ([150.109.57.149]) by smtp.gmail.com with ESMTPSA id t14-v6sm101952007pfh.109.2018.06.05.08.28.35 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 05 Jun 2018 08:28:37 -0700 (PDT) From: Lidong Chen X-Google-Original-From: Lidong Chen To: zhang.zhanghailiang@huawei.com, quintela@redhat.com, dgilbert@redhat.com, berrange@redhat.com, aviadye@mellanox.com, pbonzini@redhat.com Date: Tue, 5 Jun 2018 23:28:07 +0800 Message-Id: <1528212489-19137-9-git-send-email-lidongchen@tencent.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1528212489-19137-1-git-send-email-lidongchen@tencent.com> References: <1528212489-19137-1-git-send-email-lidongchen@tencent.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::243 Subject: [Qemu-devel] [PATCH v5 08/10] migration: create a dedicated thread to release rdma resource X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: galsha@mellanox.com, adido@mellanox.com, qemu-devel@nongnu.org, Lidong Chen Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" ibv_dereg_mr wait for a long time for big memory size virtual server. The test result is: 10GB 326ms 20GB 699ms 30GB 1021ms 40GB 1387ms 50GB 1712ms 60GB 2034ms 70GB 2457ms 80GB 2807ms 90GB 3107ms 100GB 3474ms 110GB 3735ms 120GB 4064ms 130GB 4567ms 140GB 4886ms this will cause the guest os hang for a while when migration finished. So create a dedicated thread to release rdma resource. Signed-off-by: Lidong Chen --- migration/rdma.c | 43 +++++++++++++++++++++++++++---------------- 1 file changed, 27 insertions(+), 16 deletions(-) diff --git a/migration/rdma.c b/migration/rdma.c index dfa4f77..f12e8d5 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -2979,35 +2979,46 @@ static void qio_channel_rdma_set_aio_fd_handler(QIOChannel *ioc, } } -static int qio_channel_rdma_close(QIOChannel *ioc, - Error **errp) +static void *qio_channel_rdma_close_thread(void *arg) { - QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc); - RDMAContext *rdmain, *rdmaout; - trace_qemu_rdma_close(); + RDMAContext **rdma = arg; + RDMAContext *rdmain = rdma[0]; + RDMAContext *rdmaout = rdma[1]; - rdmain = rioc->rdmain; - if (rdmain) { - atomic_rcu_set(&rioc->rdmain, NULL); - } - - rdmaout = rioc->rdmaout; - if (rdmaout) { - atomic_rcu_set(&rioc->rdmaout, NULL); - } + rcu_register_thread(); synchronize_rcu(); - if (rdmain) { qemu_rdma_cleanup(rdmain); } - if (rdmaout) { qemu_rdma_cleanup(rdmaout); } g_free(rdmain); g_free(rdmaout); + g_free(rdma); + + rcu_unregister_thread(); + return NULL; +} + +static int qio_channel_rdma_close(QIOChannel *ioc, + Error **errp) +{ + QemuThread t; + QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc); + RDMAContext **rdma = g_new0(RDMAContext*, 2); + + trace_qemu_rdma_close(); + if (rioc->rdmain || rioc->rdmaout) { + rdma[0] = rioc->rdmain; + rdma[1] = rioc->rdmaout; + qemu_thread_create(&t, "rdma cleanup", qio_channel_rdma_close_thread, + rdma, QEMU_THREAD_DETACHED); + atomic_rcu_set(&rioc->rdmain, NULL); + atomic_rcu_set(&rioc->rdmaout, NULL); + } return 0; } From patchwork Tue Jun 5 15:28:08 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: 858585 jemmy X-Patchwork-Id: 925531 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="So1QoFYH"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 410bQS2fNWz9s08 for ; Wed, 6 Jun 2018 01:32:16 +1000 (AEST) Received: from localhost ([::1]:47526 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fQDwT-0002vh-Ub for incoming@patchwork.ozlabs.org; Tue, 05 Jun 2018 11:32:14 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40222) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fQDt7-0008Mw-21 for qemu-devel@nongnu.org; Tue, 05 Jun 2018 11:28:45 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fQDt3-0003yg-Ek for qemu-devel@nongnu.org; Tue, 05 Jun 2018 11:28:45 -0400 Received: from mail-pl0-x241.google.com ([2607:f8b0:400e:c01::241]:41292) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fQDt3-0003yN-9K for qemu-devel@nongnu.org; Tue, 05 Jun 2018 11:28:41 -0400 Received: by mail-pl0-x241.google.com with SMTP id az12-v6so1722726plb.8 for ; Tue, 05 Jun 2018 08:28:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=+pUg748FIAj70423DtRMW7r7pF08+rU+AmRIjUoxV9Y=; b=So1QoFYHoj2cjXRPoLhE8viprzo0p/DxAjKDn+SNKSJbq++wVsnqYmO/BhN/YuC0XW UzPUAOVN7MMgEObShHqKi1J6UGMO6fzRBQENBoAYMGfDYn6nv1lzOPaBj3Uv5NpENO14 7zNd7W9pxuOjYJimUMiVeNvt7Empg4A7uZZK6fEG/9XlM+pb3kTwwqpwz/SR12f9cbGS nLhJuADNILcqQ+lUy7VZWC8jNrd+8KcksPXQkPLfqtN27mkJJzil2hN1SpTKL8L+K3xZ kXtECsoUsbxX70tr8pdLtWRSRxaPSrOzHZHq7/rDjV34O6gd16/kAbhadEsrHFpVSc/C aZeQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=+pUg748FIAj70423DtRMW7r7pF08+rU+AmRIjUoxV9Y=; b=UaoR80Z9WUNmwL18L5+vJsslOYDOku/ZSTGxvXy34dSmi20HWHnYwkcLVs/BkDMHM0 K91MmslJVokyn/zk8ormC/mfwf3r5wrG/DFi9h4XJtiQ3jwcZjzrqMmfmAJKEtzERY8f CiNRF20f6f0sEVRBpyNNYtL4OGq01Uzb19vK86fxbk+7q/x4aoaNVENikGupfiwFH6QU 0/Wfi2Rk2yU+AvaHfS+kpCT4b+PNYWPvAE4wZZpNnM78gfFkP0ogiLbu4KyXwFrMlUlJ QJFLHaaP19vsI6vEZ8vIqSrw8fKjayLrjBX8hxQLou8m/1w2op6gUEQH0YV9wAVfclHZ Sk3w== X-Gm-Message-State: ALKqPwdR/l9QghTKY4t9OX6f/pcRiFQ6VZoimLg4EcGd5aowErSQ7pk4 W3VWaVqqpnUNkXjnzsYgCqQ= X-Google-Smtp-Source: ADUXVKIsM7i08X5XNzR1opTpXrDfCtI4VappSLG226tAX/uWaJZkQbtasBQiJ6r41uait3IGJE+eGw== X-Received: by 2002:a17:902:a416:: with SMTP id p22-v6mr26650587plq.228.1528212520472; Tue, 05 Jun 2018 08:28:40 -0700 (PDT) Received: from VM_93_245_centos.localdomain ([150.109.57.149]) by smtp.gmail.com with ESMTPSA id t14-v6sm101952007pfh.109.2018.06.05.08.28.38 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 05 Jun 2018 08:28:39 -0700 (PDT) From: Lidong Chen X-Google-Original-From: Lidong Chen To: zhang.zhanghailiang@huawei.com, quintela@redhat.com, dgilbert@redhat.com, berrange@redhat.com, aviadye@mellanox.com, pbonzini@redhat.com Date: Tue, 5 Jun 2018 23:28:08 +0800 Message-Id: <1528212489-19137-10-git-send-email-lidongchen@tencent.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1528212489-19137-1-git-send-email-lidongchen@tencent.com> References: <1528212489-19137-1-git-send-email-lidongchen@tencent.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::241 Subject: [Qemu-devel] [PATCH v5 09/10] migration: poll the cm event while wait RDMA work request completion X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: galsha@mellanox.com, adido@mellanox.com, qemu-devel@nongnu.org, Lidong Chen Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" If the peer qemu is crashed, the qemu_rdma_wait_comp_channel function maybe loop forever. so we should also poll the cm event fd, and when receive RDMA_CM_EVENT_DISCONNECTED and RDMA_CM_EVENT_DEVICE_REMOVAL, we consider some error happened. Signed-off-by: Lidong Chen --- migration/rdma.c | 33 ++++++++++++++++++++++++++++++--- 1 file changed, 30 insertions(+), 3 deletions(-) diff --git a/migration/rdma.c b/migration/rdma.c index f12e8d5..bb6989e 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -1489,6 +1489,9 @@ static uint64_t qemu_rdma_poll(RDMAContext *rdma, uint64_t *wr_id_out, */ static int qemu_rdma_wait_comp_channel(RDMAContext *rdma) { + struct rdma_cm_event *cm_event; + int ret = -1; + /* * Coroutine doesn't start until migration_fd_process_incoming() * so don't yield unless we know we're running inside of a coroutine. @@ -1505,13 +1508,37 @@ static int qemu_rdma_wait_comp_channel(RDMAContext *rdma) * without hanging forever. */ while (!rdma->error_state && !rdma->received_error) { - GPollFD pfds[1]; + GPollFD pfds[2]; pfds[0].fd = rdma->comp_channel->fd; pfds[0].events = G_IO_IN | G_IO_HUP | G_IO_ERR; + pfds[0].revents = 0; + + pfds[1].fd = rdma->channel->fd; + pfds[1].events = G_IO_IN | G_IO_HUP | G_IO_ERR; + pfds[1].revents = 0; + /* 0.1s timeout, should be fine for a 'cancel' */ - switch (qemu_poll_ns(pfds, 1, 100 * 1000 * 1000)) { + switch (qemu_poll_ns(pfds, 2, 100 * 1000 * 1000)) { + case 2: case 1: /* fd active */ - return 0; + if (pfds[0].revents) { + return 0; + } + + if (pfds[1].revents) { + ret = rdma_get_cm_event(rdma->channel, &cm_event); + if (!ret) { + rdma_ack_cm_event(cm_event); + } + + error_report("receive cm event while wait comp channel," + "cm event is %d", cm_event->event); + if (cm_event->event == RDMA_CM_EVENT_DISCONNECTED || + cm_event->event == RDMA_CM_EVENT_DEVICE_REMOVAL) { + return -EPIPE; + } + } + break; case 0: /* Timeout, go around again */ break; From patchwork Tue Jun 5 15:28:09 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: 858585 jemmy X-Patchwork-Id: 925534 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="GYzvj2pg"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 410bWc4PSpz9s01 for ; Wed, 6 Jun 2018 01:36:44 +1000 (AEST) Received: from localhost ([::1]:47560 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fQE0o-00077m-AB for incoming@patchwork.ozlabs.org; Tue, 05 Jun 2018 11:36:42 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40223) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fQDt7-0008My-27 for qemu-devel@nongnu.org; Tue, 05 Jun 2018 11:28:46 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fQDt5-0003zu-Sc for qemu-devel@nongnu.org; Tue, 05 Jun 2018 11:28:45 -0400 Received: from mail-pf0-x241.google.com ([2607:f8b0:400e:c00::241]:34678) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fQDt5-0003zN-JP for qemu-devel@nongnu.org; Tue, 05 Jun 2018 11:28:43 -0400 Received: by mail-pf0-x241.google.com with SMTP id a63-v6so1492802pfl.1 for ; Tue, 05 Jun 2018 08:28:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=BU/AaHONtx2y8sW4Yxt/LJSbPRVczbwYHznqDXS1FTI=; b=GYzvj2pgTFgLsZbjw7vrs4Tlb+LVieuBET18YnIJOWUxDZbPr2t0YJArpQ0QlJ8TAx +w0B4zCSgvLiuPgux3F1j8LVLJMyuecKp+UoKq5V7ZlwbTb8z/PV8bSdVxXtT/LycvhJ TmciPnOHZdMGYh+sfxPCg5tvpes5RvXgXXOeZXfgrQdy08hf28UZK9f/+E202q4lskgA inKPqqzNWc/14SXBrg4GGt27lJrlpl+v7aU9NnGOLWxjrzQFwmWXEmmnasIvDUnmxUV4 p676R9vvFhZlbIuAeJP5DJtZ2uHvsgYyLR2NWYCRU8xUOjbkgDShzStuOoehGZVqJg7r u5Kw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=BU/AaHONtx2y8sW4Yxt/LJSbPRVczbwYHznqDXS1FTI=; b=BLvrU1C3Xc7Po9WT8NWfZWaER5iUXR3QHgJ9tLmZzhxTLj2iK43IsUY59I1I84MhvI Y9GOJ8k0Z/wjK28GbZV+0igqZ8XFKYN3pZi3fo5grF1mVSkvnePDCH7Ke46BtC8DJdRn RjqAKx27SMQNohPiLZ6GHGcCR1Sh9fZbWnQBm6OpJOLfm1LVUQGZ2NokjHxLQA+1F2XV G0B3g2yHM4LE67+KvBjCmJuhr7Nsfgx5J+d1FFR0mgjjAHr3iDQBoc++knlY3+UfKgHx S2ySRPIEBjY0D3WJFZYXKJXqNDJbwfup+xC3WOFsAUGo683gTlIiFtsQJbYSV5YkPvF4 OOig== X-Gm-Message-State: ALKqPweCp1b/vODUfAsuJ+RgM/4svW3NKk3G3u/++HuofSqyDi394UV+ ubzb7C4521ejBJPkubBkqPE= X-Google-Smtp-Source: ADUXVKLMr3VxpCo2MwdorsnCGHgaMlm7iZ079xF+5C3PO705rHVQnji1YLFhvVQFVso/Bah8c9JbGQ== X-Received: by 2002:a63:3759:: with SMTP id g25-v6mr21789763pgn.59.1528212522778; Tue, 05 Jun 2018 08:28:42 -0700 (PDT) Received: from VM_93_245_centos.localdomain ([150.109.57.149]) by smtp.gmail.com with ESMTPSA id t14-v6sm101952007pfh.109.2018.06.05.08.28.40 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 05 Jun 2018 08:28:42 -0700 (PDT) From: Lidong Chen X-Google-Original-From: Lidong Chen To: zhang.zhanghailiang@huawei.com, quintela@redhat.com, dgilbert@redhat.com, berrange@redhat.com, aviadye@mellanox.com, pbonzini@redhat.com Date: Tue, 5 Jun 2018 23:28:09 +0800 Message-Id: <1528212489-19137-11-git-send-email-lidongchen@tencent.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1528212489-19137-1-git-send-email-lidongchen@tencent.com> References: <1528212489-19137-1-git-send-email-lidongchen@tencent.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::241 Subject: [Qemu-devel] [PATCH v5 10/10] migration: implement the shutdown for RDMA QIOChannel X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: galsha@mellanox.com, adido@mellanox.com, qemu-devel@nongnu.org, Lidong Chen Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Because RDMA QIOChannel not implement shutdown function, If the to_dst_file was set error, the return path thread will wait forever. and the migration thread will wait return path thread exit. the backtrace of return path thread is: (gdb) bt #0 0x00007f372a76bb0f in ppoll () from /lib64/libc.so.6 #1 0x000000000071dc24 in qemu_poll_ns (fds=0x7ef7091d0580, nfds=2, timeout=100000000) at qemu-timer.c:325 #2 0x00000000006b2fba in qemu_rdma_wait_comp_channel (rdma=0xd424000) at migration/rdma.c:1501 #3 0x00000000006b3191 in qemu_rdma_block_for_wrid (rdma=0xd424000, wrid_requested=4000, byte_len=0x7ef7091d0640) at migration/rdma.c:1580 #4 0x00000000006b3638 in qemu_rdma_exchange_get_response (rdma=0xd424000, head=0x7ef7091d0720, expecting=3, idx=0) at migration/rdma.c:1726 #5 0x00000000006b3ad6 in qemu_rdma_exchange_recv (rdma=0xd424000, head=0x7ef7091d0720, expecting=3) at migration/rdma.c:1903 #6 0x00000000006b5d03 in qemu_rdma_get_buffer (opaque=0x6a57dc0, buf=0x5c80030 "", pos=8, size=32768) at migration/rdma.c:2714 #7 0x00000000006a9635 in qemu_fill_buffer (f=0x5c80000) at migration/qemu-file.c:232 #8 0x00000000006a9ecd in qemu_peek_byte (f=0x5c80000, offset=0) at migration/qemu-file.c:502 #9 0x00000000006a9f1f in qemu_get_byte (f=0x5c80000) at migration/qemu-file.c:515 #10 0x00000000006aa162 in qemu_get_be16 (f=0x5c80000) at migration/qemu-file.c:591 #11 0x00000000006a46d3 in source_return_path_thread ( opaque=0xd826a0 ) at migration/migration.c:1331 #12 0x00007f372aa49e25 in start_thread () from /lib64/libpthread.so.0 #13 0x00007f372a77635d in clone () from /lib64/libc.so.6 the backtrace of migration thread is: (gdb) bt #0 0x00007f372aa4af57 in pthread_join () from /lib64/libpthread.so.0 #1 0x00000000007d5711 in qemu_thread_join (thread=0xd826f8 ) at util/qemu-thread-posix.c:504 #2 0x00000000006a4bc5 in await_return_path_close_on_source ( ms=0xd826a0 ) at migration/migration.c:1460 #3 0x00000000006a53e4 in migration_completion (s=0xd826a0 , current_active_state=4, old_vm_running=0x7ef7089cf976, start_time=0x7ef7089cf980) at migration/migration.c:1695 #4 0x00000000006a5c54 in migration_thread (opaque=0xd826a0 ) at migration/migration.c:1837 #5 0x00007f372aa49e25 in start_thread () from /lib64/libpthread.so.0 #6 0x00007f372a77635d in clone () from /lib64/libc.so.6 Signed-off-by: Lidong Chen Reviewed-by: Dr. David Alan Gilbert --- migration/rdma.c | 40 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 40 insertions(+) diff --git a/migration/rdma.c b/migration/rdma.c index bb6989e..0b35d65 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -3050,6 +3050,45 @@ static int qio_channel_rdma_close(QIOChannel *ioc, return 0; } +static int +qio_channel_rdma_shutdown(QIOChannel *ioc, + QIOChannelShutdown how, + Error **errp) +{ + QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc); + RDMAContext *rdmain, *rdmaout; + + rcu_read_lock(); + + rdmain = atomic_rcu_read(&rioc->rdmain); + rdmaout = atomic_rcu_read(&rioc->rdmain); + + switch (how) { + case QIO_CHANNEL_SHUTDOWN_READ: + if (rdmain) { + rdmain->error_state = -1; + } + break; + case QIO_CHANNEL_SHUTDOWN_WRITE: + if (rdmaout) { + rdmaout->error_state = -1; + } + break; + case QIO_CHANNEL_SHUTDOWN_BOTH: + default: + if (rdmain) { + rdmain->error_state = -1; + } + if (rdmaout) { + rdmaout->error_state = -1; + } + break; + } + + rcu_read_unlock(); + return 0; +} + /* * Parameters: * @offset == 0 : @@ -3876,6 +3915,7 @@ static void qio_channel_rdma_class_init(ObjectClass *klass, ioc_klass->io_close = qio_channel_rdma_close; ioc_klass->io_create_watch = qio_channel_rdma_create_watch; ioc_klass->io_set_aio_fd_handler = qio_channel_rdma_set_aio_fd_handler; + ioc_klass->io_shutdown = qio_channel_rdma_shutdown; } static const TypeInfo qio_channel_rdma_info = {