From patchwork Wed May 30 09:43:25 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: 858585 jemmy X-Patchwork-Id: 922667 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="lUrTYNz7"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40wm085yYLz9s1d for ; Wed, 30 May 2018 19:44:40 +1000 (AEST) Received: from localhost ([::1]:37236 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fNxen-0000Lj-Km for incoming@patchwork.ozlabs.org; Wed, 30 May 2018 05:44:37 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40248) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fNxe8-0000K4-NO for qemu-devel@nongnu.org; Wed, 30 May 2018 05:43:57 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fNxe7-0006ns-N5 for qemu-devel@nongnu.org; Wed, 30 May 2018 05:43:56 -0400 Received: from mail-pg0-x233.google.com ([2607:f8b0:400e:c05::233]:33251) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fNxe7-0006ne-H0 for qemu-devel@nongnu.org; Wed, 30 May 2018 05:43:55 -0400 Received: by mail-pg0-x233.google.com with SMTP id e21-v6so7906523pgv.0 for ; Wed, 30 May 2018 02:43:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=KS0xlWIpngwTUUYXTv5XW1KchXCdXZy4qTzrl+t6rR4=; b=lUrTYNz791Mafbz0tcG88mkdbJDiACc4xOEDezAdqDwGYdKus1UGjFSXCYRu04bnNw Xqac8X/vpSLe27yB0JHSk8QrZtTZaeMyGYBNgBzuYiEx8U4MWdPVi/YGNSNvAXMBgO3N 0kCxxjIiG3GlsMWa1EqdvfsOi7KUC9IuPgQPm468jyl7DRMCLQfOkyl3BdWu0QPxXQAy 9qhkw4KFjbZE5my+uqvFzb73KvcXvZ3vlhE/4IJqhSbS92ow9LZFl+c+CFawPmrtGogQ KafBG9HtrXCCHMezqq+3xGojPyirMvUdUG/C1ijPrRtmHuMa188AlI4rZInbB8a7fqe5 2jAg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=KS0xlWIpngwTUUYXTv5XW1KchXCdXZy4qTzrl+t6rR4=; b=Jd462wzy/GgV70ZeuIUH8UNPfM4C4XBrXwqktBFDkeMt9tSd+I7RR1mmZmqsKiu93k 0XwzFXmBlP3JUV5mt9xiesn97NolFXnI8jAKj+PkfCUtnP1d9hnBOdloB4NrBRpdebJj CBRGthZecK+Yhvg6AXAAKBsrKhAiiKJWCfbWXFMqJ3C6KlJFLmXDqoN4orczCvvnMAbX pRh1r8Vh8zzpKpdriOzlEFVV77eSnkjLntWjGc1gjDk+as5QOO1MUdmp2c6JT5sKdfry EakU1EGTzcXbKbS2yLngvmtQmQ7GfQSQtcb28Im1Z7WGLUd7ZQVJEZPrOMr98eNm1wgw PNQg== X-Gm-Message-State: ALKqPwf16VAohtbZlGIW0CCNtPl8rfXbW3sVQqJaxpP9ABPUUQnhHXjW 4rEOGojUxVPOy99ItDSXHEI= X-Google-Smtp-Source: ADUXVKJfzPvjjInnwzpBpE/kP+nx1BvRpyWvROEzTlV5ixhrc4nBpRuH8oZEQVe77I+o8yiWuXIXUA== X-Received: by 2002:a65:62d9:: with SMTP id m25-v6mr1632098pgv.407.1527673434612; Wed, 30 May 2018 02:43:54 -0700 (PDT) Received: from VM_93_245_centos.localdomain ([150.109.57.149]) by smtp.gmail.com with ESMTPSA id 29-v6sm60565257pfj.14.2018.05.30.02.43.52 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 30 May 2018 02:43:54 -0700 (PDT) From: Lidong Chen X-Google-Original-From: Lidong Chen To: zhang.zhanghailiang@huawei.com, quintela@redhat.com, dgilbert@redhat.com, berrange@redhat.com, aviadye@mellanox.com, pbonzini@redhat.com Date: Wed, 30 May 2018 17:43:25 +0800 Message-Id: <1527673416-31268-2-git-send-email-lidongchen@tencent.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1527673416-31268-1-git-send-email-lidongchen@tencent.com> References: <1527673416-31268-1-git-send-email-lidongchen@tencent.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::233 Subject: [Qemu-devel] [PATCH v4 01/12] migration: disable RDMA WRITE after postcopy started X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lidong Chen , adido@mellanox.com, qemu-devel@nongnu.org, Lidong Chen Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" From: Lidong Chen RDMA WRITE operations are performed with no notification to the destination qemu, then the destination qemu can not wakeup. This patch disable RDMA WRITE after postcopy started. Signed-off-by: Lidong Chen Reviewed-by: Dr. David Alan Gilbert --- migration/qemu-file.c | 8 ++++++-- migration/rdma.c | 12 ++++++++++++ 2 files changed, 18 insertions(+), 2 deletions(-) diff --git a/migration/qemu-file.c b/migration/qemu-file.c index 0463f4c..977b9ae 100644 --- a/migration/qemu-file.c +++ b/migration/qemu-file.c @@ -253,8 +253,12 @@ size_t ram_control_save_page(QEMUFile *f, ram_addr_t block_offset, if (f->hooks && f->hooks->save_page) { int ret = f->hooks->save_page(f, f->opaque, block_offset, offset, size, bytes_sent); - f->bytes_xfer += size; - if (ret != RAM_SAVE_CONTROL_DELAYED) { + if (ret != RAM_SAVE_CONTROL_NOT_SUPP) { + f->bytes_xfer += size; + } + + if (ret != RAM_SAVE_CONTROL_DELAYED && + ret != RAM_SAVE_CONTROL_NOT_SUPP) { if (bytes_sent && *bytes_sent > 0) { qemu_update_position(f, *bytes_sent); } else if (ret < 0) { diff --git a/migration/rdma.c b/migration/rdma.c index 7d233b0..a0748f4 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -2930,6 +2930,10 @@ static size_t qemu_rdma_save_page(QEMUFile *f, void *opaque, CHECK_ERROR_STATE(); + if (migrate_get_current()->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) { + return RAM_SAVE_CONTROL_NOT_SUPP; + } + qemu_fflush(f); if (size > 0) { @@ -3489,6 +3493,10 @@ static int qemu_rdma_registration_start(QEMUFile *f, void *opaque, CHECK_ERROR_STATE(); + if (migrate_get_current()->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) { + return 0; + } + trace_qemu_rdma_registration_start(flags); qemu_put_be64(f, RAM_SAVE_FLAG_HOOK); qemu_fflush(f); @@ -3511,6 +3519,10 @@ static int qemu_rdma_registration_stop(QEMUFile *f, void *opaque, CHECK_ERROR_STATE(); + if (migrate_get_current()->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) { + return 0; + } + qemu_fflush(f); ret = qemu_rdma_drain_cq(f, rdma); From patchwork Wed May 30 09:43:26 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: 858585 jemmy X-Patchwork-Id: 922668 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="su0H9oSp"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40wm092f88z9s1w for ; Wed, 30 May 2018 19:44:41 +1000 (AEST) Received: from localhost ([::1]:37237 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fNxeo-0000Mm-RN for incoming@patchwork.ozlabs.org; Wed, 30 May 2018 05:44:38 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40271) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fNxeB-0000KS-FB for qemu-devel@nongnu.org; Wed, 30 May 2018 05:44:00 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fNxeA-0006oY-5r for qemu-devel@nongnu.org; Wed, 30 May 2018 05:43:59 -0400 Received: from mail-pf0-x242.google.com ([2607:f8b0:400e:c00::242]:42329) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fNxe9-0006oM-U3 for qemu-devel@nongnu.org; Wed, 30 May 2018 05:43:58 -0400 Received: by mail-pf0-x242.google.com with SMTP id p14-v6so8747363pfh.9 for ; Wed, 30 May 2018 02:43:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=Li3WC6MPuMw3wcKNVdD+pFKDtGVaLyxmGSlYvo2Kn/U=; b=su0H9oSpjkdI2S1UTnO6JzR0XCONBwN1xnRHJucILKXIrjfpu92ZqytGfUVKXZIKQG zkJEWWfzN5NRo/CGovyKzRqpVnaVcrBzVK+pa1vksXI9yHtnLBXmQa/LMPGyqdEWVlKN 2IKjLfgZCMm5ligMSZE/hJVyWoj0hy4cR/nu9rE9AxWAOEu11jz53JsSddNnnZkDZrY+ EJVleAzQUvc9maf/HwMnmmh7yzELujd4CHMvQHW2LKRqjuU5lY3PuohDNCIV9bq63ti/ f9TfFIttVGzQpSMtTOc8fxkkXkhZAGTBIEciD4iYnXhrXMBGKmiz8susS85eAKu9jOFc egnA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Li3WC6MPuMw3wcKNVdD+pFKDtGVaLyxmGSlYvo2Kn/U=; b=VSdTNfjaAEjp9Q0C+oZx5rRVKphx2nqxEdjHQ/P23oyB2X/j0fiR/dX71Hy6k2zvhg J4UFS97PL5lzC9KMmAtD16RzIqo0gi6pzKokjVd+sfMRZDTixkCCNF0TicCzXHK0eNbq lgX0gJ0KJwZkIFWSKEotK52Yckz6kXyQ56q6jQ4RVsMsQ2V8NEWT3TnyXtW4CCjOu1aP zs+Mg4oeN53nMaZd7sqy5Dcy8AqZDswTnTyfFSO4eezVhUs7QZHrtMYC0gEQsX3buR84 avnghImDdcPuC6dMTL7WWR5IBGVRDAClxHwC4jNyWWoSqqzQPT4VUUznVZj3NFaw8MU/ uwqw== X-Gm-Message-State: ALKqPwep1XIF7xkfOCiZuXrSmujpO3yfa9jssonARso6sm3erXOBkFR1 NJpW1VxbUG9Ci8e3U9uzlBc= X-Google-Smtp-Source: ADUXVKLZpHAFESZfQNBYwXn+gdr4Iu4yfBa5dL/Crp9gKhafSpzkKpdn436TUkmmyO6SLN+Uq7aTXw== X-Received: by 2002:a63:85c8:: with SMTP id u191-v6mr1592981pgd.300.1527673437044; Wed, 30 May 2018 02:43:57 -0700 (PDT) Received: from VM_93_245_centos.localdomain ([150.109.57.149]) by smtp.gmail.com with ESMTPSA id 29-v6sm60565257pfj.14.2018.05.30.02.43.54 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 30 May 2018 02:43:56 -0700 (PDT) From: Lidong Chen X-Google-Original-From: Lidong Chen To: zhang.zhanghailiang@huawei.com, quintela@redhat.com, dgilbert@redhat.com, berrange@redhat.com, aviadye@mellanox.com, pbonzini@redhat.com Date: Wed, 30 May 2018 17:43:26 +0800 Message-Id: <1527673416-31268-3-git-send-email-lidongchen@tencent.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1527673416-31268-1-git-send-email-lidongchen@tencent.com> References: <1527673416-31268-1-git-send-email-lidongchen@tencent.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::242 Subject: [Qemu-devel] [PATCH v4 02/12] migration: create a dedicated connection for rdma return path X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lidong Chen , adido@mellanox.com, qemu-devel@nongnu.org, Lidong Chen Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" From: Lidong Chen If start a RDMA migration with postcopy enabled, the source qemu establish a dedicated connection for return path. Signed-off-by: Lidong Chen Reviewed-by: Dr. David Alan Gilbert --- migration/rdma.c | 94 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 91 insertions(+), 3 deletions(-) diff --git a/migration/rdma.c b/migration/rdma.c index a0748f4..ec4bbff 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -387,6 +387,10 @@ typedef struct RDMAContext { uint64_t unregistrations[RDMA_SIGNALED_SEND_MAX]; GHashTable *blockmap; + + /* the RDMAContext for return path */ + struct RDMAContext *return_path; + bool is_return_path; } RDMAContext; #define TYPE_QIO_CHANNEL_RDMA "qio-channel-rdma" @@ -2332,10 +2336,22 @@ static void qemu_rdma_cleanup(RDMAContext *rdma) rdma_destroy_id(rdma->cm_id); rdma->cm_id = NULL; } + + /* the destination side, listen_id and channel is shared */ if (rdma->listen_id) { - rdma_destroy_id(rdma->listen_id); + if (!rdma->is_return_path) { + rdma_destroy_id(rdma->listen_id); + } rdma->listen_id = NULL; + + if (rdma->channel) { + if (!rdma->is_return_path) { + rdma_destroy_event_channel(rdma->channel); + } + rdma->channel = NULL; + } } + if (rdma->channel) { rdma_destroy_event_channel(rdma->channel); rdma->channel = NULL; @@ -2564,6 +2580,25 @@ err_dest_init_create_listen_id: } +static void qemu_rdma_return_path_dest_init(RDMAContext *rdma_return_path, + RDMAContext *rdma) +{ + int idx; + + for (idx = 0; idx < RDMA_WRID_MAX; idx++) { + rdma_return_path->wr_data[idx].control_len = 0; + rdma_return_path->wr_data[idx].control_curr = NULL; + } + + /*the CM channel and CM id is shared*/ + rdma_return_path->channel = rdma->channel; + rdma_return_path->listen_id = rdma->listen_id; + + rdma->return_path = rdma_return_path; + rdma_return_path->return_path = rdma; + rdma_return_path->is_return_path = true; +} + static void *qemu_rdma_data_init(const char *host_port, Error **errp) { RDMAContext *rdma = NULL; @@ -3021,6 +3056,8 @@ err: return ret; } +static void rdma_accept_incoming_migration(void *opaque); + static int qemu_rdma_accept(RDMAContext *rdma) { RDMACapabilities cap; @@ -3115,7 +3152,14 @@ static int qemu_rdma_accept(RDMAContext *rdma) } } - qemu_set_fd_handler(rdma->channel->fd, NULL, NULL, NULL); + /* Accept the second connection request for return path */ + if (migrate_postcopy() && !rdma->is_return_path) { + qemu_set_fd_handler(rdma->channel->fd, rdma_accept_incoming_migration, + NULL, + (void *)(intptr_t)rdma->return_path); + } else { + qemu_set_fd_handler(rdma->channel->fd, NULL, NULL, NULL); + } ret = rdma_accept(rdma->cm_id, &conn_param); if (ret) { @@ -3700,6 +3744,10 @@ static void rdma_accept_incoming_migration(void *opaque) trace_qemu_rdma_accept_incoming_migration_accepted(); + if (rdma->is_return_path) { + return; + } + f = qemu_fopen_rdma(rdma, "rb"); if (f == NULL) { ERROR(errp, "could not qemu_fopen_rdma!"); @@ -3714,7 +3762,7 @@ static void rdma_accept_incoming_migration(void *opaque) void rdma_start_incoming_migration(const char *host_port, Error **errp) { int ret; - RDMAContext *rdma; + RDMAContext *rdma, *rdma_return_path; Error *local_err = NULL; trace_rdma_start_incoming_migration(); @@ -3741,12 +3789,24 @@ void rdma_start_incoming_migration(const char *host_port, Error **errp) trace_rdma_start_incoming_migration_after_rdma_listen(); + /* initialize the RDMAContext for return path */ + if (migrate_postcopy()) { + rdma_return_path = qemu_rdma_data_init(host_port, &local_err); + + if (rdma_return_path == NULL) { + goto err; + } + + qemu_rdma_return_path_dest_init(rdma_return_path, rdma); + } + qemu_set_fd_handler(rdma->channel->fd, rdma_accept_incoming_migration, NULL, (void *)(intptr_t)rdma); return; err: error_propagate(errp, local_err); g_free(rdma); + g_free(rdma_return_path); } void rdma_start_outgoing_migration(void *opaque, @@ -3754,6 +3814,7 @@ void rdma_start_outgoing_migration(void *opaque, { MigrationState *s = opaque; RDMAContext *rdma = qemu_rdma_data_init(host_port, errp); + RDMAContext *rdma_return_path = NULL; int ret = 0; if (rdma == NULL) { @@ -3774,6 +3835,32 @@ void rdma_start_outgoing_migration(void *opaque, goto err; } + /* RDMA postcopy need a seprate queue pair for return path */ + if (migrate_postcopy()) { + rdma_return_path = qemu_rdma_data_init(host_port, errp); + + if (rdma_return_path == NULL) { + goto err; + } + + ret = qemu_rdma_source_init(rdma_return_path, + s->enabled_capabilities[MIGRATION_CAPABILITY_RDMA_PIN_ALL], errp); + + if (ret) { + goto err; + } + + ret = qemu_rdma_connect(rdma_return_path, errp); + + if (ret) { + goto err; + } + + rdma->return_path = rdma_return_path; + rdma_return_path->return_path = rdma; + rdma_return_path->is_return_path = true; + } + trace_rdma_start_outgoing_migration_after_rdma_connect(); s->to_dst_file = qemu_fopen_rdma(rdma, "wb"); @@ -3781,4 +3868,5 @@ void rdma_start_outgoing_migration(void *opaque, return; err: g_free(rdma); + g_free(rdma_return_path); } From patchwork Wed May 30 09:43:27 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: 858585 jemmy X-Patchwork-Id: 922670 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="roxv0BCW"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40wm3j01Tqz9s1b for ; Wed, 30 May 2018 19:47:44 +1000 (AEST) Received: from localhost ([::1]:37251 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fNxhm-0002jt-K7 for incoming@patchwork.ozlabs.org; Wed, 30 May 2018 05:47:42 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40288) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fNxeD-0000Lr-AL for qemu-devel@nongnu.org; Wed, 30 May 2018 05:44:02 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fNxeC-0006pD-Bn for qemu-devel@nongnu.org; Wed, 30 May 2018 05:44:01 -0400 Received: from mail-pf0-x236.google.com ([2607:f8b0:400e:c00::236]:42785) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fNxeC-0006os-5z for qemu-devel@nongnu.org; Wed, 30 May 2018 05:44:00 -0400 Received: by mail-pf0-x236.google.com with SMTP id p14-v6so8747432pfh.9 for ; Wed, 30 May 2018 02:44:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=x2zpONDew4zxYxDoGbW2jWHb5JmyqrNqYpXGAuo66Z4=; b=roxv0BCWE62TimcJpiCZRAr1pV8dI+cd5O0WxiQRBJGxDevfeQXDYqQC822ue9r+WT rPczU/SM89+M+Y694BN1Nw2yvENp590Q3yGzmRK6AblKT8lvpPq45faT/8g5MDKOkUNz BB46/jCd4Ox+N6fMoGtQRAD4Hd2TFmTxAhq0GNS7hrfZSew/eopjOYO822VdyzHd5JGC F8LDLiXxf7OLcWY8ko5LcoWDIzoyZi145bz/OyES54V6kgDTWUmqQgnAMCTApyalkzbV U3mTj+xTjVgZZrdM4tR8zM/p8hcq0ESgStXb7eLaHakurkL5zObVeNrCLKUXanTpRmlb r8YQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=x2zpONDew4zxYxDoGbW2jWHb5JmyqrNqYpXGAuo66Z4=; b=Cq4HBPrcEKuqO1m1BDLH5iqaFfszQ6n7eGezy48dDgfUp7MGMpbs7njYN5HPhwFQeR wrKkxq709GKxBuTyxrj7Zedf44UOSugH4fdXArX3PiuchicwJMl0QJmW2pZS1m/HhyRw Ex0lHa57jYh3DRDWmjOaCSsptRl4UyFeaW3mxR8PVMPLky3ojBOF6OQPU0sTkMy7pLMu Iv9RUPu8k5nSc8aURuEFU0j5CPf8Nq9XFCEph8A+BUpP98blkeesU4MJDYmkEmzIDd+1 lqhYz3XPkucVr+wwKpSTMITirIHQ6kqWBBvB2ApVKYl7Hu5kEqe49HY4WCVIT9HKwGlu +zNw== X-Gm-Message-State: ALKqPwc0nXs6ubDZ/yPJoOQ5iKNXPJLHRlQ+vayI9S7oEE2AO19OHqPh KqHjKFI6Y7C/yvBxCadIVX0= X-Google-Smtp-Source: ADUXVKJRIDFs42TGQtuQu2N0JvgvVtFp2fDhJRR42Vy4x1fZ9Pn2ZqRQjg8u7MszCvfMXQeMf3fSkg== X-Received: by 2002:a65:4ecc:: with SMTP id w12-v6mr1663707pgq.214.1527673439350; Wed, 30 May 2018 02:43:59 -0700 (PDT) Received: from VM_93_245_centos.localdomain ([150.109.57.149]) by smtp.gmail.com with ESMTPSA id 29-v6sm60565257pfj.14.2018.05.30.02.43.57 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 30 May 2018 02:43:58 -0700 (PDT) From: Lidong Chen X-Google-Original-From: Lidong Chen To: zhang.zhanghailiang@huawei.com, quintela@redhat.com, dgilbert@redhat.com, berrange@redhat.com, aviadye@mellanox.com, pbonzini@redhat.com Date: Wed, 30 May 2018 17:43:27 +0800 Message-Id: <1527673416-31268-4-git-send-email-lidongchen@tencent.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1527673416-31268-1-git-send-email-lidongchen@tencent.com> References: <1527673416-31268-1-git-send-email-lidongchen@tencent.com> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::236 Subject: [Qemu-devel] [PATCH v4 03/12] migration: remove unnecessary variables len in QIOChannelRDMA X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lidong Chen , adido@mellanox.com, qemu-devel@nongnu.org, Lidong Chen Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" From: Lidong Chen Because qio_channel_rdma_writev and qio_channel_rdma_readv maybe invoked by different threads concurrently, this patch removes unnecessary variables len in QIOChannelRDMA and use local variable instead. Signed-off-by: Lidong Chen Reviewed-by: Dr. David Alan Gilbert Reviewed-by: Daniel P. Berrangé --- migration/rdma.c | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-) diff --git a/migration/rdma.c b/migration/rdma.c index ec4bbff..9b6da4d 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -404,7 +404,6 @@ struct QIOChannelRDMA { QIOChannel parent; RDMAContext *rdma; QEMUFile *file; - size_t len; bool blocking; /* XXX we don't actually honour this yet */ }; @@ -2643,6 +2642,7 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc, int ret; ssize_t done = 0; size_t i; + size_t len = 0; CHECK_ERROR_STATE(); @@ -2662,10 +2662,10 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc, while (remaining) { RDMAControlHeader head; - rioc->len = MIN(remaining, RDMA_SEND_INCREMENT); - remaining -= rioc->len; + len = MIN(remaining, RDMA_SEND_INCREMENT); + remaining -= len; - head.len = rioc->len; + head.len = len; head.type = RDMA_CONTROL_QEMU_FILE; ret = qemu_rdma_exchange_send(rdma, &head, data, NULL, NULL, NULL); @@ -2675,8 +2675,8 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc, return ret; } - data += rioc->len; - done += rioc->len; + data += len; + done += len; } } @@ -2771,8 +2771,7 @@ static ssize_t qio_channel_rdma_readv(QIOChannel *ioc, } } } - rioc->len = done; - return rioc->len; + return done; } /* From patchwork Wed May 30 09:43:28 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: 858585 jemmy X-Patchwork-Id: 922671 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="FNjjq88v"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40wm3n2n5Hz9s1b for ; Wed, 30 May 2018 19:47:49 +1000 (AEST) Received: from localhost ([::1]:37253 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fNxhr-0002nb-0w for incoming@patchwork.ozlabs.org; Wed, 30 May 2018 05:47:47 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40315) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fNxeH-0000Q0-NL for qemu-devel@nongnu.org; Wed, 30 May 2018 05:44:06 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fNxeE-0006ps-K6 for qemu-devel@nongnu.org; Wed, 30 May 2018 05:44:05 -0400 Received: from mail-pf0-x22a.google.com ([2607:f8b0:400e:c00::22a]:42774) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fNxeE-0006pg-EP for qemu-devel@nongnu.org; Wed, 30 May 2018 05:44:02 -0400 Received: by mail-pf0-x22a.google.com with SMTP id p14-v6so8747481pfh.9 for ; Wed, 30 May 2018 02:44:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=L0Ramt5azYV3DW/8/DFcHqf/V23s75JfWn8Ox5NsZwI=; b=FNjjq88vNrmeNKVjiVx0SAY2upFBDFh8+hpmVBUxuoIezdvhZ1o/OXNbInGVaiKceB weEgagox2nSixupg3VXDLjXPlcAs2MaSuV6JIF82lcxAtq+r2f5h7p5S1dbu5yP7oXFh x4Tpnp4njGz4z2zQhauqTxpUDL0/R6BbzyjvhF9KSXWq0Bd2YOACzkPnFkYJDotQXXzE LFzdispk19ThEwXwpdyWp2zQ/iPpFwOI/1dNy9j7j7WeUimTlnCdNT8FZYEbY0u1J95G aDGRHC9WOfw200I57qd+PYt7YQlzTcyTeafGgzeINv3iVJeAWVAcKGdmbDBlAqIt2Q5Z +cpA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=L0Ramt5azYV3DW/8/DFcHqf/V23s75JfWn8Ox5NsZwI=; b=C9LQerrxIXwZbqpv+W5bpNIshk222vxZ5qV9Lr06/h5QPU7pAmQMQGhegXdP8yHEt/ 14a7DukhBR3qosstovoDuvJaeHGJIrKxPj5l/lwYRFGLiVDqumvlLyUfz2XT8wY7ty+4 GbshM3X5TbTbv3q9mXRF2hwde6KhqbRhmtSOXvBzHA6iiWyZCR6NVJUgTxlvNlMyJrdK El2fY4OIamwIICAXDYpHY21tAiu6NgIOj3GLXLT6iYCUnFymr4D3Z7pD8PXrlleMOQjL owwjgiikOdlfgiKHGxok4IxSNSLYML72OPRIGBO1zxZY2W6UhDMzgLHnQ+X9iatzQopx Qa1Q== X-Gm-Message-State: ALKqPwd3lab8mf35ikPTPXHHjtNBRTTJFaNLQL1n70pXNiyZDi0SwTiz 1XrWkfb1SxlyzOAlNXoylC0= X-Google-Smtp-Source: ADUXVKJIYHr1R+UJo2XeWRfbRmoOEDGASmPJg//mC7qxxGpmAwwwoz6rb6yydpK/vSdvjpLLpiUu+A== X-Received: by 2002:a63:6d05:: with SMTP id i5-v6mr1660211pgc.321.1527673441622; Wed, 30 May 2018 02:44:01 -0700 (PDT) Received: from VM_93_245_centos.localdomain ([150.109.57.149]) by smtp.gmail.com with ESMTPSA id 29-v6sm60565257pfj.14.2018.05.30.02.43.59 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 30 May 2018 02:44:01 -0700 (PDT) From: Lidong Chen X-Google-Original-From: Lidong Chen To: zhang.zhanghailiang@huawei.com, quintela@redhat.com, dgilbert@redhat.com, berrange@redhat.com, aviadye@mellanox.com, pbonzini@redhat.com Date: Wed, 30 May 2018 17:43:28 +0800 Message-Id: <1527673416-31268-5-git-send-email-lidongchen@tencent.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1527673416-31268-1-git-send-email-lidongchen@tencent.com> References: <1527673416-31268-1-git-send-email-lidongchen@tencent.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::22a Subject: [Qemu-devel] [PATCH v4 04/12] migration: avoid concurrent invoke channel_close by different threads X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lidong Chen , adido@mellanox.com, qemu-devel@nongnu.org, Lidong Chen Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" From: Lidong Chen The channel_close maybe invoked by different threads. For example, source qemu invokes qemu_fclose in main thread, migration thread and return path thread. Destination qemu invokes qemu_fclose in main thread, listen thread and COLO incoming thread. Add a mutex in QEMUFile struct to avoid concurrent invoke channel_close. Signed-off-by: Lidong Chen --- migration/qemu-file.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/migration/qemu-file.c b/migration/qemu-file.c index 977b9ae..87d0f05 100644 --- a/migration/qemu-file.c +++ b/migration/qemu-file.c @@ -52,6 +52,7 @@ struct QEMUFile { unsigned int iovcnt; int last_error; + QemuMutex lock; }; /* @@ -96,6 +97,7 @@ QEMUFile *qemu_fopen_ops(void *opaque, const QEMUFileOps *ops) f = g_new0(QEMUFile, 1); + qemu_mutex_init(&f->lock); f->opaque = opaque; f->ops = ops; return f; @@ -328,7 +330,9 @@ int qemu_fclose(QEMUFile *f) ret = qemu_file_get_error(f); if (f->ops->close) { + qemu_mutex_lock(&f->lock); int ret2 = f->ops->close(f->opaque); + qemu_mutex_unlock(&f->lock); if (ret >= 0) { ret = ret2; } @@ -339,6 +343,7 @@ int qemu_fclose(QEMUFile *f) if (f->last_error) { ret = f->last_error; } + qemu_mutex_destroy(&f->lock); g_free(f); trace_qemu_file_fclose(); return ret; From patchwork Wed May 30 09:43:29 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: 858585 jemmy X-Patchwork-Id: 922674 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="Ey53oR8l"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40wm702hpyz9s1R for ; Wed, 30 May 2018 19:50:36 +1000 (AEST) Received: from localhost ([::1]:37266 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fNxkY-0004sb-1v for incoming@patchwork.ozlabs.org; Wed, 30 May 2018 05:50:34 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40336) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fNxeJ-0000Rh-Gc for qemu-devel@nongnu.org; Wed, 30 May 2018 05:44:09 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fNxeH-0006qf-Fj for qemu-devel@nongnu.org; Wed, 30 May 2018 05:44:07 -0400 Received: from mail-pf0-x241.google.com ([2607:f8b0:400e:c00::241]:36562) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fNxeH-0006qT-6p for qemu-devel@nongnu.org; Wed, 30 May 2018 05:44:05 -0400 Received: by mail-pf0-x241.google.com with SMTP id w129-v6so8756929pfd.3 for ; Wed, 30 May 2018 02:44:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=xoNXjkhSRfGFabOTEuJXpjBSwWuY76cIvudGti5sy9w=; b=Ey53oR8lL+3yYjpmVML/M5isWmrcx7T8LDlkeTnENDOeKQXrS1g15oMi4YJxPwxTxD /G1aDt67Tgj4/XmdIx0WIYiCyFdszA6sMk86V0SF6YXwaq2dhuLpvmwu0i3zhzKcwMqF HvtPDl0yJmquLl7B+cqas0X13X4E3Jmbi+EYv14DycLdOgWEI2YpRFOCqVTP8g1dOr2a I/pu6WpYttfgMh+3fIf0YUrAfC+U72j5XNf7UZWkwimrzcd1b55wFw7vTQ4LFW1eW5b4 W8AAkXM/6D7xpP/1qOuCXAPD2+o4oRHKoEnYcG4EFEmUVhht4U2S6oMiM5Z53hbGS0kQ BX9Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=xoNXjkhSRfGFabOTEuJXpjBSwWuY76cIvudGti5sy9w=; b=fg6DRbADoEJ/5XX25akTLaGYNdQXtO2N1KYp2wK5a+t5tB77ofTgWL80jvfvKzlk5q F0eDKSsBXEfv9ZWzy5zvvGyVYin1MshH/p3b4/uU2yjPLalXYN6I0AB2Cm5OJZweDB5T NmFrfDiD4NUlcI/WiLJu42EW5Y+DaqeeQtRh6rnNlal/Bl9yHBlq6YmAAYg8w+7Z7UPJ p9z/hypsvLnCX4q+FzSFbtR12mpeWnSS2RuPNzHhZrQynklCJ1amnQvZaivJU74US+2A T+lL7pASDAha/eyB2+wSXgd7DDDBFFQmLXTLlOvzgqea35MuNUPYj4u+lGNcIEz7/x6I 1iDA== X-Gm-Message-State: ALKqPwc1QatrhOhxzrrYTZFSnfuurB4AmINZfYp+lXIWaRrjxb2DGpQg cC95G1rJWLtffyAaP1FLTmQ= X-Google-Smtp-Source: ADUXVKLhTag8vaC7SbG9Oox30jq3xTvZ0baxwQbAGTgKcok04Vkqp8LIDqUFHsTadr71HxPYvp3/Xg== X-Received: by 2002:a62:b204:: with SMTP id x4-v6mr2086404pfe.21.1527673444138; Wed, 30 May 2018 02:44:04 -0700 (PDT) Received: from VM_93_245_centos.localdomain ([150.109.57.149]) by smtp.gmail.com with ESMTPSA id 29-v6sm60565257pfj.14.2018.05.30.02.44.01 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 30 May 2018 02:44:03 -0700 (PDT) From: Lidong Chen X-Google-Original-From: Lidong Chen To: zhang.zhanghailiang@huawei.com, quintela@redhat.com, dgilbert@redhat.com, berrange@redhat.com, aviadye@mellanox.com, pbonzini@redhat.com Date: Wed, 30 May 2018 17:43:29 +0800 Message-Id: <1527673416-31268-6-git-send-email-lidongchen@tencent.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1527673416-31268-1-git-send-email-lidongchen@tencent.com> References: <1527673416-31268-1-git-send-email-lidongchen@tencent.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::241 Subject: [Qemu-devel] [PATCH v4 05/12] migration: implement bi-directional RDMA QIOChannel X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lidong Chen , adido@mellanox.com, qemu-devel@nongnu.org, Lidong Chen Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" From: Lidong Chen This patch implements bi-directional RDMA QIOChannel. Because different threads may access RDMAQIOChannel currently, this patch use RCU to protect it. Signed-off-by: Lidong Chen --- migration/colo.c | 2 + migration/migration.c | 2 + migration/postcopy-ram.c | 2 + migration/ram.c | 4 + migration/rdma.c | 196 ++++++++++++++++++++++++++++++++++++++++------- migration/savevm.c | 3 + 6 files changed, 183 insertions(+), 26 deletions(-) diff --git a/migration/colo.c b/migration/colo.c index 4381067..88936f5 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -534,6 +534,7 @@ void *colo_process_incoming_thread(void *opaque) uint64_t value; Error *local_err = NULL; + rcu_register_thread(); qemu_sem_init(&mis->colo_incoming_sem, 0); migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE, @@ -666,5 +667,6 @@ out: } migration_incoming_exit_colo(); + rcu_unregister_thread(); return NULL; } diff --git a/migration/migration.c b/migration/migration.c index 05aec2c..6217ef1 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -2008,6 +2008,7 @@ static void *source_return_path_thread(void *opaque) int res; trace_source_return_path_thread_entry(); + rcu_register_thread(); retry: while (!ms->rp_state.error && !qemu_file_get_error(rp) && @@ -2147,6 +2148,7 @@ out: trace_source_return_path_thread_end(); ms->rp_state.from_dst_file = NULL; qemu_fclose(rp); + rcu_unregister_thread(); return NULL; } diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c index 658b750..a5de61d 100644 --- a/migration/postcopy-ram.c +++ b/migration/postcopy-ram.c @@ -853,6 +853,7 @@ static void *postcopy_ram_fault_thread(void *opaque) RAMBlock *rb = NULL; trace_postcopy_ram_fault_thread_entry(); + rcu_register_thread(); mis->last_rb = NULL; /* last RAMBlock we sent part of */ qemu_sem_post(&mis->fault_thread_sem); @@ -1059,6 +1060,7 @@ retry: } } } + rcu_unregister_thread(); trace_postcopy_ram_fault_thread_exit(); g_free(pfd); return NULL; diff --git a/migration/ram.c b/migration/ram.c index c53e836..85c8c39 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -678,6 +678,7 @@ static void *multifd_send_thread(void *opaque) MultiFDSendParams *p = opaque; Error *local_err = NULL; + rcu_register_thread(); if (multifd_send_initial_packet(p, &local_err) < 0) { goto out; } @@ -701,6 +702,7 @@ out: p->running = false; qemu_mutex_unlock(&p->mutex); + rcu_unregister_thread(); return NULL; } @@ -814,6 +816,7 @@ static void *multifd_recv_thread(void *opaque) { MultiFDRecvParams *p = opaque; + rcu_register_thread(); while (true) { qemu_mutex_lock(&p->mutex); if (p->quit) { @@ -828,6 +831,7 @@ static void *multifd_recv_thread(void *opaque) p->running = false; qemu_mutex_unlock(&p->mutex); + rcu_unregister_thread(); return NULL; } diff --git a/migration/rdma.c b/migration/rdma.c index 9b6da4d..45f01e6 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -86,6 +86,7 @@ static uint32_t known_capabilities = RDMA_CAPABILITY_PIN_ALL; " to abort!"); \ rdma->error_reported = 1; \ } \ + rcu_read_unlock(); \ return rdma->error_state; \ } \ } while (0) @@ -402,7 +403,8 @@ typedef struct QIOChannelRDMA QIOChannelRDMA; struct QIOChannelRDMA { QIOChannel parent; - RDMAContext *rdma; + RDMAContext *rdmain; + RDMAContext *rdmaout; QEMUFile *file; bool blocking; /* XXX we don't actually honour this yet */ }; @@ -2638,12 +2640,20 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc, { QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc); QEMUFile *f = rioc->file; - RDMAContext *rdma = rioc->rdma; + RDMAContext *rdma; int ret; ssize_t done = 0; size_t i; size_t len = 0; + rcu_read_lock(); + rdma = atomic_rcu_read(&rioc->rdmaout); + + if (!rdma) { + rcu_read_unlock(); + return -EIO; + } + CHECK_ERROR_STATE(); /* @@ -2653,6 +2663,7 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc, ret = qemu_rdma_write_flush(f, rdma); if (ret < 0) { rdma->error_state = ret; + rcu_read_unlock(); return ret; } @@ -2672,6 +2683,7 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc, if (ret < 0) { rdma->error_state = ret; + rcu_read_unlock(); return ret; } @@ -2680,6 +2692,7 @@ static ssize_t qio_channel_rdma_writev(QIOChannel *ioc, } } + rcu_read_unlock(); return done; } @@ -2713,12 +2726,20 @@ static ssize_t qio_channel_rdma_readv(QIOChannel *ioc, Error **errp) { QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc); - RDMAContext *rdma = rioc->rdma; + RDMAContext *rdma; RDMAControlHeader head; int ret = 0; ssize_t i; size_t done = 0; + rcu_read_lock(); + rdma = atomic_rcu_read(&rioc->rdmain); + + if (!rdma) { + rcu_read_unlock(); + return -EIO; + } + CHECK_ERROR_STATE(); for (i = 0; i < niov; i++) { @@ -2730,7 +2751,7 @@ static ssize_t qio_channel_rdma_readv(QIOChannel *ioc, * were given and dish out the bytes until we run * out of bytes. */ - ret = qemu_rdma_fill(rioc->rdma, data, want, 0); + ret = qemu_rdma_fill(rdma, data, want, 0); done += ret; want -= ret; /* Got what we needed, so go to next iovec */ @@ -2752,25 +2773,28 @@ static ssize_t qio_channel_rdma_readv(QIOChannel *ioc, if (ret < 0) { rdma->error_state = ret; + rcu_read_unlock(); return ret; } /* * SEND was received with new bytes, now try again. */ - ret = qemu_rdma_fill(rioc->rdma, data, want, 0); + ret = qemu_rdma_fill(rdma, data, want, 0); done += ret; want -= ret; /* Still didn't get enough, so lets just return */ if (want) { if (done == 0) { + rcu_read_unlock(); return QIO_CHANNEL_ERR_BLOCK; } else { break; } } } + rcu_read_unlock(); return done; } @@ -2822,15 +2846,29 @@ qio_channel_rdma_source_prepare(GSource *source, gint *timeout) { QIOChannelRDMASource *rsource = (QIOChannelRDMASource *)source; - RDMAContext *rdma = rsource->rioc->rdma; + RDMAContext *rdma; GIOCondition cond = 0; *timeout = -1; + rcu_read_lock(); + if (rsource->condition == G_IO_IN) { + rdma = atomic_rcu_read(&rsource->rioc->rdmain); + } else { + rdma = atomic_rcu_read(&rsource->rioc->rdmaout); + } + + if (!rdma) { + error_report("RDMAContext is NULL when prepare Gsource"); + rcu_read_unlock(); + return FALSE; + } + if (rdma->wr_data[0].control_len) { cond |= G_IO_IN; } cond |= G_IO_OUT; + rcu_read_unlock(); return cond & rsource->condition; } @@ -2838,14 +2876,28 @@ static gboolean qio_channel_rdma_source_check(GSource *source) { QIOChannelRDMASource *rsource = (QIOChannelRDMASource *)source; - RDMAContext *rdma = rsource->rioc->rdma; + RDMAContext *rdma; GIOCondition cond = 0; + rcu_read_lock(); + if (rsource->condition == G_IO_IN) { + rdma = atomic_rcu_read(&rsource->rioc->rdmain); + } else { + rdma = atomic_rcu_read(&rsource->rioc->rdmaout); + } + + if (!rdma) { + error_report("RDMAContext is NULL when check Gsource"); + rcu_read_unlock(); + return FALSE; + } + if (rdma->wr_data[0].control_len) { cond |= G_IO_IN; } cond |= G_IO_OUT; + rcu_read_unlock(); return cond & rsource->condition; } @@ -2856,14 +2908,28 @@ qio_channel_rdma_source_dispatch(GSource *source, { QIOChannelFunc func = (QIOChannelFunc)callback; QIOChannelRDMASource *rsource = (QIOChannelRDMASource *)source; - RDMAContext *rdma = rsource->rioc->rdma; + RDMAContext *rdma; GIOCondition cond = 0; + rcu_read_lock(); + if (rsource->condition == G_IO_IN) { + rdma = atomic_rcu_read(&rsource->rioc->rdmain); + } else { + rdma = atomic_rcu_read(&rsource->rioc->rdmaout); + } + + if (!rdma) { + error_report("RDMAContext is NULL when dispatch Gsource"); + rcu_read_unlock(); + return FALSE; + } + if (rdma->wr_data[0].control_len) { cond |= G_IO_IN; } cond |= G_IO_OUT; + rcu_read_unlock(); return (*func)(QIO_CHANNEL(rsource->rioc), (cond & rsource->condition), user_data); @@ -2908,15 +2974,32 @@ static int qio_channel_rdma_close(QIOChannel *ioc, Error **errp) { QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc); + RDMAContext *rdmain, *rdmaout; trace_qemu_rdma_close(); - if (rioc->rdma) { - if (!rioc->rdma->error_state) { - rioc->rdma->error_state = qemu_file_get_error(rioc->file); - } - qemu_rdma_cleanup(rioc->rdma); - g_free(rioc->rdma); - rioc->rdma = NULL; + + rdmain = rioc->rdmain; + if (rdmain) { + atomic_rcu_set(&rioc->rdmain, NULL); + } + + rdmaout = rioc->rdmaout; + if (rdmaout) { + atomic_rcu_set(&rioc->rdmaout, NULL); } + + synchronize_rcu(); + + if (rdmain) { + qemu_rdma_cleanup(rdmain); + } + + if (rdmaout) { + qemu_rdma_cleanup(rdmaout); + } + + g_free(rdmain); + g_free(rdmaout); + return 0; } @@ -2959,12 +3042,21 @@ static size_t qemu_rdma_save_page(QEMUFile *f, void *opaque, size_t size, uint64_t *bytes_sent) { QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(opaque); - RDMAContext *rdma = rioc->rdma; + RDMAContext *rdma; int ret; + rcu_read_lock(); + rdma = atomic_rcu_read(&rioc->rdmaout); + + if (!rdma) { + rcu_read_unlock(); + return -EIO; + } + CHECK_ERROR_STATE(); if (migrate_get_current()->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) { + rcu_read_unlock(); return RAM_SAVE_CONTROL_NOT_SUPP; } @@ -3049,9 +3141,11 @@ static size_t qemu_rdma_save_page(QEMUFile *f, void *opaque, } } + rcu_read_unlock(); return RAM_SAVE_CONTROL_DELAYED; err: rdma->error_state = ret; + rcu_read_unlock(); return ret; } @@ -3227,8 +3321,8 @@ static int qemu_rdma_registration_handle(QEMUFile *f, void *opaque) RDMAControlHeader blocks = { .type = RDMA_CONTROL_RAM_BLOCKS_RESULT, .repeat = 1 }; QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(opaque); - RDMAContext *rdma = rioc->rdma; - RDMALocalBlocks *local = &rdma->local_ram_blocks; + RDMAContext *rdma; + RDMALocalBlocks *local; RDMAControlHeader head; RDMARegister *reg, *registers; RDMACompress *comp; @@ -3241,8 +3335,17 @@ static int qemu_rdma_registration_handle(QEMUFile *f, void *opaque) int count = 0; int i = 0; + rcu_read_lock(); + rdma = atomic_rcu_read(&rioc->rdmain); + + if (!rdma) { + rcu_read_unlock(); + return -EIO; + } + CHECK_ERROR_STATE(); + local = &rdma->local_ram_blocks; do { trace_qemu_rdma_registration_handle_wait(); @@ -3476,6 +3579,7 @@ out: if (ret < 0) { rdma->error_state = ret; } + rcu_read_unlock(); return ret; } @@ -3489,10 +3593,18 @@ out: static int rdma_block_notification_handle(QIOChannelRDMA *rioc, const char *name) { - RDMAContext *rdma = rioc->rdma; + RDMAContext *rdma; int curr; int found = -1; + rcu_read_lock(); + rdma = atomic_rcu_read(&rioc->rdmain); + + if (!rdma) { + rcu_read_unlock(); + return -EIO; + } + /* Find the matching RAMBlock in our local list */ for (curr = 0; curr < rdma->local_ram_blocks.nb_blocks; curr++) { if (!strcmp(rdma->local_ram_blocks.block[curr].block_name, name)) { @@ -3503,6 +3615,7 @@ rdma_block_notification_handle(QIOChannelRDMA *rioc, const char *name) if (found == -1) { error_report("RAMBlock '%s' not found on destination", name); + rcu_read_unlock(); return -ENOENT; } @@ -3510,6 +3623,7 @@ rdma_block_notification_handle(QIOChannelRDMA *rioc, const char *name) trace_rdma_block_notification_handle(name, rdma->next_src_index); rdma->next_src_index++; + rcu_read_unlock(); return 0; } @@ -3532,11 +3646,19 @@ static int qemu_rdma_registration_start(QEMUFile *f, void *opaque, uint64_t flags, void *data) { QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(opaque); - RDMAContext *rdma = rioc->rdma; + RDMAContext *rdma; + + rcu_read_lock(); + rdma = atomic_rcu_read(&rioc->rdmaout); + if (!rdma) { + rcu_read_unlock(); + return -EIO; + } CHECK_ERROR_STATE(); if (migrate_get_current()->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) { + rcu_read_unlock(); return 0; } @@ -3544,6 +3666,7 @@ static int qemu_rdma_registration_start(QEMUFile *f, void *opaque, qemu_put_be64(f, RAM_SAVE_FLAG_HOOK); qemu_fflush(f); + rcu_read_unlock(); return 0; } @@ -3556,13 +3679,21 @@ static int qemu_rdma_registration_stop(QEMUFile *f, void *opaque, { Error *local_err = NULL, **errp = &local_err; QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(opaque); - RDMAContext *rdma = rioc->rdma; + RDMAContext *rdma; RDMAControlHeader head = { .len = 0, .repeat = 1 }; int ret = 0; + rcu_read_lock(); + rdma = atomic_rcu_read(&rioc->rdmaout); + if (!rdma) { + rcu_read_unlock(); + return -EIO; + } + CHECK_ERROR_STATE(); if (migrate_get_current()->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) { + rcu_read_unlock(); return 0; } @@ -3594,6 +3725,7 @@ static int qemu_rdma_registration_stop(QEMUFile *f, void *opaque, qemu_rdma_reg_whole_ram_blocks : NULL); if (ret < 0) { ERROR(errp, "receiving remote info!"); + rcu_read_unlock(); return ret; } @@ -3617,6 +3749,7 @@ static int qemu_rdma_registration_stop(QEMUFile *f, void *opaque, "not identical on both the source and destination.", local->nb_blocks, nb_dest_blocks); rdma->error_state = -EINVAL; + rcu_read_unlock(); return -EINVAL; } @@ -3633,6 +3766,7 @@ static int qemu_rdma_registration_stop(QEMUFile *f, void *opaque, local->block[i].length, rdma->dest_blocks[i].length); rdma->error_state = -EINVAL; + rcu_read_unlock(); return -EINVAL; } local->block[i].remote_host_addr = @@ -3650,9 +3784,11 @@ static int qemu_rdma_registration_stop(QEMUFile *f, void *opaque, goto err; } + rcu_read_unlock(); return 0; err: rdma->error_state = ret; + rcu_read_unlock(); return ret; } @@ -3670,10 +3806,15 @@ static const QEMUFileHooks rdma_write_hooks = { static void qio_channel_rdma_finalize(Object *obj) { QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(obj); - if (rioc->rdma) { - qemu_rdma_cleanup(rioc->rdma); - g_free(rioc->rdma); - rioc->rdma = NULL; + if (rioc->rdmain) { + qemu_rdma_cleanup(rioc->rdmain); + g_free(rioc->rdmain); + rioc->rdmain = NULL; + } + if (rioc->rdmaout) { + qemu_rdma_cleanup(rioc->rdmaout); + g_free(rioc->rdmaout); + rioc->rdmaout = NULL; } } @@ -3713,13 +3854,16 @@ static QEMUFile *qemu_fopen_rdma(RDMAContext *rdma, const char *mode) } rioc = QIO_CHANNEL_RDMA(object_new(TYPE_QIO_CHANNEL_RDMA)); - rioc->rdma = rdma; if (mode[0] == 'w') { rioc->file = qemu_fopen_channel_output(QIO_CHANNEL(rioc)); + rioc->rdmaout = rdma; + rioc->rdmain = rdma->return_path; qemu_file_set_hooks(rioc->file, &rdma_write_hooks); } else { rioc->file = qemu_fopen_channel_input(QIO_CHANNEL(rioc)); + rioc->rdmain = rdma; + rioc->rdmaout = rdma->return_path; qemu_file_set_hooks(rioc->file, &rdma_read_hooks); } diff --git a/migration/savevm.c b/migration/savevm.c index 4251125..90cd00f 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -1621,6 +1621,7 @@ static void *postcopy_ram_listen_thread(void *opaque) qemu_sem_post(&mis->listen_thread_sem); trace_postcopy_ram_listen_thread_start(); + rcu_register_thread(); /* * Because we're a thread and not a coroutine we can't yield * in qemu_file, and thus we must be blocking now. @@ -1661,6 +1662,7 @@ static void *postcopy_ram_listen_thread(void *opaque) * to leave the guest running and fire MCEs for pages that never * arrived as a desperate recovery step. */ + rcu_unregister_thread(); exit(EXIT_FAILURE); } @@ -1675,6 +1677,7 @@ static void *postcopy_ram_listen_thread(void *opaque) migration_incoming_state_destroy(); qemu_loadvm_state_cleanup(); + rcu_unregister_thread(); return NULL; } From patchwork Wed May 30 09:43:30 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: 858585 jemmy X-Patchwork-Id: 922673 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="unmsigvR"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40wm6y68VSz9s1b for ; Wed, 30 May 2018 19:50:33 +1000 (AEST) Received: from localhost ([::1]:37264 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fNxkU-0004pK-54 for incoming@patchwork.ozlabs.org; Wed, 30 May 2018 05:50:30 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40350) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fNxeK-0000SW-Ae for qemu-devel@nongnu.org; Wed, 30 May 2018 05:44:09 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fNxeJ-0006rI-F7 for qemu-devel@nongnu.org; Wed, 30 May 2018 05:44:08 -0400 Received: from mail-pg0-x244.google.com ([2607:f8b0:400e:c05::244]:33344) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fNxeJ-0006rA-9j for qemu-devel@nongnu.org; Wed, 30 May 2018 05:44:07 -0400 Received: by mail-pg0-x244.google.com with SMTP id e21-v6so7906761pgv.0 for ; Wed, 30 May 2018 02:44:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=hOvy2JS8aIvKTdzR40q4ec4jdt9nO3bN04jPZoHKiX0=; b=unmsigvRvuO1H7hscdPE4auPpRe9RYV+L6zIjgM6Et4qAGQz0Gx/456yZsZL3FAIn7 QWzXqPrS/7T/JxPBjgNmF01jxWl71I9vdC9GzEf8G3dY7OMyR/pDgKKrb6p9mdpccHe2 jnHDuCZznakQpC5KEHhsZuN+GbXQE8znCk2o0RIwoVrBMIAu4N1FsiR46lEa+/m2mCH0 DCknt+O7HHpFuMj7aqS3Q6NGZ5ydQbCn+ICbi+ZeHZhwwa83F8Utka+qeNKlpqwNNKCj ITuf0N4C54D/c7PLwGIa8Ww9xs7TA5QCTmvi07nSqsCL/djePoX/TDmXs71hHlc/9OqE WNQw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=hOvy2JS8aIvKTdzR40q4ec4jdt9nO3bN04jPZoHKiX0=; b=bwuPInsQpcjzIrGHc4TR3eiAjGiFYZBMV3iYN5ij6rhFPNTT8uw8dPuOSQg1nHgV2i GUh2ZV+Lzt8Ixkxu6e9kHhMRxZFk2gT0JamdENBcArx8h/CyxNl5iVwFwKdF0bAAOLLl ZUpLdd80Y0uVCybfrmnt6Yy9jy8xmtvzoLNcoWFTJeGzF/mPhYeM0oYveGKaQmvEg+Mh 4E3KEISbvzX2kPNEtElGXmhZAt1k2rSjwh/LlqjigRmGWqZ/qceC5ArpHK6hto8/OTMi ks5gZoYaArgTMP92ISCoKsJKmiRIhOrS9inEnyRVyEziwgN8Ox9+e5+O2S1tIXcDOvH5 ioLA== X-Gm-Message-State: ALKqPwemL3YC07ewT+qDjsxLbnngUthmYreRbqf+4z/NjCDB+umsarMM jmyNQf0Mg6lb7XUg72eDmPc= X-Google-Smtp-Source: ADUXVKIVKUUw4y6GpGkcBTEVDk+/3Mj+1cKH7NR/OmlOr5VNFRlPpB6t4bOOL2aXea+QW3ePgg5b+Q== X-Received: by 2002:a63:6501:: with SMTP id z1-v6mr1654370pgb.452.1527673446445; Wed, 30 May 2018 02:44:06 -0700 (PDT) Received: from VM_93_245_centos.localdomain ([150.109.57.149]) by smtp.gmail.com with ESMTPSA id 29-v6sm60565257pfj.14.2018.05.30.02.44.04 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 30 May 2018 02:44:05 -0700 (PDT) From: Lidong Chen X-Google-Original-From: Lidong Chen To: zhang.zhanghailiang@huawei.com, quintela@redhat.com, dgilbert@redhat.com, berrange@redhat.com, aviadye@mellanox.com, pbonzini@redhat.com Date: Wed, 30 May 2018 17:43:30 +0800 Message-Id: <1527673416-31268-7-git-send-email-lidongchen@tencent.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1527673416-31268-1-git-send-email-lidongchen@tencent.com> References: <1527673416-31268-1-git-send-email-lidongchen@tencent.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::244 Subject: [Qemu-devel] [PATCH v4 06/12] migration: Stop rdma yielding during incoming postcopy X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lidong Chen , adido@mellanox.com, qemu-devel@nongnu.org, Lidong Chen Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" From: Lidong Chen During incoming postcopy, the destination qemu will invoke qemu_rdma_wait_comp_channel in a seprate thread. So does not use rdma yield, and poll the completion channel fd instead. Signed-off-by: Lidong Chen Reviewed-by: Dr. David Alan Gilbert --- migration/rdma.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/migration/rdma.c b/migration/rdma.c index 45f01e6..0dd4033 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -1493,11 +1493,13 @@ static int qemu_rdma_wait_comp_channel(RDMAContext *rdma) * Coroutine doesn't start until migration_fd_process_incoming() * so don't yield unless we know we're running inside of a coroutine. */ - if (rdma->migration_started_on_destination) { + if (rdma->migration_started_on_destination && + migration_incoming_get_current()->state == MIGRATION_STATUS_ACTIVE) { yield_until_fd_readable(rdma->comp_channel->fd); } else { /* This is the source side, we're in a separate thread * or destination prior to migration_fd_process_incoming() + * after postcopy, the destination also in a seprate thread. * we can't yield; so we have to poll the fd. * But we need to be able to handle 'cancel' or an error * without hanging forever. From patchwork Wed May 30 09:43:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: 858585 jemmy X-Patchwork-Id: 922678 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="O4jYIGkc"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40wm9r6BnBz9s1R for ; Wed, 30 May 2018 19:53:04 +1000 (AEST) Received: from localhost ([::1]:37281 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fNxmw-0006p5-GS for incoming@patchwork.ozlabs.org; Wed, 30 May 2018 05:53:02 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40380) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fNxeN-0000Us-1K for qemu-devel@nongnu.org; Wed, 30 May 2018 05:44:14 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fNxeL-0006sG-Ql for qemu-devel@nongnu.org; Wed, 30 May 2018 05:44:11 -0400 Received: from mail-pf0-x232.google.com ([2607:f8b0:400e:c00::232]:35408) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fNxeL-0006rv-J0 for qemu-devel@nongnu.org; Wed, 30 May 2018 05:44:09 -0400 Received: by mail-pf0-x232.google.com with SMTP id x9-v6so8752037pfm.2 for ; Wed, 30 May 2018 02:44:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=PBW95m8Mv4iQVCaF+sBliU9LxTO0gnKOMUpB8qan9BE=; b=O4jYIGkcLZiWjCId2WosELpavJUz3Ce8Qh1XmPQ7XCQ11jworlN5QF43DbqZdUwqR4 kBM9/iFIjzftF/AWl/fBz1xIGkV5LDZZw50zimiwhswbGnvjP1x0MkfdyQqicdRKX8Oi 7fchUnKL/1/+LKB0vjWldd7p0o8aTUIthnVFDoNftdNm2YwYGEZgcD0bXt6yHoHmr4/v izpxqijebL9KKOEjo62OPgpF0kzIV4EXEdtKVKvY3Lv6TV0soAc2x5NxV7PPOdvFgpaW hzz9MKL6+FDa1cPZG86ZeEsUt4fbD4lfbC3Xop2iXCChAm2f+i/lqvxR8IwLnpKRR246 YZ0Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=PBW95m8Mv4iQVCaF+sBliU9LxTO0gnKOMUpB8qan9BE=; b=LsJK9NwVMm/NFukla1L8eqWw7qsdAMy8MAIkbw1gNx2fd8ePE5/VqK4v94BrlTCPmg G8wuCAxgKJWbNYzy+/s1AJ7U3W3bvc9r0gPvYaMCMrgkmiE2pRemiMZOlsR0KMWPCUJi 5WmnVqzB205+zXsvoknnfmfgzUHcZ6YEr90sfN1YVcUGuXXsyHOF1wlVum7rPRV5fDjT VxnsV7nWAeVq8TBRqNTzi0csPonXl8rDnGzFs082YdxcZT3PyyLq0AdqvEFFQyMpfjIb Jbh9zSUKV6I+mP17Ldeug9nYR+b4DZLldl2t6uNUSaj0QjIAqeeXZ50TdS4bQQgC0E6f +LUQ== X-Gm-Message-State: ALKqPwfV770tICAdtZRKRKkrasOK4hdgG52SVUUAEB/rwU3GktGvZSOK RV0qWPgXkNIzlTbzHMgV5fg= X-Google-Smtp-Source: ADUXVKL1EqRFqtVmRelnWSbJrCM4pNyEuMAD/r6A4Mz51YXR6GMTYlhTj4jsc9bK2C/TCmBaQOf8XQ== X-Received: by 2002:a63:2485:: with SMTP id k127-v6mr1657161pgk.434.1527673448706; Wed, 30 May 2018 02:44:08 -0700 (PDT) Received: from VM_93_245_centos.localdomain ([150.109.57.149]) by smtp.gmail.com with ESMTPSA id 29-v6sm60565257pfj.14.2018.05.30.02.44.06 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 30 May 2018 02:44:08 -0700 (PDT) From: Lidong Chen X-Google-Original-From: Lidong Chen To: zhang.zhanghailiang@huawei.com, quintela@redhat.com, dgilbert@redhat.com, berrange@redhat.com, aviadye@mellanox.com, pbonzini@redhat.com Date: Wed, 30 May 2018 17:43:31 +0800 Message-Id: <1527673416-31268-8-git-send-email-lidongchen@tencent.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1527673416-31268-1-git-send-email-lidongchen@tencent.com> References: <1527673416-31268-1-git-send-email-lidongchen@tencent.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::232 Subject: [Qemu-devel] [PATCH v4 07/12] migration: not wait RDMA_CM_EVENT_DISCONNECTED event after rdma_disconnect X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lidong Chen , adido@mellanox.com, qemu-devel@nongnu.org, Lidong Chen Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" From: Lidong Chen When cancel migration during RDMA precopy, the source qemu main thread hangs sometime. The backtrace is: (gdb) bt #0 0x00007f249eabd43d in write () from /lib64/libpthread.so.0 #1 0x00007f24a1ce98e4 in rdma_get_cm_event (channel=0x4675d10, event=0x7ffe2f643dd0) at src/cma.c:2189 #2 0x00000000007b6166 in qemu_rdma_cleanup (rdma=0x6784000) at migration/rdma.c:2296 #3 0x00000000007b7cae in qio_channel_rdma_close (ioc=0x3bfcc30, errp=0x0) at migration/rdma.c:2999 #4 0x00000000008db60e in qio_channel_close (ioc=0x3bfcc30, errp=0x0) at io/channel.c:273 #5 0x00000000007a8765 in channel_close (opaque=0x3bfcc30) at migration/qemu-file-channel.c:98 #6 0x00000000007a71f9 in qemu_fclose (f=0x527c000) at migration/qemu-file.c:334 #7 0x0000000000795b96 in migrate_fd_cleanup (opaque=0x3b46280) at migration/migration.c:1162 #8 0x000000000093a71b in aio_bh_call (bh=0x3db7a20) at util/async.c:90 #9 0x000000000093a7b2 in aio_bh_poll (ctx=0x3b121c0) at util/async.c:118 #10 0x000000000093f2ad in aio_dispatch (ctx=0x3b121c0) at util/aio-posix.c:436 #11 0x000000000093ab41 in aio_ctx_dispatch (source=0x3b121c0, callback=0x0, user_data=0x0) at util/async.c:261 #12 0x00007f249f73c7aa in g_main_context_dispatch () from /lib64/libglib-2.0.so.0 #13 0x000000000093dc5e in glib_pollfds_poll () at util/main-loop.c:215 #14 0x000000000093dd4e in os_host_main_loop_wait (timeout=28000000) at util/main-loop.c:263 #15 0x000000000093de05 in main_loop_wait (nonblocking=0) at util/main-loop.c:522 #16 0x00000000005bc6a5 in main_loop () at vl.c:1944 #17 0x00000000005c39b5 in main (argc=56, argv=0x7ffe2f6443f8, envp=0x3ad0030) at vl.c:4752 It does not get the RDMA_CM_EVENT_DISCONNECTED event after rdma_disconnect sometime. According to IB Spec once active side send DREQ message, it should wait for DREP message and only once it arrived it should trigger a DISCONNECT event. DREP message can be dropped due to network issues. For that case the spec defines a DREP_timeout state in the CM state machine, if the DREP is dropped we should get a timeout and a TIMEWAIT_EXIT event will be trigger. Unfortunately the current kernel CM implementation doesn't include the DREP_timeout state and in above scenario we will not get DISCONNECT or TIMEWAIT_EXIT events. So it should not invoke rdma_get_cm_event which may hang forever, and the event channel is also destroyed in qemu_rdma_cleanup. Signed-off-by: Lidong Chen Reviewed-by: Dr. David Alan Gilbert --- migration/rdma.c | 12 ++---------- migration/trace-events | 1 - 2 files changed, 2 insertions(+), 11 deletions(-) diff --git a/migration/rdma.c b/migration/rdma.c index 0dd4033..92e4d30 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -2275,8 +2275,7 @@ static int qemu_rdma_write(QEMUFile *f, RDMAContext *rdma, static void qemu_rdma_cleanup(RDMAContext *rdma) { - struct rdma_cm_event *cm_event; - int ret, idx; + int idx; if (rdma->cm_id && rdma->connected) { if ((rdma->error_state || @@ -2290,14 +2289,7 @@ static void qemu_rdma_cleanup(RDMAContext *rdma) qemu_rdma_post_send_control(rdma, NULL, &head); } - ret = rdma_disconnect(rdma->cm_id); - if (!ret) { - trace_qemu_rdma_cleanup_waiting_for_disconnect(); - ret = rdma_get_cm_event(rdma->channel, &cm_event); - if (!ret) { - rdma_ack_cm_event(cm_event); - } - } + rdma_disconnect(rdma->cm_id); trace_qemu_rdma_cleanup_disconnect(); rdma->connected = false; } diff --git a/migration/trace-events b/migration/trace-events index 3c798dd..4a768ea 100644 --- a/migration/trace-events +++ b/migration/trace-events @@ -146,7 +146,6 @@ qemu_rdma_accept_pin_state(bool pin) "%d" qemu_rdma_accept_pin_verbsc(void *verbs) "Verbs context after listen: %p" qemu_rdma_block_for_wrid_miss(const char *wcompstr, int wcomp, const char *gcompstr, uint64_t req) "A Wanted wrid %s (%d) but got %s (%" PRIu64 ")" qemu_rdma_cleanup_disconnect(void) "" -qemu_rdma_cleanup_waiting_for_disconnect(void) "" qemu_rdma_close(void) "" qemu_rdma_connect_pin_all_requested(void) "" qemu_rdma_connect_pin_all_outcome(bool pin) "%d" From patchwork Wed May 30 09:43:32 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: 858585 jemmy X-Patchwork-Id: 922672 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="EwvXBunC"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40wm3q0nPMz9s1R for ; Wed, 30 May 2018 19:47:51 +1000 (AEST) Received: from localhost ([::1]:37256 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fNxhs-0002qq-PB for incoming@patchwork.ozlabs.org; Wed, 30 May 2018 05:47:48 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40408) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fNxeQ-0000Xb-82 for qemu-devel@nongnu.org; Wed, 30 May 2018 05:44:15 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fNxeO-0006sp-6F for qemu-devel@nongnu.org; Wed, 30 May 2018 05:44:14 -0400 Received: from mail-pg0-x230.google.com ([2607:f8b0:400e:c05::230]:44726) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fNxeO-0006sh-0O for qemu-devel@nongnu.org; Wed, 30 May 2018 05:44:12 -0400 Received: by mail-pg0-x230.google.com with SMTP id p21-v6so7899040pgd.11 for ; Wed, 30 May 2018 02:44:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=lux+N7RLSocDI3oxa3LwaDyLJuk54eW4aEDrU0k5j8k=; b=EwvXBunCyl4mtRMlmxPWyVoAGwwG85VRbBHuIU1GlLQoRXqw0GC2MUkOFl2UAxvb5N pK+aaR6CT/C2yyfeJPWM6WVq7r2B7rMROc7CfoiBgbwVCV4qzb2+5mLPqHnIHI8RC8/1 jVYt3pXNX9q9684B7Lz/tSl2Y3I5buBx0PZE6b9xLRgclHicq5acTW0klkVk3ZM7gfr0 Yi0i09nWhbuAHVwb84MaOPcwk7vWgLCQFjKyySWu37pu2bSjNEG16x1QBj4E5x2pX4Ay PTi2isjUQWvdow+oejB46AtgtT92sGcdCwL6Zm+lSTgALCZA4TPPRX626i2q0/A7z9H/ RmwQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=lux+N7RLSocDI3oxa3LwaDyLJuk54eW4aEDrU0k5j8k=; b=ULjVq/JXOkvIkta6TSru9sQQjlGMJ8iKQnP3c5Fy1+Z3NAMfr5ISfTfcg5VwId5EPv DgsuI1OQaqwsprHq6AMG+zzoJadlg7K6SZARpReoZjRkJbpHsWQFJ505ftYQ1AgY9iAt io7Ti8oibNeVEoRuGWb3Ng7qmxTPEQdnIRgIhi79ajMCaYkdIBkbwS/CSd+xN8CKJC5n SmyVyayHqpdAaAS212yvmSblz1kF8fdUAY75SuTQHMY21AR8tl7EjJ/XlgNJkliyJXGW SxBOScgqg2tDGCoWayaqzWCuhSW54d3XBn2oFWQcJ6hr/FGlbzbY3V0DzKvHDDpr7tRX 82mg== X-Gm-Message-State: ALKqPwfMLwMnzn2Xk/llquafAuoKfJp3uQrrJHITVcSzeCQyXSDE4lQZ nSb6wzaTplWGxV4KYXlEXVQMLQ== X-Google-Smtp-Source: ADUXVKJ+4+M+DmfcPRPWnBApqpVf/94IlgCm/FPSyixyskKJmuQwqZE3bC8QCDfatvoF+LLy/FTcsA== X-Received: by 2002:a63:6383:: with SMTP id x125-v6mr1601424pgb.277.1527673451152; Wed, 30 May 2018 02:44:11 -0700 (PDT) Received: from VM_93_245_centos.localdomain ([150.109.57.149]) by smtp.gmail.com with ESMTPSA id 29-v6sm60565257pfj.14.2018.05.30.02.44.08 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 30 May 2018 02:44:10 -0700 (PDT) From: Lidong Chen X-Google-Original-From: Lidong Chen To: zhang.zhanghailiang@huawei.com, quintela@redhat.com, dgilbert@redhat.com, berrange@redhat.com, aviadye@mellanox.com, pbonzini@redhat.com Date: Wed, 30 May 2018 17:43:32 +0800 Message-Id: <1527673416-31268-9-git-send-email-lidongchen@tencent.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1527673416-31268-1-git-send-email-lidongchen@tencent.com> References: <1527673416-31268-1-git-send-email-lidongchen@tencent.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::230 Subject: [Qemu-devel] [PATCH v4 08/12] migration: implement io_set_aio_fd_handler function for RDMA QIOChannel X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lidong Chen , adido@mellanox.com, qemu-devel@nongnu.org, Lidong Chen Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" From: Lidong Chen if qio_channel_rdma_readv return QIO_CHANNEL_ERR_BLOCK, the destination qemu crash. The backtrace is: (gdb) bt #0 0x0000000000000000 in ?? () #1 0x00000000008db50e in qio_channel_set_aio_fd_handler (ioc=0x38111e0, ctx=0x3726080, io_read=0x8db841 , io_write=0x0, opaque=0x38111e0) at io/channel.c: #2 0x00000000008db952 in qio_channel_set_aio_fd_handlers (ioc=0x38111e0) at io/channel.c:438 #3 0x00000000008dbab4 in qio_channel_yield (ioc=0x38111e0, condition=G_IO_IN) at io/channel.c:47 #4 0x00000000007a870b in channel_get_buffer (opaque=0x38111e0, buf=0x440c038 "", pos=0, size=327 at migration/qemu-file-channel.c:83 #5 0x00000000007a70f6 in qemu_fill_buffer (f=0x440c000) at migration/qemu-file.c:299 #6 0x00000000007a79d0 in qemu_peek_byte (f=0x440c000, offset=0) at migration/qemu-file.c:562 #7 0x00000000007a7a22 in qemu_get_byte (f=0x440c000) at migration/qemu-file.c:575 #8 0x00000000007a7c78 in qemu_get_be32 (f=0x440c000) at migration/qemu-file.c:655 #9 0x00000000007a0508 in qemu_loadvm_state (f=0x440c000) at migration/savevm.c:2126 #10 0x0000000000794141 in process_incoming_migration_co (opaque=0x0) at migration/migration.c:366 #11 0x000000000095c598 in coroutine_trampoline (i0=84033984, i1=0) at util/coroutine-ucontext.c:1 #12 0x00007f9c0db56d40 in ?? () from /lib64/libc.so.6 #13 0x00007f96fe858760 in ?? () #14 0x0000000000000000 in ?? () RDMA QIOChannel not implement io_set_aio_fd_handler. so qio_channel_set_aio_fd_handler will access NULL pointer. Signed-off-by: Lidong Chen Reviewed-by: Juan Quintela --- migration/rdma.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/migration/rdma.c b/migration/rdma.c index 92e4d30..dfa4f77 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -2963,6 +2963,21 @@ static GSource *qio_channel_rdma_create_watch(QIOChannel *ioc, return source; } +static void qio_channel_rdma_set_aio_fd_handler(QIOChannel *ioc, + AioContext *ctx, + IOHandler *io_read, + IOHandler *io_write, + void *opaque) +{ + QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc); + if (io_read) { + aio_set_fd_handler(ctx, rioc->rdmain->comp_channel->fd, + false, io_read, io_write, NULL, opaque); + } else { + aio_set_fd_handler(ctx, rioc->rdmaout->comp_channel->fd, + false, io_read, io_write, NULL, opaque); + } +} static int qio_channel_rdma_close(QIOChannel *ioc, Error **errp) @@ -3822,6 +3837,7 @@ static void qio_channel_rdma_class_init(ObjectClass *klass, ioc_klass->io_set_blocking = qio_channel_rdma_set_blocking; ioc_klass->io_close = qio_channel_rdma_close; ioc_klass->io_create_watch = qio_channel_rdma_create_watch; + ioc_klass->io_set_aio_fd_handler = qio_channel_rdma_set_aio_fd_handler; } static const TypeInfo qio_channel_rdma_info = { From patchwork Wed May 30 09:43:33 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: 858585 jemmy X-Patchwork-Id: 922669 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="c70yoJBa"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40wm0Z4lhqz9s1R for ; Wed, 30 May 2018 19:45:02 +1000 (AEST) Received: from localhost ([::1]:37239 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fNxfA-0000a4-7j for incoming@patchwork.ozlabs.org; Wed, 30 May 2018 05:45:00 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40421) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fNxeR-0000YW-Ab for qemu-devel@nongnu.org; Wed, 30 May 2018 05:44:18 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fNxeQ-0006tp-FB for qemu-devel@nongnu.org; Wed, 30 May 2018 05:44:15 -0400 Received: from mail-pf0-x236.google.com ([2607:f8b0:400e:c00::236]:43742) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fNxeQ-0006tb-9h for qemu-devel@nongnu.org; Wed, 30 May 2018 05:44:14 -0400 Received: by mail-pf0-x236.google.com with SMTP id j20-v6so8747081pff.10 for ; Wed, 30 May 2018 02:44:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=V6VZn/TLDJ0rcpWXxTwI8ayG4VlrvGWwAP//SDZcvS8=; b=c70yoJBa+cc/cgPMQyUADt91a4QHWwOKywxcjHB3KaDeAFXPqruZrFtZOipIUMF0DK nl5735M052+tIWrdUJAZBnW832lG3N1LJM3uCQbCcBC3nDlRDRA7jSyxHMip38AaE8bl KhXfFhDdUzq1uwqdlJCnI/fQa8wr5vUQHyV/RQJU/pn7g+QraJxe6M5ddONiX7oo8dCs jI9C94luuO8R4i+CW9ViaLlCyjTJPDXK5jnd6+X2s3Y+ZaZ9PNGKl4XaIPLcq2c3qcL6 SyCq/IbYW5pa5RYLY622qCP/AvK/XfU3zm4mZy0czW7nBsLX3p8QyhsXnbcKx3GWfJUg KirQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=V6VZn/TLDJ0rcpWXxTwI8ayG4VlrvGWwAP//SDZcvS8=; b=lGFZ8Jm8tt6uKO8C+3nPh5932r9TieeL7aJh+p3seRp5PHRPwkjaOUqKKMbGPh4NkQ woMYxxEfPhvWdFfE5ZQW78Y0GldAKUBm//egGCI4/egwfWN54pxBpwcOPtjZ9Tltf2P/ Xk+Z+oeCjdhsQ4c1YVeniKD4oPihTdvFRM9os+6RCoYSpV3GDZEp+NaguNT5F5Jdm9VD CjqeuZkR0wCDUX6f8V6vABGaumAjTdjw4er3xohbqNmC0DOqn4q3KbX9kRYMqetSpET4 dfE3MfT5aHCgp6FJhMuZNOeYhF7LPe2rmn4pwnKrgnm6ML+reMGhBBaFKfRdbrobf6JN jXcQ== X-Gm-Message-State: ALKqPwea1wLCK/g8wjpx7c7ubR6F7hBtcKiW+l7RVQ11kmkhMKyt1oR5 jCNn8v9+n4LyqM+7Z478wHw= X-Google-Smtp-Source: ADUXVKL7vL4XJ9u+XfjhDvHd9O6LNYGF8aNdLia45jayAWP59OL4cWSPoBsHTu8Nicx67T5z/lLenw== X-Received: by 2002:a62:1549:: with SMTP id 70-v6mr2089022pfv.91.1527673453401; Wed, 30 May 2018 02:44:13 -0700 (PDT) Received: from VM_93_245_centos.localdomain ([150.109.57.149]) by smtp.gmail.com with ESMTPSA id 29-v6sm60565257pfj.14.2018.05.30.02.44.11 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 30 May 2018 02:44:12 -0700 (PDT) From: Lidong Chen X-Google-Original-From: Lidong Chen To: zhang.zhanghailiang@huawei.com, quintela@redhat.com, dgilbert@redhat.com, berrange@redhat.com, aviadye@mellanox.com, pbonzini@redhat.com Date: Wed, 30 May 2018 17:43:33 +0800 Message-Id: <1527673416-31268-10-git-send-email-lidongchen@tencent.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1527673416-31268-1-git-send-email-lidongchen@tencent.com> References: <1527673416-31268-1-git-send-email-lidongchen@tencent.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::236 Subject: [Qemu-devel] [PATCH v4 09/12] migration: invoke qio_channel_yield only when qemu_in_coroutine() X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lidong Chen , adido@mellanox.com, qemu-devel@nongnu.org, Lidong Chen Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" From: Lidong Chen when qio_channel_read return QIO_CHANNEL_ERR_BLOCK, the source qemu crash. The backtrace is: (gdb) bt #0 0x00007fb20aba91d7 in raise () from /lib64/libc.so.6 #1 0x00007fb20abaa8c8 in abort () from /lib64/libc.so.6 #2 0x00007fb20aba2146 in __assert_fail_base () from /lib64/libc.so.6 #3 0x00007fb20aba21f2 in __assert_fail () from /lib64/libc.so.6 #4 0x00000000008dba2d in qio_channel_yield (ioc=0x22f9e20, condition=G_IO_IN) at io/channel.c:460 #5 0x00000000007a870b in channel_get_buffer (opaque=0x22f9e20, buf=0x3d54038 "", pos=0, size=32768) at migration/qemu-file-channel.c:83 #6 0x00000000007a70f6 in qemu_fill_buffer (f=0x3d54000) at migration/qemu-file.c:299 #7 0x00000000007a79d0 in qemu_peek_byte (f=0x3d54000, offset=0) at migration/qemu-file.c:562 #8 0x00000000007a7a22 in qemu_get_byte (f=0x3d54000) at migration/qemu-file.c:575 #9 0x00000000007a7c46 in qemu_get_be16 (f=0x3d54000) at migration/qemu-file.c:647 #10 0x0000000000796db7 in source_return_path_thread (opaque=0x2242280) at migration/migration.c:1794 #11 0x00000000009428fa in qemu_thread_start (args=0x3e58420) at util/qemu-thread-posix.c:504 #12 0x00007fb20af3ddc5 in start_thread () from /lib64/libpthread.so.0 #13 0x00007fb20ac6b74d in clone () from /lib64/libc.so.6 This patch fixed by invoke qio_channel_yield only when qemu_in_coroutine(). Signed-off-by: Lidong Chen Reviewed-by: Juan Quintela --- migration/qemu-file-channel.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/migration/qemu-file-channel.c b/migration/qemu-file-channel.c index e202d73..8e639eb 100644 --- a/migration/qemu-file-channel.c +++ b/migration/qemu-file-channel.c @@ -49,7 +49,11 @@ static ssize_t channel_writev_buffer(void *opaque, ssize_t len; len = qio_channel_writev(ioc, local_iov, nlocal_iov, NULL); if (len == QIO_CHANNEL_ERR_BLOCK) { - qio_channel_wait(ioc, G_IO_OUT); + if (qemu_in_coroutine()) { + qio_channel_yield(ioc, G_IO_OUT); + } else { + qio_channel_wait(ioc, G_IO_OUT); + } continue; } if (len < 0) { @@ -80,7 +84,11 @@ static ssize_t channel_get_buffer(void *opaque, ret = qio_channel_read(ioc, (char *)buf, size, NULL); if (ret < 0) { if (ret == QIO_CHANNEL_ERR_BLOCK) { - qio_channel_yield(ioc, G_IO_IN); + if (qemu_in_coroutine()) { + qio_channel_yield(ioc, G_IO_IN); + } else { + qio_channel_wait(ioc, G_IO_IN); + } } else { /* XXX handle Error * object */ return -EIO; From patchwork Wed May 30 09:43:34 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: 858585 jemmy X-Patchwork-Id: 922677 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="l3Y8IDMQ"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40wm9l1Jk9z9s1R for ; Wed, 30 May 2018 19:52:59 +1000 (AEST) Received: from localhost ([::1]:37278 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fNxmq-0006kb-SJ for incoming@patchwork.ozlabs.org; Wed, 30 May 2018 05:52:56 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40466) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fNxeX-0000cD-36 for qemu-devel@nongnu.org; Wed, 30 May 2018 05:44:23 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fNxeS-0006uP-Fh for qemu-devel@nongnu.org; Wed, 30 May 2018 05:44:21 -0400 Received: from mail-pf0-x243.google.com ([2607:f8b0:400e:c00::243]:41051) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fNxeS-0006uD-A4 for qemu-devel@nongnu.org; Wed, 30 May 2018 05:44:16 -0400 Received: by mail-pf0-x243.google.com with SMTP id v63-v6so8745120pfk.8 for ; Wed, 30 May 2018 02:44:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=tutEnQN1b8+0xuMhMGhMt4Xc3KmSHGHA0nVCLjitWMI=; b=l3Y8IDMQjQ3mef0NX3Q5X1WRf2KKIj32RUxW7wvnvnLkuls3N/5Od7D74j5rmvL6b4 c8lXv0YpyD7Bvigw+uRUpg+NpAVe7g741/VhBLutNtGIAkoF1Tyy/QpfRpFr2/W/bC7Q +WDYv4nLGEDOnOxP5R80Ym23j9PoZiA0w3DdTUlmPkrPY3txhdWxw2zMo36pZ/rrm5gN QHfzLSNzyJ08eoTDC/TmQuZZhqurEk8C6u/oGk8ntbXOIkDfXm8naZXuNIhj2ifcrE9+ ULP0152R3vsANy0TnLbpB20ldcJSZqmCKl+QEKLusNGmmqOWtUiQuqriwwkGep05yOUD dxGw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=tutEnQN1b8+0xuMhMGhMt4Xc3KmSHGHA0nVCLjitWMI=; b=hMcfsVPd5Iz1lWVNFcHpQPRz4Gn3EXRARXAvy5kLMWwOk4xo9WcrASypm5tVBP6BFi 1xHeS055fQqQ/oS4PLpOy7tiv5HQvk8CXKgH71Z5rzbehDKOjtnW9FQCnyAWcGB4J9qw 5Jwli1jxec8OoZWz9OK51lI5NwgKGFzOTXjhfr2Q0Co29w3xR3kAxGsfNArCrdr5vcox J1esD9FIBtCdunu4Ago7Wc2lvBjC/UL14HnI0iOlTupegePg/8whN0dydbsbIufxEmqe 9LN7vI1fTMBOMfUmnQvk6Pd6qhMikS7H3FMkpy6vb9zFTVuyzPX+kc0rU5pCBWzYNqOn zprw== X-Gm-Message-State: ALKqPwepP5BhJDVziSUwURn3hKRGE+FPKjOfoaDFenb+0byZFBWDrpp6 V5G7/0hF8wuZXJCBX7F/LTo= X-Google-Smtp-Source: ADUXVKJjheOnV0nnz1XH3x2RAO63Qnw3lfgQGKHGg1XwEypuvVO8VNoS5oXwl+cbmBFts2YL0LVWfg== X-Received: by 2002:a62:2394:: with SMTP id q20-v6mr2083601pfj.1.1527673455477; Wed, 30 May 2018 02:44:15 -0700 (PDT) Received: from VM_93_245_centos.localdomain ([150.109.57.149]) by smtp.gmail.com with ESMTPSA id 29-v6sm60565257pfj.14.2018.05.30.02.44.13 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 30 May 2018 02:44:15 -0700 (PDT) From: Lidong Chen X-Google-Original-From: Lidong Chen To: zhang.zhanghailiang@huawei.com, quintela@redhat.com, dgilbert@redhat.com, berrange@redhat.com, aviadye@mellanox.com, pbonzini@redhat.com Date: Wed, 30 May 2018 17:43:34 +0800 Message-Id: <1527673416-31268-11-git-send-email-lidongchen@tencent.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1527673416-31268-1-git-send-email-lidongchen@tencent.com> References: <1527673416-31268-1-git-send-email-lidongchen@tencent.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::243 Subject: [Qemu-devel] [PATCH v4 10/12] migration: create a dedicated thread to release rdma resource X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: adido@mellanox.com, qemu-devel@nongnu.org, Lidong Chen Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" ibv_dereg_mr wait for a long time for big memory size virtual server. The test result is: 10GB 326ms 20GB 699ms 30GB 1021ms 40GB 1387ms 50GB 1712ms 60GB 2034ms 70GB 2457ms 80GB 2807ms 90GB 3107ms 100GB 3474ms 110GB 3735ms 120GB 4064ms 130GB 4567ms 140GB 4886ms this will cause the guest os hang for a while when migration finished. So create a dedicated thread to release rdma resource. Signed-off-by: Lidong Chen --- migration/rdma.c | 21 +++++++++++++++++---- 1 file changed, 17 insertions(+), 4 deletions(-) diff --git a/migration/rdma.c b/migration/rdma.c index dfa4f77..1b9e261 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -2979,12 +2979,12 @@ static void qio_channel_rdma_set_aio_fd_handler(QIOChannel *ioc, } } -static int qio_channel_rdma_close(QIOChannel *ioc, - Error **errp) +static void *qio_channel_rdma_close_thread(void *arg) { - QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc); + QIOChannelRDMA *rioc = arg; RDMAContext *rdmain, *rdmaout; - trace_qemu_rdma_close(); + + rcu_register_thread(); rdmain = rioc->rdmain; if (rdmain) { @@ -3009,6 +3009,19 @@ static int qio_channel_rdma_close(QIOChannel *ioc, g_free(rdmain); g_free(rdmaout); + rcu_unregister_thread(); + return NULL; +} + +static int qio_channel_rdma_close(QIOChannel *ioc, + Error **errp) +{ + QemuThread t; + QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc); + trace_qemu_rdma_close(); + + qemu_thread_create(&t, "rdma cleanup", qio_channel_rdma_close_thread, + rioc, QEMU_THREAD_DETACHED); return 0; } From patchwork Wed May 30 09:43:35 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: 858585 jemmy X-Patchwork-Id: 922679 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="Q9xkr6aY"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40wmD41WrVz9s1R for ; Wed, 30 May 2018 19:55:00 +1000 (AEST) Received: from localhost ([::1]:37290 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fNxon-0008Aj-Tc for incoming@patchwork.ozlabs.org; Wed, 30 May 2018 05:54:57 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40450) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fNxeV-0000aL-GK for qemu-devel@nongnu.org; Wed, 30 May 2018 05:44:20 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fNxeU-0006v7-M4 for qemu-devel@nongnu.org; Wed, 30 May 2018 05:44:19 -0400 Received: from mail-pf0-x244.google.com ([2607:f8b0:400e:c00::244]:36565) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fNxeU-0006ux-Dm for qemu-devel@nongnu.org; Wed, 30 May 2018 05:44:18 -0400 Received: by mail-pf0-x244.google.com with SMTP id w129-v6so8757264pfd.3 for ; Wed, 30 May 2018 02:44:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=PLY3XFHAd8mJL9+yVghHeSjCauckPGLwpP9im9j+9Es=; b=Q9xkr6aY803vmWI+19TLcavYr3MazWT7zDPLX3KXYYYF98omRnyPgIVRLMTNQZjsJe waKLc+HtV8Z4tkfrNPqfGqVJuwFSQtEuYG8X5X3jA5T9tsSTabUuT6GmVg+Mm3YdZ2Cb ZFCtUfotcbdNgcWNxhl20gq8KUIxxuH0KreHV/QO4zelW5lpK2vQIm65BCmAAR4gJoY7 x/HUgcZH7t+biJdol2uqJq7qiwEtAR6v2oxYhZaW8G0MGzdoEjzKcWn9kYbda8fEQFID onGwwKqGCCVCO0zXpD/KDYXw45EE1LI1kFncSNuY6vaKSGhi0owFNTVSNUbaqYbXhf9J evzw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=PLY3XFHAd8mJL9+yVghHeSjCauckPGLwpP9im9j+9Es=; b=PcpQJLsft0o2d992TeXs+G8BTFMPojksvchSdJ4p3rn1xysuuwhZ3MuemVDg76WAOb jiO5uveC5h9MNWn3ZFtmh6JhfJ5dndbJIdSWtWM7LdINTXTXDN/hKWNFPzB3heRrTJIL 5LxgObxp7aoANNrhNUS3k7OO/PU7gTQu+DhLjUSCZZB+PnqzJR6MP1FRuIVu0GnULTxz Orpvj1jDS6wy5gGzMwY9Yqury9gLCV5ME0bn2HFLwrgBimgWPbf6JmcFJhpW1x6Hujqu wmlrarw5HFZ1zkB/JZx69VS4XoiHa2/NdG594QHlWcdiF4BZ58gKgwS9loP/Q55MM/VM chng== X-Gm-Message-State: ALKqPweJpMO/kjScfs4Y7wlap7j/I15WNExwAhxtxEEQM/0qwGMJ3KnV Jj4kQ2LbcE57IJV0X4RJ6Tw= X-Google-Smtp-Source: ADUXVKKG4G+rMStNon1aeUlA10hN0CnJSrK8YpXmEUO7b0XNOj1T/GXYmAYrgMzSVMP3k4WYW1EfIw== X-Received: by 2002:a62:d11d:: with SMTP id z29-v6mr2072475pfg.246.1527673457575; Wed, 30 May 2018 02:44:17 -0700 (PDT) Received: from VM_93_245_centos.localdomain ([150.109.57.149]) by smtp.gmail.com with ESMTPSA id 29-v6sm60565257pfj.14.2018.05.30.02.44.15 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 30 May 2018 02:44:17 -0700 (PDT) From: Lidong Chen X-Google-Original-From: Lidong Chen To: zhang.zhanghailiang@huawei.com, quintela@redhat.com, dgilbert@redhat.com, berrange@redhat.com, aviadye@mellanox.com, pbonzini@redhat.com Date: Wed, 30 May 2018 17:43:35 +0800 Message-Id: <1527673416-31268-12-git-send-email-lidongchen@tencent.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1527673416-31268-1-git-send-email-lidongchen@tencent.com> References: <1527673416-31268-1-git-send-email-lidongchen@tencent.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::244 Subject: [Qemu-devel] [PATCH v4 11/12] migration: poll the cm event while wait RDMA work request completion X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: adido@mellanox.com, qemu-devel@nongnu.org, Lidong Chen Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" If the peer qemu is crashed, the qemu_rdma_wait_comp_channel function maybe loop forever. so we should also poll the cm event fd, and when receive any cm event, we consider some error happened. Signed-off-by: Lidong Chen --- migration/rdma.c | 35 ++++++++++++++++++++++++----------- 1 file changed, 24 insertions(+), 11 deletions(-) diff --git a/migration/rdma.c b/migration/rdma.c index 1b9e261..d611a06 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -1489,6 +1489,9 @@ static uint64_t qemu_rdma_poll(RDMAContext *rdma, uint64_t *wr_id_out, */ static int qemu_rdma_wait_comp_channel(RDMAContext *rdma) { + struct rdma_cm_event *cm_event; + int ret = -1; + /* * Coroutine doesn't start until migration_fd_process_incoming() * so don't yield unless we know we're running inside of a coroutine. @@ -1504,25 +1507,35 @@ static int qemu_rdma_wait_comp_channel(RDMAContext *rdma) * But we need to be able to handle 'cancel' or an error * without hanging forever. */ - while (!rdma->error_state && !rdma->received_error) { - GPollFD pfds[1]; + while (!rdma->error_state && !rdma->received_error) { + GPollFD pfds[2]; pfds[0].fd = rdma->comp_channel->fd; pfds[0].events = G_IO_IN | G_IO_HUP | G_IO_ERR; + pfds[0].revents = 0; + + pfds[1].fd = rdma->channel->fd; + pfds[1].events = G_IO_IN | G_IO_HUP | G_IO_ERR; + pfds[1].revents = 0; + /* 0.1s timeout, should be fine for a 'cancel' */ - switch (qemu_poll_ns(pfds, 1, 100 * 1000 * 1000)) { - case 1: /* fd active */ - return 0; + qemu_poll_ns(pfds, 2, 100 * 1000 * 1000); - case 0: /* Timeout, go around again */ - break; + if (pfds[1].revents) { + ret = rdma_get_cm_event(rdma->channel, &cm_event); + if (!ret) { + rdma_ack_cm_event(cm_event); + } + error_report("receive cm event while wait comp channel," + "cm event is %d", cm_event->event); - default: /* Error of some type - - * I don't trust errno from qemu_poll_ns - */ - error_report("%s: poll failed", __func__); + /* consider any rdma communication event as an error */ return -EPIPE; } + if (pfds[0].revents) { + return 0; + } + if (migrate_get_current()->state == MIGRATION_STATUS_CANCELLING) { /* Bail out and let the cancellation happen */ return -EPIPE; From patchwork Wed May 30 09:43:36 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: 858585 jemmy X-Patchwork-Id: 922675 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="PdOhgZ+l"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40wm726gR9z9s1R for ; Wed, 30 May 2018 19:50:38 +1000 (AEST) Received: from localhost ([::1]:37268 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fNxka-0004uz-Hv for incoming@patchwork.ozlabs.org; Wed, 30 May 2018 05:50:36 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40481) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fNxeZ-0000f5-U2 for qemu-devel@nongnu.org; Wed, 30 May 2018 05:44:25 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fNxeW-0006vm-P1 for qemu-devel@nongnu.org; Wed, 30 May 2018 05:44:23 -0400 Received: from mail-pl0-x241.google.com ([2607:f8b0:400e:c01::241]:34650) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fNxeW-0006va-H8 for qemu-devel@nongnu.org; Wed, 30 May 2018 05:44:20 -0400 Received: by mail-pl0-x241.google.com with SMTP id ay10-v6so10767801plb.1 for ; Wed, 30 May 2018 02:44:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=11VfxFQD7s5h6U1r0LB5IYQubrSVzc6pJoMQOfiI8xk=; b=PdOhgZ+lifBqtjIz/NSKWR4K7kRFG5425lf6Cq+/L9cu9fEK3CTzrDTVFC1/VLWKS8 D7iRpYO5dkoIxa9FLLfwPrMRUlcVEbzTMnxdonBsF1X/jJG6w5tzs+XLvaM3PYYih0Wa pKUfr++9rR05H4AaDxWDjBlxm7Xg4aoAUoTu4JDbBGg07wjBSPDH79WV4eI5Zk/pZ4FF 1jPEhFc0kToCBPpmyD81W5nIvJEKmcXquAqWk1hgUepQtDLZ4mbeJ0M4DzJanW14UloZ K72zV2Y+q4027isdzSBQLlQyCwWW+BLgJeQvIcGTVViMXXu5XrV1sRabw08pxOHxtad5 HKnQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=11VfxFQD7s5h6U1r0LB5IYQubrSVzc6pJoMQOfiI8xk=; b=jaHiJDOa8olLD/lRi8qP0uLzIKrJ2gEtZcRjEoJV4n5SbnrxEzU5ciyrUN0EM6jUaQ zkZZnNiCxFHDRk/y0N0CS/ZChrHgo4mzsBAI2xrcVs8+rEkWO1+Oh0vEycfbj0TzHiTs sLgxG86j6q+blRNXzq8+Cdm2d/KVdj/LQqR3o0DWd3Fu8rqj0opHo1YwMIn4lDIZoNHX Oi2MYb1qvVKSFfwXOxABfP4TRVzG7Io5ClfmRG946H+apLIeKtOsgUlwEXRaeWt3DepA fqEwgqmz8g4g/0OjKFO4kAj+FswjC/85aJxks+OskbOB1nbeisSK81UXCCgFlm1SSmus fjcw== X-Gm-Message-State: ALKqPwf+w8J2kHjeAjA4mWUTktqBMwzJPjq/uARXVAsXtllqIsJtNp29 qw2KrL3CY/TvnwATFPX1NxFH2g== X-Google-Smtp-Source: ADUXVKI4ddPxEhhbaCAmNb+puFHlGB9+fP7xBItjRuShQpwV9vuNvPYyeZyCiHwxCzrhaxPeOvnU0Q== X-Received: by 2002:a17:902:4c88:: with SMTP id b8-v6mr2104421ple.285.1527673459683; Wed, 30 May 2018 02:44:19 -0700 (PDT) Received: from VM_93_245_centos.localdomain ([150.109.57.149]) by smtp.gmail.com with ESMTPSA id 29-v6sm60565257pfj.14.2018.05.30.02.44.17 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 30 May 2018 02:44:19 -0700 (PDT) From: Lidong Chen X-Google-Original-From: Lidong Chen To: zhang.zhanghailiang@huawei.com, quintela@redhat.com, dgilbert@redhat.com, berrange@redhat.com, aviadye@mellanox.com, pbonzini@redhat.com Date: Wed, 30 May 2018 17:43:36 +0800 Message-Id: <1527673416-31268-13-git-send-email-lidongchen@tencent.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1527673416-31268-1-git-send-email-lidongchen@tencent.com> References: <1527673416-31268-1-git-send-email-lidongchen@tencent.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::241 Subject: [Qemu-devel] [PATCH v4 12/12] migration: implement the shutdown for RDMA QIOChannel X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: adido@mellanox.com, qemu-devel@nongnu.org, Lidong Chen Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Because RDMA QIOChannel not implement shutdown function, If the to_dst_file was set error, the return path thread will wait forever. and the migration thread will wait return path thread exit. the backtrace of return path thread is: (gdb) bt #0 0x00007f372a76bb0f in ppoll () from /lib64/libc.so.6 #1 0x000000000071dc24 in qemu_poll_ns (fds=0x7ef7091d0580, nfds=2, timeout=100000000) at qemu-timer.c:325 #2 0x00000000006b2fba in qemu_rdma_wait_comp_channel (rdma=0xd424000) at migration/rdma.c:1501 #3 0x00000000006b3191 in qemu_rdma_block_for_wrid (rdma=0xd424000, wrid_requested=4000, byte_len=0x7ef7091d0640) at migration/rdma.c:1580 #4 0x00000000006b3638 in qemu_rdma_exchange_get_response (rdma=0xd424000, head=0x7ef7091d0720, expecting=3, idx=0) at migration/rdma.c:1726 #5 0x00000000006b3ad6 in qemu_rdma_exchange_recv (rdma=0xd424000, head=0x7ef7091d0720, expecting=3) at migration/rdma.c:1903 #6 0x00000000006b5d03 in qemu_rdma_get_buffer (opaque=0x6a57dc0, buf=0x5c80030 "", pos=8, size=32768) at migration/rdma.c:2714 #7 0x00000000006a9635 in qemu_fill_buffer (f=0x5c80000) at migration/qemu-file.c:232 #8 0x00000000006a9ecd in qemu_peek_byte (f=0x5c80000, offset=0) at migration/qemu-file.c:502 #9 0x00000000006a9f1f in qemu_get_byte (f=0x5c80000) at migration/qemu-file.c:515 #10 0x00000000006aa162 in qemu_get_be16 (f=0x5c80000) at migration/qemu-file.c:591 #11 0x00000000006a46d3 in source_return_path_thread ( opaque=0xd826a0 ) at migration/migration.c:1331 #12 0x00007f372aa49e25 in start_thread () from /lib64/libpthread.so.0 #13 0x00007f372a77635d in clone () from /lib64/libc.so.6 the backtrace of migration thread is: (gdb) bt #0 0x00007f372aa4af57 in pthread_join () from /lib64/libpthread.so.0 #1 0x00000000007d5711 in qemu_thread_join (thread=0xd826f8 ) at util/qemu-thread-posix.c:504 #2 0x00000000006a4bc5 in await_return_path_close_on_source ( ms=0xd826a0 ) at migration/migration.c:1460 #3 0x00000000006a53e4 in migration_completion (s=0xd826a0 , current_active_state=4, old_vm_running=0x7ef7089cf976, start_time=0x7ef7089cf980) at migration/migration.c:1695 #4 0x00000000006a5c54 in migration_thread (opaque=0xd826a0 ) at migration/migration.c:1837 #5 0x00007f372aa49e25 in start_thread () from /lib64/libpthread.so.0 #6 0x00007f372a77635d in clone () from /lib64/libc.so.6 Signed-off-by: Lidong Chen Reviewed-by: Dr. David Alan Gilbert --- migration/rdma.c | 40 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 40 insertions(+) diff --git a/migration/rdma.c b/migration/rdma.c index d611a06..0912b6a 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -3038,6 +3038,45 @@ static int qio_channel_rdma_close(QIOChannel *ioc, return 0; } +static int +qio_channel_rdma_shutdown(QIOChannel *ioc, + QIOChannelShutdown how, + Error **errp) +{ + QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc); + RDMAContext *rdmain, *rdmaout; + + rcu_read_lock(); + + rdmain = atomic_rcu_read(&rioc->rdmain); + rdmaout = atomic_rcu_read(&rioc->rdmain); + + switch (how) { + case QIO_CHANNEL_SHUTDOWN_READ: + if (rdmain) { + rdmain->error_state = -1; + } + break; + case QIO_CHANNEL_SHUTDOWN_WRITE: + if (rdmaout) { + rdmaout->error_state = -1; + } + break; + case QIO_CHANNEL_SHUTDOWN_BOTH: + default: + if (rdmain) { + rdmain->error_state = -1; + } + if (rdmaout) { + rdmaout->error_state = -1; + } + break; + } + + rcu_read_unlock(); + return 0; +} + /* * Parameters: * @offset == 0 : @@ -3864,6 +3903,7 @@ static void qio_channel_rdma_class_init(ObjectClass *klass, ioc_klass->io_close = qio_channel_rdma_close; ioc_klass->io_create_watch = qio_channel_rdma_create_watch; ioc_klass->io_set_aio_fd_handler = qio_channel_rdma_set_aio_fd_handler; + ioc_klass->io_shutdown = qio_channel_rdma_shutdown; } static const TypeInfo qio_channel_rdma_info = {