From patchwork Sat May 23 17:34:12 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ilya Maximets X-Patchwork-Id: 1296758 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.136; helo=silver.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=ovn.org Received: from silver.osuosl.org (smtp3.osuosl.org [140.211.166.136]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 49Tr861YYKz9sRY for ; Sun, 24 May 2020 03:34:29 +1000 (AEST) Received: from localhost (localhost [127.0.0.1]) by silver.osuosl.org (Postfix) with ESMTP id 3C17B21080; Sat, 23 May 2020 17:34:28 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from silver.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id m23MW1ZAmK7F; Sat, 23 May 2020 17:34:26 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by silver.osuosl.org (Postfix) with ESMTP id 2DA672036C; Sat, 23 May 2020 17:34:26 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 08E55C088B; Sat, 23 May 2020 17:34:26 +0000 (UTC) X-Original-To: ovs-dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from fraxinus.osuosl.org (smtp4.osuosl.org [140.211.166.137]) by lists.linuxfoundation.org (Postfix) with ESMTP id 8CDD3C0176 for ; Sat, 23 May 2020 17:34:24 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by fraxinus.osuosl.org (Postfix) with ESMTP id 74CD286D2B for ; Sat, 23 May 2020 17:34:24 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from fraxinus.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id mOO2QFnOFUCM for ; Sat, 23 May 2020 17:34:23 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from relay2-d.mail.gandi.net (relay2-d.mail.gandi.net [217.70.183.194]) by fraxinus.osuosl.org (Postfix) with ESMTPS id F3A9C86D26 for ; Sat, 23 May 2020 17:34:22 +0000 (UTC) X-Originating-IP: 90.177.210.238 Received: from im-t490s.redhat.com (238.210.broadband10.iol.cz [90.177.210.238]) (Authenticated sender: i.maximets@ovn.org) by relay2-d.mail.gandi.net (Postfix) with ESMTPSA id CB2D140003; Sat, 23 May 2020 17:34:18 +0000 (UTC) From: Ilya Maximets To: ovs-dev@openvswitch.org Date: Sat, 23 May 2020 19:34:12 +0200 Message-Id: <20200523173412.477681-1-i.maximets@ovn.org> X-Mailer: git-send-email 2.25.4 MIME-Version: 1.0 Cc: Han Zhou , Ilya Maximets Subject: [ovs-dev] [PATCH] raft: Avoid sending equal snapshots. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" Snapshots are huge. In some cases we could receive several outdated append replies from the remote server. This could happen in high scale cases if the remote server is overloaded and not able to process all the raft requests in time. As an action to each outdated append reply we're sending full database snapshot. While remote server is already overloaded those snapshots will stuck in jsonrpc backlog for a long time making it grow up to few GB. Since remote server wasn't able to timely process incoming messages it will likely not able to process snapshots leading to the same situation with low chances to recover. Remote server will likely stuck in 'candidate' state, other servers will grow their memory consumption due to growing jsonrpc backlogs: jsonrpc|INFO|excessive sending backlog, jsonrpc: ssl:192.16.0.3:6644, num of msgs: 3795, backlog: 8838994624. This patch is trying to avoid that situation by avoiding sending of equal snapshot install requests. This helps maintain reasonable memory consumption and allows the cluster to recover on a larger scale. Signed-off-by: Ilya Maximets Acked-by: Han Zhou --- I'm not an expert in this code, so there might be better way to track equal snapshot installation requests. Suggestions are welcome. ovsdb/raft-private.c | 1 + ovsdb/raft-private.h | 4 ++++ ovsdb/raft.c | 39 ++++++++++++++++++++++++++++++++++++++- 3 files changed, 43 insertions(+), 1 deletion(-) diff --git a/ovsdb/raft-private.c b/ovsdb/raft-private.c index 26d39a087..9468fdaf4 100644 --- a/ovsdb/raft-private.c +++ b/ovsdb/raft-private.c @@ -137,6 +137,7 @@ raft_server_destroy(struct raft_server *s) if (s) { free(s->address); free(s->nickname); + free(s->last_install_snapshot_request); free(s); } } diff --git a/ovsdb/raft-private.h b/ovsdb/raft-private.h index ac8656d42..1f366b4ab 100644 --- a/ovsdb/raft-private.h +++ b/ovsdb/raft-private.h @@ -27,6 +27,7 @@ struct ds; struct ovsdb_parser; +struct raft_install_snapshot_request; /* Formatting server IDs and cluster IDs for use in human-readable logs. Do * not use these in cases where the whole server or cluster ID is needed; use @@ -83,6 +84,9 @@ struct raft_server { bool replied; /* Reply to append_request was received from this node during current election_timeout interval. */ + /* Copy of the last install_snapshot_request sent to this server. */ + struct raft_install_snapshot_request *last_install_snapshot_request; + /* For use in adding and removing servers: */ struct uuid requester_sid; /* Nonzero if requested via RPC. */ struct unixctl_conn *requester_conn; /* Only if requested via unixctl. */ diff --git a/ovsdb/raft.c b/ovsdb/raft.c index 515eadab3..708b0624c 100644 --- a/ovsdb/raft.c +++ b/ovsdb/raft.c @@ -1421,8 +1421,20 @@ raft_conn_run(struct raft *raft, struct raft_conn *conn) jsonrpc_session_run(conn->js); unsigned int new_seqno = jsonrpc_session_get_seqno(conn->js); - bool just_connected = (new_seqno != conn->js_seqno + bool reconnected = new_seqno != conn->js_seqno; + bool just_connected = (reconnected && jsonrpc_session_is_connected(conn->js)); + + if (reconnected) { + /* Clear 'last_install_snapshot_request' since it might not reach the + * destination or server was restarted. */ + struct raft_server *server = raft_find_server(raft, &conn->sid); + if (server) { + free(server->last_install_snapshot_request); + server->last_install_snapshot_request = NULL; + } + } + conn->js_seqno = new_seqno; if (just_connected) { if (raft->joining) { @@ -3296,6 +3308,31 @@ raft_send_install_snapshot_request(struct raft *raft, .election_timer = raft->election_timer, /* use latest value */ } }; + + if (s->last_install_snapshot_request) { + struct raft_install_snapshot_request *old, *new; + + old = s->last_install_snapshot_request; + new = &rpc.install_snapshot_request; + if ( old->term == new->term + && old->last_index == new->last_index + && old->last_term == new->last_term + && old->last_servers == new->last_servers + && old->data == new->data + && old->election_timer == new->election_timer + && uuid_equals(&old->last_eid, &new->last_eid)) { + static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 5); + + VLOG_WARN_RL(&rl, "not sending exact same install_snapshot_request" + " to server %s again", s->nickname); + return; + } + } + free(s->last_install_snapshot_request); + CONST_CAST(struct raft_server *, s)->last_install_snapshot_request + = xmemdup(&rpc.install_snapshot_request, + sizeof rpc.install_snapshot_request); + raft_send(raft, &rpc); }