From patchwork Tue Feb 23 13:15:00 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ilya Maximets X-Patchwork-Id: 1443477 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.136; helo=smtp3.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Received: from smtp3.osuosl.org (smtp3.osuosl.org [140.211.166.136]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DlKKc41bHz9sRN for ; Wed, 24 Feb 2021 00:15:16 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by smtp3.osuosl.org (Postfix) with ESMTP id 1BEFA6058A; Tue, 23 Feb 2021 13:15:14 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp3.osuosl.org ([127.0.0.1]) by localhost (smtp3.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id roZcLY9PljQp; Tue, 23 Feb 2021 13:15:12 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by smtp3.osuosl.org (Postfix) with ESMTP id CC1C1605ED; Tue, 23 Feb 2021 13:15:11 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id B2484C000A; Tue, 23 Feb 2021 13:15:11 +0000 (UTC) X-Original-To: ovs-dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp1.osuosl.org (smtp1.osuosl.org [140.211.166.138]) by lists.linuxfoundation.org (Postfix) with ESMTP id 6024BC0001 for ; Tue, 23 Feb 2021 13:15:10 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp1.osuosl.org (Postfix) with ESMTP id 4DED183A4A for ; Tue, 23 Feb 2021 13:15:10 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp1.osuosl.org ([127.0.0.1]) by localhost (smtp1.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Hcfltjz3trSv for ; Tue, 23 Feb 2021 13:15:08 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0 Received: from relay6-d.mail.gandi.net (relay6-d.mail.gandi.net [217.70.183.198]) by smtp1.osuosl.org (Postfix) with ESMTPS id D08B983155 for ; Tue, 23 Feb 2021 13:15:07 +0000 (UTC) X-Originating-IP: 78.45.89.65 Received: from im-t490s.redhat.com (ip-78-45-89-65.net.upcbroadband.cz [78.45.89.65]) (Authenticated sender: i.maximets@ovn.org) by relay6-d.mail.gandi.net (Postfix) with ESMTPSA id 20DF2C0007; Tue, 23 Feb 2021 13:15:02 +0000 (UTC) From: Ilya Maximets To: ovs-dev@openvswitch.org Date: Tue, 23 Feb 2021 14:15:00 +0100 Message-Id: <20210223131500.53470-1-i.maximets@ovn.org> X-Mailer: git-send-email 2.26.2 MIME-Version: 1.0 Cc: Zhen Wang , Ilya Maximets , Dumitru Ceara Subject: [ovs-dev] [PATCH] raft: Reintroduce jsonrpc inactivity probes. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" It's not enough to just have heartbeats. RAFT heartbeats are unidirectional, i.e. leader sends them to followers but not the other way around. Missing heartbeats provokes followers to start election, but if leader will not receive any replies it will not do anything while there is a quorum, i.e. there are enough other servers to make decisions. This leads to situation that while TCP connection is established, leader will continue to blindly send messages to it. In our case this leads to growing send backlog. Connection will be terminated eventually due to excessive send backlog, but this this might take a lot of time and wasted process memory. At the same time 'candidate' will continue to send vote requests to the dead connection on its side. To fix that we need to reintroduce inactivity probes that will drop connection if there was no incoming traffic for a long time and remote server doesn't reply to the "echo" request. Probe interval might be chosen based on an election timeout to avoid issues described in commit db5a066c17bd. Fixes: db5a066c17bd ("raft: Disable RAFT jsonrpc inactivity probe.") Signed-off-by: Ilya Maximets Acked-by: Han Zhou --- ovsdb/raft.c | 32 +++++++++++++++++++++++++++++++- 1 file changed, 31 insertions(+), 1 deletion(-) diff --git a/ovsdb/raft.c b/ovsdb/raft.c index ea91d1fdb..0fb1420fb 100644 --- a/ovsdb/raft.c +++ b/ovsdb/raft.c @@ -940,6 +940,34 @@ raft_reset_ping_timer(struct raft *raft) raft->ping_timeout = time_msec() + raft->election_timer / 3; } +static void +raft_conn_update_probe_interval(struct raft *raft, struct raft_conn *r_conn) +{ + /* Inactivity probe will be sent if connection will remain idle for the + * time of an election timeout. Connection will be dropped if inactivity + * will last twice that time. + * + * It's not enough to just have heartbeats if connection is still + * established, but no packets received from the other side. Without + * inactivity probe follower will just try to initiate election + * indefinitely staying in 'candidate' role. And the leader will continue + * to send heartbeats to the dead connection thinking that remote server + * is still part of the cluster. */ + int probe_interval = raft->election_timer + ELECTION_RANGE_MSEC; + + jsonrpc_session_set_probe_interval(r_conn->js, probe_interval); +} + +static void +raft_update_probe_intervals(struct raft *raft) +{ + struct raft_conn *r_conn; + + LIST_FOR_EACH (r_conn, list_node, &raft->conns) { + raft_conn_update_probe_interval(raft, r_conn); + } +} + static void raft_add_conn(struct raft *raft, struct jsonrpc_session *js, const struct uuid *sid, bool incoming) @@ -954,7 +982,7 @@ raft_add_conn(struct raft *raft, struct jsonrpc_session *js, &conn->sid); conn->incoming = incoming; conn->js_seqno = jsonrpc_session_get_seqno(conn->js); - jsonrpc_session_set_probe_interval(js, 0); + raft_conn_update_probe_interval(raft, conn); jsonrpc_session_set_backlog_threshold(js, raft->conn_backlog_max_n_msgs, raft->conn_backlog_max_n_bytes); } @@ -2804,6 +2832,7 @@ raft_update_commit_index(struct raft *raft, uint64_t new_commit_index) raft->election_timer, e->election_timer); raft->election_timer = e->election_timer; raft->election_timer_new = 0; + raft_update_probe_intervals(raft); } if (e->servers) { /* raft_run_reconfigure() can write a new Raft entry, which can @@ -2820,6 +2849,7 @@ raft_update_commit_index(struct raft *raft, uint64_t new_commit_index) VLOG_INFO("Election timer changed from %"PRIu64" to %"PRIu64, raft->election_timer, e->election_timer); raft->election_timer = e->election_timer; + raft_update_probe_intervals(raft); } } /* Check if any pending command can be completed, and complete it.