From patchwork Fri Mar 6 07:48:45 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Han Zhou X-Patchwork-Id: 1250116 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.136; helo=silver.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=ovn.org Received: from silver.osuosl.org (smtp3.osuosl.org [140.211.166.136]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48Yfs018x1z9sRY for ; Fri, 6 Mar 2020 18:49:24 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by silver.osuosl.org (Postfix) with ESMTP id AB756204EF; Fri, 6 Mar 2020 07:49:21 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from silver.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id MTDFYrwDGu-m; Fri, 6 Mar 2020 07:49:20 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by silver.osuosl.org (Postfix) with ESMTP id 71D17204CA; Fri, 6 Mar 2020 07:49:20 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 5B2D7C1D8E; Fri, 6 Mar 2020 07:49:20 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from silver.osuosl.org (smtp3.osuosl.org [140.211.166.136]) by lists.linuxfoundation.org (Postfix) with ESMTP id 94305C013E for ; Fri, 6 Mar 2020 07:49:17 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by silver.osuosl.org (Postfix) with ESMTP id 8334D203DE for ; Fri, 6 Mar 2020 07:49:17 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from silver.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id CKokVxobMqZM for ; Fri, 6 Mar 2020 07:49:15 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from relay12.mail.gandi.net (relay12.mail.gandi.net [217.70.178.232]) by silver.osuosl.org (Postfix) with ESMTPS id 56CC720380 for ; Fri, 6 Mar 2020 07:49:14 +0000 (UTC) Received: from localhost.localdomain.localdomain (unknown [216.113.160.71]) (Authenticated sender: hzhou@ovn.org) by relay12.mail.gandi.net (Postfix) with ESMTPSA id 4A40B200004; Fri, 6 Mar 2020 07:49:10 +0000 (UTC) From: Han Zhou To: dev@openvswitch.org Date: Thu, 5 Mar 2020 23:48:45 -0800 Message-Id: <1583480926-99714-1-git-send-email-hzhou@ovn.org> X-Mailer: git-send-email 2.1.0 MIME-Version: 1.0 Cc: Han Zhou Subject: [ovs-dev] [PATCH 1/2] raft: Fix the problem of stuck in candidate role forever. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" Sometimes a server can stay in candidate role forever, even if the server already see the new leader and handles append-requests normally. However, because of the wrong role, it appears as disconnected from cluster and so the clients are disconnected. This problem happens when 2 servers become candidates in the same term, and one of them is elected as leader in that term. It can be reproduced by the test cases added in this patch. The root cause is that the current implementation only changes role to follower when a bigger term is observed (in raft_receive_term__()). According to the RAFT paper, if another candidate becomes leader with the same term, the candidate should change to follower. This patch fixes it by changing the role to follower when leader is being updated in raft_update_leader(). Signed-off-by: Han Zhou --- ovsdb/raft.c | 19 +++++++++++++++-- tests/ovsdb-cluster.at | 55 ++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 72 insertions(+), 2 deletions(-) diff --git a/ovsdb/raft.c b/ovsdb/raft.c index 4789bc4..2e1144c 100644 --- a/ovsdb/raft.c +++ b/ovsdb/raft.c @@ -73,7 +73,8 @@ enum raft_failure_test { FT_CRASH_BEFORE_SEND_EXEC_REQ, FT_CRASH_AFTER_SEND_EXEC_REQ, FT_CRASH_AFTER_RECV_APPEND_REQ_UPDATE, - FT_DELAY_ELECTION + FT_DELAY_ELECTION, + FT_DONT_SEND_VOTE_REQUEST }; static enum raft_failure_test failure_test; @@ -1641,6 +1642,7 @@ raft_start_election(struct raft *raft, bool leadership_transfer) } ovs_assert(raft->role != RAFT_LEADER); + raft->role = RAFT_CANDIDATE; /* If there was no leader elected since last election, we know we are * retrying now. */ @@ -1684,7 +1686,9 @@ raft_start_election(struct raft *raft, bool leadership_transfer) .leadership_transfer = leadership_transfer, }, }; - raft_send(raft, &rq); + if (failure_test != FT_DONT_SEND_VOTE_REQUEST) { + raft_send(raft, &rq); + } } /* Vote for ourselves. */ @@ -2960,6 +2964,15 @@ raft_update_leader(struct raft *raft, const struct uuid *sid) }; ignore(ovsdb_log_write_and_free(raft->log, raft_record_to_json(&r))); } + if (raft->role == RAFT_CANDIDATE) { + /* Section 3.4: While waiting for votes, a candidate may + * receive an AppendEntries RPC from another server claiming to + * be leader. If the leader’s term (included in its RPC) is at + * least as large as the candidate’s current term, then the + * candidate recognizes the leader as legitimate and returns to + * follower state. */ + raft->role = RAFT_FOLLOWER; + } return true; } @@ -4667,6 +4680,8 @@ raft_unixctl_failure_test(struct unixctl_conn *conn OVS_UNUSED, raft_reset_election_timer(raft); } } + } else if (!strcmp(test, "dont-send-vote-request")) { + failure_test = FT_DONT_SEND_VOTE_REQUEST; } else if (!strcmp(test, "clear")) { failure_test = FT_NO_TEST; unixctl_command_reply(conn, "test dismissed"); diff --git a/tests/ovsdb-cluster.at b/tests/ovsdb-cluster.at index 3a0bd45..8a049bf 100644 --- a/tests/ovsdb-cluster.at +++ b/tests/ovsdb-cluster.at @@ -437,6 +437,61 @@ ovsdb_cluster_failure_test 2 2 3 crash-after-receiving-append-request-update AT_CLEANUP +AT_SETUP([OVSDB cluster - competing candidates]) +AT_KEYWORDS([ovsdb server negative unix cluster competing-candidates]) + +n=3 +schema_name=`ovsdb-tool schema-name $abs_srcdir/idltest.ovsschema` +ordinal_schema > schema +AT_CHECK([ovsdb-tool '-vPATTERN:console:%c|%p|%m' create-cluster s1.db $abs_srcdir/idltest.ovsschema unix:s1.raft], [0], [], [stderr]) +cid=`ovsdb-tool db-cid s1.db` +schema_name=`ovsdb-tool schema-name $abs_srcdir/idltest.ovsschema` +for i in `seq 2 $n`; do + AT_CHECK([ovsdb-tool join-cluster s$i.db $schema_name unix:s$i.raft unix:s1.raft]) +done + +on_exit 'kill `cat *.pid`' +for i in `seq $n`; do + AT_CHECK([ovsdb-server -v -vconsole:off -vsyslog:off --detach --no-chdir --log-file=s$i.log --pidfile=s$i.pid --unixctl=s$i --remote=punix:s$i.ovsdb s$i.db]) +done +for i in `seq $n`; do + AT_CHECK([ovsdb_client_wait unix:s$i.ovsdb $schema_name connected]) +done + +# We need to simulate the situation when 2 candidates starts election with same +# term. +# +# Before triggering leader election, tell follower s2 don't send vote request (simulating +# vote-request lost or not handled in time), and tell follower s3 to delay +# election timer to make sure s3 doesn't send vote-request before s2 enters +# term 2. +AT_CHECK([ovs-appctl -t "`pwd`"/s2 cluster/failure-test dont-send-vote-request], [0], [ignore]) +AT_CHECK([ovs-appctl -t "`pwd`"/s3 cluster/failure-test delay-election], [0], [ignore]) + +# Restart leader, which will become follower, and both old followers will start +# election as candidate. The new follower (old leader) will vote one of them, +# and the other candidate should step back as follower as again. +kill -9 `cat s1.pid` +AT_CHECK([ovsdb-server -v -vconsole:off -vsyslog:off --detach --no-chdir --log-file=s1.log --pidfile=s1.pid --unixctl=s1 --remote=punix:s1.ovsdb s1.db]) + +# Tell s1 to delay election timer so that it won't start election before s3 +# becomes candidate. +AT_CHECK([ovs-appctl -t "`pwd`"/s1 cluster/failure-test delay-election], [0], [ignore]) + +OVS_WAIT_UNTIL([ovs-appctl -t "`pwd`"/s1 cluster/status $schema_name | grep "Term: 2"]) + +for i in `seq $n`; do + OVS_WAIT_WHILE([ovs-appctl -t "`pwd`"/s$i cluster/status $schema_name | grep "candidate"]) + AT_CHECK([ovsdb_client_wait unix:s$i.ovsdb $schema_name connected]) +done + +for i in `seq $n`; do + OVS_APP_EXIT_AND_WAIT_BY_TARGET([`pwd`/s$i], [s$i.pid]) +done + +AT_CLEANUP + + AT_BANNER([OVSDB - cluster tests]) # Torture test.