From patchwork Mon Dec 13 19:46:03 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dumitru Ceara X-Patchwork-Id: 1567470 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: bilbo.ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=Q6NnWwtX; dkim-atps=neutral Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.136; helo=smtp3.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Received: from smtp3.osuosl.org (smtp3.osuosl.org [140.211.166.136]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4JCX7h0gRlz9s5P for ; Tue, 14 Dec 2021 06:46:22 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by smtp3.osuosl.org (Postfix) with ESMTP id D458360B21; Mon, 13 Dec 2021 19:46:19 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp3.osuosl.org ([127.0.0.1]) by localhost (smtp3.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id pEOldVEFJnww; Mon, 13 Dec 2021 19:46:19 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by smtp3.osuosl.org (Postfix) with ESMTPS id 1245960B1C; Mon, 13 Dec 2021 19:46:18 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id DCE50C001E; Mon, 13 Dec 2021 19:46:17 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp3.osuosl.org (smtp3.osuosl.org [IPv6:2605:bc80:3010::136]) by lists.linuxfoundation.org (Postfix) with ESMTP id 0DB81C0012 for ; Mon, 13 Dec 2021 19:46:17 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp3.osuosl.org (Postfix) with ESMTP id DC36B60B1D for ; Mon, 13 Dec 2021 19:46:16 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp3.osuosl.org ([127.0.0.1]) by localhost (smtp3.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id jfIvmyIUNk11 for ; Mon, 13 Dec 2021 19:46:16 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by smtp3.osuosl.org (Postfix) with ESMTPS id 3146460B1C for ; Mon, 13 Dec 2021 19:46:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1639424774; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=gfCIK2TB+LBvjTMf/zb+c7wtFEV2I7BptDf+WuCR7xM=; b=Q6NnWwtX+5sQXwd9hO80RkQFplqqECCzE6fySHM+QKzAlavJh1kPYkugHEu5YO5HVRAnfl roas2p7R93SWPrxQKHsJWeguMj0CLYTnSq77LBMZkotD/2+Btz6ii+9k5U1w9BE40oiNbP Ech8rFBXXVROTetl3kqmMXgmK5jCvg0= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-151-PIk1ynPeNdCETVArx_zrmg-1; Mon, 13 Dec 2021 14:46:09 -0500 X-MC-Unique: PIk1ynPeNdCETVArx_zrmg-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 50B89100CCC0; Mon, 13 Dec 2021 19:46:08 +0000 (UTC) Received: from dceara.remote.csb (unknown [10.39.194.126]) by smtp.corp.redhat.com (Postfix) with ESMTP id 53CE460C9F; Mon, 13 Dec 2021 19:46:07 +0000 (UTC) From: Dumitru Ceara To: dev@openvswitch.org Date: Mon, 13 Dec 2021 20:46:03 +0100 Message-Id: <20211213194603.32487-1-dceara@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=dceara@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Cc: i.maximets@ovn.org Subject: [ovs-dev] [PATCH v2] raft: Only allow followers to snapshot. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" Commit 3c2d6274bcee ("raft: Transfer leadership before creating snapshots.") made it such that raft leaders transfer leadership before snapshotting. However, there's still the case when the next leader to be is in the process of snapshotting. To avoid delays in that case too, we now explicitly allow snapshots only on followers. Cluster members will have to wait until the current election is settled before snapshotting. Given the following logs taken from an OVN_Southbound 3-server cluster during a scale test: S1 (old leader): 2021-12-10T19:07:51.226Z|00823|raft|INFO|Transferring leadership to write a snapshot. 2021-12-10T19:08:03.830Z|00824|ovsdb|INFO|OVN_Southbound: Database compaction took 12601ms 2021-12-10T19:08:03.833Z|00825|timeval|WARN|Unreasonably long 12604ms poll interval (10632ms user, 1924ms system) 2021-12-10T19:08:03.940Z|00838|raft|INFO|server 8b8d is leader for term 43 S2 (follower): 2021-12-10T19:08:00.870Z|00481|raft|INFO|server 8b8d is leader for term 43 S3 (new leader): 2021-12-10T19:07:51.242Z|01083|raft|INFO|received leadership transfer from f5c9 in term 42 2021-12-10T19:07:51.244Z|01084|raft|INFO|term 43: starting election 2021-12-10T19:08:00.805Z|01085|ovsdb|INFO|OVN_Southbound: Database compaction took 9559ms 2021-12-10T19:08:00.869Z|01100|raft|INFO|term 43: elected leader by 2+ of 3 servers We see that the leader to be (S3) receives the leadership transfer, initiates the election and immediately after starts a snapshot that takes ~9.5 seconds. During this time, S2 votes for S3 electing it as cluster leader but S3 doesn't effectively become leader until it finishes snapshotting, essentially keeping the cluster without a leader for up to ~9.5 seconds. With the current change, S3 will delay compaction and snapshotting until the election is finished. The only exception is the case of single-node clusters for which we allow the node to snapshot regardless of role. Acked-by: Han Zhou Signed-off-by: Dumitru Ceara --- V2: - Added Han's ack and his suggestion to allow single-node clusters to snapshot. --- ovsdb/raft.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ovsdb/raft.c b/ovsdb/raft.c index ce40c5bc075c..1a3447a8dd4f 100644 --- a/ovsdb/raft.c +++ b/ovsdb/raft.c @@ -4226,7 +4226,7 @@ raft_may_snapshot(const struct raft *raft) && !raft->leaving && !raft->left && !raft->failed - && raft->role != RAFT_LEADER + && (raft->role == RAFT_FOLLOWER || hmap_count(&raft->servers) == 1) && raft->last_applied >= raft->log_start); }