From patchwork Mon Dec 13 11:03:14 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dumitru Ceara X-Patchwork-Id: 1567244 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: bilbo.ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=HpTbpjDs; dkim-atps=neutral Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=2605:bc80:3010::136; helo=smtp3.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Received: from smtp3.osuosl.org (smtp3.osuosl.org [IPv6:2605:bc80:3010::136]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4JCJXQ3Sgzz9s3q for ; Mon, 13 Dec 2021 22:03:32 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by smtp3.osuosl.org (Postfix) with ESMTP id EFFAD606EB; Mon, 13 Dec 2021 11:03:29 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp3.osuosl.org ([127.0.0.1]) by localhost (smtp3.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 0wY31UKlM8DV; Mon, 13 Dec 2021 11:03:29 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by smtp3.osuosl.org (Postfix) with ESMTPS id 2076B6003C; Mon, 13 Dec 2021 11:03:28 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id EFF4DC001E; Mon, 13 Dec 2021 11:03:27 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp3.osuosl.org (smtp3.osuosl.org [140.211.166.136]) by lists.linuxfoundation.org (Postfix) with ESMTP id EFA4EC0012 for ; Mon, 13 Dec 2021 11:03:26 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp3.osuosl.org (Postfix) with ESMTP id CDBE5606EB for ; Mon, 13 Dec 2021 11:03:26 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp3.osuosl.org ([127.0.0.1]) by localhost (smtp3.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 1woPNyWnxrkJ for ; Mon, 13 Dec 2021 11:03:26 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by smtp3.osuosl.org (Postfix) with ESMTPS id EE400606DE for ; Mon, 13 Dec 2021 11:03:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1639393404; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=oG4cFoy+hdoWzVhjnWA/91RQbA8dhHEyKS7aOopNXTg=; b=HpTbpjDsK1py6IN5p3uaWDOM3lQeTMTGrlML5dWI0cn/yyHzHzk7MjvHcbPaRolqyOm6G0 U6jt4oo333pu5oEkcv8QNH8nAPsvqlv8caaYewjjjPPKcQwPXAvrPnEKP7miL4RVeB5gQ9 MaKSRoJBBzY2r5/tq67Lw7rpcIM4LOI= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-120-h56F4DstONuFLVxJNKIK6w-1; Mon, 13 Dec 2021 06:03:21 -0500 X-MC-Unique: h56F4DstONuFLVxJNKIK6w-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 74EDC344E4; Mon, 13 Dec 2021 11:03:20 +0000 (UTC) Received: from dceara.remote.csb (unknown [10.39.194.126]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8BE611001F4D; Mon, 13 Dec 2021 11:03:18 +0000 (UTC) From: Dumitru Ceara To: dev@openvswitch.org Date: Mon, 13 Dec 2021 12:03:14 +0100 Message-Id: <20211213110314.13721-1-dceara@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=dceara@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Cc: i.maximets@ovn.org Subject: [ovs-dev] [PATCH] raft: Only allow followers to snapshot. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" Commit 3c2d6274bcee ("raft: Transfer leadership before creating snapshots.") made it such that raft leaders transfer leadership before snapshotting. However, there's still the case when the next leader to be is in the process of snapshotting. To avoid delays in that case too, we now explicitly allow snapshots only on followers. Cluster members will have to wait until the current election is settled before snapshotting. Given the following logs taken from an OVN_Southbound 3-server cluster during a scale test: S1 (old leader): 2021-12-10T19:07:51.226Z|00823|raft|INFO|Transferring leadership to write a snapshot. 2021-12-10T19:08:03.830Z|00824|ovsdb|INFO|OVN_Southbound: Database compaction took 12601ms 2021-12-10T19:08:03.833Z|00825|timeval|WARN|Unreasonably long 12604ms poll interval (10632ms user, 1924ms system) 2021-12-10T19:08:03.940Z|00838|raft|INFO|server 8b8d is leader for term 43 S2 (follower): 2021-12-10T19:08:00.870Z|00481|raft|INFO|server 8b8d is leader for term 43 S3 (new leader): 2021-12-10T19:07:51.242Z|01083|raft|INFO|received leadership transfer from f5c9 in term 42 2021-12-10T19:07:51.244Z|01084|raft|INFO|term 43: starting election 2021-12-10T19:08:00.805Z|01085|ovsdb|INFO|OVN_Southbound: Database compaction took 9559ms 2021-12-10T19:08:00.869Z|01100|raft|INFO|term 43: elected leader by 2+ of 3 servers We see that the leader to be (S3) receives the leadership transfer, initiates the election and immediately after starts a snapshot that takes ~9.5 seconds. During this time, S2 votes for S3 electing it as cluster leader but S3 doesn't effectively become leader until it finishes snapshotting, essentially keeping the cluster without a leader for up to ~9.5 seconds. With the current change, S3 will delay compaction and snapshotting until the election is finished. Signed-off-by: Dumitru Ceara Acked-by: Han Zhou --- ovsdb/raft.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ovsdb/raft.c b/ovsdb/raft.c index ce40c5bc075c..6ffcb21db1e2 100644 --- a/ovsdb/raft.c +++ b/ovsdb/raft.c @@ -4226,7 +4226,7 @@ raft_may_snapshot(const struct raft *raft) && !raft->leaving && !raft->left && !raft->failed - && raft->role != RAFT_LEADER + && raft->role == RAFT_FOLLOWER && raft->last_applied >= raft->log_start); }