Message ID | 20211213194603.32487-1-dceara@redhat.com |
---|---|
State | Accepted |
Headers | show |
Series | [ovs-dev,v2] raft: Only allow followers to snapshot. | expand |
Context | Check | Description |
---|---|---|
ovsrobot/apply-robot | success | apply and check: success |
ovsrobot/github-robot-_Build_and_Test | success | github build: passed |
On 12/13/21 20:46, Dumitru Ceara wrote: > Commit 3c2d6274bcee ("raft: Transfer leadership before creating > snapshots.") made it such that raft leaders transfer leadership before > snapshotting. However, there's still the case when the next leader to > be is in the process of snapshotting. To avoid delays in that case too, > we now explicitly allow snapshots only on followers. Cluster members > will have to wait until the current election is settled before > snapshotting. > > Given the following logs taken from an OVN_Southbound 3-server cluster > during a scale test: > > S1 (old leader): > 2021-12-10T19:07:51.226Z|00823|raft|INFO|Transferring leadership to write a snapshot. > 2021-12-10T19:08:03.830Z|00824|ovsdb|INFO|OVN_Southbound: Database compaction took 12601ms > 2021-12-10T19:08:03.833Z|00825|timeval|WARN|Unreasonably long 12604ms poll interval (10632ms user, 1924ms system) > 2021-12-10T19:08:03.940Z|00838|raft|INFO|server 8b8d is leader for term 43 > > S2 (follower): > 2021-12-10T19:08:00.870Z|00481|raft|INFO|server 8b8d is leader for term 43 > > S3 (new leader): > 2021-12-10T19:07:51.242Z|01083|raft|INFO|received leadership transfer from f5c9 in term 42 > 2021-12-10T19:07:51.244Z|01084|raft|INFO|term 43: starting election > 2021-12-10T19:08:00.805Z|01085|ovsdb|INFO|OVN_Southbound: Database compaction took 9559ms > 2021-12-10T19:08:00.869Z|01100|raft|INFO|term 43: elected leader by 2+ of 3 servers > > We see that the leader to be (S3) receives the leadership transfer, > initiates the election and immediately after starts a snapshot that > takes ~9.5 seconds. During this time, S2 votes for S3 electing it > as cluster leader but S3 doesn't effectively become leader until it > finishes snapshotting, essentially keeping the cluster without a > leader for up to ~9.5 seconds. > > With the current change, S3 will delay compaction and snapshotting until > the election is finished. > > The only exception is the case of single-node clusters for which we > allow the node to snapshot regardless of role. > > Acked-by: Han Zhou <hzhou@ovn.org> > Signed-off-by: Dumitru Ceara <dceara@redhat.com> Thanks, Han and Dumitru! Applied. Best regards, Ilya Maximets.
diff --git a/ovsdb/raft.c b/ovsdb/raft.c index ce40c5bc075c..1a3447a8dd4f 100644 --- a/ovsdb/raft.c +++ b/ovsdb/raft.c @@ -4226,7 +4226,7 @@ raft_may_snapshot(const struct raft *raft) && !raft->leaving && !raft->left && !raft->failed - && raft->role != RAFT_LEADER + && (raft->role == RAFT_FOLLOWER || hmap_count(&raft->servers) == 1) && raft->last_applied >= raft->log_start); }