Message ID | 20240411234504.2913083-2-i.maximets@ovn.org |
---|---|
State | Accepted |
Commit | d7f2150ea83a02336adbf0bf48f160664f11ce8f |
Delegated to: | Ilya Maximets |
Headers | show |
Series | ovsdb: raft: Fixes for probe interval updates. | expand |
Context | Check | Description |
---|---|---|
ovsrobot/apply-robot | success | apply and check: success |
ovsrobot/intel-ovs-compilation | fail | test: fail |
ovsrobot/github-robot-_Build_and_Test | success | github build: passed |
On Thu, Apr 11, 2024 at 7:44 PM Ilya Maximets <i.maximets@ovn.org> wrote: > > Inactivity probe interval on RAFT connections depend on a value of the > election timer. However, the actual value is not known until the > database snapshot with the RAFT information is received by a joining > server. New joining server is using a default 1 second until then. > > In case a new joining server is trying to join an existing cluster > with a large database, it may take more than a second to generate and > send an initial database snapshot. This is causing an inability to > actually join this cluster. Joining server sends ADD_SERVER request, > waits 1 second, sends a probe, doesn't get a reply within another > second, because the leader is busy preparing and sending an initial > snapshot to it, disconnects, repeat. > > This is not an issue for the servers that did already join, since > their probe intervals are larger than election timeout. > Cooperative multitasking also doesn't fully solve this issue, since > it depends on election timer, which is likely higher in the existing > cluster with a very big database. > > Fix that by using the maximum election timer value for inactivity > probes until the actual value is known. We still shouldn't completely > disable the probes, because in the rare event the connection is > established but the other side silently goes away, we still want to > disconnect and try to re-establish the connection eventually. > > Since probe intervals also depend on the joining state now, update > them when the server joins the cluster. > > Fixes: 14b2b0aad7ae ("raft: Reintroduce jsonrpc inactivity probes.") > Reported-by: Terry Wilson <twilson@redhat.com> > Reported-at: https://issues.redhat.com/browse/FDP-144 > Signed-off-by: Ilya Maximets <i.maximets@ovn.org> > --- Acked-by: Mike Pattrick <mkp@redhat.com>
diff --git a/ovsdb/raft.c b/ovsdb/raft.c index d81a1758a..083ebf66a 100644 --- a/ovsdb/raft.c +++ b/ovsdb/raft.c @@ -1018,8 +1018,13 @@ raft_conn_update_probe_interval(struct raft *raft, struct raft_conn *r_conn) * inactivity probe follower will just try to initiate election * indefinitely staying in 'candidate' role. And the leader will continue * to send heartbeats to the dead connection thinking that remote server - * is still part of the cluster. */ - int probe_interval = raft->election_timer + ELECTION_RANGE_MSEC; + * is still part of the cluster. + * + * While joining, the real value of the election timeout is not known to + * this server, so using the maximum. */ + int probe_interval = (raft->joining ? ELECTION_MAX_MSEC + : raft->election_timer) + + ELECTION_RANGE_MSEC; jsonrpc_session_set_probe_interval(r_conn->js, probe_interval); } @@ -2820,6 +2825,13 @@ raft_send_heartbeats(struct raft *raft) raft_reset_ping_timer(raft); } +static void +raft_join_complete(struct raft *raft) +{ + raft->joining = false; + raft_update_probe_intervals(raft); +} + /* Initializes the fields in 's' that represent the leader's view of the * server. */ static void @@ -2866,7 +2878,7 @@ raft_become_leader(struct raft *raft) * we're becoming a cluster leader without receiving reply for a * join request and will commit addition of this server ourselves. */ VLOG_INFO_RL(&rl, "elected as leader while joining"); - raft->joining = false; + raft_join_complete(raft); } struct raft_server *s; @@ -3101,7 +3113,7 @@ raft_update_commit_index(struct raft *raft, uint64_t new_commit_index) "added to configuration without reply " "(eid: "UUID_FMT", commit index: %"PRIu64")", UUID_ARGS(&e->eid), index); - raft->joining = false; + raft_join_complete(raft); } } raft_servers_destroy(&servers); @@ -4049,7 +4061,7 @@ raft_handle_add_server_reply(struct raft *raft, } if (rpy->success) { - raft->joining = false; + raft_join_complete(raft); /* It is tempting, at this point, to check that this server is part of * the current configuration. However, this is not necessarily the
Inactivity probe interval on RAFT connections depend on a value of the election timer. However, the actual value is not known until the database snapshot with the RAFT information is received by a joining server. New joining server is using a default 1 second until then. In case a new joining server is trying to join an existing cluster with a large database, it may take more than a second to generate and send an initial database snapshot. This is causing an inability to actually join this cluster. Joining server sends ADD_SERVER request, waits 1 second, sends a probe, doesn't get a reply within another second, because the leader is busy preparing and sending an initial snapshot to it, disconnects, repeat. This is not an issue for the servers that did already join, since their probe intervals are larger than election timeout. Cooperative multitasking also doesn't fully solve this issue, since it depends on election timer, which is likely higher in the existing cluster with a very big database. Fix that by using the maximum election timer value for inactivity probes until the actual value is known. We still shouldn't completely disable the probes, because in the rare event the connection is established but the other side silently goes away, we still want to disconnect and try to re-establish the connection eventually. Since probe intervals also depend on the joining state now, update them when the server joins the cluster. Fixes: 14b2b0aad7ae ("raft: Reintroduce jsonrpc inactivity probes.") Reported-by: Terry Wilson <twilson@redhat.com> Reported-at: https://issues.redhat.com/browse/FDP-144 Signed-off-by: Ilya Maximets <i.maximets@ovn.org> --- ovsdb/raft.c | 22 +++++++++++++++++----- 1 file changed, 17 insertions(+), 5 deletions(-)