Message ID | 20240110192939.15220-4-frode.nordahl@canonical.com |
---|---|
State | Changes Requested |
Delegated to: | Ilya Maximets |
Headers | show |
Series | Introduce cooperative multitasking to improve OVSDB RAFT cluster operation. | expand |
Context | Check | Description |
---|---|---|
ovsrobot/apply-robot | success | apply and check: success |
ovsrobot/github-robot-_Build_and_Test | success | github build: passed |
ovsrobot/intel-ovs-compilation | success | test: success |
On 1/10/24 20:29, Frode Nordahl wrote: > The OVSDB server is mostly synchronous and single threaded. The > OVSDB RAFT storage engine operate under strict deadlines with > operational impact should the deadline be overrun. > > Register for cooperative multitasking so that long running > processing elsewhere in the program may yield to allow stable > maintenance of the cluster. > > Signed-off-by: Frode Nordahl <frode.nordahl@canonical.com> > --- > ovsdb/raft.c | 19 ++++++++++++++++++- > 1 file changed, 18 insertions(+), 1 deletion(-) > > diff --git a/ovsdb/raft.c b/ovsdb/raft.c > index 8effd9ad1..6d2d9afda 100644 > --- a/ovsdb/raft.c > +++ b/ovsdb/raft.c > @@ -22,6 +22,7 @@ > #include <errno.h> > #include <unistd.h> > > +#include "cooperative-multitasking.h" > #include "hash.h" > #include "jsonrpc.h" > #include "lockfile.h" > @@ -423,6 +424,8 @@ raft_make_address_passive(const char *address_) > } > } > > +#define RAFT_TIMER_THRESHOLD(t) t / 3 > + > static struct raft * > raft_alloc(void) > { > @@ -446,6 +449,10 @@ raft_alloc(void) > raft->conn_backlog_max_n_msgs = DEFAULT_MAX_BACKLOG_N_MSGS; > raft->conn_backlog_max_n_bytes = DEFAULT_MAX_BACKLOG_N_BYTES; > > + COOPERATIVE_MULTITASKING_REGISTER( > + &raft_run, raft, RAFT_TIMER_THRESHOLD(raft->election_timer), > + "consider increasing the election timer."); We will not need to register here if we combine it with update. > + > return raft; > } > > @@ -991,12 +998,22 @@ raft_reset_election_timer(struct raft *raft) > duration += raft->election_timer; > } > raft->election_timeout = raft->election_base + duration; > + > + COOPERATIVE_MULTITASKING_UPDATE( > + &raft_run, raft, > + raft->election_base, > + RAFT_TIMER_THRESHOLD(raft->election_base + duration)); > } > > static void > raft_reset_ping_timer(struct raft *raft) > { > - raft->ping_timeout = time_msec() + raft->election_timer / 3; > + long long int now = time_msec(); > + > + raft->ping_timeout = now + RAFT_TIMER_THRESHOLD(raft->election_timer); > + > + COOPERATIVE_MULTITASKING_UPDATE( > + &raft_run, raft, now, RAFT_TIMER_THRESHOLD(raft->election_timer)); > } > > static void I'm not sure we need to update whenever the timer is firing or reset. We need to update when the raft_run() is called (to update the last_run). And we need to update when election_timer changes the value, but this is not necessary, since raft_run() is the only thing that changes the election timer, so we will always have an updated value at the end of a function. Updating when the timers are reset is not very intuitive, also these timers will constantly modify the value on every RAFT communication round, which is a little hard to track. We just need to ensure that raft_run() is called a few times within the span of election_timer, we do not care much about exact timestamps. Does that make sense? Best regards, Ilya Maximets.
diff --git a/ovsdb/raft.c b/ovsdb/raft.c index 8effd9ad1..6d2d9afda 100644 --- a/ovsdb/raft.c +++ b/ovsdb/raft.c @@ -22,6 +22,7 @@ #include <errno.h> #include <unistd.h> +#include "cooperative-multitasking.h" #include "hash.h" #include "jsonrpc.h" #include "lockfile.h" @@ -423,6 +424,8 @@ raft_make_address_passive(const char *address_) } } +#define RAFT_TIMER_THRESHOLD(t) t / 3 + static struct raft * raft_alloc(void) { @@ -446,6 +449,10 @@ raft_alloc(void) raft->conn_backlog_max_n_msgs = DEFAULT_MAX_BACKLOG_N_MSGS; raft->conn_backlog_max_n_bytes = DEFAULT_MAX_BACKLOG_N_BYTES; + COOPERATIVE_MULTITASKING_REGISTER( + &raft_run, raft, RAFT_TIMER_THRESHOLD(raft->election_timer), + "consider increasing the election timer."); + return raft; } @@ -991,12 +998,22 @@ raft_reset_election_timer(struct raft *raft) duration += raft->election_timer; } raft->election_timeout = raft->election_base + duration; + + COOPERATIVE_MULTITASKING_UPDATE( + &raft_run, raft, + raft->election_base, + RAFT_TIMER_THRESHOLD(raft->election_base + duration)); } static void raft_reset_ping_timer(struct raft *raft) { - raft->ping_timeout = time_msec() + raft->election_timer / 3; + long long int now = time_msec(); + + raft->ping_timeout = now + RAFT_TIMER_THRESHOLD(raft->election_timer); + + COOPERATIVE_MULTITASKING_UPDATE( + &raft_run, raft, now, RAFT_TIMER_THRESHOLD(raft->election_timer)); } static void
The OVSDB server is mostly synchronous and single threaded. The OVSDB RAFT storage engine operate under strict deadlines with operational impact should the deadline be overrun. Register for cooperative multitasking so that long running processing elsewhere in the program may yield to allow stable maintenance of the cluster. Signed-off-by: Frode Nordahl <frode.nordahl@canonical.com> --- ovsdb/raft.c | 19 ++++++++++++++++++- 1 file changed, 18 insertions(+), 1 deletion(-)