Message ID | 1582942030-31096-7-git-send-email-hzhou@ovn.org |
---|---|
State | Accepted |
Commit | 315e88cb4dd9c524ac111323f9d064678cf06a5e |
Headers | show |
Series | [ovs-dev,1/7] raft-rpc.c: Fix message format. | expand |
On Fri, Feb 28, 2020 at 06:07:10PM -0800, Han Zhou wrote: > When a leader handles install_snapshot reply, the next_index for > the follower should be log_start instead of log_end, because there > can be new entries added in leader's log after initiating the > install_snapshot procedure. Also, it should send all the accumulated > entries to follower in the following append-request message, instead > of sending 0 entries, to speed up the converge. > > Without this fix, there is no functional problem, but it takes > uncessary extra rounds of append-requests responsed with "inconsistency" > by follower, although finally will be converged. > > Signed-off-by: Han Zhou <hzhou@ovn.org> Thanks so much for all the fixes! I applied these to master. Let me know if you want me to backport any of them.
On Fri, Mar 6, 2020 at 2:32 PM Ben Pfaff <blp@ovn.org> wrote: > > On Fri, Feb 28, 2020 at 06:07:10PM -0800, Han Zhou wrote: > > When a leader handles install_snapshot reply, the next_index for > > the follower should be log_start instead of log_end, because there > > can be new entries added in leader's log after initiating the > > install_snapshot procedure. Also, it should send all the accumulated > > entries to follower in the following append-request message, instead > > of sending 0 entries, to speed up the converge. > > > > Without this fix, there is no functional problem, but it takes > > uncessary extra rounds of append-requests responsed with "inconsistency" > > by follower, although finally will be converged. > > > > Signed-off-by: Han Zhou <hzhou@ovn.org> > > Thanks so much for all the fixes! I applied these to master. Let me > know if you want me to backport any of them. Thanks Ben! Could you help backport at least 1/7 - 4/7 to branch-2.12 and 2.13? 5/7 - 7/7 are not major problems, so I think it is ok not backporting them.
On Fri, Mar 06, 2020 at 02:47:50PM -0800, Han Zhou wrote: > On Fri, Mar 6, 2020 at 2:32 PM Ben Pfaff <blp@ovn.org> wrote: > > > > On Fri, Feb 28, 2020 at 06:07:10PM -0800, Han Zhou wrote: > > > When a leader handles install_snapshot reply, the next_index for > > > the follower should be log_start instead of log_end, because there > > > can be new entries added in leader's log after initiating the > > > install_snapshot procedure. Also, it should send all the accumulated > > > entries to follower in the following append-request message, instead > > > of sending 0 entries, to speed up the converge. > > > > > > Without this fix, there is no functional problem, but it takes > > > uncessary extra rounds of append-requests responsed with "inconsistency" > > > by follower, although finally will be converged. > > > > > > Signed-off-by: Han Zhou <hzhou@ovn.org> > > > > Thanks so much for all the fixes! I applied these to master. Let me > > know if you want me to backport any of them. > > Thanks Ben! Could you help backport at least 1/7 - 4/7 to branch-2.12 and > 2.13? 5/7 - 7/7 are not major problems, so I think it is ok not backporting > them. I backported the first 4 patches to 2.13 and 2.12.
diff --git a/ovsdb/raft.c b/ovsdb/raft.c index c5c1d49..1612734 100644 --- a/ovsdb/raft.c +++ b/ovsdb/raft.c @@ -3997,8 +3997,9 @@ raft_handle_install_snapshot_reply( VLOG_INFO_RL(&rl, "cluster "CID_FMT": installed snapshot on server %s " " up to %"PRIu64":%"PRIu64, CID_ARGS(&raft->cid), s->nickname, rpy->last_term, rpy->last_index); - s->next_index = raft->log_end; - raft_send_append_request(raft, s, 0, "snapshot installed"); + s->next_index = raft->log_start; + raft_send_append_request(raft, s, raft->log_end - s->next_index, + "snapshot installed"); } /* Returns true if 'raft' has grown enough since the last snapshot that
When a leader handles install_snapshot reply, the next_index for the follower should be log_start instead of log_end, because there can be new entries added in leader's log after initiating the install_snapshot procedure. Also, it should send all the accumulated entries to follower in the following append-request message, instead of sending 0 entries, to speed up the converge. Without this fix, there is no functional problem, but it takes uncessary extra rounds of append-requests responsed with "inconsistency" by follower, although finally will be converged. Signed-off-by: Han Zhou <hzhou@ovn.org> --- ovsdb/raft.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)