diff mbox series

[ovs-dev,7/7] raft: Fix next_index in install_snapshot reply handling.

Message ID 1582942030-31096-7-git-send-email-hzhou@ovn.org
State Accepted
Commit 315e88cb4dd9c524ac111323f9d064678cf06a5e
Headers show
Series [ovs-dev,1/7] raft-rpc.c: Fix message format. | expand

Commit Message

Han Zhou Feb. 29, 2020, 2:07 a.m. UTC
When a leader handles install_snapshot reply, the next_index for
the follower should be log_start instead of log_end, because there
can be new entries added in leader's log after initiating the
install_snapshot procedure.  Also, it should send all the accumulated
entries to follower in the following append-request message, instead
of sending 0 entries, to speed up the converge.

Without this fix, there is no functional problem, but it takes
uncessary extra rounds of append-requests responsed with "inconsistency"
by follower, although finally will be converged.

Signed-off-by: Han Zhou <hzhou@ovn.org>
---
 ovsdb/raft.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Comments

Ben Pfaff March 6, 2020, 10:31 p.m. UTC | #1
On Fri, Feb 28, 2020 at 06:07:10PM -0800, Han Zhou wrote:
> When a leader handles install_snapshot reply, the next_index for
> the follower should be log_start instead of log_end, because there
> can be new entries added in leader's log after initiating the
> install_snapshot procedure.  Also, it should send all the accumulated
> entries to follower in the following append-request message, instead
> of sending 0 entries, to speed up the converge.
> 
> Without this fix, there is no functional problem, but it takes
> uncessary extra rounds of append-requests responsed with "inconsistency"
> by follower, although finally will be converged.
> 
> Signed-off-by: Han Zhou <hzhou@ovn.org>

Thanks so much for all the fixes!  I applied these to master.  Let me
know if you want me to backport any of them.
Han Zhou March 6, 2020, 10:47 p.m. UTC | #2
On Fri, Mar 6, 2020 at 2:32 PM Ben Pfaff <blp@ovn.org> wrote:
>
> On Fri, Feb 28, 2020 at 06:07:10PM -0800, Han Zhou wrote:
> > When a leader handles install_snapshot reply, the next_index for
> > the follower should be log_start instead of log_end, because there
> > can be new entries added in leader's log after initiating the
> > install_snapshot procedure.  Also, it should send all the accumulated
> > entries to follower in the following append-request message, instead
> > of sending 0 entries, to speed up the converge.
> >
> > Without this fix, there is no functional problem, but it takes
> > uncessary extra rounds of append-requests responsed with "inconsistency"
> > by follower, although finally will be converged.
> >
> > Signed-off-by: Han Zhou <hzhou@ovn.org>
>
> Thanks so much for all the fixes!  I applied these to master.  Let me
> know if you want me to backport any of them.

Thanks Ben! Could you help backport at least 1/7 - 4/7 to branch-2.12 and
2.13? 5/7 - 7/7 are not major problems, so I think it is ok not backporting
them.
Ben Pfaff March 6, 2020, 11:01 p.m. UTC | #3
On Fri, Mar 06, 2020 at 02:47:50PM -0800, Han Zhou wrote:
> On Fri, Mar 6, 2020 at 2:32 PM Ben Pfaff <blp@ovn.org> wrote:
> >
> > On Fri, Feb 28, 2020 at 06:07:10PM -0800, Han Zhou wrote:
> > > When a leader handles install_snapshot reply, the next_index for
> > > the follower should be log_start instead of log_end, because there
> > > can be new entries added in leader's log after initiating the
> > > install_snapshot procedure.  Also, it should send all the accumulated
> > > entries to follower in the following append-request message, instead
> > > of sending 0 entries, to speed up the converge.
> > >
> > > Without this fix, there is no functional problem, but it takes
> > > uncessary extra rounds of append-requests responsed with "inconsistency"
> > > by follower, although finally will be converged.
> > >
> > > Signed-off-by: Han Zhou <hzhou@ovn.org>
> >
> > Thanks so much for all the fixes!  I applied these to master.  Let me
> > know if you want me to backport any of them.
> 
> Thanks Ben! Could you help backport at least 1/7 - 4/7 to branch-2.12 and
> 2.13? 5/7 - 7/7 are not major problems, so I think it is ok not backporting
> them.

I backported the first 4 patches to 2.13 and 2.12.
diff mbox series

Patch

diff --git a/ovsdb/raft.c b/ovsdb/raft.c
index c5c1d49..1612734 100644
--- a/ovsdb/raft.c
+++ b/ovsdb/raft.c
@@ -3997,8 +3997,9 @@  raft_handle_install_snapshot_reply(
     VLOG_INFO_RL(&rl, "cluster "CID_FMT": installed snapshot on server %s "
                  " up to %"PRIu64":%"PRIu64, CID_ARGS(&raft->cid),
                  s->nickname, rpy->last_term, rpy->last_index);
-    s->next_index = raft->log_end;
-    raft_send_append_request(raft, s, 0, "snapshot installed");
+    s->next_index = raft->log_start;
+    raft_send_append_request(raft, s, raft->log_end - s->next_index,
+                             "snapshot installed");
 }
 
 /* Returns true if 'raft' has grown enough since the last snapshot that