Message ID | 20210629112035.17402-1-dceara@redhat.com |
---|---|
State | Changes Requested |
Headers | show |
Series | [ovs-dev] reconnect: Add graceful reconnect. | expand |
Context | Check | Description |
---|---|---|
ovsrobot/apply-robot | success | apply and check: success |
ovsrobot/github-robot | success | github build: passed |
On Tue, Jun 29, 2021 at 01:20:35PM +0200, Dumitru Ceara wrote: > Until now clients that needed to reconnect immediately could only use > reconnect_force_reconnect(). However, reconnect_force_reconnect() > doesn't reset the backoff for connections that were alive long enough > (more than backoff seconds). > > Moreover, the reconnect library cannot determine the exact reason why a > client wishes to initiate a reconnection. In most cases reconnection > happens because of a fatal error when communicating with the remote, > e.g., in the ovsdb-cs layer, when invalid messages are received from > ovsdb-server. In such cases it makes sense to not reset the backoff > because the remote seems to be unhealthy. > > There are however cases when reconnection is needed for other reasons. > One such example is when ovsdb-clients require "leader-only" connections > to clustered ovsdb-server databases. Whenever the client determines > that the remote is not a leader anymore, it decides to reconnect to a > new remote from its list, searching for the new leader. Using > jsonrpc_force_reconnect() (which calls reconnect_force_reconnect()) will > not reset backoff even though the former leader is still likely in good > shape. > > Since 3c2d6274bcee ("raft: Transfer leadership before creating > snapshots.") leadership changes inside the clustered database happen > more often and therefore "leader-only" clients need to reconnect more > often too. Not resetting the backoff every time a leadership change > happens will cause all reconnections to happen with the maximum backoff > (8 seconds) resulting in significant latency. > > This commit also updates the Python reconnect and IDL implementations > and adds tests for force-reconnect and graceful-reconnect. > > Reported-at: https://bugzilla.redhat.com/1977264 > Signed-off-by: Dumitru Ceara <dceara@redhat.com> I only glanced over this, but my reaction is good. Thank you for adding tests and writing such a thorough rationale!
On 6/29/21 1:20 PM, Dumitru Ceara wrote: > Until now clients that needed to reconnect immediately could only use > reconnect_force_reconnect(). However, reconnect_force_reconnect() > doesn't reset the backoff for connections that were alive long enough > (more than backoff seconds). > > Moreover, the reconnect library cannot determine the exact reason why a > client wishes to initiate a reconnection. In most cases reconnection > happens because of a fatal error when communicating with the remote, > e.g., in the ovsdb-cs layer, when invalid messages are received from > ovsdb-server. In such cases it makes sense to not reset the backoff > because the remote seems to be unhealthy. > > There are however cases when reconnection is needed for other reasons. > One such example is when ovsdb-clients require "leader-only" connections > to clustered ovsdb-server databases. Whenever the client determines > that the remote is not a leader anymore, it decides to reconnect to a > new remote from its list, searching for the new leader. Using > jsonrpc_force_reconnect() (which calls reconnect_force_reconnect()) will > not reset backoff even though the former leader is still likely in good > shape. > > Since 3c2d6274bcee ("raft: Transfer leadership before creating > snapshots.") leadership changes inside the clustered database happen > more often and therefore "leader-only" clients need to reconnect more > often too. Not resetting the backoff every time a leadership change > happens will cause all reconnections to happen with the maximum backoff > (8 seconds) resulting in significant latency. > > This commit also updates the Python reconnect and IDL implementations > and adds tests for force-reconnect and graceful-reconnect. > > Reported-at: https://bugzilla.redhat.com/1977264 > Signed-off-by: Dumitru Ceara <dceara@redhat.com> > --- Hi, Dumitru. Thanks for working on this issue. I've seen it in practice while running OVN tests, but I still don't quiet understand why it happens. Could you, please, describe how state transitioning work here for the ovsdb-idl case? > +# Forcefully reconnect. > +force-reconnect > + in RECONNECT for 0 ms (2000 ms backoff) > + 1 successful connections out of 3 attempts, seqno 2 > + disconnected > +run > + should disconnect > +connecting > + in CONNECTING for 0 ms (2000 ms backoff) Especially this part seems wrong to me. Because after 'should disconnect' there should be 'disconnect' of 'connect-fail', but not 'connecting'. We literally should disconnect here, otherwise it's a violation of the reconnect API. And my concern is that ovsdb-cs or jsonrpc violates the API somewhere by not calling reconnect_disconnectd() when it is required, or there is some other bug that makes 'reconnect' module to jump over few states in a fsm. The logical workflow for the force-reconnect, from what I see in the code should be: 1. force-reconnect --> transition to S_RECONNECT 2. run -> in S_RECONNECT, so returning RECONNECT_DISCONNECT 3. disconnect -> check the state, update backoff and transition to S_BACKOFF 4. run -> in S_BACKOFF, so returning RECONNECT_CONNECT 5. connected .... Something is fishy here, because ovsdb-cs somehow jumps over step #3 and maybe also #4. Best regards, Ilya Maximets.
On 7/12/21 10:36 PM, Ilya Maximets wrote: > On 6/29/21 1:20 PM, Dumitru Ceara wrote: >> Until now clients that needed to reconnect immediately could only use >> reconnect_force_reconnect(). However, reconnect_force_reconnect() >> doesn't reset the backoff for connections that were alive long enough >> (more than backoff seconds). >> >> Moreover, the reconnect library cannot determine the exact reason why a >> client wishes to initiate a reconnection. In most cases reconnection >> happens because of a fatal error when communicating with the remote, >> e.g., in the ovsdb-cs layer, when invalid messages are received from >> ovsdb-server. In such cases it makes sense to not reset the backoff >> because the remote seems to be unhealthy. >> >> There are however cases when reconnection is needed for other reasons. >> One such example is when ovsdb-clients require "leader-only" connections >> to clustered ovsdb-server databases. Whenever the client determines >> that the remote is not a leader anymore, it decides to reconnect to a >> new remote from its list, searching for the new leader. Using >> jsonrpc_force_reconnect() (which calls reconnect_force_reconnect()) will >> not reset backoff even though the former leader is still likely in good >> shape. >> >> Since 3c2d6274bcee ("raft: Transfer leadership before creating >> snapshots.") leadership changes inside the clustered database happen >> more often and therefore "leader-only" clients need to reconnect more >> often too. Not resetting the backoff every time a leadership change >> happens will cause all reconnections to happen with the maximum backoff >> (8 seconds) resulting in significant latency. >> >> This commit also updates the Python reconnect and IDL implementations >> and adds tests for force-reconnect and graceful-reconnect. >> >> Reported-at: https://bugzilla.redhat.com/1977264 >> Signed-off-by: Dumitru Ceara <dceara@redhat.com> >> --- > > Hi, Dumitru. > Hi Ilya, > Thanks for working on this issue. I've seen it in practice while running > OVN tests, but I still don't quiet understand why it happens. Could you, > please, describe how state transitioning work here for the ovsdb-idl case? > Without the patch, assuming a current backoff of X seconds, the sequence of events (even for a connection that has seen activity after backoff) is: - reconnect_force_reconnect() -> move to S_RECONNECT - ovsdb_cs_run() -> jsonrpc_session_run() -> reconnect_run() -> reconnect_disconnected() -> because state is not S_ACTIVE | S_IDLE backoff is not changed, stays X. >> +# Forcefully reconnect. >> +force-reconnect >> + in RECONNECT for 0 ms (2000 ms backoff) >> + 1 successful connections out of 3 attempts, seqno 2 >> + disconnected >> +run >> + should disconnect >> +connecting >> + in CONNECTING for 0 ms (2000 ms backoff) > > Especially this part seems wrong to me. Because after 'should disconnect' > there should be 'disconnect' of 'connect-fail', but not 'connecting'. We > literally should disconnect here, otherwise it's a violation of the reconnect > API. And my concern is that ovsdb-cs or jsonrpc violates the API somewhere > by not calling reconnect_disconnectd() when it is required, or there is some > other bug that makes 'reconnect' module to jump over few states in a fsm. > > The logical workflow for the force-reconnect, from what I see in the code > should be: > > 1. force-reconnect --> transition to S_RECONNECT > 2. run -> in S_RECONNECT, so returning RECONNECT_DISCONNECT > 3. disconnect -> check the state, update backoff and transition to S_BACKOFF > 4. run -> in S_BACKOFF, so returning RECONNECT_CONNECT > 5. connected .... This is just a bug in the test case I added, I need to issue "disconnect" in the test case to trigger reconnect_disconnected() to be called (like ovsdb-cs does). > > Something is fishy here, because ovsdb-cs somehow jumps over step #3 and > maybe also #4. As mentioned above, it's not the case for ovsdb-cs. However, while checking the test case I realized that the patch will cause all "graceful" reconnects to happen with an initial backoff of 2 seconds for long lived sessions (instead of 1s). That's because: reconnect_graceful_reconnect() -> reconnect_reset_backoff__() -> session was active after the initial 'backoff' seconds, so reset backoff to minimum. -> reconnect_transition__(..., S_RECONNECT); ovsdb_cs_run() -> jsonrpc_session_run() -> reconnect_run() -> reconnect_disconnected(): -> if (!reconnect_reset_backoff__(..)) then double backoff. I see a couple of potential ways to fix that: 1. reconnect_graceful_reconnect() could bump fsm->backoff_free_tries to allow a single backoff free reconnect. 2. reconnect_graceful_reconnect() could reset fsm->backoff to 0 if it was active recently (instead of calling reconnect_reset_backoff__()). However, this is all under the assumption that we want to add support for two types of reconnect "semantics": a. forced, e.g., when the client detects inconsistencies in the data received from the server (the server is in "bad shape") in which we should never reset the backoff. b. graceful, e.g., when the client reconnects because it needs a leader-only connection and a leadership transfer happened on the server side (the server is likely in "good shape" just not a leader anymore). An alternative to all this is to change reconnect_disconnected() to reset backoff for connections in states S_ACTIVE | S_IDLE | S_RECONNECT (used to be just active and idle). I guess we could treat this as a bug fix and continue discussion for a separate follow up patch to add the "graceful vs force" semantics if they turn out to make sense for the future. In essence (excluding potential python and test changes) the patch would become: diff --git a/lib/reconnect.c b/lib/reconnect.c index a929ddfd2d..3a6d93f9d1 100644 --- a/lib/reconnect.c +++ b/lib/reconnect.c @@ -385,7 +385,7 @@ reconnect_disconnected(struct reconnect *fsm, long long int now, int error) if (fsm->backoff_free_tries > 1) { fsm->backoff_free_tries--; fsm->backoff = 0; - } else if (fsm->state & (S_ACTIVE | S_IDLE) + } else if (fsm->state & (S_ACTIVE | S_IDLE | S_RECONNECT) && (fsm->last_activity - fsm->last_connected >= fsm->backoff || fsm->passive)) { fsm->backoff = fsm->passive ? 0 : fsm->min_backoff; --- What do you think? > > Best regards, Ilya Maximets. > Regards, Dumitru
On 7/13/21 5:03 PM, Dumitru Ceara wrote: > On 7/12/21 10:36 PM, Ilya Maximets wrote: >> On 6/29/21 1:20 PM, Dumitru Ceara wrote: >>> Until now clients that needed to reconnect immediately could only use >>> reconnect_force_reconnect(). However, reconnect_force_reconnect() >>> doesn't reset the backoff for connections that were alive long enough >>> (more than backoff seconds). >>> >>> Moreover, the reconnect library cannot determine the exact reason why a >>> client wishes to initiate a reconnection. In most cases reconnection >>> happens because of a fatal error when communicating with the remote, >>> e.g., in the ovsdb-cs layer, when invalid messages are received from >>> ovsdb-server. In such cases it makes sense to not reset the backoff >>> because the remote seems to be unhealthy. >>> >>> There are however cases when reconnection is needed for other reasons. >>> One such example is when ovsdb-clients require "leader-only" connections >>> to clustered ovsdb-server databases. Whenever the client determines >>> that the remote is not a leader anymore, it decides to reconnect to a >>> new remote from its list, searching for the new leader. Using >>> jsonrpc_force_reconnect() (which calls reconnect_force_reconnect()) will >>> not reset backoff even though the former leader is still likely in good >>> shape. >>> >>> Since 3c2d6274bcee ("raft: Transfer leadership before creating >>> snapshots.") leadership changes inside the clustered database happen >>> more often and therefore "leader-only" clients need to reconnect more >>> often too. Not resetting the backoff every time a leadership change >>> happens will cause all reconnections to happen with the maximum backoff >>> (8 seconds) resulting in significant latency. >>> >>> This commit also updates the Python reconnect and IDL implementations >>> and adds tests for force-reconnect and graceful-reconnect. >>> >>> Reported-at: https://bugzilla.redhat.com/1977264 >>> Signed-off-by: Dumitru Ceara <dceara@redhat.com> >>> --- >> >> Hi, Dumitru. >> > > Hi Ilya, > >> Thanks for working on this issue. I've seen it in practice while running >> OVN tests, but I still don't quiet understand why it happens. Could you, >> please, describe how state transitioning work here for the ovsdb-idl case? >> > > Without the patch, assuming a current backoff of X seconds, the sequence > of events (even for a connection that has seen activity after backoff) > is: > > - reconnect_force_reconnect() -> move to S_RECONNECT > - ovsdb_cs_run() > -> jsonrpc_session_run() > -> reconnect_run() > -> reconnect_disconnected() > -> because state is not S_ACTIVE | S_IDLE backoff is not > changed, stays X. Hmm, I see. Thanks for explanation! > >>> +# Forcefully reconnect. >>> +force-reconnect >>> + in RECONNECT for 0 ms (2000 ms backoff) >>> + 1 successful connections out of 3 attempts, seqno 2 >>> + disconnected >>> +run >>> + should disconnect >>> +connecting >>> + in CONNECTING for 0 ms (2000 ms backoff) >> >> Especially this part seems wrong to me. Because after 'should disconnect' >> there should be 'disconnect' of 'connect-fail', but not 'connecting'. We >> literally should disconnect here, otherwise it's a violation of the reconnect >> API. And my concern is that ovsdb-cs or jsonrpc violates the API somewhere >> by not calling reconnect_disconnectd() when it is required, or there is some >> other bug that makes 'reconnect' module to jump over few states in a fsm. >> >> The logical workflow for the force-reconnect, from what I see in the code >> should be: >> >> 1. force-reconnect --> transition to S_RECONNECT >> 2. run -> in S_RECONNECT, so returning RECONNECT_DISCONNECT >> 3. disconnect -> check the state, update backoff and transition to S_BACKOFF >> 4. run -> in S_BACKOFF, so returning RECONNECT_CONNECT >> 5. connected .... > > This is just a bug in the test case I added, I need to issue > "disconnect" in the test case to trigger reconnect_disconnected() > to be called (like ovsdb-cs does). > >> >> Something is fishy here, because ovsdb-cs somehow jumps over step #3 and >> maybe also #4. > > As mentioned above, it's not the case for ovsdb-cs. > > However, while checking the test case I realized that the patch will > cause all "graceful" reconnects to happen with an initial backoff of 2 > seconds for long lived sessions (instead of 1s). > > That's because: > > reconnect_graceful_reconnect() > -> reconnect_reset_backoff__() > -> session was active after the initial 'backoff' seconds, so > reset backoff to minimum. > -> reconnect_transition__(..., S_RECONNECT); > ovsdb_cs_run() > -> jsonrpc_session_run() > -> reconnect_run() > -> reconnect_disconnected(): > -> if (!reconnect_reset_backoff__(..)) then double backoff. > > I see a couple of potential ways to fix that: > 1. reconnect_graceful_reconnect() could bump fsm->backoff_free_tries to > allow a single backoff free reconnect. > 2. reconnect_graceful_reconnect() could reset fsm->backoff to 0 if it > was active recently (instead of calling reconnect_reset_backoff__()). > > However, this is all under the assumption that we want to add support > for two types of reconnect "semantics": > > a. forced, e.g., when the client detects inconsistencies in the data > received from the server (the server is in "bad shape") in which we > should never reset the backoff. > > b. graceful, e.g., when the client reconnects because it needs a > leader-only connection and a leadership transfer happened on the > server side (the server is likely in "good shape" just not a leader > anymore). > > An alternative to all this is to change reconnect_disconnected() to > reset backoff for connections in states S_ACTIVE | S_IDLE | S_RECONNECT > (used to be just active and idle). > > I guess we could treat this as a bug fix and continue discussion for a > separate follow up patch to add the "graceful vs force" semantics if > they turn out to make sense for the future. > > In essence (excluding potential python and test changes) the patch would > become: > > diff --git a/lib/reconnect.c b/lib/reconnect.c > index a929ddfd2d..3a6d93f9d1 100644 > --- a/lib/reconnect.c > +++ b/lib/reconnect.c > @@ -385,7 +385,7 @@ reconnect_disconnected(struct reconnect *fsm, long long int now, int error) > if (fsm->backoff_free_tries > 1) { > fsm->backoff_free_tries--; > fsm->backoff = 0; > - } else if (fsm->state & (S_ACTIVE | S_IDLE) > + } else if (fsm->state & (S_ACTIVE | S_IDLE | S_RECONNECT) > && (fsm->last_activity - fsm->last_connected >= fsm->backoff > || fsm->passive)) { > fsm->backoff = fsm->passive ? 0 : fsm->min_backoff; > --- > > What do you think? The problem I see with adding S_RECONNECT to this condition is that 'last_activity' and 'last_connected' doesn't belong to current connection in case we're forcefully interrupting connection in a S_CONNECTING state. So, we will re-set backoff based on outdated information. Also, I think, in this case, 'last_connected' might be larger than 'last_activity' and we will have a negative value here. All values are signed, so it should not be an issue, but it's not a clean solution. Bumping the number of free tries seems like a better solution to me, because: 1. I'm not sure that we need 2 types of reconnect. I mean, backoff is intended to avoid overloading the already overloaded server with connection attempts. In case of forceful re-connection the server is not overloaded, so we should not increase a backoff. IMO, backoff should only be involved in cases where we have problems on the server side related to load, not the leader-only stuff or even database inconsistency problems. 2. force-reconnect should not consume free tries for exactly same reasons as in point 1. Free tries are to avoid backoff, but backoff exists only to avoid storming the remote server with connections while it's overloaded. No overload -> No need for backoff -> No need to consume free tries. 3. With bumping of free tries we're avoiding logical issues around using 'last_activity' and 'last_connected' from the old connection. 4. Faster reconnection, since backoff will be zero. And that is fine, because as far as we know, the server is not overloaded, it's just not suitable for our needs for some reason. If it is overloaded, we will backoff after the first failed connection attempt. One potential problem I see with this solution is that if there are no servers on a list that are suitable for ovsdb-cs, we will have a log full of annoying reconnect logs. Backoff kind of rate-limits ovsdb-cs right now and we have no internal rate-limit inside ovsdb-cs. This looks more like an issue of ovsdb-cs and not the issue of reconnect module itself. But we need to have a solution for this. And actually, the problem here is that from the jsonrpc point of view the connection is perfectly fine and functional, but ovsdb-cs implements high-level logic on top of it and decides that connection is not good on a much later stage based on the actual received data. So, we need to somehow propagate this information from ovsdb-cs down to reconnect module or allow ovsdb-cs to control the state machine, otherwise we will need two separate backoffs: one inside the reconnect for usual connection problems and one in ovsdb-cs for high-level data inconsistencies and leader changes. Thinking this way led me to a different solution. We could expose something like jsonrpc_session_set_backoff_free_tries() and allow ovsdb-cs to make a decision. E.g.: diff --git a/lib/ovsdb-cs.c b/lib/ovsdb-cs.c index f13065c6c..900597b96 100644 --- a/lib/ovsdb-cs.c +++ b/lib/ovsdb-cs.c @@ -729,6 +729,17 @@ void ovsdb_cs_force_reconnect(struct ovsdb_cs *cs) { if (cs->session) { + if (cs->state == CS_S_MONITORING) { + /* The ovsdb-cs was in MONITORING state, so we either had data + * inconsistency on this server, or it stopped being the cluster + * leader, or the user requested to re-connect. Avoiding backoff + * in these cases, as we need to re-connect as soon as possible. + * Connections that are not in MONITORING state should have their + * backoff to avoid constant flood of re-connection attempts in + * case there is no suitable database server. */ + jsonrpc_session_set_backoff_free_tries( + cs->session, jsonrpc_session_get_n_remotes(cs->session)); + } jsonrpc_session_force_reconnect(cs->session); } } --- This way, if the server looses leadership or inconsistency detected, the client will have 3 free attempts to find a new suitable server. After that it will start to backoff as it does now. No changes in reconnect module required. Thoughts? Best regards, Ilya Maximets.
On 7/20/21 3:18 PM, Ilya Maximets wrote: > On 7/13/21 5:03 PM, Dumitru Ceara wrote: >> On 7/12/21 10:36 PM, Ilya Maximets wrote: >>> On 6/29/21 1:20 PM, Dumitru Ceara wrote: >>>> Until now clients that needed to reconnect immediately could only use >>>> reconnect_force_reconnect(). However, reconnect_force_reconnect() >>>> doesn't reset the backoff for connections that were alive long enough >>>> (more than backoff seconds). >>>> >>>> Moreover, the reconnect library cannot determine the exact reason why a >>>> client wishes to initiate a reconnection. In most cases reconnection >>>> happens because of a fatal error when communicating with the remote, >>>> e.g., in the ovsdb-cs layer, when invalid messages are received from >>>> ovsdb-server. In such cases it makes sense to not reset the backoff >>>> because the remote seems to be unhealthy. >>>> >>>> There are however cases when reconnection is needed for other reasons. >>>> One such example is when ovsdb-clients require "leader-only" connections >>>> to clustered ovsdb-server databases. Whenever the client determines >>>> that the remote is not a leader anymore, it decides to reconnect to a >>>> new remote from its list, searching for the new leader. Using >>>> jsonrpc_force_reconnect() (which calls reconnect_force_reconnect()) will >>>> not reset backoff even though the former leader is still likely in good >>>> shape. >>>> >>>> Since 3c2d6274bcee ("raft: Transfer leadership before creating >>>> snapshots.") leadership changes inside the clustered database happen >>>> more often and therefore "leader-only" clients need to reconnect more >>>> often too. Not resetting the backoff every time a leadership change >>>> happens will cause all reconnections to happen with the maximum backoff >>>> (8 seconds) resulting in significant latency. >>>> >>>> This commit also updates the Python reconnect and IDL implementations >>>> and adds tests for force-reconnect and graceful-reconnect. >>>> >>>> Reported-at: https://bugzilla.redhat.com/1977264 >>>> Signed-off-by: Dumitru Ceara <dceara@redhat.com> >>>> --- >>> >>> Hi, Dumitru. >>> >> >> Hi Ilya, >> >>> Thanks for working on this issue. I've seen it in practice while running >>> OVN tests, but I still don't quiet understand why it happens. Could you, >>> please, describe how state transitioning work here for the ovsdb-idl case? >>> >> >> Without the patch, assuming a current backoff of X seconds, the sequence >> of events (even for a connection that has seen activity after backoff) >> is: >> >> - reconnect_force_reconnect() -> move to S_RECONNECT >> - ovsdb_cs_run() >> -> jsonrpc_session_run() >> -> reconnect_run() >> -> reconnect_disconnected() >> -> because state is not S_ACTIVE | S_IDLE backoff is not >> changed, stays X. > > Hmm, I see. Thanks for explanation! > >> >>>> +# Forcefully reconnect. >>>> +force-reconnect >>>> + in RECONNECT for 0 ms (2000 ms backoff) >>>> + 1 successful connections out of 3 attempts, seqno 2 >>>> + disconnected >>>> +run >>>> + should disconnect >>>> +connecting >>>> + in CONNECTING for 0 ms (2000 ms backoff) >>> >>> Especially this part seems wrong to me. Because after 'should disconnect' >>> there should be 'disconnect' of 'connect-fail', but not 'connecting'. We >>> literally should disconnect here, otherwise it's a violation of the reconnect >>> API. And my concern is that ovsdb-cs or jsonrpc violates the API somewhere >>> by not calling reconnect_disconnectd() when it is required, or there is some >>> other bug that makes 'reconnect' module to jump over few states in a fsm. >>> >>> The logical workflow for the force-reconnect, from what I see in the code >>> should be: >>> >>> 1. force-reconnect --> transition to S_RECONNECT >>> 2. run -> in S_RECONNECT, so returning RECONNECT_DISCONNECT >>> 3. disconnect -> check the state, update backoff and transition to S_BACKOFF >>> 4. run -> in S_BACKOFF, so returning RECONNECT_CONNECT >>> 5. connected .... >> >> This is just a bug in the test case I added, I need to issue >> "disconnect" in the test case to trigger reconnect_disconnected() >> to be called (like ovsdb-cs does). >> >>> >>> Something is fishy here, because ovsdb-cs somehow jumps over step #3 and >>> maybe also #4. >> >> As mentioned above, it's not the case for ovsdb-cs. >> >> However, while checking the test case I realized that the patch will >> cause all "graceful" reconnects to happen with an initial backoff of 2 >> seconds for long lived sessions (instead of 1s). >> >> That's because: >> >> reconnect_graceful_reconnect() >> -> reconnect_reset_backoff__() >> -> session was active after the initial 'backoff' seconds, so >> reset backoff to minimum. >> -> reconnect_transition__(..., S_RECONNECT); >> ovsdb_cs_run() >> -> jsonrpc_session_run() >> -> reconnect_run() >> -> reconnect_disconnected(): >> -> if (!reconnect_reset_backoff__(..)) then double backoff. >> >> I see a couple of potential ways to fix that: >> 1. reconnect_graceful_reconnect() could bump fsm->backoff_free_tries to >> allow a single backoff free reconnect. >> 2. reconnect_graceful_reconnect() could reset fsm->backoff to 0 if it >> was active recently (instead of calling reconnect_reset_backoff__()). >> >> However, this is all under the assumption that we want to add support >> for two types of reconnect "semantics": >> >> a. forced, e.g., when the client detects inconsistencies in the data >> received from the server (the server is in "bad shape") in which we >> should never reset the backoff. >> >> b. graceful, e.g., when the client reconnects because it needs a >> leader-only connection and a leadership transfer happened on the >> server side (the server is likely in "good shape" just not a leader >> anymore). >> >> An alternative to all this is to change reconnect_disconnected() to >> reset backoff for connections in states S_ACTIVE | S_IDLE | S_RECONNECT >> (used to be just active and idle). >> >> I guess we could treat this as a bug fix and continue discussion for a >> separate follow up patch to add the "graceful vs force" semantics if >> they turn out to make sense for the future. >> >> In essence (excluding potential python and test changes) the patch would >> become: >> >> diff --git a/lib/reconnect.c b/lib/reconnect.c >> index a929ddfd2d..3a6d93f9d1 100644 >> --- a/lib/reconnect.c >> +++ b/lib/reconnect.c >> @@ -385,7 +385,7 @@ reconnect_disconnected(struct reconnect *fsm, long long int now, int error) >> if (fsm->backoff_free_tries > 1) { >> fsm->backoff_free_tries--; >> fsm->backoff = 0; >> - } else if (fsm->state & (S_ACTIVE | S_IDLE) >> + } else if (fsm->state & (S_ACTIVE | S_IDLE | S_RECONNECT) >> && (fsm->last_activity - fsm->last_connected >= fsm->backoff >> || fsm->passive)) { >> fsm->backoff = fsm->passive ? 0 : fsm->min_backoff; >> --- >> >> What do you think? > > The problem I see with adding S_RECONNECT to this condition is that > 'last_activity' and 'last_connected' doesn't belong to current > connection in case we're forcefully interrupting connection in a > S_CONNECTING state. So, we will re-set backoff based on outdated > information. Also, I think, in this case, 'last_connected' might > be larger than 'last_activity' and we will have a negative value here. You're right, I missed this. > All values are signed, so it should not be an issue, but it's not > a clean solution. It's not, I agree. > > Bumping the number of free tries seems like a better solution to me, > because: > > 1. I'm not sure that we need 2 types of reconnect. I mean, backoff > is intended to avoid overloading the already overloaded server with > connection attempts. In case of forceful re-connection the server > is not overloaded, so we should not increase a backoff. > IMO, backoff should only be involved in cases where we have > problems on the server side related to load, not the leader-only > stuff or even database inconsistency problems. > > 2. force-reconnect should not consume free tries for exactly same > reasons as in point 1. Free tries are to avoid backoff, but > backoff exists only to avoid storming the remote server with > connections while it's overloaded. > No overload -> No need for backoff -> No need to consume free tries. > > 3. With bumping of free tries we're avoiding logical issues around > using 'last_activity' and 'last_connected' from the old connection. > > 4. Faster reconnection, since backoff will be zero. And that is > fine, because as far as we know, the server is not overloaded, > it's just not suitable for our needs for some reason. > If it is overloaded, we will backoff after the first failed > connection attempt. This is even better (instead of min_backoff, I mean). > > One potential problem I see with this solution is that if there > are no servers on a list that are suitable for ovsdb-cs, we will > have a log full of annoying reconnect logs. Backoff kind of > rate-limits ovsdb-cs right now and we have no internal rate-limit > inside ovsdb-cs. This looks more like an issue of ovsdb-cs and > not the issue of reconnect module itself. But we need to have a > solution for this. > > And actually, the problem here is that from the jsonrpc point of > view the connection is perfectly fine and functional, but ovsdb-cs > implements high-level logic on top of it and decides that > connection is not good on a much later stage based on the actual > received data. So, we need to somehow propagate this information > from ovsdb-cs down to reconnect module or allow ovsdb-cs to control > the state machine, otherwise we will need two separate backoffs: That's what I was trying to implement with the "graceful reconnect: allow ovsdb-cs to decide how to reconnect. But there are corner cases like the ones pointed out during the review of this patch. > one inside the reconnect for usual connection problems and one > in ovsdb-cs for high-level data inconsistencies and leader changes. > > Thinking this way led me to a different solution. We could expose > something like jsonrpc_session_set_backoff_free_tries() and allow > ovsdb-cs to make a decision. E.g.: > > diff --git a/lib/ovsdb-cs.c b/lib/ovsdb-cs.c > index f13065c6c..900597b96 100644 > --- a/lib/ovsdb-cs.c > +++ b/lib/ovsdb-cs.c > @@ -729,6 +729,17 @@ void > ovsdb_cs_force_reconnect(struct ovsdb_cs *cs) > { > if (cs->session) { > + if (cs->state == CS_S_MONITORING) { > + /* The ovsdb-cs was in MONITORING state, so we either had data > + * inconsistency on this server, or it stopped being the cluster > + * leader, or the user requested to re-connect. Avoiding backoff > + * in these cases, as we need to re-connect as soon as possible. > + * Connections that are not in MONITORING state should have their > + * backoff to avoid constant flood of re-connection attempts in > + * case there is no suitable database server. */ > + jsonrpc_session_set_backoff_free_tries( > + cs->session, jsonrpc_session_get_n_remotes(cs->session)); > + } > jsonrpc_session_force_reconnect(cs->session); > } > } > --- > > This way, if the server looses leadership or inconsistency detected, > the client will have 3 free attempts to find a new suitable server. > After that it will start to backoff as it does now. No changes in > reconnect module required. > > Thoughts? This works for me. I just have a question regarding the new API: should we allow jsonrpc users to set the free tries to any value or shall we make it more strict, e.g., jsonrpc_session_reset_backoff_free_tries(), which would reset the number of free tries to 'n_remotes'? Will you be sending a patch or shall I add your "Suggested-by"? > > Best regards, Ilya Maximets. > Thanks, Dumitru
On 7/20/21 3:39 PM, Dumitru Ceara wrote: > On 7/20/21 3:18 PM, Ilya Maximets wrote: >> On 7/13/21 5:03 PM, Dumitru Ceara wrote: >>> On 7/12/21 10:36 PM, Ilya Maximets wrote: >>>> On 6/29/21 1:20 PM, Dumitru Ceara wrote: >>>>> Until now clients that needed to reconnect immediately could only use >>>>> reconnect_force_reconnect(). However, reconnect_force_reconnect() >>>>> doesn't reset the backoff for connections that were alive long enough >>>>> (more than backoff seconds). >>>>> >>>>> Moreover, the reconnect library cannot determine the exact reason why a >>>>> client wishes to initiate a reconnection. In most cases reconnection >>>>> happens because of a fatal error when communicating with the remote, >>>>> e.g., in the ovsdb-cs layer, when invalid messages are received from >>>>> ovsdb-server. In such cases it makes sense to not reset the backoff >>>>> because the remote seems to be unhealthy. >>>>> >>>>> There are however cases when reconnection is needed for other reasons. >>>>> One such example is when ovsdb-clients require "leader-only" connections >>>>> to clustered ovsdb-server databases. Whenever the client determines >>>>> that the remote is not a leader anymore, it decides to reconnect to a >>>>> new remote from its list, searching for the new leader. Using >>>>> jsonrpc_force_reconnect() (which calls reconnect_force_reconnect()) will >>>>> not reset backoff even though the former leader is still likely in good >>>>> shape. >>>>> >>>>> Since 3c2d6274bcee ("raft: Transfer leadership before creating >>>>> snapshots.") leadership changes inside the clustered database happen >>>>> more often and therefore "leader-only" clients need to reconnect more >>>>> often too. Not resetting the backoff every time a leadership change >>>>> happens will cause all reconnections to happen with the maximum backoff >>>>> (8 seconds) resulting in significant latency. >>>>> >>>>> This commit also updates the Python reconnect and IDL implementations >>>>> and adds tests for force-reconnect and graceful-reconnect. >>>>> >>>>> Reported-at: https://bugzilla.redhat.com/1977264 >>>>> Signed-off-by: Dumitru Ceara <dceara@redhat.com> >>>>> --- >>>> >>>> Hi, Dumitru. >>>> >>> >>> Hi Ilya, >>> >>>> Thanks for working on this issue. I've seen it in practice while running >>>> OVN tests, but I still don't quiet understand why it happens. Could you, >>>> please, describe how state transitioning work here for the ovsdb-idl case? >>>> >>> >>> Without the patch, assuming a current backoff of X seconds, the sequence >>> of events (even for a connection that has seen activity after backoff) >>> is: >>> >>> - reconnect_force_reconnect() -> move to S_RECONNECT >>> - ovsdb_cs_run() >>> -> jsonrpc_session_run() >>> -> reconnect_run() >>> -> reconnect_disconnected() >>> -> because state is not S_ACTIVE | S_IDLE backoff is not >>> changed, stays X. >> >> Hmm, I see. Thanks for explanation! >> >>> >>>>> +# Forcefully reconnect. >>>>> +force-reconnect >>>>> + in RECONNECT for 0 ms (2000 ms backoff) >>>>> + 1 successful connections out of 3 attempts, seqno 2 >>>>> + disconnected >>>>> +run >>>>> + should disconnect >>>>> +connecting >>>>> + in CONNECTING for 0 ms (2000 ms backoff) >>>> >>>> Especially this part seems wrong to me. Because after 'should disconnect' >>>> there should be 'disconnect' of 'connect-fail', but not 'connecting'. We >>>> literally should disconnect here, otherwise it's a violation of the reconnect >>>> API. And my concern is that ovsdb-cs or jsonrpc violates the API somewhere >>>> by not calling reconnect_disconnectd() when it is required, or there is some >>>> other bug that makes 'reconnect' module to jump over few states in a fsm. >>>> >>>> The logical workflow for the force-reconnect, from what I see in the code >>>> should be: >>>> >>>> 1. force-reconnect --> transition to S_RECONNECT >>>> 2. run -> in S_RECONNECT, so returning RECONNECT_DISCONNECT >>>> 3. disconnect -> check the state, update backoff and transition to S_BACKOFF >>>> 4. run -> in S_BACKOFF, so returning RECONNECT_CONNECT >>>> 5. connected .... >>> >>> This is just a bug in the test case I added, I need to issue >>> "disconnect" in the test case to trigger reconnect_disconnected() >>> to be called (like ovsdb-cs does). >>> >>>> >>>> Something is fishy here, because ovsdb-cs somehow jumps over step #3 and >>>> maybe also #4. >>> >>> As mentioned above, it's not the case for ovsdb-cs. >>> >>> However, while checking the test case I realized that the patch will >>> cause all "graceful" reconnects to happen with an initial backoff of 2 >>> seconds for long lived sessions (instead of 1s). >>> >>> That's because: >>> >>> reconnect_graceful_reconnect() >>> -> reconnect_reset_backoff__() >>> -> session was active after the initial 'backoff' seconds, so >>> reset backoff to minimum. >>> -> reconnect_transition__(..., S_RECONNECT); >>> ovsdb_cs_run() >>> -> jsonrpc_session_run() >>> -> reconnect_run() >>> -> reconnect_disconnected(): >>> -> if (!reconnect_reset_backoff__(..)) then double backoff. >>> >>> I see a couple of potential ways to fix that: >>> 1. reconnect_graceful_reconnect() could bump fsm->backoff_free_tries to >>> allow a single backoff free reconnect. >>> 2. reconnect_graceful_reconnect() could reset fsm->backoff to 0 if it >>> was active recently (instead of calling reconnect_reset_backoff__()). >>> >>> However, this is all under the assumption that we want to add support >>> for two types of reconnect "semantics": >>> >>> a. forced, e.g., when the client detects inconsistencies in the data >>> received from the server (the server is in "bad shape") in which we >>> should never reset the backoff. >>> >>> b. graceful, e.g., when the client reconnects because it needs a >>> leader-only connection and a leadership transfer happened on the >>> server side (the server is likely in "good shape" just not a leader >>> anymore). >>> >>> An alternative to all this is to change reconnect_disconnected() to >>> reset backoff for connections in states S_ACTIVE | S_IDLE | S_RECONNECT >>> (used to be just active and idle). >>> >>> I guess we could treat this as a bug fix and continue discussion for a >>> separate follow up patch to add the "graceful vs force" semantics if >>> they turn out to make sense for the future. >>> >>> In essence (excluding potential python and test changes) the patch would >>> become: >>> >>> diff --git a/lib/reconnect.c b/lib/reconnect.c >>> index a929ddfd2d..3a6d93f9d1 100644 >>> --- a/lib/reconnect.c >>> +++ b/lib/reconnect.c >>> @@ -385,7 +385,7 @@ reconnect_disconnected(struct reconnect *fsm, long long int now, int error) >>> if (fsm->backoff_free_tries > 1) { >>> fsm->backoff_free_tries--; >>> fsm->backoff = 0; >>> - } else if (fsm->state & (S_ACTIVE | S_IDLE) >>> + } else if (fsm->state & (S_ACTIVE | S_IDLE | S_RECONNECT) >>> && (fsm->last_activity - fsm->last_connected >= fsm->backoff >>> || fsm->passive)) { >>> fsm->backoff = fsm->passive ? 0 : fsm->min_backoff; >>> --- >>> >>> What do you think? >> >> The problem I see with adding S_RECONNECT to this condition is that >> 'last_activity' and 'last_connected' doesn't belong to current >> connection in case we're forcefully interrupting connection in a >> S_CONNECTING state. So, we will re-set backoff based on outdated >> information. Also, I think, in this case, 'last_connected' might >> be larger than 'last_activity' and we will have a negative value here. > > You're right, I missed this. > >> All values are signed, so it should not be an issue, but it's not >> a clean solution. > > It's not, I agree. > >> >> Bumping the number of free tries seems like a better solution to me, >> because: >> >> 1. I'm not sure that we need 2 types of reconnect. I mean, backoff >> is intended to avoid overloading the already overloaded server with >> connection attempts. In case of forceful re-connection the server >> is not overloaded, so we should not increase a backoff. >> IMO, backoff should only be involved in cases where we have >> problems on the server side related to load, not the leader-only >> stuff or even database inconsistency problems. >> >> 2. force-reconnect should not consume free tries for exactly same >> reasons as in point 1. Free tries are to avoid backoff, but >> backoff exists only to avoid storming the remote server with >> connections while it's overloaded. >> No overload -> No need for backoff -> No need to consume free tries. >> >> 3. With bumping of free tries we're avoiding logical issues around >> using 'last_activity' and 'last_connected' from the old connection. >> >> 4. Faster reconnection, since backoff will be zero. And that is >> fine, because as far as we know, the server is not overloaded, >> it's just not suitable for our needs for some reason. >> If it is overloaded, we will backoff after the first failed >> connection attempt. > > This is even better (instead of min_backoff, I mean). > >> >> One potential problem I see with this solution is that if there >> are no servers on a list that are suitable for ovsdb-cs, we will >> have a log full of annoying reconnect logs. Backoff kind of >> rate-limits ovsdb-cs right now and we have no internal rate-limit >> inside ovsdb-cs. This looks more like an issue of ovsdb-cs and >> not the issue of reconnect module itself. But we need to have a >> solution for this. >> >> And actually, the problem here is that from the jsonrpc point of >> view the connection is perfectly fine and functional, but ovsdb-cs >> implements high-level logic on top of it and decides that >> connection is not good on a much later stage based on the actual >> received data. So, we need to somehow propagate this information >> from ovsdb-cs down to reconnect module or allow ovsdb-cs to control >> the state machine, otherwise we will need two separate backoffs: > > That's what I was trying to implement with the "graceful reconnect: > allow ovsdb-cs to decide how to reconnect. But there are corner cases > like the ones pointed out during the review of this patch. > >> one inside the reconnect for usual connection problems and one >> in ovsdb-cs for high-level data inconsistencies and leader changes. >> >> Thinking this way led me to a different solution. We could expose >> something like jsonrpc_session_set_backoff_free_tries() and allow >> ovsdb-cs to make a decision. E.g.: >> >> diff --git a/lib/ovsdb-cs.c b/lib/ovsdb-cs.c >> index f13065c6c..900597b96 100644 >> --- a/lib/ovsdb-cs.c >> +++ b/lib/ovsdb-cs.c >> @@ -729,6 +729,17 @@ void >> ovsdb_cs_force_reconnect(struct ovsdb_cs *cs) >> { >> if (cs->session) { >> + if (cs->state == CS_S_MONITORING) { >> + /* The ovsdb-cs was in MONITORING state, so we either had data >> + * inconsistency on this server, or it stopped being the cluster >> + * leader, or the user requested to re-connect. Avoiding backoff >> + * in these cases, as we need to re-connect as soon as possible. >> + * Connections that are not in MONITORING state should have their >> + * backoff to avoid constant flood of re-connection attempts in >> + * case there is no suitable database server. */ >> + jsonrpc_session_set_backoff_free_tries( >> + cs->session, jsonrpc_session_get_n_remotes(cs->session)); >> + } >> jsonrpc_session_force_reconnect(cs->session); >> } >> } >> --- >> >> This way, if the server looses leadership or inconsistency detected, >> the client will have 3 free attempts to find a new suitable server. >> After that it will start to backoff as it does now. No changes in >> reconnect module required. >> >> Thoughts? > > This works for me. I just have a question regarding the new API: should > we allow jsonrpc users to set the free tries to any value or shall we > make it more strict, e.g., jsonrpc_session_reset_backoff_free_tries(), > which would reset the number of free tries to 'n_remotes'? jsonrpc_session_reset_backoff_free_tries() seems better, because ovsdb-cs doesn't actually set number of free tries for jsonrpc from the start. Maybe we can name it just jsonrpc_session_reset_backoff() ? I'm not sure, but I just don't like this very long name. > > Will you be sending a patch or shall I add your "Suggested-by"? I'm OK with "Suggested-by". Would be also great to have some type of a test for this functionality. > >> >> Best regards, Ilya Maximets. >> > > Thanks, > Dumitru >
On 7/20/21 4:25 PM, Ilya Maximets wrote: > On 7/20/21 3:39 PM, Dumitru Ceara wrote: >> On 7/20/21 3:18 PM, Ilya Maximets wrote: >>> On 7/13/21 5:03 PM, Dumitru Ceara wrote: >>>> On 7/12/21 10:36 PM, Ilya Maximets wrote: >>>>> On 6/29/21 1:20 PM, Dumitru Ceara wrote: >>>>>> Until now clients that needed to reconnect immediately could only use >>>>>> reconnect_force_reconnect(). However, reconnect_force_reconnect() >>>>>> doesn't reset the backoff for connections that were alive long enough >>>>>> (more than backoff seconds). >>>>>> >>>>>> Moreover, the reconnect library cannot determine the exact reason why a >>>>>> client wishes to initiate a reconnection. In most cases reconnection >>>>>> happens because of a fatal error when communicating with the remote, >>>>>> e.g., in the ovsdb-cs layer, when invalid messages are received from >>>>>> ovsdb-server. In such cases it makes sense to not reset the backoff >>>>>> because the remote seems to be unhealthy. >>>>>> >>>>>> There are however cases when reconnection is needed for other reasons. >>>>>> One such example is when ovsdb-clients require "leader-only" connections >>>>>> to clustered ovsdb-server databases. Whenever the client determines >>>>>> that the remote is not a leader anymore, it decides to reconnect to a >>>>>> new remote from its list, searching for the new leader. Using >>>>>> jsonrpc_force_reconnect() (which calls reconnect_force_reconnect()) will >>>>>> not reset backoff even though the former leader is still likely in good >>>>>> shape. >>>>>> >>>>>> Since 3c2d6274bcee ("raft: Transfer leadership before creating >>>>>> snapshots.") leadership changes inside the clustered database happen >>>>>> more often and therefore "leader-only" clients need to reconnect more >>>>>> often too. Not resetting the backoff every time a leadership change >>>>>> happens will cause all reconnections to happen with the maximum backoff >>>>>> (8 seconds) resulting in significant latency. >>>>>> >>>>>> This commit also updates the Python reconnect and IDL implementations >>>>>> and adds tests for force-reconnect and graceful-reconnect. >>>>>> >>>>>> Reported-at: https://bugzilla.redhat.com/1977264 >>>>>> Signed-off-by: Dumitru Ceara <dceara@redhat.com> >>>>>> --- >>>>> >>>>> Hi, Dumitru. >>>>> >>>> >>>> Hi Ilya, >>>> >>>>> Thanks for working on this issue. I've seen it in practice while running >>>>> OVN tests, but I still don't quiet understand why it happens. Could you, >>>>> please, describe how state transitioning work here for the ovsdb-idl case? >>>>> >>>> >>>> Without the patch, assuming a current backoff of X seconds, the sequence >>>> of events (even for a connection that has seen activity after backoff) >>>> is: >>>> >>>> - reconnect_force_reconnect() -> move to S_RECONNECT >>>> - ovsdb_cs_run() >>>> -> jsonrpc_session_run() >>>> -> reconnect_run() >>>> -> reconnect_disconnected() >>>> -> because state is not S_ACTIVE | S_IDLE backoff is not >>>> changed, stays X. >>> >>> Hmm, I see. Thanks for explanation! >>> >>>> >>>>>> +# Forcefully reconnect. >>>>>> +force-reconnect >>>>>> + in RECONNECT for 0 ms (2000 ms backoff) >>>>>> + 1 successful connections out of 3 attempts, seqno 2 >>>>>> + disconnected >>>>>> +run >>>>>> + should disconnect >>>>>> +connecting >>>>>> + in CONNECTING for 0 ms (2000 ms backoff) >>>>> >>>>> Especially this part seems wrong to me. Because after 'should disconnect' >>>>> there should be 'disconnect' of 'connect-fail', but not 'connecting'. We >>>>> literally should disconnect here, otherwise it's a violation of the reconnect >>>>> API. And my concern is that ovsdb-cs or jsonrpc violates the API somewhere >>>>> by not calling reconnect_disconnectd() when it is required, or there is some >>>>> other bug that makes 'reconnect' module to jump over few states in a fsm. >>>>> >>>>> The logical workflow for the force-reconnect, from what I see in the code >>>>> should be: >>>>> >>>>> 1. force-reconnect --> transition to S_RECONNECT >>>>> 2. run -> in S_RECONNECT, so returning RECONNECT_DISCONNECT >>>>> 3. disconnect -> check the state, update backoff and transition to S_BACKOFF >>>>> 4. run -> in S_BACKOFF, so returning RECONNECT_CONNECT >>>>> 5. connected .... >>>> >>>> This is just a bug in the test case I added, I need to issue >>>> "disconnect" in the test case to trigger reconnect_disconnected() >>>> to be called (like ovsdb-cs does). >>>> >>>>> >>>>> Something is fishy here, because ovsdb-cs somehow jumps over step #3 and >>>>> maybe also #4. >>>> >>>> As mentioned above, it's not the case for ovsdb-cs. >>>> >>>> However, while checking the test case I realized that the patch will >>>> cause all "graceful" reconnects to happen with an initial backoff of 2 >>>> seconds for long lived sessions (instead of 1s). >>>> >>>> That's because: >>>> >>>> reconnect_graceful_reconnect() >>>> -> reconnect_reset_backoff__() >>>> -> session was active after the initial 'backoff' seconds, so >>>> reset backoff to minimum. >>>> -> reconnect_transition__(..., S_RECONNECT); >>>> ovsdb_cs_run() >>>> -> jsonrpc_session_run() >>>> -> reconnect_run() >>>> -> reconnect_disconnected(): >>>> -> if (!reconnect_reset_backoff__(..)) then double backoff. >>>> >>>> I see a couple of potential ways to fix that: >>>> 1. reconnect_graceful_reconnect() could bump fsm->backoff_free_tries to >>>> allow a single backoff free reconnect. >>>> 2. reconnect_graceful_reconnect() could reset fsm->backoff to 0 if it >>>> was active recently (instead of calling reconnect_reset_backoff__()). >>>> >>>> However, this is all under the assumption that we want to add support >>>> for two types of reconnect "semantics": >>>> >>>> a. forced, e.g., when the client detects inconsistencies in the data >>>> received from the server (the server is in "bad shape") in which we >>>> should never reset the backoff. >>>> >>>> b. graceful, e.g., when the client reconnects because it needs a >>>> leader-only connection and a leadership transfer happened on the >>>> server side (the server is likely in "good shape" just not a leader >>>> anymore). >>>> >>>> An alternative to all this is to change reconnect_disconnected() to >>>> reset backoff for connections in states S_ACTIVE | S_IDLE | S_RECONNECT >>>> (used to be just active and idle). >>>> >>>> I guess we could treat this as a bug fix and continue discussion for a >>>> separate follow up patch to add the "graceful vs force" semantics if >>>> they turn out to make sense for the future. >>>> >>>> In essence (excluding potential python and test changes) the patch would >>>> become: >>>> >>>> diff --git a/lib/reconnect.c b/lib/reconnect.c >>>> index a929ddfd2d..3a6d93f9d1 100644 >>>> --- a/lib/reconnect.c >>>> +++ b/lib/reconnect.c >>>> @@ -385,7 +385,7 @@ reconnect_disconnected(struct reconnect *fsm, long long int now, int error) >>>> if (fsm->backoff_free_tries > 1) { >>>> fsm->backoff_free_tries--; >>>> fsm->backoff = 0; >>>> - } else if (fsm->state & (S_ACTIVE | S_IDLE) >>>> + } else if (fsm->state & (S_ACTIVE | S_IDLE | S_RECONNECT) >>>> && (fsm->last_activity - fsm->last_connected >= fsm->backoff >>>> || fsm->passive)) { >>>> fsm->backoff = fsm->passive ? 0 : fsm->min_backoff; >>>> --- >>>> >>>> What do you think? >>> >>> The problem I see with adding S_RECONNECT to this condition is that >>> 'last_activity' and 'last_connected' doesn't belong to current >>> connection in case we're forcefully interrupting connection in a >>> S_CONNECTING state. So, we will re-set backoff based on outdated >>> information. Also, I think, in this case, 'last_connected' might >>> be larger than 'last_activity' and we will have a negative value here. >> >> You're right, I missed this. >> >>> All values are signed, so it should not be an issue, but it's not >>> a clean solution. >> >> It's not, I agree. >> >>> >>> Bumping the number of free tries seems like a better solution to me, >>> because: >>> >>> 1. I'm not sure that we need 2 types of reconnect. I mean, backoff >>> is intended to avoid overloading the already overloaded server with >>> connection attempts. In case of forceful re-connection the server >>> is not overloaded, so we should not increase a backoff. >>> IMO, backoff should only be involved in cases where we have >>> problems on the server side related to load, not the leader-only >>> stuff or even database inconsistency problems. >>> >>> 2. force-reconnect should not consume free tries for exactly same >>> reasons as in point 1. Free tries are to avoid backoff, but >>> backoff exists only to avoid storming the remote server with >>> connections while it's overloaded. >>> No overload -> No need for backoff -> No need to consume free tries. >>> >>> 3. With bumping of free tries we're avoiding logical issues around >>> using 'last_activity' and 'last_connected' from the old connection. >>> >>> 4. Faster reconnection, since backoff will be zero. And that is >>> fine, because as far as we know, the server is not overloaded, >>> it's just not suitable for our needs for some reason. >>> If it is overloaded, we will backoff after the first failed >>> connection attempt. >> >> This is even better (instead of min_backoff, I mean). >> >>> >>> One potential problem I see with this solution is that if there >>> are no servers on a list that are suitable for ovsdb-cs, we will >>> have a log full of annoying reconnect logs. Backoff kind of >>> rate-limits ovsdb-cs right now and we have no internal rate-limit >>> inside ovsdb-cs. This looks more like an issue of ovsdb-cs and >>> not the issue of reconnect module itself. But we need to have a >>> solution for this. >>> >>> And actually, the problem here is that from the jsonrpc point of >>> view the connection is perfectly fine and functional, but ovsdb-cs >>> implements high-level logic on top of it and decides that >>> connection is not good on a much later stage based on the actual >>> received data. So, we need to somehow propagate this information >>> from ovsdb-cs down to reconnect module or allow ovsdb-cs to control >>> the state machine, otherwise we will need two separate backoffs: >> >> That's what I was trying to implement with the "graceful reconnect: >> allow ovsdb-cs to decide how to reconnect. But there are corner cases >> like the ones pointed out during the review of this patch. >> >>> one inside the reconnect for usual connection problems and one >>> in ovsdb-cs for high-level data inconsistencies and leader changes. >>> >>> Thinking this way led me to a different solution. We could expose >>> something like jsonrpc_session_set_backoff_free_tries() and allow >>> ovsdb-cs to make a decision. E.g.: >>> >>> diff --git a/lib/ovsdb-cs.c b/lib/ovsdb-cs.c >>> index f13065c6c..900597b96 100644 >>> --- a/lib/ovsdb-cs.c >>> +++ b/lib/ovsdb-cs.c >>> @@ -729,6 +729,17 @@ void >>> ovsdb_cs_force_reconnect(struct ovsdb_cs *cs) >>> { >>> if (cs->session) { >>> + if (cs->state == CS_S_MONITORING) { >>> + /* The ovsdb-cs was in MONITORING state, so we either had data >>> + * inconsistency on this server, or it stopped being the cluster >>> + * leader, or the user requested to re-connect. Avoiding backoff >>> + * in these cases, as we need to re-connect as soon as possible. >>> + * Connections that are not in MONITORING state should have their >>> + * backoff to avoid constant flood of re-connection attempts in >>> + * case there is no suitable database server. */ >>> + jsonrpc_session_set_backoff_free_tries( >>> + cs->session, jsonrpc_session_get_n_remotes(cs->session)); >>> + } >>> jsonrpc_session_force_reconnect(cs->session); >>> } >>> } >>> --- >>> >>> This way, if the server looses leadership or inconsistency detected, >>> the client will have 3 free attempts to find a new suitable server. >>> After that it will start to backoff as it does now. No changes in >>> reconnect module required. >>> >>> Thoughts? >> >> This works for me. I just have a question regarding the new API: should >> we allow jsonrpc users to set the free tries to any value or shall we >> make it more strict, e.g., jsonrpc_session_reset_backoff_free_tries(), >> which would reset the number of free tries to 'n_remotes'? And it should, probably, set number of free tries to n_remotes + 1, because it will count current disconnection as an attempt. > > jsonrpc_session_reset_backoff_free_tries() seems better, because ovsdb-cs > doesn't actually set number of free tries for jsonrpc from the start. > Maybe we can name it just jsonrpc_session_reset_backoff() ? I'm not > sure, but I just don't like this very long name. > >> >> Will you be sending a patch or shall I add your "Suggested-by"? > > I'm OK with "Suggested-by". > Would be also great to have some type of a test for this functionality. > >> >>> >>> Best regards, Ilya Maximets. >>> >> >> Thanks, >> Dumitru >> >
On 7/20/21 4:29 PM, Ilya Maximets wrote: > On 7/20/21 4:25 PM, Ilya Maximets wrote: >> On 7/20/21 3:39 PM, Dumitru Ceara wrote: >>> On 7/20/21 3:18 PM, Ilya Maximets wrote: >>>> On 7/13/21 5:03 PM, Dumitru Ceara wrote: >>>>> On 7/12/21 10:36 PM, Ilya Maximets wrote: >>>>>> On 6/29/21 1:20 PM, Dumitru Ceara wrote: >>>>>>> Until now clients that needed to reconnect immediately could only use >>>>>>> reconnect_force_reconnect(). However, reconnect_force_reconnect() >>>>>>> doesn't reset the backoff for connections that were alive long enough >>>>>>> (more than backoff seconds). >>>>>>> >>>>>>> Moreover, the reconnect library cannot determine the exact reason why a >>>>>>> client wishes to initiate a reconnection. In most cases reconnection >>>>>>> happens because of a fatal error when communicating with the remote, >>>>>>> e.g., in the ovsdb-cs layer, when invalid messages are received from >>>>>>> ovsdb-server. In such cases it makes sense to not reset the backoff >>>>>>> because the remote seems to be unhealthy. >>>>>>> >>>>>>> There are however cases when reconnection is needed for other reasons. >>>>>>> One such example is when ovsdb-clients require "leader-only" connections >>>>>>> to clustered ovsdb-server databases. Whenever the client determines >>>>>>> that the remote is not a leader anymore, it decides to reconnect to a >>>>>>> new remote from its list, searching for the new leader. Using >>>>>>> jsonrpc_force_reconnect() (which calls reconnect_force_reconnect()) will >>>>>>> not reset backoff even though the former leader is still likely in good >>>>>>> shape. >>>>>>> >>>>>>> Since 3c2d6274bcee ("raft: Transfer leadership before creating >>>>>>> snapshots.") leadership changes inside the clustered database happen >>>>>>> more often and therefore "leader-only" clients need to reconnect more >>>>>>> often too. Not resetting the backoff every time a leadership change >>>>>>> happens will cause all reconnections to happen with the maximum backoff >>>>>>> (8 seconds) resulting in significant latency. >>>>>>> >>>>>>> This commit also updates the Python reconnect and IDL implementations >>>>>>> and adds tests for force-reconnect and graceful-reconnect. >>>>>>> >>>>>>> Reported-at: https://bugzilla.redhat.com/1977264 >>>>>>> Signed-off-by: Dumitru Ceara <dceara@redhat.com> >>>>>>> --- >>>>>> >>>>>> Hi, Dumitru. >>>>>> >>>>> >>>>> Hi Ilya, >>>>> >>>>>> Thanks for working on this issue. I've seen it in practice while running >>>>>> OVN tests, but I still don't quiet understand why it happens. Could you, >>>>>> please, describe how state transitioning work here for the ovsdb-idl case? >>>>>> >>>>> >>>>> Without the patch, assuming a current backoff of X seconds, the sequence >>>>> of events (even for a connection that has seen activity after backoff) >>>>> is: >>>>> >>>>> - reconnect_force_reconnect() -> move to S_RECONNECT >>>>> - ovsdb_cs_run() >>>>> -> jsonrpc_session_run() >>>>> -> reconnect_run() >>>>> -> reconnect_disconnected() >>>>> -> because state is not S_ACTIVE | S_IDLE backoff is not >>>>> changed, stays X. >>>> >>>> Hmm, I see. Thanks for explanation! >>>> >>>>> >>>>>>> +# Forcefully reconnect. >>>>>>> +force-reconnect >>>>>>> + in RECONNECT for 0 ms (2000 ms backoff) >>>>>>> + 1 successful connections out of 3 attempts, seqno 2 >>>>>>> + disconnected >>>>>>> +run >>>>>>> + should disconnect >>>>>>> +connecting >>>>>>> + in CONNECTING for 0 ms (2000 ms backoff) >>>>>> >>>>>> Especially this part seems wrong to me. Because after 'should disconnect' >>>>>> there should be 'disconnect' of 'connect-fail', but not 'connecting'. We >>>>>> literally should disconnect here, otherwise it's a violation of the reconnect >>>>>> API. And my concern is that ovsdb-cs or jsonrpc violates the API somewhere >>>>>> by not calling reconnect_disconnectd() when it is required, or there is some >>>>>> other bug that makes 'reconnect' module to jump over few states in a fsm. >>>>>> >>>>>> The logical workflow for the force-reconnect, from what I see in the code >>>>>> should be: >>>>>> >>>>>> 1. force-reconnect --> transition to S_RECONNECT >>>>>> 2. run -> in S_RECONNECT, so returning RECONNECT_DISCONNECT >>>>>> 3. disconnect -> check the state, update backoff and transition to S_BACKOFF >>>>>> 4. run -> in S_BACKOFF, so returning RECONNECT_CONNECT >>>>>> 5. connected .... >>>>> >>>>> This is just a bug in the test case I added, I need to issue >>>>> "disconnect" in the test case to trigger reconnect_disconnected() >>>>> to be called (like ovsdb-cs does). >>>>> >>>>>> >>>>>> Something is fishy here, because ovsdb-cs somehow jumps over step #3 and >>>>>> maybe also #4. >>>>> >>>>> As mentioned above, it's not the case for ovsdb-cs. >>>>> >>>>> However, while checking the test case I realized that the patch will >>>>> cause all "graceful" reconnects to happen with an initial backoff of 2 >>>>> seconds for long lived sessions (instead of 1s). >>>>> >>>>> That's because: >>>>> >>>>> reconnect_graceful_reconnect() >>>>> -> reconnect_reset_backoff__() >>>>> -> session was active after the initial 'backoff' seconds, so >>>>> reset backoff to minimum. >>>>> -> reconnect_transition__(..., S_RECONNECT); >>>>> ovsdb_cs_run() >>>>> -> jsonrpc_session_run() >>>>> -> reconnect_run() >>>>> -> reconnect_disconnected(): >>>>> -> if (!reconnect_reset_backoff__(..)) then double backoff. >>>>> >>>>> I see a couple of potential ways to fix that: >>>>> 1. reconnect_graceful_reconnect() could bump fsm->backoff_free_tries to >>>>> allow a single backoff free reconnect. >>>>> 2. reconnect_graceful_reconnect() could reset fsm->backoff to 0 if it >>>>> was active recently (instead of calling reconnect_reset_backoff__()). >>>>> >>>>> However, this is all under the assumption that we want to add support >>>>> for two types of reconnect "semantics": >>>>> >>>>> a. forced, e.g., when the client detects inconsistencies in the data >>>>> received from the server (the server is in "bad shape") in which we >>>>> should never reset the backoff. >>>>> >>>>> b. graceful, e.g., when the client reconnects because it needs a >>>>> leader-only connection and a leadership transfer happened on the >>>>> server side (the server is likely in "good shape" just not a leader >>>>> anymore). >>>>> >>>>> An alternative to all this is to change reconnect_disconnected() to >>>>> reset backoff for connections in states S_ACTIVE | S_IDLE | S_RECONNECT >>>>> (used to be just active and idle). >>>>> >>>>> I guess we could treat this as a bug fix and continue discussion for a >>>>> separate follow up patch to add the "graceful vs force" semantics if >>>>> they turn out to make sense for the future. >>>>> >>>>> In essence (excluding potential python and test changes) the patch would >>>>> become: >>>>> >>>>> diff --git a/lib/reconnect.c b/lib/reconnect.c >>>>> index a929ddfd2d..3a6d93f9d1 100644 >>>>> --- a/lib/reconnect.c >>>>> +++ b/lib/reconnect.c >>>>> @@ -385,7 +385,7 @@ reconnect_disconnected(struct reconnect *fsm, long long int now, int error) >>>>> if (fsm->backoff_free_tries > 1) { >>>>> fsm->backoff_free_tries--; >>>>> fsm->backoff = 0; >>>>> - } else if (fsm->state & (S_ACTIVE | S_IDLE) >>>>> + } else if (fsm->state & (S_ACTIVE | S_IDLE | S_RECONNECT) >>>>> && (fsm->last_activity - fsm->last_connected >= fsm->backoff >>>>> || fsm->passive)) { >>>>> fsm->backoff = fsm->passive ? 0 : fsm->min_backoff; >>>>> --- >>>>> >>>>> What do you think? >>>> >>>> The problem I see with adding S_RECONNECT to this condition is that >>>> 'last_activity' and 'last_connected' doesn't belong to current >>>> connection in case we're forcefully interrupting connection in a >>>> S_CONNECTING state. So, we will re-set backoff based on outdated >>>> information. Also, I think, in this case, 'last_connected' might >>>> be larger than 'last_activity' and we will have a negative value here. >>> >>> You're right, I missed this. >>> >>>> All values are signed, so it should not be an issue, but it's not >>>> a clean solution. >>> >>> It's not, I agree. >>> >>>> >>>> Bumping the number of free tries seems like a better solution to me, >>>> because: >>>> >>>> 1. I'm not sure that we need 2 types of reconnect. I mean, backoff >>>> is intended to avoid overloading the already overloaded server with >>>> connection attempts. In case of forceful re-connection the server >>>> is not overloaded, so we should not increase a backoff. >>>> IMO, backoff should only be involved in cases where we have >>>> problems on the server side related to load, not the leader-only >>>> stuff or even database inconsistency problems. >>>> >>>> 2. force-reconnect should not consume free tries for exactly same >>>> reasons as in point 1. Free tries are to avoid backoff, but >>>> backoff exists only to avoid storming the remote server with >>>> connections while it's overloaded. >>>> No overload -> No need for backoff -> No need to consume free tries. >>>> >>>> 3. With bumping of free tries we're avoiding logical issues around >>>> using 'last_activity' and 'last_connected' from the old connection. >>>> >>>> 4. Faster reconnection, since backoff will be zero. And that is >>>> fine, because as far as we know, the server is not overloaded, >>>> it's just not suitable for our needs for some reason. >>>> If it is overloaded, we will backoff after the first failed >>>> connection attempt. >>> >>> This is even better (instead of min_backoff, I mean). >>> >>>> >>>> One potential problem I see with this solution is that if there >>>> are no servers on a list that are suitable for ovsdb-cs, we will >>>> have a log full of annoying reconnect logs. Backoff kind of >>>> rate-limits ovsdb-cs right now and we have no internal rate-limit >>>> inside ovsdb-cs. This looks more like an issue of ovsdb-cs and >>>> not the issue of reconnect module itself. But we need to have a >>>> solution for this. >>>> >>>> And actually, the problem here is that from the jsonrpc point of >>>> view the connection is perfectly fine and functional, but ovsdb-cs >>>> implements high-level logic on top of it and decides that >>>> connection is not good on a much later stage based on the actual >>>> received data. So, we need to somehow propagate this information >>>> from ovsdb-cs down to reconnect module or allow ovsdb-cs to control >>>> the state machine, otherwise we will need two separate backoffs: >>> >>> That's what I was trying to implement with the "graceful reconnect: >>> allow ovsdb-cs to decide how to reconnect. But there are corner cases >>> like the ones pointed out during the review of this patch. >>> >>>> one inside the reconnect for usual connection problems and one >>>> in ovsdb-cs for high-level data inconsistencies and leader changes. >>>> >>>> Thinking this way led me to a different solution. We could expose >>>> something like jsonrpc_session_set_backoff_free_tries() and allow >>>> ovsdb-cs to make a decision. E.g.: >>>> >>>> diff --git a/lib/ovsdb-cs.c b/lib/ovsdb-cs.c >>>> index f13065c6c..900597b96 100644 >>>> --- a/lib/ovsdb-cs.c >>>> +++ b/lib/ovsdb-cs.c >>>> @@ -729,6 +729,17 @@ void >>>> ovsdb_cs_force_reconnect(struct ovsdb_cs *cs) >>>> { >>>> if (cs->session) { >>>> + if (cs->state == CS_S_MONITORING) { >>>> + /* The ovsdb-cs was in MONITORING state, so we either had data >>>> + * inconsistency on this server, or it stopped being the cluster >>>> + * leader, or the user requested to re-connect. Avoiding backoff >>>> + * in these cases, as we need to re-connect as soon as possible. >>>> + * Connections that are not in MONITORING state should have their >>>> + * backoff to avoid constant flood of re-connection attempts in >>>> + * case there is no suitable database server. */ >>>> + jsonrpc_session_set_backoff_free_tries( >>>> + cs->session, jsonrpc_session_get_n_remotes(cs->session)); >>>> + } >>>> jsonrpc_session_force_reconnect(cs->session); >>>> } >>>> } >>>> --- >>>> >>>> This way, if the server looses leadership or inconsistency detected, >>>> the client will have 3 free attempts to find a new suitable server. >>>> After that it will start to backoff as it does now. No changes in >>>> reconnect module required. >>>> >>>> Thoughts? >>> >>> This works for me. I just have a question regarding the new API: should >>> we allow jsonrpc users to set the free tries to any value or shall we >>> make it more strict, e.g., jsonrpc_session_reset_backoff_free_tries(), >>> which would reset the number of free tries to 'n_remotes'? > > And it should, probably, set number of free tries to n_remotes + 1, > because it will count current disconnection as an attempt. > Makes sense. >> >> jsonrpc_session_reset_backoff_free_tries() seems better, because ovsdb-cs >> doesn't actually set number of free tries for jsonrpc from the start. >> Maybe we can name it just jsonrpc_session_reset_backoff() ? I'm not >> sure, but I just don't like this very long name. >> >>> >>> Will you be sending a patch or shall I add your "Suggested-by"? >> >> I'm OK with "Suggested-by". >> Would be also great to have some type of a test for this functionality. >> Let me try to come up with something. I'll post a v2 these days. Regards, Dumitru
On 7/20/21 4:32 PM, Dumitru Ceara wrote: > On 7/20/21 4:29 PM, Ilya Maximets wrote: >> On 7/20/21 4:25 PM, Ilya Maximets wrote: >>> On 7/20/21 3:39 PM, Dumitru Ceara wrote: >>>> On 7/20/21 3:18 PM, Ilya Maximets wrote: >>>>> On 7/13/21 5:03 PM, Dumitru Ceara wrote: >>>>>> On 7/12/21 10:36 PM, Ilya Maximets wrote: >>>>>>> On 6/29/21 1:20 PM, Dumitru Ceara wrote: >>>>>>>> Until now clients that needed to reconnect immediately could only use >>>>>>>> reconnect_force_reconnect(). However, reconnect_force_reconnect() >>>>>>>> doesn't reset the backoff for connections that were alive long enough >>>>>>>> (more than backoff seconds). >>>>>>>> >>>>>>>> Moreover, the reconnect library cannot determine the exact reason why a >>>>>>>> client wishes to initiate a reconnection. In most cases reconnection >>>>>>>> happens because of a fatal error when communicating with the remote, >>>>>>>> e.g., in the ovsdb-cs layer, when invalid messages are received from >>>>>>>> ovsdb-server. In such cases it makes sense to not reset the backoff >>>>>>>> because the remote seems to be unhealthy. >>>>>>>> >>>>>>>> There are however cases when reconnection is needed for other reasons. >>>>>>>> One such example is when ovsdb-clients require "leader-only" connections >>>>>>>> to clustered ovsdb-server databases. Whenever the client determines >>>>>>>> that the remote is not a leader anymore, it decides to reconnect to a >>>>>>>> new remote from its list, searching for the new leader. Using >>>>>>>> jsonrpc_force_reconnect() (which calls reconnect_force_reconnect()) will >>>>>>>> not reset backoff even though the former leader is still likely in good >>>>>>>> shape. >>>>>>>> >>>>>>>> Since 3c2d6274bcee ("raft: Transfer leadership before creating >>>>>>>> snapshots.") leadership changes inside the clustered database happen >>>>>>>> more often and therefore "leader-only" clients need to reconnect more >>>>>>>> often too. Not resetting the backoff every time a leadership change >>>>>>>> happens will cause all reconnections to happen with the maximum backoff >>>>>>>> (8 seconds) resulting in significant latency. >>>>>>>> >>>>>>>> This commit also updates the Python reconnect and IDL implementations >>>>>>>> and adds tests for force-reconnect and graceful-reconnect. >>>>>>>> >>>>>>>> Reported-at: https://bugzilla.redhat.com/1977264 >>>>>>>> Signed-off-by: Dumitru Ceara <dceara@redhat.com> >>>>>>>> --- >>>>>>> >>>>>>> Hi, Dumitru. >>>>>>> >>>>>> >>>>>> Hi Ilya, >>>>>> >>>>>>> Thanks for working on this issue. I've seen it in practice while running >>>>>>> OVN tests, but I still don't quiet understand why it happens. Could you, >>>>>>> please, describe how state transitioning work here for the ovsdb-idl case? >>>>>>> >>>>>> >>>>>> Without the patch, assuming a current backoff of X seconds, the sequence >>>>>> of events (even for a connection that has seen activity after backoff) >>>>>> is: >>>>>> >>>>>> - reconnect_force_reconnect() -> move to S_RECONNECT >>>>>> - ovsdb_cs_run() >>>>>> -> jsonrpc_session_run() >>>>>> -> reconnect_run() >>>>>> -> reconnect_disconnected() >>>>>> -> because state is not S_ACTIVE | S_IDLE backoff is not >>>>>> changed, stays X. >>>>> >>>>> Hmm, I see. Thanks for explanation! >>>>> >>>>>> >>>>>>>> +# Forcefully reconnect. >>>>>>>> +force-reconnect >>>>>>>> + in RECONNECT for 0 ms (2000 ms backoff) >>>>>>>> + 1 successful connections out of 3 attempts, seqno 2 >>>>>>>> + disconnected >>>>>>>> +run >>>>>>>> + should disconnect >>>>>>>> +connecting >>>>>>>> + in CONNECTING for 0 ms (2000 ms backoff) >>>>>>> >>>>>>> Especially this part seems wrong to me. Because after 'should disconnect' >>>>>>> there should be 'disconnect' of 'connect-fail', but not 'connecting'. We >>>>>>> literally should disconnect here, otherwise it's a violation of the reconnect >>>>>>> API. And my concern is that ovsdb-cs or jsonrpc violates the API somewhere >>>>>>> by not calling reconnect_disconnectd() when it is required, or there is some >>>>>>> other bug that makes 'reconnect' module to jump over few states in a fsm. >>>>>>> >>>>>>> The logical workflow for the force-reconnect, from what I see in the code >>>>>>> should be: >>>>>>> >>>>>>> 1. force-reconnect --> transition to S_RECONNECT >>>>>>> 2. run -> in S_RECONNECT, so returning RECONNECT_DISCONNECT >>>>>>> 3. disconnect -> check the state, update backoff and transition to S_BACKOFF >>>>>>> 4. run -> in S_BACKOFF, so returning RECONNECT_CONNECT >>>>>>> 5. connected .... >>>>>> >>>>>> This is just a bug in the test case I added, I need to issue >>>>>> "disconnect" in the test case to trigger reconnect_disconnected() >>>>>> to be called (like ovsdb-cs does). >>>>>> >>>>>>> >>>>>>> Something is fishy here, because ovsdb-cs somehow jumps over step #3 and >>>>>>> maybe also #4. >>>>>> >>>>>> As mentioned above, it's not the case for ovsdb-cs. >>>>>> >>>>>> However, while checking the test case I realized that the patch will >>>>>> cause all "graceful" reconnects to happen with an initial backoff of 2 >>>>>> seconds for long lived sessions (instead of 1s). >>>>>> >>>>>> That's because: >>>>>> >>>>>> reconnect_graceful_reconnect() >>>>>> -> reconnect_reset_backoff__() >>>>>> -> session was active after the initial 'backoff' seconds, so >>>>>> reset backoff to minimum. >>>>>> -> reconnect_transition__(..., S_RECONNECT); >>>>>> ovsdb_cs_run() >>>>>> -> jsonrpc_session_run() >>>>>> -> reconnect_run() >>>>>> -> reconnect_disconnected(): >>>>>> -> if (!reconnect_reset_backoff__(..)) then double backoff. >>>>>> >>>>>> I see a couple of potential ways to fix that: >>>>>> 1. reconnect_graceful_reconnect() could bump fsm->backoff_free_tries to >>>>>> allow a single backoff free reconnect. >>>>>> 2. reconnect_graceful_reconnect() could reset fsm->backoff to 0 if it >>>>>> was active recently (instead of calling reconnect_reset_backoff__()). >>>>>> >>>>>> However, this is all under the assumption that we want to add support >>>>>> for two types of reconnect "semantics": >>>>>> >>>>>> a. forced, e.g., when the client detects inconsistencies in the data >>>>>> received from the server (the server is in "bad shape") in which we >>>>>> should never reset the backoff. >>>>>> >>>>>> b. graceful, e.g., when the client reconnects because it needs a >>>>>> leader-only connection and a leadership transfer happened on the >>>>>> server side (the server is likely in "good shape" just not a leader >>>>>> anymore). >>>>>> >>>>>> An alternative to all this is to change reconnect_disconnected() to >>>>>> reset backoff for connections in states S_ACTIVE | S_IDLE | S_RECONNECT >>>>>> (used to be just active and idle). >>>>>> >>>>>> I guess we could treat this as a bug fix and continue discussion for a >>>>>> separate follow up patch to add the "graceful vs force" semantics if >>>>>> they turn out to make sense for the future. >>>>>> >>>>>> In essence (excluding potential python and test changes) the patch would >>>>>> become: >>>>>> >>>>>> diff --git a/lib/reconnect.c b/lib/reconnect.c >>>>>> index a929ddfd2d..3a6d93f9d1 100644 >>>>>> --- a/lib/reconnect.c >>>>>> +++ b/lib/reconnect.c >>>>>> @@ -385,7 +385,7 @@ reconnect_disconnected(struct reconnect *fsm, long long int now, int error) >>>>>> if (fsm->backoff_free_tries > 1) { >>>>>> fsm->backoff_free_tries--; >>>>>> fsm->backoff = 0; >>>>>> - } else if (fsm->state & (S_ACTIVE | S_IDLE) >>>>>> + } else if (fsm->state & (S_ACTIVE | S_IDLE | S_RECONNECT) >>>>>> && (fsm->last_activity - fsm->last_connected >= fsm->backoff >>>>>> || fsm->passive)) { >>>>>> fsm->backoff = fsm->passive ? 0 : fsm->min_backoff; >>>>>> --- >>>>>> >>>>>> What do you think? >>>>> >>>>> The problem I see with adding S_RECONNECT to this condition is that >>>>> 'last_activity' and 'last_connected' doesn't belong to current >>>>> connection in case we're forcefully interrupting connection in a >>>>> S_CONNECTING state. So, we will re-set backoff based on outdated >>>>> information. Also, I think, in this case, 'last_connected' might >>>>> be larger than 'last_activity' and we will have a negative value here. >>>> >>>> You're right, I missed this. >>>> >>>>> All values are signed, so it should not be an issue, but it's not >>>>> a clean solution. >>>> >>>> It's not, I agree. >>>> >>>>> >>>>> Bumping the number of free tries seems like a better solution to me, >>>>> because: >>>>> >>>>> 1. I'm not sure that we need 2 types of reconnect. I mean, backoff >>>>> is intended to avoid overloading the already overloaded server with >>>>> connection attempts. In case of forceful re-connection the server >>>>> is not overloaded, so we should not increase a backoff. >>>>> IMO, backoff should only be involved in cases where we have >>>>> problems on the server side related to load, not the leader-only >>>>> stuff or even database inconsistency problems. >>>>> >>>>> 2. force-reconnect should not consume free tries for exactly same >>>>> reasons as in point 1. Free tries are to avoid backoff, but >>>>> backoff exists only to avoid storming the remote server with >>>>> connections while it's overloaded. >>>>> No overload -> No need for backoff -> No need to consume free tries. >>>>> >>>>> 3. With bumping of free tries we're avoiding logical issues around >>>>> using 'last_activity' and 'last_connected' from the old connection. >>>>> >>>>> 4. Faster reconnection, since backoff will be zero. And that is >>>>> fine, because as far as we know, the server is not overloaded, >>>>> it's just not suitable for our needs for some reason. >>>>> If it is overloaded, we will backoff after the first failed >>>>> connection attempt. >>>> >>>> This is even better (instead of min_backoff, I mean). >>>> >>>>> >>>>> One potential problem I see with this solution is that if there >>>>> are no servers on a list that are suitable for ovsdb-cs, we will >>>>> have a log full of annoying reconnect logs. Backoff kind of >>>>> rate-limits ovsdb-cs right now and we have no internal rate-limit >>>>> inside ovsdb-cs. This looks more like an issue of ovsdb-cs and >>>>> not the issue of reconnect module itself. But we need to have a >>>>> solution for this. >>>>> >>>>> And actually, the problem here is that from the jsonrpc point of >>>>> view the connection is perfectly fine and functional, but ovsdb-cs >>>>> implements high-level logic on top of it and decides that >>>>> connection is not good on a much later stage based on the actual >>>>> received data. So, we need to somehow propagate this information >>>>> from ovsdb-cs down to reconnect module or allow ovsdb-cs to control >>>>> the state machine, otherwise we will need two separate backoffs: >>>> >>>> That's what I was trying to implement with the "graceful reconnect: >>>> allow ovsdb-cs to decide how to reconnect. But there are corner cases >>>> like the ones pointed out during the review of this patch. >>>> >>>>> one inside the reconnect for usual connection problems and one >>>>> in ovsdb-cs for high-level data inconsistencies and leader changes. >>>>> >>>>> Thinking this way led me to a different solution. We could expose >>>>> something like jsonrpc_session_set_backoff_free_tries() and allow >>>>> ovsdb-cs to make a decision. E.g.: >>>>> >>>>> diff --git a/lib/ovsdb-cs.c b/lib/ovsdb-cs.c >>>>> index f13065c6c..900597b96 100644 >>>>> --- a/lib/ovsdb-cs.c >>>>> +++ b/lib/ovsdb-cs.c >>>>> @@ -729,6 +729,17 @@ void >>>>> ovsdb_cs_force_reconnect(struct ovsdb_cs *cs) >>>>> { >>>>> if (cs->session) { >>>>> + if (cs->state == CS_S_MONITORING) { >>>>> + /* The ovsdb-cs was in MONITORING state, so we either had data >>>>> + * inconsistency on this server, or it stopped being the cluster >>>>> + * leader, or the user requested to re-connect. Avoiding backoff >>>>> + * in these cases, as we need to re-connect as soon as possible. >>>>> + * Connections that are not in MONITORING state should have their >>>>> + * backoff to avoid constant flood of re-connection attempts in >>>>> + * case there is no suitable database server. */ >>>>> + jsonrpc_session_set_backoff_free_tries( >>>>> + cs->session, jsonrpc_session_get_n_remotes(cs->session)); >>>>> + } >>>>> jsonrpc_session_force_reconnect(cs->session); >>>>> } >>>>> } >>>>> --- >>>>> >>>>> This way, if the server looses leadership or inconsistency detected, >>>>> the client will have 3 free attempts to find a new suitable server. >>>>> After that it will start to backoff as it does now. No changes in >>>>> reconnect module required. >>>>> >>>>> Thoughts? >>>> >>>> This works for me. I just have a question regarding the new API: should >>>> we allow jsonrpc users to set the free tries to any value or shall we >>>> make it more strict, e.g., jsonrpc_session_reset_backoff_free_tries(), >>>> which would reset the number of free tries to 'n_remotes'? >> >> And it should, probably, set number of free tries to n_remotes + 1, >> because it will count current disconnection as an attempt. >> > > Makes sense. > >>> >>> jsonrpc_session_reset_backoff_free_tries() seems better, because ovsdb-cs >>> doesn't actually set number of free tries for jsonrpc from the start. >>> Maybe we can name it just jsonrpc_session_reset_backoff() ? I'm not >>> sure, but I just don't like this very long name. >>> >>>> >>>> Will you be sending a patch or shall I add your "Suggested-by"? >>> >>> I'm OK with "Suggested-by". >>> Would be also great to have some type of a test for this functionality. >>> > > Let me try to come up with something. I'll post a v2 these days. > v2 available at: https://patchwork.ozlabs.org/project/openvswitch/patch/20210721105614.25498-1-dceara@redhat.com/ Thanks, Dumitru
diff --git a/lib/jsonrpc.c b/lib/jsonrpc.c index 926cbcb86e..66b7f9ef4c 100644 --- a/lib/jsonrpc.c +++ b/lib/jsonrpc.c @@ -1267,6 +1267,13 @@ jsonrpc_session_force_reconnect(struct jsonrpc_session *s) reconnect_force_reconnect(s->reconnect, time_msec()); } +/* Makes 's' gracefully drop its connection (if any) and reconnect. */ +void +jsonrpc_session_graceful_reconnect(struct jsonrpc_session *s) +{ + reconnect_graceful_reconnect(s->reconnect, time_msec()); +} + /* Sets 'max_backoff' as the maximum time, in milliseconds, to wait after a * connection attempt fails before attempting to connect again. */ void diff --git a/lib/jsonrpc.h b/lib/jsonrpc.h index d75d66b863..3a34424e22 100644 --- a/lib/jsonrpc.h +++ b/lib/jsonrpc.h @@ -136,6 +136,7 @@ void jsonrpc_session_get_reconnect_stats(const struct jsonrpc_session *, void jsonrpc_session_enable_reconnect(struct jsonrpc_session *); void jsonrpc_session_force_reconnect(struct jsonrpc_session *); +void jsonrpc_session_graceful_reconnect(struct jsonrpc_session *); void jsonrpc_session_set_max_backoff(struct jsonrpc_session *, int max_backoff); diff --git a/lib/ovsdb-cs.c b/lib/ovsdb-cs.c index 911b71dd4f..6f16d29f6d 100644 --- a/lib/ovsdb-cs.c +++ b/lib/ovsdb-cs.c @@ -230,8 +230,10 @@ static void ovsdb_cs_transition_at(struct ovsdb_cs *, enum ovsdb_cs_state, #define ovsdb_cs_transition(CS, STATE) \ ovsdb_cs_transition_at(CS, STATE, OVS_SOURCE_LOCATOR) -static void ovsdb_cs_retry_at(struct ovsdb_cs *, const char *where); -#define ovsdb_cs_retry(CS) ovsdb_cs_retry_at(CS, OVS_SOURCE_LOCATOR) +static void ovsdb_cs_retry_at(struct ovsdb_cs *, bool graceful, + const char *where); +#define ovsdb_cs_retry(CS, GRACEFUL) \ + ovsdb_cs_retry_at((CS), (GRACEFUL), OVS_SOURCE_LOCATOR) static struct vlog_rate_limit syntax_rl = VLOG_RATE_LIMIT_INIT(1, 5); @@ -400,9 +402,21 @@ ovsdb_cs_send_request(struct ovsdb_cs *cs, struct jsonrpc_msg *request) } static void -ovsdb_cs_retry_at(struct ovsdb_cs *cs, const char *where) +ovsdb_cs_reconnect(struct ovsdb_cs *cs, bool graceful) { - ovsdb_cs_force_reconnect(cs); + if (cs->session) { + if (graceful) { + jsonrpc_session_graceful_reconnect(cs->session); + } else { + jsonrpc_session_force_reconnect(cs->session); + } + } +} + +static void +ovsdb_cs_retry_at(struct ovsdb_cs *cs, bool graceful, const char *where) +{ + ovsdb_cs_reconnect(cs, graceful); ovsdb_cs_transition_at(cs, CS_S_RETRY, where); } @@ -438,7 +452,7 @@ ovsdb_cs_process_response(struct ovsdb_cs *cs, struct jsonrpc_msg *msg) ovsdb_cs_state_to_string(cs->state), s); free(s); - ovsdb_cs_retry(cs); + ovsdb_cs_retry(cs, false); return; } @@ -711,9 +725,7 @@ ovsdb_cs_enable_reconnect(struct ovsdb_cs *cs) void ovsdb_cs_force_reconnect(struct ovsdb_cs *cs) { - if (cs->session) { - jsonrpc_session_force_reconnect(cs->session); - } + ovsdb_cs_reconnect(cs, false); } /* Drops 'cs''s current connection and the cached session. This is useful if @@ -722,7 +734,7 @@ void ovsdb_cs_flag_inconsistency(struct ovsdb_cs *cs) { cs->data.last_id = UUID_ZERO; - ovsdb_cs_retry(cs); + ovsdb_cs_retry(cs, false); } /* Returns true if 'cs' is currently connected or will eventually try to @@ -1951,7 +1963,7 @@ ovsdb_cs_check_server_db(struct ovsdb_cs *cs) { bool ok = ovsdb_cs_check_server_db__(cs); if (!ok) { - ovsdb_cs_retry(cs); + ovsdb_cs_retry(cs, true); } return ok; } diff --git a/lib/reconnect.c b/lib/reconnect.c index a929ddfd2d..633e77bdb2 100644 --- a/lib/reconnect.c +++ b/lib/reconnect.c @@ -77,6 +77,7 @@ static void reconnect_transition__(struct reconnect *, long long int now, enum state state); static long long int reconnect_deadline__(const struct reconnect *); static bool reconnect_may_retry(struct reconnect *); +static bool reconnect_reset_backoff__(struct reconnect *); static const char * reconnect_state_name__(enum state state) @@ -330,6 +331,23 @@ reconnect_force_reconnect(struct reconnect *fsm, long long int now) } } +/* If 'fsm' is enabled and currently connected (or attempting to connect), + * forces reconnect_run() for 'fsm' to return RECONNECT_DISCONNECT the next + * time it is called, which should cause the client to drop the connection (or + * attempt), back off, and then reconnect. + * + * Unlike reconnect_force_reconnect(), this also resets the backoff for + * connections that were active long enough. + * */ +void +reconnect_graceful_reconnect(struct reconnect *fsm, long long int now) +{ + if (fsm->state & (S_CONNECTING | S_ACTIVE | S_IDLE)) { + reconnect_reset_backoff__(fsm); + reconnect_transition__(fsm, now, S_RECONNECT); + } +} + /* Tell 'fsm' that the connection dropped or that a connection attempt failed. * 'error' specifies the reason: a positive value represents an errno value, * EOF indicates that the connection was closed by the peer (e.g. read() @@ -385,11 +403,7 @@ reconnect_disconnected(struct reconnect *fsm, long long int now, int error) if (fsm->backoff_free_tries > 1) { fsm->backoff_free_tries--; fsm->backoff = 0; - } else if (fsm->state & (S_ACTIVE | S_IDLE) - && (fsm->last_activity - fsm->last_connected >= fsm->backoff - || fsm->passive)) { - fsm->backoff = fsm->passive ? 0 : fsm->min_backoff; - } else { + } else if (!reconnect_reset_backoff__(fsm)) { if (fsm->backoff < fsm->min_backoff) { fsm->backoff = fsm->min_backoff; } else if (fsm->backoff < fsm->max_backoff / 2) { @@ -745,3 +759,15 @@ reconnect_may_retry(struct reconnect *fsm) } return may_retry; } + +static bool +reconnect_reset_backoff__(struct reconnect *fsm) +{ + if (fsm->state & (S_ACTIVE | S_IDLE) + && (fsm->last_activity - fsm->last_connected >= fsm->backoff + || fsm->passive)) { + fsm->backoff = fsm->passive ? 0 : fsm->min_backoff; + return true; + } + return false; +} diff --git a/lib/reconnect.h b/lib/reconnect.h index 40cc569c42..25bc627d66 100644 --- a/lib/reconnect.h +++ b/lib/reconnect.h @@ -67,6 +67,7 @@ void reconnect_enable(struct reconnect *, long long int now); void reconnect_disable(struct reconnect *, long long int now); void reconnect_force_reconnect(struct reconnect *, long long int now); +void reconnect_graceful_reconnect(struct reconnect *, long long int now); void reconnect_skip_backoff(struct reconnect *); bool reconnect_is_connected(const struct reconnect *); diff --git a/python/ovs/db/idl.py b/python/ovs/db/idl.py index 889cf3431b..49bb964217 100644 --- a/python/ovs/db/idl.py +++ b/python/ovs/db/idl.py @@ -276,7 +276,7 @@ class Idl(object): tables=self.server_tables) self.change_seqno = initial_change_seqno if not self.__check_server_db(): - self.force_reconnect() + self.graceful_reconnect() break else: self.__parse_update(msg.params[1], OVSDB_UPDATE) @@ -442,6 +442,12 @@ class Idl(object): In the meantime, the contents of the IDL will not change.""" self._session.force_reconnect() + def graceful_reconnect(self): + """Makes the IDL to gracefully drop its connection to the database and + reconnect. In the meantime, the contents of the IDL will not + change.""" + self._session.graceful_reconnect() + def session_name(self): return self._session.get_name() diff --git a/python/ovs/jsonrpc.py b/python/ovs/jsonrpc.py index bf32f8c87c..353a07fc49 100644 --- a/python/ovs/jsonrpc.py +++ b/python/ovs/jsonrpc.py @@ -612,5 +612,8 @@ class Session(object): def force_reconnect(self): self.reconnect.force_reconnect(ovs.timeval.msec()) + def graceful_reconnect(self): + self.reconnect.graceful_reconnect(ovs.timeval.msec()) + def get_num_of_remotes(self): return len(self.remotes) diff --git a/python/ovs/reconnect.py b/python/ovs/reconnect.py index c4c6c87e9f..0f4a8e18da 100644 --- a/python/ovs/reconnect.py +++ b/python/ovs/reconnect.py @@ -325,6 +325,20 @@ class Reconnect(object): Reconnect.Idle): self._transition(now, Reconnect.Reconnect) + def graceful_reconnect(self, now): + """If this FSM is enabled and currently connected (or attempting to + connect), forces self.run() to return ovs.reconnect.DISCONNECT the next + time it is called, which should cause the client to drop the connection + (or attempt), back off, and then reconnect. + + Unlike self.force_reconnect(), this also updates backoff for long + lived sessions.""" + if self.state in (Reconnect.ConnectInProgress, + Reconnect.Active, + Reconnect.Idle): + self._reset_backoff() + self._transition(now, Reconnect.Reconnect) + def disconnected(self, now, error): """Tell this FSM that the connection dropped or that a connection attempt failed. 'error' specifies the reason: a positive value @@ -377,14 +391,7 @@ class Reconnect(object): if self.backoff_free_tries > 1: self.backoff_free_tries -= 1 self.backoff = 0 - elif (self.state in (Reconnect.Active, Reconnect.Idle) and - (self.last_activity - self.last_connected >= self.backoff or - self.passive)): - if self.passive: - self.backoff = 0 - else: - self.backoff = self.min_backoff - else: + elif not self._reset_backoff(): if self.backoff < self.min_backoff: self.backoff = self.min_backoff elif self.backoff < self.max_backoff / 2: @@ -623,3 +630,14 @@ class Reconnect(object): return True else: return False + + def _reset_backoff(self): + if (self.state in (Reconnect.Active, Reconnect.Idle) and + (self.last_activity - self.last_connected >= self.backoff or + self.passive)): + if self.passive: + self.backoff = 0 + else: + self.backoff = self.min_backoff + return True + return False diff --git a/tests/reconnect.at b/tests/reconnect.at index 0f74709f5a..884483df3d 100644 --- a/tests/reconnect.at +++ b/tests/reconnect.at @@ -1303,3 +1303,347 @@ run listening in LISTENING for 0 ms (0 ms backoff) ]) + +###################################################################### +RECONNECT_CHECK([long connection doesn't reset backoff (force-reconnect)], + [enable + +# First connection attempt fails after 1000 ms. +run +connecting +run +timeout +run +connect-failed + +# Back off for 1000 ms. +timeout +run + +# Second connection attempt fails after 1000 ms. +connecting +timeout +run +connect-failed + +# Back off for 2000 ms. +timeout +run + +# Third connection attempt succeeds after 500 ms. +connecting +advance 500 +run +connected + +# Connection receives 3 chunks of data spaced 2000 ms apart. +advance 2000 +run +activity +advance 2000 +run +activity +advance 2000 +run +activity + +# Forcefully reconnect. +force-reconnect +run +connecting + +# Back off for 2000 ms. +timeout + +# Connection succeeds. +connected +], + [### t=1000 ### +enable + in BACKOFF for 0 ms (0 ms backoff) + +# First connection attempt fails after 1000 ms. +run + should connect +connecting + in CONNECTING for 0 ms (0 ms backoff) +run +timeout + advance 1000 ms + +### t=2000 ### + in CONNECTING for 1000 ms (0 ms backoff) +run + should disconnect +connect-failed + in BACKOFF for 0 ms (1000 ms backoff) + 0 successful connections out of 1 attempts, seqno 0 + +# Back off for 1000 ms. +timeout + advance 1000 ms + +### t=3000 ### + in BACKOFF for 1000 ms (1000 ms backoff) +run + should connect + +# Second connection attempt fails after 1000 ms. +connecting + in CONNECTING for 0 ms (1000 ms backoff) +timeout + advance 1000 ms + +### t=4000 ### + in CONNECTING for 1000 ms (1000 ms backoff) +run + should disconnect +connect-failed + in BACKOFF for 0 ms (2000 ms backoff) + 0 successful connections out of 2 attempts, seqno 0 + +# Back off for 2000 ms. +timeout + advance 2000 ms + +### t=6000 ### + in BACKOFF for 2000 ms (2000 ms backoff) +run + should connect + +# Third connection attempt succeeds after 500 ms. +connecting + in CONNECTING for 0 ms (2000 ms backoff) +advance 500 + +### t=6500 ### + in CONNECTING for 500 ms (2000 ms backoff) +run +connected + in ACTIVE for 0 ms (2000 ms backoff) + created 1000, last activity 1000, last connected 6500 + 1 successful connections out of 3 attempts, seqno 1 + connected + last connected 0 ms ago, connected 0 ms total + +# Connection receives 3 chunks of data spaced 2000 ms apart. +advance 2000 + +### t=8500 ### + in ACTIVE for 2000 ms (2000 ms backoff) +run +activity + created 1000, last activity 8500, last connected 6500 +advance 2000 + +### t=10500 ### + in ACTIVE for 4000 ms (2000 ms backoff) +run +activity + created 1000, last activity 10500, last connected 6500 +advance 2000 + +### t=12500 ### + in ACTIVE for 6000 ms (2000 ms backoff) +run +activity + created 1000, last activity 12500, last connected 6500 + +# Forcefully reconnect. +force-reconnect + in RECONNECT for 0 ms (2000 ms backoff) + 1 successful connections out of 3 attempts, seqno 2 + disconnected +run + should disconnect +connecting + in CONNECTING for 0 ms (2000 ms backoff) + +# Back off for 2000 ms. +timeout + advance 2000 ms + +### t=14500 ### + in CONNECTING for 2000 ms (2000 ms backoff) + last connected 8000 ms ago, connected 6000 ms total + +# Connection succeeds. +connected + in ACTIVE for 0 ms (2000 ms backoff) + created 1000, last activity 12500, last connected 14500 + 2 successful connections out of 4 attempts, seqno 3 + connected + last connected 0 ms ago, connected 6000 ms total +]) + +###################################################################### +RECONNECT_CHECK([long connection resets backoff (graceful-reconnect)], + [enable + +# First connection attempt fails after 1000 ms. +run +connecting +run +timeout +run +connect-failed + +# Back off for 1000 ms. +timeout +run + +# Second connection attempt fails after 1000 ms. +connecting +timeout +run +connect-failed + +# Back off for 2000 ms. +timeout +run + +# Third connection attempt succeeds after 500 ms. +connecting +advance 500 +run +connected + +# Connection receives 3 chunks of data spaced 2000 ms apart. +advance 2000 +run +activity +advance 2000 +run +activity +advance 2000 +run +activity + +# Gracefully reconnect. +graceful-reconnect +run +connecting + +# Back off for 1000 ms. +timeout + +# Connection succeeds. +connected +], + [### t=1000 ### +enable + in BACKOFF for 0 ms (0 ms backoff) + +# First connection attempt fails after 1000 ms. +run + should connect +connecting + in CONNECTING for 0 ms (0 ms backoff) +run +timeout + advance 1000 ms + +### t=2000 ### + in CONNECTING for 1000 ms (0 ms backoff) +run + should disconnect +connect-failed + in BACKOFF for 0 ms (1000 ms backoff) + 0 successful connections out of 1 attempts, seqno 0 + +# Back off for 1000 ms. +timeout + advance 1000 ms + +### t=3000 ### + in BACKOFF for 1000 ms (1000 ms backoff) +run + should connect + +# Second connection attempt fails after 1000 ms. +connecting + in CONNECTING for 0 ms (1000 ms backoff) +timeout + advance 1000 ms + +### t=4000 ### + in CONNECTING for 1000 ms (1000 ms backoff) +run + should disconnect +connect-failed + in BACKOFF for 0 ms (2000 ms backoff) + 0 successful connections out of 2 attempts, seqno 0 + +# Back off for 2000 ms. +timeout + advance 2000 ms + +### t=6000 ### + in BACKOFF for 2000 ms (2000 ms backoff) +run + should connect + +# Third connection attempt succeeds after 500 ms. +connecting + in CONNECTING for 0 ms (2000 ms backoff) +advance 500 + +### t=6500 ### + in CONNECTING for 500 ms (2000 ms backoff) +run +connected + in ACTIVE for 0 ms (2000 ms backoff) + created 1000, last activity 1000, last connected 6500 + 1 successful connections out of 3 attempts, seqno 1 + connected + last connected 0 ms ago, connected 0 ms total + +# Connection receives 3 chunks of data spaced 2000 ms apart. +advance 2000 + +### t=8500 ### + in ACTIVE for 2000 ms (2000 ms backoff) +run +activity + created 1000, last activity 8500, last connected 6500 +advance 2000 + +### t=10500 ### + in ACTIVE for 4000 ms (2000 ms backoff) +run +activity + created 1000, last activity 10500, last connected 6500 +advance 2000 + +### t=12500 ### + in ACTIVE for 6000 ms (2000 ms backoff) +run +activity + created 1000, last activity 12500, last connected 6500 + +# Gracefully reconnect. +graceful-reconnect + in RECONNECT for 0 ms (1000 ms backoff) + 1 successful connections out of 3 attempts, seqno 2 + disconnected +run + should disconnect +connecting + in CONNECTING for 0 ms (1000 ms backoff) + +# Back off for 1000 ms. +timeout + advance 1000 ms + +### t=13500 ### + in CONNECTING for 1000 ms (1000 ms backoff) + last connected 7000 ms ago, connected 6000 ms total + +# Connection succeeds. +connected + in ACTIVE for 0 ms (1000 ms backoff) + created 1000, last activity 12500, last connected 13500 + 2 successful connections out of 4 attempts, seqno 3 + connected + last connected 0 ms ago, connected 6000 ms total +]) diff --git a/tests/test-reconnect.c b/tests/test-reconnect.c index c84bb1cdbf..cabf228f84 100644 --- a/tests/test-reconnect.c +++ b/tests/test-reconnect.c @@ -108,6 +108,12 @@ do_force_reconnect(struct ovs_cmdl_context *ctx OVS_UNUSED) reconnect_force_reconnect(reconnect, now); } +static void +do_graceful_reconnect(struct ovs_cmdl_context *ctx OVS_UNUSED) +{ + reconnect_graceful_reconnect(reconnect, now); +} + static int error_from_string(const char *s) { @@ -293,6 +299,7 @@ static const struct ovs_cmdl_command all_commands[] = { { "enable", NULL, 0, 0, do_enable, OVS_RO }, { "disable", NULL, 0, 0, do_disable, OVS_RO }, { "force-reconnect", NULL, 0, 0, do_force_reconnect, OVS_RO }, + { "graceful-reconnect", NULL, 0, 0, do_graceful_reconnect, OVS_RO }, { "disconnected", NULL, 0, 1, do_disconnected, OVS_RO }, { "connecting", NULL, 0, 0, do_connecting, OVS_RO }, { "connect-failed", NULL, 0, 1, do_connect_failed, OVS_RO }, diff --git a/tests/test-reconnect.py b/tests/test-reconnect.py index cea48eb527..b397eac656 100644 --- a/tests/test-reconnect.py +++ b/tests/test-reconnect.py @@ -33,6 +33,10 @@ def do_force_reconnect(_): r.force_reconnect(now) +def do_graceful_reconnect(_): + r.graceful_reconnect(now) + + def error_from_string(s): if not s: return 0 @@ -176,6 +180,7 @@ def main(): "enable": do_enable, "disable": do_disable, "force-reconnect": do_force_reconnect, + "graceful-reconnect": do_graceful_reconnect, "disconnected": do_disconnected, "connecting": do_connecting, "connect-failed": do_connect_failed,
Until now clients that needed to reconnect immediately could only use reconnect_force_reconnect(). However, reconnect_force_reconnect() doesn't reset the backoff for connections that were alive long enough (more than backoff seconds). Moreover, the reconnect library cannot determine the exact reason why a client wishes to initiate a reconnection. In most cases reconnection happens because of a fatal error when communicating with the remote, e.g., in the ovsdb-cs layer, when invalid messages are received from ovsdb-server. In such cases it makes sense to not reset the backoff because the remote seems to be unhealthy. There are however cases when reconnection is needed for other reasons. One such example is when ovsdb-clients require "leader-only" connections to clustered ovsdb-server databases. Whenever the client determines that the remote is not a leader anymore, it decides to reconnect to a new remote from its list, searching for the new leader. Using jsonrpc_force_reconnect() (which calls reconnect_force_reconnect()) will not reset backoff even though the former leader is still likely in good shape. Since 3c2d6274bcee ("raft: Transfer leadership before creating snapshots.") leadership changes inside the clustered database happen more often and therefore "leader-only" clients need to reconnect more often too. Not resetting the backoff every time a leadership change happens will cause all reconnections to happen with the maximum backoff (8 seconds) resulting in significant latency. This commit also updates the Python reconnect and IDL implementations and adds tests for force-reconnect and graceful-reconnect. Reported-at: https://bugzilla.redhat.com/1977264 Signed-off-by: Dumitru Ceara <dceara@redhat.com> --- lib/jsonrpc.c | 7 + lib/jsonrpc.h | 1 + lib/ovsdb-cs.c | 32 ++-- lib/reconnect.c | 36 ++++- lib/reconnect.h | 1 + python/ovs/db/idl.py | 8 +- python/ovs/jsonrpc.py | 3 + python/ovs/reconnect.py | 34 +++- tests/reconnect.at | 344 ++++++++++++++++++++++++++++++++++++++++ tests/test-reconnect.c | 7 + tests/test-reconnect.py | 5 + 11 files changed, 454 insertions(+), 24 deletions(-)