Message ID | ZKujCdicu0osQgrt@SIT-SLAP8639.int.lidl.net |
---|---|
State | Changes Requested |
Headers | show |
Series | [ovs-dev,v2] relay: allow setting probe interval | expand |
Context | Check | Description |
---|---|---|
ovsrobot/apply-robot | success | apply and check: success |
ovsrobot/github-robot-_Build_and_Test | fail | github build: failed |
ovsrobot/intel-ovs-compilation | success | test: success |
On 7/10/23 08:19, Felix Huettner via dev wrote: > previously it was not possible to set the probe interval for the > connection from a relay to the backing ovsdb-server. With this change it > is now possible using the > `ovsdb-server/set-active-ovsdb-server-probe-interval` command. > > The command `ovsdb-server/set-active-ovsdb-server-probe-interval` is > already used to set the probe interval for active-backup replication. > However this is mutally exclusive with being a relay and the case for > using the command is the same. Therefor we decided to reuse it instead > of adding a new one. Hi, Felix. Thanks for the patch! It is important to be able to set the probe interval for this direction, I agree. And in the absence of the unified configuration method, the appctl seems to be as an acceptable solution. Though I don't think we should mix configuration for active-backup with relays. The 'active-ovsdb-server' part in the name is a direct reference to an active-backup replication. Terminology is a bit different: AB service model relay service model ---------------- ------------------- backup relay active relay source So, maybe something like 'set-relay-source-probe-interval'? And I'm not sure it's mutually exclusive to have active-backup and a relay in the same process. What prevents it? For the other part of the change, it's not mentioned in the commit message, but mentioned in the NEWS file that the default values is changed to 60s. I'm not sure that's a right thing to do. In the active-backup model, the usual use-case is to back up a standalone database. And it's important to keep the connection alive for long since there is no other source to replicate anyway. On the other hand, relays are most likely used in pair with a clustered service model, and supposed to be faster in reaction to cluster changes. There are likely other servers that we can re-connect to. I'd keep the default inactivity probe value for relays as is, as it might be important for smaller clusters to fail-over faster. Users who frequently see disconnections due to inactivity probes on relay connection may evaluate reasons and adjust the value for themselves, if necessary. What do you think? Best regards, Ilya Maximets.
On Wed, Jul 12, 2023 at 01:15:57PM +0200, Ilya Maximets wrote: > On 7/10/23 08:19, Felix Huettner via dev wrote: > > previously it was not possible to set the probe interval for the > > connection from a relay to the backing ovsdb-server. With this change it > > is now possible using the > > `ovsdb-server/set-active-ovsdb-server-probe-interval` command. > > > > The command `ovsdb-server/set-active-ovsdb-server-probe-interval` is > > already used to set the probe interval for active-backup replication. > > However this is mutally exclusive with being a relay and the case for > > using the command is the same. Therefor we decided to reuse it instead > > of adding a new one. > > Hi, Felix. Thanks for the patch! > > It is important to be able to set the probe interval for this direction, > I agree. And in the absence of the unified configuration method, the > appctl seems to be as an acceptable solution. > > Though I don't think we should mix configuration for active-backup with > relays. The 'active-ovsdb-server' part in the name is a direct reference > to an active-backup replication. Terminology is a bit different: > > AB service model relay service model > ---------------- ------------------- > backup relay > active relay source > > So, maybe something like 'set-relay-source-probe-interval'? > > And I'm not sure it's mutually exclusive to have active-backup and a > relay in the same process. What prevents it? > Hi Ilya, thanks for the feedback. Then i'll include a new appctl command and stay with the existing 5s default. Thanks Felix > > For the other part of the change, it's not mentioned in the commit message, > but mentioned in the NEWS file that the default values is changed to 60s. > I'm not sure that's a right thing to do. In the active-backup model, the > usual use-case is to back up a standalone database. And it's important to > keep the connection alive for long since there is no other source to > replicate anyway. On the other hand, relays are most likely used in pair > with a clustered service model, and supposed to be faster in reaction to > cluster changes. There are likely other servers that we can re-connect to. > > I'd keep the default inactivity probe value for relays as is, as it might > be important for smaller clusters to fail-over faster. Users who frequently > see disconnections due to inactivity probes on relay connection may evaluate > reasons and adjust the value for themselves, if necessary. > > What do you think? > > Best regards, Ilya Maximets. Diese E Mail enthält möglicherweise vertrauliche Inhalte und ist nur für die Verwertung durch den vorgesehenen Empfänger bestimmt. Sollten Sie nicht der vorgesehene Empfänger sein, setzen Sie den Absender bitte unverzüglich in Kenntnis und löschen diese E Mail. Hinweise zum Datenschutz finden Sie hier<https://www.datenschutz.schwarz>. This e-mail may contain confidential content and is intended only for the specified recipient/s. If you are not the intended recipient, please inform the sender immediately and delete this e-mail. Information on data protection can be found here<https://www.datenschutz.schwarz>.
diff --git a/NEWS b/NEWS index 6a990c921..1920830bd 100644 --- a/NEWS +++ b/NEWS @@ -10,6 +10,10 @@ Post-v3.1.0 conversion operation is present. For the cluster service model follow upgrade instructions in 'Upgrading from version 3.1 and earlier to 3.2 and later' section of ovsdb(7). + * When ovsdb-server is running in relay mode, the default value of probe + interval is increased to 60 seconds for the connection to the + backing server. This value is configurable with the unixctl + command - ovsdb-server/set-active-ovsdb-server-probe-interval. - IPFIX template and statistics intervals can now be configured through two new options in the IPFIX table: 'template_interval' and 'stats_interval'. - Linux kernel datapath: diff --git a/ovsdb/ovsdb-server.c b/ovsdb/ovsdb-server.c index 9bad0c8dd..d29dde417 100644 --- a/ovsdb/ovsdb-server.c +++ b/ovsdb/ovsdb-server.c @@ -797,7 +797,8 @@ open_db(struct server_config *config, const char *filename) add_db(config, db); if (is_relay) { - ovsdb_relay_add_db(db->db, relay_remotes, update_schema, config); + ovsdb_relay_add_db(db->db, relay_remotes, update_schema, config, + *config->replication_probe_interval); } return NULL; } @@ -1473,6 +1474,7 @@ ovsdb_server_set_active_ovsdb_server_probe_interval(struct unixctl_conn *conn, if (*config->is_backup) { replication_set_probe_interval(probe_interval); } + ovsdb_relay_set_probe_interval(probe_interval); unixctl_command_reply(conn, NULL); } else { unixctl_command_reply( diff --git a/ovsdb/relay.c b/ovsdb/relay.c index 377f3285f..21720f62f 100644 --- a/ovsdb/relay.c +++ b/ovsdb/relay.c @@ -127,7 +127,7 @@ static struct ovsdb_cs_ops relay_cs_ops = { void ovsdb_relay_add_db(struct ovsdb *db, const char *remote, schema_change_callback schema_change_cb, - void *schema_change_aux) + void *schema_change_aux, int probe_interval) { struct relay_ctx *ctx; @@ -152,10 +152,23 @@ ovsdb_relay_add_db(struct ovsdb *db, const char *remote, shash_add(&relay_dbs, db->name, ctx); ovsdb_cs_set_leader_only(ctx->cs, false); ovsdb_cs_set_remote(ctx->cs, remote, true); + ovsdb_cs_set_probe_interval(ctx->cs, probe_interval); VLOG_DBG("added database: %s, %s", db->name, remote); } +/* Updates the probe interval for all relay connections to the specified + * value*/ +void +ovsdb_relay_set_probe_interval(int probe_interval) +{ + struct shash_node *node; + SHASH_FOR_EACH (node, &relay_dbs) { + struct relay_ctx *ctx = node->data; + ovsdb_cs_set_probe_interval(ctx->cs, probe_interval); + } +} + void ovsdb_relay_del_db(struct ovsdb *db) { diff --git a/ovsdb/relay.h b/ovsdb/relay.h index f841554ca..ccfaf3c93 100644 --- a/ovsdb/relay.h +++ b/ovsdb/relay.h @@ -33,11 +33,13 @@ typedef struct ovsdb_error *(*schema_change_callback)( void ovsdb_relay_add_db(struct ovsdb *, const char *remote, schema_change_callback schema_change_cb, - void *schema_change_aux); + void *schema_change_aux, int probe_interval); void ovsdb_relay_del_db(struct ovsdb *); void ovsdb_relay_run(void); void ovsdb_relay_wait(void); +void ovsdb_relay_set_probe_interval(int); + bool ovsdb_relay_is_connected(struct ovsdb *); #endif /* OVSDB_RELAY_H */