mbox series

[ovs-dev,0/6] ovsdb: conversion: Bug fixes & Optimizations.

Message ID 20230327194301.3773536-1-i.maximets@ovn.org
Headers show
Series ovsdb: conversion: Bug fixes & Optimizations. | expand

Message

Ilya Maximets March 27, 2023, 7:42 p.m. UTC
This patch set aims to solve the issue of database conversion for
reasonably sized databases in high-scale OVN setups discussed here:
  https://mail.openvswitch.org/pipermail/ovs-discuss/2022-December/052140.html

The general approach is described in this email:
  https://mail.openvswitch.org/pipermail/ovs-discuss/2023-January/052158.html

The first patch in the set is a pure bug fix that should be backported.
The second patch also can be considered as a bug fix, though not very
important one.

Patch 3 is a ground work for the database file format change.
Patch 4 is an actual file format change that allows performance
improvements.

Patch 5 is an additional performance improvement on top of 4.

The last patch is covering a bit different area and can generally be
considered outside of this patch set.  It is aimed to solve the
initial monitor request storm caused by the conversion.  Without it,
conversion is passing, but it may take up to an hour to catch up
all the clients, with ovsdb-server being non-responsive for tens
of minutes, depending on a scale.

More details are in commit messages of each patch.

We may consider backporting patch 3 to branch 3.1 (earlier?) in order
to simplify upgrades for OpenStack or ovn-kubernetes deployments.

The patch set allows to successfully perform database conversion
on a 250-node cluster loaded with ovn-heater's cluster-density test.

500-node version of the test can also succeed with a small hiccup
caused by a short RAFT disconnection due to busy-sending of 100 GB
of initial monitor replies to all the clients.  In my testing, cluster
recovers from it in just 30 seconds though (16 second election timer).
There are ways to solve this as well, but the issue is outside of
the scope for this patch set.  See the last patch for more details
on the issue.

Before this change, conversion with either 250 or 500 nodes in this
test is practically impossible.


Ilya Maximets (6):
  ovsdb-tool: Fix cluster-to-standalone for DB conversion records.
  ovsdb: Check for ephemeral columns before writing a new schema.
  ovsdb: Allow conversion records with no data in a clustered storage.
  ovsdb: Perform conversion with no data for clustered databases.
  ovsdb: Avoid converting database twice on an initiator.
  ovsdb: monitor: Keep and maintain the initial change set.

 Documentation/ref/ovsdb.7.rst | 63 +++++++++++++++++++++++++++
 NEWS                          | 10 +++++
 ovsdb/log.c                   | 17 ++++++++
 ovsdb/log.h                   |  3 ++
 ovsdb/monitor.c               |  5 ++-
 ovsdb/ovsdb-server.c          | 80 ++++++++++++++++++++++++++---------
 ovsdb/ovsdb-tool.c            | 43 +++++++++++++++++--
 ovsdb/ovsdb.c                 | 34 +++++++++++++++
 ovsdb/ovsdb.h                 |  3 ++
 ovsdb/relay.c                 | 20 +++++++--
 ovsdb/relay.h                 |  9 +++-
 ovsdb/storage.c               | 24 ++++++++---
 ovsdb/storage.h               |  2 +-
 ovsdb/transaction.c           |  8 ++--
 ovsdb/transaction.h           |  4 +-
 ovsdb/trigger.c               | 53 ++++++++++++++++++-----
 ovsdb/trigger.h               |  7 +++
 tests/ovsdb-tool.at           | 69 ++++++++++++++++++++++++++++++
 18 files changed, 399 insertions(+), 55 deletions(-)

Comments

Dumitru Ceara April 6, 2023, 2:31 p.m. UTC | #1
On 3/27/23 21:42, Ilya Maximets wrote:
> This patch set aims to solve the issue of database conversion for
> reasonably sized databases in high-scale OVN setups discussed here:
>   https://mail.openvswitch.org/pipermail/ovs-discuss/2022-December/052140.html
> 
> The general approach is described in this email:
>   https://mail.openvswitch.org/pipermail/ovs-discuss/2023-January/052158.html
> 
> The first patch in the set is a pure bug fix that should be backported.
> The second patch also can be considered as a bug fix, though not very
> important one.

I think both patches can be safely backported.

> 
> Patch 3 is a ground work for the database file format change.
> Patch 4 is an actual file format change that allows performance
> improvements.
> 
> Patch 5 is an additional performance improvement on top of 4.
> 
> The last patch is covering a bit different area and can generally be
> considered outside of this patch set.  It is aimed to solve the
> initial monitor request storm caused by the conversion.  Without it,
> conversion is passing, but it may take up to an hour to catch up
> all the clients, with ovsdb-server being non-responsive for tens
> of minutes, depending on a scale.
> 
> More details are in commit messages of each patch.
> 
> We may consider backporting patch 3 to branch 3.1 (earlier?) in order
> to simplify upgrades for OpenStack or ovn-kubernetes deployments.
> 

+1 for backporting patch 3.

> The patch set allows to successfully perform database conversion
> on a 250-node cluster loaded with ovn-heater's cluster-density test.
> 
> 500-node version of the test can also succeed with a small hiccup
> caused by a short RAFT disconnection due to busy-sending of 100 GB
> of initial monitor replies to all the clients.  In my testing, cluster
> recovers from it in just 30 seconds though (16 second election timer).
> There are ways to solve this as well, but the issue is outside of
> the scope for this patch set.  See the last patch for more details
> on the issue.
> 
> Before this change, conversion with either 250 or 500 nodes in this
> test is practically impossible.
> 

Great work!

Regards,
Dumitru

> 
> Ilya Maximets (6):
>   ovsdb-tool: Fix cluster-to-standalone for DB conversion records.
>   ovsdb: Check for ephemeral columns before writing a new schema.
>   ovsdb: Allow conversion records with no data in a clustered storage.
>   ovsdb: Perform conversion with no data for clustered databases.
>   ovsdb: Avoid converting database twice on an initiator.
>   ovsdb: monitor: Keep and maintain the initial change set.
> 
>  Documentation/ref/ovsdb.7.rst | 63 +++++++++++++++++++++++++++
>  NEWS                          | 10 +++++
>  ovsdb/log.c                   | 17 ++++++++
>  ovsdb/log.h                   |  3 ++
>  ovsdb/monitor.c               |  5 ++-
>  ovsdb/ovsdb-server.c          | 80 ++++++++++++++++++++++++++---------
>  ovsdb/ovsdb-tool.c            | 43 +++++++++++++++++--
>  ovsdb/ovsdb.c                 | 34 +++++++++++++++
>  ovsdb/ovsdb.h                 |  3 ++
>  ovsdb/relay.c                 | 20 +++++++--
>  ovsdb/relay.h                 |  9 +++-
>  ovsdb/storage.c               | 24 ++++++++---
>  ovsdb/storage.h               |  2 +-
>  ovsdb/transaction.c           |  8 ++--
>  ovsdb/transaction.h           |  4 +-
>  ovsdb/trigger.c               | 53 ++++++++++++++++++-----
>  ovsdb/trigger.h               |  7 +++
>  tests/ovsdb-tool.at           | 69 ++++++++++++++++++++++++++++++
>  18 files changed, 399 insertions(+), 55 deletions(-)
>
Ilya Maximets April 6, 2023, 10:14 p.m. UTC | #2
On 4/6/23 16:31, Dumitru Ceara wrote:
> On 3/27/23 21:42, Ilya Maximets wrote:
>> This patch set aims to solve the issue of database conversion for
>> reasonably sized databases in high-scale OVN setups discussed here:
>>   https://mail.openvswitch.org/pipermail/ovs-discuss/2022-December/052140.html
>>
>> The general approach is described in this email:
>>   https://mail.openvswitch.org/pipermail/ovs-discuss/2023-January/052158.html
>>
>> The first patch in the set is a pure bug fix that should be backported.
>> The second patch also can be considered as a bug fix, though not very
>> important one.
> 
> I think both patches can be safely backported.

Sure.

> 
>>
>> Patch 3 is a ground work for the database file format change.
>> Patch 4 is an actual file format change that allows performance
>> improvements.
>>
>> Patch 5 is an additional performance improvement on top of 4.
>>
>> The last patch is covering a bit different area and can generally be
>> considered outside of this patch set.  It is aimed to solve the
>> initial monitor request storm caused by the conversion.  Without it,
>> conversion is passing, but it may take up to an hour to catch up
>> all the clients, with ovsdb-server being non-responsive for tens
>> of minutes, depending on a scale.
>>
>> More details are in commit messages of each patch.
>>
>> We may consider backporting patch 3 to branch 3.1 (earlier?) in order
>> to simplify upgrades for OpenStack or ovn-kubernetes deployments.
>>
> 
> +1 for backporting patch 3.

Ack.

> 
>> The patch set allows to successfully perform database conversion
>> on a 250-node cluster loaded with ovn-heater's cluster-density test.
>>
>> 500-node version of the test can also succeed with a small hiccup
>> caused by a short RAFT disconnection due to busy-sending of 100 GB
>> of initial monitor replies to all the clients.  In my testing, cluster
>> recovers from it in just 30 seconds though (16 second election timer).
>> There are ways to solve this as well, but the issue is outside of
>> the scope for this patch set.  See the last patch for more details
>> on the issue.
>>
>> Before this change, conversion with either 250 or 500 nodes in this
>> test is practically impossible.
>>
> 
> Great work!

Thanks, Simon and Dumitru for review!

I can fix the couple of highlighted nits while applying the patch set,
but I'll wait a bit for Frode to test these changes on his data first.

FWIW, I'll be mostly out on PTO for the next 2 weeks.  So, there is
some time. :)

Best regards, Ilya Maximets.

> 
> Regards,
> Dumitru
> 
>>
>> Ilya Maximets (6):
>>   ovsdb-tool: Fix cluster-to-standalone for DB conversion records.
>>   ovsdb: Check for ephemeral columns before writing a new schema.
>>   ovsdb: Allow conversion records with no data in a clustered storage.
>>   ovsdb: Perform conversion with no data for clustered databases.
>>   ovsdb: Avoid converting database twice on an initiator.
>>   ovsdb: monitor: Keep and maintain the initial change set.
>>
>>  Documentation/ref/ovsdb.7.rst | 63 +++++++++++++++++++++++++++
>>  NEWS                          | 10 +++++
>>  ovsdb/log.c                   | 17 ++++++++
>>  ovsdb/log.h                   |  3 ++
>>  ovsdb/monitor.c               |  5 ++-
>>  ovsdb/ovsdb-server.c          | 80 ++++++++++++++++++++++++++---------
>>  ovsdb/ovsdb-tool.c            | 43 +++++++++++++++++--
>>  ovsdb/ovsdb.c                 | 34 +++++++++++++++
>>  ovsdb/ovsdb.h                 |  3 ++
>>  ovsdb/relay.c                 | 20 +++++++--
>>  ovsdb/relay.h                 |  9 +++-
>>  ovsdb/storage.c               | 24 ++++++++---
>>  ovsdb/storage.h               |  2 +-
>>  ovsdb/transaction.c           |  8 ++--
>>  ovsdb/transaction.h           |  4 +-
>>  ovsdb/trigger.c               | 53 ++++++++++++++++++-----
>>  ovsdb/trigger.h               |  7 +++
>>  tests/ovsdb-tool.at           | 69 ++++++++++++++++++++++++++++++
>>  18 files changed, 399 insertions(+), 55 deletions(-)
>>
>
Ilya Maximets April 24, 2023, 10:41 p.m. UTC | #3
On 4/7/23 00:14, Ilya Maximets wrote:
> On 4/6/23 16:31, Dumitru Ceara wrote:
>> On 3/27/23 21:42, Ilya Maximets wrote:
>>> This patch set aims to solve the issue of database conversion for
>>> reasonably sized databases in high-scale OVN setups discussed here:
>>>   https://mail.openvswitch.org/pipermail/ovs-discuss/2022-December/052140.html
>>>
>>> The general approach is described in this email:
>>>   https://mail.openvswitch.org/pipermail/ovs-discuss/2023-January/052158.html
>>>
>>> The first patch in the set is a pure bug fix that should be backported.
>>> The second patch also can be considered as a bug fix, though not very
>>> important one.
>>
>> I think both patches can be safely backported.
> 
> Sure.
> 
>>
>>>
>>> Patch 3 is a ground work for the database file format change.
>>> Patch 4 is an actual file format change that allows performance
>>> improvements.
>>>
>>> Patch 5 is an additional performance improvement on top of 4.
>>>
>>> The last patch is covering a bit different area and can generally be
>>> considered outside of this patch set.  It is aimed to solve the
>>> initial monitor request storm caused by the conversion.  Without it,
>>> conversion is passing, but it may take up to an hour to catch up
>>> all the clients, with ovsdb-server being non-responsive for tens
>>> of minutes, depending on a scale.
>>>
>>> More details are in commit messages of each patch.
>>>
>>> We may consider backporting patch 3 to branch 3.1 (earlier?) in order
>>> to simplify upgrades for OpenStack or ovn-kubernetes deployments.
>>>
>>
>> +1 for backporting patch 3.
> 
> Ack.
> 
>>
>>> The patch set allows to successfully perform database conversion
>>> on a 250-node cluster loaded with ovn-heater's cluster-density test.
>>>
>>> 500-node version of the test can also succeed with a small hiccup
>>> caused by a short RAFT disconnection due to busy-sending of 100 GB
>>> of initial monitor replies to all the clients.  In my testing, cluster
>>> recovers from it in just 30 seconds though (16 second election timer).
>>> There are ways to solve this as well, but the issue is outside of
>>> the scope for this patch set.  See the last patch for more details
>>> on the issue.
>>>
>>> Before this change, conversion with either 250 or 500 nodes in this
>>> test is practically impossible.
>>>
>>
>> Great work!
> 
> Thanks, Simon and Dumitru for review!
> 
> I can fix the couple of highlighted nits while applying the patch set,
> but I'll wait a bit for Frode to test these changes on his data first.
> 
> FWIW, I'll be mostly out on PTO for the next 2 weeks.  So, there is
> some time. :)

Since there is no real point in waiting any longer, I applied the set now.
Thanks again!

First 3 patches also backported down to 2.17.

Best regards, Ilya Maximets.

> 
> Best regards, Ilya Maximets.
> 
>>
>> Regards,
>> Dumitru
>>
>>>
>>> Ilya Maximets (6):
>>>   ovsdb-tool: Fix cluster-to-standalone for DB conversion records.
>>>   ovsdb: Check for ephemeral columns before writing a new schema.
>>>   ovsdb: Allow conversion records with no data in a clustered storage.
>>>   ovsdb: Perform conversion with no data for clustered databases.
>>>   ovsdb: Avoid converting database twice on an initiator.
>>>   ovsdb: monitor: Keep and maintain the initial change set.
>>>
>>>  Documentation/ref/ovsdb.7.rst | 63 +++++++++++++++++++++++++++
>>>  NEWS                          | 10 +++++
>>>  ovsdb/log.c                   | 17 ++++++++
>>>  ovsdb/log.h                   |  3 ++
>>>  ovsdb/monitor.c               |  5 ++-
>>>  ovsdb/ovsdb-server.c          | 80 ++++++++++++++++++++++++++---------
>>>  ovsdb/ovsdb-tool.c            | 43 +++++++++++++++++--
>>>  ovsdb/ovsdb.c                 | 34 +++++++++++++++
>>>  ovsdb/ovsdb.h                 |  3 ++
>>>  ovsdb/relay.c                 | 20 +++++++--
>>>  ovsdb/relay.h                 |  9 +++-
>>>  ovsdb/storage.c               | 24 ++++++++---
>>>  ovsdb/storage.h               |  2 +-
>>>  ovsdb/transaction.c           |  8 ++--
>>>  ovsdb/transaction.h           |  4 +-
>>>  ovsdb/trigger.c               | 53 ++++++++++++++++++-----
>>>  ovsdb/trigger.h               |  7 +++
>>>  tests/ovsdb-tool.at           | 69 ++++++++++++++++++++++++++++++
>>>  18 files changed, 399 insertions(+), 55 deletions(-)
>>>
>>
>