diff mbox series

[ovs-dev,v2] controller: Avoid unnecessary load balancer flow processing.

Message ID 20210712080810.26164-1-dceara@redhat.com
State Superseded
Headers show
Series [ovs-dev,v2] controller: Avoid unnecessary load balancer flow processing. | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/github-robot-_Build_and_Test fail github build: failed
ovsrobot/github-robot-_ovn-kubernetes fail github build: failed

Commit Message

Dumitru Ceara July 12, 2021, 8:08 a.m. UTC
Whenever a Load_Balancer is updated, e.g., a VIP is added, the following
sequence of events happens:

1. The Southbound Load_Balancer record is updated.
2. The Southbound Datapath_Binding records on which the Load_Balancer is
   applied are updated.
3. Southbound ovsdb-server sends updates about the Load_Balancer and
   Datapath_Binding records to ovn-controller.
4. The IDL layer in ovn-controller processes the updates at #3, but
   because of the SB schema references between tables [0] all logical
   flows referencing the updated Datapath_Binding are marked as
   "updated".  The same is true for Logical_DP_Group records
   referencing the Datapath_Binding, and also for all logical flows
   pointing to the new "updated" datapath groups.
5. ovn-controller ends up recomputing (removing/readding) all flows for
   all these tracked updates.

From the SB Schema:
        "Datapath_Binding": {
            "columns": {
                [...]
                "load_balancers": {"type": {"key": {"type": "uuid",
                                                   "refTable": "Load_Balancer",
                                                   "refType": "weak"},
                                            "min": 0,
                                            "max": "unlimited"}},
        [...]
        "Load_Balancer": {
            "columns": {
                "datapaths": {
                [...]
                    "type": {"key": {"type": "uuid",
                                     "refTable": "Datapath_Binding"},
                             "min": 0, "max": "unlimited"}},
        [...]
        "Logical_DP_Group": {
            "columns": {
                "datapaths":
                    {"type": {"key": {"type": "uuid",
                                      "refTable": "Datapath_Binding",
                                      "refType": "weak"},
                              "min": 0, "max": "unlimited"}}},
        [...]
        "Logical_Flow": {
            "columns": {
                "logical_datapath":
                    {"type": {"key": {"type": "uuid",
                                      "refTable": "Datapath_Binding"},
                              "min": 0, "max": 1}},
                "logical_dp_group":
                    {"type": {"key": {"type": "uuid",
                                      "refTable": "Logical_DP_Group"},

In order to avoid this unnecessary Logical_Flow notification storm we
now remove the explicit reference from Datapath_Binding to
Load_Balancer and instead store raw UUIDs.

This means that on the ovn-controller side we need to perform a
Load_Balancer table lookup by UUID whenever a new datapath is added,
but that doesn't happen too often and the cost of the lookup is
negligible compared to the huge cost of processing the unnecessary
logical flow updates.

This change is backwards compatible because the contents stored in the
database are not changed, just that the schema constraints are relaxed a
bit.

Some performance measurements, on a scale test deployment simulating an
ovn-kubernetes deployment with 120 nodes and a large load balancer
with 16K VIPs associated to each node's logical switch, the event
processing loop time in ovn-controller, when adding a new VIP, is
reduced from ~39 seconds to ~8 seconds.

There's no need to change the northd DDlog implementation.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1978605
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Dumitru Ceara <dceara@redhat.com>
---
 controller/lflow.c  |  6 ++++--
 northd/ovn-northd.c | 14 ++++++--------
 ovn-sb.ovsschema    |  6 ++----
 3 files changed, 12 insertions(+), 14 deletions(-)

Comments

Ben Pfaff July 16, 2021, 12:03 a.m. UTC | #1
On Mon, Jul 12, 2021 at 10:08:10AM +0200, Dumitru Ceara wrote:
> Whenever a Load_Balancer is updated, e.g., a VIP is added, the following
> sequence of events happens:
> 
> 1. The Southbound Load_Balancer record is updated.
> 2. The Southbound Datapath_Binding records on which the Load_Balancer is
>    applied are updated.
> 3. Southbound ovsdb-server sends updates about the Load_Balancer and
>    Datapath_Binding records to ovn-controller.
> 4. The IDL layer in ovn-controller processes the updates at #3, but
>    because of the SB schema references between tables [0] all logical
>    flows referencing the updated Datapath_Binding are marked as
>    "updated".  The same is true for Logical_DP_Group records
>    referencing the Datapath_Binding, and also for all logical flows
>    pointing to the new "updated" datapath groups.
> 5. ovn-controller ends up recomputing (removing/readding) all flows for
>    all these tracked updates.

This is kind of a weird change from my perspective.  It allows for
broken referential integrity in the database to work around a
performance bug in the IDL.
Han Zhou July 16, 2021, 1:05 a.m. UTC | #2
On Thu, Jul 15, 2021 at 5:03 PM Ben Pfaff <blp@ovn.org> wrote:

> On Mon, Jul 12, 2021 at 10:08:10AM +0200, Dumitru Ceara wrote:
> > Whenever a Load_Balancer is updated, e.g., a VIP is added, the following
> > sequence of events happens:
> >
> > 1. The Southbound Load_Balancer record is updated.
> > 2. The Southbound Datapath_Binding records on which the Load_Balancer is
> >    applied are updated.
> > 3. Southbound ovsdb-server sends updates about the Load_Balancer and
> >    Datapath_Binding records to ovn-controller.
> > 4. The IDL layer in ovn-controller processes the updates at #3, but
> >    because of the SB schema references between tables [0] all logical
> >    flows referencing the updated Datapath_Binding are marked as
> >    "updated".  The same is true for Logical_DP_Group records
> >    referencing the Datapath_Binding, and also for all logical flows
> >    pointing to the new "updated" datapath groups.
> > 5. ovn-controller ends up recomputing (removing/readding) all flows for
> >    all these tracked updates.
>
> This is kind of a weird change from my perspective.  It allows for
> broken referential integrity in the database to work around a
> performance bug in the IDL.


Yes, it did look weird and there were detailed discussions on it in the v1
reviews. Some options that require much bigger changes were discussed for
longer term, unless ddlog is used.


> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
Dumitru Ceara July 16, 2021, 7:43 a.m. UTC | #3
On 7/16/21 3:05 AM, Han Zhou wrote:
> On Thu, Jul 15, 2021 at 5:03 PM Ben Pfaff <blp@ovn.org> wrote:
> 
>> On Mon, Jul 12, 2021 at 10:08:10AM +0200, Dumitru Ceara wrote:
>>> Whenever a Load_Balancer is updated, e.g., a VIP is added, the following
>>> sequence of events happens:
>>>
>>> 1. The Southbound Load_Balancer record is updated.
>>> 2. The Southbound Datapath_Binding records on which the Load_Balancer is
>>>    applied are updated.
>>> 3. Southbound ovsdb-server sends updates about the Load_Balancer and
>>>    Datapath_Binding records to ovn-controller.
>>> 4. The IDL layer in ovn-controller processes the updates at #3, but
>>>    because of the SB schema references between tables [0] all logical
>>>    flows referencing the updated Datapath_Binding are marked as
>>>    "updated".  The same is true for Logical_DP_Group records
>>>    referencing the Datapath_Binding, and also for all logical flows
>>>    pointing to the new "updated" datapath groups.
>>> 5. ovn-controller ends up recomputing (removing/readding) all flows for
>>>    all these tracked updates.
>>
>> This is kind of a weird change from my perspective.  It allows for
>> broken referential integrity in the database to work around a
>> performance bug in the IDL.

Thanks for looking at this change!

I guess the description in the commit message is a bit unclear.  Let me
try to fix that here.  There is no bug in the IDL; it's doing what it's
supposed to do given the Southbound database schema.

The problem is with the schema itself.  When flow generation for load
balancers was moved to ovn-controller an additional reference (from
Datapath_Binding to Load_Balancer) was added in the SB schema to avoid
performance issues or extremely complex code in ovn-controller to handle
Load_Balancer updates incrementally.

My first attempt at the fix was to just remove that additional
reference.  That is fine, and doesn't affect correctness but:
1. makes the upgrade scenario complex because ovn-controllers running
old versions would expect that reference to be there.
2. adds lots of complicated code in ovn-controller incremental-processing.

> 
> 
> Yes, it did look weird and there were detailed discussions on it in the v1
> reviews. Some options that require much bigger changes were discussed for
> longer term, unless ddlog is used.
> 

This would be the longer term solution, along with removing the
reference from Datapath_Binding to Load_Balancers completely.

But until then, we have huge performance impact in ovn-controller due to
this.  Quoting from my original commit log:

"Some performance measurements, on a scale test deployment simulating an
ovn-kubernetes deployment with 120 nodes and a large load balancer
with 16K VIPs associated to each node's logical switch, the event
processing loop time in ovn-controller, when adding a new VIP, is
reduced from ~39 seconds to ~8 seconds."

Regards,
Dumitru
Ben Pfaff July 16, 2021, 3:42 p.m. UTC | #4
On Fri, Jul 16, 2021 at 09:43:17AM +0200, Dumitru Ceara wrote:
> On 7/16/21 3:05 AM, Han Zhou wrote:
> > On Thu, Jul 15, 2021 at 5:03 PM Ben Pfaff <blp@ovn.org> wrote:
> > 
> >> On Mon, Jul 12, 2021 at 10:08:10AM +0200, Dumitru Ceara wrote:
> >>> Whenever a Load_Balancer is updated, e.g., a VIP is added, the following
> >>> sequence of events happens:
> >>>
> >>> 1. The Southbound Load_Balancer record is updated.
> >>> 2. The Southbound Datapath_Binding records on which the Load_Balancer is
> >>>    applied are updated.
> >>> 3. Southbound ovsdb-server sends updates about the Load_Balancer and
> >>>    Datapath_Binding records to ovn-controller.
> >>> 4. The IDL layer in ovn-controller processes the updates at #3, but
> >>>    because of the SB schema references between tables [0] all logical
> >>>    flows referencing the updated Datapath_Binding are marked as
> >>>    "updated".  The same is true for Logical_DP_Group records
> >>>    referencing the Datapath_Binding, and also for all logical flows
> >>>    pointing to the new "updated" datapath groups.
> >>> 5. ovn-controller ends up recomputing (removing/readding) all flows for
> >>>    all these tracked updates.
> >>
> >> This is kind of a weird change from my perspective.  It allows for
> >> broken referential integrity in the database to work around a
> >> performance bug in the IDL.
> 
> Thanks for looking at this change!
> 
> I guess the description in the commit message is a bit unclear.  Let me
> try to fix that here.  There is no bug in the IDL; it's doing what it's
> supposed to do given the Southbound database schema.
> 
> The problem is with the schema itself.  When flow generation for load
> balancers was moved to ovn-controller an additional reference (from
> Datapath_Binding to Load_Balancer) was added in the SB schema to avoid
> performance issues or extremely complex code in ovn-controller to handle
> Load_Balancer updates incrementally.
> 
> My first attempt at the fix was to just remove that additional
> reference.  That is fine, and doesn't affect correctness but:
> 1. makes the upgrade scenario complex because ovn-controllers running
> old versions would expect that reference to be there.
> 2. adds lots of complicated code in ovn-controller incremental-processing.
> 
> > 
> > 
> > Yes, it did look weird and there were detailed discussions on it in the v1
> > reviews. Some options that require much bigger changes were discussed for
> > longer term, unless ddlog is used.
> > 
> 
> This would be the longer term solution, along with removing the
> reference from Datapath_Binding to Load_Balancers completely.

OK.  In context, it makes more sense.
Dumitru Ceara July 23, 2021, 4:29 p.m. UTC | #5
On 7/16/21 5:42 PM, Ben Pfaff wrote:
> On Fri, Jul 16, 2021 at 09:43:17AM +0200, Dumitru Ceara wrote:
>> On 7/16/21 3:05 AM, Han Zhou wrote:
>>> On Thu, Jul 15, 2021 at 5:03 PM Ben Pfaff <blp@ovn.org> wrote:
>>>
>>>> On Mon, Jul 12, 2021 at 10:08:10AM +0200, Dumitru Ceara wrote:
>>>>> Whenever a Load_Balancer is updated, e.g., a VIP is added, the following
>>>>> sequence of events happens:
>>>>>
>>>>> 1. The Southbound Load_Balancer record is updated.
>>>>> 2. The Southbound Datapath_Binding records on which the Load_Balancer is
>>>>>    applied are updated.
>>>>> 3. Southbound ovsdb-server sends updates about the Load_Balancer and
>>>>>    Datapath_Binding records to ovn-controller.
>>>>> 4. The IDL layer in ovn-controller processes the updates at #3, but
>>>>>    because of the SB schema references between tables [0] all logical
>>>>>    flows referencing the updated Datapath_Binding are marked as
>>>>>    "updated".  The same is true for Logical_DP_Group records
>>>>>    referencing the Datapath_Binding, and also for all logical flows
>>>>>    pointing to the new "updated" datapath groups.
>>>>> 5. ovn-controller ends up recomputing (removing/readding) all flows for
>>>>>    all these tracked updates.
>>>>
>>>> This is kind of a weird change from my perspective.  It allows for
>>>> broken referential integrity in the database to work around a
>>>> performance bug in the IDL.
>>
>> Thanks for looking at this change!
>>
>> I guess the description in the commit message is a bit unclear.  Let me
>> try to fix that here.  There is no bug in the IDL; it's doing what it's
>> supposed to do given the Southbound database schema.
>>
>> The problem is with the schema itself.  When flow generation for load
>> balancers was moved to ovn-controller an additional reference (from
>> Datapath_Binding to Load_Balancer) was added in the SB schema to avoid
>> performance issues or extremely complex code in ovn-controller to handle
>> Load_Balancer updates incrementally.
>>
>> My first attempt at the fix was to just remove that additional
>> reference.  That is fine, and doesn't affect correctness but:
>> 1. makes the upgrade scenario complex because ovn-controllers running
>> old versions would expect that reference to be there.
>> 2. adds lots of complicated code in ovn-controller incremental-processing.
>>
>>>
>>>
>>> Yes, it did look weird and there were detailed discussions on it in the v1
>>> reviews. Some options that require much bigger changes were discussed for
>>> longer term, unless ddlog is used.
>>>
>>
>> This would be the longer term solution, along with removing the
>> reference from Datapath_Binding to Load_Balancers completely.
> 
> OK.  In context, it makes more sense.
> 

Looks like the time to remove the reference completely arrived sooner
than I had thought.

I just posted a patch to remove the loose reference completely because
it's still creating issues when load balancers are added/removed:

http://patchwork.ozlabs.org/project/ovn/patch/20210723162757.32274-1-dceara@redhat.com/

Regards,
Dumitru
diff mbox series

Patch

diff --git a/controller/lflow.c b/controller/lflow.c
index 60aa011ff..c58c4f25c 100644
--- a/controller/lflow.c
+++ b/controller/lflow.c
@@ -1744,8 +1744,10 @@  lflow_processing_end:
     /* Add load balancer hairpin flows if the datapath has any load balancers
      * associated. */
     for (size_t i = 0; i < dp->n_load_balancers; i++) {
-        consider_lb_hairpin_flows(dp->load_balancers[i],
-                                  l_ctx_in->local_datapaths,
+        const struct sbrec_load_balancer *lb =
+            sbrec_load_balancer_table_get_for_uuid(l_ctx_in->lb_table,
+                                                   &dp->load_balancers[i]);
+        consider_lb_hairpin_flows(lb, l_ctx_in->local_datapaths,
                                   l_ctx_out->flow_table);
     }
 
diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
index 562dc62b2..999c3f482 100644
--- a/northd/ovn-northd.c
+++ b/northd/ovn-northd.c
@@ -3635,19 +3635,17 @@  build_ovn_lbs(struct northd_context *ctx, struct hmap *datapaths,
             continue;
         }
 
-        const struct sbrec_load_balancer **sbrec_lbs =
-            xmalloc(od->nbs->n_load_balancer * sizeof *sbrec_lbs);
+        struct uuid *lb_uuids =
+            xmalloc(od->nbs->n_load_balancer * sizeof *lb_uuids);
         for (size_t i = 0; i < od->nbs->n_load_balancer; i++) {
             const struct uuid *lb_uuid =
                 &od->nbs->load_balancer[i]->header_.uuid;
             lb = ovn_northd_lb_find(lbs, lb_uuid);
-            sbrec_lbs[i] = lb->slb;
+            lb_uuids[i] = lb->slb->header_.uuid;
         }
-
-        sbrec_datapath_binding_set_load_balancers(
-            od->sb, (struct sbrec_load_balancer **)sbrec_lbs,
-            od->nbs->n_load_balancer);
-        free(sbrec_lbs);
+        sbrec_datapath_binding_set_load_balancers(od->sb, lb_uuids,
+                                                  od->nbs->n_load_balancer);
+        free(lb_uuids);
     }
 }
 
diff --git a/ovn-sb.ovsschema b/ovn-sb.ovsschema
index bbf60781d..57fbc3ca4 100644
--- a/ovn-sb.ovsschema
+++ b/ovn-sb.ovsschema
@@ -1,7 +1,7 @@ 
 {
     "name": "OVN_Southbound",
     "version": "20.18.0",
-    "cksum": "1816525029 26536",
+    "cksum": "779623696 26386",
     "tables": {
         "SB_Global": {
             "columns": {
@@ -166,9 +166,7 @@ 
                      "type": {"key": {"type": "integer",
                                       "minInteger": 1,
                                       "maxInteger": 16777215}}},
-                "load_balancers": {"type": {"key": {"type": "uuid",
-                                                   "refTable": "Load_Balancer",
-                                                   "refType": "weak"},
+                "load_balancers": {"type": {"key": {"type": "uuid"},
                                             "min": 0,
                                             "max": "unlimited"}},
                 "external_ids": {