diff mbox series

[ovs-dev,v3,2/3] norhd: optimize build_lrouter_defrag_flows_for_lb

Message ID 2697542e168fb3e040650eedbf21aef8a0a4848c.1629996550.git.lorenzo.bianconi@redhat.com
State Accepted
Headers show
Series improve code efficiency for ovn-northd lb-unreachable ips | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/github-robot-_Build_and_Test success github build: passed
ovsrobot/github-robot-_ovn-kubernetes fail github build: failed

Commit Message

Lorenzo Bianconi Aug. 26, 2021, 4:50 p.m. UTC
Similar to build_lb_rules(), precompute hash and lflow pointer in
build_lrouter_defrag_flows_for_lb routine since they do not depend on
datapath updating datapath group. Using this approach we can reduce
ovn-northd loop time of ~1s running an ovnnb_db from production
environment:

ovn-master:
-----------
Statistics for 'ovnnb_db_run'
  Total samples: 37
  Maximum: 12656 msec
  Minimum: 12276 msec
  95th percentile: 12557.000000 msec
  Short term average: 12475.213654 msec
  Long term average: 12336.498446 msec

build_lb_rules + build_lrouter_defrag_flows_for_lb:
---------------------------------------------------
Statistics for 'ovnnb_db_run'
  Total samples: 37
  Maximum: 11266 msec
  Minimum: 11039 msec
  95th percentile: 11194.000000 msec
  Short term average: 11145.945629 msec
  Long term average: 11075.628051 msec

ovn-nbctl lr-list | wc -l
121
ovn-nbctl ls-list | wc -l
241
ovn-nbctl lb-list | wc -l
47077
ovn-sbctl dump-flows |wc -l
9852935

Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
---
 northd/ovn-northd.c | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

Comments

Mark Gray Aug. 27, 2021, 7:04 a.m. UTC | #1
On 26/08/2021 17:50, Lorenzo Bianconi wrote:
> Similar to build_lb_rules(), precompute hash and lflow pointer in
> build_lrouter_defrag_flows_for_lb routine since they do not depend on
> datapath updating datapath group. Using this approach we can reduce
> ovn-northd loop time of ~1s running an ovnnb_db from production
> environment:
> 
> ovn-master:
> -----------
> Statistics for 'ovnnb_db_run'
>   Total samples: 37
>   Maximum: 12656 msec
>   Minimum: 12276 msec
>   95th percentile: 12557.000000 msec
>   Short term average: 12475.213654 msec
>   Long term average: 12336.498446 msec
> 
> build_lb_rules + build_lrouter_defrag_flows_for_lb:
> ---------------------------------------------------
> Statistics for 'ovnnb_db_run'
>   Total samples: 37
>   Maximum: 11266 msec
>   Minimum: 11039 msec
>   95th percentile: 11194.000000 msec
>   Short term average: 11145.945629 msec
>   Long term average: 11075.628051 msec
> 
> ovn-nbctl lr-list | wc -l
> 121
> ovn-nbctl ls-list | wc -l
> 241
> ovn-nbctl lb-list | wc -l
> 47077
> ovn-sbctl dump-flows |wc -l
> 9852935
> 
> Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
> ---
>  northd/ovn-northd.c | 18 ++++++++++++++----
>  1 file changed, 14 insertions(+), 4 deletions(-)
> 
> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> index 5efa54ee5..61fb5b159 100644
> --- a/northd/ovn-northd.c
> +++ b/northd/ovn-northd.c
> @@ -9476,11 +9476,21 @@ build_lrouter_defrag_flows_for_lb(struct ovn_northd_lb *lb,
>  
>          ds_put_format(&defrag_actions, "ct_dnat;");
>  
> +        struct ovn_lflow *lflow_ref = NULL;
> +        uint32_t hash = ovn_logical_flow_hash(
> +                ovn_stage_get_table(S_ROUTER_IN_DEFRAG),
> +                ovn_stage_get_pipeline_name(S_ROUTER_IN_DEFRAG), prio,
> +                ds_cstr(match), ds_cstr(&defrag_actions));
>          for (size_t j = 0; j < lb->n_nb_lr; j++) {
> -            ovn_lflow_add_with_hint(lflows, lb->nb_lr[j], S_ROUTER_IN_DEFRAG,
> -                                    prio, ds_cstr(match),
> -                                    ds_cstr(&defrag_actions),
> -                                    &lb->nlb->header_);
> +            struct ovn_datapath *od = lb->nb_lr[j];
> +            if (ovn_dp_group_add_with_reference(lflow_ref, od, hash)) {
> +                continue;
> +            }

For all of these patches, could something like the following change give
a further optimization? It would call ovn_lflow__xxxxx() only once and I
think it simplifies the code. Does it give a benefit? What do you think?

	/* Build datapath group */
	struct hmapx od_group;
	hmapx_init(&od_group);
        for (size_t j = 0; j < lb->n_nb_lr; j++) {
            struct ovn_datapath *od = lb->nb_lr[j];
            if (ovn_dp_group_add_with_reference(&od_group, od)) {
                continue;
            }
        }

	/* Add lfow specifying datapath group rather than datapath */
	ovn_lflow_add_at_with_dp_group(lflows,
                    S_SWITCH_IN_STATEFUL, priority,
                    ds_cstr(match), ds_cstr(action),
                    NULL, meter, &lb->nlb->header_);

> +            lflow_ref = ovn_lflow_add_at_with_hash(lflows, od,
> +                                    S_ROUTER_IN_DEFRAG, prio,
> +                                    ds_cstr(match), ds_cstr(&defrag_actions),
> +                                    NULL, NULL, &lb->nlb->header_,
> +                                    OVS_SOURCE_LOCATOR, hash);
>          }
>      }
>      ds_destroy(&defrag_actions);
>
Lorenzo Bianconi Aug. 27, 2021, 8:18 a.m. UTC | #2
> On 26/08/2021 17:50, Lorenzo Bianconi wrote:
> > Similar to build_lb_rules(), precompute hash and lflow pointer in
> > build_lrouter_defrag_flows_for_lb routine since they do not depend on
> > datapath updating datapath group. Using this approach we can reduce
> > ovn-northd loop time of ~1s running an ovnnb_db from production
> > environment:
> > 
> > ovn-master:
> > -----------
> > Statistics for 'ovnnb_db_run'
> >   Total samples: 37
> >   Maximum: 12656 msec
> >   Minimum: 12276 msec
> >   95th percentile: 12557.000000 msec
> >   Short term average: 12475.213654 msec
> >   Long term average: 12336.498446 msec
> > 
> > build_lb_rules + build_lrouter_defrag_flows_for_lb:
> > ---------------------------------------------------
> > Statistics for 'ovnnb_db_run'
> >   Total samples: 37
> >   Maximum: 11266 msec
> >   Minimum: 11039 msec
> >   95th percentile: 11194.000000 msec
> >   Short term average: 11145.945629 msec
> >   Long term average: 11075.628051 msec
> > 
> > ovn-nbctl lr-list | wc -l
> > 121
> > ovn-nbctl ls-list | wc -l
> > 241
> > ovn-nbctl lb-list | wc -l
> > 47077
> > ovn-sbctl dump-flows |wc -l
> > 9852935
> > 
> > Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
> > ---
> >  northd/ovn-northd.c | 18 ++++++++++++++----
> >  1 file changed, 14 insertions(+), 4 deletions(-)
> > 
> > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> > index 5efa54ee5..61fb5b159 100644
> > --- a/northd/ovn-northd.c
> > +++ b/northd/ovn-northd.c
> > @@ -9476,11 +9476,21 @@ build_lrouter_defrag_flows_for_lb(struct ovn_northd_lb *lb,
> >  
> >          ds_put_format(&defrag_actions, "ct_dnat;");
> >  
> > +        struct ovn_lflow *lflow_ref = NULL;
> > +        uint32_t hash = ovn_logical_flow_hash(
> > +                ovn_stage_get_table(S_ROUTER_IN_DEFRAG),
> > +                ovn_stage_get_pipeline_name(S_ROUTER_IN_DEFRAG), prio,
> > +                ds_cstr(match), ds_cstr(&defrag_actions));
> >          for (size_t j = 0; j < lb->n_nb_lr; j++) {
> > -            ovn_lflow_add_with_hint(lflows, lb->nb_lr[j], S_ROUTER_IN_DEFRAG,
> > -                                    prio, ds_cstr(match),
> > -                                    ds_cstr(&defrag_actions),
> > -                                    &lb->nlb->header_);
> > +            struct ovn_datapath *od = lb->nb_lr[j];
> > +            if (ovn_dp_group_add_with_reference(lflow_ref, od, hash)) {
> > +                continue;
> > +            }
> 
> For all of these patches, could something like the following change give
> a further optimization? It would call ovn_lflow__xxxxx() only once and I
> think it simplifies the code. Does it give a benefit? What do you think?
> 
> 	/* Build datapath group */
> 	struct hmapx od_group;
> 	hmapx_init(&od_group);
>         for (size_t j = 0; j < lb->n_nb_lr; j++) {
>             struct ovn_datapath *od = lb->nb_lr[j];
>             if (ovn_dp_group_add_with_reference(&od_group, od)) {
>                 continue;
>             }
>         }
> 
> 	/* Add lfow specifying datapath group rather than datapath */
> 	ovn_lflow_add_at_with_dp_group(lflows,
>                     S_SWITCH_IN_STATEFUL, priority,
>                     ds_cstr(match), ds_cstr(action),
>                     NULL, meter, &lb->nlb->header_);

Hi Mark,

if I read correctly the code I guess we should pass od_group somewhere here,
right? Anyway it seems to me this approach will not work with parrallel
implementation, right?

Regards,
Lorenzo

> 
> > +            lflow_ref = ovn_lflow_add_at_with_hash(lflows, od,
> > +                                    S_ROUTER_IN_DEFRAG, prio,
> > +                                    ds_cstr(match), ds_cstr(&defrag_actions),
> > +                                    NULL, NULL, &lb->nlb->header_,
> > +                                    OVS_SOURCE_LOCATOR, hash);
> >          }
> >      }
> >      ds_destroy(&defrag_actions);
> > 
>
Mark Gray Aug. 27, 2021, 4:06 p.m. UTC | #3
On 27/08/2021 09:18, Lorenzo Bianconi wrote:
>> On 26/08/2021 17:50, Lorenzo Bianconi wrote:
>>> Similar to build_lb_rules(), precompute hash and lflow pointer in
>>> build_lrouter_defrag_flows_for_lb routine since they do not depend on
>>> datapath updating datapath group. Using this approach we can reduce
>>> ovn-northd loop time of ~1s running an ovnnb_db from production
>>> environment:
>>>
>>> ovn-master:
>>> -----------
>>> Statistics for 'ovnnb_db_run'
>>>   Total samples: 37
>>>   Maximum: 12656 msec
>>>   Minimum: 12276 msec
>>>   95th percentile: 12557.000000 msec
>>>   Short term average: 12475.213654 msec
>>>   Long term average: 12336.498446 msec
>>>
>>> build_lb_rules + build_lrouter_defrag_flows_for_lb:
>>> ---------------------------------------------------
>>> Statistics for 'ovnnb_db_run'
>>>   Total samples: 37
>>>   Maximum: 11266 msec
>>>   Minimum: 11039 msec
>>>   95th percentile: 11194.000000 msec
>>>   Short term average: 11145.945629 msec
>>>   Long term average: 11075.628051 msec
>>>
>>> ovn-nbctl lr-list | wc -l
>>> 121
>>> ovn-nbctl ls-list | wc -l
>>> 241
>>> ovn-nbctl lb-list | wc -l
>>> 47077
>>> ovn-sbctl dump-flows |wc -l
>>> 9852935
>>>
>>> Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
>>> ---
>>>  northd/ovn-northd.c | 18 ++++++++++++++----
>>>  1 file changed, 14 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
>>> index 5efa54ee5..61fb5b159 100644
>>> --- a/northd/ovn-northd.c
>>> +++ b/northd/ovn-northd.c
>>> @@ -9476,11 +9476,21 @@ build_lrouter_defrag_flows_for_lb(struct ovn_northd_lb *lb,
>>>  
>>>          ds_put_format(&defrag_actions, "ct_dnat;");
>>>  
>>> +        struct ovn_lflow *lflow_ref = NULL;
>>> +        uint32_t hash = ovn_logical_flow_hash(
>>> +                ovn_stage_get_table(S_ROUTER_IN_DEFRAG),
>>> +                ovn_stage_get_pipeline_name(S_ROUTER_IN_DEFRAG), prio,
>>> +                ds_cstr(match), ds_cstr(&defrag_actions));
>>>          for (size_t j = 0; j < lb->n_nb_lr; j++) {
>>> -            ovn_lflow_add_with_hint(lflows, lb->nb_lr[j], S_ROUTER_IN_DEFRAG,
>>> -                                    prio, ds_cstr(match),
>>> -                                    ds_cstr(&defrag_actions),
>>> -                                    &lb->nlb->header_);
>>> +            struct ovn_datapath *od = lb->nb_lr[j];
>>> +            if (ovn_dp_group_add_with_reference(lflow_ref, od, hash)) {
>>> +                continue;
>>> +            }
>>
>> For all of these patches, could something like the following change give
>> a further optimization? It would call ovn_lflow__xxxxx() only once and I
>> think it simplifies the code. Does it give a benefit? What do you think?
>>
>> 	/* Build datapath group */
>> 	struct hmapx od_group;
>> 	hmapx_init(&od_group);
>>         for (size_t j = 0; j < lb->n_nb_lr; j++) {
>>             struct ovn_datapath *od = lb->nb_lr[j];
>>             if (ovn_dp_group_add_with_reference(&od_group, od)) {
>>                 continue;
>>             }
>>         }
>>
>> 	/* Add lfow specifying datapath group rather than datapath */
>> 	ovn_lflow_add_at_with_dp_group(lflows,
>>                     S_SWITCH_IN_STATEFUL, priority,
>>                     ds_cstr(match), ds_cstr(action),
>>                     NULL, meter, &lb->nlb->header_);
> 
> Hi Mark,
> 
> if I read correctly the code I guess we should pass od_group somewhere here,
> right?

You are right, I was suggesting the following:

 	ovn_lflow_add_at_with_dp_group(lflows, od_group
                     S_SWITCH_IN_STATEFUL, priority,
                     ds_cstr(match), ds_cstr(action),
                     NULL, meter, &lb->nlb->header_);

I tried to implement this based on your code but I ran into some issues.
I spent some time debugging and then I ran out of time. I might look at
this again later and I will share the results if they are successful

Anyway it seems to me this approach will not work with parrallel
> implementation, right?

I don't think this is a problem but I haven't considered it too much
> 
> Regards,
> Lorenzo
> 
>>
>>> +            lflow_ref = ovn_lflow_add_at_with_hash(lflows, od,
>>> +                                    S_ROUTER_IN_DEFRAG, prio,
>>> +                                    ds_cstr(match), ds_cstr(&defrag_actions),
>>> +                                    NULL, NULL, &lb->nlb->header_,
>>> +                                    OVS_SOURCE_LOCATOR, hash);
>>>          }
>>>      }
>>>      ds_destroy(&defrag_actions);
>>>
>>
Mark Gray Aug. 27, 2021, 5:22 p.m. UTC | #4
On 26/08/2021 17:50, Lorenzo Bianconi wrote:
> Similar to build_lb_rules(), precompute hash and lflow pointer in
> build_lrouter_defrag_flows_for_lb routine since they do not depend on
> datapath updating datapath group. Using this approach we can reduce
> ovn-northd loop time of ~1s running an ovnnb_db from production
> environment:
> 
> ovn-master:
> -----------
> Statistics for 'ovnnb_db_run'
>   Total samples: 37
>   Maximum: 12656 msec
>   Minimum: 12276 msec
>   95th percentile: 12557.000000 msec
>   Short term average: 12475.213654 msec
>   Long term average: 12336.498446 msec
> 
> build_lb_rules + build_lrouter_defrag_flows_for_lb:
> ---------------------------------------------------
> Statistics for 'ovnnb_db_run'
>   Total samples: 37
>   Maximum: 11266 msec
>   Minimum: 11039 msec
>   95th percentile: 11194.000000 msec
>   Short term average: 11145.945629 msec
>   Long term average: 11075.628051 msec
> 
> ovn-nbctl lr-list | wc -l
> 121
> ovn-nbctl ls-list | wc -l
> 241
> ovn-nbctl lb-list | wc -l
> 47077
> ovn-sbctl dump-flows |wc -l
> 9852935
> 
> Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
> ---
>  northd/ovn-northd.c | 18 ++++++++++++++----
>  1 file changed, 14 insertions(+), 4 deletions(-)
> 
> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> index 5efa54ee5..61fb5b159 100644
> --- a/northd/ovn-northd.c
> +++ b/northd/ovn-northd.c
> @@ -9476,11 +9476,21 @@ build_lrouter_defrag_flows_for_lb(struct ovn_northd_lb *lb,
>  
>          ds_put_format(&defrag_actions, "ct_dnat;");
>  
> +        struct ovn_lflow *lflow_ref = NULL;
> +        uint32_t hash = ovn_logical_flow_hash(
> +                ovn_stage_get_table(S_ROUTER_IN_DEFRAG),
> +                ovn_stage_get_pipeline_name(S_ROUTER_IN_DEFRAG), prio,
> +                ds_cstr(match), ds_cstr(&defrag_actions));
>          for (size_t j = 0; j < lb->n_nb_lr; j++) {
> -            ovn_lflow_add_with_hint(lflows, lb->nb_lr[j], S_ROUTER_IN_DEFRAG,
> -                                    prio, ds_cstr(match),
> -                                    ds_cstr(&defrag_actions),
> -                                    &lb->nlb->header_);
> +            struct ovn_datapath *od = lb->nb_lr[j];
> +            if (ovn_dp_group_add_with_reference(lflow_ref, od, hash)) {
> +                continue;
> +            }
> +            lflow_ref = ovn_lflow_add_at_with_hash(lflows, od,
> +                                    S_ROUTER_IN_DEFRAG, prio,
> +                                    ds_cstr(match), ds_cstr(&defrag_actions),
> +                                    NULL, NULL, &lb->nlb->header_,
> +                                    OVS_SOURCE_LOCATOR, hash);
>          }
>      }
>      ds_destroy(&defrag_actions);
> 

Acked-by: Mark D. Gray <mark.d.gray@redhat.com>
diff mbox series

Patch

diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
index 5efa54ee5..61fb5b159 100644
--- a/northd/ovn-northd.c
+++ b/northd/ovn-northd.c
@@ -9476,11 +9476,21 @@  build_lrouter_defrag_flows_for_lb(struct ovn_northd_lb *lb,
 
         ds_put_format(&defrag_actions, "ct_dnat;");
 
+        struct ovn_lflow *lflow_ref = NULL;
+        uint32_t hash = ovn_logical_flow_hash(
+                ovn_stage_get_table(S_ROUTER_IN_DEFRAG),
+                ovn_stage_get_pipeline_name(S_ROUTER_IN_DEFRAG), prio,
+                ds_cstr(match), ds_cstr(&defrag_actions));
         for (size_t j = 0; j < lb->n_nb_lr; j++) {
-            ovn_lflow_add_with_hint(lflows, lb->nb_lr[j], S_ROUTER_IN_DEFRAG,
-                                    prio, ds_cstr(match),
-                                    ds_cstr(&defrag_actions),
-                                    &lb->nlb->header_);
+            struct ovn_datapath *od = lb->nb_lr[j];
+            if (ovn_dp_group_add_with_reference(lflow_ref, od, hash)) {
+                continue;
+            }
+            lflow_ref = ovn_lflow_add_at_with_hash(lflows, od,
+                                    S_ROUTER_IN_DEFRAG, prio,
+                                    ds_cstr(match), ds_cstr(&defrag_actions),
+                                    NULL, NULL, &lb->nlb->header_,
+                                    OVS_SOURCE_LOCATOR, hash);
         }
     }
     ds_destroy(&defrag_actions);