diff mbox series

[ovs-dev,v2,1/1] controller: Check for tunnel change in multi-vtep case is incorrect

Message ID 20200924211704.24904-1-venugopali@nvidia.com
State Changes Requested
Headers show
Series [ovs-dev,v2,1/1] controller: Check for tunnel change in multi-vtep case is incorrect | expand

Commit Message

Venugopal Iyer Sept. 24, 2020, 9:17 p.m. UTC
From: venu iyer <venugopali@nvidia.com>

Prior to multi-vtep, there was one tunnel for each type, since
there was only one encap-ip. So, the check in
chassis_tunnels_changed():

	sset_count(encap_type_set) != encap_type_count

worked. However, with multiple IPs per tunnel type, the above
check won't work. So, once multiple encap-ips are configured,
ovn-controller will always keep updating the encap list in
the SB (due to the above check); this causes a lot of unnecessary
churn, including recomputing the flows etc. which will put
ovn-controller in overdrive thereby consuming lot of CPU cycles
(see almost 100%)

Verified ovn-controller cpu utilization with the fix (and also
that SB doesn't keep constantly updating). make check didn't
show any additional failures.

Fixes: Fixes: b520ca7 ("Support for multiple VTEP in OVN")
Signed-off-by: venu iyer <venugopali@nvidia.com>
---
 controller/chassis.c |  8 +++++---
 tests/ovn.at         | 49 ++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 54 insertions(+), 3 deletions(-)

Comments

Numan Siddique Sept. 25, 2020, 9:18 a.m. UTC | #1
On Fri, Sep 25, 2020 at 2:46 AM <venugopali@nvidia.com> wrote:
>
> From: venu iyer <venugopali@nvidia.com>
>
> Prior to multi-vtep, there was one tunnel for each type, since
> there was only one encap-ip. So, the check in
> chassis_tunnels_changed():
>
>         sset_count(encap_type_set) != encap_type_count
>
> worked. However, with multiple IPs per tunnel type, the above
> check won't work. So, once multiple encap-ips are configured,
> ovn-controller will always keep updating the encap list in
> the SB (due to the above check); this causes a lot of unnecessary
> churn, including recomputing the flows etc. which will put
> ovn-controller in overdrive thereby consuming lot of CPU cycles
> (see almost 100%)
>
> Verified ovn-controller cpu utilization with the fix (and also
> that SB doesn't keep constantly updating). make check didn't
> show any additional failures.
>
> Fixes: Fixes: b520ca7 ("Support for multiple VTEP in OVN")
> Signed-off-by: venu iyer <venugopali@nvidia.com>


Hi Venu,

Thanks for the patch. Please see below for few comments.


>
> ---
>  controller/chassis.c |  8 +++++---
>  tests/ovn.at         | 49 ++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 54 insertions(+), 3 deletions(-)
>
> diff --git a/controller/chassis.c b/controller/chassis.c
> index a365188e8..a6dfb92df 100644
> --- a/controller/chassis.c
> +++ b/controller/chassis.c
> @@ -415,14 +415,15 @@ chassis_tunnels_changed(const struct sset *encap_type_set,
>                          const char *encap_csum,
>                          const struct sbrec_chassis *chassis_rec)
>  {
> -    size_t encap_type_count = 0;
> +    struct sset chassis_rec_encap_type_set;
>
> +    sset_init(&chassis_rec_encap_type_set);


This can be done as
 struct sset chassis_rec_encap_type_set =
       SSET_INITIALIZER(&chassis_rec_encap_type_set);


>
>      for (size_t i = 0; i < chassis_rec->n_encaps; i++) {
>
>          if (!sset_contains(encap_type_set, chassis_rec->encaps[i]->type)) {
>              return true;
>          }
> -        encap_type_count++;
> +        sset_add(&chassis_rec_encap_type_set, chassis_rec->encaps[i]->type);
>
>          if (!sset_contains(encap_ip_set, chassis_rec->encaps[i]->ip)) {
>              return true;
> @@ -441,7 +442,8 @@ chassis_tunnels_changed(const struct sset *encap_type_set,
>          return true;
>      }
>
> -    if (sset_count(encap_type_set) != encap_type_count) {
> +    if (sset_count(encap_type_set) !=
> +            sset_count(&chassis_rec_encap_type_set)) {

There is a memory leak here because the sset is not destroyed.
Not that in the above for loop, there are multiple returns. So you
need to destroy
the sset in those cases too.

Thanks
Numan

>          return true;
>      }
>
> diff --git a/tests/ovn.at b/tests/ovn.at
> index b6c8622ba..47090fa71 100644
> --- a/tests/ovn.at
> +++ b/tests/ovn.at
> @@ -22299,3 +22299,52 @@ ovn-sbctl --columns chassis --bare list port_binding sw0-p2
>
>  OVN_CLEANUP([hv1])
>  AT_CLEANUP
> +
> +#
> +# When multiple encap-ips are configued, make sure the
> +# ovn-controller doesn't get into a state of constantly
> +# updating the SB Chassis's encap information. The test
> +# configures multiple vteps and waits for a couple
> +# of seconds to makes sure the SB encap info remains
> +# unchanged.
> +#
> +#
> +AT_SETUP([ovn -- multi-vtep SB Chassis encap updates])
> +ovn_start
> +
> +net_add n1
> +sim_add hv1
> +as hv1
> +
> +ovs-vsctl add-br br-phys
> +# Just set the encap type to be geneve for this test.
> +ovn_attach n1 br-phys 192.168.0.1 24 geneve
> +
> +# Get the encap rec, should be just one - with geneve/192.168.0.1
> +encap_rec=`ovn-sbctl --data=bare --no-heading --column encaps list chassis hv1`
> +#echo "Encap Rec before mvtep is $encap_rec"
> +
> +# Set multiple IPs
> +as hv1
> +ovs-vsctl \
> +    -- set Open_vSwitch . external-ids:ovn-encap-ip="192.168.0.1,192.168.1.1"
> +
> +# Check if the encap_rec changed - should have, no need to
> +# compare the exact values.
> +encap_rec_mvtep=`ovn-sbctl --data=bare --no-heading --column encaps list chassis hv1`
> +#echo "Encap Rec after mvtep is $encap_rec_mvtep"
> +
> +AT_CHECK([test "$encap_rec" != "$encap_rec_mvtep"], [0], [])
> +
> +# now, wait for a couple of secs -  should be enough time, I suppose.
> +sleep 2
> +
> +# Check if the encap_rec changed (it should not have)
> +encap_rec_mvtep1=`ovn-sbctl --data=bare --no-heading --column encaps list chassis hv1`
> +#echo "Encap Rec after wait is $encap_rec_mvtep1"
> +
> +AT_CHECK([test "$encap_rec_mvtep" == "$encap_rec_mvtep1"], [0], [])
> +
> +OVN_CLEANUP([hv1])
> +AT_CLEANUP
> +
> --
> 2.17.1
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
Venugopal Iyer Sept. 25, 2020, 3:12 p.m. UTC | #2
Thanks for looking at this, Numan:

re: memleak, i didn't look at sset_add to see that it does an alloc; i'll fix it.

thanks,

-venu
diff mbox series

Patch

diff --git a/controller/chassis.c b/controller/chassis.c
index a365188e8..a6dfb92df 100644
--- a/controller/chassis.c
+++ b/controller/chassis.c
@@ -415,14 +415,15 @@  chassis_tunnels_changed(const struct sset *encap_type_set,
                         const char *encap_csum,
                         const struct sbrec_chassis *chassis_rec)
 {
-    size_t encap_type_count = 0;
+    struct sset chassis_rec_encap_type_set;
 
+    sset_init(&chassis_rec_encap_type_set);
     for (size_t i = 0; i < chassis_rec->n_encaps; i++) {
 
         if (!sset_contains(encap_type_set, chassis_rec->encaps[i]->type)) {
             return true;
         }
-        encap_type_count++;
+        sset_add(&chassis_rec_encap_type_set, chassis_rec->encaps[i]->type);
 
         if (!sset_contains(encap_ip_set, chassis_rec->encaps[i]->ip)) {
             return true;
@@ -441,7 +442,8 @@  chassis_tunnels_changed(const struct sset *encap_type_set,
         return true;
     }
 
-    if (sset_count(encap_type_set) != encap_type_count) {
+    if (sset_count(encap_type_set) !=
+            sset_count(&chassis_rec_encap_type_set)) {
         return true;
     }
 
diff --git a/tests/ovn.at b/tests/ovn.at
index b6c8622ba..47090fa71 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -22299,3 +22299,52 @@  ovn-sbctl --columns chassis --bare list port_binding sw0-p2
 
 OVN_CLEANUP([hv1])
 AT_CLEANUP
+
+#
+# When multiple encap-ips are configued, make sure the
+# ovn-controller doesn't get into a state of constantly
+# updating the SB Chassis's encap information. The test
+# configures multiple vteps and waits for a couple
+# of seconds to makes sure the SB encap info remains
+# unchanged.
+#
+#
+AT_SETUP([ovn -- multi-vtep SB Chassis encap updates])
+ovn_start
+
+net_add n1
+sim_add hv1
+as hv1
+
+ovs-vsctl add-br br-phys
+# Just set the encap type to be geneve for this test.
+ovn_attach n1 br-phys 192.168.0.1 24 geneve
+
+# Get the encap rec, should be just one - with geneve/192.168.0.1
+encap_rec=`ovn-sbctl --data=bare --no-heading --column encaps list chassis hv1`
+#echo "Encap Rec before mvtep is $encap_rec"
+
+# Set multiple IPs
+as hv1
+ovs-vsctl \
+    -- set Open_vSwitch . external-ids:ovn-encap-ip="192.168.0.1,192.168.1.1"
+
+# Check if the encap_rec changed - should have, no need to
+# compare the exact values.
+encap_rec_mvtep=`ovn-sbctl --data=bare --no-heading --column encaps list chassis hv1`
+#echo "Encap Rec after mvtep is $encap_rec_mvtep"
+
+AT_CHECK([test "$encap_rec" != "$encap_rec_mvtep"], [0], [])
+
+# now, wait for a couple of secs -  should be enough time, I suppose.
+sleep 2
+
+# Check if the encap_rec changed (it should not have)
+encap_rec_mvtep1=`ovn-sbctl --data=bare --no-heading --column encaps list chassis hv1`
+#echo "Encap Rec after wait is $encap_rec_mvtep1"
+
+AT_CHECK([test "$encap_rec_mvtep" == "$encap_rec_mvtep1"], [0], [])
+
+OVN_CLEANUP([hv1])
+AT_CLEANUP
+