diff mbox series

[ovs-dev] controller: Add config option per LB to enable/disable CT flush

Message ID 20230316182507.124733-1-amusil@redhat.com
State Changes Requested
Headers show
Series [ovs-dev] controller: Add config option per LB to enable/disable CT flush | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/github-robot-_Build_and_Test success github build: passed
ovsrobot/github-robot-_ovn-kubernetes success github build: passed

Commit Message

Ales Musil March 16, 2023, 6:25 p.m. UTC
The CT flush was enabled by default for every LB, add
config option called "ct_flush_enabled" that allows
users to enable/disable the CT flush. The CT flush is
remaining enabled by default.

Reported-at: https://bugzilla.redhat.com/2178962
Signed-off-by: Ales Musil <amusil@redhat.com>
---
 NEWS                        |  2 ++
 controller/ovn-controller.c |  6 ++++--
 ovn-nb.xml                  |  7 +++++++
 tests/ovn.at                | 20 ++++++++++++++++++++
 tests/system-ovn.at         | 33 ++++++++++++++++++++++++++++-----
 5 files changed, 61 insertions(+), 7 deletions(-)

Comments

Ilya Maximets March 16, 2023, 6:36 p.m. UTC | #1
On 3/16/23 19:25, Ales Musil wrote:
> The CT flush was enabled by default for every LB, add
> config option called "ct_flush_enabled" that allows
> users to enable/disable the CT flush. The CT flush is
> remaining enabled by default.

Quick comment: '_enabled' part in the option name seems redundant.
'ct_flush_enabled=false' reads the same to me as 'ct_flush=false'.

But it's probably better to wait for a full review from someone
before re-spinning just for this.

Best regards, Ilya Maximets.
Dumitru Ceara March 16, 2023, 7:44 p.m. UTC | #2
Hi Ales,

On 3/16/23 19:25, Ales Musil wrote:
> The CT flush was enabled by default for every LB, add
> config option called "ct_flush_enabled" that allows
> users to enable/disable the CT flush. The CT flush is
> remaining enabled by default.
> 
> Reported-at: https://bugzilla.redhat.com/2178962

As mentioned in the bug report above: "OVN has started flushing
conntrack entries for services (vips->backends) whenever the 5 tuple
changes (so either LB is deleted OR VIP, VIP port, backend IP, backend
port, proto changes).".  I'd like to avoid this kind of problems in the
future by making the feature opt-in, so disabled by default.

I know that it's a bit of a sensitive subject because the feature was
already enabled by default when it was added in v23.03.0 so in theory
there might be people using it already.

However, given that (AFAIK) nobody else except ovn-kubernetes complained
about the lack of "CT flush on backend removal" before v23.03.0 it's
probably safe to assume that disabling the feature by default doesn't
affect other users.

In any case, because ovn-kubernetes is only interested in flushing UDP
entries and not other protocols (*), they'll have to change their code
anyway to either opt-out for some LBs or opt-in for the rest, depending
on the default we choose.

Another reason to make it opt-in is to avoid issues due to OVN upgrading
to a version that has flush enabled by default for all LB while the CMS
code is still unaware of the new option and thus cannot configure OVN to
(partially) disable flushing.

Regarding the code I have a few minor comments below.

Thanks,
Dumitru

(*) whether that's the right thing to do is not really relevant I guess.

> Signed-off-by: Ales Musil <amusil@redhat.com>
> ---
>  NEWS                        |  2 ++
>  controller/ovn-controller.c |  6 ++++--
>  ovn-nb.xml                  |  7 +++++++
>  tests/ovn.at                | 20 ++++++++++++++++++++
>  tests/system-ovn.at         | 33 ++++++++++++++++++++++++++++-----
>  5 files changed, 61 insertions(+), 7 deletions(-)
> 
> diff --git a/NEWS b/NEWS
> index 637adcff3..0daac951a 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -2,6 +2,8 @@ Post v23.03.0
>  -------------
>    - Enhance LSP.options:arp_proxy to support IPv6, configurable MAC
>      addresses and CIDRs.
> +  - Add an option for LBs called "ct_flush_enabled" that allows CMS to specify
> +    if ovn-controller should flush related CT entries for removed LB backends.
>  
>  OVN v23.03.0 - 03 Mar 2023
>  --------------------------
> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> index 7dcbfd252..8b85464c6 100644
> --- a/controller/ovn-controller.c
> +++ b/controller/ovn-controller.c
> @@ -2697,7 +2697,8 @@ static void
>  lb_data_removed_five_tuples_add(struct ed_type_lb_data *lb_data,
>                                  const struct ovn_controller_lb *lb)
>  {
> -    if (!ovs_feature_is_supported(OVS_CT_TUPLE_FLUSH_SUPPORT)) {
> +    if (!ovs_feature_is_supported(OVS_CT_TUPLE_FLUSH_SUPPORT) ||
> +        !smap_get_bool(&lb->slb->options, "ct_flush_enabled", true)) {

Let's store this as a field in 'struct ovn_controller_lb' so we don't
lookup in the smap every time we need this info.  In some cases (since
we incrementally process the LBs) we already have the lb structure in
memory.

>          return;
>      }
>  
> @@ -2716,7 +2717,8 @@ static void
>  lb_data_removed_five_tuples_remove(struct ed_type_lb_data *lb_data,
>                                     const struct ovn_controller_lb *lb)
>  {
> -    if (!ovs_feature_is_supported(OVS_CT_TUPLE_FLUSH_SUPPORT)) {
> +    if (!ovs_feature_is_supported(OVS_CT_TUPLE_FLUSH_SUPPORT) ||
> +        !smap_get_bool(&lb->slb->options, "ct_flush_enabled", true)) {
>          return;
>      }
>  
> diff --git a/ovn-nb.xml b/ovn-nb.xml
> index 73f707aa0..c5dbebd1d 100644
> --- a/ovn-nb.xml
> +++ b/ovn-nb.xml
> @@ -2041,6 +2041,13 @@ or
>          the affinity timeslot. Max supported affinity_timeout is 65535
>          seconds.
>        </column>
> +
> +      <column name="options" key="ct_flush_enabled"
> +              type='{"type": "boolean"}'>

As Ilya mentioned, _enabled is probably redundant.

I wonder though whether we need a different name.  I'm not sure if it's
better to be honest but what do you think of "graceful_cleanup".  In the
end what we achieve with ct_flush_enabled=false is "graceful
termination" of the connection.

> +        The value indicates whether ovn-controller should flush CT entries
> +        that are related to this LB when the backends are removed. Being set
> +        to <code>true</code> by default.
> +      </column>
>      </group>
>    </table>
>  
> diff --git a/tests/ovn.at b/tests/ovn.at
> index fa786112c..724e8b6e5 100644
> --- a/tests/ovn.at
> +++ b/tests/ovn.at
> @@ -35039,6 +35039,26 @@ AT_CHECK([grep -q "Flushing CT for 5-tuple: vip=192.168.50.10:80, backend=192.16
>  
>  AT_CHECK([test "$(grep -c "Flushing CT for 5-tuple" hv1/ovn-controller.log)" = "6"], [0])
>  
> +# Check if disable of the CT flush works
> +check ovn-nbctl lb-del lb1
> +check ovn-nbctl lb-add lb1 "192.168.70.10:80" "192.168.80.10:8080,192.168.90.10:8080"
> +check ovn-nbctl set load_balancer lb1 options:ct_flush_enabled="false"
> +check ovn-nbctl ls-lb-add sw lb1
> +check ovs-vsctl set interface p1 external_ids:iface-id=lsp1
> +check ovn-nbctl --wait=hv sync
> +
> +#AT_CHECK([test "$(grep -c "Flushing CT for 5-tuple" hv1/ovn-controller.log)" = "6"], [0])
> +
> +# Remove one backend
> +check ovn-nbctl --wait=hv set load_balancer lb1 vips='"192.168.70.10:80"="192.168.80.10:8080"'
> +
> +#AT_CHECK([grep -q "Flushing CT for 5-tuple: vip=192.168.70.10:80, backend=192.168.90.10:8080, protocol=6" hv1/ovn-controller.log], [1])
> +#AT_CHECK([test "$(grep -c "Flushing CT for 5-tuple" hv1/ovn-controller.log)" = "6"], [0])
> +
> +check ovn-nbctl --wait=hv lb-del lb1
> +AT_CHECK([grep -q "Flushing CT for 5-tuple: vip=192.168.70.10:80, backend=192.168.80.10:8080, protocol=6" hv1/ovn-controller.log], [1])
> +AT_CHECK([test "$(grep -c "Flushing CT for 5-tuple" hv1/ovn-controller.log)" = "6"], [0])
> +
>  OVN_CLEANUP([hv1])
>  AT_CLEANUP
>  ])
> diff --git a/tests/system-ovn.at b/tests/system-ovn.at
> index ad1188078..afd10c15d 100644
> --- a/tests/system-ovn.at
> +++ b/tests/system-ovn.at
> @@ -10013,16 +10013,18 @@ OVS_START_L7([bar1], [http])
>  OVS_START_L7([bar2], [http])
>  OVS_START_L7([bar3], [http])
>  
> -OVS_WAIT_FOR_OUTPUT([
> -    for i in `seq 1 20`; do
> -        ip netns exec foo1 wget 30.0.0.1 -t 5 -T 1 --retry-connrefused -v -o wget$i.log;
> -    done
> -    ovs-appctl dpctl/dump-conntrack | FORMAT_CT(30.0.0.1) | sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [dnl
> +m4_define([LB1_CT_ENTRIES], [dnl
>  tcp,orig=(src=192.168.1.2,dst=30.0.0.1,sport=<cleared>,dport=<cleared>),reply=(src=172.16.1.2,dst=192.168.1.2,sport=<cleared>,dport=<cleared>),zone=<cleared>,mark=2,protoinfo=(state=<cleared>)
>  tcp,orig=(src=192.168.1.2,dst=30.0.0.1,sport=<cleared>,dport=<cleared>),reply=(src=172.16.1.3,dst=192.168.1.2,sport=<cleared>,dport=<cleared>),zone=<cleared>,mark=2,protoinfo=(state=<cleared>)
>  tcp,orig=(src=192.168.1.2,dst=30.0.0.1,sport=<cleared>,dport=<cleared>),reply=(src=172.16.1.4,dst=192.168.1.2,sport=<cleared>,dport=<cleared>),zone=<cleared>,mark=2,protoinfo=(state=<cleared>)
>  ])
>  
> +OVS_WAIT_FOR_OUTPUT([
> +    for i in `seq 1 20`; do
> +        ip netns exec foo1 wget 30.0.0.1 -t 5 -T 1 --retry-connrefused -v -o wget$i.log;
> +    done
> +    ovs-appctl dpctl/dump-conntrack | FORMAT_CT(30.0.0.1) | sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [LB1_CT_ENTRIES])
> +
>  OVS_WAIT_FOR_OUTPUT([
>      for i in `seq 1 20`; do
>          ip netns exec foo1 wget 30.0.0.2:8000 -t 5 -T 1 --retry-connrefused -v -o wget$i.log;
> @@ -10096,6 +10098,27 @@ check ovn-nbctl lb-del lb2
>  
>  OVS_WAIT_UNTIL([test "$(ovs-appctl dpctl/dump-conntrack | FORMAT_CT(30.0.0.3) | wc -l)" = "0"])
>  
> +# Config OVN wih disabled CT flush.
> +check ovn-nbctl lb-add lb1 30.0.0.1 "172.16.1.2,172.16.1.3,172.16.1.4" \
> +-- set load_balancer lb1 options:ct_flush_enabled="false"
> +check ovn-nbctl ls-lb-add foo lb1
> +
> +OVS_WAIT_FOR_OUTPUT([
> +    for i in `seq 1 20`; do
> +        ip netns exec foo1 wget 30.0.0.1 -t 5 -T 1 --retry-connrefused -v -o wget$i.log;
> +    done
> +    ovs-appctl dpctl/dump-conntrack | FORMAT_CT(30.0.0.1) | sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [LB1_CT_ENTRIES])
> +
> +# Remove one backend
> +check ovn-nbctl --wait=hv set load_balancer lb1 vips='"30.0.0.1"="172.16.1.2,172.16.1.3"'
> +
> +AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(30.0.0.1) | sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [LB1_CT_ENTRIES])
> +
> +# Remove whole LB
> +check ovn-nbctl --wait=hv lb-del lb1
> +
> +AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(30.0.0.1) | sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [LB1_CT_ENTRIES])
> +
>  OVS_APP_EXIT_AND_WAIT([ovn-controller])
>  
>  as ovn-sb
Ilya Maximets March 16, 2023, 8 p.m. UTC | #3
On 3/16/23 20:44, Dumitru Ceara wrote:
>> diff --git a/ovn-nb.xml b/ovn-nb.xml
>> index 73f707aa0..c5dbebd1d 100644
>> --- a/ovn-nb.xml
>> +++ b/ovn-nb.xml
>> @@ -2041,6 +2041,13 @@ or
>>          the affinity timeslot. Max supported affinity_timeout is 65535
>>          seconds.
>>        </column>
>> +
>> +      <column name="options" key="ct_flush_enabled"
>> +              type='{"type": "boolean"}'>
> 
> As Ilya mentioned, _enabled is probably redundant.
> 
> I wonder though whether we need a different name.  I'm not sure if it's
> better to be honest but what do you think of "graceful_cleanup".In the
> end what we achieve with ct_flush_enabled=false is "graceful
> termination" of the connection.

This is questionable.  I'd argue that flush is a cleaner option, so
it is more graceful. :)

Jokes aside, OVN doesn't know what CMS/user is going to do with the
backend/pod/VM.  Form pure OVN perspective we're just removing a
beckend/LB.  I would not expect an option named '*_cleanup=false' to
perform some extra actions, as well as '*_cleanup=true' to do nothing.
Unless, of course, it is a 'do_not_cleanup' option.

Best regards, Ilya Maximets.
Dumitru Ceara March 17, 2023, 9:04 a.m. UTC | #4
On 3/16/23 21:00, Ilya Maximets wrote:
> On 3/16/23 20:44, Dumitru Ceara wrote:
>>> diff --git a/ovn-nb.xml b/ovn-nb.xml
>>> index 73f707aa0..c5dbebd1d 100644
>>> --- a/ovn-nb.xml
>>> +++ b/ovn-nb.xml
>>> @@ -2041,6 +2041,13 @@ or
>>>          the affinity timeslot. Max supported affinity_timeout is 65535
>>>          seconds.
>>>        </column>
>>> +
>>> +      <column name="options" key="ct_flush_enabled"
>>> +              type='{"type": "boolean"}'>
>>
>> As Ilya mentioned, _enabled is probably redundant.
>>
>> I wonder though whether we need a different name.  I'm not sure if it's
>> better to be honest but what do you think of "graceful_cleanup".In the
>> end what we achieve with ct_flush_enabled=false is "graceful
>> termination" of the connection.
> 
> This is questionable.  I'd argue that flush is a cleaner option, so
> it is more graceful. :)
> 
> Jokes aside, OVN doesn't know what CMS/user is going to do with the
> backend/pod/VM.  Form pure OVN perspective we're just removing a
> beckend/LB.  I would not expect an option named '*_cleanup=false' to
> perform some extra actions, as well as '*_cleanup=true' to do nothing.
> Unless, of course, it is a 'do_not_cleanup' option.
> 

Fair points above.  I'd say let's keep "ct_flush=true/false" then.

> Best regards, Ilya Maximets.
> 

Thanks,
Dumitru
diff mbox series

Patch

diff --git a/NEWS b/NEWS
index 637adcff3..0daac951a 100644
--- a/NEWS
+++ b/NEWS
@@ -2,6 +2,8 @@  Post v23.03.0
 -------------
   - Enhance LSP.options:arp_proxy to support IPv6, configurable MAC
     addresses and CIDRs.
+  - Add an option for LBs called "ct_flush_enabled" that allows CMS to specify
+    if ovn-controller should flush related CT entries for removed LB backends.
 
 OVN v23.03.0 - 03 Mar 2023
 --------------------------
diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
index 7dcbfd252..8b85464c6 100644
--- a/controller/ovn-controller.c
+++ b/controller/ovn-controller.c
@@ -2697,7 +2697,8 @@  static void
 lb_data_removed_five_tuples_add(struct ed_type_lb_data *lb_data,
                                 const struct ovn_controller_lb *lb)
 {
-    if (!ovs_feature_is_supported(OVS_CT_TUPLE_FLUSH_SUPPORT)) {
+    if (!ovs_feature_is_supported(OVS_CT_TUPLE_FLUSH_SUPPORT) ||
+        !smap_get_bool(&lb->slb->options, "ct_flush_enabled", true)) {
         return;
     }
 
@@ -2716,7 +2717,8 @@  static void
 lb_data_removed_five_tuples_remove(struct ed_type_lb_data *lb_data,
                                    const struct ovn_controller_lb *lb)
 {
-    if (!ovs_feature_is_supported(OVS_CT_TUPLE_FLUSH_SUPPORT)) {
+    if (!ovs_feature_is_supported(OVS_CT_TUPLE_FLUSH_SUPPORT) ||
+        !smap_get_bool(&lb->slb->options, "ct_flush_enabled", true)) {
         return;
     }
 
diff --git a/ovn-nb.xml b/ovn-nb.xml
index 73f707aa0..c5dbebd1d 100644
--- a/ovn-nb.xml
+++ b/ovn-nb.xml
@@ -2041,6 +2041,13 @@  or
         the affinity timeslot. Max supported affinity_timeout is 65535
         seconds.
       </column>
+
+      <column name="options" key="ct_flush_enabled"
+              type='{"type": "boolean"}'>
+        The value indicates whether ovn-controller should flush CT entries
+        that are related to this LB when the backends are removed. Being set
+        to <code>true</code> by default.
+      </column>
     </group>
   </table>
 
diff --git a/tests/ovn.at b/tests/ovn.at
index fa786112c..724e8b6e5 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -35039,6 +35039,26 @@  AT_CHECK([grep -q "Flushing CT for 5-tuple: vip=192.168.50.10:80, backend=192.16
 
 AT_CHECK([test "$(grep -c "Flushing CT for 5-tuple" hv1/ovn-controller.log)" = "6"], [0])
 
+# Check if disable of the CT flush works
+check ovn-nbctl lb-del lb1
+check ovn-nbctl lb-add lb1 "192.168.70.10:80" "192.168.80.10:8080,192.168.90.10:8080"
+check ovn-nbctl set load_balancer lb1 options:ct_flush_enabled="false"
+check ovn-nbctl ls-lb-add sw lb1
+check ovs-vsctl set interface p1 external_ids:iface-id=lsp1
+check ovn-nbctl --wait=hv sync
+
+#AT_CHECK([test "$(grep -c "Flushing CT for 5-tuple" hv1/ovn-controller.log)" = "6"], [0])
+
+# Remove one backend
+check ovn-nbctl --wait=hv set load_balancer lb1 vips='"192.168.70.10:80"="192.168.80.10:8080"'
+
+#AT_CHECK([grep -q "Flushing CT for 5-tuple: vip=192.168.70.10:80, backend=192.168.90.10:8080, protocol=6" hv1/ovn-controller.log], [1])
+#AT_CHECK([test "$(grep -c "Flushing CT for 5-tuple" hv1/ovn-controller.log)" = "6"], [0])
+
+check ovn-nbctl --wait=hv lb-del lb1
+AT_CHECK([grep -q "Flushing CT for 5-tuple: vip=192.168.70.10:80, backend=192.168.80.10:8080, protocol=6" hv1/ovn-controller.log], [1])
+AT_CHECK([test "$(grep -c "Flushing CT for 5-tuple" hv1/ovn-controller.log)" = "6"], [0])
+
 OVN_CLEANUP([hv1])
 AT_CLEANUP
 ])
diff --git a/tests/system-ovn.at b/tests/system-ovn.at
index ad1188078..afd10c15d 100644
--- a/tests/system-ovn.at
+++ b/tests/system-ovn.at
@@ -10013,16 +10013,18 @@  OVS_START_L7([bar1], [http])
 OVS_START_L7([bar2], [http])
 OVS_START_L7([bar3], [http])
 
-OVS_WAIT_FOR_OUTPUT([
-    for i in `seq 1 20`; do
-        ip netns exec foo1 wget 30.0.0.1 -t 5 -T 1 --retry-connrefused -v -o wget$i.log;
-    done
-    ovs-appctl dpctl/dump-conntrack | FORMAT_CT(30.0.0.1) | sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [dnl
+m4_define([LB1_CT_ENTRIES], [dnl
 tcp,orig=(src=192.168.1.2,dst=30.0.0.1,sport=<cleared>,dport=<cleared>),reply=(src=172.16.1.2,dst=192.168.1.2,sport=<cleared>,dport=<cleared>),zone=<cleared>,mark=2,protoinfo=(state=<cleared>)
 tcp,orig=(src=192.168.1.2,dst=30.0.0.1,sport=<cleared>,dport=<cleared>),reply=(src=172.16.1.3,dst=192.168.1.2,sport=<cleared>,dport=<cleared>),zone=<cleared>,mark=2,protoinfo=(state=<cleared>)
 tcp,orig=(src=192.168.1.2,dst=30.0.0.1,sport=<cleared>,dport=<cleared>),reply=(src=172.16.1.4,dst=192.168.1.2,sport=<cleared>,dport=<cleared>),zone=<cleared>,mark=2,protoinfo=(state=<cleared>)
 ])
 
+OVS_WAIT_FOR_OUTPUT([
+    for i in `seq 1 20`; do
+        ip netns exec foo1 wget 30.0.0.1 -t 5 -T 1 --retry-connrefused -v -o wget$i.log;
+    done
+    ovs-appctl dpctl/dump-conntrack | FORMAT_CT(30.0.0.1) | sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [LB1_CT_ENTRIES])
+
 OVS_WAIT_FOR_OUTPUT([
     for i in `seq 1 20`; do
         ip netns exec foo1 wget 30.0.0.2:8000 -t 5 -T 1 --retry-connrefused -v -o wget$i.log;
@@ -10096,6 +10098,27 @@  check ovn-nbctl lb-del lb2
 
 OVS_WAIT_UNTIL([test "$(ovs-appctl dpctl/dump-conntrack | FORMAT_CT(30.0.0.3) | wc -l)" = "0"])
 
+# Config OVN wih disabled CT flush.
+check ovn-nbctl lb-add lb1 30.0.0.1 "172.16.1.2,172.16.1.3,172.16.1.4" \
+-- set load_balancer lb1 options:ct_flush_enabled="false"
+check ovn-nbctl ls-lb-add foo lb1
+
+OVS_WAIT_FOR_OUTPUT([
+    for i in `seq 1 20`; do
+        ip netns exec foo1 wget 30.0.0.1 -t 5 -T 1 --retry-connrefused -v -o wget$i.log;
+    done
+    ovs-appctl dpctl/dump-conntrack | FORMAT_CT(30.0.0.1) | sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [LB1_CT_ENTRIES])
+
+# Remove one backend
+check ovn-nbctl --wait=hv set load_balancer lb1 vips='"30.0.0.1"="172.16.1.2,172.16.1.3"'
+
+AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(30.0.0.1) | sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [LB1_CT_ENTRIES])
+
+# Remove whole LB
+check ovn-nbctl --wait=hv lb-del lb1
+
+AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(30.0.0.1) | sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [LB1_CT_ENTRIES])
+
 OVS_APP_EXIT_AND_WAIT([ovn-controller])
 
 as ovn-sb