Message ID | 20230316182507.124733-1-amusil@redhat.com |
---|---|
State | Changes Requested |
Headers | show |
Series | [ovs-dev] controller: Add config option per LB to enable/disable CT flush | expand |
Context | Check | Description |
---|---|---|
ovsrobot/apply-robot | success | apply and check: success |
ovsrobot/github-robot-_Build_and_Test | success | github build: passed |
ovsrobot/github-robot-_ovn-kubernetes | success | github build: passed |
On 3/16/23 19:25, Ales Musil wrote: > The CT flush was enabled by default for every LB, add > config option called "ct_flush_enabled" that allows > users to enable/disable the CT flush. The CT flush is > remaining enabled by default. Quick comment: '_enabled' part in the option name seems redundant. 'ct_flush_enabled=false' reads the same to me as 'ct_flush=false'. But it's probably better to wait for a full review from someone before re-spinning just for this. Best regards, Ilya Maximets.
Hi Ales, On 3/16/23 19:25, Ales Musil wrote: > The CT flush was enabled by default for every LB, add > config option called "ct_flush_enabled" that allows > users to enable/disable the CT flush. The CT flush is > remaining enabled by default. > > Reported-at: https://bugzilla.redhat.com/2178962 As mentioned in the bug report above: "OVN has started flushing conntrack entries for services (vips->backends) whenever the 5 tuple changes (so either LB is deleted OR VIP, VIP port, backend IP, backend port, proto changes).". I'd like to avoid this kind of problems in the future by making the feature opt-in, so disabled by default. I know that it's a bit of a sensitive subject because the feature was already enabled by default when it was added in v23.03.0 so in theory there might be people using it already. However, given that (AFAIK) nobody else except ovn-kubernetes complained about the lack of "CT flush on backend removal" before v23.03.0 it's probably safe to assume that disabling the feature by default doesn't affect other users. In any case, because ovn-kubernetes is only interested in flushing UDP entries and not other protocols (*), they'll have to change their code anyway to either opt-out for some LBs or opt-in for the rest, depending on the default we choose. Another reason to make it opt-in is to avoid issues due to OVN upgrading to a version that has flush enabled by default for all LB while the CMS code is still unaware of the new option and thus cannot configure OVN to (partially) disable flushing. Regarding the code I have a few minor comments below. Thanks, Dumitru (*) whether that's the right thing to do is not really relevant I guess. > Signed-off-by: Ales Musil <amusil@redhat.com> > --- > NEWS | 2 ++ > controller/ovn-controller.c | 6 ++++-- > ovn-nb.xml | 7 +++++++ > tests/ovn.at | 20 ++++++++++++++++++++ > tests/system-ovn.at | 33 ++++++++++++++++++++++++++++----- > 5 files changed, 61 insertions(+), 7 deletions(-) > > diff --git a/NEWS b/NEWS > index 637adcff3..0daac951a 100644 > --- a/NEWS > +++ b/NEWS > @@ -2,6 +2,8 @@ Post v23.03.0 > ------------- > - Enhance LSP.options:arp_proxy to support IPv6, configurable MAC > addresses and CIDRs. > + - Add an option for LBs called "ct_flush_enabled" that allows CMS to specify > + if ovn-controller should flush related CT entries for removed LB backends. > > OVN v23.03.0 - 03 Mar 2023 > -------------------------- > diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c > index 7dcbfd252..8b85464c6 100644 > --- a/controller/ovn-controller.c > +++ b/controller/ovn-controller.c > @@ -2697,7 +2697,8 @@ static void > lb_data_removed_five_tuples_add(struct ed_type_lb_data *lb_data, > const struct ovn_controller_lb *lb) > { > - if (!ovs_feature_is_supported(OVS_CT_TUPLE_FLUSH_SUPPORT)) { > + if (!ovs_feature_is_supported(OVS_CT_TUPLE_FLUSH_SUPPORT) || > + !smap_get_bool(&lb->slb->options, "ct_flush_enabled", true)) { Let's store this as a field in 'struct ovn_controller_lb' so we don't lookup in the smap every time we need this info. In some cases (since we incrementally process the LBs) we already have the lb structure in memory. > return; > } > > @@ -2716,7 +2717,8 @@ static void > lb_data_removed_five_tuples_remove(struct ed_type_lb_data *lb_data, > const struct ovn_controller_lb *lb) > { > - if (!ovs_feature_is_supported(OVS_CT_TUPLE_FLUSH_SUPPORT)) { > + if (!ovs_feature_is_supported(OVS_CT_TUPLE_FLUSH_SUPPORT) || > + !smap_get_bool(&lb->slb->options, "ct_flush_enabled", true)) { > return; > } > > diff --git a/ovn-nb.xml b/ovn-nb.xml > index 73f707aa0..c5dbebd1d 100644 > --- a/ovn-nb.xml > +++ b/ovn-nb.xml > @@ -2041,6 +2041,13 @@ or > the affinity timeslot. Max supported affinity_timeout is 65535 > seconds. > </column> > + > + <column name="options" key="ct_flush_enabled" > + type='{"type": "boolean"}'> As Ilya mentioned, _enabled is probably redundant. I wonder though whether we need a different name. I'm not sure if it's better to be honest but what do you think of "graceful_cleanup". In the end what we achieve with ct_flush_enabled=false is "graceful termination" of the connection. > + The value indicates whether ovn-controller should flush CT entries > + that are related to this LB when the backends are removed. Being set > + to <code>true</code> by default. > + </column> > </group> > </table> > > diff --git a/tests/ovn.at b/tests/ovn.at > index fa786112c..724e8b6e5 100644 > --- a/tests/ovn.at > +++ b/tests/ovn.at > @@ -35039,6 +35039,26 @@ AT_CHECK([grep -q "Flushing CT for 5-tuple: vip=192.168.50.10:80, backend=192.16 > > AT_CHECK([test "$(grep -c "Flushing CT for 5-tuple" hv1/ovn-controller.log)" = "6"], [0]) > > +# Check if disable of the CT flush works > +check ovn-nbctl lb-del lb1 > +check ovn-nbctl lb-add lb1 "192.168.70.10:80" "192.168.80.10:8080,192.168.90.10:8080" > +check ovn-nbctl set load_balancer lb1 options:ct_flush_enabled="false" > +check ovn-nbctl ls-lb-add sw lb1 > +check ovs-vsctl set interface p1 external_ids:iface-id=lsp1 > +check ovn-nbctl --wait=hv sync > + > +#AT_CHECK([test "$(grep -c "Flushing CT for 5-tuple" hv1/ovn-controller.log)" = "6"], [0]) > + > +# Remove one backend > +check ovn-nbctl --wait=hv set load_balancer lb1 vips='"192.168.70.10:80"="192.168.80.10:8080"' > + > +#AT_CHECK([grep -q "Flushing CT for 5-tuple: vip=192.168.70.10:80, backend=192.168.90.10:8080, protocol=6" hv1/ovn-controller.log], [1]) > +#AT_CHECK([test "$(grep -c "Flushing CT for 5-tuple" hv1/ovn-controller.log)" = "6"], [0]) > + > +check ovn-nbctl --wait=hv lb-del lb1 > +AT_CHECK([grep -q "Flushing CT for 5-tuple: vip=192.168.70.10:80, backend=192.168.80.10:8080, protocol=6" hv1/ovn-controller.log], [1]) > +AT_CHECK([test "$(grep -c "Flushing CT for 5-tuple" hv1/ovn-controller.log)" = "6"], [0]) > + > OVN_CLEANUP([hv1]) > AT_CLEANUP > ]) > diff --git a/tests/system-ovn.at b/tests/system-ovn.at > index ad1188078..afd10c15d 100644 > --- a/tests/system-ovn.at > +++ b/tests/system-ovn.at > @@ -10013,16 +10013,18 @@ OVS_START_L7([bar1], [http]) > OVS_START_L7([bar2], [http]) > OVS_START_L7([bar3], [http]) > > -OVS_WAIT_FOR_OUTPUT([ > - for i in `seq 1 20`; do > - ip netns exec foo1 wget 30.0.0.1 -t 5 -T 1 --retry-connrefused -v -o wget$i.log; > - done > - ovs-appctl dpctl/dump-conntrack | FORMAT_CT(30.0.0.1) | sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [dnl > +m4_define([LB1_CT_ENTRIES], [dnl > tcp,orig=(src=192.168.1.2,dst=30.0.0.1,sport=<cleared>,dport=<cleared>),reply=(src=172.16.1.2,dst=192.168.1.2,sport=<cleared>,dport=<cleared>),zone=<cleared>,mark=2,protoinfo=(state=<cleared>) > tcp,orig=(src=192.168.1.2,dst=30.0.0.1,sport=<cleared>,dport=<cleared>),reply=(src=172.16.1.3,dst=192.168.1.2,sport=<cleared>,dport=<cleared>),zone=<cleared>,mark=2,protoinfo=(state=<cleared>) > tcp,orig=(src=192.168.1.2,dst=30.0.0.1,sport=<cleared>,dport=<cleared>),reply=(src=172.16.1.4,dst=192.168.1.2,sport=<cleared>,dport=<cleared>),zone=<cleared>,mark=2,protoinfo=(state=<cleared>) > ]) > > +OVS_WAIT_FOR_OUTPUT([ > + for i in `seq 1 20`; do > + ip netns exec foo1 wget 30.0.0.1 -t 5 -T 1 --retry-connrefused -v -o wget$i.log; > + done > + ovs-appctl dpctl/dump-conntrack | FORMAT_CT(30.0.0.1) | sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [LB1_CT_ENTRIES]) > + > OVS_WAIT_FOR_OUTPUT([ > for i in `seq 1 20`; do > ip netns exec foo1 wget 30.0.0.2:8000 -t 5 -T 1 --retry-connrefused -v -o wget$i.log; > @@ -10096,6 +10098,27 @@ check ovn-nbctl lb-del lb2 > > OVS_WAIT_UNTIL([test "$(ovs-appctl dpctl/dump-conntrack | FORMAT_CT(30.0.0.3) | wc -l)" = "0"]) > > +# Config OVN wih disabled CT flush. > +check ovn-nbctl lb-add lb1 30.0.0.1 "172.16.1.2,172.16.1.3,172.16.1.4" \ > +-- set load_balancer lb1 options:ct_flush_enabled="false" > +check ovn-nbctl ls-lb-add foo lb1 > + > +OVS_WAIT_FOR_OUTPUT([ > + for i in `seq 1 20`; do > + ip netns exec foo1 wget 30.0.0.1 -t 5 -T 1 --retry-connrefused -v -o wget$i.log; > + done > + ovs-appctl dpctl/dump-conntrack | FORMAT_CT(30.0.0.1) | sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [LB1_CT_ENTRIES]) > + > +# Remove one backend > +check ovn-nbctl --wait=hv set load_balancer lb1 vips='"30.0.0.1"="172.16.1.2,172.16.1.3"' > + > +AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(30.0.0.1) | sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [LB1_CT_ENTRIES]) > + > +# Remove whole LB > +check ovn-nbctl --wait=hv lb-del lb1 > + > +AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(30.0.0.1) | sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [LB1_CT_ENTRIES]) > + > OVS_APP_EXIT_AND_WAIT([ovn-controller]) > > as ovn-sb
On 3/16/23 20:44, Dumitru Ceara wrote: >> diff --git a/ovn-nb.xml b/ovn-nb.xml >> index 73f707aa0..c5dbebd1d 100644 >> --- a/ovn-nb.xml >> +++ b/ovn-nb.xml >> @@ -2041,6 +2041,13 @@ or >> the affinity timeslot. Max supported affinity_timeout is 65535 >> seconds. >> </column> >> + >> + <column name="options" key="ct_flush_enabled" >> + type='{"type": "boolean"}'> > > As Ilya mentioned, _enabled is probably redundant. > > I wonder though whether we need a different name. I'm not sure if it's > better to be honest but what do you think of "graceful_cleanup".In the > end what we achieve with ct_flush_enabled=false is "graceful > termination" of the connection. This is questionable. I'd argue that flush is a cleaner option, so it is more graceful. :) Jokes aside, OVN doesn't know what CMS/user is going to do with the backend/pod/VM. Form pure OVN perspective we're just removing a beckend/LB. I would not expect an option named '*_cleanup=false' to perform some extra actions, as well as '*_cleanup=true' to do nothing. Unless, of course, it is a 'do_not_cleanup' option. Best regards, Ilya Maximets.
On 3/16/23 21:00, Ilya Maximets wrote: > On 3/16/23 20:44, Dumitru Ceara wrote: >>> diff --git a/ovn-nb.xml b/ovn-nb.xml >>> index 73f707aa0..c5dbebd1d 100644 >>> --- a/ovn-nb.xml >>> +++ b/ovn-nb.xml >>> @@ -2041,6 +2041,13 @@ or >>> the affinity timeslot. Max supported affinity_timeout is 65535 >>> seconds. >>> </column> >>> + >>> + <column name="options" key="ct_flush_enabled" >>> + type='{"type": "boolean"}'> >> >> As Ilya mentioned, _enabled is probably redundant. >> >> I wonder though whether we need a different name. I'm not sure if it's >> better to be honest but what do you think of "graceful_cleanup".In the >> end what we achieve with ct_flush_enabled=false is "graceful >> termination" of the connection. > > This is questionable. I'd argue that flush is a cleaner option, so > it is more graceful. :) > > Jokes aside, OVN doesn't know what CMS/user is going to do with the > backend/pod/VM. Form pure OVN perspective we're just removing a > beckend/LB. I would not expect an option named '*_cleanup=false' to > perform some extra actions, as well as '*_cleanup=true' to do nothing. > Unless, of course, it is a 'do_not_cleanup' option. > Fair points above. I'd say let's keep "ct_flush=true/false" then. > Best regards, Ilya Maximets. > Thanks, Dumitru
diff --git a/NEWS b/NEWS index 637adcff3..0daac951a 100644 --- a/NEWS +++ b/NEWS @@ -2,6 +2,8 @@ Post v23.03.0 ------------- - Enhance LSP.options:arp_proxy to support IPv6, configurable MAC addresses and CIDRs. + - Add an option for LBs called "ct_flush_enabled" that allows CMS to specify + if ovn-controller should flush related CT entries for removed LB backends. OVN v23.03.0 - 03 Mar 2023 -------------------------- diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c index 7dcbfd252..8b85464c6 100644 --- a/controller/ovn-controller.c +++ b/controller/ovn-controller.c @@ -2697,7 +2697,8 @@ static void lb_data_removed_five_tuples_add(struct ed_type_lb_data *lb_data, const struct ovn_controller_lb *lb) { - if (!ovs_feature_is_supported(OVS_CT_TUPLE_FLUSH_SUPPORT)) { + if (!ovs_feature_is_supported(OVS_CT_TUPLE_FLUSH_SUPPORT) || + !smap_get_bool(&lb->slb->options, "ct_flush_enabled", true)) { return; } @@ -2716,7 +2717,8 @@ static void lb_data_removed_five_tuples_remove(struct ed_type_lb_data *lb_data, const struct ovn_controller_lb *lb) { - if (!ovs_feature_is_supported(OVS_CT_TUPLE_FLUSH_SUPPORT)) { + if (!ovs_feature_is_supported(OVS_CT_TUPLE_FLUSH_SUPPORT) || + !smap_get_bool(&lb->slb->options, "ct_flush_enabled", true)) { return; } diff --git a/ovn-nb.xml b/ovn-nb.xml index 73f707aa0..c5dbebd1d 100644 --- a/ovn-nb.xml +++ b/ovn-nb.xml @@ -2041,6 +2041,13 @@ or the affinity timeslot. Max supported affinity_timeout is 65535 seconds. </column> + + <column name="options" key="ct_flush_enabled" + type='{"type": "boolean"}'> + The value indicates whether ovn-controller should flush CT entries + that are related to this LB when the backends are removed. Being set + to <code>true</code> by default. + </column> </group> </table> diff --git a/tests/ovn.at b/tests/ovn.at index fa786112c..724e8b6e5 100644 --- a/tests/ovn.at +++ b/tests/ovn.at @@ -35039,6 +35039,26 @@ AT_CHECK([grep -q "Flushing CT for 5-tuple: vip=192.168.50.10:80, backend=192.16 AT_CHECK([test "$(grep -c "Flushing CT for 5-tuple" hv1/ovn-controller.log)" = "6"], [0]) +# Check if disable of the CT flush works +check ovn-nbctl lb-del lb1 +check ovn-nbctl lb-add lb1 "192.168.70.10:80" "192.168.80.10:8080,192.168.90.10:8080" +check ovn-nbctl set load_balancer lb1 options:ct_flush_enabled="false" +check ovn-nbctl ls-lb-add sw lb1 +check ovs-vsctl set interface p1 external_ids:iface-id=lsp1 +check ovn-nbctl --wait=hv sync + +#AT_CHECK([test "$(grep -c "Flushing CT for 5-tuple" hv1/ovn-controller.log)" = "6"], [0]) + +# Remove one backend +check ovn-nbctl --wait=hv set load_balancer lb1 vips='"192.168.70.10:80"="192.168.80.10:8080"' + +#AT_CHECK([grep -q "Flushing CT for 5-tuple: vip=192.168.70.10:80, backend=192.168.90.10:8080, protocol=6" hv1/ovn-controller.log], [1]) +#AT_CHECK([test "$(grep -c "Flushing CT for 5-tuple" hv1/ovn-controller.log)" = "6"], [0]) + +check ovn-nbctl --wait=hv lb-del lb1 +AT_CHECK([grep -q "Flushing CT for 5-tuple: vip=192.168.70.10:80, backend=192.168.80.10:8080, protocol=6" hv1/ovn-controller.log], [1]) +AT_CHECK([test "$(grep -c "Flushing CT for 5-tuple" hv1/ovn-controller.log)" = "6"], [0]) + OVN_CLEANUP([hv1]) AT_CLEANUP ]) diff --git a/tests/system-ovn.at b/tests/system-ovn.at index ad1188078..afd10c15d 100644 --- a/tests/system-ovn.at +++ b/tests/system-ovn.at @@ -10013,16 +10013,18 @@ OVS_START_L7([bar1], [http]) OVS_START_L7([bar2], [http]) OVS_START_L7([bar3], [http]) -OVS_WAIT_FOR_OUTPUT([ - for i in `seq 1 20`; do - ip netns exec foo1 wget 30.0.0.1 -t 5 -T 1 --retry-connrefused -v -o wget$i.log; - done - ovs-appctl dpctl/dump-conntrack | FORMAT_CT(30.0.0.1) | sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [dnl +m4_define([LB1_CT_ENTRIES], [dnl tcp,orig=(src=192.168.1.2,dst=30.0.0.1,sport=<cleared>,dport=<cleared>),reply=(src=172.16.1.2,dst=192.168.1.2,sport=<cleared>,dport=<cleared>),zone=<cleared>,mark=2,protoinfo=(state=<cleared>) tcp,orig=(src=192.168.1.2,dst=30.0.0.1,sport=<cleared>,dport=<cleared>),reply=(src=172.16.1.3,dst=192.168.1.2,sport=<cleared>,dport=<cleared>),zone=<cleared>,mark=2,protoinfo=(state=<cleared>) tcp,orig=(src=192.168.1.2,dst=30.0.0.1,sport=<cleared>,dport=<cleared>),reply=(src=172.16.1.4,dst=192.168.1.2,sport=<cleared>,dport=<cleared>),zone=<cleared>,mark=2,protoinfo=(state=<cleared>) ]) +OVS_WAIT_FOR_OUTPUT([ + for i in `seq 1 20`; do + ip netns exec foo1 wget 30.0.0.1 -t 5 -T 1 --retry-connrefused -v -o wget$i.log; + done + ovs-appctl dpctl/dump-conntrack | FORMAT_CT(30.0.0.1) | sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [LB1_CT_ENTRIES]) + OVS_WAIT_FOR_OUTPUT([ for i in `seq 1 20`; do ip netns exec foo1 wget 30.0.0.2:8000 -t 5 -T 1 --retry-connrefused -v -o wget$i.log; @@ -10096,6 +10098,27 @@ check ovn-nbctl lb-del lb2 OVS_WAIT_UNTIL([test "$(ovs-appctl dpctl/dump-conntrack | FORMAT_CT(30.0.0.3) | wc -l)" = "0"]) +# Config OVN wih disabled CT flush. +check ovn-nbctl lb-add lb1 30.0.0.1 "172.16.1.2,172.16.1.3,172.16.1.4" \ +-- set load_balancer lb1 options:ct_flush_enabled="false" +check ovn-nbctl ls-lb-add foo lb1 + +OVS_WAIT_FOR_OUTPUT([ + for i in `seq 1 20`; do + ip netns exec foo1 wget 30.0.0.1 -t 5 -T 1 --retry-connrefused -v -o wget$i.log; + done + ovs-appctl dpctl/dump-conntrack | FORMAT_CT(30.0.0.1) | sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [LB1_CT_ENTRIES]) + +# Remove one backend +check ovn-nbctl --wait=hv set load_balancer lb1 vips='"30.0.0.1"="172.16.1.2,172.16.1.3"' + +AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(30.0.0.1) | sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [LB1_CT_ENTRIES]) + +# Remove whole LB +check ovn-nbctl --wait=hv lb-del lb1 + +AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(30.0.0.1) | sed -e 's/zone=[[0-9]]*/zone=<cleared>/'], [0], [LB1_CT_ENTRIES]) + OVS_APP_EXIT_AND_WAIT([ovn-controller]) as ovn-sb
The CT flush was enabled by default for every LB, add config option called "ct_flush_enabled" that allows users to enable/disable the CT flush. The CT flush is remaining enabled by default. Reported-at: https://bugzilla.redhat.com/2178962 Signed-off-by: Ales Musil <amusil@redhat.com> --- NEWS | 2 ++ controller/ovn-controller.c | 6 ++++-- ovn-nb.xml | 7 +++++++ tests/ovn.at | 20 ++++++++++++++++++++ tests/system-ovn.at | 33 ++++++++++++++++++++++++++++----- 5 files changed, 61 insertions(+), 7 deletions(-)