Message ID | 20210510175119.1507098-1-ihrachys@redhat.com |
---|---|
State | Accepted |
Headers | show |
Series | [ovs-dev,v11] ovn-northd: introduce new allow-stateless ACL verb | expand |
On Mon, May 10, 2021 at 10:51 AM Ihar Hrachyshka <ihrachys@redhat.com> wrote: > > For allow-stateless ACLs, bypass connection tracking by avoiding > setting ct hints for matching traffic. Avoid sending all traffic to ct > when a stateful ACL is present. > > === > > Reusing an existing 'allow' verb for stateless matching would have its > drawbacks, specifically, when it comes to backwards incompatibility of > the new behavior with existing environments. When using "allow" ACLs > in mixed allow/allow-related environment, we still commit "allow" > traffic to conntrack. This unnecessarily hits performance when mixed > ACL action types were used for the same datapath. This is why we > introduce a new action verb to describe stateless behavior. > > Another complexity to consider is the fact that with stateless > matching, one would not be able to rely on 'related' magic that > guarantees that reply traffic is passed through. Instead, the user > would have to accurately define matching rules both for request and > reply directions of a protocol session. Specifically, when allowing > ICMP for a specific peer host, one has to define 'allow-stateless' > rules that would match against ip.dst for request direction and ip.src > for reply direction. Other protocols and scenarios will require their > own fine grained matching approaches implemented by the user. > > === > > For performance measurements, qperf was used. Tests were executed on two > setups: > > 1) ovn-fake-multinode virtual environment; > 2) Supermicro SYS-5039MS-H8TRF 3-node RH-OSP 16.x cluster with 2 compute > nodes, NIC: "Intel Corporation Ethernet Controller XXV710 for 25GbE > SFP28 (rev 02)", using fake_nova_driver to avoid qemu overhead. > > 1) ovn-fake-multinode: > > Performance measured between two virtual nodes, two ports > that belong to different LSs connected via router. Using qperf, > performance was measured for UDP, TCP, SCTP protocols (using > <proto>_lat and <proto>_bw tests). The qperf version used: > 0.4.9-16.fc31.x86_64. Each test scenario was executed five times and > averages compared. > > Tests were executed with `allow-stateless` rules for the tested > protocol and `allow-related` for another protocol set for both ports, > both directions, e.g. for TCP scenario, the following ACLs were > defined: > > ovn-nbctl acl-add sw0 to-lport 100 tcp allow-stateless > ovn-nbctl acl-add sw0 from-lport 100 tcp allow-stateless > ovn-nbctl acl-add sw1 to-lport 100 tcp allow-stateless > ovn-nbctl acl-add sw1 from-lport 100 tcp allow-stateless > > ovn-nbctl acl-add sw0 to-lport 100 sctp allow-related > ovn-nbctl acl-add sw0 from-lport 100 sctp allow-related > ovn-nbctl acl-add sw1 to-lport 100 sctp allow-related > ovn-nbctl acl-add sw1 from-lport 100 sctp allow-related > > In this particular environment, improvement was seen in send_bw, > latency, and msg_rate measurements, where applicable, for all three > protocols under test. > > for UDP, send_bw: 293.6 MB/sec => 313.2 MB/sec (+6.68%) > latency: 16 us => 14.08 us (-12%) > msg_rate: 62.56 K/sec => 71.06 K/sec (+13.59%) > > for TCP, latency: 18.6 us => 14.88 us (-20%) > msg_rate: 53.8 K/sec => 67.28 K/sec (+25.06%) > > for SCTP, latency: 21.98 us => 19.42 us (-11.65%) > msg_rate: 45.58 K/sec => 51.54 K/sec (+13.08%) > > 2) Supermicro RH-OSP cluster: > > Two scenarios executed: > - connectivity between two ports on the same switch > - connectivity between two ports on different switches connected via > router > > For the same switch, improvements are as follows: > - TCP latency: -5.8%, UDP latency: -13.9%, SCTP latency: -10.8% > - TCP bw: +0.8%, UDP bw, +9.5%, SCTP bw: +13.29% > > For different switches, improvements are as follows: > - TCP latency: -6.9%, UDP: -11.9%, SCTP: -4.2% > - TCP bw: -0.2%, UDP bw: +11.1%, SCTP bw: +12.9% > > The effect is somewhat more noticeable in same switch scenario. The > effect is less pronounced for TCP than for UDP. > > === > > The patch takes inspiration from a now abandoned patch: > > "ovn-northd: Support mixing stateless/stateful ACLs with > Stateless_Filter." by Dumitru Ceara. > > The original patch assumed CMS doesn't require full flexibility of > matching rules for stateless matching (for example, to be used by > OpenShift). But other CMS interfaces may require the same > customizability for stateless as well as stateful matching, like in > OpenStack Neutron API. Which is why this patch reuses existing ACL > object type to describe stateless rules. > > Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com> > > --- > > v1: initial version. > v2: rebased after conflict. > v3: added ddlog implementation. > v3: apply stateless skip-hint-set rules to appropriate direction only. > v3: added more background as to implementation in commit message. > v3: test stateless scenarios with ddlog too. > v3: rebased after conflict. > v4: introduce a separate allow-stateless ACL match verb. > v5: rebased. > v6: updated docs for new allow-stateless approach. > v6: removed no longer valid comments. > v6: removed acl_is_stateless. > v7: bump north db schema version to 5.31.0 -> 5.32.0. > v8: fixed checkpatch failure on a too long line in dbschema. > v8: added perf data on baremetal lab testing. > v9: fixed ovsdb checksum. > v10: expanded documentation with more explicit distinction between acl > types. > v10: updated a somewhat obsolete comment. > v10: updated commit message with better numbers for scenarios after > re-running some tests. > v11: updated test scenarios to reflect: "northd: Optimize ct nat for > load balancer traffic". > v11: rebased. > --- > NEWS | 2 + > northd/ovn-northd.8.xml | 8 +- > northd/ovn-northd.c | 71 +++++++++-- > northd/ovn_northd.dl | 31 +++++ > ovn-nb.ovsschema | 9 +- > ovn-nb.xml | 15 ++- > tests/ovn-northd.at | 269 ++++++++++++++++++++++++++++++++++++++++ > utilities/ovn-nbctl.c | 6 +- > 8 files changed, 397 insertions(+), 14 deletions(-) > > diff --git a/NEWS b/NEWS > index c91bca0e2..1d3603269 100644 > --- a/NEWS > +++ b/NEWS > @@ -16,6 +16,8 @@ Post-v21.03.0 > be used in the logical flow matches. CMS can consider setting this to > false, if they want to use smart NICs which don't support offloading > datapath flows with this field used. > + - Introduce a new "allow-stateless" ACL verb to always bypass connection > + tracking. The existing "allow" verb behavior is left intact. > > OVN v21.03.0 - 12 Mar 2021 > ------------------------- > diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml > index 1aef75b27..bbcbcd0e8 100644 > --- a/northd/ovn-northd.8.xml > +++ b/northd/ovn-northd.8.xml > @@ -419,7 +419,9 @@ > before eventually advancing to ingress table <code>ACLs</code>. If > special ports such as route ports or localnet ports can't use ct(), a > priority-110 flow is added to skip over stateful ACLs. IPv6 Neighbor > - Discovery and MLD traffic also skips stateful ACLs. > + Discovery and MLD traffic also skips stateful ACLs. For "allow-stateless" > + ACLs, a flow is added to bypass setting the hint for connection tracker > + processing. > </p> > > <p> > @@ -642,6 +644,10 @@ > for new connections and <code>reg0[1] = 1; next;</code> for existing > connections. > </li> > + <li> > + <code>allow-stateless</code> ACLs translate into logical > + flows with the <code>next;</code> action. > + </li> > <li> > <code>reject</code> ACLs translate into logical > flows with the > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c > index 1d4c5b67b..f503ddd5e 100644 > --- a/northd/ovn-northd.c > +++ b/northd/ovn-northd.c > @@ -4975,7 +4975,52 @@ skip_port_from_conntrack(struct ovn_datapath *od, struct ovn_port *op, > } > > static void > -build_pre_acls(struct ovn_datapath *od, struct hmap *lflows) > +build_stateless_filter(struct ovn_datapath *od, > + const struct nbrec_acl *acl, > + struct hmap *lflows) > +{ > + if (!strcmp(acl->direction, "from-lport")) { > + ovn_lflow_add_with_hint(lflows, od, S_SWITCH_IN_PRE_ACL, > + acl->priority + OVN_ACL_PRI_OFFSET, > + acl->match, > + "next;", > + &acl->header_); > + } else { > + ovn_lflow_add_with_hint(lflows, od, S_SWITCH_OUT_PRE_ACL, > + acl->priority + OVN_ACL_PRI_OFFSET, > + acl->match, > + "next;", > + &acl->header_); > + } > +} > + > +static void > +build_stateless_filters(struct ovn_datapath *od, struct hmap *port_groups, > + struct hmap *lflows) > +{ > + for (size_t i = 0; i < od->nbs->n_acls; i++) { > + const struct nbrec_acl *acl = od->nbs->acls[i]; > + if (!strcmp(acl->action, "allow-stateless")) { > + build_stateless_filter(od, acl, lflows); > + } > + } > + > + struct ovn_port_group *pg; > + HMAP_FOR_EACH (pg, key_node, port_groups) { > + if (ovn_port_group_ls_find(pg, &od->nbs->header_.uuid)) { > + for (size_t i = 0; i < pg->nb_pg->n_acls; i++) { > + const struct nbrec_acl *acl = pg->nb_pg->acls[i]; > + if (!strcmp(acl->action, "allow-stateless")) { > + build_stateless_filter(od, acl, lflows); > + } > + } > + } > + } > +} > + > +static void > +build_pre_acls(struct ovn_datapath *od, struct hmap *port_groups, > + struct hmap *lflows) > { > /* Ingress and Egress Pre-ACL Table (Priority 0): Packets are > * allowed by default. */ > @@ -4988,9 +5033,9 @@ build_pre_acls(struct ovn_datapath *od, struct hmap *lflows) > ovn_lflow_add(lflows, od, S_SWITCH_OUT_PRE_ACL, 110, > "eth.src == $svc_monitor_mac", "next;"); > > - /* If there are any stateful ACL rules in this datapath, we must > - * send all IP packets through the conntrack action, which handles > - * defragmentation, in order to match L4 headers. */ > + /* If there are any stateful ACL rules in this datapath, we may > + * send IP packets for some (allow) filters through the conntrack action, > + * which handles defragmentation, in order to match L4 headers. */ > if (od->has_stateful_acl) { > for (size_t i = 0; i < od->n_router_ports; i++) { > skip_port_from_conntrack(od, od->router_ports[i], > @@ -5003,6 +5048,8 @@ build_pre_acls(struct ovn_datapath *od, struct hmap *lflows) > 110, lflows); > } > > + build_stateless_filters(od, port_groups, lflows); > + > /* Ingress and Egress Pre-ACL Table (Priority 110). > * > * Not to do conntrack on ND and ICMP destination > @@ -5406,7 +5453,8 @@ build_acl_log(struct ds *actions, const struct nbrec_acl *acl, > } else if (!strcmp(acl->action, "reject")) { > ds_put_cstr(actions, "verdict=reject, "); > } else if (!strcmp(acl->action, "allow") > - || !strcmp(acl->action, "allow-related")) { > + || !strcmp(acl->action, "allow-related") > + || !strcmp(acl->action, "allow-stateless")) { > ds_put_cstr(actions, "verdict=allow, "); > } > > @@ -5465,7 +5513,16 @@ consider_acl(struct hmap *lflows, struct ovn_datapath *od, > bool ingress = !strcmp(acl->direction, "from-lport") ? true :false; > enum ovn_stage stage = ingress ? S_SWITCH_IN_ACL : S_SWITCH_OUT_ACL; > > - if (!strcmp(acl->action, "allow") > + if (!strcmp(acl->action, "allow-stateless")) { > + struct ds actions = DS_EMPTY_INITIALIZER; > + build_acl_log(&actions, acl, meter_groups); > + ds_put_cstr(&actions, "next;"); > + ovn_lflow_add_with_hint(lflows, od, stage, > + acl->priority + OVN_ACL_PRI_OFFSET, > + acl->match, ds_cstr(&actions), > + &acl->header_); > + ds_destroy(&actions); > + } else if (!strcmp(acl->action, "allow") > || !strcmp(acl->action, "allow-related")) { > /* If there are any stateful flows, we must even commit "allow" > * actions. This is because, while the initiater's > @@ -6795,7 +6852,7 @@ build_lswitch_lflows_pre_acl_and_acl(struct ovn_datapath *od, > od->has_stateful_acl = ls_has_stateful_acl(od); > od->has_lb_vip = ls_has_lb_vip(od); > > - build_pre_acls(od, lflows); > + build_pre_acls(od, port_groups, lflows); > build_pre_lb(od, lflows, meter_groups, lbs); > build_pre_stateful(od, lflows); > build_acl_hints(od, lflows); > diff --git a/northd/ovn_northd.dl b/northd/ovn_northd.dl > index ffd09c35f..d56ff8b22 100644 > --- a/northd/ovn_northd.dl > +++ b/northd/ovn_northd.dl > @@ -1822,6 +1822,27 @@ for (&Switch(.ls =ls)) { > .external_ids = map_empty()) > } > > +for (&SwitchACL(.sw = sw@&Switch{.ls = ls}, .acl = &acl, .has_fair_meter = fair_meter)) { > + if (sw.has_stateful_acl) { > + if (acl.action == "allow-stateless") { > + if (acl.direction == "from-lport") { > + Flow(.logical_datapath = ls._uuid, > + .stage = s_SWITCH_IN_PRE_ACL(), > + .priority = acl.priority + oVN_ACL_PRI_OFFSET(), > + .__match = acl.__match, > + .actions = "next;", > + .external_ids = stage_hint(acl._uuid)) > + } else { > + Flow(.logical_datapath = ls._uuid, > + .stage = s_SWITCH_OUT_PRE_ACL(), > + .priority = acl.priority + oVN_ACL_PRI_OFFSET(), > + .__match = acl.__match, > + .actions = "next;", > + .external_ids = stage_hint(acl._uuid)) > + } > + } > + } > +} > > /* If there are any stateful ACL rules in this datapath, we must > * send all IP packets through the conntrack action, which handles > @@ -2211,6 +2232,9 @@ function build_acl_log(acl: nb::ACL, fair_meter: bool): string = > "allow-related" -> { > strs.push("verdict=allow") > }, > + "allow-stateless" -> { > + strs.push("verdict=allow") > + }, > _ -> () > }; > match (acl.meter) { > @@ -2595,6 +2619,13 @@ for (&SwitchACL(.sw = sw@&Switch{.ls = ls}, .acl = &acl, .has_fair_meter = fair_ > .actions = "${acl_log}next;", > .external_ids = stage_hint) > } > + } else if (acl.action == "allow-stateless") { > + Flow(.logical_datapath = ls._uuid, > + .stage = stage, > + .priority = acl.priority + oVN_ACL_PRI_OFFSET(), > + .__match = acl.__match, > + .actions = "${acl_log}next;", > + .external_ids = stage_hint) > } else if (acl.action == "drop" or acl.action == "reject") { > /* The implementation of "drop" differs if stateful ACLs are in > * use for this datapath. In that case, the actions differ > diff --git a/ovn-nb.ovsschema b/ovn-nb.ovsschema > index 29019809c..faf619a1c 100644 > --- a/ovn-nb.ovsschema > +++ b/ovn-nb.ovsschema > @@ -1,7 +1,7 @@ > { > "name": "OVN_Northbound", > - "version": "5.31.0", > - "cksum": "2352750632 28701", > + "version": "5.32.0", > + "cksum": "204590300 28863", > "tables": { > "NB_Global": { > "columns": { > @@ -221,7 +221,10 @@ > "enum": ["set", ["from-lport", "to-lport"]]}}}, > "match": {"type": "string"}, > "action": {"type": {"key": {"type": "string", > - "enum": ["set", ["allow", "allow-related", "drop", "reject"]]}}}, > + "enum": ["set", > + ["allow", "allow-related", > + "allow-stateless", "drop", > + "reject"]]}}}, > "log": {"type": "boolean"}, > "severity": {"type": {"key": {"type": "string", > "enum": ["set", > diff --git a/ovn-nb.xml b/ovn-nb.xml > index 6033d9ed9..ed271d8eb 100644 > --- a/ovn-nb.xml > +++ b/ovn-nb.xml > @@ -1806,7 +1806,20 @@ > <p>The action to take when the ACL rule matches:</p> > <ul> > <li> > - <code>allow</code>: Forward the packet. > + <code>allow-stateless</code>: Always forward the packet in stateless > + manner, omitting connection tracking mechanism, regardless of other > + rules defined for the switch. May require defining additional rules > + for inbound replies. For example, if you define a rule to allow > + outgoing TCP traffic directed to an IP address, then you probably > + also want to define another rule to allow incoming TCP traffic coming > + from this same IP address. > + </li> > + > + <li> > + <code>allow</code>: Forward the packet. It will also send the > + packets through connection tracking when > + <code>allow-related</code> rules exist on the logical switch. > + Otherwise, it's equivalent to <code>allow-stateless</code>. > </li> > > <li> > diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at > index 10b204624..4fb0a195e 100644 > --- a/tests/ovn-northd.at > +++ b/tests/ovn-northd.at > @@ -2592,6 +2592,275 @@ sed 's/reg8\[[0..15\]] == [[0-9]]*/reg8\[[0..15\]] == <cleared>/' | sort], [0], > AT_CLEANUP > ]) > > +OVN_FOR_EACH_NORTHD([ > +AT_SETUP([ovn -- ACL allow-stateless omit conntrack - Logical_Switch]) > +ovn_start > + > +ovn-nbctl ls-add ls > +ovn-nbctl lsp-add ls lsp1 > +ovn-nbctl lsp-set-addresses lsp1 00:00:00:00:00:01 > +ovn-nbctl lsp-add ls lsp2 > +ovn-nbctl lsp-set-addresses lsp2 00:00:00:00:00:02 > + > +for direction in from to; do > + ovn-nbctl acl-add ls ${direction}-lport 3 "tcp" allow-related > + ovn-nbctl acl-add ls ${direction}-lport 2 "udp" allow-related > + ovn-nbctl acl-add ls ${direction}-lport 1 "ip" drop > +done > +ovn-nbctl --wait=sb sync > + > +flow_eth='eth.src == 00:00:00:00:00:01 && eth.dst == 00:00:00:00:00:02' > +flow_ip='ip.ttl==64 && ip4.src == 42.42.42.1 && ip4.dst == 66.66.66.66' > +flow_tcp='tcp && tcp.dst == 80' > +flow_udp='udp && udp.dst == 80' > + > +lsp1_inport=$(fetch_column Port_Binding tunnel_key logical_port=lsp1) > + > +# TCP packets should go to conntrack. > +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_tcp}" > +AT_CHECK_UNQUOTED([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl > +# tcp,reg14=0x${lsp1_inport},vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80,tcp_flags=0 > +ct_next(ct_state=new|trk) { > + ct_next(ct_state=new|trk) { > + output("lsp2"); > + }; > +}; > +]) > + > +# UDP packets should go to conntrack. > +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_udp}" > +AT_CHECK_UNQUOTED([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl > +# udp,reg14=0x${lsp1_inport},vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80 > +ct_next(ct_state=new|trk) { > + ct_next(ct_state=new|trk) { > + output("lsp2"); > + }; > +}; > +]) > + > +# Allow stateless for TCP. > +for direction in from to; do > + ovn-nbctl acl-add ls ${direction}-lport 1 tcp allow-stateless > +done > +ovn-nbctl --wait=sb sync > + > +# TCP packets should not go to conntrack anymore. > +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_tcp}" > +AT_CHECK_UNQUOTED([ovn-trace --minimal ls "${flow}"], [0], [dnl > +# tcp,reg14=0x${lsp1_inport},vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80,tcp_flags=0 > +output("lsp2"); > +]) > + > +# UDP packets still go to conntrack. > +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_udp}" > +AT_CHECK_UNQUOTED([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl > +# udp,reg14=0x${lsp1_inport},vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80 > +ct_next(ct_state=new|trk) { > + ct_next(ct_state=new|trk) { > + output("lsp2"); > + }; > +}; > +]) > + > +# Add a load balancer. > +ovn-nbctl lb-add lb-tcp 66.66.66.66:80 42.42.42.2:8080 tcp > +ovn-nbctl lb-add lb-udp 66.66.66.66:80 42.42.42.2:8080 udp > +ovn-nbctl ls-lb-add ls lb-tcp > +ovn-nbctl ls-lb-add ls lb-udp > + > +# Remove stateless for TCP. > +ovn-nbctl acl-del ls > +ovn-nbctl --wait=sb sync > + > +# TCP packets should go to conntrack. > +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_tcp}" > +AT_CHECK_UNQUOTED([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl > +# tcp,reg14=0x${lsp1_inport},vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80,tcp_flags=0 > +ct_lb { > + ct_lb { > + output("lsp2"); > + }; > +}; > +]) > + > +# UDP packets should go to conntrack. > +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_udp}" > +AT_CHECK_UNQUOTED([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl > +# udp,reg14=0x${lsp1_inport},vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80 > +ct_lb { > + ct_lb { > + output("lsp2"); > + }; > +}; > +]) > + > +# Allow stateless for TCP. > +for direction in from to; do > + ovn-nbctl acl-add ls ${direction}-lport 1 tcp allow-stateless > +done > +ovn-nbctl --wait=sb sync > + > +# TCP packets should go to conntrack for load balancing. > +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_tcp}" > +AT_CHECK_UNQUOTED([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl > +# tcp,reg14=0x${lsp1_inport},vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80,tcp_flags=0 > +ct_lb { > + ct_lb { > + output("lsp2"); > + }; > +}; > +]) > + > +# UDP packets still go to conntrack. > +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_udp}" > +AT_CHECK_UNQUOTED([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl > +# udp,reg14=0x${lsp1_inport},vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80 > +ct_lb { > + ct_lb { > + output("lsp2"); > + }; > +}; > +]) > + > +AT_CLEANUP > +]) > + > +OVN_FOR_EACH_NORTHD([ > +AT_SETUP([ovn -- ACL allow-stateless omit conntrack - Port_Group]) > +ovn_start > + > +ovn-nbctl ls-add ls > +ovn-nbctl lsp-add ls lsp1 > +ovn-nbctl lsp-set-addresses lsp1 00:00:00:00:00:01 > +ovn-nbctl lsp-add ls lsp2 > +ovn-nbctl lsp-set-addresses lsp2 00:00:00:00:00:02 > + > +ovn-nbctl pg-add pg lsp1 lsp2 > + > +for direction in from to; do > + ovn-nbctl acl-add pg ${direction}-lport 3 "tcp" allow-related > + ovn-nbctl acl-add pg ${direction}-lport 2 "udp" allow-related > + ovn-nbctl acl-add pg ${direction}-lport 1 "ip" drop > +done > +ovn-nbctl --wait=sb sync > + > +lsp1_inport=$(fetch_column Port_Binding tunnel_key logical_port=lsp1) > +echo $lsp1_inport > + > +flow_eth='eth.src == 00:00:00:00:00:01 && eth.dst == 00:00:00:00:00:02' > +flow_ip='ip.ttl==64 && ip4.src == 42.42.42.1 && ip4.dst == 66.66.66.66' > +flow_tcp='tcp && tcp.dst == 80' > +flow_udp='udp && udp.dst == 80' > + > +# TCP packets should go to conntrack. > +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_tcp}" > +AT_CHECK_UNQUOTED([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl > +# tcp,reg14=0x${lsp1_inport},vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80,tcp_flags=0 > +ct_next(ct_state=new|trk) { > + ct_next(ct_state=new|trk) { > + output("lsp2"); > + }; > +}; > +]) > + > +# UDP packets should go to conntrack. > +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_udp}" > +AT_CHECK_UNQUOTED([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl > +# udp,reg14=0x${lsp1_inport},vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80 > +ct_next(ct_state=new|trk) { > + ct_next(ct_state=new|trk) { > + output("lsp2"); > + }; > +}; > +]) > + > +# Allow stateless for TCP. > +for direction in from to; do > + ovn-nbctl acl-add pg ${direction}-lport 1 tcp allow-stateless > +done > +ovn-nbctl --wait=sb sync > + > +# TCP packets should not go to conntrack anymore. > +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_tcp}" > +AT_CHECK_UNQUOTED([ovn-trace --minimal ls "${flow}"], [0], [dnl > +# tcp,reg14=0x${lsp1_inport},vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80,tcp_flags=0 > +output("lsp2"); > +]) > + > +# UDP packets still go to conntrack. > +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_udp}" > +AT_CHECK_UNQUOTED([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl > +# udp,reg14=0x${lsp1_inport},vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80 > +ct_next(ct_state=new|trk) { > + ct_next(ct_state=new|trk) { > + output("lsp2"); > + }; > +}; > +]) > + > +# Add a load balancer. > +ovn-nbctl lb-add lb-tcp 66.66.66.66:80 42.42.42.2:8080 tcp > +ovn-nbctl lb-add lb-udp 66.66.66.66:80 42.42.42.2:8080 udp > +ovn-nbctl ls-lb-add ls lb-tcp > +ovn-nbctl ls-lb-add ls lb-udp > + > +# Remove stateless for TCP. > +ovn-nbctl acl-del pg > +ovn-nbctl --wait=sb sync > + > +# TCP packets should go to conntrack. > +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_tcp}" > +AT_CHECK_UNQUOTED([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl > +# tcp,reg14=0x${lsp1_inport},vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80,tcp_flags=0 > +ct_lb { > + ct_lb { > + output("lsp2"); > + }; > +}; > +]) > + > +# UDP packets should go to conntrack. > +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_udp}" > +AT_CHECK_UNQUOTED([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl > +# udp,reg14=0x${lsp1_inport},vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80 > +ct_lb { > + ct_lb { > + output("lsp2"); > + }; > +}; > +]) > + > +# Allow stateless for TCP. > +for direction in from to; do > + ovn-nbctl acl-add pg ${direction}-lport 1 tcp allow-stateless > +done > +ovn-nbctl --wait=sb sync > + > +# TCP packets should go to conntrack for load balancing. > +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_tcp}" > +AT_CHECK_UNQUOTED([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl > +# tcp,reg14=0x${lsp1_inport},vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80,tcp_flags=0 > +ct_lb { > + ct_lb { > + output("lsp2"); > + }; > +}; > +]) > + > +# UDP packets still go to conntrack. > +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_udp}" > +AT_CHECK_UNQUOTED([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl > +# udp,reg14=0x${lsp1_inport},vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80 > +ct_lb { > + ct_lb { > + output("lsp2"); > + }; > +}; > +]) > + > +AT_CLEANUP > +]) > + > OVN_FOR_EACH_NORTHD([ > AT_SETUP([ovn -- check BFD config propagation to SBDB]) > AT_KEYWORDS([northd-bfd]) > diff --git a/utilities/ovn-nbctl.c b/utilities/ovn-nbctl.c > index 042c21002..48fd0b7ee 100644 > --- a/utilities/ovn-nbctl.c > +++ b/utilities/ovn-nbctl.c > @@ -2303,9 +2303,11 @@ nbctl_acl_add(struct ctl_context *ctx) > > /* Validate action. */ > if (strcmp(action, "allow") && strcmp(action, "allow-related") > - && strcmp(action, "drop") && strcmp(action, "reject")) { > + && strcmp(action, "allow-stateless") && strcmp(action, "drop") > + && strcmp(action, "reject")) { > ctl_error(ctx, "%s: action must be one of \"allow\", " > - "\"allow-related\", \"drop\", and \"reject\"", action); > + "\"allow-related\", \"allow-stateless\", \"drop\", " > + "and \"reject\"", action); > return; > } > > -- > 2.31.1 > > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev Thanks Ihar (and Dumitru for earlier reviews). I applied this to master. Looking forward to the follow-up patch that optimizes the use case when no stateful ACL exists. Thanks, Han
diff --git a/NEWS b/NEWS index c91bca0e2..1d3603269 100644 --- a/NEWS +++ b/NEWS @@ -16,6 +16,8 @@ Post-v21.03.0 be used in the logical flow matches. CMS can consider setting this to false, if they want to use smart NICs which don't support offloading datapath flows with this field used. + - Introduce a new "allow-stateless" ACL verb to always bypass connection + tracking. The existing "allow" verb behavior is left intact. OVN v21.03.0 - 12 Mar 2021 ------------------------- diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml index 1aef75b27..bbcbcd0e8 100644 --- a/northd/ovn-northd.8.xml +++ b/northd/ovn-northd.8.xml @@ -419,7 +419,9 @@ before eventually advancing to ingress table <code>ACLs</code>. If special ports such as route ports or localnet ports can't use ct(), a priority-110 flow is added to skip over stateful ACLs. IPv6 Neighbor - Discovery and MLD traffic also skips stateful ACLs. + Discovery and MLD traffic also skips stateful ACLs. For "allow-stateless" + ACLs, a flow is added to bypass setting the hint for connection tracker + processing. </p> <p> @@ -642,6 +644,10 @@ for new connections and <code>reg0[1] = 1; next;</code> for existing connections. </li> + <li> + <code>allow-stateless</code> ACLs translate into logical + flows with the <code>next;</code> action. + </li> <li> <code>reject</code> ACLs translate into logical flows with the diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c index 1d4c5b67b..f503ddd5e 100644 --- a/northd/ovn-northd.c +++ b/northd/ovn-northd.c @@ -4975,7 +4975,52 @@ skip_port_from_conntrack(struct ovn_datapath *od, struct ovn_port *op, } static void -build_pre_acls(struct ovn_datapath *od, struct hmap *lflows) +build_stateless_filter(struct ovn_datapath *od, + const struct nbrec_acl *acl, + struct hmap *lflows) +{ + if (!strcmp(acl->direction, "from-lport")) { + ovn_lflow_add_with_hint(lflows, od, S_SWITCH_IN_PRE_ACL, + acl->priority + OVN_ACL_PRI_OFFSET, + acl->match, + "next;", + &acl->header_); + } else { + ovn_lflow_add_with_hint(lflows, od, S_SWITCH_OUT_PRE_ACL, + acl->priority + OVN_ACL_PRI_OFFSET, + acl->match, + "next;", + &acl->header_); + } +} + +static void +build_stateless_filters(struct ovn_datapath *od, struct hmap *port_groups, + struct hmap *lflows) +{ + for (size_t i = 0; i < od->nbs->n_acls; i++) { + const struct nbrec_acl *acl = od->nbs->acls[i]; + if (!strcmp(acl->action, "allow-stateless")) { + build_stateless_filter(od, acl, lflows); + } + } + + struct ovn_port_group *pg; + HMAP_FOR_EACH (pg, key_node, port_groups) { + if (ovn_port_group_ls_find(pg, &od->nbs->header_.uuid)) { + for (size_t i = 0; i < pg->nb_pg->n_acls; i++) { + const struct nbrec_acl *acl = pg->nb_pg->acls[i]; + if (!strcmp(acl->action, "allow-stateless")) { + build_stateless_filter(od, acl, lflows); + } + } + } + } +} + +static void +build_pre_acls(struct ovn_datapath *od, struct hmap *port_groups, + struct hmap *lflows) { /* Ingress and Egress Pre-ACL Table (Priority 0): Packets are * allowed by default. */ @@ -4988,9 +5033,9 @@ build_pre_acls(struct ovn_datapath *od, struct hmap *lflows) ovn_lflow_add(lflows, od, S_SWITCH_OUT_PRE_ACL, 110, "eth.src == $svc_monitor_mac", "next;"); - /* If there are any stateful ACL rules in this datapath, we must - * send all IP packets through the conntrack action, which handles - * defragmentation, in order to match L4 headers. */ + /* If there are any stateful ACL rules in this datapath, we may + * send IP packets for some (allow) filters through the conntrack action, + * which handles defragmentation, in order to match L4 headers. */ if (od->has_stateful_acl) { for (size_t i = 0; i < od->n_router_ports; i++) { skip_port_from_conntrack(od, od->router_ports[i], @@ -5003,6 +5048,8 @@ build_pre_acls(struct ovn_datapath *od, struct hmap *lflows) 110, lflows); } + build_stateless_filters(od, port_groups, lflows); + /* Ingress and Egress Pre-ACL Table (Priority 110). * * Not to do conntrack on ND and ICMP destination @@ -5406,7 +5453,8 @@ build_acl_log(struct ds *actions, const struct nbrec_acl *acl, } else if (!strcmp(acl->action, "reject")) { ds_put_cstr(actions, "verdict=reject, "); } else if (!strcmp(acl->action, "allow") - || !strcmp(acl->action, "allow-related")) { + || !strcmp(acl->action, "allow-related") + || !strcmp(acl->action, "allow-stateless")) { ds_put_cstr(actions, "verdict=allow, "); } @@ -5465,7 +5513,16 @@ consider_acl(struct hmap *lflows, struct ovn_datapath *od, bool ingress = !strcmp(acl->direction, "from-lport") ? true :false; enum ovn_stage stage = ingress ? S_SWITCH_IN_ACL : S_SWITCH_OUT_ACL; - if (!strcmp(acl->action, "allow") + if (!strcmp(acl->action, "allow-stateless")) { + struct ds actions = DS_EMPTY_INITIALIZER; + build_acl_log(&actions, acl, meter_groups); + ds_put_cstr(&actions, "next;"); + ovn_lflow_add_with_hint(lflows, od, stage, + acl->priority + OVN_ACL_PRI_OFFSET, + acl->match, ds_cstr(&actions), + &acl->header_); + ds_destroy(&actions); + } else if (!strcmp(acl->action, "allow") || !strcmp(acl->action, "allow-related")) { /* If there are any stateful flows, we must even commit "allow" * actions. This is because, while the initiater's @@ -6795,7 +6852,7 @@ build_lswitch_lflows_pre_acl_and_acl(struct ovn_datapath *od, od->has_stateful_acl = ls_has_stateful_acl(od); od->has_lb_vip = ls_has_lb_vip(od); - build_pre_acls(od, lflows); + build_pre_acls(od, port_groups, lflows); build_pre_lb(od, lflows, meter_groups, lbs); build_pre_stateful(od, lflows); build_acl_hints(od, lflows); diff --git a/northd/ovn_northd.dl b/northd/ovn_northd.dl index ffd09c35f..d56ff8b22 100644 --- a/northd/ovn_northd.dl +++ b/northd/ovn_northd.dl @@ -1822,6 +1822,27 @@ for (&Switch(.ls =ls)) { .external_ids = map_empty()) } +for (&SwitchACL(.sw = sw@&Switch{.ls = ls}, .acl = &acl, .has_fair_meter = fair_meter)) { + if (sw.has_stateful_acl) { + if (acl.action == "allow-stateless") { + if (acl.direction == "from-lport") { + Flow(.logical_datapath = ls._uuid, + .stage = s_SWITCH_IN_PRE_ACL(), + .priority = acl.priority + oVN_ACL_PRI_OFFSET(), + .__match = acl.__match, + .actions = "next;", + .external_ids = stage_hint(acl._uuid)) + } else { + Flow(.logical_datapath = ls._uuid, + .stage = s_SWITCH_OUT_PRE_ACL(), + .priority = acl.priority + oVN_ACL_PRI_OFFSET(), + .__match = acl.__match, + .actions = "next;", + .external_ids = stage_hint(acl._uuid)) + } + } + } +} /* If there are any stateful ACL rules in this datapath, we must * send all IP packets through the conntrack action, which handles @@ -2211,6 +2232,9 @@ function build_acl_log(acl: nb::ACL, fair_meter: bool): string = "allow-related" -> { strs.push("verdict=allow") }, + "allow-stateless" -> { + strs.push("verdict=allow") + }, _ -> () }; match (acl.meter) { @@ -2595,6 +2619,13 @@ for (&SwitchACL(.sw = sw@&Switch{.ls = ls}, .acl = &acl, .has_fair_meter = fair_ .actions = "${acl_log}next;", .external_ids = stage_hint) } + } else if (acl.action == "allow-stateless") { + Flow(.logical_datapath = ls._uuid, + .stage = stage, + .priority = acl.priority + oVN_ACL_PRI_OFFSET(), + .__match = acl.__match, + .actions = "${acl_log}next;", + .external_ids = stage_hint) } else if (acl.action == "drop" or acl.action == "reject") { /* The implementation of "drop" differs if stateful ACLs are in * use for this datapath. In that case, the actions differ diff --git a/ovn-nb.ovsschema b/ovn-nb.ovsschema index 29019809c..faf619a1c 100644 --- a/ovn-nb.ovsschema +++ b/ovn-nb.ovsschema @@ -1,7 +1,7 @@ { "name": "OVN_Northbound", - "version": "5.31.0", - "cksum": "2352750632 28701", + "version": "5.32.0", + "cksum": "204590300 28863", "tables": { "NB_Global": { "columns": { @@ -221,7 +221,10 @@ "enum": ["set", ["from-lport", "to-lport"]]}}}, "match": {"type": "string"}, "action": {"type": {"key": {"type": "string", - "enum": ["set", ["allow", "allow-related", "drop", "reject"]]}}}, + "enum": ["set", + ["allow", "allow-related", + "allow-stateless", "drop", + "reject"]]}}}, "log": {"type": "boolean"}, "severity": {"type": {"key": {"type": "string", "enum": ["set", diff --git a/ovn-nb.xml b/ovn-nb.xml index 6033d9ed9..ed271d8eb 100644 --- a/ovn-nb.xml +++ b/ovn-nb.xml @@ -1806,7 +1806,20 @@ <p>The action to take when the ACL rule matches:</p> <ul> <li> - <code>allow</code>: Forward the packet. + <code>allow-stateless</code>: Always forward the packet in stateless + manner, omitting connection tracking mechanism, regardless of other + rules defined for the switch. May require defining additional rules + for inbound replies. For example, if you define a rule to allow + outgoing TCP traffic directed to an IP address, then you probably + also want to define another rule to allow incoming TCP traffic coming + from this same IP address. + </li> + + <li> + <code>allow</code>: Forward the packet. It will also send the + packets through connection tracking when + <code>allow-related</code> rules exist on the logical switch. + Otherwise, it's equivalent to <code>allow-stateless</code>. </li> <li> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at index 10b204624..4fb0a195e 100644 --- a/tests/ovn-northd.at +++ b/tests/ovn-northd.at @@ -2592,6 +2592,275 @@ sed 's/reg8\[[0..15\]] == [[0-9]]*/reg8\[[0..15\]] == <cleared>/' | sort], [0], AT_CLEANUP ]) +OVN_FOR_EACH_NORTHD([ +AT_SETUP([ovn -- ACL allow-stateless omit conntrack - Logical_Switch]) +ovn_start + +ovn-nbctl ls-add ls +ovn-nbctl lsp-add ls lsp1 +ovn-nbctl lsp-set-addresses lsp1 00:00:00:00:00:01 +ovn-nbctl lsp-add ls lsp2 +ovn-nbctl lsp-set-addresses lsp2 00:00:00:00:00:02 + +for direction in from to; do + ovn-nbctl acl-add ls ${direction}-lport 3 "tcp" allow-related + ovn-nbctl acl-add ls ${direction}-lport 2 "udp" allow-related + ovn-nbctl acl-add ls ${direction}-lport 1 "ip" drop +done +ovn-nbctl --wait=sb sync + +flow_eth='eth.src == 00:00:00:00:00:01 && eth.dst == 00:00:00:00:00:02' +flow_ip='ip.ttl==64 && ip4.src == 42.42.42.1 && ip4.dst == 66.66.66.66' +flow_tcp='tcp && tcp.dst == 80' +flow_udp='udp && udp.dst == 80' + +lsp1_inport=$(fetch_column Port_Binding tunnel_key logical_port=lsp1) + +# TCP packets should go to conntrack. +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_tcp}" +AT_CHECK_UNQUOTED([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl +# tcp,reg14=0x${lsp1_inport},vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80,tcp_flags=0 +ct_next(ct_state=new|trk) { + ct_next(ct_state=new|trk) { + output("lsp2"); + }; +}; +]) + +# UDP packets should go to conntrack. +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_udp}" +AT_CHECK_UNQUOTED([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl +# udp,reg14=0x${lsp1_inport},vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80 +ct_next(ct_state=new|trk) { + ct_next(ct_state=new|trk) { + output("lsp2"); + }; +}; +]) + +# Allow stateless for TCP. +for direction in from to; do + ovn-nbctl acl-add ls ${direction}-lport 1 tcp allow-stateless +done +ovn-nbctl --wait=sb sync + +# TCP packets should not go to conntrack anymore. +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_tcp}" +AT_CHECK_UNQUOTED([ovn-trace --minimal ls "${flow}"], [0], [dnl +# tcp,reg14=0x${lsp1_inport},vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80,tcp_flags=0 +output("lsp2"); +]) + +# UDP packets still go to conntrack. +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_udp}" +AT_CHECK_UNQUOTED([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl +# udp,reg14=0x${lsp1_inport},vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80 +ct_next(ct_state=new|trk) { + ct_next(ct_state=new|trk) { + output("lsp2"); + }; +}; +]) + +# Add a load balancer. +ovn-nbctl lb-add lb-tcp 66.66.66.66:80 42.42.42.2:8080 tcp +ovn-nbctl lb-add lb-udp 66.66.66.66:80 42.42.42.2:8080 udp +ovn-nbctl ls-lb-add ls lb-tcp +ovn-nbctl ls-lb-add ls lb-udp + +# Remove stateless for TCP. +ovn-nbctl acl-del ls +ovn-nbctl --wait=sb sync + +# TCP packets should go to conntrack. +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_tcp}" +AT_CHECK_UNQUOTED([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl +# tcp,reg14=0x${lsp1_inport},vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80,tcp_flags=0 +ct_lb { + ct_lb { + output("lsp2"); + }; +}; +]) + +# UDP packets should go to conntrack. +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_udp}" +AT_CHECK_UNQUOTED([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl +# udp,reg14=0x${lsp1_inport},vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80 +ct_lb { + ct_lb { + output("lsp2"); + }; +}; +]) + +# Allow stateless for TCP. +for direction in from to; do + ovn-nbctl acl-add ls ${direction}-lport 1 tcp allow-stateless +done +ovn-nbctl --wait=sb sync + +# TCP packets should go to conntrack for load balancing. +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_tcp}" +AT_CHECK_UNQUOTED([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl +# tcp,reg14=0x${lsp1_inport},vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80,tcp_flags=0 +ct_lb { + ct_lb { + output("lsp2"); + }; +}; +]) + +# UDP packets still go to conntrack. +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_udp}" +AT_CHECK_UNQUOTED([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl +# udp,reg14=0x${lsp1_inport},vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80 +ct_lb { + ct_lb { + output("lsp2"); + }; +}; +]) + +AT_CLEANUP +]) + +OVN_FOR_EACH_NORTHD([ +AT_SETUP([ovn -- ACL allow-stateless omit conntrack - Port_Group]) +ovn_start + +ovn-nbctl ls-add ls +ovn-nbctl lsp-add ls lsp1 +ovn-nbctl lsp-set-addresses lsp1 00:00:00:00:00:01 +ovn-nbctl lsp-add ls lsp2 +ovn-nbctl lsp-set-addresses lsp2 00:00:00:00:00:02 + +ovn-nbctl pg-add pg lsp1 lsp2 + +for direction in from to; do + ovn-nbctl acl-add pg ${direction}-lport 3 "tcp" allow-related + ovn-nbctl acl-add pg ${direction}-lport 2 "udp" allow-related + ovn-nbctl acl-add pg ${direction}-lport 1 "ip" drop +done +ovn-nbctl --wait=sb sync + +lsp1_inport=$(fetch_column Port_Binding tunnel_key logical_port=lsp1) +echo $lsp1_inport + +flow_eth='eth.src == 00:00:00:00:00:01 && eth.dst == 00:00:00:00:00:02' +flow_ip='ip.ttl==64 && ip4.src == 42.42.42.1 && ip4.dst == 66.66.66.66' +flow_tcp='tcp && tcp.dst == 80' +flow_udp='udp && udp.dst == 80' + +# TCP packets should go to conntrack. +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_tcp}" +AT_CHECK_UNQUOTED([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl +# tcp,reg14=0x${lsp1_inport},vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80,tcp_flags=0 +ct_next(ct_state=new|trk) { + ct_next(ct_state=new|trk) { + output("lsp2"); + }; +}; +]) + +# UDP packets should go to conntrack. +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_udp}" +AT_CHECK_UNQUOTED([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl +# udp,reg14=0x${lsp1_inport},vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80 +ct_next(ct_state=new|trk) { + ct_next(ct_state=new|trk) { + output("lsp2"); + }; +}; +]) + +# Allow stateless for TCP. +for direction in from to; do + ovn-nbctl acl-add pg ${direction}-lport 1 tcp allow-stateless +done +ovn-nbctl --wait=sb sync + +# TCP packets should not go to conntrack anymore. +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_tcp}" +AT_CHECK_UNQUOTED([ovn-trace --minimal ls "${flow}"], [0], [dnl +# tcp,reg14=0x${lsp1_inport},vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80,tcp_flags=0 +output("lsp2"); +]) + +# UDP packets still go to conntrack. +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_udp}" +AT_CHECK_UNQUOTED([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl +# udp,reg14=0x${lsp1_inport},vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80 +ct_next(ct_state=new|trk) { + ct_next(ct_state=new|trk) { + output("lsp2"); + }; +}; +]) + +# Add a load balancer. +ovn-nbctl lb-add lb-tcp 66.66.66.66:80 42.42.42.2:8080 tcp +ovn-nbctl lb-add lb-udp 66.66.66.66:80 42.42.42.2:8080 udp +ovn-nbctl ls-lb-add ls lb-tcp +ovn-nbctl ls-lb-add ls lb-udp + +# Remove stateless for TCP. +ovn-nbctl acl-del pg +ovn-nbctl --wait=sb sync + +# TCP packets should go to conntrack. +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_tcp}" +AT_CHECK_UNQUOTED([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl +# tcp,reg14=0x${lsp1_inport},vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80,tcp_flags=0 +ct_lb { + ct_lb { + output("lsp2"); + }; +}; +]) + +# UDP packets should go to conntrack. +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_udp}" +AT_CHECK_UNQUOTED([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl +# udp,reg14=0x${lsp1_inport},vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80 +ct_lb { + ct_lb { + output("lsp2"); + }; +}; +]) + +# Allow stateless for TCP. +for direction in from to; do + ovn-nbctl acl-add pg ${direction}-lport 1 tcp allow-stateless +done +ovn-nbctl --wait=sb sync + +# TCP packets should go to conntrack for load balancing. +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_tcp}" +AT_CHECK_UNQUOTED([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl +# tcp,reg14=0x${lsp1_inport},vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80,tcp_flags=0 +ct_lb { + ct_lb { + output("lsp2"); + }; +}; +]) + +# UDP packets still go to conntrack. +flow="inport == \"lsp1\" && ${flow_eth} && ${flow_ip} && ${flow_udp}" +AT_CHECK_UNQUOTED([ovn-trace --ct new --ct new --minimal ls "${flow}"], [0], [dnl +# udp,reg14=0x${lsp1_inport},vlan_tci=0x0000,dl_src=00:00:00:00:00:01,dl_dst=00:00:00:00:00:02,nw_src=42.42.42.1,nw_dst=66.66.66.66,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80 +ct_lb { + ct_lb { + output("lsp2"); + }; +}; +]) + +AT_CLEANUP +]) + OVN_FOR_EACH_NORTHD([ AT_SETUP([ovn -- check BFD config propagation to SBDB]) AT_KEYWORDS([northd-bfd]) diff --git a/utilities/ovn-nbctl.c b/utilities/ovn-nbctl.c index 042c21002..48fd0b7ee 100644 --- a/utilities/ovn-nbctl.c +++ b/utilities/ovn-nbctl.c @@ -2303,9 +2303,11 @@ nbctl_acl_add(struct ctl_context *ctx) /* Validate action. */ if (strcmp(action, "allow") && strcmp(action, "allow-related") - && strcmp(action, "drop") && strcmp(action, "reject")) { + && strcmp(action, "allow-stateless") && strcmp(action, "drop") + && strcmp(action, "reject")) { ctl_error(ctx, "%s: action must be one of \"allow\", " - "\"allow-related\", \"drop\", and \"reject\"", action); + "\"allow-related\", \"allow-stateless\", \"drop\", " + "and \"reject\"", action); return; }
For allow-stateless ACLs, bypass connection tracking by avoiding setting ct hints for matching traffic. Avoid sending all traffic to ct when a stateful ACL is present. === Reusing an existing 'allow' verb for stateless matching would have its drawbacks, specifically, when it comes to backwards incompatibility of the new behavior with existing environments. When using "allow" ACLs in mixed allow/allow-related environment, we still commit "allow" traffic to conntrack. This unnecessarily hits performance when mixed ACL action types were used for the same datapath. This is why we introduce a new action verb to describe stateless behavior. Another complexity to consider is the fact that with stateless matching, one would not be able to rely on 'related' magic that guarantees that reply traffic is passed through. Instead, the user would have to accurately define matching rules both for request and reply directions of a protocol session. Specifically, when allowing ICMP for a specific peer host, one has to define 'allow-stateless' rules that would match against ip.dst for request direction and ip.src for reply direction. Other protocols and scenarios will require their own fine grained matching approaches implemented by the user. === For performance measurements, qperf was used. Tests were executed on two setups: 1) ovn-fake-multinode virtual environment; 2) Supermicro SYS-5039MS-H8TRF 3-node RH-OSP 16.x cluster with 2 compute nodes, NIC: "Intel Corporation Ethernet Controller XXV710 for 25GbE SFP28 (rev 02)", using fake_nova_driver to avoid qemu overhead. 1) ovn-fake-multinode: Performance measured between two virtual nodes, two ports that belong to different LSs connected via router. Using qperf, performance was measured for UDP, TCP, SCTP protocols (using <proto>_lat and <proto>_bw tests). The qperf version used: 0.4.9-16.fc31.x86_64. Each test scenario was executed five times and averages compared. Tests were executed with `allow-stateless` rules for the tested protocol and `allow-related` for another protocol set for both ports, both directions, e.g. for TCP scenario, the following ACLs were defined: ovn-nbctl acl-add sw0 to-lport 100 tcp allow-stateless ovn-nbctl acl-add sw0 from-lport 100 tcp allow-stateless ovn-nbctl acl-add sw1 to-lport 100 tcp allow-stateless ovn-nbctl acl-add sw1 from-lport 100 tcp allow-stateless ovn-nbctl acl-add sw0 to-lport 100 sctp allow-related ovn-nbctl acl-add sw0 from-lport 100 sctp allow-related ovn-nbctl acl-add sw1 to-lport 100 sctp allow-related ovn-nbctl acl-add sw1 from-lport 100 sctp allow-related In this particular environment, improvement was seen in send_bw, latency, and msg_rate measurements, where applicable, for all three protocols under test. for UDP, send_bw: 293.6 MB/sec => 313.2 MB/sec (+6.68%) latency: 16 us => 14.08 us (-12%) msg_rate: 62.56 K/sec => 71.06 K/sec (+13.59%) for TCP, latency: 18.6 us => 14.88 us (-20%) msg_rate: 53.8 K/sec => 67.28 K/sec (+25.06%) for SCTP, latency: 21.98 us => 19.42 us (-11.65%) msg_rate: 45.58 K/sec => 51.54 K/sec (+13.08%) 2) Supermicro RH-OSP cluster: Two scenarios executed: - connectivity between two ports on the same switch - connectivity between two ports on different switches connected via router For the same switch, improvements are as follows: - TCP latency: -5.8%, UDP latency: -13.9%, SCTP latency: -10.8% - TCP bw: +0.8%, UDP bw, +9.5%, SCTP bw: +13.29% For different switches, improvements are as follows: - TCP latency: -6.9%, UDP: -11.9%, SCTP: -4.2% - TCP bw: -0.2%, UDP bw: +11.1%, SCTP bw: +12.9% The effect is somewhat more noticeable in same switch scenario. The effect is less pronounced for TCP than for UDP. === The patch takes inspiration from a now abandoned patch: "ovn-northd: Support mixing stateless/stateful ACLs with Stateless_Filter." by Dumitru Ceara. The original patch assumed CMS doesn't require full flexibility of matching rules for stateless matching (for example, to be used by OpenShift). But other CMS interfaces may require the same customizability for stateless as well as stateful matching, like in OpenStack Neutron API. Which is why this patch reuses existing ACL object type to describe stateless rules. Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com> --- v1: initial version. v2: rebased after conflict. v3: added ddlog implementation. v3: apply stateless skip-hint-set rules to appropriate direction only. v3: added more background as to implementation in commit message. v3: test stateless scenarios with ddlog too. v3: rebased after conflict. v4: introduce a separate allow-stateless ACL match verb. v5: rebased. v6: updated docs for new allow-stateless approach. v6: removed no longer valid comments. v6: removed acl_is_stateless. v7: bump north db schema version to 5.31.0 -> 5.32.0. v8: fixed checkpatch failure on a too long line in dbschema. v8: added perf data on baremetal lab testing. v9: fixed ovsdb checksum. v10: expanded documentation with more explicit distinction between acl types. v10: updated a somewhat obsolete comment. v10: updated commit message with better numbers for scenarios after re-running some tests. v11: updated test scenarios to reflect: "northd: Optimize ct nat for load balancer traffic". v11: rebased. --- NEWS | 2 + northd/ovn-northd.8.xml | 8 +- northd/ovn-northd.c | 71 +++++++++-- northd/ovn_northd.dl | 31 +++++ ovn-nb.ovsschema | 9 +- ovn-nb.xml | 15 ++- tests/ovn-northd.at | 269 ++++++++++++++++++++++++++++++++++++++++ utilities/ovn-nbctl.c | 6 +- 8 files changed, 397 insertions(+), 14 deletions(-)