Message ID | 20210422132502.2367928-1-numans@ovn.org |
---|---|
State | Accepted |
Headers | show |
Series | [ovs-dev,v4,1/2] northd: Optimize ct nat for load balancer traffic. | expand |
On 22/04/2021 14:25, numans@ovn.org wrote: > From: Numan Siddique <numans@ovn.org> > > Presently we add 65535 priority lflows in the stages - > 'ls_in_acl' and 'ls_out_acl' to drop packets which > match on 'ct.inv'. > > This patch provides an option to not use this field in the > logical flow matches for the following > reasons: > > • Some of the smart NICs which support offloading datapath > flows may not support this field. In such cases, almost > all the datapath flows (matching on the ct_state field +inv > or -inv cannot be offloaded if ACLs/load balancers > are configured on the logical switch datapath. > > • A recent commit in kernel ovs datapath sets the committed > connection tracking entry to be liberal for out-of-window > tcp packets (nf_ct_set_tcp_be_liberal()). Such TCP > packets will not be marked as invalid. > > There are still certain scenarios where a packet can be marked > invalid. Like > • If the tcp flags are not correct. > > • If there are checksum errors. > > In such cases, the packet will be delivered to the destination > port. So CMS should evaluate their usecases before enabling > this option. > > Signed-off-by: Numan Siddique <numans@ovn.org> > Co-authored-by: Ben Pfaff <blp@ovn.org> > Signed-off-by: Ben Pfaff <blp@ovn.org> > --- > v3 -> v4 > --- > * Moved the static variable 'use_ct_inv_match' declaration up in the > file. > > v2 -> v3 > ---- > * Updated the documentation in ovn-nb.xml as per Mark G's suggestion. > > v1 -> v2 > ---- > * Adopted the code changes shared by Ben Pfaff for the ddlog code. > > NEWS | 7 +- > northd/lswitch.dl | 7 + > northd/ovn-northd.c | 42 +++--- > northd/ovn_northd.dl | 304 ++++++++++++++++++++++--------------------- > ovn-nb.xml | 15 +++ > 5 files changed, 208 insertions(+), 167 deletions(-) > > diff --git a/NEWS b/NEWS > index 1ddde15f86..5f1365775d 100644 > --- a/NEWS > +++ b/NEWS > @@ -6,11 +6,16 @@ Post-v21.03.0 > expected to scale better than the C implementation, for large deployments. > (This may take testing and tuning to be effective.) This version of OVN > requires DDLog 0.36. > - - Introduce ovn-controller incremetal processing engine statistics > + - Introduce ovn-controller incremental processing engine statistics > - Introduced parallel processing in ovn-northd with the NB_Global config option > 'use_parallel_build' to enable it. It is disabled by default. > - Support vlan-passthru mode for tag=0 localnet ports. > - Support custom 802.11ad EthType for localnet ports. > + - Add a new NB Global option - use_ct_inv_match with the default value of true. > + If this is set to false, then the logical field - "ct.inv" will not be > + used in the logical flow matches. CMS can consider setting this to false, > + if they want to use smart NICs which don't support offloading datapath flows > + with this field used. > > OVN v21.03.0 - 12 Mar 2021 > ------------------------- > diff --git a/northd/lswitch.dl b/northd/lswitch.dl > index 47c497e0cf..27ac3cadc5 100644 > --- a/northd/lswitch.dl > +++ b/northd/lswitch.dl > @@ -742,3 +742,10 @@ function put_svc_monitor_mac(options: Map<string,string>, > relation SvcMonitorMac(mac: eth_addr) > SvcMonitorMac(get_svc_monitor_mac(options, uuid)) :- > nb::NB_Global(._uuid = uuid, .options = options). > + > +relation UseCtInvMatch[bool] > +UseCtInvMatch[options.get_bool_def("use_ct_inv_match", true)] :- > + nb::NB_Global(.options = options). > +UseCtInvMatch[true] :- > + Unit(), > + not nb in nb::NB_Global(). > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c > index de1aa896d3..5f6c0a6922 100644 > --- a/northd/ovn-northd.c > +++ b/northd/ovn-northd.c > @@ -98,6 +98,10 @@ static bool check_lsp_is_up; > static char svc_monitor_mac[ETH_ADDR_STRLEN + 1]; > static struct eth_addr svc_monitor_mac_ea; > > +/* If this option is 'true' northd will make use of ct.inv match fields. > + * Otherwise, it will avoid using it. The default is true. */ > +static bool use_ct_inv_match = true; > + > /* Default probe interval for NB and SB DB connections. */ > #define DEFAULT_PROBE_INTERVAL_MSEC 5000 > static int northd_probe_interval_nb = 0; > @@ -4099,7 +4103,6 @@ do_ovn_lflow_add(struct hmap *lflow_map, bool shared, > hmap_insert_fast(lflow_map, &lflow->hmap_node, hash); > } > > - > /* Adds a row with the specified contents to the Logical_Flow table. */ > static void > ovn_lflow_add_at(struct hmap *lflow_map, struct ovn_datapath *od, > @@ -5712,12 +5715,13 @@ build_acls(struct ovn_datapath *od, struct hmap *lflows, > * for deletion (bit 0 of ct_label is set). > * > * This is enforced at a higher priority than ACLs can be defined. */ > + char *match = xasprintf("%s(ct.est && ct.rpl && ct_label.blocked == 1)", > + use_ct_inv_match ? "ct.inv || " : ""); > ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, UINT16_MAX, > - "ct.inv || (ct.est && ct.rpl && ct_label.blocked == 1)", > - "drop;"); > + match, "drop;"); > ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, UINT16_MAX, > - "ct.inv || (ct.est && ct.rpl && ct_label.blocked == 1)", > - "drop;"); > + match, "drop;"); > + free(match); > > /* Ingress and Egress ACL Table (Priority 65535). > * > @@ -5728,14 +5732,15 @@ build_acls(struct ovn_datapath *od, struct hmap *lflows, > * direction to hit the currently defined policy from ACLs. > * > * This is enforced at a higher priority than ACLs can be defined. */ > + match = xasprintf("ct.est && !ct.rel && !ct.new%s && " > + "ct.rpl && ct_label.blocked == 0", > + use_ct_inv_match ? " && !ct.inv" : ""); > + > ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, UINT16_MAX, > - "ct.est && !ct.rel && !ct.new && !ct.inv " > - "&& ct.rpl && ct_label.blocked == 0", > - "next;"); > + match, "next;"); > ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, UINT16_MAX, > - "ct.est && !ct.rel && !ct.new && !ct.inv " > - "&& ct.rpl && ct_label.blocked == 0", > - "next;"); > + match, "next;"); > + free(match); > > /* Ingress and Egress ACL Table (Priority 65535). > * > @@ -5748,14 +5753,14 @@ build_acls(struct ovn_datapath *od, struct hmap *lflows, > * a dynamically negotiated FTP data channel), but will allow > * related traffic such as an ICMP Port Unreachable through > * that's generated from a non-listening UDP port. */ > + match = xasprintf("!ct.est && ct.rel && !ct.new%s && " > + "ct_label.blocked == 0", > + use_ct_inv_match ? " && !ct.inv" : ""); > ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, UINT16_MAX, > - "!ct.est && ct.rel && !ct.new && !ct.inv " > - "&& ct_label.blocked == 0", > - "next;"); > + match, "next;"); > ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, UINT16_MAX, > - "!ct.est && ct.rel && !ct.new && !ct.inv " > - "&& ct_label.blocked == 0", > - "next;"); > + match, "next;"); > + free(match); > > /* Ingress and Egress ACL Table (Priority 65535). > * > @@ -13195,6 +13200,9 @@ ovnnb_db_run(struct northd_context *ctx, > > use_logical_dp_groups = smap_get_bool(&nb->options, > "use_logical_dp_groups", false); > + use_ct_inv_match = smap_get_bool(&nb->options, > + "use_ct_inv_match", true); > + > /* deprecated, use --event instead */ > controller_event_en = smap_get_bool(&nb->options, > "controller_event", false); > diff --git a/northd/ovn_northd.dl b/northd/ovn_northd.dl > index 2f5d36c599..a0f7944103 100644 > --- a/northd/ovn_northd.dl > +++ b/northd/ovn_northd.dl > @@ -2280,173 +2280,179 @@ for (Reject(lsuuid, pipeline, stage, acl, fair_meter, extra_match_, extra_action > } > > /* build_acls */ > -for (sw in &Switch(.ls = ls)) > -var has_stateful = sw.has_stateful_acl or sw.has_lb_vip in > -{ > - /* Ingress and Egress ACL Table (Priority 0): Packets are allowed by > - * default. A related rule at priority 1 is added below if there > - * are any stateful ACLs in this datapath. */ > - Flow(.logical_datapath = ls._uuid, > - .stage = s_SWITCH_IN_ACL(), > - .priority = 0, > - .__match = "1", > - .actions = "next;", > - .external_ids = map_empty()); > - Flow(.logical_datapath = ls._uuid, > - .stage = s_SWITCH_OUT_ACL(), > - .priority = 0, > - .__match = "1", > - .actions = "next;", > - .external_ids = map_empty()); > - > - if (has_stateful) { > - /* Ingress and Egress ACL Table (Priority 1). > - * > - * By default, traffic is allowed. This is partially handled by > - * the Priority 0 ACL flows added earlier, but we also need to > - * commit IP flows. This is because, while the initiater's > - * direction may not have any stateful rules, the server's may > - * and then its return traffic would not have an associated > - * conntrack entry and would return "+invalid". > - * > - * We use "ct_commit" for a connection that is not already known > - * by the connection tracker. Once a connection is committed, > - * subsequent packets will hit the flow at priority 0 that just > - * uses "next;" > - * > - * We also check for established connections that have ct_label.blocked > - * set on them. That's a connection that was disallowed, but is > - * now allowed by policy again since it hit this default-allow flow. > - * We need to set ct_label.blocked=0 to let the connection continue, > - * which will be done by ct_commit() in the "stateful" stage. > - * Subsequent packets will hit the flow at priority 0 that just > - * uses "next;". */ > - Flow(.logical_datapath = ls._uuid, > - .stage = s_SWITCH_IN_ACL(), > - .priority = 1, > - .__match = "ip && (!ct.est || (ct.est && ct_label.blocked == 1))", > - .actions = "${rEGBIT_CONNTRACK_COMMIT()} = 1; next;", > - .external_ids = map_empty()); > - Flow(.logical_datapath = ls._uuid, > - .stage = s_SWITCH_OUT_ACL(), > - .priority = 1, > - .__match = "ip && (!ct.est || (ct.est && ct_label.blocked == 1))", > - .actions = "${rEGBIT_CONNTRACK_COMMIT()} = 1; next;", > - .external_ids = map_empty()); > - > - /* Ingress and Egress ACL Table (Priority 65535). > - * > - * Always drop traffic that's in an invalid state. Also drop > - * reply direction packets for connections that have been marked > - * for deletion (bit 0 of ct_label is set). > - * > - * This is enforced at a higher priority than ACLs can be defined. */ > - Flow(.logical_datapath = ls._uuid, > - .stage = s_SWITCH_IN_ACL(), > - .priority = 65535, > - .__match = "ct.inv || (ct.est && ct.rpl && ct_label.blocked == 1)", > - .actions = "drop;", > - .external_ids = map_empty()); > - Flow(.logical_datapath = ls._uuid, > - .stage = s_SWITCH_OUT_ACL(), > - .priority = 65535, > - .__match = "ct.inv || (ct.est && ct.rpl && ct_label.blocked == 1)", > - .actions = "drop;", > - .external_ids = map_empty()); > - > - /* Ingress and Egress ACL Table (Priority 65535). > - * > - * Allow reply traffic that is part of an established > - * conntrack entry that has not been marked for deletion > - * (bit 0 of ct_label). We only match traffic in the > - * reply direction because we want traffic in the request > - * direction to hit the currently defined policy from ACLs. > - * > - * This is enforced at a higher priority than ACLs can be defined. */ > +for (UseCtInvMatch[use_ct_inv_match]) { > + (var ct_inv_or, var and_not_ct_inv) = match (use_ct_inv_match) { > + true -> ("ct.inv || ", "&& !ct.inv "), > + false -> ("", ""), > + } in > + for (sw in &Switch(.ls = ls)) > + var has_stateful = sw.has_stateful_acl or sw.has_lb_vip in > + { > + /* Ingress and Egress ACL Table (Priority 0): Packets are allowed by > + * default. A related rule at priority 1 is added below if there > + * are any stateful ACLs in this datapath. */ > Flow(.logical_datapath = ls._uuid, > .stage = s_SWITCH_IN_ACL(), > - .priority = 65535, > - .__match = "ct.est && !ct.rel && !ct.new && !ct.inv " > - "&& ct.rpl && ct_label.blocked == 0", > + .priority = 0, > + .__match = "1", > .actions = "next;", > .external_ids = map_empty()); > Flow(.logical_datapath = ls._uuid, > .stage = s_SWITCH_OUT_ACL(), > - .priority = 65535, > - .__match = "ct.est && !ct.rel && !ct.new && !ct.inv " > - "&& ct.rpl && ct_label.blocked == 0", > + .priority = 0, > + .__match = "1", > .actions = "next;", > .external_ids = map_empty()); > > - /* Ingress and Egress ACL Table (Priority 65535). > - * > - * Allow traffic that is related to an existing conntrack entry that > - * has not been marked for deletion (bit 0 of ct_label). > - * > - * This is enforced at a higher priority than ACLs can be defined. > - * > - * NOTE: This does not support related data sessions (eg, > - * a dynamically negotiated FTP data channel), but will allow > - * related traffic such as an ICMP Port Unreachable through > - * that's generated from a non-listening UDP port. */ > - Flow(.logical_datapath = ls._uuid, > - .stage = s_SWITCH_IN_ACL(), > - .priority = 65535, > - .__match = "!ct.est && ct.rel && !ct.new && !ct.inv " > - "&& ct_label.blocked == 0", > - .actions = "next;", > - .external_ids = map_empty()); > - Flow(.logical_datapath = ls._uuid, > - .stage = s_SWITCH_OUT_ACL(), > - .priority = 65535, > - .__match = "!ct.est && ct.rel && !ct.new && !ct.inv " > - "&& ct_label.blocked == 0", > - .actions = "next;", > - .external_ids = map_empty()); > + if (has_stateful) { > + /* Ingress and Egress ACL Table (Priority 1). > + * > + * By default, traffic is allowed. This is partially handled by > + * the Priority 0 ACL flows added earlier, but we also need to > + * commit IP flows. This is because, while the initiater's > + * direction may not have any stateful rules, the server's may > + * and then its return traffic would not have an associated > + * conntrack entry and would return "+invalid". > + * > + * We use "ct_commit" for a connection that is not already known > + * by the connection tracker. Once a connection is committed, > + * subsequent packets will hit the flow at priority 0 that just > + * uses "next;" > + * > + * We also check for established connections that have ct_label.blocked > + * set on them. That's a connection that was disallowed, but is > + * now allowed by policy again since it hit this default-allow flow. > + * We need to set ct_label.blocked=0 to let the connection continue, > + * which will be done by ct_commit() in the "stateful" stage. > + * Subsequent packets will hit the flow at priority 0 that just > + * uses "next;". */ > + Flow(.logical_datapath = ls._uuid, > + .stage = s_SWITCH_IN_ACL(), > + .priority = 1, > + .__match = "ip && (!ct.est || (ct.est && ct_label.blocked == 1))", > + .actions = "${rEGBIT_CONNTRACK_COMMIT()} = 1; next;", > + .external_ids = map_empty()); > + Flow(.logical_datapath = ls._uuid, > + .stage = s_SWITCH_OUT_ACL(), > + .priority = 1, > + .__match = "ip && (!ct.est || (ct.est && ct_label.blocked == 1))", > + .actions = "${rEGBIT_CONNTRACK_COMMIT()} = 1; next;", > + .external_ids = map_empty()); > > - /* Ingress and Egress ACL Table (Priority 65535). > - * > - * Not to do conntrack on ND packets. */ > + /* Ingress and Egress ACL Table (Priority 65535). > + * > + * Always drop traffic that's in an invalid state. Also drop > + * reply direction packets for connections that have been marked > + * for deletion (bit 0 of ct_label is set). > + * > + * This is enforced at a higher priority than ACLs can be defined. */ > + Flow(.logical_datapath = ls._uuid, > + .stage = s_SWITCH_IN_ACL(), > + .priority = 65535, > + .__match = ct_inv_or ++ "(ct.est && ct.rpl && ct_label.blocked == 1)", > + .actions = "drop;", > + .external_ids = map_empty()); > + Flow(.logical_datapath = ls._uuid, > + .stage = s_SWITCH_OUT_ACL(), > + .priority = 65535, > + .__match = ct_inv_or ++ "(ct.est && ct.rpl && ct_label.blocked == 1)", > + .actions = "drop;", > + .external_ids = map_empty()); > + > + /* Ingress and Egress ACL Table (Priority 65535). > + * > + * Allow reply traffic that is part of an established > + * conntrack entry that has not been marked for deletion > + * (bit 0 of ct_label). We only match traffic in the > + * reply direction because we want traffic in the request > + * direction to hit the currently defined policy from ACLs. > + * > + * This is enforced at a higher priority than ACLs can be defined. */ > + Flow(.logical_datapath = ls._uuid, > + .stage = s_SWITCH_IN_ACL(), > + .priority = 65535, > + .__match = "ct.est && !ct.rel && !ct.new " ++ and_not_ct_inv ++ > + "&& ct.rpl && ct_label.blocked == 0", > + .actions = "next;", > + .external_ids = map_empty()); > + Flow(.logical_datapath = ls._uuid, > + .stage = s_SWITCH_OUT_ACL(), > + .priority = 65535, > + .__match = "ct.est && !ct.rel && !ct.new " ++ and_not_ct_inv ++ > + "&& ct.rpl && ct_label.blocked == 0", > + .actions = "next;", > + .external_ids = map_empty()); > + > + /* Ingress and Egress ACL Table (Priority 65535). > + * > + * Allow traffic that is related to an existing conntrack entry that > + * has not been marked for deletion (bit 0 of ct_label). > + * > + * This is enforced at a higher priority than ACLs can be defined. > + * > + * NOTE: This does not support related data sessions (eg, > + * a dynamically negotiated FTP data channel), but will allow > + * related traffic such as an ICMP Port Unreachable through > + * that's generated from a non-listening UDP port. */ > + Flow(.logical_datapath = ls._uuid, > + .stage = s_SWITCH_IN_ACL(), > + .priority = 65535, > + .__match = "!ct.est && ct.rel && !ct.new " ++ and_not_ct_inv ++ > + "&& ct_label.blocked == 0", > + .actions = "next;", > + .external_ids = map_empty()); > + Flow(.logical_datapath = ls._uuid, > + .stage = s_SWITCH_OUT_ACL(), > + .priority = 65535, > + .__match = "!ct.est && ct.rel && !ct.new " ++ and_not_ct_inv ++ > + "&& ct_label.blocked == 0", > + .actions = "next;", > + .external_ids = map_empty()); > + > + /* Ingress and Egress ACL Table (Priority 65535). > + * > + * Not to do conntrack on ND packets. */ > + Flow(.logical_datapath = ls._uuid, > + .stage = s_SWITCH_IN_ACL(), > + .priority = 65535, > + .__match = "nd || nd_ra || nd_rs || mldv1 || mldv2", > + .actions = "next;", > + .external_ids = map_empty()); > + Flow(.logical_datapath = ls._uuid, > + .stage = s_SWITCH_OUT_ACL(), > + .priority = 65535, > + .__match = "nd || nd_ra || nd_rs || mldv1 || mldv2", > + .actions = "next;", > + .external_ids = map_empty()) > + }; > + > + /* Add a 34000 priority flow to advance the DNS reply from ovn-controller, > + * if the CMS has configured DNS records for the datapath. > + */ > + if (sw.has_dns_records) { > + Flow(.logical_datapath = ls._uuid, > + .stage = s_SWITCH_OUT_ACL(), > + .priority = 34000, > + .__match = "udp.src == 53", > + .actions = if has_stateful "ct_commit; next;" else "next;", > + .external_ids = map_empty()) > + }; > + > + /* Add a 34000 priority flow to advance the service monitor reply > + * packets to skip applying ingress ACLs. */ > Flow(.logical_datapath = ls._uuid, > .stage = s_SWITCH_IN_ACL(), > - .priority = 65535, > - .__match = "nd || nd_ra || nd_rs || mldv1 || mldv2", > + .priority = 34000, > + .__match = "eth.dst == $svc_monitor_mac", > .actions = "next;", > .external_ids = map_empty()); > - Flow(.logical_datapath = ls._uuid, > - .stage = s_SWITCH_OUT_ACL(), > - .priority = 65535, > - .__match = "nd || nd_ra || nd_rs || mldv1 || mldv2", > - .actions = "next;", > - .external_ids = map_empty()) > - }; > - > - /* Add a 34000 priority flow to advance the DNS reply from ovn-controller, > - * if the CMS has configured DNS records for the datapath. > - */ > - if (sw.has_dns_records) { > Flow(.logical_datapath = ls._uuid, > .stage = s_SWITCH_OUT_ACL(), > .priority = 34000, > - .__match = "udp.src == 53", > - .actions = if has_stateful "ct_commit; next;" else "next;", > + .__match = "eth.src == $svc_monitor_mac", > + .actions = "next;", > .external_ids = map_empty()) > - }; > - > - /* Add a 34000 priority flow to advance the service monitor reply > - * packets to skip applying ingress ACLs. */ > - Flow(.logical_datapath = ls._uuid, > - .stage = s_SWITCH_IN_ACL(), > - .priority = 34000, > - .__match = "eth.dst == $svc_monitor_mac", > - .actions = "next;", > - .external_ids = map_empty()); > - Flow(.logical_datapath = ls._uuid, > - .stage = s_SWITCH_OUT_ACL(), > - .priority = 34000, > - .__match = "eth.src == $svc_monitor_mac", > - .actions = "next;", > - .external_ids = map_empty()) > + } > } > > /* This stage builds hints for the IN/OUT_ACL stage. Based on various > diff --git a/ovn-nb.xml b/ovn-nb.xml > index feb38b5d31..e337be4eb3 100644 > --- a/ovn-nb.xml > +++ b/ovn-nb.xml > @@ -240,6 +240,21 @@ > </p> > </column> > > + <column name="options" key="use_ct_inv_match"> > + <p> > + If set to false, <code>ovn-northd</code> will not use the > + <code>ct.inv</code> field in any of the logical flow matches. > + The default value is true. If the NIC supports offloading > + OVS datapath flows but doesn't support offloading ct_state > + <code>inv</code> flag, then the datapath flows matching on this flag > + (either <code>+inv</code> or <code>-inv</code>) will not be > + offloaded. CMS should consider setting <code>use_ct_inv_match</code> > + to <code>false</code> in such cases. This results in a side effect > + of the invalid packets getting delivered to the destination VIF, which > + otherwise would have been dropped by <code>OVN</code>. > + </p> > + </column> > + > <group title="Options for configuring interconnection route advertisement"> > <p> > These options control how routes are advertised between OVN > Acked-by: Mark D. Gray <mark.d.gray@redhat.com>
On Thu, Apr 22, 2021 at 6:25 AM <numans@ovn.org> wrote: > > From: Numan Siddique <numans@ovn.org> > > Presently we add 65535 priority lflows in the stages - > 'ls_in_acl' and 'ls_out_acl' to drop packets which > match on 'ct.inv'. > > This patch provides an option to not use this field in the > logical flow matches for the following > reasons: > > • Some of the smart NICs which support offloading datapath > flows may not support this field. In such cases, almost > all the datapath flows (matching on the ct_state field +inv > or -inv cannot be offloaded if ACLs/load balancers > are configured on the logical switch datapath. > > • A recent commit in kernel ovs datapath sets the committed > connection tracking entry to be liberal for out-of-window > tcp packets (nf_ct_set_tcp_be_liberal()). Such TCP > packets will not be marked as invalid. > > There are still certain scenarios where a packet can be marked > invalid. Like > • If the tcp flags are not correct. > > • If there are checksum errors. > > In such cases, the packet will be delivered to the destination > port. So CMS should evaluate their usecases before enabling > this option. > > Signed-off-by: Numan Siddique <numans@ovn.org> > Co-authored-by: Ben Pfaff <blp@ovn.org> > Signed-off-by: Ben Pfaff <blp@ovn.org> > --- > v3 -> v4 > --- > * Moved the static variable 'use_ct_inv_match' declaration up in the > file. > > v2 -> v3 > ---- > * Updated the documentation in ovn-nb.xml as per Mark G's suggestion. > > v1 -> v2 > ---- > * Adopted the code changes shared by Ben Pfaff for the ddlog code. > > NEWS | 7 +- > northd/lswitch.dl | 7 + > northd/ovn-northd.c | 42 +++--- > northd/ovn_northd.dl | 304 ++++++++++++++++++++++--------------------- > ovn-nb.xml | 15 +++ > 5 files changed, 208 insertions(+), 167 deletions(-) > > diff --git a/NEWS b/NEWS > index 1ddde15f86..5f1365775d 100644 > --- a/NEWS > +++ b/NEWS > @@ -6,11 +6,16 @@ Post-v21.03.0 > expected to scale better than the C implementation, for large deployments. > (This may take testing and tuning to be effective.) This version of OVN > requires DDLog 0.36. > - - Introduce ovn-controller incremetal processing engine statistics > + - Introduce ovn-controller incremental processing engine statistics > - Introduced parallel processing in ovn-northd with the NB_Global config option > 'use_parallel_build' to enable it. It is disabled by default. > - Support vlan-passthru mode for tag=0 localnet ports. > - Support custom 802.11ad EthType for localnet ports. > + - Add a new NB Global option - use_ct_inv_match with the default value of true. > + If this is set to false, then the logical field - "ct.inv" will not be > + used in the logical flow matches. CMS can consider setting this to false, > + if they want to use smart NICs which don't support offloading datapath flows > + with this field used. > > OVN v21.03.0 - 12 Mar 2021 > ------------------------- > diff --git a/northd/lswitch.dl b/northd/lswitch.dl > index 47c497e0cf..27ac3cadc5 100644 > --- a/northd/lswitch.dl > +++ b/northd/lswitch.dl > @@ -742,3 +742,10 @@ function put_svc_monitor_mac(options: Map<string,string>, > relation SvcMonitorMac(mac: eth_addr) > SvcMonitorMac(get_svc_monitor_mac(options, uuid)) :- > nb::NB_Global(._uuid = uuid, .options = options). > + > +relation UseCtInvMatch[bool] > +UseCtInvMatch[options.get_bool_def("use_ct_inv_match", true)] :- > + nb::NB_Global(.options = options). > +UseCtInvMatch[true] :- > + Unit(), > + not nb in nb::NB_Global(). > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c > index de1aa896d3..5f6c0a6922 100644 > --- a/northd/ovn-northd.c > +++ b/northd/ovn-northd.c > @@ -98,6 +98,10 @@ static bool check_lsp_is_up; > static char svc_monitor_mac[ETH_ADDR_STRLEN + 1]; > static struct eth_addr svc_monitor_mac_ea; > > +/* If this option is 'true' northd will make use of ct.inv match fields. > + * Otherwise, it will avoid using it. The default is true. */ > +static bool use_ct_inv_match = true; > + > /* Default probe interval for NB and SB DB connections. */ > #define DEFAULT_PROBE_INTERVAL_MSEC 5000 > static int northd_probe_interval_nb = 0; > @@ -4099,7 +4103,6 @@ do_ovn_lflow_add(struct hmap *lflow_map, bool shared, > hmap_insert_fast(lflow_map, &lflow->hmap_node, hash); > } > > - > /* Adds a row with the specified contents to the Logical_Flow table. */ > static void > ovn_lflow_add_at(struct hmap *lflow_map, struct ovn_datapath *od, > @@ -5712,12 +5715,13 @@ build_acls(struct ovn_datapath *od, struct hmap *lflows, > * for deletion (bit 0 of ct_label is set). > * > * This is enforced at a higher priority than ACLs can be defined. */ > + char *match = xasprintf("%s(ct.est && ct.rpl && ct_label.blocked == 1)", > + use_ct_inv_match ? "ct.inv || " : ""); > ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, UINT16_MAX, > - "ct.inv || (ct.est && ct.rpl && ct_label.blocked == 1)", > - "drop;"); > + match, "drop;"); > ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, UINT16_MAX, > - "ct.inv || (ct.est && ct.rpl && ct_label.blocked == 1)", > - "drop;"); > + match, "drop;"); > + free(match); > > /* Ingress and Egress ACL Table (Priority 65535). > * > @@ -5728,14 +5732,15 @@ build_acls(struct ovn_datapath *od, struct hmap *lflows, > * direction to hit the currently defined policy from ACLs. > * > * This is enforced at a higher priority than ACLs can be defined. */ > + match = xasprintf("ct.est && !ct.rel && !ct.new%s && " > + "ct.rpl && ct_label.blocked == 0", > + use_ct_inv_match ? " && !ct.inv" : ""); > + > ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, UINT16_MAX, > - "ct.est && !ct.rel && !ct.new && !ct.inv " > - "&& ct.rpl && ct_label.blocked == 0", > - "next;"); > + match, "next;"); > ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, UINT16_MAX, > - "ct.est && !ct.rel && !ct.new && !ct.inv " > - "&& ct.rpl && ct_label.blocked == 0", > - "next;"); > + match, "next;"); > + free(match); > > /* Ingress and Egress ACL Table (Priority 65535). > * > @@ -5748,14 +5753,14 @@ build_acls(struct ovn_datapath *od, struct hmap *lflows, > * a dynamically negotiated FTP data channel), but will allow > * related traffic such as an ICMP Port Unreachable through > * that's generated from a non-listening UDP port. */ > + match = xasprintf("!ct.est && ct.rel && !ct.new%s && " > + "ct_label.blocked == 0", > + use_ct_inv_match ? " && !ct.inv" : ""); > ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, UINT16_MAX, > - "!ct.est && ct.rel && !ct.new && !ct.inv " > - "&& ct_label.blocked == 0", > - "next;"); > + match, "next;"); > ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, UINT16_MAX, > - "!ct.est && ct.rel && !ct.new && !ct.inv " > - "&& ct_label.blocked == 0", > - "next;"); > + match, "next;"); > + free(match); > > /* Ingress and Egress ACL Table (Priority 65535). > * > @@ -13195,6 +13200,9 @@ ovnnb_db_run(struct northd_context *ctx, > > use_logical_dp_groups = smap_get_bool(&nb->options, > "use_logical_dp_groups", false); > + use_ct_inv_match = smap_get_bool(&nb->options, > + "use_ct_inv_match", true); > + > /* deprecated, use --event instead */ > controller_event_en = smap_get_bool(&nb->options, > "controller_event", false); > diff --git a/northd/ovn_northd.dl b/northd/ovn_northd.dl > index 2f5d36c599..a0f7944103 100644 > --- a/northd/ovn_northd.dl > +++ b/northd/ovn_northd.dl > @@ -2280,173 +2280,179 @@ for (Reject(lsuuid, pipeline, stage, acl, fair_meter, extra_match_, extra_action > } > > /* build_acls */ > -for (sw in &Switch(.ls = ls)) > -var has_stateful = sw.has_stateful_acl or sw.has_lb_vip in > -{ > - /* Ingress and Egress ACL Table (Priority 0): Packets are allowed by > - * default. A related rule at priority 1 is added below if there > - * are any stateful ACLs in this datapath. */ > - Flow(.logical_datapath = ls._uuid, > - .stage = s_SWITCH_IN_ACL(), > - .priority = 0, > - .__match = "1", > - .actions = "next;", > - .external_ids = map_empty()); > - Flow(.logical_datapath = ls._uuid, > - .stage = s_SWITCH_OUT_ACL(), > - .priority = 0, > - .__match = "1", > - .actions = "next;", > - .external_ids = map_empty()); > - > - if (has_stateful) { > - /* Ingress and Egress ACL Table (Priority 1). > - * > - * By default, traffic is allowed. This is partially handled by > - * the Priority 0 ACL flows added earlier, but we also need to > - * commit IP flows. This is because, while the initiater's > - * direction may not have any stateful rules, the server's may > - * and then its return traffic would not have an associated > - * conntrack entry and would return "+invalid". > - * > - * We use "ct_commit" for a connection that is not already known > - * by the connection tracker. Once a connection is committed, > - * subsequent packets will hit the flow at priority 0 that just > - * uses "next;" > - * > - * We also check for established connections that have ct_label.blocked > - * set on them. That's a connection that was disallowed, but is > - * now allowed by policy again since it hit this default-allow flow. > - * We need to set ct_label.blocked=0 to let the connection continue, > - * which will be done by ct_commit() in the "stateful" stage. > - * Subsequent packets will hit the flow at priority 0 that just > - * uses "next;". */ > - Flow(.logical_datapath = ls._uuid, > - .stage = s_SWITCH_IN_ACL(), > - .priority = 1, > - .__match = "ip && (!ct.est || (ct.est && ct_label.blocked == 1))", > - .actions = "${rEGBIT_CONNTRACK_COMMIT()} = 1; next;", > - .external_ids = map_empty()); > - Flow(.logical_datapath = ls._uuid, > - .stage = s_SWITCH_OUT_ACL(), > - .priority = 1, > - .__match = "ip && (!ct.est || (ct.est && ct_label.blocked == 1))", > - .actions = "${rEGBIT_CONNTRACK_COMMIT()} = 1; next;", > - .external_ids = map_empty()); > - > - /* Ingress and Egress ACL Table (Priority 65535). > - * > - * Always drop traffic that's in an invalid state. Also drop > - * reply direction packets for connections that have been marked > - * for deletion (bit 0 of ct_label is set). > - * > - * This is enforced at a higher priority than ACLs can be defined. */ > - Flow(.logical_datapath = ls._uuid, > - .stage = s_SWITCH_IN_ACL(), > - .priority = 65535, > - .__match = "ct.inv || (ct.est && ct.rpl && ct_label.blocked == 1)", > - .actions = "drop;", > - .external_ids = map_empty()); > - Flow(.logical_datapath = ls._uuid, > - .stage = s_SWITCH_OUT_ACL(), > - .priority = 65535, > - .__match = "ct.inv || (ct.est && ct.rpl && ct_label.blocked == 1)", > - .actions = "drop;", > - .external_ids = map_empty()); > - > - /* Ingress and Egress ACL Table (Priority 65535). > - * > - * Allow reply traffic that is part of an established > - * conntrack entry that has not been marked for deletion > - * (bit 0 of ct_label). We only match traffic in the > - * reply direction because we want traffic in the request > - * direction to hit the currently defined policy from ACLs. > - * > - * This is enforced at a higher priority than ACLs can be defined. */ > +for (UseCtInvMatch[use_ct_inv_match]) { > + (var ct_inv_or, var and_not_ct_inv) = match (use_ct_inv_match) { > + true -> ("ct.inv || ", "&& !ct.inv "), > + false -> ("", ""), > + } in > + for (sw in &Switch(.ls = ls)) > + var has_stateful = sw.has_stateful_acl or sw.has_lb_vip in > + { > + /* Ingress and Egress ACL Table (Priority 0): Packets are allowed by > + * default. A related rule at priority 1 is added below if there > + * are any stateful ACLs in this datapath. */ > Flow(.logical_datapath = ls._uuid, > .stage = s_SWITCH_IN_ACL(), > - .priority = 65535, > - .__match = "ct.est && !ct.rel && !ct.new && !ct.inv " > - "&& ct.rpl && ct_label.blocked == 0", > + .priority = 0, > + .__match = "1", > .actions = "next;", > .external_ids = map_empty()); > Flow(.logical_datapath = ls._uuid, > .stage = s_SWITCH_OUT_ACL(), > - .priority = 65535, > - .__match = "ct.est && !ct.rel && !ct.new && !ct.inv " > - "&& ct.rpl && ct_label.blocked == 0", > + .priority = 0, > + .__match = "1", > .actions = "next;", > .external_ids = map_empty()); > > - /* Ingress and Egress ACL Table (Priority 65535). > - * > - * Allow traffic that is related to an existing conntrack entry that > - * has not been marked for deletion (bit 0 of ct_label). > - * > - * This is enforced at a higher priority than ACLs can be defined. > - * > - * NOTE: This does not support related data sessions (eg, > - * a dynamically negotiated FTP data channel), but will allow > - * related traffic such as an ICMP Port Unreachable through > - * that's generated from a non-listening UDP port. */ > - Flow(.logical_datapath = ls._uuid, > - .stage = s_SWITCH_IN_ACL(), > - .priority = 65535, > - .__match = "!ct.est && ct.rel && !ct.new && !ct.inv " > - "&& ct_label.blocked == 0", > - .actions = "next;", > - .external_ids = map_empty()); > - Flow(.logical_datapath = ls._uuid, > - .stage = s_SWITCH_OUT_ACL(), > - .priority = 65535, > - .__match = "!ct.est && ct.rel && !ct.new && !ct.inv " > - "&& ct_label.blocked == 0", > - .actions = "next;", > - .external_ids = map_empty()); > + if (has_stateful) { > + /* Ingress and Egress ACL Table (Priority 1). > + * > + * By default, traffic is allowed. This is partially handled by > + * the Priority 0 ACL flows added earlier, but we also need to > + * commit IP flows. This is because, while the initiater's > + * direction may not have any stateful rules, the server's may > + * and then its return traffic would not have an associated > + * conntrack entry and would return "+invalid". > + * > + * We use "ct_commit" for a connection that is not already known > + * by the connection tracker. Once a connection is committed, > + * subsequent packets will hit the flow at priority 0 that just > + * uses "next;" > + * > + * We also check for established connections that have ct_label.blocked > + * set on them. That's a connection that was disallowed, but is > + * now allowed by policy again since it hit this default-allow flow. > + * We need to set ct_label.blocked=0 to let the connection continue, > + * which will be done by ct_commit() in the "stateful" stage. > + * Subsequent packets will hit the flow at priority 0 that just > + * uses "next;". */ > + Flow(.logical_datapath = ls._uuid, > + .stage = s_SWITCH_IN_ACL(), > + .priority = 1, > + .__match = "ip && (!ct.est || (ct.est && ct_label.blocked == 1))", > + .actions = "${rEGBIT_CONNTRACK_COMMIT()} = 1; next;", > + .external_ids = map_empty()); > + Flow(.logical_datapath = ls._uuid, > + .stage = s_SWITCH_OUT_ACL(), > + .priority = 1, > + .__match = "ip && (!ct.est || (ct.est && ct_label.blocked == 1))", > + .actions = "${rEGBIT_CONNTRACK_COMMIT()} = 1; next;", > + .external_ids = map_empty()); > > - /* Ingress and Egress ACL Table (Priority 65535). > - * > - * Not to do conntrack on ND packets. */ > + /* Ingress and Egress ACL Table (Priority 65535). > + * > + * Always drop traffic that's in an invalid state. Also drop > + * reply direction packets for connections that have been marked > + * for deletion (bit 0 of ct_label is set). > + * > + * This is enforced at a higher priority than ACLs can be defined. */ > + Flow(.logical_datapath = ls._uuid, > + .stage = s_SWITCH_IN_ACL(), > + .priority = 65535, > + .__match = ct_inv_or ++ "(ct.est && ct.rpl && ct_label.blocked == 1)", > + .actions = "drop;", > + .external_ids = map_empty()); > + Flow(.logical_datapath = ls._uuid, > + .stage = s_SWITCH_OUT_ACL(), > + .priority = 65535, > + .__match = ct_inv_or ++ "(ct.est && ct.rpl && ct_label.blocked == 1)", > + .actions = "drop;", > + .external_ids = map_empty()); > + > + /* Ingress and Egress ACL Table (Priority 65535). > + * > + * Allow reply traffic that is part of an established > + * conntrack entry that has not been marked for deletion > + * (bit 0 of ct_label). We only match traffic in the > + * reply direction because we want traffic in the request > + * direction to hit the currently defined policy from ACLs. > + * > + * This is enforced at a higher priority than ACLs can be defined. */ > + Flow(.logical_datapath = ls._uuid, > + .stage = s_SWITCH_IN_ACL(), > + .priority = 65535, > + .__match = "ct.est && !ct.rel && !ct.new " ++ and_not_ct_inv ++ > + "&& ct.rpl && ct_label.blocked == 0", > + .actions = "next;", > + .external_ids = map_empty()); > + Flow(.logical_datapath = ls._uuid, > + .stage = s_SWITCH_OUT_ACL(), > + .priority = 65535, > + .__match = "ct.est && !ct.rel && !ct.new " ++ and_not_ct_inv ++ > + "&& ct.rpl && ct_label.blocked == 0", > + .actions = "next;", > + .external_ids = map_empty()); > + > + /* Ingress and Egress ACL Table (Priority 65535). > + * > + * Allow traffic that is related to an existing conntrack entry that > + * has not been marked for deletion (bit 0 of ct_label). > + * > + * This is enforced at a higher priority than ACLs can be defined. > + * > + * NOTE: This does not support related data sessions (eg, > + * a dynamically negotiated FTP data channel), but will allow > + * related traffic such as an ICMP Port Unreachable through > + * that's generated from a non-listening UDP port. */ > + Flow(.logical_datapath = ls._uuid, > + .stage = s_SWITCH_IN_ACL(), > + .priority = 65535, > + .__match = "!ct.est && ct.rel && !ct.new " ++ and_not_ct_inv ++ > + "&& ct_label.blocked == 0", > + .actions = "next;", > + .external_ids = map_empty()); > + Flow(.logical_datapath = ls._uuid, > + .stage = s_SWITCH_OUT_ACL(), > + .priority = 65535, > + .__match = "!ct.est && ct.rel && !ct.new " ++ and_not_ct_inv ++ > + "&& ct_label.blocked == 0", > + .actions = "next;", > + .external_ids = map_empty()); > + > + /* Ingress and Egress ACL Table (Priority 65535). > + * > + * Not to do conntrack on ND packets. */ > + Flow(.logical_datapath = ls._uuid, > + .stage = s_SWITCH_IN_ACL(), > + .priority = 65535, > + .__match = "nd || nd_ra || nd_rs || mldv1 || mldv2", > + .actions = "next;", > + .external_ids = map_empty()); > + Flow(.logical_datapath = ls._uuid, > + .stage = s_SWITCH_OUT_ACL(), > + .priority = 65535, > + .__match = "nd || nd_ra || nd_rs || mldv1 || mldv2", > + .actions = "next;", > + .external_ids = map_empty()) > + }; > + > + /* Add a 34000 priority flow to advance the DNS reply from ovn-controller, > + * if the CMS has configured DNS records for the datapath. > + */ > + if (sw.has_dns_records) { > + Flow(.logical_datapath = ls._uuid, > + .stage = s_SWITCH_OUT_ACL(), > + .priority = 34000, > + .__match = "udp.src == 53", > + .actions = if has_stateful "ct_commit; next;" else "next;", > + .external_ids = map_empty()) > + }; > + > + /* Add a 34000 priority flow to advance the service monitor reply > + * packets to skip applying ingress ACLs. */ > Flow(.logical_datapath = ls._uuid, > .stage = s_SWITCH_IN_ACL(), > - .priority = 65535, > - .__match = "nd || nd_ra || nd_rs || mldv1 || mldv2", > + .priority = 34000, > + .__match = "eth.dst == $svc_monitor_mac", > .actions = "next;", > .external_ids = map_empty()); > - Flow(.logical_datapath = ls._uuid, > - .stage = s_SWITCH_OUT_ACL(), > - .priority = 65535, > - .__match = "nd || nd_ra || nd_rs || mldv1 || mldv2", > - .actions = "next;", > - .external_ids = map_empty()) > - }; > - > - /* Add a 34000 priority flow to advance the DNS reply from ovn-controller, > - * if the CMS has configured DNS records for the datapath. > - */ > - if (sw.has_dns_records) { > Flow(.logical_datapath = ls._uuid, > .stage = s_SWITCH_OUT_ACL(), > .priority = 34000, > - .__match = "udp.src == 53", > - .actions = if has_stateful "ct_commit; next;" else "next;", > + .__match = "eth.src == $svc_monitor_mac", > + .actions = "next;", > .external_ids = map_empty()) > - }; > - > - /* Add a 34000 priority flow to advance the service monitor reply > - * packets to skip applying ingress ACLs. */ > - Flow(.logical_datapath = ls._uuid, > - .stage = s_SWITCH_IN_ACL(), > - .priority = 34000, > - .__match = "eth.dst == $svc_monitor_mac", > - .actions = "next;", > - .external_ids = map_empty()); > - Flow(.logical_datapath = ls._uuid, > - .stage = s_SWITCH_OUT_ACL(), > - .priority = 34000, > - .__match = "eth.src == $svc_monitor_mac", > - .actions = "next;", > - .external_ids = map_empty()) > + } > } > > /* This stage builds hints for the IN/OUT_ACL stage. Based on various > diff --git a/ovn-nb.xml b/ovn-nb.xml > index feb38b5d31..e337be4eb3 100644 > --- a/ovn-nb.xml > +++ b/ovn-nb.xml > @@ -240,6 +240,21 @@ > </p> > </column> > > + <column name="options" key="use_ct_inv_match"> > + <p> > + If set to false, <code>ovn-northd</code> will not use the > + <code>ct.inv</code> field in any of the logical flow matches. > + The default value is true. If the NIC supports offloading > + OVS datapath flows but doesn't support offloading ct_state > + <code>inv</code> flag, then the datapath flows matching on this flag > + (either <code>+inv</code> or <code>-inv</code>) will not be > + offloaded. CMS should consider setting <code>use_ct_inv_match</code> > + to <code>false</code> in such cases. This results in a side effect > + of the invalid packets getting delivered to the destination VIF, which > + otherwise would have been dropped by <code>OVN</code>. > + </p> > + </column> > + > <group title="Options for configuring interconnection route advertisement"> > <p> > These options control how routes are advertised between OVN > -- > 2.29.2 > > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev Acked-by: Han Zhou <hzhou@ovn.org>
On Thu, May 6, 2021 at 2:35 AM Han Zhou <hzhou@ovn.org> wrote: > > On Thu, Apr 22, 2021 at 6:25 AM <numans@ovn.org> wrote: > > > > From: Numan Siddique <numans@ovn.org> > > > > Presently we add 65535 priority lflows in the stages - > > 'ls_in_acl' and 'ls_out_acl' to drop packets which > > match on 'ct.inv'. > > > > This patch provides an option to not use this field in the > > logical flow matches for the following > > reasons: > > > > • Some of the smart NICs which support offloading datapath > > flows may not support this field. In such cases, almost > > all the datapath flows (matching on the ct_state field +inv > > or -inv cannot be offloaded if ACLs/load balancers > > are configured on the logical switch datapath. > > > > • A recent commit in kernel ovs datapath sets the committed > > connection tracking entry to be liberal for out-of-window > > tcp packets (nf_ct_set_tcp_be_liberal()). Such TCP > > packets will not be marked as invalid. > > > > There are still certain scenarios where a packet can be marked > > invalid. Like > > • If the tcp flags are not correct. > > > > • If there are checksum errors. > > > > In such cases, the packet will be delivered to the destination > > port. So CMS should evaluate their usecases before enabling > > this option. > > > > Signed-off-by: Numan Siddique <numans@ovn.org> > > Co-authored-by: Ben Pfaff <blp@ovn.org> > > Signed-off-by: Ben Pfaff <blp@ovn.org> > > --- > > v3 -> v4 > > --- > > * Moved the static variable 'use_ct_inv_match' declaration up in the > > file. > > > > v2 -> v3 > > ---- > > * Updated the documentation in ovn-nb.xml as per Mark G's suggestion. > > > > v1 -> v2 > > ---- > > * Adopted the code changes shared by Ben Pfaff for the ddlog code. > > > > NEWS | 7 +- > > northd/lswitch.dl | 7 + > > northd/ovn-northd.c | 42 +++--- > > northd/ovn_northd.dl | 304 ++++++++++++++++++++++--------------------- > > ovn-nb.xml | 15 +++ > > 5 files changed, 208 insertions(+), 167 deletions(-) > > > > diff --git a/NEWS b/NEWS > > index 1ddde15f86..5f1365775d 100644 > > --- a/NEWS > > +++ b/NEWS > > @@ -6,11 +6,16 @@ Post-v21.03.0 > > expected to scale better than the C implementation, for large > deployments. > > (This may take testing and tuning to be effective.) This version of > OVN > > requires DDLog 0.36. > > - - Introduce ovn-controller incremetal processing engine statistics > > + - Introduce ovn-controller incremental processing engine statistics > > - Introduced parallel processing in ovn-northd with the NB_Global > config option > > 'use_parallel_build' to enable it. It is disabled by default. > > - Support vlan-passthru mode for tag=0 localnet ports. > > - Support custom 802.11ad EthType for localnet ports. > > + - Add a new NB Global option - use_ct_inv_match with the default value > of true. > > + If this is set to false, then the logical field - "ct.inv" will not > be > > + used in the logical flow matches. CMS can consider setting this to > false, > > + if they want to use smart NICs which don't support offloading > datapath flows > > + with this field used. > > > > OVN v21.03.0 - 12 Mar 2021 > > ------------------------- > > diff --git a/northd/lswitch.dl b/northd/lswitch.dl > > index 47c497e0cf..27ac3cadc5 100644 > > --- a/northd/lswitch.dl > > +++ b/northd/lswitch.dl > > @@ -742,3 +742,10 @@ function put_svc_monitor_mac(options: > Map<string,string>, > > relation SvcMonitorMac(mac: eth_addr) > > SvcMonitorMac(get_svc_monitor_mac(options, uuid)) :- > > nb::NB_Global(._uuid = uuid, .options = options). > > + > > +relation UseCtInvMatch[bool] > > +UseCtInvMatch[options.get_bool_def("use_ct_inv_match", true)] :- > > + nb::NB_Global(.options = options). > > +UseCtInvMatch[true] :- > > + Unit(), > > + not nb in nb::NB_Global(). > > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c > > index de1aa896d3..5f6c0a6922 100644 > > --- a/northd/ovn-northd.c > > +++ b/northd/ovn-northd.c > > @@ -98,6 +98,10 @@ static bool check_lsp_is_up; > > static char svc_monitor_mac[ETH_ADDR_STRLEN + 1]; > > static struct eth_addr svc_monitor_mac_ea; > > > > +/* If this option is 'true' northd will make use of ct.inv match fields. > > + * Otherwise, it will avoid using it. The default is true. */ > > +static bool use_ct_inv_match = true; > > + > > /* Default probe interval for NB and SB DB connections. */ > > #define DEFAULT_PROBE_INTERVAL_MSEC 5000 > > static int northd_probe_interval_nb = 0; > > @@ -4099,7 +4103,6 @@ do_ovn_lflow_add(struct hmap *lflow_map, bool > shared, > > hmap_insert_fast(lflow_map, &lflow->hmap_node, hash); > > } > > > > - > > /* Adds a row with the specified contents to the Logical_Flow table. */ > > static void > > ovn_lflow_add_at(struct hmap *lflow_map, struct ovn_datapath *od, > > @@ -5712,12 +5715,13 @@ build_acls(struct ovn_datapath *od, struct hmap > *lflows, > > * for deletion (bit 0 of ct_label is set). > > * > > * This is enforced at a higher priority than ACLs can be > defined. */ > > + char *match = xasprintf("%s(ct.est && ct.rpl && ct_label.blocked > == 1)", > > + use_ct_inv_match ? "ct.inv || " : ""); > > ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, UINT16_MAX, > > - "ct.inv || (ct.est && ct.rpl && ct_label.blocked > == 1)", > > - "drop;"); > > + match, "drop;"); > > ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, UINT16_MAX, > > - "ct.inv || (ct.est && ct.rpl && ct_label.blocked > == 1)", > > - "drop;"); > > + match, "drop;"); > > + free(match); > > > > /* Ingress and Egress ACL Table (Priority 65535). > > * > > @@ -5728,14 +5732,15 @@ build_acls(struct ovn_datapath *od, struct hmap > *lflows, > > * direction to hit the currently defined policy from ACLs. > > * > > * This is enforced at a higher priority than ACLs can be > defined. */ > > + match = xasprintf("ct.est && !ct.rel && !ct.new%s && " > > + "ct.rpl && ct_label.blocked == 0", > > + use_ct_inv_match ? " && !ct.inv" : ""); > > + > > ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, UINT16_MAX, > > - "ct.est && !ct.rel && !ct.new && !ct.inv " > > - "&& ct.rpl && ct_label.blocked == 0", > > - "next;"); > > + match, "next;"); > > ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, UINT16_MAX, > > - "ct.est && !ct.rel && !ct.new && !ct.inv " > > - "&& ct.rpl && ct_label.blocked == 0", > > - "next;"); > > + match, "next;"); > > + free(match); > > > > /* Ingress and Egress ACL Table (Priority 65535). > > * > > @@ -5748,14 +5753,14 @@ build_acls(struct ovn_datapath *od, struct hmap > *lflows, > > * a dynamically negotiated FTP data channel), but will allow > > * related traffic such as an ICMP Port Unreachable through > > * that's generated from a non-listening UDP port. */ > > + match = xasprintf("!ct.est && ct.rel && !ct.new%s && " > > + "ct_label.blocked == 0", > > + use_ct_inv_match ? " && !ct.inv" : ""); > > ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, UINT16_MAX, > > - "!ct.est && ct.rel && !ct.new && !ct.inv " > > - "&& ct_label.blocked == 0", > > - "next;"); > > + match, "next;"); > > ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, UINT16_MAX, > > - "!ct.est && ct.rel && !ct.new && !ct.inv " > > - "&& ct_label.blocked == 0", > > - "next;"); > > + match, "next;"); > > + free(match); > > > > /* Ingress and Egress ACL Table (Priority 65535). > > * > > @@ -13195,6 +13200,9 @@ ovnnb_db_run(struct northd_context *ctx, > > > > use_logical_dp_groups = smap_get_bool(&nb->options, > > "use_logical_dp_groups", > false); > > + use_ct_inv_match = smap_get_bool(&nb->options, > > + "use_ct_inv_match", true); > > + > > /* deprecated, use --event instead */ > > controller_event_en = smap_get_bool(&nb->options, > > "controller_event", false); > > diff --git a/northd/ovn_northd.dl b/northd/ovn_northd.dl > > index 2f5d36c599..a0f7944103 100644 > > --- a/northd/ovn_northd.dl > > +++ b/northd/ovn_northd.dl > > @@ -2280,173 +2280,179 @@ for (Reject(lsuuid, pipeline, stage, acl, > fair_meter, extra_match_, extra_action > > } > > > > /* build_acls */ > > -for (sw in &Switch(.ls = ls)) > > -var has_stateful = sw.has_stateful_acl or sw.has_lb_vip in > > -{ > > - /* Ingress and Egress ACL Table (Priority 0): Packets are allowed by > > - * default. A related rule at priority 1 is added below if there > > - * are any stateful ACLs in this datapath. */ > > - Flow(.logical_datapath = ls._uuid, > > - .stage = s_SWITCH_IN_ACL(), > > - .priority = 0, > > - .__match = "1", > > - .actions = "next;", > > - .external_ids = map_empty()); > > - Flow(.logical_datapath = ls._uuid, > > - .stage = s_SWITCH_OUT_ACL(), > > - .priority = 0, > > - .__match = "1", > > - .actions = "next;", > > - .external_ids = map_empty()); > > - > > - if (has_stateful) { > > - /* Ingress and Egress ACL Table (Priority 1). > > - * > > - * By default, traffic is allowed. This is partially handled by > > - * the Priority 0 ACL flows added earlier, but we also need to > > - * commit IP flows. This is because, while the initiater's > > - * direction may not have any stateful rules, the server's may > > - * and then its return traffic would not have an associated > > - * conntrack entry and would return "+invalid". > > - * > > - * We use "ct_commit" for a connection that is not already known > > - * by the connection tracker. Once a connection is committed, > > - * subsequent packets will hit the flow at priority 0 that just > > - * uses "next;" > > - * > > - * We also check for established connections that have > ct_label.blocked > > - * set on them. That's a connection that was disallowed, but is > > - * now allowed by policy again since it hit this default-allow > flow. > > - * We need to set ct_label.blocked=0 to let the connection > continue, > > - * which will be done by ct_commit() in the "stateful" stage. > > - * Subsequent packets will hit the flow at priority 0 that just > > - * uses "next;". */ > > - Flow(.logical_datapath = ls._uuid, > > - .stage = s_SWITCH_IN_ACL(), > > - .priority = 1, > > - .__match = "ip && (!ct.est || (ct.est && > ct_label.blocked == 1))", > > - .actions = "${rEGBIT_CONNTRACK_COMMIT()} = 1; > next;", > > - .external_ids = map_empty()); > > - Flow(.logical_datapath = ls._uuid, > > - .stage = s_SWITCH_OUT_ACL(), > > - .priority = 1, > > - .__match = "ip && (!ct.est || (ct.est && > ct_label.blocked == 1))", > > - .actions = "${rEGBIT_CONNTRACK_COMMIT()} = 1; > next;", > > - .external_ids = map_empty()); > > - > > - /* Ingress and Egress ACL Table (Priority 65535). > > - * > > - * Always drop traffic that's in an invalid state. Also drop > > - * reply direction packets for connections that have been marked > > - * for deletion (bit 0 of ct_label is set). > > - * > > - * This is enforced at a higher priority than ACLs can be > defined. */ > > - Flow(.logical_datapath = ls._uuid, > > - .stage = s_SWITCH_IN_ACL(), > > - .priority = 65535, > > - .__match = "ct.inv || (ct.est && ct.rpl && > ct_label.blocked == 1)", > > - .actions = "drop;", > > - .external_ids = map_empty()); > > - Flow(.logical_datapath = ls._uuid, > > - .stage = s_SWITCH_OUT_ACL(), > > - .priority = 65535, > > - .__match = "ct.inv || (ct.est && ct.rpl && > ct_label.blocked == 1)", > > - .actions = "drop;", > > - .external_ids = map_empty()); > > - > > - /* Ingress and Egress ACL Table (Priority 65535). > > - * > > - * Allow reply traffic that is part of an established > > - * conntrack entry that has not been marked for deletion > > - * (bit 0 of ct_label). We only match traffic in the > > - * reply direction because we want traffic in the request > > - * direction to hit the currently defined policy from ACLs. > > - * > > - * This is enforced at a higher priority than ACLs can be > defined. */ > > +for (UseCtInvMatch[use_ct_inv_match]) { > > + (var ct_inv_or, var and_not_ct_inv) = match (use_ct_inv_match) { > > + true -> ("ct.inv || ", "&& !ct.inv "), > > + false -> ("", ""), > > + } in > > + for (sw in &Switch(.ls = ls)) > > + var has_stateful = sw.has_stateful_acl or sw.has_lb_vip in > > + { > > + /* Ingress and Egress ACL Table (Priority 0): Packets are > allowed by > > + * default. A related rule at priority 1 is added below if there > > + * are any stateful ACLs in this datapath. */ > > Flow(.logical_datapath = ls._uuid, > > .stage = s_SWITCH_IN_ACL(), > > - .priority = 65535, > > - .__match = "ct.est && !ct.rel && !ct.new && > !ct.inv " > > - "&& ct.rpl && ct_label.blocked == 0", > > + .priority = 0, > > + .__match = "1", > > .actions = "next;", > > .external_ids = map_empty()); > > Flow(.logical_datapath = ls._uuid, > > .stage = s_SWITCH_OUT_ACL(), > > - .priority = 65535, > > - .__match = "ct.est && !ct.rel && !ct.new && > !ct.inv " > > - "&& ct.rpl && ct_label.blocked == 0", > > + .priority = 0, > > + .__match = "1", > > .actions = "next;", > > .external_ids = map_empty()); > > > > - /* Ingress and Egress ACL Table (Priority 65535). > > - * > > - * Allow traffic that is related to an existing conntrack entry > that > > - * has not been marked for deletion (bit 0 of ct_label). > > - * > > - * This is enforced at a higher priority than ACLs can be > defined. > > - * > > - * NOTE: This does not support related data sessions (eg, > > - * a dynamically negotiated FTP data channel), but will allow > > - * related traffic such as an ICMP Port Unreachable through > > - * that's generated from a non-listening UDP port. */ > > - Flow(.logical_datapath = ls._uuid, > > - .stage = s_SWITCH_IN_ACL(), > > - .priority = 65535, > > - .__match = "!ct.est && ct.rel && !ct.new && > !ct.inv " > > - "&& ct_label.blocked == 0", > > - .actions = "next;", > > - .external_ids = map_empty()); > > - Flow(.logical_datapath = ls._uuid, > > - .stage = s_SWITCH_OUT_ACL(), > > - .priority = 65535, > > - .__match = "!ct.est && ct.rel && !ct.new && > !ct.inv " > > - "&& ct_label.blocked == 0", > > - .actions = "next;", > > - .external_ids = map_empty()); > > + if (has_stateful) { > > + /* Ingress and Egress ACL Table (Priority 1). > > + * > > + * By default, traffic is allowed. This is partially > handled by > > + * the Priority 0 ACL flows added earlier, but we also need > to > > + * commit IP flows. This is because, while the initiater's > > + * direction may not have any stateful rules, the server's > may > > + * and then its return traffic would not have an associated > > + * conntrack entry and would return "+invalid". > > + * > > + * We use "ct_commit" for a connection that is not already > known > > + * by the connection tracker. Once a connection is > committed, > > + * subsequent packets will hit the flow at priority 0 that > just > > + * uses "next;" > > + * > > + * We also check for established connections that have > ct_label.blocked > > + * set on them. That's a connection that was disallowed, > but is > > + * now allowed by policy again since it hit this > default-allow flow. > > + * We need to set ct_label.blocked=0 to let the connection > continue, > > + * which will be done by ct_commit() in the "stateful" stage. > > + * Subsequent packets will hit the flow at priority 0 that > just > > + * uses "next;". */ > > + Flow(.logical_datapath = ls._uuid, > > + .stage = s_SWITCH_IN_ACL(), > > + .priority = 1, > > + .__match = "ip && (!ct.est || (ct.est && > ct_label.blocked == 1))", > > + .actions = "${rEGBIT_CONNTRACK_COMMIT()} = 1; > next;", > > + .external_ids = map_empty()); > > + Flow(.logical_datapath = ls._uuid, > > + .stage = s_SWITCH_OUT_ACL(), > > + .priority = 1, > > + .__match = "ip && (!ct.est || (ct.est && > ct_label.blocked == 1))", > > + .actions = "${rEGBIT_CONNTRACK_COMMIT()} = 1; > next;", > > + .external_ids = map_empty()); > > > > - /* Ingress and Egress ACL Table (Priority 65535). > > - * > > - * Not to do conntrack on ND packets. */ > > + /* Ingress and Egress ACL Table (Priority 65535). > > + * > > + * Always drop traffic that's in an invalid state. Also drop > > + * reply direction packets for connections that have been > marked > > + * for deletion (bit 0 of ct_label is set). > > + * > > + * This is enforced at a higher priority than ACLs can be > defined. */ > > + Flow(.logical_datapath = ls._uuid, > > + .stage = s_SWITCH_IN_ACL(), > > + .priority = 65535, > > + .__match = ct_inv_or ++ "(ct.est && ct.rpl && > ct_label.blocked == 1)", > > + .actions = "drop;", > > + .external_ids = map_empty()); > > + Flow(.logical_datapath = ls._uuid, > > + .stage = s_SWITCH_OUT_ACL(), > > + .priority = 65535, > > + .__match = ct_inv_or ++ "(ct.est && ct.rpl && > ct_label.blocked == 1)", > > + .actions = "drop;", > > + .external_ids = map_empty()); > > + > > + /* Ingress and Egress ACL Table (Priority 65535). > > + * > > + * Allow reply traffic that is part of an established > > + * conntrack entry that has not been marked for deletion > > + * (bit 0 of ct_label). We only match traffic in the > > + * reply direction because we want traffic in the request > > + * direction to hit the currently defined policy from ACLs. > > + * > > + * This is enforced at a higher priority than ACLs can be > defined. */ > > + Flow(.logical_datapath = ls._uuid, > > + .stage = s_SWITCH_IN_ACL(), > > + .priority = 65535, > > + .__match = "ct.est && !ct.rel && !ct.new " ++ > and_not_ct_inv ++ > > + "&& ct.rpl && ct_label.blocked == > 0", > > + .actions = "next;", > > + .external_ids = map_empty()); > > + Flow(.logical_datapath = ls._uuid, > > + .stage = s_SWITCH_OUT_ACL(), > > + .priority = 65535, > > + .__match = "ct.est && !ct.rel && !ct.new " ++ > and_not_ct_inv ++ > > + "&& ct.rpl && ct_label.blocked == > 0", > > + .actions = "next;", > > + .external_ids = map_empty()); > > + > > + /* Ingress and Egress ACL Table (Priority 65535). > > + * > > + * Allow traffic that is related to an existing conntrack > entry that > > + * has not been marked for deletion (bit 0 of ct_label). > > + * > > + * This is enforced at a higher priority than ACLs can be > defined. > > + * > > + * NOTE: This does not support related data sessions (eg, > > + * a dynamically negotiated FTP data channel), but will allow > > + * related traffic such as an ICMP Port Unreachable through > > + * that's generated from a non-listening UDP port. */ > > + Flow(.logical_datapath = ls._uuid, > > + .stage = s_SWITCH_IN_ACL(), > > + .priority = 65535, > > + .__match = "!ct.est && ct.rel && !ct.new " ++ > and_not_ct_inv ++ > > + "&& ct_label.blocked == 0", > > + .actions = "next;", > > + .external_ids = map_empty()); > > + Flow(.logical_datapath = ls._uuid, > > + .stage = s_SWITCH_OUT_ACL(), > > + .priority = 65535, > > + .__match = "!ct.est && ct.rel && !ct.new " ++ > and_not_ct_inv ++ > > + "&& ct_label.blocked == 0", > > + .actions = "next;", > > + .external_ids = map_empty()); > > + > > + /* Ingress and Egress ACL Table (Priority 65535). > > + * > > + * Not to do conntrack on ND packets. */ > > + Flow(.logical_datapath = ls._uuid, > > + .stage = s_SWITCH_IN_ACL(), > > + .priority = 65535, > > + .__match = "nd || nd_ra || nd_rs || mldv1 || > mldv2", > > + .actions = "next;", > > + .external_ids = map_empty()); > > + Flow(.logical_datapath = ls._uuid, > > + .stage = s_SWITCH_OUT_ACL(), > > + .priority = 65535, > > + .__match = "nd || nd_ra || nd_rs || mldv1 || > mldv2", > > + .actions = "next;", > > + .external_ids = map_empty()) > > + }; > > + > > + /* Add a 34000 priority flow to advance the DNS reply from > ovn-controller, > > + * if the CMS has configured DNS records for the datapath. > > + */ > > + if (sw.has_dns_records) { > > + Flow(.logical_datapath = ls._uuid, > > + .stage = s_SWITCH_OUT_ACL(), > > + .priority = 34000, > > + .__match = "udp.src == 53", > > + .actions = if has_stateful "ct_commit; next;" > else "next;", > > + .external_ids = map_empty()) > > + }; > > + > > + /* Add a 34000 priority flow to advance the service monitor reply > > + * packets to skip applying ingress ACLs. */ > > Flow(.logical_datapath = ls._uuid, > > .stage = s_SWITCH_IN_ACL(), > > - .priority = 65535, > > - .__match = "nd || nd_ra || nd_rs || mldv1 || > mldv2", > > + .priority = 34000, > > + .__match = "eth.dst == $svc_monitor_mac", > > .actions = "next;", > > .external_ids = map_empty()); > > - Flow(.logical_datapath = ls._uuid, > > - .stage = s_SWITCH_OUT_ACL(), > > - .priority = 65535, > > - .__match = "nd || nd_ra || nd_rs || mldv1 || > mldv2", > > - .actions = "next;", > > - .external_ids = map_empty()) > > - }; > > - > > - /* Add a 34000 priority flow to advance the DNS reply from > ovn-controller, > > - * if the CMS has configured DNS records for the datapath. > > - */ > > - if (sw.has_dns_records) { > > Flow(.logical_datapath = ls._uuid, > > .stage = s_SWITCH_OUT_ACL(), > > .priority = 34000, > > - .__match = "udp.src == 53", > > - .actions = if has_stateful "ct_commit; next;" else > "next;", > > + .__match = "eth.src == $svc_monitor_mac", > > + .actions = "next;", > > .external_ids = map_empty()) > > - }; > > - > > - /* Add a 34000 priority flow to advance the service monitor reply > > - * packets to skip applying ingress ACLs. */ > > - Flow(.logical_datapath = ls._uuid, > > - .stage = s_SWITCH_IN_ACL(), > > - .priority = 34000, > > - .__match = "eth.dst == $svc_monitor_mac", > > - .actions = "next;", > > - .external_ids = map_empty()); > > - Flow(.logical_datapath = ls._uuid, > > - .stage = s_SWITCH_OUT_ACL(), > > - .priority = 34000, > > - .__match = "eth.src == $svc_monitor_mac", > > - .actions = "next;", > > - .external_ids = map_empty()) > > + } > > } > > > > /* This stage builds hints for the IN/OUT_ACL stage. Based on various > > diff --git a/ovn-nb.xml b/ovn-nb.xml > > index feb38b5d31..e337be4eb3 100644 > > --- a/ovn-nb.xml > > +++ b/ovn-nb.xml > > @@ -240,6 +240,21 @@ > > </p> > > </column> > > > > + <column name="options" key="use_ct_inv_match"> > > + <p> > > + If set to false, <code>ovn-northd</code> will not use the > > + <code>ct.inv</code> field in any of the logical flow matches. > > + The default value is true. If the NIC supports offloading > > + OVS datapath flows but doesn't support offloading ct_state > > + <code>inv</code> flag, then the datapath flows matching on > this flag > > + (either <code>+inv</code> or <code>-inv</code>) will not be > > + offloaded. CMS should consider setting > <code>use_ct_inv_match</code> > > + to <code>false</code> in such cases. This results in a side > effect > > + of the invalid packets getting delivered to the destination > VIF, which > > + otherwise would have been dropped by <code>OVN</code>. > > + </p> > > + </column> > > + > > <group title="Options for configuring interconnection route > advertisement"> > > <p> > > These options control how routes are advertised between OVN > > -- > > 2.29.2 > > > > _______________________________________________ > > dev mailing list > > dev@openvswitch.org > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > Acked-by: Han Zhou <hzhou@ovn.org> Thanks for the reviews Mark G and Han. I applied both the patches to the main branch addressing Han's comment in p1. When I submitted this patch, I had a test case which somehow got removed in the later versions. I added the test case before applying. Regards Numan > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
On Thu, May 6, 2021 at 11:41 AM Numan Siddique <numans@ovn.org> wrote: > > On Thu, May 6, 2021 at 2:35 AM Han Zhou <hzhou@ovn.org> wrote: > > > > On Thu, Apr 22, 2021 at 6:25 AM <numans@ovn.org> wrote: > > > > > > From: Numan Siddique <numans@ovn.org> > > > > > > Presently we add 65535 priority lflows in the stages - > > > 'ls_in_acl' and 'ls_out_acl' to drop packets which > > > match on 'ct.inv'. > > > > > > This patch provides an option to not use this field in the > > > logical flow matches for the following > > > reasons: > > > > > > • Some of the smart NICs which support offloading datapath > > > flows may not support this field. In such cases, almost > > > all the datapath flows (matching on the ct_state field +inv > > > or -inv cannot be offloaded if ACLs/load balancers > > > are configured on the logical switch datapath. > > > > > > • A recent commit in kernel ovs datapath sets the committed > > > connection tracking entry to be liberal for out-of-window > > > tcp packets (nf_ct_set_tcp_be_liberal()). Such TCP > > > packets will not be marked as invalid. > > > > > > There are still certain scenarios where a packet can be marked > > > invalid. Like > > > • If the tcp flags are not correct. > > > > > > • If there are checksum errors. > > > > > > In such cases, the packet will be delivered to the destination > > > port. So CMS should evaluate their usecases before enabling > > > this option. > > > > > > Signed-off-by: Numan Siddique <numans@ovn.org> > > > Co-authored-by: Ben Pfaff <blp@ovn.org> > > > Signed-off-by: Ben Pfaff <blp@ovn.org> > > > --- > > > v3 -> v4 > > > --- > > > * Moved the static variable 'use_ct_inv_match' declaration up in the > > > file. > > > > > > v2 -> v3 > > > ---- > > > * Updated the documentation in ovn-nb.xml as per Mark G's suggestion. > > > > > > v1 -> v2 > > > ---- > > > * Adopted the code changes shared by Ben Pfaff for the ddlog code. > > > > > > NEWS | 7 +- > > > northd/lswitch.dl | 7 + > > > northd/ovn-northd.c | 42 +++--- > > > northd/ovn_northd.dl | 304 ++++++++++++++++++++++--------------------- > > > ovn-nb.xml | 15 +++ > > > 5 files changed, 208 insertions(+), 167 deletions(-) > > > > > > diff --git a/NEWS b/NEWS > > > index 1ddde15f86..5f1365775d 100644 > > > --- a/NEWS > > > +++ b/NEWS > > > @@ -6,11 +6,16 @@ Post-v21.03.0 > > > expected to scale better than the C implementation, for large > > deployments. > > > (This may take testing and tuning to be effective.) This version of > > OVN > > > requires DDLog 0.36. > > > - - Introduce ovn-controller incremetal processing engine statistics > > > + - Introduce ovn-controller incremental processing engine statistics > > > - Introduced parallel processing in ovn-northd with the NB_Global > > config option > > > 'use_parallel_build' to enable it. It is disabled by default. > > > - Support vlan-passthru mode for tag=0 localnet ports. > > > - Support custom 802.11ad EthType for localnet ports. > > > + - Add a new NB Global option - use_ct_inv_match with the default value > > of true. > > > + If this is set to false, then the logical field - "ct.inv" will not > > be > > > + used in the logical flow matches. CMS can consider setting this to > > false, > > > + if they want to use smart NICs which don't support offloading > > datapath flows > > > + with this field used. > > > > > > OVN v21.03.0 - 12 Mar 2021 > > > ------------------------- > > > diff --git a/northd/lswitch.dl b/northd/lswitch.dl > > > index 47c497e0cf..27ac3cadc5 100644 > > > --- a/northd/lswitch.dl > > > +++ b/northd/lswitch.dl > > > @@ -742,3 +742,10 @@ function put_svc_monitor_mac(options: > > Map<string,string>, > > > relation SvcMonitorMac(mac: eth_addr) > > > SvcMonitorMac(get_svc_monitor_mac(options, uuid)) :- > > > nb::NB_Global(._uuid = uuid, .options = options). > > > + > > > +relation UseCtInvMatch[bool] > > > +UseCtInvMatch[options.get_bool_def("use_ct_inv_match", true)] :- > > > + nb::NB_Global(.options = options). > > > +UseCtInvMatch[true] :- > > > + Unit(), > > > + not nb in nb::NB_Global(). > > > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c > > > index de1aa896d3..5f6c0a6922 100644 > > > --- a/northd/ovn-northd.c > > > +++ b/northd/ovn-northd.c > > > @@ -98,6 +98,10 @@ static bool check_lsp_is_up; > > > static char svc_monitor_mac[ETH_ADDR_STRLEN + 1]; > > > static struct eth_addr svc_monitor_mac_ea; > > > > > > +/* If this option is 'true' northd will make use of ct.inv match fields. > > > + * Otherwise, it will avoid using it. The default is true. */ > > > +static bool use_ct_inv_match = true; > > > + > > > /* Default probe interval for NB and SB DB connections. */ > > > #define DEFAULT_PROBE_INTERVAL_MSEC 5000 > > > static int northd_probe_interval_nb = 0; > > > @@ -4099,7 +4103,6 @@ do_ovn_lflow_add(struct hmap *lflow_map, bool > > shared, > > > hmap_insert_fast(lflow_map, &lflow->hmap_node, hash); > > > } > > > > > > - > > > /* Adds a row with the specified contents to the Logical_Flow table. */ > > > static void > > > ovn_lflow_add_at(struct hmap *lflow_map, struct ovn_datapath *od, > > > @@ -5712,12 +5715,13 @@ build_acls(struct ovn_datapath *od, struct hmap > > *lflows, > > > * for deletion (bit 0 of ct_label is set). > > > * > > > * This is enforced at a higher priority than ACLs can be > > defined. */ > > > + char *match = xasprintf("%s(ct.est && ct.rpl && ct_label.blocked > > == 1)", > > > + use_ct_inv_match ? "ct.inv || " : ""); > > > ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, UINT16_MAX, > > > - "ct.inv || (ct.est && ct.rpl && ct_label.blocked > > == 1)", > > > - "drop;"); > > > + match, "drop;"); > > > ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, UINT16_MAX, > > > - "ct.inv || (ct.est && ct.rpl && ct_label.blocked > > == 1)", > > > - "drop;"); > > > + match, "drop;"); > > > + free(match); > > > > > > /* Ingress and Egress ACL Table (Priority 65535). > > > * > > > @@ -5728,14 +5732,15 @@ build_acls(struct ovn_datapath *od, struct hmap > > *lflows, > > > * direction to hit the currently defined policy from ACLs. > > > * > > > * This is enforced at a higher priority than ACLs can be > > defined. */ > > > + match = xasprintf("ct.est && !ct.rel && !ct.new%s && " > > > + "ct.rpl && ct_label.blocked == 0", > > > + use_ct_inv_match ? " && !ct.inv" : ""); > > > + > > > ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, UINT16_MAX, > > > - "ct.est && !ct.rel && !ct.new && !ct.inv " > > > - "&& ct.rpl && ct_label.blocked == 0", > > > - "next;"); > > > + match, "next;"); > > > ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, UINT16_MAX, > > > - "ct.est && !ct.rel && !ct.new && !ct.inv " > > > - "&& ct.rpl && ct_label.blocked == 0", > > > - "next;"); > > > + match, "next;"); > > > + free(match); > > > > > > /* Ingress and Egress ACL Table (Priority 65535). > > > * > > > @@ -5748,14 +5753,14 @@ build_acls(struct ovn_datapath *od, struct hmap > > *lflows, > > > * a dynamically negotiated FTP data channel), but will allow > > > * related traffic such as an ICMP Port Unreachable through > > > * that's generated from a non-listening UDP port. */ > > > + match = xasprintf("!ct.est && ct.rel && !ct.new%s && " > > > + "ct_label.blocked == 0", > > > + use_ct_inv_match ? " && !ct.inv" : ""); > > > ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, UINT16_MAX, > > > - "!ct.est && ct.rel && !ct.new && !ct.inv " > > > - "&& ct_label.blocked == 0", > > > - "next;"); > > > + match, "next;"); > > > ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, UINT16_MAX, > > > - "!ct.est && ct.rel && !ct.new && !ct.inv " > > > - "&& ct_label.blocked == 0", > > > - "next;"); > > > + match, "next;"); > > > + free(match); > > > > > > /* Ingress and Egress ACL Table (Priority 65535). > > > * > > > @@ -13195,6 +13200,9 @@ ovnnb_db_run(struct northd_context *ctx, > > > > > > use_logical_dp_groups = smap_get_bool(&nb->options, > > > "use_logical_dp_groups", > > false); > > > + use_ct_inv_match = smap_get_bool(&nb->options, > > > + "use_ct_inv_match", true); > > > + > > > /* deprecated, use --event instead */ > > > controller_event_en = smap_get_bool(&nb->options, > > > "controller_event", false); > > > diff --git a/northd/ovn_northd.dl b/northd/ovn_northd.dl > > > index 2f5d36c599..a0f7944103 100644 > > > --- a/northd/ovn_northd.dl > > > +++ b/northd/ovn_northd.dl > > > @@ -2280,173 +2280,179 @@ for (Reject(lsuuid, pipeline, stage, acl, > > fair_meter, extra_match_, extra_action > > > } > > > > > > /* build_acls */ > > > -for (sw in &Switch(.ls = ls)) > > > -var has_stateful = sw.has_stateful_acl or sw.has_lb_vip in > > > -{ > > > - /* Ingress and Egress ACL Table (Priority 0): Packets are allowed by > > > - * default. A related rule at priority 1 is added below if there > > > - * are any stateful ACLs in this datapath. */ > > > - Flow(.logical_datapath = ls._uuid, > > > - .stage = s_SWITCH_IN_ACL(), > > > - .priority = 0, > > > - .__match = "1", > > > - .actions = "next;", > > > - .external_ids = map_empty()); > > > - Flow(.logical_datapath = ls._uuid, > > > - .stage = s_SWITCH_OUT_ACL(), > > > - .priority = 0, > > > - .__match = "1", > > > - .actions = "next;", > > > - .external_ids = map_empty()); > > > - > > > - if (has_stateful) { > > > - /* Ingress and Egress ACL Table (Priority 1). > > > - * > > > - * By default, traffic is allowed. This is partially handled by > > > - * the Priority 0 ACL flows added earlier, but we also need to > > > - * commit IP flows. This is because, while the initiater's > > > - * direction may not have any stateful rules, the server's may > > > - * and then its return traffic would not have an associated > > > - * conntrack entry and would return "+invalid". > > > - * > > > - * We use "ct_commit" for a connection that is not already known > > > - * by the connection tracker. Once a connection is committed, > > > - * subsequent packets will hit the flow at priority 0 that just > > > - * uses "next;" > > > - * > > > - * We also check for established connections that have > > ct_label.blocked > > > - * set on them. That's a connection that was disallowed, but is > > > - * now allowed by policy again since it hit this default-allow > > flow. > > > - * We need to set ct_label.blocked=0 to let the connection > > continue, > > > - * which will be done by ct_commit() in the "stateful" stage. > > > - * Subsequent packets will hit the flow at priority 0 that just > > > - * uses "next;". */ > > > - Flow(.logical_datapath = ls._uuid, > > > - .stage = s_SWITCH_IN_ACL(), > > > - .priority = 1, > > > - .__match = "ip && (!ct.est || (ct.est && > > ct_label.blocked == 1))", > > > - .actions = "${rEGBIT_CONNTRACK_COMMIT()} = 1; > > next;", > > > - .external_ids = map_empty()); > > > - Flow(.logical_datapath = ls._uuid, > > > - .stage = s_SWITCH_OUT_ACL(), > > > - .priority = 1, > > > - .__match = "ip && (!ct.est || (ct.est && > > ct_label.blocked == 1))", > > > - .actions = "${rEGBIT_CONNTRACK_COMMIT()} = 1; > > next;", > > > - .external_ids = map_empty()); > > > - > > > - /* Ingress and Egress ACL Table (Priority 65535). > > > - * > > > - * Always drop traffic that's in an invalid state. Also drop > > > - * reply direction packets for connections that have been marked > > > - * for deletion (bit 0 of ct_label is set). > > > - * > > > - * This is enforced at a higher priority than ACLs can be > > defined. */ > > > - Flow(.logical_datapath = ls._uuid, > > > - .stage = s_SWITCH_IN_ACL(), > > > - .priority = 65535, > > > - .__match = "ct.inv || (ct.est && ct.rpl && > > ct_label.blocked == 1)", > > > - .actions = "drop;", > > > - .external_ids = map_empty()); > > > - Flow(.logical_datapath = ls._uuid, > > > - .stage = s_SWITCH_OUT_ACL(), > > > - .priority = 65535, > > > - .__match = "ct.inv || (ct.est && ct.rpl && > > ct_label.blocked == 1)", > > > - .actions = "drop;", > > > - .external_ids = map_empty()); > > > - > > > - /* Ingress and Egress ACL Table (Priority 65535). > > > - * > > > - * Allow reply traffic that is part of an established > > > - * conntrack entry that has not been marked for deletion > > > - * (bit 0 of ct_label). We only match traffic in the > > > - * reply direction because we want traffic in the request > > > - * direction to hit the currently defined policy from ACLs. > > > - * > > > - * This is enforced at a higher priority than ACLs can be > > defined. */ > > > +for (UseCtInvMatch[use_ct_inv_match]) { > > > + (var ct_inv_or, var and_not_ct_inv) = match (use_ct_inv_match) { > > > + true -> ("ct.inv || ", "&& !ct.inv "), > > > + false -> ("", ""), > > > + } in > > > + for (sw in &Switch(.ls = ls)) > > > + var has_stateful = sw.has_stateful_acl or sw.has_lb_vip in > > > + { > > > + /* Ingress and Egress ACL Table (Priority 0): Packets are > > allowed by > > > + * default. A related rule at priority 1 is added below if there > > > + * are any stateful ACLs in this datapath. */ > > > Flow(.logical_datapath = ls._uuid, > > > .stage = s_SWITCH_IN_ACL(), > > > - .priority = 65535, > > > - .__match = "ct.est && !ct.rel && !ct.new && > > !ct.inv " > > > - "&& ct.rpl && ct_label.blocked == 0", > > > + .priority = 0, > > > + .__match = "1", > > > .actions = "next;", > > > .external_ids = map_empty()); > > > Flow(.logical_datapath = ls._uuid, > > > .stage = s_SWITCH_OUT_ACL(), > > > - .priority = 65535, > > > - .__match = "ct.est && !ct.rel && !ct.new && > > !ct.inv " > > > - "&& ct.rpl && ct_label.blocked == 0", > > > + .priority = 0, > > > + .__match = "1", > > > .actions = "next;", > > > .external_ids = map_empty()); > > > > > > - /* Ingress and Egress ACL Table (Priority 65535). > > > - * > > > - * Allow traffic that is related to an existing conntrack entry > > that > > > - * has not been marked for deletion (bit 0 of ct_label). > > > - * > > > - * This is enforced at a higher priority than ACLs can be > > defined. > > > - * > > > - * NOTE: This does not support related data sessions (eg, > > > - * a dynamically negotiated FTP data channel), but will allow > > > - * related traffic such as an ICMP Port Unreachable through > > > - * that's generated from a non-listening UDP port. */ > > > - Flow(.logical_datapath = ls._uuid, > > > - .stage = s_SWITCH_IN_ACL(), > > > - .priority = 65535, > > > - .__match = "!ct.est && ct.rel && !ct.new && > > !ct.inv " > > > - "&& ct_label.blocked == 0", > > > - .actions = "next;", > > > - .external_ids = map_empty()); > > > - Flow(.logical_datapath = ls._uuid, > > > - .stage = s_SWITCH_OUT_ACL(), > > > - .priority = 65535, > > > - .__match = "!ct.est && ct.rel && !ct.new && > > !ct.inv " > > > - "&& ct_label.blocked == 0", > > > - .actions = "next;", > > > - .external_ids = map_empty()); > > > + if (has_stateful) { > > > + /* Ingress and Egress ACL Table (Priority 1). > > > + * > > > + * By default, traffic is allowed. This is partially > > handled by > > > + * the Priority 0 ACL flows added earlier, but we also need > > to > > > + * commit IP flows. This is because, while the initiater's > > > + * direction may not have any stateful rules, the server's > > may > > > + * and then its return traffic would not have an associated > > > + * conntrack entry and would return "+invalid". > > > + * > > > + * We use "ct_commit" for a connection that is not already > > known > > > + * by the connection tracker. Once a connection is > > committed, > > > + * subsequent packets will hit the flow at priority 0 that > > just > > > + * uses "next;" > > > + * > > > + * We also check for established connections that have > > ct_label.blocked > > > + * set on them. That's a connection that was disallowed, > > but is > > > + * now allowed by policy again since it hit this > > default-allow flow. > > > + * We need to set ct_label.blocked=0 to let the connection > > continue, > > > + * which will be done by ct_commit() in the "stateful" stage. > > > + * Subsequent packets will hit the flow at priority 0 that > > just > > > + * uses "next;". */ > > > + Flow(.logical_datapath = ls._uuid, > > > + .stage = s_SWITCH_IN_ACL(), > > > + .priority = 1, > > > + .__match = "ip && (!ct.est || (ct.est && > > ct_label.blocked == 1))", > > > + .actions = "${rEGBIT_CONNTRACK_COMMIT()} = 1; > > next;", > > > + .external_ids = map_empty()); > > > + Flow(.logical_datapath = ls._uuid, > > > + .stage = s_SWITCH_OUT_ACL(), > > > + .priority = 1, > > > + .__match = "ip && (!ct.est || (ct.est && > > ct_label.blocked == 1))", > > > + .actions = "${rEGBIT_CONNTRACK_COMMIT()} = 1; > > next;", > > > + .external_ids = map_empty()); > > > > > > - /* Ingress and Egress ACL Table (Priority 65535). > > > - * > > > - * Not to do conntrack on ND packets. */ > > > + /* Ingress and Egress ACL Table (Priority 65535). > > > + * > > > + * Always drop traffic that's in an invalid state. Also drop > > > + * reply direction packets for connections that have been > > marked > > > + * for deletion (bit 0 of ct_label is set). > > > + * > > > + * This is enforced at a higher priority than ACLs can be > > defined. */ > > > + Flow(.logical_datapath = ls._uuid, > > > + .stage = s_SWITCH_IN_ACL(), > > > + .priority = 65535, > > > + .__match = ct_inv_or ++ "(ct.est && ct.rpl && > > ct_label.blocked == 1)", > > > + .actions = "drop;", > > > + .external_ids = map_empty()); > > > + Flow(.logical_datapath = ls._uuid, > > > + .stage = s_SWITCH_OUT_ACL(), > > > + .priority = 65535, > > > + .__match = ct_inv_or ++ "(ct.est && ct.rpl && > > ct_label.blocked == 1)", > > > + .actions = "drop;", > > > + .external_ids = map_empty()); > > > + > > > + /* Ingress and Egress ACL Table (Priority 65535). > > > + * > > > + * Allow reply traffic that is part of an established > > > + * conntrack entry that has not been marked for deletion > > > + * (bit 0 of ct_label). We only match traffic in the > > > + * reply direction because we want traffic in the request > > > + * direction to hit the currently defined policy from ACLs. > > > + * > > > + * This is enforced at a higher priority than ACLs can be > > defined. */ > > > + Flow(.logical_datapath = ls._uuid, > > > + .stage = s_SWITCH_IN_ACL(), > > > + .priority = 65535, > > > + .__match = "ct.est && !ct.rel && !ct.new " ++ > > and_not_ct_inv ++ > > > + "&& ct.rpl && ct_label.blocked == > > 0", > > > + .actions = "next;", > > > + .external_ids = map_empty()); > > > + Flow(.logical_datapath = ls._uuid, > > > + .stage = s_SWITCH_OUT_ACL(), > > > + .priority = 65535, > > > + .__match = "ct.est && !ct.rel && !ct.new " ++ > > and_not_ct_inv ++ > > > + "&& ct.rpl && ct_label.blocked == > > 0", > > > + .actions = "next;", > > > + .external_ids = map_empty()); > > > + > > > + /* Ingress and Egress ACL Table (Priority 65535). > > > + * > > > + * Allow traffic that is related to an existing conntrack > > entry that > > > + * has not been marked for deletion (bit 0 of ct_label). > > > + * > > > + * This is enforced at a higher priority than ACLs can be > > defined. > > > + * > > > + * NOTE: This does not support related data sessions (eg, > > > + * a dynamically negotiated FTP data channel), but will allow > > > + * related traffic such as an ICMP Port Unreachable through > > > + * that's generated from a non-listening UDP port. */ > > > + Flow(.logical_datapath = ls._uuid, > > > + .stage = s_SWITCH_IN_ACL(), > > > + .priority = 65535, > > > + .__match = "!ct.est && ct.rel && !ct.new " ++ > > and_not_ct_inv ++ > > > + "&& ct_label.blocked == 0", > > > + .actions = "next;", > > > + .external_ids = map_empty()); > > > + Flow(.logical_datapath = ls._uuid, > > > + .stage = s_SWITCH_OUT_ACL(), > > > + .priority = 65535, > > > + .__match = "!ct.est && ct.rel && !ct.new " ++ > > and_not_ct_inv ++ > > > + "&& ct_label.blocked == 0", > > > + .actions = "next;", > > > + .external_ids = map_empty()); > > > + > > > + /* Ingress and Egress ACL Table (Priority 65535). > > > + * > > > + * Not to do conntrack on ND packets. */ > > > + Flow(.logical_datapath = ls._uuid, > > > + .stage = s_SWITCH_IN_ACL(), > > > + .priority = 65535, > > > + .__match = "nd || nd_ra || nd_rs || mldv1 || > > mldv2", > > > + .actions = "next;", > > > + .external_ids = map_empty()); > > > + Flow(.logical_datapath = ls._uuid, > > > + .stage = s_SWITCH_OUT_ACL(), > > > + .priority = 65535, > > > + .__match = "nd || nd_ra || nd_rs || mldv1 || > > mldv2", > > > + .actions = "next;", > > > + .external_ids = map_empty()) > > > + }; > > > + > > > + /* Add a 34000 priority flow to advance the DNS reply from > > ovn-controller, > > > + * if the CMS has configured DNS records for the datapath. > > > + */ > > > + if (sw.has_dns_records) { > > > + Flow(.logical_datapath = ls._uuid, > > > + .stage = s_SWITCH_OUT_ACL(), > > > + .priority = 34000, > > > + .__match = "udp.src == 53", > > > + .actions = if has_stateful "ct_commit; next;" > > else "next;", > > > + .external_ids = map_empty()) > > > + }; > > > + > > > + /* Add a 34000 priority flow to advance the service monitor reply > > > + * packets to skip applying ingress ACLs. */ > > > Flow(.logical_datapath = ls._uuid, > > > .stage = s_SWITCH_IN_ACL(), > > > - .priority = 65535, > > > - .__match = "nd || nd_ra || nd_rs || mldv1 || > > mldv2", > > > + .priority = 34000, > > > + .__match = "eth.dst == $svc_monitor_mac", > > > .actions = "next;", > > > .external_ids = map_empty()); > > > - Flow(.logical_datapath = ls._uuid, > > > - .stage = s_SWITCH_OUT_ACL(), > > > - .priority = 65535, > > > - .__match = "nd || nd_ra || nd_rs || mldv1 || > > mldv2", > > > - .actions = "next;", > > > - .external_ids = map_empty()) > > > - }; > > > - > > > - /* Add a 34000 priority flow to advance the DNS reply from > > ovn-controller, > > > - * if the CMS has configured DNS records for the datapath. > > > - */ > > > - if (sw.has_dns_records) { > > > Flow(.logical_datapath = ls._uuid, > > > .stage = s_SWITCH_OUT_ACL(), > > > .priority = 34000, > > > - .__match = "udp.src == 53", > > > - .actions = if has_stateful "ct_commit; next;" else > > "next;", > > > + .__match = "eth.src == $svc_monitor_mac", > > > + .actions = "next;", > > > .external_ids = map_empty()) > > > - }; > > > - > > > - /* Add a 34000 priority flow to advance the service monitor reply > > > - * packets to skip applying ingress ACLs. */ > > > - Flow(.logical_datapath = ls._uuid, > > > - .stage = s_SWITCH_IN_ACL(), > > > - .priority = 34000, > > > - .__match = "eth.dst == $svc_monitor_mac", > > > - .actions = "next;", > > > - .external_ids = map_empty()); > > > - Flow(.logical_datapath = ls._uuid, > > > - .stage = s_SWITCH_OUT_ACL(), > > > - .priority = 34000, > > > - .__match = "eth.src == $svc_monitor_mac", > > > - .actions = "next;", > > > - .external_ids = map_empty()) > > > + } > > > } > > > > > > /* This stage builds hints for the IN/OUT_ACL stage. Based on various > > > diff --git a/ovn-nb.xml b/ovn-nb.xml > > > index feb38b5d31..e337be4eb3 100644 > > > --- a/ovn-nb.xml > > > +++ b/ovn-nb.xml > > > @@ -240,6 +240,21 @@ > > > </p> > > > </column> > > > > > > + <column name="options" key="use_ct_inv_match"> > > > + <p> > > > + If set to false, <code>ovn-northd</code> will not use the > > > + <code>ct.inv</code> field in any of the logical flow matches. > > > + The default value is true. If the NIC supports offloading > > > + OVS datapath flows but doesn't support offloading ct_state > > > + <code>inv</code> flag, then the datapath flows matching on > > this flag > > > + (either <code>+inv</code> or <code>-inv</code>) will not be > > > + offloaded. CMS should consider setting > > <code>use_ct_inv_match</code> > > > + to <code>false</code> in such cases. This results in a side > > effect > > > + of the invalid packets getting delivered to the destination > > VIF, which > > > + otherwise would have been dropped by <code>OVN</code>. > > > + </p> > > > + </column> > > > + > > > <group title="Options for configuring interconnection route > > advertisement"> > > > <p> > > > These options control how routes are advertised between OVN > > > -- > > > 2.29.2 > > > > > > _______________________________________________ > > > dev mailing list > > > dev@openvswitch.org > > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > > > Acked-by: Han Zhou <hzhou@ovn.org> > > Thanks for the reviews Mark G and Han. > > I applied both the patches to the main branch addressing Han's comment in p1. > When I submitted this patch, I had a test case which somehow got removed > in the later versions. I added the test case before applying. > > Regards > Numan Hi Numan, Seems this merge breaks the CI. All these vtep related cases are failing: 898: ovn-controller-vtep - chassis FAILED (ovn-controller-vtep.at:110) 899: ovn-controller-vtep - binding 1 FAILED (ovn-controller-vtep.at:178) 900: ovn-controller-vtep - binding 2 FAILED (ovn-controller-vtep.at:252) 901: ovn-controller-vtep - vtep-lswitch FAILED (ovn-controller-vtep.at:291) 902: ovn-controller-vtep - vtep-macs 1 FAILED (ovn-controller-vtep.at:343) 903: ovn-controller-vtep - vtep-macs 2 FAILED (ovn-controller-vtep.at:431) All these tests are successful when I run them locally. I haven't checked more details yet. Thanks, Han
On Thu, May 6, 2021, 7:00 PM Han Zhou <hzhou@ovn.org> wrote: > On Thu, May 6, 2021 at 11:41 AM Numan Siddique <numans@ovn.org> wrote: > > > > On Thu, May 6, 2021 at 2:35 AM Han Zhou <hzhou@ovn.org> wrote: > > > > > > On Thu, Apr 22, 2021 at 6:25 AM <numans@ovn.org> wrote: > > > > > > > > From: Numan Siddique <numans@ovn.org> > > > > > > > > Presently we add 65535 priority lflows in the stages - > > > > 'ls_in_acl' and 'ls_out_acl' to drop packets which > > > > match on 'ct.inv'. > > > > > > > > This patch provides an option to not use this field in the > > > > logical flow matches for the following > > > > reasons: > > > > > > > > • Some of the smart NICs which support offloading datapath > > > > flows may not support this field. In such cases, almost > > > > all the datapath flows (matching on the ct_state field +inv > > > > or -inv cannot be offloaded if ACLs/load balancers > > > > are configured on the logical switch datapath. > > > > > > > > • A recent commit in kernel ovs datapath sets the committed > > > > connection tracking entry to be liberal for out-of-window > > > > tcp packets (nf_ct_set_tcp_be_liberal()). Such TCP > > > > packets will not be marked as invalid. > > > > > > > > There are still certain scenarios where a packet can be marked > > > > invalid. Like > > > > • If the tcp flags are not correct. > > > > > > > > • If there are checksum errors. > > > > > > > > In such cases, the packet will be delivered to the destination > > > > port. So CMS should evaluate their usecases before enabling > > > > this option. > > > > > > > > Signed-off-by: Numan Siddique <numans@ovn.org> > > > > Co-authored-by: Ben Pfaff <blp@ovn.org> > > > > Signed-off-by: Ben Pfaff <blp@ovn.org> > > > > --- > > > > v3 -> v4 > > > > --- > > > > * Moved the static variable 'use_ct_inv_match' declaration up in > the > > > > file. > > > > > > > > v2 -> v3 > > > > ---- > > > > * Updated the documentation in ovn-nb.xml as per Mark G's > suggestion. > > > > > > > > v1 -> v2 > > > > ---- > > > > * Adopted the code changes shared by Ben Pfaff for the ddlog code. > > > > > > > > NEWS | 7 +- > > > > northd/lswitch.dl | 7 + > > > > northd/ovn-northd.c | 42 +++--- > > > > northd/ovn_northd.dl | 304 > ++++++++++++++++++++++--------------------- > > > > ovn-nb.xml | 15 +++ > > > > 5 files changed, 208 insertions(+), 167 deletions(-) > > > > > > > > diff --git a/NEWS b/NEWS > > > > index 1ddde15f86..5f1365775d 100644 > > > > --- a/NEWS > > > > +++ b/NEWS > > > > @@ -6,11 +6,16 @@ Post-v21.03.0 > > > > expected to scale better than the C implementation, for large > > > deployments. > > > > (This may take testing and tuning to be effective.) This > version of > > > OVN > > > > requires DDLog 0.36. > > > > - - Introduce ovn-controller incremetal processing engine statistics > > > > + - Introduce ovn-controller incremental processing engine > statistics > > > > - Introduced parallel processing in ovn-northd with the NB_Global > > > config option > > > > 'use_parallel_build' to enable it. It is disabled by default. > > > > - Support vlan-passthru mode for tag=0 localnet ports. > > > > - Support custom 802.11ad EthType for localnet ports. > > > > + - Add a new NB Global option - use_ct_inv_match with the default > value > > > of true. > > > > + If this is set to false, then the logical field - "ct.inv" will > not > > > be > > > > + used in the logical flow matches. CMS can consider setting this > to > > > false, > > > > + if they want to use smart NICs which don't support offloading > > > datapath flows > > > > + with this field used. > > > > > > > > OVN v21.03.0 - 12 Mar 2021 > > > > ------------------------- > > > > diff --git a/northd/lswitch.dl b/northd/lswitch.dl > > > > index 47c497e0cf..27ac3cadc5 100644 > > > > --- a/northd/lswitch.dl > > > > +++ b/northd/lswitch.dl > > > > @@ -742,3 +742,10 @@ function put_svc_monitor_mac(options: > > > Map<string,string>, > > > > relation SvcMonitorMac(mac: eth_addr) > > > > SvcMonitorMac(get_svc_monitor_mac(options, uuid)) :- > > > > nb::NB_Global(._uuid = uuid, .options = options). > > > > + > > > > +relation UseCtInvMatch[bool] > > > > +UseCtInvMatch[options.get_bool_def("use_ct_inv_match", true)] :- > > > > + nb::NB_Global(.options = options). > > > > +UseCtInvMatch[true] :- > > > > + Unit(), > > > > + not nb in nb::NB_Global(). > > > > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c > > > > index de1aa896d3..5f6c0a6922 100644 > > > > --- a/northd/ovn-northd.c > > > > +++ b/northd/ovn-northd.c > > > > @@ -98,6 +98,10 @@ static bool check_lsp_is_up; > > > > static char svc_monitor_mac[ETH_ADDR_STRLEN + 1]; > > > > static struct eth_addr svc_monitor_mac_ea; > > > > > > > > +/* If this option is 'true' northd will make use of ct.inv match > fields. > > > > + * Otherwise, it will avoid using it. The default is true. */ > > > > +static bool use_ct_inv_match = true; > > > > + > > > > /* Default probe interval for NB and SB DB connections. */ > > > > #define DEFAULT_PROBE_INTERVAL_MSEC 5000 > > > > static int northd_probe_interval_nb = 0; > > > > @@ -4099,7 +4103,6 @@ do_ovn_lflow_add(struct hmap *lflow_map, bool > > > shared, > > > > hmap_insert_fast(lflow_map, &lflow->hmap_node, hash); > > > > } > > > > > > > > - > > > > /* Adds a row with the specified contents to the Logical_Flow table. > */ > > > > static void > > > > ovn_lflow_add_at(struct hmap *lflow_map, struct ovn_datapath *od, > > > > @@ -5712,12 +5715,13 @@ build_acls(struct ovn_datapath *od, struct > hmap > > > *lflows, > > > > * for deletion (bit 0 of ct_label is set). > > > > * > > > > * This is enforced at a higher priority than ACLs can be > > > defined. */ > > > > + char *match = xasprintf("%s(ct.est && ct.rpl && > ct_label.blocked > > > == 1)", > > > > + use_ct_inv_match ? "ct.inv || " : > ""); > > > > ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, UINT16_MAX, > > > > - "ct.inv || (ct.est && ct.rpl && > ct_label.blocked > > > == 1)", > > > > - "drop;"); > > > > + match, "drop;"); > > > > ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, UINT16_MAX, > > > > - "ct.inv || (ct.est && ct.rpl && > ct_label.blocked > > > == 1)", > > > > - "drop;"); > > > > + match, "drop;"); > > > > + free(match); > > > > > > > > /* Ingress and Egress ACL Table (Priority 65535). > > > > * > > > > @@ -5728,14 +5732,15 @@ build_acls(struct ovn_datapath *od, struct > hmap > > > *lflows, > > > > * direction to hit the currently defined policy from ACLs. > > > > * > > > > * This is enforced at a higher priority than ACLs can be > > > defined. */ > > > > + match = xasprintf("ct.est && !ct.rel && !ct.new%s && " > > > > + "ct.rpl && ct_label.blocked == 0", > > > > + use_ct_inv_match ? " && !ct.inv" : ""); > > > > + > > > > ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, UINT16_MAX, > > > > - "ct.est && !ct.rel && !ct.new && !ct.inv " > > > > - "&& ct.rpl && ct_label.blocked == 0", > > > > - "next;"); > > > > + match, "next;"); > > > > ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, UINT16_MAX, > > > > - "ct.est && !ct.rel && !ct.new && !ct.inv " > > > > - "&& ct.rpl && ct_label.blocked == 0", > > > > - "next;"); > > > > + match, "next;"); > > > > + free(match); > > > > > > > > /* Ingress and Egress ACL Table (Priority 65535). > > > > * > > > > @@ -5748,14 +5753,14 @@ build_acls(struct ovn_datapath *od, struct > hmap > > > *lflows, > > > > * a dynamically negotiated FTP data channel), but will > allow > > > > * related traffic such as an ICMP Port Unreachable through > > > > * that's generated from a non-listening UDP port. */ > > > > + match = xasprintf("!ct.est && ct.rel && !ct.new%s && " > > > > + "ct_label.blocked == 0", > > > > + use_ct_inv_match ? " && !ct.inv" : ""); > > > > ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, UINT16_MAX, > > > > - "!ct.est && ct.rel && !ct.new && !ct.inv " > > > > - "&& ct_label.blocked == 0", > > > > - "next;"); > > > > + match, "next;"); > > > > ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, UINT16_MAX, > > > > - "!ct.est && ct.rel && !ct.new && !ct.inv " > > > > - "&& ct_label.blocked == 0", > > > > - "next;"); > > > > + match, "next;"); > > > > + free(match); > > > > > > > > /* Ingress and Egress ACL Table (Priority 65535). > > > > * > > > > @@ -13195,6 +13200,9 @@ ovnnb_db_run(struct northd_context *ctx, > > > > > > > > use_logical_dp_groups = smap_get_bool(&nb->options, > > > > "use_logical_dp_groups", > > > false); > > > > + use_ct_inv_match = smap_get_bool(&nb->options, > > > > + "use_ct_inv_match", true); > > > > + > > > > /* deprecated, use --event instead */ > > > > controller_event_en = smap_get_bool(&nb->options, > > > > "controller_event", false); > > > > diff --git a/northd/ovn_northd.dl b/northd/ovn_northd.dl > > > > index 2f5d36c599..a0f7944103 100644 > > > > --- a/northd/ovn_northd.dl > > > > +++ b/northd/ovn_northd.dl > > > > @@ -2280,173 +2280,179 @@ for (Reject(lsuuid, pipeline, stage, acl, > > > fair_meter, extra_match_, extra_action > > > > } > > > > > > > > /* build_acls */ > > > > -for (sw in &Switch(.ls = ls)) > > > > -var has_stateful = sw.has_stateful_acl or sw.has_lb_vip in > > > > -{ > > > > - /* Ingress and Egress ACL Table (Priority 0): Packets are > allowed by > > > > - * default. A related rule at priority 1 is added below if > there > > > > - * are any stateful ACLs in this datapath. */ > > > > - Flow(.logical_datapath = ls._uuid, > > > > - .stage = s_SWITCH_IN_ACL(), > > > > - .priority = 0, > > > > - .__match = "1", > > > > - .actions = "next;", > > > > - .external_ids = map_empty()); > > > > - Flow(.logical_datapath = ls._uuid, > > > > - .stage = s_SWITCH_OUT_ACL(), > > > > - .priority = 0, > > > > - .__match = "1", > > > > - .actions = "next;", > > > > - .external_ids = map_empty()); > > > > - > > > > - if (has_stateful) { > > > > - /* Ingress and Egress ACL Table (Priority 1). > > > > - * > > > > - * By default, traffic is allowed. This is partially > handled by > > > > - * the Priority 0 ACL flows added earlier, but we also need > to > > > > - * commit IP flows. This is because, while the initiater's > > > > - * direction may not have any stateful rules, the server's > may > > > > - * and then its return traffic would not have an associated > > > > - * conntrack entry and would return "+invalid". > > > > - * > > > > - * We use "ct_commit" for a connection that is not already > known > > > > - * by the connection tracker. Once a connection is > committed, > > > > - * subsequent packets will hit the flow at priority 0 that > just > > > > - * uses "next;" > > > > - * > > > > - * We also check for established connections that have > > > ct_label.blocked > > > > - * set on them. That's a connection that was disallowed, > but is > > > > - * now allowed by policy again since it hit this > default-allow > > > flow. > > > > - * We need to set ct_label.blocked=0 to let the connection > > > continue, > > > > - * which will be done by ct_commit() in the "stateful" > stage. > > > > - * Subsequent packets will hit the flow at priority 0 that > just > > > > - * uses "next;". */ > > > > - Flow(.logical_datapath = ls._uuid, > > > > - .stage = s_SWITCH_IN_ACL(), > > > > - .priority = 1, > > > > - .__match = "ip && (!ct.est || (ct.est && > > > ct_label.blocked == 1))", > > > > - .actions = "${rEGBIT_CONNTRACK_COMMIT()} = 1; > > > next;", > > > > - .external_ids = map_empty()); > > > > - Flow(.logical_datapath = ls._uuid, > > > > - .stage = s_SWITCH_OUT_ACL(), > > > > - .priority = 1, > > > > - .__match = "ip && (!ct.est || (ct.est && > > > ct_label.blocked == 1))", > > > > - .actions = "${rEGBIT_CONNTRACK_COMMIT()} = 1; > > > next;", > > > > - .external_ids = map_empty()); > > > > - > > > > - /* Ingress and Egress ACL Table (Priority 65535). > > > > - * > > > > - * Always drop traffic that's in an invalid state. Also > drop > > > > - * reply direction packets for connections that have been > marked > > > > - * for deletion (bit 0 of ct_label is set). > > > > - * > > > > - * This is enforced at a higher priority than ACLs can be > > > defined. */ > > > > - Flow(.logical_datapath = ls._uuid, > > > > - .stage = s_SWITCH_IN_ACL(), > > > > - .priority = 65535, > > > > - .__match = "ct.inv || (ct.est && ct.rpl && > > > ct_label.blocked == 1)", > > > > - .actions = "drop;", > > > > - .external_ids = map_empty()); > > > > - Flow(.logical_datapath = ls._uuid, > > > > - .stage = s_SWITCH_OUT_ACL(), > > > > - .priority = 65535, > > > > - .__match = "ct.inv || (ct.est && ct.rpl && > > > ct_label.blocked == 1)", > > > > - .actions = "drop;", > > > > - .external_ids = map_empty()); > > > > - > > > > - /* Ingress and Egress ACL Table (Priority 65535). > > > > - * > > > > - * Allow reply traffic that is part of an established > > > > - * conntrack entry that has not been marked for deletion > > > > - * (bit 0 of ct_label). We only match traffic in the > > > > - * reply direction because we want traffic in the request > > > > - * direction to hit the currently defined policy from ACLs. > > > > - * > > > > - * This is enforced at a higher priority than ACLs can be > > > defined. */ > > > > +for (UseCtInvMatch[use_ct_inv_match]) { > > > > + (var ct_inv_or, var and_not_ct_inv) = match (use_ct_inv_match) { > > > > + true -> ("ct.inv || ", "&& !ct.inv "), > > > > + false -> ("", ""), > > > > + } in > > > > + for (sw in &Switch(.ls = ls)) > > > > + var has_stateful = sw.has_stateful_acl or sw.has_lb_vip in > > > > + { > > > > + /* Ingress and Egress ACL Table (Priority 0): Packets are > > > allowed by > > > > + * default. A related rule at priority 1 is added below if > there > > > > + * are any stateful ACLs in this datapath. */ > > > > Flow(.logical_datapath = ls._uuid, > > > > .stage = s_SWITCH_IN_ACL(), > > > > - .priority = 65535, > > > > - .__match = "ct.est && !ct.rel && !ct.new && > > > !ct.inv " > > > > - "&& ct.rpl && ct_label.blocked == > 0", > > > > + .priority = 0, > > > > + .__match = "1", > > > > .actions = "next;", > > > > .external_ids = map_empty()); > > > > Flow(.logical_datapath = ls._uuid, > > > > .stage = s_SWITCH_OUT_ACL(), > > > > - .priority = 65535, > > > > - .__match = "ct.est && !ct.rel && !ct.new && > > > !ct.inv " > > > > - "&& ct.rpl && ct_label.blocked == > 0", > > > > + .priority = 0, > > > > + .__match = "1", > > > > .actions = "next;", > > > > .external_ids = map_empty()); > > > > > > > > - /* Ingress and Egress ACL Table (Priority 65535). > > > > - * > > > > - * Allow traffic that is related to an existing conntrack > entry > > > that > > > > - * has not been marked for deletion (bit 0 of ct_label). > > > > - * > > > > - * This is enforced at a higher priority than ACLs can be > > > defined. > > > > - * > > > > - * NOTE: This does not support related data sessions (eg, > > > > - * a dynamically negotiated FTP data channel), but will > allow > > > > - * related traffic such as an ICMP Port Unreachable through > > > > - * that's generated from a non-listening UDP port. */ > > > > - Flow(.logical_datapath = ls._uuid, > > > > - .stage = s_SWITCH_IN_ACL(), > > > > - .priority = 65535, > > > > - .__match = "!ct.est && ct.rel && !ct.new && > > > !ct.inv " > > > > - "&& ct_label.blocked == 0", > > > > - .actions = "next;", > > > > - .external_ids = map_empty()); > > > > - Flow(.logical_datapath = ls._uuid, > > > > - .stage = s_SWITCH_OUT_ACL(), > > > > - .priority = 65535, > > > > - .__match = "!ct.est && ct.rel && !ct.new && > > > !ct.inv " > > > > - "&& ct_label.blocked == 0", > > > > - .actions = "next;", > > > > - .external_ids = map_empty()); > > > > + if (has_stateful) { > > > > + /* Ingress and Egress ACL Table (Priority 1). > > > > + * > > > > + * By default, traffic is allowed. This is partially > > > handled by > > > > + * the Priority 0 ACL flows added earlier, but we also > need > > > to > > > > + * commit IP flows. This is because, while the > initiater's > > > > + * direction may not have any stateful rules, the > server's > > > may > > > > + * and then its return traffic would not have an > associated > > > > + * conntrack entry and would return "+invalid". > > > > + * > > > > + * We use "ct_commit" for a connection that is not > already > > > known > > > > + * by the connection tracker. Once a connection is > > > committed, > > > > + * subsequent packets will hit the flow at priority 0 > that > > > just > > > > + * uses "next;" > > > > + * > > > > + * We also check for established connections that have > > > ct_label.blocked > > > > + * set on them. That's a connection that was > disallowed, > > > but is > > > > + * now allowed by policy again since it hit this > > > default-allow flow. > > > > + * We need to set ct_label.blocked=0 to let the > connection > > > continue, > > > > + * which will be done by ct_commit() in the "stateful" > stage. > > > > + * Subsequent packets will hit the flow at priority 0 > that > > > just > > > > + * uses "next;". */ > > > > + Flow(.logical_datapath = ls._uuid, > > > > + .stage = s_SWITCH_IN_ACL(), > > > > + .priority = 1, > > > > + .__match = "ip && (!ct.est || (ct.est && > > > ct_label.blocked == 1))", > > > > + .actions = "${rEGBIT_CONNTRACK_COMMIT()} = > 1; > > > next;", > > > > + .external_ids = map_empty()); > > > > + Flow(.logical_datapath = ls._uuid, > > > > + .stage = s_SWITCH_OUT_ACL(), > > > > + .priority = 1, > > > > + .__match = "ip && (!ct.est || (ct.est && > > > ct_label.blocked == 1))", > > > > + .actions = "${rEGBIT_CONNTRACK_COMMIT()} = > 1; > > > next;", > > > > + .external_ids = map_empty()); > > > > > > > > - /* Ingress and Egress ACL Table (Priority 65535). > > > > - * > > > > - * Not to do conntrack on ND packets. */ > > > > + /* Ingress and Egress ACL Table (Priority 65535). > > > > + * > > > > + * Always drop traffic that's in an invalid state. Also > drop > > > > + * reply direction packets for connections that have > been > > > marked > > > > + * for deletion (bit 0 of ct_label is set). > > > > + * > > > > + * This is enforced at a higher priority than ACLs can > be > > > defined. */ > > > > + Flow(.logical_datapath = ls._uuid, > > > > + .stage = s_SWITCH_IN_ACL(), > > > > + .priority = 65535, > > > > + .__match = ct_inv_or ++ "(ct.est && ct.rpl > && > > > ct_label.blocked == 1)", > > > > + .actions = "drop;", > > > > + .external_ids = map_empty()); > > > > + Flow(.logical_datapath = ls._uuid, > > > > + .stage = s_SWITCH_OUT_ACL(), > > > > + .priority = 65535, > > > > + .__match = ct_inv_or ++ "(ct.est && ct.rpl > && > > > ct_label.blocked == 1)", > > > > + .actions = "drop;", > > > > + .external_ids = map_empty()); > > > > + > > > > + /* Ingress and Egress ACL Table (Priority 65535). > > > > + * > > > > + * Allow reply traffic that is part of an established > > > > + * conntrack entry that has not been marked for deletion > > > > + * (bit 0 of ct_label). We only match traffic in the > > > > + * reply direction because we want traffic in the > request > > > > + * direction to hit the currently defined policy from > ACLs. > > > > + * > > > > + * This is enforced at a higher priority than ACLs can > be > > > defined. */ > > > > + Flow(.logical_datapath = ls._uuid, > > > > + .stage = s_SWITCH_IN_ACL(), > > > > + .priority = 65535, > > > > + .__match = "ct.est && !ct.rel && !ct.new " > ++ > > > and_not_ct_inv ++ > > > > + "&& ct.rpl && ct_label.blocked > == > > > 0", > > > > + .actions = "next;", > > > > + .external_ids = map_empty()); > > > > + Flow(.logical_datapath = ls._uuid, > > > > + .stage = s_SWITCH_OUT_ACL(), > > > > + .priority = 65535, > > > > + .__match = "ct.est && !ct.rel && !ct.new " > ++ > > > and_not_ct_inv ++ > > > > + "&& ct.rpl && ct_label.blocked > == > > > 0", > > > > + .actions = "next;", > > > > + .external_ids = map_empty()); > > > > + > > > > + /* Ingress and Egress ACL Table (Priority 65535). > > > > + * > > > > + * Allow traffic that is related to an existing > conntrack > > > entry that > > > > + * has not been marked for deletion (bit 0 of ct_label). > > > > + * > > > > + * This is enforced at a higher priority than ACLs can > be > > > defined. > > > > + * > > > > + * NOTE: This does not support related data sessions > (eg, > > > > + * a dynamically negotiated FTP data channel), but will > allow > > > > + * related traffic such as an ICMP Port Unreachable > through > > > > + * that's generated from a non-listening UDP port. */ > > > > + Flow(.logical_datapath = ls._uuid, > > > > + .stage = s_SWITCH_IN_ACL(), > > > > + .priority = 65535, > > > > + .__match = "!ct.est && ct.rel && !ct.new " > ++ > > > and_not_ct_inv ++ > > > > + "&& ct_label.blocked == 0", > > > > + .actions = "next;", > > > > + .external_ids = map_empty()); > > > > + Flow(.logical_datapath = ls._uuid, > > > > + .stage = s_SWITCH_OUT_ACL(), > > > > + .priority = 65535, > > > > + .__match = "!ct.est && ct.rel && !ct.new " > ++ > > > and_not_ct_inv ++ > > > > + "&& ct_label.blocked == 0", > > > > + .actions = "next;", > > > > + .external_ids = map_empty()); > > > > + > > > > + /* Ingress and Egress ACL Table (Priority 65535). > > > > + * > > > > + * Not to do conntrack on ND packets. */ > > > > + Flow(.logical_datapath = ls._uuid, > > > > + .stage = s_SWITCH_IN_ACL(), > > > > + .priority = 65535, > > > > + .__match = "nd || nd_ra || nd_rs || mldv1 > || > > > mldv2", > > > > + .actions = "next;", > > > > + .external_ids = map_empty()); > > > > + Flow(.logical_datapath = ls._uuid, > > > > + .stage = s_SWITCH_OUT_ACL(), > > > > + .priority = 65535, > > > > + .__match = "nd || nd_ra || nd_rs || mldv1 > || > > > mldv2", > > > > + .actions = "next;", > > > > + .external_ids = map_empty()) > > > > + }; > > > > + > > > > + /* Add a 34000 priority flow to advance the DNS reply from > > > ovn-controller, > > > > + * if the CMS has configured DNS records for the datapath. > > > > + */ > > > > + if (sw.has_dns_records) { > > > > + Flow(.logical_datapath = ls._uuid, > > > > + .stage = s_SWITCH_OUT_ACL(), > > > > + .priority = 34000, > > > > + .__match = "udp.src == 53", > > > > + .actions = if has_stateful "ct_commit; > next;" > > > else "next;", > > > > + .external_ids = map_empty()) > > > > + }; > > > > + > > > > + /* Add a 34000 priority flow to advance the service monitor > reply > > > > + * packets to skip applying ingress ACLs. */ > > > > Flow(.logical_datapath = ls._uuid, > > > > .stage = s_SWITCH_IN_ACL(), > > > > - .priority = 65535, > > > > - .__match = "nd || nd_ra || nd_rs || mldv1 || > > > mldv2", > > > > + .priority = 34000, > > > > + .__match = "eth.dst == $svc_monitor_mac", > > > > .actions = "next;", > > > > .external_ids = map_empty()); > > > > - Flow(.logical_datapath = ls._uuid, > > > > - .stage = s_SWITCH_OUT_ACL(), > > > > - .priority = 65535, > > > > - .__match = "nd || nd_ra || nd_rs || mldv1 || > > > mldv2", > > > > - .actions = "next;", > > > > - .external_ids = map_empty()) > > > > - }; > > > > - > > > > - /* Add a 34000 priority flow to advance the DNS reply from > > > ovn-controller, > > > > - * if the CMS has configured DNS records for the datapath. > > > > - */ > > > > - if (sw.has_dns_records) { > > > > Flow(.logical_datapath = ls._uuid, > > > > .stage = s_SWITCH_OUT_ACL(), > > > > .priority = 34000, > > > > - .__match = "udp.src == 53", > > > > - .actions = if has_stateful "ct_commit; next;" > else > > > "next;", > > > > + .__match = "eth.src == $svc_monitor_mac", > > > > + .actions = "next;", > > > > .external_ids = map_empty()) > > > > - }; > > > > - > > > > - /* Add a 34000 priority flow to advance the service monitor > reply > > > > - * packets to skip applying ingress ACLs. */ > > > > - Flow(.logical_datapath = ls._uuid, > > > > - .stage = s_SWITCH_IN_ACL(), > > > > - .priority = 34000, > > > > - .__match = "eth.dst == $svc_monitor_mac", > > > > - .actions = "next;", > > > > - .external_ids = map_empty()); > > > > - Flow(.logical_datapath = ls._uuid, > > > > - .stage = s_SWITCH_OUT_ACL(), > > > > - .priority = 34000, > > > > - .__match = "eth.src == $svc_monitor_mac", > > > > - .actions = "next;", > > > > - .external_ids = map_empty()) > > > > + } > > > > } > > > > > > > > /* This stage builds hints for the IN/OUT_ACL stage. Based on > various > > > > diff --git a/ovn-nb.xml b/ovn-nb.xml > > > > index feb38b5d31..e337be4eb3 100644 > > > > --- a/ovn-nb.xml > > > > +++ b/ovn-nb.xml > > > > @@ -240,6 +240,21 @@ > > > > </p> > > > > </column> > > > > > > > > + <column name="options" key="use_ct_inv_match"> > > > > + <p> > > > > + If set to false, <code>ovn-northd</code> will not use the > > > > + <code>ct.inv</code> field in any of the logical flow > matches. > > > > + The default value is true. If the NIC supports offloading > > > > + OVS datapath flows but doesn't support offloading ct_state > > > > + <code>inv</code> flag, then the datapath flows matching on > > > this flag > > > > + (either <code>+inv</code> or <code>-inv</code>) will not > be > > > > + offloaded. CMS should consider setting > > > <code>use_ct_inv_match</code> > > > > + to <code>false</code> in such cases. This results in a > side > > > effect > > > > + of the invalid packets getting delivered to the > destination > > > VIF, which > > > > + otherwise would have been dropped by <code>OVN</code>. > > > > + </p> > > > > + </column> > > > > + > > > > <group title="Options for configuring interconnection route > > > advertisement"> > > > > <p> > > > > These options control how routes are advertised between > OVN > > > > -- > > > > 2.29.2 > > > > > > > > _______________________________________________ > > > > dev mailing list > > > > dev@openvswitch.org > > > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > > > > > Acked-by: Han Zhou <hzhou@ovn.org> > > > > Thanks for the reviews Mark G and Han. > > > > I applied both the patches to the main branch addressing Han's comment in > p1. > > When I submitted this patch, I had a test case which somehow got removed > > in the later versions. I added the test case before applying. > > > > Regards > > Numan > > Hi Numan, > > Seems this merge breaks the CI. All these vtep related cases are failing: > 898: ovn-controller-vtep - chassis FAILED (ovn-controller-vtep.at:110) > 899: ovn-controller-vtep - binding 1 FAILED (ovn-controller-vtep.at:178) > 900: ovn-controller-vtep - binding 2 FAILED (ovn-controller-vtep.at:252) > 901: ovn-controller-vtep - vtep-lswitch FAILED (ovn-controller-vtep.at:291 > ) > 902: ovn-controller-vtep - vtep-macs 1 FAILED (ovn-controller-vtep.at:343) > 903: ovn-controller-vtep - vtep-macs 2 FAILED (ovn-controller-vtep.at:431) > > All these tests are successful when I run them locally. I haven't checked > more details yet. > Hi Han, I noticed that. In fact it's failing without these patches too. I quickly looked into it and looks like it's failing because of some warning message related to dns_resolve in ovsdb-server log. I think wen need to white list this warning message in vtep tests. Thanks Numan > > > Thanks, > Han > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev >
On Thu, May 6, 2021 at 6:13 PM Numan Siddique <numans@ovn.org> wrote: > > > On Thu, May 6, 2021, 7:00 PM Han Zhou <hzhou@ovn.org> wrote: > >> On Thu, May 6, 2021 at 11:41 AM Numan Siddique <numans@ovn.org> wrote: >> > >> > On Thu, May 6, 2021 at 2:35 AM Han Zhou <hzhou@ovn.org> wrote: >> > > >> > > On Thu, Apr 22, 2021 at 6:25 AM <numans@ovn.org> wrote: >> > > > >> > > > From: Numan Siddique <numans@ovn.org> >> > > > >> > > > Presently we add 65535 priority lflows in the stages - >> > > > 'ls_in_acl' and 'ls_out_acl' to drop packets which >> > > > match on 'ct.inv'. >> > > > >> > > > This patch provides an option to not use this field in the >> > > > logical flow matches for the following >> > > > reasons: >> > > > >> > > > • Some of the smart NICs which support offloading datapath >> > > > flows may not support this field. In such cases, almost >> > > > all the datapath flows (matching on the ct_state field +inv >> > > > or -inv cannot be offloaded if ACLs/load balancers >> > > > are configured on the logical switch datapath. >> > > > >> > > > • A recent commit in kernel ovs datapath sets the committed >> > > > connection tracking entry to be liberal for out-of-window >> > > > tcp packets (nf_ct_set_tcp_be_liberal()). Such TCP >> > > > packets will not be marked as invalid. >> > > > >> > > > There are still certain scenarios where a packet can be marked >> > > > invalid. Like >> > > > • If the tcp flags are not correct. >> > > > >> > > > • If there are checksum errors. >> > > > >> > > > In such cases, the packet will be delivered to the destination >> > > > port. So CMS should evaluate their usecases before enabling >> > > > this option. >> > > > >> > > > Signed-off-by: Numan Siddique <numans@ovn.org> >> > > > Co-authored-by: Ben Pfaff <blp@ovn.org> >> > > > Signed-off-by: Ben Pfaff <blp@ovn.org> >> > > > --- >> > > > v3 -> v4 >> > > > --- >> > > > * Moved the static variable 'use_ct_inv_match' declaration up in >> the >> > > > file. >> > > > >> > > > v2 -> v3 >> > > > ---- >> > > > * Updated the documentation in ovn-nb.xml as per Mark G's >> suggestion. >> > > > >> > > > v1 -> v2 >> > > > ---- >> > > > * Adopted the code changes shared by Ben Pfaff for the ddlog code. >> > > > >> > > > NEWS | 7 +- >> > > > northd/lswitch.dl | 7 + >> > > > northd/ovn-northd.c | 42 +++--- >> > > > northd/ovn_northd.dl | 304 >> ++++++++++++++++++++++--------------------- >> > > > ovn-nb.xml | 15 +++ >> > > > 5 files changed, 208 insertions(+), 167 deletions(-) >> > > > >> > > > diff --git a/NEWS b/NEWS >> > > > index 1ddde15f86..5f1365775d 100644 >> > > > --- a/NEWS >> > > > +++ b/NEWS >> > > > @@ -6,11 +6,16 @@ Post-v21.03.0 >> > > > expected to scale better than the C implementation, for large >> > > deployments. >> > > > (This may take testing and tuning to be effective.) This >> version of >> > > OVN >> > > > requires DDLog 0.36. >> > > > - - Introduce ovn-controller incremetal processing engine >> statistics >> > > > + - Introduce ovn-controller incremental processing engine >> statistics >> > > > - Introduced parallel processing in ovn-northd with the NB_Global >> > > config option >> > > > 'use_parallel_build' to enable it. It is disabled by default. >> > > > - Support vlan-passthru mode for tag=0 localnet ports. >> > > > - Support custom 802.11ad EthType for localnet ports. >> > > > + - Add a new NB Global option - use_ct_inv_match with the default >> value >> > > of true. >> > > > + If this is set to false, then the logical field - "ct.inv" will >> not >> > > be >> > > > + used in the logical flow matches. CMS can consider setting >> this >> to >> > > false, >> > > > + if they want to use smart NICs which don't support offloading >> > > datapath flows >> > > > + with this field used. >> > > > >> > > > OVN v21.03.0 - 12 Mar 2021 >> > > > ------------------------- >> > > > diff --git a/northd/lswitch.dl b/northd/lswitch.dl >> > > > index 47c497e0cf..27ac3cadc5 100644 >> > > > --- a/northd/lswitch.dl >> > > > +++ b/northd/lswitch.dl >> > > > @@ -742,3 +742,10 @@ function put_svc_monitor_mac(options: >> > > Map<string,string>, >> > > > relation SvcMonitorMac(mac: eth_addr) >> > > > SvcMonitorMac(get_svc_monitor_mac(options, uuid)) :- >> > > > nb::NB_Global(._uuid = uuid, .options = options). >> > > > + >> > > > +relation UseCtInvMatch[bool] >> > > > +UseCtInvMatch[options.get_bool_def("use_ct_inv_match", true)] :- >> > > > + nb::NB_Global(.options = options). >> > > > +UseCtInvMatch[true] :- >> > > > + Unit(), >> > > > + not nb in nb::NB_Global(). >> > > > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c >> > > > index de1aa896d3..5f6c0a6922 100644 >> > > > --- a/northd/ovn-northd.c >> > > > +++ b/northd/ovn-northd.c >> > > > @@ -98,6 +98,10 @@ static bool check_lsp_is_up; >> > > > static char svc_monitor_mac[ETH_ADDR_STRLEN + 1]; >> > > > static struct eth_addr svc_monitor_mac_ea; >> > > > >> > > > +/* If this option is 'true' northd will make use of ct.inv match >> fields. >> > > > + * Otherwise, it will avoid using it. The default is true. */ >> > > > +static bool use_ct_inv_match = true; >> > > > + >> > > > /* Default probe interval for NB and SB DB connections. */ >> > > > #define DEFAULT_PROBE_INTERVAL_MSEC 5000 >> > > > static int northd_probe_interval_nb = 0; >> > > > @@ -4099,7 +4103,6 @@ do_ovn_lflow_add(struct hmap *lflow_map, bool >> > > shared, >> > > > hmap_insert_fast(lflow_map, &lflow->hmap_node, hash); >> > > > } >> > > > >> > > > - >> > > > /* Adds a row with the specified contents to the Logical_Flow >> table. >> */ >> > > > static void >> > > > ovn_lflow_add_at(struct hmap *lflow_map, struct ovn_datapath *od, >> > > > @@ -5712,12 +5715,13 @@ build_acls(struct ovn_datapath *od, struct >> hmap >> > > *lflows, >> > > > * for deletion (bit 0 of ct_label is set). >> > > > * >> > > > * This is enforced at a higher priority than ACLs can be >> > > defined. */ >> > > > + char *match = xasprintf("%s(ct.est && ct.rpl && >> ct_label.blocked >> > > == 1)", >> > > > + use_ct_inv_match ? "ct.inv || " : >> ""); >> > > > ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, UINT16_MAX, >> > > > - "ct.inv || (ct.est && ct.rpl && >> ct_label.blocked >> > > == 1)", >> > > > - "drop;"); >> > > > + match, "drop;"); >> > > > ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, UINT16_MAX, >> > > > - "ct.inv || (ct.est && ct.rpl && >> ct_label.blocked >> > > == 1)", >> > > > - "drop;"); >> > > > + match, "drop;"); >> > > > + free(match); >> > > > >> > > > /* Ingress and Egress ACL Table (Priority 65535). >> > > > * >> > > > @@ -5728,14 +5732,15 @@ build_acls(struct ovn_datapath *od, struct >> hmap >> > > *lflows, >> > > > * direction to hit the currently defined policy from ACLs. >> > > > * >> > > > * This is enforced at a higher priority than ACLs can be >> > > defined. */ >> > > > + match = xasprintf("ct.est && !ct.rel && !ct.new%s && " >> > > > + "ct.rpl && ct_label.blocked == 0", >> > > > + use_ct_inv_match ? " && !ct.inv" : ""); >> > > > + >> > > > ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, UINT16_MAX, >> > > > - "ct.est && !ct.rel && !ct.new && !ct.inv " >> > > > - "&& ct.rpl && ct_label.blocked == 0", >> > > > - "next;"); >> > > > + match, "next;"); >> > > > ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, UINT16_MAX, >> > > > - "ct.est && !ct.rel && !ct.new && !ct.inv " >> > > > - "&& ct.rpl && ct_label.blocked == 0", >> > > > - "next;"); >> > > > + match, "next;"); >> > > > + free(match); >> > > > >> > > > /* Ingress and Egress ACL Table (Priority 65535). >> > > > * >> > > > @@ -5748,14 +5753,14 @@ build_acls(struct ovn_datapath *od, struct >> hmap >> > > *lflows, >> > > > * a dynamically negotiated FTP data channel), but will >> allow >> > > > * related traffic such as an ICMP Port Unreachable through >> > > > * that's generated from a non-listening UDP port. */ >> > > > + match = xasprintf("!ct.est && ct.rel && !ct.new%s && " >> > > > + "ct_label.blocked == 0", >> > > > + use_ct_inv_match ? " && !ct.inv" : ""); >> > > > ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, UINT16_MAX, >> > > > - "!ct.est && ct.rel && !ct.new && !ct.inv " >> > > > - "&& ct_label.blocked == 0", >> > > > - "next;"); >> > > > + match, "next;"); >> > > > ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, UINT16_MAX, >> > > > - "!ct.est && ct.rel && !ct.new && !ct.inv " >> > > > - "&& ct_label.blocked == 0", >> > > > - "next;"); >> > > > + match, "next;"); >> > > > + free(match); >> > > > >> > > > /* Ingress and Egress ACL Table (Priority 65535). >> > > > * >> > > > @@ -13195,6 +13200,9 @@ ovnnb_db_run(struct northd_context *ctx, >> > > > >> > > > use_logical_dp_groups = smap_get_bool(&nb->options, >> > > > "use_logical_dp_groups", >> > > false); >> > > > + use_ct_inv_match = smap_get_bool(&nb->options, >> > > > + "use_ct_inv_match", true); >> > > > + >> > > > /* deprecated, use --event instead */ >> > > > controller_event_en = smap_get_bool(&nb->options, >> > > > "controller_event", false); >> > > > diff --git a/northd/ovn_northd.dl b/northd/ovn_northd.dl >> > > > index 2f5d36c599..a0f7944103 100644 >> > > > --- a/northd/ovn_northd.dl >> > > > +++ b/northd/ovn_northd.dl >> > > > @@ -2280,173 +2280,179 @@ for (Reject(lsuuid, pipeline, stage, acl, >> > > fair_meter, extra_match_, extra_action >> > > > } >> > > > >> > > > /* build_acls */ >> > > > -for (sw in &Switch(.ls = ls)) >> > > > -var has_stateful = sw.has_stateful_acl or sw.has_lb_vip in >> > > > -{ >> > > > - /* Ingress and Egress ACL Table (Priority 0): Packets are >> allowed by >> > > > - * default. A related rule at priority 1 is added below if >> there >> > > > - * are any stateful ACLs in this datapath. */ >> > > > - Flow(.logical_datapath = ls._uuid, >> > > > - .stage = s_SWITCH_IN_ACL(), >> > > > - .priority = 0, >> > > > - .__match = "1", >> > > > - .actions = "next;", >> > > > - .external_ids = map_empty()); >> > > > - Flow(.logical_datapath = ls._uuid, >> > > > - .stage = s_SWITCH_OUT_ACL(), >> > > > - .priority = 0, >> > > > - .__match = "1", >> > > > - .actions = "next;", >> > > > - .external_ids = map_empty()); >> > > > - >> > > > - if (has_stateful) { >> > > > - /* Ingress and Egress ACL Table (Priority 1). >> > > > - * >> > > > - * By default, traffic is allowed. This is partially >> handled by >> > > > - * the Priority 0 ACL flows added earlier, but we also need >> to >> > > > - * commit IP flows. This is because, while the initiater's >> > > > - * direction may not have any stateful rules, the server's >> may >> > > > - * and then its return traffic would not have an associated >> > > > - * conntrack entry and would return "+invalid". >> > > > - * >> > > > - * We use "ct_commit" for a connection that is not already >> known >> > > > - * by the connection tracker. Once a connection is >> committed, >> > > > - * subsequent packets will hit the flow at priority 0 that >> just >> > > > - * uses "next;" >> > > > - * >> > > > - * We also check for established connections that have >> > > ct_label.blocked >> > > > - * set on them. That's a connection that was disallowed, >> but is >> > > > - * now allowed by policy again since it hit this >> default-allow >> > > flow. >> > > > - * We need to set ct_label.blocked=0 to let the connection >> > > continue, >> > > > - * which will be done by ct_commit() in the "stateful" >> stage. >> > > > - * Subsequent packets will hit the flow at priority 0 that >> just >> > > > - * uses "next;". */ >> > > > - Flow(.logical_datapath = ls._uuid, >> > > > - .stage = s_SWITCH_IN_ACL(), >> > > > - .priority = 1, >> > > > - .__match = "ip && (!ct.est || (ct.est && >> > > ct_label.blocked == 1))", >> > > > - .actions = "${rEGBIT_CONNTRACK_COMMIT()} = 1; >> > > next;", >> > > > - .external_ids = map_empty()); >> > > > - Flow(.logical_datapath = ls._uuid, >> > > > - .stage = s_SWITCH_OUT_ACL(), >> > > > - .priority = 1, >> > > > - .__match = "ip && (!ct.est || (ct.est && >> > > ct_label.blocked == 1))", >> > > > - .actions = "${rEGBIT_CONNTRACK_COMMIT()} = 1; >> > > next;", >> > > > - .external_ids = map_empty()); >> > > > - >> > > > - /* Ingress and Egress ACL Table (Priority 65535). >> > > > - * >> > > > - * Always drop traffic that's in an invalid state. Also >> drop >> > > > - * reply direction packets for connections that have been >> marked >> > > > - * for deletion (bit 0 of ct_label is set). >> > > > - * >> > > > - * This is enforced at a higher priority than ACLs can be >> > > defined. */ >> > > > - Flow(.logical_datapath = ls._uuid, >> > > > - .stage = s_SWITCH_IN_ACL(), >> > > > - .priority = 65535, >> > > > - .__match = "ct.inv || (ct.est && ct.rpl && >> > > ct_label.blocked == 1)", >> > > > - .actions = "drop;", >> > > > - .external_ids = map_empty()); >> > > > - Flow(.logical_datapath = ls._uuid, >> > > > - .stage = s_SWITCH_OUT_ACL(), >> > > > - .priority = 65535, >> > > > - .__match = "ct.inv || (ct.est && ct.rpl && >> > > ct_label.blocked == 1)", >> > > > - .actions = "drop;", >> > > > - .external_ids = map_empty()); >> > > > - >> > > > - /* Ingress and Egress ACL Table (Priority 65535). >> > > > - * >> > > > - * Allow reply traffic that is part of an established >> > > > - * conntrack entry that has not been marked for deletion >> > > > - * (bit 0 of ct_label). We only match traffic in the >> > > > - * reply direction because we want traffic in the request >> > > > - * direction to hit the currently defined policy from ACLs. >> > > > - * >> > > > - * This is enforced at a higher priority than ACLs can be >> > > defined. */ >> > > > +for (UseCtInvMatch[use_ct_inv_match]) { >> > > > + (var ct_inv_or, var and_not_ct_inv) = match (use_ct_inv_match) >> { >> > > > + true -> ("ct.inv || ", "&& !ct.inv "), >> > > > + false -> ("", ""), >> > > > + } in >> > > > + for (sw in &Switch(.ls = ls)) >> > > > + var has_stateful = sw.has_stateful_acl or sw.has_lb_vip in >> > > > + { >> > > > + /* Ingress and Egress ACL Table (Priority 0): Packets are >> > > allowed by >> > > > + * default. A related rule at priority 1 is added below if >> there >> > > > + * are any stateful ACLs in this datapath. */ >> > > > Flow(.logical_datapath = ls._uuid, >> > > > .stage = s_SWITCH_IN_ACL(), >> > > > - .priority = 65535, >> > > > - .__match = "ct.est && !ct.rel && !ct.new && >> > > !ct.inv " >> > > > - "&& ct.rpl && ct_label.blocked == >> 0", >> > > > + .priority = 0, >> > > > + .__match = "1", >> > > > .actions = "next;", >> > > > .external_ids = map_empty()); >> > > > Flow(.logical_datapath = ls._uuid, >> > > > .stage = s_SWITCH_OUT_ACL(), >> > > > - .priority = 65535, >> > > > - .__match = "ct.est && !ct.rel && !ct.new && >> > > !ct.inv " >> > > > - "&& ct.rpl && ct_label.blocked == >> 0", >> > > > + .priority = 0, >> > > > + .__match = "1", >> > > > .actions = "next;", >> > > > .external_ids = map_empty()); >> > > > >> > > > - /* Ingress and Egress ACL Table (Priority 65535). >> > > > - * >> > > > - * Allow traffic that is related to an existing conntrack >> entry >> > > that >> > > > - * has not been marked for deletion (bit 0 of ct_label). >> > > > - * >> > > > - * This is enforced at a higher priority than ACLs can be >> > > defined. >> > > > - * >> > > > - * NOTE: This does not support related data sessions (eg, >> > > > - * a dynamically negotiated FTP data channel), but will >> allow >> > > > - * related traffic such as an ICMP Port Unreachable through >> > > > - * that's generated from a non-listening UDP port. */ >> > > > - Flow(.logical_datapath = ls._uuid, >> > > > - .stage = s_SWITCH_IN_ACL(), >> > > > - .priority = 65535, >> > > > - .__match = "!ct.est && ct.rel && !ct.new && >> > > !ct.inv " >> > > > - "&& ct_label.blocked == 0", >> > > > - .actions = "next;", >> > > > - .external_ids = map_empty()); >> > > > - Flow(.logical_datapath = ls._uuid, >> > > > - .stage = s_SWITCH_OUT_ACL(), >> > > > - .priority = 65535, >> > > > - .__match = "!ct.est && ct.rel && !ct.new && >> > > !ct.inv " >> > > > - "&& ct_label.blocked == 0", >> > > > - .actions = "next;", >> > > > - .external_ids = map_empty()); >> > > > + if (has_stateful) { >> > > > + /* Ingress and Egress ACL Table (Priority 1). >> > > > + * >> > > > + * By default, traffic is allowed. This is partially >> > > handled by >> > > > + * the Priority 0 ACL flows added earlier, but we also >> need >> > > to >> > > > + * commit IP flows. This is because, while the >> initiater's >> > > > + * direction may not have any stateful rules, the >> server's >> > > may >> > > > + * and then its return traffic would not have an >> associated >> > > > + * conntrack entry and would return "+invalid". >> > > > + * >> > > > + * We use "ct_commit" for a connection that is not >> already >> > > known >> > > > + * by the connection tracker. Once a connection is >> > > committed, >> > > > + * subsequent packets will hit the flow at priority 0 >> that >> > > just >> > > > + * uses "next;" >> > > > + * >> > > > + * We also check for established connections that have >> > > ct_label.blocked >> > > > + * set on them. That's a connection that was >> disallowed, >> > > but is >> > > > + * now allowed by policy again since it hit this >> > > default-allow flow. >> > > > + * We need to set ct_label.blocked=0 to let the >> connection >> > > continue, >> > > > + * which will be done by ct_commit() in the "stateful" >> stage. >> > > > + * Subsequent packets will hit the flow at priority 0 >> that >> > > just >> > > > + * uses "next;". */ >> > > > + Flow(.logical_datapath = ls._uuid, >> > > > + .stage = s_SWITCH_IN_ACL(), >> > > > + .priority = 1, >> > > > + .__match = "ip && (!ct.est || (ct.est && >> > > ct_label.blocked == 1))", >> > > > + .actions = "${rEGBIT_CONNTRACK_COMMIT()} >> = >> 1; >> > > next;", >> > > > + .external_ids = map_empty()); >> > > > + Flow(.logical_datapath = ls._uuid, >> > > > + .stage = s_SWITCH_OUT_ACL(), >> > > > + .priority = 1, >> > > > + .__match = "ip && (!ct.est || (ct.est && >> > > ct_label.blocked == 1))", >> > > > + .actions = "${rEGBIT_CONNTRACK_COMMIT()} >> = >> 1; >> > > next;", >> > > > + .external_ids = map_empty()); >> > > > >> > > > - /* Ingress and Egress ACL Table (Priority 65535). >> > > > - * >> > > > - * Not to do conntrack on ND packets. */ >> > > > + /* Ingress and Egress ACL Table (Priority 65535). >> > > > + * >> > > > + * Always drop traffic that's in an invalid state. >> Also >> drop >> > > > + * reply direction packets for connections that have >> been >> > > marked >> > > > + * for deletion (bit 0 of ct_label is set). >> > > > + * >> > > > + * This is enforced at a higher priority than ACLs can >> be >> > > defined. */ >> > > > + Flow(.logical_datapath = ls._uuid, >> > > > + .stage = s_SWITCH_IN_ACL(), >> > > > + .priority = 65535, >> > > > + .__match = ct_inv_or ++ "(ct.est && >> ct.rpl >> && >> > > ct_label.blocked == 1)", >> > > > + .actions = "drop;", >> > > > + .external_ids = map_empty()); >> > > > + Flow(.logical_datapath = ls._uuid, >> > > > + .stage = s_SWITCH_OUT_ACL(), >> > > > + .priority = 65535, >> > > > + .__match = ct_inv_or ++ "(ct.est && >> ct.rpl >> && >> > > ct_label.blocked == 1)", >> > > > + .actions = "drop;", >> > > > + .external_ids = map_empty()); >> > > > + >> > > > + /* Ingress and Egress ACL Table (Priority 65535). >> > > > + * >> > > > + * Allow reply traffic that is part of an established >> > > > + * conntrack entry that has not been marked for >> deletion >> > > > + * (bit 0 of ct_label). We only match traffic in the >> > > > + * reply direction because we want traffic in the >> request >> > > > + * direction to hit the currently defined policy from >> ACLs. >> > > > + * >> > > > + * This is enforced at a higher priority than ACLs can >> be >> > > defined. */ >> > > > + Flow(.logical_datapath = ls._uuid, >> > > > + .stage = s_SWITCH_IN_ACL(), >> > > > + .priority = 65535, >> > > > + .__match = "ct.est && !ct.rel && !ct.new >> " >> ++ >> > > and_not_ct_inv ++ >> > > > + "&& ct.rpl && ct_label.blocked >> == >> > > 0", >> > > > + .actions = "next;", >> > > > + .external_ids = map_empty()); >> > > > + Flow(.logical_datapath = ls._uuid, >> > > > + .stage = s_SWITCH_OUT_ACL(), >> > > > + .priority = 65535, >> > > > + .__match = "ct.est && !ct.rel && !ct.new >> " >> ++ >> > > and_not_ct_inv ++ >> > > > + "&& ct.rpl && ct_label.blocked >> == >> > > 0", >> > > > + .actions = "next;", >> > > > + .external_ids = map_empty()); >> > > > + >> > > > + /* Ingress and Egress ACL Table (Priority 65535). >> > > > + * >> > > > + * Allow traffic that is related to an existing >> conntrack >> > > entry that >> > > > + * has not been marked for deletion (bit 0 of >> ct_label). >> > > > + * >> > > > + * This is enforced at a higher priority than ACLs can >> be >> > > defined. >> > > > + * >> > > > + * NOTE: This does not support related data sessions >> (eg, >> > > > + * a dynamically negotiated FTP data channel), but will >> allow >> > > > + * related traffic such as an ICMP Port Unreachable >> through >> > > > + * that's generated from a non-listening UDP port. */ >> > > > + Flow(.logical_datapath = ls._uuid, >> > > > + .stage = s_SWITCH_IN_ACL(), >> > > > + .priority = 65535, >> > > > + .__match = "!ct.est && ct.rel && !ct.new >> " >> ++ >> > > and_not_ct_inv ++ >> > > > + "&& ct_label.blocked == 0", >> > > > + .actions = "next;", >> > > > + .external_ids = map_empty()); >> > > > + Flow(.logical_datapath = ls._uuid, >> > > > + .stage = s_SWITCH_OUT_ACL(), >> > > > + .priority = 65535, >> > > > + .__match = "!ct.est && ct.rel && !ct.new >> " >> ++ >> > > and_not_ct_inv ++ >> > > > + "&& ct_label.blocked == 0", >> > > > + .actions = "next;", >> > > > + .external_ids = map_empty()); >> > > > + >> > > > + /* Ingress and Egress ACL Table (Priority 65535). >> > > > + * >> > > > + * Not to do conntrack on ND packets. */ >> > > > + Flow(.logical_datapath = ls._uuid, >> > > > + .stage = s_SWITCH_IN_ACL(), >> > > > + .priority = 65535, >> > > > + .__match = "nd || nd_ra || nd_rs || mldv1 >> || >> > > mldv2", >> > > > + .actions = "next;", >> > > > + .external_ids = map_empty()); >> > > > + Flow(.logical_datapath = ls._uuid, >> > > > + .stage = s_SWITCH_OUT_ACL(), >> > > > + .priority = 65535, >> > > > + .__match = "nd || nd_ra || nd_rs || mldv1 >> || >> > > mldv2", >> > > > + .actions = "next;", >> > > > + .external_ids = map_empty()) >> > > > + }; >> > > > + >> > > > + /* Add a 34000 priority flow to advance the DNS reply from >> > > ovn-controller, >> > > > + * if the CMS has configured DNS records for the datapath. >> > > > + */ >> > > > + if (sw.has_dns_records) { >> > > > + Flow(.logical_datapath = ls._uuid, >> > > > + .stage = s_SWITCH_OUT_ACL(), >> > > > + .priority = 34000, >> > > > + .__match = "udp.src == 53", >> > > > + .actions = if has_stateful "ct_commit; >> next;" >> > > else "next;", >> > > > + .external_ids = map_empty()) >> > > > + }; >> > > > + >> > > > + /* Add a 34000 priority flow to advance the service monitor >> reply >> > > > + * packets to skip applying ingress ACLs. */ >> > > > Flow(.logical_datapath = ls._uuid, >> > > > .stage = s_SWITCH_IN_ACL(), >> > > > - .priority = 65535, >> > > > - .__match = "nd || nd_ra || nd_rs || mldv1 || >> > > mldv2", >> > > > + .priority = 34000, >> > > > + .__match = "eth.dst == $svc_monitor_mac", >> > > > .actions = "next;", >> > > > .external_ids = map_empty()); >> > > > - Flow(.logical_datapath = ls._uuid, >> > > > - .stage = s_SWITCH_OUT_ACL(), >> > > > - .priority = 65535, >> > > > - .__match = "nd || nd_ra || nd_rs || mldv1 || >> > > mldv2", >> > > > - .actions = "next;", >> > > > - .external_ids = map_empty()) >> > > > - }; >> > > > - >> > > > - /* Add a 34000 priority flow to advance the DNS reply from >> > > ovn-controller, >> > > > - * if the CMS has configured DNS records for the datapath. >> > > > - */ >> > > > - if (sw.has_dns_records) { >> > > > Flow(.logical_datapath = ls._uuid, >> > > > .stage = s_SWITCH_OUT_ACL(), >> > > > .priority = 34000, >> > > > - .__match = "udp.src == 53", >> > > > - .actions = if has_stateful "ct_commit; next;" >> else >> > > "next;", >> > > > + .__match = "eth.src == $svc_monitor_mac", >> > > > + .actions = "next;", >> > > > .external_ids = map_empty()) >> > > > - }; >> > > > - >> > > > - /* Add a 34000 priority flow to advance the service monitor >> reply >> > > > - * packets to skip applying ingress ACLs. */ >> > > > - Flow(.logical_datapath = ls._uuid, >> > > > - .stage = s_SWITCH_IN_ACL(), >> > > > - .priority = 34000, >> > > > - .__match = "eth.dst == $svc_monitor_mac", >> > > > - .actions = "next;", >> > > > - .external_ids = map_empty()); >> > > > - Flow(.logical_datapath = ls._uuid, >> > > > - .stage = s_SWITCH_OUT_ACL(), >> > > > - .priority = 34000, >> > > > - .__match = "eth.src == $svc_monitor_mac", >> > > > - .actions = "next;", >> > > > - .external_ids = map_empty()) >> > > > + } >> > > > } >> > > > >> > > > /* This stage builds hints for the IN/OUT_ACL stage. Based on >> various >> > > > diff --git a/ovn-nb.xml b/ovn-nb.xml >> > > > index feb38b5d31..e337be4eb3 100644 >> > > > --- a/ovn-nb.xml >> > > > +++ b/ovn-nb.xml >> > > > @@ -240,6 +240,21 @@ >> > > > </p> >> > > > </column> >> > > > >> > > > + <column name="options" key="use_ct_inv_match"> >> > > > + <p> >> > > > + If set to false, <code>ovn-northd</code> will not use the >> > > > + <code>ct.inv</code> field in any of the logical flow >> matches. >> > > > + The default value is true. If the NIC supports >> offloading >> > > > + OVS datapath flows but doesn't support offloading >> ct_state >> > > > + <code>inv</code> flag, then the datapath flows matching >> on >> > > this flag >> > > > + (either <code>+inv</code> or <code>-inv</code>) will not >> be >> > > > + offloaded. CMS should consider setting >> > > <code>use_ct_inv_match</code> >> > > > + to <code>false</code> in such cases. This results in a >> side >> > > effect >> > > > + of the invalid packets getting delivered to the >> destination >> > > VIF, which >> > > > + otherwise would have been dropped by <code>OVN</code>. >> > > > + </p> >> > > > + </column> >> > > > + >> > > > <group title="Options for configuring interconnection route >> > > advertisement"> >> > > > <p> >> > > > These options control how routes are advertised between >> OVN >> > > > -- >> > > > 2.29.2 >> > > > >> > > > _______________________________________________ >> > > > dev mailing list >> > > > dev@openvswitch.org >> > > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev >> > > >> > > Acked-by: Han Zhou <hzhou@ovn.org> >> > >> > Thanks for the reviews Mark G and Han. >> > >> > I applied both the patches to the main branch addressing Han's comment >> in >> p1. >> > When I submitted this patch, I had a test case which somehow got removed >> > in the later versions. I added the test case before applying. >> > >> > Regards >> > Numan >> >> Hi Numan, >> >> Seems this merge breaks the CI. All these vtep related cases are failing: >> 898: ovn-controller-vtep - chassis FAILED (ovn-controller-vtep.at:110) >> 899: ovn-controller-vtep - binding 1 FAILED (ovn-controller-vtep.at:178) >> 900: ovn-controller-vtep - binding 2 FAILED (ovn-controller-vtep.at:252) >> 901: ovn-controller-vtep - vtep-lswitch FAILED ( >> ovn-controller-vtep.at:291) >> 902: ovn-controller-vtep - vtep-macs 1 FAILED (ovn-controller-vtep.at:343 >> ) >> 903: ovn-controller-vtep - vtep-macs 2 FAILED (ovn-controller-vtep.at:431 >> ) >> >> All these tests are successful when I run them locally. I haven't checked >> more details yet. >> > > Hi Han, > > I noticed that. In fact it's failing without these patches too. > > I quickly looked into it and looks like it's failing because of some > warning message related to dns_resolve in ovsdb-server log. > > I think wen need to white list this warning message in vtep tests. > > > Thanks > Numan > > ok. Thanks for checking! Do you happen to know what change introduced this? Also I wonder why I don't see these failures locally. Is it test server's problem (regarding DNS?) > > >> >> >> Thanks, >> Han >> _______________________________________________ >> dev mailing list >> dev@openvswitch.org >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev >> >
On Thu, May 6, 2021, 9:17 PM Han Zhou <hzhou@ovn.org> wrote: > On Thu, May 6, 2021 at 6:13 PM Numan Siddique <numans@ovn.org> wrote: > > > > > > > On Thu, May 6, 2021, 7:00 PM Han Zhou <hzhou@ovn.org> wrote: > > > >> On Thu, May 6, 2021 at 11:41 AM Numan Siddique <numans@ovn.org> wrote: > >> > > >> > On Thu, May 6, 2021 at 2:35 AM Han Zhou <hzhou@ovn.org> wrote: > >> > > > >> > > On Thu, Apr 22, 2021 at 6:25 AM <numans@ovn.org> wrote: > >> > > > > >> > > > From: Numan Siddique <numans@ovn.org> > >> > > > > >> > > > Presently we add 65535 priority lflows in the stages - > >> > > > 'ls_in_acl' and 'ls_out_acl' to drop packets which > >> > > > match on 'ct.inv'. > >> > > > > >> > > > This patch provides an option to not use this field in the > >> > > > logical flow matches for the following > >> > > > reasons: > >> > > > > >> > > > • Some of the smart NICs which support offloading datapath > >> > > > flows may not support this field. In such cases, almost > >> > > > all the datapath flows (matching on the ct_state field +inv > >> > > > or -inv cannot be offloaded if ACLs/load balancers > >> > > > are configured on the logical switch datapath. > >> > > > > >> > > > • A recent commit in kernel ovs datapath sets the committed > >> > > > connection tracking entry to be liberal for out-of-window > >> > > > tcp packets (nf_ct_set_tcp_be_liberal()). Such TCP > >> > > > packets will not be marked as invalid. > >> > > > > >> > > > There are still certain scenarios where a packet can be marked > >> > > > invalid. Like > >> > > > • If the tcp flags are not correct. > >> > > > > >> > > > • If there are checksum errors. > >> > > > > >> > > > In such cases, the packet will be delivered to the destination > >> > > > port. So CMS should evaluate their usecases before enabling > >> > > > this option. > >> > > > > >> > > > Signed-off-by: Numan Siddique <numans@ovn.org> > >> > > > Co-authored-by: Ben Pfaff <blp@ovn.org> > >> > > > Signed-off-by: Ben Pfaff <blp@ovn.org> > >> > > > --- > >> > > > v3 -> v4 > >> > > > --- > >> > > > * Moved the static variable 'use_ct_inv_match' declaration up in > >> the > >> > > > file. > >> > > > > >> > > > v2 -> v3 > >> > > > ---- > >> > > > * Updated the documentation in ovn-nb.xml as per Mark G's > >> suggestion. > >> > > > > >> > > > v1 -> v2 > >> > > > ---- > >> > > > * Adopted the code changes shared by Ben Pfaff for the ddlog > code. > >> > > > > >> > > > NEWS | 7 +- > >> > > > northd/lswitch.dl | 7 + > >> > > > northd/ovn-northd.c | 42 +++--- > >> > > > northd/ovn_northd.dl | 304 > >> ++++++++++++++++++++++--------------------- > >> > > > ovn-nb.xml | 15 +++ > >> > > > 5 files changed, 208 insertions(+), 167 deletions(-) > >> > > > > >> > > > diff --git a/NEWS b/NEWS > >> > > > index 1ddde15f86..5f1365775d 100644 > >> > > > --- a/NEWS > >> > > > +++ b/NEWS > >> > > > @@ -6,11 +6,16 @@ Post-v21.03.0 > >> > > > expected to scale better than the C implementation, for large > >> > > deployments. > >> > > > (This may take testing and tuning to be effective.) This > >> version of > >> > > OVN > >> > > > requires DDLog 0.36. > >> > > > - - Introduce ovn-controller incremetal processing engine > >> statistics > >> > > > + - Introduce ovn-controller incremental processing engine > >> statistics > >> > > > - Introduced parallel processing in ovn-northd with the > NB_Global > >> > > config option > >> > > > 'use_parallel_build' to enable it. It is disabled by > default. > >> > > > - Support vlan-passthru mode for tag=0 localnet ports. > >> > > > - Support custom 802.11ad EthType for localnet ports. > >> > > > + - Add a new NB Global option - use_ct_inv_match with the > default > >> value > >> > > of true. > >> > > > + If this is set to false, then the logical field - "ct.inv" > will > >> not > >> > > be > >> > > > + used in the logical flow matches. CMS can consider setting > >> this > >> to > >> > > false, > >> > > > + if they want to use smart NICs which don't support offloading > >> > > datapath flows > >> > > > + with this field used. > >> > > > > >> > > > OVN v21.03.0 - 12 Mar 2021 > >> > > > ------------------------- > >> > > > diff --git a/northd/lswitch.dl b/northd/lswitch.dl > >> > > > index 47c497e0cf..27ac3cadc5 100644 > >> > > > --- a/northd/lswitch.dl > >> > > > +++ b/northd/lswitch.dl > >> > > > @@ -742,3 +742,10 @@ function put_svc_monitor_mac(options: > >> > > Map<string,string>, > >> > > > relation SvcMonitorMac(mac: eth_addr) > >> > > > SvcMonitorMac(get_svc_monitor_mac(options, uuid)) :- > >> > > > nb::NB_Global(._uuid = uuid, .options = options). > >> > > > + > >> > > > +relation UseCtInvMatch[bool] > >> > > > +UseCtInvMatch[options.get_bool_def("use_ct_inv_match", true)] :- > >> > > > + nb::NB_Global(.options = options). > >> > > > +UseCtInvMatch[true] :- > >> > > > + Unit(), > >> > > > + not nb in nb::NB_Global(). > >> > > > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c > >> > > > index de1aa896d3..5f6c0a6922 100644 > >> > > > --- a/northd/ovn-northd.c > >> > > > +++ b/northd/ovn-northd.c > >> > > > @@ -98,6 +98,10 @@ static bool check_lsp_is_up; > >> > > > static char svc_monitor_mac[ETH_ADDR_STRLEN + 1]; > >> > > > static struct eth_addr svc_monitor_mac_ea; > >> > > > > >> > > > +/* If this option is 'true' northd will make use of ct.inv match > >> fields. > >> > > > + * Otherwise, it will avoid using it. The default is true. */ > >> > > > +static bool use_ct_inv_match = true; > >> > > > + > >> > > > /* Default probe interval for NB and SB DB connections. */ > >> > > > #define DEFAULT_PROBE_INTERVAL_MSEC 5000 > >> > > > static int northd_probe_interval_nb = 0; > >> > > > @@ -4099,7 +4103,6 @@ do_ovn_lflow_add(struct hmap *lflow_map, > bool > >> > > shared, > >> > > > hmap_insert_fast(lflow_map, &lflow->hmap_node, hash); > >> > > > } > >> > > > > >> > > > - > >> > > > /* Adds a row with the specified contents to the Logical_Flow > >> table. > >> */ > >> > > > static void > >> > > > ovn_lflow_add_at(struct hmap *lflow_map, struct ovn_datapath *od, > >> > > > @@ -5712,12 +5715,13 @@ build_acls(struct ovn_datapath *od, struct > >> hmap > >> > > *lflows, > >> > > > * for deletion (bit 0 of ct_label is set). > >> > > > * > >> > > > * This is enforced at a higher priority than ACLs can be > >> > > defined. */ > >> > > > + char *match = xasprintf("%s(ct.est && ct.rpl && > >> ct_label.blocked > >> > > == 1)", > >> > > > + use_ct_inv_match ? "ct.inv || " : > >> ""); > >> > > > ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, UINT16_MAX, > >> > > > - "ct.inv || (ct.est && ct.rpl && > >> ct_label.blocked > >> > > == 1)", > >> > > > - "drop;"); > >> > > > + match, "drop;"); > >> > > > ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, UINT16_MAX, > >> > > > - "ct.inv || (ct.est && ct.rpl && > >> ct_label.blocked > >> > > == 1)", > >> > > > - "drop;"); > >> > > > + match, "drop;"); > >> > > > + free(match); > >> > > > > >> > > > /* Ingress and Egress ACL Table (Priority 65535). > >> > > > * > >> > > > @@ -5728,14 +5732,15 @@ build_acls(struct ovn_datapath *od, struct > >> hmap > >> > > *lflows, > >> > > > * direction to hit the currently defined policy from > ACLs. > >> > > > * > >> > > > * This is enforced at a higher priority than ACLs can be > >> > > defined. */ > >> > > > + match = xasprintf("ct.est && !ct.rel && !ct.new%s && " > >> > > > + "ct.rpl && ct_label.blocked == 0", > >> > > > + use_ct_inv_match ? " && !ct.inv" : ""); > >> > > > + > >> > > > ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, UINT16_MAX, > >> > > > - "ct.est && !ct.rel && !ct.new && !ct.inv " > >> > > > - "&& ct.rpl && ct_label.blocked == 0", > >> > > > - "next;"); > >> > > > + match, "next;"); > >> > > > ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, UINT16_MAX, > >> > > > - "ct.est && !ct.rel && !ct.new && !ct.inv " > >> > > > - "&& ct.rpl && ct_label.blocked == 0", > >> > > > - "next;"); > >> > > > + match, "next;"); > >> > > > + free(match); > >> > > > > >> > > > /* Ingress and Egress ACL Table (Priority 65535). > >> > > > * > >> > > > @@ -5748,14 +5753,14 @@ build_acls(struct ovn_datapath *od, struct > >> hmap > >> > > *lflows, > >> > > > * a dynamically negotiated FTP data channel), but will > >> allow > >> > > > * related traffic such as an ICMP Port Unreachable > through > >> > > > * that's generated from a non-listening UDP port. */ > >> > > > + match = xasprintf("!ct.est && ct.rel && !ct.new%s && " > >> > > > + "ct_label.blocked == 0", > >> > > > + use_ct_inv_match ? " && !ct.inv" : ""); > >> > > > ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, UINT16_MAX, > >> > > > - "!ct.est && ct.rel && !ct.new && !ct.inv " > >> > > > - "&& ct_label.blocked == 0", > >> > > > - "next;"); > >> > > > + match, "next;"); > >> > > > ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, UINT16_MAX, > >> > > > - "!ct.est && ct.rel && !ct.new && !ct.inv " > >> > > > - "&& ct_label.blocked == 0", > >> > > > - "next;"); > >> > > > + match, "next;"); > >> > > > + free(match); > >> > > > > >> > > > /* Ingress and Egress ACL Table (Priority 65535). > >> > > > * > >> > > > @@ -13195,6 +13200,9 @@ ovnnb_db_run(struct northd_context *ctx, > >> > > > > >> > > > use_logical_dp_groups = smap_get_bool(&nb->options, > >> > > > > "use_logical_dp_groups", > >> > > false); > >> > > > + use_ct_inv_match = smap_get_bool(&nb->options, > >> > > > + "use_ct_inv_match", true); > >> > > > + > >> > > > /* deprecated, use --event instead */ > >> > > > controller_event_en = smap_get_bool(&nb->options, > >> > > > "controller_event", > false); > >> > > > diff --git a/northd/ovn_northd.dl b/northd/ovn_northd.dl > >> > > > index 2f5d36c599..a0f7944103 100644 > >> > > > --- a/northd/ovn_northd.dl > >> > > > +++ b/northd/ovn_northd.dl > >> > > > @@ -2280,173 +2280,179 @@ for (Reject(lsuuid, pipeline, stage, > acl, > >> > > fair_meter, extra_match_, extra_action > >> > > > } > >> > > > > >> > > > /* build_acls */ > >> > > > -for (sw in &Switch(.ls = ls)) > >> > > > -var has_stateful = sw.has_stateful_acl or sw.has_lb_vip in > >> > > > -{ > >> > > > - /* Ingress and Egress ACL Table (Priority 0): Packets are > >> allowed by > >> > > > - * default. A related rule at priority 1 is added below if > >> there > >> > > > - * are any stateful ACLs in this datapath. */ > >> > > > - Flow(.logical_datapath = ls._uuid, > >> > > > - .stage = s_SWITCH_IN_ACL(), > >> > > > - .priority = 0, > >> > > > - .__match = "1", > >> > > > - .actions = "next;", > >> > > > - .external_ids = map_empty()); > >> > > > - Flow(.logical_datapath = ls._uuid, > >> > > > - .stage = s_SWITCH_OUT_ACL(), > >> > > > - .priority = 0, > >> > > > - .__match = "1", > >> > > > - .actions = "next;", > >> > > > - .external_ids = map_empty()); > >> > > > - > >> > > > - if (has_stateful) { > >> > > > - /* Ingress and Egress ACL Table (Priority 1). > >> > > > - * > >> > > > - * By default, traffic is allowed. This is partially > >> handled by > >> > > > - * the Priority 0 ACL flows added earlier, but we also > need > >> to > >> > > > - * commit IP flows. This is because, while the > initiater's > >> > > > - * direction may not have any stateful rules, the > server's > >> may > >> > > > - * and then its return traffic would not have an > associated > >> > > > - * conntrack entry and would return "+invalid". > >> > > > - * > >> > > > - * We use "ct_commit" for a connection that is not > already > >> known > >> > > > - * by the connection tracker. Once a connection is > >> committed, > >> > > > - * subsequent packets will hit the flow at priority 0 > that > >> just > >> > > > - * uses "next;" > >> > > > - * > >> > > > - * We also check for established connections that have > >> > > ct_label.blocked > >> > > > - * set on them. That's a connection that was disallowed, > >> but is > >> > > > - * now allowed by policy again since it hit this > >> default-allow > >> > > flow. > >> > > > - * We need to set ct_label.blocked=0 to let the > connection > >> > > continue, > >> > > > - * which will be done by ct_commit() in the "stateful" > >> stage. > >> > > > - * Subsequent packets will hit the flow at priority 0 > that > >> just > >> > > > - * uses "next;". */ > >> > > > - Flow(.logical_datapath = ls._uuid, > >> > > > - .stage = s_SWITCH_IN_ACL(), > >> > > > - .priority = 1, > >> > > > - .__match = "ip && (!ct.est || (ct.est && > >> > > ct_label.blocked == 1))", > >> > > > - .actions = "${rEGBIT_CONNTRACK_COMMIT()} = > 1; > >> > > next;", > >> > > > - .external_ids = map_empty()); > >> > > > - Flow(.logical_datapath = ls._uuid, > >> > > > - .stage = s_SWITCH_OUT_ACL(), > >> > > > - .priority = 1, > >> > > > - .__match = "ip && (!ct.est || (ct.est && > >> > > ct_label.blocked == 1))", > >> > > > - .actions = "${rEGBIT_CONNTRACK_COMMIT()} = > 1; > >> > > next;", > >> > > > - .external_ids = map_empty()); > >> > > > - > >> > > > - /* Ingress and Egress ACL Table (Priority 65535). > >> > > > - * > >> > > > - * Always drop traffic that's in an invalid state. Also > >> drop > >> > > > - * reply direction packets for connections that have been > >> marked > >> > > > - * for deletion (bit 0 of ct_label is set). > >> > > > - * > >> > > > - * This is enforced at a higher priority than ACLs can be > >> > > defined. */ > >> > > > - Flow(.logical_datapath = ls._uuid, > >> > > > - .stage = s_SWITCH_IN_ACL(), > >> > > > - .priority = 65535, > >> > > > - .__match = "ct.inv || (ct.est && ct.rpl && > >> > > ct_label.blocked == 1)", > >> > > > - .actions = "drop;", > >> > > > - .external_ids = map_empty()); > >> > > > - Flow(.logical_datapath = ls._uuid, > >> > > > - .stage = s_SWITCH_OUT_ACL(), > >> > > > - .priority = 65535, > >> > > > - .__match = "ct.inv || (ct.est && ct.rpl && > >> > > ct_label.blocked == 1)", > >> > > > - .actions = "drop;", > >> > > > - .external_ids = map_empty()); > >> > > > - > >> > > > - /* Ingress and Egress ACL Table (Priority 65535). > >> > > > - * > >> > > > - * Allow reply traffic that is part of an established > >> > > > - * conntrack entry that has not been marked for deletion > >> > > > - * (bit 0 of ct_label). We only match traffic in the > >> > > > - * reply direction because we want traffic in the request > >> > > > - * direction to hit the currently defined policy from > ACLs. > >> > > > - * > >> > > > - * This is enforced at a higher priority than ACLs can be > >> > > defined. */ > >> > > > +for (UseCtInvMatch[use_ct_inv_match]) { > >> > > > + (var ct_inv_or, var and_not_ct_inv) = match > (use_ct_inv_match) > >> { > >> > > > + true -> ("ct.inv || ", "&& !ct.inv "), > >> > > > + false -> ("", ""), > >> > > > + } in > >> > > > + for (sw in &Switch(.ls = ls)) > >> > > > + var has_stateful = sw.has_stateful_acl or sw.has_lb_vip in > >> > > > + { > >> > > > + /* Ingress and Egress ACL Table (Priority 0): Packets are > >> > > allowed by > >> > > > + * default. A related rule at priority 1 is added below > if > >> there > >> > > > + * are any stateful ACLs in this datapath. */ > >> > > > Flow(.logical_datapath = ls._uuid, > >> > > > .stage = s_SWITCH_IN_ACL(), > >> > > > - .priority = 65535, > >> > > > - .__match = "ct.est && !ct.rel && !ct.new && > >> > > !ct.inv " > >> > > > - "&& ct.rpl && ct_label.blocked > == > >> 0", > >> > > > + .priority = 0, > >> > > > + .__match = "1", > >> > > > .actions = "next;", > >> > > > .external_ids = map_empty()); > >> > > > Flow(.logical_datapath = ls._uuid, > >> > > > .stage = s_SWITCH_OUT_ACL(), > >> > > > - .priority = 65535, > >> > > > - .__match = "ct.est && !ct.rel && !ct.new && > >> > > !ct.inv " > >> > > > - "&& ct.rpl && ct_label.blocked > == > >> 0", > >> > > > + .priority = 0, > >> > > > + .__match = "1", > >> > > > .actions = "next;", > >> > > > .external_ids = map_empty()); > >> > > > > >> > > > - /* Ingress and Egress ACL Table (Priority 65535). > >> > > > - * > >> > > > - * Allow traffic that is related to an existing conntrack > >> entry > >> > > that > >> > > > - * has not been marked for deletion (bit 0 of ct_label). > >> > > > - * > >> > > > - * This is enforced at a higher priority than ACLs can be > >> > > defined. > >> > > > - * > >> > > > - * NOTE: This does not support related data sessions (eg, > >> > > > - * a dynamically negotiated FTP data channel), but will > >> allow > >> > > > - * related traffic such as an ICMP Port Unreachable > through > >> > > > - * that's generated from a non-listening UDP port. */ > >> > > > - Flow(.logical_datapath = ls._uuid, > >> > > > - .stage = s_SWITCH_IN_ACL(), > >> > > > - .priority = 65535, > >> > > > - .__match = "!ct.est && ct.rel && !ct.new && > >> > > !ct.inv " > >> > > > - "&& ct_label.blocked == 0", > >> > > > - .actions = "next;", > >> > > > - .external_ids = map_empty()); > >> > > > - Flow(.logical_datapath = ls._uuid, > >> > > > - .stage = s_SWITCH_OUT_ACL(), > >> > > > - .priority = 65535, > >> > > > - .__match = "!ct.est && ct.rel && !ct.new && > >> > > !ct.inv " > >> > > > - "&& ct_label.blocked == 0", > >> > > > - .actions = "next;", > >> > > > - .external_ids = map_empty()); > >> > > > + if (has_stateful) { > >> > > > + /* Ingress and Egress ACL Table (Priority 1). > >> > > > + * > >> > > > + * By default, traffic is allowed. This is partially > >> > > handled by > >> > > > + * the Priority 0 ACL flows added earlier, but we > also > >> need > >> > > to > >> > > > + * commit IP flows. This is because, while the > >> initiater's > >> > > > + * direction may not have any stateful rules, the > >> server's > >> > > may > >> > > > + * and then its return traffic would not have an > >> associated > >> > > > + * conntrack entry and would return "+invalid". > >> > > > + * > >> > > > + * We use "ct_commit" for a connection that is not > >> already > >> > > known > >> > > > + * by the connection tracker. Once a connection is > >> > > committed, > >> > > > + * subsequent packets will hit the flow at priority 0 > >> that > >> > > just > >> > > > + * uses "next;" > >> > > > + * > >> > > > + * We also check for established connections that > have > >> > > ct_label.blocked > >> > > > + * set on them. That's a connection that was > >> disallowed, > >> > > but is > >> > > > + * now allowed by policy again since it hit this > >> > > default-allow flow. > >> > > > + * We need to set ct_label.blocked=0 to let the > >> connection > >> > > continue, > >> > > > + * which will be done by ct_commit() in the > "stateful" > >> stage. > >> > > > + * Subsequent packets will hit the flow at priority 0 > >> that > >> > > just > >> > > > + * uses "next;". */ > >> > > > + Flow(.logical_datapath = ls._uuid, > >> > > > + .stage = s_SWITCH_IN_ACL(), > >> > > > + .priority = 1, > >> > > > + .__match = "ip && (!ct.est || (ct.est > && > >> > > ct_label.blocked == 1))", > >> > > > + .actions = > "${rEGBIT_CONNTRACK_COMMIT()} > >> = > >> 1; > >> > > next;", > >> > > > + .external_ids = map_empty()); > >> > > > + Flow(.logical_datapath = ls._uuid, > >> > > > + .stage = s_SWITCH_OUT_ACL(), > >> > > > + .priority = 1, > >> > > > + .__match = "ip && (!ct.est || (ct.est > && > >> > > ct_label.blocked == 1))", > >> > > > + .actions = > "${rEGBIT_CONNTRACK_COMMIT()} > >> = > >> 1; > >> > > next;", > >> > > > + .external_ids = map_empty()); > >> > > > > >> > > > - /* Ingress and Egress ACL Table (Priority 65535). > >> > > > - * > >> > > > - * Not to do conntrack on ND packets. */ > >> > > > + /* Ingress and Egress ACL Table (Priority 65535). > >> > > > + * > >> > > > + * Always drop traffic that's in an invalid state. > >> Also > >> drop > >> > > > + * reply direction packets for connections that have > >> been > >> > > marked > >> > > > + * for deletion (bit 0 of ct_label is set). > >> > > > + * > >> > > > + * This is enforced at a higher priority than ACLs > can > >> be > >> > > defined. */ > >> > > > + Flow(.logical_datapath = ls._uuid, > >> > > > + .stage = s_SWITCH_IN_ACL(), > >> > > > + .priority = 65535, > >> > > > + .__match = ct_inv_or ++ "(ct.est && > >> ct.rpl > >> && > >> > > ct_label.blocked == 1)", > >> > > > + .actions = "drop;", > >> > > > + .external_ids = map_empty()); > >> > > > + Flow(.logical_datapath = ls._uuid, > >> > > > + .stage = s_SWITCH_OUT_ACL(), > >> > > > + .priority = 65535, > >> > > > + .__match = ct_inv_or ++ "(ct.est && > >> ct.rpl > >> && > >> > > ct_label.blocked == 1)", > >> > > > + .actions = "drop;", > >> > > > + .external_ids = map_empty()); > >> > > > + > >> > > > + /* Ingress and Egress ACL Table (Priority 65535). > >> > > > + * > >> > > > + * Allow reply traffic that is part of an established > >> > > > + * conntrack entry that has not been marked for > >> deletion > >> > > > + * (bit 0 of ct_label). We only match traffic in the > >> > > > + * reply direction because we want traffic in the > >> request > >> > > > + * direction to hit the currently defined policy from > >> ACLs. > >> > > > + * > >> > > > + * This is enforced at a higher priority than ACLs > can > >> be > >> > > defined. */ > >> > > > + Flow(.logical_datapath = ls._uuid, > >> > > > + .stage = s_SWITCH_IN_ACL(), > >> > > > + .priority = 65535, > >> > > > + .__match = "ct.est && !ct.rel && > !ct.new > >> " > >> ++ > >> > > and_not_ct_inv ++ > >> > > > + "&& ct.rpl && > ct_label.blocked > >> == > >> > > 0", > >> > > > + .actions = "next;", > >> > > > + .external_ids = map_empty()); > >> > > > + Flow(.logical_datapath = ls._uuid, > >> > > > + .stage = s_SWITCH_OUT_ACL(), > >> > > > + .priority = 65535, > >> > > > + .__match = "ct.est && !ct.rel && > !ct.new > >> " > >> ++ > >> > > and_not_ct_inv ++ > >> > > > + "&& ct.rpl && > ct_label.blocked > >> == > >> > > 0", > >> > > > + .actions = "next;", > >> > > > + .external_ids = map_empty()); > >> > > > + > >> > > > + /* Ingress and Egress ACL Table (Priority 65535). > >> > > > + * > >> > > > + * Allow traffic that is related to an existing > >> conntrack > >> > > entry that > >> > > > + * has not been marked for deletion (bit 0 of > >> ct_label). > >> > > > + * > >> > > > + * This is enforced at a higher priority than ACLs > can > >> be > >> > > defined. > >> > > > + * > >> > > > + * NOTE: This does not support related data sessions > >> (eg, > >> > > > + * a dynamically negotiated FTP data channel), but > will > >> allow > >> > > > + * related traffic such as an ICMP Port Unreachable > >> through > >> > > > + * that's generated from a non-listening UDP port. > */ > >> > > > + Flow(.logical_datapath = ls._uuid, > >> > > > + .stage = s_SWITCH_IN_ACL(), > >> > > > + .priority = 65535, > >> > > > + .__match = "!ct.est && ct.rel && > !ct.new > >> " > >> ++ > >> > > and_not_ct_inv ++ > >> > > > + "&& ct_label.blocked == 0", > >> > > > + .actions = "next;", > >> > > > + .external_ids = map_empty()); > >> > > > + Flow(.logical_datapath = ls._uuid, > >> > > > + .stage = s_SWITCH_OUT_ACL(), > >> > > > + .priority = 65535, > >> > > > + .__match = "!ct.est && ct.rel && > !ct.new > >> " > >> ++ > >> > > and_not_ct_inv ++ > >> > > > + "&& ct_label.blocked == 0", > >> > > > + .actions = "next;", > >> > > > + .external_ids = map_empty()); > >> > > > + > >> > > > + /* Ingress and Egress ACL Table (Priority 65535). > >> > > > + * > >> > > > + * Not to do conntrack on ND packets. */ > >> > > > + Flow(.logical_datapath = ls._uuid, > >> > > > + .stage = s_SWITCH_IN_ACL(), > >> > > > + .priority = 65535, > >> > > > + .__match = "nd || nd_ra || nd_rs || > mldv1 > >> || > >> > > mldv2", > >> > > > + .actions = "next;", > >> > > > + .external_ids = map_empty()); > >> > > > + Flow(.logical_datapath = ls._uuid, > >> > > > + .stage = s_SWITCH_OUT_ACL(), > >> > > > + .priority = 65535, > >> > > > + .__match = "nd || nd_ra || nd_rs || > mldv1 > >> || > >> > > mldv2", > >> > > > + .actions = "next;", > >> > > > + .external_ids = map_empty()) > >> > > > + }; > >> > > > + > >> > > > + /* Add a 34000 priority flow to advance the DNS reply > from > >> > > ovn-controller, > >> > > > + * if the CMS has configured DNS records for the > datapath. > >> > > > + */ > >> > > > + if (sw.has_dns_records) { > >> > > > + Flow(.logical_datapath = ls._uuid, > >> > > > + .stage = s_SWITCH_OUT_ACL(), > >> > > > + .priority = 34000, > >> > > > + .__match = "udp.src == 53", > >> > > > + .actions = if has_stateful "ct_commit; > >> next;" > >> > > else "next;", > >> > > > + .external_ids = map_empty()) > >> > > > + }; > >> > > > + > >> > > > + /* Add a 34000 priority flow to advance the service > monitor > >> reply > >> > > > + * packets to skip applying ingress ACLs. */ > >> > > > Flow(.logical_datapath = ls._uuid, > >> > > > .stage = s_SWITCH_IN_ACL(), > >> > > > - .priority = 65535, > >> > > > - .__match = "nd || nd_ra || nd_rs || mldv1 > || > >> > > mldv2", > >> > > > + .priority = 34000, > >> > > > + .__match = "eth.dst == $svc_monitor_mac", > >> > > > .actions = "next;", > >> > > > .external_ids = map_empty()); > >> > > > - Flow(.logical_datapath = ls._uuid, > >> > > > - .stage = s_SWITCH_OUT_ACL(), > >> > > > - .priority = 65535, > >> > > > - .__match = "nd || nd_ra || nd_rs || mldv1 > || > >> > > mldv2", > >> > > > - .actions = "next;", > >> > > > - .external_ids = map_empty()) > >> > > > - }; > >> > > > - > >> > > > - /* Add a 34000 priority flow to advance the DNS reply from > >> > > ovn-controller, > >> > > > - * if the CMS has configured DNS records for the datapath. > >> > > > - */ > >> > > > - if (sw.has_dns_records) { > >> > > > Flow(.logical_datapath = ls._uuid, > >> > > > .stage = s_SWITCH_OUT_ACL(), > >> > > > .priority = 34000, > >> > > > - .__match = "udp.src == 53", > >> > > > - .actions = if has_stateful "ct_commit; > next;" > >> else > >> > > "next;", > >> > > > + .__match = "eth.src == $svc_monitor_mac", > >> > > > + .actions = "next;", > >> > > > .external_ids = map_empty()) > >> > > > - }; > >> > > > - > >> > > > - /* Add a 34000 priority flow to advance the service monitor > >> reply > >> > > > - * packets to skip applying ingress ACLs. */ > >> > > > - Flow(.logical_datapath = ls._uuid, > >> > > > - .stage = s_SWITCH_IN_ACL(), > >> > > > - .priority = 34000, > >> > > > - .__match = "eth.dst == $svc_monitor_mac", > >> > > > - .actions = "next;", > >> > > > - .external_ids = map_empty()); > >> > > > - Flow(.logical_datapath = ls._uuid, > >> > > > - .stage = s_SWITCH_OUT_ACL(), > >> > > > - .priority = 34000, > >> > > > - .__match = "eth.src == $svc_monitor_mac", > >> > > > - .actions = "next;", > >> > > > - .external_ids = map_empty()) > >> > > > + } > >> > > > } > >> > > > > >> > > > /* This stage builds hints for the IN/OUT_ACL stage. Based on > >> various > >> > > > diff --git a/ovn-nb.xml b/ovn-nb.xml > >> > > > index feb38b5d31..e337be4eb3 100644 > >> > > > --- a/ovn-nb.xml > >> > > > +++ b/ovn-nb.xml > >> > > > @@ -240,6 +240,21 @@ > >> > > > </p> > >> > > > </column> > >> > > > > >> > > > + <column name="options" key="use_ct_inv_match"> > >> > > > + <p> > >> > > > + If set to false, <code>ovn-northd</code> will not use > the > >> > > > + <code>ct.inv</code> field in any of the logical flow > >> matches. > >> > > > + The default value is true. If the NIC supports > >> offloading > >> > > > + OVS datapath flows but doesn't support offloading > >> ct_state > >> > > > + <code>inv</code> flag, then the datapath flows matching > >> on > >> > > this flag > >> > > > + (either <code>+inv</code> or <code>-inv</code>) will > not > >> be > >> > > > + offloaded. CMS should consider setting > >> > > <code>use_ct_inv_match</code> > >> > > > + to <code>false</code> in such cases. This results in a > >> side > >> > > effect > >> > > > + of the invalid packets getting delivered to the > >> destination > >> > > VIF, which > >> > > > + otherwise would have been dropped by <code>OVN</code>. > >> > > > + </p> > >> > > > + </column> > >> > > > + > >> > > > <group title="Options for configuring interconnection route > >> > > advertisement"> > >> > > > <p> > >> > > > These options control how routes are advertised between > >> OVN > >> > > > -- > >> > > > 2.29.2 > >> > > > > >> > > > _______________________________________________ > >> > > > dev mailing list > >> > > > dev@openvswitch.org > >> > > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > >> > > > >> > > Acked-by: Han Zhou <hzhou@ovn.org> > >> > > >> > Thanks for the reviews Mark G and Han. > >> > > >> > I applied both the patches to the main branch addressing Han's comment > >> in > >> p1. > >> > When I submitted this patch, I had a test case which somehow got > removed > >> > in the later versions. I added the test case before applying. > >> > > >> > Regards > >> > Numan > >> > >> Hi Numan, > >> > >> Seems this merge breaks the CI. All these vtep related cases are > failing: > >> 898: ovn-controller-vtep - chassis FAILED (ovn-controller-vtep.at:110) > >> 899: ovn-controller-vtep - binding 1 FAILED (ovn-controller-vtep.at:178 > ) > >> 900: ovn-controller-vtep - binding 2 FAILED (ovn-controller-vtep.at:252 > ) > >> 901: ovn-controller-vtep - vtep-lswitch FAILED ( > >> ovn-controller-vtep.at:291) > >> 902: ovn-controller-vtep - vtep-macs 1 FAILED ( > ovn-controller-vtep.at:343 > >> ) > >> 903: ovn-controller-vtep - vtep-macs 2 FAILED ( > ovn-controller-vtep.at:431 > >> ) > >> > >> All these tests are successful when I run them locally. I haven't > checked > >> more details yet. > >> > > > > Hi Han, > > > > I noticed that. In fact it's failing without these patches too. > > > > I quickly looked into it and looks like it's failing because of some > > warning message related to dns_resolve in ovsdb-server log. > > > > I think wen need to white list this warning message in vtep tests. > > > > > > Thanks > > Numan > > > > ok. Thanks for checking! Do you happen to know what change introduced > this? Also I wonder why I don't see these failures locally. Is it test > server's problem (regarding DNS?) > I didn't get a chance to look further. If you're planning to look into it then please do so. Thanks Numan > > > > > >> > >> > >> Thanks, > >> Han > >> _______________________________________________ > >> dev mailing list > >> dev@openvswitch.org > >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev > >> > > > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev >
On Thu, May 6, 2021 at 7:21 PM Numan Siddique <numans@ovn.org> wrote: > > > On Thu, May 6, 2021, 9:17 PM Han Zhou <hzhou@ovn.org> wrote: > >> On Thu, May 6, 2021 at 6:13 PM Numan Siddique <numans@ovn.org> wrote: >> >> > >> > >> > On Thu, May 6, 2021, 7:00 PM Han Zhou <hzhou@ovn.org> wrote: >> > >> >> On Thu, May 6, 2021 at 11:41 AM Numan Siddique <numans@ovn.org> wrote: >> >> > >> >> > On Thu, May 6, 2021 at 2:35 AM Han Zhou <hzhou@ovn.org> wrote: >> >> > > >> >> > > On Thu, Apr 22, 2021 at 6:25 AM <numans@ovn.org> wrote: >> >> > > > >> >> > > > From: Numan Siddique <numans@ovn.org> >> >> > > > >> >> > > > Presently we add 65535 priority lflows in the stages - >> >> > > > 'ls_in_acl' and 'ls_out_acl' to drop packets which >> >> > > > match on 'ct.inv'. >> >> > > > >> >> > > > This patch provides an option to not use this field in the >> >> > > > logical flow matches for the following >> >> > > > reasons: >> >> > > > >> >> > > > • Some of the smart NICs which support offloading datapath >> >> > > > flows may not support this field. In such cases, almost >> >> > > > all the datapath flows (matching on the ct_state field +inv >> >> > > > or -inv cannot be offloaded if ACLs/load balancers >> >> > > > are configured on the logical switch datapath. >> >> > > > >> >> > > > • A recent commit in kernel ovs datapath sets the committed >> >> > > > connection tracking entry to be liberal for out-of-window >> >> > > > tcp packets (nf_ct_set_tcp_be_liberal()). Such TCP >> >> > > > packets will not be marked as invalid. >> >> > > > >> >> > > > There are still certain scenarios where a packet can be marked >> >> > > > invalid. Like >> >> > > > • If the tcp flags are not correct. >> >> > > > >> >> > > > • If there are checksum errors. >> >> > > > >> >> > > > In such cases, the packet will be delivered to the destination >> >> > > > port. So CMS should evaluate their usecases before enabling >> >> > > > this option. >> >> > > > >> >> > > > Signed-off-by: Numan Siddique <numans@ovn.org> >> >> > > > Co-authored-by: Ben Pfaff <blp@ovn.org> >> >> > > > Signed-off-by: Ben Pfaff <blp@ovn.org> >> >> > > > --- >> >> > > > v3 -> v4 >> >> > > > --- >> >> > > > * Moved the static variable 'use_ct_inv_match' declaration up >> in >> >> the >> >> > > > file. >> >> > > > >> >> > > > v2 -> v3 >> >> > > > ---- >> >> > > > * Updated the documentation in ovn-nb.xml as per Mark G's >> >> suggestion. >> >> > > > >> >> > > > v1 -> v2 >> >> > > > ---- >> >> > > > * Adopted the code changes shared by Ben Pfaff for the ddlog >> code. >> >> > > > >> >> > > > NEWS | 7 +- >> >> > > > northd/lswitch.dl | 7 + >> >> > > > northd/ovn-northd.c | 42 +++--- >> >> > > > northd/ovn_northd.dl | 304 >> >> ++++++++++++++++++++++--------------------- >> >> > > > ovn-nb.xml | 15 +++ >> >> > > > 5 files changed, 208 insertions(+), 167 deletions(-) >> >> > > > >> >> > > > diff --git a/NEWS b/NEWS >> >> > > > index 1ddde15f86..5f1365775d 100644 >> >> > > > --- a/NEWS >> >> > > > +++ b/NEWS >> >> > > > @@ -6,11 +6,16 @@ Post-v21.03.0 >> >> > > > expected to scale better than the C implementation, for >> large >> >> > > deployments. >> >> > > > (This may take testing and tuning to be effective.) This >> >> version of >> >> > > OVN >> >> > > > requires DDLog 0.36. >> >> > > > - - Introduce ovn-controller incremetal processing engine >> >> statistics >> >> > > > + - Introduce ovn-controller incremental processing engine >> >> statistics >> >> > > > - Introduced parallel processing in ovn-northd with the >> NB_Global >> >> > > config option >> >> > > > 'use_parallel_build' to enable it. It is disabled by >> default. >> >> > > > - Support vlan-passthru mode for tag=0 localnet ports. >> >> > > > - Support custom 802.11ad EthType for localnet ports. >> >> > > > + - Add a new NB Global option - use_ct_inv_match with the >> default >> >> value >> >> > > of true. >> >> > > > + If this is set to false, then the logical field - "ct.inv" >> will >> >> not >> >> > > be >> >> > > > + used in the logical flow matches. CMS can consider setting >> >> this >> >> to >> >> > > false, >> >> > > > + if they want to use smart NICs which don't support >> offloading >> >> > > datapath flows >> >> > > > + with this field used. >> >> > > > >> >> > > > OVN v21.03.0 - 12 Mar 2021 >> >> > > > ------------------------- >> >> > > > diff --git a/northd/lswitch.dl b/northd/lswitch.dl >> >> > > > index 47c497e0cf..27ac3cadc5 100644 >> >> > > > --- a/northd/lswitch.dl >> >> > > > +++ b/northd/lswitch.dl >> >> > > > @@ -742,3 +742,10 @@ function put_svc_monitor_mac(options: >> >> > > Map<string,string>, >> >> > > > relation SvcMonitorMac(mac: eth_addr) >> >> > > > SvcMonitorMac(get_svc_monitor_mac(options, uuid)) :- >> >> > > > nb::NB_Global(._uuid = uuid, .options = options). >> >> > > > + >> >> > > > +relation UseCtInvMatch[bool] >> >> > > > +UseCtInvMatch[options.get_bool_def("use_ct_inv_match", true)] :- >> >> > > > + nb::NB_Global(.options = options). >> >> > > > +UseCtInvMatch[true] :- >> >> > > > + Unit(), >> >> > > > + not nb in nb::NB_Global(). >> >> > > > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c >> >> > > > index de1aa896d3..5f6c0a6922 100644 >> >> > > > --- a/northd/ovn-northd.c >> >> > > > +++ b/northd/ovn-northd.c >> >> > > > @@ -98,6 +98,10 @@ static bool check_lsp_is_up; >> >> > > > static char svc_monitor_mac[ETH_ADDR_STRLEN + 1]; >> >> > > > static struct eth_addr svc_monitor_mac_ea; >> >> > > > >> >> > > > +/* If this option is 'true' northd will make use of ct.inv match >> >> fields. >> >> > > > + * Otherwise, it will avoid using it. The default is true. */ >> >> > > > +static bool use_ct_inv_match = true; >> >> > > > + >> >> > > > /* Default probe interval for NB and SB DB connections. */ >> >> > > > #define DEFAULT_PROBE_INTERVAL_MSEC 5000 >> >> > > > static int northd_probe_interval_nb = 0; >> >> > > > @@ -4099,7 +4103,6 @@ do_ovn_lflow_add(struct hmap *lflow_map, >> bool >> >> > > shared, >> >> > > > hmap_insert_fast(lflow_map, &lflow->hmap_node, hash); >> >> > > > } >> >> > > > >> >> > > > - >> >> > > > /* Adds a row with the specified contents to the Logical_Flow >> >> table. >> >> */ >> >> > > > static void >> >> > > > ovn_lflow_add_at(struct hmap *lflow_map, struct ovn_datapath >> *od, >> >> > > > @@ -5712,12 +5715,13 @@ build_acls(struct ovn_datapath *od, >> struct >> >> hmap >> >> > > *lflows, >> >> > > > * for deletion (bit 0 of ct_label is set). >> >> > > > * >> >> > > > * This is enforced at a higher priority than ACLs can >> be >> >> > > defined. */ >> >> > > > + char *match = xasprintf("%s(ct.est && ct.rpl && >> >> ct_label.blocked >> >> > > == 1)", >> >> > > > + use_ct_inv_match ? "ct.inv || " >> : >> >> ""); >> >> > > > ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, UINT16_MAX, >> >> > > > - "ct.inv || (ct.est && ct.rpl && >> >> ct_label.blocked >> >> > > == 1)", >> >> > > > - "drop;"); >> >> > > > + match, "drop;"); >> >> > > > ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, UINT16_MAX, >> >> > > > - "ct.inv || (ct.est && ct.rpl && >> >> ct_label.blocked >> >> > > == 1)", >> >> > > > - "drop;"); >> >> > > > + match, "drop;"); >> >> > > > + free(match); >> >> > > > >> >> > > > /* Ingress and Egress ACL Table (Priority 65535). >> >> > > > * >> >> > > > @@ -5728,14 +5732,15 @@ build_acls(struct ovn_datapath *od, >> struct >> >> hmap >> >> > > *lflows, >> >> > > > * direction to hit the currently defined policy from >> ACLs. >> >> > > > * >> >> > > > * This is enforced at a higher priority than ACLs can >> be >> >> > > defined. */ >> >> > > > + match = xasprintf("ct.est && !ct.rel && !ct.new%s && " >> >> > > > + "ct.rpl && ct_label.blocked == 0", >> >> > > > + use_ct_inv_match ? " && !ct.inv" : >> ""); >> >> > > > + >> >> > > > ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, UINT16_MAX, >> >> > > > - "ct.est && !ct.rel && !ct.new && !ct.inv " >> >> > > > - "&& ct.rpl && ct_label.blocked == 0", >> >> > > > - "next;"); >> >> > > > + match, "next;"); >> >> > > > ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, UINT16_MAX, >> >> > > > - "ct.est && !ct.rel && !ct.new && !ct.inv " >> >> > > > - "&& ct.rpl && ct_label.blocked == 0", >> >> > > > - "next;"); >> >> > > > + match, "next;"); >> >> > > > + free(match); >> >> > > > >> >> > > > /* Ingress and Egress ACL Table (Priority 65535). >> >> > > > * >> >> > > > @@ -5748,14 +5753,14 @@ build_acls(struct ovn_datapath *od, >> struct >> >> hmap >> >> > > *lflows, >> >> > > > * a dynamically negotiated FTP data channel), but will >> >> allow >> >> > > > * related traffic such as an ICMP Port Unreachable >> through >> >> > > > * that's generated from a non-listening UDP port. */ >> >> > > > + match = xasprintf("!ct.est && ct.rel && !ct.new%s && " >> >> > > > + "ct_label.blocked == 0", >> >> > > > + use_ct_inv_match ? " && !ct.inv" : >> ""); >> >> > > > ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, UINT16_MAX, >> >> > > > - "!ct.est && ct.rel && !ct.new && !ct.inv " >> >> > > > - "&& ct_label.blocked == 0", >> >> > > > - "next;"); >> >> > > > + match, "next;"); >> >> > > > ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, UINT16_MAX, >> >> > > > - "!ct.est && ct.rel && !ct.new && !ct.inv " >> >> > > > - "&& ct_label.blocked == 0", >> >> > > > - "next;"); >> >> > > > + match, "next;"); >> >> > > > + free(match); >> >> > > > >> >> > > > /* Ingress and Egress ACL Table (Priority 65535). >> >> > > > * >> >> > > > @@ -13195,6 +13200,9 @@ ovnnb_db_run(struct northd_context *ctx, >> >> > > > >> >> > > > use_logical_dp_groups = smap_get_bool(&nb->options, >> >> > > > >> "use_logical_dp_groups", >> >> > > false); >> >> > > > + use_ct_inv_match = smap_get_bool(&nb->options, >> >> > > > + "use_ct_inv_match", true); >> >> > > > + >> >> > > > /* deprecated, use --event instead */ >> >> > > > controller_event_en = smap_get_bool(&nb->options, >> >> > > > "controller_event", >> false); >> >> > > > diff --git a/northd/ovn_northd.dl b/northd/ovn_northd.dl >> >> > > > index 2f5d36c599..a0f7944103 100644 >> >> > > > --- a/northd/ovn_northd.dl >> >> > > > +++ b/northd/ovn_northd.dl >> >> > > > @@ -2280,173 +2280,179 @@ for (Reject(lsuuid, pipeline, stage, >> acl, >> >> > > fair_meter, extra_match_, extra_action >> >> > > > } >> >> > > > >> >> > > > /* build_acls */ >> >> > > > -for (sw in &Switch(.ls = ls)) >> >> > > > -var has_stateful = sw.has_stateful_acl or sw.has_lb_vip in >> >> > > > -{ >> >> > > > - /* Ingress and Egress ACL Table (Priority 0): Packets are >> >> allowed by >> >> > > > - * default. A related rule at priority 1 is added below if >> >> there >> >> > > > - * are any stateful ACLs in this datapath. */ >> >> > > > - Flow(.logical_datapath = ls._uuid, >> >> > > > - .stage = s_SWITCH_IN_ACL(), >> >> > > > - .priority = 0, >> >> > > > - .__match = "1", >> >> > > > - .actions = "next;", >> >> > > > - .external_ids = map_empty()); >> >> > > > - Flow(.logical_datapath = ls._uuid, >> >> > > > - .stage = s_SWITCH_OUT_ACL(), >> >> > > > - .priority = 0, >> >> > > > - .__match = "1", >> >> > > > - .actions = "next;", >> >> > > > - .external_ids = map_empty()); >> >> > > > - >> >> > > > - if (has_stateful) { >> >> > > > - /* Ingress and Egress ACL Table (Priority 1). >> >> > > > - * >> >> > > > - * By default, traffic is allowed. This is partially >> >> handled by >> >> > > > - * the Priority 0 ACL flows added earlier, but we also >> need >> >> to >> >> > > > - * commit IP flows. This is because, while the >> initiater's >> >> > > > - * direction may not have any stateful rules, the >> server's >> >> may >> >> > > > - * and then its return traffic would not have an >> associated >> >> > > > - * conntrack entry and would return "+invalid". >> >> > > > - * >> >> > > > - * We use "ct_commit" for a connection that is not >> already >> >> known >> >> > > > - * by the connection tracker. Once a connection is >> >> committed, >> >> > > > - * subsequent packets will hit the flow at priority 0 >> that >> >> just >> >> > > > - * uses "next;" >> >> > > > - * >> >> > > > - * We also check for established connections that have >> >> > > ct_label.blocked >> >> > > > - * set on them. That's a connection that was >> disallowed, >> >> but is >> >> > > > - * now allowed by policy again since it hit this >> >> default-allow >> >> > > flow. >> >> > > > - * We need to set ct_label.blocked=0 to let the >> connection >> >> > > continue, >> >> > > > - * which will be done by ct_commit() in the "stateful" >> >> stage. >> >> > > > - * Subsequent packets will hit the flow at priority 0 >> that >> >> just >> >> > > > - * uses "next;". */ >> >> > > > - Flow(.logical_datapath = ls._uuid, >> >> > > > - .stage = s_SWITCH_IN_ACL(), >> >> > > > - .priority = 1, >> >> > > > - .__match = "ip && (!ct.est || (ct.est && >> >> > > ct_label.blocked == 1))", >> >> > > > - .actions = "${rEGBIT_CONNTRACK_COMMIT()} >> = 1; >> >> > > next;", >> >> > > > - .external_ids = map_empty()); >> >> > > > - Flow(.logical_datapath = ls._uuid, >> >> > > > - .stage = s_SWITCH_OUT_ACL(), >> >> > > > - .priority = 1, >> >> > > > - .__match = "ip && (!ct.est || (ct.est && >> >> > > ct_label.blocked == 1))", >> >> > > > - .actions = "${rEGBIT_CONNTRACK_COMMIT()} >> = 1; >> >> > > next;", >> >> > > > - .external_ids = map_empty()); >> >> > > > - >> >> > > > - /* Ingress and Egress ACL Table (Priority 65535). >> >> > > > - * >> >> > > > - * Always drop traffic that's in an invalid state. Also >> >> drop >> >> > > > - * reply direction packets for connections that have >> been >> >> marked >> >> > > > - * for deletion (bit 0 of ct_label is set). >> >> > > > - * >> >> > > > - * This is enforced at a higher priority than ACLs can >> be >> >> > > defined. */ >> >> > > > - Flow(.logical_datapath = ls._uuid, >> >> > > > - .stage = s_SWITCH_IN_ACL(), >> >> > > > - .priority = 65535, >> >> > > > - .__match = "ct.inv || (ct.est && ct.rpl && >> >> > > ct_label.blocked == 1)", >> >> > > > - .actions = "drop;", >> >> > > > - .external_ids = map_empty()); >> >> > > > - Flow(.logical_datapath = ls._uuid, >> >> > > > - .stage = s_SWITCH_OUT_ACL(), >> >> > > > - .priority = 65535, >> >> > > > - .__match = "ct.inv || (ct.est && ct.rpl && >> >> > > ct_label.blocked == 1)", >> >> > > > - .actions = "drop;", >> >> > > > - .external_ids = map_empty()); >> >> > > > - >> >> > > > - /* Ingress and Egress ACL Table (Priority 65535). >> >> > > > - * >> >> > > > - * Allow reply traffic that is part of an established >> >> > > > - * conntrack entry that has not been marked for deletion >> >> > > > - * (bit 0 of ct_label). We only match traffic in the >> >> > > > - * reply direction because we want traffic in the >> request >> >> > > > - * direction to hit the currently defined policy from >> ACLs. >> >> > > > - * >> >> > > > - * This is enforced at a higher priority than ACLs can >> be >> >> > > defined. */ >> >> > > > +for (UseCtInvMatch[use_ct_inv_match]) { >> >> > > > + (var ct_inv_or, var and_not_ct_inv) = match >> (use_ct_inv_match) >> >> { >> >> > > > + true -> ("ct.inv || ", "&& !ct.inv "), >> >> > > > + false -> ("", ""), >> >> > > > + } in >> >> > > > + for (sw in &Switch(.ls = ls)) >> >> > > > + var has_stateful = sw.has_stateful_acl or sw.has_lb_vip in >> >> > > > + { >> >> > > > + /* Ingress and Egress ACL Table (Priority 0): Packets >> are >> >> > > allowed by >> >> > > > + * default. A related rule at priority 1 is added >> below if >> >> there >> >> > > > + * are any stateful ACLs in this datapath. */ >> >> > > > Flow(.logical_datapath = ls._uuid, >> >> > > > .stage = s_SWITCH_IN_ACL(), >> >> > > > - .priority = 65535, >> >> > > > - .__match = "ct.est && !ct.rel && !ct.new >> && >> >> > > !ct.inv " >> >> > > > - "&& ct.rpl && ct_label.blocked >> == >> >> 0", >> >> > > > + .priority = 0, >> >> > > > + .__match = "1", >> >> > > > .actions = "next;", >> >> > > > .external_ids = map_empty()); >> >> > > > Flow(.logical_datapath = ls._uuid, >> >> > > > .stage = s_SWITCH_OUT_ACL(), >> >> > > > - .priority = 65535, >> >> > > > - .__match = "ct.est && !ct.rel && !ct.new >> && >> >> > > !ct.inv " >> >> > > > - "&& ct.rpl && ct_label.blocked >> == >> >> 0", >> >> > > > + .priority = 0, >> >> > > > + .__match = "1", >> >> > > > .actions = "next;", >> >> > > > .external_ids = map_empty()); >> >> > > > >> >> > > > - /* Ingress and Egress ACL Table (Priority 65535). >> >> > > > - * >> >> > > > - * Allow traffic that is related to an existing >> conntrack >> >> entry >> >> > > that >> >> > > > - * has not been marked for deletion (bit 0 of ct_label). >> >> > > > - * >> >> > > > - * This is enforced at a higher priority than ACLs can >> be >> >> > > defined. >> >> > > > - * >> >> > > > - * NOTE: This does not support related data sessions >> (eg, >> >> > > > - * a dynamically negotiated FTP data channel), but will >> >> allow >> >> > > > - * related traffic such as an ICMP Port Unreachable >> through >> >> > > > - * that's generated from a non-listening UDP port. */ >> >> > > > - Flow(.logical_datapath = ls._uuid, >> >> > > > - .stage = s_SWITCH_IN_ACL(), >> >> > > > - .priority = 65535, >> >> > > > - .__match = "!ct.est && ct.rel && !ct.new >> && >> >> > > !ct.inv " >> >> > > > - "&& ct_label.blocked == 0", >> >> > > > - .actions = "next;", >> >> > > > - .external_ids = map_empty()); >> >> > > > - Flow(.logical_datapath = ls._uuid, >> >> > > > - .stage = s_SWITCH_OUT_ACL(), >> >> > > > - .priority = 65535, >> >> > > > - .__match = "!ct.est && ct.rel && !ct.new >> && >> >> > > !ct.inv " >> >> > > > - "&& ct_label.blocked == 0", >> >> > > > - .actions = "next;", >> >> > > > - .external_ids = map_empty()); >> >> > > > + if (has_stateful) { >> >> > > > + /* Ingress and Egress ACL Table (Priority 1). >> >> > > > + * >> >> > > > + * By default, traffic is allowed. This is >> partially >> >> > > handled by >> >> > > > + * the Priority 0 ACL flows added earlier, but we >> also >> >> need >> >> > > to >> >> > > > + * commit IP flows. This is because, while the >> >> initiater's >> >> > > > + * direction may not have any stateful rules, the >> >> server's >> >> > > may >> >> > > > + * and then its return traffic would not have an >> >> associated >> >> > > > + * conntrack entry and would return "+invalid". >> >> > > > + * >> >> > > > + * We use "ct_commit" for a connection that is not >> >> already >> >> > > known >> >> > > > + * by the connection tracker. Once a connection is >> >> > > committed, >> >> > > > + * subsequent packets will hit the flow at priority >> 0 >> >> that >> >> > > just >> >> > > > + * uses "next;" >> >> > > > + * >> >> > > > + * We also check for established connections that >> have >> >> > > ct_label.blocked >> >> > > > + * set on them. That's a connection that was >> >> disallowed, >> >> > > but is >> >> > > > + * now allowed by policy again since it hit this >> >> > > default-allow flow. >> >> > > > + * We need to set ct_label.blocked=0 to let the >> >> connection >> >> > > continue, >> >> > > > + * which will be done by ct_commit() in the >> "stateful" >> >> stage. >> >> > > > + * Subsequent packets will hit the flow at priority >> 0 >> >> that >> >> > > just >> >> > > > + * uses "next;". */ >> >> > > > + Flow(.logical_datapath = ls._uuid, >> >> > > > + .stage = s_SWITCH_IN_ACL(), >> >> > > > + .priority = 1, >> >> > > > + .__match = "ip && (!ct.est || (ct.est >> && >> >> > > ct_label.blocked == 1))", >> >> > > > + .actions = >> "${rEGBIT_CONNTRACK_COMMIT()} >> >> = >> >> 1; >> >> > > next;", >> >> > > > + .external_ids = map_empty()); >> >> > > > + Flow(.logical_datapath = ls._uuid, >> >> > > > + .stage = s_SWITCH_OUT_ACL(), >> >> > > > + .priority = 1, >> >> > > > + .__match = "ip && (!ct.est || (ct.est >> && >> >> > > ct_label.blocked == 1))", >> >> > > > + .actions = >> "${rEGBIT_CONNTRACK_COMMIT()} >> >> = >> >> 1; >> >> > > next;", >> >> > > > + .external_ids = map_empty()); >> >> > > > >> >> > > > - /* Ingress and Egress ACL Table (Priority 65535). >> >> > > > - * >> >> > > > - * Not to do conntrack on ND packets. */ >> >> > > > + /* Ingress and Egress ACL Table (Priority 65535). >> >> > > > + * >> >> > > > + * Always drop traffic that's in an invalid state. >> >> Also >> >> drop >> >> > > > + * reply direction packets for connections that have >> >> been >> >> > > marked >> >> > > > + * for deletion (bit 0 of ct_label is set). >> >> > > > + * >> >> > > > + * This is enforced at a higher priority than ACLs >> can >> >> be >> >> > > defined. */ >> >> > > > + Flow(.logical_datapath = ls._uuid, >> >> > > > + .stage = s_SWITCH_IN_ACL(), >> >> > > > + .priority = 65535, >> >> > > > + .__match = ct_inv_or ++ "(ct.est && >> >> ct.rpl >> >> && >> >> > > ct_label.blocked == 1)", >> >> > > > + .actions = "drop;", >> >> > > > + .external_ids = map_empty()); >> >> > > > + Flow(.logical_datapath = ls._uuid, >> >> > > > + .stage = s_SWITCH_OUT_ACL(), >> >> > > > + .priority = 65535, >> >> > > > + .__match = ct_inv_or ++ "(ct.est && >> >> ct.rpl >> >> && >> >> > > ct_label.blocked == 1)", >> >> > > > + .actions = "drop;", >> >> > > > + .external_ids = map_empty()); >> >> > > > + >> >> > > > + /* Ingress and Egress ACL Table (Priority 65535). >> >> > > > + * >> >> > > > + * Allow reply traffic that is part of an >> established >> >> > > > + * conntrack entry that has not been marked for >> >> deletion >> >> > > > + * (bit 0 of ct_label). We only match traffic in >> the >> >> > > > + * reply direction because we want traffic in the >> >> request >> >> > > > + * direction to hit the currently defined policy >> from >> >> ACLs. >> >> > > > + * >> >> > > > + * This is enforced at a higher priority than ACLs >> can >> >> be >> >> > > defined. */ >> >> > > > + Flow(.logical_datapath = ls._uuid, >> >> > > > + .stage = s_SWITCH_IN_ACL(), >> >> > > > + .priority = 65535, >> >> > > > + .__match = "ct.est && !ct.rel && >> !ct.new >> >> " >> >> ++ >> >> > > and_not_ct_inv ++ >> >> > > > + "&& ct.rpl && >> ct_label.blocked >> >> == >> >> > > 0", >> >> > > > + .actions = "next;", >> >> > > > + .external_ids = map_empty()); >> >> > > > + Flow(.logical_datapath = ls._uuid, >> >> > > > + .stage = s_SWITCH_OUT_ACL(), >> >> > > > + .priority = 65535, >> >> > > > + .__match = "ct.est && !ct.rel && >> !ct.new >> >> " >> >> ++ >> >> > > and_not_ct_inv ++ >> >> > > > + "&& ct.rpl && >> ct_label.blocked >> >> == >> >> > > 0", >> >> > > > + .actions = "next;", >> >> > > > + .external_ids = map_empty()); >> >> > > > + >> >> > > > + /* Ingress and Egress ACL Table (Priority 65535). >> >> > > > + * >> >> > > > + * Allow traffic that is related to an existing >> >> conntrack >> >> > > entry that >> >> > > > + * has not been marked for deletion (bit 0 of >> >> ct_label). >> >> > > > + * >> >> > > > + * This is enforced at a higher priority than ACLs >> can >> >> be >> >> > > defined. >> >> > > > + * >> >> > > > + * NOTE: This does not support related data sessions >> >> (eg, >> >> > > > + * a dynamically negotiated FTP data channel), but >> will >> >> allow >> >> > > > + * related traffic such as an ICMP Port Unreachable >> >> through >> >> > > > + * that's generated from a non-listening UDP port. >> */ >> >> > > > + Flow(.logical_datapath = ls._uuid, >> >> > > > + .stage = s_SWITCH_IN_ACL(), >> >> > > > + .priority = 65535, >> >> > > > + .__match = "!ct.est && ct.rel && >> !ct.new >> >> " >> >> ++ >> >> > > and_not_ct_inv ++ >> >> > > > + "&& ct_label.blocked == 0", >> >> > > > + .actions = "next;", >> >> > > > + .external_ids = map_empty()); >> >> > > > + Flow(.logical_datapath = ls._uuid, >> >> > > > + .stage = s_SWITCH_OUT_ACL(), >> >> > > > + .priority = 65535, >> >> > > > + .__match = "!ct.est && ct.rel && >> !ct.new >> >> " >> >> ++ >> >> > > and_not_ct_inv ++ >> >> > > > + "&& ct_label.blocked == 0", >> >> > > > + .actions = "next;", >> >> > > > + .external_ids = map_empty()); >> >> > > > + >> >> > > > + /* Ingress and Egress ACL Table (Priority 65535). >> >> > > > + * >> >> > > > + * Not to do conntrack on ND packets. */ >> >> > > > + Flow(.logical_datapath = ls._uuid, >> >> > > > + .stage = s_SWITCH_IN_ACL(), >> >> > > > + .priority = 65535, >> >> > > > + .__match = "nd || nd_ra || nd_rs || >> mldv1 >> >> || >> >> > > mldv2", >> >> > > > + .actions = "next;", >> >> > > > + .external_ids = map_empty()); >> >> > > > + Flow(.logical_datapath = ls._uuid, >> >> > > > + .stage = s_SWITCH_OUT_ACL(), >> >> > > > + .priority = 65535, >> >> > > > + .__match = "nd || nd_ra || nd_rs || >> mldv1 >> >> || >> >> > > mldv2", >> >> > > > + .actions = "next;", >> >> > > > + .external_ids = map_empty()) >> >> > > > + }; >> >> > > > + >> >> > > > + /* Add a 34000 priority flow to advance the DNS reply >> from >> >> > > ovn-controller, >> >> > > > + * if the CMS has configured DNS records for the >> datapath. >> >> > > > + */ >> >> > > > + if (sw.has_dns_records) { >> >> > > > + Flow(.logical_datapath = ls._uuid, >> >> > > > + .stage = s_SWITCH_OUT_ACL(), >> >> > > > + .priority = 34000, >> >> > > > + .__match = "udp.src == 53", >> >> > > > + .actions = if has_stateful "ct_commit; >> >> next;" >> >> > > else "next;", >> >> > > > + .external_ids = map_empty()) >> >> > > > + }; >> >> > > > + >> >> > > > + /* Add a 34000 priority flow to advance the service >> monitor >> >> reply >> >> > > > + * packets to skip applying ingress ACLs. */ >> >> > > > Flow(.logical_datapath = ls._uuid, >> >> > > > .stage = s_SWITCH_IN_ACL(), >> >> > > > - .priority = 65535, >> >> > > > - .__match = "nd || nd_ra || nd_rs || mldv1 >> || >> >> > > mldv2", >> >> > > > + .priority = 34000, >> >> > > > + .__match = "eth.dst == $svc_monitor_mac", >> >> > > > .actions = "next;", >> >> > > > .external_ids = map_empty()); >> >> > > > - Flow(.logical_datapath = ls._uuid, >> >> > > > - .stage = s_SWITCH_OUT_ACL(), >> >> > > > - .priority = 65535, >> >> > > > - .__match = "nd || nd_ra || nd_rs || mldv1 >> || >> >> > > mldv2", >> >> > > > - .actions = "next;", >> >> > > > - .external_ids = map_empty()) >> >> > > > - }; >> >> > > > - >> >> > > > - /* Add a 34000 priority flow to advance the DNS reply from >> >> > > ovn-controller, >> >> > > > - * if the CMS has configured DNS records for the datapath. >> >> > > > - */ >> >> > > > - if (sw.has_dns_records) { >> >> > > > Flow(.logical_datapath = ls._uuid, >> >> > > > .stage = s_SWITCH_OUT_ACL(), >> >> > > > .priority = 34000, >> >> > > > - .__match = "udp.src == 53", >> >> > > > - .actions = if has_stateful "ct_commit; >> next;" >> >> else >> >> > > "next;", >> >> > > > + .__match = "eth.src == $svc_monitor_mac", >> >> > > > + .actions = "next;", >> >> > > > .external_ids = map_empty()) >> >> > > > - }; >> >> > > > - >> >> > > > - /* Add a 34000 priority flow to advance the service monitor >> >> reply >> >> > > > - * packets to skip applying ingress ACLs. */ >> >> > > > - Flow(.logical_datapath = ls._uuid, >> >> > > > - .stage = s_SWITCH_IN_ACL(), >> >> > > > - .priority = 34000, >> >> > > > - .__match = "eth.dst == $svc_monitor_mac", >> >> > > > - .actions = "next;", >> >> > > > - .external_ids = map_empty()); >> >> > > > - Flow(.logical_datapath = ls._uuid, >> >> > > > - .stage = s_SWITCH_OUT_ACL(), >> >> > > > - .priority = 34000, >> >> > > > - .__match = "eth.src == $svc_monitor_mac", >> >> > > > - .actions = "next;", >> >> > > > - .external_ids = map_empty()) >> >> > > > + } >> >> > > > } >> >> > > > >> >> > > > /* This stage builds hints for the IN/OUT_ACL stage. Based on >> >> various >> >> > > > diff --git a/ovn-nb.xml b/ovn-nb.xml >> >> > > > index feb38b5d31..e337be4eb3 100644 >> >> > > > --- a/ovn-nb.xml >> >> > > > +++ b/ovn-nb.xml >> >> > > > @@ -240,6 +240,21 @@ >> >> > > > </p> >> >> > > > </column> >> >> > > > >> >> > > > + <column name="options" key="use_ct_inv_match"> >> >> > > > + <p> >> >> > > > + If set to false, <code>ovn-northd</code> will not use >> the >> >> > > > + <code>ct.inv</code> field in any of the logical flow >> >> matches. >> >> > > > + The default value is true. If the NIC supports >> >> offloading >> >> > > > + OVS datapath flows but doesn't support offloading >> >> ct_state >> >> > > > + <code>inv</code> flag, then the datapath flows >> matching >> >> on >> >> > > this flag >> >> > > > + (either <code>+inv</code> or <code>-inv</code>) will >> not >> >> be >> >> > > > + offloaded. CMS should consider setting >> >> > > <code>use_ct_inv_match</code> >> >> > > > + to <code>false</code> in such cases. This results in >> a >> >> side >> >> > > effect >> >> > > > + of the invalid packets getting delivered to the >> >> destination >> >> > > VIF, which >> >> > > > + otherwise would have been dropped by <code>OVN</code>. >> >> > > > + </p> >> >> > > > + </column> >> >> > > > + >> >> > > > <group title="Options for configuring interconnection >> route >> >> > > advertisement"> >> >> > > > <p> >> >> > > > These options control how routes are advertised >> between >> >> OVN >> >> > > > -- >> >> > > > 2.29.2 >> >> > > > >> >> > > > _______________________________________________ >> >> > > > dev mailing list >> >> > > > dev@openvswitch.org >> >> > > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev >> >> > > >> >> > > Acked-by: Han Zhou <hzhou@ovn.org> >> >> > >> >> > Thanks for the reviews Mark G and Han. >> >> > >> >> > I applied both the patches to the main branch addressing Han's >> comment >> >> in >> >> p1. >> >> > When I submitted this patch, I had a test case which somehow got >> removed >> >> > in the later versions. I added the test case before applying. >> >> > >> >> > Regards >> >> > Numan >> >> >> >> Hi Numan, >> >> >> >> Seems this merge breaks the CI. All these vtep related cases are >> failing: >> >> 898: ovn-controller-vtep - chassis FAILED (ovn-controller-vtep.at:110) >> >> 899: ovn-controller-vtep - binding 1 FAILED ( >> ovn-controller-vtep.at:178) >> >> 900: ovn-controller-vtep - binding 2 FAILED ( >> ovn-controller-vtep.at:252) >> >> 901: ovn-controller-vtep - vtep-lswitch FAILED ( >> >> ovn-controller-vtep.at:291) >> >> 902: ovn-controller-vtep - vtep-macs 1 FAILED ( >> ovn-controller-vtep.at:343 >> >> ) >> >> 903: ovn-controller-vtep - vtep-macs 2 FAILED ( >> ovn-controller-vtep.at:431 >> >> ) >> >> >> >> All these tests are successful when I run them locally. I haven't >> checked >> >> more details yet. >> >> >> > >> > Hi Han, >> > >> > I noticed that. In fact it's failing without these patches too. >> > >> > I quickly looked into it and looks like it's failing because of some >> > warning message related to dns_resolve in ovsdb-server log. >> > >> > I think wen need to white list this warning message in vtep tests. >> > >> > >> > Thanks >> > Numan >> > >> > ok. Thanks for checking! Do you happen to know what change introduced >> this? Also I wonder why I don't see these failures locally. Is it test >> server's problem (regarding DNS?) >> > > I didn't get a chance to look further. If you're planning to look into it > then please do so. > > I simply tried pushing the same branch to my repo, and the jobs all passed without any issues, but retrying the job on ovn-org repo always failed. It is likely related to the git servers (at least not related to any of the recent commits).
diff --git a/NEWS b/NEWS index 1ddde15f86..5f1365775d 100644 --- a/NEWS +++ b/NEWS @@ -6,11 +6,16 @@ Post-v21.03.0 expected to scale better than the C implementation, for large deployments. (This may take testing and tuning to be effective.) This version of OVN requires DDLog 0.36. - - Introduce ovn-controller incremetal processing engine statistics + - Introduce ovn-controller incremental processing engine statistics - Introduced parallel processing in ovn-northd with the NB_Global config option 'use_parallel_build' to enable it. It is disabled by default. - Support vlan-passthru mode for tag=0 localnet ports. - Support custom 802.11ad EthType for localnet ports. + - Add a new NB Global option - use_ct_inv_match with the default value of true. + If this is set to false, then the logical field - "ct.inv" will not be + used in the logical flow matches. CMS can consider setting this to false, + if they want to use smart NICs which don't support offloading datapath flows + with this field used. OVN v21.03.0 - 12 Mar 2021 ------------------------- diff --git a/northd/lswitch.dl b/northd/lswitch.dl index 47c497e0cf..27ac3cadc5 100644 --- a/northd/lswitch.dl +++ b/northd/lswitch.dl @@ -742,3 +742,10 @@ function put_svc_monitor_mac(options: Map<string,string>, relation SvcMonitorMac(mac: eth_addr) SvcMonitorMac(get_svc_monitor_mac(options, uuid)) :- nb::NB_Global(._uuid = uuid, .options = options). + +relation UseCtInvMatch[bool] +UseCtInvMatch[options.get_bool_def("use_ct_inv_match", true)] :- + nb::NB_Global(.options = options). +UseCtInvMatch[true] :- + Unit(), + not nb in nb::NB_Global(). diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c index de1aa896d3..5f6c0a6922 100644 --- a/northd/ovn-northd.c +++ b/northd/ovn-northd.c @@ -98,6 +98,10 @@ static bool check_lsp_is_up; static char svc_monitor_mac[ETH_ADDR_STRLEN + 1]; static struct eth_addr svc_monitor_mac_ea; +/* If this option is 'true' northd will make use of ct.inv match fields. + * Otherwise, it will avoid using it. The default is true. */ +static bool use_ct_inv_match = true; + /* Default probe interval for NB and SB DB connections. */ #define DEFAULT_PROBE_INTERVAL_MSEC 5000 static int northd_probe_interval_nb = 0; @@ -4099,7 +4103,6 @@ do_ovn_lflow_add(struct hmap *lflow_map, bool shared, hmap_insert_fast(lflow_map, &lflow->hmap_node, hash); } - /* Adds a row with the specified contents to the Logical_Flow table. */ static void ovn_lflow_add_at(struct hmap *lflow_map, struct ovn_datapath *od, @@ -5712,12 +5715,13 @@ build_acls(struct ovn_datapath *od, struct hmap *lflows, * for deletion (bit 0 of ct_label is set). * * This is enforced at a higher priority than ACLs can be defined. */ + char *match = xasprintf("%s(ct.est && ct.rpl && ct_label.blocked == 1)", + use_ct_inv_match ? "ct.inv || " : ""); ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, UINT16_MAX, - "ct.inv || (ct.est && ct.rpl && ct_label.blocked == 1)", - "drop;"); + match, "drop;"); ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, UINT16_MAX, - "ct.inv || (ct.est && ct.rpl && ct_label.blocked == 1)", - "drop;"); + match, "drop;"); + free(match); /* Ingress and Egress ACL Table (Priority 65535). * @@ -5728,14 +5732,15 @@ build_acls(struct ovn_datapath *od, struct hmap *lflows, * direction to hit the currently defined policy from ACLs. * * This is enforced at a higher priority than ACLs can be defined. */ + match = xasprintf("ct.est && !ct.rel && !ct.new%s && " + "ct.rpl && ct_label.blocked == 0", + use_ct_inv_match ? " && !ct.inv" : ""); + ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, UINT16_MAX, - "ct.est && !ct.rel && !ct.new && !ct.inv " - "&& ct.rpl && ct_label.blocked == 0", - "next;"); + match, "next;"); ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, UINT16_MAX, - "ct.est && !ct.rel && !ct.new && !ct.inv " - "&& ct.rpl && ct_label.blocked == 0", - "next;"); + match, "next;"); + free(match); /* Ingress and Egress ACL Table (Priority 65535). * @@ -5748,14 +5753,14 @@ build_acls(struct ovn_datapath *od, struct hmap *lflows, * a dynamically negotiated FTP data channel), but will allow * related traffic such as an ICMP Port Unreachable through * that's generated from a non-listening UDP port. */ + match = xasprintf("!ct.est && ct.rel && !ct.new%s && " + "ct_label.blocked == 0", + use_ct_inv_match ? " && !ct.inv" : ""); ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL, UINT16_MAX, - "!ct.est && ct.rel && !ct.new && !ct.inv " - "&& ct_label.blocked == 0", - "next;"); + match, "next;"); ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL, UINT16_MAX, - "!ct.est && ct.rel && !ct.new && !ct.inv " - "&& ct_label.blocked == 0", - "next;"); + match, "next;"); + free(match); /* Ingress and Egress ACL Table (Priority 65535). * @@ -13195,6 +13200,9 @@ ovnnb_db_run(struct northd_context *ctx, use_logical_dp_groups = smap_get_bool(&nb->options, "use_logical_dp_groups", false); + use_ct_inv_match = smap_get_bool(&nb->options, + "use_ct_inv_match", true); + /* deprecated, use --event instead */ controller_event_en = smap_get_bool(&nb->options, "controller_event", false); diff --git a/northd/ovn_northd.dl b/northd/ovn_northd.dl index 2f5d36c599..a0f7944103 100644 --- a/northd/ovn_northd.dl +++ b/northd/ovn_northd.dl @@ -2280,173 +2280,179 @@ for (Reject(lsuuid, pipeline, stage, acl, fair_meter, extra_match_, extra_action } /* build_acls */ -for (sw in &Switch(.ls = ls)) -var has_stateful = sw.has_stateful_acl or sw.has_lb_vip in -{ - /* Ingress and Egress ACL Table (Priority 0): Packets are allowed by - * default. A related rule at priority 1 is added below if there - * are any stateful ACLs in this datapath. */ - Flow(.logical_datapath = ls._uuid, - .stage = s_SWITCH_IN_ACL(), - .priority = 0, - .__match = "1", - .actions = "next;", - .external_ids = map_empty()); - Flow(.logical_datapath = ls._uuid, - .stage = s_SWITCH_OUT_ACL(), - .priority = 0, - .__match = "1", - .actions = "next;", - .external_ids = map_empty()); - - if (has_stateful) { - /* Ingress and Egress ACL Table (Priority 1). - * - * By default, traffic is allowed. This is partially handled by - * the Priority 0 ACL flows added earlier, but we also need to - * commit IP flows. This is because, while the initiater's - * direction may not have any stateful rules, the server's may - * and then its return traffic would not have an associated - * conntrack entry and would return "+invalid". - * - * We use "ct_commit" for a connection that is not already known - * by the connection tracker. Once a connection is committed, - * subsequent packets will hit the flow at priority 0 that just - * uses "next;" - * - * We also check for established connections that have ct_label.blocked - * set on them. That's a connection that was disallowed, but is - * now allowed by policy again since it hit this default-allow flow. - * We need to set ct_label.blocked=0 to let the connection continue, - * which will be done by ct_commit() in the "stateful" stage. - * Subsequent packets will hit the flow at priority 0 that just - * uses "next;". */ - Flow(.logical_datapath = ls._uuid, - .stage = s_SWITCH_IN_ACL(), - .priority = 1, - .__match = "ip && (!ct.est || (ct.est && ct_label.blocked == 1))", - .actions = "${rEGBIT_CONNTRACK_COMMIT()} = 1; next;", - .external_ids = map_empty()); - Flow(.logical_datapath = ls._uuid, - .stage = s_SWITCH_OUT_ACL(), - .priority = 1, - .__match = "ip && (!ct.est || (ct.est && ct_label.blocked == 1))", - .actions = "${rEGBIT_CONNTRACK_COMMIT()} = 1; next;", - .external_ids = map_empty()); - - /* Ingress and Egress ACL Table (Priority 65535). - * - * Always drop traffic that's in an invalid state. Also drop - * reply direction packets for connections that have been marked - * for deletion (bit 0 of ct_label is set). - * - * This is enforced at a higher priority than ACLs can be defined. */ - Flow(.logical_datapath = ls._uuid, - .stage = s_SWITCH_IN_ACL(), - .priority = 65535, - .__match = "ct.inv || (ct.est && ct.rpl && ct_label.blocked == 1)", - .actions = "drop;", - .external_ids = map_empty()); - Flow(.logical_datapath = ls._uuid, - .stage = s_SWITCH_OUT_ACL(), - .priority = 65535, - .__match = "ct.inv || (ct.est && ct.rpl && ct_label.blocked == 1)", - .actions = "drop;", - .external_ids = map_empty()); - - /* Ingress and Egress ACL Table (Priority 65535). - * - * Allow reply traffic that is part of an established - * conntrack entry that has not been marked for deletion - * (bit 0 of ct_label). We only match traffic in the - * reply direction because we want traffic in the request - * direction to hit the currently defined policy from ACLs. - * - * This is enforced at a higher priority than ACLs can be defined. */ +for (UseCtInvMatch[use_ct_inv_match]) { + (var ct_inv_or, var and_not_ct_inv) = match (use_ct_inv_match) { + true -> ("ct.inv || ", "&& !ct.inv "), + false -> ("", ""), + } in + for (sw in &Switch(.ls = ls)) + var has_stateful = sw.has_stateful_acl or sw.has_lb_vip in + { + /* Ingress and Egress ACL Table (Priority 0): Packets are allowed by + * default. A related rule at priority 1 is added below if there + * are any stateful ACLs in this datapath. */ Flow(.logical_datapath = ls._uuid, .stage = s_SWITCH_IN_ACL(), - .priority = 65535, - .__match = "ct.est && !ct.rel && !ct.new && !ct.inv " - "&& ct.rpl && ct_label.blocked == 0", + .priority = 0, + .__match = "1", .actions = "next;", .external_ids = map_empty()); Flow(.logical_datapath = ls._uuid, .stage = s_SWITCH_OUT_ACL(), - .priority = 65535, - .__match = "ct.est && !ct.rel && !ct.new && !ct.inv " - "&& ct.rpl && ct_label.blocked == 0", + .priority = 0, + .__match = "1", .actions = "next;", .external_ids = map_empty()); - /* Ingress and Egress ACL Table (Priority 65535). - * - * Allow traffic that is related to an existing conntrack entry that - * has not been marked for deletion (bit 0 of ct_label). - * - * This is enforced at a higher priority than ACLs can be defined. - * - * NOTE: This does not support related data sessions (eg, - * a dynamically negotiated FTP data channel), but will allow - * related traffic such as an ICMP Port Unreachable through - * that's generated from a non-listening UDP port. */ - Flow(.logical_datapath = ls._uuid, - .stage = s_SWITCH_IN_ACL(), - .priority = 65535, - .__match = "!ct.est && ct.rel && !ct.new && !ct.inv " - "&& ct_label.blocked == 0", - .actions = "next;", - .external_ids = map_empty()); - Flow(.logical_datapath = ls._uuid, - .stage = s_SWITCH_OUT_ACL(), - .priority = 65535, - .__match = "!ct.est && ct.rel && !ct.new && !ct.inv " - "&& ct_label.blocked == 0", - .actions = "next;", - .external_ids = map_empty()); + if (has_stateful) { + /* Ingress and Egress ACL Table (Priority 1). + * + * By default, traffic is allowed. This is partially handled by + * the Priority 0 ACL flows added earlier, but we also need to + * commit IP flows. This is because, while the initiater's + * direction may not have any stateful rules, the server's may + * and then its return traffic would not have an associated + * conntrack entry and would return "+invalid". + * + * We use "ct_commit" for a connection that is not already known + * by the connection tracker. Once a connection is committed, + * subsequent packets will hit the flow at priority 0 that just + * uses "next;" + * + * We also check for established connections that have ct_label.blocked + * set on them. That's a connection that was disallowed, but is + * now allowed by policy again since it hit this default-allow flow. + * We need to set ct_label.blocked=0 to let the connection continue, + * which will be done by ct_commit() in the "stateful" stage. + * Subsequent packets will hit the flow at priority 0 that just + * uses "next;". */ + Flow(.logical_datapath = ls._uuid, + .stage = s_SWITCH_IN_ACL(), + .priority = 1, + .__match = "ip && (!ct.est || (ct.est && ct_label.blocked == 1))", + .actions = "${rEGBIT_CONNTRACK_COMMIT()} = 1; next;", + .external_ids = map_empty()); + Flow(.logical_datapath = ls._uuid, + .stage = s_SWITCH_OUT_ACL(), + .priority = 1, + .__match = "ip && (!ct.est || (ct.est && ct_label.blocked == 1))", + .actions = "${rEGBIT_CONNTRACK_COMMIT()} = 1; next;", + .external_ids = map_empty()); - /* Ingress and Egress ACL Table (Priority 65535). - * - * Not to do conntrack on ND packets. */ + /* Ingress and Egress ACL Table (Priority 65535). + * + * Always drop traffic that's in an invalid state. Also drop + * reply direction packets for connections that have been marked + * for deletion (bit 0 of ct_label is set). + * + * This is enforced at a higher priority than ACLs can be defined. */ + Flow(.logical_datapath = ls._uuid, + .stage = s_SWITCH_IN_ACL(), + .priority = 65535, + .__match = ct_inv_or ++ "(ct.est && ct.rpl && ct_label.blocked == 1)", + .actions = "drop;", + .external_ids = map_empty()); + Flow(.logical_datapath = ls._uuid, + .stage = s_SWITCH_OUT_ACL(), + .priority = 65535, + .__match = ct_inv_or ++ "(ct.est && ct.rpl && ct_label.blocked == 1)", + .actions = "drop;", + .external_ids = map_empty()); + + /* Ingress and Egress ACL Table (Priority 65535). + * + * Allow reply traffic that is part of an established + * conntrack entry that has not been marked for deletion + * (bit 0 of ct_label). We only match traffic in the + * reply direction because we want traffic in the request + * direction to hit the currently defined policy from ACLs. + * + * This is enforced at a higher priority than ACLs can be defined. */ + Flow(.logical_datapath = ls._uuid, + .stage = s_SWITCH_IN_ACL(), + .priority = 65535, + .__match = "ct.est && !ct.rel && !ct.new " ++ and_not_ct_inv ++ + "&& ct.rpl && ct_label.blocked == 0", + .actions = "next;", + .external_ids = map_empty()); + Flow(.logical_datapath = ls._uuid, + .stage = s_SWITCH_OUT_ACL(), + .priority = 65535, + .__match = "ct.est && !ct.rel && !ct.new " ++ and_not_ct_inv ++ + "&& ct.rpl && ct_label.blocked == 0", + .actions = "next;", + .external_ids = map_empty()); + + /* Ingress and Egress ACL Table (Priority 65535). + * + * Allow traffic that is related to an existing conntrack entry that + * has not been marked for deletion (bit 0 of ct_label). + * + * This is enforced at a higher priority than ACLs can be defined. + * + * NOTE: This does not support related data sessions (eg, + * a dynamically negotiated FTP data channel), but will allow + * related traffic such as an ICMP Port Unreachable through + * that's generated from a non-listening UDP port. */ + Flow(.logical_datapath = ls._uuid, + .stage = s_SWITCH_IN_ACL(), + .priority = 65535, + .__match = "!ct.est && ct.rel && !ct.new " ++ and_not_ct_inv ++ + "&& ct_label.blocked == 0", + .actions = "next;", + .external_ids = map_empty()); + Flow(.logical_datapath = ls._uuid, + .stage = s_SWITCH_OUT_ACL(), + .priority = 65535, + .__match = "!ct.est && ct.rel && !ct.new " ++ and_not_ct_inv ++ + "&& ct_label.blocked == 0", + .actions = "next;", + .external_ids = map_empty()); + + /* Ingress and Egress ACL Table (Priority 65535). + * + * Not to do conntrack on ND packets. */ + Flow(.logical_datapath = ls._uuid, + .stage = s_SWITCH_IN_ACL(), + .priority = 65535, + .__match = "nd || nd_ra || nd_rs || mldv1 || mldv2", + .actions = "next;", + .external_ids = map_empty()); + Flow(.logical_datapath = ls._uuid, + .stage = s_SWITCH_OUT_ACL(), + .priority = 65535, + .__match = "nd || nd_ra || nd_rs || mldv1 || mldv2", + .actions = "next;", + .external_ids = map_empty()) + }; + + /* Add a 34000 priority flow to advance the DNS reply from ovn-controller, + * if the CMS has configured DNS records for the datapath. + */ + if (sw.has_dns_records) { + Flow(.logical_datapath = ls._uuid, + .stage = s_SWITCH_OUT_ACL(), + .priority = 34000, + .__match = "udp.src == 53", + .actions = if has_stateful "ct_commit; next;" else "next;", + .external_ids = map_empty()) + }; + + /* Add a 34000 priority flow to advance the service monitor reply + * packets to skip applying ingress ACLs. */ Flow(.logical_datapath = ls._uuid, .stage = s_SWITCH_IN_ACL(), - .priority = 65535, - .__match = "nd || nd_ra || nd_rs || mldv1 || mldv2", + .priority = 34000, + .__match = "eth.dst == $svc_monitor_mac", .actions = "next;", .external_ids = map_empty()); - Flow(.logical_datapath = ls._uuid, - .stage = s_SWITCH_OUT_ACL(), - .priority = 65535, - .__match = "nd || nd_ra || nd_rs || mldv1 || mldv2", - .actions = "next;", - .external_ids = map_empty()) - }; - - /* Add a 34000 priority flow to advance the DNS reply from ovn-controller, - * if the CMS has configured DNS records for the datapath. - */ - if (sw.has_dns_records) { Flow(.logical_datapath = ls._uuid, .stage = s_SWITCH_OUT_ACL(), .priority = 34000, - .__match = "udp.src == 53", - .actions = if has_stateful "ct_commit; next;" else "next;", + .__match = "eth.src == $svc_monitor_mac", + .actions = "next;", .external_ids = map_empty()) - }; - - /* Add a 34000 priority flow to advance the service monitor reply - * packets to skip applying ingress ACLs. */ - Flow(.logical_datapath = ls._uuid, - .stage = s_SWITCH_IN_ACL(), - .priority = 34000, - .__match = "eth.dst == $svc_monitor_mac", - .actions = "next;", - .external_ids = map_empty()); - Flow(.logical_datapath = ls._uuid, - .stage = s_SWITCH_OUT_ACL(), - .priority = 34000, - .__match = "eth.src == $svc_monitor_mac", - .actions = "next;", - .external_ids = map_empty()) + } } /* This stage builds hints for the IN/OUT_ACL stage. Based on various diff --git a/ovn-nb.xml b/ovn-nb.xml index feb38b5d31..e337be4eb3 100644 --- a/ovn-nb.xml +++ b/ovn-nb.xml @@ -240,6 +240,21 @@ </p> </column> + <column name="options" key="use_ct_inv_match"> + <p> + If set to false, <code>ovn-northd</code> will not use the + <code>ct.inv</code> field in any of the logical flow matches. + The default value is true. If the NIC supports offloading + OVS datapath flows but doesn't support offloading ct_state + <code>inv</code> flag, then the datapath flows matching on this flag + (either <code>+inv</code> or <code>-inv</code>) will not be + offloaded. CMS should consider setting <code>use_ct_inv_match</code> + to <code>false</code> in such cases. This results in a side effect + of the invalid packets getting delivered to the destination VIF, which + otherwise would have been dropped by <code>OVN</code>. + </p> + </column> + <group title="Options for configuring interconnection route advertisement"> <p> These options control how routes are advertised between OVN