@@ -20,12 +20,14 @@ DOC_SOURCE = \
Documentation/tutorials/ovn-ipsec.rst \
Documentation/tutorials/ovn-rbac.rst \
Documentation/tutorials/ovn-interconnection.rst \
+ Documentation/tutorials/ddlog-new-feature.rst \
Documentation/topics/index.rst \
Documentation/topics/testing.rst \
Documentation/topics/high-availability.rst \
Documentation/topics/integration.rst \
Documentation/topics/ovn-news-2.8.rst \
Documentation/topics/role-based-access-control.rst \
+ Documentation/topics/debugging-ddlog.rst \
Documentation/howto/index.rst \
Documentation/howto/docker.rst \
Documentation/howto/firewalld.rst \
@@ -89,6 +89,13 @@ need the following software:
The environment variable OVS_RESOLV_CONF can be used to specify DNS server
configuration file (the default file on Linux is /etc/resolv.conf).
+- `DDlog <https://github.com/vmware/differential-datalog>`, if you
+ want to build ``ovn-northd-ddlog``, an alternate implementation of
+ ``ovn-northd`` that scales better to large deployments. The NEWS
+ file specifies the right version of DDlog to use with this release.
+ Building with DDlog supports requires Rust to be installed (see
+ https://www.rust-lang.org/tools/install).
+
If you are working from a Git tree or snapshot (instead of from a distribution
tarball), or if you modify the OVN build system or the database
schema, you will also need the following software:
@@ -176,6 +183,14 @@ the default database directory, add options as shown here::
``yum install`` or ``rpm -ivh``) and .deb (e.g. via
``apt-get install`` or ``dpkg -i``) use the above configure options.
+To build with DDlog support, add ``--with-ddlog=<path to ddlog>/lib``
+to the ``configure`` command line. Building with DDLog adds a few
+minutes to the build because the Rust compiler is slow. To speed this
+up by about 2x, also add ``--enable-ddlog-fast-build``. This disables
+some Rust compiler optimizations, making a much slower
+``ovn-northd-ddlog`` executable, so it should not be used for
+production builds or for profiling.
+
By default, static libraries are built and linked against. If you want to use
shared libraries instead::
@@ -353,6 +368,14 @@ An example after install might be::
$ ovn-ctl start_northd
$ ovn-ctl start_controller
+If you built with DDlog support, then you can start
+``ovn-northd-ddlog`` instead of ``ovn-northd`` by adding
+``--ovn-northd-ddlog=yes``, e.g.::
+
+ $ export PATH=$PATH:/usr/local/share/ovn/scripts
+ $ ovn-ctl --ovn-northd-ddlog=yes start_northd
+ $ ovn-ctl start_controller
+
Starting OVN Central services
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -403,11 +426,15 @@ it at any time is harmless::
$ ovn-nbctl --no-wait init
$ ovn-sbctl --no-wait init
-Start the ovn-northd, telling it to connect to the OVN db servers same Unix
-domain socket::
+Start ``ovn-northd``, telling it to connect to the OVN db servers same
+Unix domain socket::
$ ovn-northd --pidfile --detach --log-file
+If you built with DDlog support, you can start ``ovn-northd-ddlog``
+instead, the same way::
+
+ $ ovn-northd-ddlog --pidfile --detach --log-file
Starting OVN Central services in containers
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
new file mode 100644
@@ -0,0 +1,280 @@
+..
+ Licensed under the Apache License, Version 2.0 (the "License"); you may
+ not use this file except in compliance with the License. You may obtain
+ a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ License for the specific language governing permissions and limitations
+ under the License.
+
+ Convention for heading levels in OVN documentation:
+
+ ======= Heading 0 (reserved for the title in a document)
+ ------- Heading 1
+ ~~~~~~~ Heading 2
+ +++++++ Heading 3
+ ''''''' Heading 4
+
+ Avoid deeper levels because they do not render well.
+
+=========================================
+Debugging the DDlog version of ovn-northd
+=========================================
+
+This document gives some tips for debugging correctness issues in the
+DDlog implementation of ``ovn-northd``. To keep things conrete, we
+assume here that a failure occurred in one of the test cases in
+``ovn-e2e.at``, but the same methodology applies in any other
+environment. If none of these methods helps, ask for assistance or
+submit a bug report.
+
+Before trying these methods, you may want to check the northd log
+file, ``tests/testsuite.dir/<test_number>/northd/ovn-northd.log`` for
+error messages that might explain the failure.
+
+Compare OVSDB tables generated by DDlog vs C
+--------------------------------------------
+
+The first thing I typically want to check when ``ovn-northd-ddlog``
+does not behave as expected is how the OVSDB tables computed by DDlog
+differ from what the C implementation produces. Fortunately, all the
+infrastructure needed to do this already exists in OVN.
+
+First, let's modify the test script, e.g., ``ovn.at`` to dump the
+contents of OVSDB right before the failure. The most common issue is
+a difference between the logical flows generated by the two
+implementations. To make it easy to compare the generated flows, make
+sure that the test contains something like this in the right place::
+
+ ovn-sbctl dump-flows > sbflows
+ AT_CAPTURE_FILE([sbflows])
+
+The first line above dumps the OVN logical flow table to a file named
+``sbflows``. The second line ensures that, if the test fails,
+``sbflows`` get logged to ``testsuite.log``. That is not particularly
+useful for us right now, but it means that if someone later submits a
+bug report, that's one more piece of data that we don't have to ask
+for them to submit along with it.
+
+Next, we want to run the test twice, with the C and DDlog versions of
+northd, e.g., ``make check -j6 TESTSUITEFLAGS="-d 111 112"`` if 111
+and 112 are the C and DDlog versions of the same test. The ``-d`` in
+this command line makes the test driver keep test directories around
+even for tests that succeed, since by default it deletes them.
+
+Now you can look at ``sbflows`` in each test log directory. The
+``ovn-northd-ddlog`` developers have gone to some trouble to make the
+DDlog flows as similar as possible to the C ones, right down to white
+space and other formatting. Thus, the DDlog output is often identical
+to C aside from logical datapath UUIDs.
+
+Usually, this means that one can get informative results by running
+``diff``, e.g.::
+
+ diff -u tests/testsuite.dir/111/sbflows tests/testsuite.dir/111/sbflows
+
+Running the input through the ``uuidfilt`` utility from OVS will
+generally get rid of the logical datapath UUID differences as well::
+
+ diff -u <(uuidfilt tests/testsuite.dir/111/sbflows) <(uuidfilt tests/testsuite.dir/111/sbflows)
+
+If there are nontrivial differences, this often identifies your bug.
+
+Often, once you have identified the difference between the two OVSDB
+dumps, this will immediately lead you to the root cause of the bug,
+but if you are not this lucky then the next method may help.
+
+Record and replay DDlog execution
+---------------------------------
+
+DDlog offers a way to record all input table updates throughout the
+execution of northd and replay them against DDlog running as a
+standalone executable without all other OVN components. This has two
+advantages. First, this allows one to easily tweak the inputs, e.g.
+to simplify the test scenario. Second, the recorded execution can be
+easily replayed anywhere without having to reproduce your OVN setup.
+
+Use the ``--ddlog-record`` option to record updates,
+e.g. ``--ddlog-record=replay.dat`` to record to ``replay.dat``.
+(OVN's built-in tests automatically do this.) The file contains the
+log of transactions in the DDlog command format (see
+https://github.com/vmware/differential-datalog/blob/master/doc/command_reference/command_reference.md).
+
+To replay the log, you will need the standalone DDlog executable. By
+default, the build system does not compile this program, because it
+increases the already long Rust compilation time. To build it, add
+``NORTHD_CLI=1`` to the ``make`` command line, e.g. ``make
+NORTHD_CLI=1``.
+
+You can modify the log before replaying it, e.g., adding ``dump
+<table>`` commands to dump the contents of relations at various points
+during execution. The <table> name must be fully qualified based on
+the file in which it is declared, e.g. ``OVN_Southbound::<table>`` for
+southbound tables or ``lrouter::<table>.`` for ``lrouter.dl``. You
+can also use ``dump`` without an argument to dump the contents of all
+tables.
+
+The following command replays the log generated by OVN test number
+112 and dumps the output of DDlog to ``replay.dump``::
+
+ ovn/northd/ovn_northd_ddlog/target/release/ovn_northd_cli < tests/testsuite.dir/112/northd/replay.dat > replay.dump
+
+Or, to dump table contents following the run, without having to edit
+``replay.dat``::
+
+ (cat tests/testsuite.dir/112/northd/replay.dat; echo 'dump;') | ovn/northd/ovn_northd_ddlog/target/release/ovn_northd_cli --no-init-snapshot > replay.dump
+
+Depending on whether and how you installed OVS and OVN, you might need
+to point ``LD_LIBRARY_PATH`` to library build directories to get the
+CLI to run, e.g.::
+
+ export LD_LIBRARY_PATH=$HOME/ovn/_build/lib/.libs:$HOME/ovs/_build/lib/.libs
+
+.. note::
+
+ The replay output may be less informative than you expect because
+ DDlog does not, by default, keep around enough information to
+ include input relation and intermediate relations in the output.
+ These relations are often critical to understanding what is going
+ on. To include them, add the options
+ ``--output-internal-relations --output-input-relations=In_`` to
+ ``DDLOG_EXTRA_FLAGS`` for building ``ovn-northd-ddlog``. For
+ example, ``configure`` as::
+
+ ./configure DDLOG_EXTRA_FLAGS='--output-internal-relations --output-input-relations=In_'
+
+Debugging by Logging
+--------------------
+
+One limitation of the previous method is that it allows one to inspect
+inputs and outputs of a rule, but not the (sometimes fairly
+complicated) computation that goes on inside the rule. You can of
+course break up the rule into several rules and dump the intermediate
+outputs.
+
+There are at least two alternatives for generating log messages.
+First, you can write rules to add strings to the Warning relation
+declared in ``ovn_north.dl``. Code in ``ovn-northd-ddlog.c`` will log
+any given string in this relation just once, when it is first added to
+the relation. (If it is removed from the relation and then added back
+later, it will be logged again.)
+
+Second, you can call using the ``warn()`` function declared in
+``ovn.dl`` from a DDlog rule. It's not straightforward to know
+exactly when this function will be called, like it would be in an
+imperative language like C, since DDlog is a declarative language
+where the user doesn't directly control when rules are triggered. You
+might, for example, see the rule being triggered multiple times with
+the same input. Nevertheless, this debugging technique is useful in
+practice.
+
+You will find many examples of the use of Warning and ``warn`` in
+``ovn_northd.dl``, where it is frequently used to report non-critical
+errors.
+
+Debugging panics
+----------------
+
+**TODO**: update these instructions as DDlog's internal handling of panic's
+is improved.
+
+DDlog is a safe language, so DDlog programs normally do not crash,
+except for the following three cases:
+
+- A panic in a Rust function imported to DDlog as ``extern function``.
+
+- A panic in a C function imported to DDlog as ``extern function``.
+
+- A bug in the DDlog runtime or libraries.
+
+Below we walk through the steps involved in debugging such failures.
+In this scenario, there is an array-index-out-of-bounds error in the
+``ovn_scan_static_dynamic_ip6()`` function, which is written in Rust
+and imported to DDlog as an ``extern function``. When invoked from a
+DDlog rule, this function causes a panic in one of DDlog worker
+threads.
+
+**Step 1: Check for error messages in the northd log.** A panic can
+generally lead to unpredictable outcomes, so one cannot count on a
+clean error message showing up in the log (Other outcomes include
+crashing the entire process and even deadlocks. We are working to
+eliminate the latter possibility). In this case we are lucky to
+observe a bunch of error messages like the following in the ``northd``
+log:
+
+ ``2019-09-23T16:23:24.549Z|00011|ovn_northd|ERR|ddlog_transaction_commit():
+ error: failed to receive flush ack message from timely dataflow
+ thread``
+
+These messages are telling us that something is broken inside the
+DDlog runtime.
+
+**Step 2: Record and replay the failing scenario.** We use DDlog's
+record/replay capabilities (see above) to capture the faulty scenario.
+We replay the recorded trace::
+
+ northd/ovn_northd_ddlog/target/release/ovn_northd_cli < tests/testsuite.dir/117/northd/replay.dat
+
+This generates a bunch of output ending with::
+
+ thread 'worker thread 2' panicked at 'index out of bounds: the len is 1 but the index is 1', /rustc/eae3437dfe991621e8afdc82734f4a172d7ddf9b/src/libcore/slice/mod.rs:2681:10
+ note: run with RUST_BACKTRACE=1 environment variable to display a backtrace.
+
+We re-run the CLI again with backtrace enabled (as suggested by the
+error message)::
+
+ RUST_BACKTRACE=1 northd/ovn_northd_ddlog/target/release/ovn_northd_cli < tests/testsuite.dir/117/northd/replay.dat
+
+This finally yields the following stack trace, which suggests array
+bound violation in ``ovn_scan_static_dynamic_ip6``::
+
+ 0: backtrace::backtrace::libunwind::trace
+ at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.29 10: core::panicking::panic_bounds_check
+ at src/libcore/panicking.rs:61
+ [SKIPPED]
+ 11: ovn_northd_ddlog::__ovn::ovn_scan_static_dynamic_ip6
+ 12: ovn_northd_ddlog::prog::__f
+ [SKIPPED]
+
+Finally, looking at the source code of
+``ovn_scan_static_dynamic_ip6``, we identify the following line,
+containing an unsafe array indexing operator, as the culprit::
+
+ ovn_ipv6_parse(&f[1].to_string())
+
+Clean build
+~~~~~~~~~~~
+
+Occasionally it's desirable to a full and complete build of the
+DDlog-generated code. To trigger that, delete the generated
+``ovn_northd_ddlog`` directory and the ``ddlog.stamp`` witness file,
+like this::
+
+ rm -rf northd/ovn_northd_ddlog northd/ddlog.stamp
+
+or::
+
+ make clean-ddlog
+
+Submitting a bug report
+-----------------------
+
+If you are having trouble with DDlog and the above methods do not
+help, please submit a bug report to ``bugs@openvswitch.org``, CC
+``ryzhyk@gmail.com``.
+
+In addition to problem description, please provide as many of the
+following as possible:
+
+- Are you running with the right DDlog for the version of OVN? OVN
+ and DDlog are both evolving and OVN needs to build against a
+ specific version of DDlog.
+
+- ``replay.dat`` file generated as described above
+
+- Logs: ``ovn-northd.log`` and ``testsuite.log``, if you are running
+ the OVN test suite
@@ -36,6 +36,7 @@ OVN
.. toctree::
:maxdepth: 2
+ debugging-ddlog
integration.rst
high-availability
role-based-access-control
new file mode 100644
@@ -0,0 +1,362 @@
+..
+ Licensed under the Apache License, Version 2.0 (the "License"); you may
+ not use this file except in compliance with the License. You may obtain
+ a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ License for the specific language governing permissions and limitations
+ under the License.
+
+ Convention for heading levels in OVN documentation:
+
+ ======= Heading 0 (reserved for the title in a document)
+ ------- Heading 1
+ ~~~~~~~ Heading 2
+ +++++++ Heading 3
+ ''''''' Heading 4
+
+ Avoid deeper levels because they do not render well.
+
+===========================================================
+Adding a new OVN feature to the DDlog version of ovn-northd
+===========================================================
+
+This document describes the usual steps an OVN developer should go
+through when adding a new feature to ``ovn-northd-ddlog``. In order to
+make things less abstract we will use the IP Multicast
+``ovn-northd-ddlog`` implementation as an example. Even though the
+document is structured as a tutorial there might still exist
+feature-specific aspects that are not covered here.
+
+Overview
+--------
+
+DDlog is a dataflow system: it receives data from a data source (a set
+of "input relations"), processes it through "intermediate relations"
+according to the rules specified in the DDlog program, and sends the
+processed "output relations" to a data sink. In OVN, the input
+relations primarily come from the OVN Northbound database and the
+output relations primarily go to the OVN Southbound database. The
+process looks like this::
+
+ from NBDB +----------+ +-----------------+ +-----------+ to SBDB
+ ---------->|Input rels|-->|Intermediate rels|-->|Output rels|---------->
+ +----------+ +-----------------+ +-----------+
+
+Adding a new feature to ``ovn-northd-ddlog`` usually involves the
+following steps:
+
+1. Update northbound and/or southbound OVSDB schemas.
+
+2. Configure DDlog/OVSDB bindings.
+
+3. Define intermediate DDlog relations and rules to compute them.
+
+4. Write rules to update output relations.
+
+5. Generate ``Logical_Flow``s and/or other forwarding records (e.g.,
+ ``Multicast_Group``) that will control the dataplane operations.
+
+Update NB and/or SB OVSDB schemas
+---------------------------------
+
+This step is no different from the normal development flow in C.
+
+Most of the times a developer chooses between two ways of configuring
+a new feature:
+
+1. Adding a set of columns to tables in the NB and/or SB database (or
+ adding key-value pairs to existing columns).
+
+2. Adding new tables to the NB and/or SB database.
+
+Looking at IP Multicast, there are two ``OVN Northbound`` tables where
+configuration information is stored:
+
+- ``Logical_Switch``, column ``other_config``, keys ``mcast_*``.
+
+- ``Logical_Router``, column ``options``, keys ``mcast_*``.
+
+These tables become inputs to the DDlog pipeline.
+
+In addition we add a new table ``IP_Multicast`` to the SB database.
+DDlog will update this table, that is, ``IP_Multicast`` receives
+output from the above pipeline.
+
+Configuring DDlog/OVSDB bindings
+--------------------------------
+
+Configuring ``northd/automake.mk``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The OVN build process uses DDlog's ``ovsdb2ddlog`` utility to parse
+``ovn-nb.ovsschema`` and ``ovn-sb.ovsschema`` and then automatically
+populate ``OVN_Northbound.dl`` and ``OVN_Southbound.dl``. For each
+OVN Northbound and Southbound table, it generates one or more
+corresponding DDlog relations.
+
+We need to supply ``ovsdb2ddlog`` with some information that it can't
+infer from the OVSDB schemas. This information must be specified as
+``ovsdb2ddlog`` arguments, which are read from
+``northd/ovn-nb.dlopts`` and ``northd/ovn-sb.dlopts``.
+
+The main choice for each new table is whether it is used for output.
+Output tables can also be used for input, but the converse is not
+true. If the table is used for output at all, we add ``-o <table>``
+to the option file. Our new table ``IP_Multicast`` is an output
+table, so we add ``-o IP_Multicast`` to ``ovn-sb.dlopts``.
+
+For input-only tables, ``ovsdb2ddlog`` generates a DDlog input
+relation with the same name. For output tables, it generates this
+table plus an output relation named ``Out_<table>``. Thus,
+``OVN_Southbound.dl`` has two relations for ``IP_Multicast``::
+
+ input relation IP_Multicast (
+ _uuid: uuid,
+ datapath: string,
+ enabled: Set<bool>,
+ querier: Set<bool>
+ )
+ output relation Out_IP_Multicast (
+ _uuid: uuid,
+ datapath: string,
+ enabled: Set<bool>,
+ querier: Set<bool>
+ )
+
+For an output table, consider whether only some of the columns are
+used for output, that is, some of the columns are effectively
+input-only. This is common in OVN for OVSDB columns that are managed
+externally (e.g. by a CMS). For each input-only column, we add ``--ro
+<table>.<column>``. Alternatively, if most of the columns are
+input-only but a few are output columns, add ``--rw <table>.<column>``
+for each of the output columns. In our case, all of the columns are
+used for output, so we do not need to add anything.
+
+Finally, in some cases ``ovn-northd-ddlog`` shouldn't change values in
+. One such case is the ``seq_no`` column in the
+``IP_Multicast`` table. To do that we need to instruct ``ovsdb2ddlog``
+to treat the column as read-only by using the ``--ro`` switch.
+
+``ovsdb2ddlog`` generates a number of additional DDlog relations, for
+use by auto-generated OVSDB adapter logic. These are irrelevant to
+most DDLog developers, although sometimes they can be handy for
+debugging. See the appendix_ for details.
+
+Define intermediate DDlog relations and rules to compute them.
+--------------------------------------------------------------
+
+Obviously there will be a one-to-one relationship between logical
+switches/routers and IP multicast configuration. One way to represent
+this relationship is to create multicast configuration DDlog relations
+to be referenced by ``&Switch`` and ``&Router`` DDlog records::
+
+ /* IP Multicast per switch configuration. */
+ relation &McastSwitchCfg(
+ datapath : uuid,
+ enabled : bool,
+ querier : bool
+ }
+
+ &McastSwitchCfg(
+ .datapath = ls_uuid,
+ .enabled = map_get_bool_def(other_config, "mcast_snoop", false),
+ .querier = map_get_bool_def(other_config, "mcast_querier", true)) :-
+ nb.Logical_Switch(._uuid = ls_uuid,
+ .other_config = other_config).
+
+Then reference these relations in ``&Switch`` and ``&Router``. For
+example, in ``lswitch.dl``, the ``&Switch`` relation definition now
+contains::
+
+ relation &Switch(
+ ls: nb.Logical_Switch,
+ [...]
+ mcast_cfg: Ref<McastSwitchCfg>
+ )
+
+And is populated by the following rule which references the correct
+``McastSwitchCfg`` based on the logical switch uuid::
+
+ &Switch(.ls = ls,
+ [...]
+ .mcast_cfg = mcast_cfg) :-
+ nb.Logical_Switch[ls],
+ [...]
+ mcast_cfg in &McastSwitchCfg(.datapath = ls._uuid).
+
+Build state based on information dynamically updated by ``ovn-controller``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Some OVN features rely on information learned by ``ovn-controller`` to
+generate ``Logical_Flow`` or other records that control the dataplane.
+In case of IP Multicast, ``ovn-controller`` uses IGMP to learn
+multicast groups that are joined by hosts.
+
+Each ``ovn-controller`` maintains its own set of records to avoid
+ownership and concurrency with other controllers. If two hosts that
+are connected to the same logical switch but reside on different
+hypervisors (different ``ovn-controller`` processes) join the same
+multicast group G, each of the controllers will create an
+``IGMP_Group`` record in the ``OVN Southbound`` database which will
+contain a set of ports to which the interested hosts are connected.
+
+At this point ``ovn-northd-ddlog`` needs to aggregate the per-chassis
+IGMP records to generate a single ``Logical_Flow`` for group G.
+Moreover, the ports on which the hosts are connected are represented
+as references to ``Port_Binding`` records in the database. These also
+need to be translated to ``&SwitchPort`` DDlog relations. The
+corresponding DDlog operations that need to be performed are:
+
+- Flatten the ``<IGMP group, ports>`` mapping in order to be able to
+ do the translation from ``Port_Binding`` to ``&SwitchPort``. For
+ each ``IGMP_Group`` record in the ``OVN Southbound`` database
+ generate an individual record of type ``IgmpSwitchGroupPort`` for
+ each ``Port_Binding`` in the set of ports that joined the
+ group. Also, translate the ``Port_Binding`` uuid to the
+ corresponding ``Logical_Switch_Port`` uuid::
+
+ relation IgmpSwitchGroupPort(
+ address: string,
+ switch : Ref<Switch>,
+ port : uuid
+ )
+
+ IgmpSwitchGroupPort(address, switch, lsp_uuid) :-
+ sb::IGMP_Group(.address = address, .datapath = igmp_dp_set,
+ .ports = pb_ports),
+ var pb_port_uuid = FlatMap(pb_ports),
+ sb::Port_Binding(._uuid = pb_port_uuid, .logical_port = lsp_name),
+ &SwitchPort(
+ .lsp = nb.Logical_Switch_Port{._uuid = lsp_uuid, .name = lsp_name},
+ .sw = switch).
+
+- Aggregate the flattened IgmpSwitchGroupPort (implicitly from all
+ ``ovn-controller`` instances) grouping by adress and logical
+ switch::
+
+ relation IgmpSwitchMulticastGroup(
+ address: string,
+ switch : Ref<Switch>,
+ ports : Set<uuid>
+ )
+
+ IgmpSwitchMulticastGroup(address, switch, ports) :-
+ IgmpSwitchGroupPort(address, switch, port),
+ var ports = port.group_by((address, switch)).to_set().
+
+At this point we have all the feature configuration relevant
+information stored in DDlog relations in ``ovn-northd-ddlog`` memory.
+
+Write rules to update output relations
+--------------------------------------
+
+The developer updates output tables by writing rules that generate
+``Out_*`` relations. For IP Multicast this means::
+
+ /* IP_Multicast table (only applicable for Switches). */
+ sb::Out_IP_Multicast(._uuid = hash128(cfg.datapath),
+ .datapath = cfg.datapath,
+ .enabled = set_singleton(cfg.enabled),
+ .querier = set_singleton(cfg.querier)) :-
+ &McastSwitchCfg[cfg].
+
+.. note:: ``OVN_Southbound.dl`` also contains an ``IP_Multicast``
+ relation with ``input`` qualifier. This relation stores the
+ current snapshot of the OVSDB table and cannot be written to.
+
+Generate ``Logical_Flow`` and/or other forwarding records
+---------------------------------------------------------
+
+At this point we have defined all DDlog relations required to generate
+``Logical_Flow``s. All we have to do is write the rules to do so.
+For each ``IgmpSwitchMulticastGroup`` we generate a ``Flow`` that has
+as action ``"outport = <Multicast_Group>; output;"``::
+
+ /* Ingress table 17: Add IP multicast flows learnt from IGMP (priority 90). */
+ for (IgmpSwitchMulticastGroup(.address = address, .switch = &sw)) {
+ Flow(.logical_datapath = sw.dpname,
+ .stage = switch_stage(IN, L2_LKUP),
+ .priority = 90,
+ .__match = "eth.mcast && ip4 && ip4.dst == ${address}",
+ .actions = "outport = \"${address}\"; output;",
+ .external_ids = map_empty())
+ }
+
+In some cases generating a logical flow is not enough. For IGMP we
+also need to maintain OVN southbound ``Multicast_Group`` records,
+one per IGMP group storing the corresponding ``Port_Binding`` uuids of
+ports where multicast traffic should be sent. This is also relatively
+straightforward::
+
+ /* Create a multicast group for each IGMP group learned by a Switch.
+ * 'tunnel_key' == 0 triggers an ID allocation later.
+ */
+ sb::Out_Multicast_Group (.datapath = switch.dpname,
+ .name = address,
+ .tunnel_key = 0,
+ .ports = set_map_uuid2name(port_ids)) :-
+ IgmpSwitchMulticastGroup(address, &switch, port_ids).
+
+We must also define DDlog relations that will allocate ``tunnel_key``
+values. There are two cases: tunnel keys for records that already
+existed in the database are preserved to implement stable id
+allocation; new multicast groups need new keys. This kind of
+allocation can be tricky, especially to new users of DDlog. OVN
+contains multiple instances of allocation, so it's probably worth
+reading through the existing cases and following their pattern, and,
+if it's still tricky, asking for assistance.
+
+Appendix A. Additional relations generated by ``ovsdb2ddlog``
+-------------------------------------------------------------
+
+.. _appendix:
+
+ovsdb2ddlog generates some extra relations to manage communication
+with the OVSDB server. It generates records in the following
+relations when rows in OVSDB output tables need to be added or deleted
+or updated.
+
+In the steady state, when everything is working well, a given record
+stays in any one of these relations only briefly: just long enough for
+``ovn-northd-ddlog`` to send a transaction to the OVSDB server. When
+the OVSDB server applies the update and sends an acknowledgement, this
+ordinarily means that these relations become empty, because there are
+no longer any further changes to send.
+
+Thus, records that persist in one of these relations is a sign of a
+problem. One example of such a problem is the database server
+rejecting the transactions sent by ``ovn-northd-ddlog``, which might
+happen if, for example, a bug in a ``.dl`` file would cause some OVSDB
+constraint or relational integrity rule to be violated. (Such a
+problem can often be diagnosed by looking in the OVSDB server's log.)
+
+- ``DeltaPlus_IP_Multicast`` used by the DDlog program to track new
+ records that are not yet added to the database::
+
+ output relation DeltaPlus_IP_Multicast (
+ datapath: uuid_or_string_t,
+ enabled: Set<bool>,
+ querier: Set<bool>
+ )
+
+- ``DeltaMinus_IP_Multicast`` used by the DDlog program to track
+ records that are no longer needed in the database and need to be
+ removed::
+
+ output relation DeltaMinus_IP_Multicast (
+ _uuid: uuid
+ )
+
+- ``Update_IP_Multicast`` used by the DDlog program to track records
+ whose fields need to be updated in the database::
+
+ output relation Update_IP_Multicast (
+ _uuid: uuid,
+ enabled: Set<bool>,
+ querier: Set<bool>
+ )
@@ -44,3 +44,4 @@ vSwitch.
ovn-rbac
ovn-ipsec
ovn-interconnection
+ ddlog-new-feature
@@ -1,5 +1,11 @@
Post-v20.09.0
---------------------
+ - ovn-northd-ddlog: New implementation of northd, based on DDlog. This
+ implementation is incremental, meaning that it only recalculates what is
+ needed for the southbound database when northbound changes occur. It is
+ expected to scale better than the C implementation, for large deployments.
+ (This may take testing and tuning to be effective.) This version of OVN
+ requires DDLog 0.30.
- The "datapath" argument to ovn-trace is now optional, since the
datapath can be inferred from the inport (which is required).
- The obsolete "redirect-chassis" way to configure gateways has been
@@ -42,6 +42,49 @@ AC_DEFUN([OVS_ENABLE_WERROR],
fi
AC_SUBST([SPARSE_WERROR])])
+dnl OVS_CHECK_DDLOG
+dnl
+dnl Configure ddlog source tree
+AC_DEFUN([OVS_CHECK_DDLOG], [
+ AC_ARG_WITH([ddlog],
+ [AC_HELP_STRING([--with-ddlog=.../differential-datalog/lib],
+ [Enables DDlog by pointing to its library dir])],
+ [DDLOGLIBDIR=$withval], [DDLOGLIBDIR=no])
+
+ AC_MSG_CHECKING([for DDlog library directory])
+ if test "$DDLOGLIBDIR" != no; then
+ if test ! -d "$DDLOGLIBDIR"; then
+ AC_MSG_ERROR([ddlog library dir "$DDLOGLIBDIR" doesn't exist])
+ elif test ! -f "$DDLOGLIBDIR"/ddlog_std.dl; then
+ AC_MSG_ERROR([ddlog library dir "$DDLOGLIBDIR" lacks ddlog_std.dl])
+ fi
+
+ AC_ARG_VAR([DDLOG])
+ AC_CHECK_PROGS([DDLOG], [ddlog], [none])
+ if test X"$DDLOG" = X"none"; then
+ AC_MSG_ERROR([ddlog is required to build with DDlog])
+ fi
+
+ AC_ARG_VAR([CARGO])
+ AC_CHECK_PROGS([CARGO], [cargo], [none])
+ if test X"$CARGO" = X"none"; then
+ AC_MSG_ERROR([cargo is required to build with DDlog])
+ fi
+
+ AC_ARG_VAR([RUSTC])
+ AC_CHECK_PROGS([RUSTC], [rustc], [none])
+ if test X"$RUSTC" = X"none"; then
+ AC_MSG_ERROR([rustc is required to build with DDlog])
+ fi
+
+ AC_SUBST([DDLOGLIBDIR])
+ AC_DEFINE([DDLOG], [1], [Build OVN daemons with ddlog.])
+ fi
+ AC_MSG_RESULT([$DDLOGLIBDIR])
+
+ AM_CONDITIONAL([DDLOG], [test "$DDLOGLIBDIR" != no])
+])
+
dnl Checks for net/if_dl.h.
dnl
dnl (We use this as a proxy for checking whether we're building on FreeBSD
@@ -131,6 +131,7 @@ OVS_LIBTOOL_VERSIONS
OVS_CHECK_CXX
AX_FUNC_POSIX_MEMALIGN
OVN_CHECK_UNBOUND
+OVS_CHECK_DDLOG_FAST_BUILD
OVS_CHECK_INCLUDE_NEXT([stdio.h string.h])
AC_CONFIG_FILES([lib/libovn.sym])
@@ -167,11 +168,15 @@ OVS_CONDITIONAL_CC_OPTION([-Wno-unused-parameter], [HAVE_WNO_UNUSED_PARAMETER])
OVS_ENABLE_WERROR
OVS_ENABLE_SPARSE
+OVS_CHECK_DDLOG
OVS_CHECK_PRAGMA_MESSAGE
OVN_CHECK_OVS
OVS_CTAGS_IDENTIFIERS
AC_SUBST([OVS_CFLAGS])
AC_SUBST([OVS_LDFLAGS])
+AC_SUBST([DDLOG_EXTRA_FLAGS])
+AC_SUBST([DDLOG_EXTRA_RUSTFLAGS])
+AC_SUBST([DDLOG_NORTHD_LIB_ONLY])
AC_SUBST([ovs_srcdir], ['${OVSDIR}'])
AC_SUBST([ovs_builddir], ['${OVSBUILDDIR}'])
@@ -576,3 +576,19 @@ AC_DEFUN([OVN_CHECK_UNBOUND],
fi
AM_CONDITIONAL([HAVE_UNBOUND], [test "$HAVE_UNBOUND" = yes])
AC_SUBST([HAVE_UNBOUND])])
+
+dnl Checks for --enable-ddlog-fast-build and updates DDLOG_EXTRA_RUSTFLAGS.
+AC_DEFUN([OVS_CHECK_DDLOG_FAST_BUILD],
+ [AC_ARG_ENABLE(
+ [ddlog_fast_build],
+ [AC_HELP_STRING([--enable-ddlog-fast-build],
+ [Build ddlog programs faster, but generate slower code])],
+ [case "${enableval}" in
+ (yes) ddlog_fast_build=true ;;
+ (no) ddlog_fast_build=false ;;
+ (*) AC_MSG_ERROR([bad value ${enableval} for --enable-ddlog-fast-build]) ;;
+ esac],
+ [ddlog_fast_build=false])
+ if $ddlog_fast_build; then
+ DDLOG_EXTRA_RUSTFLAGS="-C opt-level=z"
+ fi])
@@ -1,2 +1,6 @@
/ovn-northd
+/ovn-northd-ddlog
/ovn-northd.8
+/OVN_Northbound.dl
+/OVN_Southbound.dl
+/ovn_northd_ddlog/
@@ -8,3 +8,107 @@ northd_ovn_northd_LDADD = \
man_MANS += northd/ovn-northd.8
EXTRA_DIST += northd/ovn-northd.8.xml
CLEANFILES += northd/ovn-northd.8
+
+EXTRA_DIST += \
+ northd/ovn-northd northd/ovn-northd.8.xml \
+ northd/ovn_northd.dl northd/ovn.dl northd/ovn.rs \
+ northd/ovn.toml northd/lswitch.dl northd/lrouter.dl \
+ northd/helpers.dl northd/ipam.dl northd/multicast.dl \
+ northd/ovn-nb.dlopts northd/ovn-sb.dlopts \
+ northd/ovsdb2ddlog2c
+
+if DDLOG
+bin_PROGRAMS += northd/ovn-northd-ddlog
+northd_ovn_northd_ddlog_SOURCES = northd/ovn-northd-ddlog.c
+nodist_northd_ovn_northd_ddlog_SOURCES = \
+ northd/ovn-northd-ddlog-sb.inc \
+ northd/ovn-northd-ddlog-nb.inc \
+ northd/ovn_northd_ddlog/ddlog.h
+northd_ovn_northd_ddlog_LDADD = \
+ northd/ovn_northd_ddlog/target/release/libovn_northd_ddlog.la \
+ lib/libovn.la \
+ $(OVSDB_LIBDIR)/libovsdb.la \
+ $(OVS_LIBDIR)/libopenvswitch.la
+
+nb_opts = $$(cat $(srcdir)/northd/ovn-nb.dlopts)
+northd/OVN_Northbound.dl: ovn-nb.ovsschema northd/ovn-nb.dlopts
+ $(AM_V_GEN)ovsdb2ddlog -f $< --output-file $@ $(nb_opts)
+northd/ovn-northd-ddlog-nb.inc: ovn-nb.ovsschema northd/ovn-nb.dlopts northd/ovsdb2ddlog2c
+ $(AM_V_GEN)$(run_python) $(srcdir)/northd/ovsdb2ddlog2c -p nb_ -f $< --output-file $@ $(nb_opts)
+
+sb_opts = $$(cat $(srcdir)/northd/ovn-sb.dlopts)
+northd/OVN_Southbound.dl: ovn-sb.ovsschema northd/ovn-sb.dlopts
+ $(AM_V_GEN)ovsdb2ddlog -f $< --output-file $@ $(sb_opts)
+northd/ovn-northd-ddlog-sb.inc: ovn-sb.ovsschema northd/ovn-sb.dlopts northd/ovsdb2ddlog2c
+ $(AM_V_GEN)$(run_python) $(srcdir)/northd/ovsdb2ddlog2c -p sb_ -f $< --output-file $@ $(sb_opts)
+
+BUILT_SOURCES += \
+ northd/ovn-northd-ddlog-sb.inc \
+ northd/ovn-northd-ddlog-nb.inc
+
+northd/ovn_northd_ddlog/ddlog.h: northd/ddlog.stamp
+
+CARGO_VERBOSE = $(cargo_verbose_$(V))
+cargo_verbose_ = $(cargo_verbose_$(AM_DEFAULT_VERBOSITY))
+cargo_verbose_0 =
+cargo_verbose_1 = --verbose
+
+DDLOGFLAGS = -L $(DDLOGLIBDIR) -L $(builddir)/northd $(DDLOG_EXTRA_FLAGS)
+
+RUSTFLAGS = \
+ -L ../../lib/.libs \
+ -L $(OVS_LIBDIR)/.libs \
+ $$LIBOPENVSWITCH_DEPS \
+ $$LIBOVN_DEPS \
+ -Awarnings $(DDLOG_EXTRA_RUSTFLAGS)
+
+ddlog_sources = \
+ northd/ovn_northd.dl \
+ northd/lswitch.dl \
+ northd/lrouter.dl \
+ northd/ipam.dl \
+ northd/multicast.dl \
+ northd/ovn.dl \
+ northd/ovn.rs \
+ northd/helpers.dl \
+ northd/OVN_Northbound.dl \
+ northd/OVN_Southbound.dl
+northd/ddlog.stamp: $(ddlog_sources)
+ $(AM_V_GEN)$(DDLOG) -i $< -o $(builddir)/northd $(DDLOGFLAGS)
+ $(AM_V_at)touch $@
+
+NORTHD_LIB = 1
+NORTHD_CLI = 0
+
+ddlog_targets = $(northd_lib_$(NORTHD_LIB)) $(northd_cli_$(NORTHD_CLI))
+northd_lib_1 = northd/ovn_northd_ddlog/target/release/libovn_%_ddlog.la
+northd_cli_1 = northd/ovn_northd_ddlog/target/release/ovn_%_cli
+EXTRA_northd_ovn_northd_DEPENDENCIES = $(northd_cli_$(NORTHD_CLI))
+
+cargo_build = $(cargo_build_$(NORTHD_LIB)$(NORTHD_CLI))
+cargo_build_01 = --features command-line --bin ovn_northd_cli
+cargo_build_10 = --lib
+cargo_build_11 = --features command-line
+
+$(ddlog_targets): northd/ddlog.stamp lib/libovn.la $(OVS_LIBDIR)/libopenvswitch.la
+ $(AM_V_GEN)LIBOVN_DEPS=`. lib/libovn.la && echo "$$dependency_libs"` && \
+ LIBOPENVSWITCH_DEPS=`. $(OVS_LIBDIR)/libopenvswitch.la && echo "$$dependency_libs"` && \
+ cd northd/ovn_northd_ddlog && \
+ RUSTC='$(RUSTC)' RUSTFLAGS="$(RUSTFLAGS)" \
+ cargo build --release $(CARGO_VERBOSE) $(cargo_build) --no-default-features --features ovsdb
+endif
+
+CLEAN_LOCAL += clean-ddlog
+clean-ddlog:
+ rm -rf northd/ovn_northd_ddlog northd/ddlog.stamp
+
+CLEANFILES += \
+ northd/ddlog.stamp \
+ northd/ovn_northd_ddlog/ddlog.h \
+ northd/ovn_northd_ddlog/target/release/libovn_northd_ddlog.a \
+ northd/ovn_northd_ddlog/target/release/libovn_northd_ddlog.la \
+ northd/ovn_northd_ddlog/target/release/ovn_northd_cli \
+ northd/OVN_Northbound.dl \
+ northd/OVN_Southbound.dl \
+ northd/ovn-northd-ddlog-nb.inc \
+ northd/ovn-northd-ddlog-sb.inc
new file mode 100644
@@ -0,0 +1,93 @@
+/*
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import OVN_Northbound as nb
+import OVN_Southbound as sb
+import ovsdb
+import ovn
+
+/* ACLRef: reference to nb::ACL */
+relation &ACLRef[nb::ACL]
+&ACLRef[acl] :- nb::ACL[acl].
+
+/* DHCP_Options: reference to nb::DHCP_Options */
+relation &DHCP_OptionsRef[nb::DHCP_Options]
+&DHCP_OptionsRef[options] :- nb::DHCP_Options[options].
+
+/* QoS: reference to nb::QoS */
+relation &QoSRef[nb::QoS]
+&QoSRef[qos] :- nb::QoS[qos].
+
+/* LoadBalancerRef: reference to nb::Load_Balancer */
+relation &LoadBalancerRef[nb::Load_Balancer]
+&LoadBalancerRef[lb] :- nb::Load_Balancer[lb].
+
+/* LoadBalancerHealthCheckRef: reference to nb::Load_Balancer_Health_Check */
+relation &LoadBalancerHealthCheckRef[nb::Load_Balancer_Health_Check]
+&LoadBalancerHealthCheckRef[lbhc] :- nb::Load_Balancer_Health_Check[lbhc].
+
+/* MeterRef: reference to nb::Meter*/
+relation &MeterRef[nb::Meter]
+&MeterRef[meter] :- nb::Meter[meter].
+
+/* NATRef: reference to nb::NAT*/
+relation &NATRef[nb::NAT]
+&NATRef[nat] :- nb::NAT[nat].
+
+/* AddressSetRef: reference to nb::Address_Set */
+relation &AddressSetRef[nb::Address_Set]
+&AddressSetRef[__as] :- nb::Address_Set[__as].
+
+/* ServiceMonitor: reference to sb::Service_Monitor */
+relation &ServiceMonitorRef[sb::Service_Monitor]
+&ServiceMonitorRef[sm] :- sb::Service_Monitor[sm].
+
+/* Switch-to-router logical port connections */
+relation SwitchRouterPeer(lsp: uuid, lsp_name: string, lrp: uuid)
+SwitchRouterPeer(lsp, lsp_name, lrp) :-
+ nb::Logical_Switch_Port(._uuid = lsp, .name = lsp_name, .__type = "router", .options = options),
+ Some{var router_port} = map_get(options, "router-port"),
+ nb::Logical_Router_Port(.name = router_port, ._uuid = lrp).
+
+function map_get_bool_def(m: Map<string, string>,
+ k: string, def: bool): bool = {
+ m.get(k)
+ .and_then(|x| match (str_to_lower(x)) {
+ "false" -> Some{false},
+ "true" -> Some{true},
+ _ -> None
+ })
+ .unwrap_or(def)
+}
+
+function map_get_int_def(m: Map<string, string>, k: string,
+ def: integer): integer = {
+ m.get(k).and_then(parse_dec_u64).unwrap_or(def)
+}
+
+function map_get_int_def_limit(m: Map<string, string>, k: string, def: integer,
+ min: integer, max: integer): integer = {
+ var v = map_get_int_def(m, k, def);
+ var v1 = {
+ if (v < min) min else v
+ };
+ if (v1 > max) max else v1
+}
+
+function ha_chassis_group_uuid(uuid: uuid): uuid { hash128("hacg" ++ uuid) }
+function ha_chassis_uuid(chassis_name: string, nb_chassis_uuid: uuid): uuid { hash128("hac" ++ chassis_name ++ nb_chassis_uuid) }
+
+/* Dummy relation with one empty row, useful for putting into antijoins. */
+relation Unit()
+Unit().
new file mode 100644
@@ -0,0 +1,506 @@
+/*
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+/*
+ * IPAM (IP address management) and MACAM (MAC address management)
+ *
+ * IPAM generally stands for IP address management. In non-virtualized
+ * world, MAC addresses come with the hardware. But, with virtualized
+ * workloads, they need to be assigned and managed. This function
+ * does both IP address management (ipam) and MAC address management
+ * (macam).
+ */
+
+import OVN_Northbound as nb
+import ovsdb
+import allocate
+import helpers
+import ovn
+import ovn_northd
+import lswitch
+import lrouter
+
+function mAC_ADDR_SPACE(): bit<64> = 64'hffffff
+
+/*
+ * IPv4 dynamic address allocation.
+ */
+
+/*
+ * The fixed portions of a request for a dynamic LSP address.
+ */
+typedef dynamic_address_request = DynamicAddressRequest{
+ mac: Option<eth_addr>,
+ ip4: Option<in_addr>,
+ ip6: Option<in6_addr>
+}
+function parse_dynamic_address_request(s: string): Option<dynamic_address_request> {
+ var tokens = string_split(s, " ");
+ var n = vec_len(tokens);
+ if (n < 1 or n > 3) {
+ return None
+ };
+
+ var t0 = tokens.nth(0).unwrap_or("");
+ var t1 = tokens.nth(1).unwrap_or("");
+ var t2 = tokens.nth(2).unwrap_or("");
+ if (t0 == "dynamic") {
+ if (n == 1) {
+ Some{DynamicAddressRequest{None, None, None}}
+ } else if (n == 2) {
+ match (ip46_parse(t1)) {
+ Some{IPv4{ipv4}} -> Some{DynamicAddressRequest{None, Some{ipv4}, None}},
+ Some{IPv6{ipv6}} -> Some{DynamicAddressRequest{None, None, Some{ipv6}}},
+ _ -> None
+ }
+ } else if (n == 3) {
+ match ((ip_parse(t1), ipv6_parse(t2))) {
+ (Some{ipv4}, Some{ipv6}) -> Some{DynamicAddressRequest{None, Some{ipv4}, Some{ipv6}}},
+ _ -> None
+ }
+ } else {
+ None
+ }
+ } else if (n == 2 and t1 == "dynamic") {
+ match (eth_addr_from_string(t0)) {
+ Some{mac} -> Some{DynamicAddressRequest{Some{mac}, None, None}},
+ _ -> None
+ }
+ } else {
+ None
+ }
+}
+
+/* SwitchIPv4ReservedAddress - keeps track of statically reserved IPv4 addresses
+ * for each switch whose subnet option is set, including:
+ * (1) first and last (multicast) address in the subnet range
+ * (2) addresses from `other_config.exclude_ips`
+ * (3) port addresses in lsp.addresses, except "unknown" addresses, addresses of
+ * "router" ports, dynamic addresses
+ * (4) addresses associated with router ports peered with the switch.
+ * (5) static IP component of "dynamic" `lsp.addresses`.
+ *
+ * Addresses are kept in host-endian format (i.e., bit<32> vs in_addr).
+ */
+relation SwitchIPv4ReservedAddress(lswitch: uuid, addr: bit<32>)
+
+/* Add reserved address groups (1) and (2). */
+SwitchIPv4ReservedAddress(.lswitch = ls._uuid,
+ .addr = addr) :-
+ &Switch(.ls = ls,
+ .subnet = Some{(_, _, start_ipv4, total_ipv4s)}),
+ var exclude_ips = {
+ var exclude_ips = set_singleton(start_ipv4);
+ set_insert(exclude_ips, start_ipv4 + total_ipv4s - 1);
+ match (map_get(ls.other_config, "exclude_ips")) {
+ None -> exclude_ips,
+ Some{exclude_ip_list} -> match (parse_ip_list(exclude_ip_list)) {
+ Left{err} -> {
+ warn("logical switch ${uuid2str(ls._uuid)}: bad exclude_ips (${err})");
+ exclude_ips
+ },
+ Right{ranges} -> {
+ for (range in ranges) {
+ (var ip_start, var ip_end) = range;
+ var start = iptohl(ip_start);
+ var end = match (ip_end) {
+ None -> start,
+ Some{ip} -> iptohl(ip)
+ };
+ start = max(start_ipv4, start);
+ end = min(start_ipv4 + total_ipv4s - 1, end);
+ if (end >= start) {
+ for (addr in range_vec(start, end+1, 1)) {
+ set_insert(exclude_ips, addr)
+ }
+ } else {
+ warn("logical switch ${uuid2str(ls._uuid)}: excluded addresses not in subnet")
+ }
+ };
+ exclude_ips
+ }
+ }
+ }
+ },
+ var addr = FlatMap(exclude_ips).
+
+/* Add reserved address group (3). */
+SwitchIPv4ReservedAddress(.lswitch = ls._uuid,
+ .addr = addr) :-
+ SwitchPortStaticAddresses(
+ .port = &SwitchPort{
+ .sw = &Switch{.ls = ls,
+ .subnet = Some{(_, _, start_ipv4, total_ipv4s)}},
+ .peer = None},
+ .addrs = lport_addrs
+ ),
+ var addrs = {
+ var addrs = set_empty();
+ for (addr in lport_addrs.ipv4_addrs) {
+ var addr_host_endian = iptohl(addr.addr);
+ if (addr_host_endian >= start_ipv4 and addr_host_endian < start_ipv4 + total_ipv4s) {
+ set_insert(addrs, addr_host_endian)
+ } else ()
+ };
+ addrs
+ },
+ var addr = FlatMap(addrs).
+
+/* Add reserved address group (4) */
+SwitchIPv4ReservedAddress(.lswitch = ls._uuid,
+ .addr = addr) :-
+ &SwitchPort(
+ .sw = &Switch{.ls = ls,
+ .subnet = Some{(_, _, start_ipv4, total_ipv4s)}},
+ .peer = Some{&rport}),
+ var addrs = {
+ var addrs = set_empty();
+ for (addr in rport.networks.ipv4_addrs) {
+ var addr_host_endian = iptohl(addr.addr);
+ if (addr_host_endian >= start_ipv4 and addr_host_endian < start_ipv4 + total_ipv4s) {
+ set_insert(addrs, addr_host_endian)
+ } else ()
+ };
+ addrs
+ },
+ var addr = FlatMap(addrs).
+
+/* Add reserved address group (5) */
+SwitchIPv4ReservedAddress(.lswitch = sw.ls._uuid,
+ .addr = iptohl(ip_addr)) :-
+ &SwitchPort(.sw = &sw, .lsp = lsp, .static_dynamic_ipv4 = Some{ip_addr}).
+
+/* Aggregate all reserved addresses for each switch. */
+relation SwitchIPv4ReservedAddresses(lswitch: uuid, addrs: Set<bit<32>>)
+
+SwitchIPv4ReservedAddresses(lswitch, addrs) :-
+ SwitchIPv4ReservedAddress(lswitch, addr),
+ var addrs = addr.group_by(lswitch).to_set().
+
+SwitchIPv4ReservedAddresses(lswitch_uuid, set_empty()) :-
+ nb::Logical_Switch(._uuid = lswitch_uuid),
+ not SwitchIPv4ReservedAddress(lswitch_uuid, _).
+
+/* Allocate dynamic IP addresses for ports that require them:
+ */
+relation SwitchPortAllocatedIPv4DynAddress(lsport: uuid, dyn_addr: Option<in_addr>)
+
+SwitchPortAllocatedIPv4DynAddress(lsport, dyn_addr) :-
+ /* Aggregate all ports of a switch that need a dynamic IP address */
+ port in &SwitchPort(.needs_dynamic_ipv4address = true,
+ .sw = &sw),
+ var switch_id = sw.ls._uuid,
+ var ports = port.group_by(switch_id).to_vec(),
+ SwitchIPv4ReservedAddresses(switch_id, reserved_addrs),
+ /* Allocate dynamic addresses only for ports that don't have a dynamic address
+ * or have one that is no longer valid. */
+ var dyn_addresses = {
+ var used_addrs = reserved_addrs;
+ var assigned_addrs = vec_empty();
+ var need_addr = vec_empty();
+ (var start_ipv4, var total_ipv4s) = match (vec_nth(ports, 0)) {
+ None -> { (0, 0) } /* no ports with dynamic addresses */,
+ Some{port0} -> {
+ match (port0.sw.subnet) {
+ None -> {
+ abort("needs_dynamic_ipv4address is true, but subnet is undefined in port ${uuid2str(deref(port0).lsp._uuid)}");
+ (0, 0)
+ },
+ Some{(_, _, start_ipv4, total_ipv4s)} -> (start_ipv4, total_ipv4s)
+ }
+ }
+ };
+ for (port in ports) {
+ //warn("port(${deref(port).lsp._uuid})");
+ match (deref(port).dynamic_address) {
+ None -> {
+ /* no dynamic address yet -- allocate one now */
+ //warn("need_addr(${deref(port).lsp._uuid})");
+ vec_push(need_addr, deref(port).lsp._uuid)
+ },
+ Some{dynaddr} -> {
+ match (vec_nth(dynaddr.ipv4_addrs, 0)) {
+ None -> {
+ /* dynamic address does not have IPv4 component -- allocate one now */
+ //warn("need_addr(${deref(port).lsp._uuid})");
+ vec_push(need_addr, deref(port).lsp._uuid)
+ },
+ Some{addr} -> {
+ var haddr = iptohl(addr.addr);
+ if (haddr < start_ipv4 or haddr >= start_ipv4 + total_ipv4s) {
+ vec_push(need_addr, deref(port).lsp._uuid)
+ } else if (set_contains(used_addrs, haddr)) {
+ vec_push(need_addr, deref(port).lsp._uuid);
+ warn("Duplicate IP set on switch ${deref(port).lsp.name}: ${addr.addr}")
+ } else {
+ /* has valid dynamic address -- record it in used_addrs */
+ set_insert(used_addrs, haddr);
+ assigned_addrs.push((port.lsp._uuid, Some{haddr}))
+ }
+ }
+ }
+ }
+ }
+ };
+ assigned_addrs.append(allocate_opt(used_addrs, need_addr, start_ipv4, start_ipv4 + total_ipv4s - 1));
+ assigned_addrs
+ },
+ var port_address = FlatMap(dyn_addresses),
+ (var lsport, var dyn_addr_bits) = port_address,
+ var dyn_addr = dyn_addr_bits.map(hltoip).
+
+/* Compute new dynamic IPv4 address assignment:
+ * - port does not need dynamic IP - use static_dynamic_ip if any
+ * - a new address has been allocated for port - use this address
+ * - otherwise, use existing dynamic IP
+ */
+relation SwitchPortNewIPv4DynAddress(lsport: uuid, dyn_addr: Option<in_addr>)
+
+SwitchPortNewIPv4DynAddress(lsp._uuid, ip_addr) :-
+ &SwitchPort(.sw = &sw,
+ .needs_dynamic_ipv4address = false,
+ .static_dynamic_ipv4 = static_dynamic_ipv4,
+ .lsp = lsp),
+ var ip_addr = {
+ match (static_dynamic_ipv4) {
+ None -> { None },
+ Some{addr} -> {
+ match (sw.subnet) {
+ None -> { None },
+ Some{(_, _, start_ipv4, total_ipv4s)} -> {
+ var haddr = iptohl(addr);
+ if (haddr < start_ipv4 or haddr >= start_ipv4 + total_ipv4s) {
+ /* new static ip is not valid */
+ None
+ } else {
+ Some{addr}
+ }
+ }
+ }
+ }
+ }
+ }.
+
+SwitchPortNewIPv4DynAddress(lsport, addr) :-
+ SwitchPortAllocatedIPv4DynAddress(lsport, addr).
+
+/*
+ * Dynamic MAC address allocation.
+ */
+
+function get_mac_prefix(options: Map<string,string>, uuid: uuid) : bit<64> =
+{
+ var existing_prefix = match (map_get(options, "mac_prefix")) {
+ Some{prefix} -> scan_eth_addr_prefix(prefix),
+ None -> None
+ };
+ match (existing_prefix) {
+ Some{prefix} -> prefix,
+ None -> pseudorandom_mac(uuid, 16'h1234) & 64'hffffff000000
+ }
+}
+function put_mac_prefix(options: Map<string,string>, mac_prefix: bit<64>)
+ : Map<string,string> =
+{
+ map_insert_imm(options, "mac_prefix",
+ string_substr(to_string(eth_addr_from_uint64(mac_prefix)), 0, 8))
+}
+relation MacPrefix(mac_prefix: bit<64>)
+MacPrefix(get_mac_prefix(options, uuid)) :-
+ nb::NB_Global(._uuid = uuid, .options = options).
+
+/* ReservedMACAddress - keeps track of statically reserved MAC addresses.
+ * (1) static addresses in `lsp.addresses`
+ * (2) static MAC component of "dynamic" `lsp.addresses`.
+ * (3) addresses associated with router ports peered with the switch.
+ *
+ * Addresses are kept in 64-bit host-endian format.
+ */
+relation ReservedMACAddress(addr: bit<64>)
+
+/* Add reserved address group (1). */
+ReservedMACAddress(.addr = eth_addr_to_uint64(lport_addrs.ea)) :-
+ SwitchPortStaticAddresses(.addrs = lport_addrs).
+
+/* Add reserved address group (2). */
+ReservedMACAddress(.addr = eth_addr_to_uint64(mac_addr)) :-
+ &SwitchPort(.lsp = lsp, .static_dynamic_mac = Some{mac_addr}).
+
+/* Add reserved address group (3). */
+ReservedMACAddress(.addr = eth_addr_to_uint64(rport.networks.ea)) :-
+ &SwitchPort(.peer = Some{&rport}).
+
+/* Aggregate all reserved MAC addresses. */
+relation ReservedMACAddresses(addrs: Set<bit<64>>)
+
+ReservedMACAddresses(addrs) :-
+ ReservedMACAddress(addr),
+ var addrs = addr.group_by(()).to_set().
+
+/* Handle case when `ReservedMACAddress` is empty */
+ReservedMACAddresses(set_empty()) :-
+ // NB_Global should have exactly one record, so we can
+ // use it as a base for antijoin.
+ nb::NB_Global(),
+ not ReservedMACAddress(_).
+
+/* Allocate dynamic MAC addresses for ports that require them:
+ * Case 1: port doesn't need dynamic MAC (i.e., does not have dynamic address or
+ * has a dynamic address with a static MAC).
+ * Case 2: needs dynamic MAC, has dynamic MAC, has existing dynamic MAC with the right prefix
+ * needs dynamic MAC, does not have fixed dynamic MAC, doesn't have existing dynamic MAC with correct prefix
+ */
+relation SwitchPortAllocatedMACDynAddress(lsport: uuid, dyn_addr: bit<64>)
+
+SwitchPortAllocatedMACDynAddress(lsport, dyn_addr),
+SwitchPortDuplicateMACAddress(dup_addrs) :-
+ /* Group all ports that need a dynamic IP address */
+ port in &SwitchPort(.needs_dynamic_macaddress = true, .lsp = lsp),
+ SwitchPortNewIPv4DynAddress(lsp._uuid, ipv4_addr),
+ var ports = (port, ipv4_addr).group_by(()).to_vec(),
+ ReservedMACAddresses(reserved_addrs),
+ MacPrefix(mac_prefix),
+ (var dyn_addresses, var dup_addrs) = {
+ var used_addrs = reserved_addrs;
+ var need_addr = vec_empty();
+ var dup_addrs = set_empty();
+ for (port_with_addr in ports) {
+ (var port, var ipv4_addr) = port_with_addr;
+ var hint = match (ipv4_addr) {
+ None -> Some { mac_prefix | 1 },
+ Some{addr} -> {
+ /* The tentative MAC's suffix will be in the interval (1, 0xfffffe). */
+ var mac_suffix: bit<24> = iptohl(addr)[23:0] % ((mAC_ADDR_SPACE() - 1)[23:0]) + 1;
+ Some{ mac_prefix | (40'd0 ++ mac_suffix) }
+ }
+ };
+ match (port.dynamic_address) {
+ None -> {
+ /* no dynamic address yet -- allocate one now */
+ vec_push(need_addr, (port.lsp._uuid, hint))
+ },
+ Some{dynaddr} -> {
+ var haddr = eth_addr_to_uint64(dynaddr.ea);
+ if ((haddr ^ mac_prefix) >> 24 != 0) {
+ /* existing dynamic address is no longer valid */
+ vec_push(need_addr, (port.lsp._uuid, hint))
+ } else if (set_contains(used_addrs, haddr)) {
+ set_insert(dup_addrs, dynaddr.ea);
+ } else {
+ /* has valid dynamic address -- record it in used_addrs */
+ set_insert(used_addrs, haddr)
+ }
+ }
+ }
+ };
+ // FIXME: if a port has a dynamic address that is no longer valid, and
+ // we are unable to allocate a new address, the current behavior is to
+ // keep the old invalid address. It should probably be changed to
+ // removing the old address.
+ // FIXME: OVN allocates MAC addresses by seeding them with IPv4 address.
+ // Implement a custom allocation function that simulates this behavior.
+ var res = allocate_with_hint(used_addrs, need_addr, mac_prefix + 1, mac_prefix + mAC_ADDR_SPACE() - 1);
+ var res_strs = vec_empty();
+ for (x in res) {
+ (var uuid, var addr) = x;
+ vec_push(res_strs, "${uuid2str(uuid)}: ${eth_addr_from_uint64(addr)}")
+ };
+ (res, dup_addrs)
+ },
+ var port_address = FlatMap(dyn_addresses),
+ (var lsport, var dyn_addr) = port_address.
+
+relation SwitchPortDuplicateMACAddress(dup_addrs: Set<eth_addr>)
+Warning["Duplicate MAC set: ${ea}"] :-
+ SwitchPortDuplicateMACAddress(dup_addrs),
+ var ea = FlatMap(dup_addrs).
+
+/* Compute new dynamic MAC address assignment:
+ * - port does not need dynamic MAC - use `static_dynamic_mac`
+ * - a new address has been allocated for port - use this address
+ * - otherwise, use existing dynamic MAC
+ */
+relation SwitchPortNewMACDynAddress(lsport: uuid, dyn_addr: Option<eth_addr>)
+
+SwitchPortNewMACDynAddress(lsp._uuid, mac_addr) :-
+ &SwitchPort(.needs_dynamic_macaddress = false,
+ .lsp = lsp,
+ .sw = &sw,
+ .static_dynamic_mac = static_dynamic_mac),
+ var mac_addr = match (static_dynamic_mac) {
+ None -> None,
+ Some{addr} -> {
+ if (is_some(sw.subnet) or is_some(sw.ipv6_prefix) or
+ map_get(sw.ls.other_config, "mac_only") == Some{"true"}) {
+ Some{addr}
+ } else {
+ None
+ }
+ }
+ }.
+
+SwitchPortNewMACDynAddress(lsport, Some{eth_addr_from_uint64(addr)}) :-
+ SwitchPortAllocatedMACDynAddress(lsport, addr).
+
+SwitchPortNewMACDynAddress(lsp._uuid, addr) :-
+ &SwitchPort(.needs_dynamic_macaddress = true, .lsp = lsp, .dynamic_address = cur_address),
+ not SwitchPortAllocatedMACDynAddress(lsp._uuid, _),
+ var addr = match (cur_address) {
+ None -> None,
+ Some{dynaddr} -> Some{dynaddr.ea}
+ }.
+
+/*
+ * Dynamic IPv6 address allocation.
+ * `needs_dynamic_ipv6address` -> in6_generate_eui64(mac, ipv6_prefix)
+ */
+relation SwitchPortNewDynamicAddress(port: Ref<SwitchPort>, address: Option<lport_addresses>)
+
+SwitchPortNewDynamicAddress(port, None) :-
+ port in &SwitchPort(.lsp = lsp),
+ SwitchPortNewMACDynAddress(lsp._uuid, None).
+
+SwitchPortNewDynamicAddress(port, lport_address) :-
+ port in &SwitchPort(.lsp = lsp,
+ .sw = &sw,
+ .needs_dynamic_ipv6address = needs_dynamic_ipv6address,
+ .static_dynamic_ipv6 = static_dynamic_ipv6),
+ SwitchPortNewMACDynAddress(lsp._uuid, Some{mac_addr}),
+ SwitchPortNewIPv4DynAddress(lsp._uuid, opt_ip4_addr),
+ var ip6_addr = match ((static_dynamic_ipv6, needs_dynamic_ipv6address, sw.ipv6_prefix)) {
+ (Some{ipv6}, _, _) -> " ${ipv6}",
+ (_, true, Some{prefix}) -> " ${in6_generate_eui64(mac_addr, prefix)}",
+ _ -> ""
+ },
+ var ip4_addr = match (opt_ip4_addr) {
+ None -> "",
+ Some{ip4} -> " ${ip4}"
+ },
+ var addr_string = "${mac_addr}${ip6_addr}${ip4_addr}",
+ var lport_address = extract_addresses(addr_string).
+
+
+///* If there's more than one dynamic addresses in port->addresses, log a warning
+// and only allocate the first dynamic address */
+//
+// VLOG_WARN_RL(&rl, "More than one dynamic address "
+// "configured for logical switch port '%s'",
+// nbsp->name);
+//
+////>> * MAC addresses suffixes in OUIs managed by OVN"s MACAM (MAC Address
+////>> Management) system, in the range 1...0xfffffe.
+////>> * IPv4 addresses in ranges managed by OVN's IPAM (IP Address Management)
+////>> system. The range varies depending on the size of the subnet.
+////>>
+////>> Are these `dynamic_addresses` in OVN_Northbound.Logical_Switch_Port`?
new file mode 100644
@@ -0,0 +1,715 @@
+/*
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import OVN_Northbound as nb
+import OVN_Southbound as sb
+import multicast
+import ovsdb
+import ovn
+import helpers
+import lswitch
+import ovn_northd
+
+function is_enabled(lr: nb::Logical_Router): bool { is_enabled(lr.enabled) }
+function is_enabled(lrp: nb::Logical_Router_Port): bool { is_enabled(lrp.enabled) }
+function is_enabled(rp: RouterPort): bool { rp.lrp.is_enabled() }
+function is_enabled(rp: Ref<RouterPort>): bool { rp.lrp.is_enabled() }
+
+/* default logical flow prioriry for distributed routes */
+function dROUTE_PRIO(): bit<32> = 400
+
+/* LogicalRouterPortCandidate.
+ *
+ * Each row pairs a logical router port with its logical router, but without
+ * checking that the logical router port is on only one logical router.
+ *
+ * (Use LogicalRouterPort instead, which guarantees uniqueness.) */
+relation LogicalRouterPortCandidate(lrp_uuid: uuid, lr_uuid: uuid)
+LogicalRouterPortCandidate(lrp_uuid, lr_uuid) :-
+ nb::Logical_Router(._uuid = lr_uuid, .ports = ports),
+ var lrp_uuid = FlatMap(ports).
+Warning[message] :-
+ LogicalRouterPortCandidate(lrp_uuid, lr_uuid),
+ var lrs = lr_uuid.group_by(lrp_uuid).to_set(),
+ set_size(lrs) > 1,
+ lrp in nb::Logical_Router_Port(._uuid = lrp_uuid),
+ var message = "Bad configuration: logical router port ${lrp.name} belongs "
+ "to more than one logical router".
+
+/* Each row means 'lport' is in 'lrouter' (and only that lrouter). */
+relation LogicalRouterPort(lport: uuid, lrouter: uuid)
+LogicalRouterPort(lrp_uuid, lr_uuid) :-
+ LogicalRouterPortCandidate(lrp_uuid, lr_uuid),
+ var lrs = lr_uuid.group_by(lrp_uuid).to_set(),
+ set_size(lrs) == 1,
+ Some{var lr_uuid} = set_nth(lrs, 0).
+
+/*
+ * Peer routers.
+ *
+ * Each row in the relation indicates that routers 'a' and 'b' can reach
+ * each other directly through router ports.
+ *
+ * This relation is symmetric: if (a,b) then (b,a).
+ * This relation is antireflexive: if (a,b) then a != b.
+ *
+ * Routers aren't peers if they can reach each other only through logical
+ * switch ports (that's the ReachableLogicalRouter table).
+ */
+relation PeerLogicalRouter(a: uuid, b: uuid)
+PeerLogicalRouter(lrp_uuid, peer._uuid) :-
+ LogicalRouterPort(lrp_uuid, _),
+ lrp in nb::Logical_Router_Port(._uuid = lrp_uuid),
+ Some{var peer_name} = lrp.peer,
+ peer in nb::Logical_Router_Port(.name = peer_name),
+ peer.peer == Some{lrp.name}, // 'peer' must point back to 'lrp'
+ lrp_uuid != peer._uuid. // No reflexive pointers.
+
+/*
+ * First-hop routers.
+ *
+ * Each row indicates that 'lrouter' is a first-hop logical router for
+ * 'lswitch', that is, that a "cable" directly connects 'lrouter' and
+ * 'lswitch'.
+ *
+ * A switch can have multiple first-hop routers. */
+relation FirstHopLogicalRouter(lrouter: uuid, lswitch: uuid)
+FirstHopLogicalRouter(lrouter, lswitch) :-
+ LogicalRouterPort(lrp_uuid, lrouter),
+ lrp in nb::Logical_Router_Port(._uuid = lrp_uuid),
+ LogicalSwitchPort(lsp_uuid, lswitch),
+ lsp in nb::Logical_Switch_Port(._uuid = lsp_uuid),
+ lsp.__type == "router",
+ map_get(lsp.options, "router-port") == Some{lrp.name},
+ is_none(lrp.peer).
+
+/*
+ * Reachable routers.
+ *
+ * Each row in the relation indicates that routers 'a' and 'b' can reach each
+ * other directly or indirectly through any chain of logical routers and
+ * switches.
+ *
+ * This relation is symmetric: if (a,b) then (b,a).
+ * This relation is reflexive: (a,a) is always true.
+ */
+relation ReachableLogicalRouter(a: uuid, b: uuid)
+ReachableLogicalRouter(a, b) :-
+ PeerLogicalRouter(a, c),
+ ReachableLogicalRouter(c, b).
+ReachableLogicalRouter(a, b) :-
+ FirstHopLogicalRouter(a, ls),
+ FirstHopLogicalRouter(b, ls).
+ReachableLogicalRouter(a, b) :-
+ ReachableLogicalRouter(a, c),
+ ReachableLogicalRouter(c, b).
+ReachableLogicalRouter(a, a) :- ReachableLogicalRouter(a, _).
+
+// ha_chassis_group and gateway_chassis may not both be present.
+Warning[message] :-
+ lrp in nb::Logical_Router_Port(),
+ is_some(lrp.ha_chassis_group),
+ not set_is_empty(lrp.gateway_chassis),
+ var message = "Both ha_chassis_group and gateway_chassis configured on "
+ "port ${lrp.name}; ignoring the latter".
+
+// A distributed gateway port cannot also be an L3 gateway router.
+Warning[message] :-
+ lrp in nb::Logical_Router_Port(),
+ is_some(lrp.ha_chassis_group)
+ or not set_is_empty(lrp.gateway_chassis),
+ map_contains_key(lrp.options, "chassis"),
+ var message = "Bad configuration: distributed gateway port configured on "
+ "port ${lrp.name} on L3 gateway router".
+
+/* DistributedGatewayPortCandidate.
+ *
+ * Each row pairs a logical router with its distributed gateway port,
+ * but without checking that there is at most one DGP per LR.
+ *
+ * (Use DistributedGatewayPort instead, since it guarantees uniqueness.) */
+relation DistributedGatewayPortCandidate(lr_uuid: uuid, lrp_uuid: uuid)
+DistributedGatewayPortCandidate(lr_uuid, lrp_uuid) :-
+ lr in nb::Logical_Router(._uuid = lr_uuid),
+ LogicalRouterPort(lrp_uuid, lr._uuid),
+ lrp in nb::Logical_Router_Port(._uuid = lrp_uuid),
+ not map_contains_key(lrp.options, "chassis"),
+ var has_hcg = is_some(lrp.ha_chassis_group),
+ var has_gc = not set_is_empty(lrp.gateway_chassis),
+ has_hcg or has_gc.
+Warning[message] :-
+ DistributedGatewayPortCandidate(lr_uuid, lrp_uuid),
+ var lrps = lrp_uuid.group_by(lr_uuid).to_set(),
+ set_size(lrps) > 1,
+ lr in nb::Logical_Router(._uuid = lr_uuid),
+ var message = "Bad configuration: multiple distributed gateway ports on "
+ "logical router ${lr.name}; ignoring all of them".
+
+/* Distributed gateway ports.
+ *
+ * Each row means 'lrp' is the distributed gateway port on 'lr_uuid'.
+ *
+ * There is at most one distributed gateway port per logical router. */
+relation DistributedGatewayPort(lrp: nb::Logical_Router_Port, lr_uuid: uuid)
+DistributedGatewayPort(lrp, lr_uuid) :-
+ DistributedGatewayPortCandidate(lr_uuid, lrp_uuid),
+ var lrps = lrp_uuid.group_by(lr_uuid).to_set(),
+ set_size(lrps) == 1,
+ Some{var lrp_uuid} = set_nth(lrps, 0),
+ lrp in nb::Logical_Router_Port(._uuid = lrp_uuid).
+
+/* HAChassis is an abstraction over nb::Gateway_Chassis and nb::HA_Chassis, which
+ * are different ways to represent the same configuration. Each row is
+ * effectively one HA_Chassis record. (Usually, we could associated each
+ * row with a particular 'lr_uuid', but it's permissible for more than one
+ * logical router to use a HA chassis group, so we omit it so that multiple
+ * references get merged.)
+ *
+ * nb::Gateway_Chassis has an "options" column that this omits because
+ * nb::HA_Chassis doesn't have anything similar. That's OK because no options
+ * were ever defined. */
+relation HAChassis(hacg_uuid: uuid,
+ hac_uuid: uuid,
+ chassis_name: string,
+ priority: integer,
+ external_ids: Map<string,string>)
+HAChassis(ha_chassis_group_uuid(lrp._uuid), gw_chassis_uuid,
+ chassis_name, priority, external_ids) :-
+ DistributedGatewayPort(.lrp = lrp),
+ is_none(lrp.ha_chassis_group),
+ var gw_chassis_uuid = FlatMap(lrp.gateway_chassis),
+ nb::Gateway_Chassis(._uuid = gw_chassis_uuid,
+ .chassis_name = chassis_name,
+ .priority = priority,
+ .external_ids = eids),
+ var external_ids = map_insert_imm(eids, "chassis-name", chassis_name).
+HAChassis(ha_chassis_group_uuid(ha_chassis_group._uuid), ha_chassis_uuid,
+ chassis_name, priority, external_ids) :-
+ DistributedGatewayPort(.lrp = lrp),
+ Some{var hac_group_uuid} = lrp.ha_chassis_group,
+ ha_chassis_group in nb::HA_Chassis_Group(._uuid = hac_group_uuid),
+ var ha_chassis_uuid = FlatMap(ha_chassis_group.ha_chassis),
+ nb::HA_Chassis(._uuid = ha_chassis_uuid,
+ .chassis_name = chassis_name,
+ .priority = priority,
+ .external_ids = eids),
+ var external_ids = map_insert_imm(eids, "chassis-name", chassis_name).
+
+/* HAChassisGroup is an abstraction for sb::HA_Chassis_Group that papers over
+ * the two southbound ways to configure it via nb::Gateway_Chassis and
+ * nb::HA_Chassis. The former configuration method does not provide a name or
+ * external_ids for the group (only for individual chassis), so we generate
+ * them.
+ *
+ * (Usually, we could associated each row with a particular 'lr_uuid', but it's
+ * permissible for more than one logical router to use a HA chassis group, so
+ * we omit it so that multiple references get merged.)
+ */
+relation HAChassisGroup(uuid: uuid,
+ name: string,
+ external_ids: Map<string,string>)
+HAChassisGroup(ha_chassis_group_uuid(lrp._uuid), lrp.name, map_empty()) :-
+ DistributedGatewayPort(.lrp = lrp),
+ is_none(lrp.ha_chassis_group),
+ not set_is_empty(lrp.gateway_chassis).
+HAChassisGroup(ha_chassis_group_uuid(hac_group_uuid),
+ name, external_ids) :-
+ DistributedGatewayPort(.lrp = lrp),
+ Some{var hac_group_uuid} = lrp.ha_chassis_group,
+ nb::HA_Chassis_Group(._uuid = hacg_uuid,
+ .name = name,
+ .external_ids = external_ids).
+
+/* Each row maps from a logical router to the name of its HAChassisGroup.
+ * This level of indirection is needed because multiple logical routers
+ * are allowed to reference a given HAChassisGroup. */
+relation LogicalRouterHAChassisGroup(lr_uuid: uuid,
+ hacg_uuid: uuid)
+LogicalRouterHAChassisGroup(lr_uuid, ha_chassis_group_uuid(lrp._uuid)) :-
+ DistributedGatewayPort(lrp, lr_uuid),
+ is_none(lrp.ha_chassis_group),
+ set_size(lrp.gateway_chassis) > 0.
+LogicalRouterHAChassisGroup(lr_uuid,
+ ha_chassis_group_uuid(hac_group_uuid)) :-
+ DistributedGatewayPort(lrp, lr_uuid),
+ Some{var hac_group_uuid} = lrp.ha_chassis_group,
+ nb::HA_Chassis_Group(._uuid = hac_group_uuid).
+
+
+/* For each router port, tracks whether it's a redirect port of its router */
+relation RouterPortIsRedirect(lrp: uuid, is_redirect: bool)
+RouterPortIsRedirect(lrp, true) :- DistributedGatewayPort(nb::Logical_Router_Port{._uuid = lrp}, _).
+RouterPortIsRedirect(lrp, false) :-
+ nb::Logical_Router_Port(._uuid = lrp),
+ not DistributedGatewayPort(nb::Logical_Router_Port{._uuid = lrp}, _).
+
+relation LogicalRouterRedirectPort(lr: uuid, has_redirect_port: Option<nb::Logical_Router_Port>)
+
+LogicalRouterRedirectPort(lr, Some{lrp}) :-
+ DistributedGatewayPort(lrp, lr).
+
+LogicalRouterRedirectPort(lr, None) :-
+ nb::Logical_Router(._uuid = lr),
+ not DistributedGatewayPort(_, lr).
+
+typedef ExceptionalExtIps = AllowedExtIps{ips: Ref<nb::Address_Set>}
+ | ExemptedExtIps{ips: Ref<nb::Address_Set>}
+
+typedef NAT = NAT{
+ nat: Ref<nb::NAT>,
+ external_ip: v46_ip,
+ external_mac: Option<eth_addr>,
+ exceptional_ext_ips: Option<ExceptionalExtIps>
+}
+
+relation LogicalRouterNAT0(
+ lr: uuid,
+ nat: Ref<nb::NAT>,
+ external_ip: v46_ip,
+ external_mac: Option<eth_addr>)
+LogicalRouterNAT0(lr, nat, external_ip, external_mac) :-
+ nb::Logical_Router(._uuid = lr, .nat = nats),
+ var nat_uuid = FlatMap(nats),
+ nat in &NATRef[nb::NAT{._uuid = nat_uuid}],
+ Some{var external_ip} = ip46_parse(nat.external_ip),
+ var external_mac = match (nat.external_mac) {
+ Some{s} -> eth_addr_from_string(s),
+ None -> None
+ }.
+Warning["Bad ip address ${nat.external_ip} in nat configuration for router ${lr_name}."] :-
+ nb::Logical_Router(._uuid = lr, .nat = nats, .name = lr_name),
+ var nat_uuid = FlatMap(nats),
+ nat in &NATRef[nb::NAT{._uuid = nat_uuid}],
+ None = ip46_parse(nat.external_ip).
+Warning["Bad MAC address ${s} in nat configuration for router ${lr_name}."] :-
+ nb::Logical_Router(._uuid = lr, .nat = nats, .name = lr_name),
+ var nat_uuid = FlatMap(nats),
+ nat in &NATRef[nb::NAT{._uuid = nat_uuid}],
+ Some{var s} = nat.external_mac,
+ None = eth_addr_from_string(s).
+
+relation LogicalRouterNAT(lr: uuid, nat: NAT)
+LogicalRouterNAT(lr, NAT{nat, external_ip, external_mac, None}) :-
+ LogicalRouterNAT0(lr, nat, external_ip, external_mac),
+ nat.allowed_ext_ips.is_none(),
+ nat.exempted_ext_ips.is_none().
+LogicalRouterNAT(lr, NAT{nat, external_ip, external_mac, Some{AllowedExtIps{__as}}}) :-
+ LogicalRouterNAT0(lr, nat, external_ip, external_mac),
+ nat.exempted_ext_ips.is_none(),
+ Some{var __as_uuid} = nat.allowed_ext_ips,
+ __as in &AddressSetRef[nb::Address_Set{._uuid = __as_uuid}].
+LogicalRouterNAT(lr, NAT{nat, external_ip, external_mac, Some{ExemptedExtIps{__as}}}) :-
+ LogicalRouterNAT0(lr, nat, external_ip, external_mac),
+ nat.allowed_ext_ips.is_none(),
+ Some{var __as_uuid} = nat.exempted_ext_ips,
+ __as in &AddressSetRef[nb::Address_Set{._uuid = __as_uuid}].
+Warning["NAT rule: ${nat._uuid} not applied, since"
+ "both allowed and exempt external ips set"] :-
+ LogicalRouterNAT0(lr, nat, _, _),
+ nat.allowed_ext_ips.is_some() and nat.exempted_ext_ips.is_some().
+
+relation LogicalRouterNATs(lr: uuid, nat: Vec<NAT>)
+
+LogicalRouterNATs(lr, nats) :-
+ LogicalRouterNAT(lr, nat),
+ var nats = nat.group_by(lr).to_vec().
+
+LogicalRouterNATs(lr, vec_empty()) :-
+ nb::Logical_Router(._uuid = lr),
+ not LogicalRouterNAT(lr, _).
+
+/* For each router, collect the set of IPv4 and IPv6 addresses used for SNAT,
+ * which includes:
+ *
+ * - dnat_force_snat_addrs
+ * - lb_force_snat_addrs
+ * - IP addresses used in the router's attached NAT rules
+ *
+ * This is like init_nat_entries() in ovn-northd.c. */
+relation LogicalRouterSnatIP(lr: uuid, snat_ip: v46_ip, nat: Option<NAT>)
+LogicalRouterSnatIP(lr._uuid, force_snat_ip, None) :-
+ lr in nb::Logical_Router(),
+ var dnat_force_snat_ips = get_force_snat_ip(lr, "dnat"),
+ var lb_force_snat_ips = get_force_snat_ip(lr, "lb"),
+ var force_snat_ip = FlatMap(dnat_force_snat_ips.union(lb_force_snat_ips)).
+LogicalRouterSnatIP(lr, snat_ip, Some{nat}) :-
+ LogicalRouterNAT(lr, nat@NAT{.nat = &nb::NAT{.__type = "snat"}, .external_ip = snat_ip}).
+
+function group_to_setunionmap(g: Group<'K1, ('K2,Set<'V>)>): Map<'K2,Set<'V>> {
+ var map = map_empty();
+ for (entry in g) {
+ (var key, var value) = entry;
+ match (map.get(key)) {
+ None -> map.insert(key, value),
+ Some{old_value} -> map.insert(key, old_value.union(value))
+ }
+ };
+ map
+}
+relation LogicalRouterSnatIPs(lr: uuid, snat_ips: Map<v46_ip, Set<NAT>>)
+LogicalRouterSnatIPs(lr, snat_ips) :-
+ LogicalRouterSnatIP(lr, snat_ip, nat),
+ var snat_ips = (snat_ip, nat.to_set()).group_by(lr).group_to_setunionmap().
+LogicalRouterSnatIPs(lr._uuid, map_empty()) :-
+ lr in nb::Logical_Router(),
+ not LogicalRouterSnatIP(.lr = lr._uuid).
+
+relation LogicalRouterLB(lr: uuid, nat: Ref<nb::Load_Balancer>)
+
+LogicalRouterLB(lr, lb) :-
+ nb::Logical_Router(._uuid = lr, .load_balancer = lbs),
+ var lb_uuid = FlatMap(lbs),
+ lb in &LoadBalancerRef[nb::Load_Balancer{._uuid = lb_uuid}].
+
+relation LogicalRouterLBs(lr: uuid, nat: Vec<Ref<nb::Load_Balancer>>)
+
+LogicalRouterLBs(lr, lbs) :-
+ LogicalRouterLB(lr, lb),
+ var lbs = lb.group_by(lr).to_vec().
+
+LogicalRouterLBs(lr, vec_empty()) :-
+ nb::Logical_Router(._uuid = lr),
+ not LogicalRouterLB(lr, _).
+
+/* Router relation collects all attributes of a logical router.
+ *
+ * `lr` - Logical_Router record from the NB database
+ * `l3dgw_port` - optional redirect port (see `DistributedGatewayPort`)
+ * `redirect_port_name` - derived redirect port name (or empty string if
+ * router does not have a redirect port)
+ * `is_gateway` - true iff the router is a gateway router. Together with
+ * `l3dgw_port`, this flag affects the generation of various flows
+ * related to NAT and load balancing.
+ * `learn_from_arp_request` - whether ARP requests to addresses on the router
+ * should always be learned
+ */
+
+function chassis_redirect_name(port_name: string): string = "cr-${port_name}"
+
+relation &Router(
+ lr: nb::Logical_Router,
+ l3dgw_port: Option<nb::Logical_Router_Port>,
+ redirect_port_name: string,
+ is_gateway: bool,
+ nats: Vec<NAT>,
+ snat_ips: Map<v46_ip, Set<NAT>>,
+ lbs: Vec<Ref<nb::Load_Balancer>>,
+ mcast_cfg: Ref<McastRouterCfg>,
+ learn_from_arp_request: bool
+)
+
+&Router(.lr = lr,
+ .l3dgw_port = l3dgw_port,
+ .redirect_port_name =
+ match (l3dgw_port) {
+ Some{rport} -> json_string_escape(chassis_redirect_name(rport.name)),
+ _ -> ""
+ },
+ .is_gateway = is_some(map_get(lr.options, "chassis")),
+ .nats = nats,
+ .snat_ips = snat_ips,
+ .lbs = lbs,
+ .mcast_cfg = mcast_cfg,
+ .learn_from_arp_request = learn_from_arp_request) :-
+ lr in nb::Logical_Router(),
+ lr.is_enabled(),
+ LogicalRouterRedirectPort(lr._uuid, l3dgw_port),
+ LogicalRouterNATs(lr._uuid, nats),
+ LogicalRouterLBs(lr._uuid, lbs),
+ LogicalRouterSnatIPs(lr._uuid, snat_ips),
+ mcast_cfg in &McastRouterCfg(.datapath = lr._uuid),
+ var learn_from_arp_request = map_get_bool_def(lr.options, "always_learn_from_arp_request", true).
+
+/* RouterLB: many-to-many relation between logical routers and nb::LB */
+relation RouterLB(router: Ref<Router>, lb: Ref<nb::Load_Balancer>)
+
+RouterLB(router, lb) :-
+ router in &Router(.lbs = lbs),
+ var lb = FlatMap(lbs).
+
+/* Load balancer VIPs associated with routers */
+relation RouterLBVIP(
+ router: Ref<Router>,
+ lb: Ref<nb::Load_Balancer>,
+ vip: string,
+ backends: string)
+
+RouterLBVIP(router, lb, vip, backends) :-
+ RouterLB(router, lb@(&nb::Load_Balancer{.vips = vips})),
+ var kv = FlatMap(vips),
+ (var vip, var backends) = kv.
+
+/* Router-to-router logical port connections */
+relation RouterRouterPeer(rport1: uuid, rport2: uuid, rport2_name: string)
+
+RouterRouterPeer(rport1, rport2, peer_name) :-
+ nb::Logical_Router_Port(._uuid = rport1, .peer = peer),
+ Some{var peer_name} = peer,
+ nb::Logical_Router_Port(._uuid = rport2, .name = peer_name).
+
+/* Router port can peer with anothe router port, a switch port or have
+ * no peer.
+ */
+typedef RouterPeer = PeerRouter{rport: uuid, name: string}
+ | PeerSwitch{sport: uuid, name: string}
+ | PeerNone
+
+function router_peer_name(peer: RouterPeer): Option<string> = {
+ match (peer) {
+ PeerRouter{_, n} -> Some{n},
+ PeerSwitch{_, n} -> Some{n},
+ PeerNone -> None
+ }
+}
+
+relation RouterPortPeer(rport: uuid, peer: RouterPeer)
+
+/* Router-to-router logical port connections */
+RouterPortPeer(rport, PeerSwitch{sport, sport_name}) :-
+ SwitchRouterPeer(sport, sport_name, rport).
+
+RouterPortPeer(rport1, PeerRouter{rport2, rport2_name}) :-
+ RouterRouterPeer(rport1, rport2, rport2_name).
+
+RouterPortPeer(rport, PeerNone) :-
+ nb::Logical_Router_Port(._uuid = rport),
+ not SwitchRouterPeer(_, _, rport),
+ not RouterRouterPeer(rport, _, _).
+
+/* Each row maps from a Logical_Router port to the input options in its
+ * corresponding Port_Binding (if any). This is because northd preserves
+ * most of the options in that column. (northd unconditionally sets the
+ * ipv6_prefix_delegation and ipv6_prefix options, so we remove them for
+ * faster convergence.) */
+relation RouterPortSbOptions(lrp_uuid: uuid, options: Map<string,string>)
+RouterPortSbOptions(lrp._uuid, options) :-
+ lrp in nb::Logical_Router_Port(),
+ pb in sb::Port_Binding(._uuid = lrp._uuid),
+ var options = {
+ var options = pb.options;
+ map_remove(options, "ipv6_prefix");
+ map_remove(options, "ipv6_prefix_delegation");
+ options
+ }.
+RouterPortSbOptions(lrp._uuid, map_empty()) :-
+ lrp in nb::Logical_Router_Port(),
+ not sb::Port_Binding(._uuid = lrp._uuid).
+
+/* FIXME: what should happen when extract_lrp_networks fails? */
+/* RouterPort relation collects all attributes of a logical router port */
+relation &RouterPort(
+ lrp: nb::Logical_Router_Port,
+ json_name: string,
+ networks: lport_addresses,
+ router: Ref<Router>,
+ is_redirect: bool,
+ peer: RouterPeer,
+ mcast_cfg: Ref<McastPortCfg>,
+ sb_options: Map<string,string>)
+
+&RouterPort(.lrp = lrp,
+ .json_name = json_string_escape(lrp.name),
+ .networks = networks,
+ .router = router,
+ .is_redirect = is_redirect,
+ .peer = peer,
+ .mcast_cfg = mcast_cfg,
+ .sb_options = sb_options) :-
+ nb::Logical_Router_Port[lrp],
+ Some{var networks} = extract_lrp_networks(lrp.mac, lrp.networks),
+ LogicalRouterPort(lrp._uuid, lrouter_uuid),
+ router in &Router(.lr = nb::Logical_Router{._uuid = lrouter_uuid}),
+ RouterPortIsRedirect(lrp._uuid, is_redirect),
+ RouterPortPeer(lrp._uuid, peer),
+ mcast_cfg in &McastPortCfg(.port = lrp._uuid, .router_port = true),
+ RouterPortSbOptions(lrp._uuid, sb_options).
+
+relation RouterPortNetworksIPv4Addr(port: Ref<RouterPort>, addr: ipv4_netaddr)
+
+RouterPortNetworksIPv4Addr(port, addr) :-
+ port in &RouterPort(.networks = networks),
+ var addr = FlatMap(networks.ipv4_addrs).
+
+relation RouterPortNetworksIPv6Addr(port: Ref<RouterPort>, addr: ipv6_netaddr)
+
+RouterPortNetworksIPv6Addr(port, addr) :-
+ port in &RouterPort(.networks = networks),
+ var addr = FlatMap(networks.ipv6_addrs).
+
+/* StaticRoute: Collects and parses attributes of a static route. */
+typedef route_policy = SrcIp | DstIp
+function route_policy_from_string(s: Option<string>): route_policy = {
+ match (s) {
+ Some{"src-ip"} -> SrcIp,
+ _ -> DstIp
+ }
+}
+function to_string(policy: route_policy): string = {
+ match (policy) {
+ SrcIp -> "src-ip",
+ DstIp -> "dst-ip"
+ }
+}
+
+typedef route_key = RouteKey {
+ policy: route_policy,
+ ip_prefix: v46_ip,
+ plen: bit<32>
+}
+
+relation &StaticRoute(lrsr: nb::Logical_Router_Static_Route,
+ key: route_key,
+ nexthop: v46_ip,
+ output_port: Option<string>,
+ ecmp_symmetric_reply: bool)
+
+&StaticRoute(.lrsr = lrsr,
+ .key = RouteKey{policy, ip_prefix, plen},
+ .nexthop = nexthop,
+ .output_port = lrsr.output_port,
+ .ecmp_symmetric_reply = esr) :-
+ lrsr in nb::Logical_Router_Static_Route(),
+ var policy = route_policy_from_string(lrsr.policy),
+ Some{(var nexthop, var nexthop_plen)} = ip46_parse_cidr(lrsr.nexthop),
+ match (nexthop) {
+ IPv4{_} -> nexthop_plen == 32,
+ IPv6{_} -> nexthop_plen == 128
+ },
+ Some{(var ip_prefix, var plen)} = ip46_parse_cidr(lrsr.ip_prefix),
+ match ((nexthop, ip_prefix)) {
+ (IPv4{_}, IPv4{_}) -> true,
+ (IPv6{_}, IPv6{_}) -> true,
+ _ -> false
+ },
+ var esr = map_get_bool_def(lrsr.options, "ecmp_symmetric_reply", false).
+
+/* Returns the IP address of the router port 'op' that
+ * overlaps with 'ip'. If one is not found, returns None. */
+function find_lrp_member_ip(networks: lport_addresses, ip: v46_ip): Option<v46_ip> =
+{
+ match (ip) {
+ IPv4{ip4} -> {
+ for (na in networks.ipv4_addrs) {
+ if (ip_same_network((na.addr, ip4), ipv4_netaddr_mask(na))) {
+ /* There should be only 1 interface that matches the
+ * supplied IP. Otherwise, it's a configuration error,
+ * because subnets of a router's interfaces should NOT
+ * overlap. */
+ return Some{IPv4{na.addr}}
+ }
+ };
+ return None
+ },
+ IPv6{ip6} -> {
+ for (na in networks.ipv6_addrs) {
+ if (ipv6_same_network((na.addr, ip6), ipv6_netaddr_mask(na))) {
+ /* There should be only 1 interface that matches the
+ * supplied IP. Otherwise, it's a configuration error,
+ * because subnets of a router's interfaces should NOT
+ * overlap. */
+ return Some{IPv6{na.addr}}
+ }
+ };
+ return None
+ }
+ }
+}
+
+
+/* Step 1: compute router-route pairs */
+relation RouterStaticRoute_(
+ router : Ref<Router>,
+ key : route_key,
+ nexthop : v46_ip,
+ output_port : Option<string>,
+ ecmp_symmetric_reply : bool)
+
+RouterStaticRoute_(.router = router,
+ .key = route.key,
+ .nexthop = route.nexthop,
+ .output_port = route.output_port,
+ .ecmp_symmetric_reply = route.ecmp_symmetric_reply) :-
+ router in &Router(.lr = nb::Logical_Router{.static_routes = routes}),
+ var route_id = FlatMap(routes),
+ route in &StaticRoute(.lrsr = nb::Logical_Router_Static_Route{._uuid = route_id}).
+
+/* Step-2: compute output_port for each pair */
+typedef route_dst = RouteDst {
+ nexthop: v46_ip,
+ src_ip: v46_ip,
+ port: Ref<RouterPort>,
+ ecmp_symmetric_reply: bool
+}
+
+relation RouterStaticRoute(
+ router : Ref<Router>,
+ key : route_key,
+ dsts : Set<route_dst>)
+
+RouterStaticRoute(router, key, dsts) :-
+ RouterStaticRoute_(.router = router,
+ .key = key,
+ .nexthop = nexthop,
+ .output_port = None,
+ .ecmp_symmetric_reply = ecmp_symmetric_reply),
+ /* output_port is not specified, find the
+ * router port matching the next hop. */
+ port in &RouterPort(.router = &Router{.lr = nb::Logical_Router{._uuid = router.lr._uuid}},
+ .networks = networks),
+ Some{var src_ip} = find_lrp_member_ip(networks, nexthop),
+ var dst = RouteDst{nexthop, src_ip, port, ecmp_symmetric_reply},
+ var dsts = dst.group_by((router, key)).to_set().
+
+RouterStaticRoute(router, key, dsts) :-
+ RouterStaticRoute_(.router = router,
+ .key = key,
+ .nexthop = nexthop,
+ .output_port = Some{oport},
+ .ecmp_symmetric_reply = ecmp_symmetric_reply),
+ /* output_port specified */
+ port in &RouterPort(.lrp = nb::Logical_Router_Port{.name = oport},
+ .networks = networks),
+ Some{var src_ip} = match (find_lrp_member_ip(networks, nexthop)) {
+ Some{src_ip} -> Some{src_ip},
+ None -> {
+ /* There are no IP networks configured on the router's port via
+ * which 'route->nexthop' is theoretically reachable. But since
+ * 'out_port' has been specified, we honor it by trying to reach
+ * 'route->nexthop' via the first IP address of 'out_port'.
+ * (There are cases, e.g in GCE, where each VM gets a /32 IP
+ * address and the default gateway is still reachable from it.) */
+ match (key.ip_prefix) {
+ IPv4{_} -> match (vec_nth(networks.ipv4_addrs, 0)) {
+ Some{addr} -> Some{IPv4{addr.addr}},
+ None -> {
+ warn("No path for static route ${key.ip_prefix}; next hop ${nexthop}");
+ None
+ }
+ },
+ IPv6{_} -> match (vec_nth(networks.ipv6_addrs, 0)) {
+ Some{addr} -> Some{IPv6{addr.addr}},
+ None -> {
+ warn("No path for static route ${key.ip_prefix}; next hop ${nexthop}");
+ None
+ }
+ }
+ }
+ }
+ },
+ var dsts = set_singleton(RouteDst{nexthop, src_ip, port, ecmp_symmetric_reply}).
+
+Warning[message] :-
+ RouterStaticRoute_(.router = router, .key = key, .nexthop = nexthop),
+ not RouterStaticRoute(.router = router, .key = key),
+ var message = "No path for ${key.policy} static route ${key.ip_prefix}/${key.plen} with next hop ${nexthop}".
new file mode 100644
@@ -0,0 +1,663 @@
+/*
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import OVN_Northbound as nb
+import OVN_Southbound as sb
+import ovsdb
+import ovn
+import lrouter
+import multicast
+import helpers
+import ipam
+
+function is_enabled(lsp: nb::Logical_Switch_Port): bool { is_enabled(lsp.enabled) }
+function is_enabled(lsp: Ref<nb::Logical_Switch_Port>): bool { lsp.deref().is_enabled() }
+function is_enabled(sp: SwitchPort): bool { sp.lsp.is_enabled() }
+function is_enabled(sp: Ref<SwitchPort>): bool { sp.lsp.is_enabled() }
+
+relation SwitchRouterPeerRef(lsp: uuid, rport: Option<Ref<RouterPort>>)
+
+SwitchRouterPeerRef(lsp, Some{rport}) :-
+ SwitchRouterPeer(lsp, _, lrp),
+ rport in &RouterPort(.lrp = nb::Logical_Router_Port{._uuid = lrp}).
+
+SwitchRouterPeerRef(lsp, None) :-
+ nb::Logical_Switch_Port(._uuid = lsp),
+ not SwitchRouterPeer(lsp, _, _).
+
+/* map logical ports to logical switches */
+relation LogicalSwitchPort(lport: uuid, lswitch: uuid)
+
+LogicalSwitchPort(lport, lswitch) :-
+ nb::Logical_Switch(._uuid = lswitch, .ports = ports),
+ var lport = FlatMap(ports).
+
+/* Logical switches that have enabled ports with "unknown" address */
+relation LogicalSwitchUnknownPorts(ls: uuid, port_ids: Set<uuid>)
+
+LogicalSwitchUnknownPorts(ls_uuid, port_ids) :-
+ &SwitchPort(.lsp = lsp, .sw = &Switch{.ls = ls}),
+ lsp.is_enabled() and set_contains(lsp.addresses, "unknown"),
+ var ls_uuid = ls._uuid,
+ var port_ids = lsp._uuid.group_by(ls_uuid).to_set().
+
+/* PortStaticAddresses: static IP addresses associated with each Logical_Switch_Port */
+relation PortStaticAddresses(lsport: uuid, ip4addrs: Set<string>, ip6addrs: Set<string>)
+
+PortStaticAddresses(.lsport = port_uuid,
+ .ip4addrs = set_unions(ip4_addrs),
+ .ip6addrs = set_unions(ip6_addrs)) :-
+ nb::Logical_Switch_Port(._uuid = port_uuid, .addresses = addresses),
+ var address = FlatMap(if (set_is_empty(addresses)) { set_singleton("") } else { addresses }),
+ (var ip4addrs, var ip6addrs) = if (not is_dynamic_lsp_address(address)) {
+ split_addresses(address)
+ } else { (set_empty(), set_empty()) },
+ var static_addrs = (ip4addrs, ip6addrs).group_by(port_uuid).group_unzip(),
+ (var ip4_addrs, var ip6_addrs) = static_addrs.
+
+relation PortInGroup(port: uuid, group: uuid)
+
+PortInGroup(port, group) :-
+ nb::Port_Group(._uuid = group, .ports = ports),
+ var port = FlatMap(ports).
+
+/* All ACLs associated with logical switch */
+relation LogicalSwitchACL(ls: uuid, acl: uuid)
+
+LogicalSwitchACL(ls, acl) :-
+ nb::Logical_Switch(._uuid = ls, .acls = acls),
+ var acl = FlatMap(acls).
+
+LogicalSwitchACL(ls, acl) :-
+ nb::Logical_Switch(._uuid = ls, .ports = ports),
+ var port_id = FlatMap(ports),
+ PortInGroup(port_id, group_id),
+ nb::Port_Group(._uuid = group_id, .acls = acls),
+ var acl = FlatMap(acls).
+
+relation LogicalSwitchStatefulACL(ls: uuid, acl: uuid)
+
+LogicalSwitchStatefulACL(ls, acl) :-
+ LogicalSwitchACL(ls, acl),
+ nb::ACL(._uuid = acl, .action = "allow-related").
+
+relation LogicalSwitchHasStatefulACL(ls: uuid, has_stateful_acl: bool)
+
+LogicalSwitchHasStatefulACL(ls, true) :-
+ LogicalSwitchStatefulACL(ls, _).
+
+LogicalSwitchHasStatefulACL(ls, false) :-
+ nb::Logical_Switch(._uuid = ls),
+ not LogicalSwitchStatefulACL(ls, _).
+
+relation LogicalSwitchLocalnetPort0(ls_uuid: uuid, lsp_name: string)
+LogicalSwitchLocalnetPort0(ls_uuid, lsp_name) :-
+ ls in nb::Logical_Switch(._uuid = ls_uuid),
+ var lsp_uuid = FlatMap(ls.ports),
+ lsp in nb::Logical_Switch_Port(._uuid = lsp_uuid),
+ lsp.__type == "localnet",
+ var lsp_name = lsp.name.
+
+relation LogicalSwitchLocalnetPorts(ls_uuid: uuid, localnet_port_names: Vec<string>)
+LogicalSwitchLocalnetPorts(ls_uuid, localnet_port_names) :-
+ LogicalSwitchLocalnetPort0(ls_uuid, lsp_name),
+ var localnet_port_names = lsp_name.group_by(ls_uuid).to_vec().
+LogicalSwitchLocalnetPorts(ls_uuid, vec_empty()) :-
+ ls in nb::Logical_Switch(),
+ var ls_uuid = ls._uuid,
+ not LogicalSwitchLocalnetPort0(ls_uuid, _).
+
+/* Flatten the list of dns_records in Logical_Switch */
+relation LogicalSwitchDNS(ls_uuid: uuid, dns_uuid: uuid)
+
+LogicalSwitchDNS(ls._uuid, dns_uuid) :-
+ nb::Logical_Switch[ls],
+ var dns_uuid = FlatMap(ls.dns_records),
+ nb::DNS(._uuid = dns_uuid).
+
+relation LogicalSwitchWithDNSRecords(ls: uuid)
+
+LogicalSwitchWithDNSRecords(ls) :-
+ LogicalSwitchDNS(ls, dns_uuid),
+ nb::DNS(._uuid = dns_uuid, .records = records),
+ not map_is_empty(records).
+
+relation LogicalSwitchHasDNSRecords(ls: uuid, has_dns_records: bool)
+
+LogicalSwitchHasDNSRecords(ls, true) :-
+ LogicalSwitchWithDNSRecords(ls).
+
+LogicalSwitchHasDNSRecords(ls, false) :-
+ nb::Logical_Switch(._uuid = ls),
+ not LogicalSwitchWithDNSRecords(ls).
+
+relation LogicalSwitchHasNonRouterPort0(ls: uuid)
+LogicalSwitchHasNonRouterPort0(ls_uuid) :-
+ ls in nb::Logical_Switch(._uuid = ls_uuid),
+ var lsp_uuid = FlatMap(ls.ports),
+ lsp in nb::Logical_Switch_Port(._uuid = lsp_uuid),
+ lsp.__type != "router".
+
+relation LogicalSwitchHasNonRouterPort(ls: uuid, has_non_router_port: bool)
+LogicalSwitchHasNonRouterPort(ls, true) :-
+ LogicalSwitchHasNonRouterPort0(ls).
+LogicalSwitchHasNonRouterPort(ls, false) :-
+ nb::Logical_Switch(._uuid = ls),
+ not LogicalSwitchHasNonRouterPort0(ls).
+
+/* Switch relation collects all attributes of a logical switch */
+
+relation &Switch(
+ ls: nb::Logical_Switch,
+ has_stateful_acl: bool,
+ has_lb_vip: bool,
+ has_dns_records: bool,
+ localnet_port_names: Vec<string>,
+ subnet: Option<(in_addr/*subnet*/, in_addr/*mask*/, bit<32>/*start_ipv4*/, bit<32>/*total_ipv4s*/)>,
+ ipv6_prefix: Option<in6_addr>,
+ mcast_cfg: Ref<McastSwitchCfg>,
+ is_vlan_transparent: bool,
+
+ /* Does this switch have at least one port with type != "router"? */
+ has_non_router_port: bool
+)
+
+function ipv6_parse_prefix(s: string): Option<in6_addr> {
+ if (string_contains(s, "/")) {
+ match (ipv6_parse_cidr(s)) {
+ Right{(addr, 64)} -> Some{addr},
+ _ -> None
+ }
+ } else {
+ ipv6_parse(s)
+ }
+}
+
+&Switch(.ls = ls,
+ .has_stateful_acl = has_stateful_acl,
+ .has_lb_vip = has_lb_vip,
+ .has_dns_records = has_dns_records,
+ .localnet_port_names = localnet_port_names,
+ .subnet = subnet,
+ .ipv6_prefix = ipv6_prefix,
+ .mcast_cfg = mcast_cfg,
+ .has_non_router_port = has_non_router_port,
+ .is_vlan_transparent = is_vlan_transparent) :-
+ nb::Logical_Switch[ls],
+ LogicalSwitchHasStatefulACL(ls._uuid, has_stateful_acl),
+ LogicalSwitchHasLBVIP(ls._uuid, has_lb_vip),
+ LogicalSwitchHasDNSRecords(ls._uuid, has_dns_records),
+ LogicalSwitchLocalnetPorts(ls._uuid, localnet_port_names),
+ LogicalSwitchHasNonRouterPort(ls._uuid, has_non_router_port),
+ mcast_cfg in &McastSwitchCfg(.datapath = ls._uuid),
+ var subnet =
+ match (map_get(ls.other_config, "subnet")) {
+ None -> None,
+ Some{subnet_str} -> {
+ match (ip_parse_masked(subnet_str)) {
+ Left{err} -> {
+ warn("bad 'subnet' ${subnet_str}");
+ None
+ },
+ Right{(subnet, mask)} -> {
+ if (ip_count_cidr_bits(mask) == Some{32}
+ or not ip_is_cidr(mask)) {
+ warn("bad 'subnet' ${subnet_str}");
+ None
+ } else {
+ Some{(subnet, mask, (iptohl(subnet) & iptohl(mask)) + 1, ~iptohl(mask))}
+ }
+ }
+ }
+ }
+ },
+ var ipv6_prefix =
+ match (map_get(ls.other_config, "ipv6_prefix")) {
+ None -> None,
+ Some{prefix} -> ipv6_parse_prefix(prefix)
+ },
+ var is_vlan_transparent = map_get_bool_def(ls.other_config, "vlan-passthru", false).
+
+/* SwitchLB: many-to-many relation between logical switches and nb::LB */
+relation SwitchLB(sw_uuid: uuid, lb: Ref<nb::Load_Balancer>)
+SwitchLB(sw_uuid, lb) :-
+ nb::Logical_Switch(._uuid = sw_uuid, .load_balancer = lb_ids),
+ var lb_id = FlatMap(lb_ids),
+ lb in &LoadBalancerRef[nb::Load_Balancer{._uuid = lb_id}].
+
+/* Load balancer VIPs associated with switch */
+relation SwitchLBVIP(sw_uuid: uuid, lb: Ref<nb::Load_Balancer>, vip: string, backends: string)
+SwitchLBVIP(sw_uuid, lb, vip, backends) :-
+ SwitchLB(sw_uuid, lb@(&nb::Load_Balancer{.vips = vips})),
+ var kv = FlatMap(vips),
+ (var vip, var backends) = kv.
+
+relation LogicalSwitchHasLBVIP(sw_uuid: uuid, has_lb_vip: bool)
+LogicalSwitchHasLBVIP(sw_uuid, true) :-
+ SwitchLBVIP(.sw_uuid = sw_uuid).
+LogicalSwitchHasLBVIP(sw_uuid, false) :-
+ nb::Logical_Switch(._uuid = sw_uuid),
+ not SwitchLBVIP(.sw_uuid = sw_uuid).
+
+relation &LBVIP(
+ lb: Ref<nb::Load_Balancer>,
+ vip_key: string,
+ vip_addr: v46_ip,
+ vip_port: bit<16>,
+ backend_ips: string)
+
+&LBVIP(.lb = lb,
+ .vip_key = vip_key,
+ .vip_addr = vip_addr,
+ .vip_port = vip_port,
+ .backend_ips = backend_ips) :-
+ LoadBalancerRef[lb],
+ var vip = FlatMap(lb.vips),
+ (var vip_key, var backend_ips) = vip,
+ Some{(var vip_addr, var vip_port)} = ip_address_and_port_from_lb_key(vip_key).
+
+typedef svc_monitor = SvcMonitor{
+ port_name: string, // Might name a switch or router port.
+ src_ip: string
+}
+
+relation &LBVIPBackend(
+ lbvip: Ref<LBVIP>,
+ ip: v46_ip,
+ port: bit<16>,
+ svc_monitor: Option<svc_monitor>)
+
+function parse_ip_port_mapping(mappings: Map<string,string>, ip: v46_ip)
+ : Option<svc_monitor> {
+ for (kv in mappings) {
+ (var key, var value) = kv;
+ if (ip46_parse(key) == Some{ip}) {
+ var strs = string_split(value, ":");
+ if (vec_len(strs) != 2) {
+ return None
+ };
+
+ return match ((vec_nth(strs, 0), vec_nth(strs, 1))) {
+ (Some{port_name}, Some{src_ip}) -> Some{SvcMonitor{port_name, src_ip}},
+ _ -> None
+ }
+ }
+ };
+ return None
+}
+
+&LBVIPBackend(.lbvip = lbvip,
+ .ip = ip,
+ .port = port,
+ .svc_monitor = svc_monitor) :-
+ LBVIP[lbvip],
+ var backend = FlatMap(string_split(lbvip.backend_ips, ",")),
+ Some{(var ip, var port)} = ip_address_and_port_from_lb_key(backend),
+ (var svc_monitor) = parse_ip_port_mapping(lbvip.lb.ip_port_mappings, ip).
+
+function is_online(status: Option<string>): bool = {
+ match (status) {
+ Some{s} -> s == "online",
+ _ -> true
+ }
+}
+function default_protocol(protocol: Option<string>): string = {
+ match (protocol) {
+ Some{x} -> x,
+ None -> "tcp"
+ }
+}
+relation &LBVIPBackendStatus(
+ port: bit<16>,
+ ip: v46_ip,
+ protocol: string,
+ logical_port: string,
+ up: bool)
+&LBVIPBackendStatus(port, ip, protocol, logical_port, up) :-
+ sm in sb::Service_Monitor(),
+ var port = sm.port as bit<16>,
+ Some{var ip} = ip46_parse(sm.ip),
+ var protocol = default_protocol(sm.protocol),
+ var logical_port = sm.logical_port,
+ var up = is_online(sm.status).
+&LBVIPBackendStatus(port, ip, protocol, logical_port, true) :-
+ LBVIPBackend[lbvipbackend],
+ var port = lbvipbackend.port as bit<16>,
+ var ip = lbvipbackend.ip,
+ var protocol = default_protocol(lbvipbackend.lbvip.lb.protocol),
+ Some{var svc_monitor} = lbvipbackend.svc_monitor,
+ var logical_port = svc_monitor.port_name,
+ not sb::Service_Monitor(.port = port as bit<64>,
+ .ip = "${ip}",
+ .protocol = Some{protocol},
+ .logical_port = logical_port).
+
+/* SwitchPortDHCPv4Options: many-to-one relation between logical switches and DHCPv4 options */
+relation SwitchPortDHCPv4Options(
+ port: Ref<SwitchPort>,
+ dhcpv4_options: Ref<nb::DHCP_Options>)
+
+SwitchPortDHCPv4Options(port, options) :-
+ port in &SwitchPort(.lsp = lsp),
+ port.lsp.__type != "external",
+ Some{var dhcpv4_uuid} = lsp.dhcpv4_options,
+ options in &DHCP_OptionsRef[nb::DHCP_Options{._uuid = dhcpv4_uuid}].
+
+/* SwitchPortDHCPv6Options: many-to-one relation between logical switches and DHCPv4 options */
+relation SwitchPortDHCPv6Options(
+ port: Ref<SwitchPort>,
+ dhcpv6_options: Ref<nb::DHCP_Options>)
+
+SwitchPortDHCPv6Options(port, options) :-
+ port in &SwitchPort(.lsp = lsp),
+ port.lsp.__type != "external",
+ Some{var dhcpv6_uuid} = lsp.dhcpv6_options,
+ options in &DHCP_OptionsRef[nb::DHCP_Options{._uuid = dhcpv6_uuid}].
+
+/* SwitchQoS: many-to-one relation between logical switches and nb::QoS */
+relation SwitchQoS(sw: Ref<Switch>, qos: Ref<nb::QoS>)
+
+SwitchQoS(sw, qos) :-
+ sw in &Switch(.ls = nb::Logical_Switch{.qos_rules = qos_rules}),
+ var qos_rule = FlatMap(qos_rules),
+ qos in &QoSRef[nb::QoS{._uuid = qos_rule}].
+
+/* Reports whether a given ACL is associated with a fair meter.
+ * 'has_fair_meter' is false if 'acl' has no meter, or if has a meter
+ * that isn't a fair meter. (The latter case has two subcases: the
+ * case where the meter that the ACL names corresponds to an nb::Meter
+ * with that name, and the case where it doesn't.) */
+relation ACLHasFairMeter(acl: Ref<nb::ACL>, has_fair_meter: bool)
+ACLHasFairMeter(acl, true) :-
+ ACLWithFairMeter(acl, _).
+ACLHasFairMeter(acl, false) :-
+ acl in &ACLRef[_],
+ not ACLWithFairMeter(acl, _).
+
+/* All the ACLs associated with a fair meter, with their fair meters. */
+relation ACLWithFairMeter(acl: Ref<nb::ACL>, meter: Ref<nb::Meter>)
+ACLWithFairMeter(acl, meter) :-
+ acl in &ACLRef[nb::ACL{.meter = Some{meter_name}}],
+ meter in &MeterRef[nb::Meter{.name = meter_name, .fair = Some{true}}].
+
+/* SwitchACL: many-to-many relation between logical switches and ACLs */
+relation &SwitchACL(sw: Ref<Switch>,
+ acl: Ref<nb::ACL>,
+ has_fair_meter: bool)
+
+&SwitchACL(.sw = sw, .acl = acl, .has_fair_meter = has_fair_meter) :-
+ LogicalSwitchACL(sw_uuid, acl_uuid),
+ sw in &Switch(.ls = nb::Logical_Switch{._uuid = sw_uuid}),
+ acl in &ACLRef[nb::ACL{._uuid = acl_uuid}],
+ ACLHasFairMeter(acl, has_fair_meter).
+
+relation SwitchPortUp(lsp: uuid, up: bool)
+
+SwitchPortUp(lsp, up) :-
+ nb::Logical_Switch_Port(._uuid = lsp, .name = lsp_name, .__type = __type),
+ sb::Port_Binding(.logical_port = lsp_name, .chassis = chassis),
+ var up =
+ if (__type == "router") {
+ true
+ } else if (is_none(chassis)) {
+ false
+ } else {
+ true
+ }.
+
+SwitchPortUp(lsp, up) :-
+ nb::Logical_Switch_Port(._uuid = lsp, .name = lsp_name, .__type = __type),
+ not sb::Port_Binding(.logical_port = lsp_name),
+ var up = __type == "router".
+
+relation SwitchPortHAChassisGroup0(lsp_uuid: uuid, hac_group_uuid: uuid)
+SwitchPortHAChassisGroup0(lsp_uuid, ha_chassis_group_uuid(ls_uuid)) :-
+ lsp in nb::Logical_Switch_Port(._uuid = lsp_uuid),
+ lsp.__type == "external",
+ Some{var hac_group_uuid} = lsp.ha_chassis_group,
+ ha_chassis_group in nb::HA_Chassis_Group(._uuid = hac_group_uuid),
+ /* If the group is empty, then HA_Chassis_Group record will not be created in SB,
+ * and so we should not create a reference to the group in Port_Binding table,
+ * to avoid integrity violation. */
+ not set_is_empty(ha_chassis_group.ha_chassis),
+ LogicalSwitchPort(.lport = lsp_uuid, .lswitch = ls_uuid).
+relation SwitchPortHAChassisGroup(lsp_uuid: uuid, hac_group_uuid: Option<uuid>)
+SwitchPortHAChassisGroup(lsp_uuid, Some{hac_group_uuid}) :-
+ SwitchPortHAChassisGroup0(lsp_uuid, hac_group_uuid).
+SwitchPortHAChassisGroup(lsp_uuid, None) :-
+ lsp in nb::Logical_Switch_Port(._uuid = lsp_uuid),
+ not SwitchPortHAChassisGroup0(lsp_uuid, _).
+
+/* SwitchPort relation collects all attributes of a logical switch port
+ * - `peer` - peer router port, if any
+ * - `static_dynamic_mac` - port has a "dynamic" address that contains a static MAC,
+ * e.g., "80:fa:5b:06:72:b7 dynamic"
+ * - `static_dynamic_ipv4`, `static_dynamic_ipv6` - port has a "dynamic" address that contains a static IP,
+ * e.g., "dynamic 192.168.1.2"
+ * - `needs_dynamic_ipv4address` - port requires a dynamically allocated IPv4 address
+ * - `needs_dynamic_macaddress` - port requires a dynamically allocated MAC address
+ * - `needs_dynamic_tag` - port requires a dynamically allocated tag
+ * - `up` - true if the port is bound to a chassis or has type ""
+ * - 'hac_group_uuid' - uuid of sb::HA_Chassis_Group, only for "external" ports
+ */
+relation &SwitchPort(
+ lsp: nb::Logical_Switch_Port,
+ json_name: string,
+ sw: Ref<Switch>,
+ peer: Option<Ref<RouterPort>>,
+ static_addresses: Vec<lport_addresses>,
+ dynamic_address: Option<lport_addresses>,
+ static_dynamic_mac: Option<eth_addr>,
+ static_dynamic_ipv4: Option<in_addr>,
+ static_dynamic_ipv6: Option<in6_addr>,
+ ps_addresses: Vec<lport_addresses>,
+ ps_eth_addresses: Vec<string>,
+ parent_name: Option<string>,
+ needs_dynamic_ipv4address: bool,
+ needs_dynamic_macaddress: bool,
+ needs_dynamic_ipv6address: bool,
+ needs_dynamic_tag: bool,
+ up: bool,
+ mcast_cfg: Ref<McastPortCfg>,
+ hac_group_uuid: Option<uuid>
+)
+
+&SwitchPort(.lsp = lsp,
+ .json_name = json_string_escape(lsp.name),
+ .sw = sw,
+ .peer = peer,
+ .static_addresses = static_addresses,
+ .dynamic_address = dynamic_address,
+ .static_dynamic_mac = static_dynamic_mac,
+ .static_dynamic_ipv4 = static_dynamic_ipv4,
+ .static_dynamic_ipv6 = static_dynamic_ipv6,
+ .ps_addresses = ps_addresses,
+ .ps_eth_addresses = ps_eth_addresses,
+ .parent_name = parent_name,
+ .needs_dynamic_ipv4address = needs_dynamic_ipv4address,
+ .needs_dynamic_macaddress = needs_dynamic_macaddress,
+ .needs_dynamic_ipv6address = needs_dynamic_ipv6address,
+ .needs_dynamic_tag = needs_dynamic_tag,
+ .up = up,
+ .mcast_cfg = mcast_cfg,
+ .hac_group_uuid = hac_group_uuid) :-
+ nb::Logical_Switch_Port[lsp],
+ LogicalSwitchPort(lsp._uuid, lswitch_uuid),
+ sw in &Switch(.ls = nb::Logical_Switch{._uuid = lswitch_uuid, .other_config = other_config},
+ .subnet = subnet,
+ .ipv6_prefix = ipv6_prefix),
+ SwitchRouterPeerRef(lsp._uuid, peer),
+ SwitchPortUp(lsp._uuid, up),
+ mcast_cfg in &McastPortCfg(.port = lsp._uuid, .router_port = false),
+ var static_addresses = {
+ var static_addresses = vec_empty();
+ for (addr in lsp.addresses) {
+ if ((addr != "router") and (not is_dynamic_lsp_address(addr))) {
+ match (extract_lsp_addresses(addr)) {
+ None -> (),
+ Some{lport_addr} -> vec_push(static_addresses, lport_addr)
+ }
+ } else ()
+ };
+ static_addresses
+ },
+ var ps_addresses = {
+ var ps_addresses = vec_empty();
+ for (addr in lsp.port_security) {
+ match (extract_lsp_addresses(addr)) {
+ None -> (),
+ Some{lport_addr} -> vec_push(ps_addresses, lport_addr)
+ }
+ };
+ ps_addresses
+ },
+ var ps_eth_addresses = {
+ var ps_eth_addresses = vec_empty();
+ for (ps_addr in ps_addresses) {
+ vec_push(ps_eth_addresses, "${ps_addr.ea}")
+ };
+ ps_eth_addresses
+ },
+ var dynamic_address = match (lsp.dynamic_addresses) {
+ None -> None,
+ Some{lport_addr} -> extract_lsp_addresses(lport_addr)
+ },
+ (var static_dynamic_mac,
+ var static_dynamic_ipv4,
+ var static_dynamic_ipv6,
+ var has_dyn_lsp_addr) = {
+ var dynamic_address_request = None;
+ for (addr in lsp.addresses) {
+ dynamic_address_request = parse_dynamic_address_request(addr);
+ if (is_some(dynamic_address_request)) {
+ break
+ }
+ };
+
+ match (dynamic_address_request) {
+ Some{DynamicAddressRequest{mac, ipv4, ipv6}} -> (mac, ipv4, ipv6, true),
+ None -> (None, None, None, false)
+ }
+ },
+ var needs_dynamic_ipv4address = has_dyn_lsp_addr and is_none(peer) and is_some(subnet) and
+ is_none(static_dynamic_ipv4),
+ var needs_dynamic_macaddress = has_dyn_lsp_addr and is_none(peer) and is_none(static_dynamic_mac) and
+ (is_some(subnet) or is_some(ipv6_prefix) or
+ map_get(other_config, "mac_only") == Some{"true"}),
+ var needs_dynamic_ipv6address = has_dyn_lsp_addr and is_none(peer) and is_some(ipv6_prefix) and is_none(static_dynamic_ipv6),
+ var parent_name = match (lsp.parent_name) {
+ None -> None,
+ Some{pname} -> if (pname == "") { None } else { Some{pname} }
+ },
+ /* Port needs dynamic tag if it has a parent and its `tag_request` is 0. */
+ var needs_dynamic_tag = is_some(parent_name) and
+ lsp.tag_request == Some{0},
+ SwitchPortHAChassisGroup(.lsp_uuid = lsp._uuid,
+ .hac_group_uuid = hac_group_uuid).
+
+/* Switch port port security addresses */
+relation SwitchPortPSAddresses(port: Ref<SwitchPort>,
+ ps_addrs: lport_addresses)
+
+SwitchPortPSAddresses(port, ps_addrs) :-
+ port in &SwitchPort(.ps_addresses = ps_addresses),
+ var ps_addrs = FlatMap(ps_addresses).
+
+/* All static addresses associated with a port parsed into
+ * the lport_addresses data structure */
+relation SwitchPortStaticAddresses(port: Ref<SwitchPort>,
+ addrs: lport_addresses)
+SwitchPortStaticAddresses(port, addrs) :-
+ port in &SwitchPort(.static_addresses = static_addresses),
+ var addrs = FlatMap(static_addresses).
+
+/* All static and dynamic addresses associated with a port parsed into
+ * the lport_addresses data structure */
+relation SwitchPortAddresses(port: Ref<SwitchPort>,
+ addrs: lport_addresses)
+
+SwitchPortAddresses(port, addrs) :- SwitchPortStaticAddresses(port, addrs).
+
+SwitchPortAddresses(port, dynamic_address) :-
+ SwitchPortNewDynamicAddress(port, Some{dynamic_address}).
+
+/* "router" is a special Logical_Switch_Port address value that indicates that the Ethernet, IPv4, and IPv6
+ * this port should be obtained from the connected logical router port, as specified by router-port in
+ * options.
+ *
+ * The resulting addresses are used to populate the logical switch’s destination lookup, and also for the
+ * logical switch to generate ARP and ND replies.
+ *
+ * If the connected logical router port is a distributed gateway port and the logical router has rules
+ * specified in nat with external_mac, then those addresses are also used to populate the switch’s destination
+ * lookup. */
+SwitchPortAddresses(port, addrs) :-
+ port in &SwitchPort(.lsp = lsp, .peer = Some{&rport}),
+ Some{var addrs} = {
+ var opt_addrs = None;
+ for (addr in lsp.addresses) {
+ if (addr == "router") {
+ opt_addrs = Some{rport.networks}
+ } else ()
+ };
+ opt_addrs
+ }.
+
+/* All static and dynamic IPv4 addresses associated with a port */
+relation SwitchPortIPv4Address(port: Ref<SwitchPort>,
+ ea: eth_addr,
+ addr: ipv4_netaddr)
+
+SwitchPortIPv4Address(port, ea, addr) :-
+ SwitchPortAddresses(port, LPortAddress{.ea = ea, .ipv4_addrs = addrs}),
+ var addr = FlatMap(addrs).
+
+/* All static and dynamic IPv6 addresses associated with a port */
+relation SwitchPortIPv6Address(port: Ref<SwitchPort>,
+ ea: eth_addr,
+ addr: ipv6_netaddr)
+
+SwitchPortIPv6Address(port, ea, addr) :-
+ SwitchPortAddresses(port, LPortAddress{.ea = ea, .ipv6_addrs = addrs}),
+ var addr = FlatMap(addrs).
+
+/* Service monitoring. */
+
+/* MAC allocated for service monitor usage. Just one mac is allocated
+ * for this purpose and ovn-controller's on each chassis will make use
+ * of this mac when sending out the packets to monitor the services
+ * defined in Service_Monitor Southbound table. Since these packets
+ * all locally handled, having just one mac is good enough. */
+function get_svc_monitor_mac(options: Map<string,string>, uuid: uuid)
+ : eth_addr =
+{
+ var existing_mac = match (
+ map_get(options, "svc_monitor_mac"))
+ {
+ Some{mac} -> scan_eth_addr(mac),
+ None -> None
+ };
+ match (existing_mac) {
+ Some{mac} -> mac,
+ None -> eth_addr_from_uint64(pseudorandom_mac(uuid, 'h5678))
+ }
+}
+function put_svc_monitor_mac(options: Map<string,string>,
+ svc_monitor_mac: eth_addr) : Map<string,string> =
+{
+ map_insert_imm(options, "svc_monitor_mac", to_string(svc_monitor_mac))
+}
+relation SvcMonitorMac(mac: eth_addr)
+SvcMonitorMac(get_svc_monitor_mac(options, uuid)) :-
+ nb::NB_Global(._uuid = uuid, .options = options).
new file mode 100644
@@ -0,0 +1,259 @@
+/*
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import OVN_Northbound as nb
+import OVN_Southbound as sb
+import ovn
+import ovsdb
+import helpers
+import lswitch
+import lrouter
+
+function mCAST_DEFAULT_MAX_ENTRIES(): integer = 2048
+
+function mCAST_DEFAULT_IDLE_TIMEOUT_S(): integer = 300
+function mCAST_DEFAULT_MIN_IDLE_TIMEOUT_S(): integer = 15
+function mCAST_DEFAULT_MAX_IDLE_TIMEOUT_S(): integer = 3600
+
+function mCAST_DEFAULT_MIN_QUERY_INTERVAL_S(): integer = 1
+function mCAST_DEFAULT_MAX_QUERY_INTERVAL_S(): integer =
+ mCAST_DEFAULT_MAX_IDLE_TIMEOUT_S()
+
+function mCAST_DEFAULT_QUERY_MAX_RESPONSE_S(): integer = 1
+
+/* IP Multicast per switch configuration. */
+relation &McastSwitchCfg(
+ datapath : uuid,
+ enabled : bool,
+ querier : bool,
+ flood_unreg : bool,
+ eth_src : string,
+ ip4_src : string,
+ ip6_src : string,
+ table_size : integer,
+ idle_timeout : integer,
+ query_interval: integer,
+ query_max_resp: integer
+)
+
+ /* FIXME: Right now table_size is enforced only in ovn-controller but in
+ * the ovn-northd C version we enforce it on the aggregate groups too.
+ */
+
+&McastSwitchCfg(
+ .datapath = ls_uuid,
+ .enabled = map_get_bool_def(other_config, "mcast_snoop",
+ false),
+ .querier = map_get_bool_def(other_config, "mcast_querier",
+ true),
+ .flood_unreg = map_get_bool_def(other_config,
+ "mcast_flood_unregistered",
+ false),
+ .eth_src = other_config.get("mcast_eth_src").unwrap_or(""),
+ .ip4_src = other_config.get("mcast_ip4_src").unwrap_or(""),
+ .ip6_src = other_config.get("mcast_ip6_src").unwrap_or(""),
+ .table_size = map_get_int_def(other_config,
+ "mcast_table_size",
+ mCAST_DEFAULT_MAX_ENTRIES()),
+ .idle_timeout = idle_timeout,
+ .query_interval = query_interval,
+ .query_max_resp = query_max_resp) :-
+ nb::Logical_Switch(._uuid = ls_uuid,
+ .other_config = other_config),
+ var idle_timeout =
+ map_get_int_def_limit(other_config, "mcast_idle_timeout",
+ mCAST_DEFAULT_IDLE_TIMEOUT_S(),
+ mCAST_DEFAULT_MIN_IDLE_TIMEOUT_S(),
+ mCAST_DEFAULT_MAX_IDLE_TIMEOUT_S()),
+ var query_interval =
+ map_get_int_def_limit(other_config, "mcast_query_interval",
+ idle_timeout / 2,
+ mCAST_DEFAULT_MIN_QUERY_INTERVAL_S(),
+ mCAST_DEFAULT_MAX_QUERY_INTERVAL_S()),
+ var query_max_resp =
+ map_get_int_def(other_config, "mcast_query_max_response",
+ mCAST_DEFAULT_QUERY_MAX_RESPONSE_S()).
+
+/* IP Multicast per router configuration. */
+relation &McastRouterCfg(
+ datapath: uuid,
+ relay : bool
+)
+
+&McastRouterCfg(lr_uuid, mcast_relay) :-
+ nb::Logical_Router(._uuid = lr_uuid, .options = options),
+ var mcast_relay = map_get_bool_def(options, "mcast_relay", false).
+
+/* IP Multicast port configuration. */
+relation &McastPortCfg(
+ port : uuid,
+ router_port : bool,
+ flood : bool,
+ flood_reports : bool
+)
+
+&McastPortCfg(lsp_uuid, false, flood, flood_reports) :-
+ nb::Logical_Switch_Port(._uuid = lsp_uuid, .options = options),
+ var flood = map_get_bool_def(options, "mcast_flood", false),
+ var flood_reports = map_get_bool_def(options, "mcast_flood_reports",
+ false).
+
+&McastPortCfg(lrp_uuid, true, flood, flood) :-
+ nb::Logical_Router_Port(._uuid = lrp_uuid, .options = options),
+ var flood = map_get_bool_def(options, "mcast_flood", false).
+
+/* Mapping between Switch and the set of router port uuids on which to flood
+ * IP multicast for relay.
+ */
+relation SwitchMcastFloodRelayPorts(sw: Ref<Switch>, ports: Set<uuid>)
+
+SwitchMcastFloodRelayPorts(switch, relay_ports) :-
+ &SwitchPort(
+ .lsp = lsp,
+ .sw = switch,
+ .peer = Some{&RouterPort{.router = &Router{.mcast_cfg = &mcast_cfg}}}
+ ), mcast_cfg.relay,
+ var relay_ports = lsp._uuid.group_by(switch).to_set().
+
+SwitchMcastFloodRelayPorts(switch, set_empty()) :-
+ Switch[switch],
+ not &SwitchPort(
+ .sw = switch,
+ .peer = Some{
+ &RouterPort{
+ .router = &Router{.mcast_cfg = &McastRouterCfg{.relay=true}}
+ }
+ }
+ ).
+
+/* Mapping between Switch and the set of port uuids on which to
+ * flood IP multicast statically.
+ */
+relation SwitchMcastFloodPorts(sw: Ref<Switch>, ports: Set<uuid>)
+
+SwitchMcastFloodPorts(switch, flood_ports) :-
+ &SwitchPort(
+ .lsp = lsp,
+ .sw = switch,
+ .mcast_cfg = &McastPortCfg{.flood = true}),
+ var flood_ports = lsp._uuid.group_by(switch).to_set().
+
+SwitchMcastFloodPorts(switch, set_empty()) :-
+ Switch[switch],
+ not &SwitchPort(
+ .sw = switch,
+ .mcast_cfg = &McastPortCfg{.flood = true}).
+
+/* Mapping between Switch and the set of port uuids on which to
+ * flood IP multicast reports statically.
+ */
+relation SwitchMcastFloodReportPorts(sw: Ref<Switch>, ports: Set<uuid>)
+
+SwitchMcastFloodReportPorts(switch, flood_ports) :-
+ &SwitchPort(
+ .lsp = lsp,
+ .sw = switch,
+ .mcast_cfg = &McastPortCfg{.flood_reports = true}),
+ var flood_ports = lsp._uuid.group_by(switch).to_set().
+
+SwitchMcastFloodReportPorts(switch, set_empty()) :-
+ Switch[switch],
+ not &SwitchPort(
+ .sw = switch,
+ .mcast_cfg = &McastPortCfg{.flood_reports = true}).
+
+/* Mapping between Router and the set of port uuids on which to
+ * flood IP multicast reports statically.
+ */
+relation RouterMcastFloodPorts(sw: Ref<Router>, ports: Set<uuid>)
+
+RouterMcastFloodPorts(router, flood_ports) :-
+ &RouterPort(
+ .lrp = lrp,
+ .router = router,
+ .mcast_cfg = &McastPortCfg{.flood = true}
+ ),
+ var flood_ports = lrp._uuid.group_by(router).to_set().
+
+RouterMcastFloodPorts(router, set_empty()) :-
+ Router[router],
+ not &RouterPort(
+ .router = router,
+ .mcast_cfg = &McastPortCfg{.flood = true}).
+
+/* Flattened IGMP group. One record per address-port tuple. */
+relation IgmpSwitchGroupPort(
+ address: string,
+ switch : Ref<Switch>,
+ port : uuid
+)
+
+IgmpSwitchGroupPort(address, switch, lsp_uuid) :-
+ sb::IGMP_Group(.address = address, .datapath = igmp_dp_set,
+ .ports = pb_ports),
+ var pb_port_uuid = FlatMap(pb_ports),
+ sb::Port_Binding(._uuid = pb_port_uuid, .logical_port = lsp_name),
+ &SwitchPort(
+ .lsp = nb::Logical_Switch_Port{._uuid = lsp_uuid, .name = lsp_name},
+ .sw = switch).
+
+/* Aggregated IGMP group: merges all IgmpSwitchGroupPort for a given
+ * address-switch tuple from all chassis.
+ */
+relation IgmpSwitchMulticastGroup(
+ address: string,
+ switch : Ref<Switch>,
+ ports : Set<uuid>
+)
+
+IgmpSwitchMulticastGroup(address, switch, ports) :-
+ IgmpSwitchGroupPort(address, switch, port),
+ var ports = port.group_by((address, switch)).to_set().
+
+/* Flattened IGMP group representation for routers with relay enabled. One
+ * record per address-port tuple for all IGMP groups learned by switches
+ * connected to the router.
+ */
+relation IgmpRouterGroupPort(
+ address: string,
+ router : Ref<Router>,
+ port : uuid
+)
+
+IgmpRouterGroupPort(address, rtr_port.router, rtr_port.lrp._uuid) :-
+ SwitchMcastFloodRelayPorts(switch, sw_flood_ports),
+ IgmpSwitchMulticastGroup(address, switch, _),
+ /* For IPv6 only relay routable multicast groups
+ * (RFC 4291 2.7).
+ */
+ match (ipv6_parse(address)) {
+ Some{ipv6} -> ipv6_is_routable_multicast(ipv6),
+ None -> true
+ },
+ var flood_port = FlatMap(sw_flood_ports),
+ &SwitchPort(.lsp = nb::Logical_Switch_Port{._uuid = flood_port},
+ .peer = Some{&rtr_port}).
+
+/* Aggregated IGMP group for routers: merges all IgmpRouterGroupPort for
+ * a given address-router tuple from all connected switches.
+ */
+relation IgmpRouterMulticastGroup(
+ address: string,
+ router : Ref<Router>,
+ ports : Set<uuid>
+)
+
+IgmpRouterMulticastGroup(address, router, ports) :-
+ IgmpRouterGroupPort(address, router, port),
+ var ports = port.group_by((address, router)).to_set().
new file mode 100644
@@ -0,0 +1,13 @@
+-o Logical_Router_Port
+--rw Logical_Router_Port.ipv6_prefix
+-o Logical_Switch_Port
+--rw Logical_Switch_Port.tag
+--rw Logical_Switch_Port.dynamic_addresses
+--rw Logical_Switch_Port.up
+-o NB_Global
+--rw NB_Global.sb_cfg
+--rw NB_Global.hv_cfg
+--rw NB_Global.options
+--rw NB_Global.ipsec
+--rw NB_Global.nb_cfg_timestamp
+--rw NB_Global.hv_cfg_timestamp
new file mode 100644
@@ -0,0 +1,1274 @@
+/*
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#include <config.h>
+
+#include <getopt.h>
+#include <stdlib.h>
+#include <stdio.h>
+#include <fcntl.h>
+#include <unistd.h>
+
+#include "command-line.h"
+#include "daemon.h"
+#include "fatal-signal.h"
+#include "hash.h"
+#include "jsonrpc.h"
+#include "lib/ovn-util.h"
+#include "openvswitch/hmap.h"
+#include "openvswitch/json.h"
+#include "openvswitch/poll-loop.h"
+#include "openvswitch/vlog.h"
+#include "ovsdb-cs.h"
+#include "ovsdb-data.h"
+#include "ovsdb-error.h"
+#include "ovsdb-parser.h"
+#include "ovsdb-types.h"
+#include "stream-ssl.h"
+#include "stream.h"
+#include "unixctl.h"
+#include "util.h"
+#include "uuid.h"
+
+#include "northd/ovn_northd_ddlog/ddlog.h"
+
+VLOG_DEFINE_THIS_MODULE(ovn_northd);
+
+#include "northd/ovn-northd-ddlog-nb.inc"
+#include "northd/ovn-northd-ddlog-sb.inc"
+
+struct northd_status {
+ bool locked;
+ bool pause;
+};
+
+static unixctl_cb_func ovn_northd_exit;
+static unixctl_cb_func ovn_northd_pause;
+static unixctl_cb_func ovn_northd_resume;
+static unixctl_cb_func ovn_northd_is_paused;
+static unixctl_cb_func ovn_northd_status;
+
+/* --ddlog-record: The name of a file to which to record DDlog commands for
+ * later replay. Useful for debugging. If null (by default), DDlog commands
+ * are not recorded. */
+static const char *record_file;
+
+static const char *ovnnb_db;
+static const char *ovnsb_db;
+static const char *unixctl_path;
+
+/* Frequently used table ids. */
+static table_id WARNING_TABLE_ID;
+static table_id NB_CFG_TIMESTAMP_ID;
+
+/* Initialize frequently used table ids. */
+static void init_table_ids(void)
+{
+ WARNING_TABLE_ID = ddlog_get_table_id("Warning");
+ NB_CFG_TIMESTAMP_ID = ddlog_get_table_id("NbCfgTimestamp");
+}
+
+/*
+ * Accumulates DDlog delta to be sent to OVSDB.
+ *
+ * FIXME: There is currently no global northd state descriptor shared by NB and
+ * SB connections. We should probably introduce it and move this variable there
+ * instead of declaring it as a global variable.
+ */
+static ddlog_delta *delta;
+
+
+struct northd_ctx {
+ ddlog_prog ddlog;
+ char *prefix;
+ const char **input_relations;
+ const char **output_relations;
+ const char **output_only_relations;
+
+ bool has_timestamp_columns;
+
+ struct ovsdb_cs *cs;
+ struct json *request_id;
+ enum {
+ /* Initial state, before the output-only data (if any) has been
+ * requested. */
+ S_INITIAL,
+
+ /* Output-only data has been requested. Waiting for reply. */
+ S_OUTPUT_ONLY_DATA_REQUESTED,
+
+ /* Output-only data (if any) has been received. Any request sent out
+ * now would be to update data. */
+ S_UPDATE,
+ } state;
+
+ /* Database info. */
+ const char *db_name;
+ struct json *output_only_data;
+ const char *lock_name; /* Name of lock we need, NULL if none. */
+ bool paused;
+};
+
+static struct ovsdb_cs_ops northd_cs_ops;
+
+static struct json *get_database_ops(struct northd_ctx *);
+static int ddlog_clear(struct northd_ctx *);
+
+static void
+northd_ctx_connection_status(struct unixctl_conn *conn, int argc OVS_UNUSED,
+ const char *argv[] OVS_UNUSED, void *ctx_)
+{
+ const struct northd_ctx *ctx = ctx_;
+ bool connected = ovsdb_cs_is_connected(ctx->cs);
+ unixctl_command_reply(conn, connected ? "connected" : "not connected");
+}
+
+static void
+northd_ctx_cluster_state_reset(struct unixctl_conn *conn, int argc OVS_UNUSED,
+ const char *argv[] OVS_UNUSED, void *ctx_)
+{
+ const struct northd_ctx *ctx = ctx_;
+ VLOG_INFO("Resetting %s database cluster state", ctx->db_name);
+ ovsdb_cs_reset_min_index(ctx->cs);
+ unixctl_command_reply(conn, NULL);
+}
+
+static struct northd_ctx *
+northd_ctx_create(const char *server, const char *database,
+ const char *unixctl_command_prefix,
+ const char *lock_name,
+ ddlog_prog ddlog,
+ const char **input_relations,
+ const char **output_relations,
+ const char **output_only_relations)
+{
+ struct northd_ctx *ctx = xmalloc(sizeof *ctx);
+ *ctx = (struct northd_ctx) {
+ .ddlog = ddlog,
+ .prefix = xasprintf("%s::", database),
+ .input_relations = input_relations,
+ .output_relations = output_relations,
+ .output_only_relations = output_only_relations,
+ /* 'has_timestamp_columns' will get filled in later. */
+ .cs = ovsdb_cs_create(database, 1 /* XXX */, &northd_cs_ops, ctx),
+ .state = S_INITIAL,
+ .db_name = database,
+ /* 'output_only_relations' will get filled in later. */
+ .lock_name = lock_name,
+ .paused = false,
+ };
+
+ ovsdb_cs_set_remote(ctx->cs, server, true);
+ ovsdb_cs_set_lock(ctx->cs, lock_name);
+
+ char *cmd = xasprintf("%s-connection-status", unixctl_command_prefix);
+ unixctl_command_register(cmd, "", 0, 0,
+ northd_ctx_connection_status, ctx);
+ free(cmd);
+
+ cmd = xasprintf("%s-cluster-state-reset", unixctl_command_prefix);
+ unixctl_command_register(cmd, "", 0, 0,
+ northd_ctx_cluster_state_reset, ctx);
+ free(cmd);
+
+ return ctx;
+}
+
+static void
+northd_ctx_destroy(struct northd_ctx *ctx)
+{
+ if (ctx) {
+ ovsdb_cs_destroy(ctx->cs);
+ json_destroy(ctx->output_only_data);
+ free(ctx);
+ }
+}
+
+static struct json *
+northd_compose_monitor_request(const struct json *schema_json, void *ctx_)
+{
+ struct northd_ctx *ctx = ctx_;
+
+ struct shash *schema = ovsdb_cs_parse_schema(schema_json);
+
+ const struct sset *nb_global = shash_find_data(
+ schema, "NB_Global");
+ ctx->has_timestamp_columns
+ = (nb_global
+ && sset_contains(nb_global, "nb_cfg_timestamp")
+ && sset_contains(nb_global, "sb_cfg_timestamp"));
+
+ struct json *monitor_requests = json_object_create();
+
+ /* This should be smarter about ignoring not needed ones. There's a lot
+ * more logic for this in ovsdb_idl_compose_monitor_request(). */
+ const struct shash_node *node;
+ SHASH_FOR_EACH (node, schema) {
+ const char *table_name = node->name;
+
+ /* Only subscribe to input relations we care about. */
+ for (const char **p = ctx->input_relations; *p; p++) {
+ if (!strcmp(table_name, *p)) {
+ const struct sset *schema_columns = node->data;
+ struct json *subscribed_columns = json_array_create_empty();
+
+ const char *column;
+ SSET_FOR_EACH (column, schema_columns) {
+ if (strcmp(column, "_version")) {
+ json_array_add(subscribed_columns,
+ json_string_create(column));
+ }
+ }
+
+ struct json *monitor_request = json_object_create();
+ json_object_put(monitor_request, "columns",
+ subscribed_columns);
+ json_object_put(monitor_requests, table_name,
+ json_array_create_1(monitor_request));
+ break;
+ }
+ }
+ }
+ ovsdb_cs_free_schema(schema);
+
+ return monitor_requests;
+}
+
+static struct ovsdb_cs_ops northd_cs_ops = { northd_compose_monitor_request };
+
+/* Sends the database server a request for all the row UUIDs in output-only
+ * tables. */
+static void
+northd_send_output_only_data_request(struct northd_ctx *ctx)
+{
+ if (ctx->output_only_relations[0]) {
+ json_destroy(ctx->output_only_data);
+ ctx->output_only_data = NULL;
+
+ struct json *ops = json_array_create_1(
+ json_string_create(ctx->db_name));
+ for (size_t i = 0; ctx->output_only_relations[i]; i++) {
+ const char *table = ctx->output_only_relations[i];
+ struct json *op = json_object_create();
+ json_object_put_string(op, "op", "select");
+ json_object_put_string(op, "table", table);
+ json_object_put(op, "columns",
+ json_array_create_1(json_string_create("_uuid")));
+ json_object_put(op, "where", json_array_create_empty());
+ json_array_add(ops, op);
+ }
+
+ ctx->state = S_OUTPUT_ONLY_DATA_REQUESTED;
+ ctx->request_id = ovsdb_cs_send_transaction(ctx->cs, ops);
+ } else {
+ ctx->state = S_UPDATE;
+ }
+}
+
+static void
+northd_pause(struct northd_ctx *ctx)
+{
+ if (!ctx->paused && ctx->lock_name) {
+ ctx->paused = true;
+ VLOG_INFO("This ovn-northd instance is now paused.");
+ ovsdb_cs_set_lock(ctx->cs, NULL);
+ }
+}
+
+static void
+northd_unpause(struct northd_ctx *ctx)
+{
+ if (ctx->paused) {
+ ovsdb_cs_set_lock(ctx->cs, ctx->lock_name);
+ ctx->paused = false;
+ }
+}
+
+static void
+warning_cb(uintptr_t arg OVS_UNUSED,
+ table_id table OVS_UNUSED,
+ const ddlog_record *rec,
+ ssize_t weight)
+{
+ size_t len;
+ const char *s = ddlog_get_str_with_length(rec, &len);
+ if (weight > 0) {
+ VLOG_WARN("New warning: %.*s", (int)len, s);
+ } else {
+ VLOG_WARN("Warning cleared: %.*s", (int)len, s);
+ }
+}
+
+static int
+ddlog_commit(ddlog_prog ddlog)
+{
+ ddlog_delta *new_delta = ddlog_transaction_commit_dump_changes(ddlog);
+ if (!delta) {
+ VLOG_WARN("Transaction commit failed");
+ return -1;
+ }
+
+ /* Remove warnings from delta and output them straight away. */
+ ddlog_delta *warnings = ddlog_delta_remove_table(new_delta, WARNING_TABLE_ID);
+ ddlog_delta_enumerate(warnings, warning_cb, 0);
+ ddlog_free_delta(warnings);
+
+ /* Merge changes into `delta`. */
+ ddlog_delta_union(delta, new_delta);
+
+ return 0;
+}
+
+static int
+ddlog_clear(struct northd_ctx *ctx)
+{
+ int n_failures = 0;
+ for (int i = 0; ctx->input_relations[i]; i++) {
+ char *table = xasprintf("%s%s", ctx->prefix, ctx->input_relations[i]);
+ if (ddlog_clear_relation(ctx->ddlog, ddlog_get_table_id(table))) {
+ n_failures++;
+ }
+ free(table);
+ }
+ if (n_failures) {
+ VLOG_WARN("failed to clear %d tables in %s database",
+ n_failures, ctx->db_name);
+ }
+ return n_failures;
+}
+
+static const struct json *
+json_object_get(const struct json *json, const char *member_name)
+{
+ return (json && json->type == JSON_OBJECT
+ ? shash_find_data(json_object(json), member_name)
+ : NULL);
+}
+
+/* Returns the new value of NB_Global::nb_cfg, if any, from the updates in
+ * <table-updates> provided by the caller, or INT64_MIN if none is present. */
+static int64_t
+get_nb_cfg(const struct json *table_updates)
+{
+ const struct json *nb_global = json_object_get(table_updates, "NB_Global");
+ if (nb_global) {
+ struct shash_node *row;
+ SHASH_FOR_EACH (row, json_object(nb_global)) {
+ const struct json *value = row->data;
+ const struct json *new = json_object_get(value, "new");
+ const struct json *nb_cfg = json_object_get(new, "nb_cfg");
+ if (nb_cfg && nb_cfg->type == JSON_INTEGER) {
+ return json_integer(nb_cfg);
+ }
+ }
+ }
+ return INT64_MIN;
+}
+
+static void
+northd_parse_update(struct northd_ctx *ctx,
+ const struct ovsdb_cs_update_event *update)
+{
+ if (ddlog_transaction_start(ctx->ddlog)) {
+ VLOG_WARN("DDlog failed to start transaction");
+ return;
+ }
+
+ if (update->clear && ddlog_clear(ctx)) {
+ goto error;
+ }
+ char *updates_s = json_to_string(update->table_updates, 0);
+ if (ddlog_apply_ovsdb_updates(ctx->ddlog, ctx->prefix, updates_s)) {
+ VLOG_WARN("DDlog failed to apply updates %s", updates_s);
+ free(updates_s);
+ goto error;
+ }
+ free(updates_s);
+
+ /* Whenever a new 'nb_cfg' value comes in, take the current time and push
+ * it into the NbCfgTimestamp relation for the DDlog program to put into
+ * nb::NB_Global.nb_cfg_timestamp. */
+ static int64_t old_nb_cfg = INT64_MIN;
+ static int64_t old_nb_cfg_timestamp = INT64_MIN;
+ int64_t new_nb_cfg = old_nb_cfg;
+ int64_t new_nb_cfg_timestamp = old_nb_cfg_timestamp;
+ if (ctx->has_timestamp_columns) {
+ new_nb_cfg = get_nb_cfg(update->table_updates);
+ if (new_nb_cfg == INT64_MIN) {
+ new_nb_cfg = old_nb_cfg == INT64_MIN ? 0 : old_nb_cfg;
+ }
+ if (new_nb_cfg != old_nb_cfg) {
+ new_nb_cfg_timestamp = time_wall_msec();
+
+ ddlog_cmd *updates[2];
+ int n_updates = 0;
+ if (old_nb_cfg_timestamp != INT64_MIN) {
+ updates[n_updates++] = ddlog_delete_val_cmd(
+ NB_CFG_TIMESTAMP_ID, ddlog_i64(old_nb_cfg_timestamp));
+ }
+ updates[n_updates++] = ddlog_insert_cmd(
+ NB_CFG_TIMESTAMP_ID, ddlog_i64(new_nb_cfg_timestamp));
+ if (ddlog_apply_updates(ctx->ddlog, updates, n_updates) < 0) {
+ goto error;
+ }
+ }
+ }
+
+ /* Commit changes to DDlog. */
+ if (ddlog_commit(ctx->ddlog)) {
+ goto error;
+ }
+ old_nb_cfg = new_nb_cfg;
+ old_nb_cfg_timestamp = new_nb_cfg_timestamp;
+
+ /* This update may have implications for the other side, so
+ * immediately wake to check for more changes to be applied. */
+ poll_immediate_wake();
+
+ return;
+
+error:
+ ddlog_transaction_rollback(ctx->ddlog);
+}
+
+static void
+northd_process_txn_reply(struct northd_ctx *ctx,
+ const struct jsonrpc_msg *reply)
+{
+ if (!json_equal(reply->id, ctx->request_id)) {
+ VLOG_WARN("unexpected transaction reply");
+ return;
+ }
+
+ json_destroy(ctx->request_id);
+ ctx->request_id = NULL;
+
+ if (reply->type == JSONRPC_ERROR) {
+ char *s = jsonrpc_msg_to_string(reply);
+ VLOG_WARN("received database error: %s", s);
+ free(s);
+
+ ovsdb_cs_force_reconnect(ctx->cs);
+ return;
+ }
+
+ switch (ctx->state) {
+ case S_INITIAL:
+ OVS_NOT_REACHED();
+ break;
+
+ case S_OUTPUT_ONLY_DATA_REQUESTED:
+ json_destroy(ctx->output_only_data);
+ ctx->output_only_data = json_clone(reply->result);
+
+ ctx->state = S_UPDATE;
+ break;
+
+ case S_UPDATE:
+ /* Nothing to do. */
+ break;
+
+ default:
+ OVS_NOT_REACHED();
+ }
+}
+
+/* Processes a batch of messages from the database server on 'ctx'. */
+static void
+northd_run(struct northd_ctx *ctx)
+{
+ struct ovs_list events;
+ ovsdb_cs_run(ctx->cs, &events);
+
+ struct ovsdb_cs_event *event;
+ LIST_FOR_EACH_POP (event, list_node, &events) {
+ switch (event->type) {
+ case OVSDB_CS_EVENT_TYPE_RECONNECT:
+ json_destroy(ctx->request_id);
+ ctx->state = S_INITIAL;
+ break;
+
+ case OVSDB_CS_EVENT_TYPE_LOCKED:
+ break;
+
+ case OVSDB_CS_EVENT_TYPE_UPDATE:
+ northd_parse_update(ctx, &event->update);
+ break;
+
+ case OVSDB_CS_EVENT_TYPE_TXN_REPLY:
+ northd_process_txn_reply(ctx, event->txn_reply);
+ break;
+ }
+ ovsdb_cs_event_destroy(event);
+ }
+
+ if (ctx->state == S_INITIAL && ovsdb_cs_may_send_transaction(ctx->cs)) {
+ northd_send_output_only_data_request(ctx);
+ }
+}
+
+/* Pass the changes for 'ctx' to its database server. */
+static void
+northd_send_deltas(struct northd_ctx *ctx)
+{
+ if (ctx->request_id || !ovsdb_cs_may_send_transaction(ctx->cs)) {
+ return;
+ }
+
+ struct json *ops = get_database_ops(ctx);
+ if (!ops) {
+ return;
+ }
+
+ struct json *comment = json_object_create();
+ json_object_put_string(comment, "op", "comment");
+ json_object_put_string(comment, "comment", "ovn-northd-ddlog");
+ json_array_add(ops, comment);
+
+ ctx->request_id = ovsdb_cs_send_transaction(ctx->cs, ops);
+}
+
+static void
+northd_update_probe_interval_cb(
+ uintptr_t probe_intervalp_,
+ table_id table OVS_UNUSED,
+ const ddlog_record *rec,
+ ssize_t weight OVS_UNUSED)
+{
+ int *probe_intervalp = (int *) probe_intervalp_;
+
+ int64_t x = ddlog_get_i64(rec);
+ *probe_intervalp = (x > 0 && x < 1000 ? 1000
+ : x > INT_MAX ? INT_MAX
+ : x);
+}
+
+static void
+northd_update_probe_interval(struct northd_ctx *nb, struct northd_ctx *sb)
+{
+ /* 0 means that Northd_Probe_Interval is empty. That means that we haven't
+ * connected to the database and retrieved an initial snapshot. Thus, we
+ * set an infinite probe interval to allow for retrieving and stabilizing
+ * an initial snapshot of the databse, which can take a long time.
+ *
+ * -1 means that Northd_Probe_Interval is nonempty but the database doesn't
+ * set a probe interval. Thus, we use the default probe interval.
+ *
+ * Any other value is an explicit probe interval request from the
+ * database. */
+ int probe_interval = 0;
+ table_id tid = ddlog_get_table_id("Northd_Probe_Interval");
+ ddlog_delta *probe_delta = ddlog_delta_get_table(delta, tid);
+ ddlog_delta_enumerate(probe_delta, northd_update_probe_interval_cb, (uintptr_t) &probe_interval);
+
+ ovsdb_cs_set_probe_interval(nb->cs, probe_interval);
+ ovsdb_cs_set_probe_interval(sb->cs, probe_interval);
+}
+
+/* Arranges for poll_block() to wake up when northd_run() has something to
+ * do or when activity occurs on a transaction on 'ctx'. */
+static void
+northd_wait(struct northd_ctx *ctx)
+{
+ ovsdb_cs_wait(ctx->cs);
+}
+
+/* ddlog-specific actions. */
+
+/* Generate OVSDB update command for delta-plus, delta-minus, and delta-update
+ * tables. */
+static void
+ddlog_table_update_deltas(struct ds *ds, ddlog_prog ddlog,
+ const char *db, const char *table)
+{
+ int error;
+ char *updates;
+
+ error = ddlog_dump_ovsdb_delta_tables(ddlog, delta, db, table, &updates);
+ if (error) {
+ VLOG_INFO("DDlog error %d dumping delta for table %s", error, table);
+ return;
+ }
+
+ if (!updates[0]) {
+ ddlog_free_json(updates);
+ return;
+ }
+
+ ds_put_cstr(ds, updates);
+ ds_put_char(ds, ',');
+ ddlog_free_json(updates);
+}
+
+/* Generate OVSDB update command for a output-only table. */
+static void
+ddlog_table_update_output(struct ds *ds, ddlog_prog ddlog,
+ const char *db, const char *table)
+{
+ int error;
+ char *updates;
+
+ error = ddlog_dump_ovsdb_output_table(ddlog, delta, db, table, &updates);
+ if (error) {
+ VLOG_WARN("%s: failed to generate update commands for "
+ "output-only table (error %d)", table, error);
+ return;
+ }
+ char *table_name = xasprintf("%s::Out_%s", db, table);
+ ddlog_delta_clear_table(delta, ddlog_get_table_id(table_name));
+ free(table_name);
+
+ if (!updates[0]) {
+ ddlog_free_json(updates);
+ return;
+ }
+
+ ds_put_cstr(ds, updates);
+ ds_put_char(ds, ',');
+ ddlog_free_json(updates);
+}
+
+/* A set of UUIDs.
+ *
+ * Not fully abstracted: the client still uses plain struct hmap, for
+ * example. */
+
+/* A node within a set of uuids. */
+struct uuidset_node {
+ struct hmap_node hmap_node;
+ struct uuid uuid;
+};
+
+static void uuidset_delete(struct hmap *uuidset, struct uuidset_node *);
+
+static void
+uuidset_destroy(struct hmap *uuidset)
+{
+ if (uuidset) {
+ struct uuidset_node *node, *next;
+
+ HMAP_FOR_EACH_SAFE (node, next, hmap_node, uuidset) {
+ uuidset_delete(uuidset, node);
+ }
+ hmap_destroy(uuidset);
+ }
+}
+
+static struct uuidset_node *
+uuidset_find(struct hmap *uuidset, const struct uuid *uuid)
+{
+ struct uuidset_node *node;
+
+ HMAP_FOR_EACH_WITH_HASH (node, hmap_node, uuid_hash(uuid), uuidset) {
+ if (uuid_equals(uuid, &node->uuid)) {
+ return node;
+ }
+ }
+
+ return NULL;
+}
+
+static void
+uuidset_insert(struct hmap *uuidset, const struct uuid *uuid)
+{
+ if (!uuidset_find(uuidset, uuid)) {
+ struct uuidset_node *node = xmalloc(sizeof *node);
+ node->uuid = *uuid;
+ hmap_insert(uuidset, &node->hmap_node, uuid_hash(&node->uuid));
+ }
+}
+
+static void
+uuidset_delete(struct hmap *uuidset, struct uuidset_node *node)
+{
+ hmap_remove(uuidset, &node->hmap_node);
+ free(node);
+}
+
+static struct ovsdb_error *
+parse_output_only_data(const struct json *txn_result, size_t index,
+ struct hmap *uuidset)
+{
+ if (txn_result->type != JSON_ARRAY || txn_result->array.n <= index) {
+ return ovsdb_syntax_error(txn_result, NULL,
+ "transaction result missing for "
+ "output-only relation %"PRIuSIZE, index);
+ }
+
+ struct ovsdb_parser p;
+ ovsdb_parser_init(&p, txn_result->array.elems[0], "select result");
+ const struct json *rows = ovsdb_parser_member(&p, "rows", OP_ARRAY);
+ struct ovsdb_error *error = ovsdb_parser_finish(&p);
+ if (error) {
+ return error;
+ }
+
+ for (size_t i = 0; i < rows->array.n; i++) {
+ const struct json *row = rows->array.elems[i];
+
+ ovsdb_parser_init(&p, row, "row");
+ const struct json *uuid = ovsdb_parser_member(&p, "_uuid", OP_ARRAY);
+ error = ovsdb_parser_finish(&p);
+ if (error) {
+ return error;
+ }
+
+ struct ovsdb_base_type base_type = OVSDB_BASE_UUID_INIT;
+ union ovsdb_atom atom;
+ error = ovsdb_atom_from_json(&atom, &base_type, uuid, NULL);
+ if (error) {
+ return error;
+ }
+ uuidset_insert(uuidset, &atom.uuid);
+ }
+
+ return NULL;
+}
+
+static bool
+get_ddlog_uuid(const ddlog_record *rec, struct uuid *uuid)
+{
+ if (!ddlog_is_int(rec)) {
+ return false;
+ }
+
+ __uint128_t u128 = ddlog_get_u128(rec);
+ uuid->parts[0] = u128 >> 96;
+ uuid->parts[1] = u128 >> 64;
+ uuid->parts[2] = u128 >> 32;
+ uuid->parts[3] = u128;
+ return true;
+}
+
+struct dump_index_data {
+ ddlog_prog prog;
+ struct hmap *rows_present;
+ const char *table;
+ struct ds *ops_s;
+};
+
+static void OVS_UNUSED
+index_cb(uintptr_t data_, const ddlog_record *rec)
+{
+ static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 5);
+ struct dump_index_data *data = (struct dump_index_data *) data_;
+
+ /* Extract the rec's row UUID as 'uuid'. */
+ const ddlog_record *rec_uuid = ddlog_get_named_struct_field(rec, "_uuid");
+ if (!rec_uuid) {
+ VLOG_WARN_RL(&rl, "%s: row has no _uuid column", data->table);
+ return;
+ }
+ struct uuid uuid;
+ if (!get_ddlog_uuid(rec_uuid, &uuid)) {
+ VLOG_WARN_RL(&rl, "%s: _uuid column has unexpected type", data->table);
+ return;
+ }
+
+ /* If a row with the given UUID was already in the database, then
+ * send a operation to update it; otherwise, send an operation to
+ * insert it. */
+ struct uuidset_node *node = uuidset_find(data->rows_present, &uuid);
+ char *s = NULL;
+ int ret;
+ if (node) {
+ uuidset_delete(data->rows_present, node);
+ ret = ddlog_into_ovsdb_update_str(data->prog, data->table, rec, &s);
+ } else {
+ ret = ddlog_into_ovsdb_insert_str(data->prog, data->table, rec, &s);
+ }
+ if (ret) {
+ VLOG_WARN_RL(&rl, "%s: ddlog could not convert row into database op",
+ data->table);
+ return;
+ }
+ ds_put_format(data->ops_s, "%s,", s);
+ ddlog_free_json(s);
+}
+
+static struct json *
+where_uuid_equals(const struct uuid *uuid)
+{
+ return
+ json_array_create_1(
+ json_array_create_3(
+ json_string_create("_uuid"),
+ json_string_create("=="),
+ json_array_create_2(
+ json_string_create("uuid"),
+ json_string_create_nocopy(
+ xasprintf(UUID_FMT, UUID_ARGS(uuid))))));
+}
+
+static void
+add_delete_row_op(const char *table, const struct uuid *uuid, struct ds *ops_s)
+{
+ struct json *op = json_object_create();
+ json_object_put_string(op, "op", "delete");
+ json_object_put_string(op, "table", table);
+ json_object_put(op, "where", where_uuid_equals(uuid));
+ json_to_ds(op, 0, ops_s);
+ json_destroy(op);
+ ds_put_char(ops_s, ',');
+}
+
+static void
+northd_update_sb_cfg_cb(
+ uintptr_t new_sb_cfgp_,
+ table_id table OVS_UNUSED,
+ const ddlog_record *rec,
+ ssize_t weight)
+{
+ int64_t *new_sb_cfgp = (int64_t *) new_sb_cfgp_;
+
+ if (weight < 0) {
+ return;
+ }
+
+ if (ddlog_get_int(rec, NULL, 0) <= sizeof *new_sb_cfgp) {
+ *new_sb_cfgp = ddlog_get_i64(rec);
+ }
+}
+
+static struct json *
+get_database_ops(struct northd_ctx *ctx)
+{
+ struct ds ops_s = DS_EMPTY_INITIALIZER;
+ ds_put_char(&ops_s, '[');
+ json_string_escape(ctx->db_name, &ops_s);
+ ds_put_char(&ops_s, ',');
+ size_t start_len = ops_s.length;
+
+ for (const char **p = ctx->output_relations; *p; p++) {
+ ddlog_table_update_deltas(&ops_s, ctx->ddlog, ctx->db_name, *p);
+ }
+
+ if (ctx->output_only_data) {
+ /*
+ * We just reconnected to the database (or connected for the first time
+ * in this execution). We assume that the contents of the output-only
+ * tables might have changed (this is especially true the first time we
+ * connect to the database a given execution, of course; we can't
+ * assume that the tables have any particular contents in this case).
+ *
+ * ctx->output_only_data is a database reply that tells us the
+ * UUIDs of the rows that exist in the database. Our strategy is to
+ * compare these UUIDs to the UUIDs of the rows that exist in the DDlog
+ * analogues of these tables, and then add, delete, or update rows as
+ * necessary.
+ *
+ * (ctx->output_only_data only gives row UUIDs, not full row
+ * contents. That means that for rows that exist in OVSDB and in
+ * DDLog, we always send an update to set all the columns. It wouldn't
+ * save bandwidth to do anything else, since we'd always have to send
+ * the full row contents in one direction and if there were differences
+ * we'd have to send the contents in both directions. With this
+ * strategy we only send them in one direction even in the worst case.)
+ *
+ * (We can't just send an operation to delete all the rows and then
+ * re-add them all in the same transaction, because ovsdb-server
+ * rejecting deleting a row with a given UUID and the adding the same
+ * UUID back in a single transaction.)
+ */
+ static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 2);
+
+ for (size_t i = 0; ctx->output_only_relations[i]; i++) {
+ const char *table = ctx->output_only_relations[i];
+
+ /* Parse the list of row UUIDs received from OVSDB. */
+ struct hmap rows_present = HMAP_INITIALIZER(&rows_present);
+ struct ovsdb_error *error = parse_output_only_data(
+ ctx->output_only_data, i, &rows_present);
+ if (error) {
+ char *s = ovsdb_error_to_string_free(error);
+ VLOG_WARN_RL(&rl, "%s", s);
+ free(s);
+ uuidset_destroy(&rows_present);
+ continue;
+ }
+
+ /* Get the index_id for the DDlog table.
+ *
+ * We require output-only tables to have an accompanying index
+ * named <table>_Index. */
+ char *index = xasprintf("%s_Index", table);
+ index_id idxid = ddlog_get_index_id(index);
+ if (idxid == -1) {
+ VLOG_WARN_RL(&rl, "%s: unknown index", index);
+ free(index);
+ uuidset_destroy(&rows_present);
+ continue;
+ }
+ free(index);
+
+ /* For each row in the index, update a corresponding OVSDB row, if
+ * there is one, otherwise insert a new row. */
+ struct dump_index_data cbdata = {
+ ctx->ddlog, &rows_present, table, &ops_s
+ };
+ ddlog_dump_index(ctx->ddlog, idxid, index_cb, (uintptr_t) &cbdata);
+
+ /* Any uuids remaining in 'rows_present' are rows that are in OVSDB
+ * but not DDlog. Delete them from OVSDB. */
+ struct uuidset_node *node;
+ HMAP_FOR_EACH (node, hmap_node, &rows_present) {
+ add_delete_row_op(table, &node->uuid, &ops_s);
+ }
+ uuidset_destroy(&rows_present);
+
+ /* Discard any queued output to this table, since we just
+ * did a full sync to it. */
+ struct ds tmp = DS_EMPTY_INITIALIZER;
+ ddlog_table_update_output(&tmp, ctx->ddlog, ctx->db_name, table);
+ ds_destroy(&tmp);
+ }
+
+ json_destroy(ctx->output_only_data);
+ ctx->output_only_data = NULL;
+ } else {
+ for (const char **p = ctx->output_only_relations; *p; p++) {
+ ddlog_table_update_output(&ops_s, ctx->ddlog, ctx->db_name, *p);
+ }
+ }
+
+ /* If we're updating nb::NB_Global.sb_cfg, then also update
+ * sb_cfg_timestamp.
+ *
+ * XXX If the transaction we're sending to the database fails, then
+ * currently as written we'll never find out about it and sb_cfg_timestamp
+ * will not be updated.
+ */
+ static int64_t old_sb_cfg = INT64_MIN;
+ static int64_t old_sb_cfg_timestamp = INT64_MIN;
+ int64_t new_sb_cfg = old_sb_cfg;
+ if (ctx->has_timestamp_columns) {
+ table_id sb_cfg_tid = ddlog_get_table_id("SbCfg");
+ ddlog_delta *sb_cfg_delta = ddlog_delta_get_table(delta, sb_cfg_tid);
+ ddlog_delta_enumerate(sb_cfg_delta, northd_update_sb_cfg_cb,
+ (uintptr_t) &new_sb_cfg);
+ ddlog_free_delta(sb_cfg_delta);
+
+ if (new_sb_cfg != old_sb_cfg) {
+ old_sb_cfg = new_sb_cfg;
+ old_sb_cfg_timestamp = time_wall_msec();
+ ds_put_format(&ops_s, "{\"op\":\"update\",\"table\":\"NB_Global\",\"where\":[],"
+ "\"row\":{\"sb_cfg_timestamp\":%"PRId64"}},", old_sb_cfg_timestamp);
+ }
+ }
+
+ struct json *ops;
+ if (ops_s.length > start_len) {
+ ds_chomp(&ops_s, ',');
+ ds_put_char(&ops_s, ']');
+ ops = json_from_string(ds_cstr(&ops_s));
+ } else {
+ ops = NULL;
+ }
+
+ ds_destroy(&ops_s);
+
+ return ops;
+}
+
+/* Callback used by the ddlog engine to print error messages. Note that
+ * this is only used by the ddlog runtime, as opposed to the application
+ * code in ovn_northd.dl, which uses the vlog facility directly. */
+static void
+ddlog_print_error(const char *msg)
+{
+ VLOG_ERR("%s", msg);
+}
+
+static void
+usage(void)
+{
+ printf("\
+%s: OVN northbound management daemon\n\
+usage: %s [OPTIONS]\n\
+\n\
+Options:\n\
+ --ovnnb-db=DATABASE connect to ovn-nb database at DATABASE\n\
+ (default: %s)\n\
+ --ovnsb-db=DATABASE connect to ovn-sb database at DATABASE\n\
+ (default: %s)\n\
+ --unixctl=SOCKET override default control socket name\n\
+ -h, --help display this help message\n\
+ -o, --options list available options\n\
+ -V, --version display version information\n\
+", program_name, program_name, default_nb_db(), default_sb_db());
+ daemon_usage();
+ vlog_usage();
+ stream_usage("database", true, true, false);
+}
+
+static void
+parse_options(int argc OVS_UNUSED, char *argv[] OVS_UNUSED)
+{
+ enum {
+ OVN_DAEMON_OPTION_ENUMS,
+ VLOG_OPTION_ENUMS,
+ SSL_OPTION_ENUMS,
+ OPT_DDLOG_RECORD
+ };
+ static const struct option long_options[] = {
+ {"ddlog-record", required_argument, NULL, OPT_DDLOG_RECORD},
+ {"ovnsb-db", required_argument, NULL, 'd'},
+ {"ovnnb-db", required_argument, NULL, 'D'},
+ {"unixctl", required_argument, NULL, 'u'},
+ {"help", no_argument, NULL, 'h'},
+ {"options", no_argument, NULL, 'o'},
+ {"version", no_argument, NULL, 'V'},
+ OVN_DAEMON_LONG_OPTIONS,
+ VLOG_LONG_OPTIONS,
+ STREAM_SSL_LONG_OPTIONS,
+ {NULL, 0, NULL, 0},
+ };
+ char *short_options = ovs_cmdl_long_options_to_short_options(long_options);
+
+ for (;;) {
+ int c;
+
+ c = getopt_long(argc, argv, short_options, long_options, NULL);
+ if (c == -1) {
+ break;
+ }
+
+ switch (c) {
+ OVN_DAEMON_OPTION_HANDLERS;
+ VLOG_OPTION_HANDLERS;
+ STREAM_SSL_OPTION_HANDLERS;
+
+ case OPT_DDLOG_RECORD:
+ record_file = optarg;
+ break;
+
+ case 'd':
+ ovnsb_db = optarg;
+ break;
+
+ case 'D':
+ ovnnb_db = optarg;
+ break;
+
+ case 'u':
+ unixctl_path = optarg;
+ break;
+
+ case 'h':
+ usage();
+ exit(EXIT_SUCCESS);
+
+ case 'o':
+ ovs_cmdl_print_options(long_options);
+ exit(EXIT_SUCCESS);
+
+ case 'V':
+ ovs_print_version(0, 0);
+ exit(EXIT_SUCCESS);
+
+ default:
+ break;
+ }
+ }
+
+ if (!ovnsb_db || !ovnsb_db[0]) {
+ ovnsb_db = default_sb_db();
+ }
+
+ if (!ovnnb_db || !ovnnb_db[0]) {
+ ovnnb_db = default_nb_db();
+ }
+
+ free(short_options);
+}
+
+int
+main(int argc, char *argv[])
+{
+ int res = EXIT_SUCCESS;
+ struct unixctl_server *unixctl;
+ int retval;
+ bool exiting;
+
+ init_table_ids();
+
+ fatal_ignore_sigpipe();
+ ovs_cmdl_proctitle_init(argc, argv);
+ set_program_name(argv[0]);
+ service_start(&argc, &argv);
+ parse_options(argc, argv);
+
+ daemonize_start(false);
+
+ char *abs_unixctl_path = get_abs_unix_ctl_path(unixctl_path);
+ retval = unixctl_server_create(abs_unixctl_path, &unixctl);
+ free(abs_unixctl_path);
+
+ if (retval) {
+ exit(EXIT_FAILURE);
+ }
+
+ struct northd_status status = {
+ .locked = false,
+ .pause = false,
+ };
+ unixctl_command_register("exit", "", 0, 0, ovn_northd_exit, &exiting);
+ unixctl_command_register("status", "", 0, 0, ovn_northd_status, &status);
+
+
+ ddlog_prog ddlog;
+ ddlog = ddlog_run(1, false, NULL, 0, ddlog_print_error, &delta);
+ if (!ddlog) {
+ ovs_fatal(0, "DDlog instance could not be created");
+ }
+
+ int replay_fd = -1;
+ if (record_file) {
+ replay_fd = open(record_file, O_CREAT | O_WRONLY | O_TRUNC, 0666);
+ if (replay_fd < 0) {
+ ovs_fatal(errno, "%s: could not create DDlog record file",
+ record_file);
+ }
+
+ if (ddlog_record_commands(ddlog, replay_fd)) {
+ ovs_fatal(0, "could not enable DDlog command recording");
+ }
+ }
+
+ struct northd_ctx *nb_ctx = northd_ctx_create(
+ ovnnb_db, "OVN_Northbound", "nb", NULL, ddlog,
+ nb_input_relations, nb_output_relations, nb_output_only_relations);
+ struct northd_ctx *sb_ctx = northd_ctx_create(
+ ovnsb_db, "OVN_Southbound", "sb", "ovn_northd", ddlog,
+ sb_input_relations, sb_output_relations, sb_output_only_relations);
+
+ unixctl_command_register("pause", "", 0, 0, ovn_northd_pause, sb_ctx);
+ unixctl_command_register("resume", "", 0, 0, ovn_northd_resume, sb_ctx);
+ unixctl_command_register("is-paused", "", 0, 0, ovn_northd_is_paused,
+ sb_ctx);
+
+ char *ovn_internal_version = ovn_get_internal_version();
+ VLOG_INFO("OVN internal version is : [%s]", ovn_internal_version);
+
+ daemonize_complete();
+
+ /* Main loop. */
+ exiting = false;
+ while (!exiting) {
+ bool has_lock = ovsdb_cs_has_lock(sb_ctx->cs);
+ if (!sb_ctx->paused) {
+ if (has_lock && !status.locked) {
+ VLOG_INFO("ovn-northd lock acquired. "
+ "This ovn-northd instance is now active.");
+ } else if (!has_lock && status.locked) {
+ VLOG_INFO("ovn-northd lock lost. "
+ "This ovn-northd instance is now on standby.");
+ }
+ }
+ status.locked = has_lock;
+ status.pause = sb_ctx->paused;
+
+ northd_run(nb_ctx);
+ northd_run(sb_ctx);
+ northd_update_probe_interval(nb_ctx, sb_ctx);
+ if (ovsdb_cs_has_lock(sb_ctx->cs) &&
+ sb_ctx->state == S_UPDATE &&
+ nb_ctx->state == S_UPDATE &&
+ ovsdb_cs_may_send_transaction(sb_ctx->cs) &&
+ ovsdb_cs_may_send_transaction(nb_ctx->cs)) {
+ northd_send_deltas(nb_ctx);
+ northd_send_deltas(sb_ctx);
+ }
+
+ unixctl_server_run(unixctl);
+
+ northd_wait(nb_ctx);
+ northd_wait(sb_ctx);
+ unixctl_server_wait(unixctl);
+ if (exiting) {
+ poll_immediate_wake();
+ }
+ poll_block();
+ if (should_service_stop()) {
+ exiting = true;
+ }
+ }
+
+ northd_ctx_destroy(nb_ctx);
+ northd_ctx_destroy(sb_ctx);
+
+ ddlog_stop(ddlog);
+
+ if (replay_fd >= 0) {
+ fsync(replay_fd);
+ close(replay_fd);
+ }
+
+ unixctl_server_destroy(unixctl);
+ service_stop();
+
+ exit(res);
+}
+
+static void
+ovn_northd_exit(struct unixctl_conn *conn, int argc OVS_UNUSED,
+ const char *argv[] OVS_UNUSED, void *exiting_)
+{
+ bool *exiting = exiting_;
+ *exiting = true;
+
+ unixctl_command_reply(conn, NULL);
+}
+
+static void
+ovn_northd_pause(struct unixctl_conn *conn, int argc OVS_UNUSED,
+ const char *argv[] OVS_UNUSED, void *sb_ctx_)
+{
+ struct northd_ctx *sb_ctx = sb_ctx_;
+ northd_pause(sb_ctx);
+ unixctl_command_reply(conn, NULL);
+}
+
+static void
+ovn_northd_resume(struct unixctl_conn *conn, int argc OVS_UNUSED,
+ const char *argv[] OVS_UNUSED, void *sb_ctx_)
+{
+ struct northd_ctx *sb_ctx = sb_ctx_;
+ northd_unpause(sb_ctx);
+ unixctl_command_reply(conn, NULL);
+}
+
+static void
+ovn_northd_is_paused(struct unixctl_conn *conn, int argc OVS_UNUSED,
+ const char *argv[] OVS_UNUSED, void *sb_ctx_)
+{
+ struct northd_ctx *sb_ctx = sb_ctx_;
+ unixctl_command_reply(conn, sb_ctx->paused ? "true" : "false");
+}
+
+static void
+ovn_northd_status(struct unixctl_conn *conn, int argc OVS_UNUSED,
+ const char *argv[] OVS_UNUSED, void *status_)
+{
+ struct northd_status *status = status_;
+
+ /* Use a labeled formatted output so we can add more to the status command
+ * later without breaking any consuming scripts. */
+ char *s = xasprintf("Status: %s\n",
+ status->pause ? "paused"
+ : status->locked ? "active"
+ : "standby");
+ unixctl_command_reply(conn, s);
+ free(s);
+}
new file mode 100644
@@ -0,0 +1,29 @@
+-o Address_Set
+-o DHCP_Options
+-o DHCPv6_Options
+-o DNS
+-o Datapath_Binding
+-o Gateway_Chassis
+-o HA_Chassis
+-o HA_Chassis_Group
+-o IP_Multicast
+-o Load_Balancer
+-o MAC_Binding
+-o Meter
+-o Meter_Band
+-o Multicast_Group
+-o Port_Binding
+-o Port_Group
+-o RBAC_Permission
+-o RBAC_Role
+-o SB_Global
+-o Service_Monitor
+--output-only Logical_Flow
+--ro IP_Multicast.seq_no
+--ro Port_Binding.chassis
+--ro Port_Binding.encap
+--ro Port_Binding.virtual_parent
+--ro SB_Global.connections
+--ro SB_Global.external_ids
+--ro SB_Global.ssl
+--ro Service_Monitor.status
new file mode 100644
@@ -0,0 +1,386 @@
+/*
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import ovsdb
+
+
+/* Logical port is enabled if it does not have an enabled flag or the flag is true */
+function is_enabled(s: Option<bool>): bool = {
+ s != Some{false}
+}
+
+/*
+ * Ethernet addresses
+ */
+extern type eth_addr
+
+extern function eth_addr_zero(): eth_addr
+extern function eth_addr2string(addr: eth_addr): string
+function to_string(addr: eth_addr): string {
+ eth_addr2string(addr)
+}
+extern function scan_eth_addr(s: string): Option<eth_addr>
+extern function scan_eth_addr_prefix(s: string): Option<bit<64>>
+extern function eth_addr_from_string(s: string): Option<eth_addr>
+extern function eth_addr_to_uint64(ea: eth_addr): bit<64>
+extern function eth_addr_from_uint64(x: bit<64>): eth_addr
+extern function eth_addr_mark_random(ea: eth_addr): eth_addr
+
+function pseudorandom_mac(seed: uuid, variant: bit<16>) : bit<64> = {
+ eth_addr_to_uint64(eth_addr_mark_random(eth_addr_from_uint64(hash64(seed ++ variant))))
+}
+
+/*
+ * IPv4 addresses
+ */
+
+extern type in_addr
+
+function to_string(ip: in_addr): string = {
+ var x = iptohl(ip);
+ "${x >> 24}.${(x >> 16) & 'hff}.${(x >> 8) & 'hff}.${x & 'hff}"
+}
+
+function ip_is_cidr(netmask: in_addr): bool {
+ var x = ~iptohl(netmask);
+ (x & (x + 1)) == 0
+}
+function ip_is_local_multicast(ip: in_addr): bool {
+ (iptohl(ip) & 32'hffffff00) == 32'he0000000
+}
+
+function ip_create_mask(plen: bit<32>): in_addr {
+ hltoip((64'h00000000ffffffff << (32 - plen))[31:0])
+}
+
+function ip_bitxor(a: in_addr, b: in_addr): in_addr {
+ hltoip(iptohl(a) ^ iptohl(b))
+}
+
+function ip_bitand(a: in_addr, b: in_addr): in_addr {
+ hltoip(iptohl(a) & iptohl(b))
+}
+
+function ip_network(addr: in_addr, mask: in_addr): in_addr {
+ hltoip(iptohl(addr) & iptohl(mask))
+}
+
+function ip_host(addr: in_addr, mask: in_addr): in_addr {
+ hltoip(iptohl(addr) & ~iptohl(mask))
+}
+
+function ip_host_is_zero(addr: in_addr, mask: in_addr): bool {
+ ip_is_zero(ip_host(addr, mask))
+}
+
+function ip_is_zero(a: in_addr): bool {
+ iptohl(a) == 0
+}
+
+function ip_bcast(addr: in_addr, mask: in_addr): in_addr {
+ hltoip(iptohl(addr) | ~iptohl(mask))
+}
+
+extern function ip_parse(s: string): Option<in_addr>
+extern function ip_parse_masked(s: string): Either<string/*err*/, (in_addr/*host_ip*/, in_addr/*mask*/)>
+extern function ip_parse_cidr(s: string): Either<string/*err*/, (in_addr/*ip*/, bit<32>/*plen*/)>
+extern function ip_count_cidr_bits(ip: in_addr): Option<bit<8>>
+
+/* True if both 'ips' are in the same network as defined by netmask 'mask',
+ * false otherwise. */
+function ip_same_network(ips: (in_addr, in_addr), mask: in_addr): bool {
+ ((iptohl(ips.0) ^ iptohl(ips.1)) & iptohl(mask)) == 0
+}
+
+extern function iptohl(addr: in_addr): bit<32>
+extern function hltoip(addr: bit<32>): in_addr
+extern function scan_static_dynamic_ip(s: string): Option<in_addr>
+
+/*
+ * parse IPv4 address list of the form:
+ * "10.0.0.4 10.0.0.10 10.0.0.20..10.0.0.50 10.0.0.100..10.0.0.110"
+ */
+extern function parse_ip_list(ips: string): Either<string, Vec<(in_addr, Option<in_addr>)>>
+
+/*
+ * IPv6 addresses
+ */
+extern type in6_addr
+
+extern function in6_generate_lla(ea: eth_addr): in6_addr
+extern function in6_generate_eui64(ea: eth_addr, prefix: in6_addr): in6_addr
+extern function in6_is_lla(addr: in6_addr): bool
+extern function in6_addr_solicited_node(ip6: in6_addr): in6_addr
+
+extern function ipv6_string_mapped(addr: in6_addr): string
+extern function ipv6_parse_masked(s: string): Either<string/*err*/, (in6_addr/*ip*/, in6_addr/*mask*/)>
+extern function ipv6_parse(s: string): Option<in6_addr>
+extern function ipv6_parse_cidr(s: string): Either<string/*err*/, (in6_addr/*ip*/, bit<32>/*plen*/)>
+extern function ipv6_bitxor(a: in6_addr, b: in6_addr): in6_addr
+extern function ipv6_bitand(a: in6_addr, b: in6_addr): in6_addr
+extern function ipv6_bitnot(a: in6_addr): in6_addr
+extern function ipv6_create_mask(mask: bit<32>): in6_addr
+extern function ipv6_is_zero(a: in6_addr): bool
+extern function ipv6_is_v4mapped(a: in6_addr): bool
+extern function ipv6_is_routable_multicast(a: in6_addr): bool
+extern function ipv6_is_all_hosts(a: in6_addr): bool
+
+function ipv6_network(addr: in6_addr, mask: in6_addr): in6_addr {
+ ipv6_bitand(addr, mask)
+}
+
+function ipv6_host(addr: in6_addr, mask: in6_addr): in6_addr {
+ ipv6_bitand(addr, ipv6_bitnot(mask))
+}
+
+/* True if both 'ips' are in the same network as defined by netmask 'mask',
+ * false otherwise. */
+function ipv6_same_network(ips: (in6_addr, in6_addr), mask: in6_addr): bool {
+ ipv6_network(ips.0, mask) == ipv6_network(ips.1, mask)
+}
+
+extern function ipv6_host_is_zero(addr: in6_addr, mask: in6_addr): bool
+extern function ipv6_multicast_to_ethernet(ip6: in6_addr): eth_addr
+extern function ipv6_is_cidr(ip6: in6_addr): bool
+extern function ipv6_count_cidr_bits(ip6: in6_addr): Option<bit<8>>
+
+extern function inet6_ntop(addr: in6_addr): string
+function to_string(addr: in6_addr): string = {
+ inet6_ntop(addr)
+}
+
+/*
+ * IPv4 | IPv6 addresses
+ */
+
+typedef v46_ip = IPv4 { ipv4: in_addr } | IPv6 { ipv6: in6_addr }
+
+function ip46_parse_cidr(s: string) : Option<(v46_ip, bit<32>)> = {
+ match (ip_parse_cidr(s)) {
+ Right{(ipv4, plen)} -> return Some{(IPv4{ipv4}, plen)},
+ _ -> ()
+ };
+ match (ipv6_parse_cidr(s)) {
+ Right{(ipv6, plen)} -> return Some{(IPv6{ipv6}, plen)},
+ _ -> ()
+ };
+ None
+}
+function ip46_parse_masked(s: string) : Option<(v46_ip, v46_ip)> = {
+ match (ip_parse_masked(s)) {
+ Right{(ipv4, mask)} -> return Some{(IPv4{ipv4}, IPv4{mask})},
+ _ -> ()
+ };
+ match (ipv6_parse_masked(s)) {
+ Right{(ipv6, mask)} -> return Some{(IPv6{ipv6}, IPv6{mask})},
+ _ -> ()
+ };
+ None
+}
+function ip46_parse(s: string) : Option<v46_ip> = {
+ match (ip_parse(s)) {
+ Some{ipv4} -> return Some{IPv4{ipv4}},
+ _ -> ()
+ };
+ match (ipv6_parse(s)) {
+ Some{ipv6} -> return Some{IPv6{ipv6}},
+ _ -> ()
+ };
+ None
+}
+function to_string(ip46: v46_ip) : string = {
+ match (ip46) {
+ IPv4{ipv4} -> "${ipv4}",
+ IPv6{ipv6} -> "${ipv6}"
+ }
+}
+function to_bracketed_string(ip46: v46_ip) : string = {
+ match (ip46) {
+ IPv4{ipv4} -> "${ipv4}",
+ IPv6{ipv6} -> "[${ipv6}]"
+ }
+}
+
+function ip46_get_network(ip46: v46_ip, plen: bit<32>) : v46_ip {
+ match (ip46) {
+ IPv4{ipv4} -> IPv4{ip_bitand(ipv4, ip_create_mask(plen))},
+ IPv6{ipv6} -> IPv6{ipv6_bitand(ipv6, ipv6_create_mask(plen))}
+ }
+}
+
+function ip46_is_all_ones(ip46: v46_ip) : bool {
+ match (ip46) {
+ IPv4{ipv4} -> ipv4 == ip_create_mask(32),
+ IPv6{ipv6} -> ipv6 == ipv6_create_mask(128)
+ }
+}
+
+function ip46_count_cidr_bits(ip46: v46_ip) : Option<bit<8>> {
+ match (ip46) {
+ IPv4{ipv4} -> ip_count_cidr_bits(ipv4),
+ IPv6{ipv6} -> ipv6_count_cidr_bits(ipv6)
+ }
+}
+
+function ip46_ipX(ip46: v46_ip) : string {
+ match (ip46) {
+ IPv4{_} -> "ip4",
+ IPv6{_} -> "ip6"
+ }
+}
+
+function ip46_xxreg(ip46: v46_ip) : string {
+ match (ip46) {
+ IPv4{_} -> "",
+ IPv6{_} -> "xx"
+ }
+}
+
+typedef ipv4_netaddr = IPV4NetAddr {
+ addr: in_addr, /* 192.168.10.123 */
+ plen: bit<32> /* CIDR Prefix: 24. */
+}
+
+/* Returns the netmask. */
+function ipv4_netaddr_mask(na: ipv4_netaddr): in_addr {
+ ip_create_mask(na.plen)
+}
+
+/* Returns the broadcast address. */
+function ipv4_netaddr_bcast(na: ipv4_netaddr): in_addr {
+ ip_bcast(na.addr, ipv4_netaddr_mask(na))
+}
+
+/* Returns the network (with the host bits zeroed). */
+function ipv4_netaddr_network(na: ipv4_netaddr): in_addr {
+ ip_network(na.addr, ipv4_netaddr_mask(na))
+}
+
+/* Returns the host (with the network bits zeroed). */
+function ipv4_netaddr_host(na: ipv4_netaddr): in_addr {
+ ip_host(na.addr, ipv4_netaddr_mask(na))
+}
+
+/* Match on the host, if the host part is nonzero, or on the network
+ * otherwise. */
+function ipv4_netaddr_match_host_or_network(na: ipv4_netaddr): string {
+ if (na.plen < 32 and ip_is_zero(ipv4_netaddr_host(na))) {
+ "${na.addr}/${na.plen}"
+ } else {
+ "${na.addr}"
+ }
+}
+
+/* Match on the network. */
+function ipv4_netaddr_match_network(na: ipv4_netaddr): string {
+ if (na.plen < 32) {
+ "${ipv4_netaddr_network(na)}/${na.plen}"
+ } else {
+ "${na.addr}"
+ }
+}
+
+typedef ipv6_netaddr = IPV6NetAddr {
+ addr: in6_addr, /* fc00::1 */
+ plen: bit<32> /* CIDR Prefix: 64 */
+}
+
+/* Returns the netmask. */
+function ipv6_netaddr_mask(na: ipv6_netaddr): in6_addr {
+ ipv6_create_mask(na.plen)
+}
+
+/* Returns the network (with the host bits zeroed). */
+function ipv6_netaddr_network(na: ipv6_netaddr): in6_addr {
+ ipv6_network(na.addr, ipv6_netaddr_mask(na))
+}
+
+/* Returns the host (with the network bits zeroed). */
+function ipv6_netaddr_host(na: ipv6_netaddr): in6_addr {
+ ipv6_host(na.addr, ipv6_netaddr_mask(na))
+}
+
+function ipv6_netaddr_solicited_node(na: ipv6_netaddr): in6_addr {
+ in6_addr_solicited_node(na.addr)
+}
+
+function ipv6_netaddr_is_lla(na: ipv6_netaddr): bool {
+ return in6_is_lla(ipv6_netaddr_network(na))
+}
+
+/* Match on the network. */
+function ipv6_netaddr_match_network(na: ipv6_netaddr): string {
+ if (na.plen < 128) {
+ "${ipv6_netaddr_network(na)}/${na.plen}"
+ } else {
+ "${na.addr}"
+ }
+}
+
+typedef lport_addresses = LPortAddress {
+ ea: eth_addr,
+ ipv4_addrs: Vec<ipv4_netaddr>,
+ ipv6_addrs: Vec<ipv6_netaddr>
+}
+
+function to_string(addr: lport_addresses): string = {
+ var addrs = ["${addr.ea}"];
+ for (ip4 in addr.ipv4_addrs) {
+ vec_push(addrs, "${ip4.addr}")
+ };
+
+ for (ip6 in addr.ipv6_addrs) {
+ vec_push(addrs, "${ip6.addr}")
+ };
+
+ string_join(addrs, " ")
+}
+
+/*
+ * Packet header lengths
+ */
+function eTH_HEADER_LEN(): integer = 14
+function vLAN_HEADER_LEN(): integer = 4
+function vLAN_ETH_HEADER_LEN(): integer = eTH_HEADER_LEN() + vLAN_HEADER_LEN()
+
+/*
+ * Logging
+ */
+extern function warn(msg: string): ()
+extern function err(msg: string): ()
+extern function abort(msg: string): ()
+
+/*
+ * C functions imported from OVN
+ */
+extern function is_dynamic_lsp_address(addr: string): bool
+extern function extract_lsp_addresses(address: string): Option<lport_addresses>
+extern function extract_addresses(address: string): Option<lport_addresses>
+extern function extract_lrp_networks(mac: string, networks: Set<string>): Option<lport_addresses>
+
+extern function split_addresses(addr: string): (Set<string>, Set<string>)
+
+extern function ovn_internal_version(): string
+
+/*
+ * C functions imported from OVS
+ */
+extern function json_string_escape(s: string): string
+
+/* Returns the number of 1-bits in `x`, between 0 and 64 inclusive */
+extern function count_1bits(x: bit<64>): bit<8>
+
+/* For a 'key' of the form "IP:port" or just "IP", returns
+ * (v46_ip, port) tuple. */
+extern function ip_address_and_port_from_lb_key(k: string): Option<(v46_ip, bit<16>)>
new file mode 100644
@@ -0,0 +1,867 @@
+/*
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+use ::nom::*;
+use ::differential_datalog::record;
+use ::std::ffi;
+use ::std::ptr;
+use ::std::default;
+use ::std::process;
+use ::std::os::raw;
+use ::libc;
+
+use crate::ddlog_std;
+
+pub fn warn(msg: &String) {
+ warn_(msg.as_str())
+}
+
+pub fn warn_(msg: &str) {
+ unsafe {
+ ddlog_warn(ffi::CString::new(msg).unwrap().as_ptr());
+ }
+}
+
+pub fn err_(msg: &str) {
+ unsafe {
+ ddlog_err(ffi::CString::new(msg).unwrap().as_ptr());
+ }
+}
+
+pub fn abort(msg: &String) {
+ abort_(msg.as_str())
+}
+
+fn abort_(msg: &str) {
+ err_(format!("DDlog error: {}.", msg).as_ref());
+ process::abort();
+}
+
+const ETH_ADDR_SIZE: usize = 6;
+const IN6_ADDR_SIZE: usize = 16;
+const INET6_ADDRSTRLEN: usize = 46;
+const INET_ADDRSTRLEN: usize = 16;
+const ETH_ADDR_STRLEN: usize = 17;
+
+const AF_INET: usize = 2;
+const AF_INET6: usize = 10;
+
+/* Implementation for externs declared in ovn.dl */
+
+#[repr(C)]
+#[derive(Default, PartialEq, Eq, PartialOrd, Ord, Clone, Hash, Serialize, Deserialize, Debug)]
+pub struct eth_addr {
+ x: [u8; ETH_ADDR_SIZE]
+}
+
+pub fn eth_addr_zero() -> eth_addr {
+ eth_addr { x: [0; ETH_ADDR_SIZE] }
+}
+
+pub fn eth_addr2string(addr: ð_addr) -> String {
+ format!("{:02x}:{:02x}:{:02x}:{:02x}:{:02x}:{:02x}",
+ addr.x[0], addr.x[1], addr.x[2], addr.x[3], addr.x[4], addr.x[5])
+}
+
+pub fn eth_addr_from_string(s: &String) -> ddlog_std::Option<eth_addr> {
+ let mut ea: eth_addr = Default::default();
+ unsafe {
+ if ovs::eth_addr_from_string(string2cstr(s).as_ptr(), &mut ea as *mut eth_addr) {
+ ddlog_std::Option::Some{x: ea}
+ } else {
+ ddlog_std::Option::None
+ }
+ }
+}
+
+pub fn eth_addr_from_uint64(x: &u64) -> eth_addr {
+ let mut ea: eth_addr = Default::default();
+ unsafe {
+ ovs::eth_addr_from_uint64(*x as libc::uint64_t, &mut ea as *mut eth_addr);
+ ea
+ }
+}
+
+pub fn eth_addr_mark_random(ea: ð_addr) -> eth_addr {
+ unsafe {
+ let mut ea_new = ea.clone();
+ ovs::eth_addr_mark_random(&mut ea_new as *mut eth_addr);
+ ea_new
+ }
+}
+
+pub fn eth_addr_to_uint64(ea: ð_addr) -> u64 {
+ unsafe {
+ ovs::eth_addr_to_uint64(ea.clone()) as u64
+ }
+}
+
+
+impl FromRecord for eth_addr {
+ fn from_record(val: &record::Record) -> Result<Self, String> {
+ Ok(eth_addr{x: <[u8; ETH_ADDR_SIZE]>::from_record(val)?})
+ }
+}
+
+::differential_datalog::decl_struct_into_record!(eth_addr, <>, x);
+::differential_datalog::decl_record_mutator_struct!(eth_addr, <>, x: [u8; ETH_ADDR_SIZE]);
+
+
+#[repr(C)]
+#[derive(Default, PartialEq, Eq, PartialOrd, Ord, Clone, Hash, Serialize, Deserialize, Debug)]
+pub struct in6_addr {
+ x: [u8; IN6_ADDR_SIZE]
+}
+
+pub const in6addr_any: in6_addr = in6_addr{x: [0; IN6_ADDR_SIZE]};
+pub const in6addr_all_hosts: in6_addr = in6_addr{x: [
+ 0xff,0x02,0x00,0x00,0x00,0x00,0x00,0x00,
+ 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x01 ]};
+
+impl FromRecord for in6_addr {
+ fn from_record(val: &record::Record) -> Result<Self, String> {
+ Ok(in6_addr{x: <[u8; IN6_ADDR_SIZE]>::from_record(val)?})
+ }
+}
+
+::differential_datalog::decl_struct_into_record!(in6_addr, <>, x);
+::differential_datalog::decl_record_mutator_struct!(in6_addr, <>, x: [u8; IN6_ADDR_SIZE]);
+
+pub fn in6_generate_lla(ea: ð_addr) -> in6_addr {
+ let mut addr: in6_addr = Default::default();
+ unsafe {ovs::in6_generate_lla(ea.clone(), &mut addr as *mut in6_addr)};
+ addr
+}
+
+pub fn in6_generate_eui64(ea: ð_addr, prefix: &in6_addr) -> in6_addr {
+ let mut addr: in6_addr = Default::default();
+ unsafe {ovs::in6_generate_eui64(ea.clone(),
+ prefix as *const in6_addr,
+ &mut addr as *mut in6_addr)};
+ addr
+}
+
+pub fn in6_is_lla(addr: &in6_addr) -> bool {
+ unsafe {ovs::in6_is_lla(addr as *const in6_addr)}
+}
+
+pub fn in6_addr_solicited_node(ip6: &in6_addr) -> in6_addr
+{
+ let mut res: in6_addr = Default::default();
+ unsafe {
+ ovs::in6_addr_solicited_node(&mut res as *mut in6_addr, ip6 as *const in6_addr);
+ }
+ res
+}
+
+pub fn ipv6_bitand(a: &in6_addr, b: &in6_addr) -> in6_addr {
+ unsafe {
+ ovs::ipv6_addr_bitand(a as *const in6_addr, b as *const in6_addr)
+ }
+}
+
+pub fn ipv6_bitxor(a: &in6_addr, b: &in6_addr) -> in6_addr {
+ unsafe {
+ ovs::ipv6_addr_bitxor(a as *const in6_addr, b as *const in6_addr)
+ }
+}
+
+pub fn ipv6_bitnot(a: &in6_addr) -> in6_addr {
+ let mut result: in6_addr = Default::default();
+ for i in 0..16 {
+ result.x[i] = !a.x[i]
+ }
+ result
+}
+
+pub fn ipv6_string_mapped(addr: &in6_addr) -> String {
+ let mut addr_str = [0 as i8; INET6_ADDRSTRLEN];
+ unsafe {
+ ovs::ipv6_string_mapped(&mut addr_str[0] as *mut raw::c_char, addr as *const in6_addr);
+ cstr2string(&addr_str as *const raw::c_char)
+ }
+}
+
+pub fn ipv6_is_zero(addr: &in6_addr) -> bool {
+ *addr == in6addr_any
+}
+
+pub fn ipv6_count_cidr_bits(ip6: &in6_addr) -> ddlog_std::Option<u8> {
+ unsafe {
+ match (ipv6_is_cidr(ip6)) {
+ true => ddlog_std::Option::Some{x: ovs::ipv6_count_cidr_bits(ip6 as *const in6_addr) as u8},
+ false => ddlog_std::Option::None
+ }
+ }
+}
+
+pub fn json_string_escape(s: &String) -> String {
+ let mut ds = ovs_ds::new();
+ unsafe {
+ ovs::json_string_escape(ffi::CString::new(s.as_str()).unwrap().as_ptr() as *const raw::c_char,
+ &mut ds as *mut ovs_ds);
+ };
+ unsafe{ds.into_string()}
+}
+
+pub fn extract_lsp_addresses(address: &String) -> ddlog_std::Option<lport_addresses> {
+ unsafe {
+ let mut laddrs: lport_addresses_c = Default::default();
+ if ovn_c::extract_lsp_addresses(string2cstr(address).as_ptr(),
+ &mut laddrs as *mut lport_addresses_c) {
+ ddlog_std::Option::Some{x: laddrs.into_ddlog()}
+ } else {
+ ddlog_std::Option::None
+ }
+ }
+}
+
+pub fn extract_addresses(address: &String) -> ddlog_std::Option<lport_addresses> {
+ unsafe {
+ let mut laddrs: lport_addresses_c = Default::default();
+ let mut ofs: raw::c_int = 0;
+ if ovn_c::extract_addresses(string2cstr(address).as_ptr(),
+ &mut laddrs as *mut lport_addresses_c,
+ &mut ofs as *mut raw::c_int) {
+ ddlog_std::Option::Some{x: laddrs.into_ddlog()}
+ } else {
+ ddlog_std::Option::None
+ }
+ }
+}
+
+pub fn extract_lrp_networks(mac: &String, networks: &ddlog_std::Set<String>) -> ddlog_std::Option<lport_addresses>
+{
+ unsafe {
+ let mut laddrs: lport_addresses_c = Default::default();
+ let mut networks_cstrs = Vec::with_capacity(networks.x.len());
+ let mut networks_ptrs = Vec::with_capacity(networks.x.len());
+ for net in networks.x.iter() {
+ networks_cstrs.push(string2cstr(net));
+ networks_ptrs.push(networks_cstrs.last().unwrap().as_ptr());
+ };
+ if ovn_c::extract_lrp_networks__(string2cstr(mac).as_ptr(), networks_ptrs.as_ptr() as *const *const raw::c_char,
+ networks_ptrs.len(), &mut laddrs as *mut lport_addresses_c) {
+ ddlog_std::Option::Some{x: laddrs.into_ddlog()}
+ } else {
+ ddlog_std::Option::None
+ }
+ }
+}
+
+pub fn ovn_internal_version() -> String {
+ unsafe {
+ let s = ovn_c::ovn_get_internal_version();
+ let retval = cstr2string(s);
+ free(s as *mut raw::c_void);
+ retval
+ }
+}
+
+pub fn ipv6_parse_masked(s: &String) -> ddlog_std::Either<String, ddlog_std::tuple2<in6_addr, in6_addr>>
+{
+ unsafe {
+ let mut ip: in6_addr = Default::default();
+ let mut mask: in6_addr = Default::default();
+ let err = ovs::ipv6_parse_masked(string2cstr(s).as_ptr(), &mut ip as *mut in6_addr, &mut mask as *mut in6_addr);
+ if (err != ptr::null_mut()) {
+ let errstr = cstr2string(err);
+ free(err as *mut raw::c_void);
+ ddlog_std::Either::Left{l: errstr}
+ } else {
+ ddlog_std::Either::Right{r: ddlog_std::tuple2(ip, mask)}
+ }
+ }
+}
+
+pub fn ipv6_parse_cidr(s: &String) -> ddlog_std::Either<String, ddlog_std::tuple2<in6_addr, u32>>
+{
+ unsafe {
+ let mut ip: in6_addr = Default::default();
+ let mut plen: raw::c_uint = 0;
+ let err = ovs::ipv6_parse_cidr(string2cstr(s).as_ptr(), &mut ip as *mut in6_addr, &mut plen as *mut raw::c_uint);
+ if (err != ptr::null_mut()) {
+ let errstr = cstr2string(err);
+ free(err as *mut raw::c_void);
+ ddlog_std::Either::Left{l: errstr}
+ } else {
+ ddlog_std::Either::Right{r: ddlog_std::tuple2(ip, plen as u32)}
+ }
+ }
+}
+
+pub fn ipv6_parse(s: &String) -> ddlog_std::Option<in6_addr>
+{
+ unsafe {
+ let mut ip: in6_addr = Default::default();
+ let res = ovs::ipv6_parse(string2cstr(s).as_ptr(), &mut ip as *mut in6_addr);
+ if (res) {
+ ddlog_std::Option::Some{x: ip}
+ } else {
+ ddlog_std::Option::None
+ }
+ }
+}
+
+pub fn ipv6_create_mask(mask: &u32) -> in6_addr
+{
+ unsafe {ovs::ipv6_create_mask(*mask as raw::c_uint)}
+}
+
+
+pub fn ipv6_is_routable_multicast(a: &in6_addr) -> bool
+{
+ unsafe{ovn_c::ipv6_addr_is_routable_multicast(a as *const in6_addr)}
+}
+
+pub fn ipv6_is_all_hosts(a: &in6_addr) -> bool
+{
+ return *a == in6addr_all_hosts;
+}
+
+pub fn ipv6_is_cidr(a: &in6_addr) -> bool
+{
+ unsafe{ovs::ipv6_is_cidr(a as *const in6_addr)}
+}
+
+pub fn ipv6_multicast_to_ethernet(ip6: &in6_addr) -> eth_addr
+{
+ let mut eth: eth_addr = Default::default();
+ unsafe{
+ ovs::ipv6_multicast_to_ethernet(&mut eth as *mut eth_addr, ip6 as *const in6_addr);
+ }
+ eth
+}
+
+pub type in_addr = u32;
+pub type ovs_be32 = u32;
+
+pub fn iptohl(addr: &in_addr) -> u32 {
+ ddlog_std::ntohl(addr)
+}
+pub fn hltoip(addr: &u32) -> in_addr {
+ ddlog_std::htonl(addr)
+}
+
+pub fn ip_parse_masked(s: &String) -> ddlog_std::Either<String, ddlog_std::tuple2<in_addr, in_addr>>
+{
+ unsafe {
+ let mut ip: ovs_be32 = 0;
+ let mut mask: ovs_be32 = 0;
+ let err = ovs::ip_parse_masked(string2cstr(s).as_ptr(), &mut ip as *mut ovs_be32, &mut mask as *mut ovs_be32);
+ if (err != ptr::null_mut()) {
+ let errstr = cstr2string(err);
+ free(err as *mut raw::c_void);
+ ddlog_std::Either::Left{l: errstr}
+ } else {
+ ddlog_std::Either::Right{r: ddlog_std::tuple2(ip, mask)}
+ }
+ }
+}
+
+pub fn ip_parse_cidr(s: &String) -> ddlog_std::Either<String, ddlog_std::tuple2<in_addr, u32>>
+{
+ unsafe {
+ let mut ip: ovs_be32 = 0;
+ let mut plen: raw::c_uint = 0;
+ let err = ovs::ip_parse_cidr(string2cstr(s).as_ptr(), &mut ip as *mut ovs_be32, &mut plen as *mut raw::c_uint);
+ if (err != ptr::null_mut()) {
+ let errstr = cstr2string(err);
+ free(err as *mut raw::c_void);
+ ddlog_std::Either::Left{l: errstr}
+ } else {
+ ddlog_std::Either::Right{r: ddlog_std::tuple2(ip, plen as u32)}
+ }
+ }
+}
+
+pub fn ip_parse(s: &String) -> ddlog_std::Option<in_addr>
+{
+ unsafe {
+ let mut ip: ovs_be32 = 0;
+ if (ovs::ip_parse(string2cstr(s).as_ptr(), &mut ip as *mut ovs_be32)) {
+ ddlog_std::Option::Some{x:ip}
+ } else {
+ ddlog_std::Option::None
+ }
+ }
+}
+
+pub fn ip_count_cidr_bits(address: &in_addr) -> ddlog_std::Option<u8> {
+ unsafe {
+ match (ip_is_cidr(address)) {
+ true => ddlog_std::Option::Some{x: ovs::ip_count_cidr_bits(*address) as u8},
+ false => ddlog_std::Option::None
+ }
+ }
+}
+
+pub fn is_dynamic_lsp_address(address: &String) -> bool {
+ unsafe {
+ ovn_c::is_dynamic_lsp_address(string2cstr(address).as_ptr())
+ }
+}
+
+pub fn split_addresses(addresses: &String) -> ddlog_std::tuple2<ddlog_std::Set<String>, ddlog_std::Set<String>> {
+ let mut ip4_addrs = ovs_svec::new();
+ let mut ip6_addrs = ovs_svec::new();
+ unsafe {
+ ovn_c::split_addresses(string2cstr(addresses).as_ptr(), &mut ip4_addrs as *mut ovs_svec, &mut ip6_addrs as *mut ovs_svec);
+ ddlog_std::tuple2(ip4_addrs.into_strings(), ip6_addrs.into_strings())
+ }
+}
+
+pub fn scan_eth_addr(s: &String) -> ddlog_std::Option<eth_addr> {
+ let mut ea = eth_addr_zero();
+ unsafe {
+ if ovs::ovs_scan(string2cstr(s).as_ptr(), b"%hhx:%hhx:%hhx:%hhx:%hhx:%hhx\0".as_ptr() as *const raw::c_char,
+ &mut ea.x[0] as *mut u8, &mut ea.x[1] as *mut u8,
+ &mut ea.x[2] as *mut u8, &mut ea.x[3] as *mut u8,
+ &mut ea.x[4] as *mut u8, &mut ea.x[5] as *mut u8)
+ {
+ ddlog_std::Option::Some{x: ea}
+ } else {
+ ddlog_std::Option::None
+ }
+ }
+}
+
+pub fn scan_eth_addr_prefix(s: &String) -> ddlog_std::Option<u64> {
+ let mut b2: u8 = 0;
+ let mut b1: u8 = 0;
+ let mut b0: u8 = 0;
+ unsafe {
+ if ovs::ovs_scan(string2cstr(s).as_ptr(), b"%hhx:%hhx:%hhx\0".as_ptr() as *const raw::c_char,
+ &mut b2 as *mut u8, &mut b1 as *mut u8, &mut b0 as *mut u8)
+ {
+ ddlog_std::Option::Some{x: ((b2 as u64) << 40) | ((b1 as u64) << 32) | ((b0 as u64) << 24) }
+ } else {
+ ddlog_std::Option::None
+ }
+ }
+}
+
+pub fn scan_static_dynamic_ip(s: &String) -> ddlog_std::Option<in_addr> {
+ let mut ip0: u8 = 0;
+ let mut ip1: u8 = 0;
+ let mut ip2: u8 = 0;
+ let mut ip3: u8 = 0;
+ let mut n: raw::c_uint = 0;
+ unsafe {
+ if ovs::ovs_scan(string2cstr(s).as_ptr(), b"dynamic %hhu.%hhu.%hhu.%hhu%n\0".as_ptr() as *const raw::c_char,
+ &mut ip0 as *mut u8,
+ &mut ip1 as *mut u8,
+ &mut ip2 as *mut u8,
+ &mut ip3 as *mut u8,
+ &mut n) && s.len() == (n as usize)
+ {
+ ddlog_std::Option::Some{x: ddlog_std::htonl(&(((ip0 as u32) << 24) | ((ip1 as u32) << 16) | ((ip2 as u32) << 8) | (ip3 as u32)))}
+ } else {
+ ddlog_std::Option::None
+ }
+ }
+}
+
+pub fn ip_address_and_port_from_lb_key(k: &String) ->
+ ddlog_std::Option<ddlog_std::tuple2<v46_ip, u16>>
+{
+ unsafe {
+ let mut ip_address: *mut raw::c_char = ptr::null_mut();
+ let mut port: libc::uint16_t = 0;
+ let mut addr_family: raw::c_int = 0;
+
+ ovn_c::ip_address_and_port_from_lb_key(string2cstr(k).as_ptr(), &mut ip_address as *mut *mut raw::c_char,
+ &mut port as *mut libc::uint16_t, &mut addr_family as *mut raw::c_int);
+ if (ip_address != ptr::null_mut()) {
+ match (ip46_parse(&cstr2string(ip_address))) {
+ ddlog_std::Option::Some{x: ip46} => {
+ let res = ddlog_std::tuple2(ip46, port as u16);
+ free(ip_address as *mut raw::c_void);
+ return ddlog_std::Option::Some{x: res}
+ },
+ _ => ()
+ }
+ }
+ ddlog_std::Option::None
+ }
+}
+
+pub fn count_1bits(x: &u64) -> u8 {
+ x.count_ones() as u8
+}
+
+
+pub fn str_to_int(s: &String, base: &u16) -> ddlog_std::Option<u64> {
+ let mut i: raw::c_int = 0;
+ let ok = unsafe {
+ ovs::str_to_int(string2cstr(s).as_ptr(), *base as raw::c_int, &mut i as *mut raw::c_int)
+ };
+ if ok {
+ ddlog_std::Option::Some{x: i as u64}
+ } else {
+ ddlog_std::Option::None
+ }
+}
+
+pub fn str_to_uint(s: &String, base: &u16) -> ddlog_std::Option<u64> {
+ let mut i: raw::c_uint = 0;
+ let ok = unsafe {
+ ovs::str_to_uint(string2cstr(s).as_ptr(), *base as raw::c_int, &mut i as *mut raw::c_uint)
+ };
+ if ok {
+ ddlog_std::Option::Some{x: i as u64}
+ } else {
+ ddlog_std::Option::None
+ }
+}
+
+pub fn inet6_ntop(addr: &in6_addr) -> String {
+ let mut buf = [0 as i8; INET6_ADDRSTRLEN];
+ unsafe {
+ let res = inet_ntop(AF_INET6 as raw::c_int, addr as *const in6_addr as *const raw::c_void,
+ &mut buf[0] as *mut raw::c_char, INET6_ADDRSTRLEN as libc::socklen_t);
+ if res == ptr::null() {
+ warn(&format!("inet_ntop({:?}) failed", *addr));
+ "".to_owned()
+ } else {
+ cstr2string(&buf as *const raw::c_char)
+ }
+ }
+}
+
+/* Internals */
+
+unsafe fn cstr2string(s: *const raw::c_char) -> String {
+ ffi::CStr::from_ptr(s).to_owned().into_string().
+ unwrap_or_else(|e|{ warn(&format!("cstr2string: {}", e)); "".to_owned() })
+}
+
+fn string2cstr(s: &String) -> ffi::CString {
+ ffi::CString::new(s.as_str()).unwrap()
+}
+
+/* OVS dynamic string type */
+#[repr(C)]
+struct ovs_ds {
+ s: *mut raw::c_char, /* Null-terminated string. */
+ length: libc::size_t, /* Bytes used, not including null terminator. */
+ allocated: libc::size_t /* Bytes allocated, not including null terminator. */
+}
+
+impl ovs_ds {
+ pub fn new() -> ovs_ds {
+ ovs_ds{s: ptr::null_mut(), length: 0, allocated: 0}
+ }
+
+ pub unsafe fn into_string(mut self) -> String {
+ let res = cstr2string(ovs::ds_cstr(&self as *const ovs_ds));
+ ovs::ds_destroy(&mut self as *mut ovs_ds);
+ res
+ }
+}
+
+/* OVS string vector type */
+#[repr(C)]
+struct ovs_svec {
+ names: *mut *mut raw::c_char,
+ n: libc::size_t,
+ allocated: libc::size_t
+}
+
+impl ovs_svec {
+ pub fn new() -> ovs_svec {
+ ovs_svec{names: ptr::null_mut(), n: 0, allocated: 0}
+ }
+
+ pub unsafe fn into_strings(mut self) -> ddlog_std::Set<String> {
+ let mut res: ddlog_std::Set<String> = ddlog_std::Set::new();
+ unsafe {
+ for i in 0..self.n {
+ res.insert(cstr2string(*self.names.offset(i as isize)));
+ }
+ ovs::svec_destroy(&mut self as *mut ovs_svec);
+ }
+ res
+ }
+}
+
+
+// ovn/lib/ovn-util.h
+#[repr(C)]
+struct ipv4_netaddr_c {
+ addr: libc::uint32_t,
+ mask: libc::uint32_t,
+ network: libc::uint32_t,
+ plen: raw::c_uint,
+
+ addr_s: [raw::c_char; INET_ADDRSTRLEN + 1], /* "192.168.10.123" */
+ network_s: [raw::c_char; INET_ADDRSTRLEN + 1], /* "192.168.10.0" */
+ bcast_s: [raw::c_char; INET_ADDRSTRLEN + 1] /* "192.168.10.255" */
+}
+
+impl Default for ipv4_netaddr_c {
+ fn default() -> Self {
+ ipv4_netaddr_c {
+ addr: 0,
+ mask: 0,
+ network: 0,
+ plen: 0,
+ addr_s: [0; INET_ADDRSTRLEN + 1],
+ network_s: [0; INET_ADDRSTRLEN + 1],
+ bcast_s: [0; INET_ADDRSTRLEN + 1]
+ }
+ }
+}
+
+impl ipv4_netaddr_c {
+ pub unsafe fn to_ddlog(&self) -> ipv4_netaddr {
+ ipv4_netaddr{
+ addr: self.addr,
+ plen: self.plen,
+ }
+ }
+}
+
+#[repr(C)]
+struct ipv6_netaddr_c {
+ addr: in6_addr, /* fc00::1 */
+ mask: in6_addr, /* ffff:ffff:ffff:ffff:: */
+ sn_addr: in6_addr, /* ff02:1:ff00::1 */
+ network: in6_addr, /* fc00:: */
+ plen: raw::c_uint, /* CIDR Prefix: 64 */
+
+ addr_s: [raw::c_char; INET6_ADDRSTRLEN + 1], /* "fc00::1" */
+ sn_addr_s: [raw::c_char; INET6_ADDRSTRLEN + 1], /* "ff02:1:ff00::1" */
+ network_s: [raw::c_char; INET6_ADDRSTRLEN + 1] /* "fc00::" */
+}
+
+impl Default for ipv6_netaddr_c {
+ fn default() -> Self {
+ ipv6_netaddr_c {
+ addr: Default::default(),
+ mask: Default::default(),
+ sn_addr: Default::default(),
+ network: Default::default(),
+ plen: 0,
+ addr_s: [0; INET6_ADDRSTRLEN + 1],
+ sn_addr_s: [0; INET6_ADDRSTRLEN + 1],
+ network_s: [0; INET6_ADDRSTRLEN + 1]
+ }
+ }
+}
+
+impl ipv6_netaddr_c {
+ pub unsafe fn to_ddlog(&self) -> ipv6_netaddr {
+ ipv6_netaddr{
+ addr: self.addr.clone(),
+ plen: self.plen
+ }
+ }
+}
+
+
+// ovn-util.h
+#[repr(C)]
+struct lport_addresses_c {
+ ea_s: [raw::c_char; ETH_ADDR_STRLEN + 1],
+ ea: eth_addr,
+ n_ipv4_addrs: libc::size_t,
+ ipv4_addrs: *mut ipv4_netaddr_c,
+ n_ipv6_addrs: libc::size_t,
+ ipv6_addrs: *mut ipv6_netaddr_c
+}
+
+impl Default for lport_addresses_c {
+ fn default() -> Self {
+ lport_addresses_c {
+ ea_s: [0; ETH_ADDR_STRLEN + 1],
+ ea: Default::default(),
+ n_ipv4_addrs: 0,
+ ipv4_addrs: ptr::null_mut(),
+ n_ipv6_addrs: 0,
+ ipv6_addrs: ptr::null_mut()
+ }
+ }
+}
+
+impl lport_addresses_c {
+ pub unsafe fn into_ddlog(mut self) -> lport_addresses {
+ let mut ipv4_addrs = ddlog_std::Vec::with_capacity(self.n_ipv4_addrs);
+ for i in 0..self.n_ipv4_addrs {
+ ipv4_addrs.push((&*self.ipv4_addrs.offset(i as isize)).to_ddlog())
+ }
+ let mut ipv6_addrs = ddlog_std::Vec::with_capacity(self.n_ipv6_addrs);
+ for i in 0..self.n_ipv6_addrs {
+ ipv6_addrs.push((&*self.ipv6_addrs.offset(i as isize)).to_ddlog())
+ }
+ let res = lport_addresses {
+ ea: self.ea.clone(),
+ ipv4_addrs: ipv4_addrs,
+ ipv6_addrs: ipv6_addrs
+ };
+ ovn_c::destroy_lport_addresses(&mut self as *mut lport_addresses_c);
+ res
+ }
+}
+
+/* functions imported from ovn-northd.c */
+extern "C" {
+ fn ddlog_warn(msg: *const raw::c_char);
+ fn ddlog_err(msg: *const raw::c_char);
+}
+
+/* functions imported from libovn */
+mod ovn_c {
+ use ::std::os::raw;
+ use ::libc;
+ use super::lport_addresses_c;
+ use super::ovs_svec;
+ use super::in6_addr;
+
+ #[link(name = "ovn")]
+ extern "C" {
+ // ovn/lib/ovn-util.h
+ pub fn extract_lsp_addresses(address: *const raw::c_char, laddrs: *mut lport_addresses_c) -> bool;
+ pub fn extract_addresses(address: *const raw::c_char, laddrs: *mut lport_addresses_c, ofs: *mut raw::c_int) -> bool;
+ pub fn extract_lrp_networks__(mac: *const raw::c_char, networks: *const *const raw::c_char,
+ n_networks: libc::size_t, laddrs: *mut lport_addresses_c) -> bool;
+ pub fn destroy_lport_addresses(addrs: *mut lport_addresses_c);
+ pub fn is_dynamic_lsp_address(address: *const raw::c_char) -> bool;
+ pub fn split_addresses(addresses: *const raw::c_char, ip4_addrs: *mut ovs_svec, ipv6_addrs: *mut ovs_svec);
+ pub fn ip_address_and_port_from_lb_key(key: *const raw::c_char, ip_address: *mut *mut raw::c_char,
+ port: *mut libc::uint16_t, addr_family: *mut raw::c_int);
+ pub fn ipv6_addr_is_routable_multicast(ip: *const in6_addr) -> bool;
+ pub fn ovn_get_internal_version() -> *mut raw::c_char;
+ }
+}
+
+mod ovs {
+ use ::std::os::raw;
+ use ::libc;
+ use super::in6_addr;
+ use super::ovs_be32;
+ use super::ovs_ds;
+ use super::eth_addr;
+ use super::ovs_svec;
+
+ /* functions imported from libopenvswitch */
+ #[link(name = "openvswitch")]
+ extern "C" {
+ // lib/packets.h
+ pub fn ipv6_string_mapped(addr_str: *mut raw::c_char, addr: *const in6_addr) -> *const raw::c_char;
+ pub fn ipv6_parse_masked(s: *const raw::c_char, ip: *mut in6_addr, mask: *mut in6_addr) -> *mut raw::c_char;
+ pub fn ipv6_parse_cidr(s: *const raw::c_char, ip: *mut in6_addr, plen: *mut raw::c_uint) -> *mut raw::c_char;
+ pub fn ipv6_parse(s: *const raw::c_char, ip: *mut in6_addr) -> bool;
+ pub fn ipv6_mask_is_any(mask: *const in6_addr) -> bool;
+ pub fn ipv6_count_cidr_bits(mask: *const in6_addr) -> raw::c_int;
+ pub fn ipv6_is_cidr(mask: *const in6_addr) -> bool;
+ pub fn ipv6_addr_bitxor(a: *const in6_addr, b: *const in6_addr) -> in6_addr;
+ pub fn ipv6_addr_bitand(a: *const in6_addr, b: *const in6_addr) -> in6_addr;
+ pub fn ipv6_create_mask(mask: raw::c_uint) -> in6_addr;
+ pub fn ipv6_is_zero(a: *const in6_addr) -> bool;
+ pub fn ipv6_multicast_to_ethernet(eth: *mut eth_addr, ip6: *const in6_addr);
+ pub fn ip_parse_masked(s: *const raw::c_char, ip: *mut ovs_be32, mask: *mut ovs_be32) -> *mut raw::c_char;
+ pub fn ip_parse_cidr(s: *const raw::c_char, ip: *mut ovs_be32, plen: *mut raw::c_uint) -> *mut raw::c_char;
+ pub fn ip_parse(s: *const raw::c_char, ip: *mut ovs_be32) -> bool;
+ pub fn ip_count_cidr_bits(mask: ovs_be32) -> raw::c_int;
+ pub fn eth_addr_from_string(s: *const raw::c_char, ea: *mut eth_addr) -> bool;
+ pub fn eth_addr_to_uint64(ea: eth_addr) -> libc::uint64_t;
+ pub fn eth_addr_from_uint64(x: libc::uint64_t, ea: *mut eth_addr);
+ pub fn eth_addr_mark_random(ea: *mut eth_addr);
+ pub fn in6_generate_eui64(ea: eth_addr, prefix: *const in6_addr, lla: *mut in6_addr);
+ pub fn in6_generate_lla(ea: eth_addr, lla: *mut in6_addr);
+ pub fn in6_is_lla(addr: *const in6_addr) -> bool;
+ pub fn in6_addr_solicited_node(addr: *mut in6_addr, ip6: *const in6_addr);
+
+ // include/openvswitch/json.h
+ pub fn json_string_escape(str: *const raw::c_char, out: *mut ovs_ds);
+ // openvswitch/dynamic-string.h
+ pub fn ds_destroy(ds: *mut ovs_ds);
+ pub fn ds_cstr(ds: *const ovs_ds) -> *const raw::c_char;
+ pub fn svec_destroy(v: *mut ovs_svec);
+ pub fn ovs_scan(s: *const raw::c_char, format: *const raw::c_char, ...) -> bool;
+ pub fn str_to_int(s: *const raw::c_char, base: raw::c_int, i: *mut raw::c_int) -> bool;
+ pub fn str_to_uint(s: *const raw::c_char, base: raw::c_int, i: *mut raw::c_uint) -> bool;
+ }
+}
+
+/* functions imported from libc */
+#[link(name = "c")]
+extern "C" {
+ fn free(ptr: *mut raw::c_void);
+}
+
+/* functions imported from arp/inet6 */
+extern "C" {
+ fn inet_ntop(af: raw::c_int, cp: *const raw::c_void,
+ buf: *mut raw::c_char, len: libc::socklen_t) -> *const raw::c_char;
+}
+
+/*
+ * Parse IPv4 address list.
+ */
+
+named!(parse_spaces<nom::types::CompleteStr, ()>,
+ do_parse!(many1!(one_of!(&" \t\n\r\x0c\x0b")) >> (()) )
+);
+
+named!(parse_opt_spaces<nom::types::CompleteStr, ()>,
+ do_parse!(opt!(parse_spaces) >> (()))
+);
+
+named!(parse_ipv4_range<nom::types::CompleteStr, (String, Option<String>)>,
+ do_parse!(addr1: many_till!(complete!(nom::anychar), alt!(do_parse!(eof!() >> (nom::types::CompleteStr(""))) | peek!(tag!("..")) | tag!(" ") )) >>
+ parse_opt_spaces >>
+ addr2: opt!(do_parse!(tag!("..") >>
+ parse_opt_spaces >>
+ addr2: many_till!(complete!(nom::anychar), alt!(do_parse!(eof!() >> (' ')) | char!(' ')) ) >>
+ (addr2) )) >>
+ parse_opt_spaces >>
+ (addr1.0.into_iter().collect(), addr2.map(|x|x.0.into_iter().collect())) )
+);
+
+named!(parse_ipv4_address_list<nom::types::CompleteStr, Vec<(String, Option<String>)>>,
+ do_parse!(parse_opt_spaces >>
+ ranges: many0!(parse_ipv4_range) >>
+ (ranges)));
+
+pub fn parse_ip_list(ips: &String) -> ddlog_std::Either<String, ddlog_std::Vec<ddlog_std::tuple2<in_addr, ddlog_std::Option<in_addr>>>>
+{
+ match parse_ipv4_address_list(nom::types::CompleteStr(ips.as_str())) {
+ Err(e) => {
+ ddlog_std::Either::Left{l: format!("invalid IP list format: \"{}\"", ips.as_str())}
+ },
+ Ok((nom::types::CompleteStr(""), ranges)) => {
+ let mut res = vec![];
+ for (ip1, ip2) in ranges.iter() {
+ let start = match ip_parse(&ip1) {
+ ddlog_std::Option::None => return ddlog_std::Either::Left{l: format!("invalid IP address: \"{}\"", *ip1)},
+ ddlog_std::Option::Some{x: ip} => ip
+ };
+ let end = match ip2 {
+ None => ddlog_std::Option::None,
+ Some(ip_str) => match ip_parse(&ip_str.clone()) {
+ ddlog_std::Option::None => return ddlog_std::Either::Left{l: format!("invalid IP address: \"{}\"", *ip_str)},
+ x => x
+ }
+ };
+ res.push(ddlog_std::tuple2(start, end));
+ };
+ ddlog_std::Either::Right{r: ddlog_std::Vec{x: res}}
+ },
+ Ok((suffix, _)) => {
+ ddlog_std::Either::Left{l: format!("IP address list contains trailing characters: \"{}\"", suffix)}
+ }
+ }
+}
new file mode 100644
@@ -0,0 +1,2 @@
+[dependencies.nom]
+version = "4.0"
new file mode 100644
@@ -0,0 +1,7476 @@
+/*
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import OVN_Northbound as nb
+import OVN_Southbound as sb
+import ovsdb
+import allocate
+import ovn
+import lswitch
+import lrouter
+import multicast
+import helpers
+import ipam
+
+output relation Warning[string]
+
+index Logical_Flow_Index() on sb::Out_Logical_Flow()
+
+/* Meter_Band table */
+for (mb in nb::Meter_Band) {
+ sb::Out_Meter_Band(._uuid = mb._uuid,
+ .action = mb.action,
+ .rate = mb.rate,
+ .burst_size = mb.burst_size)
+}
+
+/* Meter table */
+for (meter in nb::Meter) {
+ sb::Out_Meter(._uuid = meter._uuid,
+ .name = meter.name,
+ .unit = meter.unit,
+ .bands = meter.bands)
+}
+sb::Out_Meter(._uuid = hash128(name),
+ .name = name,
+ .unit = meter.unit,
+ .bands = meter.bands) :-
+ ACLWithFairMeter(acl, meter),
+ var name = acl_log_meter_name(meter.name, acl._uuid).
+
+/* Proxy table for Out_Datapath_Binding: contains all Datapath_Binding fields,
+ * except tunnel id, which is allocated separately (see TunKeyAllocation). */
+relation OutProxy_Datapath_Binding (
+ _uuid: uuid,
+ load_balancers: Set<uuid>,
+ external_ids: Map<string,string>
+)
+
+/* Datapath_Binding table */
+OutProxy_Datapath_Binding(uuid, load_balancers, external_ids) :-
+ nb::Logical_Switch(._uuid = uuid, .name = name, .external_ids = ids,
+ .load_balancer = load_balancers,
+ .other_config = other_config),
+ var uuid_str = uuid2str(uuid),
+ var external_ids = {
+ var eids = ["logical-switch" -> uuid_str, "name" -> name];
+ match (ids.get("neutron:network_name")) {
+ None -> (),
+ Some{nnn} -> eids.insert("name2", nnn)
+ };
+ match (other_config.get("interconn-ts")) {
+ None -> (),
+ Some{value} -> eids.insert("interconn-ts", value)
+ };
+ eids
+ }.
+
+OutProxy_Datapath_Binding(uuid, set_empty(), external_ids) :-
+ lr in nb::Logical_Router(._uuid = uuid, .name = name, .external_ids = ids,
+ .options = options),
+ lr.is_enabled(),
+ var uuid_str = uuid2str(uuid),
+ var external_ids = {
+ var eids = ["logical-router" -> uuid_str, "name" -> name];
+ match (ids.get("neutron:router_name")) {
+ None -> (),
+ Some{nnn} -> eids.insert("name2", nnn)
+ };
+ match (options.get("snat-ct-zone").and_then(parse_dec_u64)) {
+ None -> (),
+ Some{zone} -> eids.insert("snat-ct-zone", "${zone}")
+ };
+ eids
+ }.
+
+sb::Out_Datapath_Binding(uuid, tunkey, load_balancers, external_ids) :-
+ OutProxy_Datapath_Binding(uuid, load_balancers, external_ids),
+ TunKeyAllocation(uuid, tunkey).
+
+
+/* Proxy table for Out_Datapath_Binding: contains all Datapath_Binding fields,
+ * except tunnel id, which is allocated separately (see PortTunKeyAllocation). */
+relation OutProxy_Port_Binding (
+ _uuid: uuid,
+ logical_port: string,
+ __type: string,
+ gateway_chassis: Set<uuid>,
+ ha_chassis_group: Option<uuid>,
+ options: Map<string,string>,
+ datapath: uuid,
+ parent_port: Option<string>,
+ tag: Option<integer>,
+ mac: Set<string>,
+ nat_addresses: Set<string>,
+ external_ids: Map<string,string>
+)
+
+/* Case 1: Create a Port_Binding per logical switch port that is not of type "router" */
+OutProxy_Port_Binding(._uuid = lsp._uuid,
+ .logical_port = lsp.name,
+ .__type = lsp.__type,
+ .gateway_chassis = set_empty(),
+ .ha_chassis_group = sp.hac_group_uuid,
+ .options = lsp.options,
+ .datapath = sw.ls._uuid,
+ .parent_port = lsp.parent_name,
+ .tag = tag,
+ .mac = lsp.addresses,
+ .nat_addresses = set_empty(),
+ .external_ids = eids) :-
+ sp in &SwitchPort(.lsp = lsp, .sw = &sw),
+ SwitchPortNewDynamicTag(lsp._uuid, opt_tag),
+ var tag = match (opt_tag) {
+ None -> lsp.tag,
+ Some{t} -> Some{t}
+ },
+ lsp.__type != "router",
+ var eids = {
+ var eids = lsp.external_ids;
+ match (lsp.external_ids.get("neutron:port_name")) {
+ None -> (),
+ Some{name} -> eids.insert("name", name)
+ };
+ eids
+ }.
+
+
+/* Case 2: Create a Port_Binding per logical switch port of type "router" */
+OutProxy_Port_Binding(._uuid = lsp._uuid,
+ .logical_port = lsp.name,
+ .__type = __type,
+ .gateway_chassis = set_empty(),
+ .ha_chassis_group = None,
+ .options = options,
+ .datapath = sw.ls._uuid,
+ .parent_port = lsp.parent_name,
+ .tag = None,
+ .mac = lsp.addresses,
+ .nat_addresses = nat_addresses,
+ .external_ids = eids) :-
+ &SwitchPort(.lsp = lsp, .sw = &sw, .peer = peer),
+ var eids = {
+ var eids = lsp.external_ids;
+ match (lsp.external_ids.get("neutron:port_name")) {
+ None -> (),
+ Some{name} -> eids.insert("name", name)
+ };
+ eids
+ },
+ Some{var router_port} = lsp.options.get("router-port"),
+ var opt_chassis = peer.and_then(|p| p.router.lr.options.get("chassis")),
+ var l3dgw_port = peer.and_then(|p| p.router.l3dgw_port),
+ (var __type, var options) = {
+ var options = ["peer" -> router_port];
+ match (opt_chassis) {
+ None -> {
+ ("patch", options)
+ },
+ Some{chassis} -> {
+ options.insert("l3gateway-chassis", chassis);
+ ("l3gateway", options)
+ }
+ }
+ },
+ var base_nat_addresses = {
+ match (lsp.options.get("nat-addresses")) {
+ None -> { set_empty() },
+ Some{"router"} -> match ((l3dgw_port, opt_chassis, peer)) {
+ (None, None, _) -> set_empty(),
+ (_, _, None) -> set_empty(),
+ (_, _, Some{rport}) -> get_nat_addresses(deref(rport))
+ },
+ Some{nat_addresses} -> {
+ /* Only accept manual specification of ethernet address
+ * followed by IPv4 addresses on type "l3gateway" ports. */
+ if (is_some(opt_chassis)) {
+ match (extract_lsp_addresses(nat_addresses)) {
+ None -> {
+ warn("Error extracting nat-addresses.");
+ set_empty()
+ },
+ Some{_} -> { set_singleton(nat_addresses) }
+ }
+ } else { set_empty() }
+ }
+ }
+ },
+ /* Add the router mac and IPv4 addresses to
+ * Port_Binding.nat_addresses so that GARP is sent for these
+ * IPs by the ovn-controller on which the distributed gateway
+ * router port resides if:
+ *
+ * 1. The peer has 'reside-on-redirect-chassis' set and the
+ * the logical router datapath has distributed router port.
+ *
+ * 2. The peer is distributed gateway router port.
+ *
+ * 3. The peer's router is a gateway router and the port has a localnet
+ * port.
+ *
+ * Note: Port_Binding.nat_addresses column is also used for
+ * sending the GARPs for the router port IPs.
+ * */
+ var garp_nat_addresses = match (peer) {
+ Some{rport} -> match (
+ (map_get_bool_def(rport.lrp.options, "reside-on-redirect-chassis",
+ false)
+ and is_some(l3dgw_port)) or
+ Some{rport.lrp} == l3dgw_port or
+ (is_some(rport.router.lr.options.get("chassis")) and
+ not sw.localnet_port_names.is_empty())) {
+ false -> set_empty(),
+ true -> set_singleton(get_garp_nat_addresses(deref(rport)))
+ },
+ None -> set_empty()
+ },
+ var nat_addresses = set_union(base_nat_addresses, garp_nat_addresses).
+
+/* Case 3: Port_Binding per logical router port */
+OutProxy_Port_Binding(._uuid = lrp._uuid,
+ .logical_port = lrp.name,
+ .__type = __type,
+ .gateway_chassis = set_empty(),
+ .ha_chassis_group = None,
+ .options = options,
+ .datapath = router.lr._uuid,
+ .parent_port = None,
+ .tag = None, // always empty for router ports
+ .mac = set_singleton("${lrp.mac} ${lrp.networks.join(\" \")}"),
+ .nat_addresses = set_empty(),
+ .external_ids = lrp.external_ids) :-
+ rp in &RouterPort(.lrp = lrp, .router = &router, .peer = peer),
+ RouterPortRAOptionsComplete(lrp._uuid, options0),
+ (var __type, var options1) = match (router.lr.options.get("chassis")) {
+ /* TODO: derived ports */
+ None -> ("patch", map_empty()),
+ Some{lrchassis} -> ("l3gateway", ["l3gateway-chassis" -> lrchassis])
+ },
+ var options2 = match (router_peer_name(peer)) {
+ None -> map_empty(),
+ Some{peer_name} -> ["peer" -> peer_name]
+ },
+ var options3 = match ((peer, rp.networks.ipv6_addrs.is_empty())) {
+ (PeerSwitch{_, _}, false) -> {
+ var enabled = lrp.is_enabled();
+ var pd = map_get_bool_def(lrp.options, "prefix_delegation", false);
+ var p = map_get_bool_def(lrp.options, "prefix", false);
+ ["ipv6_prefix_delegation" -> "${pd and enabled}",
+ "ipv6_prefix" -> "${p and enabled}"]
+ },
+ _ -> map_empty()
+ },
+ PreserveIPv6RAPDList(lrp._uuid, ipv6_ra_pd_list),
+ var options4 = match (ipv6_ra_pd_list) {
+ None -> map_empty(),
+ Some{value} -> ["ipv6_ra_pd_list" -> value]
+ },
+ var options = options0.union(options1).union(options2).union(options3).union(options4),
+ var eids = {
+ var eids = lrp.external_ids;
+ match (lrp.external_ids.get("neutron:port_name")) {
+ None -> (),
+ Some{name} -> eids.insert("name", name)
+ };
+ eids
+ }.
+/*
+*/
+function get_router_load_balancer_ips(router: Router) :
+ (Set<string>, Set<string>) =
+{
+ var all_ips_v4 = set_empty();
+ var all_ips_v6 = set_empty();
+ for (lb in router.lbs) {
+ for (kv in deref(lb).vips) {
+ (var vip, _) = kv;
+ /* node->key contains IP:port or just IP. */
+ match (ip_address_and_port_from_lb_key(vip)) {
+ None -> (),
+ Some{(IPv4{ipv4}, _)} -> all_ips_v4.insert("${ipv4}"),
+ Some{(IPv6{ipv6}, _)} -> all_ips_v6.insert("${ipv6}")
+ }
+ }
+ };
+ (all_ips_v4, all_ips_v6)
+}
+
+/* Returns an array of strings, each consisting of a MAC address followed
+ * by one or more IP addresses, and if the port is a distributed gateway
+ * port, followed by 'is_chassis_resident("LPORT_NAME")', where the
+ * LPORT_NAME is the name of the L3 redirect port or the name of the
+ * logical_port specified in a NAT rule. These strings include the
+ * external IP addresses of all NAT rules defined on that router, and all
+ * of the IP addresses used in load balancer VIPs defined on that router.
+ */
+function get_nat_addresses(rport: RouterPort): Set<string> =
+{
+ var addresses = set_empty();
+ var router = deref(rport.router);
+ var has_redirect = is_some(router.l3dgw_port);
+ match (eth_addr_from_string(rport.lrp.mac)) {
+ None -> addresses,
+ Some{mac} -> {
+ var c_addresses = "${mac}";
+ var central_ip_address = false;
+
+ /* Get NAT IP addresses. */
+ for (nat in router.nats) {
+ /* Determine whether this NAT rule satisfies the conditions for
+ * distributed NAT processing. */
+ if (has_redirect and nat.nat.__type == "dnat_and_snat" and
+ is_some(nat.nat.logical_port) and is_some(nat.external_mac)) {
+ /* Distributed NAT rule. */
+ var logical_port = option_unwrap_or_default(nat.nat.logical_port);
+ var external_mac = option_unwrap_or_default(nat.external_mac);
+ addresses.insert("${external_mac} ${nat.external_ip} "
+ "is_chassis_resident(${json_string_escape(logical_port)})")
+ } else {
+ /* Centralized NAT rule, either on gateway router or distributed
+ * router.
+ * Check if external_ip is same as router ip. If so, then there
+ * is no need to add this to the nat_addresses. The router IPs
+ * will be added separately. */
+ var is_router_ip = false;
+ match (nat.external_ip) {
+ IPv4{ei} -> {
+ for (ipv4 in rport.networks.ipv4_addrs) {
+ if (ei == ipv4.addr) {
+ is_router_ip = true;
+ break
+ }
+ }
+ },
+ IPv6{ei} -> {
+ for (ipv6 in rport.networks.ipv6_addrs) {
+ if (ei == ipv6.addr) {
+ is_router_ip = true;
+ break
+ }
+ }
+ }
+ };
+ if (not is_router_ip) {
+ c_addresses = c_addresses ++ " ${nat.external_ip}";
+ central_ip_address = true
+ }
+ }
+ };
+
+ /* A set to hold all load-balancer vips. */
+ (var all_ips_v4, var all_ips_v6) = get_router_load_balancer_ips(router);
+
+ for (ip_address in set_union(all_ips_v4, all_ips_v6)) {
+ c_addresses = c_addresses ++ " ${ip_address}";
+ central_ip_address = true
+ };
+
+ if (central_ip_address) {
+ /* Gratuitous ARP for centralized NAT rules on distributed gateway
+ * ports should be restricted to the gateway chassis. */
+ if (has_redirect) {
+ c_addresses = c_addresses ++ " is_chassis_resident(${router.redirect_port_name})"
+ } else ();
+
+ addresses.insert(c_addresses)
+ } else ();
+ addresses
+ }
+ }
+}
+
+function get_garp_nat_addresses(rport: RouterPort): string = {
+ var garp_info = ["${rport.networks.ea}"];
+ for (ipv4_addr in rport.networks.ipv4_addrs) {
+ garp_info.push("${ipv4_addr.addr}")
+ };
+ if (rport.router.redirect_port_name != "") {
+ garp_info.push("is_chassis_resident(${rport.router.redirect_port_name})")
+ };
+ garp_info.join(" ")
+}
+
+/* Extra options computed for router ports by the logical flow generation code */
+relation RouterPortRAOptions(lrp: uuid, options: Map<string, string>)
+
+relation RouterPortRAOptionsComplete(lrp: uuid, options: Map<string, string>)
+
+RouterPortRAOptionsComplete(lrp, options) :-
+ RouterPortRAOptions(lrp, options).
+RouterPortRAOptionsComplete(lrp, map_empty()) :-
+ nb::Logical_Router_Port(._uuid = lrp),
+ not RouterPortRAOptions(lrp, _).
+
+
+/*
+ * Create derived port for Logical_Router_Ports with non-empty 'gateway_chassis' column.
+ */
+
+/* Create derived ports */
+OutProxy_Port_Binding(// lrp._uuid is already in use; generate a new UUID by
+ // hashing it.
+ ._uuid = hash128(lrp._uuid),
+ .logical_port = chassis_redirect_name(lrp.name),
+ .__type = "chassisredirect",
+ .gateway_chassis = set_empty(),
+ .ha_chassis_group = Some{hacg_uuid},
+ .options = options,
+ .datapath = lr_uuid,
+ .parent_port = None,
+ .tag = None, //always empty for router ports
+ .mac = set_singleton("${lrp.mac} ${lrp.networks.join(\" \")}"),
+ .nat_addresses = set_empty(),
+ .external_ids = lrp.external_ids) :-
+ DistributedGatewayPort(lrp, lr_uuid),
+ LogicalRouterHAChassisGroup(lr_uuid, hacg_uuid),
+ var redirect_type = match (lrp.options.get("redirect-type")) {
+ Some{var value} -> ["redirect-type" -> value],
+ _ -> map_empty()
+ },
+ var options = redirect_type.insert_imm("distributed-port", lrp.name).
+
+
+/* Add allocated qdisc_queue_id and tunnel key to Port_Binding.
+ */
+sb::Out_Port_Binding(._uuid = pbinding._uuid,
+ .logical_port = pbinding.logical_port,
+ .__type = pbinding.__type,
+ .gateway_chassis = pbinding.gateway_chassis,
+ .ha_chassis_group = pbinding.ha_chassis_group,
+ .options = options0,
+ .datapath = pbinding.datapath,
+ .tunnel_key = tunkey,
+ .parent_port = pbinding.parent_port,
+ .tag = pbinding.tag,
+ .mac = pbinding.mac,
+ .nat_addresses = pbinding.nat_addresses,
+ .external_ids = pbinding.external_ids) :-
+ pbinding in OutProxy_Port_Binding(),
+ PortTunKeyAllocation(pbinding._uuid, tunkey),
+ QueueIDAllocation(pbinding._uuid, qid),
+ var options0 = match (qid) {
+ None -> pbinding.options,
+ Some{id} -> pbinding.options.insert_imm("qdisc_queue_id", "${id}")
+ }.
+
+/* Referenced chassis.
+ *
+ * These tables track the sb::Chassis that a packet that traverses logical
+ * router 'lr_uuid' can end up at (or start from). This is used for
+ * sb::Out_HA_Chassis_Group's ref_chassis column.
+ *
+ * RefChassisSet0 has a row for each logical router that actually references a
+ * chassis. RefChassisSet has a row for every logical router. */
+relation RefChassis(lr_uuid: uuid, chassis_uuid: uuid)
+RefChassis(lr_uuid, chassis_uuid) :-
+ ReachableLogicalRouter(lr_uuid, lr2_uuid),
+ FirstHopLogicalRouter(lr2_uuid, ls_uuid),
+ LogicalSwitchPort(lsp_uuid, ls_uuid),
+ nb::Logical_Switch_Port(._uuid = lsp_uuid, .name = lsp_name),
+ sb::Port_Binding(.logical_port = lsp_name, .chassis = chassis_uuids),
+ Some{var chassis_uuid} = chassis_uuids.
+relation RefChassisSet0(lr_uuid: uuid, chassis_uuids: Set<uuid>)
+RefChassisSet0(lr_uuid, chassis_uuids) :-
+ RefChassis(lr_uuid, chassis_uuid),
+ var chassis_uuids = chassis_uuid.group_by(lr_uuid).to_set().
+relation RefChassisSet(lr_uuid: uuid, chassis_uuids: Set<uuid>)
+RefChassisSet(lr_uuid, chassis_uuids) :-
+ RefChassisSet0(lr_uuid, chassis_uuids).
+RefChassisSet(lr_uuid, set_empty()) :-
+ nb::Logical_Router(._uuid = lr_uuid),
+ not RefChassisSet0(lr_uuid, _).
+
+/* Referenced chassis for an HA chassis group.
+ *
+ * Multiple logical routers can reference an HA chassis group so we merge the
+ * referenced chassis across all of them.
+ */
+relation HAChassisGroupRefChassisSet(hacg_uuid: uuid,
+ chassis_uuids: Set<uuid>)
+HAChassisGroupRefChassisSet(hacg_uuid, chassis_uuids) :-
+ LogicalRouterHAChassisGroup(lr_uuid, hacg_uuid),
+ RefChassisSet(lr_uuid, chassis_uuids),
+ var chassis_uuids = chassis_uuids.group_by(hacg_uuid).union().
+
+/* HA_Chassis_Group and HA_Chassis. */
+sb::Out_HA_Chassis_Group(hacg_uuid, hacg_name, ha_chassis, ref_chassis, eids) :-
+ HAChassis(hacg_uuid, hac_uuid, chassis_name, _, _),
+ var chassis_uuid = ha_chassis_uuid(chassis_name, hac_uuid),
+ var ha_chassis = chassis_uuid.group_by(hacg_uuid).to_set(),
+ HAChassisGroup(hacg_uuid, hacg_name, eids),
+ HAChassisGroupRefChassisSet(hacg_uuid, ref_chassis).
+
+sb::Out_HA_Chassis(ha_chassis_uuid(chassis_name, hac_uuid), chassis, priority, eids) :-
+ HAChassis(_, hac_uuid, chassis_name, priority, eids),
+ chassis_rec in sb::Chassis(.name = chassis_name),
+ var chassis = Some{chassis_rec._uuid}.
+sb::Out_HA_Chassis(ha_chassis_uuid(chassis_name, hac_uuid), None, priority, eids) :-
+ HAChassis(_, hac_uuid, chassis_name, priority, eids),
+ not chassis_rec in sb::Chassis(.name = chassis_name).
+
+relation HAChassisToChassis(name: string, chassis: Option<uuid>)
+HAChassisToChassis(name, Some{chassis}) :-
+ sb::Chassis(._uuid = chassis, .name = name).
+HAChassisToChassis(name, None) :-
+ nb::HA_Chassis(.chassis_name = name),
+ not sb::Chassis(.name = name).
+sb::Out_HA_Chassis(ha_chassis_uuid(ha_chassis.chassis_name, hac_uuid), chassis, priority, eids) :-
+ sp in &SwitchPort(),
+ sp.lsp.__type == "external",
+ Some{var ha_chassis_group_uuid} = sp.lsp.ha_chassis_group,
+ ha_chassis_group in nb::HA_Chassis_Group(._uuid = ha_chassis_group_uuid),
+ var hac_uuid = FlatMap(ha_chassis_group.ha_chassis),
+ ha_chassis in nb::HA_Chassis(._uuid = hac_uuid, .priority = priority, .external_ids = eids),
+ HAChassisToChassis(ha_chassis.chassis_name, chassis).
+sb::Out_HA_Chassis_Group(_uuid, name, ha_chassis, set_empty() /* XXX? */, eids) :-
+ sp in &SwitchPort(),
+ sp.lsp.__type == "external",
+ var ls_uuid = sp.sw.ls._uuid,
+ Some{var ha_chassis_group_uuid} = sp.lsp.ha_chassis_group,
+ ha_chassis_group in nb::HA_Chassis_Group(._uuid = ha_chassis_group_uuid, .name = name,
+ .external_ids = eids),
+ var hac_uuid = FlatMap(ha_chassis_group.ha_chassis),
+ ha_chassis in nb::HA_Chassis(._uuid = hac_uuid),
+ var ha_chassis_uuid_name = ha_chassis_uuid(ha_chassis.chassis_name, hac_uuid),
+ var ha_chassis = ha_chassis_uuid_name.group_by((ls_uuid, name, eids)).to_set(),
+ var _uuid = ha_chassis_group_uuid(ls_uuid).
+
+/*
+ * SB_Global: copy nb_cfg and options from NB.
+ * If NB_Global does not exist yet, just keep the current value of SB_Global,
+ * if any.
+ */
+for (nb_global in nb::NB_Global) {
+ sb::Out_SB_Global(._uuid = nb_global._uuid,
+ .nb_cfg = nb_global.nb_cfg,
+ .options = nb_global.options,
+ .ipsec = nb_global.ipsec)
+}
+
+sb::Out_SB_Global(._uuid = sb_global._uuid,
+ .nb_cfg = sb_global.nb_cfg,
+ .options = sb_global.options,
+ .ipsec = sb_global.ipsec) :-
+ sb_global in sb::SB_Global(),
+ not nb::NB_Global().
+
+/* sb::Chassis_Private joined with is_remote from sb::Chassis,
+ * including a record even for a null Chassis ref. */
+relation ChassisPrivate(
+ cp: sb::Chassis_Private,
+ is_remote: bool)
+ChassisPrivate(cp, map_get_bool_def(c.other_config, "is-remote", false)) :-
+ cp in sb::Chassis_Private(.chassis = Some{uuid}),
+ c in sb::Chassis(._uuid = uuid).
+ChassisPrivate(cp, false),
+Warning["Chassis not exist for Chassis_Private record, name: ${cp.name}"] :-
+ cp in sb::Chassis_Private(.chassis = Some{uuid}),
+ not sb::Chassis(._uuid = uuid).
+ChassisPrivate(cp, false),
+Warning["Chassis not exist for Chassis_Private record, name: ${cp.name}"] :-
+ cp in sb::Chassis_Private(.chassis = None).
+
+/* Track minimum hv_cfg across all the (non-remote) chassis. */
+relation HvCfg0(hv_cfg: integer)
+HvCfg0(hv_cfg) :-
+ ChassisPrivate(.cp = sb::Chassis_Private{.nb_cfg = chassis_cfg}, .is_remote = false),
+ var hv_cfg = chassis_cfg.group_by(()).min().
+relation HvCfg(hv_cfg: integer)
+HvCfg(hv_cfg) :- HvCfg0(hv_cfg).
+HvCfg(hv_cfg) :-
+ nb::NB_Global(.nb_cfg = hv_cfg),
+ not HvCfg0().
+
+/* Track maximum nb_cfg_timestamp among all the (non-remote) chassis
+ * that have the minimum nb_cfg. */
+relation HvCfgTimestamp0(hv_cfg_timestamp: integer)
+HvCfgTimestamp0(hv_cfg_timestamp) :-
+ HvCfg(hv_cfg),
+ ChassisPrivate(.cp = sb::Chassis_Private{.nb_cfg = hv_cfg,
+ .nb_cfg_timestamp = chassis_cfg_timestamp},
+ .is_remote = false),
+ var hv_cfg_timestamp = chassis_cfg_timestamp.group_by(()).max().
+relation HvCfgTimestamp(hv_cfg_timestamp: integer)
+HvCfgTimestamp(hv_cfg_timestamp) :- HvCfgTimestamp0(hv_cfg_timestamp).
+HvCfgTimestamp(hv_cfg_timestamp) :-
+ nb::NB_Global(.hv_cfg_timestamp = hv_cfg_timestamp),
+ not HvCfgTimestamp0().
+
+/*
+ * NB_Global:
+ * - set `sb_cfg` to the value of `SB_Global.nb_cfg`.
+ * - set `hv_cfg` to the smallest value of `nb_cfg` across all `Chassis`
+ * - FIXME: we use ipsec as unique key to make sure that we don't create multiple `NB_Global`
+ * instance. There is a potential race condition if this field is modified at the same
+ * time northd is updating `sb_cfg` or `hv_cfg`.
+ */
+input relation NbCfgTimestamp[integer]
+nb::Out_NB_Global(._uuid = _uuid,
+ .sb_cfg = sb_cfg,
+ .hv_cfg = hv_cfg,
+ .nb_cfg_timestamp = nb_cfg_timestamp,
+ .hv_cfg_timestamp = hv_cfg_timestamp,
+ .ipsec = ipsec,
+ .options = options) :-
+ NbCfgTimestamp[nb_cfg_timestamp],
+ HvCfgTimestamp(hv_cfg_timestamp),
+ nbg in nb::NB_Global(._uuid = _uuid, .ipsec = ipsec),
+ sb::SB_Global(.nb_cfg = sb_cfg),
+ HvCfg(hv_cfg),
+ HvCfgTimestamp(hv_cfg_timestamp),
+ MacPrefix(mac_prefix),
+ SvcMonitorMac(svc_monitor_mac),
+ OvnMaxDpKeyLocal[max_tunid],
+ var options0 = put_mac_prefix(nbg.options, mac_prefix),
+ var options1 = put_svc_monitor_mac(options0, svc_monitor_mac),
+ var options2 = options1.insert_imm("max_tunid", "${max_tunid}"),
+ var options = options2.insert_imm("northd_internal_version", ovn_internal_version()).
+
+
+/* SB_Global does not exist yet -- just keep the old value of NB_Global */
+nb::Out_NB_Global(._uuid = nbg._uuid,
+ .sb_cfg = nbg.sb_cfg,
+ .hv_cfg = nbg.hv_cfg,
+ .ipsec = nbg.ipsec,
+ .options = nbg.options,
+ .nb_cfg_timestamp = nb_cfg_timestamp,
+ .hv_cfg_timestamp = hv_cfg_timestamp) :-
+ NbCfgTimestamp[nb_cfg_timestamp],
+ HvCfgTimestamp(hv_cfg_timestamp),
+ nbg in nb::NB_Global(),
+ not sb::SB_Global().
+
+output relation SbCfg[integer]
+SbCfg[sb_cfg] :- nb::Out_NB_Global(.sb_cfg = sb_cfg).
+
+output relation Northd_Probe_Interval[s64]
+Northd_Probe_Interval[interval] :-
+ nb in nb::NB_Global(),
+ var interval = nb.options.get("northd_probe_interval").and_then(parse_dec_i64).unwrap_or(-1).
+
+relation CheckLspIsUp[bool]
+CheckLspIsUp[check_lsp_is_up] :-
+ nb in nb::NB_Global(),
+ var check_lsp_is_up = not map_get_bool_def(nb.options, "ignore_lsp_down", false).
+CheckLspIsUp[true] :-
+ Unit(),
+ not nb in nb::NB_Global().
+
+/*
+ * Address_Set: copy from NB + additional records generated from NB Port_Group (two records for each
+ * Port_Group for IPv4 and IPv6 addresses).
+ *
+ * There can be name collisions between the two types of Address_Set records. User-defined records
+ * take precedence.
+ */
+sb::Out_Address_Set(._uuid = nb_as._uuid,
+ .name = nb_as.name,
+ .addresses = nb_as.addresses) :-
+ AddressSetRef[nb_as].
+
+sb::Out_Address_Set(._uuid = hash128("svc_monitor_mac"),
+ .name = "svc_monitor_mac",
+ .addresses = set_singleton("${svc_monitor_mac}")) :-
+ SvcMonitorMac(svc_monitor_mac).
+
+sb::Out_Address_Set(hash128(as_name), as_name, pg_ip4addrs.union()) :-
+ nb::Port_Group(.ports = pg_ports, .name = pg_name),
+ var as_name = pg_name ++ "_ip4",
+ // avoid name collisions with user-defined Address_Sets
+ not nb::Address_Set(.name = as_name),
+ var port_uuid = FlatMap(pg_ports),
+ PortStaticAddresses(.lsport = port_uuid, .ip4addrs = stat),
+ SwitchPortNewDynamicAddress(&SwitchPort{.lsp = nb::Logical_Switch_Port{._uuid = port_uuid}},
+ dyn_addr),
+ var dynamic = match (dyn_addr) {
+ None -> set_empty(),
+ Some{lpaddress} -> match (lpaddress.ipv4_addrs.nth(0)) {
+ None -> set_empty(),
+ Some{addr} -> set_singleton("${addr.addr}")
+ }
+ },
+ //PortDynamicAddresses(.lsport = port_uuid, .ip4addrs = dynamic),
+ var port_ip4addrs = stat.union(dynamic),
+ var pg_ip4addrs = port_ip4addrs.group_by(as_name).to_vec().
+
+sb::Out_Address_Set(hash128(as_name), as_name, set_empty()) :-
+ nb::Port_Group(.ports = set_empty(), .name = pg_name),
+ var as_name = pg_name ++ "_ip4",
+ // avoid name collisions with user-defined Address_Sets
+ not nb::Address_Set(.name = as_name).
+
+sb::Out_Address_Set(hash128(as_name), as_name, pg_ip6addrs.union()) :-
+ nb::Port_Group(.ports = pg_ports, .name = pg_name),
+ var as_name = pg_name ++ "_ip6",
+ // avoid name collisions with user-defined Address_Sets
+ not nb::Address_Set(.name = as_name),
+ var port_uuid = FlatMap(pg_ports),
+ PortStaticAddresses(.lsport = port_uuid, .ip6addrs = stat),
+ SwitchPortNewDynamicAddress(&SwitchPort{.lsp = nb::Logical_Switch_Port{._uuid = port_uuid}},
+ dyn_addr),
+ var dynamic = match (dyn_addr) {
+ None -> set_empty(),
+ Some{lpaddress} -> match (lpaddress.ipv6_addrs.nth(0)) {
+ None -> set_empty(),
+ Some{addr} -> set_singleton("${addr.addr}")
+ }
+ },
+ //PortDynamicAddresses(.lsport = port_uuid, .ip6addrs = dynamic),
+ var port_ip6addrs = stat.union(dynamic),
+ var pg_ip6addrs = port_ip6addrs.group_by(as_name).to_vec().
+
+sb::Out_Address_Set(hash128(as_name), as_name, set_empty()) :-
+ nb::Port_Group(.ports = set_empty(), .name = pg_name),
+ var as_name = pg_name ++ "_ip6",
+ // avoid name collisions with user-defined Address_Sets
+ not nb::Address_Set(.name = as_name).
+
+/*
+ * Port_Group
+ *
+ * Create one SB Port_Group record for every datapath that has ports
+ * referenced by the NB Port_Group.ports field. In order to maintain the
+ * SB Port_Group.name uniqueness constraint, ovn-northd populates the field
+ * with the value: <SB.Logical_Datapath.tunnel_key>_<NB.Port_Group.name>.
+ */
+sb::Out_Port_Group(._uuid = hash128(sb_name), .name = sb_name, .ports = port_names) :-
+ nb::Port_Group(._uuid = _uuid, .name = nb_name, .ports = pg_ports),
+ var port_uuid = FlatMap(pg_ports),
+ &SwitchPort(.lsp = lsp@nb::Logical_Switch_Port{._uuid = port_uuid,
+ .name = port_name},
+ .sw = &Switch{.ls = nb::Logical_Switch{._uuid = ls_uuid}}),
+ TunKeyAllocation(.datapath = ls_uuid, .tunkey = tunkey),
+ var sb_name = "${tunkey}_${nb_name}",
+ var port_names = port_name.group_by((_uuid, sb_name)).to_set().
+
+/*
+ * Multicast_Group:
+ * - three static rows per logical switch: one for flooding, one for packets
+ * with unknown destinations, one for flooding IP multicast known traffic to
+ * mrouters.
+ * - dynamically created rows based on IGMP groups learned by controllers.
+ */
+
+function mC_FLOOD(): (string, integer) =
+ ("_MC_flood", 32768)
+
+function mC_UNKNOWN(): (string, integer) =
+ ("_MC_unknown", 32769)
+
+function mC_MROUTER_FLOOD(): (string, integer) =
+ ("_MC_mrouter_flood", 32770)
+
+function mC_MROUTER_STATIC(): (string, integer) =
+ ("_MC_mrouter_static", 32771)
+
+function mC_STATIC(): (string, integer) =
+ ("_MC_static", 32772)
+
+function mC_FLOOD_L2(): (string, integer) =
+ ("_MC_flood_l2", 32773)
+
+function mC_IP_MCAST_MIN(): (string, integer) =
+ ("_MC_ip_mcast_min", 32774)
+
+function mC_IP_MCAST_MAX(): (string, integer) =
+ ("_MC_ip_mcast_max", 65535)
+
+
+// TODO: check that Multicast_Group.ports should not include derived ports
+
+/* Proxy table for Out_Multicast_Group: contains all Multicast_Group fields,
+ * except `_uuid`, which will be computed by hashing the remaining fields,
+ * and tunnel key, which case it is allocated separately (see
+ * MulticastGroupTunKeyAllocation). */
+relation OutProxy_Multicast_Group (
+ datapath: uuid,
+ name: string,
+ ports: Set<uuid>
+)
+
+/* Only create flood group if the switch has enabled ports */
+sb::Out_Multicast_Group (._uuid = hash128((datapath,name)),
+ .datapath = datapath,
+ .name = name,
+ .tunnel_key = tunnel_key,
+ .ports = port_ids) :-
+ &SwitchPort(.lsp = lsp, .sw = &Switch{.ls = ls}),
+ lsp.is_enabled(),
+ var datapath = ls._uuid,
+ var port_ids = lsp._uuid.group_by((datapath)).to_set(),
+ (var name, var tunnel_key) = mC_FLOOD().
+
+/* Create a multicast group to flood to all switch ports except router ports.
+ */
+sb::Out_Multicast_Group (._uuid = hash128((datapath,name)),
+ .datapath = datapath,
+ .name = name,
+ .tunnel_key = tunnel_key,
+ .ports = port_ids) :-
+ &SwitchPort(.lsp = lsp, .sw = &Switch{.ls = ls}),
+ lsp.is_enabled(),
+ lsp.__type != "router",
+ var datapath = ls._uuid,
+ var port_ids = lsp._uuid.group_by((datapath)).to_set(),
+ (var name, var tunnel_key) = mC_FLOOD_L2().
+
+/* Only create unknown group if the switch has ports with "unknown" address */
+sb::Out_Multicast_Group (._uuid = hash128((ls,name)),
+ .datapath = ls,
+ .name = name,
+ .tunnel_key = tunnel_key,
+ .ports = port_ids) :-
+ LogicalSwitchUnknownPorts(ls, port_ids),
+ (var name, var tunnel_key) = mC_UNKNOWN().
+
+/* Create a multicast group to flood multicast traffic to routers with
+ * multicast relay enabled.
+ */
+sb::Out_Multicast_Group (._uuid = hash128((sw.ls._uuid,name)),
+ .datapath = sw.ls._uuid,
+ .name = name,
+ .tunnel_key = tunnel_key,
+ .ports = port_ids) :-
+ SwitchMcastFloodRelayPorts(&sw, port_ids),
+ not port_ids.is_empty(),
+ (var name, var tunnel_key) = mC_MROUTER_FLOOD().
+
+/* Create a multicast group to flood traffic (no reports) to ports with
+ * multicast flood enabled.
+ */
+sb::Out_Multicast_Group (._uuid = hash128((sw.ls._uuid,name)),
+ .datapath = sw.ls._uuid,
+ .name = name,
+ .tunnel_key = tunnel_key,
+ .ports = port_ids) :-
+ SwitchMcastFloodPorts(&sw, port_ids),
+ not port_ids.is_empty(),
+ (var name, var tunnel_key) = mC_STATIC().
+
+/* Create a multicast group to flood reports to ports with
+ * multicast flood_reports enabled.
+ */
+sb::Out_Multicast_Group (._uuid = hash128((sw.ls._uuid,name)),
+ .datapath = sw.ls._uuid,
+ .name = name,
+ .tunnel_key = tunnel_key,
+ .ports = port_ids) :-
+ SwitchMcastFloodReportPorts(&sw, port_ids),
+ not port_ids.is_empty(),
+ (var name, var tunnel_key) = mC_MROUTER_STATIC().
+
+/* Create a multicast group to flood traffic and reports to router ports with
+ * multicast flood enabled.
+ */
+sb::Out_Multicast_Group (._uuid = hash128((rtr.lr._uuid,name)),
+ .datapath = rtr.lr._uuid,
+ .name = name,
+ .tunnel_key = tunnel_key,
+ .ports = port_ids) :-
+ RouterMcastFloodPorts(&rtr, port_ids),
+ not port_ids.is_empty(),
+ (var name, var tunnel_key) = mC_STATIC().
+
+/* Create a multicast group for each IGMP group learned by a Switch.
+ * 'tunnel_key' == 0 triggers an ID allocation later.
+ */
+OutProxy_Multicast_Group (.datapath = switch.ls._uuid,
+ .name = address,
+ .ports = port_ids) :-
+ IgmpSwitchMulticastGroup(address, &switch, port_ids).
+
+/* Create a multicast group for each IGMP group learned by a Router.
+ * 'tunnel_key' == 0 triggers an ID allocation later.
+ */
+OutProxy_Multicast_Group (.datapath = router.lr._uuid,
+ .name = address,
+ .ports = port_ids) :-
+ IgmpRouterMulticastGroup(address, &router, port_ids).
+
+/* Allocate a 'tunnel_key' for dynamic multicast groups. */
+sb::Out_Multicast_Group(._uuid = hash128((mcgroup.datapath,mcgroup.name)),
+ .datapath = mcgroup.datapath,
+ .name = mcgroup.name,
+ .tunnel_key = tunnel_key,
+ .ports = mcgroup.ports) :-
+ mcgroup in OutProxy_Multicast_Group(),
+ MulticastGroupTunKeyAllocation(mcgroup.datapath, mcgroup.name, tunnel_key).
+
+/*
+ * MAC binding: records inserted by hypervisors; northd removes records for deleted logical ports and datapaths.
+ */
+sb::Out_MAC_Binding (._uuid = mb._uuid,
+ .logical_port = mb.logical_port,
+ .ip = mb.ip,
+ .mac = mb.mac,
+ .datapath = mb.datapath) :-
+ sb::MAC_Binding[mb],
+ sb::Out_Port_Binding(.logical_port = mb.logical_port),
+ sb::Out_Datapath_Binding(._uuid = mb.datapath).
+
+/*
+ * DHCP options: fixed table
+ */
+sb::Out_DHCP_Options (
+ ._uuid = 128'h7d9d898a_179b_4898_8382_b73bec391f23,
+ .name = "offerip",
+ .code = 0,
+ .__type = "ipv4"
+).
+
+sb::Out_DHCP_Options (
+ ._uuid = 128'hea5e7d14_fd97_491c_8004_a120bdbc4306,
+ .name = "netmask",
+ .code = 1,
+ .__type = "ipv4"
+).
+
+sb::Out_DHCP_Options (
+ ._uuid = 128'hdab5e39b_6702_4245_9573_6c142aa3724c,
+ .name = "router",
+ .code = 3,
+ .__type = "ipv4"
+).
+
+sb::Out_DHCP_Options (
+ ._uuid = 128'h340b4bc5_c5c3_43d1_ae77_564da69c8fcc,
+ .name = "dns_server",
+ .code = 6,
+ .__type = "ipv4"
+).
+
+sb::Out_DHCP_Options (
+ ._uuid = 128'hcd1ab302_cbb2_4eab_9ec5_ec1c8541bd82,
+ .name = "log_server",
+ .code = 7,
+ .__type = "ipv4"
+).
+
+sb::Out_DHCP_Options (
+ ._uuid = 128'h1c7ea6a0_fe6b_48c1_a920_302583c1ff08,
+ .name = "lpr_server",
+ .code = 9,
+ .__type = "ipv4"
+).
+
+sb::Out_DHCP_Options (
+ ._uuid = 128'hae35e575_226a_4ab5_a1c4_166f426dd999,
+ .name = "domain_name",
+ .code = 15,
+ .__type = "str"
+).
+
+sb::Out_DHCP_Options (
+ ._uuid = 128'had0ec3e0_8be9_4c77_bceb_f8954a34c7ba,
+ .name = "swap_server",
+ .code = 16,
+ .__type = "ipv4"
+).
+
+sb::Out_DHCP_Options (
+ ._uuid = 128'h884c2e02_6e99_4d12_aef7_8454ebf8a3b7,
+ .name = "policy_filter",
+ .code = 21,
+ .__type = "ipv4"
+).
+
+sb::Out_DHCP_Options (
+ ._uuid = 128'h57cc2c61_fd2a_41c6_b6b1_6ce9a8901f86,
+ .name = "router_solicitation",
+ .code = 32,
+ .__type = "ipv4"
+).
+
+sb::Out_DHCP_Options (
+ ._uuid = 128'h48249097_03f0_46c1_a32a_2dd57cd4d0f8,
+ .name = "nis_server",
+ .code = 41,
+ .__type = "ipv4"
+).
+
+sb::Out_DHCP_Options (
+ ._uuid = 128'h333fe07e_bdd1_4371_aa4f_a412bc60f3a2,
+ .name = "ntp_server",
+ .code = 42,
+ .__type = "ipv4"
+).
+
+sb::Out_DHCP_Options (
+ ._uuid = 128'h6207109c_49d0_4348_8238_dd92afb69bf0,
+ .name = "server_id",
+ .code = 54,
+ .__type = "ipv4"
+).
+
+sb::Out_DHCP_Options (
+ ._uuid = 128'h2090b783_26d3_4c1d_830c_54c1b6c5d846,
+ .name = "tftp_server",
+ .code = 66,
+ .__type = "host_id"
+).
+
+sb::Out_DHCP_Options (
+ ._uuid = 128'ha18ff399_caea_406e_af7e_321c6f74e581,
+ .name = "classless_static_route",
+ .code = 121,
+ .__type = "static_routes"
+).
+
+sb::Out_DHCP_Options (
+ ._uuid = 128'hb81ad7b4_62f0_40c7_a9a3_f96677628767,
+ .name = "ms_classless_static_route",
+ .code = 249,
+ .__type = "static_routes"
+).
+
+sb::Out_DHCP_Options (
+ ._uuid = 128'h0c2e144e_4b5f_4e21_8978_0e20bac9a6ea,
+ .name = "ip_forward_enable",
+ .code = 19,
+ .__type = "bool"
+).
+
+sb::Out_DHCP_Options (
+ ._uuid = 128'h6feb1926_9469_4b40_bfbf_478b9888cd3a,
+ .name = "router_discovery",
+ .code = 31,
+ .__type = "bool"
+).
+
+sb::Out_DHCP_Options (
+ ._uuid = 128'hcb776249_e8b1_4502_b33b_fa294d44077d,
+ .name = "ethernet_encap",
+ .code = 36,
+ .__type = "bool"
+).
+
+sb::Out_DHCP_Options (
+ ._uuid = 128'ha2df9eaa_aea9_497f_b339_0c8ec3e39a07,
+ .name = "default_ttl",
+ .code = 23,
+ .__type = "uint8"
+).
+
+sb::Out_DHCP_Options (
+ ._uuid = 128'hb44b45a9_5004_4ef5_8e6a_aa8629e1afb1,
+ .name = "tcp_ttl",
+ .code = 37,
+ .__type = "uint8"
+).
+
+sb::Out_DHCP_Options (
+ ._uuid = 128'h50f01ca7_c650_46f0_8f50_39a67ec657da,
+ .name = "mtu",
+ .code = 26,
+ .__type = "uint16"
+).
+
+sb::Out_DHCP_Options (
+ ._uuid = 128'h9d31c057_6085_4810_96af_eeac7d3c5308,
+ .name = "lease_time",
+ .code = 51,
+ .__type = "uint32"
+).
+
+sb::Out_DHCP_Options (
+ ._uuid = 128'hea1e2e7a_9585_46ee_ad49_adfdefc0c4ef,
+ .name = "T1",
+ .code = 58,
+ .__type = "uint32"
+).
+
+sb::Out_DHCP_Options (
+ ._uuid = 128'hbc83a233_554b_453a_afca_1eadf76810d2,
+ .name = "T2",
+ .code = 59,
+ .__type = "uint32"
+).
+
+sb::Out_DHCP_Options (
+ ._uuid = 128'h1ab3eeca_0523_4101_9076_eea77d0232f4,
+ .name = "bootfile_name",
+ .code = 67,
+ .__type = "str"
+).
+
+sb::Out_DHCP_Options (
+ ._uuid = 128'ha5c20b69_f7f3_4fa8_b550_8697aec6cbb7,
+ .name = "wpad",
+ .code = 252,
+ .__type = "str"
+).
+
+sb::Out_DHCP_Options (
+ ._uuid = 128'h1516bcb6_cc93_4233_a63f_bd29c8601831,
+ .name = "path_prefix",
+ .code = 210,
+ .__type = "str"
+).
+
+sb::Out_DHCP_Options (
+ ._uuid = 128'hc98e13cd_f653_473c_85c1_850dcad685fc,
+ .name = "tftp_server_address",
+ .code = 150,
+ .__type = "ipv4"
+).
+
+sb::Out_DHCP_Options (
+ ._uuid = 128'hfbe06e70_b43d_4dd9_9b21_2f27eb5da5df,
+ .name = "arp_cache_timeout",
+ .code = 35,
+ .__type = "uint32"
+).
+
+sb::Out_DHCP_Options (
+ ._uuid = 128'h2af54a3c_545c_4104_ae1c_432caa3e085e,
+ .name = "tcp_keepalive_interval",
+ .code = 38,
+ .__type = "uint32"
+).
+
+sb::Out_DHCP_Options (
+ ._uuid = 128'h4b2144e8_8d3f_4d96_9032_fe23c1866cd4,
+ .name = "domain_search_list",
+ .code = 119,
+ .__type = "domains"
+).
+
+sb::Out_DHCP_Options (
+ ._uuid = 128'hb7236164_eea4_4bf2_9306_8619a9e3ad1d,
+ .name = "broadcast_address",
+ .code = 28,
+ .__type = "ipv4"
+).
+
+sb::Out_DHCP_Options (
+ ._uuid = 128'h2d738583_96f4_4a78_99a1_f8f7fe328f3f,
+ .name = "bootfile_name_alt",
+ .code = 254,
+ .__type = "str"
+).
+
+
+/*
+ * DHCPv6 options: fixed table
+ */
+sb::Out_DHCPv6_Options (
+ ._uuid = 128'h100b2659_0ec0_4da7_9ec3_25997f92dc00,
+ .name = "server_id",
+ .code = 2,
+ .__type = "mac"
+).
+
+sb::Out_DHCPv6_Options (
+ ._uuid = 128'h53f49b50_db75_4b0d_83df_50d31009ca9c,
+ .name = "ia_addr",
+ .code = 5,
+ .__type = "ipv6"
+).
+
+sb::Out_DHCPv6_Options (
+ ._uuid = 128'he3619685_d4f7_42ad_936b_4f4440b7eeb4,
+ .name = "dns_server",
+ .code = 23,
+ .__type = "ipv6"
+).
+
+sb::Out_DHCPv6_Options (
+ ._uuid = 128'hcb8a4e7f_a312_4cb1_a846_e474d9f0c531,
+ .name = "domain_search",
+ .code = 24,
+ .__type = "str"
+).
+
+
+/*
+ * DNS: copied from NB + datapaths column pointer to LS datapaths that use the record
+ */
+
+function map_to_lowercase(m_in: Map<string,string>): Map<string,string> {
+ var m_out = map_empty();
+ for (node in m_in) {
+ (var k, var v) = node;
+ m_out.insert(string_to_lowercase(k), string_to_lowercase(v))
+ };
+ m_out
+}
+
+sb::Out_DNS(._uuid = nbdns._uuid,
+ .records = map_to_lowercase(nbdns.records),
+ .datapaths = datapaths,
+ .external_ids = nbdns.external_ids.insert_imm("dns_id", uuid2str(nbdns._uuid))) :-
+ nb::DNS[nbdns],
+ LogicalSwitchDNS(ls_uuid, nbdns._uuid),
+ var datapaths = ls_uuid.group_by(nbdns).to_set().
+
+/*
+ * RBAC_Permission: fixed
+ */
+
+sb::Out_RBAC_Permission (
+ ._uuid = 128'h7df3749a_1754_4a78_afa4_3abf526fe510,
+ .table = "Chassis",
+ .authorization = set_singleton("name"),
+ .insert_delete = true,
+ .update = ["nb_cfg", "external_ids", "encaps",
+ "vtep_logical_switches", "other_config", "name"].to_set()
+).
+
+sb::Out_RBAC_Permission (
+ ._uuid = 128'h07e623f7_137c_4a11_9084_3b3f89cb4a54,
+ .table = "Chassis_Private",
+ .authorization = set_singleton("name"),
+ .insert_delete = true,
+ .update = ["nb_cfg", "nb_cfg_timestamp", "chassis", "name"].to_set()
+).
+
+sb::Out_RBAC_Permission (
+ ._uuid = 128'h94bec860_431e_4d95_82e7_3b75d8997241,
+ .table = "Encap",
+ .authorization = set_singleton("chassis_name"),
+ .insert_delete = true,
+ .update = ["type", "options", "ip", "chassis_name"].to_set()
+).
+
+sb::Out_RBAC_Permission (
+ ._uuid = 128'hd8ceff1a_2b11_48bd_802f_4a991aa4e908,
+ .table = "Port_Binding",
+ .authorization = set_singleton(""),
+ .insert_delete = false,
+ .update = set_singleton("chassis")
+).
+
+sb::Out_RBAC_Permission (
+ ._uuid = 128'h6ffdc696_8bfb_4d82_b620_a00d39270b2f,
+ .table = "MAC_Binding",
+ .authorization = set_singleton(""),
+ .insert_delete = true,
+ .update = ["logical_port", "ip", "mac", "datapath"].to_set()
+).
+
+sb::Out_RBAC_Permission (
+ ._uuid = 128'h39231c7e_4bf1_41d0_ada4_1d8a319c0da3,
+ .table = "Service_Monitor",
+ .authorization = set_singleton(""),
+ .insert_delete = false,
+ .update = set_singleton("status")
+).
+
+/*
+ * RBAC_Role: fixed
+ */
+sb::Out_RBAC_Role (
+ ._uuid = 128'ha406b472_5de8_4456_9f38_bf344c911b22,
+ .name = "ovn-controller",
+ .permissions = [
+ "Chassis" -> 128'h7df3749a_1754_4a78_afa4_3abf526fe510,
+ "Chassis_Private" -> 128'h07e623f7_137c_4a11_9084_3b3f89cb4a54,
+ "Encap" -> 128'h94bec860_431e_4d95_82e7_3b75d8997241,
+ "Port_Binding" -> 128'hd8ceff1a_2b11_48bd_802f_4a991aa4e908,
+ "MAC_Binding" -> 128'h6ffdc696_8bfb_4d82_b620_a00d39270b2f,
+ "Service_Monitor"-> 128'h39231c7e_4bf1_41d0_ada4_1d8a319c0da3]
+
+).
+
+/* Output modified Logical_Switch_Port table with dynamic address updated */
+nb::Out_Logical_Switch_Port(._uuid = lsp._uuid,
+ .tag = tag,
+ .dynamic_addresses = dynamic_addresses,
+ .up = Some{up}) :-
+ SwitchPortNewDynamicAddress(&SwitchPort{.lsp = lsp, .up = up}, opt_dyn_addr),
+ var dynamic_addresses = opt_dyn_addr.and_then(|a| Some{"${a}"}),
+ SwitchPortNewDynamicTag(lsp._uuid, opt_tag),
+ var tag = match (opt_tag) {
+ None -> lsp.tag,
+ Some{t} -> Some{t}
+ }.
+
+relation LRPIPv6Prefix0(lrp_uuid: uuid, ipv6_prefix: string)
+LRPIPv6Prefix0(lrp._uuid, ipv6_prefix) :-
+ lrp in nb::Logical_Router_Port(),
+ map_get_bool_def(lrp.options, "prefix", false),
+ sb::Port_Binding(.logical_port = lrp.name, .options = options),
+ Some{var ipv6_ra_pd_list} = options.get("ipv6_ra_pd_list"),
+ var parts = string_split(ipv6_ra_pd_list, ","),
+ Some{var ipv6_prefix} = parts.nth(1).
+
+relation LRPIPv6Prefix(lrp_uuid: uuid, ipv6_prefix: Option<string>)
+LRPIPv6Prefix(lrp_uuid, Some{ipv6_prefix}) :-
+ LRPIPv6Prefix0(lrp_uuid, ipv6_prefix).
+LRPIPv6Prefix(lrp_uuid, None) :-
+ nb::Logical_Router_Port(._uuid = lrp_uuid),
+ not LRPIPv6Prefix0(lrp_uuid, _).
+
+nb::Out_Logical_Router_Port(._uuid = _uuid,
+ .ipv6_prefix = to_set(ipv6_prefix)) :-
+ nb::Logical_Router_Port(._uuid = _uuid, .name = name),
+ LRPIPv6Prefix(_uuid, ipv6_prefix).
+
+typedef Direction = IN | OUT
+
+typedef PipelineStage = PORT_SEC_L2
+ | PORT_SEC_IP
+ | PORT_SEC_ND
+ | PRE_ACL
+ | PRE_LB
+ | PRE_STATEFUL
+ | ACL_HINT
+ | ACL
+ | QOS_MARK
+ | QOS_METER
+ | LB
+ | STATEFUL
+ | PRE_HAIRPIN
+ | HAIRPIN
+ | NAT_HAIRPIN
+ | ARP_ND_RSP
+ | DHCP_OPTIONS
+ | DHCP_RESPONSE
+ | DNS_LOOKUP
+ | DNS_RESPONSE
+ | EXTERNAL_PORT
+ | L2_LKUP
+ | ADMISSION
+ | LOOKUP_NEIGHBOR
+ | LEARN_NEIGHBOR
+ | IP_INPUT
+ | DEFRAG
+ | UNSNAT
+ | DNAT
+ | ECMP_STATEFUL
+ | ND_RA_OPTIONS
+ | ND_RA_RESPONSE
+ | IP_ROUTING
+ | IP_ROUTING_ECMP
+ | POLICY
+ | ARP_RESOLVE
+ | CHK_PKT_LEN
+ | LARGER_PKTS
+ | GW_REDIRECT
+ | ARP_REQUEST
+ | UNDNAT
+ | SNAT
+ | EGR_LOOP
+ | DELIVERY
+
+typedef DatapathType = LSwitch | LRouter
+
+typedef Stage = Stage{
+ datapath : DatapathType,
+ direction : Direction,
+ stage : PipelineStage
+}
+
+function switch_stage(direction: Direction, stage: PipelineStage): Stage = {
+ Stage{LSwitch, direction, stage}
+}
+
+function router_stage(direction: Direction, stage: PipelineStage): Stage = {
+ Stage{LRouter, direction, stage}
+}
+
+function stage_id(stage: Stage): (integer, string) =
+{
+ match ((stage.datapath, stage.direction, stage.stage)) {
+ /* Logical switch ingress stages. */
+ (LSwitch, IN, PORT_SEC_L2) -> (0, "ls_in_port_sec_l2"),
+ (LSwitch, IN, PORT_SEC_IP) -> (1, "ls_in_port_sec_ip"),
+ (LSwitch, IN, PORT_SEC_ND) -> (2, "ls_in_port_sec_nd"),
+ (LSwitch, IN, PRE_ACL) -> (3, "ls_in_pre_acl"),
+ (LSwitch, IN, PRE_LB) -> (4, "ls_in_pre_lb"),
+ (LSwitch, IN, PRE_STATEFUL) -> (5, "ls_in_pre_stateful"),
+ (LSwitch, IN, ACL_HINT) -> (6, "ls_in_acl_hint"),
+ (LSwitch, IN, ACL) -> (7, "ls_in_acl"),
+ (LSwitch, IN, QOS_MARK) -> (8, "ls_in_qos_mark"),
+ (LSwitch, IN, QOS_METER) -> (9, "ls_in_qos_meter"),
+ (LSwitch, IN, LB) -> (10, "ls_in_lb"),
+ (LSwitch, IN, STATEFUL) -> (11, "ls_in_stateful"),
+ (LSwitch, IN, PRE_HAIRPIN) -> (12, "ls_in_pre_hairpin"),
+ (LSwitch, IN, NAT_HAIRPIN) -> (13, "ls_in_hairpin"),
+ (LSwitch, IN, HAIRPIN) -> (14, "ls_in_hairpin"),
+ (LSwitch, IN, ARP_ND_RSP) -> (15, "ls_in_arp_rsp"),
+ (LSwitch, IN, DHCP_OPTIONS) -> (16, "ls_in_dhcp_options"),
+ (LSwitch, IN, DHCP_RESPONSE) -> (17, "ls_in_dhcp_response"),
+ (LSwitch, IN, DNS_LOOKUP) -> (18, "ls_in_dns_lookup"),
+ (LSwitch, IN, DNS_RESPONSE) -> (19, "ls_in_dns_response"),
+ (LSwitch, IN, EXTERNAL_PORT) -> (20, "ls_in_external_port"),
+ (LSwitch, IN, L2_LKUP) -> (21, "ls_in_l2_lkup"),
+
+ /* Logical switch egress stages. */
+ (LSwitch, OUT, PRE_LB) -> (0, "ls_out_pre_lb"),
+ (LSwitch, OUT, PRE_ACL) -> (1, "ls_out_pre_acl"),
+ (LSwitch, OUT, PRE_STATEFUL) -> (2, "ls_out_pre_stateful"),
+ (LSwitch, OUT, LB) -> (3, "ls_out_lb"),
+ (LSwitch, OUT, ACL_HINT) -> (4, "ls_out_acl_hint"),
+ (LSwitch, OUT, ACL) -> (5, "ls_out_acl"),
+ (LSwitch, OUT, QOS_MARK) -> (6, "ls_out_qos_mark"),
+ (LSwitch, OUT, QOS_METER) -> (7, "ls_out_qos_meter"),
+ (LSwitch, OUT, STATEFUL) -> (8, "ls_out_stateful"),
+ (LSwitch, OUT, PORT_SEC_IP) -> (9, "ls_out_port_sec_ip"),
+ (LSwitch, OUT, PORT_SEC_L2) -> (10, "ls_out_port_sec_l2"),
+
+ /* Logical router ingress stages. */
+ (LRouter, IN, ADMISSION) -> (0, "lr_in_admission"),
+ (LRouter, IN, LOOKUP_NEIGHBOR) -> (1, "lr_in_lookup_neighbor"),
+ (LRouter, IN, LEARN_NEIGHBOR) -> (2, "lr_in_learn_neighbor"),
+ (LRouter, IN, IP_INPUT) -> (3, "lr_in_ip_input"),
+ (LRouter, IN, DEFRAG) -> (4, "lr_in_defrag"),
+ (LRouter, IN, UNSNAT) -> (5, "lr_in_unsnat"),
+ (LRouter, IN, DNAT) -> (6, "lr_in_dnat"),
+ (LRouter, IN, ECMP_STATEFUL) -> (7, "lr_in_ecmp_stateful"),
+ (LRouter, IN, ND_RA_OPTIONS) -> (8, "lr_in_nd_ra_options"),
+ (LRouter, IN, ND_RA_RESPONSE)-> (9, "lr_in_nd_ra_response"),
+ (LRouter, IN, IP_ROUTING) -> (10, "lr_in_ip_routing"),
+ (LRouter, IN, IP_ROUTING_ECMP) -> (11, "lr_in_ip_routing_ecmp"),
+ (LRouter, IN, POLICY) -> (12, "lr_in_policy"),
+ (LRouter, IN, ARP_RESOLVE) -> (13, "lr_in_arp_resolve"),
+ (LRouter, IN, CHK_PKT_LEN) -> (14, "lr_in_chk_pkt_len"),
+ (LRouter, IN, LARGER_PKTS) -> (15, "lr_in_larger_pkts"),
+ (LRouter, IN, GW_REDIRECT) -> (16, "lr_in_gw_redirect"),
+ (LRouter, IN, ARP_REQUEST) -> (17, "lr_in_arp_request"),
+
+ /* Logical router egress stages. */
+ (LRouter, OUT, UNDNAT) -> (0, "lr_out_undnat"),
+ (LRouter, OUT, SNAT) -> (1, "lr_out_snat"),
+ (LRouter, OUT, EGR_LOOP) -> (2, "lr_out_egr_loop"),
+ (LRouter, OUT, DELIVERY) -> (3, "lr_out_delivery"),
+
+ _ -> (64'hffffffffffffffff, "") /* alternatively crash? */
+ }
+}
+
+/*
+ * OVS register usage:
+ *
+ * Logical Switch pipeline:
+ * +---------+----------------------------------------------+
+ * | R0 | REGBIT_{CONNTRACK/DHCP/DNS/HAIRPIN} |
+ * | | REGBIT_ACL_HINT_{ALLOW_NEW/ALLOW/DROP/BLOCK} |
+ * +---------+----------------------------------------------+
+ * | R1 - R9 | UNUSED |
+ * +---------+----------------------------------------------+
+ *
+ * Logical Router pipeline:
+ * +-----+--------------------------+---+-----------------+---+---------------+
+ * | R0 | REGBIT_ND_RA_OPTS_RESULT | | | | |
+ * | | (= IN_ND_RA_OPTIONS) | X | | | |
+ * | | NEXT_HOP_IPV4 | R | | | |
+ * | | (>= IP_INPUT) | E | INPORT_ETH_ADDR | X | |
+ * +-----+--------------------------+ G | (< IP_INPUT) | X | |
+ * | R1 | SRC_IPV4 for ARP-REQ | 0 | | R | |
+ * | | (>= IP_INPUT) | | | E | NEXT_HOP_IPV6 |
+ * +-----+--------------------------+---+-----------------+ G | (>= IP_INPUT) |
+ * | R2 | UNUSED | X | | 0 | |
+ * | | | R | | | |
+ * +-----+--------------------------+ E | UNUSED | | |
+ * | R3 | UNUSED | G | | | |
+ * | | | 1 | | | |
+ * +-----+--------------------------+---+-----------------+---+---------------+
+ * | R4 | UNUSED | X | | | |
+ * | | | R | | | |
+ * +-----+--------------------------+ E | UNUSED | X | |
+ * | R5 | UNUSED | G | | X | |
+ * | | | 2 | | R |SRC_IPV6 for NS|
+ * +-----+--------------------------+---+-----------------+ E | (>= IP_INPUT) |
+ * | R6 | UNUSED | X | | G | |
+ * | | | R | | 1 | |
+ * +-----+--------------------------+ E | UNUSED | | |
+ * | R7 | UNUSED | G | | | |
+ * | | | 3 | | | |
+ * +-----+--------------------------+---+-----------------+---+---------------+
+ * | R8 | ECMP_GROUP_ID | | |
+ * | | ECMP_MEMBER_ID | X | |
+ * +-----+--------------------------+ R | |
+ * | | REGBIT_{ | E | |
+ * | | EGRESS_LOOPBACK/ | G | UNUSED |
+ * | R9 | PKT_LARGER/ | 4 | |
+ * | | LOOKUP_NEIGHBOR_RESULT/| | |
+ * | | SKIP_LOOKUP_NEIGHBOR} | | |
+ * +-----+--------------------------+---+-----------------+
+ *
+ */
+
+/* Register definitions specific to routers. */
+function rEG_NEXT_HOP(): string = "reg0" /* reg0 for IPv4, xxreg0 for IPv6 */
+function rEG_SRC(): string = "reg1" /* reg1 for IPv4, xxreg1 for IPv6 */
+
+/* Register definitions specific to switches. */
+function rEGBIT_CONNTRACK_DEFRAG() : string = "reg0[0]"
+function rEGBIT_CONNTRACK_COMMIT() : string = "reg0[1]"
+function rEGBIT_CONNTRACK_NAT() : string = "reg0[2]"
+function rEGBIT_DHCP_OPTS_RESULT() : string = "reg0[3]"
+function rEGBIT_DNS_LOOKUP_RESULT(): string = "reg0[4]"
+function rEGBIT_ND_RA_OPTS_RESULT(): string = "reg0[5]"
+function rEGBIT_HAIRPIN() : string = "reg0[6]"
+function rEGBIT_ACL_HINT_ALLOW_NEW(): string = "reg0[7]"
+function rEGBIT_ACL_HINT_ALLOW() : string = "reg0[8]"
+function rEGBIT_ACL_HINT_DROP() : string = "reg0[9]"
+function rEGBIT_ACL_HINT_BLOCK() : string = "reg0[10]"
+
+/* Register definitions for switches and routers. */
+
+/* Indicate that this packet has been recirculated using egress
+ * loopback. This allows certain checks to be bypassed, such as a
+* logical router dropping packets with source IP address equals
+* one of the logical router's own IP addresses. */
+function rEGBIT_EGRESS_LOOPBACK() : string = "reg9[0]"
+/* Register to store the result of check_pkt_larger action. */
+function rEGBIT_PKT_LARGER() : string = "reg9[1]"
+function rEGBIT_LOOKUP_NEIGHBOR_RESULT() : string = "reg9[2]"
+function rEGBIT_LOOKUP_NEIGHBOR_IP_RESULT() : string = "reg9[3]"
+
+/* Register to store the eth address associated to a router port for packets
+ * received in S_ROUTER_IN_ADMISSION.
+ */
+function rEG_INPORT_ETH_ADDR() : string = "xreg0[0..47]"
+
+/* Register for ECMP bucket selection. */
+function rEG_ECMP_GROUP_ID() : string = "reg8[0..15]"
+function rEG_ECMP_MEMBER_ID() : string = "reg8[16..31]"
+
+function fLAGBIT_NOT_VXLAN() : string = "flags[1] == 0"
+
+function mFF_N_LOG_REGS() : bit<32> = 10
+
+/*
+ * Logical_Flow
+ relation Out_Logical_Flow (
+ logical_datapath: string,
+ pipeline: string,
+ table_id: integer,
+ priority: integer,
+ __match: string,
+ actions: string,
+ external_ids: Map<string,string>)
+ */
+
+relation Flow (
+ logical_datapath: uuid,
+ stage: Stage,
+ priority: integer,
+ __match: string,
+ actions: string,
+ external_ids: Map<string,string>
+)
+
+sb::Out_Logical_Flow(._uuid = hash128((f.logical_datapath, f.stage, f.priority, f.__match, f.actions, f.external_ids)),
+ .logical_datapath = f.logical_datapath,
+ .pipeline = if (f.stage.direction == IN) "ingress" else "egress",
+ .table_id = table_id,
+ .priority = f.priority,
+ .__match = f.__match,
+ .actions = f.actions,
+ .external_ids = f.external_ids.insert_imm("stage-name", table_name)) :-
+ Flow[f],
+ (var table_id, var table_name) = stage_id(f.stage).
+
+/* Logical flows for forwarding groups. */
+Flow(.logical_datapath = sw.ls._uuid,
+ .stage = switch_stage(IN, ARP_ND_RSP),
+ .priority = 50,
+ .__match = __match,
+ .actions = actions,
+ .external_ids = stage_hint(fg_uuid)) :-
+ sw in &Switch(),
+ var fg_uuid = FlatMap(sw.ls.forwarding_groups),
+ fg in nb::Forwarding_Group(._uuid = fg_uuid),
+ not fg.child_port.is_empty(),
+ var __match = "arp.tpa == ${fg.vip} && arp.op == 1",
+ var actions = "eth.dst = eth.src; "
+ "eth.src = ${fg.vmac}; "
+ "arp.op = 2; /* ARP reply */ "
+ "arp.tha = arp.sha; "
+ "arp.sha = ${fg.vmac}; "
+ "arp.tpa = arp.spa; "
+ "arp.spa = ${fg.vip}; "
+ "outport = inport; "
+ "flags.loopback = 1; "
+ "output;".
+
+function escape_child_ports(child_port: Set<string>): string {
+ var escaped = vec_with_capacity(child_port.size());
+ for (s in child_port) {
+ escaped.push(json_string_escape(s))
+ };
+ escaped.join(",")
+}
+Flow(.logical_datapath = sw.ls._uuid,
+ .stage = switch_stage(IN, L2_LKUP),
+ .priority = 50,
+ .__match = __match,
+ .actions = actions,
+ .external_ids = map_empty()) :-
+ sw in &Switch(),
+ var fg_uuid = FlatMap(sw.ls.forwarding_groups),
+ fg in nb::Forwarding_Group(._uuid = fg_uuid),
+ not fg.child_port.is_empty(),
+ var __match = "eth.dst == ${fg.vmac}",
+ var actions = "fwd_group(" ++
+ if (fg.liveness) { "liveness=\"true\"," } else { "" } ++
+ "childports=" ++ escape_child_ports(fg.child_port) ++ ");".
+
+/* Logical switch ingress table PORT_SEC_L2: admission control framework
+ * (priority 100) */
+for (sw in &Switch()) {
+ if (not sw.is_vlan_transparent) {
+ /* Block logical VLANs. */
+ Flow(.logical_datapath = sw.ls._uuid,
+ .stage = switch_stage(IN, PORT_SEC_L2),
+ .priority = 100,
+ .__match = "vlan.present",
+ .actions = "drop;",
+ .external_ids = map_empty() /*TODO: check*/)
+ };
+
+ /* Broadcast/multicast source address is invalid */
+ Flow(.logical_datapath = sw.ls._uuid,
+ .stage = switch_stage(IN, PORT_SEC_L2),
+ .priority = 100,
+ .__match = "eth.src[40]",
+ .actions = "drop;",
+ .external_ids = map_empty() /*TODO: check*/)
+ /* Port security flows have priority 50 (see below) and will continue to the next table
+ if packet source is acceptable. */
+}
+
+// space-separated set of strings
+function join(strings: Set<string>, sep: string): string {
+ strings.to_vec().join(sep)
+}
+
+function build_port_security_ipv6_flow(
+ pipeline: Direction,
+ ea: eth_addr,
+ ipv6_addrs: Vec<ipv6_netaddr>): string =
+{
+ var ip6_addrs = vec_empty();
+
+ /* Allow link-local address. */
+ ip6_addrs.push(ipv6_string_mapped(in6_generate_lla(ea)));
+
+ /* Allow ip6.dst=ff00::/8 for multicast packets */
+ if (pipeline == OUT) {
+ ip6_addrs.push("ff00::/8")
+ };
+ for (addr in ipv6_addrs) {
+ ip6_addrs.push(ipv6_netaddr_match_network(addr))
+ };
+
+ var dir = if (pipeline == IN) { "src" } else { "dst" };
+ " && ip6.${dir} == {" ++ ip6_addrs.join(", ") ++ "}"
+}
+
+function build_port_security_ipv6_nd_flow(
+ ea: eth_addr,
+ ipv6_addrs: Vec<ipv6_netaddr>): string =
+{
+ var __match = " && ip6 && nd && ((nd.sll == ${eth_addr_zero()} || "
+ "nd.sll == ${ea}) || ((nd.tll == ${eth_addr_zero()} || "
+ "nd.tll == ${ea})";
+ if (ipv6_addrs.is_empty()) {
+ __match ++ "))"
+ } else {
+ var ip6_str = ipv6_string_mapped(in6_generate_lla(ea));
+ __match = __match ++ " && (nd.target == ${ip6_str}";
+
+ for(addr in ipv6_addrs) {
+ ip6_str = ipv6_netaddr_match_network(addr);
+ __match = __match ++ " || nd.target == ${ip6_str}"
+ };
+ __match ++ ")))"
+ }
+}
+
+/* Pre-ACL */
+for (&Switch(.ls =ls)) {
+ /* Ingress and Egress Pre-ACL Table (Priority 0): Packets are
+ * allowed by default. */
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(IN, PRE_ACL),
+ .priority = 0,
+ .__match = "1",
+ .actions = "next;",
+ .external_ids = map_empty());
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(OUT, PRE_ACL),
+ .priority = 0,
+ .__match = "1",
+ .actions = "next;",
+ .external_ids = map_empty());
+
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(IN, PRE_ACL),
+ .priority = 110,
+ .__match = "eth.dst == $svc_monitor_mac",
+ .actions = "next;",
+ .external_ids = map_empty());
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(OUT, PRE_ACL),
+ .priority = 110,
+ .__match = "eth.src == $svc_monitor_mac",
+ .actions = "next;",
+ .external_ids = map_empty())
+}
+
+
+/* If there are any stateful ACL rules in this datapath, we must
+ * send all IP packets through the conntrack action, which handles
+ * defragmentation, in order to match L4 headers. */
+
+for (&SwitchPort(.lsp = lsp@nb::Logical_Switch_Port{.__type = "router"},
+ .json_name = lsp_name,
+ .sw = &Switch{.ls = ls, .has_stateful_acl = true})) {
+ /* Can't use ct() for router ports. Consider the
+ * following configuration: lp1(10.0.0.2) on
+ * hostA--ls1--lr0--ls2--lp2(10.0.1.2) on hostB, For a
+ * ping from lp1 to lp2, First, the response will go
+ * through ct() with a zone for lp2 in the ls2 ingress
+ * pipeline on hostB. That ct zone knows about this
+ * connection. Next, it goes through ct() with the zone
+ * for the router port in the egress pipeline of ls2 on
+ * hostB. This zone does not know about the connection,
+ * as the icmp request went through the logical router
+ * on hostA, not hostB. This would only work with
+ * distributed conntrack state across all chassis. */
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(IN, PRE_ACL),
+ .priority = 110,
+ .__match = "ip && inport == ${lsp_name}",
+ .actions = "next;",
+ .external_ids = stage_hint(lsp._uuid));
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(OUT, PRE_ACL),
+ .priority = 110,
+ .__match = "ip && outport == ${lsp_name}",
+ .actions = "next;",
+ .external_ids = stage_hint(lsp._uuid))
+}
+
+for (&SwitchPort(.lsp = lsp@nb::Logical_Switch_Port{.__type = "localnet"},
+ .json_name = lsp_name,
+ .sw = &Switch{.ls = ls, .has_stateful_acl = true})) {
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(IN, PRE_ACL),
+ .priority = 110,
+ .__match = "ip && inport == ${lsp_name}",
+ .actions = "next;",
+ .external_ids = stage_hint(lsp._uuid));
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(OUT, PRE_ACL),
+ .priority = 110,
+ .__match = "ip && outport == ${lsp_name}",
+ .actions = "next;",
+ .external_ids = stage_hint(lsp._uuid))
+}
+
+for (&Switch(.ls = ls, .has_stateful_acl = true)) {
+ /* Ingress and Egress Pre-ACL Table (Priority 110).
+ *
+ * Not to do conntrack on ND and ICMP destination
+ * unreachable packets. */
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(IN, PRE_ACL),
+ .priority = 110,
+ .__match = "nd || nd_rs || nd_ra || mldv1 || mldv2 || "
+ "(udp && udp.src == 546 && udp.dst == 547)",
+ .actions = "next;",
+ .external_ids = map_empty());
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(OUT, PRE_ACL),
+ .priority = 110,
+ .__match = "nd || nd_rs || nd_ra || mldv1 || mldv2 || "
+ "(udp && udp.src == 546 && udp.dst == 547)",
+ .actions = "next;",
+ .external_ids = map_empty());
+
+ /* Ingress and Egress Pre-ACL Table (Priority 100).
+ *
+ * Regardless of whether the ACL is "from-lport" or "to-lport",
+ * we need rules in both the ingress and egress table, because
+ * the return traffic needs to be followed.
+ *
+ * 'REGBIT_CONNTRACK_DEFRAG' is set to let the pre-stateful table send
+ * it to conntrack for tracking and defragmentation. */
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(IN, PRE_ACL),
+ .priority = 100,
+ .__match = "ip",
+ .actions = "${rEGBIT_CONNTRACK_DEFRAG()} = 1; next;",
+ .external_ids = map_empty());
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(OUT, PRE_ACL),
+ .priority = 100,
+ .__match = "ip",
+ .actions = "${rEGBIT_CONNTRACK_DEFRAG()} = 1; next;",
+ .external_ids = map_empty())
+}
+
+/* Pre-LB */
+for (&Switch(.ls = ls)) {
+ /* Do not send ND packets to conntrack */
+ var __match = "nd || nd_rs || nd_ra || mldv1 || mldv2" in {
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(IN, PRE_LB),
+ .priority = 110,
+ .__match = __match,
+ .actions = "next;",
+ .external_ids = map_empty());
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(OUT, PRE_LB),
+ .priority = 110,
+ .__match = __match,
+ .actions = "next;",
+ .external_ids = map_empty())
+ };
+
+ /* Do not send service monitor packets to conntrack. */
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(IN, PRE_LB),
+ .priority = 110,
+ .__match = "eth.dst == $svc_monitor_mac",
+ .actions = "next;",
+ .external_ids = map_empty());
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(OUT, PRE_LB),
+ .priority = 110,
+ .__match = "eth.src == $svc_monitor_mac",
+ .actions = "next;",
+ .external_ids = map_empty());
+
+ /* Allow all packets to go to next tables by default. */
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(IN, PRE_LB),
+ .priority = 0,
+ .__match = "1",
+ .actions = "next;",
+ .external_ids = map_empty());
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(OUT, PRE_LB),
+ .priority = 0,
+ .__match = "1",
+ .actions = "next;",
+ .external_ids = map_empty())
+}
+
+for (&SwitchPort(.lsp = lsp, .json_name = lsp_name, .sw = &Switch{.ls = ls}))
+if (lsp.__type == "router" or lsp.__type == "localnet") {
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(IN, PRE_LB),
+ .priority = 110,
+ .__match = "ip && inport == ${lsp_name}",
+ .actions = "next;",
+ .external_ids = stage_hint(lsp._uuid));
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(OUT, PRE_LB),
+ .priority = 110,
+ .__match = "ip && outport == ${lsp_name}",
+ .actions = "next;",
+ .external_ids = stage_hint(lsp._uuid))
+}
+
+relation HasEventElbMeter(has_meter: bool)
+
+HasEventElbMeter(true) :-
+ nb::Meter(.name = "event-elb").
+
+HasEventElbMeter(false) :-
+ Unit(),
+ not nb::Meter(.name = "event-elb").
+
+/* Empty LoadBalancer Controller event */
+function build_empty_lb_event_flow(key: string, lb: nb::Load_Balancer,
+ meter: bool): Option<(string, string)> {
+ (var ip, var port) = match (ip_address_and_port_from_lb_key(key)) {
+ Some{(ip, port)} -> (ip, port),
+ _ -> return None
+ };
+
+ var protocol = match (lb.protocol) {
+ Some{"tcp"} -> "tcp",
+ _ -> "udp"
+ };
+ var meter = match (meter) {
+ true -> "event-elb",
+ _ -> ""
+ };
+ var vip = match (port) {
+ 0 -> "${ip}",
+ _ -> "${ip.to_bracketed_string()}:${port}"
+ };
+
+ var __match = vec_with_capacity(2);
+ __match.push("${ip46_ipX(ip)}.dst == ${ip}");
+ if (port != 0) {
+ __match.push("${protocol}.dst == ${port}");
+ };
+
+ var action = "trigger_event("
+ "event = \"empty_lb_backends\", "
+ "meter = \"${meter}\", "
+ "vip = \"${vip}\", "
+ "protocol = \"${protocol}\", "
+ "load_balancer = \"${uuid2str(lb._uuid)}\");";
+
+ Some{(__match.join(" && "), action)}
+}
+
+/* ControllerEventEn has exactly one row, either 'true' to enable controller
+ * events or 'false' to disable them. */
+relation ControllerEventEn(enable: bool)
+ControllerEventEn(map_get_bool_def(options, "controller_event", false)) :-
+ nb::NB_Global(.options = options).
+ControllerEventEn(false) :- Unit(), not nb::NB_Global().
+
+Flow(.logical_datapath = sw.ls._uuid,
+ .stage = switch_stage(IN, PRE_LB),
+ .priority = 130,
+ .__match = __match,
+ .actions = __action,
+ .external_ids = stage_hint(lb._uuid)) :-
+ ControllerEventEn(true),
+ SwitchLBVIP(.sw_uuid = sw_uuid, .lb = &lb, .vip = vip, .backends = backends),
+ sw in &Switch(.ls = nb::Logical_Switch{._uuid = sw_uuid}),
+ backends == "",
+ HasEventElbMeter(has_elb_meter),
+ Some {(var __match, var __action)} = build_empty_lb_event_flow(
+ vip, lb, has_elb_meter).
+
+/* 'REGBIT_CONNTRACK_DEFRAG' is set to let the pre-stateful table send
+ * packet to conntrack for defragmentation.
+ *
+ * Send all the packets to conntrack in the ingress pipeline if the
+ * logical switch has a load balancer with VIP configured. Earlier
+ * we used to set the REGBIT_CONNTRACK_DEFRAG flag in the ingress pipeline
+ * if the IP destination matches the VIP. But this causes few issues when
+ * a logical switch has no ACLs configured with allow-related.
+ * To understand the issue, lets a take a TCP load balancer -
+ * 10.0.0.10:80=10.0.0.3:80.
+ * If a logical port - p1 with IP - 10.0.0.5 opens a TCP connection with
+ * the VIP - 10.0.0.10, then the packet in the ingress pipeline of 'p1'
+ * is sent to the p1's conntrack zone id and the packet is load balanced
+ * to the backend - 10.0.0.3. For the reply packet from the backend lport,
+ * it is not sent to the conntrack of backend lport's zone id. This is fine
+ * as long as the packet is valid. Suppose the backend lport sends an
+ * invalid TCP packet (like incorrect sequence number), the packet gets
+ * delivered to the lport 'p1' without unDNATing the packet to the
+ * VIP - 10.0.0.10. And this causes the connection to be reset by the
+ * lport p1's VIF.
+ *
+ * We can't fix this issue by adding a logical flow to drop ct.inv packets
+ * in the egress pipeline since it will drop all other connections not
+ * destined to the load balancers.
+ *
+ * To fix this issue, we send all the packets to the conntrack in the
+ * ingress pipeline if a load balancer is configured. We can now
+ * add a lflow to drop ct.inv packets.
+ */
+for (sw in &Switch(.has_lb_vip = true)) {
+ Flow(.logical_datapath = sw.ls._uuid,
+ .stage = switch_stage(IN, PRE_LB),
+ .priority = 100,
+ .__match = "ip",
+ .actions = "${rEGBIT_CONNTRACK_DEFRAG()} = 1; next;",
+ .external_ids = map_empty());
+ Flow(.logical_datapath = sw.ls._uuid,
+ .stage = switch_stage(OUT, PRE_LB),
+ .priority = 100,
+ .__match = "ip",
+ .actions = "${rEGBIT_CONNTRACK_DEFRAG()} = 1; next;",
+ .external_ids = map_empty())
+}
+
+/* Pre-stateful */
+for (&Switch(.ls = ls)) {
+ /* Ingress and Egress pre-stateful Table (Priority 0): Packets are
+ * allowed by default. */
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(IN, PRE_STATEFUL),
+ .priority = 0,
+ .__match = "1",
+ .actions = "next;",
+ .external_ids = map_empty());
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(OUT, PRE_STATEFUL),
+ .priority = 0,
+ .__match = "1",
+ .actions = "next;",
+ .external_ids = map_empty());
+
+ /* If REGBIT_CONNTRACK_DEFRAG is set as 1, then the packets should be
+ * sent to conntrack for tracking and defragmentation. */
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(IN, PRE_STATEFUL),
+ .priority = 100,
+ .__match = "${rEGBIT_CONNTRACK_DEFRAG()} == 1",
+ .actions = "ct_next;",
+ .external_ids = map_empty());
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(OUT, PRE_STATEFUL),
+ .priority = 100,
+ .__match = "${rEGBIT_CONNTRACK_DEFRAG()} == 1",
+ .actions = "ct_next;",
+ .external_ids = map_empty())
+}
+
+function acl_log_meter_name(meter_name: string, acl_uuid: uuid): string =
+{
+ meter_name ++ "__" ++ uuid2str(acl_uuid)
+}
+
+function build_acl_log(acl: nb::ACL, fair_meter: bool): string =
+{
+ if (not acl.log) {
+ ""
+ } else {
+ var strs = vec_empty();
+ match (acl.name) {
+ None -> (),
+ Some{name} -> strs.push("name=${json_string_escape(name)}")
+ };
+ /* If a severity level isn't specified, default to "info". */
+ match (acl.severity) {
+ None -> strs.push("severity=info"),
+ Some{severity} -> strs.push("severity=${severity}")
+ };
+ match (acl.action) {
+ "drop" -> {
+ strs.push("verdict=drop")
+ },
+ "reject" -> {
+ strs.push("verdict=reject")
+ },
+ "allow" -> {
+ strs.push("verdict=allow")
+ },
+ "allow-related" -> {
+ strs.push("verdict=allow")
+ },
+ _ -> ()
+ };
+ match (acl.meter) {
+ Some{meter} -> {
+ var name = match (fair_meter) {
+ true -> acl_log_meter_name(meter, acl._uuid),
+ false -> meter
+ };
+ strs.push("meter=${json_string_escape(name)}")
+ },
+ None -> ()
+ };
+ "log(${strs.join(\", \")}); "
+ }
+}
+
+/* Due to various hard-coded priorities need to implement ACLs, the
+ * northbound database supports a smaller range of ACL priorities than
+ * are available to logical flows. This value is added to an ACL
+ * priority to determine the ACL's logical flow priority. */
+function oVN_ACL_PRI_OFFSET(): integer = 1000
+
+/* Intermediate relation that stores reject ACLs.
+ * The following rules generate logical flows for these ACLs.
+ */
+relation Reject(
+ lsuuid: uuid,
+ pipeline: string,
+ stage: Stage,
+ acl: nb::ACL,
+ fair_meter: bool,
+ extra_match: string,
+ extra_actions: string)
+
+/* build_reject_acl_rules() */
+for (Reject(lsuuid, pipeline, stage, acl, fair_meter, extra_match_, extra_actions_)) {
+ var extra_match = match (extra_match_) {
+ "" -> "",
+ s -> "(${s}) && "
+ } in
+ var extra_actions = match (extra_actions_) {
+ "" -> "",
+ s -> "${s} "
+ } in
+ var next = match (pipeline == "ingress") {
+ true -> "next(pipeline=egress,table=${stage_id(switch_stage(OUT, QOS_MARK)).0})",
+ false -> "next(pipeline=ingress,table=${stage_id(switch_stage(IN, L2_LKUP)).0})"
+ } in
+ var acl_log = build_acl_log(acl, fair_meter) in
+ var __match = extra_match ++ acl.__match in
+ var actions = acl_log ++ extra_actions ++ "reg0 = 0; "
+ "reject { "
+ "/* eth.dst <-> eth.src; ip.dst <-> ip.src; is implicit. */ "
+ "outport <-> inport; ${next}; };" in
+ Flow(.logical_datapath = lsuuid,
+ .stage = stage,
+ .priority = acl.priority + oVN_ACL_PRI_OFFSET(),
+ .__match = __match,
+ .actions = actions,
+ .external_ids = stage_hint(acl._uuid))
+}
+
+/* build_acls */
+for (sw in &Switch(.ls = ls))
+var has_stateful = sw.has_stateful_acl or sw.has_lb_vip in
+{
+ /* Ingress and Egress ACL Table (Priority 0): Packets are allowed by
+ * default. A related rule at priority 1 is added below if there
+ * are any stateful ACLs in this datapath. */
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(IN, ACL),
+ .priority = 0,
+ .__match = "1",
+ .actions = "next;",
+ .external_ids = map_empty());
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(OUT, ACL),
+ .priority = 0,
+ .__match = "1",
+ .actions = "next;",
+ .external_ids = map_empty());
+
+ if (has_stateful) {
+ /* Ingress and Egress ACL Table (Priority 1).
+ *
+ * By default, traffic is allowed. This is partially handled by
+ * the Priority 0 ACL flows added earlier, but we also need to
+ * commit IP flows. This is because, while the initiater's
+ * direction may not have any stateful rules, the server's may
+ * and then its return traffic would not have an associated
+ * conntrack entry and would return "+invalid".
+ *
+ * We use "ct_commit" for a connection that is not already known
+ * by the connection tracker. Once a connection is committed,
+ * subsequent packets will hit the flow at priority 0 that just
+ * uses "next;"
+ *
+ * We also check for established connections that have ct_label.blocked
+ * set on them. That's a connection that was disallowed, but is
+ * now allowed by policy again since it hit this default-allow flow.
+ * We need to set ct_label.blocked=0 to let the connection continue,
+ * which will be done by ct_commit() in the "stateful" stage.
+ * Subsequent packets will hit the flow at priority 0 that just
+ * uses "next;". */
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(IN, ACL),
+ .priority = 1,
+ .__match = "ip && (!ct.est || (ct.est && ct_label.blocked == 1))",
+ .actions = "${rEGBIT_CONNTRACK_COMMIT()} = 1; next;",
+ .external_ids = map_empty());
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(OUT, ACL),
+ .priority = 1,
+ .__match = "ip && (!ct.est || (ct.est && ct_label.blocked == 1))",
+ .actions = "${rEGBIT_CONNTRACK_COMMIT()} = 1; next;",
+ .external_ids = map_empty());
+
+ /* Ingress and Egress ACL Table (Priority 65535).
+ *
+ * Always drop traffic that's in an invalid state. Also drop
+ * reply direction packets for connections that have been marked
+ * for deletion (bit 0 of ct_label is set).
+ *
+ * This is enforced at a higher priority than ACLs can be defined. */
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(IN, ACL),
+ .priority = 65535,
+ .__match = "ct.inv || (ct.est && ct.rpl && ct_label.blocked == 1)",
+ .actions = "drop;",
+ .external_ids = map_empty());
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(OUT, ACL),
+ .priority = 65535,
+ .__match = "ct.inv || (ct.est && ct.rpl && ct_label.blocked == 1)",
+ .actions = "drop;",
+ .external_ids = map_empty());
+
+ /* Ingress and Egress ACL Table (Priority 65535).
+ *
+ * Allow reply traffic that is part of an established
+ * conntrack entry that has not been marked for deletion
+ * (bit 0 of ct_label). We only match traffic in the
+ * reply direction because we want traffic in the request
+ * direction to hit the currently defined policy from ACLs.
+ *
+ * This is enforced at a higher priority than ACLs can be defined. */
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(IN, ACL),
+ .priority = 65535,
+ .__match = "ct.est && !ct.rel && !ct.new && !ct.inv "
+ "&& ct.rpl && ct_label.blocked == 0",
+ .actions = "next;",
+ .external_ids = map_empty());
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(OUT, ACL),
+ .priority = 65535,
+ .__match = "ct.est && !ct.rel && !ct.new && !ct.inv "
+ "&& ct.rpl && ct_label.blocked == 0",
+ .actions = "next;",
+ .external_ids = map_empty());
+
+ /* Ingress and Egress ACL Table (Priority 65535).
+ *
+ * Allow traffic that is related to an existing conntrack entry that
+ * has not been marked for deletion (bit 0 of ct_label).
+ *
+ * This is enforced at a higher priority than ACLs can be defined.
+ *
+ * NOTE: This does not support related data sessions (eg,
+ * a dynamically negotiated FTP data channel), but will allow
+ * related traffic such as an ICMP Port Unreachable through
+ * that's generated from a non-listening UDP port. */
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(IN, ACL),
+ .priority = 65535,
+ .__match = "!ct.est && ct.rel && !ct.new && !ct.inv "
+ "&& ct_label.blocked == 0",
+ .actions = "next;",
+ .external_ids = map_empty());
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(OUT, ACL),
+ .priority = 65535,
+ .__match = "!ct.est && ct.rel && !ct.new && !ct.inv "
+ "&& ct_label.blocked == 0",
+ .actions = "next;",
+ .external_ids = map_empty());
+
+ /* Ingress and Egress ACL Table (Priority 65535).
+ *
+ * Not to do conntrack on ND packets. */
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(IN, ACL),
+ .priority = 65535,
+ .__match = "nd || nd_ra || nd_rs || mldv1 || mldv2",
+ .actions = "next;",
+ .external_ids = map_empty());
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(OUT, ACL),
+ .priority = 65535,
+ .__match = "nd || nd_ra || nd_rs || mldv1 || mldv2",
+ .actions = "next;",
+ .external_ids = map_empty())
+ };
+
+ /* Add a 34000 priority flow to advance the DNS reply from ovn-controller,
+ * if the CMS has configured DNS records for the datapath.
+ */
+ if (sw.has_dns_records) {
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(OUT, ACL),
+ .priority = 34000,
+ .__match = "udp.src == 53",
+ .actions = if has_stateful "ct_commit; next;" else "next;",
+ .external_ids = map_empty())
+ };
+
+ /* Add a 34000 priority flow to advance the service monitor reply
+ * packets to skip applying ingress ACLs. */
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(IN, ACL),
+ .priority = 34000,
+ .__match = "eth.dst == $svc_monitor_mac",
+ .actions = "next;",
+ .external_ids = map_empty());
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(OUT, ACL),
+ .priority = 34000,
+ .__match = "eth.src == $svc_monitor_mac",
+ .actions = "next;",
+ .external_ids = map_empty())
+}
+
+/* This stage builds hints for the IN/OUT_ACL stage. Based on various
+ * combinations of ct flags packets may hit only a subset of the logical
+ * flows in the IN/OUT_ACL stage.
+ *
+ * Populating ACL hints first and storing them in registers simplifies
+ * the logical flow match expressions in the IN/OUT_ACL stage and
+ * generates less openflows.
+ *
+ * Certain combinations of ct flags might be valid matches for multiple
+ * types of ACL logical flows (e.g., allow/drop). In such cases hints
+ * corresponding to all potential matches are set.
+ */
+input relation AclHintStages[Stage]
+AclHintStages[switch_stage(IN, ACL_HINT)].
+AclHintStages[switch_stage(OUT, ACL_HINT)].
+for (&Switch(.ls = ls)) {
+ for (AclHintStages[stage]) {
+ /* New, not already established connections, may hit either allow
+ * or drop ACLs. For allow ACLs, the connection must also be committed
+ * to conntrack so we set REGBIT_ACL_HINT_ALLOW_NEW.
+ */
+ Flow(ls._uuid, stage, 7, "ct.new && !ct.est",
+ "${rEGBIT_ACL_HINT_ALLOW_NEW()} = 1; "
+ "${rEGBIT_ACL_HINT_DROP()} = 1; "
+ "next;", map_empty());
+
+ /* Already established connections in the "request" direction that
+ * are already marked as "blocked" may hit either:
+ * - allow ACLs for connections that were previously allowed by a
+ * policy that was deleted and is being readded now. In this case
+ * the connection should be recommitted so we set
+ * REGBIT_ACL_HINT_ALLOW_NEW.
+ * - drop ACLs.
+ */
+ Flow(ls._uuid, stage, 6, "!ct.new && ct.est && !ct.rpl && ct_label.blocked == 1",
+ "${rEGBIT_ACL_HINT_ALLOW_NEW()} = 1; "
+ "${rEGBIT_ACL_HINT_DROP()} = 1; "
+ "next;", map_empty());
+
+ /* Not tracked traffic can either be allowed or dropped. */
+ Flow(ls._uuid, stage, 5, "!ct.trk",
+ "${rEGBIT_ACL_HINT_ALLOW()} = 1; "
+ "${rEGBIT_ACL_HINT_DROP()} = 1; "
+ "next;", map_empty());
+
+ /* Already established connections in the "request" direction may hit
+ * either:
+ * - allow ACLs in which case the traffic should be allowed so we set
+ * REGBIT_ACL_HINT_ALLOW.
+ * - drop ACLs in which case the traffic should be blocked and the
+ * connection must be committed with ct_label.blocked set so we set
+ * REGBIT_ACL_HINT_BLOCK.
+ */
+ Flow(ls._uuid, stage, 4, "!ct.new && ct.est && !ct.rpl && ct_label.blocked == 0",
+ "${rEGBIT_ACL_HINT_ALLOW()} = 1; "
+ "${rEGBIT_ACL_HINT_BLOCK()} = 1; "
+ "next;", map_empty());
+
+ /* Not established or established and already blocked connections may
+ * hit drop ACLs.
+ */
+ Flow(ls._uuid, stage, 3, "!ct.est",
+ "${rEGBIT_ACL_HINT_DROP()} = 1; "
+ "next;", map_empty());
+ Flow(ls._uuid, stage, 2, "ct.est && ct_label.blocked == 1",
+ "${rEGBIT_ACL_HINT_DROP()} = 1; "
+ "next;", map_empty());
+
+ /* Established connections that were previously allowed might hit
+ * drop ACLs in which case the connection must be committed with
+ * ct_label.blocked set.
+ */
+ Flow(ls._uuid, stage, 1, "ct.est && ct_label.blocked == 0",
+ "${rEGBIT_ACL_HINT_BLOCK()} = 1; "
+ "next;", map_empty());
+
+ /* In any case, advance to the next stage. */
+ Flow(ls._uuid, stage, 0, "1", "next;", map_empty())
+ }
+}
+
+/* Ingress or Egress ACL Table (Various priorities). */
+for (&SwitchACL(.sw = &Switch{.ls = ls, .has_stateful_acl = has_stateful},
+ .acl = &acl, .has_fair_meter = fair_meter)) {
+ /* consider_acl */
+ var ingress = acl.direction == "from-lport" in
+ var stage = if (ingress) { switch_stage(IN, ACL) } else { switch_stage(OUT, ACL) } in
+ var pipeline = if ingress "ingress" else "egress" in
+ var stage_hint = stage_hint(acl._uuid) in
+ var acl_log = build_acl_log(acl, fair_meter) in
+ if (acl.action == "allow" or acl.action == "allow-related") {
+ /* If there are any stateful flows, we must even commit "allow"
+ * actions. This is because, while the initiater's
+ * direction may not have any stateful rules, the server's
+ * may and then its return traffic would not have an
+ * associated conntrack entry and would return "+invalid". */
+ if (not has_stateful) {
+ Flow(.logical_datapath = ls._uuid,
+ .stage = stage,
+ .priority = acl.priority + oVN_ACL_PRI_OFFSET(),
+ .__match = acl.__match,
+ .actions = "${acl_log}next;",
+ .external_ids = stage_hint)
+ } else {
+ /* Commit the connection tracking entry if it's a new
+ * connection that matches this ACL. After this commit,
+ * the reply traffic is allowed by a flow we create at
+ * priority 65535, defined earlier.
+ *
+ * It's also possible that a known connection was marked for
+ * deletion after a policy was deleted, but the policy was
+ * re-added while that connection is still known. We catch
+ * that case here and un-set ct_label.blocked (which will be done
+ * by ct_commit in the "stateful" stage) to indicate that the
+ * connection should be allowed to resume.
+ */
+ Flow(.logical_datapath = ls._uuid,
+ .stage = stage,
+ .priority = acl.priority + oVN_ACL_PRI_OFFSET(),
+ .__match = "${rEGBIT_ACL_HINT_ALLOW_NEW()} == 1 && (${acl.__match})",
+ .actions = "${rEGBIT_CONNTRACK_COMMIT()} = 1; ${acl_log}next;",
+ .external_ids = stage_hint);
+
+ /* Match on traffic in the request direction for an established
+ * connection tracking entry that has not been marked for
+ * deletion. There is no need to commit here, so we can just
+ * proceed to the next table. We use this to ensure that this
+ * connection is still allowed by the currently defined
+ * policy. Match untracked packets too. */
+ Flow(.logical_datapath = ls._uuid,
+ .stage = stage,
+ .priority = acl.priority + oVN_ACL_PRI_OFFSET(),
+ .__match = "${rEGBIT_ACL_HINT_ALLOW()} == 1 && (${acl.__match})",
+ .actions = "${acl_log}next;",
+ .external_ids = stage_hint)
+ }
+ } else if (acl.action == "drop" or acl.action == "reject") {
+ /* The implementation of "drop" differs if stateful ACLs are in
+ * use for this datapath. In that case, the actions differ
+ * depending on whether the connection was previously committed
+ * to the connection tracker with ct_commit. */
+ if (has_stateful) {
+ /* If the packet is not tracked or not part of an established
+ * connection, then we can simply reject/drop it. */
+ var __match = "${rEGBIT_ACL_HINT_DROP()} == 1" in
+ if (acl.action == "reject") {
+ Reject(ls._uuid, pipeline, stage, acl, fair_meter, __match, "")
+ } else {
+ Flow(.logical_datapath = ls._uuid,
+ .stage = stage,
+ .priority = acl.priority + oVN_ACL_PRI_OFFSET(),
+ .__match = __match ++ " && (${acl.__match})",
+ .actions = "${acl_log}/* drop */",
+ .external_ids = stage_hint)
+ };
+ /* For an existing connection without ct_label set, we've
+ * encountered a policy change. ACLs previously allowed
+ * this connection and we committed the connection tracking
+ * entry. Current policy says that we should drop this
+ * connection. First, we set bit 0 of ct_label to indicate
+ * that this connection is set for deletion. By not
+ * specifying "next;", we implicitly drop the packet after
+ * updating conntrack state. We would normally defer
+ * ct_commit() to the "stateful" stage, but since we're
+ * rejecting/dropping the packet, we go ahead and do it here.
+ */
+ var __match = "${rEGBIT_ACL_HINT_BLOCK()} == 1" in
+ var actions = "ct_commit { ct_label.blocked = 1; }; " in
+ if (acl.action == "reject") {
+ Reject(ls._uuid, pipeline, stage, acl, fair_meter, __match, actions)
+ } else {
+ Flow(.logical_datapath = ls._uuid,
+ .stage = stage,
+ .priority = acl.priority + oVN_ACL_PRI_OFFSET(),
+ .__match = __match ++ " && (${acl.__match})",
+ .actions = "${actions}${acl_log}/* drop */",
+ .external_ids = stage_hint)
+ }
+ } else {
+ /* There are no stateful ACLs in use on this datapath,
+ * so a "reject/drop" ACL is simply the "reject/drop"
+ * logical flow action in all cases. */
+ if (acl.action == "reject") {
+ Reject(ls._uuid, pipeline, stage, acl, fair_meter, "", "")
+ } else {
+ Flow(.logical_datapath = ls._uuid,
+ .stage = stage,
+ .priority = acl.priority + oVN_ACL_PRI_OFFSET(),
+ .__match = acl.__match,
+ .actions = "${acl_log}/* drop */",
+ .external_ids = stage_hint)
+ }
+ }
+ }
+}
+
+/* Add 34000 priority flow to allow DHCP reply from ovn-controller to all
+ * logical ports of the datapath if the CMS has configured DHCPv4 options.
+ * */
+for (SwitchPortDHCPv4Options(.port = &SwitchPort{.lsp = lsp, .sw = &sw},
+ .dhcpv4_options = dhcpv4_options@&nb::DHCP_Options{.options = options})
+ if lsp.__type != "external") {
+ (Some{var server_id}, Some{var server_mac}, Some{var lease_time}) =
+ (options.get("server_id"), options.get("server_mac"), options.get("lease_time")) in
+ Flow(.logical_datapath = sw.ls._uuid,
+ .stage = switch_stage(OUT, ACL),
+ .priority = 34000,
+ .__match = "outport == ${json_string_escape(lsp.name)} "
+ "&& eth.src == ${server_mac} "
+ "&& ip4.src == ${server_id} && udp && udp.src == 67 "
+ "&& udp.dst == 68",
+ .actions = if (sw.has_stateful_acl) "ct_commit; next;" else "next;",
+ .external_ids = stage_hint(dhcpv4_options._uuid))
+}
+
+for (SwitchPortDHCPv6Options(.port = &SwitchPort{.lsp = lsp, .sw = &sw},
+ .dhcpv6_options = dhcpv6_options@&nb::DHCP_Options{.options=options} )
+ if lsp.__type != "external") {
+ Some{var server_mac} = options.get("server_id") in
+ Some{var ea} = eth_addr_from_string(server_mac) in
+ var server_ip = ipv6_string_mapped(in6_generate_lla(ea)) in
+ /* Get the link local IP of the DHCPv6 server from the
+ * server MAC. */
+ Flow(.logical_datapath = sw.ls._uuid,
+ .stage = switch_stage(OUT, ACL),
+ .priority = 34000,
+ .__match = "outport == ${json_string_escape(lsp.name)} "
+ "&& eth.src == ${server_mac} "
+ "&& ip6.src == ${server_ip} && udp && udp.src == 547 "
+ "&& udp.dst == 546",
+ .actions = if (sw.has_stateful_acl) "ct_commit; next;" else "next;",
+ .external_ids = stage_hint(dhcpv6_options._uuid))
+}
+
+relation QoSAction(qos: uuid, key_action: string, value_action: integer)
+
+QoSAction(qos, k, v) :-
+ nb::QoS(._uuid = qos, .action = actions),
+ var action = FlatMap(actions),
+ (var k, var v) = action.
+
+/* QoS rules */
+for (&Switch(.ls = ls)) {
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(IN, QOS_MARK),
+ .priority = 0,
+ .__match = "1",
+ .actions = "next;",
+ .external_ids = map_empty());
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(OUT, QOS_MARK),
+ .priority = 0,
+ .__match = "1",
+ .actions = "next;",
+ .external_ids = map_empty());
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(IN, QOS_METER),
+ .priority = 0,
+ .__match = "1",
+ .actions = "next;",
+ .external_ids = map_empty());
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(OUT, QOS_METER),
+ .priority = 0,
+ .__match = "1",
+ .actions = "next;",
+ .external_ids = map_empty())
+}
+
+for (SwitchQoS(.sw = &sw, .qos = &qos)) {
+ var ingress = if (qos.direction == "from-lport") true else false in
+ var pipeline = if ingress "ingress" else "egress" in {
+ var stage = if (ingress) { switch_stage(IN, QOS_MARK) } else { switch_stage(OUT, QOS_MARK) } in
+ /* FIXME: Can value_action be negative? */
+ for (QoSAction(qos._uuid, key_action, value_action)) {
+ if (key_action == "dscp") {
+ Flow(.logical_datapath = sw.ls._uuid,
+ .stage = stage,
+ .priority = qos.priority,
+ .__match = qos.__match,
+ .actions = "ip.dscp = ${value_action}; next;",
+ .external_ids = stage_hint(qos._uuid))
+ }
+ };
+
+ (var burst, var rate) = {
+ var rate = 0;
+ var burst = 0;
+ for (bw in qos.bandwidth) {
+ /* FIXME: Can value_bandwidth be negative? */
+ (var key_bandwidth, var value_bandwidth) = bw;
+ if (key_bandwidth == "rate") {
+ rate = value_bandwidth
+ } else if (key_bandwidth == "burst") {
+ burst = value_bandwidth
+ } else ()
+ };
+ (burst, rate)
+ } in
+ if (rate != 0) {
+ var stage = if (ingress) { switch_stage(IN, QOS_METER) } else { switch_stage(OUT, QOS_METER) } in
+ var meter_action = if (burst != 0) {
+ "set_meter(${rate}, ${burst}); next;"
+ } else {
+ "set_meter(${rate}); next;"
+ } in
+ /* Ingress and Egress QoS Meter Table.
+ *
+ * We limit the bandwidth of this flow by adding a meter table.
+ */
+ Flow(.logical_datapath = sw.ls._uuid,
+ .stage = stage,
+ .priority = qos.priority,
+ .__match = qos.__match,
+ .actions = meter_action,
+ .external_ids = stage_hint(qos._uuid))
+ }
+ }
+}
+
+/* LB rules */
+for (&Switch(.ls = ls, .has_lb_vip = has_lb_vip)) {
+ /* Ingress and Egress LB Table (Priority 0): Packets are allowed by
+ * default. */
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(IN, LB),
+ .priority = 0,
+ .__match = "1",
+ .actions = "next;",
+ .external_ids = map_empty());
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(OUT, LB),
+ .priority = 0,
+ .__match = "1",
+ .actions = "next;",
+ .external_ids = map_empty());
+
+ if (not ls.load_balancer.is_empty()) {
+ for (&SwitchPort(.lsp = lsp@nb::Logical_Switch_Port{.__type = "router"},
+ .json_name = lsp_name,
+ .sw = &Switch{.ls = ls})) {
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(IN, LB),
+ .priority = 65535,
+ .__match = "ip && inport == ${lsp_name}",
+ .actions = "next;",
+ .external_ids = stage_hint(lsp._uuid));
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(OUT, LB),
+ .priority = 65535,
+ .__match = "ip && outport == ${lsp_name}",
+ .actions = "next;",
+ .external_ids = stage_hint(lsp._uuid))
+ }
+ };
+
+ if (has_lb_vip) {
+ /* Ingress and Egress LB Table (Priority 65534).
+ *
+ * Send established traffic through conntrack for just NAT. */
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(IN, LB),
+ .priority = 65534,
+ .__match = "ct.est && !ct.rel && !ct.new && !ct.inv && ct_label.natted == 1",
+ .actions = "${rEGBIT_CONNTRACK_NAT()} = 1; next;",
+ .external_ids = map_empty());
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(OUT, LB),
+ .priority = 65534,
+ .__match = "ct.est && !ct.rel && !ct.new && !ct.inv && ct_label.natted == 1",
+ .actions = "${rEGBIT_CONNTRACK_NAT()} = 1; next;",
+ .external_ids = map_empty())
+ }
+}
+
+/* stateful rules */
+for (&Switch(.ls = ls)) {
+ /* Ingress and Egress stateful Table (Priority 0): Packets are
+ * allowed by default. */
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(IN, STATEFUL),
+ .priority = 0,
+ .__match = "1",
+ .actions = "next;",
+ .external_ids = map_empty());
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(OUT, STATEFUL),
+ .priority = 0,
+ .__match = "1",
+ .actions = "next;",
+ .external_ids = map_empty());
+
+ /* If REGBIT_CONNTRACK_COMMIT is set as 1, then the packets should be
+ * committed to conntrack. We always set ct_label.blocked to 0 here as
+ * any packet that makes it this far is part of a connection we
+ * want to allow to continue. */
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(IN, STATEFUL),
+ .priority = 100,
+ .__match = "${rEGBIT_CONNTRACK_COMMIT()} == 1",
+ .actions = "ct_commit { ct_label.blocked = 0; }; next;",
+ .external_ids = map_empty());
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(OUT, STATEFUL),
+ .priority = 100,
+ .__match = "${rEGBIT_CONNTRACK_COMMIT()} == 1",
+ .actions = "ct_commit { ct_label.blocked = 0; }; next;",
+ .external_ids = map_empty());
+
+ /* If REGBIT_CONNTRACK_NAT is set as 1, then packets should just be sent
+ * through nat (without committing).
+ *
+ * REGBIT_CONNTRACK_COMMIT is set for new connections and
+ * REGBIT_CONNTRACK_NAT is set for established connections. So they
+ * don't overlap.
+ */
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(IN, STATEFUL),
+ .priority = 100,
+ .__match = "${rEGBIT_CONNTRACK_NAT()} == 1",
+ .actions = "ct_lb;",
+ .external_ids = map_empty());
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(OUT, STATEFUL),
+ .priority = 100,
+ .__match = "${rEGBIT_CONNTRACK_NAT()} == 1",
+ .actions = "ct_lb;",
+ .external_ids = map_empty())
+}
+
+/* Load balancing rules for new connections get committed to conntrack
+ * table. So even if REGBIT_CONNTRACK_COMMIT is set in a previous table
+ * a higher priority rule for load balancing below also commits the
+ * connection, so it is okay if we do not hit the above match on
+ * REGBIT_CONNTRACK_COMMIT. */
+function get_match_for_lb_key(ip_address: v46_ip,
+ port: bit<16>,
+ protocol: Option<string>,
+ redundancy: bool): string = {
+ var port_match = if (port != 0) {
+ var proto = if (protocol == Some{"udp"}) {
+ "udp"
+ } else {
+ "tcp"
+ };
+ if (redundancy) { " && ${proto}" } else { "" } ++
+ " && ${proto}.dst == ${port}"
+ } else {
+ ""
+ };
+
+ var ip_match = match (ip_address) {
+ IPv4{ipv4} -> "ip4.dst == ${ipv4}",
+ IPv6{ipv6} -> "ip6.dst == ${ipv6}"
+ };
+
+ if (redundancy) { "ip && " } else { "" } ++ ip_match ++ port_match
+}
+/* New connections in Ingress table. */
+
+function ct_lb(backends: string,
+ selection_fields: Set<string>, protocol: Option<string>): string {
+ var args = vec_with_capacity(2);
+ args.push("backends=${backends}");
+
+ if (not selection_fields.is_empty()) {
+ var hash_fields = vec_with_capacity(selection_fields.size());
+ for (sf in selection_fields) {
+ var hf = match ((sf, protocol)) {
+ ("tp_src", Some{p}) -> "${p}_src",
+ ("tp_dst", Some{p}) -> "${p}_dst",
+ _ -> sf
+ };
+ hash_fields.push(hf);
+ };
+ args.push("hash_fields=" ++ json_string_escape(hash_fields.join(",")));
+ };
+
+ "ct_lb(" ++ args.join("; ") ++ ");"
+}
+Flow(.logical_datapath = sw.ls._uuid,
+ .stage = switch_stage(IN, STATEFUL),
+ .priority = priority,
+ .__match = __match,
+ .actions = actions,
+ .external_ids = stage_hint(lb._uuid)) :-
+ sw in &Switch(),
+ LBVIPBackend[lbvipbackend],
+ Some{var svc_monitor} = lbvipbackend.svc_monitor,
+ var lbvip = lbvipbackend.lbvip,
+ var lb = lbvip.lb,
+ sw.ls.load_balancer.contains(lb._uuid),
+ bs in &LBVIPBackendStatus(.port = lbvipbackend.port,
+ .ip = lbvipbackend.ip,
+ .protocol = default_protocol(lb.protocol),
+ .logical_port = svc_monitor.port_name),
+ var bses = bs.group_by((sw, lbvip, lb)).to_set(),
+ var __match = "ct.new && " ++ get_match_for_lb_key(lbvip.vip_addr, lbvip.vip_port, lb.protocol, false),
+ var priority = if (lbvip.vip_port != 0) { 120 } else { 110 },
+ var up_backends = {
+ var up_backends = set_empty();
+ for (bs in bses) {
+ if (bs.up) {
+ up_backends.insert("${bs.ip}:${bs.port}")
+ }
+ };
+ up_backends
+ },
+ var actions = if (up_backends.is_empty()) {
+ "drop;"
+ } else {
+ ct_lb(up_backends.to_vec().join(","),
+ lb.selection_fields, lb.protocol)
+ }.
+Flow(.logical_datapath = sw.ls._uuid,
+ .stage = switch_stage(IN, STATEFUL),
+ .priority = priority,
+ .__match = __match,
+ .actions = actions,
+ .external_ids = stage_hint(lb._uuid)) :-
+ sw in &Switch(),
+ LBVIPBackend[lbvipbackend],
+ None = lbvipbackend.svc_monitor,
+ var lbvip = lbvipbackend.lbvip,
+ var lb = lbvip.lb,
+ sw.ls.load_balancer.contains(lb._uuid),
+ var __match = "ct.new && " ++ get_match_for_lb_key(lbvip.vip_addr, lbvip.vip_port, lb.protocol, false),
+ var priority = if (lbvip.vip_port != 0) { 120 } else { 110 },
+ var actions = ct_lb(lbvip.backend_ips, lb.selection_fields, lb.protocol).
+
+/* Ingress Pre-Hairpin/Nat-Hairpin/Hairpin tabled (Priority 0).
+ * Packets that don't need hairpinning should continue processing.
+ */
+Flow(.logical_datapath = ls_uuid,
+ .stage = switch_stage(IN, stage),
+ .priority = 0,
+ .__match = "1",
+ .actions = "next;",
+ .external_ids = map_empty()) :-
+ &Switch(.ls = nb::Logical_Switch{._uuid = ls_uuid}),
+ var stages = [PRE_HAIRPIN, NAT_HAIRPIN, HAIRPIN],
+ var stage = FlatMap(stages).
+for (&Switch(.ls = nb::Logical_Switch{._uuid = ls_uuid}, .has_lb_vip = true)) {
+ /* Check if the packet needs to be hairpinned. */
+ Flow(.logical_datapath = ls_uuid,
+ .stage = switch_stage(IN, PRE_HAIRPIN),
+ .priority = 100,
+ .__match = "ip && ct.trk && ct.dnat",
+ .actions = "${rEGBIT_HAIRPIN()} = chk_lb_hairpin(); next;",
+ .external_ids = stage_hint(ls_uuid));
+
+ /* Check if the packet is a reply of hairpinned traffic. */
+ Flow(.logical_datapath = ls_uuid,
+ .stage = switch_stage(IN, PRE_HAIRPIN),
+ .priority = 90,
+ .__match = "ip",
+ .actions = "${rEGBIT_HAIRPIN()} = chk_lb_hairpin_reply(); ",
+ .external_ids = stage_hint(ls_uuid));
+
+ /* If packet needs to be hairpinned, snat the src ip with the VIP. */
+ Flow(.logical_datapath = ls_uuid,
+ .stage = switch_stage(IN, NAT_HAIRPIN),
+ .priority = 100,
+ .__match = "ip && (ct.new || ct.est) && ct.trk && ct.dnat"
+ " && ${rEGBIT_HAIRPIN()} == 1",
+ .actions = "ct_snat_to_vip; next;",
+ .external_ids = stage_hint(ls_uuid));
+
+ /* For the reply of hairpinned traffic, snat the src ip to the VIP. */
+ Flow(.logical_datapath = ls_uuid,
+ .stage = switch_stage(IN, NAT_HAIRPIN),
+ .priority = 90,
+ .__match = "ip && ${rEGBIT_HAIRPIN()} == 1",
+ .actions = "ct_snat;",
+ .external_ids = stage_hint(ls_uuid));
+
+ /* Ingress Hairpin table.
+ * - Priority 1: Packets that were SNAT-ed for hairpinning should be
+ * looped back (i.e., swap ETH addresses and send back on inport).
+ */
+ Flow(.logical_datapath = ls_uuid,
+ .stage = switch_stage(IN, HAIRPIN),
+ .priority = 1,
+ .__match = "${rEGBIT_HAIRPIN()} == 1",
+ .actions = "eth.dst <-> eth.src;"
+ "outport = inport;"
+ "flags.loopback = 1;"
+ "output;",
+ .external_ids = stage_hint(ls_uuid))
+}
+
+/* Logical switch ingress table PORT_SEC_L2: ingress port security - L2 (priority 50)
+ ingress table PORT_SEC_IP: ingress port security - IP (priority 90 and 80)
+ ingress table PORT_SEC_ND: ingress port security - ND (priority 90 and 80) */
+for (&SwitchPort(.lsp = lsp, .sw = &sw, .json_name = json_name, .ps_eth_addresses = ps_eth_addresses)
+ if lsp.is_enabled() and lsp.__type != "external") {
+ for (pbinding in sb::Out_Port_Binding(.logical_port = lsp.name)) {
+ var __match = if (ps_eth_addresses.is_empty()) {
+ "inport == ${json_name}"
+ } else {
+ "inport == ${json_name} && eth.src == {${ps_eth_addresses.join(\" \")}}"
+ } in
+ var actions = match (pbinding.options.get("qdisc_queue_id")) {
+ None -> "next;",
+ Some{id} -> "set_queue(${id}); next;"
+ } in
+ Flow(.logical_datapath = sw.ls._uuid,
+ .stage = switch_stage(IN, PORT_SEC_L2),
+ .priority = 50,
+ .__match = __match,
+ .actions = actions,
+ .external_ids = stage_hint(lsp._uuid))
+ }
+}
+
+/**
+* Build port security constraints on IPv4 and IPv6 src and dst fields
+* and add logical flows to S_SWITCH_(IN/OUT)_PORT_SEC_IP stage.
+*
+* For each port security of the logical port, following
+* logical flows are added
+* - If the port security has IPv4 addresses,
+* - Priority 90 flow to allow IPv4 packets for known IPv4 addresses
+*
+* - If the port security has IPv6 addresses,
+* - Priority 90 flow to allow IPv6 packets for known IPv6 addresses
+*
+* - If the port security has IPv4 addresses or IPv6 addresses or both
+* - Priority 80 flow to drop all IPv4 and IPv6 traffic
+*/
+for (SwitchPortPSAddresses(.port = &port@SwitchPort{.sw = &sw}, .ps_addrs = ps)
+ if port.is_enabled() and
+ (ps.ipv4_addrs.len() > 0 or ps.ipv6_addrs.len() > 0) and
+ port.lsp.__type != "external")
+{
+ if (ps.ipv4_addrs.len() > 0) {
+ var dhcp_match = "inport == ${port.json_name}"
+ " && eth.src == ${ps.ea}"
+ " && ip4.src == 0.0.0.0"
+ " && ip4.dst == 255.255.255.255"
+ " && udp.src == 68 && udp.dst == 67" in {
+ Flow(.logical_datapath = sw.ls._uuid,
+ .stage = switch_stage(IN, PORT_SEC_IP),
+ .priority = 90,
+ .__match = dhcp_match,
+ .actions = "next;",
+ .external_ids = stage_hint(port.lsp._uuid))
+ };
+ var addrs = {
+ var addrs = vec_empty();
+ for (addr in ps.ipv4_addrs) {
+ /* When the netmask is applied, if the host portion is
+ * non-zero, the host can only use the specified
+ * address. If zero, the host is allowed to use any
+ * address in the subnet.
+ */
+ addrs.push(ipv4_netaddr_match_host_or_network(addr))
+ };
+ addrs
+ } in
+ var __match =
+ "inport == ${port.json_name} && eth.src == ${ps.ea} && ip4.src == {" ++
+ addrs.join(", ") ++ "}" in
+ {
+ Flow(.logical_datapath = sw.ls._uuid,
+ .stage = switch_stage(IN, PORT_SEC_IP),
+ .priority = 90,
+ .__match = __match,
+ .actions = "next;",
+ .external_ids = stage_hint(port.lsp._uuid))
+ }
+ };
+ if (ps.ipv6_addrs.len() > 0) {
+ var dad_match = "inport == ${port.json_name}"
+ " && eth.src == ${ps.ea}"
+ " && ip6.src == ::"
+ " && ip6.dst == ff02::/16"
+ " && icmp6.type == {131, 135, 143}" in
+ {
+ Flow(.logical_datapath = sw.ls._uuid,
+ .stage = switch_stage(IN, PORT_SEC_IP),
+ .priority = 90,
+ .__match = dad_match,
+ .actions = "next;",
+ .external_ids = stage_hint(port.lsp._uuid))
+ };
+ var __match = "inport == ${port.json_name} && eth.src == ${ps.ea}" ++
+ build_port_security_ipv6_flow(IN, ps.ea, ps.ipv6_addrs) in
+ {
+ Flow(.logical_datapath = sw.ls._uuid,
+ .stage = switch_stage(IN, PORT_SEC_IP),
+ .priority = 90,
+ .__match = __match,
+ .actions = "next;",
+ .external_ids = stage_hint(port.lsp._uuid))
+ }
+ };
+ var __match = "inport == ${port.json_name} && eth.src == ${ps.ea} && ip" in
+ {
+ Flow(.logical_datapath = sw.ls._uuid,
+ .stage = switch_stage(IN, PORT_SEC_IP),
+ .priority = 80,
+ .__match = __match,
+ .actions = "drop;",
+ .external_ids = stage_hint(port.lsp._uuid))
+ }
+}
+
+/**
+ * Build port security constraints on ARP and IPv6 ND fields
+ * and add logical flows to S_SWITCH_IN_PORT_SEC_ND stage.
+ *
+ * For each port security of the logical port, following
+ * logical flows are added
+ * - If the port security has no IP (both IPv4 and IPv6) or
+ * if it has IPv4 address(es)
+ * - Priority 90 flow to allow ARP packets for known MAC addresses
+ * in the eth.src and arp.spa fields. If the port security
+ * has IPv4 addresses, allow known IPv4 addresses in the arp.tpa field.
+ *
+ * - If the port security has no IP (both IPv4 and IPv6) or
+ * if it has IPv6 address(es)
+ * - Priority 90 flow to allow IPv6 ND packets for known MAC addresses
+ * in the eth.src and nd.sll/nd.tll fields. If the port security
+ * has IPv6 addresses, allow known IPv6 addresses in the nd.target field
+ * for IPv6 Neighbor Advertisement packet.
+ *
+ * - Priority 80 flow to drop ARP and IPv6 ND packets.
+ */
+for (SwitchPortPSAddresses(.port = &port@SwitchPort{.sw = &sw}, .ps_addrs = ps)
+ if port.is_enabled() and port.lsp.__type != "external")
+{
+ var no_ip = ps.ipv4_addrs.is_empty() and ps.ipv6_addrs.is_empty() in
+ {
+ if (not ps.ipv4_addrs.is_empty() or no_ip) {
+ var __match = {
+ var prefix = "inport == ${port.json_name} && eth.src == ${ps.ea} && arp.sha == ${ps.ea}";
+ if (not ps.ipv4_addrs.is_empty()) {
+ var spas = vec_empty();
+ for (addr in ps.ipv4_addrs) {
+ spas.push(ipv4_netaddr_match_host_or_network(addr))
+ };
+ prefix ++ " && arp.spa == {${spas.join(\", \")}}"
+ } else {
+ prefix
+ }
+ } in {
+ Flow(.logical_datapath = sw.ls._uuid,
+ .stage = switch_stage(IN, PORT_SEC_ND),
+ .priority = 90,
+ .__match = __match,
+ .actions = "next;",
+ .external_ids = stage_hint(port.lsp._uuid))
+ }
+ };
+ if (not ps.ipv6_addrs.is_empty() or no_ip) {
+ var __match = "inport == ${port.json_name} && eth.src == ${ps.ea}" ++
+ build_port_security_ipv6_nd_flow(ps.ea, ps.ipv6_addrs) in
+ {
+ Flow(.logical_datapath = sw.ls._uuid,
+ .stage = switch_stage(IN, PORT_SEC_ND),
+ .priority = 90,
+ .__match = __match,
+ .actions = "next;",
+ .external_ids = stage_hint(port.lsp._uuid))
+ }
+ };
+ Flow(.logical_datapath = sw.ls._uuid,
+ .stage = switch_stage(IN, PORT_SEC_ND),
+ .priority = 80,
+ .__match = "inport == ${port.json_name} && (arp || nd)",
+ .actions = "drop;",
+ .external_ids = stage_hint(port.lsp._uuid))
+ }
+}
+
+/* Ingress table PORT_SEC_ND and PORT_SEC_IP: Port security - IP and ND, by
+ * default goto next. (priority 0)*/
+for (&Switch(.ls = ls)) {
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(IN, PORT_SEC_ND),
+ .priority = 0,
+ .__match = "1",
+ .actions = "next;",
+ .external_ids = map_empty());
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(IN, PORT_SEC_IP),
+ .priority = 0,
+ .__match = "1",
+ .actions = "next;",
+ .external_ids = map_empty())
+}
+
+/* Ingress table ARP_ND_RSP: ARP/ND responder, skip requests coming from
+ * localnet and vtep ports. (priority 100); see ovn-northd.8.xml for the
+ * rationale. */
+for (&SwitchPort(.lsp = lsp, .sw = &sw, .json_name = json_name)
+ if lsp.is_enabled() and
+ (lsp.__type == "localnet" or lsp.__type == "vtep"))
+{
+ Flow(.logical_datapath = sw.ls._uuid,
+ .stage = switch_stage(IN, ARP_ND_RSP),
+ .priority = 100,
+ .__match = "inport == ${json_name}",
+ .actions = "next;",
+ .external_ids = stage_hint(lsp._uuid))
+}
+
+function lsp_is_up(lsp: nb::Logical_Switch_Port): bool = {
+ lsp.up == Some{true}
+}
+
+/* Ingress table ARP_ND_RSP: ARP/ND responder, reply for known IPs.
+ * (priority 50). */
+/* Handle
+ * - GARPs for virtual ip which belongs to a logical port
+ * of type 'virtual' and bind that port.
+ *
+ * - ARP reply from the virtual ip which belongs to a logical
+ * port of type 'virtual' and bind that port.
+ * */
+ Flow(.logical_datapath = sp.sw.ls._uuid,
+ .stage = switch_stage(IN, ARP_ND_RSP),
+ .priority = 100,
+ .__match = "inport == ${vp.json_name} && "
+ "((arp.op == 1 && arp.spa == ${virtual_ip} && arp.tpa == ${virtual_ip}) || "
+ "(arp.op == 2 && arp.spa == ${virtual_ip}))",
+ .actions = "bind_vport(${sp.json_name}, inport); next;",
+ .external_ids = stage_hint(lsp._uuid)) :-
+ sp in &SwitchPort(.lsp = lsp@nb::Logical_Switch_Port{.__type = "virtual"}),
+ Some{var virtual_ip} = lsp.options.get("virtual-ip"),
+ Some{var virtual_parents} = lsp.options.get("virtual-parents"),
+ Some{var ip} = ip_parse(virtual_ip),
+ var vparent = FlatMap(string_split(virtual_parents, ",")),
+ vp in &SwitchPort(.lsp = nb::Logical_Switch_Port{.name = vparent}),
+ vp.sw == sp.sw.
+
+/*
+ * Add ARP/ND reply flows if either the
+ * - port is up and it doesn't have 'unknown' address defined or
+ * - port type is router or
+ * - port type is localport
+ */
+for (CheckLspIsUp[check_lsp_is_up]) {
+ for (SwitchPortIPv4Address(.port = &SwitchPort{.lsp = lsp, .sw = &sw, .json_name = json_name},
+ .ea = ea, .addr = addr)
+ if lsp.is_enabled() and
+ ((lsp_is_up(lsp) or not check_lsp_is_up)
+ or lsp.__type == "router" or lsp.__type == "localport") and
+ lsp.__type != "external" and lsp.__type != "virtual" and
+ not lsp.addresses.contains("unknown"))
+ {
+ var __match = "arp.tpa == ${addr.addr} && arp.op == 1" in
+ {
+ var actions = "eth.dst = eth.src; "
+ "eth.src = ${ea}; "
+ "arp.op = 2; /* ARP reply */ "
+ "arp.tha = arp.sha; "
+ "arp.sha = ${ea}; "
+ "arp.tpa = arp.spa; "
+ "arp.spa = ${addr.addr}; "
+ "outport = inport; "
+ "flags.loopback = 1; "
+ "output;" in
+ Flow(.logical_datapath = sw.ls._uuid,
+ .stage = switch_stage(IN, ARP_ND_RSP),
+ .priority = 50,
+ .__match = __match,
+ .actions = actions,
+ .external_ids = stage_hint(lsp._uuid));
+
+ /* Do not reply to an ARP request from the port that owns the
+ * address (otherwise a DHCP client that ARPs to check for a
+ * duplicate address will fail). Instead, forward it the usual
+ * way.
+ *
+ * (Another alternative would be to simply drop the packet. If
+ * everything is working as it is configured, then this would
+ * produce equivalent results, since no one should reply to the
+ * request. But ARPing for one's own IP address is intended to
+ * detect situations where the network is not working as
+ * configured, so dropping the request would frustrate that
+ * intent.) */
+ Flow(.logical_datapath = sw.ls._uuid,
+ .stage = switch_stage(IN, ARP_ND_RSP),
+ .priority = 100,
+ .__match = __match ++ " && inport == ${json_name}",
+ .actions = "next;",
+ .external_ids = stage_hint(lsp._uuid))
+ }
+ }
+}
+
+/* For ND solicitations, we need to listen for both the
+ * unicast IPv6 address and its all-nodes multicast address,
+ * but always respond with the unicast IPv6 address. */
+for (SwitchPortIPv6Address(.port = &SwitchPort{.lsp = lsp, .json_name = json_name, .sw = &sw},
+ .ea = ea, .addr = addr)
+ if lsp.is_enabled() and
+ (lsp_is_up(lsp) or lsp.__type == "router" or lsp.__type == "localport") and
+ lsp.__type != "external" and lsp.__type != "virtual")
+{
+ var __match = "nd_ns && ip6.dst == {${addr.addr}, ${ipv6_netaddr_solicited_node(addr)}} && nd.target == ${addr.addr}" in
+ var actions = "${if (lsp.__type == \"router\") \"nd_na_router\" else \"nd_na\"} { "
+ "eth.src = ${ea}; "
+ "ip6.src = ${addr.addr}; "
+ "nd.target = ${addr.addr}; "
+ "nd.tll = ${ea}; "
+ "outport = inport; "
+ "flags.loopback = 1; "
+ "output; "
+ "};" in
+ {
+ Flow(.logical_datapath = sw.ls._uuid,
+ .stage = switch_stage(IN, ARP_ND_RSP),
+ .priority = 50,
+ .__match = __match,
+ .actions = actions,
+ .external_ids = stage_hint(lsp._uuid));
+
+ /* Do not reply to a solicitation from the port that owns the
+ * address (otherwise DAD detection will fail). */
+ Flow(.logical_datapath = sw.ls._uuid,
+ .stage = switch_stage(IN, ARP_ND_RSP),
+ .priority = 100,
+ .__match = __match ++ " && inport == ${json_name}",
+ .actions = "next;",
+ .external_ids = stage_hint(lsp._uuid))
+ }
+}
+
+/* Ingress table ARP_ND_RSP: ARP/ND responder, by default goto next.
+ * (priority 0)*/
+for (ls in nb::Logical_Switch) {
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(IN, ARP_ND_RSP),
+ .priority = 0,
+ .__match = "1",
+ .actions = "next;",
+ .external_ids = map_empty())
+}
+
+/* Ingress table ARP_ND_RSP: ARP/ND responder for service monitor source ip.
+ * (priority 110)*/
+Flow(.logical_datapath = sp.sw.ls._uuid,
+ .stage = switch_stage(IN, ARP_ND_RSP),
+ .priority = 110,
+ .__match = "arp.tpa == ${svc_mon_src_ip} && arp.op == 1",
+ .actions = "eth.dst = eth.src; "
+ "eth.src = ${svc_monitor_mac}; "
+ "arp.op = 2; /* ARP reply */ "
+ "arp.tha = arp.sha; "
+ "arp.sha = ${svc_monitor_mac}; "
+ "arp.tpa = arp.spa; "
+ "arp.spa = ${svc_mon_src_ip}; "
+ "outport = inport; "
+ "flags.loopback = 1; "
+ "output;",
+ .external_ids = stage_hint(lbvipbackend.lbvip.lb._uuid)) :-
+ LBVIPBackend[lbvipbackend],
+ Some{var svc_monitor} = lbvipbackend.svc_monitor,
+ sp in &SwitchPort(
+ .lsp = nb::Logical_Switch_Port{.name = svc_monitor.port_name}),
+ var svc_mon_src_ip = svc_monitor.src_ip,
+ SvcMonitorMac(svc_monitor_mac).
+
+function build_dhcpv4_action(
+ lsp_json_key: string,
+ dhcpv4_options: nb::DHCP_Options,
+ offer_ip: in_addr) : Option<(string, string, string)> =
+{
+ match (ip_parse_masked(dhcpv4_options.cidr)) {
+ Left{err} -> {
+ /* cidr defined is invalid */
+ None
+ },
+ Right{(var host_ip, var mask)} -> {
+ if (not ip_same_network((offer_ip, host_ip), mask)) {
+ /* the offer ip of the logical port doesn't belong to the cidr
+ * defined in the DHCPv4 options.
+ */
+ None
+ } else {
+ match ((dhcpv4_options.options.get("server_id"),
+ dhcpv4_options.options.get("server_mac"),
+ dhcpv4_options.options.get("lease_time")))
+ {
+ (Some{var server_ip}, Some{var server_mac}, Some{var lease_time}) -> {
+ var options_map = dhcpv4_options.options;
+
+ /* server_mac is not DHCPv4 option, delete it from the smap. */
+ options_map.remove("server_mac");
+ options_map.insert("netmask", "${mask}");
+
+ /* We're not using SMAP_FOR_EACH because we want a consistent order of the
+ * options on different architectures (big or little endian, SSE4.2) */
+ var options = vec_empty();
+ for (node in options_map) {
+ (var k, var v) = node;
+ options.push("${k} = ${v}")
+ };
+ var options_action = "${rEGBIT_DHCP_OPTS_RESULT()} = put_dhcp_opts(offerip = ${offer_ip}, " ++
+ options.join(", ") ++ "); next;";
+ var response_action = "eth.dst = eth.src; eth.src = ${server_mac}; "
+ "ip4.src = ${server_ip}; udp.src = 67; "
+ "udp.dst = 68; outport = inport; flags.loopback = 1; "
+ "output;";
+
+ var ipv4_addr_match = "ip4.src == ${offer_ip} && ip4.dst == {${server_ip}, 255.255.255.255}";
+ Some{(options_action, response_action, ipv4_addr_match)}
+ },
+ _ -> {
+ /* "server_id", "server_mac" and "lease_time" should be
+ * present in the dhcp_options. */
+ //static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
+ warn("Required DHCPv4 options not defined for lport - ${lsp_json_key}");
+ None
+ }
+ }
+ }
+ }
+ }
+}
+
+function build_dhcpv6_action(
+ lsp_json_key: string,
+ dhcpv6_options: nb::DHCP_Options,
+ offer_ip: in6_addr): Option<(string, string)> =
+{
+ match (ipv6_parse_masked(dhcpv6_options.cidr)) {
+ Left{err} -> {
+ /* cidr defined is invalid */
+ //warn("cidr is invalid - ${err}");
+ None
+ },
+ Right{(var host_ip, var mask)} -> {
+ if (not ipv6_same_network((offer_ip, host_ip), mask)) {
+ /* offer_ip doesn't belongs to the cidr defined in lport's DHCPv6
+ * options.*/
+ //warn("ip does not belong to cidr");
+ None
+ } else {
+ /* "server_id" should be the MAC address. */
+ match (dhcpv6_options.options.get("server_id")) {
+ None -> {
+ warn("server_id not present in the DHCPv6 options for lport ${lsp_json_key}");
+ None
+ },
+ Some{server_mac} -> {
+ match (eth_addr_from_string(server_mac)) {
+ None -> {
+ warn("server_id not present in the DHCPv6 options for lport ${lsp_json_key}");
+ None
+ },
+ Some{ea} -> {
+ /* Get the link local IP of the DHCPv6 server from the server MAC. */
+ var server_ip = ipv6_string_mapped(in6_generate_lla(ea));
+ var ia_addr = ipv6_string_mapped(offer_ip);
+ var options = vec_empty();
+
+ /* Check whether the dhcpv6 options should be configured as stateful.
+ * Only reply with ia_addr option for dhcpv6 stateful address mode. */
+ if (map_get_bool_def(dhcpv6_options.options, "dhcpv6_stateless", false) == false) {
+ options.push("ia_addr = ${ia_addr}")
+ } else ();
+
+ /* We're not using SMAP_FOR_EACH because we want a consistent order of the
+ * options on different architectures (big or little endian, SSE4.2) */
+ // FIXME: enumerate map in ascending order of keys. Is this good enough?
+ for (node in dhcpv6_options.options) {
+ (var k, var v) = node;
+ if (k != "dhcpv6_stateless") {
+ options.push("${k} = ${v}")
+ } else ()
+ };
+
+ var options_action = "${rEGBIT_DHCP_OPTS_RESULT()} = put_dhcpv6_opts(" ++
+ options.join(", ") ++
+ "); next;";
+ var response_action = "eth.dst = eth.src; eth.src = ${server_mac}; "
+ "ip6.dst = ip6.src; ip6.src = ${server_ip}; udp.src = 547; "
+ "udp.dst = 546; outport = inport; flags.loopback = 1; "
+ "output;";
+ Some{(options_action, response_action)}
+ }
+ }
+ }
+ }
+ }
+ }
+ }
+}
+
+/* If 'names' has one element, returns json_string_escape() for it.
+ * Otherwise, returns json_string_escape() of all of its elements inside "{...}".
+ */
+function json_string_escape_vec(names: Vec<string>): string
+{
+ match ((names.len(), names.nth(0))) {
+ (1, Some{name}) -> json_string_escape(name),
+ _ -> {
+ var json_names = vec_with_capacity(names.len());
+ for (name in names) {
+ json_names.push(json_string_escape(name));
+ };
+ "{" ++ json_names.join(", ") ++ "}"
+ }
+ }
+}
+
+/*
+ * Ordinarily, returns a single match against 'lsp'.
+ *
+ * If 'lsp' is an external port, returns a match against the localnet port(s) on
+ * its switch along with a condition that it only operate if 'lsp' is
+ * chassis-resident. This makes sense as a condition for sending DHCP replies
+ * to external ports because only one chassis should send such a reply.
+ *
+ * Returns a prefix and a suffix string. There is no reason for this except
+ * that it makes it possible to exactly mimic the format used by ovn-northd.c
+ * so that text-based comparisons do not show differences. (This fails if
+ * there's more than one localnet port since the C version uses multiple flows
+ * in that case.)
+ */
+function match_dhcp_input(lsp: Ref<SwitchPort>): (string, string) =
+{
+ if (lsp.lsp.__type == "external" and not lsp.sw.localnet_port_names.is_empty()) {
+ ("inport == " ++ json_string_escape_vec(lsp.sw.localnet_port_names) ++ " && ",
+ " && is_chassis_resident(${lsp.json_name})")
+ } else {
+ ("inport == ${lsp.json_name} && ", "")
+ }
+}
+
+/* Logical switch ingress tables DHCP_OPTIONS and DHCP_RESPONSE: DHCP options
+ * and response priority 100 flows. */
+for (lsp in &SwitchPort
+ /* Don't add the DHCP flows if the port is not enabled or if the
+ * port is a router port. */
+ if (lsp.is_enabled() and lsp.lsp.__type != "router")
+ /* If it's an external port and there is no localnet port
+ * and if it doesn't belong to an HA chassis group ignore it. */
+ and (lsp.lsp.__type != "external"
+ or (not lsp.sw.localnet_port_names.is_empty()
+ and is_some(lsp.lsp.ha_chassis_group))))
+{
+ for (lps in LogicalSwitchPort(.lport = lsp.lsp._uuid, .lswitch = lsuuid)) {
+ var json_key = json_string_escape(lsp.lsp.name) in
+ (var pfx, var sfx) = match_dhcp_input(lsp) in
+ {
+ /* DHCPv4 options enabled for this port */
+ Some{var dhcpv4_options_uuid} = lsp.lsp.dhcpv4_options in
+ {
+ for (dhcpv4_options in nb::DHCP_Options(._uuid = dhcpv4_options_uuid)) {
+ for (SwitchPortIPv4Address(.port = &SwitchPort{.lsp = nb::Logical_Switch_Port{._uuid = lsp.lsp._uuid}}, .ea = ea, .addr = addr)) {
+ Some{(var options_action, var response_action, var ipv4_addr_match)} =
+ build_dhcpv4_action(json_key, dhcpv4_options, addr.addr) in
+ {
+ var __match =
+ pfx ++ "eth.src == ${ea} && "
+ "ip4.src == 0.0.0.0 && ip4.dst == 255.255.255.255 && "
+ "udp.src == 68 && udp.dst == 67" ++ sfx
+ in
+ Flow(.logical_datapath = lsuuid,
+ .stage = switch_stage(IN, DHCP_OPTIONS),
+ .priority = 100,
+ .__match = __match,
+ .actions = options_action,
+ .external_ids = stage_hint(lsp.lsp._uuid));
+
+ /* Allow ip4.src = OFFER_IP and
+ * ip4.dst = {SERVER_IP, 255.255.255.255} for the below
+ * cases
+ * - When the client wants to renew the IP by sending
+ * the DHCPREQUEST to the server ip.
+ * - When the client wants to renew the IP by
+ * broadcasting the DHCPREQUEST.
+ */
+ var __match = pfx ++ "eth.src == ${ea} && "
+ "${ipv4_addr_match} && udp.src == 68 && udp.dst == 67" ++ sfx in
+ Flow(.logical_datapath = lsuuid,
+ .stage = switch_stage(IN, DHCP_OPTIONS),
+ .priority = 100,
+ .__match = __match,
+ .actions = options_action,
+ .external_ids = stage_hint(lsp.lsp._uuid));
+
+ /* If REGBIT_DHCP_OPTS_RESULT is set, it means the
+ * put_dhcp_opts action is successful. */
+ var __match = pfx ++ "eth.src == ${ea} && "
+ "ip4 && udp.src == 68 && udp.dst == 67 && " ++
+ rEGBIT_DHCP_OPTS_RESULT() ++ sfx in
+ Flow(.logical_datapath = lsuuid,
+ .stage = switch_stage(IN, DHCP_RESPONSE),
+ .priority = 100,
+ .__match = __match,
+ .actions = response_action,
+ .external_ids = stage_hint(lsp.lsp._uuid))
+ // FIXME: is there a constraint somewhere that guarantees that build_dhcpv4_action
+ // returns Some() for at most 1 address in lsp_addrs? Otherwise, simulate this break
+ // by computing an aggregate that returns the first element of a group.
+ //break;
+ }
+ }
+ }
+ };
+
+ /* DHCPv6 options enabled for this port */
+ Some{var dhcpv6_options_uuid} = lsp.lsp.dhcpv6_options in
+ {
+ for (dhcpv6_options in nb::DHCP_Options(._uuid = dhcpv6_options_uuid)) {
+ for (SwitchPortIPv6Address(.port = &SwitchPort{.lsp = nb::Logical_Switch_Port{._uuid = lsp.lsp._uuid}}, .ea = ea, .addr = addr)) {
+ Some{(var options_action, var response_action)} =
+ build_dhcpv6_action(json_key, dhcpv6_options, addr.addr) in
+ {
+ var __match = pfx ++ "eth.src == ${ea}"
+ " && ip6.dst == ff02::1:2 && udp.src == 546 &&"
+ " udp.dst == 547" ++ sfx in
+ {
+ Flow(.logical_datapath = lsuuid,
+ .stage = switch_stage(IN, DHCP_OPTIONS),
+ .priority = 100,
+ .__match = __match,
+ .actions = options_action,
+ .external_ids = stage_hint(lsp.lsp._uuid));
+
+ /* If REGBIT_DHCP_OPTS_RESULT is set to 1, it means the
+ * put_dhcpv6_opts action is successful */
+ Flow(.logical_datapath = lsuuid,
+ .stage = switch_stage(IN, DHCP_RESPONSE),
+ .priority = 100,
+ .__match = __match ++ " && ${rEGBIT_DHCP_OPTS_RESULT()}",
+ .actions = response_action,
+ .external_ids = stage_hint(lsp.lsp._uuid))
+ // FIXME: is there a constraint somewhere that guarantees that build_dhcpv4_action
+ // returns Some() for at most 1 address in lsp_addrs? Otherwise, simulate this breaks
+ // by computing an aggregate that returns the first element of a group.
+ //break;
+ }
+ }
+ }
+ }
+ }
+ }
+ }
+}
+
+/* Logical switch ingress tables DNS_LOOKUP and DNS_RESPONSE: DNS lookup and
+ * response priority 100 flows.
+ */
+for (LogicalSwitchHasDNSRecords(ls, true))
+{
+ Flow(.logical_datapath = ls,
+ .stage = switch_stage(IN, DNS_LOOKUP),
+ .priority = 100,
+ .__match = "udp.dst == 53",
+ .actions = "${rEGBIT_DNS_LOOKUP_RESULT()} = dns_lookup(); next;",
+ .external_ids = map_empty());
+
+ var action = "eth.dst <-> eth.src; ip4.src <-> ip4.dst; "
+ "udp.dst = udp.src; udp.src = 53; outport = inport; "
+ "flags.loopback = 1; output;" in
+ Flow(.logical_datapath = ls,
+ .stage = switch_stage(IN, DNS_RESPONSE),
+ .priority = 100,
+ .__match = "udp.dst == 53 && ${rEGBIT_DNS_LOOKUP_RESULT()}",
+ .actions = action,
+ .external_ids = map_empty());
+
+ var action = "eth.dst <-> eth.src; ip6.src <-> ip6.dst; "
+ "udp.dst = udp.src; udp.src = 53; outport = inport; "
+ "flags.loopback = 1; output;" in
+ Flow(.logical_datapath = ls,
+ .stage = switch_stage(IN, DNS_RESPONSE),
+ .priority = 100,
+ .__match = "udp.dst == 53 && ${rEGBIT_DNS_LOOKUP_RESULT()}",
+ .actions = action,
+ .external_ids = map_empty())
+}
+
+/* Ingress table DHCP_OPTIONS and DHCP_RESPONSE: DHCP options and response, by
+ * default goto next. (priority 0).
+ *
+ * Ingress table DNS_LOOKUP and DNS_RESPONSE: DNS lookup and response, by
+ * default goto next. (priority 0).
+
+ * Ingress table EXTERNAL_PORT - External port handling, by default goto next.
+ * (priority 0). */
+for (ls in nb::Logical_Switch) {
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(IN, DHCP_OPTIONS),
+ .priority = 0,
+ .__match = "1",
+ .actions = "next;",
+ .external_ids = map_empty());
+
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(IN, DHCP_RESPONSE),
+ .priority = 0,
+ .__match = "1",
+ .actions = "next;",
+ .external_ids = map_empty());
+
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(IN, DNS_LOOKUP),
+ .priority = 0,
+ .__match = "1",
+ .actions = "next;",
+ .external_ids = map_empty());
+
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(IN, DNS_RESPONSE),
+ .priority = 0,
+ .__match = "1",
+ .actions = "next;",
+ .external_ids = map_empty());
+
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(IN, EXTERNAL_PORT),
+ .priority = 0,
+ .__match = "1",
+ .actions = "next;",
+ .external_ids = map_empty())
+}
+
+Flow(.logical_datapath = sw.ls._uuid,
+ .stage = switch_stage(IN, L2_LKUP),
+ .priority = 110,
+ .__match = "eth.dst == $svc_monitor_mac",
+ .actions = "handle_svc_check(inport);",
+ .external_ids = map_empty()) :-
+ sw in &Switch().
+
+for (sw in &Switch(.ls = ls, .mcast_cfg = &mcast_cfg)
+ if (mcast_cfg.enabled)) {
+ for (SwitchMcastFloodRelayPorts(sw, relay_ports)) {
+ for (SwitchMcastFloodReportPorts(sw, flood_report_ports)) {
+ for (SwitchMcastFloodPorts(sw, flood_ports)) {
+ var flood_relay = not relay_ports.is_empty() in
+ var flood_reports = not flood_report_ports.is_empty() in
+ var flood_static = not flood_ports.is_empty() in
+ var igmp_act = {
+ if (flood_reports) {
+ var mrouter_static = json_string_escape(mC_MROUTER_STATIC().0);
+ "clone { "
+ "outport = ${mrouter_static}; "
+ "output; "
+ "};igmp;"
+ } else {
+ "igmp;"
+ }
+ } in {
+ /* Punt IGMP traffic to controller. */
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(IN, L2_LKUP),
+ .priority = 100,
+ .__match = "ip4 && ip.proto == 2",
+ .actions = "${igmp_act}",
+ .external_ids = map_empty());
+
+ /* Punt MLD traffic to controller. */
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(IN, L2_LKUP),
+ .priority = 100,
+ .__match = "mldv1 || mldv2",
+ .actions = "${igmp_act}",
+ .external_ids = map_empty());
+
+ /* Flood all IP multicast traffic destined to 224.0.0.X to
+ * all ports - RFC 4541, section 2.1.2, item 2.
+ */
+ var flood = json_string_escape(mC_FLOOD().0) in
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(IN, L2_LKUP),
+ .priority = 85,
+ .__match = "ip4.mcast && ip4.dst == 224.0.0.0/24",
+ .actions = "outport = ${flood}; output;",
+ .external_ids = map_empty());
+
+ /* Flood all IPv6 multicast traffic destined to reserved
+ * multicast IPs (RFC 4291, 2.7.1).
+ */
+ var flood = json_string_escape(mC_FLOOD().0) in
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(IN, L2_LKUP),
+ .priority = 85,
+ .__match = "ip6.mcast_flood",
+ .actions = "outport = ${flood}; output;",
+ .external_ids = map_empty());
+
+ /* Forward uregistered IP multicast to routers with relay
+ * enabled and to any ports configured to flood IP
+ * multicast traffic. If configured to flood unregistered
+ * traffic this will be handled by the L2 multicast flow.
+ */
+ if (not mcast_cfg.flood_unreg) {
+ var relay_act = {
+ if (flood_relay) {
+ var rtr_flood = json_string_escape(mC_MROUTER_FLOOD().0);
+ "clone { "
+ "outport = ${rtr_flood}; "
+ "output; "
+ "}; "
+ } else {
+ ""
+ }
+ } in
+ var static_act = {
+ if (flood_static) {
+ var mc_static = json_string_escape(mC_STATIC().0);
+ "outport =${mc_static}; output;"
+ } else {
+ ""
+ }
+ } in
+ var drop_act = {
+ if (not flood_relay and not flood_static) {
+ "drop;"
+ } else {
+ ""
+ }
+ } in
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(IN, L2_LKUP),
+ .priority = 80,
+ .__match = "ip4.mcast || ip6.mcast",
+ .actions =
+ "${relay_act}${static_act}${drop_act}",
+ .external_ids = map_empty())
+ }
+ }
+ }
+ }
+ }
+}
+
+/* Ingress table L2_LKUP: Add IP multicast flows learnt from IGMP/MLD (priority
+ * 90). */
+for (IgmpSwitchMulticastGroup(.address = address, .switch = &sw)) {
+ /* RFC 4541, section 2.1.2, item 2: Skip groups in the 224.0.0.X
+ * range.
+ *
+ * RFC 4291, section 2.7.1: Skip groups that correspond to all
+ * hosts.
+ */
+ Some{var ip} = ip46_parse(address) in
+ (var skip_address) = match (ip) {
+ IPv4{ipv4} -> ip_is_local_multicast(ipv4),
+ IPv6{ipv6} -> ipv6_is_all_hosts(ipv6)
+ } in
+ var ipX = ip46_ipX(ip) in
+ for (SwitchMcastFloodRelayPorts(&sw, relay_ports) if not skip_address) {
+ for (SwitchMcastFloodPorts(&sw, flood_ports)) {
+ var flood_relay = not relay_ports.is_empty() in
+ var flood_static = not flood_ports.is_empty() in
+ var mc_rtr_flood = json_string_escape(mC_MROUTER_FLOOD().0) in
+ var mc_static = json_string_escape(mC_STATIC().0) in
+ var relay_act = {
+ if (flood_relay) {
+ "clone { "
+ "outport = ${mc_rtr_flood}; output; "
+ "};"
+ } else {
+ ""
+ }
+ } in
+ var static_act = {
+ if (flood_static) {
+ "clone { "
+ "outport =${mc_static}; "
+ "output; "
+ "};"
+ } else {
+ ""
+ }
+ } in
+ Flow(.logical_datapath = sw.ls._uuid,
+ .stage = switch_stage(IN, L2_LKUP),
+ .priority = 90,
+ .__match = "eth.mcast && ${ipX} && ${ipX}.dst == ${address}",
+ .actions =
+ "${relay_act} ${static_act} outport = \"${address}\"; "
+ "output;",
+ .external_ids = map_empty())
+ }
+ }
+}
+
+/* Table EXTERNAL_PORT: External port. Drop ARP request for router ips from
+ * external ports on chassis not binding those ports. This makes the router
+ * pipeline to be run only on the chassis binding the external ports.
+ *
+ * For an external port X on logical switch LS, if X is not resident on this
+ * chassis, drop ARP requests arriving on localnet ports from X's Ethernet
+ * address, if the ARP request is asking to translate the IP address of a
+ * router port on LS. */
+Flow(.logical_datapath = sp.sw.ls._uuid,
+ .stage = switch_stage(IN, EXTERNAL_PORT),
+ .priority = 100,
+ .__match = ("inport == ${json_string_escape(localnet_port_name)} && "
+ "eth.src == ${lp_addr.ea} && "
+ "!is_chassis_resident(${sp.json_name}) && "
+ "arp.tpa == ${rp_addr.addr} && arp.op == 1"),
+ .actions = "drop;",
+ .external_ids = stage_hint(sp.lsp._uuid)) :-
+ sp in &SwitchPort(),
+ sp.lsp.__type == "external",
+ var localnet_port_name = FlatMap(sp.sw.localnet_port_names),
+ var lp_addr = FlatMap(sp.static_addresses),
+ rp in &SwitchPort(.sw = sp.sw),
+ rp.lsp.__type == "router",
+ SwitchPortIPv4Address(.port = rp, .addr = rp_addr).
+Flow(.logical_datapath = sp.sw.ls._uuid,
+ .stage = switch_stage(IN, EXTERNAL_PORT),
+ .priority = 100,
+ .__match = ("inport == ${json_string_escape(localnet_port_name)} && "
+ "eth.src == ${lp_addr.ea} && "
+ "!is_chassis_resident(${sp.json_name}) && "
+ "nd_ns && ip6.dst == {${rp_addr.addr}, ${ipv6_netaddr_solicited_node(rp_addr)}} && "
+ "nd.target == ${rp_addr.addr}"),
+ .actions = "drop;",
+ .external_ids = stage_hint(sp.lsp._uuid)) :-
+ sp in &SwitchPort(),
+ sp.lsp.__type == "external",
+ var localnet_port_name = FlatMap(sp.sw.localnet_port_names),
+ var lp_addr = FlatMap(sp.static_addresses),
+ rp in &SwitchPort(.sw = sp.sw),
+ rp.lsp.__type == "router",
+ SwitchPortIPv6Address(.port = rp, .addr = rp_addr).
+Flow(.logical_datapath = sp.sw.ls._uuid,
+ .stage = switch_stage(IN, EXTERNAL_PORT),
+ .priority = 100,
+ .__match = ("inport == ${json_string_escape(localnet_port_name)} && "
+ "eth.src == ${lp_addr.ea} && "
+ "eth.dst == ${ea} && "
+ "!is_chassis_resident(${sp.json_name})"),
+ .actions = "drop;",
+ .external_ids = stage_hint(sp.lsp._uuid)) :-
+ sp in &SwitchPort(),
+ sp.lsp.__type == "external",
+ var localnet_port_name = FlatMap(sp.sw.localnet_port_names),
+ var lp_addr = FlatMap(sp.static_addresses),
+ rp in &SwitchPort(.sw = sp.sw),
+ rp.lsp.__type == "router",
+ SwitchPortAddresses(.port = rp, .addrs = LPortAddress{.ea = ea}).
+
+/* Ingress table L2_LKUP: Destination lookup, broadcast and multicast handling
+ * (priority 100). */
+for (ls in nb::Logical_Switch) {
+ var mc_flood = json_string_escape(mC_FLOOD().0) in
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(IN, L2_LKUP),
+ .priority = 70,
+ .__match = "eth.mcast",
+ .actions = "outport = ${mc_flood}; output;",
+ .external_ids = map_empty())
+}
+
+/* Ingress table L2_LKUP: Destination lookup, unicast handling (priority 50).
+*/
+for (SwitchPortStaticAddresses(.port = &SwitchPort{.lsp = lsp, .json_name = json_name, .sw = &sw},
+ .addrs = addrs)
+ if lsp.__type != "external") {
+ Flow(.logical_datapath = sw.ls._uuid,
+ .stage = switch_stage(IN, L2_LKUP),
+ .priority = 50,
+ .__match = "eth.dst == ${addrs.ea}",
+ .actions = "outport = ${json_name}; output;",
+ .external_ids = stage_hint(lsp._uuid))
+}
+
+/*
+ * Ingress table L2_LKUP: Flows that flood self originated ARP/ND packets in the
+ * switching domain.
+ */
+/* Self originated ARP requests/ND need to be flooded to the L2 domain
+ * (except on router ports). Determine that packets are self originated
+ * by also matching on source MAC. Matching on ingress port is not
+ * reliable in case this is a VLAN-backed network.
+ * Priority: 75.
+ */
+
+/* Returns 'true' if the IP 'addr' is on the same subnet with one of the
+ * IPs configured on the router port.
+ */
+function lrouter_port_ip_reachable(rp: Ref<RouterPort>, addr: v46_ip): bool {
+ match (addr) {
+ IPv4{ipv4} -> {
+ for (na in rp.networks.ipv4_addrs) {
+ if (ip_same_network((ipv4, na.addr), ipv4_netaddr_mask(na))) {
+ return true
+ }
+ }
+ },
+ IPv6{ipv6} -> {
+ for (na in rp.networks.ipv6_addrs) {
+ if (ipv6_same_network((ipv6, na.addr), ipv6_netaddr_mask(na))) {
+ return true
+ }
+ }
+ }
+ };
+ false
+}
+Flow(.logical_datapath = sw.ls._uuid,
+ .stage = switch_stage(IN, L2_LKUP),
+ .priority = 75,
+ .__match = __match,
+ .actions = actions,
+ .external_ids = stage_hint(sp.lsp._uuid)) :-
+ sp in &SwitchPort(.sw = sw, .peer = Some{rp}),
+ rp.is_enabled(),
+ var eth_src_set = {
+ var eth_src_set = set_singleton("${rp.networks.ea}");
+ for (nat in rp.router.nats) {
+ match (nat.nat.external_mac) {
+ Some{mac} ->
+ if (lrouter_port_ip_reachable(rp, nat.external_ip)) {
+ eth_src_set.insert(mac)
+ } else (),
+ _ -> ()
+ }
+ };
+ eth_src_set
+ },
+ var eth_src = "{" ++ eth_src_set.to_vec().join(", ") ++ "}",
+ var __match = "eth.src == ${eth_src} && (arp.op == 1 || nd_ns)",
+ var mc_flood_l2 = json_string_escape(mC_FLOOD_L2().0),
+ var actions = "outport = ${mc_flood_l2}; output;".
+
+/* Forward ARP requests for owned IP addresses (L3, VIP, NAT) only to this
+ * router port.
+ * Priority: 80.
+ */
+function get_arp_forward_ips(rp: Ref<RouterPort>): (Set<string>, Set<string>) = {
+ var all_ips_v4 = set_empty();
+ var all_ips_v6 = set_empty();
+
+ (var lb_ips_v4, var lb_ips_v6)
+ = get_router_load_balancer_ips(deref(rp.router));
+ for (a in lb_ips_v4) {
+ /* Check if the ovn port has a network configured on which we could
+ * expect ARP requests for the LB VIP.
+ */
+ match (ip_parse(a)) {
+ Some{ipv4} -> if (lrouter_port_ip_reachable(rp, IPv4{ipv4})) {
+ all_ips_v4.insert(a)
+ },
+ _ -> ()
+ }
+ };
+ for (a in lb_ips_v6) {
+ /* Check if the ovn port has a network configured on which we could
+ * expect NS requests for the LB VIP.
+ */
+ match (ipv6_parse(a)) {
+ Some{ipv6} -> if (lrouter_port_ip_reachable(rp, IPv6{ipv6})) {
+ all_ips_v6.insert(a)
+ },
+ _ -> ()
+ }
+ };
+
+ for (nat in rp.router.nats) {
+ if (nat.nat.__type != "snat") {
+ /* Check if the ovn port has a network configured on which we could
+ * expect ARP requests/NS for the DNAT external_ip.
+ */
+ if (lrouter_port_ip_reachable(rp, nat.external_ip)) {
+ match (nat.external_ip) {
+ IPv4{_} -> all_ips_v4.insert(nat.nat.external_ip),
+ IPv6{_} -> all_ips_v6.insert(nat.nat.external_ip)
+ }
+ }
+ }
+ };
+
+ for (a in rp.networks.ipv4_addrs) {
+ all_ips_v4.insert("${a.addr}")
+ };
+ for (a in rp.networks.ipv6_addrs) {
+ all_ips_v6.insert("${a.addr}")
+ };
+
+ (all_ips_v4, all_ips_v6)
+}
+/* Packets received from VXLAN tunnels have already been through the
+ * router pipeline so we should skip them. Normally this is done by the
+ * multicast_group implementation (VXLAN packets skip table 32 which
+ * delivers to patch ports) but we're bypassing multicast_groups.
+ * (This is why we match against fLAGBIT_NOT_VXLAN() here.)
+ */
+Flow(.logical_datapath = sw.ls._uuid,
+ .stage = switch_stage(IN, L2_LKUP),
+ .priority = 80,
+ .__match = fLAGBIT_NOT_VXLAN() ++
+ " && arp.op == 1 && arp.tpa == { " ++
+ all_ips_v4.to_vec().join(", ") ++ "}",
+ .actions = if (sw.has_non_router_port) {
+ "clone {outport = ${sp.json_name}; output; }; "
+ "outport = ${mc_flood_l2}; output;"
+ } else {
+ "outport = ${sp.json_name}; output;"
+ },
+ .external_ids = stage_hint(sp.lsp._uuid)) :-
+ sp in &SwitchPort(.sw = sw, .peer = Some{rp}),
+ rp.is_enabled(),
+ (var all_ips_v4, _) = get_arp_forward_ips(rp),
+ not all_ips_v4.is_empty(),
+ var mc_flood_l2 = json_string_escape(mC_FLOOD_L2().0).
+Flow(.logical_datapath = sw.ls._uuid,
+ .stage = switch_stage(IN, L2_LKUP),
+ .priority = 80,
+ .__match = fLAGBIT_NOT_VXLAN() ++
+ " && nd_ns && nd.target == { " ++
+ all_ips_v6.to_vec().join(", ") ++ "}",
+ .actions = if (sw.has_non_router_port) {
+ "clone {outport = ${sp.json_name}; output; }; "
+ "outport = ${mc_flood_l2}; output;"
+ } else {
+ "outport = ${sp.json_name}; output;"
+ },
+ .external_ids = stage_hint(sp.lsp._uuid)) :-
+ sp in &SwitchPort(.sw = sw, .peer = Some{rp}),
+ rp.is_enabled(),
+ (_, var all_ips_v6) = get_arp_forward_ips(rp),
+ not all_ips_v6.is_empty(),
+ var mc_flood_l2 = json_string_escape(mC_FLOOD_L2().0).
+
+for (SwitchPortNewDynamicAddress(.port = &SwitchPort{.lsp = lsp, .json_name = json_name, .sw = &sw},
+ .address = Some{addrs})
+ if lsp.__type != "external") {
+ Flow(.logical_datapath = sw.ls._uuid,
+ .stage = switch_stage(IN, L2_LKUP),
+ .priority = 50,
+ .__match = "eth.dst == ${addrs.ea}",
+ .actions = "outport = ${json_name}; output;",
+ .external_ids = stage_hint(lsp._uuid))
+}
+
+for (&SwitchPort(.lsp = lsp,
+ .json_name = json_name,
+ .sw = &sw,
+ .peer = Some{&RouterPort{.lrp = lrp,
+ .is_redirect = is_redirect,
+ .router = &Router{.lr = lr,
+ .redirect_port_name = redirect_port_name}}})
+ if (lsp.addresses.contains("router") and lsp.__type != "external"))
+{
+ Some{var mac} = scan_eth_addr(lrp.mac) in {
+ var add_chassis_resident_check =
+ not sw.localnet_port_names.is_empty() and
+ (/* The peer of this port represents a distributed
+ * gateway port. The destination lookup flow for the
+ * router's distributed gateway port MAC address should
+ * only be programmed on the "redirect-chassis". */
+ is_redirect or
+ /* Check if the option 'reside-on-redirect-chassis'
+ * is set to true on the peer port. If set to true
+ * and if the logical switch has a localnet port, it
+ * means the router pipeline for the packets from
+ * this logical switch should be run on the chassis
+ * hosting the gateway port.
+ */
+ map_get_bool_def(lrp.options, "reside-on-redirect-chassis", false)) in
+ var __match = if (add_chassis_resident_check) {
+ /* The destination lookup flow for the router's
+ * distributed gateway port MAC address should only be
+ * programmed on the "redirect-chassis". */
+ "eth.dst == ${mac} && is_chassis_resident(${redirect_port_name})"
+ } else {
+ "eth.dst == ${mac}"
+ } in
+ Flow(.logical_datapath = sw.ls._uuid,
+ .stage = switch_stage(IN, L2_LKUP),
+ .priority = 50,
+ .__match = __match,
+ .actions = "outport = ${json_name}; output;",
+ .external_ids = stage_hint(lsp._uuid));
+
+ /* Add ethernet addresses specified in NAT rules on
+ * distributed logical routers. */
+ if (is_redirect) {
+ for (LogicalRouterNAT(.lr = lr._uuid, .nat = nat)) {
+ if (nat.nat.__type == "dnat_and_snat") {
+ Some{var lport} = nat.nat.logical_port in
+ Some{var emac} = nat.nat.external_mac in
+ Some{var nat_mac} = eth_addr_from_string(emac) in
+ var __match = "eth.dst == ${nat_mac} && is_chassis_resident(${json_string_escape(lport)})" in
+ Flow(.logical_datapath = sw.ls._uuid,
+ .stage = switch_stage(IN, L2_LKUP),
+ .priority = 50,
+ .__match = __match,
+ .actions = "outport = ${json_name}; output;",
+ .external_ids = stage_hint(nat.nat._uuid))
+ }
+ }
+ }
+ }
+}
+// FIXME: do we care about this?
+/* } else {
+ static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
+
+ VLOG_INFO_RL(&rl,
+ "%s: invalid syntax '%s' in addresses column",
+ op->nbsp->name, op->nbsp->addresses[i]);
+ }*/
+
+/* Ingress table L2_LKUP: Destination lookup for unknown MACs (priority 0). */
+for (LogicalSwitchUnknownPorts(.ls = ls_uuid)) {
+ var mc_unknown = json_string_escape(mC_UNKNOWN().0) in
+ Flow(.logical_datapath = ls_uuid,
+ .stage = switch_stage(IN, L2_LKUP),
+ .priority = 0,
+ .__match = "1",
+ .actions = "outport = ${mc_unknown}; output;",
+ .external_ids = map_empty())
+}
+
+/* Egress tables PORT_SEC_IP: Egress port security - IP (priority 0)
+ * Egress table PORT_SEC_L2: Egress port security L2 - multicast/broadcast (priority 100). */
+for (&Switch(.ls = ls)) {
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(OUT, PORT_SEC_IP),
+ .priority = 0,
+ .__match = "1",
+ .actions = "next;",
+ .external_ids = map_empty());
+ Flow(.logical_datapath = ls._uuid,
+ .stage = switch_stage(OUT, PORT_SEC_L2),
+ .priority = 100,
+ .__match = "eth.mcast",
+ .actions = "output;",
+ .external_ids = map_empty())
+}
+
+/* Egress table PORT_SEC_IP: Egress port security - IP (priorities 90 and 80)
+ * if port security enabled.
+ *
+ * Egress table PORT_SEC_L2: Egress port security - L2 (priorities 50 and 150).
+ *
+ * Priority 50 rules implement port security for enabled logical port.
+ *
+ * Priority 150 rules drop packets to disabled logical ports, so that they
+ * don't even receive multicast or broadcast packets. */
+Flow(.logical_datapath = sw.ls._uuid,
+ .stage = switch_stage(OUT, PORT_SEC_L2),
+ .priority = 50,
+ .__match = __match,
+ .actions = queue_action ++ "output;",
+ .external_ids = stage_hint(lsp._uuid)) :-
+ &SwitchPort(.sw = &sw, .lsp = lsp, .json_name = json_name, .ps_eth_addresses = ps_eth_addresses),
+ lsp.is_enabled(),
+ lsp.__type != "external",
+ var __match = if (ps_eth_addresses.is_empty()) {
+ "outport == ${json_name}"
+ } else {
+ "outport == ${json_name} && eth.dst == {${ps_eth_addresses.join(\" \")}}"
+ },
+ pbinding in sb::Out_Port_Binding(.logical_port = lsp.name),
+ var queue_action = match ((lsp.__type,
+ pbinding.options.get("qdisc_queue_id"))) {
+ ("localnet", Some{queue_id}) -> "set_queue(${queue_id});",
+ _ -> ""
+ }.
+
+for (&SwitchPort(.lsp = lsp, .json_name = json_name, .sw = &sw)
+ if not lsp.is_enabled() and lsp.__type != "external") {
+ Flow(.logical_datapath = sw.ls._uuid,
+ .stage = switch_stage(OUT, PORT_SEC_L2),
+ .priority = 150,
+ .__match = "outport == {$json_name}",
+ .actions = "drop;",
+ .external_ids = stage_hint(lsp._uuid))
+}
+
+for (SwitchPortPSAddresses(.port = &SwitchPort{.lsp = lsp, .json_name = json_name, .sw = &sw},
+ .ps_addrs = ps)
+ if (ps.ipv4_addrs.len() > 0 or ps.ipv6_addrs.len() > 0)
+ and lsp.__type != "external")
+{
+ if (ps.ipv4_addrs.len() > 0) {
+ var addrs = {
+ var addrs = vec_empty();
+ for (addr in ps.ipv4_addrs) {
+ /* When the netmask is applied, if the host portion is
+ * non-zero, the host can only use the specified
+ * address. If zero, the host is allowed to use any
+ * address in the subnet.
+ */
+ addrs.push(ipv4_netaddr_match_host_or_network(addr));
+ if (addr.plen < 32 and not ip_is_zero(ipv4_netaddr_host(addr))) {
+ addrs.push("${ipv4_netaddr_bcast(addr)}")
+ }
+ };
+ addrs
+ } in
+ var __match =
+ "outport == ${json_name} && eth.dst == ${ps.ea} && ip4.dst == {255.255.255.255, 224.0.0.0/4, " ++
+ addrs.join(", ") ++ "}" in
+ Flow(.logical_datapath = sw.ls._uuid,
+ .stage = switch_stage(OUT, PORT_SEC_IP),
+ .priority = 90,
+ .__match = __match,
+ .actions = "next;",
+ .external_ids = stage_hint(lsp._uuid))
+ };
+ if (ps.ipv6_addrs.len() > 0) {
+ var __match = "outport == ${json_name} && eth.dst == ${ps.ea}" ++
+ build_port_security_ipv6_flow(OUT, ps.ea, ps.ipv6_addrs) in
+ Flow(.logical_datapath = sw.ls._uuid,
+ .stage = switch_stage(OUT, PORT_SEC_IP),
+ .priority = 90,
+ .__match = __match,
+ .actions = "next;",
+ .external_ids = stage_hint(lsp._uuid))
+ };
+ var __match = "outport == ${json_name} && eth.dst == ${ps.ea} && ip" in
+ Flow(.logical_datapath = sw.ls._uuid,
+ .stage = switch_stage(OUT, PORT_SEC_IP),
+ .priority = 80,
+ .__match = __match,
+ .actions = "drop;",
+ .external_ids = stage_hint(lsp._uuid))
+}
+
+/* Logical router ingress table ADMISSION: Admission control framework. */
+for (&Router(.lr = lr)) {
+ /* Logical VLANs not supported.
+ * Broadcast/multicast source address is invalid. */
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, ADMISSION),
+ .priority = 100,
+ .__match = "vlan.present || eth.src[40]",
+ .actions = "drop;",
+ .external_ids = map_empty())
+}
+
+/* Logical router ingress table ADMISSION: match (priority 50). */
+for (&RouterPort(.lrp = lrp,
+ .json_name = json_name,
+ .networks = lrp_networks,
+ .router = &router,
+ .is_redirect = is_redirect)
+ /* Drop packets from disabled logical ports (since logical flow
+ * tables are default-drop). */
+ if lrp.is_enabled())
+{
+ //if (op->derived) {
+ // /* No ingress packets should be received on a chassisredirect
+ // * port. */
+ // continue;
+ //}
+
+ /* Store the ethernet address of the port receiving the packet.
+ * This will save us from having to match on inport further down in
+ * the pipeline.
+ */
+ var actions = "${rEG_INPORT_ETH_ADDR()} = ${lrp_networks.ea}; next;" in {
+ Flow(.logical_datapath = router.lr._uuid,
+ .stage = router_stage(IN, ADMISSION),
+ .priority = 50,
+ .__match = "eth.mcast && inport == ${json_name}",
+ .actions = actions,
+ .external_ids = stage_hint(lrp._uuid));
+
+ var __match =
+ "eth.dst == ${lrp_networks.ea} && inport == ${json_name}" ++
+ if is_redirect {
+ /* Traffic with eth.dst = l3dgw_port->lrp_networks.ea
+ * should only be received on the "redirect-chassis". */
+ " && is_chassis_resident(${json_string_escape(chassis_redirect_name(lrp.name))})"
+ } else { "" } in
+ Flow(.logical_datapath = router.lr._uuid,
+ .stage = router_stage(IN, ADMISSION),
+ .priority = 50,
+ .__match = __match,
+ .actions = actions,
+ .external_ids = stage_hint(lrp._uuid))
+ }
+}
+
+
+/* Logical router ingress table LOOKUP_NEIGHBOR and
+ * table LEARN_NEIGHBOR. */
+/* Learn MAC bindings from ARP/IPv6 ND.
+ *
+ * For ARP packets, table LOOKUP_NEIGHBOR does a lookup for the
+ * (arp.spa, arp.sha) in the mac binding table using the 'lookup_arp'
+ * action and stores the result in REGBIT_LOOKUP_NEIGHBOR_RESULT bit.
+ * If "always_learn_from_arp_request" is set to false, it will also
+ * lookup for the (arp.spa) in the mac binding table using the
+ * "lookup_arp_ip" action for ARP request packets, and stores the
+ * result in REGBIT_LOOKUP_NEIGHBOR_IP_RESULT bit; or set that bit
+ * to "1" directly for ARP response packets.
+ *
+ * For IPv6 ND NA packets, table LOOKUP_NEIGHBOR does a lookup
+ * for the (nd.target, nd.tll) in the mac binding table using the
+ * 'lookup_nd' action and stores the result in
+ * REGBIT_LOOKUP_NEIGHBOR_RESULT bit. If
+ * "always_learn_from_arp_request" is set to false,
+ * REGBIT_LOOKUP_NEIGHBOR_IP_RESULT bit is set.
+ *
+ * For IPv6 ND NS packets, table LOOKUP_NEIGHBOR does a lookup
+ * for the (ip6.src, nd.sll) in the mac binding table using the
+ * 'lookup_nd' action and stores the result in
+ * REGBIT_LOOKUP_NEIGHBOR_RESULT bit. If
+ * "always_learn_from_arp_request" is set to false, it will also lookup
+ * for the (ip6.src) in the mac binding table using the "lookup_nd_ip"
+ * action and stores the result in REGBIT_LOOKUP_NEIGHBOR_IP_RESULT
+ * bit.
+ *
+ * Table LEARN_NEIGHBOR learns the mac-binding using the action
+ * - 'put_arp/put_nd'. Learning mac-binding is skipped if
+ * REGBIT_LOOKUP_NEIGHBOR_RESULT bit is set or
+ * REGBIT_LOOKUP_NEIGHBOR_IP_RESULT is not set.
+ *
+ * */
+
+/* Flows for LOOKUP_NEIGHBOR. */
+for (&Router(.lr = lr, .learn_from_arp_request = learn_from_arp_request))
+var rLNR = rEGBIT_LOOKUP_NEIGHBOR_RESULT() in
+var rLNIR = rEGBIT_LOOKUP_NEIGHBOR_IP_RESULT() in
+{
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, LOOKUP_NEIGHBOR),
+ .priority = 100,
+ .__match = "arp.op == 2",
+ .actions =
+ "${rLNR} = lookup_arp(inport, arp.spa, arp.sha); " ++
+ { if (learn_from_arp_request) "" else "${rLNIR} = 1; " } ++
+ "next;",
+ .external_ids = map_empty());
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, LOOKUP_NEIGHBOR),
+ .priority = 100,
+ .__match = "nd_na",
+ .actions =
+ "${rLNR} = lookup_nd(inport, nd.target, nd.tll); " ++
+ { if (learn_from_arp_request) "" else "${rLNIR} = 1; " } ++
+ "next;",
+ .external_ids = map_empty());
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, LOOKUP_NEIGHBOR),
+ .priority = 100,
+ .__match = "nd_ns",
+ .actions =
+ "${rLNR} = lookup_nd(inport, ip6.src, nd.sll); " ++
+ { if (learn_from_arp_request) "" else
+ "${rLNIR} = lookup_nd_ip(inport, ip6.src); " } ++
+ "next;",
+ .external_ids = map_empty());
+
+ /* For other packet types, we can skip neighbor learning.
+ * So set REGBIT_LOOKUP_NEIGHBOR_RESULT to 1. */
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, LOOKUP_NEIGHBOR),
+ .priority = 0,
+ .__match = "1",
+ .actions = "${rLNR} = 1; next;",
+ .external_ids = map_empty());
+
+ /* Flows for LEARN_NEIGHBOR. */
+ /* Skip Neighbor learning if not required. */
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, LEARN_NEIGHBOR),
+ .priority = 100,
+ .__match =
+ "${rLNR} == 1" ++
+ { if (learn_from_arp_request) "" else " || ${rLNIR} == 0" },
+ .actions = "next;",
+ .external_ids = map_empty());
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, LEARN_NEIGHBOR),
+ .priority = 90,
+ .__match = "arp",
+ .actions = "put_arp(inport, arp.spa, arp.sha); next;",
+ .external_ids = map_empty());
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, LEARN_NEIGHBOR),
+ .priority = 90,
+ .__match = "arp",
+ .actions = "put_arp(inport, arp.spa, arp.sha); next;",
+ .external_ids = map_empty());
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, LEARN_NEIGHBOR),
+ .priority = 90,
+ .__match = "nd_na",
+ .actions = "put_nd(inport, nd.target, nd.tll); next;",
+ .external_ids = map_empty());
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, LEARN_NEIGHBOR),
+ .priority = 90,
+ .__match = "nd_ns",
+ .actions = "put_nd(inport, ip6.src, nd.sll); next;",
+ .external_ids = map_empty())
+}
+
+/* Check if we need to learn mac-binding from ARP requests. */
+for (RouterPortNetworksIPv4Addr(rp@&RouterPort{.router = router}, addr)) {
+ var is_l3dgw_port = match (router.l3dgw_port) {
+ Some{l3dgw_lrp} -> l3dgw_lrp._uuid == rp.lrp._uuid,
+ None -> false
+ } in
+ var has_redirect_port = router.redirect_port_name != "" in
+ var chassis_residence = match (is_l3dgw_port and has_redirect_port) {
+ true -> " && is_chassis_resident(${router.redirect_port_name})",
+ false -> ""
+ } in
+ var rLNR = rEGBIT_LOOKUP_NEIGHBOR_RESULT() in
+ var rLNIR = rEGBIT_LOOKUP_NEIGHBOR_IP_RESULT() in
+ var match0 = "inport == ${rp.json_name} && "
+ "arp.spa == ${ipv4_netaddr_match_network(addr)}" in
+ var match1 = "arp.op == 1" ++ chassis_residence in
+ var learn_from_arp_request = router.learn_from_arp_request in {
+ if (not learn_from_arp_request) {
+ /* ARP request to this address should always get learned,
+ * so add a priority-110 flow to set
+ * REGBIT_LOOKUP_NEIGHBOR_IP_RESULT to 1. */
+ var __match = [match0, "arp.tpa == ${addr.addr}", match1] in
+ var actions = "${rLNR} = lookup_arp(inport, arp.spa, arp.sha); "
+ "${rLNIR} = 1; "
+ "next;" in
+ Flow(.logical_datapath = router.lr._uuid,
+ .stage = router_stage(IN, LOOKUP_NEIGHBOR),
+ .priority = 110,
+ .__match = __match.join(" && "),
+ .actions = actions,
+ .external_ids = stage_hint(rp.lrp._uuid))
+ };
+
+ var actions = "${rLNR} = lookup_arp(inport, arp.spa, arp.sha); " ++
+ { if (learn_from_arp_request) "" else
+ "${rLNIR} = lookup_arp_ip(inport, arp.spa); " } ++
+ "next;" in
+ Flow(.logical_datapath = router.lr._uuid,
+ .stage = router_stage(IN, LOOKUP_NEIGHBOR),
+ .priority = 100,
+ .__match = "${match0} && ${match1}",
+ .actions = actions,
+ .external_ids = stage_hint(rp.lrp._uuid))
+ }
+}
+
+
+/* Logical router ingress table IP_INPUT: IP Input. */
+for (router in &Router(.lr = lr, .mcast_cfg = &mcast_cfg)) {
+ /* L3 admission control: drop multicast and broadcast source, localhost
+ * source or destination, and zero network source or destination
+ * (priority 100). */
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, IP_INPUT),
+ .priority = 100,
+ .__match = "ip4.src_mcast ||"
+ "ip4.src == 255.255.255.255 || "
+ "ip4.src == 127.0.0.0/8 || "
+ "ip4.dst == 127.0.0.0/8 || "
+ "ip4.src == 0.0.0.0/8 || "
+ "ip4.dst == 0.0.0.0/8",
+ .actions = "drop;",
+ .external_ids = map_empty());
+
+ /* Drop ARP packets (priority 85). ARP request packets for router's own
+ * IPs are handled with priority-90 flows.
+ * Drop IPv6 ND packets (priority 85). ND NA packets for router's own
+ * IPs are handled with priority-90 flows.
+ */
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, IP_INPUT),
+ .priority = 85,
+ .__match = "arp || nd",
+ .actions = "drop;",
+ .external_ids = map_empty());
+
+ /* Allow IPv6 multicast traffic that's supposed to reach the
+ * router pipeline (e.g., router solicitations).
+ */
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, IP_INPUT),
+ .priority = 84,
+ .__match = "nd_rs || nd_ra",
+ .actions = "next;",
+ .external_ids = map_empty());
+
+ /* Drop other reserved multicast. */
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, IP_INPUT),
+ .priority = 83,
+ .__match = "ip6.mcast_rsvd",
+ .actions = "drop;",
+ .external_ids = map_empty());
+
+ /* Allow other multicast if relay enabled (priority 82). */
+ var mcast_action = { if (mcast_cfg.relay) { "next;" } else { "drop;" } } in
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, IP_INPUT),
+ .priority = 82,
+ .__match = "ip4.mcast || ip6.mcast",
+ .actions = mcast_action,
+ .external_ids = map_empty());
+
+ /* Drop Ethernet local broadcast. By definition this traffic should
+ * not be forwarded.*/
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, IP_INPUT),
+ .priority = 50,
+ .__match = "eth.bcast",
+ .actions = "drop;",
+ .external_ids = map_empty());
+
+ /* TTL discard */
+ Flow(
+ .logical_datapath = lr._uuid,
+ .stage = router_stage(IN, IP_INPUT),
+ .priority = 30,
+ .__match = "ip4 && ip.ttl == {0, 1}",
+ .actions = "drop;",
+ .external_ids = map_empty());
+
+ /* Pass other traffic not already handled to the next table for
+ * routing. */
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, IP_INPUT),
+ .priority = 0,
+ .__match = "1",
+ .actions = "next;",
+ .external_ids = map_empty())
+}
+
+function format_v4_networks(networks: lport_addresses, add_bcast: bool): string =
+{
+ var addrs = vec_empty();
+ for (addr in networks.ipv4_addrs) {
+ addrs.push("${addr.addr}");
+ if (add_bcast) {
+ addrs.push("${ipv4_netaddr_bcast(addr)}")
+ } else ()
+ };
+ if (addrs.len() == 1) {
+ addrs.join(", ")
+ } else {
+ "{" ++ addrs.join(", ") ++ "}"
+ }
+}
+
+function format_v6_networks(networks: lport_addresses): string =
+{
+ var addrs = vec_empty();
+ for (addr in networks.ipv6_addrs) {
+ addrs.push("${addr.addr}")
+ };
+ if (addrs.len() == 1) {
+ addrs.join(", ")
+ } else {
+ "{" ++ addrs.join(", ") ++ "}"
+ }
+}
+
+/* The following relation is used in ARP reply flow generation to determine whether
+ * the is_chassis_resident check must be added to the flow.
+ */
+relation AddChassisResidentCheck_(lrp: uuid, add_check: bool)
+
+AddChassisResidentCheck_(lrp._uuid, res) :-
+ &SwitchPort(.peer = Some{&RouterPort{.lrp = lrp, .router = &router, .is_redirect = is_redirect}},
+ .sw = sw),
+ is_some(router.l3dgw_port),
+ not sw.localnet_port_names.is_empty(),
+ var res = if (is_redirect) {
+ /* Traffic with eth.src = l3dgw_port->lrp_networks.ea
+ * should only be sent from the "redirect-chassis", so that
+ * upstream MAC learning points to the "redirect-chassis".
+ * Also need to avoid generation of multiple ARP responses
+ * from different chassis. */
+ true
+ } else {
+ /* Check if the option 'reside-on-redirect-chassis'
+ * is set to true on the router port. If set to true
+ * and if peer's logical switch has a localnet port, it
+ * means the router pipeline for the packets from
+ * peer's logical switch is be run on the chassis
+ * hosting the gateway port and it should reply to the
+ * ARP requests for the router port IPs.
+ */
+ map_get_bool_def(lrp.options, "reside-on-redirect-chassis", false)
+ }.
+
+
+relation AddChassisResidentCheck(lrp: uuid, add_check: bool)
+
+AddChassisResidentCheck(lrp, add_check) :-
+ AddChassisResidentCheck_(lrp, add_check).
+
+AddChassisResidentCheck(lrp, false) :-
+ nb::Logical_Router_Port(._uuid = lrp),
+ not AddChassisResidentCheck_(lrp, _).
+
+
+function get_force_snat_ip(lr: nb::Logical_Router, key_type: string): Set<v46_ip> =
+{
+ var ips = set_empty();
+ match (lr.options.get(key_type ++ "_force_snat_ip")) {
+ None -> (),
+ Some{s} -> {
+ for (token in s.split(" ")) {
+ match (ip46_parse(token)) {
+ Some{ip} -> ips.insert(ip),
+ _ -> () // XXX warn
+ }
+ };
+ }
+ };
+ ips
+}
+
+function has_force_snat_ip(lr: nb::Logical_Router, key_type: string): bool {
+ not get_force_snat_ip(lr, key_type).is_empty()
+}
+
+/* Logical router ingress table IP_INPUT: IP Input for IPv4. */
+for (&RouterPort(.router = &router, .networks = networks, .lrp = lrp)
+ if (not networks.ipv4_addrs.is_empty()))
+{
+ /* L3 admission control: drop packets that originate from an
+ * IPv4 address owned by the router or a broadcast address
+ * known to the router (priority 100). */
+ var __match = "ip4.src == " ++
+ format_v4_networks(networks, true) ++
+ " && ${rEGBIT_EGRESS_LOOPBACK()} == 0" in
+ Flow(.logical_datapath = router.lr._uuid,
+ .stage = router_stage(IN, IP_INPUT),
+ .priority = 100,
+ .__match = __match,
+ .actions = "drop;",
+ .external_ids = stage_hint(lrp._uuid));
+
+ /* ICMP echo reply. These flows reply to ICMP echo requests
+ * received for the router's IP address. Since packets only
+ * get here as part of the logical router datapath, the inport
+ * (i.e. the incoming locally attached net) does not matter.
+ * The ip.ttl also does not matter (RFC1812 section 4.2.2.9) */
+ var __match = "ip4.dst == " ++
+ format_v4_networks(networks, false) ++
+ " && icmp4.type == 8 && icmp4.code == 0" in
+ Flow(.logical_datapath = router.lr._uuid,
+ .stage = router_stage(IN, IP_INPUT),
+ .priority = 90,
+ .__match = __match,
+ .actions = "ip4.dst <-> ip4.src; "
+ "ip.ttl = 255; "
+ "icmp4.type = 0; "
+ "flags.loopback = 1; "
+ "next; ",
+ .external_ids = stage_hint(lrp._uuid))
+}
+
+/* Priority-90-92 flows handle ARP requests and ND packets. Most are
+ * per logical port but DNAT addresses can be handled per datapath
+ * for non gateway router ports.
+ *
+ * Priority 91 and 92 flows are added for each gateway router
+ * port to handle the special cases. In case we get the packet
+ * on a regular port, just reply with the port's ETH address.
+ */
+LogicalRouterNatArpNdFlow(router, nat) :-
+ router in &Router(.lr = nb::Logical_Router{._uuid = lr}),
+ LogicalRouterNAT(.lr = lr, .nat = nat@NAT{.nat = &nb::NAT{.__type = __type}}),
+ /* Skip SNAT entries for now, we handle unique SNAT IPs separately
+ * below.
+ */
+ __type != "snat".
+/* Now handle SNAT entries too, one per unique SNAT IP. */
+LogicalRouterNatArpNdFlow(router, nat) :-
+ router in &Router(.snat_ips = snat_ips),
+ var snat_ip = FlatMap(snat_ips),
+ (var ip, var nats) = snat_ip,
+ Some{var nat} = nats.nth(0).
+
+relation LogicalRouterNatArpNdFlow(router: Ref<Router>, nat: NAT)
+LogicalRouterArpNdFlow(router, nat, None, rEG_INPORT_ETH_ADDR(), None, false, 90) :-
+ LogicalRouterNatArpNdFlow(router, nat).
+
+/* ARP / ND handling for external IP addresses.
+ *
+ * DNAT and SNAT IP addresses are external IP addresses that need ARP
+ * handling.
+ *
+ * These are already taken care globally, per router. The only
+ * exception is on the l3dgw_port where we might need to use a
+ * different ETH address.
+ */
+LogicalRouterPortNatArpNdFlow(router, nat, l3dgw_port) :-
+ router in &Router(.lr = lr, .l3dgw_port = Some{l3dgw_port}),
+ LogicalRouterNAT(lr._uuid, nat),
+ /* Skip SNAT entries for now, we handle unique SNAT IPs separately
+ * below.
+ */
+ nat.nat.__type != "snat".
+/* Now handle SNAT entries too, one per unique SNAT IP. */
+LogicalRouterPortNatArpNdFlow(router, nat, l3dgw_port) :-
+ router in &Router(.l3dgw_port = Some{l3dgw_port}, .snat_ips = snat_ips),
+ var snat_ip = FlatMap(snat_ips),
+ (var ip, var nats) = snat_ip,
+ Some{var nat} = nats.nth(0).
+
+/* Respond to ARP/NS requests on the chassis that binds the gw
+ * port. Drop the ARP/NS requests on other chassis.
+ */
+relation LogicalRouterPortNatArpNdFlow(router: Ref<Router>, nat: NAT, lrp: nb::Logical_Router_Port)
+LogicalRouterArpNdFlow(router, nat, Some{lrp}, mac, Some{extra_match}, false, 92),
+LogicalRouterArpNdFlow(router, nat, Some{lrp}, mac, None, true, 91) :-
+ LogicalRouterPortNatArpNdFlow(router, nat, lrp),
+ (var mac, var extra_match) = match ((nat.external_mac, nat.nat.logical_port)) {
+ (Some{external_mac}, Some{logical_port}) -> (
+ /* distributed NAT case, use nat->external_mac */
+ external_mac.to_string(),
+ /* Traffic with eth.src = nat->external_mac should only be
+ * sent from the chassis where nat->logical_port is
+ * resident, so that upstream MAC learning points to the
+ * correct chassis. Also need to avoid generation of
+ * multiple ARP responses from different chassis. */
+ "is_chassis_resident(${json_string_escape(logical_port)})"
+ ),
+ _ -> (
+ rEG_INPORT_ETH_ADDR(),
+ /* Traffic with eth.src = l3dgw_port->lrp_networks.ea_s
+ * should only be sent from the gateway chassis, so that
+ * upstream MAC learning points to the gateway chassis.
+ * Also need to avoid generation of multiple ARP responses
+ * from different chassis. */
+ match (router.redirect_port_name) {
+ "" -> "",
+ s -> "is_chassis_resident(${s})"
+ }
+ )
+ }.
+
+/* Now divide the ARP/ND flows into ARP and ND. */
+relation LogicalRouterArpNdFlow(
+ router: Ref<Router>,
+ nat: NAT,
+ lrp: Option<nb::Logical_Router_Port>,
+ mac: string,
+ extra_match: Option<string>,
+ drop: bool,
+ priority: integer)
+LogicalRouterArpFlow(router, lrp, ipv4, mac, extra_match, drop, priority,
+ stage_hint(nat.nat._uuid)) :-
+ LogicalRouterArpNdFlow(router, nat@NAT{.external_ip = IPv4{ipv4}}, lrp,
+ mac, extra_match, drop, priority).
+LogicalRouterNdFlow(router, lrp, "nd_na", ipv6, true, mac, extra_match, drop, priority,
+ stage_hint(nat.nat._uuid)) :-
+ LogicalRouterArpNdFlow(router, nat@NAT{.external_ip = IPv6{ipv6}}, lrp,
+ mac, extra_match, drop, priority).
+
+relation LogicalRouterArpFlow(
+ lr: Ref<Router>,
+ lrp: Option<nb::Logical_Router_Port>,
+ ip: in_addr,
+ mac: string,
+ extra_match: Option<string>,
+ drop: bool,
+ priority: integer,
+ external_ids: Map<string,string>)
+Flow(.logical_datapath = lr.lr._uuid,
+ .stage = router_stage(IN, IP_INPUT),
+ .priority = priority,
+ .__match = __match,
+ .actions = actions,
+ .external_ids = external_ids) :-
+ LogicalRouterArpFlow(.lr = lr, .lrp = lrp, .ip = ip, .mac = mac,
+ .extra_match = extra_match, .drop = drop,
+ .priority = priority, .external_ids = external_ids),
+ var __match = {
+ var clauses = vec_with_capacity(3);
+ match (lrp) {
+ Some{p} -> clauses.push("inport == ${json_string_escape(p.name)}"),
+ None -> ()
+ };
+ clauses.push("arp.op == 1 && arp.tpa == ${ip}");
+ clauses.append(extra_match.to_vec());
+ clauses.join(" && ")
+ },
+ var actions = if (drop) {
+ "drop;"
+ } else {
+ "eth.dst = eth.src; "
+ "eth.src = ${mac}; "
+ "arp.op = 2; /* ARP reply */ "
+ "arp.tha = arp.sha; "
+ "arp.sha = ${mac}; "
+ "arp.tpa = arp.spa; "
+ "arp.spa = ${ip}; "
+ "outport = inport; "
+ "flags.loopback = 1; "
+ "output;"
+ }.
+
+relation LogicalRouterNdFlow(
+ lr: Ref<Router>,
+ lrp: Option<nb::Logical_Router_Port>,
+ action: string,
+ ip: in6_addr,
+ sn_ip: bool,
+ mac: string,
+ extra_match: Option<string>,
+ drop: bool,
+ priority: integer,
+ external_ids: Map<string,string>)
+Flow(.logical_datapath = lr.lr._uuid,
+ .stage = router_stage(IN, IP_INPUT),
+ .priority = priority,
+ .__match = __match,
+ .actions = actions,
+ .external_ids = external_ids) :-
+ LogicalRouterNdFlow(.lr = lr, .lrp = lrp, .action = action, .ip = ip,
+ .sn_ip = sn_ip, .mac = mac, .extra_match = extra_match,
+ .drop = drop, .priority = priority,
+ .external_ids = external_ids),
+ var __match = {
+ var clauses = vec_with_capacity(4);
+ match (lrp) {
+ Some{p} -> clauses.push("inport == ${json_string_escape(p.name)}"),
+ None -> ()
+ };
+ if (sn_ip) {
+ clauses.push("ip6.dst == {${ip}, ${in6_addr_solicited_node(ip)}}")
+ };
+ clauses.push("nd_ns && nd.target == ${ip}");
+ clauses.append(extra_match.to_vec());
+ clauses.join(" && ")
+ },
+ var actions = if (drop) {
+ "drop;"
+ } else {
+ "${action} { "
+ "eth.src = ${mac}; "
+ "ip6.src = ${ip}; "
+ "nd.target = ${ip}; "
+ "nd.tll = ${mac}; "
+ "outport = inport; "
+ "flags.loopback = 1; "
+ "output; "
+ "};"
+ }.
+
+/* ICMP time exceeded */
+for (RouterPortNetworksIPv4Addr(.port = &RouterPort{.lrp = lrp,
+ .json_name = json_name,
+ .router = router,
+ .networks = networks,
+ .is_redirect = is_redirect},
+ .addr = addr))
+{
+ Flow(.logical_datapath = router.lr._uuid,
+ .stage = router_stage(IN, IP_INPUT),
+ .priority = 40,
+ .__match = "inport == ${json_name} && ip4 && "
+ "ip.ttl == {0, 1} && !ip.later_frag",
+ .actions = "icmp4 {"
+ "eth.dst <-> eth.src; "
+ "icmp4.type = 11; /* Time exceeded */ "
+ "icmp4.code = 0; /* TTL exceeded in transit */ "
+ "ip4.dst = ip4.src; "
+ "ip4.src = ${addr.addr}; "
+ "ip.ttl = 255; "
+ "next; };",
+ .external_ids = stage_hint(lrp._uuid));
+
+ /* ARP reply. These flows reply to ARP requests for the router's own
+ * IP address. */
+ for (AddChassisResidentCheck(lrp._uuid, add_chassis_resident_check)) {
+ var __match =
+ "arp.spa == ${ipv4_netaddr_match_network(addr)}" ++
+ if (add_chassis_resident_check) {
+ " && is_chassis_resident(${router.redirect_port_name})"
+ } else "" in
+ LogicalRouterArpFlow(.lr = router,
+ .lrp = Some{lrp},
+ .ip = addr.addr,
+ .mac = rEG_INPORT_ETH_ADDR(),
+ .extra_match = Some{__match},
+ .drop = false,
+ .priority = 90,
+ .external_ids = stage_hint(lrp._uuid))
+ }
+}
+
+for (&RouterPort(.lrp = lrp,
+ .router = router@&Router{.lr = lr},
+ .json_name = json_name,
+ .networks = networks,
+ .is_redirect = is_redirect))
+var residence_check = match (is_redirect) {
+ true -> Some{"is_chassis_resident(${router.redirect_port_name})"},
+ false -> None
+} in {
+ for (RouterLBVIP(.router = &Router{.lr = nb::Logical_Router{._uuid= lr._uuid}}, .vip = vip)) {
+ Some{(var ip_address, _)} = ip_address_and_port_from_lb_key(vip) in {
+ IPv4{var ipv4} = ip_address in
+ LogicalRouterArpFlow(.lr = router,
+ .lrp = Some{lrp},
+ .ip = ipv4,
+ .mac = rEG_INPORT_ETH_ADDR(),
+ .extra_match = residence_check,
+ .drop = false,
+ .priority = 90,
+ .external_ids = map_empty());
+
+ IPv6{var ipv6} = ip_address in
+ LogicalRouterNdFlow(.lr = router,
+ .lrp = Some{lrp},
+ .action = "nd_na",
+ .ip = ipv6,
+ .sn_ip = false,
+ .mac = rEG_INPORT_ETH_ADDR(),
+ .extra_match = residence_check,
+ .drop = false,
+ .priority = 90,
+ .external_ids = map_empty())
+ }
+ }
+}
+
+/* Drop IP traffic destined to router owned IPs except if the IP is
+ * also a SNAT IP. Those are dropped later, in stage
+ * "lr_in_arp_resolve", if unSNAT was unsuccessful.
+ *
+ * Priority 60.
+ */
+Flow(.logical_datapath = lr_uuid,
+ .stage = router_stage(IN, IP_INPUT),
+ .priority = 60,
+ .__match = "ip4.dst == {" ++ match_ips.join(", ") ++ "}",
+ .actions = "drop;",
+ .external_ids = stage_hint(lrp_uuid)) :-
+ &RouterPort(.lrp = nb::Logical_Router_Port{._uuid = lrp_uuid},
+ .router = &Router{.snat_ips = snat_ips,
+ .lr = nb::Logical_Router{._uuid = lr_uuid}},
+ .networks = networks),
+ var addr = FlatMap(networks.ipv4_addrs),
+ not snat_ips.contains_key(IPv4{addr.addr}),
+ var match_ips = "${addr.addr}".group_by((lr_uuid, lrp_uuid)).to_vec().
+Flow(.logical_datapath = lr_uuid,
+ .stage = router_stage(IN, IP_INPUT),
+ .priority = 60,
+ .__match = "ip6.dst == {" ++ match_ips.join(", ") ++ "}",
+ .actions = "drop;",
+ .external_ids = stage_hint(lrp_uuid)) :-
+ &RouterPort(.lrp = nb::Logical_Router_Port{._uuid = lrp_uuid},
+ .router = &Router{.snat_ips = snat_ips,
+ .lr = nb::Logical_Router{._uuid = lr_uuid}},
+ .networks = networks),
+ var addr = FlatMap(networks.ipv6_addrs),
+ not snat_ips.contains_key(IPv6{addr.addr}),
+ var match_ips = "${addr.addr}".group_by((lr_uuid, lrp_uuid)).to_vec().
+
+for (RouterPortNetworksIPv4Addr(
+ .port = &RouterPort{
+ .router = &Router{.lr = lr,
+ .l3dgw_port = None,
+ .is_gateway = false},
+ .lrp = lrp},
+ .addr = addr))
+{
+ /* UDP/TCP port unreachable. */
+ var __match = "ip4 && ip4.dst == ${addr.addr} && !ip.later_frag && udp" in
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, IP_INPUT),
+ .priority = 80,
+ .__match = __match,
+ .actions = "icmp4 {"
+ "eth.dst <-> eth.src; "
+ "ip4.dst <-> ip4.src; "
+ "ip.ttl = 255; "
+ "icmp4.type = 3; "
+ "icmp4.code = 3; "
+ "next; };",
+ .external_ids = stage_hint(lrp._uuid));
+
+ var __match = "ip4 && ip4.dst == ${addr.addr} && !ip.later_frag && tcp" in
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, IP_INPUT),
+ .priority = 80,
+ .__match = __match,
+ .actions = "tcp_reset {"
+ "eth.dst <-> eth.src; "
+ "ip4.dst <-> ip4.src; "
+ "next; };",
+ .external_ids = stage_hint(lrp._uuid));
+
+ var __match = "ip4 && ip4.dst == ${addr.addr} && !ip.later_frag" in
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, IP_INPUT),
+ .priority = 70,
+ .__match = __match,
+ .actions = "icmp4 {"
+ "eth.dst <-> eth.src; "
+ "ip4.dst <-> ip4.src; "
+ "ip.ttl = 255; "
+ "icmp4.type = 3; "
+ "icmp4.code = 2; "
+ "next; };",
+ .external_ids = stage_hint(lrp._uuid))
+}
+
+/* DHCPv6 reply handling */
+Flow(.logical_datapath = rp.router.lr._uuid,
+ .stage = router_stage(IN, IP_INPUT),
+ .priority = 100,
+ .__match = "ip6.dst == ${ipv6_addr.addr} "
+ "&& udp.src == 547 && udp.dst == 546",
+ .actions = "reg0 = 0; handle_dhcpv6_reply;",
+ .external_ids = stage_hint(rp.lrp._uuid)) :-
+ rp in &RouterPort(),
+ var ipv6_addr = FlatMap(rp.networks.ipv6_addrs).
+
+/* Logical router ingress table IP_INPUT: IP Input for IPv6. */
+for (&RouterPort(.router = &router, .networks = networks, .lrp = lrp)
+ if (not networks.ipv6_addrs.is_empty()))
+{
+ //if (op->derived) {
+ // /* No ingress packets are accepted on a chassisredirect
+ // * port, so no need to program flows for that port. */
+ // continue;
+ //}
+
+ /* ICMPv6 echo reply. These flows reply to echo requests
+ * received for the router's IP address. */
+ var __match = "ip6.dst == " ++
+ format_v6_networks(networks) ++
+ " && icmp6.type == 128 && icmp6.code == 0" in
+ Flow(.logical_datapath = router.lr._uuid,
+ .stage = router_stage(IN, IP_INPUT),
+ .priority = 90,
+ .__match = __match,
+ .actions = "ip6.dst <-> ip6.src; "
+ "ip.ttl = 255; "
+ "icmp6.type = 129; "
+ "flags.loopback = 1; "
+ "next; ",
+ .external_ids = stage_hint(lrp._uuid))
+}
+
+/* ND reply. These flows reply to ND solicitations for the
+ * router's own IP address. */
+for (RouterPortNetworksIPv6Addr(.port = &RouterPort{.lrp = lrp,
+ .is_redirect = is_redirect,
+ .router = router,
+ .networks = networks,
+ .json_name = json_name},
+ .addr = addr))
+{
+ var extra_match = if (is_redirect) {
+ /* Traffic with eth.src = l3dgw_port->lrp_networks.ea
+ * should only be sent from the gateway chassis, so that
+ * upstream MAC learning points to the gateway chassis.
+ * Also need to avoid generation of multiple ND replies
+ * from different chassis. */
+ Some{"is_chassis_resident(${json_string_escape(chassis_redirect_name(lrp.name))})"}
+ } else None in
+ LogicalRouterNdFlow(.lr = router,
+ .lrp = Some{lrp},
+ .action = "nd_na_router",
+ .ip = addr.addr,
+ .sn_ip = true,
+ .mac = rEG_INPORT_ETH_ADDR(),
+ .extra_match = extra_match,
+ .drop = false,
+ .priority = 90,
+ .external_ids = stage_hint(lrp._uuid))
+}
+
+/* UDP/TCP port unreachable */
+for (RouterPortNetworksIPv6Addr(
+ .port = &RouterPort{.router = &Router{.lr = lr,
+ .l3dgw_port = None,
+ .is_gateway = false},
+ .lrp = lrp,
+ .json_name = json_name},
+ .addr = addr))
+{
+ var __match = "ip6 && ip6.dst == ${addr.addr} && !ip.later_frag && tcp" in
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, IP_INPUT),
+ .priority = 80,
+ .__match = __match,
+ .actions = "tcp_reset {"
+ "eth.dst <-> eth.src; "
+ "ip6.dst <-> ip6.src; "
+ "next; };",
+ .external_ids = stage_hint(lrp._uuid));
+
+ var __match = "ip6 && ip6.dst == ${addr.addr} && !ip.later_frag && udp" in
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, IP_INPUT),
+ .priority = 80,
+ .__match = __match,
+ .actions = "icmp6 {"
+ "eth.dst <-> eth.src; "
+ "ip6.dst <-> ip6.src; "
+ "ip.ttl = 255; "
+ "icmp6.type = 1; "
+ "icmp6.code = 4; "
+ "next; };",
+ .external_ids = stage_hint(lrp._uuid));
+
+ var __match = "ip6 && ip6.dst == ${addr.addr} && !ip.later_frag" in
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, IP_INPUT),
+ .priority = 70,
+ .__match = __match,
+ .actions = "icmp6 {"
+ "eth.dst <-> eth.src; "
+ "ip6.dst <-> ip6.src; "
+ "ip.ttl = 255; "
+ "icmp6.type = 1; "
+ "icmp6.code = 3; "
+ "next; };",
+ .external_ids = stage_hint(lrp._uuid))
+}
+
+/* ICMPv6 time exceeded */
+for (RouterPortNetworksIPv6Addr(.port = &RouterPort{.router = &router,
+ .lrp = lrp,
+ .json_name = json_name},
+ .addr = addr)
+ /* skip link-local address */
+ if (not ipv6_netaddr_is_lla(addr)))
+{
+ var __match = "inport == ${json_name} && ip6 && "
+ "ip6.src == ${ipv6_netaddr_match_network(addr)} && "
+ "ip.ttl == {0, 1} && !ip.later_frag" in
+ var actions = "icmp6 {"
+ "eth.dst <-> eth.src; "
+ "ip6.dst = ip6.src; "
+ "ip6.src = ${addr.addr}; "
+ "ip.ttl = 255; "
+ "icmp6.type = 3; /* Time exceeded */ "
+ "icmp6.code = 0; /* TTL exceeded in transit */ "
+ "next; };" in
+ Flow(.logical_datapath = router.lr._uuid,
+ .stage = router_stage(IN, IP_INPUT),
+ .priority = 40,
+ .__match = __match,
+ .actions = actions,
+ .external_ids = stage_hint(lrp._uuid))
+}
+
+/* NAT, Defrag and load balancing. */
+
+function default_allow_flow(datapath: uuid, stage: Stage): Flow {
+ Flow{.logical_datapath = datapath,
+ .stage = stage,
+ .priority = 0,
+ .__match = "1",
+ .actions = "next;",
+ .external_ids = map_empty()}
+}
+for (&Router(.lr = lr)) {
+ /* Packets are allowed by default. */
+ Flow[default_allow_flow(lr._uuid, router_stage(IN, DEFRAG))];
+ Flow[default_allow_flow(lr._uuid, router_stage(IN, UNSNAT))];
+ Flow[default_allow_flow(lr._uuid, router_stage(OUT, SNAT))];
+ Flow[default_allow_flow(lr._uuid, router_stage(IN, DNAT))];
+ Flow[default_allow_flow(lr._uuid, router_stage(OUT, UNDNAT))];
+ Flow[default_allow_flow(lr._uuid, router_stage(OUT, EGR_LOOP))];
+ Flow[default_allow_flow(lr._uuid, router_stage(IN, ECMP_STATEFUL))];
+
+ /* Send the IPv6 NS packets to next table. When ovn-controller
+ * generates IPv6 NS (for the action - nd_ns{}), the injected
+ * packet would go through conntrack - which is not required. */
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(OUT, SNAT),
+ .priority = 120,
+ .__match = "nd_ns",
+ .actions = "next;",
+ .external_ids = map_empty())
+}
+
+function lrouter_nat_is_stateless(nat: NAT): bool = {
+ Some{"true"} == nat.nat.options.get("stateless")
+}
+
+/* Handles the match criteria and actions in logical flow
+ * based on external ip based NAT rule filter.
+ *
+ * For ALLOWED_EXT_IPs, we will add an additional match criteria
+ * of comparing ip*.src/dst with the allowed external ip address set.
+ *
+ * For EXEMPTED_EXT_IPs, we will have an additional logical flow
+ * where we compare ip*.src/dst with the exempted external ip address set
+ * and action says "next" instead of ct*.
+ */
+function lrouter_nat_add_ext_ip_match(
+ router: Ref<Router>,
+ nat: NAT,
+ __match: string,
+ ipX: string,
+ is_src: bool,
+ mask: v46_ip): (string, Option<Flow>)
+{
+ var dir = if (is_src) "src" else "dst";
+ match (nat.exceptional_ext_ips) {
+ None -> ("", None),
+ Some{AllowedExtIps{__as}} -> (" && ${ipX}.${dir} == $${__as.name}", None),
+ Some{ExemptedExtIps{__as}} -> {
+ /* Priority of logical flows corresponding to exempted_ext_ips is
+ * +1 of the corresponding regulr NAT rule.
+ * For example, if we have following NAT rule and we associate
+ * exempted external ips to it:
+ * "ovn-nbctl lr-nat-add router dnat_and_snat 10.15.24.139 50.0.0.11"
+ *
+ * And now we associate exempted external ip address set to it.
+ * Now corresponding to above rule we will have following logical
+ * flows:
+ * lr_out_snat...priority=162, match=(..ip4.dst == $exempt_range),
+ * action=(next;)
+ * lr_out_snat...priority=161, match=(..), action=(ct_snat(....);)
+ *
+ */
+ var priority = match (is_src) {
+ true -> {
+ /* S_ROUTER_IN_DNAT uses priority 100 */
+ 100 + 1
+ },
+ false -> {
+ /* S_ROUTER_OUT_SNAT uses priority (mask + 1 + 128 + 1) */
+ var is_gw_router = router.l3dgw_port.is_none();
+ var mask_1bits = ip46_count_cidr_bits(mask).unwrap_or(8'd0) as integer;
+ mask_1bits + 2 + { if (not is_gw_router) 128 else 0 }
+ }
+ };
+
+ ("",
+ Some{Flow{.logical_datapath = router.lr._uuid,
+ .stage = if (is_src) { router_stage(IN, DNAT) } else { router_stage(OUT, SNAT) },
+ .priority = priority,
+ .__match = "${__match} && ${ipX}.${dir} == $${__as.name}",
+ .actions = "next;",
+ .external_ids = stage_hint(nat.nat._uuid)}})
+ }
+ }
+}
+
+relation LogicalRouterForceSnatFlows(
+ logical_router: uuid,
+ ips: Set<v46_ip>,
+ context: string)
+Flow(.logical_datapath = logical_router,
+ .stage = router_stage(IN, UNSNAT),
+ .priority = 110,
+ .__match = "${ipX} && ${ipX}.dst == ${ip}",
+ .actions = "ct_snat;",
+ .external_ids = map_empty()),
+/* Higher priority rules to force SNAT with the IP addresses
+ * configured in the Gateway router. This only takes effect
+ * when the packet has already been DNATed or load balanced once. */
+Flow(.logical_datapath = logical_router,
+ .stage = router_stage(OUT, SNAT),
+ .priority = 100,
+ .__match = "flags.force_snat_for_${context} == 1 && ${ipX}",
+ .actions = "ct_snat(%{ip});",
+ .external_ids = map_empty()) :-
+ LogicalRouterForceSnatFlows(.logical_router = logical_router,
+ .ips = ips,
+ .context = context),
+ var ip = FlatMap(ips),
+ var ipX = ip46_ipX(ip).
+
+/* NAT rules are only valid on Gateway routers and routers with
+ * l3dgw_port (router has a port with "redirect-chassis"
+ * specified). */
+for (r in &Router(.lr = lr,
+ .l3dgw_port = l3dgw_port,
+ .redirect_port_name = redirect_port_name,
+ .is_gateway = is_gateway)
+ if is_some(l3dgw_port) or is_gateway)
+{
+ for (LogicalRouterNAT(.lr = lr._uuid, .nat = nat)) {
+ var ipX = ip46_ipX(nat.external_ip) in
+ var xx = ip46_xxreg(nat.external_ip) in
+ /* Check the validity of nat->logical_ip. 'logical_ip' can
+ * be a subnet when the type is "snat". */
+ Some{(_, var mask)} = ip46_parse_masked(nat.nat.logical_ip) in
+ true == match ((ip46_is_all_ones(mask), nat.nat.__type)) {
+ (_, "snat") -> true,
+ (false, _) -> {
+ warn("bad ip ${nat.nat.logical_ip} for dnat in router ${uuid2str(lr._uuid)}");
+ false
+ },
+ _ -> true
+ } in
+ /* For distributed router NAT, determine whether this NAT rule
+ * satisfies the conditions for distributed NAT processing. */
+ var mac = match ((is_some(l3dgw_port) and nat.nat.__type == "dnat_and_snat",
+ nat.nat.logical_port, nat.external_mac)) {
+ (true, Some{_}, Some{mac}) -> Some{mac},
+ _ -> None
+ } in
+ var stateless = (lrouter_nat_is_stateless(nat)
+ and nat.nat.__type == "dnat_and_snat") in
+ {
+ /* Ingress UNSNAT table: It is for already established connections'
+ * reverse traffic. i.e., SNAT has already been done in egress
+ * pipeline and now the packet has entered the ingress pipeline as
+ * part of a reply. We undo the SNAT here.
+ *
+ * Undoing SNAT has to happen before DNAT processing. This is
+ * because when the packet was DNATed in ingress pipeline, it did
+ * not know about the possibility of eventual additional SNAT in
+ * egress pipeline. */
+ if (nat.nat.__type == "snat" or nat.nat.__type == "dnat_and_snat") {
+ if (l3dgw_port == None) {
+ /* Gateway router. */
+ var actions = if (stateless) {
+ "${ipX}.dst=${nat.nat.logical_ip}; next;"
+ } else {
+ "ct_snat;"
+ } in
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, UNSNAT),
+ .priority = 90,
+ .__match = "ip && ${ipX}.dst == ${nat.nat.external_ip}",
+ .actions = actions,
+ .external_ids = stage_hint(nat.nat._uuid))
+ };
+ Some{var gwport} = l3dgw_port in {
+ /* Distributed router. */
+
+ /* Traffic received on l3dgw_port is subject to NAT. */
+ var __match =
+ "ip && ${ipX}.dst == ${nat.nat.external_ip}"
+ " && inport == ${json_string_escape(gwport.name)}" ++
+ if (mac == None) {
+ /* Flows for NAT rules that are centralized are only
+ * programmed on the "redirect-chassis". */
+ " && is_chassis_resident(${redirect_port_name})"
+ } else { "" } in
+ var actions = if (stateless) {
+ "${ipX}.dst=${nat.nat.logical_ip}; next;"
+ } else {
+ "ct_snat;"
+ } in
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, UNSNAT),
+ .priority = 100,
+ .__match = __match,
+ .actions = actions,
+ .external_ids = stage_hint(nat.nat._uuid))
+ }
+ };
+
+ /* Ingress DNAT table: Packets enter the pipeline with destination
+ * IP address that needs to be DNATted from a external IP address
+ * to a logical IP address. */
+ var ip_and_ports = "${nat.nat.logical_ip}" ++
+ if (nat.nat.external_port_range != "") {
+ " ${nat.nat.external_port_range}"
+ } else {
+ ""
+ } in
+ if (nat.nat.__type == "dnat" or nat.nat.__type == "dnat_and_snat") {
+ None = l3dgw_port in
+ var __match = "ip && ip4.dst == ${nat.nat.external_ip}" in
+ (var ext_ip_match, var ext_flow) = lrouter_nat_add_ext_ip_match(
+ r, nat, __match, ipX, true, mask) in
+ {
+ /* Gateway router. */
+ /* Packet when it goes from the initiator to destination.
+ * We need to set flags.loopback because the router can
+ * send the packet back through the same interface. */
+ Some{var f} = ext_flow in Flow[f];
+
+ var flag_action =
+ if (has_force_snat_ip(lr, "dnat")) {
+ /* Indicate to the future tables that a DNAT has taken
+ * place and a force SNAT needs to be done in the
+ * Egress SNAT table. */
+ "flags.force_snat_for_dnat = 1; "
+ } else { "" } in
+ var nat_actions = if (stateless) {
+ "${ipX}.dst=${nat.nat.logical_ip}; next;"
+ } else {
+ "flags.loopback = 1; "
+ "ct_dnat(${ip_and_ports});"
+ } in
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, DNAT),
+ .priority = 100,
+ .__match = __match ++ ext_ip_match,
+ .actions = flag_action ++ nat_actions,
+ .external_ids = stage_hint(nat.nat._uuid))
+ };
+
+ Some{var gwport} = l3dgw_port in
+ var __match =
+ "ip && ${ipX}.dst == ${nat.nat.external_ip}"
+ " && inport == ${json_string_escape(gwport.name)}" ++
+ if (mac == None) {
+ /* Flows for NAT rules that are centralized are only
+ * programmed on the "redirect-chassis". */
+ " && is_chassis_resident(${redirect_port_name})"
+ } else { "" } in
+ (var ext_ip_match, var ext_flow) = lrouter_nat_add_ext_ip_match(
+ r, nat, __match, ipX, true, mask) in
+ {
+ /* Distributed router. */
+ /* Traffic received on l3dgw_port is subject to NAT. */
+ Some{var f} = ext_flow in Flow[f];
+
+ var actions = if (stateless) {
+ "${ipX}.dst=${nat.nat.logical_ip}; next;"
+ } else {
+ "ct_dnat(${ip_and_ports});"
+ } in
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, DNAT),
+ .priority = 100,
+ .__match = __match ++ ext_ip_match,
+ .actions = actions,
+ .external_ids = stage_hint(nat.nat._uuid))
+ }
+ };
+
+ /* ARP resolve for NAT IPs. */
+ Some{var gwport} = l3dgw_port in {
+ var gwport_name = json_string_escape(gwport.name) in {
+ if (nat.nat.__type == "snat") {
+ var __match = "inport == ${gwport_name} && "
+ "${ipX}.src == ${nat.nat.external_ip}" in
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, IP_INPUT),
+ .priority = 120,
+ .__match = __match,
+ .actions = "next;",
+ .external_ids = stage_hint(nat.nat._uuid))
+ };
+
+ var nexthop_reg = "${xx}${rEG_NEXT_HOP()}" in
+ var __match = "outport == ${gwport_name} && "
+ "${nexthop_reg} == ${nat.nat.external_ip}" in
+ var dst_mac = match (mac) {
+ Some{value} -> "${value}",
+ None -> gwport.mac
+ } in
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, ARP_RESOLVE),
+ .priority = 100,
+ .__match = __match,
+ .actions = "eth.dst = ${dst_mac}; next;",
+ .external_ids = stage_hint(nat.nat._uuid))
+ }
+ };
+
+ /* Egress UNDNAT table: It is for already established connections'
+ * reverse traffic. i.e., DNAT has already been done in ingress
+ * pipeline and now the packet has entered the egress pipeline as
+ * part of a reply. We undo the DNAT here.
+ *
+ * Note that this only applies for NAT on a distributed router.
+ * Undo DNAT on a gateway router is done in the ingress DNAT
+ * pipeline stage. */
+ if ((nat.nat.__type == "dnat" or nat.nat.__type == "dnat_and_snat")) {
+ Some{var gwport} = l3dgw_port in
+ var __match =
+ "ip && ${ipX}.src == ${nat.nat.logical_ip}"
+ " && outport == ${json_string_escape(gwport.name)}" ++
+ if (mac == None) {
+ /* Flows for NAT rules that are centralized are only
+ * programmed on the "redirect-chassis". */
+ " && is_chassis_resident(${redirect_port_name})"
+ } else { "" } in
+ var actions =
+ match (mac) {
+ Some{mac_addr} -> "eth.src = ${mac_addr}; ",
+ None -> ""
+ } ++
+ if (stateless) {
+ "${ipX}.src=${nat.nat.external_ip}; next;"
+ } else {
+ "ct_dnat;"
+ } in
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(OUT, UNDNAT),
+ .priority = 100,
+ .__match = __match,
+ .actions = actions,
+ .external_ids = stage_hint(nat.nat._uuid))
+ };
+
+ /* Egress SNAT table: Packets enter the egress pipeline with
+ * source ip address that needs to be SNATted to a external ip
+ * address. */
+ var ip_and_ports = "${nat.nat.external_ip}" ++
+ if (nat.nat.external_port_range != "") {
+ " ${nat.nat.external_port_range}"
+ } else {
+ ""
+ } in
+ if (nat.nat.__type == "snat" or nat.nat.__type == "dnat_and_snat") {
+ None = l3dgw_port in
+ var __match = "ip && ${ipX}.src == ${nat.nat.logical_ip}" in
+ (var ext_ip_match, var ext_flow) = lrouter_nat_add_ext_ip_match(
+ r, nat, __match, ipX, false, mask) in
+ {
+ /* Gateway router. */
+ Some{var f} = ext_flow in Flow[f];
+
+ /* The priority here is calculated such that the
+ * nat->logical_ip with the longest mask gets a higher
+ * priority. */
+ var actions = if (stateless) {
+ "${ipX}.src=${nat.nat.external_ip}; next;"
+ } else {
+ "ct_snat(${ip_and_ports});"
+ } in
+ Some{var plen} = ip46_count_cidr_bits(mask) in
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(OUT, SNAT),
+ .priority = plen as bit<64> + 1,
+ .__match = __match ++ ext_ip_match,
+ .actions = actions,
+ .external_ids = stage_hint(nat.nat._uuid))
+ };
+
+ Some{var gwport} = l3dgw_port in
+ var __match =
+ "ip && ${ipX}.src == ${nat.nat.logical_ip}"
+ " && outport == ${json_string_escape(gwport.name)}" ++
+ if (mac == None) {
+ /* Flows for NAT rules that are centralized are only
+ * programmed on the "redirect-chassis". */
+ " && is_chassis_resident(${redirect_port_name})"
+ } else { "" } in
+ (var ext_ip_match, var ext_flow) = lrouter_nat_add_ext_ip_match(
+ r, nat, __match, ipX, false, mask) in
+ {
+ /* Distributed router. */
+ Some{var f} = ext_flow in Flow[f];
+
+ var actions =
+ match (mac) {
+ Some{mac_addr} -> "eth.src = ${mac_addr}; ",
+ _ -> ""
+ } ++ if (stateless) {
+ "${ipX}.src=${nat.nat.external_ip}; next;"
+ } else {
+ "ct_snat(${ip_and_ports});"
+ } in
+ /* The priority here is calculated such that the
+ * nat->logical_ip with the longest mask gets a higher
+ * priority. */
+ Some{var plen} = ip46_count_cidr_bits(mask) in
+ var priority = (plen as bit<64>) + 1 in
+ var centralized_boost = if (mac == None) 128 else 0 in
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(OUT, SNAT),
+ .priority = priority + centralized_boost,
+ .__match = __match ++ ext_ip_match,
+ .actions = actions,
+ .external_ids = stage_hint(nat.nat._uuid))
+ }
+ };
+
+ /* Logical router ingress table ADMISSION:
+ * For NAT on a distributed router, add rules allowing
+ * ingress traffic with eth.dst matching nat->external_mac
+ * on the l3dgw_port instance where nat->logical_port is
+ * resident. */
+ Some{var mac_addr} = mac in
+ Some{var gwport} = l3dgw_port in
+ Some{var logical_port} = nat.nat.logical_port in
+ var __match =
+ "eth.dst == ${mac_addr} && inport == ${json_string_escape(gwport.name)}"
+ " && is_chassis_resident(${json_string_escape(logical_port)})" in
+ /* Store the ethernet address of the port receiving the packet.
+ * This will save us from having to match on inport further
+ * down in the pipeline.
+ */
+ var actions = "${rEG_INPORT_ETH_ADDR()} = ${gwport.mac}; next;" in
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, ADMISSION),
+ .priority = 50,
+ .__match = __match,
+ .actions = actions,
+ .external_ids = stage_hint(nat.nat._uuid));
+
+ /* Ingress Gateway Redirect Table: For NAT on a distributed
+ * router, add flows that are specific to a NAT rule. These
+ * flows indicate the presence of an applicable NAT rule that
+ * can be applied in a distributed manner.
+ * In particulr the IP src register and eth.src are set to NAT external IP and
+ * NAT external mac so the ARP request generated in the following
+ * stage is sent out with proper IP/MAC src addresses
+ */
+ Some{var mac_addr} = mac in
+ Some{var gwport} = l3dgw_port in
+ Some{var logical_port} = nat.nat.logical_port in
+ Some{var external_mac} = nat.nat.external_mac in
+ var __match =
+ "${ipX}.src == ${nat.nat.logical_ip} && "
+ "outport == ${json_string_escape(gwport.name)} && "
+ "is_chassis_resident(${json_string_escape(logical_port)})" in
+ var actions =
+ "eth.src = ${external_mac}; "
+ "${xx}${rEG_SRC()} = ${nat.nat.external_ip}; "
+ "next;" in
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, GW_REDIRECT),
+ .priority = 100,
+ .__match = __match,
+ .actions = actions,
+ .external_ids = stage_hint(nat.nat._uuid));
+
+ /* Egress Loopback table: For NAT on a distributed router.
+ * If packets in the egress pipeline on the distributed
+ * gateway port have ip.dst matching a NAT external IP, then
+ * loop a clone of the packet back to the beginning of the
+ * ingress pipeline with inport = outport. */
+ Some{var gwport} = l3dgw_port in
+ /* Distributed router. */
+ Some{var port} = match (mac) {
+ Some{_} -> match (nat.nat.logical_port) {
+ Some{name} -> Some{json_string_escape(name)},
+ None -> None: Option<string>
+ },
+ None -> Some{redirect_port_name}
+ } in
+ var __match = "${ipX}.dst == ${nat.nat.external_ip} && outport == ${json_string_escape(gwport.name)} && is_chassis_resident(${port})" in
+ var regs = {
+ var regs = vec_empty();
+ for (j in range_vec(0, mFF_N_LOG_REGS(), 01)) {
+ regs.push("reg${j} = 0; ")
+ };
+ regs
+ } in
+ var actions =
+ "clone { ct_clear; "
+ "inport = outport; outport = \"\"; "
+ "flags = 0; flags.loopback = 1; " ++
+ regs.join("") ++
+ "${rEGBIT_EGRESS_LOOPBACK()} = 1; "
+ "next(pipeline=ingress, table=0); };" in
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(OUT, EGR_LOOP),
+ .priority = 100,
+ .__match = __match,
+ .actions = actions,
+ .external_ids = stage_hint(nat.nat._uuid))
+ }
+ };
+
+ /* Handle force SNAT options set in the gateway router. */
+ if (l3dgw_port == None) {
+ var dnat_force_snat_ips = get_force_snat_ip(lr, "dnat") in
+ if (not dnat_force_snat_ips.is_empty())
+ LogicalRouterForceSnatFlows(.logical_router = lr._uuid,
+ .ips = dnat_force_snat_ips,
+ .context = "dnat");
+
+ var lb_force_snat_ips = get_force_snat_ip(lr, "lb") in
+ if (not lb_force_snat_ips.is_empty())
+ LogicalRouterForceSnatFlows(.logical_router = lr._uuid,
+ .ips = lb_force_snat_ips,
+ .context = "lb");
+
+ /* For gateway router, re-circulate every packet through
+ * the DNAT zone. This helps with the following.
+ *
+ * Any packet that needs to be unDNATed in the reverse
+ * direction gets unDNATed. Ideally this could be done in
+ * the egress pipeline. But since the gateway router
+ * does not have any feature that depends on the source
+ * ip address being external IP address for IP routing,
+ * we can do it here, saving a future re-circulation. */
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, DNAT),
+ .priority = 50,
+ .__match = "ip",
+ .actions = "flags.loopback = 1; ct_dnat;",
+ .external_ids = map_empty())
+ }
+}
+
+function nats_contain_vip(nats: Vec<NAT>, vip: v46_ip): bool {
+ for (nat in nats) {
+ if (nat.external_ip == vip) {
+ return true
+ }
+ };
+ return false
+}
+
+/* Load balancing and packet defrag are only valid on
+ * Gateway routers or router with gateway port. */
+for (RouterLBVIP(
+ .router = &Router{.lr = lr,
+ .l3dgw_port = l3dgw_port,
+ .redirect_port_name = redirect_port_name,
+ .is_gateway = is_gateway,
+ .nats = nats},
+ .lb = &lb,
+ .vip = vip,
+ .backends = backends)
+ if is_some(l3dgw_port) or is_gateway)
+{
+ if (backends == "") {
+ for (ControllerEventEn(true)) {
+ for (HasEventElbMeter(has_elb_meter)) {
+ Some {(var __match, var __action)} =
+ build_empty_lb_event_flow(vip, lb, has_elb_meter) in
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, DNAT),
+ .priority = 130,
+ .__match = __match,
+ .actions = __action,
+ .external_ids = stage_hint(lb._uuid))
+ }
+ }
+ };
+
+ /* A set to hold all ips that need defragmentation and tracking. */
+
+ /* vip contains IP:port or just IP. */
+ Some{(var ip_address, var port)} = ip_address_and_port_from_lb_key(vip) in
+ var ipX = ip46_ipX(ip_address) in
+ var proto = match (lb.protocol) {
+ Some{proto} -> proto,
+ _ -> "tcp"
+ } in {
+ /* If there are any load balancing rules, we should send
+ * the packet to conntrack for defragmentation and
+ * tracking. This helps with two things.
+ *
+ * 1. With tracking, we can send only new connections to
+ * pick a DNAT ip address from a group.
+ * 2. If there are L4 ports in load balancing rules, we
+ * need the defragmentation to match on L4 ports. */
+ var __match = "ip && ${ipX}.dst == ${ip_address}" in
+ /* One of these flows must be created for each unique LB VIP address.
+ * We create one for each VIP:port pair; flows with the same IP and
+ * different port numbers will produce identical flows that will
+ * get merged by DDlog. */
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, DEFRAG),
+ .priority = 100,
+ .__match = __match,
+ .actions = "ct_next;",
+ .external_ids = stage_hint(lb._uuid));
+
+ /* Higher priority rules are added for load-balancing in DNAT
+ * table. For every match (on a VIP[:port]), we add two flows
+ * via add_router_lb_flow(). One flow is for specific matching
+ * on ct.new with an action of "ct_lb($targets);". The other
+ * flow is for ct.est with an action of "ct_dnat;". */
+ var match1 = "ip && ${ipX}.dst == ${ip_address}" in
+ (var prio, var match2) =
+ if (port != 0) {
+ (120, " && ${proto} && ${proto}.dst == ${port}")
+ } else {
+ (110, "")
+ } in
+ var __match = match1 ++ match2 ++
+ match (l3dgw_port) {
+ Some{gwport} -> " && is_chassis_resident(${redirect_port_name})",
+ _ -> ""
+ } in
+ var has_force_snat_ip = has_force_snat_ip(lr, "lb") in
+ {
+ /* A match and actions for established connections. */
+ var est_match = "ct.est && " ++ __match in
+ var actions =
+ match (has_force_snat_ip) {
+ true -> "flags.force_snat_for_lb = 1; ct_dnat;",
+ false -> "ct_dnat;"
+ } in
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, DNAT),
+ .priority = prio,
+ .__match = est_match,
+ .actions = actions,
+ .external_ids = stage_hint(lb._uuid));
+
+ if (nats_contain_vip(nats, ip_address)) {
+ /* The load balancer vip is also present in the NAT entries.
+ * So add a high priority lflow to advance the the packet
+ * destined to the vip (and the vip port if defined)
+ * in the S_ROUTER_IN_UNSNAT stage.
+ * There seems to be an issue with ovs-vswitchd. When the new
+ * connection packet destined for the lb vip is received,
+ * it is dnat'ed in the S_ROUTER_IN_DNAT stage in the dnat
+ * conntrack zone. For the next packet, if it goes through
+ * unsnat stage, the conntrack flags are not set properly, and
+ * it doesn't hit the established state flows in
+ * S_ROUTER_IN_DNAT stage. */
+ var match3 = "${ipX} && ${ipX}.dst == ${ip_address} && ${proto}" ++
+ if (port != 0) { " && ${proto}.dst == ${port}" }
+ else { "" } in
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, UNSNAT),
+ .priority = 120,
+ .__match = match3,
+ .actions = "next;",
+ .external_ids = stage_hint(lb._uuid))
+ };
+
+ Some{var gwport} = l3dgw_port in
+ /* Add logical flows to UNDNAT the load balanced reverse traffic in
+ * the router egress pipleine stage - S_ROUTER_OUT_UNDNAT if the logical
+ * router has a gateway router port associated.
+ */
+ var conds = {
+ var conds = vec_empty();
+ for (ip_str in string_split(backends, ",")) {
+ match (ip_address_and_port_from_lb_key(ip_str)) {
+ None -> () /* FIXME: put a break here */,
+ Some{(ip_address_, port_)} -> conds.push(
+ "(${ipX}.src == ${ip_address_}" ++
+ if (port_ != 0) {
+ " && ${proto}.src == ${port_})"
+ } else {
+ ")"
+ })
+ }
+ };
+ conds
+ } in
+ not conds.is_empty() in
+ var undnat_match =
+ "${ip46_ipX(ip_address)} && (" ++ conds.join(" || ") ++
+ ") && outport == ${json_string_escape(gwport.name)} && "
+ "is_chassis_resident(${redirect_port_name})" in
+ var action =
+ match (has_force_snat_ip) {
+ true -> "flags.force_snat_for_lb = 1; ct_dnat;",
+ false -> "ct_dnat;"
+ } in
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(OUT, UNDNAT),
+ .priority = 120,
+ .__match = undnat_match,
+ .actions = action,
+ .external_ids = stage_hint(lb._uuid))
+ }
+ }
+}
+
+/* Higher priority rules are added for load-balancing in DNAT
+ * table. For every match (on a VIP[:port]), we add two flows
+ * via add_router_lb_flow(). One flow is for specific matching
+ * on ct.new with an action of "ct_lb($targets);". The other
+ * flow is for ct.est with an action of "ct_dnat;". */
+Flow(.logical_datapath = r.lr._uuid,
+ .stage = router_stage(IN, DNAT),
+ .priority = priority,
+ .__match = __match,
+ .actions = actions,
+ .external_ids = stage_hint(lb._uuid)) :-
+ r in &Router(),
+ is_some(r.l3dgw_port) or r.is_gateway,
+ LBVIPBackend[lbvipbackend],
+ Some{var svc_monitor} = lbvipbackend.svc_monitor,
+ var lbvip = lbvipbackend.lbvip,
+ var lb = lbvip.lb,
+ r.lr.load_balancer.contains(lb._uuid),
+ bs in &LBVIPBackendStatus(.port = lbvipbackend.port,
+ .ip = lbvipbackend.ip,
+ .protocol = default_protocol(lb.protocol),
+ .logical_port = svc_monitor.port_name),
+ var bses = bs.group_by((r, lbvip, lb)).to_set(),
+ var __match
+ = "ct.new && " ++
+ get_match_for_lb_key(lbvip.vip_addr, lbvip.vip_port, lb.protocol, true) ++
+ match (r.l3dgw_port) {
+ Some{gwport} -> " && is_chassis_resident(${r.redirect_port_name})",
+ _ -> ""
+ },
+ var priority = if (lbvip.vip_port != 0) 120 else 110,
+ var up_backends = {
+ var up_backends = set_empty();
+ for (bs in bses) {
+ if (bs.up) {
+ up_backends.insert("${bs.ip}:${bs.port}")
+ }
+ };
+ up_backends
+ },
+ var actions = if (up_backends.is_empty()) {
+ "drop;"
+ } else {
+ match (has_force_snat_ip(r.lr, "lb")) {
+ true -> "flags.force_snat_for_lb = 1; ",
+ false -> ""
+ } ++ ct_lb(up_backends.to_vec().join(","), lb.selection_fields,
+ lb.protocol)
+ }.
+Flow(.logical_datapath = r.lr._uuid,
+ .stage = router_stage(IN, DNAT),
+ .priority = priority,
+ .__match = __match,
+ .actions = actions,
+ .external_ids = stage_hint(lb._uuid)) :-
+ r in &Router(),
+ is_some(r.l3dgw_port) or r.is_gateway,
+ LBVIPBackend[lbvipbackend],
+ None = lbvipbackend.svc_monitor,
+ var lbvip = lbvipbackend.lbvip,
+ var lb = lbvip.lb,
+ r.lr.load_balancer.contains(lb._uuid),
+ var __match
+ = "ct.new && " ++
+ get_match_for_lb_key(lbvip.vip_addr, lbvip.vip_port, lb.protocol, true) ++
+ match (r.l3dgw_port) {
+ Some{gwport} -> " && is_chassis_resident(${r.redirect_port_name})",
+ _ -> ""
+ },
+ var priority = if (lbvip.vip_port != 0) 120 else 110,
+ var actions = ct_lb(lbvip.backend_ips, lb.selection_fields, lb.protocol).
+
+
+/* Defaults based on MaxRtrInterval and MinRtrInterval from RFC 4861 section
+ * 6.2.1
+ */
+function nD_RA_MAX_INTERVAL_DEFAULT(): integer = 600
+
+function nd_ra_min_interval_default(max: integer): integer =
+{
+ if (max >= 9) { max / 3 } else { max * 3 / 4 }
+}
+
+function nD_RA_MAX_INTERVAL_MAX(): integer = 1800
+function nD_RA_MAX_INTERVAL_MIN(): integer = 4
+
+function nD_RA_MIN_INTERVAL_MAX(max: integer): integer = ((max * 3) / 4)
+function nD_RA_MIN_INTERVAL_MIN(): integer = 3
+
+function nD_MTU_DEFAULT(): integer = 0
+
+function copy_ra_to_sb(port: RouterPort, address_mode: string): Map<string, string> =
+{
+ var options = port.sb_options;
+
+ options.insert("ipv6_ra_send_periodic", "true");
+ options.insert("ipv6_ra_address_mode", address_mode);
+
+ var max_interval = map_get_int_def(port.lrp.ipv6_ra_configs, "max_interval",
+ nD_RA_MAX_INTERVAL_DEFAULT());
+
+ if (max_interval > nD_RA_MAX_INTERVAL_MAX()) {
+ max_interval = nD_RA_MAX_INTERVAL_MAX()
+ };
+
+ if (max_interval < nD_RA_MAX_INTERVAL_MIN()) {
+ max_interval = nD_RA_MAX_INTERVAL_MIN()
+ };
+
+ options.insert("ipv6_ra_max_interval", "${max_interval}");
+
+ var min_interval = map_get_int_def(port.lrp.ipv6_ra_configs,
+ "min_interval", nd_ra_min_interval_default(max_interval));
+
+ if (min_interval > nD_RA_MIN_INTERVAL_MAX(max_interval)) {
+ min_interval = nD_RA_MIN_INTERVAL_MAX(max_interval)
+ } else ();
+
+ if (min_interval < nD_RA_MIN_INTERVAL_MIN()) {
+ min_interval = nD_RA_MIN_INTERVAL_MIN()
+ } else ();
+
+ options.insert("ipv6_ra_min_interval", "${min_interval}");
+
+ var mtu = map_get_int_def(port.lrp.ipv6_ra_configs, "mtu", nD_MTU_DEFAULT());
+
+ /* RFC 2460 requires the MTU for IPv6 to be at least 1280 */
+ if (mtu != 0 and mtu >= 1280) {
+ options.insert("ipv6_ra_mtu", "${mtu}")
+ };
+
+ var prefixes = vec_empty();
+ for (addrs in port.networks.ipv6_addrs) {
+ if (ipv6_netaddr_is_lla(addrs)) {
+ options.insert("ipv6_ra_src_addr", "${addrs.addr}")
+ } else {
+ prefixes.push(ipv6_netaddr_match_network(addrs))
+ }
+ };
+ match (port.sb_options.get("ipv6_ra_pd_list")) {
+ Some{value} -> prefixes.push(value),
+ _ -> ()
+ };
+ options.insert("ipv6_ra_prefixes", prefixes.join(" "));
+
+ match (port.lrp.ipv6_ra_configs.get("rdnss")) {
+ Some{value} -> options.insert("ipv6_ra_rdnss", value),
+ _ -> ()
+ };
+
+ match (port.lrp.ipv6_ra_configs.get("dnssl")) {
+ Some{value} -> options.insert("ipv6_ra_dnssl", value),
+ _ -> ()
+ };
+
+ options.insert("ipv6_ra_src_eth", "${port.networks.ea}");
+
+ var prf = match (port.lrp.ipv6_ra_configs.get("router_preference")) {
+ Some{prf} -> if (prf == "HIGH" or prf == "LOW") prf else "MEDIUM",
+ _ -> "MEDIUM"
+ };
+ options.insert("ipv6_ra_prf", prf);
+
+ match (port.lrp.ipv6_ra_configs.get("route_info")) {
+ Some{s} -> options.insert("ipv6_ra_route_info", s),
+ _ -> ()
+ };
+
+ options
+}
+
+/* Logical router ingress table ND_RA_OPTIONS and ND_RA_RESPONSE: IPv6 Router
+ * Adv (RA) options and response. */
+// FIXME: do these rules apply to derived ports?
+for (&RouterPort[port@RouterPort{.lrp = lrp@nb::Logical_Router_Port{.peer = None},
+ .router = &router,
+ .json_name = json_name,
+ .networks = networks,
+ .peer = PeerSwitch{}}]
+ if (not networks.ipv6_addrs.is_empty()))
+{
+ Some{var address_mode} = lrp.ipv6_ra_configs.get("address_mode") in
+ /* FIXME: we need a nicer wat to write this */
+ true ==
+ if ((address_mode != "slaac") and
+ (address_mode != "dhcpv6_stateful") and
+ (address_mode != "dhcpv6_stateless")) {
+ warn("Invalid address mode [${address_mode}] defined");
+ false
+ } else { true } in
+ {
+ if (map_get_bool_def(lrp.ipv6_ra_configs, "send_periodic", false)) {
+ RouterPortRAOptions(lrp._uuid, copy_ra_to_sb(port, address_mode))
+ };
+
+ (true, var prefix) =
+ {
+ var add_rs_response_flow = false;
+ var prefix = "";
+ for (addr in networks.ipv6_addrs) {
+ if (not ipv6_netaddr_is_lla(addr)) {
+ prefix = prefix ++ ", prefix = ${ipv6_netaddr_match_network(addr)}";
+ add_rs_response_flow = true
+ } else ()
+ };
+ (add_rs_response_flow, prefix)
+ } in
+ {
+ var __match = "inport == ${json_name} && ip6.dst == ff02::2 && nd_rs" in
+ /* As per RFC 2460, 1280 is minimum IPv6 MTU. */
+ var mtu = match(lrp.ipv6_ra_configs.get("mtu")) {
+ Some{mtu_s} -> {
+ match (parse_dec_u64(mtu_s)) {
+ None -> 0,
+ Some{mtu} -> if (mtu >= 1280) mtu else 0
+ }
+ },
+ None -> 0
+ } in
+ var actions0 =
+ "${rEGBIT_ND_RA_OPTS_RESULT()} = put_nd_ra_opts("
+ "addr_mode = ${json_string_escape(address_mode)}, "
+ "slla = ${networks.ea}" ++
+ if (mtu > 0) { ", mtu = ${mtu}" } else { "" } in
+ var router_preference = match (lrp.ipv6_ra_configs.get("router_preference")) {
+ Some{"MEDIUM"} -> "",
+ None -> "",
+ Some{prf} -> ", router_preference = \"${prf}\""
+ } in
+ var actions = actions0 ++ router_preference ++ prefix ++ "); next;" in
+ Flow(.logical_datapath = router.lr._uuid,
+ .stage = router_stage(IN, ND_RA_OPTIONS),
+ .priority = 50,
+ .__match = __match,
+ .actions = actions,
+ .external_ids = stage_hint(lrp._uuid));
+
+ var __match = "inport == ${json_name} && ip6.dst == ff02::2 && "
+ "nd_ra && ${rEGBIT_ND_RA_OPTS_RESULT()}" in
+ var ip6_str = ipv6_string_mapped(in6_generate_lla(networks.ea)) in
+ var actions = "eth.dst = eth.src; eth.src = ${networks.ea}; "
+ "ip6.dst = ip6.src; ip6.src = ${ip6_str}; "
+ "outport = inport; flags.loopback = 1; "
+ "output;" in
+ Flow(.logical_datapath = router.lr._uuid,
+ .stage = router_stage(IN, ND_RA_RESPONSE),
+ .priority = 50,
+ .__match = __match,
+ .actions = actions,
+ .external_ids = stage_hint(lrp._uuid))
+ }
+ }
+}
+
+
+/* Logical router ingress table ND_RA_OPTIONS, ND_RA_RESPONSE: RS responder, by
+ * default goto next. (priority 0)*/
+for (&Router(.lr = lr))
+{
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, ND_RA_OPTIONS),
+ .priority = 0,
+ .__match = "1",
+ .actions = "next;",
+ .external_ids = map_empty());
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, ND_RA_RESPONSE),
+ .priority = 0,
+ .__match = "1",
+ .actions = "next;",
+ .external_ids = map_empty())
+}
+
+/* Proxy table that stores per-port routes.
+ * There routes get converted into logical flows by
+ * the following rule.
+ */
+relation Route(key: route_key, // matching criteria
+ port: Ref<RouterPort>, // output port
+ src_ip: v46_ip, // source IP address for output
+ gateway: Option<v46_ip>) // next hop (unless being delivered)
+
+function build_route_match(key: route_key) : (string, bit<32>) =
+{
+ var ipX = ip46_ipX(key.ip_prefix);
+
+ (var dir, var priority) = match (key.policy) {
+ SrcIp -> ("src", key.plen * 2),
+ DstIp -> ("dst", (key.plen * 2) + 1)
+ };
+
+ var network = ip46_get_network(key.ip_prefix, key.plen);
+ var __match = "${ipX}.${dir} == ${network}/${key.plen}";
+
+ (__match, priority)
+}
+for (Route(.port = port,
+ .key = key,
+ .src_ip = src_ip,
+ .gateway = gateway))
+{
+ var ipX = ip46_ipX(key.ip_prefix) in
+ var xx = ip46_xxreg(key.ip_prefix) in
+ /* IPv6 link-local addresses must be scoped to the local router port. */
+ var inport_match = match (key.ip_prefix) {
+ IPv6{prefix} -> if (in6_is_lla(prefix)) {
+ "inport == ${port.json_name} && "
+ } else "",
+ _ -> ""
+ } in
+ (var ip_match, var priority) = build_route_match(key) in
+ var __match = inport_match ++ ip_match in
+ var nexthop = match (gateway) {
+ Some{gw} -> "${gw}",
+ None -> "${ipX}.dst"
+ } in
+ var actions =
+ "ip.ttl--; "
+ "${rEG_ECMP_GROUP_ID()} = 0; "
+ "${xx}${rEG_NEXT_HOP()} = ${nexthop}; "
+ "${xx}${rEG_SRC()} = ${src_ip}; "
+ "eth.src = ${port.networks.ea}; "
+ "outport = ${port.json_name}; "
+ "flags.loopback = 1; "
+ "next;" in
+ /* The priority here is calculated to implement longest-prefix-match
+ * routing. */
+ Flow(.logical_datapath = port.router.lr._uuid,
+ .stage = router_stage(IN, IP_ROUTING),
+ .priority = 32'd0 ++ priority,
+ .__match = __match,
+ .actions = actions,
+ .external_ids = stage_hint(port.lrp._uuid))
+}
+
+/* Logical router ingress table IP_ROUTING & IP_ROUTING_ECMP: IP Routing.
+ *
+ * A packet that arrives at this table is an IP packet that should be
+ * routed to the address in 'ip[46].dst'.
+ *
+ * For regular routes without ECMP, table IP_ROUTING sets outport to the
+ * correct output port, eth.src to the output port's MAC address, and
+ * '[xx]${rEG_NEXT_HOP()}' to the next-hop IP address (leaving 'ip[46].dst', the
+ * packet’s final destination, unchanged), and advances to the next table.
+ *
+ * For ECMP routes, i.e. multiple routes with same policy and prefix, table
+ * IP_ROUTING remembers ECMP group id and selects a member id, and advances
+ * to table IP_ROUTING_ECMP, which sets outport, eth.src, and the appropriate
+ * next-hop register for the selected ECMP member.
+ * */
+Route(key, port, src_ip, None) :-
+ RouterPortNetworksIPv4Addr(.port = port, .addr = addr),
+ var key = RouteKey{DstIp, IPv4{addr.addr}, addr.plen},
+ var src_ip = IPv4{addr.addr}.
+
+Route(key, port, src_ip, None) :-
+ RouterPortNetworksIPv6Addr(.port = port, .addr = addr),
+ var key = RouteKey{DstIp, IPv6{addr.addr}, addr.plen},
+ var src_ip = IPv6{addr.addr}.
+
+Flow(.logical_datapath = r.lr._uuid,
+ .stage = router_stage(IN, IP_ROUTING_ECMP),
+ .priority = 150,
+ .__match = "${rEG_ECMP_GROUP_ID()} == 0",
+ .actions = "next;",
+ .external_ids = map_empty()) :-
+ r in &Router().
+
+/* Convert the static routes to flows. */
+Route(key, dst.port, dst.src_ip, Some{dst.nexthop}) :-
+ RouterStaticRoute(.router = &router, .key = key, .dsts = dsts),
+ dsts.size() == 1,
+ Some{var dst} = dsts.nth(0).
+
+/* Return a vector of pairs (1, set[0]), ... (n, set[n - 1]). */
+function numbered_vec(set: Set<'A>) : Vec<(bit<16>, 'A)> = {
+ var vec = vec_with_capacity(set.size());
+ var i = 1;
+ for (x in set) {
+ vec.push((i, x));
+ i = i + 1
+ };
+ vec
+}
+
+relation EcmpGroup(
+ group_id: bit<16>,
+ router: Ref<Router>,
+ key: route_key,
+ dsts: Set<route_dst>,
+ route_match: string, // This is build_route_match(key).0
+ route_priority: integer) // This is build_route_match(key).1
+
+EcmpGroup(group_id, router, key, dsts, route_match, route_priority) :-
+ r in RouterStaticRoute(.router = router, .key = key, .dsts = dsts),
+ dsts.size() > 1,
+ var groups = (router, key, dsts).group_by(()).to_set(),
+ var group_id_and_group = FlatMap(numbered_vec(groups)),
+ (var group_id, (var router, var key, var dsts)) = group_id_and_group,
+ (var route_match, var route_priority0) = build_route_match(key),
+ var route_priority = route_priority0 as integer.
+
+Flow(.logical_datapath = router.lr._uuid,
+ .stage = router_stage(IN, IP_ROUTING),
+ .priority = route_priority,
+ .__match = route_match,
+ .actions = actions,
+ .external_ids = map_empty()) :-
+ EcmpGroup(group_id, router, key, dsts, route_match, route_priority),
+ var all_member_ids = {
+ var member_ids = vec_with_capacity(dsts.size());
+ for (i in range_vec(1, dsts.size()+1, 1)) {
+ member_ids.push("${i}")
+ };
+ member_ids.join(", ")
+ },
+ var actions =
+ "ip.ttl--; "
+ "flags.loopback = 1; "
+ "${rEG_ECMP_GROUP_ID()} = ${group_id}; " /* XXX */
+ "${rEG_ECMP_MEMBER_ID()} = select(${all_member_ids});".
+
+Flow(.logical_datapath = router.lr._uuid,
+ .stage = router_stage(IN, IP_ROUTING_ECMP),
+ .priority = 100,
+ .__match = __match,
+ .actions = actions,
+ .external_ids = map_empty()) :-
+ EcmpGroup(group_id, router, key, dsts, _, _),
+ var member_id_and_dst = FlatMap(numbered_vec(dsts)),
+ (var member_id, var dst) = member_id_and_dst,
+ var xx = ip46_xxreg(dst.nexthop),
+ var __match = "${rEG_ECMP_GROUP_ID()} == ${group_id} && "
+ "${rEG_ECMP_MEMBER_ID()} == ${member_id}",
+ var actions = "${xx}${rEG_NEXT_HOP()} = ${dst.nexthop}; "
+ "${xx}${rEG_SRC()} = ${dst.src_ip}; "
+ "eth.src = ${dst.port.networks.ea}; "
+ "outport = ${dst.port.json_name}; "
+ "next;".
+
+/* If symmetric ECMP replies are enabled, then packets that arrive over
+ * an ECMP route need to go through conntrack.
+ */
+relation EcmpSymmetricReply(
+ router: Ref<Router>,
+ dst: route_dst,
+ route_match: string,
+ tunkey: integer)
+EcmpSymmetricReply(router, dst, route_match, tunkey) :-
+ EcmpGroup(.router = router, .dsts = dsts, .route_match = route_match),
+ router.is_gateway,
+ var dst = FlatMap(dsts),
+ dst.ecmp_symmetric_reply,
+ PortTunKeyAllocation(.port = dst.port.lrp._uuid, .tunkey = tunkey).
+
+Flow(.logical_datapath = router.lr._uuid,
+ .stage = router_stage(IN, DEFRAG),
+ .priority = 100,
+ .__match = __match,
+ .actions = "ct_next;",
+ .external_ids = map_empty()) :-
+ EcmpSymmetricReply(router, dst, route_match, _),
+ var __match = "inport == ${dst.port.json_name} && ${route_match}".
+
+/* And packets that go out over an ECMP route need conntrack.
+ XXX this seems to exactly duplicate the above flow? */
+
+/* Save src eth and inport in ct_label for packets that arrive over
+ * an ECMP route.
+ */
+Flow(.logical_datapath = router.lr._uuid,
+ .stage = router_stage(IN, ECMP_STATEFUL),
+ .priority = 100,
+ .__match = __match,
+ .actions = actions,
+ .external_ids = map_empty()) :-
+ EcmpSymmetricReply(router, dst, route_match, tunkey),
+ var __match = "inport == ${dst.port.json_name} && ${route_match} && "
+ "(ct.new && !ct.est)",
+ var actions = "ct_commit { ct_label.ecmp_reply_eth = eth.src;"
+ " ct_label.ecmp_reply_port = ${tunkey};}; next;".
+
+/* Bypass ECMP selection if we already have ct_label information
+ * for where to route the packet.
+ */
+Flow(.logical_datapath = router.lr._uuid,
+ .stage = router_stage(IN, IP_ROUTING),
+ .priority = 100,
+ .__match = "${ecmp_reply} && ${route_match}",
+ .actions = "ip.ttl--; "
+ "flags.loopback = 1; "
+ "eth.src = ${dst.port.networks.ea}; "
+ "${xx}reg1 = ${dst.src_ip}; "
+ "outport = ${dst.port.json_name}; "
+ "next;",
+ .external_ids = map_empty()),
+/* Egress reply traffic for symmetric ECMP routes skips router policies. */
+Flow(.logical_datapath = router.lr._uuid,
+ .stage = router_stage(IN, POLICY),
+ .priority = 65535,
+ .__match = ecmp_reply,
+ .actions = "next;",
+ .external_ids = map_empty()),
+Flow(.logical_datapath = router.lr._uuid,
+ .stage = router_stage(IN, ARP_RESOLVE),
+ .priority = 200,
+ .__match = ecmp_reply,
+ .actions = "eth.dst = ct_label.ecmp_reply_eth; next;",
+ .external_ids = map_empty()) :-
+ EcmpSymmetricReply(router, dst, route_match, tunkey),
+ var ecmp_reply = "ct.rpl && ct_label.ecmp_reply_port == ${tunkey}",
+ var xx = ip46_xxreg(dst.nexthop).
+
+
+/* IP Multicast lookup. Here we set the output port, adjust TTL and advance
+ * to next table (priority 500).
+ */
+/* Drop IPv6 multicast traffic that shouldn't be forwarded,
+ * i.e., router solicitation and router advertisement.
+ */
+Flow(.logical_datapath = router.lr._uuid,
+ .stage = router_stage(IN, IP_ROUTING),
+ .priority = 550,
+ .__match = "nd_rs || nd_ra",
+ .actions = "drop;",
+ .external_ids = map_empty()) :-
+ router in &Router().
+
+for (IgmpRouterMulticastGroup(address, &rtr, ports)) {
+ for (RouterMcastFloodPorts(&rtr, flood_ports) if rtr.mcast_cfg.relay) {
+ var flood_static = not flood_ports.is_empty() in
+ var mc_static = json_string_escape(mC_STATIC().0) in
+ var static_act = {
+ if (flood_static) {
+ "clone { "
+ "outport = ${mc_static}; "
+ "ip.ttl--; "
+ "next; "
+ "};"
+ } else {
+ ""
+ }
+ } in
+ Some{var ip} = ip46_parse(address) in
+ var ipX = ip46_ipX(ip) in
+ Flow(.logical_datapath = rtr.lr._uuid,
+ .stage = router_stage(IN, IP_ROUTING),
+ .priority = 500,
+ .__match = "${ipX} && ${ipX}.dst == ${address}",
+ .actions =
+ "${static_act} outport = ${json_string_escape(address)}; "
+ "ip.ttl--; next;",
+ .external_ids = map_empty())
+ }
+}
+
+/* If needed, flood unregistered multicast on statically configured ports.
+ * Priority 450. Otherwise drop any multicast traffic.
+ */
+for (RouterMcastFloodPorts(&rtr, flood_ports) if rtr.mcast_cfg.relay) {
+ var mc_static = json_string_escape(mC_STATIC().0) in
+ var flood_static = not flood_ports.is_empty() in
+ var actions = if (flood_static) {
+ "clone { "
+ "outport = ${mc_static}; "
+ "ip.ttl--; "
+ "next; "
+ "};"
+ } else {
+ "drop;"
+ } in
+ Flow(.logical_datapath = rtr.lr._uuid,
+ .stage = router_stage(IN, IP_ROUTING),
+ .priority = 450,
+ .__match = "ip4.mcast || ip6.mcast",
+ .actions = actions,
+ .external_ids = map_empty())
+}
+
+/* Logical router ingress table POLICY: Policy.
+ *
+ * A packet that arrives at this table is an IP packet that should be
+ * permitted/denied/rerouted to the address in the rule's nexthop.
+ * This table sets outport to the correct out_port,
+ * eth.src to the output port's MAC address,
+ * the appropriate register to the next-hop IP address (leaving
+ * 'ip[46].dst', the packet’s final destination, unchanged), and
+ * advances to the next table for ARP/ND resolution. */
+for (&Router(.lr = lr)) {
+ /* This is a catch-all rule. It has the lowest priority (0)
+ * does a match-all("1") and pass-through (next) */
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, POLICY),
+ .priority = 0,
+ .__match = "1",
+ .actions = "next;",
+ .external_ids = map_empty())
+}
+
+function stage_hint(_uuid: uuid): Map<string,string> = {
+ ["stage-hint" -> "${hex(_uuid[127:96])}"]
+}
+
+
+/* Convert routing policies to flows. */
+function pkt_mark_policy(options: Map<string,string>): string {
+ var pkt_mark = options.get("pkt_mark").and_then(parse_dec_u64).unwrap_or(0);
+ if (pkt_mark > 0 and pkt_mark < (1 << 32)) {
+ "pkt.mark = ${pkt_mark}; "
+ } else {
+ ""
+ }
+}
+Flow(.logical_datapath = r.lr._uuid,
+ .stage = router_stage(IN, POLICY),
+ .priority = policy.priority,
+ .__match = policy.__match,
+ .actions = actions,
+ .external_ids = stage_hint(policy._uuid)) :-
+ r in &Router(),
+ var policy_uuid = FlatMap(r.lr.policies),
+ policy in nb::Logical_Router_Policy(._uuid = policy_uuid),
+ policy.action == "reroute",
+ out_port in &RouterPort(.router = r),
+ Some{var nexthop_s} = policy.nexthop,
+ Some{var nexthop} = ip46_parse(nexthop_s),
+ Some{var src_ip} = find_lrp_member_ip(out_port.networks, nexthop),
+ /*
+ None:
+ VLOG_WARN_RL(&rl, "lrp_addr not found for routing policy "
+ " priority %"PRId64" nexthop %s",
+ rule->priority, rule->nexthop);
+ */
+ var xx = ip46_xxreg(src_ip),
+ var actions = (pkt_mark_policy(policy.options) ++
+ "${xx}${rEG_NEXT_HOP()} = ${nexthop}; "
+ "${xx}${rEG_SRC()} = ${src_ip}; "
+ "eth.src = ${out_port.networks.ea}; "
+ "outport = ${out_port.json_name}; "
+ "flags.loopback = 1; "
+ "next;").
+Flow(.logical_datapath = r.lr._uuid,
+ .stage = router_stage(IN, POLICY),
+ .priority = policy.priority,
+ .__match = policy.__match,
+ .actions = "drop;",
+ .external_ids = stage_hint(policy._uuid)) :-
+ r in &Router(),
+ var policy_uuid = FlatMap(r.lr.policies),
+ policy in nb::Logical_Router_Policy(._uuid = policy_uuid),
+ policy.action == "drop".
+Flow(.logical_datapath = r.lr._uuid,
+ .stage = router_stage(IN, POLICY),
+ .priority = policy.priority,
+ .__match = policy.__match,
+ .actions = pkt_mark_policy(policy.options) ++ "next;",
+ .external_ids = stage_hint(policy._uuid)) :-
+ r in &Router(),
+ var policy_uuid = FlatMap(r.lr.policies),
+ policy in nb::Logical_Router_Policy(._uuid = policy_uuid),
+ policy.action == "allow".
+
+/* XXX destination unreachable */
+
+/* Local router ingress table ARP_RESOLVE: ARP Resolution.
+ *
+ * Multicast packets already have the outport set so just advance to next
+ * table (priority 500).
+ */
+for (&Router(.lr = lr)) {
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, ARP_RESOLVE),
+ .priority = 500,
+ .__match = "ip4.mcast || ip6.mcast",
+ .actions = "next;",
+ .external_ids = map_empty())
+}
+
+/* Local router ingress table ARP_RESOLVE: ARP Resolution.
+ *
+ * Any packet that reaches this table is an IP packet whose next-hop IP
+ * address is in the next-hop register. (ip4.dst is the final destination.) This table
+ * resolves the IP address in the next-hop register into an output port in outport and an
+ * Ethernet address in eth.dst. */
+// FIXME: does this apply to redirect ports?
+for (rp in &RouterPort(.peer = PeerRouter{peer_port, _},
+ .router = &router,
+ .networks = networks))
+{
+ for (&RouterPort(.lrp = nb::Logical_Router_Port{._uuid = peer_port},
+ .json_name = peer_json_name,
+ .router = &peer_router))
+ {
+ /* This is a logical router port. If next-hop IP address in
+ * the next-hop register matches IP address of this router port, then
+ * the packet is intended to eventually be sent to this
+ * logical port. Set the destination mac address using this
+ * port's mac address.
+ *
+ * The packet is still in peer's logical pipeline. So the match
+ * should be on peer's outport. */
+ if (not networks.ipv4_addrs.is_empty()) {
+ var __match = "outport == ${peer_json_name} && "
+ "${rEG_NEXT_HOP()} == " ++
+ format_v4_networks(networks, false) in
+ Flow(.logical_datapath = peer_router.lr._uuid,
+ .stage = router_stage(IN, ARP_RESOLVE),
+ .priority = 100,
+ .__match = __match,
+ .actions = "eth.dst = ${networks.ea}; next;",
+ .external_ids = stage_hint(rp.lrp._uuid))
+ };
+
+ if (not networks.ipv6_addrs.is_empty()) {
+ var __match = "outport == ${peer_json_name} && "
+ "xx${rEG_NEXT_HOP()} == " ++
+ format_v6_networks(networks) in
+ Flow(.logical_datapath = peer_router.lr._uuid,
+ .stage = router_stage(IN, ARP_RESOLVE),
+ .priority = 100,
+ .__match = __match,
+ .actions = "eth.dst = ${networks.ea}; next;",
+ .external_ids = stage_hint(rp.lrp._uuid))
+ }
+ }
+}
+
+/* Packet is on a non gateway chassis and
+ * has an unresolved ARP on a network behind gateway
+ * chassis attached router port. Since, redirect type
+ * is "bridged", instead of calling "get_arp"
+ * on this node, we will redirect the packet to gateway
+ * chassis, by setting destination mac router port mac.*/
+Flow(.logical_datapath = router.lr._uuid,
+ .stage = router_stage(IN, ARP_RESOLVE),
+ .priority = 50,
+ .__match = "outport == ${rp.json_name} && "
+ "!is_chassis_resident(${router.redirect_port_name})",
+ .actions = "eth.dst = ${rp.networks.ea}; next;",
+ .external_ids = stage_hint(lrp._uuid)) :-
+ rp in &RouterPort(.lrp = lrp, .router = router),
+ router.redirect_port_name != "",
+ Some{"bridged"} = lrp.options.get("redirect-type").
+
+
+/* Drop IP traffic destined to router owned IPs. Part of it is dropped
+ * in stage "lr_in_ip_input" but traffic that could have been unSNATed
+ * but didn't match any existing session might still end up here.
+ *
+ * Priority 1.
+ */
+Flow(.logical_datapath = lr_uuid,
+ .stage = router_stage(IN, ARP_RESOLVE),
+ .priority = 1,
+ .__match = "ip4.dst == {" ++ match_ips.join(", ") ++ "}",
+ .actions = "drop;",
+ .external_ids = stage_hint(lrp_uuid)) :-
+ &RouterPort(.lrp = nb::Logical_Router_Port{._uuid = lrp_uuid},
+ .router = &Router{.snat_ips = snat_ips,
+ .lr = nb::Logical_Router{._uuid = lr_uuid}},
+ .networks = networks),
+ var addr = FlatMap(networks.ipv4_addrs),
+ snat_ips.contains_key(IPv4{addr.addr}),
+ var match_ips = "${addr.addr}".group_by((lr_uuid, lrp_uuid)).to_vec().
+Flow(.logical_datapath = lr_uuid,
+ .stage = router_stage(IN, ARP_RESOLVE),
+ .priority = 1,
+ .__match = "ip6.dst == {" ++ match_ips.join(", ") ++ "}",
+ .actions = "drop;",
+ .external_ids = stage_hint(lrp_uuid)) :-
+ &RouterPort(.lrp = nb::Logical_Router_Port{._uuid = lrp_uuid},
+ .router = &Router{.snat_ips = snat_ips,
+ .lr = nb::Logical_Router{._uuid = lr_uuid}},
+ .networks = networks),
+ var addr = FlatMap(networks.ipv6_addrs),
+ snat_ips.contains_key(IPv6{addr.addr}),
+ var match_ips = "${addr.addr}".group_by((lr_uuid, lrp_uuid)).to_vec().
+
+/* This is a logical switch port that backs a VM or a container.
+ * Extract its addresses. For each of the address, go through all
+ * the router ports attached to the switch (to which this port
+ * connects) and if the address in question is reachable from the
+ * router port, add an ARP/ND entry in that router's pipeline. */
+for (SwitchPortIPv4Address(
+ .port = &SwitchPort{.lsp = lsp, .sw = &sw},
+ .ea = ea,
+ .addr = addr)
+ if lsp.__type != "router" and lsp.__type != "virtual" and lsp.is_enabled())
+{
+ for (&SwitchPort(.sw = &Switch{.ls = nb::Logical_Switch{._uuid = sw.ls._uuid}},
+ .peer = Some{&peer@RouterPort{.router = &peer_router}}))
+ {
+ Some{_} = find_lrp_member_ip(peer.networks, IPv4{addr.addr}) in
+ Flow(.logical_datapath = peer_router.lr._uuid,
+ .stage = router_stage(IN, ARP_RESOLVE),
+ .priority = 100,
+ .__match = "outport == ${peer.json_name} && "
+ "${rEG_NEXT_HOP()} == ${addr.addr}",
+ .actions = "eth.dst = ${ea}; next;",
+ .external_ids = stage_hint(lsp._uuid))
+ }
+}
+
+for (SwitchPortIPv6Address(
+ .port = &SwitchPort{.lsp = lsp, .sw = &sw},
+ .ea = ea,
+ .addr = addr)
+ if lsp.__type != "router" and lsp.__type != "virtual" and lsp.is_enabled())
+{
+ for (&SwitchPort(.sw = &Switch{.ls = nb::Logical_Switch{._uuid = sw.ls._uuid}},
+ .peer = Some{&peer@RouterPort{.router = &peer_router}}))
+ {
+ Some{_} = find_lrp_member_ip(peer.networks, IPv6{addr.addr}) in
+ Flow(.logical_datapath = peer_router.lr._uuid,
+ .stage = router_stage(IN, ARP_RESOLVE),
+ .priority = 100,
+ .__match = "outport == ${peer.json_name} && "
+ "xx${rEG_NEXT_HOP()} == ${addr.addr}",
+ .actions = "eth.dst = ${ea}; next;",
+ .external_ids = stage_hint(lsp._uuid))
+ }
+}
+
+/* True if 's' is an empty set or a set that contains just an empty string,
+ * false otherwise.
+ *
+ * This is meant for sets of 0 or 1 elements, like the OVSDB integration
+ * with DDlog uses. */
+function is_empty_set_or_string(s: Option<string>): bool = {
+ match (s) {
+ None -> true,
+ Some{""} -> true,
+ _ -> false
+ }
+}
+
+/* This is a virtual port. Add ARP replies for the virtual ip with
+ * the mac of the present active virtual parent.
+ * If the logical port doesn't have virtual parent set in
+ * Port_Binding table, then add the flow to set eth.dst to
+ * 00:00:00:00:00:00 and advance to next table so that ARP is
+ * resolved by router pipeline using the arp{} action.
+ * The MAC_Binding entry for the virtual ip might be invalid. */
+Flow(.logical_datapath = peer.router.lr._uuid,
+ .stage = router_stage(IN, ARP_RESOLVE),
+ .priority = 100,
+ .__match = "outport == ${peer.json_name} && "
+ "${rEG_NEXT_HOP()} == ${virtual_ip}",
+ .actions = "eth.dst = 00:00:00:00:00:00; next;",
+ .external_ids = stage_hint(sp.lsp._uuid)) :-
+ sp in &SwitchPort(.lsp = lsp@nb::Logical_Switch_Port{.__type = "virtual"}),
+ Some{var virtual_ip_s} = lsp.options.get("virtual-ip"),
+ Some{var virtual_parents} = lsp.options.get("virtual-parents"),
+ Some{var virtual_ip} = ip_parse(virtual_ip_s),
+ pb in sb::Port_Binding(.logical_port = sp.lsp.name),
+ is_empty_set_or_string(pb.virtual_parent) or is_none(pb.chassis),
+ sp2 in &SwitchPort(.sw = sp.sw, .peer = Some{peer}),
+ Some{_} = find_lrp_member_ip(peer.networks, IPv4{virtual_ip}).
+Flow(.logical_datapath = peer.router.lr._uuid,
+ .stage = router_stage(IN, ARP_RESOLVE),
+ .priority = 100,
+ .__match = "outport == ${peer.json_name} && "
+ "${rEG_NEXT_HOP()} == ${virtual_ip}",
+ .actions = "eth.dst = ${address.ea}; next;",
+ .external_ids = stage_hint(sp.lsp._uuid)) :-
+ sp in &SwitchPort(.lsp = lsp@nb::Logical_Switch_Port{.__type = "virtual"}),
+ Some{var virtual_ip_s} = lsp.options.get("virtual-ip"),
+ Some{var virtual_parents} = lsp.options.get("virtual-parents"),
+ Some{var virtual_ip} = ip_parse(virtual_ip_s),
+ pb in sb::Port_Binding(.logical_port = sp.lsp.name),
+ not (is_empty_set_or_string(pb.virtual_parent) or is_none(pb.chassis)),
+ Some{var virtual_parent} = pb.virtual_parent,
+ vp in &SwitchPort(.lsp = nb::Logical_Switch_Port{.name = virtual_parent}),
+ var address = FlatMap(vp.static_addresses),
+ sp2 in &SwitchPort(.sw = sp.sw, .peer = Some{peer}),
+ Some{_} = find_lrp_member_ip(peer.networks, IPv4{virtual_ip}).
+
+/* This is a logical switch port that connects to a router. */
+
+/* The peer of this switch port is the router port for which
+ * we need to add logical flows such that it can resolve
+ * ARP entries for all the other router ports connected to
+ * the switch in question. */
+for (&SwitchPort(.lsp = lsp1,
+ .peer = Some{&peer1@RouterPort{.router = &peer_router}},
+ .sw = &sw)
+ if lsp1.is_enabled() and
+ not map_get_bool_def(peer_router.lr.options, "dynamic_neigh_routers", false))
+{
+ for (&SwitchPort(.lsp = lsp2, .peer = Some{&peer2},
+ .sw = &Switch{.ls = nb::Logical_Switch{._uuid = sw.ls._uuid}})
+ /* Skip the router port under consideration. */
+ if peer2.lrp._uuid != peer1.lrp._uuid)
+ {
+ if (not peer2.networks.ipv4_addrs.is_empty()) {
+ Flow(.logical_datapath = peer_router.lr._uuid,
+ .stage = router_stage(IN, ARP_RESOLVE),
+ .priority = 100,
+ .__match = "outport == ${peer1.json_name} && "
+ "${rEG_NEXT_HOP()} == ${format_v4_networks(peer2.networks, false)}",
+ .actions = "eth.dst = ${peer2.networks.ea}; next;",
+ .external_ids = stage_hint(lsp1._uuid))
+ };
+
+ if (not peer2.networks.ipv6_addrs.is_empty()) {
+ Flow(.logical_datapath = peer_router.lr._uuid,
+ .stage = router_stage(IN, ARP_RESOLVE),
+ .priority = 100,
+ .__match = "outport == ${peer1.json_name} && "
+ "xx${rEG_NEXT_HOP()} == ${format_v6_networks(peer2.networks)}",
+ .actions = "eth.dst = ${peer2.networks.ea}; next;",
+ .external_ids = stage_hint(lsp1._uuid))
+ }
+ }
+}
+
+for (&Router(.lr = lr))
+{
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, ARP_RESOLVE),
+ .priority = 0,
+ .__match = "ip4",
+ .actions = "get_arp(outport, ${rEG_NEXT_HOP()}); next;",
+ .external_ids = map_empty());
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, ARP_RESOLVE),
+ .priority = 0,
+ .__match = "ip6",
+ .actions = "get_nd(outport, xx${rEG_NEXT_HOP()}); next;",
+ .external_ids = map_empty())
+}
+
+/* Local router ingress table CHK_PKT_LEN: Check packet length.
+ *
+ * Any IPv4 packet with outport set to the distributed gateway
+ * router port, check the packet length and store the result in the
+ * 'REGBIT_PKT_LARGER' register bit.
+ *
+ * Local router ingress table LARGER_PKTS: Handle larger packets.
+ *
+ * Any IPv4 packet with outport set to the distributed gateway
+ * router port and the 'REGBIT_PKT_LARGER' register bit is set,
+ * generate ICMPv4 packet with type 3 (Destination Unreachable) and
+ * code 4 (Fragmentation needed).
+ * */
+Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, CHK_PKT_LEN),
+ .priority = 0,
+ .__match = "1",
+ .actions = "next;",
+ .external_ids = map_empty()) :-
+ &Router(.lr = lr).
+Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, LARGER_PKTS),
+ .priority = 0,
+ .__match = "1",
+ .actions = "next;",
+ .external_ids = map_empty()) :-
+ &Router(.lr = lr).
+Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, CHK_PKT_LEN),
+ .priority = 50,
+ .__match = "outport == ${l3dgw_port_json_name}",
+ .actions = "${rEGBIT_PKT_LARGER()} = check_pkt_larger(${mtu}); "
+ "next;",
+ .external_ids = stage_hint(l3dgw_port._uuid)) :-
+ r in &Router(.lr = lr),
+ Some{var l3dgw_port} = r.l3dgw_port,
+ var l3dgw_port_json_name = json_string_escape(l3dgw_port.name),
+ r.redirect_port_name != "",
+ var gw_mtu = map_get_int_def(l3dgw_port.options, "gateway_mtu", 0),
+ gw_mtu > 0,
+ var mtu = gw_mtu + vLAN_ETH_HEADER_LEN().
+Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, LARGER_PKTS),
+ .priority = 50,
+ .__match = "inport == ${rp.json_name} && outport == ${l3dgw_port_json_name} && "
+ "ip4 && ${rEGBIT_PKT_LARGER()}",
+ .actions = "icmp4_error {"
+ "${rEGBIT_EGRESS_LOOPBACK()} = 1; "
+ "eth.dst = ${rp.networks.ea}; "
+ "ip4.dst = ip4.src; "
+ "ip4.src = ${first_ipv4.addr}; "
+ "ip.ttl = 255; "
+ "icmp4.type = 3; /* Destination Unreachable. */ "
+ "icmp4.code = 4; /* Frag Needed and DF was Set. */ "
+ /* Set icmp4.frag_mtu to gw_mtu */
+ "icmp4.frag_mtu = ${gw_mtu}; "
+ "next(pipeline=ingress, table=0); "
+ "};",
+ .external_ids = stage_hint(rp.lrp._uuid)) :-
+ r in &Router(.lr = lr),
+ Some{var l3dgw_port} = r.l3dgw_port,
+ var l3dgw_port_json_name = json_string_escape(l3dgw_port.name),
+ r.redirect_port_name != "",
+ var gw_mtu = map_get_int_def(l3dgw_port.options, "gateway_mtu", 0),
+ gw_mtu > 0,
+ rp in &RouterPort(.router = r),
+ rp.lrp != l3dgw_port,
+ Some{var first_ipv4} = rp.networks.ipv4_addrs.nth(0).
+Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, LARGER_PKTS),
+ .priority = 50,
+ .__match = "inport == ${rp.json_name} && outport == ${l3dgw_port_json_name} && "
+ "ip6 && ${rEGBIT_PKT_LARGER()}",
+ .actions = "icmp6_error {"
+ "${rEGBIT_EGRESS_LOOPBACK()} = 1; "
+ "eth.dst = ${rp.networks.ea}; "
+ "ip6.dst = ip6.src; "
+ "ip6.src = ${first_ipv6.addr}; "
+ "ip.ttl = 255; "
+ "icmp6.type = 2; /* Packet Too Big. */ "
+ "icmp6.code = 0; "
+ /* Set icmp6.frag_mtu to gw_mtu */
+ "icmp6.frag_mtu = ${gw_mtu}; "
+ "next(pipeline=ingress, table=0); "
+ "};",
+ .external_ids = stage_hint(rp.lrp._uuid)) :-
+ r in &Router(.lr = lr),
+ Some{var l3dgw_port} = r.l3dgw_port,
+ var l3dgw_port_json_name = json_string_escape(l3dgw_port.name),
+ r.redirect_port_name != "",
+ var gw_mtu = map_get_int_def(l3dgw_port.options, "gateway_mtu", 0),
+ gw_mtu > 0,
+ rp in &RouterPort(.router = r),
+ rp.lrp != l3dgw_port,
+ Some{var first_ipv6} = rp.networks.ipv6_addrs.nth(0).
+
+/* Logical router ingress table GW_REDIRECT: Gateway redirect.
+ *
+ * For traffic with outport equal to the l3dgw_port
+ * on a distributed router, this table redirects a subset
+ * of the traffic to the l3redirect_port which represents
+ * the central instance of the l3dgw_port.
+ */
+for (&Router(.lr = lr,
+ .l3dgw_port = l3dgw_port,
+ .redirect_port_name = redirect_port_name))
+{
+ /* For traffic with outport == l3dgw_port, if the
+ * packet did not match any higher priority redirect
+ * rule, then the traffic is redirected to the central
+ * instance of the l3dgw_port. */
+ Some{var gwport} = l3dgw_port in
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, GW_REDIRECT),
+ .priority = 50,
+ .__match = "outport == ${json_string_escape(gwport.name)}",
+ .actions = "outport = ${redirect_port_name}; next;",
+ .external_ids = stage_hint(gwport._uuid));
+
+ /* Packets are allowed by default. */
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, GW_REDIRECT),
+ .priority = 0,
+ .__match = "1",
+ .actions = "next;",
+ .external_ids = map_empty())
+}
+
+/* Local router ingress table ARP_REQUEST: ARP request.
+ *
+ * In the common case where the Ethernet destination has been resolved,
+ * this table outputs the packet (priority 0). Otherwise, it composes
+ * and sends an ARP/IPv6 NA request (priority 100). */
+Flow(.logical_datapath = router.lr._uuid,
+ .stage = router_stage(IN, ARP_REQUEST),
+ .priority = 200,
+ .__match = __match,
+ .actions = actions,
+ .external_ids = map_empty()) :-
+ rsr in RouterStaticRoute(.router = &router),
+ var dst = FlatMap(rsr.dsts),
+ IPv6{var gw_ip6} = dst.nexthop,
+ var __match = "eth.dst == 00:00:00:00:00:00 && "
+ "ip6 && xx${rEG_NEXT_HOP()} == ${dst.nexthop}",
+ var sn_addr = in6_addr_solicited_node(gw_ip6),
+ var eth_dst = ipv6_multicast_to_ethernet(sn_addr),
+ var sn_addr_s = ipv6_string_mapped(sn_addr),
+ var actions = "nd_ns { "
+ "eth.dst = ${eth_dst}; "
+ "ip6.dst = ${sn_addr_s}; "
+ "nd.target = ${dst.nexthop}; "
+ "output; "
+ "};".
+
+for (&Router(.lr = lr))
+{
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, ARP_REQUEST),
+ .priority = 100,
+ .__match = "eth.dst == 00:00:00:00:00:00 && ip4",
+ .actions = "arp { "
+ "eth.dst = ff:ff:ff:ff:ff:ff; "
+ "arp.spa = ${rEG_SRC()}; "
+ "arp.tpa = ${rEG_NEXT_HOP()}; "
+ "arp.op = 1; " /* ARP request */
+ "output; "
+ "};",
+ .external_ids = map_empty());
+
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, ARP_REQUEST),
+ .priority = 100,
+ .__match = "eth.dst == 00:00:00:00:00:00 && ip6",
+ .actions = "nd_ns { "
+ "nd.target = xx${rEG_NEXT_HOP()}; "
+ "output; "
+ "};",
+ .external_ids = map_empty());
+
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(IN, ARP_REQUEST),
+ .priority = 0,
+ .__match = "1",
+ .actions = "output;",
+ .external_ids = map_empty())
+}
+
+
+/* Logical router egress table DELIVERY: Delivery (priority 100).
+ *
+ * Priority 100 rules deliver packets to enabled logical ports. */
+for (&RouterPort(.lrp = lrp,
+ .json_name = json_name,
+ .networks = lrp_networks,
+ .router = &Router{.lr = lr, .mcast_cfg = &mcast_cfg})
+ /* Drop packets to disabled logical ports (since logical flow
+ * tables are default-drop). */
+ if lrp.is_enabled())
+{
+ /* If multicast relay is enabled then also adjust source mac for IP
+ * multicast traffic.
+ */
+ if (mcast_cfg.relay) {
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(OUT, DELIVERY),
+ .priority = 110,
+ .__match = "(ip4.mcast || ip6.mcast) && "
+ "outport == ${json_name}",
+ .actions = "eth.src = ${lrp_networks.ea}; output;",
+ .external_ids = stage_hint(lrp._uuid))
+ };
+ /* No egress packets should be processed in the context of
+ * a chassisredirect port. The chassisredirect port should
+ * be replaced by the l3dgw port in the local output
+ * pipeline stage before egress processing. */
+
+ Flow(.logical_datapath = lr._uuid,
+ .stage = router_stage(OUT, DELIVERY),
+ .priority = 100,
+ .__match = "outport == ${json_name}",
+ .actions = "output;",
+ .external_ids = stage_hint(lrp._uuid))
+}
+
+/*
+ * Datapath tunnel key allocation:
+ *
+ * Allocates a globally unique tunnel id in the range 1...2**24-1 for
+ * each Logical_Switch and Logical_Router.
+ */
+
+function oVN_MAX_DP_KEY(): integer { (64'd1 << 24) - 1 }
+function oVN_MAX_DP_GLOBAL_NUM(): integer { (64'd1 << 16) - 1 }
+function oVN_MIN_DP_KEY_LOCAL(): integer { 1 }
+function oVN_MAX_DP_KEY_LOCAL(): integer { oVN_MAX_DP_KEY() - oVN_MAX_DP_GLOBAL_NUM() }
+function oVN_MIN_DP_KEY_GLOBAL(): integer { oVN_MAX_DP_KEY_LOCAL() + 1 }
+function oVN_MAX_DP_KEY_GLOBAL(): integer { oVN_MAX_DP_KEY() }
+
+function oVN_MAX_DP_VXLAN_KEY(): integer { (64'd1 << 12) - 1 }
+function oVN_MAX_DP_VXLAN_KEY_LOCAL(): integer { oVN_MAX_DP_KEY() - oVN_MAX_DP_GLOBAL_NUM() }
+
+/* If any chassis uses VXLAN encapsulation, then the entire deployment is in VXLAN mode. */
+relation IsVxlanMode0()
+IsVxlanMode0() :-
+ sb::Chassis(.encaps = encaps),
+ var encap_uuid = FlatMap(encaps),
+ sb::Encap(._uuid = encap_uuid, .__type = "vxlan").
+
+relation IsVxlanMode[bool]
+IsVxlanMode[true] :-
+ IsVxlanMode0().
+IsVxlanMode[false] :-
+ Unit(),
+ not IsVxlanMode0().
+
+/* The maximum datapath tunnel key that may be used. */
+relation OvnMaxDpKeyLocal[integer]
+/* OVN_MAX_DP_GLOBAL_NUM doesn't apply for vxlan mode. */
+OvnMaxDpKeyLocal[oVN_MAX_DP_VXLAN_KEY()] :- IsVxlanMode[true].
+OvnMaxDpKeyLocal[oVN_MAX_DP_KEY() - oVN_MAX_DP_GLOBAL_NUM()] :- IsVxlanMode[false].
+
+function get_dp_tunkey(map: Map<string,string>, key: string): Option<integer> {
+ map.get(key)
+ .and_then(parse_dec_u64)
+ .and_then(|x| if (x > 0 and x < (2<<24)) {
+ Some{x}
+ } else {
+ None
+ })
+}
+
+// Tunnel keys requested by datapaths.
+relation RequestedTunKey(datapath: uuid, tunkey: integer)
+RequestedTunKey(uuid, tunkey) :-
+ ls in nb::Logical_Switch(._uuid = uuid),
+ Some{var tunkey} = get_dp_tunkey(ls.other_config, "requested-tnl-key").
+RequestedTunKey(uuid, tunkey) :-
+ lr in nb::Logical_Router(._uuid = uuid),
+ Some{var tunkey} = get_dp_tunkey(lr.options, "requested-tnl-key").
+Warning[message] :-
+ RequestedTunKey(datapath, tunkey),
+ var count = datapath.group_by((tunkey)).size(),
+ count > 1,
+ var message = "${count} logical switches or routers request "
+ "datapath tunnel key ${tunkey}".
+
+// Assign tunnel keys:
+// - First priority to requested tunnel keys.
+// - Second priority to already assigned tunnel keys.
+// In either case, make an arbitrary choice in case of conflicts within a
+// priority level.
+relation AssignedTunKey(datapath: uuid, tunkey: integer)
+AssignedTunKey(datapath, tunkey) :-
+ RequestedTunKey(datapath, tunkey),
+ var datapath = datapath.group_by(tunkey).first().
+AssignedTunKey(datapath, tunkey) :-
+ sb::Datapath_Binding(._uuid = datapath, .tunnel_key = tunkey),
+ not RequestedTunKey(_, tunkey),
+ not RequestedTunKey(datapath, _),
+ var datapath = datapath.group_by(tunkey).first().
+
+// all tunnel keys already in use in the Realized table
+relation AllocatedTunKeys(keys: Set<integer>)
+AllocatedTunKeys(keys) :-
+ AssignedTunKey(.tunkey = tunkey),
+ var keys = tunkey.group_by(()).to_set().
+
+// Datapath_Binding's not yet in the Realized table
+relation NotYetAllocatedTunKeys(datapaths: Vec<uuid>)
+
+NotYetAllocatedTunKeys(datapaths) :-
+ OutProxy_Datapath_Binding(._uuid = datapath),
+ not AssignedTunKey(datapath, _),
+ var datapaths = datapath.group_by(()).to_vec().
+
+// Perform the allocation
+relation TunKeyAllocation(datapath: uuid, tunkey: integer)
+
+TunKeyAllocation(datapath, tunkey) :- AssignedTunKey(datapath, tunkey).
+
+// Case 1: AllocatedTunKeys relation is not empty (i.e., contains
+// a single record that stores a set of allocated keys)
+TunKeyAllocation(datapath, tunkey) :-
+ NotYetAllocatedTunKeys(unallocated),
+ AllocatedTunKeys(allocated),
+ OvnMaxDpKeyLocal[max_dp_key_local],
+ var allocation = FlatMap(allocate(allocated, unallocated, 1, max_dp_key_local)),
+ (var datapath, var tunkey) = allocation.
+
+// Case 2: AllocatedTunKeys relation is empty
+TunKeyAllocation(datapath, tunkey) :-
+ NotYetAllocatedTunKeys(unallocated),
+ not AllocatedTunKeys(_),
+ OvnMaxDpKeyLocal[max_dp_key_local],
+ var allocation = FlatMap(allocate(set_empty(), unallocated, 1, max_dp_key_local)),
+ (var datapath, var tunkey) = allocation.
+
+/*
+ * Port id allocation:
+ *
+ * Port IDs in a per-datapath space in the range 1...2**15-1
+ */
+
+function get_port_tunkey(map: Map<string,string>, key: string): Option<integer> {
+ map.get(key)
+ .and_then(parse_dec_u64)
+ .and_then(|x| if (x > 0 and x < (2<<15)) {
+ Some{x}
+ } else {
+ None
+ })
+}
+
+// Tunnel keys requested by port bindings.
+relation RequestedPortTunKey(datapath: uuid, port: uuid, tunkey: integer)
+RequestedPortTunKey(datapath, port, tunkey) :-
+ sp in &SwitchPort(),
+ var datapath = sp.sw.ls._uuid,
+ var port = sp.lsp._uuid,
+ Some{var tunkey} = get_port_tunkey(sp.lsp.options, "requested-tnl-key").
+RequestedPortTunKey(datapath, port, tunkey) :-
+ rp in &RouterPort(),
+ var datapath = rp.router.lr._uuid,
+ var port = rp.lrp._uuid,
+ Some{var tunkey} = get_port_tunkey(rp.lrp.options, "requested-tnl-key").
+Warning[message] :-
+ RequestedPortTunKey(datapath, port, tunkey),
+ var count = port.group_by((datapath, tunkey)).size(),
+ count > 1,
+ var message = "${count} logical ports in the same datapath "
+ "request port tunnel key ${tunkey}".
+
+// Assign tunnel keys:
+// - First priority to requested tunnel keys.
+// - Second priority to already assigned tunnel keys.
+// In either case, make an arbitrary choice in case of conflicts within a
+// priority level.
+relation AssignedPortTunKey(datapath: uuid, port: uuid, tunkey: integer)
+AssignedPortTunKey(datapath, port, tunkey) :-
+ RequestedPortTunKey(datapath, port, tunkey),
+ var port = port.group_by((datapath, tunkey)).first().
+AssignedPortTunKey(datapath, port, tunkey) :-
+ sb::Port_Binding(._uuid = port_uuid,
+ .datapath = datapath,
+ .tunnel_key = tunkey),
+ not RequestedPortTunKey(datapath, _, tunkey),
+ not RequestedPortTunKey(datapath, port_uuid, _),
+ var port = port_uuid.group_by((datapath, tunkey)).first().
+
+// all tunnel keys already in use in the Realized table
+relation AllocatedPortTunKeys(datapath: uuid, keys: Set<integer>)
+
+AllocatedPortTunKeys(datapath, keys) :-
+ AssignedPortTunKey(datapath, port, tunkey),
+ var keys = tunkey.group_by(datapath).to_set().
+
+// Port_Binding's not yet in the Realized table
+relation NotYetAllocatedPortTunKeys(datapath: uuid, all_logical_ids: Vec<uuid>)
+
+NotYetAllocatedPortTunKeys(datapath, all_names) :-
+ OutProxy_Port_Binding(._uuid = port_uuid, .datapath = datapath),
+ not AssignedPortTunKey(datapath, port_uuid, _),
+ var all_names = port_uuid.group_by(datapath).to_vec().
+
+// Perform the allocation.
+relation PortTunKeyAllocation(port: uuid, tunkey: integer)
+
+// Transfer existing allocations from the realized table.
+PortTunKeyAllocation(port, tunkey) :- AssignedPortTunKey(_, port, tunkey).
+
+// Case 1: AllocatedPortTunKeys(datapath) is not empty (i.e., contains
+// a single record that stores a set of allocated keys).
+PortTunKeyAllocation(port, tunkey) :-
+ AllocatedPortTunKeys(datapath, allocated),
+ NotYetAllocatedPortTunKeys(datapath, unallocated),
+ var allocation = FlatMap(allocate(allocated, unallocated, 1, 64'hffff)),
+ (var port, var tunkey) = allocation.
+
+// Case 2: PortAllocatedTunKeys(datapath) relation is empty
+PortTunKeyAllocation(port, tunkey) :-
+ NotYetAllocatedPortTunKeys(datapath, unallocated),
+ not AllocatedPortTunKeys(datapath, _),
+ var allocation = FlatMap(allocate(set_empty(), unallocated, 1, 64'hffff)),
+ (var port, var tunkey) = allocation.
+
+/*
+ * Multicast group tunnel_key allocation:
+ *
+ * Tunnel-keys in a per-datapath space in the range 32770...65535
+ */
+
+// All tunnel keys already in use in the Realized table.
+relation AllocatedMulticastGroupTunKeys(datapath_uuid: uuid, keys: Set<integer>)
+
+AllocatedMulticastGroupTunKeys(datapath_uuid, keys) :-
+ sb::Multicast_Group(.datapath = datapath_uuid, .tunnel_key = tunkey),
+ //sb::UUIDMap_Datapath_Binding(datapath, Left{datapath_uuid}),
+ var keys = tunkey.group_by(datapath_uuid).to_set().
+
+// Multicast_Group's not yet in the Realized table.
+relation NotYetAllocatedMulticastGroupTunKeys(datapath_uuid: uuid,
+ all_logical_ids: Vec<string>)
+
+NotYetAllocatedMulticastGroupTunKeys(datapath_uuid, all_names) :-
+ OutProxy_Multicast_Group(.name = name, .datapath = datapath_uuid),
+ not sb::Multicast_Group(.name = name, .datapath = datapath_uuid),
+ var all_names = name.group_by(datapath_uuid).to_vec().
+
+// Perform the allocation
+relation MulticastGroupTunKeyAllocation(datapath_uuid: uuid, group: string, tunkey: integer)
+
+// transfer existing allocations from the realized table
+MulticastGroupTunKeyAllocation(datapath_uuid, group, tunkey) :-
+ //sb::UUIDMap_Datapath_Binding(_, datapath_uuid),
+ sb::Multicast_Group(.name = group,
+ .datapath = datapath_uuid,
+ .tunnel_key = tunkey).
+
+// Case 1: AllocatedMulticastGroupTunKeys(datapath) is not empty (i.e.,
+// contains a single record that stores a set of allocated keys)
+MulticastGroupTunKeyAllocation(datapath_uuid, group, tunkey) :-
+ AllocatedMulticastGroupTunKeys(datapath_uuid, allocated),
+ NotYetAllocatedMulticastGroupTunKeys(datapath_uuid, unallocated),
+ (_, var min_key) = mC_IP_MCAST_MIN(),
+ (_, var max_key) = mC_IP_MCAST_MAX(),
+ var allocation = FlatMap(allocate(allocated, unallocated,
+ min_key, max_key)),
+ (var group, var tunkey) = allocation.
+
+// Case 2: AllocatedMulticastGroupTunKeys(datapath) relation is empty
+MulticastGroupTunKeyAllocation(datapath_uuid, group, tunkey) :-
+ NotYetAllocatedMulticastGroupTunKeys(datapath_uuid, unallocated),
+ not AllocatedMulticastGroupTunKeys(datapath_uuid, _),
+ (_, var min_key) = mC_IP_MCAST_MIN(),
+ (_, var max_key) = mC_IP_MCAST_MAX(),
+ var allocation = FlatMap(allocate(set_empty(), unallocated,
+ min_key, max_key)),
+ (var group, var tunkey) = allocation.
+
+/*
+ * Queue ID allocation
+ *
+ * Queue IDs on a chassis, for routers that have QoS enabled, in a per-chassis
+ * space in the range 1...0xf000. It looks to me like there'd only be a small
+ * number of these per chassis, and probably a small number overall, in case it
+ * matters.
+ *
+ * Queue ID may also need to be deallocated if port loses QoS attributes
+ *
+ * This logic applies mainly to sb::Port_Binding records bound to a chassis
+ * (i.e. with the chassis column nonempty) but "localnet" ports can also
+ * have a queue ID. For those we use the port's own UUID as the chassis UUID.
+ */
+
+function port_has_qos_params(opts: Map<string, string>): bool = {
+ opts.contains_key("qos_max_rate") or opts.contains_key("qos_burst")
+}
+
+
+// ports in Out_Port_Binding that require queue ID on chassis
+relation PortRequiresQID(port: uuid, chassis: uuid)
+
+PortRequiresQID(pb._uuid, chassis) :-
+ pb in OutProxy_Port_Binding(),
+ pb.__type != "localnet",
+ port_has_qos_params(pb.options),
+ sb::Port_Binding(._uuid = pb._uuid, .chassis = chassis_set),
+ Some{var chassis} = chassis_set.
+PortRequiresQID(pb._uuid, pb._uuid) :-
+ pb in OutProxy_Port_Binding(),
+ pb.__type == "localnet",
+ port_has_qos_params(pb.options),
+ sb::Port_Binding(._uuid = pb._uuid).
+
+relation AggPortRequiresQID(chassis: uuid, ports: Vec<uuid>)
+
+AggPortRequiresQID(chassis, ports) :-
+ PortRequiresQID(port, chassis),
+ var ports = port.group_by(chassis).to_vec().
+
+relation AllocatedQIDs(chassis: uuid, allocated_ids: Map<uuid, integer>)
+
+AllocatedQIDs(chassis, allocated_ids) :-
+ pb in sb::Port_Binding(),
+ pb.__type != "localnet",
+ Some{var chassis} = pb.chassis,
+ Some{var qid_str} = pb.options.get("qdisc_queue_id"),
+ Some{var qid} = parse_dec_u64(qid_str),
+ var allocated_ids = (pb._uuid, qid).group_by(chassis).to_map().
+AllocatedQIDs(chassis, allocated_ids) :-
+ pb in sb::Port_Binding(),
+ pb.__type == "localnet",
+ var chassis = pb._uuid,
+ Some{var qid_str} = pb.options.get("qdisc_queue_id"),
+ Some{var qid} = parse_dec_u64(qid_str),
+ var allocated_ids = (pb._uuid, qid).group_by(chassis).to_map().
+
+// allocate queue IDs to ports
+relation QueueIDAllocation(port: uuid, qids: Option<integer>)
+
+// None for ports that do not require a queue
+QueueIDAllocation(port, None) :-
+ OutProxy_Port_Binding(._uuid = port),
+ not PortRequiresQID(port, _).
+
+QueueIDAllocation(port, Some{qid}) :-
+ AggPortRequiresQID(chassis, ports),
+ AllocatedQIDs(chassis, allocated_ids),
+ var allocations = FlatMap(adjust_allocation(allocated_ids, ports, 1, 64'hf000)),
+ (var port, var qid) = allocations.
+
+QueueIDAllocation(port, Some{qid}) :-
+ AggPortRequiresQID(chassis, ports),
+ not AllocatedQIDs(chassis, _),
+ var allocations = FlatMap(adjust_allocation(map_empty(), ports, 1, 64'hf000)),
+ (var port, var qid) = allocations.
+
+/*
+ * This allows ovn-northd to preserve options:ipv6_ra_pd_list, which is set by
+ * ovn-controller.
+ */
+relation PreserveIPv6RAPDList(lrp_uuid: uuid, ipv6_ra_pd_list: Option<string>)
+PreserveIPv6RAPDList(lrp_uuid, ipv6_ra_pd_list) :-
+ sb::Port_Binding(._uuid = lrp_uuid, .options = options),
+ var ipv6_ra_pd_list = options.get("ipv6_ra_pd_list").
+PreserveIPv6RAPDList(lrp_uuid, None) :-
+ nb::Logical_Router_Port(._uuid = lrp_uuid),
+ not sb::Port_Binding(._uuid = lrp_uuid).
+
+/*
+ * Tag allocation for nested containers.
+ */
+
+/* Reserved tags for each parent port, including:
+ * 1. For ports that need a dynamically allocated tag, existing tag, if any,
+ * 2. For ports that have a statically assigned tag (via `tag_request`), the
+ * `tag_request` value.
+ * 3. For ports that do not have a tag_request, but have a tag statically assigned
+ * by directly setting the `tag` field, use this value.
+ */
+relation SwitchPortReservedTag(parent_name: string, tags: integer)
+
+SwitchPortReservedTag(parent_name, tag) :-
+ &SwitchPort(.lsp = lsp, .needs_dynamic_tag = needs_dynamic_tag, .parent_name = Some{parent_name}),
+ Some{var tag} = if (needs_dynamic_tag) {
+ lsp.tag
+ } else {
+ match (lsp.tag_request) {
+ Some{req} -> Some{req},
+ None -> lsp.tag
+ }
+ }.
+
+relation SwitchPortReservedTags(parent_name: string, tags: Set<integer>)
+
+SwitchPortReservedTags(parent_name, tags) :-
+ SwitchPortReservedTag(parent_name, tag),
+ var tags = tag.group_by(parent_name).to_set().
+
+SwitchPortReservedTags(parent_name, set_empty()) :-
+ nb::Logical_Switch_Port(.name = parent_name),
+ not SwitchPortReservedTag(.parent_name = parent_name).
+
+/* Allocate tags for ports that require dynamically allocated tags and do not
+ * have any yet.
+ */
+relation SwitchPortAllocatedTags(lsp_uuid: uuid, tag: Option<integer>)
+
+SwitchPortAllocatedTags(lsp_uuid, tag) :-
+ &SwitchPort(.lsp = lsp, .needs_dynamic_tag = true, .parent_name = Some{parent_name}),
+ is_none(lsp.tag),
+ var lsps_need_tag = lsp._uuid.group_by(parent_name).to_vec(),
+ SwitchPortReservedTags(parent_name, reserved),
+ var dyn_tags = allocate_opt(reserved,
+ lsps_need_tag,
+ 1, /* Tag 0 is invalid for nested containers. */
+ 4095),
+ var lsp_tag = FlatMap(dyn_tags),
+ (var lsp_uuid, var tag) = lsp_tag.
+
+/* New tag-to-port assignment:
+ * Case 1. Statically reserved tag (via `tag_request`), if any.
+ * Case 2. Existing tag for ports that require a dynamically allocated tag and already have one.
+ * Case 3. Use newly allocated tags (from `SwitchPortAllocatedTags`) for all other ports.
+ */
+relation SwitchPortNewDynamicTag(port: uuid, tag: Option<integer>)
+
+/* Case 1 */
+SwitchPortNewDynamicTag(lsp._uuid, tag) :-
+ &SwitchPort(.lsp = lsp, .needs_dynamic_tag = false),
+ var tag = match (lsp.tag_request) {
+ Some{0} -> None,
+ treq -> treq
+ }.
+
+/* Case 2 */
+SwitchPortNewDynamicTag(lsp._uuid, Some{tag}) :-
+ &SwitchPort(.lsp = lsp, .needs_dynamic_tag = true),
+ Some{var tag} = lsp.tag.
+
+/* Case 3 */
+SwitchPortNewDynamicTag(lsp._uuid, tag) :-
+ &SwitchPort(.lsp = lsp, .needs_dynamic_tag = true),
+ is_none(lsp.tag),
+ SwitchPortAllocatedTags(lsp._uuid, tag).
+
+/* IP_Multicast table (only applicable for Switches). */
+sb::Out_IP_Multicast(._uuid = cfg.datapath,
+ .datapath = cfg.datapath,
+ .enabled = Some{cfg.enabled},
+ .querier = Some{cfg.querier},
+ .eth_src = cfg.eth_src,
+ .ip4_src = cfg.ip4_src,
+ .ip6_src = cfg.ip6_src,
+ .table_size = Some{cfg.table_size},
+ .idle_timeout = Some{cfg.idle_timeout},
+ .query_interval = Some{cfg.query_interval},
+ .query_max_resp = Some{cfg.query_max_resp}) :-
+ &McastSwitchCfg[cfg].
+
+
+relation PortExists(name: string)
+PortExists(name) :- nb::Logical_Switch_Port(.name = name).
+PortExists(name) :- nb::Logical_Router_Port(.name = name).
+
+sb::Out_Load_Balancer(._uuid = lb._uuid,
+ .name = lb.name,
+ .vips = lb.vips,
+ .protocol = lb.protocol,
+ .datapaths = datapaths,
+ .external_ids = ["lb_id" -> uuid2str(lb_uuid)]) :-
+ nb in nb::Logical_Switch(._uuid = ls_uuid, .load_balancer = lb_uuids),
+ var lb_uuid = FlatMap(lb_uuids),
+ var datapaths = ls_uuid.group_by(lb_uuid).to_set(),
+ lb in nb::Load_Balancer(._uuid = lb_uuid).
+
+
+sb::Out_Service_Monitor(._uuid = hash128((svc_monitor.port_name, lbvipbackend.ip, lbvipbackend.port, protocol)),
+ .ip = "${lbvipbackend.ip}",
+ .protocol = Some{protocol},
+ .port = lbvipbackend.port as integer,
+ .logical_port = svc_monitor.port_name,
+ .src_mac = to_string(svc_monitor_mac),
+ .src_ip = svc_monitor.src_ip,
+ .options = lbhc.options,
+ .external_ids = map_empty()) :-
+ SvcMonitorMac(svc_monitor_mac),
+ LBVIPBackend[lbvipbackend],
+ Some{var svc_monitor} = lbvipbackend.svc_monitor,
+ LoadBalancerHealthCheckRef[lbhc],
+ PortExists(svc_monitor.port_name),
+ lbvipbackend.lbvip.lb.health_check.contains(lbhc._uuid),
+ lbhc.vip == lbvipbackend.lbvip.vip_key,
+ var protocol = default_protocol(lbvipbackend.lbvip.lb.protocol),
+ protocol != "sctp".
+
+Warning["SCTP load balancers do not currently support "
+ "health checks. Not creating health checks for "
+ "load balancer ${uuid2str(lbvipbackend.lbvip.lb._uuid)}"] :-
+ LBVIPBackend[lbvipbackend],
+ default_protocol(lbvipbackend.lbvip.lb.protocol) == "sctp",
+ Some{var svc_monitor} = lbvipbackend.svc_monitor,
+ LoadBalancerHealthCheckRef[lbhc],
+ lbvipbackend.lbvip.lb.health_check.contains(lbhc._uuid),
+ lbhc.vip == lbvipbackend.lbvip.vip_key.
new file mode 100755
@@ -0,0 +1,127 @@
+#!/usr/bin/env python3
+# Copyright (c) 2020 Nicira, Inc.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at:
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import getopt
+import sys
+
+import ovs.json
+import ovs.db.error
+import ovs.db.schema
+
+argv0 = sys.argv[0]
+
+def usage():
+ print("""\
+%(argv0)s: ovsdb schema compiler for northd
+usage: %(argv0)s [OPTIONS]
+
+The following option must be specified:
+ -p, --prefix=PREFIX Prefix for declarations in output.
+
+The following ovsdb2ddlog options are supported:
+ -f, --schema-file=FILE OVSDB schema file.
+ -o, --output-table=TABLE Mark TABLE as output.
+ --output-only-table=TABLE Mark TABLE as output-only. DDlog will send updates to this table directly to OVSDB without comparing it with current OVSDB state.
+ --ro=TABLE.COLUMN Ignored.
+ --rw=TABLE.COLUMN Ignored.
+ --output-file=FILE.inc Write output to FILE.inc. If this option is not specified, output will be written to stdout.
+
+The following options are also available:
+ -h, --help display this help message
+ -V, --version display version information\
+""" % {'argv0': argv0})
+ sys.exit(0)
+
+if __name__ == "__main__":
+ try:
+ try:
+ options, args = getopt.gnu_getopt(sys.argv[1:], 'p:f:o:hV',
+ ['prefix=',
+ 'schema-file=',
+ 'output-table=',
+ 'output-only-table=',
+ 'ro=',
+ 'rw=',
+ 'output-file='])
+ except getopt.GetoptError as geo:
+ sys.stderr.write("%s: %s\n" % (argv0, geo.msg))
+ sys.exit(1)
+
+ prefix = None
+ schema_file = None
+ output_tables = set()
+ output_only_tables = set()
+ output_file = None
+ for key, value in options:
+ if key in ['-h', '--help']:
+ usage()
+ elif key in ['-V', '--version']:
+ print("ovsdb2ddlog2c (OVN) @VERSION@")
+ elif key in ['-p', '--prefix']:
+ prefix = value
+ elif key in ['-f', '--schema-file']:
+ schema_file = value
+ elif key in ['-o', '--output-table']:
+ output_tables.add(value)
+ elif key == '--output-only-table':
+ output_only_tables.add(value)
+ elif key in ['--ro', '--rw']:
+ pass
+ elif key == '--output-file':
+ output_file = value
+ else:
+ sys.exit(0)
+
+ if schema_file is None:
+ sys.stderr.write("%s: missing -f or --schema-file option\n" % argv0)
+ sys.exit(1)
+ if prefix is None:
+ sys.stderr.write("%s: missing -p or --prefix option\n" % argv0)
+ sys.exit(1)
+ if not output_tables.isdisjoint(output_only_tables):
+ example = next(iter(output_tables.intersect(output_only_tables)))
+ sys.stderr.write("%s: %s may not be both an output table and "
+ "an output-only table\n" % (argv0, example))
+ sys.exit(1)
+
+ schema = ovs.db.schema.DbSchema.from_json(ovs.json.from_file(
+ schema_file))
+
+ all_tables = set(schema.tables.keys())
+ missing_tables = (output_tables | output_only_tables) - all_tables
+ if missing_tables:
+ sys.stderr.write("%s: %s is not the name of a table\n"
+ % (argv0, next(iter(missing_tables))))
+ sys.exit(1)
+
+ f = sys.stdout if output_file is None else open(output_file, "w")
+ for name, tables in (
+ ("input_relations", all_tables - output_only_tables),
+ ("output_relations", output_tables),
+ ("output_only_relations", output_only_tables)):
+ f.write("static const char *%s%s[] = {\n" % (prefix, name))
+ for table in sorted(tables):
+ f.write(" \"%s\",\n" % table)
+ f.write(" NULL,\n")
+ f.write("};\n\n")
+ if schema_file is not None:
+ f.close()
+ except ovs.db.error.Error as e:
+ sys.stderr.write("%s: %s\n" % (argv0, e))
+ sys.exit(1)
+
+# Local variables:
+# mode: python
+# End:
@@ -210,3 +210,10 @@ export OVS_CTL_TIMEOUT
# matter break everything.
ASAN_OPTIONS=detect_leaks=0:abort_on_error=true:log_path=asan:$ASAN_OPTIONS
export ASAN_OPTIONS
+
+# Check whether we should run ddlog tests.
+if test '@DDLOGLIBDIR@' != no; then
+ TEST_DDLOG="yes"
+else
+ TEST_DDLOG="no"
+fi
@@ -460,4 +460,7 @@ m4_define([OVN_FOR_EACH_NORTHD], [dnl
m4_pushdef([NORTHD_TYPE], [ovn-northd])dnl
$1
m4_popdef([NORTHD_TYPE])dnl
+m4_pushdef([NORTHD_TYPE], [ovn-northd-ddlog])dnl
+$1
+m4_popdef([NORTHD_TYPE])dnl
])
@@ -295,7 +295,6 @@ OVS_APP_EXIT_AND_WAIT([NORTHD_TYPE])
AT_CLEANUP
])
-
OVN_FOR_EACH_NORTHD([
AT_SETUP([ovn -- check HA_Chassis_Group propagation from NBDB to SBDB])
ovn_start
@@ -707,6 +706,103 @@ check_row_count Datapath_Binding 1
AT_CLEANUP
])
+OVN_FOR_EACH_NORTHD([
+AT_SETUP([ovn -- ovn-northd restart])
+ovn_start --no-backup-northd
+
+# Check that ovn-northd is active, by verifying that it creates and
+# destroys southbound datapaths as one would expect.
+check_row_count Datapath_Binding 0
+check ovn-nbctl --wait=sb ls-add sw0
+check_row_count Datapath_Binding 1
+
+# Kill northd.
+as northd
+OVS_APP_EXIT_AND_WAIT([NORTHD_TYPE])
+
+# With ovn-northd gone, changes to nbdb won't be reflected into sbdb.
+# Make sure.
+check ovn-nbctl ls-add sw1
+sleep 5
+check_row_count Datapath_Binding 1
+
+# Now resume ovn-northd. Changes should catch up.
+ovn_start_northd primary
+wait_row_count Datapath_Binding 2
+
+AT_CLEANUP
+])
+
+OVN_FOR_EACH_NORTHD([
+AT_SETUP([ovn -- northbound database reconnection])
+ovn_start --no-backup-northd
+
+# Check that ovn-northd is active, by verifying that it creates and
+# destroys southbound datapaths as one would expect.
+check_row_count Datapath_Binding 0
+check ovn-nbctl --wait=sb ls-add sw0
+check_row_count Datapath_Binding 1
+lf=$(count_rows Logical_Flow)
+
+# Make nbdb ovsdb-server drop connection from ovn-northd.
+conn=$(as ovn-nb ovs-appctl -t ovsdb-server ovsdb-server/list-remotes)
+check as ovn-nb ovs-appctl -t ovsdb-server ovsdb-server/remove-remote "$conn"
+conn2=punix:`pwd`/special.sock
+check as ovn-nb ovs-appctl -t ovsdb-server ovsdb-server/add-remote "$conn2"
+
+# ovn-northd won't respond to changes (because the nbdb connection dropped).
+check ovn-nbctl --db="${conn2#p}" ls-add sw1
+sleep 5
+check_row_count Datapath_Binding 1
+check_row_count Logical_Flow $lf
+
+# Now re-enable the nbdb connection and observe ovn-northd catch up.
+#
+# It's important to check both Datapath_Binding and Logical_Flow because
+# ovn-northd-ddlog implements them in different ways that might go wrong
+# differently on reconnection.
+check as ovn-nb ovs-appctl -t ovsdb-server ovsdb-server/add-remote "$conn"
+wait_row_count Datapath_Binding 2
+wait_row_count Logical_Flow $(expr 2 \* $lf)
+
+AT_CLEANUP
+])
+
+OVN_FOR_EACH_NORTHD([
+AT_SETUP([ovn -- southbound database reconnection])
+ovn_start --no-backup-northd
+
+# Check that ovn-northd is active, by verifying that it creates and
+# destroys southbound datapaths as one would expect.
+check_row_count Datapath_Binding 0
+check ovn-nbctl --wait=sb ls-add sw0
+check_row_count Datapath_Binding 1
+lf=$(count_rows Logical_Flow)
+
+# Make sbdb ovsdb-server drop connection from ovn-northd.
+conn=$(as ovn-sb ovs-appctl -t ovsdb-server ovsdb-server/list-remotes)
+check as ovn-sb ovs-appctl -t ovsdb-server ovsdb-server/remove-remote "$conn"
+conn2=punix:`pwd`/special.sock
+check as ovn-sb ovs-appctl -t ovsdb-server ovsdb-server/add-remote "$conn2"
+
+# ovn-northd can't respond to changes (because the sbdb connection dropped).
+check ovn-nbctl ls-add sw1
+sleep 5
+OVN_SB_DB=${conn2#p} check_row_count Datapath_Binding 1
+OVN_SB_DB=${conn2#p} check_row_count Logical_Flow $lf
+
+# Now re-enable the sbdb connection and observe ovn-northd catch up.
+#
+# It's important to check both Datapath_Binding and Logical_Flow because
+# ovn-northd-ddlog implements them in different ways that might go wrong
+# differently on reconnection.
+check as ovn-sb ovs-appctl -t ovsdb-server ovsdb-server/add-remote "$conn"
+wait_row_count Datapath_Binding 2
+wait_row_count Logical_Flow $(expr 2 \* $lf)
+
+AT_CLEANUP
+])
+
OVN_FOR_EACH_NORTHD([
AT_SETUP([ovn -- check Redirect Chassis propagation from NB to SB])
ovn_start
@@ -1854,8 +1950,10 @@ check_column meter_me nb:meter name
check_acl_lflow() {
acl_log_name=$1
meter_name=$2
- # echo checking that logical flow for acl log $acl_log_name has $meter_name
- AT_CHECK([ovn-sbctl lflow-list | grep ls_out_acl | \
+ echo "checking that logical flow for acl log $acl_log_name has $meter_name"
+ ovn-sbctl dump-flows > sbflows
+ AT_CAPTURE_FILE([sbflows])
+ AT_CHECK([grep ls_out_acl sbflows | \
grep "\"${acl_log_name}\"" | \
grep -c "meter=\"${meter_name}\""], [0], [1
])
@@ -1869,7 +1967,7 @@ check_meter_by_name() {
done
}
-# Make sure 'fair' value properly affects the Meters in SB
+AS_BOX([Make sure 'fair' value properly affects the Meters in SB])
check_meter_by_name meter_me
check_meter_by_name NOT meter_me__${acl1} meter_me__${acl2}
@@ -1883,40 +1981,42 @@ check_meter_by_name NOT meter_me__${acl1} meter_me__${acl2}
check ovn-nbctl --wait=sb set Meter $nb_meter_uuid fair=true
check_meter_by_name meter_me meter_me__${acl1} meter_me__${acl2}
-# Change template meter and make sure that is reflected on acl meters as well
+AS_BOX([Change template meter and make sure that is reflected on acl meters])
template_band=$(fetch_column nb:meter bands name=meter_me)
check ovn-nbctl --wait=sb set meter_band $template_band rate=123
-# Make sure that every Meter_Band has the right rate. (ovn-northd
-# creates 3 identical Meter_Band rows, all identical; ovn-northd-ddlog
-# creates just 1. It doesn't matter, they work just as well.)
+AS_BOX([Make sure that every Meter_Band has the right rate.])
+# ovn-northd creates 3 identical Meter_Band rows, all identical;
+# ovn-northd-ddlog creates just 1. It doesn't matter, they work just
+# as well.)
n_meter_bands=$(count_rows meter_band)
AT_FAIL_IF([test "$n_meter_bands" != 1 && test "$n_meter_bands" != 3])
check_row_count meter_band $n_meter_bands rate=123
-# Check meter in logical flows for acl logs
+AS_BOX([Check meter in logical flows for acl logs])
check_acl_lflow acl_one meter_me__${acl1}
check_acl_lflow acl_two meter_me__${acl2}
-# Stop using meter for acl1
+AS_BOX([Stop using meter for acl1])
check ovn-nbctl --wait=sb clear acl $acl1 meter
check_meter_by_name meter_me meter_me__${acl2}
check_meter_by_name NOT meter_me__${acl1}
check_acl_lflow acl_two meter_me__${acl2}
-# Remove template Meter should remove all others as well
+AS_BOX([Remove template Meter should remove all others as well])
check ovn-nbctl --wait=sb meter-del meter_me
check_row_count meter 0
-# Check that logical flow remains but uses non-unique meter since fair
-# attribute is lost by the removal of the Meter row.
+AS_BOX([Check that logical flow remains but uses non-unique meter])
+# since fair attribute is lost by the removal of the Meter row.
check_acl_lflow acl_two meter_me
+exit 0
-# Re-add template meter and make sure acl2's meter is back in sb
+AS_BOX([Re-add template meter and make sure acl2's meter is back in sb])
check ovn-nbctl --wait=sb --fair meter-add meter_me drop 1 pktps
check_meter_by_name meter_me meter_me__${acl2}
check_meter_by_name NOT meter_me__${acl1}
check_acl_lflow acl_two meter_me__${acl2}
-# Remove acl2
+AS_BOX([Remove acl2])
sw0=$(fetch_column nb:logical_switch _uuid name=sw0)
check ovn-nbctl --wait=sb remove logical_switch $sw0 acls $acl2
check_meter_by_name meter_me
@@ -2030,7 +2130,9 @@ get_tunnel_keys
AT_CHECK([test $lsp02 = 3 && test $ls1 = 123])
AT_CLEANUP
+])
+OVN_FOR_EACH_NORTHD([
AT_SETUP([ovn -- NB to SB load balancer sync])
ovn_start
@@ -16963,6 +16963,10 @@ AT_CLEANUP
OVN_FOR_EACH_NORTHD([
AT_SETUP([ovn -- IGMP snoop/querier/relay])
+
+dnl This test has problems with ovn-northd-ddlog.
+AT_SKIP_IF([test NORTHD_TYPE = ovn-northd-ddlog && test "$RUN_ANYWAY" != yes])
+
ovn_start
# Logical network:
@@ -17637,6 +17641,10 @@ AT_CLEANUP
OVN_FOR_EACH_NORTHD([
AT_SETUP([ovn -- MLD snoop/querier/relay])
+
+dnl This test has problems with ovn-northd-ddlog.
+AT_SKIP_IF([test NORTHD_TYPE = ovn-northd-ddlog && test "$RUN_ANYWAY" != yes])
+
ovn_start
# Logical network:
@@ -20340,6 +20348,10 @@ AT_CLEANUP
OVN_FOR_EACH_NORTHD([
AT_SETUP([ovn -- interconnection])
+
+dnl This test has problems with ovn-northd-ddlog.
+AT_SKIP_IF([test NORTHD_TYPE = ovn-northd-ddlog && test "$RUN_ANYWAY" != yes])
+
ovn_init_ic_db
n_az=5
n_ts=5
@@ -7,11 +7,14 @@ dnl Make AT_SETUP automatically do some things for us:
dnl - Run the ovs_init() shell function as the first step in every test.
dnl - If NORTHD_TYPE is defined, then append it to the test name and
dnl set it as a shell variable as well.
+dnl - Skip the test if it's for ovn-northd-ddlog but it didn't get built.
m4_rename([AT_SETUP], [OVS_AT_SETUP])
m4_define([AT_SETUP],
[OVS_AT_SETUP($@[]m4_ifdef([NORTHD_TYPE], [ -- NORTHD_TYPE]))
m4_ifdef([NORTHD_TYPE], [[NORTHD_TYPE]=NORTHD_TYPE
-AT_SKIP_IF([test $NORTHD_TYPE = ovn-northd-ddlog && test $TEST_DDLOG = no])
+])dnl
+m4_if(NORTHD_TYPE, [ovn-northd-ddlog], [dnl
+AT_SKIP_IF([test $TEST_DDLOG = no])
])dnl
ovs_init
])
@@ -5572,4 +5572,4 @@ as
OVS_TRAFFIC_VSWITCHD_STOP(["/.*error receiving.*/d
/.*terminating with signal 15.*/d"])
AT_CLEANUP
-])
\ No newline at end of file
+])
@@ -72,6 +72,7 @@ schema=
installed=false
built=false
ovn=true
+ddlog=false
ovnsb_schema=
ovnnb_schema=
ic_sb_schema=
@@ -143,6 +144,7 @@ General options:
-S, --schema=FILE use FILE as vswitch.ovsschema
OVN options:
+ --ddlog use ovn-northd-ddlog
--no-ovn-rbac disable role-based access control for OVN
--n-northds=NUMBER run NUMBER copies of northd (default: 1)
--n-ics=NUMBER run NUMBER copies of ic (default: 1)
@@ -234,6 +236,9 @@ EOF
--gdb-ovn-controller-vtep)
gdb_ovn_controller_vtep=true
;;
+ --ddlog)
+ ddlog=true
+ ;;
--no-ovn-rbac)
ovn_rbac=false
;;
@@ -609,12 +614,23 @@ for i in $(seq $n_ics); do
--ovnsb-db="$OVN_SB_DB" --ovnnb-db="$OVN_NB_DB" \
--ic-sb-db="$OVN_IC_SB_DB" --ic-nb-db="$OVN_IC_NB_DB"
done
+
+northd_args=
+if $ddlog; then
+ OVN_NORTHD=ovn-northd-ddlog
+else
+ OVN_NORTHD=ovn-northd
+fi
+
for i in $(seq $n_northds); do
if [ $i -eq 1 ]; then inst=""; else inst=$i; fi
- rungdb $gdb_ovn_northd $gdb_ovn_northd_ex ovn-northd --detach \
- --no-chdir --pidfile=ovn-northd${inst}.pid -vconsole:off \
- --log-file=ovn-northd${inst}.log -vsyslog:off \
- --ovnsb-db="$OVN_SB_DB" --ovnnb-db="$OVN_NB_DB"
+ if $ddlog; then
+ northd_args=--ddlog-record=replay$inst.txt
+ fi
+ rungdb $gdb_ovn_northd $gdb_ovn_northd_ex $OVN_NORTHD --detach \
+ --no-chdir --pidfile=$OVN_NORTHD$inst.pid -vconsole:off \
+ --log-file=$OVN_NORTHD$inst.log -vsyslog:off \
+ --ovnsb-db="$OVN_SB_DB" --ovnnb-db="$OVN_NB_DB" $northd_args
done
for i in $(seq $n_controllers); do
if [ $i -eq 1 ]; then inst=""; else inst=$i; fi
@@ -184,7 +184,7 @@ skip_signoff_check = False
#
# Python isn't checked as flake8 performs these checks during build.
line_length_blacklist = re.compile(
- r'\.(am|at|etc|in|m4|mk|patch|py)$|debian/rules')
+ r'\.(am|at|etc|in|m4|mk|patch|py|dl)|$|debian/rules')
# Don't enforce a requirement that leading whitespace be all spaces on
# files that include these characters in their name, since these kinds
@@ -458,10 +458,10 @@ start_northd () {
ovn_northd_params="`cat $ovn_northd_db_conf_file`"
fi
- if daemon_is_running ovn-northd; then
- log_success_msg "ovn-northd is already running"
+ if daemon_is_running $OVN_NORTHD_BIN; then
+ log_success_msg "$OVN_NORTHD_BIN is already running"
else
- set ovn-northd
+ set $OVN_NORTHD_BIN
if test X"$OVN_NORTHD_LOGFILE" != X; then
set "$@" --log-file=$OVN_NORTHD_LOGFILE
fi
@@ -571,7 +571,7 @@ start_controller_vtep () {
## ---- ##
stop_northd () {
- OVS_RUNDIR=${OVS_RUNDIR} stop_ovn_daemon ovn-northd
+ OVS_RUNDIR=${OVS_RUNDIR} stop_ovn_daemon $OVN_NORTHD_BIN
if [ ! -e $ovn_northd_db_conf_file ]; then
if test X"$OVN_MANAGE_OVSDB" = Xyes; then
@@ -714,6 +714,7 @@ set_defaults () {
OVN_CONTROLLER_WRAPPER=
OVSDB_NB_WRAPPER=
OVSDB_SB_WRAPPER=
+ OVN_NORTHD_DDLOG=no
OVN_USER=
@@ -932,6 +933,8 @@ Options:
--ovs-user="user[:group]" pass the --user flag to ovs daemons
--ovsdb-nb-wrapper=WRAPPER run with a wrapper like valgrind for debugging
--ovsdb-sb-wrapper=WRAPPER run with a wrapper like valgrind for debugging
+ --ovn-northd-ddlog=yes|no whether we should run the DDlog version
+ of ovn-northd. The default is "no".
-h, --help display this help message
File location options:
@@ -1087,6 +1090,13 @@ do
;;
esac
done
+
+if test X"$OVN_NORTHD_DDLOG" = Xyes; then
+ OVN_NORTHD_BIN=ovn-northd-ddlog
+else
+ OVN_NORTHD_BIN=ovn-northd
+fi
+
case $command in
start_northd)
start_northd
@@ -1179,7 +1189,7 @@ case $command in
restart_ic_sb_ovsdb
;;
status_northd)
- daemon_status ovn-northd || exit 1
+ daemon_status $OVN_NORTHD_BIN || exit 1
;;
status_ovsdb)
status_ovsdb