[ovs-dev,v3] Policy-based routing (PBR) provides a mechanism to configure permit/deny and reroute policies on the router. Permit/deny policies are similar to OVN ACLs, but exist on the logical-router. Reroute policies are needed for service-insertion and.
diff mbox series

Message ID 1547665307-45407-2-git-send-email-mary.manohar@nutanix.com
State Not Applicable
Headers show
Series
  • [ovs-dev,v3] Policy-based routing (PBR) provides a mechanism to configure permit/deny and reroute policies on the router. Permit/deny policies are similar to OVN ACLs, but exist on the logical-router. Reroute policies are needed for service-insertion and.
Related show

Commit Message

Mary Manohar Jan. 16, 2019, 7:01 p.m. UTC
Policy-based routing (PBR) provides a mechanism to configure permit/deny and reroute policies on the router.
Permit/deny policies are similar to OVN ACLs, but exist on the logical-router.
Reroute policies are needed for service-insertion and service-chaining.
Currently, policies are stateless.

To achieve this, a new table is introduced in the ingress pipeline of the Logical-router.
The new table is between the ‘IP Routing’ and the ‘ARP/ND resolution’ table.
This way, PBR can override routing decisions and provide a different next-hop.

This Patch:
  a. Changes in OVN NB Schema to introduce a new table in the Logical router.
  b. Add commands to ovn-nbctl to add/delete/list routing policies.
  c. Changes in ovn-northd to process routing-policy configurations.

    A new table 'Logical_Router_Policy' has been added in the northbound schema.
    The table has the following columns:
     * priority: Rules with numerically higher priority take precedence over those with lower.
     * match: Uses the same expression language as the 'match' column of 'Logical_Flow' table in the OVN Southbound database.
     * action: allow/drop/reroute
     * nexthop: Nexthop IP address.

     Each row in this table represents one routing policy for a logical router.
     The 'action' column for the highest priority matching row in this table
     determines a packet's treatment. If no row matches, packets are allowed by
     default.

    The new ovn-nbctl commands are as follows:
    1. Add a new ovn-nbctl command to add a routing policy.
       lr-policy-add ROUTER PRIORITY MATCH ACTION [NEXTHOP]

       Nexthop is an optional parameter.
       It needs to be provided only when 'action' is 'reroute'
       A policy is uniquely identified by priority and match.
       Multiple policies can have the same priority.

    2. Add a new ovn-nbctl command to delete a routing policy.
       lr-policy-del ROUTER [PRIORITY [MATCH]]

       Takes priority and match as optional parameters.
       If priority and match are specified, the policy with the given priority and match is deleted.
       If priority is specified and match is not specified, all rules with that priority are deleted.
       If priority is not specified, all the rules would be deleted.

    3. Add a new ovn-nbctl command to list routing-policies in the logical router.
       lr-policy-list ROUTER

    ovn-northd changes are to get routing-policies from northbound database and populate the same as logical flows in the southbound database.
    A new table called 'POLICY' is introduced in the Logical router's ingress pipeline.
    Each routing-policy configured in the northbound database translates into a single logical flow in the new table.

    The columns from the Logical_Router_Policy table are used as follows:
    The priority column is used as priority in the logical-flow.
    The match column is used as the 'match' string in the logical-flow.
    The action column is used to determine the action of the logical-flow.

    When the 'action' is reroute, if the nexthop ip-address is a connected router port or the IP address of a logical port,
    the logical-flow is constructed to route the packet to the nexthop ip-address.

Signed-off-by: Mary Manohar <mary.manohar@nutanix.com>
---
 ovn/northd/ovn-northd.c   | 144 +++++++++++++++++++++++++++++++--
 ovn/ovn-nb.ovsschema      |  19 ++++-
 ovn/ovn-nb.xml            |  63 +++++++++++++++
 ovn/utilities/ovn-nbctl.c | 198 ++++++++++++++++++++++++++++++++++++++++++++++
 tests/ovn-nbctl.at        |  47 +++++++++++
 tests/ovn.at              | 187 +++++++++++++++++++++++++++++++++++++++++++
 6 files changed, 650 insertions(+), 8 deletions(-)

Comments

Ben Pfaff Jan. 16, 2019, 8:32 p.m. UTC | #1
On Wed, Jan 16, 2019 at 07:01:13PM +0000, Mary Manohar wrote:
> Policy-based routing (PBR) provides a mechanism to configure permit/deny and reroute policies on the router.
> Permit/deny policies are similar to OVN ACLs, but exist on the logical-router.
> Reroute policies are needed for service-insertion and service-chaining.
> Currently, policies are stateless.
> 
> To achieve this, a new table is introduced in the ingress pipeline of the Logical-router.
> The new table is between the ‘IP Routing’ and the ‘ARP/ND resolution’ table.
> This way, PBR can override routing decisions and provide a different next-hop.

Thanks a lot for the patch!

I'm having trouble applying it.  When I pass it to "git am -3", I get
the following:

Using index info to reconstruct a base tree...
M	ovn/northd/ovn-northd.c
M	ovn/ovn-nb.xml
M	ovn/utilities/ovn-nbctl.c
M	tests/ovn-nbctl.at
M	tests/ovn.at
error: patch failed: ovn/northd/ovn-northd.c:142
error: ovn/northd/ovn-northd.c: patch does not apply
error: patch failed: ovn/ovn-nb.ovsschema:1
error: ovn/ovn-nb.ovsschema: patch does not apply
error: patch failed: ovn/ovn-nb.xml:1236
error: ovn/ovn-nb.xml: patch does not apply
error: patch failed: ovn/utilities/ovn-nbctl.c:640
error: ovn/utilities/ovn-nbctl.c: patch does not apply
error: patch failed: tests/ovn-nbctl.at:1331
error: tests/ovn-nbctl.at: patch does not apply
error: patch failed: tests/ovn.at:5085
error: tests/ovn.at: patch does not apply
error: Did you hand edit your patch?
It does not apply to blobs recorded in its index.

Thanks,

Ben.
Ben Pfaff Jan. 16, 2019, 8:54 p.m. UTC | #2
On Wed, Jan 16, 2019 at 12:32:02PM -0800, Ben Pfaff wrote:
> On Wed, Jan 16, 2019 at 07:01:13PM +0000, Mary Manohar wrote:
> > Policy-based routing (PBR) provides a mechanism to configure permit/deny and reroute policies on the router.
> > Permit/deny policies are similar to OVN ACLs, but exist on the logical-router.
> > Reroute policies are needed for service-insertion and service-chaining.
> > Currently, policies are stateless.
> > 
> > To achieve this, a new table is introduced in the ingress pipeline of the Logical-router.
> > The new table is between the ‘IP Routing’ and the ‘ARP/ND resolution’ table.
> > This way, PBR can override routing decisions and provide a different next-hop.
> 
> Thanks a lot for the patch!
> 
> I'm having trouble applying it.  When I pass it to "git am -3", I get
> the following:

You can disregard that, the problem was on my end.

I'll review the patch now.
0-day Robot Jan. 16, 2019, 9:42 p.m. UTC | #3
Bleep bloop.  Greetings Mary Manohar, I am a robot and I have tried out your patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
ERROR: Inappropriate bracing around statement
#101 FILE: ovn/northd/ovn-northd.c:4698:
    if (nexthop == NULL)

WARNING: Line is 80 characters long (recommended limit is 79)
#146 FILE: ovn/northd/ovn-northd.c:4743:
        VLOG_WARN_RL(&rl, "No path for routing policy priority %d; next hop %s",

WARNING: Line is 80 characters long (recommended limit is 79)
#278 FILE: ovn/ovn-nb.ovsschema:246:
                                            "refTable": "Logical_Router_Policy",

WARNING: Line is 93 characters long (recommended limit is 79)
#296 FILE: ovn/ovn-nb.ovsschema:318:
                                            "enum": ["set", ["allow", "drop", "reroute"]]}}},

WARNING: Line is 80 characters long (recommended limit is 79)
#342 FILE: ovn/ovn-nb.xml:1869:
        The packets that the routing policy should match, in the same expression

WARNING: Line lacks whitespace around operator
#392 FILE: ovn/utilities/ovn-nbctl.c:644:
  lr-policy-add ROUTER PRIORITY MATCH ACTION [NEXTHOP]\n\

WARNING: Line lacks whitespace around operator
#394 FILE: ovn/utilities/ovn-nbctl.c:646:
  lr-policy-del ROUTER [PRIORITY [MATCH]]\n\

WARNING: Line lacks whitespace around operator
#396 FILE: ovn/utilities/ovn-nbctl.c:648:
  lr-policy-list ROUTER   print policies for ROUTER\n\

ERROR: Inappropriate bracing around statement
#467 FILE: ovn/utilities/ovn-nbctl.c:3468:
    if (next_hop != NULL)

ERROR: Inappropriate bracing around statement
#553 FILE: ovn/utilities/ovn-nbctl.c:3554:
    } else

Lines checked: 872, Warnings: 7, Errors: 3


build:
depbase=`echo ovn/controller-vtep/vtep.o | sed 's|[^/]*$|.deps/&|;s|\.o$||'`;\
gcc -std=gnu99 -DHAVE_CONFIG_H -I.    -I ./include -I ./include -I ./lib -I ./lib    -Wstrict-prototypes -Wall -Wextra -Wno-sign-compare -Wpointer-arith -Wformat -Wformat-security -Wswitch-enum -Wunused-parameter -Wbad-function-cast -Wcast-align -Wstrict-prototypes -Wold-style-definition -Wmissing-prototypes -Wmissing-field-initializers -fno-strict-aliasing -Wshadow -Werror -Werror   -g -O2 -MT ovn/controller-vtep/vtep.o -MD -MP -MF $depbase.Tpo -c -o ovn/controller-vtep/vtep.o ovn/controller-vtep/vtep.c &&\
mv -f $depbase.Tpo $depbase.Po
/bin/sh ./libtool  --tag=CC   --mode=link gcc -std=gnu99 -Wstrict-prototypes -Wall -Wextra -Wno-sign-compare -Wpointer-arith -Wformat -Wformat-security -Wswitch-enum -Wunused-parameter -Wbad-function-cast -Wcast-align -Wstrict-prototypes -Wold-style-definition -Wmissing-prototypes -Wmissing-field-initializers -fno-strict-aliasing -Wshadow -Werror -Werror   -g -O2     -o ovn/controller-vtep/ovn-controller-vtep ovn/controller-vtep/binding.o ovn/controller-vtep/gateway.o ovn/controller-vtep/ovn-controller-vtep.o ovn/controller-vtep/vtep.o ovn/lib/libovn.la lib/libopenvswitch.la vtep/libvtep.la -lpthread -lrt -lm  -lunbound
libtool: link: gcc -std=gnu99 -Wstrict-prototypes -Wall -Wextra -Wno-sign-compare -Wpointer-arith -Wformat -Wformat-security -Wswitch-enum -Wunused-parameter -Wbad-function-cast -Wcast-align -Wstrict-prototypes -Wold-style-definition -Wmissing-prototypes -Wmissing-field-initializers -fno-strict-aliasing -Wshadow -Werror -Werror -g -O2 -o ovn/controller-vtep/ovn-controller-vtep ovn/controller-vtep/binding.o ovn/controller-vtep/gateway.o ovn/controller-vtep/ovn-controller-vtep.o ovn/controller-vtep/vtep.o  ovn/lib/.libs/libovn.a lib/.libs/libopenvswitch.a -lssl -lcrypto -lcap-ng vtep/.libs/libvtep.a -lpthread -lrt -lm -lunbound
depbase=`echo ovn/northd/ovn-northd.o | sed 's|[^/]*$|.deps/&|;s|\.o$||'`;\
gcc -std=gnu99 -DHAVE_CONFIG_H -I.    -I ./include -I ./include -I ./lib -I ./lib    -Wstrict-prototypes -Wall -Wextra -Wno-sign-compare -Wpointer-arith -Wformat -Wformat-security -Wswitch-enum -Wunused-parameter -Wbad-function-cast -Wcast-align -Wstrict-prototypes -Wold-style-definition -Wmissing-prototypes -Wmissing-field-initializers -fno-strict-aliasing -Wshadow -Werror -Werror   -g -O2 -MT ovn/northd/ovn-northd.o -MD -MP -MF $depbase.Tpo -c -o ovn/northd/ovn-northd.o ovn/northd/ovn-northd.c &&\
mv -f $depbase.Tpo $depbase.Po
In file included from ovn/northd/ovn-northd.c:48:0:
ovn/northd/ovn-northd.c: In function ‘build_routing_policy_flow’:
./include/openvswitch/vlog.h:282:14: error: format ‘%d’ expects argument of type ‘int’, but argument 5 has type ‘int64_t’ [-Werror=format=]
         enum vlog_level level__ = LEVEL;                                \
              ^
./include/openvswitch/vlog.h:218:31: note: in expansion of macro ‘VLOG_RL’
 #define VLOG_WARN_RL(RL, ...) VLOG_RL(RL, VLL_WARN, __VA_ARGS__)
                               ^
ovn/northd/ovn-northd.c:4769:16: note: in expansion of macro ‘VLOG_WARN_RL’
                VLOG_WARN_RL(&rl, "lrp_addr not found for routing policy "
                ^
cc1: all warnings being treated as errors
make[2]: *** [ovn/northd/ovn-northd.o] Error 1
make[2]: Leaving directory `/var/lib/jenkins/jobs/upstream_build_from_pw/workspace'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/var/lib/jenkins/jobs/upstream_build_from_pw/workspace'
make: *** [all] Error 2


Please check this out.  If you feel there has been an error, please email aconole@bytheb.org

Thanks,
0-day Robot
Ben Pfaff Jan. 17, 2019, 12:24 a.m. UTC | #4
On Wed, Jan 16, 2019 at 07:01:13PM +0000, Mary Manohar wrote:
> Policy-based routing (PBR) provides a mechanism to configure permit/deny and reroute policies on the router.
> Permit/deny policies are similar to OVN ACLs, but exist on the logical-router.
> Reroute policies are needed for service-insertion and service-chaining.
> Currently, policies are stateless.
> 
> To achieve this, a new table is introduced in the ingress pipeline of the Logical-router.
> The new table is between the ‘IP Routing’ and the ‘ARP/ND resolution’ table.
> This way, PBR can override routing decisions and provide a different next-hop.
> 
> This Patch:
>   a. Changes in OVN NB Schema to introduce a new table in the Logical router.
>   b. Add commands to ovn-nbctl to add/delete/list routing policies.
>   c. Changes in ovn-northd to process routing-policy configurations.
> 
>     A new table 'Logical_Router_Policy' has been added in the northbound schema.
>     The table has the following columns:
>      * priority: Rules with numerically higher priority take precedence over those with lower.
>      * match: Uses the same expression language as the 'match' column of 'Logical_Flow' table in the OVN Southbound database.
>      * action: allow/drop/reroute
>      * nexthop: Nexthop IP address.
> 
>      Each row in this table represents one routing policy for a logical router.
>      The 'action' column for the highest priority matching row in this table
>      determines a packet's treatment. If no row matches, packets are allowed by
>      default.
> 
>     The new ovn-nbctl commands are as follows:
>     1. Add a new ovn-nbctl command to add a routing policy.
>        lr-policy-add ROUTER PRIORITY MATCH ACTION [NEXTHOP]
> 
>        Nexthop is an optional parameter.
>        It needs to be provided only when 'action' is 'reroute'
>        A policy is uniquely identified by priority and match.
>        Multiple policies can have the same priority.
> 
>     2. Add a new ovn-nbctl command to delete a routing policy.
>        lr-policy-del ROUTER [PRIORITY [MATCH]]
> 
>        Takes priority and match as optional parameters.
>        If priority and match are specified, the policy with the given priority and match is deleted.
>        If priority is specified and match is not specified, all rules with that priority are deleted.
>        If priority is not specified, all the rules would be deleted.
> 
>     3. Add a new ovn-nbctl command to list routing-policies in the logical router.
>        lr-policy-list ROUTER
> 
>     ovn-northd changes are to get routing-policies from northbound database and populate the same as logical flows in the southbound database.
>     A new table called 'POLICY' is introduced in the Logical router's ingress pipeline.
>     Each routing-policy configured in the northbound database translates into a single logical flow in the new table.
> 
>     The columns from the Logical_Router_Policy table are used as follows:
>     The priority column is used as priority in the logical-flow.
>     The match column is used as the 'match' string in the logical-flow.
>     The action column is used to determine the action of the logical-flow.
> 
>     When the 'action' is reroute, if the nexthop ip-address is a connected router port or the IP address of a logical port,
>     the logical-flow is constructed to route the packet to the nexthop ip-address.
> 
> Signed-off-by: Mary Manohar <mary.manohar@nutanix.com>

Thanks for the patch!  It is good work.  I have some feedback for you,
although most of it is fairly superficial.

It looks like this is IPv4-only.  OVN routing supports IPv4 and IPv6,
and I don't know a reason why it would be much extra work to support
IPv6 or why IPv6 should not support policy routing.

The first line of a commit message should be short, so that it fits on a
single line, but this commit message has a whole paragraph in the first
line.

A commit message should be wrapped at 75 columns or so, and generally
not indented, but this commit message has many long lines and the
indentation seems overall kind of random.

checkpatch has some feedback too.  I guess you probably got the same
from the robot as well:

    ERROR: Inappropriate bracing around statement
    #28 FILE: ovn/northd/ovn-northd.c:4698:
        if (nexthop == NULL)

    WARNING: Line is 80 characters long (recommended limit is 79)
    #73 FILE: ovn/northd/ovn-northd.c:4743:
            VLOG_WARN_RL(&rl, "No path for routing policy priority %d; next hop %s",

    WARNING: Line is 80 characters long (recommended limit is 79)
    #205 FILE: ovn/ovn-nb.ovsschema:246:
                                                "refTable": "Logical_Router_Policy",

    WARNING: Line is 93 characters long (recommended limit is 79)
    #223 FILE: ovn/ovn-nb.ovsschema:318:
                                                "enum": ["set", ["allow", "drop", "reroute"]]}}},

    WARNING: Line is 80 characters long (recommended limit is 79)
    #269 FILE: ovn/ovn-nb.xml:1869:
            The packets that the routing policy should match, in the same expression

    WARNING: Line lacks whitespace around operator
    #319 FILE: ovn/utilities/ovn-nbctl.c:644:
      lr-policy-add ROUTER PRIORITY MATCH ACTION [NEXTHOP]\n\

    WARNING: Line lacks whitespace around operator
    #321 FILE: ovn/utilities/ovn-nbctl.c:646:
      lr-policy-del ROUTER [PRIORITY [MATCH]]\n\

    WARNING: Line lacks whitespace around operator
    #323 FILE: ovn/utilities/ovn-nbctl.c:648:
      lr-policy-list ROUTER   print policies for ROUTER\n\

    ERROR: Inappropriate bracing around statement
    #394 FILE: ovn/utilities/ovn-nbctl.c:3468:
        if (next_hop != NULL)

    ERROR: Inappropriate bracing around statement
    #480 FILE: ovn/utilities/ovn-nbctl.c:3554:
        } else

In get_outport_for_routing_policy_nexthop(), this is always going to
report error 0:
+    char *error = ip_parse_cidr(nexthop, &nexthop_be32, &plen);
+    if (!error) {
+        if (plen != 32) {
+            static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
+            VLOG_WARN_RL(&rl, "bad next hop ip %s for routing policy "
+                         "with priority %d, error: %s ",
+                         nexthop, priority, error);
+            return NULL;
+        }
but I think it could just use ip_parse() and then it wouldn't have to
worry about a mask at all.

Some of the coding style and indentation was a little unusual, and some
of the code made me think a little too hard, so I'd suggest folding in
the following in the next version because it eases all of these for me:

diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
index 0f99420b11fd..6ffd105641b3 100644
--- a/ovn/northd/ovn-northd.c
+++ b/ovn/northd/ovn-northd.c
@@ -4719,32 +4719,18 @@ get_outport_for_routing_policy_nexthop(struct ovn_datapath *od,
         return NULL;
     }
 
-    /* find the router port matching the next hop. */
-    int i;
-    struct ovn_port *out_port = NULL;
-    const char *lrp_addr_s = NULL;
-    for (i = 0; i < od->nbr->n_ports; i++) {
+    /* Find the router port matching the next hop. */
+    for (int i = 0; i < od->nbr->n_ports; i++) {
        struct nbrec_logical_router_port *lrp = od->nbr->ports[i];
-       out_port = ovn_port_find(ports, lrp->name);
-       if (!out_port) {
-          /* This should not happen. */
-           continue;
-       }
-       lrp_addr_s = find_lrp_member_ip(out_port, nexthop);
-       if (lrp_addr_s) {
-            break;
-       } else {
-           out_port = NULL;
+       struct ovn_port *out_port = ovn_port_find(ports, lrp->name);
+       if (out_port && find_lrp_member_ip(out_port, nexthop)) {
+           return out_port;
        }
     }
-    if (!out_port) {
-        /* There is no matched out port. */
-        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
-        VLOG_WARN_RL(&rl, "No path for routing policy priority %d; next hop %s",
-                     priority, nexthop);
-        return NULL;
-    }
-    return out_port;
+    static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
+    VLOG_WARN_RL(&rl, "No path for routing policy priority %d; next hop %s",
+                 priority, nexthop);
+    return NULL;
 }
 
 static void
@@ -4756,33 +4742,30 @@ build_routing_policy_flow(struct hmap *lflows, struct ovn_datapath *od,
     struct ds actions = DS_EMPTY_INITIALIZER;
 
     if (!strcmp(rule->action, "reroute")) {
-        struct ovn_port *out_port = NULL;
-        const char *lrp_addr_s = NULL;
-        out_port = get_outport_for_routing_policy_nexthop(
+        struct ovn_port *out_port = get_outport_for_routing_policy_nexthop(
             od, ports, rule->priority, rule->nexthop);
-        if (out_port == NULL) {
-           return;
-        } else {
-           lrp_addr_s = find_lrp_member_ip(out_port, rule->nexthop);
-           if (!lrp_addr_s) {
-               static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
-               VLOG_WARN_RL(&rl, "lrp_addr not found for routing policy "
-                            " priority %d nexthop %s",
-                            rule->priority, rule->nexthop);
-               return;
-           } else {
-               ds_put_format(&actions, "reg0 = %s; "
-                  "reg1 = %s; "
-                  "eth.src = %s; "
-                  "outport = %s; "
-                  "flags.loopback = 1; "
-                  "next;",
-                  rule->nexthop,
-                  lrp_addr_s,
-                  out_port->lrp_networks.ea_s,
-                  out_port->json_key);
-           }
+        if (!out_port) {
+            return;
+        }
+
+        const char *lrp_addr_s = find_lrp_member_ip(out_port, rule->nexthop);
+        if (!lrp_addr_s) {
+            static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
+            VLOG_WARN_RL(&rl, "lrp_addr not found for routing policy "
+                         " priority %"PRId64" nexthop %s",
+                         rule->priority, rule->nexthop);
+            return;
         }
+        ds_put_format(&actions, "reg0 = %s; "
+                      "reg1 = %s; "
+                      "eth.src = %s; "
+                      "outport = %s; "
+                      "flags.loopback = 1; "
+                      "next;",
+                      rule->nexthop,
+                      lrp_addr_s,
+                      out_port->lrp_networks.ea_s,
+                      out_port->json_key);
     } else if (!strcmp(rule->action, "drop")) {
         ds_put_cstr(&actions, "drop;");
     } else if (!strcmp(rule->action, "allow")) {
@@ -6533,8 +6516,8 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap *ports,
 
         /* Convert routing policies to flows. */
         for (int i = 0; i < od->nbr->n_policies; i++) {
-            const struct nbrec_logical_router_policy *rule;
-             rule = od->nbr->policies[i];
+            const struct nbrec_logical_router_policy *rule
+                = od->nbr->policies[i];
             build_routing_policy_flow(lflows, od, ports, rule);
         }
     }
diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml
index 12aef7df49e9..10134c297c81 100644
--- a/ovn/ovn-nb.xml
+++ b/ovn/ovn-nb.xml
@@ -1242,7 +1242,7 @@
     </column>
 
     <column name="policies">
-      One or more routing policies for the router.
+      Zero or more routing policies for the router.
     </column>
 
     <column name="enabled">
@@ -1852,7 +1852,7 @@
       column="action"/> column for the highest-<ref column="priority"/>
       matching row in this table determines a packet's treatment.  If no row
       matches, packets are allowed by default. (Default-deny treatment is
-      possible: add a rule with <ref column="priority"/> 0, <code>0</code> as
+      possible: add a rule with <ref column="priority"/> 0, <code>1</code> as
       <ref column="match"/>, and <code>drop</code> as <ref column="action"/>.)
     </p>
 
@@ -1891,14 +1891,14 @@
         </li>
 
         <li>
-          <code>reroute</code>: Reroute packet to nexthop
+          <code>reroute</code>: Reroute packet to <ref column="nexthop"/>.
         </li>
       </ul>
     </column>
 
     <column name="nexthop">
       <p>
-        Nexthop IP address for this route.  Nexthop IP address should be the IP
+        Next-hop IP address for this route, which should be the IP
         address of a connected router port or the IP address of a logical port.
       </p>
     </column>
diff --git a/ovn/utilities/ovn-nbctl.c b/ovn/utilities/ovn-nbctl.c
index babb907cef27..d833fb17e434 100644
--- a/ovn/utilities/ovn-nbctl.c
+++ b/ovn/utilities/ovn-nbctl.c
@@ -3421,6 +3421,7 @@ nbctl_lr_policy_add(struct ctl_context *ctx)
     }
     const char *action = ctx->argv[4];
     char *next_hop = NULL;
+
     /* Validate action. */
     if (strcmp(action, "allow") && strcmp(action, "drop")
         && strcmp(action, "reroute")) {
@@ -3429,17 +3430,18 @@ nbctl_lr_policy_add(struct ctl_context *ctx)
     }
     if (!strcmp(action, "reroute")) {
         if (ctx->argc < 6) {
-            ctl_error(ctx, "Nexthop is not specified when action is reroute.");
+            ctl_error(ctx, "Nexthop is required when action is reroute.");
         }
     }
+
     /* Check if same routing policy already exists.
      * A policy is uniquely identified by priority and match */
     for (int i = 0; i < lr->n_policies; i++) {
         const struct nbrec_logical_router_policy *policy = lr->policies[i];
-        if ((policy->priority == priority) &&
-            (!strcmp(policy->match, ctx->argv[3]))) {
-           ctl_error(ctx, "Same routing policy already existed on the "
-                       "logical router %s.", ctx->argv[1]);
+        if (policy->priority == priority &&
+            !strcmp(policy->match, ctx->argv[3])) {
+            ctl_error(ctx, "Same routing policy already existed on the "
+                      "logical router %s.", ctx->argv[1]);
         }
     }
     if (ctx->argc == 6) {
@@ -3448,6 +3450,7 @@ nbctl_lr_policy_add(struct ctl_context *ctx)
             ctl_error(ctx, "bad next hop argument: %s", ctx->argv[5]);
         }
     }
+
     struct nbrec_logical_router_policy *policy;
     policy = nbrec_logical_router_policy_insert(ctx->txn);
     nbrec_logical_router_policy_set_priority(policy, priority);
@@ -3460,10 +3463,10 @@ nbctl_lr_policy_add(struct ctl_context *ctx)
     struct nbrec_logical_router_policy **new_policies
         = xmalloc(sizeof *new_policies * (lr->n_policies + 1));
     memcpy(new_policies, lr->policies,
-               sizeof *new_policies * lr->n_policies);
+           sizeof *new_policies * lr->n_policies);
     new_policies[lr->n_policies] = policy;
     nbrec_logical_router_set_policies(lr, new_policies,
-                                           lr->n_policies + 1);
+                                      lr->n_policies + 1);
     free(new_policies);
     if (next_hop != NULL)
         free(next_hop);
@@ -3479,23 +3482,26 @@ nbctl_lr_policy_del(struct ctl_context *ctx)
         ctx->error = error;
         return;
     }
-     if (ctx->argc == 2) {
+
+    if (ctx->argc == 2) {
         /* If a priority is not specified, delete all policies. */
         nbrec_logical_router_set_policies(lr, NULL, 0);
         return;
     }
+
     error = parse_priority(ctx->argv[2], &priority);
     if (error) {
         ctx->error = error;
         return;
     }
+
     /* If match is not specified, delete all routing policies with the
      * specified priority. */
     if (ctx->argc == 3) {
         struct nbrec_logical_router_policy **new_policies
-                = xmemdup(lr->policies,
-                          sizeof *new_policies * lr->n_policies);
-    int n_policies = 0;
+            = xmemdup(lr->policies,
+                      sizeof *new_policies * lr->n_policies);
+        int n_policies = 0;
         for (int i = 0; i < lr->n_policies; i++) {
             if (priority != lr->policies[i]->priority) {
                 new_policies[n_policies++] = lr->policies[i];
@@ -3506,18 +3512,19 @@ nbctl_lr_policy_del(struct ctl_context *ctx)
         free(new_policies);
         return;
     }
+
     /* Delete policy that has the same priority and match string */
     for (int i = 0; i < lr->n_policies; i++) {
         struct nbrec_logical_router_policy *routing_policy = lr->policies[i];
-         if (priority == routing_policy->priority &&
+        if (priority == routing_policy->priority &&
             !strcmp(ctx->argv[3], routing_policy->match)) {
             struct nbrec_logical_router_policy **new_policies
                 = xmemdup(lr->policies,
                           sizeof *new_policies * lr->n_policies);
-             new_policies[i] = lr->policies[lr->n_policies - 1];
+            new_policies[i] = lr->policies[lr->n_policies - 1];
             nbrec_logical_router_verify_policies(lr);
             nbrec_logical_router_set_policies(lr, new_policies,
-                                      lr->n_policies - 1);
+                                              lr->n_policies - 1);
             free(new_policies);
             return;
         }

I'll look forward to v4.

Patch
diff mbox series

diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
index 58bef7d..5cc9256 100644
--- a/ovn/northd/ovn-northd.c
+++ b/ovn/northd/ovn-northd.c
@@ -142,9 +142,10 @@  enum ovn_stage {
     PIPELINE_STAGE(ROUTER, IN,  ND_RA_OPTIONS,  5, "lr_in_nd_ra_options") \
     PIPELINE_STAGE(ROUTER, IN,  ND_RA_RESPONSE, 6, "lr_in_nd_ra_response") \
     PIPELINE_STAGE(ROUTER, IN,  IP_ROUTING,     7, "lr_in_ip_routing")   \
-    PIPELINE_STAGE(ROUTER, IN,  ARP_RESOLVE,    8, "lr_in_arp_resolve")  \
-    PIPELINE_STAGE(ROUTER, IN,  GW_REDIRECT,    9, "lr_in_gw_redirect")  \
-    PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST,    10, "lr_in_arp_request")  \
+    PIPELINE_STAGE(ROUTER, IN,  POLICY,         8, "lr_in_policy")       \
+    PIPELINE_STAGE(ROUTER, IN,  ARP_RESOLVE,    9, "lr_in_arp_resolve")  \
+    PIPELINE_STAGE(ROUTER, IN,  GW_REDIRECT,    10, "lr_in_gw_redirect")  \
+    PIPELINE_STAGE(ROUTER, IN,  ARP_REQUEST,    11, "lr_in_arp_request")  \
                                                                       \
     /* Logical router egress stages. */                               \
     PIPELINE_STAGE(ROUTER, OUT, UNDNAT,    0, "lr_out_undnat")        \
@@ -4633,6 +4634,111 @@  find_lrp_member_ip(const struct ovn_port *op, const char *ip_s)
     return NULL;
 }
 
+static struct ovn_port*
+get_outport_for_routing_policy_nexthop(struct ovn_datapath *od,
+                                       struct hmap *ports,
+                                       int priority, const char *nexthop)
+{
+    if (nexthop == NULL)
+        return NULL;
+
+    unsigned int plen = 0;
+    ovs_be32 nexthop_be32;
+    /* Verify that the next hop is an IP address with an all-ones mask. */
+    char *error = ip_parse_cidr(nexthop, &nexthop_be32, &plen);
+    if (!error) {
+        if (plen != 32) {
+            static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
+            VLOG_WARN_RL(&rl, "bad next hop ip %s for routing policy "
+                         "with priority %d, error: %s ",
+                         nexthop, priority, error);
+            return NULL;
+        }
+    } else {
+        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
+        VLOG_WARN_RL(&rl, "Failed to parse cidr %s for routing policy "
+                     "with priority %d ",
+                     nexthop, priority);
+        free(error);
+        return NULL;
+    }
+
+    /* find the router port matching the next hop. */
+    int i;
+    struct ovn_port *out_port = NULL;
+    const char *lrp_addr_s = NULL;
+    for (i = 0; i < od->nbr->n_ports; i++) {
+       struct nbrec_logical_router_port *lrp = od->nbr->ports[i];
+       out_port = ovn_port_find(ports, lrp->name);
+       if (!out_port) {
+          /* This should not happen. */
+           continue;
+       }
+       lrp_addr_s = find_lrp_member_ip(out_port, nexthop);
+       if (lrp_addr_s) {
+            break;
+       } else {
+           out_port = NULL;
+       }
+    }
+    if (!out_port) {
+        /* There is no matched out port. */
+        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
+        VLOG_WARN_RL(&rl, "No path for routing policy priority %d; next hop %s",
+                     priority, nexthop);
+        return NULL;
+    }
+    return out_port;
+}
+
+static void
+build_routing_policy_flow(struct hmap *lflows, struct ovn_datapath *od,
+                          struct hmap *ports,
+                          const struct nbrec_logical_router_policy *rule)
+{
+    struct ds match = DS_EMPTY_INITIALIZER;
+    struct ds actions = DS_EMPTY_INITIALIZER;
+
+    if (!strcmp(rule->action, "reroute")) {
+        struct ovn_port *out_port = NULL;
+        const char *lrp_addr_s = NULL;
+        out_port = get_outport_for_routing_policy_nexthop(
+            od, ports, rule->priority, rule->nexthop);
+        if (out_port == NULL) {
+           return;
+        } else {
+           lrp_addr_s = find_lrp_member_ip(out_port, rule->nexthop);
+           if (!lrp_addr_s) {
+               static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
+               VLOG_WARN_RL(&rl, "lrp_addr not found for routing policy "
+                            " priority %d nexthop %s",
+                            rule->priority, rule->nexthop);
+               return;
+           } else {
+               ds_put_format(&actions, "reg0 = %s; "
+                  "reg1 = %s; "
+                  "eth.src = %s; "
+                  "outport = %s; "
+                  "flags.loopback = 1; "
+                  "next;",
+                  rule->nexthop,
+                  lrp_addr_s,
+                  out_port->lrp_networks.ea_s,
+                  out_port->json_key);
+           }
+        }
+    } else if (!strcmp(rule->action, "drop")) {
+        ds_put_cstr(&actions, "drop;");
+    } else if (!strcmp(rule->action, "allow")) {
+        ds_put_cstr(&actions, "next;");
+    }
+    ds_put_format(&match, "%s", rule->match);
+    ovn_lflow_add(lflows, od, S_ROUTER_IN_POLICY, rule->priority,
+                  ds_cstr(&match), ds_cstr(&actions));
+    ds_destroy(&match);
+    ds_destroy(&actions);
+}
+
 static void
 add_route(struct hmap *lflows, const struct ovn_port *op,
           const char *lrp_addr_s, const char *network_s, int plen,
@@ -6332,9 +6438,35 @@  build_lrouter_flows(struct hmap *datapaths, struct hmap *ports,
         }
     }
 
+    /* Logical router ingress table 8: Policy.
+     *
+     * A packet that arrives at this table is an IP packet that should be
+     * permitted/denied/rerouted to the address in the rule's nexthop.
+     * This table sets outport to the correct out_port,
+     * eth.src to the output port's MAC address,
+     * and '[xx]reg0' to the next-hop IP address (leaving
+     * 'ip[46].dst', the packet’s final destination, unchanged), and
+     * advances to the next table for ARP/ND resolution. */
+    HMAP_FOR_EACH (od, key_node, datapaths) {
+        if (!od->nbr) {
+            continue;
+        }
+        /* This is a catch-all rule. It has the lowest priority (0)
+         * does a match-all("1") and pass-through (next) */
+        ovn_lflow_add(lflows, od, S_ROUTER_IN_POLICY, 0, "1", "next;");
+
+        /* Convert routing policies to flows. */
+        for (int i = 0; i < od->nbr->n_policies; i++) {
+            const struct nbrec_logical_router_policy *rule;
+             rule = od->nbr->policies[i];
+            build_routing_policy_flow(lflows, od, ports, rule);
+        }
+    }
+
+
     /* XXX destination unreachable */
 
-    /* Local router ingress table 8: ARP Resolution.
+    /* Local router ingress table 9: ARP Resolution.
      *
      * Any packet that reaches this table is an IP packet whose next-hop IP
      * address is in reg0. (ip4.dst is the final destination.) This table
@@ -6533,7 +6665,7 @@  build_lrouter_flows(struct hmap *datapaths, struct hmap *ports,
                       "get_nd(outport, xxreg0); next;");
     }
 
-    /* Logical router ingress table 9: Gateway redirect.
+    /* Logical router ingress table 10: Gateway redirect.
      *
      * For traffic with outport equal to the l3dgw_port
      * on a distributed router, this table redirects a subset
@@ -6573,7 +6705,7 @@  build_lrouter_flows(struct hmap *datapaths, struct hmap *ports,
         ovn_lflow_add(lflows, od, S_ROUTER_IN_GW_REDIRECT, 0, "1", "next;");
     }
 
-    /* Local router ingress table 10: ARP request.
+    /* Local router ingress table 11: ARP request.
      *
      * In the common case where the Ethernet destination has been resolved,
      * this table outputs the packet (priority 0).  Otherwise, it composes
diff --git a/ovn/ovn-nb.ovsschema b/ovn/ovn-nb.ovsschema
index f3683df..ff16985 100644
--- a/ovn/ovn-nb.ovsschema
+++ b/ovn/ovn-nb.ovsschema
@@ -1,7 +1,7 @@ 
 {
     "name": "OVN_Northbound",
-    "version": "5.14.0",
-    "cksum": "3600467067 20513",
+    "version": "5.15.0",
+    "cksum": "3545233945 21390",
     "tables": {
         "NB_Global": {
             "columns": {
@@ -242,6 +242,11 @@ 
                                             "refType": "strong"},
                                    "min": 0,
                                    "max": "unlimited"}},
+                "policies": {"type": {"key": {"type": "uuid",
+                                            "refTable": "Logical_Router_Policy",
+                                            "refType": "strong"},
+                                   "min": 0,
+                                   "max": "unlimited"}},
                 "enabled": {"type": {"key": "boolean", "min": 0, "max": 1}},
                 "nat": {"type": {"key": {"type": "uuid",
                                          "refTable": "NAT",
@@ -303,6 +308,16 @@ 
                     "type": {"key": "string", "value": "string",
                              "min": 0, "max": "unlimited"}}},
             "isRoot": false},
+        "Logical_Router_Policy": {
+            "columns": {
+                "priority": {"type": {"key": {"type": "integer",
+                                              "minInteger": 0,
+                                              "maxInteger": 32767}}},
+                "match": {"type": "string"},
+                "action": {"type": {"key": {"type": "string",
+                                            "enum": ["set", ["allow", "drop", "reroute"]]}}},
+                "nexthop": {"type": {"key": "string", "min": 0, "max": 1}}},
+            "isRoot": false},
         "NAT": {
             "columns": {
                 "external_ip": {"type": "string"},
diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml
index 474b4f9..0675d39 100644
--- a/ovn/ovn-nb.xml
+++ b/ovn/ovn-nb.xml
@@ -1236,6 +1236,10 @@ 
       One or more static routes for the router.
     </column>
 
+    <column name="policies">
+      One or more routing policies for the router.
+    </column>
+
     <column name="enabled">
       This column is used to administratively set router state.  If this column
       is empty or is set to <code>true</code>, the router is enabled.  If this
@@ -1793,6 +1797,65 @@ 
 
   </table>
 
+  <table name="Logical_Router_Policy" title="Logical router policies">
+    <p>
+      Each row in this table represents one routing policy for a logical router
+      that points to it through its <ref column="policies"/> column.  The <ref
+      column="action"/> column for the highest-<ref column="priority"/>
+      matching row in this table determines a packet's treatment.  If no row
+      matches, packets are allowed by default. (Default-deny treatment is
+      possible: add a rule with <ref column="priority"/> 0, <code>0</code> as
+      <ref column="match"/>, and <code>drop</code> as <ref column="action"/>.)
+    </p>
+
+    <column name="priority">
+      <p>
+        The routing policy's priority.  Rules with numerically higher priority
+        take precedence over those with lower. A rule is uniquely identified
+        by the priority and match string.
+      </p>
+    </column>
+
+    <column name="match">
+      <p>
+        The packets that the routing policy should match, in the same expression
+        language used for the <ref column="match" table="Logical_Flow"
+        db="OVN_Southbound"/> column in the OVN Southbound database's
+        <ref table="Logical_Flow" db="OVN_Southbound"/> table.
+      </p>
+
+      <p>
+        By default all traffic is allowed.  When writing a more
+        restrictive policy, it is important to remember to allow flows
+        such as ARP and IPv6 neighbor discovery packets.
+      </p>
+    </column>
+
+    <column name="action">
+      <p>The action to take when the routing policy matches:</p>
+      <ul>
+        <li>
+          <code>allow</code>: Forward the packet.
+        </li>
+
+        <li>
+          <code>drop</code>: Silently drop the packet.
+        </li>
+
+        <li>
+          <code>reroute</code>: Reroute packet to nexthop
+        </li>
+      </ul>
+    </column>
+
+    <column name="nexthop">
+      <p>
+        Nexthop IP address for this route.  Nexthop IP address should be the IP
+        address of a connected router port or the IP address of a logical port.
+      </p>
+    </column>
+  </table>
+
   <table name="NAT" title="NAT rules">
     <p>
       Each record represents a NAT rule.
diff --git a/ovn/utilities/ovn-nbctl.c b/ovn/utilities/ovn-nbctl.c
index 9d1b220..5f4cb2e 100644
--- a/ovn/utilities/ovn-nbctl.c
+++ b/ovn/utilities/ovn-nbctl.c
@@ -640,6 +640,13 @@  Route commands:\n\
                             remove routes from ROUTER\n\
   lr-route-list ROUTER      print routes for ROUTER\n\
 \n\
+Policy commands:\n\
+  lr-policy-add ROUTER PRIORITY MATCH ACTION [NEXTHOP]\n\
+                            add a policy to router\n\
+  lr-policy-del ROUTER [PRIORITY [MATCH]]\n\
+                            remove policies from ROUTER\n\
+  lr-policy-list ROUTER   print policies for ROUTER\n\
+\n\
 NAT commands:\n\
   lr-nat-add ROUTER TYPE EXTERNAL_IP LOGICAL_IP [LOGICAL_PORT EXTERNAL_MAC]\n\
                             add a NAT to ROUTER\n\
@@ -3393,6 +3400,189 @@  normalize_prefix_str(const char *orig_prefix)
         return normalize_ipv6_prefix(ipv6, plen);
     }
 }
+
+static void
+nbctl_lr_policy_add(struct ctl_context *ctx)
+{
+    const struct nbrec_logical_router *lr;
+    int64_t priority = 0;
+    char *error = lr_by_name_or_uuid(ctx, ctx->argv[1], true, &lr);
+    if (error) {
+        ctx->error = error;
+        return;
+    }
+    error = parse_priority(ctx->argv[2], &priority);
+    if (error) {
+        ctx->error = error;
+        return;
+    }
+    const char *action = ctx->argv[4];
+    char *next_hop = NULL;
+    /* Validate action. */
+    if (strcmp(action, "allow") && strcmp(action, "drop")
+        && strcmp(action, "reroute")) {
+        ctl_error(ctx, "%s: action must be one of \"allow\", \"drop\", "
+                  "and \"reroute\"", action);
+    }
+    if (!strcmp(action, "reroute")) {
+        if (ctx->argc < 6) {
+            ctl_error(ctx, "Nexthop is not specified when action is reroute.");
+        }
+    }
+    /* Check if same routing policy already exists.
+     * A policy is uniquely identified by priority and match */
+    for (int i = 0; i < lr->n_policies; i++) {
+        const struct nbrec_logical_router_policy *policy = lr->policies[i];
+        if ((policy->priority == priority) &&
+            (!strcmp(policy->match, ctx->argv[3]))) {
+           ctl_error(ctx, "Same routing policy already existed on the "
+                       "logical router %s.", ctx->argv[1]);
+        }
+    }
+    if (ctx->argc == 6) {
+        next_hop = normalize_prefix_str(ctx->argv[5]);
+        if (!next_hop) {
+            ctl_error(ctx, "bad next hop argument: %s", ctx->argv[5]);
+        }
+    }
+    struct nbrec_logical_router_policy *policy;
+    policy = nbrec_logical_router_policy_insert(ctx->txn);
+    nbrec_logical_router_policy_set_priority(policy, priority);
+    nbrec_logical_router_policy_set_match(policy, ctx->argv[3]);
+    nbrec_logical_router_policy_set_action(policy, action);
+    if (ctx->argc == 6) {
+        nbrec_logical_router_policy_set_nexthop(policy, next_hop);
+    }
+    nbrec_logical_router_verify_policies(lr);
+    struct nbrec_logical_router_policy **new_policies
+        = xmalloc(sizeof *new_policies * (lr->n_policies + 1));
+    memcpy(new_policies, lr->policies,
+               sizeof *new_policies * lr->n_policies);
+    new_policies[lr->n_policies] = policy;
+    nbrec_logical_router_set_policies(lr, new_policies,
+                                           lr->n_policies + 1);
+    free(new_policies);
+    if (next_hop != NULL)
+        free(next_hop);
+}
+
+static void
+nbctl_lr_policy_del(struct ctl_context *ctx)
+{
+    const struct nbrec_logical_router *lr;
+    int64_t priority = 0;
+    char *error = lr_by_name_or_uuid(ctx, ctx->argv[1], true, &lr);
+    if (error) {
+        ctx->error = error;
+        return;
+    }
+     if (ctx->argc == 2) {
+        /* If a priority is not specified, delete all policies. */
+        nbrec_logical_router_set_policies(lr, NULL, 0);
+        return;
+    }
+    error = parse_priority(ctx->argv[2], &priority);
+    if (error) {
+        ctx->error = error;
+        return;
+    }
+    /* If match is not specified, delete all routing policies with the
+     * specified priority. */
+    if (ctx->argc == 3) {
+        struct nbrec_logical_router_policy **new_policies
+                = xmemdup(lr->policies,
+                          sizeof *new_policies * lr->n_policies);
+    int n_policies = 0;
+        for (int i = 0; i < lr->n_policies; i++) {
+            if (priority != lr->policies[i]->priority) {
+                new_policies[n_policies++] = lr->policies[i];
+            }
+        }
+        nbrec_logical_router_verify_policies(lr);
+        nbrec_logical_router_set_policies(lr, new_policies, n_policies);
+        free(new_policies);
+        return;
+    }
+    /* Delete policy that has the same priority and match string */
+    for (int i = 0; i < lr->n_policies; i++) {
+        struct nbrec_logical_router_policy *routing_policy = lr->policies[i];
+         if (priority == routing_policy->priority &&
+            !strcmp(ctx->argv[3], routing_policy->match)) {
+            struct nbrec_logical_router_policy **new_policies
+                = xmemdup(lr->policies,
+                          sizeof *new_policies * lr->n_policies);
+             new_policies[i] = lr->policies[lr->n_policies - 1];
+            nbrec_logical_router_verify_policies(lr);
+            nbrec_logical_router_set_policies(lr, new_policies,
+                                      lr->n_policies - 1);
+            free(new_policies);
+            return;
+        }
+    }
+}
+
+ struct routing_policy {
+    int priority;
+    char *match;
+    const struct nbrec_logical_router_policy *policy;
+};
+
+static int
+routing_policy_cmp(const void *policy1_, const void *policy2_)
+{
+    const struct routing_policy *policy1p = policy1_;
+    const struct routing_policy *policy2p = policy2_;
+    if (policy1p->priority != policy2p->priority) {
+        return policy1p->priority > policy2p->priority ? -1 : 1;
+    } else {
+        return strcmp(policy1p->match, policy2p->match);
+    }
+}
+
+static void
+print_routing_policy(const struct nbrec_logical_router_policy *policy,
+                     struct ds *s)
+{
+    if (policy->nexthop != NULL) {
+        char *next_hop = normalize_prefix_str(policy->nexthop);
+        ds_put_format(s, "%10ld %50s %15s %25s", policy->priority,
+                      policy->match, policy->action, next_hop);
+        free(next_hop);
+    } else
+        ds_put_format(s, "%10ld %50s %15s", policy->priority,
+                      policy->match, policy->action);
+     ds_put_char(s, '\n');
+}
+
+static void
+nbctl_lr_policy_list(struct ctl_context *ctx)
+{
+    const struct nbrec_logical_router *lr;
+    struct routing_policy *policies;
+    size_t n_policies = 0;
+    char *error = lr_by_name_or_uuid(ctx, ctx->argv[1], true, &lr);
+    if (error) {
+        ctx->error = error;
+        return;
+    }
+    policies = xmalloc(sizeof *policies * lr->n_policies);
+     for (int i = 0; i < lr->n_policies; i++) {
+        const struct nbrec_logical_router_policy *policy
+            = lr->policies[i];
+        policies[n_policies].priority = policy->priority;
+        policies[n_policies].match = policy->match;
+        policies[n_policies].policy = policy;
+        n_policies++;
+    }
+    qsort(policies, n_policies, sizeof *policies, routing_policy_cmp);
+    if (n_policies) {
+        ds_put_cstr(&ctx->output, "Routing Policies\n");
+    }
+    for (int i = 0; i < n_policies; i++) {
+        print_routing_policy(policies[i].policy, &ctx->output);
+    }
+    free(policies);
+}
 
 static void
 nbctl_lr_route_add(struct ctl_context *ctx)
@@ -5141,6 +5331,14 @@  static const struct ctl_command_syntax nbctl_commands[] = {
     { "lr-route-list", 1, 1, "ROUTER", NULL, nbctl_lr_route_list, NULL,
       "", RO },
 
+    /* Policy commands */
+    { "lr-policy-add", 4, 5, "ROUTER PRIORITY MATCH ACTION [NEXTHOP]", NULL,
+        nbctl_lr_policy_add, NULL, "", RW },
+    { "lr-policy-del", 1, 3, "ROUTER [PRIORITY [MATCH]]", NULL,
+        nbctl_lr_policy_del, NULL, "", RW },
+    { "lr-policy-list", 1, 1, "ROUTER", NULL, nbctl_lr_policy_list, NULL,
+       "", RO },
+
     /* NAT commands. */
     { "lr-nat-add", 4, 6,
       "ROUTER TYPE EXTERNAL_IP LOGICAL_IP [LOGICAL_PORT EXTERNAL_MAC]", NULL,
diff --git a/tests/ovn-nbctl.at b/tests/ovn-nbctl.at
index 25414b8..70aaf81 100644
--- a/tests/ovn-nbctl.at
+++ b/tests/ovn-nbctl.at
@@ -1331,6 +1331,53 @@  IPv6 Routes
 
 dnl ---------------------------------------------------------------------
 
+OVN_NBCTL_TEST([ovn_nbctl_policies], [policies], [
+AT_CHECK([ovn-nbctl lr-add lr0])
+
+dnl Add policies with allow and drop actions
+AT_CHECK([ovn-nbctl lr-policy-add lr0 100 "ip4.src == 1.1.1.0/24" drop])
+AT_CHECK([ovn-nbctl lr-policy-add lr0 100 "ip4.src == 1.1.2.0/24" allow])
+AT_CHECK([ovn-nbctl lr-policy-add lr0 101 "ip4.src == 2.1.1.0/24" allow])
+AT_CHECK([ovn-nbctl lr-policy-add lr0 101 "ip4.src == 2.1.2.0/24" drop])
+
+dnl Add duplicated policy
+AT_CHECK([ovn-nbctl lr-policy-add lr0 100 "ip4.src == 1.1.1.0/24" drop], [1], [],
+  [ovn-nbctl: Same routing policy already existed on the logical router lr0.
+])
+
+dnl Add duplicated policy
+AT_CHECK([ovn-nbctl lr-policy-add lr0 103 "ip4.src == 1.1.1.0/24" deny], [1], [],
+  [ovn-nbctl: deny: action must be one of "allow", "drop", and "reroute"
+])
+
+dnl Delete by priority and match string
+AT_CHECK([ovn-nbctl lr-policy-del lr0 100 "ip4.src == 1.1.1.0/24"])
+AT_CHECK([ovn-nbctl lr-policy-list lr0], [0], [dnl
+Routing Policies
+       101                              ip4.src == 2.1.1.0/24           allow
+       101                              ip4.src == 2.1.2.0/24            drop
+       100                              ip4.src == 1.1.2.0/24           allow
+])
+
+dnl Delete all policies for given priority
+AT_CHECK([ovn-nbctl lr-policy-del lr0 101])
+AT_CHECK([ovn-nbctl lr-policy-list lr0], [0], [dnl
+Routing Policies
+       100                              ip4.src == 1.1.2.0/24           allow
+])
+
+dnl Add policy with reroute action
+AT_CHECK([ovn-nbctl lr-policy-add lr0 102 "ip4.src == 3.1.2.0/24" reroute 3.3.3.3])
+
+dnl Add policy with invalid reroute ip
+AT_CHECK([ovn-nbctl lr-policy-add lr0 103 "ip4.src == 3.1.2.0/24" reroute 3.3.3.x], [1], [],
+  [ovn-nbctl: bad next hop argument: 3.3.3.x
+])
+
+])
+
+dnl ---------------------------------------------------------------------
+
 OVN_NBCTL_TEST([ovn_nbctl_lsp_types], [lsp types], [
 AT_CHECK([ovn-nbctl ls-add ls0])
 AT_CHECK([ovn-nbctl lsp-add ls0 lp0])
diff --git a/tests/ovn.at b/tests/ovn.at
index ab32faa..1044732 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -5085,6 +5085,7 @@  test_ipv4_icmp_request 2 000000010204 0000000102f2 $l2_ip $rtr_l2_ip 0000 8510 0
 test_ipv4_icmp_request 1 000000010203 0000000102f1 $l1_ip $rtr_l2_ip 0000 8510 02ff 8d10
 test_ipv4_icmp_request 2 000000010204 0000000102f2 $l2_ip $rtr_l1_ip 0000 8510 02ff 8d10
 
+
 echo "---------NB dump-----"
 ovn-nbctl show
 echo "---------------------"
@@ -5108,6 +5109,192 @@  for inport in 1 2; do
 done
 
 OVN_CLEANUP([hv1])
+AT_CLEANUP
+
+AT_SETUP([ovn -- policy-based routing: 1 HVs, 2 LSs, 1 lport/LS, 1 LR])
+AT_KEYWORDS([policy-based-routing])
+AT_SKIP_IF([test $HAVE_PYTHON = no])
+ovn_start
+
+# Logical network:
+# One LR - R1 has switch ls1 (191.168.1.0/24) connected to it,
+# and has switch ls2 (172.16.1.0/24) connected to it.
+
+ovn-nbctl lr-add R1
+
+ovn-nbctl ls-add ls1
+ovn-nbctl ls-add ls2
+ovn-nbctl ls-add ls3
+
+# Connect ls1 to R1
+ovn-nbctl lrp-add R1 ls1 00:00:00:01:02:f1 192.168.1.1/24
+ovn-nbctl lsp-add ls1 rp-ls1 -- set Logical_Switch_Port rp-ls1 \
+    type=router options:router-port=ls1 addresses=\"00:00:00:01:02:f1\"
+
+# Connect ls2 to R1
+ovn-nbctl lrp-add R1 ls2 00:00:00:01:02:f2 172.16.1.1/24
+ovn-nbctl lsp-add ls2 rp-ls2 -- set Logical_Switch_Port rp-ls2 \
+    type=router options:router-port=ls2 addresses=\"00:00:00:01:02:f2\"
+
+# Connect ls3 to R1
+ovn-nbctl lrp-add R1 ls3 00:00:00:01:02:f3 20.20.1.1/24
+ovn-nbctl lsp-add ls3 rp-ls3 -- set Logical_Switch_Port rp-ls3 \
+    type=router options:router-port=ls3 addresses=\"00:00:00:01:02:f3\"
+
+# Create logical port ls1-lp1 in ls1
+ovn-nbctl lsp-add ls1 ls1-lp1 \
+-- lsp-set-addresses ls1-lp1 "00:00:00:01:02:03 192.168.1.2"
+
+# Create logical port ls2-lp1 in ls2
+ovn-nbctl lsp-add ls2 ls2-lp1 \
+-- lsp-set-addresses ls2-lp1 "00:00:00:01:02:04 172.16.1.2"
+
+# Create logical port ls3-lp1 in ls3
+ovn-nbctl lsp-add ls3 ls3-lp1 \
+-- lsp-set-addresses ls3-lp1 "00:00:00:01:02:05 20.20.1.2"
+
+# Create one hypervisor and create OVS ports corresponding to logical ports.
+net_add n1
+
+sim_add pbr-hv
+as pbr-hv
+ovs-vsctl add-br br-phys
+ovn_attach n1 br-phys 192.168.0.1
+
+ovs-vsctl -- add-port br-int vif1 -- \
+    set interface vif1 external-ids:iface-id=ls1-lp1 \
+    options:tx_pcap=pbr-hv/vif1-tx.pcap \
+    options:rxq_pcap=pbr-hv/vif1-rx.pcap \
+    ofport-request=1
+
+ovs-vsctl -- add-port br-int vif2 -- \
+    set interface vif2 external-ids:iface-id=ls2-lp1 \
+    options:tx_pcap=pbr-hv/vif2-tx.pcap \
+    options:rxq_pcap=pbr-hv/vif2-rx.pcap \
+    ofport-request=1
+
+ovs-vsctl -- add-port br-int vif3 -- \
+    set interface vif3 external-ids:iface-id=ls3-lp1 \
+    options:tx_pcap=pbr-hv/vif3-tx.pcap \
+    options:rxq_pcap=pbr-hv/vif3-rx.pcap \
+    ofport-request=1
+
+# Allow some time for ovn-northd and ovn-controller to catch up.
+# XXX This should be more systematic.
+sleep 1
+
+ls1_ro_mac=00:00:00:01:02:f1
+ls1_ro_ip=192.168.1.1
+
+ls2_ro_mac=00:00:00:01:02:f2
+ls2_ro_ip=172.16.1.1
+
+ls3_ro_mac=00:00:00:01:02:f3
+
+ls1_p1_mac=00:00:00:01:02:03
+ls1_p1_ip=192.168.1.2
+
+ls2_p1_mac=00:00:00:01:02:04
+ls2_p1_ip=172.16.1.2
+
+ls3_p1_mac=00:00:00:01:02:05
+
+# Create a drop policy
+ovn-nbctl lr-policy-add R1 10 "ip4.src==192.168.1.0/24 && ip4.dst==172.16.1.0/24" drop
+
+# Check logical flow
+AT_CHECK([ovn-sbctl dump-flows | grep lr_in_policy | grep "192.168.1.0" | wc -l], [0], [dnl
+1
+])
+
+# Send packet.
+packet="inport==\"ls1-lp1\" && eth.src==$ls1_p1_mac && eth.dst==$ls1_ro_mac &&
+       ip4 && ip.ttl==64 && ip4.src==$ls1_p1_ip && ip4.dst==$ls2_p1_ip &&
+       udp && udp.src==53 && udp.dst==4369"
+
+as pbr-hv ovs-appctl -t ovn-controller inject-pkt "$packet"
+
+# Check if packet hit the drop policy
+AT_CHECK([ovs-ofctl dump-flows br-int | \
+    grep "nw_src=192.168.1.0/24,nw_dst=172.16.1.0/24 actions=drop" | \
+    grep "priority=10" | \
+    grep "n_packets=1" | wc -l], [0], [dnl
+1
+])
+
+# Expected to drop the packet.
+$PYTHON "$top_srcdir/utilities/ovs-pcap.in" pbr-hv/vif2-tx.pcap > vif2.packets
+rcvd_packet=`cat vif2.packets`
+AT_FAIL_IF([rcvd_packet = ""])
+
+# Override drop policy with allow
+ovn-nbctl lr-policy-add R1 20 "ip4.src==192.168.1.0/24 && ip4.dst==172.16.1.0/24" allow
+
+# Check logical flow
+AT_CHECK([ovn-sbctl dump-flows | grep lr_in_policy | grep "192.168.1.0" | wc -l], [0], [dnl
+2
+])
+
+# Send packet.
+packet="inport==\"ls1-lp1\" && eth.src==$ls1_p1_mac && eth.dst==$ls1_ro_mac &&
+       ip4 && ip.ttl==64 && ip4.src==$ls1_p1_ip && ip4.dst==$ls2_p1_ip &&
+       udp && udp.src==53 && udp.dst==4369"
+as pbr-hv ovs-appctl -t ovn-controller inject-pkt "$packet"
+
+# Check if packet hit the allow policy
+AT_CHECK([ovn-sbctl dump-flows | grep lr_in_policy | \
+    grep "192.168.1.0" | \
+    grep "priority=20" | wc -l], [0], [dnl
+1
+])
+
+# Expected packet has TTL decreased by 1
+expected="eth.src==$ls2_ro_mac && eth.dst==$ls2_p1_mac &&
+       ip4 && ip.ttl==63 && ip4.src==$ls1_p1_ip && ip4.dst==$ls2_p1_ip &&
+       udp && udp.src==53 && udp.dst==4369"
+echo $expected | ovstest test-ovn expr-to-packets > expected
+
+OVN_CHECK_PACKETS([pbr-hv/vif2-tx.pcap], [expected])
+
+# Override allow policy with reroute
+ovn-nbctl lr-policy-add R1 30 "ip4.src==192.168.1.0/24 && ip4.dst==172.16.1.0/24" reroute 20.20.1.2
+
+# Check logical flow
+AT_CHECK([ovn-sbctl dump-flows | grep lr_in_policy | \
+    grep "192.168.1.0" | \
+    grep "priority=30" | wc -l], [0], [dnl
+1
+])
+
+# Send packet.
+packet="inport==\"ls1-lp1\" && eth.src==$ls1_p1_mac && eth.dst==$ls1_ro_mac &&
+       ip4 && ip.ttl==64 && ip4.src==$ls1_p1_ip && ip4.dst==$ls2_p1_ip &&
+       udp && udp.src==53 && udp.dst==4369"
+as pbr-hv ovs-appctl -t ovn-controller inject-pkt "$packet"
+
+echo "southbound flows"
+
+ovn-sbctl dump-flows | grep lr_in_policy
+echo "ovs flows"
+ovs-ofctl dump-flows br-int
+# Check if packet hit the allow policy
+AT_CHECK([ovs-ofctl dump-flows br-int | \
+    grep "nw_src=192.168.1.0/24,nw_dst=172.16.1.0/24" | \
+    grep "priority=30" | \
+    grep "n_packets=1" | wc -l], [0], [dnl
+1
+])
+echo "packet hit reroute policy"
+
+# Expected packet has TTL decreased by 1
+expected="eth.src==$ls3_ro_mac && eth.dst==$ls3_p1_mac &&
+       ip4 && ip.ttl==63 && ip4.src==$ls1_p1_ip && ip4.dst==$ls2_p1_ip &&
+       udp && udp.src==53 && udp.dst==4369"
+echo $expected | ovstest test-ovn expr-to-packets > 3.expected
+
+OVN_CHECK_PACKETS([pbr-hv/vif3-tx.pcap], [3.expected])
+
+OVN_CLEANUP([pbr-hv])
 
 AT_CLEANUP