[ovs-dev,v2] ovn: Add a case of policy based routing.
diff mbox

Message ID 1475751192-9918-1-git-send-email-guru@ovn.org
State Accepted
Headers show

Commit Message

Gurucharan Shetty Oct. 6, 2016, 10:53 a.m. UTC
OVN currently supports multiple gateway routers (residing on
different chassis) connected to the same logical topology.

When external traffic enters the logical topology, they can enter
from any gateway routers and reach its eventual destination. This
is achieved with proper static routes configured on the gateway
routers.

But when traffic is initiated in the logical space by a logical
port, we do not have a good way to distribute that traffic across
multiple gateway routers.

This commit introduces one particular way to do it. Based on the
source IP address or source IP network of the packet, we can now
jump to a specific gateway router.

This is very useful for a specific use case of Kubernetes.
When traffic is initiated inside a container heading to outside world,
we want to be able to send such traffic outside the gateway router
residing in the same host as that of the container. Since each
host gets a specific subnet, we can use source IP address based
policy routing to decide on the gateway router.

Rationale for using the same routing table for both source and
destination IP address based routing:

Some hardware network vendors support policy routing in a different table
on arbitrary "match".  And when a packet enters, if there is a match
in policy based routing table, the default routing table is not
consulted at all.  In case of OVN, we mainly want policy based routing
for north-south traffic. We want east-west traffic to flow as-is. Creating
a separate table for policy based routing complicates the configuration
quite a bit. For e.g., if we have a source IP network based rule added,
to decide a particular gateway router as a next hop, we should add rules at
a higher priority for all the connected routes to make sure that east-west
traffic is not effected in the policy based routing table itself.

Signed-off-by: Gurucharan Shetty <guru@ovn.org>
---
 ovn/northd/ovn-northd.c       |  24 +++--
 ovn/ovn-nb.ovsschema          |   8 +-
 ovn/ovn-nb.xml                |  26 +++++
 ovn/utilities/ovn-nbctl.8.xml |   8 +-
 ovn/utilities/ovn-nbctl.c     |  43 ++++++---
 tests/ovn-nbctl.at            |  42 ++++----
 tests/ovn.at                  | 219 ++++++++++++++++++++++++++++++++++++++++++
 7 files changed, 331 insertions(+), 39 deletions(-)

Comments

Ben Pfaff Nov. 2, 2016, 4:29 p.m. UTC | #1
On Wed, Nov 02, 2016 at 09:28:44AM -0700, Ben Pfaff wrote:
> On Thu, Oct 06, 2016 at 03:53:12AM -0700, Gurucharan Shetty wrote:
> > OVN currently supports multiple gateway routers (residing on
> > different chassis) connected to the same logical topology.
> > 
> > When external traffic enters the logical topology, they can enter
> > from any gateway routers and reach its eventual destination. This
> > is achieved with proper static routes configured on the gateway
> > routers.
> > 
> > But when traffic is initiated in the logical space by a logical
> > port, we do not have a good way to distribute that traffic across
> > multiple gateway routers.
> > 
> > This commit introduces one particular way to do it. Based on the
> > source IP address or source IP network of the packet, we can now
> > jump to a specific gateway router.
> > 
> > This is very useful for a specific use case of Kubernetes.
> > When traffic is initiated inside a container heading to outside world,
> > we want to be able to send such traffic outside the gateway router
> > residing in the same host as that of the container. Since each
> > host gets a specific subnet, we can use source IP address based
> > policy routing to decide on the gateway router.
> > 
> > Rationale for using the same routing table for both source and
> > destination IP address based routing:
> > 
> > Some hardware network vendors support policy routing in a different table
> > on arbitrary "match".  And when a packet enters, if there is a match
> > in policy based routing table, the default routing table is not
> > consulted at all.  In case of OVN, we mainly want policy based routing
> > for north-south traffic. We want east-west traffic to flow as-is. Creating
> > a separate table for policy based routing complicates the configuration
> > quite a bit. For e.g., if we have a source IP network based rule added,
> > to decide a particular gateway router as a next hop, we should add rules at
> > a higher priority for all the connected routes to make sure that east-west
> > traffic is not effected in the policy based routing table itself.
> > 
> > Signed-off-by: Gurucharan Shetty <guru@ovn.org>
> 
> Thank you!
> 
> Acked-by: Ben Pfaff <blp@ovn.org>

Oh, you might also add a NEWS item given that OVN is no longer
experimental.
Gurucharan Shetty Nov. 3, 2016, 3:06 p.m. UTC | #2
On 2 November 2016 at 09:29, Ben Pfaff <blp@ovn.org> wrote:

> On Wed, Nov 02, 2016 at 09:28:44AM -0700, Ben Pfaff wrote:
> > On Thu, Oct 06, 2016 at 03:53:12AM -0700, Gurucharan Shetty wrote:
> > > OVN currently supports multiple gateway routers (residing on
> > > different chassis) connected to the same logical topology.
> > >
> > > When external traffic enters the logical topology, they can enter
> > > from any gateway routers and reach its eventual destination. This
> > > is achieved with proper static routes configured on the gateway
> > > routers.
> > >
> > > But when traffic is initiated in the logical space by a logical
> > > port, we do not have a good way to distribute that traffic across
> > > multiple gateway routers.
> > >
> > > This commit introduces one particular way to do it. Based on the
> > > source IP address or source IP network of the packet, we can now
> > > jump to a specific gateway router.
> > >
> > > This is very useful for a specific use case of Kubernetes.
> > > When traffic is initiated inside a container heading to outside world,
> > > we want to be able to send such traffic outside the gateway router
> > > residing in the same host as that of the container. Since each
> > > host gets a specific subnet, we can use source IP address based
> > > policy routing to decide on the gateway router.
> > >
> > > Rationale for using the same routing table for both source and
> > > destination IP address based routing:
> > >
> > > Some hardware network vendors support policy routing in a different
> table
> > > on arbitrary "match".  And when a packet enters, if there is a match
> > > in policy based routing table, the default routing table is not
> > > consulted at all.  In case of OVN, we mainly want policy based routing
> > > for north-south traffic. We want east-west traffic to flow as-is.
> Creating
> > > a separate table for policy based routing complicates the configuration
> > > quite a bit. For e.g., if we have a source IP network based rule added,
> > > to decide a particular gateway router as a next hop, we should add
> rules at
> > > a higher priority for all the connected routes to make sure that
> east-west
> > > traffic is not effected in the policy based routing table itself.
> > >
> > > Signed-off-by: Gurucharan Shetty <guru@ovn.org>
> >
> > Thank you!
> >
> > Acked-by: Ben Pfaff <blp@ovn.org>
>
> Oh, you might also add a NEWS item given that OVN is no longer
> experimental.
>
Thank you for the review. I added it to NEWS and added your suggested diff
and applied.

Patch
diff mbox

diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
index 4668d9e..1dfee3b 100644
--- a/ovn/northd/ovn-northd.c
+++ b/ovn/northd/ovn-northd.c
@@ -3227,10 +3227,20 @@  find_lrp_member_ip(const struct ovn_port *op, const char *ip_s)
 static void
 add_route(struct hmap *lflows, const struct ovn_port *op,
           const char *lrp_addr_s, const char *network_s, int plen,
-          const char *gateway)
+          const char *gateway, const char *policy)
 {
     bool is_ipv4 = strchr(network_s, '.') ? true : false;
     struct ds match = DS_EMPTY_INITIALIZER;
+    const char *dir;
+    uint16_t priority;
+
+    if (policy && !strcmp(policy, "src-ip")) {
+        dir = "src";
+        priority = plen * 2;
+    } else {
+        dir = "dst";
+        priority = (plen * 2) + 1;
+    }
 
     /* IPv6 link-local addresses must be scoped to the local router port. */
     if (!is_ipv4) {
@@ -3240,7 +3250,7 @@  add_route(struct hmap *lflows, const struct ovn_port *op,
             ds_put_format(&match, "inport == %s && ", op->json_key);
         }
     }
-    ds_put_format(&match, "ip%s.dst == %s/%d", is_ipv4 ? "4" : "6",
+    ds_put_format(&match, "ip%s.%s == %s/%d", is_ipv4 ? "4" : "6", dir,
                   network_s, plen);
 
     struct ds actions = DS_EMPTY_INITIALIZER;
@@ -3264,7 +3274,7 @@  add_route(struct hmap *lflows, const struct ovn_port *op,
 
     /* The priority here is calculated to implement longest-prefix-match
      * routing. */
-    ovn_lflow_add(lflows, op->od, S_ROUTER_IN_IP_ROUTING, plen,
+    ovn_lflow_add(lflows, op->od, S_ROUTER_IN_IP_ROUTING, priority,
                   ds_cstr(&match), ds_cstr(&actions));
     ds_destroy(&match);
     ds_destroy(&actions);
@@ -3377,7 +3387,9 @@  build_static_route_flow(struct hmap *lflows, struct ovn_datapath *od,
         goto free_prefix_s;
     }
 
-    add_route(lflows, out_port, lrp_addr_s, prefix_s, plen, route->nexthop);
+    char *policy = route->policy ? route->policy : "dst-ip";
+    add_route(lflows, out_port, lrp_addr_s, prefix_s, plen, route->nexthop,
+              policy);
 
 free_prefix_s:
     free(prefix_s);
@@ -4011,13 +4023,13 @@  build_lrouter_flows(struct hmap *datapaths, struct hmap *ports,
         for (int i = 0; i < op->lrp_networks.n_ipv4_addrs; i++) {
             add_route(lflows, op, op->lrp_networks.ipv4_addrs[i].addr_s,
                       op->lrp_networks.ipv4_addrs[i].network_s,
-                      op->lrp_networks.ipv4_addrs[i].plen, NULL);
+                      op->lrp_networks.ipv4_addrs[i].plen, NULL, NULL);
         }
 
         for (int i = 0; i < op->lrp_networks.n_ipv6_addrs; i++) {
             add_route(lflows, op, op->lrp_networks.ipv6_addrs[i].addr_s,
                       op->lrp_networks.ipv6_addrs[i].network_s,
-                      op->lrp_networks.ipv6_addrs[i].plen, NULL);
+                      op->lrp_networks.ipv6_addrs[i].plen, NULL, NULL);
         }
     }
 
diff --git a/ovn/ovn-nb.ovsschema b/ovn/ovn-nb.ovsschema
index 865dd34..65f2d7c 100644
--- a/ovn/ovn-nb.ovsschema
+++ b/ovn/ovn-nb.ovsschema
@@ -1,7 +1,7 @@ 
 {
     "name": "OVN_Northbound",
-    "version": "5.4.0",
-    "cksum": "4176761817 11225",
+    "version": "5.4.1",
+    "cksum": "3773248894 11490",
     "tables": {
         "NB_Global": {
             "columns": {
@@ -196,6 +196,10 @@ 
         "Logical_Router_Static_Route": {
             "columns": {
                 "ip_prefix": {"type": "string"},
+                "policy": {"type": {"key": {"type": "string",
+                                            "enum": ["set", ["src-ip",
+                                                             "dst-ip"]]},
+                                    "min": 0, "max": 1}},
                 "nexthop": {"type": "string"},
                 "output_port": {"type": {"key": "string", "min": 0, "max": 1}}},
             "isRoot": false},
diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml
index c2a1ebb..cbdaeee 100644
--- a/ovn/ovn-nb.xml
+++ b/ovn/ovn-nb.xml
@@ -1060,6 +1060,32 @@ 
       </p>
     </column>
 
+    <column name="policy">
+      <p>
+        If it is specified, this setting describes the policy used to make
+        routing decisions.  This setting must be one of the following strings:
+      </p>
+      <ul>
+        <li>
+          <code>src-ip</code>: This policy sends the packet to the
+          <ref column="nexthop"/> when the packet's source IP address matches
+          <ref column="ip_prefix"/>.  If the <ref column="ip_prefix"/> has
+          a mask length of <code>n</code>, then this record gets an implicit
+          priority of <code>2*n</code>.
+       </li>
+        <li>
+          <code>dst-ip</code>: This policy sends the packet to the
+          <ref column="nexthop"/> when the packet's destination IP address
+          matches <ref column="ip_prefix"/>.  If the <ref column="ip_prefix"/>
+          has a mask length of <code>n</code>, then this record gets an
+          implicit priority of <code>2*n + 1</code>.
+        </li>
+      </ul>
+      <p>
+        If not specified, the default is <code>dst-ip</code>.
+     </p>
+    </column>
+
     <column name="nexthop">
       <p>
         Nexthop IP address for this route.  Nexthop IP address should be the IP
diff --git a/ovn/utilities/ovn-nbctl.8.xml b/ovn/utilities/ovn-nbctl.8.xml
index 2cbd6e0..5b702fc 100644
--- a/ovn/utilities/ovn-nbctl.8.xml
+++ b/ovn/utilities/ovn-nbctl.8.xml
@@ -380,7 +380,7 @@ 
     <h1>Logical Router Static Route Commands</h1>
 
     <dl>
-      <dt>[<code>--may-exist</code>] <code>lr-route-add</code> <var>router</var> <var>prefix</var> <var>nexthop</var> [<var>port</var>]</dt>
+      <dt>[<code>--may-exist</code>] [<code>--policy</code>=<var>POLICY</var>] <code>lr-route-add</code> <var>router</var> <var>prefix</var> <var>nexthop</var> [<var>port</var>]</dt>
       <dd>
         <p>
           Adds the specified route to <var>router</var>.
@@ -396,6 +396,12 @@ 
         </p>
 
         <p>
+          <code>--policy</code> describes the policy used to make routing
+          decisions.  This should be one of "dst-ip" or "src-ip".  If not
+          specified, the default is "dst-ip".
+        </p>
+
+        <p>
           It is an error if a route with <var>prefix</var> already exists,
           unless <code>--may-exist</code> is specified.
         </p>
diff --git a/ovn/utilities/ovn-nbctl.c b/ovn/utilities/ovn-nbctl.c
index ad2d2f8..85ca13a 100644
--- a/ovn/utilities/ovn-nbctl.c
+++ b/ovn/utilities/ovn-nbctl.c
@@ -378,7 +378,7 @@  Logical router port commands:\n\
                             ('enabled' or 'disabled')\n\
 \n\
 Route commands:\n\
-  lr-route-add ROUTER PREFIX NEXTHOP [PORT]\n\
+  [--policy=POLICY] lr-route-add ROUTER PREFIX NEXTHOP [PORT]\n\
                             add a route to ROUTER\n\
   lr-route-del ROUTER [PREFIX]\n\
                             remove routes from ROUTER\n\
@@ -2032,6 +2032,11 @@  nbctl_lr_route_add(struct ctl_context *ctx)
     lr = lr_by_name_or_uuid(ctx, ctx->argv[1], true);
     char *prefix, *next_hop;
 
+    const char *policy = shash_find_data(&ctx->options, "--policy");
+    if (policy && strcmp(policy, "src-ip") && strcmp(policy, "dst-ip")) {
+        ctl_fatal("bad policy: %s", policy);
+    }
+
     prefix = normalize_prefix_str(ctx->argv[2]);
     if (!prefix) {
         ctl_fatal("bad prefix argument: %s", ctx->argv[2]);
@@ -2092,6 +2097,9 @@  nbctl_lr_route_add(struct ctl_context *ctx)
             nbrec_logical_router_static_route_set_output_port(route,
                                                               ctx->argv[4]);
         }
+        if (policy) {
+             nbrec_logical_router_static_route_set_policy(route, policy);
+        }
         free(rt_prefix);
         free(next_hop);
         free(prefix);
@@ -2105,6 +2113,9 @@  nbctl_lr_route_add(struct ctl_context *ctx)
     if (ctx->argc == 5) {
         nbrec_logical_router_static_route_set_output_port(route, ctx->argv[4]);
     }
+    if (policy) {
+        nbrec_logical_router_static_route_set_policy(route, policy);
+    }
 
     nbrec_logical_router_verify_static_routes(lr);
     struct nbrec_logical_router_static_route **new_routes
@@ -2458,7 +2469,7 @@  nbctl_lrp_get_enabled(struct ctl_context *ctx)
 }
 
 struct ipv4_route {
-    int plen;
+    int priority;
     ovs_be32 addr;
     const struct nbrec_logical_router_static_route *route;
 };
@@ -2469,8 +2480,8 @@  ipv4_route_cmp(const void *route1_, const void *route2_)
     const struct ipv4_route *route1p = route1_;
     const struct ipv4_route *route2p = route2_;
 
-    if (route1p->plen != route2p->plen) {
-        return route1p->plen > route2p->plen ? -1 : 1;
+    if (route1p->priority != route2p->priority) {
+        return route1p->priority > route2p->priority ? -1 : 1;
     } else if (route1p->addr != route2p->addr) {
         return ntohl(route1p->addr) < ntohl(route2p->addr) ? -1 : 1;
     } else {
@@ -2479,7 +2490,7 @@  ipv4_route_cmp(const void *route1_, const void *route2_)
 }
 
 struct ipv6_route {
-    int plen;
+    int priority;
     struct in6_addr addr;
     const struct nbrec_logical_router_static_route *route;
 };
@@ -2490,8 +2501,8 @@  ipv6_route_cmp(const void *route1_, const void *route2_)
     const struct ipv6_route *route1p = route1_;
     const struct ipv6_route *route2p = route2_;
 
-    if (route1p->plen != route2p->plen) {
-        return route1p->plen > route2p->plen ? -1 : 1;
+    if (route1p->priority != route2p->priority) {
+        return route1p->priority > route2p->priority ? -1 : 1;
     }
     return memcmp(&route1p->addr, &route2p->addr, sizeof(route1p->addr));
 }
@@ -2506,6 +2517,12 @@  print_route(const struct nbrec_logical_router_static_route *route, struct ds *s)
     free(prefix);
     free(next_hop);
 
+    if (route->policy) {
+        ds_put_format(s, " %s", route->policy);
+    } else {
+        ds_put_format(s, " %s", "dst-ip");
+    }
+
     if (route->output_port) {
         ds_put_format(s, " %s", route->output_port);
     }
@@ -2531,11 +2548,13 @@  nbctl_lr_route_list(struct ctl_context *ctx)
             = lr->static_routes[i];
         unsigned int plen;
         ovs_be32 ipv4;
+        const char *policy = route->policy ? route->policy : "dst-ip";
         char *error;
-
         error = ip_parse_cidr(route->ip_prefix, &ipv4, &plen);
         if (!error) {
-            ipv4_routes[n_ipv4_routes].plen = plen;
+            ipv4_routes[n_ipv4_routes].priority = !strcmp(policy, "dst-ip")
+                                                    ? (2 * plen) + 1
+                                                    : 2 * plen;
             ipv4_routes[n_ipv4_routes].addr = ipv4;
             ipv4_routes[n_ipv4_routes].route = route;
             n_ipv4_routes++;
@@ -2545,7 +2564,9 @@  nbctl_lr_route_list(struct ctl_context *ctx)
             struct in6_addr ipv6;
             error = ipv6_parse_cidr(route->ip_prefix, &ipv6, &plen);
             if (!error) {
-                ipv6_routes[n_ipv6_routes].plen = plen;
+                ipv6_routes[n_ipv6_routes].priority = !strcmp(policy, "dst-ip")
+                                                        ? (2 * plen) + 1
+                                                        : 2 * plen;
                 ipv6_routes[n_ipv6_routes].addr = ipv6;
                 ipv6_routes[n_ipv6_routes].route = route;
                 n_ipv6_routes++;
@@ -2948,7 +2969,7 @@  static const struct ctl_command_syntax nbctl_commands[] = {
 
     /* logical router route commands. */
     { "lr-route-add", 3, 4, "ROUTER PREFIX NEXTHOP [PORT]", NULL,
-      nbctl_lr_route_add, NULL, "--may-exist", RW },
+      nbctl_lr_route_add, NULL, "--may-exist,--policy=", RW },
     { "lr-route-del", 1, 2, "ROUTER [PREFIX]", NULL, nbctl_lr_route_del,
       NULL, "--if-exists", RW },
     { "lr-route-list", 1, 1, "ROUTER", NULL, nbctl_lr_route_list, NULL,
diff --git a/tests/ovn-nbctl.at b/tests/ovn-nbctl.at
index af00dad..0ea6ab8 100644
--- a/tests/ovn-nbctl.at
+++ b/tests/ovn-nbctl.at
@@ -657,20 +657,23 @@  AT_CHECK([ovn-nbctl lr-route-add lr0 2001:0db8:1::/64 2001:0db8:0:f103::1/64], [
 ])
 
 AT_CHECK([ovn-nbctl --may-exist lr-route-add lr0 10.0.0.111/24 11.0.0.1])
+AT_CHECK([ovn-nbctl --policy=src-ip lr-route-add lr0 9.16.1.0/24 11.0.0.1])
 
 AT_CHECK([ovn-nbctl lr-route-list lr0], [0], [dnl
 IPv4 Routes
-              10.0.0.0/24                  11.0.0.1
-              10.0.1.0/24                  11.0.1.1 lp0
-                0.0.0.0/0               192.168.0.1
+              10.0.0.0/24                  11.0.0.1 dst-ip
+              10.0.1.0/24                  11.0.1.1 dst-ip lp0
+              9.16.1.0/24                  11.0.0.1 src-ip
+                0.0.0.0/0               192.168.0.1 dst-ip
 ])
 
 AT_CHECK([ovn-nbctl --may-exist lr-route-add lr0 10.0.0.111/24 11.0.0.1 lp1])
 AT_CHECK([ovn-nbctl lr-route-list lr0], [0], [dnl
 IPv4 Routes
-              10.0.0.0/24                  11.0.0.1 lp1
-              10.0.1.0/24                  11.0.1.1 lp0
-                0.0.0.0/0               192.168.0.1
+              10.0.0.0/24                  11.0.0.1 dst-ip lp1
+              10.0.1.0/24                  11.0.1.1 dst-ip lp0
+              9.16.1.0/24                  11.0.0.1 src-ip
+                0.0.0.0/0               192.168.0.1 dst-ip
 ])
 
 dnl Delete non-existent prefix
@@ -680,11 +683,12 @@  AT_CHECK([ovn-nbctl lr-route-del lr0 10.0.2.1/24], [1], [],
 AT_CHECK([ovn-nbctl --if-exists lr-route-del lr0 10.0.2.1/24])
 
 AT_CHECK([ovn-nbctl lr-route-del lr0 10.0.1.1/24])
+AT_CHECK([ovn-nbctl lr-route-del lr0 9.16.1.0/24])
 
 AT_CHECK([ovn-nbctl lr-route-list lr0], [0], [dnl
 IPv4 Routes
-              10.0.0.0/24                  11.0.0.1 lp1
-                0.0.0.0/0               192.168.0.1
+              10.0.0.0/24                  11.0.0.1 dst-ip lp1
+                0.0.0.0/0               192.168.0.1 dst-ip
 ])
 
 AT_CHECK([ovn-nbctl lr-route-del lr0])
@@ -698,17 +702,17 @@  AT_CHECK([ovn-nbctl lr-route-add lr0 2001:0db8:1::/64 2001:0db8:0:f103::1])
 
 AT_CHECK([ovn-nbctl lr-route-list lr0], [0], [dnl
 IPv6 Routes
-            2001:db8::/64        2001:db8:0:f102::1 lp0
-          2001:db8:1::/64        2001:db8:0:f103::1
-                     ::/0        2001:db8:0:f101::1
+            2001:db8::/64        2001:db8:0:f102::1 dst-ip lp0
+          2001:db8:1::/64        2001:db8:0:f103::1 dst-ip
+                     ::/0        2001:db8:0:f101::1 dst-ip
 ])
 
 AT_CHECK([ovn-nbctl lr-route-del lr0 2001:0db8:0::/64])
 
 AT_CHECK([ovn-nbctl lr-route-list lr0], [0], [dnl
 IPv6 Routes
-          2001:db8:1::/64        2001:db8:0:f103::1
-                     ::/0        2001:db8:0:f101::1
+          2001:db8:1::/64        2001:db8:0:f103::1 dst-ip
+                     ::/0        2001:db8:0:f101::1 dst-ip
 ])
 
 AT_CHECK([ovn-nbctl lr-route-del lr0])
@@ -725,14 +729,14 @@  AT_CHECK([ovn-nbctl lr-route-add lr0 2001:0db8:1::/64 2001:0db8:0:f103::1])
 
 AT_CHECK([ovn-nbctl lr-route-list lr0], [0], [dnl
 IPv4 Routes
-              10.0.0.0/24                  11.0.0.1
-              10.0.1.0/24                  11.0.1.1 lp0
-                0.0.0.0/0               192.168.0.1
+              10.0.0.0/24                  11.0.0.1 dst-ip
+              10.0.1.0/24                  11.0.1.1 dst-ip lp0
+                0.0.0.0/0               192.168.0.1 dst-ip
 
 IPv6 Routes
-            2001:db8::/64        2001:db8:0:f102::1 lp0
-          2001:db8:1::/64        2001:db8:0:f103::1
-                     ::/0        2001:db8:0:f101::1
+            2001:db8::/64        2001:db8:0:f102::1 dst-ip lp0
+          2001:db8:1::/64        2001:db8:0:f103::1 dst-ip
+                     ::/0        2001:db8:0:f101::1 dst-ip
 ])
 
 OVN_NBCTL_TEST_STOP
diff --git a/tests/ovn.at b/tests/ovn.at
index 3910958..6a9c4b0 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -5446,3 +5446,222 @@  check_tos 0
 
 OVN_CLEANUP([hv])
 AT_CLEANUP
+
+AT_SETUP([ovn -- 3 HVs, 3 LRs connected via LS, source IP based routes])
+AT_SKIP_IF([test $HAVE_PYTHON = no])
+ovn_start
+
+# Logical network:
+# Three LRs - R1, R2 and R3 that are connected to each other via LS "join"
+# in 20.0.0.0/24 network. R1 has switchess foo (192.168.1.0/24) and bar
+# (192.168.2.0/24) connected to it.
+#
+# R2 and R3 are gateway routers.
+# R2 has alice (172.16.1.0/24) and R3 has bob (172.16.1.0/24)
+# connected to it. Note how both alice and bob have the same subnet behind it.
+# We are trying to simulate external network via those 2 switches. In real
+# world the switch ports of these switches will have addresses set as "unknown"
+# to make them learning switches. Or those switches will be "localnet" ones.
+
+# Create three hypervisors and create OVS ports corresponding to logical ports.
+net_add n1
+
+sim_add hv1
+as hv1
+ovs-vsctl add-br br-phys
+ovn_attach n1 br-phys 192.168.0.1
+ovs-vsctl -- add-port br-int hv1-vif1 -- \
+    set interface hv1-vif1 external-ids:iface-id=foo1 \
+    options:tx_pcap=hv1/vif1-tx.pcap \
+    options:rxq_pcap=hv1/vif1-rx.pcap \
+    ofport-request=1
+
+ovs-vsctl -- add-port br-int hv1-vif2 -- \
+    set interface hv1-vif2 external-ids:iface-id=bar1 \
+    options:tx_pcap=hv1/vif2-tx.pcap \
+    options:rxq_pcap=hv1/vif2-rx.pcap \
+    ofport-request=2
+
+sim_add hv2
+as hv2
+ovs-vsctl add-br br-phys
+ovn_attach n1 br-phys 192.168.0.2
+ovs-vsctl -- add-port br-int hv2-vif1 -- \
+    set interface hv2-vif1 external-ids:iface-id=alice1 \
+    options:tx_pcap=hv2/vif1-tx.pcap \
+    options:rxq_pcap=hv2/vif1-rx.pcap \
+    ofport-request=1
+
+sim_add hv3
+as hv3
+ovs-vsctl add-br br-phys
+ovn_attach n1 br-phys 192.168.0.3
+ovs-vsctl -- add-port br-int hv3-vif1 -- \
+    set interface hv3-vif1 external-ids:iface-id=bob1 \
+    options:tx_pcap=hv3/vif1-tx.pcap \
+    options:rxq_pcap=hv3/vif1-rx.pcap \
+    ofport-request=1
+
+
+ovn-nbctl create Logical_Router name=R1
+ovn-nbctl create Logical_Router name=R2 options:chassis="hv2"
+ovn-nbctl create Logical_Router name=R3 options:chassis="hv3"
+
+ovn-nbctl ls-add foo
+ovn-nbctl ls-add bar
+ovn-nbctl ls-add alice
+ovn-nbctl ls-add bob
+ovn-nbctl ls-add join
+
+# Connect foo to R1
+ovn-nbctl lrp-add R1 foo 00:00:01:01:02:03 192.168.1.1/24
+ovn-nbctl lsp-add foo rp-foo -- set Logical_Switch_Port rp-foo type=router \
+    options:router-port=foo addresses=\"00:00:01:01:02:03\"
+
+# Connect bar to R1
+ovn-nbctl lrp-add R1 bar 00:00:01:01:02:04 192.168.2.1/24
+ovn-nbctl lsp-add bar rp-bar -- set Logical_Switch_Port rp-bar type=router \
+    options:router-port=bar addresses=\"00:00:01:01:02:04\"
+
+# Connect alice to R2
+ovn-nbctl lrp-add R2 alice 00:00:02:01:02:03 172.16.1.1/24
+ovn-nbctl lsp-add alice rp-alice -- set Logical_Switch_Port rp-alice \
+    type=router options:router-port=alice addresses=\"00:00:02:01:02:03\"
+
+# Connect bob to R3
+ovn-nbctl lrp-add R3 bob 00:00:03:01:02:03 172.16.1.2/24
+ovn-nbctl lsp-add bob rp-bob -- set Logical_Switch_Port rp-bob \
+    type=router options:router-port=bob addresses=\"00:00:03:01:02:03\"
+
+# Connect R1 to join
+ovn-nbctl lrp-add R1 R1_join 00:00:04:01:02:03 20.0.0.1/24
+ovn-nbctl lsp-add join r1-join -- set Logical_Switch_Port r1-join \
+    type=router options:router-port=R1_join addresses='"00:00:04:01:02:03"'
+
+# Connect R2 to join
+ovn-nbctl lrp-add R2 R2_join 00:00:04:01:02:04 20.0.0.2/24
+ovn-nbctl lsp-add join r2-join -- set Logical_Switch_Port r2-join \
+    type=router options:router-port=R2_join addresses='"00:00:04:01:02:04"'
+
+# Connect R3 to join
+ovn-nbctl lrp-add R3 R3_join 00:00:04:01:02:05 20.0.0.3/24
+ovn-nbctl lsp-add join r3-join -- set Logical_Switch_Port r3-join \
+    type=router options:router-port=R3_join addresses='"00:00:04:01:02:05"'
+
+# Install static routes with source ip address as the policy for routing.
+# We want traffic from 'foo' to go via R2 and traffic of 'bar' to go via R3.
+ovn-nbctl --policy="src-ip" lr-route-add R1 192.168.1.0/24 20.0.0.2
+ovn-nbctl --policy="src-ip" lr-route-add R1 192.168.2.0/24 20.0.0.3
+
+# Install static routes with destination ip address as the policy for routing.
+ovn-nbctl lr-route-add R2 192.168.0.0/16 20.0.0.1
+
+ovn-nbctl lr-route-add R3 192.168.0.0/16 20.0.0.1
+
+# Create logical port foo1 in foo
+ovn-nbctl lsp-add foo foo1 \
+-- lsp-set-addresses foo1 "f0:00:00:01:02:03 192.168.1.2"
+
+# Create logical port bar1 in bar
+ovn-nbctl lsp-add bar bar1 \
+-- lsp-set-addresses bar1 "f0:00:00:01:02:04 192.168.2.2"
+
+# Create logical port alice1 in alice
+ovn-nbctl lsp-add alice alice1 \
+-- lsp-set-addresses alice1 "f0:00:00:01:02:05 172.16.1.3"
+
+# Create logical port bob1 in bob
+ovn-nbctl lsp-add bob bob1 \
+-- lsp-set-addresses bob1 "f0:00:00:01:02:06 172.16.1.4"
+
+# Pre-populate the hypervisors' ARP tables so that we don't lose any
+# packets for ARP resolution (native tunneling doesn't queue packets
+# for ARP resolution).
+ovn_populate_arp
+
+# Allow some time for ovn-northd and ovn-controller to catch up.
+# XXX This should be more systematic.
+sleep 1
+
+ip_to_hex() {
+    printf "%02x%02x%02x%02x" "$@"
+}
+trim_zeros() {
+    sed 's/\(00\)\{1,\}$//'
+}
+
+# Send ip packets between foo1 and bar1
+# (East-west traffic should flow normally)
+src_mac="f00000010203"
+dst_mac="000001010203"
+src_ip=`ip_to_hex 192 168 1 2`
+dst_ip=`ip_to_hex 192 168 2 2`
+packet=${dst_mac}${src_mac}08004500001c0000000040110000${src_ip}${dst_ip}0035111100080000
+as hv1 ovs-appctl netdev-dummy/receive hv1-vif1 $packet
+
+# Send ip packets between foo1 and alice1
+src_mac="f00000010203"
+dst_mac="000001010203"
+src_ip=`ip_to_hex 192 168 1 2`
+dst_ip=`ip_to_hex 172 16 1 3`
+packet=${dst_mac}${src_mac}08004500001c0000000040110000${src_ip}${dst_ip}0035111100080000
+as hv1 ovs-appctl netdev-dummy/receive hv1-vif1 $packet
+#as hv1 ovs-appctl ofproto/trace br-int in_port=1 $packet
+
+# Send ip packets between bar1 and bob1
+src_mac="f00000010204"
+dst_mac="000001010204"
+src_ip=`ip_to_hex 192 168 2 2`
+dst_ip=`ip_to_hex 172 16 1 4`
+packet=${dst_mac}${src_mac}08004500001c0000000040110000${src_ip}${dst_ip}0035111100080000
+as hv1 ovs-appctl netdev-dummy/receive hv1-vif2 $packet
+#as hv1 ovs-appctl ofproto/trace br-int in_port=2 $packet
+
+# Packet to expect at bar1
+src_mac="000001010204"
+dst_mac="f00000010204"
+src_ip=`ip_to_hex 192 168 1 2`
+dst_ip=`ip_to_hex 192 168 2 2`
+expected=${dst_mac}${src_mac}08004500001c000000003f110100${src_ip}${dst_ip}0035111100080000
+echo $expected > expected
+OVN_CHECK_PACKETS([hv1/vif2-tx.pcap], [expected])
+
+# Packet to Expect at alice1
+src_mac="000002010203"
+dst_mac="f00000010205"
+src_ip=`ip_to_hex 192 168 1 2`
+dst_ip=`ip_to_hex 172 16 1 3`
+expected=${dst_mac}${src_mac}08004500001c000000003e110200${src_ip}${dst_ip}0035111100080000
+echo $expected > expected
+OVN_CHECK_PACKETS([hv2/vif1-tx.pcap], [expected])
+
+# Packet to Expect at bob1
+src_mac="000003010203"
+dst_mac="f00000010206"
+src_ip=`ip_to_hex 192 168 2 2`
+dst_ip=`ip_to_hex 172 16 1 4`
+expected=${dst_mac}${src_mac}08004500001c000000003e110200${src_ip}${dst_ip}0035111100080000
+echo $expected > expected
+OVN_CHECK_PACKETS([hv3/vif1-tx.pcap], [expected])
+
+for sim in hv1 hv2 hv3; do
+    as $sim
+    OVS_APP_EXIT_AND_WAIT([ovn-controller])
+    OVS_APP_EXIT_AND_WAIT([ovs-vswitchd])
+    OVS_APP_EXIT_AND_WAIT([ovsdb-server])
+done
+
+as ovn-sb
+OVS_APP_EXIT_AND_WAIT([ovsdb-server])
+
+as ovn-nb
+OVS_APP_EXIT_AND_WAIT([ovsdb-server])
+
+as northd
+OVS_APP_EXIT_AND_WAIT([ovn-northd])
+
+as main
+OVS_APP_EXIT_AND_WAIT([ovs-vswitchd])
+OVS_APP_EXIT_AND_WAIT([ovsdb-server])
+
+AT_CLEANUP