diff mbox

[ovs-dev,2/2] ovn: Add a case of policy based routing.

Message ID 1467368619-4916-2-git-send-email-guru@ovn.org
State Superseded
Headers show

Commit Message

Gurucharan Shetty July 1, 2016, 10:23 a.m. UTC
OVN currently supports multiple gateway routers (residing on
different chassis) connected to the same logical topology.

When external traffic enters the logical topology, they can enter
from any gateway routers and reach its eventual destination. This
is achieved with proper static routes configured on the gateway
routers.

But when traffic is initiated in the logical space by a logical
port, we do not have a good way to distribute that traffic across
multiple gateway routers.

This commit introduces one particular way to do it. Based on the
source IP address or source IP network of the packet, we can now
jump to a specific gateway router.

This is very useful for a specific use case of Kubernetes.
When traffic is initiated inside a container heading to outside world,
we want to be able to send such traffic outside the gateway router
residing in the same host as that of the container. Since each
host gets a specific subnet, we can use source IP address based
policy routing to decide on the gateway router.

Signed-off-by: Gurucharan Shetty <guru@ovn.org>
---
 ovn/northd/ovn-northd.c   |  20 +++--
 ovn/ovn-nb.ovsschema      |   8 +-
 ovn/ovn-nb.xml            |  22 +++++
 ovn/utilities/ovn-nbctl.c |  20 +++--
 tests/ovn-nbctl.at        |  76 +++++++++-------
 tests/ovn.at              | 224 ++++++++++++++++++++++++++++++++++++++++++++--
 6 files changed, 312 insertions(+), 58 deletions(-)

Comments

Ben Pfaff July 1, 2016, 9:36 p.m. UTC | #1
On Fri, Jul 01, 2016 at 03:23:39AM -0700, Gurucharan Shetty wrote:
> OVN currently supports multiple gateway routers (residing on
> different chassis) connected to the same logical topology.
> 
> When external traffic enters the logical topology, they can enter
> from any gateway routers and reach its eventual destination. This
> is achieved with proper static routes configured on the gateway
> routers.
> 
> But when traffic is initiated in the logical space by a logical
> port, we do not have a good way to distribute that traffic across
> multiple gateway routers.
> 
> This commit introduces one particular way to do it. Based on the
> source IP address or source IP network of the packet, we can now
> jump to a specific gateway router.
> 
> This is very useful for a specific use case of Kubernetes.
> When traffic is initiated inside a container heading to outside world,
> we want to be able to send such traffic outside the gateway router
> residing in the same host as that of the container. Since each
> host gets a specific subnet, we can use source IP address based
> policy routing to decide on the gateway router.
> 
> Signed-off-by: Gurucharan Shetty <guru@ovn.org>

Maybe it is my own naivete, because I have not used policy-based routing
before, but it is not obvious to me how dst and src routes should
interact.  Is it normal for a single routing table to contain both dst
and src routes?  It appears from the implementation that if both are
present then they are applied using a longest-prefix-match approach
regardless of the field that matches.  In the OpenFlow implementation, I
see that the meaning of the routing table is ambiguous when there are
src and dst routes with the same prefix length.

The two forks here are so similar:
    char *match;
    if (policy && !strcmp(policy, "src-ip")) {
        match = xasprintf("ip4.src == "IP_FMT"/"IP_FMT,
                          IP_ARGS(network), IP_ARGS(mask));
    } else {
        match = xasprintf("ip4.dst == "IP_FMT"/"IP_FMT,
                          IP_ARGS(network), IP_ARGS(mask));
    }
that I'd be inclined to factor it out, e.g.:
    const char *dir = policy && !strcmp(policy, "src-ip") ? "src" : "dst";
    char *match = xasprintf("ip4.%s == "IP_FMT"/"IP_FMT,
                            dir, IP_ARGS(network), IP_ARGS(mask));

I would have expected the new argument to the ovn-nbctl command to be
optional.

The ovn-nbctl manpage needs an update.

Acked-by: Ben Pfaff <blp@ovn.org>
Gurucharan Shetty July 1, 2016, 10:54 p.m. UTC | #2
On 1 July 2016 at 14:36, Ben Pfaff <blp@ovn.org> wrote:

> On Fri, Jul 01, 2016 at 03:23:39AM -0700, Gurucharan Shetty wrote:
> > OVN currently supports multiple gateway routers (residing on
> > different chassis) connected to the same logical topology.
> >
> > When external traffic enters the logical topology, they can enter
> > from any gateway routers and reach its eventual destination. This
> > is achieved with proper static routes configured on the gateway
> > routers.
> >
> > But when traffic is initiated in the logical space by a logical
> > port, we do not have a good way to distribute that traffic across
> > multiple gateway routers.
> >
> > This commit introduces one particular way to do it. Based on the
> > source IP address or source IP network of the packet, we can now
> > jump to a specific gateway router.
> >
> > This is very useful for a specific use case of Kubernetes.
> > When traffic is initiated inside a container heading to outside world,
> > we want to be able to send such traffic outside the gateway router
> > residing in the same host as that of the container. Since each
> > host gets a specific subnet, we can use source IP address based
> > policy routing to decide on the gateway router.
> >
> > Signed-off-by: Gurucharan Shetty <guru@ovn.org>
>
> Maybe it is my own naivete, because I have not used policy-based routing
> before, but it is not obvious to me how dst and src routes should
> interact.  Is it normal for a single routing table to contain both dst
> and src routes?


I am not sure either. I thought about couple of other options, but without
an explicit priority of one over the other, I couldn't think of a good
idea. I will let this stay here for a few days to see if anyone else (or
you) have a better idea here.


> It appears from the implementation that if both are
> present then they are applied using a longest-prefix-match approach
> regardless of the field that matches.  In the OpenFlow implementation, I
> see that the meaning of the routing table is ambiguous when there are
> src and dst routes with the same prefix length.
>
> The two forks here are so similar:
>     char *match;
>     if (policy && !strcmp(policy, "src-ip")) {
>         match = xasprintf("ip4.src == "IP_FMT"/"IP_FMT,
>                           IP_ARGS(network), IP_ARGS(mask));
>     } else {
>         match = xasprintf("ip4.dst == "IP_FMT"/"IP_FMT,
>                           IP_ARGS(network), IP_ARGS(mask));
>     }
> that I'd be inclined to factor it out, e.g.:
>     const char *dir = policy && !strcmp(policy, "src-ip") ? "src" : "dst";
>     char *match = xasprintf("ip4.%s == "IP_FMT"/"IP_FMT,
>                             dir, IP_ARGS(network), IP_ARGS(mask));
>
I will do the above.


>
> I would have expected the new argument to the ovn-nbctl command to be
> optional.
>
I wanted to do that. But lr-route-add command already had one optional
argument. I couldn't quite think of a nice way of adding another optional
argument which is just a string. "src-ip" and "dst-ip" can (for whatever
reason) be names of a router or the names of a router port. So, I was not
sure how to decide which optional argument was given. Do you have a idea?



>
> The ovn-nbctl manpage needs an update.
>
Ugh, did not do a 'git add'


>
> Acked-by: Ben Pfaff <blp@ovn.org>
>
Ben Pfaff July 2, 2016, 12:11 a.m. UTC | #3
On Fri, Jul 01, 2016 at 03:54:36PM -0700, Guru Shetty wrote:
> On 1 July 2016 at 14:36, Ben Pfaff <blp@ovn.org> wrote:
> 
> > On Fri, Jul 01, 2016 at 03:23:39AM -0700, Gurucharan Shetty wrote:
> > > OVN currently supports multiple gateway routers (residing on
> > > different chassis) connected to the same logical topology.
> > >
> > > When external traffic enters the logical topology, they can enter
> > > from any gateway routers and reach its eventual destination. This
> > > is achieved with proper static routes configured on the gateway
> > > routers.
> > >
> > > But when traffic is initiated in the logical space by a logical
> > > port, we do not have a good way to distribute that traffic across
> > > multiple gateway routers.
> > >
> > > This commit introduces one particular way to do it. Based on the
> > > source IP address or source IP network of the packet, we can now
> > > jump to a specific gateway router.
> > >
> > > This is very useful for a specific use case of Kubernetes.
> > > When traffic is initiated inside a container heading to outside world,
> > > we want to be able to send such traffic outside the gateway router
> > > residing in the same host as that of the container. Since each
> > > host gets a specific subnet, we can use source IP address based
> > > policy routing to decide on the gateway router.
> > >
> > > Signed-off-by: Gurucharan Shetty <guru@ovn.org>
> >
> > I would have expected the new argument to the ovn-nbctl command to be
> > optional.
> >
> I wanted to do that. But lr-route-add command already had one optional
> argument. I couldn't quite think of a nice way of adding another optional
> argument which is just a string. "src-ip" and "dst-ip" can (for whatever
> reason) be names of a router or the names of a router port. So, I was not
> sure how to decide which optional argument was given. Do you have a idea?

I can think of two ways.

One is to require that the policy argument be present if the output port
argument is present, so that one may specify one of the following, which
is slightly awkward but not really too bad.

        lr-route-add ROUTER PREFIX NEXTHOP
        lr-route-add ROUTER PREFIX NEXTHOP POLICY
        lr-route-add ROUTER PREFIX NEXTHOP POLICY PORT

The other would be to make these real options, so that one ends up with:

        [--policy=POLICY] [--out-port=PORT] lr-route-add ROUTER PREFIX NEXTHOP
Gurucharan Shetty July 2, 2016, 10:45 p.m. UTC | #4
On 1 July 2016 at 14:36, Ben Pfaff <blp@ovn.org> wrote:

> On Fri, Jul 01, 2016 at 03:23:39AM -0700, Gurucharan Shetty wrote:
> > OVN currently supports multiple gateway routers (residing on
> > different chassis) connected to the same logical topology.
> >
> > When external traffic enters the logical topology, they can enter
> > from any gateway routers and reach its eventual destination. This
> > is achieved with proper static routes configured on the gateway
> > routers.
> >
> > But when traffic is initiated in the logical space by a logical
> > port, we do not have a good way to distribute that traffic across
> > multiple gateway routers.
> >
> > This commit introduces one particular way to do it. Based on the
> > source IP address or source IP network of the packet, we can now
> > jump to a specific gateway router.
> >
> > This is very useful for a specific use case of Kubernetes.
> > When traffic is initiated inside a container heading to outside world,
> > we want to be able to send such traffic outside the gateway router
> > residing in the same host as that of the container. Since each
> > host gets a specific subnet, we can use source IP address based
> > policy routing to decide on the gateway router.
> >
> > Signed-off-by: Gurucharan Shetty <guru@ovn.org>
>
> Maybe it is my own naivete, because I have not used policy-based routing
> before, but it is not obvious to me how dst and src routes should
> interact.  Is it normal for a single routing table to contain both dst
> and src routes?  It appears from the implementation that if both are
> present then they are applied using a longest-prefix-match approach
> regardless of the field that matches.  In the OpenFlow implementation, I
> see that the meaning of the routing table is ambiguous when there are
> src and dst routes with the same prefix length.
>

I agree that the above is a problem. We can now have router's connected
(in-built) routes having the same priority as policy based routes causing
problem for east-west traffic. This was not a general problem for
destination ip based routes as you would add routes that are not directly
reachable. I need to re-think this and I will try and come up with
something better.


>
> The two forks here are so similar:
>     char *match;
>     if (policy && !strcmp(policy, "src-ip")) {
>         match = xasprintf("ip4.src == "IP_FMT"/"IP_FMT,
>                           IP_ARGS(network), IP_ARGS(mask));
>     } else {
>         match = xasprintf("ip4.dst == "IP_FMT"/"IP_FMT,
>                           IP_ARGS(network), IP_ARGS(mask));
>     }
> that I'd be inclined to factor it out, e.g.:
>     const char *dir = policy && !strcmp(policy, "src-ip") ? "src" : "dst";
>     char *match = xasprintf("ip4.%s == "IP_FMT"/"IP_FMT,
>                             dir, IP_ARGS(network), IP_ARGS(mask));
>
> I would have expected the new argument to the ovn-nbctl command to be
> optional.
>
> The ovn-nbctl manpage needs an update.
>
> Acked-by: Ben Pfaff <blp@ovn.org>
>
Ben Pfaff July 3, 2016, 12:09 a.m. UTC | #5
On Sat, Jul 02, 2016 at 03:45:23PM -0700, Guru Shetty wrote:
> On 1 July 2016 at 14:36, Ben Pfaff <blp@ovn.org> wrote:
> 
> > On Fri, Jul 01, 2016 at 03:23:39AM -0700, Gurucharan Shetty wrote:
> > > OVN currently supports multiple gateway routers (residing on
> > > different chassis) connected to the same logical topology.
> > >
> > > When external traffic enters the logical topology, they can enter
> > > from any gateway routers and reach its eventual destination. This
> > > is achieved with proper static routes configured on the gateway
> > > routers.
> > >
> > > But when traffic is initiated in the logical space by a logical
> > > port, we do not have a good way to distribute that traffic across
> > > multiple gateway routers.
> > >
> > > This commit introduces one particular way to do it. Based on the
> > > source IP address or source IP network of the packet, we can now
> > > jump to a specific gateway router.
> > >
> > > This is very useful for a specific use case of Kubernetes.
> > > When traffic is initiated inside a container heading to outside world,
> > > we want to be able to send such traffic outside the gateway router
> > > residing in the same host as that of the container. Since each
> > > host gets a specific subnet, we can use source IP address based
> > > policy routing to decide on the gateway router.
> > >
> > > Signed-off-by: Gurucharan Shetty <guru@ovn.org>
> >
> > Maybe it is my own naivete, because I have not used policy-based routing
> > before, but it is not obvious to me how dst and src routes should
> > interact.  Is it normal for a single routing table to contain both dst
> > and src routes?  It appears from the implementation that if both are
> > present then they are applied using a longest-prefix-match approach
> > regardless of the field that matches.  In the OpenFlow implementation, I
> > see that the meaning of the routing table is ambiguous when there are
> > src and dst routes with the same prefix length.
> >
> 
> I agree that the above is a problem. We can now have router's connected
> (in-built) routes having the same priority as policy based routes causing
> problem for east-west traffic. This was not a general problem for
> destination ip based routes as you would add routes that are not directly
> reachable. I need to re-think this and I will try and come up with
> something better.

If there's a general rule that, for example, if a src and a dst route
with the same length both match, then the dst route should be chosen,
then it's possible to interleave the priorities, e.g. to
priority=length*2 for src routes and priority=length*2+1 for dst routes.
diff mbox

Patch

diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
index c2cf15e..3d2ab39 100644
--- a/ovn/northd/ovn-northd.c
+++ b/ovn/northd/ovn-northd.c
@@ -1759,10 +1759,17 @@  lrport_is_enabled(const struct nbrec_logical_router_port *lrport)
 
 static void
 add_route(struct hmap *lflows, const struct ovn_port *op,
-          ovs_be32 network, ovs_be32 mask, ovs_be32 gateway)
+          ovs_be32 network, ovs_be32 mask, ovs_be32 gateway,
+          const char *policy)
 {
-    char *match = xasprintf("ip4.dst == "IP_FMT"/"IP_FMT,
-                            IP_ARGS(network), IP_ARGS(mask));
+    char *match;
+    if (policy && !strcmp(policy, "src-ip")) {
+        match = xasprintf("ip4.src == "IP_FMT"/"IP_FMT,
+                          IP_ARGS(network), IP_ARGS(mask));
+    } else {
+        match = xasprintf("ip4.dst == "IP_FMT"/"IP_FMT,
+                          IP_ARGS(network), IP_ARGS(mask));
+    }
 
     struct ds actions = DS_EMPTY_INITIALIZER;
     ds_put_cstr(&actions, "ip.ttl--; reg0 = ");
@@ -1852,7 +1859,8 @@  build_static_route_flow(struct hmap *lflows, struct ovn_datapath *od,
         }
     }
 
-    add_route(lflows, out_port, prefix, mask, next_hop);
+    char *policy = route->policy ? route->policy : "dst-ip";
+    add_route(lflows, out_port, prefix, mask, next_hop, policy);
 }
 
 static void
@@ -2219,7 +2227,7 @@  build_lrouter_flows(struct hmap *datapaths, struct hmap *ports,
             continue;
         }
 
-        add_route(lflows, op, op->network, op->mask, 0);
+        add_route(lflows, op, op->network, op->mask, 0, NULL);
     }
     HMAP_FOR_EACH (od, key_node, datapaths) {
         if (!od->nbr) {
@@ -2235,7 +2243,7 @@  build_lrouter_flows(struct hmap *datapaths, struct hmap *ports,
         }
 
         if (od->gateway && od->gateway_port) {
-            add_route(lflows, od->gateway_port, 0, 0, od->gateway);
+            add_route(lflows, od->gateway_port, 0, 0, od->gateway, NULL);
         }
     }
     /* XXX destination unreachable */
diff --git a/ovn/ovn-nb.ovsschema b/ovn/ovn-nb.ovsschema
index 58f04b2..099c50c 100644
--- a/ovn/ovn-nb.ovsschema
+++ b/ovn/ovn-nb.ovsschema
@@ -1,7 +1,7 @@ 
 {
     "name": "OVN_Northbound",
-    "version": "3.1.0",
-    "cksum": "1426508118 6135",
+    "version": "3.2.0",
+    "cksum": "904639232 6400",
     "tables": {
         "Logical_Switch": {
             "columns": {
@@ -107,6 +107,10 @@ 
         "Logical_Router_Static_Route": {
             "columns": {
                 "ip_prefix": {"type": "string"},
+                "policy": {"type": {"key": {"type": "string",
+                                            "enum": ["set", ["src-ip",
+                                                             "dst-ip"]]},
+                                    "min": 0, "max": 1}},
                 "nexthop": {"type": "string"},
                 "output_port": {"type": {"key": "string", "min": 0, "max": 1}}},
             "isRoot": false},
diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml
index 6355c44..1e3a046 100644
--- a/ovn/ovn-nb.xml
+++ b/ovn/ovn-nb.xml
@@ -762,6 +762,28 @@ 
       </p>
     </column>
 
+    <column name="policy">
+      <p>
+        If it is specified, this setting describes the policy used to make
+        routing decisions.  This setting must be one of the following strings:
+      </p>
+      <ul>
+        <li>
+          <code>src-ip</code>: This policy sends the packet to the
+          <ref column="nexthop"/> when the packet's source IP address matches
+          <ref column="ip_prefix"/>.
+        </li>
+        <li>
+          <code>dst-ip</code>: This policy sends the packet to the
+          <ref column="nexthop"/> when the packet's destination IP address
+          matches <ref column="ip_prefix"/>.
+        </li>
+      </ul>
+      <p>
+        If not specified, the default is <code>dst-ip</code>.
+      </p>
+    </column>
+
     <column name="nexthop">
       <p>
         Nexthop IP address for this route.  Nexthop IP address should be the IP
diff --git a/ovn/utilities/ovn-nbctl.c b/ovn/utilities/ovn-nbctl.c
index abeba0b..6420e40 100644
--- a/ovn/utilities/ovn-nbctl.c
+++ b/ovn/utilities/ovn-nbctl.c
@@ -351,7 +351,7 @@  Logical router port commands:\n\
                             ('enabled' or 'disabled')\n\
 \n\
 Route commands:\n\
-  lr-route-add ROUTER PREFIX NEXTHOP [PORT]\n\
+  lr-route-add ROUTER PREFIX NEXTHOP POLICY [PORT]\n\
                             add a route to ROUTER\n\
   lr-route-del ROUTER [PREFIX]\n\
                             remove routes from ROUTER\n\
@@ -1356,6 +1356,10 @@  nbctl_lr_route_add(struct ctl_context *ctx)
         }
     }
 
+    if (strcmp(ctx->argv[4], "src-ip") && strcmp(ctx->argv[4], "dst-ip")) {
+        ctl_fatal("bad policy: %s", ctx->argv[4]);
+    }
+
     bool may_exist = shash_find(&ctx->options, "--may-exist") != NULL;
     for (int i = 0; i < lr->n_static_routes; i++) {
         const struct nbrec_logical_router_static_route *route
@@ -1383,9 +1387,10 @@  nbctl_lr_route_add(struct ctl_context *ctx)
         nbrec_logical_router_static_route_verify_nexthop(route);
         nbrec_logical_router_static_route_set_ip_prefix(route, prefix);
         nbrec_logical_router_static_route_set_nexthop(route, next_hop);
-        if (ctx->argc == 5) {
+        nbrec_logical_router_static_route_set_policy(route, ctx->argv[4]);
+        if (ctx->argc == 6) {
             nbrec_logical_router_static_route_set_output_port(route,
-                                                              ctx->argv[4]);
+                                                              ctx->argv[5]);
         }
         free(rt_prefix);
         free(next_hop);
@@ -1397,8 +1402,9 @@  nbctl_lr_route_add(struct ctl_context *ctx)
     route = nbrec_logical_router_static_route_insert(ctx->txn);
     nbrec_logical_router_static_route_set_ip_prefix(route, prefix);
     nbrec_logical_router_static_route_set_nexthop(route, next_hop);
-    if (ctx->argc == 5) {
-        nbrec_logical_router_static_route_set_output_port(route, ctx->argv[4]);
+    nbrec_logical_router_static_route_set_policy(route, ctx->argv[4]);
+    if (ctx->argc == 6) {
+        nbrec_logical_router_static_route_set_output_port(route, ctx->argv[5]);
     }
 
     nbrec_logical_router_verify_static_routes(lr);
@@ -1758,7 +1764,7 @@  print_route(const struct nbrec_logical_router_static_route *route, struct ds *s)
 
     char *prefix = normalize_prefix_str(route->ip_prefix);
     char *next_hop = normalize_prefix_str(route->nexthop);
-    ds_put_format(s, "%25s %25s", prefix, next_hop);
+    ds_put_format(s, "%25s %25s %s", prefix, next_hop, route->policy);
     free(prefix);
     free(next_hop);
 
@@ -2134,7 +2140,7 @@  static const struct ctl_command_syntax nbctl_commands[] = {
       NULL, "", RO },
 
     /* logical router route commands. */
-    { "lr-route-add", 3, 4, "ROUTER PREFIX NEXTHOP [PORT]", NULL,
+    { "lr-route-add", 4, 5, "ROUTER PREFIX NEXTHOP POLICY [PORT]", NULL,
       nbctl_lr_route_add, NULL, "--may-exist", RW },
     { "lr-route-del", 1, 2, "ROUTER [PREFIX]", NULL, nbctl_lr_route_del,
       NULL, "--if-exists", RW },
diff --git a/tests/ovn-nbctl.at b/tests/ovn-nbctl.at
index 0c756ed..d97813a 100644
--- a/tests/ovn-nbctl.at
+++ b/tests/ovn-nbctl.at
@@ -371,29 +371,32 @@  OVN_NBCTL_TEST_START
 AT_CHECK([ovn-nbctl lr-add lr0])
 
 dnl Check IPv4 routes
-AT_CHECK([ovn-nbctl lr-route-add lr0 0.0.0.0/0 192.168.0.1])
-AT_CHECK([ovn-nbctl lr-route-add lr0 10.0.1.0/24 11.0.1.1 lp0])
-AT_CHECK([ovn-nbctl lr-route-add lr0 10.0.0.1/24 11.0.0.2])
+AT_CHECK([ovn-nbctl lr-route-add lr0 0.0.0.0/0 192.168.0.1 dst-ip])
+AT_CHECK([ovn-nbctl lr-route-add lr0 10.0.1.0/24 11.0.1.1 dst-ip lp0])
+AT_CHECK([ovn-nbctl lr-route-add lr0 10.0.0.1/24 11.0.0.2 dst-ip])
+AT_CHECK([ovn-nbctl lr-route-add lr0 192.168.1.0/24 11.0.0.2 src-ip])
 
 dnl Add overlapping route with 10.0.0.1/24
-AT_CHECK([ovn-nbctl lr-route-add lr0 10.0.0.111/24 11.0.0.1], [1], [],
+AT_CHECK([ovn-nbctl lr-route-add lr0 10.0.0.111/24 11.0.0.1 dst-ip], [1], [],
   [ovn-nbctl: duplicate prefix: 10.0.0.0/24
 ])
-AT_CHECK([ovn-nbctl --may-exist lr-route-add lr0 10.0.0.111/24 11.0.0.1])
+AT_CHECK([ovn-nbctl --may-exist lr-route-add lr0 10.0.0.111/24 11.0.0.1 dst-ip])
 
 AT_CHECK([ovn-nbctl lr-route-list lr0], [0], [dnl
 IPv4 Routes
-              10.0.0.0/24                  11.0.0.1
-              10.0.1.0/24                  11.0.1.1 lp0
-                0.0.0.0/0               192.168.0.1
+              10.0.0.0/24                  11.0.0.1 dst-ip
+              10.0.1.0/24                  11.0.1.1 dst-ip lp0
+           192.168.1.0/24                  11.0.0.2 src-ip
+                0.0.0.0/0               192.168.0.1 dst-ip
 ])
 
-AT_CHECK([ovn-nbctl --may-exist lr-route-add lr0 10.0.0.111/24 11.0.0.1 lp1])
+AT_CHECK([ovn-nbctl --may-exist lr-route-add lr0 10.0.0.111/24 11.0.0.1 dst-ip lp1])
 AT_CHECK([ovn-nbctl lr-route-list lr0], [0], [dnl
 IPv4 Routes
-              10.0.0.0/24                  11.0.0.1 lp1
-              10.0.1.0/24                  11.0.1.1 lp0
-                0.0.0.0/0               192.168.0.1
+              10.0.0.0/24                  11.0.0.1 dst-ip lp1
+              10.0.1.0/24                  11.0.1.1 dst-ip lp0
+           192.168.1.0/24                  11.0.0.2 src-ip
+                0.0.0.0/0               192.168.0.1 dst-ip
 ])
 
 dnl Delete non-existent prefix
@@ -406,8 +409,9 @@  AT_CHECK([ovn-nbctl lr-route-del lr0 10.0.1.1/24])
 
 AT_CHECK([ovn-nbctl lr-route-list lr0], [0], [dnl
 IPv4 Routes
-              10.0.0.0/24                  11.0.0.1 lp1
-                0.0.0.0/0               192.168.0.1
+              10.0.0.0/24                  11.0.0.1 dst-ip lp1
+           192.168.1.0/24                  11.0.0.2 src-ip
+                0.0.0.0/0               192.168.0.1 dst-ip
 ])
 
 AT_CHECK([ovn-nbctl lr-route-del lr0])
@@ -415,23 +419,23 @@  AT_CHECK([ovn-nbctl lr-route-list lr0], [0], [dnl
 ])
 
 dnl Check IPv6 routes
-AT_CHECK([ovn-nbctl lr-route-add lr0 0:0:0:0:0:0:0:0/0 2001:0db8:0:f101::1])
-AT_CHECK([ovn-nbctl lr-route-add lr0 2001:0db8:0::/64 2001:0db8:0:f102::1 lp0])
-AT_CHECK([ovn-nbctl lr-route-add lr0 2001:0db8:1::/64 2001:0db8:0:f103::1])
+AT_CHECK([ovn-nbctl lr-route-add lr0 0:0:0:0:0:0:0:0/0 2001:0db8:0:f101::1 dst-ip])
+AT_CHECK([ovn-nbctl lr-route-add lr0 2001:0db8:0::/64 2001:0db8:0:f102::1 dst-ip lp0])
+AT_CHECK([ovn-nbctl lr-route-add lr0 2001:0db8:1::/64 2001:0db8:0:f103::1] dst-ip)
 
 AT_CHECK([ovn-nbctl lr-route-list lr0], [0], [dnl
 IPv6 Routes
-            2001:db8::/64        2001:db8:0:f102::1 lp0
-          2001:db8:1::/64        2001:db8:0:f103::1
-                     ::/0        2001:db8:0:f101::1
+            2001:db8::/64        2001:db8:0:f102::1 dst-ip lp0
+          2001:db8:1::/64        2001:db8:0:f103::1 dst-ip
+                     ::/0        2001:db8:0:f101::1 dst-ip
 ])
 
 AT_CHECK([ovn-nbctl lr-route-del lr0 2001:0db8:0::/64])
 
 AT_CHECK([ovn-nbctl lr-route-list lr0], [0], [dnl
 IPv6 Routes
-          2001:db8:1::/64        2001:db8:0:f103::1
-                     ::/0        2001:db8:0:f101::1
+          2001:db8:1::/64        2001:db8:0:f103::1 dst-ip
+                     ::/0        2001:db8:0:f101::1 dst-ip
 ])
 
 AT_CHECK([ovn-nbctl lr-route-del lr0])
@@ -439,23 +443,27 @@  AT_CHECK([ovn-nbctl lr-route-list lr0], [0], [dnl
 ])
 
 dnl Check IPv4 and IPv6 routes
-AT_CHECK([ovn-nbctl lr-route-add lr0 0.0.0.0/0 192.168.0.1])
-AT_CHECK([ovn-nbctl lr-route-add lr0 10.0.1.1/24 11.0.1.1 lp0])
-AT_CHECK([ovn-nbctl lr-route-add lr0 10.0.0.1/24 11.0.0.1])
-AT_CHECK([ovn-nbctl lr-route-add lr0 0:0:0:0:0:0:0:0/0 2001:0db8:0:f101::1])
-AT_CHECK([ovn-nbctl lr-route-add lr0 2001:0db8:0::/64 2001:0db8:0:f102::1 lp0])
-AT_CHECK([ovn-nbctl lr-route-add lr0 2001:0db8:1::/64 2001:0db8:0:f103::1])
+AT_CHECK([ovn-nbctl lr-route-add lr0 0.0.0.0/0 192.168.0.1 dst-ip])
+AT_CHECK([ovn-nbctl lr-route-add lr0 10.0.1.1/24 11.0.1.1 dst-ip lp0])
+AT_CHECK([ovn-nbctl lr-route-add lr0 10.0.0.1/24 11.0.0.1 dst-ip])
+AT_CHECK([ovn-nbctl lr-route-add lr0 192.168.1.0/24 11.0.0.1 src-ip lp1])
+AT_CHECK([ovn-nbctl lr-route-add lr0 0:0:0:0:0:0:0:0/0 2001:0db8:0:f101::1 dst-ip])
+AT_CHECK([ovn-nbctl lr-route-add lr0 2001:0db8:0::/64 2001:0db8:0:f102::1 dst-ip lp0])
+AT_CHECK([ovn-nbctl lr-route-add lr0 2001:0db8:1::/64 2001:0db8:0:f103::1 dst-ip])
+AT_CHECK([ovn-nbctl lr-route-add lr0 4001:0db8:1::/64 2001:0db8:0:f103::1 src-ip])
 
 AT_CHECK([ovn-nbctl lr-route-list lr0], [0], [dnl
 IPv4 Routes
-              10.0.0.0/24                  11.0.0.1
-              10.0.1.0/24                  11.0.1.1 lp0
-                0.0.0.0/0               192.168.0.1
+              10.0.0.0/24                  11.0.0.1 dst-ip
+              10.0.1.0/24                  11.0.1.1 dst-ip lp0
+           192.168.1.0/24                  11.0.0.1 src-ip lp1
+                0.0.0.0/0               192.168.0.1 dst-ip
 
 IPv6 Routes
-            2001:db8::/64        2001:db8:0:f102::1 lp0
-          2001:db8:1::/64        2001:db8:0:f103::1
-                     ::/0        2001:db8:0:f101::1
+            2001:db8::/64        2001:db8:0:f102::1 dst-ip lp0
+          2001:db8:1::/64        2001:db8:0:f103::1 dst-ip
+          4001:db8:1::/64        2001:db8:0:f103::1 src-ip
+                     ::/0        2001:db8:0:f101::1 dst-ip
 ])
 
 OVN_NBCTL_TEST_STOP
diff --git a/tests/ovn.at b/tests/ovn.at
index 297070c..91312f3 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -2436,9 +2436,9 @@  ovn-nbctl lrp-add R1 R1_R2 00:00:00:02:03:04 20.0.0.1/24 R2_R1
 ovn-nbctl lrp-add R2 R2_R1 00:00:00:02:03:05 20.0.0.2/24 R1_R2
 
 #install static routes
-ovn-nbctl lr-route-add R1 172.16.1.0/24 20.0.0.2
-ovn-nbctl lr-route-add R2 172.16.2.0/24 20.0.0.2 R1_R2
-ovn-nbctl lr-route-add R2 192.168.1.0/24 20.0.0.1
+ovn-nbctl lr-route-add R1 172.16.1.0/24 20.0.0.2 dst-ip
+ovn-nbctl lr-route-add R2 172.16.2.0/24 20.0.0.2 dst-ip R1_R2
+ovn-nbctl lr-route-add R2 192.168.1.0/24 20.0.0.1 dst-ip
 
 # Create logical port foo1 in foo
 ovn-nbctl lsp-add foo foo1 \
@@ -2692,14 +2692,14 @@  ovn-nbctl lsp-add join r3-join -- set Logical_Switch_Port r3-join \
     type=router options:router-port=R3_join addresses='"00:00:04:01:02:05"'
 
 #install static routes
-ovn-nbctl lr-route-add R1 172.16.1.0/24 20.0.0.2
-ovn-nbctl lr-route-add R1 10.32.1.0/24 20.0.0.3
+ovn-nbctl lr-route-add R1 172.16.1.0/24 20.0.0.2 dst-ip
+ovn-nbctl lr-route-add R1 10.32.1.0/24 20.0.0.3 dst-ip
 
-ovn-nbctl lr-route-add R2 192.168.1.0/24 20.0.0.1
-ovn-nbctl lr-route-add R2 10.32.1.0/24 20.0.0.3
+ovn-nbctl lr-route-add R2 192.168.1.0/24 20.0.0.1 dst-ip
+ovn-nbctl lr-route-add R2 10.32.1.0/24 20.0.0.3 dst-ip
 
-ovn-nbctl lr-route-add R3 192.168.1.0/24 20.0.0.1
-ovn-nbctl lr-route-add R3 172.16.1.0/24 20.0.0.2
+ovn-nbctl lr-route-add R3 192.168.1.0/24 20.0.0.1 dst-ip
+ovn-nbctl lr-route-add R3 172.16.1.0/24 20.0.0.2 dst-ip
 
 # Create logical port foo1 in foo
 ovn-nbctl lsp-add foo foo1 \
@@ -2844,6 +2844,212 @@  OVS_APP_EXIT_AND_WAIT([ovsdb-server])
 
 AT_CLEANUP
 
+AT_SETUP([ovn -- 3 HVs, 3 LRs connected via LS, source IP based routes])
+AT_KEYWORDS([ovnstaticroutes])
+AT_SKIP_IF([test $HAVE_PYTHON = no])
+ovn_start
+
+# Logical network:
+# Three LRs - R1, R2 and R3 that are connected to each other via LS "join"
+# in 20.0.0.0/24 network. R1 has switchess foo (192.168.1.0/24) and bar
+# (192.168.2.0/24) connected to it.
+#
+# R2 and R3 are gateway routers.
+# R2 has alice (172.16.1.0/24) and R3 has bob (172.16.1.0/24)
+# connected to it. Note how both alice and bob have the same subnet behind it.
+# We are trying to simulate external network via those 2 switches. In real
+# world the switch ports of these switches will have addresses set as "unknown"
+# to make them learning switches. Or those switches will be "localnet" ones.
+
+# Create three hypervisors and create OVS ports corresponding to logical ports.
+net_add n1
+
+sim_add hv1
+as hv1
+ovs-vsctl add-br br-phys
+ovn_attach n1 br-phys 192.168.0.1
+ovs-vsctl -- add-port br-int hv1-vif1 -- \
+    set interface hv1-vif1 external-ids:iface-id=foo1 \
+    options:tx_pcap=hv1/vif1-tx.pcap \
+    options:rxq_pcap=hv1/vif1-rx.pcap \
+    ofport-request=1
+
+ovs-vsctl -- add-port br-int hv1-vif2 -- \
+    set interface hv1-vif2 external-ids:iface-id=bar1 \
+    options:tx_pcap=hv1/vif2-tx.pcap \
+    options:rxq_pcap=hv1/vif2-rx.pcap \
+    ofport-request=2
+
+sim_add hv2
+as hv2
+ovs-vsctl add-br br-phys
+ovn_attach n1 br-phys 192.168.0.2
+ovs-vsctl -- add-port br-int hv2-vif1 -- \
+    set interface hv2-vif1 external-ids:iface-id=alice1 \
+    options:tx_pcap=hv2/vif1-tx.pcap \
+    options:rxq_pcap=hv2/vif1-rx.pcap \
+    ofport-request=1
+
+sim_add hv3
+as hv3
+ovs-vsctl add-br br-phys
+ovn_attach n1 br-phys 192.168.0.3
+ovs-vsctl -- add-port br-int hv3-vif1 -- \
+    set interface hv3-vif1 external-ids:iface-id=bob1 \
+    options:tx_pcap=hv3/vif1-tx.pcap \
+    options:rxq_pcap=hv3/vif1-rx.pcap \
+    ofport-request=1
+
+
+ovn-nbctl create Logical_Router name=R1
+ovn-nbctl create Logical_Router name=R2 options:chassis="hv2"
+ovn-nbctl create Logical_Router name=R3 options:chassis="hv3"
+
+ovn-nbctl ls-add foo
+ovn-nbctl ls-add bar
+ovn-nbctl ls-add alice
+ovn-nbctl ls-add bob
+ovn-nbctl ls-add join
+
+# Connect foo to R1
+ovn-nbctl lrp-add R1 foo 00:00:01:01:02:03 192.168.1.1/24
+ovn-nbctl lsp-add foo rp-foo -- set Logical_Switch_Port rp-foo type=router \
+    options:router-port=foo addresses=\"00:00:01:01:02:03\"
+
+# Connect bar to R1
+ovn-nbctl lrp-add R1 bar 00:00:01:01:02:04 192.168.2.1/24
+ovn-nbctl lsp-add bar rp-bar -- set Logical_Switch_Port rp-bar type=router \
+    options:router-port=bar addresses=\"00:00:01:01:02:04\"
+
+# Connect alice to R2
+ovn-nbctl lrp-add R2 alice 00:00:02:01:02:03 172.16.1.1/24
+ovn-nbctl lsp-add alice rp-alice -- set Logical_Switch_Port rp-alice \
+    type=router options:router-port=alice addresses=\"00:00:02:01:02:03\"
+
+# Connect bob to R3
+ovn-nbctl lrp-add R3 bob 00:00:03:01:02:03 172.16.1.2/24
+ovn-nbctl lsp-add bob rp-bob -- set Logical_Switch_Port rp-bob \
+    type=router options:router-port=bob addresses=\"00:00:03:01:02:03\"
+
+# Connect R1 to join
+ovn-nbctl lrp-add R1 R1_join 00:00:04:01:02:03 20.0.0.1/24
+ovn-nbctl lsp-add join r1-join -- set Logical_Switch_Port r1-join \
+    type=router options:router-port=R1_join addresses='"00:00:04:01:02:03"'
+
+# Connect R2 to join
+ovn-nbctl lrp-add R2 R2_join 00:00:04:01:02:04 20.0.0.2/24
+ovn-nbctl lsp-add join r2-join -- set Logical_Switch_Port r2-join \
+    type=router options:router-port=R2_join addresses='"00:00:04:01:02:04"'
+
+# Connect R3 to join
+ovn-nbctl lrp-add R3 R3_join 00:00:04:01:02:05 20.0.0.3/24
+ovn-nbctl lsp-add join r3-join -- set Logical_Switch_Port r3-join \
+    type=router options:router-port=R3_join addresses='"00:00:04:01:02:05"'
+
+# Install static routes with source ip address as the policy for routing.
+# We want traffic from 'foo' to go via R2 and traffic of 'bar' to go via R3.
+ovn-nbctl lr-route-add R1 192.168.1.0/24 20.0.0.2 src-ip
+ovn-nbctl lr-route-add R1 192.168.2.0/24 20.0.0.3 src-ip
+
+# Install static routes with destination ip address as the policy for routing.
+ovn-nbctl lr-route-add R2 192.168.0.0/16 20.0.0.1 dst-ip
+
+ovn-nbctl lr-route-add R3 192.168.0.0/16 20.0.0.1 dst-ip
+
+# Create logical port foo1 in foo
+ovn-nbctl lsp-add foo foo1 \
+-- lsp-set-addresses foo1 "f0:00:00:01:02:03 192.168.1.2"
+
+# Create logical port bar1 in bar
+ovn-nbctl lsp-add bar bar1 \
+-- lsp-set-addresses bar1 "f0:00:00:01:02:04 192.168.2.2"
+
+# Create logical port alice1 in alice
+ovn-nbctl lsp-add alice alice1 \
+-- lsp-set-addresses alice1 "f0:00:00:01:02:05 172.16.1.3"
+
+# Create logical port bob1 in bob
+ovn-nbctl lsp-add bob bob1 \
+-- lsp-set-addresses bob1 "f0:00:00:01:02:06 172.16.1.4"
+
+# Pre-populate the hypervisors' ARP tables so that we don't lose any
+# packets for ARP resolution (native tunneling doesn't queue packets
+# for ARP resolution).
+ovn_populate_arp
+
+# Allow some time for ovn-northd and ovn-controller to catch up.
+# XXX This should be more systematic.
+sleep 1
+
+ip_to_hex() {
+    printf "%02x%02x%02x%02x" "$@"
+}
+trim_zeros() {
+    sed 's/\(00\)\{1,\}$//'
+}
+
+# Send ip packets between foo1 and alice1
+src_mac="f00000010203"
+dst_mac="000001010203"
+src_ip=`ip_to_hex 192 168 1 2`
+dst_ip=`ip_to_hex 172 16 1 3`
+packet=${dst_mac}${src_mac}08004500001c0000000040110000${src_ip}${dst_ip}0035111100080000
+as hv1 ovs-appctl netdev-dummy/receive hv1-vif1 $packet
+#as hv1 ovs-appctl ofproto/trace br-int in_port=1 $packet
+
+# Send ip packets between bar1 and bob1
+src_mac="f00000010204"
+dst_mac="000001010204"
+src_ip=`ip_to_hex 192 168 2 2`
+dst_ip=`ip_to_hex 172 16 1 4`
+packet=${dst_mac}${src_mac}08004500001c0000000040110000${src_ip}${dst_ip}0035111100080000
+as hv1 ovs-appctl netdev-dummy/receive hv1-vif2 $packet
+#as hv1 ovs-appctl ofproto/trace br-int in_port=2 $packet
+
+# Packet to Expect at alice1
+src_mac="000002010203"
+dst_mac="f00000010205"
+src_ip=`ip_to_hex 192 168 1 2`
+dst_ip=`ip_to_hex 172 16 1 3`
+expected=${dst_mac}${src_mac}08004500001c000000003e110200${src_ip}${dst_ip}0035111100080000
+
+$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv2/vif1-tx.pcap | trim_zeros > received.packets
+echo $expected | trim_zeros > expout
+AT_CHECK([cat received.packets], [0], [expout])
+
+# Packet to Expect at bob1
+src_mac="000003010203"
+dst_mac="f00000010206"
+src_ip=`ip_to_hex 192 168 2 2`
+dst_ip=`ip_to_hex 172 16 1 4`
+expected=${dst_mac}${src_mac}08004500001c000000003e110200${src_ip}${dst_ip}0035111100080000
+
+$PYTHON "$top_srcdir/utilities/ovs-pcap.in" hv3/vif1-tx.pcap | trim_zeros > received.packets
+echo $expected | trim_zeros > expout
+AT_CHECK([cat received.packets], [0], [expout])
+
+for sim in hv1 hv2 hv3; do
+    as $sim
+    OVS_APP_EXIT_AND_WAIT([ovn-controller])
+    OVS_APP_EXIT_AND_WAIT([ovs-vswitchd])
+    OVS_APP_EXIT_AND_WAIT([ovsdb-server])
+done
+
+as ovn-sb
+OVS_APP_EXIT_AND_WAIT([ovsdb-server])
+
+as ovn-nb
+OVS_APP_EXIT_AND_WAIT([ovsdb-server])
+
+as northd
+OVS_APP_EXIT_AND_WAIT([ovn-northd])
+
+as main
+OVS_APP_EXIT_AND_WAIT([ovs-vswitchd])
+OVS_APP_EXIT_AND_WAIT([ovsdb-server])
+
+AT_CLEANUP
+
 AT_SETUP([ovn -- 2 HVs, 2 LRs connected via LS, gateway router])
 AT_KEYWORDS([ovngatewayrouter])
 AT_SKIP_IF([test $HAVE_PYTHON = no])