Message ID | e590661b77d0b57d2d115b0595d8ab7167dbf129.1612995252.git.lorenzo.bianconi@redhat.com |
---|---|
State | Not Applicable |
Headers | show |
Series | [ovs-dev,RFC] northd: set max priority for automatic routes | expand |
On Wed, Feb 10, 2021 at 2:17 PM Lorenzo Bianconi < lorenzo.bianconi@redhat.com> wrote: > > Increase priority for automatic routes (routes created assigning IP > addresses to OVN logical router interfaces) in order to always prefer them > over static routes since the router has a direct link to the destination > address (possible use-case can be found here [0]). > > [0] https://bugzilla.redhat.com/show_bug.cgi?id=1891516 > Hi Lorenzo, Tim, While the patch may solve the problem in the bug report, I think there is something more fundamental to be discussed. The problem is caused by ovn-k8s's use of "src-ip" in static routes which overrides the direct-connected route. I think the implementation of "src-ip" support in the static route is somehow flawed. The priorities of the flows generated by static routes are calculated according to the prefix length, so that longest prefix routes are prioritised when there are multiple route matches, which is correct when comparing matches among "dst-ip" routes or among "src-ip" routes, but is not correct between "dst-ip" and "src-ip" routes. Comparison of prefix length between these two types of static routes doesn't make sense, because they match by different fields (src-ip v.s. dst-ip). For example, when there are static routes: 1. 192.168.0.0/24 via 100.64.0.1 src-ip 2. 10.0.0.0/20 via 100.64.0.2 dst-ip In this example, a packet from 192.168.0.1 to 10.0.0.1 matches both routes, but it is unreasonable to say it should follow the 1st route just because it has longer prefix length. Instead, we should prioritize one type over the other. It seems in physical router implementation policy based routing always has higher priority than destination routing, so we should probably enforce it in a similar way in OVN, i.e. set "src-ip" flows with higher priority than all the "dst-ip" flows. In fact, the policy routing table already supported this behavior because it is examined before the static route table. Since the "src-ip" feature in the static route table is flawed, and can be replaced by the policy routing table, I'd suggest to deprecate it. For correctness, users (like ovn-k8s) should use the policy routing table instead for the src-ip based routing rules. Users have full control of how they want the packets to be routed. For the use case mentioned in the bug report, it should have two rules in the policy routing table: 100 ip.dst == 100.64.0.0/16 accept # for directly connected destination, skip the src-ip rules 90 ip.src == 10.244.0.0/24 nexthop 100.64.0.1 # src-ip rules Would this better satisfy the need of ovn-k8s? If the above is agreed, the priority change of directly connected routes from this patch is irrelevant to the problem reported in the bug, because policy routing rules are examined before the static routing table anyway, so no matter how high the route priority is, it wouldn't matter. In addition, it seems to me no real use cases would benefit from this change, but I could be wrong and please correct me if so. Thanks, Han > Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> > --- > northd/ovn-northd.c | 27 ++++++++++++++++----------- > 1 file changed, 16 insertions(+), 11 deletions(-) > > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c > index b2b5f6a1b..dc8706f2f 100644 > --- a/northd/ovn-northd.c > +++ b/northd/ovn-northd.c > @@ -7994,18 +7994,23 @@ build_route_prefix_s(const struct in6_addr *prefix, unsigned int plen) > > static void > build_route_match(const struct ovn_port *op_inport, const char *network_s, > - int plen, bool is_src_route, bool is_ipv4, struct ds *match, > - uint16_t *priority) > + int plen, bool is_src_route, bool is_ipv4, bool automatic, > + struct ds *match, uint16_t *priority) > { > + int prefix_len = plen; > const char *dir; > /* The priority here is calculated to implement longest-prefix-match > - * routing. */ > + * routing. Automatic routes have max priority */ > + if (automatic) { > + prefix_len = is_ipv4 ? 32 : 128; > + prefix_len++; > + } > if (is_src_route) { > dir = "src"; > - *priority = plen * 2; > + *priority = prefix_len * 2; > } else { > dir = "dst"; > - *priority = (plen * 2) + 1; > + *priority = (prefix_len * 2) + 1; > } > > if (op_inport) { > @@ -8172,7 +8177,7 @@ build_ecmp_route_flow(struct hmap *lflows, struct ovn_datapath *od, > > char *prefix_s = build_route_prefix_s(&eg->prefix, eg->plen); > build_route_match(NULL, prefix_s, eg->plen, eg->is_src_route, is_ipv4, > - &route_match, &priority); > + false, &route_match, &priority); > free(prefix_s); > > struct ds actions = DS_EMPTY_INITIALIZER; > @@ -8246,7 +8251,7 @@ build_ecmp_route_flow(struct hmap *lflows, struct ovn_datapath *od, > static void > add_route(struct hmap *lflows, const struct ovn_port *op, > const char *lrp_addr_s, const char *network_s, int plen, > - const char *gateway, bool is_src_route, > + const char *gateway, bool is_src_route, bool automatic, > const struct ovsdb_idl_row *stage_hint) > { > bool is_ipv4 = strchr(network_s, '.') ? true : false; > @@ -8263,7 +8268,7 @@ add_route(struct hmap *lflows, const struct ovn_port *op, > } > } > build_route_match(op_inport, network_s, plen, is_src_route, is_ipv4, > - &match, &priority); > + automatic, &match, &priority); > > struct ds common_actions = DS_EMPTY_INITIALIZER; > ds_put_format(&common_actions, REG_ECMP_GROUP_ID" = 0; %s = ", > @@ -8319,7 +8324,7 @@ build_static_route_flow(struct hmap *lflows, struct ovn_datapath *od, > > char *prefix_s = build_route_prefix_s(&route_->prefix, route_->plen); > add_route(lflows, out_port, lrp_addr_s, prefix_s, route_->plen, > - route->nexthop, route_->is_src_route, > + route->nexthop, route_->is_src_route, false, > &route->header_); > > free(prefix_s); > @@ -9389,14 +9394,14 @@ build_ip_routing_flows_for_lrouter_port( > add_route(lflows, op, op->lrp_networks.ipv4_addrs[i].addr_s, > op->lrp_networks.ipv4_addrs[i].network_s, > op->lrp_networks.ipv4_addrs[i].plen, NULL, false, > - &op->nbrp->header_); > + true, &op->nbrp->header_); > } > > for (int i = 0; i < op->lrp_networks.n_ipv6_addrs; i++) { > add_route(lflows, op, op->lrp_networks.ipv6_addrs[i].addr_s, > op->lrp_networks.ipv6_addrs[i].network_s, > op->lrp_networks.ipv6_addrs[i].plen, NULL, false, > - &op->nbrp->header_); > + true, &op->nbrp->header_); > } > } > } > -- > 2.29.2 > > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
On Tue, Feb 16, 2021 at 4:21 AM Han Zhou <hzhou@ovn.org> wrote: > > > On Wed, Feb 10, 2021 at 2:17 PM Lorenzo Bianconi < > lorenzo.bianconi@redhat.com> wrote: > > > > Increase priority for automatic routes (routes created assigning IP > > addresses to OVN logical router interfaces) in order to always prefer > them > > over static routes since the router has a direct link to the destination > > address (possible use-case can be found here [0]). > > > > [0] https://bugzilla.redhat.com/show_bug.cgi?id=1891516 > > > > Hi Lorenzo, Tim, > > While the patch may solve the problem in the bug report, I think there is > something more fundamental to be discussed. The problem is caused by > ovn-k8s's use of "src-ip" in static routes which overrides the > direct-connected route. I think the implementation of "src-ip" support in > the static route is somehow flawed. The priorities of the flows generated > by static routes are calculated according to the prefix length, so that > longest prefix routes are prioritised when there are multiple route > matches, which is correct when comparing matches among "dst-ip" routes or > among "src-ip" routes, but is not correct between "dst-ip" and "src-ip" > routes. Comparison of prefix length between these two types of static > routes doesn't make sense, because they match by different fields (src-ip > v.s. dst-ip). For example, when there are static routes: > 1. 192.168.0.0/24 via 100.64.0.1 src-ip > 2. 10.0.0.0/20 via 100.64.0.2 dst-ip > > In this example, a packet from 192.168.0.1 to 10.0.0.1 matches both > routes, but it is unreasonable to say it should follow the 1st route just > because it has longer prefix length. Instead, we should prioritize one type > over the other. It seems in physical router implementation policy based > routing always has higher priority than destination routing, so we should > probably enforce it in a similar way in OVN, i.e. set "src-ip" flows with > higher priority than all the "dst-ip" flows. In fact, the policy routing > table already supported this behavior because it is examined before the > static route table. > > Since the "src-ip" feature in the static route table is flawed, and can be > replaced by the policy routing table, I'd suggest to deprecate it. For > correctness, users (like ovn-k8s) should use the policy routing table > instead for the src-ip based routing rules. Users have full control of how > they want the packets to be routed. For the use case mentioned in the bug > report, it should have two rules in the policy routing table: > > 100 ip.dst == 100.64.0.0/16 accept # for directly connected destination, > skip the src-ip rules > 90 ip.src == 10.244.0.0/24 nexthop 100.64.0.1 # src-ip rules > > Would this better satisfy the need of ovn-k8s? > I believe this is correct. src-ip matching should be done in the policy table so traditional dest based routing is handled in default routing table. Need to go double check though. > If the above is agreed, the priority change of directly connected routes > from this patch is irrelevant to the problem reported in the bug, because > policy routing rules are examined before the static routing table anyway, > so no matter how high the route priority is, it wouldn't matter. In > addition, it seems to me no real use cases would benefit from this change, > but I could be wrong and please correct me if so. > > I disagree with this. Trying to override a directly connected route is fundamentally wrong, which is why real routers specifically stop a user from being able to do this. What if a user who had a router attached to 100.64.0.0/16, adds a /32 route for 100.64.0.1 via another interface/subnet? That would take precedence over the directly attached route in OVN iiuc and pretty much guarantee improper networking. Directly connected routes should always take precedence, and therefore the default route lflows that get installed should always have the highest possible priority. > Thanks, > Han > > > Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> > > --- > > northd/ovn-northd.c | 27 ++++++++++++++++----------- > > 1 file changed, 16 insertions(+), 11 deletions(-) > > > > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c > > index b2b5f6a1b..dc8706f2f 100644 > > --- a/northd/ovn-northd.c > > +++ b/northd/ovn-northd.c > > @@ -7994,18 +7994,23 @@ build_route_prefix_s(const struct in6_addr > *prefix, unsigned int plen) > > > > static void > > build_route_match(const struct ovn_port *op_inport, const char > *network_s, > > - int plen, bool is_src_route, bool is_ipv4, struct ds > *match, > > - uint16_t *priority) > > + int plen, bool is_src_route, bool is_ipv4, bool > automatic, > > + struct ds *match, uint16_t *priority) > > { > > + int prefix_len = plen; > > const char *dir; > > /* The priority here is calculated to implement longest-prefix-match > > - * routing. */ > > + * routing. Automatic routes have max priority */ > > + if (automatic) { > > + prefix_len = is_ipv4 ? 32 : 128; > > + prefix_len++; > > + } > > if (is_src_route) { > > dir = "src"; > > - *priority = plen * 2; > > + *priority = prefix_len * 2; > > } else { > > dir = "dst"; > > - *priority = (plen * 2) + 1; > > + *priority = (prefix_len * 2) + 1; > > } > > > > if (op_inport) { > > @@ -8172,7 +8177,7 @@ build_ecmp_route_flow(struct hmap *lflows, struct > ovn_datapath *od, > > > > char *prefix_s = build_route_prefix_s(&eg->prefix, eg->plen); > > build_route_match(NULL, prefix_s, eg->plen, eg->is_src_route, > is_ipv4, > > - &route_match, &priority); > > + false, &route_match, &priority); > > free(prefix_s); > > > > struct ds actions = DS_EMPTY_INITIALIZER; > > @@ -8246,7 +8251,7 @@ build_ecmp_route_flow(struct hmap *lflows, struct > ovn_datapath *od, > > static void > > add_route(struct hmap *lflows, const struct ovn_port *op, > > const char *lrp_addr_s, const char *network_s, int plen, > > - const char *gateway, bool is_src_route, > > + const char *gateway, bool is_src_route, bool automatic, > > const struct ovsdb_idl_row *stage_hint) > > { > > bool is_ipv4 = strchr(network_s, '.') ? true : false; > > @@ -8263,7 +8268,7 @@ add_route(struct hmap *lflows, const struct > ovn_port *op, > > } > > } > > build_route_match(op_inport, network_s, plen, is_src_route, is_ipv4, > > - &match, &priority); > > + automatic, &match, &priority); > > > > struct ds common_actions = DS_EMPTY_INITIALIZER; > > ds_put_format(&common_actions, REG_ECMP_GROUP_ID" = 0; %s = ", > > @@ -8319,7 +8324,7 @@ build_static_route_flow(struct hmap *lflows, > struct ovn_datapath *od, > > > > char *prefix_s = build_route_prefix_s(&route_->prefix, > route_->plen); > > add_route(lflows, out_port, lrp_addr_s, prefix_s, route_->plen, > > - route->nexthop, route_->is_src_route, > > + route->nexthop, route_->is_src_route, false, > > &route->header_); > > > > free(prefix_s); > > @@ -9389,14 +9394,14 @@ build_ip_routing_flows_for_lrouter_port( > > add_route(lflows, op, op->lrp_networks.ipv4_addrs[i].addr_s, > > op->lrp_networks.ipv4_addrs[i].network_s, > > op->lrp_networks.ipv4_addrs[i].plen, NULL, false, > > - &op->nbrp->header_); > > + true, &op->nbrp->header_); > > } > > > > for (int i = 0; i < op->lrp_networks.n_ipv6_addrs; i++) { > > add_route(lflows, op, op->lrp_networks.ipv6_addrs[i].addr_s, > > op->lrp_networks.ipv6_addrs[i].network_s, > > op->lrp_networks.ipv6_addrs[i].plen, NULL, false, > > - &op->nbrp->header_); > > + true, &op->nbrp->header_); > > } > > } > > } > > -- > > 2.29.2 > > > > _______________________________________________ > > dev mailing list > > dev@openvswitch.org > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev >
> On Tue, Feb 16, 2021 at 4:21 AM Han Zhou <hzhou@ovn.org> wrote: >> >> >> >> On Wed, Feb 10, 2021 at 2:17 PM Lorenzo Bianconi <lorenzo.bianconi@redhat.com> wrote: >> > >> > Increase priority for automatic routes (routes created assigning IP >> > addresses to OVN logical router interfaces) in order to always prefer them >> > over static routes since the router has a direct link to the destination >> > address (possible use-case can be found here [0]). >> > >> > [0] https://bugzilla.redhat.com/show_bug.cgi?id=1891516 >> > >> >> Hi Lorenzo, Tim, Hi Han, thx for the detailed reply :) >> >> While the patch may solve the problem in the bug report, I think there is something more fundamental to be discussed. The problem is caused by ovn-k8s's use of "src-ip" in static routes which overrides the direct-connected route. I think the implementation of "src-ip" support in the static route is somehow flawed. The priorities of the flows generated by static routes are calculated according to the prefix length, so that longest prefix routes are prioritised when there are multiple route matches, which is correct when comparing matches among "dst-ip" routes or among "src-ip" routes, but is not correct between "dst-ip" and "src-ip" routes. Comparison of prefix length between these two types of static routes doesn't make sense, because they match by different fields (src-ip v.s. dst-ip). For example, when there are static routes: >> 1. 192.168.0.0/24 via 100.64.0.1 src-ip >> 2. 10.0.0.0/20 via 100.64.0.2 dst-ip >> >> In this example, a packet from 192.168.0.1 to 10.0.0.1 matches both routes, but it is unreasonable to say it should follow the 1st route just because it has longer prefix length. Instead, we should prioritize one type over the other. It seems in physical router implementation policy based routing always has higher priority than destination routing, so we should probably enforce it in a similar way in OVN, i.e. set "src-ip" flows with higher priority than all the "dst-ip" flows. In fact, the policy routing table already supported this behavior because it is examined before the static route table. >> >> Since the "src-ip" feature in the static route table is flawed, and can be replaced by the policy routing table, I'd suggest to deprecate it. For correctness, users (like ovn-k8s) should use the policy routing table instead for the src-ip based routing rules. Users have full control of how they want the packets to be routed. For the use case mentioned in the bug report, it should have two rules in the policy routing table: >> >> >> 100 ip.dst == 100.64.0.0/16 accept # for directly connected destination, skip the src-ip rules >> 90 ip.src == 10.244.0.0/24 nexthop 100.64.0.1 # src-ip rules >> >> Would this better satisfy the need of ovn-k8s? > > > I believe this is correct. src-ip matching should be done in the policy table so traditional dest based routing is handled in default routing table. Need to go double check though. I agree on this point. > >> >> If the above is agreed, the priority change of directly connected routes from this patch is irrelevant to the problem reported in the bug, because policy routing rules are examined before the static routing table anyway, so no matter how high the route priority is, it wouldn't matter. In addition, it seems to me no real use cases would benefit from this change, but I could be wrong and please correct me if so. >> > I disagree with this. Trying to override a directly connected route is fundamentally wrong, which is why real routers specifically stop a user from being able to do this. What if a user who had a router attached to 100.64.0.0/16, adds a /32 route for 100.64.0.1 via another interface/subnet? That would take precedence over the directly attached route in OVN iiuc and pretty much guarantee improper networking. Directly connected routes should always take precedence, and therefore the default route lflows that get installed should always have the highest possible priority. I do not have a strong opinion on it since I do not have any use cases for overwriting an automatic route with a static one. @Dumitru Ceara IIRC you mentioned a use case for it, right? Regards, Lorenzo > >> >> Thanks, >> Han >> >> > Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> >> > --- >> > northd/ovn-northd.c | 27 ++++++++++++++++----------- >> > 1 file changed, 16 insertions(+), 11 deletions(-) >> > >> > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c >> > index b2b5f6a1b..dc8706f2f 100644 >> > --- a/northd/ovn-northd.c >> > +++ b/northd/ovn-northd.c >> > @@ -7994,18 +7994,23 @@ build_route_prefix_s(const struct in6_addr *prefix, unsigned int plen) >> > >> > static void >> > build_route_match(const struct ovn_port *op_inport, const char *network_s, >> > - int plen, bool is_src_route, bool is_ipv4, struct ds *match, >> > - uint16_t *priority) >> > + int plen, bool is_src_route, bool is_ipv4, bool automatic, >> > + struct ds *match, uint16_t *priority) >> > { >> > + int prefix_len = plen; >> > const char *dir; >> > /* The priority here is calculated to implement longest-prefix-match >> > - * routing. */ >> > + * routing. Automatic routes have max priority */ >> > + if (automatic) { >> > + prefix_len = is_ipv4 ? 32 : 128; >> > + prefix_len++; >> > + } >> > if (is_src_route) { >> > dir = "src"; >> > - *priority = plen * 2; >> > + *priority = prefix_len * 2; >> > } else { >> > dir = "dst"; >> > - *priority = (plen * 2) + 1; >> > + *priority = (prefix_len * 2) + 1; >> > } >> > >> > if (op_inport) { >> > @@ -8172,7 +8177,7 @@ build_ecmp_route_flow(struct hmap *lflows, struct ovn_datapath *od, >> > >> > char *prefix_s = build_route_prefix_s(&eg->prefix, eg->plen); >> > build_route_match(NULL, prefix_s, eg->plen, eg->is_src_route, is_ipv4, >> > - &route_match, &priority); >> > + false, &route_match, &priority); >> > free(prefix_s); >> > >> > struct ds actions = DS_EMPTY_INITIALIZER; >> > @@ -8246,7 +8251,7 @@ build_ecmp_route_flow(struct hmap *lflows, struct ovn_datapath *od, >> > static void >> > add_route(struct hmap *lflows, const struct ovn_port *op, >> > const char *lrp_addr_s, const char *network_s, int plen, >> > - const char *gateway, bool is_src_route, >> > + const char *gateway, bool is_src_route, bool automatic, >> > const struct ovsdb_idl_row *stage_hint) >> > { >> > bool is_ipv4 = strchr(network_s, '.') ? true : false; >> > @@ -8263,7 +8268,7 @@ add_route(struct hmap *lflows, const struct ovn_port *op, >> > } >> > } >> > build_route_match(op_inport, network_s, plen, is_src_route, is_ipv4, >> > - &match, &priority); >> > + automatic, &match, &priority); >> > >> > struct ds common_actions = DS_EMPTY_INITIALIZER; >> > ds_put_format(&common_actions, REG_ECMP_GROUP_ID" = 0; %s = ", >> > @@ -8319,7 +8324,7 @@ build_static_route_flow(struct hmap *lflows, struct ovn_datapath *od, >> > >> > char *prefix_s = build_route_prefix_s(&route_->prefix, route_->plen); >> > add_route(lflows, out_port, lrp_addr_s, prefix_s, route_->plen, >> > - route->nexthop, route_->is_src_route, >> > + route->nexthop, route_->is_src_route, false, >> > &route->header_); >> > >> > free(prefix_s); >> > @@ -9389,14 +9394,14 @@ build_ip_routing_flows_for_lrouter_port( >> > add_route(lflows, op, op->lrp_networks.ipv4_addrs[i].addr_s, >> > op->lrp_networks.ipv4_addrs[i].network_s, >> > op->lrp_networks.ipv4_addrs[i].plen, NULL, false, >> > - &op->nbrp->header_); >> > + true, &op->nbrp->header_); >> > } >> > >> > for (int i = 0; i < op->lrp_networks.n_ipv6_addrs; i++) { >> > add_route(lflows, op, op->lrp_networks.ipv6_addrs[i].addr_s, >> > op->lrp_networks.ipv6_addrs[i].network_s, >> > op->lrp_networks.ipv6_addrs[i].plen, NULL, false, >> > - &op->nbrp->header_); >> > + true, &op->nbrp->header_); >> > } >> > } >> > } >> > -- >> > 2.29.2 >> > >> > _______________________________________________ >> > dev mailing list >> > dev@openvswitch.org >> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
On Wed, Feb 17, 2021 at 2:36 PM Tim Rozet <trozet@redhat.com> wrote: > > > > > On Tue, Feb 16, 2021 at 4:21 AM Han Zhou <hzhou@ovn.org> wrote: >> >> >> >> On Wed, Feb 10, 2021 at 2:17 PM Lorenzo Bianconi < lorenzo.bianconi@redhat.com> wrote: >> > >> > Increase priority for automatic routes (routes created assigning IP >> > addresses to OVN logical router interfaces) in order to always prefer them >> > over static routes since the router has a direct link to the destination >> > address (possible use-case can be found here [0]). >> > >> > [0] https://bugzilla.redhat.com/show_bug.cgi?id=1891516 >> > >> >> Hi Lorenzo, Tim, >> >> While the patch may solve the problem in the bug report, I think there is something more fundamental to be discussed. The problem is caused by ovn-k8s's use of "src-ip" in static routes which overrides the direct-connected route. I think the implementation of "src-ip" support in the static route is somehow flawed. The priorities of the flows generated by static routes are calculated according to the prefix length, so that longest prefix routes are prioritised when there are multiple route matches, which is correct when comparing matches among "dst-ip" routes or among "src-ip" routes, but is not correct between "dst-ip" and "src-ip" routes. Comparison of prefix length between these two types of static routes doesn't make sense, because they match by different fields (src-ip v.s. dst-ip). For example, when there are static routes: >> 1. 192.168.0.0/24 via 100.64.0.1 src-ip >> 2. 10.0.0.0/20 via 100.64.0.2 dst-ip >> >> In this example, a packet from 192.168.0.1 to 10.0.0.1 matches both routes, but it is unreasonable to say it should follow the 1st route just because it has longer prefix length. Instead, we should prioritize one type over the other. It seems in physical router implementation policy based routing always has higher priority than destination routing, so we should probably enforce it in a similar way in OVN, i.e. set "src-ip" flows with higher priority than all the "dst-ip" flows. In fact, the policy routing table already supported this behavior because it is examined before the static route table. >> >> Since the "src-ip" feature in the static route table is flawed, and can be replaced by the policy routing table, I'd suggest to deprecate it. For correctness, users (like ovn-k8s) should use the policy routing table instead for the src-ip based routing rules. Users have full control of how they want the packets to be routed. For the use case mentioned in the bug report, it should have two rules in the policy routing table: >> >> >> 100 ip.dst == 100.64.0.0/16 accept # for directly connected destination, skip the src-ip rules >> 90 ip.src == 10.244.0.0/24 nexthop 100.64.0.1 # src-ip rules >> >> Would this better satisfy the need of ovn-k8s? > > > I believe this is correct. src-ip matching should be done in the policy table so traditional dest based routing is handled in default routing table. Need to go double check though. > >> >> If the above is agreed, the priority change of directly connected routes from this patch is irrelevant to the problem reported in the bug, because policy routing rules are examined before the static routing table anyway, so no matter how high the route priority is, it wouldn't matter. In addition, it seems to me no real use cases would benefit from this change, but I could be wrong and please correct me if so. >> > I disagree with this. Trying to override a directly connected route is fundamentally wrong, which is why real routers specifically stop a user from being able to do this. What if a user who had a router attached to 100.64.0.0/16, adds a /32 route for 100.64.0.1 via another interface/subnet? That would take precedence over the directly attached route in OVN iiuc and pretty much guarantee improper networking. Directly connected routes should always take precedence, and therefore the default route lflows that get installed should always have the highest possible priority. > Hi Tim, Thanks for your inputs. Here are my thoughts: In the scenario we discussed, it is in fact the same output interfaces but just different nexthops in the 10.64.0.0/16 subnet. In this case, adding a more specific route with a specific nexthop on the directly connected subnet doesn't seem to be violating any routing principle, right? Use the topology in the bug report as an example ((In the diagram of the bug report I think there is a typo: 100.64.0.1 should be the DR's output port IP, and 100.64.0.2 & 100.64.0.3 belong to GR1 and GR2). 10.64.0.0/16 is directly connected, but the adjacent L2 network doesn't have to have all the /16 IPs directly connected. Some of the nodes can reside behind a L3 hop. For example, if we know that 10.64.1.0/24 is behind a router with IP 10.64.255.1, we can still add a route: 10.64.1.0/24 via 10.64.255.1, which should work with the current OVN implementation, while this patch would in fact break it. On the other hand, what I wanted to emphasize is not that we want to support the above use case I mentioned - it doesn't look like a good design for me either although it seems valid. My point is that maybe we shouldn't have the special adjustment for the priority just to restrict such use cases, which doesn't bring us any extra benefit, while making the code less generic. OVN providing the possibility of overriding the routes with longer prefix length doesn't mean the user has to use it, and if someone relies on this capability for their special needs then I assume they should have a clear understanding of what they are doing - it is totally up to the user. If we do believe that any configuration that tries to add more specific routes than the directly connected routes is invalid and want to protect against them (I am not convinced yet), then probably we should consider even more scenarios, e.g. what if there are two interfaces configured with overlapping directly connected subnets, e.g. 100.64.0.0/16 on LRP1 and 100.64.123.0/24 on LRP2 (both LRP1 and LRP2 are on the same LR)? Currently if a packet with dest IP 100.64.123.* arrives we would expect it to be routed to LRP2, but with this patch they will have the same priority and the behavior will become unpredictable. So for the above reasons I'm a little concerned about making the specific change for the priorities of directly connected routes because it seems would make things more complex without obvious benefits. Thanks, Han
Hi guys, On 2/18/21 12:16 AM, Han Zhou wrote: > On Wed, Feb 17, 2021 at 2:36 PM Tim Rozet <trozet@redhat.com> wrote: >> >> >> >> >> On Tue, Feb 16, 2021 at 4:21 AM Han Zhou <hzhou@ovn.org> wrote: >>> >>> >>> >>> On Wed, Feb 10, 2021 at 2:17 PM Lorenzo Bianconi < > lorenzo.bianconi@redhat.com> wrote: >>>> >>>> Increase priority for automatic routes (routes created assigning IP >>>> addresses to OVN logical router interfaces) in order to always prefer > them >>>> over static routes since the router has a direct link to the > destination >>>> address (possible use-case can be found here [0]). >>>> >>>> [0] https://bugzilla.redhat.com/show_bug.cgi?id=1891516 >>>> >>> >>> Hi Lorenzo, Tim, >>> >>> While the patch may solve the problem in the bug report, I think there > is something more fundamental to be discussed. The problem is caused by > ovn-k8s's use of "src-ip" in static routes which overrides the > direct-connected route. I think the implementation of "src-ip" support in > the static route is somehow flawed. The priorities of the flows generated > by static routes are calculated according to the prefix length, so that > longest prefix routes are prioritised when there are multiple route > matches, which is correct when comparing matches among "dst-ip" routes or > among "src-ip" routes, but is not correct between "dst-ip" and "src-ip" > routes. Comparison of prefix length between these two types of static > routes doesn't make sense, because they match by different fields (src-ip > v.s. dst-ip). For example, when there are static routes: >>> 1. 192.168.0.0/24 via 100.64.0.1 src-ip >>> 2. 10.0.0.0/20 via 100.64.0.2 dst-ip >>> >>> In this example, a packet from 192.168.0.1 to 10.0.0.1 matches both > routes, but it is unreasonable to say it should follow the 1st route just > because it has longer prefix length. Instead, we should prioritize one type > over the other. It seems in physical router implementation policy based > routing always has higher priority than destination routing, so we should > probably enforce it in a similar way in OVN, i.e. set "src-ip" flows with > higher priority than all the "dst-ip" flows. In fact, the policy routing > table already supported this behavior because it is examined before the > static route table. >>> >>> Since the "src-ip" feature in the static route table is flawed, and can > be replaced by the policy routing table, I'd suggest to deprecate it. For I agree. > correctness, users (like ovn-k8s) should use the policy routing table > instead for the src-ip based routing rules. Users have full control of how > they want the packets to be routed. For the use case mentioned in the bug > report, it should have two rules in the policy routing table: >>> >>> >>> 100 ip.dst == 100.64.0.0/16 accept # for directly connected destination, > skip the src-ip rules >>> 90 ip.src == 10.244.0.0/24 nexthop 100.64.0.1 # src-ip rules >>> >>> Would this better satisfy the need of ovn-k8s? >> >> >> I believe this is correct. src-ip matching should be done in the policy > table so traditional dest based routing is handled in default routing > table. Need to go double check though. One example, but should be the same for all "traditional" routers, "The Policy-Based Routing feature is a process whereby a device puts packets through a route map before routing the packets.": https://www.cisco.com/c/en/us/td/docs/ios-xml/ios/iproute_pi/configuration/15-mt/iri-15-mt-book/iri-pbr.html >> >>> >>> If the above is agreed, the priority change of directly connected routes > from this patch is irrelevant to the problem reported in the bug, because > policy routing rules are examined before the static routing table anyway, > so no matter how high the route priority is, it wouldn't matter. In > addition, it seems to me no real use cases would benefit from this change, > but I could be wrong and please correct me if so. >>> >> I disagree with this. Trying to override a directly connected route is > fundamentally wrong, which is why real routers specifically stop a user > from being able to do this. What if a user who had a router attached to > 100.64.0.0/16, adds a /32 route for 100.64.0.1 via another > interface/subnet? That would take precedence over the directly attached > route in OVN iiuc and pretty much guarantee improper networking. Directly > connected routes should always take precedence, and therefore the default > route lflows that get installed should always have the highest possible > priority. I think there's some confusion here. In "traditional" routers, longest prefix match has priority in the route selection process. If two routes (from different protocols, including "directly connected") are available for the *same* prefix *and* length then, and only then, the one with the lowest administrative distance is selected. Most of the times "directly connected" routes have a lower admin distance by default compared to static routes so they are preferred *if* such a selection has to happen. The admin distance is also usually configurable (although not usually recommended) and allows users to override the default preference of routes. Back to the example above, in a "traditional" router, when the two routes are configured, with default admin distances: 100.64.0.2/16 directly connected, via dev1 100.64.0.1/32 static, via 42.42.42.42 All traffic with destination IP in 100.64.0.3-100.64.255.254 is forwarded via dev1 (the directly connected route is selected). Traffic with destination IP 100.64.0.2 is processed locally. Traffic with destination IP 100.64.0.1 is forwarded via the static route towards next-hop 42.42.42.42. >> > > Hi Tim, > > Thanks for your inputs. Here are my thoughts: > > In the scenario we discussed, it is in fact the same output interfaces but > just different nexthops in the 10.64.0.0/16 subnet. In this case, adding a > more specific route with a specific nexthop on the directly connected > subnet doesn't seem to be violating any routing principle, right? Use the > topology in the bug report as an example ((In the diagram of the bug report > I think there is a typo: 100.64.0.1 should be the DR's output port IP, and > 100.64.0.2 & 100.64.0.3 belong to GR1 and GR2). 10.64.0.0/16 is directly > connected, but the adjacent L2 network doesn't have to have all the /16 IPs > directly connected. Some of the nodes can reside behind a L3 hop. For > example, if we know that 10.64.1.0/24 is behind a router with IP > 10.64.255.1, we can still add a route: 10.64.1.0/24 via 10.64.255.1, which > should work with the current OVN implementation, while this patch would in > fact break it. > > On the other hand, what I wanted to emphasize is not that we want to > support the above use case I mentioned - it doesn't look like a good design > for me either although it seems valid. My point is that maybe we shouldn't > have the special adjustment for the priority just to restrict such use > cases, which doesn't bring us any extra benefit, while making the code less > generic. OVN providing the possibility of overriding the routes with longer > prefix length doesn't mean the user has to use it, and if someone relies on > this capability for their special needs then I assume they should have a > clear understanding of what they are doing - it is totally up to the user. > > If we do believe that any configuration that tries to add more specific > routes than the directly connected routes is invalid and want to protect > against them (I am not convinced yet), then probably we should consider As mentioned above, I don't think more specific static routes are an invalid config. As a matter of fact, there are examples of using them on "traditional" routers to have part of the traffic go through additional services, e.g.: https://community.cisco.com/t5/other-network-architecture/how-do-you-promote-a-static-route-over-a-directly-connected/m-p/644482/highlight/true#M179363 > even more scenarios, e.g. what if there are two interfaces configured with > overlapping directly connected subnets, e.g. 100.64.0.0/16 on LRP1 and > 100.64.123.0/24 on LRP2 (both LRP1 and LRP2 are on the same LR)? Currently > if a packet with dest IP 100.64.123.* arrives we would expect it to be > routed to LRP2, but with this patch they will have the same priority and > the behavior will become unpredictable. > > So for the above reasons I'm a little concerned about making the specific > change for the priorities of directly connected routes because it seems > would make things more complex without obvious benefits. I agree, I don't think it's OK to bump priority of directly connected routes. > > Thanks, > Han Regards, Dumitru
diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c index b2b5f6a1b..dc8706f2f 100644 --- a/northd/ovn-northd.c +++ b/northd/ovn-northd.c @@ -7994,18 +7994,23 @@ build_route_prefix_s(const struct in6_addr *prefix, unsigned int plen) static void build_route_match(const struct ovn_port *op_inport, const char *network_s, - int plen, bool is_src_route, bool is_ipv4, struct ds *match, - uint16_t *priority) + int plen, bool is_src_route, bool is_ipv4, bool automatic, + struct ds *match, uint16_t *priority) { + int prefix_len = plen; const char *dir; /* The priority here is calculated to implement longest-prefix-match - * routing. */ + * routing. Automatic routes have max priority */ + if (automatic) { + prefix_len = is_ipv4 ? 32 : 128; + prefix_len++; + } if (is_src_route) { dir = "src"; - *priority = plen * 2; + *priority = prefix_len * 2; } else { dir = "dst"; - *priority = (plen * 2) + 1; + *priority = (prefix_len * 2) + 1; } if (op_inport) { @@ -8172,7 +8177,7 @@ build_ecmp_route_flow(struct hmap *lflows, struct ovn_datapath *od, char *prefix_s = build_route_prefix_s(&eg->prefix, eg->plen); build_route_match(NULL, prefix_s, eg->plen, eg->is_src_route, is_ipv4, - &route_match, &priority); + false, &route_match, &priority); free(prefix_s); struct ds actions = DS_EMPTY_INITIALIZER; @@ -8246,7 +8251,7 @@ build_ecmp_route_flow(struct hmap *lflows, struct ovn_datapath *od, static void add_route(struct hmap *lflows, const struct ovn_port *op, const char *lrp_addr_s, const char *network_s, int plen, - const char *gateway, bool is_src_route, + const char *gateway, bool is_src_route, bool automatic, const struct ovsdb_idl_row *stage_hint) { bool is_ipv4 = strchr(network_s, '.') ? true : false; @@ -8263,7 +8268,7 @@ add_route(struct hmap *lflows, const struct ovn_port *op, } } build_route_match(op_inport, network_s, plen, is_src_route, is_ipv4, - &match, &priority); + automatic, &match, &priority); struct ds common_actions = DS_EMPTY_INITIALIZER; ds_put_format(&common_actions, REG_ECMP_GROUP_ID" = 0; %s = ", @@ -8319,7 +8324,7 @@ build_static_route_flow(struct hmap *lflows, struct ovn_datapath *od, char *prefix_s = build_route_prefix_s(&route_->prefix, route_->plen); add_route(lflows, out_port, lrp_addr_s, prefix_s, route_->plen, - route->nexthop, route_->is_src_route, + route->nexthop, route_->is_src_route, false, &route->header_); free(prefix_s); @@ -9389,14 +9394,14 @@ build_ip_routing_flows_for_lrouter_port( add_route(lflows, op, op->lrp_networks.ipv4_addrs[i].addr_s, op->lrp_networks.ipv4_addrs[i].network_s, op->lrp_networks.ipv4_addrs[i].plen, NULL, false, - &op->nbrp->header_); + true, &op->nbrp->header_); } for (int i = 0; i < op->lrp_networks.n_ipv6_addrs; i++) { add_route(lflows, op, op->lrp_networks.ipv6_addrs[i].addr_s, op->lrp_networks.ipv6_addrs[i].network_s, op->lrp_networks.ipv6_addrs[i].plen, NULL, false, - &op->nbrp->header_); + true, &op->nbrp->header_); } } }
Increase priority for automatic routes (routes created assigning IP addresses to OVN logical router interfaces) in order to always prefer them over static routes since the router has a direct link to the destination address (possible use-case can be found here [0]). [0] https://bugzilla.redhat.com/show_bug.cgi?id=1891516 Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> --- northd/ovn-northd.c | 27 ++++++++++++++++----------- 1 file changed, 16 insertions(+), 11 deletions(-)