diff mbox series

[ovs-dev,ovn] ovn-northd: Increase ECMP route priority with 400.

Message ID 1589406663-9041-1-git-send-email-hzhou@ovn.org
State Not Applicable
Headers show
Series [ovs-dev,ovn] ovn-northd: Increase ECMP route priority with 400. | expand

Commit Message

Han Zhou May 13, 2020, 9:51 p.m. UTC
In commit c0bf32d the priorities of the regular routes were increased by
400, but ECMP routes were not updated. This patch fixes it. Since
for ECMP routes the outport is unknown (there can be many different
outports) the condition check of whether the outport is distributed
gateway port is skipped.

Fixes: c0bf32d ("Manage ARP process locally in a DVR scenario")
Cc: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: Han Zhou <hzhou@ovn.org>
---
 northd/ovn-northd.8.xml |  3 ++-
 northd/ovn-northd.c     | 22 ++++++++++++----------
 2 files changed, 14 insertions(+), 11 deletions(-)

Comments

Dumitru Ceara May 14, 2020, 8:12 a.m. UTC | #1
On 5/13/20 11:51 PM, Han Zhou wrote:
> In commit c0bf32d the priorities of the regular routes were increased by
> 400, but ECMP routes were not updated. This patch fixes it. Since
> for ECMP routes the outport is unknown (there can be many different
> outports) the condition check of whether the outport is distributed
> gateway port is skipped.
> 
> Fixes: c0bf32d ("Manage ARP process locally in a DVR scenario")
> Cc: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
> Signed-off-by: Han Zhou <hzhou@ovn.org>

Hi Han,

Should we also add a test for this scenario to avoid further regressions?

Thanks,
Dumitru

> ---
>  northd/ovn-northd.8.xml |  3 ++-
>  northd/ovn-northd.c     | 22 ++++++++++++----------
>  2 files changed, 14 insertions(+), 11 deletions(-)
> 
> diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
> index 8f224b0..b0475d0 100644
> --- a/northd/ovn-northd.8.xml
> +++ b/northd/ovn-northd.8.xml
> @@ -2601,7 +2601,8 @@ next;
>            ids <var>MID1</var>, <var>MID2</var>, ..., a logical flow with match
>            in CIDR notation <code>ip4.dst == <var>N</var>/<var>M</var></code>,
>            or <code>ip6.dst == <var>N</var>/<var>M</var></code>, whose priority
> -          is the integer value of <var>M</var>, has the following actions:
> +          is <code>400</code> + the integer value of <var>M</var>, has the
> +          following actions:
>          </p>
>  
>          <pre>
> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> index b25152d..c02e305 100644
> --- a/northd/ovn-northd.c
> +++ b/northd/ovn-northd.c
> @@ -7337,7 +7337,8 @@ build_route_prefix_s(const struct v46_ip *prefix, unsigned int plen)
>  }
>  
>  static void
> -build_route_match(const struct ovn_port *op_inport, const char *network_s,
> +build_route_match(const struct ovn_port *op_inport,
> +                  const struct ovn_port *op_outport, const char *network_s,
>                    int plen, bool is_src_route, bool is_ipv4, struct ds *match,
>                    uint16_t *priority)
>  {
> @@ -7351,6 +7352,13 @@ build_route_match(const struct ovn_port *op_inport, const char *network_s,
>          dir = "dst";
>          *priority = (plen * 2) + 1;
>      }
> +    /* traffic for internal IPs of logical switch ports must be sent to
> +     * the gw controller through the overlay tunnels
> +     */
> +    if (!op_outport ||
> +        (op_outport->nbrp && !op_outport->nbrp->n_gateway_chassis)) {
> +        priority += DROUTE_PRIO;
> +    }
>  
>      if (op_inport) {
>          ds_put_format(match, "inport == %s && ", op_inport->json_key);
> @@ -7434,8 +7442,8 @@ build_ecmp_route_flow(struct hmap *lflows, struct ovn_datapath *od,
>      uint16_t priority;
>  
>      char *prefix_s = build_route_prefix_s(&eg->prefix, eg->plen);
> -    build_route_match(NULL, prefix_s, eg->plen, eg->is_src_route, is_ipv4,
> -                      &match, &priority);
> +    build_route_match(NULL, NULL, prefix_s, eg->plen, eg->is_src_route,
> +                      is_ipv4, &match, &priority);
>      free(prefix_s);
>  
>      struct ds actions = DS_EMPTY_INITIALIZER;
> @@ -7548,14 +7556,8 @@ add_route(struct hmap *lflows, const struct ovn_port *op,
>              op_inport = op;
>          }
>      }
> -    build_route_match(op_inport, network_s, plen, is_src_route, is_ipv4,
> +    build_route_match(op_inport, op, network_s, plen, is_src_route, is_ipv4,
>                        &match, &priority);
> -    /* traffic for internal IPs of logical switch ports must be sent to
> -     * the gw controller through the overlay tunnels
> -     */
> -    if (op->nbrp && !op->nbrp->n_gateway_chassis) {
> -        priority += DROUTE_PRIO;
> -    }
>  
>      struct ds actions = DS_EMPTY_INITIALIZER;
>      ds_put_format(&actions, "ip.ttl--; "REG_ECMP_GROUP_ID" = 0; %sreg0 = ",
>
Han Zhou May 14, 2020, 4:49 p.m. UTC | #2
On Thu, May 14, 2020 at 1:12 AM Dumitru Ceara <dceara@redhat.com> wrote:
>
> On 5/13/20 11:51 PM, Han Zhou wrote:
> > In commit c0bf32d the priorities of the regular routes were increased by
> > 400, but ECMP routes were not updated. This patch fixes it. Since
> > for ECMP routes the outport is unknown (there can be many different
> > outports) the condition check of whether the outport is distributed
> > gateway port is skipped.
> >
> > Fixes: c0bf32d ("Manage ARP process locally in a DVR scenario")
> > Cc: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
> > Signed-off-by: Han Zhou <hzhou@ovn.org>
>
> Hi Han,
>
> Should we also add a test for this scenario to avoid further regressions?
>
> Thanks,
> Dumitru

Thanks Dumitru for the review. I thought it twice and it seems this fix
doesn't solve the whole problem.
In commit c0bf32d ("Manage ARP process locally in a DVR scenario"), it
introduced different sets of route priorities. If the route's output port
is distributed gateway port, it is 400 lower than the routes with same
prefix length but next hop is a regular router port. The longest prefix
match would not work properly for routing. This violates the basic
principle for the default routing behavior. For example:

route1: 10.0.0.0/24 nexthop 192.168.0.2 via lrp1 (distributed gateway port)
route2: 10.0.0.0/16 nexthop 192.168.100.2 via lrp2 (regular router port)

The user would expected route1 to override route2, because it is has longer
prefix match. However, the commit c0bf32d results in 10.0.0.0/16 taking
effect, because the priority now is (400 + 33) v.s. 49.

  table=9 (lr_in_ip_routing   ), priority=49   , match=(ip4.dst ==
10.0.0.0/24), action=(ip.ttl--; reg8[0..15] = 0; reg0 = 192.168.0.2; reg1 =
192.168.0.1; eth.src = aa:aa:aa:aa:aa:01; outport = "lrp1"; flags.loopback
= 1; next;)
  table=9 (lr_in_ip_routing   ), priority=433  , match=(ip4.dst ==
10.0.0.0/16), action=(ip.ttl--; reg8[0..15] = 0; reg0 = 192.168.100.2; reg1
= 192.168.100.1; eth.src = aa:aa:aa:aa:aa:02; outport = "lrp2";
flags.loopback = 1; next;)

So I wonder if we should change the solution of the commit c0bf32d, instead
of just increasing the ECMP routes priority only. I don't completely
understand the problem that commit was trying to solve. Is there an example
that describes the scenario in more detail and the reason why the route
priorities have to be different?

>
> > ---
> >  northd/ovn-northd.8.xml |  3 ++-
> >  northd/ovn-northd.c     | 22 ++++++++++++----------
> >  2 files changed, 14 insertions(+), 11 deletions(-)
> >
> > diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
> > index 8f224b0..b0475d0 100644
> > --- a/northd/ovn-northd.8.xml
> > +++ b/northd/ovn-northd.8.xml
> > @@ -2601,7 +2601,8 @@ next;
> >            ids <var>MID1</var>, <var>MID2</var>, ..., a logical flow
with match
> >            in CIDR notation <code>ip4.dst ==
<var>N</var>/<var>M</var></code>,
> >            or <code>ip6.dst == <var>N</var>/<var>M</var></code>, whose
priority
> > -          is the integer value of <var>M</var>, has the following
actions:
> > +          is <code>400</code> + the integer value of <var>M</var>, has
the
> > +          following actions:
> >          </p>
> >
> >          <pre>
> > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> > index b25152d..c02e305 100644
> > --- a/northd/ovn-northd.c
> > +++ b/northd/ovn-northd.c
> > @@ -7337,7 +7337,8 @@ build_route_prefix_s(const struct v46_ip *prefix,
unsigned int plen)
> >  }
> >
> >  static void
> > -build_route_match(const struct ovn_port *op_inport, const char
*network_s,
> > +build_route_match(const struct ovn_port *op_inport,
> > +                  const struct ovn_port *op_outport, const char
*network_s,
> >                    int plen, bool is_src_route, bool is_ipv4, struct ds
*match,
> >                    uint16_t *priority)
> >  {
> > @@ -7351,6 +7352,13 @@ build_route_match(const struct ovn_port
*op_inport, const char *network_s,
> >          dir = "dst";
> >          *priority = (plen * 2) + 1;
> >      }
> > +    /* traffic for internal IPs of logical switch ports must be sent to
> > +     * the gw controller through the overlay tunnels
> > +     */
> > +    if (!op_outport ||
> > +        (op_outport->nbrp && !op_outport->nbrp->n_gateway_chassis)) {
> > +        priority += DROUTE_PRIO;
> > +    }
> >
> >      if (op_inport) {
> >          ds_put_format(match, "inport == %s && ", op_inport->json_key);
> > @@ -7434,8 +7442,8 @@ build_ecmp_route_flow(struct hmap *lflows, struct
ovn_datapath *od,
> >      uint16_t priority;
> >
> >      char *prefix_s = build_route_prefix_s(&eg->prefix, eg->plen);
> > -    build_route_match(NULL, prefix_s, eg->plen, eg->is_src_route,
is_ipv4,
> > -                      &match, &priority);
> > +    build_route_match(NULL, NULL, prefix_s, eg->plen, eg->is_src_route,
> > +                      is_ipv4, &match, &priority);
> >      free(prefix_s);
> >
> >      struct ds actions = DS_EMPTY_INITIALIZER;
> > @@ -7548,14 +7556,8 @@ add_route(struct hmap *lflows, const struct
ovn_port *op,
> >              op_inport = op;
> >          }
> >      }
> > -    build_route_match(op_inport, network_s, plen, is_src_route,
is_ipv4,
> > +    build_route_match(op_inport, op, network_s, plen, is_src_route,
is_ipv4,
> >                        &match, &priority);
> > -    /* traffic for internal IPs of logical switch ports must be sent to
> > -     * the gw controller through the overlay tunnels
> > -     */
> > -    if (op->nbrp && !op->nbrp->n_gateway_chassis) {
> > -        priority += DROUTE_PRIO;
> > -    }
> >
> >      struct ds actions = DS_EMPTY_INITIALIZER;
> >      ds_put_format(&actions, "ip.ttl--; "REG_ECMP_GROUP_ID" = 0; %sreg0
= ",
> >
>
Lorenzo Bianconi May 14, 2020, 5:04 p.m. UTC | #3
>
> On Thu, May 14, 2020 at 1:12 AM Dumitru Ceara <dceara@redhat.com> wrote:
> >
> > On 5/13/20 11:51 PM, Han Zhou wrote:
> > > In commit c0bf32d the priorities of the regular routes were increased by
> > > 400, but ECMP routes were not updated. This patch fixes it. Since
> > > for ECMP routes the outport is unknown (there can be many different
> > > outports) the condition check of whether the outport is distributed
> > > gateway port is skipped.
> > >
> > > Fixes: c0bf32d ("Manage ARP process locally in a DVR scenario")
> > > Cc: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
> > > Signed-off-by: Han Zhou <hzhou@ovn.org>
> >
> > Hi Han,
> >
> > Should we also add a test for this scenario to avoid further regressions?
> >
> > Thanks,
> > Dumitru
>
> Thanks Dumitru for the review. I thought it twice and it seems this fix doesn't solve the whole problem.
> In commit c0bf32d ("Manage ARP process locally in a DVR scenario"), it introduced different sets of route priorities. If the route's output port is distributed gateway port, it is 400 lower than the routes with same prefix length but next hop is a regular router port. The longest prefix match would not work properly for routing. This violates the basic principle for the default routing behavior. For example:
>
> route1: 10.0.0.0/24 nexthop 192.168.0.2 via lrp1 (distributed gateway port)
> route2: 10.0.0.0/16 nexthop 192.168.100.2 via lrp2 (regular router port)
>
> The user would expected route1 to override route2, because it is has longer prefix match. However, the commit c0bf32d results in 10.0.0.0/16 taking effect, because the priority now is (400 + 33) v.s. 49.
>
>   table=9 (lr_in_ip_routing   ), priority=49   , match=(ip4.dst == 10.0.0.0/24), action=(ip.ttl--; reg8[0..15] = 0; reg0 = 192.168.0.2; reg1 = 192.168.0.1; eth.src = aa:aa:aa:aa:aa:01; outport = "lrp1"; flags.loopback = 1; next;)
>   table=9 (lr_in_ip_routing   ), priority=433  , match=(ip4.dst == 10.0.0.0/16), action=(ip.ttl--; reg8[0..15] = 0; reg0 = 192.168.100.2; reg1 = 192.168.100.1; eth.src = aa:aa:aa:aa:aa:02; outport = "lrp2"; flags.loopback = 1; next;)
>
> So I wonder if we should change the solution of the commit c0bf32d, instead of just increasing the ECMP routes priority only. I don't completely understand the problem that commit was trying to solve. Is there an example that describes the scenario in more detail and the reason why the route priorities have to be different?

basically the commit c0bf32d ("Manage ARP process locally in a DVR
scenario") wants to manage DVR traffic locally (e.g not send ARP
traffic through the tunnel). In particular this change is used to
distribute non-DVR traffic using geneve tunnels IIRC. I will look into
it.

Regards,
Lorenzo

>
> >
> > > ---
> > >  northd/ovn-northd.8.xml |  3 ++-
> > >  northd/ovn-northd.c     | 22 ++++++++++++----------
> > >  2 files changed, 14 insertions(+), 11 deletions(-)
> > >
> > > diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
> > > index 8f224b0..b0475d0 100644
> > > --- a/northd/ovn-northd.8.xml
> > > +++ b/northd/ovn-northd.8.xml
> > > @@ -2601,7 +2601,8 @@ next;
> > >            ids <var>MID1</var>, <var>MID2</var>, ..., a logical flow with match
> > >            in CIDR notation <code>ip4.dst == <var>N</var>/<var>M</var></code>,
> > >            or <code>ip6.dst == <var>N</var>/<var>M</var></code>, whose priority
> > > -          is the integer value of <var>M</var>, has the following actions:
> > > +          is <code>400</code> + the integer value of <var>M</var>, has the
> > > +          following actions:
> > >          </p>
> > >
> > >          <pre>
> > > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> > > index b25152d..c02e305 100644
> > > --- a/northd/ovn-northd.c
> > > +++ b/northd/ovn-northd.c
> > > @@ -7337,7 +7337,8 @@ build_route_prefix_s(const struct v46_ip *prefix, unsigned int plen)
> > >  }
> > >
> > >  static void
> > > -build_route_match(const struct ovn_port *op_inport, const char *network_s,
> > > +build_route_match(const struct ovn_port *op_inport,
> > > +                  const struct ovn_port *op_outport, const char *network_s,
> > >                    int plen, bool is_src_route, bool is_ipv4, struct ds *match,
> > >                    uint16_t *priority)
> > >  {
> > > @@ -7351,6 +7352,13 @@ build_route_match(const struct ovn_port *op_inport, const char *network_s,
> > >          dir = "dst";
> > >          *priority = (plen * 2) + 1;
> > >      }
> > > +    /* traffic for internal IPs of logical switch ports must be sent to
> > > +     * the gw controller through the overlay tunnels
> > > +     */
> > > +    if (!op_outport ||
> > > +        (op_outport->nbrp && !op_outport->nbrp->n_gateway_chassis)) {
> > > +        priority += DROUTE_PRIO;
> > > +    }
> > >
> > >      if (op_inport) {
> > >          ds_put_format(match, "inport == %s && ", op_inport->json_key);
> > > @@ -7434,8 +7442,8 @@ build_ecmp_route_flow(struct hmap *lflows, struct ovn_datapath *od,
> > >      uint16_t priority;
> > >
> > >      char *prefix_s = build_route_prefix_s(&eg->prefix, eg->plen);
> > > -    build_route_match(NULL, prefix_s, eg->plen, eg->is_src_route, is_ipv4,
> > > -                      &match, &priority);
> > > +    build_route_match(NULL, NULL, prefix_s, eg->plen, eg->is_src_route,
> > > +                      is_ipv4, &match, &priority);
> > >      free(prefix_s);
> > >
> > >      struct ds actions = DS_EMPTY_INITIALIZER;
> > > @@ -7548,14 +7556,8 @@ add_route(struct hmap *lflows, const struct ovn_port *op,
> > >              op_inport = op;
> > >          }
> > >      }
> > > -    build_route_match(op_inport, network_s, plen, is_src_route, is_ipv4,
> > > +    build_route_match(op_inport, op, network_s, plen, is_src_route, is_ipv4,
> > >                        &match, &priority);
> > > -    /* traffic for internal IPs of logical switch ports must be sent to
> > > -     * the gw controller through the overlay tunnels
> > > -     */
> > > -    if (op->nbrp && !op->nbrp->n_gateway_chassis) {
> > > -        priority += DROUTE_PRIO;
> > > -    }
> > >
> > >      struct ds actions = DS_EMPTY_INITIALIZER;
> > >      ds_put_format(&actions, "ip.ttl--; "REG_ECMP_GROUP_ID" = 0; %sreg0 = ",
> > >
> >
Lorenzo Bianconi May 15, 2020, 3:55 p.m. UTC | #4
> On Thu, May 14, 2020 at 1:12 AM Dumitru Ceara <dceara@redhat.com> wrote:
> >
> > On 5/13/20 11:51 PM, Han Zhou wrote:
> > > In commit c0bf32d the priorities of the regular routes were increased by
> > > 400, but ECMP routes were not updated. This patch fixes it. Since
> > > for ECMP routes the outport is unknown (there can be many different
> > > outports) the condition check of whether the outport is distributed
> > > gateway port is skipped.
> > >
> > > Fixes: c0bf32d ("Manage ARP process locally in a DVR scenario")
> > > Cc: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
> > > Signed-off-by: Han Zhou <hzhou@ovn.org>

[...]

> So I wonder if we should change the solution of the commit c0bf32d, instead
> of just increasing the ECMP routes priority only. I don't completely
> understand the problem that commit was trying to solve. Is there an example
> that describes the scenario in more detail and the reason why the route
> priorities have to be different?

Hi Han,

this morning Dumitru and me worked trying to find a solution for this issue.
We though to restore the original configuration in table 9
and add a new table used only to take care of FIP/DVR traffic.
Does it work for you?
I sent a RFC with the basic implementation. I tested it and it works fine
regarding to FIP traffic. I will work on unitest waiting for feedbacks.

Regards,
Lorenzo

> 
> >
> > > ---
> > >  northd/ovn-northd.8.xml |  3 ++-
> > >  northd/ovn-northd.c     | 22 ++++++++++++----------
> > >  2 files changed, 14 insertions(+), 11 deletions(-)
> > >
> > > diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
> > > index 8f224b0..b0475d0 100644
> > > --- a/northd/ovn-northd.8.xml
> > > +++ b/northd/ovn-northd.8.xml
> > > @@ -2601,7 +2601,8 @@ next;
> > >            ids <var>MID1</var>, <var>MID2</var>, ..., a logical flow
> with match
> > >            in CIDR notation <code>ip4.dst ==
> <var>N</var>/<var>M</var></code>,
> > >            or <code>ip6.dst == <var>N</var>/<var>M</var></code>, whose
> priority
> > > -          is the integer value of <var>M</var>, has the following
> actions:
> > > +          is <code>400</code> + the integer value of <var>M</var>, has
> the
> > > +          following actions:
> > >          </p>
> > >
> > >          <pre>
> > > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> > > index b25152d..c02e305 100644
> > > --- a/northd/ovn-northd.c
> > > +++ b/northd/ovn-northd.c
> > > @@ -7337,7 +7337,8 @@ build_route_prefix_s(const struct v46_ip *prefix,
> unsigned int plen)
> > >  }
> > >
> > >  static void
> > > -build_route_match(const struct ovn_port *op_inport, const char
> *network_s,
> > > +build_route_match(const struct ovn_port *op_inport,
> > > +                  const struct ovn_port *op_outport, const char
> *network_s,
> > >                    int plen, bool is_src_route, bool is_ipv4, struct ds
> *match,
> > >                    uint16_t *priority)
> > >  {
> > > @@ -7351,6 +7352,13 @@ build_route_match(const struct ovn_port
> *op_inport, const char *network_s,
> > >          dir = "dst";
> > >          *priority = (plen * 2) + 1;
> > >      }
> > > +    /* traffic for internal IPs of logical switch ports must be sent to
> > > +     * the gw controller through the overlay tunnels
> > > +     */
> > > +    if (!op_outport ||
> > > +        (op_outport->nbrp && !op_outport->nbrp->n_gateway_chassis)) {
> > > +        priority += DROUTE_PRIO;
> > > +    }
> > >
> > >      if (op_inport) {
> > >          ds_put_format(match, "inport == %s && ", op_inport->json_key);
> > > @@ -7434,8 +7442,8 @@ build_ecmp_route_flow(struct hmap *lflows, struct
> ovn_datapath *od,
> > >      uint16_t priority;
> > >
> > >      char *prefix_s = build_route_prefix_s(&eg->prefix, eg->plen);
> > > -    build_route_match(NULL, prefix_s, eg->plen, eg->is_src_route,
> is_ipv4,
> > > -                      &match, &priority);
> > > +    build_route_match(NULL, NULL, prefix_s, eg->plen, eg->is_src_route,
> > > +                      is_ipv4, &match, &priority);
> > >      free(prefix_s);
> > >
> > >      struct ds actions = DS_EMPTY_INITIALIZER;
> > > @@ -7548,14 +7556,8 @@ add_route(struct hmap *lflows, const struct
> ovn_port *op,
> > >              op_inport = op;
> > >          }
> > >      }
> > > -    build_route_match(op_inport, network_s, plen, is_src_route,
> is_ipv4,
> > > +    build_route_match(op_inport, op, network_s, plen, is_src_route,
> is_ipv4,
> > >                        &match, &priority);
> > > -    /* traffic for internal IPs of logical switch ports must be sent to
> > > -     * the gw controller through the overlay tunnels
> > > -     */
> > > -    if (op->nbrp && !op->nbrp->n_gateway_chassis) {
> > > -        priority += DROUTE_PRIO;
> > > -    }
> > >
> > >      struct ds actions = DS_EMPTY_INITIALIZER;
> > >      ds_put_format(&actions, "ip.ttl--; "REG_ECMP_GROUP_ID" = 0; %sreg0
> = ",
> > >
> >
Lorenzo Bianconi May 15, 2020, 4 p.m. UTC | #5
> > On Thu, May 14, 2020 at 1:12 AM Dumitru Ceara <dceara@redhat.com> wrote:
> > >
> > > On 5/13/20 11:51 PM, Han Zhou wrote:
> > > > In commit c0bf32d the priorities of the regular routes were increased by
> > > > 400, but ECMP routes were not updated. This patch fixes it. Since
> > > > for ECMP routes the outport is unknown (there can be many different
> > > > outports) the condition check of whether the outport is distributed
> > > > gateway port is skipped.
> > > >
> > > > Fixes: c0bf32d ("Manage ARP process locally in a DVR scenario")
> > > > Cc: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
> > > > Signed-off-by: Han Zhou <hzhou@ovn.org>
> 
> [...]
> 
> > So I wonder if we should change the solution of the commit c0bf32d, instead
> > of just increasing the ECMP routes priority only. I don't completely
> > understand the problem that commit was trying to solve. Is there an example
> > that describes the scenario in more detail and the reason why the route
> > priorities have to be different?
> 
> Hi Han,
> 
> this morning Dumitru and me worked trying to find a solution for this issue.
> We though to restore the original configuration in table 9
> and add a new table used only to take care of FIP/DVR traffic.
> Does it work for you?
> I sent a RFC with the basic implementation. I tested it and it works fine
> regarding to FIP traffic. I will work on unitest waiting for feedbacks.
> 
> Regards,
> Lorenzo

ops I forgot, this is the patch:
https://patchwork.ozlabs.org/project/openvswitch/patch/a99d46b275a10a6d8278434f7cefa0466bb78af5.1589557561.git.lorenzo.bianconi@redhat.com/

Regards,
Lorenzo

> 
> > 
> > >
> > > > ---
> > > >  northd/ovn-northd.8.xml |  3 ++-
> > > >  northd/ovn-northd.c     | 22 ++++++++++++----------
> > > >  2 files changed, 14 insertions(+), 11 deletions(-)
> > > >
> > > > diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
> > > > index 8f224b0..b0475d0 100644
> > > > --- a/northd/ovn-northd.8.xml
> > > > +++ b/northd/ovn-northd.8.xml
> > > > @@ -2601,7 +2601,8 @@ next;
> > > >            ids <var>MID1</var>, <var>MID2</var>, ..., a logical flow
> > with match
> > > >            in CIDR notation <code>ip4.dst ==
> > <var>N</var>/<var>M</var></code>,
> > > >            or <code>ip6.dst == <var>N</var>/<var>M</var></code>, whose
> > priority
> > > > -          is the integer value of <var>M</var>, has the following
> > actions:
> > > > +          is <code>400</code> + the integer value of <var>M</var>, has
> > the
> > > > +          following actions:
> > > >          </p>
> > > >
> > > >          <pre>
> > > > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> > > > index b25152d..c02e305 100644
> > > > --- a/northd/ovn-northd.c
> > > > +++ b/northd/ovn-northd.c
> > > > @@ -7337,7 +7337,8 @@ build_route_prefix_s(const struct v46_ip *prefix,
> > unsigned int plen)
> > > >  }
> > > >
> > > >  static void
> > > > -build_route_match(const struct ovn_port *op_inport, const char
> > *network_s,
> > > > +build_route_match(const struct ovn_port *op_inport,
> > > > +                  const struct ovn_port *op_outport, const char
> > *network_s,
> > > >                    int plen, bool is_src_route, bool is_ipv4, struct ds
> > *match,
> > > >                    uint16_t *priority)
> > > >  {
> > > > @@ -7351,6 +7352,13 @@ build_route_match(const struct ovn_port
> > *op_inport, const char *network_s,
> > > >          dir = "dst";
> > > >          *priority = (plen * 2) + 1;
> > > >      }
> > > > +    /* traffic for internal IPs of logical switch ports must be sent to
> > > > +     * the gw controller through the overlay tunnels
> > > > +     */
> > > > +    if (!op_outport ||
> > > > +        (op_outport->nbrp && !op_outport->nbrp->n_gateway_chassis)) {
> > > > +        priority += DROUTE_PRIO;
> > > > +    }
> > > >
> > > >      if (op_inport) {
> > > >          ds_put_format(match, "inport == %s && ", op_inport->json_key);
> > > > @@ -7434,8 +7442,8 @@ build_ecmp_route_flow(struct hmap *lflows, struct
> > ovn_datapath *od,
> > > >      uint16_t priority;
> > > >
> > > >      char *prefix_s = build_route_prefix_s(&eg->prefix, eg->plen);
> > > > -    build_route_match(NULL, prefix_s, eg->plen, eg->is_src_route,
> > is_ipv4,
> > > > -                      &match, &priority);
> > > > +    build_route_match(NULL, NULL, prefix_s, eg->plen, eg->is_src_route,
> > > > +                      is_ipv4, &match, &priority);
> > > >      free(prefix_s);
> > > >
> > > >      struct ds actions = DS_EMPTY_INITIALIZER;
> > > > @@ -7548,14 +7556,8 @@ add_route(struct hmap *lflows, const struct
> > ovn_port *op,
> > > >              op_inport = op;
> > > >          }
> > > >      }
> > > > -    build_route_match(op_inport, network_s, plen, is_src_route,
> > is_ipv4,
> > > > +    build_route_match(op_inport, op, network_s, plen, is_src_route,
> > is_ipv4,
> > > >                        &match, &priority);
> > > > -    /* traffic for internal IPs of logical switch ports must be sent to
> > > > -     * the gw controller through the overlay tunnels
> > > > -     */
> > > > -    if (op->nbrp && !op->nbrp->n_gateway_chassis) {
> > > > -        priority += DROUTE_PRIO;
> > > > -    }
> > > >
> > > >      struct ds actions = DS_EMPTY_INITIALIZER;
> > > >      ds_put_format(&actions, "ip.ttl--; "REG_ECMP_GROUP_ID" = 0; %sreg0
> > = ",
> > > >
> > >
Han Zhou May 15, 2020, 6:48 p.m. UTC | #6
On Fri, May 15, 2020 at 9:01 AM Lorenzo Bianconi <
lorenzo.bianconi@redhat.com> wrote:
>
> > > On Thu, May 14, 2020 at 1:12 AM Dumitru Ceara <dceara@redhat.com>
wrote:
> > > >
> > > > On 5/13/20 11:51 PM, Han Zhou wrote:
> > > > > In commit c0bf32d the priorities of the regular routes were
increased by
> > > > > 400, but ECMP routes were not updated. This patch fixes it. Since
> > > > > for ECMP routes the outport is unknown (there can be many
different
> > > > > outports) the condition check of whether the outport is
distributed
> > > > > gateway port is skipped.
> > > > >
> > > > > Fixes: c0bf32d ("Manage ARP process locally in a DVR scenario")
> > > > > Cc: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
> > > > > Signed-off-by: Han Zhou <hzhou@ovn.org>
> >
> > [...]
> >
> > > So I wonder if we should change the solution of the commit c0bf32d,
instead
> > > of just increasing the ECMP routes priority only. I don't completely
> > > understand the problem that commit was trying to solve. Is there an
example
> > > that describes the scenario in more detail and the reason why the
route
> > > priorities have to be different?
> >
> > Hi Han,
> >
> > this morning Dumitru and me worked trying to find a solution for this
issue.
> > We though to restore the original configuration in table 9
> > and add a new table used only to take care of FIP/DVR traffic.
> > Does it work for you?
> > I sent a RFC with the basic implementation. I tested it and it works
fine
> > regarding to FIP traffic. I will work on unitest waiting for feedbacks.
> >
> > Regards,
> > Lorenzo
>
> ops I forgot, this is the patch:
>
https://patchwork.ozlabs.org/project/openvswitch/patch/a99d46b275a10a6d8278434f7cefa0466bb78af5.1589557561.git.lorenzo.bianconi@redhat.com/
>
> Regards,
> Lorenzo
>

That's great. Thanks Lorenzo. I will abandon this patch.

> >
> > >
> > > >
> > > > > ---
> > > > >  northd/ovn-northd.8.xml |  3 ++-
> > > > >  northd/ovn-northd.c     | 22 ++++++++++++----------
> > > > >  2 files changed, 14 insertions(+), 11 deletions(-)
> > > > >
> > > > > diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
> > > > > index 8f224b0..b0475d0 100644
> > > > > --- a/northd/ovn-northd.8.xml
> > > > > +++ b/northd/ovn-northd.8.xml
> > > > > @@ -2601,7 +2601,8 @@ next;
> > > > >            ids <var>MID1</var>, <var>MID2</var>, ..., a logical
flow
> > > with match
> > > > >            in CIDR notation <code>ip4.dst ==
> > > <var>N</var>/<var>M</var></code>,
> > > > >            or <code>ip6.dst == <var>N</var>/<var>M</var></code>,
whose
> > > priority
> > > > > -          is the integer value of <var>M</var>, has the following
> > > actions:
> > > > > +          is <code>400</code> + the integer value of
<var>M</var>, has
> > > the
> > > > > +          following actions:
> > > > >          </p>
> > > > >
> > > > >          <pre>
> > > > > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> > > > > index b25152d..c02e305 100644
> > > > > --- a/northd/ovn-northd.c
> > > > > +++ b/northd/ovn-northd.c
> > > > > @@ -7337,7 +7337,8 @@ build_route_prefix_s(const struct v46_ip
*prefix,
> > > unsigned int plen)
> > > > >  }
> > > > >
> > > > >  static void
> > > > > -build_route_match(const struct ovn_port *op_inport, const char
> > > *network_s,
> > > > > +build_route_match(const struct ovn_port *op_inport,
> > > > > +                  const struct ovn_port *op_outport, const char
> > > *network_s,
> > > > >                    int plen, bool is_src_route, bool is_ipv4,
struct ds
> > > *match,
> > > > >                    uint16_t *priority)
> > > > >  {
> > > > > @@ -7351,6 +7352,13 @@ build_route_match(const struct ovn_port
> > > *op_inport, const char *network_s,
> > > > >          dir = "dst";
> > > > >          *priority = (plen * 2) + 1;
> > > > >      }
> > > > > +    /* traffic for internal IPs of logical switch ports must be
sent to
> > > > > +     * the gw controller through the overlay tunnels
> > > > > +     */
> > > > > +    if (!op_outport ||
> > > > > +        (op_outport->nbrp &&
!op_outport->nbrp->n_gateway_chassis)) {
> > > > > +        priority += DROUTE_PRIO;
> > > > > +    }
> > > > >
> > > > >      if (op_inport) {
> > > > >          ds_put_format(match, "inport == %s && ",
op_inport->json_key);
> > > > > @@ -7434,8 +7442,8 @@ build_ecmp_route_flow(struct hmap *lflows,
struct
> > > ovn_datapath *od,
> > > > >      uint16_t priority;
> > > > >
> > > > >      char *prefix_s = build_route_prefix_s(&eg->prefix, eg->plen);
> > > > > -    build_route_match(NULL, prefix_s, eg->plen, eg->is_src_route,
> > > is_ipv4,
> > > > > -                      &match, &priority);
> > > > > +    build_route_match(NULL, NULL, prefix_s, eg->plen,
eg->is_src_route,
> > > > > +                      is_ipv4, &match, &priority);
> > > > >      free(prefix_s);
> > > > >
> > > > >      struct ds actions = DS_EMPTY_INITIALIZER;
> > > > > @@ -7548,14 +7556,8 @@ add_route(struct hmap *lflows, const struct
> > > ovn_port *op,
> > > > >              op_inport = op;
> > > > >          }
> > > > >      }
> > > > > -    build_route_match(op_inport, network_s, plen, is_src_route,
> > > is_ipv4,
> > > > > +    build_route_match(op_inport, op, network_s, plen,
is_src_route,
> > > is_ipv4,
> > > > >                        &match, &priority);
> > > > > -    /* traffic for internal IPs of logical switch ports must be
sent to
> > > > > -     * the gw controller through the overlay tunnels
> > > > > -     */
> > > > > -    if (op->nbrp && !op->nbrp->n_gateway_chassis) {
> > > > > -        priority += DROUTE_PRIO;
> > > > > -    }
> > > > >
> > > > >      struct ds actions = DS_EMPTY_INITIALIZER;
> > > > >      ds_put_format(&actions, "ip.ttl--; "REG_ECMP_GROUP_ID" = 0;
%sreg0
> > > = ",
> > > > >
> > > >
>
>
Jakub Libosvar May 18, 2020, 4:09 p.m. UTC | #7
On 15/05/2020 18:00, Lorenzo Bianconi wrote:
>>> On Thu, May 14, 2020 at 1:12 AM Dumitru Ceara <dceara@redhat.com> wrote:
>>>>
>>>> On 5/13/20 11:51 PM, Han Zhou wrote:
>>>>> In commit c0bf32d the priorities of the regular routes were increased by
>>>>> 400, but ECMP routes were not updated. This patch fixes it. Since
>>>>> for ECMP routes the outport is unknown (there can be many different
>>>>> outports) the condition check of whether the outport is distributed
>>>>> gateway port is skipped.
>>>>>
>>>>> Fixes: c0bf32d ("Manage ARP process locally in a DVR scenario")
>>>>> Cc: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
>>>>> Signed-off-by: Han Zhou <hzhou@ovn.org>
>>
>> [...]
>>
>>> So I wonder if we should change the solution of the commit c0bf32d, instead
>>> of just increasing the ECMP routes priority only. I don't completely
>>> understand the problem that commit was trying to solve. Is there an example
>>> that describes the scenario in more detail and the reason why the route
>>> priorities have to be different?
>>
>> Hi Han,
>>
>> this morning Dumitru and me worked trying to find a solution for this issue.
>> We though to restore the original configuration in table 9
>> and add a new table used only to take care of FIP/DVR traffic.
>> Does it work for you?
>> I sent a RFC with the basic implementation. I tested it and it works fine
>> regarding to FIP traffic. I will work on unitest waiting for feedbacks.
>>
>> Regards,
>> Lorenzo
> 
> ops I forgot, this is the patch:
> https://patchwork.ozlabs.org/project/openvswitch/patch/a99d46b275a10a6d8278434f7cefa0466bb78af5.1589557561.git.lorenzo.bianconi@redhat.com/
> 
> Regards,
> Lorenzo

I just cherry-picked this patch to OVN master on my devstack environment
and tested it fixes the problem with Neutron and OpenStack.

Kuba

> 
>>
>>>
>>>>
>>>>> ---
>>>>>  northd/ovn-northd.8.xml |  3 ++-
>>>>>  northd/ovn-northd.c     | 22 ++++++++++++----------
>>>>>  2 files changed, 14 insertions(+), 11 deletions(-)
>>>>>
>>>>> diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
>>>>> index 8f224b0..b0475d0 100644
>>>>> --- a/northd/ovn-northd.8.xml
>>>>> +++ b/northd/ovn-northd.8.xml
>>>>> @@ -2601,7 +2601,8 @@ next;
>>>>>            ids <var>MID1</var>, <var>MID2</var>, ..., a logical flow
>>> with match
>>>>>            in CIDR notation <code>ip4.dst ==
>>> <var>N</var>/<var>M</var></code>,
>>>>>            or <code>ip6.dst == <var>N</var>/<var>M</var></code>, whose
>>> priority
>>>>> -          is the integer value of <var>M</var>, has the following
>>> actions:
>>>>> +          is <code>400</code> + the integer value of <var>M</var>, has
>>> the
>>>>> +          following actions:
>>>>>          </p>
>>>>>
>>>>>          <pre>
>>>>> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
>>>>> index b25152d..c02e305 100644
>>>>> --- a/northd/ovn-northd.c
>>>>> +++ b/northd/ovn-northd.c
>>>>> @@ -7337,7 +7337,8 @@ build_route_prefix_s(const struct v46_ip *prefix,
>>> unsigned int plen)
>>>>>  }
>>>>>
>>>>>  static void
>>>>> -build_route_match(const struct ovn_port *op_inport, const char
>>> *network_s,
>>>>> +build_route_match(const struct ovn_port *op_inport,
>>>>> +                  const struct ovn_port *op_outport, const char
>>> *network_s,
>>>>>                    int plen, bool is_src_route, bool is_ipv4, struct ds
>>> *match,
>>>>>                    uint16_t *priority)
>>>>>  {
>>>>> @@ -7351,6 +7352,13 @@ build_route_match(const struct ovn_port
>>> *op_inport, const char *network_s,
>>>>>          dir = "dst";
>>>>>          *priority = (plen * 2) + 1;
>>>>>      }
>>>>> +    /* traffic for internal IPs of logical switch ports must be sent to
>>>>> +     * the gw controller through the overlay tunnels
>>>>> +     */
>>>>> +    if (!op_outport ||
>>>>> +        (op_outport->nbrp && !op_outport->nbrp->n_gateway_chassis)) {
>>>>> +        priority += DROUTE_PRIO;
>>>>> +    }
>>>>>
>>>>>      if (op_inport) {
>>>>>          ds_put_format(match, "inport == %s && ", op_inport->json_key);
>>>>> @@ -7434,8 +7442,8 @@ build_ecmp_route_flow(struct hmap *lflows, struct
>>> ovn_datapath *od,
>>>>>      uint16_t priority;
>>>>>
>>>>>      char *prefix_s = build_route_prefix_s(&eg->prefix, eg->plen);
>>>>> -    build_route_match(NULL, prefix_s, eg->plen, eg->is_src_route,
>>> is_ipv4,
>>>>> -                      &match, &priority);
>>>>> +    build_route_match(NULL, NULL, prefix_s, eg->plen, eg->is_src_route,
>>>>> +                      is_ipv4, &match, &priority);
>>>>>      free(prefix_s);
>>>>>
>>>>>      struct ds actions = DS_EMPTY_INITIALIZER;
>>>>> @@ -7548,14 +7556,8 @@ add_route(struct hmap *lflows, const struct
>>> ovn_port *op,
>>>>>              op_inport = op;
>>>>>          }
>>>>>      }
>>>>> -    build_route_match(op_inport, network_s, plen, is_src_route,
>>> is_ipv4,
>>>>> +    build_route_match(op_inport, op, network_s, plen, is_src_route,
>>> is_ipv4,
>>>>>                        &match, &priority);
>>>>> -    /* traffic for internal IPs of logical switch ports must be sent to
>>>>> -     * the gw controller through the overlay tunnels
>>>>> -     */
>>>>> -    if (op->nbrp && !op->nbrp->n_gateway_chassis) {
>>>>> -        priority += DROUTE_PRIO;
>>>>> -    }
>>>>>
>>>>>      struct ds actions = DS_EMPTY_INITIALIZER;
>>>>>      ds_put_format(&actions, "ip.ttl--; "REG_ECMP_GROUP_ID" = 0; %sreg0
>>> = ",
>>>>>
>>>>
> 
> 
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
diff mbox series

Patch

diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
index 8f224b0..b0475d0 100644
--- a/northd/ovn-northd.8.xml
+++ b/northd/ovn-northd.8.xml
@@ -2601,7 +2601,8 @@  next;
           ids <var>MID1</var>, <var>MID2</var>, ..., a logical flow with match
           in CIDR notation <code>ip4.dst == <var>N</var>/<var>M</var></code>,
           or <code>ip6.dst == <var>N</var>/<var>M</var></code>, whose priority
-          is the integer value of <var>M</var>, has the following actions:
+          is <code>400</code> + the integer value of <var>M</var>, has the
+          following actions:
         </p>
 
         <pre>
diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
index b25152d..c02e305 100644
--- a/northd/ovn-northd.c
+++ b/northd/ovn-northd.c
@@ -7337,7 +7337,8 @@  build_route_prefix_s(const struct v46_ip *prefix, unsigned int plen)
 }
 
 static void
-build_route_match(const struct ovn_port *op_inport, const char *network_s,
+build_route_match(const struct ovn_port *op_inport,
+                  const struct ovn_port *op_outport, const char *network_s,
                   int plen, bool is_src_route, bool is_ipv4, struct ds *match,
                   uint16_t *priority)
 {
@@ -7351,6 +7352,13 @@  build_route_match(const struct ovn_port *op_inport, const char *network_s,
         dir = "dst";
         *priority = (plen * 2) + 1;
     }
+    /* traffic for internal IPs of logical switch ports must be sent to
+     * the gw controller through the overlay tunnels
+     */
+    if (!op_outport ||
+        (op_outport->nbrp && !op_outport->nbrp->n_gateway_chassis)) {
+        priority += DROUTE_PRIO;
+    }
 
     if (op_inport) {
         ds_put_format(match, "inport == %s && ", op_inport->json_key);
@@ -7434,8 +7442,8 @@  build_ecmp_route_flow(struct hmap *lflows, struct ovn_datapath *od,
     uint16_t priority;
 
     char *prefix_s = build_route_prefix_s(&eg->prefix, eg->plen);
-    build_route_match(NULL, prefix_s, eg->plen, eg->is_src_route, is_ipv4,
-                      &match, &priority);
+    build_route_match(NULL, NULL, prefix_s, eg->plen, eg->is_src_route,
+                      is_ipv4, &match, &priority);
     free(prefix_s);
 
     struct ds actions = DS_EMPTY_INITIALIZER;
@@ -7548,14 +7556,8 @@  add_route(struct hmap *lflows, const struct ovn_port *op,
             op_inport = op;
         }
     }
-    build_route_match(op_inport, network_s, plen, is_src_route, is_ipv4,
+    build_route_match(op_inport, op, network_s, plen, is_src_route, is_ipv4,
                       &match, &priority);
-    /* traffic for internal IPs of logical switch ports must be sent to
-     * the gw controller through the overlay tunnels
-     */
-    if (op->nbrp && !op->nbrp->n_gateway_chassis) {
-        priority += DROUTE_PRIO;
-    }
 
     struct ds actions = DS_EMPTY_INITIALIZER;
     ds_put_format(&actions, "ip.ttl--; "REG_ECMP_GROUP_ID" = 0; %sreg0 = ",