diff mbox

[ovs-dev,1/2] ovn-controller: Datapath based conntrack zone for load-balancing.

Message ID 1471945578-28525-2-git-send-email-guru@ovn.org
State Accepted
Delegated to: Ben Pfaff
Headers show

Commit Message

Gurucharan Shetty Aug. 23, 2016, 9:46 a.m. UTC
Currently ct_lb() logical action is only added for a logical switch and
we use the conntrack zone allocated for the logical port.  A future commit
will use ct_lb() for a logical router too.  In that case, use the allocated
DNAT zone.

Signed-off-by: Gurucharan Shetty <guru@ovn.org>
---
 include/ovn/actions.h  | 3 +++
 ovn/controller/lflow.c | 1 +
 ovn/lib/actions.c      | 7 +++++--
 ovn/ovn-sb.xml         | 6 ++++--
 tests/test-ovn.c       | 1 +
 5 files changed, 14 insertions(+), 4 deletions(-)

Comments

Ben Pfaff Aug. 30, 2016, 8:03 p.m. UTC | #1
On Tue, Aug 23, 2016 at 02:46:17AM -0700, Gurucharan Shetty wrote:
> Currently ct_lb() logical action is only added for a logical switch and
> we use the conntrack zone allocated for the logical port.  A future commit
> will use ct_lb() for a logical router too.  In that case, use the allocated
> DNAT zone.
> 
> Signed-off-by: Gurucharan Shetty <guru@ovn.org>

Can you help me to understand why the desired behavior is different in
each of these cases?
Gurucharan Shetty Aug. 30, 2016, 8:42 p.m. UTC | #2
On 30 August 2016 at 13:03, Ben Pfaff <blp@ovn.org> wrote:

> On Tue, Aug 23, 2016 at 02:46:17AM -0700, Gurucharan Shetty wrote:
> > Currently ct_lb() logical action is only added for a logical switch and
> > we use the conntrack zone allocated for the logical port.  A future
> commit
> > will use ct_lb() for a logical router too.  In that case, use the
> allocated
> > DNAT zone.
> >
> > Signed-off-by: Gurucharan Shetty <guru@ovn.org>
>
> Can you help me to understand why the desired behavior is different in
> each of these cases?
>

Currently we do the following conntrack zone allocations.
1. A conntrack zone for each logical port. This has to be unique only per
hypervisor. We use this zone to do both firewall and east-west
load-balancing.

For firewall, we send the packet to conntrack to defragment it and track it
and figure out whether it is invalid, new, established etc. Invalid packets
are dropped. new connections are committed.

For load-balancing, defragmented packets are DNATed into one of the
possible endpoints. We do the load-balancing at the endpoint (instead of
say in a router) because of the asymmetric  nature of OVN router pipeline
handling for east-west.
So when we see ct_lb() action on a switch, we can just use the conntrack
zone allocated for that logical port.


2. Two conntrack zones per gateway router.
A gateway router only resides in a particular chassis. We have one
conntrack zone for DNAT and another for SNAT.

Now when I want to add ct_lb() action for the gateway router, I want to use
it as part of the gateway router pipeline. Since load-balancing is nothing
but a DNAT using one of the chosen endpoint, the conntrack zone has to be a
DNAT zone allocated to that gateway router.

Did I give the answer to your question? Or was it something else that you
were looking an explanation for?

PS: The second patch of the series did not make it to patchwork. It is here:
http://openvswitch.org/pipermail/dev/2016-August/078478.html
Ben Pfaff Aug. 30, 2016, 9:50 p.m. UTC | #3
On Tue, Aug 30, 2016 at 01:42:06PM -0700, Guru Shetty wrote:
> On 30 August 2016 at 13:03, Ben Pfaff <blp@ovn.org> wrote:
> 
> > On Tue, Aug 23, 2016 at 02:46:17AM -0700, Gurucharan Shetty wrote:
> > > Currently ct_lb() logical action is only added for a logical switch and
> > > we use the conntrack zone allocated for the logical port.  A future
> > commit
> > > will use ct_lb() for a logical router too.  In that case, use the
> > allocated
> > > DNAT zone.
> > >
> > > Signed-off-by: Gurucharan Shetty <guru@ovn.org>
> >
> > Can you help me to understand why the desired behavior is different in
> > each of these cases?
> >
> 
> Currently we do the following conntrack zone allocations.
> 1. A conntrack zone for each logical port. This has to be unique only per
> hypervisor. We use this zone to do both firewall and east-west
> load-balancing.
> 
> For firewall, we send the packet to conntrack to defragment it and track it
> and figure out whether it is invalid, new, established etc. Invalid packets
> are dropped. new connections are committed.
> 
> For load-balancing, defragmented packets are DNATed into one of the
> possible endpoints. We do the load-balancing at the endpoint (instead of
> say in a router) because of the asymmetric  nature of OVN router pipeline
> handling for east-west.
> So when we see ct_lb() action on a switch, we can just use the conntrack
> zone allocated for that logical port.
> 
> 
> 2. Two conntrack zones per gateway router.
> A gateway router only resides in a particular chassis. We have one
> conntrack zone for DNAT and another for SNAT.
> 
> Now when I want to add ct_lb() action for the gateway router, I want to use
> it as part of the gateway router pipeline. Since load-balancing is nothing
> but a DNAT using one of the chosen endpoint, the conntrack zone has to be a
> DNAT zone allocated to that gateway router.
> 
> Did I give the answer to your question? Or was it something else that you
> were looking an explanation for?

One underlying question I'm trying to understand is whether this
difference is something inherent in the definition of logical switches
and logical routers, and thus the approach taken here of automatically
choosing a strategy is appropriate, or whether this is just what we
happen to be doing now, in which case it might be better to make how to
select the zone a parameter to ct_lb.
Gurucharan Shetty Aug. 30, 2016, 10:04 p.m. UTC | #4
On 30 August 2016 at 14:50, Ben Pfaff <blp@ovn.org> wrote:

> On Tue, Aug 30, 2016 at 01:42:06PM -0700, Guru Shetty wrote:
> > On 30 August 2016 at 13:03, Ben Pfaff <blp@ovn.org> wrote:
> >
> > > On Tue, Aug 23, 2016 at 02:46:17AM -0700, Gurucharan Shetty wrote:
> > > > Currently ct_lb() logical action is only added for a logical switch
> and
> > > > we use the conntrack zone allocated for the logical port.  A future
> > > commit
> > > > will use ct_lb() for a logical router too.  In that case, use the
> > > allocated
> > > > DNAT zone.
> > > >
> > > > Signed-off-by: Gurucharan Shetty <guru@ovn.org>
> > >
> > > Can you help me to understand why the desired behavior is different in
> > > each of these cases?
> > >
> >
> > Currently we do the following conntrack zone allocations.
> > 1. A conntrack zone for each logical port. This has to be unique only per
> > hypervisor. We use this zone to do both firewall and east-west
> > load-balancing.
> >
> > For firewall, we send the packet to conntrack to defragment it and track
> it
> > and figure out whether it is invalid, new, established etc. Invalid
> packets
> > are dropped. new connections are committed.
> >
> > For load-balancing, defragmented packets are DNATed into one of the
> > possible endpoints. We do the load-balancing at the endpoint (instead of
> > say in a router) because of the asymmetric  nature of OVN router pipeline
> > handling for east-west.
> > So when we see ct_lb() action on a switch, we can just use the conntrack
> > zone allocated for that logical port.
> >
> >
> > 2. Two conntrack zones per gateway router.
> > A gateway router only resides in a particular chassis. We have one
> > conntrack zone for DNAT and another for SNAT.
> >
> > Now when I want to add ct_lb() action for the gateway router, I want to
> use
> > it as part of the gateway router pipeline. Since load-balancing is
> nothing
> > but a DNAT using one of the chosen endpoint, the conntrack zone has to
> be a
> > DNAT zone allocated to that gateway router.
> >
> > Did I give the answer to your question? Or was it something else that you
> > were looking an explanation for?
>
> One underlying question I'm trying to understand is whether this
> difference is something inherent in the definition of logical switches
> and logical routers, and thus the approach taken here of automatically
> choosing a strategy is appropriate, or whether this is just what we
> happen to be doing now, in which case it might be better to make how to
> select the zone a parameter to ct_lb.
>

I see the question now. I don't have a good answer. One way to look at it
would be that a "zone" is an internal implementation detail and should not
be seen in a action of logical flow. But we can then say that we could
rename "zone" as "datapath" in the logical action.  But, then we would be
limiting it to 2 anyway (datapath=lswitch or datapath=lrouter), because
that is all that I can think about - in which case we are inferring it with
the current patch anyway.  I did start with passing this as an argument to
ct_lb, but could not think about a good name. I then decided that for a
stateful OVS action, you probably cannot make a very general purpose
logical action.  Do you have an opinion? I will go with that.
Ben Pfaff Sept. 9, 2016, 3:59 p.m. UTC | #5
On Tue, Aug 30, 2016 at 03:04:44PM -0700, Guru Shetty wrote:
> On 30 August 2016 at 14:50, Ben Pfaff <blp@ovn.org> wrote:
> 
> > On Tue, Aug 30, 2016 at 01:42:06PM -0700, Guru Shetty wrote:
> > > On 30 August 2016 at 13:03, Ben Pfaff <blp@ovn.org> wrote:
> > >
> > > > On Tue, Aug 23, 2016 at 02:46:17AM -0700, Gurucharan Shetty wrote:
> > > > > Currently ct_lb() logical action is only added for a logical switch
> > and
> > > > > we use the conntrack zone allocated for the logical port.  A future
> > > > commit
> > > > > will use ct_lb() for a logical router too.  In that case, use the
> > > > allocated
> > > > > DNAT zone.
> > > > >
> > > > > Signed-off-by: Gurucharan Shetty <guru@ovn.org>
> > > >
> > > > Can you help me to understand why the desired behavior is different in
> > > > each of these cases?
> > > >
> > >
> > > Currently we do the following conntrack zone allocations.
> > > 1. A conntrack zone for each logical port. This has to be unique only per
> > > hypervisor. We use this zone to do both firewall and east-west
> > > load-balancing.
> > >
> > > For firewall, we send the packet to conntrack to defragment it and track
> > it
> > > and figure out whether it is invalid, new, established etc. Invalid
> > packets
> > > are dropped. new connections are committed.
> > >
> > > For load-balancing, defragmented packets are DNATed into one of the
> > > possible endpoints. We do the load-balancing at the endpoint (instead of
> > > say in a router) because of the asymmetric  nature of OVN router pipeline
> > > handling for east-west.
> > > So when we see ct_lb() action on a switch, we can just use the conntrack
> > > zone allocated for that logical port.
> > >
> > >
> > > 2. Two conntrack zones per gateway router.
> > > A gateway router only resides in a particular chassis. We have one
> > > conntrack zone for DNAT and another for SNAT.
> > >
> > > Now when I want to add ct_lb() action for the gateway router, I want to
> > use
> > > it as part of the gateway router pipeline. Since load-balancing is
> > nothing
> > > but a DNAT using one of the chosen endpoint, the conntrack zone has to
> > be a
> > > DNAT zone allocated to that gateway router.
> > >
> > > Did I give the answer to your question? Or was it something else that you
> > > were looking an explanation for?
> >
> > One underlying question I'm trying to understand is whether this
> > difference is something inherent in the definition of logical switches
> > and logical routers, and thus the approach taken here of automatically
> > choosing a strategy is appropriate, or whether this is just what we
> > happen to be doing now, in which case it might be better to make how to
> > select the zone a parameter to ct_lb.
> >
> 
> I see the question now. I don't have a good answer. One way to look at it
> would be that a "zone" is an internal implementation detail and should not
> be seen in a action of logical flow. But we can then say that we could
> rename "zone" as "datapath" in the logical action.  But, then we would be
> limiting it to 2 anyway (datapath=lswitch or datapath=lrouter), because
> that is all that I can think about - in which case we are inferring it with
> the current patch anyway.  I did start with passing this as an argument to
> ct_lb, but could not think about a good name. I then decided that for a
> stateful OVS action, you probably cannot make a very general purpose
> logical action.  Do you have an opinion? I will go with that.

I think I understand now.  Let's take what you've done.

Please consider whether any of your explanation here is worth putting
into comments or documentation or the commit log.

Acked-by: Ben Pfaff <blp@ovn.org>
Gurucharan Shetty Sept. 9, 2016, 9:48 p.m. UTC | #6
>
>
>
> I think I understand now.  Let's take what you've done.
>
> Please consider whether any of your explanation here is worth putting
> into comments or documentation or the commit log.
>
Thanks. I added the rationale into the commit message and pushed this.


>
> Acked-by: Ben Pfaff <blp@ovn.org>
>
diff mbox

Patch

diff --git a/include/ovn/actions.h b/include/ovn/actions.h
index e2a716a..fd03c00 100644
--- a/include/ovn/actions.h
+++ b/include/ovn/actions.h
@@ -406,6 +406,9 @@  struct ovnact_encode_params {
                         unsigned int *portp);
     const void *aux;
 
+    /* 'true' if the flow is for a switch. */
+    bool is_switch;
+
     /* A map from a port name to its connection tracking zone. */
     const struct simap *ct_zones;
 
diff --git a/ovn/controller/lflow.c b/ovn/controller/lflow.c
index 341ca08..71167ef 100644
--- a/ovn/controller/lflow.c
+++ b/ovn/controller/lflow.c
@@ -419,6 +419,7 @@  consider_logical_flow(const struct lport_index *lports,
     struct ovnact_encode_params ep = {
         .lookup_port = lookup_port_cb,
         .aux = &aux,
+        .is_switch = is_switch(ldp),
         .ct_zones = ct_zones,
         .group_table = group_table,
         .lflow_uuid = lflow->header_.uuid,
diff --git a/ovn/lib/actions.c b/ovn/lib/actions.c
index 3908e1d..f2002cf 100644
--- a/ovn/lib/actions.c
+++ b/ovn/lib/actions.c
@@ -927,7 +927,8 @@  encode_CT_LB(const struct ovnact_ct_lb *cl,
         struct ofpact_conntrack *ct = ofpact_put_CT(ofpacts);
         struct ofpact_nat *nat;
         size_t nat_offset;
-        ct->zone_src.field = mf_from_id(MFF_LOG_CT_ZONE);
+        ct->zone_src.field = ep->is_switch ? mf_from_id(MFF_LOG_CT_ZONE)
+                                : mf_from_id(MFF_LOG_DNAT_ZONE);
         ct->zone_src.ofs = 0;
         ct->zone_src.n_bits = 16;
         ct->flags = 0;
@@ -951,6 +952,8 @@  encode_CT_LB(const struct ovnact_ct_lb *cl,
     uint32_t group_id = 0, hash;
     struct group_info *group_info;
     struct ofpact_group *og;
+    uint32_t zone_reg = ep->is_switch ? MFF_LOG_CT_ZONE - MFF_REG0
+                            : MFF_LOG_DNAT_ZONE - MFF_REG0;
 
     struct ds ds = DS_EMPTY_INITIALIZER;
     ds_put_format(&ds, "type=select");
@@ -965,7 +968,7 @@  encode_CT_LB(const struct ovnact_ct_lb *cl,
             ds_put_format(&ds, ":%"PRIu16, dst->port);
         }
         ds_put_format(&ds, "),commit,table=%d,zone=NXM_NX_REG%d[0..15])",
-                      recirc_table, MFF_LOG_CT_ZONE - MFF_REG0);
+                      recirc_table, zone_reg);
     }
 
     hash = hash_string(ds_cstr(&ds), 0);
diff --git a/ovn/ovn-sb.xml b/ovn/ovn-sb.xml
index e119249..235c21c 100644
--- a/ovn/ovn-sb.xml
+++ b/ovn/ovn-sb.xml
@@ -1345,8 +1345,10 @@ 
             DNAT address.  Processing automatically moves on to the next table,
             as if <code>next;</code> were specified, and later tables act on
             the packet as modified by the connection tracker.  Connection
-            tracking state is scoped by the logical port, so overlapping
-            addresses may be used.
+            tracking state is scoped by the logical port when the action is
+            used in a flow for a logical switch, so overlapping
+            addresses may be used.  Connection tracking state is scoped by the
+            logical topology when the action is used in a flow for a router.
           </p>
           <p>
             Without arguments, <code>ct_lb</code> sends the packet to the
diff --git a/tests/test-ovn.c b/tests/test-ovn.c
index 0ef09ab..837399e 100644
--- a/tests/test-ovn.c
+++ b/tests/test-ovn.c
@@ -1181,6 +1181,7 @@  test_parse_actions(struct ovs_cmdl_context *ctx OVS_UNUSED)
             const struct ovnact_encode_params ep = {
                 .lookup_port = lookup_port_cb,
                 .aux = &ports,
+                .is_switch = true,
                 .ct_zones = &ct_zones,
                 .group_table = &group_table,