diff mbox

[ovs-dev,RFC] ovn: Add stateful ACL support.

Message ID 1441415060-93194-1-git-send-email-jpettit@nicira.com
State RFC
Headers show

Commit Message

Justin Pettit Sept. 5, 2015, 1:04 a.m. UTC
Add support for the "allow-related" ACL action.  This is dependent on
the OVS conntrack functionality, which is not available on all platforms
or kernel versions.

Here is a sample policy that will allow all tenants in logical switch
"ls0" to SSH to each other.  Anyone can make an HTTP request to "lp0".
All other IP traffic is dropped:

  ovn-nbctl acl-add ls0 from-lport 100 ip allow-related
  ovn-nbctl acl-add ls0 to-lport 100 tcp.dst==22 allow-related
  ovn-nbctl acl-add ls0 to-lport 100 "outport == \"lp0\" \
            && tcp.dst==80" allow-related
  ovn-nbctl acl-add ls0 to-lport 1 ip drop

-=-=-=-=-=-=-=-=-=-

NOTE: This is an RFC.  I would like some feedback on the overall design
and whether it works as expected.  It has a number of dependencies on
features not yet available in the master of OVS.  As such, it is
probably easiest to try this patch from the following repo:

    https://github.com/justinpettit/ovs/tree/ovn-acl

Once the prerequisites make it to the main OVS repo, I'll send out a
non-RFC version.
---
 ovn/TODO                            |    8 ++
 ovn/controller/binding.c            |   43 ++++++++++
 ovn/controller/lflow.c              |   13 +++-
 ovn/controller/lflow.h              |    4 +-
 ovn/controller/ovn-controller.8.xml |   19 +++++
 ovn/controller/ovn-controller.c     |   50 +++++++++---
 ovn/controller/ovn-controller.h     |    7 ++
 ovn/controller/physical.c           |   16 ++++-
 ovn/lib/actions.c                   |   46 +++++++++--
 ovn/lib/actions.h                   |   13 ++--
 ovn/northd/ovn-northd.c             |  152 +++++++++++++++++++++++++++++------
 ovn/ovn-architecture.7.xml          |    8 ++
 ovn/ovn-sb.xml                      |   39 +++++++--
 tests/test-ovn.c                    |    8 +-
 14 files changed, 363 insertions(+), 63 deletions(-)

Comments

Ben Pfaff Sept. 6, 2015, 1:33 a.m. UTC | #1
On Fri, Sep 04, 2015 at 06:04:20PM -0700, Justin Pettit wrote:
> Add support for the "allow-related" ACL action.  This is dependent on
> the OVS conntrack functionality, which is not available on all platforms
> or kernel versions.
> 
> Here is a sample policy that will allow all tenants in logical switch
> "ls0" to SSH to each other.  Anyone can make an HTTP request to "lp0".
> All other IP traffic is dropped:
> 
>   ovn-nbctl acl-add ls0 from-lport 100 ip allow-related
>   ovn-nbctl acl-add ls0 to-lport 100 tcp.dst==22 allow-related
>   ovn-nbctl acl-add ls0 to-lport 100 "outport == \"lp0\" \
>             && tcp.dst==80" allow-related
>   ovn-nbctl acl-add ls0 to-lport 1 ip drop
> 
> -=-=-=-=-=-=-=-=-=-
> 
> NOTE: This is an RFC.  I would like some feedback on the overall design
> and whether it works as expected.  It has a number of dependencies on
> features not yet available in the master of OVS.  As such, it is
> probably easiest to try this patch from the following repo:
> 
>     https://github.com/justinpettit/ovs/tree/ovn-acl
> 
> Once the prerequisites make it to the main OVS repo, I'll send out a
> non-RFC version.

Thanks!  This is pretty clean, and not very much code.

Is adding support for FTP a big project or a small one?  (What about
other ALGs?)

In update_ct_zones() in binding.c, the bitmap_scan() call starts over
from 0 every time.  It would not be too much work to start from the
previously allocated zone, and so that is probably worthwhile.

If ct_next is not the last action in a logical flow, will the actions
that follow it get executed?  If not, then we should document that and
we should probably reject a set of actions where ct_next is followed by
other actions.

It would be nice to test the parsing of the new actions, in the "ovn --
action parsing" test in tests/ovn.at.

I have some other suggestions too that seemed to be best expressed as
patches that can be squashed in.  I pushed them to:
        git@github.com:blp/ovs-reviews.git acl-suggestions

The commits there are:

commit 68601484d3a00161a2c28c73254679fcd030cce6
Author: Ben Pfaff <blp@nicira.com>
Date:   Sat Sep 5 18:27:30 2015 -0700

    ovn-sb: Improve documentation formatting and phrasing.
    
    Signed-off-by: Ben Pfaff <blp@nicira.com>

commit 71644ce7656775ff39037d02526f1f04118dd2ba
Author: Ben Pfaff <blp@nicira.com>
Date:   Sat Sep 5 18:19:20 2015 -0700

    ovn-northd: Add parentheses around ACL match.
    
    Otherwise an ACL that contains an && would yield a syntax error, and in
    general it's best not to tempt precedence rules.
    
    Signed-off-by: Ben Pfaff <blp@nicira.com>

commit d618afd0356870c20df323371fea6f2a9647445c
Author: Ben Pfaff <blp@nicira.com>
Date:   Sat Sep 5 18:09:00 2015 -0700

    physical: Revert stray change.
    
    The indentation was changed in "ovn: Add stateful ACL support." but
    I think it was a mistake.
    
    Signed-off-by: Ben Pfaff <blp@nicira.com>

commit 8316edccf932d6560274f4cc9dc74139cb0936aa
Author: Ben Pfaff <blp@nicira.com>
Date:   Sat Sep 5 18:08:20 2015 -0700

    ovn-controller: Pass conntrack zone info as explicit parameters.
    
    It's a lot easier to see what code actually depends on data when the data
    is passed as a parameter instead of as part of a large struct.  This
    commit makes that change for conntrack zones.
    
    Signed-off-by: Ben Pfaff <blp@nicira.com>
Justin Pettit Oct. 13, 2015, 10:35 p.m. UTC | #2
> On Sep 5, 2015, at 6:33 PM, Ben Pfaff <blp@nicira.com> wrote:
> 
> Thanks!  This is pretty clean, and not very much code.
> 
> Is adding support for FTP a big project or a small one?  (What about
> other ALGs?)

It's a little messy.  We need to make a conntrack call with the ALG type for that particular flow, which means a different OpenFlow flow for each supported ALG.  It's not clear to me whether we should expose that through the ACL definitions in the Northbound DB or just do it ourselves under the covers.  As such, I'd like to just get this in without ALG support for the time-being, and then come back after we've discussed it with some people familiar with CMS integration.

Currently, the OVS conntrack code only supports FTP.  I've asked Joe to also look into TFTP, since people will need it to PXE boot.  We'll almost certainly need to support more, but Linux supports a fairly small number of them.

> In update_ct_zones() in binding.c, the bitmap_scan() call starts over
> from 0 every time.  It would not be too much work to start from the
> previously allocated zone, and so that is probably worthwhile.

Good point.  I just did it for the lifetime of that functional call and didn't make it static to live across calls.  Let me know if you think that's worth doing, though.

> If ct_next is not the last action in a logical flow, will the actions
> that follow it get executed?  If not, then we should document that and
> we should probably reject a set of actions where ct_next is followed by
> other actions.

Yes, follow on actions will be executed, but it's not generally useful, since the side-effects won't be available.  I've added some documentation about that.

> It would be nice to test the parsing of the new actions, in the "ovn --
> action parsing" test in tests/ovn.at.

Done.

> I have some other suggestions too that seemed to be best expressed as
> patches that can be squashed in.  I pushed them to:
>        git@github.com:blp/ovs-reviews.git acl-suggestions

Thanks!  I squashed them into my patches.

I'll send out a non-RFC version once the conntrack userspace changes gets committed upstream to OVS.

--Justin
diff mbox

Patch

diff --git a/ovn/TODO b/ovn/TODO
index 356b3ba..260ddaa 100644
--- a/ovn/TODO
+++ b/ovn/TODO
@@ -80,3 +80,11 @@ 
    So far, both ovn-controller and ovn-controller-vtep only allow
    chassis to have one tunnel encapsulation entry.  We should extend
    the implementation to support multiple tunnel encapsulations.
+
+* ACL
+
+** Support FTP ALGs.
+
+** Support reject action.
+
+** Support log option.
diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c
index fca2430..492dd39 100644
--- a/ovn/controller/binding.c
+++ b/ovn/controller/binding.c
@@ -16,6 +16,7 @@ 
 #include <config.h>
 #include "binding.h"
 
+#include "lib/bitmap.h"
 #include "lib/sset.h"
 #include "lib/util.h"
 #include "lib/vswitch-idl.h"
@@ -71,6 +72,46 @@  get_local_iface_ids(const struct ovsrec_bridge *br_int, struct sset *lports)
     }
 }
 
+static void
+update_ct_zones(struct controller_ctx *ctx, struct sset *lports)
+{
+    struct simap_node *ct_zone, *ct_zone_next;
+    const char *iface_id;
+
+    /* xxx This is wasteful to assign a zone to each port--even if no
+     * xxx security policy is applied. */
+
+    /* Delete any zones that are associated with removed ports. */
+    SIMAP_FOR_EACH_SAFE(ct_zone, ct_zone_next, &ctx->ct_zones) {
+        if (!sset_contains(lports, ct_zone->name)) {
+            bitmap_set0(ctx->ct_zone_bitmap, ct_zone->data);
+            simap_delete(&ctx->ct_zones, ct_zone);
+        }
+    }
+
+    /* Assign a unique zone id for each logical port. */
+    SSET_FOR_EACH(iface_id, lports) {
+        size_t zone;
+
+        if (simap_contains(&ctx->ct_zones, iface_id)) {
+            continue;
+        }
+
+        /* We assume that there are 64K zones and that we own them all. */
+        zone = bitmap_scan(ctx->ct_zone_bitmap, 0, 1, MAX_CT_ZONES + 1);
+        if (zone == MAX_CT_ZONES + 1) {
+            static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
+            VLOG_WARN_RL(&rl, "exhausted all ct zones");
+            return;
+        }
+
+        bitmap_set1(ctx->ct_zone_bitmap, zone);
+        simap_put(&ctx->ct_zones, iface_id, zone);
+
+        /* xxx This should make call to erase any old entries for this zone. */
+    }
+}
+
 void
 binding_run(struct controller_ctx *ctx, const struct ovsrec_bridge *br_int,
             const char *chassis_id)
@@ -97,6 +138,7 @@  binding_run(struct controller_ctx *ctx, const struct ovsrec_bridge *br_int,
         /* We have no integration bridge, therefore no local logical ports.
          * We'll remove our chassis from all port binding records below. */
     }
+    update_ct_zones(ctx, &lports);
     sset_clone(&all_lports, &lports);
 
     ovsdb_idl_txn_add_comment(
@@ -141,6 +183,7 @@  binding_cleanup(struct controller_ctx *ctx, const char *chassis_id)
     if (!chassis_id) {
         return true;
     }
+
     const struct sbrec_chassis *chassis_rec
         = get_chassis_by_name(ctx->ovnsb_idl, chassis_id);
     if (!chassis_rec) {
diff --git a/ovn/controller/lflow.c b/ovn/controller/lflow.c
index 9246e61..fa61247 100644
--- a/ovn/controller/lflow.c
+++ b/ovn/controller/lflow.c
@@ -58,6 +58,15 @@  symtab_init(void)
     MFF_LOG_REGS;
 #undef MFF_LOG_REG
 
+    /* Connection tracking state. */
+    expr_symtab_add_field(&symtab, "ct_state", MFF_CT_STATE, NULL, false);
+    expr_symtab_add_predicate(&symtab, "ct.trk", "ct_state[7]");
+    expr_symtab_add_subfield(&symtab, "ct.new", "ct.trk", "ct_state[0]");
+    expr_symtab_add_subfield(&symtab, "ct.est", "ct.trk", "ct_state[1]");
+    expr_symtab_add_subfield(&symtab, "ct.rel", "ct.trk", "ct_state[2]");
+    expr_symtab_add_subfield(&symtab, "ct.inv", "ct.trk", "ct_state[5]");
+    expr_symtab_add_subfield(&symtab, "ct.rpl", "ct.trk", "ct_state[6]");
+
     /* Data fields. */
     expr_symtab_add_field(&symtab, "eth.src", MFF_ETH_SRC, NULL, false);
     expr_symtab_add_field(&symtab, "eth.dst", MFF_ETH_DST, NULL, false);
@@ -284,8 +293,8 @@  lflow_run(struct controller_ctx *ctx, struct hmap *flow_table)
 
         ofpbuf_use_stub(&ofpacts, ofpacts_stub, sizeof ofpacts_stub);
         error = actions_parse_string(lflow->actions, &symtab, &ldp->ports,
-                                     next_phys_table, output_phys_table,
-                                     &ofpacts, &prereqs);
+                                     &ctx->ct_zones, next_phys_table,
+                                     output_phys_table, &ofpacts, &prereqs);
         if (error) {
             static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
             VLOG_WARN_RL(&rl, "error parsing actions \"%s\": %s",
diff --git a/ovn/controller/lflow.h b/ovn/controller/lflow.h
index 59fe559..0e3e51e 100644
--- a/ovn/controller/lflow.h
+++ b/ovn/controller/lflow.h
@@ -55,6 +55,7 @@  struct uuid;
 
 /* Logical fields. */
 #define MFF_LOG_DATAPATH MFF_METADATA /* Logical datapath (64 bits). */
+#define MFF_LOG_CT_ZONE  MFF_REG5     /* Logical conntrack zone (32 bits). */
 #define MFF_LOG_INPORT   MFF_REG6     /* Logical input port (32 bits). */
 #define MFF_LOG_OUTPORT  MFF_REG7     /* Logical output port (32 bits). */
 
@@ -66,8 +67,7 @@  struct uuid;
     MFF_LOG_REG(MFF_REG1) \
     MFF_LOG_REG(MFF_REG2) \
     MFF_LOG_REG(MFF_REG3) \
-    MFF_LOG_REG(MFF_REG4) \
-    MFF_LOG_REG(MFF_REG5)
+    MFF_LOG_REG(MFF_REG4)
 
 void lflow_init(void);
 void lflow_run(struct controller_ctx *, struct hmap *flow_table);
diff --git a/ovn/controller/ovn-controller.8.xml b/ovn/controller/ovn-controller.8.xml
index e1cb6a2..acca510 100644
--- a/ovn/controller/ovn-controller.8.xml
+++ b/ovn/controller/ovn-controller.8.xml
@@ -119,4 +119,23 @@ 
         This behavior should be changed.
       </p>
     </p>
+
+    <h1>RUNTIME MANAGEMENT COMMANDS</h1>
+    <p>
+      <code>ovs-appctl</code> can send commands to a running
+      <code>ovn-controller</code> process.  The currently supported
+      commands are described below.
+      <dl>
+      <dt><code>exit</code></dt>
+      <dd>
+        Causes <code>ovn-controller</code> to gracefully terminate.
+      </dd>
+
+      <dt><code>ct-zone-list</code></dt>
+      <dd>
+        Lists each local logical port and its connection tracking zone.
+      </dd>
+      </dl>
+    </p>
+
 </manpage>
diff --git a/ovn/controller/ovn-controller.c b/ovn/controller/ovn-controller.c
index 8e93a0f..8314e04 100644
--- a/ovn/controller/ovn-controller.c
+++ b/ovn/controller/ovn-controller.c
@@ -27,6 +27,7 @@ 
 #include "compiler.h"
 #include "daemon.h"
 #include "dirs.h"
+#include "dynamic-string.h"
 #include "openvswitch/vconn.h"
 #include "openvswitch/vlog.h"
 #include "ovn/lib/ovn-sb-idl.h"
@@ -49,6 +50,7 @@ 
 VLOG_DEFINE_THIS_MODULE(main);
 
 static unixctl_cb_func ovn_controller_exit;
+static unixctl_cb_func ct_zone_list;
 
 #define DEFAULT_BRIDGE_NAME "br-int"
 
@@ -119,6 +121,7 @@  int
 main(int argc, char *argv[])
 {
     struct unixctl_server *unixctl;
+    struct controller_ctx ctx;
     bool exiting;
     int retval;
 
@@ -134,6 +137,7 @@  main(int argc, char *argv[])
         exit(EXIT_FAILURE);
     }
     unixctl_command_register("exit", "", 0, 0, ovn_controller_exit, &exiting);
+    unixctl_command_register("ct-zone-list", "", 0, 0, ct_zone_list, &ctx);
 
     daemonize_complete();
 
@@ -162,15 +166,20 @@  main(int argc, char *argv[])
         ovsdb_idl_create(ovnsb_remote, &sbrec_idl_class, true, true));
     ovsdb_idl_get_initial_snapshot(ovnsb_idl_loop.idl);
 
+    /* Initialize connection tracking zones. */
+    simap_init(&ctx.ct_zones);
+    ctx.ct_zone_bitmap = bitmap_allocate(MAX_CT_ZONES);
+
+    /* We never use zone 0. */
+    bitmap_set1(ctx.ct_zone_bitmap, 0);
+
     /* Main loop. */
     exiting = false;
     while (!exiting) {
-        struct controller_ctx ctx = {
-            .ovs_idl = ovs_idl_loop.idl,
-            .ovs_idl_txn = ovsdb_idl_loop_run(&ovs_idl_loop),
-            .ovnsb_idl = ovnsb_idl_loop.idl,
-            .ovnsb_idl_txn = ovsdb_idl_loop_run(&ovnsb_idl_loop),
-        };
+        ctx.ovs_idl = ovs_idl_loop.idl;
+        ctx.ovs_idl_txn = ovsdb_idl_loop_run(&ovs_idl_loop);
+        ctx.ovnsb_idl = ovnsb_idl_loop.idl;
+        ctx.ovnsb_idl_txn = ovsdb_idl_loop_run(&ovnsb_idl_loop);
 
         const struct ovsrec_bridge *br_int = get_br_int(ctx.ovs_idl);
         const char *chassis_id = get_chassis_id(ctx.ovs_idl);
@@ -213,12 +222,10 @@  main(int argc, char *argv[])
     /* It's time to exit.  Clean up the databases. */
     bool done = false;
     while (!done) {
-        struct controller_ctx ctx = {
-            .ovs_idl = ovs_idl_loop.idl,
-            .ovs_idl_txn = ovsdb_idl_loop_run(&ovs_idl_loop),
-            .ovnsb_idl = ovnsb_idl_loop.idl,
-            .ovnsb_idl_txn = ovsdb_idl_loop_run(&ovnsb_idl_loop),
-        };
+        ctx.ovs_idl = ovs_idl_loop.idl;
+        ctx.ovs_idl_txn = ovsdb_idl_loop_run(&ovs_idl_loop);
+        ctx.ovnsb_idl = ovnsb_idl_loop.idl;
+        ctx.ovnsb_idl_txn = ovsdb_idl_loop_run(&ovnsb_idl_loop);
 
         const struct ovsrec_bridge *br_int = get_br_int(ctx.ovs_idl);
         const char *chassis_id = get_chassis_id(ctx.ovs_idl);
@@ -241,6 +248,9 @@  main(int argc, char *argv[])
     lflow_destroy();
     ofctrl_destroy();
 
+    simap_destroy(&ctx.ct_zones);
+    bitmap_free(ctx.ct_zone_bitmap);
+
     ovsdb_idl_loop_destroy(&ovs_idl_loop);
     ovsdb_idl_loop_destroy(&ovnsb_idl_loop);
 
@@ -341,3 +351,19 @@  ovn_controller_exit(struct unixctl_conn *conn, int argc OVS_UNUSED,
 
     unixctl_command_reply(conn, NULL);
 }
+
+static void
+ct_zone_list(struct unixctl_conn *conn, int argc OVS_UNUSED,
+             const char *argv[] OVS_UNUSED, void *aux)
+{
+    const struct controller_ctx *ctx = aux;
+    struct ds ds = DS_EMPTY_INITIALIZER;
+    struct simap_node *zone;
+
+    SIMAP_FOR_EACH(zone, &ctx->ct_zones) {
+        ds_put_format(&ds, "%s %d\n", zone->name, zone->data);
+    }
+
+    unixctl_command_reply(conn, ds_cstr(&ds));
+    ds_destroy(&ds);
+}
diff --git a/ovn/controller/ovn-controller.h b/ovn/controller/ovn-controller.h
index be89b5f..932decd 100644
--- a/ovn/controller/ovn-controller.h
+++ b/ovn/controller/ovn-controller.h
@@ -17,14 +17,21 @@ 
 #ifndef OVN_CONTROLLER_H
 #define OVN_CONTROLLER_H 1
 
+#include "simap.h"
 #include "ovn/lib/ovn-sb-idl.h"
 
+/* Linux supports a maximum of 64K zones, which seems like a fine default. */
+#define MAX_CT_ZONES 65535
+
 struct controller_ctx {
     struct ovsdb_idl *ovnsb_idl;
     struct ovsdb_idl_txn *ovnsb_idl_txn;
 
     struct ovsdb_idl *ovs_idl;
     struct ovsdb_idl_txn *ovs_idl_txn;
+
+    struct simap ct_zones;          /* Port to conntrack zone mappings. */
+    unsigned long *ct_zone_bitmap;  /* Bitmap of assigned zones. */
 };
 
 static inline const struct sbrec_chassis *
diff --git a/ovn/controller/physical.c b/ovn/controller/physical.c
index 2ec0ba9..d12c5dc 100644
--- a/ovn/controller/physical.c
+++ b/ovn/controller/physical.c
@@ -235,6 +235,7 @@  physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
 
         struct match match;
         if (!tun) {
+            int zone_id = simap_get(&ctx->ct_zones, binding->logical_port);
             /* Packets that arrive from a vif can belong to a VM or
              * to a container located inside that VM. Packets that
              * arrive from containers have a tag (vlan) associated with them.
@@ -251,6 +252,7 @@  physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
              * input port, MFF_LOG_DATAPATH to the logical datapath, and
              * resubmit into the logical ingress pipeline starting at table
              * 16. */
+
             match_init_catchall(&match);
             ofpbuf_clear(&ofpacts);
             match_set_in_port(&match, ofport);
@@ -258,6 +260,10 @@  physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
                 match_set_dl_vlan(&match, htons(tag));
             }
 
+            if (zone_id) {
+                put_load(zone_id, MFF_LOG_CT_ZONE, 0, 32, &ofpacts);
+            }
+
             /* Set MFF_LOG_DATAPATH and MFF_LOG_INPORT. */
             put_load(binding->datapath->tunnel_key, MFF_LOG_DATAPATH, 0, 64,
                      &ofpacts);
@@ -289,6 +295,10 @@  physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
             match_set_reg(&match, MFF_LOG_OUTPORT - MFF_REG0,
                           binding->tunnel_key);
 
+            if (zone_id) {
+                put_load(zone_id, MFF_LOG_CT_ZONE, 0, 32, &ofpacts);
+            }
+
             /* Resubmit to table 34. */
             put_resubmit(OFTABLE_DROP_LOOPBACK, &ofpacts);
             ofctrl_add_flow(flow_table, OFTABLE_LOCAL_OUTPUT, 100, &match,
@@ -396,6 +406,10 @@  physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
                 continue;
             }
 
+            int zone_id = simap_get(&ctx->ct_zones, port->logical_port);
+            if (zone_id) {
+                put_load(zone_id, MFF_LOG_CT_ZONE, 0, 32, &ofpacts);
+            }
             if (simap_contains(&lport_to_ofport, port->logical_port)) {
                 put_load(port->tunnel_key, MFF_LOG_OUTPORT, 0, 32, &ofpacts);
                 put_resubmit(OFTABLE_DROP_LOOPBACK, &ofpacts);
@@ -504,7 +518,7 @@  physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve,
     ofpbuf_clear(&ofpacts);
 #define MFF_LOG_REG(ID) put_load(0, ID, 0, 32, &ofpacts);
     MFF_LOG_REGS;
-#undef MFF_LOG_REGS
+ #undef MFF_LOG_REGS
     put_resubmit(OFTABLE_LOG_EGRESS_PIPELINE, &ofpacts);
     ofctrl_add_flow(flow_table, OFTABLE_DROP_LOOPBACK, 0, &match, &ofpacts);
 
diff --git a/ovn/lib/actions.c b/ovn/lib/actions.c
index 0a0158a..0f9675f 100644
--- a/ovn/lib/actions.c
+++ b/ovn/lib/actions.c
@@ -24,6 +24,7 @@ 
 #include "lex.h"
 #include "ofp-actions.h"
 #include "ofpbuf.h"
+#include "simap.h"
 
 /* Context maintained during actions_parse(). */
 struct action_context {
@@ -33,6 +34,7 @@  struct action_context {
     uint8_t next_table_id;      /* OpenFlow table for 'next' to resubmit. */
     uint8_t output_table_id;    /* OpenFlow table for 'output' to resubmit. */
     const struct simap *ports;  /* Map from port name to number. */
+    const struct simap *ct_zones; /* Map from port name to conntrack zone. */
 
     /* State. */
     char *error;                /* Error, if any, otherwise NULL. */
@@ -131,6 +133,30 @@  emit_resubmit(struct action_context *ctx, uint8_t table_id)
 }
 
 static void
+emit_ct(struct action_context *ctx, bool recirc, bool commit)
+{
+    struct ofpact_conntrack *ct = ofpact_put_CT(ctx->ofpacts);
+    ct->flags |= recirc ? NX_CT_F_RECIRC : 0;
+    ct->flags |= commit ? NX_CT_F_COMMIT : 0;
+    if (recirc) {
+        /* If "recirc" is set, we automatically go to the next table. */
+        ct->next_table = ctx->next_table_id;
+    }
+    /* xxx Should remove hard-coding reg5 if we refactor library. */
+    ct->src.field = mf_from_id(MFF_REG5);
+    ct->src.ofs = 0;
+    ct->src.n_bits = 32;
+
+    /* CT only works with IP, so set up a prerequisite. */
+    struct expr *expr;
+    char *error;
+
+    expr = expr_parse_string("ip", ctx->symtab, &error);
+    ovs_assert(!error);
+    ctx->prereqs = expr_combine(EXPR_T_AND, ctx->prereqs, expr);
+}
+
+static void
 parse_actions(struct action_context *ctx)
 {
     /* "drop;" by itself is a valid (empty) set of actions, but it can't be
@@ -163,6 +189,10 @@  parse_actions(struct action_context *ctx)
             }
         } else if (lexer_match_id(ctx->lexer, "output")) {
             emit_resubmit(ctx, ctx->output_table_id);
+        } else if (lexer_match_id(ctx->lexer, "ct_next")) {
+            emit_ct(ctx, true, false);
+        } else if (lexer_match_id(ctx->lexer, "ct_commit")) {
+            emit_ct(ctx, false, true);
         } else {
             action_syntax_error(ctx, "expecting action");
         }
@@ -186,6 +216,8 @@  parse_actions(struct action_context *ctx)
  * (as one would provide to expr_to_matches()).  Strings used in the actions
  * that are not in 'ports' are translated to zero.
  *
+ * 'ct_zones' provides a map from a port name to its connection tracking zone.
+ *
  * 'next_table_id' should be the OpenFlow table to which the "next" action will
  * resubmit, or 0 to disable "next".
  *
@@ -204,8 +236,9 @@  parse_actions(struct action_context *ctx)
   */
 char * OVS_WARN_UNUSED_RESULT
 actions_parse(struct lexer *lexer, const struct shash *symtab,
-              const struct simap *ports, uint8_t next_table_id,
-              uint8_t output_table_id, struct ofpbuf *ofpacts,
+              const struct simap *ports, const struct simap *ct_zones,
+              uint8_t next_table_id, uint8_t output_table_id,
+              struct ofpbuf *ofpacts,
               struct expr **prereqsp)
 {
     size_t ofpacts_start = ofpacts->size;
@@ -214,6 +247,7 @@  actions_parse(struct lexer *lexer, const struct shash *symtab,
     ctx.lexer = lexer;
     ctx.symtab = symtab;
     ctx.ports = ports;
+    ctx.ct_zones = ct_zones;
     ctx.next_table_id = next_table_id;
     ctx.output_table_id = output_table_id;
     ctx.error = NULL;
@@ -236,16 +270,16 @@  actions_parse(struct lexer *lexer, const struct shash *symtab,
 /* Like actions_parse(), but the actions are taken from 's'. */
 char * OVS_WARN_UNUSED_RESULT
 actions_parse_string(const char *s, const struct shash *symtab,
-                     const struct simap *ports, uint8_t next_table_id,
-                     uint8_t output_table_id, struct ofpbuf *ofpacts,
-                     struct expr **prereqsp)
+                     const struct simap *ports, const struct simap *ct_zones,
+                     uint8_t next_table_id, uint8_t output_table_id,
+                     struct ofpbuf *ofpacts, struct expr **prereqsp)
 {
     struct lexer lexer;
     char *error;
 
     lexer_init(&lexer, s);
     lexer_get(&lexer);
-    error = actions_parse(&lexer, symtab, ports, next_table_id,
+    error = actions_parse(&lexer, symtab, ports, ct_zones, next_table_id,
                           output_table_id, ofpacts, prereqsp);
     lexer_destroy(&lexer);
 
diff --git a/ovn/lib/actions.h b/ovn/lib/actions.h
index 74cd185..377b273 100644
--- a/ovn/lib/actions.h
+++ b/ovn/lib/actions.h
@@ -27,14 +27,15 @@  struct shash;
 struct simap;
 
 char *actions_parse(struct lexer *, const struct shash *symtab,
-                    const struct simap *ports, uint8_t next_table_id,
-                    uint8_t output_table_id, struct ofpbuf *ofpacts,
-                    struct expr **prereqsp)
+                    const struct simap *ports, const struct simap *ct_zones,
+                    uint8_t next_table_id, uint8_t output_table_id,
+                    struct ofpbuf *ofpacts, struct expr **prereqsp)
     OVS_WARN_UNUSED_RESULT;
 char *actions_parse_string(const char *s, const struct shash *symtab,
-                           const struct simap *ports, uint8_t next_table_id,
-                           uint8_t output_table_id, struct ofpbuf *ofpacts,
-                           struct expr **prereqsp)
+                           const struct simap *ports,
+                           const struct simap *ct_zones,
+                           uint8_t next_table_id, uint8_t output_table_id,
+                           struct ofpbuf *ofpacts, struct expr **prereqsp)
     OVS_WARN_UNUSED_RESULT;
 
 #endif /* ovn/actions.h */
diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
index d70ba39..f1dfa10 100644
--- a/ovn/northd/ovn-northd.c
+++ b/ovn/northd/ovn-northd.c
@@ -60,6 +60,8 @@  static const char *default_db(void);
  * These must be listed in the order that the stages will be executed. */
 #define INGRESS_STAGES                         \
     INGRESS_STAGE(PORT_SEC, port_sec)          \
+    INGRESS_STAGE(PRE_ACL, pre_acl)            \
+    INGRESS_STAGE(ACL, acl)                    \
     INGRESS_STAGE(L2_LKUP, l2_lkup)
 
 enum ingress_stage {
@@ -72,8 +74,9 @@  enum ingress_stage {
 /* Egress pipeline stages.
  *
  * These must be listed in the order that the stages will be executed. */
-#define EGRESS_STAGES                         \
-    EGRESS_STAGE(ACL, acl)                    \
+#define EGRESS_STAGES                          \
+    EGRESS_STAGE(PRE_ACL, pre_acl)             \
+    EGRESS_STAGE(ACL, acl)                     \
     EGRESS_STAGE(PORT_SEC, port_sec)
 
 enum egress_stage {
@@ -723,6 +726,118 @@  lport_is_enabled(const struct nbrec_logical_port *lport)
     return !lport->enabled || *lport->enabled;
 }
 
+static bool
+has_stateful_acl(struct ovn_datapath *od)
+{
+    for (size_t i = 0; i < od->nb->n_acls; i++) {
+        struct nbrec_acl *acl = od->nb->acls[i];
+        if (!strcmp(acl->action, "allow-related")) {
+            return true;
+        }
+    }
+
+    return false;
+}
+
+static void
+build_acls(struct ovn_datapath *od, struct hmap *lflows)
+{
+    /* Ingress and Egress Pre-ACL Table (Priority 0): Packets are
+     * allowed by default. */
+    ovn_lflow_add(lflows, od, P_IN, S_IN_PRE_ACL, 0, "1", "next;");
+    ovn_lflow_add(lflows, od, P_OUT, S_OUT_PRE_ACL, 0, "1", "next;");
+
+    /* Ingress and Egress ACL Table (Priority 0): Packets are allowed by
+     * default. */
+    ovn_lflow_add(lflows, od, P_IN, S_IN_ACL, 0, "1", "next;");
+    ovn_lflow_add(lflows, od, P_OUT, S_OUT_ACL, 0, "1", "next;");
+
+    /* If there are any stateful ACL rules, we must send all IP packets
+     * through the conntrack action, which handles defragmentation, in
+     * order to match L4 headers. */
+    if (has_stateful_acl(od)) {
+        /* Ingress and Egress Pre-ACL Table (Priority 100).
+         *
+         * Regardless of whether the ACL is "from-lport" or "to-lport",
+         * we need rules in both the ingress and egress table, because
+         * the return traffic needs to be followed. */
+        ovn_lflow_add(lflows, od, P_IN, S_IN_PRE_ACL, 100,
+                      "ip", "ct_next;");
+        ovn_lflow_add(lflows, od, P_OUT, S_OUT_PRE_ACL, 100,
+                      "ip", "ct_next;");
+
+        /* Ingress and Egress ACL Table (Priority 65535).
+         *
+         * Always drop traffic that's in an invalid state.  This is
+         * enforced at a higher priority than ACLs can be defined. */
+        ovn_lflow_add(lflows, od, P_IN, S_IN_ACL, UINT16_MAX,
+                      "ct.inv", "drop;");
+        ovn_lflow_add(lflows, od, P_OUT, S_OUT_ACL, UINT16_MAX,
+                      "ct.inv", "drop;");
+
+        /* Ingress and Egress ACL Table (Priority 65535).
+         *
+         * Always allow traffic that is established to a committed
+         * conntrack entry.  This is enforced at a higher priority than
+         * ACLs can be defined. */
+        ovn_lflow_add(lflows, od, P_IN, S_IN_ACL, UINT16_MAX,
+                      "ct.est && !ct.rel && !ct.new && !ct.inv",
+                      "next;");
+        ovn_lflow_add(lflows, od, P_OUT, S_OUT_ACL, UINT16_MAX,
+                      "ct.est && !ct.rel && !ct.new && !ct.inv",
+                      "next;");
+
+        /* Ingress and Egress ACL Table (Priority 65535).
+         *
+         * Always allow traffic that is related to an existing conntrack
+         * entry.  This is enforced at a higher priority than ACLs can
+         * be defined.
+         *
+         * NOTE: This does not support related data sessions (eg,
+         * a dynamically negotiated FTP data channel), but will allow
+         * related traffic such as an ICMP Port Unreachable through
+         * that's generated from a non-listening UDP port.  */
+        ovn_lflow_add(lflows, od, P_IN, S_IN_ACL, UINT16_MAX,
+                      "!ct.est && ct.rel && !ct.new && !ct.inv",
+                      "next;");
+        ovn_lflow_add(lflows, od, P_OUT, S_OUT_ACL, UINT16_MAX,
+                      "!ct.est && ct.rel && !ct.new && !ct.inv",
+                      "next;");
+    }
+
+    /* Ingress or Egress ACL Table (Various priorities). */
+    for (size_t i = 0; i < od->nb->n_acls; i++) {
+        struct nbrec_acl *acl = od->nb->acls[i];
+        bool ingress = !strcmp(acl->direction, "from-lport") ? true :false;
+        enum ovn_pipeline pipeline = ingress ? P_IN : P_OUT;
+        uint8_t stage = ingress ? S_IN_ACL : S_OUT_ACL;
+
+        if (!strcmp(acl->action, "allow")) {
+            ovn_lflow_add(lflows, od, pipeline, stage, acl->priority,
+                          acl->match, "next;");
+        } else if (!strcmp(acl->action, "allow-related")) {
+            struct ds match = DS_EMPTY_INITIALIZER;
+
+            /* Commit the connection tracking entry, which allow all
+             * other traffic related to this entry to flow due to the
+             * 65535 priority flow defined earlier. */
+            ds_put_format(&match, "ct.new && %s", acl->match);
+            ovn_lflow_add(lflows, od, pipeline, stage, acl->priority,
+                          ds_cstr(&match), "ct_commit; next;");
+
+            ds_destroy(&match);
+        } else if (!strcmp(acl->action, "drop")) {
+            ovn_lflow_add(lflows, od, pipeline, stage, acl->priority,
+                          acl->match, "drop;");
+        } else if (!strcmp(acl->action, "reject")) {
+            /* xxx Need to support "reject". */
+            VLOG_INFO("reject is not a supported action");
+            ovn_lflow_add(lflows, od, pipeline, stage, acl->priority,
+                          acl->match, "drop;");
+        }
+    }
+}
+
 /* Updates the Logical_Flow and Multicast_Group tables in the OVN_SB database,
  * constructing their contents based on the OVN_NB database. */
 static void
@@ -764,7 +879,7 @@  build_lflows(struct northd_context *ctx, struct hmap *datapaths,
         ds_destroy(&match);
     }
 
-    /* Ingress table 1: Destination lookup, broadcast and multicast handling
+    /* Ingress table 3: Destination lookup, broadcast and multicast handling
      * (priority 100). */
     HMAP_FOR_EACH (op, key_node, ports) {
         if (lport_is_enabled(op->nb)) {
@@ -776,7 +891,7 @@  build_lflows(struct northd_context *ctx, struct hmap *datapaths,
                       "outport = \""MC_FLOOD"\"; output;");
     }
 
-    /* Ingress table 1: Destination lookup, unicast handling (priority 50), */
+    /* Ingress table 3: Destination lookup, unicast handling (priority 50), */
     HMAP_FOR_EACH (op, key_node, ports) {
         for (size_t i = 0; i < op->nb->n_macs; i++) {
             uint8_t mac[ETH_ADDR_LEN];
@@ -807,7 +922,7 @@  build_lflows(struct northd_context *ctx, struct hmap *datapaths,
         }
     }
 
-    /* Ingress table 1: Destination lookup for unknown MACs (priority 0). */
+    /* Ingress table 3: Destination lookup for unknown MACs (priority 0). */
     HMAP_FOR_EACH (od, key_node, datapaths) {
         if (od->has_unknown) {
             ovn_lflow_add(&lflows, od, P_IN, S_IN_L2_LKUP, 0, "1",
@@ -815,31 +930,14 @@  build_lflows(struct northd_context *ctx, struct hmap *datapaths,
         }
     }
 
-    /* Egress table 0: ACLs (any priority). */
-    HMAP_FOR_EACH (od, key_node, datapaths) {
-        for (size_t i = 0; i < od->nb->n_acls; i++) {
-            const struct nbrec_acl *acl = od->nb->acls[i];
-            const char *action;
-
-            action = (!strcmp(acl->action, "allow") ||
-                      !strcmp(acl->action, "allow-related"))
-                ? "next;" : "drop;";
-            ovn_lflow_add(&lflows, od, P_OUT, S_OUT_ACL, acl->priority,
-                          acl->match, action);
-        }
-    }
-    HMAP_FOR_EACH (od, key_node, datapaths) {
-        ovn_lflow_add(&lflows, od, P_OUT, S_OUT_ACL, 0, "1", "next;");
-    }
-
-    /* Egress table 1: Egress port security multicast/broadcast (priority
+    /* Egress table 2: Egress port security multicast/broadcast (priority
      * 100). */
     HMAP_FOR_EACH (od, key_node, datapaths) {
         ovn_lflow_add(&lflows, od, P_OUT, S_OUT_PORT_SEC, 100, "eth.dst[40]",
                       "output;");
     }
 
-    /* Egress table 1: Egress port security (priority 50). */
+    /* Egress table 2: Egress port security (priority 50). */
     HMAP_FOR_EACH (op, key_node, ports) {
         struct ds match;
 
@@ -857,6 +955,12 @@  build_lflows(struct northd_context *ctx, struct hmap *datapaths,
         ds_destroy(&match);
     }
 
+    /* Build pre-ACL and ACL tables for both ingress and egress.
+     * Ingress tables 1 and 2.  Egress tables 0 and 1. */
+    HMAP_FOR_EACH (od, key_node, datapaths) {
+        build_acls(od, &lflows);
+    }
+
     /* Push changes to the Logical_Flow table to database. */
     const struct sbrec_logical_flow *sbflow, *next_sbflow;
     SBREC_LOGICAL_FLOW_FOR_EACH_SAFE (sbflow, next_sbflow, ctx->ovnsb_idl) {
diff --git a/ovn/ovn-architecture.7.xml b/ovn/ovn-architecture.7.xml
index 2424836..7252f5d 100644
--- a/ovn/ovn-architecture.7.xml
+++ b/ovn/ovn-architecture.7.xml
@@ -642,6 +642,14 @@ 
       tunnels as part of the tunnel key.)
     </dd>
 
+    <dt>conntrack zone field</dt>
+    <dd>
+      A field that denotes the connection tracking zone.  The value only
+      has local significance and is not meaningful between chassis.
+      This is initialized to 0 at the beginning of the logical ingress
+      pipeline.  OVN stores this in Nicira extension register number 5.
+    </dd>
+
     <dt>VLAN ID</dt>
     <dd>
       The VLAN ID is used as an interface between OVN and containers nested
diff --git a/ovn/ovn-sb.xml b/ovn/ovn-sb.xml
index 8102eb3..94ef532 100644
--- a/ovn/ovn-sb.xml
+++ b/ovn/ovn-sb.xml
@@ -287,10 +287,11 @@ 
       <code>inport</code> to <code>outport</code>; if they are equal, it treats
       the <code>output</code> as a no-op.  In the common case, where they are
       different, the packet enters the egress pipeline.  This transition to the
-      egress pipeline discards register data, e.g. <code>reg0</code>
-      ... <code>reg5</code>, to achieve uniform behavior regardless of whether
-      the egress pipeline is on a different hypervisor (because registers
-      aren't preserve across tunnel encapsulation).
+      egress pipeline discards register data, e.g. <code>reg0</code> ...
+      <code>reg4</code> and connection tracking state, to achieve
+      uniform behavior regardless of whether the egress pipeline is on a
+      different hypervisor (because registers aren't preserve across
+      tunnel encapsulation).
     </p>
 
     <p>
@@ -674,7 +675,7 @@ 
       </p>
 
       <ul>
-        <li><code>reg0</code>...<code>reg5</code></li>
+        <li><code>reg0</code>...<code>reg4</code></li>
         <li><code>inport</code> <code>outport</code></li>
         <li><code>eth.src</code> <code>eth.dst</code> <code>eth.type</code></li>
         <li><code>vlan.tci</code> <code>vlan.vid</code> <code>vlan.pcp</code> <code>vlan.present</code></li>
@@ -688,6 +689,12 @@ 
         <li><code>icmp4.type</code> <code>icmp4.code</code></li>
         <li><code>icmp6.type</code> <code>icmp6.code</code></li>
         <li><code>nd.target</code> <code>nd.sll</code> <code>nd.tll</code></li>
+        <li><code>ct_state</code> can take the following shortcuts:</li>
+        <li><code>   ct.new</code> New flow</li>
+        <li><code>   ct.est</code> Established flow</li>
+        <li><code>   ct.rel</code> Related flow</li>
+        <li><code>   ct.rpl</code> Reply flow</li>
+        <li><code>   ct.inv</code> Connection entry in a bad state</li>
       </ul>
 
     </column>
@@ -769,6 +776,26 @@ 
             pipeline.
           </p>
 	</dd>
+
+        <dt><code>ct_next;</code></dt>
+        <dd>
+          Apply connection tracking to the flow.  After a call to
+          <code>ct_next</code>, the <code>ct_state</code> field is
+          available to match.  As a side effect, IP fragments will be
+          reassembled for matching.  If a fragmented packet is output,
+          then it will be sent with any overlapping fragments squashed.
+          The connection tracking state is scoped by the logical port,
+          so overlapping addresses may be used.  To allow traffic
+          related to the matched flow, <code>ct_commit</code> must be
+          called.  After this action is used, the next logical datapath
+          table will be executed.
+        </dd>
+
+        <dt><code>ct_commit;</code></dt>
+        <dd>
+          Commit the flow to the connection tracking entry associated
+          with it by a previous call to <code>commit</code>.
+        </dd>
       </dl>
 
       <p>
@@ -784,8 +811,6 @@ 
 
         <dt><code>learn</code></dt>
 
-        <dt><code>conntrack</code></dt>
-
         <dt><code>dec_ttl { <var>action</var>, </code>...<code> } { <var>action</var>; </code>...<code>};</code></dt>
         <dd>
           decrement TTL; execute first set of actions if
diff --git a/tests/test-ovn.c b/tests/test-ovn.c
index 60b87de..4f8b80d 100644
--- a/tests/test-ovn.c
+++ b/tests/test-ovn.c
@@ -1121,7 +1121,7 @@  static void
 test_parse_actions(struct ovs_cmdl_context *ctx OVS_UNUSED)
 {
     struct shash symtab;
-    struct simap ports;
+    struct simap ports, ct_zones;
     struct ds input;
 
     create_symtab(&symtab);
@@ -1130,6 +1130,7 @@  test_parse_actions(struct ovs_cmdl_context *ctx OVS_UNUSED)
     simap_put(&ports, "eth0", 5);
     simap_put(&ports, "eth1", 6);
     simap_put(&ports, "LOCAL", ofp_to_u16(OFPP_LOCAL));
+    simap_init(&ct_zones);
 
     ds_init(&input);
     while (!ds_get_test_line(&input, stdin)) {
@@ -1138,8 +1139,8 @@  test_parse_actions(struct ovs_cmdl_context *ctx OVS_UNUSED)
         char *error;
 
         ofpbuf_init(&ofpacts, 0);
-        error = actions_parse_string(ds_cstr(&input), &symtab, &ports, 11, 64,
-                                     &ofpacts, &prereqs);
+        error = actions_parse_string(ds_cstr(&input), &symtab, &ports,
+                                     &ct_zones, 11, 64, &ofpacts, &prereqs);
         if (!error) {
             struct ds output;
 
@@ -1165,6 +1166,7 @@  test_parse_actions(struct ovs_cmdl_context *ctx OVS_UNUSED)
     ds_destroy(&input);
 
     simap_destroy(&ports);
+    simap_destroy(&ct_zones);
     expr_symtab_destroy(&symtab);
     shash_destroy(&symtab);
 }