[ovs-dev,2/2] dpctl: Support flush conntrack by conntrack 5-tuple

Message ID 1510603980-62407-2-git-send-email-yihung.wei@gmail.com
State New
Headers show
Series
  • [ovs-dev,1/2] ct-dpif, dpif-netlink: Support conntrack flush by ct 5-tuple
Related show

Commit Message

Yi-Hung Wei Nov. 13, 2017, 8:13 p.m.
With this patch, ovs-dpctl flush-conntrack accepts a conntrack 5-tuple
parameter to delete the conntrack entry specified by the 5-tuple.
For example, user can use the following command to flush a conntrack entry
in zone 5.

$ ovs-dpctl flush-conntrack zone=5 \
    'ct_nw_src=10.1.1.2,ct_nw_dst=10.1.1.1,ct_nw_proto=17,ct_tp_src=2,ct_tp_dst=1'
$ ovs-appctl dpctl/flush-conntrack zone=5 \
    'ct_nw_src=10.1.1.2,ct_nw_dst=10.1.1.1,ct_nw_proto=17,ct_tp_src=2,ct_tp_dst=1'

VMWare-BZ: #1983178
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
---
 NEWS                             |  2 +
 lib/ct-dpif.c                    | 81 ++++++++++++++++++++++++++++++++++++++++
 lib/ct-dpif.h                    |  1 +
 lib/dpctl.c                      | 23 +++++++++++-
 lib/dpctl.man                    | 14 ++++++-
 tests/system-kmod-macros.at      |  8 ++++
 tests/system-traffic.at          | 46 +++++++++++++++++++++++
 tests/system-userspace-macros.at | 10 +++++
 utilities/ovs-dpctl.c            |  4 +-
 9 files changed, 183 insertions(+), 6 deletions(-)

Comments

Justin Pettit Nov. 16, 2017, 1:28 a.m. | #1
> On Nov 13, 2017, at 12:13 PM, Yi-Hung Wei <yihung.wei@gmail.com> wrote:
> 
> With this patch, ovs-dpctl flush-conntrack accepts a conntrack 5-tuple

I'd also mention ovs-appctl, since ovs-dpctl doesn't work on all platforms.

> diff --git a/NEWS b/NEWS
> index 047f34b9f402..800eb84cd24f 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -11,6 +11,8 @@ Post-v2.8.0
>          IPv6 packets.
>    - Linux kernel 4.13
>      * Add support for compiling OVS with the latest Linux 4.13 kernel
> +   - "ovs-dpctl flush-conntrack" now accepts conntrack 5-tuple to delete a
> +     connection tracking entry.

I would also mention the ovs-appctl, too.  Maybe:

	"flush-conntrack" in ovs-dpctl and ovs-appctl now accept a 5-tuple to delete a specific connection tracking entry.

> diff --git a/lib/ct-dpif.c b/lib/ct-dpif.c
> index a9f58f4eb449..bdd9ef5d2551 100644
> --- a/lib/ct-dpif.c
> +++ b/lib/ct-dpif.c
> @@ -20,6 +20,7 @@
> #include <errno.h>
> 
> #include "ct-dpif.h"
> +#include "openvswitch/ofp-parse.h"
> #include "openvswitch/vlog.h"
> 
> VLOG_DEFINE_THIS_MODULE(ct_dpif);
> @@ -434,3 +435,83 @@ ct_dpif_format_tcp_stat(struct ds * ds, int tcp_state, int conn_per_state)
>     ds_put_cstr(ds, "]");
>     ds_put_format(ds, "=%u", conn_per_state);
> }
> +
> +/* Parses a specification of a conntrack 5-tuple from 's' into 'tuple'.
> + * Returns true on success. Otherwise, returns false and puts the error msg
> + * in 'ds'.
> + */

We would normally close the function comment on the previous line.  Also, I wouldn't abbreviate "message" in a comment.

> +bool
> +ct_dpif_parse_tuple(struct ct_dpif_tuple *tuple, const char *s, struct ds *ds)
> +{
> ...
> +    if (ipv6_is_zero(&tuple->src.in6) || ipv6_is_zero(&tuple->dst.in6) ||
> +        !tuple->ip_proto || !tuple->src_port || !tuple->dst_port) {
> +        ds_put_cstr(ds, "Miss at least one of the conntrack 5-tuple field");

I think this error message might be a little clearer: "At least one of the conntrack 5-tuple fields is missing."

> +        goto out2;
> +    }
> +
> +    free(copy);
> +    return true;
> +
> +out:
> +    ds_put_format(ds, "Failed to parse field %s", key);
> +out2:
> +    free(copy);
> +    return false;

Minor, but I think these labels could be more descriptive.  What about "error_with_msg" and "error" or something?

> diff --git a/lib/dpctl.c b/lib/dpctl.c
> index 7569189d8f22..682b9bf35a75 100644
> --- a/lib/dpctl.c
> +++ b/lib/dpctl.c
> @@ -1331,14 +1331,32 @@ dpctl_flush_conntrack(int argc, const char *argv[],
>                       struct dpctl_params *dpctl_p)
> {
>     struct dpif *dpif;
> +    struct ct_dpif_tuple tuple, *ptuple = NULL;
> +    struct ds ds = DS_EMPTY_INITIALIZER;
>     uint16_t zone, *pzone = NULL;
>     char *name;
>     int error;
> 
> +    /* Parse ct tuple */
> +    if (argc > 1 && ct_dpif_parse_tuple(&tuple, argv[argc-1], &ds)) {
> +        ptuple = &tuple;
> +        argc--;
> +    } else if (argc == 4) {
> +        dpctl_error(dpctl_p, EINVAL, "%s", ds_cstr(&ds));
> +        ds_destroy(&ds);
> +        return EINVAL;
> +    }
> +    ds_destroy(&ds);

I don't think this handles the case of an error parsing 'tuple' when a datapath name and zone isn't provided.  It also reads a little funny, since 'ds' can only have a value if there was an error, but it's called even on the non-error condition.  It's obviously correct, since it's a safe operation, but it's unnecessary.

> +
> +    /* Parse zone */
>     if (argc > 1 && ovs_scan(argv[argc - 1], "zone=%"SCNu16, &zone)) {
>         pzone = &zone;
>         argc--;
> +    } else if (argc == 3) {
> +        dpctl_error(dpctl_p, EINVAL, "Invalid zone or conntrack tuple");
> +        return EINVAL;

Same thing here about this only triggering if there wasn't a datapath name.

> diff --git a/lib/dpctl.man b/lib/dpctl.man
> index 675fe5af4914..f429f1653fa7 100644
> --- a/lib/dpctl.man
> +++ b/lib/dpctl.man
> @@ -217,10 +217,20 @@ are included. With \fB\-\-statistics\fR timeouts and timestamps are
> added to the output.
> .
> .TP
> -\*(DX\fBflush\-conntrack\fR [\fIdp\fR] [\fBzone=\fIzone\fR]
> -Flushes all the connection entries in the tracker used by \fIdp\fR.
> +\*(DX\fBflush\-conntrack\fR [\fIdp\fR] [\fBzone=\fIzone\fR] [\fBct-tuple\fR]

Shouldn't "ct-tuple" (and all the references below) be using "\fI", since it should be an argument?

> +Flushes the connection entries in the tracker used by \fIdp\fR based on
> +\fIzone\fR and connection tracking tuple (\fBct-tuple\fR).
> +If \fBct-tuple\fR is not provided, flushes all the connection entries.
> If \fBzone=\fIzone\fR is specified, only flushes the connections in
> \fBzone\fR.
> +.IP
> +If \fBct-tuple\fR is provided, flushes the connection entry specified by
> +\fBct-tuple\fR in \fIzone\fR. \fBZone\fR defaults to \fBzone 0\fR if it is

I don't think you need to bold anything on this line.  What about changing that sentence to "The zone defaults to 0 if it is..."?

> +not provided.
> +Example of a IPv4 \fBct-tuple\fR:

I would say "An example of an IPv4 \fIct-tuple\fR:"

> +"ct_nw_src=10.1.1.1,ct_nw_dst=10.1.1.2,ct_nw_proto=6,ct_tp_src=1,ct_tp_dst=2".

I think it might be easier to read if you added a ".IP" here.

Thanks,

--Justin

Patch

diff --git a/NEWS b/NEWS
index 047f34b9f402..800eb84cd24f 100644
--- a/NEWS
+++ b/NEWS
@@ -11,6 +11,8 @@  Post-v2.8.0
          IPv6 packets.
    - Linux kernel 4.13
      * Add support for compiling OVS with the latest Linux 4.13 kernel
+   - "ovs-dpctl flush-conntrack" now accepts conntrack 5-tuple to delete a
+     connection tracking entry.
 
 v2.8.0 - 31 Aug 2017
 --------------------
diff --git a/lib/ct-dpif.c b/lib/ct-dpif.c
index a9f58f4eb449..bdd9ef5d2551 100644
--- a/lib/ct-dpif.c
+++ b/lib/ct-dpif.c
@@ -20,6 +20,7 @@ 
 #include <errno.h>
 
 #include "ct-dpif.h"
+#include "openvswitch/ofp-parse.h"
 #include "openvswitch/vlog.h"
 
 VLOG_DEFINE_THIS_MODULE(ct_dpif);
@@ -434,3 +435,83 @@  ct_dpif_format_tcp_stat(struct ds * ds, int tcp_state, int conn_per_state)
     ds_put_cstr(ds, "]");
     ds_put_format(ds, "=%u", conn_per_state);
 }
+
+/* Parses a specification of a conntrack 5-tuple from 's' into 'tuple'.
+ * Returns true on success. Otherwise, returns false and puts the error msg
+ * in 'ds'.
+ */
+bool
+ct_dpif_parse_tuple(struct ct_dpif_tuple *tuple, const char *s, struct ds *ds)
+{
+    char *pos, *key, *value, *copy;
+    memset(tuple, 0, sizeof *tuple);
+
+    pos = copy = xstrdup(s);
+    while (ofputil_parse_key_value(&pos, &key, &value)) {
+        if (!*value) {
+            ds_put_format(ds, "field %s missing value", key);
+            goto out2;
+        }
+
+        if (!strcmp(key, "ct_nw_src") || !strcmp(key, "ct_nw_dst")) {
+            if (tuple->l3_type && tuple->l3_type != AF_INET) {
+                ds_put_cstr(ds, "L3 type set multiple times");
+                goto out2;
+            } else {
+                tuple->l3_type = AF_INET;
+            }
+            if (!ip_parse(value, key[6] == 's' ? &tuple->src.ip :
+                                                 &tuple->dst.ip)) {
+                goto out;
+            }
+        } else if (!strcmp(key, "ct_ipv6_src") ||
+                   !strcmp(key, "ct_ipv6_dst")) {
+            if (tuple->l3_type && tuple->l3_type != AF_INET6) {
+                ds_put_cstr(ds, "L3 type set multiple times");
+                goto out2;
+            } else {
+                tuple->l3_type = AF_INET6;
+            }
+            if (!ipv6_parse(value, key[8] == 's' ? &tuple->src.in6 :
+                                                   &tuple->dst.in6)) {
+                goto out;
+            }
+        } else if (!strcmp(key, "ct_nw_proto")) {
+            char *err = str_to_u8(value, key, &tuple->ip_proto);
+            if (err) {
+                free(err);
+                goto out;
+            }
+        } else if (!strcmp(key, "ct_tp_src") || !strcmp(key,"ct_tp_dst")) {
+            uint16_t port;
+            char *err = str_to_u16(value, key, &port);
+            if (err) {
+                free(err);
+                goto out;
+            }
+            if (key[6] == 's') {
+                tuple->src_port = htons(port);
+            } else {
+                tuple->dst_port = htons(port);
+            }
+        } else {
+            ds_put_format(ds, "Invalid conntrack tuple field: %s", key);
+            goto out2;
+        }
+    }
+
+    if (ipv6_is_zero(&tuple->src.in6) || ipv6_is_zero(&tuple->dst.in6) ||
+        !tuple->ip_proto || !tuple->src_port || !tuple->dst_port) {
+        ds_put_cstr(ds, "Miss at least one of the conntrack 5-tuple field");
+        goto out2;
+    }
+
+    free(copy);
+    return true;
+
+out:
+    ds_put_format(ds, "Failed to parse field %s", key);
+out2:
+    free(copy);
+    return false;
+}
diff --git a/lib/ct-dpif.h b/lib/ct-dpif.h
index ef019050c78e..5e2de53834e8 100644
--- a/lib/ct-dpif.h
+++ b/lib/ct-dpif.h
@@ -203,5 +203,6 @@  void ct_dpif_format_entry(const struct ct_dpif_entry *, struct ds *,
 void ct_dpif_format_tuple(struct ds *, const struct ct_dpif_tuple *);
 uint8_t ct_dpif_coalesce_tcp_state(uint8_t state);
 void ct_dpif_format_tcp_stat(struct ds *, int, int);
+bool ct_dpif_parse_tuple(struct ct_dpif_tuple *, const char *s, struct ds *);
 
 #endif /* CT_DPIF_H */
diff --git a/lib/dpctl.c b/lib/dpctl.c
index 7569189d8f22..682b9bf35a75 100644
--- a/lib/dpctl.c
+++ b/lib/dpctl.c
@@ -1331,14 +1331,32 @@  dpctl_flush_conntrack(int argc, const char *argv[],
                       struct dpctl_params *dpctl_p)
 {
     struct dpif *dpif;
+    struct ct_dpif_tuple tuple, *ptuple = NULL;
+    struct ds ds = DS_EMPTY_INITIALIZER;
     uint16_t zone, *pzone = NULL;
     char *name;
     int error;
 
+    /* Parse ct tuple */
+    if (argc > 1 && ct_dpif_parse_tuple(&tuple, argv[argc-1], &ds)) {
+        ptuple = &tuple;
+        argc--;
+    } else if (argc == 4) {
+        dpctl_error(dpctl_p, EINVAL, "%s", ds_cstr(&ds));
+        ds_destroy(&ds);
+        return EINVAL;
+    }
+    ds_destroy(&ds);
+
+    /* Parse zone */
     if (argc > 1 && ovs_scan(argv[argc - 1], "zone=%"SCNu16, &zone)) {
         pzone = &zone;
         argc--;
+    } else if (argc == 3) {
+        dpctl_error(dpctl_p, EINVAL, "Invalid zone or conntrack tuple");
+        return EINVAL;
     }
+
     /* The datapath name is not a mandatory parameter for this command.
      * If it is not specified - so argc < 2 - we retrieve it from the
      * current setup, assuming only one exists. */
@@ -1353,7 +1371,7 @@  dpctl_flush_conntrack(int argc, const char *argv[],
         return error;
     }
 
-    error = ct_dpif_flush(dpif, pzone, NULL);
+    error = ct_dpif_flush(dpif, pzone, ptuple);
 
     dpif_close(dpif);
     return error;
@@ -1902,7 +1920,8 @@  static const struct dpctl_command all_commands[] = {
     { "del-flow", "[dp] flow", 1, 2, dpctl_del_flow, DP_RW },
     { "del-flows", "[dp]", 0, 1, dpctl_del_flows, DP_RW },
     { "dump-conntrack", "[dp] [zone=N]", 0, 2, dpctl_dump_conntrack, DP_RO },
-    { "flush-conntrack", "[dp] [zone=N]", 0, 2, dpctl_flush_conntrack, DP_RW },
+    { "flush-conntrack", "[dp] [zone=N] [ct-tuple]", 0, 3,
+      dpctl_flush_conntrack, DP_RW },
     { "ct-stats-show", "[dp] [zone=N] [verbose]",
       0, 3, dpctl_ct_stats_show, DP_RO },
     { "ct-bkts", "[dp] [gt=N]", 0, 2, dpctl_ct_bkts, DP_RO },
diff --git a/lib/dpctl.man b/lib/dpctl.man
index 675fe5af4914..f429f1653fa7 100644
--- a/lib/dpctl.man
+++ b/lib/dpctl.man
@@ -217,10 +217,20 @@  are included. With \fB\-\-statistics\fR timeouts and timestamps are
 added to the output.
 .
 .TP
-\*(DX\fBflush\-conntrack\fR [\fIdp\fR] [\fBzone=\fIzone\fR]
-Flushes all the connection entries in the tracker used by \fIdp\fR.
+\*(DX\fBflush\-conntrack\fR [\fIdp\fR] [\fBzone=\fIzone\fR] [\fBct-tuple\fR]
+Flushes the connection entries in the tracker used by \fIdp\fR based on
+\fIzone\fR and connection tracking tuple (\fBct-tuple\fR).
+If \fBct-tuple\fR is not provided, flushes all the connection entries.
 If \fBzone=\fIzone\fR is specified, only flushes the connections in
 \fBzone\fR.
+.IP
+If \fBct-tuple\fR is provided, flushes the connection entry specified by
+\fBct-tuple\fR in \fIzone\fR. \fBZone\fR defaults to \fBzone 0\fR if it is
+not provided.
+Example of a IPv4 \fBct-tuple\fR:
+"ct_nw_src=10.1.1.1,ct_nw_dst=10.1.1.2,ct_nw_proto=6,ct_tp_src=1,ct_tp_dst=2".
+Refer to \fBovs-fields(7)\fR for a detailed introduction to conntrack tuple
+fields.
 .
 .TP
 \*(DX\fBct\-stats\-show\fR [\fIdp\fR] [\fBzone=\fIzone\fR] [\fBverbose\fR]
diff --git a/tests/system-kmod-macros.at b/tests/system-kmod-macros.at
index 27498341a9f8..cfe5a73cac70 100644
--- a/tests/system-kmod-macros.at
+++ b/tests/system-kmod-macros.at
@@ -96,3 +96,11 @@  m4_define([CHECK_CONNTRACK_LOCAL_STACK])
 # always supports NAT, so no check is needed.
 #
 m4_define([CHECK_CONNTRACK_NAT])
+
+# CHECK_CT_DPIF_FLUSH_BY_CT_TUPLE()
+#
+# Perform requirements checks for running ovs-dpctl flush-conntrack by
+# conntrack 5-tuple test. The kernel datapath does support this
+# feature. Will remove this check after both kernel and userspace
+# support it.
+m4_define([CHECK_CT_DPIF_FLUSH_BY_CT_TUPLE])
diff --git a/tests/system-traffic.at b/tests/system-traffic.at
index fd7b6121b04f..846ed233babd 100644
--- a/tests/system-traffic.at
+++ b/tests/system-traffic.at
@@ -832,6 +832,52 @@  udp,orig=(src=10.1.1.1,dst=10.1.1.2,sport=1,dport=2),reply=(src=10.1.1.2,dst=10.
 OVS_TRAFFIC_VSWITCHD_STOP
 AT_CLEANUP
 
+AT_SETUP([conntrack - ct flush by 5-tuple])
+CHECK_CONNTRACK()
+CHECK_CT_DPIF_FLUSH_BY_CT_TUPLE()
+OVS_TRAFFIC_VSWITCHD_START()
+
+ADD_NAMESPACES(at_ns0, at_ns1)
+
+ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24")
+ADD_VETH(p1, at_ns1, br0, "10.1.1.2/24")
+
+AT_DATA([flows.txt], [dnl
+priority=1,action=drop
+priority=10,arp,action=normal
+priority=100,in_port=1,udp,action=ct(commit),2
+priority=100,in_port=2,udp,action=ct(zone=5,commit),1
+])
+
+AT_CHECK([ovs-ofctl --bundle add-flows br0 flows.txt])
+
+dnl Start a new connection from port 1.
+AT_CHECK([ovs-ofctl -O OpenFlow13 packet-out br0 "in_port=1 packet=50540000000a50540000000908004500001c000000000011a4cd0a0101010a0101020001000200080000 actions=resubmit(,0)"])
+
+AT_CHECK([ovs-appctl dpctl/dump-conntrack | grep "orig=.src=10\.1\.1\.1,"], [], [dnl
+udp,orig=(src=10.1.1.1,dst=10.1.1.2,sport=1,dport=2),reply=(src=10.1.1.2,dst=10.1.1.1,sport=2,dport=1)
+])
+
+AT_CHECK([ovs-appctl dpctl/flush-conntrack 'ct_nw_src=10.1.1.2,ct_nw_dst=10.1.1.1,ct_nw_proto=17,ct_tp_src=2,ct_tp_dst=1'])
+
+AT_CHECK([ovs-appctl dpctl/dump-conntrack | grep "orig=.src=10\.1\.1\.1,"], [1], [dnl
+])
+
+dnl Start a new connection from port 2.
+AT_CHECK([ovs-ofctl -O OpenFlow13 packet-out br0 "in_port=2 packet=50540000000a50540000000908004500001c000000000011a4cd0a0101020a0101010002000100080000 actions=resubmit(,0)"])
+
+AT_CHECK([ovs-appctl dpctl/dump-conntrack | grep "orig=.src=10\.1\.1\.2,"], [], [dnl
+udp,orig=(src=10.1.1.2,dst=10.1.1.1,sport=2,dport=1),reply=(src=10.1.1.1,dst=10.1.1.2,sport=1,dport=2),zone=5
+])
+
+AT_CHECK([ovs-appctl dpctl/flush-conntrack zone=5 'ct_nw_src=10.1.1.1,ct_nw_dst=10.1.1.2,ct_nw_proto=17,ct_tp_src=1,ct_tp_dst=2'])
+
+AT_CHECK([ovs-appctl dpctl/dump-conntrack | grep "orig=.src=10\.1\.1\.2,"], [1], [dnl
+])
+
+OVS_TRAFFIC_VSWITCHD_STOP
+AT_CLEANUP
+
 AT_SETUP([conntrack - IPv4 ping])
 CHECK_CONNTRACK()
 OVS_TRAFFIC_VSWITCHD_START()
diff --git a/tests/system-userspace-macros.at b/tests/system-userspace-macros.at
index f3337f04499a..efed70f010d0 100644
--- a/tests/system-userspace-macros.at
+++ b/tests/system-userspace-macros.at
@@ -99,3 +99,13 @@  m4_define([CHECK_CONNTRACK_LOCAL_STACK],
 # datapath supports NAT.
 #
 m4_define([CHECK_CONNTRACK_NAT])
+
+# CHECK_CT_DPIF_FLUSH_BY_TUPLE()
+#
+# Perform requirements checks for running ovs-dpctl flush-conntrack by
+# conntrack 5-tuple test. The userspace datapath does not support
+# this feature yet.
+m4_define([CHECK_CT_DPIF_FLUSH_BY_CT_TUPLE],
+[
+    AT_SKIP_IF([:])
+])
diff --git a/utilities/ovs-dpctl.c b/utilities/ovs-dpctl.c
index 7b005ace3f4e..ef2daf6fb192 100644
--- a/utilities/ovs-dpctl.c
+++ b/utilities/ovs-dpctl.c
@@ -198,8 +198,8 @@  usage(void *userdata OVS_UNUSED)
            "  del-flows [DP]             delete all flows from DP\n"
            "  dump-conntrack [DP] [zone=ZONE]  " \
                "display conntrack entries for ZONE\n"
-           "  flush-conntrack [DP] [zone=ZONE] " \
-               "delete all conntrack entries in ZONE\n"
+           "  flush-conntrack [DP] [zone=ZONE] [ct-tuple]" \
+               "delete matched conntrack entries in ZONE\n"
            "  ct-stats-show [DP] [zone=ZONE] [verbose] " \
                "CT connections grouped by protocol\n"
            "  ct-bkts [DP] [gt=N] display connections per CT bucket\n"