[ovs-dev,v4,2/4] revalidator: Gather packets-per-second rate of flows

Message ID 20180712072934.13001-3-sriharsha.basavapatna@broadcom.com
State Changes Requested
Headers show
Series
  • Support dynamic rebalancing of offloaded flows
Related show

Commit Message

Sriharsha Basavapatna via dev July 12, 2018, 7:29 a.m.
This is the second patch in the patch-set to support dynamic rebalancing
of offloaded flows.

The packets-per-second (pps) rate for each flow is computed in the context
of revalidator threads when the flow stats are retrieved. The pps-rate is
computed only after a flow is revalidated and is not scheduled for
deletion. The parameters used to compute pps and the pps itself are saved
in udpif_key since they need to be persisted across iterations of
rebalancing.

Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Co-authored-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Reviewed-by: Sathya Perla <sathya.perla@broadcom.com>
---
 lib/dpif-provider.h           |   1 +
 ofproto/ofproto-dpif-upcall.c | 152 ++++++++++++++++++++++++++++++++++
 2 files changed, 153 insertions(+)

Comments

Ben Pfaff Aug. 31, 2018, 5:52 p.m. | #1
On Thu, Jul 12, 2018 at 12:59:32PM +0530, Sriharsha Basavapatna via dev wrote:
> This is the second patch in the patch-set to support dynamic rebalancing
> of offloaded flows.
>
> The packets-per-second (pps) rate for each flow is computed in the context
> of revalidator threads when the flow stats are retrieved. The pps-rate is
> computed only after a flow is revalidated and is not scheduled for
> deletion. The parameters used to compute pps and the pps itself are saved
> in udpif_key since they need to be persisted across iterations of
> rebalancing.
> 
> Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
> Co-authored-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
> Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
> Reviewed-by: Sathya Perla <sathya.perla@broadcom.com>

Thanks for the patch.  I have some comments.

This code takes references to netdevs and stores them in udpif_key, but
it never releases them.

This code uses "float" but usually that's slower than "double" on modern
hardware.  If floating point is needed, I recommend double.  What is
your reasoning for using floating point, by the way?  Especially given
that the flow time delta is being rounded down to an integer number of
seconds, I don't yet see the need.

dpif_flow_to_netdevs() seems to want to treat flows with multiple
outputs differently, but I don't think it does because it breaks out of
the loop as soon as it finds the first output.

The variable 'forward' in dpif_flow_to_netdevs() doesn't seem needed
because forward == true is the same as out_port != ODPP_NONE.

In dpif_flow_to_netdevs(), the assignments in this 'if' statement seem
redundant with those at the top of the function:

            if (forward) {
                ukey->out_netdev = NULL;
                ukey->in_netdev = NULL;
                return;
            }

In dpif_flow_to_netdevs(), it seems that one could use
nl_attr_get_odp_port() here:
            in_port = *(odp_port_t *)nl_attr_get(k);

The 'type' var in the second NL_ATTR_FOR_EACH in dpif_flow_to_netdevs()
could be declared as the more specific type enum ovs_key_attr.

In udpif_update_flow_pps(), I would omit the inner parentheses in both
of these places:

    if ((udpif->dpif->current_ms - ukey->flow_time) <= OFFL_REBAL_INTVL_MSEC) {
        return;
    }
    if ((f->stats.n_packets + ukey->flow_backlog_packets) <
        ukey->flow_packets) {
        return;
    }

It's not clear to me that udpif_update_flow_pps() should do nothing in
the cases where it bails out.  Is that carefully thought out?

The cast in udpif_flow_packet_delta() is not useful.

The temporary dpif_flow used in udpif_update_flow_pps() ->
udpif_key_to_flow_netdevs() -> dpif_flow_to_netdevs() is curious.  I am
not sure why the ukey isn't used directly.  There are no other callers.

Thanks,

Ben.
Sriharsha Basavapatna via dev Sept. 12, 2018, 2:05 p.m. | #2
On Fri, Aug 31, 2018 at 11:22 PM, Ben Pfaff <blp@ovn.org> wrote:
> On Thu, Jul 12, 2018 at 12:59:32PM +0530, Sriharsha Basavapatna via dev wrote:
>> This is the second patch in the patch-set to support dynamic rebalancing
>> of offloaded flows.
>>
>> The packets-per-second (pps) rate for each flow is computed in the context
>> of revalidator threads when the flow stats are retrieved. The pps-rate is
>> computed only after a flow is revalidated and is not scheduled for
>> deletion. The parameters used to compute pps and the pps itself are saved
>> in udpif_key since they need to be persisted across iterations of
>> rebalancing.
>>
>> Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
>> Co-authored-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
>> Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
>> Reviewed-by: Sathya Perla <sathya.perla@broadcom.com>
>
> Thanks for the patch.  I have some comments.

Thanks for your review comments. Please see my response inline.
>
> This code takes references to netdevs and stores them in udpif_key, but
> it never releases them.

Thanks for catching this; fixed it.
>
> This code uses "float" but usually that's slower than "double" on modern
> hardware.  If floating point is needed, I recommend double.  What is
> your reasoning for using floating point, by the way?  Especially given
> that the flow time delta is being rounded down to an integer number of
> seconds, I don't yet see the need.

The rate computation originally was using per millisec rate (ppms) and
the float type was needed to distinguish flows with low ppms rate (< 1000
per sec or < 1 per ms). Like you pointed out we don't need it now that
we are using per-second rate. I've changed it to an uint64.
>
> dpif_flow_to_netdevs() seems to want to treat flows with multiple
> outputs differently, but I don't think it does because it breaks out of
> the loop as soon as it finds the first output.

We just want to check that the flow as an output action (and port).
That's the reason why it breaks out as soon as it finds the first output
port. I've updated comments explaining it.
>
> The variable 'forward' in dpif_flow_to_netdevs() doesn't seem needed
> because forward == true is the same as out_port != ODPP_NONE.

removed this code.
>
> In dpif_flow_to_netdevs(), the assignments in this 'if' statement seem
> redundant with those at the top of the function:
>
>             if (forward) {
>                 ukey->out_netdev = NULL;
>                 ukey->in_netdev = NULL;
>                 return;
>             }

removed.
>
> In dpif_flow_to_netdevs(), it seems that one could use
> nl_attr_get_odp_port() here:
>             in_port = *(odp_port_t *)nl_attr_get(k);

done
>
> The 'type' var in the second NL_ATTR_FOR_EACH in dpif_flow_to_netdevs()
> could be declared as the more specific type enum ovs_key_attr.

done
>
> In udpif_update_flow_pps(), I would omit the inner parentheses in both
> of these places:
>
>     if ((udpif->dpif->current_ms - ukey->flow_time) <= OFFL_REBAL_INTVL_MSEC) {
>         return;
>     }
>     if ((f->stats.n_packets + ukey->flow_backlog_packets) <
>         ukey->flow_packets) {
>         return;
>     }

done
>
> It's not clear to me that udpif_update_flow_pps() should do nothing in
> the cases where it bails out.  Is that carefully thought out?

Yes, the first one is needed; updated comments. The second one should have
been an assert(), changed it.
>
> The cast in udpif_flow_packet_delta() is not useful.

done
>
> The temporary dpif_flow used in udpif_update_flow_pps() ->
> udpif_key_to_flow_netdevs() -> dpif_flow_to_netdevs() is curious.  I am
> not sure why the ukey isn't used directly.  There are no other callers.

Thanks for suggesting this. Removed tmp dpif_flow and started using ukey
directly; simplified it into a single routine - ukey_to_flow_netdevs().
>
> Thanks,
>
> Ben.

Thanks,
-Harsha

Patch

diff --git a/lib/dpif-provider.h b/lib/dpif-provider.h
index 62b3598ac..cc6571b28 100644
--- a/lib/dpif-provider.h
+++ b/lib/dpif-provider.h
@@ -39,6 +39,7 @@  struct dpif {
     char *full_name;
     uint8_t netflow_engine_type;
     uint8_t netflow_engine_id;
+    long long int current_ms;
 };
 
 void dpif_init(struct dpif *, const struct dpif_class *, const char *name,
diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
index 85f579251..355be9f39 100644
--- a/ofproto/ofproto-dpif-upcall.c
+++ b/ofproto/ofproto-dpif-upcall.c
@@ -42,6 +42,8 @@ 
 #include "tunnel.h"
 #include "unixctl.h"
 #include "openvswitch/vlog.h"
+#include "lib/dpif-provider.h"
+#include "lib/netdev-provider.h"
 
 #define MAX_QUEUE_LENGTH 512
 #define UPCALL_MAX_BATCH 64
@@ -304,6 +306,16 @@  struct udpif_key {
 
     uint32_t key_recirc_id;   /* Non-zero if reference is held by the ukey. */
     struct recirc_refs recircs;  /* Action recirc IDs with references held. */
+
+#define OFFL_REBAL_INTVL_MSEC  3000	/* dynamic offload rebalance freq */
+    struct netdev *in_netdev;		/* in_odp_port's netdev */
+    struct netdev *out_netdev;		/* out_odp_port's netdev */
+    struct netdev *tunnel_netdev;	/* tunnel netdev */
+    bool offloaded;			/* True if flow is offloaded */
+    float flow_pps_rate;		/* Packets-Per-Second rate */
+    long long int flow_time;		/* last pps update time */
+    uint64_t flow_packets;		/* #pkts seen in interval */
+    uint64_t flow_backlog_packets;	/* prev-mode #pkts (offl or kernel) */
 };
 
 /* Datapath operation with optional ukey attached. */
@@ -1667,6 +1679,11 @@  ukey_create__(const struct nlattr *key, size_t key_len,
     ukey->stats.used = used;
     ukey->xcache = NULL;
 
+    ukey->offloaded = false;
+    ukey->flow_time = 0;
+    ukey->in_netdev = ukey->out_netdev = ukey->tunnel_netdev = NULL;
+    ukey->flow_packets = ukey->flow_backlog_packets = 0;
+
     ukey->key_recirc_id = key_recirc_id;
     recirc_refs_init(&ukey->recircs);
     if (xout) {
@@ -2442,6 +2459,137 @@  reval_op_init(struct ukey_op *op, enum reval_result result,
     }
 }
 
+/*
+ * Given a dpif_flow, get its input and output ports (netdevs) by parsing
+ * the flow keys and actions. Save them in udpif_key.
+ */
+static void
+dpif_flow_to_netdevs(struct dpif *dpif, struct udpif_key *ukey,
+                     struct dpif_flow *f)
+{
+    const struct dpif_class *dpif_class = dpif->dpif_class;
+    odp_port_t out_port = ODPP_NONE;
+    odp_port_t in_port = ODPP_NONE;
+    const struct nlattr *a;
+    const struct nlattr *k;
+    bool forward = false;
+    unsigned int left;
+
+    ukey->in_netdev = NULL;
+    ukey->out_netdev = NULL;
+
+    /* Capture the output port */
+    NL_ATTR_FOR_EACH (a, left, f->actions, f->actions_len) {
+        enum ovs_action_attr type = nl_attr_type(a);
+        if (type == OVS_ACTION_ATTR_OUTPUT) {
+            /* Only unicast is supported */
+            if (forward) {
+                ukey->out_netdev = NULL;
+                ukey->in_netdev = NULL;
+                return;
+            }
+
+            forward = true;
+            out_port = nl_attr_get_odp_port(a);
+            ukey->out_netdev = netdev_ports_get(out_port, dpif_class);
+            break;
+        }
+    }
+
+    /* Now find the input port */
+    NL_ATTR_FOR_EACH (k, left, f->key, f->key_len) {
+        unsigned int type;
+        struct flow_tnl tnl;
+        enum odp_key_fitness res;
+
+        type = nl_attr_type(k);
+
+        switch (type) {
+        case OVS_KEY_ATTR_IN_PORT:
+            in_port = *(odp_port_t *)nl_attr_get(k);
+            ukey->in_netdev = netdev_ports_get(in_port, dpif_class);
+            break;
+        case OVS_KEY_ATTR_TUNNEL:
+            res = odp_tun_key_from_attr(k, &tnl);
+            if (res == ODP_FIT_ERROR) {
+                ukey->tunnel_netdev = NULL;
+                break;
+            }
+            ukey->tunnel_netdev = flow_get_tunnel_netdev(&tnl);
+            break;
+        default:
+            break;
+        }
+    }
+}
+
+/*
+ * Given a udpif_key and a corresponding dpif_flow, get its input and output
+ * ports (netdevs). The flow may not contain flow attributes if is a terse
+ * dump; read its attributes from the ukey and then parse the flow to get
+ * the port info.
+ */
+static void
+udpif_key_to_flow_netdevs(struct udpif *udpif, struct udpif_key *ukey,
+                          struct dpif_flow *f)
+{
+    const struct nlattr *actions;
+    size_t actions_len;
+
+    /* Read flow keys and masks from ukey */
+    f->key = ukey->key;
+    f->key_len = ukey->key_len;
+    f->mask = ukey->mask;
+    f->mask_len = ukey->mask_len;
+
+    /* Read actions from ukey */
+    ukey_get_actions(ukey, &actions, &actions_len);
+    f->actions = actions;
+    f->actions_len = actions_len;
+
+    /* Get netdev info for the flow now */
+    dpif_flow_to_netdevs(udpif->dpif, ukey, f);
+}
+
+static float
+udpif_flow_packet_delta(struct udpif_key *ukey, const struct dpif_flow *f)
+{
+    return ((float)((f->stats.n_packets + ukey->flow_backlog_packets) -
+                    ukey->flow_packets));
+}
+
+static long long int
+udpif_flow_time_delta(struct udpif *udpif, struct udpif_key *ukey)
+{
+    return (udpif->dpif->current_ms - ukey->flow_time) / 1000;
+}
+
+/* Gather pps-rate for the given dpif_flow and save it in its ukey */
+static void
+udpif_update_flow_pps(struct udpif *udpif, struct udpif_key *ukey,
+                      const struct dpif_flow *f)
+{
+    struct dpif_flow tmp_flow;
+    float pps;
+
+    ukey->offloaded = f->attrs.offloaded;
+    udpif_key_to_flow_netdevs(udpif, ukey, &tmp_flow);
+
+    if ((udpif->dpif->current_ms - ukey->flow_time) <= OFFL_REBAL_INTVL_MSEC) {
+        return;
+    }
+    if ((f->stats.n_packets + ukey->flow_backlog_packets) <
+        ukey->flow_packets) {
+        return;
+    }
+
+    pps = udpif_flow_packet_delta(ukey, f) /
+                    udpif_flow_time_delta(udpif, ukey);
+    ukey->flow_pps_rate = pps;
+    ukey->flow_packets = ukey->flow_backlog_packets + f->stats.n_packets;
+    ukey->flow_time = udpif->dpif->current_ms;
+}
+
 static void
 revalidate(struct revalidator *revalidator)
 {
@@ -2550,6 +2698,10 @@  revalidate(struct revalidator *revalidator)
             }
             ukey->dump_seq = dump_seq;
 
+            if (result != UKEY_DELETE) {
+                udpif_update_flow_pps(udpif, ukey, f);
+            }
+
             if (result != UKEY_KEEP) {
                 /* Takes ownership of 'recircs'. */
                 reval_op_init(&ops[n_ops++], result, udpif, ukey, &recircs,