diff mbox series

[ovs-dev,RFC] revalidator: add a USDT probe after evaluation when flows are deleted.

Message ID 20220624191826.80109-1-ksprague@redhat.com
State Superseded
Headers show
Series [ovs-dev,RFC] revalidator: add a USDT probe after evaluation when flows are deleted. | expand

Checks

Context Check Description
ovsrobot/apply-robot fail apply and check: fail
ovsrobot/github-robot-_Build_and_Test fail github build: failed
ovsrobot/intel-ovs-compilation fail test: fail

Commit Message

Kevin Sprague June 24, 2022, 7:18 p.m. UTC
During normal operations, it is useful to understand when a particular flow
gets removed from the system. This can be useful when debugging performance
issues tied to ofproto flow changes, trying to determine deployed traffic
patterns, or while debugging dynamic systems where ports come and go.

Prior to this change, there was a lack of visibility around flow expiration.
The existing debugging infrastructure could tell us when a flow was added to
the datapath, but not when it was removed or why.

This change introduces a USDT probe at the point where the revalidator
determines that the flow should be removed.  Additionally, we track the
reason for the flow eviction and provide that information as well.  With
this change, we can track the complete flow lifecycle for the netlink datapath
by hooking the upcall tracepoint in kernel, the flow put USDT, and the
revaldiator USDT, letting us watch as flows are added and removed from the
kernel datapath.

This change only enables this information via USDT probe, so it won't be
possible to access this information any other way (see:
Documentation/topics/usdt-probes.rst).

Also included are two scripts (utilities/usdt-scripts/filter_probe.py and
utilities/usdt-scripts/watch_flows.bt) that serve as demonstrations of how
the new USDT probes might be used going forward.

Signed-off-by: Kevin Sprague <ksprague@redhat.com>
---
We are planning to add filter_probe.py to the flake8 check in autotools.
In addition, we are planning to investigate different ways of filtering
flows through eBPF in order to gauge their performance impacts.
 ofproto/ofproto-dpif-upcall.c          |  46 ++++-
 utilities/usdt-scripts/filter_probe.py | 263 +++++++++++++++++++++++++
 utilities/usdt-scripts/watch_flows.bt  | 125 ++++++++++++
 3 files changed, 427 insertions(+), 7 deletions(-)
 create mode 100755 utilities/usdt-scripts/filter_probe.py
 create mode 100755 utilities/usdt-scripts/watch_flows.bt

Comments

0-day Robot June 24, 2022, 7:38 p.m. UTC | #1
Bleep bloop.  Greetings Kevin Sprague, I am a robot and I have tried out your patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


build:
mv tests/ovsdb-cluster-testsuite.tmp tests/ovsdb-cluster-testsuite
\
{ sed -n -e '/%AUTHORS%/q' -e p < ./debian/copyright.in;   \
  sed '34,/^$/d' ./AUTHORS.rst |			   \
	sed -n -e '/^$/q' -e 's/^/  /p';			   \
  sed -e '34,/%AUTHORS%/d' ./debian/copyright.in;	   \
} > debian/copyright
(printf '\043 Generated automatically -- do not modify!    -*- buffer-read-only: t -*-\n' && sed -e 's,[@]VERSION[@],2.17.90,g') < ./rhel/openvswitch-dkms.spec.in > openvswitch-dkms.spec.tmp || exit 1; if cmp -s openvswitch-dkms.spec.tmp rhel/openvswitch-dkms.spec; then touch rhel/openvswitch-dkms.spec; rm openvswitch-dkms.spec.tmp; else mv openvswitch-dkms.spec.tmp rhel/openvswitch-dkms.spec; fi
(printf '\043 Generated automatically -- do not modify!    -*- buffer-read-only: t -*-\n' && sed -e 's,[@]VERSION[@],2.17.90,g') < ./rhel/kmod-openvswitch-rhel6.spec.in > kmod-openvswitch-rhel6.spec.tmp || exit 1; if cmp -s kmod-openvswitch-rhel6.spec.tmp rhel/kmod-openvswitch-rhel6.spec; then touch rhel/kmod-openvswitch-rhel6.spec; rm kmod-openvswitch-rhel6.spec.tmp; else mv kmod-openvswitch-rhel6.spec.tmp rhel/kmod-openvswitch-rhel6.spec; fi
(printf '\043 Generated automatically -- do not modify!    -*- buffer-read-only: t -*-\n' && sed -e 's,[@]VERSION[@],2.17.90,g') < ./rhel/openvswitch-kmod-fedora.spec.in > openvswitch-kmod-fedora.spec.tmp || exit 1; if cmp -s openvswitch-kmod-fedora.spec.tmp rhel/openvswitch-kmod-fedora.spec; then touch rhel/openvswitch-kmod-fedora.spec; rm openvswitch-kmod-fedora.spec.tmp; else mv openvswitch-kmod-fedora.spec.tmp rhel/openvswitch-kmod-fedora.spec; fi
(printf '\043 Generated automatically -- do not modify!    -*- buffer-read-only: t -*-\n' && sed -e 's,[@]VERSION[@],2.17.90,g') < ./rhel/openvswitch.spec.in > openvswitch.spec.tmp || exit 1; if cmp -s openvswitch.spec.tmp rhel/openvswitch.spec; then touch rhel/openvswitch.spec; rm openvswitch.spec.tmp; else mv openvswitch.spec.tmp rhel/openvswitch.spec; fi
(printf '\043 Generated automatically -- do not modify!    -*- buffer-read-only: t -*-\n' && sed -e 's,[@]VERSION[@],2.17.90,g') < ./rhel/openvswitch-fedora.spec.in > openvswitch-fedora.spec.tmp || exit 1; if cmp -s openvswitch-fedora.spec.tmp rhel/openvswitch-fedora.spec; then touch rhel/openvswitch-fedora.spec; rm openvswitch-fedora.spec.tmp; else mv openvswitch-fedora.spec.tmp rhel/openvswitch-fedora.spec; fi
(printf '\043 Generated automatically -- do not modify!    -*- buffer-read-only: t -*-\n' && sed -e 's,[@]VERSION[@],2.17.90,g') \
	< ./xenserver/openvswitch-xen.spec.in > openvswitch-xen.spec.tmp || exit 1; \
if cmp -s openvswitch-xen.spec.tmp xenserver/openvswitch-xen.spec; then touch xenserver/openvswitch-xen.spec; rm openvswitch-xen.spec.tmp; else mv openvswitch-xen.spec.tmp xenserver/openvswitch-xen.spec; fi
make[3]: Entering directory `/var/lib/jenkins/jobs/0day_robot_upstream_build_from_pw/workspace/datapath'
make[3]: Leaving directory `/var/lib/jenkins/jobs/0day_robot_upstream_build_from_pw/workspace/datapath'
The following files are in git but not the distribution:
utilities/usdt-scripts/filter_probe.py
utilities/usdt-scripts/watch_flows.bt
make[2]: *** [dist-hook-git] Error 1
make[2]: Leaving directory `/var/lib/jenkins/jobs/0day_robot_upstream_build_from_pw/workspace'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/var/lib/jenkins/jobs/0day_robot_upstream_build_from_pw/workspace'
make: *** [all] Error 2


Please check this out.  If you feel there has been an error, please email aconole@redhat.com

Thanks,
0-day Robot
Eelco Chaudron June 27, 2022, 8:29 a.m. UTC | #2
On 24 Jun 2022, at 21:18, Kevin Sprague wrote:

> During normal operations, it is useful to understand when a particular flow
> gets removed from the system. This can be useful when debugging performance
> issues tied to ofproto flow changes, trying to determine deployed traffic
> patterns, or while debugging dynamic systems where ports come and go.
>
> Prior to this change, there was a lack of visibility around flow expiration.
> The existing debugging infrastructure could tell us when a flow was added to
> the datapath, but not when it was removed or why.
>
> This change introduces a USDT probe at the point where the revalidator
> determines that the flow should be removed.  Additionally, we track the
> reason for the flow eviction and provide that information as well.  With
> this change, we can track the complete flow lifecycle for the netlink datapath
> by hooking the upcall tracepoint in kernel, the flow put USDT, and the
> revaldiator USDT, letting us watch as flows are added and removed from the
> kernel datapath.
>
> This change only enables this information via USDT probe, so it won't be
> possible to access this information any other way (see:
> Documentation/topics/usdt-probes.rst).
>
> Also included are two scripts (utilities/usdt-scripts/filter_probe.py and
> utilities/usdt-scripts/watch_flows.bt) that serve as demonstrations of how
> the new USDT probes might be used going forward.

Hi Kevin,

I’ll add this to my (long) review list, and will get you some real feedback.
But below some quick remarks just glancing at PROBE only...

//Eelco


> Signed-off-by: Kevin Sprague <ksprague@redhat.com>
> ---
> We are planning to add filter_probe.py to the flake8 check in autotools.
> In addition, we are planning to investigate different ways of filtering
> flows through eBPF in order to gauge their performance impacts.
>  ofproto/ofproto-dpif-upcall.c          |  46 ++++-
>  utilities/usdt-scripts/filter_probe.py | 263 +++++++++++++++++++++++++
>  utilities/usdt-scripts/watch_flows.bt  | 125 ++++++++++++
>  3 files changed, 427 insertions(+), 7 deletions(-)
>  create mode 100755 utilities/usdt-scripts/filter_probe.py
>  create mode 100755 utilities/usdt-scripts/watch_flows.bt
>
> diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
> index 57f94df54..8971edb2d 100644
> --- a/ofproto/ofproto-dpif-upcall.c
> +++ b/ofproto/ofproto-dpif-upcall.c
> @@ -31,6 +31,7 @@
>  #include "openvswitch/list.h"
>  #include "netlink.h"
>  #include "openvswitch/ofpbuf.h"
> +#include "openvswitch/usdt-probes.h"
>  #include "ofproto-dpif-ipfix.h"
>  #include "ofproto-dpif-sflow.h"
>  #include "ofproto-dpif-xlate.h"
> @@ -260,6 +261,17 @@ enum ukey_state {
>  };
>  #define N_UKEY_STATES (UKEY_DELETED + 1)
>
> +enum flow_del_reason {
> +    FLOW_LIVE = 0,
> +    FLOW_TIME_OUT,      /* the flow went unused and was deleted. */
> +    TOO_EXPENSIVE,
> +    FLOW_WILDCARDED,
> +    BAD_ODP_FIT,
> +    ASSOCIATED_OFPROTO,
> +    XLATION_ERROR,
> +    AVOID_CACHING,
> +};
> +
>  /* 'udpif_key's are responsible for tracking the little bit of state udpif
>   * needs to do flow expiration which can't be pulled directly from the
>   * datapath.  They may be created by any handler or revalidator thread at any
> @@ -2202,7 +2214,8 @@ populate_xcache(struct udpif *udpif, struct udpif_key *ukey,
>  static enum reval_result
>  revalidate_ukey__(struct udpif *udpif, const struct udpif_key *ukey,
>                    uint16_t tcp_flags, struct ofpbuf *odp_actions,
> -                  struct recirc_refs *recircs, struct xlate_cache *xcache)
> +                  struct recirc_refs *recircs, struct xlate_cache *xcache,
> +                  enum flow_del_reason *reason)
>  {
>      struct xlate_out *xoutp;
>      struct netflow *netflow;
> @@ -2215,16 +2228,20 @@ revalidate_ukey__(struct udpif *udpif, const struct udpif_key *ukey,
>          .wc = &wc,
>      };
>
> +    int error;
>      result = UKEY_DELETE;
>      xoutp = NULL;
>      netflow = NULL;
>
> -    if (xlate_ukey(udpif, ukey, tcp_flags, &ctx)) {
> +    error = xlate_ukey(udpif, ukey, tcp_flags, &ctx);
> +    if (error) {
> +        *reason = XLATION_ERROR;
>          goto exit;
>      }
>      xoutp = &ctx.xout;
>
>      if (xoutp->avoid_caching) {
> +        *reason = AVOID_CACHING;
>          goto exit;
>      }
>
> @@ -2238,6 +2255,7 @@ revalidate_ukey__(struct udpif *udpif, const struct udpif_key *ukey,
>          ofpbuf_clear(odp_actions);
>
>          if (!ofproto) {
> +            *reason = ASSOCIATED_OFPROTO;
>              goto exit;
>          }
>
> @@ -2249,6 +2267,7 @@ revalidate_ukey__(struct udpif *udpif, const struct udpif_key *ukey,
>      if (odp_flow_key_to_mask(ukey->mask, ukey->mask_len, &dp_mask, &ctx.flow,
>                               NULL)
>          == ODP_FIT_ERROR) {
> +        *reason = BAD_ODP_FIT;
>          goto exit;
>      }
>
> @@ -2258,6 +2277,7 @@ revalidate_ukey__(struct udpif *udpif, const struct udpif_key *ukey,
>       * down.  Note that we do not know if the datapath has ignored any of the
>       * wildcarded bits, so we may be overly conservative here. */
>      if (flow_wildcards_has_extra(&dp_mask, ctx.wc)) {
> +        *reason = FLOW_WILDCARDED;
>          goto exit;
>      }
>
> @@ -2303,7 +2323,8 @@ static enum reval_result
>  revalidate_ukey(struct udpif *udpif, struct udpif_key *ukey,
>                  const struct dpif_flow_stats *stats,
>                  struct ofpbuf *odp_actions, uint64_t reval_seq,
> -                struct recirc_refs *recircs, bool offloaded)
> +                struct recirc_refs *recircs, bool offloaded,
> +                enum flow_del_reason *reason)
>      OVS_REQUIRES(ukey->mutex)
>  {
>      bool need_revalidate = ukey->reval_seq != reval_seq;
> @@ -2329,8 +2350,11 @@ revalidate_ukey(struct udpif *udpif, struct udpif_key *ukey,
>                  xlate_cache_clear(ukey->xcache);
>              }
>              result = revalidate_ukey__(udpif, ukey, push.tcp_flags,
> -                                       odp_actions, recircs, ukey->xcache);
> -        } /* else delete; too expensive to revalidate */
> +                                       odp_actions, recircs, ukey->xcache,
> +                                       reason);
> +        } /* else delete; too expensive to revalidate */ else {
> +           *reason = TOO_EXPENSIVE;
> +        }
>      } else if (!push.n_packets || ukey->xcache
>                 || !populate_xcache(udpif, ukey, push.tcp_flags)) {
>          result = UKEY_KEEP;
> @@ -2720,6 +2744,7 @@ revalidate(struct revalidator *revalidator)
>              struct recirc_refs recircs = RECIRC_REFS_EMPTY_INITIALIZER;
>              struct dpif_flow_stats stats = f->stats;
>              enum reval_result result;
> +            enum flow_del_reason reason = FLOW_LIVE;
>              struct udpif_key *ukey;
>              bool already_dumped;
>              int error;
> @@ -2767,10 +2792,11 @@ revalidate(struct revalidator *revalidator)
>              }
>              if (kill_them_all || (used && used < now - max_idle)) {
>                  result = UKEY_DELETE;
> +                reason = FLOW_TIME_OUT;
>              } else {
>                  result = revalidate_ukey(udpif, ukey, &stats, &odp_actions,
>                                           reval_seq, &recircs,
> -                                         f->attrs.offloaded);
> +                                         f->attrs.offloaded, &reason);
>              }
>              ukey->dump_seq = dump_seq;
>
> @@ -2781,6 +2807,10 @@ revalidate(struct revalidator *revalidator)
>
>              if (result != UKEY_KEEP) {
>                  /* Takes ownership of 'recircs'. */
> +                if (result == UKEY_DELETE && reason) {

I do not think the probes should be conditional, these checks you can add to the probe.
And probably move the probe above the ‘if (result != UKEY_KEEP)’ check.
This will make the probe more general, guess the name should then not be flow_delete but something like flow_result.

> +                    OVS_USDT_PROBE(ofproto_dpif_upcall_revalidate,

missing is the documentation of the new probe in Documentation/topics/usdt-probes.rst.
Secondly, take a look at the probe naming convention in the same document.
Guess it should be OVS_USDT_PROBE(revalidate, flow_delete, ...)

> +                        flow_delete, reason, &ukey->ufid);

Here I would not pass in ukey->ufid, but ukey, so if in the future you need more details you can easily get them.
I would also pass in udpif, you never know why you could use it in the future.
So something like:

  OVS_USDT_PROBE(revalidate, flow_results, udpif, ukey);


> +                }
>                  reval_op_init(&ops[n_ops++], result, udpif, ukey, &recircs,
>                                &odp_actions);
>              }
> @@ -2829,6 +2859,7 @@ revalidator_sweep__(struct revalidator *revalidator, bool purge)
>          struct udpif_key *ukey;
>          struct umap *umap = &udpif->ukeys[i];
>          size_t n_ops = 0;
> +        enum flow_del_reason reason = FLOW_LIVE;
>
>          CMAP_FOR_EACH(ukey, cmap_node, &umap->cmap) {
>              enum ukey_state ukey_state;
> @@ -2855,7 +2886,8 @@ revalidator_sweep__(struct revalidator *revalidator, bool purge)
>                      COVERAGE_INC(revalidate_missed_dp_flow);
>                      memset(&stats, 0, sizeof stats);
>                      result = revalidate_ukey(udpif, ukey, &stats, &odp_actions,
> -                                             reval_seq, &recircs, false);
> +                                             reval_seq, &recircs, false,
> +                                             &reason);
>                  }
>                  if (result != UKEY_KEEP) {
>                      /* Clears 'recircs' if filled by revalidate_ukey(). */
> diff --git a/utilities/usdt-scripts/filter_probe.py b/utilities/usdt-scripts/filter_probe.py
> new file mode 100755
> index 000000000..6eeb82ad5
> --- /dev/null
> +++ b/utilities/usdt-scripts/filter_probe.py
> @@ -0,0 +1,263 @@
> +#!/usr/bin/env python3
> +from bcc import BPF, USDT, USDTException
> +
> +import argparse
> +import psutil
> +import sys
> +import time
> +from scapy.all import hexdump, wrpcap
> +from scapy.layers.l2 import Ether
> +import struct
> +
> +
> +bpf_src = """
> +#include <linux/sched.h>
> +#include <linux/types.h>
> +#include <uapi/linux/ptrace.h>
> +
> +#define MAX_KEY     2048
> +
> +struct flow_put {
> +    u32 flags;
> +    u64 key_ptr;
> +    u64 key_len;
> +    u64 mask_ptr;
> +    u64 mask_len;
> +    u64 action_ptr;
> +    u64 action_len;
> +    u64 ufid_loc;
> +};
> +struct event_t {
> +    u64 ts;
> +    u32 reason;
> +    u32 ufid[4];
> +    u64 key_size;
> +    unsigned char key[MAX_KEY];
> +};
> +
> +BPF_RINGBUF_OUTPUT(events, <BUFFER_PAGE_COUNT>);
> +
> +int watch_reval(struct pt_regs *ctx) {
> +    uint64_t addr;
> +    struct event_t *data = events.ringbuf_reserve(sizeof(struct event_t));
> +    if(!data)
> +        return 1;
> +    data->ts = bpf_ktime_get_ns();
> +    bpf_usdt_readarg(1, ctx, &data->reason);
> +    bpf_usdt_readarg(2, ctx, &addr);
> +    bpf_probe_read(&data->ufid,sizeof(data->ufid),(void *)addr);
> +    events.ringbuf_submit(data, 0);
> +    return 0;
> +};
> +
> +
> +int watch_put(struct pt_regs *ctx) {
> +    uint64_t addr;
> +    struct event_t *data = events.ringbuf_reserve(sizeof(struct event_t));
> +    struct flow_put f;
> +    if(!data)
> +        return 1;
> +    data->ts = bpf_ktime_get_ns();
> +    bpf_usdt_readarg(2, ctx, &addr);
> +    bpf_probe_read(&f, sizeof(struct flow_put), (void *) addr);
> +    bpf_probe_read(&data->ufid, sizeof(data->ufid),(void *) f.ufid_loc);
> +    if (f.key_len > MAX_KEY) // verifier fails without this check.
> +        f.key_len = MAX_KEY;
> +    data->key_size = f.key_len;
> +    bpf_probe_read(&data->key, f.key_len,(void*)f.key_ptr);
> +    data->reason = 0;
> +    events.ringbuf_submit(data, 0);
> +    return 0;
> +};
> +"""
> +
> +def format_ufid(ufid):
> +    result = "ufid:%08x-%04x-%04x-%04x-%04x%08x" \
> +    % (ufid[0],
> +    ufid[1] >> 16,
> +    ufid[1] & 0xffff,
> +    ufid[2] >> 16,
> +    ufid[2] & 0,
> +    ufid[3])
> +    return result
> +
> +def print_flow_put(event):
> +    ufid_str = format_ufid(event.ufid)
> +    print("At time: {:<18.9f} a flow with ufid: {} was upcalled".
> +        format(event.ts / 1000000000,ufid_str))
> +    key = decode_key(bytes(event.key)[:event.key_size])
> +    if args.filter_flows:
> +        if "OVS_KEY_ATTR_IPV4" in key:
> +            print("Found an ipv4 flow. Adding its ufid to the watchlist")
> +            ufids.append(ufid_str)
> +
> +def print_expiration(event):
> +    ufid_str = format_ufid(event.ufid)
> +    reason_code = ""
> +    if args.filter_flows:
> +        if ufid_str not in ufids:
> +            return
> +        else:
> +            print("A tracked flow is expiring")
> +    if event.reason == 1:
> +        reason_code = "flow timed out"
> +    elif event.reason == 2:
> +        reason_code = "flow was too expensive to revalidate"
> +    elif event.reason == 3:
> +        reason_code = "flow was wildcarded"
> +    elif event.reason == 4:
> +        reason_code = "bad odp fit"
> +    elif event.reason == 5:
> +        reason_code = "associated ofproto"
> +    elif event.reason == 6:
> +        reason_code = "translation error"
> +    elif event.reason == 7:
> +        reason_code = "avoid_caching"
> +    print("At time: {:<18.9f} a flow with ufid: {} was deleted for reason: {}".
> +        format(event.ts / 1000000000, ufid_str, reason_code))
> +
> +def decode_key(msg,dump=True):
> +    dump=args.print_flow_keys
> +    bytes_left = len(msg)
> +    result = {}
> +    while bytes_left:
> +        if bytes_left < 4:
> +            if dump:
> +                print("{}WARN: decode truncated; cannot read header".format(
> +                    ' ' * 4))
> +            break
> +        nla_len, nla_type = struct.unpack("=HH", msg[:4])
> +        if nla_len < 4:
> +            if dump:
> +                print("{}WARN: decode truncated; nla_len < 4".format(' ' * 4))
> +            break
> +        nla_data = msg[4:nla_len]
> +        trunc = ""
> +        if nla_len > bytes_left:
> +            trunc = "..."
> +            nla_data = nla_data[:(bytes_left - 4)]
> +        else:
> +            result[get_ovs_key_attr_str(nla_type)] = nla_data
> +        if dump:
> +            print("{}nla_len {}, nla_type {}[{}], data: {}{}".format(
> +                ' ' * 4, nla_len, get_ovs_key_attr_str(nla_type),
> +                nla_type,
> +                "".join("{:02x} ".format(b) for b in nla_data), trunc))
> +        if trunc != "":
> +            if dump:
> +                print("{}WARN: decode truncated; nla_len > msg_len[{}] ".
> +                      format(" " * 4, bytes_left))
> +            break
> +        next_offset = (nla_len + 3) & (~3)
> +        msg = msg[next_offset:]
> +        bytes_left -= next_offset
> +    return result
> +
> +def get_ovs_key_attr_str(attr):
> +    ovs_key_attr = ["OVS_KEY_ATTR_UNSPEC",
> +                    "OVS_KEY_ATTR_ENCAP",
> +                    "OVS_KEY_ATTR_PRIORITY",
> +                    "OVS_KEY_ATTR_IN_PORT",
> +                    "OVS_KEY_ATTR_ETHERNET",
> +                    "OVS_KEY_ATTR_VLAN",
> +                    "OVS_KEY_ATTR_ETHERTYPE",
> +                    "OVS_KEY_ATTR_IPV4",
> +                    "OVS_KEY_ATTR_IPV6",
> +                    "OVS_KEY_ATTR_TCP",
> +                    "OVS_KEY_ATTR_UDP",
> +                    "OVS_KEY_ATTR_ICMP",
> +                    "OVS_KEY_ATTR_ICMPV6",
> +                    "OVS_KEY_ATTR_ARP",
> +                    "OVS_KEY_ATTR_ND",
> +                    "OVS_KEY_ATTR_SKB_MARK",
> +                    "OVS_KEY_ATTR_TUNNEL",
> +                    "OVS_KEY_ATTR_SCTP",
> +                    "OVS_KEY_ATTR_TCP_FLAGS",
> +                    "OVS_KEY_ATTR_DP_HASH",
> +                    "OVS_KEY_ATTR_RECIRC_ID",
> +                    "OVS_KEY_ATTR_MPLS",
> +                    "OVS_KEY_ATTR_CT_STATE",
> +                    "OVS_KEY_ATTR_CT_ZONE",
> +                    "OVS_KEY_ATTR_CT_MARK",
> +                    "OVS_KEY_ATTR_CT_LABELS",
> +                    "OVS_KEY_ATTR_CT_ORIG_TUPLE_IPV4",
> +                    "OVS_KEY_ATTR_CT_ORIG_TUPLE_IPV6",
> +                    "OVS_KEY_ATTR_NSH"]
> +    if attr < 0 or attr > len(ovs_key_attr):
> +        return "<UNKNOWN>"
> +    return ovs_key_attr[attr]
> +
> +def handle_event(ctx,data,size):
> +    event = b["events"].event(data)
> +    if event.reason == 0:
> +        print_flow_put(event)
> +    else:
> +        print_expiration(event)
> +
> +def main():
> +    global b
> +    global ufids
> +    global args
> +    ufids = []
> +    parser = argparse.ArgumentParser()
> +    parser.add_argument("--buffer-page-count",
> +                        help="Number of BPF ring buffer pages, default 1024",
> +                        type=int, default=1024, metavar="NUMBER")
> +    parser.add_argument("-k", "--print-flow-keys",
> +                        help="Print flow keys captured?",
> +                        type=bool, const=True,default=False,nargs="?")
> +    parser.add_argument("--pid","-p",metavar="VSWITCHD_PID",
> +                        help="ovs-vswitchd's PID", type=int, default=None)
> +    parser.add_argument("--mask", "-m", metavar="FLOW_MASK",
> +                        help="flow mask to match",nargs=1,type=int,default=None)
> +    parser.add_argument("-D", "--debug", help="debug eBPF",
> +                        type=int, const=0x3f, default=0, nargs="?")
> +    parser.add_argument("-F", "--filter-flows",
> +                        help="Filter flows based on conditions (to implement)",
> +                        type=bool, const=True,default=False, nargs="?")
> +    args = parser.parse_args()
> +    vswitch_pid = args.pid
> +    if vswitch_pid is None:
> +        for proc in psutil.process_iter():
> +            if "ovs-vswitchd" in proc.name():
> +                if vswitch_pid is not None:
> +                    print("Error: Multiple ovs-vswitchd daemons running. "
> +                          "Use the -p option to specify one to track.")
> +                    sys.exit(-1)
> +                vswitch_pid = proc.pid
> +    if vswitch_pid is None:
> +        print("Error: is ovs-vswitchd running?")
> +        sys.exit(-1)
> +    if args.mask is not None:
> +        print("mask is: ")
> +    u = USDT(pid=int(vswitch_pid))
> +    try:
> +        u.enable_probe(probe="op_flow_put", fn_name="watch_put")
> +    except USDTException as e:
> +        print("Error attaching flow_put probe.")
> +        print(str(e))
> +        sys.exit(-1)
> +    try:
> +        u.enable_probe(probe="flow_delete", fn_name="watch_reval")
> +    except USDTException as e:
> +        print("Error attaching revalidator_deletion probe.")
> +        print(str(e))
> +        sys.exit(-1)
> +
> +    source = bpf_src.replace("<BUFFER_PAGE_COUNT>",
> +                            str(args.buffer_page_count))
> +    b = BPF(text=source, usdt_contexts=[u],debug=args.debug)
> +    b["events"].open_ring_buffer(handle_event)
> +    print("Watching for events")
> +    while 1:
> +        try:
> +            b.ring_buffer_poll()
> +            time.sleep(0.5)
> +        except KeyboardInterrupt:
> +            break
> +
> +
> +
> +
> +if __name__ == "__main__":
> +    main()
> diff --git a/utilities/usdt-scripts/watch_flows.bt b/utilities/usdt-scripts/watch_flows.bt
> new file mode 100755
> index 000000000..8aea69f79
> --- /dev/null
> +++ b/utilities/usdt-scripts/watch_flows.bt
> @@ -0,0 +1,125 @@
> +#!/usr/bin/env bpftrace
> +/*
> +* usage: sudo bpftrace -p $(pidof $(which ovs-vswitchd)) ./test.bt
> +*
> +* "OVS_KEY_ATTR_UNSPEC",                     0
> +* "OVS_KEY_ATTR_ENCAP",                      1
> +* "OVS_KEY_ATTR_PRIORITY",                   2
> +* "OVS_KEY_ATTR_IN_PORT",                    3
> +* "OVS_KEY_ATTR_ETHERNET",                   4
> +* "OVS_KEY_ATTR_VLAN",                       5
> +* "OVS_KEY_ATTR_ETHERTYPE",                  6
> +* "OVS_KEY_ATTR_IPV4",                       7
> +* "OVS_KEY_ATTR_IPV6",                       8
> +* "OVS_KEY_ATTR_TCP",                        9
> +* "OVS_KEY_ATTR_UDP",                       10
> +* "OVS_KEY_ATTR_ICMP",                      11
> +* "OVS_KEY_ATTR_ICMPV6",                    12
> +* "OVS_KEY_ATTR_ARP",                       13
> +* "OVS_KEY_ATTR_ND",                        14
> +* "OVS_KEY_ATTR_SKB_MARK",                  15
> +* "OVS_KEY_ATTR_TUNNEL",                    16
> +* "OVS_KEY_ATTR_SCTP",                      17
> +* "OVS_KEY_ATTR_TCP_FLAGS",                 18
> +* "OVS_KEY_ATTR_DP_HASH",                   19
> +* "OVS_KEY_ATTR_RECIRC_ID",                 20
> +* "OVS_KEY_ATTR_MPLS",                      21
> +* "OVS_KEY_ATTR_CT_STATE",                  22
> +* "OVS_KEY_ATTR_CT_ZONE",                   23
> +* "OVS_KEY_ATTR_CT_MARK",                   24
> +* "OVS_KEY_ATTR_CT_LABELS",                 25
> +* "OVS_KEY_ATTR_CT_ORIG_TUPLE_IPV4",        26
> +* "OVS_KEY_ATTR_CT_ORIG_TUPLE_IPV6",        27
> +* "OVS_KEY_ATTR_NSH"                        28
> +*/
> +
> +#include <linux/sched.h>
> +
> +#define MAX_ATTRS       32
> +#define MAX_KEY         2048
> +// currently set to watch for ARP. Could take it as command line arg
> +#define TRAIT_TO_WATCH  13
> +struct flowput {
> +    u32 flags;
> +    u64 key_ptr;
> +    u64 key_len;
> +    u64 mask_ptr;
> +    u64 mask_len;
> +    u64 action_ptr;
> +    u64 action_len;
> +    u64 ufid_loc;
> +}
> +union ufid {
> +    u32 ufid32[4];
> +    u64 ufid64[2];
> +}
> +struct attr {
> +    u16 len;
> +    u16 type;
> +}
> +
> +BEGIN
> +{
> +    printf("--------------------------------------------------------------\n");
> +    printf("|                   Tracking flow lifecycles                 |\n");
> +    printf("--------------------------------------------------------------\n");
> +}
> +
> +usdt::dpif_netlink_operate__:op_flow_put
> +{
> +    $ptr = (struct flowput *) arg1;
> +    $st = *$ptr;
> +    $pArr = (union ufid *) $ptr->ufid_loc;
> +    $num_attrs = (uint64) 0;
> +    $key_posn = (uint64) 0;
> +    while($num_attrs < MAX_ATTRS && $key_posn < MAX_KEY) {
> +        if((uint64) $key_posn >=  $ptr->key_len) {
> +            break;
> +        }
> +        $pAttr = (struct attr *) ((uint8*) $ptr->key_ptr + $key_posn);
> +        if($pAttr->type == TRAIT_TO_WATCH) {
> +            printf("Target traffic was spotted.\n");
> +            printf("ufid:%08x-%04x-%04x-%04x-%04x%08x\n",$pArr->ufid32[0],
> +                   $pArr->ufid32[1] >> 16, $pArr->ufid32[1] & 0xffff,
> +                   $pArr->ufid32[2] >> 16, $pArr->ufid32[2] & 0xffff,
> +                   $pArr->ufid32[3]);
> +                @watchlist[$pArr->ufid64[0],$pArr->ufid64[1]]++;
> +        }
> +        $num_attrs++;
> +        $key_posn = ($key_posn + $pAttr->len + 3) & (0xffffff ^ 3);
> +    }
> +}
> +
> +usdt::ofproto_dpif_upcall_revalidate:flow_delete
> +{
> +    $ptr = (union ufid *) arg1;
> +    if(!@watchlist[$ptr->ufid64[0],$ptr->ufid64[1]]) {
> +        return;
> +    }
> +    printf("ufid:%08x-%04x-%04x-%04x-%04x%08x was invalidated because ",
> +           $ptr->ufid32[0], $ptr->ufid32[1] >> 16, $ptr->ufid32[1] & 0xffff,
> +           $ptr->ufid32[2] >> 16, $ptr->ufid32[2] & 0xffff, $ptr->ufid32[3]);
> +    if (arg0 == 1)  {
> +        printf("it timed out.\n");
> +    } else if (arg0 == 2) {
> +        printf("it was too expensive to revalidate.\n");
> +    } else if (arg0 == 3) {
> +        printf("there was a change in the openflow wildcards.\n");
> +    } else if (arg0 == 4) {
> +        printf("the odp fit was bad.\n");
> +    } else if (arg0 == 5) {
> +        printf("associated ofproto.\n");
> +    } else if (arg0 == 6) {
> +        printf("there was an error translating the openflow rule to ovs.\n");
> +    } else if (arg0 == 7) {
> +        printf("avoiding caching.\n");
> +    } else {
> +        printf("The value of arg0 is %d\n",arg0);
> +    }
> +}
> +
> +END
> +{
> +    clear(@watchlist);
> +    printf("\n");
> +}
> -- 
> 2.36.1
diff mbox series

Patch

diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
index 57f94df54..8971edb2d 100644
--- a/ofproto/ofproto-dpif-upcall.c
+++ b/ofproto/ofproto-dpif-upcall.c
@@ -31,6 +31,7 @@ 
 #include "openvswitch/list.h"
 #include "netlink.h"
 #include "openvswitch/ofpbuf.h"
+#include "openvswitch/usdt-probes.h"
 #include "ofproto-dpif-ipfix.h"
 #include "ofproto-dpif-sflow.h"
 #include "ofproto-dpif-xlate.h"
@@ -260,6 +261,17 @@  enum ukey_state {
 };
 #define N_UKEY_STATES (UKEY_DELETED + 1)
 
+enum flow_del_reason {
+    FLOW_LIVE = 0,
+    FLOW_TIME_OUT,      /* the flow went unused and was deleted. */
+    TOO_EXPENSIVE,
+    FLOW_WILDCARDED,
+    BAD_ODP_FIT,
+    ASSOCIATED_OFPROTO,
+    XLATION_ERROR,
+    AVOID_CACHING,
+};
+
 /* 'udpif_key's are responsible for tracking the little bit of state udpif
  * needs to do flow expiration which can't be pulled directly from the
  * datapath.  They may be created by any handler or revalidator thread at any
@@ -2202,7 +2214,8 @@  populate_xcache(struct udpif *udpif, struct udpif_key *ukey,
 static enum reval_result
 revalidate_ukey__(struct udpif *udpif, const struct udpif_key *ukey,
                   uint16_t tcp_flags, struct ofpbuf *odp_actions,
-                  struct recirc_refs *recircs, struct xlate_cache *xcache)
+                  struct recirc_refs *recircs, struct xlate_cache *xcache,
+                  enum flow_del_reason *reason)
 {
     struct xlate_out *xoutp;
     struct netflow *netflow;
@@ -2215,16 +2228,20 @@  revalidate_ukey__(struct udpif *udpif, const struct udpif_key *ukey,
         .wc = &wc,
     };
 
+    int error;
     result = UKEY_DELETE;
     xoutp = NULL;
     netflow = NULL;
 
-    if (xlate_ukey(udpif, ukey, tcp_flags, &ctx)) {
+    error = xlate_ukey(udpif, ukey, tcp_flags, &ctx);
+    if (error) {
+        *reason = XLATION_ERROR;
         goto exit;
     }
     xoutp = &ctx.xout;
 
     if (xoutp->avoid_caching) {
+        *reason = AVOID_CACHING;
         goto exit;
     }
 
@@ -2238,6 +2255,7 @@  revalidate_ukey__(struct udpif *udpif, const struct udpif_key *ukey,
         ofpbuf_clear(odp_actions);
 
         if (!ofproto) {
+            *reason = ASSOCIATED_OFPROTO;
             goto exit;
         }
 
@@ -2249,6 +2267,7 @@  revalidate_ukey__(struct udpif *udpif, const struct udpif_key *ukey,
     if (odp_flow_key_to_mask(ukey->mask, ukey->mask_len, &dp_mask, &ctx.flow,
                              NULL)
         == ODP_FIT_ERROR) {
+        *reason = BAD_ODP_FIT;
         goto exit;
     }
 
@@ -2258,6 +2277,7 @@  revalidate_ukey__(struct udpif *udpif, const struct udpif_key *ukey,
      * down.  Note that we do not know if the datapath has ignored any of the
      * wildcarded bits, so we may be overly conservative here. */
     if (flow_wildcards_has_extra(&dp_mask, ctx.wc)) {
+        *reason = FLOW_WILDCARDED;
         goto exit;
     }
 
@@ -2303,7 +2323,8 @@  static enum reval_result
 revalidate_ukey(struct udpif *udpif, struct udpif_key *ukey,
                 const struct dpif_flow_stats *stats,
                 struct ofpbuf *odp_actions, uint64_t reval_seq,
-                struct recirc_refs *recircs, bool offloaded)
+                struct recirc_refs *recircs, bool offloaded,
+                enum flow_del_reason *reason)
     OVS_REQUIRES(ukey->mutex)
 {
     bool need_revalidate = ukey->reval_seq != reval_seq;
@@ -2329,8 +2350,11 @@  revalidate_ukey(struct udpif *udpif, struct udpif_key *ukey,
                 xlate_cache_clear(ukey->xcache);
             }
             result = revalidate_ukey__(udpif, ukey, push.tcp_flags,
-                                       odp_actions, recircs, ukey->xcache);
-        } /* else delete; too expensive to revalidate */
+                                       odp_actions, recircs, ukey->xcache,
+                                       reason);
+        } /* else delete; too expensive to revalidate */ else {
+           *reason = TOO_EXPENSIVE;
+        }
     } else if (!push.n_packets || ukey->xcache
                || !populate_xcache(udpif, ukey, push.tcp_flags)) {
         result = UKEY_KEEP;
@@ -2720,6 +2744,7 @@  revalidate(struct revalidator *revalidator)
             struct recirc_refs recircs = RECIRC_REFS_EMPTY_INITIALIZER;
             struct dpif_flow_stats stats = f->stats;
             enum reval_result result;
+            enum flow_del_reason reason = FLOW_LIVE;
             struct udpif_key *ukey;
             bool already_dumped;
             int error;
@@ -2767,10 +2792,11 @@  revalidate(struct revalidator *revalidator)
             }
             if (kill_them_all || (used && used < now - max_idle)) {
                 result = UKEY_DELETE;
+                reason = FLOW_TIME_OUT;
             } else {
                 result = revalidate_ukey(udpif, ukey, &stats, &odp_actions,
                                          reval_seq, &recircs,
-                                         f->attrs.offloaded);
+                                         f->attrs.offloaded, &reason);
             }
             ukey->dump_seq = dump_seq;
 
@@ -2781,6 +2807,10 @@  revalidate(struct revalidator *revalidator)
 
             if (result != UKEY_KEEP) {
                 /* Takes ownership of 'recircs'. */
+                if (result == UKEY_DELETE && reason) {
+                    OVS_USDT_PROBE(ofproto_dpif_upcall_revalidate,
+                        flow_delete, reason, &ukey->ufid);
+                }
                 reval_op_init(&ops[n_ops++], result, udpif, ukey, &recircs,
                               &odp_actions);
             }
@@ -2829,6 +2859,7 @@  revalidator_sweep__(struct revalidator *revalidator, bool purge)
         struct udpif_key *ukey;
         struct umap *umap = &udpif->ukeys[i];
         size_t n_ops = 0;
+        enum flow_del_reason reason = FLOW_LIVE;
 
         CMAP_FOR_EACH(ukey, cmap_node, &umap->cmap) {
             enum ukey_state ukey_state;
@@ -2855,7 +2886,8 @@  revalidator_sweep__(struct revalidator *revalidator, bool purge)
                     COVERAGE_INC(revalidate_missed_dp_flow);
                     memset(&stats, 0, sizeof stats);
                     result = revalidate_ukey(udpif, ukey, &stats, &odp_actions,
-                                             reval_seq, &recircs, false);
+                                             reval_seq, &recircs, false,
+                                             &reason);
                 }
                 if (result != UKEY_KEEP) {
                     /* Clears 'recircs' if filled by revalidate_ukey(). */
diff --git a/utilities/usdt-scripts/filter_probe.py b/utilities/usdt-scripts/filter_probe.py
new file mode 100755
index 000000000..6eeb82ad5
--- /dev/null
+++ b/utilities/usdt-scripts/filter_probe.py
@@ -0,0 +1,263 @@ 
+#!/usr/bin/env python3
+from bcc import BPF, USDT, USDTException
+
+import argparse
+import psutil
+import sys
+import time
+from scapy.all import hexdump, wrpcap
+from scapy.layers.l2 import Ether
+import struct
+
+
+bpf_src = """
+#include <linux/sched.h>
+#include <linux/types.h>
+#include <uapi/linux/ptrace.h>
+
+#define MAX_KEY     2048
+
+struct flow_put {
+    u32 flags;
+    u64 key_ptr;
+    u64 key_len;
+    u64 mask_ptr;
+    u64 mask_len;
+    u64 action_ptr;
+    u64 action_len;
+    u64 ufid_loc;
+};
+struct event_t {
+    u64 ts;
+    u32 reason;
+    u32 ufid[4];
+    u64 key_size;
+    unsigned char key[MAX_KEY];
+};
+
+BPF_RINGBUF_OUTPUT(events, <BUFFER_PAGE_COUNT>);
+
+int watch_reval(struct pt_regs *ctx) {
+    uint64_t addr;
+    struct event_t *data = events.ringbuf_reserve(sizeof(struct event_t));
+    if(!data)
+        return 1;
+    data->ts = bpf_ktime_get_ns();
+    bpf_usdt_readarg(1, ctx, &data->reason);
+    bpf_usdt_readarg(2, ctx, &addr);
+    bpf_probe_read(&data->ufid,sizeof(data->ufid),(void *)addr);
+    events.ringbuf_submit(data, 0);
+    return 0;
+};
+
+
+int watch_put(struct pt_regs *ctx) {
+    uint64_t addr;
+    struct event_t *data = events.ringbuf_reserve(sizeof(struct event_t));
+    struct flow_put f;
+    if(!data)
+        return 1;
+    data->ts = bpf_ktime_get_ns();
+    bpf_usdt_readarg(2, ctx, &addr);
+    bpf_probe_read(&f, sizeof(struct flow_put), (void *) addr);
+    bpf_probe_read(&data->ufid, sizeof(data->ufid),(void *) f.ufid_loc);
+    if (f.key_len > MAX_KEY) // verifier fails without this check.
+        f.key_len = MAX_KEY;
+    data->key_size = f.key_len;
+    bpf_probe_read(&data->key, f.key_len,(void*)f.key_ptr);
+    data->reason = 0;
+    events.ringbuf_submit(data, 0);
+    return 0;
+};
+"""
+
+def format_ufid(ufid):
+    result = "ufid:%08x-%04x-%04x-%04x-%04x%08x" \
+    % (ufid[0],
+    ufid[1] >> 16,
+    ufid[1] & 0xffff,
+    ufid[2] >> 16,
+    ufid[2] & 0,
+    ufid[3])
+    return result
+
+def print_flow_put(event):
+    ufid_str = format_ufid(event.ufid)
+    print("At time: {:<18.9f} a flow with ufid: {} was upcalled".
+        format(event.ts / 1000000000,ufid_str))
+    key = decode_key(bytes(event.key)[:event.key_size])
+    if args.filter_flows:
+        if "OVS_KEY_ATTR_IPV4" in key:
+            print("Found an ipv4 flow. Adding its ufid to the watchlist")
+            ufids.append(ufid_str)
+
+def print_expiration(event):
+    ufid_str = format_ufid(event.ufid)
+    reason_code = ""
+    if args.filter_flows:
+        if ufid_str not in ufids:
+            return
+        else:
+            print("A tracked flow is expiring")
+    if event.reason == 1:
+        reason_code = "flow timed out"
+    elif event.reason == 2:
+        reason_code = "flow was too expensive to revalidate"
+    elif event.reason == 3:
+        reason_code = "flow was wildcarded"
+    elif event.reason == 4:
+        reason_code = "bad odp fit"
+    elif event.reason == 5:
+        reason_code = "associated ofproto"
+    elif event.reason == 6:
+        reason_code = "translation error"
+    elif event.reason == 7:
+        reason_code = "avoid_caching"
+    print("At time: {:<18.9f} a flow with ufid: {} was deleted for reason: {}".
+        format(event.ts / 1000000000, ufid_str, reason_code))
+
+def decode_key(msg,dump=True):
+    dump=args.print_flow_keys
+    bytes_left = len(msg)
+    result = {}
+    while bytes_left:
+        if bytes_left < 4:
+            if dump:
+                print("{}WARN: decode truncated; cannot read header".format(
+                    ' ' * 4))
+            break
+        nla_len, nla_type = struct.unpack("=HH", msg[:4])
+        if nla_len < 4:
+            if dump:
+                print("{}WARN: decode truncated; nla_len < 4".format(' ' * 4))
+            break
+        nla_data = msg[4:nla_len]
+        trunc = ""
+        if nla_len > bytes_left:
+            trunc = "..."
+            nla_data = nla_data[:(bytes_left - 4)]
+        else:
+            result[get_ovs_key_attr_str(nla_type)] = nla_data
+        if dump:
+            print("{}nla_len {}, nla_type {}[{}], data: {}{}".format(
+                ' ' * 4, nla_len, get_ovs_key_attr_str(nla_type),
+                nla_type,
+                "".join("{:02x} ".format(b) for b in nla_data), trunc))
+        if trunc != "":
+            if dump:
+                print("{}WARN: decode truncated; nla_len > msg_len[{}] ".
+                      format(" " * 4, bytes_left))
+            break
+        next_offset = (nla_len + 3) & (~3)
+        msg = msg[next_offset:]
+        bytes_left -= next_offset
+    return result
+
+def get_ovs_key_attr_str(attr):
+    ovs_key_attr = ["OVS_KEY_ATTR_UNSPEC",
+                    "OVS_KEY_ATTR_ENCAP",
+                    "OVS_KEY_ATTR_PRIORITY",
+                    "OVS_KEY_ATTR_IN_PORT",
+                    "OVS_KEY_ATTR_ETHERNET",
+                    "OVS_KEY_ATTR_VLAN",
+                    "OVS_KEY_ATTR_ETHERTYPE",
+                    "OVS_KEY_ATTR_IPV4",
+                    "OVS_KEY_ATTR_IPV6",
+                    "OVS_KEY_ATTR_TCP",
+                    "OVS_KEY_ATTR_UDP",
+                    "OVS_KEY_ATTR_ICMP",
+                    "OVS_KEY_ATTR_ICMPV6",
+                    "OVS_KEY_ATTR_ARP",
+                    "OVS_KEY_ATTR_ND",
+                    "OVS_KEY_ATTR_SKB_MARK",
+                    "OVS_KEY_ATTR_TUNNEL",
+                    "OVS_KEY_ATTR_SCTP",
+                    "OVS_KEY_ATTR_TCP_FLAGS",
+                    "OVS_KEY_ATTR_DP_HASH",
+                    "OVS_KEY_ATTR_RECIRC_ID",
+                    "OVS_KEY_ATTR_MPLS",
+                    "OVS_KEY_ATTR_CT_STATE",
+                    "OVS_KEY_ATTR_CT_ZONE",
+                    "OVS_KEY_ATTR_CT_MARK",
+                    "OVS_KEY_ATTR_CT_LABELS",
+                    "OVS_KEY_ATTR_CT_ORIG_TUPLE_IPV4",
+                    "OVS_KEY_ATTR_CT_ORIG_TUPLE_IPV6",
+                    "OVS_KEY_ATTR_NSH"]
+    if attr < 0 or attr > len(ovs_key_attr):
+        return "<UNKNOWN>"
+    return ovs_key_attr[attr]
+
+def handle_event(ctx,data,size):
+    event = b["events"].event(data)
+    if event.reason == 0:
+        print_flow_put(event)
+    else:
+        print_expiration(event)
+
+def main():
+    global b
+    global ufids
+    global args
+    ufids = []
+    parser = argparse.ArgumentParser()
+    parser.add_argument("--buffer-page-count",
+                        help="Number of BPF ring buffer pages, default 1024",
+                        type=int, default=1024, metavar="NUMBER")
+    parser.add_argument("-k", "--print-flow-keys",
+                        help="Print flow keys captured?",
+                        type=bool, const=True,default=False,nargs="?")
+    parser.add_argument("--pid","-p",metavar="VSWITCHD_PID",
+                        help="ovs-vswitchd's PID", type=int, default=None)
+    parser.add_argument("--mask", "-m", metavar="FLOW_MASK",
+                        help="flow mask to match",nargs=1,type=int,default=None)
+    parser.add_argument("-D", "--debug", help="debug eBPF",
+                        type=int, const=0x3f, default=0, nargs="?")
+    parser.add_argument("-F", "--filter-flows",
+                        help="Filter flows based on conditions (to implement)",
+                        type=bool, const=True,default=False, nargs="?")
+    args = parser.parse_args()
+    vswitch_pid = args.pid
+    if vswitch_pid is None:
+        for proc in psutil.process_iter():
+            if "ovs-vswitchd" in proc.name():
+                if vswitch_pid is not None:
+                    print("Error: Multiple ovs-vswitchd daemons running. "
+                          "Use the -p option to specify one to track.")
+                    sys.exit(-1)
+                vswitch_pid = proc.pid
+    if vswitch_pid is None:
+        print("Error: is ovs-vswitchd running?")
+        sys.exit(-1)
+    if args.mask is not None:
+        print("mask is: ")
+    u = USDT(pid=int(vswitch_pid))
+    try:
+        u.enable_probe(probe="op_flow_put", fn_name="watch_put")
+    except USDTException as e:
+        print("Error attaching flow_put probe.")
+        print(str(e))
+        sys.exit(-1)
+    try:
+        u.enable_probe(probe="flow_delete", fn_name="watch_reval")
+    except USDTException as e:
+        print("Error attaching revalidator_deletion probe.")
+        print(str(e))
+        sys.exit(-1)
+
+    source = bpf_src.replace("<BUFFER_PAGE_COUNT>",
+                            str(args.buffer_page_count))
+    b = BPF(text=source, usdt_contexts=[u],debug=args.debug)
+    b["events"].open_ring_buffer(handle_event)
+    print("Watching for events")
+    while 1:
+        try:
+            b.ring_buffer_poll()
+            time.sleep(0.5)
+        except KeyboardInterrupt:
+            break
+
+
+
+
+if __name__ == "__main__":
+    main()
diff --git a/utilities/usdt-scripts/watch_flows.bt b/utilities/usdt-scripts/watch_flows.bt
new file mode 100755
index 000000000..8aea69f79
--- /dev/null
+++ b/utilities/usdt-scripts/watch_flows.bt
@@ -0,0 +1,125 @@ 
+#!/usr/bin/env bpftrace
+/*
+* usage: sudo bpftrace -p $(pidof $(which ovs-vswitchd)) ./test.bt
+*
+* "OVS_KEY_ATTR_UNSPEC",                     0
+* "OVS_KEY_ATTR_ENCAP",                      1
+* "OVS_KEY_ATTR_PRIORITY",                   2
+* "OVS_KEY_ATTR_IN_PORT",                    3
+* "OVS_KEY_ATTR_ETHERNET",                   4
+* "OVS_KEY_ATTR_VLAN",                       5
+* "OVS_KEY_ATTR_ETHERTYPE",                  6
+* "OVS_KEY_ATTR_IPV4",                       7
+* "OVS_KEY_ATTR_IPV6",                       8
+* "OVS_KEY_ATTR_TCP",                        9
+* "OVS_KEY_ATTR_UDP",                       10
+* "OVS_KEY_ATTR_ICMP",                      11
+* "OVS_KEY_ATTR_ICMPV6",                    12
+* "OVS_KEY_ATTR_ARP",                       13
+* "OVS_KEY_ATTR_ND",                        14
+* "OVS_KEY_ATTR_SKB_MARK",                  15
+* "OVS_KEY_ATTR_TUNNEL",                    16
+* "OVS_KEY_ATTR_SCTP",                      17
+* "OVS_KEY_ATTR_TCP_FLAGS",                 18
+* "OVS_KEY_ATTR_DP_HASH",                   19
+* "OVS_KEY_ATTR_RECIRC_ID",                 20
+* "OVS_KEY_ATTR_MPLS",                      21
+* "OVS_KEY_ATTR_CT_STATE",                  22
+* "OVS_KEY_ATTR_CT_ZONE",                   23
+* "OVS_KEY_ATTR_CT_MARK",                   24
+* "OVS_KEY_ATTR_CT_LABELS",                 25
+* "OVS_KEY_ATTR_CT_ORIG_TUPLE_IPV4",        26
+* "OVS_KEY_ATTR_CT_ORIG_TUPLE_IPV6",        27
+* "OVS_KEY_ATTR_NSH"                        28
+*/
+
+#include <linux/sched.h>
+
+#define MAX_ATTRS       32
+#define MAX_KEY         2048
+// currently set to watch for ARP. Could take it as command line arg
+#define TRAIT_TO_WATCH  13
+struct flowput {
+    u32 flags;
+    u64 key_ptr;
+    u64 key_len;
+    u64 mask_ptr;
+    u64 mask_len;
+    u64 action_ptr;
+    u64 action_len;
+    u64 ufid_loc;
+}
+union ufid {
+    u32 ufid32[4];
+    u64 ufid64[2];
+}
+struct attr {
+    u16 len;
+    u16 type;
+}
+
+BEGIN
+{
+    printf("--------------------------------------------------------------\n");
+    printf("|                   Tracking flow lifecycles                 |\n");
+    printf("--------------------------------------------------------------\n");
+}
+
+usdt::dpif_netlink_operate__:op_flow_put
+{
+    $ptr = (struct flowput *) arg1;
+    $st = *$ptr;
+    $pArr = (union ufid *) $ptr->ufid_loc;
+    $num_attrs = (uint64) 0;
+    $key_posn = (uint64) 0;
+    while($num_attrs < MAX_ATTRS && $key_posn < MAX_KEY) {
+        if((uint64) $key_posn >=  $ptr->key_len) {
+            break;
+        }
+        $pAttr = (struct attr *) ((uint8*) $ptr->key_ptr + $key_posn);
+        if($pAttr->type == TRAIT_TO_WATCH) {
+            printf("Target traffic was spotted.\n");
+            printf("ufid:%08x-%04x-%04x-%04x-%04x%08x\n",$pArr->ufid32[0],
+                   $pArr->ufid32[1] >> 16, $pArr->ufid32[1] & 0xffff,
+                   $pArr->ufid32[2] >> 16, $pArr->ufid32[2] & 0xffff,
+                   $pArr->ufid32[3]);
+                @watchlist[$pArr->ufid64[0],$pArr->ufid64[1]]++;
+        }
+        $num_attrs++;
+        $key_posn = ($key_posn + $pAttr->len + 3) & (0xffffff ^ 3);
+    }
+}
+
+usdt::ofproto_dpif_upcall_revalidate:flow_delete
+{
+    $ptr = (union ufid *) arg1;
+    if(!@watchlist[$ptr->ufid64[0],$ptr->ufid64[1]]) {
+        return;
+    }
+    printf("ufid:%08x-%04x-%04x-%04x-%04x%08x was invalidated because ",
+           $ptr->ufid32[0], $ptr->ufid32[1] >> 16, $ptr->ufid32[1] & 0xffff,
+           $ptr->ufid32[2] >> 16, $ptr->ufid32[2] & 0xffff, $ptr->ufid32[3]);
+    if (arg0 == 1)  {
+        printf("it timed out.\n");
+    } else if (arg0 == 2) {
+        printf("it was too expensive to revalidate.\n");
+    } else if (arg0 == 3) {
+        printf("there was a change in the openflow wildcards.\n");
+    } else if (arg0 == 4) {
+        printf("the odp fit was bad.\n");
+    } else if (arg0 == 5) {
+        printf("associated ofproto.\n");
+    } else if (arg0 == 6) {
+        printf("there was an error translating the openflow rule to ovs.\n");
+    } else if (arg0 == 7) {
+        printf("avoiding caching.\n");
+    } else {
+        printf("The value of arg0 is %d\n",arg0);
+    }
+}
+
+END
+{
+    clear(@watchlist);
+    printf("\n");
+}