diff mbox series

[ovs-dev,v4] dpcls: Change info-get function to fetch dpcls usage stats.

Message ID 20211202100556.3745682-1-kumar.amber@intel.com
State Superseded
Headers show
Series [ovs-dev,v4] dpcls: Change info-get function to fetch dpcls usage stats. | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/github-robot-_Build_and_Test fail github build: failed

Commit Message

Kumar Amber Dec. 2, 2021, 10:05 a.m. UTC
Modified the dplcs info-get command output to include
the count for different dpcls implementations.

$ovs-appctl dpif-netdev/subtable-lookup-prio-get

Available dpcls implementations:
  autovalidator (Use count: 1, Priority: 5)
  generic (Use count: 0, Priority: 1)
  avx512_gather (Use count: 0, Priority: 3)

Test case to verify changes:
	1021: PMD - dpcls configuration     ok

Signed-off-by: Kumar Amber <kumar.amber@intel.com>
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Co-authored-by: Harry van Haaren <harry.van.haaren@intel.com>

---
v4:
- Fix comments on the patch.
- Change API from an overloaded method of counting, to returning the
  old and new subtable structs. This allows the caller to identify the
  modified subtable implementations, and update the statistics accordingly.
v3:
- Fix comments on the patch.
- Function API remains same, see discussion on OVS ML here:
  "https://mail.openvswitch.org/pipermail/ovs-dev/2021-October/388737.html"
v2:
- Dependency merged rebased to master.

---
---
 Documentation/topics/dpdk/bridge.rst | 16 +++----
 lib/dpif-netdev-lookup.c             | 70 +++++++++++++++++++++++-----
 lib/dpif-netdev-lookup.h             | 20 +++++++-
 lib/dpif-netdev.c                    | 61 +++++++++++++++++-------
 tests/pmd.at                         | 16 +++----
 5 files changed, 137 insertions(+), 46 deletions(-)

Comments

Eelco Chaudron Dec. 10, 2021, 1:19 p.m. UTC | #1
On 2 Dec 2021, at 11:05, Kumar Amber wrote:

> Modified the dplcs info-get command output to include
> the count for different dpcls implementations.
>
> $ovs-appctl dpif-netdev/subtable-lookup-prio-get
>
> Available dpcls implementations:
>   autovalidator (Use count: 1, Priority: 5)
>   generic (Use count: 0, Priority: 1)
>   avx512_gather (Use count: 0, Priority: 3)
>
> Test case to verify changes:
> 	1021: PMD - dpcls configuration     ok
>
> Signed-off-by: Kumar Amber <kumar.amber@intel.com>
> Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
> Co-authored-by: Harry van Haaren <harry.van.haaren@intel.com>
>
> ---
> v4:
> - Fix comments on the patch.
> - Change API from an overloaded method of counting, to returning the
>   old and new subtable structs. This allows the caller to identify the
>   modified subtable implementations, and update the statistics accordingly.
> v3:
> - Fix comments on the patch.
> - Function API remains same, see discussion on OVS ML here:
>   "https://mail.openvswitch.org/pipermail/ovs-dev/2021-October/388737.html"
> v2:
> - Dependency merged rebased to master.
>
> ---
> ---
>  Documentation/topics/dpdk/bridge.rst | 16 +++----
>  lib/dpif-netdev-lookup.c             | 70 +++++++++++++++++++++++-----
>  lib/dpif-netdev-lookup.h             | 20 +++++++-
>  lib/dpif-netdev.c                    | 61 +++++++++++++++++-------
>  tests/pmd.at                         | 16 +++----
>  5 files changed, 137 insertions(+), 46 deletions(-)
>
> diff --git a/Documentation/topics/dpdk/bridge.rst b/Documentation/topics/dpdk/bridge.rst
> index f645b9ade..63a54da1c 100644
> --- a/Documentation/topics/dpdk/bridge.rst
> +++ b/Documentation/topics/dpdk/bridge.rst
> @@ -156,10 +156,10 @@ OVS provides multiple implementations of dpcls. The following command enables
>  the user to check what implementations are available in a running instance ::
>
>      $ ovs-appctl dpif-netdev/subtable-lookup-prio-get
> -    Available lookup functions (priority : name)
> -            0 : autovalidator
> -            1 : generic
> -            0 : avx512_gather
> +    Available dpcls implementations:
> +            autovalidator (Use count: 1, Priority: 5)
> +            generic (Use count: 0, Priority: 1)
> +            avx512_gather (Use count: 0, Priority: 3)
>
>  To set the priority of a lookup function, run the ``prio-set`` command ::
>
> @@ -172,10 +172,10 @@ function due to the command being run. To verify the prioritization, re-run the
>  get command, note the updated priority of the ``avx512_gather`` function ::
>
>      $ ovs-appctl dpif-netdev/subtable-lookup-prio-get
> -    Available lookup functions (priority : name)
> -            0 : autovalidator
> -            1 : generic
> -            5 : avx512_gather
> +    Available dpcls implementations:
> +            autovalidator (Use count: 0, Priority: 0)
> +            generic (Use count: 0, Priority: 0)
> +            avx512_gather (Use count: 1, Priority: 5)
>
>  If two lookup functions have the same priority, the first one in the list is
>  chosen, and the 2nd occurance of that priority is not used. Put in logical
> diff --git a/lib/dpif-netdev-lookup.c b/lib/dpif-netdev-lookup.c
> index bd0a99abe..0aa79e27c 100644
> --- a/lib/dpif-netdev-lookup.c
> +++ b/lib/dpif-netdev-lookup.c
> @@ -36,18 +36,21 @@ static struct dpcls_subtable_lookup_info_t subtable_lookups[] = {
>      { .prio = 0,
>  #endif
>        .probe = dpcls_subtable_autovalidator_probe,
> -      .name = "autovalidator", },
> +      .name = "autovalidator",
> +      .usage_cnt = ATOMIC_COUNT_INIT(0), },
>
>      /* The default scalar C code implementation. */
>      { .prio = 1,
>        .probe = dpcls_subtable_generic_probe,
> -      .name = "generic", },
> +      .name = "generic",
> +      .usage_cnt = ATOMIC_COUNT_INIT(0), },
>
>  #if (__x86_64__ && HAVE_AVX512F && HAVE_LD_AVX512_GOOD && __SSE4_2__)
>      /* Only available on x86_64 bit builds with SSE 4.2 used for OVS core. */
>      { .prio = 0,
>        .probe = dpcls_subtable_avx512_gather_probe,
> -      .name = "avx512_gather", },
> +      .name = "avx512_gather",
> +      .usage_cnt = ATOMIC_COUNT_INIT(0), },
>  #else
>      /* Disabling AVX512 at compile time, as compile time requirements not met.
>       * This could be due to a number of reasons:
> @@ -93,27 +96,46 @@ dpcls_subtable_set_prio(const char *name, uint8_t priority)
>  }
>
>  dpcls_subtable_lookup_func
> -dpcls_subtable_get_best_impl(uint32_t u0_bit_count, uint32_t u1_bit_count)
> +dpcls_subtable_get_best_impl(uint32_t u0_bit_count, uint32_t u1_bit_count,
> +                             dpcls_subtable_lookup_func old_func,
> +                             struct dpcls_subtable_lookup_info_t **old_info,
> +                             struct dpcls_subtable_lookup_info_t **new_info)

I still feel this API is really ugly. See my proposed patch at the end.
>
>  {
>      /* Iter over each subtable impl, and get highest priority one. */
>      int32_t prio = -1;

This was not changed in this patch, but I guess this can be changed to int.

>      const char *name = NULL;
> +    int best_idx = 0;
>      dpcls_subtable_lookup_func best_func = NULL;
>
>      for (int i = 0; i < ARRAY_SIZE(subtable_lookups); i++) {
>          int32_t probed_prio = subtable_lookups[i].prio;

Same as above should be int.

> +        dpcls_subtable_lookup_func probed_func;
> +
> +        probed_func = subtable_lookups[i].probe(u0_bit_count,
> +                                                u1_bit_count);
> +        if (!probed_func) {
> +            continue;
> +        }
> +
> +        /* Better candidate - track this to return it later. */
>          if (probed_prio > prio) {
> -            dpcls_subtable_lookup_func probed_func;
> -            probed_func = subtable_lookups[i].probe(u0_bit_count,
> -                                    u1_bit_count);
> -            if (probed_func) {
> -                best_func = probed_func;
> -                prio = probed_prio;
> -                name = subtable_lookups[i].name;
> -            }
> +            best_func = probed_func;
> +            best_idx = i;
> +            prio = probed_prio;
> +            name = subtable_lookups[i].name;
> +        }
> +
> +        /* Return the replaced info struct to the user for usage statistics. */
> +        if (probed_func == old_func && old_info) {
> +            *old_info = &subtable_lookups[i];
>          }
>      }
>
> +    /* Return the newly used info struct to the user for usage statistics. */
> +    if (new_info) {
> +        *new_info = &subtable_lookups[best_idx];
> +    }
> +
>      VLOG_DBG("Subtable lookup function '%s' with units (%d,%d), priority %d\n",
>               name, u0_bit_count, u1_bit_count, prio);
>
> @@ -122,3 +144,27 @@ dpcls_subtable_get_best_impl(uint32_t u0_bit_count, uint32_t u1_bit_count)
>
>      return best_func;
>  }
> +
> +void
> +dp_dpcls_impl_print_stats(struct ds *reply)

All external functions have dpcls_xxxx, and reason why this one has dp_dpcls_xxxx? I would change this one also to dpcls_impl_print_stats().


> +{
> +    struct dpcls_subtable_lookup_info_t *lookup_funcs = NULL;
> +    int32_t count = dpcls_subtable_lookup_info_get(&lookup_funcs);

Guess this should have been int, but the dpcls_subtable_lookup_info_get() and dpcls_subtable_set_prio() should have returned int. Maybe, this can be fixed in a separate patch.

> +
> +    /* Add all DPCLS functions to reply string. */
> +    ds_put_cstr(reply, "Available dpcls implementations:\n");
> +
> +    for (int i = 0; i < count; i++) {
> +        ds_put_format(reply, "  %s (Use count: %d, Priority: %d",
> +                      lookup_funcs[i].name,
> +                      atomic_count_get(&lookup_funcs[i].usage_cnt),
> +                      lookup_funcs[i].prio);
> +
> +        if (ds_last(reply) == ' ') {
> +            ds_put_cstr(reply, "none");
> +        }
> +
> +        ds_put_cstr(reply, ")\n");
> +    }
> +
> +}

<SNIP>

Proposal of changes to this patch API (only compile tested/selftest tested on non AVX512):



diff --git a/lib/dpif-netdev-lookup.c b/lib/dpif-netdev-lookup.c
index 0aa79e27c..e641e4028 100644
--- a/lib/dpif-netdev-lookup.c
+++ b/lib/dpif-netdev-lookup.c
@@ -67,7 +67,7 @@ static struct dpcls_subtable_lookup_info_t subtable_lookups[] = {
 #endif
 };

-int32_t
+int
 dpcls_subtable_lookup_info_get(struct dpcls_subtable_lookup_info_t **out_ptr)
 {
     if (out_ptr == NULL) {
@@ -79,7 +79,7 @@ dpcls_subtable_lookup_info_get(struct dpcls_subtable_lookup_info_t **out_ptr)
 }

 /* sets the priority of the lookup function with "name". */
-int32_t
+int
 dpcls_subtable_set_prio(const char *name, uint8_t priority)
 {
     for (int i = 0; i < ARRAY_SIZE(subtable_lookups); i++) {
@@ -97,59 +97,65 @@ dpcls_subtable_set_prio(const char *name, uint8_t priority)

 dpcls_subtable_lookup_func
 dpcls_subtable_get_best_impl(uint32_t u0_bit_count, uint32_t u1_bit_count,
-                             dpcls_subtable_lookup_func old_func,
-                             struct dpcls_subtable_lookup_info_t **old_info,
-                             struct dpcls_subtable_lookup_info_t **new_info)
+                             struct dpcls_subtable_lookup_info_t **info)
 {
-    /* Iter over each subtable impl, and get highest priority one. */
-    int32_t prio = -1;
-    const char *name = NULL;
-    int best_idx = 0;
+    struct dpcls_subtable_lookup_info_t *best_info = NULL;
     dpcls_subtable_lookup_func best_func = NULL;
+    int prio = -1;

+    /* Iter over each subtable impl, and get highest priority one. */
     for (int i = 0; i < ARRAY_SIZE(subtable_lookups); i++) {
-        int32_t probed_prio = subtable_lookups[i].prio;
+        struct dpcls_subtable_lookup_info_t *impl_info = &subtable_lookups[i];
         dpcls_subtable_lookup_func probed_func;

+        if (impl_info->prio <= prio) {
+            continue;
+        }
+
         probed_func = subtable_lookups[i].probe(u0_bit_count,
                                                 u1_bit_count);
         if (!probed_func) {
             continue;
         }

-        /* Better candidate - track this to return it later. */
-        if (probed_prio > prio) {
-            best_func = probed_func;
-            best_idx = i;
-            prio = probed_prio;
-            name = subtable_lookups[i].name;
-        }
-
-        /* Return the replaced info struct to the user for usage statistics. */
-        if (probed_func == old_func && old_info) {
-            *old_info = &subtable_lookups[i];
-        }
+        best_func = probed_func;
+        best_info = impl_info;
+        prio = impl_info->prio;
     }

-    /* Return the newly used info struct to the user for usage statistics. */
-    if (new_info) {
-        *new_info = &subtable_lookups[best_idx];
-    }
+    /* Programming error - we must always return a valid func ptr. */
+    ovs_assert(best_func != NULL && best_info != NULL);

     VLOG_DBG("Subtable lookup function '%s' with units (%d,%d), priority %d\n",
-             name, u0_bit_count, u1_bit_count, prio);
-
-    /* Programming error - we must always return a valid func ptr. */
-    ovs_assert(best_func != NULL);
+             best_info->name, u0_bit_count, u1_bit_count, prio);

+    if (info) {
+        *info = best_info;
+    }
     return best_func;
 }

 void
-dp_dpcls_impl_print_stats(struct ds *reply)
+dpcls_info_inc_usage(struct dpcls_subtable_lookup_info_t *info)
+{
+    if (info) {
+        atomic_count_inc(&info->usage_cnt);
+    }
+}
+
+void
+dpcls_info_dec_usage(struct dpcls_subtable_lookup_info_t *info)
+{
+    if (info) {
+        atomic_count_dec(&info->usage_cnt);
+    }
+}
+
+void
+dpcls_impl_print_stats(struct ds *reply)
 {
     struct dpcls_subtable_lookup_info_t *lookup_funcs = NULL;
-    int32_t count = dpcls_subtable_lookup_info_get(&lookup_funcs);
+    int count = dpcls_subtable_lookup_info_get(&lookup_funcs);

     /* Add all DPCLS functions to reply string. */
     ds_put_cstr(reply, "Available dpcls implementations:\n");
diff --git a/lib/dpif-netdev-lookup.h b/lib/dpif-netdev-lookup.h
index 762147b4e..7f124a46e 100644
--- a/lib/dpif-netdev-lookup.h
+++ b/lib/dpif-netdev-lookup.h
@@ -68,30 +68,24 @@ struct dpcls_subtable_lookup_info_t {
     atomic_count usage_cnt;
 };

-int32_t dpcls_subtable_set_prio(const char *name, uint8_t priority);
+int dpcls_subtable_set_prio(const char *name, uint8_t priority);
+void dpcls_info_inc_usage(struct dpcls_subtable_lookup_info_t *info);
+void dpcls_info_dec_usage(struct dpcls_subtable_lookup_info_t *info);

-/* Lookup the best subtable lookup implementation for the given u0,u1 count.
- * When replacing an existing lookup func, the old_func pointer, old_info
- * and new_func are used for tracking the current and old implementations
- * which are further used for incrementing or decrementing implementation
- * statistics.
- */
+/* Lookup the best subtable lookup implementation for the given u0,u1 count. */
 dpcls_subtable_lookup_func
 dpcls_subtable_get_best_impl(uint32_t u0_bit_count, uint32_t u1_bit_count,
-                             dpcls_subtable_lookup_func old_func,
-                             struct dpcls_subtable_lookup_info_t **old_info,
-                             struct dpcls_subtable_lookup_info_t **new_func);
-
+                             struct dpcls_subtable_lookup_info_t **info);

 /* Retrieve the array of lookup implementations for iteration.
  * On error, returns a negative number.
  * On success, returns the size of the arrays pointed to by the out parameter.
  */
-int32_t
+int
 dpcls_subtable_lookup_info_get(struct dpcls_subtable_lookup_info_t **out_ptr);

 /* Prints dpcls subtables in use for different implementations. */
 void
-dp_dpcls_impl_print_stats(struct ds *reply);
+dpcls_impl_print_stats(struct ds *reply);

 #endif /* dpif-netdev-lookup.h */
diff --git a/lib/dpif-netdev-private-dpcls.h b/lib/dpif-netdev-private-dpcls.h
index 7c4a840cb..06e896bef 100644
--- a/lib/dpif-netdev-private-dpcls.h
+++ b/lib/dpif-netdev-private-dpcls.h
@@ -85,6 +85,7 @@ struct dpcls_subtable {
      * used for the lookup) then this can point at an optimized version of
      * the lookup function for this particular subtable. */
     dpcls_subtable_lookup_func lookup_func;
+    struct dpcls_subtable_lookup_info_t *lookup_func_info;

     /* Caches the masks to match a packet to, reducing runtime calculations. */
     uint64_t *mf_masks;
diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index fba02f141..4c36753ca 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -913,7 +913,7 @@ dpif_netdev_subtable_lookup_get(struct unixctl_conn *conn, int argc OVS_UNUSED,
 {
     struct ds reply = DS_EMPTY_INITIALIZER;

-    dp_dpcls_impl_print_stats(&reply);
+    dpcls_impl_print_stats(&reply);
     unixctl_command_reply(conn, ds_cstr(&reply));
     ds_destroy(&reply);
 }
@@ -8928,21 +8928,7 @@ dpcls_destroy_subtable(struct dpcls *cls, struct dpcls_subtable *subtable)
     pvector_remove(&cls->subtables, subtable);
     cmap_remove(&cls->subtables_map, &subtable->cmap_node,
                 subtable->mask.hash);
-
-    struct dpcls_subtable_lookup_info_t *old_info = NULL;
-    dpcls_subtable_get_best_impl(subtable->mf_bits_set_unit0,
-                                 subtable->mf_bits_set_unit1,
-                                 subtable->lookup_func,
-                                 &old_info,
-                                 NULL);
-
-    /* Reduce the usage count as this subtable is being destroyed. */
-    if (OVS_UNLIKELY(!old_info)) {
-        VLOG_ERR("Subtable destory: No subtable to destroy\n");
-    } else  {
-        atomic_count_dec(&old_info->usage_cnt);
-    }
-
+    dpcls_info_dec_usage(subtable->lookup_func_info);
     ovsrcu_postpone(dpcls_subtable_destroy_cb, subtable);
 }

@@ -8968,7 +8954,6 @@ static struct dpcls_subtable *
 dpcls_create_subtable(struct dpcls *cls, const struct netdev_flow_key *mask)
 {
     struct dpcls_subtable *subtable;
-    struct dpcls_subtable_lookup_info_t *new_info = NULL;

     /* Need to add one. */
     subtable = xmalloc(sizeof *subtable
@@ -8991,11 +8976,9 @@ dpcls_create_subtable(struct dpcls *cls, const struct netdev_flow_key *mask)
      * The function is guaranteed to always return a valid implementation, and
      * possibly an ISA optimized, and/or specialized implementation.
      */
-    subtable->lookup_func = dpcls_subtable_get_best_impl(unit0, unit1, NULL,
-                                                         NULL, &new_info);
-    if (new_info) {
-        atomic_count_inc(&new_info->usage_cnt);
-    }
+    subtable->lookup_func = dpcls_subtable_get_best_impl(
+        unit0, unit1, &subtable->lookup_func_info);
+    dpcls_info_inc_usage(subtable->lookup_func_info);

     cmap_insert(&cls->subtables_map, &subtable->cmap_node, mask->hash);
     /* Add the new subtable at the end of the pvector (with no hits yet) */
@@ -9036,26 +9019,21 @@ dpcls_subtable_lookup_reprobe(struct dpcls *cls)
         uint32_t u0_bits = subtable->mf_bits_set_unit0;
         uint32_t u1_bits = subtable->mf_bits_set_unit1;
         void *old_func = subtable->lookup_func;
-        struct dpcls_subtable_lookup_info_t *old_info = NULL;
-        struct dpcls_subtable_lookup_info_t *new_info = NULL;
+        struct dpcls_subtable_lookup_info_t *old_info;
+
+        old_info = subtable->lookup_func_info;
+        subtable->lookup_func = dpcls_subtable_get_best_impl(
+            u0_bits, u1_bits, &subtable->lookup_func_info);

-        /* get best implementaiton, and pointers to old/new info structs to
-         * keep usage statistics up to date.
-         */
-        subtable->lookup_func = dpcls_subtable_get_best_impl(u0_bits, u1_bits,
-                                                             old_func,
-                                                             &old_info,
-                                                             &new_info);
         if (old_func != subtable->lookup_func) {
             subtables_changed += 1;
         }

-        if (old_info) {
-            atomic_count_dec(&old_info->usage_cnt);
-        }
-
-        if (new_info) {
-            atomic_count_inc(&new_info->usage_cnt);
+        if (old_info != subtable->lookup_func_info) {
+            /* In theory, functions can be shared between implementations, so
+             * do an explicit check on the function info structures. */
+            dpcls_info_dec_usage(old_info);
+            dpcls_info_inc_usage(subtable->lookup_func_info);
         }
     }
     pvector_publish(pvec);
Kumar Amber Dec. 14, 2021, 10:01 a.m. UTC | #2
Hi Eelco,

Thanks for the reviews .

We created 2 patches one which I sent here, and one is almost identical to the patch suggested by you,
We thought the double pointer was more complicated and hence didn’t send that one first.

I will send the updated patch with your Input in sometime 😊.

Regards
Amber

> -----Original Message-----
> From: Eelco Chaudron <echaudro@redhat.com>
> Sent: Friday, December 10, 2021 6:49 PM
> To: Amber, Kumar <kumar.amber@intel.com>
> Cc: ovs-dev@openvswitch.org; fbl@sysclose.org; Stokes, Ian
> <ian.stokes@intel.com>; Van Haaren, Harry <harry.van.haaren@intel.com>
> Subject: Re: [PATCH v4] dpcls: Change info-get function to fetch dpcls usage
> stats.
> 
> 
> 
> On 2 Dec 2021, at 11:05, Kumar Amber wrote:
> 
> > Modified the dplcs info-get command output to include the count for
> > different dpcls implementations.
> >
> > $ovs-appctl dpif-netdev/subtable-lookup-prio-get
> >
> > Available dpcls implementations:
> >   autovalidator (Use count: 1, Priority: 5)
> >   generic (Use count: 0, Priority: 1)
> >   avx512_gather (Use count: 0, Priority: 3)
> >
> > Test case to verify changes:
> > 	1021: PMD - dpcls configuration     ok
> >
> > Signed-off-by: Kumar Amber <kumar.amber@intel.com>
> > Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
> > Co-authored-by: Harry van Haaren <harry.van.haaren@intel.com>
> >
> > ---
> > v4:
> > - Fix comments on the patch.
> > - Change API from an overloaded method of counting, to returning the
> >   old and new subtable structs. This allows the caller to identify the
> >   modified subtable implementations, and update the statistics accordingly.
> > v3:
> > - Fix comments on the patch.
> > - Function API remains same, see discussion on OVS ML here:
> >   "https://mail.openvswitch.org/pipermail/ovs-dev/2021-
> October/388737.html"
> > v2:
> > - Dependency merged rebased to master.
> >
> > ---
> > ---
> >  Documentation/topics/dpdk/bridge.rst | 16 +++----
> >  lib/dpif-netdev-lookup.c             | 70 +++++++++++++++++++++++-----
> >  lib/dpif-netdev-lookup.h             | 20 +++++++-
> >  lib/dpif-netdev.c                    | 61 +++++++++++++++++-------
> >  tests/pmd.at                         | 16 +++----
> >  5 files changed, 137 insertions(+), 46 deletions(-)
> >
> > diff --git a/Documentation/topics/dpdk/bridge.rst
> > b/Documentation/topics/dpdk/bridge.rst
> > index f645b9ade..63a54da1c 100644
> > --- a/Documentation/topics/dpdk/bridge.rst
> > +++ b/Documentation/topics/dpdk/bridge.rst
> > @@ -156,10 +156,10 @@ OVS provides multiple implementations of dpcls.
> > The following command enables  the user to check what implementations are
> available in a running instance ::
> >
> >      $ ovs-appctl dpif-netdev/subtable-lookup-prio-get
> > -    Available lookup functions (priority : name)
> > -            0 : autovalidator
> > -            1 : generic
> > -            0 : avx512_gather
> > +    Available dpcls implementations:
> > +            autovalidator (Use count: 1, Priority: 5)
> > +            generic (Use count: 0, Priority: 1)
> > +            avx512_gather (Use count: 0, Priority: 3)
> >
> >  To set the priority of a lookup function, run the ``prio-set`` command ::
> >
> > @@ -172,10 +172,10 @@ function due to the command being run. To verify
> > the prioritization, re-run the  get command, note the updated priority of the
> ``avx512_gather`` function ::
> >
> >      $ ovs-appctl dpif-netdev/subtable-lookup-prio-get
> > -    Available lookup functions (priority : name)
> > -            0 : autovalidator
> > -            1 : generic
> > -            5 : avx512_gather
> > +    Available dpcls implementations:
> > +            autovalidator (Use count: 0, Priority: 0)
> > +            generic (Use count: 0, Priority: 0)
> > +            avx512_gather (Use count: 1, Priority: 5)
> >
> >  If two lookup functions have the same priority, the first one in the
> > list is  chosen, and the 2nd occurance of that priority is not used.
> > Put in logical diff --git a/lib/dpif-netdev-lookup.c
> > b/lib/dpif-netdev-lookup.c index bd0a99abe..0aa79e27c 100644
> > --- a/lib/dpif-netdev-lookup.c
> > +++ b/lib/dpif-netdev-lookup.c
> > @@ -36,18 +36,21 @@ static struct dpcls_subtable_lookup_info_t
> subtable_lookups[] = {
> >      { .prio = 0,
> >  #endif
> >        .probe = dpcls_subtable_autovalidator_probe,
> > -      .name = "autovalidator", },
> > +      .name = "autovalidator",
> > +      .usage_cnt = ATOMIC_COUNT_INIT(0), },
> >
> >      /* The default scalar C code implementation. */
> >      { .prio = 1,
> >        .probe = dpcls_subtable_generic_probe,
> > -      .name = "generic", },
> > +      .name = "generic",
> > +      .usage_cnt = ATOMIC_COUNT_INIT(0), },
> >
> >  #if (__x86_64__ && HAVE_AVX512F && HAVE_LD_AVX512_GOOD &&
> __SSE4_2__)
> >      /* Only available on x86_64 bit builds with SSE 4.2 used for OVS core. */
> >      { .prio = 0,
> >        .probe = dpcls_subtable_avx512_gather_probe,
> > -      .name = "avx512_gather", },
> > +      .name = "avx512_gather",
> > +      .usage_cnt = ATOMIC_COUNT_INIT(0), },
> >  #else
> >      /* Disabling AVX512 at compile time, as compile time requirements not
> met.
> >       * This could be due to a number of reasons:
> > @@ -93,27 +96,46 @@ dpcls_subtable_set_prio(const char *name, uint8_t
> > priority)  }
> >
> >  dpcls_subtable_lookup_func
> > -dpcls_subtable_get_best_impl(uint32_t u0_bit_count, uint32_t
> > u1_bit_count)
> > +dpcls_subtable_get_best_impl(uint32_t u0_bit_count, uint32_t u1_bit_count,
> > +                             dpcls_subtable_lookup_func old_func,
> > +                             struct dpcls_subtable_lookup_info_t **old_info,
> > +                             struct dpcls_subtable_lookup_info_t
> > +**new_info)
> 
> I still feel this API is really ugly. See my proposed patch at the end.
> >
> >  {
> >      /* Iter over each subtable impl, and get highest priority one. */
> >      int32_t prio = -1;
> 
> This was not changed in this patch, but I guess this can be changed to int.
> 
> >      const char *name = NULL;
> > +    int best_idx = 0;
> >      dpcls_subtable_lookup_func best_func = NULL;
> >
> >      for (int i = 0; i < ARRAY_SIZE(subtable_lookups); i++) {
> >          int32_t probed_prio = subtable_lookups[i].prio;
> 
> Same as above should be int.
> 
> > +        dpcls_subtable_lookup_func probed_func;
> > +
> > +        probed_func = subtable_lookups[i].probe(u0_bit_count,
> > +                                                u1_bit_count);
> > +        if (!probed_func) {
> > +            continue;
> > +        }
> > +
> > +        /* Better candidate - track this to return it later. */
> >          if (probed_prio > prio) {
> > -            dpcls_subtable_lookup_func probed_func;
> > -            probed_func = subtable_lookups[i].probe(u0_bit_count,
> > -                                    u1_bit_count);
> > -            if (probed_func) {
> > -                best_func = probed_func;
> > -                prio = probed_prio;
> > -                name = subtable_lookups[i].name;
> > -            }
> > +            best_func = probed_func;
> > +            best_idx = i;
> > +            prio = probed_prio;
> > +            name = subtable_lookups[i].name;
> > +        }
> > +
> > +        /* Return the replaced info struct to the user for usage statistics. */
> > +        if (probed_func == old_func && old_info) {
> > +            *old_info = &subtable_lookups[i];
> >          }
> >      }
> >
> > +    /* Return the newly used info struct to the user for usage statistics. */
> > +    if (new_info) {
> > +        *new_info = &subtable_lookups[best_idx];
> > +    }
> > +
> >      VLOG_DBG("Subtable lookup function '%s' with units (%d,%d), priority
> %d\n",
> >               name, u0_bit_count, u1_bit_count, prio);
> >
> > @@ -122,3 +144,27 @@ dpcls_subtable_get_best_impl(uint32_t
> > u0_bit_count, uint32_t u1_bit_count)
> >
> >      return best_func;
> >  }
> > +
> > +void
> > +dp_dpcls_impl_print_stats(struct ds *reply)
> 
> All external functions have dpcls_xxxx, and reason why this one has
> dp_dpcls_xxxx? I would change this one also to dpcls_impl_print_stats().
> 
> 
> > +{
> > +    struct dpcls_subtable_lookup_info_t *lookup_funcs = NULL;
> > +    int32_t count = dpcls_subtable_lookup_info_get(&lookup_funcs);
> 
> Guess this should have been int, but the dpcls_subtable_lookup_info_get() and
> dpcls_subtable_set_prio() should have returned int. Maybe, this can be fixed in a
> separate patch.
> 
> > +
> > +    /* Add all DPCLS functions to reply string. */
> > +    ds_put_cstr(reply, "Available dpcls implementations:\n");
> > +
> > +    for (int i = 0; i < count; i++) {
> > +        ds_put_format(reply, "  %s (Use count: %d, Priority: %d",
> > +                      lookup_funcs[i].name,
> > +                      atomic_count_get(&lookup_funcs[i].usage_cnt),
> > +                      lookup_funcs[i].prio);
> > +
> > +        if (ds_last(reply) == ' ') {
> > +            ds_put_cstr(reply, "none");
> > +        }
> > +
> > +        ds_put_cstr(reply, ")\n");
> > +    }
> > +
> > +}
> 
> <SNIP>
> 
> Proposal of changes to this patch API (only compile tested/selftest tested on
> non AVX512):
> 
> 
> 
> diff --git a/lib/dpif-netdev-lookup.c b/lib/dpif-netdev-lookup.c index
> 0aa79e27c..e641e4028 100644
> --- a/lib/dpif-netdev-lookup.c
> +++ b/lib/dpif-netdev-lookup.c
> @@ -67,7 +67,7 @@ static struct dpcls_subtable_lookup_info_t
> subtable_lookups[] = {  #endif  };
> 
> -int32_t
> +int
>  dpcls_subtable_lookup_info_get(struct dpcls_subtable_lookup_info_t
> **out_ptr)  {
>      if (out_ptr == NULL) {
> @@ -79,7 +79,7 @@ dpcls_subtable_lookup_info_get(struct
> dpcls_subtable_lookup_info_t **out_ptr)  }
> 
>  /* sets the priority of the lookup function with "name". */ -int32_t
> +int
>  dpcls_subtable_set_prio(const char *name, uint8_t priority)  {
>      for (int i = 0; i < ARRAY_SIZE(subtable_lookups); i++) { @@ -97,59 +97,65 @@
> dpcls_subtable_set_prio(const char *name, uint8_t priority)
> 
>  dpcls_subtable_lookup_func
>  dpcls_subtable_get_best_impl(uint32_t u0_bit_count, uint32_t u1_bit_count,
> -                             dpcls_subtable_lookup_func old_func,
> -                             struct dpcls_subtable_lookup_info_t **old_info,
> -                             struct dpcls_subtable_lookup_info_t **new_info)
> +                             struct dpcls_subtable_lookup_info_t
> + **info)
>  {
> -    /* Iter over each subtable impl, and get highest priority one. */
> -    int32_t prio = -1;
> -    const char *name = NULL;
> -    int best_idx = 0;
> +    struct dpcls_subtable_lookup_info_t *best_info = NULL;
>      dpcls_subtable_lookup_func best_func = NULL;
> +    int prio = -1;
> 
> +    /* Iter over each subtable impl, and get highest priority one. */
>      for (int i = 0; i < ARRAY_SIZE(subtable_lookups); i++) {
> -        int32_t probed_prio = subtable_lookups[i].prio;
> +        struct dpcls_subtable_lookup_info_t *impl_info =
> + &subtable_lookups[i];
>          dpcls_subtable_lookup_func probed_func;
> 
> +        if (impl_info->prio <= prio) {
> +            continue;
> +        }
> +
>          probed_func = subtable_lookups[i].probe(u0_bit_count,
>                                                  u1_bit_count);
>          if (!probed_func) {
>              continue;
>          }
> 
> -        /* Better candidate - track this to return it later. */
> -        if (probed_prio > prio) {
> -            best_func = probed_func;
> -            best_idx = i;
> -            prio = probed_prio;
> -            name = subtable_lookups[i].name;
> -        }
> -
> -        /* Return the replaced info struct to the user for usage statistics. */
> -        if (probed_func == old_func && old_info) {
> -            *old_info = &subtable_lookups[i];
> -        }
> +        best_func = probed_func;
> +        best_info = impl_info;
> +        prio = impl_info->prio;
>      }
> 
> -    /* Return the newly used info struct to the user for usage statistics. */
> -    if (new_info) {
> -        *new_info = &subtable_lookups[best_idx];
> -    }
> +    /* Programming error - we must always return a valid func ptr. */
> +    ovs_assert(best_func != NULL && best_info != NULL);
> 
>      VLOG_DBG("Subtable lookup function '%s' with units (%d,%d), priority %d\n",
> -             name, u0_bit_count, u1_bit_count, prio);
> -
> -    /* Programming error - we must always return a valid func ptr. */
> -    ovs_assert(best_func != NULL);
> +             best_info->name, u0_bit_count, u1_bit_count, prio);
> 
> +    if (info) {
> +        *info = best_info;
> +    }
>      return best_func;
>  }
> 
>  void
> -dp_dpcls_impl_print_stats(struct ds *reply)
> +dpcls_info_inc_usage(struct dpcls_subtable_lookup_info_t *info) {
> +    if (info) {
> +        atomic_count_inc(&info->usage_cnt);
> +    }
> +}
> +
> +void
> +dpcls_info_dec_usage(struct dpcls_subtable_lookup_info_t *info) {
> +    if (info) {
> +        atomic_count_dec(&info->usage_cnt);
> +    }
> +}
> +
> +void
> +dpcls_impl_print_stats(struct ds *reply)
>  {
>      struct dpcls_subtable_lookup_info_t *lookup_funcs = NULL;
> -    int32_t count = dpcls_subtable_lookup_info_get(&lookup_funcs);
> +    int count = dpcls_subtable_lookup_info_get(&lookup_funcs);
> 
>      /* Add all DPCLS functions to reply string. */
>      ds_put_cstr(reply, "Available dpcls implementations:\n"); diff --git a/lib/dpif-
> netdev-lookup.h b/lib/dpif-netdev-lookup.h index 762147b4e..7f124a46e
> 100644
> --- a/lib/dpif-netdev-lookup.h
> +++ b/lib/dpif-netdev-lookup.h
> @@ -68,30 +68,24 @@ struct dpcls_subtable_lookup_info_t {
>      atomic_count usage_cnt;
>  };
> 
> -int32_t dpcls_subtable_set_prio(const char *name, uint8_t priority);
> +int dpcls_subtable_set_prio(const char *name, uint8_t priority); void
> +dpcls_info_inc_usage(struct dpcls_subtable_lookup_info_t *info); void
> +dpcls_info_dec_usage(struct dpcls_subtable_lookup_info_t *info);
> 
> -/* Lookup the best subtable lookup implementation for the given u0,u1 count.
> - * When replacing an existing lookup func, the old_func pointer, old_info
> - * and new_func are used for tracking the current and old implementations
> - * which are further used for incrementing or decrementing implementation
> - * statistics.
> - */
> +/* Lookup the best subtable lookup implementation for the given u0,u1
> +count. */
>  dpcls_subtable_lookup_func
>  dpcls_subtable_get_best_impl(uint32_t u0_bit_count, uint32_t u1_bit_count,
> -                             dpcls_subtable_lookup_func old_func,
> -                             struct dpcls_subtable_lookup_info_t **old_info,
> -                             struct dpcls_subtable_lookup_info_t **new_func);
> -
> +                             struct dpcls_subtable_lookup_info_t
> + **info);
> 
>  /* Retrieve the array of lookup implementations for iteration.
>   * On error, returns a negative number.
>   * On success, returns the size of the arrays pointed to by the out parameter.
>   */
> -int32_t
> +int
>  dpcls_subtable_lookup_info_get(struct dpcls_subtable_lookup_info_t
> **out_ptr);
> 
>  /* Prints dpcls subtables in use for different implementations. */  void -
> dp_dpcls_impl_print_stats(struct ds *reply);
> +dpcls_impl_print_stats(struct ds *reply);
> 
>  #endif /* dpif-netdev-lookup.h */
> diff --git a/lib/dpif-netdev-private-dpcls.h b/lib/dpif-netdev-private-dpcls.h index
> 7c4a840cb..06e896bef 100644
> --- a/lib/dpif-netdev-private-dpcls.h
> +++ b/lib/dpif-netdev-private-dpcls.h
> @@ -85,6 +85,7 @@ struct dpcls_subtable {
>       * used for the lookup) then this can point at an optimized version of
>       * the lookup function for this particular subtable. */
>      dpcls_subtable_lookup_func lookup_func;
> +    struct dpcls_subtable_lookup_info_t *lookup_func_info;
> 
>      /* Caches the masks to match a packet to, reducing runtime calculations. */
>      uint64_t *mf_masks;
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index fba02f141..4c36753ca
> 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -913,7 +913,7 @@ dpif_netdev_subtable_lookup_get(struct unixctl_conn
> *conn, int argc OVS_UNUSED,  {
>      struct ds reply = DS_EMPTY_INITIALIZER;
> 
> -    dp_dpcls_impl_print_stats(&reply);
> +    dpcls_impl_print_stats(&reply);
>      unixctl_command_reply(conn, ds_cstr(&reply));
>      ds_destroy(&reply);
>  }
> @@ -8928,21 +8928,7 @@ dpcls_destroy_subtable(struct dpcls *cls, struct
> dpcls_subtable *subtable)
>      pvector_remove(&cls->subtables, subtable);
>      cmap_remove(&cls->subtables_map, &subtable->cmap_node,
>                  subtable->mask.hash);
> -
> -    struct dpcls_subtable_lookup_info_t *old_info = NULL;
> -    dpcls_subtable_get_best_impl(subtable->mf_bits_set_unit0,
> -                                 subtable->mf_bits_set_unit1,
> -                                 subtable->lookup_func,
> -                                 &old_info,
> -                                 NULL);
> -
> -    /* Reduce the usage count as this subtable is being destroyed. */
> -    if (OVS_UNLIKELY(!old_info)) {
> -        VLOG_ERR("Subtable destory: No subtable to destroy\n");
> -    } else  {
> -        atomic_count_dec(&old_info->usage_cnt);
> -    }
> -
> +    dpcls_info_dec_usage(subtable->lookup_func_info);
>      ovsrcu_postpone(dpcls_subtable_destroy_cb, subtable);  }
> 
> @@ -8968,7 +8954,6 @@ static struct dpcls_subtable *
> dpcls_create_subtable(struct dpcls *cls, const struct netdev_flow_key *mask)  {
>      struct dpcls_subtable *subtable;
> -    struct dpcls_subtable_lookup_info_t *new_info = NULL;
> 
>      /* Need to add one. */
>      subtable = xmalloc(sizeof *subtable @@ -8991,11 +8976,9 @@
> dpcls_create_subtable(struct dpcls *cls, const struct netdev_flow_key *mask)
>       * The function is guaranteed to always return a valid implementation, and
>       * possibly an ISA optimized, and/or specialized implementation.
>       */
> -    subtable->lookup_func = dpcls_subtable_get_best_impl(unit0, unit1, NULL,
> -                                                         NULL, &new_info);
> -    if (new_info) {
> -        atomic_count_inc(&new_info->usage_cnt);
> -    }
> +    subtable->lookup_func = dpcls_subtable_get_best_impl(
> +        unit0, unit1, &subtable->lookup_func_info);
> +    dpcls_info_inc_usage(subtable->lookup_func_info);
> 
>      cmap_insert(&cls->subtables_map, &subtable->cmap_node, mask->hash);
>      /* Add the new subtable at the end of the pvector (with no hits yet) */ @@ -
> 9036,26 +9019,21 @@ dpcls_subtable_lookup_reprobe(struct dpcls *cls)
>          uint32_t u0_bits = subtable->mf_bits_set_unit0;
>          uint32_t u1_bits = subtable->mf_bits_set_unit1;
>          void *old_func = subtable->lookup_func;
> -        struct dpcls_subtable_lookup_info_t *old_info = NULL;
> -        struct dpcls_subtable_lookup_info_t *new_info = NULL;
> +        struct dpcls_subtable_lookup_info_t *old_info;
> +
> +        old_info = subtable->lookup_func_info;
> +        subtable->lookup_func = dpcls_subtable_get_best_impl(
> +            u0_bits, u1_bits, &subtable->lookup_func_info);
> 
> -        /* get best implementaiton, and pointers to old/new info structs to
> -         * keep usage statistics up to date.
> -         */
> -        subtable->lookup_func = dpcls_subtable_get_best_impl(u0_bits, u1_bits,
> -                                                             old_func,
> -                                                             &old_info,
> -                                                             &new_info);
>          if (old_func != subtable->lookup_func) {
>              subtables_changed += 1;
>          }
> 
> -        if (old_info) {
> -            atomic_count_dec(&old_info->usage_cnt);
> -        }
> -
> -        if (new_info) {
> -            atomic_count_inc(&new_info->usage_cnt);
> +        if (old_info != subtable->lookup_func_info) {
> +            /* In theory, functions can be shared between implementations, so
> +             * do an explicit check on the function info structures. */
> +            dpcls_info_dec_usage(old_info);
> +            dpcls_info_inc_usage(subtable->lookup_func_info);
>          }
>      }
>      pvector_publish(pvec);
diff mbox series

Patch

diff --git a/Documentation/topics/dpdk/bridge.rst b/Documentation/topics/dpdk/bridge.rst
index f645b9ade..63a54da1c 100644
--- a/Documentation/topics/dpdk/bridge.rst
+++ b/Documentation/topics/dpdk/bridge.rst
@@ -156,10 +156,10 @@  OVS provides multiple implementations of dpcls. The following command enables
 the user to check what implementations are available in a running instance ::
 
     $ ovs-appctl dpif-netdev/subtable-lookup-prio-get
-    Available lookup functions (priority : name)
-            0 : autovalidator
-            1 : generic
-            0 : avx512_gather
+    Available dpcls implementations:
+            autovalidator (Use count: 1, Priority: 5)
+            generic (Use count: 0, Priority: 1)
+            avx512_gather (Use count: 0, Priority: 3)
 
 To set the priority of a lookup function, run the ``prio-set`` command ::
 
@@ -172,10 +172,10 @@  function due to the command being run. To verify the prioritization, re-run the
 get command, note the updated priority of the ``avx512_gather`` function ::
 
     $ ovs-appctl dpif-netdev/subtable-lookup-prio-get
-    Available lookup functions (priority : name)
-            0 : autovalidator
-            1 : generic
-            5 : avx512_gather
+    Available dpcls implementations:
+            autovalidator (Use count: 0, Priority: 0)
+            generic (Use count: 0, Priority: 0)
+            avx512_gather (Use count: 1, Priority: 5)
 
 If two lookup functions have the same priority, the first one in the list is
 chosen, and the 2nd occurance of that priority is not used. Put in logical
diff --git a/lib/dpif-netdev-lookup.c b/lib/dpif-netdev-lookup.c
index bd0a99abe..0aa79e27c 100644
--- a/lib/dpif-netdev-lookup.c
+++ b/lib/dpif-netdev-lookup.c
@@ -36,18 +36,21 @@  static struct dpcls_subtable_lookup_info_t subtable_lookups[] = {
     { .prio = 0,
 #endif
       .probe = dpcls_subtable_autovalidator_probe,
-      .name = "autovalidator", },
+      .name = "autovalidator",
+      .usage_cnt = ATOMIC_COUNT_INIT(0), },
 
     /* The default scalar C code implementation. */
     { .prio = 1,
       .probe = dpcls_subtable_generic_probe,
-      .name = "generic", },
+      .name = "generic",
+      .usage_cnt = ATOMIC_COUNT_INIT(0), },
 
 #if (__x86_64__ && HAVE_AVX512F && HAVE_LD_AVX512_GOOD && __SSE4_2__)
     /* Only available on x86_64 bit builds with SSE 4.2 used for OVS core. */
     { .prio = 0,
       .probe = dpcls_subtable_avx512_gather_probe,
-      .name = "avx512_gather", },
+      .name = "avx512_gather",
+      .usage_cnt = ATOMIC_COUNT_INIT(0), },
 #else
     /* Disabling AVX512 at compile time, as compile time requirements not met.
      * This could be due to a number of reasons:
@@ -93,27 +96,46 @@  dpcls_subtable_set_prio(const char *name, uint8_t priority)
 }
 
 dpcls_subtable_lookup_func
-dpcls_subtable_get_best_impl(uint32_t u0_bit_count, uint32_t u1_bit_count)
+dpcls_subtable_get_best_impl(uint32_t u0_bit_count, uint32_t u1_bit_count,
+                             dpcls_subtable_lookup_func old_func,
+                             struct dpcls_subtable_lookup_info_t **old_info,
+                             struct dpcls_subtable_lookup_info_t **new_info)
 {
     /* Iter over each subtable impl, and get highest priority one. */
     int32_t prio = -1;
     const char *name = NULL;
+    int best_idx = 0;
     dpcls_subtable_lookup_func best_func = NULL;
 
     for (int i = 0; i < ARRAY_SIZE(subtable_lookups); i++) {
         int32_t probed_prio = subtable_lookups[i].prio;
+        dpcls_subtable_lookup_func probed_func;
+
+        probed_func = subtable_lookups[i].probe(u0_bit_count,
+                                                u1_bit_count);
+        if (!probed_func) {
+            continue;
+        }
+
+        /* Better candidate - track this to return it later. */
         if (probed_prio > prio) {
-            dpcls_subtable_lookup_func probed_func;
-            probed_func = subtable_lookups[i].probe(u0_bit_count,
-                                    u1_bit_count);
-            if (probed_func) {
-                best_func = probed_func;
-                prio = probed_prio;
-                name = subtable_lookups[i].name;
-            }
+            best_func = probed_func;
+            best_idx = i;
+            prio = probed_prio;
+            name = subtable_lookups[i].name;
+        }
+
+        /* Return the replaced info struct to the user for usage statistics. */
+        if (probed_func == old_func && old_info) {
+            *old_info = &subtable_lookups[i];
         }
     }
 
+    /* Return the newly used info struct to the user for usage statistics. */
+    if (new_info) {
+        *new_info = &subtable_lookups[best_idx];
+    }
+
     VLOG_DBG("Subtable lookup function '%s' with units (%d,%d), priority %d\n",
              name, u0_bit_count, u1_bit_count, prio);
 
@@ -122,3 +144,27 @@  dpcls_subtable_get_best_impl(uint32_t u0_bit_count, uint32_t u1_bit_count)
 
     return best_func;
 }
+
+void
+dp_dpcls_impl_print_stats(struct ds *reply)
+{
+    struct dpcls_subtable_lookup_info_t *lookup_funcs = NULL;
+    int32_t count = dpcls_subtable_lookup_info_get(&lookup_funcs);
+
+    /* Add all DPCLS functions to reply string. */
+    ds_put_cstr(reply, "Available dpcls implementations:\n");
+
+    for (int i = 0; i < count; i++) {
+        ds_put_format(reply, "  %s (Use count: %d, Priority: %d",
+                      lookup_funcs[i].name,
+                      atomic_count_get(&lookup_funcs[i].usage_cnt),
+                      lookup_funcs[i].prio);
+
+        if (ds_last(reply) == ' ') {
+            ds_put_cstr(reply, "none");
+        }
+
+        ds_put_cstr(reply, ")\n");
+    }
+
+}
diff --git a/lib/dpif-netdev-lookup.h b/lib/dpif-netdev-lookup.h
index 59f51faa0..762147b4e 100644
--- a/lib/dpif-netdev-lookup.h
+++ b/lib/dpif-netdev-lookup.h
@@ -20,6 +20,7 @@ 
 #include <config.h>
 #include "dpif-netdev.h"
 #include "dpif-netdev-private-dpcls.h"
+#include "dpif-netdev-private-thread.h"
 
 /* Function to perform a probe for the subtable bit fingerprint.
  * Returns NULL if not valid, or a valid function pointer to call for this
@@ -62,12 +63,25 @@  struct dpcls_subtable_lookup_info_t {
 
     /* Human readable name, used in setting subtable priority commands */
     const char *name;
+
+    /* Counter which holds the usage count of each implementations. */
+    atomic_count usage_cnt;
 };
 
 int32_t dpcls_subtable_set_prio(const char *name, uint8_t priority);
 
+/* Lookup the best subtable lookup implementation for the given u0,u1 count.
+ * When replacing an existing lookup func, the old_func pointer, old_info
+ * and new_func are used for tracking the current and old implementations
+ * which are further used for incrementing or decrementing implementation
+ * statistics.
+ */
 dpcls_subtable_lookup_func
-dpcls_subtable_get_best_impl(uint32_t u0_bit_count, uint32_t u1_bit_count);
+dpcls_subtable_get_best_impl(uint32_t u0_bit_count, uint32_t u1_bit_count,
+                             dpcls_subtable_lookup_func old_func,
+                             struct dpcls_subtable_lookup_info_t **old_info,
+                             struct dpcls_subtable_lookup_info_t **new_func);
+
 
 /* Retrieve the array of lookup implementations for iteration.
  * On error, returns a negative number.
@@ -76,4 +90,8 @@  dpcls_subtable_get_best_impl(uint32_t u0_bit_count, uint32_t u1_bit_count);
 int32_t
 dpcls_subtable_lookup_info_get(struct dpcls_subtable_lookup_info_t **out_ptr);
 
+/* Prints dpcls subtables in use for different implementations. */
+void
+dp_dpcls_impl_print_stats(struct ds *reply);
+
 #endif /* dpif-netdev-lookup.h */
diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 69d7ec26e..8f1419119 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -912,21 +912,9 @@  dpif_netdev_subtable_lookup_get(struct unixctl_conn *conn, int argc OVS_UNUSED,
                                 const char *argv[] OVS_UNUSED,
                                 void *aux OVS_UNUSED)
 {
-    /* Get a list of all lookup functions. */
-    struct dpcls_subtable_lookup_info_t *lookup_funcs = NULL;
-    int32_t count = dpcls_subtable_lookup_info_get(&lookup_funcs);
-    if (count < 0) {
-        unixctl_command_reply_error(conn, "error getting lookup names");
-        return;
-    }
-
-    /* Add all lookup functions to reply string. */
     struct ds reply = DS_EMPTY_INITIALIZER;
-    ds_put_cstr(&reply, "Available lookup functions (priority : name)\n");
-    for (int i = 0; i < count; i++) {
-        ds_put_format(&reply, "  %d : %s\n", lookup_funcs[i].prio,
-                      lookup_funcs[i].name);
-    }
+
+    dp_dpcls_impl_print_stats(&reply);
     unixctl_command_reply(conn, ds_cstr(&reply));
     ds_destroy(&reply);
 }
@@ -8978,6 +8966,21 @@  dpcls_destroy_subtable(struct dpcls *cls, struct dpcls_subtable *subtable)
     pvector_remove(&cls->subtables, subtable);
     cmap_remove(&cls->subtables_map, &subtable->cmap_node,
                 subtable->mask.hash);
+
+    struct dpcls_subtable_lookup_info_t *old_info = NULL;
+    dpcls_subtable_get_best_impl(subtable->mf_bits_set_unit0,
+                                 subtable->mf_bits_set_unit1,
+                                 subtable->lookup_func,
+                                 &old_info,
+                                 NULL);
+
+    /* Reduce the usage count as this subtable is being destroyed. */
+    if (OVS_UNLIKELY(!old_info)) {
+        VLOG_ERR("Subtable destory: No subtable to destroy\n");
+    } else  {
+        atomic_count_dec(&old_info->usage_cnt);
+    }
+
     ovsrcu_postpone(dpcls_subtable_destroy_cb, subtable);
 }
 
@@ -9003,6 +9006,7 @@  static struct dpcls_subtable *
 dpcls_create_subtable(struct dpcls *cls, const struct netdev_flow_key *mask)
 {
     struct dpcls_subtable *subtable;
+    struct dpcls_subtable_lookup_info_t *new_info = NULL;
 
     /* Need to add one. */
     subtable = xmalloc(sizeof *subtable
@@ -9025,7 +9029,11 @@  dpcls_create_subtable(struct dpcls *cls, const struct netdev_flow_key *mask)
      * The function is guaranteed to always return a valid implementation, and
      * possibly an ISA optimized, and/or specialized implementation.
      */
-    subtable->lookup_func = dpcls_subtable_get_best_impl(unit0, unit1);
+    subtable->lookup_func = dpcls_subtable_get_best_impl(unit0, unit1, NULL,
+                                                         NULL, &new_info);
+    if (new_info) {
+        atomic_count_inc(&new_info->usage_cnt);
+    }
 
     cmap_insert(&cls->subtables_map, &subtable->cmap_node, mask->hash);
     /* Add the new subtable at the end of the pvector (with no hits yet) */
@@ -9066,8 +9074,27 @@  dpcls_subtable_lookup_reprobe(struct dpcls *cls)
         uint32_t u0_bits = subtable->mf_bits_set_unit0;
         uint32_t u1_bits = subtable->mf_bits_set_unit1;
         void *old_func = subtable->lookup_func;
-        subtable->lookup_func = dpcls_subtable_get_best_impl(u0_bits, u1_bits);
-        subtables_changed += (old_func != subtable->lookup_func);
+        struct dpcls_subtable_lookup_info_t *old_info = NULL;
+        struct dpcls_subtable_lookup_info_t *new_info = NULL;
+
+        /* get best implementaiton, and pointers to old/new info structs to
+         * keep usage statistics up to date.
+         */
+        subtable->lookup_func = dpcls_subtable_get_best_impl(u0_bits, u1_bits,
+                                                             old_func,
+                                                             &old_info,
+                                                             &new_info);
+        if (old_func != subtable->lookup_func) {
+            subtables_changed += 1;
+        }
+
+        if (old_info) {
+            atomic_count_dec(&old_info->usage_cnt);
+        }
+
+        if (new_info) {
+            atomic_count_inc(&new_info->usage_cnt);
+        }
     }
     pvector_publish(pvec);
 
diff --git a/tests/pmd.at b/tests/pmd.at
index c875a744f..e61bb27d3 100644
--- a/tests/pmd.at
+++ b/tests/pmd.at
@@ -1091,11 +1091,11 @@  AT_CHECK([ovs-vsctl add-port br0 p1 -- set Interface p1 type=dummy-pmd])
 
 AT_CHECK([ovs-vsctl show], [], [stdout])
 AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-get | grep generic], [], [dnl
-  1 : generic
+  generic (Use count: 0, Priority: 1)
 ])
 
 AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-get | grep autovalidator], [], [dnl
-  0 : autovalidator
+  autovalidator (Use count: 0, Priority: 0)
 ])
 
 AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-set autovalidator 3], [0], [dnl
@@ -1103,7 +1103,7 @@  Lookup priority change affected 0 dpcls ports and 0 subtables.
 ])
 
 AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-get | grep autovalidator], [], [dnl
-  3 : autovalidator
+  autovalidator (Use count: 0, Priority: 3)
 ])
 
 AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-set generic 4], [0], [dnl
@@ -1111,7 +1111,7 @@  Lookup priority change affected 0 dpcls ports and 0 subtables.
 ])
 
 AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-get | grep generic], [], [dnl
-  4 : generic
+  generic (Use count: 0, Priority: 4)
 ])
 
 AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-set generic 8], [0], [dnl
@@ -1119,7 +1119,7 @@  Lookup priority change affected 0 dpcls ports and 0 subtables.
 ])
 
 AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-get | grep generic], [], [dnl
-  8 : generic
+  generic (Use count: 0, Priority: 8)
 ])
 
 AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-set autovalidator 8], [0], [dnl
@@ -1127,7 +1127,7 @@  Lookup priority change affected 0 dpcls ports and 0 subtables.
 ])
 
 AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-get | grep autovalidator], [], [dnl
-  8 : autovalidator
+  autovalidator (Use count: 0, Priority: 8)
 ])
 
 AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-set generic 0], [0], [dnl
@@ -1135,7 +1135,7 @@  Lookup priority change affected 0 dpcls ports and 0 subtables.
 ])
 
 AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-get | grep generic], [], [dnl
-  0 : generic
+  generic (Use count: 0, Priority: 0)
 ])
 
 AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-set generic 255], [0], [dnl
@@ -1143,7 +1143,7 @@  Lookup priority change affected 0 dpcls ports and 0 subtables.
 ])
 
 AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-get | grep generic], [], [dnl
-  255 : generic
+  generic (Use count: 0, Priority: 255)
 ])
 
 AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-set generic -1], [2],