diff mbox series

[ovs-dev,v2] dpcls: Change info-get function to fetch dpcls usage stats.

Message ID 20211020045246.3408624-1-kumar.amber@intel.com
State Superseded
Headers show
Series [ovs-dev,v2] dpcls: Change info-get function to fetch dpcls usage stats. | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/github-robot-_Build_and_Test success github build: passed

Commit Message

Kumar Amber Oct. 20, 2021, 4:52 a.m. UTC
Modified the dplcs info-get command output to include
the count for different dpcls implementations.

$ovs-appctl dpif-netdev/subtable-lookup-prio-get

Available dpcls implementations:
  autovalidator (Use count: 1, Priority: 5)
  generic (Use count: 0, Priority: 1)
  avx512_gather (Use count: 0, Priority: 3)

Test case to verify changes:
	1021: PMD - dpcls configuration     ok

Signed-off-by: Kumar Amber <kumar.amber@intel.com>
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Co-authored-by: Harry van Haaren <harry.van.haaren@intel.com>

---
v2: Dependency merged rebased to master

---
---
 Documentation/topics/dpdk/bridge.rst | 16 +++---
 lib/dpif-netdev-lookup.c             | 80 ++++++++++++++++++++++------
 lib/dpif-netdev-lookup.h             | 19 ++++++-
 lib/dpif-netdev.c                    | 30 +++++------
 tests/pmd.at                         | 16 +++---
 5 files changed, 111 insertions(+), 50 deletions(-)

Comments

Eelco Chaudron Oct. 21, 2021, 12:03 p.m. UTC | #1
Hi Kumar,

See inline comments below…

//Eelco


On 20 Oct 2021, at 6:52, Kumar Amber wrote:

> Modified the dplcs info-get command output to include
> the count for different dpcls implementations.
>
> $ovs-appctl dpif-netdev/subtable-lookup-prio-get
>
> Available dpcls implementations:
>   autovalidator (Use count: 1, Priority: 5)
>   generic (Use count: 0, Priority: 1)
>   avx512_gather (Use count: 0, Priority: 3)
>
> Test case to verify changes:
> 	1021: PMD - dpcls configuration     ok
>
> Signed-off-by: Kumar Amber <kumar.amber@intel.com>
> Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
> Co-authored-by: Harry van Haaren <harry.van.haaren@intel.com>
>
> ---
> v2: Dependency merged rebased to master
>
> ---
> ---
>  Documentation/topics/dpdk/bridge.rst | 16 +++---
>  lib/dpif-netdev-lookup.c             | 80 ++++++++++++++++++++++------
>  lib/dpif-netdev-lookup.h             | 19 ++++++-
>  lib/dpif-netdev.c                    | 30 +++++------
>  tests/pmd.at                         | 16 +++---
>  5 files changed, 111 insertions(+), 50 deletions(-)
>
> diff --git a/Documentation/topics/dpdk/bridge.rst b/Documentation/topics/dpdk/bridge.rst
> index f645b9ade..63a54da1c 100644
> --- a/Documentation/topics/dpdk/bridge.rst
> +++ b/Documentation/topics/dpdk/bridge.rst
> @@ -156,10 +156,10 @@ OVS provides multiple implementations of dpcls. The following command enables
>  the user to check what implementations are available in a running instance ::
>
>      $ ovs-appctl dpif-netdev/subtable-lookup-prio-get
> -    Available lookup functions (priority : name)
> -            0 : autovalidator
> -            1 : generic
> -            0 : avx512_gather
> +    Available dpcls implementations:
> +            autovalidator (Use count: 1, Priority: 5)
> +            generic (Use count: 0, Priority: 1)
> +            avx512_gather (Use count: 0, Priority: 3)
>
>  To set the priority of a lookup function, run the ``prio-set`` command ::
>
> @@ -172,10 +172,10 @@ function due to the command being run. To verify the prioritization, re-run the
>  get command, note the updated priority of the ``avx512_gather`` function ::
>
>      $ ovs-appctl dpif-netdev/subtable-lookup-prio-get
> -    Available lookup functions (priority : name)
> -            0 : autovalidator
> -            1 : generic
> -            5 : avx512_gather
> +    Available dpcls implementations:
> +            autovalidator (Use count: 0, Priority: 0)
> +            generic (Use count: 0, Priority: 0)
> +            avx512_gather (Use count: 1, Priority: 5)
>
>  If two lookup functions have the same priority, the first one in the list is
>  chosen, and the 2nd occurance of that priority is not used. Put in logical
> diff --git a/lib/dpif-netdev-lookup.c b/lib/dpif-netdev-lookup.c
> index bd0a99abe..c0b137299 100644
> --- a/lib/dpif-netdev-lookup.c
> +++ b/lib/dpif-netdev-lookup.c
> @@ -36,18 +36,21 @@ static struct dpcls_subtable_lookup_info_t subtable_lookups[] = {
>      { .prio = 0,
>  #endif
>        .probe = dpcls_subtable_autovalidator_probe,
> -      .name = "autovalidator", },
> +      .name = "autovalidator",
> +      .usage_cnt = 0,},
>
>      /* The default scalar C code implementation. */
>      { .prio = 1,
>        .probe = dpcls_subtable_generic_probe,
> -      .name = "generic", },
> +      .name = "generic",
> +      .usage_cnt = 0, },
>
>  #if (__x86_64__ && HAVE_AVX512F && HAVE_LD_AVX512_GOOD && __SSE4_2__)
>      /* Only available on x86_64 bit builds with SSE 4.2 used for OVS core. */
>      { .prio = 0,
>        .probe = dpcls_subtable_avx512_gather_probe,
> -      .name = "avx512_gather", },
> +      .name = "avx512_gather",
> +      .usage_cnt = 0,},
>  #else
>      /* Disabling AVX512 at compile time, as compile time requirements not met.
>       * This could be due to a number of reasons:
> @@ -93,32 +96,79 @@ dpcls_subtable_set_prio(const char *name, uint8_t priority)
>  }
>
>  dpcls_subtable_lookup_func
> -dpcls_subtable_get_best_impl(uint32_t u0_bit_count, uint32_t u1_bit_count)
> +dpcls_subtable_get_best_impl(uint32_t u0_bit_count, uint32_t u1_bit_count,
> +                             dpcls_subtable_lookup_func old_func,
> +                             uint32_t will_use_result)

Why uint32_t and not a boolean?

>  {
>      /* Iter over each subtable impl, and get highest priority one. */
>      int32_t prio = -1;
> +    uint32_t i;
>      const char *name = NULL;
> +    uint32_t best_idx = 0;
> +    uint32_t usage_cnt = 0;
>      dpcls_subtable_lookup_func best_func = NULL;
>
> -    for (int i = 0; i < ARRAY_SIZE(subtable_lookups); i++) {
> +    for (i = 0; i < ARRAY_SIZE(subtable_lookups); i++) {
>          int32_t probed_prio = subtable_lookups[i].prio;
> +        dpcls_subtable_lookup_func probed_func;

Add extra blanc line here

> +        probed_func = subtable_lookups[i].probe(u0_bit_count,
> +                                u1_bit_count);

Indentation is off

> +        if (!probed_func) {
> +            continue;
> +        }
> +
> +        /* Better candidate - track this to return it later. */
>          if (probed_prio > prio) {
> -            dpcls_subtable_lookup_func probed_func;
> -            probed_func = subtable_lookups[i].probe(u0_bit_count,
> -                                    u1_bit_count);
> -            if (probed_func) {
> -                best_func = probed_func;
> -                prio = probed_prio;
> -                name = subtable_lookups[i].name;
> -            }
> +            best_func = probed_func;
> +            best_idx = i;
> +            prio = probed_prio;
> +            name = subtable_lookups[i].name;
> +        }
> +
> +        /* Statistics keeping, reduce old func usage count. */
> +        if (probed_func == old_func) {

Should this check also not include “&& will_use_result“?
Guess you overload this function to do a counter update, which I do not like (see comments below).

> +            atomic_read_relaxed(&subtable_lookups[i].usage_cnt, &usage_cnt);
> +            usage_cnt--;
> +            atomic_store_relaxed(&subtable_lookups[i].usage_cnt, usage_cnt);

Not sure what you are trying here, but looks like you make an atomic operation, not atomic ;)

We have the following APIs which I think you should use; atomic_count_init()/atomic_count_inc()/atomic_count_dec().

>          }
>      }
>
> -    VLOG_DBG("Subtable lookup function '%s' with units (%d,%d), priority %d\n",
> -             name, u0_bit_count, u1_bit_count, prio);
> +    /* Update stats for usage. */
> +    if (will_use_result) {
> +        atomic_read_relaxed(&subtable_lookups[best_idx].usage_cnt, &usage_cnt);
> +        usage_cnt++;
> +        atomic_store_relaxed(&subtable_lookups[best_idx].usage_cnt, usage_cnt);
> +    } else {
> +       VLOG_DBG(
> +           "Subtable lookup function '%s' with units (%d,%d), priority %d\n",
> +           name, u0_bit_count, u1_bit_count, prio);
> +    }

Why not always show the log? Guess now you only show it when not using the return value?

>
>      /* Programming error - we must always return a valid func ptr. */
>      ovs_assert(best_func != NULL);
>
>      return best_func;
>  }
> +
> +void
> +dp_dpcls_impl_print_stats(struct ds *reply)
> +{
> +    struct dpcls_subtable_lookup_info_t *lookup_funcs = NULL;
> +    int32_t count = dpcls_subtable_lookup_info_get(&lookup_funcs);
> +
> +    /* Add all DPCLS functions to reply string. */
> +    ds_put_cstr(reply, "Available dpcls implementations:\n");

Would be nice to do some qsort() to show the entries based on priority/name?
> +
> +    for (int i = 0; i < count; i++) {
> +        ds_put_format(reply, "  %s (Use count: %d, Priority: %d",
> +                      lookup_funcs[i].name, lookup_funcs[i].usage_cnt,
> +                      lookup_funcs[i].prio);
> +
> +        if (ds_last(reply) == ' ') {
> +            ds_put_cstr(reply, "none");
> +        }
> +
> +        ds_put_cstr(reply, ")\n");
> +    }
> +
> +}
> diff --git a/lib/dpif-netdev-lookup.h b/lib/dpif-netdev-lookup.h
> index 59f51faa0..13431a3fb 100644
> --- a/lib/dpif-netdev-lookup.h
> +++ b/lib/dpif-netdev-lookup.h
> @@ -20,6 +20,7 @@
>  #include <config.h>
>  #include "dpif-netdev.h"
>  #include "dpif-netdev-private-dpcls.h"
> +#include "dpif-netdev-private-thread.h"
>
>  /* Function to perform a probe for the subtable bit fingerprint.
>   * Returns NULL if not valid, or a valid function pointer to call for this
> @@ -62,12 +63,24 @@ struct dpcls_subtable_lookup_info_t {
>
>      /* Human readable name, used in setting subtable priority commands */
>      const char *name;
> +
> +    /* Counter which holds the usage count of each implementations. */
> +    atomic_uint32_t usage_cnt;

Guess this needs to be atomic_count
>  };
>
>  int32_t dpcls_subtable_set_prio(const char *name, uint8_t priority);
>
> +/* Lookup the best subtable lookup implementation for the given u0,u1 count.
> + * When replacing an existing lookup func, the old_func pointer is provied
> + * and statistics will be tracked accordingly and will_use parameter would
> + * be set to 1 if the result of the function is going to be used for a
> + * subtable lookup else when set to 0, the statistics will refect that the
> + * result is not being used.
> + */

See below, as I do not like the API.

>  dpcls_subtable_lookup_func
> -dpcls_subtable_get_best_impl(uint32_t u0_bit_count, uint32_t u1_bit_count);
> +dpcls_subtable_get_best_impl(uint32_t u0_bit_count, uint32_t u1_bit_count,
> +                             dpcls_subtable_lookup_func old_func,
> +                             uint32_t will_use);
>
>  /* Retrieve the array of lookup implementations for iteration.
>   * On error, returns a negative number.
> @@ -76,4 +89,8 @@ dpcls_subtable_get_best_impl(uint32_t u0_bit_count, uint32_t u1_bit_count);
>  int32_t
>  dpcls_subtable_lookup_info_get(struct dpcls_subtable_lookup_info_t **out_ptr);
>
> +/* Prints dpcls subtables in use for different implementations. */
> +void
> +dp_dpcls_impl_print_stats(struct ds *reply);
> +
>  #endif /* dpif-netdev-lookup.h */
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index b078c2da5..c4ca3ce51 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -911,21 +911,8 @@ dpif_netdev_subtable_lookup_get(struct unixctl_conn *conn, int argc OVS_UNUSED,
>                                  const char *argv[] OVS_UNUSED,
>                                  void *aux OVS_UNUSED)
>  {
> -    /* Get a list of all lookup functions. */
> -    struct dpcls_subtable_lookup_info_t *lookup_funcs = NULL;
> -    int32_t count = dpcls_subtable_lookup_info_get(&lookup_funcs);
> -    if (count < 0) {
> -        unixctl_command_reply_error(conn, "error getting lookup names");
> -        return;
> -    }
> -
> -    /* Add all lookup functions to reply string. */
>      struct ds reply = DS_EMPTY_INITIALIZER;
> -    ds_put_cstr(&reply, "Available lookup functions (priority : name)\n");
> -    for (int i = 0; i < count; i++) {
> -        ds_put_format(&reply, "  %d : %s\n", lookup_funcs[i].prio,
> -                      lookup_funcs[i].name);
> -    }

Add extra enter between variable declaration and code.

> +    dp_dpcls_impl_print_stats(&reply);
>      unixctl_command_reply(conn, ds_cstr(&reply));
>      ds_destroy(&reply);
>  }
> @@ -8940,6 +8927,10 @@ dpcls_destroy_subtable(struct dpcls *cls, struct dpcls_subtable *subtable)
>      pvector_remove(&cls->subtables, subtable);
>      cmap_remove(&cls->subtables_map, &subtable->cmap_node,
>                  subtable->mask.hash);
> +
> +    dpcls_subtable_get_best_impl(subtable->mf_bits_set_unit0,
> +                                 subtable->mf_bits_set_unit1,
> +                                 subtable->lookup_func, 0);

First, the 0, should probably be a false bool. But in general, I do not like this API overload. Why would you call a dpcls_subtable_get_best_impl() to reset a counter? Can we not add a new API in form or
dpcls_update_impl_use_count(struct dpcls_subtable *ubtable, bool increase), or something else thats is more clear?

>      ovsrcu_postpone(dpcls_subtable_destroy_cb, subtable);
>  }
>
> @@ -8987,8 +8978,8 @@ dpcls_create_subtable(struct dpcls *cls, const struct netdev_flow_key *mask)
>       * The function is guaranteed to always return a valid implementation, and
>       * possibly an ISA optimized, and/or specialized implementation.
>       */
> -    subtable->lookup_func = dpcls_subtable_get_best_impl(unit0, unit1);
> -
> +    subtable->lookup_func = dpcls_subtable_get_best_impl(unit0, unit1, NULL,
> +                                                         1);

If dpcls_subtable_get_best_impl() would always increase the use count, no changes should be needed here.

>      cmap_insert(&cls->subtables_map, &subtable->cmap_node, mask->hash);
>      /* Add the new subtable at the end of the pvector (with no hits yet) */
>      pvector_insert(&cls->subtables, subtable, 0);
> @@ -9028,8 +9019,11 @@ dpcls_subtable_lookup_reprobe(struct dpcls *cls)
>          uint32_t u0_bits = subtable->mf_bits_set_unit0;
>          uint32_t u1_bits = subtable->mf_bits_set_unit1;
>          void *old_func = subtable->lookup_func;

Guess here we could do dpcls_update_impl_use_count(subtable->lookup_func, false)
And then call dpcls_subtable_get_best_impl() which will increase the use count.
> -        subtable->lookup_func = dpcls_subtable_get_best_impl(u0_bits, u1_bits);
> -        subtables_changed += (old_func != subtable->lookup_func);
> +        subtable->lookup_func = dpcls_subtable_get_best_impl(u0_bits, u1_bits,
> +                                                             old_func, 1);
> +        if (old_func != subtable->lookup_func) {
> +            subtables_changed += 1;
> +        }
>      }
>      pvector_publish(pvec);
>
> diff --git a/tests/pmd.at b/tests/pmd.at
> index c875a744f..e61bb27d3 100644
> --- a/tests/pmd.at
> +++ b/tests/pmd.at
> @@ -1091,11 +1091,11 @@ AT_CHECK([ovs-vsctl add-port br0 p1 -- set Interface p1 type=dummy-pmd])
>
>  AT_CHECK([ovs-vsctl show], [], [stdout])
>  AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-get | grep generic], [], [dnl
> -  1 : generic
> +  generic (Use count: 0, Priority: 1)
>  ])
>
>  AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-get | grep autovalidator], [], [dnl
> -  0 : autovalidator
> +  autovalidator (Use count: 0, Priority: 0)
>  ])
>
>  AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-set autovalidator 3], [0], [dnl
> @@ -1103,7 +1103,7 @@ Lookup priority change affected 0 dpcls ports and 0 subtables.
>  ])
>
>  AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-get | grep autovalidator], [], [dnl
> -  3 : autovalidator
> +  autovalidator (Use count: 0, Priority: 3)
>  ])
>
>  AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-set generic 4], [0], [dnl
> @@ -1111,7 +1111,7 @@ Lookup priority change affected 0 dpcls ports and 0 subtables.
>  ])
>
>  AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-get | grep generic], [], [dnl
> -  4 : generic
> +  generic (Use count: 0, Priority: 4)
>  ])
>
>  AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-set generic 8], [0], [dnl
> @@ -1119,7 +1119,7 @@ Lookup priority change affected 0 dpcls ports and 0 subtables.
>  ])
>
>  AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-get | grep generic], [], [dnl
> -  8 : generic
> +  generic (Use count: 0, Priority: 8)
>  ])
>
>  AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-set autovalidator 8], [0], [dnl
> @@ -1127,7 +1127,7 @@ Lookup priority change affected 0 dpcls ports and 0 subtables.
>  ])
>
>  AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-get | grep autovalidator], [], [dnl
> -  8 : autovalidator
> +  autovalidator (Use count: 0, Priority: 8)
>  ])
>
>  AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-set generic 0], [0], [dnl
> @@ -1135,7 +1135,7 @@ Lookup priority change affected 0 dpcls ports and 0 subtables.
>  ])
>
>  AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-get | grep generic], [], [dnl
> -  0 : generic
> +  generic (Use count: 0, Priority: 0)
>  ])
>
>  AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-set generic 255], [0], [dnl
> @@ -1143,7 +1143,7 @@ Lookup priority change affected 0 dpcls ports and 0 subtables.
>  ])
>
>  AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-get | grep generic], [], [dnl
> -  255 : generic
> +  generic (Use count: 0, Priority: 255)
>  ])
>
>  AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-set generic -1], [2],
> -- 
> 2.25.1
Van Haaren, Harry Oct. 21, 2021, 12:40 p.m. UTC | #2
> -----Original Message-----
> From: Eelco Chaudron <echaudro@redhat.com>
> Sent: Thursday, October 21, 2021 1:04 PM
> To: Amber, Kumar <kumar.amber@intel.com>
> Cc: ovs-dev@openvswitch.org; ktraynor@redhat.com; i.maximets@ovn.org;
> Stokes, Ian <ian.stokes@intel.com>; fbl@sysclose.org; Van Haaren, Harry
> <harry.van.haaren@intel.com>
> Subject: Re: [PATCH v2] dpcls: Change info-get function to fetch dpcls usage
> stats.
> 
> Hi Kumar,
> 
> See inline comments below…
> 
> //Eelco

Hey Eelco, 

Thanks for reviewing; comments/discussion inline.

> On 20 Oct 2021, at 6:52, Kumar Amber wrote:
> 
> > Modified the dplcs info-get command output to include
> > the count for different dpcls implementations.
> >
> > $ovs-appctl dpif-netdev/subtable-lookup-prio-get
> >
> > Available dpcls implementations:
> >   autovalidator (Use count: 1, Priority: 5)
> >   generic (Use count: 0, Priority: 1)
> >   avx512_gather (Use count: 0, Priority: 3)
> >
> > Test case to verify changes:
> > 	1021: PMD - dpcls configuration     ok
> >
> > Signed-off-by: Kumar Amber <kumar.amber@intel.com>
> > Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
> > Co-authored-by: Harry van Haaren <harry.van.haaren@intel.com>

<snip>

> >  dpcls_subtable_lookup_func
> > -dpcls_subtable_get_best_impl(uint32_t u0_bit_count, uint32_t u1_bit_count)
> > +dpcls_subtable_get_best_impl(uint32_t u0_bit_count, uint32_t u1_bit_count,
> > +                             dpcls_subtable_lookup_func old_func,
> > +                             uint32_t will_use_result)
> 
> Why uint32_t and not a boolean?

bool is fine too, can change.


> >  {
> >      /* Iter over each subtable impl, and get highest priority one. */
> >      int32_t prio = -1;
> > +    uint32_t i;
> >      const char *name = NULL;
> > +    uint32_t best_idx = 0;
> > +    uint32_t usage_cnt = 0;
> >      dpcls_subtable_lookup_func best_func = NULL;
> >
> > -    for (int i = 0; i < ARRAY_SIZE(subtable_lookups); i++) {
> > +    for (i = 0; i < ARRAY_SIZE(subtable_lookups); i++) {
> >          int32_t probed_prio = subtable_lookups[i].prio;
> > +        dpcls_subtable_lookup_func probed_func;
> 
> Add extra blanc line here

Can do.

> > +        probed_func = subtable_lookups[i].probe(u0_bit_count,
> > +                                u1_bit_count);
> 
> Indentation is off

Will fix.

> > +        if (!probed_func) {
> > +            continue;
> > +        }
> > +
> > +        /* Better candidate - track this to return it later. */
> >          if (probed_prio > prio) {
> > -            dpcls_subtable_lookup_func probed_func;
> > -            probed_func = subtable_lookups[i].probe(u0_bit_count,
> > -                                    u1_bit_count);
> > -            if (probed_func) {
> > -                best_func = probed_func;
> > -                prio = probed_prio;
> > -                name = subtable_lookups[i].name;
> > -            }
> > +            best_func = probed_func;
> > +            best_idx = i;
> > +            prio = probed_prio;
> > +            name = subtable_lookups[i].name;
> > +        }
> > +
> > +        /* Statistics keeping, reduce old func usage count. */
> > +        if (probed_func == old_func) {
> 
> Should this check also not include “&& will_use_result“?

Nope. When destroying a subtable, we want to decrement the counter, even when the returned func ptr will not be used.

> Guess you overload this function to do a counter update, which I do not like (see comments below).

Yes - correct. I'd prefer keep separate APIs too, but in reality the implementation to increment a counter
would require a search through all subtables (which we've already just done). So iterating all dpcls implementations
a second time, doing a brute-force search in order to increment a variable seemed too much effort, as well as
maintaining 2x implementations.

The goal of adding the "will use result" parameter (overloading API) was to reduce code maintenance effort...

 
> > +            atomic_read_relaxed(&subtable_lookups[i].usage_cnt, &usage_cnt);
> > +            usage_cnt--;
> > +            atomic_store_relaxed(&subtable_lookups[i].usage_cnt, usage_cnt);
> 
> Not sure what you are trying here, but looks like you make an atomic operation,
> not atomic ;)
> 
> We have the following APIs which I think you should use;
> atomic_count_init()/atomic_count_inc()/atomic_count_dec().

@Amber; would you investigate here and update to use atomic_count_inc/dec()?


> > -    VLOG_DBG("Subtable lookup function '%s' with units (%d,%d), priority
> %d\n",
> > -             name, u0_bit_count, u1_bit_count, prio);
> > +    /* Update stats for usage. */
> > +    if (will_use_result) {
> > +        atomic_read_relaxed(&subtable_lookups[best_idx].usage_cnt,
> &usage_cnt);
> > +        usage_cnt++;
> > +        atomic_store_relaxed(&subtable_lookups[best_idx].usage_cnt,
> usage_cnt);
> > +    } else {
> > +       VLOG_DBG(
> > +           "Subtable lookup function '%s' with units (%d,%d), priority %d\n",
> > +           name, u0_bit_count, u1_bit_count, prio);
> > +    }
> 
> Why not always show the log? Guess now you only show it when not using the
> return value?

Will investigate always showing log.

> >      /* Programming error - we must always return a valid func ptr. */
> >      ovs_assert(best_func != NULL);
> >
> >      return best_func;
> >  }
> > +
> > +void
> > +dp_dpcls_impl_print_stats(struct ds *reply)
> > +{
> > +    struct dpcls_subtable_lookup_info_t *lookup_funcs = NULL;
> > +    int32_t count = dpcls_subtable_lookup_info_get(&lookup_funcs);
> > +
> > +    /* Add all DPCLS functions to reply string. */
> > +    ds_put_cstr(reply, "Available dpcls implementations:\n");
> 
> Would be nice to do some qsort() to show the entries based on priority/name?

I don’t think its worth the complexity. The order is always the order as they are listed in the code.
What value does sorting have? It makes complexity of Unit tests higher, as we'd have to re-order the implementations there too.

> > +    for (int i = 0; i < count; i++) {
> > +        ds_put_format(reply, "  %s (Use count: %d, Priority: %d",
> > +                      lookup_funcs[i].name, lookup_funcs[i].usage_cnt,
> > +                      lookup_funcs[i].prio);
> > +
> > +        if (ds_last(reply) == ' ') {
> > +            ds_put_cstr(reply, "none");
> > +        }
> > +
> > +        ds_put_cstr(reply, ")\n");
> > +    }
> > +
> > +}
> > diff --git a/lib/dpif-netdev-lookup.h b/lib/dpif-netdev-lookup.h
> > index 59f51faa0..13431a3fb 100644
> > --- a/lib/dpif-netdev-lookup.h
> > +++ b/lib/dpif-netdev-lookup.h
> > @@ -20,6 +20,7 @@
> >  #include <config.h>
> >  #include "dpif-netdev.h"
> >  #include "dpif-netdev-private-dpcls.h"
> > +#include "dpif-netdev-private-thread.h"
> >
> >  /* Function to perform a probe for the subtable bit fingerprint.
> >   * Returns NULL if not valid, or a valid function pointer to call for this
> > @@ -62,12 +63,24 @@ struct dpcls_subtable_lookup_info_t {
> >
> >      /* Human readable name, used in setting subtable priority commands */
> >      const char *name;
> > +
> > +    /* Counter which holds the usage count of each implementations. */
> > +    atomic_uint32_t usage_cnt;
> 
> Guess this needs to be atomic_count

Sure will rename.


> >  int32_t dpcls_subtable_set_prio(const char *name, uint8_t priority);
> >
> > +/* Lookup the best subtable lookup implementation for the given u0,u1
> count.
> > + * When replacing an existing lookup func, the old_func pointer is provied
> > + * and statistics will be tracked accordingly and will_use parameter would
> > + * be set to 1 if the result of the function is going to be used for a
> > + * subtable lookup else when set to 0, the statistics will refect that the
> > + * result is not being used.
> > + */
> 
> See below, as I do not like the API.

See discussion/reply on API/design and implementation above.


> >  dpcls_subtable_lookup_func
> > -dpcls_subtable_get_best_impl(uint32_t u0_bit_count, uint32_t
> u1_bit_count);
> > +dpcls_subtable_get_best_impl(uint32_t u0_bit_count, uint32_t u1_bit_count,
> > +                             dpcls_subtable_lookup_func old_func,
> > +                             uint32_t will_use);
> >
> >  /* Retrieve the array of lookup implementations for iteration.
> >   * On error, returns a negative number.
> > @@ -76,4 +89,8 @@ dpcls_subtable_get_best_impl(uint32_t u0_bit_count,
> uint32_t u1_bit_count);
> >  int32_t
> >  dpcls_subtable_lookup_info_get(struct dpcls_subtable_lookup_info_t
> **out_ptr);
> >
> > +/* Prints dpcls subtables in use for different implementations. */
> > +void
> > +dp_dpcls_impl_print_stats(struct ds *reply);
> > +
> >  #endif /* dpif-netdev-lookup.h */
> > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> > index b078c2da5..c4ca3ce51 100644
> > --- a/lib/dpif-netdev.c
> > +++ b/lib/dpif-netdev.c
> > @@ -911,21 +911,8 @@ dpif_netdev_subtable_lookup_get(struct unixctl_conn
> *conn, int argc OVS_UNUSED,
> >                                  const char *argv[] OVS_UNUSED,
> >                                  void *aux OVS_UNUSED)
> >  {
> > -    /* Get a list of all lookup functions. */
> > -    struct dpcls_subtable_lookup_info_t *lookup_funcs = NULL;
> > -    int32_t count = dpcls_subtable_lookup_info_get(&lookup_funcs);
> > -    if (count < 0) {
> > -        unixctl_command_reply_error(conn, "error getting lookup names");
> > -        return;
> > -    }
> > -
> > -    /* Add all lookup functions to reply string. */
> >      struct ds reply = DS_EMPTY_INITIALIZER;
> > -    ds_put_cstr(&reply, "Available lookup functions (priority : name)\n");
> > -    for (int i = 0; i < count; i++) {
> > -        ds_put_format(&reply, "  %d : %s\n", lookup_funcs[i].prio,
> > -                      lookup_funcs[i].name);
> > -    }
> 
> Add extra enter between variable declaration and code.

Will fix.


> > +    dp_dpcls_impl_print_stats(&reply);
> >      unixctl_command_reply(conn, ds_cstr(&reply));
> >      ds_destroy(&reply);
> >  }
> > @@ -8940,6 +8927,10 @@ dpcls_destroy_subtable(struct dpcls *cls, struct
> dpcls_subtable *subtable)
> >      pvector_remove(&cls->subtables, subtable);
> >      cmap_remove(&cls->subtables_map, &subtable->cmap_node,
> >                  subtable->mask.hash);
> > +
> > +    dpcls_subtable_get_best_impl(subtable->mf_bits_set_unit0,
> > +                                 subtable->mf_bits_set_unit1,
> > +                                 subtable->lookup_func, 0);
> 
> First, the 0, should probably be a false bool. But in general, I do not like this API
> overload. Why would you call a dpcls_subtable_get_best_impl() to reset a
> counter? Can we not add a new API in form or
> dpcls_update_impl_use_count(struct dpcls_subtable *ubtable, bool increase), or
> something else thats is more clear?

As per above, the "dpcls_update_impl_use_count()" would require a brute-force search through the tables again.
By incorporating the "increment" in the API, we can just iterate the implementations once.
I think it is worth the complexity to avoid re-iterating implementations a 2nd time.


> >      ovsrcu_postpone(dpcls_subtable_destroy_cb, subtable);
> >  }
> >
> > @@ -8987,8 +8978,8 @@ dpcls_create_subtable(struct dpcls *cls, const struct
> netdev_flow_key *mask)
> >       * The function is guaranteed to always return a valid implementation, and
> >       * possibly an ISA optimized, and/or specialized implementation.
> >       */
> > -    subtable->lookup_func = dpcls_subtable_get_best_impl(unit0, unit1);
> > -
> > +    subtable->lookup_func = dpcls_subtable_get_best_impl(unit0, unit1, NULL,
> > +                                                         1);
> 
> If dpcls_subtable_get_best_impl() would always increase the use count, no
> changes should be needed here.

I see where you're going with this approach - see above comments.

<snip remaining patch & suggestions of the 2 apis approach>

Would documenting the reasoning clearly as to why the API is overloaded, and keeping
implementation (1 function, overloeaded with will_use_result) the same as is today be acceptable?

Regards, -Harry
diff mbox series

Patch

diff --git a/Documentation/topics/dpdk/bridge.rst b/Documentation/topics/dpdk/bridge.rst
index f645b9ade..63a54da1c 100644
--- a/Documentation/topics/dpdk/bridge.rst
+++ b/Documentation/topics/dpdk/bridge.rst
@@ -156,10 +156,10 @@  OVS provides multiple implementations of dpcls. The following command enables
 the user to check what implementations are available in a running instance ::
 
     $ ovs-appctl dpif-netdev/subtable-lookup-prio-get
-    Available lookup functions (priority : name)
-            0 : autovalidator
-            1 : generic
-            0 : avx512_gather
+    Available dpcls implementations:
+            autovalidator (Use count: 1, Priority: 5)
+            generic (Use count: 0, Priority: 1)
+            avx512_gather (Use count: 0, Priority: 3)
 
 To set the priority of a lookup function, run the ``prio-set`` command ::
 
@@ -172,10 +172,10 @@  function due to the command being run. To verify the prioritization, re-run the
 get command, note the updated priority of the ``avx512_gather`` function ::
 
     $ ovs-appctl dpif-netdev/subtable-lookup-prio-get
-    Available lookup functions (priority : name)
-            0 : autovalidator
-            1 : generic
-            5 : avx512_gather
+    Available dpcls implementations:
+            autovalidator (Use count: 0, Priority: 0)
+            generic (Use count: 0, Priority: 0)
+            avx512_gather (Use count: 1, Priority: 5)
 
 If two lookup functions have the same priority, the first one in the list is
 chosen, and the 2nd occurance of that priority is not used. Put in logical
diff --git a/lib/dpif-netdev-lookup.c b/lib/dpif-netdev-lookup.c
index bd0a99abe..c0b137299 100644
--- a/lib/dpif-netdev-lookup.c
+++ b/lib/dpif-netdev-lookup.c
@@ -36,18 +36,21 @@  static struct dpcls_subtable_lookup_info_t subtable_lookups[] = {
     { .prio = 0,
 #endif
       .probe = dpcls_subtable_autovalidator_probe,
-      .name = "autovalidator", },
+      .name = "autovalidator",
+      .usage_cnt = 0,},
 
     /* The default scalar C code implementation. */
     { .prio = 1,
       .probe = dpcls_subtable_generic_probe,
-      .name = "generic", },
+      .name = "generic",
+      .usage_cnt = 0, },
 
 #if (__x86_64__ && HAVE_AVX512F && HAVE_LD_AVX512_GOOD && __SSE4_2__)
     /* Only available on x86_64 bit builds with SSE 4.2 used for OVS core. */
     { .prio = 0,
       .probe = dpcls_subtable_avx512_gather_probe,
-      .name = "avx512_gather", },
+      .name = "avx512_gather",
+      .usage_cnt = 0,},
 #else
     /* Disabling AVX512 at compile time, as compile time requirements not met.
      * This could be due to a number of reasons:
@@ -93,32 +96,79 @@  dpcls_subtable_set_prio(const char *name, uint8_t priority)
 }
 
 dpcls_subtable_lookup_func
-dpcls_subtable_get_best_impl(uint32_t u0_bit_count, uint32_t u1_bit_count)
+dpcls_subtable_get_best_impl(uint32_t u0_bit_count, uint32_t u1_bit_count,
+                             dpcls_subtable_lookup_func old_func,
+                             uint32_t will_use_result)
 {
     /* Iter over each subtable impl, and get highest priority one. */
     int32_t prio = -1;
+    uint32_t i;
     const char *name = NULL;
+    uint32_t best_idx = 0;
+    uint32_t usage_cnt = 0;
     dpcls_subtable_lookup_func best_func = NULL;
 
-    for (int i = 0; i < ARRAY_SIZE(subtable_lookups); i++) {
+    for (i = 0; i < ARRAY_SIZE(subtable_lookups); i++) {
         int32_t probed_prio = subtable_lookups[i].prio;
+        dpcls_subtable_lookup_func probed_func;
+        probed_func = subtable_lookups[i].probe(u0_bit_count,
+                                u1_bit_count);
+        if (!probed_func) {
+            continue;
+        }
+
+        /* Better candidate - track this to return it later. */
         if (probed_prio > prio) {
-            dpcls_subtable_lookup_func probed_func;
-            probed_func = subtable_lookups[i].probe(u0_bit_count,
-                                    u1_bit_count);
-            if (probed_func) {
-                best_func = probed_func;
-                prio = probed_prio;
-                name = subtable_lookups[i].name;
-            }
+            best_func = probed_func;
+            best_idx = i;
+            prio = probed_prio;
+            name = subtable_lookups[i].name;
+        }
+
+        /* Statistics keeping, reduce old func usage count. */
+        if (probed_func == old_func) {
+            atomic_read_relaxed(&subtable_lookups[i].usage_cnt, &usage_cnt);
+            usage_cnt--;
+            atomic_store_relaxed(&subtable_lookups[i].usage_cnt, usage_cnt);
         }
     }
 
-    VLOG_DBG("Subtable lookup function '%s' with units (%d,%d), priority %d\n",
-             name, u0_bit_count, u1_bit_count, prio);
+    /* Update stats for usage. */
+    if (will_use_result) {
+        atomic_read_relaxed(&subtable_lookups[best_idx].usage_cnt, &usage_cnt);
+        usage_cnt++;
+        atomic_store_relaxed(&subtable_lookups[best_idx].usage_cnt, usage_cnt);
+    } else {
+       VLOG_DBG(
+           "Subtable lookup function '%s' with units (%d,%d), priority %d\n",
+           name, u0_bit_count, u1_bit_count, prio);
+    }
 
     /* Programming error - we must always return a valid func ptr. */
     ovs_assert(best_func != NULL);
 
     return best_func;
 }
+
+void
+dp_dpcls_impl_print_stats(struct ds *reply)
+{
+    struct dpcls_subtable_lookup_info_t *lookup_funcs = NULL;
+    int32_t count = dpcls_subtable_lookup_info_get(&lookup_funcs);
+
+    /* Add all DPCLS functions to reply string. */
+    ds_put_cstr(reply, "Available dpcls implementations:\n");
+
+    for (int i = 0; i < count; i++) {
+        ds_put_format(reply, "  %s (Use count: %d, Priority: %d",
+                      lookup_funcs[i].name, lookup_funcs[i].usage_cnt,
+                      lookup_funcs[i].prio);
+
+        if (ds_last(reply) == ' ') {
+            ds_put_cstr(reply, "none");
+        }
+
+        ds_put_cstr(reply, ")\n");
+    }
+
+}
diff --git a/lib/dpif-netdev-lookup.h b/lib/dpif-netdev-lookup.h
index 59f51faa0..13431a3fb 100644
--- a/lib/dpif-netdev-lookup.h
+++ b/lib/dpif-netdev-lookup.h
@@ -20,6 +20,7 @@ 
 #include <config.h>
 #include "dpif-netdev.h"
 #include "dpif-netdev-private-dpcls.h"
+#include "dpif-netdev-private-thread.h"
 
 /* Function to perform a probe for the subtable bit fingerprint.
  * Returns NULL if not valid, or a valid function pointer to call for this
@@ -62,12 +63,24 @@  struct dpcls_subtable_lookup_info_t {
 
     /* Human readable name, used in setting subtable priority commands */
     const char *name;
+
+    /* Counter which holds the usage count of each implementations. */
+    atomic_uint32_t usage_cnt;
 };
 
 int32_t dpcls_subtable_set_prio(const char *name, uint8_t priority);
 
+/* Lookup the best subtable lookup implementation for the given u0,u1 count.
+ * When replacing an existing lookup func, the old_func pointer is provied
+ * and statistics will be tracked accordingly and will_use parameter would
+ * be set to 1 if the result of the function is going to be used for a
+ * subtable lookup else when set to 0, the statistics will refect that the
+ * result is not being used.
+ */
 dpcls_subtable_lookup_func
-dpcls_subtable_get_best_impl(uint32_t u0_bit_count, uint32_t u1_bit_count);
+dpcls_subtable_get_best_impl(uint32_t u0_bit_count, uint32_t u1_bit_count,
+                             dpcls_subtable_lookup_func old_func,
+                             uint32_t will_use);
 
 /* Retrieve the array of lookup implementations for iteration.
  * On error, returns a negative number.
@@ -76,4 +89,8 @@  dpcls_subtable_get_best_impl(uint32_t u0_bit_count, uint32_t u1_bit_count);
 int32_t
 dpcls_subtable_lookup_info_get(struct dpcls_subtable_lookup_info_t **out_ptr);
 
+/* Prints dpcls subtables in use for different implementations. */
+void
+dp_dpcls_impl_print_stats(struct ds *reply);
+
 #endif /* dpif-netdev-lookup.h */
diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index b078c2da5..c4ca3ce51 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -911,21 +911,8 @@  dpif_netdev_subtable_lookup_get(struct unixctl_conn *conn, int argc OVS_UNUSED,
                                 const char *argv[] OVS_UNUSED,
                                 void *aux OVS_UNUSED)
 {
-    /* Get a list of all lookup functions. */
-    struct dpcls_subtable_lookup_info_t *lookup_funcs = NULL;
-    int32_t count = dpcls_subtable_lookup_info_get(&lookup_funcs);
-    if (count < 0) {
-        unixctl_command_reply_error(conn, "error getting lookup names");
-        return;
-    }
-
-    /* Add all lookup functions to reply string. */
     struct ds reply = DS_EMPTY_INITIALIZER;
-    ds_put_cstr(&reply, "Available lookup functions (priority : name)\n");
-    for (int i = 0; i < count; i++) {
-        ds_put_format(&reply, "  %d : %s\n", lookup_funcs[i].prio,
-                      lookup_funcs[i].name);
-    }
+    dp_dpcls_impl_print_stats(&reply);
     unixctl_command_reply(conn, ds_cstr(&reply));
     ds_destroy(&reply);
 }
@@ -8940,6 +8927,10 @@  dpcls_destroy_subtable(struct dpcls *cls, struct dpcls_subtable *subtable)
     pvector_remove(&cls->subtables, subtable);
     cmap_remove(&cls->subtables_map, &subtable->cmap_node,
                 subtable->mask.hash);
+
+    dpcls_subtable_get_best_impl(subtable->mf_bits_set_unit0,
+                                 subtable->mf_bits_set_unit1,
+                                 subtable->lookup_func, 0);
     ovsrcu_postpone(dpcls_subtable_destroy_cb, subtable);
 }
 
@@ -8987,8 +8978,8 @@  dpcls_create_subtable(struct dpcls *cls, const struct netdev_flow_key *mask)
      * The function is guaranteed to always return a valid implementation, and
      * possibly an ISA optimized, and/or specialized implementation.
      */
-    subtable->lookup_func = dpcls_subtable_get_best_impl(unit0, unit1);
-
+    subtable->lookup_func = dpcls_subtable_get_best_impl(unit0, unit1, NULL,
+                                                         1);
     cmap_insert(&cls->subtables_map, &subtable->cmap_node, mask->hash);
     /* Add the new subtable at the end of the pvector (with no hits yet) */
     pvector_insert(&cls->subtables, subtable, 0);
@@ -9028,8 +9019,11 @@  dpcls_subtable_lookup_reprobe(struct dpcls *cls)
         uint32_t u0_bits = subtable->mf_bits_set_unit0;
         uint32_t u1_bits = subtable->mf_bits_set_unit1;
         void *old_func = subtable->lookup_func;
-        subtable->lookup_func = dpcls_subtable_get_best_impl(u0_bits, u1_bits);
-        subtables_changed += (old_func != subtable->lookup_func);
+        subtable->lookup_func = dpcls_subtable_get_best_impl(u0_bits, u1_bits,
+                                                             old_func, 1);
+        if (old_func != subtable->lookup_func) {
+            subtables_changed += 1;
+        }
     }
     pvector_publish(pvec);
 
diff --git a/tests/pmd.at b/tests/pmd.at
index c875a744f..e61bb27d3 100644
--- a/tests/pmd.at
+++ b/tests/pmd.at
@@ -1091,11 +1091,11 @@  AT_CHECK([ovs-vsctl add-port br0 p1 -- set Interface p1 type=dummy-pmd])
 
 AT_CHECK([ovs-vsctl show], [], [stdout])
 AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-get | grep generic], [], [dnl
-  1 : generic
+  generic (Use count: 0, Priority: 1)
 ])
 
 AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-get | grep autovalidator], [], [dnl
-  0 : autovalidator
+  autovalidator (Use count: 0, Priority: 0)
 ])
 
 AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-set autovalidator 3], [0], [dnl
@@ -1103,7 +1103,7 @@  Lookup priority change affected 0 dpcls ports and 0 subtables.
 ])
 
 AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-get | grep autovalidator], [], [dnl
-  3 : autovalidator
+  autovalidator (Use count: 0, Priority: 3)
 ])
 
 AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-set generic 4], [0], [dnl
@@ -1111,7 +1111,7 @@  Lookup priority change affected 0 dpcls ports and 0 subtables.
 ])
 
 AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-get | grep generic], [], [dnl
-  4 : generic
+  generic (Use count: 0, Priority: 4)
 ])
 
 AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-set generic 8], [0], [dnl
@@ -1119,7 +1119,7 @@  Lookup priority change affected 0 dpcls ports and 0 subtables.
 ])
 
 AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-get | grep generic], [], [dnl
-  8 : generic
+  generic (Use count: 0, Priority: 8)
 ])
 
 AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-set autovalidator 8], [0], [dnl
@@ -1127,7 +1127,7 @@  Lookup priority change affected 0 dpcls ports and 0 subtables.
 ])
 
 AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-get | grep autovalidator], [], [dnl
-  8 : autovalidator
+  autovalidator (Use count: 0, Priority: 8)
 ])
 
 AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-set generic 0], [0], [dnl
@@ -1135,7 +1135,7 @@  Lookup priority change affected 0 dpcls ports and 0 subtables.
 ])
 
 AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-get | grep generic], [], [dnl
-  0 : generic
+  generic (Use count: 0, Priority: 0)
 ])
 
 AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-set generic 255], [0], [dnl
@@ -1143,7 +1143,7 @@  Lookup priority change affected 0 dpcls ports and 0 subtables.
 ])
 
 AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-get | grep generic], [], [dnl
-  255 : generic
+  generic (Use count: 0, Priority: 255)
 ])
 
 AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-set generic -1], [2],