diff mbox series

[ovs-dev,v6,6/6] docs/dpdk/bridge: add datapath performance section.

Message ID 20200702174300.48470-7-harry.van.haaren@intel.com
State Superseded
Headers show
Series DPCLS Subtable ISA Optimization | expand

Commit Message

Van Haaren, Harry July 2, 2020, 5:43 p.m. UTC
This commit adds a section to the dpdk/bridge.rst netdev documentation,
detailing the added DPCLS functionality. The newly added commands are
documented, and sample output is provided.

Running the DPCLS autovalidator with unit tests by default is possible
through re-compiling the autovalidator to have the highest priority at
startup time. This avoids making changes to all tests, and enables
debug and CI builds to validate every lookup implementation with all
unit tests.

Add NEWS updates for CPU ISA, dynamic subtables, and AVX512 lookup.

Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>

---

v5:
- Include NEWS item updates.

v4:
- Fix typos (William Tu)
- Update get commands to use include "prio" as updated in v4
- Add section on enabling autovalidator by default for unit tests
---
 Documentation/topics/dpdk/bridge.rst | 77 ++++++++++++++++++++++++++++
 NEWS                                 |  3 ++
 2 files changed, 80 insertions(+)

Comments

Stokes, Ian July 10, 2020, 4 p.m. UTC | #1
On 7/2/2020 6:43 PM, Harry van Haaren wrote:
> This commit adds a section to the dpdk/bridge.rst netdev documentation,
> detailing the added DPCLS functionality. The newly added commands are
> documented, and sample output is provided.
> 
> Running the DPCLS autovalidator with unit tests by default is possible
> through re-compiling the autovalidator to have the highest priority at
> startup time. This avoids making changes to all tests, and enables
> debug and CI builds to validate every lookup implementation with all
> unit tests.
> 
> Add NEWS updates for CPU ISA, dynamic subtables, and AVX512 lookup.
> 
> Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
> 

Hi Harry,
What you have below looks good to me.

The only additional ideas that might be worth adding would be either 
validated compilers as mention in patch 1 f the series (maybe this is 
not needed, but reviewing the existing Compilation section for OVS 
already states a GCC version that was tested with OVS DPDK so at least 1 
known GCC version is provided).

Noting the configure, make CFLAGS dependency might be of use too 
although again, depends on how people configure and compile OVS to date.

Lastly possibly adding a section on what to check if AVX512 lookup is 
not appearing might be useful also.

BR
Ian

> ---
> 
> v5:
> - Include NEWS item updates.
> 
> v4:
> - Fix typos (William Tu)
> - Update get commands to use include "prio" as updated in v4
> - Add section on enabling autovalidator by default for unit tests
> ---
>   Documentation/topics/dpdk/bridge.rst | 77 ++++++++++++++++++++++++++++
>   NEWS                                 |  3 ++
>   2 files changed, 80 insertions(+)
> 
> diff --git a/Documentation/topics/dpdk/bridge.rst b/Documentation/topics/dpdk/bridge.rst
> index f0ef42ecc..526d5c959 100644
> --- a/Documentation/topics/dpdk/bridge.rst
> +++ b/Documentation/topics/dpdk/bridge.rst
> @@ -137,3 +137,80 @@ currently turned off by default.
>   To turn on SMC::
>   
>       $ ovs-vsctl --no-wait set Open_vSwitch . other_config:smc-enable=true
> +
> +Datapath Classifier Performance
> +-------------------------------
> +
> +The datapath classifier (dpcls) performs wildcard rule matching, a compute
> +intensive process of matching a packet ``miniflow`` to a rule ``miniflow``. The
> +code that does this compute work impacts datapath performance, and optimizing
> +it can provide higher switching performance.
> +
> +Modern CPUs provide extensive SIMD instructions which can be used to get higher
> +performance. The CPU OVS is being deployed on must be capable of running these
> +SIMD instructions in order to take advantage of the performance benefits.
> +In OVS v2.14 runtime CPU detection was introduced to enable identifying if
> +these CPU ISA additions are available, and to allow the user to enable them.
> +
> +OVS provides multiple implementations of dpcls. The following command enables
> +the user to check what implementations are available in a running instance ::
> +
> +    $ ovs-appctl dpif-netdev/subtable-lookup-prio-get
> +    Available lookup functions (priority : name)
> +            0 : autovalidator
> +            1 : generic
> +            0 : avx512_gather
> +
> +To set the priority of a lookup function, run the ``prio-set`` command ::
> +
> +    $ ovs-appctl dpif-netdev/subtable-lookup-prio-set avx512_gather 5
> +    Lookup priority change affected 1 dpcls ports and 1 subtables.
> +
> +The highest priority lookup function is used for classification, and the output
> +above indicates that one subtable of one DPCLS port is has changed its lookup
> +function due to the command being run. To verify the prioritization, re-run the
> +get command, note the updated priority of the ``avx512_gather`` function ::
> +
> +    $ ovs-appctl dpif-netdev/subtable-lookup-prio-get
> +    Available lookup functions (priority : name)
> +            0 : autovalidator
> +            1 : generic
> +            5 : avx512_gather
> +
> +If two lookup functions have the same priority, the first one in the list is
> +chosen, and the 2nd occurance of that priority is not used. Put in logical
> +terms, a subtable is chosen if its priority is greater than the previous
> +best candidate.
> +
> +CPU ISA Testing and Validation
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +As multiple versions of DPCLS can co-exist, each with different CPU ISA
> +optimizations, it is important to validate that they all give the exact same
> +results. To easily test all DPCLS implementations, an ``autovalidator``
> +implementation of the DPCLS exists. This implementation runs all other
> +available DPCLS implementations, and verifies that the results are identical.
> +
> +Running the OVS unit tests with the autovalidator enabled ensures all
> +implementations provide the same results. Note that the performance of the
> +autovalidator is lower than all other implementations, as it tests the scalar
> +implementation against itself, and against all other enabled DPCLS
> +implementations.
> +
> +To adjust the DPCLS autovalidator priority, use this command ::
> +
> +    $ ovs-appctl dpif-netdev/subtable-lookup-prio-set autovalidator 7
> +
> +Running Unit Tests with Autovalidator
> ++++++++++++++++++++++++++++++++++++++
> +
> +To run the OVS unit test suite with the DPCLS autovalidator as the default
> +implementation, it is required to recompile OVS. During the recompilation,
> +the default priority of the `autovalidator` implementation is set to the
> +maximum priority, ensuring every test will be run with every lookup
> +implementation ::
> +
> +    $ ./configure --enable-autovalidator
> +
> +Compile OVS in debug mode to have `ovs_assert` statements error out if
> +there is a mis-match in the DPCLS lookup implementation.
> diff --git a/NEWS b/NEWS
> index 0116b3ea0..da8725b59 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -20,6 +20,9 @@ Post-v2.13.0
>        * New configuration knob 'other_config:lb-output-action' for bond ports
>          that enables new datapath action 'lb_output' to avoid recirculation
>          in balance-tcp mode.  Disabled by default.
> +     * Add runtime CPU ISA detection to allow optimized ISA functions
> +     * Add support for dynamically changing DPCLS subtable lookup functions
> +     * Add ISA optimized DPCLS lookup function using AVX512
>      - Tunnels: TC Flower offload
>        * Tunnel Local endpoint address masked match are supported.
>        * Tunnel Romte endpoint address masked match are supported.
>
Van Haaren, Harry July 10, 2020, 5:11 p.m. UTC | #2
> -----Original Message-----
> From: Stokes, Ian <ian.stokes@intel.com>
> Sent: Friday, July 10, 2020 5:00 PM
> To: Van Haaren, Harry <harry.van.haaren@intel.com>; ovs-dev@openvswitch.org
> Cc: i.maximets@ovn.org; u9012063@gmail.com; fiezzi@redhat.com
> Subject: Re: [PATCH v6 6/6] docs/dpdk/bridge: add datapath performance
> section.
> 
> 
> 
> On 7/2/2020 6:43 PM, Harry van Haaren wrote:
> > This commit adds a section to the dpdk/bridge.rst netdev documentation,
> > detailing the added DPCLS functionality. The newly added commands are
> > documented, and sample output is provided.
> >
> > Running the DPCLS autovalidator with unit tests by default is possible
> > through re-compiling the autovalidator to have the highest priority at
> > startup time. This avoids making changes to all tests, and enables
> > debug and CI builds to validate every lookup implementation with all
> > unit tests.
> >
> > Add NEWS updates for CPU ISA, dynamic subtables, and AVX512 lookup.
> >
> > Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
> >
> 
> Hi Harry,
> What you have below looks good to me.
> 
> The only additional ideas that might be worth adding would be either
> validated compilers as mention in patch 1 f the series (maybe this is
> not needed, but reviewing the existing Compilation section for OVS
> already states a GCC version that was tested with OVS DPDK so at least 1
> known GCC version is provided).

As mentioned in reply to first patch, I don't see value in stating what compilers
work or don’t - we must just rely on compilers working. OVS can recommend
or state that it is tested with specific compilers - but that is an independent 
issue to this patchset.

> Noting the configure, make CFLAGS dependency might be of use too
> although again, depends on how people configure and compile OVS to date.

Examples commands and documentation added to remedy this.

> Lastly possibly adding a section on what to check if AVX512 lookup is
> not appearing might be useful also.

Added section on potential issues regarding binutils bug, and how to
remedy.

> BR
> Ian

Thanks for review, -Harry

> >
> > v5:
> > - Include NEWS item updates.
> >
> > v4:
> > - Fix typos (William Tu)
> > - Update get commands to use include "prio" as updated in v4
> > - Add section on enabling autovalidator by default for unit tests
> > ---
> >   Documentation/topics/dpdk/bridge.rst | 77 ++++++++++++++++++++++++++++
> >   NEWS                                 |  3 ++
> >   2 files changed, 80 insertions(+)
> >
> > diff --git a/Documentation/topics/dpdk/bridge.rst
> b/Documentation/topics/dpdk/bridge.rst
> > index f0ef42ecc..526d5c959 100644
> > --- a/Documentation/topics/dpdk/bridge.rst
> > +++ b/Documentation/topics/dpdk/bridge.rst
> > @@ -137,3 +137,80 @@ currently turned off by default.
> >   To turn on SMC::
> >
> >       $ ovs-vsctl --no-wait set Open_vSwitch . other_config:smc-enable=true
> > +
> > +Datapath Classifier Performance
> > +-------------------------------
> > +
> > +The datapath classifier (dpcls) performs wildcard rule matching, a compute
> > +intensive process of matching a packet ``miniflow`` to a rule ``miniflow``. The
> > +code that does this compute work impacts datapath performance, and
> optimizing
> > +it can provide higher switching performance.
> > +
> > +Modern CPUs provide extensive SIMD instructions which can be used to get
> higher
> > +performance. The CPU OVS is being deployed on must be capable of running
> these
> > +SIMD instructions in order to take advantage of the performance benefits.
> > +In OVS v2.14 runtime CPU detection was introduced to enable identifying if
> > +these CPU ISA additions are available, and to allow the user to enable them.
> > +
> > +OVS provides multiple implementations of dpcls. The following command
> enables
> > +the user to check what implementations are available in a running instance ::
> > +
> > +    $ ovs-appctl dpif-netdev/subtable-lookup-prio-get
> > +    Available lookup functions (priority : name)
> > +            0 : autovalidator
> > +            1 : generic
> > +            0 : avx512_gather
> > +
> > +To set the priority of a lookup function, run the ``prio-set`` command ::
> > +
> > +    $ ovs-appctl dpif-netdev/subtable-lookup-prio-set avx512_gather 5
> > +    Lookup priority change affected 1 dpcls ports and 1 subtables.
> > +
> > +The highest priority lookup function is used for classification, and the output
> > +above indicates that one subtable of one DPCLS port is has changed its lookup
> > +function due to the command being run. To verify the prioritization, re-run
> the
> > +get command, note the updated priority of the ``avx512_gather`` function ::
> > +
> > +    $ ovs-appctl dpif-netdev/subtable-lookup-prio-get
> > +    Available lookup functions (priority : name)
> > +            0 : autovalidator
> > +            1 : generic
> > +            5 : avx512_gather
> > +
> > +If two lookup functions have the same priority, the first one in the list is
> > +chosen, and the 2nd occurance of that priority is not used. Put in logical
> > +terms, a subtable is chosen if its priority is greater than the previous
> > +best candidate.
> > +
> > +CPU ISA Testing and Validation
> > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > +
> > +As multiple versions of DPCLS can co-exist, each with different CPU ISA
> > +optimizations, it is important to validate that they all give the exact same
> > +results. To easily test all DPCLS implementations, an ``autovalidator``
> > +implementation of the DPCLS exists. This implementation runs all other
> > +available DPCLS implementations, and verifies that the results are identical.
> > +
> > +Running the OVS unit tests with the autovalidator enabled ensures all
> > +implementations provide the same results. Note that the performance of the
> > +autovalidator is lower than all other implementations, as it tests the scalar
> > +implementation against itself, and against all other enabled DPCLS
> > +implementations.
> > +
> > +To adjust the DPCLS autovalidator priority, use this command ::
> > +
> > +    $ ovs-appctl dpif-netdev/subtable-lookup-prio-set autovalidator 7
> > +
> > +Running Unit Tests with Autovalidator
> > ++++++++++++++++++++++++++++++++++++++
> > +
> > +To run the OVS unit test suite with the DPCLS autovalidator as the default
> > +implementation, it is required to recompile OVS. During the recompilation,
> > +the default priority of the `autovalidator` implementation is set to the
> > +maximum priority, ensuring every test will be run with every lookup
> > +implementation ::
> > +
> > +    $ ./configure --enable-autovalidator
> > +
> > +Compile OVS in debug mode to have `ovs_assert` statements error out if
> > +there is a mis-match in the DPCLS lookup implementation.
> > diff --git a/NEWS b/NEWS
> > index 0116b3ea0..da8725b59 100644
> > --- a/NEWS
> > +++ b/NEWS
> > @@ -20,6 +20,9 @@ Post-v2.13.0
> >        * New configuration knob 'other_config:lb-output-action' for bond ports
> >          that enables new datapath action 'lb_output' to avoid recirculation
> >          in balance-tcp mode.  Disabled by default.
> > +     * Add runtime CPU ISA detection to allow optimized ISA functions
> > +     * Add support for dynamically changing DPCLS subtable lookup functions
> > +     * Add ISA optimized DPCLS lookup function using AVX512
> >      - Tunnels: TC Flower offload
> >        * Tunnel Local endpoint address masked match are supported.
> >        * Tunnel Romte endpoint address masked match are supported.
> >
diff mbox series

Patch

diff --git a/Documentation/topics/dpdk/bridge.rst b/Documentation/topics/dpdk/bridge.rst
index f0ef42ecc..526d5c959 100644
--- a/Documentation/topics/dpdk/bridge.rst
+++ b/Documentation/topics/dpdk/bridge.rst
@@ -137,3 +137,80 @@  currently turned off by default.
 To turn on SMC::
 
     $ ovs-vsctl --no-wait set Open_vSwitch . other_config:smc-enable=true
+
+Datapath Classifier Performance
+-------------------------------
+
+The datapath classifier (dpcls) performs wildcard rule matching, a compute
+intensive process of matching a packet ``miniflow`` to a rule ``miniflow``. The
+code that does this compute work impacts datapath performance, and optimizing
+it can provide higher switching performance.
+
+Modern CPUs provide extensive SIMD instructions which can be used to get higher
+performance. The CPU OVS is being deployed on must be capable of running these
+SIMD instructions in order to take advantage of the performance benefits.
+In OVS v2.14 runtime CPU detection was introduced to enable identifying if
+these CPU ISA additions are available, and to allow the user to enable them.
+
+OVS provides multiple implementations of dpcls. The following command enables
+the user to check what implementations are available in a running instance ::
+
+    $ ovs-appctl dpif-netdev/subtable-lookup-prio-get
+    Available lookup functions (priority : name)
+            0 : autovalidator
+            1 : generic
+            0 : avx512_gather
+
+To set the priority of a lookup function, run the ``prio-set`` command ::
+
+    $ ovs-appctl dpif-netdev/subtable-lookup-prio-set avx512_gather 5
+    Lookup priority change affected 1 dpcls ports and 1 subtables.
+
+The highest priority lookup function is used for classification, and the output
+above indicates that one subtable of one DPCLS port is has changed its lookup
+function due to the command being run. To verify the prioritization, re-run the
+get command, note the updated priority of the ``avx512_gather`` function ::
+
+    $ ovs-appctl dpif-netdev/subtable-lookup-prio-get
+    Available lookup functions (priority : name)
+            0 : autovalidator
+            1 : generic
+            5 : avx512_gather
+
+If two lookup functions have the same priority, the first one in the list is
+chosen, and the 2nd occurance of that priority is not used. Put in logical
+terms, a subtable is chosen if its priority is greater than the previous
+best candidate.
+
+CPU ISA Testing and Validation
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+As multiple versions of DPCLS can co-exist, each with different CPU ISA
+optimizations, it is important to validate that they all give the exact same
+results. To easily test all DPCLS implementations, an ``autovalidator``
+implementation of the DPCLS exists. This implementation runs all other
+available DPCLS implementations, and verifies that the results are identical.
+
+Running the OVS unit tests with the autovalidator enabled ensures all
+implementations provide the same results. Note that the performance of the
+autovalidator is lower than all other implementations, as it tests the scalar
+implementation against itself, and against all other enabled DPCLS
+implementations.
+
+To adjust the DPCLS autovalidator priority, use this command ::
+
+    $ ovs-appctl dpif-netdev/subtable-lookup-prio-set autovalidator 7
+
+Running Unit Tests with Autovalidator
++++++++++++++++++++++++++++++++++++++
+
+To run the OVS unit test suite with the DPCLS autovalidator as the default
+implementation, it is required to recompile OVS. During the recompilation,
+the default priority of the `autovalidator` implementation is set to the
+maximum priority, ensuring every test will be run with every lookup
+implementation ::
+
+    $ ./configure --enable-autovalidator
+
+Compile OVS in debug mode to have `ovs_assert` statements error out if
+there is a mis-match in the DPCLS lookup implementation.
diff --git a/NEWS b/NEWS
index 0116b3ea0..da8725b59 100644
--- a/NEWS
+++ b/NEWS
@@ -20,6 +20,9 @@  Post-v2.13.0
      * New configuration knob 'other_config:lb-output-action' for bond ports
        that enables new datapath action 'lb_output' to avoid recirculation
        in balance-tcp mode.  Disabled by default.
+     * Add runtime CPU ISA detection to allow optimized ISA functions
+     * Add support for dynamically changing DPCLS subtable lookup functions
+     * Add ISA optimized DPCLS lookup function using AVX512
    - Tunnels: TC Flower offload
      * Tunnel Local endpoint address masked match are supported.
      * Tunnel Romte endpoint address masked match are supported.