Message ID | 20200702174300.48470-7-harry.van.haaren@intel.com |
---|---|
State | Superseded |
Headers | show |
Series | DPCLS Subtable ISA Optimization | expand |
On 7/2/2020 6:43 PM, Harry van Haaren wrote: > This commit adds a section to the dpdk/bridge.rst netdev documentation, > detailing the added DPCLS functionality. The newly added commands are > documented, and sample output is provided. > > Running the DPCLS autovalidator with unit tests by default is possible > through re-compiling the autovalidator to have the highest priority at > startup time. This avoids making changes to all tests, and enables > debug and CI builds to validate every lookup implementation with all > unit tests. > > Add NEWS updates for CPU ISA, dynamic subtables, and AVX512 lookup. > > Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> > Hi Harry, What you have below looks good to me. The only additional ideas that might be worth adding would be either validated compilers as mention in patch 1 f the series (maybe this is not needed, but reviewing the existing Compilation section for OVS already states a GCC version that was tested with OVS DPDK so at least 1 known GCC version is provided). Noting the configure, make CFLAGS dependency might be of use too although again, depends on how people configure and compile OVS to date. Lastly possibly adding a section on what to check if AVX512 lookup is not appearing might be useful also. BR Ian > --- > > v5: > - Include NEWS item updates. > > v4: > - Fix typos (William Tu) > - Update get commands to use include "prio" as updated in v4 > - Add section on enabling autovalidator by default for unit tests > --- > Documentation/topics/dpdk/bridge.rst | 77 ++++++++++++++++++++++++++++ > NEWS | 3 ++ > 2 files changed, 80 insertions(+) > > diff --git a/Documentation/topics/dpdk/bridge.rst b/Documentation/topics/dpdk/bridge.rst > index f0ef42ecc..526d5c959 100644 > --- a/Documentation/topics/dpdk/bridge.rst > +++ b/Documentation/topics/dpdk/bridge.rst > @@ -137,3 +137,80 @@ currently turned off by default. > To turn on SMC:: > > $ ovs-vsctl --no-wait set Open_vSwitch . other_config:smc-enable=true > + > +Datapath Classifier Performance > +------------------------------- > + > +The datapath classifier (dpcls) performs wildcard rule matching, a compute > +intensive process of matching a packet ``miniflow`` to a rule ``miniflow``. The > +code that does this compute work impacts datapath performance, and optimizing > +it can provide higher switching performance. > + > +Modern CPUs provide extensive SIMD instructions which can be used to get higher > +performance. The CPU OVS is being deployed on must be capable of running these > +SIMD instructions in order to take advantage of the performance benefits. > +In OVS v2.14 runtime CPU detection was introduced to enable identifying if > +these CPU ISA additions are available, and to allow the user to enable them. > + > +OVS provides multiple implementations of dpcls. The following command enables > +the user to check what implementations are available in a running instance :: > + > + $ ovs-appctl dpif-netdev/subtable-lookup-prio-get > + Available lookup functions (priority : name) > + 0 : autovalidator > + 1 : generic > + 0 : avx512_gather > + > +To set the priority of a lookup function, run the ``prio-set`` command :: > + > + $ ovs-appctl dpif-netdev/subtable-lookup-prio-set avx512_gather 5 > + Lookup priority change affected 1 dpcls ports and 1 subtables. > + > +The highest priority lookup function is used for classification, and the output > +above indicates that one subtable of one DPCLS port is has changed its lookup > +function due to the command being run. To verify the prioritization, re-run the > +get command, note the updated priority of the ``avx512_gather`` function :: > + > + $ ovs-appctl dpif-netdev/subtable-lookup-prio-get > + Available lookup functions (priority : name) > + 0 : autovalidator > + 1 : generic > + 5 : avx512_gather > + > +If two lookup functions have the same priority, the first one in the list is > +chosen, and the 2nd occurance of that priority is not used. Put in logical > +terms, a subtable is chosen if its priority is greater than the previous > +best candidate. > + > +CPU ISA Testing and Validation > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > + > +As multiple versions of DPCLS can co-exist, each with different CPU ISA > +optimizations, it is important to validate that they all give the exact same > +results. To easily test all DPCLS implementations, an ``autovalidator`` > +implementation of the DPCLS exists. This implementation runs all other > +available DPCLS implementations, and verifies that the results are identical. > + > +Running the OVS unit tests with the autovalidator enabled ensures all > +implementations provide the same results. Note that the performance of the > +autovalidator is lower than all other implementations, as it tests the scalar > +implementation against itself, and against all other enabled DPCLS > +implementations. > + > +To adjust the DPCLS autovalidator priority, use this command :: > + > + $ ovs-appctl dpif-netdev/subtable-lookup-prio-set autovalidator 7 > + > +Running Unit Tests with Autovalidator > ++++++++++++++++++++++++++++++++++++++ > + > +To run the OVS unit test suite with the DPCLS autovalidator as the default > +implementation, it is required to recompile OVS. During the recompilation, > +the default priority of the `autovalidator` implementation is set to the > +maximum priority, ensuring every test will be run with every lookup > +implementation :: > + > + $ ./configure --enable-autovalidator > + > +Compile OVS in debug mode to have `ovs_assert` statements error out if > +there is a mis-match in the DPCLS lookup implementation. > diff --git a/NEWS b/NEWS > index 0116b3ea0..da8725b59 100644 > --- a/NEWS > +++ b/NEWS > @@ -20,6 +20,9 @@ Post-v2.13.0 > * New configuration knob 'other_config:lb-output-action' for bond ports > that enables new datapath action 'lb_output' to avoid recirculation > in balance-tcp mode. Disabled by default. > + * Add runtime CPU ISA detection to allow optimized ISA functions > + * Add support for dynamically changing DPCLS subtable lookup functions > + * Add ISA optimized DPCLS lookup function using AVX512 > - Tunnels: TC Flower offload > * Tunnel Local endpoint address masked match are supported. > * Tunnel Romte endpoint address masked match are supported. >
> -----Original Message----- > From: Stokes, Ian <ian.stokes@intel.com> > Sent: Friday, July 10, 2020 5:00 PM > To: Van Haaren, Harry <harry.van.haaren@intel.com>; ovs-dev@openvswitch.org > Cc: i.maximets@ovn.org; u9012063@gmail.com; fiezzi@redhat.com > Subject: Re: [PATCH v6 6/6] docs/dpdk/bridge: add datapath performance > section. > > > > On 7/2/2020 6:43 PM, Harry van Haaren wrote: > > This commit adds a section to the dpdk/bridge.rst netdev documentation, > > detailing the added DPCLS functionality. The newly added commands are > > documented, and sample output is provided. > > > > Running the DPCLS autovalidator with unit tests by default is possible > > through re-compiling the autovalidator to have the highest priority at > > startup time. This avoids making changes to all tests, and enables > > debug and CI builds to validate every lookup implementation with all > > unit tests. > > > > Add NEWS updates for CPU ISA, dynamic subtables, and AVX512 lookup. > > > > Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> > > > > Hi Harry, > What you have below looks good to me. > > The only additional ideas that might be worth adding would be either > validated compilers as mention in patch 1 f the series (maybe this is > not needed, but reviewing the existing Compilation section for OVS > already states a GCC version that was tested with OVS DPDK so at least 1 > known GCC version is provided). As mentioned in reply to first patch, I don't see value in stating what compilers work or don’t - we must just rely on compilers working. OVS can recommend or state that it is tested with specific compilers - but that is an independent issue to this patchset. > Noting the configure, make CFLAGS dependency might be of use too > although again, depends on how people configure and compile OVS to date. Examples commands and documentation added to remedy this. > Lastly possibly adding a section on what to check if AVX512 lookup is > not appearing might be useful also. Added section on potential issues regarding binutils bug, and how to remedy. > BR > Ian Thanks for review, -Harry > > > > v5: > > - Include NEWS item updates. > > > > v4: > > - Fix typos (William Tu) > > - Update get commands to use include "prio" as updated in v4 > > - Add section on enabling autovalidator by default for unit tests > > --- > > Documentation/topics/dpdk/bridge.rst | 77 ++++++++++++++++++++++++++++ > > NEWS | 3 ++ > > 2 files changed, 80 insertions(+) > > > > diff --git a/Documentation/topics/dpdk/bridge.rst > b/Documentation/topics/dpdk/bridge.rst > > index f0ef42ecc..526d5c959 100644 > > --- a/Documentation/topics/dpdk/bridge.rst > > +++ b/Documentation/topics/dpdk/bridge.rst > > @@ -137,3 +137,80 @@ currently turned off by default. > > To turn on SMC:: > > > > $ ovs-vsctl --no-wait set Open_vSwitch . other_config:smc-enable=true > > + > > +Datapath Classifier Performance > > +------------------------------- > > + > > +The datapath classifier (dpcls) performs wildcard rule matching, a compute > > +intensive process of matching a packet ``miniflow`` to a rule ``miniflow``. The > > +code that does this compute work impacts datapath performance, and > optimizing > > +it can provide higher switching performance. > > + > > +Modern CPUs provide extensive SIMD instructions which can be used to get > higher > > +performance. The CPU OVS is being deployed on must be capable of running > these > > +SIMD instructions in order to take advantage of the performance benefits. > > +In OVS v2.14 runtime CPU detection was introduced to enable identifying if > > +these CPU ISA additions are available, and to allow the user to enable them. > > + > > +OVS provides multiple implementations of dpcls. The following command > enables > > +the user to check what implementations are available in a running instance :: > > + > > + $ ovs-appctl dpif-netdev/subtable-lookup-prio-get > > + Available lookup functions (priority : name) > > + 0 : autovalidator > > + 1 : generic > > + 0 : avx512_gather > > + > > +To set the priority of a lookup function, run the ``prio-set`` command :: > > + > > + $ ovs-appctl dpif-netdev/subtable-lookup-prio-set avx512_gather 5 > > + Lookup priority change affected 1 dpcls ports and 1 subtables. > > + > > +The highest priority lookup function is used for classification, and the output > > +above indicates that one subtable of one DPCLS port is has changed its lookup > > +function due to the command being run. To verify the prioritization, re-run > the > > +get command, note the updated priority of the ``avx512_gather`` function :: > > + > > + $ ovs-appctl dpif-netdev/subtable-lookup-prio-get > > + Available lookup functions (priority : name) > > + 0 : autovalidator > > + 1 : generic > > + 5 : avx512_gather > > + > > +If two lookup functions have the same priority, the first one in the list is > > +chosen, and the 2nd occurance of that priority is not used. Put in logical > > +terms, a subtable is chosen if its priority is greater than the previous > > +best candidate. > > + > > +CPU ISA Testing and Validation > > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > + > > +As multiple versions of DPCLS can co-exist, each with different CPU ISA > > +optimizations, it is important to validate that they all give the exact same > > +results. To easily test all DPCLS implementations, an ``autovalidator`` > > +implementation of the DPCLS exists. This implementation runs all other > > +available DPCLS implementations, and verifies that the results are identical. > > + > > +Running the OVS unit tests with the autovalidator enabled ensures all > > +implementations provide the same results. Note that the performance of the > > +autovalidator is lower than all other implementations, as it tests the scalar > > +implementation against itself, and against all other enabled DPCLS > > +implementations. > > + > > +To adjust the DPCLS autovalidator priority, use this command :: > > + > > + $ ovs-appctl dpif-netdev/subtable-lookup-prio-set autovalidator 7 > > + > > +Running Unit Tests with Autovalidator > > ++++++++++++++++++++++++++++++++++++++ > > + > > +To run the OVS unit test suite with the DPCLS autovalidator as the default > > +implementation, it is required to recompile OVS. During the recompilation, > > +the default priority of the `autovalidator` implementation is set to the > > +maximum priority, ensuring every test will be run with every lookup > > +implementation :: > > + > > + $ ./configure --enable-autovalidator > > + > > +Compile OVS in debug mode to have `ovs_assert` statements error out if > > +there is a mis-match in the DPCLS lookup implementation. > > diff --git a/NEWS b/NEWS > > index 0116b3ea0..da8725b59 100644 > > --- a/NEWS > > +++ b/NEWS > > @@ -20,6 +20,9 @@ Post-v2.13.0 > > * New configuration knob 'other_config:lb-output-action' for bond ports > > that enables new datapath action 'lb_output' to avoid recirculation > > in balance-tcp mode. Disabled by default. > > + * Add runtime CPU ISA detection to allow optimized ISA functions > > + * Add support for dynamically changing DPCLS subtable lookup functions > > + * Add ISA optimized DPCLS lookup function using AVX512 > > - Tunnels: TC Flower offload > > * Tunnel Local endpoint address masked match are supported. > > * Tunnel Romte endpoint address masked match are supported. > >
diff --git a/Documentation/topics/dpdk/bridge.rst b/Documentation/topics/dpdk/bridge.rst index f0ef42ecc..526d5c959 100644 --- a/Documentation/topics/dpdk/bridge.rst +++ b/Documentation/topics/dpdk/bridge.rst @@ -137,3 +137,80 @@ currently turned off by default. To turn on SMC:: $ ovs-vsctl --no-wait set Open_vSwitch . other_config:smc-enable=true + +Datapath Classifier Performance +------------------------------- + +The datapath classifier (dpcls) performs wildcard rule matching, a compute +intensive process of matching a packet ``miniflow`` to a rule ``miniflow``. The +code that does this compute work impacts datapath performance, and optimizing +it can provide higher switching performance. + +Modern CPUs provide extensive SIMD instructions which can be used to get higher +performance. The CPU OVS is being deployed on must be capable of running these +SIMD instructions in order to take advantage of the performance benefits. +In OVS v2.14 runtime CPU detection was introduced to enable identifying if +these CPU ISA additions are available, and to allow the user to enable them. + +OVS provides multiple implementations of dpcls. The following command enables +the user to check what implementations are available in a running instance :: + + $ ovs-appctl dpif-netdev/subtable-lookup-prio-get + Available lookup functions (priority : name) + 0 : autovalidator + 1 : generic + 0 : avx512_gather + +To set the priority of a lookup function, run the ``prio-set`` command :: + + $ ovs-appctl dpif-netdev/subtable-lookup-prio-set avx512_gather 5 + Lookup priority change affected 1 dpcls ports and 1 subtables. + +The highest priority lookup function is used for classification, and the output +above indicates that one subtable of one DPCLS port is has changed its lookup +function due to the command being run. To verify the prioritization, re-run the +get command, note the updated priority of the ``avx512_gather`` function :: + + $ ovs-appctl dpif-netdev/subtable-lookup-prio-get + Available lookup functions (priority : name) + 0 : autovalidator + 1 : generic + 5 : avx512_gather + +If two lookup functions have the same priority, the first one in the list is +chosen, and the 2nd occurance of that priority is not used. Put in logical +terms, a subtable is chosen if its priority is greater than the previous +best candidate. + +CPU ISA Testing and Validation +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +As multiple versions of DPCLS can co-exist, each with different CPU ISA +optimizations, it is important to validate that they all give the exact same +results. To easily test all DPCLS implementations, an ``autovalidator`` +implementation of the DPCLS exists. This implementation runs all other +available DPCLS implementations, and verifies that the results are identical. + +Running the OVS unit tests with the autovalidator enabled ensures all +implementations provide the same results. Note that the performance of the +autovalidator is lower than all other implementations, as it tests the scalar +implementation against itself, and against all other enabled DPCLS +implementations. + +To adjust the DPCLS autovalidator priority, use this command :: + + $ ovs-appctl dpif-netdev/subtable-lookup-prio-set autovalidator 7 + +Running Unit Tests with Autovalidator ++++++++++++++++++++++++++++++++++++++ + +To run the OVS unit test suite with the DPCLS autovalidator as the default +implementation, it is required to recompile OVS. During the recompilation, +the default priority of the `autovalidator` implementation is set to the +maximum priority, ensuring every test will be run with every lookup +implementation :: + + $ ./configure --enable-autovalidator + +Compile OVS in debug mode to have `ovs_assert` statements error out if +there is a mis-match in the DPCLS lookup implementation. diff --git a/NEWS b/NEWS index 0116b3ea0..da8725b59 100644 --- a/NEWS +++ b/NEWS @@ -20,6 +20,9 @@ Post-v2.13.0 * New configuration knob 'other_config:lb-output-action' for bond ports that enables new datapath action 'lb_output' to avoid recirculation in balance-tcp mode. Disabled by default. + * Add runtime CPU ISA detection to allow optimized ISA functions + * Add support for dynamically changing DPCLS subtable lookup functions + * Add ISA optimized DPCLS lookup function using AVX512 - Tunnels: TC Flower offload * Tunnel Local endpoint address masked match are supported. * Tunnel Romte endpoint address masked match are supported.
This commit adds a section to the dpdk/bridge.rst netdev documentation, detailing the added DPCLS functionality. The newly added commands are documented, and sample output is provided. Running the DPCLS autovalidator with unit tests by default is possible through re-compiling the autovalidator to have the highest priority at startup time. This avoids making changes to all tests, and enables debug and CI builds to validate every lookup implementation with all unit tests. Add NEWS updates for CPU ISA, dynamic subtables, and AVX512 lookup. Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> --- v5: - Include NEWS item updates. v4: - Fix typos (William Tu) - Update get commands to use include "prio" as updated in v4 - Add section on enabling autovalidator by default for unit tests --- Documentation/topics/dpdk/bridge.rst | 77 ++++++++++++++++++++++++++++ NEWS | 3 ++ 2 files changed, 80 insertions(+)