mbox series

[ovs-dev,v4,00/11] DPIF & MFEX Refactor and SIMD optimization

Message ID 20201125184342.2715681-1-harry.van.haaren@intel.com
Headers show
Series DPIF & MFEX Refactor and SIMD optimization | expand

Message

Van Haaren, Harry Nov. 25, 2020, 6:43 p.m. UTC
v4 summary:
- Updated and improve DPIF component
--- SMC now implemented
--- EMC handling improved
--- Novel batching method using AVX512 implemented
--- see commits for details
- Updated Miniflow Extract component
--- Improved AVX512 code path performance
--- Implemented multiple TODO item's in v3
--- Add "disable" implementation to return to scalar miniflow only
--- More fixes planned for v5/future revisions:
---- Rename command to better reflect usage
---- Improve dynamicness of patterns
---- Add more demo protocols to show usage
- Future work
--- Documentation/NEWS items
--- Statistics for optimized MFEX
- Note that this patchset will be discussed/presented at OvsConf soon :)


v3 update summary:
(Cian Ferriter helping with rebases, review and code cleanups)
- Split out partially related changes (these will be sent separately)
--- netdev output action optimization
--- avx512 dpcls 16-block support optimization
- Squash commit which moves netdev struct flow into the refactor commit:
--- Squash dpif-netdev: move netdev flow struct to header
--- Into dpif-netdev: Refactor to multiple header files
- Implement Miniflow extract for AVX-512 DPIF
--- A generic method of matching patterns and packets is implemented,
    providing traffic-pattern specific miniflow-extract acceleration.
--- The patterns today are hard-coded, however in a future patchset it
    is intended to make these runtime configurable, allowing users to
    optimize the SIMD miniflow extract for active traffic types.
- Notes:
--- 32 bit builds will be fixed in next release by adding flexible
    miniflow extract optimization selection.
--- AVX-512 VBMI ISA is not yet supported in OVS due to requiring the
    DPDK 20.11 update for RTE_CPUFLAG_*. Once on a newer DPDK this will
    be added.

v2 updates:
- Includes DPIF command switching at runtime
- Includes AVX512 DPIF implementation
- Includes some partially related changes (can be split out of set?)
--- netdev output action optimization
--- avx512 dpcls 16-block support optimization


This patchset is a v4 for making the DPIF and miniflow extract (MFEX)
components of the userspace datapath more flexible. The same approach
as has been previously used for DPCLS is used here, where a function
pointer allows selection of an implementation at runtime.

The flexibility from the above changes enables ISA optimized
implementations of the DPIF and MFEX of the datapath. As these
ISA optimized implementations also require access to EMC/SMC/HWOL
features, these have been split out to seperate header files.

The file splitting also improves maintainability, as dpif_netdev.c
has ~9000 LOC, and very hard to modify due to many structs defined
locally in the .c file, ruling out re-usability in other .c files.

Questions welcomed! Regards, -Harry


Cian Ferriter (1):
  dpif-avx512: Add SMC support to AVX512 DPIF

Harry van Haaren (10):
  dpdk: Cache result of CPU ISA checks
  dpif-netdev: Move pmd_try_optimize function in file
  dpif-netdev: Refactor to multiple header files
  dpif-netdev: Split hwol out to own header file
  dpif-netdev: Add function pointer for netdev input
  dpif-avx512: Add ISA implementation of dpif
  dpif-netdev: Add command to switch dpif implementation
  dpif-netdev/dpcls: Refactor function names to dpcls
  dpif-netdev: enable ISA optimized miniflow extract
  dpif-netdev: enable scalar datapath with optimized miniflow extract

 acinclude.m4                           |   15 +
 configure.ac                           |    1 +
 lib/automake.mk                        |   16 +-
 lib/dpdk.c                             |   26 +-
 lib/dpif-netdev-avx512-extract.c       |  528 ++++++++++++
 lib/dpif-netdev-avx512-extract.h       |   40 +
 lib/dpif-netdev-avx512.c               |  274 +++++++
 lib/dpif-netdev-lookup-autovalidator.c |    1 -
 lib/dpif-netdev-lookup-avx512-gather.c |    1 -
 lib/dpif-netdev-lookup-generic.c       |    1 -
 lib/dpif-netdev-lookup.h               |    2 +-
 lib/dpif-netdev-private-dfc.h          |  252 ++++++
 lib/dpif-netdev-private-dpcls.h        |  127 +++
 lib/dpif-netdev-private-dpif.c         |  104 +++
 lib/dpif-netdev-private-dpif.h         |   61 ++
 lib/dpif-netdev-private-extract.c      |   72 ++
 lib/dpif-netdev-private-extract.h      |   60 ++
 lib/dpif-netdev-private-flow.h         |  155 ++++
 lib/dpif-netdev-private-hwol.h         |   63 ++
 lib/dpif-netdev-private-thread.h       |  222 +++++
 lib/dpif-netdev-private.h              |  123 +--
 lib/dpif-netdev.c                      | 1027 +++++++++---------------
 22 files changed, 2417 insertions(+), 754 deletions(-)
 create mode 100644 lib/dpif-netdev-avx512-extract.c
 create mode 100644 lib/dpif-netdev-avx512-extract.h
 create mode 100644 lib/dpif-netdev-avx512.c
 create mode 100644 lib/dpif-netdev-private-dfc.h
 create mode 100644 lib/dpif-netdev-private-dpcls.h
 create mode 100644 lib/dpif-netdev-private-dpif.c
 create mode 100644 lib/dpif-netdev-private-dpif.h
 create mode 100644 lib/dpif-netdev-private-extract.c
 create mode 100644 lib/dpif-netdev-private-extract.h
 create mode 100644 lib/dpif-netdev-private-flow.h
 create mode 100644 lib/dpif-netdev-private-hwol.h
 create mode 100644 lib/dpif-netdev-private-thread.h