mbox series

[ovs-dev,v11,0/5] dpcls func ptrs & optimizations

Message ID 20190715163636.51572-1-harry.van.haaren@intel.com
Headers show
Series dpcls func ptrs & optimizations | expand

Message

Van Haaren, Harry July 15, 2019, 4:36 p.m. UTC
Hey Folks,

Here a v11 of the DPCLS Function Pointer patchset, as has been
presented at OVS Conf in Nov '18, and discussed on the ML since then.
I'm aware of the soft-freeze for 2.12, I feel this patchset has had
enough reviews/versions/testing to be merged in 2.12.

Thanks Ilya and Ian for review comments on v10, they should all be addressed
in this v11. Patchset details below, summary of v11 changes as follows:
- Reworked to use hash_add_words64(), unfortunatly hash_words64_inline() did
  not provide the same has due to it calling hash_finish() with a different
  "final" value. Refactored to re-use as much as we can of the existing code.
- Reworked specialized functions to use MACROs as suggested in review. This
  makes the specialization of functions much more tidy - good suggestion!
- Reworked lots of little fixes, Captials and stops.
- Added NEWS entry in "Userspace Datapath" section on DPCLS function pointers
  and how specialization of subtables minfilows gains search performance.
- See per patch --- v11 notes for details :)

Regards, -Harry


Patchset details;
1) Refactor dpcls_lookup and the subtable for flexibility.
In particular, add a function pointer to the subtable
structure, which enables "plugging-in" a lookup function
at runtime. This enables a number of optimizations in future.

2) and 3)
With the function pointer in place, we refactor the existing
dpcls_lookup matching code into its own function, and later its
own file. To split it to its own file requires making various
dpcls data-structures available in dpif-netdev-private.h.

4)
Refactor the existing code, to favour compute of flat arrays of
miniflows, instead of the MACRO based iteration. This simplifies
the code itself, and makes future optimizations possible due to
simplified loop structures, and loop trip counts pass in via
function arguments. See commit message for more details.

5)
This patch implements a select few specialized functions, for handling
miniflows with 5-1, 4-1, and 4-0 miniflow unit bit patterns. More of
these types of functions can (and should) be added to accelerate other
patterns of subtable lookups! See commit message for more details.


Harry van Haaren (5):
  dpif-netdev: Implement function pointers/subtable
  dpif-netdev: Move dpcls lookup structures to .h
  dpif-netdev: Split out generic lookup function
  dpif-netdev: Refactor generic implementation
  dpif-netdev: Add specialized generic scalar functions

 NEWS                             |   4 +
 lib/automake.mk                  |   2 +
 lib/dpif-netdev-lookup-generic.c | 290 +++++++++++++++++++++++++++++++
 lib/dpif-netdev-private.h        | 129 ++++++++++++++
 lib/dpif-netdev.c                | 197 ++++++++++-----------
 5 files changed, 525 insertions(+), 97 deletions(-)
 create mode 100644 lib/dpif-netdev-lookup-generic.c
 create mode 100644 lib/dpif-netdev-private.h

Comments

Stokes, Ian July 16, 2019, 9:06 p.m. UTC | #1
On 7/15/2019 5:36 PM, Harry van Haaren wrote:
> Hey Folks,
> 
> Here a v11 of the DPCLS Function Pointer patchset, as has been
> presented at OVS Conf in Nov '18, and discussed on the ML since then.
> I'm aware of the soft-freeze for 2.12, I feel this patchset has had
> enough reviews/versions/testing to be merged in 2.12.
> 
> Thanks Ilya and Ian for review comments on v10, they should all be addressed
> in this v11. Patchset details below, summary of v11 changes as follows:
> - Reworked to use hash_add_words64(), unfortunatly hash_words64_inline() did
>    not provide the same has due to it calling hash_finish() with a different
>    "final" value. Refactored to re-use as much as we can of the existing code.
> - Reworked specialized functions to use MACROs as suggested in review. This
>    makes the specialization of functions much more tidy - good suggestion!
> - Reworked lots of little fixes, Captials and stops.
> - Added NEWS entry in "Userspace Datapath" section on DPCLS function pointers
>    and how specialization of subtables minfilows gains search performance.
> - See per patch --- v11 notes for details :)
> 
> Regards, -Harry

Thanks for working on the v11 Harry.

Patches 1-4 in the series looks ok to me. I have some minor comments on 
patch 5.

I've ran it through the usual validation (travis, appveyor, read the 
docs etc). All passing without issue.

I'm just running some performance tests now again with vsperf.

For rfc2544 0% traffic loss for ovs flow scalability 
(phy2phy_scalability) I'm seeing a 10% increase in performance for 64 
byte packets and in the case of scalability where loss is allowed 
(phy2phy_scalability_cont) then I see an 8% increase in performance on 
these baselines.

@Ilya, do you have input on these patches or has Harry addressed your 
concerns in the latest revision?

Regards
Ian
Ilya Maximets July 17, 2019, 10:18 a.m. UTC | #2
On 17.07.2019 0:06, Ian Stokes wrote:
> On 7/15/2019 5:36 PM, Harry van Haaren wrote:
>> Hey Folks,
>>
>> Here a v11 of the DPCLS Function Pointer patchset, as has been
>> presented at OVS Conf in Nov '18, and discussed on the ML since then.
>> I'm aware of the soft-freeze for 2.12, I feel this patchset has had
>> enough reviews/versions/testing to be merged in 2.12.
>>
>> Thanks Ilya and Ian for review comments on v10, they should all be addressed
>> in this v11. Patchset details below, summary of v11 changes as follows:
>> - Reworked to use hash_add_words64(), unfortunatly hash_words64_inline() did
>>    not provide the same has due to it calling hash_finish() with a different
>>    "final" value. Refactored to re-use as much as we can of the existing code.
>> - Reworked specialized functions to use MACROs as suggested in review. This
>>    makes the specialization of functions much more tidy - good suggestion!
>> - Reworked lots of little fixes, Captials and stops.
>> - Added NEWS entry in "Userspace Datapath" section on DPCLS function pointers
>>    and how specialization of subtables minfilows gains search performance.
>> - See per patch --- v11 notes for details :)
>>
>> Regards, -Harry
> 
> Thanks for working on the v11 Harry.
> 
> Patches 1-4 in the series looks ok to me. I have some minor comments on patch 5.
> 
> I've ran it through the usual validation (travis, appveyor, read the docs etc). All passing without issue.
> 
> I'm just running some performance tests now again with vsperf.
> 
> For rfc2544 0% traffic loss for ovs flow scalability (phy2phy_scalability) I'm seeing a 10% increase in performance for 64 byte packets and in the case of scalability where loss is allowed (phy2phy_scalability_cont) then I see an 8% increase in performance on these baselines.
> 
> @Ilya, do you have input on these patches or has Harry addressed your concerns in the latest revision?

I've sent some comments for the patches.

BTW, I'm planning to test the performance of generic (non-optimized)
implementation today to compare with the previous implementation.

Best regards, Ilya Maximets.