mbox series

[ovs-dev,v4,0/7] Output packet batching.

Message ID 1507215962-17692-1-git-send-email-i.maximets@samsung.com
Headers show
Series Output packet batching. | expand

Message

Ilya Maximets Oct. 5, 2017, 3:05 p.m. UTC
This patch-set inspired by [1] from Bhanuprakash Bodireddy.
Implementation of [1] looks very complex and introduces many pitfalls [2]
for later code modifications like possible packet stucks.

This version targeted to make simple and flexible output packet batching on
higher level without introducing and even simplifying netdev layer.

Basic testing of 'PVP with OVS bonding on phy ports' scenario shows
significant performance improvement.

Test results for time-based batching for v3:
https://mail.openvswitch.org/pipermail/ovs-dev/2017-September/338247.html

[1] [PATCH v4 0/5] netdev-dpdk: Use intermediate queue during packet transmission.
    https://mail.openvswitch.org/pipermail/ovs-dev/2017-August/337019.html

[2] For example:
    https://mail.openvswitch.org/pipermail/ovs-dev/2017-August/337133.html

Version 4:
	* Rebased on current master.
	* Rebased on top of "Keep latest measured time for PMD thread."
	  (Jan Scheurich)
	* Microsecond resolution related patches integrated.
	* Time-based batching without RFC tag.
	* 'output_time' renamed to 'flush_time'. (Jan Scheurich)
	* 'flush_time' update moved to 'dp_netdev_pmd_flush_output_on_port'.
	  (Jan Scheurich)
	* 'output-max-latency' renamed to 'tx-flush-interval'.
	* Added patch for output batching statistics.

Version 3:

	* Rebased on current master.
	* Time based RFC: fixed assert on n_output_batches <= 0.

Version 2:

	* Rebased on current master.
	* Added time based batching RFC patch.
	* Fixed mixing packets with different sources in same batch.


Ilya Maximets (7):
  dpif-netdev: Keep latest measured time for PMD thread.
  dpif-netdev: Output packet batching.
  netdev: Remove unused may_steal.
  netdev: Remove useless cutlen.
  timeval: Introduce time_usec().
  dpif-netdev: Time based output batching.
  dpif-netdev: Count sent packets and batches.

 lib/dpif-netdev.c     | 334 +++++++++++++++++++++++++++++++++++++-------------
 lib/netdev-bsd.c      |   6 +-
 lib/netdev-dpdk.c     |  70 ++++-------
 lib/netdev-dummy.c    |   6 +-
 lib/netdev-linux.c    |   8 +-
 lib/netdev-provider.h |   7 +-
 lib/netdev.c          |  12 +-
 lib/netdev.h          |   2 +-
 lib/timeval.c         |  22 ++++
 lib/timeval.h         |   2 +
 vswitchd/vswitch.xml  |  16 +++
 11 files changed, 336 insertions(+), 149 deletions(-)

Comments

Jan Scheurich Oct. 5, 2017, 9:50 p.m. UTC | #1
Thanks, Ilya, for the new version!
We will give it a try and come back with new test results and reviews.
Regards, Jan

> -----Original Message-----
> From: Ilya Maximets [mailto:i.maximets@samsung.com]
> Sent: Thursday, 05 October, 2017 17:06
> To: ovs-dev@openvswitch.org; Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com>
> Cc: Heetae Ahn <heetae82.ahn@samsung.com>; Antonio Fischetti <antonio.fischetti@intel.com>; Eelco Chaudron
> <echaudro@redhat.com>; Ciara Loftus <ciara.loftus@intel.com>; Kevin Traynor <ktraynor@redhat.com>; Darrell Ball
> <dball@vmware.com>; Jan Scheurich <jan.scheurich@ericsson.com>; Ilya Maximets <i.maximets@samsung.com>
> Subject: [PATCH v4 0/7] Output packet batching.
> 
> This patch-set inspired by [1] from Bhanuprakash Bodireddy.
> Implementation of [1] looks very complex and introduces many pitfalls [2]
> for later code modifications like possible packet stucks.
> 
> This version targeted to make simple and flexible output packet batching on
> higher level without introducing and even simplifying netdev layer.
> 
> Basic testing of 'PVP with OVS bonding on phy ports' scenario shows
> significant performance improvement.
> 
> Test results for time-based batching for v3:
> https://mail.openvswitch.org/pipermail/ovs-dev/2017-September/338247.html
> 
> [1] [PATCH v4 0/5] netdev-dpdk: Use intermediate queue during packet transmission.
>     https://mail.openvswitch.org/pipermail/ovs-dev/2017-August/337019.html
> 
> [2] For example:
>     https://mail.openvswitch.org/pipermail/ovs-dev/2017-August/337133.html
> 
> Version 4:
> 	* Rebased on current master.
> 	* Rebased on top of "Keep latest measured time for PMD thread."
> 	  (Jan Scheurich)
> 	* Microsecond resolution related patches integrated.
> 	* Time-based batching without RFC tag.
> 	* 'output_time' renamed to 'flush_time'. (Jan Scheurich)
> 	* 'flush_time' update moved to 'dp_netdev_pmd_flush_output_on_port'.
> 	  (Jan Scheurich)
> 	* 'output-max-latency' renamed to 'tx-flush-interval'.
> 	* Added patch for output batching statistics.
> 
> Version 3:
> 
> 	* Rebased on current master.
> 	* Time based RFC: fixed assert on n_output_batches <= 0.
> 
> Version 2:
> 
> 	* Rebased on current master.
> 	* Added time based batching RFC patch.
> 	* Fixed mixing packets with different sources in same batch.
> 
> 
> Ilya Maximets (7):
>   dpif-netdev: Keep latest measured time for PMD thread.
>   dpif-netdev: Output packet batching.
>   netdev: Remove unused may_steal.
>   netdev: Remove useless cutlen.
>   timeval: Introduce time_usec().
>   dpif-netdev: Time based output batching.
>   dpif-netdev: Count sent packets and batches.
> 
>  lib/dpif-netdev.c     | 334 +++++++++++++++++++++++++++++++++++++-------------
>  lib/netdev-bsd.c      |   6 +-
>  lib/netdev-dpdk.c     |  70 ++++-------
>  lib/netdev-dummy.c    |   6 +-
>  lib/netdev-linux.c    |   8 +-
>  lib/netdev-provider.h |   7 +-
>  lib/netdev.c          |  12 +-
>  lib/netdev.h          |   2 +-
>  lib/timeval.c         |  22 ++++
>  lib/timeval.h         |   2 +
>  vswitchd/vswitch.xml  |  16 +++
>  11 files changed, 336 insertions(+), 149 deletions(-)
> 
> --
> 2.7.4
Eelco Chaudron Oct. 11, 2017, 8:47 a.m. UTC | #2
Hi Ilya,

Sorry for the late response, as I was rather busy and did not find time 
to look
at your revisions 1 till 3. Hopefully, I can make it up looking at v4...

I did some tests in-line with the earlier tests I did with Bhanu's patch 
series.
Here is a comparison for a simple PVP test using a single physical port 
with 64
bytes packets (wire speed 10G), single PMD thread:

#Flows  master     patched
======  =========  =========
     10  3,123,350  4,174,807  pps
     32  2,090,440  3,625,314  pps
     50  1,954,184  3,499,402  pps
    100  1,705,794  3,264,955  pps
    500  1,601,252  2,956,190  pps
   1000  1,568,175  2,712,385  pps

In addition, I did some latency statistics based on a PVP setup with two
physical ports, and one virtual port, and two OpenFlow rules:

ovs-ofctl add-flow br0 "in_port=dpdk0,action=vhost0"
ovs-ofctl add-flow br0 "in_port=vhost0,action=dpdk1"

Also, note that there is some deviation on latency numbers, so I did 4 
runs and
reported the min-max values.

First the master results:

Summary (flows = 30, 10G line rate = 95%, runtime = 60 seconds):

   Pkt size     min(ns)     avg(ns)          max(ns)
   --------  -------------  ---------------  -----------------
        512  7,437 - 7,469  11,416 - 13,770   99,395 - 112,296
       1024  7,197 - 7,221  11,277 - 12,379   42,876 -  47,230
       1280  7,373 - 7,549  10,647 - 12,528   37,240 -  42,235
       1518  8,046 - 8,135  11,808 - 12,931   36,534 -  46,388

And the patched results:

   Pkt size     min(ns)     avg(ns)          max(ns)
   --------  -------------  ---------------  -----------------
        512  7,605 - 7,662  11,711 - 14,053   56,603 - 121,059
       1024  7,285 - 7,317  11,291 - 12,695   44,753 -  69,624
       1280  7,605 - 7,702  10,842 - 12,685   37,047 -  45,747
       1518  8,111 - 8,159  11,434 - 13,045   38,587 -  41,754

As you can see above for the default configuration there is a minimal 
latency
increase. I assume you did some latency tests yourself, and I hope these
numbers match your findings...

I do have some small comments on your patchsets but will address them 
replying
to the individual emails.

Cheers,

Eelco

On 05/10/17 17:05, Ilya Maximets wrote:
> This patch-set inspired by [1] from Bhanuprakash Bodireddy.
> Implementation of [1] looks very complex and introduces many pitfalls [2]
> for later code modifications like possible packet stucks.
>
> This version targeted to make simple and flexible output packet batching on
> higher level without introducing and even simplifying netdev layer.
>
> Basic testing of 'PVP with OVS bonding on phy ports' scenario shows
> significant performance improvement.
>
> Test results for time-based batching for v3:
> https://mail.openvswitch.org/pipermail/ovs-dev/2017-September/338247.html
>
> [1] [PATCH v4 0/5] netdev-dpdk: Use intermediate queue during packet transmission.
>      https://mail.openvswitch.org/pipermail/ovs-dev/2017-August/337019.html
>
> [2] For example:
>      https://mail.openvswitch.org/pipermail/ovs-dev/2017-August/337133.html
>
> Version 4:
> 	* Rebased on current master.
> 	* Rebased on top of "Keep latest measured time for PMD thread."
> 	  (Jan Scheurich)
> 	* Microsecond resolution related patches integrated.
> 	* Time-based batching without RFC tag.
> 	* 'output_time' renamed to 'flush_time'. (Jan Scheurich)
> 	* 'flush_time' update moved to 'dp_netdev_pmd_flush_output_on_port'.
> 	  (Jan Scheurich)
> 	* 'output-max-latency' renamed to 'tx-flush-interval'.
> 	* Added patch for output batching statistics.
>
> Version 3:
>
> 	* Rebased on current master.
> 	* Time based RFC: fixed assert on n_output_batches <= 0.
>
> Version 2:
>
> 	* Rebased on current master.
> 	* Added time based batching RFC patch.
> 	* Fixed mixing packets with different sources in same batch.
>
>
> Ilya Maximets (7):
>    dpif-netdev: Keep latest measured time for PMD thread.
>    dpif-netdev: Output packet batching.
>    netdev: Remove unused may_steal.
>    netdev: Remove useless cutlen.
>    timeval: Introduce time_usec().
>    dpif-netdev: Time based output batching.
>    dpif-netdev: Count sent packets and batches.
>
>   lib/dpif-netdev.c     | 334 +++++++++++++++++++++++++++++++++++++-------------
>   lib/netdev-bsd.c      |   6 +-
>   lib/netdev-dpdk.c     |  70 ++++-------
>   lib/netdev-dummy.c    |   6 +-
>   lib/netdev-linux.c    |   8 +-
>   lib/netdev-provider.h |   7 +-
>   lib/netdev.c          |  12 +-
>   lib/netdev.h          |   2 +-
>   lib/timeval.c         |  22 ++++
>   lib/timeval.h         |   2 +
>   vswitchd/vswitch.xml  |  16 +++
>   11 files changed, 336 insertions(+), 149 deletions(-)
>
Bodireddy, Bhanuprakash Oct. 13, 2017, 5:20 p.m. UTC | #3
>Hi Ilya,

>

>Sorry for the late response, as I was rather busy and did not find time to look

>at your revisions 1 till 3. Hopefully, I can make it up looking at v4...

>

>I did some tests in-line with the earlier tests I did with Bhanu's patch series.

>Here is a comparison for a simple PVP test using a single physical port with 64

>bytes packets (wire speed 10G), single PMD thread:

>

>#Flows  master     patched

>======  =========  =========

>     10  3,123,350  4,174,807  pps

>     32  2,090,440  3,625,314  pps

>     50  1,954,184  3,499,402  pps

>    100  1,705,794  3,264,955  pps

>    500  1,601,252  2,956,190  pps

>   1000  1,568,175  2,712,385  pps

>

>In addition, I did some latency statistics based on a PVP setup with two

>physical ports, and one virtual port, and two OpenFlow rules:

>

>ovs-ofctl add-flow br0 "in_port=dpdk0,action=vhost0"

>ovs-ofctl add-flow br0 "in_port=vhost0,action=dpdk1"

>

>Also, note that there is some deviation on latency numbers, so I did 4 runs and

>reported the min-max values.

>

>First the master results:

>

>Summary (flows = 30, 10G line rate = 95%, runtime = 60 seconds):

>

>   Pkt size     min(ns)     avg(ns)          max(ns)

>   --------  -------------  ---------------  -----------------

>        512  7,437 - 7,469  11,416 - 13,770   99,395 - 112,296

>       1024  7,197 - 7,221  11,277 - 12,379   42,876 -  47,230

>       1280  7,373 - 7,549  10,647 - 12,528   37,240 -  42,235

>       1518  8,046 - 8,135  11,808 - 12,931   36,534 -  46,388

>

>And the patched results:

>

>   Pkt size     min(ns)     avg(ns)          max(ns)

>   --------  -------------  ---------------  -----------------

>        512  7,605 - 7,662  11,711 - 14,053   56,603 - 121,059

>       1024  7,285 - 7,317  11,291 - 12,695   44,753 -  69,624

>       1280  7,605 - 7,702  10,842 - 12,685   37,047 -  45,747

>       1518  8,111 - 8,159  11,434 - 13,045   38,587 -  41,754

>

>As you can see above for the default configuration there is a minimal latency

>increase. I assume you did some latency tests yourself, and I hope these

>numbers match your findings...


Thanks for sharing the numbers Eelco. This should be useful for future reference.

Also please note that with batching patches applied there is a small performance drop
In P2P test case(non batching scenario). This shouldn't be a concern at this point but 
VSPERF  may raise a red flag with some of its test cases when this series is applied.

Other than that the series looks good to me. I have asked Ian to check QoS functionality with
this series + incremental patch(that fixes the known issue with policer) to check for any other corner cases.

- Bhanuprakash.

>

>I do have some small comments on your patchsets but will address them

>replying to the individual emails.

>

>Cheers,

>

>Eelco

>

>On 05/10/17 17:05, Ilya Maximets wrote:

>> This patch-set inspired by [1] from Bhanuprakash Bodireddy.

>> Implementation of [1] looks very complex and introduces many pitfalls

>> [2] for later code modifications like possible packet stucks.

>>

>> This version targeted to make simple and flexible output packet

>> batching on higher level without introducing and even simplifying netdev

>layer.

>>

>> Basic testing of 'PVP with OVS bonding on phy ports' scenario shows

>> significant performance improvement.

>>

>> Test results for time-based batching for v3:

>> https://mail.openvswitch.org/pipermail/ovs-dev/2017-September/338247.h

>> tml

>>

>> [1] [PATCH v4 0/5] netdev-dpdk: Use intermediate queue during packet

>transmission.

>>

>> https://mail.openvswitch.org/pipermail/ovs-dev/2017-August/337019.html

>>

>> [2] For example:

>>

>> https://mail.openvswitch.org/pipermail/ovs-dev/2017-August/337133.html

>>

>> Version 4:

>> 	* Rebased on current master.

>> 	* Rebased on top of "Keep latest measured time for PMD thread."

>> 	  (Jan Scheurich)

>> 	* Microsecond resolution related patches integrated.

>> 	* Time-based batching without RFC tag.

>> 	* 'output_time' renamed to 'flush_time'. (Jan Scheurich)

>> 	* 'flush_time' update moved to

>'dp_netdev_pmd_flush_output_on_port'.

>> 	  (Jan Scheurich)

>> 	* 'output-max-latency' renamed to 'tx-flush-interval'.

>> 	* Added patch for output batching statistics.

>>

>> Version 3:

>>

>> 	* Rebased on current master.

>> 	* Time based RFC: fixed assert on n_output_batches <= 0.

>>

>> Version 2:

>>

>> 	* Rebased on current master.

>> 	* Added time based batching RFC patch.

>> 	* Fixed mixing packets with different sources in same batch.

>>

>>

>> Ilya Maximets (7):

>>    dpif-netdev: Keep latest measured time for PMD thread.

>>    dpif-netdev: Output packet batching.

>>    netdev: Remove unused may_steal.

>>    netdev: Remove useless cutlen.

>>    timeval: Introduce time_usec().

>>    dpif-netdev: Time based output batching.

>>    dpif-netdev: Count sent packets and batches.

>>

>>   lib/dpif-netdev.c     | 334 +++++++++++++++++++++++++++++++++++++--

>-----------

>>   lib/netdev-bsd.c      |   6 +-

>>   lib/netdev-dpdk.c     |  70 ++++-------

>>   lib/netdev-dummy.c    |   6 +-

>>   lib/netdev-linux.c    |   8 +-

>>   lib/netdev-provider.h |   7 +-

>>   lib/netdev.c          |  12 +-

>>   lib/netdev.h          |   2 +-

>>   lib/timeval.c         |  22 ++++

>>   lib/timeval.h         |   2 +

>>   vswitchd/vswitch.xml  |  16 +++

>>   11 files changed, 336 insertions(+), 149 deletions(-)

>>