[ovs-dev,v10,4/5] docs: Describe output packet batching in DPDK guide.

Message ID 1515755828-1848-5-git-send-email-i.maximets@samsung.com
State Superseded
Headers show
Series
  • Output packet batching (Time-based).
Related show

Commit Message

Ilya Maximets Jan. 12, 2018, 11:17 a.m.
Added information about output packet batching and a way to
configure 'tx-flush-interval'.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Co-authored-by: Jan Scheurich <jan.scheurich@ericsson.com>
Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
---
 Documentation/intro/install/dpdk.rst | 58 ++++++++++++++++++++++++++++++++++++
 1 file changed, 58 insertions(+)

Comments

Jan Scheurich Jan. 12, 2018, 4:53 p.m. | #1
Hi,

I still find the way DPDK-related topics are distributed over the various documentation files rather confusing:
./Documentation/intro/install/dpdk.rst
./Documentation/howto/dpdk.rst
./Documentation/topics/dpdk/index.rst
./Documentation/topics/dpdk/vhost-user.rst
./Documentation/topics/dpdk/ring.rst

Why does information like this go into intro/install/dpdk.rst rather than howto/dpdk.rst?
But cleaning this up is probably another exercise....

Acked-by: Jan Scheurich <jan.scheurich@ericsson.com>

BR, Jan

> -----Original Message-----
> From: Ilya Maximets [mailto:i.maximets@samsung.com]
> Sent: Friday, 12 January, 2018 12:17
> To: ovs-dev@openvswitch.org
> Cc: Heetae Ahn <heetae82.ahn@samsung.com>; Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com>; Antonio Fischetti
> <antonio.fischetti@intel.com>; Eelco Chaudron <echaudro@redhat.com>; Ciara Loftus <ciara.loftus@intel.com>; Kevin Traynor
> <ktraynor@redhat.com>; Jan Scheurich <jan.scheurich@ericsson.com>; Billy O'Mahony <billy.o.mahony@intel.com>; Ian Stokes
> <ian.stokes@intel.com>; Ilya Maximets <i.maximets@samsung.com>
> Subject: [PATCH v10 4/5] docs: Describe output packet batching in DPDK guide.
> 
> Added information about output packet batching and a way to
> configure 'tx-flush-interval'.
> 
> Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
> Co-authored-by: Jan Scheurich <jan.scheurich@ericsson.com>
> Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
> ---
>  Documentation/intro/install/dpdk.rst | 58 ++++++++++++++++++++++++++++++++++++
>  1 file changed, 58 insertions(+)
> 
> diff --git a/Documentation/intro/install/dpdk.rst b/Documentation/intro/install/dpdk.rst
> index 3fecb5c..040e62e 100644
> --- a/Documentation/intro/install/dpdk.rst
> +++ b/Documentation/intro/install/dpdk.rst
> @@ -568,6 +568,64 @@ not needed i.e. jumbo frames are not needed, it can be forced off by adding
>  chains of descriptors it will make more individual virtio descriptors available
>  for rx to the guest using dpdkvhost ports and this can improve performance.
> 
> +Output Packet Batching
> +~~~~~~~~~~~~~~~~~~~~~~
> +
> +To make advantage of batched transmit functions, OVS collects packets in
> +intermediate queues before sending when processing a batch of received packets.
> +Even if packets are matched by different flows, OVS uses a single send
> +operation for all packets destined to the same output port.
> +
> +Furthermore, OVS is able to buffer packets in these intermediate queues for a
> +configurable amount of time to reduce the frequency of send bursts at medium
> +load levels when the packet receive rate is high, but the receive batch size
> +still very small. This is particularly beneficial for packets transmitted to
> +VMs using an interrupt-driven virtio driver, where the interrupt overhead is
> +significant for the OVS PMD, the host operating system and the guest driver.
> +
> +The ``tx-flush-interval`` parameter can be used to specify the time in
> +microseconds OVS should wait between two send bursts to a given port (default
> +is ``0``). When the intermediate queue fills up before that time is over, the
> +buffered packet batch is sent immediately::
> +
> +    $ ovs-vsctl set Open_vSwitch . other_config:tx-flush-interval=50
> +
> +This parameter influences both throughput and latency, depending on the traffic
> +load on the port. In general lower values decrease latency while higher values
> +may be useful to achieve higher throughput.
> +
> +Low traffic (``packet rate < 1 / tx-flush-interval``) should not experience
> +any significant latency or throughput increase as packets are forwarded
> +immediately.
> +
> +At intermediate load levels
> +(``1 / tx-flush-interval < packet rate < 32 / tx-flush-interval``) traffic
> +should experience an average latency increase of up to
> +``1 / 2 * tx-flush-interval`` and a possible throughput improvement.
> +
> +Very high traffic (``packet rate >> 32 / tx-flush-interval``) should experience
> +the average latency increase equal to ``32 / (2 * packet rate)``. Most send
> +batches in this case will contain the maximum number of packets (``32``).
> +
> +A ``tx-burst-interval`` value of ``50`` microseconds has shown to provide a
> +good performance increase in a ``PHY-VM-PHY`` scenario on ``x86`` system for
> +interrupt-driven guests while keeping the latency increase at a reasonable
> +level:
> +
> +  https://mail.openvswitch.org/pipermail/ovs-dev/2017-December/341628.html
> +
> +.. note::
> +  Throughput impact of this option significantly depends on the scenario and
> +  the traffic patterns. For example: ``tx-burst-interval`` value of ``50``
> +  microseconds shows performance degradation in ``PHY-VM-PHY`` with bonded PHY
> +  scenario while testing with ``256 - 1024`` packet flows:
> +
> +    https://mail.openvswitch.org/pipermail/ovs-dev/2017-December/341700.html
> +
> +The average number of packets per output batch can be checked in PMD stats::
> +
> +    $ ovs-appctl dpif-netdev/pmd-stats-show
> +
>  Limitations
>  ------------
> 
> --
> 2.7.4
Ilya Maximets Jan. 15, 2018, 7:34 a.m. | #2
On 12.01.2018 19:53, Jan Scheurich wrote:
> Hi,
> 
> I still find the way DPDK-related topics are distributed over the various documentation files rather confusing:
> ./Documentation/intro/install/dpdk.rst
> ./Documentation/howto/dpdk.rst
> ./Documentation/topics/dpdk/index.rst
> ./Documentation/topics/dpdk/vhost-user.rst
> ./Documentation/topics/dpdk/ring.rst
> 
> Why does information like this go into intro/install/dpdk.rst rather than howto/dpdk.rst?

Oh.. It's hard to say. Initially, I wanted to place this description near to
"Exact Match Cache" description as it's the feature of the same level (somehow).

I don't like current documentation too. It's definitely should be merged or re-splitted
somehow. Sometimes it's really hard to find what you're looking for in all these docs.
And it's even harder when you're trying to use compiled documentation.

> But cleaning this up is probably another exercise....

Yes. That's should be a separate change.

> 
> Acked-by: Jan Scheurich <jan.scheurich@ericsson.com>
> 
> BR, Jan
> 
>> -----Original Message-----
>> From: Ilya Maximets [mailto:i.maximets@samsung.com]
>> Sent: Friday, 12 January, 2018 12:17
>> To: ovs-dev@openvswitch.org
>> Cc: Heetae Ahn <heetae82.ahn@samsung.com>; Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com>; Antonio Fischetti
>> <antonio.fischetti@intel.com>; Eelco Chaudron <echaudro@redhat.com>; Ciara Loftus <ciara.loftus@intel.com>; Kevin Traynor
>> <ktraynor@redhat.com>; Jan Scheurich <jan.scheurich@ericsson.com>; Billy O'Mahony <billy.o.mahony@intel.com>; Ian Stokes
>> <ian.stokes@intel.com>; Ilya Maximets <i.maximets@samsung.com>
>> Subject: [PATCH v10 4/5] docs: Describe output packet batching in DPDK guide.
>>
>> Added information about output packet batching and a way to
>> configure 'tx-flush-interval'.
>>
>> Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
>> Co-authored-by: Jan Scheurich <jan.scheurich@ericsson.com>
>> Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
>> ---
>>  Documentation/intro/install/dpdk.rst | 58 ++++++++++++++++++++++++++++++++++++
>>  1 file changed, 58 insertions(+)
>>
>> diff --git a/Documentation/intro/install/dpdk.rst b/Documentation/intro/install/dpdk.rst
>> index 3fecb5c..040e62e 100644
>> --- a/Documentation/intro/install/dpdk.rst
>> +++ b/Documentation/intro/install/dpdk.rst
>> @@ -568,6 +568,64 @@ not needed i.e. jumbo frames are not needed, it can be forced off by adding
>>  chains of descriptors it will make more individual virtio descriptors available
>>  for rx to the guest using dpdkvhost ports and this can improve performance.
>>
>> +Output Packet Batching
>> +~~~~~~~~~~~~~~~~~~~~~~
>> +
>> +To make advantage of batched transmit functions, OVS collects packets in
>> +intermediate queues before sending when processing a batch of received packets.
>> +Even if packets are matched by different flows, OVS uses a single send
>> +operation for all packets destined to the same output port.
>> +
>> +Furthermore, OVS is able to buffer packets in these intermediate queues for a
>> +configurable amount of time to reduce the frequency of send bursts at medium
>> +load levels when the packet receive rate is high, but the receive batch size
>> +still very small. This is particularly beneficial for packets transmitted to
>> +VMs using an interrupt-driven virtio driver, where the interrupt overhead is
>> +significant for the OVS PMD, the host operating system and the guest driver.
>> +
>> +The ``tx-flush-interval`` parameter can be used to specify the time in
>> +microseconds OVS should wait between two send bursts to a given port (default
>> +is ``0``). When the intermediate queue fills up before that time is over, the
>> +buffered packet batch is sent immediately::
>> +
>> +    $ ovs-vsctl set Open_vSwitch . other_config:tx-flush-interval=50
>> +
>> +This parameter influences both throughput and latency, depending on the traffic
>> +load on the port. In general lower values decrease latency while higher values
>> +may be useful to achieve higher throughput.
>> +
>> +Low traffic (``packet rate < 1 / tx-flush-interval``) should not experience
>> +any significant latency or throughput increase as packets are forwarded
>> +immediately.
>> +
>> +At intermediate load levels
>> +(``1 / tx-flush-interval < packet rate < 32 / tx-flush-interval``) traffic
>> +should experience an average latency increase of up to
>> +``1 / 2 * tx-flush-interval`` and a possible throughput improvement.
>> +
>> +Very high traffic (``packet rate >> 32 / tx-flush-interval``) should experience
>> +the average latency increase equal to ``32 / (2 * packet rate)``. Most send
>> +batches in this case will contain the maximum number of packets (``32``).
>> +
>> +A ``tx-burst-interval`` value of ``50`` microseconds has shown to provide a
>> +good performance increase in a ``PHY-VM-PHY`` scenario on ``x86`` system for
>> +interrupt-driven guests while keeping the latency increase at a reasonable
>> +level:
>> +
>> +  https://mail.openvswitch.org/pipermail/ovs-dev/2017-December/341628.html
>> +
>> +.. note::
>> +  Throughput impact of this option significantly depends on the scenario and
>> +  the traffic patterns. For example: ``tx-burst-interval`` value of ``50``
>> +  microseconds shows performance degradation in ``PHY-VM-PHY`` with bonded PHY
>> +  scenario while testing with ``256 - 1024`` packet flows:
>> +
>> +    https://mail.openvswitch.org/pipermail/ovs-dev/2017-December/341700.html
>> +
>> +The average number of packets per output batch can be checked in PMD stats::
>> +
>> +    $ ovs-appctl dpif-netdev/pmd-stats-show
>> +
>>  Limitations
>>  ------------
>>
>> --
>> 2.7.4
> 
> 
> 
>
Jan Scheurich Jan. 15, 2018, 7:53 a.m. | #3
> > I still find the way DPDK-related topics are distributed over the various documentation files rather confusing:
> > ./Documentation/intro/install/dpdk.rst
> > ./Documentation/howto/dpdk.rst
> > ./Documentation/topics/dpdk/index.rst
> > ./Documentation/topics/dpdk/vhost-user.rst
> > ./Documentation/topics/dpdk/ring.rst
> >
> > Why does information like this go into intro/install/dpdk.rst rather than howto/dpdk.rst?
> 
> Oh.. It's hard to say. Initially, I wanted to place this description near to
> "Exact Match Cache" description as it's the feature of the same level (somehow).
> 
> I don't like current documentation too. It's definitely should be merged or re-splitted
> somehow. Sometimes it's really hard to find what you're looking for in all these docs.
> And it's even harder when you're trying to use compiled documentation.
> 
> > But cleaning this up is probably another exercise....
> 
> Yes. That's should be a separate change.

So let's put that restructuring up as a work item for 2.10 and discuss in the OVS-DPDK community meeting what to aim for and how to do it.

/Jan

Patch

diff --git a/Documentation/intro/install/dpdk.rst b/Documentation/intro/install/dpdk.rst
index 3fecb5c..040e62e 100644
--- a/Documentation/intro/install/dpdk.rst
+++ b/Documentation/intro/install/dpdk.rst
@@ -568,6 +568,64 @@  not needed i.e. jumbo frames are not needed, it can be forced off by adding
 chains of descriptors it will make more individual virtio descriptors available
 for rx to the guest using dpdkvhost ports and this can improve performance.
 
+Output Packet Batching
+~~~~~~~~~~~~~~~~~~~~~~
+
+To make advantage of batched transmit functions, OVS collects packets in
+intermediate queues before sending when processing a batch of received packets.
+Even if packets are matched by different flows, OVS uses a single send
+operation for all packets destined to the same output port.
+
+Furthermore, OVS is able to buffer packets in these intermediate queues for a
+configurable amount of time to reduce the frequency of send bursts at medium
+load levels when the packet receive rate is high, but the receive batch size
+still very small. This is particularly beneficial for packets transmitted to
+VMs using an interrupt-driven virtio driver, where the interrupt overhead is
+significant for the OVS PMD, the host operating system and the guest driver.
+
+The ``tx-flush-interval`` parameter can be used to specify the time in
+microseconds OVS should wait between two send bursts to a given port (default
+is ``0``). When the intermediate queue fills up before that time is over, the
+buffered packet batch is sent immediately::
+
+    $ ovs-vsctl set Open_vSwitch . other_config:tx-flush-interval=50
+
+This parameter influences both throughput and latency, depending on the traffic
+load on the port. In general lower values decrease latency while higher values
+may be useful to achieve higher throughput.
+
+Low traffic (``packet rate < 1 / tx-flush-interval``) should not experience
+any significant latency or throughput increase as packets are forwarded
+immediately.
+
+At intermediate load levels
+(``1 / tx-flush-interval < packet rate < 32 / tx-flush-interval``) traffic
+should experience an average latency increase of up to
+``1 / 2 * tx-flush-interval`` and a possible throughput improvement.
+
+Very high traffic (``packet rate >> 32 / tx-flush-interval``) should experience
+the average latency increase equal to ``32 / (2 * packet rate)``. Most send
+batches in this case will contain the maximum number of packets (``32``).
+
+A ``tx-burst-interval`` value of ``50`` microseconds has shown to provide a
+good performance increase in a ``PHY-VM-PHY`` scenario on ``x86`` system for
+interrupt-driven guests while keeping the latency increase at a reasonable
+level:
+
+  https://mail.openvswitch.org/pipermail/ovs-dev/2017-December/341628.html
+
+.. note::
+  Throughput impact of this option significantly depends on the scenario and
+  the traffic patterns. For example: ``tx-burst-interval`` value of ``50``
+  microseconds shows performance degradation in ``PHY-VM-PHY`` with bonded PHY
+  scenario while testing with ``256 - 1024`` packet flows:
+
+    https://mail.openvswitch.org/pipermail/ovs-dev/2017-December/341700.html
+
+The average number of packets per output batch can be checked in PMD stats::
+
+    $ ovs-appctl dpif-netdev/pmd-stats-show
+
 Limitations
 ------------