diff mbox series

[ovs-dev] dp-packet: Avoid checks while preparing non-offloading packets.

Message ID 20240118164020.3382756-1-i.maximets@ovn.org
State Accepted
Commit bacd2c304a1f9cb2b31543ab1a07aa084eaf7db7
Delegated to: Ilya Maximets
Headers show
Series [ovs-dev] dp-packet: Avoid checks while preparing non-offloading packets. | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/github-robot-_Build_and_Test fail github build: failed

Commit Message

Ilya Maximets Jan. 18, 2024, 4:38 p.m. UTC
Currently, dp_packet_ol_send_prepare() performs multiple checks for
each offloading flag separately.  That takes a noticeable amount of
extra cycles for packets that do not have any offloading flags set.

Skip most of the work if no checksumming flags are set.

The change improves performance of direct forwarding between two
virtio-user ports (V2V) by ~2.5 % and offsets all the negative
effects of TSO support introduced recently.

It adds an extra check to the offloading path, but it is not a
default configuration and also should take much smaller hit due
to lower number of larger packets.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
---

Would be great to have performance numbers confirmed independently
on this one.  Better if with some physical ports as well.

I didn't list 'Fixes' tags since it's a little bit of a boiled frog
situation.  But I think this should be backported to branch-3.3 at
least.

 lib/dp-packet.c |  5 +++++
 lib/dp-packet.h | 11 +++++++++++
 2 files changed, 16 insertions(+)

Comments

Mike Pattrick Jan. 18, 2024, 6:34 p.m. UTC | #1
On Thu, Jan 18, 2024 at 11:40 AM Ilya Maximets <i.maximets@ovn.org> wrote:
>
> Currently, dp_packet_ol_send_prepare() performs multiple checks for
> each offloading flag separately.  That takes a noticeable amount of
> extra cycles for packets that do not have any offloading flags set.
>
> Skip most of the work if no checksumming flags are set.
>
> The change improves performance of direct forwarding between two
> virtio-user ports (V2V) by ~2.5 % and offsets all the negative
> effects of TSO support introduced recently.
>
> It adds an extra check to the offloading path, but it is not a
> default configuration and also should take much smaller hit due
> to lower number of larger packets.
>
> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

I tested this in a V2V configuration as well and saw a small
performance improvement across 6 runs. I also tested making the check
an inline function and calling into the full send prepare if it
passes, but that change didn't yield any noticeable performance
improvement over the current solution.

Acked-by: Mike Pattrick <mkp@redhat.com>
Simon Horman Jan. 19, 2024, 12:27 p.m. UTC | #2
On Thu, Jan 18, 2024 at 05:38:19PM +0100, Ilya Maximets wrote:
> Currently, dp_packet_ol_send_prepare() performs multiple checks for
> each offloading flag separately.  That takes a noticeable amount of
> extra cycles for packets that do not have any offloading flags set.
> 
> Skip most of the work if no checksumming flags are set.
> 
> The change improves performance of direct forwarding between two
> virtio-user ports (V2V) by ~2.5 % and offsets all the negative
> effects of TSO support introduced recently.
> 
> It adds an extra check to the offloading path, but it is not a
> default configuration and also should take much smaller hit due
> to lower number of larger packets.
> 
> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>

Acked-by: Simon Horman <horms@ovn.org>

We seem to have a situation with unreliable test results
not related to this patch.

To that end the automated workflow run failed [1].
However, I did get a successful run in my own repository [2].

[1] https://github.com/ovsrobot/ovs/actions/runs/7573402009
[2] https://github.com/horms/ovs/actions/runs/7581285912

...
Ilya Maximets Jan. 19, 2024, 12:59 p.m. UTC | #3
On 1/18/24 19:34, Mike Pattrick wrote:
> On Thu, Jan 18, 2024 at 11:40 AM Ilya Maximets <i.maximets@ovn.org> wrote:
>>
>> Currently, dp_packet_ol_send_prepare() performs multiple checks for
>> each offloading flag separately.  That takes a noticeable amount of
>> extra cycles for packets that do not have any offloading flags set.
>>
>> Skip most of the work if no checksumming flags are set.
>>
>> The change improves performance of direct forwarding between two
>> virtio-user ports (V2V) by ~2.5 % and offsets all the negative
>> effects of TSO support introduced recently.
>>
>> It adds an extra check to the offloading path, but it is not a
>> default configuration and also should take much smaller hit due
>> to lower number of larger packets.
>>
>> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
> 
> I tested this in a V2V configuration as well and saw a small
> performance improvement across 6 runs. I also tested making the check
> an inline function and calling into the full send prepare if it
> passes, but that change didn't yield any noticeable performance
> improvement over the current solution.
> 
> Acked-by: Mike Pattrick <mkp@redhat.com>
> 

Thanks, Mike and Simon!

Applied and backported to branch-3.3.

Best regards, Ilya Maximets.
diff mbox series

Patch

diff --git a/lib/dp-packet.c b/lib/dp-packet.c
index 9635cac8b..0e23c766e 100644
--- a/lib/dp-packet.c
+++ b/lib/dp-packet.c
@@ -576,6 +576,11 @@  dp_packet_ol_send_prepare(struct dp_packet *p, uint64_t flags)
 {
     bool tnl_inner = false;
 
+    if (!dp_packet_hwol_tx_is_any_csum(p)) {
+        /* Only checksumming needs actions. */
+        return;
+    }
+
     if (dp_packet_hwol_is_tunnel_geneve(p) ||
         dp_packet_hwol_is_tunnel_vxlan(p)) {
         tnl_inner = true;
diff --git a/lib/dp-packet.h b/lib/dp-packet.h
index 52e52b914..939bec5c8 100644
--- a/lib/dp-packet.h
+++ b/lib/dp-packet.h
@@ -131,6 +131,10 @@  enum dp_packet_offload_mask {
 #define DP_PACKET_OL_TX_L4_MASK (DP_PACKET_OL_TX_TCP_CKSUM | \
                                  DP_PACKET_OL_TX_UDP_CKSUM | \
                                  DP_PACKET_OL_TX_SCTP_CKSUM)
+#define DP_PACKET_OL_TX_ANY_CKSUM (DP_PACKET_OL_TX_L4_MASK | \
+                                   DP_PACKET_OL_TX_IP_CKSUM | \
+                                   DP_PACKET_OL_TX_OUTER_IP_CKSUM | \
+                                   DP_PACKET_OL_TX_OUTER_UDP_CKSUM)
 #define DP_PACKET_OL_RX_IP_CKSUM_MASK (DP_PACKET_OL_RX_IP_CKSUM_GOOD | \
                                        DP_PACKET_OL_RX_IP_CKSUM_BAD)
 #define DP_PACKET_OL_RX_L4_CKSUM_MASK (DP_PACKET_OL_RX_L4_CKSUM_GOOD | \
@@ -1189,6 +1193,13 @@  dp_packet_hwol_is_outer_udp_cksum(struct dp_packet *b)
     return !!(*dp_packet_ol_flags_ptr(b) & DP_PACKET_OL_TX_OUTER_UDP_CKSUM);
 }
 
+/* Returns 'true' if packet 'b' is marked for any checksum offload. */
+static inline bool
+dp_packet_hwol_tx_is_any_csum(struct dp_packet *b)
+{
+    return !!(*dp_packet_ol_flags_ptr(b) & DP_PACKET_OL_TX_ANY_CKSUM);
+}
+
 static inline void
 dp_packet_hwol_reset_tx_l4_csum(struct dp_packet *p)
 {