From patchwork Fri Dec 15 19:30:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Pattrick X-Patchwork-Id: 1876751 X-Patchwork-Delegate: horms@verge.net.au Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=TILkoNbt; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=2605:bc80:3010::137; helo=smtp4.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=patchwork.ozlabs.org) Received: from smtp4.osuosl.org (smtp4.osuosl.org [IPv6:2605:bc80:3010::137]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SsK7r2BrYz23nF for ; Sat, 16 Dec 2023 06:30:48 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by smtp4.osuosl.org (Postfix) with ESMTP id 93A814254F; Fri, 15 Dec 2023 19:30:45 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp4.osuosl.org 93A814254F Authentication-Results: smtp4.osuosl.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=TILkoNbt X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp4.osuosl.org ([127.0.0.1]) by localhost (smtp4.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id bVpmf6fiFCsC; Fri, 15 Dec 2023 19:30:40 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [IPv6:2605:bc80:3010:104::8cd3:938]) by smtp4.osuosl.org (Postfix) with ESMTPS id 4CDAA423FB; Fri, 15 Dec 2023 19:30:38 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp4.osuosl.org 4CDAA423FB Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 3722CC0DD2; Fri, 15 Dec 2023 19:30:36 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp2.osuosl.org (smtp2.osuosl.org [IPv6:2605:bc80:3010::133]) by lists.linuxfoundation.org (Postfix) with ESMTP id 4BCEEC0037 for ; Fri, 15 Dec 2023 19:30:34 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp2.osuosl.org (Postfix) with ESMTP id 74F71403C6 for ; Fri, 15 Dec 2023 19:30:33 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp2.osuosl.org 74F71403C6 Authentication-Results: smtp2.osuosl.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=TILkoNbt X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp2.osuosl.org ([127.0.0.1]) by localhost (smtp2.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id YW1NLzH91WJK for ; Fri, 15 Dec 2023 19:30:31 +0000 (UTC) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by smtp2.osuosl.org (Postfix) with ESMTPS id 2E70340201 for ; Fri, 15 Dec 2023 19:30:30 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp2.osuosl.org 2E70340201 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1702668629; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=2beKRYhRQP4UF59vyIdMLxCMToQVMG7K8i4hiWH+C4A=; b=TILkoNbtsAbDeU9l6B8VEtbGJCenetl5YeR9yN/qad0pUYbMRZBOwjNvCawUMHPJf/+5US yEwCTH/OCFLI25UsuLw9zoFEtzigfEQBV+U4J+88dzinezGr9QBpJLs9mtsIKOt0k5CMwb 8CjixSeErtGlwkdQLy2ilVaWEh64eoo= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-526-k-iqvyGGNMyqhnX1CePeNQ-1; Fri, 15 Dec 2023 14:30:28 -0500 X-MC-Unique: k-iqvyGGNMyqhnX1CePeNQ-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id E99BF185A784; Fri, 15 Dec 2023 19:30:27 +0000 (UTC) Received: from mpattric.remote.csb (unknown [10.22.8.90]) by smtp.corp.redhat.com (Postfix) with ESMTP id 76042492BF0; Fri, 15 Dec 2023 19:30:26 +0000 (UTC) From: Mike Pattrick To: dev@openvswitch.org Date: Fri, 15 Dec 2023 14:30:21 -0500 Message-Id: <20231215193022.1354589-1-mkp@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.10 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Subject: [ovs-dev] [PATCH v10 1/2] userspace: Support vxlan and geneve tso. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" From: Dexia Li For userspace datapath, this patch provides vxlan and geneve tunnel tso. Only support userspace vxlan or geneve tunnel, meanwhile support tunnel outter and inner csum offload. If netdev do not support offload features, there is a software fallback.If netdev do not support vxlan and geneve tso,packets will drop. Front-end devices can close offload features by ethtool also. Signed-off-by: Dexia Li Co-authored-by: Mike Pattrick Signed-off-by: Mike Pattrick --- v9: Rebased patch --- lib/dp-packet.c | 41 +++++++- lib/dp-packet.h | 216 ++++++++++++++++++++++++++++++++++++---- lib/dpif-netdev.c | 4 +- lib/flow.c | 2 +- lib/netdev-dpdk.c | 87 ++++++++++++++-- lib/netdev-dummy.c | 2 +- lib/netdev-native-tnl.c | 106 ++++++++++++++++++-- lib/netdev-provider.h | 4 + lib/netdev.c | 33 ++++-- lib/packets.c | 12 +-- lib/packets.h | 6 +- tests/dpif-netdev.at | 4 +- 12 files changed, 458 insertions(+), 59 deletions(-) diff --git a/lib/dp-packet.c b/lib/dp-packet.c index 920402369..cb20608d7 100644 --- a/lib/dp-packet.c +++ b/lib/dp-packet.c @@ -546,16 +546,47 @@ dp_packet_compare_offsets(struct dp_packet *b1, struct dp_packet *b2, return true; } +void +dp_packet_tnl_outer_ol_send_prepare(struct dp_packet *p, + uint64_t flags) +{ + if (dp_packet_hwol_is_outer_ipv4_cksum(p)) { + if (!(flags & NETDEV_TX_OFFLOAD_OUTER_IP_CKSUM)) { + dp_packet_ip_set_header_csum(p, false); + dp_packet_ol_set_ip_csum_good(p); + dp_packet_hwol_reset_outer_ipv4_csum(p); + } + } + + if (!dp_packet_hwol_is_outer_UDP_cksum(p)) { + return; + } + + if (!(flags & NETDEV_TX_OFFLOAD_OUTER_UDP_CKSUM)) { + packet_udp_complete_csum(p, false); + dp_packet_ol_set_l4_csum_good(p); + dp_packet_hwol_reset_outer_udp_csum(p); + } +} + /* Checks if the packet 'p' is compatible with netdev_ol_flags 'flags' * and if not, updates the packet with the software fall back. */ void dp_packet_ol_send_prepare(struct dp_packet *p, uint64_t flags) { + bool tnl_inner = false; + + if (dp_packet_hwol_is_tunnel_geneve(p) || + dp_packet_hwol_is_tunnel_vxlan(p)) { + dp_packet_tnl_outer_ol_send_prepare(p, flags); + tnl_inner = true; + } + if (dp_packet_hwol_tx_ip_csum(p)) { if (dp_packet_ip_checksum_good(p)) { dp_packet_hwol_reset_tx_ip_csum(p); } else if (!(flags & NETDEV_TX_OFFLOAD_IPV4_CKSUM)) { - dp_packet_ip_set_header_csum(p); + dp_packet_ip_set_header_csum(p, tnl_inner); dp_packet_ol_set_ip_csum_good(p); dp_packet_hwol_reset_tx_ip_csum(p); } @@ -565,24 +596,24 @@ dp_packet_ol_send_prepare(struct dp_packet *p, uint64_t flags) return; } - if (dp_packet_l4_checksum_good(p)) { + if (dp_packet_l4_checksum_good(p) && (!tnl_inner)) { dp_packet_hwol_reset_tx_l4_csum(p); return; } if (dp_packet_hwol_l4_is_tcp(p) && !(flags & NETDEV_TX_OFFLOAD_TCP_CKSUM)) { - packet_tcp_complete_csum(p); + packet_tcp_complete_csum(p, tnl_inner); dp_packet_ol_set_l4_csum_good(p); dp_packet_hwol_reset_tx_l4_csum(p); } else if (dp_packet_hwol_l4_is_udp(p) && !(flags & NETDEV_TX_OFFLOAD_UDP_CKSUM)) { - packet_udp_complete_csum(p); + packet_udp_complete_csum(p, tnl_inner); dp_packet_ol_set_l4_csum_good(p); dp_packet_hwol_reset_tx_l4_csum(p); } else if (!(flags & NETDEV_TX_OFFLOAD_SCTP_CKSUM) && dp_packet_hwol_l4_is_sctp(p)) { - packet_sctp_complete_csum(p); + packet_sctp_complete_csum(p, tnl_inner); dp_packet_ol_set_l4_csum_good(p); dp_packet_hwol_reset_tx_l4_csum(p); } diff --git a/lib/dp-packet.h b/lib/dp-packet.h index 11aa00723..3b16b2a54 100644 --- a/lib/dp-packet.h +++ b/lib/dp-packet.h @@ -86,22 +86,47 @@ enum dp_packet_offload_mask { DEF_OL_FLAG(DP_PACKET_OL_TX_SCTP_CKSUM, RTE_MBUF_F_TX_SCTP_CKSUM, 0x800), /* Offload IP checksum. */ DEF_OL_FLAG(DP_PACKET_OL_TX_IP_CKSUM, RTE_MBUF_F_TX_IP_CKSUM, 0x1000), + /* Offload packet is tunnel GENEVE. */ + DEF_OL_FLAG(DP_PACKET_OL_TX_TUNNEL_GENEVE, + RTE_MBUF_F_TX_TUNNEL_GENEVE, 0x2000), + /* Offload packet is tunnel VXLAN. */ + DEF_OL_FLAG(DP_PACKET_OL_TX_TUNNEL_VXLAN, + RTE_MBUF_F_TX_TUNNEL_VXLAN, 0x4000), + /* Offload tunnel packet, out is ipv4 */ + DEF_OL_FLAG(DP_PACKET_OL_TX_OUTER_IPV4, + RTE_MBUF_F_TX_OUTER_IPV4, 0x8000), + /* Offload TUNNEL out ipv4 checksum */ + DEF_OL_FLAG(DP_PACKET_OL_TX_OUTER_IP_CKSUM, + RTE_MBUF_F_TX_OUTER_IP_CKSUM, 0x10000), + /* Offload TUNNEL out udp checksum */ + DEF_OL_FLAG(DP_PACKET_OL_TX_OUTER_UDP_CKSUM, + RTE_MBUF_F_TX_OUTER_UDP_CKSUM, 0x20000), + /* Offload tunnel packet, out is ipv6 */ + DEF_OL_FLAG(DP_PACKET_OL_TX_OUTER_IPV6, + RTE_MBUF_F_TX_OUTER_IPV6, 0x40000), + /* Adding new field requires adding to DP_PACKET_OL_SUPPORTED_MASK. */ }; -#define DP_PACKET_OL_SUPPORTED_MASK (DP_PACKET_OL_RSS_HASH | \ - DP_PACKET_OL_FLOW_MARK | \ - DP_PACKET_OL_RX_L4_CKSUM_BAD | \ - DP_PACKET_OL_RX_IP_CKSUM_BAD | \ - DP_PACKET_OL_RX_L4_CKSUM_GOOD | \ - DP_PACKET_OL_RX_IP_CKSUM_GOOD | \ - DP_PACKET_OL_TX_TCP_SEG | \ - DP_PACKET_OL_TX_IPV4 | \ - DP_PACKET_OL_TX_IPV6 | \ - DP_PACKET_OL_TX_TCP_CKSUM | \ - DP_PACKET_OL_TX_UDP_CKSUM | \ - DP_PACKET_OL_TX_SCTP_CKSUM | \ - DP_PACKET_OL_TX_IP_CKSUM) +#define DP_PACKET_OL_SUPPORTED_MASK (DP_PACKET_OL_RSS_HASH | \ + DP_PACKET_OL_FLOW_MARK | \ + DP_PACKET_OL_RX_L4_CKSUM_BAD | \ + DP_PACKET_OL_RX_IP_CKSUM_BAD | \ + DP_PACKET_OL_RX_L4_CKSUM_GOOD | \ + DP_PACKET_OL_RX_IP_CKSUM_GOOD | \ + DP_PACKET_OL_TX_TCP_SEG | \ + DP_PACKET_OL_TX_IPV4 | \ + DP_PACKET_OL_TX_IPV6 | \ + DP_PACKET_OL_TX_TCP_CKSUM | \ + DP_PACKET_OL_TX_UDP_CKSUM | \ + DP_PACKET_OL_TX_SCTP_CKSUM | \ + DP_PACKET_OL_TX_IP_CKSUM | \ + DP_PACKET_OL_TX_TUNNEL_GENEVE | \ + DP_PACKET_OL_TX_TUNNEL_VXLAN | \ + DP_PACKET_OL_TX_OUTER_IPV4 | \ + DP_PACKET_OL_TX_OUTER_IP_CKSUM | \ + DP_PACKET_OL_TX_OUTER_UDP_CKSUM | \ + DP_PACKET_OL_TX_OUTER_IPV6) #define DP_PACKET_OL_TX_L4_MASK (DP_PACKET_OL_TX_TCP_CKSUM | \ DP_PACKET_OL_TX_UDP_CKSUM | \ @@ -139,6 +164,10 @@ struct dp_packet { * or UINT16_MAX. */ uint16_t l4_ofs; /* Transport-level header offset, or UINT16_MAX. */ + uint16_t inner_l3_ofs; /* inner Network-level header offset, + * or UINT16_MAX. */ + uint16_t inner_l4_ofs; /* inner Transport-level header offset, + or UINT16_MAX. */ uint32_t cutlen; /* length in bytes to cut from the end. */ ovs_be32 packet_type; /* Packet type as defined in OpenFlow */ uint16_t csum_start; /* Position to start checksumming from. */ @@ -250,6 +279,9 @@ bool dp_packet_compare_offsets(struct dp_packet *good, struct dp_packet *test, struct ds *err_str); void dp_packet_ol_send_prepare(struct dp_packet *, uint64_t); +void dp_packet_tnl_outer_ol_send_prepare(struct dp_packet *p, + uint64_t flags); + /* Frees memory that 'b' points to, as well as 'b' itself. */ @@ -482,6 +514,22 @@ dp_packet_l4_size(const struct dp_packet *b) : 0; } +static inline void * +dp_packet_inner_l3(const struct dp_packet *b) +{ + return b->inner_l3_ofs != UINT16_MAX + ? (char *) dp_packet_data(b) + b->inner_l3_ofs + : NULL; +} + +static inline void * +dp_packet_inner_l4(const struct dp_packet *b) +{ + return b->inner_l4_ofs != UINT16_MAX + ? (char *) dp_packet_data(b) + b->inner_l4_ofs + : NULL; +} + static inline const void * dp_packet_get_tcp_payload(const struct dp_packet *b) { @@ -539,6 +587,25 @@ dp_packet_get_nd_payload(const struct dp_packet *b) } #ifdef DPDK_NETDEV +static inline void +dp_packet_set_l2_len(struct dp_packet *b, size_t l2_len) +{ + b->mbuf.l2_len = l2_len; +} + +static inline void +dp_packet_set_l3_len(struct dp_packet *b, size_t l3_len) +{ + b->mbuf.l3_len = l3_len; +} + +static inline void +dp_packet_set_l4_len(struct dp_packet *b, size_t l4_len) +{ + b->mbuf.l4_len = l4_len; +} + + static inline uint64_t * dp_packet_ol_flags_ptr(const struct dp_packet *b) { @@ -558,6 +625,24 @@ dp_packet_flow_mark_ptr(const struct dp_packet *b) } #else +static inline void +dp_packet_set_l2_len(struct dp_packet *b OVS_UNUSED, size_t l2_len OVS_UNUSED) +{ + /* There are no implementation */ +} + +static inline void +dp_packet_set_l3_len(struct dp_packet *b OVS_UNUSED, size_t l3_len OVS_UNUSED) +{ + /* There are no implementation */ +} + +static inline void +dp_packet_set_l4_len(struct dp_packet *b OVS_UNUSED, size_t l4_len OVS_UNUSED) +{ + /* There are no implementation */ +} + static inline uint32_t * dp_packet_ol_flags_ptr(const struct dp_packet *b) { @@ -619,9 +704,10 @@ dp_packet_set_size(struct dp_packet *b, uint32_t v) * (and thus 'v') will always be <= UINT16_MAX; this means that there is no * loss of accuracy in assigning 'v' to 'data_len'. */ - b->mbuf.data_len = (uint16_t)v; /* Current seg length. */ - b->mbuf.pkt_len = v; /* Total length of all segments linked to - * this segment. */ + /* Current seg length. */ + b->mbuf.data_len += (uint16_t)(v - b->mbuf.pkt_len); + /* Total length of all segments linked to this segment. */ + b->mbuf.pkt_len = v; } static inline uint16_t @@ -1056,6 +1142,43 @@ dp_packet_hwol_l4_is_sctp(struct dp_packet *b) DP_PACKET_OL_TX_SCTP_CKSUM; } +/* Returns 'true' if packet 'b' is marked for tunnel GENEVE + * checksum offloading. */ +static inline bool +dp_packet_hwol_is_tunnel_geneve(struct dp_packet *b) +{ + return !!(*dp_packet_ol_flags_ptr(b) & DP_PACKET_OL_TX_TUNNEL_GENEVE); +} + +/* Returns 'true' if packet 'b' is marked for tunnel VXLAN + * checksum offloading. */ +static inline bool +dp_packet_hwol_is_tunnel_vxlan(struct dp_packet *b) +{ + return !!(*dp_packet_ol_flags_ptr(b) & DP_PACKET_OL_TX_TUNNEL_VXLAN); +} + +/* Returns 'true' if packet 'b' is marked for out ipv4. */ +static inline bool +dp_packet_hwol_is_outer_ipv4(struct dp_packet *b) +{ + return !!(*dp_packet_ol_flags_ptr(b) & DP_PACKET_OL_TX_OUTER_IPV4); +} + +/* Returns 'true' if packet 'b' is marked for out ipv4 csum offload. */ +static inline bool +dp_packet_hwol_is_outer_ipv4_cksum(struct dp_packet *b) +{ + return !!(*dp_packet_ol_flags_ptr(b) & DP_PACKET_OL_TX_OUTER_IP_CKSUM); +} + +/* Returns 'true' if packet 'b' is marked for out udp csum offload. */ +static inline bool +dp_packet_hwol_is_outer_UDP_cksum(struct dp_packet *b) +{ + return !!(*dp_packet_ol_flags_ptr(b) & DP_PACKET_OL_TX_OUTER_UDP_CKSUM); +} + static inline void dp_packet_hwol_reset_tx_l4_csum(struct dp_packet *p) { @@ -1078,6 +1201,14 @@ dp_packet_hwol_set_tx_ipv6(struct dp_packet *a) *dp_packet_ol_flags_ptr(a) |= DP_PACKET_OL_TX_IPV6; } +/* Mark packet 'a' as IPv6. */ +static inline void +dp_packet_hwol_set_tx_outer_ipv6(struct dp_packet *a) +{ + *dp_packet_ol_flags_ptr(a) &= ~DP_PACKET_OL_TX_OUTER_IPV4; + *dp_packet_ol_flags_ptr(a) |= DP_PACKET_OL_TX_OUTER_IPV6; +} + /* Returns 'true' if packet 'p' is marked for IPv4 checksum offloading. */ static inline bool dp_packet_hwol_tx_ip_csum(const struct dp_packet *p) @@ -1131,6 +1262,55 @@ dp_packet_hwol_set_tcp_seg(struct dp_packet *b) *dp_packet_ol_flags_ptr(b) |= DP_PACKET_OL_TX_TCP_SEG; } +/* Mark packet 'b' for tunnel geneve offloading. It implies that + * the packet 'b' is marked for tunnel geneve offloading. */ +static inline void +dp_packet_hwol_set_tunnel_geneve(struct dp_packet *b) +{ + *dp_packet_ol_flags_ptr(b) |= DP_PACKET_OL_TX_TUNNEL_GENEVE; +} + +/* Mark packet 'b' for tunnel vxlan offloading. It implies that + * the packet 'b' is marked for tunnel vxlan offloading. */ +static inline void +dp_packet_hwol_set_tunnel_vxlan(struct dp_packet *b) +{ + *dp_packet_ol_flags_ptr(b) |= DP_PACKET_OL_TX_TUNNEL_VXLAN; +} + +/* Mark packet 'b' for out ipv4 packet. */ +static inline void +dp_packet_hwol_set_tx_outer_ipv4(struct dp_packet *b) +{ + *dp_packet_ol_flags_ptr(b) |= DP_PACKET_OL_TX_OUTER_IPV4; +} + +/* Mark packet 'b' for out ipv4 csum offloading. */ +static inline void +dp_packet_hwol_set_tx_outer_ipv4_csum(struct dp_packet *b) +{ + *dp_packet_ol_flags_ptr(b) |= DP_PACKET_OL_TX_OUTER_IP_CKSUM; +} + +static inline void +dp_packet_hwol_reset_outer_ipv4_csum(struct dp_packet *p) +{ + *dp_packet_ol_flags_ptr(p) &= ~DP_PACKET_OL_TX_OUTER_IP_CKSUM; +} + +static inline void +dp_packet_hwol_reset_outer_udp_csum(struct dp_packet *p) +{ + *dp_packet_ol_flags_ptr(p) &= ~DP_PACKET_OL_TX_OUTER_UDP_CKSUM; +} + +/* Mark packet 'b' for out udp csum offloading. */ +static inline void +dp_packet_hwol_set_outer_udp_csum(struct dp_packet *b) +{ + *dp_packet_ol_flags_ptr(b) |= DP_PACKET_OL_TX_OUTER_UDP_CKSUM; +} + /* Resets TCP Segmentation flag in packet 'p'. */ static inline void dp_packet_hwol_reset_tcp_seg(struct dp_packet *p) @@ -1172,9 +1352,9 @@ dp_packet_ip_checksum_bad(const struct dp_packet *p) /* Calculate and set the IPv4 header checksum in packet 'p'. */ static inline void -dp_packet_ip_set_header_csum(struct dp_packet *p) +dp_packet_ip_set_header_csum(struct dp_packet *p, bool inner) { - struct ip_header *ip = dp_packet_l3(p); + struct ip_header *ip = (inner) ? dp_packet_inner_l3(p) : dp_packet_l3(p); ovs_assert(ip); ip->ip_csum = 0; diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 9a59a1b03..303d4c2e1 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -7980,7 +7980,9 @@ dp_netdev_upcall(struct dp_netdev_pmd_thread *pmd, struct dp_packet *packet_, ds_destroy(&ds); } - dp_packet_ol_send_prepare(packet_, 0); + if (type != DPIF_UC_MISS) { + dp_packet_ol_send_prepare(packet_, 0); + } return dp->upcall_cb(packet_, flow, ufid, pmd->core_id, type, userdata, actions, wc, put_actions, dp->upcall_aux); diff --git a/lib/flow.c b/lib/flow.c index b8f99f66b..82d93570a 100644 --- a/lib/flow.c +++ b/lib/flow.c @@ -3278,7 +3278,7 @@ packet_expand(struct dp_packet *p, const struct flow *flow, size_t size) if (dp_packet_hwol_tx_ip_csum(p)) { dp_packet_ol_reset_ip_csum_good(p); } else { - dp_packet_ip_set_header_csum(p); + dp_packet_ip_set_header_csum(p, false); dp_packet_ol_set_ip_csum_good(p); } pseudo_hdr_csum = packet_csum_pseudoheader(ip); diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index 1ff25c246..e4d95d0a3 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -416,6 +416,10 @@ enum dpdk_hw_ol_features { NETDEV_TX_UDP_CKSUM_OFFLOAD = 1 << 5, NETDEV_TX_SCTP_CKSUM_OFFLOAD = 1 << 6, NETDEV_TX_TSO_OFFLOAD = 1 << 7, + NETDEV_TX_VXLAN_TNL_TSO_OFFLOAD = 1 << 8, + NETDEV_TX_GENEVE_TNL_TSO_OFFLOAD = 1 << 9, + NETDEV_TX_OUTER_IP_CKSUM_OFFLOAD = 1 << 10, + NETDEV_TX_OUTER_UDP_CKSUM_OFFLOAD = 1 << 11, }; enum dpdk_rx_steer_flags { @@ -1075,6 +1079,14 @@ netdev_dpdk_update_netdev_flags(struct netdev_dpdk *dev) NETDEV_TX_OFFLOAD_SCTP_CKSUM); netdev_dpdk_update_netdev_flag(dev, NETDEV_TX_TSO_OFFLOAD, NETDEV_TX_OFFLOAD_TCP_TSO); + netdev_dpdk_update_netdev_flag(dev, NETDEV_TX_VXLAN_TNL_TSO_OFFLOAD, + NETDEV_TX_VXLAN_TNL_TSO); + netdev_dpdk_update_netdev_flag(dev, NETDEV_TX_GENEVE_TNL_TSO_OFFLOAD, + NETDEV_TX_GENEVE_TNL_TSO); + netdev_dpdk_update_netdev_flag(dev, NETDEV_TX_OUTER_IP_CKSUM_OFFLOAD, + NETDEV_TX_OFFLOAD_OUTER_IP_CKSUM); + netdev_dpdk_update_netdev_flag(dev, NETDEV_TX_OUTER_UDP_CKSUM_OFFLOAD, + NETDEV_TX_OFFLOAD_OUTER_UDP_CKSUM); } static int @@ -1129,6 +1141,23 @@ dpdk_eth_dev_port_config(struct netdev_dpdk *dev, int n_rxq, int n_txq) conf.txmode.offloads |= RTE_ETH_TX_OFFLOAD_TCP_TSO; } + if (dev->hw_ol_features & NETDEV_TX_VXLAN_TNL_TSO_OFFLOAD) { + conf.txmode.offloads |= RTE_ETH_TX_OFFLOAD_VXLAN_TNL_TSO; + } + + if (dev->hw_ol_features & NETDEV_TX_GENEVE_TNL_TSO_OFFLOAD) { + conf.txmode.offloads |= RTE_ETH_TX_OFFLOAD_GENEVE_TNL_TSO; + } + + if (dev->hw_ol_features & NETDEV_TX_OUTER_IP_CKSUM_OFFLOAD) { + conf.txmode.offloads |= RTE_ETH_TX_OFFLOAD_OUTER_IPV4_CKSUM; + } + + if (dev->hw_ol_features & NETDEV_TX_OUTER_UDP_CKSUM_OFFLOAD) { + conf.txmode.offloads |= RTE_ETH_TX_OFFLOAD_OUTER_UDP_CKSUM; + } + + /* Limit configured rss hash functions to only those supported * by the eth device. */ conf.rx_adv_conf.rss_conf.rss_hf &= info.flow_type_rss_offloads; @@ -1346,6 +1375,18 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev) dev->hw_ol_features &= ~NETDEV_TX_SCTP_CKSUM_OFFLOAD; } + if (info.tx_offload_capa & RTE_ETH_TX_OFFLOAD_OUTER_IPV4_CKSUM) { + dev->hw_ol_features |= NETDEV_TX_OUTER_IP_CKSUM_OFFLOAD; + } else { + dev->hw_ol_features &= ~NETDEV_TX_OUTER_IP_CKSUM_OFFLOAD; + } + + if (info.tx_offload_capa & RTE_ETH_TX_OFFLOAD_OUTER_UDP_CKSUM) { + dev->hw_ol_features |= NETDEV_TX_OUTER_UDP_CKSUM_OFFLOAD; + } else { + dev->hw_ol_features &= ~NETDEV_TX_OUTER_UDP_CKSUM_OFFLOAD; + } + dev->hw_ol_features &= ~NETDEV_TX_TSO_OFFLOAD; if (userspace_tso_enabled()) { if (info.tx_offload_capa & RTE_ETH_TX_OFFLOAD_TCP_TSO) { @@ -1354,6 +1395,20 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev) VLOG_WARN("%s: Tx TSO offload is not supported.", netdev_get_name(&dev->up)); } + + if (info.tx_offload_capa & RTE_ETH_TX_OFFLOAD_VXLAN_TNL_TSO) { + dev->hw_ol_features |= NETDEV_TX_VXLAN_TNL_TSO_OFFLOAD; + } else { + VLOG_WARN("%s: Tx Vxlan tunnel TSO offload is not supported.", + netdev_get_name(&dev->up)); + } + + if (info.tx_offload_capa & RTE_ETH_TX_OFFLOAD_GENEVE_TNL_TSO) { + dev->hw_ol_features |= NETDEV_TX_GENEVE_TNL_TSO_OFFLOAD; + } else { + VLOG_WARN("%s: Tx Geneve tunnel TSO offload is not supported.", + netdev_get_name(&dev->up)); + } } n_rxq = MIN(info.max_rx_queues, dev->up.n_rxq); @@ -2479,11 +2534,23 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, struct rte_mbuf *mbuf) return true; } - mbuf->l2_len = (char *) dp_packet_l3(pkt) - (char *) dp_packet_eth(pkt); - mbuf->l3_len = (char *) dp_packet_l4(pkt) - (char *) dp_packet_l3(pkt); - mbuf->l4_len = 0; - mbuf->outer_l2_len = 0; - mbuf->outer_l3_len = 0; + /* If packet is vxlan or geneve tunnel packet, calculate outer + * l2 len and outer l3 len. Inner l2/l3/l4 len are calculated + * before. */ + if (mbuf->ol_flags & + (RTE_MBUF_F_TX_TUNNEL_GENEVE | RTE_MBUF_F_TX_TUNNEL_VXLAN)) { + mbuf->outer_l2_len = (char *) dp_packet_l3(pkt) - + (char *) dp_packet_eth(pkt); + mbuf->outer_l3_len = (char *) dp_packet_l4(pkt) - + (char *) dp_packet_l3(pkt); + } else { + mbuf->l2_len = (char *) dp_packet_l3(pkt) - + (char *) dp_packet_eth(pkt); + mbuf->l3_len = (char *) dp_packet_l4(pkt) - + (char *) dp_packet_l3(pkt); + mbuf->outer_l2_len = 0; + mbuf->outer_l3_len = 0; + } th = dp_packet_l4(pkt); if (mbuf->ol_flags & RTE_MBUF_F_TX_TCP_SEG) { @@ -2501,8 +2568,14 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, struct rte_mbuf *mbuf) return false; } - mbuf->l4_len = TCP_OFFSET(th->tcp_ctl) * 4; - mbuf->tso_segsz = dev->mtu - mbuf->l3_len - mbuf->l4_len; + if (mbuf->ol_flags & (RTE_MBUF_F_TX_TUNNEL_GENEVE | + RTE_MBUF_F_TX_TUNNEL_VXLAN)) { + mbuf->tso_segsz = dev->mtu - mbuf->l2_len - mbuf->l3_len - + mbuf->l4_len - mbuf->outer_l3_len; + } else { + mbuf->l4_len = TCP_OFFSET(th->tcp_ctl) * 4; + mbuf->tso_segsz = dev->mtu - mbuf->l3_len - mbuf->l4_len; + } if (mbuf->ol_flags & RTE_MBUF_F_TX_TCP_SEG) { int hdr_len = mbuf->l2_len + mbuf->l3_len + mbuf->l4_len; diff --git a/lib/netdev-dummy.c b/lib/netdev-dummy.c index 8c6e6d448..21db9edb5 100644 --- a/lib/netdev-dummy.c +++ b/lib/netdev-dummy.c @@ -1202,7 +1202,7 @@ netdev_dummy_send(struct netdev *netdev, int qid, if (dp_packet_hwol_tx_ip_csum(packet) && !dp_packet_ip_checksum_good(packet)) { - dp_packet_ip_set_header_csum(packet); + dp_packet_ip_set_header_csum(packet, false); dp_packet_ol_set_ip_csum_good(packet); } diff --git a/lib/netdev-native-tnl.c b/lib/netdev-native-tnl.c index a0682c70f..78198c10d 100644 --- a/lib/netdev-native-tnl.c +++ b/lib/netdev-native-tnl.c @@ -173,15 +173,29 @@ netdev_tnl_push_ip_header(struct dp_packet *packet, const void *header, ip6->ip6_plen = htons(*ip_tot_size); packet_set_ipv6_flow_label(&ip6->ip6_flow, ipv6_label); packet->l4_ofs = dp_packet_size(packet) - *ip_tot_size; - dp_packet_hwol_set_tx_ipv6(packet); + + if (dp_packet_hwol_is_tunnel_geneve(packet) || + dp_packet_hwol_is_tunnel_vxlan(packet)) { + dp_packet_hwol_set_tx_outer_ipv6(packet); + } else { + dp_packet_hwol_set_tx_ipv6(packet); + } + dp_packet_ol_reset_ip_csum_good(packet); return ip6 + 1; } else { ip = netdev_tnl_ip_hdr(eth); ip->ip_tot_len = htons(*ip_tot_size); /* Postpone checksum to when the packet is pushed to the port. */ - dp_packet_hwol_set_tx_ipv4(packet); - dp_packet_hwol_set_tx_ip_csum(packet); + if (dp_packet_hwol_is_tunnel_geneve(packet) || + dp_packet_hwol_is_tunnel_vxlan(packet)) { + dp_packet_hwol_set_tx_outer_ipv4(packet); + dp_packet_hwol_set_tx_outer_ipv4_csum(packet); + } else { + dp_packet_hwol_set_tx_ipv4(packet); + dp_packet_hwol_set_tx_ip_csum(packet); + } + dp_packet_ol_reset_ip_csum_good(packet); *ip_tot_size -= IP_HEADER_LEN; packet->l4_ofs = dp_packet_size(packet) - *ip_tot_size; @@ -226,14 +240,84 @@ udp_extract_tnl_md(struct dp_packet *packet, struct flow_tnl *tnl, return udp + 1; } +/* Calculate inner l2 l3 l4 len as tunnel outer header is not + * encapsulated now. */ +static void +dp_packet_tnl_ol_process(const struct netdev *netdev, + struct dp_packet *packet, + const struct ovs_action_push_tnl *data) +{ + struct udp_header *udp = NULL; + uint8_t opt_len = 0; + struct eth_header *eth = NULL; + struct ip_header *ip = NULL; + struct genevehdr *gnh = NULL; + + /* l2 l3 l4 len refer to inner len, tunnel outer + * header is not encapsulated here. */ + if (dp_packet_hwol_l4_mask(packet)) { + ip = dp_packet_l3(packet); + + if (ip->ip_proto == IPPROTO_TCP) { + struct tcp_header *th = dp_packet_l4(packet); + dp_packet_set_l4_len(packet, TCP_OFFSET(th->tcp_ctl) * 4); + } else if (ip->ip_proto == IPPROTO_UDP) { + dp_packet_set_l4_len(packet, UDP_HEADER_LEN); + } else if (ip->ip_proto == IPPROTO_SCTP) { + dp_packet_set_l4_len(packet, SCTP_HEADER_LEN); + } + + dp_packet_set_l3_len(packet, (char *) dp_packet_l4(packet) - + (char *) dp_packet_l3(packet)); + + if (!strcmp(netdev_get_type(netdev), "geneve") || + !strcmp(netdev_get_type(netdev), "vxlan")) { + + if (IP_VER(ip->ip_ihl_ver) == 4) { + dp_packet_hwol_set_tx_ipv4(packet); + dp_packet_hwol_tx_ip_csum(packet); + } else if (IP_VER(ip->ip_ihl_ver) == 6) { + dp_packet_hwol_set_tx_ipv6(packet); + } + } + + /* Attention please, tunnel inner l2 len is consist of udp header + * len and tunnel header len and inner l2 len. */ + if (!strcmp(netdev_get_type(netdev), "geneve")) { + eth = (struct eth_header *)(data->header); + ip = (struct ip_header *)(eth + 1); + udp = (struct udp_header *)(ip + 1); + gnh = (struct genevehdr *)(udp + 1); + opt_len = gnh->opt_len * 4; + dp_packet_hwol_set_tunnel_geneve(packet); + dp_packet_set_l2_len(packet, (char *) dp_packet_l3(packet) - + (char *) dp_packet_eth(packet) + + GENEVE_BASE_HLEN + opt_len); + + packet->inner_l3_ofs = packet->l3_ofs + GENEVE_BASE_HLEN + opt_len; + packet->inner_l4_ofs = packet->l4_ofs + GENEVE_BASE_HLEN + opt_len; + + } else if (!strcmp(netdev_get_type(netdev), "vxlan")) { + dp_packet_hwol_set_tunnel_vxlan(packet); + dp_packet_set_l2_len(packet, (char *) dp_packet_l3(packet) - + (char *) dp_packet_eth(packet) + + VXLAN_HLEN); + + packet->inner_l3_ofs = packet->l3_ofs + VXLAN_HLEN; + packet->inner_l4_ofs = packet->l4_ofs + VXLAN_HLEN; + } + } +} + void -netdev_tnl_push_udp_header(const struct netdev *netdev OVS_UNUSED, +netdev_tnl_push_udp_header(const struct netdev *netdev, struct dp_packet *packet, const struct ovs_action_push_tnl *data) { struct udp_header *udp; int ip_tot_size; + dp_packet_tnl_ol_process(netdev, packet, data); udp = netdev_tnl_push_ip_header(packet, data->header, data->header_len, &ip_tot_size, 0); @@ -241,13 +325,21 @@ netdev_tnl_push_udp_header(const struct netdev *netdev OVS_UNUSED, udp->udp_src = netdev_tnl_get_src_port(packet); udp->udp_len = htons(ip_tot_size); - /* Postpone checksum to the egress netdev. */ - dp_packet_hwol_set_csum_udp(packet); if (udp->udp_csum) { dp_packet_ol_reset_l4_csum_good(packet); + if (dp_packet_hwol_is_tunnel_geneve(packet) || + dp_packet_hwol_is_tunnel_vxlan(packet)) { + dp_packet_hwol_set_outer_udp_csum(packet); + } else { + dp_packet_hwol_set_csum_udp(packet); + } } else { - dp_packet_ol_set_l4_csum_good(packet); + dp_packet_ol_set_l4_csum_good(packet); } + + packet->inner_l3_ofs += packet->l4_ofs; + packet->inner_l4_ofs += packet->l4_ofs; + } static void * diff --git a/lib/netdev-provider.h b/lib/netdev-provider.h index a7393c7ce..22840a058 100644 --- a/lib/netdev-provider.h +++ b/lib/netdev-provider.h @@ -43,6 +43,10 @@ enum netdev_ol_flags { NETDEV_TX_OFFLOAD_UDP_CKSUM = 1 << 2, NETDEV_TX_OFFLOAD_SCTP_CKSUM = 1 << 3, NETDEV_TX_OFFLOAD_TCP_TSO = 1 << 4, + NETDEV_TX_VXLAN_TNL_TSO = 1 << 5, + NETDEV_TX_GENEVE_TNL_TSO = 1 << 6, + NETDEV_TX_OFFLOAD_OUTER_IP_CKSUM = 1 << 7, + NETDEV_TX_OFFLOAD_OUTER_UDP_CKSUM = 1 << 8, }; /* A network device (e.g. an Ethernet device). diff --git a/lib/netdev.c b/lib/netdev.c index 3ed8049f7..db0610304 100644 --- a/lib/netdev.c +++ b/lib/netdev.c @@ -912,6 +912,17 @@ netdev_send(struct netdev *netdev, int qid, struct dp_packet_batch *batch, !(netdev_flags & NETDEV_TX_OFFLOAD_TCP_TSO)) { DP_PACKET_BATCH_FOR_EACH (i, packet, batch) { if (dp_packet_hwol_is_tso(packet)) { + if (dp_packet_hwol_is_tunnel_vxlan(packet) + && !(netdev_flags & NETDEV_TX_VXLAN_TNL_TSO)) { + VLOG_ERR("No VXLAN TSO support"); + return false; + } + + if (dp_packet_hwol_is_tunnel_geneve(packet) + && !(netdev_flags & NETDEV_TX_GENEVE_TNL_TSO)) { + VLOG_ERR("No GENEVE TSO support"); + return false; + } return netdev_send_tso(netdev, qid, batch, concurrent_txq); } } @@ -990,17 +1001,19 @@ netdev_push_header(const struct netdev *netdev, size_t i, size = dp_packet_batch_size(batch); DP_PACKET_BATCH_REFILL_FOR_EACH (i, size, packet, batch) { - if (OVS_UNLIKELY(dp_packet_hwol_is_tso(packet))) { + if (OVS_UNLIKELY(strcmp(netdev_get_type(netdev), "vxlan") && + strcmp(netdev_get_type(netdev), "geneve") && + dp_packet_hwol_is_tso(packet))) { COVERAGE_INC(netdev_push_header_drops); dp_packet_delete(packet); - VLOG_WARN_RL(&rl, "%s: Tunneling packets with TSO is " - "not supported: packet dropped", - netdev_get_name(netdev)); + VLOG_WARN_RL(&rl, "%s: Tunneling packets with tso HW offload" + "flags is not supported: packet dropped", + netdev_get_name(netdev)); } else { - /* The packet is going to be encapsulated and there is - * no support yet for inner network header csum offloading. */ - dp_packet_ol_send_prepare(packet, 0); - + if (strcmp(netdev_get_type(netdev), "vxlan") && + strcmp(netdev_get_type(netdev), "geneve")) { + dp_packet_ol_send_prepare(packet, 0); + } netdev->netdev_class->push_header(netdev, packet, data); pkt_metadata_init(&packet->md, data->out_port); @@ -1446,6 +1459,10 @@ netdev_get_status(const struct netdev *netdev, struct smap *smap) OL_ADD_STAT("udp_csum", NETDEV_TX_OFFLOAD_UDP_CKSUM); OL_ADD_STAT("sctp_csum", NETDEV_TX_OFFLOAD_SCTP_CKSUM); OL_ADD_STAT("tcp_seg", NETDEV_TX_OFFLOAD_TCP_TSO); + OL_ADD_STAT("vxlan_tso", NETDEV_TX_VXLAN_TNL_TSO); + OL_ADD_STAT("geneve_tso", NETDEV_TX_GENEVE_TNL_TSO); + OL_ADD_STAT("out_ip_csum", NETDEV_TX_OFFLOAD_OUTER_IP_CKSUM); + OL_ADD_STAT("out_udp_csum", NETDEV_TX_OFFLOAD_OUTER_UDP_CKSUM); #undef OL_ADD_STAT err = 0; diff --git a/lib/packets.c b/lib/packets.c index dab823ba2..d9e41346e 100644 --- a/lib/packets.c +++ b/lib/packets.c @@ -1997,9 +1997,9 @@ IP_ECN_set_ce(struct dp_packet *pkt, bool is_ipv6) /* Set TCP checksum field in packet 'p' with complete checksum. * The packet must have the L3 and L4 offsets. */ void -packet_tcp_complete_csum(struct dp_packet *p) +packet_tcp_complete_csum(struct dp_packet *p, bool inner) { - struct tcp_header *tcp = dp_packet_l4(p); + struct tcp_header *tcp = (inner) ? dp_packet_inner_l4(p) : dp_packet_l4(p); tcp->tcp_csum = 0; if (dp_packet_hwol_is_ipv4(p)) { @@ -2020,9 +2020,9 @@ packet_tcp_complete_csum(struct dp_packet *p) /* Set UDP checksum field in packet 'p' with complete checksum. * The packet must have the L3 and L4 offsets. */ void -packet_udp_complete_csum(struct dp_packet *p) +packet_udp_complete_csum(struct dp_packet *p, bool inner) { - struct udp_header *udp = dp_packet_l4(p); + struct udp_header *udp = (inner) ? dp_packet_inner_l4(p) : dp_packet_l4(p); /* Skip csum calculation if the udp_csum is zero. */ if (!udp->udp_csum) { @@ -2052,9 +2052,9 @@ packet_udp_complete_csum(struct dp_packet *p) /* Set SCTP checksum field in packet 'p' with complete checksum. * The packet must have the L3 and L4 offsets. */ void -packet_sctp_complete_csum(struct dp_packet *p) +packet_sctp_complete_csum(struct dp_packet *p, bool inner) { - struct sctp_header *sh = dp_packet_l4(p); + struct sctp_header *sh = (inner) ? dp_packet_inner_l4(p) : dp_packet_l4(p); uint16_t tp_len = dp_packet_l4_size(p); ovs_be32 csum; diff --git a/lib/packets.h b/lib/packets.h index 12245b764..8b6994809 100644 --- a/lib/packets.h +++ b/lib/packets.h @@ -1682,9 +1682,9 @@ uint32_t packet_csum_pseudoheader(const struct ip_header *); bool packet_rh_present(struct dp_packet *packet, uint8_t *nexthdr, bool *first_frag); void IP_ECN_set_ce(struct dp_packet *pkt, bool is_ipv6); -void packet_tcp_complete_csum(struct dp_packet *); -void packet_udp_complete_csum(struct dp_packet *); -void packet_sctp_complete_csum(struct dp_packet *); +void packet_tcp_complete_csum(struct dp_packet *, bool is_inner); +void packet_udp_complete_csum(struct dp_packet *, bool is_inner); +void packet_sctp_complete_csum(struct dp_packet *, bool is_inner); #define DNS_HEADER_LEN 12 struct dns_header { diff --git a/tests/dpif-netdev.at b/tests/dpif-netdev.at index d0359b5ea..59bb2de95 100644 --- a/tests/dpif-netdev.at +++ b/tests/dpif-netdev.at @@ -658,11 +658,11 @@ OVS_VSWITCHD_START( other-config:datapath-id=1234 fail-mode=secure]) AT_CHECK([ovs-vsctl get interface p1 status | sed -n 's/^{\(.*\).*}$/\1/p'], [0], [dnl -tx_ip_csum_offload="false", tx_sctp_csum_offload="false", tx_tcp_csum_offload="false", tx_tcp_seg_offload="false", tx_udp_csum_offload="false" +tx_geneve_tso_offload="false", tx_ip_csum_offload="false", tx_out_ip_csum_offload="false", tx_out_udp_csum_offload="false", tx_sctp_csum_offload="false", tx_tcp_csum_offload="false", tx_tcp_seg_offload="false", tx_udp_csum_offload="false", tx_vxlan_tso_offload="false" ], []) AT_CHECK([ovs-vsctl get interface br0 status | sed -n 's/^{\(.*\).*}$/\1/p'], [0], [dnl -tx_ip_csum_offload="false", tx_sctp_csum_offload="false", tx_tcp_csum_offload="false", tx_tcp_seg_offload="false", tx_udp_csum_offload="false" +tx_geneve_tso_offload="false", tx_ip_csum_offload="false", tx_out_ip_csum_offload="false", tx_out_udp_csum_offload="false", tx_sctp_csum_offload="false", tx_tcp_csum_offload="false", tx_tcp_seg_offload="false", tx_udp_csum_offload="false", tx_vxlan_tso_offload="false" ], []) OVS_VSWITCHD_STOP From patchwork Fri Dec 15 19:30:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Pattrick X-Patchwork-Id: 1876750 X-Patchwork-Delegate: horms@verge.net.au Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=dmWEqE+U; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=2605:bc80:3010::138; helo=smtp1.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=patchwork.ozlabs.org) Received: from smtp1.osuosl.org (smtp1.osuosl.org [IPv6:2605:bc80:3010::138]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SsK7j1d8Dz23nF for ; Sat, 16 Dec 2023 06:30:40 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by smtp1.osuosl.org (Postfix) with ESMTP id 8A4E481C11; Fri, 15 Dec 2023 19:30:38 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp1.osuosl.org 8A4E481C11 Authentication-Results: smtp1.osuosl.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=dmWEqE+U X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp1.osuosl.org ([127.0.0.1]) by localhost (smtp1.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id mbizKl-A2Qgt; Fri, 15 Dec 2023 19:30:36 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by smtp1.osuosl.org (Postfix) with ESMTPS id D118081846; Fri, 15 Dec 2023 19:30:35 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp1.osuosl.org D118081846 Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 7FE12C0077; Fri, 15 Dec 2023 19:30:35 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp2.osuosl.org (smtp2.osuosl.org [140.211.166.133]) by lists.linuxfoundation.org (Postfix) with ESMTP id 877FAC0037 for ; Fri, 15 Dec 2023 19:30:33 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp2.osuosl.org (Postfix) with ESMTP id 5E36A403BE for ; Fri, 15 Dec 2023 19:30:33 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp2.osuosl.org 5E36A403BE Authentication-Results: smtp2.osuosl.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=dmWEqE+U X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp2.osuosl.org ([127.0.0.1]) by localhost (smtp2.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id q9dHWz4W_1uv for ; Fri, 15 Dec 2023 19:30:31 +0000 (UTC) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by smtp2.osuosl.org (Postfix) with ESMTPS id 5077B403AF for ; Fri, 15 Dec 2023 19:30:31 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp2.osuosl.org 5077B403AF DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1702668630; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=osf99nQs4SqT4CCVpYbiAa+ZcgX3/4Z1Xag6aYF62v4=; b=dmWEqE+UUQ3FK4GJql6jXVEnXvvDcg8rcPH1B1M5UaYS5iPY6qNrQCAMNJ6H1Ve9wm5vr7 tD8Cr3pFezIXWfJTK+aTCOD69+8SMAk8zAQVK7QvNPGjPLLiGNl1NTF1e60BvSEOAbpBEo jyCPI/G8iF40G1sWNxNPgY0wXVAraVY= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-172-ptqfCs5FOySS__lX2kKI4A-1; Fri, 15 Dec 2023 14:30:28 -0500 X-MC-Unique: ptqfCs5FOySS__lX2kKI4A-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 5C8A4101A52A for ; Fri, 15 Dec 2023 19:30:28 +0000 (UTC) Received: from mpattric.remote.csb (unknown [10.22.8.90]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0A05B492BF0; Fri, 15 Dec 2023 19:30:27 +0000 (UTC) From: Mike Pattrick To: dev@openvswitch.org Date: Fri, 15 Dec 2023 14:30:22 -0500 Message-Id: <20231215193022.1354589-2-mkp@redhat.com> In-Reply-To: <20231215193022.1354589-1-mkp@redhat.com> References: <20231215193022.1354589-1-mkp@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.10 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Subject: [ovs-dev] [PATCH v10 2/2] userspace: Enable tunnel tests with tso. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" This patch enables most of the tunnel tests in the testsuite, and adds a large TCP transfer to a vxlan and geneve test to verify TSO functionality. Some additional changes were required to accommodate these changes with netdev-linux interfaces. The test for vlan over vxlan is purposely not enabled as the traffic produced by this test gives incorrect values in the vnet header. Signed-off-by: Mike Pattrick --- v10: - Software TCP checksums now support encapsulated TSO case - Redundant inner offset code was removed --- lib/dp-packet.h | 49 +++++++++++++++++++--------- lib/dpif-netdev-extract-avx512.c | 8 ++--- lib/flow.c | 12 ++----- lib/netdev-linux.c | 45 +++++++++++++++++++------ lib/netdev-native-tnl.c | 27 ++++++++------- lib/packets.c | 56 +++++++++++++++++++++++++------- tests/system-traffic.at | 39 ++++++++++------------ 7 files changed, 151 insertions(+), 85 deletions(-) diff --git a/lib/dp-packet.h b/lib/dp-packet.h index 3b16b2a54..cf341f09c 100644 --- a/lib/dp-packet.h +++ b/lib/dp-packet.h @@ -433,6 +433,8 @@ dp_packet_reset_offsets(struct dp_packet *b) b->l2_5_ofs = UINT16_MAX; b->l3_ofs = UINT16_MAX; b->l4_ofs = UINT16_MAX; + b->inner_l3_ofs = UINT16_MAX; + b->inner_l4_ofs = UINT16_MAX; } static inline uint16_t @@ -530,6 +532,16 @@ dp_packet_inner_l4(const struct dp_packet *b) : NULL; } +static inline size_t +dp_packet_inner_l4_size(const struct dp_packet *b) +{ + return OVS_LIKELY(b->l4_ofs != UINT16_MAX) + ? (const char *) dp_packet_tail(b) + - (const char *) dp_packet_inner_l4(b) + - dp_packet_l2_pad_size(b) + : 0; +} + static inline const void * dp_packet_get_tcp_payload(const struct dp_packet *b) { @@ -865,14 +877,6 @@ dp_packet_set_data(struct dp_packet *b, void *data) } } -static inline void -dp_packet_reset_packet(struct dp_packet *b, int off) -{ - dp_packet_set_size(b, dp_packet_size(b) - off); - dp_packet_set_data(b, ((unsigned char *) dp_packet_data(b) + off)); - dp_packet_reset_offsets(b); -} - enum { NETDEV_MAX_BURST = 32 }; /* Maximum number packets in a batch. */ struct dp_packet_batch { @@ -1411,21 +1415,36 @@ dp_packet_ol_reset_l4_csum_good(struct dp_packet *p) } } -/* Marks packet 'p' with good integrity if the 'start' and 'offset' - * matches with the 'csum_start' and 'csum_offset' in packet 'p'. - * The 'start' is the offset from the begin of the packet headers. - * The 'offset' is the offset from start to place the checksum. +/* Marks packet 'p' with good integrity if checksum offload locations + * were provided. In the case of encapsulated packets, these values may + * be deeper into the packet than OVS might expect. But the packet + * should still be considered to have good integrity. + * The 'csum_start' is the offset from the begin of the packet headers. + * The 'csum_offset' is the offset from start to place the checksum. * The csum_start and csum_offset fields are set from the virtio_net_hdr * struct that may be provided by a netdev on packet ingress. */ static inline void -dp_packet_ol_l4_csum_check_partial(struct dp_packet *p, uint16_t start, - uint16_t offset) +dp_packet_ol_l4_csum_check_partial(struct dp_packet *p) { - if (p->csum_start == start && p->csum_offset == offset) { + if (p->csum_start && p->csum_offset) { dp_packet_ol_set_l4_csum_partial(p); } } +static inline void +dp_packet_reset_packet(struct dp_packet *b, int off) +{ + dp_packet_set_size(b, dp_packet_size(b) - off); + dp_packet_set_data(b, ((unsigned char *) dp_packet_data(b) + off)); + dp_packet_reset_offsets(b); + + if (b->csum_start >= off && b->csum_offset) { + /* Adjust values for decapsulation. */ + b->csum_start -= off; + dp_packet_ol_set_l4_csum_partial(b); + } +} + static inline uint32_t ALWAYS_INLINE dp_packet_calc_hash_ipv4(const uint8_t *pkt, const uint16_t l3_ofs, uint32_t hash) diff --git a/lib/dpif-netdev-extract-avx512.c b/lib/dpif-netdev-extract-avx512.c index 1bc7e8d0e..57ca4c71b 100644 --- a/lib/dpif-netdev-extract-avx512.c +++ b/lib/dpif-netdev-extract-avx512.c @@ -776,9 +776,7 @@ mfex_ipv6_set_hwol(struct dp_packet *pkt) static void mfex_tcp_set_hwol(struct dp_packet *pkt) { - dp_packet_ol_l4_csum_check_partial(pkt, pkt->l4_ofs, - offsetof(struct tcp_header, - tcp_csum)); + dp_packet_ol_l4_csum_check_partial(pkt); if (dp_packet_l4_checksum_good(pkt) || dp_packet_ol_l4_csum_partial(pkt)) { dp_packet_hwol_set_csum_tcp(pkt); @@ -788,9 +786,7 @@ mfex_tcp_set_hwol(struct dp_packet *pkt) static void mfex_udp_set_hwol(struct dp_packet *pkt) { - dp_packet_ol_l4_csum_check_partial(pkt, pkt->l4_ofs, - offsetof(struct udp_header, - udp_csum)); + dp_packet_ol_l4_csum_check_partial(pkt); if (dp_packet_l4_checksum_good(pkt) || dp_packet_ol_l4_csum_partial(pkt)) { dp_packet_hwol_set_csum_udp(pkt); diff --git a/lib/flow.c b/lib/flow.c index 82d93570a..8e3402388 100644 --- a/lib/flow.c +++ b/lib/flow.c @@ -1054,9 +1054,7 @@ miniflow_extract(struct dp_packet *packet, struct miniflow *dst) } else if (dl_type == htons(ETH_TYPE_IPV6)) { dp_packet_update_rss_hash_ipv6_tcp_udp(packet); } - dp_packet_ol_l4_csum_check_partial(packet, packet->l4_ofs, - offsetof(struct tcp_header, - tcp_csum)); + dp_packet_ol_l4_csum_check_partial(packet); if (dp_packet_l4_checksum_good(packet) || dp_packet_ol_l4_csum_partial(packet)) { dp_packet_hwol_set_csum_tcp(packet); @@ -1076,9 +1074,7 @@ miniflow_extract(struct dp_packet *packet, struct miniflow *dst) } else if (dl_type == htons(ETH_TYPE_IPV6)) { dp_packet_update_rss_hash_ipv6_tcp_udp(packet); } - dp_packet_ol_l4_csum_check_partial(packet, packet->l4_ofs, - offsetof(struct udp_header, - udp_csum)); + dp_packet_ol_l4_csum_check_partial(packet); if (dp_packet_l4_checksum_good(packet) || dp_packet_ol_l4_csum_partial(packet)) { dp_packet_hwol_set_csum_udp(packet); @@ -1092,9 +1088,7 @@ miniflow_extract(struct dp_packet *packet, struct miniflow *dst) miniflow_push_be16(mf, tp_dst, sctp->sctp_dst); miniflow_push_be16(mf, ct_tp_src, ct_tp_src); miniflow_push_be16(mf, ct_tp_dst, ct_tp_dst); - dp_packet_ol_l4_csum_check_partial(packet, packet->l4_ofs, - offsetof(struct sctp_header, - sctp_csum)); + dp_packet_ol_l4_csum_check_partial(packet); if (dp_packet_l4_checksum_good(packet) || dp_packet_ol_l4_csum_partial(packet)) { dp_packet_hwol_set_csum_sctp(packet); diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c index e79a43260..9f519cd59 100644 --- a/lib/netdev-linux.c +++ b/lib/netdev-linux.c @@ -7145,8 +7145,12 @@ netdev_linux_prepend_vnet_hdr(struct dp_packet *b, int mtu) if (dp_packet_hwol_is_tso(b)) { uint16_t tso_segsz = dp_packet_get_tso_segsz(b); struct tcp_header *tcp = dp_packet_l4(b); + struct tcp_header *inner_tcp = dp_packet_inner_l4(b); + if (inner_tcp) { + tcp = inner_tcp; + } int tcp_hdr_len = TCP_OFFSET(tcp->tcp_ctl) * 4; - int hdr_len = ((char *) dp_packet_l4(b) - (char *) dp_packet_eth(b)) + int hdr_len = ((char *) tcp - (char *) dp_packet_eth(b)) + tcp_hdr_len; int max_packet_len = mtu + ETH_HEADER_LEN + VLAN_HEADER_LEN; @@ -7164,7 +7168,6 @@ netdev_linux_prepend_vnet_hdr(struct dp_packet *b, int mtu) } else if (dp_packet_hwol_tx_ipv6(b)) { vnet->gso_type = VIRTIO_NET_HDR_GSO_TCPV6; } - } else { vnet->hdr_len = 0; vnet->gso_size = 0; @@ -7175,6 +7178,11 @@ netdev_linux_prepend_vnet_hdr(struct dp_packet *b, int mtu) /* The packet has good L4 checksum. No need to validate again. */ vnet->csum_start = vnet->csum_offset = (OVS_FORCE __virtio16) 0; vnet->flags = VIRTIO_NET_HDR_F_DATA_VALID; + if (!dp_packet_ip_checksum_good(b)) { + /* It is possible that L4 is good but the IP checksum isn't + * complete. */ + dp_packet_ip_set_header_csum(b, false); + } } else if (dp_packet_hwol_tx_l4_checksum(b)) { /* The csum calculation is offloaded. */ if (dp_packet_hwol_l4_is_tcp(b)) { @@ -7192,37 +7200,54 @@ netdev_linux_prepend_vnet_hdr(struct dp_packet *b, int mtu) * the TCP pseudo header, so that replacing it by the ones * complement checksum of the TCP header and body will give * the correct result. */ + void * l3_off = dp_packet_inner_l3(b); + void * l4_off = dp_packet_inner_l4(b); + + if (!l3_off && !l4_off) { + l3_off = dp_packet_l3(b); + l4_off = dp_packet_l4(b); + } - struct tcp_header *tcp_hdr = dp_packet_l4(b); + struct tcp_header *tcp_hdr = l4_off; ovs_be16 csum = 0; if (dp_packet_hwol_is_ipv4(b)) { - const struct ip_header *ip_hdr = dp_packet_l3(b); + const struct ip_header *ip_hdr = l3_off; csum = ~csum_finish(packet_csum_pseudoheader(ip_hdr)); } else if (dp_packet_hwol_tx_ipv6(b)) { - const struct ovs_16aligned_ip6_hdr *ip6_hdr = dp_packet_l3(b); + const struct ovs_16aligned_ip6_hdr *ip6_hdr = l3_off; csum = ~csum_finish(packet_csum_pseudoheader6(ip6_hdr)); } tcp_hdr->tcp_csum = csum; vnet->flags = VIRTIO_NET_HDR_F_NEEDS_CSUM; - vnet->csum_start = (OVS_FORCE __virtio16) b->l4_ofs; + vnet->csum_start = (OVS_FORCE __virtio16) ((char *) l4_off - + (char *) dp_packet_data(b)); vnet->csum_offset = (OVS_FORCE __virtio16) __builtin_offsetof( struct tcp_header, tcp_csum); } else if (dp_packet_hwol_l4_is_udp(b)) { - struct udp_header *udp_hdr = dp_packet_l4(b); + void * l3_off = dp_packet_inner_l3(b); + void * l4_off = dp_packet_inner_l4(b); + + if (!l3_off && !l4_off) { + l3_off = dp_packet_l3(b); + l4_off = dp_packet_l4(b); + } + + struct udp_header *udp_hdr = l4_off; ovs_be16 csum = 0; if (dp_packet_hwol_is_ipv4(b)) { - const struct ip_header *ip_hdr = dp_packet_l3(b); + const struct ip_header *ip_hdr = l3_off; csum = ~csum_finish(packet_csum_pseudoheader(ip_hdr)); } else if (dp_packet_hwol_tx_ipv6(b)) { - const struct ovs_16aligned_ip6_hdr *ip6_hdr = dp_packet_l3(b); + const struct ovs_16aligned_ip6_hdr *ip6_hdr = l3_off; csum = ~csum_finish(packet_csum_pseudoheader6(ip6_hdr)); } udp_hdr->udp_csum = csum; vnet->flags = VIRTIO_NET_HDR_F_NEEDS_CSUM; - vnet->csum_start = (OVS_FORCE __virtio16) b->l4_ofs; + vnet->csum_start = (OVS_FORCE __virtio16) ((char *) l4_off - + (char *) dp_packet_data(b));; vnet->csum_offset = (OVS_FORCE __virtio16) __builtin_offsetof( struct udp_header, udp_csum); } else if (dp_packet_hwol_l4_is_sctp(b)) { diff --git a/lib/netdev-native-tnl.c b/lib/netdev-native-tnl.c index 78198c10d..35767800c 100644 --- a/lib/netdev-native-tnl.c +++ b/lib/netdev-native-tnl.c @@ -215,7 +215,8 @@ udp_extract_tnl_md(struct dp_packet *packet, struct flow_tnl *tnl, } if (udp->udp_csum) { - if (OVS_UNLIKELY(!dp_packet_l4_checksum_good(packet))) { + if (OVS_LIKELY(!dp_packet_ol_l4_csum_partial(packet)) && + OVS_UNLIKELY(!dp_packet_l4_checksum_good(packet))) { uint32_t csum; if (netdev_tnl_is_header_ipv6(dp_packet_data(packet))) { csum = packet_csum_pseudoheader6(dp_packet_l3(packet)); @@ -293,18 +294,11 @@ dp_packet_tnl_ol_process(const struct netdev *netdev, dp_packet_set_l2_len(packet, (char *) dp_packet_l3(packet) - (char *) dp_packet_eth(packet) + GENEVE_BASE_HLEN + opt_len); - - packet->inner_l3_ofs = packet->l3_ofs + GENEVE_BASE_HLEN + opt_len; - packet->inner_l4_ofs = packet->l4_ofs + GENEVE_BASE_HLEN + opt_len; - } else if (!strcmp(netdev_get_type(netdev), "vxlan")) { dp_packet_hwol_set_tunnel_vxlan(packet); dp_packet_set_l2_len(packet, (char *) dp_packet_l3(packet) - (char *) dp_packet_eth(packet) + VXLAN_HLEN); - - packet->inner_l3_ofs = packet->l3_ofs + VXLAN_HLEN; - packet->inner_l4_ofs = packet->l4_ofs + VXLAN_HLEN; } } } @@ -316,6 +310,8 @@ netdev_tnl_push_udp_header(const struct netdev *netdev, { struct udp_header *udp; int ip_tot_size; + uint16_t l3_ofs = packet->l3_ofs; + uint16_t l4_ofs = packet->l4_ofs; dp_packet_tnl_ol_process(netdev, packet, data); udp = netdev_tnl_push_ip_header(packet, data->header, data->header_len, @@ -333,13 +329,20 @@ netdev_tnl_push_udp_header(const struct netdev *netdev, } else { dp_packet_hwol_set_csum_udp(packet); } - } else { - dp_packet_ol_set_l4_csum_good(packet); } - packet->inner_l3_ofs += packet->l4_ofs; - packet->inner_l4_ofs += packet->l4_ofs; + if (packet->csum_start && packet->csum_offset) { + dp_packet_ol_set_l4_csum_partial(packet); + } else if (!udp->udp_csum) { + dp_packet_ol_set_l4_csum_good(packet); + } + if (l3_ofs != UINT16_MAX) { + packet->inner_l3_ofs = l3_ofs + data->header_len; + } + if (l4_ofs != UINT16_MAX) { + packet->inner_l4_ofs = l4_ofs + data->header_len; + } } static void * diff --git a/lib/packets.c b/lib/packets.c index d9e41346e..8c727397e 100644 --- a/lib/packets.c +++ b/lib/packets.c @@ -1999,19 +1999,31 @@ IP_ECN_set_ce(struct dp_packet *pkt, bool is_ipv6) void packet_tcp_complete_csum(struct dp_packet *p, bool inner) { - struct tcp_header *tcp = (inner) ? dp_packet_inner_l4(p) : dp_packet_l4(p); + struct tcp_header *tcp; + size_t tcp_sz; + void *ip_hdr; + + if (inner) { + tcp = dp_packet_inner_l4(p); + ip_hdr = dp_packet_inner_l3(p); + tcp_sz = dp_packet_inner_l4_size(p); + } else { + tcp = dp_packet_l4(p); + ip_hdr = dp_packet_l3(p); + tcp_sz = dp_packet_l4_size(p); + } tcp->tcp_csum = 0; if (dp_packet_hwol_is_ipv4(p)) { - struct ip_header *ip = dp_packet_l3(p); + struct ip_header *ip = ip_hdr; tcp->tcp_csum = csum_finish(csum_continue(packet_csum_pseudoheader(ip), - tcp, dp_packet_l4_size(p))); + tcp, tcp_sz)); } else if (dp_packet_hwol_tx_ipv6(p)) { - struct ovs_16aligned_ip6_hdr *ip6 = dp_packet_l3(p); + struct ovs_16aligned_ip6_hdr *ip6 = ip_hdr; tcp->tcp_csum = packet_csum_upperlayer6(ip6, tcp, ip6->ip6_nxt, - dp_packet_l4_size(p)); + tcp_sz); } else { OVS_NOT_REACHED(); } @@ -2022,7 +2034,19 @@ packet_tcp_complete_csum(struct dp_packet *p, bool inner) void packet_udp_complete_csum(struct dp_packet *p, bool inner) { - struct udp_header *udp = (inner) ? dp_packet_inner_l4(p) : dp_packet_l4(p); + struct udp_header *udp; + size_t udp_sz; + void *ip_hdr; + + if (inner) { + udp = dp_packet_inner_l4(p); + ip_hdr = dp_packet_inner_l3(p); + udp_sz = dp_packet_inner_l4_size(p); + } else { + udp = dp_packet_l4(p); + ip_hdr = dp_packet_l3(p); + udp_sz = dp_packet_l4_size(p); + } /* Skip csum calculation if the udp_csum is zero. */ if (!udp->udp_csum) { @@ -2031,15 +2055,15 @@ packet_udp_complete_csum(struct dp_packet *p, bool inner) udp->udp_csum = 0; if (dp_packet_hwol_is_ipv4(p)) { - struct ip_header *ip = dp_packet_l3(p); + struct ip_header *ip = ip_hdr; udp->udp_csum = csum_finish(csum_continue(packet_csum_pseudoheader(ip), - udp, dp_packet_l4_size(p))); + udp, udp_sz)); } else if (dp_packet_hwol_tx_ipv6(p)) { - struct ovs_16aligned_ip6_hdr *ip6 = dp_packet_l3(p); + struct ovs_16aligned_ip6_hdr *ip6 = ip_hdr; udp->udp_csum = packet_csum_upperlayer6(ip6, udp, ip6->ip6_nxt, - dp_packet_l4_size(p)); + udp_sz); } else { OVS_NOT_REACHED(); } @@ -2054,10 +2078,18 @@ packet_udp_complete_csum(struct dp_packet *p, bool inner) void packet_sctp_complete_csum(struct dp_packet *p, bool inner) { - struct sctp_header *sh = (inner) ? dp_packet_inner_l4(p) : dp_packet_l4(p); - uint16_t tp_len = dp_packet_l4_size(p); + struct sctp_header *sh; + uint16_t tp_len; ovs_be32 csum; + if (inner) { + sh = dp_packet_inner_l4(p); + tp_len = dp_packet_inner_l4_size(p); + } else { + sh = dp_packet_l4(p); + tp_len = dp_packet_l4_size(p); + } + put_16aligned_be32(&sh->sctp_csum, 0); csum = crc32c((void *) sh, tp_len); put_16aligned_be32(&sh->sctp_csum, csum); diff --git a/tests/system-traffic.at b/tests/system-traffic.at index 69ba6a18a..3bc1341b8 100644 --- a/tests/system-traffic.at +++ b/tests/system-traffic.at @@ -292,7 +292,6 @@ OVS_TRAFFIC_VSWITCHD_STOP AT_CLEANUP AT_SETUP([datapath - ping over vxlan tunnel]) -OVS_CHECK_TUNNEL_TSO() OVS_CHECK_VXLAN() OVS_TRAFFIC_VSWITCHD_START() @@ -330,6 +329,15 @@ NS_CHECK_EXEC([at_ns0], [ping -s 3200 -q -c 3 -i 0.3 -W 2 10.1.1.100 | FORMAT_PI 3 packets transmitted, 3 received, 0% packet loss, time 0ms ]) +dnl Check large bidirectional TCP. +AT_CHECK([dd if=/dev/urandom of=payload.bin bs=60000 count=1 2> /dev/null]) +OVS_DAEMONIZE([nc -l 10.1.1.100 1234 > data], [nc.pid]) +NS_CHECK_EXEC([at_ns0], [nc $NC_EOF_OPT 10.1.1.100 1234 < payload.bin]) + +dnl Wait until transfer completes before checking. +OVS_WAIT_WHILE([kill -0 $(cat nc.pid)]) +AT_CHECK([diff -q payload.bin data], [0]) + OVS_TRAFFIC_VSWITCHD_STOP AT_CLEANUP @@ -381,7 +389,6 @@ OVS_TRAFFIC_VSWITCHD_STOP AT_CLEANUP AT_SETUP([datapath - ping over vxlan6 tunnel]) -OVS_CHECK_TUNNEL_TSO() OVS_CHECK_VXLAN_UDP6ZEROCSUM() OVS_TRAFFIC_VSWITCHD_START() @@ -425,7 +432,6 @@ OVS_TRAFFIC_VSWITCHD_STOP AT_CLEANUP AT_SETUP([datapath - ping over gre tunnel]) -OVS_CHECK_TUNNEL_TSO() OVS_CHECK_KERNEL_EXCL(3, 10, 4, 15) OVS_CHECK_GRE() @@ -467,7 +473,6 @@ OVS_TRAFFIC_VSWITCHD_STOP AT_CLEANUP AT_SETUP([datapath - ping over ip6gre L2 tunnel]) -OVS_CHECK_TUNNEL_TSO() OVS_CHECK_KERNEL_EXCL(3, 10, 4, 15) OVS_CHECK_GRE() OVS_CHECK_ERSPAN() @@ -508,7 +513,6 @@ AT_CLEANUP AT_SETUP([datapath - ping over erspan v1 tunnel]) -OVS_CHECK_TUNNEL_TSO() OVS_CHECK_KERNEL_EXCL(3, 10, 4, 15) OVS_CHECK_GRE() OVS_CHECK_ERSPAN() @@ -545,7 +549,6 @@ OVS_TRAFFIC_VSWITCHD_STOP AT_CLEANUP AT_SETUP([datapath - ping over erspan v2 tunnel]) -OVS_CHECK_TUNNEL_TSO() OVS_CHECK_KERNEL_EXCL(3, 10, 4, 15) OVS_CHECK_GRE() OVS_CHECK_ERSPAN() @@ -582,7 +585,6 @@ OVS_TRAFFIC_VSWITCHD_STOP AT_CLEANUP AT_SETUP([datapath - ping over ip6erspan v1 tunnel]) -OVS_CHECK_TUNNEL_TSO() OVS_CHECK_KERNEL_EXCL(3, 10, 4, 15) OVS_CHECK_GRE() OVS_CHECK_ERSPAN() @@ -622,7 +624,6 @@ OVS_TRAFFIC_VSWITCHD_STOP AT_CLEANUP AT_SETUP([datapath - ping over ip6erspan v2 tunnel]) -OVS_CHECK_TUNNEL_TSO() OVS_CHECK_KERNEL_EXCL(3, 10, 4, 15) OVS_CHECK_GRE() OVS_CHECK_ERSPAN() @@ -663,7 +664,6 @@ OVS_TRAFFIC_VSWITCHD_STOP AT_CLEANUP AT_SETUP([datapath - ping over geneve tunnel]) -OVS_CHECK_TUNNEL_TSO() OVS_CHECK_GENEVE() OVS_TRAFFIC_VSWITCHD_START() @@ -701,11 +701,19 @@ NS_CHECK_EXEC([at_ns0], [ping -s 3200 -q -c 3 -i 0.3 -W 2 10.1.1.100 | FORMAT_PI 3 packets transmitted, 3 received, 0% packet loss, time 0ms ]) +dnl Check large bidirectional TCP. +AT_CHECK([dd if=/dev/urandom of=payload.bin bs=60000 count=1 2> /dev/null]) +OVS_DAEMONIZE([nc -l 10.1.1.100 1234 > data], [nc.pid]) +NS_CHECK_EXEC([at_ns0], [nc $NC_EOF_OPT 10.1.1.100 1234 < payload.bin]) + +dnl Wait until transfer completes before checking. +OVS_WAIT_WHILE([kill -0 $(cat nc.pid)]) +AT_CHECK([diff -q payload.bin data], [0]) + OVS_TRAFFIC_VSWITCHD_STOP AT_CLEANUP AT_SETUP([datapath - ping over geneve tunnel, delete flow regression]) -OVS_CHECK_TUNNEL_TSO() OVS_CHECK_GENEVE() OVS_TRAFFIC_VSWITCHD_START() @@ -760,7 +768,6 @@ OVS_TRAFFIC_VSWITCHD_STOP(["/|ERR|/d AT_CLEANUP AT_SETUP([datapath - flow resume with geneve tun_metadata]) -OVS_CHECK_TUNNEL_TSO() OVS_CHECK_GENEVE() OVS_TRAFFIC_VSWITCHD_START() @@ -812,7 +819,6 @@ OVS_TRAFFIC_VSWITCHD_STOP AT_CLEANUP AT_SETUP([datapath - ping over geneve6 tunnel]) -OVS_CHECK_TUNNEL_TSO() OVS_CHECK_GENEVE_UDP6ZEROCSUM() OVS_TRAFFIC_VSWITCHD_START() @@ -857,7 +863,6 @@ AT_CLEANUP AT_SETUP([datapath - slow_action on geneve6 tunnel]) AT_SKIP_IF([test $HAVE_TCPDUMP = no]) -OVS_CHECK_TUNNEL_TSO() OVS_CHECK_GENEVE_UDP6ZEROCSUM() OVS_TRAFFIC_VSWITCHD_START() @@ -981,7 +986,6 @@ OVS_TRAFFIC_VSWITCHD_STOP AT_CLEANUP AT_SETUP([datapath - ping over gre tunnel by simulated packets]) -OVS_CHECK_TUNNEL_TSO() OVS_CHECK_MIN_KERNEL(3, 10) OVS_TRAFFIC_VSWITCHD_START() @@ -1028,7 +1032,6 @@ OVS_TRAFFIC_VSWITCHD_STOP AT_CLEANUP AT_SETUP([datapath - ping over erspan v1 tunnel by simulated packets]) -OVS_CHECK_TUNNEL_TSO() OVS_CHECK_MIN_KERNEL(3, 10) OVS_TRAFFIC_VSWITCHD_START() @@ -1077,7 +1080,6 @@ OVS_TRAFFIC_VSWITCHD_STOP AT_CLEANUP AT_SETUP([datapath - ping over erspan v2 tunnel by simulated packets]) -OVS_CHECK_TUNNEL_TSO() OVS_CHECK_MIN_KERNEL(3, 10) OVS_TRAFFIC_VSWITCHD_START() @@ -1131,7 +1133,6 @@ OVS_TRAFFIC_VSWITCHD_STOP AT_CLEANUP AT_SETUP([datapath - ping over ip6erspan v1 tunnel by simulated packets]) -OVS_CHECK_TUNNEL_TSO() OVS_CHECK_MIN_KERNEL(3, 10) OVS_TRAFFIC_VSWITCHD_START() @@ -1187,7 +1188,6 @@ OVS_TRAFFIC_VSWITCHD_STOP AT_CLEANUP AT_SETUP([datapath - ping over ip6erspan v2 tunnel by simulated packets]) -OVS_CHECK_TUNNEL_TSO() OVS_CHECK_MIN_KERNEL(3, 10) OVS_TRAFFIC_VSWITCHD_START() @@ -1242,7 +1242,6 @@ OVS_TRAFFIC_VSWITCHD_STOP AT_CLEANUP AT_SETUP([datapath - ping over srv6 tunnel]) -OVS_CHECK_TUNNEL_TSO() OVS_CHECK_SRV6() OVS_TRAFFIC_VSWITCHD_START() @@ -1304,7 +1303,6 @@ OVS_TRAFFIC_VSWITCHD_STOP AT_CLEANUP AT_SETUP([datapath - ping6 over srv6 tunnel]) -OVS_CHECK_TUNNEL_TSO() OVS_CHECK_SRV6() OVS_TRAFFIC_VSWITCHD_START() @@ -7831,7 +7829,6 @@ AT_CLEANUP AT_SETUP([conntrack - can match and clear ct_state from outside OVS]) CHECK_CONNTRACK_LOCAL_STACK() -OVS_CHECK_TUNNEL_TSO() OVS_CHECK_GENEVE() OVS_TRAFFIC_VSWITCHD_START()