From patchwork Wed Feb 21 04:08:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Pattrick X-Patchwork-Id: 1901838 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=g+svAV3J; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=2605:bc80:3010::138; helo=smtp1.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=patchwork.ozlabs.org) Received: from smtp1.osuosl.org (smtp1.osuosl.org [IPv6:2605:bc80:3010::138]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4TfjT80jp2z23cl for ; Wed, 21 Feb 2024 15:09:14 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by smtp1.osuosl.org (Postfix) with ESMTP id BD75C81DFD; Wed, 21 Feb 2024 04:09:10 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp1.osuosl.org ([127.0.0.1]) by localhost (smtp1.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 71TMppZaaf1S; Wed, 21 Feb 2024 04:09:09 +0000 (UTC) X-Comment: SPF check N/A for local connections - client-ip=2605:bc80:3010:104::8cd3:938; helo=lists.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver= DKIM-Filter: OpenDKIM Filter v2.11.0 smtp1.osuosl.org 2D69E81DF0 Authentication-Results: smtp1.osuosl.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=g+svAV3J Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [IPv6:2605:bc80:3010:104::8cd3:938]) by smtp1.osuosl.org (Postfix) with ESMTPS id 2D69E81DF0; Wed, 21 Feb 2024 04:09:09 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 01993C0072; Wed, 21 Feb 2024 04:09:09 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp2.osuosl.org (smtp2.osuosl.org [IPv6:2605:bc80:3010::133]) by lists.linuxfoundation.org (Postfix) with ESMTP id 32356C0037 for ; Wed, 21 Feb 2024 04:09:08 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp2.osuosl.org (Postfix) with ESMTP id 2852F402BC for ; Wed, 21 Feb 2024 04:09:08 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp2.osuosl.org ([127.0.0.1]) by localhost (smtp2.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id IqJVIs8rvUqE for ; Wed, 21 Feb 2024 04:09:06 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=170.10.129.124; helo=us-smtp-delivery-124.mimecast.com; envelope-from=mkp@redhat.com; receiver= DMARC-Filter: OpenDMARC Filter v1.4.2 smtp2.osuosl.org 7477D40139 Authentication-Results: smtp2.osuosl.org; dmarc=pass (p=none dis=none) header.from=redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 smtp2.osuosl.org 7477D40139 Authentication-Results: smtp2.osuosl.org; dkim=pass (1024-bit key, unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=g+svAV3J Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by smtp2.osuosl.org (Postfix) with ESMTPS id 7477D40139 for ; Wed, 21 Feb 2024 04:09:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1708488545; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=lg9EraWEsNucbIowq8vmSU/Y46xngkvqRQLNHiUBpfk=; b=g+svAV3J7CXCACuwR++ERVcDzL+pFjX1VAK7IrxQCkm7ChCw2fpAVyosDC2afv4cqdyf25 HpMvi85d93uzzN/IIjbnKqd5bI1qYKeWnqKehj7v60hRMrsLTQdfVYGBJqn1MnzhMCNc+c OV6u0vQtqrI+19RNorxKeZgeaqXz5/4= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-617-sOP81N9APrqqzNDYimOJQw-1; Tue, 20 Feb 2024 23:09:03 -0500 X-MC-Unique: sOP81N9APrqqzNDYimOJQw-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 1B72B10201FC for ; Wed, 21 Feb 2024 04:09:03 +0000 (UTC) Received: from mpattric.remote.csb (unknown [10.22.9.187]) by smtp.corp.redhat.com (Postfix) with ESMTP id 990A41C14B0B; Wed, 21 Feb 2024 04:09:02 +0000 (UTC) From: Mike Pattrick To: dev@openvswitch.org Date: Tue, 20 Feb 2024 23:08:55 -0500 Message-Id: <20240221040855.271921-1-mkp@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.7 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Subject: [ovs-dev] [PATCH] Userspace: Software fallback for UDP encapsulated TCP segmentation. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" When sending packets that are flagged as requiring segmentation to an interface that doens't support this feature, send the packet to the TSO software fallback instead of dropping it. Signed-off-by: Mike Pattrick --- lib/dp-packet-gso.c | 73 +++++++++++++++++++++++++++++++++-------- lib/dp-packet.h | 26 +++++++++++++++ lib/netdev-native-tnl.c | 8 +++++ lib/netdev.c | 37 +++++++++------------ tests/system-traffic.at | 58 ++++++++++++++++++++++++++++++++ 5 files changed, 167 insertions(+), 35 deletions(-) diff --git a/lib/dp-packet-gso.c b/lib/dp-packet-gso.c index 847685ad9..f25abf436 100644 --- a/lib/dp-packet-gso.c +++ b/lib/dp-packet-gso.c @@ -47,6 +47,8 @@ dp_packet_gso_seg_new(const struct dp_packet *p, size_t hdr_len, seg->l2_5_ofs = p->l2_5_ofs; seg->l3_ofs = p->l3_ofs; seg->l4_ofs = p->l4_ofs; + seg->inner_l3_ofs = p->inner_l3_ofs; + seg->inner_l4_ofs = p->inner_l4_ofs; /* The protocol headers remain the same, so preserve hash and mark. */ *dp_packet_rss_ptr(seg) = *dp_packet_rss_ptr(p); @@ -71,7 +73,12 @@ dp_packet_gso_nr_segs(struct dp_packet *p) const char *data_tail; const char *data_pos; - data_pos = dp_packet_get_tcp_payload(p); + if (dp_packet_hwol_is_tunnel_vxlan(p) || + dp_packet_hwol_is_tunnel_geneve(p)) { + data_pos = dp_packet_get_inner_tcp_payload(p); + } else { + data_pos = dp_packet_get_tcp_payload(p); + } data_tail = (char *) dp_packet_tail(p) - dp_packet_l2_pad_size(p); return DIV_ROUND_UP(data_tail - data_pos, segsz); @@ -91,12 +98,15 @@ dp_packet_gso(struct dp_packet *p, struct dp_packet_batch **batches) struct tcp_header *tcp_hdr; struct ip_header *ip_hdr; struct dp_packet *seg; + const char *data_pos; uint16_t tcp_offset; uint16_t tso_segsz; + uint16_t ip_id = 0; uint32_t tcp_seq; - uint16_t ip_id; + bool outer_ipv4; int hdr_len; int seg_len; + bool tnl; tso_segsz = dp_packet_get_tso_segsz(p); if (!tso_segsz) { @@ -105,20 +115,35 @@ dp_packet_gso(struct dp_packet *p, struct dp_packet_batch **batches) return false; } - tcp_hdr = dp_packet_l4(p); - tcp_offset = TCP_OFFSET(tcp_hdr->tcp_ctl); - tcp_seq = ntohl(get_16aligned_be32(&tcp_hdr->tcp_seq)); - hdr_len = ((char *) dp_packet_l4(p) - (char *) dp_packet_eth(p)) - + tcp_offset * 4; - ip_id = 0; - if (dp_packet_hwol_is_ipv4(p)) { + if (dp_packet_hwol_is_tunnel_vxlan(p) || + dp_packet_hwol_is_tunnel_geneve(p)) { + data_pos = dp_packet_get_inner_tcp_payload(p); + outer_ipv4 = dp_packet_hwol_is_outer_ipv4(p); + tcp_hdr = dp_packet_inner_l4(p); + ip_hdr = dp_packet_inner_l3(p); + tnl = true; + if (outer_ipv4) { + ip_id = ntohs(((struct ip_header *) dp_packet_l3(p))->ip_id); + } else if (dp_packet_hwol_is_ipv4(p)) { + ip_id = ntohs(ip_hdr->ip_id); + } + } else { + data_pos = dp_packet_get_tcp_payload(p); + outer_ipv4 = dp_packet_hwol_is_ipv4(p); + tcp_hdr = dp_packet_l4(p); ip_hdr = dp_packet_l3(p); - ip_id = ntohs(ip_hdr->ip_id); + tnl = false; + if (outer_ipv4) { + ip_id = ntohs(ip_hdr->ip_id); + } } + tcp_offset = TCP_OFFSET(tcp_hdr->tcp_ctl); + tcp_seq = ntohl(get_16aligned_be32(&tcp_hdr->tcp_seq)); + hdr_len = ((char *) tcp_hdr - (char *) dp_packet_eth(p)) + + tcp_offset * 4; const char *data_tail = (char *) dp_packet_tail(p) - dp_packet_l2_pad_size(p); - const char *data_pos = dp_packet_get_tcp_payload(p); int n_segs = dp_packet_gso_nr_segs(p); for (int i = 0; i < n_segs; i++) { @@ -130,8 +155,26 @@ dp_packet_gso(struct dp_packet *p, struct dp_packet_batch **batches) seg = dp_packet_gso_seg_new(p, hdr_len, data_pos, seg_len); data_pos += seg_len; + if (tnl) { + /* Update tunnel L3 header. */ + if (dp_packet_hwol_is_ipv4(seg)) { + ip_hdr = dp_packet_inner_l3(seg); + ip_hdr->ip_tot_len = htons(sizeof *ip_hdr + + dp_packet_inner_l4_size(seg)); + ip_hdr->ip_id = htons(ip_id); + ip_hdr->ip_csum = 0; + ip_id++; + } else { + struct ovs_16aligned_ip6_hdr *ip6_hdr; + + ip6_hdr = dp_packet_inner_l3(seg); + ip6_hdr->ip6_ctlun.ip6_un1.ip6_un1_plen + = htons(dp_packet_inner_l3_size(seg) - sizeof *ip6_hdr); + } + } + /* Update L3 header. */ - if (dp_packet_hwol_is_ipv4(seg)) { + if (outer_ipv4) { ip_hdr = dp_packet_l3(seg); ip_hdr->ip_tot_len = htons(sizeof *ip_hdr + dp_packet_l4_size(seg)); @@ -146,7 +189,11 @@ dp_packet_gso(struct dp_packet *p, struct dp_packet_batch **batches) } /* Update L4 header. */ - tcp_hdr = dp_packet_l4(seg); + if (tnl) { + tcp_hdr = dp_packet_inner_l4(seg); + } else { + tcp_hdr = dp_packet_l4(seg); + } put_16aligned_be32(&tcp_hdr->tcp_seq, htonl(tcp_seq)); tcp_seq += seg_len; if (OVS_LIKELY(i < (n_segs - 1))) { diff --git a/lib/dp-packet.h b/lib/dp-packet.h index 2fa17d814..c3dabc5b0 100644 --- a/lib/dp-packet.h +++ b/lib/dp-packet.h @@ -529,6 +529,16 @@ dp_packet_inner_l3(const struct dp_packet *b) : NULL; } +static inline size_t +dp_packet_inner_l3_size(const struct dp_packet *b) +{ + return OVS_LIKELY(b->inner_l3_ofs != UINT16_MAX) + ? (const char *) dp_packet_tail(b) + - (const char *) dp_packet_inner_l3(b) + - dp_packet_l2_pad_size(b) + : 0; +} + static inline void * dp_packet_inner_l4(const struct dp_packet *b) { @@ -563,6 +573,22 @@ dp_packet_get_tcp_payload(const struct dp_packet *b) return NULL; } +static inline const void * +dp_packet_get_inner_tcp_payload(const struct dp_packet *b) +{ + size_t l4_size = dp_packet_inner_l4_size(b); + + if (OVS_LIKELY(l4_size >= TCP_HEADER_LEN)) { + struct tcp_header *tcp = dp_packet_inner_l4(b); + int tcp_len = TCP_OFFSET(tcp->tcp_ctl) * 4; + + if (OVS_LIKELY(tcp_len >= TCP_HEADER_LEN && tcp_len <= l4_size)) { + return (const char *) tcp + tcp_len; + } + } + return NULL; +} + static inline uint32_t dp_packet_get_tcp_payload_length(const struct dp_packet *pkt) { diff --git a/lib/netdev-native-tnl.c b/lib/netdev-native-tnl.c index dee9ab344..369b92acc 100644 --- a/lib/netdev-native-tnl.c +++ b/lib/netdev-native-tnl.c @@ -348,6 +348,14 @@ netdev_tnl_push_udp_header(const struct netdev *netdev OVS_UNUSED, if (l4_ofs != UINT16_MAX) { packet->inner_l4_ofs = l4_ofs + data->header_len; } + + if (dp_packet_hwol_is_tso(packet)) { + uint16_t tso_segsz = dp_packet_get_tso_segsz(packet); + if (tso_segsz > data->header_len) { + tso_segsz -= data->header_len; + dp_packet_set_tso_segsz(packet, tso_segsz); + } + } } static void * diff --git a/lib/netdev.c b/lib/netdev.c index f2d921ed6..1d59bbe5d 100644 --- a/lib/netdev.c +++ b/lib/netdev.c @@ -69,8 +69,6 @@ COVERAGE_DEFINE(netdev_received); COVERAGE_DEFINE(netdev_sent); COVERAGE_DEFINE(netdev_add_router); COVERAGE_DEFINE(netdev_get_stats); -COVERAGE_DEFINE(netdev_vxlan_tso_drops); -COVERAGE_DEFINE(netdev_geneve_tso_drops); COVERAGE_DEFINE(netdev_push_header_drops); COVERAGE_DEFINE(netdev_soft_seg_good); COVERAGE_DEFINE(netdev_soft_seg_drops); @@ -910,28 +908,23 @@ netdev_send(struct netdev *netdev, int qid, struct dp_packet_batch *batch, struct dp_packet *packet; int error; - if (userspace_tso_enabled() && - !(netdev_flags & NETDEV_TX_OFFLOAD_TCP_TSO)) { - DP_PACKET_BATCH_FOR_EACH (i, packet, batch) { - if (dp_packet_hwol_is_tso(packet)) { - if (dp_packet_hwol_is_tunnel_vxlan(packet) - && !(netdev_flags & NETDEV_TX_VXLAN_TNL_TSO)) { - VLOG_WARN_RL(&rl, "%s: No VXLAN TSO support", - netdev_get_name(netdev)); - COVERAGE_INC(netdev_vxlan_tso_drops); - dp_packet_delete_batch(batch, true); - return false; + if (userspace_tso_enabled()) { + if (!(netdev_flags & NETDEV_TX_OFFLOAD_TCP_TSO)) { + DP_PACKET_BATCH_FOR_EACH (i, packet, batch) { + if (dp_packet_hwol_is_tso(packet)) { + return netdev_send_tso(netdev, qid, batch, concurrent_txq); } - - if (dp_packet_hwol_is_tunnel_geneve(packet) - && !(netdev_flags & NETDEV_TX_GENEVE_TNL_TSO)) { - VLOG_WARN_RL(&rl, "%s: No GENEVE TSO support", - netdev_get_name(netdev)); - COVERAGE_INC(netdev_geneve_tso_drops); - dp_packet_delete_batch(batch, true); - return false; + } + } else if (!(netdev_flags & (NETDEV_TX_VXLAN_TNL_TSO | + NETDEV_TX_GENEVE_TNL_TSO))) { + DP_PACKET_BATCH_FOR_EACH (i, packet, batch) { + if (!dp_packet_hwol_is_tso(packet)) { + continue; + } + if (dp_packet_hwol_is_tunnel_vxlan(packet) || + dp_packet_hwol_is_tunnel_geneve(packet)) { + return netdev_send_tso(netdev, qid, batch, concurrent_txq); } - return netdev_send_tso(netdev, qid, batch, concurrent_txq); } } } diff --git a/tests/system-traffic.at b/tests/system-traffic.at index 98e494abf..c1681506e 100644 --- a/tests/system-traffic.at +++ b/tests/system-traffic.at @@ -351,6 +351,64 @@ OVS_WAIT_UNTIL([diff -q payload.bin udp_data]) OVS_TRAFFIC_VSWITCHD_STOP AT_CLEANUP +AT_SETUP([datapath - tcp over vxlan tunnel with software fallback]) +AT_SKIP_IF([test $HAVE_NC = no]) +OVS_CHECK_VXLAN() + +OVS_TRAFFIC_VSWITCHD_START() +ADD_BR([br-underlay]) + +AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"]) +AT_CHECK([ovs-ofctl add-flow br-underlay "actions=normal"]) + +ADD_NAMESPACES(at_ns0) + +dnl Set up underlay link from host into the namespace using veth pair. +ADD_VETH(p0, at_ns0, br-underlay, "172.31.1.1/24") +AT_CHECK([ip addr add dev br-underlay "172.31.1.100/24"]) +AT_CHECK([ip link set dev br-underlay up]) + +dnl Test the case where only one side has all checksum and tso offload disabled. +AT_CHECK([ethtool -K ovs-p0 tso off], [0], [ignore], [ignore]) +AT_CHECK([ethtool -K ovs-p0 sg off], [0], [ignore], [ignore]) + +dnl Reinitialize. +AT_CHECK([ovs-vsctl del-port ovs-p0]) +AT_CHECK([ovs-vsctl add-port br-underlay ovs-p0]) + +dnl Set up tunnel endpoints on OVS outside the namespace and with a native +dnl linux device inside the namespace. +ADD_OVS_TUNNEL([vxlan], [br0], [at_vxlan0], [172.31.1.1], [10.1.1.100/24]) +ADD_NATIVE_TUNNEL([vxlan], [at_vxlan1], [at_ns0], [172.31.1.100], [10.1.1.1/24], + [id 0 dstport 4789]) + +dnl First, check the underlay. +NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -W 2 172.31.1.100 | FORMAT_PING], [0], [dnl +3 packets transmitted, 3 received, 0% packet loss, time 0ms +]) + +dnl Check that the tunnel is up. +NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -W 2 10.1.1.100 | FORMAT_PING], [0], [dnl +3 packets transmitted, 3 received, 0% packet loss, time 0ms +]) + +dnl Initialize the listener before it is needed. +NETNS_DAEMONIZE([at_ns0], [nc -l 10.1.1.1 1234 > data2], [nc.pid]) + +dnl Verify that ncat is ready. +OVS_WAIT_UNTIL([NS_EXEC([at_ns0], [netstat -ln | grep :1234])]) + +dnl Large TCP transfer aimed towards ovs-p0, which has TSO disabled. +AT_CHECK([dd if=/dev/urandom of=payload.bin bs=60000 count=1 2> /dev/null]) +AT_CHECK([nc $NC_EOF_OPT 10.1.1.1 1234 < payload.bin]) + +dnl Wait until transfer completes before checking. +OVS_WAIT_WHILE([kill -0 $(cat nc.pid)]) +AT_CHECK([diff -q payload.bin data2], [0]) + +OVS_TRAFFIC_VSWITCHD_STOP +AT_CLEANUP + AT_SETUP([datapath - ping vlan over vxlan tunnel]) OVS_CHECK_TUNNEL_TSO() OVS_CHECK_VXLAN()