From patchwork Tue Dec 7 16:51:54 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Flavio Leitner X-Patchwork-Id: 1564790 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: bilbo.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=sysclose.org header.i=@sysclose.org header.a=rsa-sha256 header.s=201903 header.b=SrJs6AQe; dkim-atps=neutral Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.138; helo=smtp1.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Received: from smtp1.osuosl.org (smtp1.osuosl.org [140.211.166.138]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4J7mc05GDzz9s1l for ; Wed, 8 Dec 2021 03:54:24 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by smtp1.osuosl.org (Postfix) with ESMTP id 3199584C76; Tue, 7 Dec 2021 16:54:20 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp1.osuosl.org ([127.0.0.1]) by localhost (smtp1.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 82EMaT5YsSJc; Tue, 7 Dec 2021 16:54:17 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by smtp1.osuosl.org (Postfix) with ESMTPS id 3267184C87; Tue, 7 Dec 2021 16:54:14 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id B5C5AC0078; Tue, 7 Dec 2021 16:54:13 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp3.osuosl.org (smtp3.osuosl.org [IPv6:2605:bc80:3010::136]) by lists.linuxfoundation.org (Postfix) with ESMTP id 24797C0073 for ; Tue, 7 Dec 2021 16:54:12 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp3.osuosl.org (Postfix) with ESMTP id D619960EF6 for ; Tue, 7 Dec 2021 16:53:13 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Authentication-Results: smtp3.osuosl.org (amavisd-new); dkim=pass (2048-bit key) header.d=sysclose.org Received: from smtp3.osuosl.org ([127.0.0.1]) by localhost (smtp3.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id i2xCs3ibjXAT for ; Tue, 7 Dec 2021 16:53:11 +0000 (UTC) X-Greylist: from auto-whitelisted by SQLgrey-1.8.0 Received: from sysclose.org (smtp.sysclose.org [69.164.214.230]) by smtp3.osuosl.org (Postfix) with ESMTPS id 3883C60ED3 for ; Tue, 7 Dec 2021 16:53:09 +0000 (UTC) Received: from localhost (unknown [131.100.62.140]) by sysclose.org (Postfix) with ESMTPSA id 452002AA9; Tue, 7 Dec 2021 16:53:40 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 sysclose.org 452002AA9 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sysclose.org; s=201903; t=1638896020; bh=rAvV24oGiMmKHZz27us8OYalN0C0Zc2Hfsa0pPsnR2U=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=SrJs6AQecoOhIHr3K5cDC8rTObya241Xh6wqHBCTWdvMllMUD3/6+uJkoaMXd3bze NwMeRHhLY4X0rHCWgWCnmw1oHjmZCvqbTCZB6Z9OmemkZZ6BDI1Tp+eFmvxoPBEZ+C bYtDy7xAO0GxhxqpE88Aad8kzGZ01FGqd33v+5gG83Id6SNtaD5tGdgh43w/ZMe/AN kssmv1B7zJqs/WLZ4VC24ln06nKZdjgEq/xEF9ybcCMTNjfOXVHoruaS3h1652HhWy p3PjzLhlCyjzMOMEG7Y5EAYXTpVLC4W1E4tEGagueSCUS6bvLoKUkKv9GVaa8jLqRA 67yl1m5fEOTmw== From: Flavio Leitner To: dev@openvswitch.org Date: Tue, 7 Dec 2021 13:51:54 -0300 Message-Id: <20211207165156.705727-16-fbl@sysclose.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20211207165156.705727-1-fbl@sysclose.org> References: <20211207165156.705727-1-fbl@sysclose.org> MIME-Version: 1.0 Cc: Flavio Leitner Subject: [ovs-dev] [[PATCH RFC] 15/17] Respect tso/gso segment size. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" Currently OVS will calculate the segment size based on the MTU of the egress port. That usually happens to be correct when the ports share the same MTU, but that is not always true. Therefore, if the segment size is provided, then use that and make sure the over sized packets are dropped. Signed-off-by: Flavio Leitner --- lib/dp-packet.c | 1 + lib/dp-packet.h | 27 ++++++++++++++++ lib/netdev-dpdk.c | 13 ++++++-- lib/netdev-linux.c | 78 +++++++++++++++++++++++++++++++++++----------- 4 files changed, 98 insertions(+), 21 deletions(-) diff --git a/lib/dp-packet.c b/lib/dp-packet.c index 8a1bf221a..0cfc295b1 100644 --- a/lib/dp-packet.c +++ b/lib/dp-packet.c @@ -34,6 +34,7 @@ dp_packet_init__(struct dp_packet *p, size_t allocated, enum dp_packet_source so pkt_metadata_init(&p->md, 0); dp_packet_reset_cutlen(p); dp_packet_ol_reset(p); + dp_packet_set_tso_segsz(p, 0); /* Initialize implementation-specific fields of dp_packet. */ dp_packet_init_specific(p); /* By default assume the packet type to be Ethernet. */ diff --git a/lib/dp-packet.h b/lib/dp-packet.h index 51f98ab9a..27529ca87 100644 --- a/lib/dp-packet.h +++ b/lib/dp-packet.h @@ -124,6 +124,7 @@ struct dp_packet { uint32_t ol_flags; /* Offloading flags. */ uint32_t rss_hash; /* Packet hash. */ uint32_t flow_mark; /* Packet flow mark. */ + uint16_t tso_segsz; /* TCP TSO segment size. */ #endif enum dp_packet_source source; /* Source of memory allocated as 'base'. */ @@ -164,6 +165,9 @@ static inline void dp_packet_set_size(struct dp_packet *, uint32_t); static inline uint16_t dp_packet_get_allocated(const struct dp_packet *); static inline void dp_packet_set_allocated(struct dp_packet *, uint16_t); +static inline uint16_t dp_packet_get_tso_segsz(const struct dp_packet *); +static inline void dp_packet_set_tso_segsz(struct dp_packet *, uint16_t); + void *dp_packet_resize_l2(struct dp_packet *, int increment); void *dp_packet_resize_l2_5(struct dp_packet *, int increment); static inline void *dp_packet_eth(const struct dp_packet *); @@ -635,6 +639,18 @@ dp_packet_set_allocated(struct dp_packet *p, uint16_t s) p->mbuf.buf_len = s; } +static inline uint16_t +dp_packet_get_tso_segsz(const struct dp_packet *p) +{ + return p->mbuf.tso_segsz; +} + +static inline void +dp_packet_set_tso_segsz(struct dp_packet *p, uint16_t s) +{ + p->mbuf.tso_segsz = s; +} + #else /* DPDK_NETDEV */ static inline void @@ -691,6 +707,17 @@ dp_packet_set_allocated(struct dp_packet *p, uint16_t s) p->allocated_ = s; } +static inline uint16_t +dp_packet_get_tso_segsz(const struct dp_packet *p) +{ + return p->tso_segsz; +} + +static inline void +dp_packet_set_tso_segsz(struct dp_packet *p, uint16_t s) +{ + p->tso_segsz = s; +} #endif /* DPDK_NETDEV */ static inline void diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index c7e09b973..0d370bda3 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -2222,6 +2222,7 @@ netdev_dpdk_prep_ol_packet(struct netdev_dpdk *dev, struct rte_mbuf *mbuf) if (mbuf->ol_flags & PKT_TX_TCP_SEG) { struct tcp_header *th = dp_packet_l4(pkt); + int hdr_len; if (!th) { VLOG_WARN_RL(&rl, "%s: TCP Segmentation without L4 header" @@ -2231,7 +2232,14 @@ netdev_dpdk_prep_ol_packet(struct netdev_dpdk *dev, struct rte_mbuf *mbuf) mbuf->l4_len = TCP_OFFSET(th->tcp_ctl) * 4; mbuf->ol_flags |= PKT_TX_TCP_CKSUM; - mbuf->tso_segsz = dev->mtu - mbuf->l3_len - mbuf->l4_len; + hdr_len = mbuf->l2_len + mbuf->l3_len + mbuf->l4_len; + if (OVS_UNLIKELY((hdr_len + mbuf->tso_segsz) > dev->max_packet_len)) { + VLOG_WARN_RL(&rl, "%s: Oversized TSO packet. " + "hdr: %"PRIu32", gso: %"PRIu32", max len: %"PRIu32"", + dev->up.name, hdr_len, mbuf->tso_segsz, + dev->max_packet_len); + return false; + } if (mbuf->ol_flags & PKT_TX_IPV4) { mbuf->ol_flags |= PKT_TX_IP_CKSUM; @@ -2597,7 +2605,8 @@ netdev_dpdk_filter_packet_len(struct netdev_dpdk *dev, struct rte_mbuf **pkts, int cnt = 0; struct rte_mbuf *pkt; - /* Filter oversized packets, unless are marked for TSO. */ + /* Filter oversized packets. The TSO packets are filtered out + * during the offloading preparation for performance reasons. */ for (i = 0; i < pkt_cnt; i++) { pkt = pkts[i]; if (OVS_UNLIKELY((pkt->pkt_len > dev->max_packet_len) diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c index 48a3cf7d7..8a6f4592b 100644 --- a/lib/netdev-linux.c +++ b/lib/netdev-linux.c @@ -523,7 +523,7 @@ static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 20); static atomic_count miimon_cnt = ATOMIC_COUNT_INIT(0); static int netdev_linux_parse_vnet_hdr(struct dp_packet *b); -static void netdev_linux_prepend_vnet_hdr(struct dp_packet *b, int mtu); +static int netdev_linux_prepend_vnet_hdr(struct dp_packet *b, int mtu); static int netdev_linux_do_ethtool(const char *name, struct ethtool_cmd *, int cmd, const char *cmd_name); static int get_flags(const struct netdev *, unsigned int *flags); @@ -1549,9 +1549,10 @@ netdev_linux_rxq_drain(struct netdev_rxq *rxq_) } static int -netdev_linux_sock_batch_send(int sock, int ifindex, bool tso, int mtu, - struct dp_packet_batch *batch) +netdev_linux_sock_batch_send(struct netdev *netdev_, int sock, int ifindex, + bool tso, int mtu, struct dp_packet_batch *batch) { + struct netdev_linux *netdev = netdev_linux_cast(netdev_); const size_t size = dp_packet_batch_size(batch); /* We don't bother setting most fields in sockaddr_ll because the * kernel ignores them for SOCK_RAW. */ @@ -1560,26 +1561,35 @@ netdev_linux_sock_batch_send(int sock, int ifindex, bool tso, int mtu, struct mmsghdr *mmsg = xmalloc(sizeof(*mmsg) * size); struct iovec *iov = xmalloc(sizeof(*iov) * size); - struct dp_packet *packet; + int cnt = 0; + DP_PACKET_BATCH_FOR_EACH (i, packet, batch) { if (tso) { - netdev_linux_prepend_vnet_hdr(packet, mtu); + int ret = netdev_linux_prepend_vnet_hdr(packet, mtu); + + if (OVS_UNLIKELY(ret)) { + netdev->tx_dropped += 1; + VLOG_WARN_RL(&rl, "%s: Packet dropped. %s", + netdev_get_name(netdev_), ovs_strerror(ret)); + continue; + } } - iov[i].iov_base = dp_packet_data(packet); - iov[i].iov_len = dp_packet_size(packet); - mmsg[i].msg_hdr = (struct msghdr) { .msg_name = &sll, + iov[cnt].iov_base = dp_packet_data(packet); + iov[cnt].iov_len = dp_packet_size(packet); + mmsg[cnt].msg_hdr = (struct msghdr) { .msg_name = &sll, .msg_namelen = sizeof sll, - .msg_iov = &iov[i], + .msg_iov = &iov[cnt], .msg_iovlen = 1 }; + cnt++; } int error = 0; - for (uint32_t ofs = 0; ofs < size; ) { + for (uint32_t ofs = 0; ofs < cnt; ) { ssize_t retval; do { - retval = sendmmsg(sock, mmsg + ofs, size - ofs, 0); + retval = sendmmsg(sock, mmsg + ofs, cnt - ofs, 0); error = retval < 0 ? errno : 0; } while (error == EINTR); if (error) { @@ -1620,7 +1630,14 @@ netdev_linux_tap_batch_send(struct netdev *netdev_, int mtu, ssize_t retval; int error; - netdev_linux_prepend_vnet_hdr(packet, mtu); + error = netdev_linux_prepend_vnet_hdr(packet, mtu); + if (OVS_UNLIKELY(error)) { + netdev->tx_dropped++; + VLOG_WARN_RL(&rl, "%s: Packet dropped. %s", + netdev_get_name(netdev_), ovs_strerror(error)); + continue; + } + size = dp_packet_size(packet); do { retval = write(netdev->tap_fd, dp_packet_data(packet), size); @@ -1748,7 +1765,8 @@ netdev_linux_send(struct netdev *netdev_, int qid OVS_UNUSED, goto free_batch; } - error = netdev_linux_sock_batch_send(sock, ifindex, tso, mtu, batch); + error = netdev_linux_sock_batch_send(netdev_, sock, ifindex, tso, mtu, + batch); } else { error = netdev_linux_tap_batch_send(netdev_, mtu, batch); } @@ -6597,8 +6615,15 @@ netdev_linux_parse_vnet_hdr(struct dp_packet *b) switch (vnet->gso_type) { case VIRTIO_NET_HDR_GSO_TCPV4: case VIRTIO_NET_HDR_GSO_TCPV6: - /* FIXME: The packet has offloaded TCP segmentation. The gso_size - * is given and needs to be respected. */ + if (OVS_UNLIKELY(!userspace_tso_enabled())) { + VLOG_WARN_RL(&rl, "Received an unsupported packet with TSO " + "enabled."); + ret = ENOTSUP; + break; + } + + /* The packet has offloaded TCP segmentation. */ + dp_packet_set_tso_segsz(b, vnet->gso_size); dp_packet_ol_set_tcp_seg(b); break; case VIRTIO_NET_HDR_GSO_UDP: @@ -6617,18 +6642,32 @@ netdev_linux_parse_vnet_hdr(struct dp_packet *b) return ret; } -static void +/* Prepends struct virtio_net_hdr to packet 'b'. + * Returns 0 if successful, otherwise a positive errno value. + * Returns EMSGSIZE if the packet 'b' cannot be sent over MTU 'mtu'. */ +static int netdev_linux_prepend_vnet_hdr(struct dp_packet *b, int mtu) { struct virtio_net_hdr v; struct virtio_net_hdr *vnet = &v; if (dp_packet_ol_tcp_seg(b)) { - uint16_t hdr_len = ((char *)dp_packet_l4(b) - (char *)dp_packet_eth(b)) - + TCP_HEADER_LEN; + uint16_t tso_segsz = dp_packet_get_tso_segsz(b); + struct tcp_header *tcp = dp_packet_l4(b); + int tcp_hdr_len = TCP_OFFSET(tcp->tcp_ctl) * 4; + int hdr_len = ((char *)dp_packet_l4(b) - (char *)dp_packet_eth(b)) + + tcp_hdr_len; + int max_packet_len = mtu + ETH_HEADER_LEN + VLAN_HEADER_LEN; + + if (OVS_UNLIKELY((hdr_len + tso_segsz) > max_packet_len)) { + VLOG_WARN_RL(&rl, "Oversized TSO packet. hdr_len: %"PRIu32", " + "gso: %"PRIu16", max length: %"PRIu32".", hdr_len, + tso_segsz, max_packet_len); + return EMSGSIZE; + } vnet->hdr_len = (OVS_FORCE __virtio16)hdr_len; - vnet->gso_size = (OVS_FORCE __virtio16)(mtu - hdr_len); + vnet->gso_size = (OVS_FORCE __virtio16)(tso_segsz); if (dp_packet_ol_tx_ipv4(b)) { vnet->gso_type = VIRTIO_NET_HDR_GSO_TCPV4; } else if (dp_packet_ol_tx_ipv6(b)) { @@ -6718,4 +6757,5 @@ netdev_linux_prepend_vnet_hdr(struct dp_packet *b, int mtu) } dp_packet_push(b, vnet, sizeof *vnet); + return 0; }