From patchwork Mon Feb 3 21:45:50 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Flavio Leitner X-Patchwork-Id: 1233032 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.138; helo=whitealder.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=sysclose.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=sysclose.org header.i=@sysclose.org header.a=rsa-sha256 header.s=201903 header.b=l/38m/hH; dkim=fail reason="signature verification failed" (2048-bit key) header.d=sysclose.org header.i=@sysclose.org header.a=rsa-sha256 header.s=201903 header.b=VLL9HF39; dkim-atps=neutral Received: from whitealder.osuosl.org (smtp1.osuosl.org [140.211.166.138]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48BLxp3GVVz9sRW for ; Tue, 4 Feb 2020 08:46:38 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by whitealder.osuosl.org (Postfix) with ESMTP id E491884E97; Mon, 3 Feb 2020 21:46:34 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from whitealder.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id OOqfy+APJ+wZ; Mon, 3 Feb 2020 21:46:29 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by whitealder.osuosl.org (Postfix) with ESMTP id 2F1E384789; Mon, 3 Feb 2020 21:46:13 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 11307C1D80; Mon, 3 Feb 2020 21:46:13 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from silver.osuosl.org (smtp3.osuosl.org [140.211.166.136]) by lists.linuxfoundation.org (Postfix) with ESMTP id E8935C0174 for ; Mon, 3 Feb 2020 21:46:11 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by silver.osuosl.org (Postfix) with ESMTP id D5E23203DE for ; Mon, 3 Feb 2020 21:46:11 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from silver.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id fInRIPsZP0YY for ; Mon, 3 Feb 2020 21:46:09 +0000 (UTC) X-Greylist: from auto-whitelisted by SQLgrey-1.7.6 Received: from sysclose.org (smtp.sysclose.org [69.164.214.230]) by silver.osuosl.org (Postfix) with ESMTPS id ED87920508 for ; Mon, 3 Feb 2020 21:46:08 +0000 (UTC) Received: by sysclose.org (Postfix, from userid 5001) id 3B2C13DD6; Mon, 3 Feb 2020 21:46:51 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 sysclose.org 3B2C13DD6 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sysclose.org; s=201903; t=1580766411; bh=67CUQOoIwSHVjZ7XUOT5bT0aR8sJqwr/ruwsadsyfZs=; h=From:To:Cc:Subject:Date:From; b=l/38m/hHy/g4TIAdBlvxGlzGwlZ+fRcU99/ry0qrGf6Jm6tTLYzZYNkhNqppI7SAe /QmWwlVy9jb6yw+J7JsCiLK7uPO0FXNpuzAZNjteEqaH87IvDnz0Cf4aAzbwFHg591 bIIH3WfJl3NtlMFS2ASMumFEU3tvEsDcPm5ZRJo/kTidKOlh3Ubr/7EcGQ3vgNy6dn oYsTc6/Yj7BLPOV7QVkQQwW+WXV6gSMVl1rV9CjIXH7FkJk/vxM3PEXtvK5amW4Bmz Zsii4Jz89LiWaIDjAIThbAb7Nv4Q2bjqrk2Nrz3weiLMHHn6rt435vKSJBx7M1BMcL nMXz5hhey3w0g== Received: from localhost (unknown [191.7.188.156]) by sysclose.org (Postfix) with ESMTPSA id 252D12F99; Mon, 3 Feb 2020 21:46:45 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 sysclose.org 252D12F99 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sysclose.org; s=201903; t=1580766406; bh=67CUQOoIwSHVjZ7XUOT5bT0aR8sJqwr/ruwsadsyfZs=; h=From:To:Cc:Subject:Date:From; b=VLL9HF39ysJBBvUdiybWeMK2NtYJaTlWk+D1WIYRg0UqvDbu4ftsetYaZcwe0hv+f TwDPaUwp10HHfHQDkRAcX1/+SL1kP6dHOk6wb2XcO7UBPdXypxMN1TOb06fvYaEnn8 AJkSSE9t7trALl8ZAalL+XZwcqGWBpPpFNiffYYgYdP1DGMXie0GR2IhjF9frt0F9x 72teo4MJugjUsq1Xo/z0mziJRR+NA3pifLLAElC9OOeV4PixjUwaOd5hf6fYMLGmhA +i1lJUpMd9Ba0WNha2YDtCj0KuvxVykuKZhpIUZQOUEw/A05RP4P86lf9O/Sqji1JC 3ieeRbKeLYAqg== From: Flavio Leitner To: dev@openvswitch.org Date: Mon, 3 Feb 2020 18:45:50 -0300 Message-Id: <20200203214550.19320-1-fbl@sysclose.org> X-Mailer: git-send-email 2.24.1 MIME-Version: 1.0 Cc: Ilya Maximets , Flavio Leitner , txfh2007 Subject: [ovs-dev] [PATCH v2] netdev-linux: Prepend the std packet in the TSO packet X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" Usually TSO packets are close to 50k, 60k bytes long, so to to copy less bytes when receiving a packet from the kernel change the approach. Instead of extending the MTU sized packet received and append with remaining TSO data from the TSO buffer, allocate a TSO packet with enough headroom to prepend the std packet data. Fixes: 29cf9c1b3b9c ("userspace: Add TCP Segmentation Offload support") Suggested-by: Ben Pfaff Signed-off-by: Flavio Leitner --- lib/dp-packet.c | 8 +-- lib/dp-packet.h | 2 + lib/netdev-linux-private.h | 3 +- lib/netdev-linux.c | 117 ++++++++++++++++++++++--------------- 4 files changed, 78 insertions(+), 52 deletions(-) V2: - tso packets tailroom depends on headroom in netdev_linux_rxq_recv() - iov_len uses packet's tailroom. This patch depends on a previous posted patch to work: Subject: netdev-linux-private: fix max length to be 16 bits https://mail.openvswitch.org/pipermail/ovs-dev/2020-February/367469.html With both patches applied, I can run iperf3 and scp on both directions with good performance and no issues. diff --git a/lib/dp-packet.c b/lib/dp-packet.c index 8dfedcb7c..cd2623500 100644 --- a/lib/dp-packet.c +++ b/lib/dp-packet.c @@ -243,8 +243,8 @@ dp_packet_copy__(struct dp_packet *b, uint8_t *new_base, /* Reallocates 'b' so that it has exactly 'new_headroom' and 'new_tailroom' * bytes of headroom and tailroom, respectively. */ -static void -dp_packet_resize__(struct dp_packet *b, size_t new_headroom, size_t new_tailroom) +void +dp_packet_resize(struct dp_packet *b, size_t new_headroom, size_t new_tailroom) { void *new_base, *new_data; size_t new_allocated; @@ -297,7 +297,7 @@ void dp_packet_prealloc_tailroom(struct dp_packet *b, size_t size) { if (size > dp_packet_tailroom(b)) { - dp_packet_resize__(b, dp_packet_headroom(b), MAX(size, 64)); + dp_packet_resize(b, dp_packet_headroom(b), MAX(size, 64)); } } @@ -308,7 +308,7 @@ void dp_packet_prealloc_headroom(struct dp_packet *b, size_t size) { if (size > dp_packet_headroom(b)) { - dp_packet_resize__(b, MAX(size, 64), dp_packet_tailroom(b)); + dp_packet_resize(b, MAX(size, 64), dp_packet_tailroom(b)); } } diff --git a/lib/dp-packet.h b/lib/dp-packet.h index 69ae5dfac..9a9d35183 100644 --- a/lib/dp-packet.h +++ b/lib/dp-packet.h @@ -152,6 +152,8 @@ struct dp_packet *dp_packet_clone_with_headroom(const struct dp_packet *, struct dp_packet *dp_packet_clone_data(const void *, size_t); struct dp_packet *dp_packet_clone_data_with_headroom(const void *, size_t, size_t headroom); +void dp_packet_resize(struct dp_packet *b, size_t new_headroom, + size_t new_tailroom); static inline void dp_packet_delete(struct dp_packet *); static inline void *dp_packet_at(const struct dp_packet *, size_t offset, diff --git a/lib/netdev-linux-private.h b/lib/netdev-linux-private.h index be2d7b10b..c7c515f70 100644 --- a/lib/netdev-linux-private.h +++ b/lib/netdev-linux-private.h @@ -45,7 +45,8 @@ struct netdev_rxq_linux { struct netdev_rxq up; bool is_tap; int fd; - char *aux_bufs[NETDEV_MAX_BURST]; /* Batch of preallocated TSO buffers. */ + struct dp_packet *aux_bufs[NETDEV_MAX_BURST]; /* Preallocated TSO + packets. */ }; int netdev_linux_construct(struct netdev *); diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c index 6add3e2fc..c6f3d2740 100644 --- a/lib/netdev-linux.c +++ b/lib/netdev-linux.c @@ -1052,15 +1052,6 @@ static struct netdev_rxq * netdev_linux_rxq_alloc(void) { struct netdev_rxq_linux *rx = xzalloc(sizeof *rx); - if (userspace_tso_enabled()) { - int i; - - /* Allocate auxiliay buffers to receive TSO packets. */ - for (i = 0; i < NETDEV_MAX_BURST; i++) { - rx->aux_bufs[i] = xmalloc(LINUX_RXQ_TSO_MAX_LEN); - } - } - return &rx->up; } @@ -1172,7 +1163,7 @@ netdev_linux_rxq_destruct(struct netdev_rxq *rxq_) } for (i = 0; i < NETDEV_MAX_BURST; i++) { - free(rx->aux_bufs[i]); + dp_packet_delete(rx->aux_bufs[i]); } } @@ -1238,13 +1229,18 @@ netdev_linux_batch_rxq_recv_sock(struct netdev_rxq_linux *rx, int mtu, virtio_net_hdr_size = 0; } - std_len = VLAN_ETH_HEADER_LEN + mtu + virtio_net_hdr_size; + /* The length here needs to be accounted in the same way when the + * aux_buf is allocated so that it can be prepended to TSO buffer. */ + std_len = virtio_net_hdr_size + VLAN_ETH_HEADER_LEN + mtu; for (i = 0; i < NETDEV_MAX_BURST; i++) { buffers[i] = dp_packet_new_with_headroom(std_len, DP_NETDEV_HEADROOM); iovs[i][IOV_PACKET].iov_base = dp_packet_data(buffers[i]); iovs[i][IOV_PACKET].iov_len = std_len; - iovs[i][IOV_AUXBUF].iov_base = rx->aux_bufs[i]; - iovs[i][IOV_AUXBUF].iov_len = LINUX_RXQ_TSO_MAX_LEN; + if (iovlen == IOV_TSO_SIZE) { + iovs[i][IOV_AUXBUF].iov_base = dp_packet_data(rx->aux_bufs[i]); + iovs[i][IOV_AUXBUF].iov_len = dp_packet_tailroom(rx->aux_bufs[i]); + } + mmsgs[i].msg_hdr.msg_name = NULL; mmsgs[i].msg_hdr.msg_namelen = 0; mmsgs[i].msg_hdr.msg_iov = iovs[i]; @@ -1268,6 +1264,8 @@ netdev_linux_batch_rxq_recv_sock(struct netdev_rxq_linux *rx, int mtu, } for (i = 0; i < retval; i++) { + struct dp_packet *pkt; + if (mmsgs[i].msg_len < ETH_HEADER_LEN) { struct netdev *netdev_ = netdev_rxq_get_netdev(&rx->up); struct netdev_linux *netdev = netdev_linux_cast(netdev_); @@ -1280,29 +1278,29 @@ netdev_linux_batch_rxq_recv_sock(struct netdev_rxq_linux *rx, int mtu, } if (mmsgs[i].msg_len > std_len) { - /* Build a single linear TSO packet by expanding the current packet - * to append the data received in the aux_buf. */ - size_t extra_len = mmsgs[i].msg_len - std_len; - - dp_packet_set_size(buffers[i], dp_packet_size(buffers[i]) - + std_len); - dp_packet_prealloc_tailroom(buffers[i], extra_len); - memcpy(dp_packet_tail(buffers[i]), rx->aux_bufs[i], extra_len); - dp_packet_set_size(buffers[i], dp_packet_size(buffers[i]) - + extra_len); - } else { - dp_packet_set_size(buffers[i], dp_packet_size(buffers[i]) - + mmsgs[i].msg_len); - } + /* Build a single linear TSO packet by prepending the data from + * std_len buffer to the aux_buf. */ + pkt = rx->aux_bufs[i]; + dp_packet_set_size(pkt, mmsgs[i].msg_len - std_len); + dp_packet_push(pkt, dp_packet_data(buffers[i]), std_len); + /* The headroom should be the same in buffers[i], pkt and + * DP_NETDEV_HEADROOM. */ + dp_packet_resize(pkt, DP_NETDEV_HEADROOM, 0); + dp_packet_delete(buffers[i]); + rx->aux_bufs[i] = NULL; + } else { + dp_packet_set_size(buffers[i], mmsgs[i].msg_len); + pkt = buffers[i]; + } - if (virtio_net_hdr_size && netdev_linux_parse_vnet_hdr(buffers[i])) { + if (virtio_net_hdr_size && netdev_linux_parse_vnet_hdr(pkt)) { struct netdev *netdev_ = netdev_rxq_get_netdev(&rx->up); struct netdev_linux *netdev = netdev_linux_cast(netdev_); /* Unexpected error situation: the virtio header is not present * or corrupted. Drop the packet but continue in case next ones * are correct. */ - dp_packet_delete(buffers[i]); + dp_packet_delete(pkt); netdev->rx_dropped += 1; VLOG_WARN_RL(&rl, "%s: Dropped packet: Invalid virtio net header", netdev_get_name(netdev_)); @@ -1325,16 +1323,16 @@ netdev_linux_batch_rxq_recv_sock(struct netdev_rxq_linux *rx, int mtu, struct eth_header *eth; bool double_tagged; - eth = dp_packet_data(buffers[i]); + eth = dp_packet_data(pkt); double_tagged = eth->eth_type == htons(ETH_TYPE_VLAN_8021Q); - eth_push_vlan(buffers[i], + eth_push_vlan(pkt, auxdata_to_vlan_tpid(aux, double_tagged), htons(aux->tp_vlan_tci)); break; } } - dp_packet_batch_add(batch, buffers[i]); + dp_packet_batch_add(batch, pkt); } /* Delete unused buffers. */ @@ -1354,7 +1352,6 @@ static int netdev_linux_batch_rxq_recv_tap(struct netdev_rxq_linux *rx, int mtu, struct dp_packet_batch *batch) { - struct dp_packet *buffer; int virtio_net_hdr_size; ssize_t retval; size_t std_len; @@ -1372,16 +1369,22 @@ netdev_linux_batch_rxq_recv_tap(struct netdev_rxq_linux *rx, int mtu, virtio_net_hdr_size = 0; } - std_len = VLAN_ETH_HEADER_LEN + mtu + virtio_net_hdr_size; + /* The length here needs to be accounted in the same way when the + * aux_buf is allocated so that it can be prepended to TSO buffer. */ + std_len = virtio_net_hdr_size + VLAN_ETH_HEADER_LEN + mtu; for (i = 0; i < NETDEV_MAX_BURST; i++) { + struct dp_packet *buffer; + struct dp_packet *pkt; struct iovec iov[IOV_TSO_SIZE]; /* Assume Ethernet port. No need to set packet_type. */ buffer = dp_packet_new_with_headroom(std_len, DP_NETDEV_HEADROOM); iov[IOV_PACKET].iov_base = dp_packet_data(buffer); iov[IOV_PACKET].iov_len = std_len; - iov[IOV_AUXBUF].iov_base = rx->aux_bufs[i]; - iov[IOV_AUXBUF].iov_len = LINUX_RXQ_TSO_MAX_LEN; + if (iovlen == IOV_TSO_SIZE) { + iov[IOV_AUXBUF].iov_base = dp_packet_data(rx->aux_bufs[i]); + iov[IOV_AUXBUF].iov_len = dp_packet_tailroom(rx->aux_bufs[i]); + } do { retval = readv(rx->fd, iov, iovlen); @@ -1393,33 +1396,36 @@ netdev_linux_batch_rxq_recv_tap(struct netdev_rxq_linux *rx, int mtu, } if (retval > std_len) { - /* Build a single linear TSO packet by expanding the current packet - * to append the data received in the aux_buf. */ - size_t extra_len = retval - std_len; - - dp_packet_set_size(buffer, dp_packet_size(buffer) + std_len); - dp_packet_prealloc_tailroom(buffer, extra_len); - memcpy(dp_packet_tail(buffer), rx->aux_bufs[i], extra_len); - dp_packet_set_size(buffer, dp_packet_size(buffer) + extra_len); + /* Build a single linear TSO packet by prepending the data from + * std_len buffer to the aux_buf. */ + pkt = rx->aux_bufs[i]; + dp_packet_set_size(pkt, retval - std_len); + dp_packet_push(pkt, dp_packet_data(buffer), std_len); + /* The headroom should be the same in buffers[i], pkt and + * DP_NETDEV_HEADROOM. */ + dp_packet_resize(pkt, DP_NETDEV_HEADROOM, 0); + dp_packet_delete(buffer); + rx->aux_bufs[i] = NULL; } else { dp_packet_set_size(buffer, dp_packet_size(buffer) + retval); + pkt = buffer; } - if (virtio_net_hdr_size && netdev_linux_parse_vnet_hdr(buffer)) { + if (virtio_net_hdr_size && netdev_linux_parse_vnet_hdr(pkt)) { struct netdev *netdev_ = netdev_rxq_get_netdev(&rx->up); struct netdev_linux *netdev = netdev_linux_cast(netdev_); /* Unexpected error situation: the virtio header is not present * or corrupted. Drop the packet but continue in case next ones * are correct. */ - dp_packet_delete(buffer); + dp_packet_delete(pkt); netdev->rx_dropped += 1; VLOG_WARN_RL(&rl, "%s: Dropped packet: Invalid virtio net header", netdev_get_name(netdev_)); continue; } - dp_packet_batch_add(batch, buffer); + dp_packet_batch_add(batch, pkt); } if ((i == 0) && (retval < 0)) { @@ -1442,6 +1448,23 @@ netdev_linux_rxq_recv(struct netdev_rxq *rxq_, struct dp_packet_batch *batch, mtu = ETH_PAYLOAD_MAX; } + if (userspace_tso_enabled()) { + /* Allocate TSO packets. The packet has enough headroom to store + * a full non-TSO packet. When a TSO packet is received, the data + * from non-TSO buffer (std_len) is prepended to the TSO packet + * (aux_buf). */ + size_t std_len = sizeof(struct virtio_net_hdr) + VLAN_ETH_HEADER_LEN + + DP_NETDEV_HEADROOM + mtu; + size_t data_len = LINUX_RXQ_TSO_MAX_LEN - std_len; + for (int i = 0; i < NETDEV_MAX_BURST; i++) { + if (rx->aux_bufs[i]) { + continue; + } + + rx->aux_bufs[i] = dp_packet_new_with_headroom(data_len, std_len); + } + } + dp_packet_batch_init(batch); retval = (rx->is_tap ? netdev_linux_batch_rxq_recv_tap(rx, mtu, batch)