From patchwork Thu Sep 22 12:52:08 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: dheeraj X-Patchwork-Id: 1681097 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.136; helo=smtp3.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=acldigital.com header.i=@acldigital.com header.a=rsa-sha256 header.s=909B15DE-2B89-11ED-8B33-3399E12DEAFE header.b=GdC6nhmR; dkim-atps=neutral Received: from smtp3.osuosl.org (smtp3.osuosl.org [140.211.166.136]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4MYFh51XcZz1ypf for ; Thu, 22 Sep 2022 22:58:13 +1000 (AEST) Received: from localhost (localhost [127.0.0.1]) by smtp3.osuosl.org (Postfix) with ESMTP id 5EBAA610F3; Thu, 22 Sep 2022 12:58:11 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp3.osuosl.org 5EBAA610F3 Authentication-Results: smtp3.osuosl.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=acldigital.com header.i=@acldigital.com header.a=rsa-sha256 header.s=909B15DE-2B89-11ED-8B33-3399E12DEAFE header.b=GdC6nhmR X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp3.osuosl.org ([127.0.0.1]) by localhost (smtp3.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4sqvtBoP6lJi; Thu, 22 Sep 2022 12:58:10 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [IPv6:2605:bc80:3010:104::8cd3:938]) by smtp3.osuosl.org (Postfix) with ESMTPS id 4CBFA60BD9; Thu, 22 Sep 2022 12:58:09 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp3.osuosl.org 4CBFA60BD9 Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 154D9C0033; Thu, 22 Sep 2022 12:58:09 +0000 (UTC) X-Original-To: ovs-dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp2.osuosl.org (smtp2.osuosl.org [IPv6:2605:bc80:3010::133]) by lists.linuxfoundation.org (Postfix) with ESMTP id 4D4CBC002D for ; Thu, 22 Sep 2022 12:58:08 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp2.osuosl.org (Postfix) with ESMTP id 2860E40C2D for ; Thu, 22 Sep 2022 12:58:08 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp2.osuosl.org 2860E40C2D Authentication-Results: smtp2.osuosl.org; dkim=pass (2048-bit key) header.d=acldigital.com header.i=@acldigital.com header.a=rsa-sha256 header.s=909B15DE-2B89-11ED-8B33-3399E12DEAFE header.b=GdC6nhmR X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp2.osuosl.org ([127.0.0.1]) by localhost (smtp2.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id trCMpP7vcCYW for ; Thu, 22 Sep 2022 12:58:07 +0000 (UTC) X-Greylist: delayed 00:05:37 by SQLgrey-1.8.0 DKIM-Filter: OpenDKIM Filter v2.11.0 smtp2.osuosl.org D0C0B40BCB Received: from mail.acldigital.com (mail.acldigital.com [115.114.95.106]) by smtp2.osuosl.org (Postfix) with ESMTPS id D0C0B40BCB for ; Thu, 22 Sep 2022 12:58:06 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by mail.acldigital.com (Postfix) with ESMTP id 55EA41FC6498 for ; Thu, 22 Sep 2022 18:22:26 +0530 (IST) Received: from mail.acldigital.com ([127.0.0.1]) by localhost (mail.acldigital.com [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id YleH6vGCkFiq; Thu, 22 Sep 2022 18:22:25 +0530 (IST) Received: from localhost (localhost [127.0.0.1]) by mail.acldigital.com (Postfix) with ESMTP id B280E1FC6496; Thu, 22 Sep 2022 18:22:25 +0530 (IST) DKIM-Filter: OpenDKIM Filter v2.10.3 mail.acldigital.com B280E1FC6496 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=acldigital.com; s=909B15DE-2B89-11ED-8B33-3399E12DEAFE; t=1663851145; bh=H8stnlci5sYwj7QqmxEE2zKAVK1fohYgfhUrUAObMPU=; h=From:To:Date:Message-Id; b=GdC6nhmR353h0t6GQcwC8EikkUnmFCHUtAcDwYQZscfDYDNgsueFXyhaPnQPzyHuJ F375QbcBlUWHuiAKf1Y/DBp6RkVp1tMlFdz5ICB7BVwbgFgZoyLSx8+RMVeqBO7b/Q a2shpYUykF02DkNdyRD5Gsg4KVze2l4ISWi+QxlWWeqjq9u6Heip9WHEtOWj1lLmi2 moLFFncSt8x/KoQAP0zoEBXnnmDZAc3CxWkfUSOho+PtPnv4k4ylBpaYKTWFah/82D yX6Cs7yN0DDaNoNfx/VtdEvauel2bmncCyjkeZSY/GIGsngMcbhWbp5Ndan+9y9iO8 wEkWgL86LFbpw== X-Virus-Scanned: amavisd-new at acldigital.com Received: from mail.acldigital.com ([127.0.0.1]) by localhost (mail.acldigital.com [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id 0DWzZFXECr2X; Thu, 22 Sep 2022 18:22:25 +0530 (IST) Received: from dheeraj-VirtualBox.ericsson.se (unknown [115.113.119.190]) by mail.acldigital.com (Postfix) with ESMTPSA id 7F72C1FC648D; Thu, 22 Sep 2022 18:22:25 +0530 (IST) To: ovs-dev@openvswitch.org Date: Thu, 22 Sep 2022 18:22:08 +0530 Message-Id: <20220922125208.4769-1-dheeraj.k@acldigital.com> X-Mailer: git-send-email 2.17.1 Cc: Dheeraj Kumar Subject: [ovs-dev] [PATCH] dpif-netdev: Optimize flushing of output packet buffers X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Dheeraj Kumar via dev From: dheeraj Reply-To: Dheeraj Kumar MIME-Version: 1.0 Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" Problem Statement: Before OVS 2.12 the OVS-DPDK datapath transmitted processed rx packet batches directly to the wanted tx queues. In OVS 2.12 each PMD stores the processed packets in an intermediate buffer per output port and flushes these output buffers in a separate step. This buffering was introduced to allow better batching of packets for transmit. The current implementation of the function that flushes the output buffers performs a full scan overall output ports, even if only one single packet was buffered. In systems with hundreds of ports this can take a long time and degrades OVS-DPDK performance significantly. Solution: Maintain a list of output ports with buffered packets for each PMD thread and only iterate over that list when flushing output buffers. Furthermore, defer the flushing of packet buffers to a single invocation at the end of each PMD loop when output packet batching with a tx-flush-interval has been enabled and increased jitter (in microsecond range is not of concern). Signed-off-by: Dheeraj Kumar --- lib/dpif-netdev-private-thread.h | 7 +++--- lib/dpif-netdev.c | 39 ++++++++++++++++++++------------ 2 files changed, 29 insertions(+), 17 deletions(-) diff --git a/lib/dpif-netdev-private-thread.h b/lib/dpif-netdev-private-thread.h index 4472b199d..2775e1a2b 100644 --- a/lib/dpif-netdev-private-thread.h +++ b/lib/dpif-netdev-private-thread.h @@ -185,9 +185,6 @@ struct dp_netdev_pmd_thread { * than 'cmap_count(dp->poll_threads)'. */ uint32_t static_tx_qid; - /* Number of filled output batches. */ - int n_output_batches; - struct ovs_mutex port_mutex; /* Mutex for 'poll_list' and 'tx_ports'. */ /* List of rx queues to poll. */ struct hmap poll_list OVS_GUARDED; @@ -213,6 +210,10 @@ struct dp_netdev_pmd_thread { struct hmap tnl_port_cache; struct hmap send_port_cache; + /* Keep track of the ports with buffered output packets in + * send_port_cache. */ + struct ovs_list pending_tx_ports; + /* Keep track of detailed PMD performance statistics. */ struct pmd_perf_stats perf_stats; diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index a45b46014..3c2cd6cbc 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -500,6 +500,7 @@ struct tx_port { int qid; long long last_used; struct hmap_node node; + struct ovs_list pending_tx; /* Only used in send_port_cache. */ long long flush_time; struct dp_packet_batch output_pkts; struct dp_packet_batch *txq_pkts; /* Only for hash mode. */ @@ -5241,8 +5242,9 @@ dp_netdev_pmd_flush_output_on_port(struct dp_netdev_pmd_thread *pmd, atomic_read_relaxed(&pmd->dp->tx_flush_interval, &tx_flush_interval); p->flush_time = pmd->ctx.now + tx_flush_interval; - ovs_assert(pmd->n_output_batches > 0); - pmd->n_output_batches--; + /* Remove send port from pending port list */ + ovs_assert(!ovs_list_is_empty(&p->pending_tx)); + ovs_list_remove(&p->pending_tx); pmd_perf_update_counter(&pmd->perf_stats, PMD_STAT_SENT_PKTS, output_cnt); pmd_perf_update_counter(&pmd->perf_stats, PMD_STAT_SENT_BATCHES, 1); @@ -5264,16 +5266,11 @@ static int dp_netdev_pmd_flush_output_packets(struct dp_netdev_pmd_thread *pmd, bool force) { - struct tx_port *p; + struct tx_port *p, *next; int output_cnt = 0; - if (!pmd->n_output_batches) { - return 0; - } - - HMAP_FOR_EACH (p, node, &pmd->send_port_cache) { - if (!dp_packet_batch_is_empty(&p->output_pkts) - && (force || pmd->ctx.now >= p->flush_time)) { + LIST_FOR_EACH_SAFE (p, next, pending_tx, &pmd->pending_tx_ports) { + if (force || pmd->ctx.now >= p->flush_time) { output_cnt += dp_netdev_pmd_flush_output_on_port(pmd, p); } } @@ -5292,6 +5289,7 @@ dp_netdev_process_rxq_port(struct dp_netdev_pmd_thread *pmd, int batch_cnt = 0; int rem_qlen = 0, *qlen_p = NULL; uint64_t cycles; + uint64_t tx_flush_interval; /* Measure duration for polling and processing rx burst. */ cycle_timer_start(&pmd->perf_stats, &timer); @@ -5333,7 +5331,12 @@ dp_netdev_process_rxq_port(struct dp_netdev_pmd_thread *pmd, cycles = cycle_timer_stop(&pmd->perf_stats, &timer); dp_netdev_rxq_add_cycles(rxq, RXQ_CYCLES_PROC_CURR, cycles); - dp_netdev_pmd_flush_output_packets(pmd, false); + /* Defer flushing of tx buffers if tx packet batching is enabled. */ + atomic_read_relaxed(&pmd->dp->tx_flush_interval, &tx_flush_interval); + if (tx_flush_interval == 0) { + dp_netdev_pmd_flush_output_packets(pmd, false); + } + } else { /* Discard cycles. */ cycle_timer_stop(&pmd->perf_stats, &timer); @@ -6803,6 +6806,7 @@ pmd_load_cached_ports(struct dp_netdev_pmd_thread *pmd) n_txq * sizeof *tx_port->txq_pkts); tx_port_cached->txq_pkts = txq_pkts_cached; } + ovs_list_init(&tx_port_cached->pending_tx); hmap_insert(&pmd->send_port_cache, &tx_port_cached->node, hash_port_no(tx_port_cached->port->port_no)); } @@ -6877,6 +6881,7 @@ pmd_thread_main(void *f_) int poll_cnt; int i; int process_packets = 0; + uint32_t tx_flush_interval; poll_list = NULL; @@ -6890,6 +6895,7 @@ pmd_thread_main(void *f_) reload: atomic_count_init(&pmd->pmd_overloaded, 0); + atomic_read_relaxed(&pmd->dp->tx_flush_interval, &tx_flush_interval); if (!dpdk_attached) { dpdk_attached = dpdk_attach_thread(pmd->core_id); @@ -6962,7 +6968,6 @@ reload: if (!rx_packets) { /* We didn't receive anything in the process loop. - * Check if we need to send something. * There was no time updates on current iteration. */ pmd_thread_ctx_time_update(pmd); tx_packets = dp_netdev_pmd_flush_output_packets(pmd, false); @@ -6978,6 +6983,11 @@ reload: } } + if (tx_flush_interval > 0) { + /* Try to flush port output buffers. */ + tx_packets = dp_netdev_pmd_flush_output_packets(pmd, false); + } + if (lc++ > 1024) { lc = 0; @@ -7447,7 +7457,6 @@ dp_netdev_configure_pmd(struct dp_netdev_pmd_thread *pmd, struct dp_netdev *dp, pmd->core_id = core_id; pmd->numa_id = numa_id; pmd->need_reload = false; - pmd->n_output_batches = 0; ovs_refcount_init(&pmd->ref_cnt); atomic_init(&pmd->exit, false); @@ -7474,6 +7483,7 @@ dp_netdev_configure_pmd(struct dp_netdev_pmd_thread *pmd, struct dp_netdev *dp, hmap_init(&pmd->tnl_port_cache); hmap_init(&pmd->send_port_cache); cmap_init(&pmd->tx_bonds); + ovs_list_init(&pmd->pending_tx_ports); /* Initialize DPIF function pointer to the default configured version. */ atomic_init(&pmd->netdev_input_func, dp_netdev_impl_get_default()); @@ -7498,6 +7508,7 @@ dp_netdev_destroy_pmd(struct dp_netdev_pmd_thread *pmd) struct dpcls *cls; dp_netdev_pmd_flow_flush(pmd); + ovs_list_poison(&pmd->pending_tx_ports); hmap_destroy(&pmd->send_port_cache); hmap_destroy(&pmd->tnl_port_cache); hmap_destroy(&pmd->tx_ports); @@ -8714,7 +8725,7 @@ dp_execute_output_action(struct dp_netdev_pmd_thread *pmd, dp_netdev_pmd_flush_output_on_port(pmd, p); } if (dp_packet_batch_is_empty(&p->output_pkts)) { - pmd->n_output_batches++; + ovs_list_push_front(&pmd->pending_tx_ports, &p->pending_tx); } struct dp_packet *packet;