From patchwork Fri Jan 12 11:17:06 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ilya Maximets X-Patchwork-Id: 859768 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=samsung.com header.i=@samsung.com header.b="Jpjvb83Z"; dkim-atps=neutral Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3zJ0dL4Ywkz9sQm for ; Fri, 12 Jan 2018 22:19:34 +1100 (AEDT) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id D16E3F22; Fri, 12 Jan 2018 11:17:41 +0000 (UTC) X-Original-To: ovs-dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 73AB7F56 for ; Fri, 12 Jan 2018 11:17:40 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mailout1.w1.samsung.com (mailout1.w1.samsung.com [210.118.77.11]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 2345AD0 for ; Fri, 12 Jan 2018 11:17:39 +0000 (UTC) Received: from eucas1p1.samsung.com (unknown [182.198.249.206]) by mailout1.w1.samsung.com (KnoxPortal) with ESMTP id 20180112111737euoutp012977092b4adf0b3c8068def15d0abc1e~JC_vPhEVu2224622246euoutp01l; Fri, 12 Jan 2018 11:17:37 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 mailout1.w1.samsung.com 20180112111737euoutp012977092b4adf0b3c8068def15d0abc1e~JC_vPhEVu2224622246euoutp01l DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=samsung.com; s=mail20170921; t=1515755857; bh=ZSkCmpX2K9VNebwFStKKUm66a+i34RiUfjkGfyhpWzI=; h=From:To:Cc:Subject:Date:In-reply-to:References:From; b=Jpjvb83ZDNgK13WpAejyx+J+RzPQmR0RxD3ERIssqOhRsAP7Ra5xZbKz5cR9ZpjaK xdqofEDVbq2d4SDPwZwUdiVu44wyZ3jXdWqxGgN7uL+L0ecapsAduQehXc/bCdoHOu acMFdYOJ6oaiURse/JZ3y4IF1xT8e87IHgWKIkAM= Received: from eusmges5.samsung.com (unknown [203.254.199.245]) by eucas1p1.samsung.com (KnoxPortal) with ESMTP id 20180112111736eucas1p13ef629e5b35719ed876dc3e4cd7a2903~JC_un-X1-2218922189eucas1p1d; Fri, 12 Jan 2018 11:17:36 +0000 (GMT) Received: from eucas1p1.samsung.com ( [182.198.249.206]) by eusmges5.samsung.com (EUCPMTA) with SMTP id 14.3C.12743.059985A5; Fri, 12 Jan 2018 11:17:36 +0000 (GMT) Received: from eusmgms1.samsung.com (unknown [182.198.249.179]) by eucas1p1.samsung.com (KnoxPortal) with ESMTP id 20180112111735eucas1p1c61f5aa366c29f04933af45ed6c4e5e4~JC_uANZhg2218922189eucas1p1c; Fri, 12 Jan 2018 11:17:35 +0000 (GMT) X-AuditID: cbfec7f5-f79d06d0000031c7-be-5a5899504c2f Received: from eusync1.samsung.com ( [203.254.199.211]) by eusmgms1.samsung.com (EUCPMTA) with SMTP id BF.27.18832.F49985A5; Fri, 12 Jan 2018 11:17:35 +0000 (GMT) Received: from imaximets.rnd.samsung.ru ([106.109.129.180]) by eusync1.samsung.com (Oracle Communications Messaging Server 7.0.5.31.0 64bit (built May 5 2014)) with ESMTPA id <0P2F00EP4WORX3A0@eusync1.samsung.com>; Fri, 12 Jan 2018 11:17:35 +0000 (GMT) From: Ilya Maximets To: ovs-dev@openvswitch.org Date: Fri, 12 Jan 2018 14:17:06 +0300 Message-id: <1515755828-1848-4-git-send-email-i.maximets@samsung.com> X-Mailer: git-send-email 2.7.4 In-reply-to: <1515755828-1848-1-git-send-email-i.maximets@samsung.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFrrEIsWRmVeSWpSXmKPExsWy7djPc7oBMyOiDGa8ZLVY/YvTYuczZYsd a5cyWrT0z2S2uHPlJ5vFtM+32S2utP9kt9j48CyrxZHvpxkt1h76wG4x99NzRgduj19fr7J5 LN7zksnj2c3/jB7v9wG5fVtWMQawRnHZpKTmZJalFunbJXBl3Nk8ia1gRnrF39f3mRsYD3h1 MXJySAiYSOw5dZYRwhaTuHBvPVsXIxeHkMBSRokdXXtYIJzPjBIfbzczwXTsPbCIFSKxjFHi 7oMmKKeZSeLuyelgVWwCOhKnVh8BmysiIC3xuvcNK4jNLLCVWWJhnxqILSzgKHH+8UmwehYB VYmLl/+D1fAKuEpc+f+EBWKbnMTNc53MXYwcHJwCbhIXdkWB7JIQ2MAmce1qIxtEjYvEq47D zBC2sMSr41vYIWwZic6Og0wQDc2MEg2rLjFCOBMYJb40L4f6x17i1M2rTBDX8UlM2jYdbJuE AK9ER5sQRImHxPMbq1khbEeJG23voYE0k1Hi9cTPrBMYpRcwMqxiFEktLc5NTy021StOzC0u zUvXS87P3cQIjOnT/45/3cG49JjVIUYBDkYlHl6LgvAoIdbEsuLK3EOMEhzMSiK8LEURUUK8 KYmVValF+fFFpTmpxYcYpTlYlMR5baPaIoUE0hNLUrNTUwtSi2CyTBycUg2MydHpmzInnZtx eJ0C4xytpuDly39ImE/xKn7/rjj+TghfVo/r840vfrgdV+eO3dxbNcnb6a5ajWB2UTiv+9VI UzvpRwvupgh82mP0eYu4Cfss07bZSlt/d5kvLpi39KJAgO7WduvjE1iU9+VrVnVY1C3MW/HB IGrDVNHUi1ctWPpbdupt+OalxFKckWioxVxUnAgAmP6rpeUCAAA= X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFlrGLMWRmVeSWpSXmKPExsVy+t/xy7r+MyOiDKa0a1is/sVpsfOZssWO tUsZLVr6ZzJb3Lnyk81i2ufb7BZX2n+yW2x8eJbV4sj304wWaw99YLeY++k5owO3x6+vV9k8 Fu95yeTx7OZ/Ro/3+4Dcvi2rGANYo7hsUlJzMstSi/TtErgy7myexFYwI73i7+v7zA2MB7y6 GDk5JARMJPYeWMQKYYtJXLi3nq2LkYtDSGAJo0R39wd2CKeVSeLd1GeMIFVsAjoSp1YfAbNF BKQlXve+YQUpYhbYzixxsnUiC0hCWMBR4vzjk0wgNouAqsTFy//BVvAKuEpc+f+EBWKdnMTN c53MXYwcHJwCbhIXdkWBhIWASqZuus04gZF3ASPDKkaR1NLi3PTcYkO94sTc4tK8dL3k/NxN jMDQ23bs5+YdjJc2Bh9iFOBgVOLhtSgIjxJiTSwrrsw9xCjBwawkwstSFBElxJuSWFmVWpQf X1Sak1p8iFGag0VJnLd3z+pIIYH0xJLU7NTUgtQimCwTB6dUA6PJtKNTBYzmFB1x3hz/QXqx LR+Hgvq/1Fafu2VXP6i/LrtZb/Cyeesqn8/yOxfxJh3VaHpyRNWobPXyet/Wn5fmi8dvWXxv dVjOMsGkd+2HT4lN2s15d8bPRWtte4oYfD1+MG5uUXI8e4xndt52S67fR9ju5M1NkYqsemNg tXZjfNqJhVMyV7xQYinOSDTUYi4qTgQA9T8HuDkCAAA= X-CMS-MailID: 20180112111735eucas1p1c61f5aa366c29f04933af45ed6c4e5e4 X-Msg-Generator: CA CMS-TYPE: 201P X-CMS-RootMailID: 20180112111735eucas1p1c61f5aa366c29f04933af45ed6c4e5e4 X-RootMTR: 20180112111735eucas1p1c61f5aa366c29f04933af45ed6c4e5e4 References: <1515755828-1848-1-git-send-email-i.maximets@samsung.com> X-Spam-Status: No, score=-7.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, RCVD_IN_DNSWL_HI, T_RP_MATCHES_RCVD autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Cc: Heetae Ahn , Ilya Maximets Subject: [ovs-dev] [PATCH v10 3/5] dpif-netdev: Time based output batching. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org This allows to collect packets from more than one RX burst and send them together with a configurable intervals. 'other_config:tx-flush-interval' can be used to configure time that a packet can wait in output batch for sending. 'tx-flush-interval' has microsecond resolution. Signed-off-by: Ilya Maximets Tested-by: Jan Scheurich Acked-by: Jan Scheurich --- lib/dpif-netdev.c | 108 ++++++++++++++++++++++++++++++++++++++++----------- vswitchd/vswitch.xml | 16 ++++++++ 2 files changed, 102 insertions(+), 22 deletions(-) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 6909a03..f0ba2b2 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -87,6 +87,9 @@ VLOG_DEFINE_THIS_MODULE(dpif_netdev); #define MAX_RECIRC_DEPTH 6 DEFINE_STATIC_PER_THREAD_DATA(uint32_t, recirc_depth, 0) +/* Use instant packet send by default. */ +#define DEFAULT_TX_FLUSH_INTERVAL 0 + /* Configuration parameters. */ enum { MAX_FLOWS = 65536 }; /* Maximum number of flows in flow table. */ enum { MAX_METERS = 65536 }; /* Maximum number of meters. */ @@ -273,6 +276,9 @@ struct dp_netdev { struct hmap ports; struct seq *port_seq; /* Incremented whenever a port changes. */ + /* The time that a packet can wait in output batch for sending. */ + atomic_uint32_t tx_flush_interval; + /* Meters. */ struct ovs_mutex meter_locks[N_METER_LOCKS]; struct dp_meter *meters[MAX_METERS]; /* Meter bands. */ @@ -500,6 +506,7 @@ struct tx_port { int qid; long long last_used; struct hmap_node node; + long long flush_time; struct dp_packet_batch output_pkts; struct dp_netdev_rxq *output_pkts_rxqs[NETDEV_MAX_BURST]; }; @@ -581,6 +588,9 @@ struct dp_netdev_pmd_thread { * than 'cmap_count(dp->poll_threads)'. */ uint32_t static_tx_qid; + /* Number of filled output batches. */ + int n_output_batches; + struct ovs_mutex port_mutex; /* Mutex for 'poll_list' and 'tx_ports'. */ /* List of rx queues to poll. */ struct hmap poll_list OVS_GUARDED; @@ -670,8 +680,9 @@ static void dp_netdev_add_rxq_to_pmd(struct dp_netdev_pmd_thread *pmd, static void dp_netdev_del_rxq_from_pmd(struct dp_netdev_pmd_thread *pmd, struct rxq_poll *poll) OVS_REQUIRES(pmd->port_mutex); -static void -dp_netdev_pmd_flush_output_packets(struct dp_netdev_pmd_thread *pmd); +static int +dp_netdev_pmd_flush_output_packets(struct dp_netdev_pmd_thread *pmd, + bool force); static void reconfigure_datapath(struct dp_netdev *dp) OVS_REQUIRES(dp->port_mutex); @@ -1245,6 +1256,7 @@ create_dp_netdev(const char *name, const struct dpif_class *class, conntrack_init(&dp->conntrack); atomic_init(&dp->emc_insert_min, DEFAULT_EM_FLOW_INSERT_MIN); + atomic_init(&dp->tx_flush_interval, DEFAULT_TX_FLUSH_INTERVAL); cmap_init(&dp->poll_threads); @@ -2911,7 +2923,7 @@ dpif_netdev_execute(struct dpif *dpif, struct dpif_execute *execute) dp_packet_batch_init_packet(&pp, execute->packet); dp_netdev_execute_actions(pmd, &pp, false, execute->flow, execute->actions, execute->actions_len); - dp_netdev_pmd_flush_output_packets(pmd); + dp_netdev_pmd_flush_output_packets(pmd, true); if (pmd->core_id == NON_PMD_CORE_ID) { ovs_mutex_unlock(&dp->non_pmd_mutex); @@ -2960,6 +2972,16 @@ dpif_netdev_set_config(struct dpif *dpif, const struct smap *other_config) smap_get_ullong(other_config, "emc-insert-inv-prob", DEFAULT_EM_FLOW_INSERT_INV_PROB); uint32_t insert_min, cur_min; + uint32_t tx_flush_interval, cur_tx_flush_interval; + + tx_flush_interval = smap_get_int(other_config, "tx-flush-interval", + DEFAULT_TX_FLUSH_INTERVAL); + atomic_read_relaxed(&dp->tx_flush_interval, &cur_tx_flush_interval); + if (tx_flush_interval != cur_tx_flush_interval) { + atomic_store_relaxed(&dp->tx_flush_interval, tx_flush_interval); + VLOG_INFO("Flushing interval for tx queues set to %"PRIu32" us", + tx_flush_interval); + } if (!nullable_string_is_equal(dp->pmd_cmask, cmask)) { free(dp->pmd_cmask); @@ -3154,7 +3176,7 @@ dp_netdev_rxq_get_intrvl_cycles(struct dp_netdev_rxq *rx, unsigned idx) return processing_cycles; } -static void +static int dp_netdev_pmd_flush_output_on_port(struct dp_netdev_pmd_thread *pmd, struct tx_port *p) { @@ -3164,6 +3186,7 @@ dp_netdev_pmd_flush_output_on_port(struct dp_netdev_pmd_thread *pmd, bool dynamic_txqs; struct cycle_timer timer; uint64_t cycles; + uint32_t tx_flush_interval; cycle_timer_start(&pmd->perf_stats, &timer); @@ -3180,6 +3203,13 @@ dp_netdev_pmd_flush_output_on_port(struct dp_netdev_pmd_thread *pmd, netdev_send(p->port->netdev, tx_qid, &p->output_pkts, dynamic_txqs); dp_packet_batch_init(&p->output_pkts); + /* Update time of the next flush. */ + atomic_read_relaxed(&pmd->dp->tx_flush_interval, &tx_flush_interval); + p->flush_time = pmd->ctx.now + tx_flush_interval; + + ovs_assert(pmd->n_output_batches > 0); + pmd->n_output_batches--; + pmd_perf_update_counter(&pmd->perf_stats, PMD_STAT_SENT_PKTS, output_cnt); pmd_perf_update_counter(&pmd->perf_stats, PMD_STAT_SENT_BATCHES, 1); @@ -3191,18 +3221,28 @@ dp_netdev_pmd_flush_output_on_port(struct dp_netdev_pmd_thread *pmd, RXQ_CYCLES_PROC_CURR, cycles); } } + + return output_cnt; } -static void -dp_netdev_pmd_flush_output_packets(struct dp_netdev_pmd_thread *pmd) +static int +dp_netdev_pmd_flush_output_packets(struct dp_netdev_pmd_thread *pmd, + bool force) { struct tx_port *p; + int output_cnt = 0; + + if (!pmd->n_output_batches) { + return 0; + } HMAP_FOR_EACH (p, node, &pmd->send_port_cache) { - if (!dp_packet_batch_is_empty(&p->output_pkts)) { - dp_netdev_pmd_flush_output_on_port(pmd, p); + if (!dp_packet_batch_is_empty(&p->output_pkts) + && (force || pmd->ctx.now >= p->flush_time)) { + output_cnt += dp_netdev_pmd_flush_output_on_port(pmd, p); } } + return output_cnt; } static int @@ -3213,7 +3253,7 @@ dp_netdev_process_rxq_port(struct dp_netdev_pmd_thread *pmd, struct dp_packet_batch batch; struct cycle_timer timer; int error; - int batch_cnt = 0; + int batch_cnt = 0, output_cnt = 0; uint64_t cycles; /* Measure duration for polling and processing rx burst. */ @@ -3235,7 +3275,7 @@ dp_netdev_process_rxq_port(struct dp_netdev_pmd_thread *pmd, cycles = cycle_timer_stop(&pmd->perf_stats, &timer); dp_netdev_rxq_add_cycles(rxq, RXQ_CYCLES_PROC_CURR, cycles); - dp_netdev_pmd_flush_output_packets(pmd); + output_cnt = dp_netdev_pmd_flush_output_packets(pmd, false); } else { /* Discard cycles. */ cycle_timer_stop(&pmd->perf_stats, &timer); @@ -3249,7 +3289,7 @@ dp_netdev_process_rxq_port(struct dp_netdev_pmd_thread *pmd, pmd->ctx.last_rxq = NULL; - return batch_cnt; + return batch_cnt + output_cnt; } static struct tx_port * @@ -3872,6 +3912,7 @@ dpif_netdev_run(struct dpif *dpif) struct dp_netdev *dp = get_dp_netdev(dpif); struct dp_netdev_pmd_thread *non_pmd; uint64_t new_tnl_seq; + bool need_to_flush = true; ovs_mutex_lock(&dp->port_mutex); non_pmd = dp_netdev_get_pmd(dp, NON_PMD_CORE_ID); @@ -3882,13 +3923,22 @@ dpif_netdev_run(struct dpif *dpif) int i; for (i = 0; i < port->n_rxq; i++) { - dp_netdev_process_rxq_port(non_pmd, - &port->rxqs[i], - port->port_no); + if (dp_netdev_process_rxq_port(non_pmd, + &port->rxqs[i], + port->port_no)) { + need_to_flush = false; + } } } } - pmd_thread_ctx_time_update(non_pmd); + if (need_to_flush) { + /* We didn't receive anything in the process loop. + * Check if we need to send something. + * There was no time updates on current iteration. */ + pmd_thread_ctx_time_update(non_pmd); + dp_netdev_pmd_flush_output_packets(non_pmd, false); + } + dpif_netdev_xps_revalidate_pmd(non_pmd, false); ovs_mutex_unlock(&dp->non_pmd_mutex); @@ -3939,6 +3989,8 @@ pmd_free_cached_ports(struct dp_netdev_pmd_thread *pmd) { struct tx_port *tx_port_cached; + /* Flush all the queued packets. */ + dp_netdev_pmd_flush_output_packets(pmd, true); /* Free all used tx queue ids. */ dpif_netdev_xps_revalidate_pmd(pmd, true); @@ -4069,6 +4121,7 @@ reload: cycles_counter_update(s); for (;;) { uint64_t iter_packets = 0; + pmd_perf_start_iteration(s); for (i = 0; i < poll_cnt; i++) { process_packets = @@ -4077,15 +4130,20 @@ reload: iter_packets += process_packets; } + if (!iter_packets) { + /* We didn't receive anything in the process loop. + * Check if we need to send something. + * There was no time updates on current iteration. */ + pmd_thread_ctx_time_update(pmd); + iter_packets += dp_netdev_pmd_flush_output_packets(pmd, false); + } + if (lc++ > 1024) { bool reload; lc = 0; coverage_try_clear(); - /* It's possible that the time was not updated on current - * iteration, if there were no received packets. */ - pmd_thread_ctx_time_update(pmd); dp_netdev_pmd_try_optimize(pmd, poll_list, poll_cnt); if (!ovsrcu_try_quiesce()) { emc_cache_slow_sweep(&pmd->flow_cache); @@ -4524,6 +4582,7 @@ dp_netdev_configure_pmd(struct dp_netdev_pmd_thread *pmd, struct dp_netdev *dp, pmd->core_id = core_id; pmd->numa_id = numa_id; pmd->need_reload = false; + pmd->n_output_batches = 0; ovs_refcount_init(&pmd->ref_cnt); latch_init(&pmd->exit_latch); @@ -4713,6 +4772,7 @@ dp_netdev_add_port_tx_to_pmd(struct dp_netdev_pmd_thread *pmd, tx->port = port; tx->qid = -1; + tx->flush_time = 0LL; dp_packet_batch_init(&tx->output_pkts); hmap_insert(&pmd->tx_ports, &tx->node, hash_port_no(tx->port->port_no)); @@ -5407,12 +5467,16 @@ dp_execute_cb(void *aux_, struct dp_packet_batch *packets_, dp_netdev_pmd_flush_output_on_port(pmd, p); } #endif - if (OVS_UNLIKELY(dp_packet_batch_size(&p->output_pkts) - + dp_packet_batch_size(packets_) > NETDEV_MAX_BURST)) { - /* Some packets was generated while input batch processing. - * Flush here to avoid overflow. */ + if (dp_packet_batch_size(&p->output_pkts) + + dp_packet_batch_size(packets_) > NETDEV_MAX_BURST) { + /* Flush here to avoid overflow. */ dp_netdev_pmd_flush_output_on_port(pmd, p); } + + if (dp_packet_batch_is_empty(&p->output_pkts)) { + pmd->n_output_batches++; + } + DP_PACKET_BATCH_FOR_EACH (packet, packets_) { p->output_pkts_rxqs[dp_packet_batch_size(&p->output_pkts)] = pmd->ctx.last_rxq; diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml index 58c0ebd..61fb7b1 100644 --- a/vswitchd/vswitch.xml +++ b/vswitchd/vswitch.xml @@ -359,6 +359,22 @@

+ +

+ Specifies the time in microseconds that a packet can wait in output + batch for sending i.e. amount of time that packet can spend in an + intermediate output queue before sending to netdev. + This option can be used to configure balance between throughput + and latency. Lower values decreases latency while higher values + may be useful to achieve higher performance. +

+

+ Defaults to 0 i.e. instant packet sending (latency optimized). +

+
+