From patchwork Fri Dec 1 15:44:32 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ilya Maximets X-Patchwork-Id: 843557 X-Patchwork-Delegate: ian.stokes@intel.com Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=samsung.com header.i=@samsung.com header.b="WYMaTEvd"; dkim-atps=neutral Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3ypJcb0ZNCz9s9Y for ; Sat, 2 Dec 2017 02:49:51 +1100 (AEDT) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id 47ECAC9B; Fri, 1 Dec 2017 15:45:03 +0000 (UTC) X-Original-To: ovs-dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 09489C96 for ; Fri, 1 Dec 2017 15:45:02 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mailout2.w1.samsung.com (mailout2.w1.samsung.com [210.118.77.12]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id C0B2447E for ; Fri, 1 Dec 2017 15:45:00 +0000 (UTC) Received: from eucas1p1.samsung.com (unknown [182.198.249.206]) by mailout2.w1.samsung.com (KnoxPortal) with ESMTP id 20171201154458euoutp0284d6f7e5e906cc0a5ad7a78f357b95aa~8NiLxlujt3158431584euoutp02q; Fri, 1 Dec 2017 15:44:58 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 mailout2.w1.samsung.com 20171201154458euoutp0284d6f7e5e906cc0a5ad7a78f357b95aa~8NiLxlujt3158431584euoutp02q DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=samsung.com; s=mail20170921; t=1512143098; bh=FLSiE9ilikcjX1Z71ad2eC4vCxh+5NRJjx/VvWtaTkM=; h=From:To:Cc:Subject:Date:In-reply-to:References:From; b=WYMaTEvd1fsxzv0uAXY+F92Qd30h223aojONy2l8Yybrd0cBzBAf1q5iphD0KLu6D R+pXGqRgRAXnHsiVDF1esAcoaSjnj44+4MUttrVTHCDk9FniIFENh975BSv3ne/u+F VWOG9Nv9kVe2I7zBM+qw/A/G648gUZpzwoc/ywD4= Received: from eusmges2.samsung.com (unknown [203.254.199.241]) by eucas1p1.samsung.com (KnoxPortal) with ESMTP id 20171201154457eucas1p1d03544de24d70ace5087145a1996e545~8NiK4b2RP0124001240eucas1p1n; Fri, 1 Dec 2017 15:44:57 +0000 (GMT) Received: from eucas1p1.samsung.com ( [182.198.249.206]) by eusmges2.samsung.com (EUCPMTA) with SMTP id 5E.77.12907.9F8712A5; Fri, 1 Dec 2017 15:44:57 +0000 (GMT) Received: from eusmgms1.samsung.com (unknown [182.198.249.179]) by eucas1p2.samsung.com (KnoxPortal) with ESMTP id 20171201154456eucas1p2826604deb3abcce765fba5e74656d630~8NiKF_w8e0895008950eucas1p27; Fri, 1 Dec 2017 15:44:56 +0000 (GMT) X-AuditID: cbfec7f1-f793a6d00000326b-39-5a2178f9ac54 Received: from eusync3.samsung.com ( [203.254.199.213]) by eusmgms1.samsung.com (EUCPMTA) with SMTP id A2.82.18832.8F8712A5; Fri, 1 Dec 2017 15:44:56 +0000 (GMT) Received: from imaximets.rnd.samsung.ru ([106.109.129.180]) by eusync3.samsung.com (Oracle Communications Messaging Server 7.0.5.31.0 64bit (built May 5 2014)) with ESMTPA id <0P0A00BWSH2B5A50@eusync3.samsung.com>; Fri, 01 Dec 2017 15:44:56 +0000 (GMT) From: Ilya Maximets To: ovs-dev@openvswitch.org, Bhanuprakash Bodireddy Date: Fri, 01 Dec 2017 18:44:32 +0300 Message-id: <1512143073-22347-7-git-send-email-i.maximets@samsung.com> X-Mailer: git-send-email 2.7.4 In-reply-to: <1512143073-22347-1-git-send-email-i.maximets@samsung.com> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrHIsWRmVeSWpSXmKPExsWy7djPc7o/KxSjDOa2iVus/sVpsfOZskVL /0xmiztXfrJZTPt8m93iSvtPdouND8+yWhz5fprRYu2hD+wWcz89Z3Tg8vj19Sqbx+I9L5k8 nt38z+jxfh+Q27dlFWMAaxSXTUpqTmZZapG+XQJXxu7L85kK3ncyVhzY0MDewLg7qYuRk0NC wERi25tVzBC2mMSFe+vZuhi5OIQEljJKTL95lAnC+cwoMbdhCwtMR9uB84wQiWWMEr/6f0A5 zUwS887tBqtiE9CROLX6CCOILSIQIfHg+yx2kCJmgdNMEm82LQJbKCzgIHHn7RnWLkYODhYB VYl72wVBwrwCbhK7ttyAuklO4ua5TmaQEk4Bd4kjv8VBxkgILGGTmPXlOBNEjYvEpBk3WCFs YYlXx7ewQ9gyEp0dB5kgGpoZJRpWXWKEcCYwSnxpXg7VbS9x6uZVMJtZgE9i0rbpYNskBHgl OtqEIEo8JDqPL4d631Hi+Z7/rBAfz2KUeL50LuMERukFjAyrGEVSS4tz01OLjfSKE3OLS/PS 9ZLzczcxAmP59L/jH3cwvj9hdYhRgINRiYeXIVwxSog1say4MvcQowQHs5IIb1YJUIg3JbGy KrUoP76oNCe1+BCjNAeLkjivbVRbpJBAemJJanZqakFqEUyWiYNTqoFxRo83t8uTvrg9TzY/ /LmygOH23cn5cVrLNAJPSDbG+GsWzd2lZdngrXI6UafQZNZbk8tKDN0/r6kWuWpv9pjyWP5p V4XG9hVTjppfCQiMP/Z6q8bVxisSq1iqb4jePPGpulf/d5r7xObdca3BK24LFk/+5fL7a5/m zDmnUnfNTRSx3t61Y0aCEktxRqKhFnNRcSIA63rTN+ECAAA= X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFlrILMWRmVeSWpSXmKPExsVy+t/xq7o/KhSjDDZ361us/sVpsfOZskVL /0xmiztXfrJZTPt8m93iSvtPdouND8+yWhz5fprRYu2hD+wWcz89Z3Tg8vj19Sqbx+I9L5k8 nt38z+jxfh+Q27dlFWMAaxSXTUpqTmZZapG+XQJXxu7L85kK3ncyVhzY0MDewLg7qYuRk0NC wESi7cB5RghbTOLCvfVsXYxcHEICSxglliw5zwThtDJJrDp2kQ2kik1AR+LU6iNgHSICERIt c9YzghQxC5xlkmg9fRysSFjAQeLO2zOsXYwcHCwCqhL3tguChHkF3CR2bbnBDLFNTuLmuU5m kBJOAXeJI7/FQcJCQCX7d+9gnMDIu4CRYRWjSGppcW56brGhXnFibnFpXrpecn7uJkZgwG07 9nPzDsZLG4MPMQpwMCrx8K4IUYwSYk0sK67MPcQowcGsJMKbVQIU4k1JrKxKLcqPLyrNSS0+ xCjNwaIkztu7Z3WkkEB6YklqdmpqQWoRTJaJg1OqgdFv1To/HcEvKctj+viXvjwi0Coov7zW Jmuh1tPj+uu87H8KNK8q8uTLYO2fKaNgo39GIY5da+G1KO66Cd/ehW8X9jZSkI3LzTnvvmXx tPR3zBGL95fM2jmXs8dzfbKwbWFKWpztBe1jZbcaA3R+8pq9+szSPP+m5YFfLG9jv39x6v26 S3/nMiWW4oxEQy3mouJEAEboUSs0AgAA X-CMS-MailID: 20171201154456eucas1p2826604deb3abcce765fba5e74656d630 X-Msg-Generator: CA CMS-TYPE: 201P X-CMS-RootMailID: 20171201154456eucas1p2826604deb3abcce765fba5e74656d630 X-RootMTR: 20171201154456eucas1p2826604deb3abcce765fba5e74656d630 References: <1512143073-22347-1-git-send-email-i.maximets@samsung.com> X-Spam-Status: No, score=-7.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, RCVD_IN_DNSWL_HI, T_RP_MATCHES_RCVD autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Cc: Heetae Ahn , Ilya Maximets Subject: [ovs-dev] [PATCH v6 6/7] dpif-netdev: Time based output batching. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org This allows to collect packets from more than one RX burst and send them together with a configurable intervals. 'other_config:tx-flush-interval' can be used to configure time that a packet can wait in output batch for sending. dpif-netdev turned to microsecond resolution for time measuring to ensure desired resolution of 'tx-flush-interval'. Signed-off-by: Ilya Maximets --- lib/dpif-netdev.c | 151 +++++++++++++++++++++++++++++++++++++++------------ vswitchd/vswitch.xml | 16 ++++++ 2 files changed, 132 insertions(+), 35 deletions(-) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 68c35fa..da6b7b8 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -85,6 +85,9 @@ VLOG_DEFINE_THIS_MODULE(dpif_netdev); #define MAX_RECIRC_DEPTH 6 DEFINE_STATIC_PER_THREAD_DATA(uint32_t, recirc_depth, 0) +/* Use instant packet send by default. */ +#define DEFAULT_TX_FLUSH_INTERVAL 0 + /* Configuration parameters. */ enum { MAX_FLOWS = 65536 }; /* Maximum number of flows in flow table. */ enum { MAX_METERS = 65536 }; /* Maximum number of meters. */ @@ -178,12 +181,13 @@ struct emc_cache { /* Simple non-wildcarding single-priority classifier. */ -/* Time in ms between successive optimizations of the dpcls subtable vector */ -#define DPCLS_OPTIMIZATION_INTERVAL 1000 +/* Time in microseconds between successive optimizations of the dpcls + * subtable vector */ +#define DPCLS_OPTIMIZATION_INTERVAL 1000000LL -/* Time in ms of the interval in which rxq processing cycles used in - * rxq to pmd assignments is measured and stored. */ -#define PMD_RXQ_INTERVAL_LEN 10000 +/* Time in microseconds of the interval in which rxq processing cycles used + * in rxq to pmd assignments is measured and stored. */ +#define PMD_RXQ_INTERVAL_LEN 10000000LL /* Number of intervals for which cycles are stored * and used during rxq to pmd assignment. */ @@ -270,6 +274,9 @@ struct dp_netdev { struct hmap ports; struct seq *port_seq; /* Incremented whenever a port changes. */ + /* The time that a packet can wait in output batch for sending. */ + atomic_uint32_t tx_flush_interval; + /* Meters. */ struct ovs_mutex meter_locks[N_METER_LOCKS]; struct dp_meter *meters[MAX_METERS]; /* Meter bands. */ @@ -356,7 +363,7 @@ enum rxq_cycles_counter_type { RXQ_N_CYCLES }; -#define XPS_TIMEOUT_MS 500LL +#define XPS_TIMEOUT 500000LL /* In microseconds. */ /* Contained by struct dp_netdev_port's 'rxqs' member. */ struct dp_netdev_rxq { @@ -526,6 +533,7 @@ struct tx_port { int qid; long long last_used; struct hmap_node node; + long long flush_time; struct dp_packet_batch output_pkts; }; @@ -627,10 +635,13 @@ struct dp_netdev_pmd_thread { * less than 'cmap_count(dp->poll_threads)'. */ uint32_t static_tx_qid; + /* Number of filled output batches. */ + int n_output_batches; + unsigned core_id; /* CPU core id of this pmd thread. */ int numa_id; /* numa node id of this pmd thread. */ - /* 20 pad bytes. */ + /* 16 pad bytes. */ ); PADDED_MEMBERS(CACHE_LINE_SIZE, @@ -740,8 +751,9 @@ static void dp_netdev_add_rxq_to_pmd(struct dp_netdev_pmd_thread *pmd, static void dp_netdev_del_rxq_from_pmd(struct dp_netdev_pmd_thread *pmd, struct rxq_poll *poll) OVS_REQUIRES(pmd->port_mutex); -static void -dp_netdev_pmd_flush_output_packets(struct dp_netdev_pmd_thread *pmd); +static int +dp_netdev_pmd_flush_output_packets(struct dp_netdev_pmd_thread *pmd, + bool force); static void reconfigure_datapath(struct dp_netdev *dp) OVS_REQUIRES(dp->port_mutex); @@ -832,7 +844,7 @@ emc_cache_slow_sweep(struct emc_cache *flow_cache) static inline void pmd_thread_ctx_time_update(struct dp_netdev_pmd_thread *pmd) { - pmd->ctx.now = time_msec(); + pmd->ctx.now = time_usec(); } /* Returns true if 'dpif' is a netdev or dummy dpif, false otherwise. */ @@ -1332,6 +1344,7 @@ create_dp_netdev(const char *name, const struct dpif_class *class, conntrack_init(&dp->conntrack); atomic_init(&dp->emc_insert_min, DEFAULT_EM_FLOW_INSERT_MIN); + atomic_init(&dp->tx_flush_interval, DEFAULT_TX_FLUSH_INTERVAL); cmap_init(&dp->poll_threads); @@ -2999,7 +3012,7 @@ dpif_netdev_execute(struct dpif *dpif, struct dpif_execute *execute) dp_packet_batch_init_packet(&pp, execute->packet); dp_netdev_execute_actions(pmd, &pp, false, execute->flow, execute->actions, execute->actions_len); - dp_netdev_pmd_flush_output_packets(pmd); + dp_netdev_pmd_flush_output_packets(pmd, true); if (pmd->core_id == NON_PMD_CORE_ID) { ovs_mutex_unlock(&dp->non_pmd_mutex); @@ -3048,6 +3061,16 @@ dpif_netdev_set_config(struct dpif *dpif, const struct smap *other_config) smap_get_ullong(other_config, "emc-insert-inv-prob", DEFAULT_EM_FLOW_INSERT_INV_PROB); uint32_t insert_min, cur_min; + uint32_t tx_flush_interval, cur_tx_flush_interval; + + tx_flush_interval = smap_get_int(other_config, "tx-flush-interval", + DEFAULT_TX_FLUSH_INTERVAL); + atomic_read_relaxed(&dp->tx_flush_interval, &cur_tx_flush_interval); + if (tx_flush_interval != cur_tx_flush_interval) { + atomic_store_relaxed(&dp->tx_flush_interval, tx_flush_interval); + VLOG_INFO("Flushing interval for tx queues set to %"PRIu32" us", + tx_flush_interval); + } if (!nullable_string_is_equal(dp->pmd_cmask, cmask)) { free(dp->pmd_cmask); @@ -3286,12 +3309,14 @@ dp_netdev_rxq_get_intrvl_cycles(struct dp_netdev_rxq *rx, unsigned idx) return processing_cycles; } -static void +static int dp_netdev_pmd_flush_output_on_port(struct dp_netdev_pmd_thread *pmd, struct tx_port *p) { int tx_qid; + int output_cnt; bool dynamic_txqs; + uint32_t tx_flush_interval; dynamic_txqs = p->port->dynamic_txqs; if (dynamic_txqs) { @@ -3300,20 +3325,40 @@ dp_netdev_pmd_flush_output_on_port(struct dp_netdev_pmd_thread *pmd, tx_qid = pmd->static_tx_qid; } + output_cnt = dp_packet_batch_size(&p->output_pkts); + ovs_assert(output_cnt > 0); + netdev_send(p->port->netdev, tx_qid, &p->output_pkts, dynamic_txqs); dp_packet_batch_init(&p->output_pkts); + + /* Update time of the next flush. */ + atomic_read_relaxed(&pmd->dp->tx_flush_interval, &tx_flush_interval); + p->flush_time = pmd->ctx.now + tx_flush_interval; + + ovs_assert(pmd->n_output_batches > 0); + pmd->n_output_batches--; + + return output_cnt; } -static void -dp_netdev_pmd_flush_output_packets(struct dp_netdev_pmd_thread *pmd) +static int +dp_netdev_pmd_flush_output_packets(struct dp_netdev_pmd_thread *pmd, + bool force) { struct tx_port *p; + int output_cnt = 0; + + if (!pmd->n_output_batches) { + return 0; + } HMAP_FOR_EACH (p, node, &pmd->send_port_cache) { - if (!dp_packet_batch_is_empty(&p->output_pkts)) { - dp_netdev_pmd_flush_output_on_port(pmd, p); + if (!dp_packet_batch_is_empty(&p->output_pkts) + && (force || pmd->ctx.now >= p->flush_time)) { + output_cnt += dp_netdev_pmd_flush_output_on_port(pmd, p); } } + return output_cnt; } static int @@ -3323,7 +3368,7 @@ dp_netdev_process_rxq_port(struct dp_netdev_pmd_thread *pmd, { struct dp_packet_batch batch; int error; - int batch_cnt = 0; + int batch_cnt = 0, output_cnt = 0; dp_packet_batch_init(&batch); error = netdev_rxq_recv(rx, &batch); @@ -3333,7 +3378,7 @@ dp_netdev_process_rxq_port(struct dp_netdev_pmd_thread *pmd, batch_cnt = batch.count; dp_netdev_input(pmd, &batch, port_no); - dp_netdev_pmd_flush_output_packets(pmd); + output_cnt = dp_netdev_pmd_flush_output_packets(pmd, false); } else if (error != EAGAIN && error != EOPNOTSUPP) { static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5); @@ -3341,7 +3386,7 @@ dp_netdev_process_rxq_port(struct dp_netdev_pmd_thread *pmd, netdev_rxq_get_name(rx), ovs_strerror(error)); } - return batch_cnt; + return batch_cnt + output_cnt; } static struct tx_port * @@ -3951,7 +3996,8 @@ dpif_netdev_run(struct dpif *dpif) struct dp_netdev *dp = get_dp_netdev(dpif); struct dp_netdev_pmd_thread *non_pmd; uint64_t new_tnl_seq; - int process_packets = 0; + int process_packets; + bool need_to_flush = true; ovs_mutex_lock(&dp->port_mutex); non_pmd = dp_netdev_get_pmd(dp, NON_PMD_CORE_ID); @@ -3971,11 +4017,25 @@ dpif_netdev_run(struct dpif *dpif) process_packets ? PMD_CYCLES_PROCESSING : PMD_CYCLES_IDLE); + if (process_packets) { + need_to_flush = false; + } } } } + if (need_to_flush) { + /* We didn't receive anything in the process loop. + * Check if we need to send something. + * There was no time updates on current iteration. */ + pmd_thread_ctx_time_update(non_pmd); + process_packets = dp_netdev_pmd_flush_output_packets(non_pmd, + false); + cycles_count_intermediate(non_pmd, NULL, process_packets + ? PMD_CYCLES_PROCESSING + : PMD_CYCLES_IDLE); + } + cycles_count_end(non_pmd, PMD_CYCLES_IDLE); - pmd_thread_ctx_time_update(non_pmd); dpif_netdev_xps_revalidate_pmd(non_pmd, false); ovs_mutex_unlock(&dp->non_pmd_mutex); @@ -4026,6 +4086,8 @@ pmd_free_cached_ports(struct dp_netdev_pmd_thread *pmd) { struct tx_port *tx_port_cached; + /* Flush all the queued packets. */ + dp_netdev_pmd_flush_output_packets(pmd, true); /* Free all used tx queue ids. */ dpif_netdev_xps_revalidate_pmd(pmd, true); @@ -4124,7 +4186,6 @@ pmd_thread_main(void *f_) bool exiting; int poll_cnt; int i; - int process_packets = 0; poll_list = NULL; @@ -4154,6 +4215,9 @@ reload: cycles_count_start(pmd); for (;;) { + int process_packets; + bool need_to_flush = true; + for (i = 0; i < poll_cnt; i++) { process_packets = dp_netdev_process_rxq_port(pmd, poll_list[i].rxq->rx, @@ -4161,6 +4225,20 @@ reload: cycles_count_intermediate(pmd, poll_list[i].rxq, process_packets ? PMD_CYCLES_PROCESSING : PMD_CYCLES_IDLE); + if (process_packets) { + need_to_flush = false; + } + } + + if (need_to_flush) { + /* We didn't receive anything in the process loop. + * Check if we need to send something. + * There was no time updates on current iteration. */ + pmd_thread_ctx_time_update(pmd); + process_packets = dp_netdev_pmd_flush_output_packets(pmd, false); + cycles_count_intermediate(pmd, NULL, + process_packets ? PMD_CYCLES_PROCESSING + : PMD_CYCLES_IDLE); } if (lc++ > 1024) { @@ -4169,9 +4247,6 @@ reload: lc = 0; coverage_try_clear(); - /* It's possible that the time was not updated on current - * iteration, if there were no received packets. */ - pmd_thread_ctx_time_update(pmd); dp_netdev_pmd_try_optimize(pmd, poll_list, poll_cnt); if (!ovsrcu_try_quiesce()) { emc_cache_slow_sweep(&pmd->flow_cache); @@ -4257,7 +4332,7 @@ dp_netdev_run_meter(struct dp_netdev *dp, struct dp_packet_batch *packets_, memset(exceeded_rate, 0, cnt * sizeof *exceeded_rate); /* All packets will hit the meter at the same time. */ - long_delta_t = (now - meter->used); /* msec */ + long_delta_t = (now - meter->used) / 1000; /* msec */ /* Make sure delta_t will not be too large, so that bucket will not * wrap around below. */ @@ -4413,7 +4488,7 @@ dpif_netdev_meter_set(struct dpif *dpif, ofproto_meter_id *meter_id, meter->flags = config->flags; meter->n_bands = config->n_bands; meter->max_delta_t = 0; - meter->used = time_msec(); + meter->used = time_usec(); /* set up bands */ for (i = 0; i < config->n_bands; ++i) { @@ -4611,6 +4686,7 @@ dp_netdev_configure_pmd(struct dp_netdev_pmd_thread *pmd, struct dp_netdev *dp, pmd->core_id = core_id; pmd->numa_id = numa_id; pmd->need_reload = false; + pmd->n_output_batches = 0; ovs_refcount_init(&pmd->ref_cnt); latch_init(&pmd->exit_latch); @@ -4798,6 +4874,7 @@ dp_netdev_add_port_tx_to_pmd(struct dp_netdev_pmd_thread *pmd, tx->port = port; tx->qid = -1; + tx->flush_time = 0LL; dp_packet_batch_init(&tx->output_pkts); hmap_insert(&pmd->tx_ports, &tx->node, hash_port_no(tx->port->port_no)); @@ -4961,7 +5038,7 @@ packet_batch_per_flow_execute(struct packet_batch_per_flow *batch, struct dp_netdev_flow *flow = batch->flow; dp_netdev_flow_used(flow, batch->array.count, batch->byte_count, - batch->tcp_flags, pmd->ctx.now); + batch->tcp_flags, pmd->ctx.now / 1000); actions = dp_netdev_flow_get_actions(flow); @@ -5336,7 +5413,7 @@ dpif_netdev_xps_revalidate_pmd(const struct dp_netdev_pmd_thread *pmd, continue; } interval = pmd->ctx.now - tx->last_used; - if (tx->qid >= 0 && (purge || interval >= XPS_TIMEOUT_MS)) { + if (tx->qid >= 0 && (purge || interval >= XPS_TIMEOUT)) { port = tx->port; ovs_mutex_lock(&port->txq_used_mutex); port->txq_used[tx->qid]--; @@ -5357,7 +5434,7 @@ dpif_netdev_xps_get_tx_qid(const struct dp_netdev_pmd_thread *pmd, interval = pmd->ctx.now - tx->last_used; tx->last_used = pmd->ctx.now; - if (OVS_LIKELY(tx->qid >= 0 && interval < XPS_TIMEOUT_MS)) { + if (OVS_LIKELY(tx->qid >= 0 && interval < XPS_TIMEOUT)) { return tx->qid; } @@ -5489,12 +5566,16 @@ dp_execute_cb(void *aux_, struct dp_packet_batch *packets_, dp_netdev_pmd_flush_output_on_port(pmd, p); } #endif - if (OVS_UNLIKELY(dp_packet_batch_size(&p->output_pkts) - + dp_packet_batch_size(packets_) > NETDEV_MAX_BURST)) { - /* Some packets was generated while input batch processing. - * Flush here to avoid overflow. */ + if (dp_packet_batch_size(&p->output_pkts) + + dp_packet_batch_size(packets_) > NETDEV_MAX_BURST) { + /* Flush here to avoid overflow. */ dp_netdev_pmd_flush_output_on_port(pmd, p); } + + if (dp_packet_batch_is_empty(&p->output_pkts)) { + pmd->n_output_batches++; + } + DP_PACKET_BATCH_FOR_EACH (packet, packets_) { dp_packet_batch_add(&p->output_pkts, packet); } @@ -5735,7 +5816,7 @@ dp_execute_cb(void *aux_, struct dp_packet_batch *packets_, conntrack_execute(&dp->conntrack, packets_, aux->flow->dl_type, force, commit, zone, setmark, setlabel, helper, - nat_action_info_ref, pmd->ctx.now); + nat_action_info_ref, pmd->ctx.now / 1000); break; } diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml index c145e1a..ef34fe6 100644 --- a/vswitchd/vswitch.xml +++ b/vswitchd/vswitch.xml @@ -344,6 +344,22 @@

+ +

+ Specifies the time in microseconds that a packet can wait in output + batch for sending i.e. amount of time that packet can spend in an + intermediate output queue before sending to netdev. + This option can be used to configure balance between throughput + and latency. Lower values decreases latency while higher values + may be useful to achieve higher performance. +

+

+ Defaults to 0 i.e. instant packet sending (latency optimized). +

+
+