From patchwork Thu Jan 4 20:02:39 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Scheurich X-Patchwork-Id: 855875 X-Patchwork-Delegate: ian.stokes@intel.com Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3zCQLX68Zmz9s7c for ; Fri, 5 Jan 2018 11:20:52 +1100 (AEDT) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id 708E6D12; Fri, 5 Jan 2018 00:19:11 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 62A24D02 for ; Fri, 5 Jan 2018 00:19:09 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mout.web.de (mout.web.de [212.227.15.4]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 2B8A64EB for ; Fri, 5 Jan 2018 00:19:07 +0000 (UTC) Received: from ubuntu.fritz.box ([89.0.12.95]) by smtp.web.de (mrweb004 [213.165.67.108]) with ESMTPSA (Nemesis) id 0M5Jip-1epUl92g19-00zU5s; Fri, 05 Jan 2018 01:19:00 +0100 From: Jan Scheurich To: dev@openvswitch.org Date: Thu, 4 Jan 2018 21:02:39 +0100 Message-Id: <1515096166-16257-2-git-send-email-jan.scheurich@ericsson.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1515096166-16257-1-git-send-email-jan.scheurich@ericsson.com> References: <1515096166-16257-1-git-send-email-jan.scheurich@ericsson.com> X-Provags-ID: V03:K0:Kz3YPR+GBfhsXRiErhszS3DJqd/q6crRqIxx98KqPOSyMfm1O5n vOvFnZ8KccamH1QUlpC/XlMPSGqmQ2wXWPOoYUwXPe/Cp6+GMzui1Sf/76F4+SaipneEzCJ SQ0soc/Akt0m5EF1ea7CAr6YcsSEXg+USnmZG5r4bV4XQsl88CedMAXbxpoG+oy2c+65Hf/ 3dpS58lMz1Yq88QVBEh7g== X-UI-Out-Filterresults: notjunk:1; V01:K0:XgvNeIVdgFU=:Q/Zc7DwY22IEgTUKoAvaBn UgAgHxPBZWBCdTKllqLHleqLffKbZq2f/19GclrH5gZTWsXH4VaA9HwLCFTYPPqQ+y6ibFbjX eB0ZTjLTjzpWDzBsdM2AEVB5fYZytDW4LtRUUrSMbJ4vHutnEfQUS+GN6kdGST889eGt4CSPZ 9yS9XtDGQZH819UonHGPHpV/JyiQ+mgYlrEd3f6QZ+SQQni/+7ZZ2SpKrteK+iRCIzhCLuOaV DTKkJwZwMX/d5MH27CWRA09/FHFKgq7icVw34GmGQPL7MN/towTS2dDQsO+JqFpQNbSpTeBPy a1NmwQb0RUxtGvYK+pAC8IjleMmbyE5uDrekDX5YtVDOnxowp0eirQtFVvWQCRFJU5+CrEFiC vAJjPtld4UZHd1HtW31KWslWywU+Z2wGlNbrX2/eQxhAE48nDQRl+fNs7l4oyEe2GEte7/jUr qxL1XpUyb+e3YoVAId3GVurHjCM71LoE/YxxFgy5YvmH5zJqAX4HTwLX8uOoTzh3/6KhbCcZ0 q47zqZdnWeAPcfWYXYC8fVJ2jZBU78a3eOfsFPQXYCGKP599skgb7a4L1URT72+sPoRjLxPKJ u0tal770cKKHjSCn2XyID15EcXM9r5ad89luDvihBUav+k3ScMEwuleXGdDqElzEpHOWdu1sC LbFA99wLCA+c1FfHf7dzHfT2Xa+0AK9WXJmMvbZjnng3NfzNOxn7rpPkkc8IVHyNby7qsCcGl Qf36XsNlLpE5VlS956RQGT/QNZ/AjGGSqcN1pF1LDg12uSWlnlbMZN73lG4VcCM+2b5AVmZOY EgQluk+MBoiQBYjRUOMLb25D4A4Ow== X-Spam-Status: No, score=-1.0 required=5.0 tests=BAYES_00, DATE_IN_PAST_03_06, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, T_RP_MATCHES_RCVD autolearn=no version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Cc: Ilya Maximets Subject: [ovs-dev] [RFC PATCH 1/8] dpif-netdev: Use microsecond granularity. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org From: Ilya Maximets Upcoming time-based output batching will require microsecond granularity for it's flexible configuration. Signed-off-by: Ilya Maximets --- lib/dpif-netdev.c | 27 ++++++++++++++------------- 1 file changed, 14 insertions(+), 13 deletions(-) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 96cc4d5..279ae6b 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -181,12 +181,13 @@ struct emc_cache { /* Simple non-wildcarding single-priority classifier. */ -/* Time in ms between successive optimizations of the dpcls subtable vector */ -#define DPCLS_OPTIMIZATION_INTERVAL 1000 +/* Time in microseconds between successive optimizations of the dpcls + * subtable vector */ +#define DPCLS_OPTIMIZATION_INTERVAL 1000000LL -/* Time in ms of the interval in which rxq processing cycles used in - * rxq to pmd assignments is measured and stored. */ -#define PMD_RXQ_INTERVAL_LEN 10000 +/* Time in microseconds of the interval in which rxq processing cycles used + * in rxq to pmd assignments is measured and stored. */ +#define PMD_RXQ_INTERVAL_LEN 10000000LL /* Number of intervals for which cycles are stored * and used during rxq to pmd assignment. */ @@ -344,7 +345,7 @@ enum rxq_cycles_counter_type { RXQ_N_CYCLES }; -#define XPS_TIMEOUT_MS 500LL +#define XPS_TIMEOUT 500000LL /* In microseconds. */ /* Contained by struct dp_netdev_port's 'rxqs' member. */ struct dp_netdev_rxq { @@ -764,7 +765,7 @@ emc_cache_slow_sweep(struct emc_cache *flow_cache) static inline void pmd_thread_ctx_time_update(struct dp_netdev_pmd_thread *pmd) { - pmd->ctx.now = time_msec(); + pmd->ctx.now = time_usec(); } /* Returns true if 'dpif' is a netdev or dummy dpif, false otherwise. */ @@ -4332,7 +4333,7 @@ dp_netdev_run_meter(struct dp_netdev *dp, struct dp_packet_batch *packets_, memset(exceeded_rate, 0, cnt * sizeof *exceeded_rate); /* All packets will hit the meter at the same time. */ - long_delta_t = (now - meter->used); /* msec */ + long_delta_t = (now - meter->used) / 1000; /* msec */ /* Make sure delta_t will not be too large, so that bucket will not * wrap around below. */ @@ -4488,7 +4489,7 @@ dpif_netdev_meter_set(struct dpif *dpif, ofproto_meter_id *meter_id, meter->flags = config->flags; meter->n_bands = config->n_bands; meter->max_delta_t = 0; - meter->used = time_msec(); + meter->used = time_usec(); /* set up bands */ for (i = 0; i < config->n_bands; ++i) { @@ -5030,7 +5031,7 @@ packet_batch_per_flow_execute(struct packet_batch_per_flow *batch, struct dp_netdev_flow *flow = batch->flow; dp_netdev_flow_used(flow, batch->array.count, batch->byte_count, - batch->tcp_flags, pmd->ctx.now); + batch->tcp_flags, pmd->ctx.now / 1000); actions = dp_netdev_flow_get_actions(flow); @@ -5424,7 +5425,7 @@ dpif_netdev_xps_revalidate_pmd(const struct dp_netdev_pmd_thread *pmd, continue; } interval = pmd->ctx.now - tx->last_used; - if (tx->qid >= 0 && (purge || interval >= XPS_TIMEOUT_MS)) { + if (tx->qid >= 0 && (purge || interval >= XPS_TIMEOUT)) { port = tx->port; ovs_mutex_lock(&port->txq_used_mutex); port->txq_used[tx->qid]--; @@ -5445,7 +5446,7 @@ dpif_netdev_xps_get_tx_qid(const struct dp_netdev_pmd_thread *pmd, interval = pmd->ctx.now - tx->last_used; tx->last_used = pmd->ctx.now; - if (OVS_LIKELY(tx->qid >= 0 && interval < XPS_TIMEOUT_MS)) { + if (OVS_LIKELY(tx->qid >= 0 && interval < XPS_TIMEOUT)) { return tx->qid; } @@ -5824,7 +5825,7 @@ dp_execute_cb(void *aux_, struct dp_packet_batch *packets_, conntrack_execute(&dp->conntrack, packets_, aux->flow->dl_type, force, commit, zone, setmark, setlabel, aux->flow->tp_src, aux->flow->tp_dst, helper, nat_action_info_ref, - pmd->ctx.now); + pmd->ctx.now / 1000); break; } From patchwork Thu Jan 4 20:02:40 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Scheurich X-Patchwork-Id: 855877 X-Patchwork-Delegate: ian.stokes@intel.com Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3zCQMf5Yw8z9s7n for ; Fri, 5 Jan 2018 11:21:50 +1100 (AEDT) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id 74E67D1C; Fri, 5 Jan 2018 00:19:13 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id B6954D08 for ; Fri, 5 Jan 2018 00:19:09 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mout.web.de (mout.web.de [212.227.15.3]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 8BFC8561 for ; Fri, 5 Jan 2018 00:19:08 +0000 (UTC) Received: from ubuntu.fritz.box ([89.0.12.95]) by smtp.web.de (mrweb004 [213.165.67.108]) with ESMTPSA (Nemesis) id 0MXSg2-1eRBQ23sqU-00WV3V; Fri, 05 Jan 2018 01:19:01 +0100 From: Jan Scheurich To: dev@openvswitch.org Date: Thu, 4 Jan 2018 21:02:40 +0100 Message-Id: <1515096166-16257-3-git-send-email-jan.scheurich@ericsson.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1515096166-16257-1-git-send-email-jan.scheurich@ericsson.com> References: <1515096166-16257-1-git-send-email-jan.scheurich@ericsson.com> X-Provags-ID: V03:K0:AgyxBDLUDMOp/RrAZlCRKXXd0Nt1gqzQyDlJfq+nv12L/4oAA/o 0/qbKrU0BGChYt6yono57FhtUVQZP5l058Ok+YoUPkXh2wVm/Jg59PGo5FISbTVjE+0gD6O sBd/9rUGVJ0xz51PGNSFkli07mUxf3YlFfaNVEyivhbfBbJ6lRomAiaJlqae0ClyV1yTxP4 SYyHcJ/RrgWyD/RuepTbA== X-UI-Out-Filterresults: notjunk:1; V01:K0:87ZA/iTisyg=:A1fVyU/Q/F3uyhQwygaCm0 ILoMSTD4UatJJjB78UxLIqY3plf0pGFQVS/HpONy+nG2t46aNBTpR60Zs2jMszLXHcZgch6Ww 6dvva3Zmf/mmasSTkso7aF7DJ2wbFKrqyyM3onUSd1q4cc6M/+SfOyUTcjheg2HLzEzaSYTiZ fI5FU44X1vM0SVvmHKuxWrICIeibsGK6Xf1rz5YBNGKuQpp99LjgYVMT8b3eh7oz02RNXXaai NZIRqFglmm7rBWbrg02IwZXSLQb4221S3Emq1WGT+GYhwg4wnn901vDJ8VHegPYfzBEozWlYb bZ8v+l5FlloYP7uKZ9F5ttDz+ICJcZ9apNkxth3ffg1BL8Yl+XzPfbGrAF5UIKlpbEGSfLTNb P2kWoIGBe4BDfSIAYTnj1zGiXuV3kMllMKCjDIEmE1ILCX7Cm9ERpxpHG/bjMFem0pQxYgfeh 2lL4oWAd7fd8+yN/doDvvzYfsYNDIyJPSKCdHYGv+6lugN3T+bU4PeikiZGbvXxNadDVyLUMp IP53y455w64xEXAhRYuZDHENs1nHFn32NSHJjgS/XK3Oz+jzKQiUigfNq/+2vWTMwNqfQU6Vu htIW906ko4DFanU4tdWGMCtmzm9GFKnCg8vmkJkWYD/rFVrHJ3LNKNbV3T0TgtqsQUO+GV7dq 7dCYsoGgCxck3m8C+ZG0LJkPIynvm24HffbaiztNzyHW6uw6DpqNfNq5xSxKBoULR/7kjuxrB hgULak6jM+Tgt5VfJ4iknYRDLsUorNJFI1qkKWlYZZHS0kV0v2z9cK4bzlF6jq2nkTdNZHubi 01QnFSItvxYZrev/Own1ykyUrPJVA== X-Spam-Status: No, score=-1.0 required=5.0 tests=BAYES_00, DATE_IN_PAST_03_06, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, T_RP_MATCHES_RCVD autolearn=no version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Cc: Ilya Maximets Subject: [ovs-dev] [RFC PATCH 2/8] dpif-netdev: Count cycles on per-rxq basis. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org From: Ilya Maximets Upcoming time-based output batching will allow to collect in a single output batch packets from different RX queues. Lets keep the list of RX queues for each output packet and collect cycles for them on send. Signed-off-by: Ilya Maximets --- lib/dpif-netdev.c | 100 ++++++++++++++++++++++++++++++++++++------------------ 1 file changed, 67 insertions(+), 33 deletions(-) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 279ae6b..a9f509a 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -504,6 +504,7 @@ struct tx_port { long long last_used; struct hmap_node node; struct dp_packet_batch output_pkts; + struct dp_netdev_rxq *output_pkts_rxqs[NETDEV_MAX_BURST]; }; /* A set of properties for the current processing loop that is not directly @@ -515,6 +516,10 @@ struct dp_netdev_pmd_thread_ctx { long long now; /* Used to count cycles. See 'cycles_count_end()' */ unsigned long long last_cycles; + /* RX queue from which last packet was received. */ + struct dp_netdev_rxq *last_rxq; + /* Indicates how should be treated last counted cycles. */ + enum pmd_cycles_counter_type current_pmd_cycles_type; }; /* PMD: Poll modes drivers. PMD accesses devices via polling to eliminate @@ -3232,42 +3237,53 @@ cycles_counter(void) /* Fake mutex to make sure that the calls to cycles_count_* are balanced */ extern struct ovs_mutex cycles_counter_fake_mutex; -/* Start counting cycles. Must be followed by 'cycles_count_end()' */ +/* Start counting cycles. Must be followed by 'cycles_count_end()'. + * Counting starts from the idle type state. */ static inline void cycles_count_start(struct dp_netdev_pmd_thread *pmd) OVS_ACQUIRES(&cycles_counter_fake_mutex) OVS_NO_THREAD_SAFETY_ANALYSIS { + pmd->ctx.current_pmd_cycles_type = PMD_CYCLES_IDLE; pmd->ctx.last_cycles = cycles_counter(); } -/* Stop counting cycles and add them to the counter 'type' */ +/* Stop counting cycles and add them to the counter of the current type. */ static inline void -cycles_count_end(struct dp_netdev_pmd_thread *pmd, - enum pmd_stat_type type) +cycles_count_end(struct dp_netdev_pmd_thread *pmd) OVS_RELEASES(&cycles_counter_fake_mutex) OVS_NO_THREAD_SAFETY_ANALYSIS { unsigned long long interval = cycles_counter() - pmd->ctx.last_cycles; + enum pmd_cycles_counter_type type = pmd->ctx.current_pmd_cycles_type; pmd_perf_update_counter(&pmd->perf_stats, type, interval); } -/* Calculate the intermediate cycle result and add to the counter 'type' */ +/* Calculate the intermediate cycle result and add to the counter of + * the current type */ static inline void cycles_count_intermediate(struct dp_netdev_pmd_thread *pmd, - struct dp_netdev_rxq *rxq, - enum pmd_stat_type type) + struct dp_netdev_rxq **rxqs, int n_rxqs) OVS_NO_THREAD_SAFETY_ANALYSIS { unsigned long long new_cycles = cycles_counter(); unsigned long long interval = new_cycles - pmd->ctx.last_cycles; + enum pmd_cycles_counter_type type = pmd->ctx.current_pmd_cycles_type; + int i; + pmd->ctx.last_cycles = new_cycles; pmd_perf_update_counter(&pmd->perf_stats, type, interval); - if (rxq && (type == PMD_CYCLES_POLL_BUSY)) { + if (n_rxqs && (type == PMD_CYCLES_POLL_BUSY)) { /* Add to the amount of current processing cycles. */ - non_atomic_ullong_add(&rxq->cycles[RXQ_CYCLES_PROC_CURR], interval); + interval /= n_rxqs; + for (i = 0; i < n_rxqs; i++) { + if (rxqs[i]) { + non_atomic_ullong_add(&rxqs[i]->cycles[RXQ_CYCLES_PROC_CURR], + interval); + } + } } } @@ -3319,6 +3335,16 @@ dp_netdev_pmd_flush_output_on_port(struct dp_netdev_pmd_thread *pmd, int tx_qid; int output_cnt; bool dynamic_txqs; + enum pmd_cycles_counter_type save_pmd_cycles_type; + + /* In case we're in PMD_CYCLES_PROCESSING state we need to count + * cycles for rxq we're processing now. */ + cycles_count_intermediate(pmd, &pmd->ctx.last_rxq, 1); + + /* Save current cycles counting state to restore after accounting + * send cycles. */ + save_pmd_cycles_type = pmd->ctx.current_pmd_cycles_type; + pmd->ctx.current_pmd_cycles_type = PMD_CYCLES_POLL_BUSY; dynamic_txqs = p->port->dynamic_txqs; if (dynamic_txqs) { @@ -3336,6 +3362,10 @@ dp_netdev_pmd_flush_output_on_port(struct dp_netdev_pmd_thread *pmd, PMD_STAT_SENT_PKTS, output_cnt); pmd_perf_update_counter(&pmd->perf_stats, PMD_STAT_SENT_BATCHES, 1); + + /* Update send cycles for all the rx queues and restore previous state. */ + cycles_count_intermediate(pmd, p->output_pkts_rxqs, output_cnt); + pmd->ctx.current_pmd_cycles_type = save_pmd_cycles_type; } static void @@ -3352,7 +3382,7 @@ dp_netdev_pmd_flush_output_packets(struct dp_netdev_pmd_thread *pmd) static int dp_netdev_process_rxq_port(struct dp_netdev_pmd_thread *pmd, - struct netdev_rxq *rx, + struct dp_netdev_rxq *rxq, odp_port_t port_no) { struct pmd_perf_stats *s = &pmd->perf_stats; @@ -3361,7 +3391,12 @@ dp_netdev_process_rxq_port(struct dp_netdev_pmd_thread *pmd, int batch_cnt = 0; dp_packet_batch_init(&batch); - error = netdev_rxq_recv(rx, &batch); + + cycles_count_intermediate(pmd, NULL, 0); + pmd->ctx.last_rxq = rxq; + pmd->ctx.current_pmd_cycles_type = PMD_CYCLES_POLL_BUSY; + error = netdev_rxq_recv(rxq->rx, &batch); + if (!error) { *recirc_depth_get() = 0; pmd_thread_ctx_time_update(pmd); @@ -3385,14 +3420,20 @@ dp_netdev_process_rxq_port(struct dp_netdev_pmd_thread *pmd, } /* Process packet batch. */ dp_netdev_input(pmd, &batch, port_no); + cycles_count_intermediate(pmd, &rxq, 1); + dp_netdev_pmd_flush_output_packets(pmd); + } else if (error != EAGAIN && error != EOPNOTSUPP) { static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5); VLOG_ERR_RL(&rl, "error receiving data from %s: %s", - netdev_rxq_get_name(rx), ovs_strerror(error)); + netdev_rxq_get_name(rxq->rx), ovs_strerror(error)); } + pmd->ctx.current_pmd_cycles_type = PMD_CYCLES_POLL_IDLE; + pmd->ctx.last_rxq = NULL; + return batch_cnt; } @@ -4016,7 +4057,6 @@ dpif_netdev_run(struct dpif *dpif) struct dp_netdev *dp = get_dp_netdev(dpif); struct dp_netdev_pmd_thread *non_pmd; uint64_t new_tnl_seq; - int process_packets = 0; ovs_mutex_lock(&dp->port_mutex); non_pmd = dp_netdev_get_pmd(dp, NON_PMD_CORE_ID); @@ -4028,18 +4068,13 @@ dpif_netdev_run(struct dpif *dpif) int i; for (i = 0; i < port->n_rxq; i++) { - process_packets = - dp_netdev_process_rxq_port(non_pmd, - port->rxqs[i].rx, - port->port_no); - cycles_count_intermediate(non_pmd, NULL, - process_packets - ? PMD_CYCLES_POLL_BUSY - : PMD_CYCLES_POLL_IDLE); + dp_netdev_process_rxq_port(non_pmd, + &port->rxqs[i], + port->port_no); } } } - cycles_count_end(non_pmd, PMD_CYCLES_POLL_IDLE); + cycles_count_end(non_pmd); pmd_thread_ctx_time_update(non_pmd); dpif_netdev_xps_revalidate_pmd(non_pmd, false); ovs_mutex_unlock(&dp->non_pmd_mutex); @@ -4190,7 +4225,6 @@ pmd_thread_main(void *f_) bool exiting; int poll_cnt; int i; - int process_packets = 0; poll_list = NULL; @@ -4225,14 +4259,10 @@ reload: pmd_perf_start_iteration(s, pmd->ctx.last_cycles); for (i = 0; i < poll_cnt; i++) { - process_packets = - dp_netdev_process_rxq_port(pmd, poll_list[i].rxq->rx, - poll_list[i].port_no); - cycles_count_intermediate(pmd, poll_list[i].rxq, - process_packets - ? PMD_CYCLES_POLL_BUSY - : PMD_CYCLES_POLL_IDLE); - iter_packets += process_packets; + int rxq_packets = + dp_netdev_process_rxq_port(pmd, poll_list[i].rxq, + poll_list[i].port_no); + iter_packets += rxq_packets; } if (lc++ > 1024) { @@ -4253,14 +4283,14 @@ reload: if (reload) { break; } - cycles_count_intermediate(pmd, NULL, PMD_CYCLES_OVERHEAD); + cycles_count_intermediate(pmd, NULL, 0); } pmd_perf_end_iteration(s, pmd->ctx.last_cycles, iter_packets, pmd_perf_metrics_enabled(pmd)); } - cycles_count_end(pmd, PMD_CYCLES_OVERHEAD); + cycles_count_end(pmd); poll_cnt = pmd_load_queues_and_ports(pmd, &poll_list); exiting = latch_is_set(&pmd->exit_latch); @@ -4699,6 +4729,8 @@ dp_netdev_configure_pmd(struct dp_netdev_pmd_thread *pmd, struct dp_netdev *dp, ovs_mutex_init(&pmd->port_mutex); cmap_init(&pmd->flow_table); cmap_init(&pmd->classifiers); + pmd->ctx.last_rxq = NULL; + pmd->ctx.current_pmd_cycles_type = PMD_CYCLES_IDLE; pmd_thread_ctx_time_update(pmd); pmd->next_optimization = pmd->ctx.now + DPCLS_OPTIMIZATION_INTERVAL; pmd->rxq_next_cycle_store = pmd->ctx.now + PMD_RXQ_INTERVAL_LEN; @@ -5585,6 +5617,8 @@ dp_execute_cb(void *aux_, struct dp_packet_batch *packets_, dp_netdev_pmd_flush_output_on_port(pmd, p); } DP_PACKET_BATCH_FOR_EACH (packet, packets_) { + p->output_pkts_rxqs[dp_packet_batch_size(&p->output_pkts)] = + pmd->ctx.last_rxq; dp_packet_batch_add(&p->output_pkts, packet); } return; From patchwork Thu Jan 4 20:02:41 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Scheurich X-Patchwork-Id: 855878 X-Patchwork-Delegate: ian.stokes@intel.com Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3zCQNB3Ftvz9s7n for ; Fri, 5 Jan 2018 11:22:18 +1100 (AEDT) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id CC478D0A; Fri, 5 Jan 2018 00:19:14 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 72F91D06 for ; Fri, 5 Jan 2018 00:19:09 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mout.web.de (mout.web.de [212.227.15.14]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 3BD4753E for ; Fri, 5 Jan 2018 00:19:07 +0000 (UTC) Received: from ubuntu.fritz.box ([89.0.12.95]) by smtp.web.de (mrweb004 [213.165.67.108]) with ESMTPSA (Nemesis) id 0MTuQz-1eOh7M0ztR-00QiI6; Fri, 05 Jan 2018 01:19:01 +0100 From: Jan Scheurich To: dev@openvswitch.org Date: Thu, 4 Jan 2018 21:02:41 +0100 Message-Id: <1515096166-16257-4-git-send-email-jan.scheurich@ericsson.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1515096166-16257-1-git-send-email-jan.scheurich@ericsson.com> References: <1515096166-16257-1-git-send-email-jan.scheurich@ericsson.com> X-Provags-ID: V03:K0:ioEyBypLYZLOMMfEQ5TNwmAGBTpfiqekhfXT0ZR5m1FnRw8m+DV XJFhlyow7i0MRHFg6BBknSgsFyWyQO4Syy6O86YmTVXVp/UtI4+yVtTAZUiD1JxQ3tRlJOX Z8rQbIyWo+hCmz4ZqdaMw+YCE3987oxIVTZCmg2576vKSGpbI2JcS+Wne6BGYZnTY7Ryvlt hRykHBASmQM7fu0IC1zrQ== X-UI-Out-Filterresults: notjunk:1; V01:K0:YU/r2eSVJVU=:3LuV9ep3o3hP5/PjvmZwqE sTL4bzJvhTUMHakkWMvR6MH/f1mmAP2I57T3dzjS/4j4dJUi0ODfCw2esxV2zLN54DlTJBLLr K+I0jvZaXN9KLztjLH3ho9HSDPsa//sFhmpc7POuJr9efKBMNChnhZNyxlfNuxrQsnU1zfSeN Y70sCThN7vG7Dc56XdeIKCocPTEcWI3PDAhkf1XtncCwAtVUPVaYE/mV+hVWAfHvHmXtkHll8 N/p/pH7MtXWcOdDmL18mk8HuwO1ddyAzIKVnupKpqnTWqv5Hq6AwMHdkl6mXjgOjjT3QnVQv/ XV7F2VtD2HAzI4LoCvaBOc+X3LwzQ3avPf/SgSJA/OH0l1fTIZTvnmj2T5LBPrQu6bGI0dkKF ba7Nsf3r1j+2lj5AbOBalrAEXL4+12v2+FGQjFydnWHvixEiBp3O0OIEbmq2Tfv9odU1IM84U gg5wpE7kfiMdOw/r4BNWgn8G4GQqBlZDnomyHUPwwgpbl+vtigop7cN9VkT0TWFgc5f+JaEt4 97DUl428oJWAdXGU2waeBgJBkHXc/uDsc+dNRDVAYBjKA1kU6gExhbRzCo+bHQUqIgJxgulc6 bWzMFjYIYxrqgAqH/IUcZdbmJGvMlLbdzWXKvAmIiyh0A+BFqs6q5xiZ1ZSniX+BN9y5K1abu z9HWImgDToeVwN8EeccK+3PKQ4+2TZL0oJrY8GPr8Fr4TW57D9sSjI9Ad0vwwnlvWSOLoiUyF q+VUFq7b0o+8sYVs3Gt37n5UVdwES8QZHXPDau9xliLgXECdgZF3c9S7nq0E8y+lqs6z5w6Mc spiGOXT++Xe28VKxRdfn6WgZ+ux3A== X-Spam-Status: No, score=-1.0 required=5.0 tests=BAYES_00, DATE_IN_PAST_03_06, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, T_RP_MATCHES_RCVD autolearn=no version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Cc: Ilya Maximets Subject: [ovs-dev] [RFC PATCH 3/8] dpif-netdev: Time based output batching. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org From: Ilya Maximets This allows to collect packets from more than one RX burst and send them together with a configurable intervals. 'other_config:tx-flush-interval' can be used to configure time that a packet can wait in output batch for sending. 'tx-flush-interval' has microsecond resolution. Signed-off-by: Ilya Maximets --- lib/dpif-netdev.c | 108 +++++++++++++++++++++++++++++++++++++++++---------- vswitchd/vswitch.xml | 16 ++++++++ 2 files changed, 103 insertions(+), 21 deletions(-) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index a9f509a..d16ba93 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -88,6 +88,9 @@ VLOG_DEFINE_THIS_MODULE(dpif_netdev); #define MAX_RECIRC_DEPTH 6 DEFINE_STATIC_PER_THREAD_DATA(uint32_t, recirc_depth, 0) +/* Use instant packet send by default. */ +#define DEFAULT_TX_FLUSH_INTERVAL 0 + /* Configuration parameters. */ enum { MAX_FLOWS = 65536 }; /* Maximum number of flows in flow table. */ enum { MAX_METERS = 65536 }; /* Maximum number of meters. */ @@ -274,6 +277,9 @@ struct dp_netdev { struct hmap ports; struct seq *port_seq; /* Incremented whenever a port changes. */ + /* The time that a packet can wait in output batch for sending. */ + atomic_uint32_t tx_flush_interval; + /* Meters. */ struct ovs_mutex meter_locks[N_METER_LOCKS]; struct dp_meter *meters[MAX_METERS]; /* Meter bands. */ @@ -503,6 +509,7 @@ struct tx_port { int qid; long long last_used; struct hmap_node node; + long long flush_time; struct dp_packet_batch output_pkts; struct dp_netdev_rxq *output_pkts_rxqs[NETDEV_MAX_BURST]; }; @@ -588,6 +595,9 @@ struct dp_netdev_pmd_thread { * than 'cmap_count(dp->poll_threads)'. */ uint32_t static_tx_qid; + /* Number of filled output batches. */ + int n_output_batches; + struct ovs_mutex port_mutex; /* Mutex for 'poll_list' and 'tx_ports'. */ /* List of rx queues to poll. */ struct hmap poll_list OVS_GUARDED; @@ -677,8 +687,9 @@ static void dp_netdev_add_rxq_to_pmd(struct dp_netdev_pmd_thread *pmd, static void dp_netdev_del_rxq_from_pmd(struct dp_netdev_pmd_thread *pmd, struct rxq_poll *poll) OVS_REQUIRES(pmd->port_mutex); -static void -dp_netdev_pmd_flush_output_packets(struct dp_netdev_pmd_thread *pmd); +static int +dp_netdev_pmd_flush_output_packets(struct dp_netdev_pmd_thread *pmd, + bool force); static void reconfigure_datapath(struct dp_netdev *dp) OVS_REQUIRES(dp->port_mutex); @@ -1344,6 +1355,7 @@ create_dp_netdev(const char *name, const struct dpif_class *class, conntrack_init(&dp->conntrack); atomic_init(&dp->emc_insert_min, DEFAULT_EM_FLOW_INSERT_MIN); + atomic_init(&dp->tx_flush_interval, DEFAULT_TX_FLUSH_INTERVAL); cmap_init(&dp->poll_threads); @@ -3010,7 +3022,7 @@ dpif_netdev_execute(struct dpif *dpif, struct dpif_execute *execute) dp_packet_batch_init_packet(&pp, execute->packet); dp_netdev_execute_actions(pmd, &pp, false, execute->flow, execute->actions, execute->actions_len); - dp_netdev_pmd_flush_output_packets(pmd); + dp_netdev_pmd_flush_output_packets(pmd, true); if (pmd->core_id == NON_PMD_CORE_ID) { ovs_mutex_unlock(&dp->non_pmd_mutex); @@ -3059,6 +3071,16 @@ dpif_netdev_set_config(struct dpif *dpif, const struct smap *other_config) smap_get_ullong(other_config, "emc-insert-inv-prob", DEFAULT_EM_FLOW_INSERT_INV_PROB); uint32_t insert_min, cur_min; + uint32_t tx_flush_interval, cur_tx_flush_interval; + + tx_flush_interval = smap_get_int(other_config, "tx-flush-interval", + DEFAULT_TX_FLUSH_INTERVAL); + atomic_read_relaxed(&dp->tx_flush_interval, &cur_tx_flush_interval); + if (tx_flush_interval != cur_tx_flush_interval) { + atomic_store_relaxed(&dp->tx_flush_interval, tx_flush_interval); + VLOG_INFO("Flushing interval for tx queues set to %"PRIu32" us", + tx_flush_interval); + } if (!nullable_string_is_equal(dp->pmd_cmask, cmask)) { free(dp->pmd_cmask); @@ -3328,13 +3350,14 @@ dp_netdev_rxq_get_intrvl_cycles(struct dp_netdev_rxq *rx, unsigned idx) return processing_cycles; } -static void +static int dp_netdev_pmd_flush_output_on_port(struct dp_netdev_pmd_thread *pmd, struct tx_port *p) { int tx_qid; int output_cnt; bool dynamic_txqs; + uint32_t tx_flush_interval; enum pmd_cycles_counter_type save_pmd_cycles_type; /* In case we're in PMD_CYCLES_PROCESSING state we need to count @@ -3354,10 +3377,18 @@ dp_netdev_pmd_flush_output_on_port(struct dp_netdev_pmd_thread *pmd, } output_cnt = dp_packet_batch_size(&p->output_pkts); + ovs_assert(output_cnt > 0); netdev_send(p->port->netdev, tx_qid, &p->output_pkts, dynamic_txqs); dp_packet_batch_init(&p->output_pkts); + /* Update time of the next flush. */ + atomic_read_relaxed(&pmd->dp->tx_flush_interval, &tx_flush_interval); + p->flush_time = pmd->ctx.now + tx_flush_interval; + + ovs_assert(pmd->n_output_batches > 0); + pmd->n_output_batches--; + pmd_perf_update_counter(&pmd->perf_stats, PMD_STAT_SENT_PKTS, output_cnt); pmd_perf_update_counter(&pmd->perf_stats, @@ -3366,29 +3397,39 @@ dp_netdev_pmd_flush_output_on_port(struct dp_netdev_pmd_thread *pmd, /* Update send cycles for all the rx queues and restore previous state. */ cycles_count_intermediate(pmd, p->output_pkts_rxqs, output_cnt); pmd->ctx.current_pmd_cycles_type = save_pmd_cycles_type; + return output_cnt; } -static void -dp_netdev_pmd_flush_output_packets(struct dp_netdev_pmd_thread *pmd) +static int +dp_netdev_pmd_flush_output_packets(struct dp_netdev_pmd_thread *pmd, + bool force) { struct tx_port *p; + int output_cnt = 0; + + if (!pmd->n_output_batches) { + return 0; + } HMAP_FOR_EACH (p, node, &pmd->send_port_cache) { - if (!dp_packet_batch_is_empty(&p->output_pkts)) { - dp_netdev_pmd_flush_output_on_port(pmd, p); + if (!dp_packet_batch_is_empty(&p->output_pkts) + && (force || pmd->ctx.now >= p->flush_time)) { + output_cnt += dp_netdev_pmd_flush_output_on_port(pmd, p); } } + return output_cnt; } static int dp_netdev_process_rxq_port(struct dp_netdev_pmd_thread *pmd, struct dp_netdev_rxq *rxq, - odp_port_t port_no) + odp_port_t port_no, + bool *flushed) { struct pmd_perf_stats *s = &pmd->perf_stats; struct dp_packet_batch batch; int error; - int batch_cnt = 0; + int batch_cnt = 0, output_cnt = 0; dp_packet_batch_init(&batch); @@ -3422,7 +3463,8 @@ dp_netdev_process_rxq_port(struct dp_netdev_pmd_thread *pmd, dp_netdev_input(pmd, &batch, port_no); cycles_count_intermediate(pmd, &rxq, 1); - dp_netdev_pmd_flush_output_packets(pmd); + output_cnt = dp_netdev_pmd_flush_output_packets(pmd, false); + *flushed = true; } else if (error != EAGAIN && error != EOPNOTSUPP) { static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5); @@ -4057,6 +4099,7 @@ dpif_netdev_run(struct dpif *dpif) struct dp_netdev *dp = get_dp_netdev(dpif); struct dp_netdev_pmd_thread *non_pmd; uint64_t new_tnl_seq; + bool flushed = false; ovs_mutex_lock(&dp->port_mutex); non_pmd = dp_netdev_get_pmd(dp, NON_PMD_CORE_ID); @@ -4070,12 +4113,20 @@ dpif_netdev_run(struct dpif *dpif) for (i = 0; i < port->n_rxq; i++) { dp_netdev_process_rxq_port(non_pmd, &port->rxqs[i], - port->port_no); + port->port_no, + &flushed); } } } + if (!flushed) { + /* We didn't receive anything in the process loop. + * Check if we need to send something. + * There was no time updates on current iteration. */ + pmd_thread_ctx_time_update(non_pmd); + dp_netdev_pmd_flush_output_packets(non_pmd, false); + } + cycles_count_end(non_pmd); - pmd_thread_ctx_time_update(non_pmd); dpif_netdev_xps_revalidate_pmd(non_pmd, false); ovs_mutex_unlock(&dp->non_pmd_mutex); @@ -4126,6 +4177,8 @@ pmd_free_cached_ports(struct dp_netdev_pmd_thread *pmd) { struct tx_port *tx_port_cached; + /* Flush all the queued packets. */ + dp_netdev_pmd_flush_output_packets(pmd, true); /* Free all used tx queue ids. */ dpif_netdev_xps_revalidate_pmd(pmd, true); @@ -4255,25 +4308,32 @@ reload: cycles_count_start(pmd); for (;;) { uint64_t iter_packets = 0; + bool flushed = false; pmd_perf_start_iteration(s, pmd->ctx.last_cycles); for (i = 0; i < poll_cnt; i++) { int rxq_packets = dp_netdev_process_rxq_port(pmd, poll_list[i].rxq, - poll_list[i].port_no); + poll_list[i].port_no, + &flushed); iter_packets += rxq_packets; } + if (!flushed) { + /* We didn't receive anything in the process loop. + * Check if we need to send something. + * There was no time updates on current iteration. */ + pmd_thread_ctx_time_update(pmd); + dp_netdev_pmd_flush_output_packets(pmd, false); + } + if (lc++ > 1024) { bool reload; lc = 0; coverage_try_clear(); - /* It's possible that the time was not updated on current - * iteration, if there were no received packets. */ - pmd_thread_ctx_time_update(pmd); dp_netdev_pmd_try_optimize(pmd, poll_list, poll_cnt); if (!ovsrcu_try_quiesce()) { emc_cache_slow_sweep(&pmd->flow_cache); @@ -4717,6 +4777,7 @@ dp_netdev_configure_pmd(struct dp_netdev_pmd_thread *pmd, struct dp_netdev *dp, pmd->core_id = core_id; pmd->numa_id = numa_id; pmd->need_reload = false; + pmd->n_output_batches = 0; ovs_refcount_init(&pmd->ref_cnt); latch_init(&pmd->exit_latch); @@ -4907,6 +4968,7 @@ dp_netdev_add_port_tx_to_pmd(struct dp_netdev_pmd_thread *pmd, tx->port = port; tx->qid = -1; + tx->flush_time = 0LL; dp_packet_batch_init(&tx->output_pkts); hmap_insert(&pmd->tx_ports, &tx->node, hash_port_no(tx->port->port_no)); @@ -5610,12 +5672,16 @@ dp_execute_cb(void *aux_, struct dp_packet_batch *packets_, dp_netdev_pmd_flush_output_on_port(pmd, p); } #endif - if (OVS_UNLIKELY(dp_packet_batch_size(&p->output_pkts) - + dp_packet_batch_size(packets_) > NETDEV_MAX_BURST)) { - /* Some packets was generated while input batch processing. - * Flush here to avoid overflow. */ + if (dp_packet_batch_size(&p->output_pkts) + + dp_packet_batch_size(packets_) > NETDEV_MAX_BURST) { + /* Flush here to avoid overflow. */ dp_netdev_pmd_flush_output_on_port(pmd, p); } + + if (dp_packet_batch_is_empty(&p->output_pkts)) { + pmd->n_output_batches++; + } + DP_PACKET_BATCH_FOR_EACH (packet, packets_) { p->output_pkts_rxqs[dp_packet_batch_size(&p->output_pkts)] = pmd->ctx.last_rxq; diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml index 37d04b7..ce9f11b 100644 --- a/vswitchd/vswitch.xml +++ b/vswitchd/vswitch.xml @@ -359,6 +359,22 @@

+ +

+ Specifies the time in microseconds that a packet can wait in output + batch for sending i.e. amount of time that packet can spend in an + intermediate output queue before sending to netdev. + This option can be used to configure balance between throughput + and latency. Lower values decreases latency while higher values + may be useful to achieve higher performance. +

+

+ Defaults to 0 i.e. instant packet sending (latency optimized). +

+
+

From patchwork Thu Jan 4 20:02:42 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Scheurich X-Patchwork-Id: 855879 X-Patchwork-Delegate: ian.stokes@intel.com Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3zCQNl3XXrz9s7n for ; Fri, 5 Jan 2018 11:22:47 +1100 (AEDT) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id BE962D22; Fri, 5 Jan 2018 00:19:15 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 1EB46D05 for ; Fri, 5 Jan 2018 00:19:10 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mout.web.de (mout.web.de [212.227.15.3]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 21035573 for ; Fri, 5 Jan 2018 00:19:08 +0000 (UTC) Received: from ubuntu.fritz.box ([89.0.12.95]) by smtp.web.de (mrweb004 [213.165.67.108]) with ESMTPSA (Nemesis) id 0M7n5k-1et3vZ2Xp2-00vMI7; Fri, 05 Jan 2018 01:19:01 +0100 From: Jan Scheurich To: dev@openvswitch.org Date: Thu, 4 Jan 2018 21:02:42 +0100 Message-Id: <1515096166-16257-5-git-send-email-jan.scheurich@ericsson.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1515096166-16257-1-git-send-email-jan.scheurich@ericsson.com> References: <1515096166-16257-1-git-send-email-jan.scheurich@ericsson.com> X-Provags-ID: V03:K0:8UeUDq0c79dnmoETCEDbGkDAtctUa9zm/e8CKuOgimdRy9rPOVT UeUnI9Tx2gCml605UGPPyzL3MEPq/xjXzCzj9xXi9X1i5nA43UVbz3sHG8tpfsz1jp31QTT /woDmhTBUFiizhnyEIaDQDtRqIJhWbi/nzMcn88/99TtbTPgwOgio572CUDHaQcTJ7CEi6u SLR4U8cWE5tVVZqk6RViw== X-UI-Out-Filterresults: notjunk:1; V01:K0:V/LYCbY/k2g=:M8Q5nP5sjiOcMKOHJR0PjX /gNYnxt4YI0YoNCMgDEMw1+jkdfG+rlOWp5iRY2xLG2FlykE0uZ6iSTii8seDaXPXA+CGfwSf xr1WH3Y8T1n6AWsocp2DlsOi3+neVIwgVE++xDApA9dEo2ZyMJU4WDuE7S+AynNR4Sj/90fPW Q2dZWgV0KlPxm2Fn0n6M3pTB7PKbv1HuhEjRTqm1D+p550kGeTAqjbq1aiUVPUyVyFtyPF11S niO4ipbgHtBfz0UdzLS7AHutKA47QcS1YL1xXUqlLjuNZ8pqblP7m4T1/MOS+F1rI/CRa6G4O shm6+JjDCtaTbKTbqIcZcqdmnJgV8Qt1BfWISvRKWN+kufjdYle/RCjV48Rxq6Y2fWf9M5MOK Sfp032hgHoCps/ZB7T3Py5Md2dh8tt2JYEJpl3oHFxpPQmq7eSMlTMOe6slpI1HEPs1ECZtoy mJ+xJbbqJaV/XYu42y7hxDlo/GAq27WWtiDqeg2mDxSiWnbef2S2S+6NA+Zzwi10Bv2PY43bM yFc1Jor7Q4HdbvJ1vgUkf2am4qZ6ttTEJO+KRPf2bL4QY5NhhgSX5mQ+P92DEKxrGgaWa4qdC r0Gg4tbHFov6Otbzg5oNpgYOa1kJwHZGuUQPAs7mBxE8bKIBAzMndKKzlwahCUf1E6TQKIFRy g6Z+KfntR89cnr268ZT6HnfR+5n2dVfV+Df/NibVMph/gFWYqqM6IdY/Mxe0jYDN3kCmC7evi 3UUEJ2SpTLcF7SnIHkXpPxjmJii1ejj8Yr1RQtdFJ5+13n5/kkAvaxgkSWHbYEnybxjirQtNF gHhVF7sqjlY0A7j0DAIodusAlr6VQ== X-Spam-Status: No, score=-1.0 required=5.0 tests=BAYES_00, DATE_IN_PAST_03_06, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, T_RP_MATCHES_RCVD autolearn=no version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Cc: Ilya Maximets Subject: [ovs-dev] [RFC PATCH 4/8] docs: Describe output packet batching in DPDK guide. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org From: Ilya Maximets Added information about output packet batching and a way to configure 'tx-flush-interval'. Signed-off-by: Ilya Maximets Co-authored-by: Jan Scheurich Signed-off-by: Jan Scheurich --- Documentation/intro/install/dpdk.rst | 58 ++++++++++++++++++++++++++++++++++++ 1 file changed, 58 insertions(+) diff --git a/Documentation/intro/install/dpdk.rst b/Documentation/intro/install/dpdk.rst index 3fecb5c..040e62e 100644 --- a/Documentation/intro/install/dpdk.rst +++ b/Documentation/intro/install/dpdk.rst @@ -568,6 +568,64 @@ not needed i.e. jumbo frames are not needed, it can be forced off by adding chains of descriptors it will make more individual virtio descriptors available for rx to the guest using dpdkvhost ports and this can improve performance. +Output Packet Batching +~~~~~~~~~~~~~~~~~~~~~~ + +To make advantage of batched transmit functions, OVS collects packets in +intermediate queues before sending when processing a batch of received packets. +Even if packets are matched by different flows, OVS uses a single send +operation for all packets destined to the same output port. + +Furthermore, OVS is able to buffer packets in these intermediate queues for a +configurable amount of time to reduce the frequency of send bursts at medium +load levels when the packet receive rate is high, but the receive batch size +still very small. This is particularly beneficial for packets transmitted to +VMs using an interrupt-driven virtio driver, where the interrupt overhead is +significant for the OVS PMD, the host operating system and the guest driver. + +The ``tx-flush-interval`` parameter can be used to specify the time in +microseconds OVS should wait between two send bursts to a given port (default +is ``0``). When the intermediate queue fills up before that time is over, the +buffered packet batch is sent immediately:: + + $ ovs-vsctl set Open_vSwitch . other_config:tx-flush-interval=50 + +This parameter influences both throughput and latency, depending on the traffic +load on the port. In general lower values decrease latency while higher values +may be useful to achieve higher throughput. + +Low traffic (``packet rate < 1 / tx-flush-interval``) should not experience +any significant latency or throughput increase as packets are forwarded +immediately. + +At intermediate load levels +(``1 / tx-flush-interval < packet rate < 32 / tx-flush-interval``) traffic +should experience an average latency increase of up to +``1 / 2 * tx-flush-interval`` and a possible throughput improvement. + +Very high traffic (``packet rate >> 32 / tx-flush-interval``) should experience +the average latency increase equal to ``32 / (2 * packet rate)``. Most send +batches in this case will contain the maximum number of packets (``32``). + +A ``tx-burst-interval`` value of ``50`` microseconds has shown to provide a +good performance increase in a ``PHY-VM-PHY`` scenario on ``x86`` system for +interrupt-driven guests while keeping the latency increase at a reasonable +level: + + https://mail.openvswitch.org/pipermail/ovs-dev/2017-December/341628.html + +.. note:: + Throughput impact of this option significantly depends on the scenario and + the traffic patterns. For example: ``tx-burst-interval`` value of ``50`` + microseconds shows performance degradation in ``PHY-VM-PHY`` with bonded PHY + scenario while testing with ``256 - 1024`` packet flows: + + https://mail.openvswitch.org/pipermail/ovs-dev/2017-December/341700.html + +The average number of packets per output batch can be checked in PMD stats:: + + $ ovs-appctl dpif-netdev/pmd-stats-show + Limitations ------------ From patchwork Thu Jan 4 20:02:43 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Scheurich X-Patchwork-Id: 855876 X-Patchwork-Delegate: ian.stokes@intel.com Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3zCQM60Jndz9s7c for ; Fri, 5 Jan 2018 11:21:22 +1100 (AEDT) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id 6EACBD09; Fri, 5 Jan 2018 00:19:12 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 6B75ED05 for ; Fri, 5 Jan 2018 00:19:09 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mout.web.de (mout.web.de [212.227.15.3]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 820C5551 for ; Fri, 5 Jan 2018 00:19:08 +0000 (UTC) Received: from ubuntu.fritz.box ([89.0.12.95]) by smtp.web.de (mrweb004 [213.165.67.108]) with ESMTPSA (Nemesis) id 0Lg3bv-1fH6Jq3asl-00pgy0; Fri, 05 Jan 2018 01:19:02 +0100 From: Jan Scheurich To: dev@openvswitch.org Date: Thu, 4 Jan 2018 21:02:43 +0100 Message-Id: <1515096166-16257-6-git-send-email-jan.scheurich@ericsson.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1515096166-16257-1-git-send-email-jan.scheurich@ericsson.com> References: <1515096166-16257-1-git-send-email-jan.scheurich@ericsson.com> X-Provags-ID: V03:K0:jGz3zqxDQ7K7JB6vk6IRdLkkBKrpStYEKBpE6blRsdqLnxP31BO 3AVKOY/D3iu9xcvyTyDTn40kN6gbAMyqZnsF9dludcenQb3nTQGpt/w9ErRrzxJJ4OwZZ15 XKPP1K7vmMT3UqIXSZ/xB7pg6PsNohvm3y6Qlj4GkqM8tDRVHU5+Hely1xJG0zscRvjHr/5 VCTTanOWiVxbwtN4pyMBw== X-UI-Out-Filterresults: notjunk:1; V01:K0:juw0M5mNq/M=:rCb7+DbJoUqDqjdY3Ccsfq 5lUfN66VRliqnkIiknr10m5ysv5vR/4qKzc1uHiECxGd/n6LURN0xi/362dlWniXR8NC6JXgb vRDlAP93tL9SKTt5unCGkB1KdkatVarsvA3p+Z2dDrM19m1Y3JloiOBWmuqfSo8kjSM0eA65s IpBmEWbrWdfnnjzUOxZbi4fgSAGpxJHmhyJOF73pBP5UVPrAsH26fEgDFlwiUHcuGnr/Tq888 OZktCvyEsAdRlTPvl72AVxr+sL312uOtT48QKJuRENN9TOW661/WCSQWThveeG+gKEO3xU7IZ v1FnvYjFQj/Th+BOHF8Fh1fkR+/7P3BZAF2Xxp0Fa306rESQwCEX8JEmcwnOj49q3eZPhJRDM Y15UppYgp2BZC5BUJrlm8fTf8S4q6eybuRgYcgNLMWGRVAOVBtMQ1w5rsViqhpprkkmECDUO4 qsUbP5KUWr/OcUP+mAc1RecK6AnTaOXV4b9pEt2Zrd0hAFJJIaowzXatAnO5XZD6DIfTqiJLw xD91lnvtu5iRGClV0RqobYTyXIkoFQTiddyg9h/6hDZ0jupASG2Z8Y5yP923RtjOD4sC2azZu RMFpabwV33FnKxa7K6CieApLg0z0E5RwSNAcJUyMP3YYx9tpayNnfMhfQy0vmcsh6nKlkPnnS WBT6GA/k6LKrgBrHi1RWFNDupLL7R0mMajBzxmgD4DMQp7zU2UkbZXOrUpPggUpUtRBcSMJL9 FvpwT3y/AO2BveLJEE21prrR+A8jUEwj8ROCE5kLmJAEcfjpdWWHiwt9L/zAurkDFtBESeX7b FK5ONlM0j2JxwQQCwyM3Zu3VYl6Sg== X-Spam-Status: No, score=-1.0 required=5.0 tests=BAYES_00, DATE_IN_PAST_03_06, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, T_RP_MATCHES_RCVD autolearn=no version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Cc: Ilya Maximets Subject: [ovs-dev] [RFC PATCH 5/8] NEWS: Mark output packet batching support. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org From: Ilya Maximets New feature should be mentioned in news, especially because it has user-visible configuration options. Signed-off-by: Ilya Maximets --- NEWS | 2 ++ 1 file changed, 2 insertions(+) diff --git a/NEWS b/NEWS index a7f2def..d9c6641 100644 --- a/NEWS +++ b/NEWS @@ -29,6 +29,8 @@ Post-v2.8.0 * Add support for vHost IOMMU * New debug appctl command 'netdev-dpdk/get-mempool-info'. * All the netdev-dpdk appctl commands described in ovs-vswitchd man page. + - Userspace datapath: + * Output packet batching support. - vswitchd: * Datapath IDs may now be specified as 0x1 (etc.) instead of 16 digits. From patchwork Thu Jan 4 20:02:44 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Scheurich X-Patchwork-Id: 855873 X-Patchwork-Delegate: ian.stokes@intel.com Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3zCQK12pqpz9s7n for ; Fri, 5 Jan 2018 11:19:33 +1100 (AEDT) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id EE942CEF; Fri, 5 Jan 2018 00:19:07 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id D5335CC3 for ; Fri, 5 Jan 2018 00:19:06 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mout.web.de (mout.web.de [212.227.15.14]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 7E5ED4EB for ; Fri, 5 Jan 2018 00:19:04 +0000 (UTC) Received: from ubuntu.fritz.box ([89.0.12.95]) by smtp.web.de (mrweb004 [213.165.67.108]) with ESMTPSA (Nemesis) id 0LaZQd-1fBa0n0UgB-00mN2q; Fri, 05 Jan 2018 01:19:02 +0100 From: Jan Scheurich To: dev@openvswitch.org Date: Thu, 4 Jan 2018 21:02:44 +0100 Message-Id: <1515096166-16257-7-git-send-email-jan.scheurich@ericsson.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1515096166-16257-1-git-send-email-jan.scheurich@ericsson.com> References: <1515096166-16257-1-git-send-email-jan.scheurich@ericsson.com> X-Provags-ID: V03:K0:/tCvawIw1a6fCMB69vQFBJM6rpeshIqO9I5twX4m87HoKTj2vOB AqjQuNjpb0NbEaPZZu48SDoWxXTyGR8mjDCiZjr8z0v5BswNDEVyUxAPv6fj5J2Av1SgreQ A5n5WGn/n9p4CpBgWpPovVkkAJuyBDQhAEUBQ8aJc97o6D7REaOvZBWdCUE6TScMAyiGhq0 q+EpYkLAhrUV818qLf0nQ== X-UI-Out-Filterresults: notjunk:1; V01:K0:qU4hI+jkmuQ=:NMwl+SG8J9bMekj/5OeRVT 3HhpA/ZxrCFNZI4ikG+0HUXT4piT0BGevBHd+GsqybdMZYVXOpCTFAbldDr3PWvKT4OQg2pDc b9bn9pONY2a6QtjfpRNIooc1kDovAegW3KWhTd/vUS5gt+avSpIAFkHM8eZvqOw2x1ZsMD2Q2 L33wxzYdGBfqvJ+X8tJlLy5sIH+Ne4wwiNoxPEIEwuCn1PIIiliLgTd1unT4GF5pt5oJ9cKUB f+bBRsuOz6hopxKy1BpLJ8E+sKQaj50wqGqdK4HgCoui8fEd38Kn2uLcaww3ytLclC1Ku+0pF 1uDD8Mq5WVHAzkleZA4mkFAlYryFtLr2N/NhOuz5kghNVrgoIacc4yKxUVlsctQV7ughxn06j GmUrwS2z7GaD2keXrCwHVw1wGu6dqkOe7FeRGdIXoWWusOr7O4jhZSR8epn/o+/q7dhfdzF7s L/2r8d5q3iMsN/oW8T7PzkHmsqhuvZG31XirXi+KJAnI6KhMjippFB0WcXBvtXmYjdh5on+wL YiQoGiwdYZvtzmZSGSfzEl6GWT9U13Gzgs9UiDHfXvjtq9YiaEKlZ84jPzSH/4+bkQzuBedUR 3VCLKAYie7DZ9JF2rO5bppx1Vxky+uzZjqROdJRTN6RSztPPOa05WOQyy5Ad/BJjFbWvyZ1X6 TE6kbI505BC4nDnQ5HDyYhM76yeWCIOGho53bfjK4gBCkON0Hy2UmyWAwQmh/PzJtWxsFGiX6 52dHjVTnyfyA0+wZF7zUYQDjfYG7CCqKD/HnEv7pqOQmxrKxx15mg/UsGrpB2W6QvPMZk7ihK TKnPLwWUobqO9zreBTDHuSJXDrH4g== X-Spam-Status: No, score=-1.0 required=5.0 tests=BAYES_00, DATE_IN_PAST_03_06, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, T_RP_MATCHES_RCVD autolearn=no version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Subject: [ovs-dev] [RFC PATCH 6/8] dpif-netdev: Refactor cycle counting X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org Simplify the historically grown TSC cycle counting in PMD threads. Cycles are currently counted for the following purposes: 1. Measure PMD ustilization PMD utilization is defined as ratio of cycles spent in busy iterations (at least one packet received or sent) over the total number of cycles. This is already done in pmd_perf_start_iteration() and pmd_perf_end_iteration() based on a TSC timestamp saved in current iteration at start_iteration() and the actual TSC at end_iteration(). No dependency on intermediate cycle accounting. 2. Measure the processing load per RX queue This comprises cycles spend on polling and processing packets received from the rx queue and the cycles spent on delayed sending of these packets to tx queues (with time-based batching). 3. Measure the cycles spend on processing upcalls These are part of the processing cycles of PMD and rxq but are also measured separately for the purpose of supervising upcall performance and load. The previous scheme using cycles_count_start(), cycles_count_intermediate() and cycles-count_end() originally introduced to simplify cycle counting and saving calls to rte_get_tsc_cycles() was rather obscuring things. Replaced this with dedicated pairs of cycles_count() around each task to be measured and accounting the difference in cycles as appropriate for the task. Each call to cycles_count(pmd) will now store the read TSC counter in pmd->ctx.last_cycles, so that users with lower accuracy requirements can read that value instead of calling cycles_count(). Signed-off-by: Jan Scheurich --- lib/dpif-netdev.c | 132 ++++++++++++++++-------------------------------------- 1 file changed, 39 insertions(+), 93 deletions(-) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index d16ba93..5d23128 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -522,11 +522,9 @@ struct dp_netdev_pmd_thread_ctx { /* Latest measured time. See 'pmd_thread_ctx_time_update()'. */ long long now; /* Used to count cycles. See 'cycles_count_end()' */ - unsigned long long last_cycles; + uint64_t last_cycles; /* RX queue from which last packet was received. */ struct dp_netdev_rxq *last_rxq; - /* Indicates how should be treated last counted cycles. */ - enum pmd_cycles_counter_type current_pmd_cycles_type; }; /* PMD: Poll modes drivers. PMD accesses devices via polling to eliminate @@ -3246,69 +3244,16 @@ dp_netdev_actions_free(struct dp_netdev_actions *actions) free(actions); } -static inline unsigned long long -cycles_counter(void) +static inline uint64_t +cycles_counter(struct dp_netdev_pmd_thread *pmd) { #ifdef DPDK_NETDEV - return rte_get_tsc_cycles(); + return pmd->ctx.last_cycles = rte_get_tsc_cycles(); #else - return 0; + return pmd->ctx.last_cycles = 0; #endif } -/* Fake mutex to make sure that the calls to cycles_count_* are balanced */ -extern struct ovs_mutex cycles_counter_fake_mutex; - -/* Start counting cycles. Must be followed by 'cycles_count_end()'. - * Counting starts from the idle type state. */ -static inline void -cycles_count_start(struct dp_netdev_pmd_thread *pmd) - OVS_ACQUIRES(&cycles_counter_fake_mutex) - OVS_NO_THREAD_SAFETY_ANALYSIS -{ - pmd->ctx.current_pmd_cycles_type = PMD_CYCLES_IDLE; - pmd->ctx.last_cycles = cycles_counter(); -} - -/* Stop counting cycles and add them to the counter of the current type. */ -static inline void -cycles_count_end(struct dp_netdev_pmd_thread *pmd) - OVS_RELEASES(&cycles_counter_fake_mutex) - OVS_NO_THREAD_SAFETY_ANALYSIS -{ - unsigned long long interval = cycles_counter() - pmd->ctx.last_cycles; - enum pmd_cycles_counter_type type = pmd->ctx.current_pmd_cycles_type; - - pmd_perf_update_counter(&pmd->perf_stats, type, interval); -} - -/* Calculate the intermediate cycle result and add to the counter of - * the current type */ -static inline void -cycles_count_intermediate(struct dp_netdev_pmd_thread *pmd, - struct dp_netdev_rxq **rxqs, int n_rxqs) - OVS_NO_THREAD_SAFETY_ANALYSIS -{ - unsigned long long new_cycles = cycles_counter(); - unsigned long long interval = new_cycles - pmd->ctx.last_cycles; - enum pmd_cycles_counter_type type = pmd->ctx.current_pmd_cycles_type; - int i; - - pmd->ctx.last_cycles = new_cycles; - - pmd_perf_update_counter(&pmd->perf_stats, type, interval); - if (n_rxqs && (type == PMD_CYCLES_POLL_BUSY)) { - /* Add to the amount of current processing cycles. */ - interval /= n_rxqs; - for (i = 0; i < n_rxqs; i++) { - if (rxqs[i]) { - non_atomic_ullong_add(&rxqs[i]->cycles[RXQ_CYCLES_PROC_CURR], - interval); - } - } - } -} - static inline bool pmd_perf_metrics_enabled(const struct dp_netdev_pmd_thread *pmd) { @@ -3325,6 +3270,14 @@ dp_netdev_rxq_set_cycles(struct dp_netdev_rxq *rx, atomic_store_relaxed(&rx->cycles[type], cycles); } +static void +dp_netdev_rxq_add_cycles(struct dp_netdev_rxq *rx, + enum rxq_cycles_counter_type type, + unsigned long long cycles) +{ + non_atomic_ullong_add(&rx->cycles[type], cycles); +} + static uint64_t dp_netdev_rxq_get_cycles(struct dp_netdev_rxq *rx, enum rxq_cycles_counter_type type) @@ -3358,16 +3311,9 @@ dp_netdev_pmd_flush_output_on_port(struct dp_netdev_pmd_thread *pmd, int output_cnt; bool dynamic_txqs; uint32_t tx_flush_interval; - enum pmd_cycles_counter_type save_pmd_cycles_type; - /* In case we're in PMD_CYCLES_PROCESSING state we need to count - * cycles for rxq we're processing now. */ - cycles_count_intermediate(pmd, &pmd->ctx.last_rxq, 1); - - /* Save current cycles counting state to restore after accounting - * send cycles. */ - save_pmd_cycles_type = pmd->ctx.current_pmd_cycles_type; - pmd->ctx.current_pmd_cycles_type = PMD_CYCLES_POLL_BUSY; + /* Measure duration of batch transmission. */ + uint64_t cycles = cycles_counter(pmd); dynamic_txqs = p->port->dynamic_txqs; if (dynamic_txqs) { @@ -3394,9 +3340,15 @@ dp_netdev_pmd_flush_output_on_port(struct dp_netdev_pmd_thread *pmd, pmd_perf_update_counter(&pmd->perf_stats, PMD_STAT_SENT_BATCHES, 1); - /* Update send cycles for all the rx queues and restore previous state. */ - cycles_count_intermediate(pmd, p->output_pkts_rxqs, output_cnt); - pmd->ctx.current_pmd_cycles_type = save_pmd_cycles_type; + /* Divide the batch tx cost evenly over the packets' rxqs. */ + cycles = (cycles_counter(pmd) - cycles) / output_cnt; + struct dp_netdev_rxq **rxqs = p->output_pkts_rxqs; + for (int i = 0; i < output_cnt; i++) { + if (rxqs[i]) { + dp_netdev_rxq_add_cycles(rxqs[i], RXQ_CYCLES_PROC_CURR, + cycles); + } + } return output_cnt; } @@ -3429,16 +3381,17 @@ dp_netdev_process_rxq_port(struct dp_netdev_pmd_thread *pmd, struct pmd_perf_stats *s = &pmd->perf_stats; struct dp_packet_batch batch; int error; - int batch_cnt = 0, output_cnt = 0; + int batch_cnt = 0; dp_packet_batch_init(&batch); - - cycles_count_intermediate(pmd, NULL, 0); pmd->ctx.last_rxq = rxq; - pmd->ctx.current_pmd_cycles_type = PMD_CYCLES_POLL_BUSY; + + /* Measure duration for polling and processing rx burst. */ + uint64_t cycles = cycles_counter(pmd); error = netdev_rxq_recv(rxq->rx, &batch); if (!error) { + /* At least one packet received. */ *recirc_depth_get() = 0; pmd_thread_ctx_time_update(pmd); batch_cnt = batch.count; @@ -3448,7 +3401,7 @@ dp_netdev_process_rxq_port(struct dp_netdev_pmd_thread *pmd, histogram_add_sample(&s->pkts_per_batch, batch_cnt); /* Update the maximum Rx queue fill level. */ uint32_t qfill = batch.qfill; - switch (netdev_dpdk_get_type(netdev_rxq_get_netdev(rx))) { + switch (netdev_dpdk_get_type(netdev_rxq_get_netdev(rxq->rx))) { case DPDK_DEV_VHOST: if (qfill > s->current.max_vhost_qfill) { s->current.max_vhost_qfill = qfill; @@ -3461,11 +3414,13 @@ dp_netdev_process_rxq_port(struct dp_netdev_pmd_thread *pmd, } /* Process packet batch. */ dp_netdev_input(pmd, &batch, port_no); - cycles_count_intermediate(pmd, &rxq, 1); - - output_cnt = dp_netdev_pmd_flush_output_packets(pmd, false); + /* Add processing cycles to rxq stats. */ + cycles = cycles_counter(pmd) - cycles; + dp_netdev_rxq_add_cycles(rxq, RXQ_CYCLES_PROC_CURR, cycles); + /* Flush the send queues. */ + dp_netdev_pmd_flush_output_packets(pmd, false); *flushed = true; - + } else if (error != EAGAIN && error != EOPNOTSUPP) { static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5); @@ -3473,9 +3428,7 @@ dp_netdev_process_rxq_port(struct dp_netdev_pmd_thread *pmd, netdev_rxq_get_name(rxq->rx), ovs_strerror(error)); } - pmd->ctx.current_pmd_cycles_type = PMD_CYCLES_POLL_IDLE; pmd->ctx.last_rxq = NULL; - return batch_cnt; } @@ -4105,7 +4058,6 @@ dpif_netdev_run(struct dpif *dpif) non_pmd = dp_netdev_get_pmd(dp, NON_PMD_CORE_ID); if (non_pmd) { ovs_mutex_lock(&dp->non_pmd_mutex); - cycles_count_start(non_pmd); HMAP_FOR_EACH (port, node, &dp->ports) { if (!netdev_is_pmd(port->netdev)) { int i; @@ -4126,7 +4078,6 @@ dpif_netdev_run(struct dpif *dpif) dp_netdev_pmd_flush_output_packets(non_pmd, false); } - cycles_count_end(non_pmd); dpif_netdev_xps_revalidate_pmd(non_pmd, false); ovs_mutex_unlock(&dp->non_pmd_mutex); @@ -4305,7 +4256,6 @@ reload: lc = UINT_MAX; } - cycles_count_start(pmd); for (;;) { uint64_t iter_packets = 0; bool flushed = false; @@ -4343,15 +4293,12 @@ reload: if (reload) { break; } - cycles_count_intermediate(pmd, NULL, 0); } - pmd_perf_end_iteration(s, pmd->ctx.last_cycles, iter_packets, + pmd_perf_end_iteration(s, cycles_counter(pmd), iter_packets, pmd_perf_metrics_enabled(pmd)); } - cycles_count_end(pmd); - poll_cnt = pmd_load_queues_and_ports(pmd, &poll_list); exiting = latch_is_set(&pmd->exit_latch); /* Signal here to make sure the pmd finishes @@ -4791,7 +4738,6 @@ dp_netdev_configure_pmd(struct dp_netdev_pmd_thread *pmd, struct dp_netdev *dp, cmap_init(&pmd->flow_table); cmap_init(&pmd->classifiers); pmd->ctx.last_rxq = NULL; - pmd->ctx.current_pmd_cycles_type = PMD_CYCLES_IDLE; pmd_thread_ctx_time_update(pmd); pmd->next_optimization = pmd->ctx.now + DPCLS_OPTIMIZATION_INTERVAL; pmd->rxq_next_cycle_store = pmd->ctx.now + PMD_RXQ_INTERVAL_LEN; @@ -5247,7 +5193,7 @@ handle_packet_upcall(struct dp_netdev_pmd_thread *pmd, struct match match; ovs_u128 ufid; int error; - uint64_t cycles = cycles_counter(); + uint64_t cycles = cycles_counter(pmd); match.tun_md.valid = false; miniflow_expand(&key->mf, &match.flow); @@ -5303,7 +5249,7 @@ handle_packet_upcall(struct dp_netdev_pmd_thread *pmd, } if (pmd_perf_metrics_enabled(pmd)) { /* Update upcall stats. */ - cycles = cycles_counter() - cycles; + cycles = cycles_counter(pmd) - cycles; struct pmd_perf_stats *s = &pmd->perf_stats; s->current.upcalls++; s->current.upcall_cycles += cycles; From patchwork Thu Jan 4 20:02:45 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Scheurich X-Patchwork-Id: 855874 X-Patchwork-Delegate: ian.stokes@intel.com Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3zCQKw35vwz9s7n for ; Fri, 5 Jan 2018 11:20:20 +1100 (AEDT) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id 343EBCFC; Fri, 5 Jan 2018 00:19:09 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id B1C10CF2 for ; Fri, 5 Jan 2018 00:19:08 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mout.web.de (mout.web.de [212.227.15.3]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id EAB8A18A for ; Fri, 5 Jan 2018 00:19:07 +0000 (UTC) Received: from ubuntu.fritz.box ([89.0.12.95]) by smtp.web.de (mrweb004 [213.165.67.108]) with ESMTPSA (Nemesis) id 0MCNY9-1egMWl2PWD-009BPG; Fri, 05 Jan 2018 01:19:02 +0100 From: Jan Scheurich To: dev@openvswitch.org Date: Thu, 4 Jan 2018 21:02:45 +0100 Message-Id: <1515096166-16257-8-git-send-email-jan.scheurich@ericsson.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1515096166-16257-1-git-send-email-jan.scheurich@ericsson.com> References: <1515096166-16257-1-git-send-email-jan.scheurich@ericsson.com> X-Provags-ID: V03:K0:jzWS09/GxwPQCrGRBG6s3cUQGa+kbwIHuf0DT2A4utgm7Hhk5vw jEWpihvSoN37k5gg/dA5/NOyZ+C96sqO9W0w5iiStbAo6fICtXdgmKq7a5SjgH6ZsJgqUFh btYXwM8Rw9RHVKgQxs8t/Zy+8pp+y9Kpynl5V0W5Ad/5HP6vM6Rch8QyXfg0oiDG71tuXXF x0bxKOMj9DWZeLSZSPkQQ== X-UI-Out-Filterresults: notjunk:1; V01:K0:AhFDTJ44rb0=:TqdUyqb0zJklFWsGW9Y+OM gLXtLUu3aDL4qk5f5JIcgNaUDDQdUQaNM1ieaO3blGMfRYDyrOFj8v6+ZmYaZszs3s1zxTKbk HB4rh3qqYEKICQZFwb2VMI9z7/YTNPfz3jDzs7QHfJDOYrsnCHEcdJMV051xbmjdhafW8f0Pg ieqxA/hdeynElewjhkDaCBDd2EvIdxX43mSIMBQO0G8jxi7eFNSeH+tMYDGoxQ5/vsOqj/fOc GzBJnc8amffHg6Fo4MTbC9JfzPagSqzXJQy8qryD3YG+0lccBiESSQZgC9u5UbAviCYhHc/bf x0fsrHslV0F5YZ1aa6Ooui2RmmXw2IguAFwD1ZU+NbkilKpcKzH37TwqrsjKEKgRmiVZLjdKa EGKqhGyaqDNDRUz8IOhsN9J7hBhwLg9S3bX1MbDCYpbGjdZ5kNk90CxaYqfHXw/VZy1peix6z 9Np4bwd9N91/77uIf+qd0BTygcJyAQtF0zvoxZLmN7h1VdILTe8jbPFijhijJOY2bQTGz5xz/ zZWQtDqP9Z3ZM7uXVxJPypfIfBOewVKrGc7/yDYkRirUSenw82k66H6X3/v9Nn9XTpVo5VLDt ip5Nnmze5v6Uh1EnJSUvy+DJbAS5N6oNubRKd87CIqNStUYKWtl4Fu3SyRiMheyKev04Jkrf0 95JpcdgWhlfjyTImwOJBf2neTat98BN0e9qga75V/FKd8xmgKoYceSoTEFlrBckE0jBb6nR5e 02YWxqpEjETrxbaiZpMiCn6SEKq210GqsZzfYpPAs524RrWFwT0F7DgQnsC1vcmgn71ZAy9Xw KZhnZ8b1LXrmqGPTwX/au3Hpa6NZQ== X-Spam-Status: No, score=-1.0 required=5.0 tests=BAYES_00, DATE_IN_PAST_03_06, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, T_RP_MATCHES_RCVD autolearn=no version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Subject: [ovs-dev] [RFC PATCH 7/8] dpif-netdev: Reset the rxq current cycle counter on reload. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org From: Kevin Traynor An rxq may have processing cycles counted in the current counter when a reload happens. That could temporarily create a small skew on the stats for an rxq. Reset the counter after reload. Fixes: 4809891b2e01 ("dpif-netdev: Count the rxq processing cycles for an rxq.") Signed-off-by: Kevin Traynor --- lib/dpif-netdev.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 5d23128..fc10f8e 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -4246,6 +4246,8 @@ reload: VLOG_DBG("Core %d processing port \'%s\' with queue-id %d\n", pmd->core_id, netdev_rxq_get_name(poll_list[i].rxq->rx), netdev_rxq_get_queue_id(poll_list[i].rxq->rx)); + /* Reset the rxq current cycles counter. */ + dp_netdev_rxq_set_cycles(poll_list[i].rxq, RXQ_CYCLES_PROC_CURR, 0); } if (!poll_cnt) { From patchwork Thu Jan 4 20:02:46 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Scheurich X-Patchwork-Id: 855880 X-Patchwork-Delegate: ian.stokes@intel.com Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3zCQPN70W9z9s7n for ; Fri, 5 Jan 2018 11:23:20 +1100 (AEDT) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id E4668D32; Fri, 5 Jan 2018 00:19:16 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id D6906D0D for ; Fri, 5 Jan 2018 00:19:10 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mout.web.de (mout.web.de [212.227.15.3]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 87CF74EB for ; Fri, 5 Jan 2018 00:19:09 +0000 (UTC) Received: from ubuntu.fritz.box ([89.0.12.95]) by smtp.web.de (mrweb004 [213.165.67.108]) with ESMTPSA (Nemesis) id 0M4EJn-1eoNEb3TtL-00rlNw; Fri, 05 Jan 2018 01:19:03 +0100 From: Jan Scheurich To: dev@openvswitch.org Date: Thu, 4 Jan 2018 21:02:46 +0100 Message-Id: <1515096166-16257-9-git-send-email-jan.scheurich@ericsson.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1515096166-16257-1-git-send-email-jan.scheurich@ericsson.com> References: <1515096166-16257-1-git-send-email-jan.scheurich@ericsson.com> X-Provags-ID: V03:K0:HAA2XxU3ThRRtNspL16H9P+2My4xoU/XVOj5aKg42LcL46LymL/ eraMbujpa3E53555aAXiE/bpV+eAu/aDlSUND0JwIdfbSwmBBjYMM9w4/InpLLHYfb/fYW4 47dXSbN0ajob1M8nz6KJmmp6nprMjxzRsdwk+4rqBn1i3YUHisDpQuwB0GqClmaAmwh+yte Cc2y3izrZmguJTaS/DEIg== X-UI-Out-Filterresults: notjunk:1; V01:K0:ohOuIxrnaHc=:yCbYf70jtNWI87MmATsO9a +SUY7lS9wWep/977/sxT79C5Ngzgbq5F4Zd/BmYvqih5BWJNuZC6A6N5Fj7n+wZDyiggYRrAQ P0mLQ0dMsAprpHqzOz0IwwOVTrV/1p70zyKTaKbTDpcQ6QXaW5iMblhUKCJGpxcVMBmXmyLeW 4t9bNEjx22O5wQt3btOVO+WktNh8SyFq1rbpaW84To3wYz8s21FQJZGOTafqMiNEexDRanzMt xgtxYflrswQs6OKtXE3H9cR0bP9XfpNeygDaUWVOsiARtTYnuQxHSMfrrYz1PJ8/0GRk5sojV bd/yGgDOSgX9BrHyDOqtmv29Xnj+CTLscf1jQVMtcGIUv003OfX0Ae8sFUvke0k5MzXuIgad+ YqWd4Sfp1qehZGntVSRv3zcz4UIWRrzHKLuJkd6Y41V7BuerQVGW1vQ1ruCaDZgwr8ZjCBMan ijalB8HRucqJH+xAAtKffbVqeX3OVbar5OCmAeEIbMn/WwMUvfMvDkuSssa31zGW93j86Jkf1 pUm8hDR9F/dMenrTXZj55VgRydJoQWlwZZodfVwxWcY9jOX5mEsjOCoNydcZ9ONRk/cWWir5s misBZpLTVK5RvymFSUzZfVqBMr8FjIMV+j5OvcYhwNRTt6jXIAYEqxa+4CdDuRYvmrAdjg5dU aRKslxhHppvqXr2rDiuPjGKCs2ksv51WOmXZ63xsMycRtJBl7IXixjHoR0Tx6Xe+/17iCLDQD B/SShs39io10SiigTY5Hf6CT6cMGXFxqokTQLmArnbm0YCViiu5C2dGPZKNDFAsd+gfExFcN0 taNsxMiIMK4ZVcZqGiHqqjDM7m+/A== X-Spam-Status: No, score=-1.0 required=5.0 tests=BAYES_00, DATE_IN_PAST_03_06, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, T_RP_MATCHES_RCVD autolearn=no version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Subject: [ovs-dev] [RFC PATCH 8/8] dpif-netdev: Add percentage of pmd/core used by each rxq. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org It is based on the length of history that is stored about an rxq (currently 1 min). $ ovs-appctl dpif-netdev/pmd-rxq-show pmd thread numa_id 0 core_id 4: isolated : false port: dpdkphy1 queue-id: 0 pmd usage: 70 % port: dpdkvhost0 queue-id: 0 pmd usage: 0 % pmd thread numa_id 0 core_id 6: isolated : false port: dpdkphy0 queue-id: 0 pmd usage: 64 % port: dpdkvhost1 queue-id: 0 pmd usage: 0 % These values are what would be used as part of rxq to pmd assignment due to a reconfiguration event e.g. adding pmds, adding rxqs or with the command: ovs-appctl dpif-netdev/pmd-rxq-rebalance Signed-off-by: Kevin Traynor Co-authored-by: Kevin Traynor Signed-off-by: Jan Scheurich --- Documentation/howto/dpdk.rst | 12 +++++++ NEWS | 1 + lib/dpif-netdev.c | 85 +++++++++++++++++++++++++++++++++----------- tests/pmd.at | 51 +++++++++++++++++++------- 4 files changed, 116 insertions(+), 33 deletions(-) diff --git a/Documentation/howto/dpdk.rst b/Documentation/howto/dpdk.rst index 2393c2f..1597e1c 100644 --- a/Documentation/howto/dpdk.rst +++ b/Documentation/howto/dpdk.rst @@ -139,6 +139,18 @@ Core 3: Q1 (80%) | Core 7: Q4 (70%) | Q5 (10%) core 8: Q3 (60%) | Q0 (30%) +To see the current measured usage history of pmd core cycles for each rxq:: + + $ ovs-appctl dpif-netdev/pmd-rxq-show + +.. note:: + + A history of one minute is recorded and shown for each rxq to allow for + traffic pattern spikes. An rxq's pmd core cycles usage changes due to traffic + pattern or reconfig changes will take one minute before they are fully + reflected in the stats. In this way the the stats show what would be used + during a new rxq to pmd assignment. + Rxq to pmds assignment takes place whenever there are configuration changes or can be triggered by using:: diff --git a/NEWS b/NEWS index d9c6641..e2ea776 100644 --- a/NEWS +++ b/NEWS @@ -29,6 +29,7 @@ Post-v2.8.0 * Add support for vHost IOMMU * New debug appctl command 'netdev-dpdk/get-mempool-info'. * All the netdev-dpdk appctl commands described in ovs-vswitchd man page. + * Add rxq utilization of pmd cycles to pmd-rxq-show - Userspace datapath: * Output packet batching support. - vswitchd: diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index fc10f8e..4761d3b 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -369,6 +369,8 @@ struct dp_netdev_rxq { /* We store PMD_RXQ_INTERVAL_MAX intervals of data for an rxq and then sum them to yield the cycles used for an rxq. */ atomic_ullong cycles_intrvl[PMD_RXQ_INTERVAL_MAX]; + atomic_ullong intrvl_tsc1[PMD_RXQ_INTERVAL_MAX]; + atomic_ullong intrvl_tsc2[PMD_RXQ_INTERVAL_MAX]; }; /* A port in a netdev-based datapath. */ @@ -573,7 +575,7 @@ struct dp_netdev_pmd_thread { /* Periodically sort subtable vectors according to hit frequencies */ long long int next_optimization; /* End of the next time interval for which processing cycles - are stored for each polled rxq. */ + are stored for each polled rxq. Same unit as pmd->ctx.now. */ long long int rxq_next_cycle_store; /* Current context of the PMD thread. */ @@ -700,6 +702,8 @@ static inline void dp_netdev_pmd_try_optimize(struct dp_netdev_pmd_thread *pmd, struct polled_queue *poll_list, int poll_cnt); static void +dp_netdev_rxq_cycles_reset(struct dp_netdev_rxq *rx); +static void dp_netdev_rxq_set_cycles(struct dp_netdev_rxq *rx, enum rxq_cycles_counter_type type, unsigned long long cycles); @@ -708,7 +712,8 @@ dp_netdev_rxq_get_cycles(struct dp_netdev_rxq *rx, enum rxq_cycles_counter_type type); static void dp_netdev_rxq_set_intrvl_cycles(struct dp_netdev_rxq *rx, - unsigned long long cycles); + uint64_t tsc_timestamp, + uint64_t cycles); static uint64_t dp_netdev_rxq_get_intrvl_cycles(struct dp_netdev_rxq *rx, unsigned idx); static void @@ -981,9 +986,8 @@ static void pmd_info_show_rxq(struct ds *reply, struct dp_netdev_pmd_thread *pmd) { if (pmd->core_id != NON_PMD_CORE_ID) { - const char *prev_name = NULL; struct rxq_poll *list; - size_t i, n; + size_t n_rxq, idx; ds_put_format(reply, "pmd thread numa_id %d core_id %u:\n\tisolated : %s\n", @@ -991,22 +995,41 @@ pmd_info_show_rxq(struct ds *reply, struct dp_netdev_pmd_thread *pmd) ? "true" : "false"); ovs_mutex_lock(&pmd->port_mutex); - sorted_poll_list(pmd, &list, &n); - for (i = 0; i < n; i++) { - const char *name = netdev_rxq_get_name(list[i].rxq->rx); - - if (!prev_name || strcmp(name, prev_name)) { - if (prev_name) { - ds_put_cstr(reply, "\n"); - } - ds_put_format(reply, "\tport: %s\tqueue-id:", name); + sorted_poll_list(pmd, &list, &n_rxq); + + for (int i = 0; i < n_rxq; i++) { + + struct dp_netdev_rxq *rxq = list[i].rxq; + const char *name = netdev_rxq_get_name(rxq->rx); + uint64_t proc_cycles = 0; + uint64_t total_cycles = 0; + + /* Collect the rxq cycle stats. */ + idx = (rxq->intrvl_idx - 1) % PMD_RXQ_INTERVAL_MAX; + if (rxq->intrvl_tsc2[idx] > 0) { + /* Only show pmd usage if a full set of interval + * measurements is available. */ + total_cycles = rxq->intrvl_tsc1[idx] - + rxq->intrvl_tsc2[idx]; + } + for (int j = 0; j < PMD_RXQ_INTERVAL_MAX; j++) { + idx = (rxq->intrvl_idx + j) % PMD_RXQ_INTERVAL_MAX; + proc_cycles += rxq->cycles_intrvl[idx]; } - ds_put_format(reply, " %d", + + ds_put_format(reply, "\tport: %16s\tqueue-id: %2d", name, netdev_rxq_get_queue_id(list[i].rxq->rx)); - prev_name = name; + ds_put_format(reply, "\tpmd usage: "); + if (total_cycles > 0) { + ds_put_format(reply, "%2"PRIu64"", + proc_cycles * 100 / total_cycles); + ds_put_cstr(reply, " %"); + } else { + ds_put_format(reply, "%s", "NOT AVAIL"); + } + ds_put_cstr(reply, "\n"); } ovs_mutex_unlock(&pmd->port_mutex); - ds_put_cstr(reply, "\n"); free(list); } } @@ -3263,6 +3286,18 @@ pmd_perf_metrics_enabled(const struct dp_netdev_pmd_thread *pmd) } static void +dp_netdev_rxq_cycles_reset(struct dp_netdev_rxq *rx) +{ + atomic_store_relaxed(&rx->cycles[RXQ_CYCLES_PROC_CURR], 0); + for (int i = 0; i < PMD_RXQ_INTERVAL_MAX; i++) { + atomic_store_relaxed(&rx->cycles_intrvl[i], 0); + atomic_store_relaxed(&rx->intrvl_tsc1[i], 0); + atomic_store_relaxed(&rx->intrvl_tsc2[i], 0); + } + rx->intrvl_idx = 0; +} + +static void dp_netdev_rxq_set_cycles(struct dp_netdev_rxq *rx, enum rxq_cycles_counter_type type, unsigned long long cycles) @@ -3289,10 +3324,17 @@ dp_netdev_rxq_get_cycles(struct dp_netdev_rxq *rx, static void dp_netdev_rxq_set_intrvl_cycles(struct dp_netdev_rxq *rx, - unsigned long long cycles) + uint64_t tsc_timestamp, + uint64_t cycles) { - unsigned int idx = rx->intrvl_idx++ % PMD_RXQ_INTERVAL_MAX; + uint64_t old_tsc_ts; + size_t idx = rx->intrvl_idx % PMD_RXQ_INTERVAL_MAX; + atomic_store_relaxed(&rx->cycles_intrvl[idx], cycles); + atomic_read_relaxed(&rx->intrvl_tsc1[idx], &old_tsc_ts); + atomic_store_relaxed(&rx->intrvl_tsc2[idx], old_tsc_ts); + atomic_store_relaxed(&rx->intrvl_tsc1[idx], tsc_timestamp); + rx->intrvl_idx++; } static uint64_t @@ -4247,7 +4289,8 @@ reload: pmd->core_id, netdev_rxq_get_name(poll_list[i].rxq->rx), netdev_rxq_get_queue_id(poll_list[i].rxq->rx)); /* Reset the rxq current cycles counter. */ - dp_netdev_rxq_set_cycles(poll_list[i].rxq, RXQ_CYCLES_PROC_CURR, 0); + dp_netdev_rxq_cycles_reset(poll_list[i].rxq); + // dp_netdev_rxq_set_cycles(poll_list[i].rxq, RXQ_CYCLES_PROC_CURR, 0); } if (!poll_cnt) { @@ -6247,7 +6290,9 @@ dp_netdev_pmd_try_optimize(struct dp_netdev_pmd_thread *pmd, for (unsigned i = 0; i < poll_cnt; i++) { uint64_t rxq_cyc_curr = dp_netdev_rxq_get_cycles(poll_list[i].rxq, RXQ_CYCLES_PROC_CURR); - dp_netdev_rxq_set_intrvl_cycles(poll_list[i].rxq, rxq_cyc_curr); + dp_netdev_rxq_set_intrvl_cycles(poll_list[i].rxq, + cycles_counter(pmd), + rxq_cyc_curr); dp_netdev_rxq_set_cycles(poll_list[i].rxq, RXQ_CYCLES_PROC_CURR, 0); } diff --git a/tests/pmd.at b/tests/pmd.at index 0356f87..430b875 100644 --- a/tests/pmd.at +++ b/tests/pmd.at @@ -6,7 +6,15 @@ m4_divert_push([PREPARE_TESTS]) # of every rxq (one per line) in the form: # port_name rxq_id numa_id core_id parse_pmd_rxq_show () { - awk '/pmd/ {numa=$4; core=substr($6, 1, length($6) - 1)} /\t/{for (i=4; i<=NF; i++) print $2, $i, numa, core}' | sort + awk '/pmd thread/ {numa=$4; core=substr($6, 1, length($6) - 1)} /\tport:/ {print $2, $4, numa, core}' | sort +} + +# Given the output of `ovs-appctl dpif-netdev/pmd-rxq-show`, +# and with queues for each core on one line, prints the rxqs +# of the core on one line +# 'port:' port_name 'queue_id:' rxq_id rxq_id rxq_id rxq_id +parse_pmd_rxq_show_group () { + awk '/port:/ {print $1, $2, $3, $4, $12, $20, $28}' } # Given the output of `ovs-appctl dpctl/dump-flows`, prints a list of flows @@ -53,7 +61,7 @@ m4_define([CHECK_PMD_THREADS_CREATED], [ ]) m4_define([SED_NUMA_CORE_PATTERN], ["s/\(numa_id \)[[0-9]]*\( core_id \)[[0-9]]*:/\1\2:/"]) -m4_define([SED_NUMA_CORE_QUEUE_PATTERN], ["s/\(numa_id \)[[0-9]]*\( core_id \)[[0-9]]*:/\1\2:/;s/\(queue-id: \)1 2 5 6/\1/;s/\(queue-id: \)0 3 4 7/\1/"]) +m4_define([SED_NUMA_CORE_QUEUE_PATTERN], ["s/1 2 5 6//;s/0 3 4 7//"]) m4_define([DUMMY_NUMA], [--dummy-numa="0,0,0,0"]) AT_SETUP([PMD - creating a thread/add-port]) @@ -65,7 +73,7 @@ CHECK_PMD_THREADS_CREATED() AT_CHECK([ovs-appctl dpif-netdev/pmd-rxq-show | sed SED_NUMA_CORE_PATTERN], [0], [dnl pmd thread numa_id core_id : isolated : false - port: p0 queue-id: 0 + port: p0 queue-id: 0 pmd usage: NOT AVAIL ]) AT_CHECK([ovs-appctl dpif/show | sed 's/\(tx_queues=\)[[0-9]]*/\1/g'], [0], [dnl @@ -96,7 +104,14 @@ dummy@ovs-dummy: hit:0 missed:0 AT_CHECK([ovs-appctl dpif-netdev/pmd-rxq-show | sed SED_NUMA_CORE_PATTERN], [0], [dnl pmd thread numa_id core_id : isolated : false - port: p0 queue-id: 0 1 2 3 4 5 6 7 + port: p0 queue-id: 0 pmd usage: NOT AVAIL + port: p0 queue-id: 1 pmd usage: NOT AVAIL + port: p0 queue-id: 2 pmd usage: NOT AVAIL + port: p0 queue-id: 3 pmd usage: NOT AVAIL + port: p0 queue-id: 4 pmd usage: NOT AVAIL + port: p0 queue-id: 5 pmd usage: NOT AVAIL + port: p0 queue-id: 6 pmd usage: NOT AVAIL + port: p0 queue-id: 7 pmd usage: NOT AVAIL ]) OVS_VSWITCHD_STOP @@ -120,20 +135,23 @@ dummy@ovs-dummy: hit:0 missed:0 AT_CHECK([ovs-appctl dpif-netdev/pmd-rxq-show | sed SED_NUMA_CORE_PATTERN], [0], [dnl pmd thread numa_id core_id : isolated : false - port: p0 queue-id: 0 1 2 3 4 5 6 7 + port: p0 queue-id: 0 pmd usage: NOT AVAIL + port: p0 queue-id: 1 pmd usage: NOT AVAIL + port: p0 queue-id: 2 pmd usage: NOT AVAIL + port: p0 queue-id: 3 pmd usage: NOT AVAIL + port: p0 queue-id: 4 pmd usage: NOT AVAIL + port: p0 queue-id: 5 pmd usage: NOT AVAIL + port: p0 queue-id: 6 pmd usage: NOT AVAIL + port: p0 queue-id: 7 pmd usage: NOT AVAIL ]) TMP=$(cat ovs-vswitchd.log | wc -l | tr -d [[:blank:]]) AT_CHECK([ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x3]) CHECK_PMD_THREADS_CREATED([2], [], [+$TMP]) -AT_CHECK([ovs-appctl dpif-netdev/pmd-rxq-show | sed SED_NUMA_CORE_QUEUE_PATTERN], [0], [dnl -pmd thread numa_id core_id : - isolated : false - port: p0 queue-id: -pmd thread numa_id core_id : - isolated : false - port: p0 queue-id: +AT_CHECK([ovs-appctl dpif-netdev/pmd-rxq-show | sed ':a;/AVAIL$/{N;s/\n//;ba}' | parse_pmd_rxq_show_group | sed SED_NUMA_CORE_QUEUE_PATTERN], [0], [dnl +port: p0 queue-id: +port: p0 queue-id: ]) TMP=$(cat ovs-vswitchd.log | wc -l | tr -d [[:blank:]]) @@ -143,7 +161,14 @@ CHECK_PMD_THREADS_CREATED([1], [], [+$TMP]) AT_CHECK([ovs-appctl dpif-netdev/pmd-rxq-show | sed SED_NUMA_CORE_PATTERN], [0], [dnl pmd thread numa_id core_id : isolated : false - port: p0 queue-id: 0 1 2 3 4 5 6 7 + port: p0 queue-id: 0 pmd usage: NOT AVAIL + port: p0 queue-id: 1 pmd usage: NOT AVAIL + port: p0 queue-id: 2 pmd usage: NOT AVAIL + port: p0 queue-id: 3 pmd usage: NOT AVAIL + port: p0 queue-id: 4 pmd usage: NOT AVAIL + port: p0 queue-id: 5 pmd usage: NOT AVAIL + port: p0 queue-id: 6 pmd usage: NOT AVAIL + port: p0 queue-id: 7 pmd usage: NOT AVAIL ]) OVS_VSWITCHD_STOP