From patchwork Mon Aug 26 12:59:03 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li,Rongqing via dev" X-Patchwork-Id: 1153171 X-Patchwork-Delegate: i.maximets@samsung.com Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=openvswitch.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=altencalsoftlabs.com header.i=@altencalsoftlabs.com header.b="oK7p50Vc"; dkim-atps=neutral Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 46HC3L6Vwlz9s7T for ; Mon, 26 Aug 2019 23:07:42 +1000 (AEST) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id F15B4156A; Mon, 26 Aug 2019 13:07:37 +0000 (UTC) X-Original-To: ovs-dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 1F4F31564 for ; Mon, 26 Aug 2019 13:05:18 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mail.altencalsoftlabs.com (mail.altencalsoftlabs.com [182.73.72.41]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 90EA88A2 for ; Mon, 26 Aug 2019 13:05:16 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by mail.altencalsoftlabs.com (Postfix) with ESMTP id D8E0F4420139; Mon, 26 Aug 2019 18:35:11 +0530 (IST) Received: from mail.altencalsoftlabs.com ([127.0.0.1]) by localhost (mail.altencalsoftlabs.com [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id 8wIKA94o6JOW; Mon, 26 Aug 2019 18:35:10 +0530 (IST) Received: from localhost (localhost [127.0.0.1]) by mail.altencalsoftlabs.com (Postfix) with ESMTP id 75620442003A; Mon, 26 Aug 2019 18:35:10 +0530 (IST) DKIM-Filter: OpenDKIM Filter v2.10.3 mail.altencalsoftlabs.com 75620442003A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=altencalsoftlabs.com; s=selector; t=1566824710; bh=LNLH1hwq4MI5J8yF1f7zsoIpKU8Yr9borQuZpCg4NPU=; h=From:To:Date:Message-Id; b=oK7p50VcjLvUEWbB8KSN9jIdfmEep815hCNOcjl13jnQ5QE0K+lY30137+9ujS7a+ BMvIvabjEwg0pKxUzy/IM64YIhMTK1+m0ELSVlxDsQ8mspUBIBzoitLAqvvD2eU/xy sltW/+SPCv2rsCdvBxfv7lhpbMXa5zFvYJ8SFg7Y= X-Virus-Scanned: amavisd-new at altencalsoftlabs.com Received: from mail.altencalsoftlabs.com ([127.0.0.1]) by localhost (mail.altencalsoftlabs.com [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id ASoYaB8oSg5S; Mon, 26 Aug 2019 18:35:10 +0530 (IST) Received: from localhost.localdomain (unknown [10.1.0.4]) by mail.altencalsoftlabs.com (Postfix) with ESMTPSA id BB1624420030; Mon, 26 Aug 2019 18:35:08 +0530 (IST) To: ovs-dev@openvswitch.org, i.maximets@samsung.com Date: Mon, 26 Aug 2019 18:29:03 +0530 Message-Id: <1566824343-18855-1-git-send-email-sriram.v@altencalsoftlabs.com> X-Mailer: git-send-email 2.7.4 X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Subject: [ovs-dev] [PATCH v6] Detailed packet drop statistics per dpdk and vhostuser ports X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Sriram Vatala via dev From: "Li,Rongqing via dev" Reply-To: Sriram Vatala MIME-Version: 1.0 Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org OVS may be unable to transmit packets for multiple reasons and today there is a single counter to track packets dropped due to any of those reasons. The most common reason is that a VM is unable to read packets fast enough causing the vhostuser port transmit queue on the OVS side to become full. This manifests as a problem with VNFs not receiving all packets. Having a separate drop counter to track packets dropped because the transmit queue is full will clearly indicate that the problem is on the VM side and not in OVS. Similarly maintaining separate counters for all possible drops helps in indicating sensible cause for packet drops. This patch adds custom stats counters to track packets dropped at port level and these counters are displayed along with other stats in "ovs-vsctl get interface statistics" command. The detailed stats will be available for both dpdk and vhostuser ports. Signed-off-by: Sriram Vatala Signed-off-by: Sriram Vatala --- lib/netdev-dpdk.c | 120 ++++++++++++++++++--- utilities/bugtool/automake.mk | 3 +- utilities/bugtool/ovs-bugtool-get-iface-stats | 25 +++++ .../bugtool/plugins/network-status/openvswitch.xml | 1 + vswitchd/vswitch.xml | 24 +++++ 5 files changed, 157 insertions(+), 16 deletions(-) create mode 100755 utilities/bugtool/ovs-bugtool-get-iface-stats diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index 4805783..6685f32 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -447,8 +447,14 @@ struct netdev_dpdk { PADDED_MEMBERS(CACHE_LINE_SIZE, struct netdev_stats stats; - /* Custom stat for retries when unable to transmit. */ + /* Counters for Custom device stats */ + /* No. of retries when unable to transmit. */ uint64_t tx_retries; + /* Pkts left untransmitted in Tx buffers. Probably Tx Que is full */ + uint64_t tx_failure_drops; + uint64_t tx_mtu_exceeded_drops; + uint64_t tx_qos_drops; + uint64_t rx_qos_drops; /* Protects stats */ rte_spinlock_t stats_lock; /* 4 pad bytes here. */ @@ -2205,6 +2211,7 @@ netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq, struct ingress_policer *policer = netdev_dpdk_get_ingress_policer(dev); uint16_t nb_rx = 0; uint16_t dropped = 0; + uint16_t qos_drops = 0; int qid = rxq->queue_id * VIRTIO_QNUM + VIRTIO_TXQ; int vid = netdev_dpdk_get_vid(dev); @@ -2236,11 +2243,13 @@ netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq, (struct rte_mbuf **) batch->packets, nb_rx, true); dropped -= nb_rx; + qos_drops = dropped; } rte_spinlock_lock(&dev->stats_lock); netdev_dpdk_vhost_update_rx_counters(&dev->stats, batch->packets, nb_rx, dropped); + dev->rx_qos_drops += qos_drops; rte_spinlock_unlock(&dev->stats_lock); batch->count = nb_rx; @@ -2266,6 +2275,7 @@ netdev_dpdk_rxq_recv(struct netdev_rxq *rxq, struct dp_packet_batch *batch, struct ingress_policer *policer = netdev_dpdk_get_ingress_policer(dev); int nb_rx; int dropped = 0; + int qos_drops = 0; if (OVS_UNLIKELY(!(dev->flags & NETDEV_UP))) { return EAGAIN; @@ -2284,12 +2294,14 @@ netdev_dpdk_rxq_recv(struct netdev_rxq *rxq, struct dp_packet_batch *batch, (struct rte_mbuf **) batch->packets, nb_rx, true); dropped -= nb_rx; + qos_drops = dropped; } /* Update stats to reflect dropped packets */ if (OVS_UNLIKELY(dropped)) { rte_spinlock_lock(&dev->stats_lock); dev->stats.rx_dropped += dropped; + dev->rx_qos_drops += qos_drops; rte_spinlock_unlock(&dev->stats_lock); } @@ -2373,6 +2385,9 @@ __netdev_dpdk_vhost_send(struct netdev *netdev, int qid, struct rte_mbuf **cur_pkts = (struct rte_mbuf **) pkts; unsigned int total_pkts = cnt; unsigned int dropped = 0; + unsigned int tx_failure; + unsigned int mtu_drops; + unsigned int qos_drops; int i, retries = 0; int max_retries = VHOST_ENQ_RETRY_MIN; int vid = netdev_dpdk_get_vid(dev); @@ -2390,9 +2405,12 @@ __netdev_dpdk_vhost_send(struct netdev *netdev, int qid, rte_spinlock_lock(&dev->tx_q[qid].tx_lock); cnt = netdev_dpdk_filter_packet_len(dev, cur_pkts, cnt); + mtu_drops = total_pkts - cnt; + qos_drops = cnt; /* Check has QoS has been configured for the netdev */ cnt = netdev_dpdk_qos_run(dev, cur_pkts, cnt, true); - dropped = total_pkts - cnt; + qos_drops -= cnt; + dropped = qos_drops + mtu_drops; do { int vhost_qid = qid * VIRTIO_QNUM + VIRTIO_RXQ; @@ -2417,12 +2435,16 @@ __netdev_dpdk_vhost_send(struct netdev *netdev, int qid, } } while (cnt && (retries++ < max_retries)); + tx_failure = cnt; rte_spinlock_unlock(&dev->tx_q[qid].tx_lock); rte_spinlock_lock(&dev->stats_lock); netdev_dpdk_vhost_update_tx_counters(&dev->stats, pkts, total_pkts, cnt + dropped); dev->tx_retries += MIN(retries, max_retries); + dev->tx_failure_drops += tx_failure; + dev->tx_mtu_exceeded_drops += mtu_drops; + dev->tx_qos_drops += qos_drops; rte_spinlock_unlock(&dev->stats_lock); out: @@ -2447,12 +2469,15 @@ dpdk_do_tx_copy(struct netdev *netdev, int qid, struct dp_packet_batch *batch) struct rte_mbuf *pkts[PKT_ARRAY_SIZE]; uint32_t cnt = batch_cnt; uint32_t dropped = 0; + uint32_t tx_failure = 0; + uint32_t mtu_drops = 0; + uint32_t qos_drops = 0; if (dev->type != DPDK_DEV_VHOST) { /* Check if QoS has been configured for this netdev. */ cnt = netdev_dpdk_qos_run(dev, (struct rte_mbuf **) batch->packets, batch_cnt, false); - dropped += batch_cnt - cnt; + qos_drops = batch_cnt - cnt; } uint32_t txcnt = 0; @@ -2465,13 +2490,13 @@ dpdk_do_tx_copy(struct netdev *netdev, int qid, struct dp_packet_batch *batch) VLOG_WARN_RL(&rl, "Too big size %u max_packet_len %d", size, dev->max_packet_len); - dropped++; + mtu_drops++; continue; } pkts[txcnt] = rte_pktmbuf_alloc(dev->dpdk_mp->mp); if (OVS_UNLIKELY(!pkts[txcnt])) { - dropped += cnt - i; + dropped = cnt - i; break; } @@ -2488,13 +2513,17 @@ dpdk_do_tx_copy(struct netdev *netdev, int qid, struct dp_packet_batch *batch) __netdev_dpdk_vhost_send(netdev, qid, (struct dp_packet **) pkts, txcnt); } else { - dropped += netdev_dpdk_eth_tx_burst(dev, qid, pkts, txcnt); + tx_failure = netdev_dpdk_eth_tx_burst(dev, qid, pkts, txcnt); } } + dropped += qos_drops + mtu_drops + tx_failure; if (OVS_UNLIKELY(dropped)) { rte_spinlock_lock(&dev->stats_lock); dev->stats.tx_dropped += dropped; + dev->tx_failure_drops += tx_failure; + dev->tx_mtu_exceeded_drops += mtu_drops; + dev->tx_qos_drops += qos_drops; rte_spinlock_unlock(&dev->stats_lock); } } @@ -2536,18 +2565,25 @@ netdev_dpdk_send__(struct netdev_dpdk *dev, int qid, dp_packet_delete_batch(batch, true); } else { int tx_cnt, dropped; + int tx_failure, mtu_drops, qos_drops; int batch_cnt = dp_packet_batch_size(batch); struct rte_mbuf **pkts = (struct rte_mbuf **) batch->packets; tx_cnt = netdev_dpdk_filter_packet_len(dev, pkts, batch_cnt); + mtu_drops = batch_cnt - tx_cnt; + qos_drops = tx_cnt; tx_cnt = netdev_dpdk_qos_run(dev, pkts, tx_cnt, true); - dropped = batch_cnt - tx_cnt; + qos_drops -= tx_cnt; - dropped += netdev_dpdk_eth_tx_burst(dev, qid, pkts, tx_cnt); + tx_failure = netdev_dpdk_eth_tx_burst(dev, qid, pkts, tx_cnt); + dropped = tx_failure + mtu_drops + qos_drops; if (OVS_UNLIKELY(dropped)) { rte_spinlock_lock(&dev->stats_lock); dev->stats.tx_dropped += dropped; + dev->tx_failure_drops += tx_failure; + dev->tx_mtu_exceeded_drops += mtu_drops; + dev->tx_qos_drops += qos_drops; rte_spinlock_unlock(&dev->stats_lock); } } @@ -2816,6 +2852,17 @@ netdev_dpdk_get_custom_stats(const struct netdev *netdev, uint32_t i; struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); int rte_xstats_ret; + uint16_t index = 0; + +#define DPDK_CSTATS \ + DPDK_CSTAT(tx_failure_drops) \ + DPDK_CSTAT(tx_mtu_exceeded_drops) \ + DPDK_CSTAT(tx_qos_drops) \ + DPDK_CSTAT(rx_qos_drops) + +#define DPDK_CSTAT(NAME) +1 + custom_stats->size = DPDK_CSTATS; +#undef DPDK_CSTAT ovs_mutex_lock(&dev->mutex); @@ -2830,9 +2877,10 @@ netdev_dpdk_get_custom_stats(const struct netdev *netdev, if (rte_xstats_ret > 0 && rte_xstats_ret <= dev->rte_xstats_ids_size) { - custom_stats->size = rte_xstats_ret; + index = rte_xstats_ret; + custom_stats->size += rte_xstats_ret; custom_stats->counters = - (struct netdev_custom_counter *) xcalloc(rte_xstats_ret, + (struct netdev_custom_counter *) xcalloc(custom_stats->size, sizeof(struct netdev_custom_counter)); for (i = 0; i < rte_xstats_ret; i++) { @@ -2846,7 +2894,6 @@ netdev_dpdk_get_custom_stats(const struct netdev *netdev, VLOG_WARN("Cannot get XSTATS values for port: "DPDK_PORT_ID_FMT, dev->port_id); custom_stats->counters = NULL; - custom_stats->size = 0; /* Let's clear statistics cache, so it will be * reconfigured */ netdev_dpdk_clear_xstats(dev); @@ -2855,6 +2902,27 @@ netdev_dpdk_get_custom_stats(const struct netdev *netdev, free(values); } + if (custom_stats->counters == NULL) { + custom_stats->counters = + (struct netdev_custom_counter *) xcalloc(custom_stats->size, + sizeof(struct netdev_custom_counter)); + } + + rte_spinlock_lock(&dev->stats_lock); + i = index; +#define DPDK_CSTAT(NAME) \ + ovs_strlcpy(custom_stats->counters[i++].name, #NAME, \ + NETDEV_CUSTOM_STATS_NAME_SIZE); + DPDK_CSTATS; +#undef DPDK_CSTAT + + i = index; +#define DPDK_CSTAT(NAME) \ + custom_stats->counters[i++].value = dev->NAME; + DPDK_CSTATS; +#undef DPDK_CSTAT + rte_spinlock_unlock(&dev->stats_lock); + ovs_mutex_unlock(&dev->mutex); return 0; @@ -2865,17 +2933,39 @@ netdev_dpdk_vhost_get_custom_stats(const struct netdev *netdev, struct netdev_custom_stats *custom_stats) { struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); + int i; + +#define VHOST_CSTATS \ + VHOST_CSTAT(tx_retries) \ + VHOST_CSTAT(tx_failure_drops) \ + VHOST_CSTAT(tx_mtu_exceeded_drops) \ + VHOST_CSTAT(tx_qos_drops) \ + VHOST_CSTAT(rx_qos_drops) + +#define VHOST_CSTAT(NAME) +1 + custom_stats->size = VHOST_CSTATS; +#undef VHOST_CSTAT - ovs_mutex_lock(&dev->mutex); - custom_stats->size = VHOST_CUSTOM_STATS_SIZE; custom_stats->counters = xcalloc(custom_stats->size, sizeof *custom_stats->counters); - ovs_strlcpy(custom_stats->counters[0].name, VHOST_STAT_TX_RETRIES, + i = 0; +#define VHOST_CSTAT(NAME) \ + ovs_strlcpy(custom_stats->counters[i++].name, #NAME, \ NETDEV_CUSTOM_STATS_NAME_SIZE); + VHOST_CSTATS; +#undef VHOST_CSTAT + + ovs_mutex_lock(&dev->mutex); rte_spinlock_lock(&dev->stats_lock); - custom_stats->counters[0].value = dev->tx_retries; + + i = 0; +#define VHOST_CSTAT(NAME) \ + custom_stats->counters[i++].value = dev->NAME; + VHOST_CSTATS; +#undef VHOST_CSTAT + rte_spinlock_unlock(&dev->stats_lock); ovs_mutex_unlock(&dev->mutex); diff --git a/utilities/bugtool/automake.mk b/utilities/bugtool/automake.mk index 18fa347..9657468 100644 --- a/utilities/bugtool/automake.mk +++ b/utilities/bugtool/automake.mk @@ -22,7 +22,8 @@ bugtool_scripts = \ utilities/bugtool/ovs-bugtool-ovs-bridge-datapath-type \ utilities/bugtool/ovs-bugtool-ovs-vswitchd-threads-affinity \ utilities/bugtool/ovs-bugtool-qos-configs \ - utilities/bugtool/ovs-bugtool-get-dpdk-nic-numa + utilities/bugtool/ovs-bugtool-get-dpdk-nic-numa \ + utilities/bugtool/ovs-bugtool-get-iface-stats scripts_SCRIPTS += $(bugtool_scripts) diff --git a/utilities/bugtool/ovs-bugtool-get-iface-stats b/utilities/bugtool/ovs-bugtool-get-iface-stats new file mode 100755 index 0000000..0fe175e --- /dev/null +++ b/utilities/bugtool/ovs-bugtool-get-iface-stats @@ -0,0 +1,25 @@ +#! /bin/bash + +# This library is free software; you can redistribute it and/or +# modify it under the terms of version 2.1 of the GNU Lesser General +# Public License as published by the Free Software Foundation. +# +# This library is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# Lesser General Public License for more details. +# +# Copyright (C) 2019 Ericsson AB + +for bridge in `ovs-vsctl -- --real list-br` +do + echo -e "\nBridge : ${bridge}\n" + for iface in `ovs-vsctl list-ifaces ${bridge}` + do + echo -e "iface : ${iface}" + ovs-vsctl get interface ${iface} statistics + echo -e "\n" + done + echo -e "iface : ${bridge}" + ovs-vsctl get interface ${bridge} statistics +done diff --git a/utilities/bugtool/plugins/network-status/openvswitch.xml b/utilities/bugtool/plugins/network-status/openvswitch.xml index d39867c..f8c4ff0 100644 --- a/utilities/bugtool/plugins/network-status/openvswitch.xml +++ b/utilities/bugtool/plugins/network-status/openvswitch.xml @@ -34,6 +34,7 @@ ovs-appctl dpctl/dump-flows netdev@ovs-netdev ovs-appctl dpctl/dump-flows system@ovs-system ovs-appctl dpctl/show -s + /usr/share/openvswitch/scripts/ovs-bugtool-get-iface-stats /usr/share/openvswitch/scripts/ovs-bugtool-ovs-ofctl-loop-over-bridges "show" /usr/share/openvswitch/scripts/ovs-bugtool-ovs-ofctl-loop-over-bridges "dump-flows" /usr/share/openvswitch/scripts/ovs-bugtool-ovs-ofctl-loop-over-bridges "dump-ports" diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml index 9a743c0..4a7bcf8 100644 --- a/vswitchd/vswitch.xml +++ b/vswitchd/vswitch.xml @@ -3486,6 +3486,30 @@ ovs-vsctl add-port br0 p0 -- set Interface p0 type=patch options:peer=p1 \ the above. + + + Total number of transmit retries on a vhost-user or vhost-user-client + interface. + + + Total number of packets dropped because DPDP transmit API for + physical/vhost ports fails to transmit the packets. This happens + most likely because the transmit queue is full or has been filled + up. There are other reasons as well which are unlikely to happen. + + + Number of packets dropped due to packet length exceeding the max + device MTU. + + + Total number of packets dropped due to transmission rate exceeding + the configured egress policer rate. + + + Total number of packets dropped due to reception rate exceeding + the configured ingress policer rate. + +