Message ID | 1566997134-10160-1-git-send-email-sriram.v@altencalsoftlabs.com |
---|---|
State | Changes Requested |
Delegated to: | Ilya Maximets |
Headers | show |
Series | [ovs-dev,v7] Detailed packet drop statistics per dpdk and vhostuser ports | expand |
Bleep bloop. Greetings Sriram Vatala via dev, I am a robot and I have tried out your patch. Thanks for your contribution. I encountered some error that I wasn't expecting. See the details below. checkpatch: WARNING: Line is 138 characters long (recommended limit is 79) #351 FILE: utilities/bugtool/plugins/network-status/openvswitch.xml:37: <command label="ovs-vsctl-get-interface-statistics" filters="ovs">/usr/share/openvswitch/scripts/ovs-bugtool-get-iface-stats</command> Lines checked: 392, Warnings: 1, Errors: 0 Please check this out. If you feel there has been an error, please email aconole@redhat.com Thanks, 0-day Robot
Hi All, Please consider this as a gentle remainder. Thanks & Regards, Sriram. -----Original Message----- From: Sriram Vatala <sriram.v@altencalsoftlabs.com> Sent: 28 August 2019 18:29 To: ovs-dev@openvswitch.org; i.maximets@samsung.com Cc: blp@ovn.org; ian.stokes@intel.com; Sriram Vatala <sriram.v@altencalsoftlabs.com> Subject: [PATCH v7] Detailed packet drop statistics per dpdk and vhostuser ports OVS may be unable to transmit packets for multiple reasons and today there is a single counter to track packets dropped due to any of those reasons. The most common reason is that a VM is unable to read packets fast enough causing the vhostuser port transmit queue on the OVS side to become full. This manifests as a problem with VNFs not receiving all packets. Having a separate drop counter to track packets dropped because the transmit queue is full will clearly indicate that the problem is on the VM side and not in OVS. Similarly maintaining separate counters for all possible drops helps in indicating sensible cause for packet drops. This patch adds custom stats counters to track packets dropped at port level and these counters are displayed along with other stats in "ovs-vsctl get interface <iface> statistics" command. The detailed stats will be available for both dpdk and vhostuser ports. Signed-off-by: Sriram Vatala <sriram.v@altencalsoftlabs.com> --- lib/netdev-dpdk.c | 99 +++++++++++++++++++--- utilities/bugtool/automake.mk | 3 +- utilities/bugtool/ovs-bugtool-get-iface-stats | 25 ++++++ .../bugtool/plugins/network-status/openvswitch.xml | 1 + vswitchd/vswitch.xml | 24 ++++++ 5 files changed, 139 insertions(+), 13 deletions(-) create mode 100755 utilities/bugtool/ovs-bugtool-get-iface-stats diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index bc20d68..a679c5b 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -413,8 +413,17 @@ struct netdev_dpdk { PADDED_MEMBERS(CACHE_LINE_SIZE, struct netdev_stats stats; - /* Custom stat for retries when unable to transmit. */ + /* Custom device stats */ + /* No. of retries when unable to transmit. */ uint64_t tx_retries; + /* Pkts dropped when unable to transmit; Probably Tx queue is full */ + uint64_t tx_failure_drops; + /* Pkts len greater than max dev MTU */ + uint64_t tx_mtu_exceeded_drops; + /* Pkt drops in egress policier processing */ + uint64_t tx_qos_drops; + /* Pkts drops in ingress policier processing */ + uint64_t rx_qos_drops; /* Protects stats */ rte_spinlock_t stats_lock; /* 4 pad bytes here. */ @@ -2171,6 +2180,7 @@ netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq, struct ingress_policer *policer = netdev_dpdk_get_ingress_policer(dev); uint16_t nb_rx = 0; uint16_t dropped = 0; + uint16_t qos_drops = 0; int qid = rxq->queue_id * VIRTIO_QNUM + VIRTIO_TXQ; int vid = netdev_dpdk_get_vid(dev); @@ -2202,11 +2212,13 @@ netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq, (struct rte_mbuf **) batch->packets, nb_rx, true); dropped -= nb_rx; + qos_drops = dropped; } rte_spinlock_lock(&dev->stats_lock); netdev_dpdk_vhost_update_rx_counters(&dev->stats, batch->packets, nb_rx, dropped); + dev->rx_qos_drops += qos_drops; rte_spinlock_unlock(&dev->stats_lock); batch->count = nb_rx; @@ -2232,6 +2244,7 @@ netdev_dpdk_rxq_recv(struct netdev_rxq *rxq, struct dp_packet_batch *batch, struct ingress_policer *policer = netdev_dpdk_get_ingress_policer(dev); int nb_rx; int dropped = 0; + int qos_drops = 0; if (OVS_UNLIKELY(!(dev->flags & NETDEV_UP))) { return EAGAIN; @@ -2250,12 +2263,14 @@ netdev_dpdk_rxq_recv(struct netdev_rxq *rxq, struct dp_packet_batch *batch, (struct rte_mbuf **) batch->packets, nb_rx, true); dropped -= nb_rx; + qos_drops = dropped; } /* Update stats to reflect dropped packets */ if (OVS_UNLIKELY(dropped)) { rte_spinlock_lock(&dev->stats_lock); dev->stats.rx_dropped += dropped; + dev->rx_qos_drops += qos_drops; rte_spinlock_unlock(&dev->stats_lock); } @@ -2339,6 +2354,9 @@ __netdev_dpdk_vhost_send(struct netdev *netdev, int qid, struct rte_mbuf **cur_pkts = (struct rte_mbuf **) pkts; unsigned int total_pkts = cnt; unsigned int dropped = 0; + unsigned int tx_failure; + unsigned int mtu_drops; + unsigned int qos_drops; int i, retries = 0; int max_retries = VHOST_ENQ_RETRY_MIN; int vid = netdev_dpdk_get_vid(dev); @@ -2356,9 +2374,12 @@ __netdev_dpdk_vhost_send(struct netdev *netdev, int qid, rte_spinlock_lock(&dev->tx_q[qid].tx_lock); cnt = netdev_dpdk_filter_packet_len(dev, cur_pkts, cnt); + mtu_drops = total_pkts - cnt; + qos_drops = cnt; /* Check has QoS has been configured for the netdev */ cnt = netdev_dpdk_qos_run(dev, cur_pkts, cnt, true); - dropped = total_pkts - cnt; + qos_drops -= cnt; + dropped = qos_drops + mtu_drops; do { int vhost_qid = qid * VIRTIO_QNUM + VIRTIO_RXQ; @@ -2383,12 +2404,16 @@ __netdev_dpdk_vhost_send(struct netdev *netdev, int qid, } } while (cnt && (retries++ < max_retries)); + tx_failure = cnt; rte_spinlock_unlock(&dev->tx_q[qid].tx_lock); rte_spinlock_lock(&dev->stats_lock); netdev_dpdk_vhost_update_tx_counters(&dev->stats, pkts, total_pkts, cnt + dropped); dev->tx_retries += MIN(retries, max_retries); + dev->tx_failure_drops += tx_failure; + dev->tx_mtu_exceeded_drops += mtu_drops; + dev->tx_qos_drops += qos_drops; rte_spinlock_unlock(&dev->stats_lock); out: @@ -2413,12 +2438,15 @@ dpdk_do_tx_copy(struct netdev *netdev, int qid, struct dp_packet_batch *batch) struct rte_mbuf *pkts[PKT_ARRAY_SIZE]; uint32_t cnt = batch_cnt; uint32_t dropped = 0; + uint32_t tx_failure = 0; + uint32_t mtu_drops = 0; + uint32_t qos_drops = 0; if (dev->type != DPDK_DEV_VHOST) { /* Check if QoS has been configured for this netdev. */ cnt = netdev_dpdk_qos_run(dev, (struct rte_mbuf **) batch->packets, batch_cnt, false); - dropped += batch_cnt - cnt; + qos_drops = batch_cnt - cnt; } uint32_t txcnt = 0; @@ -2431,7 +2459,7 @@ dpdk_do_tx_copy(struct netdev *netdev, int qid, struct dp_packet_batch *batch) VLOG_WARN_RL(&rl, "Too big size %u max_packet_len %d", size, dev->max_packet_len); - dropped++; + mtu_drops++; continue; } @@ -2454,13 +2482,17 @@ dpdk_do_tx_copy(struct netdev *netdev, int qid, struct dp_packet_batch *batch) __netdev_dpdk_vhost_send(netdev, qid, (struct dp_packet **) pkts, txcnt); } else { - dropped += netdev_dpdk_eth_tx_burst(dev, qid, pkts, txcnt); + tx_failure = netdev_dpdk_eth_tx_burst(dev, qid, pkts, + txcnt); } } + dropped += qos_drops + mtu_drops + tx_failure; if (OVS_UNLIKELY(dropped)) { rte_spinlock_lock(&dev->stats_lock); dev->stats.tx_dropped += dropped; + dev->tx_failure_drops += tx_failure; + dev->tx_mtu_exceeded_drops += mtu_drops; + dev->tx_qos_drops += qos_drops; rte_spinlock_unlock(&dev->stats_lock); } } @@ -2502,18 +2534,25 @@ netdev_dpdk_send__(struct netdev_dpdk *dev, int qid, dp_packet_delete_batch(batch, true); } else { int tx_cnt, dropped; + int tx_failure, mtu_drops, qos_drops; int batch_cnt = dp_packet_batch_size(batch); struct rte_mbuf **pkts = (struct rte_mbuf **) batch->packets; tx_cnt = netdev_dpdk_filter_packet_len(dev, pkts, batch_cnt); + mtu_drops = batch_cnt - tx_cnt; + qos_drops = tx_cnt; tx_cnt = netdev_dpdk_qos_run(dev, pkts, tx_cnt, true); - dropped = batch_cnt - tx_cnt; + qos_drops -= tx_cnt; - dropped += netdev_dpdk_eth_tx_burst(dev, qid, pkts, tx_cnt); + tx_failure = netdev_dpdk_eth_tx_burst(dev, qid, pkts, tx_cnt); + dropped = tx_failure + mtu_drops + qos_drops; if (OVS_UNLIKELY(dropped)) { rte_spinlock_lock(&dev->stats_lock); dev->stats.tx_dropped += dropped; + dev->tx_failure_drops += tx_failure; + dev->tx_mtu_exceeded_drops += mtu_drops; + dev->tx_qos_drops += qos_drops; rte_spinlock_unlock(&dev->stats_lock); } } @@ -2772,6 +2811,17 @@ netdev_dpdk_get_custom_stats(const struct netdev *netdev, uint32_t i; struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); int rte_xstats_ret; + uint16_t index = 0; + +#define DPDK_CSTATS \ + DPDK_CSTAT(tx_failure_drops) \ + DPDK_CSTAT(tx_mtu_exceeded_drops) \ + DPDK_CSTAT(tx_qos_drops) \ + DPDK_CSTAT(rx_qos_drops) + +#define DPDK_CSTAT(NAME) +1 + custom_stats->size = DPDK_CSTATS; +#undef DPDK_CSTAT ovs_mutex_lock(&dev->mutex); @@ -2786,9 +2836,10 @@ netdev_dpdk_get_custom_stats(const struct netdev *netdev, if (rte_xstats_ret > 0 && rte_xstats_ret <= dev->rte_xstats_ids_size) { - custom_stats->size = rte_xstats_ret; + index = rte_xstats_ret; + custom_stats->size += rte_xstats_ret; custom_stats->counters = - (struct netdev_custom_counter *) xcalloc(rte_xstats_ret, + (struct netdev_custom_counter *) + xcalloc(custom_stats->size, sizeof(struct netdev_custom_counter)); for (i = 0; i < rte_xstats_ret; i++) { @@ -2802,7 +2853,6 @@ netdev_dpdk_get_custom_stats(const struct netdev *netdev, VLOG_WARN("Cannot get XSTATS values for port: "DPDK_PORT_ID_FMT, dev->port_id); custom_stats->counters = NULL; - custom_stats->size = 0; /* Let's clear statistics cache, so it will be * reconfigured */ netdev_dpdk_clear_xstats(dev); @@ -2811,6 +2861,27 @@ netdev_dpdk_get_custom_stats(const struct netdev *netdev, free(values); } + if (custom_stats->counters == NULL) { + custom_stats->counters = + (struct netdev_custom_counter *) xcalloc(custom_stats->size, + sizeof(struct netdev_custom_counter)); + } + + rte_spinlock_lock(&dev->stats_lock); + i = index; +#define DPDK_CSTAT(NAME) \ + ovs_strlcpy(custom_stats->counters[i++].name, #NAME, \ + NETDEV_CUSTOM_STATS_NAME_SIZE); + DPDK_CSTATS; +#undef DPDK_CSTAT + + i = index; +#define DPDK_CSTAT(NAME) \ + custom_stats->counters[i++].value = dev->NAME; + DPDK_CSTATS; +#undef DPDK_CSTAT + rte_spinlock_unlock(&dev->stats_lock); + ovs_mutex_unlock(&dev->mutex); return 0; @@ -2823,8 +2894,12 @@ netdev_dpdk_vhost_get_custom_stats(const struct netdev *netdev, struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); int i; -#define VHOST_CSTATS \ - VHOST_CSTAT(tx_retries) +#define VHOST_CSTATS \ + VHOST_CSTAT(tx_retries) \ + VHOST_CSTAT(tx_failure_drops) \ + VHOST_CSTAT(tx_mtu_exceeded_drops) \ + VHOST_CSTAT(tx_qos_drops) \ + VHOST_CSTAT(rx_qos_drops) #define VHOST_CSTAT(NAME) + 1 custom_stats->size = VHOST_CSTATS; diff --git a/utilities/bugtool/automake.mk b/utilities/bugtool/automake.mk index 18fa347..9657468 100644 --- a/utilities/bugtool/automake.mk +++ b/utilities/bugtool/automake.mk @@ -22,7 +22,8 @@ bugtool_scripts = \ utilities/bugtool/ovs-bugtool-ovs-bridge-datapath-type \ utilities/bugtool/ovs-bugtool-ovs-vswitchd-threads-affinity \ utilities/bugtool/ovs-bugtool-qos-configs \ - utilities/bugtool/ovs-bugtool-get-dpdk-nic-numa + utilities/bugtool/ovs-bugtool-get-dpdk-nic-numa \ + utilities/bugtool/ovs-bugtool-get-iface-stats scripts_SCRIPTS += $(bugtool_scripts) diff --git a/utilities/bugtool/ovs-bugtool-get-iface-stats b/utilities/bugtool/ovs-bugtool-get-iface-stats new file mode 100755 index 0000000..0fe175e --- /dev/null +++ b/utilities/bugtool/ovs-bugtool-get-iface-stats @@ -0,0 +1,25 @@ +#! /bin/bash + +# This library is free software; you can redistribute it and/or # +modify it under the terms of version 2.1 of the GNU Lesser General # +Public License as published by the Free Software Foundation. +# +# This library is distributed in the hope that it will be useful, # but +WITHOUT ANY WARRANTY; without even the implied warranty of # +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU # +Lesser General Public License for more details. +# +# Copyright (C) 2019 Ericsson AB + +for bridge in `ovs-vsctl -- --real list-br` do + echo -e "\nBridge : ${bridge}\n" + for iface in `ovs-vsctl list-ifaces ${bridge}` + do + echo -e "iface : ${iface}" + ovs-vsctl get interface ${iface} statistics + echo -e "\n" + done + echo -e "iface : ${bridge}" + ovs-vsctl get interface ${bridge} statistics done diff --git a/utilities/bugtool/plugins/network-status/openvswitch.xml b/utilities/bugtool/plugins/network-status/openvswitch.xml index d39867c..f8c4ff0 100644 --- a/utilities/bugtool/plugins/network-status/openvswitch.xml +++ b/utilities/bugtool/plugins/network-status/openvswitch.xml @@ -34,6 +34,7 @@ <command label="ovs-appctl-dpctl-dump-flows-netdev" filters="ovs" repeat="2">ovs-appctl dpctl/dump-flows netdev@ovs-netdev</command> <command label="ovs-appctl-dpctl-dump-flows-system" filters="ovs" repeat="2">ovs-appctl dpctl/dump-flows system@ovs-system</command> <command label="ovs-appctl-dpctl-show-s" filters="ovs" repeat="2">ovs-appctl dpctl/show -s</command> + <command label="ovs-vsctl-get-interface-statistics" + filters="ovs">/usr/share/openvswitch/scripts/ovs-bugtool-get-iface-sta + ts</command> <command label="ovs-ofctl-show" filters="ovs">/usr/share/openvswitch/scripts/ovs-bugtool-ovs-ofctl-loop-over -bridges "show"</command> <command label="ovs-ofctl-dump-flows" filters="ovs" repeat="2">/usr/share/openvswitch/scripts/ovs-bugtool-ovs-ofctl-loop-over-br idges "dump-flows"</command> <command label="ovs-ofctl-dump-ports" filters="ovs" repeat="2">/usr/share/openvswitch/scripts/ovs-bugtool-ovs-ofctl-loop-over-br idges "dump-ports"</command> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml index 9a743c0..4a7bcf8 100644 --- a/vswitchd/vswitch.xml +++ b/vswitchd/vswitch.xml @@ -3486,6 +3486,30 @@ ovs-vsctl add-port br0 p0 -- set Interface p0 type=patch options:peer=p1 \ the above. </column> </group> + <group title="Statistics: Custom stats"> + <column name="statistics" key="tx_retries"> + Total number of transmit retries on a vhost-user or vhost-user-client + interface. + </column> + <column name="statistics" key="tx_failure_drops"> + Total number of packets dropped because DPDP transmit API for + physical/vhost ports fails to transmit the packets. This happens + most likely because the transmit queue is full or has been filled + up. There are other reasons as well which are unlikely to happen. + </column> + <column name="statistics" key="tx_mtu_exceeded_drops"> + Number of packets dropped due to packet length exceeding the max + device MTU. + </column> + <column name="statistics" key="tx_qos_drops"> + Total number of packets dropped due to transmission rate exceeding + the configured egress policer rate. + </column> + <column name="statistics" key="rx_qos_drops"> + Total number of packets dropped due to reception rate exceeding + the configured ingress policer rate. + </column> + </group> </group> <group title="Ingress Policing"> -- 2.7.4
On 28.08.2019 15:58, Sriram Vatala wrote: > OVS may be unable to transmit packets for multiple reasons and > today there is a single counter to track packets dropped due to > any of those reasons. The most common reason is that a VM is > unable to read packets fast enough causing the vhostuser port > transmit queue on the OVS side to become full. This manifests > as a problem with VNFs not receiving all packets. Having a > separate drop counter to track packets dropped because the > transmit queue is full will clearly indicate that the problem > is on the VM side and not in OVS. Similarly maintaining separate > counters for all possible drops helps in indicating sensible > cause for packet drops. > > This patch adds custom stats counters to track packets dropped > at port level and these counters are displayed along with > other stats in "ovs-vsctl get interface <iface> statistics" > command. The detailed stats will be available for both dpdk and > vhostuser ports. > > Signed-off-by: Sriram Vatala <sriram.v@altencalsoftlabs.com> > --- Hi. Thanks for a new version. It'll be good if you'll include version history right here under the '---' cut line, so it'll be easier to track changes. Like this: Version 7: * Implemented something. * Fixed something. * Patch renamed. Previous name: ... Version 6: * ... Another general comment is that, IMHO, it's better to rename this patch to "netdev-dpdk: Detailed packet drop statistics.". This patch adds statistics for all the dpdk based interfaces, so there is no need to list them in the commit name. Prefix clearly describes the patch area. (Please, do not drop the version number because of patch re-naming. Just add a note about renaming in a version history.) More comments inline. Best regards, Ilya Maximets. > lib/netdev-dpdk.c | 99 +++++++++++++++++++--- > utilities/bugtool/automake.mk | 3 +- > utilities/bugtool/ovs-bugtool-get-iface-stats | 25 ++++++ > .../bugtool/plugins/network-status/openvswitch.xml | 1 + > vswitchd/vswitch.xml | 24 ++++++ > 5 files changed, 139 insertions(+), 13 deletions(-) > create mode 100755 utilities/bugtool/ovs-bugtool-get-iface-stats > > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c > index bc20d68..a679c5b 100644 > --- a/lib/netdev-dpdk.c > +++ b/lib/netdev-dpdk.c > @@ -413,8 +413,17 @@ struct netdev_dpdk { > > PADDED_MEMBERS(CACHE_LINE_SIZE, > struct netdev_stats stats; > - /* Custom stat for retries when unable to transmit. */ > + /* Custom device stats */ > + /* No. of retries when unable to transmit. */ > uint64_t tx_retries; > + /* Pkts dropped when unable to transmit; Probably Tx queue is full */ > + uint64_t tx_failure_drops; > + /* Pkts len greater than max dev MTU */ > + uint64_t tx_mtu_exceeded_drops; > + /* Pkt drops in egress policier processing */ > + uint64_t tx_qos_drops; > + /* Pkts drops in ingress policier processing */ > + uint64_t rx_qos_drops; > /* Protects stats */ > rte_spinlock_t stats_lock; > /* 4 pad bytes here. */ Padding still needs re-calculation. Another option is to move all the counters to separate structure like 'struct netdev_dpdk_sw_stats' and keep a pointer in 'struct netdev_dpdk'. Another point is about naming. We, probably, should add some prefix to these stats to distinguish them from HW/driver stats and avoid possible collisions. This could be done here in variable names or while reporting them. I guess 'netdev_dpdk_' prefix might be suitable to clearly highlight the origin of the counter (It's, probably, better to add prefix like this while reporting because this is not very useful inside the netdev-dpdk code). Suggestions are welcome. > @@ -2171,6 +2180,7 @@ netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq, > struct ingress_policer *policer = netdev_dpdk_get_ingress_policer(dev); > uint16_t nb_rx = 0; > uint16_t dropped = 0; > + uint16_t qos_drops = 0; > int qid = rxq->queue_id * VIRTIO_QNUM + VIRTIO_TXQ; > int vid = netdev_dpdk_get_vid(dev); > > @@ -2202,11 +2212,13 @@ netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq, > (struct rte_mbuf **) batch->packets, > nb_rx, true); > dropped -= nb_rx; > + qos_drops = dropped; > } > > rte_spinlock_lock(&dev->stats_lock); > netdev_dpdk_vhost_update_rx_counters(&dev->stats, batch->packets, > nb_rx, dropped); > + dev->rx_qos_drops += qos_drops; > rte_spinlock_unlock(&dev->stats_lock); > > batch->count = nb_rx; > @@ -2232,6 +2244,7 @@ netdev_dpdk_rxq_recv(struct netdev_rxq *rxq, struct dp_packet_batch *batch, > struct ingress_policer *policer = netdev_dpdk_get_ingress_policer(dev); > int nb_rx; > int dropped = 0; > + int qos_drops = 0; > > if (OVS_UNLIKELY(!(dev->flags & NETDEV_UP))) { > return EAGAIN; > @@ -2250,12 +2263,14 @@ netdev_dpdk_rxq_recv(struct netdev_rxq *rxq, struct dp_packet_batch *batch, > (struct rte_mbuf **) batch->packets, > nb_rx, true); > dropped -= nb_rx; > + qos_drops = dropped; > } > > /* Update stats to reflect dropped packets */ > if (OVS_UNLIKELY(dropped)) { > rte_spinlock_lock(&dev->stats_lock); > dev->stats.rx_dropped += dropped; > + dev->rx_qos_drops += qos_drops; > rte_spinlock_unlock(&dev->stats_lock); > } > > @@ -2339,6 +2354,9 @@ __netdev_dpdk_vhost_send(struct netdev *netdev, int qid, > struct rte_mbuf **cur_pkts = (struct rte_mbuf **) pkts; > unsigned int total_pkts = cnt; > unsigned int dropped = 0; > + unsigned int tx_failure; > + unsigned int mtu_drops; > + unsigned int qos_drops; > int i, retries = 0; > int max_retries = VHOST_ENQ_RETRY_MIN; > int vid = netdev_dpdk_get_vid(dev); > @@ -2356,9 +2374,12 @@ __netdev_dpdk_vhost_send(struct netdev *netdev, int qid, > rte_spinlock_lock(&dev->tx_q[qid].tx_lock); > > cnt = netdev_dpdk_filter_packet_len(dev, cur_pkts, cnt); > + mtu_drops = total_pkts - cnt; > + qos_drops = cnt; > /* Check has QoS has been configured for the netdev */ > cnt = netdev_dpdk_qos_run(dev, cur_pkts, cnt, true); > - dropped = total_pkts - cnt; > + qos_drops -= cnt; > + dropped = qos_drops + mtu_drops; > > do { > int vhost_qid = qid * VIRTIO_QNUM + VIRTIO_RXQ; > @@ -2383,12 +2404,16 @@ __netdev_dpdk_vhost_send(struct netdev *netdev, int qid, > } > } while (cnt && (retries++ < max_retries)); > > + tx_failure = cnt; > rte_spinlock_unlock(&dev->tx_q[qid].tx_lock); > > rte_spinlock_lock(&dev->stats_lock); > netdev_dpdk_vhost_update_tx_counters(&dev->stats, pkts, total_pkts, > cnt + dropped); > dev->tx_retries += MIN(retries, max_retries); > + dev->tx_failure_drops += tx_failure; > + dev->tx_mtu_exceeded_drops += mtu_drops; > + dev->tx_qos_drops += qos_drops; > rte_spinlock_unlock(&dev->stats_lock); > > out: > @@ -2413,12 +2438,15 @@ dpdk_do_tx_copy(struct netdev *netdev, int qid, struct dp_packet_batch *batch) > struct rte_mbuf *pkts[PKT_ARRAY_SIZE]; > uint32_t cnt = batch_cnt; > uint32_t dropped = 0; > + uint32_t tx_failure = 0; > + uint32_t mtu_drops = 0; > + uint32_t qos_drops = 0; > > if (dev->type != DPDK_DEV_VHOST) { > /* Check if QoS has been configured for this netdev. */ > cnt = netdev_dpdk_qos_run(dev, (struct rte_mbuf **) batch->packets, > batch_cnt, false); > - dropped += batch_cnt - cnt; > + qos_drops = batch_cnt - cnt; > } > > uint32_t txcnt = 0; > @@ -2431,7 +2459,7 @@ dpdk_do_tx_copy(struct netdev *netdev, int qid, struct dp_packet_batch *batch) > VLOG_WARN_RL(&rl, "Too big size %u max_packet_len %d", > size, dev->max_packet_len); > > - dropped++; > + mtu_drops++; > continue; > } > > @@ -2454,13 +2482,17 @@ dpdk_do_tx_copy(struct netdev *netdev, int qid, struct dp_packet_batch *batch) > __netdev_dpdk_vhost_send(netdev, qid, (struct dp_packet **) pkts, > txcnt); > } else { > - dropped += netdev_dpdk_eth_tx_burst(dev, qid, pkts, txcnt); > + tx_failure = netdev_dpdk_eth_tx_burst(dev, qid, pkts, txcnt); > } > } > > + dropped += qos_drops + mtu_drops + tx_failure; > if (OVS_UNLIKELY(dropped)) { > rte_spinlock_lock(&dev->stats_lock); > dev->stats.tx_dropped += dropped; > + dev->tx_failure_drops += tx_failure; > + dev->tx_mtu_exceeded_drops += mtu_drops; > + dev->tx_qos_drops += qos_drops; > rte_spinlock_unlock(&dev->stats_lock); > } > } > @@ -2502,18 +2534,25 @@ netdev_dpdk_send__(struct netdev_dpdk *dev, int qid, > dp_packet_delete_batch(batch, true); > } else { > int tx_cnt, dropped; > + int tx_failure, mtu_drops, qos_drops; > int batch_cnt = dp_packet_batch_size(batch); > struct rte_mbuf **pkts = (struct rte_mbuf **) batch->packets; > > tx_cnt = netdev_dpdk_filter_packet_len(dev, pkts, batch_cnt); > + mtu_drops = batch_cnt - tx_cnt; > + qos_drops = tx_cnt; > tx_cnt = netdev_dpdk_qos_run(dev, pkts, tx_cnt, true); > - dropped = batch_cnt - tx_cnt; > + qos_drops -= tx_cnt; > > - dropped += netdev_dpdk_eth_tx_burst(dev, qid, pkts, tx_cnt); > + tx_failure = netdev_dpdk_eth_tx_burst(dev, qid, pkts, tx_cnt); > > + dropped = tx_failure + mtu_drops + qos_drops; > if (OVS_UNLIKELY(dropped)) { > rte_spinlock_lock(&dev->stats_lock); > dev->stats.tx_dropped += dropped; > + dev->tx_failure_drops += tx_failure; > + dev->tx_mtu_exceeded_drops += mtu_drops; > + dev->tx_qos_drops += qos_drops; > rte_spinlock_unlock(&dev->stats_lock); > } > } > @@ -2772,6 +2811,17 @@ netdev_dpdk_get_custom_stats(const struct netdev *netdev, > uint32_t i; > struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); > int rte_xstats_ret; > + uint16_t index = 0; > + > +#define DPDK_CSTATS \ > + DPDK_CSTAT(tx_failure_drops) \ > + DPDK_CSTAT(tx_mtu_exceeded_drops) \ > + DPDK_CSTAT(tx_qos_drops) \ > + DPDK_CSTAT(rx_qos_drops) > + > +#define DPDK_CSTAT(NAME) +1 > + custom_stats->size = DPDK_CSTATS; > +#undef DPDK_CSTAT > > ovs_mutex_lock(&dev->mutex); > > @@ -2786,9 +2836,10 @@ netdev_dpdk_get_custom_stats(const struct netdev *netdev, > if (rte_xstats_ret > 0 && > rte_xstats_ret <= dev->rte_xstats_ids_size) { > > - custom_stats->size = rte_xstats_ret; > + index = rte_xstats_ret; > + custom_stats->size += rte_xstats_ret; > custom_stats->counters = > - (struct netdev_custom_counter *) xcalloc(rte_xstats_ret, > + (struct netdev_custom_counter *) xcalloc(custom_stats->size, > sizeof(struct netdev_custom_counter)); > > for (i = 0; i < rte_xstats_ret; i++) { > @@ -2802,7 +2853,6 @@ netdev_dpdk_get_custom_stats(const struct netdev *netdev, > VLOG_WARN("Cannot get XSTATS values for port: "DPDK_PORT_ID_FMT, > dev->port_id); > custom_stats->counters = NULL; > - custom_stats->size = 0; > /* Let's clear statistics cache, so it will be > * reconfigured */ > netdev_dpdk_clear_xstats(dev); > @@ -2811,6 +2861,27 @@ netdev_dpdk_get_custom_stats(const struct netdev *netdev, > free(values); > } > > + if (custom_stats->counters == NULL) { > + custom_stats->counters = > + (struct netdev_custom_counter *) xcalloc(custom_stats->size, > + sizeof(struct netdev_custom_counter)); > + } > + > + rte_spinlock_lock(&dev->stats_lock); > + i = index; > +#define DPDK_CSTAT(NAME) \ > + ovs_strlcpy(custom_stats->counters[i++].name, #NAME, \ > + NETDEV_CUSTOM_STATS_NAME_SIZE); > + DPDK_CSTATS; > +#undef DPDK_CSTAT > + > + i = index; > +#define DPDK_CSTAT(NAME) \ > + custom_stats->counters[i++].value = dev->NAME; > + DPDK_CSTATS; > +#undef DPDK_CSTAT > + rte_spinlock_unlock(&dev->stats_lock); > + > ovs_mutex_unlock(&dev->mutex); > > return 0; > @@ -2823,8 +2894,12 @@ netdev_dpdk_vhost_get_custom_stats(const struct netdev *netdev, > struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); > int i; > > -#define VHOST_CSTATS \ > - VHOST_CSTAT(tx_retries) > +#define VHOST_CSTATS \ > + VHOST_CSTAT(tx_retries) \ > + VHOST_CSTAT(tx_failure_drops) \ > + VHOST_CSTAT(tx_mtu_exceeded_drops) \ > + VHOST_CSTAT(tx_qos_drops) \ > + VHOST_CSTAT(rx_qos_drops) > > #define VHOST_CSTAT(NAME) + 1 > custom_stats->size = VHOST_CSTATS; > diff --git a/utilities/bugtool/automake.mk b/utilities/bugtool/automake.mk > index 18fa347..9657468 100644 > --- a/utilities/bugtool/automake.mk > +++ b/utilities/bugtool/automake.mk > @@ -22,7 +22,8 @@ bugtool_scripts = \ > utilities/bugtool/ovs-bugtool-ovs-bridge-datapath-type \ > utilities/bugtool/ovs-bugtool-ovs-vswitchd-threads-affinity \ > utilities/bugtool/ovs-bugtool-qos-configs \ > - utilities/bugtool/ovs-bugtool-get-dpdk-nic-numa > + utilities/bugtool/ovs-bugtool-get-dpdk-nic-numa \ > + utilities/bugtool/ovs-bugtool-get-iface-stats > > scripts_SCRIPTS += $(bugtool_scripts) > > diff --git a/utilities/bugtool/ovs-bugtool-get-iface-stats b/utilities/bugtool/ovs-bugtool-get-iface-stats > new file mode 100755 > index 0000000..0fe175e > --- /dev/null > +++ b/utilities/bugtool/ovs-bugtool-get-iface-stats > @@ -0,0 +1,25 @@ > +#! /bin/bash > + > +# This library is free software; you can redistribute it and/or > +# modify it under the terms of version 2.1 of the GNU Lesser General > +# Public License as published by the Free Software Foundation. > +# > +# This library is distributed in the hope that it will be useful, > +# but WITHOUT ANY WARRANTY; without even the implied warranty of > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > +# Lesser General Public License for more details. > +# > +# Copyright (C) 2019 Ericsson AB > + > +for bridge in `ovs-vsctl -- --real list-br` > +do > + echo -e "\nBridge : ${bridge}\n" > + for iface in `ovs-vsctl list-ifaces ${bridge}` > + do > + echo -e "iface : ${iface}" > + ovs-vsctl get interface ${iface} statistics > + echo -e "\n" > + done > + echo -e "iface : ${bridge}" > + ovs-vsctl get interface ${bridge} statistics > +done > diff --git a/utilities/bugtool/plugins/network-status/openvswitch.xml b/utilities/bugtool/plugins/network-status/openvswitch.xml > index d39867c..f8c4ff0 100644 > --- a/utilities/bugtool/plugins/network-status/openvswitch.xml > +++ b/utilities/bugtool/plugins/network-status/openvswitch.xml > @@ -34,6 +34,7 @@ > <command label="ovs-appctl-dpctl-dump-flows-netdev" filters="ovs" repeat="2">ovs-appctl dpctl/dump-flows netdev@ovs-netdev</command> > <command label="ovs-appctl-dpctl-dump-flows-system" filters="ovs" repeat="2">ovs-appctl dpctl/dump-flows system@ovs-system</command> > <command label="ovs-appctl-dpctl-show-s" filters="ovs" repeat="2">ovs-appctl dpctl/show -s</command> > + <command label="ovs-vsctl-get-interface-statistics" filters="ovs">/usr/share/openvswitch/scripts/ovs-bugtool-get-iface-stats</command> > <command label="ovs-ofctl-show" filters="ovs">/usr/share/openvswitch/scripts/ovs-bugtool-ovs-ofctl-loop-over-bridges "show"</command> > <command label="ovs-ofctl-dump-flows" filters="ovs" repeat="2">/usr/share/openvswitch/scripts/ovs-bugtool-ovs-ofctl-loop-over-bridges "dump-flows"</command> > <command label="ovs-ofctl-dump-ports" filters="ovs" repeat="2">/usr/share/openvswitch/scripts/ovs-bugtool-ovs-ofctl-loop-over-bridges "dump-ports"</command> > diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml > index 9a743c0..4a7bcf8 100644 > --- a/vswitchd/vswitch.xml > +++ b/vswitchd/vswitch.xml > @@ -3486,6 +3486,30 @@ ovs-vsctl add-port br0 p0 -- set Interface p0 type=patch options:peer=p1 \ > the above. > </column> > </group> > + <group title="Statistics: Custom stats"> > + <column name="statistics" key="tx_retries"> > + Total number of transmit retries on a vhost-user or vhost-user-client > + interface. > + </column> > + <column name="statistics" key="tx_failure_drops"> > + Total number of packets dropped because DPDP transmit API for > + physical/vhost ports fails to transmit the packets. This happens > + most likely because the transmit queue is full or has been filled > + up. There are other reasons as well which are unlikely to happen. > + </column> > + <column name="statistics" key="tx_mtu_exceeded_drops"> > + Number of packets dropped due to packet length exceeding the max > + device MTU. > + </column> > + <column name="statistics" key="tx_qos_drops"> > + Total number of packets dropped due to transmission rate exceeding > + the configured egress policer rate. > + </column> > + <column name="statistics" key="rx_qos_drops"> > + Total number of packets dropped due to reception rate exceeding > + the configured ingress policer rate. > + </column> > + </group> I'm not sure if it make sense to declare all the stats here. Even usual "Statistics" doesn't have all of the default stats described. All of above should be easy to understand just looking at the name. Also, these are netdev-dpdk specific and not supported by other netdev types. If needed, these stats could be mentioned in DPDK guides in Documentation/ . > </group> > > <group title="Ingress Policing"> >
Hi Ilya, Thanks for reviewing the patch and suggestions. Will address your comments in next patch. Thanks & Regards, Sriram. -----Original Message----- From: Ilya Maximets <i.maximets@samsung.com> Sent: 30 August 2019 18:53 To: Sriram Vatala <sriram.v@altencalsoftlabs.com>; ovs-dev@openvswitch.org Cc: blp@ovn.org; ian.stokes@intel.com; Kevin Traynor <ktraynor@redhat.com> Subject: Re: [PATCH v7] Detailed packet drop statistics per dpdk and vhostuser ports On 28.08.2019 15:58, Sriram Vatala wrote: > OVS may be unable to transmit packets for multiple reasons and today > there is a single counter to track packets dropped due to any of those > reasons. The most common reason is that a VM is unable to read packets > fast enough causing the vhostuser port transmit queue on the OVS side > to become full. This manifests as a problem with VNFs not receiving > all packets. Having a separate drop counter to track packets dropped > because the transmit queue is full will clearly indicate that the > problem is on the VM side and not in OVS. Similarly maintaining > separate counters for all possible drops helps in indicating sensible > cause for packet drops. > > This patch adds custom stats counters to track packets dropped at port > level and these counters are displayed along with other stats in > "ovs-vsctl get interface <iface> statistics" > command. The detailed stats will be available for both dpdk and > vhostuser ports. > > Signed-off-by: Sriram Vatala <sriram.v@altencalsoftlabs.com> > --- Hi. Thanks for a new version. It'll be good if you'll include version history right here under the '---' cut line, so it'll be easier to track changes. Like this: Version 7: * Implemented something. * Fixed something. * Patch renamed. Previous name: ... Version 6: * ... Another general comment is that, IMHO, it's better to rename this patch to "netdev-dpdk: Detailed packet drop statistics.". This patch adds statistics for all the dpdk based interfaces, so there is no need to list them in the commit name. Prefix clearly describes the patch area. (Please, do not drop the version number because of patch re-naming. Just add a note about renaming in a version history.) More comments inline. Best regards, Ilya Maximets. > lib/netdev-dpdk.c | 99 > +++++++++++++++++++--- > utilities/bugtool/automake.mk | 3 +- > utilities/bugtool/ovs-bugtool-get-iface-stats | 25 ++++++ > .../bugtool/plugins/network-status/openvswitch.xml | 1 + > vswitchd/vswitch.xml | 24 ++++++ > 5 files changed, 139 insertions(+), 13 deletions(-) create mode > 100755 utilities/bugtool/ovs-bugtool-get-iface-stats > > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index > bc20d68..a679c5b 100644 > --- a/lib/netdev-dpdk.c > +++ b/lib/netdev-dpdk.c > @@ -413,8 +413,17 @@ struct netdev_dpdk { > > PADDED_MEMBERS(CACHE_LINE_SIZE, > struct netdev_stats stats; > - /* Custom stat for retries when unable to transmit. */ > + /* Custom device stats */ > + /* No. of retries when unable to transmit. */ > uint64_t tx_retries; > + /* Pkts dropped when unable to transmit; Probably Tx queue is full > */ > + uint64_t tx_failure_drops; > + /* Pkts len greater than max dev MTU */ > + uint64_t tx_mtu_exceeded_drops; > + /* Pkt drops in egress policier processing */ > + uint64_t tx_qos_drops; > + /* Pkts drops in ingress policier processing */ > + uint64_t rx_qos_drops; > /* Protects stats */ > rte_spinlock_t stats_lock; > /* 4 pad bytes here. */ Padding still needs re-calculation. Another option is to move all the counters to separate structure like 'struct netdev_dpdk_sw_stats' and keep a pointer in 'struct netdev_dpdk'. Another point is about naming. We, probably, should add some prefix to these stats to distinguish them from HW/driver stats and avoid possible collisions. This could be done here in variable names or while reporting them. I guess 'netdev_dpdk_' prefix might be suitable to clearly highlight the origin of the counter (It's, probably, better to add prefix like this while reporting because this is not very useful inside the netdev-dpdk code). Suggestions are welcome. > @@ -2171,6 +2180,7 @@ netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq, > struct ingress_policer *policer = netdev_dpdk_get_ingress_policer(dev); > uint16_t nb_rx = 0; > uint16_t dropped = 0; > + uint16_t qos_drops = 0; > int qid = rxq->queue_id * VIRTIO_QNUM + VIRTIO_TXQ; > int vid = netdev_dpdk_get_vid(dev); > > @@ -2202,11 +2212,13 @@ netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq, > (struct rte_mbuf **) batch->packets, > nb_rx, true); > dropped -= nb_rx; > + qos_drops = dropped; > } > > rte_spinlock_lock(&dev->stats_lock); > netdev_dpdk_vhost_update_rx_counters(&dev->stats, batch->packets, > nb_rx, dropped); > + dev->rx_qos_drops += qos_drops; > rte_spinlock_unlock(&dev->stats_lock); > > batch->count = nb_rx; > @@ -2232,6 +2244,7 @@ netdev_dpdk_rxq_recv(struct netdev_rxq *rxq, struct > dp_packet_batch *batch, > struct ingress_policer *policer = netdev_dpdk_get_ingress_policer(dev); > int nb_rx; > int dropped = 0; > + int qos_drops = 0; > > if (OVS_UNLIKELY(!(dev->flags & NETDEV_UP))) { > return EAGAIN; > @@ -2250,12 +2263,14 @@ netdev_dpdk_rxq_recv(struct netdev_rxq *rxq, struct > dp_packet_batch *batch, > (struct rte_mbuf **) batch->packets, > nb_rx, true); > dropped -= nb_rx; > + qos_drops = dropped; > } > > /* Update stats to reflect dropped packets */ > if (OVS_UNLIKELY(dropped)) { > rte_spinlock_lock(&dev->stats_lock); > dev->stats.rx_dropped += dropped; > + dev->rx_qos_drops += qos_drops; > rte_spinlock_unlock(&dev->stats_lock); > } > > @@ -2339,6 +2354,9 @@ __netdev_dpdk_vhost_send(struct netdev *netdev, int > qid, > struct rte_mbuf **cur_pkts = (struct rte_mbuf **) pkts; > unsigned int total_pkts = cnt; > unsigned int dropped = 0; > + unsigned int tx_failure; > + unsigned int mtu_drops; > + unsigned int qos_drops; > int i, retries = 0; > int max_retries = VHOST_ENQ_RETRY_MIN; > int vid = netdev_dpdk_get_vid(dev); @@ -2356,9 +2374,12 @@ > __netdev_dpdk_vhost_send(struct netdev *netdev, int qid, > rte_spinlock_lock(&dev->tx_q[qid].tx_lock); > > cnt = netdev_dpdk_filter_packet_len(dev, cur_pkts, cnt); > + mtu_drops = total_pkts - cnt; > + qos_drops = cnt; > /* Check has QoS has been configured for the netdev */ > cnt = netdev_dpdk_qos_run(dev, cur_pkts, cnt, true); > - dropped = total_pkts - cnt; > + qos_drops -= cnt; > + dropped = qos_drops + mtu_drops; > > do { > int vhost_qid = qid * VIRTIO_QNUM + VIRTIO_RXQ; @@ -2383,12 > +2404,16 @@ __netdev_dpdk_vhost_send(struct netdev *netdev, int qid, > } > } while (cnt && (retries++ < max_retries)); > > + tx_failure = cnt; > rte_spinlock_unlock(&dev->tx_q[qid].tx_lock); > > rte_spinlock_lock(&dev->stats_lock); > netdev_dpdk_vhost_update_tx_counters(&dev->stats, pkts, total_pkts, > cnt + dropped); > dev->tx_retries += MIN(retries, max_retries); > + dev->tx_failure_drops += tx_failure; > + dev->tx_mtu_exceeded_drops += mtu_drops; > + dev->tx_qos_drops += qos_drops; > rte_spinlock_unlock(&dev->stats_lock); > > out: > @@ -2413,12 +2438,15 @@ dpdk_do_tx_copy(struct netdev *netdev, int qid, > struct dp_packet_batch *batch) > struct rte_mbuf *pkts[PKT_ARRAY_SIZE]; > uint32_t cnt = batch_cnt; > uint32_t dropped = 0; > + uint32_t tx_failure = 0; > + uint32_t mtu_drops = 0; > + uint32_t qos_drops = 0; > > if (dev->type != DPDK_DEV_VHOST) { > /* Check if QoS has been configured for this netdev. */ > cnt = netdev_dpdk_qos_run(dev, (struct rte_mbuf **) batch->packets, > batch_cnt, false); > - dropped += batch_cnt - cnt; > + qos_drops = batch_cnt - cnt; > } > > uint32_t txcnt = 0; > @@ -2431,7 +2459,7 @@ dpdk_do_tx_copy(struct netdev *netdev, int qid, struct > dp_packet_batch *batch) > VLOG_WARN_RL(&rl, "Too big size %u max_packet_len %d", > size, dev->max_packet_len); > > - dropped++; > + mtu_drops++; > continue; > } > > @@ -2454,13 +2482,17 @@ dpdk_do_tx_copy(struct netdev *netdev, int qid, > struct dp_packet_batch *batch) > __netdev_dpdk_vhost_send(netdev, qid, (struct dp_packet **) > pkts, > txcnt); > } else { > - dropped += netdev_dpdk_eth_tx_burst(dev, qid, pkts, txcnt); > + tx_failure = netdev_dpdk_eth_tx_burst(dev, qid, pkts, > + txcnt); > } > } > > + dropped += qos_drops + mtu_drops + tx_failure; > if (OVS_UNLIKELY(dropped)) { > rte_spinlock_lock(&dev->stats_lock); > dev->stats.tx_dropped += dropped; > + dev->tx_failure_drops += tx_failure; > + dev->tx_mtu_exceeded_drops += mtu_drops; > + dev->tx_qos_drops += qos_drops; > rte_spinlock_unlock(&dev->stats_lock); > } > } > @@ -2502,18 +2534,25 @@ netdev_dpdk_send__(struct netdev_dpdk *dev, int qid, > dp_packet_delete_batch(batch, true); > } else { > int tx_cnt, dropped; > + int tx_failure, mtu_drops, qos_drops; > int batch_cnt = dp_packet_batch_size(batch); > struct rte_mbuf **pkts = (struct rte_mbuf **) batch->packets; > > tx_cnt = netdev_dpdk_filter_packet_len(dev, pkts, batch_cnt); > + mtu_drops = batch_cnt - tx_cnt; > + qos_drops = tx_cnt; > tx_cnt = netdev_dpdk_qos_run(dev, pkts, tx_cnt, true); > - dropped = batch_cnt - tx_cnt; > + qos_drops -= tx_cnt; > > - dropped += netdev_dpdk_eth_tx_burst(dev, qid, pkts, tx_cnt); > + tx_failure = netdev_dpdk_eth_tx_burst(dev, qid, pkts, > + tx_cnt); > > + dropped = tx_failure + mtu_drops + qos_drops; > if (OVS_UNLIKELY(dropped)) { > rte_spinlock_lock(&dev->stats_lock); > dev->stats.tx_dropped += dropped; > + dev->tx_failure_drops += tx_failure; > + dev->tx_mtu_exceeded_drops += mtu_drops; > + dev->tx_qos_drops += qos_drops; > rte_spinlock_unlock(&dev->stats_lock); > } > } > @@ -2772,6 +2811,17 @@ netdev_dpdk_get_custom_stats(const struct netdev > *netdev, > uint32_t i; > struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); > int rte_xstats_ret; > + uint16_t index = 0; > + > +#define DPDK_CSTATS \ > + DPDK_CSTAT(tx_failure_drops) \ > + DPDK_CSTAT(tx_mtu_exceeded_drops) \ > + DPDK_CSTAT(tx_qos_drops) \ > + DPDK_CSTAT(rx_qos_drops) > + > +#define DPDK_CSTAT(NAME) +1 > + custom_stats->size = DPDK_CSTATS; #undef DPDK_CSTAT > > ovs_mutex_lock(&dev->mutex); > > @@ -2786,9 +2836,10 @@ netdev_dpdk_get_custom_stats(const struct netdev > *netdev, > if (rte_xstats_ret > 0 && > rte_xstats_ret <= dev->rte_xstats_ids_size) { > > - custom_stats->size = rte_xstats_ret; > + index = rte_xstats_ret; > + custom_stats->size += rte_xstats_ret; > custom_stats->counters = > - (struct netdev_custom_counter *) > xcalloc(rte_xstats_ret, > + (struct netdev_custom_counter *) > + xcalloc(custom_stats->size, > sizeof(struct netdev_custom_counter)); > > for (i = 0; i < rte_xstats_ret; i++) { @@ -2802,7 +2853,6 > @@ netdev_dpdk_get_custom_stats(const struct netdev *netdev, > VLOG_WARN("Cannot get XSTATS values for port: > "DPDK_PORT_ID_FMT, > dev->port_id); > custom_stats->counters = NULL; > - custom_stats->size = 0; > /* Let's clear statistics cache, so it will be > * reconfigured */ > netdev_dpdk_clear_xstats(dev); @@ -2811,6 +2861,27 @@ > netdev_dpdk_get_custom_stats(const struct netdev *netdev, > free(values); > } > > + if (custom_stats->counters == NULL) { > + custom_stats->counters = > + (struct netdev_custom_counter *) xcalloc(custom_stats->size, > + sizeof(struct netdev_custom_counter)); > + } > + > + rte_spinlock_lock(&dev->stats_lock); > + i = index; > +#define DPDK_CSTAT(NAME) \ > + ovs_strlcpy(custom_stats->counters[i++].name, #NAME, \ > + NETDEV_CUSTOM_STATS_NAME_SIZE); > + DPDK_CSTATS; > +#undef DPDK_CSTAT > + > + i = index; > +#define DPDK_CSTAT(NAME) \ > + custom_stats->counters[i++].value = dev->NAME; > + DPDK_CSTATS; > +#undef DPDK_CSTAT > + rte_spinlock_unlock(&dev->stats_lock); > + > ovs_mutex_unlock(&dev->mutex); > > return 0; > @@ -2823,8 +2894,12 @@ netdev_dpdk_vhost_get_custom_stats(const struct > netdev *netdev, > struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); > int i; > > -#define VHOST_CSTATS \ > - VHOST_CSTAT(tx_retries) > +#define VHOST_CSTATS \ > + VHOST_CSTAT(tx_retries) \ > + VHOST_CSTAT(tx_failure_drops) \ > + VHOST_CSTAT(tx_mtu_exceeded_drops) \ > + VHOST_CSTAT(tx_qos_drops) \ > + VHOST_CSTAT(rx_qos_drops) > > #define VHOST_CSTAT(NAME) + 1 > custom_stats->size = VHOST_CSTATS; diff --git > a/utilities/bugtool/automake.mk b/utilities/bugtool/automake.mk index > 18fa347..9657468 100644 > --- a/utilities/bugtool/automake.mk > +++ b/utilities/bugtool/automake.mk > @@ -22,7 +22,8 @@ bugtool_scripts = \ > utilities/bugtool/ovs-bugtool-ovs-bridge-datapath-type \ > utilities/bugtool/ovs-bugtool-ovs-vswitchd-threads-affinity \ > utilities/bugtool/ovs-bugtool-qos-configs \ > - utilities/bugtool/ovs-bugtool-get-dpdk-nic-numa > + utilities/bugtool/ovs-bugtool-get-dpdk-nic-numa \ > + utilities/bugtool/ovs-bugtool-get-iface-stats > > scripts_SCRIPTS += $(bugtool_scripts) > > diff --git a/utilities/bugtool/ovs-bugtool-get-iface-stats > b/utilities/bugtool/ovs-bugtool-get-iface-stats > new file mode 100755 > index 0000000..0fe175e > --- /dev/null > +++ b/utilities/bugtool/ovs-bugtool-get-iface-stats > @@ -0,0 +1,25 @@ > +#! /bin/bash > + > +# This library is free software; you can redistribute it and/or # > +modify it under the terms of version 2.1 of the GNU Lesser General # > +Public License as published by the Free Software Foundation. > +# > +# This library is distributed in the hope that it will be useful, # > +but WITHOUT ANY WARRANTY; without even the implied warranty of # > +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU # > +Lesser General Public License for more details. > +# > +# Copyright (C) 2019 Ericsson AB > + > +for bridge in `ovs-vsctl -- --real list-br` do > + echo -e "\nBridge : ${bridge}\n" > + for iface in `ovs-vsctl list-ifaces ${bridge}` > + do > + echo -e "iface : ${iface}" > + ovs-vsctl get interface ${iface} statistics > + echo -e "\n" > + done > + echo -e "iface : ${bridge}" > + ovs-vsctl get interface ${bridge} statistics done > diff --git a/utilities/bugtool/plugins/network-status/openvswitch.xml > b/utilities/bugtool/plugins/network-status/openvswitch.xml > index d39867c..f8c4ff0 100644 > --- a/utilities/bugtool/plugins/network-status/openvswitch.xml > +++ b/utilities/bugtool/plugins/network-status/openvswitch.xml > @@ -34,6 +34,7 @@ > <command label="ovs-appctl-dpctl-dump-flows-netdev" filters="ovs" > repeat="2">ovs-appctl dpctl/dump-flows netdev@ovs-netdev</command> > <command label="ovs-appctl-dpctl-dump-flows-system" filters="ovs" > repeat="2">ovs-appctl dpctl/dump-flows system@ovs-system</command> > <command label="ovs-appctl-dpctl-show-s" filters="ovs" > repeat="2">ovs-appctl dpctl/show -s</command> > + <command label="ovs-vsctl-get-interface-statistics" > + filters="ovs">/usr/share/openvswitch/scripts/ovs-bugtool-get-iface-s > + tats</command> > <command label="ovs-ofctl-show" > filters="ovs">/usr/share/openvswitch/scripts/ovs-bugtool-ovs-ofctl-loop-over-bridges > "show"</command> > <command label="ovs-ofctl-dump-flows" filters="ovs" > repeat="2">/usr/share/openvswitch/scripts/ovs-bugtool-ovs-ofctl-loop-over-bridges > "dump-flows"</command> > <command label="ovs-ofctl-dump-ports" filters="ovs" > repeat="2">/usr/share/openvswitch/scripts/ovs-bugtool-ovs-ofctl-loop-o > ver-bridges "dump-ports"</command> diff --git a/vswitchd/vswitch.xml > b/vswitchd/vswitch.xml index 9a743c0..4a7bcf8 100644 > --- a/vswitchd/vswitch.xml > +++ b/vswitchd/vswitch.xml > @@ -3486,6 +3486,30 @@ ovs-vsctl add-port br0 p0 -- set Interface p0 > type=patch options:peer=p1 \ > the above. > </column> > </group> > + <group title="Statistics: Custom stats"> > + <column name="statistics" key="tx_retries"> > + Total number of transmit retries on a vhost-user or > vhost-user-client > + interface. > + </column> > + <column name="statistics" key="tx_failure_drops"> > + Total number of packets dropped because DPDP transmit API for > + physical/vhost ports fails to transmit the packets. This happens > + most likely because the transmit queue is full or has been filled > + up. There are other reasons as well which are unlikely to happen. > + </column> > + <column name="statistics" key="tx_mtu_exceeded_drops"> > + Number of packets dropped due to packet length exceeding the max > + device MTU. > + </column> > + <column name="statistics" key="tx_qos_drops"> > + Total number of packets dropped due to transmission rate > exceeding > + the configured egress policer rate. > + </column> > + <column name="statistics" key="rx_qos_drops"> > + Total number of packets dropped due to reception rate exceeding > + the configured ingress policer rate. > + </column> > + </group> I'm not sure if it make sense to declare all the stats here. Even usual "Statistics" doesn't have all of the default stats described. All of above should be easy to understand just looking at the name. Also, these are netdev-dpdk specific and not supported by other netdev types. If needed, these stats could be mentioned in DPDK guides in Documentation/ . > </group> > > <group title="Ingress Policing"> >
Hi Ilya, 1) I was working on addressing the comments provided by you. Had a small query on one of your comments. 2) I am trying to understand the problem of padding bytes in struct netdev_dpdk which you are referring to in your comment. 3) If I understand correctly, the macro "PADDED_MEMBERS(UNIT, MEMBERS)" ensures that the memory to be allocated for all 'MEMBERS' is rounded off to nearest multiple of CACHE_LINE_SIZE which is 64 in this case. This macro adds pad bytes to roundoff to multiple of 64. 4) At runtime, I checked the size of stats section of netdev_dpdk without new counters that I have introduced in my patch. It is 384 bytes, out of which struct netdev_stats alone occupies 336 bytes and 8 bytes for tx_retries counter. (I could see there is no padding between netdev_stats and tx_retries). Total of 344 is rounded to 384 [((344 + 64 - 1)/64) * 64]. 5) With my new counters, the size remains same after padding. (336 + 8 + 8 +8 +8 +8 = 376) which is rounded to 384 bytes [((376 +64 -1)/64) *64] at runtime. I want to check with you, if I have correctly understood the problem that you are referring in your comment. If you can explain a bit more on this, it would be helpful for me to understand the problem and think of possible solutions. Following is the snippet of memory layout of netdev_dpdk at runtime : union { OVS_CACHE_LINE_MARKER cacheline1; struct {...}; uint8_t pad50[64]; }; union { struct {...}; uint8_t pad51[192]; }; union { struct {...}; uint8_t pad52[384]; }; union { struct {...}; uint8_t pad53[128]; }; union { struct {...}; uint8_t pad54[64]; }; } Thanks & Regards, Sriram. -----Original Message----- From: Ilya Maximets <i.maximets@samsung.com> Sent: 30 August 2019 18:53 To: Sriram Vatala <sriram.v@altencalsoftlabs.com>; ovs-dev@openvswitch.org Cc: blp@ovn.org; ian.stokes@intel.com; Kevin Traynor <ktraynor@redhat.com> Subject: Re: [PATCH v7] Detailed packet drop statistics per dpdk and vhostuser ports On 28.08.2019 15:58, Sriram Vatala wrote: > OVS may be unable to transmit packets for multiple reasons and today > there is a single counter to track packets dropped due to any of those > reasons. The most common reason is that a VM is unable to read packets > fast enough causing the vhostuser port transmit queue on the OVS side > to become full. This manifests as a problem with VNFs not receiving > all packets. Having a separate drop counter to track packets dropped > because the transmit queue is full will clearly indicate that the > problem is on the VM side and not in OVS. Similarly maintaining > separate counters for all possible drops helps in indicating sensible > cause for packet drops. > > This patch adds custom stats counters to track packets dropped at port > level and these counters are displayed along with other stats in > "ovs-vsctl get interface <iface> statistics" > command. The detailed stats will be available for both dpdk and > vhostuser ports. > > Signed-off-by: Sriram Vatala <sriram.v@altencalsoftlabs.com> > --- Hi. Thanks for a new version. It'll be good if you'll include version history right here under the '---' cut line, so it'll be easier to track changes. Like this: Version 7: * Implemented something. * Fixed something. * Patch renamed. Previous name: ... Version 6: * ... Another general comment is that, IMHO, it's better to rename this patch to "netdev-dpdk: Detailed packet drop statistics.". This patch adds statistics for all the dpdk based interfaces, so there is no need to list them in the commit name. Prefix clearly describes the patch area. (Please, do not drop the version number because of patch re-naming. Just add a note about renaming in a version history.) More comments inline. Best regards, Ilya Maximets. > lib/netdev-dpdk.c | 99 > +++++++++++++++++++--- > utilities/bugtool/automake.mk | 3 +- > utilities/bugtool/ovs-bugtool-get-iface-stats | 25 ++++++ > .../bugtool/plugins/network-status/openvswitch.xml | 1 + > vswitchd/vswitch.xml | 24 ++++++ > 5 files changed, 139 insertions(+), 13 deletions(-) create mode > 100755 utilities/bugtool/ovs-bugtool-get-iface-stats > > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index > bc20d68..a679c5b 100644 > --- a/lib/netdev-dpdk.c > +++ b/lib/netdev-dpdk.c > @@ -413,8 +413,17 @@ struct netdev_dpdk { > > PADDED_MEMBERS(CACHE_LINE_SIZE, > struct netdev_stats stats; > - /* Custom stat for retries when unable to transmit. */ > + /* Custom device stats */ > + /* No. of retries when unable to transmit. */ > uint64_t tx_retries; > + /* Pkts dropped when unable to transmit; Probably Tx queue is full > */ > + uint64_t tx_failure_drops; > + /* Pkts len greater than max dev MTU */ > + uint64_t tx_mtu_exceeded_drops; > + /* Pkt drops in egress policier processing */ > + uint64_t tx_qos_drops; > + /* Pkts drops in ingress policier processing */ > + uint64_t rx_qos_drops; > /* Protects stats */ > rte_spinlock_t stats_lock; > /* 4 pad bytes here. */ Padding still needs re-calculation. Another option is to move all the counters to separate structure like 'struct netdev_dpdk_sw_stats' and keep a pointer in 'struct netdev_dpdk'. Another point is about naming. We, probably, should add some prefix to these stats to distinguish them from HW/driver stats and avoid possible collisions. This could be done here in variable names or while reporting them. I guess 'netdev_dpdk_' prefix might be suitable to clearly highlight the origin of the counter (It's, probably, better to add prefix like this while reporting because this is not very useful inside the netdev-dpdk code). Suggestions are welcome. > @@ -2171,6 +2180,7 @@ netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq, > struct ingress_policer *policer = netdev_dpdk_get_ingress_policer(dev); > uint16_t nb_rx = 0; > uint16_t dropped = 0; > + uint16_t qos_drops = 0; > int qid = rxq->queue_id * VIRTIO_QNUM + VIRTIO_TXQ; > int vid = netdev_dpdk_get_vid(dev); > > @@ -2202,11 +2212,13 @@ netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq, > (struct rte_mbuf **) batch->packets, > nb_rx, true); > dropped -= nb_rx; > + qos_drops = dropped; > } > > rte_spinlock_lock(&dev->stats_lock); > netdev_dpdk_vhost_update_rx_counters(&dev->stats, batch->packets, > nb_rx, dropped); > + dev->rx_qos_drops += qos_drops; > rte_spinlock_unlock(&dev->stats_lock); > > batch->count = nb_rx; > @@ -2232,6 +2244,7 @@ netdev_dpdk_rxq_recv(struct netdev_rxq *rxq, struct > dp_packet_batch *batch, > struct ingress_policer *policer = netdev_dpdk_get_ingress_policer(dev); > int nb_rx; > int dropped = 0; > + int qos_drops = 0; > > if (OVS_UNLIKELY(!(dev->flags & NETDEV_UP))) { > return EAGAIN; > @@ -2250,12 +2263,14 @@ netdev_dpdk_rxq_recv(struct netdev_rxq *rxq, struct > dp_packet_batch *batch, > (struct rte_mbuf **) batch->packets, > nb_rx, true); > dropped -= nb_rx; > + qos_drops = dropped; > } > > /* Update stats to reflect dropped packets */ > if (OVS_UNLIKELY(dropped)) { > rte_spinlock_lock(&dev->stats_lock); > dev->stats.rx_dropped += dropped; > + dev->rx_qos_drops += qos_drops; > rte_spinlock_unlock(&dev->stats_lock); > } > > @@ -2339,6 +2354,9 @@ __netdev_dpdk_vhost_send(struct netdev *netdev, int > qid, > struct rte_mbuf **cur_pkts = (struct rte_mbuf **) pkts; > unsigned int total_pkts = cnt; > unsigned int dropped = 0; > + unsigned int tx_failure; > + unsigned int mtu_drops; > + unsigned int qos_drops; > int i, retries = 0; > int max_retries = VHOST_ENQ_RETRY_MIN; > int vid = netdev_dpdk_get_vid(dev); @@ -2356,9 +2374,12 @@ > __netdev_dpdk_vhost_send(struct netdev *netdev, int qid, > rte_spinlock_lock(&dev->tx_q[qid].tx_lock); > > cnt = netdev_dpdk_filter_packet_len(dev, cur_pkts, cnt); > + mtu_drops = total_pkts - cnt; > + qos_drops = cnt; > /* Check has QoS has been configured for the netdev */ > cnt = netdev_dpdk_qos_run(dev, cur_pkts, cnt, true); > - dropped = total_pkts - cnt; > + qos_drops -= cnt; > + dropped = qos_drops + mtu_drops; > > do { > int vhost_qid = qid * VIRTIO_QNUM + VIRTIO_RXQ; @@ -2383,12 > +2404,16 @@ __netdev_dpdk_vhost_send(struct netdev *netdev, int qid, > } > } while (cnt && (retries++ < max_retries)); > > + tx_failure = cnt; > rte_spinlock_unlock(&dev->tx_q[qid].tx_lock); > > rte_spinlock_lock(&dev->stats_lock); > netdev_dpdk_vhost_update_tx_counters(&dev->stats, pkts, total_pkts, > cnt + dropped); > dev->tx_retries += MIN(retries, max_retries); > + dev->tx_failure_drops += tx_failure; > + dev->tx_mtu_exceeded_drops += mtu_drops; > + dev->tx_qos_drops += qos_drops; > rte_spinlock_unlock(&dev->stats_lock); > > out: > @@ -2413,12 +2438,15 @@ dpdk_do_tx_copy(struct netdev *netdev, int qid, > struct dp_packet_batch *batch) > struct rte_mbuf *pkts[PKT_ARRAY_SIZE]; > uint32_t cnt = batch_cnt; > uint32_t dropped = 0; > + uint32_t tx_failure = 0; > + uint32_t mtu_drops = 0; > + uint32_t qos_drops = 0; > > if (dev->type != DPDK_DEV_VHOST) { > /* Check if QoS has been configured for this netdev. */ > cnt = netdev_dpdk_qos_run(dev, (struct rte_mbuf **) batch->packets, > batch_cnt, false); > - dropped += batch_cnt - cnt; > + qos_drops = batch_cnt - cnt; > } > > uint32_t txcnt = 0; > @@ -2431,7 +2459,7 @@ dpdk_do_tx_copy(struct netdev *netdev, int qid, struct > dp_packet_batch *batch) > VLOG_WARN_RL(&rl, "Too big size %u max_packet_len %d", > size, dev->max_packet_len); > > - dropped++; > + mtu_drops++; > continue; > } > > @@ -2454,13 +2482,17 @@ dpdk_do_tx_copy(struct netdev *netdev, int qid, > struct dp_packet_batch *batch) > __netdev_dpdk_vhost_send(netdev, qid, (struct dp_packet **) > pkts, > txcnt); > } else { > - dropped += netdev_dpdk_eth_tx_burst(dev, qid, pkts, txcnt); > + tx_failure = netdev_dpdk_eth_tx_burst(dev, qid, pkts, > + txcnt); > } > } > > + dropped += qos_drops + mtu_drops + tx_failure; > if (OVS_UNLIKELY(dropped)) { > rte_spinlock_lock(&dev->stats_lock); > dev->stats.tx_dropped += dropped; > + dev->tx_failure_drops += tx_failure; > + dev->tx_mtu_exceeded_drops += mtu_drops; > + dev->tx_qos_drops += qos_drops; > rte_spinlock_unlock(&dev->stats_lock); > } > } > @@ -2502,18 +2534,25 @@ netdev_dpdk_send__(struct netdev_dpdk *dev, int qid, > dp_packet_delete_batch(batch, true); > } else { > int tx_cnt, dropped; > + int tx_failure, mtu_drops, qos_drops; > int batch_cnt = dp_packet_batch_size(batch); > struct rte_mbuf **pkts = (struct rte_mbuf **) batch->packets; > > tx_cnt = netdev_dpdk_filter_packet_len(dev, pkts, batch_cnt); > + mtu_drops = batch_cnt - tx_cnt; > + qos_drops = tx_cnt; > tx_cnt = netdev_dpdk_qos_run(dev, pkts, tx_cnt, true); > - dropped = batch_cnt - tx_cnt; > + qos_drops -= tx_cnt; > > - dropped += netdev_dpdk_eth_tx_burst(dev, qid, pkts, tx_cnt); > + tx_failure = netdev_dpdk_eth_tx_burst(dev, qid, pkts, > + tx_cnt); > > + dropped = tx_failure + mtu_drops + qos_drops; > if (OVS_UNLIKELY(dropped)) { > rte_spinlock_lock(&dev->stats_lock); > dev->stats.tx_dropped += dropped; > + dev->tx_failure_drops += tx_failure; > + dev->tx_mtu_exceeded_drops += mtu_drops; > + dev->tx_qos_drops += qos_drops; > rte_spinlock_unlock(&dev->stats_lock); > } > } > @@ -2772,6 +2811,17 @@ netdev_dpdk_get_custom_stats(const struct netdev > *netdev, > uint32_t i; > struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); > int rte_xstats_ret; > + uint16_t index = 0; > + > +#define DPDK_CSTATS \ > + DPDK_CSTAT(tx_failure_drops) \ > + DPDK_CSTAT(tx_mtu_exceeded_drops) \ > + DPDK_CSTAT(tx_qos_drops) \ > + DPDK_CSTAT(rx_qos_drops) > + > +#define DPDK_CSTAT(NAME) +1 > + custom_stats->size = DPDK_CSTATS; #undef DPDK_CSTAT > > ovs_mutex_lock(&dev->mutex); > > @@ -2786,9 +2836,10 @@ netdev_dpdk_get_custom_stats(const struct netdev > *netdev, > if (rte_xstats_ret > 0 && > rte_xstats_ret <= dev->rte_xstats_ids_size) { > > - custom_stats->size = rte_xstats_ret; > + index = rte_xstats_ret; > + custom_stats->size += rte_xstats_ret; > custom_stats->counters = > - (struct netdev_custom_counter *) > xcalloc(rte_xstats_ret, > + (struct netdev_custom_counter *) > + xcalloc(custom_stats->size, > sizeof(struct netdev_custom_counter)); > > for (i = 0; i < rte_xstats_ret; i++) { @@ -2802,7 +2853,6 > @@ netdev_dpdk_get_custom_stats(const struct netdev *netdev, > VLOG_WARN("Cannot get XSTATS values for port: > "DPDK_PORT_ID_FMT, > dev->port_id); > custom_stats->counters = NULL; > - custom_stats->size = 0; > /* Let's clear statistics cache, so it will be > * reconfigured */ > netdev_dpdk_clear_xstats(dev); @@ -2811,6 +2861,27 @@ > netdev_dpdk_get_custom_stats(const struct netdev *netdev, > free(values); > } > > + if (custom_stats->counters == NULL) { > + custom_stats->counters = > + (struct netdev_custom_counter *) xcalloc(custom_stats->size, > + sizeof(struct netdev_custom_counter)); > + } > + > + rte_spinlock_lock(&dev->stats_lock); > + i = index; > +#define DPDK_CSTAT(NAME) \ > + ovs_strlcpy(custom_stats->counters[i++].name, #NAME, \ > + NETDEV_CUSTOM_STATS_NAME_SIZE); > + DPDK_CSTATS; > +#undef DPDK_CSTAT > + > + i = index; > +#define DPDK_CSTAT(NAME) \ > + custom_stats->counters[i++].value = dev->NAME; > + DPDK_CSTATS; > +#undef DPDK_CSTAT > + rte_spinlock_unlock(&dev->stats_lock); > + > ovs_mutex_unlock(&dev->mutex); > > return 0; > @@ -2823,8 +2894,12 @@ netdev_dpdk_vhost_get_custom_stats(const struct > netdev *netdev, > struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); > int i; > > -#define VHOST_CSTATS \ > - VHOST_CSTAT(tx_retries) > +#define VHOST_CSTATS \ > + VHOST_CSTAT(tx_retries) \ > + VHOST_CSTAT(tx_failure_drops) \ > + VHOST_CSTAT(tx_mtu_exceeded_drops) \ > + VHOST_CSTAT(tx_qos_drops) \ > + VHOST_CSTAT(rx_qos_drops) > > #define VHOST_CSTAT(NAME) + 1 > custom_stats->size = VHOST_CSTATS; diff --git > a/utilities/bugtool/automake.mk b/utilities/bugtool/automake.mk index > 18fa347..9657468 100644 > --- a/utilities/bugtool/automake.mk > +++ b/utilities/bugtool/automake.mk > @@ -22,7 +22,8 @@ bugtool_scripts = \ > utilities/bugtool/ovs-bugtool-ovs-bridge-datapath-type \ > utilities/bugtool/ovs-bugtool-ovs-vswitchd-threads-affinity \ > utilities/bugtool/ovs-bugtool-qos-configs \ > - utilities/bugtool/ovs-bugtool-get-dpdk-nic-numa > + utilities/bugtool/ovs-bugtool-get-dpdk-nic-numa \ > + utilities/bugtool/ovs-bugtool-get-iface-stats > > scripts_SCRIPTS += $(bugtool_scripts) > > diff --git a/utilities/bugtool/ovs-bugtool-get-iface-stats > b/utilities/bugtool/ovs-bugtool-get-iface-stats > new file mode 100755 > index 0000000..0fe175e > --- /dev/null > +++ b/utilities/bugtool/ovs-bugtool-get-iface-stats > @@ -0,0 +1,25 @@ > +#! /bin/bash > + > +# This library is free software; you can redistribute it and/or # > +modify it under the terms of version 2.1 of the GNU Lesser General # > +Public License as published by the Free Software Foundation. > +# > +# This library is distributed in the hope that it will be useful, # > +but WITHOUT ANY WARRANTY; without even the implied warranty of # > +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU # > +Lesser General Public License for more details. > +# > +# Copyright (C) 2019 Ericsson AB > + > +for bridge in `ovs-vsctl -- --real list-br` do > + echo -e "\nBridge : ${bridge}\n" > + for iface in `ovs-vsctl list-ifaces ${bridge}` > + do > + echo -e "iface : ${iface}" > + ovs-vsctl get interface ${iface} statistics > + echo -e "\n" > + done > + echo -e "iface : ${bridge}" > + ovs-vsctl get interface ${bridge} statistics done > diff --git a/utilities/bugtool/plugins/network-status/openvswitch.xml > b/utilities/bugtool/plugins/network-status/openvswitch.xml > index d39867c..f8c4ff0 100644 > --- a/utilities/bugtool/plugins/network-status/openvswitch.xml > +++ b/utilities/bugtool/plugins/network-status/openvswitch.xml > @@ -34,6 +34,7 @@ > <command label="ovs-appctl-dpctl-dump-flows-netdev" filters="ovs" > repeat="2">ovs-appctl dpctl/dump-flows netdev@ovs-netdev</command> > <command label="ovs-appctl-dpctl-dump-flows-system" filters="ovs" > repeat="2">ovs-appctl dpctl/dump-flows system@ovs-system</command> > <command label="ovs-appctl-dpctl-show-s" filters="ovs" > repeat="2">ovs-appctl dpctl/show -s</command> > + <command label="ovs-vsctl-get-interface-statistics" > + filters="ovs">/usr/share/openvswitch/scripts/ovs-bugtool-get-iface-s > + tats</command> > <command label="ovs-ofctl-show" > filters="ovs">/usr/share/openvswitch/scripts/ovs-bugtool-ovs-ofctl-loop-over-bridges > "show"</command> > <command label="ovs-ofctl-dump-flows" filters="ovs" > repeat="2">/usr/share/openvswitch/scripts/ovs-bugtool-ovs-ofctl-loop-over-bridges > "dump-flows"</command> > <command label="ovs-ofctl-dump-ports" filters="ovs" > repeat="2">/usr/share/openvswitch/scripts/ovs-bugtool-ovs-ofctl-loop-o > ver-bridges "dump-ports"</command> diff --git a/vswitchd/vswitch.xml > b/vswitchd/vswitch.xml index 9a743c0..4a7bcf8 100644 > --- a/vswitchd/vswitch.xml > +++ b/vswitchd/vswitch.xml > @@ -3486,6 +3486,30 @@ ovs-vsctl add-port br0 p0 -- set Interface p0 > type=patch options:peer=p1 \ > the above. > </column> > </group> > + <group title="Statistics: Custom stats"> > + <column name="statistics" key="tx_retries"> > + Total number of transmit retries on a vhost-user or > vhost-user-client > + interface. > + </column> > + <column name="statistics" key="tx_failure_drops"> > + Total number of packets dropped because DPDP transmit API for > + physical/vhost ports fails to transmit the packets. This happens > + most likely because the transmit queue is full or has been filled > + up. There are other reasons as well which are unlikely to happen. > + </column> > + <column name="statistics" key="tx_mtu_exceeded_drops"> > + Number of packets dropped due to packet length exceeding the max > + device MTU. > + </column> > + <column name="statistics" key="tx_qos_drops"> > + Total number of packets dropped due to transmission rate > exceeding > + the configured egress policer rate. > + </column> > + <column name="statistics" key="rx_qos_drops"> > + Total number of packets dropped due to reception rate exceeding > + the configured ingress policer rate. > + </column> > + </group> I'm not sure if it make sense to declare all the stats here. Even usual "Statistics" doesn't have all of the default stats described. All of above should be easy to understand just looking at the name. Also, these are netdev-dpdk specific and not supported by other netdev types. If needed, these stats could be mentioned in DPDK guides in Documentation/ . > </group> > > <group title="Ingress Policing"> >
On 04.09.2019 16:31, Sriram Vatala wrote: > Hi Ilya, > 1) I was working on addressing the comments provided by you. Had a small query > on one of your comments. > 2) I am trying to understand the problem of padding bytes in struct > netdev_dpdk which you are referring to in your comment. > 3) If I understand correctly, the macro "PADDED_MEMBERS(UNIT, MEMBERS)" > ensures that the memory to be allocated for all 'MEMBERS' is rounded off to > nearest multiple of CACHE_LINE_SIZE which is 64 in this case. This macro adds > pad bytes to roundoff to multiple of 64. > 4) At runtime, I checked the size of stats section of netdev_dpdk without new > counters that I have introduced in my patch. It is 384 bytes, out of which > struct netdev_stats alone occupies 336 bytes and 8 bytes for tx_retries > counter. (I could see there is no padding between netdev_stats and > tx_retries). Total of 344 is rounded to 384 [((344 + 64 - 1)/64) * 64]. > 5) With my new counters, the size remains same after padding. > (336 + 8 + 8 +8 +8 +8 = 376) which is rounded to 384 bytes [((376 +64 -1)/64) > *64] at runtime. > > I want to check with you, if I have correctly understood the problem that you > are referring in your comment. If you can explain a bit more on this, it would > be helpful for me to understand the problem and think of possible solutions. > > Following is the snippet of memory layout of netdev_dpdk at runtime : > union { > OVS_CACHE_LINE_MARKER cacheline1; > struct {...}; > uint8_t pad50[64]; > }; > union { > struct {...}; > uint8_t pad51[192]; > }; > union { > struct {...}; > uint8_t pad52[384]; > }; > union { > struct {...}; > uint8_t pad53[128]; > }; > union { > struct {...}; > uint8_t pad54[64]; > }; > } Hi. The code looks like this: PADDED_MEMBERS(CACHE_LINE_SIZE, struct netdev_stats stats; /* Custom stat for retries when unable to transmit. */ uint64_t tx_retries; /* Protects stats */ rte_spinlock_t stats_lock; /* 4 pad bytes here. */ <-- This comment I'm talking about. ); The only thing you need to change is to update the comment while adding new structure fields. Your calculations are missing the size of 'stats_lock' which is 4 bytes. So, on current master total size is: 336 bytes for stats + 8 bytes for tx_retries + 4 bytes for lock = 352 And it's rounded up to 384 by padding, i.e. we have 384 - 352 = 32 pad bytes. The comment on current master should be "/* 32 pad bytes here. */". BTW, Kevin, how did you calculate 4 here? My pahole output is following: union { struct { struct netdev_stats stats; /* 320 336 */ uint64_t tx_retries; /* 656 8 */ rte_spinlock_t stats_lock; /* 664 4 */ }; /* 352 */ uint8_t pad52[0]; /* 0 */ }; /* 320 384 */ Returning to the patch. After adding 4 more stats (4 * 8 bytes), the new layout takes: 336 bytes for stats + 8 bytes for tx_retries \ + 4*8 bytes for new stats + 4 bytes for lock = 384 bytes. Now the structure takes exactly 384 bytes and you need to remove the comment or change it to "/* 0 pad bytes here. */". Sorry, I didn't check the actual layout until now so I was confused a bit by the comment on current master. Anyway, you need to update that comment. However, It might be still useful to move these stats to a separate structure to avoid big padding in case we'll want to add one more. And I'm still thinking that we need to drop paddings at all for most of the structure fields, but this should be a separate change. Best regards, Ilya Maximets.
On 04/09/2019 15:03, Ilya Maximets wrote: > On 04.09.2019 16:31, Sriram Vatala wrote: >> Hi Ilya, >> 1) I was working on addressing the comments provided by you. Had a small query >> on one of your comments. >> 2) I am trying to understand the problem of padding bytes in struct >> netdev_dpdk which you are referring to in your comment. >> 3) If I understand correctly, the macro "PADDED_MEMBERS(UNIT, MEMBERS)" >> ensures that the memory to be allocated for all 'MEMBERS' is rounded off to >> nearest multiple of CACHE_LINE_SIZE which is 64 in this case. This macro adds >> pad bytes to roundoff to multiple of 64. >> 4) At runtime, I checked the size of stats section of netdev_dpdk without new >> counters that I have introduced in my patch. It is 384 bytes, out of which >> struct netdev_stats alone occupies 336 bytes and 8 bytes for tx_retries >> counter. (I could see there is no padding between netdev_stats and >> tx_retries). Total of 344 is rounded to 384 [((344 + 64 - 1)/64) * 64]. >> 5) With my new counters, the size remains same after padding. >> (336 + 8 + 8 +8 +8 +8 = 376) which is rounded to 384 bytes [((376 +64 -1)/64) >> *64] at runtime. >> >> I want to check with you, if I have correctly understood the problem that you >> are referring in your comment. If you can explain a bit more on this, it would >> be helpful for me to understand the problem and think of possible solutions. >> >> Following is the snippet of memory layout of netdev_dpdk at runtime : >> union { >> OVS_CACHE_LINE_MARKER cacheline1; >> struct {...}; >> uint8_t pad50[64]; >> }; >> union { >> struct {...}; >> uint8_t pad51[192]; >> }; >> union { >> struct {...}; >> uint8_t pad52[384]; >> }; >> union { >> struct {...}; >> uint8_t pad53[128]; >> }; >> union { >> struct {...}; >> uint8_t pad54[64]; >> }; >> } > > Hi. > > The code looks like this: > > PADDED_MEMBERS(CACHE_LINE_SIZE, > struct netdev_stats stats; > /* Custom stat for retries when unable to transmit. */ > uint64_t tx_retries; > /* Protects stats */ > rte_spinlock_t stats_lock; > /* 4 pad bytes here. */ <-- This comment I'm talking about. > ); > > The only thing you need to change is to update the comment while adding > new structure fields. > > Your calculations are missing the size of 'stats_lock' which is 4 bytes. > So, on current master total size is: > 336 bytes for stats + 8 bytes for tx_retries + 4 bytes for lock = 352 > And it's rounded up to 384 by padding, i.e. we have 384 - 352 = 32 pad bytes. > The comment on current master should be "/* 32 pad bytes here. */". > > BTW, Kevin, how did you calculate 4 here? My pahole output is following: > > union { > struct { > struct netdev_stats stats; /* 320 336 */ > uint64_t tx_retries; /* 656 8 */ > rte_spinlock_t stats_lock; /* 664 4 */ > }; /* 352 */ > uint8_t pad52[0]; /* 0 */ > }; /* 320 384 */ > Hmm, looks like I must have misinterpreted. I will send a patch for branch-2.12 to update the comment. Not sure it's worth to send a patch for master as this patch is changing it anyway, but I can if there is a preference. > Returning to the patch. After adding 4 more stats (4 * 8 bytes), the new layout > takes: > 336 bytes for stats + 8 bytes for tx_retries \ > + 4*8 bytes for new stats + 4 bytes for lock = 384 bytes. > > Now the structure takes exactly 384 bytes and you need to remove the comment > or change it to "/* 0 pad bytes here. */". > > Sorry, I didn't check the actual layout until now so I was confused a bit by the > comment on current master. Anyway, you need to update that comment. > However, It might be still useful to move these stats to a separate structure to > avoid big padding in case we'll want to add one more. And I'm still thinking that > we need to drop paddings at all for most of the structure fields, but this should > be a separate change. > > Best regards, Ilya Maximets. >
Hi Ilya, Thanks a lot for the explanation. As per your suggestion, I will move all the counters (including 'tx_retries')to some structure and place a pointer to it in netdev_dpdk structure so that the padding size will not vary with the introduction of new counters in future. @Kevin Traynor : I will change the comment line for the number of padding bytes in my next patch instead of you sending a new patch just for changing the comment line. Thanks & Regards, Sriram. -----Original Message----- From: Ilya Maximets <i.maximets@samsung.com> Sent: 04 September 2019 19:34 To: Sriram Vatala <sriram.v@altencalsoftlabs.com>; Kevin Traynor <ktraynor@redhat.com> Cc: ovs-dev@openvswitch.org; Stokes, Ian <ian.stokes@intel.com> Subject: Re: [PATCH v7] Detailed packet drop statistics per dpdk and vhostuser ports On 04.09.2019 16:31, Sriram Vatala wrote: > Hi Ilya, > 1) I was working on addressing the comments provided by you. Had a > small query on one of your comments. > 2) I am trying to understand the problem of padding bytes in struct > netdev_dpdk which you are referring to in your comment. > 3) If I understand correctly, the macro "PADDED_MEMBERS(UNIT, MEMBERS)" > ensures that the memory to be allocated for all 'MEMBERS' is rounded > off to nearest multiple of CACHE_LINE_SIZE which is 64 in this case. > This macro adds pad bytes to roundoff to multiple of 64. > 4) At runtime, I checked the size of stats section of netdev_dpdk > without new counters that I have introduced in my patch. It is 384 > bytes, out of which struct netdev_stats alone occupies 336 bytes and 8 > bytes for tx_retries counter. (I could see there is no padding between > netdev_stats and tx_retries). Total of 344 is rounded to 384 [((344 + 64 - > 1)/64) * 64]. > 5) With my new counters, the size remains same after padding. > (336 + 8 + 8 +8 +8 +8 = 376) which is rounded to 384 bytes [((376 +64 > -1)/64) *64] at runtime. > > I want to check with you, if I have correctly understood the problem > that you are referring in your comment. If you can explain a bit more > on this, it would be helpful for me to understand the problem and think of > possible solutions. > > Following is the snippet of memory layout of netdev_dpdk at runtime : > union { > OVS_CACHE_LINE_MARKER cacheline1; > struct {...}; > uint8_t pad50[64]; > }; > union { > struct {...}; > uint8_t pad51[192]; > }; > union { > struct {...}; > uint8_t pad52[384]; > }; > union { > struct {...}; > uint8_t pad53[128]; > }; > union { > struct {...}; > uint8_t pad54[64]; > }; > } Hi. The code looks like this: PADDED_MEMBERS(CACHE_LINE_SIZE, struct netdev_stats stats; /* Custom stat for retries when unable to transmit. */ uint64_t tx_retries; /* Protects stats */ rte_spinlock_t stats_lock; /* 4 pad bytes here. */ <-- This comment I'm talking about. ); The only thing you need to change is to update the comment while adding new structure fields. Your calculations are missing the size of 'stats_lock' which is 4 bytes. So, on current master total size is: 336 bytes for stats + 8 bytes for tx_retries + 4 bytes for lock = 352 And it's rounded up to 384 by padding, i.e. we have 384 - 352 = 32 pad bytes. The comment on current master should be "/* 32 pad bytes here. */". BTW, Kevin, how did you calculate 4 here? My pahole output is following: union { struct { struct netdev_stats stats; /* 320 336 */ uint64_t tx_retries; /* 656 8 */ rte_spinlock_t stats_lock; /* 664 4 */ }; /* 352 */ uint8_t pad52[0]; /* 0 */ }; /* 320 384 */ Returning to the patch. After adding 4 more stats (4 * 8 bytes), the new layout takes: 336 bytes for stats + 8 bytes for tx_retries \ + 4*8 bytes for new stats + 4 bytes for lock = 384 bytes. Now the structure takes exactly 384 bytes and you need to remove the comment or change it to "/* 0 pad bytes here. */". Sorry, I didn't check the actual layout until now so I was confused a bit by the comment on current master. Anyway, you need to update that comment. However, It might be still useful to move these stats to a separate structure to avoid big padding in case we'll want to add one more. And I'm still thinking that we need to drop paddings at all for most of the structure fields, but this should be a separate change. Best regards, Ilya Maximets.
On 05.09.2019 9:58, Sriram Vatala wrote: > Hi Ilya, > Thanks a lot for the explanation. As per your suggestion, I will move all the > counters (including 'tx_retries')to some structure and place a pointer to it > in netdev_dpdk structure so that the padding size will not vary with the > introduction of new counters in future. > > @Kevin Traynor : I will change the comment line for the number of padding > bytes in my next patch instead of you sending a new patch just for changing > the comment line. > > Thanks & Regards, > Sriram. > > -----Original Message----- > From: Ilya Maximets <i.maximets@samsung.com> > Sent: 04 September 2019 19:34 > To: Sriram Vatala <sriram.v@altencalsoftlabs.com>; Kevin Traynor > <ktraynor@redhat.com> > Cc: ovs-dev@openvswitch.org; Stokes, Ian <ian.stokes@intel.com> > Subject: Re: [PATCH v7] Detailed packet drop statistics per dpdk and vhostuser > ports > > On 04.09.2019 16:31, Sriram Vatala wrote: >> Hi Ilya, >> 1) I was working on addressing the comments provided by you. Had a >> small query on one of your comments. >> 2) I am trying to understand the problem of padding bytes in struct >> netdev_dpdk which you are referring to in your comment. >> 3) If I understand correctly, the macro "PADDED_MEMBERS(UNIT, MEMBERS)" >> ensures that the memory to be allocated for all 'MEMBERS' is rounded >> off to nearest multiple of CACHE_LINE_SIZE which is 64 in this case. >> This macro adds pad bytes to roundoff to multiple of 64. >> 4) At runtime, I checked the size of stats section of netdev_dpdk >> without new counters that I have introduced in my patch. It is 384 >> bytes, out of which struct netdev_stats alone occupies 336 bytes and 8 >> bytes for tx_retries counter. (I could see there is no padding between >> netdev_stats and tx_retries). Total of 344 is rounded to 384 [((344 + 64 - >> 1)/64) * 64]. >> 5) With my new counters, the size remains same after padding. >> (336 + 8 + 8 +8 +8 +8 = 376) which is rounded to 384 bytes [((376 +64 >> -1)/64) *64] at runtime. >> >> I want to check with you, if I have correctly understood the problem >> that you are referring in your comment. If you can explain a bit more >> on this, it would be helpful for me to understand the problem and think of >> possible solutions. >> >> Following is the snippet of memory layout of netdev_dpdk at runtime : >> union { >> OVS_CACHE_LINE_MARKER cacheline1; >> struct {...}; >> uint8_t pad50[64]; >> }; >> union { >> struct {...}; >> uint8_t pad51[192]; >> }; >> union { >> struct {...}; >> uint8_t pad52[384]; >> }; >> union { >> struct {...}; >> uint8_t pad53[128]; >> }; >> union { >> struct {...}; >> uint8_t pad54[64]; >> }; >> } > > Hi. > > The code looks like this: > > PADDED_MEMBERS(CACHE_LINE_SIZE, > struct netdev_stats stats; > /* Custom stat for retries when unable to transmit. */ > uint64_t tx_retries; > /* Protects stats */ > rte_spinlock_t stats_lock; > /* 4 pad bytes here. */ <-- This comment I'm talking about. > ); > > The only thing you need to change is to update the comment while adding new > structure fields. > > Your calculations are missing the size of 'stats_lock' which is 4 bytes. > So, on current master total size is: > 336 bytes for stats + 8 bytes for tx_retries + 4 bytes for lock = 352 And > it's rounded up to 384 by padding, i.e. we have 384 - 352 = 32 pad bytes. Sorry, I miscalculated this too.) 336 + 8 + 4 = 348 pahole reports 352, because there is an additional 4 bytes padding inside the unnamed structure. > The comment on current master should be "/* 32 pad bytes here. */". So, the comment should be "/* 36 pad bytes here. */". > > BTW, Kevin, how did you calculate 4 here? My pahole output is following: > > union { > struct { > struct netdev_stats stats; /* 320 336 */ > uint64_t tx_retries; /* 656 8 */ > rte_spinlock_t stats_lock; /* 664 4 */ > }; /* 352 */ > uint8_t pad52[0]; /* 0 */ > }; /* 320 384 */ > > Returning to the patch. After adding 4 more stats (4 * 8 bytes), the new > layout > takes: > 336 bytes for stats + 8 bytes for tx_retries \ > + 4*8 bytes for new stats + 4 bytes for lock = 384 bytes. > > Now the structure takes exactly 384 bytes and you need to remove the comment > or change it to "/* 0 pad bytes here. */". And 4 bytes will remain in padding after adding new stats. > > Sorry, I didn't check the actual layout until now so I was confused a bit by > the comment on current master. Anyway, you need to update that comment. > However, It might be still useful to move these stats to a separate structure > to avoid big padding in case we'll want to add one more. And I'm still > thinking that we need to drop paddings at all for most of the structure > fields, but this should be a separate change. > > Best regards, Ilya Maximets. >
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index bc20d68..a679c5b 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -413,8 +413,17 @@ struct netdev_dpdk { PADDED_MEMBERS(CACHE_LINE_SIZE, struct netdev_stats stats; - /* Custom stat for retries when unable to transmit. */ + /* Custom device stats */ + /* No. of retries when unable to transmit. */ uint64_t tx_retries; + /* Pkts dropped when unable to transmit; Probably Tx queue is full */ + uint64_t tx_failure_drops; + /* Pkts len greater than max dev MTU */ + uint64_t tx_mtu_exceeded_drops; + /* Pkt drops in egress policier processing */ + uint64_t tx_qos_drops; + /* Pkts drops in ingress policier processing */ + uint64_t rx_qos_drops; /* Protects stats */ rte_spinlock_t stats_lock; /* 4 pad bytes here. */ @@ -2171,6 +2180,7 @@ netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq, struct ingress_policer *policer = netdev_dpdk_get_ingress_policer(dev); uint16_t nb_rx = 0; uint16_t dropped = 0; + uint16_t qos_drops = 0; int qid = rxq->queue_id * VIRTIO_QNUM + VIRTIO_TXQ; int vid = netdev_dpdk_get_vid(dev); @@ -2202,11 +2212,13 @@ netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq, (struct rte_mbuf **) batch->packets, nb_rx, true); dropped -= nb_rx; + qos_drops = dropped; } rte_spinlock_lock(&dev->stats_lock); netdev_dpdk_vhost_update_rx_counters(&dev->stats, batch->packets, nb_rx, dropped); + dev->rx_qos_drops += qos_drops; rte_spinlock_unlock(&dev->stats_lock); batch->count = nb_rx; @@ -2232,6 +2244,7 @@ netdev_dpdk_rxq_recv(struct netdev_rxq *rxq, struct dp_packet_batch *batch, struct ingress_policer *policer = netdev_dpdk_get_ingress_policer(dev); int nb_rx; int dropped = 0; + int qos_drops = 0; if (OVS_UNLIKELY(!(dev->flags & NETDEV_UP))) { return EAGAIN; @@ -2250,12 +2263,14 @@ netdev_dpdk_rxq_recv(struct netdev_rxq *rxq, struct dp_packet_batch *batch, (struct rte_mbuf **) batch->packets, nb_rx, true); dropped -= nb_rx; + qos_drops = dropped; } /* Update stats to reflect dropped packets */ if (OVS_UNLIKELY(dropped)) { rte_spinlock_lock(&dev->stats_lock); dev->stats.rx_dropped += dropped; + dev->rx_qos_drops += qos_drops; rte_spinlock_unlock(&dev->stats_lock); } @@ -2339,6 +2354,9 @@ __netdev_dpdk_vhost_send(struct netdev *netdev, int qid, struct rte_mbuf **cur_pkts = (struct rte_mbuf **) pkts; unsigned int total_pkts = cnt; unsigned int dropped = 0; + unsigned int tx_failure; + unsigned int mtu_drops; + unsigned int qos_drops; int i, retries = 0; int max_retries = VHOST_ENQ_RETRY_MIN; int vid = netdev_dpdk_get_vid(dev); @@ -2356,9 +2374,12 @@ __netdev_dpdk_vhost_send(struct netdev *netdev, int qid, rte_spinlock_lock(&dev->tx_q[qid].tx_lock); cnt = netdev_dpdk_filter_packet_len(dev, cur_pkts, cnt); + mtu_drops = total_pkts - cnt; + qos_drops = cnt; /* Check has QoS has been configured for the netdev */ cnt = netdev_dpdk_qos_run(dev, cur_pkts, cnt, true); - dropped = total_pkts - cnt; + qos_drops -= cnt; + dropped = qos_drops + mtu_drops; do { int vhost_qid = qid * VIRTIO_QNUM + VIRTIO_RXQ; @@ -2383,12 +2404,16 @@ __netdev_dpdk_vhost_send(struct netdev *netdev, int qid, } } while (cnt && (retries++ < max_retries)); + tx_failure = cnt; rte_spinlock_unlock(&dev->tx_q[qid].tx_lock); rte_spinlock_lock(&dev->stats_lock); netdev_dpdk_vhost_update_tx_counters(&dev->stats, pkts, total_pkts, cnt + dropped); dev->tx_retries += MIN(retries, max_retries); + dev->tx_failure_drops += tx_failure; + dev->tx_mtu_exceeded_drops += mtu_drops; + dev->tx_qos_drops += qos_drops; rte_spinlock_unlock(&dev->stats_lock); out: @@ -2413,12 +2438,15 @@ dpdk_do_tx_copy(struct netdev *netdev, int qid, struct dp_packet_batch *batch) struct rte_mbuf *pkts[PKT_ARRAY_SIZE]; uint32_t cnt = batch_cnt; uint32_t dropped = 0; + uint32_t tx_failure = 0; + uint32_t mtu_drops = 0; + uint32_t qos_drops = 0; if (dev->type != DPDK_DEV_VHOST) { /* Check if QoS has been configured for this netdev. */ cnt = netdev_dpdk_qos_run(dev, (struct rte_mbuf **) batch->packets, batch_cnt, false); - dropped += batch_cnt - cnt; + qos_drops = batch_cnt - cnt; } uint32_t txcnt = 0; @@ -2431,7 +2459,7 @@ dpdk_do_tx_copy(struct netdev *netdev, int qid, struct dp_packet_batch *batch) VLOG_WARN_RL(&rl, "Too big size %u max_packet_len %d", size, dev->max_packet_len); - dropped++; + mtu_drops++; continue; } @@ -2454,13 +2482,17 @@ dpdk_do_tx_copy(struct netdev *netdev, int qid, struct dp_packet_batch *batch) __netdev_dpdk_vhost_send(netdev, qid, (struct dp_packet **) pkts, txcnt); } else { - dropped += netdev_dpdk_eth_tx_burst(dev, qid, pkts, txcnt); + tx_failure = netdev_dpdk_eth_tx_burst(dev, qid, pkts, txcnt); } } + dropped += qos_drops + mtu_drops + tx_failure; if (OVS_UNLIKELY(dropped)) { rte_spinlock_lock(&dev->stats_lock); dev->stats.tx_dropped += dropped; + dev->tx_failure_drops += tx_failure; + dev->tx_mtu_exceeded_drops += mtu_drops; + dev->tx_qos_drops += qos_drops; rte_spinlock_unlock(&dev->stats_lock); } } @@ -2502,18 +2534,25 @@ netdev_dpdk_send__(struct netdev_dpdk *dev, int qid, dp_packet_delete_batch(batch, true); } else { int tx_cnt, dropped; + int tx_failure, mtu_drops, qos_drops; int batch_cnt = dp_packet_batch_size(batch); struct rte_mbuf **pkts = (struct rte_mbuf **) batch->packets; tx_cnt = netdev_dpdk_filter_packet_len(dev, pkts, batch_cnt); + mtu_drops = batch_cnt - tx_cnt; + qos_drops = tx_cnt; tx_cnt = netdev_dpdk_qos_run(dev, pkts, tx_cnt, true); - dropped = batch_cnt - tx_cnt; + qos_drops -= tx_cnt; - dropped += netdev_dpdk_eth_tx_burst(dev, qid, pkts, tx_cnt); + tx_failure = netdev_dpdk_eth_tx_burst(dev, qid, pkts, tx_cnt); + dropped = tx_failure + mtu_drops + qos_drops; if (OVS_UNLIKELY(dropped)) { rte_spinlock_lock(&dev->stats_lock); dev->stats.tx_dropped += dropped; + dev->tx_failure_drops += tx_failure; + dev->tx_mtu_exceeded_drops += mtu_drops; + dev->tx_qos_drops += qos_drops; rte_spinlock_unlock(&dev->stats_lock); } } @@ -2772,6 +2811,17 @@ netdev_dpdk_get_custom_stats(const struct netdev *netdev, uint32_t i; struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); int rte_xstats_ret; + uint16_t index = 0; + +#define DPDK_CSTATS \ + DPDK_CSTAT(tx_failure_drops) \ + DPDK_CSTAT(tx_mtu_exceeded_drops) \ + DPDK_CSTAT(tx_qos_drops) \ + DPDK_CSTAT(rx_qos_drops) + +#define DPDK_CSTAT(NAME) +1 + custom_stats->size = DPDK_CSTATS; +#undef DPDK_CSTAT ovs_mutex_lock(&dev->mutex); @@ -2786,9 +2836,10 @@ netdev_dpdk_get_custom_stats(const struct netdev *netdev, if (rte_xstats_ret > 0 && rte_xstats_ret <= dev->rte_xstats_ids_size) { - custom_stats->size = rte_xstats_ret; + index = rte_xstats_ret; + custom_stats->size += rte_xstats_ret; custom_stats->counters = - (struct netdev_custom_counter *) xcalloc(rte_xstats_ret, + (struct netdev_custom_counter *) xcalloc(custom_stats->size, sizeof(struct netdev_custom_counter)); for (i = 0; i < rte_xstats_ret; i++) { @@ -2802,7 +2853,6 @@ netdev_dpdk_get_custom_stats(const struct netdev *netdev, VLOG_WARN("Cannot get XSTATS values for port: "DPDK_PORT_ID_FMT, dev->port_id); custom_stats->counters = NULL; - custom_stats->size = 0; /* Let's clear statistics cache, so it will be * reconfigured */ netdev_dpdk_clear_xstats(dev); @@ -2811,6 +2861,27 @@ netdev_dpdk_get_custom_stats(const struct netdev *netdev, free(values); } + if (custom_stats->counters == NULL) { + custom_stats->counters = + (struct netdev_custom_counter *) xcalloc(custom_stats->size, + sizeof(struct netdev_custom_counter)); + } + + rte_spinlock_lock(&dev->stats_lock); + i = index; +#define DPDK_CSTAT(NAME) \ + ovs_strlcpy(custom_stats->counters[i++].name, #NAME, \ + NETDEV_CUSTOM_STATS_NAME_SIZE); + DPDK_CSTATS; +#undef DPDK_CSTAT + + i = index; +#define DPDK_CSTAT(NAME) \ + custom_stats->counters[i++].value = dev->NAME; + DPDK_CSTATS; +#undef DPDK_CSTAT + rte_spinlock_unlock(&dev->stats_lock); + ovs_mutex_unlock(&dev->mutex); return 0; @@ -2823,8 +2894,12 @@ netdev_dpdk_vhost_get_custom_stats(const struct netdev *netdev, struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); int i; -#define VHOST_CSTATS \ - VHOST_CSTAT(tx_retries) +#define VHOST_CSTATS \ + VHOST_CSTAT(tx_retries) \ + VHOST_CSTAT(tx_failure_drops) \ + VHOST_CSTAT(tx_mtu_exceeded_drops) \ + VHOST_CSTAT(tx_qos_drops) \ + VHOST_CSTAT(rx_qos_drops) #define VHOST_CSTAT(NAME) + 1 custom_stats->size = VHOST_CSTATS; diff --git a/utilities/bugtool/automake.mk b/utilities/bugtool/automake.mk index 18fa347..9657468 100644 --- a/utilities/bugtool/automake.mk +++ b/utilities/bugtool/automake.mk @@ -22,7 +22,8 @@ bugtool_scripts = \ utilities/bugtool/ovs-bugtool-ovs-bridge-datapath-type \ utilities/bugtool/ovs-bugtool-ovs-vswitchd-threads-affinity \ utilities/bugtool/ovs-bugtool-qos-configs \ - utilities/bugtool/ovs-bugtool-get-dpdk-nic-numa + utilities/bugtool/ovs-bugtool-get-dpdk-nic-numa \ + utilities/bugtool/ovs-bugtool-get-iface-stats scripts_SCRIPTS += $(bugtool_scripts) diff --git a/utilities/bugtool/ovs-bugtool-get-iface-stats b/utilities/bugtool/ovs-bugtool-get-iface-stats new file mode 100755 index 0000000..0fe175e --- /dev/null +++ b/utilities/bugtool/ovs-bugtool-get-iface-stats @@ -0,0 +1,25 @@ +#! /bin/bash + +# This library is free software; you can redistribute it and/or +# modify it under the terms of version 2.1 of the GNU Lesser General +# Public License as published by the Free Software Foundation. +# +# This library is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# Lesser General Public License for more details. +# +# Copyright (C) 2019 Ericsson AB + +for bridge in `ovs-vsctl -- --real list-br` +do + echo -e "\nBridge : ${bridge}\n" + for iface in `ovs-vsctl list-ifaces ${bridge}` + do + echo -e "iface : ${iface}" + ovs-vsctl get interface ${iface} statistics + echo -e "\n" + done + echo -e "iface : ${bridge}" + ovs-vsctl get interface ${bridge} statistics +done diff --git a/utilities/bugtool/plugins/network-status/openvswitch.xml b/utilities/bugtool/plugins/network-status/openvswitch.xml index d39867c..f8c4ff0 100644 --- a/utilities/bugtool/plugins/network-status/openvswitch.xml +++ b/utilities/bugtool/plugins/network-status/openvswitch.xml @@ -34,6 +34,7 @@ <command label="ovs-appctl-dpctl-dump-flows-netdev" filters="ovs" repeat="2">ovs-appctl dpctl/dump-flows netdev@ovs-netdev</command> <command label="ovs-appctl-dpctl-dump-flows-system" filters="ovs" repeat="2">ovs-appctl dpctl/dump-flows system@ovs-system</command> <command label="ovs-appctl-dpctl-show-s" filters="ovs" repeat="2">ovs-appctl dpctl/show -s</command> + <command label="ovs-vsctl-get-interface-statistics" filters="ovs">/usr/share/openvswitch/scripts/ovs-bugtool-get-iface-stats</command> <command label="ovs-ofctl-show" filters="ovs">/usr/share/openvswitch/scripts/ovs-bugtool-ovs-ofctl-loop-over-bridges "show"</command> <command label="ovs-ofctl-dump-flows" filters="ovs" repeat="2">/usr/share/openvswitch/scripts/ovs-bugtool-ovs-ofctl-loop-over-bridges "dump-flows"</command> <command label="ovs-ofctl-dump-ports" filters="ovs" repeat="2">/usr/share/openvswitch/scripts/ovs-bugtool-ovs-ofctl-loop-over-bridges "dump-ports"</command> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml index 9a743c0..4a7bcf8 100644 --- a/vswitchd/vswitch.xml +++ b/vswitchd/vswitch.xml @@ -3486,6 +3486,30 @@ ovs-vsctl add-port br0 p0 -- set Interface p0 type=patch options:peer=p1 \ the above. </column> </group> + <group title="Statistics: Custom stats"> + <column name="statistics" key="tx_retries"> + Total number of transmit retries on a vhost-user or vhost-user-client + interface. + </column> + <column name="statistics" key="tx_failure_drops"> + Total number of packets dropped because DPDP transmit API for + physical/vhost ports fails to transmit the packets. This happens + most likely because the transmit queue is full or has been filled + up. There are other reasons as well which are unlikely to happen. + </column> + <column name="statistics" key="tx_mtu_exceeded_drops"> + Number of packets dropped due to packet length exceeding the max + device MTU. + </column> + <column name="statistics" key="tx_qos_drops"> + Total number of packets dropped due to transmission rate exceeding + the configured egress policer rate. + </column> + <column name="statistics" key="rx_qos_drops"> + Total number of packets dropped due to reception rate exceeding + the configured ingress policer rate. + </column> + </group> </group> <group title="Ingress Policing">
OVS may be unable to transmit packets for multiple reasons and today there is a single counter to track packets dropped due to any of those reasons. The most common reason is that a VM is unable to read packets fast enough causing the vhostuser port transmit queue on the OVS side to become full. This manifests as a problem with VNFs not receiving all packets. Having a separate drop counter to track packets dropped because the transmit queue is full will clearly indicate that the problem is on the VM side and not in OVS. Similarly maintaining separate counters for all possible drops helps in indicating sensible cause for packet drops. This patch adds custom stats counters to track packets dropped at port level and these counters are displayed along with other stats in "ovs-vsctl get interface <iface> statistics" command. The detailed stats will be available for both dpdk and vhostuser ports. Signed-off-by: Sriram Vatala <sriram.v@altencalsoftlabs.com> --- lib/netdev-dpdk.c | 99 +++++++++++++++++++--- utilities/bugtool/automake.mk | 3 +- utilities/bugtool/ovs-bugtool-get-iface-stats | 25 ++++++ .../bugtool/plugins/network-status/openvswitch.xml | 1 + vswitchd/vswitch.xml | 24 ++++++ 5 files changed, 139 insertions(+), 13 deletions(-) create mode 100755 utilities/bugtool/ovs-bugtool-get-iface-stats