diff mbox series

[ovs-dev,v7] Detailed packet drop statistics per dpdk and vhostuser ports

Message ID 1566997134-10160-1-git-send-email-sriram.v@altencalsoftlabs.com
State Changes Requested
Delegated to: Ilya Maximets
Headers show
Series [ovs-dev,v7] Detailed packet drop statistics per dpdk and vhostuser ports | expand

Commit Message

Li,Rongqing via dev Aug. 28, 2019, 12:58 p.m. UTC
OVS may be unable to transmit packets for multiple reasons and
today there is a single counter to track packets dropped due to
any of those reasons. The most common reason is that a VM is
unable to read packets fast enough causing the vhostuser port
transmit queue on the OVS side to become full. This manifests
as a problem with VNFs not receiving all packets. Having a
separate drop counter to track packets dropped because the
transmit queue is full will clearly indicate that the problem
is on the VM side and not in OVS. Similarly maintaining separate
counters for all possible drops helps in indicating sensible
cause for packet drops.

This patch adds custom stats counters to track packets dropped
at port level and these counters are displayed along with
other stats in "ovs-vsctl get interface <iface> statistics"
command. The detailed stats will be available for both dpdk and
vhostuser ports.

Signed-off-by: Sriram Vatala <sriram.v@altencalsoftlabs.com>
---
 lib/netdev-dpdk.c                                  | 99 +++++++++++++++++++---
 utilities/bugtool/automake.mk                      |  3 +-
 utilities/bugtool/ovs-bugtool-get-iface-stats      | 25 ++++++
 .../bugtool/plugins/network-status/openvswitch.xml |  1 +
 vswitchd/vswitch.xml                               | 24 ++++++
 5 files changed, 139 insertions(+), 13 deletions(-)
 create mode 100755 utilities/bugtool/ovs-bugtool-get-iface-stats

Comments

0-day Robot Aug. 28, 2019, 1:59 p.m. UTC | #1
Bleep bloop.  Greetings Sriram Vatala via dev, I am a robot and I have tried out your patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Line is 138 characters long (recommended limit is 79)
#351 FILE: utilities/bugtool/plugins/network-status/openvswitch.xml:37:
    <command label="ovs-vsctl-get-interface-statistics" filters="ovs">/usr/share/openvswitch/scripts/ovs-bugtool-get-iface-stats</command>

Lines checked: 392, Warnings: 1, Errors: 0


Please check this out.  If you feel there has been an error, please email aconole@redhat.com

Thanks,
0-day Robot
Li,Rongqing via dev Aug. 30, 2019, 11:34 a.m. UTC | #2
Hi All,
Please consider this as a gentle remainder.

Thanks & Regards,
Sriram.

-----Original Message-----
From: Sriram Vatala <sriram.v@altencalsoftlabs.com> 
Sent: 28 August 2019 18:29
To: ovs-dev@openvswitch.org; i.maximets@samsung.com
Cc: blp@ovn.org; ian.stokes@intel.com; Sriram Vatala
<sriram.v@altencalsoftlabs.com>
Subject: [PATCH v7] Detailed packet drop statistics per dpdk and vhostuser
ports

OVS may be unable to transmit packets for multiple reasons and today there
is a single counter to track packets dropped due to any of those reasons.
The most common reason is that a VM is unable to read packets fast enough
causing the vhostuser port transmit queue on the OVS side to become full.
This manifests as a problem with VNFs not receiving all packets. Having a
separate drop counter to track packets dropped because the transmit queue is
full will clearly indicate that the problem is on the VM side and not in
OVS. Similarly maintaining separate counters for all possible drops helps in
indicating sensible cause for packet drops.

This patch adds custom stats counters to track packets dropped at port level
and these counters are displayed along with other stats in "ovs-vsctl get
interface <iface> statistics"
command. The detailed stats will be available for both dpdk and vhostuser
ports.

Signed-off-by: Sriram Vatala <sriram.v@altencalsoftlabs.com>
---
 lib/netdev-dpdk.c                                  | 99
+++++++++++++++++++---
 utilities/bugtool/automake.mk                      |  3 +-
 utilities/bugtool/ovs-bugtool-get-iface-stats      | 25 ++++++
 .../bugtool/plugins/network-status/openvswitch.xml |  1 +
 vswitchd/vswitch.xml                               | 24 ++++++
 5 files changed, 139 insertions(+), 13 deletions(-)  create mode 100755
utilities/bugtool/ovs-bugtool-get-iface-stats

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index bc20d68..a679c5b
100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -413,8 +413,17 @@ struct netdev_dpdk {
 
     PADDED_MEMBERS(CACHE_LINE_SIZE,
         struct netdev_stats stats;
-        /* Custom stat for retries when unable to transmit. */
+        /* Custom device stats */
+        /* No. of retries when unable to transmit. */
         uint64_t tx_retries;
+        /* Pkts dropped when unable to transmit; Probably Tx queue is full
*/
+        uint64_t tx_failure_drops;
+        /* Pkts len greater than max dev MTU */
+        uint64_t tx_mtu_exceeded_drops;
+        /* Pkt drops in egress policier processing */
+        uint64_t tx_qos_drops;
+        /* Pkts drops in ingress policier processing */
+        uint64_t rx_qos_drops;
         /* Protects stats */
         rte_spinlock_t stats_lock;
         /* 4 pad bytes here. */
@@ -2171,6 +2180,7 @@ netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq,
     struct ingress_policer *policer = netdev_dpdk_get_ingress_policer(dev);
     uint16_t nb_rx = 0;
     uint16_t dropped = 0;
+    uint16_t qos_drops = 0;
     int qid = rxq->queue_id * VIRTIO_QNUM + VIRTIO_TXQ;
     int vid = netdev_dpdk_get_vid(dev);
 
@@ -2202,11 +2212,13 @@ netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq,
                                     (struct rte_mbuf **) batch->packets,
                                     nb_rx, true);
         dropped -= nb_rx;
+        qos_drops = dropped;
     }
 
     rte_spinlock_lock(&dev->stats_lock);
     netdev_dpdk_vhost_update_rx_counters(&dev->stats, batch->packets,
                                          nb_rx, dropped);
+    dev->rx_qos_drops += qos_drops;
     rte_spinlock_unlock(&dev->stats_lock);
 
     batch->count = nb_rx;
@@ -2232,6 +2244,7 @@ netdev_dpdk_rxq_recv(struct netdev_rxq *rxq, struct
dp_packet_batch *batch,
     struct ingress_policer *policer = netdev_dpdk_get_ingress_policer(dev);
     int nb_rx;
     int dropped = 0;
+    int qos_drops = 0;
 
     if (OVS_UNLIKELY(!(dev->flags & NETDEV_UP))) {
         return EAGAIN;
@@ -2250,12 +2263,14 @@ netdev_dpdk_rxq_recv(struct netdev_rxq *rxq, struct
dp_packet_batch *batch,
                                     (struct rte_mbuf **) batch->packets,
                                     nb_rx, true);
         dropped -= nb_rx;
+        qos_drops = dropped;
     }
 
     /* Update stats to reflect dropped packets */
     if (OVS_UNLIKELY(dropped)) {
         rte_spinlock_lock(&dev->stats_lock);
         dev->stats.rx_dropped += dropped;
+        dev->rx_qos_drops += qos_drops;
         rte_spinlock_unlock(&dev->stats_lock);
     }
 
@@ -2339,6 +2354,9 @@ __netdev_dpdk_vhost_send(struct netdev *netdev, int
qid,
     struct rte_mbuf **cur_pkts = (struct rte_mbuf **) pkts;
     unsigned int total_pkts = cnt;
     unsigned int dropped = 0;
+    unsigned int tx_failure;
+    unsigned int mtu_drops;
+    unsigned int qos_drops;
     int i, retries = 0;
     int max_retries = VHOST_ENQ_RETRY_MIN;
     int vid = netdev_dpdk_get_vid(dev); @@ -2356,9 +2374,12 @@
__netdev_dpdk_vhost_send(struct netdev *netdev, int qid,
     rte_spinlock_lock(&dev->tx_q[qid].tx_lock);
 
     cnt = netdev_dpdk_filter_packet_len(dev, cur_pkts, cnt);
+    mtu_drops = total_pkts - cnt;
+    qos_drops = cnt;
     /* Check has QoS has been configured for the netdev */
     cnt = netdev_dpdk_qos_run(dev, cur_pkts, cnt, true);
-    dropped = total_pkts - cnt;
+    qos_drops -= cnt;
+    dropped = qos_drops + mtu_drops;
 
     do {
         int vhost_qid = qid * VIRTIO_QNUM + VIRTIO_RXQ; @@ -2383,12
+2404,16 @@ __netdev_dpdk_vhost_send(struct netdev *netdev, int qid,
         }
     } while (cnt && (retries++ < max_retries));
 
+    tx_failure = cnt;
     rte_spinlock_unlock(&dev->tx_q[qid].tx_lock);
 
     rte_spinlock_lock(&dev->stats_lock);
     netdev_dpdk_vhost_update_tx_counters(&dev->stats, pkts, total_pkts,
                                          cnt + dropped);
     dev->tx_retries += MIN(retries, max_retries);
+    dev->tx_failure_drops += tx_failure;
+    dev->tx_mtu_exceeded_drops += mtu_drops;
+    dev->tx_qos_drops += qos_drops;
     rte_spinlock_unlock(&dev->stats_lock);
 
 out:
@@ -2413,12 +2438,15 @@ dpdk_do_tx_copy(struct netdev *netdev, int qid,
struct dp_packet_batch *batch)
     struct rte_mbuf *pkts[PKT_ARRAY_SIZE];
     uint32_t cnt = batch_cnt;
     uint32_t dropped = 0;
+    uint32_t tx_failure = 0;
+    uint32_t mtu_drops = 0;
+    uint32_t qos_drops = 0;
 
     if (dev->type != DPDK_DEV_VHOST) {
         /* Check if QoS has been configured for this netdev. */
         cnt = netdev_dpdk_qos_run(dev, (struct rte_mbuf **) batch->packets,
                                   batch_cnt, false);
-        dropped += batch_cnt - cnt;
+        qos_drops = batch_cnt - cnt;
     }
 
     uint32_t txcnt = 0;
@@ -2431,7 +2459,7 @@ dpdk_do_tx_copy(struct netdev *netdev, int qid, struct
dp_packet_batch *batch)
             VLOG_WARN_RL(&rl, "Too big size %u max_packet_len %d",
                          size, dev->max_packet_len);
 
-            dropped++;
+            mtu_drops++;
             continue;
         }
 
@@ -2454,13 +2482,17 @@ dpdk_do_tx_copy(struct netdev *netdev, int qid,
struct dp_packet_batch *batch)
             __netdev_dpdk_vhost_send(netdev, qid, (struct dp_packet **)
pkts,
                                      txcnt);
         } else {
-            dropped += netdev_dpdk_eth_tx_burst(dev, qid, pkts, txcnt);
+            tx_failure = netdev_dpdk_eth_tx_burst(dev, qid, pkts, 
+ txcnt);
         }
     }
 
+    dropped += qos_drops + mtu_drops + tx_failure;
     if (OVS_UNLIKELY(dropped)) {
         rte_spinlock_lock(&dev->stats_lock);
         dev->stats.tx_dropped += dropped;
+        dev->tx_failure_drops += tx_failure;
+        dev->tx_mtu_exceeded_drops += mtu_drops;
+        dev->tx_qos_drops += qos_drops;
         rte_spinlock_unlock(&dev->stats_lock);
     }
 }
@@ -2502,18 +2534,25 @@ netdev_dpdk_send__(struct netdev_dpdk *dev, int qid,
         dp_packet_delete_batch(batch, true);
     } else {
         int tx_cnt, dropped;
+        int tx_failure, mtu_drops, qos_drops;
         int batch_cnt = dp_packet_batch_size(batch);
         struct rte_mbuf **pkts = (struct rte_mbuf **) batch->packets;
 
         tx_cnt = netdev_dpdk_filter_packet_len(dev, pkts, batch_cnt);
+        mtu_drops = batch_cnt - tx_cnt;
+        qos_drops = tx_cnt;
         tx_cnt = netdev_dpdk_qos_run(dev, pkts, tx_cnt, true);
-        dropped = batch_cnt - tx_cnt;
+        qos_drops -= tx_cnt;
 
-        dropped += netdev_dpdk_eth_tx_burst(dev, qid, pkts, tx_cnt);
+        tx_failure = netdev_dpdk_eth_tx_burst(dev, qid, pkts, tx_cnt);
 
+        dropped = tx_failure + mtu_drops + qos_drops;
         if (OVS_UNLIKELY(dropped)) {
             rte_spinlock_lock(&dev->stats_lock);
             dev->stats.tx_dropped += dropped;
+            dev->tx_failure_drops += tx_failure;
+            dev->tx_mtu_exceeded_drops += mtu_drops;
+            dev->tx_qos_drops += qos_drops;
             rte_spinlock_unlock(&dev->stats_lock);
         }
     }
@@ -2772,6 +2811,17 @@ netdev_dpdk_get_custom_stats(const struct netdev
*netdev,
     uint32_t i;
     struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
     int rte_xstats_ret;
+    uint16_t index = 0;
+
+#define DPDK_CSTATS                         \
+    DPDK_CSTAT(tx_failure_drops)            \
+    DPDK_CSTAT(tx_mtu_exceeded_drops)       \
+    DPDK_CSTAT(tx_qos_drops)                \
+    DPDK_CSTAT(rx_qos_drops)
+
+#define DPDK_CSTAT(NAME) +1
+    custom_stats->size = DPDK_CSTATS;
+#undef DPDK_CSTAT
 
     ovs_mutex_lock(&dev->mutex);
 
@@ -2786,9 +2836,10 @@ netdev_dpdk_get_custom_stats(const struct netdev
*netdev,
         if (rte_xstats_ret > 0 &&
             rte_xstats_ret <= dev->rte_xstats_ids_size) {
 
-            custom_stats->size = rte_xstats_ret;
+            index = rte_xstats_ret;
+            custom_stats->size += rte_xstats_ret;
             custom_stats->counters =
-                    (struct netdev_custom_counter *)
xcalloc(rte_xstats_ret,
+                   (struct netdev_custom_counter *) 
+ xcalloc(custom_stats->size,
                             sizeof(struct netdev_custom_counter));
 
             for (i = 0; i < rte_xstats_ret; i++) { @@ -2802,7 +2853,6 @@
netdev_dpdk_get_custom_stats(const struct netdev *netdev,
             VLOG_WARN("Cannot get XSTATS values for port:
"DPDK_PORT_ID_FMT,
                       dev->port_id);
             custom_stats->counters = NULL;
-            custom_stats->size = 0;
             /* Let's clear statistics cache, so it will be
              * reconfigured */
             netdev_dpdk_clear_xstats(dev); @@ -2811,6 +2861,27 @@
netdev_dpdk_get_custom_stats(const struct netdev *netdev,
         free(values);
     }
 
+    if (custom_stats->counters == NULL) {
+        custom_stats->counters =
+            (struct netdev_custom_counter *) xcalloc(custom_stats->size,
+                                  sizeof(struct netdev_custom_counter));
+    }
+
+    rte_spinlock_lock(&dev->stats_lock);
+    i = index;
+#define DPDK_CSTAT(NAME)                                     \
+    ovs_strlcpy(custom_stats->counters[i++].name, #NAME,     \
+                NETDEV_CUSTOM_STATS_NAME_SIZE);
+    DPDK_CSTATS;
+#undef DPDK_CSTAT
+
+    i = index;
+#define DPDK_CSTAT(NAME)                                     \
+    custom_stats->counters[i++].value = dev->NAME;
+    DPDK_CSTATS;
+#undef DPDK_CSTAT
+    rte_spinlock_unlock(&dev->stats_lock);
+
     ovs_mutex_unlock(&dev->mutex);
 
     return 0;
@@ -2823,8 +2894,12 @@ netdev_dpdk_vhost_get_custom_stats(const struct
netdev *netdev,
     struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
     int i;
 
-#define VHOST_CSTATS \
-    VHOST_CSTAT(tx_retries)
+#define VHOST_CSTATS                        \
+    VHOST_CSTAT(tx_retries)                 \
+    VHOST_CSTAT(tx_failure_drops)           \
+    VHOST_CSTAT(tx_mtu_exceeded_drops)      \
+    VHOST_CSTAT(tx_qos_drops)               \
+    VHOST_CSTAT(rx_qos_drops)
 
 #define VHOST_CSTAT(NAME) + 1
     custom_stats->size = VHOST_CSTATS;
diff --git a/utilities/bugtool/automake.mk b/utilities/bugtool/automake.mk
index 18fa347..9657468 100644
--- a/utilities/bugtool/automake.mk
+++ b/utilities/bugtool/automake.mk
@@ -22,7 +22,8 @@ bugtool_scripts = \
 	utilities/bugtool/ovs-bugtool-ovs-bridge-datapath-type \
 	utilities/bugtool/ovs-bugtool-ovs-vswitchd-threads-affinity \
 	utilities/bugtool/ovs-bugtool-qos-configs \
-	utilities/bugtool/ovs-bugtool-get-dpdk-nic-numa
+	utilities/bugtool/ovs-bugtool-get-dpdk-nic-numa \
+	utilities/bugtool/ovs-bugtool-get-iface-stats
 
 scripts_SCRIPTS += $(bugtool_scripts)
 
diff --git a/utilities/bugtool/ovs-bugtool-get-iface-stats
b/utilities/bugtool/ovs-bugtool-get-iface-stats
new file mode 100755
index 0000000..0fe175e
--- /dev/null
+++ b/utilities/bugtool/ovs-bugtool-get-iface-stats
@@ -0,0 +1,25 @@
+#! /bin/bash
+
+# This library is free software; you can redistribute it and/or # 
+modify it under the terms of version 2.1 of the GNU Lesser General # 
+Public License as published by the Free Software Foundation.
+#
+# This library is distributed in the hope that it will be useful, # but 
+WITHOUT ANY WARRANTY; without even the implied warranty of # 
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU # 
+Lesser General Public License for more details.
+#
+# Copyright (C) 2019 Ericsson AB
+
+for bridge in `ovs-vsctl -- --real list-br` do
+    echo -e "\nBridge : ${bridge}\n"
+    for iface in `ovs-vsctl list-ifaces ${bridge}`
+    do
+        echo -e "iface : ${iface}"
+        ovs-vsctl get interface ${iface} statistics
+        echo -e "\n"
+    done
+    echo -e "iface : ${bridge}"
+    ovs-vsctl get interface ${bridge} statistics done
diff --git a/utilities/bugtool/plugins/network-status/openvswitch.xml
b/utilities/bugtool/plugins/network-status/openvswitch.xml
index d39867c..f8c4ff0 100644
--- a/utilities/bugtool/plugins/network-status/openvswitch.xml
+++ b/utilities/bugtool/plugins/network-status/openvswitch.xml
@@ -34,6 +34,7 @@
     <command label="ovs-appctl-dpctl-dump-flows-netdev" filters="ovs"
repeat="2">ovs-appctl dpctl/dump-flows netdev@ovs-netdev</command>
     <command label="ovs-appctl-dpctl-dump-flows-system" filters="ovs"
repeat="2">ovs-appctl dpctl/dump-flows system@ovs-system</command>
     <command label="ovs-appctl-dpctl-show-s" filters="ovs"
repeat="2">ovs-appctl dpctl/show -s</command>
+    <command label="ovs-vsctl-get-interface-statistics" 
+ filters="ovs">/usr/share/openvswitch/scripts/ovs-bugtool-get-iface-sta
+ ts</command>
     <command label="ovs-ofctl-show"
filters="ovs">/usr/share/openvswitch/scripts/ovs-bugtool-ovs-ofctl-loop-over
-bridges "show"</command>
     <command label="ovs-ofctl-dump-flows" filters="ovs"
repeat="2">/usr/share/openvswitch/scripts/ovs-bugtool-ovs-ofctl-loop-over-br
idges "dump-flows"</command>
     <command label="ovs-ofctl-dump-ports" filters="ovs"
repeat="2">/usr/share/openvswitch/scripts/ovs-bugtool-ovs-ofctl-loop-over-br
idges "dump-ports"</command> diff --git a/vswitchd/vswitch.xml
b/vswitchd/vswitch.xml index 9a743c0..4a7bcf8 100644
--- a/vswitchd/vswitch.xml
+++ b/vswitchd/vswitch.xml
@@ -3486,6 +3486,30 @@ ovs-vsctl add-port br0 p0 -- set Interface p0
type=patch options:peer=p1 \
           the above.
         </column>
       </group>
+      <group title="Statistics: Custom stats">
+        <column name="statistics" key="tx_retries">
+          Total number of transmit retries on a vhost-user or
vhost-user-client
+          interface.
+        </column>
+        <column name="statistics" key="tx_failure_drops">
+          Total number of packets dropped because DPDP transmit API for
+          physical/vhost ports fails to transmit the packets. This happens
+          most likely because the transmit queue is full or has been filled
+          up. There are other reasons as well which are unlikely to happen.
+        </column>
+        <column name="statistics" key="tx_mtu_exceeded_drops">
+          Number of packets dropped due to packet length exceeding the max
+          device MTU.
+        </column>
+        <column name="statistics" key="tx_qos_drops">
+          Total number of packets dropped due to transmission rate
exceeding
+          the configured egress policer rate.
+        </column>
+        <column name="statistics" key="rx_qos_drops">
+          Total number of packets dropped due to reception rate exceeding
+          the configured ingress policer rate.
+        </column>
+      </group>
     </group>
 
     <group title="Ingress Policing">
--
2.7.4
Ilya Maximets Aug. 30, 2019, 1:23 p.m. UTC | #3
On 28.08.2019 15:58, Sriram Vatala wrote:
> OVS may be unable to transmit packets for multiple reasons and
> today there is a single counter to track packets dropped due to
> any of those reasons. The most common reason is that a VM is
> unable to read packets fast enough causing the vhostuser port
> transmit queue on the OVS side to become full. This manifests
> as a problem with VNFs not receiving all packets. Having a
> separate drop counter to track packets dropped because the
> transmit queue is full will clearly indicate that the problem
> is on the VM side and not in OVS. Similarly maintaining separate
> counters for all possible drops helps in indicating sensible
> cause for packet drops.
> 
> This patch adds custom stats counters to track packets dropped
> at port level and these counters are displayed along with
> other stats in "ovs-vsctl get interface <iface> statistics"
> command. The detailed stats will be available for both dpdk and
> vhostuser ports.
> 
> Signed-off-by: Sriram Vatala <sriram.v@altencalsoftlabs.com>
> ---

Hi. Thanks for a new version.

It'll be good if you'll include version history right here under
the '---' cut line, so it'll be easier to track changes.
Like this:

Version 7:
  * Implemented something.
  * Fixed something.
  * Patch renamed. Previous name: ...

Version 6:
  * ...


Another general comment is that, IMHO, it's better to rename this
patch to "netdev-dpdk: Detailed packet drop statistics.".
This patch adds statistics for all the dpdk based interfaces, so
there is no need to list them in the commit name. Prefix clearly
describes the patch area. (Please, do not drop the version number
because of patch re-naming. Just add a note about renaming in a
version history.)

More comments inline.

Best regards, Ilya Maximets.

>  lib/netdev-dpdk.c                                  | 99 +++++++++++++++++++---
>  utilities/bugtool/automake.mk                      |  3 +-
>  utilities/bugtool/ovs-bugtool-get-iface-stats      | 25 ++++++
>  .../bugtool/plugins/network-status/openvswitch.xml |  1 +
>  vswitchd/vswitch.xml                               | 24 ++++++
>  5 files changed, 139 insertions(+), 13 deletions(-)
>  create mode 100755 utilities/bugtool/ovs-bugtool-get-iface-stats
> 
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index bc20d68..a679c5b 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -413,8 +413,17 @@ struct netdev_dpdk {
>  
>      PADDED_MEMBERS(CACHE_LINE_SIZE,
>          struct netdev_stats stats;
> -        /* Custom stat for retries when unable to transmit. */
> +        /* Custom device stats */
> +        /* No. of retries when unable to transmit. */
>          uint64_t tx_retries;
> +        /* Pkts dropped when unable to transmit; Probably Tx queue is full */
> +        uint64_t tx_failure_drops;
> +        /* Pkts len greater than max dev MTU */
> +        uint64_t tx_mtu_exceeded_drops;
> +        /* Pkt drops in egress policier processing */
> +        uint64_t tx_qos_drops;
> +        /* Pkts drops in ingress policier processing */
> +        uint64_t rx_qos_drops;
>          /* Protects stats */
>          rte_spinlock_t stats_lock;
>          /* 4 pad bytes here. */

Padding still needs re-calculation.
Another option is to move all the counters to separate structure
like 'struct netdev_dpdk_sw_stats' and keep a pointer in 'struct netdev_dpdk'.

Another point is about naming. We, probably, should add some prefix to these
stats to distinguish them from HW/driver stats and avoid possible collisions.
This could be done here in variable names or while reporting them.
I guess 'netdev_dpdk_' prefix might be suitable to clearly highlight the
origin of the counter (It's, probably, better to add prefix like this
while reporting because this is not very useful inside the netdev-dpdk code).
Suggestions are welcome.


> @@ -2171,6 +2180,7 @@ netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq,
>      struct ingress_policer *policer = netdev_dpdk_get_ingress_policer(dev);
>      uint16_t nb_rx = 0;
>      uint16_t dropped = 0;
> +    uint16_t qos_drops = 0;
>      int qid = rxq->queue_id * VIRTIO_QNUM + VIRTIO_TXQ;
>      int vid = netdev_dpdk_get_vid(dev);
>  
> @@ -2202,11 +2212,13 @@ netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq,
>                                      (struct rte_mbuf **) batch->packets,
>                                      nb_rx, true);
>          dropped -= nb_rx;
> +        qos_drops = dropped;
>      }
>  
>      rte_spinlock_lock(&dev->stats_lock);
>      netdev_dpdk_vhost_update_rx_counters(&dev->stats, batch->packets,
>                                           nb_rx, dropped);
> +    dev->rx_qos_drops += qos_drops;
>      rte_spinlock_unlock(&dev->stats_lock);
>  
>      batch->count = nb_rx;
> @@ -2232,6 +2244,7 @@ netdev_dpdk_rxq_recv(struct netdev_rxq *rxq, struct dp_packet_batch *batch,
>      struct ingress_policer *policer = netdev_dpdk_get_ingress_policer(dev);
>      int nb_rx;
>      int dropped = 0;
> +    int qos_drops = 0;
>  
>      if (OVS_UNLIKELY(!(dev->flags & NETDEV_UP))) {
>          return EAGAIN;
> @@ -2250,12 +2263,14 @@ netdev_dpdk_rxq_recv(struct netdev_rxq *rxq, struct dp_packet_batch *batch,
>                                      (struct rte_mbuf **) batch->packets,
>                                      nb_rx, true);
>          dropped -= nb_rx;
> +        qos_drops = dropped;
>      }
>  
>      /* Update stats to reflect dropped packets */
>      if (OVS_UNLIKELY(dropped)) {
>          rte_spinlock_lock(&dev->stats_lock);
>          dev->stats.rx_dropped += dropped;
> +        dev->rx_qos_drops += qos_drops;
>          rte_spinlock_unlock(&dev->stats_lock);
>      }
>  
> @@ -2339,6 +2354,9 @@ __netdev_dpdk_vhost_send(struct netdev *netdev, int qid,
>      struct rte_mbuf **cur_pkts = (struct rte_mbuf **) pkts;
>      unsigned int total_pkts = cnt;
>      unsigned int dropped = 0;
> +    unsigned int tx_failure;
> +    unsigned int mtu_drops;
> +    unsigned int qos_drops;
>      int i, retries = 0;
>      int max_retries = VHOST_ENQ_RETRY_MIN;
>      int vid = netdev_dpdk_get_vid(dev);
> @@ -2356,9 +2374,12 @@ __netdev_dpdk_vhost_send(struct netdev *netdev, int qid,
>      rte_spinlock_lock(&dev->tx_q[qid].tx_lock);
>  
>      cnt = netdev_dpdk_filter_packet_len(dev, cur_pkts, cnt);
> +    mtu_drops = total_pkts - cnt;
> +    qos_drops = cnt;
>      /* Check has QoS has been configured for the netdev */
>      cnt = netdev_dpdk_qos_run(dev, cur_pkts, cnt, true);
> -    dropped = total_pkts - cnt;
> +    qos_drops -= cnt;
> +    dropped = qos_drops + mtu_drops;
>  
>      do {
>          int vhost_qid = qid * VIRTIO_QNUM + VIRTIO_RXQ;
> @@ -2383,12 +2404,16 @@ __netdev_dpdk_vhost_send(struct netdev *netdev, int qid,
>          }
>      } while (cnt && (retries++ < max_retries));
>  
> +    tx_failure = cnt;
>      rte_spinlock_unlock(&dev->tx_q[qid].tx_lock);
>  
>      rte_spinlock_lock(&dev->stats_lock);
>      netdev_dpdk_vhost_update_tx_counters(&dev->stats, pkts, total_pkts,
>                                           cnt + dropped);
>      dev->tx_retries += MIN(retries, max_retries);
> +    dev->tx_failure_drops += tx_failure;
> +    dev->tx_mtu_exceeded_drops += mtu_drops;
> +    dev->tx_qos_drops += qos_drops;
>      rte_spinlock_unlock(&dev->stats_lock);
>  
>  out:
> @@ -2413,12 +2438,15 @@ dpdk_do_tx_copy(struct netdev *netdev, int qid, struct dp_packet_batch *batch)
>      struct rte_mbuf *pkts[PKT_ARRAY_SIZE];
>      uint32_t cnt = batch_cnt;
>      uint32_t dropped = 0;
> +    uint32_t tx_failure = 0;
> +    uint32_t mtu_drops = 0;
> +    uint32_t qos_drops = 0;
>  
>      if (dev->type != DPDK_DEV_VHOST) {
>          /* Check if QoS has been configured for this netdev. */
>          cnt = netdev_dpdk_qos_run(dev, (struct rte_mbuf **) batch->packets,
>                                    batch_cnt, false);
> -        dropped += batch_cnt - cnt;
> +        qos_drops = batch_cnt - cnt;
>      }
>  
>      uint32_t txcnt = 0;
> @@ -2431,7 +2459,7 @@ dpdk_do_tx_copy(struct netdev *netdev, int qid, struct dp_packet_batch *batch)
>              VLOG_WARN_RL(&rl, "Too big size %u max_packet_len %d",
>                           size, dev->max_packet_len);
>  
> -            dropped++;
> +            mtu_drops++;
>              continue;
>          }
>  
> @@ -2454,13 +2482,17 @@ dpdk_do_tx_copy(struct netdev *netdev, int qid, struct dp_packet_batch *batch)
>              __netdev_dpdk_vhost_send(netdev, qid, (struct dp_packet **) pkts,
>                                       txcnt);
>          } else {
> -            dropped += netdev_dpdk_eth_tx_burst(dev, qid, pkts, txcnt);
> +            tx_failure = netdev_dpdk_eth_tx_burst(dev, qid, pkts, txcnt);
>          }
>      }
>  
> +    dropped += qos_drops + mtu_drops + tx_failure;
>      if (OVS_UNLIKELY(dropped)) {
>          rte_spinlock_lock(&dev->stats_lock);
>          dev->stats.tx_dropped += dropped;
> +        dev->tx_failure_drops += tx_failure;
> +        dev->tx_mtu_exceeded_drops += mtu_drops;
> +        dev->tx_qos_drops += qos_drops;
>          rte_spinlock_unlock(&dev->stats_lock);
>      }
>  }
> @@ -2502,18 +2534,25 @@ netdev_dpdk_send__(struct netdev_dpdk *dev, int qid,
>          dp_packet_delete_batch(batch, true);
>      } else {
>          int tx_cnt, dropped;
> +        int tx_failure, mtu_drops, qos_drops;
>          int batch_cnt = dp_packet_batch_size(batch);
>          struct rte_mbuf **pkts = (struct rte_mbuf **) batch->packets;
>  
>          tx_cnt = netdev_dpdk_filter_packet_len(dev, pkts, batch_cnt);
> +        mtu_drops = batch_cnt - tx_cnt;
> +        qos_drops = tx_cnt;
>          tx_cnt = netdev_dpdk_qos_run(dev, pkts, tx_cnt, true);
> -        dropped = batch_cnt - tx_cnt;
> +        qos_drops -= tx_cnt;
>  
> -        dropped += netdev_dpdk_eth_tx_burst(dev, qid, pkts, tx_cnt);
> +        tx_failure = netdev_dpdk_eth_tx_burst(dev, qid, pkts, tx_cnt);
>  
> +        dropped = tx_failure + mtu_drops + qos_drops;
>          if (OVS_UNLIKELY(dropped)) {
>              rte_spinlock_lock(&dev->stats_lock);
>              dev->stats.tx_dropped += dropped;
> +            dev->tx_failure_drops += tx_failure;
> +            dev->tx_mtu_exceeded_drops += mtu_drops;
> +            dev->tx_qos_drops += qos_drops;
>              rte_spinlock_unlock(&dev->stats_lock);
>          }
>      }
> @@ -2772,6 +2811,17 @@ netdev_dpdk_get_custom_stats(const struct netdev *netdev,
>      uint32_t i;
>      struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>      int rte_xstats_ret;
> +    uint16_t index = 0;
> +
> +#define DPDK_CSTATS                         \
> +    DPDK_CSTAT(tx_failure_drops)            \
> +    DPDK_CSTAT(tx_mtu_exceeded_drops)       \
> +    DPDK_CSTAT(tx_qos_drops)                \
> +    DPDK_CSTAT(rx_qos_drops)
> +
> +#define DPDK_CSTAT(NAME) +1
> +    custom_stats->size = DPDK_CSTATS;
> +#undef DPDK_CSTAT
>  
>      ovs_mutex_lock(&dev->mutex);
>  
> @@ -2786,9 +2836,10 @@ netdev_dpdk_get_custom_stats(const struct netdev *netdev,
>          if (rte_xstats_ret > 0 &&
>              rte_xstats_ret <= dev->rte_xstats_ids_size) {
>  
> -            custom_stats->size = rte_xstats_ret;
> +            index = rte_xstats_ret;
> +            custom_stats->size += rte_xstats_ret;
>              custom_stats->counters =
> -                    (struct netdev_custom_counter *) xcalloc(rte_xstats_ret,
> +                   (struct netdev_custom_counter *) xcalloc(custom_stats->size,
>                              sizeof(struct netdev_custom_counter));
>  
>              for (i = 0; i < rte_xstats_ret; i++) {
> @@ -2802,7 +2853,6 @@ netdev_dpdk_get_custom_stats(const struct netdev *netdev,
>              VLOG_WARN("Cannot get XSTATS values for port: "DPDK_PORT_ID_FMT,
>                        dev->port_id);
>              custom_stats->counters = NULL;
> -            custom_stats->size = 0;
>              /* Let's clear statistics cache, so it will be
>               * reconfigured */
>              netdev_dpdk_clear_xstats(dev);
> @@ -2811,6 +2861,27 @@ netdev_dpdk_get_custom_stats(const struct netdev *netdev,
>          free(values);
>      }
>  
> +    if (custom_stats->counters == NULL) {
> +        custom_stats->counters =
> +            (struct netdev_custom_counter *) xcalloc(custom_stats->size,
> +                                  sizeof(struct netdev_custom_counter));
> +    }
> +
> +    rte_spinlock_lock(&dev->stats_lock);
> +    i = index;
> +#define DPDK_CSTAT(NAME)                                     \
> +    ovs_strlcpy(custom_stats->counters[i++].name, #NAME,     \
> +                NETDEV_CUSTOM_STATS_NAME_SIZE);
> +    DPDK_CSTATS;
> +#undef DPDK_CSTAT
> +
> +    i = index;
> +#define DPDK_CSTAT(NAME)                                     \
> +    custom_stats->counters[i++].value = dev->NAME;
> +    DPDK_CSTATS;
> +#undef DPDK_CSTAT
> +    rte_spinlock_unlock(&dev->stats_lock);
> +
>      ovs_mutex_unlock(&dev->mutex);
>  
>      return 0;
> @@ -2823,8 +2894,12 @@ netdev_dpdk_vhost_get_custom_stats(const struct netdev *netdev,
>      struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>      int i;
>  
> -#define VHOST_CSTATS \
> -    VHOST_CSTAT(tx_retries)
> +#define VHOST_CSTATS                        \
> +    VHOST_CSTAT(tx_retries)                 \
> +    VHOST_CSTAT(tx_failure_drops)           \
> +    VHOST_CSTAT(tx_mtu_exceeded_drops)      \
> +    VHOST_CSTAT(tx_qos_drops)               \
> +    VHOST_CSTAT(rx_qos_drops)
>  
>  #define VHOST_CSTAT(NAME) + 1
>      custom_stats->size = VHOST_CSTATS;
> diff --git a/utilities/bugtool/automake.mk b/utilities/bugtool/automake.mk
> index 18fa347..9657468 100644
> --- a/utilities/bugtool/automake.mk
> +++ b/utilities/bugtool/automake.mk
> @@ -22,7 +22,8 @@ bugtool_scripts = \
>  	utilities/bugtool/ovs-bugtool-ovs-bridge-datapath-type \
>  	utilities/bugtool/ovs-bugtool-ovs-vswitchd-threads-affinity \
>  	utilities/bugtool/ovs-bugtool-qos-configs \
> -	utilities/bugtool/ovs-bugtool-get-dpdk-nic-numa
> +	utilities/bugtool/ovs-bugtool-get-dpdk-nic-numa \
> +	utilities/bugtool/ovs-bugtool-get-iface-stats
>  
>  scripts_SCRIPTS += $(bugtool_scripts)
>  
> diff --git a/utilities/bugtool/ovs-bugtool-get-iface-stats b/utilities/bugtool/ovs-bugtool-get-iface-stats
> new file mode 100755
> index 0000000..0fe175e
> --- /dev/null
> +++ b/utilities/bugtool/ovs-bugtool-get-iface-stats
> @@ -0,0 +1,25 @@
> +#! /bin/bash
> +
> +# This library is free software; you can redistribute it and/or
> +# modify it under the terms of version 2.1 of the GNU Lesser General
> +# Public License as published by the Free Software Foundation.
> +#
> +# This library is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +# Lesser General Public License for more details.
> +#
> +# Copyright (C) 2019 Ericsson AB
> +
> +for bridge in `ovs-vsctl -- --real list-br`
> +do
> +    echo -e "\nBridge : ${bridge}\n"
> +    for iface in `ovs-vsctl list-ifaces ${bridge}`
> +    do
> +        echo -e "iface : ${iface}"
> +        ovs-vsctl get interface ${iface} statistics
> +        echo -e "\n"
> +    done
> +    echo -e "iface : ${bridge}"
> +    ovs-vsctl get interface ${bridge} statistics
> +done
> diff --git a/utilities/bugtool/plugins/network-status/openvswitch.xml b/utilities/bugtool/plugins/network-status/openvswitch.xml
> index d39867c..f8c4ff0 100644
> --- a/utilities/bugtool/plugins/network-status/openvswitch.xml
> +++ b/utilities/bugtool/plugins/network-status/openvswitch.xml
> @@ -34,6 +34,7 @@
>      <command label="ovs-appctl-dpctl-dump-flows-netdev" filters="ovs" repeat="2">ovs-appctl dpctl/dump-flows netdev@ovs-netdev</command>
>      <command label="ovs-appctl-dpctl-dump-flows-system" filters="ovs" repeat="2">ovs-appctl dpctl/dump-flows system@ovs-system</command>
>      <command label="ovs-appctl-dpctl-show-s" filters="ovs" repeat="2">ovs-appctl dpctl/show -s</command>
> +    <command label="ovs-vsctl-get-interface-statistics" filters="ovs">/usr/share/openvswitch/scripts/ovs-bugtool-get-iface-stats</command>
>      <command label="ovs-ofctl-show" filters="ovs">/usr/share/openvswitch/scripts/ovs-bugtool-ovs-ofctl-loop-over-bridges "show"</command>
>      <command label="ovs-ofctl-dump-flows" filters="ovs" repeat="2">/usr/share/openvswitch/scripts/ovs-bugtool-ovs-ofctl-loop-over-bridges "dump-flows"</command>
>      <command label="ovs-ofctl-dump-ports" filters="ovs" repeat="2">/usr/share/openvswitch/scripts/ovs-bugtool-ovs-ofctl-loop-over-bridges "dump-ports"</command>
> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
> index 9a743c0..4a7bcf8 100644
> --- a/vswitchd/vswitch.xml
> +++ b/vswitchd/vswitch.xml
> @@ -3486,6 +3486,30 @@ ovs-vsctl add-port br0 p0 -- set Interface p0 type=patch options:peer=p1 \
>            the above.
>          </column>
>        </group>
> +      <group title="Statistics: Custom stats">
> +        <column name="statistics" key="tx_retries">
> +          Total number of transmit retries on a vhost-user or vhost-user-client
> +          interface.
> +        </column>
> +        <column name="statistics" key="tx_failure_drops">
> +          Total number of packets dropped because DPDP transmit API for
> +          physical/vhost ports fails to transmit the packets. This happens
> +          most likely because the transmit queue is full or has been filled
> +          up. There are other reasons as well which are unlikely to happen.
> +        </column>
> +        <column name="statistics" key="tx_mtu_exceeded_drops">
> +          Number of packets dropped due to packet length exceeding the max
> +          device MTU.
> +        </column>
> +        <column name="statistics" key="tx_qos_drops">
> +          Total number of packets dropped due to transmission rate exceeding
> +          the configured egress policer rate.
> +        </column>
> +        <column name="statistics" key="rx_qos_drops">
> +          Total number of packets dropped due to reception rate exceeding
> +          the configured ingress policer rate.
> +        </column>
> +      </group>

I'm not sure if it make sense to declare all the stats here. Even usual "Statistics"
doesn't have all of the default stats described. All of above should be easy to
understand just looking at the name. Also, these are netdev-dpdk specific and not
supported by other netdev types. If needed, these stats could be mentioned in DPDK
guides in Documentation/ .

>      </group>
>  
>      <group title="Ingress Policing">
>
Li,Rongqing via dev Sept. 2, 2019, 6:48 a.m. UTC | #4
Hi Ilya,
Thanks for reviewing the patch and suggestions. Will address your comments in 
next patch.

Thanks & Regards,
Sriram.

-----Original Message-----
From: Ilya Maximets <i.maximets@samsung.com>
Sent: 30 August 2019 18:53
To: Sriram Vatala <sriram.v@altencalsoftlabs.com>; ovs-dev@openvswitch.org
Cc: blp@ovn.org; ian.stokes@intel.com; Kevin Traynor <ktraynor@redhat.com>
Subject: Re: [PATCH v7] Detailed packet drop statistics per dpdk and vhostuser 
ports

On 28.08.2019 15:58, Sriram Vatala wrote:
> OVS may be unable to transmit packets for multiple reasons and today
> there is a single counter to track packets dropped due to any of those
> reasons. The most common reason is that a VM is unable to read packets
> fast enough causing the vhostuser port transmit queue on the OVS side
> to become full. This manifests as a problem with VNFs not receiving
> all packets. Having a separate drop counter to track packets dropped
> because the transmit queue is full will clearly indicate that the
> problem is on the VM side and not in OVS. Similarly maintaining
> separate counters for all possible drops helps in indicating sensible
> cause for packet drops.
>
> This patch adds custom stats counters to track packets dropped at port
> level and these counters are displayed along with other stats in
> "ovs-vsctl get interface <iface> statistics"
> command. The detailed stats will be available for both dpdk and
> vhostuser ports.
>
> Signed-off-by: Sriram Vatala <sriram.v@altencalsoftlabs.com>
> ---

Hi. Thanks for a new version.

It'll be good if you'll include version history right here under the '---' cut 
line, so it'll be easier to track changes.
Like this:

Version 7:
  * Implemented something.
  * Fixed something.
  * Patch renamed. Previous name: ...

Version 6:
  * ...


Another general comment is that, IMHO, it's better to rename this patch to 
"netdev-dpdk: Detailed packet drop statistics.".
This patch adds statistics for all the dpdk based interfaces, so there is no 
need to list them in the commit name. Prefix clearly describes the patch area. 
(Please, do not drop the version number because of patch re-naming. Just add a 
note about renaming in a version history.)

More comments inline.

Best regards, Ilya Maximets.

>  lib/netdev-dpdk.c                                  | 99 
> +++++++++++++++++++---
>  utilities/bugtool/automake.mk                      |  3 +-
>  utilities/bugtool/ovs-bugtool-get-iface-stats      | 25 ++++++
>  .../bugtool/plugins/network-status/openvswitch.xml |  1 +
>  vswitchd/vswitch.xml                               | 24 ++++++
>  5 files changed, 139 insertions(+), 13 deletions(-)  create mode
> 100755 utilities/bugtool/ovs-bugtool-get-iface-stats
>
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index
> bc20d68..a679c5b 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -413,8 +413,17 @@ struct netdev_dpdk {
>
>      PADDED_MEMBERS(CACHE_LINE_SIZE,
>          struct netdev_stats stats;
> -        /* Custom stat for retries when unable to transmit. */
> +        /* Custom device stats */
> +        /* No. of retries when unable to transmit. */
>          uint64_t tx_retries;
> +        /* Pkts dropped when unable to transmit; Probably Tx queue is full 
> */
> +        uint64_t tx_failure_drops;
> +        /* Pkts len greater than max dev MTU */
> +        uint64_t tx_mtu_exceeded_drops;
> +        /* Pkt drops in egress policier processing */
> +        uint64_t tx_qos_drops;
> +        /* Pkts drops in ingress policier processing */
> +        uint64_t rx_qos_drops;
>          /* Protects stats */
>          rte_spinlock_t stats_lock;
>          /* 4 pad bytes here. */

Padding still needs re-calculation.
Another option is to move all the counters to separate structure like 'struct 
netdev_dpdk_sw_stats' and keep a pointer in 'struct netdev_dpdk'.

Another point is about naming. We, probably, should add some prefix to these 
stats to distinguish them from HW/driver stats and avoid possible collisions.
This could be done here in variable names or while reporting them.
I guess 'netdev_dpdk_' prefix might be suitable to clearly highlight the 
origin of the counter (It's, probably, better to add prefix like this while 
reporting because this is not very useful inside the netdev-dpdk code).
Suggestions are welcome.


> @@ -2171,6 +2180,7 @@ netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq,
>      struct ingress_policer *policer = netdev_dpdk_get_ingress_policer(dev);
>      uint16_t nb_rx = 0;
>      uint16_t dropped = 0;
> +    uint16_t qos_drops = 0;
>      int qid = rxq->queue_id * VIRTIO_QNUM + VIRTIO_TXQ;
>      int vid = netdev_dpdk_get_vid(dev);
>
> @@ -2202,11 +2212,13 @@ netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq,
>                                      (struct rte_mbuf **) batch->packets,
>                                      nb_rx, true);
>          dropped -= nb_rx;
> +        qos_drops = dropped;
>      }
>
>      rte_spinlock_lock(&dev->stats_lock);
>      netdev_dpdk_vhost_update_rx_counters(&dev->stats, batch->packets,
>                                           nb_rx, dropped);
> +    dev->rx_qos_drops += qos_drops;
>      rte_spinlock_unlock(&dev->stats_lock);
>
>      batch->count = nb_rx;
> @@ -2232,6 +2244,7 @@ netdev_dpdk_rxq_recv(struct netdev_rxq *rxq, struct 
> dp_packet_batch *batch,
>      struct ingress_policer *policer = netdev_dpdk_get_ingress_policer(dev);
>      int nb_rx;
>      int dropped = 0;
> +    int qos_drops = 0;
>
>      if (OVS_UNLIKELY(!(dev->flags & NETDEV_UP))) {
>          return EAGAIN;
> @@ -2250,12 +2263,14 @@ netdev_dpdk_rxq_recv(struct netdev_rxq *rxq, struct 
> dp_packet_batch *batch,
>                                      (struct rte_mbuf **) batch->packets,
>                                      nb_rx, true);
>          dropped -= nb_rx;
> +        qos_drops = dropped;
>      }
>
>      /* Update stats to reflect dropped packets */
>      if (OVS_UNLIKELY(dropped)) {
>          rte_spinlock_lock(&dev->stats_lock);
>          dev->stats.rx_dropped += dropped;
> +        dev->rx_qos_drops += qos_drops;
>          rte_spinlock_unlock(&dev->stats_lock);
>      }
>
> @@ -2339,6 +2354,9 @@ __netdev_dpdk_vhost_send(struct netdev *netdev, int 
> qid,
>      struct rte_mbuf **cur_pkts = (struct rte_mbuf **) pkts;
>      unsigned int total_pkts = cnt;
>      unsigned int dropped = 0;
> +    unsigned int tx_failure;
> +    unsigned int mtu_drops;
> +    unsigned int qos_drops;
>      int i, retries = 0;
>      int max_retries = VHOST_ENQ_RETRY_MIN;
>      int vid = netdev_dpdk_get_vid(dev); @@ -2356,9 +2374,12 @@
> __netdev_dpdk_vhost_send(struct netdev *netdev, int qid,
>      rte_spinlock_lock(&dev->tx_q[qid].tx_lock);
>
>      cnt = netdev_dpdk_filter_packet_len(dev, cur_pkts, cnt);
> +    mtu_drops = total_pkts - cnt;
> +    qos_drops = cnt;
>      /* Check has QoS has been configured for the netdev */
>      cnt = netdev_dpdk_qos_run(dev, cur_pkts, cnt, true);
> -    dropped = total_pkts - cnt;
> +    qos_drops -= cnt;
> +    dropped = qos_drops + mtu_drops;
>
>      do {
>          int vhost_qid = qid * VIRTIO_QNUM + VIRTIO_RXQ; @@ -2383,12
> +2404,16 @@ __netdev_dpdk_vhost_send(struct netdev *netdev, int qid,
>          }
>      } while (cnt && (retries++ < max_retries));
>
> +    tx_failure = cnt;
>      rte_spinlock_unlock(&dev->tx_q[qid].tx_lock);
>
>      rte_spinlock_lock(&dev->stats_lock);
>      netdev_dpdk_vhost_update_tx_counters(&dev->stats, pkts, total_pkts,
>                                           cnt + dropped);
>      dev->tx_retries += MIN(retries, max_retries);
> +    dev->tx_failure_drops += tx_failure;
> +    dev->tx_mtu_exceeded_drops += mtu_drops;
> +    dev->tx_qos_drops += qos_drops;
>      rte_spinlock_unlock(&dev->stats_lock);
>
>  out:
> @@ -2413,12 +2438,15 @@ dpdk_do_tx_copy(struct netdev *netdev, int qid, 
> struct dp_packet_batch *batch)
>      struct rte_mbuf *pkts[PKT_ARRAY_SIZE];
>      uint32_t cnt = batch_cnt;
>      uint32_t dropped = 0;
> +    uint32_t tx_failure = 0;
> +    uint32_t mtu_drops = 0;
> +    uint32_t qos_drops = 0;
>
>      if (dev->type != DPDK_DEV_VHOST) {
>          /* Check if QoS has been configured for this netdev. */
>          cnt = netdev_dpdk_qos_run(dev, (struct rte_mbuf **) batch->packets,
>                                    batch_cnt, false);
> -        dropped += batch_cnt - cnt;
> +        qos_drops = batch_cnt - cnt;
>      }
>
>      uint32_t txcnt = 0;
> @@ -2431,7 +2459,7 @@ dpdk_do_tx_copy(struct netdev *netdev, int qid, struct 
> dp_packet_batch *batch)
>              VLOG_WARN_RL(&rl, "Too big size %u max_packet_len %d",
>                           size, dev->max_packet_len);
>
> -            dropped++;
> +            mtu_drops++;
>              continue;
>          }
>
> @@ -2454,13 +2482,17 @@ dpdk_do_tx_copy(struct netdev *netdev, int qid, 
> struct dp_packet_batch *batch)
>              __netdev_dpdk_vhost_send(netdev, qid, (struct dp_packet **) 
> pkts,
>                                       txcnt);
>          } else {
> -            dropped += netdev_dpdk_eth_tx_burst(dev, qid, pkts, txcnt);
> +            tx_failure = netdev_dpdk_eth_tx_burst(dev, qid, pkts,
> + txcnt);
>          }
>      }
>
> +    dropped += qos_drops + mtu_drops + tx_failure;
>      if (OVS_UNLIKELY(dropped)) {
>          rte_spinlock_lock(&dev->stats_lock);
>          dev->stats.tx_dropped += dropped;
> +        dev->tx_failure_drops += tx_failure;
> +        dev->tx_mtu_exceeded_drops += mtu_drops;
> +        dev->tx_qos_drops += qos_drops;
>          rte_spinlock_unlock(&dev->stats_lock);
>      }
>  }
> @@ -2502,18 +2534,25 @@ netdev_dpdk_send__(struct netdev_dpdk *dev, int qid,
>          dp_packet_delete_batch(batch, true);
>      } else {
>          int tx_cnt, dropped;
> +        int tx_failure, mtu_drops, qos_drops;
>          int batch_cnt = dp_packet_batch_size(batch);
>          struct rte_mbuf **pkts = (struct rte_mbuf **) batch->packets;
>
>          tx_cnt = netdev_dpdk_filter_packet_len(dev, pkts, batch_cnt);
> +        mtu_drops = batch_cnt - tx_cnt;
> +        qos_drops = tx_cnt;
>          tx_cnt = netdev_dpdk_qos_run(dev, pkts, tx_cnt, true);
> -        dropped = batch_cnt - tx_cnt;
> +        qos_drops -= tx_cnt;
>
> -        dropped += netdev_dpdk_eth_tx_burst(dev, qid, pkts, tx_cnt);
> +        tx_failure = netdev_dpdk_eth_tx_burst(dev, qid, pkts,
> + tx_cnt);
>
> +        dropped = tx_failure + mtu_drops + qos_drops;
>          if (OVS_UNLIKELY(dropped)) {
>              rte_spinlock_lock(&dev->stats_lock);
>              dev->stats.tx_dropped += dropped;
> +            dev->tx_failure_drops += tx_failure;
> +            dev->tx_mtu_exceeded_drops += mtu_drops;
> +            dev->tx_qos_drops += qos_drops;
>              rte_spinlock_unlock(&dev->stats_lock);
>          }
>      }
> @@ -2772,6 +2811,17 @@ netdev_dpdk_get_custom_stats(const struct netdev 
> *netdev,
>      uint32_t i;
>      struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>      int rte_xstats_ret;
> +    uint16_t index = 0;
> +
> +#define DPDK_CSTATS                         \
> +    DPDK_CSTAT(tx_failure_drops)            \
> +    DPDK_CSTAT(tx_mtu_exceeded_drops)       \
> +    DPDK_CSTAT(tx_qos_drops)                \
> +    DPDK_CSTAT(rx_qos_drops)
> +
> +#define DPDK_CSTAT(NAME) +1
> +    custom_stats->size = DPDK_CSTATS; #undef DPDK_CSTAT
>
>      ovs_mutex_lock(&dev->mutex);
>
> @@ -2786,9 +2836,10 @@ netdev_dpdk_get_custom_stats(const struct netdev 
> *netdev,
>          if (rte_xstats_ret > 0 &&
>              rte_xstats_ret <= dev->rte_xstats_ids_size) {
>
> -            custom_stats->size = rte_xstats_ret;
> +            index = rte_xstats_ret;
> +            custom_stats->size += rte_xstats_ret;
>              custom_stats->counters =
> -                    (struct netdev_custom_counter *) 
> xcalloc(rte_xstats_ret,
> +                   (struct netdev_custom_counter *)
> + xcalloc(custom_stats->size,
>                              sizeof(struct netdev_custom_counter));
>
>              for (i = 0; i < rte_xstats_ret; i++) { @@ -2802,7 +2853,6
> @@ netdev_dpdk_get_custom_stats(const struct netdev *netdev,
>              VLOG_WARN("Cannot get XSTATS values for port: 
> "DPDK_PORT_ID_FMT,
>                        dev->port_id);
>              custom_stats->counters = NULL;
> -            custom_stats->size = 0;
>              /* Let's clear statistics cache, so it will be
>               * reconfigured */
>              netdev_dpdk_clear_xstats(dev); @@ -2811,6 +2861,27 @@
> netdev_dpdk_get_custom_stats(const struct netdev *netdev,
>          free(values);
>      }
>
> +    if (custom_stats->counters == NULL) {
> +        custom_stats->counters =
> +            (struct netdev_custom_counter *) xcalloc(custom_stats->size,
> +                                  sizeof(struct netdev_custom_counter));
> +    }
> +
> +    rte_spinlock_lock(&dev->stats_lock);
> +    i = index;
> +#define DPDK_CSTAT(NAME)                                     \
> +    ovs_strlcpy(custom_stats->counters[i++].name, #NAME,     \
> +                NETDEV_CUSTOM_STATS_NAME_SIZE);
> +    DPDK_CSTATS;
> +#undef DPDK_CSTAT
> +
> +    i = index;
> +#define DPDK_CSTAT(NAME)                                     \
> +    custom_stats->counters[i++].value = dev->NAME;
> +    DPDK_CSTATS;
> +#undef DPDK_CSTAT
> +    rte_spinlock_unlock(&dev->stats_lock);
> +
>      ovs_mutex_unlock(&dev->mutex);
>
>      return 0;
> @@ -2823,8 +2894,12 @@ netdev_dpdk_vhost_get_custom_stats(const struct 
> netdev *netdev,
>      struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>      int i;
>
> -#define VHOST_CSTATS \
> -    VHOST_CSTAT(tx_retries)
> +#define VHOST_CSTATS                        \
> +    VHOST_CSTAT(tx_retries)                 \
> +    VHOST_CSTAT(tx_failure_drops)           \
> +    VHOST_CSTAT(tx_mtu_exceeded_drops)      \
> +    VHOST_CSTAT(tx_qos_drops)               \
> +    VHOST_CSTAT(rx_qos_drops)
>
>  #define VHOST_CSTAT(NAME) + 1
>      custom_stats->size = VHOST_CSTATS; diff --git
> a/utilities/bugtool/automake.mk b/utilities/bugtool/automake.mk index
> 18fa347..9657468 100644
> --- a/utilities/bugtool/automake.mk
> +++ b/utilities/bugtool/automake.mk
> @@ -22,7 +22,8 @@ bugtool_scripts = \
>  	utilities/bugtool/ovs-bugtool-ovs-bridge-datapath-type \
>  	utilities/bugtool/ovs-bugtool-ovs-vswitchd-threads-affinity \
>  	utilities/bugtool/ovs-bugtool-qos-configs \
> -	utilities/bugtool/ovs-bugtool-get-dpdk-nic-numa
> +	utilities/bugtool/ovs-bugtool-get-dpdk-nic-numa \
> +	utilities/bugtool/ovs-bugtool-get-iface-stats
>
>  scripts_SCRIPTS += $(bugtool_scripts)
>
> diff --git a/utilities/bugtool/ovs-bugtool-get-iface-stats
> b/utilities/bugtool/ovs-bugtool-get-iface-stats
> new file mode 100755
> index 0000000..0fe175e
> --- /dev/null
> +++ b/utilities/bugtool/ovs-bugtool-get-iface-stats
> @@ -0,0 +1,25 @@
> +#! /bin/bash
> +
> +# This library is free software; you can redistribute it and/or #
> +modify it under the terms of version 2.1 of the GNU Lesser General #
> +Public License as published by the Free Software Foundation.
> +#
> +# This library is distributed in the hope that it will be useful, #
> +but WITHOUT ANY WARRANTY; without even the implied warranty of #
> +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU #
> +Lesser General Public License for more details.
> +#
> +# Copyright (C) 2019 Ericsson AB
> +
> +for bridge in `ovs-vsctl -- --real list-br` do
> +    echo -e "\nBridge : ${bridge}\n"
> +    for iface in `ovs-vsctl list-ifaces ${bridge}`
> +    do
> +        echo -e "iface : ${iface}"
> +        ovs-vsctl get interface ${iface} statistics
> +        echo -e "\n"
> +    done
> +    echo -e "iface : ${bridge}"
> +    ovs-vsctl get interface ${bridge} statistics done
> diff --git a/utilities/bugtool/plugins/network-status/openvswitch.xml
> b/utilities/bugtool/plugins/network-status/openvswitch.xml
> index d39867c..f8c4ff0 100644
> --- a/utilities/bugtool/plugins/network-status/openvswitch.xml
> +++ b/utilities/bugtool/plugins/network-status/openvswitch.xml
> @@ -34,6 +34,7 @@
>      <command label="ovs-appctl-dpctl-dump-flows-netdev" filters="ovs" 
> repeat="2">ovs-appctl dpctl/dump-flows netdev@ovs-netdev</command>
>      <command label="ovs-appctl-dpctl-dump-flows-system" filters="ovs" 
> repeat="2">ovs-appctl dpctl/dump-flows system@ovs-system</command>
>      <command label="ovs-appctl-dpctl-show-s" filters="ovs"
> repeat="2">ovs-appctl dpctl/show -s</command>
> +    <command label="ovs-vsctl-get-interface-statistics"
> + filters="ovs">/usr/share/openvswitch/scripts/ovs-bugtool-get-iface-s
> + tats</command>
>      <command label="ovs-ofctl-show" 
> filters="ovs">/usr/share/openvswitch/scripts/ovs-bugtool-ovs-ofctl-loop-over-bridges 
> "show"</command>
>      <command label="ovs-ofctl-dump-flows" filters="ovs" 
> repeat="2">/usr/share/openvswitch/scripts/ovs-bugtool-ovs-ofctl-loop-over-bridges 
> "dump-flows"</command>
>      <command label="ovs-ofctl-dump-ports" filters="ovs"
> repeat="2">/usr/share/openvswitch/scripts/ovs-bugtool-ovs-ofctl-loop-o
> ver-bridges "dump-ports"</command> diff --git a/vswitchd/vswitch.xml
> b/vswitchd/vswitch.xml index 9a743c0..4a7bcf8 100644
> --- a/vswitchd/vswitch.xml
> +++ b/vswitchd/vswitch.xml
> @@ -3486,6 +3486,30 @@ ovs-vsctl add-port br0 p0 -- set Interface p0 
> type=patch options:peer=p1 \
>            the above.
>          </column>
>        </group>
> +      <group title="Statistics: Custom stats">
> +        <column name="statistics" key="tx_retries">
> +          Total number of transmit retries on a vhost-user or 
> vhost-user-client
> +          interface.
> +        </column>
> +        <column name="statistics" key="tx_failure_drops">
> +          Total number of packets dropped because DPDP transmit API for
> +          physical/vhost ports fails to transmit the packets. This happens
> +          most likely because the transmit queue is full or has been filled
> +          up. There are other reasons as well which are unlikely to happen.
> +        </column>
> +        <column name="statistics" key="tx_mtu_exceeded_drops">
> +          Number of packets dropped due to packet length exceeding the max
> +          device MTU.
> +        </column>
> +        <column name="statistics" key="tx_qos_drops">
> +          Total number of packets dropped due to transmission rate 
> exceeding
> +          the configured egress policer rate.
> +        </column>
> +        <column name="statistics" key="rx_qos_drops">
> +          Total number of packets dropped due to reception rate exceeding
> +          the configured ingress policer rate.
> +        </column>
> +      </group>

I'm not sure if it make sense to declare all the stats here. Even usual 
"Statistics"
doesn't have all of the default stats described. All of above should be easy 
to understand just looking at the name. Also, these are netdev-dpdk specific 
and not supported by other netdev types. If needed, these stats could be 
mentioned in DPDK guides in Documentation/ .

>      </group>
>
>      <group title="Ingress Policing">
>
Li,Rongqing via dev Sept. 4, 2019, 1:31 p.m. UTC | #5
Hi Ilya,
1) I was working on addressing the comments provided by you. Had a small query 
on one of your comments.
2) I am trying to understand the problem of padding bytes in struct 
netdev_dpdk which you are referring to in your comment.
3) If I understand correctly, the macro "PADDED_MEMBERS(UNIT, MEMBERS)" 
ensures that the memory to be allocated for all 'MEMBERS' is rounded off to 
nearest multiple of CACHE_LINE_SIZE which  is 64 in this case. This macro adds 
pad bytes to roundoff to multiple of 64.
4) At runtime, I checked the size of stats section of netdev_dpdk without new 
counters that I have introduced in my patch. It is 384 bytes, out of which 
struct netdev_stats alone occupies 336 bytes and 8 bytes for tx_retries 
counter. (I could see there is no padding between netdev_stats and 
tx_retries). Total of 344 is rounded to 384 [((344 + 64 - 1)/64) * 64].
5) With my new counters, the size remains same after padding.
(336 + 8 + 8 +8 +8 +8 = 376) which is rounded to 384 bytes  [((376 +64 -1)/64) 
*64]  at runtime.

I want to check with you, if I have correctly understood the problem that you 
are referring in your comment. If you can explain a bit more on this, it would 
be helpful for me to understand the problem and think of possible solutions.

Following is the snippet of memory layout of netdev_dpdk at runtime :
    union {
        OVS_CACHE_LINE_MARKER cacheline1;
        struct {...};
        uint8_t pad50[64];
    };
    union {
        struct {...};
        uint8_t pad51[192];
    };
    union {
        struct {...};
        uint8_t pad52[384];
    };
    union {
        struct {...};
        uint8_t pad53[128];
    };
    union {
        struct {...};
        uint8_t pad54[64];
    };
}

Thanks & Regards,
Sriram.

-----Original Message-----
From: Ilya Maximets <i.maximets@samsung.com>
Sent: 30 August 2019 18:53
To: Sriram Vatala <sriram.v@altencalsoftlabs.com>; ovs-dev@openvswitch.org
Cc: blp@ovn.org; ian.stokes@intel.com; Kevin Traynor <ktraynor@redhat.com>
Subject: Re: [PATCH v7] Detailed packet drop statistics per dpdk and vhostuser 
ports

On 28.08.2019 15:58, Sriram Vatala wrote:
> OVS may be unable to transmit packets for multiple reasons and today
> there is a single counter to track packets dropped due to any of those
> reasons. The most common reason is that a VM is unable to read packets
> fast enough causing the vhostuser port transmit queue on the OVS side
> to become full. This manifests as a problem with VNFs not receiving
> all packets. Having a separate drop counter to track packets dropped
> because the transmit queue is full will clearly indicate that the
> problem is on the VM side and not in OVS. Similarly maintaining
> separate counters for all possible drops helps in indicating sensible
> cause for packet drops.
>
> This patch adds custom stats counters to track packets dropped at port
> level and these counters are displayed along with other stats in
> "ovs-vsctl get interface <iface> statistics"
> command. The detailed stats will be available for both dpdk and
> vhostuser ports.
>
> Signed-off-by: Sriram Vatala <sriram.v@altencalsoftlabs.com>
> ---

Hi. Thanks for a new version.

It'll be good if you'll include version history right here under the '---' cut 
line, so it'll be easier to track changes.
Like this:

Version 7:
  * Implemented something.
  * Fixed something.
  * Patch renamed. Previous name: ...

Version 6:
  * ...


Another general comment is that, IMHO, it's better to rename this patch to 
"netdev-dpdk: Detailed packet drop statistics.".
This patch adds statistics for all the dpdk based interfaces, so there is no 
need to list them in the commit name. Prefix clearly describes the patch area. 
(Please, do not drop the version number because of patch re-naming. Just add a 
note about renaming in a version history.)

More comments inline.

Best regards, Ilya Maximets.

>  lib/netdev-dpdk.c                                  | 99 
> +++++++++++++++++++---
>  utilities/bugtool/automake.mk                      |  3 +-
>  utilities/bugtool/ovs-bugtool-get-iface-stats      | 25 ++++++
>  .../bugtool/plugins/network-status/openvswitch.xml |  1 +
>  vswitchd/vswitch.xml                               | 24 ++++++
>  5 files changed, 139 insertions(+), 13 deletions(-)  create mode
> 100755 utilities/bugtool/ovs-bugtool-get-iface-stats
>
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index
> bc20d68..a679c5b 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -413,8 +413,17 @@ struct netdev_dpdk {
>
>      PADDED_MEMBERS(CACHE_LINE_SIZE,
>          struct netdev_stats stats;
> -        /* Custom stat for retries when unable to transmit. */
> +        /* Custom device stats */
> +        /* No. of retries when unable to transmit. */
>          uint64_t tx_retries;
> +        /* Pkts dropped when unable to transmit; Probably Tx queue is full 
> */
> +        uint64_t tx_failure_drops;
> +        /* Pkts len greater than max dev MTU */
> +        uint64_t tx_mtu_exceeded_drops;
> +        /* Pkt drops in egress policier processing */
> +        uint64_t tx_qos_drops;
> +        /* Pkts drops in ingress policier processing */
> +        uint64_t rx_qos_drops;
>          /* Protects stats */
>          rte_spinlock_t stats_lock;
>          /* 4 pad bytes here. */

Padding still needs re-calculation.
Another option is to move all the counters to separate structure like 'struct 
netdev_dpdk_sw_stats' and keep a pointer in 'struct netdev_dpdk'.

Another point is about naming. We, probably, should add some prefix to these 
stats to distinguish them from HW/driver stats and avoid possible collisions.
This could be done here in variable names or while reporting them.
I guess 'netdev_dpdk_' prefix might be suitable to clearly highlight the 
origin of the counter (It's, probably, better to add prefix like this while 
reporting because this is not very useful inside the netdev-dpdk code).
Suggestions are welcome.


> @@ -2171,6 +2180,7 @@ netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq,
>      struct ingress_policer *policer = netdev_dpdk_get_ingress_policer(dev);
>      uint16_t nb_rx = 0;
>      uint16_t dropped = 0;
> +    uint16_t qos_drops = 0;
>      int qid = rxq->queue_id * VIRTIO_QNUM + VIRTIO_TXQ;
>      int vid = netdev_dpdk_get_vid(dev);
>
> @@ -2202,11 +2212,13 @@ netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq,
>                                      (struct rte_mbuf **) batch->packets,
>                                      nb_rx, true);
>          dropped -= nb_rx;
> +        qos_drops = dropped;
>      }
>
>      rte_spinlock_lock(&dev->stats_lock);
>      netdev_dpdk_vhost_update_rx_counters(&dev->stats, batch->packets,
>                                           nb_rx, dropped);
> +    dev->rx_qos_drops += qos_drops;
>      rte_spinlock_unlock(&dev->stats_lock);
>
>      batch->count = nb_rx;
> @@ -2232,6 +2244,7 @@ netdev_dpdk_rxq_recv(struct netdev_rxq *rxq, struct 
> dp_packet_batch *batch,
>      struct ingress_policer *policer = netdev_dpdk_get_ingress_policer(dev);
>      int nb_rx;
>      int dropped = 0;
> +    int qos_drops = 0;
>
>      if (OVS_UNLIKELY(!(dev->flags & NETDEV_UP))) {
>          return EAGAIN;
> @@ -2250,12 +2263,14 @@ netdev_dpdk_rxq_recv(struct netdev_rxq *rxq, struct 
> dp_packet_batch *batch,
>                                      (struct rte_mbuf **) batch->packets,
>                                      nb_rx, true);
>          dropped -= nb_rx;
> +        qos_drops = dropped;
>      }
>
>      /* Update stats to reflect dropped packets */
>      if (OVS_UNLIKELY(dropped)) {
>          rte_spinlock_lock(&dev->stats_lock);
>          dev->stats.rx_dropped += dropped;
> +        dev->rx_qos_drops += qos_drops;
>          rte_spinlock_unlock(&dev->stats_lock);
>      }
>
> @@ -2339,6 +2354,9 @@ __netdev_dpdk_vhost_send(struct netdev *netdev, int 
> qid,
>      struct rte_mbuf **cur_pkts = (struct rte_mbuf **) pkts;
>      unsigned int total_pkts = cnt;
>      unsigned int dropped = 0;
> +    unsigned int tx_failure;
> +    unsigned int mtu_drops;
> +    unsigned int qos_drops;
>      int i, retries = 0;
>      int max_retries = VHOST_ENQ_RETRY_MIN;
>      int vid = netdev_dpdk_get_vid(dev); @@ -2356,9 +2374,12 @@
> __netdev_dpdk_vhost_send(struct netdev *netdev, int qid,
>      rte_spinlock_lock(&dev->tx_q[qid].tx_lock);
>
>      cnt = netdev_dpdk_filter_packet_len(dev, cur_pkts, cnt);
> +    mtu_drops = total_pkts - cnt;
> +    qos_drops = cnt;
>      /* Check has QoS has been configured for the netdev */
>      cnt = netdev_dpdk_qos_run(dev, cur_pkts, cnt, true);
> -    dropped = total_pkts - cnt;
> +    qos_drops -= cnt;
> +    dropped = qos_drops + mtu_drops;
>
>      do {
>          int vhost_qid = qid * VIRTIO_QNUM + VIRTIO_RXQ; @@ -2383,12
> +2404,16 @@ __netdev_dpdk_vhost_send(struct netdev *netdev, int qid,
>          }
>      } while (cnt && (retries++ < max_retries));
>
> +    tx_failure = cnt;
>      rte_spinlock_unlock(&dev->tx_q[qid].tx_lock);
>
>      rte_spinlock_lock(&dev->stats_lock);
>      netdev_dpdk_vhost_update_tx_counters(&dev->stats, pkts, total_pkts,
>                                           cnt + dropped);
>      dev->tx_retries += MIN(retries, max_retries);
> +    dev->tx_failure_drops += tx_failure;
> +    dev->tx_mtu_exceeded_drops += mtu_drops;
> +    dev->tx_qos_drops += qos_drops;
>      rte_spinlock_unlock(&dev->stats_lock);
>
>  out:
> @@ -2413,12 +2438,15 @@ dpdk_do_tx_copy(struct netdev *netdev, int qid, 
> struct dp_packet_batch *batch)
>      struct rte_mbuf *pkts[PKT_ARRAY_SIZE];
>      uint32_t cnt = batch_cnt;
>      uint32_t dropped = 0;
> +    uint32_t tx_failure = 0;
> +    uint32_t mtu_drops = 0;
> +    uint32_t qos_drops = 0;
>
>      if (dev->type != DPDK_DEV_VHOST) {
>          /* Check if QoS has been configured for this netdev. */
>          cnt = netdev_dpdk_qos_run(dev, (struct rte_mbuf **) batch->packets,
>                                    batch_cnt, false);
> -        dropped += batch_cnt - cnt;
> +        qos_drops = batch_cnt - cnt;
>      }
>
>      uint32_t txcnt = 0;
> @@ -2431,7 +2459,7 @@ dpdk_do_tx_copy(struct netdev *netdev, int qid, struct 
> dp_packet_batch *batch)
>              VLOG_WARN_RL(&rl, "Too big size %u max_packet_len %d",
>                           size, dev->max_packet_len);
>
> -            dropped++;
> +            mtu_drops++;
>              continue;
>          }
>
> @@ -2454,13 +2482,17 @@ dpdk_do_tx_copy(struct netdev *netdev, int qid, 
> struct dp_packet_batch *batch)
>              __netdev_dpdk_vhost_send(netdev, qid, (struct dp_packet **) 
> pkts,
>                                       txcnt);
>          } else {
> -            dropped += netdev_dpdk_eth_tx_burst(dev, qid, pkts, txcnt);
> +            tx_failure = netdev_dpdk_eth_tx_burst(dev, qid, pkts,
> + txcnt);
>          }
>      }
>
> +    dropped += qos_drops + mtu_drops + tx_failure;
>      if (OVS_UNLIKELY(dropped)) {
>          rte_spinlock_lock(&dev->stats_lock);
>          dev->stats.tx_dropped += dropped;
> +        dev->tx_failure_drops += tx_failure;
> +        dev->tx_mtu_exceeded_drops += mtu_drops;
> +        dev->tx_qos_drops += qos_drops;
>          rte_spinlock_unlock(&dev->stats_lock);
>      }
>  }
> @@ -2502,18 +2534,25 @@ netdev_dpdk_send__(struct netdev_dpdk *dev, int qid,
>          dp_packet_delete_batch(batch, true);
>      } else {
>          int tx_cnt, dropped;
> +        int tx_failure, mtu_drops, qos_drops;
>          int batch_cnt = dp_packet_batch_size(batch);
>          struct rte_mbuf **pkts = (struct rte_mbuf **) batch->packets;
>
>          tx_cnt = netdev_dpdk_filter_packet_len(dev, pkts, batch_cnt);
> +        mtu_drops = batch_cnt - tx_cnt;
> +        qos_drops = tx_cnt;
>          tx_cnt = netdev_dpdk_qos_run(dev, pkts, tx_cnt, true);
> -        dropped = batch_cnt - tx_cnt;
> +        qos_drops -= tx_cnt;
>
> -        dropped += netdev_dpdk_eth_tx_burst(dev, qid, pkts, tx_cnt);
> +        tx_failure = netdev_dpdk_eth_tx_burst(dev, qid, pkts,
> + tx_cnt);
>
> +        dropped = tx_failure + mtu_drops + qos_drops;
>          if (OVS_UNLIKELY(dropped)) {
>              rte_spinlock_lock(&dev->stats_lock);
>              dev->stats.tx_dropped += dropped;
> +            dev->tx_failure_drops += tx_failure;
> +            dev->tx_mtu_exceeded_drops += mtu_drops;
> +            dev->tx_qos_drops += qos_drops;
>              rte_spinlock_unlock(&dev->stats_lock);
>          }
>      }
> @@ -2772,6 +2811,17 @@ netdev_dpdk_get_custom_stats(const struct netdev 
> *netdev,
>      uint32_t i;
>      struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>      int rte_xstats_ret;
> +    uint16_t index = 0;
> +
> +#define DPDK_CSTATS                         \
> +    DPDK_CSTAT(tx_failure_drops)            \
> +    DPDK_CSTAT(tx_mtu_exceeded_drops)       \
> +    DPDK_CSTAT(tx_qos_drops)                \
> +    DPDK_CSTAT(rx_qos_drops)
> +
> +#define DPDK_CSTAT(NAME) +1
> +    custom_stats->size = DPDK_CSTATS; #undef DPDK_CSTAT
>
>      ovs_mutex_lock(&dev->mutex);
>
> @@ -2786,9 +2836,10 @@ netdev_dpdk_get_custom_stats(const struct netdev 
> *netdev,
>          if (rte_xstats_ret > 0 &&
>              rte_xstats_ret <= dev->rte_xstats_ids_size) {
>
> -            custom_stats->size = rte_xstats_ret;
> +            index = rte_xstats_ret;
> +            custom_stats->size += rte_xstats_ret;
>              custom_stats->counters =
> -                    (struct netdev_custom_counter *) 
> xcalloc(rte_xstats_ret,
> +                   (struct netdev_custom_counter *)
> + xcalloc(custom_stats->size,
>                              sizeof(struct netdev_custom_counter));
>
>              for (i = 0; i < rte_xstats_ret; i++) { @@ -2802,7 +2853,6
> @@ netdev_dpdk_get_custom_stats(const struct netdev *netdev,
>              VLOG_WARN("Cannot get XSTATS values for port: 
> "DPDK_PORT_ID_FMT,
>                        dev->port_id);
>              custom_stats->counters = NULL;
> -            custom_stats->size = 0;
>              /* Let's clear statistics cache, so it will be
>               * reconfigured */
>              netdev_dpdk_clear_xstats(dev); @@ -2811,6 +2861,27 @@
> netdev_dpdk_get_custom_stats(const struct netdev *netdev,
>          free(values);
>      }
>
> +    if (custom_stats->counters == NULL) {
> +        custom_stats->counters =
> +            (struct netdev_custom_counter *) xcalloc(custom_stats->size,
> +                                  sizeof(struct netdev_custom_counter));
> +    }
> +
> +    rte_spinlock_lock(&dev->stats_lock);
> +    i = index;
> +#define DPDK_CSTAT(NAME)                                     \
> +    ovs_strlcpy(custom_stats->counters[i++].name, #NAME,     \
> +                NETDEV_CUSTOM_STATS_NAME_SIZE);
> +    DPDK_CSTATS;
> +#undef DPDK_CSTAT
> +
> +    i = index;
> +#define DPDK_CSTAT(NAME)                                     \
> +    custom_stats->counters[i++].value = dev->NAME;
> +    DPDK_CSTATS;
> +#undef DPDK_CSTAT
> +    rte_spinlock_unlock(&dev->stats_lock);
> +
>      ovs_mutex_unlock(&dev->mutex);
>
>      return 0;
> @@ -2823,8 +2894,12 @@ netdev_dpdk_vhost_get_custom_stats(const struct 
> netdev *netdev,
>      struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>      int i;
>
> -#define VHOST_CSTATS \
> -    VHOST_CSTAT(tx_retries)
> +#define VHOST_CSTATS                        \
> +    VHOST_CSTAT(tx_retries)                 \
> +    VHOST_CSTAT(tx_failure_drops)           \
> +    VHOST_CSTAT(tx_mtu_exceeded_drops)      \
> +    VHOST_CSTAT(tx_qos_drops)               \
> +    VHOST_CSTAT(rx_qos_drops)
>
>  #define VHOST_CSTAT(NAME) + 1
>      custom_stats->size = VHOST_CSTATS; diff --git
> a/utilities/bugtool/automake.mk b/utilities/bugtool/automake.mk index
> 18fa347..9657468 100644
> --- a/utilities/bugtool/automake.mk
> +++ b/utilities/bugtool/automake.mk
> @@ -22,7 +22,8 @@ bugtool_scripts = \
>  	utilities/bugtool/ovs-bugtool-ovs-bridge-datapath-type \
>  	utilities/bugtool/ovs-bugtool-ovs-vswitchd-threads-affinity \
>  	utilities/bugtool/ovs-bugtool-qos-configs \
> -	utilities/bugtool/ovs-bugtool-get-dpdk-nic-numa
> +	utilities/bugtool/ovs-bugtool-get-dpdk-nic-numa \
> +	utilities/bugtool/ovs-bugtool-get-iface-stats
>
>  scripts_SCRIPTS += $(bugtool_scripts)
>
> diff --git a/utilities/bugtool/ovs-bugtool-get-iface-stats
> b/utilities/bugtool/ovs-bugtool-get-iface-stats
> new file mode 100755
> index 0000000..0fe175e
> --- /dev/null
> +++ b/utilities/bugtool/ovs-bugtool-get-iface-stats
> @@ -0,0 +1,25 @@
> +#! /bin/bash
> +
> +# This library is free software; you can redistribute it and/or #
> +modify it under the terms of version 2.1 of the GNU Lesser General #
> +Public License as published by the Free Software Foundation.
> +#
> +# This library is distributed in the hope that it will be useful, #
> +but WITHOUT ANY WARRANTY; without even the implied warranty of #
> +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU #
> +Lesser General Public License for more details.
> +#
> +# Copyright (C) 2019 Ericsson AB
> +
> +for bridge in `ovs-vsctl -- --real list-br` do
> +    echo -e "\nBridge : ${bridge}\n"
> +    for iface in `ovs-vsctl list-ifaces ${bridge}`
> +    do
> +        echo -e "iface : ${iface}"
> +        ovs-vsctl get interface ${iface} statistics
> +        echo -e "\n"
> +    done
> +    echo -e "iface : ${bridge}"
> +    ovs-vsctl get interface ${bridge} statistics done
> diff --git a/utilities/bugtool/plugins/network-status/openvswitch.xml
> b/utilities/bugtool/plugins/network-status/openvswitch.xml
> index d39867c..f8c4ff0 100644
> --- a/utilities/bugtool/plugins/network-status/openvswitch.xml
> +++ b/utilities/bugtool/plugins/network-status/openvswitch.xml
> @@ -34,6 +34,7 @@
>      <command label="ovs-appctl-dpctl-dump-flows-netdev" filters="ovs" 
> repeat="2">ovs-appctl dpctl/dump-flows netdev@ovs-netdev</command>
>      <command label="ovs-appctl-dpctl-dump-flows-system" filters="ovs" 
> repeat="2">ovs-appctl dpctl/dump-flows system@ovs-system</command>
>      <command label="ovs-appctl-dpctl-show-s" filters="ovs"
> repeat="2">ovs-appctl dpctl/show -s</command>
> +    <command label="ovs-vsctl-get-interface-statistics"
> + filters="ovs">/usr/share/openvswitch/scripts/ovs-bugtool-get-iface-s
> + tats</command>
>      <command label="ovs-ofctl-show" 
> filters="ovs">/usr/share/openvswitch/scripts/ovs-bugtool-ovs-ofctl-loop-over-bridges 
> "show"</command>
>      <command label="ovs-ofctl-dump-flows" filters="ovs" 
> repeat="2">/usr/share/openvswitch/scripts/ovs-bugtool-ovs-ofctl-loop-over-bridges 
> "dump-flows"</command>
>      <command label="ovs-ofctl-dump-ports" filters="ovs"
> repeat="2">/usr/share/openvswitch/scripts/ovs-bugtool-ovs-ofctl-loop-o
> ver-bridges "dump-ports"</command> diff --git a/vswitchd/vswitch.xml
> b/vswitchd/vswitch.xml index 9a743c0..4a7bcf8 100644
> --- a/vswitchd/vswitch.xml
> +++ b/vswitchd/vswitch.xml
> @@ -3486,6 +3486,30 @@ ovs-vsctl add-port br0 p0 -- set Interface p0 
> type=patch options:peer=p1 \
>            the above.
>          </column>
>        </group>
> +      <group title="Statistics: Custom stats">
> +        <column name="statistics" key="tx_retries">
> +          Total number of transmit retries on a vhost-user or 
> vhost-user-client
> +          interface.
> +        </column>
> +        <column name="statistics" key="tx_failure_drops">
> +          Total number of packets dropped because DPDP transmit API for
> +          physical/vhost ports fails to transmit the packets. This happens
> +          most likely because the transmit queue is full or has been filled
> +          up. There are other reasons as well which are unlikely to happen.
> +        </column>
> +        <column name="statistics" key="tx_mtu_exceeded_drops">
> +          Number of packets dropped due to packet length exceeding the max
> +          device MTU.
> +        </column>
> +        <column name="statistics" key="tx_qos_drops">
> +          Total number of packets dropped due to transmission rate 
> exceeding
> +          the configured egress policer rate.
> +        </column>
> +        <column name="statistics" key="rx_qos_drops">
> +          Total number of packets dropped due to reception rate exceeding
> +          the configured ingress policer rate.
> +        </column>
> +      </group>

I'm not sure if it make sense to declare all the stats here. Even usual 
"Statistics"
doesn't have all of the default stats described. All of above should be easy 
to understand just looking at the name. Also, these are netdev-dpdk specific 
and not supported by other netdev types. If needed, these stats could be 
mentioned in DPDK guides in Documentation/ .

>      </group>
>
>      <group title="Ingress Policing">
>
Ilya Maximets Sept. 4, 2019, 2:03 p.m. UTC | #6
On 04.09.2019 16:31, Sriram Vatala wrote:
> Hi Ilya,
> 1) I was working on addressing the comments provided by you. Had a small query 
> on one of your comments.
> 2) I am trying to understand the problem of padding bytes in struct 
> netdev_dpdk which you are referring to in your comment.
> 3) If I understand correctly, the macro "PADDED_MEMBERS(UNIT, MEMBERS)" 
> ensures that the memory to be allocated for all 'MEMBERS' is rounded off to 
> nearest multiple of CACHE_LINE_SIZE which  is 64 in this case. This macro adds 
> pad bytes to roundoff to multiple of 64.
> 4) At runtime, I checked the size of stats section of netdev_dpdk without new 
> counters that I have introduced in my patch. It is 384 bytes, out of which 
> struct netdev_stats alone occupies 336 bytes and 8 bytes for tx_retries 
> counter. (I could see there is no padding between netdev_stats and 
> tx_retries). Total of 344 is rounded to 384 [((344 + 64 - 1)/64) * 64].
> 5) With my new counters, the size remains same after padding.
> (336 + 8 + 8 +8 +8 +8 = 376) which is rounded to 384 bytes  [((376 +64 -1)/64) 
> *64]  at runtime.
> 
> I want to check with you, if I have correctly understood the problem that you 
> are referring in your comment. If you can explain a bit more on this, it would 
> be helpful for me to understand the problem and think of possible solutions.
> 
> Following is the snippet of memory layout of netdev_dpdk at runtime :
>     union {
>         OVS_CACHE_LINE_MARKER cacheline1;
>         struct {...};
>         uint8_t pad50[64];
>     };
>     union {
>         struct {...};
>         uint8_t pad51[192];
>     };
>     union {
>         struct {...};
>         uint8_t pad52[384];
>     };
>     union {
>         struct {...};
>         uint8_t pad53[128];
>     };
>     union {
>         struct {...};
>         uint8_t pad54[64];
>     };
> }

Hi.

The code looks like this:

     PADDED_MEMBERS(CACHE_LINE_SIZE,
         struct netdev_stats stats;
         /* Custom stat for retries when unable to transmit. */
         uint64_t tx_retries;
         /* Protects stats */
         rte_spinlock_t stats_lock;
         /* 4 pad bytes here. */    <-- This comment I'm talking about.
     );

The only thing you need to change is to update the comment while adding
new structure fields.

Your calculations are missing the size of 'stats_lock' which is 4 bytes.
So, on current master total size is:
    336 bytes for stats + 8 bytes for tx_retries + 4 bytes for lock = 352
And it's rounded up to 384 by padding, i.e. we have 384 - 352 = 32 pad bytes.
The comment on current master should be "/* 32 pad bytes here. */".

BTW, Kevin, how did you calculate 4 here? My pahole output is following:

        union {
                struct {
                        struct netdev_stats stats;       /*   320   336 */
                        uint64_t   tx_retries;           /*   656     8 */
                        rte_spinlock_t stats_lock;       /*   664     4 */
                };                                       /*         352 */
                uint8_t            pad52[0];             /*           0 */
        };                                               /*   320   384 */

Returning to the patch. After adding 4 more stats (4 * 8 bytes), the new layout
takes:
   336 bytes for stats + 8 bytes for tx_retries \
   + 4*8 bytes for new stats + 4 bytes for lock = 384 bytes.

Now the structure takes exactly 384 bytes and you need to remove the comment
or change it to "/* 0 pad bytes here. */".

Sorry, I didn't check the actual layout until now so I was confused a bit by the
comment on current master. Anyway, you need to update that comment.
However, It might be still useful to move these stats to a separate structure to
avoid big padding in case we'll want to add one more. And I'm still thinking that
we need to drop paddings at all for most of the structure fields, but this should
be a separate change.

Best regards, Ilya Maximets.
Kevin Traynor Sept. 4, 2019, 3:54 p.m. UTC | #7
On 04/09/2019 15:03, Ilya Maximets wrote:
> On 04.09.2019 16:31, Sriram Vatala wrote:
>> Hi Ilya,
>> 1) I was working on addressing the comments provided by you. Had a small query 
>> on one of your comments.
>> 2) I am trying to understand the problem of padding bytes in struct 
>> netdev_dpdk which you are referring to in your comment.
>> 3) If I understand correctly, the macro "PADDED_MEMBERS(UNIT, MEMBERS)" 
>> ensures that the memory to be allocated for all 'MEMBERS' is rounded off to 
>> nearest multiple of CACHE_LINE_SIZE which  is 64 in this case. This macro adds 
>> pad bytes to roundoff to multiple of 64.
>> 4) At runtime, I checked the size of stats section of netdev_dpdk without new 
>> counters that I have introduced in my patch. It is 384 bytes, out of which 
>> struct netdev_stats alone occupies 336 bytes and 8 bytes for tx_retries 
>> counter. (I could see there is no padding between netdev_stats and 
>> tx_retries). Total of 344 is rounded to 384 [((344 + 64 - 1)/64) * 64].
>> 5) With my new counters, the size remains same after padding.
>> (336 + 8 + 8 +8 +8 +8 = 376) which is rounded to 384 bytes  [((376 +64 -1)/64) 
>> *64]  at runtime.
>>
>> I want to check with you, if I have correctly understood the problem that you 
>> are referring in your comment. If you can explain a bit more on this, it would 
>> be helpful for me to understand the problem and think of possible solutions.
>>
>> Following is the snippet of memory layout of netdev_dpdk at runtime :
>>     union {
>>         OVS_CACHE_LINE_MARKER cacheline1;
>>         struct {...};
>>         uint8_t pad50[64];
>>     };
>>     union {
>>         struct {...};
>>         uint8_t pad51[192];
>>     };
>>     union {
>>         struct {...};
>>         uint8_t pad52[384];
>>     };
>>     union {
>>         struct {...};
>>         uint8_t pad53[128];
>>     };
>>     union {
>>         struct {...};
>>         uint8_t pad54[64];
>>     };
>> }
> 
> Hi.
> 
> The code looks like this:
> 
>      PADDED_MEMBERS(CACHE_LINE_SIZE,
>          struct netdev_stats stats;
>          /* Custom stat for retries when unable to transmit. */
>          uint64_t tx_retries;
>          /* Protects stats */
>          rte_spinlock_t stats_lock;
>          /* 4 pad bytes here. */    <-- This comment I'm talking about.
>      );
> 
> The only thing you need to change is to update the comment while adding
> new structure fields.
> 
> Your calculations are missing the size of 'stats_lock' which is 4 bytes.
> So, on current master total size is:
>     336 bytes for stats + 8 bytes for tx_retries + 4 bytes for lock = 352
> And it's rounded up to 384 by padding, i.e. we have 384 - 352 = 32 pad bytes.
> The comment on current master should be "/* 32 pad bytes here. */".
> 
> BTW, Kevin, how did you calculate 4 here? My pahole output is following:
> 
>         union {
>                 struct {
>                         struct netdev_stats stats;       /*   320   336 */
>                         uint64_t   tx_retries;           /*   656     8 */
>                         rte_spinlock_t stats_lock;       /*   664     4 */
>                 };                                       /*         352 */
>                 uint8_t            pad52[0];             /*           0 */
>         };                                               /*   320   384 */
> 

Hmm, looks like I must have misinterpreted. I will send a patch for
branch-2.12 to update the comment. Not sure it's worth to send a patch
for master as this patch is changing it anyway, but I can if there is a
preference.

> Returning to the patch. After adding 4 more stats (4 * 8 bytes), the new layout
> takes:
>    336 bytes for stats + 8 bytes for tx_retries \
>    + 4*8 bytes for new stats + 4 bytes for lock = 384 bytes.
> 
> Now the structure takes exactly 384 bytes and you need to remove the comment
> or change it to "/* 0 pad bytes here. */".
> 
> Sorry, I didn't check the actual layout until now so I was confused a bit by the
> comment on current master. Anyway, you need to update that comment.
> However, It might be still useful to move these stats to a separate structure to
> avoid big padding in case we'll want to add one more. And I'm still thinking that
> we need to drop paddings at all for most of the structure fields, but this should
> be a separate change.
> 
> Best regards, Ilya Maximets.
>
Li,Rongqing via dev Sept. 5, 2019, 6:58 a.m. UTC | #8
Hi Ilya,
Thanks a lot for the explanation. As per your suggestion, I will move all the 
counters (including 'tx_retries')to some structure and place a pointer to it 
in netdev_dpdk structure so that the padding size will not vary with the 
introduction of new counters in future.

@Kevin Traynor : I will change the comment line for the number of padding 
bytes in my next patch instead of you sending a new patch just for changing 
the comment line.

Thanks & Regards,
Sriram.

-----Original Message-----
From: Ilya Maximets <i.maximets@samsung.com>
Sent: 04 September 2019 19:34
To: Sriram Vatala <sriram.v@altencalsoftlabs.com>; Kevin Traynor 
<ktraynor@redhat.com>
Cc: ovs-dev@openvswitch.org; Stokes, Ian <ian.stokes@intel.com>
Subject: Re: [PATCH v7] Detailed packet drop statistics per dpdk and vhostuser 
ports

On 04.09.2019 16:31, Sriram Vatala wrote:
> Hi Ilya,
> 1) I was working on addressing the comments provided by you. Had a
> small query on one of your comments.
> 2) I am trying to understand the problem of padding bytes in struct
> netdev_dpdk which you are referring to in your comment.
> 3) If I understand correctly, the macro "PADDED_MEMBERS(UNIT, MEMBERS)"
> ensures that the memory to be allocated for all 'MEMBERS' is rounded
> off to nearest multiple of CACHE_LINE_SIZE which  is 64 in this case.
> This macro adds pad bytes to roundoff to multiple of 64.
> 4) At runtime, I checked the size of stats section of netdev_dpdk
> without new counters that I have introduced in my patch. It is 384
> bytes, out of which struct netdev_stats alone occupies 336 bytes and 8
> bytes for tx_retries counter. (I could see there is no padding between
> netdev_stats and tx_retries). Total of 344 is rounded to 384 [((344 + 64 - 
> 1)/64) * 64].
> 5) With my new counters, the size remains same after padding.
> (336 + 8 + 8 +8 +8 +8 = 376) which is rounded to 384 bytes  [((376 +64
> -1)/64) *64]  at runtime.
>
> I want to check with you, if I have correctly understood the problem
> that you are referring in your comment. If you can explain a bit more
> on this, it would be helpful for me to understand the problem and think of 
> possible solutions.
>
> Following is the snippet of memory layout of netdev_dpdk at runtime :
>     union {
>         OVS_CACHE_LINE_MARKER cacheline1;
>         struct {...};
>         uint8_t pad50[64];
>     };
>     union {
>         struct {...};
>         uint8_t pad51[192];
>     };
>     union {
>         struct {...};
>         uint8_t pad52[384];
>     };
>     union {
>         struct {...};
>         uint8_t pad53[128];
>     };
>     union {
>         struct {...};
>         uint8_t pad54[64];
>     };
> }

Hi.

The code looks like this:

     PADDED_MEMBERS(CACHE_LINE_SIZE,
         struct netdev_stats stats;
         /* Custom stat for retries when unable to transmit. */
         uint64_t tx_retries;
         /* Protects stats */
         rte_spinlock_t stats_lock;
         /* 4 pad bytes here. */    <-- This comment I'm talking about.
     );

The only thing you need to change is to update the comment while adding new 
structure fields.

Your calculations are missing the size of 'stats_lock' which is 4 bytes.
So, on current master total size is:
    336 bytes for stats + 8 bytes for tx_retries + 4 bytes for lock = 352 And 
it's rounded up to 384 by padding, i.e. we have 384 - 352 = 32 pad bytes.
The comment on current master should be "/* 32 pad bytes here. */".

BTW, Kevin, how did you calculate 4 here? My pahole output is following:

        union {
                struct {
                        struct netdev_stats stats;       /*   320   336 */
                        uint64_t   tx_retries;           /*   656     8 */
                        rte_spinlock_t stats_lock;       /*   664     4 */
                };                                       /*         352 */
                uint8_t            pad52[0];             /*           0 */
        };                                               /*   320   384 */

Returning to the patch. After adding 4 more stats (4 * 8 bytes), the new 
layout
takes:
   336 bytes for stats + 8 bytes for tx_retries \
   + 4*8 bytes for new stats + 4 bytes for lock = 384 bytes.

Now the structure takes exactly 384 bytes and you need to remove the comment 
or change it to "/* 0 pad bytes here. */".

Sorry, I didn't check the actual layout until now so I was confused a bit by 
the comment on current master. Anyway, you need to update that comment.
However, It might be still useful to move these stats to a separate structure 
to avoid big padding in case we'll want to add one more. And I'm still 
thinking that we need to drop paddings at all for most of the structure 
fields, but this should be a separate change.

Best regards, Ilya Maximets.
Ilya Maximets Sept. 6, 2019, 1:17 p.m. UTC | #9
On 05.09.2019 9:58, Sriram Vatala wrote:
> Hi Ilya,
> Thanks a lot for the explanation. As per your suggestion, I will move all the 
> counters (including 'tx_retries')to some structure and place a pointer to it 
> in netdev_dpdk structure so that the padding size will not vary with the 
> introduction of new counters in future.
> 
> @Kevin Traynor : I will change the comment line for the number of padding 
> bytes in my next patch instead of you sending a new patch just for changing 
> the comment line.
> 
> Thanks & Regards,
> Sriram.
> 
> -----Original Message-----
> From: Ilya Maximets <i.maximets@samsung.com>
> Sent: 04 September 2019 19:34
> To: Sriram Vatala <sriram.v@altencalsoftlabs.com>; Kevin Traynor 
> <ktraynor@redhat.com>
> Cc: ovs-dev@openvswitch.org; Stokes, Ian <ian.stokes@intel.com>
> Subject: Re: [PATCH v7] Detailed packet drop statistics per dpdk and vhostuser 
> ports
> 
> On 04.09.2019 16:31, Sriram Vatala wrote:
>> Hi Ilya,
>> 1) I was working on addressing the comments provided by you. Had a
>> small query on one of your comments.
>> 2) I am trying to understand the problem of padding bytes in struct
>> netdev_dpdk which you are referring to in your comment.
>> 3) If I understand correctly, the macro "PADDED_MEMBERS(UNIT, MEMBERS)"
>> ensures that the memory to be allocated for all 'MEMBERS' is rounded
>> off to nearest multiple of CACHE_LINE_SIZE which  is 64 in this case.
>> This macro adds pad bytes to roundoff to multiple of 64.
>> 4) At runtime, I checked the size of stats section of netdev_dpdk
>> without new counters that I have introduced in my patch. It is 384
>> bytes, out of which struct netdev_stats alone occupies 336 bytes and 8
>> bytes for tx_retries counter. (I could see there is no padding between
>> netdev_stats and tx_retries). Total of 344 is rounded to 384 [((344 + 64 - 
>> 1)/64) * 64].
>> 5) With my new counters, the size remains same after padding.
>> (336 + 8 + 8 +8 +8 +8 = 376) which is rounded to 384 bytes  [((376 +64
>> -1)/64) *64]  at runtime.
>>
>> I want to check with you, if I have correctly understood the problem
>> that you are referring in your comment. If you can explain a bit more
>> on this, it would be helpful for me to understand the problem and think of 
>> possible solutions.
>>
>> Following is the snippet of memory layout of netdev_dpdk at runtime :
>>     union {
>>         OVS_CACHE_LINE_MARKER cacheline1;
>>         struct {...};
>>         uint8_t pad50[64];
>>     };
>>     union {
>>         struct {...};
>>         uint8_t pad51[192];
>>     };
>>     union {
>>         struct {...};
>>         uint8_t pad52[384];
>>     };
>>     union {
>>         struct {...};
>>         uint8_t pad53[128];
>>     };
>>     union {
>>         struct {...};
>>         uint8_t pad54[64];
>>     };
>> }
> 
> Hi.
> 
> The code looks like this:
> 
>      PADDED_MEMBERS(CACHE_LINE_SIZE,
>          struct netdev_stats stats;
>          /* Custom stat for retries when unable to transmit. */
>          uint64_t tx_retries;
>          /* Protects stats */
>          rte_spinlock_t stats_lock;
>          /* 4 pad bytes here. */    <-- This comment I'm talking about.
>      );
> 
> The only thing you need to change is to update the comment while adding new 
> structure fields.
> 
> Your calculations are missing the size of 'stats_lock' which is 4 bytes.
> So, on current master total size is:
>     336 bytes for stats + 8 bytes for tx_retries + 4 bytes for lock = 352 And 
> it's rounded up to 384 by padding, i.e. we have 384 - 352 = 32 pad bytes.

Sorry, I miscalculated this too.)
336 + 8 + 4 = 348
pahole reports 352, because there is an additional 4 bytes padding inside the
unnamed structure.


> The comment on current master should be "/* 32 pad bytes here. */".

So, the comment should be "/* 36 pad bytes here. */".

> 
> BTW, Kevin, how did you calculate 4 here? My pahole output is following:
> 
>         union {
>                 struct {
>                         struct netdev_stats stats;       /*   320   336 */
>                         uint64_t   tx_retries;           /*   656     8 */
>                         rte_spinlock_t stats_lock;       /*   664     4 */
>                 };                                       /*         352 */
>                 uint8_t            pad52[0];             /*           0 */
>         };                                               /*   320   384 */
> 
> Returning to the patch. After adding 4 more stats (4 * 8 bytes), the new 
> layout
> takes:
>    336 bytes for stats + 8 bytes for tx_retries \
>    + 4*8 bytes for new stats + 4 bytes for lock = 384 bytes.
> 
> Now the structure takes exactly 384 bytes and you need to remove the comment 
> or change it to "/* 0 pad bytes here. */".

And 4 bytes will remain in padding after adding new stats.

> 
> Sorry, I didn't check the actual layout until now so I was confused a bit by 
> the comment on current master. Anyway, you need to update that comment.
> However, It might be still useful to move these stats to a separate structure 
> to avoid big padding in case we'll want to add one more. And I'm still 
> thinking that we need to drop paddings at all for most of the structure 
> fields, but this should be a separate change.
> 
> Best regards, Ilya Maximets.
>
diff mbox series

Patch

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index bc20d68..a679c5b 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -413,8 +413,17 @@  struct netdev_dpdk {
 
     PADDED_MEMBERS(CACHE_LINE_SIZE,
         struct netdev_stats stats;
-        /* Custom stat for retries when unable to transmit. */
+        /* Custom device stats */
+        /* No. of retries when unable to transmit. */
         uint64_t tx_retries;
+        /* Pkts dropped when unable to transmit; Probably Tx queue is full */
+        uint64_t tx_failure_drops;
+        /* Pkts len greater than max dev MTU */
+        uint64_t tx_mtu_exceeded_drops;
+        /* Pkt drops in egress policier processing */
+        uint64_t tx_qos_drops;
+        /* Pkts drops in ingress policier processing */
+        uint64_t rx_qos_drops;
         /* Protects stats */
         rte_spinlock_t stats_lock;
         /* 4 pad bytes here. */
@@ -2171,6 +2180,7 @@  netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq,
     struct ingress_policer *policer = netdev_dpdk_get_ingress_policer(dev);
     uint16_t nb_rx = 0;
     uint16_t dropped = 0;
+    uint16_t qos_drops = 0;
     int qid = rxq->queue_id * VIRTIO_QNUM + VIRTIO_TXQ;
     int vid = netdev_dpdk_get_vid(dev);
 
@@ -2202,11 +2212,13 @@  netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq,
                                     (struct rte_mbuf **) batch->packets,
                                     nb_rx, true);
         dropped -= nb_rx;
+        qos_drops = dropped;
     }
 
     rte_spinlock_lock(&dev->stats_lock);
     netdev_dpdk_vhost_update_rx_counters(&dev->stats, batch->packets,
                                          nb_rx, dropped);
+    dev->rx_qos_drops += qos_drops;
     rte_spinlock_unlock(&dev->stats_lock);
 
     batch->count = nb_rx;
@@ -2232,6 +2244,7 @@  netdev_dpdk_rxq_recv(struct netdev_rxq *rxq, struct dp_packet_batch *batch,
     struct ingress_policer *policer = netdev_dpdk_get_ingress_policer(dev);
     int nb_rx;
     int dropped = 0;
+    int qos_drops = 0;
 
     if (OVS_UNLIKELY(!(dev->flags & NETDEV_UP))) {
         return EAGAIN;
@@ -2250,12 +2263,14 @@  netdev_dpdk_rxq_recv(struct netdev_rxq *rxq, struct dp_packet_batch *batch,
                                     (struct rte_mbuf **) batch->packets,
                                     nb_rx, true);
         dropped -= nb_rx;
+        qos_drops = dropped;
     }
 
     /* Update stats to reflect dropped packets */
     if (OVS_UNLIKELY(dropped)) {
         rte_spinlock_lock(&dev->stats_lock);
         dev->stats.rx_dropped += dropped;
+        dev->rx_qos_drops += qos_drops;
         rte_spinlock_unlock(&dev->stats_lock);
     }
 
@@ -2339,6 +2354,9 @@  __netdev_dpdk_vhost_send(struct netdev *netdev, int qid,
     struct rte_mbuf **cur_pkts = (struct rte_mbuf **) pkts;
     unsigned int total_pkts = cnt;
     unsigned int dropped = 0;
+    unsigned int tx_failure;
+    unsigned int mtu_drops;
+    unsigned int qos_drops;
     int i, retries = 0;
     int max_retries = VHOST_ENQ_RETRY_MIN;
     int vid = netdev_dpdk_get_vid(dev);
@@ -2356,9 +2374,12 @@  __netdev_dpdk_vhost_send(struct netdev *netdev, int qid,
     rte_spinlock_lock(&dev->tx_q[qid].tx_lock);
 
     cnt = netdev_dpdk_filter_packet_len(dev, cur_pkts, cnt);
+    mtu_drops = total_pkts - cnt;
+    qos_drops = cnt;
     /* Check has QoS has been configured for the netdev */
     cnt = netdev_dpdk_qos_run(dev, cur_pkts, cnt, true);
-    dropped = total_pkts - cnt;
+    qos_drops -= cnt;
+    dropped = qos_drops + mtu_drops;
 
     do {
         int vhost_qid = qid * VIRTIO_QNUM + VIRTIO_RXQ;
@@ -2383,12 +2404,16 @@  __netdev_dpdk_vhost_send(struct netdev *netdev, int qid,
         }
     } while (cnt && (retries++ < max_retries));
 
+    tx_failure = cnt;
     rte_spinlock_unlock(&dev->tx_q[qid].tx_lock);
 
     rte_spinlock_lock(&dev->stats_lock);
     netdev_dpdk_vhost_update_tx_counters(&dev->stats, pkts, total_pkts,
                                          cnt + dropped);
     dev->tx_retries += MIN(retries, max_retries);
+    dev->tx_failure_drops += tx_failure;
+    dev->tx_mtu_exceeded_drops += mtu_drops;
+    dev->tx_qos_drops += qos_drops;
     rte_spinlock_unlock(&dev->stats_lock);
 
 out:
@@ -2413,12 +2438,15 @@  dpdk_do_tx_copy(struct netdev *netdev, int qid, struct dp_packet_batch *batch)
     struct rte_mbuf *pkts[PKT_ARRAY_SIZE];
     uint32_t cnt = batch_cnt;
     uint32_t dropped = 0;
+    uint32_t tx_failure = 0;
+    uint32_t mtu_drops = 0;
+    uint32_t qos_drops = 0;
 
     if (dev->type != DPDK_DEV_VHOST) {
         /* Check if QoS has been configured for this netdev. */
         cnt = netdev_dpdk_qos_run(dev, (struct rte_mbuf **) batch->packets,
                                   batch_cnt, false);
-        dropped += batch_cnt - cnt;
+        qos_drops = batch_cnt - cnt;
     }
 
     uint32_t txcnt = 0;
@@ -2431,7 +2459,7 @@  dpdk_do_tx_copy(struct netdev *netdev, int qid, struct dp_packet_batch *batch)
             VLOG_WARN_RL(&rl, "Too big size %u max_packet_len %d",
                          size, dev->max_packet_len);
 
-            dropped++;
+            mtu_drops++;
             continue;
         }
 
@@ -2454,13 +2482,17 @@  dpdk_do_tx_copy(struct netdev *netdev, int qid, struct dp_packet_batch *batch)
             __netdev_dpdk_vhost_send(netdev, qid, (struct dp_packet **) pkts,
                                      txcnt);
         } else {
-            dropped += netdev_dpdk_eth_tx_burst(dev, qid, pkts, txcnt);
+            tx_failure = netdev_dpdk_eth_tx_burst(dev, qid, pkts, txcnt);
         }
     }
 
+    dropped += qos_drops + mtu_drops + tx_failure;
     if (OVS_UNLIKELY(dropped)) {
         rte_spinlock_lock(&dev->stats_lock);
         dev->stats.tx_dropped += dropped;
+        dev->tx_failure_drops += tx_failure;
+        dev->tx_mtu_exceeded_drops += mtu_drops;
+        dev->tx_qos_drops += qos_drops;
         rte_spinlock_unlock(&dev->stats_lock);
     }
 }
@@ -2502,18 +2534,25 @@  netdev_dpdk_send__(struct netdev_dpdk *dev, int qid,
         dp_packet_delete_batch(batch, true);
     } else {
         int tx_cnt, dropped;
+        int tx_failure, mtu_drops, qos_drops;
         int batch_cnt = dp_packet_batch_size(batch);
         struct rte_mbuf **pkts = (struct rte_mbuf **) batch->packets;
 
         tx_cnt = netdev_dpdk_filter_packet_len(dev, pkts, batch_cnt);
+        mtu_drops = batch_cnt - tx_cnt;
+        qos_drops = tx_cnt;
         tx_cnt = netdev_dpdk_qos_run(dev, pkts, tx_cnt, true);
-        dropped = batch_cnt - tx_cnt;
+        qos_drops -= tx_cnt;
 
-        dropped += netdev_dpdk_eth_tx_burst(dev, qid, pkts, tx_cnt);
+        tx_failure = netdev_dpdk_eth_tx_burst(dev, qid, pkts, tx_cnt);
 
+        dropped = tx_failure + mtu_drops + qos_drops;
         if (OVS_UNLIKELY(dropped)) {
             rte_spinlock_lock(&dev->stats_lock);
             dev->stats.tx_dropped += dropped;
+            dev->tx_failure_drops += tx_failure;
+            dev->tx_mtu_exceeded_drops += mtu_drops;
+            dev->tx_qos_drops += qos_drops;
             rte_spinlock_unlock(&dev->stats_lock);
         }
     }
@@ -2772,6 +2811,17 @@  netdev_dpdk_get_custom_stats(const struct netdev *netdev,
     uint32_t i;
     struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
     int rte_xstats_ret;
+    uint16_t index = 0;
+
+#define DPDK_CSTATS                         \
+    DPDK_CSTAT(tx_failure_drops)            \
+    DPDK_CSTAT(tx_mtu_exceeded_drops)       \
+    DPDK_CSTAT(tx_qos_drops)                \
+    DPDK_CSTAT(rx_qos_drops)
+
+#define DPDK_CSTAT(NAME) +1
+    custom_stats->size = DPDK_CSTATS;
+#undef DPDK_CSTAT
 
     ovs_mutex_lock(&dev->mutex);
 
@@ -2786,9 +2836,10 @@  netdev_dpdk_get_custom_stats(const struct netdev *netdev,
         if (rte_xstats_ret > 0 &&
             rte_xstats_ret <= dev->rte_xstats_ids_size) {
 
-            custom_stats->size = rte_xstats_ret;
+            index = rte_xstats_ret;
+            custom_stats->size += rte_xstats_ret;
             custom_stats->counters =
-                    (struct netdev_custom_counter *) xcalloc(rte_xstats_ret,
+                   (struct netdev_custom_counter *) xcalloc(custom_stats->size,
                             sizeof(struct netdev_custom_counter));
 
             for (i = 0; i < rte_xstats_ret; i++) {
@@ -2802,7 +2853,6 @@  netdev_dpdk_get_custom_stats(const struct netdev *netdev,
             VLOG_WARN("Cannot get XSTATS values for port: "DPDK_PORT_ID_FMT,
                       dev->port_id);
             custom_stats->counters = NULL;
-            custom_stats->size = 0;
             /* Let's clear statistics cache, so it will be
              * reconfigured */
             netdev_dpdk_clear_xstats(dev);
@@ -2811,6 +2861,27 @@  netdev_dpdk_get_custom_stats(const struct netdev *netdev,
         free(values);
     }
 
+    if (custom_stats->counters == NULL) {
+        custom_stats->counters =
+            (struct netdev_custom_counter *) xcalloc(custom_stats->size,
+                                  sizeof(struct netdev_custom_counter));
+    }
+
+    rte_spinlock_lock(&dev->stats_lock);
+    i = index;
+#define DPDK_CSTAT(NAME)                                     \
+    ovs_strlcpy(custom_stats->counters[i++].name, #NAME,     \
+                NETDEV_CUSTOM_STATS_NAME_SIZE);
+    DPDK_CSTATS;
+#undef DPDK_CSTAT
+
+    i = index;
+#define DPDK_CSTAT(NAME)                                     \
+    custom_stats->counters[i++].value = dev->NAME;
+    DPDK_CSTATS;
+#undef DPDK_CSTAT
+    rte_spinlock_unlock(&dev->stats_lock);
+
     ovs_mutex_unlock(&dev->mutex);
 
     return 0;
@@ -2823,8 +2894,12 @@  netdev_dpdk_vhost_get_custom_stats(const struct netdev *netdev,
     struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
     int i;
 
-#define VHOST_CSTATS \
-    VHOST_CSTAT(tx_retries)
+#define VHOST_CSTATS                        \
+    VHOST_CSTAT(tx_retries)                 \
+    VHOST_CSTAT(tx_failure_drops)           \
+    VHOST_CSTAT(tx_mtu_exceeded_drops)      \
+    VHOST_CSTAT(tx_qos_drops)               \
+    VHOST_CSTAT(rx_qos_drops)
 
 #define VHOST_CSTAT(NAME) + 1
     custom_stats->size = VHOST_CSTATS;
diff --git a/utilities/bugtool/automake.mk b/utilities/bugtool/automake.mk
index 18fa347..9657468 100644
--- a/utilities/bugtool/automake.mk
+++ b/utilities/bugtool/automake.mk
@@ -22,7 +22,8 @@  bugtool_scripts = \
 	utilities/bugtool/ovs-bugtool-ovs-bridge-datapath-type \
 	utilities/bugtool/ovs-bugtool-ovs-vswitchd-threads-affinity \
 	utilities/bugtool/ovs-bugtool-qos-configs \
-	utilities/bugtool/ovs-bugtool-get-dpdk-nic-numa
+	utilities/bugtool/ovs-bugtool-get-dpdk-nic-numa \
+	utilities/bugtool/ovs-bugtool-get-iface-stats
 
 scripts_SCRIPTS += $(bugtool_scripts)
 
diff --git a/utilities/bugtool/ovs-bugtool-get-iface-stats b/utilities/bugtool/ovs-bugtool-get-iface-stats
new file mode 100755
index 0000000..0fe175e
--- /dev/null
+++ b/utilities/bugtool/ovs-bugtool-get-iface-stats
@@ -0,0 +1,25 @@ 
+#! /bin/bash
+
+# This library is free software; you can redistribute it and/or
+# modify it under the terms of version 2.1 of the GNU Lesser General
+# Public License as published by the Free Software Foundation.
+#
+# This library is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+# Lesser General Public License for more details.
+#
+# Copyright (C) 2019 Ericsson AB
+
+for bridge in `ovs-vsctl -- --real list-br`
+do
+    echo -e "\nBridge : ${bridge}\n"
+    for iface in `ovs-vsctl list-ifaces ${bridge}`
+    do
+        echo -e "iface : ${iface}"
+        ovs-vsctl get interface ${iface} statistics
+        echo -e "\n"
+    done
+    echo -e "iface : ${bridge}"
+    ovs-vsctl get interface ${bridge} statistics
+done
diff --git a/utilities/bugtool/plugins/network-status/openvswitch.xml b/utilities/bugtool/plugins/network-status/openvswitch.xml
index d39867c..f8c4ff0 100644
--- a/utilities/bugtool/plugins/network-status/openvswitch.xml
+++ b/utilities/bugtool/plugins/network-status/openvswitch.xml
@@ -34,6 +34,7 @@ 
     <command label="ovs-appctl-dpctl-dump-flows-netdev" filters="ovs" repeat="2">ovs-appctl dpctl/dump-flows netdev@ovs-netdev</command>
     <command label="ovs-appctl-dpctl-dump-flows-system" filters="ovs" repeat="2">ovs-appctl dpctl/dump-flows system@ovs-system</command>
     <command label="ovs-appctl-dpctl-show-s" filters="ovs" repeat="2">ovs-appctl dpctl/show -s</command>
+    <command label="ovs-vsctl-get-interface-statistics" filters="ovs">/usr/share/openvswitch/scripts/ovs-bugtool-get-iface-stats</command>
     <command label="ovs-ofctl-show" filters="ovs">/usr/share/openvswitch/scripts/ovs-bugtool-ovs-ofctl-loop-over-bridges "show"</command>
     <command label="ovs-ofctl-dump-flows" filters="ovs" repeat="2">/usr/share/openvswitch/scripts/ovs-bugtool-ovs-ofctl-loop-over-bridges "dump-flows"</command>
     <command label="ovs-ofctl-dump-ports" filters="ovs" repeat="2">/usr/share/openvswitch/scripts/ovs-bugtool-ovs-ofctl-loop-over-bridges "dump-ports"</command>
diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
index 9a743c0..4a7bcf8 100644
--- a/vswitchd/vswitch.xml
+++ b/vswitchd/vswitch.xml
@@ -3486,6 +3486,30 @@  ovs-vsctl add-port br0 p0 -- set Interface p0 type=patch options:peer=p1 \
           the above.
         </column>
       </group>
+      <group title="Statistics: Custom stats">
+        <column name="statistics" key="tx_retries">
+          Total number of transmit retries on a vhost-user or vhost-user-client
+          interface.
+        </column>
+        <column name="statistics" key="tx_failure_drops">
+          Total number of packets dropped because DPDP transmit API for
+          physical/vhost ports fails to transmit the packets. This happens
+          most likely because the transmit queue is full or has been filled
+          up. There are other reasons as well which are unlikely to happen.
+        </column>
+        <column name="statistics" key="tx_mtu_exceeded_drops">
+          Number of packets dropped due to packet length exceeding the max
+          device MTU.
+        </column>
+        <column name="statistics" key="tx_qos_drops">
+          Total number of packets dropped due to transmission rate exceeding
+          the configured egress policer rate.
+        </column>
+        <column name="statistics" key="rx_qos_drops">
+          Total number of packets dropped due to reception rate exceeding
+          the configured ingress policer rate.
+        </column>
+      </group>
     </group>
 
     <group title="Ingress Policing">