Message ID | 20230105202425.4187792-1-david.marchand@redhat.com |
---|---|
State | Superseded |
Headers | show |
Series | [ovs-dev,v6,1/2] netdev-dpdk: Add per virtqueue statistics. | expand |
Context | Check | Description |
---|---|---|
ovsrobot/apply-robot | success | apply and check: success |
ovsrobot/github-robot-_Build_and_Test | success | github build: passed |
ovsrobot/intel-ovs-compilation | success | test: success |
On 1/5/23 21:24, David Marchand wrote: > The DPDK vhost-user library maintains more granular per queue stats > which can replace what OVS was providing for vhost-user ports. > > The benefits for OVS: > - OVS can skip parsing packet sizes on the rx side, > - dev->stats_lock won't be taken in rx/tx code unless some packet is > dropped, > - vhost-user is aware of which packets are transmitted to the guest, > so per *transmitted* packet size stats can be reported, > - more internal stats from vhost-user may be exposed, without OVS > needing to understand them, > > Note: the vhost-user library does not provide global stats for a port. > The proposed implementation is to have the global stats (exposed via > netdev_get_stats()) computed by querying and aggregating all per queue > stats. > Since per queue stats are exposed via another netdev ops > (netdev_get_custom_stats()), this may lead to some race and small > discrepancies. > This issue might already affect other netdev classes. > > Example: > $ ovs-vsctl get interface vhost4 statistics | > sed -e 's#[{}]##g' -e 's#, #\n#g' | > grep -v =0$ > rx_1_to_64_packets=12 > rx_256_to_511_packets=15 > rx_65_to_127_packets=21 > rx_broadcast_packets=15 > rx_bytes=7497 > rx_multicast_packets=33 > rx_packets=48 > rx_q0_good_bytes=242 > rx_q0_good_packets=3 > rx_q0_guest_notifications=3 > rx_q0_multicast_packets=3 > rx_q0_size_65_127_packets=2 > rx_q0_undersize_packets=1 > rx_q1_broadcast_packets=15 > rx_q1_good_bytes=7255 > rx_q1_good_packets=45 > rx_q1_guest_notifications=45 > rx_q1_multicast_packets=30 > rx_q1_size_256_511_packets=15 > rx_q1_size_65_127_packets=19 > rx_q1_undersize_packets=11 > tx_1_to_64_packets=36 > tx_256_to_511_packets=45 > tx_65_to_127_packets=63 > tx_broadcast_packets=45 > tx_bytes=22491 > tx_multicast_packets=99 > tx_packets=144 > tx_q0_broadcast_packets=30 > tx_q0_good_bytes=14994 > tx_q0_good_packets=96 > tx_q0_guest_notifications=96 > tx_q0_multicast_packets=66 > tx_q0_size_256_511_packets=30 > tx_q0_size_65_127_packets=42 > tx_q0_undersize_packets=24 > tx_q1_broadcast_packets=15 > tx_q1_good_bytes=7497 > tx_q1_good_packets=48 > tx_q1_guest_notifications=48 > tx_q1_multicast_packets=33 > tx_q1_size_256_511_packets=15 > tx_q1_size_65_127_packets=21 > tx_q1_undersize_packets=12 > > Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> > Signed-off-by: David Marchand <david.marchand@redhat.com> > --- > Changes since v5: > - added missing dev->stats_lock acquire in netdev_dpdk_vhost_get_stats, > - changed netdev_dpdk_vhost_update_[rt]x_counters to take dev->stats_lock > only when some packets got dropped in OVS. Since the rx side won't > take the lock unless some QoS configuration is in place, this change > will likely have the same effect as separating stats_lock into rx/tx > dedicated locks. Testing shows a slight (around 1%) improvement of > performance for some V2V setup, Thanks, David. With this version I see a performance improvement over the master branch as well. Nice! Maxime, could you take another look at this series? Or did you already? I see your 'Reviewed-by' tag, but I'm not sure if it's just a carry over. Best regards, Ilya Maximets. > > Changes since v3: > - rebased to master now that v22.11 landed, > - fixed error code in stats helper when vhost port is not "running", > - shortened rx/tx stats macro names, > > Changes since RFC v2: > - dropped the experimental api check (now that the feature is marked > stable in DPDK), > - moved netdev_dpdk_get_carrier() forward declaration next to the > function needing it, > - used per q stats for netdev_get_stats() and removed OVS per packet > size accounting logic, > - fixed small packets counter (see rx_undersized_errors hack), > - added more Tx stats, > - added unit tests, > > --- > lib/netdev-dpdk.c | 429 ++++++++++++++++++++++++++++++++----------- > tests/system-dpdk.at | 33 +++- > 2 files changed, 349 insertions(+), 113 deletions(-)
On 1/6/23 12:02, Ilya Maximets wrote: > On 1/5/23 21:24, David Marchand wrote: >> The DPDK vhost-user library maintains more granular per queue stats >> which can replace what OVS was providing for vhost-user ports. >> >> The benefits for OVS: >> - OVS can skip parsing packet sizes on the rx side, >> - dev->stats_lock won't be taken in rx/tx code unless some packet is >> dropped, >> - vhost-user is aware of which packets are transmitted to the guest, >> so per *transmitted* packet size stats can be reported, >> - more internal stats from vhost-user may be exposed, without OVS >> needing to understand them, >> >> Note: the vhost-user library does not provide global stats for a port. >> The proposed implementation is to have the global stats (exposed via >> netdev_get_stats()) computed by querying and aggregating all per queue >> stats. >> Since per queue stats are exposed via another netdev ops >> (netdev_get_custom_stats()), this may lead to some race and small >> discrepancies. >> This issue might already affect other netdev classes. >> >> Example: >> $ ovs-vsctl get interface vhost4 statistics | >> sed -e 's#[{}]##g' -e 's#, #\n#g' | >> grep -v =0$ >> rx_1_to_64_packets=12 >> rx_256_to_511_packets=15 >> rx_65_to_127_packets=21 >> rx_broadcast_packets=15 >> rx_bytes=7497 >> rx_multicast_packets=33 >> rx_packets=48 >> rx_q0_good_bytes=242 >> rx_q0_good_packets=3 >> rx_q0_guest_notifications=3 >> rx_q0_multicast_packets=3 >> rx_q0_size_65_127_packets=2 >> rx_q0_undersize_packets=1 >> rx_q1_broadcast_packets=15 >> rx_q1_good_bytes=7255 >> rx_q1_good_packets=45 >> rx_q1_guest_notifications=45 >> rx_q1_multicast_packets=30 >> rx_q1_size_256_511_packets=15 >> rx_q1_size_65_127_packets=19 >> rx_q1_undersize_packets=11 >> tx_1_to_64_packets=36 >> tx_256_to_511_packets=45 >> tx_65_to_127_packets=63 >> tx_broadcast_packets=45 >> tx_bytes=22491 >> tx_multicast_packets=99 >> tx_packets=144 >> tx_q0_broadcast_packets=30 >> tx_q0_good_bytes=14994 >> tx_q0_good_packets=96 >> tx_q0_guest_notifications=96 >> tx_q0_multicast_packets=66 >> tx_q0_size_256_511_packets=30 >> tx_q0_size_65_127_packets=42 >> tx_q0_undersize_packets=24 >> tx_q1_broadcast_packets=15 >> tx_q1_good_bytes=7497 >> tx_q1_good_packets=48 >> tx_q1_guest_notifications=48 >> tx_q1_multicast_packets=33 >> tx_q1_size_256_511_packets=15 >> tx_q1_size_65_127_packets=21 >> tx_q1_undersize_packets=12 >> >> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> >> Signed-off-by: David Marchand <david.marchand@redhat.com> >> --- >> Changes since v5: >> - added missing dev->stats_lock acquire in netdev_dpdk_vhost_get_stats, >> - changed netdev_dpdk_vhost_update_[rt]x_counters to take dev->stats_lock >> only when some packets got dropped in OVS. Since the rx side won't >> take the lock unless some QoS configuration is in place, this change >> will likely have the same effect as separating stats_lock into rx/tx >> dedicated locks. Testing shows a slight (around 1%) improvement of >> performance for some V2V setup, > > Thanks, David. With this version I see a performance improvement > over the master branch as well. Nice! Great! Thanks David. > Maxime, could you take another look at this series? Or did you > already? I see your 'Reviewed-by' tag, but I'm not sure if it's > just a carry over. I followed David's perf issue investigations, but I will have another look today. Thanks, Maxime > Best regards, Ilya Maximets. > >> >> Changes since v3: >> - rebased to master now that v22.11 landed, >> - fixed error code in stats helper when vhost port is not "running", >> - shortened rx/tx stats macro names, >> >> Changes since RFC v2: >> - dropped the experimental api check (now that the feature is marked >> stable in DPDK), >> - moved netdev_dpdk_get_carrier() forward declaration next to the >> function needing it, >> - used per q stats for netdev_get_stats() and removed OVS per packet >> size accounting logic, >> - fixed small packets counter (see rx_undersized_errors hack), >> - added more Tx stats, >> - added unit tests, >> >> --- >> lib/netdev-dpdk.c | 429 ++++++++++++++++++++++++++++++++----------- >> tests/system-dpdk.at | 33 +++- >> 2 files changed, 349 insertions(+), 113 deletions(-) >
On 1/5/23 21:24, David Marchand wrote: > The DPDK vhost-user library maintains more granular per queue stats > which can replace what OVS was providing for vhost-user ports. > > The benefits for OVS: > - OVS can skip parsing packet sizes on the rx side, > - dev->stats_lock won't be taken in rx/tx code unless some packet is > dropped, > - vhost-user is aware of which packets are transmitted to the guest, > so per *transmitted* packet size stats can be reported, > - more internal stats from vhost-user may be exposed, without OVS > needing to understand them, > > Note: the vhost-user library does not provide global stats for a port. > The proposed implementation is to have the global stats (exposed via > netdev_get_stats()) computed by querying and aggregating all per queue > stats. > Since per queue stats are exposed via another netdev ops > (netdev_get_custom_stats()), this may lead to some race and small > discrepancies. > This issue might already affect other netdev classes. > > Example: > $ ovs-vsctl get interface vhost4 statistics | > sed -e 's#[{}]##g' -e 's#, #\n#g' | > grep -v =0$ > rx_1_to_64_packets=12 > rx_256_to_511_packets=15 > rx_65_to_127_packets=21 > rx_broadcast_packets=15 > rx_bytes=7497 > rx_multicast_packets=33 > rx_packets=48 > rx_q0_good_bytes=242 > rx_q0_good_packets=3 > rx_q0_guest_notifications=3 > rx_q0_multicast_packets=3 > rx_q0_size_65_127_packets=2 > rx_q0_undersize_packets=1 > rx_q1_broadcast_packets=15 > rx_q1_good_bytes=7255 > rx_q1_good_packets=45 > rx_q1_guest_notifications=45 > rx_q1_multicast_packets=30 > rx_q1_size_256_511_packets=15 > rx_q1_size_65_127_packets=19 > rx_q1_undersize_packets=11 > tx_1_to_64_packets=36 > tx_256_to_511_packets=45 > tx_65_to_127_packets=63 > tx_broadcast_packets=45 > tx_bytes=22491 > tx_multicast_packets=99 > tx_packets=144 > tx_q0_broadcast_packets=30 > tx_q0_good_bytes=14994 > tx_q0_good_packets=96 > tx_q0_guest_notifications=96 > tx_q0_multicast_packets=66 > tx_q0_size_256_511_packets=30 > tx_q0_size_65_127_packets=42 > tx_q0_undersize_packets=24 > tx_q1_broadcast_packets=15 > tx_q1_good_bytes=7497 > tx_q1_good_packets=48 > tx_q1_guest_notifications=48 > tx_q1_multicast_packets=33 > tx_q1_size_256_511_packets=15 > tx_q1_size_65_127_packets=21 > tx_q1_undersize_packets=12 > > Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> > Signed-off-by: David Marchand <david.marchand@redhat.com> > --- > Changes since v5: > - added missing dev->stats_lock acquire in netdev_dpdk_vhost_get_stats, > - changed netdev_dpdk_vhost_update_[rt]x_counters to take dev->stats_lock > only when some packets got dropped in OVS. Since the rx side won't > take the lock unless some QoS configuration is in place, this change > will likely have the same effect as separating stats_lock into rx/tx > dedicated locks. Testing shows a slight (around 1%) improvement of > performance for some V2V setup, > > Changes since v3: > - rebased to master now that v22.11 landed, > - fixed error code in stats helper when vhost port is not "running", > - shortened rx/tx stats macro names, > > Changes since RFC v2: > - dropped the experimental api check (now that the feature is marked > stable in DPDK), > - moved netdev_dpdk_get_carrier() forward declaration next to the > function needing it, > - used per q stats for netdev_get_stats() and removed OVS per packet > size accounting logic, > - fixed small packets counter (see rx_undersized_errors hack), > - added more Tx stats, > - added unit tests, > > --- > lib/netdev-dpdk.c | 429 ++++++++++++++++++++++++++++++++----------- > tests/system-dpdk.at | 33 +++- > 2 files changed, 349 insertions(+), 113 deletions(-) > > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c > index fff57f7827..80ba650032 100644 > --- a/lib/netdev-dpdk.c > +++ b/lib/netdev-dpdk.c > @@ -2363,70 +2363,18 @@ is_vhost_running(struct netdev_dpdk *dev) > return (netdev_dpdk_get_vid(dev) >= 0 && dev->vhost_reconfigured); > } > > -static inline void > -netdev_dpdk_vhost_update_rx_size_counters(struct netdev_stats *stats, > - unsigned int packet_size) > -{ > - /* Hard-coded search for the size bucket. */ > - if (packet_size < 256) { > - if (packet_size >= 128) { > - stats->rx_128_to_255_packets++; > - } else if (packet_size <= 64) { > - stats->rx_1_to_64_packets++; > - } else { > - stats->rx_65_to_127_packets++; > - } > - } else { > - if (packet_size >= 1523) { > - stats->rx_1523_to_max_packets++; > - } else if (packet_size >= 1024) { > - stats->rx_1024_to_1522_packets++; > - } else if (packet_size < 512) { > - stats->rx_256_to_511_packets++; > - } else { > - stats->rx_512_to_1023_packets++; > - } > - } > -} > - > static inline void > netdev_dpdk_vhost_update_rx_counters(struct netdev_dpdk *dev, > - struct dp_packet **packets, int count, > int qos_drops) > { > - struct netdev_stats *stats = &dev->stats; > - struct dp_packet *packet; > - unsigned int packet_size; > - int i; > - > - stats->rx_packets += count; > - stats->rx_dropped += qos_drops; > - for (i = 0; i < count; i++) { > - packet = packets[i]; > - packet_size = dp_packet_size(packet); > - > - if (OVS_UNLIKELY(packet_size < ETH_HEADER_LEN)) { > - /* This only protects the following multicast counting from > - * too short packets, but it does not stop the packet from > - * further processing. */ > - stats->rx_errors++; > - stats->rx_length_errors++; > - continue; > - } > - > - netdev_dpdk_vhost_update_rx_size_counters(stats, packet_size); > - > - struct eth_header *eh = (struct eth_header *) dp_packet_data(packet); > - if (OVS_UNLIKELY(eth_addr_is_multicast(eh->eth_dst))) { > - stats->multicast++; > - } > - > - stats->rx_bytes += packet_size; > + if (OVS_LIKELY(!qos_drops)) { > + return; > } > > - if (OVS_UNLIKELY(qos_drops)) { > - dev->sw_stats->rx_qos_drops += qos_drops; > - } > + rte_spinlock_lock(&dev->stats_lock); > + dev->stats.rx_dropped += qos_drops; > + dev->sw_stats->rx_qos_drops += qos_drops; > + rte_spinlock_unlock(&dev->stats_lock); > } > > /* > @@ -2473,10 +2421,7 @@ netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq, > qos_drops -= nb_rx; > } > > - rte_spinlock_lock(&dev->stats_lock); > - netdev_dpdk_vhost_update_rx_counters(dev, batch->packets, > - nb_rx, qos_drops); > - rte_spinlock_unlock(&dev->stats_lock); > + netdev_dpdk_vhost_update_rx_counters(dev, qos_drops); > > batch->count = nb_rx; > dp_packet_batch_init_packet_fields(batch); > @@ -2589,34 +2534,27 @@ netdev_dpdk_filter_packet_len(struct netdev_dpdk *dev, struct rte_mbuf **pkts, > > static inline void > netdev_dpdk_vhost_update_tx_counters(struct netdev_dpdk *dev, > - struct dp_packet **packets, > - int attempted, > struct netdev_dpdk_sw_stats *sw_stats_add) > { > int dropped = sw_stats_add->tx_mtu_exceeded_drops + > sw_stats_add->tx_qos_drops + > sw_stats_add->tx_failure_drops + > sw_stats_add->tx_invalid_hwol_drops; > - struct netdev_stats *stats = &dev->stats; > - int sent = attempted - dropped; > - int i; > - > - stats->tx_packets += sent; > - stats->tx_dropped += dropped; > + struct netdev_dpdk_sw_stats *sw_stats; > > - for (i = 0; i < sent; i++) { > - stats->tx_bytes += dp_packet_size(packets[i]); > + if (OVS_LIKELY(!dropped)) { You also need to check there were no tx_retries. > + return; > } > > - if (OVS_UNLIKELY(dropped || sw_stats_add->tx_retries)) { > - struct netdev_dpdk_sw_stats *sw_stats = dev->sw_stats; > - > - sw_stats->tx_retries += sw_stats_add->tx_retries; > - sw_stats->tx_failure_drops += sw_stats_add->tx_failure_drops; > - sw_stats->tx_mtu_exceeded_drops += sw_stats_add->tx_mtu_exceeded_drops; > - sw_stats->tx_qos_drops += sw_stats_add->tx_qos_drops; > - sw_stats->tx_invalid_hwol_drops += sw_stats_add->tx_invalid_hwol_drops; > - } > + rte_spinlock_lock(&dev->stats_lock); > + sw_stats = dev->sw_stats; > + dev->stats.tx_dropped += dropped; > + sw_stats->tx_retries += sw_stats_add->tx_retries; > + sw_stats->tx_failure_drops += sw_stats_add->tx_failure_drops; > + sw_stats->tx_mtu_exceeded_drops += sw_stats_add->tx_mtu_exceeded_drops; > + sw_stats->tx_qos_drops += sw_stats_add->tx_qos_drops; > + sw_stats->tx_invalid_hwol_drops += sw_stats_add->tx_invalid_hwol_drops; > + rte_spinlock_unlock(&dev->stats_lock); > } > > static void > @@ -2795,13 +2733,13 @@ netdev_dpdk_vhost_send(struct netdev *netdev, int qid, > { > struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); > int max_retries = VHOST_ENQ_RETRY_MIN; > - int cnt, batch_cnt, vhost_batch_cnt; > int vid = netdev_dpdk_get_vid(dev); > struct netdev_dpdk_sw_stats stats; > + int cnt, vhost_batch_cnt; > struct rte_mbuf **pkts; > int retries; > > - batch_cnt = cnt = dp_packet_batch_size(batch); > + cnt = dp_packet_batch_size(batch); > qid = dev->tx_q[qid % netdev->n_txq].map; > if (OVS_UNLIKELY(vid < 0 || !dev->vhost_reconfigured || qid < 0 > || !(dev->flags & NETDEV_UP))) { > @@ -2850,10 +2788,7 @@ netdev_dpdk_vhost_send(struct netdev *netdev, int qid, > stats.tx_failure_drops += cnt; > stats.tx_retries = MIN(retries, max_retries); > > - rte_spinlock_lock(&dev->stats_lock); > - netdev_dpdk_vhost_update_tx_counters(dev, batch->packets, batch_cnt, > - &stats); > - rte_spinlock_unlock(&dev->stats_lock); > + netdev_dpdk_vhost_update_tx_counters(dev, &stats); > > pkts = (struct rte_mbuf **) batch->packets; > for (int i = 0; i < vhost_batch_cnt; i++) { > @@ -3007,41 +2942,305 @@ netdev_dpdk_set_mtu(struct netdev *netdev, int mtu) > return 0; > } > > -static int > -netdev_dpdk_get_carrier(const struct netdev *netdev, bool *carrier); > - > static int > netdev_dpdk_vhost_get_stats(const struct netdev *netdev, > struct netdev_stats *stats) > { > + struct rte_vhost_stat_name *vhost_stats_names = NULL; > struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); > + struct rte_vhost_stat *vhost_stats = NULL; > + int vhost_stats_count; > + int err; > + int qid; > + int vid; > > ovs_mutex_lock(&dev->mutex); > > + if (!is_vhost_running(dev)) { > + err = EPROTO; > + goto out; > + } > + > + vid = netdev_dpdk_get_vid(dev); > + > + /* We expect all rxqs have the same number of stats, only query rxq0. */ > + qid = 0 * VIRTIO_QNUM + VIRTIO_TXQ; > + err = rte_vhost_vring_stats_get_names(vid, qid, NULL, 0); > + if (err < 0) { > + err = EPROTO; > + goto out; > + } > + > + vhost_stats_count = err; > + vhost_stats_names = xcalloc(vhost_stats_count, sizeof *vhost_stats_names); > + vhost_stats = xcalloc(vhost_stats_count, sizeof *vhost_stats); > + > + err = rte_vhost_vring_stats_get_names(vid, qid, vhost_stats_names, > + vhost_stats_count); > + if (err != vhost_stats_count) { > + err = EPROTO; > + goto out; > + } > + > +#define VHOST_RXQ_STATS \ > + VHOST_RXQ_STAT(rx_packets, "good_packets") \ > + VHOST_RXQ_STAT(rx_bytes, "good_bytes") \ > + VHOST_RXQ_STAT(rx_broadcast_packets, "broadcast_packets") \ > + VHOST_RXQ_STAT(multicast, "multicast_packets") \ > + VHOST_RXQ_STAT(rx_undersized_errors, "undersize_packets") \ > + VHOST_RXQ_STAT(rx_1_to_64_packets, "size_64_packets") \ > + VHOST_RXQ_STAT(rx_65_to_127_packets, "size_65_127_packets") \ > + VHOST_RXQ_STAT(rx_128_to_255_packets, "size_128_255_packets") \ > + VHOST_RXQ_STAT(rx_256_to_511_packets, "size_256_511_packets") \ > + VHOST_RXQ_STAT(rx_512_to_1023_packets, "size_512_1023_packets") \ > + VHOST_RXQ_STAT(rx_1024_to_1522_packets, "size_1024_1518_packets") \ > + VHOST_RXQ_STAT(rx_1523_to_max_packets, "size_1519_max_packets") > + > +#define VHOST_RXQ_STAT(MEMBER, NAME) dev->stats.MEMBER = 0; > + VHOST_RXQ_STATS; > +#undef VHOST_RXQ_STAT > + > + for (int q = 0; q < dev->up.n_rxq; q++) { > + qid = q * VIRTIO_QNUM + VIRTIO_TXQ; > + > + err = rte_vhost_vring_stats_get(vid, qid, vhost_stats, > + vhost_stats_count); > + if (err != vhost_stats_count) { > + err = EPROTO; > + goto out; > + } > + > + for (int i = 0; i < vhost_stats_count; i++) { > +#define VHOST_RXQ_STAT(MEMBER, NAME) \ > + if (string_ends_with(vhost_stats_names[i].name, NAME)) { \ > + dev->stats.MEMBER += vhost_stats[i].value; \ > + continue; \ > + } > + VHOST_RXQ_STATS; > +#undef VHOST_RXQ_STAT > + } > + } > + > + /* OVS reports 64 bytes and smaller packets into "rx_1_to_64_packets". > + * Since vhost only reports good packets and has no error counter, > + * rx_undersized_errors is highjacked (see above) to retrieve > + * "undersize_packets". */ > + dev->stats.rx_1_to_64_packets += dev->stats.rx_undersized_errors; > + memset(&dev->stats.rx_undersized_errors, 0xff, > + sizeof dev->stats.rx_undersized_errors); > + > +#define VHOST_RXQ_STAT(MEMBER, NAME) stats->MEMBER = dev->stats.MEMBER; > + VHOST_RXQ_STATS; > +#undef VHOST_RXQ_STAT > + > + free(vhost_stats_names); > + vhost_stats_names = NULL; > + free(vhost_stats); > + vhost_stats = NULL; > + > + /* We expect all txqs have the same number of stats, only query txq0. */ > + qid = 0 * VIRTIO_QNUM; > + err = rte_vhost_vring_stats_get_names(vid, qid, NULL, 0); > + if (err < 0) { > + err = EPROTO; > + goto out; > + } > + > + vhost_stats_count = err; > + vhost_stats_names = xcalloc(vhost_stats_count, sizeof *vhost_stats_names); > + vhost_stats = xcalloc(vhost_stats_count, sizeof *vhost_stats); > + > + err = rte_vhost_vring_stats_get_names(vid, qid, vhost_stats_names, > + vhost_stats_count); > + if (err != vhost_stats_count) { > + err = EPROTO; > + goto out; > + } > + > +#define VHOST_TXQ_STATS \ > + VHOST_TXQ_STAT(tx_packets, "good_packets") \ > + VHOST_TXQ_STAT(tx_bytes, "good_bytes") \ > + VHOST_TXQ_STAT(tx_broadcast_packets, "broadcast_packets") \ > + VHOST_TXQ_STAT(tx_multicast_packets, "multicast_packets") \ > + VHOST_TXQ_STAT(rx_undersized_errors, "undersize_packets") \ > + VHOST_TXQ_STAT(tx_1_to_64_packets, "size_64_packets") \ > + VHOST_TXQ_STAT(tx_65_to_127_packets, "size_65_127_packets") \ > + VHOST_TXQ_STAT(tx_128_to_255_packets, "size_128_255_packets") \ > + VHOST_TXQ_STAT(tx_256_to_511_packets, "size_256_511_packets") \ > + VHOST_TXQ_STAT(tx_512_to_1023_packets, "size_512_1023_packets") \ > + VHOST_TXQ_STAT(tx_1024_to_1522_packets, "size_1024_1518_packets") \ > + VHOST_TXQ_STAT(tx_1523_to_max_packets, "size_1519_max_packets") > + > +#define VHOST_TXQ_STAT(MEMBER, NAME) dev->stats.MEMBER = 0; > + VHOST_TXQ_STATS; > +#undef VHOST_TXQ_STAT > + > + for (int q = 0; q < dev->up.n_txq; q++) { > + qid = q * VIRTIO_QNUM; > + > + err = rte_vhost_vring_stats_get(vid, qid, vhost_stats, > + vhost_stats_count); > + if (err != vhost_stats_count) { > + err = EPROTO; > + goto out; > + } > + > + for (int i = 0; i < vhost_stats_count; i++) { > +#define VHOST_TXQ_STAT(MEMBER, NAME) \ > + if (string_ends_with(vhost_stats_names[i].name, NAME)) { \ > + dev->stats.MEMBER += vhost_stats[i].value; \ > + continue; \ > + } > + VHOST_TXQ_STATS; > +#undef VHOST_TXQ_STAT > + } > + } > + > + /* OVS reports 64 bytes and smaller packets into "tx_1_to_64_packets". > + * Same as for rx, rx_undersized_errors is highjacked. */ > + dev->stats.tx_1_to_64_packets += dev->stats.rx_undersized_errors; > + memset(&dev->stats.rx_undersized_errors, 0xff, > + sizeof dev->stats.rx_undersized_errors); > + > +#define VHOST_TXQ_STAT(MEMBER, NAME) stats->MEMBER = dev->stats.MEMBER; > + VHOST_TXQ_STATS; > +#undef VHOST_TXQ_STAT > + > rte_spinlock_lock(&dev->stats_lock); > - /* Supported Stats */ > - stats->rx_packets = dev->stats.rx_packets; > - stats->tx_packets = dev->stats.tx_packets; > stats->rx_dropped = dev->stats.rx_dropped; > stats->tx_dropped = dev->stats.tx_dropped; > - stats->multicast = dev->stats.multicast; > - stats->rx_bytes = dev->stats.rx_bytes; > - stats->tx_bytes = dev->stats.tx_bytes; > - stats->rx_errors = dev->stats.rx_errors; > - stats->rx_length_errors = dev->stats.rx_length_errors; > - > - stats->rx_1_to_64_packets = dev->stats.rx_1_to_64_packets; > - stats->rx_65_to_127_packets = dev->stats.rx_65_to_127_packets; > - stats->rx_128_to_255_packets = dev->stats.rx_128_to_255_packets; > - stats->rx_256_to_511_packets = dev->stats.rx_256_to_511_packets; > - stats->rx_512_to_1023_packets = dev->stats.rx_512_to_1023_packets; > - stats->rx_1024_to_1522_packets = dev->stats.rx_1024_to_1522_packets; > - stats->rx_1523_to_max_packets = dev->stats.rx_1523_to_max_packets; > - > rte_spinlock_unlock(&dev->stats_lock); > > + err = 0; > +out: > + > ovs_mutex_unlock(&dev->mutex); > > + free(vhost_stats); > + free(vhost_stats_names); > + > + return err; > +} > + > +static int > +netdev_dpdk_vhost_get_custom_stats(const struct netdev *netdev, > + struct netdev_custom_stats *custom_stats) > +{ > + struct rte_vhost_stat_name *vhost_stats_names = NULL; > + struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); > + struct rte_vhost_stat *vhost_stats = NULL; > + int vhost_rxq_stats_count; > + int vhost_txq_stats_count; > + int stat_offset; > + int err; > + int qid; > + int vid; > + > + netdev_dpdk_get_sw_custom_stats(netdev, custom_stats); > + stat_offset = custom_stats->size; > + > + ovs_mutex_lock(&dev->mutex); > + > + if (!is_vhost_running(dev)) { > + goto out; > + } > + > + vid = netdev_dpdk_get_vid(dev); > + > + qid = 0 * VIRTIO_QNUM + VIRTIO_TXQ; > + err = rte_vhost_vring_stats_get_names(vid, qid, NULL, 0); > + if (err < 0) { > + goto out; > + } > + vhost_rxq_stats_count = err; > + > + qid = 0 * VIRTIO_QNUM; > + err = rte_vhost_vring_stats_get_names(vid, qid, NULL, 0); > + if (err < 0) { > + goto out; > + } > + vhost_txq_stats_count = err; > + > + stat_offset += dev->up.n_rxq * vhost_rxq_stats_count; > + stat_offset += dev->up.n_txq * vhost_txq_stats_count; > + custom_stats->counters = xrealloc(custom_stats->counters, > + stat_offset * > + sizeof *custom_stats->counters); > + stat_offset = custom_stats->size; > + > + vhost_stats_names = xcalloc(vhost_rxq_stats_count, > + sizeof *vhost_stats_names); > + vhost_stats = xcalloc(vhost_rxq_stats_count, sizeof *vhost_stats); > + > + for (int q = 0; q < dev->up.n_rxq; q++) { > + qid = q * VIRTIO_QNUM + VIRTIO_TXQ; > + > + err = rte_vhost_vring_stats_get_names(vid, qid, vhost_stats_names, > + vhost_rxq_stats_count); > + if (err != vhost_rxq_stats_count) { > + goto out; > + } > + > + err = rte_vhost_vring_stats_get(vid, qid, vhost_stats, > + vhost_rxq_stats_count); > + if (err != vhost_rxq_stats_count) { > + goto out; > + } > + > + for (int i = 0; i < vhost_rxq_stats_count; i++) { > + ovs_strlcpy(custom_stats->counters[stat_offset + i].name, > + vhost_stats_names[i].name, > + NETDEV_CUSTOM_STATS_NAME_SIZE); > + custom_stats->counters[stat_offset + i].value = > + vhost_stats[i].value; > + } > + stat_offset += vhost_rxq_stats_count; > + } > + > + free(vhost_stats_names); > + vhost_stats_names = NULL; > + free(vhost_stats); > + vhost_stats = NULL; > + > + vhost_stats_names = xcalloc(vhost_txq_stats_count, > + sizeof *vhost_stats_names); > + vhost_stats = xcalloc(vhost_txq_stats_count, sizeof *vhost_stats); > + > + for (int q = 0; q < dev->up.n_txq; q++) { > + qid = q * VIRTIO_QNUM; > + > + err = rte_vhost_vring_stats_get_names(vid, qid, vhost_stats_names, > + vhost_txq_stats_count); > + if (err != vhost_txq_stats_count) { > + goto out; > + } > + > + err = rte_vhost_vring_stats_get(vid, qid, vhost_stats, > + vhost_txq_stats_count); > + if (err != vhost_txq_stats_count) { > + goto out; > + } > + > + for (int i = 0; i < vhost_txq_stats_count; i++) { > + ovs_strlcpy(custom_stats->counters[stat_offset + i].name, > + vhost_stats_names[i].name, > + NETDEV_CUSTOM_STATS_NAME_SIZE); > + custom_stats->counters[stat_offset + i].value = > + vhost_stats[i].value; > + } > + stat_offset += vhost_txq_stats_count; > + } > + > + free(vhost_stats_names); > + vhost_stats_names = NULL; > + free(vhost_stats); > + vhost_stats = NULL; > + > +out: > + ovs_mutex_unlock(&dev->mutex); > + > + custom_stats->size = stat_offset; > + > return 0; > } > > @@ -3088,6 +3287,9 @@ netdev_dpdk_convert_xstats(struct netdev_stats *stats, > #undef DPDK_XSTATS > } > > +static int > +netdev_dpdk_get_carrier(const struct netdev *netdev, bool *carrier); > + > static int > netdev_dpdk_get_stats(const struct netdev *netdev, struct netdev_stats *stats) > { > @@ -3536,6 +3738,7 @@ netdev_dpdk_update_flags__(struct netdev_dpdk *dev, > if (NETDEV_UP & on) { > rte_spinlock_lock(&dev->stats_lock); > memset(&dev->stats, 0, sizeof dev->stats); > + memset(dev->sw_stats, 0, sizeof *dev->sw_stats); > rte_spinlock_unlock(&dev->stats_lock); > } > } > @@ -5036,6 +5239,11 @@ dpdk_vhost_reconfigure_helper(struct netdev_dpdk *dev) > dev->tx_q[0].map = 0; > } > > + rte_spinlock_lock(&dev->stats_lock); > + memset(&dev->stats, 0, sizeof dev->stats); > + memset(dev->sw_stats, 0, sizeof *dev->sw_stats); > + rte_spinlock_unlock(&dev->stats_lock); > + > if (userspace_tso_enabled()) { > dev->hw_ol_features |= NETDEV_TX_TSO_OFFLOAD; > VLOG_DBG("%s: TSO enabled on vhost port", netdev_get_name(&dev->up)); > @@ -5096,6 +5304,9 @@ netdev_dpdk_vhost_client_reconfigure(struct netdev *netdev) > /* Register client-mode device. */ > vhost_flags |= RTE_VHOST_USER_CLIENT; > > + /* Extended per vq statistics. */ > + vhost_flags |= RTE_VHOST_USER_NET_STATS_ENABLE; > + > /* There is no support for multi-segments buffers. */ > vhost_flags |= RTE_VHOST_USER_LINEARBUF_SUPPORT; > > @@ -5574,7 +5785,7 @@ static const struct netdev_class dpdk_vhost_class = { > .send = netdev_dpdk_vhost_send, > .get_carrier = netdev_dpdk_vhost_get_carrier, > .get_stats = netdev_dpdk_vhost_get_stats, > - .get_custom_stats = netdev_dpdk_get_sw_custom_stats, > + .get_custom_stats = netdev_dpdk_vhost_get_custom_stats, > .get_status = netdev_dpdk_vhost_user_get_status, > .reconfigure = netdev_dpdk_vhost_reconfigure, > .rxq_recv = netdev_dpdk_vhost_rxq_recv, > @@ -5590,7 +5801,7 @@ static const struct netdev_class dpdk_vhost_client_class = { > .send = netdev_dpdk_vhost_send, > .get_carrier = netdev_dpdk_vhost_get_carrier, > .get_stats = netdev_dpdk_vhost_get_stats, > - .get_custom_stats = netdev_dpdk_get_sw_custom_stats, > + .get_custom_stats = netdev_dpdk_vhost_get_custom_stats, > .get_status = netdev_dpdk_vhost_user_get_status, > .reconfigure = netdev_dpdk_vhost_client_reconfigure, > .rxq_recv = netdev_dpdk_vhost_rxq_recv, > diff --git a/tests/system-dpdk.at b/tests/system-dpdk.at > index 8dc187a61d..5ef7f8ccdc 100644 > --- a/tests/system-dpdk.at > +++ b/tests/system-dpdk.at > @@ -200,9 +200,10 @@ ADD_VETH(tap1, ns2, br10, "172.31.110.12/24") > dnl Execute testpmd in background > on_exit "pkill -f -x -9 'tail -f /dev/null'" > tail -f /dev/null | dpdk-testpmd --socket-mem="$(cat NUMA_NODE)" --no-pci\ > - --vdev="net_virtio_user,path=$OVS_RUNDIR/dpdkvhostclient0,server=1" \ > - --vdev="net_tap0,iface=tap0" --file-prefix page0 \ > - --single-file-segments -- -a >$OVS_RUNDIR/testpmd-dpdkvhostuserclient0.log 2>&1 & > + --vdev="net_virtio_user,path=$OVS_RUNDIR/dpdkvhostclient0,queues=2,server=1" \ > + --vdev="net_tap0,iface=tap0" --file-prefix page0 \ > + --single-file-segments -- -a --nb-cores 2 --rxq 2 --txq 2 \ > + >$OVS_RUNDIR/testpmd-dpdkvhostuserclient0.log 2>&1 & > > OVS_WAIT_UNTIL([grep "virtio is now ready for processing" ovs-vswitchd.log]) > OVS_WAIT_UNTIL([ip link show dev tap0 | grep -qw LOWER_UP]) > @@ -220,9 +221,33 @@ AT_CHECK([ip netns exec ns1 ip addr add 172.31.110.11/24 dev tap0], [], > > AT_CHECK([ip netns exec ns1 ip link show], [], [stdout], [stderr]) > AT_CHECK([ip netns exec ns2 ip link show], [], [stdout], [stderr]) > -AT_CHECK([ip netns exec ns1 ping -c 4 -I tap0 172.31.110.12], [], [stdout], > +AT_CHECK([ip netns exec ns1 ping -i 0.1 -c 10 -I tap0 172.31.110.12], [], [stdout], > [stderr]) > > +AT_CHECK([ip netns exec ns1 ip link set tap0 down], [], [stdout], [stderr]) > + > +# Wait for stats to be queried ("stats-update-interval") > +sleep 5 > +AT_CHECK([ovs-vsctl get interface dpdkvhostuserclient0 statistics], [], [stdout], [stderr]) > + > +AT_CHECK([test `ovs-vsctl get interface dpdkvhostuserclient0 statistics:rx_packets` -gt 0 -a dnl > + `ovs-vsctl get interface dpdkvhostuserclient0 statistics:rx_bytes` -gt 0]) > +AT_CHECK([test `ovs-vsctl get interface dpdkvhostuserclient0 statistics:rx_packets` -eq dnl > + $((`ovs-vsctl get interface dpdkvhostuserclient0 statistics:rx_q0_good_packets` + dnl > + `ovs-vsctl get interface dpdkvhostuserclient0 statistics:rx_q1_good_packets`))]) > +AT_CHECK([test `ovs-vsctl get interface dpdkvhostuserclient0 statistics:rx_bytes` -eq dnl > + $((`ovs-vsctl get interface dpdkvhostuserclient0 statistics:rx_q0_good_bytes` + dnl > + `ovs-vsctl get interface dpdkvhostuserclient0 statistics:rx_q1_good_bytes`))]) > + > +AT_CHECK([test `ovs-vsctl get interface dpdkvhostuserclient0 statistics:tx_packets` -gt 0 -a dnl > + `ovs-vsctl get interface dpdkvhostuserclient0 statistics:tx_bytes` -gt 0]) > +AT_CHECK([test `ovs-vsctl get interface dpdkvhostuserclient0 statistics:tx_packets` -eq dnl > + $((`ovs-vsctl get interface dpdkvhostuserclient0 statistics:tx_q0_good_packets` + dnl > + `ovs-vsctl get interface dpdkvhostuserclient0 statistics:tx_q1_good_packets`))]) > +AT_CHECK([test `ovs-vsctl get interface dpdkvhostuserclient0 statistics:tx_bytes` -eq dnl > + $((`ovs-vsctl get interface dpdkvhostuserclient0 statistics:tx_q0_good_bytes` + dnl > + `ovs-vsctl get interface dpdkvhostuserclient0 statistics:tx_q1_good_bytes`))]) > + > dnl Clean up the testpmd now > pkill -f -x -9 'tail -f /dev/null' >
On Fri, Jan 6, 2023 at 2:40 PM Maxime Coquelin <maxime.coquelin@redhat.com> wrote: > On 1/5/23 21:24, David Marchand wrote: > > @@ -2589,34 +2534,27 @@ netdev_dpdk_filter_packet_len(struct netdev_dpdk *dev, struct rte_mbuf **pkts, > > > > static inline void > > netdev_dpdk_vhost_update_tx_counters(struct netdev_dpdk *dev, > > - struct dp_packet **packets, > > - int attempted, > > struct netdev_dpdk_sw_stats *sw_stats_add) > > { > > int dropped = sw_stats_add->tx_mtu_exceeded_drops + > > sw_stats_add->tx_qos_drops + > > sw_stats_add->tx_failure_drops + > > sw_stats_add->tx_invalid_hwol_drops; > > - struct netdev_stats *stats = &dev->stats; > > - int sent = attempted - dropped; > > - int i; > > - > > - stats->tx_packets += sent; > > - stats->tx_dropped += dropped; > > + struct netdev_dpdk_sw_stats *sw_stats; > > > > - for (i = 0; i < sent; i++) { > > - stats->tx_bytes += dp_packet_size(packets[i]); > > + if (OVS_LIKELY(!dropped)) { > > You also need to check there were no tx_retries. Indeed, I am testing the fix. > > > + return; > > } > >
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index fff57f7827..80ba650032 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -2363,70 +2363,18 @@ is_vhost_running(struct netdev_dpdk *dev) return (netdev_dpdk_get_vid(dev) >= 0 && dev->vhost_reconfigured); } -static inline void -netdev_dpdk_vhost_update_rx_size_counters(struct netdev_stats *stats, - unsigned int packet_size) -{ - /* Hard-coded search for the size bucket. */ - if (packet_size < 256) { - if (packet_size >= 128) { - stats->rx_128_to_255_packets++; - } else if (packet_size <= 64) { - stats->rx_1_to_64_packets++; - } else { - stats->rx_65_to_127_packets++; - } - } else { - if (packet_size >= 1523) { - stats->rx_1523_to_max_packets++; - } else if (packet_size >= 1024) { - stats->rx_1024_to_1522_packets++; - } else if (packet_size < 512) { - stats->rx_256_to_511_packets++; - } else { - stats->rx_512_to_1023_packets++; - } - } -} - static inline void netdev_dpdk_vhost_update_rx_counters(struct netdev_dpdk *dev, - struct dp_packet **packets, int count, int qos_drops) { - struct netdev_stats *stats = &dev->stats; - struct dp_packet *packet; - unsigned int packet_size; - int i; - - stats->rx_packets += count; - stats->rx_dropped += qos_drops; - for (i = 0; i < count; i++) { - packet = packets[i]; - packet_size = dp_packet_size(packet); - - if (OVS_UNLIKELY(packet_size < ETH_HEADER_LEN)) { - /* This only protects the following multicast counting from - * too short packets, but it does not stop the packet from - * further processing. */ - stats->rx_errors++; - stats->rx_length_errors++; - continue; - } - - netdev_dpdk_vhost_update_rx_size_counters(stats, packet_size); - - struct eth_header *eh = (struct eth_header *) dp_packet_data(packet); - if (OVS_UNLIKELY(eth_addr_is_multicast(eh->eth_dst))) { - stats->multicast++; - } - - stats->rx_bytes += packet_size; + if (OVS_LIKELY(!qos_drops)) { + return; } - if (OVS_UNLIKELY(qos_drops)) { - dev->sw_stats->rx_qos_drops += qos_drops; - } + rte_spinlock_lock(&dev->stats_lock); + dev->stats.rx_dropped += qos_drops; + dev->sw_stats->rx_qos_drops += qos_drops; + rte_spinlock_unlock(&dev->stats_lock); } /* @@ -2473,10 +2421,7 @@ netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq, qos_drops -= nb_rx; } - rte_spinlock_lock(&dev->stats_lock); - netdev_dpdk_vhost_update_rx_counters(dev, batch->packets, - nb_rx, qos_drops); - rte_spinlock_unlock(&dev->stats_lock); + netdev_dpdk_vhost_update_rx_counters(dev, qos_drops); batch->count = nb_rx; dp_packet_batch_init_packet_fields(batch); @@ -2589,34 +2534,27 @@ netdev_dpdk_filter_packet_len(struct netdev_dpdk *dev, struct rte_mbuf **pkts, static inline void netdev_dpdk_vhost_update_tx_counters(struct netdev_dpdk *dev, - struct dp_packet **packets, - int attempted, struct netdev_dpdk_sw_stats *sw_stats_add) { int dropped = sw_stats_add->tx_mtu_exceeded_drops + sw_stats_add->tx_qos_drops + sw_stats_add->tx_failure_drops + sw_stats_add->tx_invalid_hwol_drops; - struct netdev_stats *stats = &dev->stats; - int sent = attempted - dropped; - int i; - - stats->tx_packets += sent; - stats->tx_dropped += dropped; + struct netdev_dpdk_sw_stats *sw_stats; - for (i = 0; i < sent; i++) { - stats->tx_bytes += dp_packet_size(packets[i]); + if (OVS_LIKELY(!dropped)) { + return; } - if (OVS_UNLIKELY(dropped || sw_stats_add->tx_retries)) { - struct netdev_dpdk_sw_stats *sw_stats = dev->sw_stats; - - sw_stats->tx_retries += sw_stats_add->tx_retries; - sw_stats->tx_failure_drops += sw_stats_add->tx_failure_drops; - sw_stats->tx_mtu_exceeded_drops += sw_stats_add->tx_mtu_exceeded_drops; - sw_stats->tx_qos_drops += sw_stats_add->tx_qos_drops; - sw_stats->tx_invalid_hwol_drops += sw_stats_add->tx_invalid_hwol_drops; - } + rte_spinlock_lock(&dev->stats_lock); + sw_stats = dev->sw_stats; + dev->stats.tx_dropped += dropped; + sw_stats->tx_retries += sw_stats_add->tx_retries; + sw_stats->tx_failure_drops += sw_stats_add->tx_failure_drops; + sw_stats->tx_mtu_exceeded_drops += sw_stats_add->tx_mtu_exceeded_drops; + sw_stats->tx_qos_drops += sw_stats_add->tx_qos_drops; + sw_stats->tx_invalid_hwol_drops += sw_stats_add->tx_invalid_hwol_drops; + rte_spinlock_unlock(&dev->stats_lock); } static void @@ -2795,13 +2733,13 @@ netdev_dpdk_vhost_send(struct netdev *netdev, int qid, { struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); int max_retries = VHOST_ENQ_RETRY_MIN; - int cnt, batch_cnt, vhost_batch_cnt; int vid = netdev_dpdk_get_vid(dev); struct netdev_dpdk_sw_stats stats; + int cnt, vhost_batch_cnt; struct rte_mbuf **pkts; int retries; - batch_cnt = cnt = dp_packet_batch_size(batch); + cnt = dp_packet_batch_size(batch); qid = dev->tx_q[qid % netdev->n_txq].map; if (OVS_UNLIKELY(vid < 0 || !dev->vhost_reconfigured || qid < 0 || !(dev->flags & NETDEV_UP))) { @@ -2850,10 +2788,7 @@ netdev_dpdk_vhost_send(struct netdev *netdev, int qid, stats.tx_failure_drops += cnt; stats.tx_retries = MIN(retries, max_retries); - rte_spinlock_lock(&dev->stats_lock); - netdev_dpdk_vhost_update_tx_counters(dev, batch->packets, batch_cnt, - &stats); - rte_spinlock_unlock(&dev->stats_lock); + netdev_dpdk_vhost_update_tx_counters(dev, &stats); pkts = (struct rte_mbuf **) batch->packets; for (int i = 0; i < vhost_batch_cnt; i++) { @@ -3007,41 +2942,305 @@ netdev_dpdk_set_mtu(struct netdev *netdev, int mtu) return 0; } -static int -netdev_dpdk_get_carrier(const struct netdev *netdev, bool *carrier); - static int netdev_dpdk_vhost_get_stats(const struct netdev *netdev, struct netdev_stats *stats) { + struct rte_vhost_stat_name *vhost_stats_names = NULL; struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); + struct rte_vhost_stat *vhost_stats = NULL; + int vhost_stats_count; + int err; + int qid; + int vid; ovs_mutex_lock(&dev->mutex); + if (!is_vhost_running(dev)) { + err = EPROTO; + goto out; + } + + vid = netdev_dpdk_get_vid(dev); + + /* We expect all rxqs have the same number of stats, only query rxq0. */ + qid = 0 * VIRTIO_QNUM + VIRTIO_TXQ; + err = rte_vhost_vring_stats_get_names(vid, qid, NULL, 0); + if (err < 0) { + err = EPROTO; + goto out; + } + + vhost_stats_count = err; + vhost_stats_names = xcalloc(vhost_stats_count, sizeof *vhost_stats_names); + vhost_stats = xcalloc(vhost_stats_count, sizeof *vhost_stats); + + err = rte_vhost_vring_stats_get_names(vid, qid, vhost_stats_names, + vhost_stats_count); + if (err != vhost_stats_count) { + err = EPROTO; + goto out; + } + +#define VHOST_RXQ_STATS \ + VHOST_RXQ_STAT(rx_packets, "good_packets") \ + VHOST_RXQ_STAT(rx_bytes, "good_bytes") \ + VHOST_RXQ_STAT(rx_broadcast_packets, "broadcast_packets") \ + VHOST_RXQ_STAT(multicast, "multicast_packets") \ + VHOST_RXQ_STAT(rx_undersized_errors, "undersize_packets") \ + VHOST_RXQ_STAT(rx_1_to_64_packets, "size_64_packets") \ + VHOST_RXQ_STAT(rx_65_to_127_packets, "size_65_127_packets") \ + VHOST_RXQ_STAT(rx_128_to_255_packets, "size_128_255_packets") \ + VHOST_RXQ_STAT(rx_256_to_511_packets, "size_256_511_packets") \ + VHOST_RXQ_STAT(rx_512_to_1023_packets, "size_512_1023_packets") \ + VHOST_RXQ_STAT(rx_1024_to_1522_packets, "size_1024_1518_packets") \ + VHOST_RXQ_STAT(rx_1523_to_max_packets, "size_1519_max_packets") + +#define VHOST_RXQ_STAT(MEMBER, NAME) dev->stats.MEMBER = 0; + VHOST_RXQ_STATS; +#undef VHOST_RXQ_STAT + + for (int q = 0; q < dev->up.n_rxq; q++) { + qid = q * VIRTIO_QNUM + VIRTIO_TXQ; + + err = rte_vhost_vring_stats_get(vid, qid, vhost_stats, + vhost_stats_count); + if (err != vhost_stats_count) { + err = EPROTO; + goto out; + } + + for (int i = 0; i < vhost_stats_count; i++) { +#define VHOST_RXQ_STAT(MEMBER, NAME) \ + if (string_ends_with(vhost_stats_names[i].name, NAME)) { \ + dev->stats.MEMBER += vhost_stats[i].value; \ + continue; \ + } + VHOST_RXQ_STATS; +#undef VHOST_RXQ_STAT + } + } + + /* OVS reports 64 bytes and smaller packets into "rx_1_to_64_packets". + * Since vhost only reports good packets and has no error counter, + * rx_undersized_errors is highjacked (see above) to retrieve + * "undersize_packets". */ + dev->stats.rx_1_to_64_packets += dev->stats.rx_undersized_errors; + memset(&dev->stats.rx_undersized_errors, 0xff, + sizeof dev->stats.rx_undersized_errors); + +#define VHOST_RXQ_STAT(MEMBER, NAME) stats->MEMBER = dev->stats.MEMBER; + VHOST_RXQ_STATS; +#undef VHOST_RXQ_STAT + + free(vhost_stats_names); + vhost_stats_names = NULL; + free(vhost_stats); + vhost_stats = NULL; + + /* We expect all txqs have the same number of stats, only query txq0. */ + qid = 0 * VIRTIO_QNUM; + err = rte_vhost_vring_stats_get_names(vid, qid, NULL, 0); + if (err < 0) { + err = EPROTO; + goto out; + } + + vhost_stats_count = err; + vhost_stats_names = xcalloc(vhost_stats_count, sizeof *vhost_stats_names); + vhost_stats = xcalloc(vhost_stats_count, sizeof *vhost_stats); + + err = rte_vhost_vring_stats_get_names(vid, qid, vhost_stats_names, + vhost_stats_count); + if (err != vhost_stats_count) { + err = EPROTO; + goto out; + } + +#define VHOST_TXQ_STATS \ + VHOST_TXQ_STAT(tx_packets, "good_packets") \ + VHOST_TXQ_STAT(tx_bytes, "good_bytes") \ + VHOST_TXQ_STAT(tx_broadcast_packets, "broadcast_packets") \ + VHOST_TXQ_STAT(tx_multicast_packets, "multicast_packets") \ + VHOST_TXQ_STAT(rx_undersized_errors, "undersize_packets") \ + VHOST_TXQ_STAT(tx_1_to_64_packets, "size_64_packets") \ + VHOST_TXQ_STAT(tx_65_to_127_packets, "size_65_127_packets") \ + VHOST_TXQ_STAT(tx_128_to_255_packets, "size_128_255_packets") \ + VHOST_TXQ_STAT(tx_256_to_511_packets, "size_256_511_packets") \ + VHOST_TXQ_STAT(tx_512_to_1023_packets, "size_512_1023_packets") \ + VHOST_TXQ_STAT(tx_1024_to_1522_packets, "size_1024_1518_packets") \ + VHOST_TXQ_STAT(tx_1523_to_max_packets, "size_1519_max_packets") + +#define VHOST_TXQ_STAT(MEMBER, NAME) dev->stats.MEMBER = 0; + VHOST_TXQ_STATS; +#undef VHOST_TXQ_STAT + + for (int q = 0; q < dev->up.n_txq; q++) { + qid = q * VIRTIO_QNUM; + + err = rte_vhost_vring_stats_get(vid, qid, vhost_stats, + vhost_stats_count); + if (err != vhost_stats_count) { + err = EPROTO; + goto out; + } + + for (int i = 0; i < vhost_stats_count; i++) { +#define VHOST_TXQ_STAT(MEMBER, NAME) \ + if (string_ends_with(vhost_stats_names[i].name, NAME)) { \ + dev->stats.MEMBER += vhost_stats[i].value; \ + continue; \ + } + VHOST_TXQ_STATS; +#undef VHOST_TXQ_STAT + } + } + + /* OVS reports 64 bytes and smaller packets into "tx_1_to_64_packets". + * Same as for rx, rx_undersized_errors is highjacked. */ + dev->stats.tx_1_to_64_packets += dev->stats.rx_undersized_errors; + memset(&dev->stats.rx_undersized_errors, 0xff, + sizeof dev->stats.rx_undersized_errors); + +#define VHOST_TXQ_STAT(MEMBER, NAME) stats->MEMBER = dev->stats.MEMBER; + VHOST_TXQ_STATS; +#undef VHOST_TXQ_STAT + rte_spinlock_lock(&dev->stats_lock); - /* Supported Stats */ - stats->rx_packets = dev->stats.rx_packets; - stats->tx_packets = dev->stats.tx_packets; stats->rx_dropped = dev->stats.rx_dropped; stats->tx_dropped = dev->stats.tx_dropped; - stats->multicast = dev->stats.multicast; - stats->rx_bytes = dev->stats.rx_bytes; - stats->tx_bytes = dev->stats.tx_bytes; - stats->rx_errors = dev->stats.rx_errors; - stats->rx_length_errors = dev->stats.rx_length_errors; - - stats->rx_1_to_64_packets = dev->stats.rx_1_to_64_packets; - stats->rx_65_to_127_packets = dev->stats.rx_65_to_127_packets; - stats->rx_128_to_255_packets = dev->stats.rx_128_to_255_packets; - stats->rx_256_to_511_packets = dev->stats.rx_256_to_511_packets; - stats->rx_512_to_1023_packets = dev->stats.rx_512_to_1023_packets; - stats->rx_1024_to_1522_packets = dev->stats.rx_1024_to_1522_packets; - stats->rx_1523_to_max_packets = dev->stats.rx_1523_to_max_packets; - rte_spinlock_unlock(&dev->stats_lock); + err = 0; +out: + ovs_mutex_unlock(&dev->mutex); + free(vhost_stats); + free(vhost_stats_names); + + return err; +} + +static int +netdev_dpdk_vhost_get_custom_stats(const struct netdev *netdev, + struct netdev_custom_stats *custom_stats) +{ + struct rte_vhost_stat_name *vhost_stats_names = NULL; + struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); + struct rte_vhost_stat *vhost_stats = NULL; + int vhost_rxq_stats_count; + int vhost_txq_stats_count; + int stat_offset; + int err; + int qid; + int vid; + + netdev_dpdk_get_sw_custom_stats(netdev, custom_stats); + stat_offset = custom_stats->size; + + ovs_mutex_lock(&dev->mutex); + + if (!is_vhost_running(dev)) { + goto out; + } + + vid = netdev_dpdk_get_vid(dev); + + qid = 0 * VIRTIO_QNUM + VIRTIO_TXQ; + err = rte_vhost_vring_stats_get_names(vid, qid, NULL, 0); + if (err < 0) { + goto out; + } + vhost_rxq_stats_count = err; + + qid = 0 * VIRTIO_QNUM; + err = rte_vhost_vring_stats_get_names(vid, qid, NULL, 0); + if (err < 0) { + goto out; + } + vhost_txq_stats_count = err; + + stat_offset += dev->up.n_rxq * vhost_rxq_stats_count; + stat_offset += dev->up.n_txq * vhost_txq_stats_count; + custom_stats->counters = xrealloc(custom_stats->counters, + stat_offset * + sizeof *custom_stats->counters); + stat_offset = custom_stats->size; + + vhost_stats_names = xcalloc(vhost_rxq_stats_count, + sizeof *vhost_stats_names); + vhost_stats = xcalloc(vhost_rxq_stats_count, sizeof *vhost_stats); + + for (int q = 0; q < dev->up.n_rxq; q++) { + qid = q * VIRTIO_QNUM + VIRTIO_TXQ; + + err = rte_vhost_vring_stats_get_names(vid, qid, vhost_stats_names, + vhost_rxq_stats_count); + if (err != vhost_rxq_stats_count) { + goto out; + } + + err = rte_vhost_vring_stats_get(vid, qid, vhost_stats, + vhost_rxq_stats_count); + if (err != vhost_rxq_stats_count) { + goto out; + } + + for (int i = 0; i < vhost_rxq_stats_count; i++) { + ovs_strlcpy(custom_stats->counters[stat_offset + i].name, + vhost_stats_names[i].name, + NETDEV_CUSTOM_STATS_NAME_SIZE); + custom_stats->counters[stat_offset + i].value = + vhost_stats[i].value; + } + stat_offset += vhost_rxq_stats_count; + } + + free(vhost_stats_names); + vhost_stats_names = NULL; + free(vhost_stats); + vhost_stats = NULL; + + vhost_stats_names = xcalloc(vhost_txq_stats_count, + sizeof *vhost_stats_names); + vhost_stats = xcalloc(vhost_txq_stats_count, sizeof *vhost_stats); + + for (int q = 0; q < dev->up.n_txq; q++) { + qid = q * VIRTIO_QNUM; + + err = rte_vhost_vring_stats_get_names(vid, qid, vhost_stats_names, + vhost_txq_stats_count); + if (err != vhost_txq_stats_count) { + goto out; + } + + err = rte_vhost_vring_stats_get(vid, qid, vhost_stats, + vhost_txq_stats_count); + if (err != vhost_txq_stats_count) { + goto out; + } + + for (int i = 0; i < vhost_txq_stats_count; i++) { + ovs_strlcpy(custom_stats->counters[stat_offset + i].name, + vhost_stats_names[i].name, + NETDEV_CUSTOM_STATS_NAME_SIZE); + custom_stats->counters[stat_offset + i].value = + vhost_stats[i].value; + } + stat_offset += vhost_txq_stats_count; + } + + free(vhost_stats_names); + vhost_stats_names = NULL; + free(vhost_stats); + vhost_stats = NULL; + +out: + ovs_mutex_unlock(&dev->mutex); + + custom_stats->size = stat_offset; + return 0; } @@ -3088,6 +3287,9 @@ netdev_dpdk_convert_xstats(struct netdev_stats *stats, #undef DPDK_XSTATS } +static int +netdev_dpdk_get_carrier(const struct netdev *netdev, bool *carrier); + static int netdev_dpdk_get_stats(const struct netdev *netdev, struct netdev_stats *stats) { @@ -3536,6 +3738,7 @@ netdev_dpdk_update_flags__(struct netdev_dpdk *dev, if (NETDEV_UP & on) { rte_spinlock_lock(&dev->stats_lock); memset(&dev->stats, 0, sizeof dev->stats); + memset(dev->sw_stats, 0, sizeof *dev->sw_stats); rte_spinlock_unlock(&dev->stats_lock); } } @@ -5036,6 +5239,11 @@ dpdk_vhost_reconfigure_helper(struct netdev_dpdk *dev) dev->tx_q[0].map = 0; } + rte_spinlock_lock(&dev->stats_lock); + memset(&dev->stats, 0, sizeof dev->stats); + memset(dev->sw_stats, 0, sizeof *dev->sw_stats); + rte_spinlock_unlock(&dev->stats_lock); + if (userspace_tso_enabled()) { dev->hw_ol_features |= NETDEV_TX_TSO_OFFLOAD; VLOG_DBG("%s: TSO enabled on vhost port", netdev_get_name(&dev->up)); @@ -5096,6 +5304,9 @@ netdev_dpdk_vhost_client_reconfigure(struct netdev *netdev) /* Register client-mode device. */ vhost_flags |= RTE_VHOST_USER_CLIENT; + /* Extended per vq statistics. */ + vhost_flags |= RTE_VHOST_USER_NET_STATS_ENABLE; + /* There is no support for multi-segments buffers. */ vhost_flags |= RTE_VHOST_USER_LINEARBUF_SUPPORT; @@ -5574,7 +5785,7 @@ static const struct netdev_class dpdk_vhost_class = { .send = netdev_dpdk_vhost_send, .get_carrier = netdev_dpdk_vhost_get_carrier, .get_stats = netdev_dpdk_vhost_get_stats, - .get_custom_stats = netdev_dpdk_get_sw_custom_stats, + .get_custom_stats = netdev_dpdk_vhost_get_custom_stats, .get_status = netdev_dpdk_vhost_user_get_status, .reconfigure = netdev_dpdk_vhost_reconfigure, .rxq_recv = netdev_dpdk_vhost_rxq_recv, @@ -5590,7 +5801,7 @@ static const struct netdev_class dpdk_vhost_client_class = { .send = netdev_dpdk_vhost_send, .get_carrier = netdev_dpdk_vhost_get_carrier, .get_stats = netdev_dpdk_vhost_get_stats, - .get_custom_stats = netdev_dpdk_get_sw_custom_stats, + .get_custom_stats = netdev_dpdk_vhost_get_custom_stats, .get_status = netdev_dpdk_vhost_user_get_status, .reconfigure = netdev_dpdk_vhost_client_reconfigure, .rxq_recv = netdev_dpdk_vhost_rxq_recv, diff --git a/tests/system-dpdk.at b/tests/system-dpdk.at index 8dc187a61d..5ef7f8ccdc 100644 --- a/tests/system-dpdk.at +++ b/tests/system-dpdk.at @@ -200,9 +200,10 @@ ADD_VETH(tap1, ns2, br10, "172.31.110.12/24") dnl Execute testpmd in background on_exit "pkill -f -x -9 'tail -f /dev/null'" tail -f /dev/null | dpdk-testpmd --socket-mem="$(cat NUMA_NODE)" --no-pci\ - --vdev="net_virtio_user,path=$OVS_RUNDIR/dpdkvhostclient0,server=1" \ - --vdev="net_tap0,iface=tap0" --file-prefix page0 \ - --single-file-segments -- -a >$OVS_RUNDIR/testpmd-dpdkvhostuserclient0.log 2>&1 & + --vdev="net_virtio_user,path=$OVS_RUNDIR/dpdkvhostclient0,queues=2,server=1" \ + --vdev="net_tap0,iface=tap0" --file-prefix page0 \ + --single-file-segments -- -a --nb-cores 2 --rxq 2 --txq 2 \ + >$OVS_RUNDIR/testpmd-dpdkvhostuserclient0.log 2>&1 & OVS_WAIT_UNTIL([grep "virtio is now ready for processing" ovs-vswitchd.log]) OVS_WAIT_UNTIL([ip link show dev tap0 | grep -qw LOWER_UP]) @@ -220,9 +221,33 @@ AT_CHECK([ip netns exec ns1 ip addr add 172.31.110.11/24 dev tap0], [], AT_CHECK([ip netns exec ns1 ip link show], [], [stdout], [stderr]) AT_CHECK([ip netns exec ns2 ip link show], [], [stdout], [stderr]) -AT_CHECK([ip netns exec ns1 ping -c 4 -I tap0 172.31.110.12], [], [stdout], +AT_CHECK([ip netns exec ns1 ping -i 0.1 -c 10 -I tap0 172.31.110.12], [], [stdout], [stderr]) +AT_CHECK([ip netns exec ns1 ip link set tap0 down], [], [stdout], [stderr]) + +# Wait for stats to be queried ("stats-update-interval") +sleep 5 +AT_CHECK([ovs-vsctl get interface dpdkvhostuserclient0 statistics], [], [stdout], [stderr]) + +AT_CHECK([test `ovs-vsctl get interface dpdkvhostuserclient0 statistics:rx_packets` -gt 0 -a dnl + `ovs-vsctl get interface dpdkvhostuserclient0 statistics:rx_bytes` -gt 0]) +AT_CHECK([test `ovs-vsctl get interface dpdkvhostuserclient0 statistics:rx_packets` -eq dnl + $((`ovs-vsctl get interface dpdkvhostuserclient0 statistics:rx_q0_good_packets` + dnl + `ovs-vsctl get interface dpdkvhostuserclient0 statistics:rx_q1_good_packets`))]) +AT_CHECK([test `ovs-vsctl get interface dpdkvhostuserclient0 statistics:rx_bytes` -eq dnl + $((`ovs-vsctl get interface dpdkvhostuserclient0 statistics:rx_q0_good_bytes` + dnl + `ovs-vsctl get interface dpdkvhostuserclient0 statistics:rx_q1_good_bytes`))]) + +AT_CHECK([test `ovs-vsctl get interface dpdkvhostuserclient0 statistics:tx_packets` -gt 0 -a dnl + `ovs-vsctl get interface dpdkvhostuserclient0 statistics:tx_bytes` -gt 0]) +AT_CHECK([test `ovs-vsctl get interface dpdkvhostuserclient0 statistics:tx_packets` -eq dnl + $((`ovs-vsctl get interface dpdkvhostuserclient0 statistics:tx_q0_good_packets` + dnl + `ovs-vsctl get interface dpdkvhostuserclient0 statistics:tx_q1_good_packets`))]) +AT_CHECK([test `ovs-vsctl get interface dpdkvhostuserclient0 statistics:tx_bytes` -eq dnl + $((`ovs-vsctl get interface dpdkvhostuserclient0 statistics:tx_q0_good_bytes` + dnl + `ovs-vsctl get interface dpdkvhostuserclient0 statistics:tx_q1_good_bytes`))]) + dnl Clean up the testpmd now pkill -f -x -9 'tail -f /dev/null'