| Message ID | 20260408171644.1404735-1-i.maximets@ovn.org |
|---|---|
| State | New |
| Delegated to: | Kevin Traynor |
| Headers | show |
| Series | [ovs-dev] dpif-netdev: Remove pmd-stats-show in favor of pmd-perf-show. | expand |
| Context | Check | Description |
|---|---|---|
| ovsrobot/apply-robot | success | apply and check: success |
| ovsrobot/cirrus-robot | success | cirrus build: passed |
| ovsrobot/github-robot-_Build_and_Test | success | github build: passed |
On 8 Apr 2026, at 19:16, Ilya Maximets wrote: > The 'pmd-perf-show' command provides all the same information and more. > It is also better visually structured and easier to read as a result. > > Let's remove the old 'pmd-stats-show' command, as there is no real need > to have two commands reporting the same data. > > The only difference until now was that 'pmd-perf-show' didn't provide > information for the "main" thread. This change makes it report the > statistics for the aggregated "main" thread as well, omitting things > related to CPU cycles, as we can't collect those for threads that are > not pinned. For the same reason histograms are also always disabled. > Omission is done by checking the total number of iterations to be zero. > "main" thread doesn't start/end iterations. > > The actual unixctl command is preserved undocumented and serves as an > alias for 'pmd-perf-show'. This should allow old scripts that are just > capturing the output for humans (or LLMs?) to read to keep working. > Note, however, that the exact output format for unixctl commands was > never a guarantee, so scripts that attempt to parse the output may > still break. > > Signed-off-by: Ilya Maximets <i.maximets@ovn.org> > --- > > Note: I believe the change in system-dpdk-offloads.at is correct, > but I didn't run the testsuite, as I have no hardware for it. > Not a review, but I want to confirm that the offload testsuite runs fine. //Eelco
On 4/8/26 6:16 PM, Ilya Maximets wrote: > The 'pmd-perf-show' command provides all the same information and more. > It is also better visually structured and easier to read as a result. > > Let's remove the old 'pmd-stats-show' command, as there is no real need > to have two commands reporting the same data. > > The only difference until now was that 'pmd-perf-show' didn't provide > information for the "main" thread. This change makes it report the > statistics for the aggregated "main" thread as well, omitting things > related to CPU cycles, as we can't collect those for threads that are > not pinned. For the same reason histograms are also always disabled. > Omission is done by checking the total number of iterations to be zero. > "main" thread doesn't start/end iterations. > > The actual unixctl command is preserved undocumented and serves as an > alias for 'pmd-perf-show'. This should allow old scripts that are just > capturing the output for humans (or LLMs?) to read to keep working. > Note, however, that the exact output format for unixctl commands was > never a guarantee, so scripts that attempt to parse the output may > still break. > Hi Ilya, thanks for this, couple of comment below, otherwise LGTM. > Signed-off-by: Ilya Maximets <i.maximets@ovn.org> > --- > > Note: I believe the change in system-dpdk-offloads.at is correct, > but I didn't run the testsuite, as I have no hardware for it. > > Documentation/intro/install/afxdp.rst | 2 +- > Documentation/intro/install/dpdk.rst | 2 +- > Documentation/topics/dpdk/bridge.rst | 4 +- > Documentation/topics/dpdk/pmd.rst | 4 - > NEWS | 4 + > lib/dpif-netdev-perf.c | 39 +++-- > lib/dpif-netdev-perf.h | 2 +- > lib/dpif-netdev-unixctl.man | 62 ++++--- > lib/dpif-netdev.c | 158 ++++-------------- > tests/dpif-netdev.at | 20 +-- > tests/pmd.at | 47 +++--- > tests/system-dpdk-offloads.at | 8 +- > .../plugins/system-logs/openvswitch.xml | 2 +- > 13 files changed, 136 insertions(+), 218 deletions(-) > > diff --git a/Documentation/intro/install/afxdp.rst b/Documentation/intro/install/afxdp.rst > index 63a10e328..07225a885 100644 > --- a/Documentation/intro/install/afxdp.rst > +++ b/Documentation/intro/install/afxdp.rst > @@ -273,7 +273,7 @@ Measure your system call rate by doing:: > > Or, use OVS pmd tool:: > > - ovs-appctl dpif-netdev/pmd-stats-show > + ovs-appctl dpif-netdev/pmd-perf-show > > > Example Script > diff --git a/Documentation/intro/install/dpdk.rst b/Documentation/intro/install/dpdk.rst > index 6f4687bde..d5c897e8b 100644 > --- a/Documentation/intro/install/dpdk.rst > +++ b/Documentation/intro/install/dpdk.rst > @@ -709,7 +709,7 @@ level: > > The average number of packets per output batch can be checked in PMD stats:: > > - $ ovs-appctl dpif-netdev/pmd-stats-show > + $ ovs-appctl dpif-netdev/pmd-perf-show > > Limitations > ------------ > diff --git a/Documentation/topics/dpdk/bridge.rst b/Documentation/topics/dpdk/bridge.rst > index 03c4dd4e3..4468b904b 100644 > --- a/Documentation/topics/dpdk/bridge.rst > +++ b/Documentation/topics/dpdk/bridge.rst > @@ -103,7 +103,7 @@ the packet itself and others (for example, VLAN tag or Ethernet type) can be > extracted without fully parsing the packet. This allows OVS to significantly > speed up packet forwarding for these flows with simple match criteria. > Statistics on the number of packets matched in this way can be found in a > -`simple match hits` counter of `ovs-appctl dpif-netdev/pmd-stats-show` command. > +`Simple Match hits` counter of `ovs-appctl dpif-netdev/pmd-perf-show` command. > > EMC Insertion Probability > ------------------------- > @@ -127,7 +127,7 @@ If ``N`` is set to 1, an insertion will be performed for every flow. If set to > With default ``N`` set to 100, higher megaflow hits will occur initially as > observed with pmd stats:: > > - $ ovs-appctl dpif-netdev/pmd-stats-show > + $ ovs-appctl dpif-netdev/pmd-perf-show > > For certain traffic profiles with many parallel flows, it's recommended to set > ``N`` to '0' to achieve higher forwarding performance. > diff --git a/Documentation/topics/dpdk/pmd.rst b/Documentation/topics/dpdk/pmd.rst > index 2e8cf5edb..1589d521c 100644 > --- a/Documentation/topics/dpdk/pmd.rst > +++ b/Documentation/topics/dpdk/pmd.rst > @@ -57,10 +57,6 @@ PMD Thread Statistics > > To show current stats:: > > - $ ovs-appctl dpif-netdev/pmd-stats-show > - > -or:: > - > $ ovs-appctl dpif-netdev/pmd-perf-show > > Detailed performance metrics for ``pmd-perf-show`` can also be enabled:: > diff --git a/NEWS b/NEWS > index 1a3044cbf..b35bcff6e 100644 > --- a/NEWS > +++ b/NEWS > @@ -3,6 +3,10 @@ Post-v3.7.0 > - Userspace datapath: > * ARP/ND lookups for native tunnel are now rate limited. The holdout > timer can be configured with 'tnl/neigh/retrans_time'. > + - ovs-appctl: > + * 'dpif-netdev/pmd-stats-show' command was removed in favor of the more > + informative and better structured 'dpif-netdev/pmd-perf-show', which > + now also provides statistics for the "main" thread. > > > v3.7.0 - 16 Feb 2026 > diff --git a/lib/dpif-netdev-perf.c b/lib/dpif-netdev-perf.c > index 1cd4ee084..ba370d7c1 100644 > --- a/lib/dpif-netdev-perf.c > +++ b/lib/dpif-netdev-perf.c > @@ -233,7 +233,8 @@ pmd_perf_format_overall_stats(struct ds *str, struct pmd_perf_stats *s, > uint64_t sleep_iter = stats[PMD_SLEEP_ITER]; > uint64_t tot_sleep_cycles = stats[PMD_CYCLES_SLEEP]; > > - ds_put_format(str, > + if (tot_iter) { While the change is to cater for main thread, it also changes display for pmd cores with no iterations. Previously there was: pmd thread numa_id 1 core_id 9: Iterations: 0 (0.00 us/it) - Used TSC cycles: 0 ( 0.0 % of total cycles) - idle iterations: 0 ( 0.0 % of used cycles) - busy iterations: 0 ( 0.0 % of used cycles) - sleep iterations: 0 ( 0.0 % of iterations) Sleep time (us): 0 ( 0 us/iteration avg.) Rx packets: 0 Tx packets: 0 That is now changed so that iterations of zero are implicit. pmd thread numa_id 1 core_id 11: Rx packets: 0 Tx packets: 0 Considering we can also have a case where there are iterations but no packets like below, i think we should leave zero iterations explicit for pmd thread as it currently is. pmd thread numa_id 0 core_id 8: Iterations: 17369522 (0.15 us/it) - Used TSC cycles: 6816002305 ( 80.1 % of total cycles) - idle iterations: 17369522 (100.0 % of used cycles) - busy iterations: 0 ( 0.0 % of used cycles) - sleep iterations: 0 ( 0.0 % of iterations) Sleep time (us): 0 ( 0 us/iteration avg.) Rx packets: 0 Tx packets: 0 I agree that for main thread it doesn't make sense to display it, so maybe we could add a bool to pmd_perf_format_overall_stats() args to make showing the iteration section conditional, then set differently when the non pmd core in pmd_info_show_perf(). > + ds_put_format(str, > " Iterations: %12"PRIu64" (%.2f us/it)\n" > " - Used TSC cycles: %12"PRIu64" (%5.1f %% of total cycles)\n" aside from comments above, with current version there is no need to re-check for tot_iter (not shown in diff) > " - idle iterations: %12"PRIu64" (%5.1f %% of used cycles)\n" > @@ -252,9 +253,18 @@ pmd_perf_format_overall_stats(struct ds *str, struct pmd_perf_stats *s, > sleep_iter, tot_iter ? 100.0 * sleep_iter / tot_iter : 0, > tot_sleep_cycles * us_per_cycle, > sleep_iter ? (tot_sleep_cycles * us_per_cycle) / sleep_iter : 0); > + } > if (rx_packets > 0) { > ds_put_format(str, > - " Rx packets: %12"PRIu64" (%.0f Kpps, %.0f cycles/pkt)\n" > + " Rx packets: %12"PRIu64" (%.0f Kpps", > + rx_packets, (rx_packets / duration) / 1000); > + if (tot_iter) { > + ds_put_format(str, ", %.0f cycles/pkt", > + 1.0 * stats[PMD_CYCLES_ITER_BUSY] / rx_packets); > + } > + ds_put_cstr(str, ")\n"); > + > + ds_put_format(str, > " Datapath passes: %12"PRIu64" (%.2f passes/pkt)\n" > " - PHWOL hits: %12"PRIu64" (%5.1f %%)\n" > " - MFEX Opt hits: %12"PRIu64" (%5.1f %%)\n" > @@ -262,11 +272,7 @@ pmd_perf_format_overall_stats(struct ds *str, struct pmd_perf_stats *s, > " - EMC hits: %12"PRIu64" (%5.1f %%)\n" > " - SMC hits: %12"PRIu64" (%5.1f %%)\n" > " - Megaflow hits: %12"PRIu64" (%5.1f %%, %.2f " > - "subtbl lookups/hit)\n" > - " - Upcalls: %12"PRIu64" (%5.1f %%, %.1f us/upcall)\n" > - " - Lost upcalls: %12"PRIu64" (%5.1f %%)\n", > - rx_packets, (rx_packets / duration) / 1000, > - 1.0 * stats[PMD_CYCLES_ITER_BUSY] / rx_packets, > + "subtbl lookups/hit)\n", > passes, 1.0 * passes / rx_packets, > stats[PMD_STAT_PHWOL_HIT], > 100.0 * stats[PMD_STAT_PHWOL_HIT] / passes, > @@ -282,11 +288,20 @@ pmd_perf_format_overall_stats(struct ds *str, struct pmd_perf_stats *s, > 100.0 * stats[PMD_STAT_MASKED_HIT] / passes, > stats[PMD_STAT_MASKED_HIT] > ? 1.0 * stats[PMD_STAT_MASKED_LOOKUP] / stats[PMD_STAT_MASKED_HIT] > - : 0, > - upcalls, 100.0 * upcalls / passes, > - upcalls ? (upcall_cycles * us_per_cycle) / upcalls : 0, > - stats[PMD_STAT_LOST], > - 100.0 * stats[PMD_STAT_LOST] / passes); > + : 0); > + > + ds_put_format(str, > + " - Upcalls: %12"PRIu64" (%5.1f %%", > + upcalls, 100.0 * upcalls / passes); > + if (tot_iter) { > + ds_put_format(str, ", %.1f us/upcall", > + upcalls ? (upcall_cycles * us_per_cycle) / upcalls : 0); > + } > + ds_put_cstr(str, ")\n"); > + > + ds_put_format(str, > + " - Lost upcalls: %12"PRIu64" (%5.1f %%)\n", > + stats[PMD_STAT_LOST], 100.0 * stats[PMD_STAT_LOST] / passes); > } else { > ds_put_format(str, > " Rx packets: %12d\n", 0); > diff --git a/lib/dpif-netdev-perf.h b/lib/dpif-netdev-perf.h > index 84beced15..8a41afa8a 100644 > --- a/lib/dpif-netdev-perf.h > +++ b/lib/dpif-netdev-perf.h > @@ -317,7 +317,7 @@ void pmd_perf_read_counters(struct pmd_perf_stats *s, > * NON-PMD they might be updated from multiple threads, but we can live > * with losing a rare update as 100% accuracy is not required. > * However, as counters are read for display from outside the PMD thread > - * with e.g. pmd-stats-show, we make sure that the 64-bit read and store > + * with e.g. pmd-perf-show, we make sure that the 64-bit read and store > * operations are atomic also on 32-bit systems so that readers cannot > * not read garbage. On 64-bit systems this incurs no overhead. */ > > diff --git a/lib/dpif-netdev-unixctl.man b/lib/dpif-netdev-unixctl.man > index 8cd847416..3d5ab437c 100644 > --- a/lib/dpif-netdev-unixctl.man > +++ b/lib/dpif-netdev-unixctl.man > @@ -6,44 +6,22 @@ argument can be omitted. By default the commands present data for all pmd > threads in the datapath. By specifying the "-pmd Core" option one can filter > the output for a single pmd in the datapath. > . > -.IP "\fBdpif-netdev/pmd-stats-show\fR [\fB-pmd\fR \fIcore\fR] [\fIdp\fR]" > -Shows performance statistics for one or all pmd threads of the datapath > -\fIdp\fR. The special thread "main" sums up the statistics of every non pmd > -thread. > - > -The sum of "phwol hits", "simple match hits", "emc hits", "smc hits", > -"megaflow hits" and "miss" is the number of packet lookups performed by the > -datapath. Beware that a recirculated packet experiences one additional lookup > -per recirculation, so there may be more lookups than forwarded packets in the > -datapath. > - > -The MFEX Opt hits displays the number of packets that are processed by the > -optimized miniflow extract implementations. > - > -Cycles are counted using the TSC or similar facilities (when available on > -the platform). The duration of one cycle depends on the processing platform. > - > -"idle cycles" refers to cycles spent in PMD iterations not forwarding any > -any packets. "processing cycles" refers to cycles spent in PMD iterations > -forwarding at least one packet, including the cost for polling, processing and > -transmitting said packets. > - > -To reset these counters use \fBdpif-netdev/pmd-stats-clear\fR. > -. > .IP "\fBdpif-netdev/pmd-stats-clear\fR [\fIdp\fR]" > Resets to zero the per pmd thread performance numbers shown by the > -\fBdpif-netdev/pmd-stats-show\fR and \fBdpif-netdev/pmd-perf-show\fR commands. > -It will NOT reset datapath or bridge statistics, only the values shown by > -the above commands. > +\fBdpif-netdev/pmd-perf-show\fR command. It will NOT reset datapath or bridge > +statistics, only the values shown by the above command. > . > .IP "\fBdpif-netdev/pmd-perf-show\fR [\fB-nh\fR] [\fB-it\fR \fIiter_len\fR] \ > [\fB-ms\fR \fIms_len\fR] [\fB-pmd\fR \fIcore\fR] [\fIdp\fR]" > Shows detailed performance metrics for one or all pmds threads of the > -user space datapath. > +user space datapath. The special thread "main" sums up the statistics of every > +non pmd thread. > > -The collection of detailed statistics can be controlled by a new > -configuration parameter "other_config:pmd-perf-metrics". By default it > -is disabled. The run-time overhead, when enabled, is in the order of 1%. > +The collection of additional detailed statistics can be controlled by a > +configuration parameter \fBother-config:pmd-perf-metrics\fR. By default it is > +disabled. The run-time overhead, when enabled, is in the order of 1%. > + > +Collected statistics include: > > .RS > .IP > @@ -153,8 +131,26 @@ pmd thread numa_id 0 core_id 1: > .RE > .IP > Here "Rx packets" actually reflects the number of packets forwarded by the > -datapath. "Datapath passes" matches the number of packet lookups as > -reported by the \fBdpif-netdev/pmd-stats-show\fR command. > +datapath. > + > +The sum of "PHWOL hits", "Simple Match hits", "EMC hits", "SMC hits", > +"Megaflow hits" and "Upcalls" is the number of packet lookups performed by the > +datapath and it is reported as "Datapath passes". Beware that a recirculated > +packet experiences one additional lookup per recirculation, so there may be > +more lookups than forwarded packets in the datapath. > + > +The "MFEX Opt hits" displays the number of packets that are processed by the > +optimized miniflow extract implementations. > + > +Cycles are counted using the TSC or similar facilities (when available on > +the platform). The duration of one cycle depends on the processing platform. > +Statistics based on cycles are not reported for the "main" thread, since the > +accurate accounting of CPU cycles is not possible in this case. > + > +"idle iterations" refers to PMD iterations that didn't not result in processing typo, "didn't not" thanks, Kevin. > +any packets. "busy iterations" refers to PMD iterations that included > +processing of at least one packet. The reported used TSC cycles include the > +cost for polling, processing and transmitting said packets. > > To reset the counters and start a new measurement use > \fBdpif-netdev/pmd-stats-clear\fR. > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c > index 9df05c4c2..db5823a91 100644 > --- a/lib/dpif-netdev.c > +++ b/lib/dpif-netdev.c > @@ -620,8 +620,7 @@ get_dp_netdev(const struct dpif *dpif) > } > > enum pmd_info_type { > - PMD_INFO_SHOW_STATS, /* Show how cpu cycles are spent. */ > - PMD_INFO_CLEAR_STATS, /* Set the cycles count to 0. */ > + PMD_INFO_CLEAR_STATS, /* Set the cycle and the packet counters to 0. */ > PMD_INFO_SHOW_RXQ, /* Show poll lists of pmd threads. */ > PMD_INFO_PERF_SHOW, /* Show pmd performance details. */ > PMD_INFO_SLEEP_SHOW, /* Show max sleep configuration details. */ > @@ -641,127 +640,42 @@ format_pmd_thread(struct ds *reply, struct dp_netdev_pmd_thread *pmd) > ds_put_cstr(reply, ":\n"); > } > > -static void > -pmd_info_show_stats(struct ds *reply, > - struct dp_netdev_pmd_thread *pmd) > -{ > - uint64_t stats[PMD_N_STATS]; > - uint64_t total_cycles, total_packets; > - double passes_per_pkt = 0; > - double lookups_per_hit = 0; > - double packets_per_batch = 0; > - > - pmd_perf_read_counters(&pmd->perf_stats, stats); > - total_cycles = stats[PMD_CYCLES_ITER_IDLE] > - + stats[PMD_CYCLES_ITER_BUSY]; > - total_packets = stats[PMD_STAT_RECV]; > - > - format_pmd_thread(reply, pmd); > - > - if (total_packets > 0) { > - passes_per_pkt = (total_packets + stats[PMD_STAT_RECIRC]) > - / (double) total_packets; > - } > - if (stats[PMD_STAT_MASKED_HIT] > 0) { > - lookups_per_hit = stats[PMD_STAT_MASKED_LOOKUP] > - / (double) stats[PMD_STAT_MASKED_HIT]; > - } > - if (stats[PMD_STAT_SENT_BATCHES] > 0) { > - packets_per_batch = stats[PMD_STAT_SENT_PKTS] > - / (double) stats[PMD_STAT_SENT_BATCHES]; > - } > - > - ds_put_format(reply, > - " packets received: %"PRIu64"\n" > - " packet recirculations: %"PRIu64"\n" > - " avg. datapath passes per packet: %.02f\n" > - " phwol hits: %"PRIu64"\n" > - " mfex opt hits: %"PRIu64"\n" > - " simple match hits: %"PRIu64"\n" > - " emc hits: %"PRIu64"\n" > - " smc hits: %"PRIu64"\n" > - " megaflow hits: %"PRIu64"\n" > - " avg. subtable lookups per megaflow hit: %.02f\n" > - " miss with success upcall: %"PRIu64"\n" > - " miss with failed upcall: %"PRIu64"\n" > - " avg. packets per output batch: %.02f\n", > - total_packets, stats[PMD_STAT_RECIRC], > - passes_per_pkt, stats[PMD_STAT_PHWOL_HIT], > - stats[PMD_STAT_MFEX_OPT_HIT], > - stats[PMD_STAT_SIMPLE_HIT], > - stats[PMD_STAT_EXACT_HIT], > - stats[PMD_STAT_SMC_HIT], > - stats[PMD_STAT_MASKED_HIT], > - lookups_per_hit, stats[PMD_STAT_MISS], stats[PMD_STAT_LOST], > - packets_per_batch); > - > - if (total_cycles == 0) { > - return; > - } > - > - ds_put_format(reply, > - " idle cycles: %"PRIu64" (%.02f%%)\n" > - " processing cycles: %"PRIu64" (%.02f%%)\n", > - stats[PMD_CYCLES_ITER_IDLE], > - stats[PMD_CYCLES_ITER_IDLE] / (double) total_cycles * 100, > - stats[PMD_CYCLES_ITER_BUSY], > - stats[PMD_CYCLES_ITER_BUSY] / (double) total_cycles * 100); > - > - if (total_packets == 0) { > - return; > - } > - > - ds_put_format(reply, > - " avg cycles per packet: %.02f (%"PRIu64"/%"PRIu64")\n", > - total_cycles / (double) total_packets, > - total_cycles, total_packets); > - > - ds_put_format(reply, > - " avg processing cycles per packet: " > - "%.02f (%"PRIu64"/%"PRIu64")\n", > - stats[PMD_CYCLES_ITER_BUSY] / (double) total_packets, > - stats[PMD_CYCLES_ITER_BUSY], total_packets); > -} > - > static void > pmd_info_show_perf(struct ds *reply, > struct dp_netdev_pmd_thread *pmd, > struct pmd_perf_params *par) > { > - if (pmd->core_id != NON_PMD_CORE_ID) { > - char *time_str = > - xastrftime_msec("%H:%M:%S.###", time_wall_msec(), true); > - long long now = time_msec(); > - double duration = (now - pmd->perf_stats.start_ms) / 1000.0; > - > - ds_put_cstr(reply, "\n"); > - ds_put_format(reply, "Time: %s\n", time_str); > - ds_put_format(reply, "Measurement duration: %.3f s\n", duration); > - ds_put_cstr(reply, "\n"); > - format_pmd_thread(reply, pmd); > - ds_put_cstr(reply, "\n"); > - pmd_perf_format_overall_stats(reply, &pmd->perf_stats, duration); > - if (pmd_perf_metrics_enabled(pmd)) { > - /* Prevent parallel clearing of perf metrics. */ > - ovs_mutex_lock(&pmd->perf_stats.clear_mutex); > - if (par->histograms) { > - ds_put_cstr(reply, "\n"); > - pmd_perf_format_histograms(reply, &pmd->perf_stats); > - } > - if (par->iter_hist_len > 0) { > - ds_put_cstr(reply, "\n"); > - pmd_perf_format_iteration_history(reply, &pmd->perf_stats, > - par->iter_hist_len); > - } > - if (par->ms_hist_len > 0) { > - ds_put_cstr(reply, "\n"); > - pmd_perf_format_ms_history(reply, &pmd->perf_stats, > - par->ms_hist_len); > - } > - ovs_mutex_unlock(&pmd->perf_stats.clear_mutex); > + char *time_str = xastrftime_msec("%H:%M:%S.###", time_wall_msec(), true); > + long long now = time_msec(); > + double duration = (now - pmd->perf_stats.start_ms) / 1000.0; > + > + ds_put_cstr(reply, "\n"); > + ds_put_format(reply, "Time: %s\n", time_str); > + ds_put_format(reply, "Measurement duration: %.3f s\n", duration); > + ds_put_cstr(reply, "\n"); > + format_pmd_thread(reply, pmd); > + ds_put_cstr(reply, "\n"); > + pmd_perf_format_overall_stats(reply, &pmd->perf_stats, duration); > + if (pmd_perf_metrics_enabled(pmd) && pmd->core_id != NON_PMD_CORE_ID) { > + /* Prevent parallel clearing of perf metrics. */ > + ovs_mutex_lock(&pmd->perf_stats.clear_mutex); > + if (par->histograms) { > + ds_put_cstr(reply, "\n"); > + pmd_perf_format_histograms(reply, &pmd->perf_stats); > } > - free(time_str); > + if (par->iter_hist_len > 0) { > + ds_put_cstr(reply, "\n"); > + pmd_perf_format_iteration_history(reply, &pmd->perf_stats, > + par->iter_hist_len); > + } > + if (par->ms_hist_len > 0) { > + ds_put_cstr(reply, "\n"); > + pmd_perf_format_ms_history(reply, &pmd->perf_stats, > + par->ms_hist_len); > + } > + ovs_mutex_unlock(&pmd->perf_stats.clear_mutex); > } > + free(time_str); > } > > static int > @@ -1443,8 +1357,6 @@ dpif_netdev_pmd_info(struct unixctl_conn *conn, int argc, const char *argv[], > pmd_info_show_rxq(&reply, pmd, secs); > } else if (type == PMD_INFO_CLEAR_STATS) { > pmd_perf_stats_clear(&pmd->perf_stats); > - } else if (type == PMD_INFO_SHOW_STATS) { > - pmd_info_show_stats(&reply, pmd); > } else if (type == PMD_INFO_PERF_SHOW) { > pmd_info_show_perf(&reply, pmd, (struct pmd_perf_params *)aux); > } else if (type == PMD_INFO_SLEEP_SHOW) { > @@ -1554,14 +1466,10 @@ dpif_netdev_bond_show(struct unixctl_conn *conn, int argc, > static int > dpif_netdev_init(void) > { > - static enum pmd_info_type show_aux = PMD_INFO_SHOW_STATS, > - clear_aux = PMD_INFO_CLEAR_STATS, > + static enum pmd_info_type clear_aux = PMD_INFO_CLEAR_STATS, > poll_aux = PMD_INFO_SHOW_RXQ, > sleep_aux = PMD_INFO_SLEEP_SHOW; > > - unixctl_command_register("dpif-netdev/pmd-stats-show", "[-pmd core] [dp]", > - 0, 3, dpif_netdev_pmd_info, > - (void *)&show_aux); > unixctl_command_register("dpif-netdev/pmd-stats-clear", "[-pmd core] [dp]", > 0, 3, dpif_netdev_pmd_info, > (void *)&clear_aux); > @@ -1578,6 +1486,10 @@ dpif_netdev_init(void) > " [-pmd core] [dp]", > 0, 8, pmd_perf_show_cmd, > NULL); > + /* 'pmd-stats-show' is just an undocumented alias for 'pmd-perf-show', > + * for compatibility with old muscle memory. */ > + unixctl_command_register("dpif-netdev/pmd-stats-show", NULL, > + 0, 8, pmd_perf_show_cmd, NULL); > unixctl_command_register("dpif-netdev/pmd-rxq-rebalance", "[dp]", > 0, 1, dpif_netdev_pmd_rebalance, > NULL); > diff --git a/tests/dpif-netdev.at b/tests/dpif-netdev.at > index 231197970..405094856 100644 > --- a/tests/dpif-netdev.at > +++ b/tests/dpif-netdev.at > @@ -979,32 +979,32 @@ AT_CHECK([cat good_frame | sed -e "s/6b72/dead/" > bad_frame]) > > CHECK_FWD_PACKET(p1, p2, , [bad_frame], [bad_frame]) > dnl First packet, no simple matching. > -AT_CHECK([ovs-appctl dpif-netdev/pmd-stats-show | grep 'simple match hits'], [0], [dnl > - simple match hits: 0 > +AT_CHECK([ovs-appctl dpif-netdev/pmd-perf-show | grep 'Simple Match hits'], [0], [dnl > + - Simple Match hits: 0 ( 0.0 %) > ]) > > dnl No Rx flag. > CHECK_FWD_PACKET(p1, p2, , [bad_frame], [bad_frame]) > -AT_CHECK([ovs-appctl dpif-netdev/pmd-stats-show | grep 'simple match hits'], [0], [dnl > - simple match hits: 1 > +AT_CHECK([ovs-appctl dpif-netdev/pmd-perf-show | grep 'Simple Match hits'], [0], [dnl > + - Simple Match hits: 1 ( 50.0 %) > ]) > > dnl Flag as Rx good. > CHECK_FWD_PACKET(p1, p2, ol_l4_rx_csum_set_good, [bad_frame], [bad_frame]) > -AT_CHECK([ovs-appctl dpif-netdev/pmd-stats-show | grep 'simple match hits'], [0], [dnl > - simple match hits: 2 > +AT_CHECK([ovs-appctl dpif-netdev/pmd-perf-show | grep 'Simple Match hits'], [0], [dnl > + - Simple Match hits: 2 ( 66.7 %) > ]) > > dnl Flag as Rx bad. > CHECK_FWD_PACKET(p1, p2, ol_l4_rx_csum_set_bad, [bad_frame], [bad_frame]) > -AT_CHECK([ovs-appctl dpif-netdev/pmd-stats-show | grep 'simple match hits'], [0], [dnl > - simple match hits: 3 > +AT_CHECK([ovs-appctl dpif-netdev/pmd-perf-show | grep 'Simple Match hits'], [0], [dnl > + - Simple Match hits: 3 ( 75.0 %) > ]) > > dnl Flag as Rx partial. > CHECK_FWD_PACKET(p1, p2, ol_l4_rx_csum_set_partial, [bad_frame], [good_frame]) > -AT_CHECK([ovs-appctl dpif-netdev/pmd-stats-show | grep 'simple match hits'], [0], [dnl > - simple match hits: 4 > +AT_CHECK([ovs-appctl dpif-netdev/pmd-perf-show | grep 'Simple Match hits'], [0], [dnl > + - Simple Match hits: 4 ( 80.0 %) > ]) > > OVS_VSWITCHD_STOP > diff --git a/tests/pmd.at b/tests/pmd.at > index 8254ac3b0..e8590044a 100644 > --- a/tests/pmd.at > +++ b/tests/pmd.at > @@ -440,20 +440,12 @@ dummy@ovs-dummy: hit:0 missed:0 > p0 7/1: (dummy-pmd: n_rxq=4, n_txq=1, numa_id=0) > ]) > > -AT_CHECK([ovs-appctl dpif-netdev/pmd-stats-show | sed SED_NUMA_CORE_PATTERN | sed '/cycles/d' | grep pmd -A 12], [0], [dnl > +AT_CHECK([ovs-appctl dpif-netdev/pmd-perf-show | sed SED_NUMA_CORE_PATTERN \ > + | sed '/cycles/d' | sed '/[[Ii]]teration/d' | grep pmd -A 3], [0], [dnl > pmd thread numa_id <cleared> core_id <cleared>: > - packets received: 0 > - packet recirculations: 0 > - avg. datapath passes per packet: 0.00 > - phwol hits: 0 > - mfex opt hits: 0 > - simple match hits: 0 > - emc hits: 0 > - smc hits: 0 > - megaflow hits: 0 > - avg. subtable lookups per megaflow hit: 0.00 > - miss with success upcall: 0 > - miss with failed upcall: 0 > + > + Rx packets: 0 > + Tx packets: 0 > ]) > > ovs-appctl time/stop > @@ -474,20 +466,23 @@ AT_CHECK([cat ovs-vswitchd.log | filter_flow_install | strip_xout], [0], [dnl > recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:77,dst=50:54:00:00:01:78),eth_type(0x0800),ipv4(frag=no), actions: <del> > ]) > > -AT_CHECK([ovs-appctl dpif-netdev/pmd-stats-show | sed SED_NUMA_CORE_PATTERN | sed '/cycles/d' | grep pmd -A 12], [0], [dnl > +AT_CHECK([ovs-appctl dpif-netdev/pmd-perf-show | sed SED_NUMA_CORE_PATTERN \ > + | sed '/cycles/d' | sed '/[[Ii]]teration/d' \ > + | sed 's/, .* us/, <cleared> us/' \ > + | sed 's/[[0-9]]* Kpps/<cleared> Kpps/' | grep pmd -A 12], [0], [dnl > pmd thread numa_id <cleared> core_id <cleared>: > - packets received: 20 > - packet recirculations: 0 > - avg. datapath passes per packet: 1.00 > - phwol hits: 0 > - mfex opt hits: 0 > - simple match hits: 0 > - emc hits: 19 > - smc hits: 0 > - megaflow hits: 0 > - avg. subtable lookups per megaflow hit: 0.00 > - miss with success upcall: 1 > - miss with failed upcall: 0 > + > + Datapath passes: 20 (1.00 passes/pkt) > + - PHWOL hits: 0 ( 0.0 %) > + - MFEX Opt hits: 0 ( 0.0 %) > + - Simple Match hits: 0 ( 0.0 %) > + - EMC hits: 19 ( 95.0 %) > + - SMC hits: 0 ( 0.0 %) > + - Megaflow hits: 0 ( 0.0 %, 0.00 subtbl lookups/hit) > + - Upcalls: 1 ( 5.0 %, <cleared> us/upcall) > + - Lost upcalls: 0 ( 0.0 %) > + Tx packets: 20 (<cleared> Kpps) > + Tx batches: 20 (1.00 pkts/batch) > ]) > > OVS_VSWITCHD_STOP > diff --git a/tests/system-dpdk-offloads.at b/tests/system-dpdk-offloads.at > index 81ab89b2f..09bdaf639 100644 > --- a/tests/system-dpdk-offloads.at > +++ b/tests/system-dpdk-offloads.at > @@ -144,8 +144,8 @@ AT_CHECK([ovs-appctl dpctl/dump-flows type=dpdk,partially-offloaded \ > in_port(2),eth_type(0x0800),ipv4(frag=no), packets:9, bytes:954, used:0.0s, actions:check_pkt_len(size=200,gt(4),le(5)),3 > ]) > > -AT_CHECK([test $(ovs-appctl dpif-netdev/pmd-stats-show | \ > - awk '/phwol hits:/ {sum += $3} END {print sum}') -ge 8]) > +AT_CHECK([test $(ovs-appctl dpif-netdev/pmd-perf-show | \ > + awk '/PHWOL hits:/ {sum += $4} END {print sum}') -ge 8]) > > OVS_TRAFFIC_VSWITCHD_STOP > AT_CLEANUP > @@ -216,8 +216,8 @@ OVS_WAIT_UNTIL_EQUAL( > in_port(ovs-p0),eth(macs),eth_type(0x0800),ipv4(frag=no), packets:50, bytes:3000, used:0.0s, actions:ovs-p1 > in_port(ovs-p1),eth(macs),eth_type(0x0800),ipv4(frag=no), packets:50, bytes:3000, used:0.0s, actions:ovs-p0]) > > -AT_CHECK([test $(ovs-appctl dpif-netdev/pmd-stats-show | \ > - awk '/packets received:/ {sum += $3} END {print sum}') -lt 10]) > +AT_CHECK([test $(ovs-appctl dpif-netdev/pmd-perf-show | \ > + awk '/Rx packets:/ {sum += $3} END {print sum}') -lt 10]) > > OVS_TRAFFIC_VSWITCHD_STOP > AT_CLEANUP > diff --git a/utilities/bugtool/plugins/system-logs/openvswitch.xml b/utilities/bugtool/plugins/system-logs/openvswitch.xml > index 46c731812..0f17add75 100644 > --- a/utilities/bugtool/plugins/system-logs/openvswitch.xml > +++ b/utilities/bugtool/plugins/system-logs/openvswitch.xml > @@ -20,7 +20,7 @@ > <directory label="ovsdb-backups" filters="ovs" pattern=".*/conf.db.backup[0-9][^/]*$">/etc/openvswitch</directory> > <directory label="ovsdb-backups2" filters="ovs" pattern=".*/conf.db.backup[0-9][^/]*$">/var/lib/openvswitch</directory> > <command label="system_memory_status" filters="ovs">df -h</command> > - <command label="check_number_of_pmds" filters="ovs">ovs-appctl dpif-netdev/pmd-stats-show | grep pmd</command> > + <command label="check_number_of_pmds" filters="ovs">ovs-appctl dpif-netdev/pmd-perf-show | grep pmd</command> > <command label="ovs-appctl-vlog-list" filters="ovs">ovs-appctl vlog/list</command> > <command label="journalctl" filters="ovs">journalctl</command> > <command label="user_limits" filters="ovs">ulimit -a</command>
diff --git a/Documentation/intro/install/afxdp.rst b/Documentation/intro/install/afxdp.rst index 63a10e328..07225a885 100644 --- a/Documentation/intro/install/afxdp.rst +++ b/Documentation/intro/install/afxdp.rst @@ -273,7 +273,7 @@ Measure your system call rate by doing:: Or, use OVS pmd tool:: - ovs-appctl dpif-netdev/pmd-stats-show + ovs-appctl dpif-netdev/pmd-perf-show Example Script diff --git a/Documentation/intro/install/dpdk.rst b/Documentation/intro/install/dpdk.rst index 6f4687bde..d5c897e8b 100644 --- a/Documentation/intro/install/dpdk.rst +++ b/Documentation/intro/install/dpdk.rst @@ -709,7 +709,7 @@ level: The average number of packets per output batch can be checked in PMD stats:: - $ ovs-appctl dpif-netdev/pmd-stats-show + $ ovs-appctl dpif-netdev/pmd-perf-show Limitations ------------ diff --git a/Documentation/topics/dpdk/bridge.rst b/Documentation/topics/dpdk/bridge.rst index 03c4dd4e3..4468b904b 100644 --- a/Documentation/topics/dpdk/bridge.rst +++ b/Documentation/topics/dpdk/bridge.rst @@ -103,7 +103,7 @@ the packet itself and others (for example, VLAN tag or Ethernet type) can be extracted without fully parsing the packet. This allows OVS to significantly speed up packet forwarding for these flows with simple match criteria. Statistics on the number of packets matched in this way can be found in a -`simple match hits` counter of `ovs-appctl dpif-netdev/pmd-stats-show` command. +`Simple Match hits` counter of `ovs-appctl dpif-netdev/pmd-perf-show` command. EMC Insertion Probability ------------------------- @@ -127,7 +127,7 @@ If ``N`` is set to 1, an insertion will be performed for every flow. If set to With default ``N`` set to 100, higher megaflow hits will occur initially as observed with pmd stats:: - $ ovs-appctl dpif-netdev/pmd-stats-show + $ ovs-appctl dpif-netdev/pmd-perf-show For certain traffic profiles with many parallel flows, it's recommended to set ``N`` to '0' to achieve higher forwarding performance. diff --git a/Documentation/topics/dpdk/pmd.rst b/Documentation/topics/dpdk/pmd.rst index 2e8cf5edb..1589d521c 100644 --- a/Documentation/topics/dpdk/pmd.rst +++ b/Documentation/topics/dpdk/pmd.rst @@ -57,10 +57,6 @@ PMD Thread Statistics To show current stats:: - $ ovs-appctl dpif-netdev/pmd-stats-show - -or:: - $ ovs-appctl dpif-netdev/pmd-perf-show Detailed performance metrics for ``pmd-perf-show`` can also be enabled:: diff --git a/NEWS b/NEWS index 1a3044cbf..b35bcff6e 100644 --- a/NEWS +++ b/NEWS @@ -3,6 +3,10 @@ Post-v3.7.0 - Userspace datapath: * ARP/ND lookups for native tunnel are now rate limited. The holdout timer can be configured with 'tnl/neigh/retrans_time'. + - ovs-appctl: + * 'dpif-netdev/pmd-stats-show' command was removed in favor of the more + informative and better structured 'dpif-netdev/pmd-perf-show', which + now also provides statistics for the "main" thread. v3.7.0 - 16 Feb 2026 diff --git a/lib/dpif-netdev-perf.c b/lib/dpif-netdev-perf.c index 1cd4ee084..ba370d7c1 100644 --- a/lib/dpif-netdev-perf.c +++ b/lib/dpif-netdev-perf.c @@ -233,7 +233,8 @@ pmd_perf_format_overall_stats(struct ds *str, struct pmd_perf_stats *s, uint64_t sleep_iter = stats[PMD_SLEEP_ITER]; uint64_t tot_sleep_cycles = stats[PMD_CYCLES_SLEEP]; - ds_put_format(str, + if (tot_iter) { + ds_put_format(str, " Iterations: %12"PRIu64" (%.2f us/it)\n" " - Used TSC cycles: %12"PRIu64" (%5.1f %% of total cycles)\n" " - idle iterations: %12"PRIu64" (%5.1f %% of used cycles)\n" @@ -252,9 +253,18 @@ pmd_perf_format_overall_stats(struct ds *str, struct pmd_perf_stats *s, sleep_iter, tot_iter ? 100.0 * sleep_iter / tot_iter : 0, tot_sleep_cycles * us_per_cycle, sleep_iter ? (tot_sleep_cycles * us_per_cycle) / sleep_iter : 0); + } if (rx_packets > 0) { ds_put_format(str, - " Rx packets: %12"PRIu64" (%.0f Kpps, %.0f cycles/pkt)\n" + " Rx packets: %12"PRIu64" (%.0f Kpps", + rx_packets, (rx_packets / duration) / 1000); + if (tot_iter) { + ds_put_format(str, ", %.0f cycles/pkt", + 1.0 * stats[PMD_CYCLES_ITER_BUSY] / rx_packets); + } + ds_put_cstr(str, ")\n"); + + ds_put_format(str, " Datapath passes: %12"PRIu64" (%.2f passes/pkt)\n" " - PHWOL hits: %12"PRIu64" (%5.1f %%)\n" " - MFEX Opt hits: %12"PRIu64" (%5.1f %%)\n" @@ -262,11 +272,7 @@ pmd_perf_format_overall_stats(struct ds *str, struct pmd_perf_stats *s, " - EMC hits: %12"PRIu64" (%5.1f %%)\n" " - SMC hits: %12"PRIu64" (%5.1f %%)\n" " - Megaflow hits: %12"PRIu64" (%5.1f %%, %.2f " - "subtbl lookups/hit)\n" - " - Upcalls: %12"PRIu64" (%5.1f %%, %.1f us/upcall)\n" - " - Lost upcalls: %12"PRIu64" (%5.1f %%)\n", - rx_packets, (rx_packets / duration) / 1000, - 1.0 * stats[PMD_CYCLES_ITER_BUSY] / rx_packets, + "subtbl lookups/hit)\n", passes, 1.0 * passes / rx_packets, stats[PMD_STAT_PHWOL_HIT], 100.0 * stats[PMD_STAT_PHWOL_HIT] / passes, @@ -282,11 +288,20 @@ pmd_perf_format_overall_stats(struct ds *str, struct pmd_perf_stats *s, 100.0 * stats[PMD_STAT_MASKED_HIT] / passes, stats[PMD_STAT_MASKED_HIT] ? 1.0 * stats[PMD_STAT_MASKED_LOOKUP] / stats[PMD_STAT_MASKED_HIT] - : 0, - upcalls, 100.0 * upcalls / passes, - upcalls ? (upcall_cycles * us_per_cycle) / upcalls : 0, - stats[PMD_STAT_LOST], - 100.0 * stats[PMD_STAT_LOST] / passes); + : 0); + + ds_put_format(str, + " - Upcalls: %12"PRIu64" (%5.1f %%", + upcalls, 100.0 * upcalls / passes); + if (tot_iter) { + ds_put_format(str, ", %.1f us/upcall", + upcalls ? (upcall_cycles * us_per_cycle) / upcalls : 0); + } + ds_put_cstr(str, ")\n"); + + ds_put_format(str, + " - Lost upcalls: %12"PRIu64" (%5.1f %%)\n", + stats[PMD_STAT_LOST], 100.0 * stats[PMD_STAT_LOST] / passes); } else { ds_put_format(str, " Rx packets: %12d\n", 0); diff --git a/lib/dpif-netdev-perf.h b/lib/dpif-netdev-perf.h index 84beced15..8a41afa8a 100644 --- a/lib/dpif-netdev-perf.h +++ b/lib/dpif-netdev-perf.h @@ -317,7 +317,7 @@ void pmd_perf_read_counters(struct pmd_perf_stats *s, * NON-PMD they might be updated from multiple threads, but we can live * with losing a rare update as 100% accuracy is not required. * However, as counters are read for display from outside the PMD thread - * with e.g. pmd-stats-show, we make sure that the 64-bit read and store + * with e.g. pmd-perf-show, we make sure that the 64-bit read and store * operations are atomic also on 32-bit systems so that readers cannot * not read garbage. On 64-bit systems this incurs no overhead. */ diff --git a/lib/dpif-netdev-unixctl.man b/lib/dpif-netdev-unixctl.man index 8cd847416..3d5ab437c 100644 --- a/lib/dpif-netdev-unixctl.man +++ b/lib/dpif-netdev-unixctl.man @@ -6,44 +6,22 @@ argument can be omitted. By default the commands present data for all pmd threads in the datapath. By specifying the "-pmd Core" option one can filter the output for a single pmd in the datapath. . -.IP "\fBdpif-netdev/pmd-stats-show\fR [\fB-pmd\fR \fIcore\fR] [\fIdp\fR]" -Shows performance statistics for one or all pmd threads of the datapath -\fIdp\fR. The special thread "main" sums up the statistics of every non pmd -thread. - -The sum of "phwol hits", "simple match hits", "emc hits", "smc hits", -"megaflow hits" and "miss" is the number of packet lookups performed by the -datapath. Beware that a recirculated packet experiences one additional lookup -per recirculation, so there may be more lookups than forwarded packets in the -datapath. - -The MFEX Opt hits displays the number of packets that are processed by the -optimized miniflow extract implementations. - -Cycles are counted using the TSC or similar facilities (when available on -the platform). The duration of one cycle depends on the processing platform. - -"idle cycles" refers to cycles spent in PMD iterations not forwarding any -any packets. "processing cycles" refers to cycles spent in PMD iterations -forwarding at least one packet, including the cost for polling, processing and -transmitting said packets. - -To reset these counters use \fBdpif-netdev/pmd-stats-clear\fR. -. .IP "\fBdpif-netdev/pmd-stats-clear\fR [\fIdp\fR]" Resets to zero the per pmd thread performance numbers shown by the -\fBdpif-netdev/pmd-stats-show\fR and \fBdpif-netdev/pmd-perf-show\fR commands. -It will NOT reset datapath or bridge statistics, only the values shown by -the above commands. +\fBdpif-netdev/pmd-perf-show\fR command. It will NOT reset datapath or bridge +statistics, only the values shown by the above command. . .IP "\fBdpif-netdev/pmd-perf-show\fR [\fB-nh\fR] [\fB-it\fR \fIiter_len\fR] \ [\fB-ms\fR \fIms_len\fR] [\fB-pmd\fR \fIcore\fR] [\fIdp\fR]" Shows detailed performance metrics for one or all pmds threads of the -user space datapath. +user space datapath. The special thread "main" sums up the statistics of every +non pmd thread. -The collection of detailed statistics can be controlled by a new -configuration parameter "other_config:pmd-perf-metrics". By default it -is disabled. The run-time overhead, when enabled, is in the order of 1%. +The collection of additional detailed statistics can be controlled by a +configuration parameter \fBother-config:pmd-perf-metrics\fR. By default it is +disabled. The run-time overhead, when enabled, is in the order of 1%. + +Collected statistics include: .RS .IP @@ -153,8 +131,26 @@ pmd thread numa_id 0 core_id 1: .RE .IP Here "Rx packets" actually reflects the number of packets forwarded by the -datapath. "Datapath passes" matches the number of packet lookups as -reported by the \fBdpif-netdev/pmd-stats-show\fR command. +datapath. + +The sum of "PHWOL hits", "Simple Match hits", "EMC hits", "SMC hits", +"Megaflow hits" and "Upcalls" is the number of packet lookups performed by the +datapath and it is reported as "Datapath passes". Beware that a recirculated +packet experiences one additional lookup per recirculation, so there may be +more lookups than forwarded packets in the datapath. + +The "MFEX Opt hits" displays the number of packets that are processed by the +optimized miniflow extract implementations. + +Cycles are counted using the TSC or similar facilities (when available on +the platform). The duration of one cycle depends on the processing platform. +Statistics based on cycles are not reported for the "main" thread, since the +accurate accounting of CPU cycles is not possible in this case. + +"idle iterations" refers to PMD iterations that didn't not result in processing +any packets. "busy iterations" refers to PMD iterations that included +processing of at least one packet. The reported used TSC cycles include the +cost for polling, processing and transmitting said packets. To reset the counters and start a new measurement use \fBdpif-netdev/pmd-stats-clear\fR. diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 9df05c4c2..db5823a91 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -620,8 +620,7 @@ get_dp_netdev(const struct dpif *dpif) } enum pmd_info_type { - PMD_INFO_SHOW_STATS, /* Show how cpu cycles are spent. */ - PMD_INFO_CLEAR_STATS, /* Set the cycles count to 0. */ + PMD_INFO_CLEAR_STATS, /* Set the cycle and the packet counters to 0. */ PMD_INFO_SHOW_RXQ, /* Show poll lists of pmd threads. */ PMD_INFO_PERF_SHOW, /* Show pmd performance details. */ PMD_INFO_SLEEP_SHOW, /* Show max sleep configuration details. */ @@ -641,127 +640,42 @@ format_pmd_thread(struct ds *reply, struct dp_netdev_pmd_thread *pmd) ds_put_cstr(reply, ":\n"); } -static void -pmd_info_show_stats(struct ds *reply, - struct dp_netdev_pmd_thread *pmd) -{ - uint64_t stats[PMD_N_STATS]; - uint64_t total_cycles, total_packets; - double passes_per_pkt = 0; - double lookups_per_hit = 0; - double packets_per_batch = 0; - - pmd_perf_read_counters(&pmd->perf_stats, stats); - total_cycles = stats[PMD_CYCLES_ITER_IDLE] - + stats[PMD_CYCLES_ITER_BUSY]; - total_packets = stats[PMD_STAT_RECV]; - - format_pmd_thread(reply, pmd); - - if (total_packets > 0) { - passes_per_pkt = (total_packets + stats[PMD_STAT_RECIRC]) - / (double) total_packets; - } - if (stats[PMD_STAT_MASKED_HIT] > 0) { - lookups_per_hit = stats[PMD_STAT_MASKED_LOOKUP] - / (double) stats[PMD_STAT_MASKED_HIT]; - } - if (stats[PMD_STAT_SENT_BATCHES] > 0) { - packets_per_batch = stats[PMD_STAT_SENT_PKTS] - / (double) stats[PMD_STAT_SENT_BATCHES]; - } - - ds_put_format(reply, - " packets received: %"PRIu64"\n" - " packet recirculations: %"PRIu64"\n" - " avg. datapath passes per packet: %.02f\n" - " phwol hits: %"PRIu64"\n" - " mfex opt hits: %"PRIu64"\n" - " simple match hits: %"PRIu64"\n" - " emc hits: %"PRIu64"\n" - " smc hits: %"PRIu64"\n" - " megaflow hits: %"PRIu64"\n" - " avg. subtable lookups per megaflow hit: %.02f\n" - " miss with success upcall: %"PRIu64"\n" - " miss with failed upcall: %"PRIu64"\n" - " avg. packets per output batch: %.02f\n", - total_packets, stats[PMD_STAT_RECIRC], - passes_per_pkt, stats[PMD_STAT_PHWOL_HIT], - stats[PMD_STAT_MFEX_OPT_HIT], - stats[PMD_STAT_SIMPLE_HIT], - stats[PMD_STAT_EXACT_HIT], - stats[PMD_STAT_SMC_HIT], - stats[PMD_STAT_MASKED_HIT], - lookups_per_hit, stats[PMD_STAT_MISS], stats[PMD_STAT_LOST], - packets_per_batch); - - if (total_cycles == 0) { - return; - } - - ds_put_format(reply, - " idle cycles: %"PRIu64" (%.02f%%)\n" - " processing cycles: %"PRIu64" (%.02f%%)\n", - stats[PMD_CYCLES_ITER_IDLE], - stats[PMD_CYCLES_ITER_IDLE] / (double) total_cycles * 100, - stats[PMD_CYCLES_ITER_BUSY], - stats[PMD_CYCLES_ITER_BUSY] / (double) total_cycles * 100); - - if (total_packets == 0) { - return; - } - - ds_put_format(reply, - " avg cycles per packet: %.02f (%"PRIu64"/%"PRIu64")\n", - total_cycles / (double) total_packets, - total_cycles, total_packets); - - ds_put_format(reply, - " avg processing cycles per packet: " - "%.02f (%"PRIu64"/%"PRIu64")\n", - stats[PMD_CYCLES_ITER_BUSY] / (double) total_packets, - stats[PMD_CYCLES_ITER_BUSY], total_packets); -} - static void pmd_info_show_perf(struct ds *reply, struct dp_netdev_pmd_thread *pmd, struct pmd_perf_params *par) { - if (pmd->core_id != NON_PMD_CORE_ID) { - char *time_str = - xastrftime_msec("%H:%M:%S.###", time_wall_msec(), true); - long long now = time_msec(); - double duration = (now - pmd->perf_stats.start_ms) / 1000.0; - - ds_put_cstr(reply, "\n"); - ds_put_format(reply, "Time: %s\n", time_str); - ds_put_format(reply, "Measurement duration: %.3f s\n", duration); - ds_put_cstr(reply, "\n"); - format_pmd_thread(reply, pmd); - ds_put_cstr(reply, "\n"); - pmd_perf_format_overall_stats(reply, &pmd->perf_stats, duration); - if (pmd_perf_metrics_enabled(pmd)) { - /* Prevent parallel clearing of perf metrics. */ - ovs_mutex_lock(&pmd->perf_stats.clear_mutex); - if (par->histograms) { - ds_put_cstr(reply, "\n"); - pmd_perf_format_histograms(reply, &pmd->perf_stats); - } - if (par->iter_hist_len > 0) { - ds_put_cstr(reply, "\n"); - pmd_perf_format_iteration_history(reply, &pmd->perf_stats, - par->iter_hist_len); - } - if (par->ms_hist_len > 0) { - ds_put_cstr(reply, "\n"); - pmd_perf_format_ms_history(reply, &pmd->perf_stats, - par->ms_hist_len); - } - ovs_mutex_unlock(&pmd->perf_stats.clear_mutex); + char *time_str = xastrftime_msec("%H:%M:%S.###", time_wall_msec(), true); + long long now = time_msec(); + double duration = (now - pmd->perf_stats.start_ms) / 1000.0; + + ds_put_cstr(reply, "\n"); + ds_put_format(reply, "Time: %s\n", time_str); + ds_put_format(reply, "Measurement duration: %.3f s\n", duration); + ds_put_cstr(reply, "\n"); + format_pmd_thread(reply, pmd); + ds_put_cstr(reply, "\n"); + pmd_perf_format_overall_stats(reply, &pmd->perf_stats, duration); + if (pmd_perf_metrics_enabled(pmd) && pmd->core_id != NON_PMD_CORE_ID) { + /* Prevent parallel clearing of perf metrics. */ + ovs_mutex_lock(&pmd->perf_stats.clear_mutex); + if (par->histograms) { + ds_put_cstr(reply, "\n"); + pmd_perf_format_histograms(reply, &pmd->perf_stats); } - free(time_str); + if (par->iter_hist_len > 0) { + ds_put_cstr(reply, "\n"); + pmd_perf_format_iteration_history(reply, &pmd->perf_stats, + par->iter_hist_len); + } + if (par->ms_hist_len > 0) { + ds_put_cstr(reply, "\n"); + pmd_perf_format_ms_history(reply, &pmd->perf_stats, + par->ms_hist_len); + } + ovs_mutex_unlock(&pmd->perf_stats.clear_mutex); } + free(time_str); } static int @@ -1443,8 +1357,6 @@ dpif_netdev_pmd_info(struct unixctl_conn *conn, int argc, const char *argv[], pmd_info_show_rxq(&reply, pmd, secs); } else if (type == PMD_INFO_CLEAR_STATS) { pmd_perf_stats_clear(&pmd->perf_stats); - } else if (type == PMD_INFO_SHOW_STATS) { - pmd_info_show_stats(&reply, pmd); } else if (type == PMD_INFO_PERF_SHOW) { pmd_info_show_perf(&reply, pmd, (struct pmd_perf_params *)aux); } else if (type == PMD_INFO_SLEEP_SHOW) { @@ -1554,14 +1466,10 @@ dpif_netdev_bond_show(struct unixctl_conn *conn, int argc, static int dpif_netdev_init(void) { - static enum pmd_info_type show_aux = PMD_INFO_SHOW_STATS, - clear_aux = PMD_INFO_CLEAR_STATS, + static enum pmd_info_type clear_aux = PMD_INFO_CLEAR_STATS, poll_aux = PMD_INFO_SHOW_RXQ, sleep_aux = PMD_INFO_SLEEP_SHOW; - unixctl_command_register("dpif-netdev/pmd-stats-show", "[-pmd core] [dp]", - 0, 3, dpif_netdev_pmd_info, - (void *)&show_aux); unixctl_command_register("dpif-netdev/pmd-stats-clear", "[-pmd core] [dp]", 0, 3, dpif_netdev_pmd_info, (void *)&clear_aux); @@ -1578,6 +1486,10 @@ dpif_netdev_init(void) " [-pmd core] [dp]", 0, 8, pmd_perf_show_cmd, NULL); + /* 'pmd-stats-show' is just an undocumented alias for 'pmd-perf-show', + * for compatibility with old muscle memory. */ + unixctl_command_register("dpif-netdev/pmd-stats-show", NULL, + 0, 8, pmd_perf_show_cmd, NULL); unixctl_command_register("dpif-netdev/pmd-rxq-rebalance", "[dp]", 0, 1, dpif_netdev_pmd_rebalance, NULL); diff --git a/tests/dpif-netdev.at b/tests/dpif-netdev.at index 231197970..405094856 100644 --- a/tests/dpif-netdev.at +++ b/tests/dpif-netdev.at @@ -979,32 +979,32 @@ AT_CHECK([cat good_frame | sed -e "s/6b72/dead/" > bad_frame]) CHECK_FWD_PACKET(p1, p2, , [bad_frame], [bad_frame]) dnl First packet, no simple matching. -AT_CHECK([ovs-appctl dpif-netdev/pmd-stats-show | grep 'simple match hits'], [0], [dnl - simple match hits: 0 +AT_CHECK([ovs-appctl dpif-netdev/pmd-perf-show | grep 'Simple Match hits'], [0], [dnl + - Simple Match hits: 0 ( 0.0 %) ]) dnl No Rx flag. CHECK_FWD_PACKET(p1, p2, , [bad_frame], [bad_frame]) -AT_CHECK([ovs-appctl dpif-netdev/pmd-stats-show | grep 'simple match hits'], [0], [dnl - simple match hits: 1 +AT_CHECK([ovs-appctl dpif-netdev/pmd-perf-show | grep 'Simple Match hits'], [0], [dnl + - Simple Match hits: 1 ( 50.0 %) ]) dnl Flag as Rx good. CHECK_FWD_PACKET(p1, p2, ol_l4_rx_csum_set_good, [bad_frame], [bad_frame]) -AT_CHECK([ovs-appctl dpif-netdev/pmd-stats-show | grep 'simple match hits'], [0], [dnl - simple match hits: 2 +AT_CHECK([ovs-appctl dpif-netdev/pmd-perf-show | grep 'Simple Match hits'], [0], [dnl + - Simple Match hits: 2 ( 66.7 %) ]) dnl Flag as Rx bad. CHECK_FWD_PACKET(p1, p2, ol_l4_rx_csum_set_bad, [bad_frame], [bad_frame]) -AT_CHECK([ovs-appctl dpif-netdev/pmd-stats-show | grep 'simple match hits'], [0], [dnl - simple match hits: 3 +AT_CHECK([ovs-appctl dpif-netdev/pmd-perf-show | grep 'Simple Match hits'], [0], [dnl + - Simple Match hits: 3 ( 75.0 %) ]) dnl Flag as Rx partial. CHECK_FWD_PACKET(p1, p2, ol_l4_rx_csum_set_partial, [bad_frame], [good_frame]) -AT_CHECK([ovs-appctl dpif-netdev/pmd-stats-show | grep 'simple match hits'], [0], [dnl - simple match hits: 4 +AT_CHECK([ovs-appctl dpif-netdev/pmd-perf-show | grep 'Simple Match hits'], [0], [dnl + - Simple Match hits: 4 ( 80.0 %) ]) OVS_VSWITCHD_STOP diff --git a/tests/pmd.at b/tests/pmd.at index 8254ac3b0..e8590044a 100644 --- a/tests/pmd.at +++ b/tests/pmd.at @@ -440,20 +440,12 @@ dummy@ovs-dummy: hit:0 missed:0 p0 7/1: (dummy-pmd: n_rxq=4, n_txq=1, numa_id=0) ]) -AT_CHECK([ovs-appctl dpif-netdev/pmd-stats-show | sed SED_NUMA_CORE_PATTERN | sed '/cycles/d' | grep pmd -A 12], [0], [dnl +AT_CHECK([ovs-appctl dpif-netdev/pmd-perf-show | sed SED_NUMA_CORE_PATTERN \ + | sed '/cycles/d' | sed '/[[Ii]]teration/d' | grep pmd -A 3], [0], [dnl pmd thread numa_id <cleared> core_id <cleared>: - packets received: 0 - packet recirculations: 0 - avg. datapath passes per packet: 0.00 - phwol hits: 0 - mfex opt hits: 0 - simple match hits: 0 - emc hits: 0 - smc hits: 0 - megaflow hits: 0 - avg. subtable lookups per megaflow hit: 0.00 - miss with success upcall: 0 - miss with failed upcall: 0 + + Rx packets: 0 + Tx packets: 0 ]) ovs-appctl time/stop @@ -474,20 +466,23 @@ AT_CHECK([cat ovs-vswitchd.log | filter_flow_install | strip_xout], [0], [dnl recirc_id(0),in_port(1),packet_type(ns=0,id=0),eth(src=50:54:00:00:00:77,dst=50:54:00:00:01:78),eth_type(0x0800),ipv4(frag=no), actions: <del> ]) -AT_CHECK([ovs-appctl dpif-netdev/pmd-stats-show | sed SED_NUMA_CORE_PATTERN | sed '/cycles/d' | grep pmd -A 12], [0], [dnl +AT_CHECK([ovs-appctl dpif-netdev/pmd-perf-show | sed SED_NUMA_CORE_PATTERN \ + | sed '/cycles/d' | sed '/[[Ii]]teration/d' \ + | sed 's/, .* us/, <cleared> us/' \ + | sed 's/[[0-9]]* Kpps/<cleared> Kpps/' | grep pmd -A 12], [0], [dnl pmd thread numa_id <cleared> core_id <cleared>: - packets received: 20 - packet recirculations: 0 - avg. datapath passes per packet: 1.00 - phwol hits: 0 - mfex opt hits: 0 - simple match hits: 0 - emc hits: 19 - smc hits: 0 - megaflow hits: 0 - avg. subtable lookups per megaflow hit: 0.00 - miss with success upcall: 1 - miss with failed upcall: 0 + + Datapath passes: 20 (1.00 passes/pkt) + - PHWOL hits: 0 ( 0.0 %) + - MFEX Opt hits: 0 ( 0.0 %) + - Simple Match hits: 0 ( 0.0 %) + - EMC hits: 19 ( 95.0 %) + - SMC hits: 0 ( 0.0 %) + - Megaflow hits: 0 ( 0.0 %, 0.00 subtbl lookups/hit) + - Upcalls: 1 ( 5.0 %, <cleared> us/upcall) + - Lost upcalls: 0 ( 0.0 %) + Tx packets: 20 (<cleared> Kpps) + Tx batches: 20 (1.00 pkts/batch) ]) OVS_VSWITCHD_STOP diff --git a/tests/system-dpdk-offloads.at b/tests/system-dpdk-offloads.at index 81ab89b2f..09bdaf639 100644 --- a/tests/system-dpdk-offloads.at +++ b/tests/system-dpdk-offloads.at @@ -144,8 +144,8 @@ AT_CHECK([ovs-appctl dpctl/dump-flows type=dpdk,partially-offloaded \ in_port(2),eth_type(0x0800),ipv4(frag=no), packets:9, bytes:954, used:0.0s, actions:check_pkt_len(size=200,gt(4),le(5)),3 ]) -AT_CHECK([test $(ovs-appctl dpif-netdev/pmd-stats-show | \ - awk '/phwol hits:/ {sum += $3} END {print sum}') -ge 8]) +AT_CHECK([test $(ovs-appctl dpif-netdev/pmd-perf-show | \ + awk '/PHWOL hits:/ {sum += $4} END {print sum}') -ge 8]) OVS_TRAFFIC_VSWITCHD_STOP AT_CLEANUP @@ -216,8 +216,8 @@ OVS_WAIT_UNTIL_EQUAL( in_port(ovs-p0),eth(macs),eth_type(0x0800),ipv4(frag=no), packets:50, bytes:3000, used:0.0s, actions:ovs-p1 in_port(ovs-p1),eth(macs),eth_type(0x0800),ipv4(frag=no), packets:50, bytes:3000, used:0.0s, actions:ovs-p0]) -AT_CHECK([test $(ovs-appctl dpif-netdev/pmd-stats-show | \ - awk '/packets received:/ {sum += $3} END {print sum}') -lt 10]) +AT_CHECK([test $(ovs-appctl dpif-netdev/pmd-perf-show | \ + awk '/Rx packets:/ {sum += $3} END {print sum}') -lt 10]) OVS_TRAFFIC_VSWITCHD_STOP AT_CLEANUP diff --git a/utilities/bugtool/plugins/system-logs/openvswitch.xml b/utilities/bugtool/plugins/system-logs/openvswitch.xml index 46c731812..0f17add75 100644 --- a/utilities/bugtool/plugins/system-logs/openvswitch.xml +++ b/utilities/bugtool/plugins/system-logs/openvswitch.xml @@ -20,7 +20,7 @@ <directory label="ovsdb-backups" filters="ovs" pattern=".*/conf.db.backup[0-9][^/]*$">/etc/openvswitch</directory> <directory label="ovsdb-backups2" filters="ovs" pattern=".*/conf.db.backup[0-9][^/]*$">/var/lib/openvswitch</directory> <command label="system_memory_status" filters="ovs">df -h</command> - <command label="check_number_of_pmds" filters="ovs">ovs-appctl dpif-netdev/pmd-stats-show | grep pmd</command> + <command label="check_number_of_pmds" filters="ovs">ovs-appctl dpif-netdev/pmd-perf-show | grep pmd</command> <command label="ovs-appctl-vlog-list" filters="ovs">ovs-appctl vlog/list</command> <command label="journalctl" filters="ovs">journalctl</command> <command label="user_limits" filters="ovs">ulimit -a</command>
The 'pmd-perf-show' command provides all the same information and more. It is also better visually structured and easier to read as a result. Let's remove the old 'pmd-stats-show' command, as there is no real need to have two commands reporting the same data. The only difference until now was that 'pmd-perf-show' didn't provide information for the "main" thread. This change makes it report the statistics for the aggregated "main" thread as well, omitting things related to CPU cycles, as we can't collect those for threads that are not pinned. For the same reason histograms are also always disabled. Omission is done by checking the total number of iterations to be zero. "main" thread doesn't start/end iterations. The actual unixctl command is preserved undocumented and serves as an alias for 'pmd-perf-show'. This should allow old scripts that are just capturing the output for humans (or LLMs?) to read to keep working. Note, however, that the exact output format for unixctl commands was never a guarantee, so scripts that attempt to parse the output may still break. Signed-off-by: Ilya Maximets <i.maximets@ovn.org> --- Note: I believe the change in system-dpdk-offloads.at is correct, but I didn't run the testsuite, as I have no hardware for it. Documentation/intro/install/afxdp.rst | 2 +- Documentation/intro/install/dpdk.rst | 2 +- Documentation/topics/dpdk/bridge.rst | 4 +- Documentation/topics/dpdk/pmd.rst | 4 - NEWS | 4 + lib/dpif-netdev-perf.c | 39 +++-- lib/dpif-netdev-perf.h | 2 +- lib/dpif-netdev-unixctl.man | 62 ++++--- lib/dpif-netdev.c | 158 ++++-------------- tests/dpif-netdev.at | 20 +-- tests/pmd.at | 47 +++--- tests/system-dpdk-offloads.at | 8 +- .../plugins/system-logs/openvswitch.xml | 2 +- 13 files changed, 136 insertions(+), 218 deletions(-)