diff mbox series

[ovs-dev] dpif-netdev: not skip cycle metric for no-packet rxqs

Message ID 1660720255-25371-1-git-send-email-lic121@chinatelecom.cn
State Rejected
Headers show
Series [ovs-dev] dpif-netdev: not skip cycle metric for no-packet rxqs | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/github-robot-_Build_and_Test success github build: passed
ovsrobot/intel-ovs-compilation fail test: fail

Commit Message

Cheng Li Aug. 17, 2022, 7:10 a.m. UTC
In PMD, cycle cost for each rxq is saved so that we know each rxq's
load. But rxq that doesn't receive a packet is skipped. In fact,
polling no-packet rxq costs cycles as well. In my test, 100w pps
rxq(vhostuser) cost cycles 87,553,850,086 while no-packet rxq costs
cycles 353,402,306.

In asign mode "group", ovs always pick the lowest load pmd. If we
have too many(let's say 100) no-packet rxqs, these rxqs will be
asigned to the same pmd. Because no load increase when a no-packet
rxq to pmd.

To avoid this, this patch count cycles for no-packet rxq as well.

Signed-off-by: lic121 <lic121@chinatelecom.cn>
---
 lib/dpif-netdev.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Kevin Traynor Aug. 25, 2022, 12:34 p.m. UTC | #1
Hi,

On 17/08/2022 08:10, lic121 wrote:
> In PMD, cycle cost for each rxq is saved so that we know each rxq's
> load. But rxq that doesn't receive a packet is skipped. In fact,
> polling no-packet rxq costs cycles as well. In my test, 100w pps
> rxq(vhostuser) cost cycles 87,553,850,086 while no-packet rxq costs
> cycles 353,402,306.
> 
> In asign mode "group", ovs always pick the lowest load pmd. If we
> have too many(let's say 100) no-packet rxqs, these rxqs will be
> asigned to the same pmd. Because no load increase when a no-packet
> rxq to pmd.
> 

You are correct that the estimated pmd load would not be increased when 
rxqs without measured load are assigned. For this reason rxqs without 
measured load are not assigned to the pmd with lowest estimated load.

They are assigned to whichever (available) pmd has the lowest number of 
rxqs assigned. Assigning this rxq will increase the number of rxqs on 
that pmd and that will be factored in when deciding where to assign the 
next rxq with no measured cycles.

The decision is made here:
https://github.com/openvswitch/ovs/blob/master/lib/dpif-netdev.c#L5760-L5766

There is also a few lines about it in this article:
https://developers.redhat.com/articles/2021/11/19/improve-multicore-scaling-open-vswitch-dpdk#the_group_rxq_to_pmd_assignment_type

If you are seeing a case where rxqs without load are not being assigned 
in the intended way as above, please give more details.

thanks,
Kevin.

> To avoid this, this patch count cycles for no-packet rxq as well.
> 
> Signed-off-by: lic121 <lic121@chinatelecom.cn>
> ---
>   lib/dpif-netdev.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index a45b460..b30a098 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -5335,8 +5335,8 @@ dp_netdev_process_rxq_port(struct dp_netdev_pmd_thread *pmd,
>   
>           dp_netdev_pmd_flush_output_packets(pmd, false);
>       } else {
> -        /* Discard cycles. */
> -        cycle_timer_stop(&pmd->perf_stats, &timer);
> +        cycles = cycle_timer_stop(&pmd->perf_stats, &timer);
> +        dp_netdev_rxq_add_cycles(rxq, RXQ_CYCLES_PROC_CURR, cycles);
>           if (error != EAGAIN && error != EOPNOTSUPP) {
>               static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>
diff mbox series

Patch

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index a45b460..b30a098 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -5335,8 +5335,8 @@  dp_netdev_process_rxq_port(struct dp_netdev_pmd_thread *pmd,
 
         dp_netdev_pmd_flush_output_packets(pmd, false);
     } else {
-        /* Discard cycles. */
-        cycle_timer_stop(&pmd->perf_stats, &timer);
+        cycles = cycle_timer_stop(&pmd->perf_stats, &timer);
+        dp_netdev_rxq_add_cycles(rxq, RXQ_CYCLES_PROC_CURR, cycles);
         if (error != EAGAIN && error != EOPNOTSUPP) {
             static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);