diff mbox series

[ovs-dev,2/2] alb.at: Increase time/warp.

Message ID 20211123140023.3509644-2-ktraynor@redhat.com
State Accepted
Headers show
Series [ovs-dev,1/2] alb.at: Check for log from correct line number. | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/github-robot-_Build_and_Test success github build: passed

Commit Message

Kevin Traynor Nov. 23, 2021, 2 p.m. UTC
It seems that on slow system with high concurrency and cpu contention
time/warp is not accurate enough for the ALB unit tests with the minimum
time/warp that was used to hit an amount of events. This results in some
intermittent test failures.

As those tests are just waiting for a certain amount of events to occur
and there is no functional change during that time let's do the time/warp
again with higher values.

With this no failures are seen in several hundred runs.

Fixes: a83a406096e9 ("dpif-netdev: Sync PMD ALB state with user commands.")
Reported-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: Kevin Traynor <ktraynor@redhat.com>

---
GHA: https://github.com/kevintraynor/ovs/actions/runs/1494941804
---
 tests/alb.at | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

Comments

David Marchand Dec. 6, 2021, 5:11 p.m. UTC | #1
On Tue, Nov 23, 2021 at 3:01 PM Kevin Traynor <ktraynor@redhat.com> wrote:
>
> It seems that on slow system with high concurrency and cpu contention
> time/warp is not accurate enough for the ALB unit tests with the minimum
> time/warp that was used to hit an amount of events. This results in some
> intermittent test failures.
>
> As those tests are just waiting for a certain amount of events to occur
> and there is no functional change during that time let's do the time/warp
> again with higher values.
>
> With this no failures are seen in several hundred runs.
>
> Fixes: a83a406096e9 ("dpif-netdev: Sync PMD ALB state with user commands.")
> Reported-by: Ilya Maximets <i.maximets@ovn.org>

Fwiw, I managed to reproduce with below commands (test failed in 7
runs out of 10 on my laptop before patch).

In separate terminals:
$ taskset -c 3 sh -c 'while true; do true; done'
$ taskset -c 3 make -C master check TESTSUITEFLAGS="-d 1026"

> Signed-off-by: Kevin Traynor <ktraynor@redhat.com>

Reviewed-by: David Marchand <david.marchand@redhat.com>

I let the test run ~50 times, no issue with patch.
Ilya Maximets Dec. 7, 2021, 2:32 p.m. UTC | #2
On 12/6/21 18:11, David Marchand wrote:
> On Tue, Nov 23, 2021 at 3:01 PM Kevin Traynor <ktraynor@redhat.com> wrote:
>>
>> It seems that on slow system with high concurrency and cpu contention
>> time/warp is not accurate enough for the ALB unit tests with the minimum
>> time/warp that was used to hit an amount of events. This results in some
>> intermittent test failures.
>>
>> As those tests are just waiting for a certain amount of events to occur
>> and there is no functional change during that time let's do the time/warp
>> again with higher values.
>>
>> With this no failures are seen in several hundred runs.
>>
>> Fixes: a83a406096e9 ("dpif-netdev: Sync PMD ALB state with user commands.")
>> Reported-by: Ilya Maximets <i.maximets@ovn.org>
> 
> Fwiw, I managed to reproduce with below commands (test failed in 7
> runs out of 10 on my laptop before patch).
> 
> In separate terminals:
> $ taskset -c 3 sh -c 'while true; do true; done'
> $ taskset -c 3 make -C master check TESTSUITEFLAGS="-d 1026"
> 
>> Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
> 
> Reviewed-by: David Marchand <david.marchand@redhat.com>
> 
> I let the test run ~50 times, no issue with patch.

Thanks, Kevin and David!  Applied.

Best regards, Ilya Maximets.
diff mbox series

Patch

diff --git a/tests/alb.at b/tests/alb.at
index 25c91f158..2bef06f39 100644
--- a/tests/alb.at
+++ b/tests/alb.at
@@ -66,5 +66,5 @@  AT_CHECK([ovs-appctl vlog/set dpif_netdev:dbg])
 # 1 pmds 2 rxqs
 get_log_next_line_num
-ovs-appctl time/warp 60000 10000
+ovs-appctl time/warp 600000 10000
 OVS_WAIT_UNTIL([tail -n +$LINENUM ovs-vswitchd.log | grep "PMD auto load balance nothing to do, not enough non-isolated PMDs or RxQs."])
 
@@ -73,5 +73,5 @@  get_log_next_line_num
 AT_CHECK([ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x3])
 OVS_WAIT_UNTIL([tail -n +$LINENUM ovs-vswitchd.log | grep "There are 2 pmd threads on numa node"])
-ovs-appctl time/warp 60000 10000
+ovs-appctl time/warp 600000 10000
 OVS_WAIT_UNTIL([tail -n +$LINENUM ovs-vswitchd.log | grep "PMD auto load balance nothing to do, not enough non-isolated PMDs or RxQs."])
 
@@ -79,5 +79,5 @@  OVS_WAIT_UNTIL([tail -n +$LINENUM ovs-vswitchd.log | grep "PMD auto load balance
 get_log_next_line_num
 AT_CHECK([ovs-vsctl set interface p0 options:n_rxq=3])
-ovs-appctl time/warp 60000 10000
+ovs-appctl time/warp 600000 10000
 OVS_WAIT_UNTIL([tail -n +$LINENUM ovs-vswitchd.log | grep "PMD auto load balance performing dry run."])
 
@@ -86,10 +86,10 @@  get_log_next_line_num
 AT_CHECK([ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x1])
 OVS_WAIT_UNTIL([tail -n +$LINENUM ovs-vswitchd.log | grep "There are 1 pmd threads on numa node"])
-ovs-appctl time/warp 60000 10000
+ovs-appctl time/warp 600000 10000
 OVS_WAIT_UNTIL([tail -n +$LINENUM ovs-vswitchd.log | grep "PMD auto load balance nothing to do, not enough non-isolated PMDs or RxQs."])
 
 # Same config as last time
 get_log_next_line_num
-ovs-appctl time/warp 60000 10000
+ovs-appctl time/warp 600000 10000
 OVS_WAIT_UNTIL([tail -n +$LINENUM ovs-vswitchd.log | grep "PMD auto load balance nothing to do, no configuration changes since last check."])
 
@@ -147,5 +147,5 @@  AT_CHECK([ovs-appctl vlog/set dpif_netdev:dbg])
 AT_CHECK([ovs-vsctl set Open_vSwitch . other_config:pmd-rxq-assign=group])
 get_log_next_line_num
-ovs-appctl time/warp 60000 10000
+ovs-appctl time/warp 600000 10000
 OVS_WAIT_UNTIL([tail -n +$LINENUM ovs-vswitchd.log | grep "PMD auto load balance performing dry run."])
 
@@ -153,5 +153,5 @@  get_log_next_line_num
 AT_CHECK([ovs-vsctl set Open_vSwitch . other_config:pmd-rxq-assign=cycles])
 OVS_WAIT_UNTIL([tail -n +$LINENUM ovs-vswitchd.log | grep "mode changed to: 'cycles'"])
-ovs-appctl time/warp 60000 10000
+ovs-appctl time/warp 600000 10000
 OVS_WAIT_UNTIL([tail -n +$LINENUM ovs-vswitchd.log | grep "PMD auto load balance performing dry run."])
 
@@ -159,5 +159,5 @@  get_log_next_line_num
 AT_CHECK([ovs-vsctl set Open_vSwitch . other_config:pmd-rxq-assign=roundrobin])
 OVS_WAIT_UNTIL([tail -n +$LINENUM ovs-vswitchd.log | grep "mode changed to: 'roundrobin'"])
-ovs-appctl time/warp 60000 10000
+ovs-appctl time/warp 600000 10000
 OVS_WAIT_UNTIL([tail -n +$LINENUM ovs-vswitchd.log | grep "PMD auto load balance nothing to do, pmd-rxq-assign=roundrobin assignment type configured."])
 
@@ -165,5 +165,5 @@  get_log_next_line_num
 AT_CHECK([ovs-vsctl set Open_vSwitch . other_config:pmd-rxq-assign=group])
 OVS_WAIT_UNTIL([tail -n +$LINENUM ovs-vswitchd.log | grep "mode changed to: 'group'"])
-ovs-appctl time/warp 60000 10000
+ovs-appctl time/warp 600000 10000
 OVS_WAIT_UNTIL([tail -n +$LINENUM ovs-vswitchd.log | grep "PMD auto load balance performing dry run."])