Message ID | cb6f36ca2ce674d7fb189e2968d9b551ea8716b6.1493306656.git.tredaelli@redhat.com |
---|---|
State | Superseded |
Headers | show |
On 05/03/2017 10:28 PM, Ben Pfaff wrote: > On Thu, Apr 27, 2017 at 05:24:22PM +0200, Timothy Redaelli wrote: >> Port commit a6d1a2997db4: >> ofproto.at: Workaround a race >> >> While a barrier serializes requests from the same connection, >> it doesn't wait for requests from other connections to the switch. >> Replace the barrier with infamous "sleep 1" to workaround the problem. >> >> to the following tests: >> "ofproto - asynchronous message control (OpenFlow 1.0)", >> "ofproto - asynchronous message control (OpenFlow 1.3)", >> "ofproto - asynchronous message control (OpenFlow 1.4)" and >> "ofproto - asynchronous message control (OpenFlow 1.5)" >> >> Sometimes one of these tests fails because the OFPT_BARRIER_REPLY is >> printed before the other message we expect to have. >> >> Suggested-by: Lance Richardson <lrichard@redhat.com> >> Signed-off-by: Timothy Redaelli <tredaelli@redhat.com> > > Thanks for working to make the tests more reliable. > > Adding "sleep 1" makes tests slower, especially when it's in a loop as > it is here. > > What if we instead make the test wait until monitor.log grows large > enough? We already have OVS_WAIT_UNTIL, which waits only a short time > when it can. > > What do you think of this? The problem is not that the logfile is not large enough, the problem is that we have "reversed" output: --- expout 2017-05-04 06:22:03.352044646 -0400 +++ /root/ovs/tests/testsuite.dir/at-groups/944/stdout 2017-05-04 06:22:03.386044787 -0400 @@ -5,5 +5,5 @@ state: LIVE speed: 0 Mbps now, 0 Mbps max OFPT_FLOW_REMOVED (OF1.4): reason=delete table_id=0 -OFPT_FLOW_REMOVED (OF1.4): reason=group_delete table_id=0 OFPT_BARRIER_REPLY (OF1.4): +OFPT_FLOW_REMOVED (OF1.4): reason=group_delete table_id=0 Complete logs at https://gist.github.com/drizzt/dd5a3c4b2bea0d83e756f26c88fda418
On Thu, May 04, 2017 at 12:37:41PM +0200, Timothy M. Redaelli wrote: > On 05/03/2017 10:28 PM, Ben Pfaff wrote: > > On Thu, Apr 27, 2017 at 05:24:22PM +0200, Timothy Redaelli wrote: > >> Port commit a6d1a2997db4: > >> ofproto.at: Workaround a race > >> > >> While a barrier serializes requests from the same connection, > >> it doesn't wait for requests from other connections to the switch. > >> Replace the barrier with infamous "sleep 1" to workaround the problem. > >> > >> to the following tests: > >> "ofproto - asynchronous message control (OpenFlow 1.0)", > >> "ofproto - asynchronous message control (OpenFlow 1.3)", > >> "ofproto - asynchronous message control (OpenFlow 1.4)" and > >> "ofproto - asynchronous message control (OpenFlow 1.5)" > >> > >> Sometimes one of these tests fails because the OFPT_BARRIER_REPLY is > >> printed before the other message we expect to have. > >> > >> Suggested-by: Lance Richardson <lrichard@redhat.com> > >> Signed-off-by: Timothy Redaelli <tredaelli@redhat.com> > > > > Thanks for working to make the tests more reliable. > > > > Adding "sleep 1" makes tests slower, especially when it's in a loop as > > it is here. > > > > What if we instead make the test wait until monitor.log grows large > > enough? We already have OVS_WAIT_UNTIL, which waits only a short time > > when it can. > > > > What do you think of this? > > The problem is not that the logfile is not large enough, the problem is > that we have "reversed" output: > > --- expout 2017-05-04 06:22:03.352044646 -0400 > +++ /root/ovs/tests/testsuite.dir/at-groups/944/stdout 2017-05-04 > 06:22:03.386044787 -0400 > @@ -5,5 +5,5 @@ > state: LIVE > speed: 0 Mbps now, 0 Mbps max > OFPT_FLOW_REMOVED (OF1.4): reason=delete table_id=0 > -OFPT_FLOW_REMOVED (OF1.4): reason=group_delete table_id=0 > OFPT_BARRIER_REPLY (OF1.4): > +OFPT_FLOW_REMOVED (OF1.4): reason=group_delete table_id=0 > Right, and the patch I'm suggesting also drops the OFPT_BARRIER_REPLY. The only difference from your patch is that mine generally waits less time.
diff --git a/tests/ofproto.at b/tests/ofproto.at index 4431c15..69916a4 100644 --- a/tests/ofproto.at +++ b/tests/ofproto.at @@ -3212,8 +3212,7 @@ udp,vlan_tci=0x0000,dl_src=00:26:b9:8c:b0:f9,dl_dst=00:25:83:df:b4:00,nw_src=172 fi AT_FAIL_IF([test X"$1" != X]) - ovs-appctl -t ovs-ofctl ofctl/barrier - echo >>expout "OFPT_BARRIER_REPLY:" + sleep 1 AT_CHECK( [[sed ' @@ -3435,8 +3434,7 @@ udp,vlan_tci=0x0000,dl_src=00:26:b9:8c:b0:f9,dl_dst=00:25:83:df:b4:00,nw_src=172 AT_FAIL_IF([test X"$1" != X]) - ovs-appctl -t ovs-ofctl ofctl/barrier - echo >>expout "OFPT_BARRIER_REPLY (OF1.3):" + sleep 1 AT_CHECK( [[sed ' @@ -3645,8 +3643,7 @@ table_desc:- AT_FAIL_IF([test X"$1" != X]) - ovs-appctl -t ovs-ofctl ofctl/barrier - echo >>expout "OFPT_BARRIER_REPLY (OF1.4):" + sleep 1 AT_CHECK( [[sed ' @@ -3734,8 +3731,7 @@ OFPT_PORT_STATUS (OF1.5): MOD: 2(test): addr:aa:55:aa:55:00:0x AT_FAIL_IF([test X"$1" != X]) - ovs-appctl -t ovs-ofctl ofctl/barrier - echo >>expout "OFPT_BARRIER_REPLY (OF1.5):" + sleep 1 AT_CHECK( [[sed '
Port commit a6d1a2997db4: ofproto.at: Workaround a race While a barrier serializes requests from the same connection, it doesn't wait for requests from other connections to the switch. Replace the barrier with infamous "sleep 1" to workaround the problem. to the following tests: "ofproto - asynchronous message control (OpenFlow 1.0)", "ofproto - asynchronous message control (OpenFlow 1.3)", "ofproto - asynchronous message control (OpenFlow 1.4)" and "ofproto - asynchronous message control (OpenFlow 1.5)" Sometimes one of these tests fails because the OFPT_BARRIER_REPLY is printed before the other message we expect to have. Suggested-by: Lance Richardson <lrichard@redhat.com> Signed-off-by: Timothy Redaelli <tredaelli@redhat.com> --- tests/ofproto.at | 12 ++++-------- 1 file changed, 4 insertions(+), 8 deletions(-)