diff mbox

[ovs-dev,v2] testsuite: exit gracefully if it fails.

Message ID 20170609155857.20374-1-fbl@redhat.com
State Accepted
Headers show

Commit Message

Flavio Leitner June 9, 2017, 3:58 p.m. UTC
The daemon is killed leaving resources behind when a test fails.
This fixes to first signal the daemon to exit gracefully.

Fixes: 0f28164be02ac ("netdev-linux: make tap devices persistent")
Suggested-by: Joe Stringer <joe@ovn.org>
Co-authored-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Flavio Leitner <fbl@redhat.com>
---
 tests/ofproto-macros.at |  2 +-
 tests/ovs-macros.at     | 33 +++++++++++++++++++++++++++++++++
 2 files changed, 34 insertions(+), 1 deletion(-)

Changelog:
- V2
  - Using a function instead of a single line as suggested by Ben.

Comments

Joe Stringer June 12, 2017, 11:21 p.m. UTC | #1
On 9 June 2017 at 08:58, Flavio Leitner <fbl@redhat.com> wrote:
> The daemon is killed leaving resources behind when a test fails.
> This fixes to first signal the daemon to exit gracefully.
>
> Fixes: 0f28164be02ac ("netdev-linux: make tap devices persistent")
> Suggested-by: Joe Stringer <joe@ovn.org>
> Co-authored-by: Ben Pfaff <blp@ovn.org>
> Signed-off-by: Flavio Leitner <fbl@redhat.com>
> ---

Thanks for the re-spin. Ben, as co-author were you planning to commit this?

Acked-by: Joe Stringer <joe@ovn.org>

As a side note, in the case that the "kill" gets hit, there will be
state left lying around such as tap devices. To truly handle that, we
would have to consider keeping track of all of the bridges and devices
configured during the tests (or prefixing them with a common prefix
like "ovs_" or "at_") then blindly clearing these out in this failure
case. I don't know how big of a problem this is in practice, since it
requires ovs-vswitchd to become unresponsive.
Ben Pfaff June 13, 2017, 4:33 a.m. UTC | #2
On Mon, Jun 12, 2017 at 04:21:04PM -0700, Joe Stringer wrote:
> On 9 June 2017 at 08:58, Flavio Leitner <fbl@redhat.com> wrote:
> > The daemon is killed leaving resources behind when a test fails.
> > This fixes to first signal the daemon to exit gracefully.
> >
> > Fixes: 0f28164be02ac ("netdev-linux: make tap devices persistent")
> > Suggested-by: Joe Stringer <joe@ovn.org>
> > Co-authored-by: Ben Pfaff <blp@ovn.org>
> > Signed-off-by: Flavio Leitner <fbl@redhat.com>
> > ---
> 
> Thanks for the re-spin. Ben, as co-author were you planning to commit this?
> 
> Acked-by: Joe Stringer <joe@ovn.org>
> 
> As a side note, in the case that the "kill" gets hit, there will be
> state left lying around such as tap devices. To truly handle that, we
> would have to consider keeping track of all of the bridges and devices
> configured during the tests (or prefixing them with a common prefix
> like "ovs_" or "at_") then blindly clearing these out in this failure
> case. I don't know how big of a problem this is in practice, since it
> requires ovs-vswitchd to become unresponsive.

It sounds like it would be a good idea to prefix them, at least in the
long run, but I understand if this is difficult to do quickly.
Ben Pfaff June 13, 2017, 4:36 a.m. UTC | #3
On Mon, Jun 12, 2017 at 09:33:50PM -0700, Ben Pfaff wrote:
> On Mon, Jun 12, 2017 at 04:21:04PM -0700, Joe Stringer wrote:
> > On 9 June 2017 at 08:58, Flavio Leitner <fbl@redhat.com> wrote:
> > > The daemon is killed leaving resources behind when a test fails.
> > > This fixes to first signal the daemon to exit gracefully.
> > >
> > > Fixes: 0f28164be02ac ("netdev-linux: make tap devices persistent")
> > > Suggested-by: Joe Stringer <joe@ovn.org>
> > > Co-authored-by: Ben Pfaff <blp@ovn.org>
> > > Signed-off-by: Flavio Leitner <fbl@redhat.com>
> > > ---
> > 
> > Thanks for the re-spin. Ben, as co-author were you planning to commit this?
> > 
> > Acked-by: Joe Stringer <joe@ovn.org>

I applied this to master.
diff mbox

Patch

diff --git a/tests/ofproto-macros.at b/tests/ofproto-macros.at
index faff5b0..30a8b21 100644
--- a/tests/ofproto-macros.at
+++ b/tests/ofproto-macros.at
@@ -322,7 +322,7 @@  m4_define([_OVS_VSWITCHD_START],
    dnl Start ovs-vswitchd.
    AT_CHECK([ovs-vswitchd $1 --detach --no-chdir --pidfile --log-file -vvconn -vofproto_dpif -vunixctl], [0], [], [stderr])
    AT_CAPTURE_FILE([ovs-vswitchd.log])
-   on_exit "kill `cat ovs-vswitchd.pid`"
+   on_exit "kill_ovs_vswitchd `cat ovs-vswitchd.pid`"
    AT_CHECK([[sed < stderr '
 /ovs_numa|INFO|Discovered /d
 /vlog|INFO|opened log file/d
diff --git a/tests/ovs-macros.at b/tests/ovs-macros.at
index 047d0dd..dbce0a5 100644
--- a/tests/ovs-macros.at
+++ b/tests/ovs-macros.at
@@ -115,6 +115,39 @@  parent_pid () {
     fi
 }
 
+# kill_ovs_vswitchd [PID]
+#
+# Signal the ovs-vswitchd daemon to exit gracefully and wait for it to
+# terminate or kill it if that takes too long.
+#
+# It is used to cleanup all sorts of tests and results. It can't assume
+# any state, including the availability of PID file which can be provided.
+kill_ovs_vswitchd () {
+    # Use provided PID or save the current PID if available.
+    TMPPID=$1
+    if test -z "$TMPPID"; then
+        TMPPID=$(cat $OVS_RUNDIR/ovs-vswitchd.pid 2>/dev/null)
+    fi
+
+    # Tell the daemon to terminate gracefully
+    ovs-appctl --timeout=10 -t ovs-vswitchd exit --cleanup 2>/dev/null
+
+    # Nothing else to be done if there is no PID
+    test -z "$TMPPID" && return
+
+    for i in 1 2 3 4 5 6 7 8 9; do
+        # Check if the daemon is alive.
+        kill -0 $TMPPID 2>/dev/null || return
+
+        # Fallback to whole number since POSIX doesn't require
+        # fractional times to work.
+        sleep 0.1 || sleep 1
+    done
+
+    # Make sure it is terminated.
+    kill $TMPPID
+}
+
 # Normalize the output of 'wc' to match POSIX.
 # POSIX says 'wc' should print "%d %d %d", but GNU prints "%7d %7d %7d".
 # POSIX says 'wc -l' should print "%d %s", but BSD prints "%8d".