From patchwork Wed Apr 10 01:21:20 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Han Zhou X-Patchwork-Id: 1083055 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="G8Nrba1j"; dkim-atps=neutral Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44f5yp1zkGz9sSR for ; Wed, 10 Apr 2019 11:23:46 +1000 (AEST) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id A77851178; Wed, 10 Apr 2019 01:22:04 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 921AD1152 for ; Wed, 10 Apr 2019 01:21:43 +0000 (UTC) X-Greylist: whitelisted by SQLgrey-1.7.6 Received: from mail-pg1-f194.google.com (mail-pg1-f194.google.com [209.85.215.194]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 88513F4 for ; Wed, 10 Apr 2019 01:21:42 +0000 (UTC) Received: by mail-pg1-f194.google.com with SMTP id v12so477682pgq.1 for ; Tue, 09 Apr 2019 18:21:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=Y1vaHJNUYkU22ozaPXq32OTMxD0qJ4noo+E92/+XY6o=; b=G8Nrba1j1ceJZx6TSUiPXRUqL4et8PuvvCwpcr8najokOzrXgK97EMHbSP8Q6afMU+ wDShn0HG70WeVAc9P0ait0HYlVaPwza4KPcvzFmLaft1lYKXsZP4t2oD0YlRRPc5o8fx X4ToOPXRf5SB32Yy7b7NgX2xnD8C9l9V/0pivSo6rs6InjD0L7DEGJY8YRk9TwIaDQyE HEkDqJ7RB9uWzwhM6yiyur0DQODYd869zzOIwo2LzvsQmm9bh1jUfCQ2Ps7r43l2y3v8 ifoY+E+Bx2rCRhL2O+DxOaRAopv9Z5Y9aXFxXkpb/OBSG9Evt0hsvDia3HvtZc8SAI+a nniQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Y1vaHJNUYkU22ozaPXq32OTMxD0qJ4noo+E92/+XY6o=; b=Tofsl4EUboo3jHjOM6R7Qn2ZqT+lx5CD5qTr57WsmzU7l/O2L+vpRSICBb07M6yWWj W8nMn1jpYT/O5Qh1FKiGb6e3PoB0TlbZFpN+eFssNYQMVQBpjkOUmFNIjlvMyiCu36eb KGSYLTkxXkvcYITj9k8ckhXlOxkoj7IEnVnleVckCgyPgHpt/VIIigEMdnxIQvIrU8XC NLetpVqLQdim4nokLzAFhMhfoywYOwfWqyVqJ78jmnAWRWBQGr+URf901D9ckY6lIvPF rvVII0nhe+M5YED88/H0Y26agcPqZyUsV++H0vmyc1XGTreZmhhhgJuUjqovb9BBoSYJ C/ew== X-Gm-Message-State: APjAAAVZUFikYiIbVqveDJDKd1P4Gu7ZdsArB1ysb+x6X/vmZ8qWnn23 tmnqYzDoC8/jz9wQ0DTudMFJfecg X-Google-Smtp-Source: APXvYqzvVe+TQG0lepqqOjwOuoJghtlisWt+U59G0Ca8dwKvz83VoS4G9oBMGjgeOWmPHTqYl+YE/w== X-Received: by 2002:a63:707:: with SMTP id 7mr36837935pgh.390.1554859301735; Tue, 09 Apr 2019 18:21:41 -0700 (PDT) Received: from localhost.localdomain.localdomain ([216.113.160.71]) by smtp.gmail.com with ESMTPSA id u63sm17068802pgd.16.2019.04.09.18.21.41 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 09 Apr 2019 18:21:41 -0700 (PDT) From: Han Zhou X-Google-Original-From: Han Zhou To: dev@openvswitch.org Date: Tue, 9 Apr 2019 18:21:20 -0700 Message-Id: <1554859282-15144-5-git-send-email-hzhou8@ebay.com> X-Mailer: git-send-email 2.1.0 In-Reply-To: <1554859282-15144-1-git-send-email-hzhou8@ebay.com> References: <1554859282-15144-1-git-send-email-hzhou8@ebay.com> X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Subject: [ovs-dev] [PATCH 5/7] ovsdb raft: Test cases for cluster failures when there are pending transactions. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org From: Han Zhou Implement test cases for the failure scenarios when there are pending transactions from clients. This patch implements test cases for different combinations of conditions with the help of previously added test commands and options for cluster mode. The conditions include: - Connected node from which client transaction is executed: leader, follower - Crashed node: leader, follower that is connected, or the other follower - Crash point: - For leader: - before/after receiving execute_command_request - before/after sending append_request - before/after sending execute_command_reply - For follower: - before/after sending execute_command_request - after receiving append_request There are 16 test cases in total, and 9 of them are skipped purposely because of the bugs found by the test cases to avoid CI failure. They will be enabled in coming patches when the corresponding bugs are fixed. Signed-off-by: Han Zhou --- tests/ovsdb-cluster.at | 173 +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 173 insertions(+) diff --git a/tests/ovsdb-cluster.at b/tests/ovsdb-cluster.at index 5550a19..4e88766 100644 --- a/tests/ovsdb-cluster.at +++ b/tests/ovsdb-cluster.at @@ -62,6 +62,179 @@ m4_define([OVSDB_CHECK_EXECUTION], AT_CLEANUP]) EXECUTION_EXAMPLES + +OVS_START_SHELL_HELPERS +# ovsdb_cluster_failure_test SCHEMA_FUNC OUTPUT TRANSACTION... +ovsdb_cluster_failure_test () { + # Initial state: s1 is leader, s2 and s3 are followers + remote_1=$1 + remote_2=$2 + crash_node=$3 + crash_command=$4 + if test "$crash_node" == "1"; then + new_leader=$5 + fi + + cp $top_srcdir/ovn/ovn-nb.ovsschema schema + schema=`ovsdb-tool schema-name schema` + AT_CHECK([ovsdb-tool '-vPATTERN:console:%c|%p|%m' create-cluster s1.db schema unix:s1.raft], [0], [], [dnl +ovsdb|WARN|schema: changed 2 columns in 'OVN_Northbound' database from ephemeral to persistent, including 'status' column in 'Connection' table, because clusters do not support ephemeral columns +]) + + n=3 + join_cluster() { + local i=$1 + others= + for j in `seq 1 $n`; do + if test $i != $j; then + others="$others unix:s$j.raft" + fi + done + AT_CHECK([ovsdb-tool join-cluster s$i.db $schema unix:s$i.raft $others]) + } + start_server() { + local i=$1 + printf "\ns$i: starting\n" + AT_CHECK([ovsdb-server -vjsonrpc -vconsole:off -vsyslog:off --detach --no-chdir --log-file=s$i.log --pidfile=s$i.pid --unixctl=s$i --remote=punix:s$i.ovsdb s$i.db]) + } + connect_server() { + local i=$1 + printf "\ns$i: waiting to connect to storage\n" + AT_CHECK([ovsdb_client_wait --log-file=connect$i.log unix:s$i.ovsdb $schema connected]) + } + cid=`ovsdb-tool db-cid s1.db` + for i in `seq 2 $n`; do join_cluster $i; done + + on_exit 'kill `cat *.pid`' + for i in `seq $n`; do start_server $i; done + for i in `seq $n`; do connect_server $i; done + + export OVN_NB_DB=unix:s$remote_1.ovsdb,unix:s$remote_2.ovsdb + + # To ensure $new_leader node the new leader, we delay election timer for + # the other follower. + if test -n "$new_leader"; then + if test "$new_leader" == "2"; then + delay_election_node=3 + else + delay_election_node=2 + fi + AT_CHECK([ovs-appctl -t "`pwd`"/s$delay_election_node cluster/failure-test delay-election], [0], [ignore]) + fi + AT_CHECK([ovs-appctl -t "`pwd`"/s$crash_node cluster/failure-test $crash_command], [0], [ignore]) + AT_CHECK([ovn-nbctl -v --timeout=10 --no-leader-only --no-shuffle-remotes create logical_switch name=ls1], [0], [ignore], [ignore]) + + # Make sure that the node really crashed. + AT_CHECK([ls s$crash_node.ovsdb], [2], [ignore], [ignore]) + # XXX: Client will fail if remotes contains unix socket that doesn't exist (killed). + if test "$remote_1" == "$crash_node"; then + export OVN_NB_DB=unix:s$remote_2.ovsdb + fi + AT_CHECK([ovn-nbctl --no-leader-only ls-list | awk '{ print $2 }'], [0], [(ls1) +]) +} +OVS_END_SHELL_HELPERS +AT_BANNER([OVSDB - cluster failure with pending transaction]) + +AT_SETUP([OVSDB cluster - txn on follower-2, leader crash before sending appendReq, follower-2 becomes leader]) +AT_KEYWORDS([ovsdb server negative unix cluster pending-txn]) +# XXX: fix bug before enabling this test +AT_CHECK([exit 77]) +ovsdb_cluster_failure_test 2 3 1 crash-before-sending-append-request 2 +AT_CLEANUP + +AT_SETUP([OVSDB cluster - txn on follower-2, leader crash before sending appendReq, follower-3 becomes leader]) +AT_KEYWORDS([ovsdb server negative unix cluster pending-txn]) +ovsdb_cluster_failure_test 2 3 1 crash-before-sending-append-request 3 +AT_CLEANUP + +AT_SETUP([OVSDB cluster - txn on follower-2, leader crash before sending execRep, follower-2 becomes leader]) +AT_KEYWORDS([ovsdb server negative unix cluster pending-txn]) +# XXX: fix bug before enabling this test +AT_CHECK([exit 77]) +ovsdb_cluster_failure_test 2 3 1 crash-before-sending-execute-command-reply 2 +AT_CLEANUP + +AT_SETUP([OVSDB cluster - txn on follower-2, leader crash before sending execRep, follower-3 becomes leader]) +AT_KEYWORDS([ovsdb server negative unix cluster pending-txn]) +# XXX: fix bug before enabling this test +AT_CHECK([exit 77]) +ovsdb_cluster_failure_test 2 3 1 crash-before-sending-execute-command-reply 3 +AT_CLEANUP + +AT_SETUP([OVSDB cluster - txn on follower-2, leader crash after sending execRep, follower-2 becomes leader]) +AT_KEYWORDS([ovsdb server negative unix cluster pending-txn]) +# XXX: fix bug before enabling this test +AT_CHECK([exit 77]) +ovsdb_cluster_failure_test 2 3 1 crash-after-sending-execute-command-reply 2 +AT_CLEANUP + +AT_SETUP([OVSDB cluster - txn on follower-2, leader crash after sending execRep, follower-3 becomes leader]) +AT_KEYWORDS([ovsdb server negative unix cluster pending-txn]) +ovsdb_cluster_failure_test 2 3 1 crash-after-sending-execute-command-reply 3 +AT_CLEANUP + +AT_SETUP([OVSDB cluster - txn on leader, leader crash before sending appendReq, follower-2 becomes leader]) +AT_KEYWORDS([ovsdb server negative unix cluster pending-txn]) +# XXX: fix bug before enabling this test +AT_CHECK([exit 77]) +ovsdb_cluster_failure_test 1 2 1 crash-before-sending-append-request 2 +AT_CLEANUP + +AT_SETUP([OVSDB cluster - txn on leader, leader crash before sending appendReq, follower-3 becomes leader]) +AT_KEYWORDS([ovsdb server negative unix cluster pending-txn]) +ovsdb_cluster_failure_test 1 2 1 crash-before-sending-append-request 3 +AT_CLEANUP + +AT_SETUP([OVSDB cluster - txn on leader, leader crash after sending appendReq, follower-2 becomes leader]) +AT_KEYWORDS([ovsdb server negative unix cluster pending-txn]) +# XXX: Detect and skip repeated transaction before enabling this test +AT_CHECK([exit 77]) +ovsdb_cluster_failure_test 1 2 1 crash-after-sending-append-request 2 +AT_CLEANUP + +AT_SETUP([OVSDB cluster - txn on leader, leader crash after sending appendReq, follower-3 becomes leader]) +AT_KEYWORDS([ovsdb server negative unix cluster pending-txn]) +# XXX: Detect and skip repeated transaction before enabling this test +AT_CHECK([exit 77]) +ovsdb_cluster_failure_test 1 2 1 crash-after-sending-append-request 3 +AT_CLEANUP + +AT_SETUP([OVSDB cluster - txn on follower-2, follower-2 crash before sending execReq, reconnect to follower-3]) +AT_KEYWORDS([ovsdb server negative unix cluster pending-txn]) +ovsdb_cluster_failure_test 2 3 2 crash-before-sending-execute-command-request +AT_CLEANUP + +AT_SETUP([OVSDB cluster - txn on follower-2, follower-2 crash before sending execReq, reconnect to leader]) +AT_KEYWORDS([ovsdb server negative unix cluster pending-txn]) +ovsdb_cluster_failure_test 2 1 2 crash-before-sending-execute-command-request +AT_CLEANUP + +AT_SETUP([OVSDB cluster - txn on follower-2, follower-2 crash after sending execReq, reconnect to follower-3]) +AT_KEYWORDS([ovsdb server negative unix cluster pending-txn]) +# XXX: Detect and skip repeated transaction before enabling this test +AT_CHECK([exit 77]) +ovsdb_cluster_failure_test 2 3 2 crash-after-sending-execute-command-request +AT_CLEANUP + +AT_SETUP([OVSDB cluster - txn on follower-2, follower-2 crash after sending execReq, reconnect to leader]) +AT_KEYWORDS([ovsdb server negative unix cluster pending-txn]) +# XXX: Detect and skip repeated transaction before enabling this test +AT_CHECK([exit 77]) +ovsdb_cluster_failure_test 2 1 2 crash-after-sending-execute-command-request +AT_CLEANUP + +AT_SETUP([OVSDB cluster - txn on leader, follower-2 crash after receiving appendReq for the update]) +AT_KEYWORDS([ovsdb server negative unix cluster pending-txn]) +ovsdb_cluster_failure_test 1 1 2 crash-after-receiving-append-request-update +AT_CLEANUP + +AT_SETUP([OVSDB cluster - txn on follower-2, follower-3 crash after receiving appendReq for the update]) +AT_KEYWORDS([ovsdb server negative unix cluster pending-txn]) +ovsdb_cluster_failure_test 2 2 3 crash-after-receiving-append-request-update +AT_CLEANUP + + AT_BANNER([OVSDB - cluster tests]) # Torture test.