From patchwork Wed Aug 14 15:47:07 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michele Baldessari X-Patchwork-Id: 1147110 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=acksyn.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=acksyn.org header.i=@acksyn.org header.b="bhcwyI6m"; dkim-atps=neutral Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 467v9G2kn8z9sDB for ; Thu, 15 Aug 2019 01:47:29 +1000 (AEST) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id 61E90B50; Wed, 14 Aug 2019 15:47:26 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 31024504 for ; Wed, 14 Aug 2019 15:47:25 +0000 (UTC) X-Greylist: from auto-whitelisted by SQLgrey-1.7.6 Received: from palahniuk.acksyn.org (palahniuk.acksyn.org [5.9.7.26]) by smtp1.linuxfoundation.org (Postfix) with ESMTP id 9A97789D for ; Wed, 14 Aug 2019 15:47:24 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by palahniuk.acksyn.org (Postfix) with ESMTP id 8BEA52E6C1; Wed, 14 Aug 2019 11:47:23 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=acksyn.org; h= content-transfer-encoding:mime-version:x-mailer:message-id:date :date:subject:subject:from:from:received:received; s=2010; t= 1565797642; bh=iqO65rJlueTuXLYLd+CAMUaT6i2LJClqb7GN/Um782Y=; b=b hcwyI6m8XeOYAXkzXCrKI/Gnl0SRAIyKM4u0wS9/WGfy25zrZcg7XHdVOxnPJiqZ NjAmwQyPi4hN2gWCHRO8xSHKPWbAnR+H4590fchkFhx4qKbTHPhC42v+dSd7iAO6 f/N8rhBJnZJqSWluF3JYITvwqMCF7wPr6OHvfpQs4Y= Received: from palahniuk.acksyn.org ([127.0.0.1]) by localhost (mail.acksyn.org [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id tlRoqGEWRh3Z; Wed, 14 Aug 2019 11:47:22 -0400 (EDT) Received: from localhost (unknown [95.233.105.101]) by palahniuk.acksyn.org (Postfix) with ESMTPSA id 45C752DAC6; Wed, 14 Aug 2019 11:47:22 -0400 (EDT) From: Michele Baldessari To: dev@openvswitch.org Date: Wed, 14 Aug 2019 17:47:07 +0200 Message-Id: <20190814154707.15023-1-michele@acksyn.org> X-Mailer: git-send-email 2.21.0 MIME-Version: 1.0 X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Cc: Michele Baldessari Subject: [ovs-dev] [PATCH v2] Make pid_exists() more robust against empty pid argument X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org In some of our destructive testing of ovn-dbs inside containers managed by pacemaker we reached a situation where /var/run/openvswitch had empty .pid files. The current code does not deal well with them and pidfile_is_running() returns true in such a case and this confuses the OCF resource agent. - Before this change: Inside a container run: killall ovsdb-server; echo -n '' > /var/run/openvswitch/ovnnb_db.pid; echo -n '' > /var/run/openvswitch/ovnsb_db.pid We will observe that the cluster is unable to ever recover because it believes the ovn processes to be running when they really aren't and eventually just fails: podman container set: ovn-dbs-bundle [192.168.24.1:8787/rhosp15/openstack-ovn-northd:pcmklatest] ovn-dbs-bundle-0 (ocf::ovn:ovndb-servers): Master controller-0 ovn-dbs-bundle-1 (ocf::ovn:ovndb-servers): Stopped controller-1 ovn-dbs-bundle-2 (ocf::ovn:ovndb-servers): Slave controller-2 Let's make sure pid_exists() returns false when the pid is an empty string. - After this change the cluster is able to recover from this state and correctly start the resource: podman container set: ovn-dbs-bundle [192.168.24.1:8787/rhosp15/openstack-ovn-northd:pcmklatest] ovn-dbs-bundle-0 (ocf::ovn:ovndb-servers): Master controller-0 ovn-dbs-bundle-1 (ocf::ovn:ovndb-servers): Slave controller-1 ovn-dbs-bundle-2 (ocf::ovn:ovndb-servers): Slave controller-2 Fixes: 3028ce2595c8 ("ovs-lib: Allow "status" command to work as non-root.") Signed-off-by: Michele Baldessari --- v1 -> v2 ======== - Implemented Ilya's suggestion and moved the check from pidfile_is_running() to pid_exists() and re-run my tests --- utilities/ovs-lib.in | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/utilities/ovs-lib.in b/utilities/ovs-lib.in index fa840ec637f5..dc485413ef0c 100644 --- a/utilities/ovs-lib.in +++ b/utilities/ovs-lib.in @@ -127,7 +127,7 @@ fi pid_exists () { # This is better than "kill -0" because it doesn't require permission to # send a signal (so daemon_status in particular works as non-root). - test -d /proc/"$1" + test -n "$1" && test -d /proc/"$1" } pid_comm_check () {