From patchwork Fri Sep 18 22:33:21 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gurucharan Shetty X-Patchwork-Id: 1367084 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.138; helo=whitealder.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=ovn.org Received: from whitealder.osuosl.org (smtp1.osuosl.org [140.211.166.138]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4BtKgj41mCz9sRf for ; Sat, 19 Sep 2020 02:54:41 +1000 (AEST) Received: from localhost (localhost [127.0.0.1]) by whitealder.osuosl.org (Postfix) with ESMTP id 192C087A6D; Fri, 18 Sep 2020 16:54:40 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from whitealder.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 0t9ZIuPB4Ho4; Fri, 18 Sep 2020 16:54:38 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by whitealder.osuosl.org (Postfix) with ESMTP id 1E338879C2; Fri, 18 Sep 2020 16:54:38 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 0EE6BC0888; Fri, 18 Sep 2020 16:54:38 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from fraxinus.osuosl.org (smtp4.osuosl.org [140.211.166.137]) by lists.linuxfoundation.org (Postfix) with ESMTP id D9346C0051 for ; Fri, 18 Sep 2020 16:54:36 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by fraxinus.osuosl.org (Postfix) with ESMTP id D5299874AB for ; Fri, 18 Sep 2020 16:54:36 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from fraxinus.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id pZdc1pTTfCvp for ; Fri, 18 Sep 2020 16:54:36 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mail-pg1-f193.google.com (mail-pg1-f193.google.com [209.85.215.193]) by fraxinus.osuosl.org (Postfix) with ESMTPS id 5DFD2874A6 for ; Fri, 18 Sep 2020 16:54:36 +0000 (UTC) Received: by mail-pg1-f193.google.com with SMTP id j34so3808651pgi.7 for ; Fri, 18 Sep 2020 09:54:36 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id; bh=ZyqrgbQ/sNv8uHpHd044W5j5jNZ8s8quPml/kXupv0M=; b=ENvn2819P4iHXq75OcuyhRHpInfR8684SAbW8aUk3yOY+tHAbic+eXrP5eknmqmmcl Y/SURMEqv3vL0H4dlmGcTH7EFqnIB6HD0Og6DrKIRqPWHK7BzDHNb8wt4e98QFClGMbn fp+2QYfoCvI74tgUsdlwD+0iw863Oo/oGobm0s7xZXjVvGvSK8pSNMAiPs0dwljuY1y2 d9EiWSjo26MTSzS7Lkgzou9cCMv0sGFIRApoE4kIKMwgCW36JgcM5aOxCehahAupkJzV nWrFA8Rfus3K+7MiFX84BylJkEqgSrh31qMq6+K+EPIRrokTRDHRDh21mc4yNcfOIQf1 Ftqg== X-Gm-Message-State: AOAM532Qw9+WBEERUrXlQWO1G0ZjRfvvfyWlhVp4mE91x6aeQ62glEqH RaqMozFZQLbuE6cvieWow8Ts86OPAh6hjQ== X-Google-Smtp-Source: ABdhPJwvqS/pQdO9/V0J1ESYvfM2cVNoeq9Kt++vNHmgzZmCiV/5t2/ol8lnVZrvc8pCsofTQvBn7g== X-Received: by 2002:aa7:85d4:0:b029:142:440b:fcd7 with SMTP id z20-20020aa785d40000b0290142440bfcd7mr13377785pfn.36.1600448075719; Fri, 18 Sep 2020 09:54:35 -0700 (PDT) Received: from kube-master.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id h35sm3671548pgl.31.2020.09.18.09.54.34 for (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 18 Sep 2020 09:54:35 -0700 (PDT) From: Gurucharan Shetty To: dev@openvswitch.org Date: Fri, 18 Sep 2020 15:33:21 -0700 Message-Id: <1600468401-3300-1-git-send-email-guru@ovn.org> X-Mailer: git-send-email 1.9.1 Subject: [ovs-dev] [PATCH] ovs-lib: Handle daemon segfaults during exit. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" Currently, we terminate a daemon by trying "ovs-appctl exit", "SIGTERM" and finally "SIGKILL". But the logic fails if during "ovs-appctl exit", the daemon crashes (segfaults). The monitor will automatically restart the daemon with a new pid. The current logic of checking the non-existance of old pid succeeds and we proceed with the assumption that the daemon is dead. This is a problem during OVS upgrades as we will continue to run the older version of OVS. With this commit, we take care of this situation. If there is a segfault, the pidfile is not deleted. So, we wait a little to give time for the monitor to restart the daemon (which is usually instantaneous) and then re-read the pidfile. VMware-BZ: #2633995 Signed-off-by: Gurucharan Shetty Acked-by: Yi-Hung Wei --- utilities/ovs-lib.in | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/utilities/ovs-lib.in b/utilities/ovs-lib.in index d646b44..f7e9756 100644 --- a/utilities/ovs-lib.in +++ b/utilities/ovs-lib.in @@ -255,20 +255,36 @@ stop_daemon () { if version_geq "$version" "2.5.90"; then actions="$graceful $actions" fi + actiontype="" for action in $actions; do if pid_exists "$pid" >/dev/null 2>&1; then :; else - return 0 + # pid does not exist. + if [ -n "$actiontype" ]; then + return 0 + fi + # But, does the file exist? We may have had a daemon + # segfault with `ovs-appctl exit`. Check one more time + # before deciding that the daemon is dead. + [ -e "$rundir/$1.pid" ] && sleep 2 && pid=`cat "$rundir/$1.pid"` 2>/dev/null + if pid_exists "$pid" >/dev/null 2>&1; then :; else + return 0 + fi fi case $action in EXIT) action "Exiting $1 ($pid)" \ ${bindir}/ovs-appctl -T 1 -t $rundir/$1.$pid.ctl exit $2 + # The above command could have resulted in delayed + # daemon segfault. And if a monitor is running, it + # would restart the daemon giving it a new pid. ;; TERM) action "Killing $1 ($pid)" kill $pid + actiontype="force" ;; KILL) action "Killing $1 ($pid) with SIGKILL" kill -9 $pid + actiontype="force" ;; FAIL) log_failure_msg "Killing $1 ($pid) failed"