From patchwork Mon Apr 16 17:58:53 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gregory Rose X-Patchwork-Id: 898848 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="M3oZuaV9"; dkim-atps=neutral Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40Px2s3dQZz9s3G for ; Tue, 17 Apr 2018 03:59:01 +1000 (AEST) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id 2583EF01; Mon, 16 Apr 2018 17:58:59 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 98076F00 for ; Mon, 16 Apr 2018 17:58:58 +0000 (UTC) X-Greylist: whitelisted by SQLgrey-1.7.6 Received: from mail-pg0-f68.google.com (mail-pg0-f68.google.com [74.125.83.68]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 496F8355 for ; Mon, 16 Apr 2018 17:58:58 +0000 (UTC) Received: by mail-pg0-f68.google.com with SMTP id b9so4169950pgf.6 for ; Mon, 16 Apr 2018 10:58:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=DSktzgU5R7+GoRzhlsrP+KE3CJx9ZOSnfQ7K1IsZTnY=; b=M3oZuaV9OJG8uSEOvLWbVCUO/vlDzSQyPZurMc0zmRIC4+jXPtnWxokXA80dUL+C5Q lK3w7JbSSTMna+Kj8pHLfWZX8ypTEXKa29ew0BPrzy24+5TW3Fl/E1lkfz4fXit9rwu8 hsno13Dyz6Xsle9SzxoNGS+SpA0KE1GlyqgAWCPuOPYFkTlb4GKccSo3kZT5JzzJADDx MtwtGgSJl+dRUgR1AakJIY3T83doRNFW5kwxWkuMA2qiVrMsEkhCXLyUVDImr8K4XSkj dc0c6fQk3Kvs3UjR5xQxrHUp4/JdYJ6X2+n8Xcy4cpNOc9sNyzIUaGtT5CWq+HGUucaT zdig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=DSktzgU5R7+GoRzhlsrP+KE3CJx9ZOSnfQ7K1IsZTnY=; b=h4p6E/Icm70DCeN61xDOFiuM1ULb3+j8jv723wgsEhX+/Sl4ZQK89qyO5uctfawDes o88cyt/RP/bNcIMRzATDw/Z6/AfeMUOF6jL6Ruys39dyQFhyCGXw3dkbN1bnNSGUSxkF 1aPW4MiLFDQqjHVJwvOSDYpsGya9/7Q8kEpifl/b/Abs6VkXaby0am3PYVPo8KyuNam5 WHkx2mA1jVoEAIBQwPaTRYOYzEUk8h3wh6i8txdcuxlt0zprkLlD+JNp2gsWWbWvdwtT maVQyliG23cedXe/Qfrk4JAfHjuYgnR8F3jWFSOBAdNqFg+MzICh79KGKDpUDouYWXsp Q+cw== X-Gm-Message-State: ALQs6tCSKCYC6G66plC/gqKhbNzLl+jKF+U+4cXeEQru6p9ha8ezT6Hg NWNx7KH3PpSpNHj+90xy69V/dA== X-Google-Smtp-Source: AIpwx4+9vcYVmf7xkVelx20gd85rSiJG6strKieLz/zGP7h4X4GRjQp2UKqRsAUDhGePYtX3sK2iIw== X-Received: by 10.98.150.75 with SMTP id c72mr22493232pfe.62.1523901537465; Mon, 16 Apr 2018 10:58:57 -0700 (PDT) Received: from gizo.domain (67-5-146-204.ptld.qwest.net. [67.5.146.204]) by smtp.gmail.com with ESMTPSA id a3sm19981252pgv.14.2018.04.16.10.58.56 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 16 Apr 2018 10:58:56 -0700 (PDT) From: Greg Rose To: dev@openvswitch.org Date: Mon, 16 Apr 2018 10:58:53 -0700 Message-Id: <1523901533-3510-1-git-send-email-gvrose8192@gmail.com> X-Mailer: git-send-email 1.8.3.1 X-Spam-Status: No, score=-1.7 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE autolearn=no version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Cc: fbl@sysclose.org Subject: [ovs-dev] [PATCH] datapath: Prevent panic X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org On RHEL 7.x kernels we observe a panic induced by a paging error when the timer kicks off a job that subsequently accesses memory that belonged to the openvswitch kernel module but was since unloaded - thus the paging error. The panic can be induced on any RHEL 7.x kernel with the following test: while `true` do make check-kmod TESTSUITEFLAGS="-k \!gre" done On the systems I've been testing on it generally takes anywhere from a minute to 15 minutes or so to repro but never longer than that. Similar results have been seen by other testers. This patch does not fix the underlying bug, which does need to be investigated and fixed, but it does prevent it from occurring. We would like to prevent customer systems from panicking while we do futher investigation to find the root cause. Signed-off-by: Greg Rose --- datapath/datapath.c | 10 ++++++++++ tests/system-kmod-macros.at | 1 + utilities/ovs-lib.in | 1 + 3 files changed, 12 insertions(+) diff --git a/datapath/datapath.c b/datapath/datapath.c index 3ea240a..43f0d74 100644 --- a/datapath/datapath.c +++ b/datapath/datapath.c @@ -2478,6 +2478,16 @@ error: static void dp_cleanup(void) { +#if RHEL_RELEASE_CODE < RHEL_RELEASE_VERSION(8,0) + /* On RHEL 7.x kernels we hit a kernel paging error without + * this barrier and subsequent hefty delay. A process will + * attempt to access openvwitch memory after it has been + * unloaded. Further debugging is needed on that but for + * now let's not let customer machines panic. + */ + rcu_barrier(); + msleep(3000); +#endif dp_unregister_genl(ARRAY_SIZE(dp_genl_families)); ovs_netdev_exit(); unregister_netdevice_notifier(&ovs_dp_device_notifier); diff --git a/tests/system-kmod-macros.at b/tests/system-kmod-macros.at index f23a406..2b9b691 100644 --- a/tests/system-kmod-macros.at +++ b/tests/system-kmod-macros.at @@ -23,6 +23,7 @@ m4_define([OVS_TRAFFIC_VSWITCHD_START], on_exit 'modprobe -q -r mod' ]) on_exit 'ovs-dpctl del-dp ovs-system' + on_exit 'ovs-appctl dpctl/flush-conntrack' _OVS_VSWITCHD_START([]) dnl Add bridges, ports, etc. AT_CHECK([ovs-vsctl -- _ADD_BR([br0]) -- $1 m4_if([$2], [], [], [| uuidfilt])], [0], [$2]) diff --git a/utilities/ovs-lib.in b/utilities/ovs-lib.in index 4dc3151..4c3ad0f 100644 --- a/utilities/ovs-lib.in +++ b/utilities/ovs-lib.in @@ -616,6 +616,7 @@ force_reload_kmod () { for dp in `ovs-dpctl dump-dps`; do action "Removing datapath: $dp" ovs-dpctl del-dp "$dp" done + action "ovs-appctl dpctl/flush-conntrack" for vport in `awk '/^vport_/ { print $1 }' /proc/modules`; do action "Removing $vport module" rmmod $vport