From patchwork Thu Nov 8 14:38:48 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mauricio Faria de Oliveira X-Patchwork-Id: 994913 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=canonical.com Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 42rQsK1Pqqz9s9G; Fri, 9 Nov 2018 01:39:17 +1100 (AEDT) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1gKlSf-0001cT-3e; Thu, 08 Nov 2018 14:39:09 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.86_2) (envelope-from ) id 1gKlSc-0001c8-Uv for kernel-team@lists.ubuntu.com; Thu, 08 Nov 2018 14:39:06 +0000 Received: from mail-qk1-f198.google.com ([209.85.222.198]) by youngberry.canonical.com with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.76) (envelope-from ) id 1gKlSc-0005b2-LP for kernel-team@lists.ubuntu.com; Thu, 08 Nov 2018 14:39:06 +0000 Received: by mail-qk1-f198.google.com with SMTP id u20-v6so37715439qka.21 for ; Thu, 08 Nov 2018 06:39:06 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=PngYQ8wfQC0ep1N9VP+QoVoYvdbMMJ4Ehbrr9/P0JqM=; b=fprqprw80I3K0v2wvwy6hPfPFR93I5jopONIL/1iYrKRhNsiOAxqFrhyAqCIWRWz+B V2+6f2h03DAqjCONXbVMjTGxjQqgDTa5xkWghDOiCSohIYuQbhTCbCJip8fysdWC3Lhq RY0RZojKn8qicMD58D3WTMvYZ8aiuYDxBLYjfDsIQx4AHi0WBvsFLzbipYeBmcgi/B2y dxB8oPpOSjcSNL9jXovwtI5ZNroed92rIZ+owQqmUo/uCHWUhDo463Cy+zPT4QNtYdcJ eeaGwy+l07c82E8LA5aXU/YZNUK2F0Ec1yeuWecJxckZqEdtHzSPych8FR7DTuZObKnr lwag== X-Gm-Message-State: AGRZ1gI+YguDXuw5USXKpJ/bARctZlg+HAjoxd8TO6d5krk/5Ohe6Qc+ +fBywQBvOiqa0LlZ7Hnq7r9V6cyAFR2aKsRcnvkvyn9C3Z1dui6mQB9KFUp/rxP/DlTxE3NXMWV UUV2Oj38al0GRTvwvHYShFTAjHopHTjpr0nyUCaiM0A== X-Received: by 2002:a0c:e486:: with SMTP id n6mr4704987qvl.210.1541687945630; Thu, 08 Nov 2018 06:39:05 -0800 (PST) X-Google-Smtp-Source: AJdET5dJ/xEB6gtFILqyamPFXvRfItNQFK/2UmxA+SuJaJyfaYxcc9kqjZFWSxbowOxtscGXF0BfGg== X-Received: by 2002:a0c:e486:: with SMTP id n6mr4704971qvl.210.1541687945419; Thu, 08 Nov 2018 06:39:05 -0800 (PST) Received: from localhost.localdomain ([179.159.57.206]) by smtp.gmail.com with ESMTPSA id v50sm3116971qtc.7.2018.11.08.06.39.03 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 08 Nov 2018 06:39:04 -0800 (PST) From: Mauricio Faria de Oliveira To: kernel-team@lists.ubuntu.com Subject: [SRU X][PATCH v2 0/3] Add kernel parameter 'pci=clearmsi' to clear MSI(X)s early on boot Date: Thu, 8 Nov 2018 12:38:48 -0200 Message-Id: <20181108143851.15758-1-mfo@canonical.com> X-Mailer: git-send-email 2.17.1 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: gpiccoli@canonical.com MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" BugLink: https://bugs.launchpad.net/bugs/1797990 (Note: the patch sets for later releases will be sent out shortly today. All patch sets are logically identical, the only differences among releases are context lines.) [Changelog] * v2: - Reorder patch 1 as 3 to allow for the next change: - Gate the bus-scan differences with the cmdline option (patch 3 only). Now all functional changes are gated. [Impact] * A kexec/crash kernel might get stuck and fail to boot (for crash kernel, kdump fails to collect a crashdump) if a PCI device is buggy/stuck/looping and triggers a continuous flood of MSI(X) interrupts (that the kernel does not yet know about). * This fix allowed to obtain crashdumps when debugging a heavy-load scenario, in which a (heavy-loaded) network adapter wouldn't stop triggering MSI-X interrupts ever after panic()->kdump kicked in. * This fix disables MSI(X) in all PCI devices on early boot (this is OK as it's (re-)enabled normally later) with a kernel cmdline parameter (disabled by default). [Test Case] * A synthetic test-case is not yet available, however, this particular system/workload triggered the problem consistently, and it was used for development/testing. * We'll update this bug once a synthetic test-case is available; we're working on patching QEMU for this. * $ dmesg | grep 'Clearing MSI' [ 0.000000] Clearing MSI/MSI-X enable bits early in boot (quirk) * The comparison of 'dmesg -t | sort' has been reviewed between option disabled/enabled on boot & kexec modes, and only expected differences found (MHz, PIDs, MIPS). [Regression Potential] * The potential area for regressions is early boot, particularly effects of applying quirks during PCI bus scan, which is changed/broader w/ these patches. * However, all quirks are applied based on PCI ID matching, so would only apply if actually targeting a new device. * Moreover, the new quirk is only applied based on a kernel cmdline parameter that is disabled by default, which constraints even more when this is actually in effect. [Other Info] * The patch series is still under review/discussion upstream, but it's relatively important for Ubuntu users at this point, and after internal discussions we decided to submit it for SRU. * These are links to the linux-pci archive with the patches [1, 2, 3] [1] [PATCH 1/3] x86/quirks: Scan all busses for early PCI quirks https://lore.kernel.org/linux-pci/20181018183721.27467-1-gpiccoli@canonical.com/ [2] [PATCH 2/3] x86/PCI: Export find_cap() to be used in early PCI code https://lore.kernel.org/linux-pci/20181018183721.27467-2-gpiccoli@canonical.com/ [3] [PATCH 3/3] x86/quirks: Add parameter to clear MSIs early on boot https://lore.kernel.org/linux-pci/20181018183721.27467-3-gpiccoli@canonical.com/ Guilherme G. Piccoli (3): UBUNTU: SAUCE: x86/PCI: Export find_cap() to be used in early PCI code UBUNTU: SAUCE: x86/quirks: Add parameter to clear MSIs early on boot UBUNTU: SAUCE: x86/quirks: Scan all busses for early PCI quirks Documentation/kernel-parameters.txt | 6 +++++ arch/x86/include/asm/pci-direct.h | 2 ++ arch/x86/kernel/aperture_64.c | 30 ++------------------- arch/x86/kernel/early-quirks.c | 41 +++++++++++++++++++++++++++++ arch/x86/pci/common.c | 4 +++ arch/x86/pci/early.c | 25 ++++++++++++++++++ 6 files changed, 80 insertions(+), 28 deletions(-) Acked-by: Thadeu Lima de Souza Cascardo Acked-by: Thadeu Lima de Souza Cascardo Acked-by: Khalid Elmously