From patchwork Thu Nov 8 17:05:55 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mauricio Faria de Oliveira X-Patchwork-Id: 995041 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=canonical.com Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 42rV742SJ2z9s9J; Fri, 9 Nov 2018 04:06:24 +1100 (AEDT) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1gKnl2-0003Eq-3W; Thu, 08 Nov 2018 17:06:16 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.86_2) (envelope-from ) id 1gKnl1-0003EY-0W for kernel-team@lists.ubuntu.com; Thu, 08 Nov 2018 17:06:15 +0000 Received: from mail-qk1-f200.google.com ([209.85.222.200]) by youngberry.canonical.com with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.76) (envelope-from ) id 1gKnl0-0006y9-MV for kernel-team@lists.ubuntu.com; Thu, 08 Nov 2018 17:06:14 +0000 Received: by mail-qk1-f200.google.com with SMTP id s70so39533203qks.4 for ; Thu, 08 Nov 2018 09:06:14 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=uvHalsJW0Mgk7epdWrg024rXX8q7EfhM+TgJHj+WELk=; b=ZtJO2Yxb8YSkMqOcPahYYwxNBmakK8RPDqX4vLUhz5L7cLc7Bcl+SRMlN7CXwvrytS vX5szroFDGHDjvJw+H65Il+DsskyGAiAxvfwPz4Nt091CJItbFOpqAUo42DaPW5s4gP+ W1cq2ebMUnceYCSB3CoGZ6rdCR614TB4nX3F6dkuPeLUSn6/Iwg4/AOkrnubaCDEchbv wo1f2P00SZOUuC1Rzvj00RkrnC3sXUhXSwMe0LTyl0h1D6hGIkALYrn2kstEG2xTQeHc SwA/Uhg5SrJCOnF/p6rZzPdS0BDgICgTzLhXa94fx9kWs4GOv7j/lZ214M9CJFqva+Ho ofvg== X-Gm-Message-State: AGRZ1gKWijJHk0ZrYOiNQ+rPo5oEcPj/iUNLM0+jjl/wCVlkTHC/XTLH liQ2zTCag9Fqw6ddoURtUDLUtNuWSVbSloURaSeNBO98RM/tGPXzItr7SAsWxvo5y1cD81VOvzk vpXYWg25Ghq3u7jfifmcTRszYxFT2EYwvv9Pbvowhug== X-Received: by 2002:a37:4dc5:: with SMTP id a188-v6mr4937776qkb.326.1541696773734; Thu, 08 Nov 2018 09:06:13 -0800 (PST) X-Google-Smtp-Source: AJdET5cscuT5DxEn56JvDtAGbjpAD6/sD7OJs/9pnrnqnbOLTRx056nAfrL4e7sWW/vWMfniTxjMWw== X-Received: by 2002:a37:4dc5:: with SMTP id a188-v6mr4937742qkb.326.1541696773483; Thu, 08 Nov 2018 09:06:13 -0800 (PST) Received: from localhost.localdomain ([179.159.57.206]) by smtp.gmail.com with ESMTPSA id x141-v6sm2325454qka.9.2018.11.08.09.06.10 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 08 Nov 2018 09:06:12 -0800 (PST) From: Mauricio Faria de Oliveira To: kernel-team@lists.ubuntu.com Subject: [D][PATCH v2 0/3] Add kernel parameter 'pci=clearmsi' to clear MSI(X)s early on boot Date: Thu, 8 Nov 2018 15:05:55 -0200 Message-Id: <20181108170558.17534-1-mfo@canonical.com> X-Mailer: git-send-email 2.17.1 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: gpiccoli@canonical.com MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" BugLink: https://bugs.launchpad.net/bugs/1797990 (Note: the patch set for Cosmic/Disco should be identical, as both trees' master-next branch is at the 4.18.0-11.12 level, with some more patches in Cosmic, but doing Disco as well for process guidelines/completeness sake.) [Changelog] * v2: - Reorder patch 1 as 3 to allow for the next change: - Gate the bus-scan differences with the cmdline option (patch 3 only). Now all functional changes are gated. [Impact] * A kexec/crash kernel might get stuck and fail to boot (for crash kernel, kdump fails to collect a crashdump) if a PCI device is buggy/stuck/looping and triggers a continuous flood of MSI(X) interrupts (that the kernel does not yet know about). * This fix allowed to obtain crashdumps when debugging a heavy-load scenario, in which a (heavy-loaded) network adapter wouldn't stop triggering MSI-X interrupts ever after panic()->kdump kicked in. * This fix disables MSI(X) in all PCI devices on early boot (this is OK as it's (re-)enabled normally later) with a kernel cmdline parameter (disabled by default). [Test Case] * A synthetic test-case is not yet available, however, this particular system/workload triggered the problem consistently, and it was used for development/testing. * We'll update this bug once a synthetic test-case is available; we're working on patching QEMU for this. * $ dmesg | grep 'Clearing MSI' [ 0.000000] Clearing MSI/MSI-X enable bits early in boot (quirk) * The comparison of 'dmesg -t | sort' has been reviewed between option disabled/enabled on boot & kexec modes, and only expected differences found (MHz, PIDs, MIPS). [Regression Potential] * The potential area for regressions is early boot, particularly effects of applying quirks during PCI bus scan, which is changed/broader w/ these patches. * However, all quirks are applied based on PCI ID matching, so would only apply if actually targeting a new device. * Moreover, the new quirk is only applied based on a kernel cmdline parameter that is disabled by default, which constraints even more when this is actually in effect. [Other Info] * The patch series is still under review/discussion upstream, but it's relatively important for Ubuntu users at this point, and after internal discussions we decided to submit it for SRU. * These are links to the linux-pci archive with the patches [1, 2, 3] [1] [PATCH 1/3] x86/quirks: Scan all busses for early PCI quirks https://lore.kernel.org/linux-pci/20181018183721.27467-1-gpiccoli@canonical.com/ [2] [PATCH 2/3] x86/PCI: Export find_cap() to be used in early PCI code https://lore.kernel.org/linux-pci/20181018183721.27467-2-gpiccoli@canonical.com/ [3] [PATCH 3/3] x86/quirks: Add parameter to clear MSIs early on boot https://lore.kernel.org/linux-pci/20181018183721.27467-3-gpiccoli@canonical.com/ Guilherme G. Piccoli (3): UBUNTU: SAUCE: x86/PCI: Export find_cap() to be used in early PCI code UBUNTU: SAUCE: x86/quirks: Add parameter to clear MSIs early on boot UBUNTU: SAUCE: x86/quirks: Scan all busses for early PCI quirks .../admin-guide/kernel-parameters.txt | 6 +++ arch/x86/include/asm/pci-direct.h | 2 + arch/x86/kernel/aperture_64.c | 30 +------------- arch/x86/kernel/early-quirks.c | 41 +++++++++++++++++++ arch/x86/pci/common.c | 4 ++ arch/x86/pci/early.c | 25 +++++++++++ 6 files changed, 80 insertions(+), 28 deletions(-)