From patchwork Thu Nov 8 16:26:35 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mauricio Faria de Oliveira X-Patchwork-Id: 994995 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=canonical.com Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 42rTFf4s2mz9s8T; Fri, 9 Nov 2018 03:27:02 +1100 (AEDT) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1gKn8x-0006sE-9K; Thu, 08 Nov 2018 16:26:55 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.86_2) (envelope-from ) id 1gKn8w-0006s1-8K for kernel-team@lists.ubuntu.com; Thu, 08 Nov 2018 16:26:54 +0000 Received: from mail-qk1-f200.google.com ([209.85.222.200]) by youngberry.canonical.com with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.76) (envelope-from ) id 1gKn8v-0004FI-VC for kernel-team@lists.ubuntu.com; Thu, 08 Nov 2018 16:26:54 +0000 Received: by mail-qk1-f200.google.com with SMTP id f22so38113674qkm.11 for ; Thu, 08 Nov 2018 08:26:53 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=TICdd7iYhqtF9K6LF+0T1kEbkGjSrxW2zuZsz7eok/c=; b=gwiziGdcl59zLlMRPg/ig+mq5v62FhLUzos4XjCOn5az2g+GzoQx/h2Ux6dhHdGKUu G4ZPWqp6K44BhHd2uBWrV7TL5vL0Aq0LvgpG8tjNI6w0ez8L5tOmDHHYO6DDtTZkZYB3 tDScHCjKBNj3+PFM9iLBpksR9fmTaJx6StwE6+IoQkJV73YD2T6WNfYK1/Y4nw7ePase XsUy7lWyWN34e1z/QACMjPQQ/5KC1s8JsOfFDl9uXW0Bm8PF8v3tXGmuI4wccYYzqFau wUlWjebivKMXcWkNg0leEPTKQz4jen5yRAMfg22+XhwA9DSgmsC4mjirYq3+RR6wq64/ NeAg== X-Gm-Message-State: AGRZ1gKRmr8sHhWYtdgSe+fGoGG+PNZmFVjRwQd1tXGv+FOH4CKVvppb R8SKuBbbbVzBA4Mo2p2Kekn3IhkyZxVsjdmFKTt0UZ5Wm2GG2b1Azs1aNdcsHogi9/4Ey7wWj+n HDSoeMOyVihY7rUFOXhk16Fkq++kkUYTKREbyCy3kkw== X-Received: by 2002:ac8:3222:: with SMTP id x31mr4885590qta.275.1541694413008; Thu, 08 Nov 2018 08:26:53 -0800 (PST) X-Google-Smtp-Source: AJdET5dTHFqXbfENVfqSkqbYkLl06NDnxWfN+B8seE6kJ7O/xihaI8l9AjEhgQFtFxkeXnPHwCJMOw== X-Received: by 2002:ac8:3222:: with SMTP id x31mr4885576qta.275.1541694412824; Thu, 08 Nov 2018 08:26:52 -0800 (PST) Received: from localhost.localdomain ([179.159.57.206]) by smtp.gmail.com with ESMTPSA id t186-v6sm2211225qkd.54.2018.11.08.08.26.50 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 08 Nov 2018 08:26:52 -0800 (PST) From: Mauricio Faria de Oliveira To: kernel-team@lists.ubuntu.com Subject: [SRU C][PATCH v2 0/3] Add kernel parameter 'pci=clearmsi' to clear MSI(X)s early on boot Date: Thu, 8 Nov 2018 14:26:35 -0200 Message-Id: <20181108162638.17137-1-mfo@canonical.com> X-Mailer: git-send-email 2.17.1 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: gpiccoli@canonical.com MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" BugLink: https://bugs.launchpad.net/bugs/1797990 [Changelog] * v2: - Reorder patch 1 as 3 to allow for the next change: - Gate the bus-scan differences with the cmdline option (patch 3 only). Now all functional changes are gated. [Impact] * A kexec/crash kernel might get stuck and fail to boot (for crash kernel, kdump fails to collect a crashdump) if a PCI device is buggy/stuck/looping and triggers a continuous flood of MSI(X) interrupts (that the kernel does not yet know about). * This fix allowed to obtain crashdumps when debugging a heavy-load scenario, in which a (heavy-loaded) network adapter wouldn't stop triggering MSI-X interrupts ever after panic()->kdump kicked in. * This fix disables MSI(X) in all PCI devices on early boot (this is OK as it's (re-)enabled normally later) with a kernel cmdline parameter (disabled by default). [Test Case] * A synthetic test-case is not yet available, however, this particular system/workload triggered the problem consistently, and it was used for development/testing. * We'll update this bug once a synthetic test-case is available; we're working on patching QEMU for this. * $ dmesg | grep 'Clearing MSI' [ 0.000000] Clearing MSI/MSI-X enable bits early in boot (quirk) * The comparison of 'dmesg -t | sort' has been reviewed between option disabled/enabled on boot & kexec modes, and only expected differences found (MHz, PIDs, MIPS). [Regression Potential] * The potential area for regressions is early boot, particularly effects of applying quirks during PCI bus scan, which is changed/broader w/ these patches. * However, all quirks are applied based on PCI ID matching, so would only apply if actually targeting a new device. * Moreover, the new quirk is only applied based on a kernel cmdline parameter that is disabled by default, which constraints even more when this is actually in effect. [Other Info] * The patch series is still under review/discussion upstream, but it's relatively important for Ubuntu users at this point, and after internal discussions we decided to submit it for SRU. * These are links to the linux-pci archive with the patches [1, 2, 3] [1] [PATCH 1/3] x86/quirks: Scan all busses for early PCI quirks https://lore.kernel.org/linux-pci/20181018183721.27467-1-gpiccoli@canonical.com/ [2] [PATCH 2/3] x86/PCI: Export find_cap() to be used in early PCI code https://lore.kernel.org/linux-pci/20181018183721.27467-2-gpiccoli@canonical.com/ [3] [PATCH 3/3] x86/quirks: Add parameter to clear MSIs early on boot https://lore.kernel.org/linux-pci/20181018183721.27467-3-gpiccoli@canonical.com/ Guilherme G. Piccoli (3): UBUNTU: SAUCE: x86/PCI: Export find_cap() to be used in early PCI code UBUNTU: SAUCE: x86/quirks: Add parameter to clear MSIs early on boot UBUNTU: SAUCE: x86/quirks: Scan all busses for early PCI quirks .../admin-guide/kernel-parameters.txt | 6 +++ arch/x86/include/asm/pci-direct.h | 2 + arch/x86/kernel/aperture_64.c | 30 +------------- arch/x86/kernel/early-quirks.c | 41 +++++++++++++++++++ arch/x86/pci/common.c | 4 ++ arch/x86/pci/early.c | 25 +++++++++++ 6 files changed, 80 insertions(+), 28 deletions(-) Acked-by: Khalid Elmously