From patchwork Wed Oct 7 21:16:38 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kelsey Skunberg X-Patchwork-Id: 1378289 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=canonical.com Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4C66bf452zz9sV7; Thu, 8 Oct 2020 08:17:02 +1100 (AEDT) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1kQGny-0000l5-Iw; Wed, 07 Oct 2020 21:16:58 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1kQGnp-0000hv-NR for kernel-team@lists.ubuntu.com; Wed, 07 Oct 2020 21:16:49 +0000 Received: from mail-oo1-f69.google.com ([209.85.161.69]) by youngberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1kQGnp-000177-Cf for kernel-team@lists.ubuntu.com; Wed, 07 Oct 2020 21:16:49 +0000 Received: by mail-oo1-f69.google.com with SMTP id p6so1610415ooo.0 for ; Wed, 07 Oct 2020 14:16:49 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=GKTeWqizKyKARhrnh3WlLY83se+idUSc3xMEyO+JkBQ=; b=SQlR4TzNg9Dh5XPIq8UYTwkFVaJZeReOfsSRXw1h/x4Tw/21J1h2y5WfU+xQWTeFth twWGLMX6R5Z4BBpApDZXlUd67ouzw8+teigj6CoaNPOvrq+N/2YMa6x7wAKSIKXcP1lT b9bA9EIVBbYF85samdaLSDVU7Sxi5AiM8oX4//phcxtipwUaY1yrqyyWQY2i0LKRzu+L RHLov6xScHdZMCZQ36XUlS2vGyAQTTZMkA/36hMvsTlJ0fKC+sxT6riFyP1mscGtEQ9o kesRhovjOuldTFj1N6z/Thf3m/+Zkq9u+u1S0AojZX6cvwtHqAgImTwKhoqOlhkZDqMH 3Nmw== X-Gm-Message-State: AOAM530krsEFeS/mee2+vFv5YtUo60Ggt8Q942owNhvu7uk7S3rUWZ71 cXs2Gl/yossTfr76xnTgNv/ZUOZRW5C7rApPTSt4U2Lv+r82O7Tzzley8E5s9VgCTrCWbkr09vo 63V3MvJhf2BP+WZbdpOAf61SLLYRhTktWj8z3+uFRcA== X-Received: by 2002:a4a:d8c1:: with SMTP id c1mr469967oov.31.1602105407805; Wed, 07 Oct 2020 14:16:47 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx//B4nhUNAP69rOORsSN+GlKThoyUVpMN7Z/TufTJBdNPk1DOJXxcC9QMjZeySUbsSUqYA/w== X-Received: by 2002:a4a:d8c1:: with SMTP id c1mr469955oov.31.1602105407471; Wed, 07 Oct 2020 14:16:47 -0700 (PDT) Received: from localhost.localdomain ([38.80.149.171]) by smtp.gmail.com with ESMTPSA id v17sm2891236oic.4.2020.10.07.14.16.46 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 07 Oct 2020 14:16:46 -0700 (PDT) From: Kelsey Skunberg To: kernel-team@lists.ubuntu.com Subject: [bionic/linux-azure-4.15][PATCH 3/3] PCI: hv: Retry PCI bus D0 entry on invalid device state Date: Wed, 7 Oct 2020 15:16:38 -0600 Message-Id: <20201007211640.60573-4-kelsey.skunberg@canonical.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20201007211640.60573-1-kelsey.skunberg@canonical.com> References: <20201007211640.60573-1-kelsey.skunberg@canonical.com> MIME-Version: 1.0 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Wei Hu BugLink: https://bugs.launchpad.net/bugs/1883261 When kdump is triggered, some PCI devices may have not been shut down cleanly before the kdump kernel starts. This causes the initial attempt to enter D0 state in the kdump kernel to fail with invalid device state returned from Hyper-V host. When this happens, explicitly call hv_pci_bus_exit() and retry to enter the D0 state. Link: https://lore.kernel.org/r/20200507050300.10974-1-weh@microsoft.com Signed-off-by: Wei Hu [lorenzo.pieralisi@arm.com: commit log] Signed-off-by: Lorenzo Pieralisi Reviewed-by: Michael Kelley (backported from commit c81992e7f4aa19a055dbff5bd6c6d5ff9408f2fb) [KelseyS: Changes made in drivers/pci/host/ instead of drivers/pci/controller] Signed-off-by: Kelsey Skunberg --- drivers/pci/host/pci-hyperv.c | 39 +++++++++++++++++++++++++++++++++-- 1 file changed, 37 insertions(+), 2 deletions(-) diff --git a/drivers/pci/host/pci-hyperv.c b/drivers/pci/host/pci-hyperv.c index f15642e16886..33d430e5d94b 100644 --- a/drivers/pci/host/pci-hyperv.c +++ b/drivers/pci/host/pci-hyperv.c @@ -2489,6 +2489,8 @@ static void hv_free_config_window(struct hv_pcibus_device *hbus) vmbus_free_mmio(hbus->mem_config->start, PCI_CONFIG_MMIO_LENGTH); } +static int hv_pci_bus_exit(struct hv_device *hdev, bool keep_devs); + /** * hv_pci_enter_d0() - Bring the "bus" into the D0 power state * @hdev: VMBus's tracking struct for this root PCI bus @@ -2501,8 +2503,10 @@ static int hv_pci_enter_d0(struct hv_device *hdev) struct pci_bus_d0_entry *d0_entry; struct hv_pci_compl comp_pkt; struct pci_packet *pkt; + bool retry = true; int ret; +enter_d0_retry: /* * Tell the host that the bus is ready to use, and moved into the * powered-on state. This includes telling the host which region @@ -2528,6 +2532,37 @@ static int hv_pci_enter_d0(struct hv_device *hdev) if (ret) goto exit; + /* + * In certain case (Kdump) the pci device of interest was + * not cleanly shut down and resource is still held on host + * side, the host could return invalid device status. + * We need to explicitly request host to release the resource + * and try to enter D0 again. + */ + if (comp_pkt.completion_status < 0 && retry) { + retry = false; + + dev_err(&hdev->device, "Retrying D0 Entry\n"); + + /* + * Hv_pci_bus_exit() calls hv_send_resource_released() + * to free up resources of its child devices. + * In the kdump kernel we need to set the + * wslot_res_allocated to 255 so it scans all child + * devices to release resources allocated in the + * normal kernel before panic happened. + */ + hbus->wslot_res_allocated = 255; + + ret = hv_pci_bus_exit(hdev, true); + + if (ret == 0) { + kfree(pkt); + goto enter_d0_retry; + } + dev_err(&hdev->device, + "Retrying D0 failed with ret %d\n", ret); + } if (comp_pkt.completion_status < 0) { dev_err(&hdev->device, @@ -2909,7 +2944,7 @@ static int hv_pci_probe(struct hv_device *hdev, return ret; } -static int hv_pci_bus_exit(struct hv_device *hdev, bool hibernating) +static int hv_pci_bus_exit(struct hv_device *hdev, bool keep_devs) { struct hv_pcibus_device *hbus = hv_get_drvdata(hdev); struct { @@ -2927,7 +2962,7 @@ static int hv_pci_bus_exit(struct hv_device *hdev, bool hibernating) if (hdev->channel->rescind) return 0; - if (!hibernating) { + if (!keep_devs) { /* Delete any children which might still exist. */ dr = kzalloc(sizeof(*dr), GFP_KERNEL); if (dr && hv_pci_start_relations_work(hbus, dr))