From patchwork Tue Jul 3 09:05:41 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hari Vyas X-Patchwork-Id: 938530 X-Patchwork-Delegate: bhelgaas@google.com Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-pci-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=quarantine dis=none) header.from=broadcom.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=broadcom.com header.i=@broadcom.com header.b="ZHhicGdG"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 41KdWg5LnDz9s2R for ; Tue, 3 Jul 2018 19:05:51 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932394AbeGCJFu (ORCPT ); Tue, 3 Jul 2018 05:05:50 -0400 Received: from mail-qt0-f193.google.com ([209.85.216.193]:42039 "EHLO mail-qt0-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932388AbeGCJFt (ORCPT ); Tue, 3 Jul 2018 05:05:49 -0400 Received: by mail-qt0-f193.google.com with SMTP id y31-v6so899320qty.9 for ; Tue, 03 Jul 2018 02:05:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=broadcom.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=LHuJg+lE8bxnoaJBUdW9xwq8a2ZljvewQMSihcsEZMU=; b=ZHhicGdGN9ngTVsGT4hFOP1WNUSjbeHVk0grwqUEoTmmt+92YTnNpRrZL5jPgIWsKs xIbrUyyWjvovw0vDe/s+MzH08aOqSSrggoQpeaNGcbBXlomgrp5psFh3yq16K0/M8zMK LtNWipAjfdV3uK8EpTp5PfBksH6ntB3yZJz/A= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=LHuJg+lE8bxnoaJBUdW9xwq8a2ZljvewQMSihcsEZMU=; b=pzWL7wbG9sedTol/lvh3FyWGmVq9Kj81X6DuN07RCdyQEUGZkYHDkpaT9elpiVd32d 59dNBMrNyOFupZzV+wHkEX1mdKxrDNC5kp333WYBj6C1VMMPefbVaVKvDGBV8aO1fvw8 AWdW1hxPyBHR+2Wus99xeWt93HL1LbCyfpVfnl9C5KjPDoYX3vHJHKeNfQ8RSAUNmhh0 20qwT0HPTGziUuHWRMpNzGn5LsJQ1FuuQOag1d0weu05PI5aaXLc5dxZAnJf1+SY5rB3 5e7I/FvyjBuXxvf33JsYRtAigFNpJpJLttMGlxc1NC8GxYVQ/yMyZC9/ptWj2+rN9Gym 8TXA== X-Gm-Message-State: APt69E2yCuRn2crJJ2KN2kpKJOij3/BJK+IJzUiloHaUrtCzkc6C7+ga xboZLkFAQpP5nU+i84qD3ZiWUA== X-Google-Smtp-Source: AAOMgpdSBeGoCr3e22w8SnQshunSMwtcWb5zyXTAxeOJAxaP8GhLuNqYXaGs9NXymBpnTBHnCeBkWg== X-Received: by 2002:a0c:f20b:: with SMTP id h11-v6mr3988357qvk.190.1530608748572; Tue, 03 Jul 2018 02:05:48 -0700 (PDT) Received: from hariv-server.dhcp.broadcom.net ([192.19.234.250]) by smtp.gmail.com with ESMTPSA id 41-v6sm408159qto.73.2018.07.03.02.05.46 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 03 Jul 2018 02:05:48 -0700 (PDT) From: Hari Vyas To: bhelgaas@google.com, benh@kernel.crashing.org Cc: linux-pci@vger.kernel.org, ray.jui@broadcom.com, Hari Vyas Subject: [PATCH v3] PCI: Data corruption happening due to race condition Date: Tue, 3 Jul 2018 14:35:41 +0530 Message-Id: <1530608741-30664-2-git-send-email-hari.vyas@broadcom.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1530608741-30664-1-git-send-email-hari.vyas@broadcom.com> References: <1530608741-30664-1-git-send-email-hari.vyas@broadcom.com> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org When a pci device is detected, a variable is_added is set to 1 in pci device structure and proc, sys entries are created. When a pci device is removed, first is_added is checked for one and then device is detached with clearing of proc and sys entries and at end, is_added is set to 0. is_added and is_busmaster are bit fields in pci_dev structure sharing same memory location. A strange issue was observed with multiple times removal and rescan of a pcie nvme device using sysfs commands where is_added flag was observed as zero instead of one while removing device and proc,sys entries are not cleared. This causes issue in later device addition with warning message "proc_dir_entry" already registered. Debugging revealed a race condition between pcie core driver enabling is_added bit(pci_bus_add_device()) and nvme driver reset work-queue enabling is_busmaster bit (by pci_set_master()). As both fields are not handled in atomic manner and that clears is_added bit. Fix moves device addition is_added bit to separate private flag variable and use different atomic functions to set and retrieve device addition state. As is_added shares different memory location so race condition is avoided. Signed-off-by: Hari Vyas Reviewed-by: Lukas Wunner Acked-by: Michael Ellerman --- arch/powerpc/kernel/pci-common.c | 4 +++- arch/powerpc/platforms/powernv/pci-ioda.c | 3 ++- arch/powerpc/platforms/pseries/setup.c | 3 ++- drivers/pci/bus.c | 6 +++--- drivers/pci/hotplug/acpiphp_glue.c | 2 +- drivers/pci/pci.h | 11 +++++++++++ drivers/pci/probe.c | 4 ++-- drivers/pci/remove.c | 5 +++-- include/linux/pci.h | 1 - 9 files changed, 27 insertions(+), 12 deletions(-) diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c index fe9733f..471aac3 100644 --- a/arch/powerpc/kernel/pci-common.c +++ b/arch/powerpc/kernel/pci-common.c @@ -42,6 +42,8 @@ #include #include +#include "../../../drivers/pci/pci.h" + /* hose_spinlock protects accesses to the the phb_bitmap. */ static DEFINE_SPINLOCK(hose_spinlock); LIST_HEAD(hose_list); @@ -1014,7 +1016,7 @@ void pcibios_setup_bus_devices(struct pci_bus *bus) /* Cardbus can call us to add new devices to a bus, so ignore * those who are already fully discovered */ - if (dev->is_added) + if (pci_dev_is_added(dev)) continue; pcibios_setup_device(dev); diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c index 5bd0eb6..70b2e1e 100644 --- a/arch/powerpc/platforms/powernv/pci-ioda.c +++ b/arch/powerpc/platforms/powernv/pci-ioda.c @@ -46,6 +46,7 @@ #include "powernv.h" #include "pci.h" +#include "../../../../drivers/pci/pci.h" #define PNV_IODA1_M64_NUM 16 /* Number of M64 BARs */ #define PNV_IODA1_M64_SEGS 8 /* Segments per M64 BAR */ @@ -3138,7 +3139,7 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev) struct pci_dn *pdn; int mul, total_vfs; - if (!pdev->is_physfn || pdev->is_added) + if (!pdev->is_physfn || pci_dev_is_added(pdev)) return; pdn = pci_get_pdn(pdev); diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c index 139f0af..8a4868a 100644 --- a/arch/powerpc/platforms/pseries/setup.c +++ b/arch/powerpc/platforms/pseries/setup.c @@ -71,6 +71,7 @@ #include #include "pseries.h" +#include "../../../../drivers/pci/pci.h" int CMO_PrPSP = -1; int CMO_SecPSP = -1; @@ -664,7 +665,7 @@ static void pseries_pci_fixup_iov_resources(struct pci_dev *pdev) const int *indexes; struct device_node *dn = pci_device_to_OF_node(pdev); - if (!pdev->is_physfn || pdev->is_added) + if (!pdev->is_physfn || pci_dev_is_added(pdev)) return; /*Firmware must support open sriov otherwise dont configure*/ indexes = of_get_property(dn, "ibm,open-sriov-vf-bar-info", NULL); diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c index 35b7fc8..5cb40b2 100644 --- a/drivers/pci/bus.c +++ b/drivers/pci/bus.c @@ -330,7 +330,7 @@ void pci_bus_add_device(struct pci_dev *dev) return; } - dev->is_added = 1; + pci_dev_assign_added(dev, true); } EXPORT_SYMBOL_GPL(pci_bus_add_device); @@ -347,14 +347,14 @@ void pci_bus_add_devices(const struct pci_bus *bus) list_for_each_entry(dev, &bus->devices, bus_list) { /* Skip already-added devices */ - if (dev->is_added) + if (pci_dev_is_added(dev)) continue; pci_bus_add_device(dev); } list_for_each_entry(dev, &bus->devices, bus_list) { /* Skip if device attach failed */ - if (!dev->is_added) + if (!pci_dev_is_added(dev)) continue; child = dev->subordinate; if (child) diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c index 3a17b29..ef0b1b6 100644 --- a/drivers/pci/hotplug/acpiphp_glue.c +++ b/drivers/pci/hotplug/acpiphp_glue.c @@ -509,7 +509,7 @@ static void enable_slot(struct acpiphp_slot *slot) list_for_each_entry(dev, &bus->devices, bus_list) { /* Assume that newly added devices are powered on already. */ - if (!dev->is_added) + if (!pci_dev_is_added(dev)) dev->current_state = PCI_D0; } diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h index 882f1f9..0881725 100644 --- a/drivers/pci/pci.h +++ b/drivers/pci/pci.h @@ -288,6 +288,7 @@ struct pci_sriov { /* pci_dev priv_flags */ #define PCI_DEV_DISCONNECTED 0 +#define PCI_DEV_ADDED 1 static inline int pci_dev_set_disconnected(struct pci_dev *dev, void *unused) { @@ -300,6 +301,16 @@ static inline bool pci_dev_is_disconnected(const struct pci_dev *dev) return test_bit(PCI_DEV_DISCONNECTED, &dev->priv_flags); } +static inline void pci_dev_assign_added(struct pci_dev *dev, bool added) +{ + assign_bit(PCI_DEV_ADDED, &dev->priv_flags, added); +} + +static inline bool pci_dev_is_added(const struct pci_dev *dev) +{ + return test_bit(PCI_DEV_ADDED, &dev->priv_flags); +} + #ifdef CONFIG_PCI_ATS void pci_restore_ats_state(struct pci_dev *dev); #else diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index ac876e3..611adcd 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -2433,13 +2433,13 @@ int pci_scan_slot(struct pci_bus *bus, int devfn) dev = pci_scan_single_device(bus, devfn); if (!dev) return 0; - if (!dev->is_added) + if (!pci_dev_is_added(dev)) nr++; for (fn = next_fn(bus, dev, 0); fn > 0; fn = next_fn(bus, dev, fn)) { dev = pci_scan_single_device(bus, devfn + fn); if (dev) { - if (!dev->is_added) + if (!pci_dev_is_added(dev)) nr++; dev->multifunction = 1; } diff --git a/drivers/pci/remove.c b/drivers/pci/remove.c index 6f072ea..5e3d0dc 100644 --- a/drivers/pci/remove.c +++ b/drivers/pci/remove.c @@ -19,11 +19,12 @@ static void pci_stop_dev(struct pci_dev *dev) { pci_pme_active(dev, false); - if (dev->is_added) { + if (pci_dev_is_added(dev)) { device_release_driver(&dev->dev); pci_proc_detach_device(dev); pci_remove_sysfs_dev_files(dev); - dev->is_added = 0; + + pci_dev_assign_added(dev, false); } if (dev->bus->self) diff --git a/include/linux/pci.h b/include/linux/pci.h index 340029b..506125b 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -368,7 +368,6 @@ struct pci_dev { unsigned int transparent:1; /* Subtractive decode bridge */ unsigned int multifunction:1; /* Multi-function device */ - unsigned int is_added:1; unsigned int is_busmaster:1; /* Is busmaster */ unsigned int no_msi:1; /* May not use MSI */ unsigned int no_64bit_msi:1; /* May only use 32-bit MSIs */