diff mbox

3.1-rc4: spectacular kernel errors / filesystem crash

Message ID CAMaF-rMxTWRAO7WRdYoEM57k66MC6Vuwk1mcfgCgMO6Z+KBvzQ@mail.gmail.com
State Not Applicable, archived
Delegated to: David Miller
Headers show

Commit Message

Jon Mason Sept. 13, 2011, 3:51 p.m. UTC
On Tue, Sep 13, 2011 at 10:42 AM, Justin Piszcz <jpiszcz@lucidpixels.com> wrote:
>
>
> On Tue, 13 Sep 2011, Jon Mason wrote:
>
>> On Tue, Sep 13, 2011 at 9:54 AM, Justin Piszcz <jpiszcz@lucidpixels.com>
>> wrote:
>>>
>>>
>>> On Tue, 13 Sep 2011, Eric Dumazet wrote:
>>>
>>>> Please Justin make sure you pulled commit
>>>> commit ed2888e906b56769b4ffabb9c577190438aa68b8
>>>> Author: Jon Mason <mason@myri.com>
>>>> Date:   Thu Sep 8 16:41:18 2011 -0500
>>>>
>>>>   PCI: Remove MRRS modification from MPS setting code
>>>>
>>>>   Modifying the Maximum Read Request Size to 0 (value of 128Bytes) has
>>>>   massive negative ramifications on some devices.  Without knowing which
>>>>   devices have this issue, do not modify from the default value when
>>>>   walking the PCI-E bus in pcie_bus_safe mode.  Also, make pcie_bus_safe
>>>>   the default procedure.
>>>>
>>>>   Tested-by: Sven Schnelle <svens@stackframe.org>
>>>>   Tested-by: Simon Kirby <sim@hostway.ca>
>>>>   Tested-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
>>>>   Reported-and-tested-by: Eric Dumazet <eric.dumazet@gmail.com>
>>>>   Reported-and-tested-by: Niels Ole Salscheider
>>>> <niels_ole@salscheider-online.
>>>>   References: https://bugzilla.kernel.org/show_bug.cgi?id=42162
>>>>   Signed-off-by: Jon Mason <mason@myri.com>
>>>>   Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
>>>>   Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
>>>
>>> Hello,
>>>
>>> I found this commit here:
>>> http://permalink.gmane.org/gmane.linux.kernel.pci/11700
>>
>> This is an early version of the patch.  This is the patch that you want:
>>
>> https://github.com/torvalds/linux/commit/ed2888e906b56769b4ffabb9c577190438aa68b8
>>
>> It appears that this patch didn't make it to lkml or linux-pci list
>> due to kernel.org DNS being down when it was sent.
>>
>> Thanks,
>> Jon
>
> I need to learn how to use git at some point, can you please provide plain
> text patches so I can apply them and reboot?
>
> Justin.

I've attached the 2 patches I asked Linus to include into 3.1-rc6.
Let me know if there are any issues.

Thanks,
Jon

Comments

Justin Piszcz Sept. 13, 2011, 4:32 p.m. UTC | #1
On Tue, 13 Sep 2011, Jon Mason wrote:

> On Tue, Sep 13, 2011 at 10:42 AM, Justin Piszcz <jpiszcz@lucidpixels.com> wrote:
>>
>>
>> On Tue, 13 Sep 2011, Jon Mason wrote:
>>
>>> On Tue, Sep 13, 2011 at 9:54 AM, Justin Piszcz <jpiszcz@lucidpixels.com>
>>> wrote:
>>>>
>>>>
>>>> On Tue, 13 Sep 2011, Eric Dumazet wrote:
>>>>

Thanks,

# patch -p1 < ../0001-Fix-pointer-dereference-before-call-to-pcie_bus_conf.patch
patching file arch/x86/pci/acpi.c
patching file drivers/pci/hotplug/pcihp_slot.c
patching file drivers/pci/probe.c
# patch -p1 < ../0002-PCI-Remove-MRRS-modification-from-MPS-setting-code.patch
patching file drivers/pci/pci.c
patching file drivers/pci/probe.c
#

Rebooted & running with new patches for 3.1-rc4.
Will let you know if any further issues, I wonder if this will fix
the RCU/SLAB issues too, thanks.

Justin.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

From 74d81235f8e4bd60859d539a27e51d3a09d183cf Mon Sep 17 00:00:00 2001
From: Jon Mason <mason@myri.com>
Date: Thu, 8 Sep 2011 12:59:00 -0500
Subject: [PATCH 2/2] PCI: Remove MRRS modification from MPS setting code

Modifying the Maximum Read Request Size to 0 (value of 128Bytes) has
massive negative ramifications on some devices.  Without knowing which
devices have this issue, do not modify from the default value when
walking the PCI-E bus in pcie_bus_safe mode.  Also, make pcie_bus_safe
the default procedure.

Tested-by: Sven Schnelle <svens@stackframe.org>
Tested-by: Simon Kirby <sim@hostway.ca>
Tested-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Reported-and-tested-by: Eric Dumazet <eric.dumazet@gmail.com>
Reported-and-tested-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>
References: https://bugzilla.kernel.org/show_bug.cgi?id=42162
Signed-off-by: Jon Mason <mason@myri.com>
---
 drivers/pci/pci.c   |    2 +-
 drivers/pci/probe.c |   41 ++++++++++++++++++++++-------------------
 2 files changed, 23 insertions(+), 20 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 0ce6742..4e84fd4 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -77,7 +77,7 @@  unsigned long pci_cardbus_mem_size = DEFAULT_CARDBUS_MEM_SIZE;
 unsigned long pci_hotplug_io_size  = DEFAULT_HOTPLUG_IO_SIZE;
 unsigned long pci_hotplug_mem_size = DEFAULT_HOTPLUG_MEM_SIZE;
 
-enum pcie_bus_config_types pcie_bus_config = PCIE_BUS_PERFORMANCE;
+enum pcie_bus_config_types pcie_bus_config = PCIE_BUS_SAFE;
 
 /*
  * The default CLS is used if arch didn't set CLS explicitly and not
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 0820fc1..b1187ff 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1396,34 +1396,37 @@  static void pcie_write_mps(struct pci_dev *dev, int mps)
 
 static void pcie_write_mrrs(struct pci_dev *dev, int mps)
 {
-	int rc, mrrs;
+	int rc, mrrs, dev_mpss;
 
-	if (pcie_bus_config == PCIE_BUS_PERFORMANCE) {
-		int dev_mpss = 128 << dev->pcie_mpss;
+	/* In the "safe" case, do not configure the MRRS.  There appear to be
+	 * issues with setting MRRS to 0 on a number of devices.
+	 */
 
-		/* For Max performance, the MRRS must be set to the largest
-		 * supported value.  However, it cannot be configured larger
-		 * than the MPS the device or the bus can support.  This assumes
-		 * that the largest MRRS available on the device cannot be
-		 * smaller than the device MPSS.
-		 */
-		mrrs = mps < dev_mpss ? mps : dev_mpss;
-	} else
-		/* In the "safe" case, configure the MRRS for fairness on the
-		 * bus by making all devices have the same size
-		 */
-		mrrs = mps;
+	if (pcie_bus_config != PCIE_BUS_PERFORMANCE)
+		return;
+
+	dev_mpss = 128 << dev->pcie_mpss;
 
+	/* For Max performance, the MRRS must be set to the largest supported
+	 * value.  However, it cannot be configured larger than the MPS the
+	 * device or the bus can support.  This assumes that the largest MRRS
+	 * available on the device cannot be smaller than the device MPSS.
+	 */
+	mrrs = min(mps, dev_mpss);
 
 	/* MRRS is a R/W register.  Invalid values can be written, but a
-	 * subsiquent read will verify if the value is acceptable or not.
+	 * subsequent read will verify if the value is acceptable or not.
 	 * If the MRRS value provided is not acceptable (e.g., too large),
 	 * shrink the value until it is acceptable to the HW.
  	 */
 	while (mrrs != pcie_get_readrq(dev) && mrrs >= 128) {
+		dev_warn(&dev->dev, "Attempting to modify the PCI-E MRRS value"
+			 " to %d.  If any issues are encountered, please try "
+			 "running with pci=pcie_bus_safe\n", mrrs);
 		rc = pcie_set_readrq(dev, mrrs);
 		if (rc)
-			dev_err(&dev->dev, "Failed attempting to set the MRRS\n");
+			dev_err(&dev->dev,
+				"Failed attempting to set the MRRS\n");
 
 		mrrs /= 2;
 	}
@@ -1436,13 +1439,13 @@  static int pcie_bus_configure_set(struct pci_dev *dev, void *data)
 	if (!pci_is_pcie(dev))
 		return 0;
 
-	dev_info(&dev->dev, "Dev MPS %d MPSS %d MRRS %d\n",
+	dev_dbg(&dev->dev, "Dev MPS %d MPSS %d MRRS %d\n",
 		 pcie_get_mps(dev), 128<<dev->pcie_mpss, pcie_get_readrq(dev));
 
 	pcie_write_mps(dev, mps);
 	pcie_write_mrrs(dev, mps);
 
-	dev_info(&dev->dev, "Dev MPS %d MPSS %d MRRS %d\n",
+	dev_dbg(&dev->dev, "Dev MPS %d MPSS %d MRRS %d\n",
 		 pcie_get_mps(dev), 128<<dev->pcie_mpss, pcie_get_readrq(dev));
 
 	return 0;
-- 
1.7.6