diff mbox

d63e2e1f3df breaks sparc/T5-8

Message ID CAE9FiQUVSRkKNRzh99JA7V8dqsY78z_GtnMnwTrpSFXkJV9CKA@mail.gmail.com
State Not Applicable
Headers show

Commit Message

Yinghai Lu March 31, 2015, 1:06 a.m. UTC
On Mon, Mar 30, 2015 at 3:54 PM, David Ahern <david.ahern@oracle.com> wrote:
> On 3/29/15 2:07 PM, Yinghai Lu wrote:
>>
>> [  286.647560] PCI: scan_bus[/pci@300/pci@1/pci@0/pci@6] bus no 8
>> [  286.921232] PCI: Claiming 0000:00:01.0: Resource 15:
>> 0000800100000000..00008004afffffff [220c]
>> [  287.229190] PCI: Claiming 0000:01:00.0: Resource 15:
>> 0000800100000000..00008004afffffff [220c]
>> [  287.533428] PCI: Claiming 0000:02:04.0: Resource 15:
>> 0000800100000000..000080012fffffff [220c]
>> [  288.149831] PCI: Claiming 0000:03:00.0: Resource 15:
>> 0000800100000000..000080012fffffff [220c]
>> [  288.252466] PCI: Claiming 0000:04:06.0: Resource 14:
>> 0000800100000000..000080010fffffff [220c]
>> [  288.867196] PCI: Claiming 0000:05:00.0: Resource 0:
>> 0000800100000000..0000800100001fff [204]
>> [  288.968221] pci 0000:05:00.0: can't claim BAR 0 [mem
>> 0x800100000000-0x800100001fff]: no compatible bridge window
>>
>> the bridge resource has IORESOURCE_PREFETCH, but the device doesn't have
>> that.
>
> # lspci -vvxxx -s 0000:05:00.0
> 0000:05:00.0 USB controller: Renesas Technology Corp. uPD720201 USB 3.0 Host
> Controller (rev 03) (prog-if 30 [XHCI])
>         Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
> Stepping- SERR- FastB2B- DisINTx+
>         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> <TAbort- <MAbort- >SERR- <PERR- INTx-
>         Latency: 0, Cache Line Size: 64 bytes
>         Interrupt: pin A routed to IRQ 00000004
>         Region 0: Memory at 100000000 (64-bit, non-prefetchable) [size=8K]

ok, that is really non-pref mmio 64bit.
We can workaround the problem by honoring firmware setting, according
to https://www.pcisig.com/specifications/pciexpress/base2/PCIe_Base_r2.1_Errata_08Jun10.pdf
page 13

Please check attached updated patches that should fix the regression
and kill those "no compatible window" warnings.

Thanks

Yinghai

Comments

David Ahern March 31, 2015, 4:10 a.m. UTC | #1
On 3/30/15 7:06 PM, Yinghai Lu wrote:
> ok, that is really non-pref mmio 64bit.
> We can workaround the problem by honoring firmware setting, according
> to https://www.pcisig.com/specifications/pciexpress/base2/PCIe_Base_r2.1_Errata_08Jun10.pdf
> page 13
>
> Please check attached updated patches that should fix the regression
> and kill those "no compatible window" warnings.

To make sure I am on the same page: these are a new round of patches? 
clear out the old, apply these?

David

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller March 31, 2015, 3:06 p.m. UTC | #2
From: Yinghai Lu <yinghai@kernel.org>
Date: Mon, 30 Mar 2015 18:06:02 -0700

> ok, that is really non-pref mmio 64bit.
> We can workaround the problem by honoring firmware setting, according
> to https://www.pcisig.com/specifications/pciexpress/base2/PCIe_Base_r2.1_Errata_08Jun10.pdf
> page 13
> 
> Please check attached updated patches that should fix the regression
> and kill those "no compatible window" warnings.

Your patch only allows the condition behind resources that have 64-bits
of significance, but that is not what the document above says about
when this situation is allowed.

Please implement the check either exactly as stated in the errata
document, or more loosely if that is not possible, rather than more
strictly than allowed.

Your overly strict and restrictive checks are what got us into this
predicament in the first place. :-(
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yinghai Lu March 31, 2015, 4:53 p.m. UTC | #3
On Mon, Mar 30, 2015 at 9:10 PM, David Ahern <david.ahern@oracle.com> wrote:
> On 3/30/15 7:06 PM, Yinghai Lu wrote:
> To make sure I am on the same page: these are a new round of patches? clear
> out the old, apply these?

Clear out the old and apply these new ones.

Thanks

Yinghai
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Ahern March 31, 2015, 5:04 p.m. UTC | #4
On 3/31/15 10:53 AM, Yinghai Lu wrote:
> On Mon, Mar 30, 2015 at 9:10 PM, David Ahern <david.ahern@oracle.com> wrote:
>> On 3/30/15 7:06 PM, Yinghai Lu wrote:
>> To make sure I am on the same page: these are a new round of patches? clear
>> out the old, apply these?
>
> Clear out the old and apply these new ones.

I take DaveM's response to mean the patches (3rd one?) needs another 
version.

I will be on PTO Wed-Fri with limited access through Sunday. If you have 
something to try out later today I can do that; else it needs to wait 
until next week. Given the likelihood that Linus will release 4.0 this 
weekend that means both 3.19 and 4.0 will be broken for these systems.

David

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yinghai Lu March 31, 2015, 6:16 p.m. UTC | #5
On Tue, Mar 31, 2015 at 8:06 AM, David Miller <davem@davemloft.net> wrote:
> Your patch only allows the condition behind resources that have 64-bits
> of significance, but that is not what the document above says about
> when this situation is allowed.
>
> Please implement the check either exactly as stated in the errata
> document, or more loosely if that is not possible, rather than more
> strictly than allowed.
>
> Your overly strict and restrictive checks are what got us into this
> predicament in the first place. :-(

From that errata:
---
Here are criteria that are sufficient to guarantee correctness for a
given candidate BAR:
‰
The entire path from the host to the adapter is over PCI Express.
‰
No conventional PCI or PCI-X devices do peer-t o-peer reads to the
range mapped by the BAR.
‰
The PCI Express Host Bridge does no byte merging. (This is believed to
be true on most
platforms.)
‰
Any locations with read side-effects are never the target of Memory
Reads with the TH bit Set.
See Section 2.2.5
---

We can verify first one that we have all pcie device all the way to
the hostbridge.

But we can not verify or guarantee other three.

System should get better about the constraints with system design.
So if it would assign 64bit (and above 4G) mmio to those non-pref 64bit BAR,
that means it already make sure system design will follow those criteria.
and then we can safely set pref bit in resource flags.

Thanks

Yinghai
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller March 31, 2015, 6:19 p.m. UTC | #6
From: Yinghai Lu <yinghai@kernel.org>

Date: Tue, 31 Mar 2015 11:16:11 -0700

> On Tue, Mar 31, 2015 at 8:06 AM, David Miller <davem@davemloft.net> wrote:

>> Your patch only allows the condition behind resources that have 64-bits

>> of significance, but that is not what the document above says about

>> when this situation is allowed.

>>

>> Please implement the check either exactly as stated in the errata

>> document, or more loosely if that is not possible, rather than more

>> strictly than allowed.

>>

>> Your overly strict and restrictive checks are what got us into this

>> predicament in the first place. :-(

> 

> From that errata:

> ---

> Here are criteria that are sufficient to guarantee correctness for a

> given candidate BAR:

> ‰

> The entire path from the host to the adapter is over PCI Express.

> ‰

> No conventional PCI or PCI-X devices do peer-t o-peer reads to the

> range mapped by the BAR.

> ‰

> The PCI Express Host Bridge does no byte merging. (This is believed to

> be true on most

> platforms.)

> ‰

> Any locations with read side-effects are never the target of Memory

> Reads with the TH bit Set.

> See Section 2.2.5

> ---

> 

> We can verify first one that we have all pcie device all the way to

> the hostbridge.

> 

> But we can not verify or guarantee other three.


Lack of peer-to-peer reads we can assume, the byte merging we can
add as a per-controller attribute in the future if it turns out there
are some that do and it matters (I doubt this will ever be necessary)
and the side-effect issue is outside the scope of the PCI layer.

> System should get better about the constraints with system design.

> So if it would assign 64bit (and above 4G) mmio to those non-pref 64bit BAR,

> that means it already make sure system design will follow those criteria.

> and then we can safely set pref bit in resource flags.


I totally disagree, I think your test is too stringent and should be
significantly relaxed if not removed entirely.
Yinghai Lu March 31, 2015, 6:25 p.m. UTC | #7
On Tue, Mar 31, 2015 at 11:19 AM, David Miller <davem@davemloft.net> wrote:

> I totally disagree, I think your test is too stringent and should be
> significantly relaxed if not removed entirely.

ok, will produce one version that only verify first criteria.

Thanks

Yinghai
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

Subject: [RFC PATCH] PCI: Set pref for mem64 resource of pcie device

We still get "no compatible bridge window" warning on sparc T5-8
after we add support for 64bit resource for root bus.

[  286.647560] PCI: scan_bus[/pci@300/pci@1/pci@0/pci@6] bus no 8
[  286.921232] PCI: Claiming 0000:00:01.0: Resource 15: 0000800100000000..00008004afffffff [220c]
[  287.229190] PCI: Claiming 0000:01:00.0: Resource 15: 0000800100000000..00008004afffffff [220c]
[  287.533428] PCI: Claiming 0000:02:04.0: Resource 15: 0000800100000000..000080012fffffff [220c]
[  288.149831] PCI: Claiming 0000:03:00.0: Resource 15: 0000800100000000..000080012fffffff [220c]
[  288.252466] PCI: Claiming 0000:04:06.0: Resource 14: 0000800100000000..000080010fffffff [220c]
[  288.867196] PCI: Claiming 0000:05:00.0: Resource 0: 0000800100000000..0000800100001fff [204]
[  288.968221] pci 0000:05:00.0: can't claim BAR 0 [mem 0x800100000000-0x800100001fff]: no compatible bridge window

All the bridges have pref mem 64-bit resource, but the device resource does not
have pref set, then we can not find parent for the device resource,
as we can not put non-pref mem under pref mem.

According to pcie spec errta
https://www.pcisig.com/specifications/pciexpress/base2/PCIe_Base_r2.1_Errata_08Jun10.pdf
page 13, in some case it is ok to mark some as pref.

only set the bit when the mmio is above 4G by BIOS.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>

---
 drivers/pci/probe.c |   40 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 40 insertions(+)

Index: linux-2.6/drivers/pci/probe.c
===================================================================
--- linux-2.6.orig/drivers/pci/probe.c
+++ linux-2.6/drivers/pci/probe.c
@@ -1508,6 +1508,43 @@  static void pci_init_capabilities(struct
 	pci_enable_acs(dev);
 }
 
+/*
+ * According to
+ * https://www.pcisig.com/specifications/pciexpress/base2/PCIe_Base_r2.1_Errata_08Jun10.pdf
+ * page 13, system firmware could put some 64bit non-pref under 64bit pref,
+ * on some cases.
+ * Let's set pref bit when pci bus address is above 4G.
+ */
+static void set_pcie_64bit_pref(struct pci_dev *dev)
+{
+	int i;
+
+	if (!pci_is_pcie(dev))
+		return;
+
+	for (i = 0; i < PCI_BRIDGE_RESOURCES; i++) {
+		struct resource *res = &dev->resource[i];
+		struct pci_bus_region r;
+		enum pci_bar_type type;
+		int reg;
+
+		if (!(res->flags & IORESOURCE_MEM_64))
+			continue;
+
+		if (res->flags & IORESOURCE_PREFETCH)
+			continue;
+
+		pcibios_resource_to_bus(dev->bus, &r, res);
+		if (r.start < 0xffffffff)
+			continue;
+
+		reg = pci_resource_bar(dev, i, &type);
+		dev_printk(KERN_DEBUG, &dev->dev, "reg %d %pR + pref\n",
+			   reg, res);
+		res->flags |= IORESOURCE_PREFETCH;
+	}
+}
+
 void pci_device_add(struct pci_dev *dev, struct pci_bus *bus)
 {
 	int ret;
@@ -1538,6 +1575,9 @@  void pci_device_add(struct pci_dev *dev,
 	/* Initialize various capabilities */
 	pci_init_capabilities(dev);
 
+	/* After pcie_cap is assigned and sriov bar is probed */
+	set_pcie_64bit_pref(dev);
+
 	/*
 	 * Add the device to our list of discovered devices
 	 * and the bus list for fixup functions, etc.