diff mbox series

drivers: pci: dwc: dynamically set pci region limit

Message ID 20240523001046.1413579-1-shyamsaini@linux.microsoft.com
State New
Headers show
Series drivers: pci: dwc: dynamically set pci region limit | expand

Commit Message

Shyam Saini May 23, 2024, 12:10 a.m. UTC
commit 89473aa9ab26 ("PCI: dwc: Add iATU regions size detection procedure")
hardcodes the pci region limit to 4G. This causes regression on
systems with PCI memory region higher than 4G.

Fix this by dynamically setting pci region limit based on maximum
size of memory ranges in the PCI device tree node.

Fixes: 89473aa9ab26 ("PCI: dwc: Add iATU regions size detection procedure")
Signed-off-by: Shyam Saini <shyamsaini@linux.microsoft.com>
---
 drivers/pci/controller/dwc/pcie-designware.c | 20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

Comments

Serge Semin May 23, 2024, 10:02 a.m. UTC | #1
On Wed, May 22, 2024 at 05:10:46PM -0700, Shyam Saini wrote:
> commit 89473aa9ab26 ("PCI: dwc: Add iATU regions size detection procedure")
> hardcodes the pci region limit to 4G.

In what part does it _harcode_ the region limit to 4G? The procedure
_auto-detects_ the actual iATU region limits. The limits aren't
hardcoded.  The upper bound is 4G for DW PCIe IP-core older than
v4.60. If the IP-core is younger than that the upper bound will be
determined by the CX_ATU_MAX_REGION_SIZE IP-core synthesize parameter.
The auto-detection procedure is about the CX_ATU_MAX_REGION_SIZE
parameter detection.

> This causes regression on
> systems with PCI memory region higher than 4G.

I am sure it doesn't. If it did we would have got multiple bug-reports
right after the patch was merged into the mainline kernel repo. So
please provide a comprehensive description of the problem you have.

> 
> Fix this by dynamically setting pci region limit based on maximum
> size of memory ranges in the PCI device tree node.

It seems to me that your patch is an attempt to workaround some
problem you met. Give more insight about the problem in order to find
a proper fix. The justification you've provided so far seems incorrect.

Note you can't use the ranges DT-property specified on your platform
to determine the actual iATU regions size, because the later entity is
a primary/root parameter of the PCIe controller. The DT-node memory
ranges could be defined with a size greater than the actual iATU
region size. In that case the address translation logic will be broken
in the current driver implementation. AFAICS from the DW PCIe IP-core
HW-manuals the IO-transaction will be passed further to the PCIe bus with
no address translated and with the TLP fields filled in with the data
retrieved on the application interface (XALI/AXI):

"3.10.5.6 No Address Match Result Overview: When there is no address
match then the address is untranslated but the TLP header information
(for fields that are programmable) comes from the relevant fields on
the application transmit interface XALI* or AXI slave."

That isn't what could be allowed, because it may cause unpredictable
results up to the system crash, for instance, if the TLPs with the
untranslated TLPs reached a device they weren't targeted to.

If what you met in your system was a memory range greater than the
permitted iATU region limit, a proper fix would have been to allocate
a one more iATU region for the out of bounds part of the memory range.

-Serge(y)

> 
> Fixes: 89473aa9ab26 ("PCI: dwc: Add iATU regions size detection procedure")
> Signed-off-by: Shyam Saini <shyamsaini@linux.microsoft.com>
> ---
>  drivers/pci/controller/dwc/pcie-designware.c | 20 +++++++++++++++++++-
>  1 file changed, 19 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/controller/dwc/pcie-designware.c b/drivers/pci/controller/dwc/pcie-designware.c
> index 250cf7f40b85..9b8975b35dd9 100644
> --- a/drivers/pci/controller/dwc/pcie-designware.c
> +++ b/drivers/pci/controller/dwc/pcie-designware.c
> @@ -783,6 +783,9 @@ static void dw_pcie_link_set_max_link_width(struct dw_pcie *pci, u32 num_lanes)
>  void dw_pcie_iatu_detect(struct dw_pcie *pci)
>  {
>  	int max_region, ob, ib;
> +	struct dw_pcie_rp *pp = &pci->pp;
> +	struct resource_entry *entry;
> +	u64 max_mem_sz = 0;
>  	u32 val, min, dir;
>  	u64 max;
>  
> @@ -832,10 +835,25 @@ void dw_pcie_iatu_detect(struct dw_pcie *pci)
>  		max = 0;
>  	}
>  
> +	resource_list_for_each_entry(entry, &pp->bridge->windows) {
> +		if (resource_type(entry->res) != IORESOURCE_MEM)
> +			continue;
> +		max_mem_sz = (max_mem_sz < resource_size(entry->res)) ?
> +						resource_size(entry->res) : max_mem_sz;
> +	}
> +
> +	if (max_mem_sz <= SZ_4G)
> +		pci->region_limit = (max << 32) | (SZ_4G - 1);
> +	else if ((max_mem_sz > SZ_4G) && (max_mem_sz <= SZ_8G))
> +		pci->region_limit = (max << 32) | (SZ_8G - 1);
> +	else if ((max_mem_sz > SZ_8G) && (max_mem_sz <= SZ_16G))
> +		pci->region_limit = (max << 32) | (SZ_16G - 1);
> +	else
> +		pci->region_limit = (max << 32) | (SZ_32G - 1);
> +
>  	pci->num_ob_windows = ob;
>  	pci->num_ib_windows = ib;
>  	pci->region_align = 1 << fls(min);
> -	pci->region_limit = (max << 32) | (SZ_4G - 1);
>  
>  	dev_info(pci->dev, "iATU: unroll %s, %u ob, %u ib, align %uK, limit %lluG\n",
>  		 dw_pcie_cap_is(pci, IATU_UNROLL) ? "T" : "F",
> -- 
> 2.34.1
> 
>
Shyam Saini June 10, 2024, 11:20 p.m. UTC | #2
Hi Serge,

On Wed, May 22, 2024 at 05:10:46PM -0700, Shyam Saini wrote:
> commit 89473aa9ab26 ("PCI: dwc: Add iATU regions size detection procedure")
>> hardcodes the pci region limit to 4G.

> In what part does it _harcode_ the region limit to 4G? The procedure
> _auto-detects_ the actual iATU region limits. The limits aren't
> hardcoded.  The upper bound is 4G for DW PCIe IP-core older than
> v4.60. If the IP-core is younger than that the upper bound will be
> determined by the CX_ATU_MAX_REGION_SIZE IP-core synthesize parameter.
> The auto-detection procedure is about the CX_ATU_MAX_REGION_SIZE
> parameter detection.

>> This causes regression on
>> systems with PCI memory region higher than 4G.

> I am sure it doesn't. If it did we would have got multiple bug-reports
> right after the patch was merged into the mainline kernel repo. So
> please provide a comprehensive description of the problem you have.
> 
Sorry, I am certainly not a PCI expert but here is how I got here:

With [1] this change I started to see PCI driver throwing "Failed to set MEM range" error messages
and as consequence the driver probe fails with " error -22"

When I tracked the code, I found [1] this check was causing the driver probe failure 
and pci->region_limit was set to 4G in [2] this commit

So, to fix this issue I prepared this patch and it solves the problem I was having.
Based on your reply it seems problem is some where else.

I didn't see any problem with PCI memory <= 4G

>> 
>> Fix this by dynamically setting pci region limit based on maximum
>> size of memory ranges in the PCI device tree node.

> It seems to me that your patch is an attempt to workaround some
> problem you met. Give more insight about the problem in order to find
> a proper fix. The justification you've provided so far seems incorrect.
> 
> Note you can't use the ranges DT-property specified on your platform
> to determine the actual iATU regions size, because the later entity is
> a primary/root parameter of the PCIe controller. The DT-node memory
> ranges could be defined with a size greater than the actual iATU
> region size. In that case the address translation logic will be broken
> in the current driver implementation. AFAICS from the DW PCIe IP-core
> HW-manuals the IO-transaction will be passed further to the PCIe bus with
> no address translated and with the TLP fields filled in with the data
> retrieved on the application interface (XALI/AXI):
> 
> "3.10.5.6 No Address Match Result Overview: When there is no address
> match then the address is untranslated but the TLP header information
> (for fields that are programmable) comes from the relevant fields on
> the application transmit interface XALI* or AXI slave."
> 
> That isn't what could be allowed, because it may cause unpredictable
> results up to the system crash, for instance, if the TLPs with the
> untranslated TLPs reached a device they weren't targeted to.

> If what you met in your system was a memory range greater than the
> permitted iATU region limit, a proper fix would have been to allocate
> a one more iATU region for the out of bounds part of the memory range.

based on your suggestions I have prepared a new patch, which i will send shortly.
Thank you for the reviews.

[1] https://elixir.bootlin.com/linux/v6.9.1/source/drivers/pci/controller/dwc/pcie-designware.c#L480
[2] https://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git/commit/?id=89473aa9ab261948ed13b16effe841a675efed77
diff mbox series

Patch

diff --git a/drivers/pci/controller/dwc/pcie-designware.c b/drivers/pci/controller/dwc/pcie-designware.c
index 250cf7f40b85..9b8975b35dd9 100644
--- a/drivers/pci/controller/dwc/pcie-designware.c
+++ b/drivers/pci/controller/dwc/pcie-designware.c
@@ -783,6 +783,9 @@  static void dw_pcie_link_set_max_link_width(struct dw_pcie *pci, u32 num_lanes)
 void dw_pcie_iatu_detect(struct dw_pcie *pci)
 {
 	int max_region, ob, ib;
+	struct dw_pcie_rp *pp = &pci->pp;
+	struct resource_entry *entry;
+	u64 max_mem_sz = 0;
 	u32 val, min, dir;
 	u64 max;
 
@@ -832,10 +835,25 @@  void dw_pcie_iatu_detect(struct dw_pcie *pci)
 		max = 0;
 	}
 
+	resource_list_for_each_entry(entry, &pp->bridge->windows) {
+		if (resource_type(entry->res) != IORESOURCE_MEM)
+			continue;
+		max_mem_sz = (max_mem_sz < resource_size(entry->res)) ?
+						resource_size(entry->res) : max_mem_sz;
+	}
+
+	if (max_mem_sz <= SZ_4G)
+		pci->region_limit = (max << 32) | (SZ_4G - 1);
+	else if ((max_mem_sz > SZ_4G) && (max_mem_sz <= SZ_8G))
+		pci->region_limit = (max << 32) | (SZ_8G - 1);
+	else if ((max_mem_sz > SZ_8G) && (max_mem_sz <= SZ_16G))
+		pci->region_limit = (max << 32) | (SZ_16G - 1);
+	else
+		pci->region_limit = (max << 32) | (SZ_32G - 1);
+
 	pci->num_ob_windows = ob;
 	pci->num_ib_windows = ib;
 	pci->region_align = 1 << fls(min);
-	pci->region_limit = (max << 32) | (SZ_4G - 1);
 
 	dev_info(pci->dev, "iATU: unroll %s, %u ob, %u ib, align %uK, limit %lluG\n",
 		 dw_pcie_cap_is(pci, IATU_UNROLL) ? "T" : "F",