diff mbox series

[v2] PCI: keystone: Fix ks_pcie_v3_65_add_bus() for AM654x SoC

Message ID 20231018075038.2740534-1-s-vadapalli@ti.com
State New
Headers show
Series [v2] PCI: keystone: Fix ks_pcie_v3_65_add_bus() for AM654x SoC | expand

Commit Message

Siddharth Vadapalli Oct. 18, 2023, 7:50 a.m. UTC
The ks_pcie_v3_65_add_bus() member of "ks_pcie_ops" was added for
platforms using DW PCIe IP-core version 3.65a. The AM654x SoC uses
DW PCIe IP-core version 4.90a and ks_pcie_v3_65_add_bus() is not
applicable to it.

The commit which added support for the AM654x SoC has reused majority
of the functions with the help of the "is_am6" flag to handle AM654x
separately where applicable. Thus, make use of the "is_am6" flag and
change ks_pcie_v3_65_add_bus() to no-op for AM654x SoC.

Fixes: 18b0415bc802 ("PCI: keystone: Add support for PCIe RC in AM654x Platforms")
Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
---
Hello,

This patch is based on linux-next tagged next-20231018.

The v1 of this patch is at:
https://lore.kernel.org/r/20231011123451.34827-1-s-vadapalli@ti.com/

While there are a lot of changes since v1 and this patch could have been
posted as a v1 patch itself, I decided to post it as the v2 of the patch
mentioned above since it aims to address the issue described by the v1
patch and is similar in that sense. However, the solution to the issue
described in the v1 patch appears to be completely different from what
was implemented in the v1 patch. Thus, the commit message and subject of
this patch have been modified accordingly.

Changes since v1:
- Updated patch subject and commit message.
- Determined that issue is not with the absence of Link as mentioned in
  v1 patch. Even with Link up and endpoint device connected, if
  ks_pcie_v3_65_add_bus() is invoked and executed, all reads to the
  MSI-X offsets return 0xffffffff when pcieport driver attempts to setup
  AER and PME services. The all Fs return value indicates that the MSI-X
  configuration is failing even if Endpoint device is connected. This is
  because the ks_pcie_v3_65_add_bus() function is not applicable to the
  AM654x SoC which uses DW PCIe IP-core version 4.90a.

Regards,
Siddharth.

 drivers/pci/controller/dwc/pci-keystone.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Ravi Gunasekaran Oct. 18, 2023, 8:19 a.m. UTC | #1
Siddharth,


On 10/18/23 1:20 PM, Siddharth Vadapalli wrote:
> The ks_pcie_v3_65_add_bus() member of "ks_pcie_ops" was added for
> platforms using DW PCIe IP-core version 3.65a. The AM654x SoC uses
> DW PCIe IP-core version 4.90a and ks_pcie_v3_65_add_bus() is not
> applicable to it.
> 
> The commit which added support for the AM654x SoC has reused majority
> of the functions with the help of the "is_am6" flag to handle AM654x
> separately where applicable. Thus, make use of the "is_am6" flag and
> change ks_pcie_v3_65_add_bus() to no-op for AM654x SoC.
> 
> Fixes: 18b0415bc802 ("PCI: keystone: Add support for PCIe RC in AM654x Platforms")

6ab15b5e7057 (PCI: dwc: keystone: Convert .scan_bus() callback to use add_bus)
is that one that seems to have introduced this issue. 

ks_pcie_v3_65_scan_bus() was for IP version 3.65 and this was renamed and
added to "ks_pcie_ops" which is used by other IP versions as well.


> Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
> ---
> Hello,
> 
> This patch is based on linux-next tagged next-20231018.
> 
> The v1 of this patch is at:
> https://lore.kernel.org/r/20231011123451.34827-1-s-vadapalli@ti.com/
> 
> While there are a lot of changes since v1 and this patch could have been
> posted as a v1 patch itself, I decided to post it as the v2 of the patch
> mentioned above since it aims to address the issue described by the v1
> patch and is similar in that sense. However, the solution to the issue
> described in the v1 patch appears to be completely different from what
> was implemented in the v1 patch. Thus, the commit message and subject of
> this patch have been modified accordingly.
> 
> Changes since v1:
> - Updated patch subject and commit message.
> - Determined that issue is not with the absence of Link as mentioned in
>   v1 patch. Even with Link up and endpoint device connected, if
>   ks_pcie_v3_65_add_bus() is invoked and executed, all reads to the
>   MSI-X offsets return 0xffffffff when pcieport driver attempts to setup
>   AER and PME services. The all Fs return value indicates that the MSI-X
>   configuration is failing even if Endpoint device is connected. This is
>   because the ks_pcie_v3_65_add_bus() function is not applicable to the
>   AM654x SoC which uses DW PCIe IP-core version 4.90a.
> 
> Regards,
> Siddharth.
> 
>  drivers/pci/controller/dwc/pci-keystone.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/controller/dwc/pci-keystone.c b/drivers/pci/controller/dwc/pci-keystone.c
> index 0def919f89fa..3abd59335574 100644
> --- a/drivers/pci/controller/dwc/pci-keystone.c
> +++ b/drivers/pci/controller/dwc/pci-keystone.c
> @@ -459,7 +459,7 @@ static int ks_pcie_v3_65_add_bus(struct pci_bus *bus)
>  	struct dw_pcie *pci = to_dw_pcie_from_pp(pp);
>  	struct keystone_pcie *ks_pcie = to_keystone_pcie(pci);
>  
> -	if (!pci_is_root_bus(bus))
> +	if (!pci_is_root_bus(bus) || ks_pcie->is_am6)
>  		return 0;
>  
>  	/* Configure and set up BAR0 */
Siddharth Vadapalli Oct. 18, 2023, 8:24 a.m. UTC | #2
Ravi,

On 18/10/23 13:49, Ravi Gunasekaran wrote:
> Siddharth,
> 
> 
> On 10/18/23 1:20 PM, Siddharth Vadapalli wrote:
>> The ks_pcie_v3_65_add_bus() member of "ks_pcie_ops" was added for
>> platforms using DW PCIe IP-core version 3.65a. The AM654x SoC uses
>> DW PCIe IP-core version 4.90a and ks_pcie_v3_65_add_bus() is not
>> applicable to it.
>>
>> The commit which added support for the AM654x SoC has reused majority
>> of the functions with the help of the "is_am6" flag to handle AM654x
>> separately where applicable. Thus, make use of the "is_am6" flag and
>> change ks_pcie_v3_65_add_bus() to no-op for AM654x SoC.
>>
>> Fixes: 18b0415bc802 ("PCI: keystone: Add support for PCIe RC in AM654x Platforms")
> 
> 6ab15b5e7057 (PCI: dwc: keystone: Convert .scan_bus() callback to use add_bus)
> is that one that seems to have introduced this issue. 
> 
> ks_pcie_v3_65_scan_bus() was for IP version 3.65 and this was renamed and
> added to "ks_pcie_ops" which is used by other IP versions as well.

Thank you for reviewing the patch and pointing this out. I will update the
commit message with the correct Fixes tag as well as the appropriate description
and post the v3 patch if there is no further feedback from others on this patch.

> 
> 
>> Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
>> ---
>> Hello,
>>
>> This patch is based on linux-next tagged next-20231018.
>>
>> The v1 of this patch is at:
>> https://lore.kernel.org/r/20231011123451.34827-1-s-vadapalli@ti.com/

...

>>  		return 0;
>>  
>>  	/* Configure and set up BAR0 */
>
Serge Semin Oct. 18, 2023, 11:11 a.m. UTC | #3
On Wed, Oct 18, 2023 at 01:20:38PM +0530, Siddharth Vadapalli wrote:
> The ks_pcie_v3_65_add_bus() member of "ks_pcie_ops" was added for
> platforms using DW PCIe IP-core version 3.65a. The AM654x SoC uses
> DW PCIe IP-core version 4.90a and ks_pcie_v3_65_add_bus() is not
> applicable to it.

I'll copy my message from v1 regarding the IP-core version. Are you
really sure that it's 4.90a? Here is what my DW PCIe RC
_v4.90_ HW databook says about the BARs:

"Base Address Registers (Offset: 0x10-x14) The Synopsys core does not
implement the optional BARs for the RC product. This is based on the
assumption that the RC host probably has registers on some other
internal bus and has knowledge and setup access to these registers
already."

You cited the next text:

"3.5.7.2 RC Mode. Two BARs are present but are not expected to be
used. You should disable them or else they will be unnecessarily
assigned memory during device enumeration. If you do use a BAR, then
you should program it to capture TLPs that are targeted to your local
non-application memory space residing on TRGT1, and not for the
application on TRGT1. The BAR range must be outside of the three
Base/Limit regions."

AFAICS from the v5.20a, v5.30a, v5.40a databooks, it resides in the
_v5.x_ databooks only meanwhile the older ones (v4.21, v4.60a, v4.70a
and _v4.90a_) describe the BARs as I cited earlier. It makes my
thinking that in your case the IP-core isn't of 4.90a version. Could
you please clarify how did you detect the version? Did you use the
PCIE_VERSION_NUMBER_OFF register available in the PORT_LOGIC CSRs
space? If your judgment was based on the Keyston PCIe driver code,
then the driver might get to be wrong in that matter.

> 
> The commit which added support for the AM654x SoC has reused majority
> of the functions with the help of the "is_am6" flag to handle AM654x
> separately where applicable. Thus, make use of the "is_am6" flag and
> change ks_pcie_v3_65_add_bus() to no-op for AM654x SoC.
> 
> Fixes: 18b0415bc802 ("PCI: keystone: Add support for PCIe RC in AM654x Platforms")
> Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
> ---
> Hello,
> 
> This patch is based on linux-next tagged next-20231018.
> 
> The v1 of this patch is at:
> https://lore.kernel.org/r/20231011123451.34827-1-s-vadapalli@ti.com/
> 
> While there are a lot of changes since v1 and this patch could have been
> posted as a v1 patch itself, I decided to post it as the v2 of the patch
> mentioned above since it aims to address the issue described by the v1
> patch and is similar in that sense. However, the solution to the issue
> described in the v1 patch appears to be completely different from what
> was implemented in the v1 patch. Thus, the commit message and subject of
> this patch have been modified accordingly.
> 
> Changes since v1:
> - Updated patch subject and commit message.
> - Determined that issue is not with the absence of Link as mentioned in
>   v1 patch. Even with Link up and endpoint device connected, if
>   ks_pcie_v3_65_add_bus() is invoked and executed, all reads to the
>   MSI-X offsets return 0xffffffff when pcieport driver attempts to setup
>   AER and PME services. The all Fs return value indicates that the MSI-X
>   configuration is failing even if Endpoint device is connected. This is
>   because the ks_pcie_v3_65_add_bus() function is not applicable to the
>   AM654x SoC which uses DW PCIe IP-core version 4.90a.
> 
> Regards,
> Siddharth.
> 
>  drivers/pci/controller/dwc/pci-keystone.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/controller/dwc/pci-keystone.c b/drivers/pci/controller/dwc/pci-keystone.c
> index 0def919f89fa..3abd59335574 100644
> --- a/drivers/pci/controller/dwc/pci-keystone.c
> +++ b/drivers/pci/controller/dwc/pci-keystone.c
> @@ -459,7 +459,7 @@ static int ks_pcie_v3_65_add_bus(struct pci_bus *bus)
>  	struct dw_pcie *pci = to_dw_pcie_from_pp(pp);
>  	struct keystone_pcie *ks_pcie = to_keystone_pcie(pci);
>  

> -	if (!pci_is_root_bus(bus))
> +	if (!pci_is_root_bus(bus) || ks_pcie->is_am6)

1. IMO defining two versions of the pci_ops struct instances would look
more readable:
static struct pci_ops ks_pcie_v3_65_ops = {
        .map_bus = dw_pcie_own_conf_map_bus,
        .read = pci_generic_config_read,
        .write = pci_generic_config_write,
        .add_bus = ks_pcie_v3_65_add_bus,
};

static struct pci_ops ks_pcie_ops = {
        .map_bus = dw_pcie_own_conf_map_bus,
        .read = pci_generic_config_read,
        .write = pci_generic_config_write,
};

2. In case of 1. implemented, ks_pcie_ops will look the same as
ks_child_pcie_ops, so the later could be dropped.

3. I'm not that fluent in the PCIe core details, but based on the
pci_host_bridge.child_ops and pci_host_bridge.ops names, the first
ones will be utilized for the child PCIe buses, meanwhile the later
ones - for the root bus only (see pci_alloc_child_bus()). Bjorn?
If so then the pci_is_root_bus() check can be dropped from the
ks_pcie_v3_65_add_bus() method. After doing that I would have also
changed the name to ks_pcie_v3_65_add_root_bus(). In anyway I would
double-check the suggestion first.

-Serge(y)

>  		return 0;
>  
>  	/* Configure and set up BAR0 */
> -- 
> 2.34.1
>
Siddharth Vadapalli Oct. 18, 2023, 11:56 a.m. UTC | #4
On 18/10/23 16:41, Serge Semin wrote:
> On Wed, Oct 18, 2023 at 01:20:38PM +0530, Siddharth Vadapalli wrote:
>> The ks_pcie_v3_65_add_bus() member of "ks_pcie_ops" was added for
>> platforms using DW PCIe IP-core version 3.65a. The AM654x SoC uses
>> DW PCIe IP-core version 4.90a and ks_pcie_v3_65_add_bus() is not
>> applicable to it.
> 
> I'll copy my message from v1 regarding the IP-core version. Are you
> really sure that it's 4.90a? Here is what my DW PCIe RC

The HW databook I have is named:
"PCI Express DM Controller Databook"
and has the following footnote on every page of the databook:
4.90a
December 2016

> _v4.90_ HW databook says about the BARs:
> 
> "Base Address Registers (Offset: 0x10-x14) The Synopsys core does not
> implement the optional BARs for the RC product. This is based on the
> assumption that the RC host probably has registers on some other
> internal bus and has knowledge and setup access to these registers
> already."
> 
> You cited the next text:
> 
> "3.5.7.2 RC Mode. Two BARs are present but are not expected to be
> used. You should disable them or else they will be unnecessarily
> assigned memory during device enumeration. If you do use a BAR, then
> you should program it to capture TLPs that are targeted to your local
> non-application memory space residing on TRGT1, and not for the
> application on TRGT1. The BAR range must be outside of the three
> Base/Limit regions."

The above text is present in the "PCI Express DM Controller Databook" I
mentioned above that I have with me which I have cited as-is. I am certain that
the Databook version I have with me is v4.90a. Additionally, please refer to the
commit which added support to the PCI Keystone driver for AM654x Platform:
https://github.com/torvalds/linux/commit/18b0415bc802#diff-1a6482b4ef0eddbe6238232ad346930283832b99270ebabd67385571a619c267R977
On line 977 in the diff:
.version = 0x490A,

> 
> AFAICS from the v5.20a, v5.30a, v5.40a databooks, it resides in the
> _v5.x_ databooks only meanwhile the older ones (v4.21, v4.60a, v4.70a
> and _v4.90a_) describe the BARs as I cited earlier. It makes my
> thinking that in your case the IP-core isn't of 4.90a version. Could
> you please clarify how did you detect the version? Did you use the
> PCIE_VERSION_NUMBER_OFF register available in the PORT_LOGIC CSRs

The value of the register PCIE_RC_PCIE_VERSION_NUMBER_OFF is:
3439302a
which in ASCII is 490*
The value of the register PCIE_RC_PCIE_VERSION_TYPE_OFF is:
6c703039
which in ASCII is lp09.

So I am certain that the Controller is actually version 4.90.


> space? If your judgment was based on the Keyston PCIe driver code,
> then the driver might get to be wrong in that matter.
> 
>>
>> The commit which added support for the AM654x SoC has reused majority
>> of the functions with the help of the "is_am6" flag to handle AM654x
>> separately where applicable. Thus, make use of the "is_am6" flag and
>> change ks_pcie_v3_65_add_bus() to no-op for AM654x SoC.
>>
>> Fixes: 18b0415bc802 ("PCI: keystone: Add support for PCIe RC in AM654x Platforms")
>> Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
>> ---
>> Hello,
>>
>> This patch is based on linux-next tagged next-20231018.
>>
>> The v1 of this patch is at:
>> https://lore.kernel.org/r/20231011123451.34827-1-s-vadapalli@ti.com/
>>
>> While there are a lot of changes since v1 and this patch could have been
>> posted as a v1 patch itself, I decided to post it as the v2 of the patch
>> mentioned above since it aims to address the issue described by the v1
>> patch and is similar in that sense. However, the solution to the issue
>> described in the v1 patch appears to be completely different from what
>> was implemented in the v1 patch. Thus, the commit message and subject of
>> this patch have been modified accordingly.
>>
>> Changes since v1:
>> - Updated patch subject and commit message.
>> - Determined that issue is not with the absence of Link as mentioned in
>>   v1 patch. Even with Link up and endpoint device connected, if
>>   ks_pcie_v3_65_add_bus() is invoked and executed, all reads to the
>>   MSI-X offsets return 0xffffffff when pcieport driver attempts to setup
>>   AER and PME services. The all Fs return value indicates that the MSI-X
>>   configuration is failing even if Endpoint device is connected. This is
>>   because the ks_pcie_v3_65_add_bus() function is not applicable to the
>>   AM654x SoC which uses DW PCIe IP-core version 4.90a.
>>
>> Regards,
>> Siddharth.
>>
>>  drivers/pci/controller/dwc/pci-keystone.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/pci/controller/dwc/pci-keystone.c b/drivers/pci/controller/dwc/pci-keystone.c
>> index 0def919f89fa..3abd59335574 100644
>> --- a/drivers/pci/controller/dwc/pci-keystone.c
>> +++ b/drivers/pci/controller/dwc/pci-keystone.c
>> @@ -459,7 +459,7 @@ static int ks_pcie_v3_65_add_bus(struct pci_bus *bus)
>>  	struct dw_pcie *pci = to_dw_pcie_from_pp(pp);
>>  	struct keystone_pcie *ks_pcie = to_keystone_pcie(pci);
>>  
> 
>> -	if (!pci_is_root_bus(bus))
>> +	if (!pci_is_root_bus(bus) || ks_pcie->is_am6)
> 
> 1. IMO defining two versions of the pci_ops struct instances would look
> more readable:
> static struct pci_ops ks_pcie_v3_65_ops = {
>         .map_bus = dw_pcie_own_conf_map_bus,
>         .read = pci_generic_config_read,
>         .write = pci_generic_config_write,
>         .add_bus = ks_pcie_v3_65_add_bus,
> };
> 
> static struct pci_ops ks_pcie_ops = {
>         .map_bus = dw_pcie_own_conf_map_bus,
>         .read = pci_generic_config_read,
>         .write = pci_generic_config_write,
> };
> 
> 2. In case of 1. implemented, ks_pcie_ops will look the same as
> ks_child_pcie_ops, so the later could be dropped.
> 
> 3. I'm not that fluent in the PCIe core details, but based on the
> pci_host_bridge.child_ops and pci_host_bridge.ops names, the first
> ones will be utilized for the child PCIe buses, meanwhile the later
> ones - for the root bus only (see pci_alloc_child_bus()). Bjorn?
> If so then the pci_is_root_bus() check can be dropped from the
> ks_pcie_v3_65_add_bus() method. After doing that I would have also
> changed the name to ks_pcie_v3_65_add_root_bus(). In anyway I would
> double-check the suggestion first.

Please note that irrespective of anything else, with no dependence on any prior
discussions, what remains true is that commit 6ab15b5e7057:
PCI: dwc: keystone: Convert .scan_bus() callback to use add_bus
introduced a bug. Therefore, I believe that the fix will be to attempt to
restore the driver's state back to what it was prior to commit 6ab15b5e7057, as
gracefully as possible, without having to revert commit 6ab15b5e7057.

In the process of converting .scan_bus() callbacks to .add_bus(),
.scan_bus method was removed from:
static const struct dw_pcie_host_ops ks_pcie_host_ops = {
	.host_init = ks_pcie_host_init,
	.msi_host_init = ks_pcie_msi_host_init,
	.scan_bus = ks_pcie_v3_65_scan_bus,
};

and added as a .add_bus method to:
static struct pci_ops ks_pcie_ops = {
	.map_bus = dw_pcie_own_conf_map_bus,
	.read = pci_generic_config_read,
	.write = pci_generic_config_write,
	.add_bus = ks_pcie_v3_65_add_bus,
};
in addition to renaming the method.

Even before commit 6ab15b5e7057, with support for AM654x (using version 4.90a
DWC PCIe IP-core) already present, ks_pcie_ops was being used for AM654x. This
indicates that prior to commit 6ab15b5e7057, the ks_pcie_ops was applicable to
AM654x and there's no problem. However, when commit 6ab15b5e7057 converted the
.scan_bus() method in ks_pcie_host_ops (which wasn't used for AM654x) to
.add_bus(), it added the method within the shared ks_pcie_ops structure which
implicitly, *mistakenly*, assumes that a function named "ks_pcie_v3_65_add_bus"
is also applicable to other controller versions (4.90a in this case). Paying
attention to the name of the function, it becomes clear that it was added for
controller version "3.65", hence the name "...v3_65...". The assumption that the
function/method is applicable to other controller versions as well, was
incorrect which led to the current issue that can be observed. The commit
6ab15b5e7057 ended up adding a new method for a controller version which *never*
used the method. This therefore is a bug.

The simplest fix I see is that of exiting ks_pcie_v3_65_add_bus() if:
ks_pcie->is_am6 is set, since such checks have been scattered throughout the
same driver prior to the addition of commit 6ab15b5e7057 as well.

I am open to other implementations of fixing this issue as well. Kindly let me
know in case of any suggestions.

> 
> -Serge(y)
> 
>>  		return 0;
>>  
>>  	/* Configure and set up BAR0 */
>> -- 
>> 2.34.1
>>
Serge Semin Oct. 18, 2023, 12:15 p.m. UTC | #5
On Wed, Oct 18, 2023 at 05:26:53PM +0530, Siddharth Vadapalli wrote:
> 
> 
> On 18/10/23 16:41, Serge Semin wrote:
> > On Wed, Oct 18, 2023 at 01:20:38PM +0530, Siddharth Vadapalli wrote:
> >> The ks_pcie_v3_65_add_bus() member of "ks_pcie_ops" was added for
> >> platforms using DW PCIe IP-core version 3.65a. The AM654x SoC uses
> >> DW PCIe IP-core version 4.90a and ks_pcie_v3_65_add_bus() is not
> >> applicable to it.
> > 
> > I'll copy my message from v1 regarding the IP-core version. Are you
> > really sure that it's 4.90a? Here is what my DW PCIe RC
> 
> The HW databook I have is named:
> "PCI Express DM Controller Databook"
> and has the following footnote on every page of the databook:
> 4.90a
> December 2016
> 
> > _v4.90_ HW databook says about the BARs:
> > 
> > "Base Address Registers (Offset: 0x10-x14) The Synopsys core does not
> > implement the optional BARs for the RC product. This is based on the
> > assumption that the RC host probably has registers on some other
> > internal bus and has knowledge and setup access to these registers
> > already."
> > 
> > You cited the next text:
> > 
> > "3.5.7.2 RC Mode. Two BARs are present but are not expected to be
> > used. You should disable them or else they will be unnecessarily
> > assigned memory during device enumeration. If you do use a BAR, then
> > you should program it to capture TLPs that are targeted to your local
> > non-application memory space residing on TRGT1, and not for the
> > application on TRGT1. The BAR range must be outside of the three
> > Base/Limit regions."
> 
> The above text is present in the "PCI Express DM Controller Databook" I
> mentioned above that I have with me which I have cited as-is. I am certain that
> the Databook version I have with me is v4.90a. Additionally, please refer to the
> commit which added support to the PCI Keystone driver for AM654x Platform:
> https://github.com/torvalds/linux/commit/18b0415bc802#diff-1a6482b4ef0eddbe6238232ad346930283832b99270ebabd67385571a619c267R977
> On line 977 in the diff:
> .version = 0x490A,
> 
> > 
> > AFAICS from the v5.20a, v5.30a, v5.40a databooks, it resides in the
> > _v5.x_ databooks only meanwhile the older ones (v4.21, v4.60a, v4.70a
> > and _v4.90a_) describe the BARs as I cited earlier. It makes my
> > thinking that in your case the IP-core isn't of 4.90a version. Could
> > you please clarify how did you detect the version? Did you use the
> > PCIE_VERSION_NUMBER_OFF register available in the PORT_LOGIC CSRs
> 
> The value of the register PCIE_RC_PCIE_VERSION_NUMBER_OFF is:
> 3439302a
> which in ASCII is 490*
> The value of the register PCIE_RC_PCIE_VERSION_TYPE_OFF is:
> 6c703039
> which in ASCII is lp09.
> 
> So I am certain that the Controller is actually version 4.90.

Ok. Thanks for clarification. Then the semantic of the BARs in DW PCIe
RCs only has been changed since v5.10a. Dial-mode controllers had the
BARs available even in the older IP-cores. Good to know discovery
actually.

> 
> 
> > space? If your judgment was based on the Keyston PCIe driver code,
> > then the driver might get to be wrong in that matter.
> > 
> >>
> >> The commit which added support for the AM654x SoC has reused majority
> >> of the functions with the help of the "is_am6" flag to handle AM654x
> >> separately where applicable. Thus, make use of the "is_am6" flag and
> >> change ks_pcie_v3_65_add_bus() to no-op for AM654x SoC.
> >>
> >> Fixes: 18b0415bc802 ("PCI: keystone: Add support for PCIe RC in AM654x Platforms")
> >> Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
> >> ---
> >> Hello,
> >>
> >> This patch is based on linux-next tagged next-20231018.
> >>
> >> The v1 of this patch is at:
> >> https://lore.kernel.org/r/20231011123451.34827-1-s-vadapalli@ti.com/
> >>
> >> While there are a lot of changes since v1 and this patch could have been
> >> posted as a v1 patch itself, I decided to post it as the v2 of the patch
> >> mentioned above since it aims to address the issue described by the v1
> >> patch and is similar in that sense. However, the solution to the issue
> >> described in the v1 patch appears to be completely different from what
> >> was implemented in the v1 patch. Thus, the commit message and subject of
> >> this patch have been modified accordingly.
> >>
> >> Changes since v1:
> >> - Updated patch subject and commit message.
> >> - Determined that issue is not with the absence of Link as mentioned in
> >>   v1 patch. Even with Link up and endpoint device connected, if
> >>   ks_pcie_v3_65_add_bus() is invoked and executed, all reads to the
> >>   MSI-X offsets return 0xffffffff when pcieport driver attempts to setup
> >>   AER and PME services. The all Fs return value indicates that the MSI-X
> >>   configuration is failing even if Endpoint device is connected. This is
> >>   because the ks_pcie_v3_65_add_bus() function is not applicable to the
> >>   AM654x SoC which uses DW PCIe IP-core version 4.90a.
> >>
> >> Regards,
> >> Siddharth.
> >>
> >>  drivers/pci/controller/dwc/pci-keystone.c | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/pci/controller/dwc/pci-keystone.c b/drivers/pci/controller/dwc/pci-keystone.c
> >> index 0def919f89fa..3abd59335574 100644
> >> --- a/drivers/pci/controller/dwc/pci-keystone.c
> >> +++ b/drivers/pci/controller/dwc/pci-keystone.c
> >> @@ -459,7 +459,7 @@ static int ks_pcie_v3_65_add_bus(struct pci_bus *bus)
> >>  	struct dw_pcie *pci = to_dw_pcie_from_pp(pp);
> >>  	struct keystone_pcie *ks_pcie = to_keystone_pcie(pci);
> >>  
> > 
> >> -	if (!pci_is_root_bus(bus))
> >> +	if (!pci_is_root_bus(bus) || ks_pcie->is_am6)
> > 
> > 1. IMO defining two versions of the pci_ops struct instances would look
> > more readable:
> > static struct pci_ops ks_pcie_v3_65_ops = {
> >         .map_bus = dw_pcie_own_conf_map_bus,
> >         .read = pci_generic_config_read,
> >         .write = pci_generic_config_write,
> >         .add_bus = ks_pcie_v3_65_add_bus,
> > };
> > 
> > static struct pci_ops ks_pcie_ops = {
> >         .map_bus = dw_pcie_own_conf_map_bus,
> >         .read = pci_generic_config_read,
> >         .write = pci_generic_config_write,
> > };
> > 
> > 2. In case of 1. implemented, ks_pcie_ops will look the same as
> > ks_child_pcie_ops, so the later could be dropped.
> > 
> > 3. I'm not that fluent in the PCIe core details, but based on the
> > pci_host_bridge.child_ops and pci_host_bridge.ops names, the first
> > ones will be utilized for the child PCIe buses, meanwhile the later
> > ones - for the root bus only (see pci_alloc_child_bus()). Bjorn?
> > If so then the pci_is_root_bus() check can be dropped from the
> > ks_pcie_v3_65_add_bus() method. After doing that I would have also
> > changed the name to ks_pcie_v3_65_add_root_bus(). In anyway I would
> > double-check the suggestion first.
> 
> Please note that irrespective of anything else, with no dependence on any prior
> discussions, what remains true is that commit 6ab15b5e7057:
> PCI: dwc: keystone: Convert .scan_bus() callback to use add_bus
> introduced a bug. Therefore, I believe that the fix will be to attempt to
> restore the driver's state back to what it was prior to commit 6ab15b5e7057, as
> gracefully as possible, without having to revert commit 6ab15b5e7057.
> 
> In the process of converting .scan_bus() callbacks to .add_bus(),
> .scan_bus method was removed from:
> static const struct dw_pcie_host_ops ks_pcie_host_ops = {
> 	.host_init = ks_pcie_host_init,
> 	.msi_host_init = ks_pcie_msi_host_init,
> 	.scan_bus = ks_pcie_v3_65_scan_bus,
> };
> 
> and added as a .add_bus method to:
> static struct pci_ops ks_pcie_ops = {
> 	.map_bus = dw_pcie_own_conf_map_bus,
> 	.read = pci_generic_config_read,
> 	.write = pci_generic_config_write,
> 	.add_bus = ks_pcie_v3_65_add_bus,
> };
> in addition to renaming the method.
> 
> Even before commit 6ab15b5e7057, with support for AM654x (using version 4.90a
> DWC PCIe IP-core) already present, ks_pcie_ops was being used for AM654x. This
> indicates that prior to commit 6ab15b5e7057, the ks_pcie_ops was applicable to
> AM654x and there's no problem. However, when commit 6ab15b5e7057 converted the
> .scan_bus() method in ks_pcie_host_ops (which wasn't used for AM654x) to
> .add_bus(), it added the method within the shared ks_pcie_ops structure which
> implicitly, *mistakenly*, assumes that a function named "ks_pcie_v3_65_add_bus"
> is also applicable to other controller versions (4.90a in this case). Paying
> attention to the name of the function, it becomes clear that it was added for
> controller version "3.65", hence the name "...v3_65...". The assumption that the
> function/method is applicable to other controller versions as well, was
> incorrect which led to the current issue that can be observed. The commit
> 6ab15b5e7057 ended up adding a new method for a controller version which *never*
> used the method. This therefore is a bug.
> 

> The simplest fix I see is that of exiting ks_pcie_v3_65_add_bus() if:
> ks_pcie->is_am6 is set, since such checks have been scattered throughout the
> same driver prior to the addition of commit 6ab15b5e7057 as well.
> 
> I am open to other implementations of fixing this issue as well. Kindly let me
> know in case of any suggestions.

A simplest fix isn't always the best one. As you said yourself the
ks_pcie_v3_65_add_bus() implies having it called for v3.65. Calling it
for the newer controllers was a mistake causing the bug. Thus having
the ks_pcie_ops utilized for both old and new controllers was also a
mistake. Therefor my suggestion to fix the problem by defining the two
pci_ops structure instances would be more correct at least from the
code readability point of view. It would make the
ks_pcie_v3_65_add_bus() function body and name coherent.

Meanwhile your fix look more like a workaround. The
ks_pcie_v3_65_add_bus() function will be still called for the AM6x
v4.90 controllers, which based on its semantic would be and will be
wrong in any case. So instead of noop-ing the function it would be
better to just drop it being called for the new controllers.

-Serge(y)

> 
> > 
> > -Serge(y)
> > 
> >>  		return 0;
> >>  
> >>  	/* Configure and set up BAR0 */
> >> -- 
> >> 2.34.1
> >>
> 
> -- 
> Regards,
> Siddharth.
Bjorn Helgaas Oct. 18, 2023, 4:36 p.m. UTC | #6
[+cc Serge (please cc people who have commented on previous revisions)]

On Wed, Oct 18, 2023 at 01:20:38PM +0530, Siddharth Vadapalli wrote:
> The ks_pcie_v3_65_add_bus() member of "ks_pcie_ops" was added for
> platforms using DW PCIe IP-core version 3.65a. The AM654x SoC uses
> DW PCIe IP-core version 4.90a and ks_pcie_v3_65_add_bus() is not
> applicable to it.
> 
> The commit which added support for the AM654x SoC has reused majority
> of the functions with the help of the "is_am6" flag to handle AM654x
> separately where applicable. Thus, make use of the "is_am6" flag and
> change ks_pcie_v3_65_add_bus() to no-op for AM654x SoC.
> 
> Fixes: 18b0415bc802 ("PCI: keystone: Add support for PCIe RC in AM654x Platforms")
> Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
> ---
> Hello,
> 
> This patch is based on linux-next tagged next-20231018.
> 
> The v1 of this patch is at:
> https://lore.kernel.org/r/20231011123451.34827-1-s-vadapalli@ti.com/
> 
> While there are a lot of changes since v1 and this patch could have been
> posted as a v1 patch itself, I decided to post it as the v2 of the patch
> mentioned above since it aims to address the issue described by the v1
> patch and is similar in that sense. However, the solution to the issue
> described in the v1 patch appears to be completely different from what
> was implemented in the v1 patch. Thus, the commit message and subject of
> this patch have been modified accordingly.
> 
> Changes since v1:
> - Updated patch subject and commit message.
> - Determined that issue is not with the absence of Link as mentioned in
>   v1 patch. Even with Link up and endpoint device connected, if
>   ks_pcie_v3_65_add_bus() is invoked and executed, all reads to the
>   MSI-X offsets return 0xffffffff when pcieport driver attempts to setup
>   AER and PME services. The all Fs return value indicates that the MSI-X
>   configuration is failing even if Endpoint device is connected. This is
>   because the ks_pcie_v3_65_add_bus() function is not applicable to the
>   AM654x SoC which uses DW PCIe IP-core version 4.90a.

Thanks for verifying that this doesn't actually depend on whether the
link is up.

I think that means we should be able to get rid of the
ks_pcie_v3_65_add_bus() callback altogether and instead do this along
with the rest of the Root Port init.

>  drivers/pci/controller/dwc/pci-keystone.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/controller/dwc/pci-keystone.c b/drivers/pci/controller/dwc/pci-keystone.c
> index 0def919f89fa..3abd59335574 100644
> --- a/drivers/pci/controller/dwc/pci-keystone.c
> +++ b/drivers/pci/controller/dwc/pci-keystone.c
> @@ -459,7 +459,7 @@ static int ks_pcie_v3_65_add_bus(struct pci_bus *bus)
>  	struct dw_pcie *pci = to_dw_pcie_from_pp(pp);
>  	struct keystone_pcie *ks_pcie = to_keystone_pcie(pci);
>  
> -	if (!pci_is_root_bus(bus))
> +	if (!pci_is_root_bus(bus) || ks_pcie->is_am6)
>  		return 0;
>  
>  	/* Configure and set up BAR0 */
> -- 
> 2.34.1
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Siddharth Vadapalli Oct. 19, 2023, 4:37 a.m. UTC | #7
On 18/10/23 17:45, Serge Semin wrote:
> On Wed, Oct 18, 2023 at 05:26:53PM +0530, Siddharth Vadapalli wrote:
>>
>>

...

>>
>> Even before commit 6ab15b5e7057, with support for AM654x (using version 4.90a
>> DWC PCIe IP-core) already present, ks_pcie_ops was being used for AM654x. This
>> indicates that prior to commit 6ab15b5e7057, the ks_pcie_ops was applicable to
>> AM654x and there's no problem. However, when commit 6ab15b5e7057 converted the
>> .scan_bus() method in ks_pcie_host_ops (which wasn't used for AM654x) to
>> .add_bus(), it added the method within the shared ks_pcie_ops structure which
>> implicitly, *mistakenly*, assumes that a function named "ks_pcie_v3_65_add_bus"
>> is also applicable to other controller versions (4.90a in this case). Paying
>> attention to the name of the function, it becomes clear that it was added for
>> controller version "3.65", hence the name "...v3_65...". The assumption that the
>> function/method is applicable to other controller versions as well, was
>> incorrect which led to the current issue that can be observed. The commit
>> 6ab15b5e7057 ended up adding a new method for a controller version which *never*
>> used the method. This therefore is a bug.
>>
> 
>> The simplest fix I see is that of exiting ks_pcie_v3_65_add_bus() if:
>> ks_pcie->is_am6 is set, since such checks have been scattered throughout the
>> same driver prior to the addition of commit 6ab15b5e7057 as well.
>>
>> I am open to other implementations of fixing this issue as well. Kindly let me
>> know in case of any suggestions.
> 
> A simplest fix isn't always the best one. As you said yourself the
> ks_pcie_v3_65_add_bus() implies having it called for v3.65. Calling it
> for the newer controllers was a mistake causing the bug. Thus having
> the ks_pcie_ops utilized for both old and new controllers was also a

At the point in time when support for the new controller was added, ks_pcie_ops
was defined as:
static struct pci_ops ks_pcie_ops = {
	.map_bus = dw_pcie_own_conf_map_bus,
	.read = pci_generic_config_read,
	.write = pci_generic_config_write,
};
which did not have anything version specific.

It is only after commit 6ab15b5e7057 that a version specific .add_bus method was
added to ks_pcie_ops, making ks_pcie_ops version-specific since then.

> mistake. Therefor my suggestion to fix the problem by defining the two
> pci_ops structure instances would be more correct at least from the
> code readability point of view. It would make the
> ks_pcie_v3_65_add_bus() function body and name coherent.

Sure. Thank you for the suggestion. I will leave ks_pcie_ops as-is for the older
3.65 controller while adding the ks_pcie_am6_ops without the .add_bus method for
the newer 4.90 controller. I assume this should be acceptable since the
pci-keystone.c driver only has two controller versions, namely 3.65a and 4.90a,
with the new 4.90a controller only applicable to AM654x SoC which is already
being distinguished in the driver using the is_am6 flag.

In the v3 patch, I will add the following:

static struct pci_ops ks_pcie_am6_ops = {
	.map_bus = dw_pcie_own_conf_map_bus,
	.read = pci_generic_config_read,
	.write = pci_generic_config_write,
};

and also update ks_pcie_host_init() to the following:
if(ks_pcie->is_am6)
	pp->bridge->ops = &ks_pcie_am6_ops;
else
	pp->bridge->ops = &ks_pcie_ops;

> 
> Meanwhile your fix look more like a workaround. The
> ks_pcie_v3_65_add_bus() function will be still called for the AM6x
> v4.90 controllers, which based on its semantic would be and will be
> wrong in any case. So instead of noop-ing the function it would be
> better to just drop it being called for the new controllers.

Yes, I will drop it for the new 4.90a controller rather than making it a no-op.

> 
> -Serge(y)
> 
>>
>>>
>>> -Serge(y)
>>>
>>>>  		return 0;
>>>>  
>>>>  	/* Configure and set up BAR0 */
>>>> -- 
>>>> 2.34.1
>>>>
>>
>> -- 
>> Regards,
>> Siddharth.
Siddharth Vadapalli Oct. 19, 2023, 4:44 a.m. UTC | #8
Hello Bjorn,

On 18/10/23 22:06, Bjorn Helgaas wrote:
> [+cc Serge (please cc people who have commented on previous revisions)]

Sure, I will do so.

> 
> On Wed, Oct 18, 2023 at 01:20:38PM +0530, Siddharth Vadapalli wrote:

...

>>
>> Changes since v1:
>> - Updated patch subject and commit message.
>> - Determined that issue is not with the absence of Link as mentioned in
>>   v1 patch. Even with Link up and endpoint device connected, if
>>   ks_pcie_v3_65_add_bus() is invoked and executed, all reads to the
>>   MSI-X offsets return 0xffffffff when pcieport driver attempts to setup
>>   AER and PME services. The all Fs return value indicates that the MSI-X
>>   configuration is failing even if Endpoint device is connected. This is
>>   because the ks_pcie_v3_65_add_bus() function is not applicable to the
>>   AM654x SoC which uses DW PCIe IP-core version 4.90a.
> 
> Thanks for verifying that this doesn't actually depend on whether the
> link is up.
> 
> I think that means we should be able to get rid of the
> ks_pcie_v3_65_add_bus() callback altogether and instead do this along
> with the rest of the Root Port init.

Yes, I will follow Serge's suggestion of adding a new pci_ops structure for the
AM654x SoC which uses the new 4.90a controller. I have described it at:
https://lore.kernel.org/r/ba217723-1501-4e72-b143-e0047266ea9a@ti.com/
and am summarizing it below:

I will add the following:
static struct pci_ops ks_pcie_am6_ops = {
	.map_bus = dw_pcie_own_conf_map_bus,
	.read = pci_generic_config_read,
	.write = pci_generic_config_write,
};
which shall be used for AM654x SoC

I will also modify the contents of ks_pcie_host_init() as follows:
if(ks_pcie->is_am6)
	pp->bridge->ops = &ks_pcie_am6_ops;
else
	pp->bridge->ops = &ks_pcie_ops;

which will ensure that the .add_bus() method is no longer applicable to the
AM654x SoC, which was the case prior to commit 6ab15b5e7057.

I shall post the v3 patch with the above changes and also Cc Serge.
Siddharth Vadapalli Oct. 19, 2023, 8:17 a.m. UTC | #9
Hello Serge,

On 19/10/23 10:07, Siddharth Vadapalli wrote:
> 
> 
> On 18/10/23 17:45, Serge Semin wrote:
>> On Wed, Oct 18, 2023 at 05:26:53PM +0530, Siddharth Vadapalli wrote:

...

> 
> Sure. Thank you for the suggestion. I will leave ks_pcie_ops as-is for the older
> 3.65 controller while adding the ks_pcie_am6_ops without the .add_bus method for
> the newer 4.90 controller. I assume this should be acceptable since the
> pci-keystone.c driver only has two controller versions, namely 3.65a and 4.90a,
> with the new 4.90a controller only applicable to AM654x SoC which is already
> being distinguished in the driver using the is_am6 flag.
> 
> In the v3 patch, I will add the following:
> 
> static struct pci_ops ks_pcie_am6_ops = {
> 	.map_bus = dw_pcie_own_conf_map_bus,
> 	.read = pci_generic_config_read,
> 	.write = pci_generic_config_write,
> };
> 
> and also update ks_pcie_host_init() to the following:
> if(ks_pcie->is_am6)
> 	pp->bridge->ops = &ks_pcie_am6_ops;
> else
> 	pp->bridge->ops = &ks_pcie_ops;
> 
>>
>> Meanwhile your fix look more like a workaround. The
>> ks_pcie_v3_65_add_bus() function will be still called for the AM6x
>> v4.90 controllers, which based on its semantic would be and will be
>> wrong in any case. So instead of noop-ing the function it would be
>> better to just drop it being called for the new controllers.
> 
> Yes, I will drop it for the new 4.90a controller rather than making it a no-op.

I have posted the v3 patch at:
https://lore.kernel.org/r/20231019081330.2975470-1-s-vadapalli@ti.com/
implementing your suggestion of adding a new pci_ops structure.

Please review it and let me know in case of any feedback.
Siddharth Vadapalli Oct. 19, 2023, 8:22 a.m. UTC | #10
On 19/10/23 10:14, Siddharth Vadapalli wrote:
> Hello Bjorn,
> 
> On 18/10/23 22:06, Bjorn Helgaas wrote:
>> [+cc Serge (please cc people who have commented on previous revisions)]
> 
> Sure, I will do so.
> 

...

> 
> Yes, I will follow Serge's suggestion of adding a new pci_ops structure for the
> AM654x SoC which uses the new 4.90a controller. I have described it at:
> https://lore.kernel.org/r/ba217723-1501-4e72-b143-e0047266ea9a@ti.com/
> and am summarizing it below:
> 
> I will add the following:
> static struct pci_ops ks_pcie_am6_ops = {
> 	.map_bus = dw_pcie_own_conf_map_bus,
> 	.read = pci_generic_config_read,
> 	.write = pci_generic_config_write,
> };
> which shall be used for AM654x SoC
> 
> I will also modify the contents of ks_pcie_host_init() as follows:
> if(ks_pcie->is_am6)
> 	pp->bridge->ops = &ks_pcie_am6_ops;
> else
> 	pp->bridge->ops = &ks_pcie_ops;
> 
> which will ensure that the .add_bus() method is no longer applicable to the
> AM654x SoC, which was the case prior to commit 6ab15b5e7057.
> 
> I shall post the v3 patch with the above changes and also Cc Serge.

I have posted the v3 patch at:
https://lore.kernel.org/r/20231019081330.2975470-1-s-vadapalli@ti.com/
and have copied Serge in the mail.

>
diff mbox series

Patch

diff --git a/drivers/pci/controller/dwc/pci-keystone.c b/drivers/pci/controller/dwc/pci-keystone.c
index 0def919f89fa..3abd59335574 100644
--- a/drivers/pci/controller/dwc/pci-keystone.c
+++ b/drivers/pci/controller/dwc/pci-keystone.c
@@ -459,7 +459,7 @@  static int ks_pcie_v3_65_add_bus(struct pci_bus *bus)
 	struct dw_pcie *pci = to_dw_pcie_from_pp(pp);
 	struct keystone_pcie *ks_pcie = to_keystone_pcie(pci);
 
-	if (!pci_is_root_bus(bus))
+	if (!pci_is_root_bus(bus) || ks_pcie->is_am6)
 		return 0;
 
 	/* Configure and set up BAR0 */