diff mbox series

PCI: keystone: Don't enable BAR0 if link is not detected

Message ID 20231011123451.34827-1-s-vadapalli@ti.com
State New
Headers show
Series PCI: keystone: Don't enable BAR0 if link is not detected | expand

Commit Message

Siddharth Vadapalli Oct. 11, 2023, 12:34 p.m. UTC
Since the function dw_pcie_host_init() ignores the absence of link under
the assumption that link can come up later, it is possible that the
pci_host_probe(bridge) function is invoked even when no endpoint device
is connected. In such a situation, the ks_pcie_v3_65_add_bus() function
configures BAR0 when the link is not up, resulting in Completion Timeouts
during the MSI configuration performed later by the PCI Express Port driver
to setup AER, PME and other services. Thus, leave BAR0 disabled if link is
not yet detected when the ks_pcie_v3_65_add_bus() function is invoked.

Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
---

Hello,

This patch is based on linux-next tagged next-20231011.

Regards,
Siddharth.

 drivers/pci/controller/dwc/pci-keystone.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Bjorn Helgaas Oct. 11, 2023, 1:46 p.m. UTC | #1
Hi Siddharth,

On Wed, Oct 11, 2023 at 06:04:51PM +0530, Siddharth Vadapalli wrote:
> Since the function dw_pcie_host_init() ignores the absence of link under
> the assumption that link can come up later, it is possible that the
> pci_host_probe(bridge) function is invoked even when no endpoint device
> is connected. In such a situation, the ks_pcie_v3_65_add_bus() function
> configures BAR0 when the link is not up, resulting in Completion Timeouts
> during the MSI configuration performed later by the PCI Express Port driver
> to setup AER, PME and other services. Thus, leave BAR0 disabled if link is
> not yet detected when the ks_pcie_v3_65_add_bus() function is invoked.

I'm trying to make sense of this.  In this path:

  pci_host_probe
    pci_scan_root_bus_bridge
      pci_register_host_bridge
	bus = pci_alloc_bus(NULL)    # root bus
	bus->ops->add_bus(bus)
	  ks_pcie_v3_65_add_bus

The BAR0 in question must belong to a Root Port.  And it sounds like
the issue must be related to MSI-X, since the original MSI doesn't
involve any BARs.

I don't understand why the Root Port's BAR0 is related to the link
being up.  MSI-X configuration of the Root Port (presumably using
BAR0) shouldn't involve any transactions to devices *below* the Root
Port, and I would expect that BAR to be available (readable and
writable) regardless of whether the link is up.

If we skip the BAR0 configuration because the link is down at the time
of pci_host_probe(), when *do* we do that configuration?  I don't see
another path to ks_pcie_v3_65_add_bus() for the root bus later.

Do you know what exactly causes the Completion Timeout?  Is this a
read to BAR0, or some attempted access to a downstream device, or
something else?

Keystone is the only driver that uses .add_bus() for this, so it seems
a little weird, but maybe this is related to some Keystone-specific
hardware design.

> Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
> ---
> 
> Hello,
> 
> This patch is based on linux-next tagged next-20231011.
> 
> Regards,
> Siddharth.
> 
>  drivers/pci/controller/dwc/pci-keystone.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/controller/dwc/pci-keystone.c b/drivers/pci/controller/dwc/pci-keystone.c
> index 49aea6ce3e87..ac2ad112d616 100644
> --- a/drivers/pci/controller/dwc/pci-keystone.c
> +++ b/drivers/pci/controller/dwc/pci-keystone.c
> @@ -459,7 +459,8 @@ static int ks_pcie_v3_65_add_bus(struct pci_bus *bus)
>  	struct dw_pcie *pci = to_dw_pcie_from_pp(pp);
>  	struct keystone_pcie *ks_pcie = to_keystone_pcie(pci);
>  
> -	if (!pci_is_root_bus(bus))
> +	/* Don't enable BAR0 if link is not yet up at this point */
> +	if (!pci_is_root_bus(bus) || !dw_pcie_link_up(pci))
>  		return 0;
>  
>  	/* Configure and set up BAR0 */
> -- 
> 2.34.1
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Siddharth Vadapalli Oct. 12, 2023, 4:45 a.m. UTC | #2
Hello Bjorn,

Thank you for reviewing the patch.

On 11/10/23 19:16, Bjorn Helgaas wrote:
> Hi Siddharth,
> 
> On Wed, Oct 11, 2023 at 06:04:51PM +0530, Siddharth Vadapalli wrote:
>> Since the function dw_pcie_host_init() ignores the absence of link under
>> the assumption that link can come up later, it is possible that the
>> pci_host_probe(bridge) function is invoked even when no endpoint device
>> is connected. In such a situation, the ks_pcie_v3_65_add_bus() function
>> configures BAR0 when the link is not up, resulting in Completion Timeouts
>> during the MSI configuration performed later by the PCI Express Port driver
>> to setup AER, PME and other services. Thus, leave BAR0 disabled if link is
>> not yet detected when the ks_pcie_v3_65_add_bus() function is invoked.
> 
> I'm trying to make sense of this.  In this path:
> 
>   pci_host_probe
>     pci_scan_root_bus_bridge
>       pci_register_host_bridge
> 	bus = pci_alloc_bus(NULL)    # root bus
> 	bus->ops->add_bus(bus)
> 	  ks_pcie_v3_65_add_bus
> 
> The BAR0 in question must belong to a Root Port.  And it sounds like
> the issue must be related to MSI-X, since the original MSI doesn't
> involve any BARs.

Yes, the issue is related to MSI-X. I will list down the exact set of function
calls below as well as the place where the completion timeout first occurs:
ks_pcie_probe
  dw_pcie_host_init
    pci_host_probe
      pci_bus_add_devices
        pci_bus_add_device
          device_attach
            __device_attach
              bus_for_each_drv
                __device_attach_driver (invoked using fn(drv, data))
                  driver_probe_device
                    __driver_probe_device
                      really_probe
                        pci_device_probe
                          pcie_portdrv_probe
                            pcie_port_device_register
                              pcie_init_service_irqs
                                pcie_port_enable_irq_vec
                                  pci_alloc_irq_vectors
                                    pci_alloc_irq_vectors_affinity
                                      __pci_enable_msix_range
                                        msix_capability_init
                                          msix_setup_interrupts
                                            msix_setup_msi_descs
                                              msix_prepare_msi_desc
In this function: msix_prepare_msi_desc, the following readl() causes completion
timeout:
		desc->pci.msix_ctrl = readl(addr + PCI_MSIX_ENTRY_VECTOR_CTRL);
The completion timeout with the readl is only observed when the link is down (No
Endpoint device is actually connected to the PCIe connector slot).

The symptoms of the above completion timeout show up as a 45 second delay during
boot if no Endpoint device is connected. This 45 second delay is due to the fact
that each readl() which normally takes 4 milliseconds (in presence of Endpoint
device) now take around 40 milliseconds due to waiting for completion. Also, if
I disable Completion Timeout in the PCIe controller, Linux hangs at the readl()
mentioned above. That is the very first readl() causing the completion timeout.

> 
> I don't understand why the Root Port's BAR0 is related to the link
> being up.  MSI-X configuration of the Root Port (presumably using
> BAR0) shouldn't involve any transactions to devices *below* the Root
> Port, and I would expect that BAR to be available (readable and
> writable) regardless of whether the link is up.
> 
> If we skip the BAR0 configuration because the link is down at the time
> of pci_host_probe(), when *do* we do that configuration?  I don't see
> another path to ks_pcie_v3_65_add_bus() for the root bus later.
> 
> Do you know what exactly causes the Completion Timeout?  Is this a
> read to BAR0, or some attempted access to a downstream device, or
> something else?
> 
> Keystone is the only driver that uses .add_bus() for this, so it seems
> a little weird, but maybe this is related to some Keystone-specific
> hardware design.

Yes, I am not fully sure myself why BAR0 being enabled is causing the issue. I
will debug further within the function ks_pcie_v3_65_add_bus() to see which
section of it causes issues when the Link is down. What I am certain of however
is that exiting the ks_pcie_v3_65_add_bus() function if Link is down fixes the
completion timeouts observed above with the readl(), thereby making the 45
second delay vanish during boot when no endpoint device is connected.

Please let me know in case of any suggestions to fix this issue.

> 
>> Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
>> ---
>>
>> Hello,
>>
>> This patch is based on linux-next tagged next-20231011.
>>
>> Regards,
>> Siddharth.
>>
>>  drivers/pci/controller/dwc/pci-keystone.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/pci/controller/dwc/pci-keystone.c b/drivers/pci/controller/dwc/pci-keystone.c
>> index 49aea6ce3e87..ac2ad112d616 100644
>> --- a/drivers/pci/controller/dwc/pci-keystone.c
>> +++ b/drivers/pci/controller/dwc/pci-keystone.c
>> @@ -459,7 +459,8 @@ static int ks_pcie_v3_65_add_bus(struct pci_bus *bus)
>>  	struct dw_pcie *pci = to_dw_pcie_from_pp(pp);
>>  	struct keystone_pcie *ks_pcie = to_keystone_pcie(pci);
>>  
>> -	if (!pci_is_root_bus(bus))
>> +	/* Don't enable BAR0 if link is not yet up at this point */
>> +	if (!pci_is_root_bus(bus) || !dw_pcie_link_up(pci))
>>  		return 0;
>>  
>>  	/* Configure and set up BAR0 */
>> -- 
>> 2.34.1
>>
>>
>> _______________________________________________
>> linux-arm-kernel mailing list
>> linux-arm-kernel@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Bjorn Helgaas Oct. 12, 2023, 4:43 p.m. UTC | #3
On Thu, Oct 12, 2023 at 10:15:09AM +0530, Siddharth Vadapalli wrote:
> Hello Bjorn,
> 
> Thank you for reviewing the patch.
> 
> On 11/10/23 19:16, Bjorn Helgaas wrote:
> > Hi Siddharth,
> > 
> > On Wed, Oct 11, 2023 at 06:04:51PM +0530, Siddharth Vadapalli wrote:
> >> Since the function dw_pcie_host_init() ignores the absence of link under
> >> the assumption that link can come up later, it is possible that the
> >> pci_host_probe(bridge) function is invoked even when no endpoint device
> >> is connected. In such a situation, the ks_pcie_v3_65_add_bus() function
> >> configures BAR0 when the link is not up, resulting in Completion Timeouts
> >> during the MSI configuration performed later by the PCI Express Port driver
> >> to setup AER, PME and other services. Thus, leave BAR0 disabled if link is
> >> not yet detected when the ks_pcie_v3_65_add_bus() function is invoked.
> > 
> > I'm trying to make sense of this.  In this path:
> > 
> >   pci_host_probe
> >     pci_scan_root_bus_bridge
> >       pci_register_host_bridge
> > 	bus = pci_alloc_bus(NULL)    # root bus
> > 	bus->ops->add_bus(bus)
> > 	  ks_pcie_v3_65_add_bus
> > 
> > The BAR0 in question must belong to a Root Port.  And it sounds like
> > the issue must be related to MSI-X, since the original MSI doesn't
> > involve any BARs.
> 
> Yes, the issue is related to MSI-X. I will list down the exact set of function
> calls below as well as the place where the completion timeout first occurs:
> ks_pcie_probe
>   dw_pcie_host_init
>     pci_host_probe
>       pci_bus_add_devices
>         pci_bus_add_device
>           device_attach
>             __device_attach
>               bus_for_each_drv
>                 __device_attach_driver (invoked using fn(drv, data))
>                   driver_probe_device
>                     __driver_probe_device
>                       really_probe
>                         pci_device_probe
>                           pcie_portdrv_probe
>                             pcie_port_device_register
>                               pcie_init_service_irqs
>                                 pcie_port_enable_irq_vec
>                                   pci_alloc_irq_vectors
>                                     pci_alloc_irq_vectors_affinity
>                                       __pci_enable_msix_range
>                                         msix_capability_init
>                                           msix_setup_interrupts
>                                             msix_setup_msi_descs
>                                               msix_prepare_msi_desc
> In this function: msix_prepare_msi_desc, the following readl()
> causes completion timeout:
> 		desc->pci.msix_ctrl = readl(addr + PCI_MSIX_ENTRY_VECTOR_CTRL);
> The completion timeout with the readl is only observed when the link
> is down (No Endpoint device is actually connected to the PCIe
> connector slot).

Do you know the address ("addr")?  From pci_msix_desc_addr(), it looks
like it should be:

  desc->pci.mask_base + desc->msi_index * PCI_MSIX_ENTRY_SIZE

and desc->pci.mask_base should be dev->msix_base, which we got from
msix_map_region(), which ioremaps part of the BAR indicated by the
MSI-X Table Offset/Table BIR register.

I wonder if this readl() is being handled as an MMIO access to a
downstream device instead of a Root Port BAR access because it's
inside the Root Port's MMIO window.

Could you dump out these values just before the readl()?

  phys_addr inside msix_map_region()
  dev->msix_base
  desc->pci.mask_base
  desc->msi_index
  addr
  call early_dump_pci_device() on the Root Port

Bjorn
Siddharth Vadapalli Oct. 13, 2023, 5:03 a.m. UTC | #4
On 12/10/23 22:13, Bjorn Helgaas wrote:
> On Thu, Oct 12, 2023 at 10:15:09AM +0530, Siddharth Vadapalli wrote:
>> Hello Bjorn,
>>
>> Thank you for reviewing the patch.
>>
>> On 11/10/23 19:16, Bjorn Helgaas wrote:
>>> Hi Siddharth,
>>>

...

>>                                               msix_prepare_msi_desc
>> In this function: msix_prepare_msi_desc, the following readl()
>> causes completion timeout:
>> 		desc->pci.msix_ctrl = readl(addr + PCI_MSIX_ENTRY_VECTOR_CTRL);
>> The completion timeout with the readl is only observed when the link
>> is down (No Endpoint device is actually connected to the PCIe
>> connector slot).
> 
> Do you know the address ("addr")?  From pci_msix_desc_addr(), it looks
> like it should be:
> 
>   desc->pci.mask_base + desc->msi_index * PCI_MSIX_ENTRY_SIZE
> 
> and desc->pci.mask_base should be dev->msix_base, which we got from
> msix_map_region(), which ioremaps part of the BAR indicated by the
> MSI-X Table Offset/Table BIR register.
> 
> I wonder if this readl() is being handled as an MMIO access to a
> downstream device instead of a Root Port BAR access because it's
> inside the Root Port's MMIO window.
> 
> Could you dump out these values just before the readl()?
> 
>   phys_addr inside msix_map_region()
>   dev->msix_base
>   desc->pci.mask_base
>   desc->msi_index
>   addr

phys_addr: 0x10102000
msix_base: 0xffff80000997a000
mask_base: 0xffff80000997a000
msi_index: 0
addr: 0xffff80000997a000

Also, the details of BAR allocation from the logs are:
keystone-pcie 5500000.pcie: host bridge /bus@100000/pcie@5500000 ranges:
keystone-pcie 5500000.pcie:       IO 0x0010020000..0x001002ffff -> 0x0000000000
keystone-pcie 5500000.pcie:      MEM 0x0010030000..0x0017ffffff -> 0x0010030000
keystone-pcie 5500000.pcie: iATU unroll: enabled
keystone-pcie 5500000.pcie: iATU regions: 8 ob, 8 ib, align 64K, limit 4G
keystone-pcie 5500000.pcie: Phy link never came up
keystone-pcie 5500000.pcie: PCI host bridge to bus 0000:00
pci_bus 0000:00: root bus resource [bus 00-ff]
pci_bus 0000:00: root bus resource [io  0x0000-0xffff]
pci_bus 0000:00: root bus resource [mem 0x10030000-0x17ffffff]
pci 0000:00:00.0: [104c:b00c] type 01 class 0x060400
pci 0000:00:00.0: reg 0x10: [mem 0x05500000-0x055fffff]
pci 0000:00:00.0: reg 0x38: [mem 0x00000000-0x0000ffff pref]
pci 0000:00:00.0: supports D1
pci 0000:00:00.0: PME# supported from D0 D1 D3hot
pci 0000:00:00.0: BAR 0: assigned [mem 0x10100000-0x101fffff]
pci 0000:00:00.0: BAR 6: assigned [mem 0x10030000-0x1003ffff pref]
pci 0000:00:00.0: PCI bridge to [bus 01-ff]

The value of phys_addr lies within the range allocated to BAR0.

>   call early_dump_pci_device() on the Root Port

I invoked early_dump_pci_device() within the pci_setup_device() function in
drivers/pci/probe.c and the output is:

pci 0000:00:00.0: config space:
00000000: 4c 10 0c b0 07 01 10 00 01 00 04 06 00 00 01 00
00000010: 00 00 50 05 00 00 00 00 00 01 ff 00 00 00 00 00
00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00000030: 00 00 00 00 40 00 00 00 00 00 00 00 ff 01 00 00
00000040: 01 50 c3 5b 08 00 00 00 00 00 00 00 00 00 00 00
00000050: 05 70 80 01 00 00 00 00 00 00 00 00 00 00 00 00
00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00000070: 10 b0 42 00 01 80 00 00 10 20 00 00 22 3c 73 00
00000080: 00 00 11 10 00 00 00 00 c0 03 40 00 00 00 01 00
00000090: 00 00 00 00 1f 04 00 00 00 00 00 00 06 00 00 00
000000a0: 02 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00
000000b0: 11 00 00 01 00 20 00 00 00 40 00 00 00 00 00 00
000000c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
000000d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
000000e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
000000f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00


> 
> Bjorn
Siddharth Vadapalli Oct. 13, 2023, 5:06 a.m. UTC | #5
On 13/10/23 10:33, Siddharth Vadapalli wrote:
> 
> 
> On 12/10/23 22:13, Bjorn Helgaas wrote:
>> On Thu, Oct 12, 2023 at 10:15:09AM +0530, Siddharth Vadapalli wrote:
>>> Hello Bjorn,
>>>
>>> Thank you for reviewing the patch.
>>>
>>> On 11/10/23 19:16, Bjorn Helgaas wrote:
>>>> Hi Siddharth,
>>>>
> 
> ...
> 
>>>                                               msix_prepare_msi_desc
>>> In this function: msix_prepare_msi_desc, the following readl()
>>> causes completion timeout:
>>> 		desc->pci.msix_ctrl = readl(addr + PCI_MSIX_ENTRY_VECTOR_CTRL);
>>> The completion timeout with the readl is only observed when the link
>>> is down (No Endpoint device is actually connected to the PCIe
>>> connector slot).
>>
>> Do you know the address ("addr")?  From pci_msix_desc_addr(), it looks
>> like it should be:
>>
>>   desc->pci.mask_base + desc->msi_index * PCI_MSIX_ENTRY_SIZE
>>
>> and desc->pci.mask_base should be dev->msix_base, which we got from
>> msix_map_region(), which ioremaps part of the BAR indicated by the
>> MSI-X Table Offset/Table BIR register.
>>
>> I wonder if this readl() is being handled as an MMIO access to a
>> downstream device instead of a Root Port BAR access because it's
>> inside the Root Port's MMIO window.
>>
>> Could you dump out these values just before the readl()?
>>
>>   phys_addr inside msix_map_region()
>>   dev->msix_base
>>   desc->pci.mask_base
>>   desc->msi_index
>>   addr
> 
> phys_addr: 0x10102000
> msix_base: 0xffff80000997a000
> mask_base: 0xffff80000997a000
> msi_index: 0
> addr: 0xffff80000997a000
> 
> Also, the details of BAR allocation from the logs are:
> keystone-pcie 5500000.pcie: host bridge /bus@100000/pcie@5500000 ranges:
> keystone-pcie 5500000.pcie:       IO 0x0010020000..0x001002ffff -> 0x0000000000
> keystone-pcie 5500000.pcie:      MEM 0x0010030000..0x0017ffffff -> 0x0010030000
> keystone-pcie 5500000.pcie: iATU unroll: enabled
> keystone-pcie 5500000.pcie: iATU regions: 8 ob, 8 ib, align 64K, limit 4G
> keystone-pcie 5500000.pcie: Phy link never came up
> keystone-pcie 5500000.pcie: PCI host bridge to bus 0000:00
> pci_bus 0000:00: root bus resource [bus 00-ff]
> pci_bus 0000:00: root bus resource [io  0x0000-0xffff]
> pci_bus 0000:00: root bus resource [mem 0x10030000-0x17ffffff]
> pci 0000:00:00.0: [104c:b00c] type 01 class 0x060400
> pci 0000:00:00.0: reg 0x10: [mem 0x05500000-0x055fffff]
> pci 0000:00:00.0: reg 0x38: [mem 0x00000000-0x0000ffff pref]
> pci 0000:00:00.0: supports D1
> pci 0000:00:00.0: PME# supported from D0 D1 D3hot
> pci 0000:00:00.0: BAR 0: assigned [mem 0x10100000-0x101fffff]
> pci 0000:00:00.0: BAR 6: assigned [mem 0x10030000-0x1003ffff pref]
> pci 0000:00:00.0: PCI bridge to [bus 01-ff]
> 
> The value of phys_addr lies within the range allocated to BAR0.
> 
>>   call early_dump_pci_device() on the Root Port
> 
> I invoked early_dump_pci_device() within the pci_setup_device() function in
> drivers/pci/probe.c and the output is:
> 
> pci 0000:00:00.0: config space:
> 00000000: 4c 10 0c b0 07 01 10 00 01 00 04 06 00 00 01 00
> 00000010: 00 00 50 05 00 00 00 00 00 01 ff 00 00 00 00 00
> 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 00000030: 00 00 00 00 40 00 00 00 00 00 00 00 ff 01 00 00
> 00000040: 01 50 c3 5b 08 00 00 00 00 00 00 00 00 00 00 00
> 00000050: 05 70 80 01 00 00 00 00 00 00 00 00 00 00 00 00
> 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 00000070: 10 b0 42 00 01 80 00 00 10 20 00 00 22 3c 73 00
> 00000080: 00 00 11 10 00 00 00 00 c0 03 40 00 00 00 01 00
> 00000090: 00 00 00 00 1f 04 00 00 00 00 00 00 06 00 00 00
> 000000a0: 02 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00
> 000000b0: 11 00 00 01 00 20 00 00 00 40 00 00 00 00 00 00
> 000000c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 000000d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 000000e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 000000f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

I also noticed that the value of desc->pci.msix_ctrl obtained from the readl is
always 0xffffffff irrespective of whether or not an endpoint device is
connected. This isn't expected right? The only difference between the cases
where endpoint device is connected and isn't connected is the completion timeout.
Bjorn Helgaas Oct. 13, 2023, 6:49 p.m. UTC | #6
On Fri, Oct 13, 2023 at 10:36:01AM +0530, Siddharth Vadapalli wrote:
> On 13/10/23 10:33, Siddharth Vadapalli wrote:
> > On 12/10/23 22:13, Bjorn Helgaas wrote:
> >> On Thu, Oct 12, 2023 at 10:15:09AM +0530, Siddharth Vadapalli wrote:
> >>> On 11/10/23 19:16, Bjorn Helgaas wrote:
> > ...
> >>>                                               msix_prepare_msi_desc
> >>> In this function: msix_prepare_msi_desc, the following readl()
> >>> causes completion timeout:
> >>> 		desc->pci.msix_ctrl = readl(addr + PCI_MSIX_ENTRY_VECTOR_CTRL);
> >>> The completion timeout with the readl is only observed when the link
> >>> is down (No Endpoint device is actually connected to the PCIe
> >>> connector slot).
> >>
> >> Do you know the address ("addr")?  From pci_msix_desc_addr(), it looks
> >> like it should be:
> >>
> >>   desc->pci.mask_base + desc->msi_index * PCI_MSIX_ENTRY_SIZE
> >>
> >> and desc->pci.mask_base should be dev->msix_base, which we got from
> >> msix_map_region(), which ioremaps part of the BAR indicated by the
> >> MSI-X Table Offset/Table BIR register.
> >>
> >> I wonder if this readl() is being handled as an MMIO access to a
> >> downstream device instead of a Root Port BAR access because it's
> >> inside the Root Port's MMIO window.
> >>
> >> Could you dump out these values just before the readl()?
> >>
> >>   phys_addr inside msix_map_region()
> >>   dev->msix_base
> >>   desc->pci.mask_base
> >>   desc->msi_index
> >>   addr
> > 
> > phys_addr: 0x10102000
> > msix_base: 0xffff80000997a000
> > mask_base: 0xffff80000997a000
> > msi_index: 0
> > addr: 0xffff80000997a000
> > 
> > Also, the details of BAR allocation from the logs are:
> > keystone-pcie 5500000.pcie: host bridge /bus@100000/pcie@5500000 ranges:
> > keystone-pcie 5500000.pcie:       IO 0x0010020000..0x001002ffff -> 0x0000000000
> > keystone-pcie 5500000.pcie:      MEM 0x0010030000..0x0017ffffff -> 0x0010030000
> > keystone-pcie 5500000.pcie: iATU unroll: enabled
> > keystone-pcie 5500000.pcie: iATU regions: 8 ob, 8 ib, align 64K, limit 4G
> > keystone-pcie 5500000.pcie: Phy link never came up
> > keystone-pcie 5500000.pcie: PCI host bridge to bus 0000:00
> > pci_bus 0000:00: root bus resource [bus 00-ff]
> > pci_bus 0000:00: root bus resource [io  0x0000-0xffff]
> > pci_bus 0000:00: root bus resource [mem 0x10030000-0x17ffffff]
> > pci 0000:00:00.0: [104c:b00c] type 01 class 0x060400
> > pci 0000:00:00.0: reg 0x10: [mem 0x05500000-0x055fffff]
> > pci 0000:00:00.0: reg 0x38: [mem 0x00000000-0x0000ffff pref]
> > pci 0000:00:00.0: supports D1
> > pci 0000:00:00.0: PME# supported from D0 D1 D3hot
> > pci 0000:00:00.0: BAR 0: assigned [mem 0x10100000-0x101fffff]
> > pci 0000:00:00.0: BAR 6: assigned [mem 0x10030000-0x1003ffff pref]
> > pci 0000:00:00.0: PCI bridge to [bus 01-ff]
> > 
> > The value of phys_addr lies within the range allocated to BAR0.
> > 
> >>   call early_dump_pci_device() on the Root Port
> > 
> > I invoked early_dump_pci_device() within the pci_setup_device() function in
> > drivers/pci/probe.c and the output is:

It'd be better to dump the config space immediately before the readl()
since the PCI core did change some things in the interim.

> > pci 0000:00:00.0: config space:
> > 00000000: 4c 10 0c b0 07 01 10 00 01 00 04 06 00 00 01 00

  PCI_COMMAND = 0x0107
    PCI_COMMAND_IO | PCI_COMMAND_MEMORY | PCI_COMMAND_MASTER | PCI_COMMAND_SERR

> > 00000010: 00 00 50 05 00 00 00 00 00 01 ff 00 00 00 00 00
> > 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

  PCI_IO_BASE            =       0x00 low 4 bits indicate 16-bit addressing
  PCI_IO_LIMIT           =       0x00
  PCI_MEMORY_BASE        =     0x0000
  PCI_MEMORY_LIMIT       =     0x0000
  PCI_PREF_MEMORY_BASE   =     0x0000 low 4 bits indicate 32-bit pref
  PCI_PREF_MEMORY_LIMIT  =     0x0000
  PCI_PREF_BASE_UPPER32  = 0x00000000
  PCI_PREF_LIMIT_UPPER32 = 0x00000000

We can't tell from this whether the prefetchable window is implemented
(but I'm sure keystone *does* implement it).  If PCI_PREF_MEMORY_BASE
and PCI_PREF_MEMORY_LIMIT are read-only zeros, it is not implemented.
If they are writable, it is enabled at the same range as the
non-prefetchable window.

Similarly for the I/O window; we can't tell whether the base/limit are
read-only zero or writable.

So we have these windows that look like they're probably enabled:

  io window   at [io  0x0000-0x0fff]
  mem window  at [mem 0x00000000-0x000fffff]
  pref window at [mem 0x00000000-0x000fffff pref]

No idea whether it makes a difference, but these windows seem
misconfigured.  The default should probably be to make them all
disabled (as in f73eedc90bf7 ("PCI: vmd: Disable bridge window for
domain reset")):

  PCI_IO_BASE            = 0xf0
  PCI_IO_LIMIT           = 0x00
  PCI_MEMORY_BASE        = 0xfff0
  PCI_MEMORY_LIMIT       = 0x0000
  PCI_PREF_MEMORY_BASE   = 0xfff0
  PCI_PREF_MEMORY_LIMIT  = 0x0000
  PCI_PREF_BASE_UPPER32  = 0xffffffff
  PCI_PREF_LIMIT_UPPER32 = 0x00000000

The PCI core should reconfigure and enable them as needed by
downstream devices.

> I also noticed that the value of desc->pci.msix_ctrl obtained from
> the readl is always 0xffffffff irrespective of whether or not an
> endpoint device is connected. This isn't expected right? The only
> difference between the cases where endpoint device is connected and
> isn't connected is the completion timeout.

Right, I wouldn't expect that.  PCI_MSIX_ENTRY_VECTOR_CTRL has a bunch
of reserved bits that should be zero.

I assume MSI-X actually does work for downstream endpoints?  I
wouldn't think anybody would have bothered with
ks_pcie_v3_65_add_bus() unless MSI-X works.

Bjorn
Siddharth Vadapalli Oct. 14, 2023, 5:52 a.m. UTC | #7
On 14/10/23 00:19, Bjorn Helgaas wrote:
> On Fri, Oct 13, 2023 at 10:36:01AM +0530, Siddharth Vadapalli wrote:
>> On 13/10/23 10:33, Siddharth Vadapalli wrote:
>>> On 12/10/23 22:13, Bjorn Helgaas wrote:
>>>> On Thu, Oct 12, 2023 at 10:15:09AM +0530, Siddharth Vadapalli wrote:
>>>>> On 11/10/23 19:16, Bjorn Helgaas wrote:
>>> ...
>>>>>                                               msix_prepare_msi_desc
>>>>> In this function: msix_prepare_msi_desc, the following readl()
>>>>> causes completion timeout:
>>>>> 		desc->pci.msix_ctrl = readl(addr + PCI_MSIX_ENTRY_VECTOR_CTRL);
>>>>> The completion timeout with the readl is only observed when the link
>>>>> is down (No Endpoint device is actually connected to the PCIe
>>>>> connector slot).
>>>>
>>>> Do you know the address ("addr")?  From pci_msix_desc_addr(), it looks
>>>> like it should be:
>>>>
>>>>   desc->pci.mask_base + desc->msi_index * PCI_MSIX_ENTRY_SIZE
>>>>
>>>> and desc->pci.mask_base should be dev->msix_base, which we got from
>>>> msix_map_region(), which ioremaps part of the BAR indicated by the
>>>> MSI-X Table Offset/Table BIR register.
>>>>
>>>> I wonder if this readl() is being handled as an MMIO access to a
>>>> downstream device instead of a Root Port BAR access because it's
>>>> inside the Root Port's MMIO window.
>>>>
>>>> Could you dump out these values just before the readl()?
>>>>
>>>>   phys_addr inside msix_map_region()
>>>>   dev->msix_base
>>>>   desc->pci.mask_base
>>>>   desc->msi_index
>>>>   addr
>>>
>>> phys_addr: 0x10102000
>>> msix_base: 0xffff80000997a000
>>> mask_base: 0xffff80000997a000
>>> msi_index: 0
>>> addr: 0xffff80000997a000
>>>
>>> Also, the details of BAR allocation from the logs are:
>>> keystone-pcie 5500000.pcie: host bridge /bus@100000/pcie@5500000 ranges:
>>> keystone-pcie 5500000.pcie:       IO 0x0010020000..0x001002ffff -> 0x0000000000
>>> keystone-pcie 5500000.pcie:      MEM 0x0010030000..0x0017ffffff -> 0x0010030000
>>> keystone-pcie 5500000.pcie: iATU unroll: enabled
>>> keystone-pcie 5500000.pcie: iATU regions: 8 ob, 8 ib, align 64K, limit 4G
>>> keystone-pcie 5500000.pcie: Phy link never came up
>>> keystone-pcie 5500000.pcie: PCI host bridge to bus 0000:00
>>> pci_bus 0000:00: root bus resource [bus 00-ff]
>>> pci_bus 0000:00: root bus resource [io  0x0000-0xffff]
>>> pci_bus 0000:00: root bus resource [mem 0x10030000-0x17ffffff]
>>> pci 0000:00:00.0: [104c:b00c] type 01 class 0x060400
>>> pci 0000:00:00.0: reg 0x10: [mem 0x05500000-0x055fffff]
>>> pci 0000:00:00.0: reg 0x38: [mem 0x00000000-0x0000ffff pref]
>>> pci 0000:00:00.0: supports D1
>>> pci 0000:00:00.0: PME# supported from D0 D1 D3hot
>>> pci 0000:00:00.0: BAR 0: assigned [mem 0x10100000-0x101fffff]
>>> pci 0000:00:00.0: BAR 6: assigned [mem 0x10030000-0x1003ffff pref]
>>> pci 0000:00:00.0: PCI bridge to [bus 01-ff]
>>>
>>> The value of phys_addr lies within the range allocated to BAR0.
>>>
>>>>   call early_dump_pci_device() on the Root Port
>>>
>>> I invoked early_dump_pci_device() within the pci_setup_device() function in
>>> drivers/pci/probe.c and the output is:
> 
> It'd be better to dump the config space immediately before the readl()
> since the PCI core did change some things in the interim.

I dumped the config space just before the readl() and it is:

pcieport 0000:00:00.0: config space:
00000000: 4c 10 0c b0 07 01 10 00 01 00 04 06 00 00 01 00
00000010: 00 00 10 10 00 00 00 00 00 01 ff 00 f0 00 00 00
00000020: f0 ff 00 00 f0 ff 00 00 00 00 00 00 00 00 00 00
00000030: 00 00 00 00 40 00 00 00 00 00 00 00 17 01 02 00
00000040: 01 50 c3 5b 08 00 00 00 00 00 00 00 00 00 00 00
00000050: 05 70 80 01 00 00 00 00 00 00 00 00 00 00 00 00
00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00000070: 10 b0 42 00 01 80 00 00 1f 20 00 00 22 3c 73 00
00000080: 00 00 11 10 00 00 00 00 c0 03 40 00 10 00 01 00
00000090: 00 00 00 00 1f 04 00 00 00 00 00 00 06 00 00 00
000000a0: 02 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00
000000b0: 11 00 00 c1 00 20 00 00 00 40 00 00 00 00 00 00
000000c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
000000d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
000000e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
000000f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

BAR0 was configured with the value: 0x05500000 within ks_pcie_v3_65_add_bus() at:
dw_pcie_writel_dbi(pci, PCI_BASE_ADDRESS_0, ks_pcie->app.start);
and can be verified based on the previous config space dump I had shared within
the pci_setup_device() function. But, looking at the dump above which is just
before the readl(), its value seems to have changed to 0x10100000.

I am listing down the regions of config space which have changed, based on the
comparison of dumps taken within pci_setup_device() and just before the readl():

Within pci_setup_device():
00000010: 00 00 50 05 00 00 00 00 00 01 ff 00 00 00 00 00
Before readl():
00000010: 00 00 10 10 00 00 00 00 00 01 ff 00 f0 00 00 00

Within pci_setup_device():
00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Before readl():
00000020: f0 ff 00 00 f0 ff 00 00 00 00 00 00 00 00 00 00

Within pci_setup_device():
00000030: 00 00 00 00 40 00 00 00 00 00 00 00 ff 01 00 00
Before readl():
00000030: 00 00 00 00 40 00 00 00 00 00 00 00 17 01 02 00

Within pci_setup_device():
00000070: 10 b0 42 00 01 80 00 00 10 20 00 00 22 3c 73 00
Before readl():
00000070: 10 b0 42 00 01 80 00 00 1f 20 00 00 22 3c 73 00

Within pci_setup_device():
00000080: 00 00 11 10 00 00 00 00 c0 03 40 00 00 00 01 00
Before readl():
00000080: 00 00 11 10 00 00 00 00 c0 03 40 00 10 00 01 00

Within pci_setup_device():
000000b0: 11 00 00 01 00 20 00 00 00 40 00 00 00 00 00 00
Before readl():
000000b0: 11 00 00 c1 00 20 00 00 00 40 00 00 00 00 00 00

> 
>>> pci 0000:00:00.0: config space:
>>> 00000000: 4c 10 0c b0 07 01 10 00 01 00 04 06 00 00 01 00
> 
>   PCI_COMMAND = 0x0107
>     PCI_COMMAND_IO | PCI_COMMAND_MEMORY | PCI_COMMAND_MASTER | PCI_COMMAND_SERR
> 
>>> 00000010: 00 00 50 05 00 00 00 00 00 01 ff 00 00 00 00 00
>>> 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 
>   PCI_IO_BASE            =       0x00 low 4 bits indicate 16-bit addressing
>   PCI_IO_LIMIT           =       0x00
>   PCI_MEMORY_BASE        =     0x0000
>   PCI_MEMORY_LIMIT       =     0x0000
>   PCI_PREF_MEMORY_BASE   =     0x0000 low 4 bits indicate 32-bit pref
>   PCI_PREF_MEMORY_LIMIT  =     0x0000
>   PCI_PREF_BASE_UPPER32  = 0x00000000
>   PCI_PREF_LIMIT_UPPER32 = 0x00000000
> 
> We can't tell from this whether the prefetchable window is implemented
> (but I'm sure keystone *does* implement it).  If PCI_PREF_MEMORY_BASE
> and PCI_PREF_MEMORY_LIMIT are read-only zeros, it is not implemented.
> If they are writable, it is enabled at the same range as the
> non-prefetchable window.

Yes, PCI_PREF_MEMORY_BASE and PCI_PREF_MEMORY_LIMIT are writable.

> 
> Similarly for the I/O window; we can't tell whether the base/limit are
> read-only zero or writable.

PCI_IO_BASE and PCI_IO_LIMIT are also writable.

> 
> So we have these windows that look like they're probably enabled:
> 
>   io window   at [io  0x0000-0x0fff]
>   mem window  at [mem 0x00000000-0x000fffff]
>   pref window at [mem 0x00000000-0x000fffff pref]
> 
> No idea whether it makes a difference, but these windows seem
> misconfigured.  The default should probably be to make them all
> disabled (as in f73eedc90bf7 ("PCI: vmd: Disable bridge window for
> domain reset")):
> 
>   PCI_IO_BASE            = 0xf0
>   PCI_IO_LIMIT           = 0x00
>   PCI_MEMORY_BASE        = 0xfff0
>   PCI_MEMORY_LIMIT       = 0x0000
>   PCI_PREF_MEMORY_BASE   = 0xfff0
>   PCI_PREF_MEMORY_LIMIT  = 0x0000
>   PCI_PREF_BASE_UPPER32  = 0xffffffff
>   PCI_PREF_LIMIT_UPPER32 = 0x00000000
> 
> The PCI core should reconfigure and enable them as needed by
> downstream devices.
> 
>> I also noticed that the value of desc->pci.msix_ctrl obtained from
>> the readl is always 0xffffffff irrespective of whether or not an
>> endpoint device is connected. This isn't expected right? The only
>> difference between the cases where endpoint device is connected and
>> isn't connected is the completion timeout.
> 
> Right, I wouldn't expect that.  PCI_MSIX_ENTRY_VECTOR_CTRL has a bunch
> of reserved bits that should be zero.
> 
> I assume MSI-X actually does work for downstream endpoints?  I
> wouldn't think anybody would have bothered with
> ks_pcie_v3_65_add_bus() unless MSI-X works.

Yes, I think it is supposed to work, but it doesn't seem to be working right now
considering that even with Endpoint device connected, the readl() returns all Fs.
Serge Semin Oct. 16, 2023, 9:29 p.m. UTC | #8
Hi Siddharth

On Sat, Oct 14, 2023 at 11:22:48AM +0530, Siddharth Vadapalli wrote:
> 
> 
> On 14/10/23 00:19, Bjorn Helgaas wrote:
> > On Fri, Oct 13, 2023 at 10:36:01AM +0530, Siddharth Vadapalli wrote:
> >> On 13/10/23 10:33, Siddharth Vadapalli wrote:
> >>> On 12/10/23 22:13, Bjorn Helgaas wrote:
> >>>> On Thu, Oct 12, 2023 at 10:15:09AM +0530, Siddharth Vadapalli wrote:
> >>>>> On 11/10/23 19:16, Bjorn Helgaas wrote:
> >>> ...
> >>>>>                                               msix_prepare_msi_desc
> >>>>> In this function: msix_prepare_msi_desc, the following readl()
> >>>>> causes completion timeout:
> >>>>> 		desc->pci.msix_ctrl = readl(addr + PCI_MSIX_ENTRY_VECTOR_CTRL);
> >>>>> The completion timeout with the readl is only observed when the link
> >>>>> is down (No Endpoint device is actually connected to the PCIe
> >>>>> connector slot).
> >>>>
> >>>> Do you know the address ("addr")?  From pci_msix_desc_addr(), it looks
> >>>> like it should be:
> >>>>
> >>>>   desc->pci.mask_base + desc->msi_index * PCI_MSIX_ENTRY_SIZE
> >>>>
> >>>> and desc->pci.mask_base should be dev->msix_base, which we got from
> >>>> msix_map_region(), which ioremaps part of the BAR indicated by the
> >>>> MSI-X Table Offset/Table BIR register.
> >>>>
> >>>> I wonder if this readl() is being handled as an MMIO access to a
> >>>> downstream device instead of a Root Port BAR access because it's
> >>>> inside the Root Port's MMIO window.
> >>>>
> >>>> Could you dump out these values just before the readl()?
> >>>>
> >>>>   phys_addr inside msix_map_region()
> >>>>   dev->msix_base
> >>>>   desc->pci.mask_base
> >>>>   desc->msi_index
> >>>>   addr
> >>>
> >>> phys_addr: 0x10102000
> >>> msix_base: 0xffff80000997a000
> >>> mask_base: 0xffff80000997a000
> >>> msi_index: 0
> >>> addr: 0xffff80000997a000
> >>>
> >>> Also, the details of BAR allocation from the logs are:
> >>> keystone-pcie 5500000.pcie: host bridge /bus@100000/pcie@5500000 ranges:
> >>> keystone-pcie 5500000.pcie:       IO 0x0010020000..0x001002ffff -> 0x0000000000
> >>> keystone-pcie 5500000.pcie:      MEM 0x0010030000..0x0017ffffff -> 0x0010030000
> >>> keystone-pcie 5500000.pcie: iATU unroll: enabled
> >>> keystone-pcie 5500000.pcie: iATU regions: 8 ob, 8 ib, align 64K, limit 4G
> >>> keystone-pcie 5500000.pcie: Phy link never came up
> >>> keystone-pcie 5500000.pcie: PCI host bridge to bus 0000:00
> >>> pci_bus 0000:00: root bus resource [bus 00-ff]
> >>> pci_bus 0000:00: root bus resource [io  0x0000-0xffff]
> >>> pci_bus 0000:00: root bus resource [mem 0x10030000-0x17ffffff]
> >>> pci 0000:00:00.0: [104c:b00c] type 01 class 0x060400
> >>> pci 0000:00:00.0: reg 0x10: [mem 0x05500000-0x055fffff]
> >>> pci 0000:00:00.0: reg 0x38: [mem 0x00000000-0x0000ffff pref]
> >>> pci 0000:00:00.0: supports D1
> >>> pci 0000:00:00.0: PME# supported from D0 D1 D3hot
> >>> pci 0000:00:00.0: BAR 0: assigned [mem 0x10100000-0x101fffff]
> >>> pci 0000:00:00.0: BAR 6: assigned [mem 0x10030000-0x1003ffff pref]
> >>> pci 0000:00:00.0: PCI bridge to [bus 01-ff]
> >>>
> >>> The value of phys_addr lies within the range allocated to BAR0.
> >>>
> >>>>   call early_dump_pci_device() on the Root Port
> >>>
> >>> I invoked early_dump_pci_device() within the pci_setup_device() function in
> >>> drivers/pci/probe.c and the output is:
> > 
> > It'd be better to dump the config space immediately before the readl()
> > since the PCI core did change some things in the interim.
> 
> I dumped the config space just before the readl() and it is:
> 
> pcieport 0000:00:00.0: config space:
> 00000000: 4c 10 0c b0 07 01 10 00 01 00 04 06 00 00 01 00
> 00000010: 00 00 10 10 00 00 00 00 00 01 ff 00 f0 00 00 00
> 00000020: f0 ff 00 00 f0 ff 00 00 00 00 00 00 00 00 00 00
> 00000030: 00 00 00 00 40 00 00 00 00 00 00 00 17 01 02 00
> 00000040: 01 50 c3 5b 08 00 00 00 00 00 00 00 00 00 00 00
> 00000050: 05 70 80 01 00 00 00 00 00 00 00 00 00 00 00 00
> 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 00000070: 10 b0 42 00 01 80 00 00 1f 20 00 00 22 3c 73 00
> 00000080: 00 00 11 10 00 00 00 00 c0 03 40 00 10 00 01 00
> 00000090: 00 00 00 00 1f 04 00 00 00 00 00 00 06 00 00 00
> 000000a0: 02 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00
> 000000b0: 11 00 00 c1 00 20 00 00 00 40 00 00 00 00 00 00
> 000000c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 000000d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 000000e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 000000f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 
> BAR0 was configured with the value: 0x05500000 within ks_pcie_v3_65_add_bus() at:
> dw_pcie_writel_dbi(pci, PCI_BASE_ADDRESS_0, ks_pcie->app.start);
> and can be verified based on the previous config space dump I had shared within
> the pci_setup_device() function. But, looking at the dump above which is just
> before the readl(), its value seems to have changed to 0x10100000.
> 
> I am listing down the regions of config space which have changed, based on the
> comparison of dumps taken within pci_setup_device() and just before the readl():
> 
> Within pci_setup_device():
> 00000010: 00 00 50 05 00 00 00 00 00 01 ff 00 00 00 00 00
> Before readl():
> 00000010: 00 00 10 10 00 00 00 00 00 01 ff 00 f0 00 00 00
> 
> Within pci_setup_device():
> 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> Before readl():
> 00000020: f0 ff 00 00 f0 ff 00 00 00 00 00 00 00 00 00 00
> 
> Within pci_setup_device():
> 00000030: 00 00 00 00 40 00 00 00 00 00 00 00 ff 01 00 00
> Before readl():
> 00000030: 00 00 00 00 40 00 00 00 00 00 00 00 17 01 02 00
> 
> Within pci_setup_device():
> 00000070: 10 b0 42 00 01 80 00 00 10 20 00 00 22 3c 73 00
> Before readl():
> 00000070: 10 b0 42 00 01 80 00 00 1f 20 00 00 22 3c 73 00
> 
> Within pci_setup_device():
> 00000080: 00 00 11 10 00 00 00 00 c0 03 40 00 00 00 01 00
> Before readl():
> 00000080: 00 00 11 10 00 00 00 00 c0 03 40 00 10 00 01 00
> 
> Within pci_setup_device():
> 000000b0: 11 00 00 01 00 20 00 00 00 40 00 00 00 00 00 00
> Before readl():
> 000000b0: 11 00 00 c1 00 20 00 00 00 40 00 00 00 00 00 00
> 
> > 
> >>> pci 0000:00:00.0: config space:
> >>> 00000000: 4c 10 0c b0 07 01 10 00 01 00 04 06 00 00 01 00
> > 
> >   PCI_COMMAND = 0x0107
> >     PCI_COMMAND_IO | PCI_COMMAND_MEMORY | PCI_COMMAND_MASTER | PCI_COMMAND_SERR
> > 
> >>> 00000010: 00 00 50 05 00 00 00 00 00 01 ff 00 00 00 00 00
> >>> 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > 
> >   PCI_IO_BASE            =       0x00 low 4 bits indicate 16-bit addressing
> >   PCI_IO_LIMIT           =       0x00
> >   PCI_MEMORY_BASE        =     0x0000
> >   PCI_MEMORY_LIMIT       =     0x0000
> >   PCI_PREF_MEMORY_BASE   =     0x0000 low 4 bits indicate 32-bit pref
> >   PCI_PREF_MEMORY_LIMIT  =     0x0000
> >   PCI_PREF_BASE_UPPER32  = 0x00000000
> >   PCI_PREF_LIMIT_UPPER32 = 0x00000000
> > 
> > We can't tell from this whether the prefetchable window is implemented
> > (but I'm sure keystone *does* implement it).  If PCI_PREF_MEMORY_BASE
> > and PCI_PREF_MEMORY_LIMIT are read-only zeros, it is not implemented.
> > If they are writable, it is enabled at the same range as the
> > non-prefetchable window.
> 
> Yes, PCI_PREF_MEMORY_BASE and PCI_PREF_MEMORY_LIMIT are writable.
> 
> > 
> > Similarly for the I/O window; we can't tell whether the base/limit are
> > read-only zero or writable.
> 
> PCI_IO_BASE and PCI_IO_LIMIT are also writable.
> 
> > 
> > So we have these windows that look like they're probably enabled:
> > 
> >   io window   at [io  0x0000-0x0fff]
> >   mem window  at [mem 0x00000000-0x000fffff]
> >   pref window at [mem 0x00000000-0x000fffff pref]
> > 
> > No idea whether it makes a difference, but these windows seem
> > misconfigured.  The default should probably be to make them all
> > disabled (as in f73eedc90bf7 ("PCI: vmd: Disable bridge window for
> > domain reset")):
> > 
> >   PCI_IO_BASE            = 0xf0
> >   PCI_IO_LIMIT           = 0x00
> >   PCI_MEMORY_BASE        = 0xfff0
> >   PCI_MEMORY_LIMIT       = 0x0000
> >   PCI_PREF_MEMORY_BASE   = 0xfff0
> >   PCI_PREF_MEMORY_LIMIT  = 0x0000
> >   PCI_PREF_BASE_UPPER32  = 0xffffffff
> >   PCI_PREF_LIMIT_UPPER32 = 0x00000000
> > 
> > The PCI core should reconfigure and enable them as needed by
> > downstream devices.
> > 
> >> I also noticed that the value of desc->pci.msix_ctrl obtained from
> >> the readl is always 0xffffffff irrespective of whether or not an
> >> endpoint device is connected. This isn't expected right? The only
> >> difference between the cases where endpoint device is connected and
> >> isn't connected is the completion timeout.
> > 
> > Right, I wouldn't expect that.  PCI_MSIX_ENTRY_VECTOR_CTRL has a bunch
> > of reserved bits that should be zero.
> > 
> > I assume MSI-X actually does work for downstream endpoints?  I
> > wouldn't think anybody would have bothered with
> > ks_pcie_v3_65_add_bus() unless MSI-X works.
> 
> Yes, I think it is supposed to work, but it doesn't seem to be working right now
> considering that even with Endpoint device connected, the readl() returns all Fs.

Could you please have look at what DW PCIe IP-core version is utilized
in the Keystone PCIe host controller? If it's of v5.x then here is
what HW databook says about the BARs initialization: "If you do use a
BAR, then you should program it to capture TLPs that are targeted to
your local non-application memory space residing on TRGT1, and not for
the application on TRGT0 (dbi). The BAR range must be outside of the
three Base/Limit regions."

I have no idea whether the BAR being set with an address within the
Base/Limit regions could have caused the lags you see, but I would
have at least checked that.

-Serge(y)

> 
> -- 
> Regards,
> Siddharth.
Siddharth Vadapalli Oct. 17, 2023, 4:14 a.m. UTC | #9
Hello,

On 17/10/23 02:59, Serge Semin wrote:
> Hi Siddharth
> 

...

>>>
>>> I assume MSI-X actually does work for downstream endpoints?  I
>>> wouldn't think anybody would have bothered with
>>> ks_pcie_v3_65_add_bus() unless MSI-X works.
>>
>> Yes, I think it is supposed to work, but it doesn't seem to be working right now
>> considering that even with Endpoint device connected, the readl() returns all Fs.
> 
> Could you please have look at what DW PCIe IP-core version is utilized
> in the Keystone PCIe host controller? If it's of v5.x then here is

The DW PCIe IP-core version is 4.90a.

> what HW databook says about the BARs initialization: "If you do use a
> BAR, then you should program it to capture TLPs that are targeted to
> your local non-application memory space residing on TRGT1, and not for
> the application on TRGT0 (dbi). The BAR range must be outside of the
> three Base/Limit regions."

Yes, it's the same even in the DW PCIe IP-core version 4.90a Databook:

3.5.7.2 RC Mode

Two BARs are present but are not expected to be used. You should disable them or
else they will be unnecessarily assigned memory during device enumeration. If
you do use a BAR, then you should program it to capture TLPs that are targeted
to your local non-application memory space residing on TRGT1, and not for the
application on TRGT1. The BAR range must be outside of the three Base/Limit regions.

> 
> I have no idea whether the BAR being set with an address within the
> Base/Limit regions could have caused the lags you see, but I would
> have at least checked that.

I will check that. Thank you for sharing your feedback.

> 
> -Serge(y)
> 
>>
>> -- 
>> Regards,
>> Siddharth.
Serge Semin Oct. 18, 2023, 10:42 a.m. UTC | #10
On Tue, Oct 17, 2023 at 09:44:51AM +0530, Siddharth Vadapalli wrote:
> Hello,
> 
> On 17/10/23 02:59, Serge Semin wrote:
> > Hi Siddharth
> > 
> 
> ...
> 
> >>>
> >>> I assume MSI-X actually does work for downstream endpoints?  I
> >>> wouldn't think anybody would have bothered with
> >>> ks_pcie_v3_65_add_bus() unless MSI-X works.
> >>
> >> Yes, I think it is supposed to work, but it doesn't seem to be working right now
> >> considering that even with Endpoint device connected, the readl() returns all Fs.
> > 
> > Could you please have look at what DW PCIe IP-core version is utilized
> > in the Keystone PCIe host controller? If it's of v5.x then here is
> 

> The DW PCIe IP-core version is 4.90a.
> 
> > what HW databook says about the BARs initialization: "If you do use a
> > BAR, then you should program it to capture TLPs that are targeted to
> > your local non-application memory space residing on TRGT1, and not for
> > the application on TRGT0 (dbi). The BAR range must be outside of the
> > three Base/Limit regions."
> 
> Yes, it's the same even in the DW PCIe IP-core version 4.90a Databook:
> 
> 3.5.7.2 RC Mode
> 
> Two BARs are present but are not expected to be used. You should disable them or
> else they will be unnecessarily assigned memory during device enumeration. If
> you do use a BAR, then you should program it to capture TLPs that are targeted
> to your local non-application memory space residing on TRGT1, and not for the
> application on TRGT1. The BAR range must be outside of the three Base/Limit regions.

Are you really sure that it's 4.90a? Here is what my DW PCIe RC
_v4.90_ HW databook says about the BARs:

"Base Address Registers (Offset: 0x10-x14) The Synopsys core does not
implement the optional BARs for the RC product. This is based on the
assumption that the RC host probably has registers on some other
internal bus and has knowledge and setup access to these registers
already."

What you cited resides in the _v5.x_ databooks. It makes my thinking
that in your case the IP-core isn't of 4.90a version.

-Serge(y)

> 
> > 
> > I have no idea whether the BAR being set with an address within the
> > Base/Limit regions could have caused the lags you see, but I would
> > have at least checked that.
> 
> I will check that. Thank you for sharing your feedback.
> 
> > 
> > -Serge(y)
> > 
> >>
> >> -- 
> >> Regards,
> >> Siddharth.
> 
> -- 
> Regards,
> Siddharth.
Siddharth Vadapalli Oct. 18, 2023, 10:49 a.m. UTC | #11
On 18/10/23 16:12, Serge Semin wrote:
> On Tue, Oct 17, 2023 at 09:44:51AM +0530, Siddharth Vadapalli wrote:
>> Hello,
>>
>> On 17/10/23 02:59, Serge Semin wrote:
>>> Hi Siddharth
>>>
>>
>> ...
>>

...

> 
> Are you really sure that it's 4.90a? Here is what my DW PCIe RC
> _v4.90_ HW databook says about the BARs:
> 
> "Base Address Registers (Offset: 0x10-x14) The Synopsys core does not
> implement the optional BARs for the RC product. This is based on the
> assumption that the RC host probably has registers on some other
> internal bus and has knowledge and setup access to these registers
> already."
> 
> What you cited resides in the _v5.x_ databooks. It makes my thinking
> that in your case the IP-core isn't of 4.90a version.

I reviewed the function ks_pcie_v3_65_add_bus() and it appears clear to me now
that it is applicable only to 3.65a versions. The IP-core however is 4.90a. I
have posted the v2 patch at:
https://lore.kernel.org/r/20231018075038.2740534-1-s-vadapalli@ti.com/

Also, as pointed out by Ravi on the v2 patch's thread at:
https://lore.kernel.org/r/c546f8e9-f6ba-41b8-7dff-4a7921b6705f@ti.com/
the culprit turned out to be:
6ab15b5e7057 (PCI: dwc: keystone: Convert .scan_bus() callback to use add_bus)
which in the process of converting the .scan_bus() callback to .add_bus(), namely:
ks_pcie_v3_65_scan_bus() -> ks_pcie_v3_65_add_bus()
It added the .add_bus() method within "struct pci_ops ks_pcie_ops" which is
actually shared with the 4.90a controller as well. So an "is_am6" check should
have also been added to make it no-op for NON-3.65a controllers.

I will be posting the v3 patch implementing the above fix if there is no further
feedback on the v2 patch from others.
diff mbox series

Patch

diff --git a/drivers/pci/controller/dwc/pci-keystone.c b/drivers/pci/controller/dwc/pci-keystone.c
index 49aea6ce3e87..ac2ad112d616 100644
--- a/drivers/pci/controller/dwc/pci-keystone.c
+++ b/drivers/pci/controller/dwc/pci-keystone.c
@@ -459,7 +459,8 @@  static int ks_pcie_v3_65_add_bus(struct pci_bus *bus)
 	struct dw_pcie *pci = to_dw_pcie_from_pp(pp);
 	struct keystone_pcie *ks_pcie = to_keystone_pcie(pci);
 
-	if (!pci_is_root_bus(bus))
+	/* Don't enable BAR0 if link is not yet up at this point */
+	if (!pci_is_root_bus(bus) || !dw_pcie_link_up(pci))
 		return 0;
 
 	/* Configure and set up BAR0 */