diff mbox

[RFC] PCI: pci-imx6: Add delay to workaround kernel hang

Message ID 1403637507-9424-1-git-send-email-festevam@gmail.com
State Rejected
Headers show

Commit Message

Fabio Estevam June 24, 2014, 7:18 p.m. UTC
From: Fabio Estevam <fabio.estevam@freescale.com>

When the mx6 PCI conctroller is initialized in the bootloader we see a kernel 
hang inside imx6_add_pcie_port().

Adding a 30ms delay allows the kernel to boot.

Suggested-by: David Müller <d.mueller@elsoft.ch>
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
---
I am happy to get feedback on how to properly fix this.

Thanks

 drivers/pci/host/pci-imx6.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Marek Vasut June 25, 2014, 9:28 p.m. UTC | #1
On Tuesday, June 24, 2014 at 09:18:27 PM, Fabio Estevam wrote:
> From: Fabio Estevam <fabio.estevam@freescale.com>
> 
> When the mx6 PCI conctroller is initialized in the bootloader we see a
> kernel hang inside imx6_add_pcie_port().
> 
> Adding a 30ms delay allows the kernel to boot.
> 
> Suggested-by: David Müller <d.mueller@elsoft.ch>
> Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
> ---
> I am happy to get feedback on how to properly fix this.

Honestly, I have no clue. I would expect Freescale to help us in solving this as 
they are the chip vendor, but they cannot help either :-(

Apologies, I really have no idea .

Best regards,
Marek Vasut
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Shawn Guo June 26, 2014, 3:12 a.m. UTC | #2
On Tue, Jun 24, 2014 at 04:18:27PM -0300, Fabio Estevam wrote:
> From: Fabio Estevam <fabio.estevam@freescale.com>
> 
> When the mx6 PCI conctroller is initialized in the bootloader we see a kernel 
> hang inside imx6_add_pcie_port().
> 
> Adding a 30ms delay allows the kernel to boot.

We may not want to add a random delay into the driver before we
understand the root cause of the issue.

Do you see this issue with FSL kernel?

Shawn

> 
> Suggested-by: David Müller <d.mueller@elsoft.ch>
> Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
> ---
> I am happy to get feedback on how to properly fix this.
> 
> Thanks
> 
>  drivers/pci/host/pci-imx6.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/pci/host/pci-imx6.c b/drivers/pci/host/pci-imx6.c
> index a568efa..669f771 100644
> --- a/drivers/pci/host/pci-imx6.c
> +++ b/drivers/pci/host/pci-imx6.c
> @@ -507,6 +507,8 @@ static int __init imx6_add_pcie_port(struct pcie_port *pp,
>  	pp->root_bus_nr = -1;
>  	pp->ops = &imx6_pcie_host_ops;
>  
> +	usleep_range(25000, 30000);
> +
>  	ret = dw_pcie_host_init(pp);
>  	if (ret) {
>  		dev_err(&pdev->dev, "failed to initialize host\n");
> -- 
> 1.8.3.2
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Fabio Estevam June 26, 2014, 3:43 a.m. UTC | #3
On Thu, Jun 26, 2014 at 12:12 AM, Shawn Guo <shawn.guo@freescale.com> wrote:
> On Tue, Jun 24, 2014 at 04:18:27PM -0300, Fabio Estevam wrote:
>> From: Fabio Estevam <fabio.estevam@freescale.com>
>>
>> When the mx6 PCI conctroller is initialized in the bootloader we see a kernel
>> hang inside imx6_add_pcie_port().
>>
>> Adding a 30ms delay allows the kernel to boot.
>
> We may not want to add a random delay into the driver before we
> understand the root cause of the issue.

Yes, that's why I sent this as RFC and also explained it below the ---
line that I am actually trying to get some help with this issue.

>
> Do you see this issue with FSL kernel?

Yes, it also hangs.

It is reproducible in 100% of the boots. Just need to use mainline
U-boot (which has PCI driver enabled by default).
I am using an Intel Wifi 7260 PCI card. This was also reported by
other folks in the U-boot list.

Below is the log with linux-next kernel and earlyprintk enabled:

U-Boot 2014.07-rc1-15766-g5f46552 (Jun 24 2014 - 17:41:55)

CPU:   Freescale i.MX6Q rev1.2 at 792 MHz
Reset cause: POR
Board: MX6-SabreSD
I2C:   ready
DRAM:  1 GiB
MMC:   FSL_SDHC: 0, FSL_SDHC: 1, FSL_SDHC: 2
  00:01.0     - 16c3:abcd - Bridge device
   01:00.0    - 8086:08b1 - Network controller
No panel detected: default to Hannstar-XGA
Display: Hannstar-XGA (1024x768)
In:    serial
Out:   serial
Err:   serial
PMIC:  PFUZE100 ID=0x10
Net:   FEC [PRIME]
Hit any key to stop autoboot:  0
Booting from net ...
FEC Waiting for PHY auto negotiation to complete.. done
Using FEC device
TFTP from server 192.168.0.2; our IP address is 192.168.0.8
Filename 'zImage'.
Load address: 0x12000000
Loading: #################################################################
         #################################################################
         #################################################################
         #################################################################
         #################################################################
         #################################################################
         #################################################################
         #################################################################
         #################################################################
         #################################################################
         #################################################################
         #################################################################
         #################################################################
         #################################################################
         #################################################################
         #############################################################
         1.7 MiB/s
done
Bytes transferred = 5302272 (50e800 hex)
Using FEC device
TFTP from server 192.168.0.2; our IP address is 192.168.0.8
Filename 'imx6q-sabresd.dtb'.
Load address: 0x18000000
Loading: #######
         1.5 MiB/s
done
Bytes transferred = 33744 (83d0 hex)
Kernel image @ 0x12000000 [ 0x000000 - 0x50e800 ]
## Flattened Device Tree blob at 18000000
   Booting using the fdt blob at 0x18000000
   Using Device Tree in place at 18000000, end 1800b3cf

Starting kernel ...

Uncompressing Linux... done, booting the kernel.
Booting Linux on physical CPU 0x0
Linux version 3.16.0-rc2-next-20140625 (fabio@fabio-Latitude-E6410)
(gcc version 4.7.3 (Ubuntu/Linaro 4.7.3-1ubuntu1) ) #1440 SMP Thu
Jun 26 00:38:33 BRT 2014
CPU: ARMv7 Processor [412fc09a] revision 10 (ARMv7), cr=10c5387d
CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache
Machine model: Freescale i.MX6 Quad SABRE Smart Device Board
bootconsole [earlycon0] enabled
cma: Reserved 16 MiB at 4f000000
Memory policy: Data cache writealloc
PERCPU: Embedded 8 pages/cpu @be7a6000 s8896 r8192 d15680 u32768
Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 260096
Kernel command line: console=ttymxc0,115200 root=/dev/nfs ip=dhcp
nfsroot=192.168.0.2:/tftpboot/rfs,v3,tcp earlyprintk
PID hash table entries: 4096 (order: 2, 16384 bytes)
Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
Memory: 1004880K/1048576K available (6579K kernel code, 406K rwdata,
2264K rodata, 340K init, 8331K bss, 43696K reserved, 0K highmem)
Virtual kernel memory layout:
    vector  : 0xffff0000 - 0xffff1000   (   4 kB)
    fixmap  : 0xffc00000 - 0xffe00000   (2048 kB)
    vmalloc : 0xc0800000 - 0xff000000   (1000 MB)
    lowmem  : 0x80000000 - 0xc0000000   (1024 MB)
    pkmap   : 0x7fe00000 - 0x80000000   (   2 MB)
    modules : 0x7f000000 - 0x7fe00000   (  14 MB)
      .text : 0x80008000 - 0x808aafbc   (8844 kB)
      .init : 0x808ab000 - 0x809002c0   ( 341 kB)
      .data : 0x80902000 - 0x80967a40   ( 407 kB)
       .bss : 0x80967a48 - 0x8118aa10   (8332 kB)
SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
Hierarchical RCU implementation.
NR_IRQS:16 nr_irqs:16 16
L2C-310 erratum 769419 enabled
L2C-310 enabling early BRESP for Cortex-A9
L2C-310 full line of zeros enabled for Cortex-A9
L2C-310 ID prefetch enabled, offset 1 lines
L2C-310 dynamic clock gating enabled, standby mode enabled
L2C-310 cache controller enabled, 16 ways, 1024 kB
L2C-310: CACHE_ID 0x410000c7, AUX_CTRL 0x76070001
Switching to timer-based delay loop, resolution 15ns
sched_clock: 32 bits at 66MHz, resolution 15ns, wraps every 65075262448ns
Console: colour dummy device 80x30
Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar
... MAX_LOCKDEP_SUBCLASSES:  8
... MAX_LOCK_DEPTH:          48
... MAX_LOCKDEP_KEYS:        8191
... CLASSHASH_SIZE:          4096
... MAX_LOCKDEP_ENTRIES:     32768
... MAX_LOCKDEP_CHAINS:      65536
... CHAINHASH_SIZE:          32768
 memory used by lock dependency info: 5167 kB
 per task-struct memory footprint: 1152 bytes
Calibrating delay loop (skipped), value calculated using timer
frequency.. 132.00 BogoMIPS (lpj=660000)
pid_max: default: 32768 minimum: 301
Mount-cache hash table entries: 2048 (order: 1, 8192 bytes)
Mountpoint-cache hash table entries: 2048 (order: 1, 8192 bytes)
CPU: Testing write buffer coherency: ok
CPU0: thread -1, cpu 0, socket 0, mpidr 80000000
Setting up static identity map for 0x10672328 - 0x10672398
CPU1: Booted secondary processor
CPU1: thread -1, cpu 1, socket 0, mpidr 80000001
CPU2: Booted secondary processor
CPU2: thread -1, cpu 2, socket 0, mpidr 80000002
CPU3: Booted secondary processor
CPU3: thread -1, cpu 3, socket 0, mpidr 80000003
Brought up 4 CPUs
SMP: Total of 4 processors activated.
CPU: All CPU(s) started in SVC mode.
devtmpfs: initialized
VFP support v0.3: implementor 41 architecture 3 part 30 variant 9 rev 4
pinctrl core: initialized pinctrl subsystem
regulator-dummy: no parameters
NET: Registered protocol family 16
DMA: preallocated 256 KiB pool for atomic coherent allocations
CPU identified as i.MX6Q, silicon rev 1.2
vdd1p1: 800 <--> 1375 mV at 1100 mV
vdd3p0: 2800 <--> 3150 mV at 3000 mV
vdd2p5: 2000 <--> 2750 mV at 2400 mV
vddarm: 725 <--> 1450 mV at 1100 mV
vddpu: 725 <--> 1450 mV at 1100 mV
vddsoc: 725 <--> 1450 mV at 1175 mV
hw-breakpoint: found 5 (+1 reserved) breakpoint and 1 watchpoint registers.
hw-breakpoint: maximum watchpoint size is 4 bytes.
imx6q-pinctrl 20e0000.iomuxc: initialized IMX pinctrl driver
mxs-dma 110000.dma-apbh: initialized
usb_otg_vbus: 5000 mV
usb_h1_vbus: 5000 mV
wm8962-supply: no parameters
vgaarb: loaded
SCSI subsystem initialized
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
i2c i2c-0: IMX I2C adapter registered
i2c i2c-1: IMX I2C adapter registered
i2c i2c-2: IMX I2C adapter registered
Linux video capture interface: v2.00
pps_core: LinuxPPS API ver. 1 registered
pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti
<giometti@linux.it>
PTP clock support registered
Advanced Linux Sound Architecture Driver Initialized.
cfg80211: Calling CRDA to update world regulatory domain
Switched to clocksource mxc_timer1
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Shawn Guo June 26, 2014, 5:49 a.m. UTC | #4
On Thu, Jun 26, 2014 at 12:43:19AM -0300, Fabio Estevam wrote:
> On Thu, Jun 26, 2014 at 12:12 AM, Shawn Guo <shawn.guo@freescale.com> wrote:
> > On Tue, Jun 24, 2014 at 04:18:27PM -0300, Fabio Estevam wrote:
> >> From: Fabio Estevam <fabio.estevam@freescale.com>
> >>
> >> When the mx6 PCI conctroller is initialized in the bootloader we see a kernel
> >> hang inside imx6_add_pcie_port().
> >>
> >> Adding a 30ms delay allows the kernel to boot.
> >
> > We may not want to add a random delay into the driver before we
> > understand the root cause of the issue.
> 
> Yes, that's why I sent this as RFC and also explained it below the ---
> line that I am actually trying to get some help with this issue.
> 
> >
> > Do you see this issue with FSL kernel?
> 
> Yes, it also hangs.
> 
> It is reproducible in 100% of the boots. Just need to use mainline
> U-boot (which has PCI driver enabled by default).
> I am using an Intel Wifi 7260 PCI card. This was also reported by
> other folks in the U-boot list.

Richard,

Can you schedule some time to look at this issue?  I think it will come
to us sooner or later if any our customer enables PCIe before launching
kernel?

Shawn

> 
> Below is the log with linux-next kernel and earlyprintk enabled:
> 
> U-Boot 2014.07-rc1-15766-g5f46552 (Jun 24 2014 - 17:41:55)
> 
> CPU:   Freescale i.MX6Q rev1.2 at 792 MHz
> Reset cause: POR
> Board: MX6-SabreSD
> I2C:   ready
> DRAM:  1 GiB
> MMC:   FSL_SDHC: 0, FSL_SDHC: 1, FSL_SDHC: 2
>   00:01.0     - 16c3:abcd - Bridge device
>    01:00.0    - 8086:08b1 - Network controller
> No panel detected: default to Hannstar-XGA
> Display: Hannstar-XGA (1024x768)
> In:    serial
> Out:   serial
> Err:   serial
> PMIC:  PFUZE100 ID=0x10
> Net:   FEC [PRIME]
> Hit any key to stop autoboot:  0
> Booting from net ...
> FEC Waiting for PHY auto negotiation to complete.. done
> Using FEC device
> TFTP from server 192.168.0.2; our IP address is 192.168.0.8
> Filename 'zImage'.
> Load address: 0x12000000
> Loading: #################################################################
>          #################################################################
>          #################################################################
>          #################################################################
>          #################################################################
>          #################################################################
>          #################################################################
>          #################################################################
>          #################################################################
>          #################################################################
>          #################################################################
>          #################################################################
>          #################################################################
>          #################################################################
>          #################################################################
>          #############################################################
>          1.7 MiB/s
> done
> Bytes transferred = 5302272 (50e800 hex)
> Using FEC device
> TFTP from server 192.168.0.2; our IP address is 192.168.0.8
> Filename 'imx6q-sabresd.dtb'.
> Load address: 0x18000000
> Loading: #######
>          1.5 MiB/s
> done
> Bytes transferred = 33744 (83d0 hex)
> Kernel image @ 0x12000000 [ 0x000000 - 0x50e800 ]
> ## Flattened Device Tree blob at 18000000
>    Booting using the fdt blob at 0x18000000
>    Using Device Tree in place at 18000000, end 1800b3cf
> 
> Starting kernel ...
> 
> Uncompressing Linux... done, booting the kernel.
> Booting Linux on physical CPU 0x0
> Linux version 3.16.0-rc2-next-20140625 (fabio@fabio-Latitude-E6410)
> (gcc version 4.7.3 (Ubuntu/Linaro 4.7.3-1ubuntu1) ) #1440 SMP Thu
> Jun 26 00:38:33 BRT 2014
> CPU: ARMv7 Processor [412fc09a] revision 10 (ARMv7), cr=10c5387d
> CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache
> Machine model: Freescale i.MX6 Quad SABRE Smart Device Board
> bootconsole [earlycon0] enabled
> cma: Reserved 16 MiB at 4f000000
> Memory policy: Data cache writealloc
> PERCPU: Embedded 8 pages/cpu @be7a6000 s8896 r8192 d15680 u32768
> Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 260096
> Kernel command line: console=ttymxc0,115200 root=/dev/nfs ip=dhcp
> nfsroot=192.168.0.2:/tftpboot/rfs,v3,tcp earlyprintk
> PID hash table entries: 4096 (order: 2, 16384 bytes)
> Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
> Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
> Memory: 1004880K/1048576K available (6579K kernel code, 406K rwdata,
> 2264K rodata, 340K init, 8331K bss, 43696K reserved, 0K highmem)
> Virtual kernel memory layout:
>     vector  : 0xffff0000 - 0xffff1000   (   4 kB)
>     fixmap  : 0xffc00000 - 0xffe00000   (2048 kB)
>     vmalloc : 0xc0800000 - 0xff000000   (1000 MB)
>     lowmem  : 0x80000000 - 0xc0000000   (1024 MB)
>     pkmap   : 0x7fe00000 - 0x80000000   (   2 MB)
>     modules : 0x7f000000 - 0x7fe00000   (  14 MB)
>       .text : 0x80008000 - 0x808aafbc   (8844 kB)
>       .init : 0x808ab000 - 0x809002c0   ( 341 kB)
>       .data : 0x80902000 - 0x80967a40   ( 407 kB)
>        .bss : 0x80967a48 - 0x8118aa10   (8332 kB)
> SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
> Hierarchical RCU implementation.
> NR_IRQS:16 nr_irqs:16 16
> L2C-310 erratum 769419 enabled
> L2C-310 enabling early BRESP for Cortex-A9
> L2C-310 full line of zeros enabled for Cortex-A9
> L2C-310 ID prefetch enabled, offset 1 lines
> L2C-310 dynamic clock gating enabled, standby mode enabled
> L2C-310 cache controller enabled, 16 ways, 1024 kB
> L2C-310: CACHE_ID 0x410000c7, AUX_CTRL 0x76070001
> Switching to timer-based delay loop, resolution 15ns
> sched_clock: 32 bits at 66MHz, resolution 15ns, wraps every 65075262448ns
> Console: colour dummy device 80x30
> Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar
> ... MAX_LOCKDEP_SUBCLASSES:  8
> ... MAX_LOCK_DEPTH:          48
> ... MAX_LOCKDEP_KEYS:        8191
> ... CLASSHASH_SIZE:          4096
> ... MAX_LOCKDEP_ENTRIES:     32768
> ... MAX_LOCKDEP_CHAINS:      65536
> ... CHAINHASH_SIZE:          32768
>  memory used by lock dependency info: 5167 kB
>  per task-struct memory footprint: 1152 bytes
> Calibrating delay loop (skipped), value calculated using timer
> frequency.. 132.00 BogoMIPS (lpj=660000)
> pid_max: default: 32768 minimum: 301
> Mount-cache hash table entries: 2048 (order: 1, 8192 bytes)
> Mountpoint-cache hash table entries: 2048 (order: 1, 8192 bytes)
> CPU: Testing write buffer coherency: ok
> CPU0: thread -1, cpu 0, socket 0, mpidr 80000000
> Setting up static identity map for 0x10672328 - 0x10672398
> CPU1: Booted secondary processor
> CPU1: thread -1, cpu 1, socket 0, mpidr 80000001
> CPU2: Booted secondary processor
> CPU2: thread -1, cpu 2, socket 0, mpidr 80000002
> CPU3: Booted secondary processor
> CPU3: thread -1, cpu 3, socket 0, mpidr 80000003
> Brought up 4 CPUs
> SMP: Total of 4 processors activated.
> CPU: All CPU(s) started in SVC mode.
> devtmpfs: initialized
> VFP support v0.3: implementor 41 architecture 3 part 30 variant 9 rev 4
> pinctrl core: initialized pinctrl subsystem
> regulator-dummy: no parameters
> NET: Registered protocol family 16
> DMA: preallocated 256 KiB pool for atomic coherent allocations
> CPU identified as i.MX6Q, silicon rev 1.2
> vdd1p1: 800 <--> 1375 mV at 1100 mV
> vdd3p0: 2800 <--> 3150 mV at 3000 mV
> vdd2p5: 2000 <--> 2750 mV at 2400 mV
> vddarm: 725 <--> 1450 mV at 1100 mV
> vddpu: 725 <--> 1450 mV at 1100 mV
> vddsoc: 725 <--> 1450 mV at 1175 mV
> hw-breakpoint: found 5 (+1 reserved) breakpoint and 1 watchpoint registers.
> hw-breakpoint: maximum watchpoint size is 4 bytes.
> imx6q-pinctrl 20e0000.iomuxc: initialized IMX pinctrl driver
> mxs-dma 110000.dma-apbh: initialized
> usb_otg_vbus: 5000 mV
> usb_h1_vbus: 5000 mV
> wm8962-supply: no parameters
> vgaarb: loaded
> SCSI subsystem initialized
> usbcore: registered new interface driver usbfs
> usbcore: registered new interface driver hub
> usbcore: registered new device driver usb
> i2c i2c-0: IMX I2C adapter registered
> i2c i2c-1: IMX I2C adapter registered
> i2c i2c-2: IMX I2C adapter registered
> Linux video capture interface: v2.00
> pps_core: LinuxPPS API ver. 1 registered
> pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti
> <giometti@linux.it>
> PTP clock support registered
> Advanced Linux Sound Architecture Driver Initialized.
> cfg80211: Calling CRDA to update world regulatory domain
> Switched to clocksource mxc_timer1
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Juergen Borleis June 26, 2014, 7:32 a.m. UTC | #5
Hi Fabio,

On Thursday 26 June 2014 05:43:19 Fabio Estevam wrote:
> [...]
> Below is the log with linux-next kernel and earlyprintk enabled:

Does it also hang when you disable "earlyprintk"? A few weeks ago we had a 
similar issue with i.MX6/PCI and "earlyprintk" (at the end it was caused by a 
clock change for the UART when changing from the early to the regular 
console).

Regards,
Juergen
Lucas Stach June 26, 2014, 8:41 a.m. UTC | #6
Hi Fabio,

Am Dienstag, den 24.06.2014, 16:18 -0300 schrieb Fabio Estevam:
> From: Fabio Estevam <fabio.estevam@freescale.com>
> 
> When the mx6 PCI conctroller is initialized in the bootloader we see a kernel 
> hang inside imx6_add_pcie_port().
> 
> Adding a 30ms delay allows the kernel to boot.
> 
> Suggested-by: David Müller <d.mueller@elsoft.ch>
> Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
> ---
> I am happy to get feedback on how to properly fix this.
> 
> Thanks
> 
>  drivers/pci/host/pci-imx6.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/pci/host/pci-imx6.c b/drivers/pci/host/pci-imx6.c
> index a568efa..669f771 100644
> --- a/drivers/pci/host/pci-imx6.c
> +++ b/drivers/pci/host/pci-imx6.c
> @@ -507,6 +507,8 @@ static int __init imx6_add_pcie_port(struct pcie_port *pp,
>  	pp->root_bus_nr = -1;
>  	pp->ops = &imx6_pcie_host_ops;
>  
> +	usleep_range(25000, 30000);
> +
>  	ret = dw_pcie_host_init(pp);
>  	if (ret) {
>  		dev_err(&pdev->dev, "failed to initialize host\n");

I would suspect the issue to be somewhere in imx6_pcie_host_init(). Can
you move the delay there (and to different positions in this function)
to narrow down where the hang happens?

Regards,
Lucas
Marek Vasut June 26, 2014, 9:13 a.m. UTC | #7
On Thursday, June 26, 2014 at 07:49:38 AM, Shawn Guo wrote:
> On Thu, Jun 26, 2014 at 12:43:19AM -0300, Fabio Estevam wrote:
> > On Thu, Jun 26, 2014 at 12:12 AM, Shawn Guo <shawn.guo@freescale.com> wrote:
> > > On Tue, Jun 24, 2014 at 04:18:27PM -0300, Fabio Estevam wrote:
> > >> From: Fabio Estevam <fabio.estevam@freescale.com>
> > >> 
> > >> When the mx6 PCI conctroller is initialized in the bootloader we see a
> > >> kernel hang inside imx6_add_pcie_port().
> > >> 
> > >> Adding a 30ms delay allows the kernel to boot.
> > > 
> > > We may not want to add a random delay into the driver before we
> > > understand the root cause of the issue.
> > 
> > Yes, that's why I sent this as RFC and also explained it below the ---
> > line that I am actually trying to get some help with this issue.
> > 
> > > Do you see this issue with FSL kernel?
> > 
> > Yes, it also hangs.
> > 
> > It is reproducible in 100% of the boots. Just need to use mainline
> > U-boot (which has PCI driver enabled by default).
> > I am using an Intel Wifi 7260 PCI card. This was also reported by
> > other folks in the U-boot list.
> 
> Richard,
> 
> Can you schedule some time to look at this issue?  I think it will come
> to us sooner or later if any our customer enables PCIe before launching
> kernel?

I have this problem on MX6DL SabreSDP as well. Apparently, this issue happens 
more often on 6DL than it happens on 6Q. I'm clueless here.

I presume FSL won't be willing to release how the PCIe block is exactly wired 
into the GPR registers, right ? Also, I don't think there's some magic register 
which allows controlling the PCIe core and PCIe PHY reset lines directly (like 
on exynos), or is there please?

Best regards,
Marek Vasut
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Marek Vasut June 26, 2014, 9:17 a.m. UTC | #8
On Thursday, June 26, 2014 at 09:32:30 AM, Juergen Borleis wrote:
> Hi Fabio,
> 
> On Thursday 26 June 2014 05:43:19 Fabio Estevam wrote:
> > [...]
> 
> > Below is the log with linux-next kernel and earlyprintk enabled:
> Does it also hang when you disable "earlyprintk"? A few weeks ago we had a
> similar issue with i.MX6/PCI and "earlyprintk" (at the end it was caused by
> a clock change for the UART when changing from the early to the regular
> console).

How can the UART clock have impact on the PCIe please ?

Best regards,
Marek Vasut
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Juergen Borleis June 26, 2014, 9:26 a.m. UTC | #9
Hi Marek,

On Thursday 26 June 2014 11:17:12 Marek Vasut wrote:
> On Thursday, June 26, 2014 at 09:32:30 AM, Juergen Borleis wrote:
> > On Thursday 26 June 2014 05:43:19 Fabio Estevam wrote:
> > > [...]
> > >
> > > Below is the log with linux-next kernel and earlyprintk enabled:
> >
> > Does it also hang when you disable "earlyprintk"? A few weeks ago we had
> > a similar issue with i.MX6/PCI and "earlyprintk" (at the end it was
> > caused by a clock change for the UART when changing from the early to the
> > regular console).
>
> How can the UART clock have impact on the PCIe please ?

Not to the PCIe hardware, it just hanged the kernel when the PCIe driver came 
up (due to some of its messages it tries to output at this moment).

jbe
Marek Vasut June 26, 2014, 9:50 a.m. UTC | #10
On Thursday, June 26, 2014 at 11:26:09 AM, Juergen Borleis wrote:
> Hi Marek,
> 
> On Thursday 26 June 2014 11:17:12 Marek Vasut wrote:
> > On Thursday, June 26, 2014 at 09:32:30 AM, Juergen Borleis wrote:
> > > On Thursday 26 June 2014 05:43:19 Fabio Estevam wrote:
> > > > [...]
> > > 
> > > > Below is the log with linux-next kernel and earlyprintk enabled:
> > > Does it also hang when you disable "earlyprintk"? A few weeks ago we
> > > had a similar issue with i.MX6/PCI and "earlyprintk" (at the end it
> > > was caused by a clock change for the UART when changing from the early
> > > to the regular console).
> > 
> > How can the UART clock have impact on the PCIe please ?
> 
> Not to the PCIe hardware, it just hanged the kernel when the PCIe driver
> came up (due to some of its messages it tries to output at this moment).

Another hint, can a blocking msleep() hang the driver ? The driver probes in 
fs_initcall stage, I'm not sure if it cannot hit some kind of a problem when 
using the msleep() ? It's unlikely though ...

Best regards,
Marek Vasut
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Fabio Estevam June 26, 2014, 11:43 a.m. UTC | #11
Hi Juergen,

On Thu, Jun 26, 2014 at 4:32 AM, Juergen Borleis <jbe@pengutronix.de> wrote:
> Hi Fabio,
>
> On Thursday 26 June 2014 05:43:19 Fabio Estevam wrote:
>> [...]
>> Below is the log with linux-next kernel and earlyprintk enabled:
>
> Does it also hang when you disable "earlyprintk"? A few weeks ago we had a
> similar issue with i.MX6/PCI and "earlyprintk" (at the end it was caused by a
> clock change for the UART when changing from the early to the regular
> console).

Yes, it also hangs without earlyprintk.

Regards,

Fabio Estevam
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tim Harvey June 27, 2014, 12:29 a.m. UTC | #12
On Wed, Jun 25, 2014 at 10:49 PM, Shawn Guo <shawn.guo@freescale.com> wrote:
> On Thu, Jun 26, 2014 at 12:43:19AM -0300, Fabio Estevam wrote:
>> On Thu, Jun 26, 2014 at 12:12 AM, Shawn Guo <shawn.guo@freescale.com> wrote:
>> > On Tue, Jun 24, 2014 at 04:18:27PM -0300, Fabio Estevam wrote:
>> >> From: Fabio Estevam <fabio.estevam@freescale.com>
>> >>
>> >> When the mx6 PCI conctroller is initialized in the bootloader we see a kernel
>> >> hang inside imx6_add_pcie_port().
>> >>
>> >> Adding a 30ms delay allows the kernel to boot.
>> >
>> > We may not want to add a random delay into the driver before we
>> > understand the root cause of the issue.
>>
>> Yes, that's why I sent this as RFC and also explained it below the ---
>> line that I am actually trying to get some help with this issue.
>>
>> >
>> > Do you see this issue with FSL kernel?
>>
>> Yes, it also hangs.
>>
>> It is reproducible in 100% of the boots. Just need to use mainline
>> U-boot (which has PCI driver enabled by default).
>> I am using an Intel Wifi 7260 PCI card. This was also reported by
>> other folks in the U-boot list.
>
> Richard,
>
> Can you schedule some time to look at this issue?  I think it will come
> to us sooner or later if any our customer enables PCIe before launching
> kernel?
>
> Shawn
>

Shawn / Richard,

I am also affected by this issue on IMX6 boards that I support. If I
enable PCI in the bootloader I see similar hangs.

I have the following hardware configurations on my bench:
  1. IMX6DL + i210 (same PCI setup as Fabio's above, but DL instead of Q)
  2. IMX6Q + ath9k device
  3. IMX6DL + PLX PEX860x PCIe-to-PCIe bridge with various devices
behind the bridge, using a clock buffer from IMX6 PCIe clock
  4. IMX6Q + PLX PEX860x PCIe-to-PCIe bridge with various devices
behind the bridge, using a clock buffer from IMX6 PCIe clock
  5. IMX6Q + PLX PEX860x PCIe-to-PCIe bridge with various devices
behind the bridge using a clock generator (always on, ignoring the
PCIe clock)

For all of the above I have no PCI issues using
3.14/3.15/3.16-rc2/vendor 3.10.17_1.0.0_ga unless I enable PCI in the
bootloader. When I do so, all of the above configurations hang
somewhere around PCI init/enumeration. The same occurs with the most
recent vendor kernel 3.10.17_1.0.0_ga kernel (works when PCI is
disabled in the bootloader, hangs otherwise).

When I apply Fabio's patch above to the 3.16-rc2 kernel I find that
scenarios #4 and #5 above then work, #3 boots but the PLX bridge fails
all config cycles (0xff's), #2 boots but with no PCIe link, and #1
above still hangs. Previously, when I have dug into this particular
'hang' issue on 3.15 I found that the delay needed to be between
imx6_pcie_probe() requesting and asserting reset_gpio low, and before
setting IOMUX_GPR1:18 to power down the PCIe PHY (note here, that the
PHY is currently enabled in the bootloader when PCI is enabled there).

When I apply Fabio's patch above to the most recent vendor kernel
3.10.17_1.0.0_ga I still hang in all cases.

So while I agree there is something horribly wrong with IMX6 PCI
still, I don't think Fabio's patch is the right solution and I don't
have anything better at this point in time. I'm happy to share any
hardware with anyone that can work through this issue.

Thanks,

Tim
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Richard Zhu June 27, 2014, 10:06 a.m. UTC | #13
Hi


> -----Original Message-----

> From: Tim Harvey [mailto:tharvey@gateworks.com]

> Sent: Friday, June 27, 2014 8:29 AM

> To: Guo Shawn-R65073; Zhu Richard-R65037

> Cc: Fabio Estevam; Bjorn Helgaas; Marek Vašut; David Müller (ELSOFT AG);

> Sascha Hauer; linux-arm-kernel@lists.infradead.org; linux-pci@vger.kernel.org;

> Estevam Fabio-R49496

> Subject: Re: [RFC] PCI: pci-imx6: Add delay to workaround kernel hang

> 

> On Wed, Jun 25, 2014 at 10:49 PM, Shawn Guo <shawn.guo@freescale.com> wrote:

> > On Thu, Jun 26, 2014 at 12:43:19AM -0300, Fabio Estevam wrote:

> >> On Thu, Jun 26, 2014 at 12:12 AM, Shawn Guo <shawn.guo@freescale.com> wrote:

> >> > On Tue, Jun 24, 2014 at 04:18:27PM -0300, Fabio Estevam wrote:

> >> >> From: Fabio Estevam <fabio.estevam@freescale.com>

> >> >>

> >> >> When the mx6 PCI conctroller is initialized in the bootloader we

> >> >> see a kernel hang inside imx6_add_pcie_port().

> >> >>

> >> >> Adding a 30ms delay allows the kernel to boot.

> >> >

> >> > We may not want to add a random delay into the driver before we

> >> > understand the root cause of the issue.

> >>

> >> Yes, that's why I sent this as RFC and also explained it below the

> >> --- line that I am actually trying to get some help with this issue.

> >>

> >> >

> >> > Do you see this issue with FSL kernel?

> >>

> >> Yes, it also hangs.

> >>

> >> It is reproducible in 100% of the boots. Just need to use mainline

> >> U-boot (which has PCI driver enabled by default).

> >> I am using an Intel Wifi 7260 PCI card. This was also reported by

> >> other folks in the U-boot list.

> >

> > Richard,

> >

> > Can you schedule some time to look at this issue?  I think it will

> > come to us sooner or later if any our customer enables PCIe before

> > launching kernel?

> >

> > Shawn

> >

> 

[Richard] I did the tests refer to this use-case, enable imx6 pcie on both u-boot and kernel,
and I encounter the system hang too.

Here are the latest updates and some clues from my side:
- About 200us delay is required after the ltssm_en is set to be '1' at my side.
Otherwise, system would be hang when driver access the pcie_phy_debug_r1 to check
the link is up or not.

- After that, system hang when rc trying to access the cfg space of ep device,
Although the pcie link had been setup and I don't have know the root cause yet.

- The kernel can boot up successfully, when I mask the cfg read/write call-back in pcie-designware.c
Driver for debug purpose.

Note:
The sequence of the pcie initialization should be adjusted refer to the newly
 discovered bug(pcie link maybe rarely random down after the system warm-reset).
* Ref_ssp_en(bit16 of gpr1 register) should be set after the pcie others clks are enable.
Enable pcie related clks --> delay for about ~10us waiting for the clks stable-->set ref_ssp_en(bit16 of gpr1 register).

Debug is still on-going.

> Shawn / Richard,

> 

> I am also affected by this issue on IMX6 boards that I support. If I enable

> PCI in the bootloader I see similar hangs.

> 

> I have the following hardware configurations on my bench:

>   1. IMX6DL + i210 (same PCI setup as Fabio's above, but DL instead of Q)

>   2. IMX6Q + ath9k device

>   3. IMX6DL + PLX PEX860x PCIe-to-PCIe bridge with various devices behind the

> bridge, using a clock buffer from IMX6 PCIe clock

>   4. IMX6Q + PLX PEX860x PCIe-to-PCIe bridge with various devices behind the

> bridge, using a clock buffer from IMX6 PCIe clock

>   5. IMX6Q + PLX PEX860x PCIe-to-PCIe bridge with various devices behind the

> bridge using a clock generator (always on, ignoring the PCIe clock)

> 

> For all of the above I have no PCI issues using 3.14/3.15/3.16-rc2/vendor

> 3.10.17_1.0.0_ga unless I enable PCI in the bootloader. When I do so, all of

> the above configurations hang somewhere around PCI init/enumeration. The same

> occurs with the most recent vendor kernel 3.10.17_1.0.0_ga kernel (works when

> PCI is disabled in the bootloader, hangs otherwise).

> 

> When I apply Fabio's patch above to the 3.16-rc2 kernel I find that scenarios

> #4 and #5 above then work, #3 boots but the PLX bridge fails all config cycles

> (0xff's), #2 boots but with no PCIe link, and #1 above still hangs. Previously,

> when I have dug into this particular 'hang' issue on 3.15 I found that the

> delay needed to be between

> imx6_pcie_probe() requesting and asserting reset_gpio low, and before setting

> IOMUX_GPR1:18 to power down the PCIe PHY (note here, that the PHY is currently

> enabled in the bootloader when PCI is enabled there).

> 

> When I apply Fabio's patch above to the most recent vendor kernel

> 3.10.17_1.0.0_ga I still hang in all cases.

> 

> So while I agree there is something horribly wrong with IMX6 PCI still, I

> don't think Fabio's patch is the right solution and I don't have anything

> better at this point in time. I'm happy to share any hardware with anyone that

> can work through this issue.

> 

> Thanks,

> 

> Tim


Best Regards
Richard Zhu
Tim Harvey July 17, 2014, 12:28 a.m. UTC | #14
On Fri, Jun 27, 2014 at 3:06 AM, Hong-Xing.Zhu@freescale.com
<Hong-Xing.Zhu@freescale.com> wrote:
> [Richard] I did the tests refer to this use-case, enable imx6 pcie on both u-boot and kernel,
> and I encounter the system hang too.
>
> Here are the latest updates and some clues from my side:
> - About 200us delay is required after the ltssm_en is set to be '1' at my side.
> Otherwise, system would be hang when driver access the pcie_phy_debug_r1 to check
> the link is up or not.
>
> - After that, system hang when rc trying to access the cfg space of ep device,
> Although the pcie link had been setup and I don't have know the root cause yet.
>
> - The kernel can boot up successfully, when I mask the cfg read/write call-back in pcie-designware.c
> Driver for debug purpose.
>
> Note:
> The sequence of the pcie initialization should be adjusted refer to the newly
>  discovered bug(pcie link maybe rarely random down after the system warm-reset).
> * Ref_ssp_en(bit16 of gpr1 register) should be set after the pcie others clks are enable.
> Enable pcie related clks --> delay for about ~10us waiting for the clks stable-->set ref_ssp_en(bit16 of gpr1 register).
>
> Debug is still on-going.
>

Any update on this? Are you or others at Freescale actively working on this?

Regards,

Tim
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Uwe Kleine-König July 17, 2014, 6:51 a.m. UTC | #15
Hello,

On Tue, Jun 24, 2014 at 04:18:27PM -0300, Fabio Estevam wrote:
> From: Fabio Estevam <fabio.estevam@freescale.com>
> 
> When the mx6 PCI conctroller is initialized in the bootloader we see a kernel 
> hang inside imx6_add_pcie_port().
> 
> Adding a 30ms delay allows the kernel to boot.
Just my thought on how to debug that: I'd try to bisect the pci init
routine in the boot loader. I.e. first only do the first half of the
initialisation in U-Boot. Depending on Linux being able to boot or not
initialize more or less on the next run.

Maybe there is a single register write that makes Linux fail?!

Best regards
Uwe
Marek Vasut July 17, 2014, 8:23 a.m. UTC | #16
On Thursday, July 17, 2014 at 08:51:48 AM, Uwe Kleine-König wrote:
> Hello,
> 
> On Tue, Jun 24, 2014 at 04:18:27PM -0300, Fabio Estevam wrote:
> > From: Fabio Estevam <fabio.estevam@freescale.com>
> > 
> > When the mx6 PCI conctroller is initialized in the bootloader we see a
> > kernel hang inside imx6_add_pcie_port().
> > 
> > Adding a 30ms delay allows the kernel to boot.
> 
> Just my thought on how to debug that: I'd try to bisect the pci init
> routine in the boot loader. I.e. first only do the first half of the
> initialisation in U-Boot. Depending on Linux being able to boot or not
> initialize more or less on the next run.
> 
> Maybe there is a single register write that makes Linux fail?!

I am still hell-bent on thinking that the missing PCIe block reset is what makes 
the Linux fail. Missing block reset is always a problem. Or do we now have a 
mean to reset the PCIe block and it's PHY from software?

Best regards,
Marek Vasut
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Shawn Guo July 17, 2014, 3:27 p.m. UTC | #17
On Thu, Jul 17, 2014 at 10:23:10AM +0200, Marek Vasut wrote:
> On Thursday, July 17, 2014 at 08:51:48 AM, Uwe Kleine-König wrote:
> > Hello,
> > 
> > On Tue, Jun 24, 2014 at 04:18:27PM -0300, Fabio Estevam wrote:
> > > From: Fabio Estevam <fabio.estevam@freescale.com>
> > > 
> > > When the mx6 PCI conctroller is initialized in the bootloader we see a
> > > kernel hang inside imx6_add_pcie_port().
> > > 
> > > Adding a 30ms delay allows the kernel to boot.
> > 
> > Just my thought on how to debug that: I'd try to bisect the pci init
> > routine in the boot loader. I.e. first only do the first half of the
> > initialisation in U-Boot. Depending on Linux being able to boot or not
> > initialize more or less on the next run.
> > 
> > Maybe there is a single register write that makes Linux fail?!
> 
> I am still hell-bent on thinking that the missing PCIe block reset is what makes 
> the Linux fail. Missing block reset is always a problem.

Indeed.  We're missing a hardware reset for PCIe on i.MX6Q and i.MX6DL.
Such reset is available on i.MX6SX, so there is no this problem for
i.MX6SX PCIe.

> Or do we now have a 
> mean to reset the PCIe block and it's PHY from software?

Richard is trying to find a SW workaround for it, but we're not really
sure if it's possible.

Shawn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Marek Vasut July 18, 2014, 8:46 p.m. UTC | #18
On Thursday, July 17, 2014 at 05:27:09 PM, Shawn Guo wrote:
> On Thu, Jul 17, 2014 at 10:23:10AM +0200, Marek Vasut wrote:
> > On Thursday, July 17, 2014 at 08:51:48 AM, Uwe Kleine-König wrote:
> > > Hello,
> > > 
> > > On Tue, Jun 24, 2014 at 04:18:27PM -0300, Fabio Estevam wrote:
> > > > From: Fabio Estevam <fabio.estevam@freescale.com>
> > > > 
> > > > When the mx6 PCI conctroller is initialized in the bootloader we see
> > > > a kernel hang inside imx6_add_pcie_port().
> > > > 
> > > > Adding a 30ms delay allows the kernel to boot.
> > > 
> > > Just my thought on how to debug that: I'd try to bisect the pci init
> > > routine in the boot loader. I.e. first only do the first half of the
> > > initialisation in U-Boot. Depending on Linux being able to boot or not
> > > initialize more or less on the next run.
> > > 
> > > Maybe there is a single register write that makes Linux fail?!
> > 
> > I am still hell-bent on thinking that the missing PCIe block reset is
> > what makes the Linux fail. Missing block reset is always a problem.
> 
> Indeed.  We're missing a hardware reset for PCIe on i.MX6Q and i.MX6DL.
> Such reset is available on i.MX6SX, so there is no this problem for
> i.MX6SX PCIe.
> 
> > Or do we now have a
> > mean to reset the PCIe block and it's PHY from software?
> 
> Richard is trying to find a SW workaround for it, but we're not really
> sure if it's possible.

I hate to ask this, but does that mean all but MX6SLX are irrepairably broken 
and will never have a reliable PCIe implementation ever?

Best regards,
Marek Vasut
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/pci/host/pci-imx6.c b/drivers/pci/host/pci-imx6.c
index a568efa..669f771 100644
--- a/drivers/pci/host/pci-imx6.c
+++ b/drivers/pci/host/pci-imx6.c
@@ -507,6 +507,8 @@  static int __init imx6_add_pcie_port(struct pcie_port *pp,
 	pp->root_bus_nr = -1;
 	pp->ops = &imx6_pcie_host_ops;
 
+	usleep_range(25000, 30000);
+
 	ret = dw_pcie_host_init(pp);
 	if (ret) {
 		dev_err(&pdev->dev, "failed to initialize host\n");