Message ID | 6d0ae269-6104-02e8-be21-d3840cd6b327@web.de |
---|---|
State | Accepted |
Delegated to: | Lokesh Vutla |
Headers | show |
Series | am654_sdhci: mmc fail to send stop cmd | expand |
Hi Jan, > Subject: am654_sdhci: mmc fail to send stop cmd > > Hi all, > > on one device with one specific SD-card (possibly an aging one), I'm seeing > frequent "mmc fail to send stop cmd" messages, followed by read errors > when loading kernel and dtb. -ETIMEDOUT is returned by mmd_send_cmd. > However, I can always resolve this by simply retrying the stop command like > this: > > diff --git a/drivers/mmc/mmc.c b/drivers/mmc/mmc.c index > f36d11ddc8..9019d9f2ed 100644 > --- a/drivers/mmc/mmc.c > +++ b/drivers/mmc/mmc.c > @@ -406,7 +406,11 @@ static int mmc_read_blocks(struct mmc *mmc, void > *dst, lbaint_t start, #if !defined(CONFIG_SPL_BUILD) || > defined(CONFIG_SPL_LIBCOMMON_SUPPORT) > pr_err("mmc fail to send stop cmd\n"); #endif > - return 0; > + pr_err("retrying...\n"); > + if (mmc_send_cmd(mmc, &cmd, NULL)) { > + pr_err("failed again\n"); > + return 0; > + } > } > } > > > Hardware is our IOT2050, baseline is today's master (1c4b5038afcc) with > board-enabling and a bunch of patches from your tree [1]. However, already > 4d6da10ce611 exposes the problem. > > What could cause this? Where the timeout happen in driver? Did you try enlarge the timeout value? Regards, Peng. > > Jan > > [1] > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub. > com%2Fsiemens%2Fu-boot%2Fcommits%2Fjan%2Fiot2050&data=02%7 > C01%7CPeng.Fan%40nxp.com%7Cda088100ee5a46cdc37008d82b29779f%7 > C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C63730680439552710 > 6&sdata=oiS6nOxxAQykjMyTecz%2FTJY4OW8WiZ2CbszR2mBrXuI%3D& > amp;reserved=0
On 20.07.20 03:21, Peng Fan wrote: > Hi Jan, > >> Subject: am654_sdhci: mmc fail to send stop cmd >> >> Hi all, >> >> on one device with one specific SD-card (possibly an aging one), I'm seeing >> frequent "mmc fail to send stop cmd" messages, followed by read errors >> when loading kernel and dtb. -ETIMEDOUT is returned by mmd_send_cmd. >> However, I can always resolve this by simply retrying the stop command like >> this: >> >> diff --git a/drivers/mmc/mmc.c b/drivers/mmc/mmc.c index >> f36d11ddc8..9019d9f2ed 100644 >> --- a/drivers/mmc/mmc.c >> +++ b/drivers/mmc/mmc.c >> @@ -406,7 +406,11 @@ static int mmc_read_blocks(struct mmc *mmc, void >> *dst, lbaint_t start, #if !defined(CONFIG_SPL_BUILD) || >> defined(CONFIG_SPL_LIBCOMMON_SUPPORT) >> pr_err("mmc fail to send stop cmd\n"); #endif >> - return 0; >> + pr_err("retrying...\n"); >> + if (mmc_send_cmd(mmc, &cmd, NULL)) { >> + pr_err("failed again\n"); >> + return 0; >> + } >> } >> } >> >> >> Hardware is our IOT2050, baseline is today's master (1c4b5038afcc) with >> board-enabling and a bunch of patches from your tree [1]. However, already >> 4d6da10ce611 exposes the problem. >> >> What could cause this? > > Where the timeout happen in driver? > > Did you try enlarge the timeout value? Not sure yet where I could do that. The timeout is detected and reported by the hardware via SDHCI_INT_STATUS (= 0x18000 in case of an error). Thanks, Jan > > Regards, > Peng. > >> >> Jan >> >> [1] >> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub. ^^^^^^^^^ Welcome to the club. If you are on TB, I can recommend "Unmangle Outlook Safelinks" to get rid of this insecurity measure, at least on the client side. >> com%2Fsiemens%2Fu-boot%2Fcommits%2Fjan%2Fiot2050&data=02%7 >> C01%7CPeng.Fan%40nxp.com%7Cda088100ee5a46cdc37008d82b29779f%7 >> C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C63730680439552710 >> 6&sdata=oiS6nOxxAQykjMyTecz%2FTJY4OW8WiZ2CbszR2mBrXuI%3D& >> amp;reserved=0
On 7/20/20 10:21 AM, Peng Fan wrote: > Hi Jan, > >> Subject: am654_sdhci: mmc fail to send stop cmd >> >> Hi all, >> >> on one device with one specific SD-card (possibly an aging one), I'm seeing >> frequent "mmc fail to send stop cmd" messages, followed by read errors >> when loading kernel and dtb. -ETIMEDOUT is returned by mmd_send_cmd. >> However, I can always resolve this by simply retrying the stop command like >> this: >> >> diff --git a/drivers/mmc/mmc.c b/drivers/mmc/mmc.c index >> f36d11ddc8..9019d9f2ed 100644 >> --- a/drivers/mmc/mmc.c >> +++ b/drivers/mmc/mmc.c >> @@ -406,7 +406,11 @@ static int mmc_read_blocks(struct mmc *mmc, void >> *dst, lbaint_t start, #if !defined(CONFIG_SPL_BUILD) || >> defined(CONFIG_SPL_LIBCOMMON_SUPPORT) >> pr_err("mmc fail to send stop cmd\n"); #endif >> - return 0; >> + pr_err("retrying...\n"); >> + if (mmc_send_cmd(mmc, &cmd, NULL)) { >> + pr_err("failed again\n"); >> + return 0; >> + } >> } >> } >> >> >> Hardware is our IOT2050, baseline is today's master (1c4b5038afcc) with >> board-enabling and a bunch of patches from your tree [1]. However, already >> 4d6da10ce611 exposes the problem. >> >> What could cause this? > > Where the timeout happen in driver? > > Did you try enlarge the timeout value? how about adding SDHCI_QUIRK_WAIT_SEND_CMD? And as Peng's comment, It needs to find where return error in driver code. Best Regards, Jaehoon Chung > > Regards, > Peng. > >> >> Jan >> >> [1] >> https://protect2.fireeye.com/v1/url?k=89b609db-d478086f-89b78294-000babdfecba-7bc87eaa8a7f7725&q=1&e=eca6f3ac-3454-4f92-a074-5a4abe347b74&u=https%3A%2F%2Feur01.safelinks.protection.outlook.com%2F%3Furl%3Dhttps%253A%252F%252Fgithub. >> com%2Fsiemens%2Fu-boot%2Fcommits%2Fjan%2Fiot2050&data=02%7 >> C01%7CPeng.Fan%40nxp.com%7Cda088100ee5a46cdc37008d82b29779f%7 >> C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C63730680439552710 >> 6&sdata=oiS6nOxxAQykjMyTecz%2FTJY4OW8WiZ2CbszR2mBrXuI%3D& >> amp;reserved=0
On 21.07.20 01:23, Jaehoon Chung wrote: > On 7/20/20 10:21 AM, Peng Fan wrote: >> Hi Jan, >> >>> Subject: am654_sdhci: mmc fail to send stop cmd >>> >>> Hi all, >>> >>> on one device with one specific SD-card (possibly an aging one), I'm seeing >>> frequent "mmc fail to send stop cmd" messages, followed by read errors >>> when loading kernel and dtb. -ETIMEDOUT is returned by mmd_send_cmd. >>> However, I can always resolve this by simply retrying the stop command like >>> this: >>> >>> diff --git a/drivers/mmc/mmc.c b/drivers/mmc/mmc.c index >>> f36d11ddc8..9019d9f2ed 100644 >>> --- a/drivers/mmc/mmc.c >>> +++ b/drivers/mmc/mmc.c >>> @@ -406,7 +406,11 @@ static int mmc_read_blocks(struct mmc *mmc, void >>> *dst, lbaint_t start, #if !defined(CONFIG_SPL_BUILD) || >>> defined(CONFIG_SPL_LIBCOMMON_SUPPORT) >>> pr_err("mmc fail to send stop cmd\n"); #endif >>> - return 0; >>> + pr_err("retrying...\n"); >>> + if (mmc_send_cmd(mmc, &cmd, NULL)) { >>> + pr_err("failed again\n"); >>> + return 0; >>> + } >>> } >>> } >>> >>> >>> Hardware is our IOT2050, baseline is today's master (1c4b5038afcc) with >>> board-enabling and a bunch of patches from your tree [1]. However, already >>> 4d6da10ce611 exposes the problem. >>> >>> What could cause this? >> >> Where the timeout happen in driver? >> >> Did you try enlarge the timeout value? > > how about adding SDHCI_QUIRK_WAIT_SEND_CMD? I tried that already, but the result was even worse, a non-working mmc. > And as Peng's comment, It needs to find where return error in driver code. > As written in my other reply: https://gitlab.denx.de/u-boot/u-boot/-/blob/f12341a9529540113f01989149bbbeb68662a829/drivers/mmc/sdhci.c#L385 Thus, it's reported by the hw. Thanks, Jan
Jan, On 21/07/20 12:06 pm, Jan Kiszka wrote: > On 21.07.20 01:23, Jaehoon Chung wrote: >> On 7/20/20 10:21 AM, Peng Fan wrote: >>> Hi Jan, >>> >>>> Subject: am654_sdhci: mmc fail to send stop cmd >>>> >>>> Hi all, >>>> >>>> on one device with one specific SD-card (possibly an aging one), I'm seeing >>>> frequent "mmc fail to send stop cmd" messages, followed by read errors >>>> when loading kernel and dtb. -ETIMEDOUT is returned by mmd_send_cmd. >>>> However, I can always resolve this by simply retrying the stop command like >>>> this: >>>> >>>> diff --git a/drivers/mmc/mmc.c b/drivers/mmc/mmc.c index >>>> f36d11ddc8..9019d9f2ed 100644 >>>> --- a/drivers/mmc/mmc.c >>>> +++ b/drivers/mmc/mmc.c >>>> @@ -406,7 +406,11 @@ static int mmc_read_blocks(struct mmc *mmc, void >>>> *dst, lbaint_t start, #if !defined(CONFIG_SPL_BUILD) || >>>> defined(CONFIG_SPL_LIBCOMMON_SUPPORT) >>>> pr_err("mmc fail to send stop cmd\n"); #endif >>>> - return 0; >>>> + pr_err("retrying...\n"); >>>> + if (mmc_send_cmd(mmc, &cmd, NULL)) { >>>> + pr_err("failed again\n"); >>>> + return 0; >>>> + } >>>> } >>>> } >>>> >>>> >>>> Hardware is our IOT2050, baseline is today's master (1c4b5038afcc) with >>>> board-enabling and a bunch of patches from your tree [1]. However, already >>>> 4d6da10ce611 exposes the problem. >>>> >>>> What could cause this? >>> >>> Where the timeout happen in driver? >>> >>> Did you try enlarge the timeout value? >> >> how about adding SDHCI_QUIRK_WAIT_SEND_CMD? > > I tried that already, but the result was even worse, a non-working mmc. > >> And as Peng's comment, It needs to find where return error in driver code. >> > > As written in my other reply: > https://gitlab.denx.de/u-boot/u-boot/-/blob/f12341a9529540113f01989149bbbeb68662a829/drivers/mmc/sdhci.c#L385 > Thus, it's reported by the hw. > Its a command timeout for which we cannot program a higher timeout. Can you send a full failure log? Also, does the same card + board combination work in kernel? That should help us point to hardware vs U-boot. Thanks, Faiz
On 21.07.20 19:03, Faiz Abbas wrote: > Jan, > > On 21/07/20 12:06 pm, Jan Kiszka wrote: >> On 21.07.20 01:23, Jaehoon Chung wrote: >>> On 7/20/20 10:21 AM, Peng Fan wrote: >>>> Hi Jan, >>>> >>>>> Subject: am654_sdhci: mmc fail to send stop cmd >>>>> >>>>> Hi all, >>>>> >>>>> on one device with one specific SD-card (possibly an aging one), I'm seeing >>>>> frequent "mmc fail to send stop cmd" messages, followed by read errors >>>>> when loading kernel and dtb. -ETIMEDOUT is returned by mmd_send_cmd. >>>>> However, I can always resolve this by simply retrying the stop command like >>>>> this: >>>>> >>>>> diff --git a/drivers/mmc/mmc.c b/drivers/mmc/mmc.c index >>>>> f36d11ddc8..9019d9f2ed 100644 >>>>> --- a/drivers/mmc/mmc.c >>>>> +++ b/drivers/mmc/mmc.c >>>>> @@ -406,7 +406,11 @@ static int mmc_read_blocks(struct mmc *mmc, void >>>>> *dst, lbaint_t start, #if !defined(CONFIG_SPL_BUILD) || >>>>> defined(CONFIG_SPL_LIBCOMMON_SUPPORT) >>>>> pr_err("mmc fail to send stop cmd\n"); #endif >>>>> - return 0; >>>>> + pr_err("retrying...\n"); >>>>> + if (mmc_send_cmd(mmc, &cmd, NULL)) { >>>>> + pr_err("failed again\n"); >>>>> + return 0; >>>>> + } >>>>> } >>>>> } >>>>> >>>>> >>>>> Hardware is our IOT2050, baseline is today's master (1c4b5038afcc) with >>>>> board-enabling and a bunch of patches from your tree [1]. However, already >>>>> 4d6da10ce611 exposes the problem. >>>>> >>>>> What could cause this? >>>> >>>> Where the timeout happen in driver? >>>> >>>> Did you try enlarge the timeout value? >>> >>> how about adding SDHCI_QUIRK_WAIT_SEND_CMD? >> >> I tried that already, but the result was even worse, a non-working mmc. >> >>> And as Peng's comment, It needs to find where return error in driver code. >>> >> >> As written in my other reply: >> https://gitlab.denx.de/u-boot/u-boot/-/blob/f12341a9529540113f01989149bbbeb68662a829/drivers/mmc/sdhci.c#L385 >> Thus, it's reported by the hw. >> > > Its a command timeout for which we cannot program a higher timeout. > > Can you send a full failure log? > [unrelated fsbl, spl stuff] U-Boot 2020.07-00883-g4d6da10ce6-dirty (Jul 20 2020 - 06:30:08 +0200) Model: Siemens IOT2050 Advanced Base Board DRAM: 2 GiB MMC: sdhci@4f80000: 1, sdhci@04FA0000: 0 Loading Environment from SPI Flash... SF: Detected w25q128 with page size 256 Bytes, erase size 64 KiB, total 16 MiB OK In: serial Out: serial Err: serial Hit any key to stop autoboot: 0 stat: 18000 stat: 18000 stat: 208000 switch to partitions #0, OK mmc1(part 0) is current device ** No partition table - mmc 1 ** switch to partitions #0, OK mmc0 is current device Scanning mmc 0:1... Found U-Boot script /boot/boot.scr 784 bytes read in 2 ms (382.8 KiB/s) ## Executing script at 83000000 65329 bytes read in 11 ms (5.7 MiB/s) stat: 18000 mmc fail to send stop cmd, -110 retrying... 17113096 bytes read in 1409 ms (11.6 MiB/s) Moving Image from 0x80080000 to 0x80200000, end=812c0000 ## Flattened Device Tree blob at 82000000 Booting using the fdt blob at 0x82000000 Loading Device Tree to 00000000fdf0f000, end 00000000fdf21f30 ... OK [kernel boot] The diff I'm carrying on top of [1] is below. > Also, does the same card + board combination work in kernel? That should help us point to hardware vs U-boot. > The same card on the same board works without complaints with the kernel driver (5.8-rc5 at the moment). Even more strange, the same card a different board (IOT2050 Basic, some SoC series, slightly different type) does not throw those errors with the same U-Boot. Note that we are still carrying those clock swapping changes in [2]. I've also tried to remove it, but it has no impact on this issue. Thanks! Jan [1] https://github.com/siemens/u-boot/commits/4d6da10ce611484befd4cebbf294c89bffe927b3 [2] https://github.com/siemens/u-boot/commit/4d6da10ce611484befd4cebbf294c89bffe927b3#diff-acb4b23f67868f5c9dcd2c30c0e92dffR58 diff --git a/drivers/mmc/mmc.c b/drivers/mmc/mmc.c index f36d11ddc8..c855e3075e 100644 --- a/drivers/mmc/mmc.c +++ b/drivers/mmc/mmc.c @@ -402,11 +402,16 @@ static int mmc_read_blocks(struct mmc *mmc, void *dst, lbaint_t start, cmd.cmdidx = MMC_CMD_STOP_TRANSMISSION; cmd.cmdarg = 0; cmd.resp_type = MMC_RSP_R1b; - if (mmc_send_cmd(mmc, &cmd, NULL)) { + int ret = mmc_send_cmd(mmc, &cmd, NULL); + if (ret) { #if !defined(CONFIG_SPL_BUILD) || defined(CONFIG_SPL_LIBCOMMON_SUPPORT) - pr_err("mmc fail to send stop cmd\n"); + pr_err("mmc fail to send stop cmd, %d\n", ret); #endif - return 0; + pr_err("retrying...\n"); + if (mmc_send_cmd(mmc, &cmd, NULL)) { + pr_err("failed again\n"); + return 0; + } } } diff --git a/drivers/mmc/sdhci.c b/drivers/mmc/sdhci.c index f4eb655f6e..faefe6c8c9 100644 --- a/drivers/mmc/sdhci.c +++ b/drivers/mmc/sdhci.c @@ -381,6 +381,7 @@ static int sdhci_send_command(struct mmc *mmc, struct mmc_cmd *cmd, sdhci_reset(host, SDHCI_RESET_CMD); sdhci_reset(host, SDHCI_RESET_DATA); + printf("stat: %x\n", stat); if (stat & SDHCI_INT_TIMEOUT) return -ETIMEDOUT; else
Jan, On 21/07/20 10:52 pm, Jan Kiszka wrote: > On 21.07.20 19:03, Faiz Abbas wrote: >> Jan, >> >> On 21/07/20 12:06 pm, Jan Kiszka wrote: >>> On 21.07.20 01:23, Jaehoon Chung wrote: >>>> On 7/20/20 10:21 AM, Peng Fan wrote: >>>>> Hi Jan, >>>>> >>>>>> Subject: am654_sdhci: mmc fail to send stop cmd >>>>>> >>>>>> Hi all, >>>>>> >>>>>> on one device with one specific SD-card (possibly an aging one), I'm seeing >>>>>> frequent "mmc fail to send stop cmd" messages, followed by read errors >>>>>> when loading kernel and dtb. -ETIMEDOUT is returned by mmd_send_cmd. >>>>>> However, I can always resolve this by simply retrying the stop command like >>>>>> this: >>>>>> >>>>>> diff --git a/drivers/mmc/mmc.c b/drivers/mmc/mmc.c index >>>>>> f36d11ddc8..9019d9f2ed 100644 >>>>>> --- a/drivers/mmc/mmc.c >>>>>> +++ b/drivers/mmc/mmc.c >>>>>> @@ -406,7 +406,11 @@ static int mmc_read_blocks(struct mmc *mmc, void >>>>>> *dst, lbaint_t start, #if !defined(CONFIG_SPL_BUILD) || >>>>>> defined(CONFIG_SPL_LIBCOMMON_SUPPORT) >>>>>> pr_err("mmc fail to send stop cmd\n"); #endif >>>>>> - return 0; >>>>>> + pr_err("retrying...\n"); >>>>>> + if (mmc_send_cmd(mmc, &cmd, NULL)) { >>>>>> + pr_err("failed again\n"); >>>>>> + return 0; >>>>>> + } >>>>>> } >>>>>> } >>>>>> >>>>>> >>>>>> Hardware is our IOT2050, baseline is today's master (1c4b5038afcc) with >>>>>> board-enabling and a bunch of patches from your tree [1]. However, already >>>>>> 4d6da10ce611 exposes the problem. >>>>>> >>>>>> What could cause this? >>>>> >>>>> Where the timeout happen in driver? >>>>> >>>>> Did you try enlarge the timeout value? >>>> >>>> how about adding SDHCI_QUIRK_WAIT_SEND_CMD? >>> >>> I tried that already, but the result was even worse, a non-working mmc. >>> >>>> And as Peng's comment, It needs to find where return error in driver code. >>>> >>> >>> As written in my other reply: >>> https://gitlab.denx.de/u-boot/u-boot/-/blob/f12341a9529540113f01989149bbbeb68662a829/drivers/mmc/sdhci.c#L385 >>> Thus, it's reported by the hw. >>> >> >> Its a command timeout for which we cannot program a higher timeout. >> >> Can you send a full failure log? >> > > [unrelated fsbl, spl stuff] > > U-Boot 2020.07-00883-g4d6da10ce6-dirty (Jul 20 2020 - 06:30:08 +0200) > > Model: Siemens IOT2050 Advanced Base Board > DRAM: 2 GiB > MMC: sdhci@4f80000: 1, sdhci@04FA0000: 0 > Loading Environment from SPI Flash... SF: Detected w25q128 with page size 256 Bytes, erase size 64 KiB, total 16 MiB > OK > In: serial > Out: serial > Err: serial > Hit any key to stop autoboot: 0 > stat: 18000 > stat: 18000 > stat: 208000 > switch to partitions #0, OK > mmc1(part 0) is current device > ** No partition table - mmc 1 ** > switch to partitions #0, OK > mmc0 is current device > Scanning mmc 0:1... > Found U-Boot script /boot/boot.scr > 784 bytes read in 2 ms (382.8 KiB/s) > ## Executing script at 83000000 > 65329 bytes read in 11 ms (5.7 MiB/s) > stat: 18000 > mmc fail to send stop cmd, -110 > retrying... > 17113096 bytes read in 1409 ms (11.6 MiB/s) > Moving Image from 0x80080000 to 0x80200000, end=812c0000 > ## Flattened Device Tree blob at 82000000 > Booting using the fdt blob at 0x82000000 > Loading Device Tree to 00000000fdf0f000, end 00000000fdf21f30 ... OK > > [kernel boot] > > The diff I'm carrying on top of [1] is below. > >> Also, does the same card + board combination work in kernel? That should help us point to hardware vs U-boot. >> > > The same card on the same board works without complaints with the kernel > driver (5.8-rc5 at the moment). Even more strange, the same card a > different board (IOT2050 Basic, some SoC series, slightly different > type) does not throw those errors with the same U-Boot. Was this card working with an older U-boot version and only failing in mainline? > > Note that we are still carrying those clock swapping changes in [2]. > I've also tried to remove it, but it has no impact on this issue. > One more thing to try is to reduce the speed mode to default as we are already gating frequency to 25 MHz. Can you modify the sdhci-caps-mask to the following for sdhci1? sdhci-caps-mask = <0x7 0x200000>; Thanks, Faiz
Jan, On 23/07/20 8:55 am, Faiz Abbas wrote: > Jan, > > On 21/07/20 10:52 pm, Jan Kiszka wrote: >> On 21.07.20 19:03, Faiz Abbas wrote: >>> Jan, >>> >>> On 21/07/20 12:06 pm, Jan Kiszka wrote: >>>> On 21.07.20 01:23, Jaehoon Chung wrote: >>>>> On 7/20/20 10:21 AM, Peng Fan wrote: >>>>>> Hi Jan, >>>>>> >>>>>>> Subject: am654_sdhci: mmc fail to send stop cmd >>>>>>> >>>>>>> Hi all, >>>>>>> >>>>>>> on one device with one specific SD-card (possibly an aging one), I'm seeing >>>>>>> frequent "mmc fail to send stop cmd" messages, followed by read errors >>>>>>> when loading kernel and dtb. -ETIMEDOUT is returned by mmd_send_cmd. >>>>>>> However, I can always resolve this by simply retrying the stop command like >>>>>>> this: >>>>>>> ... >>>> >>> >>> Its a command timeout for which we cannot program a higher timeout. >>> >>> Can you send a full failure log? >>> >> >> [unrelated fsbl, spl stuff] >> >> U-Boot 2020.07-00883-g4d6da10ce6-dirty (Jul 20 2020 - 06:30:08 +0200) >> >> Model: Siemens IOT2050 Advanced Base Board >> DRAM: 2 GiB >> MMC: sdhci@4f80000: 1, sdhci@04FA0000: 0 >> Loading Environment from SPI Flash... SF: Detected w25q128 with page size 256 Bytes, erase size 64 KiB, total 16 MiB >> OK >> In: serial >> Out: serial >> Err: serial >> Hit any key to stop autoboot: 0 >> stat: 18000 >> stat: 18000 >> stat: 208000 >> switch to partitions #0, OK >> mmc1(part 0) is current device >> ** No partition table - mmc 1 ** >> switch to partitions #0, OK >> mmc0 is current device >> Scanning mmc 0:1... >> Found U-Boot script /boot/boot.scr >> 784 bytes read in 2 ms (382.8 KiB/s) >> ## Executing script at 83000000 >> 65329 bytes read in 11 ms (5.7 MiB/s) >> stat: 18000 >> mmc fail to send stop cmd, -110 >> retrying... >> 17113096 bytes read in 1409 ms (11.6 MiB/s) >> Moving Image from 0x80080000 to 0x80200000, end=812c0000 >> ## Flattened Device Tree blob at 82000000 >> Booting using the fdt blob at 0x82000000 >> Loading Device Tree to 00000000fdf0f000, end 00000000fdf21f30 ... OK >> >> [kernel boot] >> >> The diff I'm carrying on top of [1] is below. >> >>> Also, does the same card + board combination work in kernel? That should help us point to hardware vs U-boot. >>> >> >> The same card on the same board works without complaints with the kernel >> driver (5.8-rc5 at the moment). Even more strange, the same card a >> different board (IOT2050 Basic, some SoC series, slightly different >> type) does not throw those errors with the same U-Boot. > > Was this card working with an older U-boot version and only failing in mainline? > >> >> Note that we are still carrying those clock swapping changes in [2]. >> I've also tried to remove it, but it has no impact on this issue. >> > > One more thing to try is to reduce the speed mode to default as we are already gating frequency > to 25 MHz. Can you modify the sdhci-caps-mask to the following for sdhci1? > > sdhci-caps-mask = <0x7 0x200000>; > You'll need to apply this fix for this mask to work: https://patchwork.ozlabs.org/project/uboot/patch/20200723041219.2438-1-faiz_abbas@ti.com/ Thanks, Faiz
On 23.07.20 05:25, Faiz Abbas wrote: > Jan, > > On 21/07/20 10:52 pm, Jan Kiszka wrote: >> On 21.07.20 19:03, Faiz Abbas wrote: >>> Jan, >>> >>> On 21/07/20 12:06 pm, Jan Kiszka wrote: >>>> On 21.07.20 01:23, Jaehoon Chung wrote: >>>>> On 7/20/20 10:21 AM, Peng Fan wrote: >>>>>> Hi Jan, >>>>>> >>>>>>> Subject: am654_sdhci: mmc fail to send stop cmd >>>>>>> >>>>>>> Hi all, >>>>>>> >>>>>>> on one device with one specific SD-card (possibly an aging one), I'm seeing >>>>>>> frequent "mmc fail to send stop cmd" messages, followed by read errors >>>>>>> when loading kernel and dtb. -ETIMEDOUT is returned by mmd_send_cmd. >>>>>>> However, I can always resolve this by simply retrying the stop command like >>>>>>> this: >>>>>>> >>>>>>> diff --git a/drivers/mmc/mmc.c b/drivers/mmc/mmc.c index >>>>>>> f36d11ddc8..9019d9f2ed 100644 >>>>>>> --- a/drivers/mmc/mmc.c >>>>>>> +++ b/drivers/mmc/mmc.c >>>>>>> @@ -406,7 +406,11 @@ static int mmc_read_blocks(struct mmc *mmc, void >>>>>>> *dst, lbaint_t start, #if !defined(CONFIG_SPL_BUILD) || >>>>>>> defined(CONFIG_SPL_LIBCOMMON_SUPPORT) >>>>>>> pr_err("mmc fail to send stop cmd\n"); #endif >>>>>>> - return 0; >>>>>>> + pr_err("retrying...\n"); >>>>>>> + if (mmc_send_cmd(mmc, &cmd, NULL)) { >>>>>>> + pr_err("failed again\n"); >>>>>>> + return 0; >>>>>>> + } >>>>>>> } >>>>>>> } >>>>>>> >>>>>>> >>>>>>> Hardware is our IOT2050, baseline is today's master (1c4b5038afcc) with >>>>>>> board-enabling and a bunch of patches from your tree [1]. However, already >>>>>>> 4d6da10ce611 exposes the problem. >>>>>>> >>>>>>> What could cause this? >>>>>> >>>>>> Where the timeout happen in driver? >>>>>> >>>>>> Did you try enlarge the timeout value? >>>>> >>>>> how about adding SDHCI_QUIRK_WAIT_SEND_CMD? >>>> >>>> I tried that already, but the result was even worse, a non-working mmc. >>>> >>>>> And as Peng's comment, It needs to find where return error in driver code. >>>>> >>>> >>>> As written in my other reply: >>>> https://gitlab.denx.de/u-boot/u-boot/-/blob/f12341a9529540113f01989149bbbeb68662a829/drivers/mmc/sdhci.c#L385 >>>> Thus, it's reported by the hw. >>>> >>> >>> Its a command timeout for which we cannot program a higher timeout. >>> >>> Can you send a full failure log? >>> >> >> [unrelated fsbl, spl stuff] >> >> U-Boot 2020.07-00883-g4d6da10ce6-dirty (Jul 20 2020 - 06:30:08 +0200) >> >> Model: Siemens IOT2050 Advanced Base Board >> DRAM: 2 GiB >> MMC: sdhci@4f80000: 1, sdhci@04FA0000: 0 >> Loading Environment from SPI Flash... SF: Detected w25q128 with page size 256 Bytes, erase size 64 KiB, total 16 MiB >> OK >> In: serial >> Out: serial >> Err: serial >> Hit any key to stop autoboot: 0 >> stat: 18000 >> stat: 18000 >> stat: 208000 >> switch to partitions #0, OK >> mmc1(part 0) is current device >> ** No partition table - mmc 1 ** >> switch to partitions #0, OK >> mmc0 is current device >> Scanning mmc 0:1... >> Found U-Boot script /boot/boot.scr >> 784 bytes read in 2 ms (382.8 KiB/s) >> ## Executing script at 83000000 >> 65329 bytes read in 11 ms (5.7 MiB/s) >> stat: 18000 >> mmc fail to send stop cmd, -110 >> retrying... >> 17113096 bytes read in 1409 ms (11.6 MiB/s) >> Moving Image from 0x80080000 to 0x80200000, end=812c0000 >> ## Flattened Device Tree blob at 82000000 >> Booting using the fdt blob at 0x82000000 >> Loading Device Tree to 00000000fdf0f000, end 00000000fdf21f30 ... OK >> >> [kernel boot] >> >> The diff I'm carrying on top of [1] is below. >> >>> Also, does the same card + board combination work in kernel? That should help us point to hardware vs U-boot. >>> >> >> The same card on the same board works without complaints with the kernel >> driver (5.8-rc5 at the moment). Even more strange, the same card a >> different board (IOT2050 Basic, some SoC series, slightly different >> type) does not throw those errors with the same U-Boot. > > Was this card working with an older U-boot version and only failing in mainline? > Good point: Just tested our legacy firmware that was based on https://git.ti.com/cgit/processor-sdk/processor-sdk-u-boot/log/?h=029e4c009aaeaee2d06aa8271dbd3a9e73a28aa7 (https://github.com/siemens/meta-iot2050/blob/master/recipes-bsp/u-boot/u-boot-iot2050-2019.01-ti-sdk.inc), and it does not expose the issue so far. If I look at the transfer rate, 2.8 MiB/s with the old firmware vs. 11.x MiB/s with upstream, you suggestion below may make the difference. >> >> Note that we are still carrying those clock swapping changes in [2]. >> I've also tried to remove it, but it has no impact on this issue. >> > > One more thing to try is to reduce the speed mode to default as we are already gating frequency > to 25 MHz. Can you modify the sdhci-caps-mask to the following for sdhci1? > > sdhci-caps-mask = <0x7 0x200000>; > Trying that out now... Jan
On 23.07.20 07:25, Jan Kiszka wrote: > On 23.07.20 05:25, Faiz Abbas wrote: >> Jan, >> >> On 21/07/20 10:52 pm, Jan Kiszka wrote: >>> On 21.07.20 19:03, Faiz Abbas wrote: >>>> Jan, >>>> >>>> On 21/07/20 12:06 pm, Jan Kiszka wrote: >>>>> On 21.07.20 01:23, Jaehoon Chung wrote: >>>>>> On 7/20/20 10:21 AM, Peng Fan wrote: >>>>>>> Hi Jan, >>>>>>> >>>>>>>> Subject: am654_sdhci: mmc fail to send stop cmd >>>>>>>> >>>>>>>> Hi all, >>>>>>>> >>>>>>>> on one device with one specific SD-card (possibly an aging one), >>>>>>>> I'm seeing >>>>>>>> frequent "mmc fail to send stop cmd" messages, followed by read >>>>>>>> errors >>>>>>>> when loading kernel and dtb. -ETIMEDOUT is returned by >>>>>>>> mmd_send_cmd. >>>>>>>> However, I can always resolve this by simply retrying the stop >>>>>>>> command like >>>>>>>> this: >>>>>>>> >>>>>>>> diff --git a/drivers/mmc/mmc.c b/drivers/mmc/mmc.c index >>>>>>>> f36d11ddc8..9019d9f2ed 100644 >>>>>>>> --- a/drivers/mmc/mmc.c >>>>>>>> +++ b/drivers/mmc/mmc.c >>>>>>>> @@ -406,7 +406,11 @@ static int mmc_read_blocks(struct mmc *mmc, >>>>>>>> void >>>>>>>> *dst, lbaint_t start, #if !defined(CONFIG_SPL_BUILD) || >>>>>>>> defined(CONFIG_SPL_LIBCOMMON_SUPPORT) >>>>>>>> pr_err("mmc fail to send stop cmd\n"); #endif >>>>>>>> - return 0; >>>>>>>> + pr_err("retrying...\n"); >>>>>>>> + if (mmc_send_cmd(mmc, &cmd, NULL)) { >>>>>>>> + pr_err("failed again\n"); >>>>>>>> + return 0; >>>>>>>> + } >>>>>>>> } >>>>>>>> } >>>>>>>> >>>>>>>> >>>>>>>> Hardware is our IOT2050, baseline is today's master >>>>>>>> (1c4b5038afcc) with >>>>>>>> board-enabling and a bunch of patches from your tree [1]. >>>>>>>> However, already >>>>>>>> 4d6da10ce611 exposes the problem. >>>>>>>> >>>>>>>> What could cause this? >>>>>>> >>>>>>> Where the timeout happen in driver? >>>>>>> >>>>>>> Did you try enlarge the timeout value? >>>>>> >>>>>> how about adding SDHCI_QUIRK_WAIT_SEND_CMD? >>>>> >>>>> I tried that already, but the result was even worse, a non-working >>>>> mmc. >>>>> >>>>>> And as Peng's comment, It needs to find where return error in >>>>>> driver code. >>>>>> >>>>> >>>>> As written in my other reply: >>>>> https://gitlab.denx.de/u-boot/u-boot/-/blob/f12341a9529540113f01989149bbbeb68662a829/drivers/mmc/sdhci.c#L385 >>>>> >>>>> Thus, it's reported by the hw. >>>>> >>>> >>>> Its a command timeout for which we cannot program a higher timeout. >>>> >>>> Can you send a full failure log? >>>> >>> >>> [unrelated fsbl, spl stuff] >>> >>> U-Boot 2020.07-00883-g4d6da10ce6-dirty (Jul 20 2020 - 06:30:08 +0200) >>> >>> Model: Siemens IOT2050 Advanced Base Board >>> DRAM: 2 GiB >>> MMC: sdhci@4f80000: 1, sdhci@04FA0000: 0 >>> Loading Environment from SPI Flash... SF: Detected w25q128 with page >>> size 256 Bytes, erase size 64 KiB, total 16 MiB >>> OK >>> In: serial >>> Out: serial >>> Err: serial >>> Hit any key to stop autoboot: 0 >>> stat: 18000 >>> stat: 18000 >>> stat: 208000 >>> switch to partitions #0, OK >>> mmc1(part 0) is current device >>> ** No partition table - mmc 1 ** >>> switch to partitions #0, OK >>> mmc0 is current device >>> Scanning mmc 0:1... >>> Found U-Boot script /boot/boot.scr >>> 784 bytes read in 2 ms (382.8 KiB/s) >>> ## Executing script at 83000000 >>> 65329 bytes read in 11 ms (5.7 MiB/s) >>> stat: 18000 >>> mmc fail to send stop cmd, -110 >>> retrying... >>> 17113096 bytes read in 1409 ms (11.6 MiB/s) >>> Moving Image from 0x80080000 to 0x80200000, end=812c0000 >>> ## Flattened Device Tree blob at 82000000 >>> Booting using the fdt blob at 0x82000000 >>> Loading Device Tree to 00000000fdf0f000, end 00000000fdf21f30 ... OK >>> >>> [kernel boot] >>> >>> The diff I'm carrying on top of [1] is below. >>> >>>> Also, does the same card + board combination work in kernel? That >>>> should help us point to hardware vs U-boot. >>>> >>> >>> The same card on the same board works without complaints with the kernel >>> driver (5.8-rc5 at the moment). Even more strange, the same card a >>> different board (IOT2050 Basic, some SoC series, slightly different >>> type) does not throw those errors with the same U-Boot. >> >> Was this card working with an older U-boot version and only failing in >> mainline? >> > > Good point: Just tested our legacy firmware that was based on > https://git.ti.com/cgit/processor-sdk/processor-sdk-u-boot/log/?h=029e4c009aaeaee2d06aa8271dbd3a9e73a28aa7 > (https://github.com/siemens/meta-iot2050/blob/master/recipes-bsp/u-boot/u-boot-iot2050-2019.01-ti-sdk.inc), > and it does not expose the issue so far. If I look at the transfer rate, > 2.8 MiB/s with the old firmware vs. 11.x MiB/s with upstream, you > suggestion below may make the difference. > >>> >>> Note that we are still carrying those clock swapping changes in [2]. >>> I've also tried to remove it, but it has no impact on this issue. >>> >> >> One more thing to try is to reduce the speed mode to default as we are >> already gating frequency >> to 25 MHz. Can you modify the sdhci-caps-mask to the following for >> sdhci1? >> >> sdhci-caps-mask = <0x7 0x200000>; >> > > Trying that out now... > Yep, that works as well (and it does not even degrade the read performance: still 11 MiB/s with this card). What does it tell us? Jan
On 23.07.20 06:14, Faiz Abbas wrote: > Jan, > > On 23/07/20 8:55 am, Faiz Abbas wrote: >> Jan, >> >> On 21/07/20 10:52 pm, Jan Kiszka wrote: >>> On 21.07.20 19:03, Faiz Abbas wrote: >>>> Jan, >>>> >>>> On 21/07/20 12:06 pm, Jan Kiszka wrote: >>>>> On 21.07.20 01:23, Jaehoon Chung wrote: >>>>>> On 7/20/20 10:21 AM, Peng Fan wrote: >>>>>>> Hi Jan, >>>>>>> >>>>>>>> Subject: am654_sdhci: mmc fail to send stop cmd >>>>>>>> >>>>>>>> Hi all, >>>>>>>> >>>>>>>> on one device with one specific SD-card (possibly an aging one), I'm seeing >>>>>>>> frequent "mmc fail to send stop cmd" messages, followed by read errors >>>>>>>> when loading kernel and dtb. -ETIMEDOUT is returned by mmd_send_cmd. >>>>>>>> However, I can always resolve this by simply retrying the stop command like >>>>>>>> this: >>>>>>>> > ... >>>>> >>>> >>>> Its a command timeout for which we cannot program a higher timeout. >>>> >>>> Can you send a full failure log? >>>> >>> >>> [unrelated fsbl, spl stuff] >>> >>> U-Boot 2020.07-00883-g4d6da10ce6-dirty (Jul 20 2020 - 06:30:08 +0200) >>> >>> Model: Siemens IOT2050 Advanced Base Board >>> DRAM: 2 GiB >>> MMC: sdhci@4f80000: 1, sdhci@04FA0000: 0 >>> Loading Environment from SPI Flash... SF: Detected w25q128 with page size 256 Bytes, erase size 64 KiB, total 16 MiB >>> OK >>> In: serial >>> Out: serial >>> Err: serial >>> Hit any key to stop autoboot: 0 >>> stat: 18000 >>> stat: 18000 >>> stat: 208000 >>> switch to partitions #0, OK >>> mmc1(part 0) is current device >>> ** No partition table - mmc 1 ** >>> switch to partitions #0, OK >>> mmc0 is current device >>> Scanning mmc 0:1... >>> Found U-Boot script /boot/boot.scr >>> 784 bytes read in 2 ms (382.8 KiB/s) >>> ## Executing script at 83000000 >>> 65329 bytes read in 11 ms (5.7 MiB/s) >>> stat: 18000 >>> mmc fail to send stop cmd, -110 >>> retrying... >>> 17113096 bytes read in 1409 ms (11.6 MiB/s) >>> Moving Image from 0x80080000 to 0x80200000, end=812c0000 >>> ## Flattened Device Tree blob at 82000000 >>> Booting using the fdt blob at 0x82000000 >>> Loading Device Tree to 00000000fdf0f000, end 00000000fdf21f30 ... OK >>> >>> [kernel boot] >>> >>> The diff I'm carrying on top of [1] is below. >>> >>>> Also, does the same card + board combination work in kernel? That should help us point to hardware vs U-boot. >>>> >>> >>> The same card on the same board works without complaints with the kernel >>> driver (5.8-rc5 at the moment). Even more strange, the same card a >>> different board (IOT2050 Basic, some SoC series, slightly different >>> type) does not throw those errors with the same U-Boot. >> >> Was this card working with an older U-boot version and only failing in mainline? >> >>> >>> Note that we are still carrying those clock swapping changes in [2]. >>> I've also tried to remove it, but it has no impact on this issue. >>> >> >> One more thing to try is to reduce the speed mode to default as we are already gating frequency >> to 25 MHz. Can you modify the sdhci-caps-mask to the following for sdhci1? >> >> sdhci-caps-mask = <0x7 0x200000>; >> > > You'll need to apply this fix for this mask to work: > > https://patchwork.ozlabs.org/project/uboot/patch/20200723041219.2438-1-faiz_abbas@ti.com/ > BTW, could this be queued for upstream? We depend on it now. Thanks, Jan PS: Subject has a typo ("correspnding").
diff --git a/drivers/mmc/mmc.c b/drivers/mmc/mmc.c index f36d11ddc8..9019d9f2ed 100644 --- a/drivers/mmc/mmc.c +++ b/drivers/mmc/mmc.c @@ -406,7 +406,11 @@ static int mmc_read_blocks(struct mmc *mmc, void *dst, lbaint_t start, #if !defined(CONFIG_SPL_BUILD) || defined(CONFIG_SPL_LIBCOMMON_SUPPORT) pr_err("mmc fail to send stop cmd\n"); #endif - return 0; + pr_err("retrying...\n"); + if (mmc_send_cmd(mmc, &cmd, NULL)) { + pr_err("failed again\n"); + return 0; + } } }