[3/3] mmc: tegra: prevent ACMD23 on Tegra 3

Message ID 20180712073904.4705-3-stefan@agner.ch
State New
Headers show
Series
  • [1/3] mmc: tegra: prevent HS200 on Tegra 3
Related show

Commit Message

Stefan Agner July 12, 2018, 7:39 a.m.
It seems that SD3.0 advertisement needs to be set for higher eMMC
speed modes (namely DDR52) as well. The TRM states that the SD3.0
advertisement bit should be set for all controller instances, even
for those not supporting UHS-I mode...

When specifying vqmmc-supply as a fixed 1.8V regulator on a Tegra
SD/MMC instance which is connected to a eMMC device, the stack
enables SD3.0. However, enabling it has consequences: If SDHCI 3.0
support is advertised the stack enables Auto-CMD23. Unfortunately
Auto-CMD23 seems not to work well with Tegra 3 currently. It leads
to regular warnings:
  mmc2: Got command interrupt 0x00010000 even though no command operation was in progress.

It is not entirely clear why those errors happens. It seems that
a Linux 3.1 based downstream kernel which has Auto-CMD23 support
does not show those warnings.

Use quirk SDHCI_QUIRK2_ACMD23_BROKEN to prevent Auto-CMD23 being
used for now. With this the eMMC works stable on high-speed mode
while still announcing SD3.0.

This allows to use mmc-ddr-1_8v to enables DDR52 mode. In DDR52
mode read speed improves from about 42MiB/s to 72MiB/s on an
Apalis T30.

Signed-off-by: Stefan Agner <stefan@agner.ch>
---
 drivers/mmc/host/sdhci-tegra.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

Comments

Peter Geis July 26, 2018, 1:56 p.m. | #1
On 07/12/2018 03:39 AM, Stefan Agner wrote:
> It seems that SD3.0 advertisement needs to be set for higher eMMC
> speed modes (namely DDR52) as well. The TRM states that the SD3.0
> advertisement bit should be set for all controller instances, even
> for those not supporting UHS-I mode...
> 
> When specifying vqmmc-supply as a fixed 1.8V regulator on a Tegra
> SD/MMC instance which is connected to a eMMC device, the stack
> enables SD3.0. However, enabling it has consequences: If SDHCI 3.0
> support is advertised the stack enables Auto-CMD23. Unfortunately
> Auto-CMD23 seems not to work well with Tegra 3 currently. It leads
> to regular warnings:
>    mmc2: Got command interrupt 0x00010000 even though no command operation was in progress.
> 
> It is not entirely clear why those errors happens. It seems that
> a Linux 3.1 based downstream kernel which has Auto-CMD23 support
> does not show those warnings.
> 
> Use quirk SDHCI_QUIRK2_ACMD23_BROKEN to prevent Auto-CMD23 being
> used for now. With this the eMMC works stable on high-speed mode
> while still announcing SD3.0.
> 
> This allows to use mmc-ddr-1_8v to enables DDR52 mode. In DDR52
> mode read speed improves from about 42MiB/s to 72MiB/s on an
> Apalis T30.
> 
> Signed-off-by: Stefan Agner <stefan@agner.ch>
> ---
>   drivers/mmc/host/sdhci-tegra.c | 10 +++++++++-
>   1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/mmc/host/sdhci-tegra.c b/drivers/mmc/host/sdhci-tegra.c
> index 888a1ad511db..11c0b2069c7c 100644
> --- a/drivers/mmc/host/sdhci-tegra.c
> +++ b/drivers/mmc/host/sdhci-tegra.c
> @@ -336,7 +336,15 @@ static const struct sdhci_pltfm_data sdhci_tegra30_pdata = {
>   		  SDHCI_QUIRK_BROKEN_ADMA_ZEROLEN_DESC |
>   		  SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN,
>   	.quirks2 = SDHCI_QUIRK2_PRESET_VALUE_BROKEN |
> -		   SDHCI_QUIRK2_BROKEN_HS200,
> +		   SDHCI_QUIRK2_BROKEN_HS200 |
> +		   /*
> +		    * Auto-CMD23 leads to "Got command interrupt 0x00010000 even
> +		    * though no command operation was in progress."
> +		    *
> +		    * The exact reason is unknown, as the same hardware seems
> +		    * to support Auto CMD23 on a downstream 3.1 kernel.
> +		    */
> +		   SDHCI_QUIRK2_ACMD23_BROKEN,
>   	.ops  = &tegra_sdhci_ops,
>   };
>   

I finally got around to testing this on the Ouya (Tegra 3).

I found that the "Got command interrupt 0x00010000 even though no 
command operation was in progress." error occurred when the interface is 
unstable.
I've had a lot of problems with sdmmc4 stability on the Ouya above 34 
Mhz, probably due to the fact that they are using the internal cmd and 
clock pull-up resistors, against the TRM's instruction.
At 39Mhz, I saw the error this patch corrects.
With the patch, the error went away, but the interface is still unstable 
under load.

Lowering down to 32Mhz, without the patch there are no errors.

--
To unsubscribe from this list: send the line "unsubscribe linux-tegra" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Stefan Agner July 26, 2018, 2:47 p.m. | #2
On 26.07.2018 15:56, Peter Geis wrote:
> On 07/12/2018 03:39 AM, Stefan Agner wrote:
>> It seems that SD3.0 advertisement needs to be set for higher eMMC
>> speed modes (namely DDR52) as well. The TRM states that the SD3.0
>> advertisement bit should be set for all controller instances, even
>> for those not supporting UHS-I mode...
>>
>> When specifying vqmmc-supply as a fixed 1.8V regulator on a Tegra
>> SD/MMC instance which is connected to a eMMC device, the stack
>> enables SD3.0. However, enabling it has consequences: If SDHCI 3.0
>> support is advertised the stack enables Auto-CMD23. Unfortunately
>> Auto-CMD23 seems not to work well with Tegra 3 currently. It leads
>> to regular warnings:
>>    mmc2: Got command interrupt 0x00010000 even though no command operation was in progress.
>>
>> It is not entirely clear why those errors happens. It seems that
>> a Linux 3.1 based downstream kernel which has Auto-CMD23 support
>> does not show those warnings.
>>
>> Use quirk SDHCI_QUIRK2_ACMD23_BROKEN to prevent Auto-CMD23 being
>> used for now. With this the eMMC works stable on high-speed mode
>> while still announcing SD3.0.
>>
>> This allows to use mmc-ddr-1_8v to enables DDR52 mode. In DDR52
>> mode read speed improves from about 42MiB/s to 72MiB/s on an
>> Apalis T30.
>>
>> Signed-off-by: Stefan Agner <stefan@agner.ch>
>> ---
>>   drivers/mmc/host/sdhci-tegra.c | 10 +++++++++-
>>   1 file changed, 9 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/mmc/host/sdhci-tegra.c b/drivers/mmc/host/sdhci-tegra.c
>> index 888a1ad511db..11c0b2069c7c 100644
>> --- a/drivers/mmc/host/sdhci-tegra.c
>> +++ b/drivers/mmc/host/sdhci-tegra.c
>> @@ -336,7 +336,15 @@ static const struct sdhci_pltfm_data sdhci_tegra30_pdata = {
>>   		  SDHCI_QUIRK_BROKEN_ADMA_ZEROLEN_DESC |
>>   		  SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN,
>>   	.quirks2 = SDHCI_QUIRK2_PRESET_VALUE_BROKEN |
>> -		   SDHCI_QUIRK2_BROKEN_HS200,
>> +		   SDHCI_QUIRK2_BROKEN_HS200 |
>> +		   /*
>> +		    * Auto-CMD23 leads to "Got command interrupt 0x00010000 even
>> +		    * though no command operation was in progress."
>> +		    *
>> +		    * The exact reason is unknown, as the same hardware seems
>> +		    * to support Auto CMD23 on a downstream 3.1 kernel.
>> +		    */
>> +		   SDHCI_QUIRK2_ACMD23_BROKEN,
>>   	.ops  = &tegra_sdhci_ops,
>>   };
>>
> 
> I finally got around to testing this on the Ouya (Tegra 3).

Thanks for testing!

> 
> I found that the "Got command interrupt 0x00010000 even though no
> command operation was in progress." error occurred when the interface
> is unstable.
> I've had a lot of problems with sdmmc4 stability on the Ouya above 34
> Mhz, probably due to the fact that they are using the internal cmd and
> clock pull-up resistors, against the TRM's instruction.
> At 39Mhz, I saw the error this patch corrects.
> With the patch, the error went away, but the interface is still
> unstable under load.

How does this instability manifest exactly?

> 
> Lowering down to 32Mhz, without the patch there are no errors.

So the patch does not make it less stable right?

--
Stefan
--
To unsubscribe from this list: send the line "unsubscribe linux-tegra" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Geis July 26, 2018, 3:12 p.m. | #3
On 07/26/2018 10:47 AM, Stefan Agner wrote:
> On 26.07.2018 15:56, Peter Geis wrote:
>> On 07/12/2018 03:39 AM, Stefan Agner wrote:
>>> It seems that SD3.0 advertisement needs to be set for higher eMMC
>>> speed modes (namely DDR52) as well. The TRM states that the SD3.0
>>> advertisement bit should be set for all controller instances, even
>>> for those not supporting UHS-I mode...
>>>
>>> When specifying vqmmc-supply as a fixed 1.8V regulator on a Tegra
>>> SD/MMC instance which is connected to a eMMC device, the stack
>>> enables SD3.0. However, enabling it has consequences: If SDHCI 3.0
>>> support is advertised the stack enables Auto-CMD23. Unfortunately
>>> Auto-CMD23 seems not to work well with Tegra 3 currently. It leads
>>> to regular warnings:
>>>     mmc2: Got command interrupt 0x00010000 even though no command operation was in progress.
>>>
>>> It is not entirely clear why those errors happens. It seems that
>>> a Linux 3.1 based downstream kernel which has Auto-CMD23 support
>>> does not show those warnings.
>>>
>>> Use quirk SDHCI_QUIRK2_ACMD23_BROKEN to prevent Auto-CMD23 being
>>> used for now. With this the eMMC works stable on high-speed mode
>>> while still announcing SD3.0.
>>>
>>> This allows to use mmc-ddr-1_8v to enables DDR52 mode. In DDR52
>>> mode read speed improves from about 42MiB/s to 72MiB/s on an
>>> Apalis T30.
>>>
>>> Signed-off-by: Stefan Agner <stefan@agner.ch>
>>> ---
>>>    drivers/mmc/host/sdhci-tegra.c | 10 +++++++++-
>>>    1 file changed, 9 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/mmc/host/sdhci-tegra.c b/drivers/mmc/host/sdhci-tegra.c
>>> index 888a1ad511db..11c0b2069c7c 100644
>>> --- a/drivers/mmc/host/sdhci-tegra.c
>>> +++ b/drivers/mmc/host/sdhci-tegra.c
>>> @@ -336,7 +336,15 @@ static const struct sdhci_pltfm_data sdhci_tegra30_pdata = {
>>>    		  SDHCI_QUIRK_BROKEN_ADMA_ZEROLEN_DESC |
>>>    		  SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN,
>>>    	.quirks2 = SDHCI_QUIRK2_PRESET_VALUE_BROKEN |
>>> -		   SDHCI_QUIRK2_BROKEN_HS200,
>>> +		   SDHCI_QUIRK2_BROKEN_HS200 |
>>> +		   /*
>>> +		    * Auto-CMD23 leads to "Got command interrupt 0x00010000 even
>>> +		    * though no command operation was in progress."
>>> +		    *
>>> +		    * The exact reason is unknown, as the same hardware seems
>>> +		    * to support Auto CMD23 on a downstream 3.1 kernel.
>>> +		    */
>>> +		   SDHCI_QUIRK2_ACMD23_BROKEN,
>>>    	.ops  = &tegra_sdhci_ops,
>>>    };
>>>
>>
>> I finally got around to testing this on the Ouya (Tegra 3).
> 
> Thanks for testing!
> 
>>
>> I found that the "Got command interrupt 0x00010000 even though no
>> command operation was in progress." error occurred when the interface
>> is unstable.
>> I've had a lot of problems with sdmmc4 stability on the Ouya above 34
>> Mhz, probably due to the fact that they are using the internal cmd and
>> clock pull-up resistors, against the TRM's instruction.
>> At 39Mhz, I saw the error this patch corrects.
>> With the patch, the error went away, but the interface is still
>> unstable under load.
> 
> How does this instability manifest exactly?
> 

At the very edge of stability, you see write errors under heavy load.
As clock rate increases, the write errors occur more frequently.
At a certain point, you start getting read errors.
Following that you get constant io errors during card probing.
Eventually the emmc will fail to initialize, with errors 87 or 110.

I've been tweaking the pull up/down values to try and improve the 
stability, but without access to anything but the TRM it's a lot of 
trial and error.

>>
>> Lowering down to 32Mhz, without the patch there are no errors.
> 
> So the patch does not make it less stable right?
> 

No, it did not affect stability.
Although I'd conduct some performance testing to check for degradation.
Of course I'm nowhere near the limits of the controller, so it is 
doubtful I'd see a hit.

> --
> Stefan
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-tegra" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Stefan Agner July 26, 2018, 3:51 p.m. | #4
On 26.07.2018 17:12, Peter Geis wrote:
> On 07/26/2018 10:47 AM, Stefan Agner wrote:
>> On 26.07.2018 15:56, Peter Geis wrote:
>>> On 07/12/2018 03:39 AM, Stefan Agner wrote:
>>>> It seems that SD3.0 advertisement needs to be set for higher eMMC
>>>> speed modes (namely DDR52) as well. The TRM states that the SD3.0
>>>> advertisement bit should be set for all controller instances, even
>>>> for those not supporting UHS-I mode...
>>>>
>>>> When specifying vqmmc-supply as a fixed 1.8V regulator on a Tegra
>>>> SD/MMC instance which is connected to a eMMC device, the stack
>>>> enables SD3.0. However, enabling it has consequences: If SDHCI 3.0
>>>> support is advertised the stack enables Auto-CMD23. Unfortunately
>>>> Auto-CMD23 seems not to work well with Tegra 3 currently. It leads
>>>> to regular warnings:
>>>>     mmc2: Got command interrupt 0x00010000 even though no command operation was in progress.
>>>>
>>>> It is not entirely clear why those errors happens. It seems that
>>>> a Linux 3.1 based downstream kernel which has Auto-CMD23 support
>>>> does not show those warnings.
>>>>
>>>> Use quirk SDHCI_QUIRK2_ACMD23_BROKEN to prevent Auto-CMD23 being
>>>> used for now. With this the eMMC works stable on high-speed mode
>>>> while still announcing SD3.0.
>>>>
>>>> This allows to use mmc-ddr-1_8v to enables DDR52 mode. In DDR52
>>>> mode read speed improves from about 42MiB/s to 72MiB/s on an
>>>> Apalis T30.
>>>>
>>>> Signed-off-by: Stefan Agner <stefan@agner.ch>
>>>> ---
>>>>    drivers/mmc/host/sdhci-tegra.c | 10 +++++++++-
>>>>    1 file changed, 9 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/mmc/host/sdhci-tegra.c b/drivers/mmc/host/sdhci-tegra.c
>>>> index 888a1ad511db..11c0b2069c7c 100644
>>>> --- a/drivers/mmc/host/sdhci-tegra.c
>>>> +++ b/drivers/mmc/host/sdhci-tegra.c
>>>> @@ -336,7 +336,15 @@ static const struct sdhci_pltfm_data sdhci_tegra30_pdata = {
>>>>    		  SDHCI_QUIRK_BROKEN_ADMA_ZEROLEN_DESC |
>>>>    		  SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN,
>>>>    	.quirks2 = SDHCI_QUIRK2_PRESET_VALUE_BROKEN |
>>>> -		   SDHCI_QUIRK2_BROKEN_HS200,
>>>> +		   SDHCI_QUIRK2_BROKEN_HS200 |
>>>> +		   /*
>>>> +		    * Auto-CMD23 leads to "Got command interrupt 0x00010000 even
>>>> +		    * though no command operation was in progress."
>>>> +		    *
>>>> +		    * The exact reason is unknown, as the same hardware seems
>>>> +		    * to support Auto CMD23 on a downstream 3.1 kernel.
>>>> +		    */
>>>> +		   SDHCI_QUIRK2_ACMD23_BROKEN,
>>>>    	.ops  = &tegra_sdhci_ops,
>>>>    };
>>>>
>>>
>>> I finally got around to testing this on the Ouya (Tegra 3).
>>
>> Thanks for testing!
>>
>>>
>>> I found that the "Got command interrupt 0x00010000 even though no
>>> command operation was in progress." error occurred when the interface
>>> is unstable.
>>> I've had a lot of problems with sdmmc4 stability on the Ouya above 34
>>> Mhz, probably due to the fact that they are using the internal cmd and
>>> clock pull-up resistors, against the TRM's instruction.
>>> At 39Mhz, I saw the error this patch corrects.
>>> With the patch, the error went away, but the interface is still
>>> unstable under load.
>>
>> How does this instability manifest exactly?
>>
> 
> At the very edge of stability, you see write errors under heavy load.
> As clock rate increases, the write errors occur more frequently.
> At a certain point, you start getting read errors.
> Following that you get constant io errors during card probing.
> Eventually the emmc will fail to initialize, with errors 87 or 110.

What mode are you running at actually? E.g. what is the ios file saying?
cat /sys/kernel/debug/mmcX/ios

> 
> I've been tweaking the pull up/down values to try and improve the
> stability, but without access to anything but the TRM it's a lot of
> trial and error.
> 

Hm, maybe Marcel's recent fixes in our device tree are helpful?
https://lkml.org/lkml/2018/7/22/165

Also make sure to have a complete pinmux such that alternative pins for
sdmmc4 are *not* muxed as sdmmc4.

>>>
>>> Lowering down to 32Mhz, without the patch there are no errors.
>>
>> So the patch does not make it less stable right?
>>
> 
> No, it did not affect stability.
> Although I'd conduct some performance testing to check for degradation.
> Of course I'm nowhere near the limits of the controller, so it is
> doubtful I'd see a hit.

Ok, and this is with the complete patchset applied correct?

Btw, what device tree are you using? Ouya is not upstream as far as I
can tell?

--
Stefan

> 
>> --
>> Stefan
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-tegra" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-tegra" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Geis July 26, 2018, 4:39 p.m. | #5
>>>> I finally got around to testing this on the Ouya (Tegra 3).
>>>
>>> Thanks for testing!
>>>
>>>>
>>>> I found that the "Got command interrupt 0x00010000 even though no
>>>> command operation was in progress." error occurred when the interface
>>>> is unstable.
>>>> I've had a lot of problems with sdmmc4 stability on the Ouya above 34
>>>> Mhz, probably due to the fact that they are using the internal cmd and
>>>> clock pull-up resistors, against the TRM's instruction.
>>>> At 39Mhz, I saw the error this patch corrects.
>>>> With the patch, the error went away, but the interface is still
>>>> unstable under load.
>>>
>>> How does this instability manifest exactly?
>>>
>>
>> At the very edge of stability, you see write errors under heavy load.
>> As clock rate increases, the write errors occur more frequently.
>> At a certain point, you start getting read errors.
>> Following that you get constant io errors during card probing.
>> Eventually the emmc will fail to initialize, with errors 87 or 110.
> 
> What mode are you running at actually? E.g. what is the ios file saying?
> cat /sys/kernel/debug/mmcX/ios

This is the best functionality I've been able to get prior to the patches:
root@ouya:~# cat /sys/kernel/debug/mmc0/ios
clock:          30000000 Hz
actual clock:   29142858 Hz
vdd:            21 (3.3 ~ 3.4 V)
bus mode:       2 (push-pull)
chip select:    0 (don't care)
power mode:     2 (on)
bus width:      3 (8 bits)
timing spec:    9 (mmc HS200)
signal voltage: 1 (1.80 V)
driver type:    0 (driver type B)

Now I am trying DDR, but even with the patches I'm not able to remain 
stable above 17Mhz (34Mhz clock).

I've also tried just straight mmc-hs mode, but even that makes no 
difference.

> 
>>
>> I've been tweaking the pull up/down values to try and improve the
>> stability, but without access to anything but the TRM it's a lot of
>> trial and error.
>>
> 
> Hm, maybe Marcel's recent fixes in our device tree are helpful?
> https://lkml.org/lkml/2018/7/22/165
> 
> Also make sure to have a complete pinmux such that alternative pins for
> sdmmc4 are *not* muxed as sdmmc4.

That was my first issue, which was preventing sdmmc4 from working at all.
Just double checked all of the spare function pins, they are all 
assigned elsewhere.

> 
>>>>
>>>> Lowering down to 32Mhz, without the patch there are no errors.
>>>
>>> So the patch does not make it less stable right?
>>>
>>
>> No, it did not affect stability.
>> Although I'd conduct some performance testing to check for degradation.
>> Of course I'm nowhere near the limits of the controller, so it is
>> doubtful I'd see a hit.
> 
> Ok, and this is with the complete patchset applied correct?
> 
> Btw, what device tree are you using? Ouya is not upstream as far as I
> can tell?

Indeed, I have the full patchset.

Ouya is an old android game console that I've been working on getting 
mainline working on.
I've written most of the device tree, with contributions from Matt Merhar.
It's almost bit for bit a cardhu dev board, but with everything not 
necessary to function removed.
They cut a lot of corners with the board design.
Last stable kernel was 3.2, but it ran fine at 52mhz, mind you it 
reported it was running mode 5.

To get this speed, I have the pins all driven down at 4, and up at 24.
Default is 2 down and 18 up from driver init.
The pin pull ups are exactly as the original kernel, all pins pulled up 
except reset, which is pulled down.

> 
> --
> Stefan
> 
>>
>>> --
>>> Stefan
>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-tegra" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-tegra" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Stefan Agner July 26, 2018, 5:36 p.m. | #6
On 26.07.2018 18:39, Peter Geis wrote:
>>>>> I finally got around to testing this on the Ouya (Tegra 3).
>>>>
>>>> Thanks for testing!
>>>>
>>>>>
>>>>> I found that the "Got command interrupt 0x00010000 even though no
>>>>> command operation was in progress." error occurred when the interface
>>>>> is unstable.
>>>>> I've had a lot of problems with sdmmc4 stability on the Ouya above 34
>>>>> Mhz, probably due to the fact that they are using the internal cmd and
>>>>> clock pull-up resistors, against the TRM's instruction.
>>>>> At 39Mhz, I saw the error this patch corrects.
>>>>> With the patch, the error went away, but the interface is still
>>>>> unstable under load.
>>>>
>>>> How does this instability manifest exactly?
>>>>
>>>
>>> At the very edge of stability, you see write errors under heavy load.
>>> As clock rate increases, the write errors occur more frequently.
>>> At a certain point, you start getting read errors.
>>> Following that you get constant io errors during card probing.
>>> Eventually the emmc will fail to initialize, with errors 87 or 110.
>>
>> What mode are you running at actually? E.g. what is the ios file saying?
>> cat /sys/kernel/debug/mmcX/ios
> 
> This is the best functionality I've been able to get prior to the patches:
> root@ouya:~# cat /sys/kernel/debug/mmc0/ios
> clock:          30000000 Hz
> actual clock:   29142858 Hz
> vdd:            21 (3.3 ~ 3.4 V)
> bus mode:       2 (push-pull)
> chip select:    0 (don't care)
> power mode:     2 (on)
> bus width:      3 (8 bits)
> timing spec:    9 (mmc HS200)
> signal voltage: 1 (1.80 V)
> driver type:    0 (driver type B)
> 

Yeah HS200 is definilty not supported by the controller and really
should not be used.

> Now I am trying DDR, but even with the patches I'm not able to remain
> stable above 17Mhz (34Mhz clock).
> 
> I've also tried just straight mmc-hs mode, but even that makes no difference.
> 

So you tried timing spec 1 (mmc HS)?

How did you exactly enable mmc-hs mode?

I suggest to *not set* vqmmc and apply patch 1. It will report that
signaling voltage is 3.3V, but that did not really matter in our case.
This was our baseline and always worked stable on mainline. I also would
use that mode when tweaking pinmux etc...

>>
>>>
>>> I've been tweaking the pull up/down values to try and improve the
>>> stability, but without access to anything but the TRM it's a lot of
>>> trial and error.
>>>
>>
>> Hm, maybe Marcel's recent fixes in our device tree are helpful?
>> https://lkml.org/lkml/2018/7/22/165
>>
>> Also make sure to have a complete pinmux such that alternative pins for
>> sdmmc4 are *not* muxed as sdmmc4.
> 
> That was my first issue, which was preventing sdmmc4 from working at all.
> Just double checked all of the spare function pins, they are all
> assigned elsewhere.
> 

Ok.

>>
>>>>>
>>>>> Lowering down to 32Mhz, without the patch there are no errors.
>>>>
>>>> So the patch does not make it less stable right?
>>>>
>>>
>>> No, it did not affect stability.
>>> Although I'd conduct some performance testing to check for degradation.
>>> Of course I'm nowhere near the limits of the controller, so it is
>>> doubtful I'd see a hit.
>>
>> Ok, and this is with the complete patchset applied correct?
>>
>> Btw, what device tree are you using? Ouya is not upstream as far as I
>> can tell?
> 
> Indeed, I have the full patchset.
> 
> Ouya is an old android game console that I've been working on getting
> mainline working on.

I know, I have one sitting here too. I only tried to tinker a bit at the
very beginning...

> I've written most of the device tree, with contributions from Matt Merhar.
> It's almost bit for bit a cardhu dev board, but with everything not
> necessary to function removed.
> They cut a lot of corners with the board design.
> Last stable kernel was 3.2, but it ran fine at 52mhz, mind you it
> reported it was running mode 5.

That is what we saw too. With Apalis/Colibri T30 L4T downstream kernel
(which is 3.1 with quite some patches) 52MHz DDR worked fine,
surprisingly even with ACMD23. However, speed is slightly slower than
mainline 52MHz without ACMD23...

--
Stefan

> 
> To get this speed, I have the pins all driven down at 4, and up at 24.
> Default is 2 down and 18 up from driver init.
> The pin pull ups are exactly as the original kernel, all pins pulled
> up except reset, which is pulled down.
> 
>>
>> --
>> Stefan
>>
>>>
>>>> --
>>>> Stefan
>>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-tegra" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-tegra" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Geis July 26, 2018, 5:48 p.m. | #7
On 07/26/2018 01:36 PM, Stefan Agner wrote:
> On 26.07.2018 18:39, Peter Geis wrote:
>>>>>> I finally got around to testing this on the Ouya (Tegra 3).
>>>>>
>>>>> Thanks for testing!
>>>>>
>>>>>>
>>>>>> I found that the "Got command interrupt 0x00010000 even though no
>>>>>> command operation was in progress." error occurred when the interface
>>>>>> is unstable.
>>>>>> I've had a lot of problems with sdmmc4 stability on the Ouya above 34
>>>>>> Mhz, probably due to the fact that they are using the internal cmd and
>>>>>> clock pull-up resistors, against the TRM's instruction.
>>>>>> At 39Mhz, I saw the error this patch corrects.
>>>>>> With the patch, the error went away, but the interface is still
>>>>>> unstable under load.
>>>>>
>>>>> How does this instability manifest exactly?
>>>>>
>>>>
>>>> At the very edge of stability, you see write errors under heavy load.
>>>> As clock rate increases, the write errors occur more frequently.
>>>> At a certain point, you start getting read errors.
>>>> Following that you get constant io errors during card probing.
>>>> Eventually the emmc will fail to initialize, with errors 87 or 110.
>>>
>>> What mode are you running at actually? E.g. what is the ios file saying?
>>> cat /sys/kernel/debug/mmcX/ios
>>
>> This is the best functionality I've been able to get prior to the patches:
>> root@ouya:~# cat /sys/kernel/debug/mmc0/ios
>> clock:          30000000 Hz
>> actual clock:   29142858 Hz
>> vdd:            21 (3.3 ~ 3.4 V)
>> bus mode:       2 (push-pull)
>> chip select:    0 (don't care)
>> power mode:     2 (on)
>> bus width:      3 (8 bits)
>> timing spec:    9 (mmc HS200)
>> signal voltage: 1 (1.80 V)
>> driver type:    0 (driver type B)
>>
> 
> Yeah HS200 is definilty not supported by the controller and really
> should not be used.
> 
>> Now I am trying DDR, but even with the patches I'm not able to remain
>> stable above 17Mhz (34Mhz clock).
>>
>> I've also tried just straight mmc-hs mode, but even that makes no difference.
>>
> 
> So you tried timing spec 1 (mmc HS)?
> 
> How did you exactly enable mmc-hs mode?

cap-mmc-highspeed;

> 
> I suggest to *not set* vqmmc and apply patch 1. It will report that
> signaling voltage is 3.3V, but that did not really matter in our case.
> This was our baseline and always worked stable on mainline. I also would
> use that mode when tweaking pinmux etc...

Will do, thanks.

> 
>>>
>>>>
>>>> I've been tweaking the pull up/down values to try and improve the
>>>> stability, but without access to anything but the TRM it's a lot of
>>>> trial and error.
>>>>
>>>
>>> Hm, maybe Marcel's recent fixes in our device tree are helpful?
>>> https://lkml.org/lkml/2018/7/22/165
>>>
>>> Also make sure to have a complete pinmux such that alternative pins for
>>> sdmmc4 are *not* muxed as sdmmc4.
>>
>> That was my first issue, which was preventing sdmmc4 from working at all.
>> Just double checked all of the spare function pins, they are all
>> assigned elsewhere.
>>
> 
> Ok.
> 
>>>
>>>>>>
>>>>>> Lowering down to 32Mhz, without the patch there are no errors.
>>>>>
>>>>> So the patch does not make it less stable right?
>>>>>
>>>>
>>>> No, it did not affect stability.
>>>> Although I'd conduct some performance testing to check for degradation.
>>>> Of course I'm nowhere near the limits of the controller, so it is
>>>> doubtful I'd see a hit.
>>>
>>> Ok, and this is with the complete patchset applied correct?
>>>
>>> Btw, what device tree are you using? Ouya is not upstream as far as I
>>> can tell?
>>
>> Indeed, I have the full patchset.
>>
>> Ouya is an old android game console that I've been working on getting
>> mainline working on.
> 
> I know, I have one sitting here too. I only tried to tinker a bit at the
> very beginning...

It runs Xubuntu very well now with mainline.
I've got most everything roughly supported with the exception of audio.

> 
>> I've written most of the device tree, with contributions from Matt Merhar.
>> It's almost bit for bit a cardhu dev board, but with everything not
>> necessary to function removed.
>> They cut a lot of corners with the board design.
>> Last stable kernel was 3.2, but it ran fine at 52mhz, mind you it
>> reported it was running mode 5.
> 
> That is what we saw too. With Apalis/Colibri T30 L4T downstream kernel
> (which is 3.1 with quite some patches) 52MHz DDR worked fine,
> surprisingly even with ACMD23. However, speed is slightly slower than
> mainline 52MHz without ACMD23...

I noticed the same thing, speed with the original kernel on the MMC was 
worse at 52Mhz than it was at 34Mhz in HS-200 mode on mainline.
I'd be happy with it where it is, but the fact that it worked at 52Mhz 
before makes me believe something isn't quite there yet.
I selected HS-200 mode just to force 1.8v mode.

> 
> --
> Stefan
> 
>>
>> To get this speed, I have the pins all driven down at 4, and up at 24.
>> Default is 2 down and 18 up from driver init.
>> The pin pull ups are exactly as the original kernel, all pins pulled
>> up except reset, which is pulled down.
>>
>>>
>>> --
>>> Stefan
>>>
>>>>
>>>>> --
>>>>> Stefan
>>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-tegra" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-tegra" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dmitry Osipenko July 27, 2018, 7:52 p.m. | #8
On Thursday, 26 July 2018 20:48:55 MSK Peter Geis wrote:
> On 07/26/2018 01:36 PM, Stefan Agner wrote:
> > On 26.07.2018 18:39, Peter Geis wrote:
> >>>>>> I finally got around to testing this on the Ouya (Tegra 3).
> >>>>> 
> >>>>> Thanks for testing!
> >>>>> 
> >>>>>> I found that the "Got command interrupt 0x00010000 even though no
> >>>>>> command operation was in progress." error occurred when the interface
> >>>>>> is unstable.
> >>>>>> I've had a lot of problems with sdmmc4 stability on the Ouya above 34
> >>>>>> Mhz, probably due to the fact that they are using the internal cmd
> >>>>>> and
> >>>>>> clock pull-up resistors, against the TRM's instruction.
> >>>>>> At 39Mhz, I saw the error this patch corrects.
> >>>>>> With the patch, the error went away, but the interface is still
> >>>>>> unstable under load.
> >>>>> 
> >>>>> How does this instability manifest exactly?
> >>>> 
> >>>> At the very edge of stability, you see write errors under heavy load.
> >>>> As clock rate increases, the write errors occur more frequently.
> >>>> At a certain point, you start getting read errors.
> >>>> Following that you get constant io errors during card probing.
> >>>> Eventually the emmc will fail to initialize, with errors 87 or 110.
> >>> 
> >>> What mode are you running at actually? E.g. what is the ios file saying?
> >>> cat /sys/kernel/debug/mmcX/ios
> >> 
> >> This is the best functionality I've been able to get prior to the
> >> patches:
> >> root@ouya:~# cat /sys/kernel/debug/mmc0/ios
> >> clock:          30000000 Hz
> >> actual clock:   29142858 Hz
> >> vdd:            21 (3.3 ~ 3.4 V)
> >> bus mode:       2 (push-pull)
> >> chip select:    0 (don't care)
> >> power mode:     2 (on)
> >> bus width:      3 (8 bits)
> >> timing spec:    9 (mmc HS200)
> >> signal voltage: 1 (1.80 V)
> >> driver type:    0 (driver type B)
> > 
> > Yeah HS200 is definilty not supported by the controller and really
> > should not be used.
> > 
> >> Now I am trying DDR, but even with the patches I'm not able to remain
> >> stable above 17Mhz (34Mhz clock).
> >> 
> >> I've also tried just straight mmc-hs mode, but even that makes no
> >> difference.> 
> > So you tried timing spec 1 (mmc HS)?
> > 
> > How did you exactly enable mmc-hs mode?
> 
> cap-mmc-highspeed;
> 
> > I suggest to *not set* vqmmc and apply patch 1. It will report that
> > signaling voltage is 3.3V, but that did not really matter in our case.
> > This was our baseline and always worked stable on mainline. I also would
> > use that mode when tweaking pinmux etc...
> 
> Will do, thanks.
> 
> >>>> I've been tweaking the pull up/down values to try and improve the
> >>>> stability, but without access to anything but the TRM it's a lot of
> >>>> trial and error.
> >>> 
> >>> Hm, maybe Marcel's recent fixes in our device tree are helpful?
> >>> https://lkml.org/lkml/2018/7/22/165
> >>> 
> >>> Also make sure to have a complete pinmux such that alternative pins for
> >>> sdmmc4 are *not* muxed as sdmmc4.
> >> 
> >> That was my first issue, which was preventing sdmmc4 from working at all.
> >> Just double checked all of the spare function pins, they are all
> >> assigned elsewhere.
> > 
> > Ok.
> > 
> >>>>>> Lowering down to 32Mhz, without the patch there are no errors.
> >>>>> 
> >>>>> So the patch does not make it less stable right?
> >>>> 
> >>>> No, it did not affect stability.
> >>>> Although I'd conduct some performance testing to check for degradation.
> >>>> Of course I'm nowhere near the limits of the controller, so it is
> >>>> doubtful I'd see a hit.
> >>> 
> >>> Ok, and this is with the complete patchset applied correct?
> >>> 
> >>> Btw, what device tree are you using? Ouya is not upstream as far as I
> >>> can tell?
> >> 
> >> Indeed, I have the full patchset.
> >> 
> >> Ouya is an old android game console that I've been working on getting
> >> mainline working on.
> > 
> > I know, I have one sitting here too. I only tried to tinker a bit at the
> > very beginning...
> 
> It runs Xubuntu very well now with mainline.
> I've got most everything roughly supported with the exception of audio.
> 
> >> I've written most of the device tree, with contributions from Matt
> >> Merhar.
> >> It's almost bit for bit a cardhu dev board, but with everything not
> >> necessary to function removed.
> >> They cut a lot of corners with the board design.
> >> Last stable kernel was 3.2, but it ran fine at 52mhz, mind you it
> >> reported it was running mode 5.
> > 
> > That is what we saw too. With Apalis/Colibri T30 L4T downstream kernel
> > (which is 3.1 with quite some patches) 52MHz DDR worked fine,
> > surprisingly even with ACMD23. However, speed is slightly slower than
> > mainline 52MHz without ACMD23...
> 
> I noticed the same thing, speed with the original kernel on the MMC was
> worse at 52Mhz than it was at 34Mhz in HS-200 mode on mainline.
> I'd be happy with it where it is, but the fact that it worked at 52Mhz
> before makes me believe something isn't quite there yet.
> I selected HS-200 mode just to force 1.8v mode.

What's the card model your Ouya's eMMC has?




--
To unsubscribe from this list: send the line "unsubscribe linux-tegra" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Geis July 27, 2018, 8:19 p.m. | #9
Kingston KE4CN3K6A.
Though I am pretty sure I've figured out the instability.
Brought it in to work and hooked it to a scope.
Couldn't find clock, but cmd and all eight bits are running at 1.2 volts.
Repeated the results with the bootloader, the original kernel, and my
mainline.
Also noticed that even on the slowest slew rate there is significant
ringing and overshoot of .15 volts.



On Fri, Jul 27, 2018, 15:52 Dmitry Osipenko <digetx@gmail.com> wrote:

> On Thursday, 26 July 2018 20:48:55 MSK Peter Geis wrote:
> > On 07/26/2018 01:36 PM, Stefan Agner wrote:
> > > On 26.07.2018 18:39, Peter Geis wrote:
> > >>>>>> I finally got around to testing this on the Ouya (Tegra 3).
> > >>>>>
> > >>>>> Thanks for testing!
> > >>>>>
> > >>>>>> I found that the "Got command interrupt 0x00010000 even though no
> > >>>>>> command operation was in progress." error occurred when the
> interface
> > >>>>>> is unstable.
> > >>>>>> I've had a lot of problems with sdmmc4 stability on the Ouya
> above 34
> > >>>>>> Mhz, probably due to the fact that they are using the internal cmd
> > >>>>>> and
> > >>>>>> clock pull-up resistors, against the TRM's instruction.
> > >>>>>> At 39Mhz, I saw the error this patch corrects.
> > >>>>>> With the patch, the error went away, but the interface is still
> > >>>>>> unstable under load.
> > >>>>>
> > >>>>> How does this instability manifest exactly?
> > >>>>
> > >>>> At the very edge of stability, you see write errors under heavy
> load.
> > >>>> As clock rate increases, the write errors occur more frequently.
> > >>>> At a certain point, you start getting read errors.
> > >>>> Following that you get constant io errors during card probing.
> > >>>> Eventually the emmc will fail to initialize, with errors 87 or 110.
> > >>>
> > >>> What mode are you running at actually? E.g. what is the ios file
> saying?
> > >>> cat /sys/kernel/debug/mmcX/ios
> > >>
> > >> This is the best functionality I've been able to get prior to the
> > >> patches:
> > >> root@ouya:~# cat /sys/kernel/debug/mmc0/ios
> > >> clock:          30000000 Hz
> > >> actual clock:   29142858 Hz
> > >> vdd:            21 (3.3 ~ 3.4 V)
> > >> bus mode:       2 (push-pull)
> > >> chip select:    0 (don't care)
> > >> power mode:     2 (on)
> > >> bus width:      3 (8 bits)
> > >> timing spec:    9 (mmc HS200)
> > >> signal voltage: 1 (1.80 V)
> > >> driver type:    0 (driver type B)
> > >
> > > Yeah HS200 is definilty not supported by the controller and really
> > > should not be used.
> > >
> > >> Now I am trying DDR, but even with the patches I'm not able to remain
> > >> stable above 17Mhz (34Mhz clock).
> > >>
> > >> I've also tried just straight mmc-hs mode, but even that makes no
> > >> difference.>
> > > So you tried timing spec 1 (mmc HS)?
> > >
> > > How did you exactly enable mmc-hs mode?
> >
> > cap-mmc-highspeed;
> >
> > > I suggest to *not set* vqmmc and apply patch 1. It will report that
> > > signaling voltage is 3.3V, but that did not really matter in our case.
> > > This was our baseline and always worked stable on mainline. I also
> would
> > > use that mode when tweaking pinmux etc...
> >
> > Will do, thanks.
> >
> > >>>> I've been tweaking the pull up/down values to try and improve the
> > >>>> stability, but without access to anything but the TRM it's a lot of
> > >>>> trial and error.
> > >>>
> > >>> Hm, maybe Marcel's recent fixes in our device tree are helpful?
> > >>> https://lkml.org/lkml/2018/7/22/165
> > >>>
> > >>> Also make sure to have a complete pinmux such that alternative pins
> for
> > >>> sdmmc4 are *not* muxed as sdmmc4.
> > >>
> > >> That was my first issue, which was preventing sdmmc4 from working at
> all.
> > >> Just double checked all of the spare function pins, they are all
> > >> assigned elsewhere.
> > >
> > > Ok.
> > >
> > >>>>>> Lowering down to 32Mhz, without the patch there are no errors.
> > >>>>>
> > >>>>> So the patch does not make it less stable right?
> > >>>>
> > >>>> No, it did not affect stability.
> > >>>> Although I'd conduct some performance testing to check for
> degradation.
> > >>>> Of course I'm nowhere near the limits of the controller, so it is
> > >>>> doubtful I'd see a hit.
> > >>>
> > >>> Ok, and this is with the complete patchset applied correct?
> > >>>
> > >>> Btw, what device tree are you using? Ouya is not upstream as far as I
> > >>> can tell?
> > >>
> > >> Indeed, I have the full patchset.
> > >>
> > >> Ouya is an old android game console that I've been working on getting
> > >> mainline working on.
> > >
> > > I know, I have one sitting here too. I only tried to tinker a bit at
> the
> > > very beginning...
> >
> > It runs Xubuntu very well now with mainline.
> > I've got most everything roughly supported with the exception of audio.
> >
> > >> I've written most of the device tree, with contributions from Matt
> > >> Merhar.
> > >> It's almost bit for bit a cardhu dev board, but with everything not
> > >> necessary to function removed.
> > >> They cut a lot of corners with the board design.
> > >> Last stable kernel was 3.2, but it ran fine at 52mhz, mind you it
> > >> reported it was running mode 5.
> > >
> > > That is what we saw too. With Apalis/Colibri T30 L4T downstream kernel
> > > (which is 3.1 with quite some patches) 52MHz DDR worked fine,
> > > surprisingly even with ACMD23. However, speed is slightly slower than
> > > mainline 52MHz without ACMD23...
> >
> > I noticed the same thing, speed with the original kernel on the MMC was
> > worse at 52Mhz than it was at 34Mhz in HS-200 mode on mainline.
> > I'd be happy with it where it is, but the fact that it worked at 52Mhz
> > before makes me believe something isn't quite there yet.
> > I selected HS-200 mode just to force 1.8v mode.
>
> What's the card model your Ouya's eMMC has?
>
>
>
>
>
<div dir="auto"><span style="color:rgb(85,85,85);font-family:lato,arial,sans-serif;font-size:16px">Kingston </span><span style="font-size:16px;color:rgb(85,85,85);font-family:lato,arial,sans-serif">KE4CN3K6A.</span><div dir="auto"><font color="#555555" face="lato, arial, sans-serif"><span style="font-size:16px">Though I am pretty sure I&#39;ve figured out the instability.</span></font></div><div dir="auto"><font color="#555555" face="lato, arial, sans-serif"><span style="font-size:16px">Brought it in to work and hooked it to a scope.</span></font></div><div dir="auto"><font color="#555555" face="lato, arial, sans-serif"><span style="font-size:16px">Couldn&#39;t find clock, but cmd and all eight bits are running at 1.2 volts.</span></font></div><div dir="auto"><font color="#555555" face="lato, arial, sans-serif"><span style="font-size:16px">Repeated the results with the bootloader, the original kernel, and my mainline.</span></font></div><div dir="auto"><font color="#555555" face="lato, arial, sans-serif"><span style="font-size:16px">Also noticed that even on the slowest slew rate there is significant ringing and overshoot of .15 volts.<br></span></font><div dir="auto"><div dir="auto"><span style="color:rgb(85,85,85);font-family:lato,arial,sans-serif;font-size:16px"><br></span></div><div dir="auto"><span style="color:rgb(85,85,85);font-family:lato,arial,sans-serif;font-size:16px"><br></span></div></div></div></div><br><div class="gmail_quote"><div dir="ltr">On Fri, Jul 27, 2018, 15:52 Dmitry Osipenko &lt;<a href="mailto:digetx@gmail.com">digetx@gmail.com</a>&gt; wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On Thursday, 26 July 2018 20:48:55 MSK Peter Geis wrote:<br>
&gt; On 07/26/2018 01:36 PM, Stefan Agner wrote:<br>
&gt; &gt; On 26.07.2018 18:39, Peter Geis wrote:<br>
&gt; &gt;&gt;&gt;&gt;&gt;&gt; I finally got around to testing this on the Ouya (Tegra 3).<br>
&gt; &gt;&gt;&gt;&gt;&gt; <br>
&gt; &gt;&gt;&gt;&gt;&gt; Thanks for testing!<br>
&gt; &gt;&gt;&gt;&gt;&gt; <br>
&gt; &gt;&gt;&gt;&gt;&gt;&gt; I found that the &quot;Got command interrupt 0x00010000 even though no<br>
&gt; &gt;&gt;&gt;&gt;&gt;&gt; command operation was in progress.&quot; error occurred when the interface<br>
&gt; &gt;&gt;&gt;&gt;&gt;&gt; is unstable.<br>
&gt; &gt;&gt;&gt;&gt;&gt;&gt; I&#39;ve had a lot of problems with sdmmc4 stability on the Ouya above 34<br>
&gt; &gt;&gt;&gt;&gt;&gt;&gt; Mhz, probably due to the fact that they are using the internal cmd<br>
&gt; &gt;&gt;&gt;&gt;&gt;&gt; and<br>
&gt; &gt;&gt;&gt;&gt;&gt;&gt; clock pull-up resistors, against the TRM&#39;s instruction.<br>
&gt; &gt;&gt;&gt;&gt;&gt;&gt; At 39Mhz, I saw the error this patch corrects.<br>
&gt; &gt;&gt;&gt;&gt;&gt;&gt; With the patch, the error went away, but the interface is still<br>
&gt; &gt;&gt;&gt;&gt;&gt;&gt; unstable under load.<br>
&gt; &gt;&gt;&gt;&gt;&gt; <br>
&gt; &gt;&gt;&gt;&gt;&gt; How does this instability manifest exactly?<br>
&gt; &gt;&gt;&gt;&gt; <br>
&gt; &gt;&gt;&gt;&gt; At the very edge of stability, you see write errors under heavy load.<br>
&gt; &gt;&gt;&gt;&gt; As clock rate increases, the write errors occur more frequently.<br>
&gt; &gt;&gt;&gt;&gt; At a certain point, you start getting read errors.<br>
&gt; &gt;&gt;&gt;&gt; Following that you get constant io errors during card probing.<br>
&gt; &gt;&gt;&gt;&gt; Eventually the emmc will fail to initialize, with errors 87 or 110.<br>
&gt; &gt;&gt;&gt; <br>
&gt; &gt;&gt;&gt; What mode are you running at actually? E.g. what is the ios file saying?<br>
&gt; &gt;&gt;&gt; cat /sys/kernel/debug/mmcX/ios<br>
&gt; &gt;&gt; <br>
&gt; &gt;&gt; This is the best functionality I&#39;ve been able to get prior to the<br>
&gt; &gt;&gt; patches:<br>
&gt; &gt;&gt; root@ouya:~# cat /sys/kernel/debug/mmc0/ios<br>
&gt; &gt;&gt; clock:          30000000 Hz<br>
&gt; &gt;&gt; actual clock:   29142858 Hz<br>
&gt; &gt;&gt; vdd:            21 (3.3 ~ 3.4 V)<br>
&gt; &gt;&gt; bus mode:       2 (push-pull)<br>
&gt; &gt;&gt; chip select:    0 (don&#39;t care)<br>
&gt; &gt;&gt; power mode:     2 (on)<br>
&gt; &gt;&gt; bus width:      3 (8 bits)<br>
&gt; &gt;&gt; timing spec:    9 (mmc HS200)<br>
&gt; &gt;&gt; signal voltage: 1 (1.80 V)<br>
&gt; &gt;&gt; driver type:    0 (driver type B)<br>
&gt; &gt; <br>
&gt; &gt; Yeah HS200 is definilty not supported by the controller and really<br>
&gt; &gt; should not be used.<br>
&gt; &gt; <br>
&gt; &gt;&gt; Now I am trying DDR, but even with the patches I&#39;m not able to remain<br>
&gt; &gt;&gt; stable above 17Mhz (34Mhz clock).<br>
&gt; &gt;&gt; <br>
&gt; &gt;&gt; I&#39;ve also tried just straight mmc-hs mode, but even that makes no<br>
&gt; &gt;&gt; difference.&gt; <br>
&gt; &gt; So you tried timing spec 1 (mmc HS)?<br>
&gt; &gt; <br>
&gt; &gt; How did you exactly enable mmc-hs mode?<br>
&gt; <br>
&gt; cap-mmc-highspeed;<br>
&gt; <br>
&gt; &gt; I suggest to *not set* vqmmc and apply patch 1. It will report that<br>
&gt; &gt; signaling voltage is 3.3V, but that did not really matter in our case.<br>
&gt; &gt; This was our baseline and always worked stable on mainline. I also would<br>
&gt; &gt; use that mode when tweaking pinmux etc...<br>
&gt; <br>
&gt; Will do, thanks.<br>
&gt; <br>
&gt; &gt;&gt;&gt;&gt; I&#39;ve been tweaking the pull up/down values to try and improve the<br>
&gt; &gt;&gt;&gt;&gt; stability, but without access to anything but the TRM it&#39;s a lot of<br>
&gt; &gt;&gt;&gt;&gt; trial and error.<br>
&gt; &gt;&gt;&gt; <br>
&gt; &gt;&gt;&gt; Hm, maybe Marcel&#39;s recent fixes in our device tree are helpful?<br>
&gt; &gt;&gt;&gt; <a href="https://lkml.org/lkml/2018/7/22/165" rel="noreferrer noreferrer" target="_blank">https://lkml.org/lkml/2018/7/22/165</a><br>
&gt; &gt;&gt;&gt; <br>
&gt; &gt;&gt;&gt; Also make sure to have a complete pinmux such that alternative pins for<br>
&gt; &gt;&gt;&gt; sdmmc4 are *not* muxed as sdmmc4.<br>
&gt; &gt;&gt; <br>
&gt; &gt;&gt; That was my first issue, which was preventing sdmmc4 from working at all.<br>
&gt; &gt;&gt; Just double checked all of the spare function pins, they are all<br>
&gt; &gt;&gt; assigned elsewhere.<br>
&gt; &gt; <br>
&gt; &gt; Ok.<br>
&gt; &gt; <br>
&gt; &gt;&gt;&gt;&gt;&gt;&gt; Lowering down to 32Mhz, without the patch there are no errors.<br>
&gt; &gt;&gt;&gt;&gt;&gt; <br>
&gt; &gt;&gt;&gt;&gt;&gt; So the patch does not make it less stable right?<br>
&gt; &gt;&gt;&gt;&gt; <br>
&gt; &gt;&gt;&gt;&gt; No, it did not affect stability.<br>
&gt; &gt;&gt;&gt;&gt; Although I&#39;d conduct some performance testing to check for degradation.<br>
&gt; &gt;&gt;&gt;&gt; Of course I&#39;m nowhere near the limits of the controller, so it is<br>
&gt; &gt;&gt;&gt;&gt; doubtful I&#39;d see a hit.<br>
&gt; &gt;&gt;&gt; <br>
&gt; &gt;&gt;&gt; Ok, and this is with the complete patchset applied correct?<br>
&gt; &gt;&gt;&gt; <br>
&gt; &gt;&gt;&gt; Btw, what device tree are you using? Ouya is not upstream as far as I<br>
&gt; &gt;&gt;&gt; can tell?<br>
&gt; &gt;&gt; <br>
&gt; &gt;&gt; Indeed, I have the full patchset.<br>
&gt; &gt;&gt; <br>
&gt; &gt;&gt; Ouya is an old android game console that I&#39;ve been working on getting<br>
&gt; &gt;&gt; mainline working on.<br>
&gt; &gt; <br>
&gt; &gt; I know, I have one sitting here too. I only tried to tinker a bit at the<br>
&gt; &gt; very beginning...<br>
&gt; <br>
&gt; It runs Xubuntu very well now with mainline.<br>
&gt; I&#39;ve got most everything roughly supported with the exception of audio.<br>
&gt; <br>
&gt; &gt;&gt; I&#39;ve written most of the device tree, with contributions from Matt<br>
&gt; &gt;&gt; Merhar.<br>
&gt; &gt;&gt; It&#39;s almost bit for bit a cardhu dev board, but with everything not<br>
&gt; &gt;&gt; necessary to function removed.<br>
&gt; &gt;&gt; They cut a lot of corners with the board design.<br>
&gt; &gt;&gt; Last stable kernel was 3.2, but it ran fine at 52mhz, mind you it<br>
&gt; &gt;&gt; reported it was running mode 5.<br>
&gt; &gt; <br>
&gt; &gt; That is what we saw too. With Apalis/Colibri T30 L4T downstream kernel<br>
&gt; &gt; (which is 3.1 with quite some patches) 52MHz DDR worked fine,<br>
&gt; &gt; surprisingly even with ACMD23. However, speed is slightly slower than<br>
&gt; &gt; mainline 52MHz without ACMD23...<br>
&gt; <br>
&gt; I noticed the same thing, speed with the original kernel on the MMC was<br>
&gt; worse at 52Mhz than it was at 34Mhz in HS-200 mode on mainline.<br>
&gt; I&#39;d be happy with it where it is, but the fact that it worked at 52Mhz<br>
&gt; before makes me believe something isn&#39;t quite there yet.<br>
&gt; I selected HS-200 mode just to force 1.8v mode.<br>
<br>
What&#39;s the card model your Ouya&#39;s eMMC has?<br>
<br>
<br>
<br>
<br>
</blockquote></div>
Dmitry Osipenko July 28, 2018, 10:13 a.m. | #10
On Friday, 27 July 2018 23:19:53 MSK Peter Geis wrote:
> Kingston KE4CN3K6A.
> Though I am pretty sure I've figured out the instability.
> Brought it in to work and hooked it to a scope.
> Couldn't find clock, but cmd and all eight bits are running at 1.2 volts.
> Repeated the results with the bootloader, the original kernel, and my
> mainline.
> Also noticed that even on the slowest slew rate there is significant
> ringing and overshoot of .15 volts.

Okay, but eMMC is working fine with the original kernel, isn't it?


--
To unsubscribe from this list: send the line "unsubscribe linux-tegra" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Geis July 28, 2018, 12:03 p.m. | #11
On 7/28/2018 6:13 AM, Dmitry Osipenko wrote:
> On Friday, 27 July 2018 23:19:53 MSK Peter Geis wrote:
>> Kingston KE4CN3K6A.
>> Though I am pretty sure I've figured out the instability.
>> Brought it in to work and hooked it to a scope.
>> Couldn't find clock, but cmd and all eight bits are running at 1.2 volts.
>> Repeated the results with the bootloader, the original kernel, and my
>> mainline.
>> Also noticed that even on the slowest slew rate there is significant
>> ringing and overshoot of .15 volts.
> 
> Okay, but eMMC is working fine with the original kernel, isn't it?
> 
> 
Correct, at roughly double the speed of the mainline kernel.
According to the mmc spec, in high voltage mode it should have 2.7-3.3 
volts in, and the minimum signal voltage should be .75 of that, which 
equates to a minimum voltage of over 2 volts.
For low voltage mode, it should be 1.8 volts in, with the signals being 
between .3 and .7 volts.
Signal voltage of 1.3 volts is .75 of 1.8, which means we have low 
voltage in, but high voltage signaling.
This is even with HS-200 or DDR-1.8 assigned in the controller mode, so 
for some reason we aren't switching modes.
The original kernel was running in SDR 25 signaling, but didn't try to 
switch either.

Digging through the old code and the TRM, there are three ideas stuck in 
my head.
The original pinmux driver had settings such as low voltage divider and 
two values for each pull direction.
The pinmux driver is io resetting specifically the pads we are using, 
dumping any values the bootloader had set.
Is it possible the tap configuration is not working?
--
To unsubscribe from this list: send the line "unsubscribe linux-tegra" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/drivers/mmc/host/sdhci-tegra.c b/drivers/mmc/host/sdhci-tegra.c
index 888a1ad511db..11c0b2069c7c 100644
--- a/drivers/mmc/host/sdhci-tegra.c
+++ b/drivers/mmc/host/sdhci-tegra.c
@@ -336,7 +336,15 @@  static const struct sdhci_pltfm_data sdhci_tegra30_pdata = {
 		  SDHCI_QUIRK_BROKEN_ADMA_ZEROLEN_DESC |
 		  SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN,
 	.quirks2 = SDHCI_QUIRK2_PRESET_VALUE_BROKEN |
-		   SDHCI_QUIRK2_BROKEN_HS200,
+		   SDHCI_QUIRK2_BROKEN_HS200 |
+		   /*
+		    * Auto-CMD23 leads to "Got command interrupt 0x00010000 even
+		    * though no command operation was in progress."
+		    *
+		    * The exact reason is unknown, as the same hardware seems
+		    * to support Auto CMD23 on a downstream 3.1 kernel.
+		    */
+		   SDHCI_QUIRK2_ACMD23_BROKEN,
 	.ops  = &tegra_sdhci_ops,
 };