diff mbox series

[v2,2/2] ata: libata-core: Disable READ LOG DMA EXT for Samsung 840 EVOs

Message ID fe2ff41a52ce2647fd12f69a0d6eba8e3cf05b06.1647720651.git.chunkeey@gmail.com
State New
Headers show
Series [v2,1/2] ata: sata_dwc_460ex: Fix crash due to OOB write | expand

Commit Message

Christian Lamparter March 19, 2022, 8:11 p.m. UTC
Samsung' 840 EVO with the latest firmware (EXT0DB6Q) locks up with
the a message: "READ LOG DMA EXT failed, trying PIO" during boot.

Initially this was discovered because it caused a crash
with the sata_dwc_460ex controller on a WD MyBook Live DUO.

The reporter "Tice Rex" which has the unique opportunity that he
has two Samsung 840 EVO SSD! One with the older firmware "EXT0BB0Q"
which booted fine and didn't expose "READ LOG DMA EXT". But the
newer/latest firmware "EXT0DB6Q" caused the headaches.

BugLink: https://github.com/openwrt/openwrt/issues/9505
Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
---
v1 -> v2: removed Reported-by Tag (Damien)
	  opened up a issue + added BugLink (Andy)
---
 drivers/ata/libata-core.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Reimar Döffinger March 19, 2022, 8:31 p.m. UTC | #1
> On 19 Mar 2022, at 21:11, Christian Lamparter <chunkeey@gmail.com> wrote:
> 
> Samsung' 840 EVO with the latest firmware (EXT0DB6Q) locks up with
> the a message: "READ LOG DMA EXT failed, trying PIO" during boot.

I don't see any info on which kernel this happened with anywhere.
Because there was a bug that tried to use READ LOG DMA EXT even though DMA was not enabled.
That was fixed by a patch from me for 5.16 (and backports).
The behaviour you describe matches the possible symptoms of that bug.
So it would be good to know we're not blaming the drive for an already fixed bug in the kernel...
(I am still seeing some issues that READ LOG EXT - the non-DMA version - times out with interrupt lost for some device/controller combinations, but at least that "only" adds an extra 15 seconds to the boot. It's still a bit silly because in combination with e.g. IDE controllers READ LOG provides absolutely no useful functionality as far as I can tell).

Best regards,
Reimar
Reimar Döffinger March 19, 2022, 8:42 p.m. UTC | #2
> On 19 Mar 2022, at 21:31, Reimar Döffinger <Reimar.Doeffinger@gmx.de> wrote:
> 
> 
> 
>> On 19 Mar 2022, at 21:11, Christian Lamparter <chunkeey@gmail.com> wrote:
>> 
>> Samsung' 840 EVO with the latest firmware (EXT0DB6Q) locks up with
>> the a message: "READ LOG DMA EXT failed, trying PIO" during boot.
> 
> I don't see any info on which kernel this happened with anywhere.
> Because there was a bug that tried to use READ LOG DMA EXT even though DMA was not enabled.
> That was fixed by a patch from me for 5.16 (and backports).
> The behaviour you describe matches the possible symptoms of that bug.
> So it would be good to know we're not blaming the drive for an already fixed bug in the kernel...

Ok, seems not the case, fix is in 5.4.160 and the first report was from 5.4.179 it seems.
Sergey Shtylyov March 21, 2022, 8 a.m. UTC | #3
Hello!

On 3/19/22 11:11 PM, Christian Lamparter wrote:

> Samsung' 840 EVO with the latest firmware (EXT0DB6Q) locks up with
> the a message: "READ LOG DMA EXT failed, trying PIO" during boot.
> 
> Initially this was discovered because it caused a crash
> with the sata_dwc_460ex controller on a WD MyBook Live DUO.
> 
> The reporter "Tice Rex" which has the unique opportunity that he
> has two Samsung 840 EVO SSD! One with the older firmware "EXT0BB0Q"
> which booted fine and didn't expose "READ LOG DMA EXT". But the
> newer/latest firmware "EXT0DB6Q" caused the headaches.
> 
> BugLink: https://github.com/openwrt/openwrt/issues/9505
> Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
> ---
> v1 -> v2: removed Reported-by Tag (Damien)
> 	  opened up a issue + added BugLink (Andy)
> ---
>  drivers/ata/libata-core.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
> index 0c854aebfe0b..760c0d81d148 100644
> --- a/drivers/ata/libata-core.c
> +++ b/drivers/ata/libata-core.c
> @@ -4014,6 +4014,9 @@ static const struct ata_blacklist_entry ata_device_blacklist [] = {
>  						ATA_HORKAGE_ZERO_AFTER_TRIM, },
>  	{ "Crucial_CT*MX100*",		"MU01",	ATA_HORKAGE_NO_NCQ_TRIM |
>  						ATA_HORKAGE_ZERO_AFTER_TRIM, },
> +	{ "Samsung SSD 840 EVO*",	NULL,	ATA_HORKAGE_NO_NCQ_TRIM |
> +						ATA_HORKAGE_NO_DMA_LOG |
> +						ATA_HORKAGE_ZERO_AFTER_TRIM, },
>  	{ "Samsung SSD 840*",		NULL,	ATA_HORKAGE_NO_NCQ_TRIM |
>  						ATA_HORKAGE_ZERO_AFTER_TRIM, },

  Shouldn't this entry be modified instead?

[...]

MBR, Sergey
Andy Shevchenko March 21, 2022, 10:43 a.m. UTC | #4
On Mon, Mar 21, 2022 at 11:00:51AM +0300, Sergey Shtylyov wrote:
> On 3/19/22 11:11 PM, Christian Lamparter wrote:

...

> > +	{ "Samsung SSD 840 EVO*",	NULL,	ATA_HORKAGE_NO_NCQ_TRIM |
> > +						ATA_HORKAGE_NO_DMA_LOG |
> > +						ATA_HORKAGE_ZERO_AFTER_TRIM, },
> >  	{ "Samsung SSD 840*",		NULL,	ATA_HORKAGE_NO_NCQ_TRIM |
> >  						ATA_HORKAGE_ZERO_AFTER_TRIM, },
> 
>   Shouldn't this entry be modified instead?

Wouldn't it modify much more devices with unknown outcome?
I would be on the safer side as done by this patch.
Damien Le Moal March 21, 2022, 12:38 p.m. UTC | #5
On 2022/03/21 19:43, Andy Shevchenko wrote:
> On Mon, Mar 21, 2022 at 11:00:51AM +0300, Sergey Shtylyov wrote:
>> On 3/19/22 11:11 PM, Christian Lamparter wrote:
> 
> ...
> 
>>> +	{ "Samsung SSD 840 EVO*",	NULL,	ATA_HORKAGE_NO_NCQ_TRIM |
>>> +						ATA_HORKAGE_NO_DMA_LOG |
>>> +						ATA_HORKAGE_ZERO_AFTER_TRIM, },
>>>  	{ "Samsung SSD 840*",		NULL,	ATA_HORKAGE_NO_NCQ_TRIM |
>>>  						ATA_HORKAGE_ZERO_AFTER_TRIM, },
>>
>>   Shouldn't this entry be modified instead?
> 
> Wouldn't it modify much more devices with unknown outcome?
> I would be on the safer side as done by this patch.
> 

Yes, I think so too. A little googling shows that there are "840 xxx", "840 PRO
xxx" and "840 EVO xxx", at least as product names. Not sure how all these are
grouped with the actual device reported model.
Damien Le Moal March 21, 2022, 12:48 p.m. UTC | #6
On 2022/03/20 5:31, Reimar Döffinger wrote:
> 
> 
>> On 19 Mar 2022, at 21:11, Christian Lamparter <chunkeey@gmail.com> wrote:
>> 
>> Samsung' 840 EVO with the latest firmware (EXT0DB6Q) locks up with the a
>> message: "READ LOG DMA EXT failed, trying PIO" during boot.
> 
> I don't see any info on which kernel this happened with anywhere. Because
> there was a bug that tried to use READ LOG DMA EXT even though DMA was not
> enabled. That was fixed by a patch from me for 5.16 (and backports). The
> behaviour you describe matches the possible symptoms of that bug. So it would
> be good to know we're not blaming the drive for an already fixed bug in the
> kernel... (I am still seeing some issues that READ LOG EXT - the non-DMA
> version - times out with interrupt lost for some device/controller
> combinations, but at least that "only" adds an extra 15 seconds to the boot.
> It's still a bit silly because in combination with e.g. IDE controllers READ
> LOG provides absolutely no useful functionality as far as I can tell).

Yes, the added 15s is a longer timeout to wait for READ LOG EXT reply on resume.
Some drives are slow to respond and that was causing drives to disapear on
resume. I agree that for old IDE/PATA drive, we could disable READ LOG [DMA[ EXT
entirely as all the information for the drive can be found in the IDENTIFY data.
So no point. All the pata drivers could set the nolog horkage. There are still
many more recent drives that have weird behavior with read log though. This is a
pain to sort out and can be done only if a bug report is filed.

I am preparing a series to improve the libata.force boot parameter to allow
setting any horkage/link flag to disable things for drives that do not behave
nicely. That will allow users to tune systems without having to wait for the
kernel to be patched.

> 
> Best regards, Reimar
Damien Le Moal April 4, 2022, 1 a.m. UTC | #7
On 3/20/22 05:11, Christian Lamparter wrote:
> Samsung' 840 EVO with the latest firmware (EXT0DB6Q) locks up with
> the a message: "READ LOG DMA EXT failed, trying PIO" during boot.
> 
> Initially this was discovered because it caused a crash
> with the sata_dwc_460ex controller on a WD MyBook Live DUO.
> 
> The reporter "Tice Rex" which has the unique opportunity that he
> has two Samsung 840 EVO SSD! One with the older firmware "EXT0BB0Q"
> which booted fine and didn't expose "READ LOG DMA EXT". But the
> newer/latest firmware "EXT0DB6Q" caused the headaches.
> 
> BugLink: https://github.com/openwrt/openwrt/issues/9505
> Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
> ---
> v1 -> v2: removed Reported-by Tag (Damien)
> 	  opened up a issue + added BugLink (Andy)
> ---
>   drivers/ata/libata-core.c | 3 +++
>   1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
> index 0c854aebfe0b..760c0d81d148 100644
> --- a/drivers/ata/libata-core.c
> +++ b/drivers/ata/libata-core.c
> @@ -4014,6 +4014,9 @@ static const struct ata_blacklist_entry ata_device_blacklist [] = {
>   						ATA_HORKAGE_ZERO_AFTER_TRIM, },
>   	{ "Crucial_CT*MX100*",		"MU01",	ATA_HORKAGE_NO_NCQ_TRIM |
>   						ATA_HORKAGE_ZERO_AFTER_TRIM, },
> +	{ "Samsung SSD 840 EVO*",	NULL,	ATA_HORKAGE_NO_NCQ_TRIM |
> +						ATA_HORKAGE_NO_DMA_LOG |
> +						ATA_HORKAGE_ZERO_AFTER_TRIM, },
>   	{ "Samsung SSD 840*",		NULL,	ATA_HORKAGE_NO_NCQ_TRIM |
>   						ATA_HORKAGE_ZERO_AFTER_TRIM, },
>   	{ "Samsung SSD 850*",		NULL,	ATA_HORKAGE_NO_NCQ_TRIM |

Applied to for-5.18-fixes. Thanks !
diff mbox series

Patch

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 0c854aebfe0b..760c0d81d148 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -4014,6 +4014,9 @@  static const struct ata_blacklist_entry ata_device_blacklist [] = {
 						ATA_HORKAGE_ZERO_AFTER_TRIM, },
 	{ "Crucial_CT*MX100*",		"MU01",	ATA_HORKAGE_NO_NCQ_TRIM |
 						ATA_HORKAGE_ZERO_AFTER_TRIM, },
+	{ "Samsung SSD 840 EVO*",	NULL,	ATA_HORKAGE_NO_NCQ_TRIM |
+						ATA_HORKAGE_NO_DMA_LOG |
+						ATA_HORKAGE_ZERO_AFTER_TRIM, },
 	{ "Samsung SSD 840*",		NULL,	ATA_HORKAGE_NO_NCQ_TRIM |
 						ATA_HORKAGE_ZERO_AFTER_TRIM, },
 	{ "Samsung SSD 850*",		NULL,	ATA_HORKAGE_NO_NCQ_TRIM |