diff mbox series

Samsung MZ7KM SSD zero after TRIM

Message ID CAFR8UphYgPiVmw8Hu3vTEdTMZUcicEK9B47VC8kfAa8jsDNnkQ@mail.gmail.com
State Not Applicable
Delegated to: David Miller
Headers show
Series Samsung MZ7KM SSD zero after TRIM | expand

Commit Message

Juha-Matti Tilli Dec. 1, 2018, 6:21 a.m. UTC
Hello,

(I'm not on the list, please CC me if you want a reply. Sorry if this is a
repost, I think my first message didn't get through as it was multipart and not
plain text.)

The Samsung MZ7KM series devices that support ZERO after TRIM do not support
TRIM in RAID4/5/6 even if the special module parameter is enabled. The reason
is that the device name includes SAMSUNG but it does not include SSD even
though these server-grade storage devices are actually SSDs. The whitelist
contains only Samsung*SSD* and SAMSUNG*SSD*. The whitelist should contain
SAMSUNG*MZ7KM* as well. An alternative would be SAMSUNG* but then that would
match HDDs as well. Not sure if this has negative effects; in my current patch,
I only whitelist the MZ7KM devices.

We tested with RAID1 that these SSDs genuinely read zero after TRIM by writing
a large file with random data, observing the data is present on the physical
disks, removing the file and running fstrim, and observing the data is now zero
on the physical disks. The device reports to read zero after TRIM to the
operating system, as well, but the operating system doesn't believe the device.
Because of our tests, I think the device should be believed to report the flag
correctly. For RAID1, the reads zero after TRIM flag is not needed, but
unfortunately, we have too much data to economically store it in RAID-1 with
three disks per mirror, and two disks per mirror would be too dangerous because
two failures could disable the array.

As far as I know, this problem affects all versions of the Linux kernel.
Currently we have to run a custom manually compiled kernel with the patch,
because our use case is severely affected by lack of TRIM support (lots of data
stored, lots of I/O, nearly full disks, less than megabyte average file size, 1
DWPD order of magnitude, uneconomical to use RAID-1 on the storage server).

I reported this to Red Hat Bugzilla, but they wanted me to report this first to
this list, before the patch can be applied to Red Hat.

Patch is here, sorry I use Gmail so I can't send the patch as a separate
git-send-email message given my current mail system:


From: Juha-Matti Tilli <juha-matti.tilli@iki.fi>
Date: Wed, 21 Nov 2018 10:30:00 +0200
Subject: [PATCH] Samsung MZ7KM SSDs read zero after TRIM

---
 drivers/ata/libata-core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Jens Axboe Dec. 1, 2018, 6:15 p.m. UTC | #1
On 11/30/18 11:21 PM, Juha-Matti Tilli wrote:
> Hello,
> 
> (I'm not on the list, please CC me if you want a reply. Sorry if this is a
> repost, I think my first message didn't get through as it was multipart and not
> plain text.)
> 
> The Samsung MZ7KM series devices that support ZERO after TRIM do not support
> TRIM in RAID4/5/6 even if the special module parameter is enabled. The reason
> is that the device name includes SAMSUNG but it does not include SSD even
> though these server-grade storage devices are actually SSDs. The whitelist
> contains only Samsung*SSD* and SAMSUNG*SSD*. The whitelist should contain
> SAMSUNG*MZ7KM* as well. An alternative would be SAMSUNG* but then that would
> match HDDs as well. Not sure if this has negative effects; in my current patch,
> I only whitelist the MZ7KM devices.
> 
> We tested with RAID1 that these SSDs genuinely read zero after TRIM by writing
> a large file with random data, observing the data is present on the physical
> disks, removing the file and running fstrim, and observing the data is now zero
> on the physical disks. The device reports to read zero after TRIM to the
> operating system, as well, but the operating system doesn't believe the device.
> Because of our tests, I think the device should be believed to report the flag
> correctly. For RAID1, the reads zero after TRIM flag is not needed, but
> unfortunately, we have too much data to economically store it in RAID-1 with
> three disks per mirror, and two disks per mirror would be too dangerous because
> two failures could disable the array.
> 
> As far as I know, this problem affects all versions of the Linux kernel.
> Currently we have to run a custom manually compiled kernel with the patch,
> because our use case is severely affected by lack of TRIM support (lots of data
> stored, lots of I/O, nearly full disks, less than megabyte average file size, 1
> DWPD order of magnitude, uneconomical to use RAID-1 on the storage server).
> 
> I reported this to Red Hat Bugzilla, but they wanted me to report this first to
> this list, before the patch can be applied to Red Hat.
> 
> Patch is here, sorry I use Gmail so I can't send the patch as a separate
> git-send-email message given my current mail system:

This patch is broken, can't be applied as-is. gmail works just fine with
git send-email, that's what I use. My .gitconfig has this section:

[sendemail]
from = Jens Axboe <axboe@kernel.dk>
smtpserver = smtp.gmail.com
smtpuser = axboe@kernel.dk
smtpencryption = tls
smtppass = XXXXXXXXXXXXXXXXXX
smtpserverport = 587
diff mbox series

Patch

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index b289128..9d25932 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -4247,6 +4247,7 @@ 
     { "SSD*INTEL*",            NULL,    ATA_HORKAGE_ZERO_AFTER_TRIM, },
     { "Samsung*SSD*",        NULL,    ATA_HORKAGE_ZERO_AFTER_TRIM, },
     { "SAMSUNG*SSD*",        NULL,    ATA_HORKAGE_ZERO_AFTER_TRIM, },
+    { "SAMSUNG*MZ7KM*",        NULL,    ATA_HORKAGE_ZERO_AFTER_TRIM, },
     { "ST[1248][0248]0[FH]*",    NULL,    ATA_HORKAGE_ZERO_AFTER_TRIM, },

     /*