diff mbox series

e2fsprogs: detect zoned disks and prevent their raw use

Message ID 20180615205630.7989-1-mcgrof@kernel.org
State Rejected, archived
Headers show
Series e2fsprogs: detect zoned disks and prevent their raw use | expand

Commit Message

Luis Chamberlain June 15, 2018, 8:56 p.m. UTC
Using raw zoned disks by filesystems requires special handling, only
f2fs currently supports this. All other filesystems do not support
dealing with zoned disks directly.

As such using raw zoned disks is not supported by e2fsprogs, to use them you
need to use dm-zoned-tools, format them with dzadm, set the scheduler to
deadline, and then setup a dmsetup with zoned type, and somehow set
this up on every boot to live a semi-happy life for now.

Even if you use dmsetup on every boot, the zoned disk is still exposed,
and a user may still think they have to run mkfs.ext[234] on it instead
of the /dev/mapper/ disk, and then mount it by mistake.

In either case you may seem to believe your disk works and only eventually
end up with alignmet issues and perhaps lose you data. For instance the
below was observed with XFS but its expected ext[234] users would see
the same:

[10869.959501] device-mapper: zoned reclaim: (sda): Align zone 865 wp 28349 to 30842 (wp+2493) blocks failed -5
[10870.014488] sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[10870.016137] sd 0:0:0:0: [sda] tag#0 Sense Key : Illegal Request [current]
[10870.017696] sd 0:0:0:0: [sda] tag#0 Add. Sense: Unaligned write command

We have to prevent these mistakes by avoiding mkfs use on zoned disks.

Note that this not enough yet, if users are on old AHCI controllers,
the disks may not be detected as zoned. More work through udev may be
required to detect this situation old old parent PCI IDs for zoned
disks, and then prevent their use somehow.

If you are stuck on using ext[234] there a udev rule out there [0], this is
far from perfect, and not fully what we want done upstream on Linux
distributions long term but it should at least help developers for now
enjoy their shiny big fat zoned disks with ext[234].

This check should help avoid having folks shoot themselves in the foot
for now with zoned disks. If you make the mistake to use mkfs.ext4
on a zoned disk, you will now get:

 # mkfs.ext4  /dev/sda
mke2fs 1.44.2 (14-May-2018)
/dev/sda: zoned disk detected, refer to dm-zoned-tools for how to use while setting up superblock

[0] https://lkml.kernel.org/r/20180614001147.1545-1-mcgrof@kernel.org

Cc: Damien Le Moal <damien.lemoal@wdc.com>
Cc: Bart Van Assche <Bart.VanAssche@wdc.com>
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 lib/ext2fs/ext2_err.et.in |  3 +++
 lib/ext2fs/initialize.c   | 30 ++++++++++++++++++++++++++++++
 2 files changed, 33 insertions(+)

Comments

Luis Chamberlain June 15, 2018, 9:06 p.m. UTC | #1
On Fri, Jun 15, 2018 at 01:56:30PM -0700, Luis R. Rodriguez wrote:
> +static errcode_t is_zoned_disk(const char *path)
> +{
> +	char str[PATH_MAX];
> +	char *devname = basename(path);
> +	FILE *file;
> +	int len;
> +
> +	len = snprintf(str, sizeof(str), "/sys/block/%s/queue/zoned", devname);
> +
> +	/* Indicates truncation */
> +	if (len >= PATH_MAX)
> +		return EXT2_ET_INVALID_ARGUMENT;
> +
> +	file = fopen(str, "r");
> +	if (!file)
> +		return 0;
> +
> +	fclose(file);
> +
> +	return EXT2_ET_ZONE_UNSUPPORTED;
> +}


Seems this needs to be extended to ensure this is host-managed only, will spin
a v2 later.

  Luis
Theodore Ts'o June 16, 2018, 12:25 a.m. UTC | #2
On Fri, Jun 15, 2018 at 11:06:01PM +0200, Luis R. Rodriguez wrote:
> 
> Seems this needs to be extended to ensure this is host-managed only, will spin
> a v2 later.

Thanks, I was about to ask, "but what about host-aware drives?"

	      	       	    	      - Ted
Andreas Dilger June 16, 2018, 12:28 a.m. UTC | #3
On Jun 15, 2018, at 2:56 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> 
> Using raw zoned disks by filesystems requires special handling, only
> f2fs currently supports this. All other filesystems do not support
> dealing with zoned disks directly.
> 
> As such using raw zoned disks is not supported by e2fsprogs, to use them you
> need to use dm-zoned-tools, format them with dzadm, set the scheduler to
> deadline, and then setup a dmsetup with zoned type, and somehow set
> this up on every boot to live a semi-happy life for now.
> 
> Even if you use dmsetup on every boot, the zoned disk is still exposed,
> and a user may still think they have to run mkfs.ext[234] on it instead
> of the /dev/mapper/ disk, and then mount it by mistake.
> 
> In either case you may seem to believe your disk works and only eventually
> end up with alignmet issues and perhaps lose you data. For instance the
> below was observed with XFS but its expected ext[234] users would see
> the same:

If you are interested in ext4 and SMR drives, there were some patches
developed to allow ext4 to work on zoned disks, essentially converting
it to be a "lazy-journal log-structured" filesystem.  That makes almost
all of the filesystem IO linear (though we could avoid duplicate journal
writes by writing large IOs directly to disk).

This was presented at FAST'17 as "ext4-lazy", though I'm unable to find the
patch that implemented this feature (it is not landed in the kernel yet).

Maybe Ted could send out a URL for the patch, even if it is a WIP?

Cheers, Andreas


> [10869.959501] device-mapper: zoned reclaim: (sda): Align zone 865 wp 28349 to 30842 (wp+2493) blocks failed -5
> [10870.014488] sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
> [10870.016137] sd 0:0:0:0: [sda] tag#0 Sense Key : Illegal Request [current]
> [10870.017696] sd 0:0:0:0: [sda] tag#0 Add. Sense: Unaligned write command
> 
> We have to prevent these mistakes by avoiding mkfs use on zoned disks.
> 
> Note that this not enough yet, if users are on old AHCI controllers,
> the disks may not be detected as zoned. More work through udev may be
> required to detect this situation old old parent PCI IDs for zoned
> disks, and then prevent their use somehow.
> 
> If you are stuck on using ext[234] there a udev rule out there [0], this is
> far from perfect, and not fully what we want done upstream on Linux
> distributions long term but it should at least help developers for now
> enjoy their shiny big fat zoned disks with ext[234].
> 
> This check should help avoid having folks shoot themselves in the foot
> for now with zoned disks. If you make the mistake to use mkfs.ext4
> on a zoned disk, you will now get:
> 
> # mkfs.ext4  /dev/sda
> mke2fs 1.44.2 (14-May-2018)
> /dev/sda: zoned disk detected, refer to dm-zoned-tools for how to use while setting up superblock
> 
> [0] https://lkml.kernel.org/r/20180614001147.1545-1-mcgrof@kernel.org
> 
> Cc: Damien Le Moal <damien.lemoal@wdc.com>
> Cc: Bart Van Assche <Bart.VanAssche@wdc.com>
> Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> ---
> lib/ext2fs/ext2_err.et.in |  3 +++
> lib/ext2fs/initialize.c   | 30 ++++++++++++++++++++++++++++++
> 2 files changed, 33 insertions(+)
> 
> diff --git a/lib/ext2fs/ext2_err.et.in b/lib/ext2fs/ext2_err.et.in
> index 16abd23d..34ee5793 100644
> --- a/lib/ext2fs/ext2_err.et.in
> +++ b/lib/ext2fs/ext2_err.et.in
> @@ -545,4 +545,7 @@ ec	EXT2_ET_INODE_CORRUPTED,
> ec	EXT2_ET_EA_INODE_CORRUPTED,
> 	"Inode containing extended attribute value is corrupted"
> 
> +ec	EXT2_ET_ZONE_UNSUPPORTED,
> +	"zoned disk detected, refer to dm-zoned-tools for how to use"
> +
> 	end
> diff --git a/lib/ext2fs/initialize.c b/lib/ext2fs/initialize.c
> index dbe798b2..cef8ebb2 100644
> --- a/lib/ext2fs/initialize.c
> +++ b/lib/ext2fs/initialize.c
> @@ -32,6 +32,10 @@
> #define O_BINARY 0
> #endif
> 
> +#ifndef PATH_MAX
> +#define PATH_MAX 1024
> +#endif
> +
> #if defined(__linux__)    &&	defined(EXT2_OS_LINUX)
> #define CREATOR_OS EXT2_OS_LINUX
> #else
> @@ -85,6 +89,28 @@ static unsigned int calc_reserved_gdt_blocks(ext2_filsys fs)
> 	return rsv_gdb;
> }
> 
> +static errcode_t is_zoned_disk(const char *path)
> +{
> +	char str[PATH_MAX];
> +	char *devname = basename(path);
> +	FILE *file;
> +	int len;
> +
> +	len = snprintf(str, sizeof(str), "/sys/block/%s/queue/zoned", devname);
> +
> +	/* Indicates truncation */
> +	if (len >= PATH_MAX)
> +		return EXT2_ET_INVALID_ARGUMENT;
> +
> +	file = fopen(str, "r");
> +	if (!file)
> +		return 0;
> +
> +	fclose(file);
> +
> +	return EXT2_ET_ZONE_UNSUPPORTED;
> +}
> +
> errcode_t ext2fs_initialize(const char *name, int flags,
> 			    struct ext2_super_block *param,
> 			    io_manager manager, ext2_filsys *ret_fs)
> @@ -112,6 +138,10 @@ errcode_t ext2fs_initialize(const char *name, int flags,
> 	if (!param || !ext2fs_blocks_count(param))
> 		return EXT2_ET_INVALID_ARGUMENT;
> 
> +	retval = is_zoned_disk(name);
> +	if (retval != 0)
> +		return retval;
> +
> 	retval = ext2fs_get_mem(sizeof(struct struct_ext2_filsys), &fs);
> 	if (retval)
> 		return retval;
> --
> 2.17.1
> 


Cheers, Andreas
Andreas Dilger June 16, 2018, 12:36 a.m. UTC | #4
On Jun 15, 2018, at 6:28 PM, Andreas Dilger <adilger@dilger.ca> wrote:
> 
> On Jun 15, 2018, at 2:56 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
>> 
>> Using raw zoned disks by filesystems requires special handling, only
>> f2fs currently supports this. All other filesystems do not support
>> dealing with zoned disks directly.
>> 
>> As such using raw zoned disks is not supported by e2fsprogs, to use them you
>> need to use dm-zoned-tools, format them with dzadm, set the scheduler to
>> deadline, and then setup a dmsetup with zoned type, and somehow set
>> this up on every boot to live a semi-happy life for now.
>> 
>> Even if you use dmsetup on every boot, the zoned disk is still exposed,
>> and a user may still think they have to run mkfs.ext[234] on it instead
>> of the /dev/mapper/ disk, and then mount it by mistake.
>> 
>> In either case you may seem to believe your disk works and only eventually
>> end up with alignmet issues and perhaps lose you data. For instance the
>> below was observed with XFS but its expected ext[234] users would see
>> the same:
> 
> If you are interested in ext4 and SMR drives, there were some patches
> developed to allow ext4 to work on zoned disks, essentially converting
> it to be a "lazy-journal log-structured" filesystem.  That makes almost
> all of the filesystem IO linear (though we could avoid duplicate journal
> writes by writing large IOs directly to disk).
> 
> This was presented at FAST'17 as "ext4-lazy", though I'm unable to find the
> patch that implemented this feature (it is not landed in the kernel yet).
> 
> Maybe Ted could send out a URL for the patch, even if it is a WIP?

To reply to my own email, I found the patch in question:

https://github.com/tytso/ext4-patch-queue/blob/master/add-ext4-journal-lazy-mount-option

it wasn't showing up in any of my searching.


Cheers, Andreas
diff mbox series

Patch

diff --git a/lib/ext2fs/ext2_err.et.in b/lib/ext2fs/ext2_err.et.in
index 16abd23d..34ee5793 100644
--- a/lib/ext2fs/ext2_err.et.in
+++ b/lib/ext2fs/ext2_err.et.in
@@ -545,4 +545,7 @@  ec	EXT2_ET_INODE_CORRUPTED,
 ec	EXT2_ET_EA_INODE_CORRUPTED,
 	"Inode containing extended attribute value is corrupted"
 
+ec	EXT2_ET_ZONE_UNSUPPORTED,
+	"zoned disk detected, refer to dm-zoned-tools for how to use"
+
 	end
diff --git a/lib/ext2fs/initialize.c b/lib/ext2fs/initialize.c
index dbe798b2..cef8ebb2 100644
--- a/lib/ext2fs/initialize.c
+++ b/lib/ext2fs/initialize.c
@@ -32,6 +32,10 @@ 
 #define O_BINARY 0
 #endif
 
+#ifndef PATH_MAX
+#define PATH_MAX 1024
+#endif
+
 #if defined(__linux__)    &&	defined(EXT2_OS_LINUX)
 #define CREATOR_OS EXT2_OS_LINUX
 #else
@@ -85,6 +89,28 @@  static unsigned int calc_reserved_gdt_blocks(ext2_filsys fs)
 	return rsv_gdb;
 }
 
+static errcode_t is_zoned_disk(const char *path)
+{
+	char str[PATH_MAX];
+	char *devname = basename(path);
+	FILE *file;
+	int len;
+
+	len = snprintf(str, sizeof(str), "/sys/block/%s/queue/zoned", devname);
+
+	/* Indicates truncation */
+	if (len >= PATH_MAX)
+		return EXT2_ET_INVALID_ARGUMENT;
+
+	file = fopen(str, "r");
+	if (!file)
+		return 0;
+
+	fclose(file);
+
+	return EXT2_ET_ZONE_UNSUPPORTED;
+}
+
 errcode_t ext2fs_initialize(const char *name, int flags,
 			    struct ext2_super_block *param,
 			    io_manager manager, ext2_filsys *ret_fs)
@@ -112,6 +138,10 @@  errcode_t ext2fs_initialize(const char *name, int flags,
 	if (!param || !ext2fs_blocks_count(param))
 		return EXT2_ET_INVALID_ARGUMENT;
 
+	retval = is_zoned_disk(name);
+	if (retval != 0)
+		return retval;
+
 	retval = ext2fs_get_mem(sizeof(struct struct_ext2_filsys), &fs);
 	if (retval)
 		return retval;