[RESEND,v2,21/25] ext4: Add encoding mount options

Message ID 20180924215655.3676-22-krisman@collabora.co.uk
State New
Headers show
Series
  • Ext4 Encoding and Case-insensitive support
Related show

Commit Message

Gabriel Krisman Bertazi Sept. 24, 2018, 9:56 p.m.
This patch implements two new mount options for ext4: encoding and
encoding_flags.

The encoding option receives a string that identifies the NLS encoding
to be used when mounting the filesystem.  The user can optionally ask
for a specific version by appending the version string after a dash.

The encoding_flags argument allows the user to specify how the NLS
charset must behave.  The exact behavior of the flags are defined at
ext4.h.

encoding_flags is ignored if the user didn't provide an encoding.

Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.co.uk>
---
 fs/ext4/super.c | 91 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 91 insertions(+)

Comments

Theodore Y. Ts'o Oct. 7, 2018, 7:22 p.m. | #1
On Mon, Sep 24, 2018 at 05:56:51PM -0400, Gabriel Krisman Bertazi wrote:
> This patch implements two new mount options for ext4: encoding and
> encoding_flags.
> 
> The encoding option receives a string that identifies the NLS encoding
> to be used when mounting the filesystem.  The user can optionally ask
> for a specific version by appending the version string after a dash.
> 
> The encoding_flags argument allows the user to specify how the NLS
> charset must behave.  The exact behavior of the flags are defined at
> ext4.h.
> 
> encoding_flags is ignored if the user didn't provide an encoding.
> 
> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.co.uk>

It seems to me that adding support for setting the encoding parameters
via mount options is a bad idea.  The encoding is going to impact
directory hash; which means if the file system has directories created
using, say, ASCII as its encoding, and then the encoding changes to
UTF-8, directory lookups won't work correctly.  So I think this commit
needs to be dropped, and support for setting the encoding needs to be
added to e2fsprogs as the primary way encoding settings are made.

We need e2fsprogs support before this feature is ready for production
use, since e2fsck needs to be able to properly rebuild directories.

Do you agree?

						- Ted
Gabriel Krisman Bertazi Oct. 9, 2018, 2:53 p.m. | #2
"Theodore Y. Ts'o" <tytso@mit.edu> writes:

> On Mon, Sep 24, 2018 at 05:56:51PM -0400, Gabriel Krisman Bertazi wrote:

> It seems to me that adding support for setting the encoding parameters
> via mount options is a bad idea.  The encoding is going to impact
> directory hash; which means if the file system has directories created
> using, say, ASCII as its encoding, and then the encoding changes to
> UTF-8, directory lookups won't work correctly.  So I think this commit
> needs to be dropped, and support for setting the encoding needs to be
> added to e2fsprogs as the primary way encoding settings are made.

The main reason I had this patch is debugging, so I am ok with dropping
it.

> We need e2fsprogs support before this feature is ready for production
> use, since e2fsck needs to be able to properly rebuild directories.
>
> Do you agree?

I have support for e2fsprogs in the branch I mentioned in the cover
letter, I will rebase and start submitting it in the next iteration.

I am only supporting encoding selection at mkfs time, so I don't need to
rebuild the hashes, (except for fsck errors).  I'm not sure I care about
changing the encoding after creation time, since this complicates
things, for instance by increasing the risk of file name collisions.
I'm also only allowing case-sensitiveness configuration changes on empty
directories for the same reason.

I will start by submitting kernel/e2fsprogs patches to reserve the
superblock bits.  Do we agree on the current superblock format?

Also, what is the best list to send the NLS patches?  fsdevel?

Patch

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index ccf4742fea3b..f70c947c9850 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1409,6 +1409,7 @@  enum {
 	Opt_dioread_nolock, Opt_dioread_lock,
 	Opt_discard, Opt_nodiscard, Opt_init_itable, Opt_noinit_itable,
 	Opt_max_dir_size_kb, Opt_nojournal_checksum, Opt_nombcache,
+	Opt_encoding, Opt_encoding_flags
 };
 
 static const match_table_t tokens = {
@@ -1493,6 +1494,8 @@  static const match_table_t tokens = {
 	{Opt_noinit_itable, "noinit_itable"},
 	{Opt_max_dir_size_kb, "max_dir_size_kb=%u"},
 	{Opt_test_dummy_encryption, "test_dummy_encryption"},
+	{Opt_encoding, "encoding=%s"},
+	{Opt_encoding_flags, "encoding_flags=%u"},
 	{Opt_nombcache, "nombcache"},
 	{Opt_nombcache, "no_mbcache"},	/* for backward compatibility */
 	{Opt_removed, "check=none"},	/* mount option from ext2/3 */
@@ -1705,6 +1708,8 @@  static const struct mount_opts {
 	{Opt_max_dir_size_kb, 0, MOPT_GTE0},
 	{Opt_test_dummy_encryption, 0, MOPT_GTE0},
 	{Opt_nombcache, EXT4_MOUNT_NO_MBCACHE, MOPT_SET},
+	{Opt_encoding, 0, MOPT_EXT4_ONLY | MOPT_STRING},
+	{Opt_encoding_flags, 0, MOPT_EXT4_ONLY},
 	{Opt_err, 0, 0}
 };
 
@@ -1731,6 +1736,27 @@  ext4_sb_read_encoding(const struct ext4_super_block *es,
 
 	return 0;
 }
+
+static struct ext4_sb_encodings *ext4_parse_encoding_opt(char *arg)
+{
+	char *split;
+	struct ext4_sb_encodings *info;
+	ssize_t len = strlen(arg) + 1;
+
+	info = kzalloc(sizeof(struct ext4_sb_encodings) + len, GFP_KERNEL);
+	if (!info)
+		return NULL;
+
+	info->name = ((char*) &info[1]);
+	strncpy(info->name, arg, len);
+
+	split = strchr(info->name, '-');
+	if (split && *(split+1)) {
+		*split = '\0';
+		info->version = split+1;
+	}
+	return info;
+}
 #endif
 
 static int handle_mount_opt(struct super_block *sb, char *opt, int token,
@@ -1965,6 +1991,42 @@  static int handle_mount_opt(struct super_block *sb, char *opt, int token,
 		sbi->s_mount_opt |= m->mount_opt;
 	} else if (token == Opt_data_err_ignore) {
 		sbi->s_mount_opt &= ~m->mount_opt;
+	} else if (token == Opt_encoding) {
+		int ret = -1;
+#ifdef CONFIG_NLS
+		char *encoding = match_strdup(&args[0]);
+
+		if (!encoding)
+			return -ENOMEM;
+
+		if (ext4_has_feature_encrypt(sb)) {
+			ext4_msg(sb, KERN_ERR,
+				 "Can't mount with both encoding and encryption");
+			goto encoding_out;
+		}
+
+		sbi->encoding_info = ext4_parse_encoding_opt(encoding);
+		if (!sbi->encoding_info) {
+			ext4_msg(sb, KERN_ERR,
+				 "Encoding %s not supported by ext4", encoding);
+			goto encoding_out;
+		}
+
+		ret = 0;
+encoding_out:
+		kfree(encoding);
+#else
+		ext4_msg(sb, KERN_INFO, "encoding option not supported");
+#endif
+		return ret;
+	} else if (token == Opt_encoding_flags) {
+#ifdef CONFIG_NLS
+		sbi->encoding_flags = arg;
+		return 0;
+#else
+		ext4_msg(sb, KERN_INFO, "encoding flags option not supported");
+		return -1;
+#endif
 	} else {
 		if (!args->from)
 			arg = 1;
@@ -2051,6 +2113,35 @@  static int parse_options(char *options, struct super_block *sb,
 			return 0;
 		}
 	}
+
+#ifdef CONFIG_NLS
+	if (sbi->encoding_info) {
+		sbi->encoding = load_nls_version(sbi->encoding_info->name,
+						 sbi->encoding_info->version,
+						 sbi->encoding_flags);
+		if (IS_ERR(sbi->encoding)) {
+			ext4_msg(sb, KERN_ERR,
+				 "Cannot load encoding: %s %s with flags 0x%hx",
+				 sbi->encoding_info->name,
+				 sbi->encoding_info->version?:"\b",
+				 sbi->encoding_flags & EXT4_ENC_NLS_FL_MASK);
+
+			sbi->encoding = NULL;
+			sbi->encoding_flags = 0;
+			kfree(sbi->encoding_info);
+
+			return 0;
+		}
+		ext4_msg(sb, KERN_INFO,"Using encoding defined by mount option: "
+			 "%s %s with flags 0x%hx", sbi->encoding_info->name,
+			 sbi->encoding_info->version?:"\b", sbi->encoding_flags);
+	} else {
+		/* Make sure the flags are zeroed if the the user didn't
+		   provide the encoding name. */
+		sbi->encoding_flags = 0;
+	}
+
+#endif
 	return 1;
 }