From patchwork Wed Mar 25 21:18:01 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gabriel Krisman Bertazi X-Patchwork-Id: 1261654 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-ext4-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=collabora.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 48ngvh4Ry2z9sPR for ; Thu, 26 Mar 2020 08:18:24 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727358AbgCYVSY (ORCPT ); Wed, 25 Mar 2020 17:18:24 -0400 Received: from bhuna.collabora.co.uk ([46.235.227.227]:39544 "EHLO bhuna.collabora.co.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727351AbgCYVSY (ORCPT ); Wed, 25 Mar 2020 17:18:24 -0400 Received: from [127.0.0.1] (localhost [127.0.0.1]) (Authenticated sender: krisman) with ESMTPSA id 21D5C28666B From: Gabriel Krisman Bertazi To: tytso@mit.edu Cc: linux-ext4@vger.kernel.org, Gabriel Krisman Bertazi Subject: [PATCH e2fsprogs 01/11] tune2fs: Allow enabling casefold feature after fs creation Date: Wed, 25 Mar 2020 17:18:01 -0400 Message-Id: <20200325211812.2971787-2-krisman@collabora.com> X-Mailer: git-send-email 2.25.0 In-Reply-To: <20200325211812.2971787-1-krisman@collabora.com> References: <20200325211812.2971787-1-krisman@collabora.com> MIME-Version: 1.0 Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org The main reason we didn't allow this before was because !CASEFOLDED directories were expected to be normalized(). Since this is no longer the case, and as long as the encrypt feature is not enabled, it should be safe to enable this feature. Disabling the feature is trickier, since we need to make sure there are no existing +F directories in the filesystem. Leave that for a future patch. Also, enabling strict mode requires some filesystem-wide verification, so ignore that for now. Signed-off-by: Gabriel Krisman Bertazi --- misc/tune2fs.c | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/misc/tune2fs.c b/misc/tune2fs.c index a0448f63d1d5..656389c61281 100644 --- a/misc/tune2fs.c +++ b/misc/tune2fs.c @@ -161,7 +161,8 @@ static __u32 ok_features[3] = { EXT4_FEATURE_INCOMPAT_64BIT | EXT4_FEATURE_INCOMPAT_ENCRYPT | EXT4_FEATURE_INCOMPAT_CSUM_SEED | - EXT4_FEATURE_INCOMPAT_LARGEDIR, + EXT4_FEATURE_INCOMPAT_LARGEDIR | + EXT4_FEATURE_INCOMPAT_CASEFOLD, /* R/O compat */ EXT2_FEATURE_RO_COMPAT_LARGE_FILE | EXT4_FEATURE_RO_COMPAT_HUGE_FILE| @@ -1462,6 +1463,19 @@ mmp_error: } } + if (FEATURE_ON(E2P_FEATURE_INCOMPAT, EXT4_FEATURE_INCOMPAT_CASEFOLD)) { + if (ext2fs_has_feature_encrypt(sb)) { + fputs(_("Cannot enable casefold feature on filesystems " + "with the encrypt feature enabled.\n"), + stderr); + return 1; + } + + sb->s_encoding = EXT4_ENC_UTF8_12_1; + sb->s_encoding_flags = e2p_get_encoding_flags(sb->s_encoding); + } + + if (sb->s_rev_level == EXT2_GOOD_OLD_REV && (sb->s_feature_compat || sb->s_feature_ro_compat || sb->s_feature_incompat)) From patchwork Wed Mar 25 21:18:02 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gabriel Krisman Bertazi X-Patchwork-Id: 1261655 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-ext4-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=collabora.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 48ngvm1f15z9sPR for ; Thu, 26 Mar 2020 08:18:28 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727384AbgCYVS1 (ORCPT ); Wed, 25 Mar 2020 17:18:27 -0400 Received: from bhuna.collabora.co.uk ([46.235.227.227]:39548 "EHLO bhuna.collabora.co.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727351AbgCYVS1 (ORCPT ); Wed, 25 Mar 2020 17:18:27 -0400 Received: from [127.0.0.1] (localhost [127.0.0.1]) (Authenticated sender: krisman) with ESMTPSA id 5AE7828666B From: Gabriel Krisman Bertazi To: tytso@mit.edu Cc: linux-ext4@vger.kernel.org, Gabriel Krisman Bertazi Subject: [PATCH e2fsprogs 02/11] tune2fs: Fix casefold+encrypt error message Date: Wed, 25 Mar 2020 17:18:02 -0400 Message-Id: <20200325211812.2971787-3-krisman@collabora.com> X-Mailer: git-send-email 2.25.0 In-Reply-To: <20200325211812.2971787-1-krisman@collabora.com> References: <20200325211812.2971787-1-krisman@collabora.com> MIME-Version: 1.0 Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org Refering to EXT4_INCOMPAT_CASEFOLD as encoding is not as meaningful as saying casefold. Signed-off-by: Gabriel Krisman Bertazi --- misc/tune2fs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/misc/tune2fs.c b/misc/tune2fs.c index 656389c61281..ffd1d39f2e9c 100644 --- a/misc/tune2fs.c +++ b/misc/tune2fs.c @@ -1420,7 +1420,7 @@ mmp_error: if (FEATURE_ON(E2P_FEATURE_INCOMPAT, EXT4_FEATURE_INCOMPAT_ENCRYPT)) { if (ext2fs_has_feature_casefold(sb)) { fputs(_("Cannot enable encrypt feature on filesystems " - "with the encoding feature enabled.\n"), + "with the casefold feature enabled.\n"), stderr); return 1; } From patchwork Wed Mar 25 21:18:03 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gabriel Krisman Bertazi X-Patchwork-Id: 1261656 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-ext4-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=collabora.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 48ngvr2GMbz9sPR for ; Thu, 26 Mar 2020 08:18:32 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727389AbgCYVSc (ORCPT ); Wed, 25 Mar 2020 17:18:32 -0400 Received: from bhuna.collabora.co.uk ([46.235.227.227]:39566 "EHLO bhuna.collabora.co.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727351AbgCYVSb (ORCPT ); Wed, 25 Mar 2020 17:18:31 -0400 Received: from [127.0.0.1] (localhost [127.0.0.1]) (Authenticated sender: krisman) with ESMTPSA id 42AEC28666B From: Gabriel Krisman Bertazi To: tytso@mit.edu Cc: linux-ext4@vger.kernel.org, Gabriel Krisman Bertazi Subject: [PATCH e2fsprogs 03/11] ext2fs: Add method to validate casefolded strings Date: Wed, 25 Mar 2020 17:18:03 -0400 Message-Id: <20200325211812.2971787-4-krisman@collabora.com> X-Mailer: git-send-email 2.25.0 In-Reply-To: <20200325211812.2971787-1-krisman@collabora.com> References: <20200325211812.2971787-1-krisman@collabora.com> MIME-Version: 1.0 Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org This is exported to be used by fsck. Signed-off-by: Gabriel Krisman Bertazi --- lib/ext2fs/ext2fs.h | 2 ++ lib/ext2fs/ext2fsP.h | 2 ++ lib/ext2fs/nls_utf8.c | 36 ++++++++++++++++++++++++++++++++++++ 3 files changed, 40 insertions(+) diff --git a/lib/ext2fs/ext2fs.h b/lib/ext2fs/ext2fs.h index 93ecf29c568d..bf54130f4edb 100644 --- a/lib/ext2fs/ext2fs.h +++ b/lib/ext2fs/ext2fs.h @@ -1611,6 +1611,8 @@ extern errcode_t ext2fs_new_dir_inline_data(ext2_filsys fs, ext2_ino_t dir_ino, /* nls_utf8.c */ extern const struct ext2fs_nls_table *ext2fs_load_nls_table(int encoding); +extern int ext2fs_check_encoded_name(const struct ext2fs_nls_table *table, + char *s, size_t len, char **pos); /* mkdir.c */ extern errcode_t ext2fs_mkdir(ext2_filsys fs, ext2_ino_t parent, ext2_ino_t inum, diff --git a/lib/ext2fs/ext2fsP.h b/lib/ext2fs/ext2fsP.h index ad8b7d52d77c..30564ded1e2b 100644 --- a/lib/ext2fs/ext2fsP.h +++ b/lib/ext2fs/ext2fsP.h @@ -104,6 +104,8 @@ struct ext2fs_nls_ops { int (*casefold)(const struct ext2fs_nls_table *charset, const unsigned char *str, size_t len, unsigned char *dest, size_t dlen); + int (*validate)(const struct ext2fs_nls_table *table, + char *s, size_t len, char **pos); }; /* Function prototypes */ diff --git a/lib/ext2fs/nls_utf8.c b/lib/ext2fs/nls_utf8.c index e4c4e7a30990..f59484142e19 100644 --- a/lib/ext2fs/nls_utf8.c +++ b/lib/ext2fs/nls_utf8.c @@ -920,8 +920,38 @@ invalid_seq: return -EINVAL; } + +static int utf8_validate(const struct ext2fs_nls_table *table, + char *s, size_t len, char **pos) +{ + const struct utf8data *data = utf8nfdicf(table->version); + utf8leaf_t *leaf; + size_t ret = 0; + unsigned char hangul[UTF8HANGULLEAF]; + + if (!data) + return -1; + while (len && *s) { + leaf = utf8nlookup(data, hangul, s, len); + if (!leaf) { + *pos = s; + return 1; + } + if (utf8agetab[LEAF_GEN(leaf)] > data->maxage) + ret += utf8clen(s); + else if (LEAF_CCC(leaf) == DECOMPOSE) + ret += strlen(LEAF_STR(leaf)); + else + ret += utf8clen(s); + len -= utf8clen(s); + s += utf8clen(s); + } + return 0; +} + static const struct ext2fs_nls_ops utf8_ops = { .casefold = utf8_casefold, + .validate = utf8_validate, }; static const struct ext2fs_nls_table nls_utf8 = { @@ -936,3 +966,9 @@ const struct ext2fs_nls_table *ext2fs_load_nls_table(int encoding) return NULL; } + +int ext2fs_check_encoded_name(const struct ext2fs_nls_table *table, + char *name, size_t len, char **pos) +{ + return table->ops->validate(table, name, len, pos); +} From patchwork Wed Mar 25 21:18:04 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gabriel Krisman Bertazi X-Patchwork-Id: 1261657 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-ext4-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=collabora.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 48ngvv6ssTz9sPR for ; Thu, 26 Mar 2020 08:18:35 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727395AbgCYVSf (ORCPT ); Wed, 25 Mar 2020 17:18:35 -0400 Received: from bhuna.collabora.co.uk ([46.235.227.227]:39570 "EHLO bhuna.collabora.co.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727351AbgCYVSf (ORCPT ); Wed, 25 Mar 2020 17:18:35 -0400 Received: from [127.0.0.1] (localhost [127.0.0.1]) (Authenticated sender: krisman) with ESMTPSA id D03F528666B From: Gabriel Krisman Bertazi To: tytso@mit.edu Cc: linux-ext4@vger.kernel.org, Gabriel Krisman Bertazi Subject: [PATCH e2fsprogs 04/11] ext2fs: Implement faster CI comparison of strings Date: Wed, 25 Mar 2020 17:18:04 -0400 Message-Id: <20200325211812.2971787-5-krisman@collabora.com> X-Mailer: git-send-email 2.25.0 In-Reply-To: <20200325211812.2971787-1-krisman@collabora.com> References: <20200325211812.2971787-1-krisman@collabora.com> MIME-Version: 1.0 Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org Instead of calling casefold two times and memcmp the result, which require allocating a temporary buffer for the casefolded version, add a strcasecmp-like method to perform the comparison of each code-point during the casefold itself. This method is exposed because it needs to be used directly by fsck. Signed-off-by: Gabriel Krisman Bertazi --- lib/ext2fs/ext2fs.h | 4 ++++ lib/ext2fs/ext2fsP.h | 4 ++++ lib/ext2fs/nls_utf8.c | 33 +++++++++++++++++++++++++++++++++ 3 files changed, 41 insertions(+) diff --git a/lib/ext2fs/ext2fs.h b/lib/ext2fs/ext2fs.h index bf54130f4edb..c5815c37bbb6 100644 --- a/lib/ext2fs/ext2fs.h +++ b/lib/ext2fs/ext2fs.h @@ -1613,6 +1613,10 @@ extern errcode_t ext2fs_new_dir_inline_data(ext2_filsys fs, ext2_ino_t dir_ino, extern const struct ext2fs_nls_table *ext2fs_load_nls_table(int encoding); extern int ext2fs_check_encoded_name(const struct ext2fs_nls_table *table, char *s, size_t len, char **pos); +extern int ext2fs_casefold_cmp(const struct ext2fs_nls_table *table, + const unsigned char *str1, size_t len1, + const unsigned char *str2, size_t len2); + /* mkdir.c */ extern errcode_t ext2fs_mkdir(ext2_filsys fs, ext2_ino_t parent, ext2_ino_t inum, diff --git a/lib/ext2fs/ext2fsP.h b/lib/ext2fs/ext2fsP.h index 30564ded1e2b..99239be007f2 100644 --- a/lib/ext2fs/ext2fsP.h +++ b/lib/ext2fs/ext2fsP.h @@ -106,6 +106,10 @@ struct ext2fs_nls_ops { unsigned char *dest, size_t dlen); int (*validate)(const struct ext2fs_nls_table *table, char *s, size_t len, char **pos); + int (*casefold_cmp)(const struct ext2fs_nls_table *table, + const unsigned char *str1, size_t len1, + const unsigned char *str2, size_t len2); + }; /* Function prototypes */ diff --git a/lib/ext2fs/nls_utf8.c b/lib/ext2fs/nls_utf8.c index f59484142e19..f85b8e77e47b 100644 --- a/lib/ext2fs/nls_utf8.c +++ b/lib/ext2fs/nls_utf8.c @@ -949,9 +949,36 @@ static int utf8_validate(const struct ext2fs_nls_table *table, return 0; } +static int utf8_casefold_cmp(const struct ext2fs_nls_table *table, + const unsigned char *str1, size_t len1, + const unsigned char *str2, size_t len2) +{ + const struct utf8data *data = utf8nfdicf(table->version); + int c1, c2; + struct utf8cursor cur1, cur2; + + if (utf8ncursor(&cur1, data, (const char *) str1, len1) < 0) + return -1; + if (utf8ncursor(&cur2, data, (const char *) str2, len2) < 0) + return -1; + + do { + c1 = utf8byte(&cur1); + c2 = utf8byte(&cur2); + + if (c1 < 0 || c2 < 0) + return -1; + if (c1 != c2) + return c1 - c2; + } while (c1); + + return 0; +} + static const struct ext2fs_nls_ops utf8_ops = { .casefold = utf8_casefold, .validate = utf8_validate, + .casefold_cmp = utf8_casefold_cmp, }; static const struct ext2fs_nls_table nls_utf8 = { @@ -972,3 +999,9 @@ int ext2fs_check_encoded_name(const struct ext2fs_nls_table *table, { return table->ops->validate(table, name, len, pos); } +int ext2fs_casefold_cmp(const struct ext2fs_nls_table *table, + const unsigned char *str1, size_t len1, + const unsigned char *str2, size_t len2) +{ + return table->ops->casefold_cmp(table, str1, len1, str2, len2); +} From patchwork Wed Mar 25 21:18:05 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gabriel Krisman Bertazi X-Patchwork-Id: 1261658 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-ext4-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=collabora.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 48ngvy5Q4rz9sR4 for ; Thu, 26 Mar 2020 08:18:38 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727399AbgCYVSi (ORCPT ); Wed, 25 Mar 2020 17:18:38 -0400 Received: from bhuna.collabora.co.uk ([46.235.227.227]:39576 "EHLO bhuna.collabora.co.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727351AbgCYVSi (ORCPT ); Wed, 25 Mar 2020 17:18:38 -0400 Received: from [127.0.0.1] (localhost [127.0.0.1]) (Authenticated sender: krisman) with ESMTPSA id DBF1528ACCC From: Gabriel Krisman Bertazi To: tytso@mit.edu Cc: linux-ext4@vger.kernel.org, Gabriel Krisman Bertazi Subject: [PATCH e2fsprogs 05/11] e2fsck: Fix entries with invalid encoded characters Date: Wed, 25 Mar 2020 17:18:05 -0400 Message-Id: <20200325211812.2971787-6-krisman@collabora.com> X-Mailer: git-send-email 2.25.0 In-Reply-To: <20200325211812.2971787-1-krisman@collabora.com> References: <20200325211812.2971787-1-krisman@collabora.com> MIME-Version: 1.0 Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On strict mode, invalid Unicode sequences are not permited. This patch adds a verification step to pass2 to detect and modify the entries with the same replacement char used for non-encoding directories '.'. After the encoding test, we still want to check the name for usual problems, '\0', '/' in the middle of the sequence. Signed-off-by: Gabriel Krisman Bertazi --- e2fsck/e2fsck.c | 4 ++++ e2fsck/e2fsck.h | 1 + e2fsck/pass1.c | 17 +++++++++++++++++ e2fsck/pass2.c | 43 ++++++++++++++++++++++++++++++++++++++++++- 4 files changed, 64 insertions(+), 1 deletion(-) diff --git a/e2fsck/e2fsck.c b/e2fsck/e2fsck.c index d8be566fbe97..dc4b45e25657 100644 --- a/e2fsck/e2fsck.c +++ b/e2fsck/e2fsck.c @@ -75,6 +75,10 @@ errcode_t e2fsck_reset_context(e2fsck_t ctx) ext2fs_free_block_bitmap(ctx->block_found_map); ctx->block_found_map = 0; } + if (ctx->inode_casefold_map) { + ext2fs_free_block_bitmap(ctx->inode_casefold_map); + ctx->inode_casefold_map = 0; + } if (ctx->inode_link_info) { ext2fs_free_icount(ctx->inode_link_info); ctx->inode_link_info = 0; diff --git a/e2fsck/e2fsck.h b/e2fsck/e2fsck.h index 954bc9822ed2..335a5e4c6dca 100644 --- a/e2fsck/e2fsck.h +++ b/e2fsck/e2fsck.h @@ -262,6 +262,7 @@ struct e2fsck_struct { ext2fs_inode_bitmap inode_bb_map; /* Inodes which are in bad blocks */ ext2fs_inode_bitmap inode_imagic_map; /* AFS inodes */ ext2fs_inode_bitmap inode_reg_map; /* Inodes which are regular files*/ + ext2fs_inode_bitmap inode_casefold_map; /* Inodes which are casefolded */ ext2fs_block_bitmap block_found_map; /* Blocks which are in use */ ext2fs_block_bitmap block_dup_map; /* Blks referenced more than once */ diff --git a/e2fsck/pass1.c b/e2fsck/pass1.c index a57c1c0670e6..8e61f110fd7a 100644 --- a/e2fsck/pass1.c +++ b/e2fsck/pass1.c @@ -1270,6 +1270,20 @@ void e2fsck_pass1(e2fsck_t ctx) ctx->flags |= E2F_FLAG_ABORT; return; } + if (casefold_fs) { + pctx.errcode = + e2fsck_allocate_inode_bitmap(fs, + _("inode casefold map"), + EXT2FS_BMAP64_RBTREE, + "inode_casefold_map", + &ctx->inode_casefold_map); + if (pctx.errcode) { + pctx.num = 1; + fix_problem(ctx, PR_1_ALLOCATE_IBITMAP_ERROR, &pctx); + ctx->flags |= E2F_FLAG_ABORT; + return; + } + } pctx.errcode = e2fsck_setup_icount(ctx, "inode_link_info", 0, NULL, &ctx->inode_link_info); if (pctx.errcode) { @@ -1888,6 +1902,9 @@ void e2fsck_pass1(e2fsck_t ctx) add_encrypted_file(ctx, &pctx) < 0) goto clear_inode; + if (casefold_fs && inode->i_flags & EXT4_CASEFOLD_FL) + ext2fs_mark_inode_bitmap2(ctx->inode_casefold_map, ino); + if (LINUX_S_ISDIR(inode->i_mode)) { ext2fs_mark_inode_bitmap2(ctx->inode_dir_map, ino); e2fsck_add_dir_info(ctx, ino, 0); diff --git a/e2fsck/pass2.c b/e2fsck/pass2.c index d3f21017234c..c85ece1ce817 100644 --- a/e2fsck/pass2.c +++ b/e2fsck/pass2.c @@ -36,11 +36,13 @@ * - The inode_bad_map bitmap * - The inode_dir_map bitmap * - The encrypted_file_info + * - The inode_casefold_map bitmap * * Pass 2 frees the following data structures * - The inode_bad_map bitmap * - The inode_reg_map bitmap * - The encrypted_file_info + * - The inode_casefold_map bitmap */ #define _GNU_SOURCE 1 /* get strnlen() */ @@ -286,6 +288,10 @@ void e2fsck_pass2(e2fsck_t ctx) ext2fs_free_inode_bitmap(ctx->inode_reg_map); ctx->inode_reg_map = 0; } + if (ctx->inode_casefold_map) { + ext2fs_free_inode_bitmap(ctx->inode_casefold_map); + ctx->inode_casefold_map = 0; + } destroy_encrypted_file_info(ctx); clear_problem_context(&pctx); @@ -514,6 +520,30 @@ static int encrypted_check_name(e2fsck_t ctx, return 0; } +static int encoded_check_name(e2fsck_t ctx, + struct ext2_dir_entry *dirent, + struct problem_context *pctx) +{ + const struct ext2fs_nls_table *tbl = ctx->fs->encoding; + int ret; + int len = ext2fs_dirent_name_len(dirent); + char *pos, *end; + + ret = ext2fs_check_encoded_name(tbl, dirent->name, len, &pos); + if (ret < 0) { + fatal_error(ctx, _("NLS is broken.")); + } else if(ret > 0) { + ret = fix_problem(ctx, PR_2_BAD_NAME, pctx); + if (ret) { + end = &dirent->name[len]; + for (; *pos && pos != end; pos++) + *pos = '.'; + } + } + + return (ret || check_name(ctx, dirent, pctx)); +} + /* * Check the directory filetype (if present) */ @@ -997,11 +1027,18 @@ static int check_dir_block(ext2_filsys fs, size_t max_block_size; int hash_flags = 0; static char *eop_read_dirblock = NULL; + int cf_dir = 0; cd = (struct check_dir_struct *) priv_data; ibuf = buf = cd->buf; ctx = cd->ctx; + /* We only want filename encoding verification on strict + * mode. */ + if (ext2fs_test_inode_bitmap2(ctx->inode_casefold_map, ino) && + (ctx->fs->super->s_encoding_flags & EXT4_ENC_STRICT_MODE_FL)) + cf_dir = 1; + if (ctx->flags & E2F_FLAG_RUN_RETURN) return DIRENT_ABORT; @@ -1482,7 +1519,11 @@ skip_checksum: if (check_filetype(ctx, dirent, ino, &cd->pctx)) dir_modified++; - if (dir_encpolicy_id == NO_ENCRYPTION_POLICY) { + if (cf_dir) { + /* casefolded directory */ + if (encoded_check_name(ctx, dirent, &cd->pctx)) + dir_modified++; + } else if (dir_encpolicy_id == NO_ENCRYPTION_POLICY) { /* Unencrypted directory */ if (check_name(ctx, dirent, &cd->pctx)) dir_modified++; From patchwork Wed Mar 25 21:18:06 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gabriel Krisman Bertazi X-Patchwork-Id: 1261659 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-ext4-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=collabora.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 48ngw23Kvqz9sPR for ; Thu, 26 Mar 2020 08:18:42 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727401AbgCYVSm (ORCPT ); Wed, 25 Mar 2020 17:18:42 -0400 Received: from bhuna.collabora.co.uk ([46.235.227.227]:39578 "EHLO bhuna.collabora.co.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727351AbgCYVSm (ORCPT ); Wed, 25 Mar 2020 17:18:42 -0400 Received: from [127.0.0.1] (localhost [127.0.0.1]) (Authenticated sender: krisman) with ESMTPSA id A964028666B From: Gabriel Krisman Bertazi To: tytso@mit.edu Cc: linux-ext4@vger.kernel.org, Gabriel Krisman Bertazi Subject: [PATCH e2fsprogs 06/11] e2fsck: Support casefold directories when rehashing Date: Wed, 25 Mar 2020 17:18:06 -0400 Message-Id: <20200325211812.2971787-7-krisman@collabora.com> X-Mailer: git-send-email 2.25.0 In-Reply-To: <20200325211812.2971787-1-krisman@collabora.com> References: <20200325211812.2971787-1-krisman@collabora.com> MIME-Version: 1.0 Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org When rehashing a +F directory, the casefold comparison needs to be performed, in order to identify duplicated filenames. Like the -F version, This is done in two steps, first adapt the qsort comparison to consider casefolded directories, and then iterate over the sorted list fixing dups. Signed-off-by: Gabriel Krisman Bertazi --- e2fsck/rehash.c | 88 ++++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 72 insertions(+), 16 deletions(-) diff --git a/e2fsck/rehash.c b/e2fsck/rehash.c index 54bc680388c3..9254e2810ad2 100644 --- a/e2fsck/rehash.c +++ b/e2fsck/rehash.c @@ -211,6 +211,23 @@ static EXT2_QSORT_TYPE ino_cmp(const void *a, const void *b) return (he_a->ino - he_b->ino); } +struct name_cmp_ctx +{ + int casefold; + const struct ext2fs_nls_table *tbl; +}; + + +static int same_name(const struct name_cmp_ctx *cmp_ctx, char *s1, + int len1, char *s2, int len2) +{ + if (!cmp_ctx->casefold) + return (len1 == len2 && !memcmp(s1, s2, len1)); + else + return !ext2fs_casefold_cmp(cmp_ctx->tbl, + s1, len1, s2, len2); +} + /* Used for sorting the hash entry */ static EXT2_QSORT_TYPE name_cmp(const void *a, const void *b) { @@ -237,9 +254,35 @@ static EXT2_QSORT_TYPE name_cmp(const void *a, const void *b) return ret; } +static EXT2_QSORT_TYPE name_cf_cmp(const struct name_cmp_ctx *ctx, + const void *a, const void *b) +{ + const struct hash_entry *he_a = (const struct hash_entry *) a; + const struct hash_entry *he_b = (const struct hash_entry *) b; + unsigned int he_a_len, he_b_len, min_len; + int ret; + + he_a_len = ext2fs_dirent_name_len(he_a->dir); + he_b_len = ext2fs_dirent_name_len(he_b->dir); + + ret = ext2fs_casefold_cmp(ctx->tbl, he_a->dir->name, he_a_len, + he_b->dir->name, he_b_len); + if (ret == 0) { + if (he_a_len > he_b_len) + ret = 1; + else if (he_a_len < he_b_len) + ret = -1; + else + ret = he_b->dir->inode - he_a->dir->inode; + } + return ret; +} + + /* Used for sorting the hash entry */ -static EXT2_QSORT_TYPE hash_cmp(const void *a, const void *b) +static EXT2_QSORT_TYPE hash_cmp(const void *a, const void *b, void *arg) { + const struct name_cmp_ctx *ctx = (struct name_cmp_ctx *) arg; const struct hash_entry *he_a = (const struct hash_entry *) a; const struct hash_entry *he_b = (const struct hash_entry *) b; int ret; @@ -253,8 +296,12 @@ static EXT2_QSORT_TYPE hash_cmp(const void *a, const void *b) ret = 1; else if (he_a->minor_hash < he_b->minor_hash) ret = -1; - else - ret = name_cmp(a, b); + else { + if (ctx->casefold) + ret = name_cf_cmp(ctx, a, b); + else + ret = name_cmp(a, b); + } } return ret; } @@ -376,7 +423,8 @@ static void mutate_name(char *str, unsigned int *len) static int duplicate_search_and_fix(e2fsck_t ctx, ext2_filsys fs, ext2_ino_t ino, - struct fill_dir_struct *fd) + struct fill_dir_struct *fd, + const struct name_cmp_ctx *cmp_ctx) { struct problem_context pctx; struct hash_entry *ent, *prev; @@ -399,11 +447,12 @@ static int duplicate_search_and_fix(e2fsck_t ctx, ext2_filsys fs, ent = fd->harray + i; prev = ent - 1; if (!ent->dir->inode || - (ext2fs_dirent_name_len(ent->dir) != - ext2fs_dirent_name_len(prev->dir)) || - memcmp(ent->dir->name, prev->dir->name, - ext2fs_dirent_name_len(ent->dir))) + !same_name(cmp_ctx, ent->dir->name, + ext2fs_dirent_name_len(ent->dir), + prev->dir->name, + ext2fs_dirent_name_len(prev->dir))) continue; + pctx.dirent = ent->dir; if ((ent->dir->inode == prev->dir->inode) && fix_problem(ctx, PR_2_DUPLICATE_DIRENT, &pctx)) { @@ -422,10 +471,11 @@ static int duplicate_search_and_fix(e2fsck_t ctx, ext2_filsys fs, mutate_name(new_name, &new_len); for (j=0; j < fd->num_array; j++) { if ((i==j) || - (new_len != - (unsigned) ext2fs_dirent_name_len(fd->harray[j].dir)) || - memcmp(new_name, fd->harray[j].dir->name, new_len)) + !same_name(cmp_ctx, new_name, new_len, + fd->harray[j].dir->name, + ext2fs_dirent_name_len(fd->harray[j].dir))) { continue; + } mutate_name(new_name, &new_len); j = -1; @@ -890,6 +940,7 @@ errcode_t e2fsck_rehash_dir(e2fsck_t ctx, ext2_ino_t ino, struct fill_dir_struct fd = { NULL, NULL, 0, 0, 0, NULL, 0, 0, 0, 0, 0, 0 }; struct out_dir outdir = { 0, 0, 0, 0 }; + struct name_cmp_ctx name_cmp_ctx = {0, NULL}; e2fsck_read_inode(ctx, ino, &inode, "rehash_dir"); @@ -917,6 +968,11 @@ errcode_t e2fsck_rehash_dir(e2fsck_t ctx, ext2_ino_t ino, fd.compress = 1; fd.parent = 0; + if (fs->encoding && (inode.i_flags & EXT4_CASEFOLD_FL)) { + name_cmp_ctx.casefold = 1; + name_cmp_ctx.tbl = fs->encoding; + } + retry_nohash: /* Read in the entire directory into memory */ retval = ext2fs_block_iterate3(fs, ino, 0, 0, @@ -945,16 +1001,16 @@ retry_nohash: /* Sort the list */ resort: if (fd.compress && fd.num_array > 1) - qsort(fd.harray+2, fd.num_array-2, sizeof(struct hash_entry), - hash_cmp); + qsort_r(fd.harray+2, fd.num_array-2, sizeof(struct hash_entry), + hash_cmp, &name_cmp_ctx); else - qsort(fd.harray, fd.num_array, sizeof(struct hash_entry), - hash_cmp); + qsort_r(fd.harray, fd.num_array, sizeof(struct hash_entry), + hash_cmp, &name_cmp_ctx); /* * Look for duplicates */ - if (duplicate_search_and_fix(ctx, fs, ino, &fd)) + if (duplicate_search_and_fix(ctx, fs, ino, &fd, &name_cmp_ctx)) goto resort; if (ctx->options & E2F_OPT_NO) { From patchwork Wed Mar 25 21:18:07 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gabriel Krisman Bertazi X-Patchwork-Id: 1261660 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-ext4-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=collabora.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 48ngw63JLvz9sPR for ; Thu, 26 Mar 2020 08:18:46 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727402AbgCYVSq (ORCPT ); Wed, 25 Mar 2020 17:18:46 -0400 Received: from bhuna.collabora.co.uk ([46.235.227.227]:39582 "EHLO bhuna.collabora.co.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727351AbgCYVSq (ORCPT ); Wed, 25 Mar 2020 17:18:46 -0400 Received: from [127.0.0.1] (localhost [127.0.0.1]) (Authenticated sender: krisman) with ESMTPSA id 2FE7428666B From: Gabriel Krisman Bertazi To: tytso@mit.edu Cc: linux-ext4@vger.kernel.org, Gabriel Krisman Bertazi Subject: [PATCH e2fsprogs 07/11] dict: Support comparison with context Date: Wed, 25 Mar 2020 17:18:07 -0400 Message-Id: <20200325211812.2971787-8-krisman@collabora.com> X-Mailer: git-send-email 2.25.0 In-Reply-To: <20200325211812.2971787-1-krisman@collabora.com> References: <20200325211812.2971787-1-krisman@collabora.com> MIME-Version: 1.0 Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org Signed-off-by: Gabriel Krisman Bertazi --- e2fsck/pass1b.c | 2 +- e2fsck/pass2.c | 2 +- lib/support/dict.c | 22 ++++++++++++++++------ lib/support/dict.h | 4 +++- lib/support/mkquota.c | 2 +- 5 files changed, 22 insertions(+), 10 deletions(-) diff --git a/e2fsck/pass1b.c b/e2fsck/pass1b.c index bca701cab94f..65df309ecb36 100644 --- a/e2fsck/pass1b.c +++ b/e2fsck/pass1b.c @@ -104,7 +104,7 @@ static dict_t clstr_dict, ino_dict; static ext2fs_inode_bitmap inode_dup_map; -static int dict_int_cmp(const void *a, const void *b) +static int dict_int_cmp(const void* cmp_ctx, const void *a, const void *b) { intptr_t ia, ib; diff --git a/e2fsck/pass2.c b/e2fsck/pass2.c index c85ece1ce817..bc2c5b90bc97 100644 --- a/e2fsck/pass2.c +++ b/e2fsck/pass2.c @@ -327,7 +327,7 @@ static int htree_depth(struct dx_dir_info *dx_dir, return depth; } -static int dict_de_cmp(const void *a, const void *b) +static int dict_de_cmp(const void *cmp_ctx, const void *a, const void *b) { const struct ext2_dir_entry *de_a, *de_b; int a_len, b_len; diff --git a/lib/support/dict.c b/lib/support/dict.c index 6a5c15ce8273..f8277c4afdf0 100644 --- a/lib/support/dict.c +++ b/lib/support/dict.c @@ -267,6 +267,7 @@ dict_t *dict_create(dictcount_t maxcount, dict_comp_t comp) new->allocnode = dnode_alloc; new->freenode = dnode_free; new->context = NULL; + new->cmp_ctx = NULL; new->nodecount = 0; new->maxcount = maxcount; new->nilnode.left = &new->nilnode; @@ -294,6 +295,14 @@ void dict_set_allocator(dict_t *dict, dnode_alloc_t al, dict->context = context; } +void dict_set_cmp_context(dict_t *dict, void *cmp_ctx) +{ + dict_assert (!dict->cmp_ctx); + dict_assert (dict_count(dict) == 0); + + dict->cmp_ctx = cmp_ctx; +} + #ifdef E2FSCK_NOTUSED /* * Free a dynamically allocated dictionary object. Removing the nodes @@ -467,7 +476,7 @@ dnode_t *dict_lookup(dict_t *dict, const void *key) /* simple binary search adapted for trees that contain duplicate keys */ while (root != nil) { - result = dict->compare(key, root->key); + result = dict->compare(dict->cmp_ctx, key, root->key); if (result < 0) root = root->left; else if (result > 0) @@ -479,7 +488,8 @@ dnode_t *dict_lookup(dict_t *dict, const void *key) do { saved = root; root = root->left; - while (root != nil && dict->compare(key, root->key)) + while (root != nil + && dict->compare(dict->cmp_ctx, key, root->key)) root = root->right; } while (root != nil); return saved; @@ -503,7 +513,7 @@ dnode_t *dict_lower_bound(dict_t *dict, const void *key) dnode_t *tentative = 0; while (root != nil) { - int result = dict->compare(key, root->key); + int result = dict->compare(dict->cmp_ctx, key, root->key); if (result > 0) { root = root->right; @@ -535,7 +545,7 @@ dnode_t *dict_upper_bound(dict_t *dict, const void *key) dnode_t *tentative = 0; while (root != nil) { - int result = dict->compare(key, root->key); + int result = dict->compare(dict->cmp_ctx, key, root->key); if (result < 0) { root = root->left; @@ -580,7 +590,7 @@ void dict_insert(dict_t *dict, dnode_t *node, const void *key) while (where != nil) { parent = where; - result = dict->compare(key, where->key); + result = dict->compare(dict->cmp_ctx, key, where->key); /* trap attempts at duplicate key insertion unless it's explicitly allowed */ dict_assert (dict->dupes || result != 0); if (result < 0) @@ -1261,7 +1271,7 @@ static int tokenize(char *string, ...) return tokcount; } -static int comparef(const void *key1, const void *key2) +static int comparef(const void *cmp_ctx, const void *key1, const void *key2) { return strcmp(key1, key2); } diff --git a/lib/support/dict.h b/lib/support/dict.h index 838079d6c85d..d9462a33f671 100644 --- a/lib/support/dict.h +++ b/lib/support/dict.h @@ -56,7 +56,7 @@ typedef struct dnode_t { #endif } dnode_t; -typedef int (*dict_comp_t)(const void *, const void *); +typedef int (*dict_comp_t)(const void *, const void *, const void *); typedef dnode_t *(*dnode_alloc_t)(void *); typedef void (*dnode_free_t)(dnode_t *, void *); @@ -69,6 +69,7 @@ typedef struct dict_t { dnode_alloc_t dict_allocnode; dnode_free_t dict_freenode; void *dict_context; + void *cmp_ctx; int dict_dupes; #else int dict_dummmy; @@ -88,6 +89,7 @@ typedef struct dict_load_t { extern dict_t *dict_create(dictcount_t, dict_comp_t); extern void dict_set_allocator(dict_t *, dnode_alloc_t, dnode_free_t, void *); +extern void dict_set_cmp_context(dict_t *, void *); extern void dict_destroy(dict_t *); extern void dict_free_nodes(dict_t *); extern void dict_free(dict_t *); diff --git a/lib/support/mkquota.c b/lib/support/mkquota.c index 6f7ae6d6ad45..fbc3833aee98 100644 --- a/lib/support/mkquota.c +++ b/lib/support/mkquota.c @@ -234,7 +234,7 @@ out: /* Helper functions for computing quota in memory. */ /******************************************************************/ -static int dict_uint_cmp(const void *a, const void *b) +static int dict_uint_cmp(const void *cmp_ctx, const void *a, const void *b) { unsigned int c, d; From patchwork Wed Mar 25 21:18:08 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gabriel Krisman Bertazi X-Patchwork-Id: 1261661 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-ext4-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=collabora.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 48ngw93jQkz9sPR for ; Thu, 26 Mar 2020 08:18:49 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727406AbgCYVSt (ORCPT ); Wed, 25 Mar 2020 17:18:49 -0400 Received: from bhuna.collabora.co.uk ([46.235.227.227]:39586 "EHLO bhuna.collabora.co.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727351AbgCYVSt (ORCPT ); Wed, 25 Mar 2020 17:18:49 -0400 Received: from [127.0.0.1] (localhost [127.0.0.1]) (Authenticated sender: krisman) with ESMTPSA id 154D928666B From: Gabriel Krisman Bertazi To: tytso@mit.edu Cc: linux-ext4@vger.kernel.org, Gabriel Krisman Bertazi Subject: [PATCH e2fsprogs 08/11] e2fsck: Detect duplicated casefolded direntries for rehash Date: Wed, 25 Mar 2020 17:18:08 -0400 Message-Id: <20200325211812.2971787-9-krisman@collabora.com> X-Mailer: git-send-email 2.25.0 In-Reply-To: <20200325211812.2971787-1-krisman@collabora.com> References: <20200325211812.2971787-1-krisman@collabora.com> MIME-Version: 1.0 Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On pass2, support casefolded directories when looking for duplicated entries. Signed-off-by: Gabriel Krisman Bertazi --- e2fsck/pass2.c | 22 +++++++++++++++++++++- 1 file changed, 21 insertions(+), 1 deletion(-) diff --git a/e2fsck/pass2.c b/e2fsck/pass2.c index bc2c5b90bc97..3b9a2ac78b00 100644 --- a/e2fsck/pass2.c +++ b/e2fsck/pass2.c @@ -343,6 +343,20 @@ static int dict_de_cmp(const void *cmp_ctx, const void *a, const void *b) return memcmp(de_a->name, de_b->name, a_len); } +static int dict_de_cf_cmp(const void *cmp_ctx, const void *a, const void *b) +{ + const struct ext2fs_nls_table *tbl = cmp_ctx; + const struct ext2_dir_entry *de_a, *de_b; + int a_len, b_len; + + de_a = (const struct ext2_dir_entry *) a; + a_len = ext2fs_dirent_name_len(de_a); + de_b = (const struct ext2_dir_entry *) b; + b_len = ext2fs_dirent_name_len(de_b); + + return ext2fs_casefold_cmp(tbl, de_a->name, a_len, de_b->name, b_len); +} + /* * This is special sort function that makes sure that directory blocks * with a dirblock of zero are sorted to the beginning of the list. @@ -1254,7 +1268,13 @@ skip_checksum: dir_encpolicy_id = find_encryption_policy(ctx, ino); - dict_init(&de_dict, DICTCOUNT_T_MAX, dict_de_cmp); + if (cf_dir) { + dict_init(&de_dict, DICTCOUNT_T_MAX, dict_de_cf_cmp); + dict_set_cmp_context(&de_dict, (void *)ctx->fs->encoding); + } else { + dict_init(&de_dict, DICTCOUNT_T_MAX, dict_de_cmp); + } + prev = 0; do { dgrp_t group; From patchwork Wed Mar 25 21:18:09 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gabriel Krisman Bertazi X-Patchwork-Id: 1261662 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-ext4-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=collabora.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 48ngwF54TBz9sPR for ; Thu, 26 Mar 2020 08:18:53 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727417AbgCYVSx (ORCPT ); Wed, 25 Mar 2020 17:18:53 -0400 Received: from bhuna.collabora.co.uk ([46.235.227.227]:39590 "EHLO bhuna.collabora.co.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727351AbgCYVSx (ORCPT ); Wed, 25 Mar 2020 17:18:53 -0400 Received: from [127.0.0.1] (localhost [127.0.0.1]) (Authenticated sender: krisman) with ESMTPSA id 6898028666B From: Gabriel Krisman Bertazi To: tytso@mit.edu Cc: linux-ext4@vger.kernel.org, Gabriel Krisman Bertazi Subject: [PATCH e2fsprogs 09/11] e2fsck: Add option to force encoded filename verification Date: Wed, 25 Mar 2020 17:18:09 -0400 Message-Id: <20200325211812.2971787-10-krisman@collabora.com> X-Mailer: git-send-email 2.25.0 In-Reply-To: <20200325211812.2971787-1-krisman@collabora.com> References: <20200325211812.2971787-1-krisman@collabora.com> MIME-Version: 1.0 Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org This is interesting for !strict filesystems as part of the encoding update procedure. Once the filesystem is known to not have badly encoded filenames, the update is trivial, thanks to the stability of assigned code points in the unicode specification. Signed-off-by: Gabriel Krisman Bertazi --- e2fsck/e2fsck.h | 1 + e2fsck/pass2.c | 5 +++-- e2fsck/unix.c | 4 ++++ 3 files changed, 8 insertions(+), 2 deletions(-) diff --git a/e2fsck/e2fsck.h b/e2fsck/e2fsck.h index 335a5e4c6dca..f0d206a4cba0 100644 --- a/e2fsck/e2fsck.h +++ b/e2fsck/e2fsck.h @@ -177,6 +177,7 @@ struct resource_track { #define E2F_OPT_ICOUNT_FULLMAP 0x20000 /* use an array for inode counts */ #define E2F_OPT_UNSHARE_BLOCKS 0x40000 #define E2F_OPT_CLEAR_UNINIT 0x80000 /* Hack to clear the uninit bit */ +#define E2F_OPT_CHECK_ENCODING 0x100000 /* Force verification of encoded filenames */ /* * E2fsck flags diff --git a/e2fsck/pass2.c b/e2fsck/pass2.c index 3b9a2ac78b00..17e35d08c2b2 100644 --- a/e2fsck/pass2.c +++ b/e2fsck/pass2.c @@ -1048,9 +1048,10 @@ static int check_dir_block(ext2_filsys fs, ctx = cd->ctx; /* We only want filename encoding verification on strict - * mode. */ + * mode or if explicitly requested by user. */ if (ext2fs_test_inode_bitmap2(ctx->inode_casefold_map, ino) && - (ctx->fs->super->s_encoding_flags & EXT4_ENC_STRICT_MODE_FL)) + ((ctx->fs->super->s_encoding_flags & EXT4_ENC_STRICT_MODE_FL) || + (ctx->options & E2F_OPT_CHECK_ENCODING))) cf_dir = 1; if (ctx->flags & E2F_FLAG_RUN_RETURN) diff --git a/e2fsck/unix.c b/e2fsck/unix.c index b3ef0f22b866..168b4784d65e 100644 --- a/e2fsck/unix.c +++ b/e2fsck/unix.c @@ -753,6 +753,9 @@ static void parse_extended_opts(e2fsck_t ctx, const char *opts) ctx->options |= E2F_OPT_UNSHARE_BLOCKS; ctx->options |= E2F_OPT_FORCE; continue; + } else if (strcmp(token, "check_encoding") == 0) { + ctx->options |= E2F_OPT_CHECK_ENCODING; + continue; #ifdef CONFIG_DEVELOPER_FEATURES } else if (strcmp(token, "clear_all_uninit_bits") == 0) { ctx->options |= E2F_OPT_CLEAR_UNINIT; @@ -784,6 +787,7 @@ static void parse_extended_opts(e2fsck_t ctx, const char *opts) fputs("\tbmap2extent\n", stderr); fputs("\tunshare_blocks\n", stderr); fputs("\tfixes_only\n", stderr); + fputs("\tcheck_encoding\n", stderr); fputc('\n', stderr); exit(1); } From patchwork Wed Mar 25 21:18:10 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gabriel Krisman Bertazi X-Patchwork-Id: 1261663 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-ext4-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=collabora.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 48ngwK2jxQz9sPR for ; Thu, 26 Mar 2020 08:18:57 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727420AbgCYVS5 (ORCPT ); Wed, 25 Mar 2020 17:18:57 -0400 Received: from bhuna.collabora.co.uk ([46.235.227.227]:39606 "EHLO bhuna.collabora.co.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727351AbgCYVS4 (ORCPT ); Wed, 25 Mar 2020 17:18:56 -0400 Received: from [127.0.0.1] (localhost [127.0.0.1]) (Authenticated sender: krisman) with ESMTPSA id 7056A292B0F From: Gabriel Krisman Bertazi To: tytso@mit.edu Cc: linux-ext4@vger.kernel.org, Gabriel Krisman Bertazi Subject: [PATCH e2fsprogs 10/11] e2fsck.8.in: Document check_encoding extended option Date: Wed, 25 Mar 2020 17:18:10 -0400 Message-Id: <20200325211812.2971787-11-krisman@collabora.com> X-Mailer: git-send-email 2.25.0 In-Reply-To: <20200325211812.2971787-1-krisman@collabora.com> References: <20200325211812.2971787-1-krisman@collabora.com> MIME-Version: 1.0 Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org Signed-off-by: Gabriel Krisman Bertazi --- e2fsck/e2fsck.8.in | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/e2fsck/e2fsck.8.in b/e2fsck/e2fsck.8.in index 4e3890b2d3e1..019a34ecdff6 100644 --- a/e2fsck/e2fsck.8.in +++ b/e2fsck/e2fsck.8.in @@ -267,6 +267,10 @@ Only fix damaged metadata; do not optimize htree directories or compress extent trees. This option is incompatible with the -D and -E bmap2extent options. .TP +.BI check_encoding +Force verification of encoded filenames in case-insensitive directories. +This is the default mode if the filesystem has the strict flag enabled. +.TP .BI unshare_blocks If the filesystem has shared blocks, with the shared blocks read-only feature enabled, then this will unshare all shared blocks and unset the read-only From patchwork Wed Mar 25 21:18:11 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gabriel Krisman Bertazi X-Patchwork-Id: 1261664 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-ext4-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=collabora.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 48ngwN6Z48z9sPR for ; Thu, 26 Mar 2020 08:19:00 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727430AbgCYVTA (ORCPT ); Wed, 25 Mar 2020 17:19:00 -0400 Received: from bhuna.collabora.co.uk ([46.235.227.227]:39610 "EHLO bhuna.collabora.co.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727351AbgCYVTA (ORCPT ); Wed, 25 Mar 2020 17:19:00 -0400 Received: from [127.0.0.1] (localhost [127.0.0.1]) (Authenticated sender: krisman) with ESMTPSA id AE03A292B0F From: Gabriel Krisman Bertazi To: tytso@mit.edu Cc: linux-ext4@vger.kernel.org, Gabriel Krisman Bertazi Subject: [PATCH e2fsprogs 11/11] tests: f_bad_fname: Test fixes of invalid filenames and duplicates Date: Wed, 25 Mar 2020 17:18:11 -0400 Message-Id: <20200325211812.2971787-12-krisman@collabora.com> X-Mailer: git-send-email 2.25.0 In-Reply-To: <20200325211812.2971787-1-krisman@collabora.com> References: <20200325211812.2971787-1-krisman@collabora.com> MIME-Version: 1.0 Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org Signed-off-by: Gabriel Krisman Bertazi --- tests/f_bad_fname/expect.1 | 22 ++++++++++++++++++++++ tests/f_bad_fname/expect.2 | 7 +++++++ tests/f_bad_fname/image.gz | Bin 0 -> 802 bytes tests/f_bad_fname/name | 1 + 4 files changed, 30 insertions(+) create mode 100644 tests/f_bad_fname/expect.1 create mode 100644 tests/f_bad_fname/expect.2 create mode 100644 tests/f_bad_fname/image.gz create mode 100644 tests/f_bad_fname/name diff --git a/tests/f_bad_fname/expect.1 b/tests/f_bad_fname/expect.1 new file mode 100644 index 000000000000..1d860b2247a4 --- /dev/null +++ b/tests/f_bad_fname/expect.1 @@ -0,0 +1,22 @@ +Pass 1: Checking inodes, blocks, and sizes +Pass 2: Checking directory structure +Entry 'AM-^?' in /ci_dir (12) has illegal characters in its name. +Fix? yes + +Entry 'AM-~' in /ci_dir (12) has illegal characters in its name. +Fix? yes + +Duplicate entry 'A.' found. + Marking /ci_dir (12) to be rebuilt. + +Pass 3: Checking directory connectivity +Pass 3A: Optimizing directories +Entry 'A.' in /ci_dir (12) has a non-unique filename. +Rename to A.~0? yes + +Pass 4: Checking reference counts +Pass 5: Checking group summary information + +test_filesys: ***** FILE SYSTEM WAS MODIFIED ***** +test_filesys: 14/16 files (0.0% non-contiguous), 22/100 blocks +Exit status is 1 diff --git a/tests/f_bad_fname/expect.2 b/tests/f_bad_fname/expect.2 new file mode 100644 index 000000000000..13de1c0806b7 --- /dev/null +++ b/tests/f_bad_fname/expect.2 @@ -0,0 +1,7 @@ +Pass 1: Checking inodes, blocks, and sizes +Pass 2: Checking directory structure +Pass 3: Checking directory connectivity +Pass 4: Checking reference counts +Pass 5: Checking group summary information +test_filesys: 14/16 files (0.0% non-contiguous), 22/100 blocks +Exit status is 0 diff --git a/tests/f_bad_fname/image.gz b/tests/f_bad_fname/image.gz new file mode 100644 index 0000000000000000000000000000000000000000..a8b3fc6b8397a7859d9697c462f24f498bb57fd8 GIT binary patch literal 802 zcmb2|=HU3ZwK|T0IWspgJ(c0@9p4PuP!Wa)#-G*9mUQmd6)h1gP)&N{wkF^Lhf?9g zMNtKcnpXpOe4{cJM+7ff`jtKW--4vV=~{YsIv=@RXj&kBGDu*_q5yNH8;i;k=a=78 z@%4!p{B((@Y#;x**}v1?o!R+$Msd2?XQhT^yX=m-bnO%AUveTi<+L?Vc;L6)A5Olh zkgUC!Xa6hu>if4lZ+<+wuKRbcPoc`uZ6eD**?bcdtk2o`_gzv|PMx0hpO>@$rmu^( z?AyKU``I-=Pp&oC_H)|fqtg6-`@Wz0F1_p<&)<1R{~GSt@b*UM($5jzk<&|Eqb!S5 zC;dI25?e2O?evSo@-lf#pZaR%b#MCjIm7kh&2Lv9-_JR}c*edv`_`Z5uit04m}T9R zvg*S}s~^PYF7Q0{{O`dNpFUM?$$I|tgvVE&{hDl_R{mvAe|KZ^|CzPI7iX&P%Zr_C zZ@M!(>662efM5S2S5LLK+`4<&P40-^kkj^ie!o3`-2B-7*Z-%7h5kQqb@iJ6?SA`q zEzOyqebrVf!FKA``uEYh>$GmalAq=J*SUCQeTLpw{+2WHWLy0EcfE6{wREm~%q26LNIByx0HF zdH5@Ouk^p<-#3bF)m~g&Irr}r`pP?ZBEOZ}g`cy3`d{tq z*BSM{UoQRmzSZQvY*fe5KXYH}*Z*qVr}y*p&-cgVKG&N_7JkaOdT;h8y<rT_?(Xb literal 0 HcmV?d00001 diff --git a/tests/f_bad_fname/name b/tests/f_bad_fname/name new file mode 100644 index 000000000000..675165a67c25 --- /dev/null +++ b/tests/f_bad_fname/name @@ -0,0 +1 @@ +Case-insensitive directory with broken file names