From patchwork Wed Jul 23 23:47:18 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: JP Abgrall X-Patchwork-Id: 373079 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 0AC7914009C for ; Thu, 24 Jul 2014 09:55:30 +1000 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758227AbaGWXz3 (ORCPT ); Wed, 23 Jul 2014 19:55:29 -0400 Received: from mail-yk0-f202.google.com ([209.85.160.202]:41813 "EHLO mail-yk0-f202.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758143AbaGWXz2 (ORCPT ); Wed, 23 Jul 2014 19:55:28 -0400 X-Greylist: delayed 455 seconds by postgrey-1.27 at vger.kernel.org; Wed, 23 Jul 2014 19:55:28 EDT Received: by mail-yk0-f202.google.com with SMTP id q9so119497ykb.3 for ; Wed, 23 Jul 2014 16:55:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=from:to:cc:subject:date:message-id; bh=sdbwwwf47nyidzVGykOe3OP4RTZDToeZbVZxphlHnsw=; b=bDFDfmfs5iecTXjhlXg1Lb0mJ1VGKdYTKzJ+s3xLva7grcR2vXs1RmxFA15aMR/3EU 8d9WHZ4pcmi5OcmRekgUby4QEiFka9tJO1taf4bxSxdhU5g3CUpang331/CS9TbXgQH3 Jp8FMZ6AEdO7pObQTFwmvIAFCeuOwsz+8CpI11DlHtxrzTzt55dfjZKI/la0gm3sSAsB EeyhvPYQmdQ7CWJ5p7ej26bmhed0shqr7IZn0o6UG/gi/0CIKSN7yx5tRC8avLFZrLhl uUyTpxi5weCmak00IbnwetiS+ednGadMMnzZ1PYQFbN0/qodnJLmx5OgBAn8JkPCGprv mCkw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=sdbwwwf47nyidzVGykOe3OP4RTZDToeZbVZxphlHnsw=; b=GcdhMdNZY9/r7MspX7Zc15w4dWDlTaoeaWhn/b7h40jPtnOTpUjBEENikhziaAzQv3 TUzZk9tKAHX00nhOXHTnzj9qH1ejEc9ZaWyxh5R9KM+qi+ckVDtmTm4SRTCWjdJkjfxk aVZo24la9V5eoidEHDdyoWmdUtGdCpomt4cUWIhaAvysJFz1zpM2lYi3R+NYAvZWgRw5 FglkjsGM43pB0W01+QZ4MS2WyLfwQfbT78mx98OJ2BCHr7qsgc+W4zI9wA0xtVt46wT/ JxAYp5y5G8B3ZavpmuEJqLFzmCjFol4w5P6hx3+n+VXWOWuJvjbyEG86pGFm1s/3BRp/ couQ== X-Gm-Message-State: ALoCoQn4+SEx960aOhvhyA/o+Ml181DMtpi/bUPyofSmsEc3mSywe6NMXnmxw13yCuHv/EeA1qSm X-Received: by 10.236.69.138 with SMTP id n10mr1077584yhd.49.1406159271684; Wed, 23 Jul 2014 16:47:51 -0700 (PDT) Received: from corp2gmr1-2.hot.corp.google.com (corp2gmr1-2.hot.corp.google.com [172.24.189.93]) by gmr-mx.google.com with ESMTPS id z50si618649yhb.3.2014.07.23.16.47.51 for (version=TLSv1.1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Wed, 23 Jul 2014 16:47:51 -0700 (PDT) Received: from jpa-linux.mtv.corp.google.com (jpa-linux.mtv.corp.google.com [172.18.123.20]) by corp2gmr1-2.hot.corp.google.com (Postfix) with ESMTP id 69E1F5A43D7; Wed, 23 Jul 2014 16:47:51 -0700 (PDT) Received: by jpa-linux.mtv.corp.google.com (Postfix, from userid 43983) id E6FE62207B8; Wed, 23 Jul 2014 16:47:50 -0700 (PDT) From: JP Abgrall To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, JP Abgrall , Geremy Condra Subject: [PATCH] ext4: Add support for FIDTRIM, a best-effort ioctl for deep discard trim Date: Wed, 23 Jul 2014 16:47:18 -0700 Message-Id: <1406159238-7557-1-git-send-email-jpa@google.com> X-Mailer: git-send-email 2.0.0.526.g5318336 Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org * What This provides an interface for issuing an FITRIM which uses the secure discard instead of just a discard. Only the eMMC command is "secure", and not how the FS uses it: due to the fact that the FS might reassign a region somewhere else, the original deleted data will not be affected by the "trim" which only handles un-used regions. So we'll just call it "deep discard", and note that this is a "best effort" cleanup. * Why We want to be able to cleanup most of the unused blocks. We don't want to constantly secure-discard via a mount option. From an eMMC spec perspective, it tells the device to really get rid of all the data for the specified blocks and not just put them back into the pool of free ones. The eMMC spec says the secure trim handling must makes sure the data (and metadata) is not available anymore. JEDEC Standard No. 84-A441 7.6.9 Secure Erase 7.6.10 Secure Trim Signed-off-by: Geremy Condra Signed-off-by: JP Abgrall --- fs/ext4/ext4.h | 3 ++- fs/ext4/ioctl.c | 6 +++++- fs/ext4/mballoc.c | 28 ++++++++++++++++++---------- include/uapi/linux/fs.h | 1 + 4 files changed, 26 insertions(+), 12 deletions(-) diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index 7cc5a0e..5156fd6 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -2082,7 +2082,8 @@ extern int ext4_mb_add_groupinfo(struct super_block *sb, ext4_group_t i, struct ext4_group_desc *desc); extern int ext4_group_add_blocks(handle_t *handle, struct super_block *sb, ext4_fsblk_t block, unsigned long count); -extern int ext4_trim_fs(struct super_block *, struct fstrim_range *); +extern int ext4_trim_fs(struct super_block *, struct fstrim_range *, + unsigned long blkdev_flags); /* inode.c */ struct buffer_head *ext4_getblk(handle_t *, struct inode *, diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c index 0f2252e..60bb5bc 100644 --- a/fs/ext4/ioctl.c +++ b/fs/ext4/ioctl.c @@ -580,11 +580,13 @@ resizefs_out: return err; } + case FIDTRIM: case FITRIM: { struct request_queue *q = bdev_get_queue(sb->s_bdev); struct fstrim_range range; int ret = 0; + int flags = cmd == FIDTRIM ? BLKDEV_DISCARD_SECURE : 0; if (!capable(CAP_SYS_ADMIN)) return -EPERM; @@ -592,13 +594,15 @@ resizefs_out: if (!blk_queue_discard(q)) return -EOPNOTSUPP; + if ((flags & BLKDEV_DISCARD_SECURE) && !blk_queue_secdiscard(q)) + return -EOPNOTSUPP; if (copy_from_user(&range, (struct fstrim_range __user *)arg, sizeof(range))) return -EFAULT; range.minlen = max((unsigned int)range.minlen, q->limits.discard_granularity); - ret = ext4_trim_fs(sb, &range); + ret = ext4_trim_fs(sb, &range, flags); if (ret < 0) return ret; diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index 2dcb936..48caed9 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -2742,7 +2742,8 @@ int ext4_mb_release(struct super_block *sb) } static inline int ext4_issue_discard(struct super_block *sb, - ext4_group_t block_group, ext4_grpblk_t cluster, int count) + ext4_group_t block_group, ext4_grpblk_t cluster, int count, + unsigned long flags) { ext4_fsblk_t discard_block; @@ -2751,7 +2752,7 @@ static inline int ext4_issue_discard(struct super_block *sb, count = EXT4_C2B(EXT4_SB(sb), count); trace_ext4_discard_blocks(sb, (unsigned long long) discard_block, count); - return sb_issue_discard(sb, discard_block, count, GFP_NOFS, 0); + return sb_issue_discard(sb, discard_block, count, GFP_NOFS, flags); } /* @@ -2773,7 +2774,7 @@ static void ext4_free_data_callback(struct super_block *sb, if (test_opt(sb, DISCARD)) { err = ext4_issue_discard(sb, entry->efd_group, entry->efd_start_cluster, - entry->efd_count); + entry->efd_count, 0); if (err && err != -EOPNOTSUPP) ext4_msg(sb, KERN_WARNING, "discard request in" " group:%d block:%d count:%d failed" @@ -4812,7 +4813,8 @@ do_more: * them with group lock_held */ if (test_opt(sb, DISCARD)) { - err = ext4_issue_discard(sb, block_group, bit, count); + err = ext4_issue_discard(sb, block_group, bit, count, + 0); if (err && err != -EOPNOTSUPP) ext4_msg(sb, KERN_WARNING, "discard request in" " group:%d block:%d count:%lu failed" @@ -5019,13 +5021,15 @@ error_return: * @count: number of blocks to TRIM * @group: alloc. group we are working with * @e4b: ext4 buddy for the group + * @blkdev_flags: flags for the block device * * Trim "count" blocks starting at "start" in the "group". To assure that no * one will allocate those blocks, mark it as used in buddy bitmap. This must * be called with under the group lock. */ static int ext4_trim_extent(struct super_block *sb, int start, int count, - ext4_group_t group, struct ext4_buddy *e4b) + ext4_group_t group, struct ext4_buddy *e4b, + unsigned long blkdev_flags) __releases(bitlock) __acquires(bitlock) { @@ -5046,7 +5050,7 @@ __acquires(bitlock) */ mb_mark_used(e4b, &ex); ext4_unlock_group(sb, group); - ret = ext4_issue_discard(sb, group, start, count); + ret = ext4_issue_discard(sb, group, start, count, blkdev_flags); ext4_lock_group(sb, group); mb_free_blocks(NULL, e4b, start, ex.fe_len); return ret; @@ -5059,6 +5063,7 @@ __acquires(bitlock) * @start: first group block to examine * @max: last group block to examine * @minblocks: minimum extent block count + * @blkdev_flags: flags for the block device * * ext4_trim_all_free walks through group's buddy bitmap searching for free * extents. When the free block is found, ext4_trim_extent is called to TRIM @@ -5073,7 +5078,7 @@ __acquires(bitlock) static ext4_grpblk_t ext4_trim_all_free(struct super_block *sb, ext4_group_t group, ext4_grpblk_t start, ext4_grpblk_t max, - ext4_grpblk_t minblocks) + ext4_grpblk_t minblocks, unsigned long blkdev_flags) { void *bitmap; ext4_grpblk_t next, count = 0, free_count = 0; @@ -5106,7 +5111,8 @@ ext4_trim_all_free(struct super_block *sb, ext4_group_t group, if ((next - start) >= minblocks) { ret = ext4_trim_extent(sb, start, - next - start, group, &e4b); + next - start, group, &e4b, + blkdev_flags); if (ret && ret != -EOPNOTSUPP) break; ret = 0; @@ -5148,6 +5154,7 @@ out: * ext4_trim_fs() -- trim ioctl handle function * @sb: superblock for filesystem * @range: fstrim_range structure + * @blkdev_flags: flags for the block device * * start: First Byte to trim * len: number of Bytes to trim from start @@ -5156,7 +5163,8 @@ out: * start to start+len. For each such a group ext4_trim_all_free function * is invoked to trim all free space. */ -int ext4_trim_fs(struct super_block *sb, struct fstrim_range *range) +int ext4_trim_fs(struct super_block *sb, struct fstrim_range *range, + unsigned long blkdev_flags) { struct ext4_group_info *grp; ext4_group_t group, first_group, last_group; @@ -5212,7 +5220,7 @@ int ext4_trim_fs(struct super_block *sb, struct fstrim_range *range) if (grp->bb_free >= minlen) { cnt = ext4_trim_all_free(sb, group, first_cluster, - end, minlen); + end, minlen, blkdev_flags); if (cnt < 0) { ret = cnt; break; diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h index ca1a11b..a1816ca 100644 --- a/include/uapi/linux/fs.h +++ b/include/uapi/linux/fs.h @@ -156,6 +156,7 @@ struct inodes_stat_t { #define FIFREEZE _IOWR('X', 119, int) /* Freeze */ #define FITHAW _IOWR('X', 120, int) /* Thaw */ #define FITRIM _IOWR('X', 121, struct fstrim_range) /* Trim */ +#define FIDTRIM _IOWR('X', 122, struct fstrim_range) /* Deep discard trim */ #define FS_IOC_GETFLAGS _IOR('f', 1, long) #define FS_IOC_SETFLAGS _IOW('f', 2, long)