From patchwork Sun May 13 17:56:24 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Whitney X-Patchwork-Id: 912552 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-ext4-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="I73VMew9"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 40kWjz4Pj1z9s16 for ; Mon, 14 May 2018 03:56:55 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751984AbeEMR4x (ORCPT ); Sun, 13 May 2018 13:56:53 -0400 Received: from mail-qk0-f195.google.com ([209.85.220.195]:33328 "EHLO mail-qk0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751438AbeEMR4u (ORCPT ); Sun, 13 May 2018 13:56:50 -0400 Received: by mail-qk0-f195.google.com with SMTP id c11-v6so8282997qkm.0 for ; Sun, 13 May 2018 10:56:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=cziAIYCI/6d1+ZPZVw3mMUxvXuhtU78ThoykcGI/E+c=; b=I73VMew9kNNT9Hq3iLdY17IflMkE7JeINCyrP6kwPF8t4sRzMpc4RJpAXgmzP4iNNs uW+dS/2aL/UdcfQrEw49BR6BEdo9eUNU1rW2WV+93uONvx3MZvfr0y8Ib5PwBzUf/Ps0 iOYIvmpCE3VOuzq15awUp92MWHvfvMmqOPIT1iFt6tFdvsrM+VnoDRNRn+KFbGACuoF7 Spy+mkYMUYiqGIM983zzcLAPrdFRo+8hbBVQLIS6lnkSQaCASUvbtuAF1VTXIRQovX/7 gcz6n+xXAPU12vNoe+htWDJuSKe5cw7q2SW9ANF0JWzz5WTnrRZJRdiAspaPIAxUdb4o KLIw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=cziAIYCI/6d1+ZPZVw3mMUxvXuhtU78ThoykcGI/E+c=; b=cAcZ46TAihcYrCk9VbEPbYSVHgEghvDTCtYsoxv/Vj2uCTI5oCEQM3TKEMsJYruI9U eVXxGw1UMbtn6hylWSb42DggrsaUkNbGA5pERpuYkTR+5nXpBnlGlL10Wl5nzqs8k0yB JA3GlUUuilESpWwLkKloMo3Jh4ZnXnFlF93vbPeMIvfy2nAO1v0cqTbwPQgcHRiGWV8o obfXkR3gqNbvcs463pKpkkvd49p/27MPhskhRnvHCVPMD5ytMKJd7k0vVfKdQ8tukcrj fZ5YOycU7mduHFL2hbWWL0dkMAMf0SvrL2eGBpkxyHCzDYZRq6rOOMPu9O6orYeVkQBB 2htA== X-Gm-Message-State: ALKqPweUsB7GIlUYaWmBEJ1osD6QtER59fpm91rL2owy9nYgWUal+rrq Af1HH6NzgZJaAlPHe8tRa4DvLg== X-Google-Smtp-Source: AB8JxZpV9ks3ui8w+0kIu8GmsgD4JQH9NqaZW/v/9Ala94ups9vjKjzNMcAxhSMuG4fyN1pmlZ+83g== X-Received: by 2002:ae9:d617:: with SMTP id r23-v6mr5393685qkk.75.1526234209869; Sun, 13 May 2018 10:56:49 -0700 (PDT) Received: from localhost.localdomain (c-73-60-226-25.hsd1.nh.comcast.net. [73.60.226.25]) by smtp.gmail.com with ESMTPSA id z123-v6sm616164qkc.43.2018.05.13.10.56.49 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 13 May 2018 10:56:49 -0700 (PDT) From: Eric Whitney To: linux-ext4@vger.kernel.org Cc: tytso@mit.edu, Eric Whitney Subject: [RFC PATCH 5/5] ext4: don't release delalloc clusters when invalidating page Date: Sun, 13 May 2018 13:56:24 -0400 Message-Id: <20180513175624.12887-6-enwlinux@gmail.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20180513175624.12887-1-enwlinux@gmail.com> References: <20180513175624.12887-1-enwlinux@gmail.com> Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org With the preceding two patches, it's now possible to accurately determine the number of delayed allocated clusters to be released when removing a block range from the extent tree and extents status tree in ext4_ext_truncate(), ext4_punch_hole(), and ext4_collapse_range(). Since it's not possible to do this when invalidating pages in ext4_da_page_release_reservation(), remove those operations from it. It's still appropriate to clear the delayed bits for invalidated buffers there. Removal of block ranges in ext4_ext_truncate() and other functions appears redundant with removal at page invalidate time, and removal of a block range as a unit should generally incur less CPU overhead than page by page (block by block) removal. Note: this change does result in a regression for generic/036 when running at least the 4k, bigalloc, and bigalloc_1k test cases using the xfstests-bld test appliance. The test passes, but new kernel error messages of the form "Page cache invalidation failure on direct I/O. Possible data corruption due to collision with buffered I/O!" appear in dmesg. It's likely this patch violates a direct I/O implementation requirement, perhaps making DIO vulnerable to read races with buffered I/O. To be addressed. Signed-off-by: Eric Whitney --- fs/ext4/inode.c | 51 +++++++++++++-------------------------------------- 1 file changed, 13 insertions(+), 38 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index a55b4db4a29c..6a903a850d95 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -1628,62 +1628,37 @@ void ext4_da_release_space(struct inode *inode, int to_free) dquot_release_reservation_block(inode, EXT4_C2B(sbi, to_free)); } +/* + * This code doesn't work and needs to be replaced. Not deleting delayed + * blocks from the extents status tree here and deferring until blocks + * are deleted from the extent tree causes problems for DIO. One avenue + * to explore is a partial reversion of the code here, altering the + * calls to delete blocks from the extents status tree to calls to + * invalidate those blocks, hopefully avoiding problems with the DIO code. + * They would then be deleted and accounted for when the extent tree is + * modified. + */ static void ext4_da_page_release_reservation(struct page *page, unsigned int offset, unsigned int length) { - int to_release = 0, contiguous_blks = 0; struct buffer_head *head, *bh; unsigned int curr_off = 0; - struct inode *inode = page->mapping->host; - struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); unsigned int stop = offset + length; - int num_clusters; - ext4_fsblk_t lblk; + unsigned int next_off; BUG_ON(stop > PAGE_SIZE || stop < length); head = page_buffers(page); bh = head; do { - unsigned int next_off = curr_off + bh->b_size; - + next_off = curr_off + bh->b_size; if (next_off > stop) break; - - if ((offset <= curr_off) && (buffer_delay(bh))) { - to_release++; - contiguous_blks++; + if ((offset <= curr_off) && (buffer_delay(bh))) clear_buffer_delay(bh); - } else if (contiguous_blks) { - lblk = page->index << - (PAGE_SHIFT - inode->i_blkbits); - lblk += (curr_off >> inode->i_blkbits) - - contiguous_blks; - ext4_es_remove_extent(inode, lblk, contiguous_blks); - contiguous_blks = 0; - } curr_off = next_off; } while ((bh = bh->b_this_page) != head); - - if (contiguous_blks) { - lblk = page->index << (PAGE_SHIFT - inode->i_blkbits); - lblk += (curr_off >> inode->i_blkbits) - contiguous_blks; - ext4_es_remove_extent(inode, lblk, contiguous_blks); - } - - /* If we have released all the blocks belonging to a cluster, then we - * need to release the reserved space for that cluster. */ - num_clusters = EXT4_NUM_B2C(sbi, to_release); - while (num_clusters > 0) { - lblk = (page->index << (PAGE_SHIFT - inode->i_blkbits)) + - ((num_clusters - 1) << sbi->s_cluster_bits); - if (sbi->s_cluster_ratio == 1 || - !ext4_find_delalloc_cluster(inode, lblk)) - ext4_da_release_space(inode, 1); - - num_clusters--; - } } /*