From patchwork Fri Sep 28 15:44:10 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Monakhov X-Patchwork-Id: 187817 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id C1CD12C00C9 for ; Sat, 29 Sep 2012 01:44:38 +1000 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758078Ab2I1Pof (ORCPT ); Fri, 28 Sep 2012 11:44:35 -0400 Received: from mail-la0-f46.google.com ([209.85.215.46]:61221 "EHLO mail-la0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758538Ab2I1Pod (ORCPT ); Fri, 28 Sep 2012 11:44:33 -0400 Received: by mail-la0-f46.google.com with SMTP id h6so1127961lag.19 for ; Fri, 28 Sep 2012 08:44:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:from:to:cc:subject:date:message-id:x-mailer:in-reply-to :references; bh=n4SUyNSzjKKMw5Wz0zLUa+HfYD51VFK1q/y/okvJ4nY=; b=MAx6FlPp3lKhGiAZPUWNOySjr079COJkoy1lH0RY2CpbCnK2g9/yod9Y2PmyEUNWnL CK5mySc/RkoxRAA3MDn9DVxBPMOm/mumBCC9RHKtGLG+iF7mOwbYR8I1qt9bFehLyBCs HaQdSYGn87FkxPDv5mMUViy36EzroqlXSh+wdrHXDRjsJGe+GxSZ3o9K/B5x3AeWYxUo eLO1damEVplRtBXj9wk4+RTVV1M+4BPomMKTZOvfY/uVBgYIvnf0glY7l3BYbFzS8mhu /t8Akk3KhvhKtr8S/mwZvA+fE4k7gGaYLI8dBEJd6fc0Hk+oqDMh3Q5mNXQm2Dg2TKDm 8HgQ== Received: by 10.152.48.102 with SMTP id k6mr6178802lan.12.1348847072645; Fri, 28 Sep 2012 08:44:32 -0700 (PDT) Received: from smtp.gmail.com (swsoft-msk-nat.sw.ru. [195.214.232.10]) by mx.google.com with ESMTPS id c6sm2580549lbn.1.2012.09.28.08.44.31 (version=TLSv1/SSLv3 cipher=OTHER); Fri, 28 Sep 2012 08:44:32 -0700 (PDT) From: Dmitry Monakhov To: linux-ext4@vger.kernel.org Cc: tytso@mit.edu, jack@suse.cz, lczerner@redhat.com, Dmitry Monakhov Subject: [PATCH 10/11] ext4: punch_hole should wait for DIO writers V2 Date: Fri, 28 Sep 2012 19:44:10 +0400 Message-Id: <1348847051-6746-11-git-send-email-dmonakhov@openvz.org> X-Mailer: git-send-email 1.7.7.6 In-Reply-To: <1348847051-6746-1-git-send-email-dmonakhov@openvz.org> References: <1348847051-6746-1-git-send-email-dmonakhov@openvz.org> Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org punch_hole is the place where we have to wait for all existing writers (writeback, aio, dio), but currently we simply flush pended end_io request which is not sufficient. Other issue is that punch_hole performed w/o i_mutex held which obviously result in dangerous data corruption due to write-after-free. This patch performs following changes: - Guard punch_hole with i_mutex - Recheck inode flags under i_mutex - Block all new dio readers in order to prevent information leak caused by read-after-free pattern. - punch_hole now wait for all writers in flight NOTE: XXX write-after-free race is still possible because new dirty pages may appear due to mmap(), and currently there is no easy way to stop writeback while punch_hole is in progress. Changes from V1: Add flag checks once we hold i_mutex Signed-off-by: Dmitry Monakhov --- fs/ext4/extents.c | 50 +++++++++++++++++++++++++++++++++----------------- 1 files changed, 33 insertions(+), 17 deletions(-) diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index 70ba122..a1d16eb 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -4568,9 +4568,29 @@ int ext4_ext_punch_hole(struct file *file, loff_t offset, loff_t length) loff_t first_page_offset, last_page_offset; int credits, err = 0; + /* + * Write out all dirty pages to avoid race conditions + * Then release them. + */ + if (mapping->nrpages && mapping_tagged(mapping, PAGECACHE_TAG_DIRTY)) { + err = filemap_write_and_wait_range(mapping, + offset, offset + length - 1); + + if (err) + return err; + } + + mutex_lock(&inode->i_mutex); + /* Need recheck file flags under mutex */ + /* It's not possible punch hole on append only file */ + if (IS_APPEND(inode) || IS_IMMUTABLE(inode)) + return -EPERM; + if (IS_SWAPFILE(inode)) + return -ETXTBSY; + /* No need to punch hole beyond i_size */ if (offset >= inode->i_size) - return 0; + goto out_mutex; /* * If the hole extends beyond i_size, set the hole @@ -4588,33 +4608,25 @@ int ext4_ext_punch_hole(struct file *file, loff_t offset, loff_t length) first_page_offset = first_page << PAGE_CACHE_SHIFT; last_page_offset = last_page << PAGE_CACHE_SHIFT; - /* - * Write out all dirty pages to avoid race conditions - * Then release them. - */ - if (mapping->nrpages && mapping_tagged(mapping, PAGECACHE_TAG_DIRTY)) { - err = filemap_write_and_wait_range(mapping, - offset, offset + length - 1); - - if (err) - return err; - } - /* Now release the pages */ if (last_page_offset > first_page_offset) { truncate_pagecache_range(inode, first_page_offset, last_page_offset - 1); } - /* finish any pending end_io work */ + /* Wait all existing dio workers, newcomers will block on i_mutex */ + ext4_inode_block_unlocked_dio(inode); + inode_dio_wait(inode); err = ext4_flush_completed_IO(inode); if (err) - return err; + goto out_dio; credits = ext4_writepage_trans_blocks(inode); handle = ext4_journal_start(inode, credits); - if (IS_ERR(handle)) - return PTR_ERR(handle); + if (IS_ERR(handle)) { + err = PTR_ERR(handle); + goto out_dio; + } /* @@ -4706,6 +4718,10 @@ out: inode->i_mtime = inode->i_ctime = ext4_current_time(inode); ext4_mark_inode_dirty(handle, inode); ext4_journal_stop(handle); +out_dio: + ext4_inode_resume_unlocked_dio(inode); +out_mutex: + mutex_unlock(&inode->i_mutex); return err; } int ext4_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,