From patchwork Mon Dec 24 07:55:40 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zheng Liu X-Patchwork-Id: 208036 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id C0FD82C0094 for ; Mon, 24 Dec 2012 18:42:33 +1100 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752864Ab2LXHmc (ORCPT ); Mon, 24 Dec 2012 02:42:32 -0500 Received: from mail-pa0-f46.google.com ([209.85.220.46]:58298 "EHLO mail-pa0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751954Ab2LXHmb (ORCPT ); Mon, 24 Dec 2012 02:42:31 -0500 Received: by mail-pa0-f46.google.com with SMTP id bh2so3935461pad.19 for ; Sun, 23 Dec 2012 23:42:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:from:to:cc:subject:date:message-id:x-mailer:in-reply-to :references; bh=xLg5M/uuKKBcoX79c/G0828DiPYzVzh4H5wwlvL6eLE=; b=0sbetlsRm4hCDinR8L7R/Fm5fC6GzHh2pVUOBsC/scB+cyh6csAfdZ1axJG4G2SWOP ZAjW1j7teNwgYoj85wx+RRLoYU+iMXztMtykctD7MzLhRWYJEsEVhX5eqbx5Ggah5Z4b sxiC6polnCIG7f317+YO7xRESBYEGbhgEVzX0IdQQaPsOY5xWlmxUQByohBFlDSPupxs iCV4pdlTNMtGPY1kPDKkAzX+cXDmnaxb6wVR27jfrLB7YatVueXw1vdp/pxmjF8d18Ur BhR2+ZaXRx4xof2o3bY/5/rqvFiHf19YsoKh1cAwsurWXv+VEWCjcIdFwHR0ipxMI9+w 7z7Q== X-Received: by 10.66.75.100 with SMTP id b4mr61127443paw.0.1356334951284; Sun, 23 Dec 2012 23:42:31 -0800 (PST) Received: from lz-desktop.taobao.ali.com ([182.92.247.2]) by mx.google.com with ESMTPS id u1sm12392909pav.16.2012.12.23.23.42.29 (version=TLSv1/SSLv3 cipher=OTHER); Sun, 23 Dec 2012 23:42:30 -0800 (PST) From: Zheng Liu To: linux-ext4@vger.kernel.org Cc: Zheng Liu Subject: [RFC][PATCH 7/9 v1] ext4: add a new convert function to convert an unwritten extent in extent status tree Date: Mon, 24 Dec 2012 15:55:40 +0800 Message-Id: <1356335742-11793-8-git-send-email-wenqing.lz@taobao.com> X-Mailer: git-send-email 1.7.12.rc2.18.g61b472e In-Reply-To: <1356335742-11793-1-git-send-email-wenqing.lz@taobao.com> References: <1356335742-11793-1-git-send-email-wenqing.lz@taobao.com> Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org From: Zheng Liu A new function called ext4_es_convert_unwritten_extents() is defined to convert a range of unwritten extents to written in extent status tree. This function aims to improve the unwritten extent conversion in DIO end_io. Meanwhile all locks are changed to save irq flags due to DIO end_io is in irq context. Signed-off-by: Zheng Liu --- fs/ext4/extents_status.c | 161 ++++++++++++++++++++++++++++++++++++++++++++--- fs/ext4/extents_status.h | 2 + 2 files changed, 155 insertions(+), 8 deletions(-) diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c index ccd940c..9db9e05 100644 --- a/fs/ext4/extents_status.c +++ b/fs/ext4/extents_status.c @@ -239,10 +239,11 @@ ext4_lblk_t ext4_es_find_extent(struct inode *inode, struct extent_status *es) struct extent_status *es1 = NULL; struct rb_node *node; ext4_lblk_t ret = EXT_MAX_BLOCKS; + unsigned long flags; trace_ext4_es_find_extent_enter(inode, es->es_lblk); - read_lock(&EXT4_I(inode)->i_es_lock); + read_lock_irqsave(&EXT4_I(inode)->i_es_lock, flags); tree = &EXT4_I(inode)->i_es_tree; /* find delay extent in cache firstly */ @@ -273,7 +274,7 @@ out: } } - read_unlock(&EXT4_I(inode)->i_es_lock); + read_unlock_irqrestore(&EXT4_I(inode)->i_es_lock, flags); trace_ext4_es_find_extent_exit(inode, es, ret); return ret; @@ -426,6 +427,7 @@ int ext4_es_insert_extent(struct inode *inode, ext4_lblk_t lblk, struct ext4_es_tree *tree; struct extent_status newes; ext4_lblk_t end = lblk + len - 1; + unsigned long flags; int err = 0; es_debug("add [%u/%u) %llu %d to extent status tree of inode %lu\n", @@ -439,7 +441,7 @@ int ext4_es_insert_extent(struct inode *inode, ext4_lblk_t lblk, newes.es_status = status; trace_ext4_es_insert_extent(inode, &newes); - write_lock(&EXT4_I(inode)->i_es_lock); + write_lock_irqsave(&EXT4_I(inode)->i_es_lock, flags); tree = &EXT4_I(inode)->i_es_tree; err = __es_remove_extent(tree, lblk, end); if (err != 0) @@ -447,7 +449,7 @@ int ext4_es_insert_extent(struct inode *inode, ext4_lblk_t lblk, err = __es_insert_extent(tree, &newes); error: - write_unlock(&EXT4_I(inode)->i_es_lock); + write_unlock_irqrestore(&EXT4_I(inode)->i_es_lock, flags); ext4_es_print_tree(inode); @@ -466,12 +468,13 @@ int ext4_es_lookup_extent(struct inode *inode, struct extent_status *es) struct ext4_es_tree *tree; struct extent_status *es1; struct rb_node *node; + unsigned long flags; int found = 0; es_debug("lookup extent in block %u\n", es->es_lblk); tree = &EXT4_I(inode)->i_es_tree; - read_lock(&EXT4_I(inode)->i_es_lock); + read_lock_irqsave(&EXT4_I(inode)->i_es_lock, flags); /* find delay extent in cache firstly */ if (tree->cache_es) { @@ -506,7 +509,7 @@ out: es->es_status = es1->es_status; } - read_unlock(&EXT4_I(inode)->i_es_lock); + read_unlock_irqrestore(&EXT4_I(inode)->i_es_lock, flags); return found; } @@ -605,6 +608,7 @@ int ext4_es_remove_extent(struct inode *inode, ext4_lblk_t lblk, { struct ext4_es_tree *tree; ext4_lblk_t end; + unsigned long flags; int err = 0; trace_ext4_es_remove_extent(inode, lblk, len); @@ -616,9 +620,150 @@ int ext4_es_remove_extent(struct inode *inode, ext4_lblk_t lblk, tree = &EXT4_I(inode)->i_es_tree; - write_lock(&EXT4_I(inode)->i_es_lock); + write_lock_irqsave(&EXT4_I(inode)->i_es_lock, flags); err = __es_remove_extent(tree, lblk, end); - write_unlock(&EXT4_I(inode)->i_es_lock); + write_unlock_irqrestore(&EXT4_I(inode)->i_es_lock, flags); + ext4_es_print_tree(inode); + return err; +} + +int ext4_es_convert_unwritten_extents(struct inode *inode, loff_t offset, + size_t size) +{ + struct ext4_es_tree *tree; + struct rb_node *node; + struct extent_status *es, orig_es, conv_es; + ext4_lblk_t end, len1, len2; + ext4_lblk_t lblk = 0, len = 0; + unsigned long flags; + unsigned int blkbits; + int err = 0; + + /* add trace point and debug */ + blkbits = inode->i_blkbits; + lblk = offset >> blkbits; + len = (EXT4_BLOCK_ALIGN(offset + size, blkbits) >> blkbits) - lblk; + + end = lblk + len - 1; + BUG_ON(end < lblk); + + tree = &EXT4_I(inode)->i_es_tree; + + write_lock_irqsave(&EXT4_I(inode)->i_es_lock, flags); + + es = __es_tree_search(&tree->root, lblk); + if (!es) + goto out; + if (es->es_lblk > end) + goto out; + + tree->cache_es = NULL; + + orig_es.es_lblk = es->es_lblk; + orig_es.es_len = es->es_len; + orig_es.es_pblk = es->es_pblk; + orig_es.es_status = es->es_status; + + len1 = lblk > es->es_lblk ? lblk - es->es_lblk : 0; + len2 = extent_status_end(es) > end ? + extent_status_end(es) - end : 0; + if (len1 > 0) + es->es_len = len1; + if (len2 > 0) { + if (len1 > 0) { + struct extent_status newes; + + newes.es_lblk = end + 1; + newes.es_len = len2; + newes.es_pblk = orig_es.es_pblk + orig_es.es_len - len2; + newes.es_status = orig_es.es_status; + /*BUG_ON(newes.es_status != EXTENT_STATUS_UNWRITTEN);*/ + err = __es_insert_extent(tree, &newes); + if (err) { + es->es_lblk = orig_es.es_lblk; + es->es_len = orig_es.es_len; + goto out; + } + + conv_es.es_lblk = orig_es.es_lblk + len1; + conv_es.es_len = orig_es.es_len - len1 - len2; + conv_es.es_pblk = orig_es.es_pblk + len1; + conv_es.es_status = EXTENT_STATUS_WRITTEN; + err = __es_insert_extent(tree, &conv_es); + if (err) { + int err2; + err2 = __es_remove_extent(tree, newes.es_lblk, + extent_status_end(&newes)); + if (err2) + goto out; + es->es_lblk = orig_es.es_lblk; + es->es_len = orig_es.es_len; + goto out; + } + } else { + es->es_lblk = end + 1; + es->es_len = len2; + es->es_pblk = orig_es.es_pblk + orig_es.es_len - len2; + /*BUG_ON(newes.es_status != EXTENT_STATUS_UNWRITTEN);*/ + + conv_es.es_lblk = orig_es.es_lblk; + conv_es.es_len = orig_es.es_len - len2; + conv_es.es_pblk = orig_es.es_pblk; + conv_es.es_status = EXTENT_STATUS_WRITTEN; + err = __es_insert_extent(tree, &conv_es); + if (err) { + es->es_lblk = orig_es.es_lblk; + es->es_len = orig_es.es_len; + es->es_pblk = orig_es.es_pblk; + } + } + + goto out; + } + + if (len1 > 0) { + node = rb_next(&es->rb_node); + if (node) + es = rb_entry(node, struct extent_status, rb_node); + else + es = NULL; + } + + while (es && extent_status_end(es) <= end) { + node = rb_next(&es->rb_node); + es->es_status = EXTENT_STATUS_WRITTEN; + if (!node) { + es = NULL; + break; + } + es = rb_entry(node, struct extent_status, rb_node); + } + + if (es && es->es_lblk < end + 1) { + ext4_lblk_t orig_len = es->es_len; + + /* + * Here we first set conv_es just because of avoiding copy the + * value of es to a tmporary variable. + */ + len1 = extent_status_end(es) - end; + conv_es.es_lblk = es->es_lblk; + conv_es.es_len = es->es_len - len1; + conv_es.es_pblk = es->es_pblk; + conv_es.es_status = EXTENT_STATUS_WRITTEN; + + es->es_lblk = end + 1; + es->es_len = len1; + es->es_pblk = es->es_pblk + orig_len - len1; + + err = __es_insert_extent(tree, &conv_es); + if (err) + goto out; + } + +out: + write_unlock_irqrestore(&EXT4_I(inode)->i_es_lock, flags); + ext4_es_print_tree(inode); return err; } diff --git a/fs/ext4/extents_status.h b/fs/ext4/extents_status.h index 1890f80..9069ecf 100644 --- a/fs/ext4/extents_status.h +++ b/fs/ext4/extents_status.h @@ -51,6 +51,8 @@ extern int ext4_es_remove_extent(struct inode *inode, ext4_lblk_t lblk, extern ext4_lblk_t ext4_es_find_extent(struct inode *inode, struct extent_status *es); extern int ext4_es_lookup_extent(struct inode *inode, struct extent_status *es); +extern int ext4_es_convert_unwritten_extents(struct inode *inode, + loff_t offset, size_t size); static inline int ext4_es_is_written(struct extent_status *es) {