From patchwork Fri Jul 15 21:41:54 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Ehrenberg X-Patchwork-Id: 104914 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id E4AAB1007D1 for ; Sat, 16 Jul 2011 07:42:18 +1000 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751395Ab1GOVmQ (ORCPT ); Fri, 15 Jul 2011 17:42:16 -0400 Received: from smtp-out.google.com ([216.239.44.51]:34243 "EHLO smtp-out.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750815Ab1GOVmQ (ORCPT ); Fri, 15 Jul 2011 17:42:16 -0400 Received: from hpaq5.eem.corp.google.com (hpaq5.eem.corp.google.com [172.25.149.5]) by smtp-out.google.com with ESMTP id p6FLgBEo028185; Fri, 15 Jul 2011 14:42:12 -0700 DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=google.com; s=beta; t=1310766132; bh=f6fc5edrZ5IuxtQFw4Kp0YOrDmM=; h=From:To:Cc:Subject:Date:Message-Id; b=bpLgpPY/Sf5G4SohJeiE+GR7VxhFCVyKnagxdwcb5tbGWW0Y9d7Ejx0hZeEkU7Zm1 olJ9assrUBoDOFzD8bbPA== DomainKey-Signature: a=rsa-sha1; s=beta; d=google.com; c=nofws; q=dns; h=from:to:cc:subject:date:message-id:x-mailer; b=SjMOLhd/Yc4ZJwmMSzWYDb375S8ZWtXLZpMtcHxNfGQ9Xmy8V+c/uktAxhzP61FdA AWjOY9N2PEX5x1Hh2QBIg== Received: from dehrenberg-linux.mtv.corp.google.com (dehrenberg-linux.mtv.corp.google.com [172.18.96.157]) by hpaq5.eem.corp.google.com with ESMTP id p6FLg7Y0000589; Fri, 15 Jul 2011 14:42:08 -0700 Received: by dehrenberg-linux.mtv.corp.google.com (Postfix, from userid 134750) id 6CCB41C3C93; Fri, 15 Jul 2011 14:42:07 -0700 (PDT) From: Dan Ehrenberg To: "Theodore Ts'o" , Andreas Dilger , linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org, Eric Sandeen Cc: Dan Ehrenberg Subject: [PATCH v2 1/2] ext4: Preallocation is a multiple of stripe size Date: Fri, 15 Jul 2011 14:41:54 -0700 Message-Id: <1310766115-4164-1-git-send-email-dehrenberg@google.com> X-Mailer: git-send-email 1.7.3.1 Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org Previously, if a stripe width was provided, then it would be used as the preallocation granularity, with no santiy checking and no way to override this. Now, mb_prealloc_size defaults to the smallest multiple of stripe size that is greater than or equal to the old default mb_prealloc_size, and this can be overridden with the sysfs interface. Signed-off-by: Dan Ehrenberg --- fs/ext4/mballoc.c | 29 ++++++++++++++++++++--------- 1 files changed, 20 insertions(+), 9 deletions(-) diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index 6ed859d..754eb29 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -128,12 +128,13 @@ * we are doing a group prealloc we try to normalize the request to * sbi->s_mb_group_prealloc. Default value of s_mb_group_prealloc is * 512 blocks. This can be tuned via - * /sys/fs/ext4//mb_group_prealloc. The value is represented in * terms of number of blocks. If we have mounted the file system with -O * stripe= option the group prealloc request is normalized to the - * stripe value (sbi->s_stripe) + * the smallest multiple of the stripe value (sbi->s_stripe) which is + * greater than the default mb_group_prealloc. * - * The regular allocator(using the buddy cache) supports few tunables. + * The regular allocator (using the buddy cache) supports a few tunables. * * /sys/fs/ext4//mb_min_to_scan * /sys/fs/ext4//mb_max_to_scan @@ -2472,6 +2473,18 @@ int ext4_mb_init(struct super_block *sb, int needs_recovery) sbi->s_mb_stream_request = MB_DEFAULT_STREAM_THRESHOLD; sbi->s_mb_order2_reqs = MB_DEFAULT_ORDER2_REQS; sbi->s_mb_group_prealloc = MB_DEFAULT_GROUP_PREALLOC; + /* + * If there is a s_stripe > 1, then we set the s_mb_group_prealloc + * to the lowest multiple of s_stripe which is bigger than + * the s_mb_group_prealloc as determined above. We want + * the preallocation size to be an exact multiple of the + * RAID stripe size so that preallocations don't fragment + * the stripes. + */ + if (sbi->s_stripe > 1) { + sbi->s_mb_group_prealloc = roundup( + sbi->s_mb_group_prealloc, sbi->s_stripe); + } sbi->s_locality_groups = alloc_percpu(struct ext4_locality_group); if (sbi->s_locality_groups == NULL) { @@ -2830,8 +2843,9 @@ out_err: /* * here we normalize request for locality group - * Group request are normalized to s_strip size if we set the same via mount - * option. If not we set it to s_mb_group_prealloc which can be configured via + * Group request are normalized to s_mb_group_prealloc, which goes to + * s_strip if we set the same via mount option. + * s_mb_group_prealloc can be configured via * /sys/fs/ext4//mb_group_prealloc * * XXX: should we try to preallocate more than the group has now? @@ -2842,10 +2856,7 @@ static void ext4_mb_normalize_group_request(struct ext4_allocation_context *ac) struct ext4_locality_group *lg = ac->ac_lg; BUG_ON(lg == NULL); - if (EXT4_SB(sb)->s_stripe) - ac->ac_g_ex.fe_len = EXT4_SB(sb)->s_stripe; - else - ac->ac_g_ex.fe_len = EXT4_SB(sb)->s_mb_group_prealloc; + ac->ac_g_ex.fe_len = EXT4_SB(sb)->s_mb_group_prealloc; mb_debug(1, "#%u: goal %u blocks for locality group\n", current->pid, ac->ac_g_ex.fe_len); }