From patchwork Thu Mar 21 15:50:45 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lukas Czerner X-Patchwork-Id: 229732 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 349402C00C9 for ; Fri, 22 Mar 2013 02:51:08 +1100 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933724Ab3CUPvA (ORCPT ); Thu, 21 Mar 2013 11:51:00 -0400 Received: from mx1.redhat.com ([209.132.183.28]:48407 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755886Ab3CUPu7 (ORCPT ); Thu, 21 Mar 2013 11:50:59 -0400 Received: from int-mx09.intmail.prod.int.phx2.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id r2LFowTw023131 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Thu, 21 Mar 2013 11:50:58 -0400 Received: from dhcp-1-187.brq.redhat.com (dhcp-1-187.brq.redhat.com [10.34.1.187]) by int-mx09.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id r2LFoux2017878; Thu, 21 Mar 2013 11:50:57 -0400 From: Lukas Czerner To: linux-ext4@vger.kernel.org Cc: gharm@google.com, Lukas Czerner Subject: [PATCH] ext4: Do not normalize request from fallocate Date: Thu, 21 Mar 2013 16:50:45 +0100 Message-Id: <1363881045-21673-1-git-send-email-lczerner@redhat.com> X-Scanned-By: MIMEDefang 2.68 on 10.5.11.22 Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org Block requests from fallocate has been normalized originally. Then it was changed by 556b27abf73833923d5cd4be80006292e1b31662 not to normalize it. And then it was changed by 3c6fe77017bc6ce489f231c35fed3220b6691836 again to normalize the request. The fact is that we _never_ want to normalize the request from fallocate. We know exactly how much space we're going to use and we do not want anyone to mess with the request and there is no point in doing so. Commit 3c6fe77017bc6ce489f231c35fed3220b6691836 mentioned that large fallocate requests were not physically contiguous. However it is important to see why that is the case. Because the request is so big the allocator will try to find free group to allocate from skipping block groups which are used, which is fine. However it will only allocate extents of 2^15-1 block (limitation of uninitialized extent size) which will leave one block in each block group free which will make the extent tree physically non-contiguous, however _only_ by one block which is perfectly fine. This will never happen when we normalize the request because for some reason (maybe bug) it will be normalized to much smaller request (2048 blocks) and those extents will then be merged together not leaving any free block in between - hence physically contiguous. However the fact that we're splitting huge requests into ton of smaller ones and then merging extents together is very _very_ bad for fallocate performance. The situation is even worst since with commit ec22ba8edb507395c95fbc617eea26a6b2d98797 we no longer merge uninitialized extents so we end up with absolutely _huge_ extent tree for bigger fallocate requests which is also bad for performance but not only when fallocate itself, but even when working with the file later on. Fix this by disabling normalization for fallocate. From my simple testing with this commit fallocate is much faster on non fragmented file system. On my system fallocate 15T is almost 3x faster with this patch and removing this file is almost 2x faster - tested on real hardware. Signed-off-by: Lukas Czerner Reviewed-by:Dmitry Monakhov --- fs/ext4/extents.c | 18 ++++++++++-------- 1 files changed, 10 insertions(+), 8 deletions(-) diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index e2bb929..a40a602 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -4422,16 +4422,18 @@ long ext4_fallocate(struct file *file, int mode, loff_t offset, loff_t len) trace_ext4_fallocate_exit(inode, offset, max_blocks, ret); return ret; } - flags = EXT4_GET_BLOCKS_CREATE_UNINIT_EXT; - if (mode & FALLOC_FL_KEEP_SIZE) - flags |= EXT4_GET_BLOCKS_KEEP_SIZE; + /* - * Don't normalize the request if it can fit in one extent so - * that it doesn't get unnecessarily split into multiple - * extents. + * We do NOT want the requests from fallocate to be normalized + * ever!. We know exactly how much we want to allocate and + * we do not need to do any mumbo-jumbo with it. Requests bigger + * than uninit extent size, will be divided automatically into + * biggest possible extents. */ - if (len <= EXT_UNINIT_MAX_LEN << blkbits) - flags |= EXT4_GET_BLOCKS_NO_NORMALIZE; + flags = EXT4_GET_BLOCKS_CREATE_UNINIT_EXT | + EXT4_GET_BLOCKS_NO_NORMALIZE; + if (mode & FALLOC_FL_KEEP_SIZE) + flags |= EXT4_GET_BLOCKS_KEEP_SIZE; retry: while (ret >= 0 && ret < max_blocks) {