From patchwork Sat May 27 13:28:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Alex Zhuravlev X-Patchwork-Id: 1786743 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=ozlabs.org (client-ip=2404:9400:2221:ea00::3; helo=gandalf.ozlabs.org; envelope-from=srs0=tapg=bq=vger.kernel.org=linux-ext4-owner@ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=whamcloud.com header.i=@whamcloud.com header.a=rsa-sha256 header.s=selector2 header.b=VMufqxsj; dkim-atps=neutral Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4QT3Nt1Mgxz20Q8 for ; Sun, 28 May 2023 00:01:17 +1000 (AEST) Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4QT3Np5XZzz4x1f for ; Sun, 28 May 2023 00:01:14 +1000 (AEST) Received: by gandalf.ozlabs.org (Postfix) id 4QT3Np59tJz4x3l; Sun, 28 May 2023 00:01:14 +1000 (AEST) Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: gandalf.ozlabs.org; dmarc=pass (p=reject dis=none) header.from=whamcloud.com Authentication-Results: gandalf.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=2620:137:e000::1:20; helo=out1.vger.email; envelope-from=linux-ext4-owner@vger.kernel.org; receiver=) Authentication-Results: gandalf.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=whamcloud.com header.i=@whamcloud.com header.a=rsa-sha256 header.s=selector2 header.b=VMufqxsj; dkim-atps=neutral Received: from out1.vger.email (out1.vger.email [IPv6:2620:137:e000::1:20]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4QT3Nm6T4tz4x1f for ; Sun, 28 May 2023 00:01:12 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231151AbjE0OBG (ORCPT ); Sat, 27 May 2023 10:01:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34850 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230360AbjE0OBF (ORCPT ); Sat, 27 May 2023 10:01:05 -0400 X-Greylist: delayed 1937 seconds by postgrey-1.37 at lindbergh.monkeyblade.net; Sat, 27 May 2023 07:01:00 PDT Received: from outbound-ip7a.ess.barracuda.com (outbound-ip7a.ess.barracuda.com [209.222.82.174]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B78D9D8 for ; Sat, 27 May 2023 07:01:00 -0700 (PDT) Received: from NAM12-BN8-obe.outbound.protection.outlook.com (mail-bn8nam12lp2172.outbound.protection.outlook.com [104.47.55.172]) by mx-outbound8-219.us-east-2a.ess.aws.cudaops.com (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Sat, 27 May 2023 14:00:59 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=CeAuCTI/XuADnqVy2kk7Rjs8TjdKRkiO7TP7QXvYHwFLG2VqqPE+vK2KM9TR9KOCPBtcC+aGpFb8ytSFsjjNYfSz3zEG3UjoPZsKs0ixC9yIZGzNfJxnjSoUGGfp6v+t8jqsuX9mSxqY1sOtAqj61KWaKZ8OPiq1UbfLRgKxp8bROeXW3MG3zpvHMCf13kcHAaYVBCcAY7GtSJ96HLQta+3d21WyMIN5ATpaq8x31XKuFFqgGwqNJ7u2sYsu8Z/IvHu3zYkMFZTc1o/c1xKND0aRMEooK1uh23hhKa37UvXywNivES+rkwPrc12rDT2p3GbPxGFaGmEknZ+95iF6mA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=+frubh7Tvf+y8Xkc4tzR3D34Xi8cMaAAH98gKbskDpQ=; b=BCenvlfnNIbpcVhiVoMk8eEcMjL277XMAMY/9dfIzQuyUo+ICEq8elSOMO7kvnPapwWWjNPmzx5Or5QDmwB8m91HccKW895iLwgIUvZQ8eqwVcx6XDJKTI0bP5IU15VomUQBbmKLj3JYCL73afNNbQQdZtOMawWfz+0ONhkSv3EIZ8b4Ox57MC/aFFyiPRuQDCZ8y4UsoRM3EfwPxVqSzJjKr6wAcBJ63a2C0PgHH0eoGIn1obL47OqDXcIrZftzeEawuI9ZZKG2MBZZLOrriUPgXzZ/FiYwTQ/gX5jwnOqyvlMY0Av+iAXzq59Mc6RfPZBRx7bYPjmkRYTiHik5Gg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=whamcloud.com; dmarc=pass action=none header.from=whamcloud.com; dkim=pass header.d=whamcloud.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=whamcloud.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=+frubh7Tvf+y8Xkc4tzR3D34Xi8cMaAAH98gKbskDpQ=; b=VMufqxsjtrJNkhGsZbblA+fY2wpEYjyEG+JXudDIMmAowthNXvqCno+z1xPz2/3T/Jla57psTzHIzj4fHNopPmxZ1x7eifMosPGfQcuX7TtHL2NVdsXXh6hDPBQP82MRMwbi2F4G6YIwfqk0ZNosRK8yxUS9VLmIp3FtKU5Rvdg= Received: from DM4PR19MB5835.namprd19.prod.outlook.com (2603:10b6:8:66::17) by LV2PR19MB5909.namprd19.prod.outlook.com (2603:10b6:408:172::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6433.15; Sat, 27 May 2023 13:28:39 +0000 Received: from DM4PR19MB5835.namprd19.prod.outlook.com ([fe80::c542:e594:22cf:bc35]) by DM4PR19MB5835.namprd19.prod.outlook.com ([fe80::c542:e594:22cf:bc35%7]) with mapi id 15.20.6433.015; Sat, 27 May 2023 13:28:39 +0000 From: Alex Zhuravlev To: "linux-ext4@vger.kernel.org" Subject: [RFC] merge extent blocks when possible Thread-Topic: [RFC] merge extent blocks when possible Thread-Index: AQHZkJ8lX7rgImWHCkizYfKM4FSP+w== Date: Sat, 27 May 2023 13:28:38 +0000 Message-ID: <7A2B8861-96AA-4815-BB58-180F63F62436@whamcloud.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=whamcloud.com; x-ms-publictraffictype: Email x-ms-traffictypediagnostic: DM4PR19MB5835:EE_|LV2PR19MB5909:EE_ x-ms-office365-filtering-correlation-id: 0f3d8a8b-9f2d-4800-832a-08db5eb6480e x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: DrHjsxbqzXRB8nPosCAKWttIpuq+FfxTmejfpOPThpOcSLCZ0Be6JughQQEsmgKVshX2uv8wWVMzU8dvdJDuFpws/bgrAeUT8EnjQhvjJFro9pxETBao1jgX/TN1pn8aK9mLM4ZFxIHxbN1NF4Y0N6nYa3OQFb1hnmMIOERyF0kGD8W08/rx1WYppzxHhBXp8FdhT4PXWjrPIOaBb5qk0e1wLLpbtHEh6enqTlvt8vFaPjOWJZkpHCiSyfLOHNhUNMn4v2DLE8wYbUXWDfo49+hqMNn8wreH6PfZsxn9+fT1+uOZcaLjtMTKvKY++A5//MbJsiHDEvPC5erXx5C13R/qW3cU1e6zKA+Vn/yKFlpcNBxTmiLDcrl1HBWEptMNXC3eJd06vYdE9WrNsAIOxyPs+OCn6roe+WzOHW7hfAC4u+9RdZMAnH6+oer4BtnaaP6lEWc9e358ykMns50SxXvD/ylOjpmrXbtseKtaHrSIIXStZ7Skw7wHzSKWqIgEkrke3yAkZ/+pnnpvKMziN+8zMgVj9lGJJ2nR1mxXi/wPjAM7QKEVqJKu04pPiX4OiGAKwDwFRgVDnTIgNL1oV4zFTZ8j9jB5h4tYiVptA9YoAhMHXNuWGM6mqPnYg0nCQrcfZlC9nAfoHr5MM/AqsL2Ja0gylsAQsIDQ26CE434= x-forefront-antispam-report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DM4PR19MB5835.namprd19.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230028)(4636009)(366004)(346002)(39840400004)(136003)(376002)(396003)(451199021)(478600001)(316002)(6506007)(26005)(6512007)(71200400001)(41300700001)(6486002)(66556008)(66476007)(91956017)(66946007)(6916009)(66446008)(76116006)(64756008)(8936002)(5660300002)(8676002)(122000001)(83380400001)(2906002)(86362001)(2616005)(38100700002)(38070700005)(186003)(36756003)(33656002)(21314003)(45980500001);DIR:OUT;SFP:1101; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?utf-8?q?qYsdIxIVcu5LpRXxO/QPUYoFmsU+?= =?utf-8?q?rTfc6jvmJA9KMRZl8gJHj0LRf07awVq5T6BgrKNoyl0evSDviRS5RmGttNO6BjqRq?= =?utf-8?q?9t/5Cf4ww3m3ShKk5uPTI06fCp48NfryiSdljlk+KjCRfafg6wKdYpzx5kyAItSyN?= =?utf-8?q?Xu9fCnnV69+NaM0JJodqqdh/IXdSg6mQcW7qdFqCyKBr83lmPIK+eLnWRfBgnzy4e?= =?utf-8?q?5+pUDNEi6voVMuF1WZ/vjbqvhwBwCwro9+NYMnfmKlG3Kx+W8ibEAAIh94dlmcACI?= =?utf-8?q?xyQVJ5pXrpYfFbW+3lOoJydppWjIHeTdPbY6rqbZ1C9LlmaA3IOgRHKgEK+7OYCn5?= =?utf-8?q?gkHDl2GJs1z6rDA7QIYtzKWW30c71+fQ4bRB/P3+sWKfFVoBSByVFfZpjlQLNxmZM?= =?utf-8?q?TnIv8tj4RcBVdmPNt0/wEUElzIWs+uX5up92L/+zXaQ+TOGv4QwppPWwtv6Ss7vAv?= =?utf-8?q?BiMIn8OJ/gyij0AnIRlUjnKw2lSeqkP27bCR6zWh+FyN7ydxaVcPzFsXaSDG1PcTe?= =?utf-8?q?L/xvDIK1c3lkZyFj9QNihyFzCxqoM3xsN2JvZr1p8PTbR2eUO65OQNLRcxYz6JB10?= =?utf-8?q?rNPajRMNcIhRFkSAgWrZbp/ek89EmXTC4LUL7EuoKfH5hmtoQ3SaRU1OcYia1Af+c?= =?utf-8?q?YFlB0F4vnKKR63KNgVUbZuAi+ZfAAWzwwd1xsbKRpicSpFxN3F0HTT8A6O2WTwy1i?= =?utf-8?q?ywrrpXOwVBBaem2/B40HPbQgaVt3MdFVWz9YlpA4dZJ/+iLl00bpANWZaUICI3ICT?= =?utf-8?q?54SX24esbg/NRLwbl7stNi8pjdo51XVzcUCFXRpXpSd1hUXvfCvE6DiGUgUrsxwCV?= =?utf-8?q?AJjAnZC5pUWgZRXC9nb4nSoUZoHXgN8EJAQ512DSpolk57aAIvbng04MPfRKu9r6o?= =?utf-8?q?CEEGtFgJmChffidK8ldCLxTfJvgZ0/jmivZI6AAHYTwLjDjlHJHbIwi9HFCmPPewc?= =?utf-8?q?rl5jJsPe2dX+rSuK/RZ5w4HoW1f+Lykl5CY60fGMtkIAdOoL8UKsz+BXCf+lRfmuY?= =?utf-8?q?xgaTFuvCL1BUv1O8JKzNynNQithhNlvmUUy3du0boFdIOAC7Axv9EMapHWypJtgQR?= =?utf-8?q?tFgGMU7OC3Nk/c+afmY3T29Q2rJlo3/GwxgaI7MLwxLpnV2J5nhYx6U2U1BW2SGwX?= =?utf-8?q?yCfvuXIQzSf1AuLSYX3LH1FJMu/xsQeXfVVtJlho8YJoBuBQ4a9mZhK30j2/ra2sH?= =?utf-8?q?tL+vFtLE7j4fpRcyJkEKDwHYbSt4jU5zQcnmpozKEn8eCSMrA9x4DvoR+Hsr7H7nn?= =?utf-8?q?9OBgchZ9lXphC4lWm8wri3lQdaB/APq2vgNwZHSUr1FAZOKrb+b1eSe3LyRDiomKC?= =?utf-8?q?kp80/cNvFb3J/DyNMZcHLy0coqfQeYd9chnJ0fcWYhRtsOIKRqtrpufyAWGAD6JAM?= =?utf-8?q?Jm4p6rEOK5R1eI+UyxzIAF4Rl3+Yhvs5GqIkh8probAAwwc940yj1SVmwhxY1dVii?= =?utf-8?q?lD61dmVNBTn9uFpKg9lHMpg9voxAYvfHHxECaVHbAPAQcPg8RXMM2xboNgVGUSHkk?= =?utf-8?q?WYr5BEP9aSMy?= Content-ID: MIME-Version: 1.0 X-MS-Exchange-AntiSpam-ExternalHop-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-ExternalHop-MessageData-0: 6l9trB6IVWAONkx0y6pZaT1XMZTE5oQLxic2OIYvaE3CqZ/dkURaQ9CGgb4HHQaFNwf/gOpgFuZ19RJ5rCettwskPvZyeQf35XSza4ajyu89iyBrq0IRhU4l1Cy2qbQAwVn8h2gwODNR0kUah9l1N+2V/fwU4EuWvtjOK0KRt1gF9XLE1L17FnJab8ZFvyLl8FC4FzIr8y3bgngk3+c6jm9FndI0mGfWdS+uQ+y4D0HCrSER/MTmS35GfZ0QYnadn5cxXyrViJR/JQH0j+6L05GzyFvro6MHnC+c6377mtXK2oIEQbJOGizLP/uJgQ53dTC/DYS0k/nCtkk5jOh2wzXfMzNEcBXruvGQKWoARleKsySpANQtiSQUzs+5On0AVfXHRk0WcY8c/q4vUpswNiCoFBuzuvUpVaz8e8Z0jhiRTvFrr0gPW7WIw6lpx53pdxT9IVEcEMQIAXHHgizVOAYtPt/RMh6L7b+64syhYx51OrPNJOxZwz/X3lYHmkbEd4/wQKkMVjbhupygwSPkT/lolufq8EdIp6IWDD5U1M32fgF05B6z6+e2PEk1D4PxzmU6xdAY5V3Y4AwqBwbsyWy5p1siiN7bIhZDklDiFGyaR03r33mI01f+rinKeLqzi2mdB4dmp8XSieXCvYioFC1Jr4tFZPr3FzXatKIyMtehz5QMTVP/qtm8ATCpf9MNpyGfVs2iCfw6gWpjvyDIwHVFa564rYBSLwbgmcfY/YADvPOogm4y0klM7ft60jWP6ZrwX2PfkwOedJMn3bcR3fyE8BN5jFxJruRLiiuwc9t9oeHLKF4RDCg3TMiLfRrf X-OriginatorOrg: whamcloud.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: DM4PR19MB5835.namprd19.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 0f3d8a8b-9f2d-4800-832a-08db5eb6480e X-MS-Exchange-CrossTenant-originalarrivaltime: 27 May 2023 13:28:38.8933 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 753b6e26-6fd3-43e6-8248-3f1735d59bb4 X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: gev+ftYg50pNqCPt8Zn95/0GBuYuP1s2frclEUawUw+847aqhySaVWWWdy89I42y9zq5uLTTMGs0oYOpbsqmMQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: LV2PR19MB5909 X-BESS-ID: 1685196059-102267-5439-15399-1 X-BESS-VER: 2019.1_20230525.1947 X-BESS-Apparent-Source-IP: 104.47.55.172 X-BESS-Parts: H4sIAAAAAAACA4uuVkqtKFGyUioBkjpK+cVKVpZmRiZAVgZQMNU4OdnYPC0pxS QlLdXQOCXVOMnI0tws2cTExCDVPM1IqTYWAGfZmqpBAAAA X-BESS-Outbound-Spam-Score: 0.00 X-BESS-Outbound-Spam-Report: Code version 3.2, rules version 3.2.2.248409 [from cloudscan13-26.us-east-2a.ess.aws.cudaops.com] Rule breakdown below pts rule name description ---- ---------------------- -------------------------------- 0.00 BSF_BESS_OUTBOUND META: BESS Outbound X-BESS-Outbound-Spam-Status: SCORE=0.00 using account:ESS124931 scores of KILL_LEVEL=7.0 tests=BSF_BESS_OUTBOUND X-BESS-BRTS-Status: 1 X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org Hi, Please have a look at the patch attempting to handle the problem with deep extent tree. There are cases (rather corner, but still) when a lot of extents are created initially, then they get merged over time, but there is no way to merge blocks. Here is a simple example: a file is written synchronously, all even blocks first, then odd blocks. Finally you may find extents tree like this (data from debugs): EXTENTS: (ETB0):33796 (ETB1):33795 (0-677):2588672-2589349 (ETB1):2590753 (678):2589350 (ETB1):2590720 (679-1357):2589351-2590029 (ETB1):2590752 (1358):2590030 (ETB1):2590721 (1359-2037):2590031-2590709 (ETB1):2590751 (2038):2590710 (ETB1):2590722 (2039-2047):2590711-2590719 (2048-2717):2592768-2593437 (ETB1):2590750 (2718):2593438 (ETB1):2590723 (2719-3397):2593439-2594117 (ETB1):2590749 (3398):2594118 (ETB1):2590724 (3399-4077):2594119-2594797 (ETB1):2590748 (4078):2594798 (ETB1):2590725 (4079-4757):2594799-2595477 (ETB1):2590747 (4758):2595478 (ETB1):2590726 … Notice the most of the leave blocks have just a single extent, which doesn’t look very optimal. With the patch applied (0.6% slower): EXTENTS: (ETB0):33796 (ETB1):2590736 (0-2047):2588672-2590719 (2048-11999):2592768-2602719 Originally the problem was hit with a real application operating on huge datasets and with just 27371 extents "inode has invalid extent depth: 6” problem occurred. With the patch applied the application succeeded having finally 73637 in 3-level tree. Thanks, Alex diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index 35703dce23a3..8a885ef73509 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -1885,7 +1885,7 @@ static void ext4_ext_try_to_merge_up(handle_t *handle, * This function tries to merge the @ex extent to neighbours in the tree, then * tries to collapse the extent tree into the inode. */ -static void ext4_ext_try_to_merge(handle_t *handle, +static int ext4_ext_try_to_merge(handle_t *handle, struct inode *inode, struct ext4_ext_path *path, struct ext4_extent *ex) @@ -1902,9 +1902,177 @@ static void ext4_ext_try_to_merge(handle_t *handle, merge_done = ext4_ext_try_to_merge_right(inode, path, ex - 1); if (!merge_done) - (void) ext4_ext_try_to_merge_right(inode, path, ex); + merge_done = ext4_ext_try_to_merge_right(inode, path, ex); ext4_ext_try_to_merge_up(handle, inode, path); + + return merge_done; +} + +/* + * This function tries to merge blocks from @path into @npath + */ +static int ext4_ext_merge_blocks(handle_t *handle, + struct inode *inode, + struct ext4_ext_path *path, + struct ext4_ext_path *npath) +{ + unsigned int depth = ext_depth(inode); + int used, nused, free, i, k, err; + ext4_lblk_t next; + + if (path[depth].p_hdr == npath[depth].p_hdr) + return 0; + + used = le16_to_cpu(path[depth].p_hdr->eh_entries); + free = le16_to_cpu(npath[depth].p_hdr->eh_max) - + le16_to_cpu(npath[depth].p_hdr->eh_entries); + if (free < used) + return 0; + + err = ext4_ext_get_access(handle, inode, path + depth); + if (err) + return err; + err = ext4_ext_get_access(handle, inode, npath + depth); + if (err) + return err; + + /* move entries from the current leave to the next one */ + nused = le16_to_cpu(npath[depth].p_hdr->eh_entries); + memmove(EXT_FIRST_EXTENT(npath[depth].p_hdr) + used, + EXT_FIRST_EXTENT(npath[depth].p_hdr), + nused * sizeof(struct ext4_extent)); + memcpy(EXT_FIRST_EXTENT(npath[depth].p_hdr), + EXT_FIRST_EXTENT(path[depth].p_hdr), + used * sizeof(struct ext4_extent)); + le16_add_cpu(&npath[depth].p_hdr->eh_entries, used); + le16_add_cpu(&path[depth].p_hdr->eh_entries, -used); + ext4_ext_try_to_merge_right(inode, npath, + EXT_FIRST_EXTENT(npath[depth].p_hdr)); + + err = ext4_ext_dirty(handle, inode, path + depth); + if (err) + return err; + err = ext4_ext_dirty(handle, inode, npath + depth); + if (err) + return err; + + /* otherwise the index won't get corrected */ + npath[depth].p_ext = EXT_FIRST_EXTENT(npath[depth].p_hdr); + err = ext4_ext_correct_indexes(handle, inode, npath); + if (err) + return err; + + for (i = depth - 1; i >= 0; i--) { + + next = ext4_idx_pblock(path[i].p_idx); + ext4_free_blocks(handle, inode, NULL, next, 1, + EXT4_FREE_BLOCKS_METADATA | + EXT4_FREE_BLOCKS_FORGET); + err = ext4_ext_get_access(handle, inode, path + i); + if (err) + return err; + le16_add_cpu(&path[i].p_hdr->eh_entries, -1); + if (le16_to_cpu(path[i].p_hdr->eh_entries) == 0) { + /* whole index block collapsed, go up */ + continue; + } + /* remove index pointer */ + used = EXT_LAST_INDEX(path[i].p_hdr) - path[i].p_idx + 1; + memmove(path[i].p_idx, path[i].p_idx + 1, + used * sizeof(struct ext4_extent_idx)); + + err = ext4_ext_dirty(handle, inode, path + i); + if (err) + return err; + + if (path[i].p_hdr == npath[i].p_hdr) + break; + + /* try to move index pointers */ + used = le16_to_cpu(path[i].p_hdr->eh_entries); + free = le16_to_cpu(npath[i].p_hdr->eh_max) - + le16_to_cpu(npath[i].p_hdr->eh_entries); + if (used > free) + break; + err = ext4_ext_get_access(handle, inode, npath + i); + if (err) + return err; + memmove(EXT_FIRST_INDEX(npath[i].p_hdr) + used, + EXT_FIRST_INDEX(npath[i].p_hdr), + npath[i].p_hdr->eh_entries * sizeof(struct ext4_extent_idx)); + memcpy(EXT_FIRST_INDEX(npath[i].p_hdr), EXT_FIRST_INDEX(path[i].p_hdr), + used * sizeof(struct ext4_extent_idx)); + le16_add_cpu(&path[i].p_hdr->eh_entries, -used); + le16_add_cpu(&npath[i].p_hdr->eh_entries, used); + err = ext4_ext_dirty(handle, inode, path + i); + if (err) + return err; + err = ext4_ext_dirty(handle, inode, npath + i); + if (err) + return err; + + /* correct index above */ + for (k = i; k > 0; k--) { + err = ext4_ext_get_access(handle, inode, npath + k - 1); + if (err) + return err; + npath[k-1].p_idx->ei_block = + EXT_FIRST_INDEX(npath[k].p_hdr)->ei_block; + err = ext4_ext_dirty(handle, inode, npath + k - 1); + if (err) + return err; + } + } + + /* + * TODO: given we've got two paths, it should be possible to + * collapse those two blocks into the root one in some cases + */ + return 1; +} + +static int ext4_ext_try_to_merge_blocks(handle_t *handle, + struct inode *inode, + struct ext4_ext_path *path) +{ + struct ext4_ext_path *npath = NULL; + unsigned int depth = ext_depth(inode); + ext4_lblk_t next; + int used, rc = 0; + + if (depth == 0) + return 0; + + used = le16_to_cpu(path[depth].p_hdr->eh_entries); + /* XXX: think of a good value here */ + if (used > 100) + return 0; + + /* try to merge to the next block */ + next = ext4_ext_next_leaf_block(path); + if (next == EXT_MAX_BLOCKS) + return 0; + npath = ext4_find_extent(inode, next, NULL, 0); + if (IS_ERR(npath)) + return 0; + rc = ext4_ext_merge_blocks(handle, inode, path, npath); + ext4_ext_drop_refs(npath); + kfree(npath); + if (rc) + return rc > 0 ? 0 : rc; + + /* try to merge with the previous block */ + if (EXT_FIRST_EXTENT(path[depth].p_hdr)->ee_block == 0) + return 0; + next = EXT_FIRST_EXTENT(path[depth].p_hdr)->ee_block - 1; + npath = ext4_find_extent(inode, next, NULL, 0); + if (IS_ERR(npath)) + return 0; + rc = ext4_ext_merge_blocks(handle, inode, npath, path); + ext4_ext_drop_refs(npath); + kfree(npath); + return rc > 0 ? 0 : rc; } /* @@ -1976,6 +2144,7 @@ int ext4_ext_insert_extent(handle_t *handle, struct inode *inode, int depth, len, err; ext4_lblk_t next; int mb_flags = 0, unwritten; + int merged = 0; if (gb_flags & EXT4_GET_BLOCKS_DELALLOC_RESERVE) mb_flags |= EXT4_MB_DELALLOC_RESERVED; @@ -2167,8 +2336,7 @@ int ext4_ext_insert_extent(handle_t *handle, struct inode *inode, merge: /* try to merge extents */ if (!(gb_flags & EXT4_GET_BLOCKS_PRE_IO)) - ext4_ext_try_to_merge(handle, inode, path, nearex); - + merged = ext4_ext_try_to_merge(handle, inode, path, nearex); /* time to correct all indexes above */ err = ext4_ext_correct_indexes(handle, inode, path); @@ -2176,6 +2344,8 @@ int ext4_ext_insert_extent(handle_t *handle, struct inode *inode, goto cleanup; err = ext4_ext_dirty(handle, inode, path + path->p_depth); + if (!err && merged) + err = ext4_ext_try_to_merge_blocks(handle, inode, path); cleanup: ext4_free_ext_path(npath); @@ -3766,7 +3936,8 @@ static int ext4_convert_unwritten_extents_endio(handle_t *handle, /* note: ext4_ext_correct_indexes() isn't needed here because * borders are not changed */ - ext4_ext_try_to_merge(handle, inode, path, ex); + if (ext4_ext_try_to_merge(handle, inode, path, ex)) + ext4_ext_try_to_merge_blocks(handle, inode, path); /* Mark modified extent as dirty */ err = ext4_ext_dirty(handle, inode, path + path->p_depth); @@ -3829,7 +4000,8 @@ convert_initialized_extent(handle_t *handle, struct inode *inode, /* note: ext4_ext_correct_indexes() isn't needed here because * borders are not changed */ - ext4_ext_try_to_merge(handle, inode, path, ex); + if (ext4_ext_try_to_merge(handle, inode, path, ex)) + ext4_ext_try_to_merge_blocks(handle, inode, path); /* Mark modified extent as dirty */ err = ext4_ext_dirty(handle, inode, path + path->p_depth); diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c index 18611241f451..7421f2af9cf2 100644 --- a/fs/jbd2/transaction.c +++ b/fs/jbd2/transaction.c @@ -513,6 +513,7 @@ handle_t *jbd2__journal_start(journal_t *journal, int nblocks, int rsv_blocks, } rsv_handle->h_reserved = 1; rsv_handle->h_journal = journal; + rsv_handle->h_revoke_credits = revoke_records; handle->h_rsv_handle = rsv_handle; } handle->h_revoke_credits = revoke_records;