From patchwork Tue Aug 31 14:49:36 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Alex Zhuravlev X-Patchwork-Id: 1522671 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=linux-ext4-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=whamcloud.com header.i=@whamcloud.com header.a=rsa-sha256 header.s=selector2 header.b=H01ltyBp; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4GzVTL2mxKz9sPf for ; Wed, 1 Sep 2021 00:49:42 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237512AbhHaOug (ORCPT ); Tue, 31 Aug 2021 10:50:36 -0400 Received: from mail-bn8nam11on2087.outbound.protection.outlook.com ([40.107.236.87]:1888 "EHLO NAM11-BN8-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S232214AbhHaOuf (ORCPT ); Tue, 31 Aug 2021 10:50:35 -0400 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=MSMc8oB4QGNG9INulmufZTX58KHy8mlaBC1ZIYq5EG4OVNquF/NHYB8cUaVyyEBQRyuLphOvNDhDnIyTqxJpIJij/98yrC3AgSSF04tQFJujFMUsgLYzf64EMZ7VhzejMmdNcWRIt5Tdc1BRkPzCk1/NVukOES8fAV9Em2OBMVwHXKdT956B+6GkXwGkVVIfrTGKM8WeIuZ4WnBzCwzJaE+x+VgiwfXTpMf71CXLv/5igY8V+3sxx5wKsP+wmIeMIplvh31bM8WKkADdqxdnkfFviY98Qv778SM/ooajHXvQCzQs3d/Uk43eTHs6rUtrM9Oe9h/dsIawku7OpIjqmQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=xUze9ZSd9GmkxvVRyBCjaBjyGxLBbW1Ns0UzSYyfPik=; b=H+CJPT8P57QrX0nMDVwCOzIHUZAHSceEFSmHvbhz8iJbre52jjt+DjJrMZTl1Y61GC1wDVYJooaRMEanoHKeyZ2FfA+7XA27jQyNMdj6Zehf8BcO3ZzcJlQeFjgzHkDvlFoMmt6si5zP2pnC/OuJkKXsiD3laI3sokZF28d8SBe3BZpKubmz7605aUR/D+Er8q/X0kUzFifvn2/dR5t5wmx4RsRCuzi7C2uh+nNBQ6tVeI9wCseg9JNPeTQIMSxm51R+GPR0ixQDwQeF02BtfdY9OszHGVHePtfkpnf1gRqjNC9Lv04FYWw2W6xGF4oOwDXDaVgVubGhiDt50DqL5g== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=whamcloud.com; dmarc=pass action=none header.from=whamcloud.com; dkim=pass header.d=whamcloud.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=whamcloud.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=xUze9ZSd9GmkxvVRyBCjaBjyGxLBbW1Ns0UzSYyfPik=; b=H01ltyBpVdXACqw42JVB4h29noqUyy5sqJl8Jp+R0btURWGhoLhJ/PMN5Cu0CXhYQ95CaTVVYRFwGilct0TiOAbwekF0KaadSI/qSygJipU+y1jONQkHiykTfSJgHEodacSgkrEM71bikl65CZhY2+JrfW6Y8aCH44GfmEfi61w= Received: from DM8PR19MB5270.namprd19.prod.outlook.com (2603:10b6:8:4::19) by DM8PR19MB5287.namprd19.prod.outlook.com (2603:10b6:8:d::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4478.17; Tue, 31 Aug 2021 14:49:36 +0000 Received: from DM8PR19MB5270.namprd19.prod.outlook.com ([fe80::f5e6:3426:769c:e207]) by DM8PR19MB5270.namprd19.prod.outlook.com ([fe80::f5e6:3426:769c:e207%4]) with mapi id 15.20.4457.024; Tue, 31 Aug 2021 14:49:36 +0000 From: Alex Zhuravlev To: linux-ext4 Subject: [RFC] replace revoke hash table with rhashtable Thread-Topic: [RFC] replace revoke hash table with rhashtable Thread-Index: AQHXnndq+6k2arjPM0m5NXkxu5vjaQ== Date: Tue, 31 Aug 2021 14:49:36 +0000 Message-ID: <96FE09AA-171C-49B1-B434-505C15FEB435@whamcloud.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: vger.kernel.org; dkim=none (message not signed) header.d=none;vger.kernel.org; dmarc=none action=none header.from=whamcloud.com; x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: aaa16ba5-2712-414c-e303-08d96c8e8d55 x-ms-traffictypediagnostic: DM8PR19MB5287: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:466; x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: ezF4A8O1McXe7bZkOzZOdzCGAeHiAayEU2qLAAddi0iFDA5doPeG9MZVayy+i9fwpDIjecHm0fqaSQdIYVt5gPAUL6szy5wqasnqZ41APTDCCYOGjaSFMoPQvRI1C/2Iz7kqX2EFbotaJAyYnPRG4nCJcSKRiE99+qp012EjSZpFr49aQCWVCpy3BC9uvf7v5xZugoNIlJzlrsf7y+v7NL5gMkQ6wVYmv4OOqRQF0eAohTYbMJGU+7J30bATVlXH2XbDDSdTYw+ETrNFr0uCD7mDDW/3a+u6FkOadPPykhcQcfC2yIcQCJI8aqYQsy9GMR37dB8N+Hqk3sN9Fbb3+RLpZbW0y+NNMhLpRd2Ng+GNAzbv0mqZtP2I8XJzQ6/JQUfBbMzRkbrHpMyMj5TTAZXFm1OiPhjKdEX6/DLdMwwa0x0nlPVvl9Dbsf89F44sy2FgAXTljYwrtA7KWBYy0cZ2S5eVPvxekGqSiGx5mukutN9IcWR8TJjPDNiQlnK2GM761Koo5Crv/LcEUdocDTnEU+nrCtrW2nsqNN4ykH7waKNrvkomXFP3GU19kJ0emOVf0HTFH/iEbCtfWwM5XNJEV8dh3DS7tFv/iAKZBySsj4gGXgMfIZICqFQ0Jxf2S6CnjcWYlxCr8QbTuB/XgFmYeyQ94hvs1kqHCFxVV5fLvFSQyrHdK0+VZoJJmfQri1u7ER85A3hDUH8ztqPh2vkoqS0Te+jeVDdnPmj9VroQ32s+pqj7J7TqM1DhmIHO x-forefront-antispam-report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DM8PR19MB5270.namprd19.prod.outlook.com;PTR:;CAT:NONE;SFS:(4636009)(366004)(136003)(346002)(396003)(376002)(39850400004)(6506007)(53546011)(76116006)(86362001)(91956017)(30864003)(186003)(316002)(26005)(83380400001)(6916009)(2616005)(33656002)(38070700005)(8936002)(5660300002)(38100700002)(8676002)(36756003)(66446008)(2906002)(64756008)(66556008)(122000001)(66476007)(71200400001)(478600001)(6486002)(66946007)(6512007)(21314003)(45980500001);DIR:OUT;SFP:1101; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?utf-8?q?2CmjRCZ8iU/93xjS7pBiEj+1G7iX?= =?utf-8?q?S7ILmJj3vMQ75Ko3vjtppRqBkpAgzfp0+td8ESs00VD+jbcFKTlbk1O/OJbbCrMyc?= =?utf-8?q?xJt4gXl4Iwn64a2xv35m9DHFOZbnij6XBdyiAd/7mmR2MsxrOVsw6PLJvE+vlS17f?= =?utf-8?q?sb9RR5Tt2I/GbpuqZ/SJk/3zKroYv8CVNY/DhaLWHYCPtbTN016ifrUmgIwfb4wIh?= =?utf-8?q?HyAkCu0IqDvyrmstdtn+w4e1nLKj/pgob8g9ZLsiUnx29zc3/ja4DiY5q3FtgbJ5s?= =?utf-8?q?9G8S7PR6OX3tWli72kZqJnccmdLIaO8DhLi+ObOHVCB90zezzhtdiXJcqBMixac84?= =?utf-8?q?pT/qwJt9R/u+cMs4PhB4kSrHxNHdXAERT3GdWzlvJsNsVIATOlfnbOIxEl+NGTcVi?= =?utf-8?q?uw3CeKUWvPl6LkEljIKR5du0Pq3wvczQSQ0Ds6MZcrWLzE3uyPExqZWqimaBaDovL?= =?utf-8?q?sZ7nQ2ThLwN4qe0SmNwbpJOwh5i7YUblce+URw2yA48PXHrtcg2n57ZSybMY8S6yx?= =?utf-8?q?ZHjwMfYK1rboGIgHiLEZesA77Ovru4PG0/s0XXP9/9D9XNs4rP9OY3JIUDt7Bthoz?= =?utf-8?q?dEbmssqFNaJV/8XDKsh5fCnvczKdC/v0YGDuRATZaQGNcrfzFZKaIY4jYLwP0Ssvv?= =?utf-8?q?ipOduVnQ1fr4cAojHywJNZLWEFEJ29cIA9eSpk6wqWThU/T5bPbUKMxBS7xgwcmBR?= =?utf-8?q?k2B90AlAsA+fMZKVU1hZL/3+1dIf8nEYXMPNCql8zLk/8LIbHIwdW7+Ok4nLnuiOE?= =?utf-8?q?FFv951V8H3SoWm9wNwS5nrBMEI9FPToVzV/kltA0PWp+UwW/dJqmsFhDlZThxkWk0?= =?utf-8?q?60UeNXj3DksC+hF+pBh5HuJydVK/MBoKyEf3YDpOchLA35i6SfISsexR69H3fGhNg?= =?utf-8?q?mw+jZh6zPK1WGvgo39OFYCPE+Jlcc1GRShUD5cLHjE/w2VMxmoNJG8ueBzCZHi9c+?= =?utf-8?q?yGDLBROMBUvLcK3Me6yMxNqx6Xkr8B3JJtkkoaxN7VK7rQyN/Exq66RKS97H0MUxl?= =?utf-8?q?9z4laOaqYNl0obGgtl3USbTSTqXfE2kuDXrHjSD3KUzkRuD2Zfr0eKd4CKhzTntvm?= =?utf-8?q?zVAaVpVwzaVdJWTQP0RlsWkdgaJ+YLBTSmpHxDli6kQRE5MHC9Sf3ceZ24mYIIxoI?= =?utf-8?q?C6+9QB7/Pl2iSF8ACyfcrE0yhCGQ7T77Ho2PabYz6REEWnKPnBrzPXePt/Sgw3WJo?= =?utf-8?q?q39v0wV2VSQuxmLK6A4TRtCRi83rXlSkVEwZIvLVA/BNQ4s9VHteFtcCWB5lBQpn+?= =?utf-8?q?QzTsWnvgSaVV3F5Z?= x-ms-exchange-transport-forked: True Content-ID: MIME-Version: 1.0 X-OriginatorOrg: whamcloud.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: DM8PR19MB5270.namprd19.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: aaa16ba5-2712-414c-e303-08d96c8e8d55 X-MS-Exchange-CrossTenant-originalarrivaltime: 31 Aug 2021 14:49:36.0713 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 753b6e26-6fd3-43e6-8248-3f1735d59bb4 X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: HeUQ1BvWnAmt2OwvRUzqrvunZ/1mz/shRLEdaQtDzfIt9Xw82iDtS9IVHhX7R/jhhdGGkyLm5+C5qjV5Vx+7og== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM8PR19MB5287 Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org Hi, Not so long ago we noticed that journal replay can take quite a lot (hours) in cases where many journaled blocks were freed during a short period. I benchmarked hash table used by revoke code, basically it’s lookup+insert like jbd2 does at replay: 1048576 records - 95 seconds 2097152 records - 580 seconds Then I benchmarked rhashtable: 1048576 records - 2 seconds 2097152 records - 3 seconds 4194304 records - 7 seconds So, here is a patch replacing existing fixed-size hash table with rhashtable, please have a look. Thanks, Alex From 64b9161db7fee4eea665833765221d8a7e5903b6 Mon Sep 17 00:00:00 2001 From: Alex Zhuravlev Date: Tue, 31 Aug 2021 11:53:09 +0300 Subject: [PATCH] jbd2 to replace fixed-size revoke hashtable with rhashtable Signed-off-by: Alex Zhuravlev --- fs/jbd2/journal.c | 5 +- fs/jbd2/revoke.c | 255 +++++++++++++++---------------------------- include/linux/jbd2.h | 8 +- 3 files changed, 90 insertions(+), 178 deletions(-) diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c index 35302bc192eb..6e3a2cc9dfd2 100644 --- a/fs/jbd2/journal.c +++ b/fs/jbd2/journal.c @@ -1370,7 +1370,7 @@ static journal_t *journal_init_common(struct block_device *bdev, journal->j_flags = JBD2_ABORT; /* Set up a default-sized revoke table for the new mount. */ - err = jbd2_journal_init_revoke(journal, JOURNAL_REVOKE_DEFAULT_HASH); + err = jbd2_journal_init_revoke(journal); if (err) goto err_cleanup; @@ -3136,8 +3136,6 @@ static int __init journal_init_caches(void) int ret; ret = jbd2_journal_init_revoke_record_cache(); - if (ret == 0) - ret = jbd2_journal_init_revoke_table_cache(); if (ret == 0) ret = jbd2_journal_init_journal_head_cache(); if (ret == 0) @@ -3152,7 +3150,6 @@ static int __init journal_init_caches(void) static void jbd2_journal_destroy_caches(void) { jbd2_journal_destroy_revoke_record_cache(); - jbd2_journal_destroy_revoke_table_cache(); jbd2_journal_destroy_journal_head_cache(); jbd2_journal_destroy_handle_cache(); jbd2_journal_destroy_inode_cache(); diff --git a/fs/jbd2/revoke.c b/fs/jbd2/revoke.c index fa608788b93d..4533bd49e879 100644 --- a/fs/jbd2/revoke.c +++ b/fs/jbd2/revoke.c @@ -90,10 +90,10 @@ #include #include #include +#include #endif static struct kmem_cache *jbd2_revoke_record_cache; -static struct kmem_cache *jbd2_revoke_table_cache; /* Each revoke record represents one single revoked block. During journal replay, this involves recording the transaction ID of the @@ -101,23 +101,17 @@ static struct kmem_cache *jbd2_revoke_table_cache; struct jbd2_revoke_record_s { - struct list_head hash; + struct rhash_head linkage; tid_t sequence; /* Used for recovery only */ unsigned long long blocknr; }; - -/* The revoke table is just a simple hash table of revoke records. */ -struct jbd2_revoke_table_s -{ - /* It is conceivable that we might want a larger hash table - * for recovery. Must be a power of two. */ - int hash_size; - int hash_shift; - struct list_head *hash_table; +static const struct rhashtable_params revoke_rhashtable_params = { + .key_len = sizeof(unsigned long long), + .key_offset = offsetof(struct jbd2_revoke_record_s, blocknr), + .head_offset = offsetof(struct jbd2_revoke_record_s, linkage), }; - #ifdef __KERNEL__ static void write_one_revoke_record(transaction_t *, struct list_head *, @@ -126,18 +120,10 @@ static void write_one_revoke_record(transaction_t *, static void flush_descriptor(journal_t *, struct buffer_head *, int); #endif -/* Utility functions to maintain the revoke table */ - -static inline int hash(journal_t *journal, unsigned long long block) -{ - return hash_64(block, journal->j_revoke->hash_shift); -} - static int insert_revoke_hash(journal_t *journal, unsigned long long blocknr, tid_t seq) { - struct list_head *hash_list; - struct jbd2_revoke_record_s *record; + struct jbd2_revoke_record_s *record, *old; gfp_t gfp_mask = GFP_NOFS; if (journal_oom_retry) @@ -148,10 +134,12 @@ static int insert_revoke_hash(journal_t *journal, unsigned long long blocknr, record->sequence = seq; record->blocknr = blocknr; - hash_list = &journal->j_revoke->hash_table[hash(journal, blocknr)]; - spin_lock(&journal->j_revoke_lock); - list_add(&record->hash, hash_list); - spin_unlock(&journal->j_revoke_lock); + old = rhashtable_lookup_get_insert_fast(journal->j_revoke, + &record->linkage, revoke_rhashtable_params); + if (old) { + BUG_ON(record->sequence != seq); + kmem_cache_free(jbd2_revoke_record_cache, record); + } return 0; } @@ -160,22 +148,8 @@ static int insert_revoke_hash(journal_t *journal, unsigned long long blocknr, static struct jbd2_revoke_record_s *find_revoke_record(journal_t *journal, unsigned long long blocknr) { - struct list_head *hash_list; - struct jbd2_revoke_record_s *record; - - hash_list = &journal->j_revoke->hash_table[hash(journal, blocknr)]; - - spin_lock(&journal->j_revoke_lock); - record = (struct jbd2_revoke_record_s *) hash_list->next; - while (&(record->hash) != hash_list) { - if (record->blocknr == blocknr) { - spin_unlock(&journal->j_revoke_lock); - return record; - } - record = (struct jbd2_revoke_record_s *) record->hash.next; - } - spin_unlock(&journal->j_revoke_lock); - return NULL; + return rhashtable_lookup_fast(journal->j_revoke, &blocknr, + revoke_rhashtable_params); } void jbd2_journal_destroy_revoke_record_cache(void) @@ -184,12 +158,6 @@ void jbd2_journal_destroy_revoke_record_cache(void) jbd2_revoke_record_cache = NULL; } -void jbd2_journal_destroy_revoke_table_cache(void) -{ - kmem_cache_destroy(jbd2_revoke_table_cache); - jbd2_revoke_table_cache = NULL; -} - int __init jbd2_journal_init_revoke_record_cache(void) { J_ASSERT(!jbd2_revoke_record_cache); @@ -203,85 +171,27 @@ int __init jbd2_journal_init_revoke_record_cache(void) return 0; } -int __init jbd2_journal_init_revoke_table_cache(void) -{ - J_ASSERT(!jbd2_revoke_table_cache); - jbd2_revoke_table_cache = KMEM_CACHE(jbd2_revoke_table_s, - SLAB_TEMPORARY); - if (!jbd2_revoke_table_cache) { - pr_emerg("JBD2: failed to create revoke_table cache\n"); - return -ENOMEM; - } - return 0; -} - -static struct jbd2_revoke_table_s *jbd2_journal_init_revoke_table(int hash_size) -{ - int shift = 0; - int tmp = hash_size; - struct jbd2_revoke_table_s *table; - - table = kmem_cache_alloc(jbd2_revoke_table_cache, GFP_KERNEL); - if (!table) - goto out; - - while((tmp >>= 1UL) != 0UL) - shift++; - - table->hash_size = hash_size; - table->hash_shift = shift; - table->hash_table = - kmalloc_array(hash_size, sizeof(struct list_head), GFP_KERNEL); - if (!table->hash_table) { - kmem_cache_free(jbd2_revoke_table_cache, table); - table = NULL; - goto out; - } - - for (tmp = 0; tmp < hash_size; tmp++) - INIT_LIST_HEAD(&table->hash_table[tmp]); - -out: - return table; -} - -static void jbd2_journal_destroy_revoke_table(struct jbd2_revoke_table_s *table) -{ - int i; - struct list_head *hash_list; - - for (i = 0; i < table->hash_size; i++) { - hash_list = &table->hash_table[i]; - J_ASSERT(list_empty(hash_list)); - } - - kfree(table->hash_table); - kmem_cache_free(jbd2_revoke_table_cache, table); -} - /* Initialise the revoke table for a given journal to a given size. */ -int jbd2_journal_init_revoke(journal_t *journal, int hash_size) +int jbd2_journal_init_revoke(journal_t *journal) { - J_ASSERT(journal->j_revoke_table[0] == NULL); - J_ASSERT(is_power_of_2(hash_size)); + int rc; - journal->j_revoke_table[0] = jbd2_journal_init_revoke_table(hash_size); - if (!journal->j_revoke_table[0]) + rc = rhashtable_init(&journal->j_revoke_table[0], &revoke_rhashtable_params); + if (rc) goto fail0; - journal->j_revoke_table[1] = jbd2_journal_init_revoke_table(hash_size); - if (!journal->j_revoke_table[1]) + rc = rhashtable_init(&journal->j_revoke_table[1], &revoke_rhashtable_params); + if (rc) goto fail1; - journal->j_revoke = journal->j_revoke_table[1]; + journal->j_revoke = &journal->j_revoke_table[1]; spin_lock_init(&journal->j_revoke_lock); return 0; fail1: - jbd2_journal_destroy_revoke_table(journal->j_revoke_table[0]); - journal->j_revoke_table[0] = NULL; + rhashtable_destroy(&journal->j_revoke_table[0]); fail0: return -ENOMEM; } @@ -290,10 +200,8 @@ int jbd2_journal_init_revoke(journal_t *journal, int hash_size) void jbd2_journal_destroy_revoke(journal_t *journal) { journal->j_revoke = NULL; - if (journal->j_revoke_table[0]) - jbd2_journal_destroy_revoke_table(journal->j_revoke_table[0]); - if (journal->j_revoke_table[1]) - jbd2_journal_destroy_revoke_table(journal->j_revoke_table[1]); + rhashtable_destroy(&journal->j_revoke_table[0]); + rhashtable_destroy(&journal->j_revoke_table[1]); } @@ -446,9 +354,8 @@ int jbd2_journal_cancel_revoke(handle_t *handle, struct journal_head *jh) if (record) { jbd_debug(4, "cancelled existing revoke on " "blocknr %llu\n", (unsigned long long)bh->b_blocknr); - spin_lock(&journal->j_revoke_lock); - list_del(&record->hash); - spin_unlock(&journal->j_revoke_lock); + rhashtable_remove_fast(journal->j_revoke, &record->linkage, + revoke_rhashtable_params); kmem_cache_free(jbd2_revoke_record_cache, record); did_revoke = 1; } @@ -483,27 +390,29 @@ int jbd2_journal_cancel_revoke(handle_t *handle, struct journal_head *jh) */ void jbd2_clear_buffer_revoked_flags(journal_t *journal) { - struct jbd2_revoke_table_s *revoke = journal->j_revoke; - int i = 0; - - for (i = 0; i < revoke->hash_size; i++) { - struct list_head *hash_list; - struct list_head *list_entry; - hash_list = &revoke->hash_table[i]; - - list_for_each(list_entry, hash_list) { - struct jbd2_revoke_record_s *record; - struct buffer_head *bh; - record = (struct jbd2_revoke_record_s *)list_entry; - bh = __find_get_block(journal->j_fs_dev, - record->blocknr, - journal->j_blocksize); - if (bh) { - clear_buffer_revoked(bh); - __brelse(bh); - } + struct rhashtable *revoke = journal->j_revoke; + struct jbd2_revoke_record_s *record; + struct rhashtable_iter iter; + + rhashtable_walk_enter(revoke, &iter); + rhashtable_walk_start(&iter); + while ((record = rhashtable_walk_next(&iter)) != NULL) { + struct buffer_head *bh; + + if (IS_ERR(record)) + continue; + rhashtable_walk_stop(&iter); + bh = __find_get_block(journal->j_fs_dev, + record->blocknr, + journal->j_blocksize); + if (bh) { + clear_buffer_revoked(bh); + __brelse(bh); } - } + rhashtable_walk_start(&iter); + } + rhashtable_walk_stop(&iter); + rhashtable_walk_exit(&iter); } /* journal_switch_revoke table select j_revoke for next transaction @@ -512,15 +421,12 @@ void jbd2_clear_buffer_revoked_flags(journal_t *journal) */ void jbd2_journal_switch_revoke_table(journal_t *journal) { - int i; - - if (journal->j_revoke == journal->j_revoke_table[0]) - journal->j_revoke = journal->j_revoke_table[1]; + if (journal->j_revoke == &journal->j_revoke_table[0]) + journal->j_revoke = &journal->j_revoke_table[1]; else - journal->j_revoke = journal->j_revoke_table[0]; + journal->j_revoke = &journal->j_revoke_table[0]; - for (i = 0; i < journal->j_revoke->hash_size; i++) - INIT_LIST_HEAD(&journal->j_revoke->hash_table[i]); + /* XXX: check rhashtable is empty? reinitialize it? */ } /* @@ -533,31 +439,36 @@ void jbd2_journal_write_revoke_records(transaction_t *transaction, journal_t *journal = transaction->t_journal; struct buffer_head *descriptor; struct jbd2_revoke_record_s *record; - struct jbd2_revoke_table_s *revoke; - struct list_head *hash_list; - int i, offset, count; + struct rhashtable_iter iter; + struct rhashtable *revoke; + int offset, count; descriptor = NULL; offset = 0; count = 0; /* select revoke table for committing transaction */ - revoke = journal->j_revoke == journal->j_revoke_table[0] ? - journal->j_revoke_table[1] : journal->j_revoke_table[0]; - - for (i = 0; i < revoke->hash_size; i++) { - hash_list = &revoke->hash_table[i]; - - while (!list_empty(hash_list)) { - record = (struct jbd2_revoke_record_s *) - hash_list->next; + revoke = journal->j_revoke == &journal->j_revoke_table[0] ? + &journal->j_revoke_table[1] : &journal->j_revoke_table[0]; + + rhashtable_walk_enter(revoke, &iter); + rhashtable_walk_start(&iter); + while ((record = rhashtable_walk_next(&iter)) != NULL) { + if (IS_ERR(record)) + continue; + if (rhashtable_remove_fast(revoke, + &record->linkage, + revoke_rhashtable_params) == 0) { + rhashtable_walk_stop(&iter); write_one_revoke_record(transaction, log_bufs, &descriptor, &offset, record); + rhashtable_walk_start(&iter); count++; - list_del(&record->hash); kmem_cache_free(jbd2_revoke_record_cache, record); } } + rhashtable_walk_stop(&iter); + rhashtable_walk_exit(&iter); if (descriptor) flush_descriptor(journal, descriptor, offset); jbd_debug(1, "Wrote %d revoke records\n", count); @@ -725,19 +636,23 @@ int jbd2_journal_test_revoke(journal_t *journal, void jbd2_journal_clear_revoke(journal_t *journal) { - int i; - struct list_head *hash_list; + struct rhashtable_iter iter; struct jbd2_revoke_record_s *record; - struct jbd2_revoke_table_s *revoke; + struct rhashtable *revoke; revoke = journal->j_revoke; - for (i = 0; i < revoke->hash_size; i++) { - hash_list = &revoke->hash_table[i]; - while (!list_empty(hash_list)) { - record = (struct jbd2_revoke_record_s*) hash_list->next; - list_del(&record->hash); + rhashtable_walk_enter(revoke, &iter); + rhashtable_walk_start(&iter); + while ((record = rhashtable_walk_next(&iter)) != NULL) { + if (IS_ERR(record)) + continue; + if (rhashtable_remove_fast(revoke, + &record->linkage, + revoke_rhashtable_params) == 0) { kmem_cache_free(jbd2_revoke_record_cache, record); - } - } + } + } + rhashtable_walk_stop(&iter); + rhashtable_walk_exit(&iter); } diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h index fd933c45281a..dcde4329cdd1 100644 --- a/include/linux/jbd2.h +++ b/include/linux/jbd2.h @@ -29,6 +29,7 @@ #include #include #include +#include #endif #define journal_oom_retry 1 @@ -1135,12 +1136,12 @@ struct journal_s * The revoke table - maintains the list of revoked blocks in the * current transaction. */ - struct jbd2_revoke_table_s *j_revoke; + struct rhashtable *j_revoke; /** * @j_revoke_table: Alternate revoke tables for j_revoke. */ - struct jbd2_revoke_table_s *j_revoke_table[2]; + struct rhashtable j_revoke_table[2]; /** * @j_wbuf: Array of bhs for jbd2_journal_commit_transaction. @@ -1625,8 +1626,7 @@ static inline void jbd2_free_inode(struct jbd2_inode *jinode) } /* Primary revoke support */ -#define JOURNAL_REVOKE_DEFAULT_HASH 256 -extern int jbd2_journal_init_revoke(journal_t *, int); +extern int jbd2_journal_init_revoke(journal_t *); extern void jbd2_journal_destroy_revoke_record_cache(void); extern void jbd2_journal_destroy_revoke_table_cache(void); extern int __init jbd2_journal_init_revoke_record_cache(void);