From patchwork Fri Mar 22 16:18:42 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Monakhov X-Patchwork-Id: 230112 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 09A692C00C2 for ; Sat, 23 Mar 2013 03:19:50 +1100 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933792Ab3CVQTt (ORCPT ); Fri, 22 Mar 2013 12:19:49 -0400 Received: from mailhub.sw.ru ([195.214.232.25]:38707 "EHLO relay.sw.ru" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933741Ab3CVQTr (ORCPT ); Fri, 22 Mar 2013 12:19:47 -0400 Received: from mct-mail.qa.sw.ru ([10.29.1.112]) by relay.sw.ru (8.13.4/8.13.4) with ESMTP id r2MGIlji024813; Fri, 22 Mar 2013 20:18:51 +0400 (MSK) From: Dmitry Monakhov To: linux-ext4@vger.kernel.org Cc: tytso@mit.edu, jack@suse.cz, wenqing.lz@taobao.com, Dmitry Monakhov Subject: [PATCH 2/3] ext4: fix journal callback list traversal Date: Fri, 22 Mar 2013 20:18:42 +0400 Message-Id: <1363969123-16852-2-git-send-email-dmonakhov@openvz.org> X-Mailer: git-send-email 1.7.7.6 In-Reply-To: <1363969123-16852-1-git-send-email-dmonakhov@openvz.org> References: <1363969123-16852-1-git-send-email-dmonakhov@openvz.org> Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org It is incorrect to use list_for_each_entry_safe() for journal callback traversial because ->next may be removed by other task: ->ext4_mb_free_metadata() ->ext4_mb_free_metadata() ->ext4_journal_callback_del() This result in following issue: WARNING: at lib/list_debug.c:62 __list_del_entry+0x1c0/0x250() Hardware name: list_del corruption. prev->next should be ffff88019a4ec198, but was 6b6b6b6b6b6b6b6b Modules linked in: cpufreq_ondemand acpi_cpufreq freq_table mperf coretemp kvm_intel kvm crc32c_intel ghash_clmulni_intel microcode sg xhci_hcd button sd_mod crc_t10dif aesni_intel ablk_helper cryptd lrw aes_x86_64 xts gf128mul ahci libahci pata_acpi ata_generic dm_mirror dm_region_hash dm_log dm_mod Pid: 16400, comm: jbd2/dm-1-8 Tainted: G W 3.8.0-rc3+ #107 Call Trace: [] warn_slowpath_common+0xad/0xf0 [] warn_slowpath_fmt+0x46/0x50 [] ? ext4_journal_commit_callback+0x99/0xc0 [] __list_del_entry+0x1c0/0x250 [] ext4_journal_commit_callback+0x6f/0xc0 [] jbd2_journal_commit_transaction+0x23a6/0x2570 [] ? try_to_del_timer_sync+0x82/0xa0 [] ? del_timer_sync+0x91/0x1e0 [] kjournald2+0x19f/0x6a0 [] ? wake_up_bit+0x40/0x40 [] ? bit_spin_lock+0x80/0x80 [] kthread+0x10e/0x120 [] ? __init_kthread_worker+0x70/0x70 [] ret_from_fork+0x7c/0xb0 [] ? __init_kthread_worker+0x70/0x70 This patch fix the issue like follows: - ext4_journal_commit_callback() make list truly traversial safe simply by always starting from list_head - fix race between two ext4_journal_callback_del() and ext4_journal_callback_try_del() Signed-off-by: Dmitry Monakhov Reviewed-by: Jan Kara --- fs/ext4/ext4_jbd2.h | 6 +++++- fs/ext4/mballoc.c | 8 ++++---- fs/ext4/super.c | 4 +++- 3 files changed, 12 insertions(+), 6 deletions(-) diff --git a/fs/ext4/ext4_jbd2.h b/fs/ext4/ext4_jbd2.h index 4c216b1..aeed0ba 100644 --- a/fs/ext4/ext4_jbd2.h +++ b/fs/ext4/ext4_jbd2.h @@ -194,16 +194,20 @@ static inline void ext4_journal_callback_add(handle_t *handle, * ext4_journal_callback_del: delete a registered callback * @handle: active journal transaction handle on which callback was registered * @jce: registered journal callback entry to unregister + * Return true if object was sucessfully removed */ -static inline void ext4_journal_callback_del(handle_t *handle, +static inline bool ext4_journal_callback_try_del(handle_t *handle, struct ext4_journal_cb_entry *jce) { + bool deleted; struct ext4_sb_info *sbi = EXT4_SB(handle->h_transaction->t_journal->j_private); spin_lock(&sbi->s_md_lock); + deleted = !list_empty(&jce->jce_list); list_del_init(&jce->jce_list); spin_unlock(&sbi->s_md_lock); + return deleted; } int diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index ee6614b..2e5196a 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -4420,11 +4420,11 @@ ext4_mb_free_metadata(handle_t *handle, struct ext4_buddy *e4b, node = rb_prev(new_node); if (node) { entry = rb_entry(node, struct ext4_free_data, efd_node); - if (can_merge(entry, new_entry)) { + if (can_merge(entry, new_entry) && + ext4_journal_callback_try_del(handle, &entry->efd_jce)) { new_entry->efd_start_cluster = entry->efd_start_cluster; new_entry->efd_count += entry->efd_count; rb_erase(node, &(db->bb_free_root)); - ext4_journal_callback_del(handle, &entry->efd_jce); kmem_cache_free(ext4_free_data_cachep, entry); } } @@ -4432,10 +4432,10 @@ ext4_mb_free_metadata(handle_t *handle, struct ext4_buddy *e4b, node = rb_next(new_node); if (node) { entry = rb_entry(node, struct ext4_free_data, efd_node); - if (can_merge(new_entry, entry)) { + if (can_merge(new_entry, entry) && + ext4_journal_callback_try_del(handle, &entry->efd_jce)) { new_entry->efd_count += entry->efd_count; rb_erase(node, &(db->bb_free_root)); - ext4_journal_callback_del(handle, &entry->efd_jce); kmem_cache_free(ext4_free_data_cachep, entry); } } diff --git a/fs/ext4/super.c b/fs/ext4/super.c index d1ee6a8..c7e1509 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -352,7 +352,9 @@ static void ext4_journal_commit_callback(journal_t *journal, transaction_t *txn) struct ext4_journal_cb_entry *jce, *tmp; spin_lock(&sbi->s_md_lock); - list_for_each_entry_safe(jce, tmp, &txn->t_private_list, jce_list) { + while (!list_empty(&txn->t_private_list)) { + jce = list_entry(txn->t_private_list.next, + struct ext4_journal_cb_entry, jce_list); list_del_init(&jce->jce_list); spin_unlock(&sbi->s_md_lock); jce->jce_func(sb, jce, error);