Message ID | 20160617180434.GA427@quack2.suse.cz |
---|---|
State | Not Applicable, archived |
Headers | show |
On Fri 17-06-16 20:04:34, Jan Kara wrote: > On Fri 17-06-16 13:40:33, Calvin Owens wrote: > > On Thursday 06/16 at 10:33 +0200, Jan Kara wrote: > > > Hi Calvin, > > > > > > On Wed 15-06-16 16:49:39, Calvin Owens wrote: > > > > I'm hitting the following warning on a 4.0 kernel: > > > > > > > > WARNING: CPU: 15 PID: 1005611 at fs/ext4/inode.c:233 ext4_evict_inode+0x4be/0x4d0() > > > > CPU: 15 PID: 1005611 Comm: rocksdb:bg0 Not tainted 4.0.9-60_fbk10_rc1_3974_g796b9b6 #1 > > > > Call Trace: > > > > [<ffffffff8176af8e>] dump_stack+0x4d/0x63 > > > > [<ffffffff8106c4dc>] warn_slowpath_common+0x8c/0xd0 > > > > [<ffffffff8106c5da>] warn_slowpath_null+0x1a/0x20 > > > > [<ffffffff81242c0e>] ext4_evict_inode+0x4be/0x4d0 > > > > [<ffffffff811d248b>] evict+0xbb/0x190 > > > > [<ffffffff811d2d6d>] iput+0x17d/0x1e0 > > > > [<ffffffff811ce0e0>] __dentry_kill+0x190/0x1e0 > > > > [<ffffffff811ce2d1>] dput+0x1a1/0x1f0 > > > > [<ffffffff811b952a>] __fput+0x17a/0x210 > > > > [<ffffffff811b960e>] ____fput+0xe/0x10 > > > > [<ffffffff81086dbf>] task_work_run+0xbf/0x100 > > > > [<ffffffff81002bcc>] do_notify_resume+0x7c/0x90 > > > > [<ffffffff81771529>] int_signal+0x12/0x17 > > > > > > > > Commit 822dbba ("ext4: fix warning in ext4_evict_inode()") proportedly fixed > > > > this in 3.11. The check was entirely removed in 4.6. > > > > > > OK, so this is the warning: > > > > > > WARN_ON(atomic_read(&EXT4_I(inode)->i_ioend_count)); > > > > > > It was removed in 4.6 since we maintained i_ioend_count only to be able to > > > do this check and it didn't trigger for a long time. So it is interesting > > > that it actually triggered for you with 4.0. > > > > > > > Is it interesting to you that this triggers on 4.0? I can revert 600be30 and > > > > see if I can reproduce it on upstream, but since the check got removed I > > > > was wondering if there was post-4.0 work that makes it obsolete? > > > > > > It would be great. I'm attaching a revert and an additional debug patch. If > > > you can run with these two on the latest kernel (or even just apply the debug > > > patch on top of 4.0) and reproduce the issue with it, I would be grateful. > > > > Forgot to attach the patches? Or did my mailserver eat them? :) > > Sorry, probably forgot to attach. Here they are. Any luck with the patch? Honza
On 06/30/2016 12:00 AM, Jan Kara wrote: > On Fri 17-06-16 20:04:34, Jan Kara wrote: >> On Fri 17-06-16 13:40:33, Calvin Owens wrote: >>> On Thursday 06/16 at 10:33 +0200, Jan Kara wrote: >>>> Hi Calvin, >>>> >>>> On Wed 15-06-16 16:49:39, Calvin Owens wrote: >>>>> I'm hitting the following warning on a 4.0 kernel: >>>>> >>>>> WARNING: CPU: 15 PID: 1005611 at fs/ext4/inode.c:233 ext4_evict_inode+0x4be/0x4d0() >>>>> CPU: 15 PID: 1005611 Comm: rocksdb:bg0 Not tainted 4.0.9-60_fbk10_rc1_3974_g796b9b6 #1 >>>>> Call Trace: >>>>> [<ffffffff8176af8e>] dump_stack+0x4d/0x63 >>>>> [<ffffffff8106c4dc>] warn_slowpath_common+0x8c/0xd0 >>>>> [<ffffffff8106c5da>] warn_slowpath_null+0x1a/0x20 >>>>> [<ffffffff81242c0e>] ext4_evict_inode+0x4be/0x4d0 >>>>> [<ffffffff811d248b>] evict+0xbb/0x190 >>>>> [<ffffffff811d2d6d>] iput+0x17d/0x1e0 >>>>> [<ffffffff811ce0e0>] __dentry_kill+0x190/0x1e0 >>>>> [<ffffffff811ce2d1>] dput+0x1a1/0x1f0 >>>>> [<ffffffff811b952a>] __fput+0x17a/0x210 >>>>> [<ffffffff811b960e>] ____fput+0xe/0x10 >>>>> [<ffffffff81086dbf>] task_work_run+0xbf/0x100 >>>>> [<ffffffff81002bcc>] do_notify_resume+0x7c/0x90 >>>>> [<ffffffff81771529>] int_signal+0x12/0x17 >>>>> >>>>> Commit 822dbba ("ext4: fix warning in ext4_evict_inode()") proportedly fixed >>>>> this in 3.11. The check was entirely removed in 4.6. >>>> >>>> OK, so this is the warning: >>>> >>>> WARN_ON(atomic_read(&EXT4_I(inode)->i_ioend_count)); >>>> >>>> It was removed in 4.6 since we maintained i_ioend_count only to be able to >>>> do this check and it didn't trigger for a long time. So it is interesting >>>> that it actually triggered for you with 4.0. >>>> >>>>> Is it interesting to you that this triggers on 4.0? I can revert 600be30 and >>>>> see if I can reproduce it on upstream, but since the check got removed I >>>>> was wondering if there was post-4.0 work that makes it obsolete? >>>> >>>> It would be great. I'm attaching a revert and an additional debug patch. If >>>> you can run with these two on the latest kernel (or even just apply the debug >>>> patch on top of 4.0) and reproduce the issue with it, I would be grateful. >>> >>> Forgot to attach the patches? Or did my mailserver eat them? :) >> >> Sorry, probably forgot to attach. Here they are. > > Any luck with the patch? No luck at all :/ Whatever triggers this is extraordinarily rare, and I haven't been able to pin it down: I see the WARN from 1-2 random hosts each day across all of FB's servers. I'll keep at it and let you know if I get anything, hopefully soon. Thanks, Calvin > Honza > -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From 51f4400d9c0c7d3be9be84ed8ebadd8ea27490ec Mon Sep 17 00:00:00 2001 From: Jan Kara <jack@suse.cz> Date: Wed, 16 Oct 2013 20:32:58 +0200 Subject: [PATCH 2/2] ext4: Debug outstanding io_ends Signed-off-by: Jan Kara <jack@suse.cz> --- fs/ext4/ext4.h | 3 +++ fs/ext4/inode.c | 24 ++++++++++++++++++++++-- fs/ext4/page-io.c | 11 +++++++++++ 3 files changed, 36 insertions(+), 2 deletions(-) diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index d64fc259590f..9b56f6981b0a 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -189,6 +189,7 @@ struct ext4_map_blocks { */ typedef struct ext4_io_end { struct list_head list; /* per-file finished IO list */ + struct list_head full_list; handle_t *handle; /* handle reserved for extent * conversion */ struct inode *inode; /* file being written to */ @@ -198,6 +199,7 @@ typedef struct ext4_io_end { atomic_t count; /* reference counter */ loff_t offset; /* offset in the file */ ssize_t size; /* size of the extent */ + unsigned long created_at; } ext4_io_end_t; struct ext4_io_submit { @@ -1063,6 +1065,7 @@ struct ext4_inode_info { * transaction reserved */ struct list_head i_rsv_conversion_list; + struct list_head i_ioend_list; /* * Completed IOs that need unwritten extents handling and don't have * transaction reserved diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index cb00e5949bc3..187d63b5a1eb 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -176,6 +176,20 @@ int ext4_truncate_restart_trans(handle_t *handle, struct inode *inode, return ret; } +static void dump_ioends(struct inode *inode, struct list_head *head) +{ + ext4_io_end_t *io; + unsigned long flags; + + spin_lock_irqsave(&EXT4_I(inode)->i_completed_io_lock, flags); + list_for_each_entry(io, head, full_list) { + printk("ioend %p, created at 0x%lx: handle=%p, bio=%p, flag=%u, offset=%lu, len=%u, count=%d\n", + io, io->created_at, io->handle, io->bio, io->flag, (unsigned long)io->offset, + (unsigned)io->size, (int)atomic_read(&io->count)); + } + spin_unlock_irqrestore(&EXT4_I(inode)->i_completed_io_lock, flags); +} + /* * Called at the last iput() if i_nlink is zero. */ @@ -216,7 +230,10 @@ void ext4_evict_inode(struct inode *inode) } truncate_inode_pages_final(&inode->i_data); - WARN_ON(atomic_read(&EXT4_I(inode)->i_ioend_count)); + if (WARN_ON(atomic_read(&EXT4_I(inode)->i_ioend_count))) { + printk("ioend_count=%d, i_unwritten=%d\n", (int)atomic_read(&EXT4_I(inode)->i_ioend_count), (int)atomic_read(&EXT4_I(inode)->i_unwritten)); + dump_ioends(inode, &EXT4_I(inode)->i_ioend_list); + } goto no_delete; } @@ -228,7 +245,10 @@ void ext4_evict_inode(struct inode *inode) ext4_begin_ordered_truncate(inode, 0); truncate_inode_pages_final(&inode->i_data); - WARN_ON(atomic_read(&EXT4_I(inode)->i_ioend_count)); + if (WARN_ON(atomic_read(&EXT4_I(inode)->i_ioend_count))) { + printk("ioend_count=%d, i_unwritten=%d\n", (int)atomic_read(&EXT4_I(inode)->i_ioend_count), (int)atomic_read(&EXT4_I(inode)->i_unwritten)); + dump_ioends(inode, &EXT4_I(inode)->i_ioend_list); + } /* * Protect us against freezing - iput() caller didn't have to have any diff --git a/fs/ext4/page-io.c b/fs/ext4/page-io.c index cc5f529e328e..9a90dedc8e7b 100644 --- a/fs/ext4/page-io.c +++ b/fs/ext4/page-io.c @@ -124,6 +124,7 @@ static void ext4_finish_bio(struct bio *bio) static void ext4_release_io_end(ext4_io_end_t *io_end) { struct bio *bio, *next_bio; + unsigned long flags; BUG_ON(!list_empty(&io_end->list)); BUG_ON(io_end->flag & EXT4_IO_END_UNWRITTEN); @@ -132,6 +133,10 @@ static void ext4_release_io_end(ext4_io_end_t *io_end) if (atomic_dec_and_test(&EXT4_I(io_end->inode)->i_ioend_count)) wake_up_all(ext4_ioend_wq(io_end->inode)); + spin_lock_irqsave(&EXT4_I(io_end->inode)->i_completed_io_lock, flags); + list_del(&io_end->full_list); + spin_unlock_irqrestore(&EXT4_I(io_end->inode)->i_completed_io_lock, flags); + for (bio = io_end->bio; bio; bio = next_bio) { next_bio = bio->bi_private; ext4_finish_bio(bio); @@ -256,10 +261,16 @@ ext4_io_end_t *ext4_init_io_end(struct inode *inode, gfp_t flags) { ext4_io_end_t *io = kmem_cache_zalloc(io_end_cachep, flags); if (io) { + unsigned long flags; + + spin_lock_irqsave(&EXT4_I(inode)->i_completed_io_lock, flags); + list_add(&io->full_list, &EXT4_I(inode)->i_ioend_list); + spin_unlock_irqrestore(&EXT4_I(inode)->i_completed_io_lock, flags); atomic_inc(&EXT4_I(inode)->i_ioend_count); io->inode = inode; INIT_LIST_HEAD(&io->list); atomic_set(&io->count, 1); + io->created_at = _RET_IP_; } return io; } -- 2.6.6