diff mbox

[Xenial,SRU] writeback: Write dirty times for WB_SYNC_ALL writeback

Message ID 1471532331-21138-1-git-send-email-tim.gardner@canonical.com
State New
Headers show

Commit Message

Tim Gardner Aug. 18, 2016, 2:58 p.m. UTC
From: Jan Kara <jack@suse.cz>

BugLink: http://bugs.launchpad.net/bugs/1614565

Currently we take care to handle I_DIRTY_TIME in vfs_fsync() and
queue_io() so that inodes which have only dirty timestamps are properly
written on fsync(2) and sync(2). However there are other call sites -
most notably going through write_inode_now() - which expect inode to be
clean after WB_SYNC_ALL writeback. This is not currently true as we do
not clear I_DIRTY_TIME in __writeback_single_inode() even for
WB_SYNC_ALL writeback in all the cases. This then resulted in the
following oops because bdev_write_inode() did not clean the inode and
writeback code later stumbled over a dirty inode with detached wb.

  general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN
  Modules linked in:
  CPU: 3 PID: 32 Comm: kworker/u10:1 Not tainted 4.6.0-rc3+ #349
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
  Workqueue: writeback wb_workfn (flush-11:0)
  task: ffff88006ccf1840 ti: ffff88006cda8000 task.ti: ffff88006cda8000
  RIP: 0010:[<ffffffff818884d2>]  [<ffffffff818884d2>]
  locked_inode_to_wb_and_lock_list+0xa2/0x750
  RSP: 0018:ffff88006cdaf7d0  EFLAGS: 00010246
  RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff88006ccf2050
  RDX: 0000000000000000 RSI: 000000114c8a8484 RDI: 0000000000000286
  RBP: ffff88006cdaf820 R08: ffff88006ccf1840 R09: 0000000000000000
  R10: 000229915090805f R11: 0000000000000001 R12: ffff88006a72f5e0
  R13: dffffc0000000000 R14: ffffed000d4e5eed R15: ffffffff8830cf40
  FS:  0000000000000000(0000) GS:ffff88006d500000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000003301bf8 CR3: 000000006368f000 CR4: 00000000000006e0
  DR0: 0000000000001ec9 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
  Stack:
   ffff88006a72f680 ffff88006a72f768 ffff8800671230d8 03ff88006cdaf948
   ffff88006a72f668 ffff88006a72f5e0 ffff8800671230d8 ffff88006cdaf948
   ffff880065b90cc8 ffff880067123100 ffff88006cdaf970 ffffffff8188e12e
  Call Trace:
   [<     inline     >] inode_to_wb_and_lock_list fs/fs-writeback.c:309
   [<ffffffff8188e12e>] writeback_sb_inodes+0x4de/0x1250 fs/fs-writeback.c:1554
   [<ffffffff8188efa4>] __writeback_inodes_wb+0x104/0x1e0 fs/fs-writeback.c:1600
   [<ffffffff8188f9ae>] wb_writeback+0x7ce/0xc90 fs/fs-writeback.c:1709
   [<     inline     >] wb_do_writeback fs/fs-writeback.c:1844
   [<ffffffff81891079>] wb_workfn+0x2f9/0x1000 fs/fs-writeback.c:1884
   [<ffffffff813bcd1e>] process_one_work+0x78e/0x15c0 kernel/workqueue.c:2094
   [<ffffffff813bdc2b>] worker_thread+0xdb/0xfc0 kernel/workqueue.c:2228
   [<ffffffff813cdeef>] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
   [<ffffffff867bc5d2>] ret_from_fork+0x22/0x50 arch/x86/entry/entry_64.S:392
  Code: 05 94 4a a8 06 85 c0 0f 85 03 03 00 00 e8 07 15 d0 ff 41 80 3e
  00 0f 85 64 06 00 00 49 8b 9c 24 88 01 00 00 48 89 d8 48 c1 e8 03 <42>
  80 3c 28 00 0f 85 17 06 00 00 48 8b 03 48 83 c0 50 48 39 c3
  RIP  [<     inline     >] wb_get include/linux/backing-dev-defs.h:212
  RIP  [<ffffffff818884d2>] locked_inode_to_wb_and_lock_list+0xa2/0x750
  fs/fs-writeback.c:281
   RSP <ffff88006cdaf7d0>
  ---[ end trace 986a4d314dcb2694 ]---

Fix the problem by making sure __writeback_single_inode() writes inode
only with dirty times in WB_SYNC_ALL mode.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Tested-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit dc5ff2b1d66f21c27a4c37236636dff6946437e4)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
---
 fs/fs-writeback.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Kamal Mostafa Aug. 18, 2016, 5:23 p.m. UTC | #1

Chris J Arges Aug. 18, 2016, 7:21 p.m. UTC | #2
On Thu, Aug 18, 2016 at 08:58:51AM -0600, Tim Gardner wrote:
> From: Jan Kara <jack@suse.cz>
> 
> BugLink: http://bugs.launchpad.net/bugs/1614565
> 
> Currently we take care to handle I_DIRTY_TIME in vfs_fsync() and
> queue_io() so that inodes which have only dirty timestamps are properly
> written on fsync(2) and sync(2). However there are other call sites -
> most notably going through write_inode_now() - which expect inode to be
> clean after WB_SYNC_ALL writeback. This is not currently true as we do
> not clear I_DIRTY_TIME in __writeback_single_inode() even for
> WB_SYNC_ALL writeback in all the cases. This then resulted in the
> following oops because bdev_write_inode() did not clean the inode and
> writeback code later stumbled over a dirty inode with detached wb.
> 
>   general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN
>   Modules linked in:
>   CPU: 3 PID: 32 Comm: kworker/u10:1 Not tainted 4.6.0-rc3+ #349
>   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
>   Workqueue: writeback wb_workfn (flush-11:0)
>   task: ffff88006ccf1840 ti: ffff88006cda8000 task.ti: ffff88006cda8000
>   RIP: 0010:[<ffffffff818884d2>]  [<ffffffff818884d2>]
>   locked_inode_to_wb_and_lock_list+0xa2/0x750
>   RSP: 0018:ffff88006cdaf7d0  EFLAGS: 00010246
>   RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff88006ccf2050
>   RDX: 0000000000000000 RSI: 000000114c8a8484 RDI: 0000000000000286
>   RBP: ffff88006cdaf820 R08: ffff88006ccf1840 R09: 0000000000000000
>   R10: 000229915090805f R11: 0000000000000001 R12: ffff88006a72f5e0
>   R13: dffffc0000000000 R14: ffffed000d4e5eed R15: ffffffff8830cf40
>   FS:  0000000000000000(0000) GS:ffff88006d500000(0000) knlGS:0000000000000000
>   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>   CR2: 0000000003301bf8 CR3: 000000006368f000 CR4: 00000000000006e0
>   DR0: 0000000000001ec9 DR1: 0000000000000000 DR2: 0000000000000000
>   DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
>   Stack:
>    ffff88006a72f680 ffff88006a72f768 ffff8800671230d8 03ff88006cdaf948
>    ffff88006a72f668 ffff88006a72f5e0 ffff8800671230d8 ffff88006cdaf948
>    ffff880065b90cc8 ffff880067123100 ffff88006cdaf970 ffffffff8188e12e
>   Call Trace:
>    [<     inline     >] inode_to_wb_and_lock_list fs/fs-writeback.c:309
>    [<ffffffff8188e12e>] writeback_sb_inodes+0x4de/0x1250 fs/fs-writeback.c:1554
>    [<ffffffff8188efa4>] __writeback_inodes_wb+0x104/0x1e0 fs/fs-writeback.c:1600
>    [<ffffffff8188f9ae>] wb_writeback+0x7ce/0xc90 fs/fs-writeback.c:1709
>    [<     inline     >] wb_do_writeback fs/fs-writeback.c:1844
>    [<ffffffff81891079>] wb_workfn+0x2f9/0x1000 fs/fs-writeback.c:1884
>    [<ffffffff813bcd1e>] process_one_work+0x78e/0x15c0 kernel/workqueue.c:2094
>    [<ffffffff813bdc2b>] worker_thread+0xdb/0xfc0 kernel/workqueue.c:2228
>    [<ffffffff813cdeef>] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
>    [<ffffffff867bc5d2>] ret_from_fork+0x22/0x50 arch/x86/entry/entry_64.S:392
>   Code: 05 94 4a a8 06 85 c0 0f 85 03 03 00 00 e8 07 15 d0 ff 41 80 3e
>   00 0f 85 64 06 00 00 49 8b 9c 24 88 01 00 00 48 89 d8 48 c1 e8 03 <42>
>   80 3c 28 00 0f 85 17 06 00 00 48 8b 03 48 83 c0 50 48 39 c3
>   RIP  [<     inline     >] wb_get include/linux/backing-dev-defs.h:212
>   RIP  [<ffffffff818884d2>] locked_inode_to_wb_and_lock_list+0xa2/0x750
>   fs/fs-writeback.c:281
>    RSP <ffff88006cdaf7d0>
>   ---[ end trace 986a4d314dcb2694 ]---
> 
> Fix the problem by making sure __writeback_single_inode() writes inode
> only with dirty times in WB_SYNC_ALL mode.
> 
> Reported-by: Dmitry Vyukov <dvyukov@google.com>
> Tested-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
> Signed-off-by: Jan Kara <jack@suse.cz>
> Signed-off-by: Jens Axboe <axboe@fb.com>
> (cherry picked from commit dc5ff2b1d66f21c27a4c37236636dff6946437e4)
> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
> ---
>  fs/fs-writeback.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> index 60d6fc2..337afad 100644
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -1292,6 +1292,7 @@ __writeback_single_inode(struct inode *inode, struct writeback_control *wbc)
>  	dirty = inode->i_state & I_DIRTY;
>  	if (inode->i_state & I_DIRTY_TIME) {
>  		if ((dirty & (I_DIRTY_SYNC | I_DIRTY_DATASYNC)) ||
> +		    wbc->sync_mode == WB_SYNC_ALL ||
>  		    unlikely(inode->i_state & I_DIRTY_TIME_EXPIRED) ||
>  		    unlikely(time_after(jiffies,
>  					(inode->dirtied_time_when +
> -- 
> 2.7.4
> 
> 
> -- 
> kernel-team mailing list
> kernel-team@lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/kernel-team
Kamal Mostafa Aug. 18, 2016, 7:47 p.m. UTC | #3

diff mbox

Patch

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 60d6fc2..337afad 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -1292,6 +1292,7 @@  __writeback_single_inode(struct inode *inode, struct writeback_control *wbc)
 	dirty = inode->i_state & I_DIRTY;
 	if (inode->i_state & I_DIRTY_TIME) {
 		if ((dirty & (I_DIRTY_SYNC | I_DIRTY_DATASYNC)) ||
+		    wbc->sync_mode == WB_SYNC_ALL ||
 		    unlikely(inode->i_state & I_DIRTY_TIME_EXPIRED) ||
 		    unlikely(time_after(jiffies,
 					(inode->dirtied_time_when +