[Trusty,SRU] ext4: prevent bugon on race between write/fcntl
diff mbox

Message ID 1423452335-22324-1-git-send-email-ming.lei@canonical.com
State New
Headers show

Commit Message

Ming Lei Feb. 9, 2015, 3:25 a.m. UTC
From: Dmitry Monakhov <dmonakhov@openvz.org>

Buglink:
	https://bugs.launchpad.net/bugs/1418284

Upstream commit:
	a41537e69b4aa43f0fe(ext4: prevent bugon on race between write/fcntl)

Patch for 3.14: 
	http://git.kernel.org/cgit/linux/kernel/git/stable/stable-queue.git/tree/queue-3.14/ext4-prevent-bugon-on-race-between-write-fcntl.patch

From: Dmitry Monakhov <dmonakhov@openvz.org>

commit a41537e69b4aa43f0fea02498c2595a81267383b upstream.

O_DIRECT flags can be toggeled via fcntl(F_SETFL). But this value checked
twice inside ext4_file_write_iter() and __generic_file_write() which
result in BUG_ON inside ext4_direct_IO.

Let's initialize iocb->private unconditionally.

TESTCASE: xfstest:generic/036  https://patchwork.ozlabs.org/patch/402445/

#TYPICAL STACK TRACE:
kernel BUG at fs/ext4/inode.c:2960!
invalid opcode: 0000 [#1] SMP
Modules linked in: brd iTCO_wdt lpc_ich mfd_core igb ptp dm_mirror dm_region_hash dm_log dm_mod
CPU: 6 PID: 5505 Comm: aio-dio-fcntl-r Not tainted 3.17.0-rc2-00176-gff5c017 #161
Hardware name: Intel Corporation W2600CR/W2600CR, BIOS SE5C600.86B.99.99.x028.061320111235 06/13/2011
task: ffff88080e95a7c0 ti: ffff88080f908000 task.ti: ffff88080f908000
RIP: 0010:[<ffffffff811fabf2>]  [<ffffffff811fabf2>] ext4_direct_IO+0x162/0x3d0
RSP: 0018:ffff88080f90bb58  EFLAGS: 00010246
RAX: 0000000000000400 RBX: ffff88080fdb2a28 RCX: 00000000a802c818
RDX: 0000040000080000 RSI: ffff88080d8aeb80 RDI: 0000000000000001
RBP: ffff88080f90bbc8 R08: 0000000000000000 R09: 0000000000001581
R10: 0000000000000000 R11: 0000000000000000 R12: ffff88080d8aeb80
R13: ffff88080f90bbf8 R14: ffff88080fdb28c8 R15: ffff88080fdb2a28
FS:  00007f23b2055700(0000) GS:ffff880818400000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f23b2045000 CR3: 000000080cedf000 CR4: 00000000000407e0
Stack:
 ffff88080f90bb98 0000000000000000 7ffffffffffffffe ffff88080fdb2c30
 0000000000000200 0000000000000200 0000000000000001 0000000000000200
 ffff88080f90bbc8 ffff88080fdb2c30 ffff88080f90be08 0000000000000200
Call Trace:
 [<ffffffff8112ca9d>] generic_file_direct_write+0xed/0x180
 [<ffffffff8112f2b2>] __generic_file_write_iter+0x222/0x370
 [<ffffffff811f495b>] ext4_file_write_iter+0x34b/0x400
 [<ffffffff811bd709>] ? aio_run_iocb+0x239/0x410
 [<ffffffff811bd709>] ? aio_run_iocb+0x239/0x410
 [<ffffffff810990e5>] ? local_clock+0x25/0x30
 [<ffffffff810abd94>] ? __lock_acquire+0x274/0x700
 [<ffffffff811f4610>] ? ext4_unwritten_wait+0xb0/0xb0
 [<ffffffff811bd756>] aio_run_iocb+0x286/0x410
 [<ffffffff810990e5>] ? local_clock+0x25/0x30
 [<ffffffff810ac359>] ? lock_release_holdtime+0x29/0x190
 [<ffffffff811bc05b>] ? lookup_ioctx+0x4b/0xf0
 [<ffffffff811bde3b>] do_io_submit+0x55b/0x740
 [<ffffffff811bdcaa>] ? do_io_submit+0x3ca/0x740
 [<ffffffff811be030>] SyS_io_submit+0x10/0x20
 [<ffffffff815ce192>] system_call_fastpath+0x16/0x1b
Code: 01 48 8b 80 f0 01 00 00 48 8b 18 49 8b 45 10 0f 85 f1 01 00 00 48 03 45 c8 48 3b 43 48 0f 8f e3 01 00 00 49 83 7c
24 18 00 75 04 <0f> 0b eb fe f0 ff 83 ec 01 00 00 49 8b 44 24 18 8b 00 85 c0 89
RIP  [<ffffffff811fabf2>] ext4_direct_IO+0x162/0x3d0
 RSP <ffff88080f90bb58>

Reported-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
[hujianyang: Backported to 3.10
 - Move initialization of iocb->private to ext4_file_write() as we don't
   have ext4_file_write_iter(), which is introduced by commit 9b884164.
 - Adjust context to make 'overwrite' changes apply to ext4_file_dio_write()
   as ext4_file_dio_write() is not move into ext4_file_write()]
Signed-off-by: hujianyang <hujianyang@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ming Lei <ming.lei@canonical.com>

---
 fs/ext4/file.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Comments

Chris J Arges Feb. 9, 2015, 2:14 p.m. UTC | #1
The stable patch is a bit more complex then the upstream fix, but this
is what has been applied/reviewed into stable so I think this is a good
candidate for an SRU. In addition the problem is easily testable, and I
don't see other instances of where 'overwrite' is being used.

Acked-by: <chris.j.arges@canonical.com>

--

On 02/08/2015 09:25 PM, Ming Lei wrote:
> From: Dmitry Monakhov <dmonakhov@openvz.org>
> 
> Buglink:
> 	https://bugs.launchpad.net/bugs/1418284
> 
> Upstream commit:
> 	a41537e69b4aa43f0fe(ext4: prevent bugon on race between write/fcntl)
> 
> Patch for 3.14: 
> 	http://git.kernel.org/cgit/linux/kernel/git/stable/stable-queue.git/tree/queue-3.14/ext4-prevent-bugon-on-race-between-write-fcntl.patch
> 
> From: Dmitry Monakhov <dmonakhov@openvz.org>
> 
> commit a41537e69b4aa43f0fea02498c2595a81267383b upstream.
> 
> O_DIRECT flags can be toggeled via fcntl(F_SETFL). But this value checked
> twice inside ext4_file_write_iter() and __generic_file_write() which
> result in BUG_ON inside ext4_direct_IO.
> 
> Let's initialize iocb->private unconditionally.
> 
> TESTCASE: xfstest:generic/036  https://patchwork.ozlabs.org/patch/402445/
> 
> #TYPICAL STACK TRACE:
> kernel BUG at fs/ext4/inode.c:2960!
> invalid opcode: 0000 [#1] SMP
> Modules linked in: brd iTCO_wdt lpc_ich mfd_core igb ptp dm_mirror dm_region_hash dm_log dm_mod
> CPU: 6 PID: 5505 Comm: aio-dio-fcntl-r Not tainted 3.17.0-rc2-00176-gff5c017 #161
> Hardware name: Intel Corporation W2600CR/W2600CR, BIOS SE5C600.86B.99.99.x028.061320111235 06/13/2011
> task: ffff88080e95a7c0 ti: ffff88080f908000 task.ti: ffff88080f908000
> RIP: 0010:[<ffffffff811fabf2>]  [<ffffffff811fabf2>] ext4_direct_IO+0x162/0x3d0
> RSP: 0018:ffff88080f90bb58  EFLAGS: 00010246
> RAX: 0000000000000400 RBX: ffff88080fdb2a28 RCX: 00000000a802c818
> RDX: 0000040000080000 RSI: ffff88080d8aeb80 RDI: 0000000000000001
> RBP: ffff88080f90bbc8 R08: 0000000000000000 R09: 0000000000001581
> R10: 0000000000000000 R11: 0000000000000000 R12: ffff88080d8aeb80
> R13: ffff88080f90bbf8 R14: ffff88080fdb28c8 R15: ffff88080fdb2a28
> FS:  00007f23b2055700(0000) GS:ffff880818400000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f23b2045000 CR3: 000000080cedf000 CR4: 00000000000407e0
> Stack:
>  ffff88080f90bb98 0000000000000000 7ffffffffffffffe ffff88080fdb2c30
>  0000000000000200 0000000000000200 0000000000000001 0000000000000200
>  ffff88080f90bbc8 ffff88080fdb2c30 ffff88080f90be08 0000000000000200
> Call Trace:
>  [<ffffffff8112ca9d>] generic_file_direct_write+0xed/0x180
>  [<ffffffff8112f2b2>] __generic_file_write_iter+0x222/0x370
>  [<ffffffff811f495b>] ext4_file_write_iter+0x34b/0x400
>  [<ffffffff811bd709>] ? aio_run_iocb+0x239/0x410
>  [<ffffffff811bd709>] ? aio_run_iocb+0x239/0x410
>  [<ffffffff810990e5>] ? local_clock+0x25/0x30
>  [<ffffffff810abd94>] ? __lock_acquire+0x274/0x700
>  [<ffffffff811f4610>] ? ext4_unwritten_wait+0xb0/0xb0
>  [<ffffffff811bd756>] aio_run_iocb+0x286/0x410
>  [<ffffffff810990e5>] ? local_clock+0x25/0x30
>  [<ffffffff810ac359>] ? lock_release_holdtime+0x29/0x190
>  [<ffffffff811bc05b>] ? lookup_ioctx+0x4b/0xf0
>  [<ffffffff811bde3b>] do_io_submit+0x55b/0x740
>  [<ffffffff811bdcaa>] ? do_io_submit+0x3ca/0x740
>  [<ffffffff811be030>] SyS_io_submit+0x10/0x20
>  [<ffffffff815ce192>] system_call_fastpath+0x16/0x1b
> Code: 01 48 8b 80 f0 01 00 00 48 8b 18 49 8b 45 10 0f 85 f1 01 00 00 48 03 45 c8 48 3b 43 48 0f 8f e3 01 00 00 49 83 7c
> 24 18 00 75 04 <0f> 0b eb fe f0 ff 83 ec 01 00 00 49 8b 44 24 18 8b 00 85 c0 89
> RIP  [<ffffffff811fabf2>] ext4_direct_IO+0x162/0x3d0
>  RSP <ffff88080f90bb58>
> 
> Reported-by: Sasha Levin <sasha.levin@oracle.com>
> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
> [hujianyang: Backported to 3.10
>  - Move initialization of iocb->private to ext4_file_write() as we don't
>    have ext4_file_write_iter(), which is introduced by commit 9b884164.
>  - Adjust context to make 'overwrite' changes apply to ext4_file_dio_write()
>    as ext4_file_dio_write() is not move into ext4_file_write()]
> Signed-off-by: hujianyang <hujianyang@huawei.com>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Signed-off-by: Ming Lei <ming.lei@canonical.com>
> 
> ---
>  fs/ext4/file.c |    8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> --- a/fs/ext4/file.c
> +++ b/fs/ext4/file.c
> @@ -100,7 +100,7 @@ ext4_file_dio_write(struct kiocb *iocb,
>  	struct blk_plug plug;
>  	int unaligned_aio = 0;
>  	ssize_t ret;
> -	int overwrite = 0;
> +	int *overwrite = iocb->private;
>  	size_t length = iov_length(iov, nr_segs);
>  
>  	if (ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS) &&
> @@ -118,8 +118,6 @@ ext4_file_dio_write(struct kiocb *iocb,
>  	mutex_lock(&inode->i_mutex);
>  	blk_start_plug(&plug);
>  
> -	iocb->private = &overwrite;
> -
>  	/* check whether we do a DIO overwrite or not */
>  	if (ext4_should_dioread_nolock(inode) && !unaligned_aio &&
>  	    !file->f_mapping->nrpages && pos + length <= i_size_read(inode)) {
> @@ -143,7 +141,7 @@ ext4_file_dio_write(struct kiocb *iocb,
>  		 * So we should check these two conditions.
>  		 */
>  		if (err == len && (map.m_flags & EXT4_MAP_MAPPED))
> -			overwrite = 1;
> +			*overwrite = 1;
>  	}
>  
>  	ret = __generic_file_aio_write(iocb, iov, nr_segs, &iocb->ki_pos);
> @@ -170,6 +168,7 @@ ext4_file_write(struct kiocb *iocb, cons
>  {
>  	struct inode *inode = file_inode(iocb->ki_filp);
>  	ssize_t ret;
> +	int overwrite = 0;
>  
>  	/*
>  	 * If we have encountered a bitmap-format file, the size limit
> @@ -190,6 +189,7 @@ ext4_file_write(struct kiocb *iocb, cons
>  		}
>  	}
>  
> +	iocb->private = &overwrite;
>  	if (unlikely(iocb->ki_filp->f_flags & O_DIRECT))
>  		ret = ext4_file_dio_write(iocb, iov, nr_segs, pos);
>  	else
>
Seth Forshee Feb. 9, 2015, 3:03 p.m. UTC | #2
On Mon, Feb 09, 2015 at 11:25:35AM +0800, Ming Lei wrote:
> From: Dmitry Monakhov <dmonakhov@openvz.org>
> 
> Buglink:
> 	https://bugs.launchpad.net/bugs/1418284
> 
> Upstream commit:
> 	a41537e69b4aa43f0fe(ext4: prevent bugon on race between write/fcntl)
> 
> Patch for 3.14: 
> 	http://git.kernel.org/cgit/linux/kernel/git/stable/stable-queue.git/tree/queue-3.14/ext4-prevent-bugon-on-race-between-write-fcntl.patch
> 
> From: Dmitry Monakhov <dmonakhov@openvz.org>
> 
> commit a41537e69b4aa43f0fea02498c2595a81267383b upstream.
> 
> O_DIRECT flags can be toggeled via fcntl(F_SETFL). But this value checked
> twice inside ext4_file_write_iter() and __generic_file_write() which
> result in BUG_ON inside ext4_direct_IO.
> 
> Let's initialize iocb->private unconditionally.
> 
> TESTCASE: xfstest:generic/036  https://patchwork.ozlabs.org/patch/402445/
> 
> #TYPICAL STACK TRACE:
> kernel BUG at fs/ext4/inode.c:2960!
> invalid opcode: 0000 [#1] SMP
> Modules linked in: brd iTCO_wdt lpc_ich mfd_core igb ptp dm_mirror dm_region_hash dm_log dm_mod
> CPU: 6 PID: 5505 Comm: aio-dio-fcntl-r Not tainted 3.17.0-rc2-00176-gff5c017 #161
> Hardware name: Intel Corporation W2600CR/W2600CR, BIOS SE5C600.86B.99.99.x028.061320111235 06/13/2011
> task: ffff88080e95a7c0 ti: ffff88080f908000 task.ti: ffff88080f908000
> RIP: 0010:[<ffffffff811fabf2>]  [<ffffffff811fabf2>] ext4_direct_IO+0x162/0x3d0
> RSP: 0018:ffff88080f90bb58  EFLAGS: 00010246
> RAX: 0000000000000400 RBX: ffff88080fdb2a28 RCX: 00000000a802c818
> RDX: 0000040000080000 RSI: ffff88080d8aeb80 RDI: 0000000000000001
> RBP: ffff88080f90bbc8 R08: 0000000000000000 R09: 0000000000001581
> R10: 0000000000000000 R11: 0000000000000000 R12: ffff88080d8aeb80
> R13: ffff88080f90bbf8 R14: ffff88080fdb28c8 R15: ffff88080fdb2a28
> FS:  00007f23b2055700(0000) GS:ffff880818400000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f23b2045000 CR3: 000000080cedf000 CR4: 00000000000407e0
> Stack:
>  ffff88080f90bb98 0000000000000000 7ffffffffffffffe ffff88080fdb2c30
>  0000000000000200 0000000000000200 0000000000000001 0000000000000200
>  ffff88080f90bbc8 ffff88080fdb2c30 ffff88080f90be08 0000000000000200
> Call Trace:
>  [<ffffffff8112ca9d>] generic_file_direct_write+0xed/0x180
>  [<ffffffff8112f2b2>] __generic_file_write_iter+0x222/0x370
>  [<ffffffff811f495b>] ext4_file_write_iter+0x34b/0x400
>  [<ffffffff811bd709>] ? aio_run_iocb+0x239/0x410
>  [<ffffffff811bd709>] ? aio_run_iocb+0x239/0x410
>  [<ffffffff810990e5>] ? local_clock+0x25/0x30
>  [<ffffffff810abd94>] ? __lock_acquire+0x274/0x700
>  [<ffffffff811f4610>] ? ext4_unwritten_wait+0xb0/0xb0
>  [<ffffffff811bd756>] aio_run_iocb+0x286/0x410
>  [<ffffffff810990e5>] ? local_clock+0x25/0x30
>  [<ffffffff810ac359>] ? lock_release_holdtime+0x29/0x190
>  [<ffffffff811bc05b>] ? lookup_ioctx+0x4b/0xf0
>  [<ffffffff811bde3b>] do_io_submit+0x55b/0x740
>  [<ffffffff811bdcaa>] ? do_io_submit+0x3ca/0x740
>  [<ffffffff811be030>] SyS_io_submit+0x10/0x20
>  [<ffffffff815ce192>] system_call_fastpath+0x16/0x1b
> Code: 01 48 8b 80 f0 01 00 00 48 8b 18 49 8b 45 10 0f 85 f1 01 00 00 48 03 45 c8 48 3b 43 48 0f 8f e3 01 00 00 49 83 7c
> 24 18 00 75 04 <0f> 0b eb fe f0 ff 83 ec 01 00 00 49 8b 44 24 18 8b 00 85 c0 89
> RIP  [<ffffffff811fabf2>] ext4_direct_IO+0x162/0x3d0
>  RSP <ffff88080f90bb58>
> 
> Reported-by: Sasha Levin <sasha.levin@oracle.com>
> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
> [hujianyang: Backported to 3.10
>  - Move initialization of iocb->private to ext4_file_write() as we don't
>    have ext4_file_write_iter(), which is introduced by commit 9b884164.
>  - Adjust context to make 'overwrite' changes apply to ext4_file_dio_write()
>    as ext4_file_dio_write() is not move into ext4_file_write()]
> Signed-off-by: hujianyang <hujianyang@huawei.com>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Signed-off-by: Ming Lei <ming.lei@canonical.com>

This is a bit nasty, but it seems to do what's intended. As Chris says
we get the benefit of upstream review, so it makes sense to use the
backport to 3.10. Seems that it would also make sense for the 3.13 ckt
kernel tree as well though.

Acked-by: Seth Forshee <seth.forshee@canonical.com>
Brad Figg Feb. 10, 2015, 4:03 p.m. UTC | #3
On Mon, Feb 09, 2015 at 11:25:35AM +0800, Ming Lei wrote:
> From: Dmitry Monakhov <dmonakhov@openvz.org>
> 
> Buglink:
> 	https://bugs.launchpad.net/bugs/1418284
> 
> Upstream commit:
> 	a41537e69b4aa43f0fe(ext4: prevent bugon on race between write/fcntl)
> 
> Patch for 3.14: 
> 	http://git.kernel.org/cgit/linux/kernel/git/stable/stable-queue.git/tree/queue-3.14/ext4-prevent-bugon-on-race-between-write-fcntl.patch
> 
> From: Dmitry Monakhov <dmonakhov@openvz.org>
> 
> commit a41537e69b4aa43f0fea02498c2595a81267383b upstream.
> 
> O_DIRECT flags can be toggeled via fcntl(F_SETFL). But this value checked
> twice inside ext4_file_write_iter() and __generic_file_write() which
> result in BUG_ON inside ext4_direct_IO.
> 
> Let's initialize iocb->private unconditionally.
> 
> TESTCASE: xfstest:generic/036  https://patchwork.ozlabs.org/patch/402445/
> 
> #TYPICAL STACK TRACE:
> kernel BUG at fs/ext4/inode.c:2960!
> invalid opcode: 0000 [#1] SMP
> Modules linked in: brd iTCO_wdt lpc_ich mfd_core igb ptp dm_mirror dm_region_hash dm_log dm_mod
> CPU: 6 PID: 5505 Comm: aio-dio-fcntl-r Not tainted 3.17.0-rc2-00176-gff5c017 #161
> Hardware name: Intel Corporation W2600CR/W2600CR, BIOS SE5C600.86B.99.99.x028.061320111235 06/13/2011
> task: ffff88080e95a7c0 ti: ffff88080f908000 task.ti: ffff88080f908000
> RIP: 0010:[<ffffffff811fabf2>]  [<ffffffff811fabf2>] ext4_direct_IO+0x162/0x3d0
> RSP: 0018:ffff88080f90bb58  EFLAGS: 00010246
> RAX: 0000000000000400 RBX: ffff88080fdb2a28 RCX: 00000000a802c818
> RDX: 0000040000080000 RSI: ffff88080d8aeb80 RDI: 0000000000000001
> RBP: ffff88080f90bbc8 R08: 0000000000000000 R09: 0000000000001581
> R10: 0000000000000000 R11: 0000000000000000 R12: ffff88080d8aeb80
> R13: ffff88080f90bbf8 R14: ffff88080fdb28c8 R15: ffff88080fdb2a28
> FS:  00007f23b2055700(0000) GS:ffff880818400000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f23b2045000 CR3: 000000080cedf000 CR4: 00000000000407e0
> Stack:
>  ffff88080f90bb98 0000000000000000 7ffffffffffffffe ffff88080fdb2c30
>  0000000000000200 0000000000000200 0000000000000001 0000000000000200
>  ffff88080f90bbc8 ffff88080fdb2c30 ffff88080f90be08 0000000000000200
> Call Trace:
>  [<ffffffff8112ca9d>] generic_file_direct_write+0xed/0x180
>  [<ffffffff8112f2b2>] __generic_file_write_iter+0x222/0x370
>  [<ffffffff811f495b>] ext4_file_write_iter+0x34b/0x400
>  [<ffffffff811bd709>] ? aio_run_iocb+0x239/0x410
>  [<ffffffff811bd709>] ? aio_run_iocb+0x239/0x410
>  [<ffffffff810990e5>] ? local_clock+0x25/0x30
>  [<ffffffff810abd94>] ? __lock_acquire+0x274/0x700
>  [<ffffffff811f4610>] ? ext4_unwritten_wait+0xb0/0xb0
>  [<ffffffff811bd756>] aio_run_iocb+0x286/0x410
>  [<ffffffff810990e5>] ? local_clock+0x25/0x30
>  [<ffffffff810ac359>] ? lock_release_holdtime+0x29/0x190
>  [<ffffffff811bc05b>] ? lookup_ioctx+0x4b/0xf0
>  [<ffffffff811bde3b>] do_io_submit+0x55b/0x740
>  [<ffffffff811bdcaa>] ? do_io_submit+0x3ca/0x740
>  [<ffffffff811be030>] SyS_io_submit+0x10/0x20
>  [<ffffffff815ce192>] system_call_fastpath+0x16/0x1b
> Code: 01 48 8b 80 f0 01 00 00 48 8b 18 49 8b 45 10 0f 85 f1 01 00 00 48 03 45 c8 48 3b 43 48 0f 8f e3 01 00 00 49 83 7c
> 24 18 00 75 04 <0f> 0b eb fe f0 ff 83 ec 01 00 00 49 8b 44 24 18 8b 00 85 c0 89
> RIP  [<ffffffff811fabf2>] ext4_direct_IO+0x162/0x3d0
>  RSP <ffff88080f90bb58>
> 
> Reported-by: Sasha Levin <sasha.levin@oracle.com>
> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
> [hujianyang: Backported to 3.10
>  - Move initialization of iocb->private to ext4_file_write() as we don't
>    have ext4_file_write_iter(), which is introduced by commit 9b884164.
>  - Adjust context to make 'overwrite' changes apply to ext4_file_dio_write()
>    as ext4_file_dio_write() is not move into ext4_file_write()]
> Signed-off-by: hujianyang <hujianyang@huawei.com>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Signed-off-by: Ming Lei <ming.lei@canonical.com>
> 
> ---
>  fs/ext4/file.c |    8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> --- a/fs/ext4/file.c
> +++ b/fs/ext4/file.c
> @@ -100,7 +100,7 @@ ext4_file_dio_write(struct kiocb *iocb,
>  	struct blk_plug plug;
>  	int unaligned_aio = 0;
>  	ssize_t ret;
> -	int overwrite = 0;
> +	int *overwrite = iocb->private;
>  	size_t length = iov_length(iov, nr_segs);
>  
>  	if (ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS) &&
> @@ -118,8 +118,6 @@ ext4_file_dio_write(struct kiocb *iocb,
>  	mutex_lock(&inode->i_mutex);
>  	blk_start_plug(&plug);
>  
> -	iocb->private = &overwrite;
> -
>  	/* check whether we do a DIO overwrite or not */
>  	if (ext4_should_dioread_nolock(inode) && !unaligned_aio &&
>  	    !file->f_mapping->nrpages && pos + length <= i_size_read(inode)) {
> @@ -143,7 +141,7 @@ ext4_file_dio_write(struct kiocb *iocb,
>  		 * So we should check these two conditions.
>  		 */
>  		if (err == len && (map.m_flags & EXT4_MAP_MAPPED))
> -			overwrite = 1;
> +			*overwrite = 1;
>  	}
>  
>  	ret = __generic_file_aio_write(iocb, iov, nr_segs, &iocb->ki_pos);
> @@ -170,6 +168,7 @@ ext4_file_write(struct kiocb *iocb, cons
>  {
>  	struct inode *inode = file_inode(iocb->ki_filp);
>  	ssize_t ret;
> +	int overwrite = 0;
>  
>  	/*
>  	 * If we have encountered a bitmap-format file, the size limit
> @@ -190,6 +189,7 @@ ext4_file_write(struct kiocb *iocb, cons
>  		}
>  	}
>  
> +	iocb->private = &overwrite;
>  	if (unlikely(iocb->ki_filp->f_flags & O_DIRECT))
>  		ret = ext4_file_dio_write(iocb, iov, nr_segs, pos);
>  	else
> 
> -- 
> kernel-team mailing list
> kernel-team@lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/kernel-team

This has been queued in the 3.16 stable tree. We'll pick this up that way.
Ming Lei Feb. 10, 2015, 11:12 p.m. UTC | #4
On Wed, Feb 11, 2015 at 12:03 AM, Brad Figg <brad.figg@canonical.com> wrote:
> On Mon, Feb 09, 2015 at 11:25:35AM +0800, Ming Lei wrote:
>> From: Dmitry Monakhov <dmonakhov@openvz.org>
>>
>> Buglink:
>>       https://bugs.launchpad.net/bugs/1418284
>>
>> Upstream commit:
>>       a41537e69b4aa43f0fe(ext4: prevent bugon on race between write/fcntl)
>>
>> Patch for 3.14:
>>       http://git.kernel.org/cgit/linux/kernel/git/stable/stable-queue.git/tree/queue-3.14/ext4-prevent-bugon-on-race-between-write-fcntl.patch
>>
>> From: Dmitry Monakhov <dmonakhov@openvz.org>
>>
>> commit a41537e69b4aa43f0fea02498c2595a81267383b upstream.
>>
>> O_DIRECT flags can be toggeled via fcntl(F_SETFL). But this value checked
>> twice inside ext4_file_write_iter() and __generic_file_write() which
>> result in BUG_ON inside ext4_direct_IO.
>>
>> Let's initialize iocb->private unconditionally.
>>
>> TESTCASE: xfstest:generic/036  https://patchwork.ozlabs.org/patch/402445/
>>
>> #TYPICAL STACK TRACE:
>> kernel BUG at fs/ext4/inode.c:2960!
>> invalid opcode: 0000 [#1] SMP
>> Modules linked in: brd iTCO_wdt lpc_ich mfd_core igb ptp dm_mirror dm_region_hash dm_log dm_mod
>> CPU: 6 PID: 5505 Comm: aio-dio-fcntl-r Not tainted 3.17.0-rc2-00176-gff5c017 #161
>> Hardware name: Intel Corporation W2600CR/W2600CR, BIOS SE5C600.86B.99.99.x028.061320111235 06/13/2011
>> task: ffff88080e95a7c0 ti: ffff88080f908000 task.ti: ffff88080f908000
>> RIP: 0010:[<ffffffff811fabf2>]  [<ffffffff811fabf2>] ext4_direct_IO+0x162/0x3d0
>> RSP: 0018:ffff88080f90bb58  EFLAGS: 00010246
>> RAX: 0000000000000400 RBX: ffff88080fdb2a28 RCX: 00000000a802c818
>> RDX: 0000040000080000 RSI: ffff88080d8aeb80 RDI: 0000000000000001
>> RBP: ffff88080f90bbc8 R08: 0000000000000000 R09: 0000000000001581
>> R10: 0000000000000000 R11: 0000000000000000 R12: ffff88080d8aeb80
>> R13: ffff88080f90bbf8 R14: ffff88080fdb28c8 R15: ffff88080fdb2a28
>> FS:  00007f23b2055700(0000) GS:ffff880818400000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 00007f23b2045000 CR3: 000000080cedf000 CR4: 00000000000407e0
>> Stack:
>>  ffff88080f90bb98 0000000000000000 7ffffffffffffffe ffff88080fdb2c30
>>  0000000000000200 0000000000000200 0000000000000001 0000000000000200
>>  ffff88080f90bbc8 ffff88080fdb2c30 ffff88080f90be08 0000000000000200
>> Call Trace:
>>  [<ffffffff8112ca9d>] generic_file_direct_write+0xed/0x180
>>  [<ffffffff8112f2b2>] __generic_file_write_iter+0x222/0x370
>>  [<ffffffff811f495b>] ext4_file_write_iter+0x34b/0x400
>>  [<ffffffff811bd709>] ? aio_run_iocb+0x239/0x410
>>  [<ffffffff811bd709>] ? aio_run_iocb+0x239/0x410
>>  [<ffffffff810990e5>] ? local_clock+0x25/0x30
>>  [<ffffffff810abd94>] ? __lock_acquire+0x274/0x700
>>  [<ffffffff811f4610>] ? ext4_unwritten_wait+0xb0/0xb0
>>  [<ffffffff811bd756>] aio_run_iocb+0x286/0x410
>>  [<ffffffff810990e5>] ? local_clock+0x25/0x30
>>  [<ffffffff810ac359>] ? lock_release_holdtime+0x29/0x190
>>  [<ffffffff811bc05b>] ? lookup_ioctx+0x4b/0xf0
>>  [<ffffffff811bde3b>] do_io_submit+0x55b/0x740
>>  [<ffffffff811bdcaa>] ? do_io_submit+0x3ca/0x740
>>  [<ffffffff811be030>] SyS_io_submit+0x10/0x20
>>  [<ffffffff815ce192>] system_call_fastpath+0x16/0x1b
>> Code: 01 48 8b 80 f0 01 00 00 48 8b 18 49 8b 45 10 0f 85 f1 01 00 00 48 03 45 c8 48 3b 43 48 0f 8f e3 01 00 00 49 83 7c
>> 24 18 00 75 04 <0f> 0b eb fe f0 ff 83 ec 01 00 00 49 8b 44 24 18 8b 00 85 c0 89
>> RIP  [<ffffffff811fabf2>] ext4_direct_IO+0x162/0x3d0
>>  RSP <ffff88080f90bb58>
>>
>> Reported-by: Sasha Levin <sasha.levin@oracle.com>
>> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
>> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
>> [hujianyang: Backported to 3.10
>>  - Move initialization of iocb->private to ext4_file_write() as we don't
>>    have ext4_file_write_iter(), which is introduced by commit 9b884164.
>>  - Adjust context to make 'overwrite' changes apply to ext4_file_dio_write()
>>    as ext4_file_dio_write() is not move into ext4_file_write()]
>> Signed-off-by: hujianyang <hujianyang@huawei.com>
>> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>> Signed-off-by: Ming Lei <ming.lei@canonical.com>
>>
>> ---
>>  fs/ext4/file.c |    8 ++++----
>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> --- a/fs/ext4/file.c
>> +++ b/fs/ext4/file.c
>> @@ -100,7 +100,7 @@ ext4_file_dio_write(struct kiocb *iocb,
>>       struct blk_plug plug;
>>       int unaligned_aio = 0;
>>       ssize_t ret;
>> -     int overwrite = 0;
>> +     int *overwrite = iocb->private;
>>       size_t length = iov_length(iov, nr_segs);
>>
>>       if (ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS) &&
>> @@ -118,8 +118,6 @@ ext4_file_dio_write(struct kiocb *iocb,
>>       mutex_lock(&inode->i_mutex);
>>       blk_start_plug(&plug);
>>
>> -     iocb->private = &overwrite;
>> -
>>       /* check whether we do a DIO overwrite or not */
>>       if (ext4_should_dioread_nolock(inode) && !unaligned_aio &&
>>           !file->f_mapping->nrpages && pos + length <= i_size_read(inode)) {
>> @@ -143,7 +141,7 @@ ext4_file_dio_write(struct kiocb *iocb,
>>                * So we should check these two conditions.
>>                */
>>               if (err == len && (map.m_flags & EXT4_MAP_MAPPED))
>> -                     overwrite = 1;
>> +                     *overwrite = 1;
>>       }
>>
>>       ret = __generic_file_aio_write(iocb, iov, nr_segs, &iocb->ki_pos);
>> @@ -170,6 +168,7 @@ ext4_file_write(struct kiocb *iocb, cons
>>  {
>>       struct inode *inode = file_inode(iocb->ki_filp);
>>       ssize_t ret;
>> +     int overwrite = 0;
>>
>>       /*
>>        * If we have encountered a bitmap-format file, the size limit
>> @@ -190,6 +189,7 @@ ext4_file_write(struct kiocb *iocb, cons
>>               }
>>       }
>>
>> +     iocb->private = &overwrite;
>>       if (unlikely(iocb->ki_filp->f_flags & O_DIRECT))
>>               ret = ext4_file_dio_write(iocb, iov, nr_segs, pos);
>>       else
>>
>> --
>> kernel-team mailing list
>> kernel-team@lists.ubuntu.com
>> https://lists.ubuntu.com/mailman/listinfo/kernel-team
>
> This has been queued in the 3.16 stable tree. We'll pick this up that way.

If you mean the patch in the link:

         https://patchwork.ozlabs.org/patch/409372/

it is basically same with upstream commit, and it can't be applied
on 3.13 at all because it is against ext3 write_iter changes and the
"Clean up ext4_file_write" patchset.


Thanks,
Ming Lei
Brad Figg Feb. 11, 2015, 1:36 a.m. UTC | #5
On Wed, Feb 11, 2015 at 07:12:37AM +0800, Ming Lei wrote:
> On Wed, Feb 11, 2015 at 12:03 AM, Brad Figg <brad.figg@canonical.com> wrote:
> > On Mon, Feb 09, 2015 at 11:25:35AM +0800, Ming Lei wrote:
> >> From: Dmitry Monakhov <dmonakhov@openvz.org>
> >>
> >> Buglink:
> >>       https://bugs.launchpad.net/bugs/1418284
> >>
> >> Upstream commit:
> >>       a41537e69b4aa43f0fe(ext4: prevent bugon on race between write/fcntl)
> >>
> >> Patch for 3.14:
> >>       http://git.kernel.org/cgit/linux/kernel/git/stable/stable-queue.git/tree/queue-3.14/ext4-prevent-bugon-on-race-between-write-fcntl.patch
> >>
> >> From: Dmitry Monakhov <dmonakhov@openvz.org>
> >>
> >> commit a41537e69b4aa43f0fea02498c2595a81267383b upstream.
> >>
> >> O_DIRECT flags can be toggeled via fcntl(F_SETFL). But this value checked
> >> twice inside ext4_file_write_iter() and __generic_file_write() which
> >> result in BUG_ON inside ext4_direct_IO.
> >>
> >> Let's initialize iocb->private unconditionally.
> >>
> >> TESTCASE: xfstest:generic/036  https://patchwork.ozlabs.org/patch/402445/
> >>
> >> #TYPICAL STACK TRACE:
> >> kernel BUG at fs/ext4/inode.c:2960!
> >> invalid opcode: 0000 [#1] SMP
> >> Modules linked in: brd iTCO_wdt lpc_ich mfd_core igb ptp dm_mirror dm_region_hash dm_log dm_mod
> >> CPU: 6 PID: 5505 Comm: aio-dio-fcntl-r Not tainted 3.17.0-rc2-00176-gff5c017 #161
> >> Hardware name: Intel Corporation W2600CR/W2600CR, BIOS SE5C600.86B.99.99.x028.061320111235 06/13/2011
> >> task: ffff88080e95a7c0 ti: ffff88080f908000 task.ti: ffff88080f908000
> >> RIP: 0010:[<ffffffff811fabf2>]  [<ffffffff811fabf2>] ext4_direct_IO+0x162/0x3d0
> >> RSP: 0018:ffff88080f90bb58  EFLAGS: 00010246
> >> RAX: 0000000000000400 RBX: ffff88080fdb2a28 RCX: 00000000a802c818
> >> RDX: 0000040000080000 RSI: ffff88080d8aeb80 RDI: 0000000000000001
> >> RBP: ffff88080f90bbc8 R08: 0000000000000000 R09: 0000000000001581
> >> R10: 0000000000000000 R11: 0000000000000000 R12: ffff88080d8aeb80
> >> R13: ffff88080f90bbf8 R14: ffff88080fdb28c8 R15: ffff88080fdb2a28
> >> FS:  00007f23b2055700(0000) GS:ffff880818400000(0000) knlGS:0000000000000000
> >> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >> CR2: 00007f23b2045000 CR3: 000000080cedf000 CR4: 00000000000407e0
> >> Stack:
> >>  ffff88080f90bb98 0000000000000000 7ffffffffffffffe ffff88080fdb2c30
> >>  0000000000000200 0000000000000200 0000000000000001 0000000000000200
> >>  ffff88080f90bbc8 ffff88080fdb2c30 ffff88080f90be08 0000000000000200
> >> Call Trace:
> >>  [<ffffffff8112ca9d>] generic_file_direct_write+0xed/0x180
> >>  [<ffffffff8112f2b2>] __generic_file_write_iter+0x222/0x370
> >>  [<ffffffff811f495b>] ext4_file_write_iter+0x34b/0x400
> >>  [<ffffffff811bd709>] ? aio_run_iocb+0x239/0x410
> >>  [<ffffffff811bd709>] ? aio_run_iocb+0x239/0x410
> >>  [<ffffffff810990e5>] ? local_clock+0x25/0x30
> >>  [<ffffffff810abd94>] ? __lock_acquire+0x274/0x700
> >>  [<ffffffff811f4610>] ? ext4_unwritten_wait+0xb0/0xb0
> >>  [<ffffffff811bd756>] aio_run_iocb+0x286/0x410
> >>  [<ffffffff810990e5>] ? local_clock+0x25/0x30
> >>  [<ffffffff810ac359>] ? lock_release_holdtime+0x29/0x190
> >>  [<ffffffff811bc05b>] ? lookup_ioctx+0x4b/0xf0
> >>  [<ffffffff811bde3b>] do_io_submit+0x55b/0x740
> >>  [<ffffffff811bdcaa>] ? do_io_submit+0x3ca/0x740
> >>  [<ffffffff811be030>] SyS_io_submit+0x10/0x20
> >>  [<ffffffff815ce192>] system_call_fastpath+0x16/0x1b
> >> Code: 01 48 8b 80 f0 01 00 00 48 8b 18 49 8b 45 10 0f 85 f1 01 00 00 48 03 45 c8 48 3b 43 48 0f 8f e3 01 00 00 49 83 7c
> >> 24 18 00 75 04 <0f> 0b eb fe f0 ff 83 ec 01 00 00 49 8b 44 24 18 8b 00 85 c0 89
> >> RIP  [<ffffffff811fabf2>] ext4_direct_IO+0x162/0x3d0
> >>  RSP <ffff88080f90bb58>
> >>
> >> Reported-by: Sasha Levin <sasha.levin@oracle.com>
> >> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
> >> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
> >> [hujianyang: Backported to 3.10
> >>  - Move initialization of iocb->private to ext4_file_write() as we don't
> >>    have ext4_file_write_iter(), which is introduced by commit 9b884164.
> >>  - Adjust context to make 'overwrite' changes apply to ext4_file_dio_write()
> >>    as ext4_file_dio_write() is not move into ext4_file_write()]
> >> Signed-off-by: hujianyang <hujianyang@huawei.com>
> >> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> >> Signed-off-by: Ming Lei <ming.lei@canonical.com>
> >>
> >> ---
> >>  fs/ext4/file.c |    8 ++++----
> >>  1 file changed, 4 insertions(+), 4 deletions(-)
> >>
> >> --- a/fs/ext4/file.c
> >> +++ b/fs/ext4/file.c
> >> @@ -100,7 +100,7 @@ ext4_file_dio_write(struct kiocb *iocb,
> >>       struct blk_plug plug;
> >>       int unaligned_aio = 0;
> >>       ssize_t ret;
> >> -     int overwrite = 0;
> >> +     int *overwrite = iocb->private;
> >>       size_t length = iov_length(iov, nr_segs);
> >>
> >>       if (ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS) &&
> >> @@ -118,8 +118,6 @@ ext4_file_dio_write(struct kiocb *iocb,
> >>       mutex_lock(&inode->i_mutex);
> >>       blk_start_plug(&plug);
> >>
> >> -     iocb->private = &overwrite;
> >> -
> >>       /* check whether we do a DIO overwrite or not */
> >>       if (ext4_should_dioread_nolock(inode) && !unaligned_aio &&
> >>           !file->f_mapping->nrpages && pos + length <= i_size_read(inode)) {
> >> @@ -143,7 +141,7 @@ ext4_file_dio_write(struct kiocb *iocb,
> >>                * So we should check these two conditions.
> >>                */
> >>               if (err == len && (map.m_flags & EXT4_MAP_MAPPED))
> >> -                     overwrite = 1;
> >> +                     *overwrite = 1;
> >>       }
> >>
> >>       ret = __generic_file_aio_write(iocb, iov, nr_segs, &iocb->ki_pos);
> >> @@ -170,6 +168,7 @@ ext4_file_write(struct kiocb *iocb, cons
> >>  {
> >>       struct inode *inode = file_inode(iocb->ki_filp);
> >>       ssize_t ret;
> >> +     int overwrite = 0;
> >>
> >>       /*
> >>        * If we have encountered a bitmap-format file, the size limit
> >> @@ -190,6 +189,7 @@ ext4_file_write(struct kiocb *iocb, cons
> >>               }
> >>       }
> >>
> >> +     iocb->private = &overwrite;
> >>       if (unlikely(iocb->ki_filp->f_flags & O_DIRECT))
> >>               ret = ext4_file_dio_write(iocb, iov, nr_segs, pos);
> >>       else
> >>
> >> --
> >> kernel-team mailing list
> >> kernel-team@lists.ubuntu.com
> >> https://lists.ubuntu.com/mailman/listinfo/kernel-team
> >
> > This has been queued in the 3.16 stable tree. We'll pick this up that way.
> 
> If you mean the patch in the link:
> 
>          https://patchwork.ozlabs.org/patch/409372/
> 
> it is basically same with upstream commit, and it can't be applied
> on 3.13 at all because it is against ext3 write_iter changes and the
> "Clean up ext4_file_write" patchset.
> 
> 
> Thanks,
> Ming Lei

I was looking in the wrong upstream stable queue. Sorry about that. I've
applied this to Trusty master-next.

Brad

Patch
diff mbox

--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -100,7 +100,7 @@  ext4_file_dio_write(struct kiocb *iocb,
 	struct blk_plug plug;
 	int unaligned_aio = 0;
 	ssize_t ret;
-	int overwrite = 0;
+	int *overwrite = iocb->private;
 	size_t length = iov_length(iov, nr_segs);
 
 	if (ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS) &&
@@ -118,8 +118,6 @@  ext4_file_dio_write(struct kiocb *iocb,
 	mutex_lock(&inode->i_mutex);
 	blk_start_plug(&plug);
 
-	iocb->private = &overwrite;
-
 	/* check whether we do a DIO overwrite or not */
 	if (ext4_should_dioread_nolock(inode) && !unaligned_aio &&
 	    !file->f_mapping->nrpages && pos + length <= i_size_read(inode)) {
@@ -143,7 +141,7 @@  ext4_file_dio_write(struct kiocb *iocb,
 		 * So we should check these two conditions.
 		 */
 		if (err == len && (map.m_flags & EXT4_MAP_MAPPED))
-			overwrite = 1;
+			*overwrite = 1;
 	}
 
 	ret = __generic_file_aio_write(iocb, iov, nr_segs, &iocb->ki_pos);
@@ -170,6 +168,7 @@  ext4_file_write(struct kiocb *iocb, cons
 {
 	struct inode *inode = file_inode(iocb->ki_filp);
 	ssize_t ret;
+	int overwrite = 0;
 
 	/*
 	 * If we have encountered a bitmap-format file, the size limit
@@ -190,6 +189,7 @@  ext4_file_write(struct kiocb *iocb, cons
 		}
 	}
 
+	iocb->private = &overwrite;
 	if (unlikely(iocb->ki_filp->f_flags & O_DIRECT))
 		ret = ext4_file_dio_write(iocb, iov, nr_segs, pos);
 	else