diff mbox series

ext4: Avoid crash when inline data creation follows DIO write

Message ID 20220727155753.13969-1-jack@suse.cz
State Awaiting Upstream
Headers show
Series ext4: Avoid crash when inline data creation follows DIO write | expand

Commit Message

Jan Kara July 27, 2022, 3:57 p.m. UTC
When inode is created and written to using direct IO, there is nothing
to clear the EXT4_STATE_MAY_INLINE_DATA flag. Thus when inode gets
truncated later to say 1 byte and written using normal write, we will
try to store the data as inline data. This confuses the code later
because the inode now has both normal block and inline data allocated
and the confusion manifests for example as:

kernel BUG at fs/ext4/inode.c:2721!
invalid opcode: 0000 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 359 Comm: repro Not tainted 5.19.0-rc8-00001-g31ba1e3b8305-dirty #15
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-1.fc36 04/01/2014
RIP: 0010:ext4_writepages+0x363d/0x3660
RSP: 0018:ffffc90000ccf260 EFLAGS: 00010293
RAX: ffffffff81e1abcd RBX: 0000008000000000 RCX: ffff88810842a180
RDX: 0000000000000000 RSI: 0000008000000000 RDI: 0000000000000000
RBP: ffffc90000ccf650 R08: ffffffff81e17d58 R09: ffffed10222c680b
R10: dfffe910222c680c R11: 1ffff110222c680a R12: ffff888111634128
R13: ffffc90000ccf880 R14: 0000008410000000 R15: 0000000000000001
FS:  00007f72635d2640(0000) GS:ffff88811b000000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000565243379180 CR3: 000000010aa74000 CR4: 0000000000150eb0
Call Trace:
 <TASK>
 do_writepages+0x397/0x640
 filemap_fdatawrite_wbc+0x151/0x1b0
 file_write_and_wait_range+0x1c9/0x2b0
 ext4_sync_file+0x19e/0xa00
 vfs_fsync_range+0x17b/0x190
 ext4_buffered_write_iter+0x488/0x530
 ext4_file_write_iter+0x449/0x1b90
 vfs_write+0xbcd/0xf40
 ksys_write+0x198/0x2c0
 __x64_sys_write+0x7b/0x90
 do_syscall_64+0x3d/0x90
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
 </TASK>

Fix the problem by clearing EXT4_STATE_MAY_INLINE_DATA when we are doing
direct IO write to a file.

Reported-by: Tadeusz Struk <tadeusz.struk@linaro.org>
Reported-by: syzbot+bd13648a53ed6933ca49@syzkaller.appspotmail.com
Link: https://syzkaller.appspot.com/bug?id=a1e89d09bbbcbd5c4cb45db230ee28c822953984
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/ext4/file.c | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

Tadeusz Struk July 27, 2022, 5:17 p.m. UTC | #1
On 7/27/22 08:57, Jan Kara wrote:
> When inode is created and written to using direct IO, there is nothing
> to clear the EXT4_STATE_MAY_INLINE_DATA flag. Thus when inode gets
> truncated later to say 1 byte and written using normal write, we will
> try to store the data as inline data. This confuses the code later
> because the inode now has both normal block and inline data allocated
> and the confusion manifests for example as:
> 
> kernel BUG at fs/ext4/inode.c:2721!
> invalid opcode: 0000 [#1] PREEMPT SMP KASAN
> CPU: 0 PID: 359 Comm: repro Not tainted 5.19.0-rc8-00001-g31ba1e3b8305-dirty #15
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-1.fc36 04/01/2014
> RIP: 0010:ext4_writepages+0x363d/0x3660
> RSP: 0018:ffffc90000ccf260 EFLAGS: 00010293
> RAX: ffffffff81e1abcd RBX: 0000008000000000 RCX: ffff88810842a180
> RDX: 0000000000000000 RSI: 0000008000000000 RDI: 0000000000000000
> RBP: ffffc90000ccf650 R08: ffffffff81e17d58 R09: ffffed10222c680b
> R10: dfffe910222c680c R11: 1ffff110222c680a R12: ffff888111634128
> R13: ffffc90000ccf880 R14: 0000008410000000 R15: 0000000000000001
> FS:  00007f72635d2640(0000) GS:ffff88811b000000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000565243379180 CR3: 000000010aa74000 CR4: 0000000000150eb0
> Call Trace:
>   <TASK>
>   do_writepages+0x397/0x640
>   filemap_fdatawrite_wbc+0x151/0x1b0
>   file_write_and_wait_range+0x1c9/0x2b0
>   ext4_sync_file+0x19e/0xa00
>   vfs_fsync_range+0x17b/0x190
>   ext4_buffered_write_iter+0x488/0x530
>   ext4_file_write_iter+0x449/0x1b90
>   vfs_write+0xbcd/0xf40
>   ksys_write+0x198/0x2c0
>   __x64_sys_write+0x7b/0x90
>   do_syscall_64+0x3d/0x90
>   entry_SYSCALL_64_after_hwframe+0x63/0xcd
>   </TASK>
> 
> Fix the problem by clearing EXT4_STATE_MAY_INLINE_DATA when we are doing
> direct IO write to a file.
> 
> Reported-by: Tadeusz Struk<tadeusz.struk@linaro.org>
> Reported-by:syzbot+bd13648a53ed6933ca49@syzkaller.appspotmail.com
> Link:https://syzkaller.appspot.com/bug?id=a1e89d09bbbcbd5c4cb45db230ee28c822953984
> Signed-off-by: Jan Kara<jack@suse.cz>

That works fine for me. Thanks Honza.

Tested-by: Tadeusz Struk<tadeusz.struk@linaro.org>

It should also be applied to stable v5.15 and v5.10.
I will send a request once this lands in mainline.
Lukas Czerner July 28, 2022, 6:54 a.m. UTC | #2
On Wed, Jul 27, 2022 at 05:57:53PM +0200, Jan Kara wrote:
> When inode is created and written to using direct IO, there is nothing
> to clear the EXT4_STATE_MAY_INLINE_DATA flag. Thus when inode gets
> truncated later to say 1 byte and written using normal write, we will
> try to store the data as inline data. This confuses the code later
> because the inode now has both normal block and inline data allocated
> and the confusion manifests for example as:
> 
> kernel BUG at fs/ext4/inode.c:2721!
> invalid opcode: 0000 [#1] PREEMPT SMP KASAN
> CPU: 0 PID: 359 Comm: repro Not tainted 5.19.0-rc8-00001-g31ba1e3b8305-dirty #15
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-1.fc36 04/01/2014
> RIP: 0010:ext4_writepages+0x363d/0x3660
> RSP: 0018:ffffc90000ccf260 EFLAGS: 00010293
> RAX: ffffffff81e1abcd RBX: 0000008000000000 RCX: ffff88810842a180
> RDX: 0000000000000000 RSI: 0000008000000000 RDI: 0000000000000000
> RBP: ffffc90000ccf650 R08: ffffffff81e17d58 R09: ffffed10222c680b
> R10: dfffe910222c680c R11: 1ffff110222c680a R12: ffff888111634128
> R13: ffffc90000ccf880 R14: 0000008410000000 R15: 0000000000000001
> FS:  00007f72635d2640(0000) GS:ffff88811b000000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000565243379180 CR3: 000000010aa74000 CR4: 0000000000150eb0
> Call Trace:
>  <TASK>
>  do_writepages+0x397/0x640
>  filemap_fdatawrite_wbc+0x151/0x1b0
>  file_write_and_wait_range+0x1c9/0x2b0
>  ext4_sync_file+0x19e/0xa00
>  vfs_fsync_range+0x17b/0x190
>  ext4_buffered_write_iter+0x488/0x530
>  ext4_file_write_iter+0x449/0x1b90
>  vfs_write+0xbcd/0xf40
>  ksys_write+0x198/0x2c0
>  __x64_sys_write+0x7b/0x90
>  do_syscall_64+0x3d/0x90
>  entry_SYSCALL_64_after_hwframe+0x63/0xcd
>  </TASK>
> 
> Fix the problem by clearing EXT4_STATE_MAY_INLINE_DATA when we are doing
> direct IO write to a file.

Looks good, thanks.

Reviewed-by: Lukas Czerner <lczerner@redhat.com>

> 
> Reported-by: Tadeusz Struk <tadeusz.struk@linaro.org>
> Reported-by: syzbot+bd13648a53ed6933ca49@syzkaller.appspotmail.com
> Link: https://syzkaller.appspot.com/bug?id=a1e89d09bbbcbd5c4cb45db230ee28c822953984
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
>  fs/ext4/file.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/fs/ext4/file.c b/fs/ext4/file.c
> index 109d07629f81..cab5dfed1cd6 100644
> --- a/fs/ext4/file.c
> +++ b/fs/ext4/file.c
> @@ -528,6 +528,12 @@ static ssize_t ext4_dio_write_iter(struct kiocb *iocb, struct iov_iter *from)
>  		ret = -EAGAIN;
>  		goto out;
>  	}
> +	/*
> +	 * Make sure inline data cannot be created anymore since we are going
> + 	 * to allocate blocks for DIO. We know the inode does not have any
> +	 * inline data now because ext4_dio_supported() checked for that.
> +	 */
> +	ext4_clear_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA);
>  
>  	offset = iocb->ki_pos;
>  	count = ret;
> -- 
> 2.35.3
>
Theodore Ts'o Sept. 27, 2022, 9:53 p.m. UTC | #3
On Wed, 27 Jul 2022 17:57:53 +0200, Jan Kara wrote:
> When inode is created and written to using direct IO, there is nothing
> to clear the EXT4_STATE_MAY_INLINE_DATA flag. Thus when inode gets
> truncated later to say 1 byte and written using normal write, we will
> try to store the data as inline data. This confuses the code later
> because the inode now has both normal block and inline data allocated
> and the confusion manifests for example as:
> 
> [...]

Applied, thanks!

[1/1] ext4: Avoid crash when inline data creation follows DIO write
      commit: 4331037750fdd4c698facc8a03075f88f15ffbe6

Best regards,
diff mbox series

Patch

diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index 109d07629f81..cab5dfed1cd6 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -528,6 +528,12 @@  static ssize_t ext4_dio_write_iter(struct kiocb *iocb, struct iov_iter *from)
 		ret = -EAGAIN;
 		goto out;
 	}
+	/*
+	 * Make sure inline data cannot be created anymore since we are going
+ 	 * to allocate blocks for DIO. We know the inode does not have any
+	 * inline data now because ext4_dio_supported() checked for that.
+	 */
+	ext4_clear_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA);
 
 	offset = iocb->ki_pos;
 	count = ret;