diff mbox series

Can ext4_break_layouts() ever fail?

Message ID 20190516205615.GA2926@iweiny-DESK2.sc.intel.com
State Superseded
Headers show
Series Can ext4_break_layouts() ever fail? | expand

Commit Message

Ira Weiny May 16, 2019, 8:56 p.m. UTC
While testing truncate failure options for FS DAX with GUP pins; I discovered
that if ext4_break_layouts() returns an error it can result in orphan'ed inodes
being left on the orphan list resulting in the following error when the FS is
unmounted.

        EXT4-fs (pmem0): Inode 12 (00000000d274c438): orphan list check failed!
        00000000d274c438: 0001f30a 00000004 00000000 00000000 ................
        000000001fa30de6: 0000000a 00008600 00000000 00000000 ................
        000000003948cb2f: 00000000 00000000 00000000 00000000 ................

        [snip]

        000000009acf82ac: 00000003 00000003 00000000 00000000 ................
        00000000d0cb8f52: 00000000 00000000 00000000 00000000 ................
        000000001edc0c35: bf718fee 00000000 ..q.....
        CPU: 5 PID: 1806 Comm: umount Not tainted 5.1.0-rc2+ #56
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20180724_192412-buildhw-07.phx2.fedoraproject.org-1.fc29 04/01/4
        Call Trace:
         dump_stack+0x5c/0x80
         ext4_destroy_inode+0x86/0x90
         dispose_list+0x48/0x60
         evict_inodes+0x160/0x1b0
         generic_shutdown_super+0x3f/0x100
         kill_block_super+0x21/0x50
         deactivate_locked_super+0x34/0x70
         cleanup_mnt+0x3b/0x70
         task_work_run+0x8a/0xb0
         exit_to_usermode_loop+0xb9/0xc0
         do_syscall_64+0x153/0x180
         entry_SYSCALL_64_after_hwframe+0x44/0xa9
        RIP: 0033:0x7fc5ed56f6bb
        Code: 27 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 90 f3 0f 1e fa 31 f6 e9 05 00 00 00 0f 1f 44 00 00 f3 0f 1e fa b8 a6 00 00 008
        RSP: 002b:00007ffd524be128 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
        RAX: 0000000000000000 RBX: 000055867f9b2fb0 RCX: 00007fc5ed56f6bb
        RDX: 0000000000000001 RSI: 0000000000000000 RDI: 000055867f9b3190
        RBP: 0000000000000000 R08: 000055867f9b31b0 R09: 00007fc5ed5f1e80
        R10: 0000000000000000 R11: 0000000000000246 R12: 000055867f9b3190
        R13: 00007fc5ed7261a4 R14: 0000000000000000 R15: 00007ffd524be398
        EXT4-fs (pmem0): sb orphan head is 12
        sb_info orphan list:
          inode pmem0:12 at 00000000120c1727: mode 100644, nlink 1, next 0

Followed by this panic:

        ------------[ cut here ]------------
        kernel BUG at fs/ext4/super.c:1022!
        invalid opcode: 0000 [#1] SMP PTI
        CPU: 5 PID: 1806 Comm: umount Not tainted 5.1.0-rc2+ #56
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20180724_192412-buildhw-07.phx2.fedoraproject.org-1.fc29 04/01/4
        RIP: 0010:ext4_put_super+0x369/0x370
        Code: 24 d0 03 00 00 48 8b 40 68 83 60 60 fb 0f b7 83 a0 00 00 00 66 41 89 46 3a 41 f6 44 24 50 01 0f 85 71 fd ff ff e9 5f fd8
        RSP: 0018:ffffc900029cfe68 EFLAGS: 00010206
        RAX: ffff888000691dd0 RBX: ffff88800e78f800 RCX: 0000000000000000
        RDX: 0000000000000000 RSI: ffff88800fc96838 RDI: ffff88800fc96838
        RBP: ffff88800e78f9f8 R08: 0000000000000603 R09: 0000000000aaaaaa
        R10: 0000000000000000 R11: 0000000000000001 R12: ffff88800e78e800
        R13: ffff88800e78f9f8 R14: ffffffff820b3a50 R15: ffff888016521f70
        FS:  00007fc5ed3b8080(0000) GS:ffff88800fc80000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 00007f55f82181a0 CR3: 0000000015e9a000 CR4: 00000000000006e0
        Call Trace:
         generic_shutdown_super+0x6c/0x100
         kill_block_super+0x21/0x50
         deactivate_locked_super+0x34/0x70
         cleanup_mnt+0x3b/0x70
         task_work_run+0x8a/0xb0
         exit_to_usermode_loop+0xb9/0xc0
         do_syscall_64+0x153/0x180
         entry_SYSCALL_64_after_hwframe+0x44/0xa9
        RIP: 0033:0x7fc5ed56f6bb
        Code: 27 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 90 f3 0f 1e fa 31 f6 e9 05 00 00 00 0f 1f 44 00 00 f3 0f 1e fa b8 a6 00 00 008
        RSP: 002b:00007ffd524be128 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
        RAX: 0000000000000000 RBX: 000055867f9b2fb0 RCX: 00007fc5ed56f6bb
        RDX: 0000000000000001 RSI: 0000000000000000 RDI: 000055867f9b3190
        RBP: 0000000000000000 R08: 000055867f9b31b0 R09: 00007fc5ed5f1e80
        R10: 0000000000000000 R11: 0000000000000246 R12: 000055867f9b3190
        R13: 00007fc5ed7261a4 R14: 0000000000000000 R15: 00007ffd524be398
        Modules linked in: xfs libcrc32c ib_isert iscsi_target_mod rpcrdma ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_c
        ---[ end trace c300122aad5fcd86 ]---
        RIP: 0010:ext4_put_super+0x369/0x370
        Code: 24 d0 03 00 00 48 8b 40 68 83 60 60 fb 0f b7 83 a0 00 00 00 66 41 89 46 3a 41 f6 44 24 50 01 0f 85 71 fd ff ff e9 5f fd8
        RSP: 0018:ffffc900029cfe68 EFLAGS: 00010206
        RAX: ffff888000691dd0 RBX: ffff88800e78f800 RCX: 0000000000000000
        RDX: 0000000000000000 RSI: ffff88800fc96838 RDI: ffff88800fc96838
        RBP: ffff88800e78f9f8 R08: 0000000000000603 R09: 0000000000aaaaaa
        R10: 0000000000000000 R11: 0000000000000001 R12: ffff88800e78e800
        R13: ffff88800e78f9f8 R14: ffffffff820b3a50 R15: ffff888016521f70
        FS:  00007fc5ed3b8080(0000) GS:ffff88800fc80000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 00007f55f82181a0 CR3: 0000000015e9a000 CR4: 00000000000006e0
        Kernel panic - not syncing: Fatal exception
        Kernel Offset: disabled
        ---[ end Kernel panic - not syncing: Fatal exception ]---
        ------------[ cut here ]------------

I kind of worked around this by removing the orphan inode from the orphan list
if ext4_break_layouts() fails.[1]  But I don't think this unwinds everything
properly.

Failing the truncate for GUP'ed pages could be done outside of
ext4_break_layouts() so it is not absolutely necessary that it return an error.

But this begs the question can ext4_break_layouts() fail?

It looks to me like it is possible for ext4_break_layouts() to fail if
prepare_to_wait_event() sees a pending signal.  Therefore I think this is a bug
in ext4 regardless of how I may implement a truncate failure.

Is that true?
Ira



[1] as shown here.

---
 fs/ext4/inode.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Jan Kara May 17, 2019, 9:02 a.m. UTC | #1
On Thu 16-05-19 13:56:15, Ira Weiny wrote:
> While testing truncate failure options for FS DAX with GUP pins; I discovered
> that if ext4_break_layouts() returns an error it can result in orphan'ed inodes
> being left on the orphan list resulting in the following error when the FS is
> unmounted.
> 
>         EXT4-fs (pmem0): Inode 12 (00000000d274c438): orphan list check failed!
>         00000000d274c438: 0001f30a 00000004 00000000 00000000 ................
>         000000001fa30de6: 0000000a 00008600 00000000 00000000 ................
>         000000003948cb2f: 00000000 00000000 00000000 00000000 ................
> 
>         [snip]
> 
>         000000009acf82ac: 00000003 00000003 00000000 00000000 ................
>         00000000d0cb8f52: 00000000 00000000 00000000 00000000 ................
>         000000001edc0c35: bf718fee 00000000 ..q.....
>         CPU: 5 PID: 1806 Comm: umount Not tainted 5.1.0-rc2+ #56
>         Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20180724_192412-buildhw-07.phx2.fedoraproject.org-1.fc29 04/01/4
>         Call Trace:
>          dump_stack+0x5c/0x80
>          ext4_destroy_inode+0x86/0x90
>          dispose_list+0x48/0x60
>          evict_inodes+0x160/0x1b0
>          generic_shutdown_super+0x3f/0x100
>          kill_block_super+0x21/0x50
>          deactivate_locked_super+0x34/0x70
>          cleanup_mnt+0x3b/0x70
>          task_work_run+0x8a/0xb0
>          exit_to_usermode_loop+0xb9/0xc0
>          do_syscall_64+0x153/0x180
>          entry_SYSCALL_64_after_hwframe+0x44/0xa9
>         RIP: 0033:0x7fc5ed56f6bb
>         Code: 27 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 90 f3 0f 1e fa 31 f6 e9 05 00 00 00 0f 1f 44 00 00 f3 0f 1e fa b8 a6 00 00 008
>         RSP: 002b:00007ffd524be128 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
>         RAX: 0000000000000000 RBX: 000055867f9b2fb0 RCX: 00007fc5ed56f6bb
>         RDX: 0000000000000001 RSI: 0000000000000000 RDI: 000055867f9b3190
>         RBP: 0000000000000000 R08: 000055867f9b31b0 R09: 00007fc5ed5f1e80
>         R10: 0000000000000000 R11: 0000000000000246 R12: 000055867f9b3190
>         R13: 00007fc5ed7261a4 R14: 0000000000000000 R15: 00007ffd524be398
>         EXT4-fs (pmem0): sb orphan head is 12
>         sb_info orphan list:
>           inode pmem0:12 at 00000000120c1727: mode 100644, nlink 1, next 0
> 
> Followed by this panic:
> 
>         ------------[ cut here ]------------
>         kernel BUG at fs/ext4/super.c:1022!
>         invalid opcode: 0000 [#1] SMP PTI
>         CPU: 5 PID: 1806 Comm: umount Not tainted 5.1.0-rc2+ #56
>         Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20180724_192412-buildhw-07.phx2.fedoraproject.org-1.fc29 04/01/4
>         RIP: 0010:ext4_put_super+0x369/0x370
>         Code: 24 d0 03 00 00 48 8b 40 68 83 60 60 fb 0f b7 83 a0 00 00 00 66 41 89 46 3a 41 f6 44 24 50 01 0f 85 71 fd ff ff e9 5f fd8
>         RSP: 0018:ffffc900029cfe68 EFLAGS: 00010206
>         RAX: ffff888000691dd0 RBX: ffff88800e78f800 RCX: 0000000000000000
>         RDX: 0000000000000000 RSI: ffff88800fc96838 RDI: ffff88800fc96838
>         RBP: ffff88800e78f9f8 R08: 0000000000000603 R09: 0000000000aaaaaa
>         R10: 0000000000000000 R11: 0000000000000001 R12: ffff88800e78e800
>         R13: ffff88800e78f9f8 R14: ffffffff820b3a50 R15: ffff888016521f70
>         FS:  00007fc5ed3b8080(0000) GS:ffff88800fc80000(0000) knlGS:0000000000000000
>         CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>         CR2: 00007f55f82181a0 CR3: 0000000015e9a000 CR4: 00000000000006e0
>         Call Trace:
>          generic_shutdown_super+0x6c/0x100
>          kill_block_super+0x21/0x50
>          deactivate_locked_super+0x34/0x70
>          cleanup_mnt+0x3b/0x70
>          task_work_run+0x8a/0xb0
>          exit_to_usermode_loop+0xb9/0xc0
>          do_syscall_64+0x153/0x180
>          entry_SYSCALL_64_after_hwframe+0x44/0xa9
>         RIP: 0033:0x7fc5ed56f6bb
>         Code: 27 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 90 f3 0f 1e fa 31 f6 e9 05 00 00 00 0f 1f 44 00 00 f3 0f 1e fa b8 a6 00 00 008
>         RSP: 002b:00007ffd524be128 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
>         RAX: 0000000000000000 RBX: 000055867f9b2fb0 RCX: 00007fc5ed56f6bb
>         RDX: 0000000000000001 RSI: 0000000000000000 RDI: 000055867f9b3190
>         RBP: 0000000000000000 R08: 000055867f9b31b0 R09: 00007fc5ed5f1e80
>         R10: 0000000000000000 R11: 0000000000000246 R12: 000055867f9b3190
>         R13: 00007fc5ed7261a4 R14: 0000000000000000 R15: 00007ffd524be398
>         Modules linked in: xfs libcrc32c ib_isert iscsi_target_mod rpcrdma ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_c
>         ---[ end trace c300122aad5fcd86 ]---
>         RIP: 0010:ext4_put_super+0x369/0x370
>         Code: 24 d0 03 00 00 48 8b 40 68 83 60 60 fb 0f b7 83 a0 00 00 00 66 41 89 46 3a 41 f6 44 24 50 01 0f 85 71 fd ff ff e9 5f fd8
>         RSP: 0018:ffffc900029cfe68 EFLAGS: 00010206
>         RAX: ffff888000691dd0 RBX: ffff88800e78f800 RCX: 0000000000000000
>         RDX: 0000000000000000 RSI: ffff88800fc96838 RDI: ffff88800fc96838
>         RBP: ffff88800e78f9f8 R08: 0000000000000603 R09: 0000000000aaaaaa
>         R10: 0000000000000000 R11: 0000000000000001 R12: ffff88800e78e800
>         R13: ffff88800e78f9f8 R14: ffffffff820b3a50 R15: ffff888016521f70
>         FS:  00007fc5ed3b8080(0000) GS:ffff88800fc80000(0000) knlGS:0000000000000000
>         CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>         CR2: 00007f55f82181a0 CR3: 0000000015e9a000 CR4: 00000000000006e0
>         Kernel panic - not syncing: Fatal exception
>         Kernel Offset: disabled
>         ---[ end Kernel panic - not syncing: Fatal exception ]---
>         ------------[ cut here ]------------
> 
> I kind of worked around this by removing the orphan inode from the orphan list
> if ext4_break_layouts() fails.[1]  But I don't think this unwinds everything
> properly.
> 
> Failing the truncate for GUP'ed pages could be done outside of
> ext4_break_layouts() so it is not absolutely necessary that it return an error.
> 
> But this begs the question can ext4_break_layouts() fail?

Yes, it can fail e.g. due to a signal and yes, ext4_setattr() should handle
that failure.

> It looks to me like it is possible for ext4_break_layouts() to fail if
> prepare_to_wait_event() sees a pending signal.  Therefore I think this is a bug
> in ext4 regardless of how I may implement a truncate failure.

Yes, it's a bug in ext4.

> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -5648,6 +5648,8 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr)
>                 if (rc) {  
>                         up_write(&EXT4_I(inode)->i_mmap_sem);
>                         error = rc;
> +                       if (orphan)
> +                               ext4_orphan_del(NULL, inode);

This isn't quite correct. This would silence the warning but leave the
inode in on-disk orphan list. That is OK in case of fs-meltdown types of
failures like IO errors for metadata, aborted journal, or stuff like that.
But failing ext4_break_layouts() needs to be handled gracefully maintaining
fs consistency. So you rather need something like:

			if (orphan && inode->i_nlink > 0) {
				handle_t *handle;

				handle = ext4_journal_start(inode,
						EXT4_HT_INODE, 3);
				if (IS_ERR(handle)) {
					ext4_orphan_del(NULL, inode);
					goto err_out;
				}
				ext4_orphan_del(handle, inode);
				ext4_journal_stop(handle);
			}

								Honza
Ira Weiny May 17, 2019, 8:17 p.m. UTC | #2
On Fri, May 17, 2019 at 11:02:52AM +0200, Jan Kara wrote:
> On Thu 16-05-19 13:56:15, Ira Weiny wrote:
> 
> > It looks to me like it is possible for ext4_break_layouts() to fail if
> > prepare_to_wait_event() sees a pending signal.  Therefore I think this is a bug
> > in ext4 regardless of how I may implement a truncate failure.
> 
> Yes, it's a bug in ext4.
> 
> > --- a/fs/ext4/inode.c
> > +++ b/fs/ext4/inode.c
> > @@ -5648,6 +5648,8 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr)
> >                 if (rc) {  
> >                         up_write(&EXT4_I(inode)->i_mmap_sem);
> >                         error = rc;
> > +                       if (orphan)
> > +                               ext4_orphan_del(NULL, inode);
> 
> This isn't quite correct. This would silence the warning but leave the
> inode in on-disk orphan list. That is OK in case of fs-meltdown types of
> failures like IO errors for metadata, aborted journal, or stuff like that.
> But failing ext4_break_layouts() needs to be handled gracefully maintaining
> fs consistency. So you rather need something like:
> 
> 			if (orphan && inode->i_nlink > 0) {
> 				handle_t *handle;
> 
> 				handle = ext4_journal_start(inode,
> 						EXT4_HT_INODE, 3);
> 				if (IS_ERR(handle)) {
> 					ext4_orphan_del(NULL, inode);
> 					goto err_out;
> 				}
> 				ext4_orphan_del(handle, inode);
> 				ext4_journal_stop(handle);
> 			}
>

Thanks!  Unfortunately, even with your suggestion something is still wrong with
my code.

For some reason this does not seem to be "canceling" the truncate completely.
With my test code for FS DAX which fails ext4_break_layout() the file is being
truncated and an application which is writing past that truncation is getting a
SIGBUS.

I don't understand why this is happening because failing here should be
skipping the call to truncate_pagecache() which AFAICT does user unmappings...
So...  I'm still investigating this.

But thanks for confirming this is a bug.  I will get a generic patch out soon.

Thanks,
Ira
Jan Kara May 20, 2019, 8:23 a.m. UTC | #3
On Fri 17-05-19 13:17:47, Ira Weiny wrote:
> On Fri, May 17, 2019 at 11:02:52AM +0200, Jan Kara wrote:
> > On Thu 16-05-19 13:56:15, Ira Weiny wrote:
> > 
> > > It looks to me like it is possible for ext4_break_layouts() to fail if
> > > prepare_to_wait_event() sees a pending signal.  Therefore I think this is a bug
> > > in ext4 regardless of how I may implement a truncate failure.
> > 
> > Yes, it's a bug in ext4.
> > 
> > > --- a/fs/ext4/inode.c
> > > +++ b/fs/ext4/inode.c
> > > @@ -5648,6 +5648,8 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr)
> > >                 if (rc) {  
> > >                         up_write(&EXT4_I(inode)->i_mmap_sem);
> > >                         error = rc;
> > > +                       if (orphan)
> > > +                               ext4_orphan_del(NULL, inode);
> > 
> > This isn't quite correct. This would silence the warning but leave the
> > inode in on-disk orphan list. That is OK in case of fs-meltdown types of
> > failures like IO errors for metadata, aborted journal, or stuff like that.
> > But failing ext4_break_layouts() needs to be handled gracefully maintaining
> > fs consistency. So you rather need something like:
> > 
> > 			if (orphan && inode->i_nlink > 0) {
> > 				handle_t *handle;
> > 
> > 				handle = ext4_journal_start(inode,
> > 						EXT4_HT_INODE, 3);
> > 				if (IS_ERR(handle)) {
> > 					ext4_orphan_del(NULL, inode);
> > 					goto err_out;
> > 				}
> > 				ext4_orphan_del(handle, inode);
> > 				ext4_journal_stop(handle);
> > 			}
> >
> 
> Thanks!  Unfortunately, even with your suggestion something is still
> wrong with my code.
> 
> For some reason this does not seem to be "canceling" the truncate
> completely.  With my test code for FS DAX which fails ext4_break_layout()
> the file is being truncated and an application which is writing past that
> truncation is getting a SIGBUS.

Looking at the code again, I'm not really surprised. The path bailing out
of truncate in case ext4_break_layouts() fails is really hosed. The problem
is that when we get to ext4_break_layouts(), we have already updated i_size
and i_disksize and we happily leave them at their new values when bailing
out. So we need to somewhat reorder the stuff we do in ext4_setattr(). I'll
send a patch for that since it needs some considerations for proper lock
ordering etc... Thanks for experimenting with this :)

								Honza
Ira Weiny May 20, 2019, 4:24 p.m. UTC | #4
> On Fri 17-05-19 13:17:47, Ira Weiny wrote:
> > On Fri, May 17, 2019 at 11:02:52AM +0200, Jan Kara wrote:
> > > On Thu 16-05-19 13:56:15, Ira Weiny wrote:
> > >
> > > > It looks to me like it is possible for ext4_break_layouts() to
> > > > fail if
> > > > prepare_to_wait_event() sees a pending signal.  Therefore I think
> > > > this is a bug in ext4 regardless of how I may implement a truncate
> failure.
> > >
> > > Yes, it's a bug in ext4.
> > >
> > > > --- a/fs/ext4/inode.c
> > > > +++ b/fs/ext4/inode.c
> > > > @@ -5648,6 +5648,8 @@ int ext4_setattr(struct dentry *dentry,
> struct iattr *attr)
> > > >                 if (rc) {
> > > >                         up_write(&EXT4_I(inode)->i_mmap_sem);
> > > >                         error = rc;
> > > > +                       if (orphan)
> > > > +                               ext4_orphan_del(NULL, inode);
> > >
> > > This isn't quite correct. This would silence the warning but leave
> > > the inode in on-disk orphan list. That is OK in case of fs-meltdown
> > > types of failures like IO errors for metadata, aborted journal, or stuff like
> that.
> > > But failing ext4_break_layouts() needs to be handled gracefully
> > > maintaining fs consistency. So you rather need something like:
> > >
> > > 			if (orphan && inode->i_nlink > 0) {
> > > 				handle_t *handle;
> > >
> > > 				handle = ext4_journal_start(inode,
> > > 						EXT4_HT_INODE, 3);
> > > 				if (IS_ERR(handle)) {
> > > 					ext4_orphan_del(NULL, inode);
> > > 					goto err_out;
> > > 				}
> > > 				ext4_orphan_del(handle, inode);
> > > 				ext4_journal_stop(handle);
> > > 			}
> > >
> >
> > Thanks!  Unfortunately, even with your suggestion something is still
> > wrong with my code.
> >
> > For some reason this does not seem to be "canceling" the truncate
> > completely.  With my test code for FS DAX which fails
> > ext4_break_layout() the file is being truncated and an application
> > which is writing past that truncation is getting a SIGBUS.
> 
> Looking at the code again, I'm not really surprised. The path bailing out of
> truncate in case ext4_break_layouts() fails is really hosed. The problem is
> that when we get to ext4_break_layouts(), we have already updated i_size
> and i_disksize and we happily leave them at their new values when bailing
> out. So we need to somewhat reorder the stuff we do in ext4_setattr(). I'll
> send a patch for that since it needs some considerations for proper lock
> ordering etc... Thanks for experimenting with this :)
> 

I should have sent something last night but yes I came to the same conclusion through some simple experiments.

I agree that the locking and other considerations would trip me up.  So I'm not opposed to you helping here.  I had more than 1 problem with either crashes or hangs while playing with the code.  :-/

Thanks,
Ira

> 								Honza
> --
> Jan Kara <jack@suse.com>
> SUSE Labs, CR
diff mbox series

Patch

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 41eb643d75ff..134f5eebee4a 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -5648,6 +5648,8 @@  int ext4_setattr(struct dentry *dentry, struct iattr *attr)
                if (rc) {  
                        up_write(&EXT4_I(inode)->i_mmap_sem);
                        error = rc;
+                       if (orphan)
+                               ext4_orphan_del(NULL, inode);
                        goto err_out;
                }