diff mbox series

ubifs: Fix a potential space leak problem while linking tmpfile

Message ID 20200701112643.726986-1-chengzhihao1@huawei.com
State Changes Requested
Headers show
Series ubifs: Fix a potential space leak problem while linking tmpfile | expand

Commit Message

Zhihao Cheng July 1, 2020, 11:26 a.m. UTC
There is a potential space leak problem while linking tmpfile, in which
case, inode node (with nlink=0) is valid in tnc (on flash), which leads
to space leak. Meanwhile, the corresponding data nodes won't be released
from tnc. For example, (A reproducer can be found in Link):

$ mount UBIFS
  [process A]            [process B]         [TNC]         [orphan area]

 ubifs_tmpfile                          inode_A (nlink=0)     inode_A
                          do_commit     inode_A (nlink=0)     inode_A
			       ↑
      (comment: It makes sure not replay inode_A in next mount)
 ubifs_link                             inode_A (nlink=0)     inode_A
   ubifs_delete_orphan                  inode_A (nlink=0)
                          do_commit     inode_A (nlink=0)
                           ---> POWERCUT <---
   (ubifs_jnl_update)

$ mount UBIFS
  inode_A will neither be replayed in ubifs_replay_journal() nor
  ubifs_mount_orphans(). inode_A (nlink=0) with its data nodes will
  always on tnc, it occupy space but is non-visable for users.

Commit ee1438ce5dc4d ("ubifs: Check link count of inodes when killing
orphans.") handles problem in mistakenly deleting relinked tmpfile
while replaying orphan area. Since that, tmpfile inode should always
live in orphan area even it is linked. Fix it by reverting commit
32fe905c17f001 ("ubifs: Fix O_TMPFILE corner case in ubifs_link()").

Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Cc: <stable@vger.kernel.org>  # v5.3+
Fixes: 32fe905c17f001 ("ubifs: Fix O_TMPFILE corner case in ubifs_link()")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=208405
---
 fs/ubifs/dir.c | 7 -------
 1 file changed, 7 deletions(-)

Comments

Richard Weinberger July 7, 2020, 11:26 a.m. UTC | #1
On Wed, Jul 1, 2020 at 1:28 PM Zhihao Cheng <chengzhihao1@huawei.com> wrote:
>
> There is a potential space leak problem while linking tmpfile, in which
> case, inode node (with nlink=0) is valid in tnc (on flash), which leads
> to space leak. Meanwhile, the corresponding data nodes won't be released
> from tnc. For example, (A reproducer can be found in Link):
>
> $ mount UBIFS
>   [process A]            [process B]         [TNC]         [orphan area]
>
>  ubifs_tmpfile                          inode_A (nlink=0)     inode_A
>                           do_commit     inode_A (nlink=0)     inode_A
>                                ↑
>       (comment: It makes sure not replay inode_A in next mount)
>  ubifs_link                             inode_A (nlink=0)     inode_A
>    ubifs_delete_orphan                  inode_A (nlink=0)
>                           do_commit     inode_A (nlink=0)
>                            ---> POWERCUT <---
>    (ubifs_jnl_update)
>
> $ mount UBIFS
>   inode_A will neither be replayed in ubifs_replay_journal() nor
>   ubifs_mount_orphans(). inode_A (nlink=0) with its data nodes will
>   always on tnc, it occupy space but is non-visable for users.
>
> Commit ee1438ce5dc4d ("ubifs: Check link count of inodes when killing
> orphans.") handles problem in mistakenly deleting relinked tmpfile
> while replaying orphan area. Since that, tmpfile inode should always
> live in orphan area even it is linked. Fix it by reverting commit
> 32fe905c17f001 ("ubifs: Fix O_TMPFILE corner case in ubifs_link()").

Well, by reverting this commit you re-introduce the issue it fixes. ;-\
Zhihao Cheng July 7, 2020, 11:54 a.m. UTC | #2
在 2020/7/7 19:26, Richard Weinberger 写道:
> On Wed, Jul 1, 2020 at 1:28 PM Zhihao Cheng <chengzhihao1@huawei.com> wrote:
>> There is a potential space leak problem while linking tmpfile, in which
>> case, inode node (with nlink=0) is valid in tnc (on flash), which leads
>> to space leak. Meanwhile, the corresponding data nodes won't be released
>> from tnc. For example, (A reproducer can be found in Link):
>>
>> $ mount UBIFS
>>    [process A]            [process B]         [TNC]         [orphan area]
>>
>>   ubifs_tmpfile                          inode_A (nlink=0)     inode_A
>>                            do_commit     inode_A (nlink=0)     inode_A
>>                                 ↑
>>        (comment: It makes sure not replay inode_A in next mount)
>>   ubifs_link                             inode_A (nlink=0)     inode_A
>>     ubifs_delete_orphan                  inode_A (nlink=0)
>>                            do_commit     inode_A (nlink=0)
>>                             ---> POWERCUT <---
>>     (ubifs_jnl_update)
>>
>> $ mount UBIFS
>>    inode_A will neither be replayed in ubifs_replay_journal() nor
>>    ubifs_mount_orphans(). inode_A (nlink=0) with its data nodes will
>>    always on tnc, it occupy space but is non-visable for users.
>>
>> Commit ee1438ce5dc4d ("ubifs: Check link count of inodes when killing
>> orphans.") handles problem in mistakenly deleting relinked tmpfile
>> while replaying orphan area. Since that, tmpfile inode should always
>> live in orphan area even it is linked. Fix it by reverting commit
>> 32fe905c17f001 ("ubifs: Fix O_TMPFILE corner case in ubifs_link()").
> Well, by reverting this commit you re-introduce the issue it fixes. ;-\
>
Perhaps I misunderstood what commit 32fe905c17f001 ("ubifs: Fix 
O_TMPFILE corner case in ubifs_link()") wanted to fix.
I think orphan area is used to remind filesystem don't forget to delete 
inodes (whose nlink is 0) in next unclean rebooting. Generally, the file 
system is not corrupted caused by replaying orphan nodes.
Ralph reported a filesystem corruption in combination with overlayfs. 
Can you tell me the details about that problem? Thanks.
Richard Weinberger July 7, 2020, 12:09 p.m. UTC | #3
----- Ursprüngliche Mail -----
> Perhaps I misunderstood what commit 32fe905c17f001 ("ubifs: Fix
> O_TMPFILE corner case in ubifs_link()") wanted to fix.
> I think orphan area is used to remind filesystem don't forget to delete
> inodes (whose nlink is 0) in next unclean rebooting. Generally, the file
> system is not corrupted caused by replaying orphan nodes.
> Ralph reported a filesystem corruption in combination with overlayfs.
> Can you tell me the details about that problem? Thanks.

On my test bed I didn't see a fs corruption, what I saw was a failing orphan
self test while playing with O_TMPFILE and linkat().

When you create a tmpfile it has a link count of 0 and an orphan is
installed. Such that the tmpfile is gone after a reboot but you can
still use it prior to that.
By using linkat() you can raise the link counter to 1 again.
Thus, the orphan needs to be removed.
This is pattern overlayfs uses a lot.

Since UBIFS never supported raising the link counter from 0 to 1
we have many corner cases and fixing all these turned out into a nightmare.
...as you can see from the amount broken patches from me :-(.

Thanks,
//richard
Zhihao Cheng July 7, 2020, 12:36 p.m. UTC | #4
在 2020/7/7 20:09, Richard Weinberger 写道:
> ----- Ursprüngliche Mail -----
>> Perhaps I misunderstood what commit 32fe905c17f001 ("ubifs: Fix
>> O_TMPFILE corner case in ubifs_link()") wanted to fix.
>> I think orphan area is used to remind filesystem don't forget to delete
>> inodes (whose nlink is 0) in next unclean rebooting. Generally, the file
>> system is not corrupted caused by replaying orphan nodes.
>> Ralph reported a filesystem corruption in combination with overlayfs.
>> Can you tell me the details about that problem? Thanks.
> On my test bed I didn't see a fs corruption, what I saw was a failing orphan
> self test while playing with O_TMPFILE and linkat().
Do we have a reproducer, or can I get the fail testcase? Is it a xfstest 
case?
>
> When you create a tmpfile it has a link count of 0 and an orphan is
> installed. Such that the tmpfile is gone after a reboot but you can
> still use it prior to that.
> By using linkat() you can raise the link counter to 1 again.
> Thus, the orphan needs to be removed.
> This is pattern overlayfs uses a lot.
>
> Since UBIFS never supported raising the link counter from 0 to 1
> we have many corner cases and fixing all these turned out into a nightmare.
> ...as you can see from the amount broken patches from me :-(.
>
> Thanks,
> //richard
>
> .
Richard Weinberger July 7, 2020, 12:47 p.m. UTC | #5
----- Ursprüngliche Mail -----
>>> Perhaps I misunderstood what commit 32fe905c17f001 ("ubifs: Fix
>>> O_TMPFILE corner case in ubifs_link()") wanted to fix.
>>> I think orphan area is used to remind filesystem don't forget to delete
>>> inodes (whose nlink is 0) in next unclean rebooting. Generally, the file
>>> system is not corrupted caused by replaying orphan nodes.
>>> Ralph reported a filesystem corruption in combination with overlayfs.
>>> Can you tell me the details about that problem? Thanks.
>> On my test bed I didn't see a fs corruption, what I saw was a failing orphan
>> self test while playing with O_TMPFILE and linkat().
> Do we have a reproducer, or can I get the fail testcase? Is it a xfstest
> case?

I think xfstests triggered it, yes.
Later today I can check. :)

Thanks,
//richard
Zhihao Cheng July 11, 2020, 6:37 a.m. UTC | #6
在 2020/7/7 20:47, Richard Weinberger 写道:
> ----- Ursprüngliche Mail -----
>>>> Perhaps I misunderstood what commit 32fe905c17f001 ("ubifs: Fix
>>>> O_TMPFILE corner case in ubifs_link()") wanted to fix.
>>>> I think orphan area is used to remind filesystem don't forget to delete
>>>> inodes (whose nlink is 0) in next unclean rebooting. Generally, the file
>>>> system is not corrupted caused by replaying orphan nodes.
>>>> Ralph reported a filesystem corruption in combination with overlayfs.
>>>> Can you tell me the details about that problem? Thanks.
>>> On my test bed I didn't see a fs corruption, what I saw was a failing orphan
>>> self test while playing with O_TMPFILE and linkat().
>> Do we have a reproducer, or can I get the fail testcase? Is it a xfstest
>> case?
> I think xfstests triggered it, yes.
> Later today I can check. :)
>
> Thanks,
> //richard
>
> .

I think I have found the testcases, overlay/006 and overlay/041.

The 'mv' and 'rm' operations will put lowertestfile into orphan list 
twice, so we must reseve the orphan deletion operation in ubifs_link(), 
otherwise the testcase fails and we will see the following msg:

   overlay/006 2s ... - output mismatch (see 
/root/git/xfstests-dev/results//overlay/006.out.bad)
     --- tests/overlay/006.out    2020-07-07 21:42:57.737000000 +0800
     +++ /root/git/xfstests-dev/results//overlay/006.out.bad 2020-07-11 
14:31:55.340000000 +0800
     @@ -1,2 +1,4 @@
      QA output created by 006
      Silence is golden
     +rm: cannot remove 
'/tmp/scratch/ovl-mnt/uppertestdir/lowertestfile': Invalid argument
     +lowertestfile
     ...

   [  382.258210] UBIFS error (ubi0:1 pid 11896): orphan_add [ubifs]: 
orphaned twice
   [  382.352535] UBIFS error (ubi0:1 pid 11930): free_orphans [ubifs]: 
orphan list not empty at unmount


So, how about moving ubifs_delete_orphan() after ubifs_jnl_update() in 
function ubifs_link(). Following modifications applied in linux-5.8 has 
been tested by overlay/041, overlay/006 and  other tmpfile cases 
(generic/531, generic/530, generic/509, generic/389, generic/004).

diff --git a/fs/ubifs/dir.c b/fs/ubifs/dir.c
index ef85ec167a84..fd4443a5e8c6 100644
--- a/fs/ubifs/dir.c
+++ b/fs/ubifs/dir.c
@@ -722,11 +722,6 @@ static int ubifs_link(struct dentry *old_dentry, 
struct inode *dir,
                 goto out_fname;

         lock_2_inodes(dir, inode);
-
-       /* Handle O_TMPFILE corner case, it is allowed to link a 
O_TMPFILE. */
-       if (inode->i_nlink == 0)
-               ubifs_delete_orphan(c, inode->i_ino);
-
inc_nlink(inode);
ihold(inode);
         inode->i_ctime = current_time(inode);
@@ -736,6 +731,11 @@ static int ubifs_link(struct dentry *old_dentry, 
struct inode *dir,
         err = ubifs_jnl_update(c, dir, &nm, inode, 0, 0);
         if (err)
                 goto out_cancel;
+
+       /* Handle O_TMPFILE corner case, it is allowed to link a 
O_TMPFILE. */
+       if (inode->i_nlink == 1)
+               ubifs_delete_orphan(c, inode->i_ino);
+
         unlock_2_inodes(dir, inode);

         ubifs_release_budget(c, &req);
@@ -747,8 +747,6 @@ static int ubifs_link(struct dentry *old_dentry, 
struct inode *dir,
         dir->i_size -= sz_change;
         dir_ui->ui_size = dir->i_size;
drop_nlink(inode);
-       if (inode->i_nlink == 0)
-               ubifs_add_orphan(c, inode->i_ino);
         unlock_2_inodes(dir, inode);
         ubifs_release_budget(c, &req);
iput(inode);
--
Zhihao Cheng July 11, 2020, 6:44 a.m. UTC | #7
在 2020/7/11 14:37, Zhihao Cheng 写道:
> 在 2020/7/7 20:47, Richard Weinberger 写道:
>> ----- Ursprüngliche Mail -----
>>>>> Perhaps I misunderstood what commit 32fe905c17f001 ("ubifs: Fix
>>>>> O_TMPFILE corner case in ubifs_link()") wanted to fix.
>>>>> I think orphan area is used to remind filesystem don't forget to 
>>>>> delete
>>>>> inodes (whose nlink is 0) in next unclean rebooting. Generally, 
>>>>> the file
>>>>> system is not corrupted caused by replaying orphan nodes.
>>>>> Ralph reported a filesystem corruption in combination with overlayfs.
>>>>> Can you tell me the details about that problem? Thanks.
>>>> On my test bed I didn't see a fs corruption, what I saw was a 
>>>> failing orphan
>>>> self test while playing with O_TMPFILE and linkat().
>>> Do we have a reproducer, or can I get the fail testcase? Is it a 
>>> xfstest
>>> case?
>> I think xfstests triggered it, yes.
>> Later today I can check. :)
>>
>> Thanks,
>> //richard
>>
>> .
>
> I think I have found the testcases, overlay/006 and overlay/041.
>
> The 'mv' and 'rm' operations will put lowertestfile into orphan list 
> twice, so we must reseve the orphan deletion operation in 
> ubifs_link(), otherwise the testcase fails and we will see the 
> following msg:
>
>   overlay/006 2s ... - output mismatch (see 
> /root/git/xfstests-dev/results//overlay/006.out.bad)
>     --- tests/overlay/006.out    2020-07-07 21:42:57.737000000 +0800
>     +++ /root/git/xfstests-dev/results//overlay/006.out.bad 2020-07-11 
> 14:31:55.340000000 +0800
>     @@ -1,2 +1,4 @@
>      QA output created by 006
>      Silence is golden
>     +rm: cannot remove 
> '/tmp/scratch/ovl-mnt/uppertestdir/lowertestfile': Invalid argument
>     +lowertestfile
>     ...
>
>   [  382.258210] UBIFS error (ubi0:1 pid 11896): orphan_add [ubifs]: 
> orphaned twice
>   [  382.352535] UBIFS error (ubi0:1 pid 11930): free_orphans [ubifs]: 
> orphan list not empty at unmount
>
>
> So, how about moving ubifs_delete_orphan() after ubifs_jnl_update() in 
> function ubifs_link(). Following modifications applied in linux-5.8 
> has been tested by overlay/041, overlay/006 and  other tmpfile cases 
> (generic/531, generic/530, generic/509, generic/389, generic/004).
>
Results for testcases generic/530, generic/509, generic/389 and 
generic/004 are still "not run".
> diff --git a/fs/ubifs/dir.c b/fs/ubifs/dir.c
> index ef85ec167a84..fd4443a5e8c6 100644
> --- a/fs/ubifs/dir.c
> +++ b/fs/ubifs/dir.c
> @@ -722,11 +722,6 @@ static int ubifs_link(struct dentry *old_dentry, 
> struct inode *dir,
>                 goto out_fname;
>
>         lock_2_inodes(dir, inode);
> -
> -       /* Handle O_TMPFILE corner case, it is allowed to link a 
> O_TMPFILE. */
> -       if (inode->i_nlink == 0)
> -               ubifs_delete_orphan(c, inode->i_ino);
> -
> inc_nlink(inode);
> ihold(inode);
>         inode->i_ctime = current_time(inode);
> @@ -736,6 +731,11 @@ static int ubifs_link(struct dentry *old_dentry, 
> struct inode *dir,
>         err = ubifs_jnl_update(c, dir, &nm, inode, 0, 0);
>         if (err)
>                 goto out_cancel;
> +
> +       /* Handle O_TMPFILE corner case, it is allowed to link a 
> O_TMPFILE. */
> +       if (inode->i_nlink == 1)
> +               ubifs_delete_orphan(c, inode->i_ino);
> +
>         unlock_2_inodes(dir, inode);
>
>         ubifs_release_budget(c, &req);
> @@ -747,8 +747,6 @@ static int ubifs_link(struct dentry *old_dentry, 
> struct inode *dir,
>         dir->i_size -= sz_change;
>         dir_ui->ui_size = dir->i_size;
> drop_nlink(inode);
> -       if (inode->i_nlink == 0)
> -               ubifs_add_orphan(c, inode->i_ino);
>         unlock_2_inodes(dir, inode);
>         ubifs_release_budget(c, &req);
> iput(inode);
> -- 
>
Zhihao Cheng July 13, 2020, 3:30 a.m. UTC | #8
在 2020/7/11 14:37, Zhihao Cheng 写道:
> 在 2020/7/7 20:47, Richard Weinberger 写道:
>> ----- Ursprüngliche Mail -----
>>>>> Perhaps I misunderstood what commit 32fe905c17f001 ("ubifs: Fix
>>>>> O_TMPFILE corner case in ubifs_link()") wanted to fix.
>>>>> I think orphan area is used to remind filesystem don't forget to 
>>>>> delete
>>>>> inodes (whose nlink is 0) in next unclean rebooting. Generally, 
>>>>> the file
>>>>> system is not corrupted caused by replaying orphan nodes.
>>>>> Ralph reported a filesystem corruption in combination with overlayfs.
>>>>> Can you tell me the details about that problem? Thanks.
>>>> On my test bed I didn't see a fs corruption, what I saw was a 
>>>> failing orphan
>>>> self test while playing with O_TMPFILE and linkat().
>>> Do we have a reproducer, or can I get the fail testcase? Is it a 
>>> xfstest
>>> case?
>> I think xfstests triggered it, yes.
>> Later today I can check. :)
>>
>> Thanks,
>> //richard
>>
>> .
>
> I think I have found the testcases, overlay/006 and overlay/041.
>
> The 'mv' and 'rm' operations will put lowertestfile into orphan list 
> twice, so we must reseve the orphan deletion operation in 
> ubifs_link(), otherwise the testcase fails and we will see the 
> following msg:
Sorry, not lowertestfile, it's tempfile which is generated by ovl 
copy-up (mv operation). The tempfile is linked after copy-up finished. 
The tempfile is then unlinked by 'rm' operation.
>
>   overlay/006 2s ... - output mismatch (see 
> /root/git/xfstests-dev/results//overlay/006.out.bad)
>     --- tests/overlay/006.out    2020-07-07 21:42:57.737000000 +0800
>     +++ /root/git/xfstests-dev/results//overlay/006.out.bad 2020-07-11 
> 14:31:55.340000000 +0800
>     @@ -1,2 +1,4 @@
>      QA output created by 006
>      Silence is golden
>     +rm: cannot remove 
> '/tmp/scratch/ovl-mnt/uppertestdir/lowertestfile': Invalid argument
>     +lowertestfile
>     ...
>
>   [  382.258210] UBIFS error (ubi0:1 pid 11896): orphan_add [ubifs]: 
> orphaned twice
>   [  382.352535] UBIFS error (ubi0:1 pid 11930): free_orphans [ubifs]: 
> orphan list not empty at unmount
>
>
> So, how about moving ubifs_delete_orphan() after ubifs_jnl_update() in 
> function ubifs_link(). Following modifications applied in linux-5.8 
> has been tested by overlay/041, overlay/006 and  other tmpfile cases 
> (generic/531, generic/530, generic/509, generic/389, generic/004).
>
> diff --git a/fs/ubifs/dir.c b/fs/ubifs/dir.c
> index ef85ec167a84..fd4443a5e8c6 100644
> --- a/fs/ubifs/dir.c
> +++ b/fs/ubifs/dir.c
> @@ -722,11 +722,6 @@ static int ubifs_link(struct dentry *old_dentry, 
> struct inode *dir,
>                 goto out_fname;
>
>         lock_2_inodes(dir, inode);
> -
> -       /* Handle O_TMPFILE corner case, it is allowed to link a 
> O_TMPFILE. */
> -       if (inode->i_nlink == 0)
> -               ubifs_delete_orphan(c, inode->i_ino);
> -
> inc_nlink(inode);
> ihold(inode);
>         inode->i_ctime = current_time(inode);
> @@ -736,6 +731,11 @@ static int ubifs_link(struct dentry *old_dentry, 
> struct inode *dir,
>         err = ubifs_jnl_update(c, dir, &nm, inode, 0, 0);
>         if (err)
>                 goto out_cancel;
> +
> +       /* Handle O_TMPFILE corner case, it is allowed to link a 
> O_TMPFILE. */
> +       if (inode->i_nlink == 1)
> +               ubifs_delete_orphan(c, inode->i_ino);
> +
>         unlock_2_inodes(dir, inode);
>
>         ubifs_release_budget(c, &req);
> @@ -747,8 +747,6 @@ static int ubifs_link(struct dentry *old_dentry, 
> struct inode *dir,
>         dir->i_size -= sz_change;
>         dir_ui->ui_size = dir->i_size;
> drop_nlink(inode);
> -       if (inode->i_nlink == 0)
> -               ubifs_add_orphan(c, inode->i_ino);
>         unlock_2_inodes(dir, inode);
>         ubifs_release_budget(c, &req);
> iput(inode);
> -- 
>
>
>
> ______________________________________________________
> Linux MTD discussion mailing list
> http://lists.infradead.org/mailman/listinfo/linux-mtd/
diff mbox series

Patch

diff --git a/fs/ubifs/dir.c b/fs/ubifs/dir.c
index ef85ec167a84..9534c4bb598f 100644
--- a/fs/ubifs/dir.c
+++ b/fs/ubifs/dir.c
@@ -722,11 +722,6 @@  static int ubifs_link(struct dentry *old_dentry, struct inode *dir,
 		goto out_fname;
 
 	lock_2_inodes(dir, inode);
-
-	/* Handle O_TMPFILE corner case, it is allowed to link a O_TMPFILE. */
-	if (inode->i_nlink == 0)
-		ubifs_delete_orphan(c, inode->i_ino);
-
 	inc_nlink(inode);
 	ihold(inode);
 	inode->i_ctime = current_time(inode);
@@ -747,8 +742,6 @@  static int ubifs_link(struct dentry *old_dentry, struct inode *dir,
 	dir->i_size -= sz_change;
 	dir_ui->ui_size = dir->i_size;
 	drop_nlink(inode);
-	if (inode->i_nlink == 0)
-		ubifs_add_orphan(c, inode->i_ino);
 	unlock_2_inodes(dir, inode);
 	ubifs_release_budget(c, &req);
 	iput(inode);