diff mbox

jffs2:bugfix for should not appeared directory hard link

Message ID 1423826881.5969.33.camel@infradead.org
State Superseded
Delegated to: David Woodhouse
Headers show

Commit Message

David Woodhouse Feb. 13, 2015, 11:28 a.m. UTC
On Fri, 2015-02-13 at 10:13 +0000, David Woodhouse wrote:
> 
> If <ino#71> is somehow ending up linked as /mnt/foo again when it should
> have been unlinked, *that* is the problem we need to be looking at. The
> zombie *children* of <ino#71> are just a symptom of its reincarnation.

Ah, I think I understand a little better.

At the time the problem happens, it's too *early* in the scan/build
process. We don't yet *know* that ino#71 is dead, because we haven't yet
scanned the whole flash looking for dirents that might point to it.

In the problematic case, the directory SW1<#2> is first discovered when
we see its dirent from the *dead* directory <#71>, and we fill in
ic<2>->pino_nlink = 71.

And then *later* we find the *real* dirent which links SW1<#2> from the
root directory, and we hit that 'appears to be a hard link' message in
jffs2_build_inode_pass1().

The problem really comes when we *later* kill ino#71 because we realise
it doesn't have any valid dirents keeping it alive, and we *also* kill
all its children. Some of which weren't *really* supposed to be its
children :)

If we happened to hit the real dirent for SW1<#2> *first*, and the
dirent linking it from <#71> *later*, then we'd be OK.

This is all because of the NFS export support, which means we have to
know the parent inode# for directories. If we were still storing nlink
for directories as we do for plain files, we could fix this fairly
easily.

Your approach is to fix the garbage collection, to ensure that the old
deletion dirents "unlinking" SW1<#2> from the dead <ino#71> aren't
actually allowed to expire until all previous *positive* dirents linking
it into that dead parent directory have also expired — much like we
already do for garbage collection of deletion dirents in a live
directory.

That makes sense, but I'm not sure I like it very much. It's complex,
and it really does violate my sense of decency a little — if <ino#71> is
dead when it should *stay* dead, and not return to bother us.

I'll see if I can come up with a simpler solution purely within the scan
code. Can you test this, please?

Comments

Liu Song March 12, 2015, 3:05 a.m. UTC | #1
Hi David,

Apologies for the delay of this reply that caused by taken holiday of the Spring Festival.
I have tested your codes and found it can work very well. I think this solution is very
subtle, simple and reasonable, I really like this solution, thanks a lot.

Best Regards

David Woodhouse <dwmw2@infradead.org> wrote on 2015-02-13 19:28:01:

> From: David Woodhouse <dwmw2@infradead.org>

> To: liu.song11@zte.com.cn,

> Cc: cui.yunfeng@zte.com.cn, deng.chao1@zte.com.cn, jiang.xuexin@zte.com.cn, linux-mtd@lists.infradead.org, wang.bo116@zte.com.cn

> Date: 2015-02-13 19:28

> Subject: Re: [PATCH]jffs2:bugfix for should not appeared directory hard link

>

> On Fri, 2015-02-13 at 10:13 +0000, David Woodhouse wrote:

> >

> > If <ino#71> is somehow ending up linked as /mnt/foo again when it should

> > have been unlinked, *that* is the problem we need to be looking at. The

> > zombie *children* of <ino#71> are just a symptom of its reincarnation.

>

> Ah, I think I understand a little better.

>

> At the time the problem happens, it's too *early* in the scan/build

> process. We don't yet *know* that ino#71 is dead, because we haven't yet

> scanned the whole flash looking for dirents that might point to it.

>

> In the problematic case, the directory SW1<#2> is first discovered when

> we see its dirent from the *dead* directory <#71>, and we fill in

> ic<2>->pino_nlink = 71.

>

> And then *later* we find the *real* dirent which links SW1<#2> from the

> root directory, and we hit that 'appears to be a hard link' message in

> jffs2_build_inode_pass1().

>

> The problem really comes when we *later* kill ino#71 because we realise

> it doesn't have any valid dirents keeping it alive, and we *also* kill

> all its children. Some of which weren't *really* supposed to be its

> children :)

>

> If we happened to hit the real dirent for SW1<#2> *first*, and the

> dirent linking it from <#71> *later*, then we'd be OK.

>

> This is all because of the NFS export support, which means we have to

> know the parent inode# for directories. If we were still storing nlink

> for directories as we do for plain files, we could fix this fairly

> easily.

>

> Your approach is to fix the garbage collection, to ensure that the old

> deletion dirents "unlinking" SW1<#2> from the dead <ino#71> aren't

> actually allowed to expire until all previous *positive* dirents linking

> it into that dead parent directory have also expired ― much like we

> already do for garbage collection of deletion dirents in a live

> directory.

>

> That makes sense, but I'm not sure I like it very much. It's complex,

> and it really does violate my sense of decency a little ― if <ino#71> is

> dead when it should *stay* dead, and not return to bother us.

>

> I'll see if I can come up with a simpler solution purely within the scan

> code. Can you test this, please?

>

> diff --git a/fs/jffs2/build.c b/fs/jffs2/build.c

> index a3750f9..f5a9b0e 100644

> --- a/fs/jffs2/build.c

> +++ b/fs/jffs2/build.c

> @@ -49,7 +49,8 @@ next_inode(int *i, struct jffs2_inode_cache *ic, struct jffs2_sb_info *c)

>

>

>  static void jffs2_build_inode_pass1(struct jffs2_sb_info *c,

> -                struct jffs2_inode_cache *ic)

> +                struct jffs2_inode_cache *ic,

> +                int *dir_hardlinks)

>  {

>     struct jffs2_full_dirent *fd;

>

> @@ -71,16 +72,16 @@ static void jffs2_build_inode_pass1(struct jffs2_sb_info *c,

>           continue;

>        }

>

> +      /* From this point, fd->raw is no longer used so we can set fd->ic */

> +      fd->ic = child_ic;

> +      child_ic->pino_nlink++;

> +      /* If we appear (at this stage) to have hard-linked directories,

> +       * set a flag to trigger a scan later */

>        if (fd->type == DT_DIR) {

> -         if (child_ic->pino_nlink) {

> -            JFFS2_ERROR("child dir \"%s\" (ino #%u) of dir ino #%u appears to be a hard link\n",

> -                   fd->name, fd->ino, ic->ino);

> -            /* TODO: What do we do about it? */

> -         } else {

> -            child_ic->pino_nlink = ic->ino;

> -         }

> -      } else

> -         child_ic->pino_nlink++;

> +         child_ic->flags |= INO_FLAGS_IS_DIR;

> +         if (child_ic->pino_nlink > 1)

> +            *dir_hardlinks = 1;

> +      }

>

>        dbg_fsbuild("increased nlink for child \"%s\" (ino #%u)\n", fd->name, fd->ino);

>        /* Can't free scan_dents so far. We might need them in pass 2 */

> @@ -94,8 +95,7 @@ static void jffs2_build_inode_pass1(struct jffs2_sb_info *c,

>  */

>  static int jffs2_build_filesystem(struct jffs2_sb_info *c)

>  {

> -   int ret;

> -   int i;

> +   int ret, i, dir_hardlinks = 0;

>     struct jffs2_inode_cache *ic;

>     struct jffs2_full_dirent *fd;

>     struct jffs2_full_dirent *dead_fds = NULL;

> @@ -119,7 +119,7 @@ static int jffs2_build_filesystem(struct jffs2_sb_info *c)

>     /* Now scan the directory tree, increasing nlink according to every dirent found. */

>     for_each_inode(i, c, ic) {

>        if (ic->scan_dents) {

> -         jffs2_build_inode_pass1(c, ic);

> +         jffs2_build_inode_pass1(c, ic, &dir_hardlinks);

>           cond_resched();

>        }

>     }

> @@ -155,6 +155,20 @@ static int jffs2_build_filesystem(struct jffs2_sb_info *c)

>     }

>

>     dbg_fsbuild("pass 2a complete\n");

> +

> +   if (dir_hardlinks) {

> +      /* If we detected directory hardlinks earlier, *hopefully*

> +       * they are gone now because some of the links were from

> +       * dead directories which still had some old dirents lying

> +       * around and not yet garbage-collected, but which have

> +       * been discarded above. So clear the pino_nlink field

> +       * in each directory, so that the final scan below can

> +       * print appropriate warnings. */

> +      for_each_inode(i, c, ic) {

> +         if (ic->flags & INO_FLAGS_IS_DIR)

> +            ic->pino_nlink = 0;

> +      }

> +   }

>     dbg_fsbuild("freeing temporary data structures\n");

>

>     /* Finally, we can scan again and free the dirent structs */

> @@ -162,6 +176,18 @@ static int jffs2_build_filesystem(struct jffs2_sb_info *c)

>        while(ic->scan_dents) {

>           fd = ic->scan_dents;

>           ic->scan_dents = fd->next;

> +         /* We do use the pino_nlink field to count nlink of

> +          * directories during fs build, so set it to the

> +          * parent ino# now. Now that there's hopefully only

> +          * one. */

> +         if (fd->type == DT_DIR) {

> +            if (dir_hardlinks && fd->ic->pino_nlink) {

> +               JFFS2_ERROR("child dir \"%s\" (ino #%u) of dir ino #%u is also hard linked from dir ino #%u\n",

> +                      fd->name, fd->ino, ic->ino, fd->ic->pino_nlink);

> +               /* Should we unlink it from its previous parent? */

> +            }

> +            fd->ic->pino_nlink = ic->ino;

> +         }

>           jffs2_free_full_dirent(fd);

>        }

>        ic->scan_dents = NULL;

> @@ -240,11 +266,7 @@ static void jffs2_build_remove_unlinked_inode(struct jffs2_sb_info *c,

>

>           /* Reduce nlink of the child. If it's now zero, stick it on the

>              dead_fds list to be cleaned up later. Else just free the fd */

> -

> -         if (fd->type == DT_DIR)

> -            child_ic->pino_nlink = 0;

> -         else

> -            child_ic->pino_nlink--;

> +         child_ic->pino_nlink--;

>

>           if (!child_ic->pino_nlink) {

>              dbg_fsbuild("inode #%u (\"%s\") now has no links; adding to dead_fds list.\n",

> diff --git a/fs/jffs2/nodelist.h b/fs/jffs2/nodelist.h

> index fa35ff7..0637271 100644

> --- a/fs/jffs2/nodelist.h

> +++ b/fs/jffs2/nodelist.h

> @@ -194,6 +194,7 @@ struct jffs2_inode_cache {

>  #define INO_STATE_CLEARING   6   /* In clear_inode() */

>

>  #define INO_FLAGS_XATTR_CHECKED   0x01   /* has no duplicate xattr_ref */

> +#define INO_FLAGS_IS_DIR   0x02   /* is a directory */

>

>  #define RAWNODE_CLASS_INODE_CACHE   0

>  #define RAWNODE_CLASS_XATTR_DATUM   1

> @@ -249,7 +250,10 @@ struct jffs2_readinode_info

>

>  struct jffs2_full_dirent

>  {

> -   struct jffs2_raw_node_ref *raw;

> +   union {

> +      struct jffs2_raw_node_ref *raw;

> +      struct jffs2_inode_cache *ic; /* Just during part of build */

> +   };

>     struct jffs2_full_dirent *next;

>     uint32_t version;

>     uint32_t ino; /* == zero for unlink */

>

>

> --

> dwmw2

--------------------------------------------------------
ZTE Information Security Notice: The information contained in this mail (and any attachment transmitted herewith) is privileged and confidential and is intended for the exclusive use of the addressee(s).  If you are not an intended recipient, any disclosure, reproduction, distribution or other dissemination or use of the information contained is strictly prohibited.  If you have received this mail in error, please delete it and notify us immediately.
diff mbox

Patch

diff --git a/fs/jffs2/build.c b/fs/jffs2/build.c
index a3750f9..f5a9b0e 100644
--- a/fs/jffs2/build.c
+++ b/fs/jffs2/build.c
@@ -49,7 +49,8 @@  next_inode(int *i, struct jffs2_inode_cache *ic, struct jffs2_sb_info *c)
 
 
 static void jffs2_build_inode_pass1(struct jffs2_sb_info *c,
-				    struct jffs2_inode_cache *ic)
+				    struct jffs2_inode_cache *ic,
+				    int *dir_hardlinks)
 {
 	struct jffs2_full_dirent *fd;
 
@@ -71,16 +72,16 @@  static void jffs2_build_inode_pass1(struct jffs2_sb_info *c,
 			continue;
 		}
 
+		/* From this point, fd->raw is no longer used so we can set fd->ic */
+		fd->ic = child_ic;
+		child_ic->pino_nlink++;
+		/* If we appear (at this stage) to have hard-linked directories,
+		 * set a flag to trigger a scan later */
 		if (fd->type == DT_DIR) {
-			if (child_ic->pino_nlink) {
-				JFFS2_ERROR("child dir \"%s\" (ino #%u) of dir ino #%u appears to be a hard link\n",
-					    fd->name, fd->ino, ic->ino);
-				/* TODO: What do we do about it? */
-			} else {
-				child_ic->pino_nlink = ic->ino;
-			}
-		} else
-			child_ic->pino_nlink++;
+			child_ic->flags |= INO_FLAGS_IS_DIR;
+			if (child_ic->pino_nlink > 1)
+				*dir_hardlinks = 1;
+		}
 
 		dbg_fsbuild("increased nlink for child \"%s\" (ino #%u)\n", fd->name, fd->ino);
 		/* Can't free scan_dents so far. We might need them in pass 2 */
@@ -94,8 +95,7 @@  static void jffs2_build_inode_pass1(struct jffs2_sb_info *c,
 */
 static int jffs2_build_filesystem(struct jffs2_sb_info *c)
 {
-	int ret;
-	int i;
+	int ret, i, dir_hardlinks = 0;
 	struct jffs2_inode_cache *ic;
 	struct jffs2_full_dirent *fd;
 	struct jffs2_full_dirent *dead_fds = NULL;
@@ -119,7 +119,7 @@  static int jffs2_build_filesystem(struct jffs2_sb_info *c)
 	/* Now scan the directory tree, increasing nlink according to every dirent found. */
 	for_each_inode(i, c, ic) {
 		if (ic->scan_dents) {
-			jffs2_build_inode_pass1(c, ic);
+			jffs2_build_inode_pass1(c, ic, &dir_hardlinks);
 			cond_resched();
 		}
 	}
@@ -155,6 +155,20 @@  static int jffs2_build_filesystem(struct jffs2_sb_info *c)
 	}
 
 	dbg_fsbuild("pass 2a complete\n");
+
+	if (dir_hardlinks) {
+		/* If we detected directory hardlinks earlier, *hopefully*
+		 * they are gone now because some of the links were from
+		 * dead directories which still had some old dirents lying
+		 * around and not yet garbage-collected, but which have
+		 * been discarded above. So clear the pino_nlink field
+		 * in each directory, so that the final scan below can
+		 * print appropriate warnings. */
+		for_each_inode(i, c, ic) {
+			if (ic->flags & INO_FLAGS_IS_DIR)
+				ic->pino_nlink = 0;
+		}
+	}
 	dbg_fsbuild("freeing temporary data structures\n");
 
 	/* Finally, we can scan again and free the dirent structs */
@@ -162,6 +176,18 @@  static int jffs2_build_filesystem(struct jffs2_sb_info *c)
 		while(ic->scan_dents) {
 			fd = ic->scan_dents;
 			ic->scan_dents = fd->next;
+			/* We do use the pino_nlink field to count nlink of
+			 * directories during fs build, so set it to the
+			 * parent ino# now. Now that there's hopefully only
+			 * one. */
+			if (fd->type == DT_DIR) {
+				if (dir_hardlinks && fd->ic->pino_nlink) {
+					JFFS2_ERROR("child dir \"%s\" (ino #%u) of dir ino #%u is also hard linked from dir ino #%u\n",
+						    fd->name, fd->ino, ic->ino, fd->ic->pino_nlink);
+					/* Should we unlink it from its previous parent? */
+				}
+				fd->ic->pino_nlink = ic->ino;
+			}
 			jffs2_free_full_dirent(fd);
 		}
 		ic->scan_dents = NULL;
@@ -240,11 +266,7 @@  static void jffs2_build_remove_unlinked_inode(struct jffs2_sb_info *c,
 
 			/* Reduce nlink of the child. If it's now zero, stick it on the
 			   dead_fds list to be cleaned up later. Else just free the fd */
-
-			if (fd->type == DT_DIR)
-				child_ic->pino_nlink = 0;
-			else
-				child_ic->pino_nlink--;
+			child_ic->pino_nlink--;
 
 			if (!child_ic->pino_nlink) {
 				dbg_fsbuild("inode #%u (\"%s\") now has no links; adding to dead_fds list.\n",
diff --git a/fs/jffs2/nodelist.h b/fs/jffs2/nodelist.h
index fa35ff7..0637271 100644
--- a/fs/jffs2/nodelist.h
+++ b/fs/jffs2/nodelist.h
@@ -194,6 +194,7 @@  struct jffs2_inode_cache {
 #define INO_STATE_CLEARING	6	/* In clear_inode() */
 
 #define INO_FLAGS_XATTR_CHECKED	0x01	/* has no duplicate xattr_ref */
+#define INO_FLAGS_IS_DIR	0x02	/* is a directory */
 
 #define RAWNODE_CLASS_INODE_CACHE	0
 #define RAWNODE_CLASS_XATTR_DATUM	1
@@ -249,7 +250,10 @@  struct jffs2_readinode_info
 
 struct jffs2_full_dirent
 {
-	struct jffs2_raw_node_ref *raw;
+	union {
+		struct jffs2_raw_node_ref *raw;
+		struct jffs2_inode_cache *ic; /* Just during part of build */
+	};
 	struct jffs2_full_dirent *next;
 	uint32_t version;
 	uint32_t ino; /* == zero for unlink */