diff mbox series

[v1,2/2] jffs2: make cleanmarker support option

Message ID 20231019073838.17586-3-mmkurbanov@salutedevices.com
State New
Headers show
Series jffs2: make cleanmarker support option | expand

Commit Message

Martin Kurbanov Oct. 19, 2023, 7:38 a.m. UTC
This patch support for disable cleanmarker option. This is useful on
some NAND devices which entire OOB area is protected by ECC. Problem
fires when JFFS2 driver writes cleanmarker to some page and later it
tries to write to this page - write will be done successfully, but after
that such page becomes unreadable due to invalid ECC codes. This occurs
because the second write necessitates an update to ECC, but it is
impossible to do it correctly without block erase.

Signed-off-by: Martin Kurbanov <mmkurbanov@salutedevices.com>
---
 fs/jffs2/Kconfig    | 10 ++++++++++
 fs/jffs2/os-linux.h |  4 +++-
 2 files changed, 13 insertions(+), 1 deletion(-)

Comments

Richard Weinberger Oct. 19, 2023, 8:12 a.m. UTC | #1
Martin,

----- Ursprüngliche Mail -----
> Von: "Martin Kurbanov" <mmkurbanov@salutedevices.com>
> This patch support for disable cleanmarker option. This is useful on
> some NAND devices which entire OOB area is protected by ECC. Problem
> fires when JFFS2 driver writes cleanmarker to some page and later it
> tries to write to this page - write will be done successfully, but after
> that such page becomes unreadable due to invalid ECC codes. This occurs
> because the second write necessitates an update to ECC, but it is
> impossible to do it correctly without block erase.

Hmm, I miss an explanation why this change is correct and safe.
You explain why the OOB area can't be used, okay. But you need to
add more details on why you change is safe in terms of filesystem
consistency.

Beside of that, I don't think this should be kernel config option.
Why not a mount option?

Thanks,
//richard
Martin Kurbanov Oct. 23, 2023, 2:54 p.m. UTC | #2
Hello Richard,

On 19.10.2023 11:12, Richard Weinberger wrote:
>> This patch support for disable cleanmarker option. This is useful on
>> some NAND devices which entire OOB area is protected by ECC. Problem
>> fires when JFFS2 driver writes cleanmarker to some page and later it
>> tries to write to this page - write will be done successfully, but after
>> that such page becomes unreadable due to invalid ECC codes. This occurs
>> because the second write necessitates an update to ECC, but it is
>> impossible to do it correctly without block erase.
> Hmm, I miss an explanation why this change is correct and safe.
> You explain why the OOB area can't be used, okay. But you need to
> add more details on why you change is safe in terms of filesystem
> consistency.
 
If you disable the cleanmarker, the found clean block (filled with 0xff)
will be erased again (see fs/jffs2/scan.c#L162).
In my opinion, it is better to perform the block erasure again than to
not work with such a nand flash at all.


> Beside of that, I don't think this should be kernel config option.
> Why not a mount option?

Agreed
Richard Weinberger Oct. 23, 2023, 5:44 p.m. UTC | #3
----- Ursprüngliche Mail -----
> Von: "Martin Kurbanov" <mmkurbanov@salutedevices.com>
> If you disable the cleanmarker, the found clean block (filled with 0xff)
> will be erased again (see fs/jffs2/scan.c#L162).
> In my opinion, it is better to perform the block erasure again than to
> not work with such a nand flash at all.

Doesn't this case many re-erases at each mount time?

BTW: I tried your patch in nandsim, jffs2 was unhappy.
[   56.147361] jffs2: notice: (440) jffs2_build_xattr_subsystem: complete building xattr subsystem, 0 of xdatum (0 unchecked, 0 orphan) and 0 of xref (0 dead, 0 orphan) found.
[   56.200438] nand: nand_do_write_ops: attempt to write non page aligned data
[   56.201090] jffs2: Write clean marker to block at 0x001f8000 failed: -22

Do you have an idea?

Thanks,
//richard
David Woodhouse Oct. 23, 2023, 5:55 p.m. UTC | #4
On Mon, 2023-10-23 at 17:54 +0300, Martin Kurbanov wrote:
> Hello Richard,
> 
> On 19.10.2023 11:12, Richard Weinberger wrote:
> > > This patch support for disable cleanmarker option. This is useful on
> > > some NAND devices which entire OOB area is protected by ECC. Problem
> > > fires when JFFS2 driver writes cleanmarker to some page and later it
> > > tries to write to this page - write will be done successfully, but after
> > > that such page becomes unreadable due to invalid ECC codes. This occurs
> > > because the second write necessitates an update to ECC, but it is
> > > impossible to do it correctly without block erase.
> > Hmm, I miss an explanation why this change is correct and safe.
> > You explain why the OOB area can't be used, okay. But you need to
> > add more details on why you change is safe in terms of filesystem
> > consistency.
>  
> If you disable the cleanmarker, the found clean block (filled with 0xff)
> will be erased again (see fs/jffs2/scan.c#L162).
> In my opinion, it is better to perform the block erasure again than to
> not work with such a nand flash at all.

Erasing all unused blocks over and over again on every reboot/remount
is going to destroy your flash quite quickly, surely?

I think you need to come up with a way to log the clean blocks (or
erase requests) in the JFFS2 log itself.

Perhaps a 'block erase log' node type, which just contains a version#
and a list of blocks which are currently being erased. You write it out
before doing any erase operation. And then at *some* point after the
erase completes (it doesn't need to be immediate) you write out a new
one (which may be empty, or may list new blocks which are about to be
erased).

On mount, we just need to re-erase any blocks which are indicated as
being erased in the latest erase log node.

> > Beside of that, I don't think this should be kernel config option.
> > Why not a mount option?
> 
> Agreed

Why even a mount option? Shouldn't it be automatic, depending on the
type of flash chip?
Martin Kurbanov Oct. 24, 2023, 1:29 p.m. UTC | #5
On 23.10.2023 20:44, Richard Weinberger wrote:
> ----- Ursprüngliche Mail -----
>> Von: "Martin Kurbanov" <mmkurbanov@salutedevices.com>
>> If you disable the cleanmarker, the found clean block (filled with 0xff)
>> will be erased again (see fs/jffs2/scan.c#L162).
>> In my opinion, it is better to perform the block erasure again than to
>> not work with such a nand flash at all.
> 
> Doesn't this case many re-erases at each mount time?

You are right. David proposed the good solution.

> BTW: I tried your patch in nandsim, jffs2 was unhappy.
> [   56.147361] jffs2: notice: (440) jffs2_build_xattr_subsystem: complete building xattr subsystem, 0 of xdatum (0 unchecked, 0 orphan) and 0 of xref (0 dead, 0 orphan) found.
> [   56.200438] nand: nand_do_write_ops: attempt to write non page aligned data
> [   56.201090] jffs2: Write clean marker to block at 0x001f8000 failed: -22
> 
> Do you have an idea?

According to this code from the function jffs2_mark_erased_block():

```
if (jffs2_cleanmarker_oob(c) || c->cleanmarker_size == 0) {

    if (jffs2_cleanmarker_oob(c)) {
        if (jffs2_write_nand_cleanmarker(c, jeb))
            goto filebad;
    }
} else {

    struct kvec vecs[1];
    struct jffs2_unknown_node marker = {
        .magic = cpu_to_je16(JFFS2_MAGIC_BITMASK),
        .nodetype = cpu_to_je16(JFFS2_NODETYPE_CLEANMARKER),
        .totlen = cpu_to_je32(c->cleanmarker_size)
    };
```

the "if" branch should be executed because "cleanmarker_size" is set to
0 for NAND flash:

```
int jffs2_nand_flash_setup(struct jffs2_sb_info *c)
{
    if (!c->mtd->oobsize)
        return 0;

    /* Cleanmarker is out-of-band, so inline size zero */
    c->cleanmarker_size = 0;
```

In your case, the "else" branch was executed. I assume that "oobsize" is
equal to 0. In this scenario, JFFS2 will not mount without
applying my patch.
Martin Kurbanov Oct. 25, 2023, 2:53 p.m. UTC | #6
Hi David, Thank you for reply.

On 23.10.2023 20:55, David Woodhouse wrote:
> On Mon, 2023-10-23 at 17:54 +0300, Martin Kurbanov wrote:
>> Hello Richard,
>>
>> On 19.10.2023 11:12, Richard Weinberger wrote:
>>>> This patch support for disable cleanmarker option. This is useful on
>>>> some NAND devices which entire OOB area is protected by ECC. Problem
>>>> fires when JFFS2 driver writes cleanmarker to some page and later it
>>>> tries to write to this page - write will be done successfully, but after
>>>> that such page becomes unreadable due to invalid ECC codes. This occurs
>>>> because the second write necessitates an update to ECC, but it is
>>>> impossible to do it correctly without block erase.
>>> Hmm, I miss an explanation why this change is correct and safe.
>>> You explain why the OOB area can't be used, okay. But you need to
>>> add more details on why you change is safe in terms of filesystem
>>> consistency.
>>  
>> If you disable the cleanmarker, the found clean block (filled with 0xff)
>> will be erased again (see fs/jffs2/scan.c#L162).
>> In my opinion, it is better to perform the block erasure again than to
>> not work with such a nand flash at all.
> 
> Erasing all unused blocks over and over again on every reboot/remount
> is going to destroy your flash quite quickly, surely?
> 
> I think you need to come up with a way to log the clean blocks (or
> erase requests) in the JFFS2 log itself.
> 
> Perhaps a 'block erase log' node type, which just contains a version#
> and a list of blocks which are currently being erased. You write it out
> before doing any erase operation. And then at *some* point after the
> erase completes (it doesn't need to be immediate) you write out a new
> one (which may be empty, or may list new blocks which are about to be
> erased).
> 
> On mount, we just need to re-erase any blocks which are indicated as
> being erased in the latest erase log node.

What if we don't erase the free blocks during mounting, but instead
erase them when a clean block is needed (before writing)?
diff mbox series

Patch

diff --git a/fs/jffs2/Kconfig b/fs/jffs2/Kconfig
index 7c96bc107218..8a66941d1e93 100644
--- a/fs/jffs2/Kconfig
+++ b/fs/jffs2/Kconfig
@@ -29,6 +29,16 @@  config JFFS2_FS_DEBUG
 	  If reporting bugs, please try to have available a full dump of the
 	  messages at debug level 1 while the misbehaviour was occurring.
 
+config JFFS2_FS_NOCLEANMARKER
+	bool "Disable cleanmarkers JFFS2 feature"
+	depends on JFFS2_FS_WRITEBUFFER
+	depends on MTD_NAND || MTD_SPI_NAND
+	default n
+	help
+	  Do not write 'CLEANMARKER' nodes to the beginning of each erase block.
+	  This option can be useful on NAND flash where there is no free
+	  space in the OOB area or the entire OOB area is protected by ECC.
+
 config JFFS2_FS_WRITEBUFFER
 	bool "JFFS2 write-buffering support"
 	depends on JFFS2_FS
diff --git a/fs/jffs2/os-linux.h b/fs/jffs2/os-linux.h
index c604f639a00f..ea42964d8118 100644
--- a/fs/jffs2/os-linux.h
+++ b/fs/jffs2/os-linux.h
@@ -109,7 +109,9 @@  static inline void jffs2_init_inode_info(struct jffs2_inode_info *f)
 #define jffs2_can_mark_obsolete(c) (c->mtd->flags & (MTD_BIT_WRITEABLE))
 #endif
 
-#define jffs2_cleanmarker_oob(c) (c->mtd->type == MTD_NANDFLASH)
+#define jffs2_cleanmarker_oob(c)			\
+	(!IS_ENABLED(CONFIG_JFFS2_FS_NOCLEANMARKER) &&	\
+	((c)->mtd->type == MTD_NANDFLASH))
 
 #define jffs2_wbuf_dirty(c) (!!(c)->wbuf_len)