[OpenWrt-Devel,RFC,2/2] amp821xx: use newly added pad-squashfs for Meraki MR24
diff mbox series

Message ID 875zmpqxbu.fsf@husum.klickitat.com
State Superseded, archived
Delegated to: Christian Lamparter
Headers show
Series
  • [OpenWrt-Devel,RFC,1/2] build: add squashfs padding infrastructure
Related show

Commit Message

Russell Senior Aug. 22, 2019, 11 a.m. UTC
Using pad-squashfs ensures that the root.squashfs is assigned sufficient
LEBs on UBI such that all reads of the rootfs succeed, in order to avoid
read failures and kernel panics.

This fixes one such kernel panic observed on Meraki MR24 where an
inopportune-sized unpadded root.squashfs occurred.

Note: ext4-sysupgrade firmware binaries will build with this patch, but
they are as nonsensical as before the patch. Finding a way to disable
ext4 builds for Meraki MR24 is left as a TODO.

Signed-off-by: Russell Senior <russell@personaltelco.net>
---
 target/linux/apm821xx/image/Makefile | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Christian Lamparter Aug. 23, 2019, 9:01 p.m. UTC | #1
On Thursday, August 22, 2019 1:00:21 PM CEST Russell Senior wrote:
> 
> Using pad-squashfs ensures that the root.squashfs is assigned sufficient
> LEBs on UBI such that all reads of the rootfs succeed, in order to avoid
> read failures and kernel panics.
> 
> This fixes one such kernel panic observed on Meraki MR24 where an
> inopportune-sized unpadded root.squashfs occurred.
> 
> Note: ext4-sysupgrade firmware binaries will build with this patch, but
> they are as nonsensical as before the patch. Finding a way to disable
> ext4 builds for Meraki MR24 is left as a TODO.
> 
> Signed-off-by: Russell Senior <russell@personaltelco.net>
> ---
>  target/linux/apm821xx/image/Makefile | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/target/linux/apm821xx/image/Makefile b/target/linux/apm821xx/image/Makefile
> index acfd478755..53192bb448 100644
> --- a/target/linux/apm821xx/image/Makefile
> +++ b/target/linux/apm821xx/image/Makefile
> @@ -133,7 +133,8 @@ define Device/meraki_mr24
>    IMAGE_SIZE := 8191k
>    KERNEL := kernel-bin | lzma | uImage lzma | MerakiAdd-dtb | MerakiNAND
>    KERNEL_INITRAMFS := kernel-bin | lzma | dtb | MuImage-initramfs lzma
> -  IMAGE/sysupgrade.bin := sysupgrade-tar | append-metadata
> +  IMAGE/sysupgrade.bin/squashfs := pad-squashfs | sysupgrade-tar | append-metadata
> +  IMAGE/sysupgrade.bin/ext4 := sysupgrade-tar | append-metadata
>    UBINIZE_OPTS := -E 5
>    SUPPORTED_DEVICES += mr24
>  endef
> 

I've posted a similar message to the bugreport:
<https://bugs.openwrt.org/index.php?do=details&task_id=2460>

What's happening here is that mksquashfs4 is being told through the "-nopad" option
to "do not pad filesystem to a multiple of 4K" [0].

|define Image/mkfs/squashfs
|        $(STAGING_DIR_HOST)/bin/mksquashfs4 $(call mkfs_target_dir,$(1)) $@ \
|                -nopad -noappend -root-owned \
|                -comp $(SQUASHFSCOMP) $(SQUASHFSOPT) \
|                -processors 1
|endef

My guess is that this affects more than just the MR24 (I'm looking at you too:
pad2jffs and various pad-to $value) . I've tried tracking down the change that
added the "-nopad" and found it in a commit from 2005 titled:
"add some changes from whiterussian to head" [1] [2]:

| $(KDIR)/root.squashfs:
|        @mkdir -p $(KDIR)/root/jffs
|-       $(STAGING_DIR)/bin/mksquashfs-lzma $(KDIR)/root $@ -noappend -root-owned -le
|+       $(STAGING_DIR)/bin/mksquashfs-lzma $(KDIR)/root $@ -nopad -noappend -root-owned -le


So, this is really old... 

Question is, should we just drop the -nopad? Since as you established, in
the described corner-case (see above) squashfs needs this 4k padding and
the generated squashfs could be considered broken otherwise?
(Judging from your message, you went through the kernel code. Can you
please share the place where the lack of the padding is breaking the fs?)

Cheers,
Christian

[0] <https://git.openwrt.org/?p=openwrt/openwrt.git;a=blob;f=include/image.mk;h=5d54bc7947e692dacd7b4e4e2e845b0e824bfc30;hb=HEAD#l243>
[1] <https://git.openwrt.org/?p=openwrt/openwrt.git;a=commit;h=0be45c47c09746d42936e61b27d8b80f63880dee>
[2] <https://git.openwrt.org/?p=openwrt/openwrt.git;a=blobdiff;f=openwrt/target/linux/image/squashfs.mk;h=70a85b99ee7445329e5f401c46b9f7707fda0e2c;hp=917a69ab8a1bbc5f652b83ce687b0b31db1b0f3c;hb=0be45c47c09746d42936e61b27d8b80f63880dee;hpb=6e6a04539395c2f22b4f8d43404b40aeefba739a>
Russell Senior Aug. 24, 2019, 12:18 a.m. UTC | #2
>>>>> "Christian" == Christian Lamparter <chunkeey@gmail.com> writes:

> I've posted a similar message to the bugreport:
> <https://bugs.openwrt.org/index.php?do=details&task_id=2460>

I should have filed the bug first and linked it in my patch.

> What's happening here is that mksquashfs4 is being told
> through the "-nopad" option to "do not pad filesystem to a
> multiple of 4K" [0].

> |define Image/mkfs/squashfs |
> $(STAGING_DIR_HOST)/bin/mksquashfs4 $(call
> mkfs_target_dir,$(1)) $@ \ | -nopad -noappend -root-owned \ |
> -comp $(SQUASHFSCOMP) $(SQUASHFSOPT) \ | -processors 1 |endef

> My guess is that this affects more than just the MR24 (I'm
> looking at you too: pad2jffs and various pad-to $value)
> . I've tried tracking down the change that added the "-nopad"
> and found it in a commit from 2005 titled: "add some changes
> from whiterussian to head" [1] [2]:

I agree that other devices where rootfs is squashfs and lives on a ubi
volume are probaby also vulnerable to this problem. Regrettably, I haven't
thought of any other of those devices that I have on hand to test. 

> | $(KDIR)/root.squashfs: | @mkdir -p $(KDIR)/root/jffs |-
> $(STAGING_DIR)/bin/mksquashfs-lzma $(KDIR)/root $@ -noappend
> -root-owned -le |+ $(STAGING_DIR)/bin/mksquashfs-lzma
> $(KDIR)/root $@ -nopad -noappend -root-owned -le


> So, this is really old...

> Question is, should we just drop the -nopad? Since as you
> established, in the described corner-case (see above)
> squashfs needs this 4k padding and the generated squashfs
> could be considered broken otherwise?

I'm under the impression that the -nopad makes sense for NOR flash where
the kernel and rootfs are glued together, padding the isolated rootfs
would cause alignment problems or wasted space in the combined blobs.

> (Judging from your
> message, you went through the kernel code. Can you please
> share the place where the lack of the padding is breaking the
> fs?)

It's mostly inferred from the fact that I saw the error and kernel
panic, and glancing at the kernel squashfs code. I am not pretending to
understand completely how the squashfs kernel code works, but the
position of the error relative to the size of the rootfs in my panic
case strongly suggests it was trying to read past the end of the ubi
volume.

The error comes in the kernel's fs/squashfs/block.c in the
squashfs_read_data() function. There are a bunch of conditions (at least
5) within the function (see "goto read_failure;") that will lead to that
message being printed.
Russell Senior Aug. 24, 2019, 12:59 a.m. UTC | #3
>>>>> "Russell" == Russell Senior <russell@personaltelco.net> writes:

>>>>> "Christian" == Christian Lamparter <chunkeey@gmail.com> writes:

Russell> It's mostly inferred from the fact that I saw the error and
Russell> kernel panic, and glancing at the kernel squashfs code. I am
Russell> not pretending to understand completely how the squashfs kernel
Russell> code works, but the position of the error relative to the size
Russell> of the rootfs in my panic case strongly suggests it was trying
Russell> to read past the end of the ubi volume.

Oh, and I got important help finding this from Jonas Gorski. I was
remiss in not mentioning that.
Christian Lamparter Aug. 24, 2019, 4:09 p.m. UTC | #4
On Saturday, August 24, 2019 2:18:55 AM CEST Russell Senior wrote:
> >>>>> "Christian" == Christian Lamparter <chunkeey@gmail.com> writes:
> 
> > I've posted a similar message to the bugreport:
> > <https://bugs.openwrt.org/index.php?do=details&task_id=2460>
> 
> I should have filed the bug first and linked it in my patch.
I think it's fine. It depends on whenever there will be a
discussion and where it will take place... But yeah, nobody
can tell in advance how this will go.

On Saturday, August 24, 2019 2:59:31 AM CEST Russell Senior wrote:
> >>>>> "Russell" == Russell Senior <russell@personaltelco.net> writes:
> 
> >>>>> "Christian" == Christian Lamparter <chunkeey@gmail.com> writes:
> 
> Russell> It's mostly inferred from the fact that I saw the error and
> Russell> kernel panic, and glancing at the kernel squashfs code. I am
> Russell> not pretending to understand completely how the squashfs kernel
> Russell> code works, but the position of the error relative to the size
> Russell> of the rootfs in my panic case strongly suggests it was trying
> Russell> to read past the end of the ubi volume.
> 
> Oh, and I got important help finding this from Jonas Gorski. I was
> remiss in not mentioning that.
> 
Ok, Let's add him to the CC then. As well as some of the 
"ramips: Fix and tidy up IMAGE_SIZE #2226" and 
"[RFC] Use DTS firmware partition to obsolete IMAGE_SIZE #2310"

https://github.com/openwrt/openwrt/pull/2226
https://github.com/openwrt/openwrt/pull/2310

crowd. Because this will likely affect them as well... 
But they might not know it. In any case: "Welcome everyone! :-D".

> > What's happening here is that mksquashfs4 is being told
> > through the "-nopad" option to "do not pad filesystem to a
> > multiple of 4K" [0].
> 
> > |define Image/mkfs/squashfs |
> > $(STAGING_DIR_HOST)/bin/mksquashfs4 $(call
> > mkfs_target_dir,$(1)) $@ \ | -nopad -noappend -root-owned \ |
> > -comp $(SQUASHFSCOMP) $(SQUASHFSOPT) \ | -processors 1 |endef
> 
> > My guess is that this affects more than just the MR24 (I'm
> > looking at you too: pad2jffs and various pad-to $value)
> > . I've tried tracking down the change that added the "-nopad"
> > and found it in a commit from 2005 titled: "add some changes
> > from whiterussian to head" [1] [2]:
> 
> I agree that other devices where rootfs is squashfs and lives on a ubi
> volume are probaby also vulnerable to this problem. Regrettably, I haven't
> thought of any other of those devices that I have on hand to test. 
> 
> > | $(KDIR)/root.squashfs: | @mkdir -p $(KDIR)/root/jffs |-
> > $(STAGING_DIR)/bin/mksquashfs-lzma $(KDIR)/root $@ -noappend
> > -root-owned -le |+ $(STAGING_DIR)/bin/mksquashfs-lzma
> > $(KDIR)/root $@ -nopad -noappend -root-owned -le
> 
> 
> > So, this is really old...
> 
> > Question is, should we just drop the -nopad? Since as you
> > established, in the described corner-case (see above)
> > squashfs needs this 4k padding and the generated squashfs
> > could be considered broken otherwise?
> 
> I'm under the impression that the -nopad makes sense for NOR flash where
> the kernel and rootfs are glued together, padding the isolated rootfs
> would cause alignment problems or wasted space in the combined blobs.

Yes, that's the nod to padjffs2. That said,
<https://sourceforge.net/p/squashfs/mailman/message/28307755/> makes
it sound like that apart from the BLOCKSIZE, we also need to PAGE_SIZE?

(I think the APM821XX is a special case, since it can do 64KiB Pages
as well as it's 32MiB SLC NAND that uses 16 KiB erase-blocks. So a
PAGE can span up to 4 pages.

> 
> > (Judging from your
> > message, you went through the kernel code. Can you please
> > share the place where the lack of the padding is breaking the
> > fs?)
> 
> It's mostly inferred from the fact that I saw the error and kernel
> panic, and glancing at the kernel squashfs code. I am not pretending to
> understand completely how the squashfs kernel code works, but the
> position of the error relative to the size of the rootfs in my panic
> case strongly suggests it was trying to read past the end of the ubi
> volume.
> 
> The error comes in the kernel's fs/squashfs/block.c in the
> squashfs_read_data() function. There are a bunch of conditions (at least
> 5) within the function (see "goto read_failure;") that will lead to that
> message being printed.
> 
Well, that's a pity this could have saved a lot of time.

I've cobbered together a patch that deals with some of the
padding issues at "ubimkvol" and "ubinize" time. The idea
is that ideally we want to do the padding when we know
PAGE_SIZE and the BLOCKSIZE/Erasesize (MR24 blocksize in
image/Makefile seems wrong as well...).

But for now, it's set to 64KiB. If this is the way forward
we add enable getconf and get the PAGESIZE at runtime. If not,
we need to come up with something else.
(It's also possible to do some changes in  ubi's block code or
squashfs kernel code to mitigate the padding, but I don't think
the maintainers will even look at it).


Regards,
Christian
---
From 803cab7d585ebb85362357d008067caf645a7f17 Mon Sep 17 00:00:00 2001
From: Christian Lamparter <chunkeey@gmail.com>
Date: Sat, 24 Aug 2019 12:55:40 +0200
Subject: [PATCH] base-files: pad root.squashfs to 64KiB in ubi volumes

SquashFS's HOWTO states in the section "Using mksquashfs"
<https://elinux.org/Squash_FS_Howto#Using_mksquashfs>
that a padding is necessary "for the filesystem to be used
on block devices."

OpenWrt's mksquashfs for the rootfs (which is usually
squashfs) is instructed to skip the padding via the nopad
option because the rootfs will be padded by functions like
pad-rootfs to the eraseblocksize which is usually much
bigger at somewhere 64KiB.

But this is a problem for squashfs as rootfs in ubi
partitions. Currently no explicit padding is being
set and it currently works for the most time because
ubi volumes are always aligned to LEBs which could
be close enough for 4KiB paddings...

Digging down deeper revealed that squashfs excepts blocks
to be up to PAGE_SIZE. This is explained in this bug report
that states that the 4k padding for ARCHs with 64KiB pages
resulted in "attempt access beyond end of device" errors:
<https://sourceforge.net/p/squashfs/mailman/message/28307755/>

This patch changes sysupgrade to follow fstools with its
ROOTDEV_OVERLAY_ALIGN (=64KiB) and aligns squashfs rootfs
filesystem to the same amount, while also changing the
ubinize script to apply the alignment in the same manner.
(More additions would be welcome. Note: ubinize and
ubimkvol don't support alignment values that are bigger
than a LEB!)

Reported-by: Russell Senior <russell@personaltelco.net>
Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
---
 package/base-files/files/lib/upgrade/nand.sh | 12 +++++++++---
 scripts/ubinize-image.sh                     | 12 +++++++++++-
 2 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/package/base-files/files/lib/upgrade/nand.sh b/package/base-files/files/lib/upgrade/nand.sh
index fbc2b5c19a..7eb9632a06 100644
--- a/package/base-files/files/lib/upgrade/nand.sh
+++ b/package/base-files/files/lib/upgrade/nand.sh
@@ -174,11 +174,17 @@ nand_upgrade_prepare_ubi() {
 
 	# update rootfs
 	local root_size_param
-	if [ "$rootfs_type" = "ubifs" ]; then
+	case "$rootfs_type" in
+	"squashfs")
+		root_size_param="-s $(( ($rootfs_length + 65535) & ~65535))"
+		;;
+	"ubifs")
 		root_size_param="-m"
-	else
+		;;
+	*)
 		root_size_param="-s $rootfs_length"
-	fi
+		;;
+	esac
 	if ! ubimkvol /dev/$ubidev -N $CI_ROOTPART $root_size_param; then
 		echo "cannot create rootfs volume"
 		return 1;
diff --git a/scripts/ubinize-image.sh b/scripts/ubinize-image.sh
index a18d6dc428..06f4a3b995 100755
--- a/scripts/ubinize-image.sh
+++ b/scripts/ubinize-image.sh
@@ -18,6 +18,12 @@ is_ubifs() {
 	fi
 }
 
+is_squashfs() {
+	if [ "$( get_magic_word "$1" )" = "6873" ]; then
+		echo "1"
+	fi
+}
+
 ubivol() {
 	volid=$1
 	name=$2
@@ -69,7 +75,11 @@ ubilayout() {
 		ubivol $vol_id kernel "$3"
 		vol_id=$(( $vol_id + 1 ))
 	fi
-	ubivol $vol_id rootfs "$2" $root_is_ubifs
+	size=""
+	if [ -n "$( is_squashfs "$2" )" ]; then
+		size=$(( ($(wc -c < "$2") + 65535) & ~65535))
+	fi
+	ubivol $vol_id rootfs "$2" "$root_is_ubifs" "$size"
 	vol_id=$(( $vol_id + 1 ))
 	[ "$root_is_ubifs" ] || ubivol $vol_id rootfs_data "" 1
 }
Russell Senior Aug. 25, 2019, 6:16 a.m. UTC | #5
Some comments inline below ...

> On Saturday, August 24, 2019 2:18:55 AM CEST Russell Senior wrote:
> > >>>>> "Christian" == Christian Lamparter <chunkeey@gmail.com> writes:
> > 
> > > I've posted a similar message to the bugreport:
> > > <https://bugs.openwrt.org/index.php?do=details&task_id=2460>
> > 
> > I should have filed the bug first and linked it in my patch.
> I think it's fine. It depends on whenever there will be a
> discussion and where it will take place... But yeah, nobody
> can tell in advance how this will go.
> 
> On Saturday, August 24, 2019 2:59:31 AM CEST Russell Senior wrote:
> > >>>>> "Russell" == Russell Senior <russell@personaltelco.net> writes:
> > 
> > >>>>> "Christian" == Christian Lamparter <chunkeey@gmail.com> writes:
> > 
> > Russell> It's mostly inferred from the fact that I saw the error and
> > Russell> kernel panic, and glancing at the kernel squashfs code. I am
> > Russell> not pretending to understand completely how the squashfs kernel
> > Russell> code works, but the position of the error relative to the size
> > Russell> of the rootfs in my panic case strongly suggests it was trying
> > Russell> to read past the end of the ubi volume.
> > 
> > Oh, and I got important help finding this from Jonas Gorski. I was
> > remiss in not mentioning that.
> > 
> Ok, Let's add him to the CC then. As well as some of the 
> "ramips: Fix and tidy up IMAGE_SIZE #2226" and 
> "[RFC] Use DTS firmware partition to obsolete IMAGE_SIZE #2310"
> 
> https://github.com/openwrt/openwrt/pull/2226
> https://github.com/openwrt/openwrt/pull/2310
> 
> crowd. Because this will likely affect them as well... 
> But they might not know it. In any case: "Welcome everyone! :-D".
> 
> > > What's happening here is that mksquashfs4 is being told
> > > through the "-nopad" option to "do not pad filesystem to a
> > > multiple of 4K" [0].
> > 
> > > |define Image/mkfs/squashfs |
> > > $(STAGING_DIR_HOST)/bin/mksquashfs4 $(call
> > > mkfs_target_dir,$(1)) $@ \ | -nopad -noappend -root-owned \ |
> > > -comp $(SQUASHFSCOMP) $(SQUASHFSOPT) \ | -processors 1 |endef
> > 
> > > My guess is that this affects more than just the MR24 (I'm
> > > looking at you too: pad2jffs and various pad-to $value)
> > > . I've tried tracking down the change that added the "-nopad"
> > > and found it in a commit from 2005 titled: "add some changes
> > > from whiterussian to head" [1] [2]:
> > 
> > I agree that other devices where rootfs is squashfs and lives on a ubi
> > volume are probaby also vulnerable to this problem. Regrettably, I haven't
> > thought of any other of those devices that I have on hand to test. 
> > 
> > > | $(KDIR)/root.squashfs: | @mkdir -p $(KDIR)/root/jffs |-
> > > $(STAGING_DIR)/bin/mksquashfs-lzma $(KDIR)/root $@ -noappend
> > > -root-owned -le |+ $(STAGING_DIR)/bin/mksquashfs-lzma
> > > $(KDIR)/root $@ -nopad -noappend -root-owned -le
> > 
> > 
> > > So, this is really old...
> > 
> > > Question is, should we just drop the -nopad? Since as you
> > > established, in the described corner-case (see above)
> > > squashfs needs this 4k padding and the generated squashfs
> > > could be considered broken otherwise?
> > 
> > I'm under the impression that the -nopad makes sense for NOR flash where
> > the kernel and rootfs are glued together, padding the isolated rootfs
> > would cause alignment problems or wasted space in the combined blobs.
> 
> Yes, that's the nod to padjffs2. That said,
> <https://sourceforge.net/p/squashfs/mailman/message/28307755/> makes
> it sound like that apart from the BLOCKSIZE, we also need to PAGE_SIZE?
> 
> (I think the APM821XX is a special case, since it can do 64KiB Pages
> as well as it's 32MiB SLC NAND that uses 16 KiB erase-blocks. So a
> PAGE can span up to 4 pages.
> 
> > 
> > > (Judging from your
> > > message, you went through the kernel code. Can you please
> > > share the place where the lack of the padding is breaking the
> > > fs?)
> > 
> > It's mostly inferred from the fact that I saw the error and kernel
> > panic, and glancing at the kernel squashfs code. I am not pretending to
> > understand completely how the squashfs kernel code works, but the
> > position of the error relative to the size of the rootfs in my panic
> > case strongly suggests it was trying to read past the end of the ubi
> > volume.
> > 
> > The error comes in the kernel's fs/squashfs/block.c in the
> > squashfs_read_data() function. There are a bunch of conditions (at least
> > 5) within the function (see "goto read_failure;") that will lead to that
> > message being printed.
> > 
> Well, that's a pity this could have saved a lot of time.
> 
> I've cobbered together a patch that deals with some of the
> padding issues at "ubimkvol" and "ubinize" time. The idea
> is that ideally we want to do the padding when we know
> PAGE_SIZE and the BLOCKSIZE/Erasesize (MR24 blocksize in
> image/Makefile seems wrong as well...).
> 
> But for now, it's set to 64KiB. If this is the way forward
> we add enable getconf and get the PAGESIZE at runtime. If not,
> we need to come up with something else.
> (It's also possible to do some changes in  ubi's block code or
> squashfs kernel code to mitigate the padding, but I don't think
> the maintainers will even look at it).
> 
> 
> Regards,
> Christian
> ---
> From 803cab7d585ebb85362357d008067caf645a7f17 Mon Sep 17 00:00:00 2001
> From: Christian Lamparter <chunkeey@gmail.com>
> Date: Sat, 24 Aug 2019 12:55:40 +0200
> Subject: [PATCH] base-files: pad root.squashfs to 64KiB in ubi volumes
> 
> SquashFS's HOWTO states in the section "Using mksquashfs"
> <https://elinux.org/Squash_FS_Howto#Using_mksquashfs>
> that a padding is necessary "for the filesystem to be used
> on block devices."
> 
> OpenWrt's mksquashfs for the rootfs (which is usually
> squashfs) is instructed to skip the padding via the nopad
> option because the rootfs will be padded by functions like
> pad-rootfs to the eraseblocksize which is usually much
> bigger at somewhere 64KiB.

Note also, e.g. tplink,tl-wdr3600-v1:

[    0.428860] m25p80 spi0.0: en25q64 (8192 Kbytes)
[    0.433638] 3 fixed-partitions partitions found on MTD device spi0.0
[    0.440112] Creating 3 MTD partitions on "spi0.0":
[    0.444991] 0x000000000000-0x000000020000 : "u-boot"
[    0.450914] 0x000000020000-0x0000007f0000 : "firmware"
[    0.459935] 2 tplink-fw partitions found on MTD device firmware
[    0.465951] Creating 2 MTD partitions on "firmware":
[    0.471047] 0x000000000000-0x0000001b6b1b : "kernel"
[    0.476924] 0x0000001b6b1c-0x0000007d0000 : "rootfs"

netgear,wndr3800:

[    0.671227] m25p80 spi0.0: mx25l12805d (16384 Kbytes)
[    0.676366] 4 fixed-partitions partitions found on MTD device spi0.0
[    0.682724] Creating 4 MTD partitions on "spi0.0":
[    0.687508] 0x000000000000-0x000000050000 : "u-boot"
[    0.693223] 0x000000050000-0x000000070000 : "u-boot-env"
[    0.699163] 0x000000070000-0x000000ff0000 : "firmware"
[    0.707493] 2 netgear-fw partitions found on MTD device firmware
[    0.713550] Creating 2 MTD partitions on "firmware":
[    0.718507] 0x000000000000-0x0000001bd440 : "kernel"
[    0.724195] 0x0000001bd440-0x000000f80000 : "rootfs"

and netgear wgt634u:

[    1.245465] 3 bcm47xxpart partitions found on MTD device
physmap-flash.0
[    1.252454] Creating 3 MTD partitions on "physmap-flash.0":
[    1.258364] 0x000000000000-0x0000000a0000 : "boot"
[    1.286600] 0x0000000a0000-0x0000007e0000 : "firmware"
[    1.298172] 3 trx partitions found on MTD device firmware
[    1.303639] Creating 3 MTD partitions on "firmware":
[    1.308944] 0x00000000001c-0x000000000948 : "loader"
[    1.331486] 0x000000000948-0x000000139800 : "linux"
[    1.348577] 0x000000139800-0x000000740000 : "rootfs"

as examples where the rootfs starts at unaligned addresses. If you
padded the rootfs individually, the combination of kernel+rootfs would
unnecessarily waste space.

> But this is a problem for squashfs as rootfs in ubi
> partitions. Currently no explicit padding is being
> set and it currently works for the most time because
> ubi volumes are always aligned to LEBs which could
> be close enough for 4KiB paddings...
> 
> Digging down deeper revealed that squashfs excepts blocks

trivial spelling fix, that should be "accepts", I think...

> to be up to PAGE_SIZE. This is explained in this bug report
> that states that the 4k padding for ARCHs with 64KiB pages
> resulted in "attempt access beyond end of device" errors:
> <https://sourceforge.net/p/squashfs/mailman/message/28307755/>

AFAICT, the PAGE_SIZE on Meraki MR24 is 4k. In the kernel's
include/asm-generic/page.h, we have:

  /* PAGE_SHIFT determines the page size */
  
  #define PAGE_SHIFT      12
  #ifdef __ASSEMBLY__
  #define PAGE_SIZE       (1 << PAGE_SHIFT)
  #else
  #define PAGE_SIZE       (1UL << PAGE_SHIFT)
  #endif

> 
> This patch changes sysupgrade to follow fstools with its
> ROOTDEV_OVERLAY_ALIGN (=64KiB) and aligns squashfs rootfs
> filesystem to the same amount, while also changing the
> ubinize script to apply the alignment in the same manner.
> (More additions would be welcome. Note: ubinize and
> ubimkvol don't support alignment values that are bigger
> than a LEB!)

When Jonas and I were discussing this, we noted that sysupgrade changes
won't be installed the first time you do this, so a lack of padding to
the root.squashfs can still result in boot looping.

Since Meraki MR24 (afaict) doesn't use the ubinize-image.sh script, it
won't be protected.

> 
> Reported-by: Russell Senior <russell@personaltelco.net>
> Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
> ---
>  package/base-files/files/lib/upgrade/nand.sh | 12 +++++++++---
>  scripts/ubinize-image.sh                     | 12 +++++++++++-
>  2 files changed, 20 insertions(+), 4 deletions(-)
> 
> diff --git a/package/base-files/files/lib/upgrade/nand.sh b/package/base-files/files/lib/upgrade/nand.sh
> index fbc2b5c19a..7eb9632a06 100644
> --- a/package/base-files/files/lib/upgrade/nand.sh
> +++ b/package/base-files/files/lib/upgrade/nand.sh
> @@ -174,11 +174,17 @@ nand_upgrade_prepare_ubi() {
>  
>  	# update rootfs
>  	local root_size_param
> -	if [ "$rootfs_type" = "ubifs" ]; then
> +	case "$rootfs_type" in
> +	"squashfs")
> +		root_size_param="-s $(( ($rootfs_length + 65535) & ~65535))"
> +		;;
> +	"ubifs")
>  		root_size_param="-m"
> -	else
> +		;;
> +	*)
>  		root_size_param="-s $rootfs_length"
> -	fi
> +		;;
> +	esac
>  	if ! ubimkvol /dev/$ubidev -N $CI_ROOTPART $root_size_param; then
>  		echo "cannot create rootfs volume"
>  		return 1;
> diff --git a/scripts/ubinize-image.sh b/scripts/ubinize-image.sh
> index a18d6dc428..06f4a3b995 100755
> --- a/scripts/ubinize-image.sh
> +++ b/scripts/ubinize-image.sh
> @@ -18,6 +18,12 @@ is_ubifs() {
>  	fi
>  }
>  
> +is_squashfs() {
> +	if [ "$( get_magic_word "$1" )" = "6873" ]; then
> +		echo "1"
> +	fi
> +}
> +
>  ubivol() {
>  	volid=$1
>  	name=$2
> @@ -69,7 +75,11 @@ ubilayout() {
>  		ubivol $vol_id kernel "$3"
>  		vol_id=$(( $vol_id + 1 ))
>  	fi
> -	ubivol $vol_id rootfs "$2" $root_is_ubifs
> +	size=""
> +	if [ -n "$( is_squashfs "$2" )" ]; then
> +		size=$(( ($(wc -c < "$2") + 65535) & ~65535))
> +	fi
> +	ubivol $vol_id rootfs "$2" "$root_is_ubifs" "$size"
>  	vol_id=$(( $vol_id + 1 ))
>  	[ "$root_is_ubifs" ] || ubivol $vol_id rootfs_data "" 1
>  }
> -- 
> 2.23.0
> 
> 
> 
> 
> ---
> 
> 
> 
> 
> _______________________________________________
> openwrt-devel mailing list
> openwrt-devel@lists.openwrt.org
> https://lists.openwrt.org/mailman/listinfo/openwrt-devel
Christian Lamparter Aug. 25, 2019, 12:21 p.m. UTC | #6
[Fixed Mathias Email]

On Sunday, August 25, 2019 8:16:48 AM CEST Russell Senior wrote:
> > On Saturday, August 24, 2019 2:18:55 AM CEST Russell Senior wrote:
> > > > What's happening here is that mksquashfs4 is being told
> > > > through the "-nopad" option to "do not pad filesystem to a
> > > > multiple of 4K" [0].
> > > 
> > > > |define Image/mkfs/squashfs |
> > > > $(STAGING_DIR_HOST)/bin/mksquashfs4 $(call
> > > > mkfs_target_dir,$(1)) $@ \ | -nopad -noappend -root-owned \ |
> > > > -comp $(SQUASHFSCOMP) $(SQUASHFSOPT) \ | -processors 1 |endef
> > > 
> > > > My guess is that this affects more than just the MR24 (I'm
> > > > looking at you too: pad2jffs and various pad-to $value)
> > > > . I've tried tracking down the change that added the "-nopad"
> > > > and found it in a commit from 2005 titled: "add some changes
> > > > from whiterussian to head" [1] [2]:
> > > 
> > > I agree that other devices where rootfs is squashfs and lives on a ubi
> > > volume are probaby also vulnerable to this problem. Regrettably, I haven't
> > > thought of any other of those devices that I have on hand to test. 
> > > 
> > > > | $(KDIR)/root.squashfs: | @mkdir -p $(KDIR)/root/jffs |-
> > > > $(STAGING_DIR)/bin/mksquashfs-lzma $(KDIR)/root $@ -noappend
> > > > -root-owned -le |+ $(STAGING_DIR)/bin/mksquashfs-lzma
> > > > $(KDIR)/root $@ -nopad -noappend -root-owned -le
> > > 
> > > 
> > > > So, this is really old...
> > > 
> > > > Question is, should we just drop the -nopad? Since as you
> > > > established, in the described corner-case (see above)
> > > > squashfs needs this 4k padding and the generated squashfs
> > > > could be considered broken otherwise?
> > > 
> > > I'm under the impression that the -nopad makes sense for NOR flash where
> > > the kernel and rootfs are glued together, padding the isolated rootfs
> > > would cause alignment problems or wasted space in the combined blobs.
> > 
> > Yes, that's the nod to padjffs2. That said,
> > <https://sourceforge.net/p/squashfs/mailman/message/28307755/> makes
> > it sound like that apart from the BLOCKSIZE, we also need to PAGE_SIZE?
> > 
> > (I think the APM821XX is a special case, since it can do 64KiB Pages
> > as well as it's 32MiB SLC NAND that uses 16 KiB erase-blocks. So a
> > PAGE can span up to 4 pages.
> > 
> > > 
> > > > (Judging from your
> > > > message, you went through the kernel code. Can you please
> > > > share the place where the lack of the padding is breaking the
> > > > fs?)
> > > 
> > > It's mostly inferred from the fact that I saw the error and kernel
> > > panic, and glancing at the kernel squashfs code. I am not pretending to
> > > understand completely how the squashfs kernel code works, but the
> > > position of the error relative to the size of the rootfs in my panic
> > > case strongly suggests it was trying to read past the end of the ubi
> > > volume.
> > > 
> > > The error comes in the kernel's fs/squashfs/block.c in the
> > > squashfs_read_data() function. There are a bunch of conditions (at least
> > > 5) within the function (see "goto read_failure;") that will lead to that
> > > message being printed.
> > > 
> > Well, that's a pity this could have saved a lot of time.
> > 
> > I've cobbered together a patch that deals with some of the
> > padding issues at "ubimkvol" and "ubinize" time. The idea
> > is that ideally we want to do the padding when we know
> > PAGE_SIZE and the BLOCKSIZE/Erasesize (MR24 blocksize in
> > image/Makefile seems wrong as well...).
> > 
> > But for now, it's set to 64KiB. If this is the way forward
> > we add enable getconf and get the PAGESIZE at runtime. If not,
> > we need to come up with something else.
> > (It's also possible to do some changes in  ubi's block code or
> > squashfs kernel code to mitigate the padding, but I don't think
> > the maintainers will even look at it).
> > 
> > 
> > Regards,
> > Christian
> > ---
> > From 803cab7d585ebb85362357d008067caf645a7f17 Mon Sep 17 00:00:00 2001
> > From: Christian Lamparter <chunkeey@gmail.com>
> > Date: Sat, 24 Aug 2019 12:55:40 +0200
> > Subject: [PATCH] base-files: pad root.squashfs to 64KiB in ubi volumes
> > 
> > SquashFS's HOWTO states in the section "Using mksquashfs"
> > <https://elinux.org/Squash_FS_Howto#Using_mksquashfs>
> > that a padding is necessary "for the filesystem to be used
> > on block devices."
> > 
> > OpenWrt's mksquashfs for the rootfs (which is usually
> > squashfs) is instructed to skip the padding via the nopad
> > option because the rootfs will be padded by functions like
> > pad-rootfs to the eraseblocksize which is usually much
> > bigger at somewhere 64KiB.
> 
> Note also, e.g. tplink,tl-wdr3600-v1:
>
> [    0.428860] m25p80 spi0.0: en25q64 (8192 Kbytes)
> [    0.433638] 3 fixed-partitions partitions found on MTD device spi0.0
> [    0.440112] Creating 3 MTD partitions on "spi0.0":
> [    0.444991] 0x000000000000-0x000000020000 : "u-boot"
> [    0.450914] 0x000000020000-0x0000007f0000 : "firmware"
> [    0.459935] 2 tplink-fw partitions found on MTD device firmware
> [    0.465951] Creating 2 MTD partitions on "firmware":
> [    0.471047] 0x000000000000-0x0000001b6b1b : "kernel"
> [    0.476924] 0x0000001b6b1c-0x0000007d0000 : "rootfs"
> 
> netgear,wndr3800:
> 
> [    0.671227] m25p80 spi0.0: mx25l12805d (16384 Kbytes)
> [    0.676366] 4 fixed-partitions partitions found on MTD device spi0.0
> [    0.682724] Creating 4 MTD partitions on "spi0.0":
> [    0.687508] 0x000000000000-0x000000050000 : "u-boot"
> [    0.693223] 0x000000050000-0x000000070000 : "u-boot-env"
> [    0.699163] 0x000000070000-0x000000ff0000 : "firmware"
> [    0.707493] 2 netgear-fw partitions found on MTD device firmware
> [    0.713550] Creating 2 MTD partitions on "firmware":
> [    0.718507] 0x000000000000-0x0000001bd440 : "kernel"
> [    0.724195] 0x0000001bd440-0x000000f80000 : "rootfs"
> 
> and netgear wgt634u:
> 
> [    1.245465] 3 bcm47xxpart partitions found on MTD device
> physmap-flash.0
> [    1.252454] Creating 3 MTD partitions on "physmap-flash.0":
> [    1.258364] 0x000000000000-0x0000000a0000 : "boot"
> [    1.286600] 0x0000000a0000-0x0000007e0000 : "firmware"
> [    1.298172] 3 trx partitions found on MTD device firmware
> [    1.303639] Creating 3 MTD partitions on "firmware":
> [    1.308944] 0x00000000001c-0x000000000948 : "loader"
> [    1.331486] 0x000000000948-0x000000139800 : "linux"
> [    1.348577] 0x000000139800-0x000000740000 : "rootfs"
> 
> as examples where the rootfs starts at unaligned addresses. If you
> padded the rootfs individually, the combination of kernel+rootfs would
> unnecessarily waste space.
Aren't these all devices with spi-nor? They all just place the kernel +
rootfs (squashfs) in a "firmware" mtd partitions and the mtdsplit is doing
its magic. I think this is a bit out of the scope. This patch should not
interfere with them and the unaligned squashfs rootfs starts is not a
problem from what I can tell. That said removing the "-nopad" switch or
adding the padding with "pad-squashfs" you made should make no difference
in regards to the missing padding between the linux/kernel and rootfs
since this isn't the problem.

> > But this is a problem for squashfs as rootfs in ubi
> > partitions. Currently no explicit padding is being
> > set and it currently works for the most time because
> > ubi volumes are always aligned to LEBs which could
> > be close enough for 4KiB paddings...
> > 
> > Digging down deeper revealed that squashfs excepts blocks
> 
> trivial spelling fix, that should be "accepts", I think...
Not sure if it's trivial or not, but yes the "excepts" is wrong,
I think it should have been "expects". But... I've still hope
that "-nopad will be the way" so I'm prepared to just drop this
patch again :D.

> > to be up to PAGE_SIZE. This is explained in this bug report
> > that states that the 4k padding for ARCHs with 64KiB pages
> > resulted in "attempt access beyond end of device" errors:
> > <https://sourceforge.net/p/squashfs/mailman/message/28307755/>
> 
> AFAICT, the PAGE_SIZE on Meraki MR24 is 4k. In the kernel's
> include/asm-generic/page.h, we have:
The APM821xx SoC supports 4KiB, 16KiB and 64KiB page sizes.
Ie: <https://cateee.net/lkddb/web-lkddb/PPC_64K_PAGES.html>
OpenWrt currently defaults to 4KiB, but the 16KiB and 64KiB
do provide a throughput boost and they are easily enabled
by just editing config-default a bit.

>   /* PAGE_SHIFT determines the page size */
>   
>   #define PAGE_SHIFT      12
>   #ifdef __ASSEMBLY__
>   #define PAGE_SIZE       (1 << PAGE_SHIFT)
>   #else
>   #define PAGE_SIZE       (1UL << PAGE_SHIFT)
>   #endif
> 
> > 
> > This patch changes sysupgrade to follow fstools with its
> > ROOTDEV_OVERLAY_ALIGN (=64KiB) and aligns squashfs rootfs
> > filesystem to the same amount, while also changing the
> > ubinize script to apply the alignment in the same manner.
> > (More additions would be welcome. Note: ubinize and
> > ubimkvol don't support alignment values that are bigger
> > than a LEB!)
> 
> When Jonas and I were discussing this, we noted that sysupgrade changes
> won't be installed the first time you do this, so a lack of padding to
> the root.squashfs can still result in boot looping.
>
> Since Meraki MR24 (afaict) doesn't use the ubinize-image.sh script, it
> won't be protected.
Are you using an alternative flashing approach?

The wiki: <https://openwrt.org/toh/meraki/mr24> notes that for the initial
install, the MR24 should boot off an tftp-loaded initramfs and then the 
user has to use sysupgrade to install the real image.
Yes, existing initramfs + installs will have to be updated before this will
work. But then, this bug has sadly been present from the beginning and on
many other routers as well and nobody fixed it yet. It's definitely a bad
bug though.

> > Reported-by: Russell Senior <russell@personaltelco.net>
> > Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
> > ---
> >  package/base-files/files/lib/upgrade/nand.sh | 12 +++++++++---
> >  scripts/ubinize-image.sh                     | 12 +++++++++++-
> >  2 files changed, 20 insertions(+), 4 deletions(-)
> > 
> > diff --git a/package/base-files/files/lib/upgrade/nand.sh b/package/base-files/files/lib/upgrade/nand.sh
> > index fbc2b5c19a..7eb9632a06 100644
> > --- a/package/base-files/files/lib/upgrade/nand.sh
> > +++ b/package/base-files/files/lib/upgrade/nand.sh
> > @@ -174,11 +174,17 @@ nand_upgrade_prepare_ubi() {
> >  
> >  	# update rootfs
> >  	local root_size_param
> > -	if [ "$rootfs_type" = "ubifs" ]; then
> > +	case "$rootfs_type" in
> > +	"squashfs")
> > +		root_size_param="-s $(( ($rootfs_length + 65535) & ~65535))"
> > +		;;
> > +	"ubifs")
> >  		root_size_param="-m"
> > -	else
> > +		;;
> > +	*)
> >  		root_size_param="-s $rootfs_length"
> > -	fi
> > +		;;
> > +	esac
> >  	if ! ubimkvol /dev/$ubidev -N $CI_ROOTPART $root_size_param; then
> >  		echo "cannot create rootfs volume"
> >  		return 1;
> > diff --git a/scripts/ubinize-image.sh b/scripts/ubinize-image.sh
> > index a18d6dc428..06f4a3b995 100755
> > --- a/scripts/ubinize-image.sh
> > +++ b/scripts/ubinize-image.sh
> > @@ -18,6 +18,12 @@ is_ubifs() {
> >  	fi
> >  }
> >  
> > +is_squashfs() {
> > +	if [ "$( get_magic_word "$1" )" = "6873" ]; then
> > +		echo "1"
> > +	fi
> > +}
> > +
> >  ubivol() {
> >  	volid=$1
> >  	name=$2
> > @@ -69,7 +75,11 @@ ubilayout() {
> >  		ubivol $vol_id kernel "$3"
> >  		vol_id=$(( $vol_id + 1 ))
> >  	fi
> > -	ubivol $vol_id rootfs "$2" $root_is_ubifs
> > +	size=""
> > +	if [ -n "$( is_squashfs "$2" )" ]; then
> > +		size=$(( ($(wc -c < "$2") + 65535) & ~65535))
> > +	fi
> > +	ubivol $vol_id rootfs "$2" "$root_is_ubifs" "$size"
> >  	vol_id=$(( $vol_id + 1 ))
> >  	[ "$root_is_ubifs" ] || ubivol $vol_id rootfs_data "" 1
> >  }
Russell Senior Aug. 25, 2019, 1:23 p.m. UTC | #7
trimming a bit for easier reading ...

> [Fixed Mathias Email]

> > > [...]
> > > OpenWrt's mksquashfs for the rootfs (which is usually
> > > squashfs) is instructed to skip the padding via the nopad
> > > option because the rootfs will be padded by functions like
> > > pad-rootfs to the eraseblocksize which is usually much
> > > bigger at somewhere 64KiB.
> > 
> > Note also, e.g. tplink,tl-wdr3600-v1:
> >
> > [    0.428860] m25p80 spi0.0: en25q64 (8192 Kbytes)
> > [    0.433638] 3 fixed-partitions partitions found on MTD device spi0.0
> > [    0.440112] Creating 3 MTD partitions on "spi0.0":
> > [    0.444991] 0x000000000000-0x000000020000 : "u-boot"
> > [    0.450914] 0x000000020000-0x0000007f0000 : "firmware"
> > [    0.459935] 2 tplink-fw partitions found on MTD device firmware
> > [    0.465951] Creating 2 MTD partitions on "firmware":
> > [    0.471047] 0x000000000000-0x0000001b6b1b : "kernel"
> > [    0.476924] 0x0000001b6b1c-0x0000007d0000 : "rootfs"
> > 
> > netgear,wndr3800:
> > 
> > [    0.671227] m25p80 spi0.0: mx25l12805d (16384 Kbytes)
> > [    0.676366] 4 fixed-partitions partitions found on MTD device spi0.0
> > [    0.682724] Creating 4 MTD partitions on "spi0.0":
> > [    0.687508] 0x000000000000-0x000000050000 : "u-boot"
> > [    0.693223] 0x000000050000-0x000000070000 : "u-boot-env"
> > [    0.699163] 0x000000070000-0x000000ff0000 : "firmware"
> > [    0.707493] 2 netgear-fw partitions found on MTD device firmware
> > [    0.713550] Creating 2 MTD partitions on "firmware":
> > [    0.718507] 0x000000000000-0x0000001bd440 : "kernel"
> > [    0.724195] 0x0000001bd440-0x000000f80000 : "rootfs"
> > 
> > and netgear wgt634u:
> > 
> > [    1.245465] 3 bcm47xxpart partitions found on MTD device
> > physmap-flash.0
> > [    1.252454] Creating 3 MTD partitions on "physmap-flash.0":
> > [    1.258364] 0x000000000000-0x0000000a0000 : "boot"
> > [    1.286600] 0x0000000a0000-0x0000007e0000 : "firmware"
> > [    1.298172] 3 trx partitions found on MTD device firmware
> > [    1.303639] Creating 3 MTD partitions on "firmware":
> > [    1.308944] 0x00000000001c-0x000000000948 : "loader"
> > [    1.331486] 0x000000000948-0x000000139800 : "linux"
> > [    1.348577] 0x000000139800-0x000000740000 : "rootfs"
> > 
> > as examples where the rootfs starts at unaligned addresses. If you
> > padded the rootfs individually, the combination of kernel+rootfs would
> > unnecessarily waste space.
> Aren't these all devices with spi-nor? They all just place the kernel +
> rootfs (squashfs) in a "firmware" mtd partitions and the mtdsplit is doing
> its magic. [...]

Yes, agreed. These are examples of why we shouldn't just remove the
-nopad, and actually mostly just my own sanity check that I remembered
this on some devices (other spi-nor devices, such as the buffalo
wzr600dhp seem to align the rootfs). Oh, and (irrelevant detail) the
wgt634u is NOR, but not spi.

> I think this is a bit out of the scope. This patch should not
> interfere with them and the unaligned squashfs rootfs starts is not a
> problem from what I can tell. That said removing the "-nopad" switch or
> adding the padding with "pad-squashfs" you made should make no difference
> in regards to the missing padding between the linux/kernel and rootfs
> since this isn't the problem.

I was under the impression that removing the -nopad switch would pad out
the root.squashfs out to a 4k boundary before concatenation, leading to
a further padding of the concatenation since the padding will (in
general) hang over a 4k boundary.

> > > But this is a problem for squashfs as rootfs in ubi
> > > partitions. Currently no explicit padding is being
> > > set and it currently works for the most time because
> > > ubi volumes are always aligned to LEBs which could
> > > be close enough for 4KiB paddings...
> > > 
> > > Digging down deeper revealed that squashfs excepts blocks
> > 
> > trivial spelling fix, that should be "accepts", I think...
> Not sure if it's trivial or not, but yes the "excepts" is wrong,
> I think it should have been "expects". But... I've still hope
> that "-nopad will be the way" so I'm prepared to just drop this
> patch again :D.
> 
> > > to be up to PAGE_SIZE. This is explained in this bug report
> > > that states that the 4k padding for ARCHs with 64KiB pages
> > > resulted in "attempt access beyond end of device" errors:
> > > <https://sourceforge.net/p/squashfs/mailman/message/28307755/>
> > 
> > AFAICT, the PAGE_SIZE on Meraki MR24 is 4k. In the kernel's
> > include/asm-generic/page.h, we have:
> The APM821xx SoC supports 4KiB, 16KiB and 64KiB page sizes.
> Ie: <https://cateee.net/lkddb/web-lkddb/PPC_64K_PAGES.html>
> OpenWrt currently defaults to 4KiB, but the 16KiB and 64KiB
> do provide a throughput boost and they are easily enabled
> by just editing config-default a bit.

The MR24's NAND is only 32MB (okay, that's not tiny), but I'm all for
optimizing for size.

> > When Jonas and I were discussing this, we noted that sysupgrade changes
> > won't be installed the first time you do this, so a lack of padding to
> > the root.squashfs can still result in boot looping.
> >
> > Since Meraki MR24 (afaict) doesn't use the ubinize-image.sh script, it
> > won't be protected.
> Are you using an alternative flashing approach?
> 
> The wiki: <https://openwrt.org/toh/meraki/mr24> notes that for the initial
> install, the MR24 should boot off an tftp-loaded initramfs and then the 
> user has to use sysupgrade to install the real image.
> Yes, existing initramfs + installs will have to be updated before this will
> work. But then, this bug has sadly been present from the beginning and on
> many other routers as well and nobody fixed it yet. It's definitely a bad
> bug though.

Reguar sysupgrade.bin is a tarball. I normally use sysupgrade. The
initial install from stock meraki firmware is tftp booting an initramfs,
and that shouldn't be a problem. The problem I'm referring to is going
from an existing openwrt install to a new "fixed" one without tftp
booting.  If we are depending on the already-on-device sysupgrade
nand.sh process, it hasn't been fixed yet. Padding the root.squashfs as
delivered in the tarball will avoid the bug no matter the state of
nand.sh.
Russell Senior Aug. 29, 2019, 11:20 a.m. UTC | #8
Fwiw, I took a little closer look at the squashfs code. I still don't
quite understand it, but I sprinkled some printk()'s and got a better
idea of what is happening.

With a root.squashfs of 6428672 bytes, we get the error in a call:

  squashfs_read_data(sb=(ptrval),index=0,length=6427986,next_index=16777224,output=  (null))

it enters the loop at fs/squashfs/block.c:122 with b=0; bytes=-338; length=8; cur_index=0

    for (b = 0; bytes < length; b++, cur_index++) {
        bh[b] = sb_getblk(sb, cur_index);
        if (bh[b] == NULL)
            goto block_release;

sb_getblk() must return NULL, because it goes to block_release and falls
through to print the error message and exits with an error.
Russell Senior Aug. 29, 2019, 11:39 a.m. UTC | #9
> Fwiw, I took a little closer look at the squashfs code. I still don't
> quite understand it, but I sprinkled some printk()'s and got a better
> idea of what is happening.
> 
> With a root.squashfs of 6428672 bytes, we get the error in a call:

Actually, the 6428672 bytes was from a later trial. Sorry for the
confusion. I'm not sure what the real root.squashfs size was
anymore. Gah. I'll need to redo it.

> 
>   squashfs_read_data(sb=(ptrval),index=0,length=6427986,next_index=16777224,output=  (null))
> 
> it enters the loop at fs/squashfs/block.c:122 with b=0; bytes=-338; length=8; cur_index=0
> 
>     for (b = 0; bytes < length; b++, cur_index++) {
>         bh[b] = sb_getblk(sb, cur_index);
>         if (bh[b] == NULL)
>             goto block_release;
> 
> sb_getblk() must return NULL, because it goes to block_release and falls
> through to print the error message and exits with an error.
Russell Senior Aug. 29, 2019, 11:55 a.m. UTC | #10
>> Fwiw, I took a little closer look at the squashfs code. I still don't
>> quite understand it, but I sprinkled some printk()'s and got a better
>> idea of what is happening.
>> 
>> With a root.squashfs of 6428672 bytes, we get the error in a call:
>
> Actually, the 6428672 bytes was from a later trial. Sorry for the
> confusion. I'm not sure what the real root.squashfs size was
> anymore. Gah. I'll need to redo it.

Here are the corresponding numbers from the retry:

root.squasfs size (from the sysupgrade.bin tarball): 6427978 bytes

squashfs_read_data(sb=(ptrval),index=0,length=6427970,next_index=16777224,output=(null))
b=0; bytes=-322; length=8; cur_index=0

>> 
>>   squashfs_read_data(sb=(ptrval),index=0,length=6427986,next_index=16777224,output=  (null))
>> 
>> it enters the loop at fs/squashfs/block.c:122 with b=0; bytes=-338; length=8; cur_index=0
>> 
>>     for (b = 0; bytes < length; b++, cur_index++) {
>>         bh[b] = sb_getblk(sb, cur_index);
>>         if (bh[b] == NULL)
>>             goto block_release;
>> 
>> sb_getblk() must return NULL, because it goes to block_release and falls
>> through to print the error message and exits with an error.
Christian Lamparter Aug. 30, 2019, 6:51 p.m. UTC | #11
On Thursday, August 29, 2019 1:55:25 PM CEST Russell Senior wrote:
> 
> >> Fwiw, I took a little closer look at the squashfs code. I still don't
> >> quite understand it, but I sprinkled some printk()'s and got a better
> >> idea of what is happening.
> >> 
> >> With a root.squashfs of 6428672 bytes, we get the error in a call:
> >
> > Actually, the 6428672 bytes was from a later trial. Sorry for the
> > confusion. I'm not sure what the real root.squashfs size was
> > anymore. Gah. I'll need to redo it.
> 
> Here are the corresponding numbers from the retry:
> 
> root.squasfs size (from the sysupgrade.bin tarball): 6427978 bytes
> 
> squashfs_read_data(sb=(ptrval),index=0,length=6427970,next_index=16777224,output=(null))
> b=0; bytes=-322; length=8; cur_index=0
> 
> >> 
> >>   squashfs_read_data(sb=(ptrval),index=0,length=6427986,next_index=16777224,output=  (null))
> >> 
> >> it enters the loop at fs/squashfs/block.c:122 with b=0; bytes=-338; length=8; cur_index=0
> >> 
> >>     for (b = 0; bytes < length; b++, cur_index++) {
> >>         bh[b] = sb_getblk(sb, cur_index);
> >>         if (bh[b] == NULL)
> >>             goto block_release;
> >> 
> >> sb_getblk() must return NULL, because it goes to block_release and falls
> >> through to print the error message and exits with an error.
> 
Ok.

I did push a patch titled:
"build: remove harmful -nopad option from mksquashfs"
<https://git.openwrt.org/?p=openwrt/openwrt.git;a=commit;h=1c0290c5cc6258c48b8ba46b4f9c85a21de4f875> 

so, let's see if this triggers more undefined behaviour 
and exposes more hidden broken code.

Regards,
Christian
Russell Senior Aug. 30, 2019, 9:10 p.m. UTC | #12
>>>>> "Christian" == Christian Lamparter <chunkeey@gmail.com> writes:

Christian> Ok.

Christian> I did push a patch titled: "build: remove harmful -nopad
Christian> option from mksquashfs"
Christian> <https://git.openwrt.org/?p=openwrt/openwrt.git;a=commit;h=1c0290c5cc6258c48b8ba46b4f9c85a21de4f875>

Christian> so, let's see if this triggers more undefined behaviour and
Christian> exposes more hidden broken code.

Just to re-iterate my earlier worry, if for example:

  kernel_size + rootfs_with_padding_size

crosses an erase block boundary that:

  kernel_size + rootfs_without_padding_size

does not, then we'll be wasting an erase block of flash space on NOR.


On a side note, I noticed that there were some checks[1] added to
kernels about a year ago (early august 2018) to squashfs code[1], to
protect against malformed squash filesystems that might have started
triggering the boot loop. This might explain why we haven't seen the
problem earlier (it might have been silently ignored).

[1] e.g. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=71755ee5350b63fb1f283de8561cdb61b47f4d1d
Christian Lamparter Aug. 30, 2019, 11:19 p.m. UTC | #13
On Friday, August 30, 2019 11:10:54 PM CEST Russell Senior wrote:
> >>>>> "Christian" == Christian Lamparter <chunkeey@gmail.com> writes:
> 
> Christian> Ok.
> 
> Christian> I did push a patch titled: "build: remove harmful -nopad
> Christian> option from mksquashfs"
> Christian> <https://git.openwrt.org/?p=openwrt/openwrt.git;a=commit;h=1c0290c5cc6258c48b8ba46b4f9c85a21de4f875>
> 
> Christian> so, let's see if this triggers more undefined behaviour and
> Christian> exposes more hidden broken code.
> 
> Just to re-iterate my earlier worry, if for example:
> 
>   kernel_size + rootfs_with_padding_size
> 
> crosses an erase block boundary that:
> 
>   kernel_size + rootfs_without_padding_size
> 
> does not, then we'll be wasting an erase block of flash space on NOR.
Not to worry. I guess that part of the magic is also explained in the wiki:
<https://openwrt.org/docs/techref/flash.layout#example_flash_partitioning>
<https://openwrt.org/docs/techref/flash.layout#explanations>
In case my attempt now gets too confusing :-/.

The "rootfs" partition also contains the rootfs_data inside. It should
be possible to verify this with any of the routers that use the "firmware"
parition label.

Please check /proc/mtd. For example on my WNDR3700v2 (which is very similar
to your WNDR3800). It looks like this

|# cat /proc/mtd
|dev:    size   erasesize  name
|mtd0: 00050000 00010000 "u-boot"
|mtd1: 00020000 00010000 "u-boot-env"
|mtd2: 00f80000 00010000 "firmware"
|mtd3: 0018c440 00010000 "kernel"
|mtd4: 00df3bc0 00010000 "rootfs"
|mtd5: 00620000 00010000 "rootfs_data"
|mtd6: 00010000 00010000 "art
 
The rootfs is 14629824 Bytes = 13.95 MiB. (The kernel + uboot + env + art
fills out the remaining ~2MiB to a total of 16 MiB). So the padding we both
are talking about already has to exists between the squashfs portion and the
jffs2/overlay portion inside the "rootfs" partition and it's a full erase-size
block. Sadly, OpenWrt does not readily print the "end" of just the squashfs
partition (as it does with the kernel partition above) and your message from 
earlier with the three routers didn't include the "squashfs-split" and its
results. (I think that maybe this is the missing information that got lost?)

|[    0.655975] m25p80 spi0.0: mx25l12805d (16384 Kbytes)
|[    0.661058] 4 fixed-partitions partitions found on MTD device spi0.0
|[    0.667448] Creating 4 MTD partitions on "spi0.0":
|[    0.672234] 0x000000000000-0x000000050000 : "u-boot"
|[    0.677910] 0x000000050000-0x000000070000 : "u-boot-env"
|[    0.683834] 0x000000070000-0x000000ff0000 : "firmware"
|[    0.691804] 2 netgear-fw partitions found on MTD device firmware
|[    0.697856] Creating 2 MTD partitions on "firmware":
|[    0.702818] 0x000000000000-0x00000018c440 : "kernel"
|[    0.708443] 0x00000018c440-0x000000f80000 : "rootfs"
|[    0.713960] mtd: device 4 (rootfs) set to be root filesystem
|[    0.719688] 1 squashfs-split partitions found on MTD device rootfs
|[    0.725870] 0x000000960000-0x000000f80000 : "rootfs_data"
|[    0.731896] 0x000000ff0000-0x000001000000 : "art"


That is of course, you don't max-out the rootfs completely (to the last byte)
and leave no space for the overlay/jffs2... But in that case, mksquashfs 
should not need to add any padding since the partition is ending at a 4k block
already. (However, I don't think padjffs2 will be happy.)

> On a side note, I noticed that there were some checks[1] added to
> kernels about a year ago (early august 2018) to squashfs code[1], to
> protect against malformed squash filesystems that might have started
> triggering the boot loop. This might explain why we haven't seen the
> problem earlier (it might have been silently ignored).
> 
> [1] e.g. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=71755ee5350b63fb1f283de8561cdb61b47f4d1d
It's more of an edge case for sure. I think I've must have hit it during
the development as well. I remember having a bad squashfs after a build 
that I just could not read even with the userspace unsquashfs utils either
on my PC. So I think we really want to provide "unsquashfs"/"binwalk" able
root.squashfs for debugging as well.

Cheers,
Christian
Jonas Gorski Aug. 31, 2019, 12:09 p.m. UTC | #14
Hi,

On Sat, 31 Aug 2019 at 01:19, Christian Lamparter <chunkeey@gmail.com> wrote:
>
> On Friday, August 30, 2019 11:10:54 PM CEST Russell Senior wrote:
> > >>>>> "Christian" == Christian Lamparter <chunkeey@gmail.com> writes:
> >
> > Christian> Ok.
> >
> > Christian> I did push a patch titled: "build: remove harmful -nopad
> > Christian> option from mksquashfs"
> > Christian> <https://git.openwrt.org/?p=openwrt/openwrt.git;a=commit;h=1c0290c5cc6258c48b8ba46b4f9c85a21de4f875>
> >
> > Christian> so, let's see if this triggers more undefined behaviour and
> > Christian> exposes more hidden broken code.
> >
> > Just to re-iterate my earlier worry, if for example:
> >
> >   kernel_size + rootfs_with_padding_size
> >
> > crosses an erase block boundary that:
> >
> >   kernel_size + rootfs_without_padding_size
> >
> > does not, then we'll be wasting an erase block of flash space on NOR.
> Not to worry. I guess that part of the magic is also explained in the wiki:
> <https://openwrt.org/docs/techref/flash.layout#example_flash_partitioning>
> <https://openwrt.org/docs/techref/flash.layout#explanations>
> In case my attempt now gets too confusing :-/.
>
> The "rootfs" partition also contains the rootfs_data inside. It should
> be possible to verify this with any of the routers that use the "firmware"
> parition label.
>
> Please check /proc/mtd. For example on my WNDR3700v2 (which is very similar
> to your WNDR3800). It looks like this
>
> |# cat /proc/mtd
> |dev:    size   erasesize  name
> |mtd0: 00050000 00010000 "u-boot"
> |mtd1: 00020000 00010000 "u-boot-env"
> |mtd2: 00f80000 00010000 "firmware"
> |mtd3: 0018c440 00010000 "kernel"
> |mtd4: 00df3bc0 00010000 "rootfs"
> |mtd5: 00620000 00010000 "rootfs_data"
> |mtd6: 00010000 00010000 "art
>
> The rootfs is 14629824 Bytes = 13.95 MiB. (The kernel + uboot + env + art
> fills out the remaining ~2MiB to a total of 16 MiB). So the padding we both
> are talking about already has to exists between the squashfs portion and the
> jffs2/overlay portion inside the "rootfs" partition and it's a full erase-size
> block. Sadly, OpenWrt does not readily print the "end" of just the squashfs
> partition (as it does with the kernel partition above) and your message from
> earlier with the three routers didn't include the "squashfs-split" and its
> results. (I think that maybe this is the missing information that got lost?)

After a bit more investigation, those devices that use padjffs2 (or
have the rootfs
start at an at least 4k aligned offset) will be fine.

The issue are those devices with an unaligned rootfs start, which put their own
EOF marker in the image by the firmware util (e.g. brcm63xx or tp-link devices).

There it can happen that e.g. we get

0x18fff2  <squashfs end> 00 00 00 00
0x19000 00 00 00 00 ... <end of padding> FF FF
0x19010 FF FF FF ...
*
0x20000 DE AD C0 DE

The 0xff are only a guess, I haven't checked what the different
firmware utils use for padding the end of the rootfs - but mksquashfs
definitely uses 0x00.

which leads to:

1. squashfs-split will put the roofs_data partition start at 0x19000
because that's the first aligned erase block after the end of the
rootfs start + squashfs length
2. jffs2 will see that the first block is neither a valid jff2 node,
an EOF marker, nor all 0xff
3. jffs2 will refuse to erase/mount the filesystem (AFAIU the code) [1]

So removing the -nopad might actually break those devices.

We could fix this case by making sure that mksquashfs and all firmware
utils use 0xff's to pad (so the erase block will then be treated as
empty/all 0xff). But then there is the question how jffs2 reacts if
the first block is 0xff, and the second block is a valid jffs2 node,
which happens when we sysupgrade with keeping config on NOR devices.
The jffs2 code isn't the most readable code ... .


Regards
Jonas

[1]  https://elixir.bootlin.com/linux/latest/source/fs/jffs2/scan.c#L263

all dirty nodes will increase neither c->nr_free_blocks nor
empty_blocks nor badblocks, so the sum will be different from
c->nr_blocks, so jffs2 will refuse to mount
Christian Lamparter Aug. 31, 2019, 1:31 p.m. UTC | #15
Hello,

On Saturday, August 31, 2019 2:09:55 PM CEST Jonas Gorski wrote:
> On Sat, 31 Aug 2019 at 01:19, Christian Lamparter <chunkeey@gmail.com> wrote:
> >
> > On Friday, August 30, 2019 11:10:54 PM CEST Russell Senior wrote:
> > > >>>>> "Christian" == Christian Lamparter <chunkeey@gmail.com> writes:
> > >
> > > Christian> Ok.
> > >
> > > Christian> I did push a patch titled: "build: remove harmful -nopad
> > > Christian> option from mksquashfs"
> > > Christian> <https://git.openwrt.org/?p=openwrt/openwrt.git;a=commit;h=1c0290c5cc6258c48b8ba46b4f9c85a21de4f875>
> > >
> > > Christian> so, let's see if this triggers more undefined behaviour and
> > > Christian> exposes more hidden broken code.
> > >
> > > Just to re-iterate my earlier worry, if for example:
> > >
> > >   kernel_size + rootfs_with_padding_size
> > >
> > > crosses an erase block boundary that:
> > >
> > >   kernel_size + rootfs_without_padding_size
> > >
> > > does not, then we'll be wasting an erase block of flash space on NOR.
> > Not to worry. I guess that part of the magic is also explained in the wiki:
> > <https://openwrt.org/docs/techref/flash.layout#example_flash_partitioning>
> > <https://openwrt.org/docs/techref/flash.layout#explanations>
> > In case my attempt now gets too confusing :-/.
> >
> > The "rootfs" partition also contains the rootfs_data inside. It should
> > be possible to verify this with any of the routers that use the "firmware"
> > parition label.
> >
> > Please check /proc/mtd. For example on my WNDR3700v2 (which is very similar
> > to your WNDR3800). It looks like this
> >
> > |# cat /proc/mtd
> > |dev:    size   erasesize  name
> > |mtd0: 00050000 00010000 "u-boot"
> > |mtd1: 00020000 00010000 "u-boot-env"
> > |mtd2: 00f80000 00010000 "firmware"
> > |mtd3: 0018c440 00010000 "kernel"
> > |mtd4: 00df3bc0 00010000 "rootfs"
> > |mtd5: 00620000 00010000 "rootfs_data"
> > |mtd6: 00010000 00010000 "art
> >
> > The rootfs is 14629824 Bytes = 13.95 MiB. (The kernel + uboot + env + art
> > fills out the remaining ~2MiB to a total of 16 MiB). So the padding we both
> > are talking about already has to exists between the squashfs portion and the
> > jffs2/overlay portion inside the "rootfs" partition and it's a full erase-size
> > block. Sadly, OpenWrt does not readily print the "end" of just the squashfs
> > partition (as it does with the kernel partition above) and your message from
> > earlier with the three routers didn't include the "squashfs-split" and its
> > results. (I think that maybe this is the missing information that got lost?)
> 
> After a bit more investigation, those devices that use padjffs2 (or
> have the rootfs
> start at an at least 4k aligned offset) will be fine.
> 
> The issue are those devices with an unaligned rootfs start, which put their own
> EOF marker in the image by the firmware util (e.g. brcm63xx or tp-link devices).
> 
> There it can happen that e.g. we get
> 
> 0x18fff2  <squashfs end> 00 00 00 00
> 0x19000 00 00 00 00 ... <end of padding> FF FF
> 0x19010 FF FF FF ...
> *
> 0x20000 DE AD C0 DE
> 
> The 0xff are only a guess, I haven't checked what the different
> firmware utils use for padding the end of the rootfs - but mksquashfs
> definitely uses 0x00.
> 
> which leads to:
> 
> 1. squashfs-split will put the roofs_data partition start at 0x19000
> because that's the first aligned erase block after the end of the
> rootfs start + squashfs length
> 2. jffs2 will see that the first block is neither a valid jff2 node,
> an EOF marker, nor all 0xff
> 3. jffs2 will refuse to erase/mount the filesystem (AFAIU the code) [1]
> 
> So removing the -nopad might actually break those devices.
> 
> We could fix this case by making sure that mksquashfs and all firmware
> utils use 0xff's to pad (so the erase block will then be treated as
> empty/all 0xff). But then there is the question how jffs2 reacts if
> the first block is 0xff, and the second block is a valid jffs2 node,
> which happens when we sysupgrade with keeping config on NOR devices.
> The jffs2 code isn't the most readable code ... .
>
No need to worry, see one of the previous mails in this thread:

http://lists.infradead.org/pipermail/openwrt-devel/2019-August/018638.html

It contains a patch at the end titled:
"[PATCH] base-files: pad root.squashfs to 64KiB in ubi volumes"
This is another approach that just deals with the UBI+squashfs
issue but works with "-nopad". Soooooo.... do we all agree there?  

Cheers,
Christian
Jonas Gorski Sept. 1, 2019, 10:36 a.m. UTC | #16
On Sat, 31 Aug 2019 at 15:31, Christian Lamparter <chunkeey@gmail.com> wrote:
>
> Hello,
>
> On Saturday, August 31, 2019 2:09:55 PM CEST Jonas Gorski wrote:
> > On Sat, 31 Aug 2019 at 01:19, Christian Lamparter <chunkeey@gmail.com> wrote:
> > >
> > > On Friday, August 30, 2019 11:10:54 PM CEST Russell Senior wrote:
> > > > >>>>> "Christian" == Christian Lamparter <chunkeey@gmail.com> writes:
> > > >
> > > > Christian> Ok.
> > > >
> > > > Christian> I did push a patch titled: "build: remove harmful -nopad
> > > > Christian> option from mksquashfs"
> > > > Christian> <https://git.openwrt.org/?p=openwrt/openwrt.git;a=commit;h=1c0290c5cc6258c48b8ba46b4f9c85a21de4f875>
> > > >
> > > > Christian> so, let's see if this triggers more undefined behaviour and
> > > > Christian> exposes more hidden broken code.
> > > >
> > > > Just to re-iterate my earlier worry, if for example:
> > > >
> > > >   kernel_size + rootfs_with_padding_size
> > > >
> > > > crosses an erase block boundary that:
> > > >
> > > >   kernel_size + rootfs_without_padding_size
> > > >
> > > > does not, then we'll be wasting an erase block of flash space on NOR.
> > > Not to worry. I guess that part of the magic is also explained in the wiki:
> > > <https://openwrt.org/docs/techref/flash.layout#example_flash_partitioning>
> > > <https://openwrt.org/docs/techref/flash.layout#explanations>
> > > In case my attempt now gets too confusing :-/.
> > >
> > > The "rootfs" partition also contains the rootfs_data inside. It should
> > > be possible to verify this with any of the routers that use the "firmware"
> > > parition label.
> > >
> > > Please check /proc/mtd. For example on my WNDR3700v2 (which is very similar
> > > to your WNDR3800). It looks like this
> > >
> > > |# cat /proc/mtd
> > > |dev:    size   erasesize  name
> > > |mtd0: 00050000 00010000 "u-boot"
> > > |mtd1: 00020000 00010000 "u-boot-env"
> > > |mtd2: 00f80000 00010000 "firmware"
> > > |mtd3: 0018c440 00010000 "kernel"
> > > |mtd4: 00df3bc0 00010000 "rootfs"
> > > |mtd5: 00620000 00010000 "rootfs_data"
> > > |mtd6: 00010000 00010000 "art
> > >
> > > The rootfs is 14629824 Bytes = 13.95 MiB. (The kernel + uboot + env + art
> > > fills out the remaining ~2MiB to a total of 16 MiB). So the padding we both
> > > are talking about already has to exists between the squashfs portion and the
> > > jffs2/overlay portion inside the "rootfs" partition and it's a full erase-size
> > > block. Sadly, OpenWrt does not readily print the "end" of just the squashfs
> > > partition (as it does with the kernel partition above) and your message from
> > > earlier with the three routers didn't include the "squashfs-split" and its
> > > results. (I think that maybe this is the missing information that got lost?)
> >
> > After a bit more investigation, those devices that use padjffs2 (or
> > have the rootfs
> > start at an at least 4k aligned offset) will be fine.
> >
> > The issue are those devices with an unaligned rootfs start, which put their own
> > EOF marker in the image by the firmware util (e.g. brcm63xx or tp-link devices).
> >
> > There it can happen that e.g. we get
> >
> > 0x18fff2  <squashfs end> 00 00 00 00
> > 0x19000 00 00 00 00 ... <end of padding> FF FF
> > 0x19010 FF FF FF ...
> > *
> > 0x20000 DE AD C0 DE
> >
> > The 0xff are only a guess, I haven't checked what the different
> > firmware utils use for padding the end of the rootfs - but mksquashfs
> > definitely uses 0x00.
> >
> > which leads to:
> >
> > 1. squashfs-split will put the roofs_data partition start at 0x19000
> > because that's the first aligned erase block after the end of the
> > rootfs start + squashfs length
> > 2. jffs2 will see that the first block is neither a valid jff2 node,
> > an EOF marker, nor all 0xff
> > 3. jffs2 will refuse to erase/mount the filesystem (AFAIU the code) [1]
> >
> > So removing the -nopad might actually break those devices.
> >
> > We could fix this case by making sure that mksquashfs and all firmware
> > utils use 0xff's to pad (so the erase block will then be treated as
> > empty/all 0xff). But then there is the question how jffs2 reacts if
> > the first block is 0xff, and the second block is a valid jffs2 node,
> > which happens when we sysupgrade with keeping config on NOR devices.
> > The jffs2 code isn't the most readable code ... .
> >
> No need to worry, see one of the previous mails in this thread:
>
> http://lists.infradead.org/pipermail/openwrt-devel/2019-August/018638.html
>
> It contains a patch at the end titled:
> "[PATCH] base-files: pad root.squashfs to 64KiB in ubi volumes"
> This is another approach that just deals with the UBI+squashfs
> issue but works with "-nopad". Soooooo.... do we all agree there?

a) 64k is excessive, we only need 4k (actually 1k would be enough,
since we don't enable CONFIG_SQUASHFS_4K_DEVBLK_SIZE).

The referenced issue with 64k page size happens when loop-mounting a
squashfs, since loop defaults to PAGE_SIZE as its block size. But we
never do that in OpenWrt, and we don't support any targets with that
huge PAGE_SIZE - biggest is ARC with 8k.

b) it misses the squashfs's in generic sysupgrade images itself - we
need to pad their length as well, to avoid breaking devices with a
sysupgrade image hitting the corner case being flashed from an unfixed
firmware with the old nand.sh.

Also IMHO "1c0290c5cc6258c48b8ba46b4f9c85a21de4f875" should be
reverted, for the previously mentioned issues.

Regards



Jonas
Russell Senior Sept. 1, 2019, 11:52 a.m. UTC | #17
>>>>> "Jonas" == Jonas Gorski <jonas.gorski@gmail.com> writes:

>> It contains a patch at the end titled: "[PATCH] base-files: pad
>> root.squashfs to 64KiB in ubi volumes" This is another approach that
>> just deals with the UBI+squashfs issue but works with
>> "-nopad". Soooooo.... do we all agree there?

Jonas> a) 64k is excessive, we only need 4k (actually 1k would be
Jonas> enough, since we don't enable CONFIG_SQUASHFS_4K_DEVBLK_SIZE).

Jonas> The referenced issue with 64k page size happens when
Jonas> loop-mounting a squashfs, since loop defaults to PAGE_SIZE as its
Jonas> block size. But we never do that in OpenWrt, and we don't support
Jonas> any targets with that huge PAGE_SIZE - biggest is ARC with 8k.

Jonas> b) it misses the squashfs's in generic sysupgrade images itself -
Jonas> we need to pad their length as well, to avoid breaking devices
Jonas> with a sysupgrade image hitting the corner case being flashed
Jonas> from an unfixed firmware with the old nand.sh.

Jonas> Also IMHO "1c0290c5cc6258c48b8ba46b4f9c85a21de4f875" should be
Jonas> reverted, for the previously mentioned issues.

Afaict, only devices with LEB sizes of non-integer kilobytes (like the
MR24 with its 15.5k LEBs) need any intervention at all. Because
squashfs's are read in 1k blocks, there is a 1 in 62 chance of creating
a rootfs that is an inopportune size on 15.5k LEBs.  I have a PogoPlug
v3 with LEBs of 126k, and a MikroTik RouterBOARD 493G with LEBs of
124k. Neither of those is affected.

I still kind of like my solution where we explicitly ask for padding for
devices that need it.
Jonas Gorski Sept. 1, 2019, 3:16 p.m. UTC | #18
On Sun, 1 Sep 2019 at 13:52, Russell Senior <russell@personaltelco.net> wrote:
>
> >>>>> "Jonas" == Jonas Gorski <jonas.gorski@gmail.com> writes:
>
> >> It contains a patch at the end titled: "[PATCH] base-files: pad
> >> root.squashfs to 64KiB in ubi volumes" This is another approach that
> >> just deals with the UBI+squashfs issue but works with
> >> "-nopad". Soooooo.... do we all agree there?
>
> Jonas> a) 64k is excessive, we only need 4k (actually 1k would be
> Jonas> enough, since we don't enable CONFIG_SQUASHFS_4K_DEVBLK_SIZE).
>
> Jonas> The referenced issue with 64k page size happens when
> Jonas> loop-mounting a squashfs, since loop defaults to PAGE_SIZE as its
> Jonas> block size. But we never do that in OpenWrt, and we don't support
> Jonas> any targets with that huge PAGE_SIZE - biggest is ARC with 8k.
>
> Jonas> b) it misses the squashfs's in generic sysupgrade images itself -
> Jonas> we need to pad their length as well, to avoid breaking devices
> Jonas> with a sysupgrade image hitting the corner case being flashed
> Jonas> from an unfixed firmware with the old nand.sh.
>
> Jonas> Also IMHO "1c0290c5cc6258c48b8ba46b4f9c85a21de4f875" should be
> Jonas> reverted, for the previously mentioned issues.
>
> Afaict, only devices with LEB sizes of non-integer kilobytes (like the
> MR24 with its 15.5k LEBs) need any intervention at all. Because
> squashfs's are read in 1k blocks, there is a 1 in 62 chance of creating
> a rootfs that is an inopportune size on 15.5k LEBs.  I have a PogoPlug
> v3 with LEBs of 126k, and a MikroTik RouterBOARD 493G with LEBs of
> 124k. Neither of those is affected.
>
> I still kind of like my solution where we explicitly ask for padding for
> devices that need it.

The PogoPlug would also be affected if we enabled
SQUASHFS_4K_DEVBLK_SIZE - the 493G wouldn't be. In that case we would
need to pad to 4k.

But in the current situation, If we just pad to 1k, neither one would
see an increase in LEBs used. At worst will the padding make the
squashfs image fit exactly, but it will not cause it to overspill into
a new LEB.

So padding to 1k is harmless for devices with LEBs of a multiple of
1k, and fixes devices with LEBs that have an odd .5k size.


Jonas
Christian Lamparter Sept. 1, 2019, 6:46 p.m. UTC | #19
Hi,

On Sun, Sep 1, 2019 at 1:52 PM Russell Senior <russell@personaltelco.net> wrote:
> >>>>> "Jonas" == Jonas Gorski <jonas.gorski@gmail.com> writes:
> >> It contains a patch at the end titled: "[PATCH] base-files: pad
> >> root.squashfs to 64KiB in ubi volumes" This is another approach that
> >> just deals with the UBI+squashfs issue but works with
> >> "-nopad". Soooooo.... do we all agree there?
>
> Jonas> a) 64k is excessive, we only need 4k (actually 1k would be
> Jonas> enough, since we don't enable CONFIG_SQUASHFS_4K_DEVBLK_SIZE).
>
> Jonas> The referenced issue with 64k page size happens when
> Jonas> loop-mounting a squashfs, since loop defaults to PAGE_SIZE as its
> Jonas> block size. But we never do that in OpenWrt, and we don't support
> Jonas> any targets with that huge PAGE_SIZE - biggest is ARC with 8k.
>
> Jonas> b) it misses the squashfs's in generic sysupgrade images itself -
> Jonas> we need to pad their length as well, to avoid breaking devices
> Jonas> with a sysupgrade image hitting the corner case being flashed
> Jonas> from an unfixed firmware with the old nand.sh.
>
> Jonas> Also IMHO "1c0290c5cc6258c48b8ba46b4f9c85a21de4f875" should be
> Jonas> reverted, for the previously mentioned issues.
>
> Afaict, only devices with LEB sizes of non-integer kilobytes (like the
> MR24 with its 15.5k LEBs) need any intervention at all. Because
> squashfs's are read in 1k blocks, there is a 1 in 62 chance of creating
> a rootfs that is an inopportune size on 15.5k LEBs.  I have a PogoPlug
> v3 with LEBs of 126k, and a MikroTik RouterBOARD 493G with LEBs of
> 124k. Neither of those is affected.
>
> I still kind of like my solution where we explicitly ask for padding for
> devices that need it.

I see, fine.

Tell you what, I'll let the two of you do what you want or not.
My request to the both of you is, that instead of justreverting, please add  a
"why the -nopad has become essentially atechnical debt there". I'm not that
familiar with these devices and their problems, so I don#t think I have anything
more to add.

Goodbye!

Patch
diff mbox series

diff --git a/target/linux/apm821xx/image/Makefile b/target/linux/apm821xx/image/Makefile
index acfd478755..53192bb448 100644
--- a/target/linux/apm821xx/image/Makefile
+++ b/target/linux/apm821xx/image/Makefile
@@ -133,7 +133,8 @@  define Device/meraki_mr24
   IMAGE_SIZE := 8191k
   KERNEL := kernel-bin | lzma | uImage lzma | MerakiAdd-dtb | MerakiNAND
   KERNEL_INITRAMFS := kernel-bin | lzma | dtb | MuImage-initramfs lzma
-  IMAGE/sysupgrade.bin := sysupgrade-tar | append-metadata
+  IMAGE/sysupgrade.bin/squashfs := pad-squashfs | sysupgrade-tar | append-metadata
+  IMAGE/sysupgrade.bin/ext4 := sysupgrade-tar | append-metadata
   UBINIZE_OPTS := -E 5
   SUPPORTED_DEVICES += mr24
 endef