diff mbox

[LEDE-DEV] flash issue

Message ID 5296d02b-6b69-dd4c-ca4e-13150616fd6f@gmail.com
State RFC
Headers show

Commit Message

e9hack July 21, 2016, 2:08 p.m. UTC
Hi,

I did flash several different images to my WDR3600 router. In the partition size
detection, I don't see any major differences. An image from OpenWRT shows the same error
at the first boot, but for some reason it doesn't overwrite files at the second boot. One
major difference exist between my builds and a nightly build from Lede home page. In my
build, the line

[   15.428814] jffs2: notice: (457) jffs2_build_xattr_subsystem: complete building xattr
subsystem, 0 of xdatum (0 unchecked, 0 orphan) and 0 of xref (0 dead, 0 orphan) found.

occurs twice. In Lede build only once. I compare this with my Archer C7 router. The line
occurs twice too. But it uses the original flash of 16MB. In the log files, I see an entry

[   10.117176] jffs2_scan_eraseblock(): End of filesystem marker found at 0x30000

The hex value is different between the WDR3600 and the Archer C7 but doesn't match in any
way to the partition sizes.

I attached the diff to build the larger images for my WDR3600. I've two solutions. The
first replaces the 8Mb flash by the 16Mb type for the whole WDR4300/WDR3500/WDR3600
platform. The second generates a new device entry for the larger flash. This can be select
during build.

Any ideas to solve my problem?

Regards,
Hartmut


-- WDR3600 first boot after flashing of Lede my image --

[    0.595551] m25p80 spi0.0: found w25q128, expected m25p80
[    0.601099] m25p80 spi0.0: w25q128 (16384 Kbytes)
[    0.607066] 5 tp-link partitions found on MTD device spi0.0
[    0.612723] Creating 5 MTD partitions on "spi0.0":
[    0.617624] 0x000000000000-0x000000020000 : "u-boot"
[    0.624126] 0x000000020000-0x000000156f48 : "kernel"
[    0.630851] 0x000000156f48-0x000000ff0000 : "rootfs"
[    0.637518] mtd: device 2 (rootfs) set to be root filesystem
[    0.643319] 1 squashfs-split partitions found on MTD device rootfs
[    0.649648] 0x000000bc0000-0x000000ff0000 : "rootfs_data"
[    0.656798] 0x000000ff0000-0x000001000000 : "art"
[    0.663203] 0x000000020000-0x000000ff0000 : "firmware"

[    9.470240] jffs2_scan_eraseblock(): End of filesystem marker found at 0x70000
[    9.477636] jffs2_build_filesystem(): unlocking the mtd device... done.
[    9.484352] jffs2_build_filesystem(): erasing all blocks after the end marker... done.
[   19.250004] jffs2: notice: (417) jffs2_build_xattr_subsystem: complete building xattr
subsystem, 0 of xdatum (0 unchecked, 0 orphan) and 0 of xref (0 dead, 0 orphan) found.
[   19.470188] jffs2: notice: (414) jffs2_build_xattr_subsystem: complete building xattr
subsystem, 0 of xdatum (0 unchecked, 0 orphan) and 0 of xref (0 dead, 0 orphan) found.

-- WDR3600 first boot after flashing of OpenWRT my image --

[    0.591974] m25p80 spi0.0: found w25q128, expected m25p80
[    0.597521] m25p80 spi0.0: w25q128 (16384 Kbytes)
[    0.603265] 5 tp-link partitions found on MTD device spi0.0
[    0.608925] Creating 5 MTD partitions on "spi0.0":
[    0.613837] 0x000000000000-0x000000020000 : "u-boot"
[    0.619762] 0x000000020000-0x000000144bd0 : "kernel"
[    0.625664] 0x000000144bd0-0x000000ff0000 : "rootfs"
[    0.631500] mtd: device 2 (rootfs) set to be root filesystem
[    0.637321] 1 squashfs-split partitions found on MTD device rootfs
[    0.643612] 0x000000df0000-0x000000ff0000 : "rootfs_data"
[    0.649897] 0x000000ff0000-0x000001000000 : "art"
[    0.655520] 0x000000020000-0x000000ff0000 : "firmware"

[    8.975870] jffs2_scan_eraseblock(): End of filesystem marker found at 0x70000
[    8.983239] jffs2_build_filesystem(): unlocking the mtd device... done.
[    8.989967] jffs2_build_filesystem(): erasing all blocks after the end marker...
[    9.612628] Atheros AR8216/AR8236/AR8316 ag71xx-mdio.0:00: Port 2 is up
[   12.970906] done.
[   12.972940] jffs2: notice: (423) jffs2_build_xattr_subsystem: complete building xattr
subsystem, 0 of xdatum (0 unchecked, 0 orphan) and 0 of xref (0 dead, 0 orphan) found.
[   13.460438] jffs2: notice: (420) jffs2_build_xattr_subsystem: complete building xattr
subsystem, 0 of xdatum (0 unchecked, 0 orphan) and 0 of xref (0 dead, 0 orphan) found.

-- WDR3600 first boot after flashing of Lede nightly build image ---

[    0.599181] m25p80 spi0.0: found w25q128, expected m25p80
[    0.604670] m25p80 spi0.0: w25q128 (16384 Kbytes)
[    0.610552] 5 tp-link partitions found on MTD device spi0.0
[    0.616210] Creating 5 MTD partitions on "spi0.0":
[    0.621114] 0x000000000000-0x000000020000 : "u-boot"
[    0.627564] 0x000000020000-0x00000016e1e0 : "kernel"
[    0.634125] 0x00000016e1e0-0x000000ff0000 : "rootfs"
[    0.640753] mtd: device 2 (rootfs) set to be root filesystem
[    0.646596] 1 squashfs-split partitions found on MTD device rootfs
[    0.652875] 0x000000350000-0x000000ff0000 : "rootfs_data"
[    0.659953] 0x000000ff0000-0x000001000000 : "art"
[    0.666282] 0x000000020000-0x000000ff0000 : "firmware"

[    7.511918] jffs2_scan_eraseblock(): End of filesystem marker found at 0x70000
[    7.519320] jffs2_build_filesystem(): unlocking the mtd device... done.
[    7.526037] jffs2_build_filesystem(): erasing all blocks after the end marker... done.
[   43.861142] jffs2: notice: (411) jffs2_build_xattr_subsystem: complete building xattr
subsystem, 0 of xdatum (0 unchecked, 0 orphan) and 0 of xref (0 dead, 0 orphan) found.

-- Archer C7 first boot after flashing of Lede my image --

[    0.824059] m25p80 spi0.0: found w25q128, expected m25p80
[    0.829594] m25p80 spi0.0: w25q128 (16384 Kbytes)
[    0.835477] 5 tp-link partitions found on MTD device spi0.0
[    0.841169] Creating 5 MTD partitions on "spi0.0":
[    0.846041] 0x000000000000-0x000000020000 : "u-boot"
[    0.852223] 0x000000020000-0x0000001574c0 : "kernel"
[    0.858575] 0x0000001574c0-0x000000ff0000 : "rootfs"
[    0.864848] mtd: device 2 (rootfs) set to be root filesystem
[    0.870687] 1 squashfs-split partitions found on MTD device rootfs
[    0.876970] 0x000000e50000-0x000000ff0000 : "rootfs_data"
[    0.883693] 0x000000ff0000-0x000001000000 : "art"
[    0.889779] 0x000000020000-0x000000ff0000 : "firmware"

[   10.117176] jffs2_scan_eraseblock(): End of filesystem marker found at 0x30000
[   10.124519] jffs2_build_filesystem(): unlocking the mtd device... done.
[   10.131247] jffs2_build_filesystem(): erasing all blocks after the end marker... done.
[   15.428814] jffs2: notice: (457) jffs2_build_xattr_subsystem: complete building xattr
subsystem, 0 of xdatum (0 unchecked, 0 orphan) and 0 of xref (0 dead, 0 orphan) found.
[   15.622275] jffs2: notice: (454) jffs2_build_xattr_subsystem: complete building xattr
subsystem, 0 of xdatum (0 unchecked, 0 orphan) and 0 of xref (0 dead, 0 orphan) found.

-- modification of build scripts for the larger flash --
 define Device/tl-wdr6500-v2
 $(Device/tplink-8mlzma)


Am 21.07.2016 um 11:19 schrieb Conor O'Gorman:
> 0x1985 is a jffs2 magic marker value. Look up the jffs2 code. So it's not a bogus value
> that might come from a 'bad' flash. More likely is that you have some issue with flash
> partition sizes, possibly as you have a non-standard flash size anyway.
> Look carefully at the boot log for the flash partitioning.

Comments

Conor O'Gorman July 21, 2016, 2:33 p.m. UTC | #1
1) Your rootfs is quite large compared to standard build. You have 
presumably added a lot of packages. Or there is something messed up.

2) You should not see the jffs2 "complete build" message twice. It's 
showing 2 different process ids also. You need to check what you've 
added to make this code run twice. The later error would match the idea 
of two jffs2 processes.

On 21/07/16 15:08, e9hack wrote:
> [   19.250004] jffs2: notice: (417) jffs2_build_xattr_subsystem: complete building xattr
> subsystem, 0 of xdatum (0 unchecked, 0 orphan) and 0 of xref (0 dead, 0 orphan) found.
> [   19.470188] jffs2: notice: (414) jffs2_build_xattr_subsystem: complete building xattr
> subsystem, 0 of xdatum (0 unchecked, 0 orphan) and 0 of xref (0 dead, 0 orphan) found.
e9hack July 21, 2016, 7:53 p.m. UTC | #2
Am 21.07.2016 um 16:33 schrieb Conor O'Gorman:
> 1) Your rootfs is quite large compared to standard build. You have presumably added a lot
> of packages. Or there is something messed up.
This is ok. I add many things (hostapd with built-in radius, samba, openvpn, collectd,
luci, tor, php, mc).

> 2) You should not see the jffs2 "complete build" message twice. It's showing 2 different
> process ids also. You need to check what you've added to make this code run twice. The
> later error would match the idea of two jffs2 processes.
It seems it is triggered by a package which is include in my image and where I did
modified or add something to its configuration. I don't see two lines with the nightly
build from web or if I use my build and don't save and restore the current configuration.

Regards,
Hartmut
e9hack July 21, 2016, 9:57 p.m. UTC | #3
Am 21.07.2016 um 21:53 schrieb e9hack:
>> 2) You should not see the jffs2 "complete build" message twice. It's showing 2 different
>> process ids also. You need to check what you've added to make this code run twice. The
>> later error would match the idea of two jffs2 processes.
> It seems it is triggered by a package which is include in my image and where I did
> modified or add something to its configuration. I don't see two lines with the nightly
> build from web or if I use my build and don't save and restore the current configuration.

The problem is the package block-mount. If it exist in the update image and if
/etc/config/fstab does exist in the saved configuration, the error with jffs2 message
twice will occur. Removing block-mount package doesn't solve the error with 'Newly-erased
block contained ...'.

Regards,
Hartmut
e9hack July 23, 2016, 3:23 p.m. UTC | #4
Hi,

I did try to find the reason for my problem with jffs2. It seems, it is triggered by
deleting /sysupgrade.tgz by /etc/init.d/done. If I remove the rm -f /sysupgrade.tgz line
completely and delete the file later, I've no issue. I enabled some debug messages and add
a line to /etc/init.d/done to see the time, where rm -f /sysupgrade.tgz will be execute:

[    9.478098] jffs2_scan_eraseblock(): End of filesystem marker found at 0x70000
[    9.485455] jffs2_build_filesystem(): unlocking the mtd device... done.
[    9.492191] jffs2_build_filesystem(): erasing all blocks after the end marker...
jffs2_erase_block(): erase block 0x430000 (range 0x430000-0x440000)
[    9.666422] jffs2_erase_block(): erase block 0x420000 (range 0x420000-0x430000)
[    9.829087] jffs2_erase_block(): erase block 0x410000 (range 0x410000-0x420000)
...
[   19.736767] jffs2_erase_block(): erase block 0x070000 (range 0x070000-0x080000)
[   19.906864] done.
[   19.908852] jffs2: notice: (434) jffs2_build_xattr_subsystem: complete building xattr
subsystem, 0 of xdatum (0 unchecked, 0 orphan) and 0 of xref (0 dead, 0 orphan) found.

user.notice [BOOT: TEST] "------------------------------------------------------------"
[   65.389272] jffs2_erase_block(): erase block 0x000000 (range 0x000000-0x010000)
[   65.670129] jffs2: Newly-erased block contained word 0x19852003 at offset 0x00000000
[   66.146424] jffs2_erase_block(): erase block 0x010000 (range 0x010000-0x020000)
[   67.526861] jffs2_erase_block(): erase block 0x020000 (range 0x020000-0x030000)
[   68.896556] jffs2_erase_block(): erase block 0x030000 (range 0x030000-0x040000)
[   70.172272] jffs2_erase_block(): erase block 0x040000 (range 0x040000-0x050000)
[   71.686421] jffs2_erase_block(): erase block 0x050000 (range 0x050000-0x060000)

At the beginning, all blocks behind the the configuration will be erased. Later
sysupgrade.tgz will be delete. Erasing and checking of block 0 does fail. If I remove the
deleting of sysupgrade.tgz and delete sysupgrade.tgz manually, jffs2_erase_block() message
will be not shown in the log at the time when the file is delete. It starts later or if I
trigger another read/write operation manually. The issue doesn't occur. I put a sleep 15
in front of rm -f /sysupgrade.tgz and let this run in a background thread. I don't get
this error. The log shows:

user.notice [BOOT: TEST] "------------------------------------------------------------"
[   80.312832] jffs2_erase_block(): erase block 0x000000 (range 0x000000-0x010000)
[   80.963137] jffs2_erase_block(): erase block 0x010000 (range 0x010000-0x020000)
[   81.596935] jffs2_erase_block(): erase block 0x020000 (range 0x020000-0x030000)

The erasing of the missing three blocks will be execute later automatically or if I
trigger a write/erase operation again manually. /etc/init.d/done executes afterwards
/etc/rc.locale. I did add some entries, which does force a write operation.

The error is only visible (missing or wrong configuration files), if the router does
reboot again, the delay till the next reboot is not so important. If there is a next flash
update without a reboot in between, the configuration isn't lost, but the same scenario
starts again. After the next reboot, some files are lost or overwritten.

Any idea what goes wrong? Is there a race condition somewhere in the jffs2 driver?

Regards,
Hartmut
diff mbox

Patch

diff --git a/target/linux/ar71xx/image/tp-link.mk b/target/linux/ar71xx/image/tp-link.mk
index ccb1f5b..adb2f4d 100644
--- a/target/linux/ar71xx/image/tp-link.mk
+++ b/target/linux/ar71xx/image/tp-link.mk
@@ -110,7 +110,8 @@  endef
 TARGET_DEVICES += cpe210-220 cpe510-520

 define Device/tl-wdr4300-v1
-$(Device/tplink-8mlzma)
+#$(Device/tplink-8mlzma)
+$(Device/tplink-16mlzma)
   DEVICE_TITLE := TP-LINK TL-WDR4300
   DEVICE_PACKAGES := kmod-usb-core kmod-usb2 kmod-ledtrig-usbdev
   BOARDNAME = TL-WDR4300
@@ -131,6 +132,15 @@  $(Device/tl-wdr4300-v1)
   TPLINK_HWID := 0x36000001
 endef

+define Device/tl-wdr3600-v1-2
+$(Device/tplink-16mlzma)
+  DEVICE_TITLE := TP-LINK TL-WDR3600 v1-2
+  DEVICE_PACKAGES := kmod-usb-core kmod-usb2 kmod-ledtrig-usbdev
+  BOARDNAME = TL-WDR4300
+  DEVICE_PROFILE = TLWDR4300
+  TPLINK_HWID := 0x36000001
+endef
+
 define Device/tl-wdr4300-v1-il
 $(Device/tl-wdr4300-v1)
   DEVICE_TITLE := TP-LINK TL-WDR4300 v1 IL
@@ -148,7 +158,7 @@  $(Device/tl-wdr4300-v1)
   DEVICE_TITLE := TP-LINK TL-WDR4530 v1
   TPLINK_HWID := 0x45300001
 endef
-TARGET_DEVICES += tl-wdr3500-v1 tl-wdr3600-v1 tl-wdr4300-v1 tl-wdr4300-v1-il
tl-wdr4310-v1 mw4530r-v1
+TARGET_DEVICES += tl-wdr3500-v1 tl-wdr3600-v1 tl-wdr3600-v1-2 tl-wdr4300-v1
tl-wdr4300-v1-il tl-wdr4310-v1 mw4530r-v1