Message ID | 20221103175447.2941098-1-bjorn@mork.no |
---|---|
State | Rejected |
Delegated to: | Petr Štetiar |
Headers | show |
Series | ramips: fix Unifi 6 Lite boot failure with v5.15 kernels | expand |
On Thu, Nov 03, 2022 at 06:54:47PM +0100, Bjørn Mork wrote: > The Unifi 6 Lite U-Boot does not relocate the Device Tree Blobs > found in the FIT image. It behaves as if fdt_high is set to > 0xffffffff whether or not this variable is defined. > > Kernel commit 79edff12060f ("scripts/dtc: Update to upstream > version v1.6.0-51-g183df9e9c2b9"), included in v5.12, imported > dtc commit 5e735860c478 ("libfdt: Check for 8-byte address > alignment in fdt_ro_probe_()") into the kernel > > The result is that we must ensure that the "data" property of > all the "flat_dt" type images is 8 byte aligned inside the FIT > > There is currently no provision in mkimage or dtc for aligning > properties with the blob. They are naturally 4 byte aligned. > Giving us 50% probability of a soft bricking image. > > Work around the missing aligment features in mkimage and dtc > by padding the kernel image and rerunning mkimage if the first > attempt failed. > > Example boot log when using a misaligned FIT image: > > reading kernel 0 from: 0x1d0000, size: 0x002d5000 > Using 'config@1' configuration > Verifying Hash Integrity ... OK > Trying 'kernel-1' kernel subimage > Description: MIPS OpenWrt Linux-5.15.76 > Type: Kernel Image > Compression: lzma compressed > Data Start: 0x860000e4 > Data Size: 2956005 Bytes = 2.8 MiB > Architecture: MIPS > OS: Linux > Load Address: 0x80001000 > Entry Point: 0x80001000 > Hash algo: crc32 > Hash value: e1bc9460 > Hash algo: sha1 > Hash value: 6510c4ada31aeea81f2e8e537f78cb367e1c7fab > Verifying Hash Integrity ... crc32+ sha1+ OK > Using 'config@1' configuration > Trying 'fdt-1' fdt subimage > Description: MIPS OpenWrt ubnt_unifi-6-lite device tree blob > Type: Flat Device Tree > Compression: uncompressed > Data Start: 0x862d1d0c > Data Size: 11387 Bytes = 11.1 KiB > Architecture: MIPS > Hash algo: crc32 > Hash value: 16bb5a14 > Hash algo: sha1 > Hash value: 40f26bf28e33bbe661fec716929f2003301f5e4d > Verifying Hash Integrity ... crc32+ sha1+ OK > Booting using the fdt blob at 0x862d1d0c > Uncompressing Kernel Image ... OK > Using Device Tree in place at 862d1d0c, end 862d7986 > [ 0.000000] Linux version 5.15.76 (bjorn@canardo) (mipsel-openwrt-linux-musl-gcc (OpenWrt GCC 11.3.0 r21167-1673b7dca384) 11.3.0, GNU ld (GNU Binutils) 2.37) #0 SMP Wed Nov 2 15:53:34 2022 > [ 0.000000] SoC Type: MediaTek MT7621 ver:1 eco:3 > [ 0.000000] printk: bootconsole [early0] enabled > [ 0.000000] CPU0 revision is: 0001992f (MIPS 1004Kc) > [ 0.000000] Initrd not found or empty - disabling initrd > [ 0.000000] VPE topology {2,2} total 4 > [ 0.000000] Primary instruction cache 32kB, VIPT, 4-way, linesize 32 bytes. > [ 0.000000] Primary data cache 32kB, 4-way, PIPT, no aliases, linesize 32 bytes > [ 0.000000] MIPS secondary cache 256kB, 8-way, linesize 32 bytes. > [ 0.000000] Zone ranges: > [ 0.000000] Normal [mem 0x0000000000000000-0x000000000fffffff] > [ 0.000000] HighMem empty > [ 0.000000] Movable zone start for each node > [ 0.000000] Early memory node ranges > [ 0.000000] node 0: [mem 0x0000000000000000-0x000000000fffffff] > [ 0.000000] Initmem setup node 0 [mem 0x0000000000000000-0x000000000fffffff] > [ 0.000000] OF: fdt: No valid device tree found, continuing without > [ 0.000000] percpu: Embedded 11 pages/cpu s15632 r8192 d21232 u45056 > [ 0.000000] Built 1 zonelists, mobility grouping on. Total pages: 64960 > [ 0.000000] Kernel command line: rootfstype=squashfs,jffs2 > [ 0.000000] Dentry cache hash table entries: 32768 (order: 5, 131072 bytes, linear) > [ 0.000000] Inode-cache hash table entries: 16384 (order: 4, 65536 bytes, linear) > [ 0.000000] Writing ErrCtl register=0004a000 > [ 0.000000] Readback ErrCtl register=0004a000 > [ 0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off > [ 0.000000] Memory: 248504K/262144K available (7250K kernel code, 660K rwdata, 1536K rodata, 1228K init, 242K bss, 13640K reserved, 0K cma-reserved, 0K highmem) > [ 0.000000] SLUB: HWalign=32, Order=0-3, MinObjects=0, CPUs=4, Nodes=1 > [ 0.000000] rcu: Hierarchical RCU implementation. > [ 0.000000] Tracing variant of Tasks RCU enabled. > [ 0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 10 jiffies. > [ 0.000000] NR_IRQS: 256 > [ 0.000000] Kernel panic - not syncing: Failed to find mediatek,mt7621-sysc node > [ 0.000000] Rebooting in 1 seconds.. > [ 0.000000] Reboot failed -- System halted > > Signed-off-by: Bjørn Mork <bjorn@mork.no> > --- > The code below will probably look stupid to anyone with a clue. but I > hope it serves as an illustration of what we need to do. I'd appreciate > it if anyone can teach me > - how to do alignment inside a FIT using mkimage, or > - how to find the position of the embedded fdt blobs in the FIT, or > - how to check whether a value is aligned > > In any case, believe fixing this ASAP is critical. As it is now, any > image with a 5.15 kernel is just as likely to brick the device as is to > boot. If it's critical enough should we revert the affected commit just in case while we understand this problem? Consider that sooner or later we will have to fix it as 6.1 will have this problem. But if this is present on also other target we should operate on a more generic way than apply a workaround specific to a device. > > > target/linux/ramips/image/mt7621.mk | 19 ++++++++++++++++++- > 1 file changed, 18 insertions(+), 1 deletion(-) > > diff --git a/target/linux/ramips/image/mt7621.mk b/target/linux/ramips/image/mt7621.mk > index 3ef4cf4efb8f..e3b11abd14ce 100644 > --- a/target/linux/ramips/image/mt7621.mk > +++ b/target/linux/ramips/image/mt7621.mk > @@ -132,6 +132,23 @@ define Build/zyxel-nwa-fit > @mv $@.new $@ > endef > > +define Build/fit-dtalign > + $(TOPDIR)/scripts/mkits.sh \ > + -D $(DEVICE_NAME) -o $@.its -k $@ \ > + -C $(word 1,$(1)) \ > + -d $(word 2,$(1)) \ > + -a $(KERNEL_LOADADDR) -e $(if $(KERNEL_ENTRY),$(KERNEL_ENTRY),$(KERNEL_LOADADDR)) \ > + -c $(if $(DEVICE_DTS_CONFIG),$(DEVICE_DTS_CONFIG),"config-1") \ > + -A $(LINUX_KARCH) -v $(LINUX_VERSION) > + PATH=$(LINUX_DIR)/scripts/dtc:$(PATH) mkimage -f $@.its $@.new > + ( pos=$$(grep -aob $$(echo -ne "\xd0\x0d\xfe\xed") $@.new | cut -f1 -d: | tail -1); \ > + if [ "$$pos" -ne "$$((8*($$pos/8)))" ]; then \ > + dd if=/dev/zero bs=4 count=1 >> $@; \ > + PATH=$(LINUX_DIR)/scripts/dtc:$(PATH) mkimage -f $@.its $@.new; \ > + fi ) > + @mv $@.new $@ > +endef > + > define Device/dsa-migration > DEVICE_COMPAT_VERSION := 1.1 > DEVICE_COMPAT_MESSAGE := Config cannot be migrated from swconfig to DSA > @@ -1960,7 +1977,7 @@ define Device/ubnt_unifi-6-lite > DEVICE_MODEL := UniFi 6 Lite > DEVICE_DTS_CONFIG := config@1 > DEVICE_PACKAGES += kmod-mt7603 kmod-mt7915e > - KERNEL := kernel-bin | lzma | fit lzma $$(KDIR)/image-$$(firstword $$(DEVICE_DTS)).dtb > + KERNEL := kernel-bin | lzma | fit-dtalign lzma $$(KDIR)/image-$$(firstword $$(DEVICE_DTS)).dtb > IMAGE_SIZE := 15424k > endef > TARGET_DEVICES += ubnt_unifi-6-lite > -- > 2.30.2 > > > _______________________________________________ > openwrt-devel mailing list > openwrt-devel@lists.openwrt.org > https://lists.openwrt.org/mailman/listinfo/openwrt-devel
On Thu, 3 Nov 2022 at 19:11, Christian Marangi <ansuelsmth@gmail.com> wrote: > > On Thu, Nov 03, 2022 at 06:54:47PM +0100, Bjørn Mork wrote: > > The Unifi 6 Lite U-Boot does not relocate the Device Tree Blobs > > found in the FIT image. It behaves as if fdt_high is set to > > 0xffffffff whether or not this variable is defined. > > > > Kernel commit 79edff12060f ("scripts/dtc: Update to upstream > > version v1.6.0-51-g183df9e9c2b9"), included in v5.12, imported > > dtc commit 5e735860c478 ("libfdt: Check for 8-byte address > > alignment in fdt_ro_probe_()") into the kernel > > > > The result is that we must ensure that the "data" property of > > all the "flat_dt" type images is 8 byte aligned inside the FIT > > > > There is currently no provision in mkimage or dtc for aligning > > properties with the blob. They are naturally 4 byte aligned. > > Giving us 50% probability of a soft bricking image. > > > > Work around the missing aligment features in mkimage and dtc > > by padding the kernel image and rerunning mkimage if the first > > attempt failed. > > > > Example boot log when using a misaligned FIT image: > > > > reading kernel 0 from: 0x1d0000, size: 0x002d5000 > > Using 'config@1' configuration > > Verifying Hash Integrity ... OK > > Trying 'kernel-1' kernel subimage > > Description: MIPS OpenWrt Linux-5.15.76 > > Type: Kernel Image > > Compression: lzma compressed > > Data Start: 0x860000e4 > > Data Size: 2956005 Bytes = 2.8 MiB > > Architecture: MIPS > > OS: Linux > > Load Address: 0x80001000 > > Entry Point: 0x80001000 > > Hash algo: crc32 > > Hash value: e1bc9460 > > Hash algo: sha1 > > Hash value: 6510c4ada31aeea81f2e8e537f78cb367e1c7fab > > Verifying Hash Integrity ... crc32+ sha1+ OK > > Using 'config@1' configuration > > Trying 'fdt-1' fdt subimage > > Description: MIPS OpenWrt ubnt_unifi-6-lite device tree blob > > Type: Flat Device Tree > > Compression: uncompressed > > Data Start: 0x862d1d0c > > Data Size: 11387 Bytes = 11.1 KiB > > Architecture: MIPS > > Hash algo: crc32 > > Hash value: 16bb5a14 > > Hash algo: sha1 > > Hash value: 40f26bf28e33bbe661fec716929f2003301f5e4d > > Verifying Hash Integrity ... crc32+ sha1+ OK > > Booting using the fdt blob at 0x862d1d0c > > Uncompressing Kernel Image ... OK > > Using Device Tree in place at 862d1d0c, end 862d7986 > > [ 0.000000] Linux version 5.15.76 (bjorn@canardo) (mipsel-openwrt-linux-musl-gcc (OpenWrt GCC 11.3.0 r21167-1673b7dca384) 11.3.0, GNU ld (GNU Binutils) 2.37) #0 SMP Wed Nov 2 15:53:34 2022 > > [ 0.000000] SoC Type: MediaTek MT7621 ver:1 eco:3 > > [ 0.000000] printk: bootconsole [early0] enabled > > [ 0.000000] CPU0 revision is: 0001992f (MIPS 1004Kc) > > [ 0.000000] Initrd not found or empty - disabling initrd > > [ 0.000000] VPE topology {2,2} total 4 > > [ 0.000000] Primary instruction cache 32kB, VIPT, 4-way, linesize 32 bytes. > > [ 0.000000] Primary data cache 32kB, 4-way, PIPT, no aliases, linesize 32 bytes > > [ 0.000000] MIPS secondary cache 256kB, 8-way, linesize 32 bytes. > > [ 0.000000] Zone ranges: > > [ 0.000000] Normal [mem 0x0000000000000000-0x000000000fffffff] > > [ 0.000000] HighMem empty > > [ 0.000000] Movable zone start for each node > > [ 0.000000] Early memory node ranges > > [ 0.000000] node 0: [mem 0x0000000000000000-0x000000000fffffff] > > [ 0.000000] Initmem setup node 0 [mem 0x0000000000000000-0x000000000fffffff] > > [ 0.000000] OF: fdt: No valid device tree found, continuing without > > [ 0.000000] percpu: Embedded 11 pages/cpu s15632 r8192 d21232 u45056 > > [ 0.000000] Built 1 zonelists, mobility grouping on. Total pages: 64960 > > [ 0.000000] Kernel command line: rootfstype=squashfs,jffs2 > > [ 0.000000] Dentry cache hash table entries: 32768 (order: 5, 131072 bytes, linear) > > [ 0.000000] Inode-cache hash table entries: 16384 (order: 4, 65536 bytes, linear) > > [ 0.000000] Writing ErrCtl register=0004a000 > > [ 0.000000] Readback ErrCtl register=0004a000 > > [ 0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off > > [ 0.000000] Memory: 248504K/262144K available (7250K kernel code, 660K rwdata, 1536K rodata, 1228K init, 242K bss, 13640K reserved, 0K cma-reserved, 0K highmem) > > [ 0.000000] SLUB: HWalign=32, Order=0-3, MinObjects=0, CPUs=4, Nodes=1 > > [ 0.000000] rcu: Hierarchical RCU implementation. > > [ 0.000000] Tracing variant of Tasks RCU enabled. > > [ 0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 10 jiffies. > > [ 0.000000] NR_IRQS: 256 > > [ 0.000000] Kernel panic - not syncing: Failed to find mediatek,mt7621-sysc node > > [ 0.000000] Rebooting in 1 seconds.. > > [ 0.000000] Reboot failed -- System halted > > > > Signed-off-by: Bjørn Mork <bjorn@mork.no> > > --- > > The code below will probably look stupid to anyone with a clue. but I > > hope it serves as an illustration of what we need to do. I'd appreciate > > it if anyone can teach me > > - how to do alignment inside a FIT using mkimage, or > > - how to find the position of the embedded fdt blobs in the FIT, or > > - how to check whether a value is aligned > > > > In any case, believe fixing this ASAP is critical. As it is now, any > > image with a 5.15 kernel is just as likely to brick the device as is to > > boot. > > If it's critical enough should we revert the affected commit just in > case while we understand this problem? > Consider that sooner or later we will have to fix it as 6.1 will have > this problem. But if this is present on also other target we should > operate on a more generic way than apply a workaround specific to a > device. This is a broken vendor bootloader doing its thing, so we shouldn't revert upstream commits that is just enforcing the boot specification. This was introduced in 5.12 anyway, so most of the broken devices should have already surfaced. What should be done is to expand mkimage with an option to align to an 8-byte boundary. Regards, Robert > > > > > > > target/linux/ramips/image/mt7621.mk | 19 ++++++++++++++++++- > > 1 file changed, 18 insertions(+), 1 deletion(-) > > > > diff --git a/target/linux/ramips/image/mt7621.mk b/target/linux/ramips/image/mt7621.mk > > index 3ef4cf4efb8f..e3b11abd14ce 100644 > > --- a/target/linux/ramips/image/mt7621.mk > > +++ b/target/linux/ramips/image/mt7621.mk > > @@ -132,6 +132,23 @@ define Build/zyxel-nwa-fit > > @mv $@.new $@ > > endef > > > > +define Build/fit-dtalign > > + $(TOPDIR)/scripts/mkits.sh \ > > + -D $(DEVICE_NAME) -o $@.its -k $@ \ > > + -C $(word 1,$(1)) \ > > + -d $(word 2,$(1)) \ > > + -a $(KERNEL_LOADADDR) -e $(if $(KERNEL_ENTRY),$(KERNEL_ENTRY),$(KERNEL_LOADADDR)) \ > > + -c $(if $(DEVICE_DTS_CONFIG),$(DEVICE_DTS_CONFIG),"config-1") \ > > + -A $(LINUX_KARCH) -v $(LINUX_VERSION) > > + PATH=$(LINUX_DIR)/scripts/dtc:$(PATH) mkimage -f $@.its $@.new > > + ( pos=$$(grep -aob $$(echo -ne "\xd0\x0d\xfe\xed") $@.new | cut -f1 -d: | tail -1); \ > > + if [ "$$pos" -ne "$$((8*($$pos/8)))" ]; then \ > > + dd if=/dev/zero bs=4 count=1 >> $@; \ > > + PATH=$(LINUX_DIR)/scripts/dtc:$(PATH) mkimage -f $@.its $@.new; \ > > + fi ) > > + @mv $@.new $@ > > +endef > > + > > define Device/dsa-migration > > DEVICE_COMPAT_VERSION := 1.1 > > DEVICE_COMPAT_MESSAGE := Config cannot be migrated from swconfig to DSA > > @@ -1960,7 +1977,7 @@ define Device/ubnt_unifi-6-lite > > DEVICE_MODEL := UniFi 6 Lite > > DEVICE_DTS_CONFIG := config@1 > > DEVICE_PACKAGES += kmod-mt7603 kmod-mt7915e > > - KERNEL := kernel-bin | lzma | fit lzma $$(KDIR)/image-$$(firstword $$(DEVICE_DTS)).dtb > > + KERNEL := kernel-bin | lzma | fit-dtalign lzma $$(KDIR)/image-$$(firstword $$(DEVICE_DTS)).dtb > > IMAGE_SIZE := 15424k > > endef > > TARGET_DEVICES += ubnt_unifi-6-lite > > -- > > 2.30.2 > > > > > > _______________________________________________ > > openwrt-devel mailing list > > openwrt-devel@lists.openwrt.org > > https://lists.openwrt.org/mailman/listinfo/openwrt-devel > > -- > Ansuel > > _______________________________________________ > openwrt-devel mailing list > openwrt-devel@lists.openwrt.org > https://lists.openwrt.org/mailman/listinfo/openwrt-devel
Robert Marko <robimarko@gmail.com> writes: > What should be done is to expand mkimage with an option to align to an > 8-byte boundary. Definitely. However, this implies that mkimage must mangle the output of libfdt to align one specific property. AFAICS, there are no such possibilities in device tree. So the toolbox available to mkimage is the same as for the mkimage caller: Change, add or move properties around until the alignment of the "data" property is ok. But mkimage cannot take the shortcuts my quick-fix abuses, for several reasons. I believe mkimage must be able to do the alignment by local modifications of each fdt image node, and I'm not sure it can change any of the properties? Except maybe description? But a better method migh be changing the order. I guess this can be done. E.g move one of the hashes in front of data, provided the size of the property isn't 8-byte aligned. Haven't checked. But I don't think such a fix is within my capabilities after a quick look at the mkimage code. And I believe the Unifi 6 Lite case needs a fix right now. It's a robust device with working console-less recovery, but I don't think regular recovery sessions is a nice user experience... Personally, I got fed up after crawling around in the attic 3 times to push the reset button over the last few months. Which is why finally I ended up mounting a console and figuring out the root cause. Adn here we are. I believe the only reason this isn't a major complaint, is because there aren't many users testing v5.15 kernels on this device. Yet (Unrelated, but I can recommend using bluetooth consoles for stuff like this - my APs are still in the attic, but I can access the console from the living room below. And with PoE power I can force a hard reset if necessary. So no more need to crawl around in the attic. Hopefully) Bjørn
Robert Marko <robimarko@gmail.com> writes: > This is a broken vendor bootloader doing its thing, so we shouldn't > revert upstream commits > that is just enforcing the boot specification. > This was introduced in 5.12 anyway, so most of the broken devices > should have already surfaced. Forgot to ack this, but I agree 100%. If you look at the history in the kernel, you'll see that they wanted this change to expose already broken code. The 8-byte alignment requirement has always been in the spec. But it was often ignored for convenience. There were are lots of different fixes for the real bugs after this commit was added, so it is considered a success. We were for example hit by it on the realtek target, and got a fix into the mainline MIPS code as a result. A snowball in hell has better chances than a proposal to revert the "8-byte alignment is required" commit. You are right that the bootloader must be fixed. But the vendor isn't likely to do that as long as they run older kernels. I believe the OpenWrt policy in such cases is to try to work around the issue. Replacing the vendor bootloader is a last resort. Please correct me if I'm wrong. Bjørn
Hi Bjorn, On 11/4/22 09:29, Bjørn Mork wrote: > You are right that the bootloader must be fixed. But the vendor isn't > likely to do that as long as they run older kernels. I believe the > OpenWrt policy in such cases is to try to work around the issue. > Replacing the vendor bootloader is a last resort. Please correct me if > I'm wrong. I haven't tried this yet, but I think we should be able to boot a FIT image configuration which only contains a kernel image. We could then simply use the legacy append-DT like we do for legacy uImage ramips. Not the nicest way, but considering we need it for the target anyways, i consider this the simpler solution. Note however, that this has to be at least also applied to the ZyXEL NWAxxx and (i assume) all other FIT ramips boards. Best David > > > Bjørn > > _______________________________________________ > openwrt-devel mailing list > openwrt-devel@lists.openwrt.org > https://lists.openwrt.org/mailman/listinfo/openwrt-devel
David Bauer <mail@david-bauer.net> writes: > On 11/4/22 09:29, Bjørn Mork wrote: >> You are right that the bootloader must be fixed. But the vendor isn't >> likely to do that as long as they run older kernels. I believe the >> OpenWrt policy in such cases is to try to work around the issue. >> Replacing the vendor bootloader is a last resort. Please correct me if >> I'm wrong. > > I haven't tried this yet, but I think we should be able to boot a FIT > image configuration which only contains a kernel image. We could then simply > use the legacy append-DT like we do for legacy uImage ramips. > > Not the nicest way, but considering we need it for the target anyways, i consider > this the simpler solution. Yes, that sounds acceptable if it is supported. Which I guess it should be. I can test it the next time I get a chance. But that might take a while before I'm close enough to these APs to use the console. Should have had a console I could access remotely there, but I don't. > Note however, that this has to be at least also applied to the ZyXEL NWAxxx and (i assume) > all other FIT ramips boards. Really? You mean they all depend on this U-Boot feature, despite what the U-Boot docs recommend? Bjørn
Bjørn Mork <bjorn@mork.no> writes: > Robert Marko <robimarko@gmail.com> writes: > >> What should be done is to expand mkimage with an option to align to an >> 8-byte boundary. > > Definitely. > > However, this implies that mkimage must mangle the output of libfdt to > align one specific property. AFAICS, there are no such possibilities in > device tree. Well, you made me take a closer look. And I found one possibility: libfdt supports overwriting properties with FDT_NOP. So we can add an empty property in front of the image data and overwrite it with nops. This result is 3 x 4 byte FDT_NOP values What do you think about something like the attached hack? Is it OK to make this unconditional? My reasoning is that it's pretty hard to know when the alignment is required, the failures are arbitrary and hard to debug, and it doesn't seem to harm. This adds 12 bytes to every unaligned fdt node and that's it. I'll try to run it by u-boot too if you think it's acceptable. I tested this briefly, with a kernel I knew would cause an unaligned fdt from previous failures. The result looks like this and it booted fine on my Unifi 6 Lite: $ fdtdump bin/r21167+1-1673b7dca384/targets/ramips/mt7621/openwrt-snapshot-ramips-mt7621-ubnt_unifi-6-lite-squashfs-sysupgrade.bin | sed -e 's/\(data = \[.. .. .. .. .. .. .. .. .. .. .. .. .. \).*\(.....\].*\)/\1 ... \2/' |less **** fdtdump is a low-level debugging tool, not meant for general use. **** If you want to decompile a dtb, you probably want **** dtc -I dtb -O dts <filename> /dts-v1/; // magic: 0xd00dfeed // totalsize: 0x2d4e50 (2969168) // off_dt_struct: 0x38 // off_dt_strings: 0x2d4a70 // off_mem_rsvmap: 0x28 // version: 17 // last_comp_version: 16 // boot_cpuid_phys: 0x0 // size_dt_strings: 0x72 // size_dt_struct: 0x2d4a38 / { timestamp = <0x6366d7d8>; description = "MIPS OpenWrt FIT (Flattened Image Tree)"; #address-cells = <0x00000001>; images { kernel-1 { description = "MIPS OpenWrt Linux-5.15.76"; data = [6d 00 00 80 00 20 cd 96 00 00 00 00 00 ... a5 3e]; type = "kernel"; arch = "mips"; os = "linux"; compression = "lzma"; load = <0x80001000>; entry = <0x80001000>; hash@1 { value = <0xbd771d3e>; algo = "crc32"; }; hash@2 { value = <0x7c11c808 0x508aada9 0x2f14f1b1 0x6b6d7e89 0x652dd798>; algo = "sha1"; }; }; fdt-1 { // [NOP] // [NOP] // [NOP] description = "MIPS OpenWrt ubnt_unifi-6-lite device tree blob"; data = [d0 0d fe ed 00 00 2c 7b 00 00 00 38 00 ... 65 00]; type = "flat_dt"; arch = "mips"; compression = "none"; hash@1 { value = <0x16bb5a14>; algo = "crc32"; }; hash@2 { value = <0x40f26bf2 0x8e33bbe6 0x61fec716 0x929f2003 0x301f5e4d>; algo = "sha1"; }; }; }; configurations { default = "config@1"; config@1 { description = "OpenWrt ubnt_unifi-6-lite"; kernel = "kernel-1"; fdt = "fdt-1"; }; }; }; $ binwalk bin/r21167+1-1673b7dca384/targets/ramips/mt7621/openwrt-snapshot-ramips-mt7621-ubnt_unifi-6-lite-squashfs-sysupgrade.bin DECIMAL HEXADECIMAL DESCRIPTION -------------------------------------------------------------------------------- 0 0x0 Flattened device tree, size: 2969168 bytes, version: 17 228 0xE4 LZMA compressed data, properties: 0x6D, dictionary size: 8388608 bytes, uncompressed size: 9882912 bytes 2956440 0x2D1C98 Flattened device tree, size: 11387 bytes, version: 17 2969168 0x2D4E50 Squashfs filesystem, little endian, version 4.0, compression:xz, size: 5224258 bytes, 1061 inodes, blocksize: 262144 bytes, created: 2022-11-05 21:38:32 Bjørn
On Sat, 5 Nov 2022 at 09:46, Bjørn Mork <bjorn@mork.no> wrote: > > David Bauer <mail@david-bauer.net> writes: > > On 11/4/22 09:29, Bjørn Mork wrote: > >> You are right that the bootloader must be fixed. But the vendor isn't > >> likely to do that as long as they run older kernels. I believe the > >> OpenWrt policy in such cases is to try to work around the issue. > >> Replacing the vendor bootloader is a last resort. Please correct me if > >> I'm wrong. > > > > I haven't tried this yet, but I think we should be able to boot a FIT > > image configuration which only contains a kernel image. We could then simply > > use the legacy append-DT like we do for legacy uImage ramips. > > > > Not the nicest way, but considering we need it for the target anyways, i consider > > this the simpler solution. > > Yes, that sounds acceptable if it is supported. Which I guess it should > be. > > I can test it the next time I get a chance. But that might take a while > before I'm close enough to these APs to use the console. Should have > had a console I could access remotely there, but I don't. An option is also to set a load address for the dtb in the FIT image, then U-Boot will relocate it before passing it to the kernel. This way U-Boot also retains access to the dtb, in case it wants to update/modify it (e.g. updating the /memory/ node with the actual memory present, setting the cmdline, setting mac addresses etc). Jonas
Jonas Gorski <jonas.gorski@gmail.com> writes: > An option is also to set a load address for the dtb in the FIT image, > then U-Boot will relocate it before passing it to the kernel. Yes, that is worth trying. I thought this would be part of the fdt relocation step, but I see that the "load" property is used before the final relocation. I'll test this when I have console again. Or earlier if someone can tell me how to calculate a semi-safe load address. Will the kernel $loadaddr + aligned(8, 32, 512. 4096?) kernelsize be ok? And how do I figure out where in memory the fdt ended up, without seeing the console output from u-boot? Can't see anything obvious in dmesg or in sysfs. Do I need to enable a memory debug device, or is this address directly available somewhere? Bjørn
On Sun, 6 Nov 2022 at 12:50, Bjørn Mork <bjorn@mork.no> wrote: > > Jonas Gorski <jonas.gorski@gmail.com> writes: > > > An option is also to set a load address for the dtb in the FIT image, > > then U-Boot will relocate it before passing it to the kernel. > > Yes, that is worth trying. I thought this would be part of the fdt > relocation step, but I see that the "load" property is used before the > final relocation. > > I'll test this when I have console again. Or earlier if someone can > tell me how to calculate a semi-safe load address. Will the kernel > $loadaddr + aligned(8, 32, 512. 4096?) kernelsize be ok? You'll probably then have the chance to land in the .bss section, which gets zeroed on boot. Safest might be a static address like RAMSTART + 16MiB. This is what e.g. DentOS does. Not sure if 16 MiB is enough for ramdisk kernels. > And how do I figure out where in memory the fdt ended up, without seeing > the console output from u-boot? Can't see anything obvious in dmesg or > in sysfs. Do I need to enable a memory debug device, or is this address > directly available somewhere? I don't think so, at least I don't see anything obvious. So I guess you'll need to sprinkle some debug printks there (e.g. fw_arg0 or argument passed to __dt_setup_arch()). Jonas
Jonas Gorski <jonas.gorski@gmail.com> writes: > On Sun, 6 Nov 2022 at 12:50, Bjørn Mork <bjorn@mork.no> wrote: >> Jonas Gorski <jonas.gorski@gmail.com> writes: >> >> > An option is also to set a load address for the dtb in the FIT image, >> > then U-Boot will relocate it before passing it to the kernel. >> >> Yes, that is worth trying. I thought this would be part of the fdt >> relocation step, but I see that the "load" property is used before the >> final relocation. >> >> I'll test this when I have console again. Or earlier if someone can >> tell me how to calculate a semi-safe load address. Will the kernel >> $loadaddr + aligned(8, 32, 512. 4096?) kernelsize be ok? > > You'll probably then have the chance to land in the .bss section, > which gets zeroed on boot. > > Safest might be a static address like RAMSTART + 16MiB. This is what > e.g. DentOS does. > > Not sure if 16 MiB is enough for ramdisk kernels. Thanks. Will try that, but considering the big question marks I think it will have to wait until the next time I can use the console. It also doesn't relly sound like a much more robust solution in the end, than simply forcing the fdt's to be aligned. Still working on that. Injecting some FDT_NOPs in mkimage looks promising. >> And how do I figure out where in memory the fdt ended up, without seeing >> the console output from u-boot? Can't see anything obvious in dmesg or >> in sysfs. Do I need to enable a memory debug device, or is this address >> directly available somewhere? > > I don't think so, at least I don't see anything obvious. > > So I guess you'll need to sprinkle some debug printks there (e.g. > fw_arg0 or argument passed to __dt_setup_arch()). Ok, I guess I can do that. Bjørn
Hi Bjorn, On 11/7/22 13:57, Bjørn Mork wrote: > Jonas Gorski <jonas.gorski@gmail.com> writes: >> On Sun, 6 Nov 2022 at 12:50, Bjørn Mork <bjorn@mork.no> wrote: >>> Jonas Gorski <jonas.gorski@gmail.com> writes: >>> >>>> An option is also to set a load address for the dtb in the FIT image, >>>> then U-Boot will relocate it before passing it to the kernel. >>> >>> Yes, that is worth trying. I thought this would be part of the fdt >>> relocation step, but I see that the "load" property is used before the >>> final relocation. >>> >>> I'll test this when I have console again. Or earlier if someone can >>> tell me how to calculate a semi-safe load address. Will the kernel >>> $loadaddr + aligned(8, 32, 512. 4096?) kernelsize be ok? >> >> You'll probably then have the chance to land in the .bss section, >> which gets zeroed on boot. >> >> Safest might be a static address like RAMSTART + 16MiB. This is what >> e.g. DentOS does. >> >> Not sure if 16 MiB is enough for ramdisk kernels. > > Thanks. Will try that, but considering the big question marks I think it > will have to wait until the next time I can use the console. I have a patch for that prepared in my staging tree, I will re-test it on my unit. My proposed "append DTB to kernel" worked, so we have a fallback in case the DTB load-addr does not solve the issue. David > > It also doesn't relly sound like a much more robust solution in the end, > than simply forcing the fdt's to be aligned. Still working on that. > Injecting some FDT_NOPs in mkimage looks promising. > >>> And how do I figure out where in memory the fdt ended up, without seeing >>> the console output from u-boot? Can't see anything obvious in dmesg or >>> in sysfs. Do I need to enable a memory debug device, or is this address >>> directly available somewhere? >> >> I don't think so, at least I don't see anything obvious. >> >> So I guess you'll need to sprinkle some debug printks there (e.g. >> fw_arg0 or argument passed to __dt_setup_arch()). > > Ok, I guess I can do that. > > > Bjørn > > _______________________________________________ > openwrt-devel mailing list > openwrt-devel@lists.openwrt.org > https://lists.openwrt.org/mailman/listinfo/openwrt-devel
David Bauer <mail@david-bauer.net> writes: > I have a patch for that prepared in my staging tree, I will re-test it on > my unit. My proposed "append DTB to kernel" worked, so we have a fallback > in case the DTB load-addr does not solve the issue. Tried it now. Looks fine. With pr_info("fdt is at %px (mapped to %pK)\n", bph, bph); added to arch/mips/kernel/prom.c:__dt_setup_arch() I get root@u6-2:~# dmesg|grep fdt [ 0.000000] fdt is at 87000000 (mapped to (ptrval)) That "%pK" format isn't very useful at this stage of the boot :-) Bjørn
diff --git a/target/linux/ramips/image/mt7621.mk b/target/linux/ramips/image/mt7621.mk index 3ef4cf4efb8f..e3b11abd14ce 100644 --- a/target/linux/ramips/image/mt7621.mk +++ b/target/linux/ramips/image/mt7621.mk @@ -132,6 +132,23 @@ define Build/zyxel-nwa-fit @mv $@.new $@ endef +define Build/fit-dtalign + $(TOPDIR)/scripts/mkits.sh \ + -D $(DEVICE_NAME) -o $@.its -k $@ \ + -C $(word 1,$(1)) \ + -d $(word 2,$(1)) \ + -a $(KERNEL_LOADADDR) -e $(if $(KERNEL_ENTRY),$(KERNEL_ENTRY),$(KERNEL_LOADADDR)) \ + -c $(if $(DEVICE_DTS_CONFIG),$(DEVICE_DTS_CONFIG),"config-1") \ + -A $(LINUX_KARCH) -v $(LINUX_VERSION) + PATH=$(LINUX_DIR)/scripts/dtc:$(PATH) mkimage -f $@.its $@.new + ( pos=$$(grep -aob $$(echo -ne "\xd0\x0d\xfe\xed") $@.new | cut -f1 -d: | tail -1); \ + if [ "$$pos" -ne "$$((8*($$pos/8)))" ]; then \ + dd if=/dev/zero bs=4 count=1 >> $@; \ + PATH=$(LINUX_DIR)/scripts/dtc:$(PATH) mkimage -f $@.its $@.new; \ + fi ) + @mv $@.new $@ +endef + define Device/dsa-migration DEVICE_COMPAT_VERSION := 1.1 DEVICE_COMPAT_MESSAGE := Config cannot be migrated from swconfig to DSA @@ -1960,7 +1977,7 @@ define Device/ubnt_unifi-6-lite DEVICE_MODEL := UniFi 6 Lite DEVICE_DTS_CONFIG := config@1 DEVICE_PACKAGES += kmod-mt7603 kmod-mt7915e - KERNEL := kernel-bin | lzma | fit lzma $$(KDIR)/image-$$(firstword $$(DEVICE_DTS)).dtb + KERNEL := kernel-bin | lzma | fit-dtalign lzma $$(KDIR)/image-$$(firstword $$(DEVICE_DTS)).dtb IMAGE_SIZE := 15424k endef TARGET_DEVICES += ubnt_unifi-6-lite
The Unifi 6 Lite U-Boot does not relocate the Device Tree Blobs found in the FIT image. It behaves as if fdt_high is set to 0xffffffff whether or not this variable is defined. Kernel commit 79edff12060f ("scripts/dtc: Update to upstream version v1.6.0-51-g183df9e9c2b9"), included in v5.12, imported dtc commit 5e735860c478 ("libfdt: Check for 8-byte address alignment in fdt_ro_probe_()") into the kernel The result is that we must ensure that the "data" property of all the "flat_dt" type images is 8 byte aligned inside the FIT There is currently no provision in mkimage or dtc for aligning properties with the blob. They are naturally 4 byte aligned. Giving us 50% probability of a soft bricking image. Work around the missing aligment features in mkimage and dtc by padding the kernel image and rerunning mkimage if the first attempt failed. Example boot log when using a misaligned FIT image: reading kernel 0 from: 0x1d0000, size: 0x002d5000 Using 'config@1' configuration Verifying Hash Integrity ... OK Trying 'kernel-1' kernel subimage Description: MIPS OpenWrt Linux-5.15.76 Type: Kernel Image Compression: lzma compressed Data Start: 0x860000e4 Data Size: 2956005 Bytes = 2.8 MiB Architecture: MIPS OS: Linux Load Address: 0x80001000 Entry Point: 0x80001000 Hash algo: crc32 Hash value: e1bc9460 Hash algo: sha1 Hash value: 6510c4ada31aeea81f2e8e537f78cb367e1c7fab Verifying Hash Integrity ... crc32+ sha1+ OK Using 'config@1' configuration Trying 'fdt-1' fdt subimage Description: MIPS OpenWrt ubnt_unifi-6-lite device tree blob Type: Flat Device Tree Compression: uncompressed Data Start: 0x862d1d0c Data Size: 11387 Bytes = 11.1 KiB Architecture: MIPS Hash algo: crc32 Hash value: 16bb5a14 Hash algo: sha1 Hash value: 40f26bf28e33bbe661fec716929f2003301f5e4d Verifying Hash Integrity ... crc32+ sha1+ OK Booting using the fdt blob at 0x862d1d0c Uncompressing Kernel Image ... OK Using Device Tree in place at 862d1d0c, end 862d7986 [ 0.000000] Linux version 5.15.76 (bjorn@canardo) (mipsel-openwrt-linux-musl-gcc (OpenWrt GCC 11.3.0 r21167-1673b7dca384) 11.3.0, GNU ld (GNU Binutils) 2.37) #0 SMP Wed Nov 2 15:53:34 2022 [ 0.000000] SoC Type: MediaTek MT7621 ver:1 eco:3 [ 0.000000] printk: bootconsole [early0] enabled [ 0.000000] CPU0 revision is: 0001992f (MIPS 1004Kc) [ 0.000000] Initrd not found or empty - disabling initrd [ 0.000000] VPE topology {2,2} total 4 [ 0.000000] Primary instruction cache 32kB, VIPT, 4-way, linesize 32 bytes. [ 0.000000] Primary data cache 32kB, 4-way, PIPT, no aliases, linesize 32 bytes [ 0.000000] MIPS secondary cache 256kB, 8-way, linesize 32 bytes. [ 0.000000] Zone ranges: [ 0.000000] Normal [mem 0x0000000000000000-0x000000000fffffff] [ 0.000000] HighMem empty [ 0.000000] Movable zone start for each node [ 0.000000] Early memory node ranges [ 0.000000] node 0: [mem 0x0000000000000000-0x000000000fffffff] [ 0.000000] Initmem setup node 0 [mem 0x0000000000000000-0x000000000fffffff] [ 0.000000] OF: fdt: No valid device tree found, continuing without [ 0.000000] percpu: Embedded 11 pages/cpu s15632 r8192 d21232 u45056 [ 0.000000] Built 1 zonelists, mobility grouping on. Total pages: 64960 [ 0.000000] Kernel command line: rootfstype=squashfs,jffs2 [ 0.000000] Dentry cache hash table entries: 32768 (order: 5, 131072 bytes, linear) [ 0.000000] Inode-cache hash table entries: 16384 (order: 4, 65536 bytes, linear) [ 0.000000] Writing ErrCtl register=0004a000 [ 0.000000] Readback ErrCtl register=0004a000 [ 0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off [ 0.000000] Memory: 248504K/262144K available (7250K kernel code, 660K rwdata, 1536K rodata, 1228K init, 242K bss, 13640K reserved, 0K cma-reserved, 0K highmem) [ 0.000000] SLUB: HWalign=32, Order=0-3, MinObjects=0, CPUs=4, Nodes=1 [ 0.000000] rcu: Hierarchical RCU implementation. [ 0.000000] Tracing variant of Tasks RCU enabled. [ 0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 10 jiffies. [ 0.000000] NR_IRQS: 256 [ 0.000000] Kernel panic - not syncing: Failed to find mediatek,mt7621-sysc node [ 0.000000] Rebooting in 1 seconds.. [ 0.000000] Reboot failed -- System halted Signed-off-by: Bjørn Mork <bjorn@mork.no> --- The code below will probably look stupid to anyone with a clue. but I hope it serves as an illustration of what we need to do. I'd appreciate it if anyone can teach me - how to do alignment inside a FIT using mkimage, or - how to find the position of the embedded fdt blobs in the FIT, or - how to check whether a value is aligned In any case, believe fixing this ASAP is critical. As it is now, any image with a 5.15 kernel is just as likely to brick the device as is to boot. target/linux/ramips/image/mt7621.mk | 19 ++++++++++++++++++- 1 file changed, 18 insertions(+), 1 deletion(-)