Message ID | 20211109144607.7571-3-denis281089@gmail.com |
---|---|
State | Superseded |
Headers | show |
Series | ath79: add a support for reset key on MikroTik RB912 | expand |
On Tue, 9 Nov 2021 at 15:47, Denis Kalashnikov <denis281089@gmail.com> wrote: > > If kernel invokes routerboot partion parser module > (drivers/platform/mikrotik/routerboot.c) init function too early, when SPI flash > hasn't been found yet, its rb_hardconfig_init() failed with -ENODEV (since no > mtd device with name "hard_config" was found). In this case when kernel calls > ath9k probe function, it tries a sysfs fallback for loading calibration data, > but hotplug script fails to load it, since "/sys/firmware/mikrotik/hard_config" > is missing (whole "/sys/firmware/mikrotik" dir is empty, since > rb_hardconfig_init() has failed). As a result we see no phy0 and no wlan0, and > in logs we found: > > root@OpenWrt:/dev# logread | grep ath9k > kern.warn kernel: [ 13.818737] ath9k 18100000.wmac: Direct firmware load > for ath9k-eeprom-ahb-18100000.wmac.bin failed with error -2 > kern.warn kernel: [ 13.829463] ath9k 18100000.wmac: Falling back to sysfs > fallback for: ath9k-eeprom-ahb-18100000.wmac.bin > kern.err kernel: [ 14.838766] ath: phy0: Unable to load EEPROM file > ath9k-eeprom-ahb-18100000.wmac.bin > kern.err kernel: [ 14.846909] ath9k 18100000.wmac: failed to initialize device > kern.warn kernel: [ 14.852870] ath9k: probe of 18100000.wmac failed with error -22 > > We managed to fix this by changing module_init() to late_initcall() > in routerboot module: Hi, wouldn't it be better for the rb_hardconfig_init to return -EPROBE_DEFER so the the core can try probing later? Regards, Robert > > root@OpenWrt:~# logread | grep ath > kern.warn kernel: [ 13.661861] ath9k 18100000.wmac: Direct firmware load for > ath9k-eeprom-ahb-18100000.wmac.bin failed with error -2 > kern.warn kernel: [ 13.672583] ath9k 18100000.wmac: Falling back to sysfs for: > ath9k-eeprom-ahb-18100000.wmac.bin > kern.debug kernel: [ 14.569014] ath: EEPROM regdomain sanitized > kern.debug kernel: [ 14.569038] ath: EEPROM regdomain: 0x64 > kern.debug kernel: [ 14.569044] ath: EEPROM indicates we should expect a > direct regpair map > kern.debug kernel: [ 14.569073] ath: Country alpha2 being used: 00 > kern.debug kernel: [ 14.569079] ath: Regpair used: 0x64 > > root@OpenWrt:~# iw dev > phy#0 > Interface wlan0 > ifindex 3 > wdev 0x1 > addr xx:xx:xx:xx:xx:xx > type managed > txpower 20.00 dBm > > Signed-off-by: Denis Kalashnikov <denis281089@gmail.com> > --- > .../linux/generic/files/drivers/platform/mikrotik/routerboot.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/target/linux/generic/files/drivers/platform/mikrotik/routerboot.c b/target/linux/generic/files/drivers/platform/mikrotik/routerboot.c > index 4c8c0bfac5..e5745c88a9 100644 > --- a/target/linux/generic/files/drivers/platform/mikrotik/routerboot.c > +++ b/target/linux/generic/files/drivers/platform/mikrotik/routerboot.c > @@ -210,7 +210,7 @@ ssize_t routerboot_tag_show_u32s(const u8 *pld, u16 pld_len, char *buf) > return out - buf; > } > > -module_init(routerboot_init); > +late_initcall(routerboot_init); > module_exit(routerboot_exit); > > MODULE_LICENSE("GPL v2"); > -- > 2.31.1 > > > _______________________________________________ > openwrt-devel mailing list > openwrt-devel@lists.openwrt.org > https://lists.openwrt.org/mailman/listinfo/openwrt-devel
> Hi, wouldn't it be better for the rb_hardconfig_init to return > -EPROBE_DEFER so the > the core can try probing later? Hello, Robert! I've tried what you suggested (return -EPROBE_DEFER from rb_hardconfig_init and routerboot_init). It doesn't work. do_one_initcall() from linux-5.4.158/init/main.c calls module init function and it doesn't check return code and recall it later.
On Wed, 10 Nov 2021 at 10:45, Denis K <denis281089@gmail.com> wrote: > > > Hi, wouldn't it be better for the rb_hardconfig_init to return > > -EPROBE_DEFER so the > > the core can try probing later? > Hello, Robert! I've tried what you suggested (return -EPROBE_DEFER > from rb_hardconfig_init and routerboot_init). It doesn't work. > do_one_initcall() from linux-5.4.158/init/main.c calls module init > function and it doesn't check return code and recall it later. Ahh, yeah since it's not a platform driver. Ideally, it should be converted to one so it has a probe method, and then deferring will work. Adding Thibaut as he is the original author, maybe there is a reason why the platform driver approach won't work. Regards, Robert
> Le 10 nov. 2021 à 10:48, Robert Marko <robimarko@gmail.com> a écrit : > > On Wed, 10 Nov 2021 at 10:45, Denis K <denis281089@gmail.com> wrote: >> >>> Hi, wouldn't it be better for the rb_hardconfig_init to return >>> -EPROBE_DEFER so the >>> the core can try probing later? >> Hello, Robert! I've tried what you suggested (return -EPROBE_DEFER >> from rb_hardconfig_init and routerboot_init). It doesn't work. >> do_one_initcall() from linux-5.4.158/init/main.c calls module init >> function and it doesn't check return code and recall it later. > > Ahh, yeah since it's not a platform driver. > Ideally, it should be converted to one so it has a probe method, and > then deferring will work. > > Adding Thibaut as he is the original author, Thanks, I don’t follow openwrt-devel. > maybe there is a reason > why the platform driver approach won't work. Quite simply because there is no platform device to probe() for. Can I get a full boot log (dmesg output) with the problem? It seems none was provided with the original patch here[1] I’m not sure I can make sense of this: because the driver depends on SPI routines, it should be linked and loaded after SPI. This was the case so far. Thanks, Thibaut [1] https://patchwork.ozlabs.org/project/openwrt/patch/20211109144607.7571-3-denis281089@gmail.com/
Hello, Thibaut! Here is a dmesg output as an attachment. I've added printk into rb_hardconfig_init: [ 0.806327] rb_hardconfig_init: get_mtd_device_nm returns error: -19
Hi, > Le 10 nov. 2021 à 13:06, Denis K <denis281089@gmail.com> a écrit : > > Hello, Thibaut! Here is a dmesg output as an attachment. I’m afraid this dmesg makes no sense to me. rb_hardconfig should never be loaded where it is: platform drivers (where it is currently hooked) are linked much later in kernel so something looks amiss here. Unfortunately I don’t have much time to investigate right now, but I’m not (yet) convinced there’s a problem with the driver’s code at this point. I see this is using kernel 5.10 but afaict master is still using 5.4 for ath79. Maybe this is related? HTH, Thibaut
Hello, Thibaut! > because the driver depends on SPI routines, it should be linked and loaded after SPI > rb_hardconfig should never be loaded where it is: platform drivers (where it is currently hooked) are linked much later in kernel Maybe the order of init functions of SPI and routerboot depends on a link order. But SPI probe function can be called in any time: before or after routerboot init. Adding one more driver -- a new rb91x-key somehow shifts init timings so SPI _probe_ (not init) now is called after routerboot init. I suppose that SPI probe is called later now due to several deferred probes (spi, gpio-latch and rb91x-nand), you can see this in the attached dmesg: [ 0.348321] gpio-latch gpio_latch: failed to get gpio 7: -517 [ 0.355504] gpio-rb91x-key gpio_key: probe ok ... [ 0.405835] rb91x-nand nand_gpio: failed to get gpios: -517 ... [ 0.806348] rb_hardconfig_init: get_mtd_device_nm returns error: -19 ... [ 0.909300] gpio-latch gpio_latch: probe ok [ 0.927785] spi-nor spi0.0: w25x05 (64 Kbytes) [ 0.932811] 1 routerbootpart partitions found on MTD device spi0.0 [ 0.939189] Creating 1 MTD partitions on "spi0.0": [ 0.944193] 0x000000000000-0x000000010000 : "partitions" [ 0.952617] 4 routerbootpart partitions found on MTD device partitions [ 0.959369] Creating 4 MTD partitions on "partitions": [ 0.964750] 0x000000000000-0x00000000c000 : "routerboot" [ 0.971481] 0x00000000c000-0x00000000d000 : "hard_config" [ 0.978215] 0x00000000d000-0x00000000e000 : "bios" [ 0.984340] 0x00000000e000-0x00000000f000 : "soft_config" [ 0.991016] spi-nor spi0.0: probe ok ... [ 1.118306] rb91x-nand nand_gpio: probe ok > I see this is using kernel 5.10 but afaict master is still using 5.4 for ath79. > Maybe this is related? The same thing is when kernel 5.4 is used (see the attachment). I see two ways to eliminate this race condition: 1) late_initcall of routerboot module, only a single line need to be changed, 2) Somehow add a probe function to routerboot module. May be it should be called from routerboot_partitions_parse function (from mtd/parsers/routerbootpart.c). Regards, Denis.
> Le 11 nov. 2021 à 16:45, Denis K <denis281089@gmail.com> a écrit : > > I see two ways to eliminate this race condition: > 1) late_initcall of routerboot module, only a single line need to be changed, OK, let’s do that as a workaround and hope that deferred probes won’t overlap past late_initcall() > 2) Somehow add a probe function to routerboot module. May be it should be > called from routerboot_partitions_parse function (from > mtd/parsers/routerbootpart.c). No. There’s no reason why one should depend on the other (hard/soft config could be defined statically without using the parser, and the driver should still work then). HTH T
> OK, let’s do that as a workaround and hope that deferred probes won’t overlap past late_initcall()
What about using register_mtd_user function: routerboot can call
rb_hardconfig_init and rb_softconfig_init when mtd subsystem notifies
it about a new mtd device? It works.
---
.../drivers/platform/mikrotik/rb_hardconfig.c | 12 ++++------
.../drivers/platform/mikrotik/rb_softconfig.c | 12 ++++------
.../drivers/platform/mikrotik/routerboot.c | 24 +++++++++++++------
.../drivers/platform/mikrotik/routerboot.h | 5 ++--
4 files changed, 30 insertions(+), 23 deletions(-)
diff --git a/target/linux/generic/files/drivers/platform/mikrotik/rb_hardconfig.c
b/target/linux/generic/files/drivers/platform/mikrotik/rb_hardconfig.c
index e6a6928896..724851474e 100644
--- a/target/linux/generic/files/drivers/platform/mikrotik/rb_hardconfig.c
+++ b/target/linux/generic/files/drivers/platform/mikrotik/rb_hardconfig.c
@@ -676,10 +676,9 @@ static ssize_t hc_wlan_data_bin_read(struct file
*filp, struct kobject *kobj,
return count;
}
-int __init rb_hardconfig_init(struct kobject *rb_kobj)
+int __init rb_hardconfig_init(struct kobject *rb_kobj, struct mtd_info *mtd)
{
struct kobject *hc_wlan_kobj;
- struct mtd_info *mtd;
size_t bytes_read, buflen, outlen;
const u8 *buf;
void *outbuf;
@@ -690,20 +689,19 @@ int __init rb_hardconfig_init(struct kobject *rb_kobj)
hc_kobj = NULL;
hc_wlan_kobj = NULL;
- // TODO allow override
- mtd = get_mtd_device_nm(RB_MTD_HARD_CONFIG);
- if (IS_ERR(mtd))
+ ret = __get_mtd_device(mtd);
+ if (IS_ERR(ret))
return -ENODEV;
hc_buflen = mtd->size;
hc_buf = kmalloc(hc_buflen, GFP_KERNEL);
if (!hc_buf) {
- put_mtd_device(mtd);
+ __put_mtd_device(mtd);
return -ENOMEM;
}
ret = mtd_read(mtd, 0, hc_buflen, &bytes_read, hc_buf);
- put_mtd_device(mtd);
+ __put_mtd_device(mtd);
if (ret)
goto fail;
diff --git a/target/linux/generic/files/drivers/platform/mikrotik/rb_softconfig.c
b/target/linux/generic/files/drivers/platform/mikrotik/rb_softconfig.c
index 070bd32d5a..59c42e4cf4 100644
--- a/target/linux/generic/files/drivers/platform/mikrotik/rb_softconfig.c
+++ b/target/linux/generic/files/drivers/platform/mikrotik/rb_softconfig.c
@@ -705,9 +705,8 @@ mtdfail:
static struct kobj_attribute sc_kattrcommit = __ATTR(commit,
RB_SC_RMODE|RB_SC_WMODE, sc_commit_show, sc_commit_store);
-int __init rb_softconfig_init(struct kobject *rb_kobj)
+int __init rb_softconfig_init(struct kobject *rb_kobj, struct mtd_info *mtd)
{
- struct mtd_info *mtd;
size_t bytes_read, buflen;
const u8 *buf;
int i, ret;
@@ -716,20 +715,19 @@ int __init rb_softconfig_init(struct kobject *rb_kobj)
sc_buf = NULL;
sc_kobj = NULL;
- // TODO allow override
- mtd = get_mtd_device_nm(RB_MTD_SOFT_CONFIG);
- if (IS_ERR(mtd))
+ ret = __get_mtd_device(mtd);
+ if (IS_ERR(ret))
return -ENODEV;
sc_buflen = mtd->size;
sc_buf = kmalloc(sc_buflen, GFP_KERNEL);
if (!sc_buf) {
- put_mtd_device(mtd);
+ __put_mtd_device(mtd);
return -ENOMEM;
}
ret = mtd_read(mtd, 0, sc_buflen, &bytes_read, sc_buf);
- put_mtd_device(mtd);
+ __put_mtd_device(mtd);
if (ret)
goto fail;
diff --git a/target/linux/generic/files/drivers/platform/mikrotik/routerboot.c
b/target/linux/generic/files/drivers/platform/mikrotik/routerboot.c
index 4c8c0bfac5..77f39709ce 100644
--- a/target/linux/generic/files/drivers/platform/mikrotik/routerboot.c
+++ b/target/linux/generic/files/drivers/platform/mikrotik/routerboot.c
@@ -160,25 +160,35 @@ fail:
return ret;
}
+static void routerboot_mtd_notifier_add(struct mtd_info *mtd)
+{
+ if (mtd->type != MTD_NORFLASH)
+ return;
+ if (!strcmp(mtd->name, RB_MTD_HARD_CONFIG)) {
+ rb_hardconfig_init(rb_kobj, mtd);
+ } else if (!strcmp(mtd->name, RB_MTD_SOFT_CONFIG)) {
+ rb_softconfig_init(rb_kobj, mtd);
+ }
+}
+
+static struct mtd_notifier routerboot_mtd_notifier = {
+ .add = routerboot_mtd_notifier_add,
+};
+
static int __init routerboot_init(void)
{
rb_kobj = kobject_create_and_add("mikrotik", firmware_kobj);
if (!rb_kobj)
return -ENOMEM;
- /*
- * We ignore the following return values and always register.
- * These init() routines are designed so that their failed state is
- * always manageable by the corresponding exit() calls.
- */
- rb_hardconfig_init(rb_kobj);
- rb_softconfig_init(rb_kobj);
+ register_mtd_user(&routerboot_mtd_notifier);
return 0;
}
static void __exit routerboot_exit(void)
{
+ unregister_mtd_user(&routerboot_mtd_notifier);
rb_softconfig_exit();
rb_hardconfig_exit();
kobject_put(rb_kobj); // recursive afaict
diff --git a/target/linux/generic/files/drivers/platform/mikrotik/routerboot.h
b/target/linux/generic/files/drivers/platform/mikrotik/routerboot.h
index 67d89808d5..5277a3cd94 100644
--- a/target/linux/generic/files/drivers/platform/mikrotik/routerboot.h
+++ b/target/linux/generic/files/drivers/platform/mikrotik/routerboot.h
@@ -10,6 +10,7 @@
#define _ROUTERBOOT_H_
#include <linux/types.h>
+#include <linux/mtd/mtd.h>
// these magic values are stored in cpu-endianness on flash
#define RB_MAGIC_HARD (('H') | ('a' << 8) | ('r' << 16) | ('d' << 24))
@@ -25,10 +26,10 @@
int routerboot_tag_find(const u8 *bufhead, const size_t buflen, const
u16 tag_id, u16 *pld_ofs, u16 *pld_len);
int routerboot_rle_decode(const u8 *in, size_t inlen, u8 *out, size_t *outlen);
-int __init rb_hardconfig_init(struct kobject *rb_kobj);
+int __init rb_hardconfig_init(struct kobject *rb_kobj, struct mtd_info *mtd);
void __exit rb_hardconfig_exit(void);
-int __init rb_softconfig_init(struct kobject *rb_kobj);
+int __init rb_softconfig_init(struct kobject *rb_kobj, struct mtd_info *mtd);
void __exit rb_softconfig_exit(void);
ssize_t routerboot_tag_show_string(const u8 *pld, u16 pld_len, char *buf);
--
Regards, Denis
> Le 12 nov. 2021 à 16:25, Denis K <denis281089@gmail.com> a écrit : > >> OK, let’s do that as a workaround and hope that deferred probes won’t overlap past late_initcall() > What about using register_mtd_user function: routerboot can call > rb_hardconfig_init and rb_softconfig_init when mtd subsystem notifies > it about a new mtd device? It works. Looks like a good idea, I should have thought about it ;) My only nit is that the _init() calls will be executed with the MTD table mutex held, but I suppose it’s acceptable. However please do not remove the comments I left in the code. Furthermore, for the sake of consistency the module should also react to the (arguably unlikely) removal of the target MTD partitions: a .remove() callback will deal with that. Finally the inclusion of mtd/mtd.h is not necessary in the routerboot.h header (we’re using pointers and that header has no dependency on mtd). I have amended your patch to reflect the above comments, and adjusted the commit message. Can you confirm this still works for you? I cannot test it myself now. Thanks, T
I've tested it (on 5.4 and 5.10). On reboot I've had this kernel panic: [ 1028.460043] Reserved instruction in kernel code[#1]: [ 1028.465191] CPU: 0 PID: 2406 Comm: procd Not tainted 5.4.158 #0 [ 1028.471300] $ 0 : 00000000 00000001 00000000 00000000 [ 1028.476702] $ 4 : 8342118c 806adf50 00000000 00000000 [ 1028.482095] $ 8 : 83812ab0 00000000 83812ad4 00000002 [ 1028.487490] $12 : fffffffd 00000402 80731904 00000040 [ 1028.492883] $16 : 8073503c 83419000 807318ec 80730000 [ 1028.498278] $20 : 80730000 808a0000 806a9efc 806a9eec [ 1028.503671] $24 : 00000000 00000000 [ 1028.509065] $28 : 83c42000 83c43b80 83cefc00 803c1fd0 [ 1028.514459] Hi : 00474bff [ 1028.517426] Lo : b47b3346 [ 1028.520400] epc : 80764000 0x80764000 [ 1028.524371] ra : 803c1fd0 del_mtd_device+0x68/0x100 [ 1028.529673] Status: 1100dc03 KERNEL EXL IE [ 1028.533994] Cause : 00800028 (ExcCode 0a) [ 1028.538135] PrId : 0001974c (MIPS 74Kc) [ 1028.542181] Modules linked in: ath9k ath9k_common pppoe ppp_async iptable_nat ath9k_hw ath xt_state xt_nat xt_conntrack xt_REDIRECT xt_MASQUERADE xt_FLOWOFFLOAD pppox ppp_generic nf_nat nf_flow_table_hw nf_flow_table nf_conntrack mac80211 ipt_REJECT cfg80211 xt_time xt_tcpudp xt_multiport xt_mark xt_mac xt_limit xt_comment xt_TCPMSS xt_LOG slhc nf_reject_ipv4 nf_log_ipv4 nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_filter ip_tables crc_ccitt compat evdev input_core nf_log_ipv6 nf_log_common ip6table_mangle ip6table_filter ip6_tables ip6t_REJECT x_tables nf_reject_ipv6 sha256_generic libsha256 seqiv jitterentropy_rng drbg hmac ghash_generic gf128mul gcm ctr cmac ccm fsl_mph_dr_of ehci_platform ehci_fsl ehci_hcd gpio_button_hotplug usbcore nls_base usb_common aead cryptomgr crypto_null crypto_hash [ 1028.615764] Process procd (pid: 2406, threadinfo=2c3273bc, task=f31e3962, tls=77ef0dcc) [ 1028.624021] Stack : 00000000 802183b4 80680000 8066ecb0 80731768 83419000 80731978 00000000 [ 1028.632649] 80730000 803c4a5c 80730000 00000001 83c09400 802150e8 000b0054 83418400 [ 1028.641278] 83ff4000 80731978 00000000 803c4a1c 83ff1360 00000000 00000001 808a0000 [ 1028.649906] 83ff1360 80731768 83d99880 80731978 00000000 803c63a4 806a9efc 806a9eec [ 1028.658535] 83cefc00 83d01d40 83d99880 80733b18 00000000 80733b18 00000044 803c2248 [ 1028.667163] ... [ 1028.669683] Call Trace: [ 1028.669685] [ 1028.673757] [<802183b4>] sysfs_remove_files+0x38/0x5c [ 1028.679019] [<803c4a5c>] __mtd_del_partition+0xa8/0x100 [ 1028.684439] [<802150e8>] __kernfs_remove.part.0+0x1e4/0x318 [ 1028.690204] [<803c4a1c>] __mtd_del_partition+0x68/0x100 [ 1028.695604] [<803c63a4>] del_mtd_partitions+0x78/0xf0 [ 1028.700826] [<803c2248>] mtd_device_unregister+0x28/0x5c [ 1028.706314] [<803a1c80>] __device_release_driver+0x178/0x214 [ 1028.712159] [<8039fdf0>] klist_devices_put+0x0/0x8 [ 1028.717105] [<803a1d48>] device_release_driver+0x2c/0x44 [ 1028.722589] [<803a0f68>] bus_remove_device+0x154/0x168 [ 1028.727905] [<803b7944>] __unregister+0x0/0x20 [ 1028.732522] [<8039cd80>] device_del+0x15c/0x458 [ 1028.737228] [<803428e0>] spi_sync_transfer.constprop.0+0x60/0x6c [ 1028.743438] [<803b7348>] spi_complete+0x0/0x8 [ 1028.747936] [<803b7944>] __unregister+0x0/0x20 [ 1028.752524] [<803b7910>] spi_unregister_device+0x40/0x74 [ 1028.758007] [<803b7954>] __unregister+0x10/0x20 [ 1028.762685] [<803b7944>] __unregister+0x0/0x20 [ 1028.767277] [<8039bd78>] device_for_each_child+0x50/0xa4 [ 1028.772778] [<803b903c>] spi_unregister_controller+0x3c/0x170 [ 1028.778724] [<803bdb90>] ath79_spi_remove+0x1c/0x78 [ 1028.783766] [<8039f658>] device_shutdown+0x13c/0x1f4 [ 1028.788897] [<800a4814>] blocking_notifier_call_chain+0x74/0xe8 [ 1028.795019] [<800a5d6c>] kernel_restart+0x40/0xac [ 1028.799877] [<800a5ed8>] __do_sys_reboot+0x100/0x214 [ 1028.805037] [<8012fdc8>] filemap_map_pages+0x3a0/0x3d0 [ 1028.810352] [<801617e4>] handle_mm_fault+0x89c/0xcf0 [ 1028.815520] [<8006fdec>] do_page_fault+0xb4/0x4b8 [ 1028.820379] [<8006e18c>] syscall_common+0x34/0x58 [ 1028.825233] [ 1028.826769] Code: 00000000 00000000 00000000 <7f454c46> 01020100 00000000 00000000 00030008 00000001 [ 1028.836839] [ 1028.838458] ---[ end trace 37f524f87b727ac2 ]--- [ 1028.843240] Kernel panic - not syncing: Fatal exception [ 1028.848639] Rebooting in 3 seconds.. If I comment out calls of rb_hardconfig_exit and rb_softconfig_exit in routerboot_mtd_notifier_remove, there is no panic. I'm trying to determine why this is happening. Regards, Denis
I got it! _exit macro of rb_hardconfig_init seems to cause this kernel panic. Here is a patch. Regards, Denis.
Hi, > Le 16 nov. 2021 à 10:46, Denis K <denis281089@gmail.com> a écrit : > > I got it! _exit macro of rb_hardconfig_init seems to cause this kernel > panic. Here is a patch. Thanks; indeed that’s logical: functions marked as __exit are discarded when the module is built into the kernel: they’re no longer available when the mtd_notifier executes the remove() callback. I missed that. Since these are very small, I agree it’s ok to remove the __exit qualifier. I see the same rationale for removing the __init qualifier from the init functions since the mtd notifier could be called at any point during kernel execution, but that bothers me a bit more because these are somewhat large functions. Then again it’s a tradeoff between doing the right thing (using a notifier) vs using a workaround (late_initcall()). I think it’s acceptable. Anyway, I’ve refactored the patch with your changes and I added a note and edited the commit message about the point above; see attachment below. Hopefully this one is good to go :) You are the author of this patch and you did all the heavy lifting (thanks again for that), so if you have no further changes/comments I’d suggest you submit this as a v2. I’ll ACK it then (please CC-me - this should be automatic with my SoB-line added). Thanks, Thibaut
Thank you for your help, Thibaut. I think we've found the right way of initializing routerboot. I've sent a v2 of the patch. Denis вт, 16 нояб. 2021 г. в 13:20, Thibaut <hacks@slashdirt.org>: > > Hi, > > > Le 16 nov. 2021 à 10:46, Denis K <denis281089@gmail.com> a écrit : > > > > I got it! _exit macro of rb_hardconfig_init seems to cause this kernel > > panic. Here is a patch. > > Thanks; indeed that’s logical: functions marked as __exit are discarded when the module is built into the kernel: they’re no longer available when the mtd_notifier executes the remove() callback. I missed that. Since these are very small, I agree it’s ok to remove the __exit qualifier. > > I see the same rationale for removing the __init qualifier from the init functions since the mtd notifier could be called at any point during kernel execution, but that bothers me a bit more because these are somewhat large functions. Then again it’s a tradeoff between doing the right thing (using a notifier) vs using a workaround (late_initcall()). I think it’s acceptable. > > Anyway, I’ve refactored the patch with your changes and I added a note and edited the commit message about the point above; see attachment below. Hopefully this one is good to go :) > > You are the author of this patch and you did all the heavy lifting (thanks again for that), so if you have no further changes/comments I’d suggest you submit this as a v2. I’ll ACK it then (please CC-me - this should be automatic with my SoB-line added). > > Thanks, > Thibaut >
diff --git a/target/linux/generic/files/drivers/platform/mikrotik/routerboot.c b/target/linux/generic/files/drivers/platform/mikrotik/routerboot.c index 4c8c0bfac5..e5745c88a9 100644 --- a/target/linux/generic/files/drivers/platform/mikrotik/routerboot.c +++ b/target/linux/generic/files/drivers/platform/mikrotik/routerboot.c @@ -210,7 +210,7 @@ ssize_t routerboot_tag_show_u32s(const u8 *pld, u16 pld_len, char *buf) return out - buf; } -module_init(routerboot_init); +late_initcall(routerboot_init); module_exit(routerboot_exit); MODULE_LICENSE("GPL v2");
If kernel invokes routerboot partion parser module (drivers/platform/mikrotik/routerboot.c) init function too early, when SPI flash hasn't been found yet, its rb_hardconfig_init() failed with -ENODEV (since no mtd device with name "hard_config" was found). In this case when kernel calls ath9k probe function, it tries a sysfs fallback for loading calibration data, but hotplug script fails to load it, since "/sys/firmware/mikrotik/hard_config" is missing (whole "/sys/firmware/mikrotik" dir is empty, since rb_hardconfig_init() has failed). As a result we see no phy0 and no wlan0, and in logs we found: root@OpenWrt:/dev# logread | grep ath9k kern.warn kernel: [ 13.818737] ath9k 18100000.wmac: Direct firmware load for ath9k-eeprom-ahb-18100000.wmac.bin failed with error -2 kern.warn kernel: [ 13.829463] ath9k 18100000.wmac: Falling back to sysfs fallback for: ath9k-eeprom-ahb-18100000.wmac.bin kern.err kernel: [ 14.838766] ath: phy0: Unable to load EEPROM file ath9k-eeprom-ahb-18100000.wmac.bin kern.err kernel: [ 14.846909] ath9k 18100000.wmac: failed to initialize device kern.warn kernel: [ 14.852870] ath9k: probe of 18100000.wmac failed with error -22 We managed to fix this by changing module_init() to late_initcall() in routerboot module: root@OpenWrt:~# logread | grep ath kern.warn kernel: [ 13.661861] ath9k 18100000.wmac: Direct firmware load for ath9k-eeprom-ahb-18100000.wmac.bin failed with error -2 kern.warn kernel: [ 13.672583] ath9k 18100000.wmac: Falling back to sysfs for: ath9k-eeprom-ahb-18100000.wmac.bin kern.debug kernel: [ 14.569014] ath: EEPROM regdomain sanitized kern.debug kernel: [ 14.569038] ath: EEPROM regdomain: 0x64 kern.debug kernel: [ 14.569044] ath: EEPROM indicates we should expect a direct regpair map kern.debug kernel: [ 14.569073] ath: Country alpha2 being used: 00 kern.debug kernel: [ 14.569079] ath: Regpair used: 0x64 root@OpenWrt:~# iw dev phy#0 Interface wlan0 ifindex 3 wdev 0x1 addr xx:xx:xx:xx:xx:xx type managed txpower 20.00 dBm Signed-off-by: Denis Kalashnikov <denis281089@gmail.com> --- .../linux/generic/files/drivers/platform/mikrotik/routerboot.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)