Message ID | 20200625115655.955736-1-mpe@ellerman.id.au |
---|---|
State | Accepted |
Headers | show |
Series | external/mambo: Disable MEMORY_OVERFLOW | expand |
Context | Check | Description |
---|---|---|
snowpatch_ozlabs/apply_patch | success | Successfully applied on branch master (68dc040a6540c218d20517764ff5d740a3626c55) |
snowpatch_ozlabs/snowpatch_job_snowpatch-skiboot | success | Test snowpatch/job/snowpatch-skiboot on branch master |
snowpatch_ozlabs/snowpatch_job_snowpatch-skiboot-dco | success | Signed-off-by present |
Hi Michael, On 6/25/20 8:56 AM, Michael Ellerman wrote: > Mambo has a strange feature called MEMORY_OVERFLOW, enabled by > default, which causes some accesses to non-existent memory addresses > to transparently "create" memory. > > This can be confusing when debugging, eg: > > systemsim % mysim cpu 0 display spr pc > 0xC0000000000246B8 > systemsim % mysim memory display 0xC0000000000246B8 8 > 0x0000000000000000 > > Appears to show that the memory at pc (NIP) is currently zeroes. > > The astute observer will note that "mysim memory display" takes > physical addresses, not effective addresses. So unless this machine > has > 12XB of RAM, this access should have failed as there is no > memory at that address. > > Turning MEMORY_OVERFLOW off gives us a much more sensible result: > > systemsim % mysim memory display 0xC0000000000246B8 8 > Illegal Address 0xC0000000000246B8 > > It doesn't appear to have any effect on accesses done from Linux, with > the setting enabled or disabled we still get a machine check for bad > accesses in real mode: With that change applied, on mambo P10 running on a POWER8 I'm getting the following mambo exception that forbids the kernel to continue booting: [...] 142233280: (536372251): [ 0.001554] printk: bootconsole [udbg0] disabled 142387801: (537126772): [ 0.001870] mempolicy: Enabling automatic NUMA balancing. Configure with numa_balancing= or the kernel.numa_balancing sysctl 142412629: (537151600): [ 0.001919] pid_max: default: 32768 minimum: 301 142549799: (537888770): [ 0.002187] Mount-cache hash table entries: 16384 (order: 1, 131072 bytes, linear) 142570582: (537909553): [ 0.002228] Mountpoint-cache hash table entries: 16384 (order: 1, 131072 bytes, linear) 143544723: (541883694): [ 0.004130] EEH: PowerNV platform initialized 143557364: (541896335): [ 0.004155] POWER9 performance monitor hardware support registered 143629574: (541968545): [ 0.004296] rcu: Hierarchical SRCU implementation. 143904253: (543143224): [ 0.004833] smp: Bringing up secondary CPUs ... WARNING: 145271326: (548660031): Write_Mapped_Memory_Reg: Unknown address: 0x00000E995A3AF7B0, length=8 FATAL ERROR: 145271326: (548660031): Attempt to store non-existent address 0x00000E995A3AF7B0 INFO: 145271326: (548660032): ** Execution stopped: Mambo Error, ** 145271326: ** finished running 548660032 instructions ** systemsim % c 145785082: (550674532): [ 0.008506] smp: Brought up 2 nodes, 4 CPUs 145798101: (550687551): [ 0.008527] numa: Node 0 CPUs: 0-1 145810514: (550799964): [ 0.008551] numa: Node 1 CPUs: 2-3 145820419: (550809869): [ 0.008575] Using standard scheduler topology 159140153: (604064242): [ 0.034590] node 0 initialised, 118050 pages in 10ms 159167293: (604195562): [ 0.034643] pgdatinit0 (31) used greatest stack depth: 13408 bytes left 160000000: [0:0:0]: (PC:0xC0000000001C164C) : 7.1 Mega-Inst/Sec : 7.1 Mega-Cycles/Sec [17 Zaps 0 PA-Zaps] *ON* [0:0:0] pri=4 extra=0 160000000: [0:0:1]: (PC:0xC0000000003A162C) : 7.1 Mega-Inst/Sec *ON* [0:0:1] pri=4 extra=0 160000000: [0:1:0]: (PC:0xC00000000019AA58) : 7.1 Mega-Inst/Sec *ON* [0:1:0] pri=2 extra=0 160000000: [0:1:1]: (PC:0xC00000000019A5F4) : 7.1 Mega-Inst/Sec *ON* [0:1:1] pri=2 extra=0 160134492: (608065254): [ 0.036532] devtmpfs: initialized 163550958: (621787960): [ 0.043205] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns 163574637: (621811639): [ 0.043251] futex hash table entries: 1024 (order: 1, 131072 bytes, linear) 164292187: (624696519): [ 0.044653] NET: Registered protocol family 16 164770384: (626692640): [ 0.045587] audit: initializing netlink subsys (disabled) 165130649: (628049925): [ 0.046290] audit: type=2000 audit(1024007219.020:1): state=initialized audit_enabled=0 res=1 165359745: (629069307): [ 0.046738] cpuidle: using governor menu 165482259: (629491821): [ 0.046977] pstore: Registered nvram as persistent store backend 167333983: (636947373): [ 0.050594] PCI: Probing PCI hardware 168184808: (640240704): [ 0.052255] cpuidle-powernv: Default stop: psscr = 0x0000000000000300,mask=0x00000000003003ff 168204280: (640260428): [ 0.052293] cpuidle-powernv: Deepest stop: psscr = 0x0000000000300322,mask=0x00000000003003ff 168223015: (640279240): [ 0.052330] cpuidle-powernv: First stop level that may lose SPRs = 0x10 168238094: (640294710): [ 0.052359] cpuidle-powernv: First stop level that may lose timebase = 0x10 175060241: (647321380): [ 0.065684] HugeTLB registered 2.00 MiB page size, pre-allocated 0 pages 175077028: (647338167): [ 0.065717] HugeTLB registered 1.00 GiB page size, pre-allocated 0 pages 177879965: (651314704): [ 0.071191] iommu: Default domain type: Translated 177936270: (651386119): [ 0.071302] vgaarb: loaded 178219574: (651669423): [ 0.071855] SCSI subsystem initialized 178331796: (651781645): [ 0.072074] usbcore: registered new interface driver usbfs 178364558: (651814407): [ 0.072138] usbcore: registered new interface driver hub 178397036: (651981802): [ 0.072201] usbcore: registered new device driver usb 179169296: (653522982): [ 0.073710] clocksource: Switched to clocksource timebase WARNING: 179516247: (653875633): Write_Mapped_Memory_Reg: Unknown address: 0x00000E995A3AF7B0, length=8 FATAL ERROR: 179516247: (653875633): Attempt to store non-existent address 0x00000E995A3AF7B0 INFO: 179516247: (653875634): ** Execution stopped: Mambo Error, ** 179516247: ** finished running 653875634 instructions ** systemsim % I don't understand yet what's happening. Gleen does not hit it either (he is running mambo on a POWER9 tho). Which mambo / host combination have you tried it? Thanks. Cheers, Gustavo
Gustavo Romero <gromero@linux.vnet.ibm.com> writes: > Hi Michael, > > On 6/25/20 8:56 AM, Michael Ellerman wrote: >> Mambo has a strange feature called MEMORY_OVERFLOW, enabled by >> default, which causes some accesses to non-existent memory addresses >> to transparently "create" memory. >> >> This can be confusing when debugging, eg: >> >> systemsim % mysim cpu 0 display spr pc >> 0xC0000000000246B8 >> systemsim % mysim memory display 0xC0000000000246B8 8 >> 0x0000000000000000 >> >> Appears to show that the memory at pc (NIP) is currently zeroes. >> >> The astute observer will note that "mysim memory display" takes >> physical addresses, not effective addresses. So unless this machine >> has > 12XB of RAM, this access should have failed as there is no >> memory at that address. >> >> Turning MEMORY_OVERFLOW off gives us a much more sensible result: >> >> systemsim % mysim memory display 0xC0000000000246B8 8 >> Illegal Address 0xC0000000000246B8 >> >> It doesn't appear to have any effect on accesses done from Linux, with >> the setting enabled or disabled we still get a machine check for bad >> accesses in real mode: > > With that change applied, on mambo P10 running on a POWER8 I'm getting > the following mambo exception that forbids the kernel to continue booting: This looks like exactly the kind of thing we want to catch, so that's "good" :) > [...] > 142233280: (536372251): [ 0.001554] printk: bootconsole [udbg0] disabled > 142387801: (537126772): [ 0.001870] mempolicy: Enabling automatic NUMA balancing. Configure with numa_balancing= or the kernel.numa_balancing sysctl > 142412629: (537151600): [ 0.001919] pid_max: default: 32768 minimum: 301 > 142549799: (537888770): [ 0.002187] Mount-cache hash table entries: 16384 (order: 1, 131072 bytes, linear) > 142570582: (537909553): [ 0.002228] Mountpoint-cache hash table entries: 16384 (order: 1, 131072 bytes, linear) > 143544723: (541883694): [ 0.004130] EEH: PowerNV platform initialized > 143557364: (541896335): [ 0.004155] POWER9 performance monitor hardware support registered > 143629574: (541968545): [ 0.004296] rcu: Hierarchical SRCU implementation. > 143904253: (543143224): [ 0.004833] smp: Bringing up secondary CPUs ... > WARNING: 145271326: (548660031): Write_Mapped_Memory_Reg: Unknown address: 0x00000E995A3AF7B0, length=8 > FATAL ERROR: 145271326: (548660031): Attempt to store non-existent address 0x00000E995A3AF7B0 > INFO: 145271326: (548660032): ** Execution stopped: Mambo Error, ** > 145271326: ** finished running 548660032 instructions ** Can you see where the bad store came from, "bt" should give you a backtrace. Then "p pc" should give you the PC and "di <that value>" should show us what instruction it was. > systemsim % c > 145785082: (550674532): [ 0.008506] smp: Brought up 2 nodes, 4 CPUs > 145798101: (550687551): [ 0.008527] numa: Node 0 CPUs: 0-1 > 145810514: (550799964): [ 0.008551] numa: Node 1 CPUs: 2-3 Does it still happen with a single CPU? > 145820419: (550809869): [ 0.008575] Using standard scheduler topology > 159140153: (604064242): [ 0.034590] node 0 initialised, 118050 pages in 10ms > 159167293: (604195562): [ 0.034643] pgdatinit0 (31) used greatest stack depth: 13408 bytes left > 160000000: [0:0:0]: (PC:0xC0000000001C164C) : 7.1 Mega-Inst/Sec : 7.1 Mega-Cycles/Sec [17 Zaps 0 PA-Zaps] *ON* [0:0:0] pri=4 extra=0 > 160000000: [0:0:1]: (PC:0xC0000000003A162C) : 7.1 Mega-Inst/Sec *ON* [0:0:1] pri=4 extra=0 > 160000000: [0:1:0]: (PC:0xC00000000019AA58) : 7.1 Mega-Inst/Sec *ON* [0:1:0] pri=2 extra=0 > 160000000: [0:1:1]: (PC:0xC00000000019A5F4) : 7.1 Mega-Inst/Sec *ON* [0:1:1] pri=2 extra=0 > 160134492: (608065254): [ 0.036532] devtmpfs: initialized > 163550958: (621787960): [ 0.043205] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns > 163574637: (621811639): [ 0.043251] futex hash table entries: 1024 (order: 1, 131072 bytes, linear) > 164292187: (624696519): [ 0.044653] NET: Registered protocol family 16 > 164770384: (626692640): [ 0.045587] audit: initializing netlink subsys (disabled) > 165130649: (628049925): [ 0.046290] audit: type=2000 audit(1024007219.020:1): state=initialized audit_enabled=0 res=1 > 165359745: (629069307): [ 0.046738] cpuidle: using governor menu > 165482259: (629491821): [ 0.046977] pstore: Registered nvram as persistent store backend > 167333983: (636947373): [ 0.050594] PCI: Probing PCI hardware > 168184808: (640240704): [ 0.052255] cpuidle-powernv: Default stop: psscr = 0x0000000000000300,mask=0x00000000003003ff > 168204280: (640260428): [ 0.052293] cpuidle-powernv: Deepest stop: psscr = 0x0000000000300322,mask=0x00000000003003ff > 168223015: (640279240): [ 0.052330] cpuidle-powernv: First stop level that may lose SPRs = 0x10 > 168238094: (640294710): [ 0.052359] cpuidle-powernv: First stop level that may lose timebase = 0x10 > 175060241: (647321380): [ 0.065684] HugeTLB registered 2.00 MiB page size, pre-allocated 0 pages > 175077028: (647338167): [ 0.065717] HugeTLB registered 1.00 GiB page size, pre-allocated 0 pages > 177879965: (651314704): [ 0.071191] iommu: Default domain type: Translated > 177936270: (651386119): [ 0.071302] vgaarb: loaded > 178219574: (651669423): [ 0.071855] SCSI subsystem initialized > 178331796: (651781645): [ 0.072074] usbcore: registered new interface driver usbfs > 178364558: (651814407): [ 0.072138] usbcore: registered new interface driver hub > 178397036: (651981802): [ 0.072201] usbcore: registered new device driver usb > 179169296: (653522982): [ 0.073710] clocksource: Switched to clocksource timebase > WARNING: 179516247: (653875633): Write_Mapped_Memory_Reg: Unknown address: 0x00000E995A3AF7B0, length=8 > FATAL ERROR: 179516247: (653875633): Attempt to store non-existent address 0x00000E995A3AF7B0 > INFO: 179516247: (653875634): ** Execution stopped: Mambo Error, ** > 179516247: ** finished running 653875634 instructions ** > systemsim % > > I don't understand yet what's happening. Gleen does not hit it either > (he is running mambo on a POWER9 tho). Which mambo / host combination > have you tried it? I tested Power8 / Power9 emulated on x86-64 and ppc64le. cheers
Hello Michael, On 6/28/20 10:27 PM, Michael Ellerman wrote: > Gustavo Romero <gromero@linux.vnet.ibm.com> writes: >> Hi Michael, >> >> On 6/25/20 8:56 AM, Michael Ellerman wrote: >>> Mambo has a strange feature called MEMORY_OVERFLOW, enabled by >>> default, which causes some accesses to non-existent memory addresses >>> to transparently "create" memory. >>> >>> This can be confusing when debugging, eg: >>> >>> systemsim % mysim cpu 0 display spr pc >>> 0xC0000000000246B8 >>> systemsim % mysim memory display 0xC0000000000246B8 8 >>> 0x0000000000000000 >>> >>> Appears to show that the memory at pc (NIP) is currently zeroes. >>> >>> The astute observer will note that "mysim memory display" takes >>> physical addresses, not effective addresses. So unless this machine >>> has > 12XB of RAM, this access should have failed as there is no >>> memory at that address. >>> >>> Turning MEMORY_OVERFLOW off gives us a much more sensible result: >>> >>> systemsim % mysim memory display 0xC0000000000246B8 8 >>> Illegal Address 0xC0000000000246B8 >>> >>> It doesn't appear to have any effect on accesses done from Linux, with >>> the setting enabled or disabled we still get a machine check for bad >>> accesses in real mode: >> >> With that change applied, on mambo P10 running on a POWER8 I'm getting >> the following mambo exception that forbids the kernel to continue booting: > > This looks like exactly the kind of thing we want to catch, so that's > "good" :) > >> [...] >> 142233280: (536372251): [ 0.001554] printk: bootconsole [udbg0] disabled >> 142387801: (537126772): [ 0.001870] mempolicy: Enabling automatic NUMA balancing. Configure with numa_balancing= or the kernel.numa_balancing sysctl >> 142412629: (537151600): [ 0.001919] pid_max: default: 32768 minimum: 301 >> 142549799: (537888770): [ 0.002187] Mount-cache hash table entries: 16384 (order: 1, 131072 bytes, linear) >> 142570582: (537909553): [ 0.002228] Mountpoint-cache hash table entries: 16384 (order: 1, 131072 bytes, linear) >> 143544723: (541883694): [ 0.004130] EEH: PowerNV platform initialized >> 143557364: (541896335): [ 0.004155] POWER9 performance monitor hardware support registered >> 143629574: (541968545): [ 0.004296] rcu: Hierarchical SRCU implementation. >> 143904253: (543143224): [ 0.004833] smp: Bringing up secondary CPUs ... >> WARNING: 145271326: (548660031): Write_Mapped_Memory_Reg: Unknown address: 0x00000E995A3AF7B0, length=8 >> FATAL ERROR: 145271326: (548660031): Attempt to store non-existent address 0x00000E995A3AF7B0 >> INFO: 145271326: (548660032): ** Execution stopped: Mambo Error, ** >> 145271326: ** finished running 548660032 instructions ** > > Can you see where the bad store came from, "bt" should give you a backtrace. > > Then "p pc" should give you the PC and "di <that value>" should show us > what instruction it was. PC itself is parked, believe it or not, at a 'nop' instruction: 151630615: (157638813): [ 0.040390] pstore: Registered nvram as persistent store backend 152359208: (158398639): [ 0.041813] PCI: Probing PCI hardware 152666145: (158718713): [ 0.042412] audit: type=2000 audit(1024007219.010:1): state=initialized audit_enabled=0 res=1 153055303: (159124575): [ 0.043172] cpuidle-powernv: Default stop: psscr = 0x0000000000000300,mask=0x00000000003003ff 153074775: (159144887): [ 0.043210] cpuidle-powernv: Deepest stop: psscr = 0x0000000000300322,mask=0x00000000003003ff 153093510: (159164435): [ 0.043247] cpuidle-powernv: First stop level that may lose SPRs = 0x10 153108589: (159180151): [ 0.043276] cpuidle-powernv: First stop level that may lose timebase = 0x10 WARNING: 156265663: (162472569): Write_Mapped_Memory_Reg: Unknown address: 0x00000BEE18C96D60, length=8 FATAL ERROR: 156265663: (162472569): Attempt to store non-existent address 0x00000BEE18C96D60 INFO: 156265663: (162472570): ** Execution stopped: Mambo Error, ** 156265663: ** finished running 162472570 instructions ** systemsim % p pc 0xC0000000003A2EC4 systemsim % di 0xC0000000003A2EC4 WARNING: 156265663: (162472570): Need to define a CPU WARNING: 156265663: (162472570): Need to define a CPU EADDR:0xC0000000003A2EC0 RADDR:0x003A2EC0 Enc:0x2C4A007C : dcbt r0,r9,0 WARNING: 156265663: (162472570): Need to define a CPU WARNING: 156265663: (162472570): Need to define a CPU EADDR:0xC0000000003A2EC4 RADDR:0x003A2EC4 Enc:0x00000060 : nop WARNING: 156265663: (162472570): Need to define a CPU WARNING: 156265663: (162472570): Need to define a CPU EADDR:0xC0000000003A2EC8 RADDR:0x003A2EC8 Enc:0x00000060 : nop WARNING: 156265663: (162472570): Need to define a CPU WARNING: 156265663: (162472570): Need to define a CPU But just before the 'nop' there is a dcbt. But address passed to the dcbt, in GPR 9, doesn't contain anything close to the address displayed by mambo: systemsim % mysim cpu 0:0:0 display gpr 9 0x0000000000000240 systemsim % mysim mcm 0 cpu 0 thread 0 dtranslate 0x240 data address translation for 0x0000000000000240 failed systemsim % The dcbt instruction is in mm/slub.c; more context: 1241956 prefetch(object + s->offset); 1241957 c0000000003a2eb4: 20 00 3f 81 lwz r9,32(r31) 1241958 if (unlikely(!x)) 1241959 c0000000003a2eb8: 15 4a 3a 7d add. r9,r26,r9 1241960 c0000000003a2ebc: 08 00 82 41 beq c0000000003a2ec4 <kmem_cache_alloc+0x114> 1241961 __asm__ __volatile__ ("dcbt 0,%0" : : "r" (x)); 1241962 c0000000003a2ec0: 2c 4a 00 7c dcbt 0,r9 1241963 c0000000003a2ec4: 00 00 00 60 nop 1241964 c0000000003a2ec8: 00 00 00 60 nop so probably from a prefetch_freepointer() in https://github.com/torvalds/linux/blob/master/mm/slub.c#L2815 >> systemsim % c >> 145785082: (550674532): [ 0.008506] smp: Brought up 2 nodes, 4 CPUs >> 145798101: (550687551): [ 0.008527] numa: Node 0 CPUs: 0-1 >> 145810514: (550799964): [ 0.008551] numa: Node 1 CPUs: 2-3 > > Does it still happen with a single CPU? It still happens with maxcpus=0 or =1. However if I disable radix passing disable_radix=1 to command line I'm able to boot. Today I've built upstream mambo and the same issue happens. I'm clueless yet what's happening... so if you have additional things to try let me know. It might be an issue with Mambo P10 running on P8. Thanks, Gustavo
On Thu, Jul 2, 2020 at 11:56 AM Gustavo Romero <gromero@linux.vnet.ibm.com> wrote: > > > But just before the 'nop' there is a dcbt. But address passed to the dcbt, > in GPR 9, doesn't contain anything close to the address displayed by mambo: > > systemsim % mysim cpu 0:0:0 display gpr 9 > 0x0000000000000240 What in the other GPRs? > systemsim % mysim mcm 0 cpu 0 thread 0 dtranslate 0x240 > data address translation for 0x0000000000000240 failed > systemsim % > > The dcbt instruction is in mm/slub.c; more context: > > 1241956 prefetch(object + s->offset); > 1241957 c0000000003a2eb4: 20 00 3f 81 lwz r9,32(r31) > 1241958 if (unlikely(!x)) > 1241959 c0000000003a2eb8: 15 4a 3a 7d add. r9,r26,r9 > 1241960 c0000000003a2ebc: 08 00 82 41 beq c0000000003a2ec4 <kmem_cache_alloc+0x114> > 1241961 __asm__ __volatile__ ("dcbt 0,%0" : : "r" (x)); > 1241962 c0000000003a2ec0: 2c 4a 00 7c dcbt 0,r9 > 1241963 c0000000003a2ec4: 00 00 00 60 nop > 1241964 c0000000003a2ec8: 00 00 00 60 nop > > so probably from a prefetch_freepointer() in > https://github.com/torvalds/linux/blob/master/mm/slub.c#L2815 > > > >> systemsim % c > >> 145785082: (550674532): [ 0.008506] smp: Brought up 2 nodes, 4 CPUs > >> 145798101: (550687551): [ 0.008527] numa: Node 0 CPUs: 0-1 > >> 145810514: (550799964): [ 0.008551] numa: Node 1 CPUs: 2-3 > > > > Does it still happen with a single CPU? > > It still happens with maxcpus=0 or =1. However if I disable radix passing > disable_radix=1 to command line I'm able to boot. What about with the number of threads configured in skiboot.tcl set to 1? > Today I've built upstream mambo and the same issue happens. I'm clueless > yet what's happening... so if you have additional things to try let > me know. It might be an issue with Mambo P10 running on P8. It's probably just a P10 mambo bug. Even if it's not they're more likely to have an answer for you about what's going on. Anyway, I'll hold off on merging this until people figure it out.
Gustavo Romero <gromero@linux.vnet.ibm.com> writes: > On 6/28/20 10:27 PM, Michael Ellerman wrote: >> Gustavo Romero <gromero@linux.vnet.ibm.com> writes: >>> On 6/25/20 8:56 AM, Michael Ellerman wrote: >>>> Mambo has a strange feature called MEMORY_OVERFLOW, enabled by >>>> default, which causes some accesses to non-existent memory addresses >>>> to transparently "create" memory. >>>> >>>> This can be confusing when debugging, eg: >>>> >>>> systemsim % mysim cpu 0 display spr pc >>>> 0xC0000000000246B8 >>>> systemsim % mysim memory display 0xC0000000000246B8 8 >>>> 0x0000000000000000 >>>> >>>> Appears to show that the memory at pc (NIP) is currently zeroes. >>>> >>>> The astute observer will note that "mysim memory display" takes >>>> physical addresses, not effective addresses. So unless this machine >>>> has > 12XB of RAM, this access should have failed as there is no >>>> memory at that address. >>>> >>>> Turning MEMORY_OVERFLOW off gives us a much more sensible result: >>>> >>>> systemsim % mysim memory display 0xC0000000000246B8 8 >>>> Illegal Address 0xC0000000000246B8 >>>> >>>> It doesn't appear to have any effect on accesses done from Linux, with >>>> the setting enabled or disabled we still get a machine check for bad >>>> accesses in real mode: >>> >>> With that change applied, on mambo P10 running on a POWER8 I'm getting >>> the following mambo exception that forbids the kernel to continue booting: >> >> This looks like exactly the kind of thing we want to catch, so that's >> "good" :) >> >>> [...] >>> 142233280: (536372251): [ 0.001554] printk: bootconsole [udbg0] disabled >>> 142387801: (537126772): [ 0.001870] mempolicy: Enabling automatic NUMA balancing. Configure with numa_balancing= or the kernel.numa_balancing sysctl >>> 142412629: (537151600): [ 0.001919] pid_max: default: 32768 minimum: 301 >>> 142549799: (537888770): [ 0.002187] Mount-cache hash table entries: 16384 (order: 1, 131072 bytes, linear) >>> 142570582: (537909553): [ 0.002228] Mountpoint-cache hash table entries: 16384 (order: 1, 131072 bytes, linear) >>> 143544723: (541883694): [ 0.004130] EEH: PowerNV platform initialized >>> 143557364: (541896335): [ 0.004155] POWER9 performance monitor hardware support registered >>> 143629574: (541968545): [ 0.004296] rcu: Hierarchical SRCU implementation. >>> 143904253: (543143224): [ 0.004833] smp: Bringing up secondary CPUs ... >>> WARNING: 145271326: (548660031): Write_Mapped_Memory_Reg: Unknown address: 0x00000E995A3AF7B0, length=8 >>> FATAL ERROR: 145271326: (548660031): Attempt to store non-existent address 0x00000E995A3AF7B0 >>> INFO: 145271326: (548660032): ** Execution stopped: Mambo Error, ** >>> 145271326: ** finished running 548660032 instructions ** >> >> Can you see where the bad store came from, "bt" should give you a backtrace. >> >> Then "p pc" should give you the PC and "di <that value>" should show us >> what instruction it was. > > PC itself is parked, believe it or not, at a 'nop' instruction: Hmm, maybe we're looking at the wrong CPU/thread? > 151630615: (157638813): [ 0.040390] pstore: Registered nvram as persistent store backend > 152359208: (158398639): [ 0.041813] PCI: Probing PCI hardware > 152666145: (158718713): [ 0.042412] audit: type=2000 audit(1024007219.010:1): state=initialized audit_enabled=0 res=1 > 153055303: (159124575): [ 0.043172] cpuidle-powernv: Default stop: psscr = 0x0000000000000300,mask=0x00000000003003ff > 153074775: (159144887): [ 0.043210] cpuidle-powernv: Deepest stop: psscr = 0x0000000000300322,mask=0x00000000003003ff > 153093510: (159164435): [ 0.043247] cpuidle-powernv: First stop level that may lose SPRs = 0x10 > 153108589: (159180151): [ 0.043276] cpuidle-powernv: First stop level that may lose timebase = 0x10 > WARNING: 156265663: (162472569): Write_Mapped_Memory_Reg: Unknown address: 0x00000BEE18C96D60, length=8 > FATAL ERROR: 156265663: (162472569): Attempt to store non-existent address 0x00000BEE18C96D60 > INFO: 156265663: (162472570): ** Execution stopped: Mambo Error, ** > 156265663: ** finished running 162472570 instructions ** I notice this doesn't tell us what CPU caused the bad access. > systemsim % p pc > 0xC0000000003A2EC4 "p" implicitly shows CPU 0 unless you've used the "target" command. > systemsim % di 0xC0000000003A2EC4 > WARNING: 156265663: (162472570): Need to define a CPU > WARNING: 156265663: (162472570): Need to define a CPU > EADDR:0xC0000000003A2EC0 RADDR:0x003A2EC0 Enc:0x2C4A007C : dcbt r0,r9,0 > WARNING: 156265663: (162472570): Need to define a CPU > WARNING: 156265663: (162472570): Need to define a CPU > EADDR:0xC0000000003A2EC4 RADDR:0x003A2EC4 Enc:0x00000060 : nop > WARNING: 156265663: (162472570): Need to define a CPU > WARNING: 156265663: (162472570): Need to define a CPU > EADDR:0xC0000000003A2EC8 RADDR:0x003A2EC8 Enc:0x00000060 : nop > WARNING: 156265663: (162472570): Need to define a CPU > WARNING: 156265663: (162472570): Need to define a CPU > > But just before the 'nop' there is a dcbt. But address passed to the dcbt, > in GPR 9, doesn't contain anything close to the address displayed by mambo: > > systemsim % mysim cpu 0:0:0 display gpr 9 > 0x0000000000000240 > systemsim % mysim mcm 0 cpu 0 thread 0 dtranslate 0x240 > data address translation for 0x0000000000000240 failed > systemsim % > > The dcbt instruction is in mm/slub.c; more context: > > 1241956 prefetch(object + s->offset); > 1241957 c0000000003a2eb4: 20 00 3f 81 lwz r9,32(r31) > 1241958 if (unlikely(!x)) > 1241959 c0000000003a2eb8: 15 4a 3a 7d add. r9,r26,r9 > 1241960 c0000000003a2ebc: 08 00 82 41 beq c0000000003a2ec4 <kmem_cache_alloc+0x114> > 1241961 __asm__ __volatile__ ("dcbt 0,%0" : : "r" (x)); > 1241962 c0000000003a2ec0: 2c 4a 00 7c dcbt 0,r9 > 1241963 c0000000003a2ec4: 00 00 00 60 nop > 1241964 c0000000003a2ec8: 00 00 00 60 nop > > so probably from a prefetch_freepointer() in > https://github.com/torvalds/linux/blob/master/mm/slub.c#L2815 But we shouldn't be prefetching 0x240, that's a userspace address. So I suspect something has gone wrong with the debug here. >>> systemsim % c >>> 145785082: (550674532): [ 0.008506] smp: Brought up 2 nodes, 4 CPUs >>> 145798101: (550687551): [ 0.008527] numa: Node 0 CPUs: 0-1 >>> 145810514: (550799964): [ 0.008551] numa: Node 1 CPUs: 2-3 >> >> Does it still happen with a single CPU? > > It still happens with maxcpus=0 or =1. However if I disable radix passing > disable_radix=1 to command line I'm able to boot. Hmm OK. > Today I've built upstream mambo and the same issue happens. I'm clueless > yet what's happening... so if you have additional things to try let > me know. It might be an issue with Mambo P10 running on P8. Just to be clear it doesn't happen with mambo simulating P9 right? cheers
On 7/2/20 4:03 AM, Michael Ellerman wrote: > Gustavo Romero <gromero@linux.vnet.ibm.com> writes: >> On 6/28/20 10:27 PM, Michael Ellerman wrote: >> Today I've built upstream mambo and the same issue happens. I'm clueless >> yet what's happening... so if you have additional things to try let >> me know. It might be an issue with Mambo P10 running on P8. > > Just to be clear it doesn't happen with mambo simulating P9 right? Yes, just on mambo P10. P9 is ok. Also, Gleen could not reproduce it w/ mambo P10 running a P9 host (I'm using mambo on a P8 host). I'm not sure if the host matter that much, but I've also tried with the same kernel image and mambo config that Gleen used and I still hit the issue on my P8 host. I'll dig more tomorrow. Thanks Michael and Oliver. Cheers, Gustavo
On Fri, Jul 3, 2020 at 8:53 AM Gustavo Romero <gromero@linux.vnet.ibm.com> wrote: > > On 7/2/20 4:03 AM, Michael Ellerman wrote: > > Gustavo Romero <gromero@linux.vnet.ibm.com> writes: > >> On 6/28/20 10:27 PM, Michael Ellerman wrote: > >> Today I've built upstream mambo and the same issue happens. I'm clueless > >> yet what's happening... so if you have additional things to try let > >> me know. It might be an issue with Mambo P10 running on P8. > > > > Just to be clear it doesn't happen with mambo simulating P9 right? > > Yes, just on mambo P10. P9 is ok. Also, Gleen could not reproduce it > w/ mambo P10 running a P9 host (I'm using mambo on a P8 host). I'm not > sure if the host matter that much, but I've also tried with the same > kernel image and mambo config that Gleen used and I still hit the > issue on my P8 host. > > I'll dig more tomorrow. Any updates on this?
Hi Oliver, On 8/4/20 4:01 AM, Oliver O'Halloran wrote: > On Fri, Jul 3, 2020 at 8:53 AM Gustavo Romero > <gromero@linux.vnet.ibm.com> wrote: >> >> On 7/2/20 4:03 AM, Michael Ellerman wrote: >>> Gustavo Romero <gromero@linux.vnet.ibm.com> writes: >>>> On 6/28/20 10:27 PM, Michael Ellerman wrote: >>>> Today I've built upstream mambo and the same issue happens. I'm clueless >>>> yet what's happening... so if you have additional things to try let >>>> me know. It might be an issue with Mambo P10 running on P8. >>> >>> Just to be clear it doesn't happen with mambo simulating P9 right? >> >> Yes, just on mambo P10. P9 is ok. Also, Gleen could not reproduce it >> w/ mambo P10 running a P9 host (I'm using mambo on a P8 host). I'm not >> sure if the host matter that much, but I've also tried with the same >> kernel image and mambo config that Gleen used and I still hit the >> issue on my P8 host. >> >> I'll dig more tomorrow. > > Any updates on this? So the only update I have is that address in GPR 9 used by dcbt is indeed correct, it happens that the MMU is already enabled, address 0x0000000000000240 is indeed correct, but will be translated... so unfortunately not much progress since the last time. I was only able to reproduce it on a given P8 machine we have in our lab. Another thing (which doesn't make sense at all) is that it only reproduces if that P8 machine is running the Ubuntu 18.04.4 4.15.0-110-generic stock kernel, so if I boot the machine with some upstream one (for instance, 5.5.0) the issue doesn't reproduce. Nonetheless, the kernel on the host should not affect Mambo whatsoever. I've spent days digging into that issue but could not find anything wrong with Mambo or with the kernel booted on Mambo VM. Additionally, I've noted that that particular P8 machine is crashing mysteriously from time to time due to what looks like a HW issue, so I don't trust it that much at this point. Glenn was also never able to reproduce it at his side. Thus, I won't hold the change from Michael anymore, I was caught once by that Mambo feature being silently enabled when working on a soft/hard lockup and it was quite misleading. Also, it's not how the real HW works. So please, go ahead :) Thanks, Gustavo
diff --git a/external/mambo/skiboot.tcl b/external/mambo/skiboot.tcl index 6003fcbef..264846c22 100644 --- a/external/mambo/skiboot.tcl +++ b/external/mambo/skiboot.tcl @@ -122,6 +122,7 @@ myconf config processor/timebase_frequency 1/1 myconf config enable_pseries_nvram false myconf config machine_option/NO_RAM TRUE myconf config machine_option/NO_ROM TRUE +myconf config machine_option/MEMORY_OVERFLOW FALSE if { $default_config == "PEGASUS" } { # We need to be DD2 or greater on p8 for the HILE HID bit.
Mambo has a strange feature called MEMORY_OVERFLOW, enabled by default, which causes some accesses to non-existent memory addresses to transparently "create" memory. This can be confusing when debugging, eg: systemsim % mysim cpu 0 display spr pc 0xC0000000000246B8 systemsim % mysim memory display 0xC0000000000246B8 8 0x0000000000000000 Appears to show that the memory at pc (NIP) is currently zeroes. The astute observer will note that "mysim memory display" takes physical addresses, not effective addresses. So unless this machine has > 12XB of RAM, this access should have failed as there is no memory at that address. Turning MEMORY_OVERFLOW off gives us a much more sensible result: systemsim % mysim memory display 0xC0000000000246B8 8 Illegal Address 0xC0000000000246B8 It doesn't appear to have any effect on accesses done from Linux, with the setting enabled or disabled we still get a machine check for bad accesses in real mode: 0:mon> p $__mmu_off return value is 0x9000000030001003 0:mon> d c008000000000000 c008000000000000 [15447.242793] Disabling lock debugging due to kernel taint [15447.242824] MCE: CPU0: machine check (Severe) Host Real address Load/Store (foreign) [Not recovered] [15447.242868] MCE: CPU0: NIP: [c000000000103090] mread+0x90/0x190 [15447.242897] MCE: CPU0: Initiator CPU [15447.242918] MCE: CPU0: Hardware error [15447.242939] opal: Hardware platform error: Unrecoverable Machine Check exception [15447.242978] CPU: 0 PID: 18 Comm: kopald Tainted: G M 5.7.0-gcc-8.2.0-13354-gd475a86526b5-dirty #127 [15447.243022] NIP: c000000000103090 LR: c000000000103034 CTR: c0000000000c2050 [15447.243059] REGS: c0000000fffbfd70 TRAP: 0200 Tainted: G M (5.7.0-gcc-8.2.0-13354-gd475a86526b5-dirty) [15447.243101] MSR: 9000000000201003 <SF,HV,ME,RI,LE> CR: 28002224 XER: 20040000 [15447.243150] CFAR: c000000000029660 DAR: c008000000000000 DSISR: 00000008 IRQMASK: 3 [15447.243150] GPR00: 0000000028002224 c0000000f05074c0 c000000001775e00 0000000000000000 [15447.243150] GPR04: c0000000f0507768 0000000000000010 0000000000000010 c0000000f0507768 [15447.243150] GPR08: 0007ffff0faf8899 0000000000000000 c0000000f0507767 9000000030001003 [15447.243150] GPR12: 0000000031c10000 c000000001970000 c0000000f0507738 0000000000000000 [15447.243150] GPR16: 000000000000002e 0000000000000010 000000000000002e 0000000000000010 [15447.243150] GPR20: c000000000e563d0 c000000000ffcf88 c000000000fddf30 c008000000000000 [15447.243150] GPR24: c0000000f0507767 c000000000ffc678 0000000000000040 c0000000f0507878 [15447.243150] GPR28: c0000000f0507768 c0000000f0507768 0000000000000010 0000000000000000 [15447.243485] NIP [c000000000103090] mread+0x90/0x190 [15447.243514] LR [c000000000103034] mread+0x34/0x190 [15447.243538] Call Trace: [15447.243559] [c0000000f05074c0] [c0000000f0507510] 0xc0000000f0507510 (unreliable) [15447.243602] [c0000000f0507510] [c000000000107b48] xmon_core+0xd18/0x3a10 [15447.243640] [c0000000f0507850] [c00000000010a870] xmon+0x30/0x80 [15447.243677] [c0000000f05079f0] [c0000000001050d4] sysrq_handle_xmon+0xc4/0xd0 [15447.243720] [c0000000f0507a20] [c000000000882954] __handle_sysrq+0xd4/0x220 [15447.243758] [c0000000f0507ac0] [c00000000089db68] __hvc_poll+0x268/0x4b0 [15447.243796] [c0000000f0507b60] [c00000000089e824] hvc_handle_interrupt+0x24/0x70 [15447.243838] [c0000000f0507b80] [c0000000001c2474] __handle_irq_event_percpu+0xb4/0x350 [15447.243880] [c0000000f0507c50] [c0000000001c2754] handle_irq_event_percpu+0x44/0xc0 [15447.243923] [c0000000f0507c90] [c0000000001c2844] handle_irq_event+0x74/0xc0 [15447.243965] [c0000000f0507cc0] [c0000000001c9770] handle_level_irq+0xe0/0x1c0 [15447.244006] [c0000000f0507cf0] [c0000000001c0a44] generic_handle_irq+0x54/0x80 [15447.244046] [c0000000f0507d10] [c0000000000ca050] opal_handle_events+0x130/0x150 [15447.244088] [c0000000f0507d60] [c0000000000c1e7c] kopald+0x9c/0x140 [15447.244124] [c0000000f0507db0] [c00000000016ea9c] kthread+0x19c/0x1b0 [15447.244161] [c0000000f0507e20] [c00000000000d1a8] ret_from_kernel_thread+0x5c/0x74 Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> --- external/mambo/skiboot.tcl | 1 + 1 file changed, 1 insertion(+)