Message ID | 20190606072951.32116-1-ravi.bangoria@linux.ibm.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Powerpc/Watchpoint: Restore nvgprs while returning from exception | expand |
Context | Check | Description |
---|---|---|
snowpatch_ozlabs/apply_patch | success | Successfully applied on branch next (a3bf9fbdad600b1e4335dd90979f8d6072e4f602) |
snowpatch_ozlabs/build-ppc64le | success | Build succeeded |
snowpatch_ozlabs/build-ppc64be | success | Build succeeded |
snowpatch_ozlabs/build-ppc64e | success | Build succeeded |
snowpatch_ozlabs/build-pmac32 | success | Build succeeded |
snowpatch_ozlabs/checkpatch | success | total: 0 errors, 0 warnings, 0 checks, 8 lines checked |
Ravi Bangoria wrote: > Powerpc hw triggers watchpoint before executing the instruction. > To make trigger-after-execute behavior, kernel emulates the > instruction. If the instruction is 'load something into non- > volatile register', exception handler should restore emulated > register state while returning back, otherwise there will be > register state corruption. Ex, Adding a watchpoint on a list > can corrput the list: > > # cat /proc/kallsyms | grep kthread_create_list > c00000000121c8b8 d kthread_create_list > > Add watchpoint on kthread_create_list->next: > > # perf record -e mem:0xc00000000121c8c0 > > Run some workload such that new kthread gets invoked. Ex, I > just logged out from console: > > list_add corruption. next->prev should be prev (c000000001214e00), \ > but was c00000000121c8b8. (next=c00000000121c8b8). > WARNING: CPU: 59 PID: 309 at lib/list_debug.c:25 __list_add_valid+0xb4/0xc0 > CPU: 59 PID: 309 Comm: kworker/59:0 Kdump: loaded Not tainted 5.1.0-rc7+ #69 > ... > NIP __list_add_valid+0xb4/0xc0 > LR __list_add_valid+0xb0/0xc0 > Call Trace: > __list_add_valid+0xb0/0xc0 (unreliable) > __kthread_create_on_node+0xe0/0x260 > kthread_create_on_node+0x34/0x50 > create_worker+0xe8/0x260 > worker_thread+0x444/0x560 > kthread+0x160/0x1a0 > ret_from_kernel_thread+0x5c/0x70 > > Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> > --- > arch/powerpc/kernel/exceptions-64s.S | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > Awesome catch - this one has had a glorious run... Fixes: 5aae8a5370802 ("powerpc, hw_breakpoints: Implement hw_breakpoints for 64-bit server processors") Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> - Naveen
On 6/6/19 12:59 PM, Ravi Bangoria wrote: > Powerpc hw triggers watchpoint before executing the instruction. > To make trigger-after-execute behavior, kernel emulates the > instruction. If the instruction is 'load something into non- > volatile register', exception handler should restore emulated > register state while returning back, otherwise there will be > register state corruption. Ex, Adding a watchpoint on a list > can corrput the list: > > # cat /proc/kallsyms | grep kthread_create_list > c00000000121c8b8 d kthread_create_list > > Add watchpoint on kthread_create_list->next: s/kthread_create_list->next/kthread_create_list->prev/
On Thu, 2019-06-06 at 12:59 +0530, Ravi Bangoria wrote: > Powerpc hw triggers watchpoint before executing the instruction. > To make trigger-after-execute behavior, kernel emulates the > instruction. If the instruction is 'load something into non- > volatile register', exception handler should restore emulated > register state while returning back, otherwise there will be > register state corruption. Ex, Adding a watchpoint on a list > can corrput the list: > > # cat /proc/kallsyms | grep kthread_create_list > c00000000121c8b8 d kthread_create_list > > Add watchpoint on kthread_create_list->next: > > # perf record -e mem:0xc00000000121c8c0 > > Run some workload such that new kthread gets invoked. Ex, I > just logged out from console: > > list_add corruption. next->prev should be prev (c000000001214e00), \ > but was c00000000121c8b8. (next=c00000000121c8b8). > WARNING: CPU: 59 PID: 309 at lib/list_debug.c:25 __list_add_valid+0xb4/0xc0 > CPU: 59 PID: 309 Comm: kworker/59:0 Kdump: loaded Not tainted 5.1.0-rc7+ #69 > ... > NIP __list_add_valid+0xb4/0xc0 > LR __list_add_valid+0xb0/0xc0 > Call Trace: > __list_add_valid+0xb0/0xc0 (unreliable) > __kthread_create_on_node+0xe0/0x260 > kthread_create_on_node+0x34/0x50 > create_worker+0xe8/0x260 > worker_thread+0x444/0x560 > kthread+0x160/0x1a0 > ret_from_kernel_thread+0x5c/0x70 > > Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> How long has this been around? Should we be CCing stable? Mikey > --- > arch/powerpc/kernel/exceptions-64s.S | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/powerpc/kernel/exceptions-64s.S > b/arch/powerpc/kernel/exceptions-64s.S > index 9481a11..96de0d1 100644 > --- a/arch/powerpc/kernel/exceptions-64s.S > +++ b/arch/powerpc/kernel/exceptions-64s.S > @@ -1753,7 +1753,7 @@ handle_dabr_fault: > ld r5,_DSISR(r1) > addi r3,r1,STACK_FRAME_OVERHEAD > bl do_break > -12: b ret_from_except_lite > +12: b ret_from_except > > > #ifdef CONFIG_PPC_BOOK3S_64
On 6/7/19 6:20 AM, Michael Neuling wrote: > On Thu, 2019-06-06 at 12:59 +0530, Ravi Bangoria wrote: >> Powerpc hw triggers watchpoint before executing the instruction. >> To make trigger-after-execute behavior, kernel emulates the >> instruction. If the instruction is 'load something into non- >> volatile register', exception handler should restore emulated >> register state while returning back, otherwise there will be >> register state corruption. Ex, Adding a watchpoint on a list >> can corrput the list: >> >> # cat /proc/kallsyms | grep kthread_create_list >> c00000000121c8b8 d kthread_create_list >> >> Add watchpoint on kthread_create_list->next: >> >> # perf record -e mem:0xc00000000121c8c0 >> >> Run some workload such that new kthread gets invoked. Ex, I >> just logged out from console: >> >> list_add corruption. next->prev should be prev (c000000001214e00), \ >> but was c00000000121c8b8. (next=c00000000121c8b8). >> WARNING: CPU: 59 PID: 309 at lib/list_debug.c:25 __list_add_valid+0xb4/0xc0 >> CPU: 59 PID: 309 Comm: kworker/59:0 Kdump: loaded Not tainted 5.1.0-rc7+ #69 >> ... >> NIP __list_add_valid+0xb4/0xc0 >> LR __list_add_valid+0xb0/0xc0 >> Call Trace: >> __list_add_valid+0xb0/0xc0 (unreliable) >> __kthread_create_on_node+0xe0/0x260 >> kthread_create_on_node+0x34/0x50 >> create_worker+0xe8/0x260 >> worker_thread+0x444/0x560 >> kthread+0x160/0x1a0 >> ret_from_kernel_thread+0x5c/0x70 >> >> Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> > > How long has this been around? Should we be CCing stable? "bl .save_nvgprs" was added in the commit 5aae8a5370802 ("powerpc, hw_breakpoints: Implement hw_breakpoints for 64-bit server processors"), which was merged in v2.6.36.
Ravi Bangoria <ravi.bangoria@linux.ibm.com> writes: > Powerpc hw triggers watchpoint before executing the instruction. > To make trigger-after-execute behavior, kernel emulates the > instruction. If the instruction is 'load something into non- > volatile register', exception handler should restore emulated > register state while returning back, otherwise there will be > register state corruption. Ex, Adding a watchpoint on a list > can corrput the list: > > # cat /proc/kallsyms | grep kthread_create_list > c00000000121c8b8 d kthread_create_list > > Add watchpoint on kthread_create_list->next: > > # perf record -e mem:0xc00000000121c8c0 > > Run some workload such that new kthread gets invoked. Ex, I > just logged out from console: > > list_add corruption. next->prev should be prev (c000000001214e00), \ > but was c00000000121c8b8. (next=c00000000121c8b8). > WARNING: CPU: 59 PID: 309 at lib/list_debug.c:25 __list_add_valid+0xb4/0xc0 > CPU: 59 PID: 309 Comm: kworker/59:0 Kdump: loaded Not tainted 5.1.0-rc7+ #69 > ... > NIP __list_add_valid+0xb4/0xc0 > LR __list_add_valid+0xb0/0xc0 > Call Trace: > __list_add_valid+0xb0/0xc0 (unreliable) > __kthread_create_on_node+0xe0/0x260 > kthread_create_on_node+0x34/0x50 > create_worker+0xe8/0x260 > worker_thread+0x444/0x560 > kthread+0x160/0x1a0 > ret_from_kernel_thread+0x5c/0x70 This all depends on what code the compiler generates for the list access. Can you include a disassembly of the relevant code in your kernel so we have an example of the bad case. > diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S > index 9481a11..96de0d1 100644 > --- a/arch/powerpc/kernel/exceptions-64s.S > +++ b/arch/powerpc/kernel/exceptions-64s.S > @@ -1753,7 +1753,7 @@ handle_dabr_fault: > ld r5,_DSISR(r1) > addi r3,r1,STACK_FRAME_OVERHEAD > bl do_break > -12: b ret_from_except_lite > +12: b ret_from_except This probably warrants a comment explaining why we can't use the (badly named) "lite" version. cheers
On 6/7/19 11:20 AM, Michael Ellerman wrote: > Ravi Bangoria <ravi.bangoria@linux.ibm.com> writes: > >> Powerpc hw triggers watchpoint before executing the instruction. >> To make trigger-after-execute behavior, kernel emulates the >> instruction. If the instruction is 'load something into non- >> volatile register', exception handler should restore emulated >> register state while returning back, otherwise there will be >> register state corruption. Ex, Adding a watchpoint on a list >> can corrput the list: >> >> # cat /proc/kallsyms | grep kthread_create_list >> c00000000121c8b8 d kthread_create_list >> >> Add watchpoint on kthread_create_list->next: >> >> # perf record -e mem:0xc00000000121c8c0 >> >> Run some workload such that new kthread gets invoked. Ex, I >> just logged out from console: >> >> list_add corruption. next->prev should be prev (c000000001214e00), \ >> but was c00000000121c8b8. (next=c00000000121c8b8). >> WARNING: CPU: 59 PID: 309 at lib/list_debug.c:25 __list_add_valid+0xb4/0xc0 >> CPU: 59 PID: 309 Comm: kworker/59:0 Kdump: loaded Not tainted 5.1.0-rc7+ #69 >> ... >> NIP __list_add_valid+0xb4/0xc0 >> LR __list_add_valid+0xb0/0xc0 >> Call Trace: >> __list_add_valid+0xb0/0xc0 (unreliable) >> __kthread_create_on_node+0xe0/0x260 >> kthread_create_on_node+0x34/0x50 >> create_worker+0xe8/0x260 >> worker_thread+0x444/0x560 >> kthread+0x160/0x1a0 >> ret_from_kernel_thread+0x5c/0x70 > > This all depends on what code the compiler generates for the list > access. True. list corruption is just an example. But any load instruction that uses non-volatile register and hits a watchpoint, will result in register state corruption. > Can you include a disassembly of the relevant code in your > kernel so we have an example of the bad case. Register state from WARN_ON(): GPR00: c00000000059a3a0 c000007ff23afb50 c000000001344e00 0000000000000075 GPR04: 0000000000000000 0000000000000000 0000001852af8bc1 0000000000000000 GPR08: 0000000000000001 0000000000000007 0000000000000006 00000000000004aa GPR12: 0000000000000000 c000007ffffeb080 c000000000137038 c000005ff62aaa00 GPR16: 0000000000000000 0000000000000000 c000007fffbe7600 c000007fffbe7370 GPR20: c000007fffbe7320 c000007fffbe7300 c000000001373a00 0000000000000000 GPR24: fffffffffffffef7 c00000000012e320 c000007ff23afcb0 c000000000cb8628 GPR28: c00000000121c8b8 c000000001214e00 c000007fef5b17e8 c000007fef5b17c0 Snippet from __kthread_create_on_node: c000000000136be8: ed ff a2 3f addis r29,r2,-19 c000000000136bec: c0 7a bd eb ld r29,31424(r29) if (!__list_add_valid(new, prev, next)) c000000000136bf0: 78 f3 c3 7f mr r3,r30 c000000000136bf4: 78 e3 85 7f mr r5,r28 c000000000136bf8: 78 eb a4 7f mr r4,r29 c000000000136bfc: fd 36 46 48 bl c00000000059a2f8 <__list_add_valid+0x8> Watchpoint hit at 0xc000000000136bec. addis r29,r2,-19 => r29 = 0xc000000001344e00 + (-19 << 16) => r29 = 0xc000000001214e00 ld r29,31424(r29) => r29 = *(0xc000000001214e00 + 31424) => r29 = *(0xc00000000121c8c0) 0xc00000000121c8c0 is where we placed a watchpoint and thus this instruction was emulated by emulate_step. But because handle_dabr_fault did not restore emulated register state, r29 still contains stale value in above register state.
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index 9481a11..96de0d1 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -1753,7 +1753,7 @@ handle_dabr_fault: ld r5,_DSISR(r1) addi r3,r1,STACK_FRAME_OVERHEAD bl do_break -12: b ret_from_except_lite +12: b ret_from_except #ifdef CONFIG_PPC_BOOK3S_64
Powerpc hw triggers watchpoint before executing the instruction. To make trigger-after-execute behavior, kernel emulates the instruction. If the instruction is 'load something into non- volatile register', exception handler should restore emulated register state while returning back, otherwise there will be register state corruption. Ex, Adding a watchpoint on a list can corrput the list: # cat /proc/kallsyms | grep kthread_create_list c00000000121c8b8 d kthread_create_list Add watchpoint on kthread_create_list->next: # perf record -e mem:0xc00000000121c8c0 Run some workload such that new kthread gets invoked. Ex, I just logged out from console: list_add corruption. next->prev should be prev (c000000001214e00), \ but was c00000000121c8b8. (next=c00000000121c8b8). WARNING: CPU: 59 PID: 309 at lib/list_debug.c:25 __list_add_valid+0xb4/0xc0 CPU: 59 PID: 309 Comm: kworker/59:0 Kdump: loaded Not tainted 5.1.0-rc7+ #69 ... NIP __list_add_valid+0xb4/0xc0 LR __list_add_valid+0xb0/0xc0 Call Trace: __list_add_valid+0xb0/0xc0 (unreliable) __kthread_create_on_node+0xe0/0x260 kthread_create_on_node+0x34/0x50 create_worker+0xe8/0x260 worker_thread+0x444/0x560 kthread+0x160/0x1a0 ret_from_kernel_thread+0x5c/0x70 Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> --- arch/powerpc/kernel/exceptions-64s.S | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)