Message ID | 20090511200355.GA17988@in.ibm.com (mailing list archive) |
---|---|
State | Changes Requested, archived |
Headers | show |
> Hi PPC Dev folks, > Please find a patch below that implements the proposed Hardware > Breakpoint interfaces for PPC64 architecture. > > As a brief introduction, the proposed Hardware Breakpoint > infrastructure provides generic in-kernel interfaces to which users > from kernel- and user-space can request for breakpoint registers. An > ftrace plugin that can trace accesses to data variables is also part > of the generic HW Breakpoint interface patchset. The latest submission > for this patchset along with an x86 implementation can be found here: > http://lkml.org/lkml/2009/5/11/159. > > The following are the salient features of the PPC64 patch. > > - Arch-specific definitions for kernel and user-space requests are > defined in arch/powerpc/kernel/hw_breakpoint.c > - Ptrace is converted to use the HW Breakpoint interfaces Will we fall back to use the old method if more than one address is being watched? > - The ftrace plugin called ksym_tracer is tested to work fine. For > instance when tracing pid_max kernel variable, the following was > obtained as output Could you split the patch into a few pieces which implement these different parts. The smaller logical chucks will make it easier to review? > > # cat trace > # tracer: ksym_tracer > # > # TASK-PID CPU# Symbol Type Function > # | | | | | > bash 4502 3 pid_max RW .do_proc_dointvec_minmax_ conv+0x78/0x10c > bash 4502 3 pid_max RW .do_proc_dointvec_minmax_ conv+0xa0/0x10c > bash 4502 3 pid_max RW .alloc_pid+0x8c/0x4a4 > > There are however a few limitations/caveats of the patch as identified > below: > > - The patch is currently implemented only for PPC64 architecture. Other > architectures (especially Book-E implementations are expected to > happen in due course). > > - HW Breakpoints over data addresses through Xmon (using "bd" command) > and the proposed HW Breakpoint interfaces can now operate in a > mutually exclusive manner. Xmon's integration is pending and is > dependant on successful triggering of breakpoints through "bd<ops>". > (Note: On a Power5 machine running 2.6.29, Xmon could not trigger HW > Breakpoints when tested). > > Kindly let me know your comments. > > Thanks, > K.Prasad > > > Signed-off-by: K.Prasad <prasad@linux.vnet.ibm.com> > --- > arch/powerpc/Kconfig | 1 > arch/powerpc/include/asm/hw_breakpoint.h | 52 +++++ > arch/powerpc/include/asm/processor.h | 1 > arch/powerpc/include/asm/reg.h | 2 > arch/powerpc/include/asm/thread_info.h | 2 > arch/powerpc/kernel/Makefile | 2 > arch/powerpc/kernel/hw_breakpoint.c | 271 ++++++++++++++++++++++++++++ +++ > arch/powerpc/kernel/process.c | 18 ++ > arch/powerpc/kernel/ptrace.c | 48 +++++ > arch/powerpc/mm/fault.c | 14 - > samples/hw_breakpoint/data_breakpoint.c | 4 > 12 files changed, 423 insertions(+), 9 deletions(-) You've not touched prace32.c. Can we use this for 32 bit apps on a 64 bit kernel? > > Index: linux-2.6-tip.hbkpt/arch/powerpc/Kconfig > =================================================================== > --- linux-2.6-tip.hbkpt.orig/arch/powerpc/Kconfig > +++ linux-2.6-tip.hbkpt/arch/powerpc/Kconfig > @@ -125,6 +125,7 @@ config PPC > select USE_GENERIC_SMP_HELPERS if SMP > select HAVE_OPROFILE > select HAVE_SYSCALL_WRAPPERS if PPC64 > + select HAVE_HW_BREAKPOINT if PPC64 > > config EARLY_PRINTK > bool > Index: linux-2.6-tip.hbkpt/arch/powerpc/kernel/Makefile > =================================================================== > --- linux-2.6-tip.hbkpt.orig/arch/powerpc/kernel/Makefile > +++ linux-2.6-tip.hbkpt/arch/powerpc/kernel/Makefile > @@ -33,7 +33,7 @@ obj-$(CONFIG_PPC64) += setup_64.o sys_p > signal_64.o ptrace32.o \ > paca.o cpu_setup_ppc970.o \ > cpu_setup_pa6t.o \ > - firmware.o nvram_64.o > + firmware.o nvram_64.o hw_breakpoint.o > obj64-$(CONFIG_RELOCATABLE) += reloc_64.o > obj-$(CONFIG_PPC64) += vdso64/ > obj-$(CONFIG_ALTIVEC) += vecemu.o vector.o > Index: linux-2.6-tip.hbkpt/arch/powerpc/kernel/hw_breakpoint.c > =================================================================== > --- /dev/null > +++ linux-2.6-tip.hbkpt/arch/powerpc/kernel/hw_breakpoint.c > @@ -0,0 +1,271 @@ > +/* > + * This program is free software; you can redistribute it and/or modify > + * it under the terms of the GNU General Public License as published by > + * the Free Software Foundation; either version 2 of the License, or > + * (at your option) any later version. > + * > + * This program is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + * GNU General Public License for more details. > + * > + * You should have received a copy of the GNU General Public License > + * along with this program; if not, write to the Free Software > + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA . > + * > + * Copyright (C) 2009 IBM Corporation > + */ > + > +/* > + * HW_breakpoint: a unified kernel/user-space hardware breakpoint facility, > + * using the CPU's debug registers. > + */ > + > +#include <linux/notifier.h> > +#include <linux/kallsyms.h> > +#include <linux/kprobes.h> > +#include <linux/percpu.h> > +#include <linux/kernel.h> > +#include <linux/module.h> > +#include <linux/sched.h> > +#include <linux/init.h> > +#include <linux/smp.h> > + > +#include <asm/hw_breakpoint.h> > +#include <asm/processor.h> > +#include <asm/sstep.h> > + > +/* Store the kernel-space breakpoint address value */ > +static unsigned long kdabr; > + > +/* > + * Temporarily stores address for DABR before it is written by the > + * single-step handler routine > + */ > +static DEFINE_PER_CPU(unsigned long, dabr_data); > + > +void arch_update_kernel_hw_breakpoint(void *unused) > +{ > + struct hw_breakpoint *bp; > + > + /* Check if there is nothing to update */ > + if (hbp_kernel_pos == HB_NUM) > + return; > + bp = hbp_kernel[hbp_kernel_pos]; > + if (bp == NULL) > + kdabr = 0; > + else > + kdabr = bp->info.address | bp->info.type | DABR_TRANSLATION; > + set_dabr(kdabr); > +} > + > +/* > + * Install the thread breakpoints in their debug registers. > + */ > +void arch_install_thread_hw_breakpoint(struct task_struct *tsk) > +{ > + set_dabr(tsk->thread.dabr); > +} > + > +/* > + * Install the debug register values for just the kernel, no thread. > + */ > +void arch_uninstall_thread_hw_breakpoint() > +{ > + set_dabr(0); > +} > + > +/* > + * Check for virtual address in user space. > + */ > +int arch_check_va_in_userspace(unsigned long va, u8 hbp_len) > +{ > + return (va <= TASK_SIZE - HW_BREAKPOINT_LEN); > +} You pass ing hbp_len, but then use HW_BREAKPOINT_LEN? Is that right? > + > +/* > + * Check for virtual address in kernel space. > + */ > +int arch_check_va_in_kernelspace(unsigned long va, u8 hbp_len) > +{ > + return (va >= TASK_SIZE) && ((va + HW_BREAKPOINT_LEN - 1) >= TASK_SIZE) ; Can you use is_kernel_addr() here? > +} > + > +/* > + * Store a breakpoint's encoded address, length, and type. > + */ > +int arch_store_info(struct hw_breakpoint *bp) > +{ > + /* > + * User-space requests will always have the address field populated > + * For kernel-addresses, either the address or symbol name can be > + * specified. > + */ > + if (bp->info.name) > + bp->info.address = (unsigned long) > + kallsyms_lookup_name(bp->info.name); > + if (bp->info.address) > + return 0; > + return -EINVAL; > +} > + > +/* > + * Validate the arch-specific HW Breakpoint register settings > + */ > +int arch_validate_hwbkpt_settings(struct hw_breakpoint *bp, > + struct task_struct *tsk) > +{ > + int ret = -EINVAL; > + > + if (!bp) > + return ret; > + > + switch (bp->info.type) { > + case DABR_DATA_READ: > + break; > + case DABR_DATA_WRITE: > + break; > + case DABR_DATA_RW: > + break; > + default: > + return ret; > + } > + > + if (bp->triggered) > + ret = arch_store_info(bp); > + > + /* Check for double word alignment - 8 bytes */ > + if (bp->info.address & HW_BREAKPOINT_ALIGN) > + return -EINVAL; > + return ret; > +} > + > +void arch_update_user_hw_breakpoint(int pos, struct task_struct *tsk) > +{ > + struct thread_struct *thread = &(tsk->thread); > + struct hw_breakpoint *bp = thread->hbp[0]; > + > + if (bp) > + thread->dabr = bp->info.address | bp->info.type | > + DABR_TRANSLATION; > + else > + thread->dabr = 0; > +} > + > +void arch_flush_thread_hw_breakpoint(struct task_struct *tsk) > +{ > + struct thread_struct *thread = &(tsk->thread); > + > + thread->dabr = 0; > +} > + > +/* > + * Handle debug exception notifications. > + */ > +int __kprobes hw_breakpoint_handler(struct die_args *args) > +{ > + int rc = NOTIFY_STOP; > + struct hw_breakpoint *bp; > + struct pt_regs *regs = args->regs; > + unsigned long dar; > + int cpu, stepped, is_kernel; > + > + /* Disable breakpoints during exception handling */ > + set_dabr(0); > + > + dar = regs->dar & (~HW_BREAKPOINT_ALIGN); > + is_kernel = (dar >= TASK_SIZE) ? 1 : 0; > + > + if (is_kernel) > + bp = hbp_kernel[0]; > + else { > + bp = current->thread.hbp[0]; > + /* Lazy debug register switching */ > + if (!bp) > + return rc; > + rc = NOTIFY_DONE; > + } > + > + (bp->triggered)(bp, regs); > + > + cpu = get_cpu(); > + if (is_kernel) > + per_cpu(dabr_data, cpu) = kdabr; > + else > + per_cpu(dabr_data, cpu) = current->thread.dabr; > + > + stepped = emulate_step(regs, regs->nip); > + /* > + * Single-step the causative instruction manually if > + * emulate_step() could not execute it > + */ > + if (stepped == 0) { > + regs->msr |= MSR_SE; > + goto out; > + } This is where we backout to single step mode? > + > + set_dabr(per_cpu(dabr_data, cpu)); > +out: > + /* Enable pre-emption only if single-stepping is finished */ > + if (stepped) > + put_cpu_no_resched(); > + return rc; > +} > + +/* > + * Handle single-step exceptions following a DABR hit. > + */ > +int __kprobes single_step_dabr_instruction(struct die_args *args) > +{ > + struct pt_regs *regs = args->regs; > + int cpu = get_cpu(); > + int ret = NOTIFY_DONE; > + siginfo_t info; > + unsigned long this_dabr_data = per_cpu(dabr_data, cpu); > + > + /* > + * Check if we are single-stepping as a result of a > + * previous HW Breakpoint exception > + */ > + if (this_dabr_data == 0) > + goto out; > + > + regs->msr &= ~MSR_SE; > + /* Deliver signal to user-space */ > + if (this_dabr_data < TASK_SIZE) { > + info.si_signo = SIGTRAP; > + info.si_errno = 0; > + info.si_code = TRAP_HWBKPT; > + info.si_addr = (void __user *)(per_cpu(dabr_data, cpu)); > + force_sig_info(SIGTRAP, &info, current); > + } > + > + set_dabr(this_dabr_data); > + per_cpu(dabr_data, cpu) = 0; > + ret = NOTIFY_STOP; > + put_cpu_no_resched(); > + > +out: > + put_cpu_no_resched(); This looks wrong. put_cpu_no_resched() twice? > + return ret; > +} > + > +/* > + * Handle debug exception notifications. > + */ > +int __kprobes hw_breakpoint_exceptions_notify( > + struct notifier_block *unused, unsigned long val, void *data) > +{ > + int ret = NOTIFY_DONE; > + > + switch (val) { > + case DIE_DABR_MATCH: > + ret = hw_breakpoint_handler(data); > + break; > + case DIE_SSTEP: > + ret = single_step_dabr_instruction(data); > + break; > + } > + > + return ret; > +} > Index: linux-2.6-tip.hbkpt/arch/powerpc/include/asm/hw_breakpoint.h > =================================================================== > --- /dev/null > +++ linux-2.6-tip.hbkpt/arch/powerpc/include/asm/hw_breakpoint.h > @@ -0,0 +1,52 @@ > +#ifndef _PPC64_HW_BREAKPOINT_H > +#define _PPC64_HW_BREAKPOINT_H > + > +#ifdef __KERNEL__ > +#define __ARCH_HW_BREAKPOINT_H > + > +struct arch_hw_breakpoint { > + char *name; /* Contains name of the symbol to set bkpt */ > + unsigned long address; > + u8 type; > +}; Can you reorder this to pack the struct better (ie. put the unsigned long first). > + > +#include <linux/kdebug.h> > +#include <asm/reg.h> > +#include <asm-generic/hw_breakpoint.h> > + > +#define HW_BREAKPOINT_READ DABR_DATA_READ > +#define HW_BREAKPOINT_WRITE DABR_DATA_WRITE > +#define HW_BREAKPOINT_RW DABR_DATA_RW > + > +#define HW_BREAKPOINT_ALIGN 0x7 > +#define HW_BREAKPOINT_LEN 4 What is HW_BREAKPOINT_LEN? Obviouslt all instructions on ppc64 are 4 bytes. This seems to be defined in a few places, like here and MCOUNT_INSN_SIZE (ftrace.h). Can you create a #define for this generically and get all refers to use this instead of #define-ing 4 everywhere. > + > +extern struct hw_breakpoint *hbp_kernel[HB_NUM]; > +extern unsigned int hbp_user_refcount[HB_NUM]; > + > +/* > + * Ptrace support: breakpoint trigger routine. > + */ > +extern int __modify_user_hw_breakpoint(int pos, struct task_struct *tsk, > + struct hw_breakpoint *bp); > + > +extern void arch_install_thread_hw_breakpoint(struct task_struct *tsk); > +extern void arch_uninstall_thread_hw_breakpoint(void); > +extern int arch_check_va_in_userspace(unsigned long va, u8 hbp_len); > +extern int arch_check_va_in_kernelspace(unsigned long va, u8 hbp_len); > +extern int arch_validate_hwbkpt_settings(struct hw_breakpoint *bp, > + struct task_struct *tsk); > +extern void arch_update_user_hw_breakpoint(int pos, struct task_struct *tsk) ; > +extern void arch_flush_thread_hw_breakpoint(struct task_struct *tsk); > +extern void arch_update_kernel_hw_breakpoint(void *); > +extern int hw_breakpoint_exceptions_notify(struct notifier_block *unused, > + unsigned long val, void *data); > + > +extern void flush_thread_hw_breakpoint(struct task_struct *tsk); > +extern int copy_thread_hw_breakpoint(struct task_struct *tsk, > + struct task_struct *child, unsigned long clone_flags); > +extern void switch_to_thread_hw_breakpoint(struct task_struct *tsk); > + > +#endif /* __KERNEL__ */ > +#endif /* _PPC64_HW_BREAKPOINT_H */ > + > Index: linux-2.6-tip.hbkpt/arch/powerpc/include/asm/processor.h > =================================================================== > --- linux-2.6-tip.hbkpt.orig/arch/powerpc/include/asm/processor.h > +++ linux-2.6-tip.hbkpt/arch/powerpc/include/asm/processor.h > @@ -177,6 +177,7 @@ struct thread_struct { > #ifdef CONFIG_PPC64 > unsigned long start_tb; /* Start purr when proc switched in */ > unsigned long accum_tb; /* Total accumilated purr for process * / > + struct hw_breakpoint *hbp[HB_NUM]; > #endif > unsigned long dabr; /* Data address breakpoint register */ > #ifdef CONFIG_ALTIVEC > Index: linux-2.6-tip.hbkpt/arch/powerpc/kernel/ptrace.c > =================================================================== > --- linux-2.6-tip.hbkpt.orig/arch/powerpc/kernel/ptrace.c > +++ linux-2.6-tip.hbkpt/arch/powerpc/kernel/ptrace.c > @@ -37,6 +37,9 @@ > #include <asm/page.h> > #include <asm/pgtable.h> > #include <asm/system.h> > +#ifdef CONFIG_PPC64 > +#include <asm/hw_breakpoint.h> > +#endif > > /* > * does not yet catch signals sent when the child dies. > @@ -735,9 +738,22 @@ void user_disable_single_step(struct tas > clear_tsk_thread_flag(task, TIF_SINGLESTEP); > } > > +static void ptrace_triggered(struct hw_breakpoint *bp, struct pt_regs *regs) > +{ > + /* > + * The SIGTRAP signal is generated automatically for us in do_dabr(). > + * We don't have to do anything here > + */ > +} > + > int ptrace_set_debugreg(struct task_struct *task, unsigned long addr, > unsigned long data) > { > +#ifdef CONFIG_PPC64 > + struct thread_struct *thread = &(task->thread); > + struct hw_breakpoint *bp; > + int ret; > +#endif > /* For ppc64 we support one DABR and no IABR's at the moment (ppc64). > * For embedded processors we support one DAC and no IAC's at the > * moment. > @@ -767,6 +783,38 @@ int ptrace_set_debugreg(struct task_stru > if (data && !(data & DABR_TRANSLATION)) > return -EIO; > > +#ifdef CONFIG_PPC64 > + bp = thread->hbp[0]; > + if ((data & ~0x7UL) == 0) { Should use HW_BREAKPOINT_ALIGN here instead of 0x7? Is 0 some special command? Seems to be lots of special numbers here related to using data (later you mask is 0x3). What is happening here? What is contained in the data parameter? It overloads the type and the address? > + if (bp) { > + unregister_user_hw_breakpoint(task, bp); > + kfree(bp); > + thread->hbp[0] = NULL; > + } > + return 0; > + } > + > + if (bp) { > + bp->info.type = data & 0x3UL; > + task->thread.dabr = bp->info.address = > + (data & ~HW_BREAKPOINT_ALIGN); > + return __modify_user_hw_breakpoint(0, task, bp); > + } > + bp = kzalloc(sizeof(struct hw_breakpoint), GFP_KERNEL); When a processes ends, will this be correctly freed? Should it be freed in arch_flush_thread_hw_breakpoint? > + if (!bp) > + return -ENOMEM; > + > + /* Store the type of breakpoint */ > + bp->info.type = data & 0x3UL; > + bp->triggered = ptrace_triggered; > + task->thread.dabr = bp->info.address = (data & ~HW_BREAKPOINT_ALIGN); > + > + ret = register_user_hw_breakpoint(task, bp); > + if (ret) > + return ret; > + set_tsk_thread_flag(task, TIF_DEBUG); > +#endif /* CONFIG_PPC64 */ > + > /* Move contents to the DABR register */ > task->thread.dabr = data; > > Index: linux-2.6-tip.hbkpt/arch/powerpc/kernel/process.c > =================================================================== > --- linux-2.6-tip.hbkpt.orig/arch/powerpc/kernel/process.c > +++ linux-2.6-tip.hbkpt/arch/powerpc/kernel/process.c > @@ -50,6 +50,7 @@ > #include <asm/syscalls.h> > #ifdef CONFIG_PPC64 > #include <asm/firmware.h> > +#include <asm/hw_breakpoint.h> > #endif > #include <linux/kprobes.h> > #include <linux/kdebug.h> > @@ -254,8 +255,10 @@ void do_dabr(struct pt_regs *regs, unsig > 11, SIGSEGV) == NOTIFY_STOP) > return; > > +#ifndef CONFIG_PPC64 > if (debugger_dabr_match(regs)) > return; > +#endif > > /* Clear the DAC and struct entries. One shot trigger */ > #if defined(CONFIG_BOOKE) > @@ -372,8 +375,13 @@ struct task_struct *__switch_to(struct t > > #endif /* CONFIG_SMP */ > > +#ifdef CONFIG_PPC64 > + if (unlikely(test_tsk_thread_flag(new, TIF_DEBUG))) > + switch_to_thread_hw_breakpoint(new); > +#else > if (unlikely(__get_cpu_var(current_dabr) != new->thread.dabr)) > set_dabr(new->thread.dabr); > +#endif /* CONFIG_PPC64 */ > > #if defined(CONFIG_BOOKE) > /* If new thread DAC (HW breakpoint) is the same then leave it */ > @@ -550,6 +558,10 @@ void show_regs(struct pt_regs * regs) > void exit_thread(void) > { > discard_lazy_cpu_state(); > +#ifdef CONFIG_PPC64 > + if (unlikely(test_tsk_thread_flag(current, TIF_DEBUG))) > + flush_thread_hw_breakpoint(current); > +#endif /* CONFIG_PPC64 */ > } > > void flush_thread(void) > @@ -605,6 +617,9 @@ int copy_thread(unsigned long clone_flag > struct pt_regs *childregs, *kregs; > extern void ret_from_fork(void); > unsigned long sp = (unsigned long)task_stack_page(p) + THREAD_SIZE; > +#ifdef CONFIG_PPC64 > + struct task_struct *tsk = current; > +#endif > > CHECK_FULL_REGS(regs); > /* Copy registers */ > @@ -672,6 +687,9 @@ int copy_thread(unsigned long clone_flag > * function. > */ > kregs->nip = *((unsigned long *)ret_from_fork); > + > + if (unlikely(test_tsk_thread_flag(tsk, TIF_DEBUG))) > + copy_thread_hw_breakpoint(tsk, p, clone_flags); > #else > kregs->nip = (unsigned long)ret_from_fork; > #endif > Index: linux-2.6-tip.hbkpt/samples/hw_breakpoint/data_breakpoint.c > =================================================================== > --- linux-2.6-tip.hbkpt.orig/samples/hw_breakpoint/data_breakpoint.c > +++ linux-2.6-tip.hbkpt/samples/hw_breakpoint/data_breakpoint.c > @@ -54,6 +54,10 @@ static int __init hw_break_module_init(v > sample_hbp.info.type = HW_BREAKPOINT_WRITE; > sample_hbp.info.len = HW_BREAKPOINT_LEN_4; > #endif /* CONFIG_X86 */ > +#ifdef CONFIG_PPC64 > + sample_hbp.info.name = ksym_name; > + sample_hbp.info.type = DABR_DATA_WRITE; > +#endif /* CONFIG_PPC64 */ > > sample_hbp.triggered = (void *)sample_hbp_handler; > > Index: linux-2.6-tip.hbkpt/arch/powerpc/include/asm/thread_info.h > =================================================================== > --- linux-2.6-tip.hbkpt.orig/arch/powerpc/include/asm/thread_info.h > +++ linux-2.6-tip.hbkpt/arch/powerpc/include/asm/thread_info.h > @@ -114,6 +114,7 @@ static inline struct thread_info *curren > #define TIF_FREEZE 14 /* Freezing for suspend */ > #define TIF_RUNLATCH 15 /* Is the runlatch enabled? */ > #define TIF_ABI_PENDING 16 /* 32/64 bit switch needed */ > +#define TIF_DEBUG 17 /* uses debug registers */ > > /* as above, but as bit values */ > #define _TIF_SYSCALL_TRACE (1<<TIF_SYSCALL_TRACE) > @@ -132,6 +133,7 @@ static inline struct thread_info *curren > #define _TIF_FREEZE (1<<TIF_FREEZE) > #define _TIF_RUNLATCH (1<<TIF_RUNLATCH) > #define _TIF_ABI_PENDING (1<<TIF_ABI_PENDING) > +#define _TIF_DEBUG (1<<TIF_DEBUG) > #define _TIF_SYSCALL_T_OR_A (_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SEC COMP) > > #define _TIF_USER_WORK_MASK (_TIF_SIGPENDING | _TIF_NEED_RESCHED | \ > Index: linux-2.6-tip.hbkpt/arch/powerpc/include/asm/reg.h > =================================================================== > --- linux-2.6-tip.hbkpt.orig/arch/powerpc/include/asm/reg.h > +++ linux-2.6-tip.hbkpt/arch/powerpc/include/asm/reg.h > @@ -184,9 +184,11 @@ > #define CTRL_TE 0x00c00000 /* thread enable */ > #define CTRL_RUNLATCH 0x1 > #define SPRN_DABR 0x3F5 /* Data Address Breakpoint Register */ > +#define HB_NUM 1 /* Number of physical HW breakpoint registers * / You've shortedned "hardware breakpoint" to "hbp" elsewhere. Can you be consistent here so change this to HBP_NUM? > #define DABR_TRANSLATION (1UL << 2) > #define DABR_DATA_WRITE (1UL << 1) > #define DABR_DATA_READ (1UL << 0) > +#define DABR_DATA_RW (3UL << 0) > #define SPRN_DABR2 0x13D /* e300 */ > #define SPRN_DABRX 0x3F7 /* Data Address Breakpoint Register Extension * / > #define DABRX_USER (1UL << 0) > Index: linux-2.6-tip.hbkpt/arch/powerpc/mm/fault.c > =================================================================== > --- linux-2.6-tip.hbkpt.orig/arch/powerpc/mm/fault.c > +++ linux-2.6-tip.hbkpt/arch/powerpc/mm/fault.c > @@ -137,6 +137,12 @@ int __kprobes do_page_fault(struct pt_re > error_code &= 0x48200000; > else > is_write = error_code & DSISR_ISSTORE; > + > + if (error_code & DSISR_DABRMATCH) { > + /* DABR match */ > + do_dabr(regs, address, error_code); > + return 0; > + } > #else > is_write = error_code & ESR_DST; > #endif /* CONFIG_4xx || CONFIG_BOOKE */ > @@ -151,14 +157,6 @@ int __kprobes do_page_fault(struct pt_re > if (!user_mode(regs) && (address >= TASK_SIZE)) > return SIGSEGV; > > -#if !(defined(CONFIG_4xx) || defined(CONFIG_BOOKE)) > - if (error_code & DSISR_DABRMATCH) { > - /* DABR match */ > - do_dabr(regs, address, error_code); > - return 0; > - } > -#endif /* !(CONFIG_4xx || CONFIG_BOOKE)*/ > - > if (in_atomic() || mm == NULL) { > if (!user_mode(regs)) > return SIGSEGV; > _______________________________________________ > Linuxppc-dev mailing list > Linuxppc-dev@ozlabs.org > https://ozlabs.org/mailman/listinfo/linuxppc-dev >
> > +struct arch_hw_breakpoint { > > + char *name; /* Contains name of the symbol to set bkpt */ > > + unsigned long address; > > + u8 type; > > +}; > > Can you reorder this to pack the struct better (ie. put the unsigned > long first). Oops, this is a char * not just a char. Order is fine. Mikey
On Tue, May 12, 2009 at 01:33:55AM +0530, K.Prasad wrote: >Hi PPC Dev folks, > Please find a patch below that implements the proposed Hardware >Breakpoint interfaces for PPC64 architecture. > >As a brief introduction, the proposed Hardware Breakpoint >infrastructure provides generic in-kernel interfaces to which users >from kernel- and user-space can request for breakpoint registers. An >ftrace plugin that can trace accesses to data variables is also part >of the generic HW Breakpoint interface patchset. The latest submission >for this patchset along with an x86 implementation can be found here: >http://lkml.org/lkml/2009/5/11/159. > >The following are the salient features of the PPC64 patch. > >- Arch-specific definitions for kernel and user-space requests are > defined in arch/powerpc/kernel/hw_breakpoint.c >- Ptrace is converted to use the HW Breakpoint interfaces >- The ftrace plugin called ksym_tracer is tested to work fine. For > instance when tracing pid_max kernel variable, the following was > obtained as output > ># cat trace ># tracer: ksym_tracer ># ># TASK-PID CPU# Symbol Type Function ># | | | | | >bash 4502 3 pid_max RW .do_proc_dointvec_minmax_conv+0x78/0x10c >bash 4502 3 pid_max RW .do_proc_dointvec_minmax_conv+0xa0/0x10c >bash 4502 3 pid_max RW .alloc_pid+0x8c/0x4a4 > >There are however a few limitations/caveats of the patch as identified >below: > >- The patch is currently implemented only for PPC64 architecture. Other > architectures (especially Book-E implementations are expected to > happen in due course). Does this mean you will work on transitioning Book-E implementations, or that you expect the Book-E maintainers to? I'm just curious. The code as written relies heavily on the DABR/MSR setup that ppc64 has and Book-E unfortunately doesn't follow that at all. Book-E also allows for more than one HW breakpoint, which means you're growing the thread_struct by 32-bytes to support 4 of them. 64-bytes if this ever supports DAC events. Have you thought at all about a way to support this without carrying around the data in the thread struct? <snip> >Index: linux-2.6-tip.hbkpt/arch/powerpc/mm/fault.c >=================================================================== >--- linux-2.6-tip.hbkpt.orig/arch/powerpc/mm/fault.c >+++ linux-2.6-tip.hbkpt/arch/powerpc/mm/fault.c >@@ -137,6 +137,12 @@ int __kprobes do_page_fault(struct pt_re > error_code &= 0x48200000; > else > is_write = error_code & DSISR_ISSTORE; >+ >+ if (error_code & DSISR_DABRMATCH) { >+ /* DABR match */ >+ do_dabr(regs, address, error_code); >+ return 0; >+ } > #else > is_write = error_code & ESR_DST; > #endif /* CONFIG_4xx || CONFIG_BOOKE */ >@@ -151,14 +157,6 @@ int __kprobes do_page_fault(struct pt_re > if (!user_mode(regs) && (address >= TASK_SIZE)) > return SIGSEGV; > >-#if !(defined(CONFIG_4xx) || defined(CONFIG_BOOKE)) >- if (error_code & DSISR_DABRMATCH) { >- /* DABR match */ >- do_dabr(regs, address, error_code); >- return 0; >- } >-#endif /* !(CONFIG_4xx || CONFIG_BOOKE)*/ >- > if (in_atomic() || mm == NULL) { > if (!user_mode(regs)) > return SIGSEGV; I don't understand why this was changed, and the changelog doesn't highlight it. josh
On Tue, May 12, 2009 at 07:51:49AM -0400, Josh Boyer wrote: > On Tue, May 12, 2009 at 01:33:55AM +0530, K.Prasad wrote: > >- The patch is currently implemented only for PPC64 architecture. Other > > architectures (especially Book-E implementations are expected to > > happen in due course). > > Does this mean you will work on transitioning Book-E implementations, or that > you expect the Book-E maintainers to? I'm just curious. The code as written > relies heavily on the DABR/MSR setup that ppc64 has and Book-E unfortunately > doesn't follow that at all. And since there will eventually be 64-bit book E chips, we need to use something more specific than CONFIG_PPC64 in the #ifdef (easier now than figuring out which ones are breakpoint-related later). -Scott
On Tue, May 12, 2009 at 10:56:04AM +1000, Michael Neuling wrote: > > Hi PPC Dev folks, > > Please find a patch below that implements the proposed Hardware > > Breakpoint interfaces for PPC64 architecture. > > > > As a brief introduction, the proposed Hardware Breakpoint > > infrastructure provides generic in-kernel interfaces to which users > > from kernel- and user-space can request for breakpoint registers. An > > ftrace plugin that can trace accesses to data variables is also part > > of the generic HW Breakpoint interface patchset. The latest submission > > for this patchset along with an x86 implementation can be found here: > > http://lkml.org/lkml/2009/5/11/159. > > > > The following are the salient features of the PPC64 patch. > > > > - Arch-specific definitions for kernel and user-space requests are > > defined in arch/powerpc/kernel/hw_breakpoint.c > > - Ptrace is converted to use the HW Breakpoint interfaces > > Will we fall back to use the old method if more than one address is > being watched? > Assuming that you mean HW Breakpoints being used by multiple processes when you say "more than one address is being watched" - the proposed infrastructure can take care of such requests. During context switching in __switch_to() the DABR values of incoming process is restored and is valid until another process using DABR is scheduled again. > > - The ftrace plugin called ksym_tracer is tested to work fine. For > > instance when tracing pid_max kernel variable, the following was > > obtained as output > > Could you split the patch into a few pieces which implement these > different parts. The smaller logical chucks will make it easier to > review? > Sure. I will slice the patches in the next iteration of posting. > > > > # cat trace > > # tracer: ksym_tracer > > # > > # TASK-PID CPU# Symbol Type Function > > # | | | | | > > bash 4502 3 pid_max RW .do_proc_dointvec_minmax_ > conv+0x78/0x10c > > bash 4502 3 pid_max RW .do_proc_dointvec_minmax_ > conv+0xa0/0x10c > > bash 4502 3 pid_max RW .alloc_pid+0x8c/0x4a4 > > > > There are however a few limitations/caveats of the patch as identified > > below: > > > > - The patch is currently implemented only for PPC64 architecture. Other > > architectures (especially Book-E implementations are expected to > > happen in due course). > > > > - HW Breakpoints over data addresses through Xmon (using "bd" command) > > and the proposed HW Breakpoint interfaces can now operate in a > > mutually exclusive manner. Xmon's integration is pending and is > > dependant on successful triggering of breakpoints through "bd<ops>". > > (Note: On a Power5 machine running 2.6.29, Xmon could not trigger HW > > Breakpoints when tested). > > > > Kindly let me know your comments. > > > > Thanks, > > K.Prasad > > > > > > Signed-off-by: K.Prasad <prasad@linux.vnet.ibm.com> > > --- > > arch/powerpc/Kconfig | 1 > > arch/powerpc/include/asm/hw_breakpoint.h | 52 +++++ > > arch/powerpc/include/asm/processor.h | 1 > > arch/powerpc/include/asm/reg.h | 2 > > arch/powerpc/include/asm/thread_info.h | 2 > > arch/powerpc/kernel/Makefile | 2 > > arch/powerpc/kernel/hw_breakpoint.c | 271 ++++++++++++++++++++++++++++ > +++ > > arch/powerpc/kernel/process.c | 18 ++ > > arch/powerpc/kernel/ptrace.c | 48 +++++ > > arch/powerpc/mm/fault.c | 14 - > > samples/hw_breakpoint/data_breakpoint.c | 4 > > 12 files changed, 423 insertions(+), 9 deletions(-) > > You've not touched prace32.c. Can we use this for 32 bit apps on a 64 > bit kernel? > Given that PTRACE_SET_DEBUGREG invokes arch_ptrace() in ptrace32.c, I don't see why it cannot work (although I haven't tested it yet). > > > > Index: linux-2.6-tip.hbkpt/arch/powerpc/Kconfig > > =================================================================== > > --- linux-2.6-tip.hbkpt.orig/arch/powerpc/Kconfig > > +++ linux-2.6-tip.hbkpt/arch/powerpc/Kconfig > > @@ -125,6 +125,7 @@ config PPC > > select USE_GENERIC_SMP_HELPERS if SMP > > select HAVE_OPROFILE > > select HAVE_SYSCALL_WRAPPERS if PPC64 > > + select HAVE_HW_BREAKPOINT if PPC64 > > > > config EARLY_PRINTK > > bool > > Index: linux-2.6-tip.hbkpt/arch/powerpc/kernel/Makefile > > =================================================================== > > --- linux-2.6-tip.hbkpt.orig/arch/powerpc/kernel/Makefile > > +++ linux-2.6-tip.hbkpt/arch/powerpc/kernel/Makefile > > @@ -33,7 +33,7 @@ obj-$(CONFIG_PPC64) += setup_64.o sys_p > > signal_64.o ptrace32.o \ > > paca.o cpu_setup_ppc970.o \ > > cpu_setup_pa6t.o \ > > - firmware.o nvram_64.o > > + firmware.o nvram_64.o hw_breakpoint.o > > obj64-$(CONFIG_RELOCATABLE) += reloc_64.o > > obj-$(CONFIG_PPC64) += vdso64/ > > obj-$(CONFIG_ALTIVEC) += vecemu.o vector.o > > Index: linux-2.6-tip.hbkpt/arch/powerpc/kernel/hw_breakpoint.c > > =================================================================== > > --- /dev/null > > +++ linux-2.6-tip.hbkpt/arch/powerpc/kernel/hw_breakpoint.c > > @@ -0,0 +1,271 @@ > > +/* > > + * This program is free software; you can redistribute it and/or modify > > + * it under the terms of the GNU General Public License as published by > > + * the Free Software Foundation; either version 2 of the License, or > > + * (at your option) any later version. > > + * > > + * This program is distributed in the hope that it will be useful, > > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > > + * GNU General Public License for more details. > > + * > > + * You should have received a copy of the GNU General Public License > > + * along with this program; if not, write to the Free Software > > + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA > . > > + * > > + * Copyright (C) 2009 IBM Corporation > > + */ > > + > > +/* > > + * HW_breakpoint: a unified kernel/user-space hardware breakpoint facility, > > + * using the CPU's debug registers. > > + */ > > + > > +#include <linux/notifier.h> > > +#include <linux/kallsyms.h> > > +#include <linux/kprobes.h> > > +#include <linux/percpu.h> > > +#include <linux/kernel.h> > > +#include <linux/module.h> > > +#include <linux/sched.h> > > +#include <linux/init.h> > > +#include <linux/smp.h> > > + > > +#include <asm/hw_breakpoint.h> > > +#include <asm/processor.h> > > +#include <asm/sstep.h> > > + > > +/* Store the kernel-space breakpoint address value */ > > +static unsigned long kdabr; > > + > > +/* > > + * Temporarily stores address for DABR before it is written by the > > + * single-step handler routine > > + */ > > +static DEFINE_PER_CPU(unsigned long, dabr_data); > > + > > +void arch_update_kernel_hw_breakpoint(void *unused) > > +{ > > + struct hw_breakpoint *bp; > > + > > + /* Check if there is nothing to update */ > > + if (hbp_kernel_pos == HB_NUM) > > + return; > > + bp = hbp_kernel[hbp_kernel_pos]; > > + if (bp == NULL) > > + kdabr = 0; > > + else > > + kdabr = bp->info.address | bp->info.type | DABR_TRANSLATION; > > + set_dabr(kdabr); > > +} > > + > > +/* > > + * Install the thread breakpoints in their debug registers. > > + */ > > +void arch_install_thread_hw_breakpoint(struct task_struct *tsk) > > +{ > > + set_dabr(tsk->thread.dabr); > > +} > > + > > +/* > > + * Install the debug register values for just the kernel, no thread. > > + */ > > +void arch_uninstall_thread_hw_breakpoint() > > +{ > > + set_dabr(0); > > +} > > + > > +/* > > + * Check for virtual address in user space. > > + */ > > +int arch_check_va_in_userspace(unsigned long va, u8 hbp_len) > > +{ > > + return (va <= TASK_SIZE - HW_BREAKPOINT_LEN); > > +} > > You pass ing hbp_len, but then use HW_BREAKPOINT_LEN? Is that right? > The 'hbp_len' parameter is useful only when the host processor can watch for varying range of addresses, which is not the case for PPC64. I guess I should remove the parameter (although it may be required for other PowerPC implementations). > > + > > +/* > > + * Check for virtual address in kernel space. > > + */ > > +int arch_check_va_in_kernelspace(unsigned long va, u8 hbp_len) > > +{ > > + return (va >= TASK_SIZE) && ((va + HW_BREAKPOINT_LEN - 1) >= TASK_SIZE) > ; > > Can you use is_kernel_addr() here? > The above routine does more checks than is_kernel_addr() i.e. ensures that the last address (basically DABR address + address range monitored) also lies in kernel-space. However I agree that it isn't significant in PPC64 which has an alignment requirement that is greater than the range of addresses monitored. I will use arch_check_va_in_kernelspace() as a wrapper around is_kernel_addr() for now. > > +} > > + > > +/* > > + * Store a breakpoint's encoded address, length, and type. > > + */ > > +int arch_store_info(struct hw_breakpoint *bp) > > +{ > > + /* > > + * User-space requests will always have the address field populated > > + * For kernel-addresses, either the address or symbol name can be > > + * specified. > > + */ > > + if (bp->info.name) > > + bp->info.address = (unsigned long) > > + kallsyms_lookup_name(bp->info.name); > > + if (bp->info.address) > > + return 0; > > + return -EINVAL; > > +} > > + > > +/* > > + * Validate the arch-specific HW Breakpoint register settings > > + */ > > +int arch_validate_hwbkpt_settings(struct hw_breakpoint *bp, > > + struct task_struct *tsk) > > +{ > > + int ret = -EINVAL; > > + > > + if (!bp) > > + return ret; > > + > > + switch (bp->info.type) { > > + case DABR_DATA_READ: > > + break; > > + case DABR_DATA_WRITE: > > + break; > > + case DABR_DATA_RW: > > + break; > > + default: > > + return ret; > > + } > > + > > + if (bp->triggered) > > + ret = arch_store_info(bp); > > + > > + /* Check for double word alignment - 8 bytes */ > > + if (bp->info.address & HW_BREAKPOINT_ALIGN) > > + return -EINVAL; > > + return ret; > > +} > > + > > +void arch_update_user_hw_breakpoint(int pos, struct task_struct *tsk) > > +{ > > + struct thread_struct *thread = &(tsk->thread); > > + struct hw_breakpoint *bp = thread->hbp[0]; > > + > > + if (bp) > > + thread->dabr = bp->info.address | bp->info.type | > > + DABR_TRANSLATION; > > + else > > + thread->dabr = 0; > > +} > > + > > +void arch_flush_thread_hw_breakpoint(struct task_struct *tsk) > > +{ > > + struct thread_struct *thread = &(tsk->thread); > > + > > + thread->dabr = 0; > > +} > > + > > +/* > > + * Handle debug exception notifications. > > + */ > > +int __kprobes hw_breakpoint_handler(struct die_args *args) > > +{ > > + int rc = NOTIFY_STOP; > > + struct hw_breakpoint *bp; > > + struct pt_regs *regs = args->regs; > > + unsigned long dar; > > + int cpu, stepped, is_kernel; > > + > > + /* Disable breakpoints during exception handling */ > > + set_dabr(0); > > + > > + dar = regs->dar & (~HW_BREAKPOINT_ALIGN); > > + is_kernel = (dar >= TASK_SIZE) ? 1 : 0; > > + > > + if (is_kernel) > > + bp = hbp_kernel[0]; > > + else { > > + bp = current->thread.hbp[0]; > > + /* Lazy debug register switching */ > > + if (!bp) > > + return rc; > > + rc = NOTIFY_DONE; > > + } > > + > > + (bp->triggered)(bp, regs); > > + > > + cpu = get_cpu(); > > + if (is_kernel) > > + per_cpu(dabr_data, cpu) = kdabr; > > + else > > + per_cpu(dabr_data, cpu) = current->thread.dabr; > > + > > + stepped = emulate_step(regs, regs->nip); > > + /* > > + * Single-step the causative instruction manually if > > + * emulate_step() could not execute it > > + */ > > + if (stepped == 0) { > > + regs->msr |= MSR_SE; > > + goto out; > > + } > > This is where we backout to single step mode? > Yes, if emulate_step() has failed us we do manual single-stepping. > > + > > + set_dabr(per_cpu(dabr_data, cpu)); > > +out: > > + /* Enable pre-emption only if single-stepping is finished */ > > + if (stepped) > > + put_cpu_no_resched(); > > + return rc; > > +} > > + > +/* > > + * Handle single-step exceptions following a DABR hit. > > + */ > > +int __kprobes single_step_dabr_instruction(struct die_args *args) > > +{ > > + struct pt_regs *regs = args->regs; > > + int cpu = get_cpu(); > > + int ret = NOTIFY_DONE; > > + siginfo_t info; > > + unsigned long this_dabr_data = per_cpu(dabr_data, cpu); > > + > > + /* > > + * Check if we are single-stepping as a result of a > > + * previous HW Breakpoint exception > > + */ > > + if (this_dabr_data == 0) > > + goto out; > > + > > + regs->msr &= ~MSR_SE; > > + /* Deliver signal to user-space */ > > + if (this_dabr_data < TASK_SIZE) { > > + info.si_signo = SIGTRAP; > > + info.si_errno = 0; > > + info.si_code = TRAP_HWBKPT; > > + info.si_addr = (void __user *)(per_cpu(dabr_data, cpu)); > > + force_sig_info(SIGTRAP, &info, current); > > + } > > + > > + set_dabr(this_dabr_data); > > + per_cpu(dabr_data, cpu) = 0; > > + ret = NOTIFY_STOP; > > + put_cpu_no_resched(); > > + > > +out: > > + put_cpu_no_resched(); > > This looks wrong. put_cpu_no_resched() twice? > single_step_dabr_instruction() is interested in notifications from notify_die when DIE_SSTEP is the reason. This means it is invoked when single stepping occurs due to other requests (such as kprobes/xmon) and not just after hw_breakpoint_handler(). So, a pair of "int cpu = get_cpu();" and "put_cpu_no_resched()" are meant for the above case. Secondly, we want to atomically single-step the causative instruction after a HW Breakpoint exception and hence put_cpu_no_resched() is invoked only in single_step_dabr_instruction() and not in hw_breakpoint_handler() when single-stepping manually. Alternatively, smp_processor_id() can be used if preemption is already disabled and avoid a get_cpu() call but that just adds more code without any apparent benefit. Hence the put_cpu_no_resched() twice. Let me know if you can think of something better. > > + return ret; > > +} > > + > > +/* > > + * Handle debug exception notifications. > > + */ > > +int __kprobes hw_breakpoint_exceptions_notify( > > + struct notifier_block *unused, unsigned long val, void *data) > > +{ > > + int ret = NOTIFY_DONE; > > + > > + switch (val) { > > + case DIE_DABR_MATCH: > > + ret = hw_breakpoint_handler(data); > > + break; > > + case DIE_SSTEP: > > + ret = single_step_dabr_instruction(data); > > + break; > > + } > > + > > + return ret; > > +} > > Index: linux-2.6-tip.hbkpt/arch/powerpc/include/asm/hw_breakpoint.h > > =================================================================== > > --- /dev/null > > +++ linux-2.6-tip.hbkpt/arch/powerpc/include/asm/hw_breakpoint.h > > @@ -0,0 +1,52 @@ > > +#ifndef _PPC64_HW_BREAKPOINT_H > > +#define _PPC64_HW_BREAKPOINT_H > > + > > +#ifdef __KERNEL__ > > +#define __ARCH_HW_BREAKPOINT_H > > + > > +struct arch_hw_breakpoint { > > + char *name; /* Contains name of the symbol to set bkpt */ > > + unsigned long address; > > + u8 type; > > +}; > > Can you reorder this to pack the struct better (ie. put the unsigned > long first). > > > + > > +#include <linux/kdebug.h> > > +#include <asm/reg.h> > > +#include <asm-generic/hw_breakpoint.h> > > + > > +#define HW_BREAKPOINT_READ DABR_DATA_READ > > +#define HW_BREAKPOINT_WRITE DABR_DATA_WRITE > > +#define HW_BREAKPOINT_RW DABR_DATA_RW > > + > > +#define HW_BREAKPOINT_ALIGN 0x7 > > +#define HW_BREAKPOINT_LEN 4 > > What is HW_BREAKPOINT_LEN? > > Obviouslt all instructions on ppc64 are 4 bytes. This seems to be > defined in a few places, like here and MCOUNT_INSN_SIZE (ftrace.h). Can > you create a #define for this generically and get all refers to use this > instead of #define-ing 4 everywhere. > Agreed. A macro definition should have ideally existed in "arch/powerpc/include/asm/reg.h" already but I found none. Through a separate patch, I will define INSTRUCTION_LEN in reg.h and define HW_BREAKPOINT_LEN to INSTRUCTION_LEN (because the former gels well with the surrounding HW Breakpoint related name-space). > > + > > +extern struct hw_breakpoint *hbp_kernel[HB_NUM]; > > +extern unsigned int hbp_user_refcount[HB_NUM]; > > + > > +/* > > + * Ptrace support: breakpoint trigger routine. > > + */ > > +extern int __modify_user_hw_breakpoint(int pos, struct task_struct *tsk, > > + struct hw_breakpoint *bp); > > + > > +extern void arch_install_thread_hw_breakpoint(struct task_struct *tsk); > > +extern void arch_uninstall_thread_hw_breakpoint(void); > > +extern int arch_check_va_in_userspace(unsigned long va, u8 hbp_len); > > +extern int arch_check_va_in_kernelspace(unsigned long va, u8 hbp_len); > > +extern int arch_validate_hwbkpt_settings(struct hw_breakpoint *bp, > > + struct task_struct *tsk); > > +extern void arch_update_user_hw_breakpoint(int pos, struct task_struct *tsk) > ; > > +extern void arch_flush_thread_hw_breakpoint(struct task_struct *tsk); > > +extern void arch_update_kernel_hw_breakpoint(void *); > > +extern int hw_breakpoint_exceptions_notify(struct notifier_block *unused, > > + unsigned long val, void *data); > > + > > +extern void flush_thread_hw_breakpoint(struct task_struct *tsk); > > +extern int copy_thread_hw_breakpoint(struct task_struct *tsk, > > + struct task_struct *child, unsigned long clone_flags); > > +extern void switch_to_thread_hw_breakpoint(struct task_struct *tsk); > > + > > +#endif /* __KERNEL__ */ > > +#endif /* _PPC64_HW_BREAKPOINT_H */ > > + > > Index: linux-2.6-tip.hbkpt/arch/powerpc/include/asm/processor.h > > =================================================================== > > --- linux-2.6-tip.hbkpt.orig/arch/powerpc/include/asm/processor.h > > +++ linux-2.6-tip.hbkpt/arch/powerpc/include/asm/processor.h > > @@ -177,6 +177,7 @@ struct thread_struct { > > #ifdef CONFIG_PPC64 > > unsigned long start_tb; /* Start purr when proc switched in */ > > unsigned long accum_tb; /* Total accumilated purr for process * > / > > + struct hw_breakpoint *hbp[HB_NUM]; > > #endif > > unsigned long dabr; /* Data address breakpoint register */ > > #ifdef CONFIG_ALTIVEC > > Index: linux-2.6-tip.hbkpt/arch/powerpc/kernel/ptrace.c > > =================================================================== > > --- linux-2.6-tip.hbkpt.orig/arch/powerpc/kernel/ptrace.c > > +++ linux-2.6-tip.hbkpt/arch/powerpc/kernel/ptrace.c > > @@ -37,6 +37,9 @@ > > #include <asm/page.h> > > #include <asm/pgtable.h> > > #include <asm/system.h> > > +#ifdef CONFIG_PPC64 > > +#include <asm/hw_breakpoint.h> > > +#endif > > > > /* > > * does not yet catch signals sent when the child dies. > > @@ -735,9 +738,22 @@ void user_disable_single_step(struct tas > > clear_tsk_thread_flag(task, TIF_SINGLESTEP); > > } > > > > +static void ptrace_triggered(struct hw_breakpoint *bp, struct pt_regs *regs) > > +{ > > + /* > > + * The SIGTRAP signal is generated automatically for us in do_dabr(). > > + * We don't have to do anything here > > + */ > > +} > > + > > int ptrace_set_debugreg(struct task_struct *task, unsigned long addr, > > unsigned long data) > > { > > +#ifdef CONFIG_PPC64 > > + struct thread_struct *thread = &(task->thread); > > + struct hw_breakpoint *bp; > > + int ret; > > +#endif > > /* For ppc64 we support one DABR and no IABR's at the moment (ppc64). > > * For embedded processors we support one DAC and no IAC's at the > > * moment. > > @@ -767,6 +783,38 @@ int ptrace_set_debugreg(struct task_stru > > if (data && !(data & DABR_TRANSLATION)) > > return -EIO; > > > > +#ifdef CONFIG_PPC64 > > + bp = thread->hbp[0]; > > + if ((data & ~0x7UL) == 0) { > > Should use HW_BREAKPOINT_ALIGN here instead of 0x7? Is 0 some special > command? Seems to be lots of special numbers here related to using data > (later you mask is 0x3). What is happening here? What is contained in > the data parameter? It overloads the type and the address? > I agree that HW_BREAKPOINT_ALIGN must have been used here and will eliminate 0x3 using DABR_DATA_RW. 'data' variable is a composite of address | translation_enabled | type. > > + if (bp) { > > + unregister_user_hw_breakpoint(task, bp); > > + kfree(bp); > > + thread->hbp[0] = NULL; > > + } > > + return 0; > > + } > > + > > + if (bp) { > > + bp->info.type = data & 0x3UL; > > + task->thread.dabr = bp->info.address = > > + (data & ~HW_BREAKPOINT_ALIGN); > > + return __modify_user_hw_breakpoint(0, task, bp); > > + } > > + bp = kzalloc(sizeof(struct hw_breakpoint), GFP_KERNEL); > > When a processes ends, will this be correctly freed? Should it be freed > in arch_flush_thread_hw_breakpoint? > It is freed in flush_thread_hw_breakpoint() which is a part of the proposed kernel/hw_breakpoint.c (http://lkml.org/lkml/2009/5/11/159). > > + if (!bp) > > + return -ENOMEM; > > + > > + /* Store the type of breakpoint */ > > + bp->info.type = data & 0x3UL; > > + bp->triggered = ptrace_triggered; > > + task->thread.dabr = bp->info.address = (data & ~HW_BREAKPOINT_ALIGN); > > + > > + ret = register_user_hw_breakpoint(task, bp); > > + if (ret) > > + return ret; > > + set_tsk_thread_flag(task, TIF_DEBUG); > > +#endif /* CONFIG_PPC64 */ > > + > > /* Move contents to the DABR register */ > > task->thread.dabr = data; > > > > Index: linux-2.6-tip.hbkpt/arch/powerpc/kernel/process.c > > =================================================================== > > --- linux-2.6-tip.hbkpt.orig/arch/powerpc/kernel/process.c > > +++ linux-2.6-tip.hbkpt/arch/powerpc/kernel/process.c > > @@ -50,6 +50,7 @@ > > #include <asm/syscalls.h> > > #ifdef CONFIG_PPC64 > > #include <asm/firmware.h> > > +#include <asm/hw_breakpoint.h> > > #endif > > #include <linux/kprobes.h> > > #include <linux/kdebug.h> > > @@ -254,8 +255,10 @@ void do_dabr(struct pt_regs *regs, unsig > > 11, SIGSEGV) == NOTIFY_STOP) > > return; > > > > +#ifndef CONFIG_PPC64 > > if (debugger_dabr_match(regs)) > > return; > > +#endif > > > > /* Clear the DAC and struct entries. One shot trigger */ > > #if defined(CONFIG_BOOKE) > > @@ -372,8 +375,13 @@ struct task_struct *__switch_to(struct t > > > > #endif /* CONFIG_SMP */ > > > > +#ifdef CONFIG_PPC64 > > + if (unlikely(test_tsk_thread_flag(new, TIF_DEBUG))) > > + switch_to_thread_hw_breakpoint(new); > > +#else > > if (unlikely(__get_cpu_var(current_dabr) != new->thread.dabr)) > > set_dabr(new->thread.dabr); > > +#endif /* CONFIG_PPC64 */ > > > > #if defined(CONFIG_BOOKE) > > /* If new thread DAC (HW breakpoint) is the same then leave it */ > > @@ -550,6 +558,10 @@ void show_regs(struct pt_regs * regs) > > void exit_thread(void) > > { > > discard_lazy_cpu_state(); > > +#ifdef CONFIG_PPC64 > > + if (unlikely(test_tsk_thread_flag(current, TIF_DEBUG))) > > + flush_thread_hw_breakpoint(current); > > +#endif /* CONFIG_PPC64 */ > > } > > > > void flush_thread(void) > > @@ -605,6 +617,9 @@ int copy_thread(unsigned long clone_flag > > struct pt_regs *childregs, *kregs; > > extern void ret_from_fork(void); > > unsigned long sp = (unsigned long)task_stack_page(p) + THREAD_SIZE; > > +#ifdef CONFIG_PPC64 > > + struct task_struct *tsk = current; > > +#endif > > > > CHECK_FULL_REGS(regs); > > /* Copy registers */ > > @@ -672,6 +687,9 @@ int copy_thread(unsigned long clone_flag > > * function. > > */ > > kregs->nip = *((unsigned long *)ret_from_fork); > > + > > + if (unlikely(test_tsk_thread_flag(tsk, TIF_DEBUG))) > > + copy_thread_hw_breakpoint(tsk, p, clone_flags); > > #else > > kregs->nip = (unsigned long)ret_from_fork; > > #endif > > Index: linux-2.6-tip.hbkpt/samples/hw_breakpoint/data_breakpoint.c > > =================================================================== > > --- linux-2.6-tip.hbkpt.orig/samples/hw_breakpoint/data_breakpoint.c > > +++ linux-2.6-tip.hbkpt/samples/hw_breakpoint/data_breakpoint.c > > @@ -54,6 +54,10 @@ static int __init hw_break_module_init(v > > sample_hbp.info.type = HW_BREAKPOINT_WRITE; > > sample_hbp.info.len = HW_BREAKPOINT_LEN_4; > > #endif /* CONFIG_X86 */ > > +#ifdef CONFIG_PPC64 > > + sample_hbp.info.name = ksym_name; > > + sample_hbp.info.type = DABR_DATA_WRITE; > > +#endif /* CONFIG_PPC64 */ > > > > sample_hbp.triggered = (void *)sample_hbp_handler; > > > > Index: linux-2.6-tip.hbkpt/arch/powerpc/include/asm/thread_info.h > > =================================================================== > > --- linux-2.6-tip.hbkpt.orig/arch/powerpc/include/asm/thread_info.h > > +++ linux-2.6-tip.hbkpt/arch/powerpc/include/asm/thread_info.h > > @@ -114,6 +114,7 @@ static inline struct thread_info *curren > > #define TIF_FREEZE 14 /* Freezing for suspend */ > > #define TIF_RUNLATCH 15 /* Is the runlatch enabled? */ > > #define TIF_ABI_PENDING 16 /* 32/64 bit switch needed */ > > +#define TIF_DEBUG 17 /* uses debug registers */ > > > > /* as above, but as bit values */ > > #define _TIF_SYSCALL_TRACE (1<<TIF_SYSCALL_TRACE) > > @@ -132,6 +133,7 @@ static inline struct thread_info *curren > > #define _TIF_FREEZE (1<<TIF_FREEZE) > > #define _TIF_RUNLATCH (1<<TIF_RUNLATCH) > > #define _TIF_ABI_PENDING (1<<TIF_ABI_PENDING) > > +#define _TIF_DEBUG (1<<TIF_DEBUG) > > #define _TIF_SYSCALL_T_OR_A (_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SEC > COMP) > > > > #define _TIF_USER_WORK_MASK (_TIF_SIGPENDING | _TIF_NEED_RESCHED | \ > > Index: linux-2.6-tip.hbkpt/arch/powerpc/include/asm/reg.h > > =================================================================== > > --- linux-2.6-tip.hbkpt.orig/arch/powerpc/include/asm/reg.h > > +++ linux-2.6-tip.hbkpt/arch/powerpc/include/asm/reg.h > > @@ -184,9 +184,11 @@ > > #define CTRL_TE 0x00c00000 /* thread enable */ > > #define CTRL_RUNLATCH 0x1 > > #define SPRN_DABR 0x3F5 /* Data Address Breakpoint Register */ > > +#define HB_NUM 1 /* Number of physical HW breakpoint registers * > / > > You've shortedned "hardware breakpoint" to "hbp" elsewhere. Can you be > consistent here so change this to HBP_NUM? > Thanks for pointing it out! I will change. > > #define DABR_TRANSLATION (1UL << 2) > > #define DABR_DATA_WRITE (1UL << 1) > > #define DABR_DATA_READ (1UL << 0) > > +#define DABR_DATA_RW (3UL << 0) > > #define SPRN_DABR2 0x13D /* e300 */ > > #define SPRN_DABRX 0x3F7 /* Data Address Breakpoint Register Extension * > / > > #define DABRX_USER (1UL << 0) > > Index: linux-2.6-tip.hbkpt/arch/powerpc/mm/fault.c > > =================================================================== > > --- linux-2.6-tip.hbkpt.orig/arch/powerpc/mm/fault.c > > +++ linux-2.6-tip.hbkpt/arch/powerpc/mm/fault.c > > @@ -137,6 +137,12 @@ int __kprobes do_page_fault(struct pt_re > > error_code &= 0x48200000; > > else > > is_write = error_code & DSISR_ISSTORE; > > + > > + if (error_code & DSISR_DABRMATCH) { > > + /* DABR match */ > > + do_dabr(regs, address, error_code); > > + return 0; > > + } > > #else > > is_write = error_code & ESR_DST; > > #endif /* CONFIG_4xx || CONFIG_BOOKE */ > > @@ -151,14 +157,6 @@ int __kprobes do_page_fault(struct pt_re > > if (!user_mode(regs) && (address >= TASK_SIZE)) > > return SIGSEGV; > > > > -#if !(defined(CONFIG_4xx) || defined(CONFIG_BOOKE)) > > - if (error_code & DSISR_DABRMATCH) { > > - /* DABR match */ > > - do_dabr(regs, address, error_code); > > - return 0; > > - } > > -#endif /* !(CONFIG_4xx || CONFIG_BOOKE)*/ > > - > > if (in_atomic() || mm == NULL) { > > if (!user_mode(regs)) > > return SIGSEGV; > > _______________________________________________ > > Linuxppc-dev mailing list > > Linuxppc-dev@ozlabs.org > > https://ozlabs.org/mailman/listinfo/linuxppc-dev > > Thanks for the code-review. I will send out the revised patchset with the changes agreed above. -- K.Prasad
On Tue, May 12, 2009 at 07:51:49AM -0400, Josh Boyer wrote: > On Tue, May 12, 2009 at 01:33:55AM +0530, K.Prasad wrote: > >Hi PPC Dev folks, > > Please find a patch below that implements the proposed Hardware > >Breakpoint interfaces for PPC64 architecture. > > > >As a brief introduction, the proposed Hardware Breakpoint > >infrastructure provides generic in-kernel interfaces to which users > >from kernel- and user-space can request for breakpoint registers. An > >ftrace plugin that can trace accesses to data variables is also part > >of the generic HW Breakpoint interface patchset. The latest submission > >for this patchset along with an x86 implementation can be found here: > >http://lkml.org/lkml/2009/5/11/159. > > > >The following are the salient features of the PPC64 patch. > > > >- Arch-specific definitions for kernel and user-space requests are > > defined in arch/powerpc/kernel/hw_breakpoint.c > >- Ptrace is converted to use the HW Breakpoint interfaces > >- The ftrace plugin called ksym_tracer is tested to work fine. For > > instance when tracing pid_max kernel variable, the following was > > obtained as output > > > ># cat trace > ># tracer: ksym_tracer > ># > ># TASK-PID CPU# Symbol Type Function > ># | | | | | > >bash 4502 3 pid_max RW .do_proc_dointvec_minmax_conv+0x78/0x10c > >bash 4502 3 pid_max RW .do_proc_dointvec_minmax_conv+0xa0/0x10c > >bash 4502 3 pid_max RW .alloc_pid+0x8c/0x4a4 > > > >There are however a few limitations/caveats of the patch as identified > >below: > > > >- The patch is currently implemented only for PPC64 architecture. Other > > architectures (especially Book-E implementations are expected to > > happen in due course). > > Does this mean you will work on transitioning Book-E implementations, or that > you expect the Book-E maintainers to? I'm just curious. The code as written > relies heavily on the DABR/MSR setup that ppc64 has and Book-E unfortunately > doesn't follow that at all. > > Book-E also allows for more than one HW breakpoint, which means you're growing > the thread_struct by 32-bytes to support 4 of them. 64-bytes if this ever > supports DAC events. Have you thought at all about a way to support this > without carrying around the data in the thread struct? > > <snip> > The idea behind embedding the physical debug register values in thread_struct is to use its synchronisation mechanisms for HW Breakpoint related fields too. These values were originally maintained in a separate structure whose pointer lay in thread_struct but was modified based on a suggestion from Ingo Molnar (here: http://lkml.org/lkml/2009/3/10/210). I do see that Book-E processors will have severe memory footprint constraints (in embedded environment) and if the maintainers carry a different perspective (than the one cited above), the relevant fields can be migrated to a new structure whose pointer will be embedded in task_struct. The generic code may have to carry some #ifdefs though. > >Index: linux-2.6-tip.hbkpt/arch/powerpc/mm/fault.c > >=================================================================== > >--- linux-2.6-tip.hbkpt.orig/arch/powerpc/mm/fault.c > >+++ linux-2.6-tip.hbkpt/arch/powerpc/mm/fault.c > >@@ -137,6 +137,12 @@ int __kprobes do_page_fault(struct pt_re > > error_code &= 0x48200000; > > else > > is_write = error_code & DSISR_ISSTORE; > >+ > >+ if (error_code & DSISR_DABRMATCH) { > >+ /* DABR match */ > >+ do_dabr(regs, address, error_code); > >+ return 0; > >+ } > > #else > > is_write = error_code & ESR_DST; > > #endif /* CONFIG_4xx || CONFIG_BOOKE */ > >@@ -151,14 +157,6 @@ int __kprobes do_page_fault(struct pt_re > > if (!user_mode(regs) && (address >= TASK_SIZE)) > > return SIGSEGV; > > > >-#if !(defined(CONFIG_4xx) || defined(CONFIG_BOOKE)) > >- if (error_code & DSISR_DABRMATCH) { > >- /* DABR match */ > >- do_dabr(regs, address, error_code); > >- return 0; > >- } > >-#endif /* !(CONFIG_4xx || CONFIG_BOOKE)*/ > >- > > if (in_atomic() || mm == NULL) { > > if (!user_mode(regs)) > > return SIGSEGV; > > > I don't understand why this was changed, and the changelog doesn't highlight it. > > josh The intention is to capture the exception much before kprobes and xmon do. The HW Breakpoint exception handler will return NOTIFY_DONE if the exception doesn't belong to it and doesn't harm the rest. kprobes has been tested to work fine alongwith HW Breakpoints. I will add a description about this change in the next iteration of the patch. Thanks, K.Prasad
On Tue, May 12, 2009 at 11:47:38AM -0500, Scott Wood wrote: > On Tue, May 12, 2009 at 07:51:49AM -0400, Josh Boyer wrote: > > On Tue, May 12, 2009 at 01:33:55AM +0530, K.Prasad wrote: > > >- The patch is currently implemented only for PPC64 architecture. Other > > > architectures (especially Book-E implementations are expected to > > > happen in due course). > > > > Does this mean you will work on transitioning Book-E implementations, or that > > you expect the Book-E maintainers to? I'm just curious. The code as written > > relies heavily on the DABR/MSR setup that ppc64 has and Book-E unfortunately > > doesn't follow that at all. > > And since there will eventually be 64-bit book E chips, we need to use > something more specific than CONFIG_PPC64 in the #ifdef (easier now than > figuring out which ones are breakpoint-related later). > > -Scott Sure. I will be glad to receive & incorporate suggestions in this regard from PowerPC experts (or will 64-bit Book-E implementations be called by a different name altogether?) Thanks, K.Prasad
On Wed, May 13, 2009 at 01:55:45AM +0530, K.Prasad wrote: > On Tue, May 12, 2009 at 07:51:49AM -0400, Josh Boyer wrote: > > On Tue, May 12, 2009 at 01:33:55AM +0530, K.Prasad wrote: [snip] > I do see that Book-E processors will have severe memory footprint > constraints (in embedded environment) and if the maintainers carry a > different perspective (than the one cited above), the relevant fields > can be migrated to a new structure whose pointer will be embedded in > task_struct. The generic code may have to carry some #ifdefs though. I think moving the debug register info into a separate structure makes a fair bit of sense. As well as reducing the memory footprint for systems with lots of debug regs, checking it the pointer is NULL provides a simple and generic way of determining if the process has touched the debug regs. It seems to me that a kind of minimal requirement for a sensible generic debug interface is that if no processes actually ask to use the debug regs, then we should never touch them in the hardware. This means that debugging hacks in the kernel can just use the debug regs directly and don't have to go through the interface to avoid having their stuff clobbered on context switch.
On Wed, May 13, 2009 at 12:57:17PM +1000, David Gibson wrote: > On Wed, May 13, 2009 at 01:55:45AM +0530, K.Prasad wrote: > > On Tue, May 12, 2009 at 07:51:49AM -0400, Josh Boyer wrote: > > > On Tue, May 12, 2009 at 01:33:55AM +0530, K.Prasad wrote: > [snip] > > I do see that Book-E processors will have severe memory footprint > > constraints (in embedded environment) and if the maintainers carry a > > different perspective (than the one cited above), the relevant fields > > can be migrated to a new structure whose pointer will be embedded in > > task_struct. The generic code may have to carry some #ifdefs though. > > I think moving the debug register info into a separate structure makes > a fair bit of sense. As well as reducing the memory footprint for > systems with lots of debug regs, checking it the pointer is NULL > provides a simple and generic way of determining if the process has > touched the debug regs. > > It seems to me that a kind of minimal requirement for a sensible > generic debug interface is that if no processes actually ask to use > the debug regs, then we should never touch them in the hardware. This > means that debugging hacks in the kernel can just use the debug regs > directly and don't have to go through the interface to avoid having > their stuff clobbered on context switch. Uh, sorry, didn't fully read the earlier mail. Ingo makes a good point that detaching the structure complicates locking / lifetime issues. Leaving it in the thread_struct probably makes sense for now. Although it raises other issues for when different CPUs in the same architecture have different sets of debug registers (e.g. PPC server's DABR/IABR set versus 44x's IAC/DAC/DBCR regs).
On Wed, May 13, 2009 at 12:57:17PM +1000, David Gibson wrote: > On Wed, May 13, 2009 at 01:55:45AM +0530, K.Prasad wrote: > > On Tue, May 12, 2009 at 07:51:49AM -0400, Josh Boyer wrote: > > > On Tue, May 12, 2009 at 01:33:55AM +0530, K.Prasad wrote: > [snip] > > It seems to me that a kind of minimal requirement for a sensible > generic debug interface is that if no processes actually ask to use > the debug regs, then we should never touch them in the hardware. This > means that debugging hacks in the kernel can just use the debug regs > directly and don't have to go through the interface to avoid having > their stuff clobbered on context switch. > All that we do additionally for kernel users (when no process is currently using the debug regs) is to account for the usage just so that any new requests (kernel or user-space) are denied. > -- > David Gibson | I'll have my music baroque, and my code > david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ > | _way_ _around_! > http://www.ozlabs.org/~dgibson Thanks, K.Prasad
Index: linux-2.6-tip.hbkpt/arch/powerpc/Kconfig =================================================================== --- linux-2.6-tip.hbkpt.orig/arch/powerpc/Kconfig +++ linux-2.6-tip.hbkpt/arch/powerpc/Kconfig @@ -125,6 +125,7 @@ config PPC select USE_GENERIC_SMP_HELPERS if SMP select HAVE_OPROFILE select HAVE_SYSCALL_WRAPPERS if PPC64 + select HAVE_HW_BREAKPOINT if PPC64 config EARLY_PRINTK bool Index: linux-2.6-tip.hbkpt/arch/powerpc/kernel/Makefile =================================================================== --- linux-2.6-tip.hbkpt.orig/arch/powerpc/kernel/Makefile +++ linux-2.6-tip.hbkpt/arch/powerpc/kernel/Makefile @@ -33,7 +33,7 @@ obj-$(CONFIG_PPC64) += setup_64.o sys_p signal_64.o ptrace32.o \ paca.o cpu_setup_ppc970.o \ cpu_setup_pa6t.o \ - firmware.o nvram_64.o + firmware.o nvram_64.o hw_breakpoint.o obj64-$(CONFIG_RELOCATABLE) += reloc_64.o obj-$(CONFIG_PPC64) += vdso64/ obj-$(CONFIG_ALTIVEC) += vecemu.o vector.o Index: linux-2.6-tip.hbkpt/arch/powerpc/kernel/hw_breakpoint.c =================================================================== --- /dev/null +++ linux-2.6-tip.hbkpt/arch/powerpc/kernel/hw_breakpoint.c @@ -0,0 +1,271 @@ +/* + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. + * + * Copyright (C) 2009 IBM Corporation + */ + +/* + * HW_breakpoint: a unified kernel/user-space hardware breakpoint facility, + * using the CPU's debug registers. + */ + +#include <linux/notifier.h> +#include <linux/kallsyms.h> +#include <linux/kprobes.h> +#include <linux/percpu.h> +#include <linux/kernel.h> +#include <linux/module.h> +#include <linux/sched.h> +#include <linux/init.h> +#include <linux/smp.h> + +#include <asm/hw_breakpoint.h> +#include <asm/processor.h> +#include <asm/sstep.h> + +/* Store the kernel-space breakpoint address value */ +static unsigned long kdabr; + +/* + * Temporarily stores address for DABR before it is written by the + * single-step handler routine + */ +static DEFINE_PER_CPU(unsigned long, dabr_data); + +void arch_update_kernel_hw_breakpoint(void *unused) +{ + struct hw_breakpoint *bp; + + /* Check if there is nothing to update */ + if (hbp_kernel_pos == HB_NUM) + return; + bp = hbp_kernel[hbp_kernel_pos]; + if (bp == NULL) + kdabr = 0; + else + kdabr = bp->info.address | bp->info.type | DABR_TRANSLATION; + set_dabr(kdabr); +} + +/* + * Install the thread breakpoints in their debug registers. + */ +void arch_install_thread_hw_breakpoint(struct task_struct *tsk) +{ + set_dabr(tsk->thread.dabr); +} + +/* + * Install the debug register values for just the kernel, no thread. + */ +void arch_uninstall_thread_hw_breakpoint() +{ + set_dabr(0); +} + +/* + * Check for virtual address in user space. + */ +int arch_check_va_in_userspace(unsigned long va, u8 hbp_len) +{ + return (va <= TASK_SIZE - HW_BREAKPOINT_LEN); +} + +/* + * Check for virtual address in kernel space. + */ +int arch_check_va_in_kernelspace(unsigned long va, u8 hbp_len) +{ + return (va >= TASK_SIZE) && ((va + HW_BREAKPOINT_LEN - 1) >= TASK_SIZE); +} + +/* + * Store a breakpoint's encoded address, length, and type. + */ +int arch_store_info(struct hw_breakpoint *bp) +{ + /* + * User-space requests will always have the address field populated + * For kernel-addresses, either the address or symbol name can be + * specified. + */ + if (bp->info.name) + bp->info.address = (unsigned long) + kallsyms_lookup_name(bp->info.name); + if (bp->info.address) + return 0; + return -EINVAL; +} + +/* + * Validate the arch-specific HW Breakpoint register settings + */ +int arch_validate_hwbkpt_settings(struct hw_breakpoint *bp, + struct task_struct *tsk) +{ + int ret = -EINVAL; + + if (!bp) + return ret; + + switch (bp->info.type) { + case DABR_DATA_READ: + break; + case DABR_DATA_WRITE: + break; + case DABR_DATA_RW: + break; + default: + return ret; + } + + if (bp->triggered) + ret = arch_store_info(bp); + + /* Check for double word alignment - 8 bytes */ + if (bp->info.address & HW_BREAKPOINT_ALIGN) + return -EINVAL; + return ret; +} + +void arch_update_user_hw_breakpoint(int pos, struct task_struct *tsk) +{ + struct thread_struct *thread = &(tsk->thread); + struct hw_breakpoint *bp = thread->hbp[0]; + + if (bp) + thread->dabr = bp->info.address | bp->info.type | + DABR_TRANSLATION; + else + thread->dabr = 0; +} + +void arch_flush_thread_hw_breakpoint(struct task_struct *tsk) +{ + struct thread_struct *thread = &(tsk->thread); + + thread->dabr = 0; +} + +/* + * Handle debug exception notifications. + */ +int __kprobes hw_breakpoint_handler(struct die_args *args) +{ + int rc = NOTIFY_STOP; + struct hw_breakpoint *bp; + struct pt_regs *regs = args->regs; + unsigned long dar; + int cpu, stepped, is_kernel; + + /* Disable breakpoints during exception handling */ + set_dabr(0); + + dar = regs->dar & (~HW_BREAKPOINT_ALIGN); + is_kernel = (dar >= TASK_SIZE) ? 1 : 0; + + if (is_kernel) + bp = hbp_kernel[0]; + else { + bp = current->thread.hbp[0]; + /* Lazy debug register switching */ + if (!bp) + return rc; + rc = NOTIFY_DONE; + } + + (bp->triggered)(bp, regs); + + cpu = get_cpu(); + if (is_kernel) + per_cpu(dabr_data, cpu) = kdabr; + else + per_cpu(dabr_data, cpu) = current->thread.dabr; + + stepped = emulate_step(regs, regs->nip); + /* + * Single-step the causative instruction manually if + * emulate_step() could not execute it + */ + if (stepped == 0) { + regs->msr |= MSR_SE; + goto out; + } + + set_dabr(per_cpu(dabr_data, cpu)); +out: + /* Enable pre-emption only if single-stepping is finished */ + if (stepped) + put_cpu_no_resched(); + return rc; +} + +/* + * Handle single-step exceptions following a DABR hit. + */ +int __kprobes single_step_dabr_instruction(struct die_args *args) +{ + struct pt_regs *regs = args->regs; + int cpu = get_cpu(); + int ret = NOTIFY_DONE; + siginfo_t info; + unsigned long this_dabr_data = per_cpu(dabr_data, cpu); + + /* + * Check if we are single-stepping as a result of a + * previous HW Breakpoint exception + */ + if (this_dabr_data == 0) + goto out; + + regs->msr &= ~MSR_SE; + /* Deliver signal to user-space */ + if (this_dabr_data < TASK_SIZE) { + info.si_signo = SIGTRAP; + info.si_errno = 0; + info.si_code = TRAP_HWBKPT; + info.si_addr = (void __user *)(per_cpu(dabr_data, cpu)); + force_sig_info(SIGTRAP, &info, current); + } + + set_dabr(this_dabr_data); + per_cpu(dabr_data, cpu) = 0; + ret = NOTIFY_STOP; + put_cpu_no_resched(); + +out: + put_cpu_no_resched(); + return ret; +} + +/* + * Handle debug exception notifications. + */ +int __kprobes hw_breakpoint_exceptions_notify( + struct notifier_block *unused, unsigned long val, void *data) +{ + int ret = NOTIFY_DONE; + + switch (val) { + case DIE_DABR_MATCH: + ret = hw_breakpoint_handler(data); + break; + case DIE_SSTEP: + ret = single_step_dabr_instruction(data); + break; + } + + return ret; +} Index: linux-2.6-tip.hbkpt/arch/powerpc/include/asm/hw_breakpoint.h =================================================================== --- /dev/null +++ linux-2.6-tip.hbkpt/arch/powerpc/include/asm/hw_breakpoint.h @@ -0,0 +1,52 @@ +#ifndef _PPC64_HW_BREAKPOINT_H +#define _PPC64_HW_BREAKPOINT_H + +#ifdef __KERNEL__ +#define __ARCH_HW_BREAKPOINT_H + +struct arch_hw_breakpoint { + char *name; /* Contains name of the symbol to set bkpt */ + unsigned long address; + u8 type; +}; + +#include <linux/kdebug.h> +#include <asm/reg.h> +#include <asm-generic/hw_breakpoint.h> + +#define HW_BREAKPOINT_READ DABR_DATA_READ +#define HW_BREAKPOINT_WRITE DABR_DATA_WRITE +#define HW_BREAKPOINT_RW DABR_DATA_RW + +#define HW_BREAKPOINT_ALIGN 0x7 +#define HW_BREAKPOINT_LEN 4 + +extern struct hw_breakpoint *hbp_kernel[HB_NUM]; +extern unsigned int hbp_user_refcount[HB_NUM]; + +/* + * Ptrace support: breakpoint trigger routine. + */ +extern int __modify_user_hw_breakpoint(int pos, struct task_struct *tsk, + struct hw_breakpoint *bp); + +extern void arch_install_thread_hw_breakpoint(struct task_struct *tsk); +extern void arch_uninstall_thread_hw_breakpoint(void); +extern int arch_check_va_in_userspace(unsigned long va, u8 hbp_len); +extern int arch_check_va_in_kernelspace(unsigned long va, u8 hbp_len); +extern int arch_validate_hwbkpt_settings(struct hw_breakpoint *bp, + struct task_struct *tsk); +extern void arch_update_user_hw_breakpoint(int pos, struct task_struct *tsk); +extern void arch_flush_thread_hw_breakpoint(struct task_struct *tsk); +extern void arch_update_kernel_hw_breakpoint(void *); +extern int hw_breakpoint_exceptions_notify(struct notifier_block *unused, + unsigned long val, void *data); + +extern void flush_thread_hw_breakpoint(struct task_struct *tsk); +extern int copy_thread_hw_breakpoint(struct task_struct *tsk, + struct task_struct *child, unsigned long clone_flags); +extern void switch_to_thread_hw_breakpoint(struct task_struct *tsk); + +#endif /* __KERNEL__ */ +#endif /* _PPC64_HW_BREAKPOINT_H */ + Index: linux-2.6-tip.hbkpt/arch/powerpc/include/asm/processor.h =================================================================== --- linux-2.6-tip.hbkpt.orig/arch/powerpc/include/asm/processor.h +++ linux-2.6-tip.hbkpt/arch/powerpc/include/asm/processor.h @@ -177,6 +177,7 @@ struct thread_struct { #ifdef CONFIG_PPC64 unsigned long start_tb; /* Start purr when proc switched in */ unsigned long accum_tb; /* Total accumilated purr for process */ + struct hw_breakpoint *hbp[HB_NUM]; #endif unsigned long dabr; /* Data address breakpoint register */ #ifdef CONFIG_ALTIVEC Index: linux-2.6-tip.hbkpt/arch/powerpc/kernel/ptrace.c =================================================================== --- linux-2.6-tip.hbkpt.orig/arch/powerpc/kernel/ptrace.c +++ linux-2.6-tip.hbkpt/arch/powerpc/kernel/ptrace.c @@ -37,6 +37,9 @@ #include <asm/page.h> #include <asm/pgtable.h> #include <asm/system.h> +#ifdef CONFIG_PPC64 +#include <asm/hw_breakpoint.h> +#endif /* * does not yet catch signals sent when the child dies. @@ -735,9 +738,22 @@ void user_disable_single_step(struct tas clear_tsk_thread_flag(task, TIF_SINGLESTEP); } +static void ptrace_triggered(struct hw_breakpoint *bp, struct pt_regs *regs) +{ + /* + * The SIGTRAP signal is generated automatically for us in do_dabr(). + * We don't have to do anything here + */ +} + int ptrace_set_debugreg(struct task_struct *task, unsigned long addr, unsigned long data) { +#ifdef CONFIG_PPC64 + struct thread_struct *thread = &(task->thread); + struct hw_breakpoint *bp; + int ret; +#endif /* For ppc64 we support one DABR and no IABR's at the moment (ppc64). * For embedded processors we support one DAC and no IAC's at the * moment. @@ -767,6 +783,38 @@ int ptrace_set_debugreg(struct task_stru if (data && !(data & DABR_TRANSLATION)) return -EIO; +#ifdef CONFIG_PPC64 + bp = thread->hbp[0]; + if ((data & ~0x7UL) == 0) { + if (bp) { + unregister_user_hw_breakpoint(task, bp); + kfree(bp); + thread->hbp[0] = NULL; + } + return 0; + } + + if (bp) { + bp->info.type = data & 0x3UL; + task->thread.dabr = bp->info.address = + (data & ~HW_BREAKPOINT_ALIGN); + return __modify_user_hw_breakpoint(0, task, bp); + } + bp = kzalloc(sizeof(struct hw_breakpoint), GFP_KERNEL); + if (!bp) + return -ENOMEM; + + /* Store the type of breakpoint */ + bp->info.type = data & 0x3UL; + bp->triggered = ptrace_triggered; + task->thread.dabr = bp->info.address = (data & ~HW_BREAKPOINT_ALIGN); + + ret = register_user_hw_breakpoint(task, bp); + if (ret) + return ret; + set_tsk_thread_flag(task, TIF_DEBUG); +#endif /* CONFIG_PPC64 */ + /* Move contents to the DABR register */ task->thread.dabr = data; Index: linux-2.6-tip.hbkpt/arch/powerpc/kernel/process.c =================================================================== --- linux-2.6-tip.hbkpt.orig/arch/powerpc/kernel/process.c +++ linux-2.6-tip.hbkpt/arch/powerpc/kernel/process.c @@ -50,6 +50,7 @@ #include <asm/syscalls.h> #ifdef CONFIG_PPC64 #include <asm/firmware.h> +#include <asm/hw_breakpoint.h> #endif #include <linux/kprobes.h> #include <linux/kdebug.h> @@ -254,8 +255,10 @@ void do_dabr(struct pt_regs *regs, unsig 11, SIGSEGV) == NOTIFY_STOP) return; +#ifndef CONFIG_PPC64 if (debugger_dabr_match(regs)) return; +#endif /* Clear the DAC and struct entries. One shot trigger */ #if defined(CONFIG_BOOKE) @@ -372,8 +375,13 @@ struct task_struct *__switch_to(struct t #endif /* CONFIG_SMP */ +#ifdef CONFIG_PPC64 + if (unlikely(test_tsk_thread_flag(new, TIF_DEBUG))) + switch_to_thread_hw_breakpoint(new); +#else if (unlikely(__get_cpu_var(current_dabr) != new->thread.dabr)) set_dabr(new->thread.dabr); +#endif /* CONFIG_PPC64 */ #if defined(CONFIG_BOOKE) /* If new thread DAC (HW breakpoint) is the same then leave it */ @@ -550,6 +558,10 @@ void show_regs(struct pt_regs * regs) void exit_thread(void) { discard_lazy_cpu_state(); +#ifdef CONFIG_PPC64 + if (unlikely(test_tsk_thread_flag(current, TIF_DEBUG))) + flush_thread_hw_breakpoint(current); +#endif /* CONFIG_PPC64 */ } void flush_thread(void) @@ -605,6 +617,9 @@ int copy_thread(unsigned long clone_flag struct pt_regs *childregs, *kregs; extern void ret_from_fork(void); unsigned long sp = (unsigned long)task_stack_page(p) + THREAD_SIZE; +#ifdef CONFIG_PPC64 + struct task_struct *tsk = current; +#endif CHECK_FULL_REGS(regs); /* Copy registers */ @@ -672,6 +687,9 @@ int copy_thread(unsigned long clone_flag * function. */ kregs->nip = *((unsigned long *)ret_from_fork); + + if (unlikely(test_tsk_thread_flag(tsk, TIF_DEBUG))) + copy_thread_hw_breakpoint(tsk, p, clone_flags); #else kregs->nip = (unsigned long)ret_from_fork; #endif Index: linux-2.6-tip.hbkpt/samples/hw_breakpoint/data_breakpoint.c =================================================================== --- linux-2.6-tip.hbkpt.orig/samples/hw_breakpoint/data_breakpoint.c +++ linux-2.6-tip.hbkpt/samples/hw_breakpoint/data_breakpoint.c @@ -54,6 +54,10 @@ static int __init hw_break_module_init(v sample_hbp.info.type = HW_BREAKPOINT_WRITE; sample_hbp.info.len = HW_BREAKPOINT_LEN_4; #endif /* CONFIG_X86 */ +#ifdef CONFIG_PPC64 + sample_hbp.info.name = ksym_name; + sample_hbp.info.type = DABR_DATA_WRITE; +#endif /* CONFIG_PPC64 */ sample_hbp.triggered = (void *)sample_hbp_handler; Index: linux-2.6-tip.hbkpt/arch/powerpc/include/asm/thread_info.h =================================================================== --- linux-2.6-tip.hbkpt.orig/arch/powerpc/include/asm/thread_info.h +++ linux-2.6-tip.hbkpt/arch/powerpc/include/asm/thread_info.h @@ -114,6 +114,7 @@ static inline struct thread_info *curren #define TIF_FREEZE 14 /* Freezing for suspend */ #define TIF_RUNLATCH 15 /* Is the runlatch enabled? */ #define TIF_ABI_PENDING 16 /* 32/64 bit switch needed */ +#define TIF_DEBUG 17 /* uses debug registers */ /* as above, but as bit values */ #define _TIF_SYSCALL_TRACE (1<<TIF_SYSCALL_TRACE) @@ -132,6 +133,7 @@ static inline struct thread_info *curren #define _TIF_FREEZE (1<<TIF_FREEZE) #define _TIF_RUNLATCH (1<<TIF_RUNLATCH) #define _TIF_ABI_PENDING (1<<TIF_ABI_PENDING) +#define _TIF_DEBUG (1<<TIF_DEBUG) #define _TIF_SYSCALL_T_OR_A (_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SECCOMP) #define _TIF_USER_WORK_MASK (_TIF_SIGPENDING | _TIF_NEED_RESCHED | \ Index: linux-2.6-tip.hbkpt/arch/powerpc/include/asm/reg.h =================================================================== --- linux-2.6-tip.hbkpt.orig/arch/powerpc/include/asm/reg.h +++ linux-2.6-tip.hbkpt/arch/powerpc/include/asm/reg.h @@ -184,9 +184,11 @@ #define CTRL_TE 0x00c00000 /* thread enable */ #define CTRL_RUNLATCH 0x1 #define SPRN_DABR 0x3F5 /* Data Address Breakpoint Register */ +#define HB_NUM 1 /* Number of physical HW breakpoint registers */ #define DABR_TRANSLATION (1UL << 2) #define DABR_DATA_WRITE (1UL << 1) #define DABR_DATA_READ (1UL << 0) +#define DABR_DATA_RW (3UL << 0) #define SPRN_DABR2 0x13D /* e300 */ #define SPRN_DABRX 0x3F7 /* Data Address Breakpoint Register Extension */ #define DABRX_USER (1UL << 0) Index: linux-2.6-tip.hbkpt/arch/powerpc/mm/fault.c =================================================================== --- linux-2.6-tip.hbkpt.orig/arch/powerpc/mm/fault.c +++ linux-2.6-tip.hbkpt/arch/powerpc/mm/fault.c @@ -137,6 +137,12 @@ int __kprobes do_page_fault(struct pt_re error_code &= 0x48200000; else is_write = error_code & DSISR_ISSTORE; + + if (error_code & DSISR_DABRMATCH) { + /* DABR match */ + do_dabr(regs, address, error_code); + return 0; + } #else is_write = error_code & ESR_DST; #endif /* CONFIG_4xx || CONFIG_BOOKE */ @@ -151,14 +157,6 @@ int __kprobes do_page_fault(struct pt_re if (!user_mode(regs) && (address >= TASK_SIZE)) return SIGSEGV; -#if !(defined(CONFIG_4xx) || defined(CONFIG_BOOKE)) - if (error_code & DSISR_DABRMATCH) { - /* DABR match */ - do_dabr(regs, address, error_code); - return 0; - } -#endif /* !(CONFIG_4xx || CONFIG_BOOKE)*/ - if (in_atomic() || mm == NULL) { if (!user_mode(regs)) return SIGSEGV;
Hi PPC Dev folks, Please find a patch below that implements the proposed Hardware Breakpoint interfaces for PPC64 architecture. As a brief introduction, the proposed Hardware Breakpoint infrastructure provides generic in-kernel interfaces to which users from kernel- and user-space can request for breakpoint registers. An ftrace plugin that can trace accesses to data variables is also part of the generic HW Breakpoint interface patchset. The latest submission for this patchset along with an x86 implementation can be found here: http://lkml.org/lkml/2009/5/11/159. The following are the salient features of the PPC64 patch. - Arch-specific definitions for kernel and user-space requests are defined in arch/powerpc/kernel/hw_breakpoint.c - Ptrace is converted to use the HW Breakpoint interfaces - The ftrace plugin called ksym_tracer is tested to work fine. For instance when tracing pid_max kernel variable, the following was obtained as output # cat trace # tracer: ksym_tracer # # TASK-PID CPU# Symbol Type Function # | | | | | bash 4502 3 pid_max RW .do_proc_dointvec_minmax_conv+0x78/0x10c bash 4502 3 pid_max RW .do_proc_dointvec_minmax_conv+0xa0/0x10c bash 4502 3 pid_max RW .alloc_pid+0x8c/0x4a4 There are however a few limitations/caveats of the patch as identified below: - The patch is currently implemented only for PPC64 architecture. Other architectures (especially Book-E implementations are expected to happen in due course). - HW Breakpoints over data addresses through Xmon (using "bd" command) and the proposed HW Breakpoint interfaces can now operate in a mutually exclusive manner. Xmon's integration is pending and is dependant on successful triggering of breakpoints through "bd<ops>". (Note: On a Power5 machine running 2.6.29, Xmon could not trigger HW Breakpoints when tested). Kindly let me know your comments. Thanks, K.Prasad Signed-off-by: K.Prasad <prasad@linux.vnet.ibm.com> --- arch/powerpc/Kconfig | 1 arch/powerpc/include/asm/hw_breakpoint.h | 52 +++++ arch/powerpc/include/asm/processor.h | 1 arch/powerpc/include/asm/reg.h | 2 arch/powerpc/include/asm/thread_info.h | 2 arch/powerpc/kernel/Makefile | 2 arch/powerpc/kernel/hw_breakpoint.c | 271 +++++++++++++++++++++++++++++++ arch/powerpc/kernel/process.c | 18 ++ arch/powerpc/kernel/ptrace.c | 48 +++++ arch/powerpc/mm/fault.c | 14 - samples/hw_breakpoint/data_breakpoint.c | 4 12 files changed, 423 insertions(+), 9 deletions(-)