diff mbox

[RFC] Hardware Breakpoint interfaces implementation for PPC64

Message ID 20090511200355.GA17988@in.ibm.com (mailing list archive)
State Changes Requested, archived
Headers show

Commit Message

K.Prasad May 11, 2009, 8:03 p.m. UTC
Hi PPC Dev folks,
	Please find a patch below that implements the proposed Hardware
Breakpoint interfaces for PPC64 architecture.

As a brief introduction, the proposed Hardware Breakpoint
infrastructure provides generic in-kernel interfaces to which users
from kernel- and user-space can request for breakpoint registers. An
ftrace plugin that can trace accesses to data variables is also part
of the generic HW Breakpoint interface patchset. The latest submission
for this patchset along with an x86 implementation can be found here:
http://lkml.org/lkml/2009/5/11/159.

The following are the salient features of the PPC64 patch.

- Arch-specific definitions for kernel and user-space requests are
  defined in arch/powerpc/kernel/hw_breakpoint.c
- Ptrace is converted to use the HW Breakpoint interfaces
- The ftrace plugin called ksym_tracer is tested to work fine. For
  instance when tracing pid_max kernel variable, the following was
  obtained as output

# cat trace
# tracer: ksym_tracer
#
#       TASK-PID      CPU#      Symbol         Type    Function         
#          |           |          |              |         |            
bash            4502  3   pid_max               RW  .do_proc_dointvec_minmax_conv+0x78/0x10c
bash            4502  3   pid_max               RW  .do_proc_dointvec_minmax_conv+0xa0/0x10c
bash            4502  3   pid_max               RW  .alloc_pid+0x8c/0x4a4

There are however a few limitations/caveats of the patch as identified
below:

- The patch is currently implemented only for PPC64 architecture. Other
  architectures (especially Book-E implementations are expected to
  happen in due course).

- HW Breakpoints over data addresses through Xmon (using "bd" command)
  and the proposed HW Breakpoint interfaces can now operate in a
  mutually exclusive manner. Xmon's integration is pending and is
  dependant on successful triggering of breakpoints through "bd<ops>".
  (Note: On a Power5 machine running 2.6.29, Xmon could not trigger HW
  Breakpoints when tested).

Kindly let me know your comments.

Thanks,
K.Prasad


Signed-off-by: K.Prasad <prasad@linux.vnet.ibm.com>
---
 arch/powerpc/Kconfig                     |    1 
 arch/powerpc/include/asm/hw_breakpoint.h |   52 +++++
 arch/powerpc/include/asm/processor.h     |    1 
 arch/powerpc/include/asm/reg.h           |    2 
 arch/powerpc/include/asm/thread_info.h   |    2 
 arch/powerpc/kernel/Makefile             |    2 
 arch/powerpc/kernel/hw_breakpoint.c      |  271 +++++++++++++++++++++++++++++++
 arch/powerpc/kernel/process.c            |   18 ++
 arch/powerpc/kernel/ptrace.c             |   48 +++++
 arch/powerpc/mm/fault.c                  |   14 -
 samples/hw_breakpoint/data_breakpoint.c  |    4 
 12 files changed, 423 insertions(+), 9 deletions(-)

Comments

Michael Neuling May 12, 2009, 12:56 a.m. UTC | #1
> Hi PPC Dev folks,
> 	Please find a patch below that implements the proposed Hardware
> Breakpoint interfaces for PPC64 architecture.
> 
> As a brief introduction, the proposed Hardware Breakpoint
> infrastructure provides generic in-kernel interfaces to which users
> from kernel- and user-space can request for breakpoint registers. An
> ftrace plugin that can trace accesses to data variables is also part
> of the generic HW Breakpoint interface patchset. The latest submission
> for this patchset along with an x86 implementation can be found here:
> http://lkml.org/lkml/2009/5/11/159.
> 
> The following are the salient features of the PPC64 patch.
> 
> - Arch-specific definitions for kernel and user-space requests are
>   defined in arch/powerpc/kernel/hw_breakpoint.c
> - Ptrace is converted to use the HW Breakpoint interfaces

Will we fall back to use the old method if more than one address is
being watched?

> - The ftrace plugin called ksym_tracer is tested to work fine. For
>   instance when tracing pid_max kernel variable, the following was
>   obtained as output

Could you split the patch into a few pieces which implement these
different parts.  The smaller logical chucks will make it easier to
review?

> 
> # cat trace
> # tracer: ksym_tracer
> #
> #       TASK-PID      CPU#      Symbol         Type    Function         
> #          |           |          |              |         |            
> bash            4502  3   pid_max               RW  .do_proc_dointvec_minmax_
conv+0x78/0x10c
> bash            4502  3   pid_max               RW  .do_proc_dointvec_minmax_
conv+0xa0/0x10c
> bash            4502  3   pid_max               RW  .alloc_pid+0x8c/0x4a4
> 
> There are however a few limitations/caveats of the patch as identified
> below:
> 
> - The patch is currently implemented only for PPC64 architecture. Other
>   architectures (especially Book-E implementations are expected to
>   happen in due course).
> 
> - HW Breakpoints over data addresses through Xmon (using "bd" command)
>   and the proposed HW Breakpoint interfaces can now operate in a
>   mutually exclusive manner. Xmon's integration is pending and is
>   dependant on successful triggering of breakpoints through "bd<ops>".
>   (Note: On a Power5 machine running 2.6.29, Xmon could not trigger HW
>   Breakpoints when tested).
> 
> Kindly let me know your comments.
> 
> Thanks,
> K.Prasad
> 
> 
> Signed-off-by: K.Prasad <prasad@linux.vnet.ibm.com>
> ---
>  arch/powerpc/Kconfig                     |    1 
>  arch/powerpc/include/asm/hw_breakpoint.h |   52 +++++
>  arch/powerpc/include/asm/processor.h     |    1 
>  arch/powerpc/include/asm/reg.h           |    2 
>  arch/powerpc/include/asm/thread_info.h   |    2 
>  arch/powerpc/kernel/Makefile             |    2 
>  arch/powerpc/kernel/hw_breakpoint.c      |  271 ++++++++++++++++++++++++++++
+++
>  arch/powerpc/kernel/process.c            |   18 ++
>  arch/powerpc/kernel/ptrace.c             |   48 +++++
>  arch/powerpc/mm/fault.c                  |   14 -
>  samples/hw_breakpoint/data_breakpoint.c  |    4 
>  12 files changed, 423 insertions(+), 9 deletions(-)

You've not touched prace32.c.  Can we use this for 32 bit apps on a 64
bit kernel?

> 
> Index: linux-2.6-tip.hbkpt/arch/powerpc/Kconfig
> ===================================================================
> --- linux-2.6-tip.hbkpt.orig/arch/powerpc/Kconfig
> +++ linux-2.6-tip.hbkpt/arch/powerpc/Kconfig
> @@ -125,6 +125,7 @@ config PPC
>  	select USE_GENERIC_SMP_HELPERS if SMP
>  	select HAVE_OPROFILE
>  	select HAVE_SYSCALL_WRAPPERS if PPC64
> +	select HAVE_HW_BREAKPOINT if PPC64
>  
>  config EARLY_PRINTK
>  	bool
> Index: linux-2.6-tip.hbkpt/arch/powerpc/kernel/Makefile
> ===================================================================
> --- linux-2.6-tip.hbkpt.orig/arch/powerpc/kernel/Makefile
> +++ linux-2.6-tip.hbkpt/arch/powerpc/kernel/Makefile
> @@ -33,7 +33,7 @@ obj-$(CONFIG_PPC64)		+= setup_64.o sys_p
>  				   signal_64.o ptrace32.o \
>  				   paca.o cpu_setup_ppc970.o \
>  				   cpu_setup_pa6t.o \
> -				   firmware.o nvram_64.o
> +				   firmware.o nvram_64.o hw_breakpoint.o
>  obj64-$(CONFIG_RELOCATABLE)	+= reloc_64.o
>  obj-$(CONFIG_PPC64)		+= vdso64/
>  obj-$(CONFIG_ALTIVEC)		+= vecemu.o vector.o
> Index: linux-2.6-tip.hbkpt/arch/powerpc/kernel/hw_breakpoint.c
> ===================================================================
> --- /dev/null
> +++ linux-2.6-tip.hbkpt/arch/powerpc/kernel/hw_breakpoint.c
> @@ -0,0 +1,271 @@
> +/*
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, write to the Free Software
> + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA
.
> + *
> + * Copyright (C) 2009 IBM Corporation
> + */
> +
> +/*
> + * HW_breakpoint: a unified kernel/user-space hardware breakpoint facility,
> + * using the CPU's debug registers.
> + */
> +
> +#include <linux/notifier.h>
> +#include <linux/kallsyms.h>
> +#include <linux/kprobes.h>
> +#include <linux/percpu.h>
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/sched.h>
> +#include <linux/init.h>
> +#include <linux/smp.h>
> +
> +#include <asm/hw_breakpoint.h>
> +#include <asm/processor.h>
> +#include <asm/sstep.h>
> +
> +/* Store the kernel-space breakpoint address value */
> +static unsigned long kdabr;
> +
> +/*
> + * Temporarily stores address for DABR before it is written by the
> + * single-step handler routine
> + */
> +static DEFINE_PER_CPU(unsigned long, dabr_data);
> +
> +void arch_update_kernel_hw_breakpoint(void *unused)
> +{
> +	struct hw_breakpoint *bp;
> +
> +	/* Check if there is nothing to update */
> +	if (hbp_kernel_pos == HB_NUM)
> +		return;
> +	bp = hbp_kernel[hbp_kernel_pos];
> +	if (bp == NULL)
> +		kdabr = 0;
> +	else
> +		kdabr = bp->info.address | bp->info.type | DABR_TRANSLATION;
> +	set_dabr(kdabr);
> +}
> +
> +/*
> + * Install the thread breakpoints in their debug registers.
> + */
> +void arch_install_thread_hw_breakpoint(struct task_struct *tsk)
> +{
> +	set_dabr(tsk->thread.dabr);
> +}
> +
> +/*
> + * Install the debug register values for just the kernel, no thread.
> + */
> +void arch_uninstall_thread_hw_breakpoint()
> +{
> +	set_dabr(0);
> +}
> +
> +/*
> + * Check for virtual address in user space.
> + */
> +int arch_check_va_in_userspace(unsigned long va, u8 hbp_len)
> +{
> +	return (va <= TASK_SIZE - HW_BREAKPOINT_LEN);
> +}

You pass ing hbp_len, but then use HW_BREAKPOINT_LEN?  Is that right?

> +
> +/*
> + * Check for virtual address in kernel space.
> + */
> +int arch_check_va_in_kernelspace(unsigned long va, u8 hbp_len)
> +{
> +	return (va >= TASK_SIZE) && ((va + HW_BREAKPOINT_LEN - 1) >= TASK_SIZE)
;

Can you use is_kernel_addr() here?

> +}
> +
> +/*
> + * Store a breakpoint's encoded address, length, and type.
> + */
> +int arch_store_info(struct hw_breakpoint *bp)
> +{
> +	/*
> +	 * User-space requests will always have the address field populated
> +	 * For kernel-addresses, either the address or symbol name can be
> +	 * specified.
> +	 */
> +	if (bp->info.name)
> +		bp->info.address = (unsigned long)
> +					kallsyms_lookup_name(bp->info.name);
> +	if (bp->info.address)
> +		return 0;
> +	return -EINVAL;
> +}
> +
> +/*
> + * Validate the arch-specific HW Breakpoint register settings
> + */
> +int arch_validate_hwbkpt_settings(struct hw_breakpoint *bp,
> +						struct task_struct *tsk)
> +{
> +	int ret = -EINVAL;
> +
> +	if (!bp)
> +		return ret;
> +
> +	switch (bp->info.type) {
> +	case DABR_DATA_READ:
> +		break;
> +	case DABR_DATA_WRITE:
> +		break;
> +	case DABR_DATA_RW:
> +		break;
> +	default:
> +		return ret;
> +	}
> +
> +	if (bp->triggered)
> +		ret = arch_store_info(bp);
> +
> +	/* Check for double word alignment - 8 bytes */
> +	if (bp->info.address & HW_BREAKPOINT_ALIGN)
> +		return -EINVAL;
> +	return ret;
> +}
> +
> +void arch_update_user_hw_breakpoint(int pos, struct task_struct *tsk)
> +{
> +	struct thread_struct *thread = &(tsk->thread);
> +	struct hw_breakpoint *bp = thread->hbp[0];
> +
> +	if (bp)
> +		thread->dabr = bp->info.address	| bp->info.type |
> +				DABR_TRANSLATION;
> +	else
> +		thread->dabr = 0;
> +}
> +
> +void arch_flush_thread_hw_breakpoint(struct task_struct *tsk)
> +{
> +	struct thread_struct *thread = &(tsk->thread);
> +
> +	thread->dabr = 0;
> +}
> +
> +/*
> + * Handle debug exception notifications.
> + */
> +int __kprobes hw_breakpoint_handler(struct die_args *args)
> +{
> +	int rc = NOTIFY_STOP;
> +	struct hw_breakpoint *bp;
> +	struct pt_regs *regs = args->regs;
> +	unsigned long dar;
> +	int cpu, stepped, is_kernel;
> +
> +	/* Disable breakpoints during exception handling */
> +	set_dabr(0);
> +
> +	dar = regs->dar & (~HW_BREAKPOINT_ALIGN);
> +	is_kernel = (dar >= TASK_SIZE) ? 1 : 0;
> +
> +	if (is_kernel)
> +		bp = hbp_kernel[0];
> +	else {
> +		bp = current->thread.hbp[0];
> +		/* Lazy debug register switching */
> +		if (!bp)
> +			return rc;
> +		rc = NOTIFY_DONE;
> +	}
> +
> +	(bp->triggered)(bp, regs);
> +
> +	cpu = get_cpu();
> +	if (is_kernel)
> +		per_cpu(dabr_data, cpu) = kdabr;
> +	else
> +		per_cpu(dabr_data, cpu) = current->thread.dabr;
> +
> +	stepped = emulate_step(regs, regs->nip);
> +	/*
> +	 * Single-step the causative instruction manually if
> +	 * emulate_step() could not execute it
> +	 */
> +	if (stepped == 0) {
> +		regs->msr |= MSR_SE;
> +		goto out;
> +	}

This is where we backout to single step mode?

> +
> +	set_dabr(per_cpu(dabr_data, cpu));
> +out:
> +	/* Enable pre-emption only if single-stepping is finished */
> +	if (stepped)
> +		put_cpu_no_resched();
> +	return rc;
> +}
> +
+/*
> + * Handle single-step exceptions following a DABR hit.
> + */
> +int __kprobes single_step_dabr_instruction(struct die_args *args)
> +{
> +	struct pt_regs *regs = args->regs;
> +	int cpu = get_cpu();
> +	int ret = NOTIFY_DONE;
> +	siginfo_t info;
> +	unsigned long this_dabr_data = per_cpu(dabr_data, cpu);
> +
> +	/*
> +	 * Check if we are single-stepping as a result of a
> +	 * previous HW Breakpoint exception
> +	 */
> +	if (this_dabr_data == 0)
> +		goto out;
> +
> +	regs->msr &= ~MSR_SE;
> +	/* Deliver signal to user-space */
> +	if (this_dabr_data < TASK_SIZE) {
> +		info.si_signo = SIGTRAP;
> +		info.si_errno = 0;
> +		info.si_code = TRAP_HWBKPT;
> +		info.si_addr = (void __user *)(per_cpu(dabr_data, cpu));
> +		force_sig_info(SIGTRAP, &info, current);
> +	}
> +
> +	set_dabr(this_dabr_data);
> +	per_cpu(dabr_data, cpu) = 0;
> +	ret = NOTIFY_STOP;
> +	put_cpu_no_resched();
> +
> +out:
> +	put_cpu_no_resched();

This looks wrong.  put_cpu_no_resched() twice?

> +	return ret;
> +}
> +
> +/*
> + * Handle debug exception notifications.
> + */
> +int __kprobes hw_breakpoint_exceptions_notify(
> +		struct notifier_block *unused, unsigned long val, void *data)
> +{
> +	int ret = NOTIFY_DONE;
> +
> +	switch (val) {
> +	case DIE_DABR_MATCH:
> +		ret = hw_breakpoint_handler(data);
> +		break;
> +	case DIE_SSTEP:
> +		ret = single_step_dabr_instruction(data);
> +		break;
> +	}
> +
> +	return ret;
> +}
> Index: linux-2.6-tip.hbkpt/arch/powerpc/include/asm/hw_breakpoint.h
> ===================================================================
> --- /dev/null
> +++ linux-2.6-tip.hbkpt/arch/powerpc/include/asm/hw_breakpoint.h
> @@ -0,0 +1,52 @@
> +#ifndef	_PPC64_HW_BREAKPOINT_H
> +#define	_PPC64_HW_BREAKPOINT_H
> +
> +#ifdef	__KERNEL__
> +#define	__ARCH_HW_BREAKPOINT_H
> +
> +struct arch_hw_breakpoint {
> +	char		*name; /* Contains name of the symbol to set bkpt */
> +	unsigned long	address;
> +	u8		type;
> +};

Can you reorder this to pack the struct better (ie. put the unsigned
long first).

> +
> +#include <linux/kdebug.h>
> +#include <asm/reg.h>
> +#include <asm-generic/hw_breakpoint.h>
> +
> +#define HW_BREAKPOINT_READ DABR_DATA_READ
> +#define HW_BREAKPOINT_WRITE DABR_DATA_WRITE
> +#define HW_BREAKPOINT_RW DABR_DATA_RW
> +
> +#define HW_BREAKPOINT_ALIGN 0x7
> +#define HW_BREAKPOINT_LEN 4

What is HW_BREAKPOINT_LEN?  

Obviouslt all instructions on ppc64 are 4 bytes.  This seems to be
defined in a few places, like here and MCOUNT_INSN_SIZE (ftrace.h).  Can
you create a #define for this generically and get all refers to use this
instead of #define-ing 4 everywhere.

> +
> +extern struct hw_breakpoint *hbp_kernel[HB_NUM];
> +extern unsigned int hbp_user_refcount[HB_NUM];
> +
> +/*
> + * Ptrace support: breakpoint trigger routine.
> + */
> +extern int __modify_user_hw_breakpoint(int pos, struct task_struct *tsk,
> +			struct hw_breakpoint *bp);
> +
> +extern void arch_install_thread_hw_breakpoint(struct task_struct *tsk);
> +extern void arch_uninstall_thread_hw_breakpoint(void);
> +extern int arch_check_va_in_userspace(unsigned long va, u8 hbp_len);
> +extern int arch_check_va_in_kernelspace(unsigned long va, u8 hbp_len);
> +extern int arch_validate_hwbkpt_settings(struct hw_breakpoint *bp,
> +						struct task_struct *tsk);
> +extern void arch_update_user_hw_breakpoint(int pos, struct task_struct *tsk)
;
> +extern void arch_flush_thread_hw_breakpoint(struct task_struct *tsk);
> +extern void arch_update_kernel_hw_breakpoint(void *);
> +extern int hw_breakpoint_exceptions_notify(struct notifier_block *unused,
> +				     unsigned long val, void *data);
> +
> +extern void flush_thread_hw_breakpoint(struct task_struct *tsk);
> +extern int copy_thread_hw_breakpoint(struct task_struct *tsk,
> +		struct task_struct *child, unsigned long clone_flags);
> +extern void switch_to_thread_hw_breakpoint(struct task_struct *tsk);
> +
> +#endif	/* __KERNEL__ */
> +#endif	/* _PPC64_HW_BREAKPOINT_H */
> +
> Index: linux-2.6-tip.hbkpt/arch/powerpc/include/asm/processor.h
> ===================================================================
> --- linux-2.6-tip.hbkpt.orig/arch/powerpc/include/asm/processor.h
> +++ linux-2.6-tip.hbkpt/arch/powerpc/include/asm/processor.h
> @@ -177,6 +177,7 @@ struct thread_struct {
>  #ifdef CONFIG_PPC64
>  	unsigned long	start_tb;	/* Start purr when proc switched in */
>  	unsigned long	accum_tb;	/* Total accumilated purr for process *
/
> +	struct hw_breakpoint *hbp[HB_NUM];
>  #endif
>  	unsigned long	dabr;		/* Data address breakpoint register */
>  #ifdef CONFIG_ALTIVEC
> Index: linux-2.6-tip.hbkpt/arch/powerpc/kernel/ptrace.c
> ===================================================================
> --- linux-2.6-tip.hbkpt.orig/arch/powerpc/kernel/ptrace.c
> +++ linux-2.6-tip.hbkpt/arch/powerpc/kernel/ptrace.c
> @@ -37,6 +37,9 @@
>  #include <asm/page.h>
>  #include <asm/pgtable.h>
>  #include <asm/system.h>
> +#ifdef CONFIG_PPC64
> +#include <asm/hw_breakpoint.h>
> +#endif
>  
>  /*
>   * does not yet catch signals sent when the child dies.
> @@ -735,9 +738,22 @@ void user_disable_single_step(struct tas
>  	clear_tsk_thread_flag(task, TIF_SINGLESTEP);
>  }
>  
> +static void ptrace_triggered(struct hw_breakpoint *bp, struct pt_regs *regs)
> +{
> +	/*
> +	 * The SIGTRAP signal is generated automatically for us in do_dabr().
> +	 * We don't have to do anything here
> +	 */
> +}
> +
>  int ptrace_set_debugreg(struct task_struct *task, unsigned long addr,
>  			       unsigned long data)
>  {
> +#ifdef CONFIG_PPC64
> +	struct thread_struct *thread = &(task->thread);
> +	struct hw_breakpoint *bp;
> +	int ret;
> +#endif
>  	/* For ppc64 we support one DABR and no IABR's at the moment (ppc64).
>  	 *  For embedded processors we support one DAC and no IAC's at the
>  	 *  moment.
> @@ -767,6 +783,38 @@ int ptrace_set_debugreg(struct task_stru
>  	if (data && !(data & DABR_TRANSLATION))
>  		return -EIO;
>  
> +#ifdef CONFIG_PPC64
> +	bp = thread->hbp[0];
> +	if ((data & ~0x7UL) == 0) {

Should use HW_BREAKPOINT_ALIGN here instead of 0x7?  Is 0 some special
command?  Seems to be lots of special numbers here related to using data
(later you mask is 0x3).  What is happening here?  What is contained in
the data parameter?  It overloads the type and the address?

> +		if (bp) {
> +			unregister_user_hw_breakpoint(task, bp);
> +			kfree(bp);
> +			thread->hbp[0] = NULL;
> +		}
> +		return 0;
> +	}
> +
> +	if (bp) {
> +		bp->info.type = data & 0x3UL;
> +		task->thread.dabr = bp->info.address =
> +						(data & ~HW_BREAKPOINT_ALIGN);
> +		return __modify_user_hw_breakpoint(0, task, bp);
> +	}
> +	bp = kzalloc(sizeof(struct hw_breakpoint), GFP_KERNEL);

When a processes ends, will this be correctly freed?  Should it be freed
in arch_flush_thread_hw_breakpoint?

> +	if (!bp)
> +		return -ENOMEM;
> +
> +	/* Store the type of breakpoint */
> +	bp->info.type = data & 0x3UL;
> +	bp->triggered = ptrace_triggered;
> +	task->thread.dabr = bp->info.address = (data & ~HW_BREAKPOINT_ALIGN);
> +
> +	ret = register_user_hw_breakpoint(task, bp);
> +	if (ret)
> +		return ret;
> +	set_tsk_thread_flag(task, TIF_DEBUG);
> +#endif /* CONFIG_PPC64 */
> +
>  	/* Move contents to the DABR register */
>  	task->thread.dabr = data;
>  
> Index: linux-2.6-tip.hbkpt/arch/powerpc/kernel/process.c
> ===================================================================
> --- linux-2.6-tip.hbkpt.orig/arch/powerpc/kernel/process.c
> +++ linux-2.6-tip.hbkpt/arch/powerpc/kernel/process.c
> @@ -50,6 +50,7 @@
>  #include <asm/syscalls.h>
>  #ifdef CONFIG_PPC64
>  #include <asm/firmware.h>
> +#include <asm/hw_breakpoint.h>
>  #endif
>  #include <linux/kprobes.h>
>  #include <linux/kdebug.h>
> @@ -254,8 +255,10 @@ void do_dabr(struct pt_regs *regs, unsig
>  			11, SIGSEGV) == NOTIFY_STOP)
>  		return;
>  
> +#ifndef CONFIG_PPC64
>  	if (debugger_dabr_match(regs))
>  		return;
> +#endif
>  
>  	/* Clear the DAC and struct entries.  One shot trigger */
>  #if defined(CONFIG_BOOKE)
> @@ -372,8 +375,13 @@ struct task_struct *__switch_to(struct t
>  
>  #endif /* CONFIG_SMP */
>  
> +#ifdef CONFIG_PPC64
> +		if (unlikely(test_tsk_thread_flag(new, TIF_DEBUG)))
> +			switch_to_thread_hw_breakpoint(new);
> +#else
>  	if (unlikely(__get_cpu_var(current_dabr) != new->thread.dabr))
>  		set_dabr(new->thread.dabr);
> +#endif /* CONFIG_PPC64 */
>  
>  #if defined(CONFIG_BOOKE)
>  	/* If new thread DAC (HW breakpoint) is the same then leave it */
> @@ -550,6 +558,10 @@ void show_regs(struct pt_regs * regs)
>  void exit_thread(void)
>  {
>  	discard_lazy_cpu_state();
> +#ifdef CONFIG_PPC64
> +	if (unlikely(test_tsk_thread_flag(current, TIF_DEBUG)))
> +		flush_thread_hw_breakpoint(current);
> +#endif /* CONFIG_PPC64 */
>  }
>  
>  void flush_thread(void)
> @@ -605,6 +617,9 @@ int copy_thread(unsigned long clone_flag
>  	struct pt_regs *childregs, *kregs;
>  	extern void ret_from_fork(void);
>  	unsigned long sp = (unsigned long)task_stack_page(p) + THREAD_SIZE;
> +#ifdef CONFIG_PPC64
> +	struct task_struct *tsk = current;
> +#endif
>  
>  	CHECK_FULL_REGS(regs);
>  	/* Copy registers */
> @@ -672,6 +687,9 @@ int copy_thread(unsigned long clone_flag
>  	 * function.
>   	 */
>  	kregs->nip = *((unsigned long *)ret_from_fork);
> +
> +	if (unlikely(test_tsk_thread_flag(tsk, TIF_DEBUG)))
> +		copy_thread_hw_breakpoint(tsk, p, clone_flags);
>  #else
>  	kregs->nip = (unsigned long)ret_from_fork;
>  #endif
> Index: linux-2.6-tip.hbkpt/samples/hw_breakpoint/data_breakpoint.c
> ===================================================================
> --- linux-2.6-tip.hbkpt.orig/samples/hw_breakpoint/data_breakpoint.c
> +++ linux-2.6-tip.hbkpt/samples/hw_breakpoint/data_breakpoint.c
> @@ -54,6 +54,10 @@ static int __init hw_break_module_init(v
>  	sample_hbp.info.type = HW_BREAKPOINT_WRITE;
>  	sample_hbp.info.len = HW_BREAKPOINT_LEN_4;
>  #endif /* CONFIG_X86 */
> +#ifdef CONFIG_PPC64
> +	sample_hbp.info.name = ksym_name;
> +	sample_hbp.info.type = DABR_DATA_WRITE;
> +#endif /* CONFIG_PPC64 */
>  
>  	sample_hbp.triggered = (void *)sample_hbp_handler;
>  
> Index: linux-2.6-tip.hbkpt/arch/powerpc/include/asm/thread_info.h
> ===================================================================
> --- linux-2.6-tip.hbkpt.orig/arch/powerpc/include/asm/thread_info.h
> +++ linux-2.6-tip.hbkpt/arch/powerpc/include/asm/thread_info.h
> @@ -114,6 +114,7 @@ static inline struct thread_info *curren
>  #define TIF_FREEZE		14	/* Freezing for suspend */
>  #define TIF_RUNLATCH		15	/* Is the runlatch enabled? */
>  #define TIF_ABI_PENDING		16	/* 32/64 bit switch needed */
> +#define TIF_DEBUG		17	/* uses debug registers */
>  
>  /* as above, but as bit values */
>  #define _TIF_SYSCALL_TRACE	(1<<TIF_SYSCALL_TRACE)
> @@ -132,6 +133,7 @@ static inline struct thread_info *curren
>  #define _TIF_FREEZE		(1<<TIF_FREEZE)
>  #define _TIF_RUNLATCH		(1<<TIF_RUNLATCH)
>  #define _TIF_ABI_PENDING	(1<<TIF_ABI_PENDING)
> +#define _TIF_DEBUG		(1<<TIF_DEBUG)
>  #define _TIF_SYSCALL_T_OR_A	(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SEC
COMP)
>  
>  #define _TIF_USER_WORK_MASK	(_TIF_SIGPENDING | _TIF_NEED_RESCHED | \
> Index: linux-2.6-tip.hbkpt/arch/powerpc/include/asm/reg.h
> ===================================================================
> --- linux-2.6-tip.hbkpt.orig/arch/powerpc/include/asm/reg.h
> +++ linux-2.6-tip.hbkpt/arch/powerpc/include/asm/reg.h
> @@ -184,9 +184,11 @@
>  #define   CTRL_TE	0x00c00000	/* thread enable */
>  #define   CTRL_RUNLATCH	0x1
>  #define SPRN_DABR	0x3F5	/* Data Address Breakpoint Register */
> +#define   HB_NUM	1	/* Number of physical HW breakpoint registers *
/

You've shortedned "hardware breakpoint" to "hbp" elsewhere.  Can you be
consistent here so change this to HBP_NUM?

>  #define   DABR_TRANSLATION	(1UL << 2)
>  #define   DABR_DATA_WRITE	(1UL << 1)
>  #define   DABR_DATA_READ	(1UL << 0)
> +#define   DABR_DATA_RW		(3UL << 0)
>  #define SPRN_DABR2	0x13D	/* e300 */
>  #define SPRN_DABRX	0x3F7	/* Data Address Breakpoint Register Extension *
/
>  #define   DABRX_USER	(1UL << 0)
> Index: linux-2.6-tip.hbkpt/arch/powerpc/mm/fault.c
> ===================================================================
> --- linux-2.6-tip.hbkpt.orig/arch/powerpc/mm/fault.c
> +++ linux-2.6-tip.hbkpt/arch/powerpc/mm/fault.c
> @@ -137,6 +137,12 @@ int __kprobes do_page_fault(struct pt_re
>  		error_code &= 0x48200000;
>  	else
>  		is_write = error_code & DSISR_ISSTORE;
> +
> +	if (error_code & DSISR_DABRMATCH) {
> +		/* DABR match */
> +		do_dabr(regs, address, error_code);
> +		return 0;
> +	}
>  #else
>  	is_write = error_code & ESR_DST;
>  #endif /* CONFIG_4xx || CONFIG_BOOKE */
> @@ -151,14 +157,6 @@ int __kprobes do_page_fault(struct pt_re
>  	if (!user_mode(regs) && (address >= TASK_SIZE))
>  		return SIGSEGV;
>  
> -#if !(defined(CONFIG_4xx) || defined(CONFIG_BOOKE))
> -  	if (error_code & DSISR_DABRMATCH) {
> -		/* DABR match */
> -		do_dabr(regs, address, error_code);
> -		return 0;
> -	}
> -#endif /* !(CONFIG_4xx || CONFIG_BOOKE)*/
> -
>  	if (in_atomic() || mm == NULL) {
>  		if (!user_mode(regs))
>  			return SIGSEGV;
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@ozlabs.org
> https://ozlabs.org/mailman/listinfo/linuxppc-dev
>
Michael Neuling May 12, 2009, 2:48 a.m. UTC | #2
> > +struct arch_hw_breakpoint {
> > +	char		*name; /* Contains name of the symbol to set bkpt */
> > +	unsigned long	address;
> > +	u8		type;
> > +};
> 
> Can you reorder this to pack the struct better (ie. put the unsigned
> long first).

Oops, this is a char * not just a char.  Order is fine.

Mikey
Josh Boyer May 12, 2009, 11:51 a.m. UTC | #3
On Tue, May 12, 2009 at 01:33:55AM +0530, K.Prasad wrote:
>Hi PPC Dev folks,
>	Please find a patch below that implements the proposed Hardware
>Breakpoint interfaces for PPC64 architecture.
>
>As a brief introduction, the proposed Hardware Breakpoint
>infrastructure provides generic in-kernel interfaces to which users
>from kernel- and user-space can request for breakpoint registers. An
>ftrace plugin that can trace accesses to data variables is also part
>of the generic HW Breakpoint interface patchset. The latest submission
>for this patchset along with an x86 implementation can be found here:
>http://lkml.org/lkml/2009/5/11/159.
>
>The following are the salient features of the PPC64 patch.
>
>- Arch-specific definitions for kernel and user-space requests are
>  defined in arch/powerpc/kernel/hw_breakpoint.c
>- Ptrace is converted to use the HW Breakpoint interfaces
>- The ftrace plugin called ksym_tracer is tested to work fine. For
>  instance when tracing pid_max kernel variable, the following was
>  obtained as output
>
># cat trace
># tracer: ksym_tracer
>#
>#       TASK-PID      CPU#      Symbol         Type    Function         
>#          |           |          |              |         |            
>bash            4502  3   pid_max               RW  .do_proc_dointvec_minmax_conv+0x78/0x10c
>bash            4502  3   pid_max               RW  .do_proc_dointvec_minmax_conv+0xa0/0x10c
>bash            4502  3   pid_max               RW  .alloc_pid+0x8c/0x4a4
>
>There are however a few limitations/caveats of the patch as identified
>below:
>
>- The patch is currently implemented only for PPC64 architecture. Other
>  architectures (especially Book-E implementations are expected to
>  happen in due course).

Does this mean you will work on transitioning Book-E implementations, or that
you expect the Book-E maintainers to?  I'm just curious.  The code as written
relies heavily on the DABR/MSR setup that ppc64 has and Book-E unfortunately
doesn't follow that at all.

Book-E also allows for more than one HW breakpoint, which means you're growing
the thread_struct by 32-bytes to support 4 of them.  64-bytes if this ever
supports DAC events.  Have you thought at all about a way to support this
without carrying around the data in the thread struct?

<snip>

>Index: linux-2.6-tip.hbkpt/arch/powerpc/mm/fault.c
>===================================================================
>--- linux-2.6-tip.hbkpt.orig/arch/powerpc/mm/fault.c
>+++ linux-2.6-tip.hbkpt/arch/powerpc/mm/fault.c
>@@ -137,6 +137,12 @@ int __kprobes do_page_fault(struct pt_re
> 		error_code &= 0x48200000;
> 	else
> 		is_write = error_code & DSISR_ISSTORE;
>+
>+	if (error_code & DSISR_DABRMATCH) {
>+		/* DABR match */
>+		do_dabr(regs, address, error_code);
>+		return 0;
>+	}
> #else
> 	is_write = error_code & ESR_DST;
> #endif /* CONFIG_4xx || CONFIG_BOOKE */
>@@ -151,14 +157,6 @@ int __kprobes do_page_fault(struct pt_re
> 	if (!user_mode(regs) && (address >= TASK_SIZE))
> 		return SIGSEGV;
> 
>-#if !(defined(CONFIG_4xx) || defined(CONFIG_BOOKE))
>-  	if (error_code & DSISR_DABRMATCH) {
>-		/* DABR match */
>-		do_dabr(regs, address, error_code);
>-		return 0;
>-	}
>-#endif /* !(CONFIG_4xx || CONFIG_BOOKE)*/
>-
> 	if (in_atomic() || mm == NULL) {
> 		if (!user_mode(regs))
> 			return SIGSEGV;


I don't understand why this was changed, and the changelog doesn't highlight it.

josh
Scott Wood May 12, 2009, 4:47 p.m. UTC | #4
On Tue, May 12, 2009 at 07:51:49AM -0400, Josh Boyer wrote:
> On Tue, May 12, 2009 at 01:33:55AM +0530, K.Prasad wrote:
> >- The patch is currently implemented only for PPC64 architecture. Other
> >  architectures (especially Book-E implementations are expected to
> >  happen in due course).
> 
> Does this mean you will work on transitioning Book-E implementations, or that
> you expect the Book-E maintainers to?  I'm just curious.  The code as written
> relies heavily on the DABR/MSR setup that ppc64 has and Book-E unfortunately
> doesn't follow that at all.

And since there will eventually be 64-bit book E chips, we need to use
something more specific than CONFIG_PPC64 in the #ifdef (easier now than
figuring out which ones are breakpoint-related later).

-Scott
K.Prasad May 12, 2009, 8:01 p.m. UTC | #5
On Tue, May 12, 2009 at 10:56:04AM +1000, Michael Neuling wrote:
> > Hi PPC Dev folks,
> > 	Please find a patch below that implements the proposed Hardware
> > Breakpoint interfaces for PPC64 architecture.
> > 
> > As a brief introduction, the proposed Hardware Breakpoint
> > infrastructure provides generic in-kernel interfaces to which users
> > from kernel- and user-space can request for breakpoint registers. An
> > ftrace plugin that can trace accesses to data variables is also part
> > of the generic HW Breakpoint interface patchset. The latest submission
> > for this patchset along with an x86 implementation can be found here:
> > http://lkml.org/lkml/2009/5/11/159.
> > 
> > The following are the salient features of the PPC64 patch.
> > 
> > - Arch-specific definitions for kernel and user-space requests are
> >   defined in arch/powerpc/kernel/hw_breakpoint.c
> > - Ptrace is converted to use the HW Breakpoint interfaces
> 
> Will we fall back to use the old method if more than one address is
> being watched?
>

Assuming that you mean HW Breakpoints being used by multiple processes
when you say "more than one address is being watched" - the proposed
infrastructure can take care of such requests. During context switching
in __switch_to() the DABR values of incoming process is restored and is
valid until another process using DABR is scheduled again.
 
> > - The ftrace plugin called ksym_tracer is tested to work fine. For
> >   instance when tracing pid_max kernel variable, the following was
> >   obtained as output
> 
> Could you split the patch into a few pieces which implement these
> different parts.  The smaller logical chucks will make it easier to
> review?
> 

Sure. I will slice the patches in the next iteration of posting.

> > 
> > # cat trace
> > # tracer: ksym_tracer
> > #
> > #       TASK-PID      CPU#      Symbol         Type    Function         
> > #          |           |          |              |         |            
> > bash            4502  3   pid_max               RW  .do_proc_dointvec_minmax_
> conv+0x78/0x10c
> > bash            4502  3   pid_max               RW  .do_proc_dointvec_minmax_
> conv+0xa0/0x10c
> > bash            4502  3   pid_max               RW  .alloc_pid+0x8c/0x4a4
> > 
> > There are however a few limitations/caveats of the patch as identified
> > below:
> > 
> > - The patch is currently implemented only for PPC64 architecture. Other
> >   architectures (especially Book-E implementations are expected to
> >   happen in due course).
> > 
> > - HW Breakpoints over data addresses through Xmon (using "bd" command)
> >   and the proposed HW Breakpoint interfaces can now operate in a
> >   mutually exclusive manner. Xmon's integration is pending and is
> >   dependant on successful triggering of breakpoints through "bd<ops>".
> >   (Note: On a Power5 machine running 2.6.29, Xmon could not trigger HW
> >   Breakpoints when tested).
> > 
> > Kindly let me know your comments.
> > 
> > Thanks,
> > K.Prasad
> > 
> > 
> > Signed-off-by: K.Prasad <prasad@linux.vnet.ibm.com>
> > ---
> >  arch/powerpc/Kconfig                     |    1 
> >  arch/powerpc/include/asm/hw_breakpoint.h |   52 +++++
> >  arch/powerpc/include/asm/processor.h     |    1 
> >  arch/powerpc/include/asm/reg.h           |    2 
> >  arch/powerpc/include/asm/thread_info.h   |    2 
> >  arch/powerpc/kernel/Makefile             |    2 
> >  arch/powerpc/kernel/hw_breakpoint.c      |  271 ++++++++++++++++++++++++++++
> +++
> >  arch/powerpc/kernel/process.c            |   18 ++
> >  arch/powerpc/kernel/ptrace.c             |   48 +++++
> >  arch/powerpc/mm/fault.c                  |   14 -
> >  samples/hw_breakpoint/data_breakpoint.c  |    4 
> >  12 files changed, 423 insertions(+), 9 deletions(-)
> 
> You've not touched prace32.c.  Can we use this for 32 bit apps on a 64
> bit kernel?
> 

Given that PTRACE_SET_DEBUGREG invokes arch_ptrace() in ptrace32.c, I
don't see why it cannot work (although I haven't tested it yet).

> > 
> > Index: linux-2.6-tip.hbkpt/arch/powerpc/Kconfig
> > ===================================================================
> > --- linux-2.6-tip.hbkpt.orig/arch/powerpc/Kconfig
> > +++ linux-2.6-tip.hbkpt/arch/powerpc/Kconfig
> > @@ -125,6 +125,7 @@ config PPC
> >  	select USE_GENERIC_SMP_HELPERS if SMP
> >  	select HAVE_OPROFILE
> >  	select HAVE_SYSCALL_WRAPPERS if PPC64
> > +	select HAVE_HW_BREAKPOINT if PPC64
> >  
> >  config EARLY_PRINTK
> >  	bool
> > Index: linux-2.6-tip.hbkpt/arch/powerpc/kernel/Makefile
> > ===================================================================
> > --- linux-2.6-tip.hbkpt.orig/arch/powerpc/kernel/Makefile
> > +++ linux-2.6-tip.hbkpt/arch/powerpc/kernel/Makefile
> > @@ -33,7 +33,7 @@ obj-$(CONFIG_PPC64)		+= setup_64.o sys_p
> >  				   signal_64.o ptrace32.o \
> >  				   paca.o cpu_setup_ppc970.o \
> >  				   cpu_setup_pa6t.o \
> > -				   firmware.o nvram_64.o
> > +				   firmware.o nvram_64.o hw_breakpoint.o
> >  obj64-$(CONFIG_RELOCATABLE)	+= reloc_64.o
> >  obj-$(CONFIG_PPC64)		+= vdso64/
> >  obj-$(CONFIG_ALTIVEC)		+= vecemu.o vector.o
> > Index: linux-2.6-tip.hbkpt/arch/powerpc/kernel/hw_breakpoint.c
> > ===================================================================
> > --- /dev/null
> > +++ linux-2.6-tip.hbkpt/arch/powerpc/kernel/hw_breakpoint.c
> > @@ -0,0 +1,271 @@
> > +/*
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License as published by
> > + * the Free Software Foundation; either version 2 of the License, or
> > + * (at your option) any later version.
> > + *
> > + * This program is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > + * GNU General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU General Public License
> > + * along with this program; if not, write to the Free Software
> > + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA
> .
> > + *
> > + * Copyright (C) 2009 IBM Corporation
> > + */
> > +
> > +/*
> > + * HW_breakpoint: a unified kernel/user-space hardware breakpoint facility,
> > + * using the CPU's debug registers.
> > + */
> > +
> > +#include <linux/notifier.h>
> > +#include <linux/kallsyms.h>
> > +#include <linux/kprobes.h>
> > +#include <linux/percpu.h>
> > +#include <linux/kernel.h>
> > +#include <linux/module.h>
> > +#include <linux/sched.h>
> > +#include <linux/init.h>
> > +#include <linux/smp.h>
> > +
> > +#include <asm/hw_breakpoint.h>
> > +#include <asm/processor.h>
> > +#include <asm/sstep.h>
> > +
> > +/* Store the kernel-space breakpoint address value */
> > +static unsigned long kdabr;
> > +
> > +/*
> > + * Temporarily stores address for DABR before it is written by the
> > + * single-step handler routine
> > + */
> > +static DEFINE_PER_CPU(unsigned long, dabr_data);
> > +
> > +void arch_update_kernel_hw_breakpoint(void *unused)
> > +{
> > +	struct hw_breakpoint *bp;
> > +
> > +	/* Check if there is nothing to update */
> > +	if (hbp_kernel_pos == HB_NUM)
> > +		return;
> > +	bp = hbp_kernel[hbp_kernel_pos];
> > +	if (bp == NULL)
> > +		kdabr = 0;
> > +	else
> > +		kdabr = bp->info.address | bp->info.type | DABR_TRANSLATION;
> > +	set_dabr(kdabr);
> > +}
> > +
> > +/*
> > + * Install the thread breakpoints in their debug registers.
> > + */
> > +void arch_install_thread_hw_breakpoint(struct task_struct *tsk)
> > +{
> > +	set_dabr(tsk->thread.dabr);
> > +}
> > +
> > +/*
> > + * Install the debug register values for just the kernel, no thread.
> > + */
> > +void arch_uninstall_thread_hw_breakpoint()
> > +{
> > +	set_dabr(0);
> > +}
> > +
> > +/*
> > + * Check for virtual address in user space.
> > + */
> > +int arch_check_va_in_userspace(unsigned long va, u8 hbp_len)
> > +{
> > +	return (va <= TASK_SIZE - HW_BREAKPOINT_LEN);
> > +}
> 
> You pass ing hbp_len, but then use HW_BREAKPOINT_LEN?  Is that right?
> 

The 'hbp_len' parameter is useful only when the host processor can watch
for varying range of addresses, which is not the case for PPC64. I guess
I should remove the parameter (although it may be required for other
PowerPC implementations).

> > +
> > +/*
> > + * Check for virtual address in kernel space.
> > + */
> > +int arch_check_va_in_kernelspace(unsigned long va, u8 hbp_len)
> > +{
> > +	return (va >= TASK_SIZE) && ((va + HW_BREAKPOINT_LEN - 1) >= TASK_SIZE)
> ;
> 
> Can you use is_kernel_addr() here?
> 

The above routine does more checks than is_kernel_addr() i.e. ensures
that the last address (basically DABR address + address range monitored)
also lies in kernel-space.

However I agree that it isn't significant in PPC64 which has an
alignment requirement that is greater than the range of addresses
monitored. I will use arch_check_va_in_kernelspace() as a wrapper around
is_kernel_addr() for now.

> > +}
> > +
> > +/*
> > + * Store a breakpoint's encoded address, length, and type.
> > + */
> > +int arch_store_info(struct hw_breakpoint *bp)
> > +{
> > +	/*
> > +	 * User-space requests will always have the address field populated
> > +	 * For kernel-addresses, either the address or symbol name can be
> > +	 * specified.
> > +	 */
> > +	if (bp->info.name)
> > +		bp->info.address = (unsigned long)
> > +					kallsyms_lookup_name(bp->info.name);
> > +	if (bp->info.address)
> > +		return 0;
> > +	return -EINVAL;
> > +}
> > +
> > +/*
> > + * Validate the arch-specific HW Breakpoint register settings
> > + */
> > +int arch_validate_hwbkpt_settings(struct hw_breakpoint *bp,
> > +						struct task_struct *tsk)
> > +{
> > +	int ret = -EINVAL;
> > +
> > +	if (!bp)
> > +		return ret;
> > +
> > +	switch (bp->info.type) {
> > +	case DABR_DATA_READ:
> > +		break;
> > +	case DABR_DATA_WRITE:
> > +		break;
> > +	case DABR_DATA_RW:
> > +		break;
> > +	default:
> > +		return ret;
> > +	}
> > +
> > +	if (bp->triggered)
> > +		ret = arch_store_info(bp);
> > +
> > +	/* Check for double word alignment - 8 bytes */
> > +	if (bp->info.address & HW_BREAKPOINT_ALIGN)
> > +		return -EINVAL;
> > +	return ret;
> > +}
> > +
> > +void arch_update_user_hw_breakpoint(int pos, struct task_struct *tsk)
> > +{
> > +	struct thread_struct *thread = &(tsk->thread);
> > +	struct hw_breakpoint *bp = thread->hbp[0];
> > +
> > +	if (bp)
> > +		thread->dabr = bp->info.address	| bp->info.type |
> > +				DABR_TRANSLATION;
> > +	else
> > +		thread->dabr = 0;
> > +}
> > +
> > +void arch_flush_thread_hw_breakpoint(struct task_struct *tsk)
> > +{
> > +	struct thread_struct *thread = &(tsk->thread);
> > +
> > +	thread->dabr = 0;
> > +}
> > +
> > +/*
> > + * Handle debug exception notifications.
> > + */
> > +int __kprobes hw_breakpoint_handler(struct die_args *args)
> > +{
> > +	int rc = NOTIFY_STOP;
> > +	struct hw_breakpoint *bp;
> > +	struct pt_regs *regs = args->regs;
> > +	unsigned long dar;
> > +	int cpu, stepped, is_kernel;
> > +
> > +	/* Disable breakpoints during exception handling */
> > +	set_dabr(0);
> > +
> > +	dar = regs->dar & (~HW_BREAKPOINT_ALIGN);
> > +	is_kernel = (dar >= TASK_SIZE) ? 1 : 0;
> > +
> > +	if (is_kernel)
> > +		bp = hbp_kernel[0];
> > +	else {
> > +		bp = current->thread.hbp[0];
> > +		/* Lazy debug register switching */
> > +		if (!bp)
> > +			return rc;
> > +		rc = NOTIFY_DONE;
> > +	}
> > +
> > +	(bp->triggered)(bp, regs);
> > +
> > +	cpu = get_cpu();
> > +	if (is_kernel)
> > +		per_cpu(dabr_data, cpu) = kdabr;
> > +	else
> > +		per_cpu(dabr_data, cpu) = current->thread.dabr;
> > +
> > +	stepped = emulate_step(regs, regs->nip);
> > +	/*
> > +	 * Single-step the causative instruction manually if
> > +	 * emulate_step() could not execute it
> > +	 */
> > +	if (stepped == 0) {
> > +		regs->msr |= MSR_SE;
> > +		goto out;
> > +	}
> 
> This is where we backout to single step mode?
> 

Yes, if emulate_step() has failed us we do manual single-stepping.

> > +
> > +	set_dabr(per_cpu(dabr_data, cpu));
> > +out:
> > +	/* Enable pre-emption only if single-stepping is finished */
> > +	if (stepped)
> > +		put_cpu_no_resched();
> > +	return rc;
> > +}
> > +
> +/*
> > + * Handle single-step exceptions following a DABR hit.
> > + */
> > +int __kprobes single_step_dabr_instruction(struct die_args *args)
> > +{
> > +	struct pt_regs *regs = args->regs;
> > +	int cpu = get_cpu();
> > +	int ret = NOTIFY_DONE;
> > +	siginfo_t info;
> > +	unsigned long this_dabr_data = per_cpu(dabr_data, cpu);
> > +
> > +	/*
> > +	 * Check if we are single-stepping as a result of a
> > +	 * previous HW Breakpoint exception
> > +	 */
> > +	if (this_dabr_data == 0)
> > +		goto out;
> > +
> > +	regs->msr &= ~MSR_SE;
> > +	/* Deliver signal to user-space */
> > +	if (this_dabr_data < TASK_SIZE) {
> > +		info.si_signo = SIGTRAP;
> > +		info.si_errno = 0;
> > +		info.si_code = TRAP_HWBKPT;
> > +		info.si_addr = (void __user *)(per_cpu(dabr_data, cpu));
> > +		force_sig_info(SIGTRAP, &info, current);
> > +	}
> > +
> > +	set_dabr(this_dabr_data);
> > +	per_cpu(dabr_data, cpu) = 0;
> > +	ret = NOTIFY_STOP;
> > +	put_cpu_no_resched();
> > +
> > +out:
> > +	put_cpu_no_resched();
> 
> This looks wrong.  put_cpu_no_resched() twice?
> 

single_step_dabr_instruction() is interested in notifications from
notify_die when DIE_SSTEP is the reason. This means it is invoked when
single stepping occurs due to other requests (such as kprobes/xmon) and
not just after hw_breakpoint_handler(). So, a pair of

"int cpu = get_cpu();" and "put_cpu_no_resched()" are meant for the
above case.

Secondly, we want to atomically single-step the causative instruction after
a HW Breakpoint exception and hence put_cpu_no_resched() is invoked only in
single_step_dabr_instruction() and not in hw_breakpoint_handler() when
single-stepping manually.

Alternatively, smp_processor_id() can be used if preemption is already
disabled and avoid a get_cpu() call but that just adds more code without
any apparent benefit. Hence the put_cpu_no_resched() twice.

Let me know if you can think of something better.

> > +	return ret;
> > +}
> > +
> > +/*
> > + * Handle debug exception notifications.
> > + */
> > +int __kprobes hw_breakpoint_exceptions_notify(
> > +		struct notifier_block *unused, unsigned long val, void *data)
> > +{
> > +	int ret = NOTIFY_DONE;
> > +
> > +	switch (val) {
> > +	case DIE_DABR_MATCH:
> > +		ret = hw_breakpoint_handler(data);
> > +		break;
> > +	case DIE_SSTEP:
> > +		ret = single_step_dabr_instruction(data);
> > +		break;
> > +	}
> > +
> > +	return ret;
> > +}
> > Index: linux-2.6-tip.hbkpt/arch/powerpc/include/asm/hw_breakpoint.h
> > ===================================================================
> > --- /dev/null
> > +++ linux-2.6-tip.hbkpt/arch/powerpc/include/asm/hw_breakpoint.h
> > @@ -0,0 +1,52 @@
> > +#ifndef	_PPC64_HW_BREAKPOINT_H
> > +#define	_PPC64_HW_BREAKPOINT_H
> > +
> > +#ifdef	__KERNEL__
> > +#define	__ARCH_HW_BREAKPOINT_H
> > +
> > +struct arch_hw_breakpoint {
> > +	char		*name; /* Contains name of the symbol to set bkpt */
> > +	unsigned long	address;
> > +	u8		type;
> > +};
> 
> Can you reorder this to pack the struct better (ie. put the unsigned
> long first).
> 
> > +
> > +#include <linux/kdebug.h>
> > +#include <asm/reg.h>
> > +#include <asm-generic/hw_breakpoint.h>
> > +
> > +#define HW_BREAKPOINT_READ DABR_DATA_READ
> > +#define HW_BREAKPOINT_WRITE DABR_DATA_WRITE
> > +#define HW_BREAKPOINT_RW DABR_DATA_RW
> > +
> > +#define HW_BREAKPOINT_ALIGN 0x7
> > +#define HW_BREAKPOINT_LEN 4
> 
> What is HW_BREAKPOINT_LEN?  
> 
> Obviouslt all instructions on ppc64 are 4 bytes.  This seems to be
> defined in a few places, like here and MCOUNT_INSN_SIZE (ftrace.h).  Can
> you create a #define for this generically and get all refers to use this
> instead of #define-ing 4 everywhere.
> 

Agreed. A macro definition should have ideally existed in
"arch/powerpc/include/asm/reg.h" already but I found none.

Through a separate patch, I will define INSTRUCTION_LEN in reg.h and
define HW_BREAKPOINT_LEN to INSTRUCTION_LEN (because the former gels
well with the surrounding HW Breakpoint related name-space).

> > +
> > +extern struct hw_breakpoint *hbp_kernel[HB_NUM];
> > +extern unsigned int hbp_user_refcount[HB_NUM];
> > +
> > +/*
> > + * Ptrace support: breakpoint trigger routine.
> > + */
> > +extern int __modify_user_hw_breakpoint(int pos, struct task_struct *tsk,
> > +			struct hw_breakpoint *bp);
> > +
> > +extern void arch_install_thread_hw_breakpoint(struct task_struct *tsk);
> > +extern void arch_uninstall_thread_hw_breakpoint(void);
> > +extern int arch_check_va_in_userspace(unsigned long va, u8 hbp_len);
> > +extern int arch_check_va_in_kernelspace(unsigned long va, u8 hbp_len);
> > +extern int arch_validate_hwbkpt_settings(struct hw_breakpoint *bp,
> > +						struct task_struct *tsk);
> > +extern void arch_update_user_hw_breakpoint(int pos, struct task_struct *tsk)
> ;
> > +extern void arch_flush_thread_hw_breakpoint(struct task_struct *tsk);
> > +extern void arch_update_kernel_hw_breakpoint(void *);
> > +extern int hw_breakpoint_exceptions_notify(struct notifier_block *unused,
> > +				     unsigned long val, void *data);
> > +
> > +extern void flush_thread_hw_breakpoint(struct task_struct *tsk);
> > +extern int copy_thread_hw_breakpoint(struct task_struct *tsk,
> > +		struct task_struct *child, unsigned long clone_flags);
> > +extern void switch_to_thread_hw_breakpoint(struct task_struct *tsk);
> > +
> > +#endif	/* __KERNEL__ */
> > +#endif	/* _PPC64_HW_BREAKPOINT_H */
> > +
> > Index: linux-2.6-tip.hbkpt/arch/powerpc/include/asm/processor.h
> > ===================================================================
> > --- linux-2.6-tip.hbkpt.orig/arch/powerpc/include/asm/processor.h
> > +++ linux-2.6-tip.hbkpt/arch/powerpc/include/asm/processor.h
> > @@ -177,6 +177,7 @@ struct thread_struct {
> >  #ifdef CONFIG_PPC64
> >  	unsigned long	start_tb;	/* Start purr when proc switched in */
> >  	unsigned long	accum_tb;	/* Total accumilated purr for process *
> /
> > +	struct hw_breakpoint *hbp[HB_NUM];
> >  #endif
> >  	unsigned long	dabr;		/* Data address breakpoint register */
> >  #ifdef CONFIG_ALTIVEC
> > Index: linux-2.6-tip.hbkpt/arch/powerpc/kernel/ptrace.c
> > ===================================================================
> > --- linux-2.6-tip.hbkpt.orig/arch/powerpc/kernel/ptrace.c
> > +++ linux-2.6-tip.hbkpt/arch/powerpc/kernel/ptrace.c
> > @@ -37,6 +37,9 @@
> >  #include <asm/page.h>
> >  #include <asm/pgtable.h>
> >  #include <asm/system.h>
> > +#ifdef CONFIG_PPC64
> > +#include <asm/hw_breakpoint.h>
> > +#endif
> >  
> >  /*
> >   * does not yet catch signals sent when the child dies.
> > @@ -735,9 +738,22 @@ void user_disable_single_step(struct tas
> >  	clear_tsk_thread_flag(task, TIF_SINGLESTEP);
> >  }
> >  
> > +static void ptrace_triggered(struct hw_breakpoint *bp, struct pt_regs *regs)
> > +{
> > +	/*
> > +	 * The SIGTRAP signal is generated automatically for us in do_dabr().
> > +	 * We don't have to do anything here
> > +	 */
> > +}
> > +
> >  int ptrace_set_debugreg(struct task_struct *task, unsigned long addr,
> >  			       unsigned long data)
> >  {
> > +#ifdef CONFIG_PPC64
> > +	struct thread_struct *thread = &(task->thread);
> > +	struct hw_breakpoint *bp;
> > +	int ret;
> > +#endif
> >  	/* For ppc64 we support one DABR and no IABR's at the moment (ppc64).
> >  	 *  For embedded processors we support one DAC and no IAC's at the
> >  	 *  moment.
> > @@ -767,6 +783,38 @@ int ptrace_set_debugreg(struct task_stru
> >  	if (data && !(data & DABR_TRANSLATION))
> >  		return -EIO;
> >  
> > +#ifdef CONFIG_PPC64
> > +	bp = thread->hbp[0];
> > +	if ((data & ~0x7UL) == 0) {
> 
> Should use HW_BREAKPOINT_ALIGN here instead of 0x7?  Is 0 some special
> command?  Seems to be lots of special numbers here related to using data
> (later you mask is 0x3).  What is happening here?  What is contained in
> the data parameter?  It overloads the type and the address?
> 

I agree that HW_BREAKPOINT_ALIGN must have been used here and will
eliminate 0x3 using DABR_DATA_RW.

'data' variable is a composite of address | translation_enabled | type.

> > +		if (bp) {
> > +			unregister_user_hw_breakpoint(task, bp);
> > +			kfree(bp);
> > +			thread->hbp[0] = NULL;
> > +		}
> > +		return 0;
> > +	}
> > +
> > +	if (bp) {
> > +		bp->info.type = data & 0x3UL;
> > +		task->thread.dabr = bp->info.address =
> > +						(data & ~HW_BREAKPOINT_ALIGN);
> > +		return __modify_user_hw_breakpoint(0, task, bp);
> > +	}
> > +	bp = kzalloc(sizeof(struct hw_breakpoint), GFP_KERNEL);
> 
> When a processes ends, will this be correctly freed?  Should it be freed
> in arch_flush_thread_hw_breakpoint?
> 

It is freed in flush_thread_hw_breakpoint() which is a part of the
proposed kernel/hw_breakpoint.c (http://lkml.org/lkml/2009/5/11/159).

> > +	if (!bp)
> > +		return -ENOMEM;
> > +
> > +	/* Store the type of breakpoint */
> > +	bp->info.type = data & 0x3UL;
> > +	bp->triggered = ptrace_triggered;
> > +	task->thread.dabr = bp->info.address = (data & ~HW_BREAKPOINT_ALIGN);
> > +
> > +	ret = register_user_hw_breakpoint(task, bp);
> > +	if (ret)
> > +		return ret;
> > +	set_tsk_thread_flag(task, TIF_DEBUG);
> > +#endif /* CONFIG_PPC64 */
> > +
> >  	/* Move contents to the DABR register */
> >  	task->thread.dabr = data;
> >  
> > Index: linux-2.6-tip.hbkpt/arch/powerpc/kernel/process.c
> > ===================================================================
> > --- linux-2.6-tip.hbkpt.orig/arch/powerpc/kernel/process.c
> > +++ linux-2.6-tip.hbkpt/arch/powerpc/kernel/process.c
> > @@ -50,6 +50,7 @@
> >  #include <asm/syscalls.h>
> >  #ifdef CONFIG_PPC64
> >  #include <asm/firmware.h>
> > +#include <asm/hw_breakpoint.h>
> >  #endif
> >  #include <linux/kprobes.h>
> >  #include <linux/kdebug.h>
> > @@ -254,8 +255,10 @@ void do_dabr(struct pt_regs *regs, unsig
> >  			11, SIGSEGV) == NOTIFY_STOP)
> >  		return;
> >  
> > +#ifndef CONFIG_PPC64
> >  	if (debugger_dabr_match(regs))
> >  		return;
> > +#endif
> >  
> >  	/* Clear the DAC and struct entries.  One shot trigger */
> >  #if defined(CONFIG_BOOKE)
> > @@ -372,8 +375,13 @@ struct task_struct *__switch_to(struct t
> >  
> >  #endif /* CONFIG_SMP */
> >  
> > +#ifdef CONFIG_PPC64
> > +		if (unlikely(test_tsk_thread_flag(new, TIF_DEBUG)))
> > +			switch_to_thread_hw_breakpoint(new);
> > +#else
> >  	if (unlikely(__get_cpu_var(current_dabr) != new->thread.dabr))
> >  		set_dabr(new->thread.dabr);
> > +#endif /* CONFIG_PPC64 */
> >  
> >  #if defined(CONFIG_BOOKE)
> >  	/* If new thread DAC (HW breakpoint) is the same then leave it */
> > @@ -550,6 +558,10 @@ void show_regs(struct pt_regs * regs)
> >  void exit_thread(void)
> >  {
> >  	discard_lazy_cpu_state();
> > +#ifdef CONFIG_PPC64
> > +	if (unlikely(test_tsk_thread_flag(current, TIF_DEBUG)))
> > +		flush_thread_hw_breakpoint(current);
> > +#endif /* CONFIG_PPC64 */
> >  }
> >  
> >  void flush_thread(void)
> > @@ -605,6 +617,9 @@ int copy_thread(unsigned long clone_flag
> >  	struct pt_regs *childregs, *kregs;
> >  	extern void ret_from_fork(void);
> >  	unsigned long sp = (unsigned long)task_stack_page(p) + THREAD_SIZE;
> > +#ifdef CONFIG_PPC64
> > +	struct task_struct *tsk = current;
> > +#endif
> >  
> >  	CHECK_FULL_REGS(regs);
> >  	/* Copy registers */
> > @@ -672,6 +687,9 @@ int copy_thread(unsigned long clone_flag
> >  	 * function.
> >   	 */
> >  	kregs->nip = *((unsigned long *)ret_from_fork);
> > +
> > +	if (unlikely(test_tsk_thread_flag(tsk, TIF_DEBUG)))
> > +		copy_thread_hw_breakpoint(tsk, p, clone_flags);
> >  #else
> >  	kregs->nip = (unsigned long)ret_from_fork;
> >  #endif
> > Index: linux-2.6-tip.hbkpt/samples/hw_breakpoint/data_breakpoint.c
> > ===================================================================
> > --- linux-2.6-tip.hbkpt.orig/samples/hw_breakpoint/data_breakpoint.c
> > +++ linux-2.6-tip.hbkpt/samples/hw_breakpoint/data_breakpoint.c
> > @@ -54,6 +54,10 @@ static int __init hw_break_module_init(v
> >  	sample_hbp.info.type = HW_BREAKPOINT_WRITE;
> >  	sample_hbp.info.len = HW_BREAKPOINT_LEN_4;
> >  #endif /* CONFIG_X86 */
> > +#ifdef CONFIG_PPC64
> > +	sample_hbp.info.name = ksym_name;
> > +	sample_hbp.info.type = DABR_DATA_WRITE;
> > +#endif /* CONFIG_PPC64 */
> >  
> >  	sample_hbp.triggered = (void *)sample_hbp_handler;
> >  
> > Index: linux-2.6-tip.hbkpt/arch/powerpc/include/asm/thread_info.h
> > ===================================================================
> > --- linux-2.6-tip.hbkpt.orig/arch/powerpc/include/asm/thread_info.h
> > +++ linux-2.6-tip.hbkpt/arch/powerpc/include/asm/thread_info.h
> > @@ -114,6 +114,7 @@ static inline struct thread_info *curren
> >  #define TIF_FREEZE		14	/* Freezing for suspend */
> >  #define TIF_RUNLATCH		15	/* Is the runlatch enabled? */
> >  #define TIF_ABI_PENDING		16	/* 32/64 bit switch needed */
> > +#define TIF_DEBUG		17	/* uses debug registers */
> >  
> >  /* as above, but as bit values */
> >  #define _TIF_SYSCALL_TRACE	(1<<TIF_SYSCALL_TRACE)
> > @@ -132,6 +133,7 @@ static inline struct thread_info *curren
> >  #define _TIF_FREEZE		(1<<TIF_FREEZE)
> >  #define _TIF_RUNLATCH		(1<<TIF_RUNLATCH)
> >  #define _TIF_ABI_PENDING	(1<<TIF_ABI_PENDING)
> > +#define _TIF_DEBUG		(1<<TIF_DEBUG)
> >  #define _TIF_SYSCALL_T_OR_A	(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SEC
> COMP)
> >  
> >  #define _TIF_USER_WORK_MASK	(_TIF_SIGPENDING | _TIF_NEED_RESCHED | \
> > Index: linux-2.6-tip.hbkpt/arch/powerpc/include/asm/reg.h
> > ===================================================================
> > --- linux-2.6-tip.hbkpt.orig/arch/powerpc/include/asm/reg.h
> > +++ linux-2.6-tip.hbkpt/arch/powerpc/include/asm/reg.h
> > @@ -184,9 +184,11 @@
> >  #define   CTRL_TE	0x00c00000	/* thread enable */
> >  #define   CTRL_RUNLATCH	0x1
> >  #define SPRN_DABR	0x3F5	/* Data Address Breakpoint Register */
> > +#define   HB_NUM	1	/* Number of physical HW breakpoint registers *
> /
> 
> You've shortedned "hardware breakpoint" to "hbp" elsewhere.  Can you be
> consistent here so change this to HBP_NUM?
> 

Thanks for pointing it out! I will change.

> >  #define   DABR_TRANSLATION	(1UL << 2)
> >  #define   DABR_DATA_WRITE	(1UL << 1)
> >  #define   DABR_DATA_READ	(1UL << 0)
> > +#define   DABR_DATA_RW		(3UL << 0)
> >  #define SPRN_DABR2	0x13D	/* e300 */
> >  #define SPRN_DABRX	0x3F7	/* Data Address Breakpoint Register Extension *
> /
> >  #define   DABRX_USER	(1UL << 0)
> > Index: linux-2.6-tip.hbkpt/arch/powerpc/mm/fault.c
> > ===================================================================
> > --- linux-2.6-tip.hbkpt.orig/arch/powerpc/mm/fault.c
> > +++ linux-2.6-tip.hbkpt/arch/powerpc/mm/fault.c
> > @@ -137,6 +137,12 @@ int __kprobes do_page_fault(struct pt_re
> >  		error_code &= 0x48200000;
> >  	else
> >  		is_write = error_code & DSISR_ISSTORE;
> > +
> > +	if (error_code & DSISR_DABRMATCH) {
> > +		/* DABR match */
> > +		do_dabr(regs, address, error_code);
> > +		return 0;
> > +	}
> >  #else
> >  	is_write = error_code & ESR_DST;
> >  #endif /* CONFIG_4xx || CONFIG_BOOKE */
> > @@ -151,14 +157,6 @@ int __kprobes do_page_fault(struct pt_re
> >  	if (!user_mode(regs) && (address >= TASK_SIZE))
> >  		return SIGSEGV;
> >  
> > -#if !(defined(CONFIG_4xx) || defined(CONFIG_BOOKE))
> > -  	if (error_code & DSISR_DABRMATCH) {
> > -		/* DABR match */
> > -		do_dabr(regs, address, error_code);
> > -		return 0;
> > -	}
> > -#endif /* !(CONFIG_4xx || CONFIG_BOOKE)*/
> > -
> >  	if (in_atomic() || mm == NULL) {
> >  		if (!user_mode(regs))
> >  			return SIGSEGV;
> > _______________________________________________
> > Linuxppc-dev mailing list
> > Linuxppc-dev@ozlabs.org
> > https://ozlabs.org/mailman/listinfo/linuxppc-dev
> > 

Thanks for the code-review. I will send out the revised patchset with
the changes agreed above.

-- K.Prasad
K.Prasad May 12, 2009, 8:25 p.m. UTC | #6
On Tue, May 12, 2009 at 07:51:49AM -0400, Josh Boyer wrote:
> On Tue, May 12, 2009 at 01:33:55AM +0530, K.Prasad wrote:
> >Hi PPC Dev folks,
> >	Please find a patch below that implements the proposed Hardware
> >Breakpoint interfaces for PPC64 architecture.
> >
> >As a brief introduction, the proposed Hardware Breakpoint
> >infrastructure provides generic in-kernel interfaces to which users
> >from kernel- and user-space can request for breakpoint registers. An
> >ftrace plugin that can trace accesses to data variables is also part
> >of the generic HW Breakpoint interface patchset. The latest submission
> >for this patchset along with an x86 implementation can be found here:
> >http://lkml.org/lkml/2009/5/11/159.
> >
> >The following are the salient features of the PPC64 patch.
> >
> >- Arch-specific definitions for kernel and user-space requests are
> >  defined in arch/powerpc/kernel/hw_breakpoint.c
> >- Ptrace is converted to use the HW Breakpoint interfaces
> >- The ftrace plugin called ksym_tracer is tested to work fine. For
> >  instance when tracing pid_max kernel variable, the following was
> >  obtained as output
> >
> ># cat trace
> ># tracer: ksym_tracer
> >#
> >#       TASK-PID      CPU#      Symbol         Type    Function         
> >#          |           |          |              |         |            
> >bash            4502  3   pid_max               RW  .do_proc_dointvec_minmax_conv+0x78/0x10c
> >bash            4502  3   pid_max               RW  .do_proc_dointvec_minmax_conv+0xa0/0x10c
> >bash            4502  3   pid_max               RW  .alloc_pid+0x8c/0x4a4
> >
> >There are however a few limitations/caveats of the patch as identified
> >below:
> >
> >- The patch is currently implemented only for PPC64 architecture. Other
> >  architectures (especially Book-E implementations are expected to
> >  happen in due course).
> 
> Does this mean you will work on transitioning Book-E implementations, or that
> you expect the Book-E maintainers to?  I'm just curious.  The code as written
> relies heavily on the DABR/MSR setup that ppc64 has and Book-E unfortunately
> doesn't follow that at all.
> 
> Book-E also allows for more than one HW breakpoint, which means you're growing
> the thread_struct by 32-bytes to support 4 of them.  64-bytes if this ever
> supports DAC events.  Have you thought at all about a way to support this
> without carrying around the data in the thread struct?
> 
> <snip>
>

The idea behind embedding the physical debug register values in
thread_struct is to use its synchronisation mechanisms for HW Breakpoint
related fields too. These values were originally maintained in a
separate structure whose pointer lay in thread_struct but was modified
based on a suggestion from Ingo Molnar (here:
http://lkml.org/lkml/2009/3/10/210).

I do see that Book-E processors will have severe memory footprint
constraints (in embedded environment) and if the maintainers carry a
different perspective (than the one cited above), the relevant fields
can be migrated to a new structure whose pointer will be embedded in
task_struct. The generic code may have to carry some #ifdefs though.

 
> >Index: linux-2.6-tip.hbkpt/arch/powerpc/mm/fault.c
> >===================================================================
> >--- linux-2.6-tip.hbkpt.orig/arch/powerpc/mm/fault.c
> >+++ linux-2.6-tip.hbkpt/arch/powerpc/mm/fault.c
> >@@ -137,6 +137,12 @@ int __kprobes do_page_fault(struct pt_re
> > 		error_code &= 0x48200000;
> > 	else
> > 		is_write = error_code & DSISR_ISSTORE;
> >+
> >+	if (error_code & DSISR_DABRMATCH) {
> >+		/* DABR match */
> >+		do_dabr(regs, address, error_code);
> >+		return 0;
> >+	}
> > #else
> > 	is_write = error_code & ESR_DST;
> > #endif /* CONFIG_4xx || CONFIG_BOOKE */
> >@@ -151,14 +157,6 @@ int __kprobes do_page_fault(struct pt_re
> > 	if (!user_mode(regs) && (address >= TASK_SIZE))
> > 		return SIGSEGV;
> > 
> >-#if !(defined(CONFIG_4xx) || defined(CONFIG_BOOKE))
> >-  	if (error_code & DSISR_DABRMATCH) {
> >-		/* DABR match */
> >-		do_dabr(regs, address, error_code);
> >-		return 0;
> >-	}
> >-#endif /* !(CONFIG_4xx || CONFIG_BOOKE)*/
> >-
> > 	if (in_atomic() || mm == NULL) {
> > 		if (!user_mode(regs))
> > 			return SIGSEGV;
> 
> 
> I don't understand why this was changed, and the changelog doesn't highlight it.
> 
> josh

The intention is to capture the exception much before kprobes and xmon
do. The HW Breakpoint exception handler will return NOTIFY_DONE if the
exception doesn't belong to it and doesn't harm the rest. kprobes has
been tested to work fine alongwith HW Breakpoints.

I will add a description about this change in the next iteration of the
patch.

Thanks,
K.Prasad
K.Prasad May 12, 2009, 8:28 p.m. UTC | #7
On Tue, May 12, 2009 at 11:47:38AM -0500, Scott Wood wrote:
> On Tue, May 12, 2009 at 07:51:49AM -0400, Josh Boyer wrote:
> > On Tue, May 12, 2009 at 01:33:55AM +0530, K.Prasad wrote:
> > >- The patch is currently implemented only for PPC64 architecture. Other
> > >  architectures (especially Book-E implementations are expected to
> > >  happen in due course).
> > 
> > Does this mean you will work on transitioning Book-E implementations, or that
> > you expect the Book-E maintainers to?  I'm just curious.  The code as written
> > relies heavily on the DABR/MSR setup that ppc64 has and Book-E unfortunately
> > doesn't follow that at all.
> 
> And since there will eventually be 64-bit book E chips, we need to use
> something more specific than CONFIG_PPC64 in the #ifdef (easier now than
> figuring out which ones are breakpoint-related later).
> 
> -Scott

Sure. I will be glad to receive & incorporate suggestions in this regard
from PowerPC experts (or will 64-bit Book-E implementations be called by
a different name altogether?)

Thanks,
K.Prasad
David Gibson May 13, 2009, 2:57 a.m. UTC | #8
On Wed, May 13, 2009 at 01:55:45AM +0530, K.Prasad wrote:
> On Tue, May 12, 2009 at 07:51:49AM -0400, Josh Boyer wrote:
> > On Tue, May 12, 2009 at 01:33:55AM +0530, K.Prasad wrote:
[snip]
> I do see that Book-E processors will have severe memory footprint
> constraints (in embedded environment) and if the maintainers carry a
> different perspective (than the one cited above), the relevant fields
> can be migrated to a new structure whose pointer will be embedded in
> task_struct. The generic code may have to carry some #ifdefs though.

I think moving the debug register info into a separate structure makes
a fair bit of sense.  As well as reducing the memory footprint for
systems with lots of debug regs, checking it the pointer is NULL
provides a simple and generic way of determining if the process has
touched the debug regs.

It seems to me that a kind of minimal requirement for a sensible
generic debug interface is that if no processes actually ask to use
the debug regs, then we should never touch them in the hardware.  This
means that debugging hacks in the kernel can just use the debug regs
directly and don't have to go through the interface to avoid having
their stuff clobbered on context switch.
David Gibson May 13, 2009, 3 a.m. UTC | #9
On Wed, May 13, 2009 at 12:57:17PM +1000, David Gibson wrote:
> On Wed, May 13, 2009 at 01:55:45AM +0530, K.Prasad wrote:
> > On Tue, May 12, 2009 at 07:51:49AM -0400, Josh Boyer wrote:
> > > On Tue, May 12, 2009 at 01:33:55AM +0530, K.Prasad wrote:
> [snip]
> > I do see that Book-E processors will have severe memory footprint
> > constraints (in embedded environment) and if the maintainers carry a
> > different perspective (than the one cited above), the relevant fields
> > can be migrated to a new structure whose pointer will be embedded in
> > task_struct. The generic code may have to carry some #ifdefs though.
> 
> I think moving the debug register info into a separate structure makes
> a fair bit of sense.  As well as reducing the memory footprint for
> systems with lots of debug regs, checking it the pointer is NULL
> provides a simple and generic way of determining if the process has
> touched the debug regs.
> 
> It seems to me that a kind of minimal requirement for a sensible
> generic debug interface is that if no processes actually ask to use
> the debug regs, then we should never touch them in the hardware.  This
> means that debugging hacks in the kernel can just use the debug regs
> directly and don't have to go through the interface to avoid having
> their stuff clobbered on context switch.

Uh, sorry, didn't fully read the earlier mail.  Ingo makes a good
point that detaching the structure complicates locking / lifetime
issues.  Leaving it in the thread_struct probably makes sense for now.

Although it raises other issues for when different CPUs in the same
architecture have different sets of debug registers (e.g. PPC server's
DABR/IABR set versus 44x's IAC/DAC/DBCR regs).
K.Prasad May 14, 2009, 6:52 p.m. UTC | #10
On Wed, May 13, 2009 at 12:57:17PM +1000, David Gibson wrote:
> On Wed, May 13, 2009 at 01:55:45AM +0530, K.Prasad wrote:
> > On Tue, May 12, 2009 at 07:51:49AM -0400, Josh Boyer wrote:
> > > On Tue, May 12, 2009 at 01:33:55AM +0530, K.Prasad wrote:
> [snip]
> 
> It seems to me that a kind of minimal requirement for a sensible
> generic debug interface is that if no processes actually ask to use
> the debug regs, then we should never touch them in the hardware.  This
> means that debugging hacks in the kernel can just use the debug regs
> directly and don't have to go through the interface to avoid having
> their stuff clobbered on context switch.
> 

All that we do additionally for kernel users (when no process is
currently using the debug regs) is to account for the usage just so that
any new requests (kernel or user-space) are denied.

> -- 
> David Gibson			| I'll have my music baroque, and my code
> david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
> 				| _way_ _around_!
> http://www.ozlabs.org/~dgibson

Thanks,
K.Prasad
diff mbox

Patch

Index: linux-2.6-tip.hbkpt/arch/powerpc/Kconfig
===================================================================
--- linux-2.6-tip.hbkpt.orig/arch/powerpc/Kconfig
+++ linux-2.6-tip.hbkpt/arch/powerpc/Kconfig
@@ -125,6 +125,7 @@  config PPC
 	select USE_GENERIC_SMP_HELPERS if SMP
 	select HAVE_OPROFILE
 	select HAVE_SYSCALL_WRAPPERS if PPC64
+	select HAVE_HW_BREAKPOINT if PPC64
 
 config EARLY_PRINTK
 	bool
Index: linux-2.6-tip.hbkpt/arch/powerpc/kernel/Makefile
===================================================================
--- linux-2.6-tip.hbkpt.orig/arch/powerpc/kernel/Makefile
+++ linux-2.6-tip.hbkpt/arch/powerpc/kernel/Makefile
@@ -33,7 +33,7 @@  obj-$(CONFIG_PPC64)		+= setup_64.o sys_p
 				   signal_64.o ptrace32.o \
 				   paca.o cpu_setup_ppc970.o \
 				   cpu_setup_pa6t.o \
-				   firmware.o nvram_64.o
+				   firmware.o nvram_64.o hw_breakpoint.o
 obj64-$(CONFIG_RELOCATABLE)	+= reloc_64.o
 obj-$(CONFIG_PPC64)		+= vdso64/
 obj-$(CONFIG_ALTIVEC)		+= vecemu.o vector.o
Index: linux-2.6-tip.hbkpt/arch/powerpc/kernel/hw_breakpoint.c
===================================================================
--- /dev/null
+++ linux-2.6-tip.hbkpt/arch/powerpc/kernel/hw_breakpoint.c
@@ -0,0 +1,271 @@ 
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) 2009 IBM Corporation
+ */
+
+/*
+ * HW_breakpoint: a unified kernel/user-space hardware breakpoint facility,
+ * using the CPU's debug registers.
+ */
+
+#include <linux/notifier.h>
+#include <linux/kallsyms.h>
+#include <linux/kprobes.h>
+#include <linux/percpu.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/sched.h>
+#include <linux/init.h>
+#include <linux/smp.h>
+
+#include <asm/hw_breakpoint.h>
+#include <asm/processor.h>
+#include <asm/sstep.h>
+
+/* Store the kernel-space breakpoint address value */
+static unsigned long kdabr;
+
+/*
+ * Temporarily stores address for DABR before it is written by the
+ * single-step handler routine
+ */
+static DEFINE_PER_CPU(unsigned long, dabr_data);
+
+void arch_update_kernel_hw_breakpoint(void *unused)
+{
+	struct hw_breakpoint *bp;
+
+	/* Check if there is nothing to update */
+	if (hbp_kernel_pos == HB_NUM)
+		return;
+	bp = hbp_kernel[hbp_kernel_pos];
+	if (bp == NULL)
+		kdabr = 0;
+	else
+		kdabr = bp->info.address | bp->info.type | DABR_TRANSLATION;
+	set_dabr(kdabr);
+}
+
+/*
+ * Install the thread breakpoints in their debug registers.
+ */
+void arch_install_thread_hw_breakpoint(struct task_struct *tsk)
+{
+	set_dabr(tsk->thread.dabr);
+}
+
+/*
+ * Install the debug register values for just the kernel, no thread.
+ */
+void arch_uninstall_thread_hw_breakpoint()
+{
+	set_dabr(0);
+}
+
+/*
+ * Check for virtual address in user space.
+ */
+int arch_check_va_in_userspace(unsigned long va, u8 hbp_len)
+{
+	return (va <= TASK_SIZE - HW_BREAKPOINT_LEN);
+}
+
+/*
+ * Check for virtual address in kernel space.
+ */
+int arch_check_va_in_kernelspace(unsigned long va, u8 hbp_len)
+{
+	return (va >= TASK_SIZE) && ((va + HW_BREAKPOINT_LEN - 1) >= TASK_SIZE);
+}
+
+/*
+ * Store a breakpoint's encoded address, length, and type.
+ */
+int arch_store_info(struct hw_breakpoint *bp)
+{
+	/*
+	 * User-space requests will always have the address field populated
+	 * For kernel-addresses, either the address or symbol name can be
+	 * specified.
+	 */
+	if (bp->info.name)
+		bp->info.address = (unsigned long)
+					kallsyms_lookup_name(bp->info.name);
+	if (bp->info.address)
+		return 0;
+	return -EINVAL;
+}
+
+/*
+ * Validate the arch-specific HW Breakpoint register settings
+ */
+int arch_validate_hwbkpt_settings(struct hw_breakpoint *bp,
+						struct task_struct *tsk)
+{
+	int ret = -EINVAL;
+
+	if (!bp)
+		return ret;
+
+	switch (bp->info.type) {
+	case DABR_DATA_READ:
+		break;
+	case DABR_DATA_WRITE:
+		break;
+	case DABR_DATA_RW:
+		break;
+	default:
+		return ret;
+	}
+
+	if (bp->triggered)
+		ret = arch_store_info(bp);
+
+	/* Check for double word alignment - 8 bytes */
+	if (bp->info.address & HW_BREAKPOINT_ALIGN)
+		return -EINVAL;
+	return ret;
+}
+
+void arch_update_user_hw_breakpoint(int pos, struct task_struct *tsk)
+{
+	struct thread_struct *thread = &(tsk->thread);
+	struct hw_breakpoint *bp = thread->hbp[0];
+
+	if (bp)
+		thread->dabr = bp->info.address	| bp->info.type |
+				DABR_TRANSLATION;
+	else
+		thread->dabr = 0;
+}
+
+void arch_flush_thread_hw_breakpoint(struct task_struct *tsk)
+{
+	struct thread_struct *thread = &(tsk->thread);
+
+	thread->dabr = 0;
+}
+
+/*
+ * Handle debug exception notifications.
+ */
+int __kprobes hw_breakpoint_handler(struct die_args *args)
+{
+	int rc = NOTIFY_STOP;
+	struct hw_breakpoint *bp;
+	struct pt_regs *regs = args->regs;
+	unsigned long dar;
+	int cpu, stepped, is_kernel;
+
+	/* Disable breakpoints during exception handling */
+	set_dabr(0);
+
+	dar = regs->dar & (~HW_BREAKPOINT_ALIGN);
+	is_kernel = (dar >= TASK_SIZE) ? 1 : 0;
+
+	if (is_kernel)
+		bp = hbp_kernel[0];
+	else {
+		bp = current->thread.hbp[0];
+		/* Lazy debug register switching */
+		if (!bp)
+			return rc;
+		rc = NOTIFY_DONE;
+	}
+
+	(bp->triggered)(bp, regs);
+
+	cpu = get_cpu();
+	if (is_kernel)
+		per_cpu(dabr_data, cpu) = kdabr;
+	else
+		per_cpu(dabr_data, cpu) = current->thread.dabr;
+
+	stepped = emulate_step(regs, regs->nip);
+	/*
+	 * Single-step the causative instruction manually if
+	 * emulate_step() could not execute it
+	 */
+	if (stepped == 0) {
+		regs->msr |= MSR_SE;
+		goto out;
+	}
+
+	set_dabr(per_cpu(dabr_data, cpu));
+out:
+	/* Enable pre-emption only if single-stepping is finished */
+	if (stepped)
+		put_cpu_no_resched();
+	return rc;
+}
+
+/*
+ * Handle single-step exceptions following a DABR hit.
+ */
+int __kprobes single_step_dabr_instruction(struct die_args *args)
+{
+	struct pt_regs *regs = args->regs;
+	int cpu = get_cpu();
+	int ret = NOTIFY_DONE;
+	siginfo_t info;
+	unsigned long this_dabr_data = per_cpu(dabr_data, cpu);
+
+	/*
+	 * Check if we are single-stepping as a result of a
+	 * previous HW Breakpoint exception
+	 */
+	if (this_dabr_data == 0)
+		goto out;
+
+	regs->msr &= ~MSR_SE;
+	/* Deliver signal to user-space */
+	if (this_dabr_data < TASK_SIZE) {
+		info.si_signo = SIGTRAP;
+		info.si_errno = 0;
+		info.si_code = TRAP_HWBKPT;
+		info.si_addr = (void __user *)(per_cpu(dabr_data, cpu));
+		force_sig_info(SIGTRAP, &info, current);
+	}
+
+	set_dabr(this_dabr_data);
+	per_cpu(dabr_data, cpu) = 0;
+	ret = NOTIFY_STOP;
+	put_cpu_no_resched();
+
+out:
+	put_cpu_no_resched();
+	return ret;
+}
+
+/*
+ * Handle debug exception notifications.
+ */
+int __kprobes hw_breakpoint_exceptions_notify(
+		struct notifier_block *unused, unsigned long val, void *data)
+{
+	int ret = NOTIFY_DONE;
+
+	switch (val) {
+	case DIE_DABR_MATCH:
+		ret = hw_breakpoint_handler(data);
+		break;
+	case DIE_SSTEP:
+		ret = single_step_dabr_instruction(data);
+		break;
+	}
+
+	return ret;
+}
Index: linux-2.6-tip.hbkpt/arch/powerpc/include/asm/hw_breakpoint.h
===================================================================
--- /dev/null
+++ linux-2.6-tip.hbkpt/arch/powerpc/include/asm/hw_breakpoint.h
@@ -0,0 +1,52 @@ 
+#ifndef	_PPC64_HW_BREAKPOINT_H
+#define	_PPC64_HW_BREAKPOINT_H
+
+#ifdef	__KERNEL__
+#define	__ARCH_HW_BREAKPOINT_H
+
+struct arch_hw_breakpoint {
+	char		*name; /* Contains name of the symbol to set bkpt */
+	unsigned long	address;
+	u8		type;
+};
+
+#include <linux/kdebug.h>
+#include <asm/reg.h>
+#include <asm-generic/hw_breakpoint.h>
+
+#define HW_BREAKPOINT_READ DABR_DATA_READ
+#define HW_BREAKPOINT_WRITE DABR_DATA_WRITE
+#define HW_BREAKPOINT_RW DABR_DATA_RW
+
+#define HW_BREAKPOINT_ALIGN 0x7
+#define HW_BREAKPOINT_LEN 4
+
+extern struct hw_breakpoint *hbp_kernel[HB_NUM];
+extern unsigned int hbp_user_refcount[HB_NUM];
+
+/*
+ * Ptrace support: breakpoint trigger routine.
+ */
+extern int __modify_user_hw_breakpoint(int pos, struct task_struct *tsk,
+			struct hw_breakpoint *bp);
+
+extern void arch_install_thread_hw_breakpoint(struct task_struct *tsk);
+extern void arch_uninstall_thread_hw_breakpoint(void);
+extern int arch_check_va_in_userspace(unsigned long va, u8 hbp_len);
+extern int arch_check_va_in_kernelspace(unsigned long va, u8 hbp_len);
+extern int arch_validate_hwbkpt_settings(struct hw_breakpoint *bp,
+						struct task_struct *tsk);
+extern void arch_update_user_hw_breakpoint(int pos, struct task_struct *tsk);
+extern void arch_flush_thread_hw_breakpoint(struct task_struct *tsk);
+extern void arch_update_kernel_hw_breakpoint(void *);
+extern int hw_breakpoint_exceptions_notify(struct notifier_block *unused,
+				     unsigned long val, void *data);
+
+extern void flush_thread_hw_breakpoint(struct task_struct *tsk);
+extern int copy_thread_hw_breakpoint(struct task_struct *tsk,
+		struct task_struct *child, unsigned long clone_flags);
+extern void switch_to_thread_hw_breakpoint(struct task_struct *tsk);
+
+#endif	/* __KERNEL__ */
+#endif	/* _PPC64_HW_BREAKPOINT_H */
+
Index: linux-2.6-tip.hbkpt/arch/powerpc/include/asm/processor.h
===================================================================
--- linux-2.6-tip.hbkpt.orig/arch/powerpc/include/asm/processor.h
+++ linux-2.6-tip.hbkpt/arch/powerpc/include/asm/processor.h
@@ -177,6 +177,7 @@  struct thread_struct {
 #ifdef CONFIG_PPC64
 	unsigned long	start_tb;	/* Start purr when proc switched in */
 	unsigned long	accum_tb;	/* Total accumilated purr for process */
+	struct hw_breakpoint *hbp[HB_NUM];
 #endif
 	unsigned long	dabr;		/* Data address breakpoint register */
 #ifdef CONFIG_ALTIVEC
Index: linux-2.6-tip.hbkpt/arch/powerpc/kernel/ptrace.c
===================================================================
--- linux-2.6-tip.hbkpt.orig/arch/powerpc/kernel/ptrace.c
+++ linux-2.6-tip.hbkpt/arch/powerpc/kernel/ptrace.c
@@ -37,6 +37,9 @@ 
 #include <asm/page.h>
 #include <asm/pgtable.h>
 #include <asm/system.h>
+#ifdef CONFIG_PPC64
+#include <asm/hw_breakpoint.h>
+#endif
 
 /*
  * does not yet catch signals sent when the child dies.
@@ -735,9 +738,22 @@  void user_disable_single_step(struct tas
 	clear_tsk_thread_flag(task, TIF_SINGLESTEP);
 }
 
+static void ptrace_triggered(struct hw_breakpoint *bp, struct pt_regs *regs)
+{
+	/*
+	 * The SIGTRAP signal is generated automatically for us in do_dabr().
+	 * We don't have to do anything here
+	 */
+}
+
 int ptrace_set_debugreg(struct task_struct *task, unsigned long addr,
 			       unsigned long data)
 {
+#ifdef CONFIG_PPC64
+	struct thread_struct *thread = &(task->thread);
+	struct hw_breakpoint *bp;
+	int ret;
+#endif
 	/* For ppc64 we support one DABR and no IABR's at the moment (ppc64).
 	 *  For embedded processors we support one DAC and no IAC's at the
 	 *  moment.
@@ -767,6 +783,38 @@  int ptrace_set_debugreg(struct task_stru
 	if (data && !(data & DABR_TRANSLATION))
 		return -EIO;
 
+#ifdef CONFIG_PPC64
+	bp = thread->hbp[0];
+	if ((data & ~0x7UL) == 0) {
+		if (bp) {
+			unregister_user_hw_breakpoint(task, bp);
+			kfree(bp);
+			thread->hbp[0] = NULL;
+		}
+		return 0;
+	}
+
+	if (bp) {
+		bp->info.type = data & 0x3UL;
+		task->thread.dabr = bp->info.address =
+						(data & ~HW_BREAKPOINT_ALIGN);
+		return __modify_user_hw_breakpoint(0, task, bp);
+	}
+	bp = kzalloc(sizeof(struct hw_breakpoint), GFP_KERNEL);
+	if (!bp)
+		return -ENOMEM;
+
+	/* Store the type of breakpoint */
+	bp->info.type = data & 0x3UL;
+	bp->triggered = ptrace_triggered;
+	task->thread.dabr = bp->info.address = (data & ~HW_BREAKPOINT_ALIGN);
+
+	ret = register_user_hw_breakpoint(task, bp);
+	if (ret)
+		return ret;
+	set_tsk_thread_flag(task, TIF_DEBUG);
+#endif /* CONFIG_PPC64 */
+
 	/* Move contents to the DABR register */
 	task->thread.dabr = data;
 
Index: linux-2.6-tip.hbkpt/arch/powerpc/kernel/process.c
===================================================================
--- linux-2.6-tip.hbkpt.orig/arch/powerpc/kernel/process.c
+++ linux-2.6-tip.hbkpt/arch/powerpc/kernel/process.c
@@ -50,6 +50,7 @@ 
 #include <asm/syscalls.h>
 #ifdef CONFIG_PPC64
 #include <asm/firmware.h>
+#include <asm/hw_breakpoint.h>
 #endif
 #include <linux/kprobes.h>
 #include <linux/kdebug.h>
@@ -254,8 +255,10 @@  void do_dabr(struct pt_regs *regs, unsig
 			11, SIGSEGV) == NOTIFY_STOP)
 		return;
 
+#ifndef CONFIG_PPC64
 	if (debugger_dabr_match(regs))
 		return;
+#endif
 
 	/* Clear the DAC and struct entries.  One shot trigger */
 #if defined(CONFIG_BOOKE)
@@ -372,8 +375,13 @@  struct task_struct *__switch_to(struct t
 
 #endif /* CONFIG_SMP */
 
+#ifdef CONFIG_PPC64
+		if (unlikely(test_tsk_thread_flag(new, TIF_DEBUG)))
+			switch_to_thread_hw_breakpoint(new);
+#else
 	if (unlikely(__get_cpu_var(current_dabr) != new->thread.dabr))
 		set_dabr(new->thread.dabr);
+#endif /* CONFIG_PPC64 */
 
 #if defined(CONFIG_BOOKE)
 	/* If new thread DAC (HW breakpoint) is the same then leave it */
@@ -550,6 +558,10 @@  void show_regs(struct pt_regs * regs)
 void exit_thread(void)
 {
 	discard_lazy_cpu_state();
+#ifdef CONFIG_PPC64
+	if (unlikely(test_tsk_thread_flag(current, TIF_DEBUG)))
+		flush_thread_hw_breakpoint(current);
+#endif /* CONFIG_PPC64 */
 }
 
 void flush_thread(void)
@@ -605,6 +617,9 @@  int copy_thread(unsigned long clone_flag
 	struct pt_regs *childregs, *kregs;
 	extern void ret_from_fork(void);
 	unsigned long sp = (unsigned long)task_stack_page(p) + THREAD_SIZE;
+#ifdef CONFIG_PPC64
+	struct task_struct *tsk = current;
+#endif
 
 	CHECK_FULL_REGS(regs);
 	/* Copy registers */
@@ -672,6 +687,9 @@  int copy_thread(unsigned long clone_flag
 	 * function.
  	 */
 	kregs->nip = *((unsigned long *)ret_from_fork);
+
+	if (unlikely(test_tsk_thread_flag(tsk, TIF_DEBUG)))
+		copy_thread_hw_breakpoint(tsk, p, clone_flags);
 #else
 	kregs->nip = (unsigned long)ret_from_fork;
 #endif
Index: linux-2.6-tip.hbkpt/samples/hw_breakpoint/data_breakpoint.c
===================================================================
--- linux-2.6-tip.hbkpt.orig/samples/hw_breakpoint/data_breakpoint.c
+++ linux-2.6-tip.hbkpt/samples/hw_breakpoint/data_breakpoint.c
@@ -54,6 +54,10 @@  static int __init hw_break_module_init(v
 	sample_hbp.info.type = HW_BREAKPOINT_WRITE;
 	sample_hbp.info.len = HW_BREAKPOINT_LEN_4;
 #endif /* CONFIG_X86 */
+#ifdef CONFIG_PPC64
+	sample_hbp.info.name = ksym_name;
+	sample_hbp.info.type = DABR_DATA_WRITE;
+#endif /* CONFIG_PPC64 */
 
 	sample_hbp.triggered = (void *)sample_hbp_handler;
 
Index: linux-2.6-tip.hbkpt/arch/powerpc/include/asm/thread_info.h
===================================================================
--- linux-2.6-tip.hbkpt.orig/arch/powerpc/include/asm/thread_info.h
+++ linux-2.6-tip.hbkpt/arch/powerpc/include/asm/thread_info.h
@@ -114,6 +114,7 @@  static inline struct thread_info *curren
 #define TIF_FREEZE		14	/* Freezing for suspend */
 #define TIF_RUNLATCH		15	/* Is the runlatch enabled? */
 #define TIF_ABI_PENDING		16	/* 32/64 bit switch needed */
+#define TIF_DEBUG		17	/* uses debug registers */
 
 /* as above, but as bit values */
 #define _TIF_SYSCALL_TRACE	(1<<TIF_SYSCALL_TRACE)
@@ -132,6 +133,7 @@  static inline struct thread_info *curren
 #define _TIF_FREEZE		(1<<TIF_FREEZE)
 #define _TIF_RUNLATCH		(1<<TIF_RUNLATCH)
 #define _TIF_ABI_PENDING	(1<<TIF_ABI_PENDING)
+#define _TIF_DEBUG		(1<<TIF_DEBUG)
 #define _TIF_SYSCALL_T_OR_A	(_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SECCOMP)
 
 #define _TIF_USER_WORK_MASK	(_TIF_SIGPENDING | _TIF_NEED_RESCHED | \
Index: linux-2.6-tip.hbkpt/arch/powerpc/include/asm/reg.h
===================================================================
--- linux-2.6-tip.hbkpt.orig/arch/powerpc/include/asm/reg.h
+++ linux-2.6-tip.hbkpt/arch/powerpc/include/asm/reg.h
@@ -184,9 +184,11 @@ 
 #define   CTRL_TE	0x00c00000	/* thread enable */
 #define   CTRL_RUNLATCH	0x1
 #define SPRN_DABR	0x3F5	/* Data Address Breakpoint Register */
+#define   HB_NUM	1	/* Number of physical HW breakpoint registers */
 #define   DABR_TRANSLATION	(1UL << 2)
 #define   DABR_DATA_WRITE	(1UL << 1)
 #define   DABR_DATA_READ	(1UL << 0)
+#define   DABR_DATA_RW		(3UL << 0)
 #define SPRN_DABR2	0x13D	/* e300 */
 #define SPRN_DABRX	0x3F7	/* Data Address Breakpoint Register Extension */
 #define   DABRX_USER	(1UL << 0)
Index: linux-2.6-tip.hbkpt/arch/powerpc/mm/fault.c
===================================================================
--- linux-2.6-tip.hbkpt.orig/arch/powerpc/mm/fault.c
+++ linux-2.6-tip.hbkpt/arch/powerpc/mm/fault.c
@@ -137,6 +137,12 @@  int __kprobes do_page_fault(struct pt_re
 		error_code &= 0x48200000;
 	else
 		is_write = error_code & DSISR_ISSTORE;
+
+	if (error_code & DSISR_DABRMATCH) {
+		/* DABR match */
+		do_dabr(regs, address, error_code);
+		return 0;
+	}
 #else
 	is_write = error_code & ESR_DST;
 #endif /* CONFIG_4xx || CONFIG_BOOKE */
@@ -151,14 +157,6 @@  int __kprobes do_page_fault(struct pt_re
 	if (!user_mode(regs) && (address >= TASK_SIZE))
 		return SIGSEGV;
 
-#if !(defined(CONFIG_4xx) || defined(CONFIG_BOOKE))
-  	if (error_code & DSISR_DABRMATCH) {
-		/* DABR match */
-		do_dabr(regs, address, error_code);
-		return 0;
-	}
-#endif /* !(CONFIG_4xx || CONFIG_BOOKE)*/
-
 	if (in_atomic() || mm == NULL) {
 		if (!user_mode(regs))
 			return SIGSEGV;