[7/7] powerpc/eeh: Add eeh_force_recover to debugfs

Message ID 20190208030802.10805-7-oohall@gmail.com
State Changes Requested
Headers show
Series
  • [1/7] powerpc/eeh: Use debugfs_create_u32 for eeh_max_freezes
Related show

Checks

Context Check Description
snowpatch_ozlabs/checkpatch warning total: 0 errors, 0 warnings, 3 checks, 120 lines checked
snowpatch_ozlabs/build-pmac32 success build succeeded & removed 0 sparse warning(s)
snowpatch_ozlabs/build-ppc64e success build succeeded & removed 0 sparse warning(s)
snowpatch_ozlabs/build-ppc64be success build succeeded & removed 0 sparse warning(s)
snowpatch_ozlabs/build-ppc64le success build succeeded & removed 0 sparse warning(s)
snowpatch_ozlabs/apply_patch success next/apply_patch Successfully applied

Commit Message

Oliver Feb. 8, 2019, 3:08 a.m.
This patch adds a debugfs interface to force scheduling a recovery event.
This can be used to recover a specific PE or schedule a "special" recovery
even that checks for errors at the PHB level.
To force a recovery of a normal PE, use:

 echo '<#pe>:<#phb>' > /sys/kernel/debug/powerpc/eeh_force_recover

To force a scan broken PHBs:

 echo 'null' > /sys/kernel/debug/powerpc/eeh_force_recover

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/powerpc/include/asm/eeh_event.h |  1 +
 arch/powerpc/kernel/eeh.c            | 60 ++++++++++++++++++++++++++++
 arch/powerpc/kernel/eeh_event.c      | 25 +++++++-----
 3 files changed, 76 insertions(+), 10 deletions(-)

Comments

Michael Ellerman Feb. 8, 2019, 12:31 p.m. | #1
Oliver O'Halloran <oohall@gmail.com> writes:

> This patch adds a debugfs interface to force scheduling a recovery event.
> This can be used to recover a specific PE or schedule a "special" recovery
> even that checks for errors at the PHB level.
> To force a recovery of a normal PE, use:
>
>  echo '<#pe>:<#phb>' > /sys/kernel/debug/powerpc/eeh_force_recover
>
> To force a scan broken PHBs:
>
>  echo 'null' > /sys/kernel/debug/powerpc/eeh_force_recover

Why 'null', that seems like an odd choice. Why not "all" or "scan" or
something?

Also it oopsed on me:

[   76.323164] sending failure event
[   76.323421] BUG: Unable to handle kernel instruction fetch (NULL pointer?)
[   76.323655] Faulting instruction address: 0x00000000
[   76.323856] Oops: Kernel access of bad area, sig: 11 [#1]
[   76.323946] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
[   76.324295] Modules linked in: vmx_crypto kvm binfmt_misc ip_tables x_tables autofs4 crc32c_vpmsum
[   76.324669] CPU: 2 PID: 97 Comm: eehd Not tainted 5.0.0-rc2-gcc-8.2.0-00080-gfacc0d1d9517 #435
[   76.325054] NIP:  0000000000000000 LR: c0000000000451f8 CTR: 0000000000000000
[   76.325402] REGS: c0000000fec779c0 TRAP: 0400   Not tainted  (5.0.0-rc2-gcc-8.2.0-00080-gfacc0d1d9517)
[   76.325768] MSR:  800000014280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]>  CR: 24000482  XER: 20000000
[   76.326243] CFAR: c000000000002528 IRQMASK: 0 
[   76.326243] GPR00: c000000000045edc c0000000fec77c50 c000000001574000 c0000000fec77cb0 
[   76.326243] GPR04: 0000000000000000 00177d76e3e321bc 00177d76e4293a1f 5deadbeef0000100 
[   76.326243] GPR08: 5deadbeef0000200 0000000000000000 0000000000000000 00177d76e3e3216b 
[   76.326243] GPR12: 0000000000000000 c00000003fffdf00 c0000000001438a8 c0000000fe211700 
[   76.326243] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
[   76.326243] GPR20: 0000000000000000 0000000000000000 0000000000000000 c000000000e814e8 
[   76.326243] GPR24: c000000000e814c0 5deadbeef0000100 c000000001622480 0000000100000000 
[   76.326243] GPR28: c000000001413310 c0000000016244e0 c0000000014132f0 c0000001f84246a0 
[   76.329073] NIP [0000000000000000]           (null)
[   76.329285] LR [c0000000000451f8] eeh_handle_special_event+0x78/0x348
[   76.329602] Call Trace:
[   76.329762] [c0000000fec77c50] [c0000000fec77ce0] 0xc0000000fec77ce0 (unreliable)
[   76.330113] [c0000000fec77d00] [c000000000045edc] eeh_event_handler+0x10c/0x1c0
[   76.330464] [c0000000fec77db0] [c000000000143a4c] kthread+0x1ac/0x1c0
[   76.330681] [c0000000fec77e20] [c00000000000bdc4] ret_from_kernel_thread+0x5c/0x78
[   76.331026] Instruction dump:
[   76.331197] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX 
[   76.331550] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX 
[   76.331803] ---[ end trace dc73d37df5bb9ecd ]---


cheers
Oliver Feb. 8, 2019, 12:50 p.m. | #2
On Fri, Feb 8, 2019 at 11:32 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
>
> Oliver O'Halloran <oohall@gmail.com> writes:
>
> > This patch adds a debugfs interface to force scheduling a recovery event.
> > This can be used to recover a specific PE or schedule a "special" recovery
> > even that checks for errors at the PHB level.
> > To force a recovery of a normal PE, use:
> >
> >  echo '<#pe>:<#phb>' > /sys/kernel/debug/powerpc/eeh_force_recover
> >
> > To force a scan broken PHBs:
> >
> >  echo 'null' > /sys/kernel/debug/powerpc/eeh_force_recover
>
> Why 'null', that seems like an odd choice. Why not "all" or "scan" or
> something?

When an EEH event occurs the bit that is sent to the event handler is
just a pointer the the struct eeh_pe. If the pointer is null it's then
treated as a special event which indicates a PHB failure. I agree it's
a bit dumb, but I don't really expect anyone except me or samb to use
this interface so I went with what would make sense to someone
familiar with the internals.

>
> Also it oopsed on me:
>
> [   76.323164] sending failure event
> [   76.323421] BUG: Unable to handle kernel instruction fetch (NULL pointer?)
> [   76.323655] Faulting instruction address: 0x00000000
> [   76.323856] Oops: Kernel access of bad area, sig: 11 [#1]
> [   76.323946] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
> [   76.324295] Modules linked in: vmx_crypto kvm binfmt_misc ip_tables x_tables autofs4 crc32c_vpmsum
> [   76.324669] CPU: 2 PID: 97 Comm: eehd Not tainted 5.0.0-rc2-gcc-8.2.0-00080-gfacc0d1d9517 #435
> [   76.325054] NIP:  0000000000000000 LR: c0000000000451f8 CTR: 0000000000000000
> [   76.325402] REGS: c0000000fec779c0 TRAP: 0400   Not tainted  (5.0.0-rc2-gcc-8.2.0-00080-gfacc0d1d9517)
> [   76.325768] MSR:  800000014280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]>  CR: 24000482  XER: 20000000
> [   76.326243] CFAR: c000000000002528 IRQMASK: 0
> [   76.326243] GPR00: c000000000045edc c0000000fec77c50 c000000001574000 c0000000fec77cb0
> [   76.326243] GPR04: 0000000000000000 00177d76e3e321bc 00177d76e4293a1f 5deadbeef0000100
> [   76.326243] GPR08: 5deadbeef0000200 0000000000000000 0000000000000000 00177d76e3e3216b
> [   76.326243] GPR12: 0000000000000000 c00000003fffdf00 c0000000001438a8 c0000000fe211700
> [   76.326243] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> [   76.326243] GPR20: 0000000000000000 0000000000000000 0000000000000000 c000000000e814e8
> [   76.326243] GPR24: c000000000e814c0 5deadbeef0000100 c000000001622480 0000000100000000
> [   76.326243] GPR28: c000000001413310 c0000000016244e0 c0000000014132f0 c0000001f84246a0
> [   76.329073] NIP [0000000000000000]           (null)
> [   76.329285] LR [c0000000000451f8] eeh_handle_special_event+0x78/0x348
> [   76.329602] Call Trace:
> [   76.329762] [c0000000fec77c50] [c0000000fec77ce0] 0xc0000000fec77ce0 (unreliable)
> [   76.330113] [c0000000fec77d00] [c000000000045edc] eeh_event_handler+0x10c/0x1c0
> [   76.330464] [c0000000fec77db0] [c000000000143a4c] kthread+0x1ac/0x1c0
> [   76.330681] [c0000000fec77e20] [c00000000000bdc4] ret_from_kernel_thread+0x5c/0x78
> [   76.331026] Instruction dump:
> [   76.331197] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
> [   76.331550] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
> [   76.331803] ---[ end trace dc73d37df5bb9ecd ]---
>
>
> cheers

This is probably a side effect of special events being a PowerNV
specific concept. For a pseries guest there should never be any PHB
PEs since (hardware) PHBs are a concept that is hidden to to a guest.
It's like EEH is poorly thought out and full of layering violations or
something...
Michael Ellerman Feb. 11, 2019, 2:24 a.m. | #3
Oliver <oohall@gmail.com> writes:
> On Fri, Feb 8, 2019 at 11:32 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
>> Oliver O'Halloran <oohall@gmail.com> writes:
>>
>> > This patch adds a debugfs interface to force scheduling a recovery event.
>> > This can be used to recover a specific PE or schedule a "special" recovery
>> > even that checks for errors at the PHB level.
>> > To force a recovery of a normal PE, use:
>> >
>> >  echo '<#pe>:<#phb>' > /sys/kernel/debug/powerpc/eeh_force_recover
>> >
>> > To force a scan broken PHBs:
>> >
>> >  echo 'null' > /sys/kernel/debug/powerpc/eeh_force_recover
>>
>> Why 'null', that seems like an odd choice. Why not "all" or "scan" or
>> something?
>
> When an EEH event occurs the bit that is sent to the event handler is
> just a pointer the the struct eeh_pe. If the pointer is null it's then
> treated as a special event which indicates a PHB failure. I agree it's
> a bit dumb, but I don't really expect anyone except me or samb to use
> this interface so I went with what would make sense to someone
> familiar with the internals.

Yeah, nah. Let's use something that's at least vaguely self documenting
so people like me can have some clue what it's doing.

cheers
Sam Bobroff Feb. 13, 2019, 4:37 a.m. | #4
On Fri, Feb 08, 2019 at 02:08:02PM +1100, Oliver O'Halloran wrote:
> This patch adds a debugfs interface to force scheduling a recovery event.
> This can be used to recover a specific PE or schedule a "special" recovery
> even that checks for errors at the PHB level.
> To force a recovery of a normal PE, use:
> 
>  echo '<#pe>:<#phb>' > /sys/kernel/debug/powerpc/eeh_force_recover

How about placing these in the per-PHB debugfs directory?
echo '<#pe>' > /sys/kernel/debug/powerpc/PCI0000/eeh_force_recover

> To force a scan broken PHBs:
> 
>  echo 'null' > /sys/kernel/debug/powerpc/eeh_force_recover

And keep this one where it is, and just trigger with any write (or a '1'
or whatever)?

Sam.

> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
>  arch/powerpc/include/asm/eeh_event.h |  1 +
>  arch/powerpc/kernel/eeh.c            | 60 ++++++++++++++++++++++++++++
>  arch/powerpc/kernel/eeh_event.c      | 25 +++++++-----
>  3 files changed, 76 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/eeh_event.h b/arch/powerpc/include/asm/eeh_event.h
> index 9884e872686f..6d0412b846ac 100644
> --- a/arch/powerpc/include/asm/eeh_event.h
> +++ b/arch/powerpc/include/asm/eeh_event.h
> @@ -33,6 +33,7 @@ struct eeh_event {
>  
>  int eeh_event_init(void);
>  int eeh_send_failure_event(struct eeh_pe *pe);
> +int __eeh_send_failure_event(struct eeh_pe *pe);
>  void eeh_remove_event(struct eeh_pe *pe, bool force);
>  void eeh_handle_normal_event(struct eeh_pe *pe);
>  void eeh_handle_special_event(void);
> diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
> index 92809b137e39..63b91a4918c9 100644
> --- a/arch/powerpc/kernel/eeh.c
> +++ b/arch/powerpc/kernel/eeh.c
> @@ -1805,6 +1805,63 @@ static int eeh_enable_dbgfs_get(void *data, u64 *val)
>  
>  DEFINE_DEBUGFS_ATTRIBUTE(eeh_enable_dbgfs_ops, eeh_enable_dbgfs_get,
>  			 eeh_enable_dbgfs_set, "0x%llx\n");
> +
> +static ssize_t eeh_force_recover_write(struct file *filp,
> +				const char __user *user_buf,
> +				size_t count, loff_t *ppos)
> +{
> +	struct pci_controller *hose;
> +	uint32_t phbid, pe_no;
> +	struct eeh_pe *pe;
> +	char buf[20];
> +	int ret;
> +
> +	ret = simple_write_to_buffer(buf, sizeof(buf), ppos, user_buf, count);
> +	if (!ret)
> +		return -EFAULT;
> +
> +	/*
> +	 * When PE is NULL the event is a "special" event. Rather than
> +	 * recovering a specific PE it forces the EEH core to scan for failed
> +	 * PHBs and recovers each. This needs to be done before any device
> +	 * recoveries can occur.
> +	 */
> +	if (!strncmp(buf, "null", 4)) {
> +		pr_err("sending failure event\n");
> +		__eeh_send_failure_event(NULL);
> +		return count;
> +	}
> +
> +	ret = sscanf(buf, "%x:%x", &phbid, &pe_no);
> +	if (ret != 2)
> +		return -EINVAL;
> +
> +	hose = pci_find_hose_for_domain(phbid);
> +	if (!hose)
> +		return -ENODEV;
> +
> +	/* Retrieve PE */
> +	pe = eeh_pe_get(hose, pe_no, 0);
> +	if (!pe)
> +		return -ENODEV;
> +
> +	/*
> +	 * We don't do any state checking here since the detection
> +	 * process is async to the recovery process. The recovery
> +	 * thread *should* not break even if we schedule a recovery
> +	 * from an odd state (e.g. PE removed, or recovery of a
> +	 * non-isolated PE)
> +	 */
> +	__eeh_send_failure_event(pe);
> +
> +	return ret < 0 ? ret : count;
> +}
> +
> +static const struct file_operations eeh_force_recover_fops = {
> +	.open	= simple_open,
> +	.llseek	= no_llseek,
> +	.write	= eeh_force_recover_write,
> +};
>  #endif
>  
>  static int __init eeh_init_proc(void)
> @@ -1820,6 +1877,9 @@ static int __init eeh_init_proc(void)
>  		debugfs_create_bool("eeh_disable_recovery", 0600,
>  				powerpc_debugfs_root,
>  				&eeh_debugfs_no_recover);
> +		debugfs_create_file_unsafe("eeh_force_recover", 0600,
> +				powerpc_debugfs_root, NULL,
> +				&eeh_force_recover_fops);
>  		eeh_cache_debugfs_init();
>  #endif
>  	}
> diff --git a/arch/powerpc/kernel/eeh_event.c b/arch/powerpc/kernel/eeh_event.c
> index 19837798bb1d..539aca055d70 100644
> --- a/arch/powerpc/kernel/eeh_event.c
> +++ b/arch/powerpc/kernel/eeh_event.c
> @@ -121,20 +121,11 @@ int eeh_event_init(void)
>   * the actual event will be delivered in a normal context
>   * (from a workqueue).
>   */
> -int eeh_send_failure_event(struct eeh_pe *pe)
> +int __eeh_send_failure_event(struct eeh_pe *pe)
>  {
>  	unsigned long flags;
>  	struct eeh_event *event;
>  
> -	/*
> -	 * If we've manually supressed recovery events via debugfs
> -	 * then just drop it on the floor.
> -	 */
> -	if (eeh_debugfs_no_recover) {
> -		pr_err("EEH: Event dropped due to no_recover setting\n");
> -		return 0;
> -	}
> -
>  	event = kzalloc(sizeof(*event), GFP_ATOMIC);
>  	if (!event) {
>  		pr_err("EEH: out of memory, event not handled\n");
> @@ -153,6 +144,20 @@ int eeh_send_failure_event(struct eeh_pe *pe)
>  	return 0;
>  }
>  
> +int eeh_send_failure_event(struct eeh_pe *pe)
> +{
> +	/*
> +	 * If we've manually supressed recovery events via debugfs
> +	 * then just drop it on the floor.
> +	 */
> +	if (eeh_debugfs_no_recover) {
> +		pr_err("EEH: Event dropped due to no_recover setting\n");
> +		return 0;
> +	}
> +
> +	return __eeh_send_failure_event(pe);
> +}
> +
>  /**
>   * eeh_remove_event - Remove EEH event from the queue
>   * @pe: Event binding to the PE
> -- 
> 2.20.1
>
Oliver Feb. 13, 2019, 5:18 a.m. | #5
On Wed, Feb 13, 2019 at 3:38 PM Sam Bobroff <sbobroff@linux.ibm.com> wrote:
>
> On Fri, Feb 08, 2019 at 02:08:02PM +1100, Oliver O'Halloran wrote:
> > This patch adds a debugfs interface to force scheduling a recovery event.
> > This can be used to recover a specific PE or schedule a "special" recovery
> > even that checks for errors at the PHB level.
> > To force a recovery of a normal PE, use:
> >
> >  echo '<#pe>:<#phb>' > /sys/kernel/debug/powerpc/eeh_force_recover
>
> How about placing these in the per-PHB debugfs directory?
> echo '<#pe>' > /sys/kernel/debug/powerpc/PCI0000/eeh_force_recover
>
> > To force a scan broken PHBs:
> >
> >  echo 'null' > /sys/kernel/debug/powerpc/eeh_force_recover
>
> And keep this one where it is, and just trigger with any write (or a '1'
> or whatever)?

The per-PHB directories only exist on PowerNV. I'd rather this was
merged as-is since it handles both platforms. If we want to add the
per-PHB debugfs stuff to pseries we can do it later.

>
> Sam.
>
> > Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> > ---
> >  arch/powerpc/include/asm/eeh_event.h |  1 +
> >  arch/powerpc/kernel/eeh.c            | 60 ++++++++++++++++++++++++++++
> >  arch/powerpc/kernel/eeh_event.c      | 25 +++++++-----
> >  3 files changed, 76 insertions(+), 10 deletions(-)
> >
> > diff --git a/arch/powerpc/include/asm/eeh_event.h b/arch/powerpc/include/asm/eeh_event.h
> > index 9884e872686f..6d0412b846ac 100644
> > --- a/arch/powerpc/include/asm/eeh_event.h
> > +++ b/arch/powerpc/include/asm/eeh_event.h
> > @@ -33,6 +33,7 @@ struct eeh_event {
> >
> >  int eeh_event_init(void);
> >  int eeh_send_failure_event(struct eeh_pe *pe);
> > +int __eeh_send_failure_event(struct eeh_pe *pe);
> >  void eeh_remove_event(struct eeh_pe *pe, bool force);
> >  void eeh_handle_normal_event(struct eeh_pe *pe);
> >  void eeh_handle_special_event(void);
> > diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
> > index 92809b137e39..63b91a4918c9 100644
> > --- a/arch/powerpc/kernel/eeh.c
> > +++ b/arch/powerpc/kernel/eeh.c
> > @@ -1805,6 +1805,63 @@ static int eeh_enable_dbgfs_get(void *data, u64 *val)
> >
> >  DEFINE_DEBUGFS_ATTRIBUTE(eeh_enable_dbgfs_ops, eeh_enable_dbgfs_get,
> >                        eeh_enable_dbgfs_set, "0x%llx\n");
> > +
> > +static ssize_t eeh_force_recover_write(struct file *filp,
> > +                             const char __user *user_buf,
> > +                             size_t count, loff_t *ppos)
> > +{
> > +     struct pci_controller *hose;
> > +     uint32_t phbid, pe_no;
> > +     struct eeh_pe *pe;
> > +     char buf[20];
> > +     int ret;
> > +
> > +     ret = simple_write_to_buffer(buf, sizeof(buf), ppos, user_buf, count);
> > +     if (!ret)
> > +             return -EFAULT;
> > +
> > +     /*
> > +      * When PE is NULL the event is a "special" event. Rather than
> > +      * recovering a specific PE it forces the EEH core to scan for failed
> > +      * PHBs and recovers each. This needs to be done before any device
> > +      * recoveries can occur.
> > +      */
> > +     if (!strncmp(buf, "null", 4)) {
> > +             pr_err("sending failure event\n");
> > +             __eeh_send_failure_event(NULL);
> > +             return count;
> > +     }
> > +
> > +     ret = sscanf(buf, "%x:%x", &phbid, &pe_no);
> > +     if (ret != 2)
> > +             return -EINVAL;
> > +
> > +     hose = pci_find_hose_for_domain(phbid);
> > +     if (!hose)
> > +             return -ENODEV;
> > +
> > +     /* Retrieve PE */
> > +     pe = eeh_pe_get(hose, pe_no, 0);
> > +     if (!pe)
> > +             return -ENODEV;
> > +
> > +     /*
> > +      * We don't do any state checking here since the detection
> > +      * process is async to the recovery process. The recovery
> > +      * thread *should* not break even if we schedule a recovery
> > +      * from an odd state (e.g. PE removed, or recovery of a
> > +      * non-isolated PE)
> > +      */
> > +     __eeh_send_failure_event(pe);
> > +
> > +     return ret < 0 ? ret : count;
> > +}
> > +
> > +static const struct file_operations eeh_force_recover_fops = {
> > +     .open   = simple_open,
> > +     .llseek = no_llseek,
> > +     .write  = eeh_force_recover_write,
> > +};
> >  #endif
> >
> >  static int __init eeh_init_proc(void)
> > @@ -1820,6 +1877,9 @@ static int __init eeh_init_proc(void)
> >               debugfs_create_bool("eeh_disable_recovery", 0600,
> >                               powerpc_debugfs_root,
> >                               &eeh_debugfs_no_recover);
> > +             debugfs_create_file_unsafe("eeh_force_recover", 0600,
> > +                             powerpc_debugfs_root, NULL,
> > +                             &eeh_force_recover_fops);
> >               eeh_cache_debugfs_init();
> >  #endif
> >       }
> > diff --git a/arch/powerpc/kernel/eeh_event.c b/arch/powerpc/kernel/eeh_event.c
> > index 19837798bb1d..539aca055d70 100644
> > --- a/arch/powerpc/kernel/eeh_event.c
> > +++ b/arch/powerpc/kernel/eeh_event.c
> > @@ -121,20 +121,11 @@ int eeh_event_init(void)
> >   * the actual event will be delivered in a normal context
> >   * (from a workqueue).
> >   */
> > -int eeh_send_failure_event(struct eeh_pe *pe)
> > +int __eeh_send_failure_event(struct eeh_pe *pe)
> >  {
> >       unsigned long flags;
> >       struct eeh_event *event;
> >
> > -     /*
> > -      * If we've manually supressed recovery events via debugfs
> > -      * then just drop it on the floor.
> > -      */
> > -     if (eeh_debugfs_no_recover) {
> > -             pr_err("EEH: Event dropped due to no_recover setting\n");
> > -             return 0;
> > -     }
> > -
> >       event = kzalloc(sizeof(*event), GFP_ATOMIC);
> >       if (!event) {
> >               pr_err("EEH: out of memory, event not handled\n");
> > @@ -153,6 +144,20 @@ int eeh_send_failure_event(struct eeh_pe *pe)
> >       return 0;
> >  }
> >
> > +int eeh_send_failure_event(struct eeh_pe *pe)
> > +{
> > +     /*
> > +      * If we've manually supressed recovery events via debugfs
> > +      * then just drop it on the floor.
> > +      */
> > +     if (eeh_debugfs_no_recover) {
> > +             pr_err("EEH: Event dropped due to no_recover setting\n");
> > +             return 0;
> > +     }
> > +
> > +     return __eeh_send_failure_event(pe);
> > +}
> > +
> >  /**
> >   * eeh_remove_event - Remove EEH event from the queue
> >   * @pe: Event binding to the PE
> > --
> > 2.20.1
> >

Patch

diff --git a/arch/powerpc/include/asm/eeh_event.h b/arch/powerpc/include/asm/eeh_event.h
index 9884e872686f..6d0412b846ac 100644
--- a/arch/powerpc/include/asm/eeh_event.h
+++ b/arch/powerpc/include/asm/eeh_event.h
@@ -33,6 +33,7 @@  struct eeh_event {
 
 int eeh_event_init(void);
 int eeh_send_failure_event(struct eeh_pe *pe);
+int __eeh_send_failure_event(struct eeh_pe *pe);
 void eeh_remove_event(struct eeh_pe *pe, bool force);
 void eeh_handle_normal_event(struct eeh_pe *pe);
 void eeh_handle_special_event(void);
diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index 92809b137e39..63b91a4918c9 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -1805,6 +1805,63 @@  static int eeh_enable_dbgfs_get(void *data, u64 *val)
 
 DEFINE_DEBUGFS_ATTRIBUTE(eeh_enable_dbgfs_ops, eeh_enable_dbgfs_get,
 			 eeh_enable_dbgfs_set, "0x%llx\n");
+
+static ssize_t eeh_force_recover_write(struct file *filp,
+				const char __user *user_buf,
+				size_t count, loff_t *ppos)
+{
+	struct pci_controller *hose;
+	uint32_t phbid, pe_no;
+	struct eeh_pe *pe;
+	char buf[20];
+	int ret;
+
+	ret = simple_write_to_buffer(buf, sizeof(buf), ppos, user_buf, count);
+	if (!ret)
+		return -EFAULT;
+
+	/*
+	 * When PE is NULL the event is a "special" event. Rather than
+	 * recovering a specific PE it forces the EEH core to scan for failed
+	 * PHBs and recovers each. This needs to be done before any device
+	 * recoveries can occur.
+	 */
+	if (!strncmp(buf, "null", 4)) {
+		pr_err("sending failure event\n");
+		__eeh_send_failure_event(NULL);
+		return count;
+	}
+
+	ret = sscanf(buf, "%x:%x", &phbid, &pe_no);
+	if (ret != 2)
+		return -EINVAL;
+
+	hose = pci_find_hose_for_domain(phbid);
+	if (!hose)
+		return -ENODEV;
+
+	/* Retrieve PE */
+	pe = eeh_pe_get(hose, pe_no, 0);
+	if (!pe)
+		return -ENODEV;
+
+	/*
+	 * We don't do any state checking here since the detection
+	 * process is async to the recovery process. The recovery
+	 * thread *should* not break even if we schedule a recovery
+	 * from an odd state (e.g. PE removed, or recovery of a
+	 * non-isolated PE)
+	 */
+	__eeh_send_failure_event(pe);
+
+	return ret < 0 ? ret : count;
+}
+
+static const struct file_operations eeh_force_recover_fops = {
+	.open	= simple_open,
+	.llseek	= no_llseek,
+	.write	= eeh_force_recover_write,
+};
 #endif
 
 static int __init eeh_init_proc(void)
@@ -1820,6 +1877,9 @@  static int __init eeh_init_proc(void)
 		debugfs_create_bool("eeh_disable_recovery", 0600,
 				powerpc_debugfs_root,
 				&eeh_debugfs_no_recover);
+		debugfs_create_file_unsafe("eeh_force_recover", 0600,
+				powerpc_debugfs_root, NULL,
+				&eeh_force_recover_fops);
 		eeh_cache_debugfs_init();
 #endif
 	}
diff --git a/arch/powerpc/kernel/eeh_event.c b/arch/powerpc/kernel/eeh_event.c
index 19837798bb1d..539aca055d70 100644
--- a/arch/powerpc/kernel/eeh_event.c
+++ b/arch/powerpc/kernel/eeh_event.c
@@ -121,20 +121,11 @@  int eeh_event_init(void)
  * the actual event will be delivered in a normal context
  * (from a workqueue).
  */
-int eeh_send_failure_event(struct eeh_pe *pe)
+int __eeh_send_failure_event(struct eeh_pe *pe)
 {
 	unsigned long flags;
 	struct eeh_event *event;
 
-	/*
-	 * If we've manually supressed recovery events via debugfs
-	 * then just drop it on the floor.
-	 */
-	if (eeh_debugfs_no_recover) {
-		pr_err("EEH: Event dropped due to no_recover setting\n");
-		return 0;
-	}
-
 	event = kzalloc(sizeof(*event), GFP_ATOMIC);
 	if (!event) {
 		pr_err("EEH: out of memory, event not handled\n");
@@ -153,6 +144,20 @@  int eeh_send_failure_event(struct eeh_pe *pe)
 	return 0;
 }
 
+int eeh_send_failure_event(struct eeh_pe *pe)
+{
+	/*
+	 * If we've manually supressed recovery events via debugfs
+	 * then just drop it on the floor.
+	 */
+	if (eeh_debugfs_no_recover) {
+		pr_err("EEH: Event dropped due to no_recover setting\n");
+		return 0;
+	}
+
+	return __eeh_send_failure_event(pe);
+}
+
 /**
  * eeh_remove_event - Remove EEH event from the queue
  * @pe: Event binding to the PE