Patchwork PSeries: Cancel RTAS event scan before firmware flash

login
register
mail settings
Submitter Ravi K. Nittala
Date Sept. 21, 2011, 10:29 a.m.
Message ID <20110921102825.16444.75131.stgit@localhost6.localdomain6>
Download mbox | patch
Permalink /patch/115749/
State Changes Requested
Headers show

Comments

Ravi K. Nittala - Sept. 21, 2011, 10:29 a.m.
The RTAS firmware flash update is conducted using an RTAS call that is
serialized by lock_rtas() which uses spin_lock. While the flash is in
progress, rtasd performs scan for any RTAS events that are generated by
the system. rtasd keeps scanning for the RTAS events generated on the
machine. This is performed via workqueue mechanism. The rtas_event_scan()
also uses an RTAS call to scan the events, eventually trying to acquire
the spin_lock before issuing the request.

The flash update takes a while to complete and during this time, any other
RTAS call has to wait. In this case, rtas_event_scan() waits for a long time
on the spin_lock resulting in a soft lockup.

Fix: Just before the flash update is performed, the queued rtas_event_scan()
work item is cancelled from the work queue so that there is no other RTAS
call issued while the flash is in progress. After the flash completes, the
system reboots and the rtas_event_scan() is rescheduled.

Signed-off-by: Suzuki Poulose <suzuki@in.ibm.com>
Signed-off-by: Ravi Nittala <ravi.nittala@in.ibm.com>

---

 arch/powerpc/include/asm/rtas.h  |    4 ++++
 arch/powerpc/kernel/rtas_flash.c |    8 ++++++++
 arch/powerpc/kernel/rtasd.c      |    6 ++++++
 3 files changed, 18 insertions(+), 0 deletions(-)
Benjamin Herrenschmidt - Sept. 23, 2011, 12:38 a.m.
On Wed, 2011-09-21 at 15:59 +0530, Ravi K Nittala wrote:
> The RTAS firmware flash update is conducted using an RTAS call that is
> serialized by lock_rtas() which uses spin_lock. While the flash is in
> progress, rtasd performs scan for any RTAS events that are generated by
> the system. rtasd keeps scanning for the RTAS events generated on the
> machine. This is performed via workqueue mechanism. The rtas_event_scan()
> also uses an RTAS call to scan the events, eventually trying to acquire
> the spin_lock before issuing therequest.

Better. However:

> diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h
> index 58625d1..b5cbd9f 100644
> --- a/arch/powerpc/include/asm/rtas.h
> +++ b/arch/powerpc/include/asm/rtas.h
> @@ -245,6 +245,10 @@ extern int early_init_dt_scan_rtas(unsigned long node,
>  
>  extern void pSeries_log_error(char *buf, unsigned int err_type, int fatal);
>  
> +#ifdef CONFIG_PPC_RTAS_DAEMON
> +extern bool rtas_cancel_event_scan(void);
> +#endif

The extern as such doesn't need an ifdef... however, you could avoid
this one:

 .../...
>  
> +#ifdef CONFIG_PPC_RTAS_DAEMON
> +	/*
> +	 * Just before starting the firmware flash, cancel the event scan work
> +	 * to avoid any soft lockup issues.
> +	 */
> +	rtas_cancel_event_scan();
> +#endif
> +

Here, by having the header contain instead:

#ifdef CONFIG_PPC_RTAS_DAEMON
extern void rtas_cancel_event_scan(void);
#else
static inline void rtas_cancel_event_scan(void) { }
#endif
 
Also note that I removed the bool, it's not useful since you don't
test it anyway.

>  	 * NOTE: the "first" block must be under 4GB, so we create
>  	 * an entry with no data blocks in the reserved buffer in
> diff --git a/arch/powerpc/kernel/rtasd.c b/arch/powerpc/kernel/rtasd.c
> index 481ef06..e8f03fa 100644
> --- a/arch/powerpc/kernel/rtasd.c
> +++ b/arch/powerpc/kernel/rtasd.c
> @@ -472,6 +472,12 @@ static void start_event_scan(void)
>  				 &event_scan_work, event_scan_delay);
>  }
>  
> +/* Cancel the rtas event scan work */
> +bool rtas_cancel_event_scan(void)
> +{
> +	return cancel_delayed_work_sync(&event_scan_work);
> +}

Finally, the above is missing an EXPORT_SYMBOL_GPL() since rtas
flash can be a module.

Cheers,
Ben.

>  static int __init rtas_init(void)
>  {
>  	struct proc_dir_entry *entry;

Patch

diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h
index 58625d1..b5cbd9f 100644
--- a/arch/powerpc/include/asm/rtas.h
+++ b/arch/powerpc/include/asm/rtas.h
@@ -245,6 +245,10 @@  extern int early_init_dt_scan_rtas(unsigned long node,
 
 extern void pSeries_log_error(char *buf, unsigned int err_type, int fatal);
 
+#ifdef CONFIG_PPC_RTAS_DAEMON
+extern bool rtas_cancel_event_scan(void);
+#endif
+
 /* Error types logged.  */
 #define ERR_FLAG_ALREADY_LOGGED	0x0
 #define ERR_FLAG_BOOT		0x1 	/* log was pulled from NVRAM on boot */
diff --git a/arch/powerpc/kernel/rtas_flash.c b/arch/powerpc/kernel/rtas_flash.c
index e037c74..a9cceff 100644
--- a/arch/powerpc/kernel/rtas_flash.c
+++ b/arch/powerpc/kernel/rtas_flash.c
@@ -567,6 +567,14 @@  static void rtas_flash_firmware(int reboot_type)
 		return;
 	}
 
+#ifdef CONFIG_PPC_RTAS_DAEMON
+	/*
+	 * Just before starting the firmware flash, cancel the event scan work
+	 * to avoid any soft lockup issues.
+	 */
+	rtas_cancel_event_scan();
+#endif
+
 	/*
 	 * NOTE: the "first" block must be under 4GB, so we create
 	 * an entry with no data blocks in the reserved buffer in
diff --git a/arch/powerpc/kernel/rtasd.c b/arch/powerpc/kernel/rtasd.c
index 481ef06..e8f03fa 100644
--- a/arch/powerpc/kernel/rtasd.c
+++ b/arch/powerpc/kernel/rtasd.c
@@ -472,6 +472,12 @@  static void start_event_scan(void)
 				 &event_scan_work, event_scan_delay);
 }
 
+/* Cancel the rtas event scan work */
+bool rtas_cancel_event_scan(void)
+{
+	return cancel_delayed_work_sync(&event_scan_work);
+}
+
 static int __init rtas_init(void)
 {
 	struct proc_dir_entry *entry;