diff mbox series

[1/1] powerpc/rtas: Implement reentrant rtas call

Message ID 20200408223901.760733-1-leonardo@linux.ibm.com (mailing list archive)
State Superseded, archived
Headers show
Series [1/1] powerpc/rtas: Implement reentrant rtas call | expand

Checks

Context Check Description
snowpatch_ozlabs/apply_patch success Successfully applied on branch powerpc/merge (2c0ce4ff35994a7b12cc9879ced52c9e7c2e6667)
snowpatch_ozlabs/build-ppc64le success Build succeeded
snowpatch_ozlabs/build-ppc64be success Build succeeded
snowpatch_ozlabs/build-ppc64e success Build succeeded
snowpatch_ozlabs/build-pmac32 warning Upstream build failed, couldn't test patch
snowpatch_ozlabs/checkpatch success total: 0 errors, 0 warnings, 0 checks, 104 lines checked
snowpatch_ozlabs/needsstable success Patch has no Fixes tags

Commit Message

Leonardo Bras April 8, 2020, 10:39 p.m. UTC
Implement rtas_call_reentrant() for reentrant rtas-calls:
"ibm,int-on", "ibm,int-off",ibm,get-xive" and  "ibm,set-xive".

On LoPAPR Version 1.1 (March 24, 2016), from 7.3.10.1 to 7.3.10.4,
items 2 and 3 say:

2 - For the PowerPC External Interrupt option: The * call must be
reentrant to the number of processors on the platform.
3 - For the PowerPC External Interrupt option: The * argument call
buffer for each simultaneous call must be physically unique.

So, these rtas-calls can be called in a lockless way, if using
a different buffer for each call.

This can be useful to avoid deadlocks in crashing, where rtas-calls are
needed, but some other thread crashed holding the rtas.lock.

Signed-off-by: Leonardo Bras <leonardo@linux.ibm.com>
---
 arch/powerpc/include/asm/rtas.h     |  1 +
 arch/powerpc/kernel/rtas.c          | 21 +++++++++++++++++++++
 arch/powerpc/sysdev/xics/ics-rtas.c | 22 +++++++++++-----------
 3 files changed, 33 insertions(+), 11 deletions(-)

Comments

Nathan Lynch April 10, 2020, 7:28 p.m. UTC | #1
Leonardo Bras <leonardo@linux.ibm.com> writes:
> Implement rtas_call_reentrant() for reentrant rtas-calls:
> "ibm,int-on", "ibm,int-off",ibm,get-xive" and  "ibm,set-xive".
>
> On LoPAPR Version 1.1 (March 24, 2016), from 7.3.10.1 to 7.3.10.4,
> items 2 and 3 say:
>
> 2 - For the PowerPC External Interrupt option: The * call must be
> reentrant to the number of processors on the platform.
> 3 - For the PowerPC External Interrupt option: The * argument call
> buffer for each simultaneous call must be physically unique.
>
> So, these rtas-calls can be called in a lockless way, if using
> a different buffer for each call.

From the language in the spec it's clear that these calls are intended
to be reentrant with respect to themselves, but it's less clear to me
that they are safe to call simultaneously with respect to each other or
arbitrary other RTAS methods.


> This can be useful to avoid deadlocks in crashing, where rtas-calls are
> needed, but some other thread crashed holding the rtas.lock.

Are these calls commonly used in the crash-handling path? Is this
addressing a real issue you've seen?


> +/*
> + * Used for reentrant rtas calls.
> + * According to LoPAR documentation, only "ibm,int-on", "ibm,int-off",
> + * "ibm,get-xive" and "ibm,set-xive" are currently reentrant.
> + * Reentrant calls need their own rtas_args buffer, so not using rtas.args.
> + */

Please use kernel-doc format in new code.


> +int rtas_call_reentrant(int token, int nargs, int nret, int *outputs, ...)
> +{
> +	va_list list;
> +	struct rtas_args rtas_args;
> +
> +	if (!rtas.entry || token == RTAS_UNKNOWN_SERVICE)
> +		return -1;
> +
> +	va_start(list, outputs);
> +	va_rtas_call_unlocked(&rtas_args, token, nargs, nret, list);
> +	va_end(list);

No, I don't think you can place the RTAS argument buffer on the stack:

  7.2.7, Software Implementation Note:
  | The OS must be aware that the effective address range for RTAS is 4
  | GB when instantiated in 32-bit mode and the OS should not pass RTAS
  | addresses or blocks of data which might fall outside of this range.
Leonardo Bras May 13, 2020, 4:31 a.m. UTC | #2
Hello Nathan, thanks for the feedback!

On Fri, 2020-04-10 at 14:28 -0500, Nathan Lynch wrote:
> Leonardo Bras <leonardo@linux.ibm.com> writes:
> > Implement rtas_call_reentrant() for reentrant rtas-calls:
> > "ibm,int-on", "ibm,int-off",ibm,get-xive" and  "ibm,set-xive".
> > 
> > On LoPAPR Version 1.1 (March 24, 2016), from 7.3.10.1 to 7.3.10.4,
> > items 2 and 3 say:
> > 
> > 2 - For the PowerPC External Interrupt option: The * call must be
> > reentrant to the number of processors on the platform.
> > 3 - For the PowerPC External Interrupt option: The * argument call
> > buffer for each simultaneous call must be physically unique.
> > 
> > So, these rtas-calls can be called in a lockless way, if using
> > a different buffer for each call.
> > 

> From the language in the spec it's clear that these calls are intended
> to be reentrant with respect to themselves, but it's less clear to me
> that they are safe to call simultaneously with respect to each other or
> arbitrary other RTAS methods.

In my viewpoint, being reentrant to themselves, without being reentrant
to others would be very difficult to do, considering the way the
rtas_call is crafted to work.

I mean, I have no experience in rtas code, it's my viewpoint. In my
thoughts there is something like this:

common_path -> selects function by token -> reentrant function
					|-> non-reentrant function

If there is one function that is reentrant, it means the common_path
and function selection by token would need to be reentrant too.

> > This can be useful to avoid deadlocks in crashing, where rtas-calls are
> > needed, but some other thread crashed holding the rtas.lock.
> 
> Are these calls commonly used in the crash-handling path? Is this
> addressing a real issue you've seen?
> 

Yes, I noticed deadlocks during crashes, like this one:
#0 arch_spin_lock
#1  lock_rtas () 
#2  rtas_call (token=8204, nargs=1, nret=1, outputs=0x0)
#3  ics_rtas_mask_real_irq (hw_irq=4100) 
#4  machine_kexec_mask_interrupts
#5  default_machine_crash_shutdown
#6  machine_crash_shutdown 
#7  __crash_kexec
#8  crash_kexec
#9  oops_end

On ics_rtas_mask_real_irq() we have both ibm_int_off and ibm_set_xive,
so it makes sense to also add ibm_int_on and ibm_get_xive as reentrant
too.

Full discussion available on this thread:
http://patchwork.ozlabs.org/project/linuxppc-dev/patch/20200401000020.590447-1-leonardo@linux.ibm.com/

> 
> > +/*
> > + * Used for reentrant rtas calls.
> > + * According to LoPAR documentation, only "ibm,int-on", "ibm,int-off",
> > + * "ibm,get-xive" and "ibm,set-xive" are currently reentrant.
> > + * Reentrant calls need their own rtas_args buffer, so not using rtas.args.
> > + */
> 
> Please use kernel-doc format in new code.

Sure, v2 is going to be fixed.

> 
> 
> > +int rtas_call_reentrant(int token, int nargs, int nret, int *outputs, ...)
> > +{
> > +	va_list list;
> > +	struct rtas_args rtas_args;
> > +
> > +	if (!rtas.entry || token == RTAS_UNKNOWN_SERVICE)
> > +		return -1;
> > +
> > +	va_start(list, outputs);
> > +	va_rtas_call_unlocked(&rtas_args, token, nargs, nret, list);
> > +	va_end(list);
> 
> No, I don't think you can place the RTAS argument buffer on the stack:
> 
>   7.2.7, Software Implementation Note:
>   | The OS must be aware that the effective address range for RTAS is 4
>   | GB when instantiated in 32-bit mode and the OS should not pass RTAS
>   | addresses or blocks of data which might fall outside of this range.

Agree, moved to PACA.

I will send a v2 soon, it will be a 2-patch patchset.

Best regards,
Leonardo Bras
Leonardo Brás May 13, 2020, 5:23 a.m. UTC | #3
v2: 
http://patchwork.ozlabs.org/project/linuxppc-dev/patch/20200513044025.105379-2-leobras.c@gmail.com/

(Series:
http://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=176534)
Nathan Lynch May 14, 2020, 7:04 p.m. UTC | #4
Hi Leonardo,

Leonardo Bras <leonardo@linux.ibm.com> writes:
> Hello Nathan, thanks for the feedback!
>
> On Fri, 2020-04-10 at 14:28 -0500, Nathan Lynch wrote:
>> Leonardo Bras <leonardo@linux.ibm.com> writes:
>> > Implement rtas_call_reentrant() for reentrant rtas-calls:
>> > "ibm,int-on", "ibm,int-off",ibm,get-xive" and  "ibm,set-xive".
>> > 
>> > On LoPAPR Version 1.1 (March 24, 2016), from 7.3.10.1 to 7.3.10.4,
>> > items 2 and 3 say:
>> > 
>> > 2 - For the PowerPC External Interrupt option: The * call must be
>> > reentrant to the number of processors on the platform.
>> > 3 - For the PowerPC External Interrupt option: The * argument call
>> > buffer for each simultaneous call must be physically unique.
>> > 
>> > So, these rtas-calls can be called in a lockless way, if using
>> > a different buffer for each call.
>> > 
>
>> From the language in the spec it's clear that these calls are intended
>> to be reentrant with respect to themselves, but it's less clear to me
>> that they are safe to call simultaneously with respect to each other or
>> arbitrary other RTAS methods.
>
> In my viewpoint, being reentrant to themselves, without being reentrant
> to others would be very difficult to do, considering the way the
> rtas_call is crafted to work.
>
> I mean, I have no experience in rtas code, it's my viewpoint. In my
> thoughts there is something like this:
>
> common_path -> selects function by token -> reentrant function
> 					|-> non-reentrant function
>
> If there is one function that is reentrant, it means the common_path
> and function selection by token would need to be reentrant too.

I checked with partition firmware development and these calls can be
used concurrently with arbitrary other RTAS calls, which confirms your
interpretation. Thanks for bearing with me.
Leonardo Brás May 14, 2020, 8:22 p.m. UTC | #5
On Thu, 2020-05-14 at 14:04 -0500, Nathan Lynch wrote:
> I checked with partition firmware development and these calls can be
> used concurrently with arbitrary other RTAS calls, which confirms your
> interpretation. Thanks for bearing with me.

I was not aware of how I could get this information. 
It was of great help!

Thank you for your contribution!
diff mbox series

Patch

diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h
index 3c1887351c71..1ad1c85dab5e 100644
--- a/arch/powerpc/include/asm/rtas.h
+++ b/arch/powerpc/include/asm/rtas.h
@@ -352,6 +352,7 @@  extern struct rtas_t rtas;
 extern int rtas_token(const char *service);
 extern int rtas_service_present(const char *service);
 extern int rtas_call(int token, int, int, int *, ...);
+int rtas_call_reentrant(int token, int nargs, int nret, int *outputs, ...);
 void rtas_call_unlocked(struct rtas_args *args, int token, int nargs,
 			int nret, ...);
 extern void __noreturn rtas_restart(char *cmd);
diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index c5fa251b8950..85e7511afe25 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -483,6 +483,27 @@  int rtas_call(int token, int nargs, int nret, int *outputs, ...)
 }
 EXPORT_SYMBOL(rtas_call);
 
+/*
+ * Used for reentrant rtas calls.
+ * According to LoPAR documentation, only "ibm,int-on", "ibm,int-off",
+ * "ibm,get-xive" and "ibm,set-xive" are currently reentrant.
+ * Reentrant calls need their own rtas_args buffer, so not using rtas.args.
+ */
+int rtas_call_reentrant(int token, int nargs, int nret, int *outputs, ...)
+{
+	va_list list;
+	struct rtas_args rtas_args;
+
+	if (!rtas.entry || token == RTAS_UNKNOWN_SERVICE)
+		return -1;
+
+	va_start(list, outputs);
+	va_rtas_call_unlocked(&rtas_args, token, nargs, nret, list);
+	va_end(list);
+
+	return be32_to_cpu(rtas_args.rets[0]);
+}
+
 /* For RTAS_BUSY (-2), delay for 1 millisecond.  For an extended busy status
  * code of 990n, perform the hinted delay of 10^n (last digit) milliseconds.
  */
diff --git a/arch/powerpc/sysdev/xics/ics-rtas.c b/arch/powerpc/sysdev/xics/ics-rtas.c
index 6aabc74688a6..4cf18000f07c 100644
--- a/arch/powerpc/sysdev/xics/ics-rtas.c
+++ b/arch/powerpc/sysdev/xics/ics-rtas.c
@@ -50,8 +50,8 @@  static void ics_rtas_unmask_irq(struct irq_data *d)
 
 	server = xics_get_irq_server(d->irq, irq_data_get_affinity_mask(d), 0);
 
-	call_status = rtas_call(ibm_set_xive, 3, 1, NULL, hw_irq, server,
-				DEFAULT_PRIORITY);
+	call_status = rtas_call_reentrant(ibm_set_xive, 3, 1, NULL, hw_irq,
+					  server, DEFAULT_PRIORITY);
 	if (call_status != 0) {
 		printk(KERN_ERR
 			"%s: ibm_set_xive irq %u server %x returned %d\n",
@@ -60,7 +60,7 @@  static void ics_rtas_unmask_irq(struct irq_data *d)
 	}
 
 	/* Now unmask the interrupt (often a no-op) */
-	call_status = rtas_call(ibm_int_on, 1, 1, NULL, hw_irq);
+	call_status = rtas_call_reentrant(ibm_int_on, 1, 1, NULL, hw_irq);
 	if (call_status != 0) {
 		printk(KERN_ERR "%s: ibm_int_on irq=%u returned %d\n",
 			__func__, hw_irq, call_status);
@@ -91,7 +91,7 @@  static void ics_rtas_mask_real_irq(unsigned int hw_irq)
 	if (hw_irq == XICS_IPI)
 		return;
 
-	call_status = rtas_call(ibm_int_off, 1, 1, NULL, hw_irq);
+	call_status = rtas_call_reentrant(ibm_int_off, 1, 1, NULL, hw_irq);
 	if (call_status != 0) {
 		printk(KERN_ERR "%s: ibm_int_off irq=%u returned %d\n",
 			__func__, hw_irq, call_status);
@@ -99,8 +99,8 @@  static void ics_rtas_mask_real_irq(unsigned int hw_irq)
 	}
 
 	/* Have to set XIVE to 0xff to be able to remove a slot */
-	call_status = rtas_call(ibm_set_xive, 3, 1, NULL, hw_irq,
-				xics_default_server, 0xff);
+	call_status = rtas_call_reentrant(ibm_set_xive, 3, 1, NULL, hw_irq,
+					  xics_default_server, 0xff);
 	if (call_status != 0) {
 		printk(KERN_ERR "%s: ibm_set_xive(0xff) irq=%u returned %d\n",
 			__func__, hw_irq, call_status);
@@ -131,7 +131,7 @@  static int ics_rtas_set_affinity(struct irq_data *d,
 	if (hw_irq == XICS_IPI || hw_irq == XICS_IRQ_SPURIOUS)
 		return -1;
 
-	status = rtas_call(ibm_get_xive, 1, 3, xics_status, hw_irq);
+	status = rtas_call_reentrant(ibm_get_xive, 1, 3, xics_status, hw_irq);
 
 	if (status) {
 		printk(KERN_ERR "%s: ibm,get-xive irq=%u returns %d\n",
@@ -146,8 +146,8 @@  static int ics_rtas_set_affinity(struct irq_data *d,
 		return -1;
 	}
 
-	status = rtas_call(ibm_set_xive, 3, 1, NULL,
-			   hw_irq, irq_server, xics_status[1]);
+	status = rtas_call_reentrant(ibm_set_xive, 3, 1, NULL,
+				     hw_irq, irq_server, xics_status[1]);
 
 	if (status) {
 		printk(KERN_ERR "%s: ibm,set-xive irq=%u returns %d\n",
@@ -179,7 +179,7 @@  static int ics_rtas_map(struct ics *ics, unsigned int virq)
 		return -EINVAL;
 
 	/* Check if RTAS knows about this interrupt */
-	rc = rtas_call(ibm_get_xive, 1, 3, status, hw_irq);
+	rc = rtas_call_reentrant(ibm_get_xive, 1, 3, status, hw_irq);
 	if (rc)
 		return -ENXIO;
 
@@ -198,7 +198,7 @@  static long ics_rtas_get_server(struct ics *ics, unsigned long vec)
 {
 	int rc, status[2];
 
-	rc = rtas_call(ibm_get_xive, 1, 3, status, vec);
+	rc = rtas_call_reentrant(ibm_get_xive, 1, 3, status, vec);
 	if (rc)
 		return -1;
 	return status[0];