diff mbox

powerpc/pseries: fix endian problems with LE migration

Message ID 1421807520-12030-1-git-send-email-cyrilbur@gmail.com (mailing list archive)
State Accepted
Commit 3df76a9dcc74d5f012b94ea01ed6e7aaf8362c5a
Delegated to: Michael Ellerman
Headers show

Commit Message

Cyril Bur Jan. 21, 2015, 2:32 a.m. UTC
The need to handle ibm,suspend_me specially from within ppc_rtas has left an
endian bug exposed as rtas_ibm_suspend_me actually performs HCALLs and should
have its params in CPU endian.

Have ppc_rtas send the params correctly and also interpret the result
correctly.

Removed the convoluted use of the rtas_args struct to pass params to
rtas_ibm_suspend_me in favour of passing what it needs directly.

Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
---
This patch has been tested with KVM both LE and BE and on PowerVM both LE and
BE. Under QEMU/KVM the migration happens without touching the these code
pathes.
For PowerVM there is no obvious regression on BE and the LE code path now
provides the correct parameters to the hypervisor

---
 arch/powerpc/include/asm/rtas.h           |  2 +-
 arch/powerpc/kernel/rtas.c                | 22 +++++++++++++++-------
 arch/powerpc/platforms/pseries/mobility.c | 22 ++++++----------------
 3 files changed, 22 insertions(+), 24 deletions(-)

Comments

Michael Ellerman Jan. 21, 2015, 3:33 a.m. UTC | #1
On Wed, 2015-01-21 at 13:32 +1100, Cyril Bur wrote:
> The need to handle ibm,suspend_me specially from within ppc_rtas has left an
> endian bug exposed as rtas_ibm_suspend_me actually performs HCALLs and should
> have its params in CPU endian.

That needs a much better explanation.

Key points:
 - ppc_rtas() is a syscall, which takes arguments in BE
 - ibm,suspend-me is not a real RTAS call and is handled specially in there
 - ibm,suspend-me is actually implemented by an hcall
 - there is currently a bug on LE, because rtas_ibm_suspend_me() takes the
   ppc_rtas() args and feeds them directly to the hcall

> Have ppc_rtas send the params correctly and also interpret the result
> correctly.

That's a second bug which you should also mention above.

> Removed the convoluted use of the rtas_args struct to pass params to
> rtas_ibm_suspend_me in favour of passing what it needs directly.
> 
> Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
> ---
> This patch has been tested with KVM both LE and BE and on PowerVM both LE and
> BE. Under QEMU/KVM the migration happens without touching the these code
> pathes.
> For PowerVM there is no obvious regression on BE and the LE code path now
> provides the correct parameters to the hypervisor

Fold that into the changelog, it's worth remembering.

cheers
Cyril Bur Jan. 22, 2015, 5:37 a.m. UTC | #2
On Wed, 2015-01-21 at 14:33 +1100, Michael Ellerman wrote:
> On Wed, 2015-01-21 at 13:32 +1100, Cyril Bur wrote:
> > The need to handle ibm,suspend_me specially from within ppc_rtas has left an
> > endian bug exposed as rtas_ibm_suspend_me actually performs HCALLs and should
> > have its params in CPU endian.
> 
> That needs a much better explanation.
> 
Agreed

> Key points:
>  - ppc_rtas() is a syscall, which takes arguments in BE
>  - ibm,suspend-me is not a real RTAS call and is handled specially in there
>  - ibm,suspend-me is actually implemented by an hcall
>  - there is currently a bug on LE, because rtas_ibm_suspend_me() takes the
>    ppc_rtas() args and feeds them directly to the hcall
> 
I've tried to write that out neatly and orderly. Here's how that went:


RTAS events require arguments be passed in big endian while hypercalls
have their arguments passed in registers and the values should therefore
be in CPU endian.

The ibm,suspend_me 'RTAS' call makes a sequence of hypercalls to setup
one true RTAS call. This means that ibm,suspend_me is handled specially
in the ppc_rtas syscall.

The ppc_rtas syscall has its arguments in big endian and can therefore
pass these arguments directly to the rtas call. ibm,suspend_me is
handled specially from within ppc_rtas (by calling rtas_ibm_suspend_me)
which has left an endian bug on little endian systems due to the
requirement of hypercalls. The return value from rtas_ibm_suspend me
gets returned in cpu endian, and is left unconverted, also a bug on
little endian systems.

rtas_ibm_suspend_me does not actually make use of the rtas_args that it
is passed. This patch removes the convoluted use of the rtas_args struct
to pass params to rtas_ibm_suspend_me in favour of passing what it needs
as actual arguments. This patch also ensures the two callers of
rtas_ibm_suspend_me pass function parameters in cpu endian and in the
case of ppc_rtas, converts the return value.

migrate_store (the other caller of rtas_ibm_suspend_me) is from a sysfs
file which deals with everything in cpu endian so this function only
underwent cleanup.

> > Have ppc_rtas send the params correctly and also interpret the result
> > correctly.
> 
> That's a second bug which you should also mention above.
> 
> > Removed the convoluted use of the rtas_args struct to pass params to
> > rtas_ibm_suspend_me in favour of passing what it needs directly.
> > 
> > Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
> > ---
> > This patch has been tested with KVM both LE and BE and on PowerVM both LE and
> > BE. Under QEMU/KVM the migration happens without touching the these code
> > pathes.
> > For PowerVM there is no obvious regression on BE and the LE code path now
> > provides the correct parameters to the hypervisor
> 
> Fold that into the changelog, it's worth remembering.
> 
> cheers
> 
>
Michael Ellerman Jan. 23, 2015, 2:47 a.m. UTC | #3
On Thu, 2015-01-22 at 16:37 +1100, Cyril Bur wrote:
> On Wed, 2015-01-21 at 14:33 +1100, Michael Ellerman wrote:
> > On Wed, 2015-01-21 at 13:32 +1100, Cyril Bur wrote:
> > > The need to handle ibm,suspend_me specially from within ppc_rtas has left an
> > > endian bug exposed as rtas_ibm_suspend_me actually performs HCALLs and should
> > > have its params in CPU endian.
> > 
> > That needs a much better explanation.

> I've tried to write that out neatly and orderly. Here's how that went:

Yep that's much better.

Minor points of style that I like are to use "()" on functions to make it
clear, eg. ppc_rtas(), and to put strings in quotes, eg. "ibm,suspend_me".

I've fixed it up and incorporated your testing notes, end result is:

Author: cyrilbur@gmail.com <cyrilbur@gmail.com>
Date:   Wed Jan 21 13:32:00 2015 +1100

    powerpc/pseries: Fix endian problems with LE migration
    
    RTAS events require arguments be passed in big endian while hypercalls
    have their arguments passed in registers and the values should therefore
    be in CPU endian.
    
    The "ibm,suspend_me" 'RTAS' call makes a sequence of hypercalls to setup
    one true RTAS call. This means that "ibm,suspend_me" is handled
    specially in the ppc_rtas() syscall.
    
    The ppc_rtas() syscall has its arguments in big endian and can therefore
    pass these arguments directly to the RTAS call. "ibm,suspend_me" is
    handled specially from within ppc_rtas() (by calling rtas_ibm_suspend_me())
    which has left an endian bug on little endian systems due to the
    requirement of hypercalls. The return value from rtas_ibm_suspend_me()
    gets returned in cpu endian, and is left unconverted, also a bug on
    little endian systems.
    
    rtas_ibm_suspend_me() does not actually make use of the rtas_args that
    it is passed. This patch removes the convoluted use of the rtas_args
    struct to pass params to rtas_ibm_suspend_me() in favour of passing what
    it needs as actual arguments. This patch also ensures the two callers of
    rtas_ibm_suspend_me() pass function parameters in cpu endian and in the
    case of ppc_rtas(), converts the return value.
    
    migrate_store() (the other caller of rtas_ibm_suspend_me()) is from a
    sysfs file which deals with everything in cpu endian so this function
    only underwent cleanup.
    
    This patch has been tested with KVM both LE and BE and on PowerVM both
    LE and BE. Under QEMU/KVM the migration happens without touching these
    code pathes.
    
    For PowerVM there is no obvious regression on BE and the LE code path
    now provides the correct parameters to the hypervisor.
    
    Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>


cheers
diff mbox

Patch

diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h
index b390f55..2e23e92 100644
--- a/arch/powerpc/include/asm/rtas.h
+++ b/arch/powerpc/include/asm/rtas.h
@@ -327,7 +327,7 @@  extern int rtas_suspend_cpu(struct rtas_suspend_me_data *data);
 extern int rtas_suspend_last_cpu(struct rtas_suspend_me_data *data);
 extern int rtas_online_cpus_mask(cpumask_var_t cpus);
 extern int rtas_offline_cpus_mask(cpumask_var_t cpus);
-extern int rtas_ibm_suspend_me(struct rtas_args *);
+extern int rtas_ibm_suspend_me(u64 handle, int *vasi_return);
 
 struct rtc_time;
 extern unsigned long rtas_get_boot_time(void);
diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index 4af905e..21c45a2 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -897,7 +897,7 @@  int rtas_offline_cpus_mask(cpumask_var_t cpus)
 }
 EXPORT_SYMBOL(rtas_offline_cpus_mask);
 
-int rtas_ibm_suspend_me(struct rtas_args *args)
+int rtas_ibm_suspend_me(u64 handle, int *vasi_return)
 {
 	long state;
 	long rc;
@@ -911,8 +911,7 @@  int rtas_ibm_suspend_me(struct rtas_args *args)
 		return -ENOSYS;
 
 	/* Make sure the state is valid */
-	rc = plpar_hcall(H_VASI_STATE, retbuf,
-			 ((u64)args->args[0] << 32) | args->args[1]);
+	rc = plpar_hcall(H_VASI_STATE, retbuf, handle);
 
 	state = retbuf[0];
 
@@ -920,12 +919,12 @@  int rtas_ibm_suspend_me(struct rtas_args *args)
 		printk(KERN_ERR "rtas_ibm_suspend_me: vasi_state returned %ld\n",rc);
 		return rc;
 	} else if (state == H_VASI_ENABLED) {
-		args->args[args->nargs] = RTAS_NOT_SUSPENDABLE;
+		*vasi_return = RTAS_NOT_SUSPENDABLE;
 		return 0;
 	} else if (state != H_VASI_SUSPENDING) {
 		printk(KERN_ERR "rtas_ibm_suspend_me: vasi_state returned state %ld\n",
 		       state);
-		args->args[args->nargs] = -1;
+		*vasi_return = -1;
 		return 0;
 	}
 
@@ -973,7 +972,7 @@  out:
 	return atomic_read(&data.error);
 }
 #else /* CONFIG_PPC_PSERIES */
-int rtas_ibm_suspend_me(struct rtas_args *args)
+int rtas_ibm_suspend_me(u64 handle, int *vasi_return)
 {
 	return -ENOSYS;
 }
@@ -1053,7 +1052,16 @@  asmlinkage int ppc_rtas(struct rtas_args __user *uargs)
 
 	/* Need to handle ibm,suspend_me call specially */
 	if (token == ibm_suspend_me_token) {
-		rc = rtas_ibm_suspend_me(&args);
+
+		/*
+		 * rtas_ibm_suspend_me assumes args are in cpu endian, or at least the
+		 * hcall within it requires it.
+		 */
+		int vasi_rc = 0;
+		u64 handle = ((u64)be32_to_cpu(args.args[0]) << 32)
+		              | be32_to_cpu(args.args[1]);
+		rc = rtas_ibm_suspend_me(handle, &vasi_rc);
+		args.rets[0] = cpu_to_be32(vasi_rc);
 		if (rc)
 			return rc;
 		goto copy_return;
diff --git a/arch/powerpc/platforms/pseries/mobility.c b/arch/powerpc/platforms/pseries/mobility.c
index e7cb6d4..90cf3dc 100644
--- a/arch/powerpc/platforms/pseries/mobility.c
+++ b/arch/powerpc/platforms/pseries/mobility.c
@@ -316,34 +316,24 @@  void post_mobility_fixup(void)
 static ssize_t migrate_store(struct class *class, struct class_attribute *attr,
 			     const char *buf, size_t count)
 {
-	struct rtas_args args;
 	u64 streamid;
 	int rc;
+	int vasi_rc = 0;
 
 	rc = kstrtou64(buf, 0, &streamid);
 	if (rc)
 		return rc;
 
-	memset(&args, 0, sizeof(args));
-	args.token = rtas_token("ibm,suspend-me");
-	args.nargs = 2;
-	args.nret = 1;
-
-	args.args[0] = streamid >> 32 ;
-	args.args[1] = streamid & 0xffffffff;
-	args.rets = &args.args[args.nargs];
-
 	do {
-		args.rets[0] = 0;
-		rc = rtas_ibm_suspend_me(&args);
-		if (!rc && args.rets[0] == RTAS_NOT_SUSPENDABLE)
+		rc = rtas_ibm_suspend_me(streamid, &vasi_rc);
+		if (!rc && vasi_rc == RTAS_NOT_SUSPENDABLE)
 			ssleep(1);
-	} while (!rc && args.rets[0] == RTAS_NOT_SUSPENDABLE);
+	} while (!rc && vasi_rc == RTAS_NOT_SUSPENDABLE);
 
 	if (rc)
 		return rc;
-	else if (args.rets[0])
-		return args.rets[0];
+	if (vasi_rc)
+		return vasi_rc;
 
 	post_mobility_fixup();
 	return count;