Patchwork [v2] spapr-rtas: use softmmu for accessing rtas call parameters

login
register
mail settings
Submitter Alexey Kardashevskiy
Date Sept. 6, 2013, 8:10 a.m.
Message ID <1378455053-18219-1-git-send-email-aik@ozlabs.ru>
Download mbox | patch
Permalink /patch/273118/
State New
Headers show

Comments

Alexey Kardashevskiy - Sept. 6, 2013, 8:10 a.m.
On the real hardware, RTAS is called in real mode and therefore
ignores top 4 bits of the address passed in the call.

This fixes QEMU to use softmmu which can chop top 4 bits
if MSR DR is not set.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
Changes:
v2:
* masking from replaced with the use of cpu_ldl_data which can handle
realmode case properly
---
 hw/ppc/spapr_hcall.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)
Alexander Graf - Sept. 6, 2013, 8:45 a.m.
On 06.09.2013, at 10:10, Alexey Kardashevskiy wrote:

> On the real hardware, RTAS is called in real mode and therefore
> ignores top 4 bits of the address passed in the call.
> 
> This fixes QEMU to use softmmu which can chop top 4 bits
> if MSR DR is not set.
> 
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
> Changes:
> v2:
> * masking from replaced with the use of cpu_ldl_data which can handle
> realmode case properly
> ---
> hw/ppc/spapr_hcall.c | 8 +++++---
> 1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
> index 063bd36..30f90bf 100644
> --- a/hw/ppc/spapr_hcall.c
> +++ b/hw/ppc/spapr_hcall.c
> @@ -4,6 +4,7 @@
> #include "hw/ppc/spapr.h"
> #include "mmu-hash64.h"
> #include "cpu-models.h"
> +#include "exec/softmmu_exec.h"
> 
> #include <libfdt.h>
> 
> @@ -523,10 +524,11 @@ static target_ulong h_cede(PowerPCCPU *cpu, sPAPREnvironment *spapr,
> static target_ulong h_rtas(PowerPCCPU *cpu, sPAPREnvironment *spapr,
>                            target_ulong opcode, target_ulong *args)
> {
> +    CPUPPCState *env = &cpu->env;
>     target_ulong rtas_r3 = args[0];
> -    uint32_t token = ldl_be_phys(rtas_r3);
> -    uint32_t nargs = ldl_be_phys(rtas_r3 + 4);
> -    uint32_t nret = ldl_be_phys(rtas_r3 + 8);
> +    uint32_t token = cpu_ldl_data(env, rtas_r3);
> +    uint32_t nargs = cpu_ldl_data(env, rtas_r3 + 4);
> +    uint32_t nret = cpu_ldl_data(env, rtas_r3 + 8);

Wow - this really aren't that many memory accesses.

So looking through rtas calls that come after this dispatch, from what I see they mostly use rtas_ld and rtas_st. They are defined as

static inline uint32_t rtas_ld(target_ulong phys, int n)
{
    return ldl_be_phys(phys + 4*n);
}

static inline void rtas_st(target_ulong phys, int n, uint32_t val)
{
    stl_be_phys(phys + 4*n, val);
}

which means we need to change them as well to make rtas subcall memory accesses resolve properly.

Which brings me to my next question: Why don't we use rtas_ld in the 3 calls above? It looks like a perfect fit really.

Also, just changing the call to cpu_ldl_data will potentially break I think. Imagine you do cpu_synchronize_state with DR=1 in a previous call. Now comes an RTAS call. Because you don't do a cpu_synchronize_state() call here, you will still have the DR=1 in MSR, accessing virtual memory at a potentially very low address that may not be mapped in.

So before you do the cpu_ldl_data you need to either do cpu_synchronize_state() - which can be slow - or you need to manually change MSR.DR to 0 so that we end up accessing the correct memory location always. If you encapsulate that in rtas_ld and rtas_st it might end up looking quite simple:

static inline uint32_t rtas_ld(CPUPPCState *env, target_ulong phys, int n)
{
    uint32_t r;
    target_ulong msr = env->msr;

    /* Make sure we access RTAS data in guest real mode */
    env->msr ~= MSR_DR;
    r = ldl_be_phys(phys + 4*n);

    env->msr = msr;
    return r;
}

Or if you think it's too complicated, you can simply make it

static inline uint64_t ppc64_phys_to_real(uint64_t addr)
{
    return addr & ~0xF000000000000000ULL;
}

static inline uint32_t rtas_ld(target_ulong phys, int n)
{
    return ldl_be_phys(ppc64_phys_to_real(phys) + 4*n);
}

All things considered, I might even prefer the latter solution as it makes the code flow more obvious. But I'll leave the final call to you.


Alex
Alexey Kardashevskiy - Sept. 6, 2013, 9:42 a.m.
On 09/06/2013 06:45 PM, Alexander Graf wrote:
> 
> On 06.09.2013, at 10:10, Alexey Kardashevskiy wrote:
> 
>> On the real hardware, RTAS is called in real mode and therefore
>> ignores top 4 bits of the address passed in the call.
>>
>> This fixes QEMU to use softmmu which can chop top 4 bits
>> if MSR DR is not set.
>>
>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>> ---
>> Changes:
>> v2:
>> * masking from replaced with the use of cpu_ldl_data which can handle
>> realmode case properly
>> ---
>> hw/ppc/spapr_hcall.c | 8 +++++---
>> 1 file changed, 5 insertions(+), 3 deletions(-)
>>
>> diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
>> index 063bd36..30f90bf 100644
>> --- a/hw/ppc/spapr_hcall.c
>> +++ b/hw/ppc/spapr_hcall.c
>> @@ -4,6 +4,7 @@
>> #include "hw/ppc/spapr.h"
>> #include "mmu-hash64.h"
>> #include "cpu-models.h"
>> +#include "exec/softmmu_exec.h"
>>
>> #include <libfdt.h>
>>
>> @@ -523,10 +524,11 @@ static target_ulong h_cede(PowerPCCPU *cpu, sPAPREnvironment *spapr,
>> static target_ulong h_rtas(PowerPCCPU *cpu, sPAPREnvironment *spapr,
>>                            target_ulong opcode, target_ulong *args)
>> {
>> +    CPUPPCState *env = &cpu->env;
>>     target_ulong rtas_r3 = args[0];
>> -    uint32_t token = ldl_be_phys(rtas_r3);
>> -    uint32_t nargs = ldl_be_phys(rtas_r3 + 4);
>> -    uint32_t nret = ldl_be_phys(rtas_r3 + 8);
>> +    uint32_t token = cpu_ldl_data(env, rtas_r3);
>> +    uint32_t nargs = cpu_ldl_data(env, rtas_r3 + 4);
>> +    uint32_t nret = cpu_ldl_data(env, rtas_r3 + 8);
> 
> Wow - this really aren't that many memory accesses.

> So looking through rtas calls that come after this dispatch, from what I see they mostly use rtas_ld and rtas_st. They are defined as
> 
> static inline uint32_t rtas_ld(target_ulong phys, int n)
> {
>     return ldl_be_phys(phys + 4*n);
> }
> 
> static inline void rtas_st(target_ulong phys, int n, uint32_t val)
> {
>     stl_be_phys(phys + 4*n, val);
> }
> 
> which means we need to change them as well to make rtas subcall memory accesses resolve properly.
>
> Which brings me to my next question: Why don't we use rtas_ld in the 3 calls above? It looks like a perfect fit really.


They do not chop top bits too.


> Also, just changing the call to cpu_ldl_data will potentially break I
> think. Imagine you do cpu_synchronize_state with DR=1 in a previous call.
> Now comes an RTAS call. Because you don't do a cpu_synchronize_state() call
> here, you will still have the DR=1 in MSR, accessing virtual memory at a
> potentially very low address that may not be mapped in.
> 



> So before you do the cpu_ldl_data you need to either do
> cpu_synchronize_state() - which can be slow - or you need to manually
> change MSR.DR to 0 so that we end up accessing the correct memory
> location always. If you encapsulate that in rtas_ld and rtas_st it might
> end up looking quite simple:

> 
> static inline uint32_t rtas_ld(CPUPPCState *env, target_ulong phys, int n)
> {
>     uint32_t r;
>     target_ulong msr = env->msr;
> 
>     /* Make sure we access RTAS data in guest real mode */
>     env->msr ~= MSR_DR;
>     r = ldl_be_phys(phys + 4*n);
> 
>     env->msr = msr;
>     return r;
> }
> 
> Or if you think it's too complicated, you can simply make it


You added *env - this will result in almost 1000 lines patch - I did it
earlier today, I just did not post that pointless patch.


> static inline uint64_t ppc64_phys_to_real(uint64_t addr)
> {
>     return addr & ~0xF000000000000000ULL;
> }
> 
> static inline uint32_t rtas_ld(target_ulong phys, int n)
> {
>     return ldl_be_phys(ppc64_phys_to_real(phys) + 4*n);
> }
> 

> All things considered, I might even prefer the latter solution as it
> makes the code flow more obvious. But I'll leave the final call to you.


The latter will do the job, yes, I'll go for it as you are fine with it.
Thanks.



ps. please review in-kernel xics thing :)

Patch

diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index 063bd36..30f90bf 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -4,6 +4,7 @@ 
 #include "hw/ppc/spapr.h"
 #include "mmu-hash64.h"
 #include "cpu-models.h"
+#include "exec/softmmu_exec.h"
 
 #include <libfdt.h>
 
@@ -523,10 +524,11 @@  static target_ulong h_cede(PowerPCCPU *cpu, sPAPREnvironment *spapr,
 static target_ulong h_rtas(PowerPCCPU *cpu, sPAPREnvironment *spapr,
                            target_ulong opcode, target_ulong *args)
 {
+    CPUPPCState *env = &cpu->env;
     target_ulong rtas_r3 = args[0];
-    uint32_t token = ldl_be_phys(rtas_r3);
-    uint32_t nargs = ldl_be_phys(rtas_r3 + 4);
-    uint32_t nret = ldl_be_phys(rtas_r3 + 8);
+    uint32_t token = cpu_ldl_data(env, rtas_r3);
+    uint32_t nargs = cpu_ldl_data(env, rtas_r3 + 4);
+    uint32_t nret = cpu_ldl_data(env, rtas_r3 + 8);
 
     return spapr_rtas_call(cpu, spapr, token, nargs, rtas_r3 + 12,
                            nret, rtas_r3 + 12 + 4*nargs);