Patchwork [v2] spapr: Add "memop" hypercall

login
register
mail settings
Submitter Benjamin Herrenschmidt
Date May 25, 2012, 3:53 a.m.
Message ID <1337918007.16119.13.camel@pasglop>
Download mbox | patch
Permalink /patch/161241/
State New
Headers show

Comments

Benjamin Herrenschmidt - May 25, 2012, 3:53 a.m.
This adds a qemu-specific hypervisor call to the pseries machine
which allows to do what amounts to memmove, memcpy and xor over
regions of physical memory such as the framebuffer.

This is the simplest way to get usable framebuffer speed from
SLOF since the framebuffer isn't mapped in the VRMA and so would
otherwise require an hcall per 8 bytes access.

The performance is still not great but usable, and can be improved
with a more complex implementation of the hcall itself if needed.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---

v2: - Added documentation for our private hcalls
    - Fixed coding style issues

 docs/specs/ppc-spapr-hcalls.txt |   78 +++++++++++++++++++++++++++++++++++++++
 hw/spapr.h                      |    3 +-
 hw/spapr_hcall.c                |   52 ++++++++++++++++++++++++++
 3 files changed, 132 insertions(+), 1 deletions(-)
 create mode 100644 docs/specs/ppc-spapr-hcalls.txt
Alexander Graf - May 25, 2012, 8:30 a.m.
On 25.05.2012, at 05:53, Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:

> This adds a qemu-specific hypervisor call to the pseries machine
> which allows to do what amounts to memmove, memcpy and xor over
> regions of physical memory such as the framebuffer.
> 
> This is the simplest way to get usable framebuffer speed from
> SLOF since the framebuffer isn't mapped in the VRMA and so would
> otherwise require an hcall per 8 bytes access.
> 
> The performance is still not great but usable, and can be improved
> with a more complex implementation of the hcall itself if needed.
> 
> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> ---
> 
> v2: - Added documentation for our private hcalls

Very cool :). This really does belong to QEMU's documentation, as these hypercall are basically private SLOF <-> QEMU backdoor interface.

>    - Fixed coding style issues
> 
> docs/specs/ppc-spapr-hcalls.txt |   78 +++++++++++++++++++++++++++++++++++++++
> hw/spapr.h                      |    3 +-
> hw/spapr_hcall.c                |   52 ++++++++++++++++++++++++++
> 3 files changed, 132 insertions(+), 1 deletions(-)
> create mode 100644 docs/specs/ppc-spapr-hcalls.txt
> 
> diff --git a/docs/specs/ppc-spapr-hcalls.txt b/docs/specs/ppc-spapr-hcalls.txt
> new file mode 100644
> index 0000000..4b3fa9a
> --- /dev/null
> +++ b/docs/specs/ppc-spapr-hcalls.txt
> @@ -0,0 +1,78 @@
> +When used with the "pseries" machine type, qemu-system-ppc64 implement

s

> +a set of hypervisor calls using a subset of the server "PAPR" specification
> +(IBM internal at this point), which is also what IBM

's

> proprietary hypervisor
> +adheres too.
> +
> +The subset is selected based on the requirements of Linux as a guest.
> +
> +In addition to those calls, we have added our own private hypervisor
> +calls which are mostly used as a private interface between the firmware
> +running in the guest and qemu.

QEMU

> +
> +All those hypercalls start at hcall number 0xf000 which correspond
> +to a implementation specific range in PAPR.
> +
> +- H_RTAS (0xf000)
> +
> +RTAS is a set of runtime services generally provided by the firmware
> +inside the guest to the operating system. It predates the existence
> +of hypervisors (it was originally an extension to Open Firmware) and
> +is still used by PAPR to provide various services that aren't performance
> +sensitive.
> +
> +We currently implement the RTAS services in qemu

QEMU (won't comment on it below, please just search&replace it ;))

> itself. The actual RTAS
> +"firmware" blob in the guest is a small stub of a few instructions which
> +calls our private H_RTAS hypervisor call to pass the RTAS calls to qemu.
> +
> +Arguments:
> +
> +  r3 : H_RTAS (0xf000)
> +  r4 : Guest physical address of RTAS parameter block
> +
> +Returns:
> +
> +  H_SUCCESS   : Successully called the RTAS function (RTAS result
> +                will have been stored in the parameter block)
> +  H_PARAMETER : Unknown token
> +
> +- H_LOGICAL_MEMOP (0xf001)
> +
> +When the guest runs in "real mode" (in powerpc lingua this means
> +with MMU disabled, ie guest effective == guest physical), it only
> +has access to a subset of memory and no IOs.
> +
> +PAPR provides a set of hypervisor calls to perform cachable or
> +non-cachable accesses to any guest physical addresses that the
> +guest can use in order to access IO devices while in real mode.
> +
> +This is typically used by the firmware running in the guest.
> +
> +However, doing a hypercall for each access is extremely inefficient
> +(even more so when running KVM) when accessing the frame buffer. In
> +that case, things like scrolling become unusably slow.
> +
> +This hypercall allows the guest to request a "memory op" to be applied
> +to memory. The supported memory ops at this point are to copy a range
> +of memory (supports overlap of source and destination) and XOR which
> +is used by our SLOF firmware to invert the screen.
> +
> +Arguments:
> +
> +  r3: H_LOGICAL_MEMOP (0xf001)
> +  r4: Guest physical address of destination
> +  r5: Guest physical address of source
> +  r6: Individual element size
> +        0 = 1 byte
> +        1 = 2 bytes
> +        2 = 4 bytes
> +        3 = 8 bytes
> +  r7: Number of elements
> +  r8: Operation
> +        0 = copy
> +        1 = xor
> +
> +Returns:
> +
> +  H_SUCCESS   : Success
> +  H_PARAMETER : Invalid argument
> +
> diff --git a/hw/spapr.h b/hw/spapr.h
> index 7c497aa..0343f33 100644
> --- a/hw/spapr.h
> +++ b/hw/spapr.h
> @@ -264,7 +264,8 @@ typedef struct sPAPREnvironment {
>  */
> #define KVMPPC_HCALL_BASE       0xf000
> #define KVMPPC_H_RTAS           (KVMPPC_HCALL_BASE + 0x0)
> -#define KVMPPC_HCALL_MAX        KVMPPC_H_RTAS
> +#define KVMPPC_H_LOGICAL_MEMOP  (KVMPPC_HCALL_BASE + 0x1)
> +#define KVMPPC_HCALL_MAX        KVMPPC_H_LOGICAL_MEMOP
> 
> extern sPAPREnvironment *spapr;
> 
> diff --git a/hw/spapr_hcall.c b/hw/spapr_hcall.c
> index 94bb504..5211364 100644
> --- a/hw/spapr_hcall.c
> +++ b/hw/spapr_hcall.c
> @@ -608,6 +608,57 @@ static target_ulong h_logical_store(CPUPPCState *env, sPAPREnvironment *spapr,
>     return H_PARAMETER;
> }
> 
> +static target_ulong h_logical_memop(CPUPPCState *env, sPAPREnvironment *spapr,
> +                                    target_ulong opcode, target_ulong *args)
> +{
> +    target_ulong dst   = args[0]; /* Destination address */
> +    target_ulong src   = args[1]; /* Source address */
> +    target_ulong esize = args[2]; /* Element size (0=1,1=2,2=4,3=8) */
> +    target_ulong count = args[3]; /* Element count */
> +    target_ulong op    = args[4]; /* 0 = copy, 1 = invert */
> +    uint64_t tmp;
> +    unsigned int mask = (1 << esize) - 1;
> +    int step = 1 << esize;
> +
> +    if (count > 0x80000000) {
> +    return H_PARAMETER;

Indentation?

> +    }
> +
> +    if ((dst & mask) || (src & mask)) {
> +        return H_PARAMETER;
> +    }

If (op > 1) return H_PARAMETER;

> +
> +    if (dst >= src && dst < (src + (count << esize))) {
> +            dst = dst + ((count - 1) << esize);
> +            src = src + ((count - 1) << esize);
> +            step = -step;
> +    }
> +
> +    while (count--) {
> +        switch (esize) {
> +        case 0: tmp = ldub_phys(src);

I'm surprised checkpatch didn't complain here. Please do

case x:
    foo();
    break();

> break;
> +        case 1: tmp = lduw_phys(src); break;
> +        case 2: tmp = ldl_phys(src);  break;
> +        case 3: tmp = ldq_phys(src);  break;
> +        default:
> +        return H_PARAMETER;

Indentation?

> +        }
> +        if (op) {

op == 1

> +            tmp = ~tmp;
> +        }
> +        switch (esize) {
> +        case 0: stb_phys(dst, tmp); break;
> +        case 1: stw_phys(dst, tmp); break;
> +        case 2: stl_phys(dst, tmp); break;
> +        case 3: stq_phys(dst, tmp);

Same as above

Otherwise a good idea :).


Alex

> break;
> +        }
> +        dst = dst + step;
> +        src = src + step;
> +    }
> +
> +    return H_SUCCESS;
> +}
> +
> static target_ulong h_logical_icbi(CPUPPCState *env, sPAPREnvironment *spapr,
>                                    target_ulong opcode, target_ulong *args)
> {
> @@ -700,6 +751,7 @@ static void hypercall_register_types(void)
>     spapr_register_hypercall(H_LOGICAL_CACHE_STORE, h_logical_store);
>     spapr_register_hypercall(H_LOGICAL_ICBI, h_logical_icbi);
>     spapr_register_hypercall(H_LOGICAL_DCBF, h_logical_dcbf);
> +    spapr_register_hypercall(KVMPPC_H_LOGICAL_MEMOP, h_logical_memop);
> 
>     /* qemu/KVM-PPC specific hcalls */
>     spapr_register_hypercall(KVMPPC_H_RTAS, h_rtas);
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Benjamin Herrenschmidt - May 25, 2012, 8:36 a.m.
On Fri, 2012-05-25 at 10:30 +0200, Alexander Graf wrote:

> > +    while (count--) {
> > +        switch (esize) {
> > +        case 0: tmp = ldub_phys(src);
> 
> I'm surprised checkpatch didn't complain here. Please do
> 
> case x:
>     foo();
>     break();
> 
> > break;
> > +        case 1: tmp = lduw_phys(src); break;
> > +        case 2: tmp = ldl_phys(src);  break;
> > +        case 3: tmp = ldq_phys(src);  break;
> > +        default:
> > +        return H_PARAMETER;

Checkpatch absolutely complained and I decided to ignore it, seriously,
you really want to replace a nice & readable piece of code with
something that takes 3 pages and is generally gross & ugly ?

Some times, you have to ignore check patch and let sanity prevail.

Ben.

> Indentation?

Not sure what's up with identation, I had it all fixed up to please
checkpatch, maybe I screwed up the sending of the patch itself. Oh
well, I'm off to hospital on monday so that will have to wait til I'm
back (I regret you didn't make those comments on the previous iteration
of the patch though).

Ben.
Alexander Graf - May 25, 2012, 8:54 a.m.
On 25.05.2012, at 10:36, Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:

> On Fri, 2012-05-25 at 10:30 +0200, Alexander Graf wrote:
> 
>>> +    while (count--) {
>>> +        switch (esize) {
>>> +        case 0: tmp = ldub_phys(src);
>> 
>> I'm surprised checkpatch didn't complain here. Please do
>> 
>> case x:
>>    foo();
>>    break();
>> 
>>> break;
>>> +        case 1: tmp = lduw_phys(src); break;
>>> +        case 2: tmp = ldl_phys(src);  break;
>>> +        case 3: tmp = ldq_phys(src);  break;
>>> +        default:
>>> +        return H_PARAMETER;
> 
> Checkpatch absolutely complained and I decided to ignore it, seriously,
> you really want to replace a nice & readable piece of code with
> something that takes 3 pages and is generally gross & ugly ?
> 
> Some times, you have to ignore check patch and let sanity prevail.

I'm not all that keen on coding style rules. But check out arch/powerpc/kvm/emulate.c and tell me that it's a good idea to go with this "clean" approach. If you want it really clean, put the whole chunk above into a geberic helper that allows for everyone to say "read n bytes of data with native endianness into a u64". In that code, the more verbose coding style checkpatch suggests doesn't hurt and your function becomes even easier to read :)

> 
> Ben.
> 
>> Indentation?
> 
> Not sure what's up with identation, I had it all fixed up to please
> checkpatch, maybe I screwed up the sending of the patch itself.

It could be my mailer too, no idea :). Just stumbled over it.

> Oh
> well, I'm off to hospital on monday so that will have to wait til I'm
> back (I regret you didn't make those comments on the previous iteration
> of the patch though).

Yeah, it's a shame I didn't read through it more thoroughly earlier - at least it didn't take weeks in this round ;).

No worries though, if you can't make it until Monday, I'll fix it up myself afterwards :). There's no black magic involved here, so I should be ok to respin myself.


Alex
Benjamin Herrenschmidt - May 25, 2012, 9:24 a.m.
On Fri, 2012-05-25 at 10:54 +0200, Alexander Graf wrote:

> >> case x:
> >>    foo();
> >>    break();
> >> 
> >>> break;
> >>> +        case 1: tmp = lduw_phys(src); break;
> >>> +        case 2: tmp = ldl_phys(src);  break;
> >>> +        case 3: tmp = ldq_phys(src);  break;
> >>> +        default:
> >>> +        return H_PARAMETER;
> > 
> > Checkpatch absolutely complained and I decided to ignore it, seriously,
> > you really want to replace a nice & readable piece of code with
> > something that takes 3 pages and is generally gross & ugly ?
> > 
> > Some times, you have to ignore check patch and let sanity prevail.
> 
> I'm not all that keen on coding style rules. But check out
> arch/powerpc/kvm/emulate.c and tell me that it's a good idea to go
> with this "clean" approach. If you want it really clean, put the whole
> chunk above into a geberic helper that allows for everyone to say
> "read n bytes of data with native endianness into a u64". In that
> code, the more verbose coding style checkpatch suggests doesn't hurt
> and your function becomes even easier to read :)

I find your lack of taste disturbing Luke :-)

> Yeah, it's a shame I didn't read through it more thoroughly earlier - at least it didn't take weeks in this round ;).
> 
> No worries though, if you can't make it until Monday, I'll fix it up myself afterwards :). There's no black magic involved here,
> so I should be ok to respin myself.

Sure. No worries. To test it properly you really need a newer SLOF and
the patch to add -vga tho, I'll sort that out when I'm back.

Cheers,
Ben.

> Alex
Alexander Graf - May 25, 2012, 10:29 a.m.
On 25.05.2012, at 11:24, Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:

> On Fri, 2012-05-25 at 10:54 +0200, Alexander Graf wrote:
> 
>>>> case x:
>>>>   foo();
>>>>   break();
>>>> 
>>>>> break;
>>>>> +        case 1: tmp = lduw_phys(src); break;
>>>>> +        case 2: tmp = ldl_phys(src);  break;
>>>>> +        case 3: tmp = ldq_phys(src);  break;
>>>>> +        default:
>>>>> +        return H_PARAMETER;
>>> 
>>> Checkpatch absolutely complained and I decided to ignore it, seriously,
>>> you really want to replace a nice & readable piece of code with
>>> something that takes 3 pages and is generally gross & ugly ?
>>> 
>>> Some times, you have to ignore check patch and let sanity prevail.
>> 
>> I'm not all that keen on coding style rules. But check out
>> arch/powerpc/kvm/emulate.c and tell me that it's a good idea to go
>> with this "clean" approach. If you want it really clean, put the whole
>> chunk above into a geberic helper that allows for everyone to say
>> "read n bytes of data with native endianness into a u64". In that
>> code, the more verbose coding style checkpatch suggests doesn't hurt
>> and your function becomes even easier to read :)
> 
> I find your lack of taste disturbing Luke :-)

Heh :). I merely try to not collide with laws I can't influence easily and don't bother to change enough ;).

> 
>> Yeah, it's a shame I didn't read through it more thoroughly earlier - at least it didn't take weeks in this round ;).
>> 
>> No worries though, if you can't make it until Monday, I'll fix it up myself afterwards :). There's no black magic involved here,
>> so I should be ok to respin myself.
> 
> Sure. No worries. To test it properly you really need a newer SLOF and
> the patch to add -vga tho, I'll sort that out when I'm back.

Hrm. Send them over in a private mail and push the WIP SLOF to a branch somewhere then please. There are some community folks interested in VGA with -M pseries, so I'd like to at least have it prototype style available for folks to play with :)


Alex
Benjamin Herrenschmidt - May 25, 2012, 12:41 p.m.
> > I find your lack of taste disturbing Luke :-)
> 
> Heh :). I merely try to not collide with laws I can't influence easily and don't bother to change enough ;).

Compliance with established authority has never been my strong point :-)

> >> Yeah, it's a shame I didn't read through it more thoroughly earlier - at least it didn't take weeks in this round ;).
> >> 
> >> No worries though, if you can't make it until Monday, I'll fix it up myself afterwards :). There's no black magic involved here,
> >> so I should be ok to respin myself.
> > 
> > Sure. No worries. To test it properly you really need a newer SLOF and
> > the patch to add -vga tho, I'll sort that out when I'm back.
> 
> Hrm. Send them over in a private mail and push the WIP SLOF to a
> branch somewhere then please. There are some community folks interested
> in VGA with -M pseries, so I'd like to at least have it prototype style
> available for folks to play with :)

Will try to shoot stuff somewhere you can catch it before I get
on the operating table.

Cheers,
Ben.

Patch

diff --git a/docs/specs/ppc-spapr-hcalls.txt b/docs/specs/ppc-spapr-hcalls.txt
new file mode 100644
index 0000000..4b3fa9a
--- /dev/null
+++ b/docs/specs/ppc-spapr-hcalls.txt
@@ -0,0 +1,78 @@ 
+When used with the "pseries" machine type, qemu-system-ppc64 implement
+a set of hypervisor calls using a subset of the server "PAPR" specification
+(IBM internal at this point), which is also what IBM proprietary hypervisor
+adheres too.
+
+The subset is selected based on the requirements of Linux as a guest.
+
+In addition to those calls, we have added our own private hypervisor
+calls which are mostly used as a private interface between the firmware
+running in the guest and qemu.
+
+All those hypercalls start at hcall number 0xf000 which correspond
+to a implementation specific range in PAPR.
+
+- H_RTAS (0xf000)
+
+RTAS is a set of runtime services generally provided by the firmware
+inside the guest to the operating system. It predates the existence
+of hypervisors (it was originally an extension to Open Firmware) and
+is still used by PAPR to provide various services that aren't performance
+sensitive.
+
+We currently implement the RTAS services in qemu itself. The actual RTAS
+"firmware" blob in the guest is a small stub of a few instructions which
+calls our private H_RTAS hypervisor call to pass the RTAS calls to qemu.
+
+Arguments:
+
+  r3 : H_RTAS (0xf000)
+  r4 : Guest physical address of RTAS parameter block
+
+Returns:
+
+  H_SUCCESS   : Successully called the RTAS function (RTAS result
+                will have been stored in the parameter block)
+  H_PARAMETER : Unknown token
+
+- H_LOGICAL_MEMOP (0xf001)
+
+When the guest runs in "real mode" (in powerpc lingua this means
+with MMU disabled, ie guest effective == guest physical), it only
+has access to a subset of memory and no IOs.
+
+PAPR provides a set of hypervisor calls to perform cachable or
+non-cachable accesses to any guest physical addresses that the
+guest can use in order to access IO devices while in real mode.
+
+This is typically used by the firmware running in the guest.
+
+However, doing a hypercall for each access is extremely inefficient
+(even more so when running KVM) when accessing the frame buffer. In
+that case, things like scrolling become unusably slow.
+
+This hypercall allows the guest to request a "memory op" to be applied
+to memory. The supported memory ops at this point are to copy a range
+of memory (supports overlap of source and destination) and XOR which
+is used by our SLOF firmware to invert the screen.
+
+Arguments:
+
+  r3: H_LOGICAL_MEMOP (0xf001)
+  r4: Guest physical address of destination
+  r5: Guest physical address of source
+  r6: Individual element size
+        0 = 1 byte
+        1 = 2 bytes
+        2 = 4 bytes
+        3 = 8 bytes
+  r7: Number of elements
+  r8: Operation
+        0 = copy
+        1 = xor
+
+Returns:
+
+  H_SUCCESS   : Success
+  H_PARAMETER : Invalid argument
+
diff --git a/hw/spapr.h b/hw/spapr.h
index 7c497aa..0343f33 100644
--- a/hw/spapr.h
+++ b/hw/spapr.h
@@ -264,7 +264,8 @@  typedef struct sPAPREnvironment {
  */
 #define KVMPPC_HCALL_BASE       0xf000
 #define KVMPPC_H_RTAS           (KVMPPC_HCALL_BASE + 0x0)
-#define KVMPPC_HCALL_MAX        KVMPPC_H_RTAS
+#define KVMPPC_H_LOGICAL_MEMOP  (KVMPPC_HCALL_BASE + 0x1)
+#define KVMPPC_HCALL_MAX        KVMPPC_H_LOGICAL_MEMOP
 
 extern sPAPREnvironment *spapr;
 
diff --git a/hw/spapr_hcall.c b/hw/spapr_hcall.c
index 94bb504..5211364 100644
--- a/hw/spapr_hcall.c
+++ b/hw/spapr_hcall.c
@@ -608,6 +608,57 @@  static target_ulong h_logical_store(CPUPPCState *env, sPAPREnvironment *spapr,
     return H_PARAMETER;
 }
 
+static target_ulong h_logical_memop(CPUPPCState *env, sPAPREnvironment *spapr,
+                                    target_ulong opcode, target_ulong *args)
+{
+    target_ulong dst   = args[0]; /* Destination address */
+    target_ulong src   = args[1]; /* Source address */
+    target_ulong esize = args[2]; /* Element size (0=1,1=2,2=4,3=8) */
+    target_ulong count = args[3]; /* Element count */
+    target_ulong op    = args[4]; /* 0 = copy, 1 = invert */
+    uint64_t tmp;
+    unsigned int mask = (1 << esize) - 1;
+    int step = 1 << esize;
+
+    if (count > 0x80000000) {
+	return H_PARAMETER;
+    }
+
+    if ((dst & mask) || (src & mask)) {
+        return H_PARAMETER;
+    }
+
+    if (dst >= src && dst < (src + (count << esize))) {
+            dst = dst + ((count - 1) << esize);
+            src = src + ((count - 1) << esize);
+            step = -step;
+    }
+
+    while (count--) {
+        switch (esize) {
+        case 0: tmp = ldub_phys(src); break;
+        case 1: tmp = lduw_phys(src); break;
+        case 2: tmp = ldl_phys(src);  break;
+        case 3: tmp = ldq_phys(src);  break;
+        default:
+        return H_PARAMETER;
+        }
+        if (op) {
+            tmp = ~tmp;
+        }
+        switch (esize) {
+        case 0: stb_phys(dst, tmp); break;
+        case 1: stw_phys(dst, tmp); break;
+        case 2: stl_phys(dst, tmp); break;
+        case 3: stq_phys(dst, tmp); break;
+        }
+        dst = dst + step;
+        src = src + step;
+    }
+
+    return H_SUCCESS;
+}
+
 static target_ulong h_logical_icbi(CPUPPCState *env, sPAPREnvironment *spapr,
                                    target_ulong opcode, target_ulong *args)
 {
@@ -700,6 +751,7 @@  static void hypercall_register_types(void)
     spapr_register_hypercall(H_LOGICAL_CACHE_STORE, h_logical_store);
     spapr_register_hypercall(H_LOGICAL_ICBI, h_logical_icbi);
     spapr_register_hypercall(H_LOGICAL_DCBF, h_logical_dcbf);
+    spapr_register_hypercall(KVMPPC_H_LOGICAL_MEMOP, h_logical_memop);
 
     /* qemu/KVM-PPC specific hcalls */
     spapr_register_hypercall(KVMPPC_H_RTAS, h_rtas);