Patchwork [V2,1/2] kvm tools: Add initial SPAPR PPC64 architecture support

login
register
mail settings
Submitter Matt Evans
Date Dec. 13, 2011, 7 a.m.
Message ID <1323759627-12752-2-git-send-email-matt@ozlabs.org>
Download mbox | patch
Permalink /patch/131023/
State New
Headers show

Comments

Matt Evans - Dec. 13, 2011, 7 a.m.
This patch adds a new arch directory, powerpc, basic file structure, register
setup and where necessary stubs out arch-specific functions (e.g. interrupts,
runloop exits) that later patches will provide.  The target is an
SPAPR-compliant PPC64 machine (i.e. pSeries); there is no support for PPC32 or
'bare metal' PPC64 guests as yet.  Subsequent patches implement the hcalls and
RTAS required to boot SPAPR pSeries kernels.

Memory is mapped from hugetlbfs (as that is currently required by upstream PPC64
HV-mode KVM).  The mapping of a VRMA region is yet to be implemented; this is
only necessary on processors that don't support VRMA, e.g. <= P6.  Work is
therefore needed to get this going on pre-P7 CPUs.

Processor state is set up as a guest kernel would expect (both primary and
secondaries), and SMP is fully supported.

Finally, support is added for simply loading flat binary kernels (plus initrd).
(bzImages are not used on PPC, and this series does not add zImage support or an
ELF loader.)  The intention is to later support loading firmware such as SLOF.

Signed-off-by: Matt Evans <matt@ozlabs.org>
---
 tools/kvm/Makefile                           |   10 +
 tools/kvm/kvm.c                              |    3 +
 tools/kvm/powerpc/include/kvm/barrier.h      |    6 +
 tools/kvm/powerpc/include/kvm/kvm-arch.h     |   72 ++++++++
 tools/kvm/powerpc/include/kvm/kvm-cpu-arch.h |   66 ++++++++
 tools/kvm/powerpc/ioport.c                   |   18 ++
 tools/kvm/powerpc/irq.c                      |   40 +++++
 tools/kvm/powerpc/kvm-cpu.c                  |  233 ++++++++++++++++++++++++++
 tools/kvm/powerpc/kvm.c                      |  187 +++++++++++++++++++++
 9 files changed, 635 insertions(+), 0 deletions(-)
 create mode 100644 tools/kvm/powerpc/include/kvm/barrier.h
 create mode 100644 tools/kvm/powerpc/include/kvm/kvm-arch.h
 create mode 100644 tools/kvm/powerpc/include/kvm/kvm-cpu-arch.h
 create mode 100644 tools/kvm/powerpc/ioport.c
 create mode 100644 tools/kvm/powerpc/irq.c
 create mode 100644 tools/kvm/powerpc/kvm-cpu.c
 create mode 100644 tools/kvm/powerpc/kvm.c
Pekka Enberg - Dec. 13, 2011, 7:44 a.m.
On Tue, Dec 13, 2011 at 9:00 AM, Matt Evans <matt@ozlabs.org> wrote:
> This patch adds a new arch directory, powerpc, basic file structure, register
> setup and where necessary stubs out arch-specific functions (e.g. interrupts,
> runloop exits) that later patches will provide.  The target is an
> SPAPR-compliant PPC64 machine (i.e. pSeries); there is no support for PPC32 or
> 'bare metal' PPC64 guests as yet.  Subsequent patches implement the hcalls and
> RTAS required to boot SPAPR pSeries kernels.
>
> Memory is mapped from hugetlbfs (as that is currently required by upstream PPC64
> HV-mode KVM).  The mapping of a VRMA region is yet to be implemented; this is
> only necessary on processors that don't support VRMA, e.g. <= P6.  Work is
> therefore needed to get this going on pre-P7 CPUs.
>
> Processor state is set up as a guest kernel would expect (both primary and
> secondaries), and SMP is fully supported.
>
> Finally, support is added for simply loading flat binary kernels (plus initrd).
> (bzImages are not used on PPC, and this series does not add zImage support or an
> ELF loader.)  The intention is to later support loading firmware such as SLOF.
>
> Signed-off-by: Matt Evans <matt@ozlabs.org>

This looks nice and clean - but I don't really know PPC! Would you be
able to point me to some relevant documentation? Are there other PPC
folks out there that would be interested in taking a look at these
patches?

                        Pekka
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexander Graf - Dec. 13, 2011, 8:23 a.m.
On 13.12.2011, at 08:00, Matt Evans <matt@ozlabs.org> wrote:

> This patch adds a new arch directory, powerpc, basic file structure, register
> setup and where necessary stubs out arch-specific functions (e.g. interrupts,
> runloop exits) that later patches will provide.  The target is an
> SPAPR-compliant PPC64 machine (i.e. pSeries); there is no support for PPC32 or
> 'bare metal' PPC64 guests as yet.  Subsequent patches implement the hcalls and
> RTAS required to boot SPAPR pSeries kernels.
> 
> Memory is mapped from hugetlbfs (as that is currently required by upstream PPC64
> HV-mode KVM).  The mapping of a VRMA region is yet to be implemented; this is
> only necessary on processors that don't support VRMA, e.g. <= P6.  Work is
> therefore needed to get this going on pre-P7 CPUs.
> 
> Processor state is set up as a guest kernel would expect (both primary and
> secondaries), and SMP is fully supported.
> 
> Finally, support is added for simply loading flat binary kernels (plus initrd).
> (bzImages are not used on PPC, and this series does not add zImage support or an
> ELF loader.)  The intention is to later support loading firmware such as SLOF.
> 
> Signed-off-by: Matt Evans <matt@ozlabs.org>
> ---
> tools/kvm/Makefile                           |   10 +
> tools/kvm/kvm.c                              |    3 +
> tools/kvm/powerpc/include/kvm/barrier.h      |    6 +
> tools/kvm/powerpc/include/kvm/kvm-arch.h     |   72 ++++++++
> tools/kvm/powerpc/include/kvm/kvm-cpu-arch.h |   66 ++++++++
> tools/kvm/powerpc/ioport.c                   |   18 ++
> tools/kvm/powerpc/irq.c                      |   40 +++++
> tools/kvm/powerpc/kvm-cpu.c                  |  233 ++++++++++++++++++++++++++
> tools/kvm/powerpc/kvm.c                      |  187 +++++++++++++++++++++
> 9 files changed, 635 insertions(+), 0 deletions(-)
> create mode 100644 tools/kvm/powerpc/include/kvm/barrier.h
> create mode 100644 tools/kvm/powerpc/include/kvm/kvm-arch.h
> create mode 100644 tools/kvm/powerpc/include/kvm/kvm-cpu-arch.h
> create mode 100644 tools/kvm/powerpc/ioport.c
> create mode 100644 tools/kvm/powerpc/irq.c
> create mode 100644 tools/kvm/powerpc/kvm-cpu.c
> create mode 100644 tools/kvm/powerpc/kvm.c
> 
> diff --git a/tools/kvm/Makefile b/tools/kvm/Makefile
> index 2bf70c9..3f1e84a 100644
> --- a/tools/kvm/Makefile
> +++ b/tools/kvm/Makefile
> @@ -124,6 +124,16 @@ ifeq ($(ARCH),x86)
>    OTHEROBJS    += x86/bios/bios-rom.o
>    ARCH_INCLUDE := x86/include
> endif
> +# POWER/ppc:  Actually only support ppc64 currently.

Why? I usually run ppc32 user land. Doesn't that expose 'ppc' here?

> +ifeq ($(uname_M), ppc64)
> +    DEFINES += -DCONFIG_PPC
> +    OBJS    += powerpc/ioport.o
> +    OBJS    += powerpc/irq.o
> +    OBJS    += powerpc/kvm.o
> +    OBJS    += powerpc/kvm-cpu.o
> +    ARCH_INCLUDE := powerpc/include
> +    CFLAGS += -m64
> +endif
> 
> ###
> 
> diff --git a/tools/kvm/kvm.c b/tools/kvm/kvm.c
> index 35ca2c5..3fb46f6 100644
> --- a/tools/kvm/kvm.c
> +++ b/tools/kvm/kvm.c
> @@ -49,6 +49,9 @@ const char *kvm_exit_reasons[] = {
>    DEFINE_KVM_EXIT_REASON(KVM_EXIT_DCR),
>    DEFINE_KVM_EXIT_REASON(KVM_EXIT_NMI),
>    DEFINE_KVM_EXIT_REASON(KVM_EXIT_INTERNAL_ERROR),
> +#ifdef CONFIG_PPC64
> +    DEFINE_KVM_EXIT_REASON(KVM_EXIT_PAPR_HCALL),
> +#endif
> };
> 
> extern struct kvm *kvm;
> diff --git a/tools/kvm/powerpc/include/kvm/barrier.h b/tools/kvm/powerpc/include/kvm/barrier.h
> new file mode 100644
> index 0000000..bc7d179
> --- /dev/null
> +++ b/tools/kvm/powerpc/include/kvm/barrier.h
> @@ -0,0 +1,6 @@
> +#ifndef _KVM_BARRIER_H_
> +#define _KVM_BARRIER_H_
> +
> +#include <asm/system.h>
> +
> +#endif /* _KVM_BARRIER_H_ */
> diff --git a/tools/kvm/powerpc/include/kvm/kvm-arch.h b/tools/kvm/powerpc/include/kvm/kvm-arch.h
> new file mode 100644
> index 0000000..da61774
> --- /dev/null
> +++ b/tools/kvm/powerpc/include/kvm/kvm-arch.h
> @@ -0,0 +1,72 @@
> +/*
> + * PPC64 architecture-specific definitions
> + *
> + * Copyright 2011 Matt Evans <matt@ozlabs.org>, IBM Corporation.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU General Public License version 2 as published
> + * by the Free Software Foundation.
> + */
> +
> +#ifndef KVM__KVM_ARCH_H
> +#define KVM__KVM_ARCH_H
> +
> +#include <stdbool.h>
> +#include <linux/types.h>
> +#include <time.h>
> +
> +#define KVM_NR_CPUS            (255)

Why?

> +
> +/*
> + * MMIO lives after RAM, but it'd be nice if it didn't constantly move.
> + * Choose a suitably high address, e.g. 63T...  This limits RAM size.
> + */
> +#define PPC_MMIO_START            0x3F0000000000UL
> +#define PPC_MMIO_SIZE            0x010000000000UL
> +
> +#define KERNEL_LOAD_ADDR            0x0000000000000000
> +#define KERNEL_START_ADDR           0x0000000000000000
> +#define KERNEL_SECONDARY_START_ADDR     0x0000000000000060
> +#define INITRD_LOAD_ADDR            0x0000000002800000
> +
> +#define FDT_MAX_SIZE                0x10000
> +#define RTAS_MAX_SIZE               0x10000
> +
> +#define TIMEBASE_FREQ               512000000ULL
> +
> +#define KVM_MMIO_START            PPC_MMIO_START
> +
> +/*
> + * This is the address that pci_get_io_space_block() starts allocating
> + * from.  Note that this is a PCI bus address.
> + */
> +#define KVM_PCI_MMIO_AREA        0x1000000
> +
> +struct kvm {
> +    int            sys_fd;        /* For system ioctls(), i.e. /dev/kvm */
> +    int            vm_fd;        /* For VM ioctls() */
> +    timer_t            timerid;    /* Posix timer for interrupts */
> +
> +    int            nrcpus;        /* Number of cpus to run */
> +
> +    u32            mem_slots;    /* for KVM_SET_USER_MEMORY_REGION */
> +
> +    u64            ram_size;
> +    void            *ram_start;
> +
> +    bool            nmi_disabled;
> +
> +    bool            single_step;
> +
> +    const char        *vmlinux;
> +    struct disk_image       **disks;
> +    int                     nr_disks;
> +    unsigned long        rtas_gra;
> +    unsigned long        rtas_size;
> +    unsigned long        fdt_gra;
> +    unsigned long        initrd_gra;
> +    unsigned long        initrd_size;
> +    const char        *name;
> +};
> +
> +#endif /* KVM__KVM_ARCH_H */
> diff --git a/tools/kvm/powerpc/include/kvm/kvm-cpu-arch.h b/tools/kvm/powerpc/include/kvm/kvm-cpu-arch.h
> new file mode 100644
> index 0000000..64e4510
> --- /dev/null
> +++ b/tools/kvm/powerpc/include/kvm/kvm-cpu-arch.h
> @@ -0,0 +1,66 @@
> +/*
> + * PPC64 cpu-specific definitions
> + *
> + * Copyright 2011 Matt Evans <matt@ozlabs.org>, IBM Corporation.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU General Public License version 2 as published
> + * by the Free Software Foundation.
> + */
> +
> +#ifndef KVM__KVM_CPU_ARCH_H
> +#define KVM__KVM_CPU_ARCH_H
> +
> +/* Architecture-specific kvm_cpu definitions. */
> +
> +#include <linux/kvm.h>    /* for struct kvm_regs */
> +
> +#include <pthread.h>
> +
> +#define MSR_SF        (1UL<<63)
> +#define MSR_HV        (1UL<<60)
> +#define MSR_VEC        (1UL<<25)
> +#define MSR_VSX        (1UL<<23)
> +#define MSR_POW        (1UL<<18)
> +#define MSR_EE        (1UL<<15)
> +#define MSR_PR        (1UL<<14)
> +#define MSR_FP        (1UL<<13)
> +#define MSR_ME        (1UL<<12)
> +#define MSR_FE0        (1UL<<11)
> +#define MSR_SE        (1UL<<10)
> +#define MSR_BE        (1UL<<9)
> +#define MSR_FE1        (1UL<<8)
> +#define MSR_IR        (1UL<<5)
> +#define MSR_DR        (1UL<<4)
> +#define MSR_PMM        (1UL<<2)
> +#define MSR_RI        (1UL<<1)
> +#define MSR_LE        (1UL<<0)
> +
> +struct kvm;
> +
> +struct kvm_cpu {
> +    pthread_t        thread;        /* VCPU thread */
> +
> +    unsigned long        cpu_id;
> +
> +    struct kvm        *kvm;        /* parent KVM */
> +    int            vcpu_fd;    /* For VCPU ioctls() */
> +    struct kvm_run        *kvm_run;
> +
> +    struct kvm_regs        regs;
> +    struct kvm_sregs    sregs;
> +    struct kvm_fpu        fpu;
> +
> +    u8            is_running;
> +    u8            paused;
> +    u8            needs_nmi;
> +    /*
> +     * Although PPC KVM doesn't yet support coalesced MMIO, generic code
> +     * needs this in our kvm_cpu:
> +     */
> +    struct kvm_coalesced_mmio_ring  *ring;
> +};
> +
> +void kvm_cpu__irq(struct kvm_cpu *vcpu, int pin, int level);
> +
> +#endif /* KVM__KVM_CPU_ARCH_H */
> diff --git a/tools/kvm/powerpc/ioport.c b/tools/kvm/powerpc/ioport.c
> new file mode 100644
> index 0000000..a8e4dc3
> --- /dev/null
> +++ b/tools/kvm/powerpc/ioport.c
> @@ -0,0 +1,18 @@
> +/*
> + * PPC64 ioport platform setup.  There isn't any! :-)
> + *
> + * Copyright 2011 Matt Evans <matt@ozlabs.org>, IBM Corporation.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU General Public License version 2 as published
> + * by the Free Software Foundation.
> + */
> +
> +#include "kvm/ioport.h"
> +
> +#include <stdlib.h>
> +
> +void ioport__setup_arch(void)
> +{
> +    /* PPC has no legacy ioports to set up */
> +}
> diff --git a/tools/kvm/powerpc/irq.c b/tools/kvm/powerpc/irq.c
> new file mode 100644
> index 0000000..46aa64f
> --- /dev/null
> +++ b/tools/kvm/powerpc/irq.c
> @@ -0,0 +1,40 @@
> +/*
> + * PPC64 IRQ routines
> + *
> + * Copyright 2011 Matt Evans <matt@ozlabs.org>, IBM Corporation.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU General Public License version 2 as published
> + * by the Free Software Foundation.
> + */
> +
> +#include "kvm/irq.h"
> +#include "kvm/kvm.h"
> +#include "kvm/util.h"
> +
> +#include <linux/types.h>
> +#include <linux/rbtree.h>
> +#include <linux/list.h>
> +#include <linux/kvm.h>
> +#include <sys/ioctl.h>
> +
> +#include <stddef.h>
> +#include <stdlib.h>
> +
> +int irq__register_device(u32 dev, u8 *num, u8 *pin, u8 *line)
> +{
> +    fprintf(stderr, "irq__register_device(%d, [%d], [%d], [%d]\n",
> +        dev, *num, *pin, *line);
> +    return 0;
> +}
> +
> +void irq__init(struct kvm *kvm)
> +{
> +    fprintf(stderr, __func__);
> +}
> +
> +int irq__add_msix_route(struct kvm *kvm, struct msi_msg *msg)
> +{
> +    die(__FUNCTION__);
> +    return 0;
> +}
> diff --git a/tools/kvm/powerpc/kvm-cpu.c b/tools/kvm/powerpc/kvm-cpu.c
> new file mode 100644
> index 0000000..ea99666
> --- /dev/null
> +++ b/tools/kvm/powerpc/kvm-cpu.c
> @@ -0,0 +1,233 @@
> +/*
> + * PPC64 processor support
> + *
> + * Copyright 2011 Matt Evans <matt@ozlabs.org>, IBM Corporation.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU General Public License version 2 as published
> + * by the Free Software Foundation.
> + */
> +
> +#include "kvm/kvm-cpu.h"
> +
> +#include "kvm/symbol.h"
> +#include "kvm/util.h"
> +#include "kvm/kvm.h"
> +
> +#include <sys/ioctl.h>
> +#include <sys/mman.h>
> +#include <signal.h>
> +#include <stdlib.h>
> +#include <string.h>
> +#include <errno.h>
> +#include <stdio.h>
> +
> +static int debug_fd;
> +
> +void kvm_cpu__set_debug_fd(int fd)
> +{
> +    debug_fd = fd;
> +}
> +
> +int kvm_cpu__get_debug_fd(void)
> +{
> +    return debug_fd;
> +}
> +
> +static struct kvm_cpu *kvm_cpu__new(struct kvm *kvm)
> +{
> +    struct kvm_cpu *vcpu;
> +
> +    vcpu        = calloc(1, sizeof *vcpu);
> +    if (!vcpu)
> +        return NULL;
> +
> +    vcpu->kvm    = kvm;
> +
> +    return vcpu;
> +}
> +
> +void kvm_cpu__delete(struct kvm_cpu *vcpu)
> +{
> +    free(vcpu);
> +}
> +
> +struct kvm_cpu *kvm_cpu__init(struct kvm *kvm, unsigned long cpu_id)
> +{
> +    struct kvm_cpu *vcpu;
> +    int mmap_size;
> +    struct kvm_enable_cap papr_cap = { .cap = KVM_CAP_PPC_PAPR };
> +
> +    vcpu        = kvm_cpu__new(kvm);
> +    if (!vcpu)
> +        return NULL;
> +
> +    vcpu->cpu_id    = cpu_id;
> +
> +    vcpu->vcpu_fd = ioctl(vcpu->kvm->vm_fd, KVM_CREATE_VCPU, cpu_id);
> +    if (vcpu->vcpu_fd < 0)
> +        die_perror("KVM_CREATE_VCPU ioctl");
> +
> +    mmap_size = ioctl(vcpu->kvm->sys_fd, KVM_GET_VCPU_MMAP_SIZE, 0);
> +    if (mmap_size < 0)
> +        die_perror("KVM_GET_VCPU_MMAP_SIZE ioctl");
> +
> +    vcpu->kvm_run = mmap(NULL, mmap_size, PROT_RW, MAP_SHARED, vcpu->vcpu_fd, 0);
> +    if (vcpu->kvm_run == MAP_FAILED)
> +        die("unable to mmap vcpu fd");
> +
> +    ioctl(vcpu->vcpu_fd, KVM_ENABLE_CAP, &papr_cap);

Have you tried running this on PR KVM? That should also need HIOR synchronization.

Alex

> +
> +    /*
> +     * We start all CPUs, directing non-primary threads into the kernel's
> +     * secondary start point.  When we come to support SLOF, we will start
> +     * only one and SLOF will RTAS call us to ask for others to be
> +     * started.  (FIXME: make more generic & interface with whichever
> +     * firmware a platform may be using.)
> +     */
> +    vcpu->is_running = true;
> +
> +    return vcpu;
> +}
> +
> +static void kvm_cpu__setup_fpu(struct kvm_cpu *vcpu)
> +{
> +    /* Don't have to do anything, there's no expected FPU state. */
> +}
> +
> +static void kvm_cpu__setup_regs(struct kvm_cpu *vcpu)
> +{
> +    /*
> +     * FIXME: This assumes PPC64 and Linux guest.  It doesn't use the
> +     * OpenFirmware entry method, but instead the "embedded" entry which
> +     * passes the FDT address directly.
> +     */
> +    struct kvm_regs *r = &vcpu->regs;
> +
> +    if (vcpu->cpu_id == 0) {
> +        r->pc = KERNEL_START_ADDR;
> +        r->gpr[3] = vcpu->kvm->fdt_gra;
> +        r->gpr[5] = 0;
> +    } else {
> +        r->pc = KERNEL_SECONDARY_START_ADDR;
> +        r->gpr[3] = vcpu->cpu_id;
> +    }
> +    r->msr = 0x8000000000001000UL; /* 64bit, non-HV, ME */
> +
> +    if (ioctl(vcpu->vcpu_fd, KVM_SET_REGS, &vcpu->regs) < 0)
> +        die_perror("KVM_SET_REGS failed");
> +}
> +
> +static void kvm_cpu__setup_sregs(struct kvm_cpu *vcpu)
> +{
> +    /*
> +     * No sregs setup is required on PPC64/SPAPR (but there may be setup
> +     * required for non-paravirtualised platforms, e.g. TLB/SLB setup).
> +     */
> +}
> +
> +/**
> + * kvm_cpu__reset_vcpu - reset virtual CPU to a known state
> + */
> +void kvm_cpu__reset_vcpu(struct kvm_cpu *vcpu)
> +{
> +    kvm_cpu__setup_regs(vcpu);
> +    kvm_cpu__setup_sregs(vcpu);
> +    kvm_cpu__setup_fpu(vcpu);
> +}
> +
> +/* kvm_cpu__irq - set KVM's IRQ flag on this vcpu */
> +void kvm_cpu__irq(struct kvm_cpu *vcpu, int pin, int level)
> +{
> +}
> +
> +void kvm_cpu__arch_nmi(struct kvm_cpu *cpu)
> +{
> +}
> +
> +bool kvm_cpu__handle_exit(struct kvm_cpu *vcpu)
> +{
> +    bool ret = true;
> +    struct kvm_run *run = vcpu->kvm_run;
> +    switch(run->exit_reason) {
> +    default:
> +        ret = false;
> +    }
> +    return ret;
> +}
> +
> +#define CONDSTR_BIT(m, b) (((m) & MSR_##b) ? #b" " : "")
> +
> +void kvm_cpu__show_registers(struct kvm_cpu *vcpu)
> +{
> +    struct kvm_regs regs;
> +    struct kvm_sregs sregs;
> +    int r;
> +
> +    if (ioctl(vcpu->vcpu_fd, KVM_GET_REGS, &regs) < 0)
> +        die("KVM_GET_REGS failed");
> +        if (ioctl(vcpu->vcpu_fd, KVM_GET_SREGS, &sregs) < 0)
> +        die("KVM_GET_SREGS failed");
> +
> +    dprintf(debug_fd, "\n Registers:\n");
> +    dprintf(debug_fd, " NIP:   %016llx  MSR:   %016llx "
> +        "( %s%s%s%s%s%s%s%s%s%s%s%s)\n",
> +        regs.pc, regs.msr,
> +        CONDSTR_BIT(regs.msr, SF),
> +        CONDSTR_BIT(regs.msr, HV), /* ! */
> +        CONDSTR_BIT(regs.msr, VEC),
> +        CONDSTR_BIT(regs.msr, VSX),
> +        CONDSTR_BIT(regs.msr, EE),
> +        CONDSTR_BIT(regs.msr, PR),
> +        CONDSTR_BIT(regs.msr, FP),
> +        CONDSTR_BIT(regs.msr, ME),
> +        CONDSTR_BIT(regs.msr, IR),
> +        CONDSTR_BIT(regs.msr, DR),
> +        CONDSTR_BIT(regs.msr, RI),
> +        CONDSTR_BIT(regs.msr, LE));
> +    dprintf(debug_fd, " CTR:   %016llx  LR:    %016llx  CR:   %08llx\n",
> +        regs.ctr, regs.lr, regs.cr);
> +    dprintf(debug_fd, " SRR0:  %016llx  SRR1:  %016llx  XER:  %016llx\n",
> +        regs.srr0, regs.srr1, regs.xer);
> +    dprintf(debug_fd, " SPRG0: %016llx  SPRG1: %016llx\n",
> +        regs.sprg0, regs.sprg1);
> +    dprintf(debug_fd, " SPRG2: %016llx  SPRG3: %016llx\n",
> +        regs.sprg2, regs.sprg3);
> +    dprintf(debug_fd, " SPRG4: %016llx  SPRG5: %016llx\n",
> +        regs.sprg4, regs.sprg5);
> +    dprintf(debug_fd, " SPRG6: %016llx  SPRG7: %016llx\n",
> +        regs.sprg6, regs.sprg7);
> +    dprintf(debug_fd, " GPRs:\n ");
> +    for (r = 0; r < 32; r++) {
> +        dprintf(debug_fd, "%016llx  ", regs.gpr[r]);
> +        if ((r & 3) == 3)
> +            dprintf(debug_fd, "\n ");
> +    }
> +    dprintf(debug_fd, "\n");
> +
> +    /* FIXME: Assumes SLB-based (book3s) guest */
> +    for (r = 0; r < 32; r++) {
> +        dprintf(debug_fd, " SLB%02d  %016llx %016llx\n", r,
> +            sregs.u.s.ppc64.slb[r].slbe,
> +            sregs.u.s.ppc64.slb[r].slbv);
> +    }
> +    dprintf(debug_fd, "----------\n");
> +}
> +
> +void kvm_cpu__show_code(struct kvm_cpu *vcpu)
> +{
> +    if (ioctl(vcpu->vcpu_fd, KVM_GET_REGS, &vcpu->regs) < 0)
> +        die("KVM_GET_REGS failed");
> +
> +    /* FIXME: Dump/disassemble some code...! */
> +
> +    dprintf(debug_fd, "\n Stack:\n");
> +    dprintf(debug_fd,   " ------\n");
> +    /* Only works in real mode: */
> +    kvm__dump_mem(vcpu->kvm, vcpu->regs.gpr[1], 32);
> +}
> +
> +void kvm_cpu__show_page_tables(struct kvm_cpu *vcpu)
> +{
> +    /* Does nothing yet */
> +}
> diff --git a/tools/kvm/powerpc/kvm.c b/tools/kvm/powerpc/kvm.c
> new file mode 100644
> index 0000000..f838a8f
> --- /dev/null
> +++ b/tools/kvm/powerpc/kvm.c
> @@ -0,0 +1,187 @@
> +/*
> + * PPC64 (SPAPR) platform support
> + *
> + * Copyright 2011 Matt Evans <matt@ozlabs.org>, IBM Corporation.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU General Public License version 2 as published
> + * by the Free Software Foundation.
> + */
> +
> +#include "kvm/kvm.h"
> +#include "kvm/util.h"
> +
> +#include <linux/kvm.h>
> +
> +#include <sys/types.h>
> +#include <sys/ioctl.h>
> +#include <sys/mman.h>
> +#include <stdbool.h>
> +#include <assert.h>
> +#include <stdlib.h>
> +#include <string.h>
> +#include <unistd.h>
> +#include <stdio.h>
> +#include <fcntl.h>
> +#include <asm/unistd.h>
> +#include <errno.h>
> +
> +#include <linux/byteorder.h>
> +#include <libfdt.h>
> +
> +#define HUGETLBFS_PATH "/var/lib/hugetlbfs/global/pagesize-16MB/"
> +
> +static char kern_cmdline[2048];
> +
> +struct kvm_ext kvm_req_ext[] = {
> +    { 0, 0 }
> +};
> +
> +bool kvm__arch_cpu_supports_vm(void)
> +{
> +    return true;
> +}
> +
> +void kvm__init_ram(struct kvm *kvm)
> +{
> +    u64    phys_start, phys_size;
> +    void    *host_mem;
> +
> +    phys_start = 0;
> +    phys_size  = kvm->ram_size;
> +    host_mem   = kvm->ram_start;
> +
> +    /*
> +     * We put MMIO at PPC_MMIO_START, high up.  Make sure that this doesn't
> +     * crash into the end of RAM -- on PPC64 at least, this is so high
> +     * (63TB!) that this is unlikely.
> +     */
> +    if (phys_size >= PPC_MMIO_START)
> +        die("Too much memory (%lld, what a nice problem): "
> +            "overlaps MMIO!\n",
> +            phys_size);
> +
> +    kvm__register_mem(kvm, phys_start, phys_size, host_mem);
> +}
> +
> +void kvm__arch_set_cmdline(char *cmdline, bool video)
> +{
> +    /* We don't need anything unusual in here. */
> +}
> +
> +/* Architecture-specific KVM init */
> +void kvm__arch_init(struct kvm *kvm, const char *kvm_dev, const char *hugetlbfs_path, u64 ram_size, const char *name)
> +{
> +    int cap_ppc_rma;
> +
> +    kvm->ram_size        = ram_size;
> +
> +    /*
> +     * Currently, we must map from hugetlbfs; if --hugetlbfs not specified,
> +     * try a default path:
> +     */
> +    if (!hugetlbfs_path) {
> +        hugetlbfs_path = HUGETLBFS_PATH;
> +        pr_info("Using default %s for memory", hugetlbfs_path);
> +    }
> +
> +    kvm->ram_start = mmap_hugetlbfs(hugetlbfs_path, kvm->ram_size);
> +    if (kvm->ram_start == MAP_FAILED)
> +        die("Couldn't map %lld bytes for RAM (%d)\n",
> +            kvm->ram_size, errno);
> +
> +    /* FDT goes at top of memory, RTAS just below */
> +    kvm->fdt_gra = kvm->ram_size - FDT_MAX_SIZE;
> +    /* FIXME: Not all PPC systems have RTAS */
> +    kvm->rtas_gra = kvm->fdt_gra - RTAS_MAX_SIZE;
> +    madvise(kvm->ram_start, kvm->ram_size, MADV_MERGEABLE);
> +
> +    /* FIXME: This is book3s-specific */
> +    cap_ppc_rma = ioctl(kvm->sys_fd, KVM_CHECK_EXTENSION, KVM_CAP_PPC_RMA);
> +    if (cap_ppc_rma == 2)
> +        die("Need contiguous RMA allocation on this hardware, "
> +            "which is not yet supported.");
> +}
> +
> +void kvm__irq_line(struct kvm *kvm, int irq, int level)
> +{
> +    fprintf(stderr, "irq_line(%d, %d)\n", irq, level);
> +}
> +
> +void kvm__irq_trigger(struct kvm *kvm, int irq)
> +{
> +    kvm__irq_line(kvm, irq, 1);
> +    kvm__irq_line(kvm, irq, 0);
> +}
> +
> +int load_flat_binary(struct kvm *kvm, int fd_kernel, int fd_initrd, const char *kernel_cmdline)
> +{
> +    void *p;
> +    void *k_start;
> +    void *i_start;
> +    int nr;
> +
> +    if (lseek(fd_kernel, 0, SEEK_SET) < 0)
> +        die_perror("lseek");
> +
> +    p = k_start = guest_flat_to_host(kvm, KERNEL_LOAD_ADDR);
> +
> +    while ((nr = read(fd_kernel, p, 65536)) > 0)
> +        p += nr;
> +
> +    pr_info("Loaded kernel to 0x%x (%ld bytes)", KERNEL_LOAD_ADDR, p-k_start);
> +
> +    if (fd_initrd != -1) {
> +        if (lseek(fd_initrd, 0, SEEK_SET) < 0)
> +            die_perror("lseek");
> +
> +        if (p-k_start > INITRD_LOAD_ADDR)
> +            die("Kernel overlaps initrd!");
> +
> +        /* Round up kernel size to 8byte alignment, and load initrd right after. */
> +        i_start = p = guest_flat_to_host(kvm, INITRD_LOAD_ADDR);
> +
> +        while (((nr = read(fd_initrd, p, 65536)) > 0) &&
> +               p < (kvm->ram_start + kvm->ram_size))
> +            p += nr;
> +
> +        if (p >= (kvm->ram_start + kvm->ram_size))
> +            die("initrd too big to contain in guest RAM.\n");
> +
> +        pr_info("Loaded initrd to 0x%x (%ld bytes)",
> +            INITRD_LOAD_ADDR, p-i_start);
> +        kvm->initrd_gra = INITRD_LOAD_ADDR;
> +        kvm->initrd_size = p-i_start;
> +    } else {
> +        kvm->initrd_size = 0;
> +    }
> +    strncpy(kern_cmdline, kernel_cmdline, 2048);
> +    kern_cmdline[2047] = '\0';
> +
> +    return true;
> +}
> +
> +bool load_bzimage(struct kvm *kvm, int fd_kernel,
> +          int fd_initrd, const char *kernel_cmdline, u16 vidmode)
> +{
> +    /* We don't support bzImages. */
> +    return false;
> +}
> +
> +static void setup_fdt(struct kvm *kvm)
> +{
> +
> +}
> +
> +/**
> + * kvm__arch_setup_firmware
> + */
> +void kvm__arch_setup_firmware(struct kvm *kvm)
> +{
> +    /* Load RTAS */
> +
> +    /* Load SLOF */
> +
> +    /* Init FDT */
> +    setup_fdt(kvm);
> +}
> -- 
> 1.7.0.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pekka Enberg - Dec. 13, 2011, 5:43 p.m.
On Tue, Dec 13, 2011 at 9:00 AM, Matt Evans <matt@ozlabs.org> wrote:
> +int irq__register_device(u32 dev, u8 *num, u8 *pin, u8 *line)
> +{
> +       fprintf(stderr, "irq__register_device(%d, [%d], [%d], [%d]\n",
> +               dev, *num, *pin, *line);
> +       return 0;
> +}
> +
> +void irq__init(struct kvm *kvm)
> +{
> +       fprintf(stderr, __func__);
> +}
> +
> +int irq__add_msix_route(struct kvm *kvm, struct msi_msg *msg)
> +{
> +       die(__FUNCTION__);
> +       return 0;
> +}
>
> +void kvm__irq_line(struct kvm *kvm, int irq, int level)
> +{
> +       fprintf(stderr, "irq_line(%d, %d)\n", irq, level);
> +}

What's the plan with these functions? Will you need these on PPC? If
yes, why not implement them properly now? If not, we need to drop them
from PPC arch.
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Matt Evans - Dec. 13, 2011, 9:41 p.m.
On 14 Dec 2011, at 04:43, Pekka Enberg <penberg@kernel.org> wrote:

> On Tue, Dec 13, 2011 at 9:00 AM, Matt Evans <matt@ozlabs.org> wrote:
>> +int irq__register_device(u32 dev, u8 *num, u8 *pin, u8 *line)
>> +{
>> +       fprintf(stderr, "irq__register_device(%d, [%d], [%d], [%d]\n",
>> +               dev, *num, *pin, *line);
>> +       return 0;
>> +}
>> +
>> +void irq__init(struct kvm *kvm)
>> +{
>> +       fprintf(stderr, __func__);
>> +}
>> +
>> +int irq__add_msix_route(struct kvm *kvm, struct msi_msg *msg)
>> +{
>> +       die(__FUNCTION__);
>> +       return 0;
>> +}
>> 
>> +void kvm__irq_line(struct kvm *kvm, int irq, int level)
>> +{
>> +       fprintf(stderr, "irq_line(%d, %d)\n", irq, level);
>> +}
> 
> What's the plan with these functions? Will you need these on PPC? If
> yes, why not implement them properly now? If not, we need to drop them
> from PPC arch.

Yes, they're filled in in the later XICS patch.

Cheers,

Matt
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Benjamin Herrenschmidt - Dec. 14, 2011, 9:38 a.m.
> This looks nice and clean - but I don't really know PPC! Would you be
> able to point me to some relevant documentation? Are there other PPC
> folks out there that would be interested in taking a look at these
> patches?

I can try to spend time having a look but I'd rather just trust Matt :-)
After all he sits right next to me at work...

More seriously, for the ppc specific bits, I think we don't need to
worry too much, if it works (which it does) it's good enough, we can fix
things later on if we have to.

The main focus is to make sure we don't totally fubar the arch
abstraction.

Cheers,
Ben.


--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pekka Enberg - Dec. 14, 2011, 10:30 a.m.
On Wed, Dec 14, 2011 at 11:38 AM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
>> This looks nice and clean - but I don't really know PPC! Would you be
>> able to point me to some relevant documentation? Are there other PPC
>> folks out there that would be interested in taking a look at these
>> patches?
>
> I can try to spend time having a look but I'd rather just trust Matt :-)
> After all he sits right next to me at work...
>
> More seriously, for the ppc specific bits, I think we don't need to
> worry too much, if it works (which it does) it's good enough, we can fix
> things later on if we have to.

Sure, I actually already applied this patch earlier today. I'd still
be interested in pointers to relevant documentation because I'd love
to have some understanding of the PPC architecture code.

                        Pekka
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Benjamin Herrenschmidt - Dec. 14, 2011, 10:41 a.m.
On Wed, 2011-12-14 at 12:30 +0200, Pekka Enberg wrote:
> Sure, I actually already applied this patch earlier today. I'd still
> be interested in pointers to relevant documentation because I'd love
> to have some understanding of the PPC architecture code. 

The architecture documents are on power.org, tho they can be a bit
indigest tho, let me know if you can't get to them or have any
questions. You really don't want to dig into our MMU unless you really
have to :-)

Cheers,
Ben.


--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pekka Enberg - Dec. 14, 2011, 11:31 a.m.
On Wed, 14 Dec 2011, Alexander Graf wrote:
> The MMU isn't that hard to grasp. I would've said take a look at my presentation from 2010:
> 
>   http://www.linux-kvm.org/page/KVM_Forum_2010
> 
> but the video seems to have been removed since :(

Damn. Who could we ping to get them back up?

 			Pekka
Asias He - Dec. 14, 2011, 1:36 p.m.
On 12/14/2011 06:50 PM, Alexander Graf wrote:
> 
> On 14.12.2011, at 11:41, Benjamin Herrenschmidt wrote:
> 
>> On Wed, 2011-12-14 at 12:30 +0200, Pekka Enberg wrote:
>>> Sure, I actually already applied this patch earlier today. I'd still
>>> be interested in pointers to relevant documentation because I'd love
>>> to have some understanding of the PPC architecture code.
>>
>> The architecture documents are on power.org <http://power.org>, tho
>> they can be a bit
>> indigest tho, let me know if you can't get to them or have any
>> questions. You really don't want to dig into our MMU unless you really
>> have to :-)
> 
> The MMU isn't that hard to grasp. I would've said take a look at my
> presentation from 2010:
> 
>   http://www.linux-kvm.org/page/KVM_Forum_2010
> 
> but the video seems to have been removed since :(

I liked Alex's presentation about PPC on KVM ;-)
Matt Evans - Dec. 15, 2011, 1:27 a.m.
Heya Alex,

On 13/12/11 19:23, Alexander Graf wrote:
> 
> On 13.12.2011, at 08:00, Matt Evans <matt@ozlabs.org> wrote:
> 
>> This patch adds a new arch directory, powerpc, basic file structure, register
>> setup and where necessary stubs out arch-specific functions (e.g. interrupts,
>> runloop exits) that later patches will provide.  The target is an
>> SPAPR-compliant PPC64 machine (i.e. pSeries); there is no support for PPC32 or
>> 'bare metal' PPC64 guests as yet.  Subsequent patches implement the hcalls and
>> RTAS required to boot SPAPR pSeries kernels.
>>
>> Memory is mapped from hugetlbfs (as that is currently required by upstream PPC64
>> HV-mode KVM).  The mapping of a VRMA region is yet to be implemented; this is
>> only necessary on processors that don't support VRMA, e.g. <= P6.  Work is
>> therefore needed to get this going on pre-P7 CPUs.
>>
>> Processor state is set up as a guest kernel would expect (both primary and
>> secondaries), and SMP is fully supported.
>>
>> Finally, support is added for simply loading flat binary kernels (plus initrd).
>> (bzImages are not used on PPC, and this series does not add zImage support or an
>> ELF loader.)  The intention is to later support loading firmware such as SLOF.
>>
>> Signed-off-by: Matt Evans <matt@ozlabs.org>
>> ---
>> tools/kvm/Makefile                           |   10 +
>> tools/kvm/kvm.c                              |    3 +
>> tools/kvm/powerpc/include/kvm/barrier.h      |    6 +
>> tools/kvm/powerpc/include/kvm/kvm-arch.h     |   72 ++++++++
>> tools/kvm/powerpc/include/kvm/kvm-cpu-arch.h |   66 ++++++++
>> tools/kvm/powerpc/ioport.c                   |   18 ++
>> tools/kvm/powerpc/irq.c                      |   40 +++++
>> tools/kvm/powerpc/kvm-cpu.c                  |  233 ++++++++++++++++++++++++++
>> tools/kvm/powerpc/kvm.c                      |  187 +++++++++++++++++++++
>> 9 files changed, 635 insertions(+), 0 deletions(-)
>> create mode 100644 tools/kvm/powerpc/include/kvm/barrier.h
>> create mode 100644 tools/kvm/powerpc/include/kvm/kvm-arch.h
>> create mode 100644 tools/kvm/powerpc/include/kvm/kvm-cpu-arch.h
>> create mode 100644 tools/kvm/powerpc/ioport.c
>> create mode 100644 tools/kvm/powerpc/irq.c
>> create mode 100644 tools/kvm/powerpc/kvm-cpu.c
>> create mode 100644 tools/kvm/powerpc/kvm.c
>>
>> diff --git a/tools/kvm/Makefile b/tools/kvm/Makefile
>> index 2bf70c9..3f1e84a 100644
>> --- a/tools/kvm/Makefile
>> +++ b/tools/kvm/Makefile
>> @@ -124,6 +124,16 @@ ifeq ($(ARCH),x86)
>>    OTHEROBJS    += x86/bios/bios-rom.o
>>    ARCH_INCLUDE := x86/include
>> endif
>> +# POWER/ppc:  Actually only support ppc64 currently.
> 
> Why? I usually run ppc32 user land. Doesn't that expose 'ppc' here?

Not quite sure what you mean here; do you mean 32bit distro?  (Will still get 'ppc64' from a 64-bit kernel.)

There is clearly some work required here to determine what to build for when we
eventually support PPC32 guests/hosts though I'm not sure how that will look
yet.  This is designed to break if you build on a 32bit kernel, as if it DID
build, it wouldn't run anyway.  (It's building -m64 too...

>> +ifeq ($(uname_M), ppc64)
>> +    DEFINES += -DCONFIG_PPC
>> +    OBJS    += powerpc/ioport.o
>> +    OBJS    += powerpc/irq.o
>> +    OBJS    += powerpc/kvm.o
>> +    OBJS    += powerpc/kvm-cpu.o
>> +    ARCH_INCLUDE := powerpc/include
>> +    CFLAGS += -m64

...here.)

>> +endif
>>
>> ###
>>
>> diff --git a/tools/kvm/kvm.c b/tools/kvm/kvm.c
>> index 35ca2c5..3fb46f6 100644
>> --- a/tools/kvm/kvm.c
>> +++ b/tools/kvm/kvm.c
>> @@ -49,6 +49,9 @@ const char *kvm_exit_reasons[] = {
>>    DEFINE_KVM_EXIT_REASON(KVM_EXIT_DCR),
>>    DEFINE_KVM_EXIT_REASON(KVM_EXIT_NMI),
>>    DEFINE_KVM_EXIT_REASON(KVM_EXIT_INTERNAL_ERROR),
>> +#ifdef CONFIG_PPC64
>> +    DEFINE_KVM_EXIT_REASON(KVM_EXIT_PAPR_HCALL),
>> +#endif
>> };
>>
>> extern struct kvm *kvm;
>> diff --git a/tools/kvm/powerpc/include/kvm/barrier.h b/tools/kvm/powerpc/include/kvm/barrier.h
>> new file mode 100644
>> index 0000000..bc7d179
>> --- /dev/null
>> +++ b/tools/kvm/powerpc/include/kvm/barrier.h
>> @@ -0,0 +1,6 @@
>> +#ifndef _KVM_BARRIER_H_
>> +#define _KVM_BARRIER_H_
>> +
>> +#include <asm/system.h>
>> +
>> +#endif /* _KVM_BARRIER_H_ */
>> diff --git a/tools/kvm/powerpc/include/kvm/kvm-arch.h b/tools/kvm/powerpc/include/kvm/kvm-arch.h
>> new file mode 100644
>> index 0000000..da61774
>> --- /dev/null
>> +++ b/tools/kvm/powerpc/include/kvm/kvm-arch.h
>> @@ -0,0 +1,72 @@
>> +/*
>> + * PPC64 architecture-specific definitions
>> + *
>> + * Copyright 2011 Matt Evans <matt@ozlabs.org>, IBM Corporation.
>> + *
>> + * This program is free software; you can redistribute it and/or modify it
>> + * under the terms of the GNU General Public License version 2 as published
>> + * by the Free Software Foundation.
>> + */
>> +
>> +#ifndef KVM__KVM_ARCH_H
>> +#define KVM__KVM_ARCH_H
>> +
>> +#include <stdbool.h>
>> +#include <linux/types.h>
>> +#include <time.h>
>> +
>> +#define KVM_NR_CPUS            (255)
> 
> Why?

Good question; that's arbitrary & cut-paste I missed. :-)

I'll make this 1024, to match the max sensible NR_CPUS in the PPC64 kernel
(which in turn limits KVM_MAX_VCPUS).

>> +
>> +/*
>> + * MMIO lives after RAM, but it'd be nice if it didn't constantly move.
>> + * Choose a suitably high address, e.g. 63T...  This limits RAM size.
>> + */
>> +#define PPC_MMIO_START            0x3F0000000000UL
>> +#define PPC_MMIO_SIZE            0x010000000000UL
>> +
>> +#define KERNEL_LOAD_ADDR            0x0000000000000000
>> +#define KERNEL_START_ADDR           0x0000000000000000
>> +#define KERNEL_SECONDARY_START_ADDR     0x0000000000000060
>> +#define INITRD_LOAD_ADDR            0x0000000002800000
>> +
>> +#define FDT_MAX_SIZE                0x10000
>> +#define RTAS_MAX_SIZE               0x10000
>> +
>> +#define TIMEBASE_FREQ               512000000ULL
>> +
>> +#define KVM_MMIO_START            PPC_MMIO_START
>> +
>> +/*
>> + * This is the address that pci_get_io_space_block() starts allocating
>> + * from.  Note that this is a PCI bus address.
>> + */
>> +#define KVM_PCI_MMIO_AREA        0x1000000
>> +
>> +struct kvm {
>> +    int            sys_fd;        /* For system ioctls(), i.e. /dev/kvm */
>> +    int            vm_fd;        /* For VM ioctls() */
>> +    timer_t            timerid;    /* Posix timer for interrupts */
>> +
>> +    int            nrcpus;        /* Number of cpus to run */
>> +
>> +    u32            mem_slots;    /* for KVM_SET_USER_MEMORY_REGION */
>> +
>> +    u64            ram_size;
>> +    void            *ram_start;
>> +
>> +    bool            nmi_disabled;
>> +
>> +    bool            single_step;
>> +
>> +    const char        *vmlinux;
>> +    struct disk_image       **disks;
>> +    int                     nr_disks;
>> +    unsigned long        rtas_gra;
>> +    unsigned long        rtas_size;
>> +    unsigned long        fdt_gra;
>> +    unsigned long        initrd_gra;
>> +    unsigned long        initrd_size;
>> +    const char        *name;
>> +};
>> +
>> +#endif /* KVM__KVM_ARCH_H */
>> diff --git a/tools/kvm/powerpc/include/kvm/kvm-cpu-arch.h b/tools/kvm/powerpc/include/kvm/kvm-cpu-arch.h
>> new file mode 100644
>> index 0000000..64e4510
>> --- /dev/null
>> +++ b/tools/kvm/powerpc/include/kvm/kvm-cpu-arch.h
>> @@ -0,0 +1,66 @@
>> +/*
>> + * PPC64 cpu-specific definitions
>> + *
>> + * Copyright 2011 Matt Evans <matt@ozlabs.org>, IBM Corporation.
>> + *
>> + * This program is free software; you can redistribute it and/or modify it
>> + * under the terms of the GNU General Public License version 2 as published
>> + * by the Free Software Foundation.
>> + */
>> +
>> +#ifndef KVM__KVM_CPU_ARCH_H
>> +#define KVM__KVM_CPU_ARCH_H
>> +
>> +/* Architecture-specific kvm_cpu definitions. */
>> +
>> +#include <linux/kvm.h>    /* for struct kvm_regs */
>> +
>> +#include <pthread.h>
>> +
>> +#define MSR_SF        (1UL<<63)
>> +#define MSR_HV        (1UL<<60)
>> +#define MSR_VEC        (1UL<<25)
>> +#define MSR_VSX        (1UL<<23)
>> +#define MSR_POW        (1UL<<18)
>> +#define MSR_EE        (1UL<<15)
>> +#define MSR_PR        (1UL<<14)
>> +#define MSR_FP        (1UL<<13)
>> +#define MSR_ME        (1UL<<12)
>> +#define MSR_FE0        (1UL<<11)
>> +#define MSR_SE        (1UL<<10)
>> +#define MSR_BE        (1UL<<9)
>> +#define MSR_FE1        (1UL<<8)
>> +#define MSR_IR        (1UL<<5)
>> +#define MSR_DR        (1UL<<4)
>> +#define MSR_PMM        (1UL<<2)
>> +#define MSR_RI        (1UL<<1)
>> +#define MSR_LE        (1UL<<0)
>> +
>> +struct kvm;
>> +
>> +struct kvm_cpu {
>> +    pthread_t        thread;        /* VCPU thread */
>> +
>> +    unsigned long        cpu_id;
>> +
>> +    struct kvm        *kvm;        /* parent KVM */
>> +    int            vcpu_fd;    /* For VCPU ioctls() */
>> +    struct kvm_run        *kvm_run;
>> +
>> +    struct kvm_regs        regs;
>> +    struct kvm_sregs    sregs;
>> +    struct kvm_fpu        fpu;
>> +
>> +    u8            is_running;
>> +    u8            paused;
>> +    u8            needs_nmi;
>> +    /*
>> +     * Although PPC KVM doesn't yet support coalesced MMIO, generic code
>> +     * needs this in our kvm_cpu:
>> +     */
>> +    struct kvm_coalesced_mmio_ring  *ring;
>> +};
>> +
>> +void kvm_cpu__irq(struct kvm_cpu *vcpu, int pin, int level);
>> +
>> +#endif /* KVM__KVM_CPU_ARCH_H */
>> diff --git a/tools/kvm/powerpc/ioport.c b/tools/kvm/powerpc/ioport.c
>> new file mode 100644
>> index 0000000..a8e4dc3
>> --- /dev/null
>> +++ b/tools/kvm/powerpc/ioport.c
>> @@ -0,0 +1,18 @@
>> +/*
>> + * PPC64 ioport platform setup.  There isn't any! :-)
>> + *
>> + * Copyright 2011 Matt Evans <matt@ozlabs.org>, IBM Corporation.
>> + *
>> + * This program is free software; you can redistribute it and/or modify it
>> + * under the terms of the GNU General Public License version 2 as published
>> + * by the Free Software Foundation.
>> + */
>> +
>> +#include "kvm/ioport.h"
>> +
>> +#include <stdlib.h>
>> +
>> +void ioport__setup_arch(void)
>> +{
>> +    /* PPC has no legacy ioports to set up */
>> +}
>> diff --git a/tools/kvm/powerpc/irq.c b/tools/kvm/powerpc/irq.c
>> new file mode 100644
>> index 0000000..46aa64f
>> --- /dev/null
>> +++ b/tools/kvm/powerpc/irq.c
>> @@ -0,0 +1,40 @@
>> +/*
>> + * PPC64 IRQ routines
>> + *
>> + * Copyright 2011 Matt Evans <matt@ozlabs.org>, IBM Corporation.
>> + *
>> + * This program is free software; you can redistribute it and/or modify it
>> + * under the terms of the GNU General Public License version 2 as published
>> + * by the Free Software Foundation.
>> + */
>> +
>> +#include "kvm/irq.h"
>> +#include "kvm/kvm.h"
>> +#include "kvm/util.h"
>> +
>> +#include <linux/types.h>
>> +#include <linux/rbtree.h>
>> +#include <linux/list.h>
>> +#include <linux/kvm.h>
>> +#include <sys/ioctl.h>
>> +
>> +#include <stddef.h>
>> +#include <stdlib.h>
>> +
>> +int irq__register_device(u32 dev, u8 *num, u8 *pin, u8 *line)
>> +{
>> +    fprintf(stderr, "irq__register_device(%d, [%d], [%d], [%d]\n",
>> +        dev, *num, *pin, *line);
>> +    return 0;
>> +}
>> +
>> +void irq__init(struct kvm *kvm)
>> +{
>> +    fprintf(stderr, __func__);
>> +}
>> +
>> +int irq__add_msix_route(struct kvm *kvm, struct msi_msg *msg)
>> +{
>> +    die(__FUNCTION__);
>> +    return 0;
>> +}
>> diff --git a/tools/kvm/powerpc/kvm-cpu.c b/tools/kvm/powerpc/kvm-cpu.c
>> new file mode 100644
>> index 0000000..ea99666
>> --- /dev/null
>> +++ b/tools/kvm/powerpc/kvm-cpu.c
>> @@ -0,0 +1,233 @@
>> +/*
>> + * PPC64 processor support
>> + *
>> + * Copyright 2011 Matt Evans <matt@ozlabs.org>, IBM Corporation.
>> + *
>> + * This program is free software; you can redistribute it and/or modify it
>> + * under the terms of the GNU General Public License version 2 as published
>> + * by the Free Software Foundation.
>> + */
>> +
>> +#include "kvm/kvm-cpu.h"
>> +
>> +#include "kvm/symbol.h"
>> +#include "kvm/util.h"
>> +#include "kvm/kvm.h"
>> +
>> +#include <sys/ioctl.h>
>> +#include <sys/mman.h>
>> +#include <signal.h>
>> +#include <stdlib.h>
>> +#include <string.h>
>> +#include <errno.h>
>> +#include <stdio.h>
>> +
>> +static int debug_fd;
>> +
>> +void kvm_cpu__set_debug_fd(int fd)
>> +{
>> +    debug_fd = fd;
>> +}
>> +
>> +int kvm_cpu__get_debug_fd(void)
>> +{
>> +    return debug_fd;
>> +}
>> +
>> +static struct kvm_cpu *kvm_cpu__new(struct kvm *kvm)
>> +{
>> +    struct kvm_cpu *vcpu;
>> +
>> +    vcpu        = calloc(1, sizeof *vcpu);
>> +    if (!vcpu)
>> +        return NULL;
>> +
>> +    vcpu->kvm    = kvm;
>> +
>> +    return vcpu;
>> +}
>> +
>> +void kvm_cpu__delete(struct kvm_cpu *vcpu)
>> +{
>> +    free(vcpu);
>> +}
>> +
>> +struct kvm_cpu *kvm_cpu__init(struct kvm *kvm, unsigned long cpu_id)
>> +{
>> +    struct kvm_cpu *vcpu;
>> +    int mmap_size;
>> +    struct kvm_enable_cap papr_cap = { .cap = KVM_CAP_PPC_PAPR };
>> +
>> +    vcpu        = kvm_cpu__new(kvm);
>> +    if (!vcpu)
>> +        return NULL;
>> +
>> +    vcpu->cpu_id    = cpu_id;
>> +
>> +    vcpu->vcpu_fd = ioctl(vcpu->kvm->vm_fd, KVM_CREATE_VCPU, cpu_id);
>> +    if (vcpu->vcpu_fd < 0)
>> +        die_perror("KVM_CREATE_VCPU ioctl");
>> +
>> +    mmap_size = ioctl(vcpu->kvm->sys_fd, KVM_GET_VCPU_MMAP_SIZE, 0);
>> +    if (mmap_size < 0)
>> +        die_perror("KVM_GET_VCPU_MMAP_SIZE ioctl");
>> +
>> +    vcpu->kvm_run = mmap(NULL, mmap_size, PROT_RW, MAP_SHARED, vcpu->vcpu_fd, 0);
>> +    if (vcpu->kvm_run == MAP_FAILED)
>> +        die("unable to mmap vcpu fd");
>> +
>> +    ioctl(vcpu->vcpu_fd, KVM_ENABLE_CAP, &papr_cap);
> 
> Have you tried running this on PR KVM? That should also need HIOR synchronization.

I have, but only briefly and I admit I built it from a random tree I had lying
around, certainly stale, and immediately hit some "can't emulate MMIO" errors on
some stdu instructions.  I will give it another go with your tree and see if I
can get it working with kvmtool, it would be very cool for that to work.

Thanks for reviewing,


Matt


> [snip]
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexander Graf - Dec. 15, 2011, 1:37 a.m.
On 15.12.2011, at 02:27, Matt Evans wrote:

> Heya Alex,
> 
> On 13/12/11 19:23, Alexander Graf wrote:
>> 
>> On 13.12.2011, at 08:00, Matt Evans <matt@ozlabs.org> wrote:
>> 
>>> This patch adds a new arch directory, powerpc, basic file structure, register
>>> setup and where necessary stubs out arch-specific functions (e.g. interrupts,
>>> runloop exits) that later patches will provide.  The target is an
>>> SPAPR-compliant PPC64 machine (i.e. pSeries); there is no support for PPC32 or
>>> 'bare metal' PPC64 guests as yet.  Subsequent patches implement the hcalls and
>>> RTAS required to boot SPAPR pSeries kernels.
>>> 
>>> Memory is mapped from hugetlbfs (as that is currently required by upstream PPC64
>>> HV-mode KVM).  The mapping of a VRMA region is yet to be implemented; this is
>>> only necessary on processors that don't support VRMA, e.g. <= P6.  Work is
>>> therefore needed to get this going on pre-P7 CPUs.
>>> 
>>> Processor state is set up as a guest kernel would expect (both primary and
>>> secondaries), and SMP is fully supported.
>>> 
>>> Finally, support is added for simply loading flat binary kernels (plus initrd).
>>> (bzImages are not used on PPC, and this series does not add zImage support or an
>>> ELF loader.)  The intention is to later support loading firmware such as SLOF.
>>> 
>>> Signed-off-by: Matt Evans <matt@ozlabs.org>
>>> ---
>>> tools/kvm/Makefile                           |   10 +
>>> tools/kvm/kvm.c                              |    3 +
>>> tools/kvm/powerpc/include/kvm/barrier.h      |    6 +
>>> tools/kvm/powerpc/include/kvm/kvm-arch.h     |   72 ++++++++
>>> tools/kvm/powerpc/include/kvm/kvm-cpu-arch.h |   66 ++++++++
>>> tools/kvm/powerpc/ioport.c                   |   18 ++
>>> tools/kvm/powerpc/irq.c                      |   40 +++++
>>> tools/kvm/powerpc/kvm-cpu.c                  |  233 ++++++++++++++++++++++++++
>>> tools/kvm/powerpc/kvm.c                      |  187 +++++++++++++++++++++
>>> 9 files changed, 635 insertions(+), 0 deletions(-)
>>> create mode 100644 tools/kvm/powerpc/include/kvm/barrier.h
>>> create mode 100644 tools/kvm/powerpc/include/kvm/kvm-arch.h
>>> create mode 100644 tools/kvm/powerpc/include/kvm/kvm-cpu-arch.h
>>> create mode 100644 tools/kvm/powerpc/ioport.c
>>> create mode 100644 tools/kvm/powerpc/irq.c
>>> create mode 100644 tools/kvm/powerpc/kvm-cpu.c
>>> create mode 100644 tools/kvm/powerpc/kvm.c
>>> 
>>> diff --git a/tools/kvm/Makefile b/tools/kvm/Makefile
>>> index 2bf70c9..3f1e84a 100644
>>> --- a/tools/kvm/Makefile
>>> +++ b/tools/kvm/Makefile
>>> @@ -124,6 +124,16 @@ ifeq ($(ARCH),x86)
>>>   OTHEROBJS    += x86/bios/bios-rom.o
>>>   ARCH_INCLUDE := x86/include
>>> endif
>>> +# POWER/ppc:  Actually only support ppc64 currently.
>> 
>> Why? I usually run ppc32 user land. Doesn't that expose 'ppc' here?
> 
> Not quite sure what you mean here; do you mean 32bit distro?  (Will still get 'ppc64' from a 64-bit kernel.)

Eh. Yes. Sorry, my bad.

> There is clearly some work required here to determine what to build for when we
> eventually support PPC32 guests/hosts though I'm not sure how that will look
> yet.  This is designed to break if you build on a 32bit kernel, as if it DID
> build, it wouldn't run anyway.  (It's building -m64 too...

Yeah, running -M pseries on PPC32 hosts doesn't make sense really.

> 
>>> +ifeq ($(uname_M), ppc64)
>>> +    DEFINES += -DCONFIG_PPC
>>> +    OBJS    += powerpc/ioport.o
>>> +    OBJS    += powerpc/irq.o
>>> +    OBJS    += powerpc/kvm.o
>>> +    OBJS    += powerpc/kvm-cpu.o
>>> +    ARCH_INCLUDE := powerpc/include
>>> +    CFLAGS += -m64
> 
> ...here.)
> 
>>> +endif
>>> 
>>> ###
>>> 
>>> diff --git a/tools/kvm/kvm.c b/tools/kvm/kvm.c
>>> index 35ca2c5..3fb46f6 100644
>>> --- a/tools/kvm/kvm.c
>>> +++ b/tools/kvm/kvm.c
>>> @@ -49,6 +49,9 @@ const char *kvm_exit_reasons[] = {
>>>   DEFINE_KVM_EXIT_REASON(KVM_EXIT_DCR),
>>>   DEFINE_KVM_EXIT_REASON(KVM_EXIT_NMI),
>>>   DEFINE_KVM_EXIT_REASON(KVM_EXIT_INTERNAL_ERROR),
>>> +#ifdef CONFIG_PPC64
>>> +    DEFINE_KVM_EXIT_REASON(KVM_EXIT_PAPR_HCALL),
>>> +#endif
>>> };
>>> 
>>> extern struct kvm *kvm;
>>> diff --git a/tools/kvm/powerpc/include/kvm/barrier.h b/tools/kvm/powerpc/include/kvm/barrier.h
>>> new file mode 100644
>>> index 0000000..bc7d179
>>> --- /dev/null
>>> +++ b/tools/kvm/powerpc/include/kvm/barrier.h
>>> @@ -0,0 +1,6 @@
>>> +#ifndef _KVM_BARRIER_H_
>>> +#define _KVM_BARRIER_H_
>>> +
>>> +#include <asm/system.h>
>>> +
>>> +#endif /* _KVM_BARRIER_H_ */
>>> diff --git a/tools/kvm/powerpc/include/kvm/kvm-arch.h b/tools/kvm/powerpc/include/kvm/kvm-arch.h
>>> new file mode 100644
>>> index 0000000..da61774
>>> --- /dev/null
>>> +++ b/tools/kvm/powerpc/include/kvm/kvm-arch.h
>>> @@ -0,0 +1,72 @@
>>> +/*
>>> + * PPC64 architecture-specific definitions
>>> + *
>>> + * Copyright 2011 Matt Evans <matt@ozlabs.org>, IBM Corporation.
>>> + *
>>> + * This program is free software; you can redistribute it and/or modify it
>>> + * under the terms of the GNU General Public License version 2 as published
>>> + * by the Free Software Foundation.
>>> + */
>>> +
>>> +#ifndef KVM__KVM_ARCH_H
>>> +#define KVM__KVM_ARCH_H
>>> +
>>> +#include <stdbool.h>
>>> +#include <linux/types.h>
>>> +#include <time.h>
>>> +
>>> +#define KVM_NR_CPUS            (255)
>> 
>> Why?
> 
> Good question; that's arbitrary & cut-paste I missed. :-)
> 
> I'll make this 1024, to match the max sensible NR_CPUS in the PPC64 kernel
> (which in turn limits KVM_MAX_VCPUS).

I thought Sasha converted this to a queryable interface?

> 
>>> +
>>> +/*
>>> + * MMIO lives after RAM, but it'd be nice if it didn't constantly move.
>>> + * Choose a suitably high address, e.g. 63T...  This limits RAM size.
>>> + */
>>> +#define PPC_MMIO_START            0x3F0000000000UL
>>> +#define PPC_MMIO_SIZE            0x010000000000UL
>>> +
>>> +#define KERNEL_LOAD_ADDR            0x0000000000000000
>>> +#define KERNEL_START_ADDR           0x0000000000000000
>>> +#define KERNEL_SECONDARY_START_ADDR     0x0000000000000060
>>> +#define INITRD_LOAD_ADDR            0x0000000002800000
>>> +
>>> +#define FDT_MAX_SIZE                0x10000
>>> +#define RTAS_MAX_SIZE               0x10000
>>> +
>>> +#define TIMEBASE_FREQ               512000000ULL
>>> +
>>> +#define KVM_MMIO_START            PPC_MMIO_START
>>> +
>>> +/*
>>> + * This is the address that pci_get_io_space_block() starts allocating
>>> + * from.  Note that this is a PCI bus address.
>>> + */
>>> +#define KVM_PCI_MMIO_AREA        0x1000000
>>> +
>>> +struct kvm {
>>> +    int            sys_fd;        /* For system ioctls(), i.e. /dev/kvm */
>>> +    int            vm_fd;        /* For VM ioctls() */
>>> +    timer_t            timerid;    /* Posix timer for interrupts */
>>> +
>>> +    int            nrcpus;        /* Number of cpus to run */
>>> +
>>> +    u32            mem_slots;    /* for KVM_SET_USER_MEMORY_REGION */
>>> +
>>> +    u64            ram_size;
>>> +    void            *ram_start;
>>> +
>>> +    bool            nmi_disabled;
>>> +
>>> +    bool            single_step;
>>> +
>>> +    const char        *vmlinux;
>>> +    struct disk_image       **disks;
>>> +    int                     nr_disks;
>>> +    unsigned long        rtas_gra;
>>> +    unsigned long        rtas_size;
>>> +    unsigned long        fdt_gra;
>>> +    unsigned long        initrd_gra;
>>> +    unsigned long        initrd_size;
>>> +    const char        *name;
>>> +};
>>> +
>>> +#endif /* KVM__KVM_ARCH_H */
>>> diff --git a/tools/kvm/powerpc/include/kvm/kvm-cpu-arch.h b/tools/kvm/powerpc/include/kvm/kvm-cpu-arch.h
>>> new file mode 100644
>>> index 0000000..64e4510
>>> --- /dev/null
>>> +++ b/tools/kvm/powerpc/include/kvm/kvm-cpu-arch.h
>>> @@ -0,0 +1,66 @@
>>> +/*
>>> + * PPC64 cpu-specific definitions
>>> + *
>>> + * Copyright 2011 Matt Evans <matt@ozlabs.org>, IBM Corporation.
>>> + *
>>> + * This program is free software; you can redistribute it and/or modify it
>>> + * under the terms of the GNU General Public License version 2 as published
>>> + * by the Free Software Foundation.
>>> + */
>>> +
>>> +#ifndef KVM__KVM_CPU_ARCH_H
>>> +#define KVM__KVM_CPU_ARCH_H
>>> +
>>> +/* Architecture-specific kvm_cpu definitions. */
>>> +
>>> +#include <linux/kvm.h>    /* for struct kvm_regs */
>>> +
>>> +#include <pthread.h>
>>> +
>>> +#define MSR_SF        (1UL<<63)
>>> +#define MSR_HV        (1UL<<60)
>>> +#define MSR_VEC        (1UL<<25)
>>> +#define MSR_VSX        (1UL<<23)
>>> +#define MSR_POW        (1UL<<18)
>>> +#define MSR_EE        (1UL<<15)
>>> +#define MSR_PR        (1UL<<14)
>>> +#define MSR_FP        (1UL<<13)
>>> +#define MSR_ME        (1UL<<12)
>>> +#define MSR_FE0        (1UL<<11)
>>> +#define MSR_SE        (1UL<<10)
>>> +#define MSR_BE        (1UL<<9)
>>> +#define MSR_FE1        (1UL<<8)
>>> +#define MSR_IR        (1UL<<5)
>>> +#define MSR_DR        (1UL<<4)
>>> +#define MSR_PMM        (1UL<<2)
>>> +#define MSR_RI        (1UL<<1)
>>> +#define MSR_LE        (1UL<<0)
>>> +
>>> +struct kvm;
>>> +
>>> +struct kvm_cpu {
>>> +    pthread_t        thread;        /* VCPU thread */
>>> +
>>> +    unsigned long        cpu_id;
>>> +
>>> +    struct kvm        *kvm;        /* parent KVM */
>>> +    int            vcpu_fd;    /* For VCPU ioctls() */
>>> +    struct kvm_run        *kvm_run;
>>> +
>>> +    struct kvm_regs        regs;
>>> +    struct kvm_sregs    sregs;
>>> +    struct kvm_fpu        fpu;
>>> +
>>> +    u8            is_running;
>>> +    u8            paused;
>>> +    u8            needs_nmi;
>>> +    /*
>>> +     * Although PPC KVM doesn't yet support coalesced MMIO, generic code
>>> +     * needs this in our kvm_cpu:
>>> +     */
>>> +    struct kvm_coalesced_mmio_ring  *ring;
>>> +};
>>> +
>>> +void kvm_cpu__irq(struct kvm_cpu *vcpu, int pin, int level);
>>> +
>>> +#endif /* KVM__KVM_CPU_ARCH_H */
>>> diff --git a/tools/kvm/powerpc/ioport.c b/tools/kvm/powerpc/ioport.c
>>> new file mode 100644
>>> index 0000000..a8e4dc3
>>> --- /dev/null
>>> +++ b/tools/kvm/powerpc/ioport.c
>>> @@ -0,0 +1,18 @@
>>> +/*
>>> + * PPC64 ioport platform setup.  There isn't any! :-)
>>> + *
>>> + * Copyright 2011 Matt Evans <matt@ozlabs.org>, IBM Corporation.
>>> + *
>>> + * This program is free software; you can redistribute it and/or modify it
>>> + * under the terms of the GNU General Public License version 2 as published
>>> + * by the Free Software Foundation.
>>> + */
>>> +
>>> +#include "kvm/ioport.h"
>>> +
>>> +#include <stdlib.h>
>>> +
>>> +void ioport__setup_arch(void)
>>> +{
>>> +    /* PPC has no legacy ioports to set up */
>>> +}
>>> diff --git a/tools/kvm/powerpc/irq.c b/tools/kvm/powerpc/irq.c
>>> new file mode 100644
>>> index 0000000..46aa64f
>>> --- /dev/null
>>> +++ b/tools/kvm/powerpc/irq.c
>>> @@ -0,0 +1,40 @@
>>> +/*
>>> + * PPC64 IRQ routines
>>> + *
>>> + * Copyright 2011 Matt Evans <matt@ozlabs.org>, IBM Corporation.
>>> + *
>>> + * This program is free software; you can redistribute it and/or modify it
>>> + * under the terms of the GNU General Public License version 2 as published
>>> + * by the Free Software Foundation.
>>> + */
>>> +
>>> +#include "kvm/irq.h"
>>> +#include "kvm/kvm.h"
>>> +#include "kvm/util.h"
>>> +
>>> +#include <linux/types.h>
>>> +#include <linux/rbtree.h>
>>> +#include <linux/list.h>
>>> +#include <linux/kvm.h>
>>> +#include <sys/ioctl.h>
>>> +
>>> +#include <stddef.h>
>>> +#include <stdlib.h>
>>> +
>>> +int irq__register_device(u32 dev, u8 *num, u8 *pin, u8 *line)
>>> +{
>>> +    fprintf(stderr, "irq__register_device(%d, [%d], [%d], [%d]\n",
>>> +        dev, *num, *pin, *line);
>>> +    return 0;
>>> +}
>>> +
>>> +void irq__init(struct kvm *kvm)
>>> +{
>>> +    fprintf(stderr, __func__);
>>> +}
>>> +
>>> +int irq__add_msix_route(struct kvm *kvm, struct msi_msg *msg)
>>> +{
>>> +    die(__FUNCTION__);
>>> +    return 0;
>>> +}
>>> diff --git a/tools/kvm/powerpc/kvm-cpu.c b/tools/kvm/powerpc/kvm-cpu.c
>>> new file mode 100644
>>> index 0000000..ea99666
>>> --- /dev/null
>>> +++ b/tools/kvm/powerpc/kvm-cpu.c
>>> @@ -0,0 +1,233 @@
>>> +/*
>>> + * PPC64 processor support
>>> + *
>>> + * Copyright 2011 Matt Evans <matt@ozlabs.org>, IBM Corporation.
>>> + *
>>> + * This program is free software; you can redistribute it and/or modify it
>>> + * under the terms of the GNU General Public License version 2 as published
>>> + * by the Free Software Foundation.
>>> + */
>>> +
>>> +#include "kvm/kvm-cpu.h"
>>> +
>>> +#include "kvm/symbol.h"
>>> +#include "kvm/util.h"
>>> +#include "kvm/kvm.h"
>>> +
>>> +#include <sys/ioctl.h>
>>> +#include <sys/mman.h>
>>> +#include <signal.h>
>>> +#include <stdlib.h>
>>> +#include <string.h>
>>> +#include <errno.h>
>>> +#include <stdio.h>
>>> +
>>> +static int debug_fd;
>>> +
>>> +void kvm_cpu__set_debug_fd(int fd)
>>> +{
>>> +    debug_fd = fd;
>>> +}
>>> +
>>> +int kvm_cpu__get_debug_fd(void)
>>> +{
>>> +    return debug_fd;
>>> +}
>>> +
>>> +static struct kvm_cpu *kvm_cpu__new(struct kvm *kvm)
>>> +{
>>> +    struct kvm_cpu *vcpu;
>>> +
>>> +    vcpu        = calloc(1, sizeof *vcpu);
>>> +    if (!vcpu)
>>> +        return NULL;
>>> +
>>> +    vcpu->kvm    = kvm;
>>> +
>>> +    return vcpu;
>>> +}
>>> +
>>> +void kvm_cpu__delete(struct kvm_cpu *vcpu)
>>> +{
>>> +    free(vcpu);
>>> +}
>>> +
>>> +struct kvm_cpu *kvm_cpu__init(struct kvm *kvm, unsigned long cpu_id)
>>> +{
>>> +    struct kvm_cpu *vcpu;
>>> +    int mmap_size;
>>> +    struct kvm_enable_cap papr_cap = { .cap = KVM_CAP_PPC_PAPR };
>>> +
>>> +    vcpu        = kvm_cpu__new(kvm);
>>> +    if (!vcpu)
>>> +        return NULL;
>>> +
>>> +    vcpu->cpu_id    = cpu_id;
>>> +
>>> +    vcpu->vcpu_fd = ioctl(vcpu->kvm->vm_fd, KVM_CREATE_VCPU, cpu_id);
>>> +    if (vcpu->vcpu_fd < 0)
>>> +        die_perror("KVM_CREATE_VCPU ioctl");
>>> +
>>> +    mmap_size = ioctl(vcpu->kvm->sys_fd, KVM_GET_VCPU_MMAP_SIZE, 0);
>>> +    if (mmap_size < 0)
>>> +        die_perror("KVM_GET_VCPU_MMAP_SIZE ioctl");
>>> +
>>> +    vcpu->kvm_run = mmap(NULL, mmap_size, PROT_RW, MAP_SHARED, vcpu->vcpu_fd, 0);
>>> +    if (vcpu->kvm_run == MAP_FAILED)
>>> +        die("unable to mmap vcpu fd");
>>> +
>>> +    ioctl(vcpu->vcpu_fd, KVM_ENABLE_CAP, &papr_cap);
>> 
>> Have you tried running this on PR KVM? That should also need HIOR synchronization.
> 
> I have, but only briefly and I admit I built it from a random tree I had lying
> around, certainly stale, and immediately hit some "can't emulate MMIO" errors on
> some stdu instructions.  I will give it another go with your tree and see if I
> can get it working with kvmtool, it would be very cool for that to work.

Yup, it would also make your work executable to people outside of IBM :)


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Matt Evans - Dec. 15, 2011, 2:20 a.m.
On 15/12/11 12:37, Alexander Graf wrote:
> 
> On 15.12.2011, at 02:27, Matt Evans wrote:
> 
>> Heya Alex,
>>
>> On 13/12/11 19:23, Alexander Graf wrote:
>>>
>>> On 13.12.2011, at 08:00, Matt Evans <matt@ozlabs.org> wrote:
>>>
>>>> This patch adds a new arch directory, powerpc, basic file structure, register
>>>> setup and where necessary stubs out arch-specific functions (e.g. interrupts,
>>>> runloop exits) that later patches will provide.  The target is an
>>>> SPAPR-compliant PPC64 machine (i.e. pSeries); there is no support for PPC32 or
>>>> 'bare metal' PPC64 guests as yet.  Subsequent patches implement the hcalls and
>>>> RTAS required to boot SPAPR pSeries kernels.
>>>>
>>>> Memory is mapped from hugetlbfs (as that is currently required by upstream PPC64
>>>> HV-mode KVM).  The mapping of a VRMA region is yet to be implemented; this is
>>>> only necessary on processors that don't support VRMA, e.g. <= P6.  Work is
>>>> therefore needed to get this going on pre-P7 CPUs.
>>>>
>>>> Processor state is set up as a guest kernel would expect (both primary and
>>>> secondaries), and SMP is fully supported.
>>>>
>>>> Finally, support is added for simply loading flat binary kernels (plus initrd).
>>>> (bzImages are not used on PPC, and this series does not add zImage support or an
>>>> ELF loader.)  The intention is to later support loading firmware such as SLOF.
>>>>
>>>> Signed-off-by: Matt Evans <matt@ozlabs.org>
>>>> ---
>>>> tools/kvm/Makefile                           |   10 +
>>>> tools/kvm/kvm.c                              |    3 +
>>>> tools/kvm/powerpc/include/kvm/barrier.h      |    6 +
>>>> tools/kvm/powerpc/include/kvm/kvm-arch.h     |   72 ++++++++
>>>> tools/kvm/powerpc/include/kvm/kvm-cpu-arch.h |   66 ++++++++
>>>> tools/kvm/powerpc/ioport.c                   |   18 ++
>>>> tools/kvm/powerpc/irq.c                      |   40 +++++
>>>> tools/kvm/powerpc/kvm-cpu.c                  |  233 ++++++++++++++++++++++++++
>>>> tools/kvm/powerpc/kvm.c                      |  187 +++++++++++++++++++++
>>>> 9 files changed, 635 insertions(+), 0 deletions(-)
>>>> create mode 100644 tools/kvm/powerpc/include/kvm/barrier.h
>>>> create mode 100644 tools/kvm/powerpc/include/kvm/kvm-arch.h
>>>> create mode 100644 tools/kvm/powerpc/include/kvm/kvm-cpu-arch.h
>>>> create mode 100644 tools/kvm/powerpc/ioport.c
>>>> create mode 100644 tools/kvm/powerpc/irq.c
>>>> create mode 100644 tools/kvm/powerpc/kvm-cpu.c
>>>> create mode 100644 tools/kvm/powerpc/kvm.c
>>>>
>>>> diff --git a/tools/kvm/Makefile b/tools/kvm/Makefile
>>>> index 2bf70c9..3f1e84a 100644
>>>> --- a/tools/kvm/Makefile
>>>> +++ b/tools/kvm/Makefile
>>>> @@ -124,6 +124,16 @@ ifeq ($(ARCH),x86)
>>>>   OTHEROBJS    += x86/bios/bios-rom.o
>>>>   ARCH_INCLUDE := x86/include
>>>> endif
>>>> +# POWER/ppc:  Actually only support ppc64 currently.
>>>
>>> Why? I usually run ppc32 user land. Doesn't that expose 'ppc' here?
>>
>> Not quite sure what you mean here; do you mean 32bit distro?  (Will still get 'ppc64' from a 64-bit kernel.)
> 
> Eh. Yes. Sorry, my bad.
> 
>> There is clearly some work required here to determine what to build for when we
>> eventually support PPC32 guests/hosts though I'm not sure how that will look
>> yet.  This is designed to break if you build on a 32bit kernel, as if it DID
>> build, it wouldn't run anyway.  (It's building -m64 too...
> 
> Yeah, running -M pseries on PPC32 hosts doesn't make sense really.
> 
>>
>>>> +ifeq ($(uname_M), ppc64)
>>>> +    DEFINES += -DCONFIG_PPC
>>>> +    OBJS    += powerpc/ioport.o
>>>> +    OBJS    += powerpc/irq.o
>>>> +    OBJS    += powerpc/kvm.o
>>>> +    OBJS    += powerpc/kvm-cpu.o
>>>> +    ARCH_INCLUDE := powerpc/include
>>>> +    CFLAGS += -m64
>>
>> ...here.)
>>
>>>> +endif
>>>>
>>>> ###
>>>>
>>>> diff --git a/tools/kvm/kvm.c b/tools/kvm/kvm.c
>>>> index 35ca2c5..3fb46f6 100644
>>>> --- a/tools/kvm/kvm.c
>>>> +++ b/tools/kvm/kvm.c
>>>> @@ -49,6 +49,9 @@ const char *kvm_exit_reasons[] = {
>>>>   DEFINE_KVM_EXIT_REASON(KVM_EXIT_DCR),
>>>>   DEFINE_KVM_EXIT_REASON(KVM_EXIT_NMI),
>>>>   DEFINE_KVM_EXIT_REASON(KVM_EXIT_INTERNAL_ERROR),
>>>> +#ifdef CONFIG_PPC64
>>>> +    DEFINE_KVM_EXIT_REASON(KVM_EXIT_PAPR_HCALL),
>>>> +#endif
>>>> };
>>>>
>>>> extern struct kvm *kvm;
>>>> diff --git a/tools/kvm/powerpc/include/kvm/barrier.h b/tools/kvm/powerpc/include/kvm/barrier.h
>>>> new file mode 100644
>>>> index 0000000..bc7d179
>>>> --- /dev/null
>>>> +++ b/tools/kvm/powerpc/include/kvm/barrier.h
>>>> @@ -0,0 +1,6 @@
>>>> +#ifndef _KVM_BARRIER_H_
>>>> +#define _KVM_BARRIER_H_
>>>> +
>>>> +#include <asm/system.h>
>>>> +
>>>> +#endif /* _KVM_BARRIER_H_ */
>>>> diff --git a/tools/kvm/powerpc/include/kvm/kvm-arch.h b/tools/kvm/powerpc/include/kvm/kvm-arch.h
>>>> new file mode 100644
>>>> index 0000000..da61774
>>>> --- /dev/null
>>>> +++ b/tools/kvm/powerpc/include/kvm/kvm-arch.h
>>>> @@ -0,0 +1,72 @@
>>>> +/*
>>>> + * PPC64 architecture-specific definitions
>>>> + *
>>>> + * Copyright 2011 Matt Evans <matt@ozlabs.org>, IBM Corporation.
>>>> + *
>>>> + * This program is free software; you can redistribute it and/or modify it
>>>> + * under the terms of the GNU General Public License version 2 as published
>>>> + * by the Free Software Foundation.
>>>> + */
>>>> +
>>>> +#ifndef KVM__KVM_ARCH_H
>>>> +#define KVM__KVM_ARCH_H
>>>> +
>>>> +#include <stdbool.h>
>>>> +#include <linux/types.h>
>>>> +#include <time.h>
>>>> +
>>>> +#define KVM_NR_CPUS            (255)
>>>
>>> Why?
>>
>> Good question; that's arbitrary & cut-paste I missed. :-)
>>
>> I'll make this 1024, to match the max sensible NR_CPUS in the PPC64 kernel
>> (which in turn limits KVM_MAX_VCPUS).
> 
> I thought Sasha converted this to a queryable interface?

Yeah, nah; that's (urgh) something different.  The actual number of VCPUs
created is limited by the KVM_CAP_NR_VCPUS/KVM_CAP_MAX_VCPUS stuff.  This
#define is actually only used to size a static array, so on second thoughts I
think I'll just malloc() it instead, i.e. remove this #define.

>>> [snip]
>>>
>>> Have you tried running this on PR KVM? That should also need HIOR synchronization.
>>
>> I have, but only briefly and I admit I built it from a random tree I had lying
>> around, certainly stale, and immediately hit some "can't emulate MMIO" errors on
>> some stdu instructions.  I will give it another go with your tree and see if I
>> can get it working with kvmtool, it would be very cool for that to work.
> 
> Yup, it would also make your work executable to people outside of IBM :)

That's totally not lost on me ;-)  (and :( )


Cheers,


Matt
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/tools/kvm/Makefile b/tools/kvm/Makefile
index 2bf70c9..3f1e84a 100644
--- a/tools/kvm/Makefile
+++ b/tools/kvm/Makefile
@@ -124,6 +124,16 @@  ifeq ($(ARCH),x86)
 	OTHEROBJS	+= x86/bios/bios-rom.o
 	ARCH_INCLUDE := x86/include
 endif
+# POWER/ppc:  Actually only support ppc64 currently.
+ifeq ($(uname_M), ppc64)
+	DEFINES += -DCONFIG_PPC
+	OBJS	+= powerpc/ioport.o
+	OBJS	+= powerpc/irq.o
+	OBJS	+= powerpc/kvm.o
+	OBJS	+= powerpc/kvm-cpu.o
+	ARCH_INCLUDE := powerpc/include
+	CFLAGS += -m64
+endif
 
 ###
 
diff --git a/tools/kvm/kvm.c b/tools/kvm/kvm.c
index 35ca2c5..3fb46f6 100644
--- a/tools/kvm/kvm.c
+++ b/tools/kvm/kvm.c
@@ -49,6 +49,9 @@  const char *kvm_exit_reasons[] = {
 	DEFINE_KVM_EXIT_REASON(KVM_EXIT_DCR),
 	DEFINE_KVM_EXIT_REASON(KVM_EXIT_NMI),
 	DEFINE_KVM_EXIT_REASON(KVM_EXIT_INTERNAL_ERROR),
+#ifdef CONFIG_PPC64
+	DEFINE_KVM_EXIT_REASON(KVM_EXIT_PAPR_HCALL),
+#endif
 };
 
 extern struct kvm *kvm;
diff --git a/tools/kvm/powerpc/include/kvm/barrier.h b/tools/kvm/powerpc/include/kvm/barrier.h
new file mode 100644
index 0000000..bc7d179
--- /dev/null
+++ b/tools/kvm/powerpc/include/kvm/barrier.h
@@ -0,0 +1,6 @@ 
+#ifndef _KVM_BARRIER_H_
+#define _KVM_BARRIER_H_
+
+#include <asm/system.h>
+
+#endif /* _KVM_BARRIER_H_ */
diff --git a/tools/kvm/powerpc/include/kvm/kvm-arch.h b/tools/kvm/powerpc/include/kvm/kvm-arch.h
new file mode 100644
index 0000000..da61774
--- /dev/null
+++ b/tools/kvm/powerpc/include/kvm/kvm-arch.h
@@ -0,0 +1,72 @@ 
+/*
+ * PPC64 architecture-specific definitions
+ *
+ * Copyright 2011 Matt Evans <matt@ozlabs.org>, IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published
+ * by the Free Software Foundation.
+ */
+
+#ifndef KVM__KVM_ARCH_H
+#define KVM__KVM_ARCH_H
+
+#include <stdbool.h>
+#include <linux/types.h>
+#include <time.h>
+
+#define KVM_NR_CPUS			(255)
+
+/*
+ * MMIO lives after RAM, but it'd be nice if it didn't constantly move.
+ * Choose a suitably high address, e.g. 63T...  This limits RAM size.
+ */
+#define PPC_MMIO_START			0x3F0000000000UL
+#define PPC_MMIO_SIZE			0x010000000000UL
+
+#define KERNEL_LOAD_ADDR        	0x0000000000000000
+#define KERNEL_START_ADDR       	0x0000000000000000
+#define KERNEL_SECONDARY_START_ADDR     0x0000000000000060
+#define INITRD_LOAD_ADDR        	0x0000000002800000
+
+#define FDT_MAX_SIZE            	0x10000
+#define RTAS_MAX_SIZE           	0x10000
+
+#define TIMEBASE_FREQ           	512000000ULL
+
+#define KVM_MMIO_START			PPC_MMIO_START
+
+/*
+ * This is the address that pci_get_io_space_block() starts allocating
+ * from.  Note that this is a PCI bus address.
+ */
+#define KVM_PCI_MMIO_AREA		0x1000000
+
+struct kvm {
+	int			sys_fd;		/* For system ioctls(), i.e. /dev/kvm */
+	int			vm_fd;		/* For VM ioctls() */
+	timer_t			timerid;	/* Posix timer for interrupts */
+
+	int			nrcpus;		/* Number of cpus to run */
+
+	u32			mem_slots;	/* for KVM_SET_USER_MEMORY_REGION */
+
+	u64			ram_size;
+	void			*ram_start;
+
+	bool			nmi_disabled;
+
+	bool			single_step;
+
+	const char		*vmlinux;
+	struct disk_image       **disks;
+	int                     nr_disks;
+	unsigned long		rtas_gra;
+	unsigned long		rtas_size;
+	unsigned long		fdt_gra;
+	unsigned long		initrd_gra;
+	unsigned long		initrd_size;
+	const char		*name;
+};
+
+#endif /* KVM__KVM_ARCH_H */
diff --git a/tools/kvm/powerpc/include/kvm/kvm-cpu-arch.h b/tools/kvm/powerpc/include/kvm/kvm-cpu-arch.h
new file mode 100644
index 0000000..64e4510
--- /dev/null
+++ b/tools/kvm/powerpc/include/kvm/kvm-cpu-arch.h
@@ -0,0 +1,66 @@ 
+/*
+ * PPC64 cpu-specific definitions
+ *
+ * Copyright 2011 Matt Evans <matt@ozlabs.org>, IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published
+ * by the Free Software Foundation.
+ */
+
+#ifndef KVM__KVM_CPU_ARCH_H
+#define KVM__KVM_CPU_ARCH_H
+
+/* Architecture-specific kvm_cpu definitions. */
+
+#include <linux/kvm.h>	/* for struct kvm_regs */
+
+#include <pthread.h>
+
+#define MSR_SF		(1UL<<63)
+#define MSR_HV		(1UL<<60)
+#define MSR_VEC		(1UL<<25)
+#define MSR_VSX		(1UL<<23)
+#define MSR_POW		(1UL<<18)
+#define MSR_EE		(1UL<<15)
+#define MSR_PR		(1UL<<14)
+#define MSR_FP		(1UL<<13)
+#define MSR_ME		(1UL<<12)
+#define MSR_FE0		(1UL<<11)
+#define MSR_SE		(1UL<<10)
+#define MSR_BE		(1UL<<9)
+#define MSR_FE1		(1UL<<8)
+#define MSR_IR		(1UL<<5)
+#define MSR_DR		(1UL<<4)
+#define MSR_PMM		(1UL<<2)
+#define MSR_RI		(1UL<<1)
+#define MSR_LE		(1UL<<0)
+
+struct kvm;
+
+struct kvm_cpu {
+	pthread_t		thread;		/* VCPU thread */
+
+	unsigned long		cpu_id;
+
+	struct kvm		*kvm;		/* parent KVM */
+	int			vcpu_fd;	/* For VCPU ioctls() */
+	struct kvm_run		*kvm_run;
+
+	struct kvm_regs		regs;
+	struct kvm_sregs	sregs;
+	struct kvm_fpu		fpu;
+
+	u8			is_running;
+	u8			paused;
+	u8			needs_nmi;
+	/*
+	 * Although PPC KVM doesn't yet support coalesced MMIO, generic code
+	 * needs this in our kvm_cpu:
+	 */
+	struct kvm_coalesced_mmio_ring  *ring;
+};
+
+void kvm_cpu__irq(struct kvm_cpu *vcpu, int pin, int level);
+
+#endif /* KVM__KVM_CPU_ARCH_H */
diff --git a/tools/kvm/powerpc/ioport.c b/tools/kvm/powerpc/ioport.c
new file mode 100644
index 0000000..a8e4dc3
--- /dev/null
+++ b/tools/kvm/powerpc/ioport.c
@@ -0,0 +1,18 @@ 
+/*
+ * PPC64 ioport platform setup.  There isn't any! :-)
+ *
+ * Copyright 2011 Matt Evans <matt@ozlabs.org>, IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published
+ * by the Free Software Foundation.
+ */
+
+#include "kvm/ioport.h"
+
+#include <stdlib.h>
+
+void ioport__setup_arch(void)
+{
+	/* PPC has no legacy ioports to set up */
+}
diff --git a/tools/kvm/powerpc/irq.c b/tools/kvm/powerpc/irq.c
new file mode 100644
index 0000000..46aa64f
--- /dev/null
+++ b/tools/kvm/powerpc/irq.c
@@ -0,0 +1,40 @@ 
+/*
+ * PPC64 IRQ routines
+ *
+ * Copyright 2011 Matt Evans <matt@ozlabs.org>, IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published
+ * by the Free Software Foundation.
+ */
+
+#include "kvm/irq.h"
+#include "kvm/kvm.h"
+#include "kvm/util.h"
+
+#include <linux/types.h>
+#include <linux/rbtree.h>
+#include <linux/list.h>
+#include <linux/kvm.h>
+#include <sys/ioctl.h>
+
+#include <stddef.h>
+#include <stdlib.h>
+
+int irq__register_device(u32 dev, u8 *num, u8 *pin, u8 *line)
+{
+	fprintf(stderr, "irq__register_device(%d, [%d], [%d], [%d]\n",
+		dev, *num, *pin, *line);
+	return 0;
+}
+
+void irq__init(struct kvm *kvm)
+{
+	fprintf(stderr, __func__);
+}
+
+int irq__add_msix_route(struct kvm *kvm, struct msi_msg *msg)
+{
+	die(__FUNCTION__);
+	return 0;
+}
diff --git a/tools/kvm/powerpc/kvm-cpu.c b/tools/kvm/powerpc/kvm-cpu.c
new file mode 100644
index 0000000..ea99666
--- /dev/null
+++ b/tools/kvm/powerpc/kvm-cpu.c
@@ -0,0 +1,233 @@ 
+/*
+ * PPC64 processor support
+ *
+ * Copyright 2011 Matt Evans <matt@ozlabs.org>, IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published
+ * by the Free Software Foundation.
+ */
+
+#include "kvm/kvm-cpu.h"
+
+#include "kvm/symbol.h"
+#include "kvm/util.h"
+#include "kvm/kvm.h"
+
+#include <sys/ioctl.h>
+#include <sys/mman.h>
+#include <signal.h>
+#include <stdlib.h>
+#include <string.h>
+#include <errno.h>
+#include <stdio.h>
+
+static int debug_fd;
+
+void kvm_cpu__set_debug_fd(int fd)
+{
+	debug_fd = fd;
+}
+
+int kvm_cpu__get_debug_fd(void)
+{
+	return debug_fd;
+}
+
+static struct kvm_cpu *kvm_cpu__new(struct kvm *kvm)
+{
+	struct kvm_cpu *vcpu;
+
+	vcpu		= calloc(1, sizeof *vcpu);
+	if (!vcpu)
+		return NULL;
+
+	vcpu->kvm	= kvm;
+
+	return vcpu;
+}
+
+void kvm_cpu__delete(struct kvm_cpu *vcpu)
+{
+	free(vcpu);
+}
+
+struct kvm_cpu *kvm_cpu__init(struct kvm *kvm, unsigned long cpu_id)
+{
+	struct kvm_cpu *vcpu;
+	int mmap_size;
+	struct kvm_enable_cap papr_cap = { .cap = KVM_CAP_PPC_PAPR };
+
+	vcpu		= kvm_cpu__new(kvm);
+	if (!vcpu)
+		return NULL;
+
+	vcpu->cpu_id	= cpu_id;
+
+	vcpu->vcpu_fd = ioctl(vcpu->kvm->vm_fd, KVM_CREATE_VCPU, cpu_id);
+	if (vcpu->vcpu_fd < 0)
+		die_perror("KVM_CREATE_VCPU ioctl");
+
+	mmap_size = ioctl(vcpu->kvm->sys_fd, KVM_GET_VCPU_MMAP_SIZE, 0);
+	if (mmap_size < 0)
+		die_perror("KVM_GET_VCPU_MMAP_SIZE ioctl");
+
+	vcpu->kvm_run = mmap(NULL, mmap_size, PROT_RW, MAP_SHARED, vcpu->vcpu_fd, 0);
+	if (vcpu->kvm_run == MAP_FAILED)
+		die("unable to mmap vcpu fd");
+
+	ioctl(vcpu->vcpu_fd, KVM_ENABLE_CAP, &papr_cap);
+
+	/*
+	 * We start all CPUs, directing non-primary threads into the kernel's
+	 * secondary start point.  When we come to support SLOF, we will start
+	 * only one and SLOF will RTAS call us to ask for others to be
+	 * started.  (FIXME: make more generic & interface with whichever
+	 * firmware a platform may be using.)
+	 */
+	vcpu->is_running = true;
+
+	return vcpu;
+}
+
+static void kvm_cpu__setup_fpu(struct kvm_cpu *vcpu)
+{
+	/* Don't have to do anything, there's no expected FPU state. */
+}
+
+static void kvm_cpu__setup_regs(struct kvm_cpu *vcpu)
+{
+	/*
+	 * FIXME: This assumes PPC64 and Linux guest.  It doesn't use the
+	 * OpenFirmware entry method, but instead the "embedded" entry which
+	 * passes the FDT address directly.
+	 */
+	struct kvm_regs *r = &vcpu->regs;
+
+	if (vcpu->cpu_id == 0) {
+		r->pc = KERNEL_START_ADDR;
+		r->gpr[3] = vcpu->kvm->fdt_gra;
+		r->gpr[5] = 0;
+	} else {
+		r->pc = KERNEL_SECONDARY_START_ADDR;
+		r->gpr[3] = vcpu->cpu_id;
+	}
+	r->msr = 0x8000000000001000UL; /* 64bit, non-HV, ME */
+
+	if (ioctl(vcpu->vcpu_fd, KVM_SET_REGS, &vcpu->regs) < 0)
+		die_perror("KVM_SET_REGS failed");
+}
+
+static void kvm_cpu__setup_sregs(struct kvm_cpu *vcpu)
+{
+	/*
+	 * No sregs setup is required on PPC64/SPAPR (but there may be setup
+	 * required for non-paravirtualised platforms, e.g. TLB/SLB setup).
+	 */
+}
+
+/**
+ * kvm_cpu__reset_vcpu - reset virtual CPU to a known state
+ */
+void kvm_cpu__reset_vcpu(struct kvm_cpu *vcpu)
+{
+	kvm_cpu__setup_regs(vcpu);
+	kvm_cpu__setup_sregs(vcpu);
+	kvm_cpu__setup_fpu(vcpu);
+}
+
+/* kvm_cpu__irq - set KVM's IRQ flag on this vcpu */
+void kvm_cpu__irq(struct kvm_cpu *vcpu, int pin, int level)
+{
+}
+
+void kvm_cpu__arch_nmi(struct kvm_cpu *cpu)
+{
+}
+
+bool kvm_cpu__handle_exit(struct kvm_cpu *vcpu)
+{
+	bool ret = true;
+	struct kvm_run *run = vcpu->kvm_run;
+	switch(run->exit_reason) {
+	default:
+		ret = false;
+	}
+	return ret;
+}
+
+#define CONDSTR_BIT(m, b) (((m) & MSR_##b) ? #b" " : "")
+
+void kvm_cpu__show_registers(struct kvm_cpu *vcpu)
+{
+	struct kvm_regs regs;
+	struct kvm_sregs sregs;
+	int r;
+
+	if (ioctl(vcpu->vcpu_fd, KVM_GET_REGS, &regs) < 0)
+		die("KVM_GET_REGS failed");
+        if (ioctl(vcpu->vcpu_fd, KVM_GET_SREGS, &sregs) < 0)
+		die("KVM_GET_SREGS failed");
+
+	dprintf(debug_fd, "\n Registers:\n");
+	dprintf(debug_fd, " NIP:   %016llx  MSR:   %016llx "
+		"( %s%s%s%s%s%s%s%s%s%s%s%s)\n",
+		regs.pc, regs.msr,
+		CONDSTR_BIT(regs.msr, SF),
+		CONDSTR_BIT(regs.msr, HV), /* ! */
+		CONDSTR_BIT(regs.msr, VEC),
+		CONDSTR_BIT(regs.msr, VSX),
+		CONDSTR_BIT(regs.msr, EE),
+		CONDSTR_BIT(regs.msr, PR),
+		CONDSTR_BIT(regs.msr, FP),
+		CONDSTR_BIT(regs.msr, ME),
+		CONDSTR_BIT(regs.msr, IR),
+		CONDSTR_BIT(regs.msr, DR),
+		CONDSTR_BIT(regs.msr, RI),
+		CONDSTR_BIT(regs.msr, LE));
+	dprintf(debug_fd, " CTR:   %016llx  LR:    %016llx  CR:   %08llx\n",
+		regs.ctr, regs.lr, regs.cr);
+	dprintf(debug_fd, " SRR0:  %016llx  SRR1:  %016llx  XER:  %016llx\n",
+		regs.srr0, regs.srr1, regs.xer);
+	dprintf(debug_fd, " SPRG0: %016llx  SPRG1: %016llx\n",
+		regs.sprg0, regs.sprg1);
+	dprintf(debug_fd, " SPRG2: %016llx  SPRG3: %016llx\n",
+		regs.sprg2, regs.sprg3);
+	dprintf(debug_fd, " SPRG4: %016llx  SPRG5: %016llx\n",
+		regs.sprg4, regs.sprg5);
+	dprintf(debug_fd, " SPRG6: %016llx  SPRG7: %016llx\n",
+		regs.sprg6, regs.sprg7);
+	dprintf(debug_fd, " GPRs:\n ");
+	for (r = 0; r < 32; r++) {
+		dprintf(debug_fd, "%016llx  ", regs.gpr[r]);
+		if ((r & 3) == 3)
+			dprintf(debug_fd, "\n ");
+	}
+	dprintf(debug_fd, "\n");
+
+	/* FIXME: Assumes SLB-based (book3s) guest */
+	for (r = 0; r < 32; r++) {
+		dprintf(debug_fd, " SLB%02d  %016llx %016llx\n", r,
+			sregs.u.s.ppc64.slb[r].slbe,
+			sregs.u.s.ppc64.slb[r].slbv);
+	}
+	dprintf(debug_fd, "----------\n");
+}
+
+void kvm_cpu__show_code(struct kvm_cpu *vcpu)
+{
+	if (ioctl(vcpu->vcpu_fd, KVM_GET_REGS, &vcpu->regs) < 0)
+		die("KVM_GET_REGS failed");
+
+	/* FIXME: Dump/disassemble some code...! */
+
+	dprintf(debug_fd, "\n Stack:\n");
+	dprintf(debug_fd,   " ------\n");
+	/* Only works in real mode: */
+	kvm__dump_mem(vcpu->kvm, vcpu->regs.gpr[1], 32);
+}
+
+void kvm_cpu__show_page_tables(struct kvm_cpu *vcpu)
+{
+	/* Does nothing yet */
+}
diff --git a/tools/kvm/powerpc/kvm.c b/tools/kvm/powerpc/kvm.c
new file mode 100644
index 0000000..f838a8f
--- /dev/null
+++ b/tools/kvm/powerpc/kvm.c
@@ -0,0 +1,187 @@ 
+/*
+ * PPC64 (SPAPR) platform support
+ *
+ * Copyright 2011 Matt Evans <matt@ozlabs.org>, IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published
+ * by the Free Software Foundation.
+ */
+
+#include "kvm/kvm.h"
+#include "kvm/util.h"
+
+#include <linux/kvm.h>
+
+#include <sys/types.h>
+#include <sys/ioctl.h>
+#include <sys/mman.h>
+#include <stdbool.h>
+#include <assert.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+#include <stdio.h>
+#include <fcntl.h>
+#include <asm/unistd.h>
+#include <errno.h>
+
+#include <linux/byteorder.h>
+#include <libfdt.h>
+
+#define HUGETLBFS_PATH "/var/lib/hugetlbfs/global/pagesize-16MB/"
+
+static char kern_cmdline[2048];
+
+struct kvm_ext kvm_req_ext[] = {
+	{ 0, 0 }
+};
+
+bool kvm__arch_cpu_supports_vm(void)
+{
+	return true;
+}
+
+void kvm__init_ram(struct kvm *kvm)
+{
+	u64	phys_start, phys_size;
+	void	*host_mem;
+
+	phys_start = 0;
+	phys_size  = kvm->ram_size;
+	host_mem   = kvm->ram_start;
+
+	/*
+	 * We put MMIO at PPC_MMIO_START, high up.  Make sure that this doesn't
+	 * crash into the end of RAM -- on PPC64 at least, this is so high
+	 * (63TB!) that this is unlikely.
+	 */
+	if (phys_size >= PPC_MMIO_START)
+		die("Too much memory (%lld, what a nice problem): "
+		    "overlaps MMIO!\n",
+		    phys_size);
+
+	kvm__register_mem(kvm, phys_start, phys_size, host_mem);
+}
+
+void kvm__arch_set_cmdline(char *cmdline, bool video)
+{
+	/* We don't need anything unusual in here. */
+}
+
+/* Architecture-specific KVM init */
+void kvm__arch_init(struct kvm *kvm, const char *kvm_dev, const char *hugetlbfs_path, u64 ram_size, const char *name)
+{
+	int cap_ppc_rma;
+
+	kvm->ram_size		= ram_size;
+
+	/*
+	 * Currently, we must map from hugetlbfs; if --hugetlbfs not specified,
+	 * try a default path:
+	 */
+	if (!hugetlbfs_path) {
+		hugetlbfs_path = HUGETLBFS_PATH;
+		pr_info("Using default %s for memory", hugetlbfs_path);
+	}
+
+	kvm->ram_start = mmap_hugetlbfs(hugetlbfs_path, kvm->ram_size);
+	if (kvm->ram_start == MAP_FAILED)
+		die("Couldn't map %lld bytes for RAM (%d)\n",
+		    kvm->ram_size, errno);
+
+	/* FDT goes at top of memory, RTAS just below */
+	kvm->fdt_gra = kvm->ram_size - FDT_MAX_SIZE;
+	/* FIXME: Not all PPC systems have RTAS */
+	kvm->rtas_gra = kvm->fdt_gra - RTAS_MAX_SIZE;
+	madvise(kvm->ram_start, kvm->ram_size, MADV_MERGEABLE);
+
+	/* FIXME: This is book3s-specific */
+	cap_ppc_rma = ioctl(kvm->sys_fd, KVM_CHECK_EXTENSION, KVM_CAP_PPC_RMA);
+	if (cap_ppc_rma == 2)
+		die("Need contiguous RMA allocation on this hardware, "
+		    "which is not yet supported.");
+}
+
+void kvm__irq_line(struct kvm *kvm, int irq, int level)
+{
+	fprintf(stderr, "irq_line(%d, %d)\n", irq, level);
+}
+
+void kvm__irq_trigger(struct kvm *kvm, int irq)
+{
+	kvm__irq_line(kvm, irq, 1);
+	kvm__irq_line(kvm, irq, 0);
+}
+
+int load_flat_binary(struct kvm *kvm, int fd_kernel, int fd_initrd, const char *kernel_cmdline)
+{
+	void *p;
+	void *k_start;
+	void *i_start;
+	int nr;
+
+	if (lseek(fd_kernel, 0, SEEK_SET) < 0)
+		die_perror("lseek");
+
+	p = k_start = guest_flat_to_host(kvm, KERNEL_LOAD_ADDR);
+
+	while ((nr = read(fd_kernel, p, 65536)) > 0)
+		p += nr;
+
+	pr_info("Loaded kernel to 0x%x (%ld bytes)", KERNEL_LOAD_ADDR, p-k_start);
+
+	if (fd_initrd != -1) {
+		if (lseek(fd_initrd, 0, SEEK_SET) < 0)
+			die_perror("lseek");
+
+		if (p-k_start > INITRD_LOAD_ADDR)
+			die("Kernel overlaps initrd!");
+
+		/* Round up kernel size to 8byte alignment, and load initrd right after. */
+		i_start = p = guest_flat_to_host(kvm, INITRD_LOAD_ADDR);
+
+		while (((nr = read(fd_initrd, p, 65536)) > 0) &&
+		       p < (kvm->ram_start + kvm->ram_size))
+			p += nr;
+
+		if (p >= (kvm->ram_start + kvm->ram_size))
+			die("initrd too big to contain in guest RAM.\n");
+
+		pr_info("Loaded initrd to 0x%x (%ld bytes)",
+			INITRD_LOAD_ADDR, p-i_start);
+		kvm->initrd_gra = INITRD_LOAD_ADDR;
+		kvm->initrd_size = p-i_start;
+	} else {
+		kvm->initrd_size = 0;
+	}
+	strncpy(kern_cmdline, kernel_cmdline, 2048);
+	kern_cmdline[2047] = '\0';
+
+	return true;
+}
+
+bool load_bzimage(struct kvm *kvm, int fd_kernel,
+		  int fd_initrd, const char *kernel_cmdline, u16 vidmode)
+{
+	/* We don't support bzImages. */
+	return false;
+}
+
+static void setup_fdt(struct kvm *kvm)
+{
+
+}
+
+/**
+ * kvm__arch_setup_firmware
+ */
+void kvm__arch_setup_firmware(struct kvm *kvm)
+{
+	/* Load RTAS */
+
+	/* Load SLOF */
+
+	/* Init FDT */
+	setup_fdt(kvm);
+}