diff mbox

[RFC,V2,4/4] Add virtv2 machine that uses GIC-500

Message ID 1430921082-16779-5-git-send-email-shlomopongratz@gmail.com
State New
Headers show

Commit Message

Shlomo Pongratz May 6, 2015, 2:04 p.m. UTC
From: Shlomo Pongratz <shlomo.pongratz@huawei.com>

There is a need to support flexible clusters size. The GIC-500 can support
up to 128 cores, up to 32 clusters and up to 8 cores is a cluster.
So for example, if one wishes to have 16 cores, the options are:
2 clusters of 8 cores each, 4 clusters with 4 cores each
Currently only the first option is supported.
There is an issue of passing clock affinity to via the dtb. In the dtb

interrupt section there are only 24 bit left to affinity since the
variable is a 32 bit entity and 8 bits are reserved for flags.
See Documentation/devicetree/bindings/arm/arch_timer.txt.
Note that this issue is not seems to be critical as when checking
/proc/irq/3/smp_affinity with 32 cores all 32 bits are one.

Signed-off-by: Shlomo Pongratz <shlomo.pongratz@huawei.com>
---
 hw/arm/Makefile.objs |   2 +-
 hw/arm/virtv2.c      | 774 +++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 775 insertions(+), 1 deletion(-)
 create mode 100644 hw/arm/virtv2.c

Comments

Pavel Fedin May 7, 2015, 6:40 a.m. UTC | #1
Hello!

 Why do we need 'virt2' ? I am currently working on testing your
implementation, and i try to use different approach. I see this as '-machine
virt,gicv=N' option, where N is 2 or 3. Isn't it better than duplicating the
whole virt.c code just for single different function ?

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia
Shlomo Pongratz May 7, 2015, 7:37 a.m. UTC | #2
On Thursday, May 7, 2015, Pavel Fedin <p.fedin@samsung.com> wrote:

>  Hello!
>
>  Why do we need 'virt2' ? I am currently working on testing your
> implementation, and i try to use different approach. I see this as
> '-machine
> virt,gicv=N' option, where N is 2 or 3. Isn't it better than duplicating
> the
> whole virt.c code just for single different function ?
>
> Kind regards,
> Pavel Fedin
> Expert Engineer
> Samsung Electronics Research center Russia
>
>
>
Hi Pavel,

Thank you for your comment, I might do it after solving the stability
issues.

Best regards,

S.P.
Pavel Fedin May 7, 2015, 8:11 a.m. UTC | #3
Hello!

➢ Thank you for your comment, I might do it after solving the stability issues.

 What kind of stability issues do you have ?
 You know, i was testing 64-bit qemu in TCG mode, and i have also seen weird behavior. The kernel just stops and waits until you hit any key on the console, then it continues. This happens if i try to use more than one CPU. This has to do with guest kernel version, in v4.1rc2 this is gone, while still present in v4.0rc1. Originally i discovered this issue on v3.19. So, perhaps it's some guest kernel flaw, having to do with interrupts.
 The first place where it is stuck is after initializing usbcore. After hitting enter it says "switching clocksource to..." (i don't remember) and goes on. After this it can stop like this again, in some random moment.
 Which kernel version do you use for guest ?

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia
Shlomo Pongratz May 7, 2015, 8:56 a.m. UTC | #4
On Thursday, May 7, 2015, Pavel Fedin <p.fedin@samsung.com> wrote:

>  Hello!
>
> ➢ Thank you for your comment, I might do it after solving the stability
> issues.
>
>  What kind of stability issues do you have ?
>  You know, i was testing 64-bit qemu in TCG mode, and i have also seen
> weird behavior. The kernel just stops and waits until you hit any key on
> the console, then it continues. This happens if i try to use more than one
> CPU. This has to do with guest kernel version, in v4.1rc2 this is gone,
> while still present in v4.0rc1. Originally i discovered this issue on
> v3.19. So, perhaps it's some guest kernel flaw, having to do with
> interrupts.
>  The first place where it is stuck is after initializing usbcore. After
> hitting enter it says "switching clocksource to..." (i don't remember) and
> goes on. After this it can stop like this again, in some random moment.
>  Which kernel version do you use for guest ?
>
> Kind regards,
> Pavel Fedin
> Expert Engineer
> Samsung Electronics Research center Russia
>
> Hi Pavel,

The stability issue is described in patch 0000.
The issue is similar to what you have noticed.
The larger the number of smp cores it is more likely that the boot will get
stuck.
I've commuted this patch series as an RFC because of this issue.

Best regards,

S.P.
Pavel Fedin May 7, 2015, 12:46 p.m. UTC | #5
Hello!

> The issue is similar to what you have noticed.

 In this case it's probably not your fault.

> The larger the number of smp cores it is more likely that the boot will get stuck.
> I've commuted this patch series as an RFC because of this issue.

 I have tried to give more testing to it. Up to 4 CPUs worked okay (if we ignore "press any key" problem which i described). Starting from 5 i get a hard freeze right after " SMP: Total of N processors activated" message. However, i get this deadlock also with GICv2, so it's perhaps not your fault. The second problem is that i did not manage to bring up more than 8 CPUs. The following happens:
--- cut ---
Initializing cgroup subsys memory
Initializing cgroup subsys hugetlb
hw perfevents: no hardware support available
EFI services will not be available.
CPU1: Booted secondary processor
Detected PIPT I-cache on CPU1
CPU1: found redistributor 1 region 0:0x0000000008060000
CPU2: Booted secondary processor
Detected PIPT I-cache on CPU2
CPU2: found redistributor 2 region 0:0x0000000008080000
CPU3: Booted secondary processor
Detected PIPT I-cache on CPU3
CPU3: found redistributor 3 region 0:0x00000000080a0000
CPU4: Booted secondary processor
Detected PIPT I-cache on CPU4
CPU4: found redistributor 4 region 0:0x00000000080c0000
CPU5: Booted secondary processor
Detected PIPT I-cache on CPU5
CPU5: found redistributor 5 region 0:0x00000000080e0000
CPU6: Booted secondary processor
Detected PIPT I-cache on CPU6
CPU6: found redistributor 6 region 0:0x0000000008100000
CPU7: Booted secondary processor
Detected PIPT I-cache on CPU7
CPU7: found redistributor 7 region 0:0x0000000008120000
psci: failed to boot CPU8 (-22)
CPU8: failed to boot: -22
psci: failed to boot CPU9 (-22)
CPU9: failed to boot: -22
psci: failed to boot CPU10 (-22)
CPU10: failed to boot: -22
psci: failed to boot CPU11 (-22)
CPU11: failed to boot: -22
psci: failed to boot CPU12 (-22)
CPU12: failed to boot: -22
psci: failed to boot CPU13 (-22)
CPU13: failed to boot: -22
psci: failed to boot CPU14 (-22)
CPU14: failed to boot: -22
psci: failed to boot CPU15 (-22)
CPU15: failed to boot: -22
Brought up 8 CPUs
SMP: Total of 8 processors activated.
--- cut ---
 But, perhaps it's the fault of my code (i replaced your patch No 4 with my own implementation, merging the code). I will post this alternative version soon if you don't mind (need to rebase it on master).

 My current guest kernel is v3.19.5. I will test on 4.1rc2 soon.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia
Pavel Fedin May 7, 2015, 1:12 p.m. UTC | #6
Hello!

>  My current guest kernel is v3.19.5. I will test on 4.1rc2 soon.

 Retested with 4.1.0rc2 (linux-stable.git master). Works perfectly fine with up to 8 CPUs. Here is boot log from 16 CPUs, listing potential problems:
--- cut ---
Initializing cgroup subsys memory
Initializing cgroup subsys hugetlb
hw perfevents: no hardware support available
EFI services will not be available.
CPU1: Booted secondary processor
Detected PIPT I-cache on CPU1
CPU1: found redistributor 1 region 0:0x0000000008060000
CPU2: Booted secondary processor
Detected PIPT I-cache on CPU2
CPU2: found redistributor 2 region 0:0x0000000008080000
CPU3: Booted secondary processor
Detected PIPT I-cache on CPU3
CPU3: found redistributor 3 region 0:0x00000000080a0000
CPU4: Booted secondary processor
Detected PIPT I-cache on CPU4
CPU4: found redistributor 4 region 0:0x00000000080c0000
CPU5: Booted secondary processor
Detected PIPT I-cache on CPU5
CPU5: found redistributor 5 region 0:0x00000000080e0000
CPU6: Booted secondary processor
Detected PIPT I-cache on CPU6
CPU6: found redistributor 6 region 0:0x0000000008100000
CPU7: Booted secondary processor
Detected PIPT I-cache on CPU7
CPU7: found redistributor 7 region 0:0x0000000008120000
psci: failed to boot CPU8 (-22)
CPU8: failed to boot: -22
psci: failed to boot CPU9 (-22)
CPU9: failed to boot: -22
psci: failed to boot CPU10 (-22)
CPU10: failed to boot: -22
psci: failed to boot CPU11 (-22)
CPU11: failed to boot: -22
psci: failed to boot CPU12 (-22)
CPU12: failed to boot: -22
psci: failed to boot CPU13 (-22)
CPU13: failed to boot: -22
psci: failed to boot CPU14 (-22)
CPU14: failed to boot: -22
psci: failed to boot CPU15 (-22)
CPU15: failed to boot: -22
Brought up 8 CPUs
SMP: Total of 8 processors activated.
CPU: All CPU(s) started at EL1
alternatives: patching kernel code
devtmpfs: initialized
DMI not present or invalid.
clocksource jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns
NET: Registered protocol family 16
cpuidle: using governor ladder
cpuidle: using governor menu
vdso: 2 pages (1 code @ ffffffc000819000, 1 data @ ffffffc000818000)
hw-breakpoint: found 6 breakpoint and 4 watchpoint registers.
DMA: preallocated 256 KiB pool for atomic allocations
Serial: AMBA PL011 UART driver
genirq: Setting trigger mode 1 for irq 9 failed (gic_set_type+0x0/0x78)
genirq: Setting trigger mode 1 for irq 10 failed (gic_set_type+0x0/0x78)
genirq: Setting trigger mode 1 for irq 11 failed (gic_set_type+0x0/0x78)
genirq: Setting trigger mode 1 for irq 12 failed (gic_set_type+0x0/0x78)
genirq: Setting trigger mode 1 for irq 13 failed (gic_set_type+0x0/0x78)
genirq: Setting trigger mode 1 for irq 14 failed (gic_set_type+0x0/0x78)
genirq: Setting trigger mode 1 for irq 15 failed (gic_set_type+0x0/0x78)
genirq: Setting trigger mode 1 for irq 16 failed (gic_set_type+0x0/0x78)
genirq: Setting trigger mode 1 for irq 17 failed (gic_set_type+0x0/0x78)
genirq: Setting trigger mode 1 for irq 18 failed (gic_set_type+0x0/0x78)
genirq: Setting trigger mode 1 for irq 19 failed (gic_set_type+0x0/0x78)
genirq: Setting trigger mode 1 for irq 20 failed (gic_set_type+0x0/0x78)
genirq: Setting trigger mode 1 for irq 25 failed (gic_set_type+0x0/0x78)
genirq: Setting trigger mode 1 for irq 26 failed (gic_set_type+0x0/0x78)
genirq: Setting trigger mode 1 for irq 27 failed (gic_set_type+0x0/0x78)
genirq: Setting trigger mode 1 for irq 28 failed (gic_set_type+0x0/0x78)
genirq: Setting trigger mode 1 for irq 29 failed (gic_set_type+0x0/0x78)
genirq: Setting trigger mode 1 for irq 30 failed (gic_set_type+0x0/0x78)
genirq: Setting trigger mode 1 for irq 31 failed (gic_set_type+0x0/0x78)
genirq: Setting trigger mode 1 for irq 32 failed (gic_set_type+0x0/0x78)
genirq: Setting trigger mode 1 for irq 33 failed (gic_set_type+0x0/0x78)
genirq: Setting trigger mode 1 for irq 34 failed (gic_set_type+0x0/0x78)
genirq: Setting trigger mode 1 for irq 35 failed (gic_set_type+0x0/0x78)
genirq: Setting trigger mode 1 for irq 36 failed (gic_set_type+0x0/0x78)
9000000.pl011: ttyAMA0 at MMIO 0x9000000 (irq = 38, base_baud = 0) is a PL011 rev1
console [ttyAMA0] enabled
vgaarb: loaded
SCSI subsystem initialized
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
Switched to clocksource arch_sys_counter
NET: Registered protocol family 2
--- сut ---

 So far, i think the GICv3 functionality is OK. Here is my line.

Tested-by: Pavel Fedin <p.fedin@samsung.com>

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia
Shlomo Pongratz May 7, 2015, 1:44 p.m. UTC | #7
On Thursday, May 7, 2015, Pavel Fedin <p.fedin@samsung.com> wrote:

> Hello!
>
> > The issue is similar to what you have noticed.
>
>  In this case it's probably not your fault.
>
> > The larger the number of smp cores it is more likely that the boot will
> get stuck.
> > I've commuted this patch series as an RFC because of this issue.
>
>  I have tried to give more testing to it. Up to 4 CPUs worked okay (if we
> ignore "press any key" problem which i described). Starting from 5 i get a
> hard freeze right after " SMP: Total of N processors activated" message.
> However, i get this deadlock also with GICv2, so it's perhaps not your
> fault. The second problem is that i did not manage to bring up more than 8
> CPUs. The following happens:
> --- cut ---
> Initializing cgroup subsys memory
> Initializing cgroup subsys hugetlb
> hw perfevents: no hardware support available
> EFI services will not be available.
> CPU1: Booted secondary processor
> Detected PIPT I-cache on CPU1
> CPU1: found redistributor 1 region 0:0x0000000008060000
> CPU2: Booted secondary processor
> Detected PIPT I-cache on CPU2
> CPU2: found redistributor 2 region 0:0x0000000008080000
> CPU3: Booted secondary processor
> Detected PIPT I-cache on CPU3
> CPU3: found redistributor 3 region 0:0x00000000080a0000
> CPU4: Booted secondary processor
> Detected PIPT I-cache on CPU4
> CPU4: found redistributor 4 region 0:0x00000000080c0000
> CPU5: Booted secondary processor
> Detected PIPT I-cache on CPU5
> CPU5: found redistributor 5 region 0:0x00000000080e0000
> CPU6: Booted secondary processor
> Detected PIPT I-cache on CPU6
> CPU6: found redistributor 6 region 0:0x0000000008100000
> CPU7: Booted secondary processor
> Detected PIPT I-cache on CPU7
> CPU7: found redistributor 7 region 0:0x0000000008120000
> psci: failed to boot CPU8 (-22)
> CPU8: failed to boot: -22
> psci: failed to boot CPU9 (-22)
> CPU9: failed to boot: -22
> psci: failed to boot CPU10 (-22)
> CPU10: failed to boot: -22
> psci: failed to boot CPU11 (-22)
> CPU11: failed to boot: -22
> psci: failed to boot CPU12 (-22)
> CPU12: failed to boot: -22
> psci: failed to boot CPU13 (-22)
> CPU13: failed to boot: -22
> psci: failed to boot CPU14 (-22)
> CPU14: failed to boot: -22
> psci: failed to boot CPU15 (-22)
> CPU15: failed to boot: -22
> Brought up 8 CPUs
> SMP: Total of 8 processors activated.
> --- cut ---
>  But, perhaps it's the fault of my code (i replaced your patch No 4 with
> my own implementation, merging the code). I will post this alternative
> version soon if you don't mind (need to rebase it on master).
>
>  My current guest kernel is v3.19.5. I will test on 4.1rc2 soon.
>
> Kind regards,
> Pavel Fedin
> Expert Engineer
> Samsung Electronics Research center Russia
>
>
> Hi Pavel,

Check your code and see if in fdt_add_cpu_nodes and see if CPUs are added
with correct MPIDR (Affinity).

Best regards,

S.P.
Eric Auger May 19, 2015, 2:39 p.m. UTC | #8
Hi Shlomo,
On 05/06/2015 04:04 PM, shlomopongratz@gmail.com wrote:
> From: Shlomo Pongratz <shlomo.pongratz@huawei.com>
> 
> There is a need to support flexible clusters size. The GIC-500 can support
> up to 128 cores, up to 32 clusters and up to 8 cores is a cluster.
> So for example, if one wishes to have 16 cores, the options are:
> 2 clusters of 8 cores each, 4 clusters with 4 cores each
> Currently only the first option is supported.
> There is an issue of passing clock affinity to via the dtb. In the dtb
> 
> interrupt section there are only 24 bit left to affinity since the
> variable is a 32 bit entity and 8 bits are reserved for flags.
> See Documentation/devicetree/bindings/arm/arch_timer.txt.
> Note that this issue is not seems to be critical as when checking
> /proc/irq/3/smp_affinity with 32 cores all 32 bits are one.

As a general comment your series needs a good rebase in virt.c. This
would definitively ease the review.

I have the impression that the number of modifications related to the
GICv3 addition do not require to create a new virt.c file. My personal
feeling is using Ashok's approach based on machine properties may be
worth trying. Obviously this will urge you to fill the memory map holes
but that's should be feasible.

> 
> Signed-off-by: Shlomo Pongratz <shlomo.pongratz@huawei.com>
> ---
>  hw/arm/Makefile.objs |   2 +-
>  hw/arm/virtv2.c      | 774 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 775 insertions(+), 1 deletion(-)
>  create mode 100644 hw/arm/virtv2.c
> 
> diff --git a/hw/arm/Makefile.objs b/hw/arm/Makefile.objs
> index 2577f68..e94a929 100644
> --- a/hw/arm/Makefile.objs
> +++ b/hw/arm/Makefile.objs
> @@ -2,7 +2,7 @@ obj-y += boot.o collie.o exynos4_boards.o gumstix.o highbank.o
>  obj-$(CONFIG_DIGIC) += digic_boards.o
>  obj-y += integratorcp.o kzm.o mainstone.o musicpal.o nseries.o
>  obj-y += omap_sx1.o palm.o realview.o spitz.o stellaris.o
> -obj-y += tosa.o versatilepb.o vexpress.o virt.o xilinx_zynq.o z2.o
> +obj-y += tosa.o versatilepb.o vexpress.o virt.o virtv2.o xilinx_zynq.o z2.o
>  obj-y += netduino2.o
>  
>  obj-y += armv7m.o exynos4210.o pxa2xx.o pxa2xx_gpio.o pxa2xx_pic.o
> diff --git a/hw/arm/virtv2.c b/hw/arm/virtv2.c
> new file mode 100644
> index 0000000..08e3059
> --- /dev/null
> +++ b/hw/arm/virtv2.c
> @@ -0,0 +1,774 @@
> +/*
> + * ARM mach-virt emulation
> + *
> + * Copyright (c) 2013 Linaro Limited
> + * Copyright (c) 2015 Huawei.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2 or later, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along with
> + * this program.  If not, see <http://www.gnu.org/licenses/>.
> + *
> + * Emulate a virtual board which works by passing Linux all the information
> + * it needs about what devices are present via the device tree.
> + * There are some restrictions about what we can do here:
> + *  + we can only present devices whose Linux drivers will work based
> + *    purely on the device tree with no platform data at all
> + *  + we want to present a very stripped-down minimalist platform,
> + *    both because this reduces the security attack surface from the guest
> + *    and also because it reduces our exposure to being broken when
> + *    the kernel updates its device tree bindings and requires further
> + *    information in a device binding that we aren't providing.
> + * This is essentially the same approach kvmtool uses.
> + */
> +
> +#include "hw/sysbus.h"
> +#include "hw/arm/arm.h"
> +#include "hw/arm/primecell.h"
> +#include "hw/devices.h"
> +#include "net/net.h"
> +#include "sysemu/block-backend.h"
> +#include "sysemu/device_tree.h"
> +#include "sysemu/sysemu.h"
> +#include "sysemu/kvm.h"
> +#include "hw/boards.h"
> +#include "hw/loader.h"
> +#include "exec/address-spaces.h"
> +#include "qemu/bitops.h"
> +#include "qemu/error-report.h"
> +
> +#undef DEBUG_VIRT2
> +
> +#ifdef DEBUG_VIRT2
> +#define DPRINTF(fmt, ...) \
> +do { fprintf(stderr, "virt2: " fmt , ## __VA_ARGS__); } while (0)
> +#else
> +#define DPRINTF(fmt, ...) do {} while(0)
> +#endif
> +
> +
> +#define NUM_VIRTIO_TRANSPORTS 32
> +
> +/* Number of external interrupt lines to configure the GIC with */
> +#define NUM_IRQS 128
> +
> +#define GIC_FDT_IRQ_TYPE_SPI 0
> +#define GIC_FDT_IRQ_TYPE_PPI 1
> +
> +#define GIC_FDT_IRQ_FLAGS_EDGE_LO_HI 1
> +#define GIC_FDT_IRQ_FLAGS_EDGE_HI_LO 2
> +#define GIC_FDT_IRQ_FLAGS_LEVEL_HI 4
> +#define GIC_FDT_IRQ_FLAGS_LEVEL_LO 8
> +
> +#define GIC_FDT_IRQ_PPI_CPU_START 8
> +#define GIC_FDT_IRQ_PPI_CPU_WIDTH 24
> +
> +enum {
> +    VIRT_FLASH,
> +    VIRT_MEM,
> +    VIRT_CPUPERIPHS,
> +    VIRT_GIC_DIST,
> +    VIRT_GIC_DIST_SPI,
> +    VIRT_ITS_CONTROL,
> +    VIRT_ITS_TRANSLATION,
> +    VIRT_LPI,
> +    VIRT_UART,
> +    VIRT_MMIO,
> +    VIRT_RTC,
> +    VIRT_FW_CFG,
> +    VIRT_GIC_CPU,
> +};
> +
> +typedef struct MemMapEntry {
> +    hwaddr base;
> +    hwaddr size;
> +} MemMapEntry;
> +
> +typedef struct VirtBoardInfo {
> +    struct arm_boot_info bootinfo;
> +    const char *cpu_model;
> +    const MemMapEntry *memmap;
> +    const int *irqmap;
> +    int smp_cpus;
> +    void *fdt;
> +    int fdt_size;
> +    uint32_t clock_phandle;
> +} VirtBoardInfo;
> +
> +typedef struct {
> +    MachineClass parent;
> +    VirtBoardInfo *daughterboard;
> +} VirtMachineClass;
> +
> +typedef struct {
> +    MachineState parent;
> +    bool secure;
> +} VirtMachineState;
> +
> +#define TYPE_VIRT_MACHINE   "virt2"
> +#define VIRT_MACHINE(obj) \
> +    OBJECT_CHECK(VirtMachineState, (obj), TYPE_VIRT_MACHINE)
> +#define VIRT_MACHINE_GET_CLASS(obj) \
> +    OBJECT_GET_CLASS(VirtMachineClass, obj, TYPE_VIRT_MACHINE)
> +#define VIRT_MACHINE_CLASS(klass) \
> +    OBJECT_CLASS_CHECK(VirtMachineClass, klass, TYPE_VIRT_MACHINE)
> +
> +/* Addresses and sizes of our components.
> + * 0..128MB is space for a flash device so we can run bootrom code such as UEFI.
> + * 128MB..256MB is used for miscellaneous device I/O.
> + * 256MB..1GB is reserved for possible future PCI support (ie where the
> + * PCI memory window will go if we add a PCI host controller).
> + * 1GB and up is RAM (which may happily spill over into the
> + * high memory region beyond 4GB).
> + * This represents a compromise between how much RAM can be given to
> + * a 32 bit VM and leaving space for expansion and in particular for PCI.
> + * Note that devices should generally be placed at multiples of 0x10000,
> + * to accommodate guests using 64K pages.
> + */
> +static const MemMapEntry a57memmap[] = {
> +    /* Space up to 0x8000000 is reserved for a boot ROM */
> +    [VIRT_FLASH] =           {          0, 0x08000000 },
> +    [VIRT_CPUPERIPHS] =      { 0x08000000, 0x00840000 },
> +    [VIRT_GIC_DIST] =        { 0x08000000, 0x00010000 },
> +    [VIRT_GIC_DIST_SPI]=     { 0x08010000, 0x00010000 },
> +    [VIRT_ITS_CONTROL] =     { 0x08020000, 0x00010000 },
> +    [VIRT_ITS_TRANSLATION] = { 0x08030000, 0x00010000 },
> +    [VIRT_LPI] =             { 0x08040000, 0x00800000 },
> +    [VIRT_UART] =            { 0x09000000, 0x00001000 },
> +    [VIRT_RTC] =             { 0x09010000, 0x00001000 },
> +    [VIRT_FW_CFG] =          { 0x09020000, 0x0000000a },
> +    [VIRT_MMIO] =            { 0x0a000000, 0x00000200 },
> +    /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */
> +    /* 0x10000000 .. 0x40000000 reserved for PCI */
> +    [VIRT_MEM] = { 0x40000000, 30ULL * 1024 * 1024 * 1024 },
> +    /* Need to support both memmory mapped and system registers according to
> +     * section D7.6.22 4 SRE_EL1 of the ARM arch ref manual */
> +    [VIRT_GIC_CPU] = { 0x40000000, 30ULL * 1024 * 1024 * 1024 },
aren't VIRT_MEM and VIRT_GIC_CPU the same??

You miss all the PCI host controller stuff here.
> +};
> +
> +static const int a57irqmap[] = {
> +    [VIRT_UART] = 1,
> +    [VIRT_RTC] = 2,
> +    [VIRT_MMIO] = 16, /* ...to 16 + NUM_VIRTIO_TRANSPORTS - 1 */
> +};
> +
> +static VirtBoardInfo machines[] = {
> +    {
> +        .cpu_model = "cortex-a53",
> +        .memmap = a57memmap,
> +        .irqmap = a57irqmap,
> +    },
> +    {
> +        .cpu_model = "cortex-a57",
> +        .memmap = a57memmap,
> +        .irqmap = a57irqmap,
> +    },
> +    {
> +        .cpu_model = "host",
> +        .memmap = a57memmap,
> +        .irqmap = a57irqmap,
> +    },
> +};
> +
> +static VirtBoardInfo *find_machine_info(const char *cpu)
> +{
> +    int i;
> +
> +    for (i = 0; i < ARRAY_SIZE(machines); i++) {
> +        if (strcmp(cpu, machines[i].cpu_model) == 0) {
> +            return &machines[i];
> +        }
> +    }
> +    return NULL;
> +}
> +
> +static void create_fdt(VirtBoardInfo *vbi)
> +{
> +    void *fdt = create_device_tree(&vbi->fdt_size);
> +
> +    if (!fdt) {
> +        error_report("create_device_tree() failed");
> +        exit(1);
> +    }
> +
> +    vbi->fdt = fdt;
> +
> +    /* Header */
> +    qemu_fdt_setprop_string(fdt, "/", "compatible", "linux,dummy-virt");
> +    qemu_fdt_setprop_cell(fdt, "/", "#address-cells", 0x2);
> +    qemu_fdt_setprop_cell(fdt, "/", "#size-cells", 0x2);
> +
> +    /*
> +     * /chosen and /memory nodes must exist for load_dtb
> +     * to fill in necessary properties later
> +     */
> +    qemu_fdt_add_subnode(fdt, "/chosen");
> +    qemu_fdt_add_subnode(fdt, "/memory");
> +    qemu_fdt_setprop_string(fdt, "/memory", "device_type", "memory");
> +
> +    /* Clock node, for the benefit of the UART. The kernel device tree
> +     * binding documentation claims the PL011 node clock properties are
> +     * optional but in practice if you omit them the kernel refuses to
> +     * probe for the device.
> +     */
> +    vbi->clock_phandle = qemu_fdt_alloc_phandle(fdt);
> +    qemu_fdt_add_subnode(fdt, "/apb-pclk");
> +    qemu_fdt_setprop_string(fdt, "/apb-pclk", "compatible", "fixed-clock");
> +    qemu_fdt_setprop_cell(fdt, "/apb-pclk", "#clock-cells", 0x0);
> +    qemu_fdt_setprop_cell(fdt, "/apb-pclk", "clock-frequency", 24000000);
> +    qemu_fdt_setprop_string(fdt, "/apb-pclk", "clock-output-names",
> +                                "clk24mhz");
> +    qemu_fdt_setprop_cell(fdt, "/apb-pclk", "phandle", vbi->clock_phandle);
> +
> +}
> +
> +static void fdt_add_psci_node(const VirtBoardInfo *vbi)
> +{
> +    uint32_t cpu_suspend_fn;
> +    uint32_t cpu_off_fn;
> +    uint32_t cpu_on_fn;
> +    uint32_t migrate_fn;
> +    void *fdt = vbi->fdt;
> +    ARMCPU *armcpu = ARM_CPU(qemu_get_cpu(0));
> +
> +    qemu_fdt_add_subnode(fdt, "/psci");
> +    if (armcpu->psci_version == 2) {
> +       const char comp[] = "arm,psci-0.2\0arm,psci";
> +       qemu_fdt_setprop(fdt, "/psci", "compatible", comp, sizeof(comp));
> +       cpu_off_fn = QEMU_PSCI_0_2_FN_CPU_OFF;
> +       if (arm_feature(&armcpu->env, ARM_FEATURE_AARCH64)) {
> +            cpu_suspend_fn = QEMU_PSCI_0_2_FN64_CPU_SUSPEND;
> +            cpu_on_fn = QEMU_PSCI_0_2_FN64_CPU_ON;
> +            migrate_fn = QEMU_PSCI_0_2_FN64_MIGRATE;
> +       } else {
> +            cpu_suspend_fn = QEMU_PSCI_0_2_FN_CPU_SUSPEND;
> +            cpu_on_fn = QEMU_PSCI_0_2_FN_CPU_ON;
> +            migrate_fn = QEMU_PSCI_0_2_FN_MIGRATE;
> +       }
> +    } else {
> +        qemu_fdt_setprop_string(fdt, "/psci", "compatible", "arm,psci");
> +        cpu_suspend_fn = QEMU_PSCI_0_1_FN_CPU_SUSPEND;
> +        cpu_off_fn = QEMU_PSCI_0_1_FN_CPU_OFF;
> +        cpu_on_fn = QEMU_PSCI_0_1_FN_CPU_ON;
> +        migrate_fn = QEMU_PSCI_0_1_FN_MIGRATE;
> +    }
> +
> +    /* We adopt the PSCI spec's nomenclature, and use 'conduit' to refer
> +     * to the instruction that should be used to invoke PSCI functions.
> +     * However, the device tree binding uses 'method' instead, so that is
> +     * what we should use here.
> +     */
> +    qemu_fdt_setprop_string(fdt, "/psci", "method", "hvc");
> +
> +    qemu_fdt_setprop_cell(fdt, "/psci", "cpu_suspend", cpu_suspend_fn);
> +    qemu_fdt_setprop_cell(fdt, "/psci", "cpu_off", cpu_off_fn);
> +    qemu_fdt_setprop_cell(fdt, "/psci", "cpu_on", cpu_on_fn);
> +    qemu_fdt_setprop_cell(fdt, "/psci", "migrate", migrate_fn);
> +}
> +
> +//PPI is moved to different location
> +
> +static void fdt_add_timer_nodes(const VirtBoardInfo *vbi)
> +{
> +    /* Note that on A57 h/w these interrupts are level-triggered,
> +     * but for the GIC implementation provided by both QEMU and KVM
> +     * they are edge-triggered.
> +     */
> +    // ARMCPU *armcpu;
> +    uint32_t max;
> +    uint32_t irqflags = GIC_FDT_IRQ_FLAGS_EDGE_LO_HI;
> +    /* Argument is 32 bit but 8 bits are reserved for flags */
> +    max = (vbi->smp_cpus >= 24) ? 24 : vbi->smp_cpus;
> +    irqflags = deposit32(irqflags, GIC_FDT_IRQ_PPI_CPU_START,
> +        GIC_FDT_IRQ_PPI_CPU_WIDTH, (1 << max) - 1);
> +
> +    qemu_fdt_add_subnode(vbi->fdt, "/timer");
> +    qemu_fdt_setprop_string(vbi->fdt, "/timer",
> +                                "compatible", "arm,armv8-timer\0arm,armv7-timer");
> +    qemu_fdt_setprop_cells(vbi->fdt, "/timer", "interrupts",
> +                               GIC_FDT_IRQ_TYPE_PPI, 13, irqflags,
> +                               GIC_FDT_IRQ_TYPE_PPI, 14, irqflags,
> +                               GIC_FDT_IRQ_TYPE_PPI, 11, irqflags,
> +                               GIC_FDT_IRQ_TYPE_PPI, 10, irqflags);
> +}
> +
> +static void fdt_add_cpu_nodes(const VirtBoardInfo *vbi)
> +{
> +    int cpu;
> +
> +    qemu_fdt_add_subnode(vbi->fdt, "/cpus");
> +    /* From Documentation/devicetree/bindings/arm/cpus.txt
> +     *  On ARM v8 64-bit systems value should be set to 2,
> +     *  that corresponds to the MPIDR_EL1 register size.
> +     *  If MPIDR_EL1[63:32] value is equal to 0 on all CPUs
> +     *  in the system, #address-cells can be set to 1, since
> +     *  MPIDR_EL1[63:32] bits are not used for CPUs
> +     *  identification.
> +     *
> +     *  Now GIC500 doesn't support affinities 2 & 3 so currently
> +     *  #address-cells can stay 1 until future GIC
> +     */
> +    qemu_fdt_setprop_cell(vbi->fdt, "/cpus", "#address-cells", 0x1);
> +    qemu_fdt_setprop_cell(vbi->fdt, "/cpus", "#size-cells", 0x0);
> +
> +    for (cpu = vbi->smp_cpus - 1; cpu >= 0; cpu--) {
> +        int Aff1, Aff0;
> +        char *nodename = g_strdup_printf("/cpus/cpu@%d", cpu);
> +        ARMCPU *armcpu = ARM_CPU(qemu_get_cpu(cpu));
> +
> +        qemu_fdt_add_subnode(vbi->fdt, nodename);
> +        qemu_fdt_setprop_string(vbi->fdt, nodename,"device_type","cpu");
> +        qemu_fdt_setprop_string(vbi->fdt, nodename, "compatible",
> +                                    armcpu->dtb_compatible);
> +
> +        if (vbi->smp_cpus > 1) {
> +            qemu_fdt_setprop_string(vbi->fdt, nodename,
> +                                        "enable-method", "psci");
> +        }
> +
> +        /* If cpus node's #address-cells property is set to 1
> +         * The reg cell bits [23:0] must be set to bits [23:0] of MPIDR_EL1.
> +         * Currently we support simple affinity scheme e.g. we can have
> +         * 1 cluster with 8 cores but we can't have 8 clusters wit 1 core each.
> +         */
> +        Aff1 = cpu / 8;
> +        Aff0 = cpu % 8;
To me you should structure your virt series into several patch files,
GIC addition, CPU related addition (mpidr, max cpus, ...).

Best Regards

Eric
> +        qemu_fdt_setprop_cell(vbi->fdt, nodename, "reg", (Aff1 << 8) | Aff0);
> +        g_free(nodename);
> +    }
> +}
> +
> +static void fdt_add_gic_node(const VirtBoardInfo *vbi)
> +{
> +    uint32_t gic_phandle;
> +
> +    gic_phandle = qemu_fdt_alloc_phandle(vbi->fdt);
> +    qemu_fdt_setprop_cell(vbi->fdt, "/", "interrupt-parent", gic_phandle);
> +
> +    qemu_fdt_add_subnode(vbi->fdt, "/intc");
> +    /* 'cortex-a57-gic' means 'GIC v3' */
> +    qemu_fdt_setprop_string(vbi->fdt, "/intc", "compatible",
> +                            "arm,gic-v3");
> +    qemu_fdt_setprop_cell(vbi->fdt, "/intc", "#interrupt-cells", 3);
> +    qemu_fdt_setprop(vbi->fdt, "/intc", "interrupt-controller", NULL, 0);
> +    qemu_fdt_setprop_sized_cells(vbi->fdt, "/intc", "reg",
> +                             2, vbi->memmap[VIRT_GIC_DIST].base,
> +                             2, vbi->memmap[VIRT_GIC_DIST].size,
> +#if 0
> +                             /* Currently no need for SPI & ITS */
> +                             2, vbi->memmap[VIRT_GIC_DIST_SPI].base,
> +                             2, vbi->memmap[VIRT_GIC_DIST_SPI].size,
> +                             2, vbi->memmap[VIRT_ITS_CONTROL].base,
> +                             2, vbi->memmap[VIRT_ITS_CONTROL].size,
> +                             2, vbi->memmap[VIRT_ITS_TRANSLATION].base,
> +                             2, vbi->memmap[VIRT_ITS_TRANSLATION].size,
> +#endif
> +                             2, vbi->memmap[VIRT_LPI].base,
> +                             2, vbi->memmap[VIRT_LPI].size);
> +    qemu_fdt_setprop_cell(vbi->fdt, "/intc", "phandle", gic_phandle);
> +}
> +
> +static void create_gic(const VirtBoardInfo *vbi, qemu_irq *pic)
> +{
> +    /* We create a standalone GIC v3 */
> +    DeviceState *gicdev;
> +    SysBusDevice *gicbusdev;
> +    const char *gictype = "arm_gicv3";
> +    int i;
> +
> +    if (kvm_irqchip_in_kernel()) {
> +        gictype = "kvm-arm-gic";
> +    }
> +
> +    gicdev = qdev_create(NULL, gictype);
> +
> +    for (i = 0; i < vbi->smp_cpus; i++) {
> +        CPUState *cpu = qemu_get_cpu(i);
> +        CPUARMState *env = cpu->env_ptr;
> +        env->nvic = gicdev;
> +    }
> +
> +    qdev_prop_set_uint32(gicdev, "revision", 3);
> +    qdev_prop_set_uint32(gicdev, "num-cpu", smp_cpus);
> +    /* Note that the num-irq property counts both internal and external
> +     * interrupts; there are always 32 of the former (mandated by GIC spec).
> +     */
> +    qdev_prop_set_uint32(gicdev, "num-irq", NUM_IRQS + 32);
> +    qdev_init_nofail(gicdev);
> +    gicbusdev = SYS_BUS_DEVICE(gicdev);
> +
> +    sysbus_mmio_map(gicbusdev, 0, vbi->memmap[VIRT_GIC_DIST].base);
> +    sysbus_mmio_map(gicbusdev, 1, vbi->memmap[VIRT_GIC_DIST_SPI].base);
> +    sysbus_mmio_map(gicbusdev, 2, vbi->memmap[VIRT_ITS_CONTROL].base);
> +    sysbus_mmio_map(gicbusdev, 3, vbi->memmap[VIRT_ITS_TRANSLATION].base);
> +    sysbus_mmio_map(gicbusdev, 4, vbi->memmap[VIRT_LPI].base);
> +    /* Wire the outputs from each CPU's generic timer to the
> +     * appropriate GIC PPI inputs, and the GIC's IRQ output to
> +     * the CPU's IRQ input.
> +     */
> +    for (i = 0; i < smp_cpus ; i++) {
> +        DeviceState *cpudev = DEVICE(qemu_get_cpu(i));
> +        int ppibase = NUM_IRQS + i * 32;
> +        /* physical timer; we wire it up to the non-secure timer's ID,
> +         * since a real A57 always has TrustZone but QEMU doesn't.
> +         */
> +        qdev_connect_gpio_out(cpudev, 0,
> +                              qdev_get_gpio_in(gicdev, ppibase + 30));
> +        /* virtual timer */
> +        qdev_connect_gpio_out(cpudev, 1,
> +                              qdev_get_gpio_in(gicdev, ppibase + 27));
> +
> +        sysbus_connect_irq(gicbusdev, i, qdev_get_gpio_in(cpudev, ARM_CPU_IRQ));
> +    }
> +
> +    for (i = 0; i < NUM_IRQS; i++) {
> +        pic[i] = qdev_get_gpio_in(gicdev, i);
> +    }
> +
> +    fdt_add_gic_node(vbi);
> +}
> +
> +static void create_uart(const VirtBoardInfo *vbi, qemu_irq *pic)
> +{
> +    char *nodename;
> +    hwaddr base = vbi->memmap[VIRT_UART].base;
> +    hwaddr size = vbi->memmap[VIRT_UART].size;
> +    int irq = vbi->irqmap[VIRT_UART];
> +    const char compat[] = "arm,pl011\0arm,primecell";
> +    const char clocknames[] = "uartclk\0apb_pclk";
> +
> +    sysbus_create_simple("pl011", base, pic[irq]);
> +
> +    nodename = g_strdup_printf("/pl011@%" PRIx64, base);
> +    qemu_fdt_add_subnode(vbi->fdt, nodename);
> +    /* Note that we can't use setprop_string because of the embedded NUL */
> +    qemu_fdt_setprop(vbi->fdt, nodename, "compatible",
> +                         compat, sizeof(compat));
> +    qemu_fdt_setprop_sized_cells(vbi->fdt, nodename, "reg",
> +                                     2, base, 2, size);
> +    qemu_fdt_setprop_cells(vbi->fdt, nodename, "interrupts",
> +                               GIC_FDT_IRQ_TYPE_SPI, irq,
> +                               GIC_FDT_IRQ_FLAGS_LEVEL_HI);
> +    qemu_fdt_setprop_cells(vbi->fdt, nodename, "clocks",
> +                               vbi->clock_phandle, vbi->clock_phandle);
> +    qemu_fdt_setprop(vbi->fdt, nodename, "clock-names",
> +                         clocknames, sizeof(clocknames));
> +
> +    qemu_fdt_setprop_string(vbi->fdt, "/chosen", "stdout-path", nodename);
> +    g_free(nodename);
> +}
> +
> +static void create_rtc(const VirtBoardInfo *vbi, qemu_irq *pic)
> +{
> +    char *nodename;
> +    hwaddr base = vbi->memmap[VIRT_RTC].base;
> +    hwaddr size = vbi->memmap[VIRT_RTC].size;
> +    int irq = vbi->irqmap[VIRT_RTC];
> +    const char compat[] = "arm,pl031\0arm,primecell";
> +
> +    sysbus_create_simple("pl031", base, pic[irq]);
> +
> +    nodename = g_strdup_printf("/pl031@%" PRIx64, base);
> +    qemu_fdt_add_subnode(vbi->fdt, nodename);
> +    qemu_fdt_setprop(vbi->fdt, nodename, "compatible", compat, sizeof(compat));
> +    qemu_fdt_setprop_sized_cells(vbi->fdt, nodename, "reg",
> +                                 2, base, 2, size);
> +    qemu_fdt_setprop_cells(vbi->fdt, nodename, "interrupts",
> +                           GIC_FDT_IRQ_TYPE_SPI, irq,
> +                           GIC_FDT_IRQ_FLAGS_LEVEL_HI);
> +    qemu_fdt_setprop_cell(vbi->fdt, nodename, "clocks", vbi->clock_phandle);
> +    qemu_fdt_setprop_string(vbi->fdt, nodename, "clock-names", "apb_pclk");
> +    g_free(nodename);
> +}
> +
> +static void create_virtio_devices(const VirtBoardInfo *vbi, qemu_irq *pic)
> +{
> +    int i;
> +    hwaddr size = vbi->memmap[VIRT_MMIO].size;
> +
> +    /* Note that we have to create the transports in forwards order
> +     * so that command line devices are inserted lowest address first,
> +     * and then add dtb nodes in reverse order so that they appear in
> +     * the finished device tree lowest address first.
> +     */
> +    for (i = 0; i < NUM_VIRTIO_TRANSPORTS; i++) {
> +        int irq = vbi->irqmap[VIRT_MMIO] + i;
> +        hwaddr base = vbi->memmap[VIRT_MMIO].base + i * size;
> +
> +        sysbus_create_simple("virtio-mmio", base, pic[irq]);
> +    }
> +
> +    for (i = NUM_VIRTIO_TRANSPORTS - 1; i >= 0; i--) {
> +        char *nodename;
> +        int irq = vbi->irqmap[VIRT_MMIO] + i;
> +        hwaddr base = vbi->memmap[VIRT_MMIO].base + i * size;
> +
> +        nodename = g_strdup_printf("/virtio_mmio@%" PRIx64, base);
> +        qemu_fdt_add_subnode(vbi->fdt, nodename);
> +        qemu_fdt_setprop_string(vbi->fdt, nodename,
> +                                "compatible", "virtio,mmio");
> +        qemu_fdt_setprop_sized_cells(vbi->fdt, nodename, "reg",
> +                                     2, base, 2, size);
> +        qemu_fdt_setprop_cells(vbi->fdt, nodename, "interrupts",
> +                               GIC_FDT_IRQ_TYPE_SPI, irq,
> +                               GIC_FDT_IRQ_FLAGS_EDGE_LO_HI);
> +        g_free(nodename);
> +    }
> +}
> +
> +static void create_one_flash(const char *name, hwaddr flashbase,
> +                             hwaddr flashsize)
> +{
> +    /* Create and map a single flash device. We use the same
> +     * parameters as the flash devices on the Versatile Express board.
> +     */
> +    DriveInfo *dinfo = drive_get_next(IF_PFLASH);
> +    DeviceState *dev = qdev_create(NULL, "cfi.pflash01");
> +    const uint64_t sectorlength = 256 * 1024;
> +
> +    if (dinfo && qdev_prop_set_drive(dev, "drive",
> +                                     blk_by_legacy_dinfo(dinfo))) {
> +        abort();
> +    }
> +
> +    qdev_prop_set_uint32(dev, "num-blocks", flashsize / sectorlength);
> +    qdev_prop_set_uint64(dev, "sector-length", sectorlength);
> +    qdev_prop_set_uint8(dev, "width", 4);
> +    qdev_prop_set_uint8(dev, "device-width", 2);
> +    qdev_prop_set_uint8(dev, "big-endian", 0);
> +    qdev_prop_set_uint16(dev, "id0", 0x89);
> +    qdev_prop_set_uint16(dev, "id1", 0x18);
> +    qdev_prop_set_uint16(dev, "id2", 0x00);
> +    qdev_prop_set_uint16(dev, "id3", 0x00);
> +    qdev_prop_set_string(dev, "name", name);
> +    qdev_init_nofail(dev);
> +
> +    sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, flashbase);
> +}
> +
> +static void create_flash(const VirtBoardInfo *vbi)
> +{
> +    /* Create two flash devices to fill the VIRT_FLASH space in the memmap.
> +     * Any file passed via -bios goes in the first of these.
> +     */
> +    hwaddr flashsize = vbi->memmap[VIRT_FLASH].size / 2;
> +    hwaddr flashbase = vbi->memmap[VIRT_FLASH].base;
> +    char *nodename;
> +
> +    if (bios_name) {
> +        const char *fn;
> +
> +        if (drive_get(IF_PFLASH, 0, 0)) {
> +            error_report("The contents of the first flash device may be "
> +                         "specified with -bios or with -drive if=pflash... "
> +                         "but you cannot use both options at once");
> +            exit(1);
> +        }
> +        fn = qemu_find_file(QEMU_FILE_TYPE_BIOS, bios_name);
> +        if (!fn || load_image_targphys(fn, flashbase, flashsize) < 0) {
> +            error_report("Could not load ROM image '%s'", bios_name);
> +            exit(1);
> +        }
> +    }
> +
> +    create_one_flash("virt2.flash0", flashbase, flashsize);
> +    create_one_flash("virt2.flash1", flashbase + flashsize, flashsize);
> +
> +    nodename = g_strdup_printf("/flash@%" PRIx64, flashbase);
> +    qemu_fdt_add_subnode(vbi->fdt, nodename);
> +    qemu_fdt_setprop_string(vbi->fdt, nodename, "compatible", "cfi-flash");
> +    qemu_fdt_setprop_sized_cells(vbi->fdt, nodename, "reg",
> +                                 2, flashbase, 2, flashsize,
> +                                 2, flashbase + flashsize, 2, flashsize);
> +    qemu_fdt_setprop_cell(vbi->fdt, nodename, "bank-width", 4);
> +    g_free(nodename);
> +}
> +
> +static void create_fw_cfg(const VirtBoardInfo *vbi)
> +{
> +    hwaddr base = vbi->memmap[VIRT_FW_CFG].base;
> +    hwaddr size = vbi->memmap[VIRT_FW_CFG].size;
> +    char *nodename;
> +
> +    fw_cfg_init_mem_wide(base + 8, base, 8);
> +
> +    nodename = g_strdup_printf("/fw-cfg@%" PRIx64, base);
> +    qemu_fdt_add_subnode(vbi->fdt, nodename);
> +    qemu_fdt_setprop_string(vbi->fdt, nodename,
> +                            "compatible", "qemu,fw-cfg-mmio");
> +    qemu_fdt_setprop_sized_cells(vbi->fdt, nodename, "reg",
> +                                 2, base, 2, size);
> +    g_free(nodename);
> +}
> +
> +static void *machvirt_dtb(const struct arm_boot_info *binfo, int *fdt_size)
> +{
> +    const VirtBoardInfo *board = (const VirtBoardInfo *)binfo;
> +
> +    *fdt_size = board->fdt_size;
> +    return board->fdt;
> +}
> +
> +static void machvirt2_init(MachineState *machine)
> +{
> +    VirtMachineState *vms = VIRT_MACHINE(machine);
> +    qemu_irq pic[NUM_IRQS];
> +    MemoryRegion *sysmem = get_system_memory();
> +    int n;
> +    MemoryRegion *ram = g_new(MemoryRegion, 1);
> +    const char *cpu_model = machine->cpu_model;
> +    VirtBoardInfo *vbi;
> +
> +    if (!cpu_model) {
> +        cpu_model = "cortex-a57";
> +    }
> +
> +    vbi = find_machine_info(cpu_model);
> +    if (!vbi) {
> +        error_report("mach-virt: CPU %s not supported", cpu_model);
> +        exit(1);
> +    }
> +
> +    /* With gic3 full implementation (with bitops) rase the lmit to 128 */
> +    if (smp_cpus > 64) {
> +        error_report("mach-virt: cannot model more than 64 cores");
> +        exit(1);
> +    }
> +
> +    vbi->smp_cpus = smp_cpus;
> +
> +    if (machine->ram_size > vbi->memmap[VIRT_MEM].size) {
> +        error_report("mach-virt: cannot model more than 30GB RAM");
> +        exit(1);
> +    }
> +
> +    create_fdt(vbi);
> +
> +    for (n = 0; n < smp_cpus; n++) {
> +        ObjectClass *oc = cpu_class_by_name(TYPE_AARCH64_CPU, cpu_model);
> +        Object *cpuobj;
> +
> +        if (!oc) {
> +            fprintf(stderr, "Unable to find CPU definition\n");
> +            exit(1);
> +        }
> +        cpuobj = object_new(object_class_get_name(oc));
> +
> +        if (!vms->secure) {
> +            object_property_set_bool(cpuobj, false, "has_el3", NULL);
> +        }
> +
> +        object_property_set_int(cpuobj, QEMU_PSCI_CONDUIT_HVC, "psci-conduit",
> +                                NULL);
> +
> +        /* Secondary CPUs start in PSCI powered-down state */
> +        if (n > 0) {
> +            object_property_set_bool(cpuobj, true, "start-powered-off", NULL);
> +        }
> +
> +        if (object_property_find(cpuobj, "reset-cbar", NULL)) {
> +            object_property_set_int(cpuobj, vbi->memmap[VIRT_CPUPERIPHS].base,
> +                                    "reset-cbar", &error_abort);
> +        }
> +
> +        object_property_set_bool(cpuobj, true, "realized", NULL);
> +    }
> +    fdt_add_timer_nodes(vbi);
> +    fdt_add_cpu_nodes(vbi);
> +    fdt_add_psci_node(vbi);
> +
> +    memory_region_init_ram(ram, NULL, "mach-virt.ram", machine->ram_size,
> +                           &error_abort);
> +    vmstate_register_ram_global(ram);
> +    memory_region_add_subregion(sysmem, vbi->memmap[VIRT_MEM].base, ram);
> +
> +    create_flash(vbi);
> +    create_gic(vbi, pic);
> +    create_uart(vbi, pic);
> +    create_rtc(vbi, pic);
> +
> +    /* Create mmio transports, so the user can create virtio backends
> +     * (which will be automatically plugged in to the transports). If
> +     * no backend is created the transport will just sit harmlessly idle.
> +     */
> +    create_virtio_devices(vbi, pic);
> +
> +    create_fw_cfg(vbi);
> +
> +    vbi->bootinfo.ram_size = machine->ram_size;
> +    vbi->bootinfo.kernel_filename = machine->kernel_filename;
> +    vbi->bootinfo.kernel_cmdline = machine->kernel_cmdline;
> +    vbi->bootinfo.initrd_filename = machine->initrd_filename;
> +    vbi->bootinfo.nb_cpus = smp_cpus;
> +    vbi->bootinfo.board_id = -1;
> +    vbi->bootinfo.loader_start = vbi->memmap[VIRT_MEM].base;
> +    vbi->bootinfo.get_dtb = machvirt_dtb;
> +    vbi->bootinfo.firmware_loaded = bios_name || drive_get(IF_PFLASH, 0, 0);
> +    DPRINTF(" ---- kernel load ----- \n");
> +    arm_load_kernel(ARM_CPU(first_cpu), &vbi->bootinfo);
> +    DPRINTF(" ---- kernel load finish ----- \n");
> +}
> +
> +
> +static bool virt_get_secure(Object *obj, Error **errp)
> +{
> +    VirtMachineState *vms = VIRT_MACHINE(obj);
> +
> +    return vms->secure;
> +}
> +
> +static void virt_set_secure(Object *obj, bool value, Error **errp)
> +{
> +    VirtMachineState *vms = VIRT_MACHINE(obj);
> +
> +    vms->secure = value;
> +}
> +
> +static void virt_instance_init(Object *obj)
> +{
> +    VirtMachineState *vms = VIRT_MACHINE(obj);
> +
> +    /* EL3 is enabled by default on virt */
> +    vms->secure = true;
> +    object_property_add_bool(obj, "secure", virt_get_secure,
> +                             virt_set_secure, NULL);
> +    object_property_set_description(obj, "secure",
> +                                    "Set on/off to enable/disable the ARM "
> +                                    "Security Extensions (TrustZone)",
> +                                    NULL);
> +}
> +
> +static void virt_class_init(ObjectClass *oc, void *data)
> +{
> +    MachineClass *mc = MACHINE_CLASS(oc);
> +
> +    mc->name = TYPE_VIRT_MACHINE;
> +    mc->desc = "ARM Virtual Machine",
> +    mc->init = machvirt2_init;
> +    /* With gic3 full implementation (with bitops) rase the lmit to 128 */
> +    mc->max_cpus = 64;
> +}
> +
> +
> +static const TypeInfo machvirt2_info = {
> +    .name = TYPE_VIRT_MACHINE,
> +    .parent = TYPE_MACHINE,
> +    .instance_size = sizeof(VirtMachineState),
> +    .instance_init = virt_instance_init,
> +    .class_size = sizeof(VirtMachineClass),
> +    .class_init = virt_class_init,
> +};
> +
> +static void machvirt2_machine_init(void)
> +{
> +    type_register_static(&machvirt2_info);
> +}
> +
> +machine_init(machvirt2_machine_init);
>
Shlomo Pongratz May 19, 2015, 3:07 p.m. UTC | #9
> -----Original Message-----
> From: Eric Auger [mailto:eric.auger@linaro.org]
> Sent: Tuesday, 19 May, 2015 5:39 PM
> To: shlomopongratz@gmail.com; qemu-devel@nongnu.org
> Cc: peter.maydell@linaro.org; Shlomo Pongratz
> Subject: Re: [Qemu-devel] [PATCH RFC V2 4/4] Add virtv2 machine that uses
> GIC-500
> 
> Hi Shlomo,
> On 05/06/2015 04:04 PM, shlomopongratz@gmail.com wrote:
> > From: Shlomo Pongratz <shlomo.pongratz@huawei.com>
> >
> > There is a need to support flexible clusters size. The GIC-500 can
> > support up to 128 cores, up to 32 clusters and up to 8 cores is a cluster.
> > So for example, if one wishes to have 16 cores, the options are:
> > 2 clusters of 8 cores each, 4 clusters with 4 cores each Currently
> > only the first option is supported.
> > There is an issue of passing clock affinity to via the dtb. In the dtb
> >
> > interrupt section there are only 24 bit left to affinity since the
> > variable is a 32 bit entity and 8 bits are reserved for flags.
> > See Documentation/devicetree/bindings/arm/arch_timer.txt.
> > Note that this issue is not seems to be critical as when checking
> > /proc/irq/3/smp_affinity with 32 cores all 32 bits are one.
> 
> As a general comment your series needs a good rebase in virt.c. This would
> definitively ease the review.
> 
> I have the impression that the number of modifications related to the
> GICv3 addition do not require to create a new virt.c file. My personal feeling
> is using Ashok's approach based on machine properties may be worth trying.
> Obviously this will urge you to fill the memory map holes but that's should be
> feasible.
> 

Hi Eric,

Thank you.

I have just pulled from git and I'm currently working on rebasing my 1-3 patches to commit 62bf3d , mainly adding interrupt groups and fiq.
I accept that virtv2 is not needed and I'm currently using Pavel's patch https://lists.gnu.org/archive/html/qemu-devel/2015-05/msg02930.html with small modifications.
However If there is a consensus I'll move to Ashok's virt instead of Pavel's one.

I'm looking forward for a decision regarding which virt is to be used.

Best regards,

S.P.
Pavel Fedin May 25, 2015, 7:42 a.m. UTC | #10
Hello!

> I accept that virtv2 is not needed and I'm currently using Pavel's patch
> https://lists.gnu.org/archive/html/qemu-devel/2015-05/msg02930.html with small
> modifications.
> However If there is a consensus I'll move to Ashok's virt instead of Pavel's one.
> 
> I'm looking forward for a decision regarding which virt is to be used.

 I have just tried Ashok's approach and unfortunately it does not work well. The problem
is mc->max_cpus. Ashok's patch does not do anything with it because he seemed to be
satisfied with the maximum of 8. And it looks like CPU number limitation is evaluated
before machine is instantiated and properties are evaluated. I tried to change this value
inside virt_set_gic_version() (see Ashok's code), but it simply does not work, and i still
get " Maximum CPUs greater than specified machine type limit" from qemu.
 I believe there can be some fix, but it seems to require much more changes than simply
subclassing a machine. Does it worth that? And, by the way, libvirt has now added
recognition of "virt-" prefix too.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia
Pavel Fedin May 25, 2015, 11:07 a.m. UTC | #11
Hi everybody!

 I started to play with ITS implementation and got another question.

> +    [VIRT_ITS_CONTROL] =     { 0x08020000, 0x00010000 },
> +    [VIRT_ITS_TRANSLATION] = { 0x08030000, 0x00010000 },

 Why do you describe these as two separate regions? Actually they always follow one
another, and CONTROL area has peripherial IDs at the end, so we can perfectly merge them.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia
Pavel Fedin May 25, 2015, 11:47 a.m. UTC | #12
>  Why do you describe these as two separate regions? Actually they always follow one
> another, and CONTROL area has peripherial IDs at the end, so we can perfectly merge
them.

 And one more note. Could you omit ITS region reservations from the next version at all? I
think ITS implementation will look best in a separate file, as a separate device, similar
to GICv2m one. Actually, in device tree ITS is also a separate entity.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia
diff mbox

Patch

diff --git a/hw/arm/Makefile.objs b/hw/arm/Makefile.objs
index 2577f68..e94a929 100644
--- a/hw/arm/Makefile.objs
+++ b/hw/arm/Makefile.objs
@@ -2,7 +2,7 @@  obj-y += boot.o collie.o exynos4_boards.o gumstix.o highbank.o
 obj-$(CONFIG_DIGIC) += digic_boards.o
 obj-y += integratorcp.o kzm.o mainstone.o musicpal.o nseries.o
 obj-y += omap_sx1.o palm.o realview.o spitz.o stellaris.o
-obj-y += tosa.o versatilepb.o vexpress.o virt.o xilinx_zynq.o z2.o
+obj-y += tosa.o versatilepb.o vexpress.o virt.o virtv2.o xilinx_zynq.o z2.o
 obj-y += netduino2.o
 
 obj-y += armv7m.o exynos4210.o pxa2xx.o pxa2xx_gpio.o pxa2xx_pic.o
diff --git a/hw/arm/virtv2.c b/hw/arm/virtv2.c
new file mode 100644
index 0000000..08e3059
--- /dev/null
+++ b/hw/arm/virtv2.c
@@ -0,0 +1,774 @@ 
+/*
+ * ARM mach-virt emulation
+ *
+ * Copyright (c) 2013 Linaro Limited
+ * Copyright (c) 2015 Huawei.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see <http://www.gnu.org/licenses/>.
+ *
+ * Emulate a virtual board which works by passing Linux all the information
+ * it needs about what devices are present via the device tree.
+ * There are some restrictions about what we can do here:
+ *  + we can only present devices whose Linux drivers will work based
+ *    purely on the device tree with no platform data at all
+ *  + we want to present a very stripped-down minimalist platform,
+ *    both because this reduces the security attack surface from the guest
+ *    and also because it reduces our exposure to being broken when
+ *    the kernel updates its device tree bindings and requires further
+ *    information in a device binding that we aren't providing.
+ * This is essentially the same approach kvmtool uses.
+ */
+
+#include "hw/sysbus.h"
+#include "hw/arm/arm.h"
+#include "hw/arm/primecell.h"
+#include "hw/devices.h"
+#include "net/net.h"
+#include "sysemu/block-backend.h"
+#include "sysemu/device_tree.h"
+#include "sysemu/sysemu.h"
+#include "sysemu/kvm.h"
+#include "hw/boards.h"
+#include "hw/loader.h"
+#include "exec/address-spaces.h"
+#include "qemu/bitops.h"
+#include "qemu/error-report.h"
+
+#undef DEBUG_VIRT2
+
+#ifdef DEBUG_VIRT2
+#define DPRINTF(fmt, ...) \
+do { fprintf(stderr, "virt2: " fmt , ## __VA_ARGS__); } while (0)
+#else
+#define DPRINTF(fmt, ...) do {} while(0)
+#endif
+
+
+#define NUM_VIRTIO_TRANSPORTS 32
+
+/* Number of external interrupt lines to configure the GIC with */
+#define NUM_IRQS 128
+
+#define GIC_FDT_IRQ_TYPE_SPI 0
+#define GIC_FDT_IRQ_TYPE_PPI 1
+
+#define GIC_FDT_IRQ_FLAGS_EDGE_LO_HI 1
+#define GIC_FDT_IRQ_FLAGS_EDGE_HI_LO 2
+#define GIC_FDT_IRQ_FLAGS_LEVEL_HI 4
+#define GIC_FDT_IRQ_FLAGS_LEVEL_LO 8
+
+#define GIC_FDT_IRQ_PPI_CPU_START 8
+#define GIC_FDT_IRQ_PPI_CPU_WIDTH 24
+
+enum {
+    VIRT_FLASH,
+    VIRT_MEM,
+    VIRT_CPUPERIPHS,
+    VIRT_GIC_DIST,
+    VIRT_GIC_DIST_SPI,
+    VIRT_ITS_CONTROL,
+    VIRT_ITS_TRANSLATION,
+    VIRT_LPI,
+    VIRT_UART,
+    VIRT_MMIO,
+    VIRT_RTC,
+    VIRT_FW_CFG,
+    VIRT_GIC_CPU,
+};
+
+typedef struct MemMapEntry {
+    hwaddr base;
+    hwaddr size;
+} MemMapEntry;
+
+typedef struct VirtBoardInfo {
+    struct arm_boot_info bootinfo;
+    const char *cpu_model;
+    const MemMapEntry *memmap;
+    const int *irqmap;
+    int smp_cpus;
+    void *fdt;
+    int fdt_size;
+    uint32_t clock_phandle;
+} VirtBoardInfo;
+
+typedef struct {
+    MachineClass parent;
+    VirtBoardInfo *daughterboard;
+} VirtMachineClass;
+
+typedef struct {
+    MachineState parent;
+    bool secure;
+} VirtMachineState;
+
+#define TYPE_VIRT_MACHINE   "virt2"
+#define VIRT_MACHINE(obj) \
+    OBJECT_CHECK(VirtMachineState, (obj), TYPE_VIRT_MACHINE)
+#define VIRT_MACHINE_GET_CLASS(obj) \
+    OBJECT_GET_CLASS(VirtMachineClass, obj, TYPE_VIRT_MACHINE)
+#define VIRT_MACHINE_CLASS(klass) \
+    OBJECT_CLASS_CHECK(VirtMachineClass, klass, TYPE_VIRT_MACHINE)
+
+/* Addresses and sizes of our components.
+ * 0..128MB is space for a flash device so we can run bootrom code such as UEFI.
+ * 128MB..256MB is used for miscellaneous device I/O.
+ * 256MB..1GB is reserved for possible future PCI support (ie where the
+ * PCI memory window will go if we add a PCI host controller).
+ * 1GB and up is RAM (which may happily spill over into the
+ * high memory region beyond 4GB).
+ * This represents a compromise between how much RAM can be given to
+ * a 32 bit VM and leaving space for expansion and in particular for PCI.
+ * Note that devices should generally be placed at multiples of 0x10000,
+ * to accommodate guests using 64K pages.
+ */
+static const MemMapEntry a57memmap[] = {
+    /* Space up to 0x8000000 is reserved for a boot ROM */
+    [VIRT_FLASH] =           {          0, 0x08000000 },
+    [VIRT_CPUPERIPHS] =      { 0x08000000, 0x00840000 },
+    [VIRT_GIC_DIST] =        { 0x08000000, 0x00010000 },
+    [VIRT_GIC_DIST_SPI]=     { 0x08010000, 0x00010000 },
+    [VIRT_ITS_CONTROL] =     { 0x08020000, 0x00010000 },
+    [VIRT_ITS_TRANSLATION] = { 0x08030000, 0x00010000 },
+    [VIRT_LPI] =             { 0x08040000, 0x00800000 },
+    [VIRT_UART] =            { 0x09000000, 0x00001000 },
+    [VIRT_RTC] =             { 0x09010000, 0x00001000 },
+    [VIRT_FW_CFG] =          { 0x09020000, 0x0000000a },
+    [VIRT_MMIO] =            { 0x0a000000, 0x00000200 },
+    /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */
+    /* 0x10000000 .. 0x40000000 reserved for PCI */
+    [VIRT_MEM] = { 0x40000000, 30ULL * 1024 * 1024 * 1024 },
+    /* Need to support both memmory mapped and system registers according to
+     * section D7.6.22 4 SRE_EL1 of the ARM arch ref manual */
+    [VIRT_GIC_CPU] = { 0x40000000, 30ULL * 1024 * 1024 * 1024 },
+};
+
+static const int a57irqmap[] = {
+    [VIRT_UART] = 1,
+    [VIRT_RTC] = 2,
+    [VIRT_MMIO] = 16, /* ...to 16 + NUM_VIRTIO_TRANSPORTS - 1 */
+};
+
+static VirtBoardInfo machines[] = {
+    {
+        .cpu_model = "cortex-a53",
+        .memmap = a57memmap,
+        .irqmap = a57irqmap,
+    },
+    {
+        .cpu_model = "cortex-a57",
+        .memmap = a57memmap,
+        .irqmap = a57irqmap,
+    },
+    {
+        .cpu_model = "host",
+        .memmap = a57memmap,
+        .irqmap = a57irqmap,
+    },
+};
+
+static VirtBoardInfo *find_machine_info(const char *cpu)
+{
+    int i;
+
+    for (i = 0; i < ARRAY_SIZE(machines); i++) {
+        if (strcmp(cpu, machines[i].cpu_model) == 0) {
+            return &machines[i];
+        }
+    }
+    return NULL;
+}
+
+static void create_fdt(VirtBoardInfo *vbi)
+{
+    void *fdt = create_device_tree(&vbi->fdt_size);
+
+    if (!fdt) {
+        error_report("create_device_tree() failed");
+        exit(1);
+    }
+
+    vbi->fdt = fdt;
+
+    /* Header */
+    qemu_fdt_setprop_string(fdt, "/", "compatible", "linux,dummy-virt");
+    qemu_fdt_setprop_cell(fdt, "/", "#address-cells", 0x2);
+    qemu_fdt_setprop_cell(fdt, "/", "#size-cells", 0x2);
+
+    /*
+     * /chosen and /memory nodes must exist for load_dtb
+     * to fill in necessary properties later
+     */
+    qemu_fdt_add_subnode(fdt, "/chosen");
+    qemu_fdt_add_subnode(fdt, "/memory");
+    qemu_fdt_setprop_string(fdt, "/memory", "device_type", "memory");
+
+    /* Clock node, for the benefit of the UART. The kernel device tree
+     * binding documentation claims the PL011 node clock properties are
+     * optional but in practice if you omit them the kernel refuses to
+     * probe for the device.
+     */
+    vbi->clock_phandle = qemu_fdt_alloc_phandle(fdt);
+    qemu_fdt_add_subnode(fdt, "/apb-pclk");
+    qemu_fdt_setprop_string(fdt, "/apb-pclk", "compatible", "fixed-clock");
+    qemu_fdt_setprop_cell(fdt, "/apb-pclk", "#clock-cells", 0x0);
+    qemu_fdt_setprop_cell(fdt, "/apb-pclk", "clock-frequency", 24000000);
+    qemu_fdt_setprop_string(fdt, "/apb-pclk", "clock-output-names",
+                                "clk24mhz");
+    qemu_fdt_setprop_cell(fdt, "/apb-pclk", "phandle", vbi->clock_phandle);
+
+}
+
+static void fdt_add_psci_node(const VirtBoardInfo *vbi)
+{
+    uint32_t cpu_suspend_fn;
+    uint32_t cpu_off_fn;
+    uint32_t cpu_on_fn;
+    uint32_t migrate_fn;
+    void *fdt = vbi->fdt;
+    ARMCPU *armcpu = ARM_CPU(qemu_get_cpu(0));
+
+    qemu_fdt_add_subnode(fdt, "/psci");
+    if (armcpu->psci_version == 2) {
+       const char comp[] = "arm,psci-0.2\0arm,psci";
+       qemu_fdt_setprop(fdt, "/psci", "compatible", comp, sizeof(comp));
+       cpu_off_fn = QEMU_PSCI_0_2_FN_CPU_OFF;
+       if (arm_feature(&armcpu->env, ARM_FEATURE_AARCH64)) {
+            cpu_suspend_fn = QEMU_PSCI_0_2_FN64_CPU_SUSPEND;
+            cpu_on_fn = QEMU_PSCI_0_2_FN64_CPU_ON;
+            migrate_fn = QEMU_PSCI_0_2_FN64_MIGRATE;
+       } else {
+            cpu_suspend_fn = QEMU_PSCI_0_2_FN_CPU_SUSPEND;
+            cpu_on_fn = QEMU_PSCI_0_2_FN_CPU_ON;
+            migrate_fn = QEMU_PSCI_0_2_FN_MIGRATE;
+       }
+    } else {
+        qemu_fdt_setprop_string(fdt, "/psci", "compatible", "arm,psci");
+        cpu_suspend_fn = QEMU_PSCI_0_1_FN_CPU_SUSPEND;
+        cpu_off_fn = QEMU_PSCI_0_1_FN_CPU_OFF;
+        cpu_on_fn = QEMU_PSCI_0_1_FN_CPU_ON;
+        migrate_fn = QEMU_PSCI_0_1_FN_MIGRATE;
+    }
+
+    /* We adopt the PSCI spec's nomenclature, and use 'conduit' to refer
+     * to the instruction that should be used to invoke PSCI functions.
+     * However, the device tree binding uses 'method' instead, so that is
+     * what we should use here.
+     */
+    qemu_fdt_setprop_string(fdt, "/psci", "method", "hvc");
+
+    qemu_fdt_setprop_cell(fdt, "/psci", "cpu_suspend", cpu_suspend_fn);
+    qemu_fdt_setprop_cell(fdt, "/psci", "cpu_off", cpu_off_fn);
+    qemu_fdt_setprop_cell(fdt, "/psci", "cpu_on", cpu_on_fn);
+    qemu_fdt_setprop_cell(fdt, "/psci", "migrate", migrate_fn);
+}
+
+//PPI is moved to different location
+
+static void fdt_add_timer_nodes(const VirtBoardInfo *vbi)
+{
+    /* Note that on A57 h/w these interrupts are level-triggered,
+     * but for the GIC implementation provided by both QEMU and KVM
+     * they are edge-triggered.
+     */
+    // ARMCPU *armcpu;
+    uint32_t max;
+    uint32_t irqflags = GIC_FDT_IRQ_FLAGS_EDGE_LO_HI;
+    /* Argument is 32 bit but 8 bits are reserved for flags */
+    max = (vbi->smp_cpus >= 24) ? 24 : vbi->smp_cpus;
+    irqflags = deposit32(irqflags, GIC_FDT_IRQ_PPI_CPU_START,
+        GIC_FDT_IRQ_PPI_CPU_WIDTH, (1 << max) - 1);
+
+    qemu_fdt_add_subnode(vbi->fdt, "/timer");
+    qemu_fdt_setprop_string(vbi->fdt, "/timer",
+                                "compatible", "arm,armv8-timer\0arm,armv7-timer");
+    qemu_fdt_setprop_cells(vbi->fdt, "/timer", "interrupts",
+                               GIC_FDT_IRQ_TYPE_PPI, 13, irqflags,
+                               GIC_FDT_IRQ_TYPE_PPI, 14, irqflags,
+                               GIC_FDT_IRQ_TYPE_PPI, 11, irqflags,
+                               GIC_FDT_IRQ_TYPE_PPI, 10, irqflags);
+}
+
+static void fdt_add_cpu_nodes(const VirtBoardInfo *vbi)
+{
+    int cpu;
+
+    qemu_fdt_add_subnode(vbi->fdt, "/cpus");
+    /* From Documentation/devicetree/bindings/arm/cpus.txt
+     *  On ARM v8 64-bit systems value should be set to 2,
+     *  that corresponds to the MPIDR_EL1 register size.
+     *  If MPIDR_EL1[63:32] value is equal to 0 on all CPUs
+     *  in the system, #address-cells can be set to 1, since
+     *  MPIDR_EL1[63:32] bits are not used for CPUs
+     *  identification.
+     *
+     *  Now GIC500 doesn't support affinities 2 & 3 so currently
+     *  #address-cells can stay 1 until future GIC
+     */
+    qemu_fdt_setprop_cell(vbi->fdt, "/cpus", "#address-cells", 0x1);
+    qemu_fdt_setprop_cell(vbi->fdt, "/cpus", "#size-cells", 0x0);
+
+    for (cpu = vbi->smp_cpus - 1; cpu >= 0; cpu--) {
+        int Aff1, Aff0;
+        char *nodename = g_strdup_printf("/cpus/cpu@%d", cpu);
+        ARMCPU *armcpu = ARM_CPU(qemu_get_cpu(cpu));
+
+        qemu_fdt_add_subnode(vbi->fdt, nodename);
+        qemu_fdt_setprop_string(vbi->fdt, nodename,"device_type","cpu");
+        qemu_fdt_setprop_string(vbi->fdt, nodename, "compatible",
+                                    armcpu->dtb_compatible);
+
+        if (vbi->smp_cpus > 1) {
+            qemu_fdt_setprop_string(vbi->fdt, nodename,
+                                        "enable-method", "psci");
+        }
+
+        /* If cpus node's #address-cells property is set to 1
+         * The reg cell bits [23:0] must be set to bits [23:0] of MPIDR_EL1.
+         * Currently we support simple affinity scheme e.g. we can have
+         * 1 cluster with 8 cores but we can't have 8 clusters wit 1 core each.
+         */
+        Aff1 = cpu / 8;
+        Aff0 = cpu % 8;
+        qemu_fdt_setprop_cell(vbi->fdt, nodename, "reg", (Aff1 << 8) | Aff0);
+        g_free(nodename);
+    }
+}
+
+static void fdt_add_gic_node(const VirtBoardInfo *vbi)
+{
+    uint32_t gic_phandle;
+
+    gic_phandle = qemu_fdt_alloc_phandle(vbi->fdt);
+    qemu_fdt_setprop_cell(vbi->fdt, "/", "interrupt-parent", gic_phandle);
+
+    qemu_fdt_add_subnode(vbi->fdt, "/intc");
+    /* 'cortex-a57-gic' means 'GIC v3' */
+    qemu_fdt_setprop_string(vbi->fdt, "/intc", "compatible",
+                            "arm,gic-v3");
+    qemu_fdt_setprop_cell(vbi->fdt, "/intc", "#interrupt-cells", 3);
+    qemu_fdt_setprop(vbi->fdt, "/intc", "interrupt-controller", NULL, 0);
+    qemu_fdt_setprop_sized_cells(vbi->fdt, "/intc", "reg",
+                             2, vbi->memmap[VIRT_GIC_DIST].base,
+                             2, vbi->memmap[VIRT_GIC_DIST].size,
+#if 0
+                             /* Currently no need for SPI & ITS */
+                             2, vbi->memmap[VIRT_GIC_DIST_SPI].base,
+                             2, vbi->memmap[VIRT_GIC_DIST_SPI].size,
+                             2, vbi->memmap[VIRT_ITS_CONTROL].base,
+                             2, vbi->memmap[VIRT_ITS_CONTROL].size,
+                             2, vbi->memmap[VIRT_ITS_TRANSLATION].base,
+                             2, vbi->memmap[VIRT_ITS_TRANSLATION].size,
+#endif
+                             2, vbi->memmap[VIRT_LPI].base,
+                             2, vbi->memmap[VIRT_LPI].size);
+    qemu_fdt_setprop_cell(vbi->fdt, "/intc", "phandle", gic_phandle);
+}
+
+static void create_gic(const VirtBoardInfo *vbi, qemu_irq *pic)
+{
+    /* We create a standalone GIC v3 */
+    DeviceState *gicdev;
+    SysBusDevice *gicbusdev;
+    const char *gictype = "arm_gicv3";
+    int i;
+
+    if (kvm_irqchip_in_kernel()) {
+        gictype = "kvm-arm-gic";
+    }
+
+    gicdev = qdev_create(NULL, gictype);
+
+    for (i = 0; i < vbi->smp_cpus; i++) {
+        CPUState *cpu = qemu_get_cpu(i);
+        CPUARMState *env = cpu->env_ptr;
+        env->nvic = gicdev;
+    }
+
+    qdev_prop_set_uint32(gicdev, "revision", 3);
+    qdev_prop_set_uint32(gicdev, "num-cpu", smp_cpus);
+    /* Note that the num-irq property counts both internal and external
+     * interrupts; there are always 32 of the former (mandated by GIC spec).
+     */
+    qdev_prop_set_uint32(gicdev, "num-irq", NUM_IRQS + 32);
+    qdev_init_nofail(gicdev);
+    gicbusdev = SYS_BUS_DEVICE(gicdev);
+
+    sysbus_mmio_map(gicbusdev, 0, vbi->memmap[VIRT_GIC_DIST].base);
+    sysbus_mmio_map(gicbusdev, 1, vbi->memmap[VIRT_GIC_DIST_SPI].base);
+    sysbus_mmio_map(gicbusdev, 2, vbi->memmap[VIRT_ITS_CONTROL].base);
+    sysbus_mmio_map(gicbusdev, 3, vbi->memmap[VIRT_ITS_TRANSLATION].base);
+    sysbus_mmio_map(gicbusdev, 4, vbi->memmap[VIRT_LPI].base);
+    /* Wire the outputs from each CPU's generic timer to the
+     * appropriate GIC PPI inputs, and the GIC's IRQ output to
+     * the CPU's IRQ input.
+     */
+    for (i = 0; i < smp_cpus ; i++) {
+        DeviceState *cpudev = DEVICE(qemu_get_cpu(i));
+        int ppibase = NUM_IRQS + i * 32;
+        /* physical timer; we wire it up to the non-secure timer's ID,
+         * since a real A57 always has TrustZone but QEMU doesn't.
+         */
+        qdev_connect_gpio_out(cpudev, 0,
+                              qdev_get_gpio_in(gicdev, ppibase + 30));
+        /* virtual timer */
+        qdev_connect_gpio_out(cpudev, 1,
+                              qdev_get_gpio_in(gicdev, ppibase + 27));
+
+        sysbus_connect_irq(gicbusdev, i, qdev_get_gpio_in(cpudev, ARM_CPU_IRQ));
+    }
+
+    for (i = 0; i < NUM_IRQS; i++) {
+        pic[i] = qdev_get_gpio_in(gicdev, i);
+    }
+
+    fdt_add_gic_node(vbi);
+}
+
+static void create_uart(const VirtBoardInfo *vbi, qemu_irq *pic)
+{
+    char *nodename;
+    hwaddr base = vbi->memmap[VIRT_UART].base;
+    hwaddr size = vbi->memmap[VIRT_UART].size;
+    int irq = vbi->irqmap[VIRT_UART];
+    const char compat[] = "arm,pl011\0arm,primecell";
+    const char clocknames[] = "uartclk\0apb_pclk";
+
+    sysbus_create_simple("pl011", base, pic[irq]);
+
+    nodename = g_strdup_printf("/pl011@%" PRIx64, base);
+    qemu_fdt_add_subnode(vbi->fdt, nodename);
+    /* Note that we can't use setprop_string because of the embedded NUL */
+    qemu_fdt_setprop(vbi->fdt, nodename, "compatible",
+                         compat, sizeof(compat));
+    qemu_fdt_setprop_sized_cells(vbi->fdt, nodename, "reg",
+                                     2, base, 2, size);
+    qemu_fdt_setprop_cells(vbi->fdt, nodename, "interrupts",
+                               GIC_FDT_IRQ_TYPE_SPI, irq,
+                               GIC_FDT_IRQ_FLAGS_LEVEL_HI);
+    qemu_fdt_setprop_cells(vbi->fdt, nodename, "clocks",
+                               vbi->clock_phandle, vbi->clock_phandle);
+    qemu_fdt_setprop(vbi->fdt, nodename, "clock-names",
+                         clocknames, sizeof(clocknames));
+
+    qemu_fdt_setprop_string(vbi->fdt, "/chosen", "stdout-path", nodename);
+    g_free(nodename);
+}
+
+static void create_rtc(const VirtBoardInfo *vbi, qemu_irq *pic)
+{
+    char *nodename;
+    hwaddr base = vbi->memmap[VIRT_RTC].base;
+    hwaddr size = vbi->memmap[VIRT_RTC].size;
+    int irq = vbi->irqmap[VIRT_RTC];
+    const char compat[] = "arm,pl031\0arm,primecell";
+
+    sysbus_create_simple("pl031", base, pic[irq]);
+
+    nodename = g_strdup_printf("/pl031@%" PRIx64, base);
+    qemu_fdt_add_subnode(vbi->fdt, nodename);
+    qemu_fdt_setprop(vbi->fdt, nodename, "compatible", compat, sizeof(compat));
+    qemu_fdt_setprop_sized_cells(vbi->fdt, nodename, "reg",
+                                 2, base, 2, size);
+    qemu_fdt_setprop_cells(vbi->fdt, nodename, "interrupts",
+                           GIC_FDT_IRQ_TYPE_SPI, irq,
+                           GIC_FDT_IRQ_FLAGS_LEVEL_HI);
+    qemu_fdt_setprop_cell(vbi->fdt, nodename, "clocks", vbi->clock_phandle);
+    qemu_fdt_setprop_string(vbi->fdt, nodename, "clock-names", "apb_pclk");
+    g_free(nodename);
+}
+
+static void create_virtio_devices(const VirtBoardInfo *vbi, qemu_irq *pic)
+{
+    int i;
+    hwaddr size = vbi->memmap[VIRT_MMIO].size;
+
+    /* Note that we have to create the transports in forwards order
+     * so that command line devices are inserted lowest address first,
+     * and then add dtb nodes in reverse order so that they appear in
+     * the finished device tree lowest address first.
+     */
+    for (i = 0; i < NUM_VIRTIO_TRANSPORTS; i++) {
+        int irq = vbi->irqmap[VIRT_MMIO] + i;
+        hwaddr base = vbi->memmap[VIRT_MMIO].base + i * size;
+
+        sysbus_create_simple("virtio-mmio", base, pic[irq]);
+    }
+
+    for (i = NUM_VIRTIO_TRANSPORTS - 1; i >= 0; i--) {
+        char *nodename;
+        int irq = vbi->irqmap[VIRT_MMIO] + i;
+        hwaddr base = vbi->memmap[VIRT_MMIO].base + i * size;
+
+        nodename = g_strdup_printf("/virtio_mmio@%" PRIx64, base);
+        qemu_fdt_add_subnode(vbi->fdt, nodename);
+        qemu_fdt_setprop_string(vbi->fdt, nodename,
+                                "compatible", "virtio,mmio");
+        qemu_fdt_setprop_sized_cells(vbi->fdt, nodename, "reg",
+                                     2, base, 2, size);
+        qemu_fdt_setprop_cells(vbi->fdt, nodename, "interrupts",
+                               GIC_FDT_IRQ_TYPE_SPI, irq,
+                               GIC_FDT_IRQ_FLAGS_EDGE_LO_HI);
+        g_free(nodename);
+    }
+}
+
+static void create_one_flash(const char *name, hwaddr flashbase,
+                             hwaddr flashsize)
+{
+    /* Create and map a single flash device. We use the same
+     * parameters as the flash devices on the Versatile Express board.
+     */
+    DriveInfo *dinfo = drive_get_next(IF_PFLASH);
+    DeviceState *dev = qdev_create(NULL, "cfi.pflash01");
+    const uint64_t sectorlength = 256 * 1024;
+
+    if (dinfo && qdev_prop_set_drive(dev, "drive",
+                                     blk_by_legacy_dinfo(dinfo))) {
+        abort();
+    }
+
+    qdev_prop_set_uint32(dev, "num-blocks", flashsize / sectorlength);
+    qdev_prop_set_uint64(dev, "sector-length", sectorlength);
+    qdev_prop_set_uint8(dev, "width", 4);
+    qdev_prop_set_uint8(dev, "device-width", 2);
+    qdev_prop_set_uint8(dev, "big-endian", 0);
+    qdev_prop_set_uint16(dev, "id0", 0x89);
+    qdev_prop_set_uint16(dev, "id1", 0x18);
+    qdev_prop_set_uint16(dev, "id2", 0x00);
+    qdev_prop_set_uint16(dev, "id3", 0x00);
+    qdev_prop_set_string(dev, "name", name);
+    qdev_init_nofail(dev);
+
+    sysbus_mmio_map(SYS_BUS_DEVICE(dev), 0, flashbase);
+}
+
+static void create_flash(const VirtBoardInfo *vbi)
+{
+    /* Create two flash devices to fill the VIRT_FLASH space in the memmap.
+     * Any file passed via -bios goes in the first of these.
+     */
+    hwaddr flashsize = vbi->memmap[VIRT_FLASH].size / 2;
+    hwaddr flashbase = vbi->memmap[VIRT_FLASH].base;
+    char *nodename;
+
+    if (bios_name) {
+        const char *fn;
+
+        if (drive_get(IF_PFLASH, 0, 0)) {
+            error_report("The contents of the first flash device may be "
+                         "specified with -bios or with -drive if=pflash... "
+                         "but you cannot use both options at once");
+            exit(1);
+        }
+        fn = qemu_find_file(QEMU_FILE_TYPE_BIOS, bios_name);
+        if (!fn || load_image_targphys(fn, flashbase, flashsize) < 0) {
+            error_report("Could not load ROM image '%s'", bios_name);
+            exit(1);
+        }
+    }
+
+    create_one_flash("virt2.flash0", flashbase, flashsize);
+    create_one_flash("virt2.flash1", flashbase + flashsize, flashsize);
+
+    nodename = g_strdup_printf("/flash@%" PRIx64, flashbase);
+    qemu_fdt_add_subnode(vbi->fdt, nodename);
+    qemu_fdt_setprop_string(vbi->fdt, nodename, "compatible", "cfi-flash");
+    qemu_fdt_setprop_sized_cells(vbi->fdt, nodename, "reg",
+                                 2, flashbase, 2, flashsize,
+                                 2, flashbase + flashsize, 2, flashsize);
+    qemu_fdt_setprop_cell(vbi->fdt, nodename, "bank-width", 4);
+    g_free(nodename);
+}
+
+static void create_fw_cfg(const VirtBoardInfo *vbi)
+{
+    hwaddr base = vbi->memmap[VIRT_FW_CFG].base;
+    hwaddr size = vbi->memmap[VIRT_FW_CFG].size;
+    char *nodename;
+
+    fw_cfg_init_mem_wide(base + 8, base, 8);
+
+    nodename = g_strdup_printf("/fw-cfg@%" PRIx64, base);
+    qemu_fdt_add_subnode(vbi->fdt, nodename);
+    qemu_fdt_setprop_string(vbi->fdt, nodename,
+                            "compatible", "qemu,fw-cfg-mmio");
+    qemu_fdt_setprop_sized_cells(vbi->fdt, nodename, "reg",
+                                 2, base, 2, size);
+    g_free(nodename);
+}
+
+static void *machvirt_dtb(const struct arm_boot_info *binfo, int *fdt_size)
+{
+    const VirtBoardInfo *board = (const VirtBoardInfo *)binfo;
+
+    *fdt_size = board->fdt_size;
+    return board->fdt;
+}
+
+static void machvirt2_init(MachineState *machine)
+{
+    VirtMachineState *vms = VIRT_MACHINE(machine);
+    qemu_irq pic[NUM_IRQS];
+    MemoryRegion *sysmem = get_system_memory();
+    int n;
+    MemoryRegion *ram = g_new(MemoryRegion, 1);
+    const char *cpu_model = machine->cpu_model;
+    VirtBoardInfo *vbi;
+
+    if (!cpu_model) {
+        cpu_model = "cortex-a57";
+    }
+
+    vbi = find_machine_info(cpu_model);
+    if (!vbi) {
+        error_report("mach-virt: CPU %s not supported", cpu_model);
+        exit(1);
+    }
+
+    /* With gic3 full implementation (with bitops) rase the lmit to 128 */
+    if (smp_cpus > 64) {
+        error_report("mach-virt: cannot model more than 64 cores");
+        exit(1);
+    }
+
+    vbi->smp_cpus = smp_cpus;
+
+    if (machine->ram_size > vbi->memmap[VIRT_MEM].size) {
+        error_report("mach-virt: cannot model more than 30GB RAM");
+        exit(1);
+    }
+
+    create_fdt(vbi);
+
+    for (n = 0; n < smp_cpus; n++) {
+        ObjectClass *oc = cpu_class_by_name(TYPE_AARCH64_CPU, cpu_model);
+        Object *cpuobj;
+
+        if (!oc) {
+            fprintf(stderr, "Unable to find CPU definition\n");
+            exit(1);
+        }
+        cpuobj = object_new(object_class_get_name(oc));
+
+        if (!vms->secure) {
+            object_property_set_bool(cpuobj, false, "has_el3", NULL);
+        }
+
+        object_property_set_int(cpuobj, QEMU_PSCI_CONDUIT_HVC, "psci-conduit",
+                                NULL);
+
+        /* Secondary CPUs start in PSCI powered-down state */
+        if (n > 0) {
+            object_property_set_bool(cpuobj, true, "start-powered-off", NULL);
+        }
+
+        if (object_property_find(cpuobj, "reset-cbar", NULL)) {
+            object_property_set_int(cpuobj, vbi->memmap[VIRT_CPUPERIPHS].base,
+                                    "reset-cbar", &error_abort);
+        }
+
+        object_property_set_bool(cpuobj, true, "realized", NULL);
+    }
+    fdt_add_timer_nodes(vbi);
+    fdt_add_cpu_nodes(vbi);
+    fdt_add_psci_node(vbi);
+
+    memory_region_init_ram(ram, NULL, "mach-virt.ram", machine->ram_size,
+                           &error_abort);
+    vmstate_register_ram_global(ram);
+    memory_region_add_subregion(sysmem, vbi->memmap[VIRT_MEM].base, ram);
+
+    create_flash(vbi);
+    create_gic(vbi, pic);
+    create_uart(vbi, pic);
+    create_rtc(vbi, pic);
+
+    /* Create mmio transports, so the user can create virtio backends
+     * (which will be automatically plugged in to the transports). If
+     * no backend is created the transport will just sit harmlessly idle.
+     */
+    create_virtio_devices(vbi, pic);
+
+    create_fw_cfg(vbi);
+
+    vbi->bootinfo.ram_size = machine->ram_size;
+    vbi->bootinfo.kernel_filename = machine->kernel_filename;
+    vbi->bootinfo.kernel_cmdline = machine->kernel_cmdline;
+    vbi->bootinfo.initrd_filename = machine->initrd_filename;
+    vbi->bootinfo.nb_cpus = smp_cpus;
+    vbi->bootinfo.board_id = -1;
+    vbi->bootinfo.loader_start = vbi->memmap[VIRT_MEM].base;
+    vbi->bootinfo.get_dtb = machvirt_dtb;
+    vbi->bootinfo.firmware_loaded = bios_name || drive_get(IF_PFLASH, 0, 0);
+    DPRINTF(" ---- kernel load ----- \n");
+    arm_load_kernel(ARM_CPU(first_cpu), &vbi->bootinfo);
+    DPRINTF(" ---- kernel load finish ----- \n");
+}
+
+
+static bool virt_get_secure(Object *obj, Error **errp)
+{
+    VirtMachineState *vms = VIRT_MACHINE(obj);
+
+    return vms->secure;
+}
+
+static void virt_set_secure(Object *obj, bool value, Error **errp)
+{
+    VirtMachineState *vms = VIRT_MACHINE(obj);
+
+    vms->secure = value;
+}
+
+static void virt_instance_init(Object *obj)
+{
+    VirtMachineState *vms = VIRT_MACHINE(obj);
+
+    /* EL3 is enabled by default on virt */
+    vms->secure = true;
+    object_property_add_bool(obj, "secure", virt_get_secure,
+                             virt_set_secure, NULL);
+    object_property_set_description(obj, "secure",
+                                    "Set on/off to enable/disable the ARM "
+                                    "Security Extensions (TrustZone)",
+                                    NULL);
+}
+
+static void virt_class_init(ObjectClass *oc, void *data)
+{
+    MachineClass *mc = MACHINE_CLASS(oc);
+
+    mc->name = TYPE_VIRT_MACHINE;
+    mc->desc = "ARM Virtual Machine",
+    mc->init = machvirt2_init;
+    /* With gic3 full implementation (with bitops) rase the lmit to 128 */
+    mc->max_cpus = 64;
+}
+
+
+static const TypeInfo machvirt2_info = {
+    .name = TYPE_VIRT_MACHINE,
+    .parent = TYPE_MACHINE,
+    .instance_size = sizeof(VirtMachineState),
+    .instance_init = virt_instance_init,
+    .class_size = sizeof(VirtMachineClass),
+    .class_init = virt_class_init,
+};
+
+static void machvirt2_machine_init(void)
+{
+    type_register_static(&machvirt2_info);
+}
+
+machine_init(machvirt2_machine_init);