[{"id":1761470,"web_url":"http://patchwork.ozlabs.org/comment/1761470/","msgid":"<1504243799.4974.69.camel@kernel.crashing.org>","date":"2017-09-01T05:29:59","subject":"Re: [PATCH v3 2/8] powerpc/xive: guest exploitation of the XIVE\n\tinterrupt controller","submitter":{"id":38,"url":"http://patchwork.ozlabs.org/api/people/38/","name":"Benjamin Herrenschmidt","email":"benh@kernel.crashing.org"},"content":"On Wed, 2017-08-30 at 21:46 +0200, Cédric Le Goater wrote:\n> This is the framework for using XIVE in a PowerVM guest. The support\n> is very similar to the native one in a much simpler form.\n> \n> Each source is associated with an Event State Buffer (ESB). This is a\n> two bit state machine which is used to trigger events. The bits are\n> named \"P\" (pending) and \"Q\" (queued) and can be controlled by MMIO.\n> The Guest OS registers event (or notifications) queues on which the HW\n> will post event data for a target to notify.\n> \n> Instead of OPAL calls, a set of Hypervisors call are used to configure\n> the interrupt sources and the event/notification queues of the guest:\n> \n>  - H_INT_GET_SOURCE_INFO\n> \n>    used to obtain the address of the MMIO page of the Event State\n>    Buffer (PQ bits) entry associated with the source.\n> \n>  - H_INT_SET_SOURCE_CONFIG\n> \n>    assigns a source to a \"target\".\n> \n>  - H_INT_GET_SOURCE_CONFIG\n> \n>    determines to which \"target\" and \"priority\" is assigned to a source\n> \n>  - H_INT_GET_QUEUE_INFO\n> \n>    returns the address of the notification management page associated\n>    with the specified \"target\" and \"priority\".\n> \n>  - H_INT_SET_QUEUE_CONFIG\n> \n>    sets or resets the event queue for a given \"target\" and \"priority\".\n>    It is also used to set the notification config associated with the\n>    queue, only unconditional notification for the moment.  Reset is\n>    performed with a queue size of 0 and queueing is disabled in that\n>    case.\n> \n>  - H_INT_GET_QUEUE_CONFIG\n> \n>    returns the queue settings for a given \"target\" and \"priority\".\n> \n>  - H_INT_RESET\n> \n>    resets all of the partition's interrupt exploitation structures to\n>    their initial state, losing all configuration set via the hcalls\n>    H_INT_SET_SOURCE_CONFIG and H_INT_SET_QUEUE_CONFIG.\n> \n>  - H_INT_SYNC\n> \n>    issue a synchronisation on a source to make sure sure all\n>    notifications have reached their queue.\n> \n> As for XICS, the XIVE interface for the guest is described in the\n> device tree under the \"interrupt-controller\" node. A couple of new\n> properties are specific to XIVE :\n> \n>  - \"reg\"\n> \n>    contains the base address and size of the thread interrupt\n>    managnement areas (TIMA), also called rings, for the User level and\n>    for the Guest OS level. Only the Guest OS level is taken into\n>    account today.\n> \n>  - \"ibm,xive-eq-sizes\"\n> \n>    the size of the event queues. One cell per size supported, contains\n>    log2 of size, in ascending order.\n> \n>  - \"ibm,xive-lisn-ranges\"\n> \n>    the interrupt numbers ranges assigned to the guest. These are\n>    allocated using a simple bitmap.\n> \n> and also :\n> \n>  - \"/ibm,plat-res-int-priorities\"\n> \n>    contains a list of priorities that the hypervisor has reserved for\n>    its own use.\n> \n> Tested with a QEMU XIVE model for pseries and with the Power hypervisor.\n> \n> Signed-off-by: Cédric Le Goater <clg@kaod.org>\n\nAcked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>\n\n> ---\n> \n>  Changes since v2 :\n> \n>  - changed H_INT_SYNC hcall to reflect new api\n>  \n>  Changes since v1 :\n> \n>  - added a xive_teardown_cpu() routine\n>  - removed P9 doorbell support when xive is enabled.\n>  - merged in patch for \"ibm,plat-res-int-priorities\" support\n>  - added some comments on the usage of raw I/O accessors.\n>  \n>  Changes since RFC :\n> \n>  - renamed backend to spapr\n>  - fixed hotplug support\n>  - fixed kexec support\n>  - fixed src_chip value (XIVE_INVALID_CHIP_ID)\n>  - added doorbell support \n>  - added some hcall debug logs\n> \n>  arch/powerpc/include/asm/hvcall.h            |  13 +-\n>  arch/powerpc/include/asm/xive.h              |   3 +\n>  arch/powerpc/platforms/pseries/Kconfig       |   1 +\n>  arch/powerpc/platforms/pseries/hotplug-cpu.c |  11 +-\n>  arch/powerpc/platforms/pseries/kexec.c       |   6 +-\n>  arch/powerpc/platforms/pseries/setup.c       |   8 +-\n>  arch/powerpc/platforms/pseries/smp.c         |  27 +-\n>  arch/powerpc/sysdev/xive/Kconfig             |   5 +\n>  arch/powerpc/sysdev/xive/Makefile            |   1 +\n>  arch/powerpc/sysdev/xive/common.c            |  13 +\n>  arch/powerpc/sysdev/xive/spapr.c             | 618 +++++++++++++++++++++++++++\n>  11 files changed, 698 insertions(+), 8 deletions(-)\n>  create mode 100644 arch/powerpc/sysdev/xive/spapr.c\n> \n> diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h\n> index 57d38b504ff7..3d34dc0869f6 100644\n> --- a/arch/powerpc/include/asm/hvcall.h\n> +++ b/arch/powerpc/include/asm/hvcall.h\n> @@ -280,7 +280,18 @@\n>  #define H_RESIZE_HPT_COMMIT\t0x370\n>  #define H_REGISTER_PROC_TBL\t0x37C\n>  #define H_SIGNAL_SYS_RESET\t0x380\n> -#define MAX_HCALL_OPCODE\tH_SIGNAL_SYS_RESET\n> +#define H_INT_GET_SOURCE_INFO   0x3A8\n> +#define H_INT_SET_SOURCE_CONFIG 0x3AC\n> +#define H_INT_GET_SOURCE_CONFIG 0x3B0\n> +#define H_INT_GET_QUEUE_INFO    0x3B4\n> +#define H_INT_SET_QUEUE_CONFIG  0x3B8\n> +#define H_INT_GET_QUEUE_CONFIG  0x3BC\n> +#define H_INT_SET_OS_REPORTING_LINE 0x3C0\n> +#define H_INT_GET_OS_REPORTING_LINE 0x3C4\n> +#define H_INT_ESB               0x3C8\n> +#define H_INT_SYNC              0x3CC\n> +#define H_INT_RESET             0x3D0\n> +#define MAX_HCALL_OPCODE\tH_INT_RESET\n>  \n>  /* H_VIOCTL functions */\n>  #define H_GET_VIOA_DUMP_SIZE\t0x01\n> diff --git a/arch/powerpc/include/asm/xive.h b/arch/powerpc/include/asm/xive.h\n> index c23ff4389ca2..473f133a8555 100644\n> --- a/arch/powerpc/include/asm/xive.h\n> +++ b/arch/powerpc/include/asm/xive.h\n> @@ -110,11 +110,13 @@ extern bool __xive_enabled;\n>  \n>  static inline bool xive_enabled(void) { return __xive_enabled; }\n>  \n> +extern bool xive_spapr_init(void);\n>  extern bool xive_native_init(void);\n>  extern void xive_smp_probe(void);\n>  extern int  xive_smp_prepare_cpu(unsigned int cpu);\n>  extern void xive_smp_setup_cpu(void);\n>  extern void xive_smp_disable_cpu(void);\n> +extern void xive_teardown_cpu(void);\n>  extern void xive_kexec_teardown_cpu(int secondary);\n>  extern void xive_shutdown(void);\n>  extern void xive_flush_interrupt(void);\n> @@ -147,6 +149,7 @@ extern int xive_native_get_vp_info(u32 vp_id, u32 *out_cam_id, u32 *out_chip_id)\n>  \n>  static inline bool xive_enabled(void) { return false; }\n>  \n> +static inline bool xive_spapr_init(void) { return false; }\n>  static inline bool xive_native_init(void) { return false; }\n>  static inline void xive_smp_probe(void) { }\n>  extern inline int  xive_smp_prepare_cpu(unsigned int cpu) { return -EINVAL; }\n> diff --git a/arch/powerpc/platforms/pseries/Kconfig b/arch/powerpc/platforms/pseries/Kconfig\n> index 3a6dfd14f64b..71dd69d9ec64 100644\n> --- a/arch/powerpc/platforms/pseries/Kconfig\n> +++ b/arch/powerpc/platforms/pseries/Kconfig\n> @@ -7,6 +7,7 @@ config PPC_PSERIES\n>  \tselect PCI\n>  \tselect PCI_MSI\n>  \tselect PPC_XICS\n> +\tselect PPC_XIVE_SPAPR\n>  \tselect PPC_ICP_NATIVE\n>  \tselect PPC_ICP_HV\n>  \tselect PPC_ICS_RTAS\n> diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c\n> index 6afd1efd3633..175230e80766 100644\n> --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c\n> +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c\n> @@ -34,6 +34,7 @@\n>  #include <asm/machdep.h>\n>  #include <asm/vdso_datapage.h>\n>  #include <asm/xics.h>\n> +#include <asm/xive.h>\n>  #include <asm/plpar_wrappers.h>\n>  \n>  #include \"pseries.h\"\n> @@ -109,7 +110,10 @@ static void pseries_mach_cpu_die(void)\n>  \n>  \tlocal_irq_disable();\n>  \tidle_task_exit();\n> -\txics_teardown_cpu();\n> +\tif (xive_enabled())\n> +\t\txive_teardown_cpu();\n> +\telse\n> +\t\txics_teardown_cpu();\n>  \n>  \tif (get_preferred_offline_state(cpu) == CPU_STATE_INACTIVE) {\n>  \t\tset_cpu_current_state(cpu, CPU_STATE_INACTIVE);\n> @@ -174,7 +178,10 @@ static int pseries_cpu_disable(void)\n>  \t\tboot_cpuid = cpumask_any(cpu_online_mask);\n>  \n>  \t/* FIXME: abstract this to not be platform specific later on */\n> -\txics_migrate_irqs_away();\n> +\tif (xive_enabled())\n> +\t\txive_smp_disable_cpu();\n> +\telse\n> +\t\txics_migrate_irqs_away();\n>  \treturn 0;\n>  }\n>  \n> diff --git a/arch/powerpc/platforms/pseries/kexec.c b/arch/powerpc/platforms/pseries/kexec.c\n> index 6681ac97fb18..eeb13429d685 100644\n> --- a/arch/powerpc/platforms/pseries/kexec.c\n> +++ b/arch/powerpc/platforms/pseries/kexec.c\n> @@ -15,6 +15,7 @@\n>  #include <asm/firmware.h>\n>  #include <asm/kexec.h>\n>  #include <asm/xics.h>\n> +#include <asm/xive.h>\n>  #include <asm/smp.h>\n>  #include <asm/plpar_wrappers.h>\n>  \n> @@ -51,5 +52,8 @@ void pseries_kexec_cpu_down(int crash_shutdown, int secondary)\n>  \t\t}\n>  \t}\n>  \n> -\txics_kexec_teardown_cpu(secondary);\n> +\tif (xive_enabled())\n> +\t\txive_kexec_teardown_cpu(secondary);\n> +\telse\n> +\t\txics_kexec_teardown_cpu(secondary);\n>  }\n> diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c\n> index b5d86426e97b..a8531e012658 100644\n> --- a/arch/powerpc/platforms/pseries/setup.c\n> +++ b/arch/powerpc/platforms/pseries/setup.c\n> @@ -57,6 +57,7 @@\n>  #include <asm/nvram.h>\n>  #include <asm/pmc.h>\n>  #include <asm/xics.h>\n> +#include <asm/xive.h>\n>  #include <asm/ppc-pci.h>\n>  #include <asm/i8259.h>\n>  #include <asm/udbg.h>\n> @@ -176,8 +177,11 @@ static void __init pseries_setup_i8259_cascade(void)\n>  \n>  static void __init pseries_init_irq(void)\n>  {\n> -\txics_init();\n> -\tpseries_setup_i8259_cascade();\n> +\t/* Try using a XIVE if available, otherwise use a XICS */\n> +\tif (!xive_spapr_init()) {\n> +\t\txics_init();\n> +\t\tpseries_setup_i8259_cascade();\n> +\t}\n>  }\n>  \n>  static void pseries_lpar_enable_pmcs(void)\n> diff --git a/arch/powerpc/platforms/pseries/smp.c b/arch/powerpc/platforms/pseries/smp.c\n> index 24785f63fb40..2e184829e5d4 100644\n> --- a/arch/powerpc/platforms/pseries/smp.c\n> +++ b/arch/powerpc/platforms/pseries/smp.c\n> @@ -41,6 +41,7 @@\n>  #include <asm/vdso_datapage.h>\n>  #include <asm/cputhreads.h>\n>  #include <asm/xics.h>\n> +#include <asm/xive.h>\n>  #include <asm/dbell.h>\n>  #include <asm/plpar_wrappers.h>\n>  #include <asm/code-patching.h>\n> @@ -136,7 +137,9 @@ static inline int smp_startup_cpu(unsigned int lcpu)\n>  \n>  static void smp_setup_cpu(int cpu)\n>  {\n> -\tif (cpu != boot_cpuid)\n> +\tif (xive_enabled())\n> +\t\txive_smp_setup_cpu();\n> +\telse if (cpu != boot_cpuid)\n>  \t\txics_setup_cpu();\n>  \n>  \tif (firmware_has_feature(FW_FEATURE_SPLPAR))\n> @@ -181,6 +184,13 @@ static int smp_pSeries_kick_cpu(int nr)\n>  \treturn 0;\n>  }\n>  \n> +static int pseries_smp_prepare_cpu(int cpu)\n> +{\n> +\tif (xive_enabled())\n> +\t\treturn xive_smp_prepare_cpu(cpu);\n> +\treturn 0;\n> +}\n> +\n>  static void smp_pseries_cause_ipi(int cpu)\n>  {\n>  \t/* POWER9 should not use this handler */\n> @@ -211,7 +221,7 @@ static int pseries_cause_nmi_ipi(int cpu)\n>  \treturn 0;\n>  }\n>  \n> -static __init void pSeries_smp_probe(void)\n> +static __init void pSeries_smp_probe_xics(void)\n>  {\n>  \txics_smp_probe();\n>  \n> @@ -221,11 +231,24 @@ static __init void pSeries_smp_probe(void)\n>  \t\tsmp_ops->cause_ipi = icp_ops->cause_ipi;\n>  }\n>  \n> +static __init void pSeries_smp_probe(void)\n> +{\n> +\tif (xive_enabled())\n> +\t\t/*\n> +\t\t * Don't use P9 doorbells when XIVE is enabled. IPIs\n> +\t\t * using MMIOs should be faster\n> +\t\t */\n> +\t\txive_smp_probe();\n> +\telse\n> +\t\tpSeries_smp_probe_xics();\n> +}\n> +\n>  static struct smp_ops_t pseries_smp_ops = {\n>  \t.message_pass\t= NULL,\t/* Use smp_muxed_ipi_message_pass */\n>  \t.cause_ipi\t= NULL,\t/* Filled at runtime by pSeries_smp_probe() */\n>  \t.cause_nmi_ipi\t= pseries_cause_nmi_ipi,\n>  \t.probe\t\t= pSeries_smp_probe,\n> +\t.prepare_cpu\t= pseries_smp_prepare_cpu,\n>  \t.kick_cpu\t= smp_pSeries_kick_cpu,\n>  \t.setup_cpu\t= smp_setup_cpu,\n>  \t.cpu_bootable\t= smp_generic_cpu_bootable,\n> diff --git a/arch/powerpc/sysdev/xive/Kconfig b/arch/powerpc/sysdev/xive/Kconfig\n> index 12ccd7373d2f..3e3e25b5e30d 100644\n> --- a/arch/powerpc/sysdev/xive/Kconfig\n> +++ b/arch/powerpc/sysdev/xive/Kconfig\n> @@ -9,3 +9,8 @@ config PPC_XIVE_NATIVE\n>  \tdefault n\n>  \tselect PPC_XIVE\n>  \tdepends on PPC_POWERNV\n> +\n> +config PPC_XIVE_SPAPR\n> +\tbool\n> +\tdefault n\n> +\tselect PPC_XIVE\n> diff --git a/arch/powerpc/sysdev/xive/Makefile b/arch/powerpc/sysdev/xive/Makefile\n> index 3fab303fc169..536d6e5706e3 100644\n> --- a/arch/powerpc/sysdev/xive/Makefile\n> +++ b/arch/powerpc/sysdev/xive/Makefile\n> @@ -2,3 +2,4 @@ subdir-ccflags-$(CONFIG_PPC_WERROR) := -Werror\n>  \n>  obj-y\t\t\t\t+= common.o\n>  obj-$(CONFIG_PPC_XIVE_NATIVE)\t+= native.o\n> +obj-$(CONFIG_PPC_XIVE_SPAPR)\t+= spapr.o\n> diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c\n> index 26999ceae20e..8774af7a4105 100644\n> --- a/arch/powerpc/sysdev/xive/common.c\n> +++ b/arch/powerpc/sysdev/xive/common.c\n> @@ -1368,6 +1368,19 @@ void xive_flush_interrupt(void)\n>  \n>  #endif /* CONFIG_SMP */\n>  \n> +void xive_teardown_cpu(void)\n> +{\n> +\tstruct xive_cpu *xc = __this_cpu_read(xive_cpu);\n> +\tunsigned int cpu = smp_processor_id();\n> +\n> +\t/* Set CPPR to 0 to disable flow of interrupts */\n> +\txc->cppr = 0;\n> +\tout_8(xive_tima + xive_tima_offset + TM_CPPR, 0);\n> +\n> +\tif (xive_ops->teardown_cpu)\n> +\t\txive_ops->teardown_cpu(cpu, xc);\n> +}\n> +\n>  void xive_kexec_teardown_cpu(int secondary)\n>  {\n>  \tstruct xive_cpu *xc = __this_cpu_read(xive_cpu);\n> diff --git a/arch/powerpc/sysdev/xive/spapr.c b/arch/powerpc/sysdev/xive/spapr.c\n> new file mode 100644\n> index 000000000000..797bb0636ab7\n> --- /dev/null\n> +++ b/arch/powerpc/sysdev/xive/spapr.c\n> @@ -0,0 +1,618 @@\n> +/*\n> + * Copyright 2016,2017 IBM Corporation.\n> + *\n> + * This program is free software; you can redistribute it and/or\n> + * modify it under the terms of the GNU General Public License\n> + * as published by the Free Software Foundation; either version\n> + * 2 of the License, or (at your option) any later version.\n> + */\n> +\n> +#define pr_fmt(fmt) \"xive: \" fmt\n> +\n> +#include <linux/types.h>\n> +#include <linux/irq.h>\n> +#include <linux/smp.h>\n> +#include <linux/interrupt.h>\n> +#include <linux/init.h>\n> +#include <linux/of.h>\n> +#include <linux/slab.h>\n> +#include <linux/spinlock.h>\n> +#include <linux/cpumask.h>\n> +#include <linux/mm.h>\n> +\n> +#include <asm/prom.h>\n> +#include <asm/io.h>\n> +#include <asm/smp.h>\n> +#include <asm/irq.h>\n> +#include <asm/errno.h>\n> +#include <asm/xive.h>\n> +#include <asm/xive-regs.h>\n> +#include <asm/hvcall.h>\n> +\n> +#include \"xive-internal.h\"\n> +\n> +static u32 xive_queue_shift;\n> +\n> +struct xive_irq_bitmap {\n> +\tunsigned long\t\t*bitmap;\n> +\tunsigned int\t\tbase;\n> +\tunsigned int\t\tcount;\n> +\tspinlock_t\t\tlock;\n> +\tstruct list_head\tlist;\n> +};\n> +\n> +static LIST_HEAD(xive_irq_bitmaps);\n> +\n> +static int xive_irq_bitmap_add(int base, int count)\n> +{\n> +\tstruct xive_irq_bitmap *xibm;\n> +\n> +\txibm = kzalloc(sizeof(*xibm), GFP_ATOMIC);\n> +\tif (!xibm)\n> +\t\treturn -ENOMEM;\n> +\n> +\tspin_lock_init(&xibm->lock);\n> +\txibm->base = base;\n> +\txibm->count = count;\n> +\txibm->bitmap = kzalloc(xibm->count, GFP_KERNEL);\n> +\tlist_add(&xibm->list, &xive_irq_bitmaps);\n> +\n> +\tpr_info(\"Using IRQ range [%x-%x]\", xibm->base,\n> +\t\txibm->base + xibm->count - 1);\n> +\treturn 0;\n> +}\n> +\n> +static int __xive_irq_bitmap_alloc(struct xive_irq_bitmap *xibm)\n> +{\n> +\tint irq;\n> +\n> +\tirq = find_first_zero_bit(xibm->bitmap, xibm->count);\n> +\tif (irq != xibm->count) {\n> +\t\tset_bit(irq, xibm->bitmap);\n> +\t\tirq += xibm->base;\n> +\t} else {\n> +\t\tirq = -ENOMEM;\n> +\t}\n> +\n> +\treturn irq;\n> +}\n> +\n> +static int xive_irq_bitmap_alloc(void)\n> +{\n> +\tstruct xive_irq_bitmap *xibm;\n> +\tunsigned long flags;\n> +\tint irq = -ENOENT;\n> +\n> +\tlist_for_each_entry(xibm, &xive_irq_bitmaps, list) {\n> +\t\tspin_lock_irqsave(&xibm->lock, flags);\n> +\t\tirq = __xive_irq_bitmap_alloc(xibm);\n> +\t\tspin_unlock_irqrestore(&xibm->lock, flags);\n> +\t\tif (irq >= 0)\n> +\t\t\tbreak;\n> +\t}\n> +\treturn irq;\n> +}\n> +\n> +static void xive_irq_bitmap_free(int irq)\n> +{\n> +\tunsigned long flags;\n> +\tstruct xive_irq_bitmap *xibm;\n> +\n> +\tlist_for_each_entry(xibm, &xive_irq_bitmaps, list) {\n> +\t\tif ((irq >= xibm->base) && (irq < xibm->base + xibm->count)) {\n> +\t\t\tspin_lock_irqsave(&xibm->lock, flags);\n> +\t\t\tclear_bit(irq - xibm->base, xibm->bitmap);\n> +\t\t\tspin_unlock_irqrestore(&xibm->lock, flags);\n> +\t\t\tbreak;\n> +\t\t}\n> +\t}\n> +}\n> +\n> +static long plpar_int_get_source_info(unsigned long flags,\n> +\t\t\t\t      unsigned long lisn,\n> +\t\t\t\t      unsigned long *src_flags,\n> +\t\t\t\t      unsigned long *eoi_page,\n> +\t\t\t\t      unsigned long *trig_page,\n> +\t\t\t\t      unsigned long *esb_shift)\n> +{\n> +\tunsigned long retbuf[PLPAR_HCALL_BUFSIZE];\n> +\tlong rc;\n> +\n> +\trc = plpar_hcall(H_INT_GET_SOURCE_INFO, retbuf, flags, lisn);\n> +\tif (rc) {\n> +\t\tpr_err(\"H_INT_GET_SOURCE_INFO lisn=%ld failed %ld\\n\", lisn, rc);\n> +\t\treturn rc;\n> +\t}\n> +\n> +\t*src_flags = retbuf[0];\n> +\t*eoi_page  = retbuf[1];\n> +\t*trig_page = retbuf[2];\n> +\t*esb_shift = retbuf[3];\n> +\n> +\tpr_devel(\"H_INT_GET_SOURCE_INFO flags=%lx eoi=%lx trig=%lx shift=%lx\\n\",\n> +\t\tretbuf[0], retbuf[1], retbuf[2], retbuf[3]);\n> +\n> +\treturn 0;\n> +}\n> +\n> +#define XIVE_SRC_SET_EISN (1ull << (63 - 62))\n> +#define XIVE_SRC_MASK     (1ull << (63 - 63)) /* unused */\n> +\n> +static long plpar_int_set_source_config(unsigned long flags,\n> +\t\t\t\t\tunsigned long lisn,\n> +\t\t\t\t\tunsigned long target,\n> +\t\t\t\t\tunsigned long prio,\n> +\t\t\t\t\tunsigned long sw_irq)\n> +{\n> +\tlong rc;\n> +\n> +\n> +\tpr_devel(\"H_INT_SET_SOURCE_CONFIG flags=%lx lisn=%lx target=%lx prio=%lx sw_irq=%lx\\n\",\n> +\t\tflags, lisn, target, prio, sw_irq);\n> +\n> +\n> +\trc = plpar_hcall_norets(H_INT_SET_SOURCE_CONFIG, flags, lisn,\n> +\t\t\t\ttarget, prio, sw_irq);\n> +\tif (rc) {\n> +\t\tpr_err(\"H_INT_SET_SOURCE_CONFIG lisn=%ld target=%lx prio=%lx failed %ld\\n\",\n> +\t\t       lisn, target, prio, rc);\n> +\t\treturn rc;\n> +\t}\n> +\n> +\treturn 0;\n> +}\n> +\n> +static long plpar_int_get_queue_info(unsigned long flags,\n> +\t\t\t\t     unsigned long target,\n> +\t\t\t\t     unsigned long priority,\n> +\t\t\t\t     unsigned long *esn_page,\n> +\t\t\t\t     unsigned long *esn_size)\n> +{\n> +\tunsigned long retbuf[PLPAR_HCALL_BUFSIZE];\n> +\tlong rc;\n> +\n> +\trc = plpar_hcall(H_INT_GET_QUEUE_INFO, retbuf, flags, target, priority);\n> +\tif (rc) {\n> +\t\tpr_err(\"H_INT_GET_QUEUE_INFO cpu=%ld prio=%ld failed %ld\\n\",\n> +\t\t       target, priority, rc);\n> +\t\treturn rc;\n> +\t}\n> +\n> +\t*esn_page = retbuf[0];\n> +\t*esn_size = retbuf[1];\n> +\n> +\tpr_devel(\"H_INT_GET_QUEUE_INFO page=%lx size=%lx\\n\",\n> +\t\tretbuf[0], retbuf[1]);\n> +\n> +\treturn 0;\n> +}\n> +\n> +#define XIVE_EQ_ALWAYS_NOTIFY (1ull << (63 - 63))\n> +\n> +static long plpar_int_set_queue_config(unsigned long flags,\n> +\t\t\t\t       unsigned long target,\n> +\t\t\t\t       unsigned long priority,\n> +\t\t\t\t       unsigned long qpage,\n> +\t\t\t\t       unsigned long qsize)\n> +{\n> +\tlong rc;\n> +\n> +\tpr_devel(\"H_INT_SET_QUEUE_CONFIG flags=%lx target=%lx priority=%lx qpage=%lx qsize=%lx\\n\",\n> +\t\tflags,  target, priority, qpage, qsize);\n> +\n> +\trc = plpar_hcall_norets(H_INT_SET_QUEUE_CONFIG, flags, target,\n> +\t\t\t\tpriority, qpage, qsize);\n> +\tif (rc) {\n> +\t\tpr_err(\"H_INT_SET_QUEUE_CONFIG cpu=%ld prio=%ld qpage=%lx returned %ld\\n\",\n> +\t\t       target, priority, qpage, rc);\n> +\t\treturn  rc;\n> +\t}\n> +\n> +\treturn 0;\n> +}\n> +\n> +static long plpar_int_sync(unsigned long flags, unsigned long lisn)\n> +{\n> +\tlong rc;\n> +\n> +\trc = plpar_hcall_norets(H_INT_SYNC, flags, lisn);\n> +\tif (rc) {\n> +\t\tpr_err(\"H_INT_SYNC lisn=%ld returned %ld\\n\", lisn, rc);\n> +\t\treturn  rc;\n> +\t}\n> +\n> +\treturn 0;\n> +}\n> +\n> +#define XIVE_SRC_H_INT_ESB     (1ull << (63 - 60)) /* TODO */\n> +#define XIVE_SRC_LSI           (1ull << (63 - 61))\n> +#define XIVE_SRC_TRIGGER       (1ull << (63 - 62))\n> +#define XIVE_SRC_STORE_EOI     (1ull << (63 - 63))\n> +\n> +static int xive_spapr_populate_irq_data(u32 hw_irq, struct xive_irq_data *data)\n> +{\n> +\tlong rc;\n> +\tunsigned long flags;\n> +\tunsigned long eoi_page;\n> +\tunsigned long trig_page;\n> +\tunsigned long esb_shift;\n> +\n> +\tmemset(data, 0, sizeof(*data));\n> +\n> +\trc = plpar_int_get_source_info(0, hw_irq, &flags, &eoi_page, &trig_page,\n> +\t\t\t\t       &esb_shift);\n> +\tif (rc)\n> +\t\treturn  -EINVAL;\n> +\n> +\tif (flags & XIVE_SRC_STORE_EOI)\n> +\t\tdata->flags  |= XIVE_IRQ_FLAG_STORE_EOI;\n> +\tif (flags & XIVE_SRC_LSI)\n> +\t\tdata->flags  |= XIVE_IRQ_FLAG_LSI;\n> +\tdata->eoi_page  = eoi_page;\n> +\tdata->esb_shift = esb_shift;\n> +\tdata->trig_page = trig_page;\n> +\n> +\t/*\n> +\t * No chip-id for the sPAPR backend. This has an impact how we\n> +\t * pick a target. See xive_pick_irq_target().\n> +\t */\n> +\tdata->src_chip = XIVE_INVALID_CHIP_ID;\n> +\n> +\tdata->eoi_mmio = ioremap(data->eoi_page, 1u << data->esb_shift);\n> +\tif (!data->eoi_mmio) {\n> +\t\tpr_err(\"Failed to map EOI page for irq 0x%x\\n\", hw_irq);\n> +\t\treturn -ENOMEM;\n> +\t}\n> +\n> +\t/* Full function page supports trigger */\n> +\tif (flags & XIVE_SRC_TRIGGER) {\n> +\t\tdata->trig_mmio = data->eoi_mmio;\n> +\t\treturn 0;\n> +\t}\n> +\n> +\tdata->trig_mmio = ioremap(data->trig_page, 1u << data->esb_shift);\n> +\tif (!data->trig_mmio) {\n> +\t\tpr_err(\"Failed to map trigger page for irq 0x%x\\n\", hw_irq);\n> +\t\treturn -ENOMEM;\n> +\t}\n> +\treturn 0;\n> +}\n> +\n> +static int xive_spapr_configure_irq(u32 hw_irq, u32 target, u8 prio, u32 sw_irq)\n> +{\n> +\tlong rc;\n> +\n> +\trc = plpar_int_set_source_config(XIVE_SRC_SET_EISN, hw_irq, target,\n> +\t\t\t\t\t prio, sw_irq);\n> +\n> +\treturn rc == 0 ? 0 : -ENXIO;\n> +}\n> +\n> +/* This can be called multiple time to change a queue configuration */\n> +static int xive_spapr_configure_queue(u32 target, struct xive_q *q, u8 prio,\n> +\t\t\t\t   __be32 *qpage, u32 order)\n> +{\n> +\ts64 rc = 0;\n> +\tunsigned long esn_page;\n> +\tunsigned long esn_size;\n> +\tu64 flags, qpage_phys;\n> +\n> +\t/* If there's an actual queue page, clean it */\n> +\tif (order) {\n> +\t\tif (WARN_ON(!qpage))\n> +\t\t\treturn -EINVAL;\n> +\t\tqpage_phys = __pa(qpage);\n> +\t} else {\n> +\t\tqpage_phys = 0;\n> +\t}\n> +\n> +\t/* Initialize the rest of the fields */\n> +\tq->msk = order ? ((1u << (order - 2)) - 1) : 0;\n> +\tq->idx = 0;\n> +\tq->toggle = 0;\n> +\n> +\trc = plpar_int_get_queue_info(0, target, prio, &esn_page, &esn_size);\n> +\tif (rc) {\n> +\t\tpr_err(\"Error %lld getting queue info prio %d\\n\", rc, prio);\n> +\t\trc = -EIO;\n> +\t\tgoto fail;\n> +\t}\n> +\n> +\t/* TODO: add support for the notification page */\n> +\tq->eoi_phys = esn_page;\n> +\n> +\t/* Default is to always notify */\n> +\tflags = XIVE_EQ_ALWAYS_NOTIFY;\n> +\n> +\t/* Configure and enable the queue in HW */\n> +\trc = plpar_int_set_queue_config(flags, target, prio, qpage_phys, order);\n> +\tif (rc) {\n> +\t\tpr_err(\"Error %lld setting queue for prio %d\\n\", rc, prio);\n> +\t\trc = -EIO;\n> +\t} else {\n> +\t\tq->qpage = qpage;\n> +\t}\n> +fail:\n> +\treturn rc;\n> +}\n> +\n> +static int xive_spapr_setup_queue(unsigned int cpu, struct xive_cpu *xc,\n> +\t\t\t\t  u8 prio)\n> +{\n> +\tstruct xive_q *q = &xc->queue[prio];\n> +\t__be32 *qpage;\n> +\n> +\tqpage = xive_queue_page_alloc(cpu, xive_queue_shift);\n> +\tif (IS_ERR(qpage))\n> +\t\treturn PTR_ERR(qpage);\n> +\n> +\treturn xive_spapr_configure_queue(cpu, q, prio, qpage,\n> +\t\t\t\t\t  xive_queue_shift);\n> +}\n> +\n> +static void xive_spapr_cleanup_queue(unsigned int cpu, struct xive_cpu *xc,\n> +\t\t\t\t  u8 prio)\n> +{\n> +\tstruct xive_q *q = &xc->queue[prio];\n> +\tunsigned int alloc_order;\n> +\tlong rc;\n> +\n> +\trc = plpar_int_set_queue_config(0, cpu, prio, 0, 0);\n> +\tif (rc)\n> +\t\tpr_err(\"Error %ld setting queue for prio %d\\n\", rc, prio);\n> +\n> +\talloc_order = xive_alloc_order(xive_queue_shift);\n> +\tfree_pages((unsigned long)q->qpage, alloc_order);\n> +\tq->qpage = NULL;\n> +}\n> +\n> +static bool xive_spapr_match(struct device_node *node)\n> +{\n> +\t/* Ignore cascaded controllers for the moment */\n> +\treturn 1;\n> +}\n> +\n> +#ifdef CONFIG_SMP\n> +static int xive_spapr_get_ipi(unsigned int cpu, struct xive_cpu *xc)\n> +{\n> +\tint irq = xive_irq_bitmap_alloc();\n> +\n> +\tif (irq < 0) {\n> +\t\tpr_err(\"Failed to allocate IPI on CPU %d\\n\", cpu);\n> +\t\treturn -ENXIO;\n> +\t}\n> +\n> +\txc->hw_ipi = irq;\n> +\treturn 0;\n> +}\n> +\n> +static void xive_spapr_put_ipi(unsigned int cpu, struct xive_cpu *xc)\n> +{\n> +\txive_irq_bitmap_free(xc->hw_ipi);\n> +}\n> +#endif /* CONFIG_SMP */\n> +\n> +static void xive_spapr_shutdown(void)\n> +{\n> +\tlong rc;\n> +\n> +\trc = plpar_hcall_norets(H_INT_RESET, 0);\n> +\tif (rc)\n> +\t\tpr_err(\"H_INT_RESET failed %ld\\n\", rc);\n> +}\n> +\n> +/*\n> + * Perform an \"ack\" cycle on the current thread. Grab the pending\n> + * active priorities and update the CPPR to the most favored one.\n> + */\n> +static void xive_spapr_update_pending(struct xive_cpu *xc)\n> +{\n> +\tu8 nsr, cppr;\n> +\tu16 ack;\n> +\n> +\t/*\n> +\t * Perform the \"Acknowledge O/S to Register\" cycle.\n> +\t *\n> +\t * Let's speedup the access to the TIMA using the raw I/O\n> +\t * accessor as we don't need the synchronisation routine of\n> +\t * the higher level ones\n> +\t */\n> +\tack = be16_to_cpu(__raw_readw(xive_tima + TM_SPC_ACK_OS_REG));\n> +\n> +\t/* Synchronize subsequent queue accesses */\n> +\tmb();\n> +\n> +\t/*\n> +\t * Grab the CPPR and the \"NSR\" field which indicates the source\n> +\t * of the interrupt (if any)\n> +\t */\n> +\tcppr = ack & 0xff;\n> +\tnsr = ack >> 8;\n> +\n> +\tif (nsr & TM_QW1_NSR_EO) {\n> +\t\tif (cppr == 0xff)\n> +\t\t\treturn;\n> +\t\t/* Mark the priority pending */\n> +\t\txc->pending_prio |= 1 << cppr;\n> +\n> +\t\t/*\n> +\t\t * A new interrupt should never have a CPPR less favored\n> +\t\t * than our current one.\n> +\t\t */\n> +\t\tif (cppr >= xc->cppr)\n> +\t\t\tpr_err(\"CPU %d odd ack CPPR, got %d at %d\\n\",\n> +\t\t\t       smp_processor_id(), cppr, xc->cppr);\n> +\n> +\t\t/* Update our idea of what the CPPR is */\n> +\t\txc->cppr = cppr;\n> +\t}\n> +}\n> +\n> +static void xive_spapr_eoi(u32 hw_irq)\n> +{\n> +\t/* Not used */;\n> +}\n> +\n> +static void xive_spapr_setup_cpu(unsigned int cpu, struct xive_cpu *xc)\n> +{\n> +\t/* Only some debug on the TIMA settings */\n> +\tpr_debug(\"(HW value: %08x %08x %08x)\\n\",\n> +\t\t in_be32(xive_tima + TM_QW1_OS + TM_WORD0),\n> +\t\t in_be32(xive_tima + TM_QW1_OS + TM_WORD1),\n> +\t\t in_be32(xive_tima + TM_QW1_OS + TM_WORD2));\n> +}\n> +\n> +static void xive_spapr_teardown_cpu(unsigned int cpu, struct xive_cpu *xc)\n> +{\n> +\t/* Nothing to do */;\n> +}\n> +\n> +static void xive_spapr_sync_source(u32 hw_irq)\n> +{\n> +\t/* Specs are unclear on what this is doing */\n> +\tplpar_int_sync(0, hw_irq);\n> +}\n> +\n> +static const struct xive_ops xive_spapr_ops = {\n> +\t.populate_irq_data\t= xive_spapr_populate_irq_data,\n> +\t.configure_irq\t\t= xive_spapr_configure_irq,\n> +\t.setup_queue\t\t= xive_spapr_setup_queue,\n> +\t.cleanup_queue\t\t= xive_spapr_cleanup_queue,\n> +\t.match\t\t\t= xive_spapr_match,\n> +\t.shutdown\t\t= xive_spapr_shutdown,\n> +\t.update_pending\t\t= xive_spapr_update_pending,\n> +\t.eoi\t\t\t= xive_spapr_eoi,\n> +\t.setup_cpu\t\t= xive_spapr_setup_cpu,\n> +\t.teardown_cpu\t\t= xive_spapr_teardown_cpu,\n> +\t.sync_source\t\t= xive_spapr_sync_source,\n> +#ifdef CONFIG_SMP\n> +\t.get_ipi\t\t= xive_spapr_get_ipi,\n> +\t.put_ipi\t\t= xive_spapr_put_ipi,\n> +#endif /* CONFIG_SMP */\n> +\t.name\t\t\t= \"spapr\",\n> +};\n> +\n> +/*\n> + * get max priority from \"/ibm,plat-res-int-priorities\"\n> + */\n> +static bool xive_get_max_prio(u8 *max_prio)\n> +{\n> +\tstruct device_node *rootdn;\n> +\tconst __be32 *reg;\n> +\tu32 len;\n> +\tint prio, found;\n> +\n> +\trootdn = of_find_node_by_path(\"/\");\n> +\tif (!rootdn) {\n> +\t\tpr_err(\"not root node found !\\n\");\n> +\t\treturn false;\n> +\t}\n> +\n> +\treg = of_get_property(rootdn, \"ibm,plat-res-int-priorities\", &len);\n> +\tif (!reg) {\n> +\t\tpr_err(\"Failed to read 'ibm,plat-res-int-priorities' property\\n\");\n> +\t\treturn false;\n> +\t}\n> +\n> +\tif (len % (2 * sizeof(u32)) != 0) {\n> +\t\tpr_err(\"invalid 'ibm,plat-res-int-priorities' property\\n\");\n> +\t\treturn false;\n> +\t}\n> +\n> +\t/* HW supports priorities in the range [0-7] and 0xFF is a\n> +\t * wildcard priority used to mask. We scan the ranges reserved\n> +\t * by the hypervisor to find the lowest priority we can use.\n> +\t */\n> +\tfound = 0xFF;\n> +\tfor (prio = 0; prio < 8; prio++) {\n> +\t\tint reserved = 0;\n> +\t\tint i;\n> +\n> +\t\tfor (i = 0; i < len / (2 * sizeof(u32)); i++) {\n> +\t\t\tint base  = be32_to_cpu(reg[2 * i]);\n> +\t\t\tint range = be32_to_cpu(reg[2 * i + 1]);\n> +\n> +\t\t\tif (prio >= base && prio < base + range)\n> +\t\t\t\treserved++;\n> +\t\t}\n> +\n> +\t\tif (!reserved)\n> +\t\t\tfound = prio;\n> +\t}\n> +\n> +\tif (found == 0xFF) {\n> +\t\tpr_err(\"no valid priority found in 'ibm,plat-res-int-priorities'\\n\");\n> +\t\treturn false;\n> +\t}\n> +\n> +\t*max_prio = found;\n> +\treturn true;\n> +}\n> +\n> +bool xive_spapr_init(void)\n> +{\n> +\tstruct device_node *np;\n> +\tstruct resource r;\n> +\tvoid __iomem *tima;\n> +\tstruct property *prop;\n> +\tu8 max_prio;\n> +\tu32 val;\n> +\tu32 len;\n> +\tconst __be32 *reg;\n> +\tint i;\n> +\n> +\tif (xive_cmdline_disabled)\n> +\t\treturn false;\n> +\n> +\tpr_devel(\"%s()\\n\", __func__);\n> +\tnp = of_find_compatible_node(NULL, NULL, \"ibm,power-ivpe\");\n> +\tif (!np) {\n> +\t\tpr_devel(\"not found !\\n\");\n> +\t\treturn false;\n> +\t}\n> +\tpr_devel(\"Found %s\\n\", np->full_name);\n> +\n> +\t/* Resource 1 is the OS ring TIMA */\n> +\tif (of_address_to_resource(np, 1, &r)) {\n> +\t\tpr_err(\"Failed to get thread mgmnt area resource\\n\");\n> +\t\treturn false;\n> +\t}\n> +\ttima = ioremap(r.start, resource_size(&r));\n> +\tif (!tima) {\n> +\t\tpr_err(\"Failed to map thread mgmnt area\\n\");\n> +\t\treturn false;\n> +\t}\n> +\n> +\tif (!xive_get_max_prio(&max_prio))\n> +\t\treturn false;\n> +\n> +\t/* Feed the IRQ number allocator with the ranges given in the DT */\n> +\treg = of_get_property(np, \"ibm,xive-lisn-ranges\", &len);\n> +\tif (!reg) {\n> +\t\tpr_err(\"Failed to read 'ibm,xive-lisn-ranges' property\\n\");\n> +\t\treturn false;\n> +\t}\n> +\n> +\tif (len % (2 * sizeof(u32)) != 0) {\n> +\t\tpr_err(\"invalid 'ibm,xive-lisn-ranges' property\\n\");\n> +\t\treturn false;\n> +\t}\n> +\n> +\tfor (i = 0; i < len / (2 * sizeof(u32)); i++, reg += 2)\n> +\t\txive_irq_bitmap_add(be32_to_cpu(reg[0]),\n> +\t\t\t\t    be32_to_cpu(reg[1]));\n> +\n> +\t/* Iterate the EQ sizes and pick one */\n> +\tof_property_for_each_u32(np, \"ibm,xive-eq-sizes\", prop, reg, val) {\n> +\t\txive_queue_shift = val;\n> +\t\tif (val == PAGE_SHIFT)\n> +\t\t\tbreak;\n> +\t}\n> +\n> +\t/* Initialize XIVE core with our backend */\n> +\tif (!xive_core_init(&xive_spapr_ops, tima, TM_QW1_OS, max_prio))\n> +\t\treturn false;\n> +\n> +\tpr_info(\"Using %dkB queues\\n\", 1 << (xive_queue_shift - 10));\n> +\treturn true;\n> +}","headers":{"Return-Path":"<linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org>","X-Original-To":["patchwork-incoming@ozlabs.org","linuxppc-dev@lists.ozlabs.org"],"Delivered-To":["patchwork-incoming@ozlabs.org","linuxppc-dev@lists.ozlabs.org"],"Received":["from lists.ozlabs.org (lists.ozlabs.org [103.22.144.68])\n\t(using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits))\n\t(No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3xk7CQ1QZzz9s78\n\tfor <patchwork-incoming@ozlabs.org>;\n\tFri,  1 Sep 2017 15:31:46 +1000 (AEST)","from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3])\n\tby lists.ozlabs.org (Postfix) with ESMTP id 3xk7CQ0F19zDqnn\n\tfor <patchwork-incoming@ozlabs.org>;\n\tFri,  1 Sep 2017 15:31:46 +1000 (AEST)","from gate.crashing.org (gate.crashing.org [63.228.1.57])\n\t(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))\n\t(No client certificate requested)\n\tby lists.ozlabs.org (Postfix) with ESMTPS id 3xk7B60z05zDqZB\n\tfor <linuxppc-dev@lists.ozlabs.org>;\n\tFri,  1 Sep 2017 15:30:37 +1000 (AEST)","from localhost (localhost.localdomain [127.0.0.1])\n\tby gate.crashing.org (8.14.1/8.13.8) with ESMTP id v815TxWm013145;\n\tFri, 1 Sep 2017 00:30:01 -0500"],"Message-ID":"<1504243799.4974.69.camel@kernel.crashing.org>","Subject":"Re: [PATCH v3 2/8] powerpc/xive: guest exploitation of the XIVE\n\tinterrupt controller","From":"Benjamin Herrenschmidt <benh@kernel.crashing.org>","To":"=?iso-8859-1?q?C=E9dric?= Le Goater <clg@kaod.org>,\n\tlinuxppc-dev@lists.ozlabs.org","Date":"Fri, 01 Sep 2017 15:29:59 +1000","In-Reply-To":"<20170830194617.26621-3-clg@kaod.org>","References":"<20170830194617.26621-1-clg@kaod.org>\n\t<20170830194617.26621-3-clg@kaod.org>","Content-Type":"text/plain; charset=\"UTF-8\"","X-Mailer":"Evolution 3.24.5 (3.24.5-1.fc26) ","Mime-Version":"1.0","Content-Transfer-Encoding":"8bit","X-BeenThere":"linuxppc-dev@lists.ozlabs.org","X-Mailman-Version":"2.1.23","Precedence":"list","List-Id":"Linux on PowerPC Developers Mail List\n\t<linuxppc-dev.lists.ozlabs.org>","List-Unsubscribe":"<https://lists.ozlabs.org/options/linuxppc-dev>,\n\t<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=unsubscribe>","List-Archive":"<http://lists.ozlabs.org/pipermail/linuxppc-dev/>","List-Post":"<mailto:linuxppc-dev@lists.ozlabs.org>","List-Help":"<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=help>","List-Subscribe":"<https://lists.ozlabs.org/listinfo/linuxppc-dev>,\n\t<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=subscribe>","Cc":"Paul Mackerras <paulus@samba.org>,\n\tDavid Gibson <david@gibson.dropbear.id.au>","Errors-To":"linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org","Sender":"\"Linuxppc-dev\"\n\t<linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org>"}}]