Message ID | cover.1421449714.git.geoff@infradead.org |
---|---|
State | New |
Headers | show |
Hi Geoff, On Sat, Jan 17, 2015 at 12:23:34AM +0000, Geoff Levand wrote: > This series adds the core support for kexec re-boots on arm64. This v7 of the > series is mainly just a rebase to the latest arm64 for-next/core branch > (v3.19-rc4), and a few very minor changes requested for v6. I haven't looked at the series in detail before, so some of my comments may have already been discussed. > Several things are known to have problems on kexec re-boot: > > spin-table I think that's not too bad, for complete kexec support (SMP->SMP) we can require some CPU unplug mechanism and PSCI is one of them. > FIX: Upgrade system firmware to provide PSCI enable method support or add > missing spin-table support to the kernel. What's the missing spin-table support? > ACPI > ---- > > PROBLEM: The kernel for ACPI based systems does not export a device tree to the > standard user space location of 'proc/device-tree'. Current applications > expect to access device tree information from this standard location. > > WORK-AROUND: Disable ACPI in firmware, OR pass 'acpi=off' on the first stage > kernel command line, OR pass a user specified DTB using the kexec --dtb option. > > FIX: FIX: An interface to expose a binary device tree to user space has been > proposed. User kexec utilities will need to be updated to add support for this > new interface. So the fix here is to boot the second stage kernel with dtb, which means that we mandate the existence of a DT file for any ACPI system. Are there plans to make the kexec'ed kernel reuse the ACPI tables?
On Sat, Jan 17, 2015 at 12:23:34AM +0000, Geoff Levand wrote: > To allow the assembler macros defined in proc-macros.S to be used outside > the mm code move the proc-macros.S file from arch/arm64/mm/ to > arch/arm64/include/asm/ and fix up any preprocessor includes to use the new > file location. > > Signed-off-by: Geoff Levand <geoff@infradead.org> > --- > arch/arm64/include/asm/proc-macros.S | 54 ++++++++++++++++++++++++++++++++++++ > arch/arm64/mm/cache.S | 3 +- > arch/arm64/mm/proc-macros.S | 54 ------------------------------------ > arch/arm64/mm/proc.S | 3 +- Actually, I would just merge proc-macros.S into assembler.h. Not wirth keeping the former just for a few macros.
On Sat, Jan 17, 2015 at 12:23:34AM +0000, Geoff Levand wrote: > To allow for additional hcalls to be defined and to make the arm64 hcall API > more consistent across exception vector routines, change the hcall implementations > to use the ISS field of the ESR_EL2 register to specify the hcall type. > > The existing arm64 hcall implementations are limited in that they only allow > for two distinct hcalls; with the x0 register either zero, or not zero. Also, > the API of the hyp-stub exception vector routines and the KVM exception vector > routines differ; hyp-stub uses a non-zero value in x0 to implement > __hyp_set_vectors, whereas KVM uses it to implement kvm_call_hyp. > > Define three new preprocessor macros HVC_GET_VECTORS, HVC_SET_VECTORS and > HVC_CALL_HYP and to be used as hcall type specifiers and convert the > existing __hyp_get_vectors(), __hyp_set_vectors() and kvm_call_hyp() routines > to use these new macros when executing an HVC call. Also change the > corresponding hyp-stub and KVM el1_sync exception vector routines to use these > new macros. > > Signed-off-by: Geoff Levand <geoff@infradead.org> Using the #imm value for HVC to separate what gets called looks fine to me. However, I'd like to see a review from Marc/Christoffer on this patch. Some comments below: > diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h > index 7a5df52..99c319c 100644 > --- a/arch/arm64/include/asm/virt.h > +++ b/arch/arm64/include/asm/virt.h > @@ -21,6 +21,26 @@ > #define BOOT_CPU_MODE_EL1 (0xe11) > #define BOOT_CPU_MODE_EL2 (0xe12) > > +/* > + * HVC_GET_VECTORS - Return the value of the vbar_el2 register. > + */ > + > +#define HVC_GET_VECTORS 1 > + > +/* > + * HVC_SET_VECTORS - Set the value of the vbar_el2 register. > + * > + * @x0: Physical address of the new vector table. > + */ > + > +#define HVC_SET_VECTORS 2 > + > +/* > + * HVC_CALL_HYP - Execute a hyp routine. > + */ > + > +#define HVC_CALL_HYP 3 I think you can ignore this case (make it the default), just define it as 0 as that's the normal use-case after initialisation and avoid checking it explicitly. > /* > diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S > index a272f33..e3db3fd 100644 > --- a/arch/arm64/kernel/hyp-stub.S > +++ b/arch/arm64/kernel/hyp-stub.S > @@ -22,6 +22,7 @@ > #include <linux/irqchip/arm-gic-v3.h> > > #include <asm/assembler.h> > +#include <asm/kvm_arm.h> > #include <asm/ptrace.h> > #include <asm/virt.h> > > @@ -53,14 +54,22 @@ ENDPROC(__hyp_stub_vectors) > .align 11 > > el1_sync: > - mrs x1, esr_el2 > - lsr x1, x1, #26 > - cmp x1, #0x16 > - b.ne 2f // Not an HVC trap > - cbz x0, 1f > - msr vbar_el2, x0 // Set vbar_el2 > + mrs x18, esr_el2 > + lsr x17, x18, #ESR_ELx_EC_SHIFT > + and x18, x18, #ESR_ELx_ISS_MASK > + > + cmp x17, #ESR_ELx_EC_HVC64 > + b.ne 2f // Not an HVC trap > + > + cmp x18, #HVC_GET_VECTORS > + b.ne 1f > + mrs x0, vbar_el2 > b 2f > -1: mrs x0, vbar_el2 // Return vbar_el2 > + > +1: cmp x18, #HVC_SET_VECTORS > + b.ne 2f > + msr vbar_el2, x0 > + > 2: eret > ENDPROC(el1_sync) You seem to be using x17 and x18 here freely. Do you have any guarantees that the caller saved/restored those registers? I guess you assume they are temporary registers and the caller first branches to a function (like __kvm_hyp_call) and expects them to be corrupted. But I'm not sure that's always the case. Take for example the __invoke_psci_fn_hvc where the function is in C (we should change this for other reasons). > diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S > index c0d8202..1916c89 100644 > --- a/arch/arm64/kvm/hyp.S > +++ b/arch/arm64/kvm/hyp.S > @@ -27,6 +27,7 @@ > #include <asm/kvm_asm.h> > #include <asm/kvm_mmu.h> > #include <asm/memory.h> > +#include <asm/virt.h> > > #define CPU_GP_REG_OFFSET(x) (CPU_GP_REGS + x) > #define CPU_XREG_OFFSET(x) CPU_GP_REG_OFFSET(CPU_USER_PT_REGS + 8*x) > @@ -1106,12 +1107,9 @@ __hyp_panic_str: > * in Hyp mode (see init_hyp_mode in arch/arm/kvm/arm.c). Return values are > * passed in r0 and r1. > * > - * A function pointer with a value of 0 has a special meaning, and is > - * used to implement __hyp_get_vectors in the same way as in > - * arch/arm64/kernel/hyp_stub.S. > */ > ENTRY(kvm_call_hyp) > - hvc #0 > + hvc #HVC_CALL_HYP > ret > ENDPROC(kvm_call_hyp) > > @@ -1142,6 +1140,7 @@ el1_sync: // Guest trapped into EL2 > > mrs x1, esr_el2 > lsr x2, x1, #ESR_ELx_EC_SHIFT > + and x0, x1, #ESR_ELx_ISS_MASK > > cmp x2, #ESR_ELx_EC_HVC64 > b.ne el1_trap > @@ -1150,15 +1149,19 @@ el1_sync: // Guest trapped into EL2 > cbnz x3, el1_trap // called HVC > > /* Here, we're pretty sure the host called HVC. */ > + mov x18, x0 Same comment here about corrupting x18. If it is safe, maybe add some comments in the calling place. > pop x2, x3 > pop x0, x1 > > - /* Check for __hyp_get_vectors */ > - cbnz x0, 1f > + cmp x18, #HVC_GET_VECTORS > + b.ne 1f > mrs x0, vbar_el2 > b 2f > > -1: push lr, xzr > +1: cmp x18, #HVC_CALL_HYP > + b.ne 2f > + > + push lr, xzr At this point, we expect either HVC_GET_VECTORS or HVC_CALL_HYP. I think you can simply assume HVC_CALL_HYP as default and ignore the additional cmp.
On Mon, Jan 26, 2015 at 5:44 PM, Catalin Marinas <catalin.marinas@arm.com> wrote: > Hi Geoff, > > On Sat, Jan 17, 2015 at 12:23:34AM +0000, Geoff Levand wrote: >> This series adds the core support for kexec re-boots on arm64. This v7 of the >> series is mainly just a rebase to the latest arm64 for-next/core branch >> (v3.19-rc4), and a few very minor changes requested for v6. > > I haven't looked at the series in detail before, so some of my comments > may have already been discussed. > >> Several things are known to have problems on kexec re-boot: >> >> spin-table > > I think that's not too bad, for complete kexec support (SMP->SMP) we can > require some CPU unplug mechanism and PSCI is one of them. > >> FIX: Upgrade system firmware to provide PSCI enable method support or add >> missing spin-table support to the kernel. > > What's the missing spin-table support? > >> ACPI >> ---- >> >> PROBLEM: The kernel for ACPI based systems does not export a device tree to the >> standard user space location of 'proc/device-tree'. Current applications >> expect to access device tree information from this standard location. >> >> WORK-AROUND: Disable ACPI in firmware, OR pass 'acpi=off' on the first stage >> kernel command line, OR pass a user specified DTB using the kexec --dtb option. >> >> FIX: FIX: An interface to expose a binary device tree to user space has been >> proposed. User kexec utilities will need to be updated to add support for this >> new interface. The new interface is merged into mainline. /sys/firmware/fdt > So the fix here is to boot the second stage kernel with dtb, which means > that we mandate the existence of a DT file for any ACPI system. Are > there plans to make the kexec'ed kernel reuse the ACPI tables? Yes, the kexec'ed kernel will reuse the ACPI tables, and any other data passed by UEFI. The DT we're talking about here is the DT generated by the kernel's UEFI stub, and the kexec tools want access to it so they can find the UEFI system table pointer. g.
On Mon, Jan 26, 2015 at 05:44:14PM +0000, Catalin Marinas wrote: > Hi Geoff, > > On Sat, Jan 17, 2015 at 12:23:34AM +0000, Geoff Levand wrote: > > This series adds the core support for kexec re-boots on arm64. This v7 of the > > series is mainly just a rebase to the latest arm64 for-next/core branch > > (v3.19-rc4), and a few very minor changes requested for v6. > > I haven't looked at the series in detail before, so some of my comments > may have already been discussed. > > > Several things are known to have problems on kexec re-boot: > > > > spin-table > > I think that's not too bad, for complete kexec support (SMP->SMP) we can > require some CPU unplug mechanism and PSCI is one of them. > > > FIX: Upgrade system firmware to provide PSCI enable method support or add > > missing spin-table support to the kernel. > > What's the missing spin-table support? As you mention above, a mechanism for returning the CPUs to FW. There is no spec for doing this with spin-table, which would have to be written and vetted. Mark.
On Sat, Jan 17, 2015 at 12:23:34AM +0000, Geoff Levand wrote: > When a CPU is reset it needs to be put into the exception level it had when it > entered the kernel. Update cpu_reset() to accept an argument el2_switch which > signals cpu_reset() to enter the soft reset address at EL2. If el2_switch is > not set the soft reset address will be entered at EL1. > > Update cpu_soft_restart() and soft_restart() to pass the return of > is_hyp_mode_available() as the el2_switch value to cpu_reset(). Also update the > comments of cpu_reset(), cpu_soft_restart() and soft_restart() to reflect this > change. > > Signed-off-by: Geoff Levand <geoff@infradead.org> > --- > arch/arm64/include/asm/proc-fns.h | 4 ++-- > arch/arm64/kernel/process.c | 10 ++++++++- > arch/arm64/mm/proc.S | 47 +++++++++++++++++++++++++++++---------- > 3 files changed, 46 insertions(+), 15 deletions(-) > > diff --git a/arch/arm64/include/asm/proc-fns.h b/arch/arm64/include/asm/proc-fns.h > index 9a8fd84..339394d 100644 > --- a/arch/arm64/include/asm/proc-fns.h > +++ b/arch/arm64/include/asm/proc-fns.h > @@ -32,8 +32,8 @@ extern void cpu_cache_off(void); > extern void cpu_do_idle(void); > extern void cpu_do_switch_mm(unsigned long pgd_phys, struct mm_struct *mm); > extern void cpu_reset(unsigned long addr) __attribute__((noreturn)); > -void cpu_soft_restart(phys_addr_t cpu_reset, > - unsigned long addr) __attribute__((noreturn)); > +void cpu_soft_restart(phys_addr_t cpu_reset, unsigned long el2_switch, > + unsigned long addr) __attribute__((noreturn)); > extern void cpu_do_suspend(struct cpu_suspend_ctx *ptr); > extern u64 cpu_do_resume(phys_addr_t ptr, u64 idmap_ttbr); > > diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c > index fde9923..371bbf1 100644 > --- a/arch/arm64/kernel/process.c > +++ b/arch/arm64/kernel/process.c > @@ -50,6 +50,7 @@ > #include <asm/mmu_context.h> > #include <asm/processor.h> > #include <asm/stacktrace.h> > +#include <asm/virt.h> > > #ifdef CONFIG_CC_STACKPROTECTOR > #include <linux/stackprotector.h> > @@ -60,7 +61,14 @@ EXPORT_SYMBOL(__stack_chk_guard); > void soft_restart(unsigned long addr) > { > setup_mm_for_reboot(); > - cpu_soft_restart(virt_to_phys(cpu_reset), addr); > + > + /* TODO: Remove this conditional when KVM can support CPU restart. */ > + if (IS_ENABLED(CONFIG_KVM)) > + cpu_soft_restart(virt_to_phys(cpu_reset), 0, addr); If we haven't torn down KVM, doesn't that mean that KVM is active at EL2 (with MMU and caches on) at this point? If that's the case then we cannot possibly try to call kexec(), because we cannot touch the memory used by the page tables for those EL2 mappings. Things will explode if we do. Mark.
On Sat, Jan 17, 2015 at 12:23:34AM +0000, Geoff Levand wrote: > Add three new files, kexec.h, machine_kexec.c and relocate_kernel.S to the > arm64 architecture that add support for the kexec re-boot mechanism > (CONFIG_KEXEC) on arm64 platforms. > > Signed-off-by: Geoff Levand <geoff@infradead.org> > --- > arch/arm64/Kconfig | 9 ++ > arch/arm64/include/asm/kexec.h | 47 +++++++++++ > arch/arm64/kernel/Makefile | 1 + > arch/arm64/kernel/machine_kexec.c | 155 ++++++++++++++++++++++++++++++++++ > arch/arm64/kernel/relocate_kernel.S | 160 ++++++++++++++++++++++++++++++++++++ > include/uapi/linux/kexec.h | 1 + > 6 files changed, 373 insertions(+) > create mode 100644 arch/arm64/include/asm/kexec.h > create mode 100644 arch/arm64/kernel/machine_kexec.c > create mode 100644 arch/arm64/kernel/relocate_kernel.S > > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig > index b1f9a20..d9eb9cd 100644 > --- a/arch/arm64/Kconfig > +++ b/arch/arm64/Kconfig > @@ -474,6 +474,15 @@ config SECCOMP > and the task is only allowed to execute a few safe syscalls > defined by each seccomp mode. > > +config KEXEC > + depends on (!SMP || PM_SLEEP_SMP) > + bool "kexec system call" > + ---help--- > + kexec is a system call that implements the ability to shutdown your > + current kernel, and to start another kernel. It is like a reboot > + but it is independent of the system firmware. And like a reboot > + you can start any kernel with it, not just Linux. > + [...] > +/** > + * kexec_is_dtb - Helper routine to check the device tree header signature. > + */ > +static bool kexec_is_dtb(const void *dtb) > +{ > + __be32 magic; > + > + return get_user(magic, (__be32 *)dtb) ? false : > + (be32_to_cpu(magic) == OF_DT_HEADER); > +} > + > +/** > + * kexec_find_dtb_seg - Helper routine to find the dtb segment. > + */ > +static const struct kexec_segment *kexec_find_dtb_seg( > + const struct kimage *image) > +{ > + int i; > + > + for (i = 0; i < image->nr_segments; i++) { > + if (kexec_is_dtb(image->segment[i].buf)) > + return &image->segment[i]; > + } > + > + return NULL; > +} As mentioned before, _please_ move the dtb handling to the userspace-provided purgatory. It would be far better to get userspace to handle setting up the dtb pointer explicitly. That avoids fragility w.r.t. policy here as userspace will get exactly what it asked for, nothing more, nothing less. The fact that this is done on 32-bit arm does not mean that we must do it here. [...] > + /* Start new image. */ > + ldr x4, arm64_kexec_kimage_start > + ldr x0, arm64_kexec_dtb_addr > + mov x1, xzr > + mov x2, xzr > + mov x3, xzr > + br x4 Likewise, this should be part of the userspace-provided purgatory code. If we're staying true to "like a reboot you can start any kernel with it, not just Linux", we shouldn't embed the Linux boot protocol here. Mark.
On Sat, Jan 17, 2015 at 12:23:34AM +0000, Geoff Levand wrote: > Add runtime checks that fail the arm64 kexec syscall for situations that would > result in system instability do to problems in the KVM kernel support. > These checks should be removed when the KVM problems are resolved fixed. > > Signed-off-by: Geoff Levand <geoff@infradead.org> > --- > arch/arm64/kernel/machine_kexec.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c > index 3d84759..a36459d 100644 > --- a/arch/arm64/kernel/machine_kexec.c > +++ b/arch/arm64/kernel/machine_kexec.c > @@ -16,6 +16,9 @@ > #include <asm/cacheflush.h> > #include <asm/system_misc.h> > > +/* TODO: Remove this include when KVM can support a kexec reboot. */ > +#include <asm/virt.h> > + > /* Global variables for the relocate_kernel routine. */ > extern const unsigned char relocate_new_kernel[]; > extern const unsigned long relocate_new_kernel_size; > @@ -100,6 +103,13 @@ int machine_kexec_prepare(struct kimage *image) > > kexec_image_info(image); > > + /* TODO: Remove this message when KVM can support a kexec reboot. */ > + if (IS_ENABLED(CONFIG_KVM) && is_hyp_mode_available()) { > + pr_err("%s: Your kernel is configured with KVM support (CONFIG_KVM=y) which currently does not allow for kexec re-boot.\n", > + __func__); > + return -ENOSYS; > + } If you really don't want to implement KVM teardown, surely this should be at the start of the series, so we don't have a point in the middle where things may explode in this case? Mark.
On Mon, Jan 26, 2015 at 8:19 PM, Mark Rutland <mark.rutland@arm.com> wrote: > On Sat, Jan 17, 2015 at 12:23:34AM +0000, Geoff Levand wrote: >> Add runtime checks that fail the arm64 kexec syscall for situations that would >> result in system instability do to problems in the KVM kernel support. >> These checks should be removed when the KVM problems are resolved fixed. >> >> Signed-off-by: Geoff Levand <geoff@infradead.org> >> --- >> arch/arm64/kernel/machine_kexec.c | 10 ++++++++++ >> 1 file changed, 10 insertions(+) >> >> diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c >> index 3d84759..a36459d 100644 >> --- a/arch/arm64/kernel/machine_kexec.c >> +++ b/arch/arm64/kernel/machine_kexec.c >> @@ -16,6 +16,9 @@ >> #include <asm/cacheflush.h> >> #include <asm/system_misc.h> >> >> +/* TODO: Remove this include when KVM can support a kexec reboot. */ >> +#include <asm/virt.h> >> + >> /* Global variables for the relocate_kernel routine. */ >> extern const unsigned char relocate_new_kernel[]; >> extern const unsigned long relocate_new_kernel_size; >> @@ -100,6 +103,13 @@ int machine_kexec_prepare(struct kimage *image) >> >> kexec_image_info(image); >> >> + /* TODO: Remove this message when KVM can support a kexec reboot. */ >> + if (IS_ENABLED(CONFIG_KVM) && is_hyp_mode_available()) { >> + pr_err("%s: Your kernel is configured with KVM support (CONFIG_KVM=y) which currently does not allow for kexec re-boot.\n", >> + __func__); >> + return -ENOSYS; >> + } > > If you really don't want to implement KVM teardown, surely this should > be at the start of the series, so we don't have a point in the middle > where things may explode in this case? > So this caters to support systems that don't support KVM (don't boot in EL2) but is configured with both KVM and KEXEC? Why not just make the kexec config dependent on !KVM ? -Christoffer
Hi Catalin, On Mon, 2015-01-26 at 17:44 +0000, Catalin Marinas wrote: > On Sat, Jan 17, 2015 at 12:23:34AM +0000, Geoff Levand wrote: > > This series adds the core support for kexec re-boots on arm64. This v7 of the > > series is mainly just a rebase to the latest arm64 for-next/core branch > > (v3.19-rc4), and a few very minor changes requested for v6. > > I haven't looked at the series in detail before, so some of my comments > may have already been discussed. > > > Several things are known to have problems on kexec re-boot: > > > > spin-table > > I think that's not too bad, for complete kexec support (SMP->SMP) we can > require some CPU unplug mechanism and PSCI is one of them. > > > FIX: Upgrade system firmware to provide PSCI enable method support or add > > missing spin-table support to the kernel. > > What's the missing spin-table support? I had a working spin-table implementation, but I have not kept that up to date. It is somewhat complicated, and it requires a mechanism to return the secondary CPUs to the spin table code. As Mark mentioned, that mechanism would need to be decided on. I don't plan to put any more work into spin-table support. Just for anyone interested, my old spin-table support patches are in this branch: https://git.kernel.org/cgit/linux/kernel/git/geoff/linux-kexec.git/log/?h=spin-table > > > ACPI > > ---- > > > > PROBLEM: The kernel for ACPI based systems does not export a device tree to the > > standard user space location of 'proc/device-tree'. Current applications > > expect to access device tree information from this standard location. > > > > WORK-AROUND: Disable ACPI in firmware, OR pass 'acpi=off' on the first stage > > kernel command line, OR pass a user specified DTB using the kexec --dtb option. > > > > FIX: FIX: An interface to expose a binary device tree to user space has been > > proposed. User kexec utilities will need to be updated to add support for this > > new interface. > > So the fix here is to boot the second stage kernel with dtb, which means > that we mandate the existence of a DT file for any ACPI system. Are > there plans to make the kexec'ed kernel reuse the ACPI tables? As Grant mentioned, the dtb the UEFI stub creates and passes to the first stage kernel is what is passed to the second stage kernel. The second stage kernel uses that to setup its ACPI support. -Geoff
On Mon, 2015-01-26 at 21:39 +0100, Christoffer Dall wrote: > On Mon, Jan 26, 2015 at 8:19 PM, Mark Rutland <mark.rutland@arm.com> wrote: > > On Sat, Jan 17, 2015 at 12:23:34AM +0000, Geoff Levand wrote: > >> Add runtime checks that fail the arm64 kexec syscall for situations that would > >> result in system instability do to problems in the KVM kernel support. > >> These checks should be removed when the KVM problems are resolved fixed. > >> > >> Signed-off-by: Geoff Levand <geoff@infradead.org> > >> --- > >> arch/arm64/kernel/machine_kexec.c | 10 ++++++++++ > >> 1 file changed, 10 insertions(+) > >> > >> diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c > >> index 3d84759..a36459d 100644 > >> --- a/arch/arm64/kernel/machine_kexec.c > >> +++ b/arch/arm64/kernel/machine_kexec.c > >> @@ -16,6 +16,9 @@ > >> #include <asm/cacheflush.h> > >> #include <asm/system_misc.h> > >> > >> +/* TODO: Remove this include when KVM can support a kexec reboot. */ > >> +#include <asm/virt.h> > >> + > >> /* Global variables for the relocate_kernel routine. */ > >> extern const unsigned char relocate_new_kernel[]; > >> extern const unsigned long relocate_new_kernel_size; > >> @@ -100,6 +103,13 @@ int machine_kexec_prepare(struct kimage *image) > >> > >> kexec_image_info(image); > >> > >> + /* TODO: Remove this message when KVM can support a kexec reboot. */ > >> + if (IS_ENABLED(CONFIG_KVM) && is_hyp_mode_available()) { > >> + pr_err("%s: Your kernel is configured with KVM support (CONFIG_KVM=y) which currently does not allow for kexec re-boot.\n", > >> + __func__); > >> + return -ENOSYS; > >> + } > > > > If you really don't want to implement KVM teardown, surely this should > > be at the start of the series, so we don't have a point in the middle > > where things may explode in this case? > > > So this caters to support systems that don't support KVM (don't boot > in EL2) but is configured with both KVM and KEXEC? > > Why not just make the kexec config dependent on !KVM ? Sure, that would work. I put it this way so we can get build testing of kexec since the arm64 defconfig has KVM set. -Geoff
Hi Mark, On Mon, 2015-01-26 at 19:19 +0000, Mark Rutland wrote: > On Sat, Jan 17, 2015 at 12:23:34AM +0000, Geoff Levand wrote: > > Add runtime checks that fail the arm64 kexec syscall for situations that would > > result in system instability do to problems in the KVM kernel support. > > These checks should be removed when the KVM problems are resolved fixed. > > > > Signed-off-by: Geoff Levand <geoff@infradead.org> > > --- > > arch/arm64/kernel/machine_kexec.c | 10 ++++++++++ > > 1 file changed, 10 insertions(+) > > > > diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c > > index 3d84759..a36459d 100644 > > --- a/arch/arm64/kernel/machine_kexec.c > > +++ b/arch/arm64/kernel/machine_kexec.c > > @@ -16,6 +16,9 @@ > > #include <asm/cacheflush.h> > > #include <asm/system_misc.h> > > > > +/* TODO: Remove this include when KVM can support a kexec reboot. */ > > +#include <asm/virt.h> > > + > > /* Global variables for the relocate_kernel routine. */ > > extern const unsigned char relocate_new_kernel[]; > > extern const unsigned long relocate_new_kernel_size; > > @@ -100,6 +103,13 @@ int machine_kexec_prepare(struct kimage *image) > > > > kexec_image_info(image); > > > > + /* TODO: Remove this message when KVM can support a kexec reboot. */ > > + if (IS_ENABLED(CONFIG_KVM) && is_hyp_mode_available()) { > > + pr_err("%s: Your kernel is configured with KVM support (CONFIG_KVM=y) which currently does not allow for kexec re-boot.\n", > > + __func__); > > + return -ENOSYS; > > + } > > If you really don't want to implement KVM teardown, surely this should > be at the start of the series, so we don't have a point in the middle > where things may explode in this case? Yes, you're right. I'm hoping we can get the KVM fix done soon so this is not needed. -Geoff
Hi Mark, On Mon, 2015-01-26 at 19:02 +0000, Mark Rutland wrote: > On Sat, Jan 17, 2015 at 12:23:34AM +0000, Geoff Levand wrote: > > When a CPU is reset it needs to be put into the exception level it had when it > > entered the kernel. Update cpu_reset() to accept an argument el2_switch which > > signals cpu_reset() to enter the soft reset address at EL2. If el2_switch is > > not set the soft reset address will be entered at EL1. > > > > Update cpu_soft_restart() and soft_restart() to pass the return of > > is_hyp_mode_available() as the el2_switch value to cpu_reset(). Also update the > > comments of cpu_reset(), cpu_soft_restart() and soft_restart() to reflect this > > change. > > > > Signed-off-by: Geoff Levand <geoff@infradead.org> > > --- > > arch/arm64/include/asm/proc-fns.h | 4 ++-- > > arch/arm64/kernel/process.c | 10 ++++++++- > > arch/arm64/mm/proc.S | 47 +++++++++++++++++++++++++++++---------- > > 3 files changed, 46 insertions(+), 15 deletions(-) > > > > diff --git a/arch/arm64/include/asm/proc-fns.h b/arch/arm64/include/asm/proc-fns.h > > index 9a8fd84..339394d 100644 > > --- a/arch/arm64/include/asm/proc-fns.h > > +++ b/arch/arm64/include/asm/proc-fns.h > > @@ -32,8 +32,8 @@ extern void cpu_cache_off(void); > > extern void cpu_do_idle(void); > > extern void cpu_do_switch_mm(unsigned long pgd_phys, struct mm_struct *mm); > > extern void cpu_reset(unsigned long addr) __attribute__((noreturn)); > > -void cpu_soft_restart(phys_addr_t cpu_reset, > > - unsigned long addr) __attribute__((noreturn)); > > +void cpu_soft_restart(phys_addr_t cpu_reset, unsigned long el2_switch, > > + unsigned long addr) __attribute__((noreturn)); > > extern void cpu_do_suspend(struct cpu_suspend_ctx *ptr); > > extern u64 cpu_do_resume(phys_addr_t ptr, u64 idmap_ttbr); > > > > diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c > > index fde9923..371bbf1 100644 > > --- a/arch/arm64/kernel/process.c > > +++ b/arch/arm64/kernel/process.c > > @@ -50,6 +50,7 @@ > > #include <asm/mmu_context.h> > > #include <asm/processor.h> > > #include <asm/stacktrace.h> > > +#include <asm/virt.h> > > > > #ifdef CONFIG_CC_STACKPROTECTOR > > #include <linux/stackprotector.h> > > @@ -60,7 +61,14 @@ EXPORT_SYMBOL(__stack_chk_guard); > > void soft_restart(unsigned long addr) > > { > > setup_mm_for_reboot(); > > - cpu_soft_restart(virt_to_phys(cpu_reset), addr); > > + > > + /* TODO: Remove this conditional when KVM can support CPU restart. */ > > + if (IS_ENABLED(CONFIG_KVM)) > > + cpu_soft_restart(virt_to_phys(cpu_reset), 0, addr); > > If we haven't torn down KVM, doesn't that mean that KVM is active at EL2 > (with MMU and caches on) at this point? > > If that's the case then we cannot possibly try to call kexec(), because > we cannot touch the memory used by the page tables for those EL2 > mappings. Things will explode if we do. This conditional is just if KVM, do things the old way (don't try to switch exception levels). It is to handle the system shutdown case. Another patch in this series '[PATCH 7/8] arm64/kexec: Add checks for KVM' assures kexec cannot happen when KVM is configured. -Geoff
On Mon, Jan 26, 2015 at 09:48:48PM +0000, Geoff Levand wrote: > Hi Mark, > > On Mon, 2015-01-26 at 19:02 +0000, Mark Rutland wrote: > > On Sat, Jan 17, 2015 at 12:23:34AM +0000, Geoff Levand wrote: > > > When a CPU is reset it needs to be put into the exception level it had when it > > > entered the kernel. Update cpu_reset() to accept an argument el2_switch which > > > signals cpu_reset() to enter the soft reset address at EL2. If el2_switch is > > > not set the soft reset address will be entered at EL1. > > > > > > Update cpu_soft_restart() and soft_restart() to pass the return of > > > is_hyp_mode_available() as the el2_switch value to cpu_reset(). Also update the > > > comments of cpu_reset(), cpu_soft_restart() and soft_restart() to reflect this > > > change. > > > > > > Signed-off-by: Geoff Levand <geoff@infradead.org> > > > --- > > > arch/arm64/include/asm/proc-fns.h | 4 ++-- > > > arch/arm64/kernel/process.c | 10 ++++++++- > > > arch/arm64/mm/proc.S | 47 +++++++++++++++++++++++++++++---------- > > > 3 files changed, 46 insertions(+), 15 deletions(-) > > > > > > diff --git a/arch/arm64/include/asm/proc-fns.h b/arch/arm64/include/asm/proc-fns.h > > > index 9a8fd84..339394d 100644 > > > --- a/arch/arm64/include/asm/proc-fns.h > > > +++ b/arch/arm64/include/asm/proc-fns.h > > > @@ -32,8 +32,8 @@ extern void cpu_cache_off(void); > > > extern void cpu_do_idle(void); > > > extern void cpu_do_switch_mm(unsigned long pgd_phys, struct mm_struct *mm); > > > extern void cpu_reset(unsigned long addr) __attribute__((noreturn)); > > > -void cpu_soft_restart(phys_addr_t cpu_reset, > > > - unsigned long addr) __attribute__((noreturn)); > > > +void cpu_soft_restart(phys_addr_t cpu_reset, unsigned long el2_switch, > > > + unsigned long addr) __attribute__((noreturn)); > > > extern void cpu_do_suspend(struct cpu_suspend_ctx *ptr); > > > extern u64 cpu_do_resume(phys_addr_t ptr, u64 idmap_ttbr); > > > > > > diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c > > > index fde9923..371bbf1 100644 > > > --- a/arch/arm64/kernel/process.c > > > +++ b/arch/arm64/kernel/process.c > > > @@ -50,6 +50,7 @@ > > > #include <asm/mmu_context.h> > > > #include <asm/processor.h> > > > #include <asm/stacktrace.h> > > > +#include <asm/virt.h> > > > > > > #ifdef CONFIG_CC_STACKPROTECTOR > > > #include <linux/stackprotector.h> > > > @@ -60,7 +61,14 @@ EXPORT_SYMBOL(__stack_chk_guard); > > > void soft_restart(unsigned long addr) > > > { > > > setup_mm_for_reboot(); > > > - cpu_soft_restart(virt_to_phys(cpu_reset), addr); > > > + > > > + /* TODO: Remove this conditional when KVM can support CPU restart. */ > > > + if (IS_ENABLED(CONFIG_KVM)) > > > + cpu_soft_restart(virt_to_phys(cpu_reset), 0, addr); > > > > If we haven't torn down KVM, doesn't that mean that KVM is active at EL2 > > (with MMU and caches on) at this point? > > > > If that's the case then we cannot possibly try to call kexec(), because > > we cannot touch the memory used by the page tables for those EL2 > > mappings. Things will explode if we do. > > This conditional is just if KVM, do things the old way (don't try to > switch exception levels). It is to handle the system shutdown case. Having grepped treewide for soft_restart, other than kexec there are no users for arm64. So surely kexec is the only case to cater for at the moment? > Another patch in this series '[PATCH 7/8] arm64/kexec: Add checks for > KVM' assures kexec cannot happen when KVM is configured. It would be better to just move this earlier (or event better, implement kvm teardown). Mark.
On Sat, Jan 17, 2015 at 12:23:34AM +0000, Geoff Levand wrote: > diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h > index 99c319c..4f23a48 100644 > --- a/arch/arm64/include/asm/virt.h > +++ b/arch/arm64/include/asm/virt.h > @@ -41,6 +41,19 @@ > > #define HVC_CALL_HYP 3 > > +/* > + * HVC_CALL_FUNC - Execute a function at EL2. > + * > + * @x0: Physical address of the function to be executed. > + * @x1: Passed as the first argument to the function. > + * @x2: Passed as the second argument to the function. > + * @x3: Passed as the third argument to the function. > + * > + * The called function must preserve the contents of register x18. Can you pick a register that's normally callee saved? > + */ > + > +#define HVC_CALL_FUNC 4 > + > #ifndef __ASSEMBLY__ > > /* > diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S > index e3db3fd..b5d36e7 100644 > --- a/arch/arm64/kernel/hyp-stub.S > +++ b/arch/arm64/kernel/hyp-stub.S > @@ -66,9 +66,20 @@ el1_sync: > mrs x0, vbar_el2 > b 2f > > -1: cmp x18, #HVC_SET_VECTORS > - b.ne 2f > - msr vbar_el2, x0 > +1: cmp x18, #HVC_SET_VECTORS This line doesn't seem to have any change, apart from some whitespace. Or did you want to drop the label? > + b.ne 1f > + msr vbar_el2, x0 > + b 2f > + > +1: cmp x18, #HVC_CALL_FUNC > + b.ne 2f > + mov x18, lr > + mov lr, x0 > + mov x0, x1 > + mov x1, x2 > + mov x2, x3 > + blr lr > + mov lr, x18 > > 2: eret > ENDPROC(el1_sync) What is the calling convention for this HVC? You mentioned x18 above but what about other registers that the called function may corrupt (x18 is a temporary register, so it's not expected to be callee saved).
On Sat, Jan 17, 2015 at 12:23:34AM +0000, Geoff Levand wrote: > ENTRY(cpu_reset) > - mrs x1, sctlr_el1 > - bic x1, x1, #1 > - msr sctlr_el1, x1 // disable the MMU > + mrs x2, sctlr_el1 > + bic x2, x2, #1 > + msr sctlr_el1, x2 // disable the MMU > isb > - ret x0 > + > + cbz x0, 1f // el2_switch? > + mov x0, x1 > + mov x1, xzr > + mov x2, xzr > + mov x3, xzr > + hvc #HVC_CALL_FUNC // no return If that's the only user of HVC_CALL_FUNC, why do we bother with arguments, calling convention?
On Tue, Jan 27, 2015 at 05:39:47PM +0000, Catalin Marinas wrote: > On Sat, Jan 17, 2015 at 12:23:34AM +0000, Geoff Levand wrote: > > diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h > > index 99c319c..4f23a48 100644 > > --- a/arch/arm64/include/asm/virt.h > > +++ b/arch/arm64/include/asm/virt.h > > @@ -41,6 +41,19 @@ > > > > #define HVC_CALL_HYP 3 > > > > +/* > > + * HVC_CALL_FUNC - Execute a function at EL2. > > + * > > + * @x0: Physical address of the function to be executed. > > + * @x1: Passed as the first argument to the function. > > + * @x2: Passed as the second argument to the function. > > + * @x3: Passed as the third argument to the function. > > + * > > + * The called function must preserve the contents of register x18. > > Can you pick a register that's normally callee saved? We're in the hyp-stub, so we don't have a stack in EL2. Therefore we can't stack any of the existing callee-saved register values in order to be able to use them. One way to avoid that would be to have asm block which issues the HVC at EL1 stack/unstack the LR around the HVC. Then we're free to corrupt the LR at EL2 in order to call the provided function. [...] > > +1: cmp x18, #HVC_CALL_FUNC > > + b.ne 2f > > + mov x18, lr > > + mov lr, x0 > > + mov x0, x1 > > + mov x1, x2 > > + mov x2, x3 > > + blr lr > > + mov lr, x18 > > > > 2: eret > > ENDPROC(el1_sync) > > What is the calling convention for this HVC? You mentioned x18 above but > what about other registers that the called function may corrupt (x18 is > a temporary register, so it's not expected to be callee saved). Other than x18, the usual PCS rules apply here. We don't have a stack, so the function we call can't make a nested call to anything else. Mark.
Hi Mark, On Tue, 2015-01-27 at 16:46 +0000, Mark Rutland wrote: > On Mon, Jan 26, 2015 at 09:48:48PM +0000, Geoff Levand wrote: > > This conditional is just if KVM, do things the old way (don't try to > > switch exception levels). It is to handle the system shutdown case. > > Having grepped treewide for soft_restart, other than kexec there are no > users for arm64. So surely kexec is the only case to cater for at the > moment? Yes, I think you're right, and so it seems we can drop this patch and just have the 'Add checks for KVM' patch. > > Another patch in this series '[PATCH 7/8] arm64/kexec: Add checks for > > KVM' assures kexec cannot happen when KVM is configured. > > It would be better to just move this earlier (or event better, implement > kvm teardown). Yes, I hope we don't really need to have any KVM work-arounds. -Geoff
Hello, On 01/27/2015 04:19 AM, Mark Rutland wrote: > On Sat, Jan 17, 2015 at 12:23:34AM +0000, Geoff Levand wrote: >> Add runtime checks that fail the arm64 kexec syscall for situations that would >> result in system instability do to problems in the KVM kernel support. >> These checks should be removed when the KVM problems are resolved fixed. >> >> Signed-off-by: Geoff Levand <geoff@infradead.org> >> --- >> arch/arm64/kernel/machine_kexec.c | 10 ++++++++++ >> 1 file changed, 10 insertions(+) >> >> diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c >> index 3d84759..a36459d 100644 >> --- a/arch/arm64/kernel/machine_kexec.c >> +++ b/arch/arm64/kernel/machine_kexec.c >> @@ -16,6 +16,9 @@ >> #include <asm/cacheflush.h> >> #include <asm/system_misc.h> >> >> +/* TODO: Remove this include when KVM can support a kexec reboot. */ >> +#include <asm/virt.h> >> + >> /* Global variables for the relocate_kernel routine. */ >> extern const unsigned char relocate_new_kernel[]; >> extern const unsigned long relocate_new_kernel_size; >> @@ -100,6 +103,13 @@ int machine_kexec_prepare(struct kimage *image) >> >> kexec_image_info(image); >> >> + /* TODO: Remove this message when KVM can support a kexec reboot. */ >> + if (IS_ENABLED(CONFIG_KVM) && is_hyp_mode_available()) { >> + pr_err("%s: Your kernel is configured with KVM support (CONFIG_KVM=y) which currently does not allow for kexec re-boot.\n", >> + __func__); >> + return -ENOSYS; >> + } > > If you really don't want to implement KVM teardown, surely this should > be at the start of the series, so we don't have a point in the middle > where things may explode in this case? I'm going to fix this KVM issue (teardown) in cooperation with Geoff. Looking into kvm init code, kvm_arch_init() in particular, I guess that teardown function (kvm_arch_exit()) should do (reverting kvm_timer_hyp_init() per cpu) - stop arch timer (reverting cpu_init_hyp_mode() per cpu) - flush TLB - jump into identical mapping (using boot_hyp_pgd?) - disable MMU? - restore vbar_el2 to __hyp_stub_vectors (or NULL?) (reverting kvm_mmu_init()) - Do we need to free page tables and a bounce(trampoline) page? Is this good enough for safely shutting down kvm? Do I miss anything essential, or can I skip anyhing? I really appreciate your comments. -Takahiro AKASHI > Mark. > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel >
Hello, On 01/27/2015 04:19 AM, Mark Rutland wrote: > On Sat, Jan 17, 2015 at 12:23:34AM +0000, Geoff Levand wrote: >> Add runtime checks that fail the arm64 kexec syscall for situations that would >> result in system instability do to problems in the KVM kernel support. >> These checks should be removed when the KVM problems are resolved fixed. >> >> Signed-off-by: Geoff Levand <geoff@infradead.org> >> --- >> arch/arm64/kernel/machine_kexec.c | 10 ++++++++++ >> 1 file changed, 10 insertions(+) >> >> diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c >> index 3d84759..a36459d 100644 >> --- a/arch/arm64/kernel/machine_kexec.c >> +++ b/arch/arm64/kernel/machine_kexec.c >> @@ -16,6 +16,9 @@ >> #include <asm/cacheflush.h> >> #include <asm/system_misc.h> >> >> +/* TODO: Remove this include when KVM can support a kexec reboot. */ >> +#include <asm/virt.h> >> + >> /* Global variables for the relocate_kernel routine. */ >> extern const unsigned char relocate_new_kernel[]; >> extern const unsigned long relocate_new_kernel_size; >> @@ -100,6 +103,13 @@ int machine_kexec_prepare(struct kimage *image) >> >> kexec_image_info(image); >> >> + /* TODO: Remove this message when KVM can support a kexec reboot. */ >> + if (IS_ENABLED(CONFIG_KVM) && is_hyp_mode_available()) { >> + pr_err("%s: Your kernel is configured with KVM support (CONFIG_KVM=y) which currently does not allow for kexec re-boot.\n", >> + __func__); >> + return -ENOSYS; >> + } > > If you really don't want to implement KVM teardown, surely this should > be at the start of the series, so we don't have a point in the middle > where things may explode in this case? I'm going to fix this KVM issue (teardown) in cooperation with Geoff. Looking into kvm init code, kvm_arch_init() in particular, I guess that teardown function (kvm_arch_exit()) should do (reverting kvm_timer_hyp_init() per cpu) - stop arch timer (reverting cpu_init_hyp_mode() per cpu) - flush TLB - jump into identical mapping (using boot_hyp_pgd?) - disable MMU? - restore vbar_el2 to __hyp_stub_vectors (or NULL?) (reverting kvm_mmu_init()) - Do we need to free page tables and a bounce(trampoline) page? Is this good enough for safely shutting down kvm? Do I miss anything essential, or can I skip anyhing? I really appreciate your comments. -Takahiro AKASHI > Mark. > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel >
On 29/01/15 09:57, AKASHI Takahiro wrote: > Hello, > > On 01/27/2015 04:19 AM, Mark Rutland wrote: >> On Sat, Jan 17, 2015 at 12:23:34AM +0000, Geoff Levand wrote: >>> Add runtime checks that fail the arm64 kexec syscall for situations that would >>> result in system instability do to problems in the KVM kernel support. >>> These checks should be removed when the KVM problems are resolved fixed. >>> >>> Signed-off-by: Geoff Levand <geoff@infradead.org> >>> --- >>> arch/arm64/kernel/machine_kexec.c | 10 ++++++++++ >>> 1 file changed, 10 insertions(+) >>> >>> diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c >>> index 3d84759..a36459d 100644 >>> --- a/arch/arm64/kernel/machine_kexec.c >>> +++ b/arch/arm64/kernel/machine_kexec.c >>> @@ -16,6 +16,9 @@ >>> #include <asm/cacheflush.h> >>> #include <asm/system_misc.h> >>> >>> +/* TODO: Remove this include when KVM can support a kexec reboot. */ >>> +#include <asm/virt.h> >>> + >>> /* Global variables for the relocate_kernel routine. */ >>> extern const unsigned char relocate_new_kernel[]; >>> extern const unsigned long relocate_new_kernel_size; >>> @@ -100,6 +103,13 @@ int machine_kexec_prepare(struct kimage *image) >>> >>> kexec_image_info(image); >>> >>> + /* TODO: Remove this message when KVM can support a kexec reboot. */ >>> + if (IS_ENABLED(CONFIG_KVM) && is_hyp_mode_available()) { >>> + pr_err("%s: Your kernel is configured with KVM support (CONFIG_KVM=y) which currently does not allow for kexec re-boot.\n", >>> + __func__); >>> + return -ENOSYS; >>> + } >> >> If you really don't want to implement KVM teardown, surely this should >> be at the start of the series, so we don't have a point in the middle >> where things may explode in this case? > > I'm going to fix this KVM issue (teardown) in cooperation with Geoff. > > Looking into kvm init code, kvm_arch_init() in particular, > I guess that teardown function (kvm_arch_exit()) should do > (reverting kvm_timer_hyp_init() per cpu) > - stop arch timer No need, there shouldn't be any guest running at that point. > (reverting cpu_init_hyp_mode() per cpu) > - flush TLB > - jump into identical mapping (using boot_hyp_pgd?) > - disable MMU? Yes, and for that, you need to go back to an idmap > - restore vbar_el2 to __hyp_stub_vectors (or NULL?) It doesn't matter, you want to stay in HYP for the next kernel. > (reverting kvm_mmu_init()) > - Do we need to free page tables and a bounce(trampoline) page? I don't think that's useful, you've killed the kernel already. > Is this good enough for safely shutting down kvm? > Do I miss anything essential, or can I skip anyhing? I've outlined the steps a couple of days ago there: http://www.spinics.net/lists/arm-kernel/msg395177.html and the outline above should give you a nice list of things to look at. All in all, it is pretty straightforward, provided that you're careful enough about (commit 5a677ce044f has most of the information, just read it backward... ;-). Thanks, M.
On Thu, Jan 29, 2015 at 10:59:45AM +0000, Marc Zyngier wrote: > > On 29/01/15 09:57, AKASHI Takahiro wrote: > > Hello, > > > > On 01/27/2015 04:19 AM, Mark Rutland wrote: > >> On Sat, Jan 17, 2015 at 12:23:34AM +0000, Geoff Levand wrote: > >>> Add runtime checks that fail the arm64 kexec syscall for situations that would > >>> result in system instability do to problems in the KVM kernel support. > >>> These checks should be removed when the KVM problems are resolved fixed. > >>> > >>> Signed-off-by: Geoff Levand <geoff@infradead.org> > >>> --- > >>> arch/arm64/kernel/machine_kexec.c | 10 ++++++++++ > >>> 1 file changed, 10 insertions(+) > >>> > >>> diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c > >>> index 3d84759..a36459d 100644 > >>> --- a/arch/arm64/kernel/machine_kexec.c > >>> +++ b/arch/arm64/kernel/machine_kexec.c > >>> @@ -16,6 +16,9 @@ > >>> #include <asm/cacheflush.h> > >>> #include <asm/system_misc.h> > >>> > >>> +/* TODO: Remove this include when KVM can support a kexec reboot. */ > >>> +#include <asm/virt.h> > >>> + > >>> /* Global variables for the relocate_kernel routine. */ > >>> extern const unsigned char relocate_new_kernel[]; > >>> extern const unsigned long relocate_new_kernel_size; > >>> @@ -100,6 +103,13 @@ int machine_kexec_prepare(struct kimage *image) > >>> > >>> kexec_image_info(image); > >>> > >>> + /* TODO: Remove this message when KVM can support a kexec reboot. */ > >>> + if (IS_ENABLED(CONFIG_KVM) && is_hyp_mode_available()) { > >>> + pr_err("%s: Your kernel is configured with KVM support (CONFIG_KVM=y) which currently does not allow for kexec re-boot.\n", > >>> + __func__); > >>> + return -ENOSYS; > >>> + } > >> > >> If you really don't want to implement KVM teardown, surely this should > >> be at the start of the series, so we don't have a point in the middle > >> where things may explode in this case? > > > > I'm going to fix this KVM issue (teardown) in cooperation with Geoff. > > > > Looking into kvm init code, kvm_arch_init() in particular, > > I guess that teardown function (kvm_arch_exit()) should do > > (reverting kvm_timer_hyp_init() per cpu) > > - stop arch timer > > No need, there shouldn't be any guest running at that point. > > > (reverting cpu_init_hyp_mode() per cpu) > > - flush TLB > > - jump into identical mapping (using boot_hyp_pgd?) > > - disable MMU? > > Yes, and for that, you need to go back to an idmap > > > - restore vbar_el2 to __hyp_stub_vectors (or NULL?) > > It doesn't matter, you want to stay in HYP for the next kernel. Well, it depends on how the teardown is implemented. I'd imagined we'd have KVM tear itself down (restoring the hyp stub) as part of shut down. Later kexec/soft_restart would call the stub to get back to EL2. That way kexec doesn't need to know anything about KVM, and it has one path that works regardless of whether KVM is compiled into the kernel. Mark.
Hello Marc, Mark Thank you for your useful comments. I need to look at Marc's commit(5a677ce044f) more carefully about trampoline code. On 01/30/2015 03:47 AM, Mark Rutland wrote: > On Thu, Jan 29, 2015 at 10:59:45AM +0000, Marc Zyngier wrote: >> >> On 29/01/15 09:57, AKASHI Takahiro wrote: >>> Hello, >>> >>> On 01/27/2015 04:19 AM, Mark Rutland wrote: >>>> On Sat, Jan 17, 2015 at 12:23:34AM +0000, Geoff Levand wrote: >>>>> Add runtime checks that fail the arm64 kexec syscall for situations that would >>>>> result in system instability do to problems in the KVM kernel support. >>>>> These checks should be removed when the KVM problems are resolved fixed. >>>>> >>>>> Signed-off-by: Geoff Levand <geoff@infradead.org> >>>>> --- >>>>> arch/arm64/kernel/machine_kexec.c | 10 ++++++++++ >>>>> 1 file changed, 10 insertions(+) >>>>> >>>>> diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c >>>>> index 3d84759..a36459d 100644 >>>>> --- a/arch/arm64/kernel/machine_kexec.c >>>>> +++ b/arch/arm64/kernel/machine_kexec.c >>>>> @@ -16,6 +16,9 @@ >>>>> #include <asm/cacheflush.h> >>>>> #include <asm/system_misc.h> >>>>> >>>>> +/* TODO: Remove this include when KVM can support a kexec reboot. */ >>>>> +#include <asm/virt.h> >>>>> + >>>>> /* Global variables for the relocate_kernel routine. */ >>>>> extern const unsigned char relocate_new_kernel[]; >>>>> extern const unsigned long relocate_new_kernel_size; >>>>> @@ -100,6 +103,13 @@ int machine_kexec_prepare(struct kimage *image) >>>>> >>>>> kexec_image_info(image); >>>>> >>>>> + /* TODO: Remove this message when KVM can support a kexec reboot. */ >>>>> + if (IS_ENABLED(CONFIG_KVM) && is_hyp_mode_available()) { >>>>> + pr_err("%s: Your kernel is configured with KVM support (CONFIG_KVM=y) which currently does not allow for kexec re-boot.\n", >>>>> + __func__); >>>>> + return -ENOSYS; >>>>> + } >>>> >>>> If you really don't want to implement KVM teardown, surely this should >>>> be at the start of the series, so we don't have a point in the middle >>>> where things may explode in this case? >>> >>> I'm going to fix this KVM issue (teardown) in cooperation with Geoff. >>> >>> Looking into kvm init code, kvm_arch_init() in particular, >>> I guess that teardown function (kvm_arch_exit()) should do >>> (reverting kvm_timer_hyp_init() per cpu) >>> - stop arch timer >> >> No need, there shouldn't be any guest running at that point. >> >>> (reverting cpu_init_hyp_mode() per cpu) >>> - flush TLB >>> - jump into identical mapping (using boot_hyp_pgd?) >>> - disable MMU? >> >> Yes, and for that, you need to go back to an idmap >> >>> - restore vbar_el2 to __hyp_stub_vectors (or NULL?) >> >> It doesn't matter, you want to stay in HYP for the next kernel. > > Well, it depends on how the teardown is implemented. I'd imagined we'd > have KVM tear itself down (restoring the hyp stub) as part of shut down. > Later kexec/soft_restart would call the stub to get back to EL2. > > That way kexec doesn't need to know anything about KVM, and it has one > path that works regardless of whether KVM is compiled into the kernel. Initially, I thought that we would define kvm_arch_exit() and call it somewhere in the middle of kexec path (no idea yet). But Geoff suggested me to implement a new hvc call, HVC_CPU_SHUTDOWN(??), and make it called via cpu_notifier(CPU_DYING_FROZEN) initiated by machine_shutdown() from kernel_kexec(). (As you know, a hook is already there for cpu online in kvm/arm.c.) Is this the right place to put teardown function? (I'm not sure if this hook will be called also on a boot cpu.) Thanks, -Takahiro AKASHI > Mark. >
On Fri, Jan 30, 2015 at 06:10:53AM +0000, AKASHI Takahiro wrote: > Hello Marc, Mark > > Thank you for your useful comments. > I need to look at Marc's commit(5a677ce044f) more carefully > about trampoline code. > > On 01/30/2015 03:47 AM, Mark Rutland wrote: > > On Thu, Jan 29, 2015 at 10:59:45AM +0000, Marc Zyngier wrote: > >> > >> On 29/01/15 09:57, AKASHI Takahiro wrote: > >>> Hello, > >>> > >>> On 01/27/2015 04:19 AM, Mark Rutland wrote: > >>>> On Sat, Jan 17, 2015 at 12:23:34AM +0000, Geoff Levand wrote: > >>>>> Add runtime checks that fail the arm64 kexec syscall for situations that would > >>>>> result in system instability do to problems in the KVM kernel support. > >>>>> These checks should be removed when the KVM problems are resolved fixed. > >>>>> > >>>>> Signed-off-by: Geoff Levand <geoff@infradead.org> > >>>>> --- > >>>>> arch/arm64/kernel/machine_kexec.c | 10 ++++++++++ > >>>>> 1 file changed, 10 insertions(+) > >>>>> > >>>>> diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c > >>>>> index 3d84759..a36459d 100644 > >>>>> --- a/arch/arm64/kernel/machine_kexec.c > >>>>> +++ b/arch/arm64/kernel/machine_kexec.c > >>>>> @@ -16,6 +16,9 @@ > >>>>> #include <asm/cacheflush.h> > >>>>> #include <asm/system_misc.h> > >>>>> > >>>>> +/* TODO: Remove this include when KVM can support a kexec reboot. */ > >>>>> +#include <asm/virt.h> > >>>>> + > >>>>> /* Global variables for the relocate_kernel routine. */ > >>>>> extern const unsigned char relocate_new_kernel[]; > >>>>> extern const unsigned long relocate_new_kernel_size; > >>>>> @@ -100,6 +103,13 @@ int machine_kexec_prepare(struct kimage *image) > >>>>> > >>>>> kexec_image_info(image); > >>>>> > >>>>> + /* TODO: Remove this message when KVM can support a kexec reboot. */ > >>>>> + if (IS_ENABLED(CONFIG_KVM) && is_hyp_mode_available()) { > >>>>> + pr_err("%s: Your kernel is configured with KVM support (CONFIG_KVM=y) which currently does not allow for kexec re-boot.\n", > >>>>> + __func__); > >>>>> + return -ENOSYS; > >>>>> + } > >>>> > >>>> If you really don't want to implement KVM teardown, surely this should > >>>> be at the start of the series, so we don't have a point in the middle > >>>> where things may explode in this case? > >>> > >>> I'm going to fix this KVM issue (teardown) in cooperation with Geoff. > >>> > >>> Looking into kvm init code, kvm_arch_init() in particular, > >>> I guess that teardown function (kvm_arch_exit()) should do > >>> (reverting kvm_timer_hyp_init() per cpu) > >>> - stop arch timer > >> > >> No need, there shouldn't be any guest running at that point. > >> > >>> (reverting cpu_init_hyp_mode() per cpu) > >>> - flush TLB > >>> - jump into identical mapping (using boot_hyp_pgd?) > >>> - disable MMU? > >> > >> Yes, and for that, you need to go back to an idmap > >> > >>> - restore vbar_el2 to __hyp_stub_vectors (or NULL?) > >> > >> It doesn't matter, you want to stay in HYP for the next kernel. > > > > Well, it depends on how the teardown is implemented. I'd imagined we'd > > have KVM tear itself down (restoring the hyp stub) as part of shut down. > > Later kexec/soft_restart would call the stub to get back to EL2. > > > > That way kexec doesn't need to know anything about KVM, and it has one > > path that works regardless of whether KVM is compiled into the kernel. > > Initially, I thought that we would define kvm_arch_exit() and call it > somewhere in the middle of kexec path (no idea yet). > But Geoff suggested me to implement a new hvc call, HVC_CPU_SHUTDOWN(??), > and make it called via cpu_notifier(CPU_DYING_FROZEN) initiated by > machine_shutdown() from kernel_kexec(). > (As you know, a hook is already there for cpu online in kvm/arm.c.) I think it would make far more sense to have the KVM teardown entirely contained within KVM (e.g. in kvm_arch_exit). That way there's no need for an interdependency between KVM and kexec, and we'd be a step closer to KVM as a module. I'm not keen on having KVM teardown in the middle of the kexec path. Mark.
Hi Takahiro. On Fri, 2015-01-30 at 15:10 +0900, AKASHI Takahiro wrote: > Initially, I thought that we would define kvm_arch_exit() and call it > somewhere in the middle of kexec path (no idea yet). > But Geoff suggested me to implement a new hvc call, HVC_CPU_SHUTDOWN(??), > and make it called via cpu_notifier(CPU_DYING_FROZEN) initiated by > machine_shutdown() from kernel_kexec(). As an initial implementation we can hook into the CPU_DYING_FROZEN notifier sent to hyp_init_cpu_notify(). The longer term solution should use kvm_arch_hardware_enable() and kvm_arch_hardware_disable(). The calls to cpu_notifier(CPU_DYING_FROZEN) are part of cpu hot plug, and independent of kexec. If someone were to add spin-table cpu un-plug, then it would be used for that also. It seems we should be able to test without kexec by using cpu hot plug. To tear down KVM you need to get back to hyp mode, and hence the need for HVC_CPU_SHUTDOWN. The sequence I envisioned would be like this: cpu_notifier(CPU_DYING_FROZEN) -> kvm_cpu_shutdown() prepare for hvc -> HVC_CPU_SHUTDOWN now in hyp mode, do KVM tear down, restore default exception vectors Once the default exception vectors are restored soft_restart() can then execute the cpu_reset routine in EL2. Some notes are here for those with access: https://cards.linaro.org/browse/KWG-611 -Geoff
Hi Catalin, On Tue, 2015-01-27 at 17:57 +0000, Catalin Marinas wrote: > On Sat, Jan 17, 2015 at 12:23:34AM +0000, Geoff Levand wrote: > > ENTRY(cpu_reset) > > - mrs x1, sctlr_el1 > > - bic x1, x1, #1 > > - msr sctlr_el1, x1 // disable the MMU > > + mrs x2, sctlr_el1 > > + bic x2, x2, #1 > > + msr sctlr_el1, x2 // disable the MMU > > isb > > - ret x0 > > + > > + cbz x0, 1f // el2_switch? > > + mov x0, x1 > > + mov x1, xzr > > + mov x2, xzr > > + mov x3, xzr > > + hvc #HVC_CALL_FUNC // no return > > If that's the only user of HVC_CALL_FUNC, why do we bother with > arguments, calling convention? It was intended that HVC_CALL_FUNC be a mechanism to call a generic function in hyp mode. cpu_reset() is the only user of it now. As Mark explained in another post, the use of x18 by the hyp stub is to avoid the complication of setting up a stack for EL2. We thought this solution was acceptable since there are relatively few HVC_CALL_FUNC calls and we could assure they would do the right thing. -Geoff
Hi Catalin, On Tue, 2015-01-27 at 17:39 +0000, Catalin Marinas wrote: > On Sat, Jan 17, 2015 at 12:23:34AM +0000, Geoff Levand wrote: > > diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h > > index 99c319c..4f23a48 100644 > > --- a/arch/arm64/include/asm/virt.h > > +++ b/arch/arm64/include/asm/virt.h > > @@ -41,6 +41,19 @@ > > > > #define HVC_CALL_HYP 3 > > > > +/* > > + * HVC_CALL_FUNC - Execute a function at EL2. > > + * > > + * @x0: Physical address of the function to be executed. > > + * @x1: Passed as the first argument to the function. > > + * @x2: Passed as the second argument to the function. > > + * @x3: Passed as the third argument to the function. > > + * > > + * The called function must preserve the contents of register x18. > > Can you pick a register that's normally callee saved? Mark covered this in his reply. > > + */ > > + > > +#define HVC_CALL_FUNC 4 > > + > > #ifndef __ASSEMBLY__ > > > > /* > > diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S > > index e3db3fd..b5d36e7 100644 > > --- a/arch/arm64/kernel/hyp-stub.S > > +++ b/arch/arm64/kernel/hyp-stub.S > > @@ -66,9 +66,20 @@ el1_sync: > > mrs x0, vbar_el2 > > b 2f > > > > -1: cmp x18, #HVC_SET_VECTORS > > - b.ne 2f > > - msr vbar_el2, x0 > > +1: cmp x18, #HVC_SET_VECTORS > > This line doesn't seem to have any change, apart from some whitespace. > Or did you want to drop the label? Some whitespace problems got in there from so many rebases. I went through the series and cleaned them up. > > + b.ne 1f > > + msr vbar_el2, x0 > > + b 2f > > + > > +1: cmp x18, #HVC_CALL_FUNC > > + b.ne 2f > > + mov x18, lr > > + mov lr, x0 > > + mov x0, x1 > > + mov x1, x2 > > + mov x2, x3 > > + blr lr > > + mov lr, x18 > > > > 2: eret > > ENDPROC(el1_sync) > > What is the calling convention for this HVC? You mentioned x18 above but > what about other registers that the called function may corrupt (x18 is > a temporary register, so it's not expected to be callee saved). Again, Mark covered this in his reply.
On Mon, 2015-01-26 at 18:26 +0000, Catalin Marinas wrote: > On Sat, Jan 17, 2015 at 12:23:34AM +0000, Geoff Levand wrote: > > To allow for additional hcalls to be defined and to make the arm64 hcall API > > more consistent across exception vector routines, change the hcall implementations > > to use the ISS field of the ESR_EL2 register to specify the hcall type. > > > > The existing arm64 hcall implementations are limited in that they only allow > > for two distinct hcalls; with the x0 register either zero, or not zero. Also, > > the API of the hyp-stub exception vector routines and the KVM exception vector > > routines differ; hyp-stub uses a non-zero value in x0 to implement > > __hyp_set_vectors, whereas KVM uses it to implement kvm_call_hyp. > > > > Define three new preprocessor macros HVC_GET_VECTORS, HVC_SET_VECTORS and > > HVC_CALL_HYP and to be used as hcall type specifiers and convert the > > existing __hyp_get_vectors(), __hyp_set_vectors() and kvm_call_hyp() routines > > to use these new macros when executing an HVC call. Also change the > > corresponding hyp-stub and KVM el1_sync exception vector routines to use these > > new macros. > > > > Signed-off-by: Geoff Levand <geoff@infradead.org> > > Using the #imm value for HVC to separate what gets called looks fine to > me. However, I'd like to see a review from Marc/Christoffer on this > patch. Marc, Christopher, comments please? > Some comments below: > > > diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h > > index 7a5df52..99c319c 100644 > > --- a/arch/arm64/include/asm/virt.h > > +++ b/arch/arm64/include/asm/virt.h > > @@ -21,6 +21,26 @@ > > #define BOOT_CPU_MODE_EL1 (0xe11) > > #define BOOT_CPU_MODE_EL2 (0xe12) > > > > +/* > > + * HVC_GET_VECTORS - Return the value of the vbar_el2 register. > > + */ > > + > > +#define HVC_GET_VECTORS 1 > > + > > +/* > > + * HVC_SET_VECTORS - Set the value of the vbar_el2 register. > > + * > > + * @x0: Physical address of the new vector table. > > + */ > > + > > +#define HVC_SET_VECTORS 2 > > + > > +/* > > + * HVC_CALL_HYP - Execute a hyp routine. > > + */ > > + > > +#define HVC_CALL_HYP 3 > > I think you can ignore this case (make it the default), just define it > as 0 as that's the normal use-case after initialisation and avoid > checking it explicitly. OK, I changed this so that HVC_CALL_HYP is the default at 0. > > /* > > diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S > > index a272f33..e3db3fd 100644 > > --- a/arch/arm64/kernel/hyp-stub.S > > +++ b/arch/arm64/kernel/hyp-stub.S > > @@ -22,6 +22,7 @@ > > #include <linux/irqchip/arm-gic-v3.h> > > > > #include <asm/assembler.h> > > +#include <asm/kvm_arm.h> > > #include <asm/ptrace.h> > > #include <asm/virt.h> > > > > @@ -53,14 +54,22 @@ ENDPROC(__hyp_stub_vectors) > > .align 11 > > > > el1_sync: > > - mrs x1, esr_el2 > > - lsr x1, x1, #26 > > - cmp x1, #0x16 > > - b.ne 2f // Not an HVC trap > > - cbz x0, 1f > > - msr vbar_el2, x0 // Set vbar_el2 > > + mrs x18, esr_el2 > > + lsr x17, x18, #ESR_ELx_EC_SHIFT > > + and x18, x18, #ESR_ELx_ISS_MASK > > + > > + cmp x17, #ESR_ELx_EC_HVC64 > > + b.ne 2f // Not an HVC trap > > + > > + cmp x18, #HVC_GET_VECTORS > > + b.ne 1f > > + mrs x0, vbar_el2 > > b 2f > > -1: mrs x0, vbar_el2 // Return vbar_el2 > > + > > +1: cmp x18, #HVC_SET_VECTORS > > + b.ne 2f > > + msr vbar_el2, x0 > > + > > 2: eret > > ENDPROC(el1_sync) > > You seem to be using x17 and x18 here freely. Do you have any guarantees > that the caller saved/restored those registers? I guess you assume they > are temporary registers and the caller first branches to a function > (like __kvm_hyp_call) and expects them to be corrupted. But I'm not sure > that's always the case. Take for example the __invoke_psci_fn_hvc where > the function is in C (we should change this for other reasons). Yes, I assume the compiler will not expect them to be preserved. I missed __invoke_psci_fn_hvc. Can we just add x17 and x18 to the clobbered list? asm volatile( __asmeq("%0", "x0") __asmeq("%1", "x1") __asmeq("%2", "x2") __asmeq("%3", "x3") "hvc #0\n" : "+r" (function_id) - : "r" (arg0), "r" (arg1), "r" (arg2)); + : "r" (arg0), "r" (arg1), "r" (arg2) + : "x17", "x18"); > > diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S > > index c0d8202..1916c89 100644 > > --- a/arch/arm64/kvm/hyp.S > > +++ b/arch/arm64/kvm/hyp.S > > @@ -27,6 +27,7 @@ > > #include <asm/kvm_asm.h> > > #include <asm/kvm_mmu.h> > > #include <asm/memory.h> > > +#include <asm/virt.h> > > > > #define CPU_GP_REG_OFFSET(x) (CPU_GP_REGS + x) > > #define CPU_XREG_OFFSET(x) CPU_GP_REG_OFFSET(CPU_USER_PT_REGS + 8*x) > > @@ -1106,12 +1107,9 @@ __hyp_panic_str: > > * in Hyp mode (see init_hyp_mode in arch/arm/kvm/arm.c). Return values are > > * passed in r0 and r1. > > * > > - * A function pointer with a value of 0 has a special meaning, and is > > - * used to implement __hyp_get_vectors in the same way as in > > - * arch/arm64/kernel/hyp_stub.S. > > */ > > ENTRY(kvm_call_hyp) > > - hvc #0 > > + hvc #HVC_CALL_HYP > > ret > > ENDPROC(kvm_call_hyp) > > > > @@ -1142,6 +1140,7 @@ el1_sync: // Guest trapped into EL2 > > > > mrs x1, esr_el2 > > lsr x2, x1, #ESR_ELx_EC_SHIFT > > + and x0, x1, #ESR_ELx_ISS_MASK > > > > cmp x2, #ESR_ELx_EC_HVC64 > > b.ne el1_trap > > @@ -1150,15 +1149,19 @@ el1_sync: // Guest trapped into EL2 > > cbnz x3, el1_trap // called HVC > > > > /* Here, we're pretty sure the host called HVC. */ > > + mov x18, x0 > > Same comment here about corrupting x18. If it is safe, maybe add some > comments in the calling place. I added a comment regarding this to virt.h where the HVC_XXX macros are defined. I'll post that fixed up patch for review. > > > pop x2, x3 > > pop x0, x1 > > > > - /* Check for __hyp_get_vectors */ > > - cbnz x0, 1f > > + cmp x18, #HVC_GET_VECTORS > > + b.ne 1f > > mrs x0, vbar_el2 > > b 2f > > > > -1: push lr, xzr > > +1: cmp x18, #HVC_CALL_HYP > > + b.ne 2f > > + > > + push lr, xzr > > At this point, we expect either HVC_GET_VECTORS or HVC_CALL_HYP. I think > you can simply assume HVC_CALL_HYP as default and ignore the additional > cmp. OK, did that. -Geoff
Geoff, On 01/31/2015 04:48 AM, Geoff Levand wrote: > Hi Takahiro. > > On Fri, 2015-01-30 at 15:10 +0900, AKASHI Takahiro wrote: >> Initially, I thought that we would define kvm_arch_exit() and call it >> somewhere in the middle of kexec path (no idea yet). >> But Geoff suggested me to implement a new hvc call, HVC_CPU_SHUTDOWN(??), >> and make it called via cpu_notifier(CPU_DYING_FROZEN) initiated by >> machine_shutdown() from kernel_kexec(). > > As an initial implementation we can hook into the CPU_DYING_FROZEN > notifier sent to hyp_init_cpu_notify(). The longer term solution > should use kvm_arch_hardware_enable() and kvm_arch_hardware_disable(). Are these two different approaches? I mean, kexec will initiate cpu hotplug: kernel_exec() -> machine_shutdown() -> disable_nonboot_cpu() -> _cpu_down() -> cpu_notify_nofail(CPU_DEAD|...) On the other hand, kvm already has a hook into kvm_arch_hardware_disable(): (ignoring kvm_usage_count here) kvm_cpu_hotplug(CPU_DYING) -> hardware_disable() -> hardware_disable_nolock() -> kvm_arch_hardware_disable() So it seems that we don't have to add a new hook at hyp_init_cpu_notify() if kvm_arch_hardware_disable() is properly implemented. disable_nonboot_cpu() will not inovke cpu hotplug on *boot* cpu, and we should handle it in a separate way though. Do I misunderstand anything here? -Takahiro AKASHI > The calls to cpu_notifier(CPU_DYING_FROZEN) are part of cpu hot > plug, and independent of kexec. If someone were to add spin-table > cpu un-plug, then it would be used for that also. It seems we should > be able to test without kexec by using cpu hot plug. > > To tear down KVM you need to get back to hyp mode, and hence > the need for HVC_CPU_SHUTDOWN. The sequence I envisioned would > be like this: > > cpu_notifier(CPU_DYING_FROZEN) > -> kvm_cpu_shutdown() > prepare for hvc > -> HVC_CPU_SHUTDOWN > now in hyp mode, do KVM tear down, restore default exception vectors > > Once the default exception vectors are restored soft_restart() > can then execute the cpu_reset routine in EL2. > > Some notes are here for those with access: https://cards.linaro.org/browse/KWG-611 > > -Geoff >
On Fri, Jan 30, 2015 at 11:31:21PM +0000, Geoff Levand wrote: > On Mon, 2015-01-26 at 18:26 +0000, Catalin Marinas wrote: > > On Sat, Jan 17, 2015 at 12:23:34AM +0000, Geoff Levand wrote: > > > /* > > > diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S > > > index a272f33..e3db3fd 100644 > > > --- a/arch/arm64/kernel/hyp-stub.S > > > +++ b/arch/arm64/kernel/hyp-stub.S > > > @@ -22,6 +22,7 @@ > > > #include <linux/irqchip/arm-gic-v3.h> > > > > > > #include <asm/assembler.h> > > > +#include <asm/kvm_arm.h> > > > #include <asm/ptrace.h> > > > #include <asm/virt.h> > > > > > > @@ -53,14 +54,22 @@ ENDPROC(__hyp_stub_vectors) > > > .align 11 > > > > > > el1_sync: > > > - mrs x1, esr_el2 > > > - lsr x1, x1, #26 > > > - cmp x1, #0x16 > > > - b.ne 2f // Not an HVC trap > > > - cbz x0, 1f > > > - msr vbar_el2, x0 // Set vbar_el2 > > > + mrs x18, esr_el2 > > > + lsr x17, x18, #ESR_ELx_EC_SHIFT > > > + and x18, x18, #ESR_ELx_ISS_MASK > > > + > > > + cmp x17, #ESR_ELx_EC_HVC64 > > > + b.ne 2f // Not an HVC trap > > > + > > > + cmp x18, #HVC_GET_VECTORS > > > + b.ne 1f > > > + mrs x0, vbar_el2 > > > b 2f > > > -1: mrs x0, vbar_el2 // Return vbar_el2 > > > + > > > +1: cmp x18, #HVC_SET_VECTORS > > > + b.ne 2f > > > + msr vbar_el2, x0 > > > + > > > 2: eret > > > ENDPROC(el1_sync) > > > > You seem to be using x17 and x18 here freely. Do you have any guarantees > > that the caller saved/restored those registers? I guess you assume they > > are temporary registers and the caller first branches to a function > > (like __kvm_hyp_call) and expects them to be corrupted. But I'm not sure > > that's always the case. Take for example the __invoke_psci_fn_hvc where > > the function is in C (we should change this for other reasons). > > Yes, I assume the compiler will not expect them to be preserved. I > missed __invoke_psci_fn_hvc. Can we just add x17 and x18 to the > clobbered list? > > asm volatile( > __asmeq("%0", "x0") > __asmeq("%1", "x1") > __asmeq("%2", "x2") > __asmeq("%3", "x3") > "hvc #0\n" > : "+r" (function_id) > - : "r" (arg0), "r" (arg1), "r" (arg2)); > + : "r" (arg0), "r" (arg1), "r" (arg2) > + : "x17", "x18"); I think we can ignore these because they would be called from a guest context and IIUC we would only clobber x18 on the host HVC side.
Hi Takahiro, On Mon, 2015-02-02 at 17:18 +0900, AKASHI Takahiro wrote: > On 01/31/2015 04:48 AM, Geoff Levand wrote: > > As an initial implementation we can hook into the CPU_DYING_FROZEN > > notifier sent to hyp_init_cpu_notify(). The longer term solution > > should use kvm_arch_hardware_enable() and kvm_arch_hardware_disable(). > > Are these two different approaches? Yes, these are two different solutions, One initial work-around, and a more involved proper solution. Hooking into the CPU_DYING_FROZEN notifier would be a initial fix. The proper solution would be to move the KVM setup to kvm_arch_hardware_enable(), and the shutdown to kvm_arch_hardware_disable(). > kernel_exec() -> machine_shutdown() -> disable_nonboot_cpu() > -> _cpu_down() -> cpu_notify_nofail(CPU_DEAD|...) > > On the other hand, kvm already has a hook into kvm_arch_hardware_disable(): > (ignoring kvm_usage_count here) > kvm_cpu_hotplug(CPU_DYING) -> hardware_disable() > -> hardware_disable_nolock() -> kvm_arch_hardware_disable() > > So it seems that we don't have to add a new hook at hyp_init_cpu_notify() > if kvm_arch_hardware_disable() is properly implemented. Yes, that is correct. But, as above, you would also need to update the KVM startup to use kvm_arch_hardware_enable(). > disable_nonboot_cpu() will not inovke cpu hotplug on *boot* cpu, and > we should handle it in a separate way though. IIRC, the secondary cpus go through PSCI on shutdown, and that path is working OK. Maybe I am mistaken though. The primary cpu shutdown (hyp stubs restored) is what is missing. The primary cpu goes through cpu_soft_restart(), and that is what is currently failing. -Geoff
On 02/06/2015 09:11 AM, Geoff Levand wrote: > Hi Takahiro, > > On Mon, 2015-02-02 at 17:18 +0900, AKASHI Takahiro wrote: >> On 01/31/2015 04:48 AM, Geoff Levand wrote: >>> As an initial implementation we can hook into the CPU_DYING_FROZEN >>> notifier sent to hyp_init_cpu_notify(). The longer term solution >>> should use kvm_arch_hardware_enable() and kvm_arch_hardware_disable(). >> >> Are these two different approaches? > > Yes, these are two different solutions, One initial work-around, and a > more involved proper solution. Hooking into the CPU_DYING_FROZEN > notifier would be a initial fix. The proper solution would be to move > the KVM setup to kvm_arch_hardware_enable(), and the shutdown to > kvm_arch_hardware_disable(). > > >> kernel_exec() -> machine_shutdown() -> disable_nonboot_cpu() >> -> _cpu_down() -> cpu_notify_nofail(CPU_DEAD|...) >> >> On the other hand, kvm already has a hook into kvm_arch_hardware_disable(): >> (ignoring kvm_usage_count here) >> kvm_cpu_hotplug(CPU_DYING) -> hardware_disable() >> -> hardware_disable_nolock() -> kvm_arch_hardware_disable() >> >> So it seems that we don't have to add a new hook at hyp_init_cpu_notify() >> if kvm_arch_hardware_disable() is properly implemented. > > Yes, that is correct. But, as above, you would also need to update the > KVM startup to use kvm_arch_hardware_enable(). > >> disable_nonboot_cpu() will not inovke cpu hotplug on *boot* cpu, and >> we should handle it in a separate way though. > > IIRC, the secondary cpus go through PSCI on shutdown, and that path > is working OK. Maybe I am mistaken though. If so, why should we add a hook at hyp_init_cpu_notify() as initial work-around? > The primary cpu shutdown (hyp stubs restored) is what is missing. The > primary cpu goes through cpu_soft_restart(), and that is what is > currently failing. Yeah, we will call teardown function manually in soft_restart(); -Takahiro AKASHI > > -Geoff >
On Fri, 2015-02-06 at 13:18 +0900, AKASHI Takahiro wrote: > On 02/06/2015 09:11 AM, Geoff Levand wrote: > > Hi Takahiro, > > > > On Mon, 2015-02-02 at 17:18 +0900, AKASHI Takahiro wrote: > >> On 01/31/2015 04:48 AM, Geoff Levand wrote: > >>> As an initial implementation we can hook into the CPU_DYING_FROZEN > >>> notifier sent to hyp_init_cpu_notify(). The longer term solution > >>> should use kvm_arch_hardware_enable() and kvm_arch_hardware_disable(). > >> > >> Are these two different approaches? > > > > Yes, these are two different solutions, One initial work-around, and a > > more involved proper solution. Hooking into the CPU_DYING_FROZEN > > notifier would be a initial fix. The proper solution would be to move > > the KVM setup to kvm_arch_hardware_enable(), and the shutdown to > > kvm_arch_hardware_disable(). > > > > > >> kernel_exec() -> machine_shutdown() -> disable_nonboot_cpu() > >> -> _cpu_down() -> cpu_notify_nofail(CPU_DEAD|...) > >> > >> On the other hand, kvm already has a hook into kvm_arch_hardware_disable(): > >> (ignoring kvm_usage_count here) > >> kvm_cpu_hotplug(CPU_DYING) -> hardware_disable() > >> -> hardware_disable_nolock() -> kvm_arch_hardware_disable() > >> > >> So it seems that we don't have to add a new hook at hyp_init_cpu_notify() > >> if kvm_arch_hardware_disable() is properly implemented. > > > > Yes, that is correct. But, as above, you would also need to update the > > KVM startup to use kvm_arch_hardware_enable(). > > > >> disable_nonboot_cpu() will not inovke cpu hotplug on *boot* cpu, and > >> we should handle it in a separate way though. > > > > IIRC, the secondary cpus go through PSCI on shutdown, and that path > > is working OK. Maybe I am mistaken though. > > If so, why should we add a hook at hyp_init_cpu_notify() as initial work-around? To tear down KVM on the primary cpu. > > The primary cpu shutdown (hyp stubs restored) is what is missing. The > > primary cpu goes through cpu_soft_restart(), and that is what is > > currently failing. > > Yeah, we will call teardown function manually in soft_restart(); I think the KVM tear down should happen independent of soft_restart(). When soft_restart() is called, KVM should have already been torn down. -Geoff
On Fri, Jan 30, 2015 at 03:33:48PM -0800, Geoff Levand wrote: > To allow for additional hcalls to be defined and to make the arm64 hcall API > more consistent across exception vector routines, change the hcall implementations > to use the ISS field of the ESR_EL2 register to specify the hcall type. how does this make things more consistent? Do we have other examples of things using the immediate field which I'm missing? > > The existing arm64 hcall implementations are limited in that they only allow > for two distinct hcalls; with the x0 register either zero, or not zero. Also, > the API of the hyp-stub exception vector routines and the KVM exception vector > routines differ; hyp-stub uses a non-zero value in x0 to implement > __hyp_set_vectors, whereas KVM uses it to implement kvm_call_hyp. this seems orthogonal to the use of the immediate field vs. x0 though, so why it the immediate field preferred again? > > Define three new preprocessor macros HVC_CALL_HYP, HVC_GET_VECTORS, and > HVC_SET_VECTORS to be used as hcall type specifiers and convert the > existing __hyp_get_vectors(), __hyp_set_vectors() and kvm_call_hyp() routines > to use these new macros when executing an HVC call. Also change the > corresponding hyp-stub and KVM el1_sync exception vector routines to use these > new macros. > > Signed-off-by: Geoff Levand <geoff@infradead.org> > --- > arch/arm64/include/asm/virt.h | 27 +++++++++++++++++++++++++++ > arch/arm64/kernel/hyp-stub.S | 32 +++++++++++++++++++++----------- > arch/arm64/kernel/psci.c | 3 ++- > arch/arm64/kvm/hyp.S | 16 +++++++++------- > 4 files changed, 59 insertions(+), 19 deletions(-) > > diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h > index 7a5df52..eb10368 100644 > --- a/arch/arm64/include/asm/virt.h > +++ b/arch/arm64/include/asm/virt.h > @@ -18,6 +18,33 @@ > #ifndef __ASM__VIRT_H > #define __ASM__VIRT_H > > +/* > + * The arm64 hcall implementation uses the ISS field of the ESR_EL2 register to > + * specify the hcall type. The exception handlers are allowed to use registers > + * x17 and x18 in their implementation. Any routine issuing an hcall must not > + * expect these registers to be preserved. > + */ I thought the existing use of registers were based on the arm procedure call standard so we didn't have to worry about adding more caller-save registers. Don't we now have to start adding code around callers to make sure callers know that x17 and x18 may be clobbered? Thanks, -Christoffer
Hi Christoffer, On Thu, 2015-02-19 at 21:57 +0100, Christoffer Dall wrote: > On Fri, Jan 30, 2015 at 03:33:48PM -0800, Geoff Levand wrote: > > To allow for additional hcalls to be defined and to make the arm64 hcall API > > more consistent across exception vector routines, change the hcall implementations > > to use the ISS field of the ESR_EL2 register to specify the hcall type. > > how does this make things more consistent? Do we have other examples of > things using the immediate field which I'm missing? As I detail in the next paragraph, by consistent I mean in the API exposed, not in the implementation. This point is really secondary to the need for more hyper calls as I discuss below. > > The existing arm64 hcall implementations are limited in that they only allow > > for two distinct hcalls; with the x0 register either zero, or not zero. Also, > > the API of the hyp-stub exception vector routines and the KVM exception vector > > routines differ; hyp-stub uses a non-zero value in x0 to implement > > __hyp_set_vectors, whereas KVM uses it to implement kvm_call_hyp. > > this seems orthogonal to the use of the immediate field vs. x0 though, > so why it the immediate field preferred again? When a CPU is reset via cpu_soft_restart() we need to execute the caller supplied reset routine at the exception level the kernel was entered at. So, for a kernel entered at EL2, we need a way to execute that routine at EL2. The current hyp-stub vector implementation, which uses x0, is limited to two hyper calls; __hyp_get_vectors and __hyp_set_vectors. To support cpu_soft_restart() we need a third hyper call, one which allows for code to be executed at EL2. My proposed use of the immediate value of the hvc instruction will allow for 2^16 distinct hyper calls. > > Define three new preprocessor macros HVC_CALL_HYP, HVC_GET_VECTORS, and > > HVC_SET_VECTORS to be used as hcall type specifiers and convert the > > existing __hyp_get_vectors(), __hyp_set_vectors() and kvm_call_hyp() routines > > to use these new macros when executing an HVC call. Also change the > > corresponding hyp-stub and KVM el1_sync exception vector routines to use these > > new macros. > > > > Signed-off-by: Geoff Levand <geoff@infradead.org> > > --- > > arch/arm64/include/asm/virt.h | 27 +++++++++++++++++++++++++++ > > arch/arm64/kernel/hyp-stub.S | 32 +++++++++++++++++++++----------- > > arch/arm64/kernel/psci.c | 3 ++- > > arch/arm64/kvm/hyp.S | 16 +++++++++------- > > 4 files changed, 59 insertions(+), 19 deletions(-) > > > > diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h > > index 7a5df52..eb10368 100644 > > --- a/arch/arm64/include/asm/virt.h > > +++ b/arch/arm64/include/asm/virt.h > > @@ -18,6 +18,33 @@ > > #ifndef __ASM__VIRT_H > > #define __ASM__VIRT_H > > > > +/* > > + * The arm64 hcall implementation uses the ISS field of the ESR_EL2 register to > > + * specify the hcall type. The exception handlers are allowed to use registers > > + * x17 and x18 in their implementation. Any routine issuing an hcall must not > > + * expect these registers to be preserved. > > + */ > > I thought the existing use of registers were based on the arm procedure > call standard so we didn't have to worry about adding more caller-save > registers. > > Don't we now have to start adding code around callers to make sure > callers know that x17 and x18 may be clobbered? We use x17 and x18 to allow hyper calls to work without a stack, which is needed for cpu_soft_restart(). The procedure call standard says that these are temporary registers, so a C compiler should not expect these to be preserved. -Geoff
On Wed, Feb 25, 2015 at 02:09:30PM -0800, Geoff Levand wrote: > Hi Christoffer, > > On Thu, 2015-02-19 at 21:57 +0100, Christoffer Dall wrote: > > On Fri, Jan 30, 2015 at 03:33:48PM -0800, Geoff Levand wrote: > > > To allow for additional hcalls to be defined and to make the arm64 hcall API > > > more consistent across exception vector routines, change the hcall implementations > > > to use the ISS field of the ESR_EL2 register to specify the hcall type. > > > > how does this make things more consistent? Do we have other examples of > > things using the immediate field which I'm missing? > > As I detail in the next paragraph, by consistent I mean in the API exposed, > not in the implementation. This point is really secondary to the need for > more hyper calls as I discuss below. > > > > The existing arm64 hcall implementations are limited in that they only allow > > > for two distinct hcalls; with the x0 register either zero, or not zero. Also, > > > the API of the hyp-stub exception vector routines and the KVM exception vector > > > routines differ; hyp-stub uses a non-zero value in x0 to implement > > > __hyp_set_vectors, whereas KVM uses it to implement kvm_call_hyp. > > > > this seems orthogonal to the use of the immediate field vs. x0 though, > > so why it the immediate field preferred again? > > When a CPU is reset via cpu_soft_restart() we need to execute the caller > supplied reset routine at the exception level the kernel was entered at. > So, for a kernel entered at EL2, we need a way to execute that routine at > EL2. > > The current hyp-stub vector implementation, which uses x0, is limited > to two hyper calls; __hyp_get_vectors and __hyp_set_vectors. To > support cpu_soft_restart() we need a third hyper call, one which > allows for code to be executed at EL2. My proposed use of the > immediate value of the hvc instruction will allow for 2^16 distinct > hyper calls. > right, but using x0 allows for 2^64 distinct hypercalls. Just to be clear, I'm fine with using immediate field if there are no good reasons not to, I was just curious as to what direct benefit it has. After thinking about it a bit, from my point of view, the benefit would be the clarity that x0 is first argument like a normal procedure call, so no need to shift things around. Is this part of the equation or am I missing the overall purpose here? > > > Define three new preprocessor macros HVC_CALL_HYP, HVC_GET_VECTORS, and > > > HVC_SET_VECTORS to be used as hcall type specifiers and convert the > > > existing __hyp_get_vectors(), __hyp_set_vectors() and kvm_call_hyp() routines > > > to use these new macros when executing an HVC call. Also change the > > > corresponding hyp-stub and KVM el1_sync exception vector routines to use these > > > new macros. > > > > > > Signed-off-by: Geoff Levand <geoff@infradead.org> > > > --- > > > arch/arm64/include/asm/virt.h | 27 +++++++++++++++++++++++++++ > > > arch/arm64/kernel/hyp-stub.S | 32 +++++++++++++++++++++----------- > > > arch/arm64/kernel/psci.c | 3 ++- > > > arch/arm64/kvm/hyp.S | 16 +++++++++------- > > > 4 files changed, 59 insertions(+), 19 deletions(-) > > > > > > diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h > > > index 7a5df52..eb10368 100644 > > > --- a/arch/arm64/include/asm/virt.h > > > +++ b/arch/arm64/include/asm/virt.h > > > @@ -18,6 +18,33 @@ > > > #ifndef __ASM__VIRT_H > > > #define __ASM__VIRT_H > > > > > > +/* > > > + * The arm64 hcall implementation uses the ISS field of the ESR_EL2 register to > > > + * specify the hcall type. The exception handlers are allowed to use registers > > > + * x17 and x18 in their implementation. Any routine issuing an hcall must not > > > + * expect these registers to be preserved. > > > + */ > > > > I thought the existing use of registers were based on the arm procedure > > call standard so we didn't have to worry about adding more caller-save > > registers. > > > > Don't we now have to start adding code around callers to make sure > > callers know that x17 and x18 may be clobbered? > > We use x17 and x18 to allow hyper calls to work without a stack, which > is needed for cpu_soft_restart(). The procedure call standard says that > these are temporary registers, so a C compiler should not expect these > to be preserved. > Then why not use r9-15 or r0-r7 as the AACPS clearly specifies as caller preserve instead of the registers which may have special meaning etc.? -Christoffer
Hi Christoffer, On Mon, 2015-03-02 at 14:13 -0800, Christoffer Dall wrote: > On Wed, Feb 25, 2015 at 02:09:30PM -0800, Geoff Levand wrote: > > The current hyp-stub vector implementation, which uses x0, is limited > > to two hyper calls; __hyp_get_vectors and __hyp_set_vectors. To > > support cpu_soft_restart() we need a third hyper call, one which > > allows for code to be executed at EL2. My proposed use of the > > immediate value of the hvc instruction will allow for 2^16 distinct > > hyper calls. > > right, but using x0 allows for 2^64 distinct hypercalls. Just to be > clear, I'm fine with using immediate field if there are no good reasons > not to, I was just curious as to what direct benefit it has. After > thinking about it a bit, from my point of view, the benefit would be the > clarity that x0 is first argument like a normal procedure call, so no > need to shift things around. Is this part of the equation or am I > missing the overall purpose here? Yes, in general it will make marshaling of args, etc. easier. Also, to me, if we are going to change the implementation it seems to be the most natural way. > > > > + * The arm64 hcall implementation uses the ISS field of the ESR_EL2 register to > > > > + * specify the hcall type. The exception handlers are allowed to use registers > > > > + * x17 and x18 in their implementation. Any routine issuing an hcall must not > > > > + * expect these registers to be preserved. > > > > + */ > > > > > > I thought the existing use of registers were based on the arm procedure > > > call standard so we didn't have to worry about adding more caller-save > > > registers. > > > > > > Don't we now have to start adding code around callers to make sure > > > callers know that x17 and x18 may be clobbered? > > > > We use x17 and x18 to allow hyper calls to work without a stack, which > > is needed for cpu_soft_restart(). The procedure call standard says that > > these are temporary registers, so a C compiler should not expect these > > to be preserved. > > > Then why not use r9-15 or r0-r7 as the AACPS clearly specifies as > caller preserve instead of the registers which may have special meaning > etc.? OK, I will change these to x9, x10. I'll post a v8 patch set soon. -Geoff
Hi Geoff, On 03/02/2015 06:22 PM, Geoff Levand wrote: > Hi Christoffer, > > On Mon, 2015-03-02 at 14:13 -0800, Christoffer Dall wrote: >> On Wed, Feb 25, 2015 at 02:09:30PM -0800, Geoff Levand wrote: >>> The current hyp-stub vector implementation, which uses x0, is limited >>> to two hyper calls; __hyp_get_vectors and __hyp_set_vectors. To >>> support cpu_soft_restart() we need a third hyper call, one which >>> allows for code to be executed at EL2. My proposed use of the >>> immediate value of the hvc instruction will allow for 2^16 distinct >>> hyper calls. >> >> right, but using x0 allows for 2^64 distinct hypercalls. Just to be >> clear, I'm fine with using immediate field if there are no good reasons >> not to, I was just curious as to what direct benefit it has. After >> thinking about it a bit, from my point of view, the benefit would be the >> clarity that x0 is first argument like a normal procedure call, so no >> need to shift things around. Is this part of the equation or am I >> missing the overall purpose here? > > Yes, in general it will make marshaling of args, etc. easier. Also, > to me, if we are going to change the implementation it seems to be > the most natural way. >From reading the architecture documentation, I too expected the hypervisor call instruction's immediate and the instruction specific syndrome to be used. However I vaguely recall someone pointing out that reading the exception syndrome register and extracting the instruction specific syndrome is bound to take longer than simply using a general purpose register. One might also consider alignment with the SMC Calling Convention document [1], which while originally written for SMC, is also used for HVC by PSCI [2]. 1. http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.den0028a/index.html 2. http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.den0022c/index.html Chris
Hi Christopher, On Tue, 2015-03-03 at 16:47 -0500, Christopher Covington wrote: > On 03/02/2015 06:22 PM, Geoff Levand wrote: > > Yes, in general it will make marshaling of args, etc. easier. Also, > > to me, if we are going to change the implementation it seems to be > > the most natural way. > > From reading the architecture documentation, I too expected the hypervisor > call instruction's immediate and the instruction specific syndrome to be used. > However I vaguely recall someone pointing out that reading the exception > syndrome register and extracting the instruction specific syndrome is bound to > take longer than simply using a general purpose register. > > One might also consider alignment with the SMC Calling Convention document > [1], which while originally written for SMC, is also used for HVC by PSCI [2]. > > 1. http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.den0028a/index.html > 2. http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.den0022c/index.html On looking at the SMC document, I found this: The SMC instruction encodes an immediate value as defined by the ARM architecture [1][2]. The size of this and mechanism to access the immediate value differ between the ARM instruction sets. Additionally, it is time consuming for 32-bit Secure Monitor code to access this immediate value. Consequently: o An SMC immediate value of Zero must be used. o All other SMC immediate values are reserved. The first problem of differing access methods does not exist for our case, the kernel will always use the same method. As for the second problem, the current implementation already reads esr_el2. The new code just adds an AND instruction to mask the ISS field. I don't think this would be more overhead than shifting registers. One alternative would be to use a high register, say x7, and limit the hcalls to args x0-x6, but I don't think this gains much over using the immediate. -Geoff