Message ID | 20180710160013.26559-6-peter.maydell@linaro.org |
---|---|
State | New |
Headers | show |
Series | accel/tcg: Support execution from MMIO and small MMU regions | expand |
On 07/10/2018 09:00 AM, Peter Maydell wrote: > Now that all the callers can handle get_page_addr_code() returning -1, > remove all the code which tries to handle execution from MMIO regions > or small-MMU-region RAM areas. This will mean that we can correctly > execute from these areas, rather than ending up either aborting QEMU > or delivering an incorrect guest exception. > > Signed-off-by: Peter Maydell <peter.maydell@linaro.org> > --- > accel/tcg/cputlb.c | 95 +++++----------------------------------------- > 1 file changed, 10 insertions(+), 85 deletions(-) Yay! Reviewed-by: Richard Henderson <richard.henderson@linaro.org> r~
On 07/10/2018 01:00 PM, Peter Maydell wrote: > Now that all the callers can handle get_page_addr_code() returning -1, > remove all the code which tries to handle execution from MMIO regions > or small-MMU-region RAM areas. This will mean that we can correctly > execute from these areas, rather than ending up either aborting QEMU > or delivering an incorrect guest exception. > > Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org> > --- > accel/tcg/cputlb.c | 95 +++++----------------------------------------- > 1 file changed, 10 insertions(+), 85 deletions(-) > > diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c > index c491703f15f..abb0225dc79 100644 > --- a/accel/tcg/cputlb.c > +++ b/accel/tcg/cputlb.c > @@ -741,39 +741,6 @@ void tlb_set_page(CPUState *cpu, target_ulong vaddr, > prot, mmu_idx, size); > } > > -static void report_bad_exec(CPUState *cpu, target_ulong addr) > -{ > - /* Accidentally executing outside RAM or ROM is quite common for > - * several user-error situations, so report it in a way that > - * makes it clear that this isn't a QEMU bug and provide suggestions > - * about what a user could do to fix things. > - */ > - error_report("Trying to execute code outside RAM or ROM at 0x" > - TARGET_FMT_lx, addr); > - error_printf("This usually means one of the following happened:\n\n" > - "(1) You told QEMU to execute a kernel for the wrong machine " > - "type, and it crashed on startup (eg trying to run a " > - "raspberry pi kernel on a versatilepb QEMU machine)\n" > - "(2) You didn't give QEMU a kernel or BIOS filename at all, " > - "and QEMU executed a ROM full of no-op instructions until " > - "it fell off the end\n" > - "(3) Your guest kernel has a bug and crashed by jumping " > - "off into nowhere\n\n" > - "This is almost always one of the first two, so check your " > - "command line and that you are using the right type of kernel " > - "for this machine.\n" > - "If you think option (3) is likely then you can try debugging " > - "your guest with the -d debug options; in particular " > - "-d guest_errors will cause the log to include a dump of the " > - "guest register state at this point.\n\n" > - "Execution cannot continue; stopping here.\n\n"); > - > - /* Report also to the logs, with more detail including register dump */ > - qemu_log_mask(LOG_GUEST_ERROR, "qemu: fatal: Trying to execute code " > - "outside RAM or ROM at 0x" TARGET_FMT_lx "\n", addr); > - log_cpu_state_mask(LOG_GUEST_ERROR, cpu, CPU_DUMP_FPU | CPU_DUMP_CCOP); > -} > - > static inline ram_addr_t qemu_ram_addr_from_host_nofail(void *ptr) > { > ram_addr_t ram_addr; > @@ -963,7 +930,6 @@ tb_page_addr_t get_page_addr_code(CPUArchState *env, target_ulong addr) > MemoryRegionSection *section; > CPUState *cpu = ENV_GET_CPU(env); > CPUIOTLBEntry *iotlbentry; > - hwaddr physaddr, mr_offset; > > index = (addr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1); > mmu_idx = cpu_mmu_index(env, true); > @@ -977,65 +943,24 @@ tb_page_addr_t get_page_addr_code(CPUArchState *env, target_ulong addr) > (TLB_RECHECK | TLB_INVALID_MASK)) == TLB_RECHECK)) { > /* > * This is a TLB_RECHECK access, where the MMU protection > - * covers a smaller range than a target page, and we must > - * repeat the MMU check here. This tlb_fill() call might > - * longjump out if this access should cause a guest exception. > - */ > - int index; > - target_ulong tlb_addr; > - > - tlb_fill(cpu, addr, 0, MMU_INST_FETCH, mmu_idx, 0); > - > - index = (addr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1); > - tlb_addr = env->tlb_table[mmu_idx][index].addr_code; > - if (!(tlb_addr & ~(TARGET_PAGE_MASK | TLB_RECHECK))) { > - /* RAM access. We can't handle this, so for now just stop */ > - cpu_abort(cpu, "Unable to handle guest executing from RAM within " > - "a small MPU region at 0x" TARGET_FMT_lx, addr); > - } > - /* > - * Fall through to handle IO accesses (which will almost certainly > - * also result in failure) > + * covers a smaller range than a target page. Return -1 to > + * indicate that we cannot simply execute from RAM here; > + * we will perform the necessary repeat of the MMU check > + * when the "execute a single insn" code performs the > + * load of the guest insn. > */ > + return -1; > } > > iotlbentry = &env->iotlb[mmu_idx][index]; > section = iotlb_to_section(cpu, iotlbentry->addr, iotlbentry->attrs); > mr = section->mr; > if (memory_region_is_unassigned(mr)) { > - qemu_mutex_lock_iothread(); > - if (memory_region_request_mmio_ptr(mr, addr)) { > - qemu_mutex_unlock_iothread(); > - /* A MemoryRegion is potentially added so re-run the > - * get_page_addr_code. > - */ > - return get_page_addr_code(env, addr); > - } > - qemu_mutex_unlock_iothread(); > - > - /* Give the new-style cpu_transaction_failed() hook first chance > - * to handle this. > - * This is not the ideal place to detect and generate CPU > - * exceptions for instruction fetch failure (for instance > - * we don't know the length of the access that the CPU would > - * use, and it would be better to go ahead and try the access > - * and use the MemTXResult it produced). However it is the > - * simplest place we have currently available for the check. > + /* > + * Not guest RAM, so there is no ram_addr_t for it. Return -1, > + * and we will execute a single insn from this device. > */ > - mr_offset = (iotlbentry->addr & TARGET_PAGE_MASK) + addr; > - physaddr = mr_offset + > - section->offset_within_address_space - > - section->offset_within_region; > - cpu_transaction_failed(cpu, physaddr, addr, 0, MMU_INST_FETCH, mmu_idx, > - iotlbentry->attrs, MEMTX_DECODE_ERROR, 0); > - > - cpu_unassigned_access(cpu, addr, false, true, 0, 4); > - /* The CPU's unassigned access hook might have longjumped out > - * with an exception. If it didn't (or there was no hook) then > - * we can't proceed further. > - */ > - report_bad_exec(cpu, addr); > - exit(1); > + return -1; > } > p = (void *)((uintptr_t)addr + env->tlb_table[mmu_idx][index].addend); > return qemu_ram_addr_from_host_nofail(p); >
On 2018-07-10 18:00, Peter Maydell wrote: > Now that all the callers can handle get_page_addr_code() returning -1, > remove all the code which tries to handle execution from MMIO regions > or small-MMU-region RAM areas. This will mean that we can correctly > execute from these areas, rather than ending up either aborting QEMU > or delivering an incorrect guest exception. > > Signed-off-by: Peter Maydell <peter.maydell@linaro.org> > --- > accel/tcg/cputlb.c | 95 +++++----------------------------------------- > 1 file changed, 10 insertions(+), 85 deletions(-) > > diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c > index c491703f15f..abb0225dc79 100644 > --- a/accel/tcg/cputlb.c > +++ b/accel/tcg/cputlb.c > @@ -741,39 +741,6 @@ void tlb_set_page(CPUState *cpu, target_ulong vaddr, > prot, mmu_idx, size); > } > > -static void report_bad_exec(CPUState *cpu, target_ulong addr) > -{ > - /* Accidentally executing outside RAM or ROM is quite common for > - * several user-error situations, so report it in a way that > - * makes it clear that this isn't a QEMU bug and provide suggestions > - * about what a user could do to fix things. > - */ > - error_report("Trying to execute code outside RAM or ROM at 0x" > - TARGET_FMT_lx, addr); > - error_printf("This usually means one of the following happened:\n\n" > - "(1) You told QEMU to execute a kernel for the wrong machine " > - "type, and it crashed on startup (eg trying to run a " > - "raspberry pi kernel on a versatilepb QEMU machine)\n" > - "(2) You didn't give QEMU a kernel or BIOS filename at all, " > - "and QEMU executed a ROM full of no-op instructions until " > - "it fell off the end\n" > - "(3) Your guest kernel has a bug and crashed by jumping " > - "off into nowhere\n\n" > - "This is almost always one of the first two, so check your " > - "command line and that you are using the right type of kernel " > - "for this machine.\n" > - "If you think option (3) is likely then you can try debugging " > - "your guest with the -d debug options; in particular " > - "-d guest_errors will cause the log to include a dump of the " > - "guest register state at this point.\n\n" > - "Execution cannot continue; stopping here.\n\n"); Hi Peter! Looks like this patch now causes QEMU to segfault instead of printing the above error message in certain cases, e.g.: $ gdb --args aarch64-softmmu/qemu-system-aarch64 -M n800 [...] (gdb) r Starting program: aarch64-softmmu/qemu-system-aarch64 -M n800 [...] Program received signal SIGSEGV, Segmentation fault. [...] (gdb) bt #0 0x0000555555addc68 in onenand_read (opaque=0x555557600600, addr=98304, size=4) at hw/block/onenand.c:612 #1 0x00005555558b175c in memory_region_read_accessor (mr=0x555557600b80, addr=98304, value=0x7fffdbffe360, size=4, shift=0, mask=4294967295, attrs=...) at memory.c:440 #2 0x00005555558ae669 in access_with_adjusted_size (addr=addr@entry=98304, value=value@entry=0x7fffdbffe360, size=size@entry=4, access_size_min=<optimized out>, access_size_max=<optimized out>, access_fn=access_fn@entry=0x5555558b1720 <memory_region_read_accessor>, mr=mr@entry=0x555557600b80, attrs=attrs@entry=...) at memory.c:570 #3 0x00005555558b3016 in memory_region_dispatch_read (attrs=..., size=4, pval=0x7fffdbffe360, addr=98304, mr=0x555557600b80) at memory.c:1375 #4 0x00005555558b3016 in memory_region_dispatch_read (mr=0x555557600b80, addr=addr@entry=98304, pval=pval@entry=0x7fffdbffe360, size=size@entry=4, attrs=...) at memory.c:1402 #5 0x000055555583cb23 in io_readx (env=env@entry=0x555556b58a30, iotlbentry=iotlbentry@entry=0x555556b6d6b0, mmu_idx=mmu_idx@entry=1, addr=addr@entry=98304, retaddr=retaddr@entry=0, recheck=<optimized out>, access_type=access_type@entry=MMU_INST_FETCH, size=size@entry=4) at accel/tcg/cputlb.c:729 #6 0x00005555558d79cd in helper_le_ldl_cmmu (access_type=MMU_INST_FETCH, recheck=<optimized out>, retaddr=0, addr=98304, index=96, mmu_idx=1, env=0x555556b58a30) at accel/tcg/softmmu_template.h:106 #7 0x00005555558d79cd in helper_le_ldl_cmmu (env=env@entry=0x555556b58a30, addr=addr@entry=98304, oi=33, retaddr=retaddr@entry=0) at accel/tcg/softmmu_template.h:144 #8 0x00005555559d2595 in arm_tr_translate_insn (retaddr=0, ptr=98304, env=0x555556b58a30) at include/exec/cpu_ldst_template.h:102 Any clue what's going on here? Thomas
On 11/14/18 6:19 PM, Thomas Huth wrote: > Program received signal SIGSEGV, Segmentation fault. > [...] > (gdb) bt > #0 0x0000555555addc68 in onenand_read (opaque=0x555557600600, addr=98304, size=4) at hw/block/onenand.c:612 So the crash is an off-by-one on the line above: --- a/hw/block/onenand.c +++ b/hw/block/onenand.c @@ -608,7 +608,7 @@ static uint64_t onenand_read(void *opaque, hwaddr addr, int offset = addr >> s->shift; switch (offset) { - case 0x0000 ... 0xc000: + case 0x0000 ... 0xbfff: return lduw_le_p(s->boot[0] + addr); case 0xf000: /* Manufacturer ID */ as the memory segment has size 0xc000. The guest will now eventually crash with onenand_read: unknown OneNAND register c000 ... onenand_read: unknown OneNAND register fefe qemu: hardware error: onenand_read: implement ECC CPU #0: R00=00000000 R01=00000000 R02=00000000 R03=00000000 R04=00000000 R05=00000000 R06=00000000 R07=00000000 R08=00000000 R09=00000000 R10=00000000 R11=00000000 R12=00000000 R13=00000000 R14=00000000 R15=0001fe04 PSR=400001d3 -Z-- A svc32 s00=00000000 s01=00000000 d00=0000000000000000 s02=00000000 s03=00000000 d01=0000000000000000 s04=00000000 s05=00000000 d02=0000000000000000 s06=00000000 s07=00000000 d03=0000000000000000 s08=00000000 s09=00000000 d04=0000000000000000 s10=00000000 s11=00000000 d05=0000000000000000 s12=00000000 s13=00000000 d06=0000000000000000 s14=00000000 s15=00000000 d07=0000000000000000 s16=00000000 s17=00000000 d08=0000000000000000 s18=00000000 s19=00000000 d09=0000000000000000 s20=00000000 s21=00000000 d10=0000000000000000 s22=00000000 s23=00000000 d11=0000000000000000 s24=00000000 s25=00000000 d12=0000000000000000 s26=00000000 s27=00000000 d13=0000000000000000 s28=00000000 s29=00000000 d14=0000000000000000 s30=00000000 s31=00000000 d15=0000000000000000 FPSCR: 00000000 Aborted (core dumped) I'll note that fprintf at the end of onenand_read should be qemu_log(LOG_GUEST_ERROR) instead. r~
On 15 November 2018 at 07:32, Richard Henderson <rth@twiddle.net> wrote: > On 11/14/18 6:19 PM, Thomas Huth wrote: >> Program received signal SIGSEGV, Segmentation fault. >> [...] >> (gdb) bt >> #0 0x0000555555addc68 in onenand_read (opaque=0x555557600600, addr=98304, size=4) at hw/block/onenand.c:612 > > So the crash is an off-by-one on the line above: > > --- a/hw/block/onenand.c > +++ b/hw/block/onenand.c > @@ -608,7 +608,7 @@ static uint64_t onenand_read(void *opaque, hwaddr addr, > int offset = addr >> s->shift; > > switch (offset) { > - case 0x0000 ... 0xc000: > + case 0x0000 ... 0xbfff: > return lduw_le_p(s->boot[0] + addr); > > case 0xf000: /* Manufacturer ID */ > > as the memory segment has size 0xc000. Presumably it should be ... 0xbffe, since we are doing a 16-bit load ? > The guest will now eventually crash with > > onenand_read: unknown OneNAND register c000 > ... > onenand_read: unknown OneNAND register fefe > qemu: hardware error: onenand_read: implement ECC > > CPU #0: > R00=00000000 R01=00000000 R02=00000000 R03=00000000 > R04=00000000 R05=00000000 R06=00000000 R07=00000000 > R08=00000000 R09=00000000 R10=00000000 R11=00000000 > R12=00000000 R13=00000000 R14=00000000 R15=0001fe04 > PSR=400001d3 -Z-- A svc32 > s00=00000000 s01=00000000 d00=0000000000000000 > s02=00000000 s03=00000000 d01=0000000000000000 > s04=00000000 s05=00000000 d02=0000000000000000 > s06=00000000 s07=00000000 d03=0000000000000000 > s08=00000000 s09=00000000 d04=0000000000000000 > s10=00000000 s11=00000000 d05=0000000000000000 > s12=00000000 s13=00000000 d06=0000000000000000 > s14=00000000 s15=00000000 d07=0000000000000000 > s16=00000000 s17=00000000 d08=0000000000000000 > s18=00000000 s19=00000000 d09=0000000000000000 > s20=00000000 s21=00000000 d10=0000000000000000 > s22=00000000 s23=00000000 d11=0000000000000000 > s24=00000000 s25=00000000 d12=0000000000000000 > s26=00000000 s27=00000000 d13=0000000000000000 > s28=00000000 s29=00000000 d14=0000000000000000 > s30=00000000 s31=00000000 d15=0000000000000000 > FPSCR: 00000000 > Aborted (core dumped) > > I'll note that fprintf at the end of onenand_read should be > qemu_log(LOG_GUEST_ERROR) instead. Yeah, I'll put together a patch which makes it use the qemu_log facilities rather than fprintf() and hw_error(). With that plus the case statement fix then QEMU correctly just sits there as the guest execution races through memory... thanks -- PMM
On 11/15/18 2:53 PM, Peter Maydell wrote: >> switch (offset) { >> - case 0x0000 ... 0xc000: >> + case 0x0000 ... 0xbfff: >> return lduw_le_p(s->boot[0] + addr); >> >> case 0xf000: /* Manufacturer ID */ >> >> as the memory segment has size 0xc000. > > Presumably it should be ... 0xbffe, since we are > doing a 16-bit load ? Ah, true. > Yeah, I'll put together a patch which makes it use the qemu_log > facilities rather than fprintf() and hw_error(). With that > plus the case statement fix then QEMU correctly just sits there > as the guest execution races through memory... Excellent, thanks. r~
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c index c491703f15f..abb0225dc79 100644 --- a/accel/tcg/cputlb.c +++ b/accel/tcg/cputlb.c @@ -741,39 +741,6 @@ void tlb_set_page(CPUState *cpu, target_ulong vaddr, prot, mmu_idx, size); } -static void report_bad_exec(CPUState *cpu, target_ulong addr) -{ - /* Accidentally executing outside RAM or ROM is quite common for - * several user-error situations, so report it in a way that - * makes it clear that this isn't a QEMU bug and provide suggestions - * about what a user could do to fix things. - */ - error_report("Trying to execute code outside RAM or ROM at 0x" - TARGET_FMT_lx, addr); - error_printf("This usually means one of the following happened:\n\n" - "(1) You told QEMU to execute a kernel for the wrong machine " - "type, and it crashed on startup (eg trying to run a " - "raspberry pi kernel on a versatilepb QEMU machine)\n" - "(2) You didn't give QEMU a kernel or BIOS filename at all, " - "and QEMU executed a ROM full of no-op instructions until " - "it fell off the end\n" - "(3) Your guest kernel has a bug and crashed by jumping " - "off into nowhere\n\n" - "This is almost always one of the first two, so check your " - "command line and that you are using the right type of kernel " - "for this machine.\n" - "If you think option (3) is likely then you can try debugging " - "your guest with the -d debug options; in particular " - "-d guest_errors will cause the log to include a dump of the " - "guest register state at this point.\n\n" - "Execution cannot continue; stopping here.\n\n"); - - /* Report also to the logs, with more detail including register dump */ - qemu_log_mask(LOG_GUEST_ERROR, "qemu: fatal: Trying to execute code " - "outside RAM or ROM at 0x" TARGET_FMT_lx "\n", addr); - log_cpu_state_mask(LOG_GUEST_ERROR, cpu, CPU_DUMP_FPU | CPU_DUMP_CCOP); -} - static inline ram_addr_t qemu_ram_addr_from_host_nofail(void *ptr) { ram_addr_t ram_addr; @@ -963,7 +930,6 @@ tb_page_addr_t get_page_addr_code(CPUArchState *env, target_ulong addr) MemoryRegionSection *section; CPUState *cpu = ENV_GET_CPU(env); CPUIOTLBEntry *iotlbentry; - hwaddr physaddr, mr_offset; index = (addr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1); mmu_idx = cpu_mmu_index(env, true); @@ -977,65 +943,24 @@ tb_page_addr_t get_page_addr_code(CPUArchState *env, target_ulong addr) (TLB_RECHECK | TLB_INVALID_MASK)) == TLB_RECHECK)) { /* * This is a TLB_RECHECK access, where the MMU protection - * covers a smaller range than a target page, and we must - * repeat the MMU check here. This tlb_fill() call might - * longjump out if this access should cause a guest exception. - */ - int index; - target_ulong tlb_addr; - - tlb_fill(cpu, addr, 0, MMU_INST_FETCH, mmu_idx, 0); - - index = (addr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1); - tlb_addr = env->tlb_table[mmu_idx][index].addr_code; - if (!(tlb_addr & ~(TARGET_PAGE_MASK | TLB_RECHECK))) { - /* RAM access. We can't handle this, so for now just stop */ - cpu_abort(cpu, "Unable to handle guest executing from RAM within " - "a small MPU region at 0x" TARGET_FMT_lx, addr); - } - /* - * Fall through to handle IO accesses (which will almost certainly - * also result in failure) + * covers a smaller range than a target page. Return -1 to + * indicate that we cannot simply execute from RAM here; + * we will perform the necessary repeat of the MMU check + * when the "execute a single insn" code performs the + * load of the guest insn. */ + return -1; } iotlbentry = &env->iotlb[mmu_idx][index]; section = iotlb_to_section(cpu, iotlbentry->addr, iotlbentry->attrs); mr = section->mr; if (memory_region_is_unassigned(mr)) { - qemu_mutex_lock_iothread(); - if (memory_region_request_mmio_ptr(mr, addr)) { - qemu_mutex_unlock_iothread(); - /* A MemoryRegion is potentially added so re-run the - * get_page_addr_code. - */ - return get_page_addr_code(env, addr); - } - qemu_mutex_unlock_iothread(); - - /* Give the new-style cpu_transaction_failed() hook first chance - * to handle this. - * This is not the ideal place to detect and generate CPU - * exceptions for instruction fetch failure (for instance - * we don't know the length of the access that the CPU would - * use, and it would be better to go ahead and try the access - * and use the MemTXResult it produced). However it is the - * simplest place we have currently available for the check. + /* + * Not guest RAM, so there is no ram_addr_t for it. Return -1, + * and we will execute a single insn from this device. */ - mr_offset = (iotlbentry->addr & TARGET_PAGE_MASK) + addr; - physaddr = mr_offset + - section->offset_within_address_space - - section->offset_within_region; - cpu_transaction_failed(cpu, physaddr, addr, 0, MMU_INST_FETCH, mmu_idx, - iotlbentry->attrs, MEMTX_DECODE_ERROR, 0); - - cpu_unassigned_access(cpu, addr, false, true, 0, 4); - /* The CPU's unassigned access hook might have longjumped out - * with an exception. If it didn't (or there was no hook) then - * we can't proceed further. - */ - report_bad_exec(cpu, addr); - exit(1); + return -1; } p = (void *)((uintptr_t)addr + env->tlb_table[mmu_idx][index].addend); return qemu_ram_addr_from_host_nofail(p);
Now that all the callers can handle get_page_addr_code() returning -1, remove all the code which tries to handle execution from MMIO regions or small-MMU-region RAM areas. This will mean that we can correctly execute from these areas, rather than ending up either aborting QEMU or delivering an incorrect guest exception. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> --- accel/tcg/cputlb.c | 95 +++++----------------------------------------- 1 file changed, 10 insertions(+), 85 deletions(-)