Message ID | 1383073495-5332-7-git-send-email-sebastian@macke.de |
---|---|
State | New |
Headers | show |
On Tue, Oct 29, 2013 at 11:04 PM, Sebastian Macke <sebastian@macke.de> wrote: > At the moment there are two TLBs. The OpenRISC TLB followed > by the QEMU's own TLB. > At the end of the TLB miss handler a tlb_flush of QEMUs TLB > is executed which is exactly what we want to avoid. > As long as there is no context switch we don't have to flush the TLB. So this flush was needed in order to clean QEMU TLB in case DTLB/ITLB translation was enabled/disabled, right? But since you already have mmu index for nommu operation, wouldn't it be easier to indicate mmu index correctly for data and code access and drop this flush?
On 29/10/2013 2:01 PM, Max Filippov wrote: > On Tue, Oct 29, 2013 at 11:04 PM, Sebastian Macke <sebastian@macke.de> wrote: >> At the moment there are two TLBs. The OpenRISC TLB followed >> by the QEMU's own TLB. >> At the end of the TLB miss handler a tlb_flush of QEMUs TLB >> is executed which is exactly what we want to avoid. >> As long as there is no context switch we don't have to flush the TLB. > So this flush was needed in order to clean QEMU TLB in case > DTLB/ITLB translation was enabled/disabled, right? But since you > already have mmu index for nommu operation, wouldn't it be easier > to indicate mmu index correctly for data and code access and drop > this flush? > The mmu index is already set correctly and this patch removes the flush. 1. Problem The problem is if there is a context switch. OpenRISC clears its own small tlb page by page. But this does mean it flushes all pages in the big QEMU TLB. This is the reason why l.rfe instruction clears the tlb which is the instruction used to return to user mode But according to the specification this is wrong. 2. Problem which is the case you mentioned. Your are right, this is one solution and its written in the patchnotes as point 1. But this would not solve the problem No 1. I mentioned in this email. Confused? I am :) Easy: l.rfe is not supposed to clear the tlb. It can but it shouldn't. With this patch I remove the flush and solve all problems by assuming a global tlb flush if you invalidate the first entry of the small OpenRISC TLB.
On Wed, Oct 30, 2013 at 1:53 AM, Sebastian Macke <sebastian@macke.de> wrote: > On 29/10/2013 2:01 PM, Max Filippov wrote: >> >> On Tue, Oct 29, 2013 at 11:04 PM, Sebastian Macke <sebastian@macke.de> >> wrote: >>> >>> At the moment there are two TLBs. The OpenRISC TLB followed >>> by the QEMU's own TLB. >>> At the end of the TLB miss handler a tlb_flush of QEMUs TLB >>> is executed which is exactly what we want to avoid. >>> As long as there is no context switch we don't have to flush the TLB. >> >> So this flush was needed in order to clean QEMU TLB in case >> DTLB/ITLB translation was enabled/disabled, right? But since you >> already have mmu index for nommu operation, wouldn't it be easier >> to indicate mmu index correctly for data and code access and drop >> this flush? >> > > The mmu index is already set correctly and this patch removes the flush. I'm not sure: cpu_mmu_index only checks SR_IME, not SR_DME. > 1. Problem > The problem is if there is a context switch. OpenRISC clears its own small > tlb page by page. But this does mean it flushes all pages in the big QEMU I think there shouldn't be any entries in QEMU TLB other than those in OpenRISC TLB, otherwise it would be possible to access unmapped virtual addresses without getting pagefaults. > TLB. This is the reason why l.rfe instruction clears the tlb which is the > instruction used to return to user mode > But according to the specification this is wrong. > > 2. Problem which is the case you mentioned. > Your are right, this is one solution and its written in the patchnotes as > point 1. > But this would not solve the problem No 1. I mentioned in this email. > > Confused? I am :) > > Easy: l.rfe is not supposed to clear the tlb. It can but it shouldn't. > With this patch I remove the flush and solve all problems by assuming a > global tlb flush if you invalidate the first entry of the small OpenRISC > TLB.
On 29/10/2013 3:20 PM, Max Filippov wrote: > On Wed, Oct 30, 2013 at 1:53 AM, Sebastian Macke <sebastian@macke.de> wrote: >> On 29/10/2013 2:01 PM, Max Filippov wrote: >>> On Tue, Oct 29, 2013 at 11:04 PM, Sebastian Macke <sebastian@macke.de> >>> wrote: >>>> At the moment there are two TLBs. The OpenRISC TLB followed >>>> by the QEMU's own TLB. >>>> At the end of the TLB miss handler a tlb_flush of QEMUs TLB >>>> is executed which is exactly what we want to avoid. >>>> As long as there is no context switch we don't have to flush the TLB. >>> So this flush was needed in order to clean QEMU TLB in case >>> DTLB/ITLB translation was enabled/disabled, right? But since you >>> already have mmu index for nommu operation, wouldn't it be easier >>> to indicate mmu index correctly for data and code access and drop >>> this flush? >>> >> The mmu index is already set correctly and this patch removes the flush. > I'm not sure: cpu_mmu_index only checks SR_IME, not SR_DME. And again, correct. I saw this problem too and wanted to correct it later. But using Linux the IME and DME flags are always changed together and therefore distinguishing between them don't play a role. So I forgot it in the end. >> 1. Problem >> The problem is if there is a context switch. OpenRISC clears its own small >> tlb page by page. But this does mean it flushes all pages in the big QEMU > I think there shouldn't be any entries in QEMU TLB other than those in > OpenRISC TLB, otherwise it would be possible to access unmapped virtual > addresses without getting pagefaults. Correct. And this was one dilemma I had. How to increase the whole speed of tlb misses by an order of magnitude and not break any rules. Without the two patches it works like this. User mode OpenRISC TLB-Miss -> Exception -> QEMU TLB flush -> Set new page in the tlb handler -> return via l.rfe -> QEMU TLB flush In the end we had always only one valid page in the QEMU TLB which is kind of crazy. ( Now I am unsure, because with my logic we would have never a valid page in the QEMU TLB and would run in an endless loop) So with the patches QEMU's TLB could have indeed more entries than the OpenRISC TLB right now but which is 99% fine if you run it under Linux. Linux has the option to clear certain pages from the tlb which then of course will not reckognize those pages in QEMUs TLB. So there is a small chance for an error. Three options: 1. Removing QEMUs TLBs. 2. Another option might be to make the flushing mmu_index dependend to increase the speed. But this is not implemented as far as I see. 3. I could also invalidate the previous page if the OpenRISC TLB entry is overwritten. But then I don't know the correct mmu_index. So this has to be done for all possible mmu indices. Option three might be possible. I didn't think about this before. Hopefully OpenRISC will get Hardware TLB refill next year. This would solve the problem. >> TLB. This is the reason why l.rfe instruction clears the tlb which is the >> instruction used to return to user mode >> But according to the specification this is wrong. >> >> 2. Problem which is the case you mentioned. >> Your are right, this is one solution and its written in the patchnotes as >> point 1. >> But this would not solve the problem No 1. I mentioned in this email. >> >> Confused? I am :) >> >> Easy: l.rfe is not supposed to clear the tlb. It can but it shouldn't. >> With this patch I remove the flush and solve all problems by assuming a >> global tlb flush if you invalidate the first entry of the small OpenRISC >> TLB.
diff --git a/target-openrisc/cpu.h b/target-openrisc/cpu.h index 057821d..bac61e5 100644 --- a/target-openrisc/cpu.h +++ b/target-openrisc/cpu.h @@ -412,7 +412,7 @@ static inline void cpu_get_tb_cpu_state(CPUOpenRISCState *env, *pc = env->pc; *cs_base = 0; /* D_FLAG -- branch instruction exception */ - *flags = (env->flags & D_FLAG) | (env->sr & SR_SM); + *flags = (env->flags & D_FLAG) | (env->sr & (SR_SM | SR_DME | SR_IME)); } static inline int cpu_mmu_index(CPUOpenRISCState *env) diff --git a/target-openrisc/interrupt_helper.c b/target-openrisc/interrupt_helper.c index ae187f5..1ad3a78 100644 --- a/target-openrisc/interrupt_helper.c +++ b/target-openrisc/interrupt_helper.c @@ -25,10 +25,6 @@ void HELPER(rfe)(CPUOpenRISCState *env) { OpenRISCCPU *cpu = openrisc_env_get_cpu(env); CPUState *cs = CPU(cpu); -#ifndef CONFIG_USER_ONLY - int need_flush_tlb = (cpu->env.sr & (SR_SM | SR_IME | SR_DME)) ^ - (cpu->env.esr & (SR_SM | SR_IME | SR_DME)); -#endif cpu->env.pc = cpu->env.epcr; ENV_SET_SR(&(cpu->env), cpu->env.esr); @@ -48,10 +44,6 @@ void HELPER(rfe)(CPUOpenRISCState *env) cpu->env.tlb->cpu_openrisc_map_address_code = &cpu_openrisc_get_phys_nommu; } - - if (need_flush_tlb) { - tlb_flush(&cpu->env, 1); - } #endif cs->interrupt_request |= CPU_INTERRUPT_EXITTB; } diff --git a/target-openrisc/sys_helper.c b/target-openrisc/sys_helper.c index 3eda0e1..b7c987d 100644 --- a/target-openrisc/sys_helper.c +++ b/target-openrisc/sys_helper.c @@ -77,6 +77,10 @@ void HELPER(mtspr)(CPUOpenRISCState *env, idx = spr - TO_SPR(1, 512); if (!(rb & 1)) { tlb_flush_page(env, env->tlb->dtlb[0][idx].mr & TARGET_PAGE_MASK); + if (idx == 0) { + /* Assume that the whole tlb buffer must be flushed */ + tlb_flush(&cpu->env, 1); + } } env->tlb->dtlb[0][idx].mr = rb; break;
At the moment there are two TLBs. The OpenRISC TLB followed by the QEMU's own TLB. At the end of the TLB miss handler a tlb_flush of QEMUs TLB is executed which is exactly what we want to avoid. As long as there is no context switch we don't have to flush the TLB. There are two options: 1. If l.rfe is executed and the MMU is off we don't flush the buffer. This would work under Linux 2. We flush the TLB only if the correct DTLB and ITLB registers are cleared. Unfortunately the OpenRISC TLB is smaller and there is no register which commands to flush the whole buffer. In this case we have to reckognize when the whole TLB is flushed. This would have the benefit that syscalls which do not change the context would not flush the TLB. Both solutions would break weakly the specification but the speed benefit is more than a factor of two for memory intense programs. In this patch, I choose solution no. 2. If the first TLB register is cleared the whole TLB is cleared. Signed-off-by: Sebastian Macke <sebastian@macke.de> --- target-openrisc/cpu.h | 2 +- target-openrisc/interrupt_helper.c | 8 -------- target-openrisc/sys_helper.c | 4 ++++ 3 files changed, 5 insertions(+), 9 deletions(-)