diff mbox

[06/13] target-openrisc: Remove TLB flush from l.rfe instruction

Message ID 1383073495-5332-7-git-send-email-sebastian@macke.de
State New
Headers show

Commit Message

Sebastian Macke Oct. 29, 2013, 7:04 p.m. UTC
At the moment there are two TLBs. The OpenRISC TLB followed
by the QEMU's own TLB.
At the end of the TLB miss handler a tlb_flush of QEMUs TLB
is executed which is exactly what we want to avoid.
As long as there is no context switch we don't have to flush the TLB.

There are two options:
1. If l.rfe is executed and the MMU is off we don't flush the buffer.
This would work under Linux

2. We flush the TLB only if the correct DTLB and ITLB registers are
cleared. Unfortunately the OpenRISC TLB is smaller and there is no
register which commands to flush the whole buffer.
In this case we have to reckognize when the whole TLB is flushed.
This would have the benefit that syscalls which do not change
the context would not flush the TLB.

Both solutions would break weakly the specification but the speed benefit
is more than a factor of two for memory intense programs.
In this patch, I choose solution no. 2.
If the first TLB register is cleared the whole TLB is cleared.

Signed-off-by: Sebastian Macke <sebastian@macke.de>
---
 target-openrisc/cpu.h              | 2 +-
 target-openrisc/interrupt_helper.c | 8 --------
 target-openrisc/sys_helper.c       | 4 ++++
 3 files changed, 5 insertions(+), 9 deletions(-)

Comments

Max Filippov Oct. 29, 2013, 9:01 p.m. UTC | #1
On Tue, Oct 29, 2013 at 11:04 PM, Sebastian Macke <sebastian@macke.de> wrote:
> At the moment there are two TLBs. The OpenRISC TLB followed
> by the QEMU's own TLB.
> At the end of the TLB miss handler a tlb_flush of QEMUs TLB
> is executed which is exactly what we want to avoid.
> As long as there is no context switch we don't have to flush the TLB.

So this flush was needed in order to clean QEMU TLB in case
DTLB/ITLB translation was enabled/disabled, right? But since you
already have mmu index for nommu operation, wouldn't it be easier
to indicate mmu index correctly for data and code access and drop
this flush?
Sebastian Macke Oct. 29, 2013, 9:53 p.m. UTC | #2
On 29/10/2013 2:01 PM, Max Filippov wrote:
> On Tue, Oct 29, 2013 at 11:04 PM, Sebastian Macke <sebastian@macke.de> wrote:
>> At the moment there are two TLBs. The OpenRISC TLB followed
>> by the QEMU's own TLB.
>> At the end of the TLB miss handler a tlb_flush of QEMUs TLB
>> is executed which is exactly what we want to avoid.
>> As long as there is no context switch we don't have to flush the TLB.
> So this flush was needed in order to clean QEMU TLB in case
> DTLB/ITLB translation was enabled/disabled, right? But since you
> already have mmu index for nommu operation, wouldn't it be easier
> to indicate mmu index correctly for data and code access and drop
> this flush?
>

The mmu index is already set correctly and this patch removes the flush.
1. Problem
The problem is if there is a context switch.  OpenRISC clears its own 
small tlb page by page. But this does mean it flushes all pages in the 
big QEMU TLB.  This is the reason why l.rfe instruction clears the tlb 
which is the instruction used to return to user mode
But according to the specification this is wrong.

2. Problem which is the case you mentioned.
Your are right, this is one solution and its written in the patchnotes 
as point 1.
But this would not solve the problem No 1. I mentioned in this email.

Confused? I am :)

Easy: l.rfe is not supposed to clear the tlb. It can but it shouldn't.
With this patch I remove the flush and solve all problems by assuming a 
global tlb flush if you invalidate the first entry of the small OpenRISC 
TLB.
Max Filippov Oct. 29, 2013, 10:20 p.m. UTC | #3
On Wed, Oct 30, 2013 at 1:53 AM, Sebastian Macke <sebastian@macke.de> wrote:
> On 29/10/2013 2:01 PM, Max Filippov wrote:
>>
>> On Tue, Oct 29, 2013 at 11:04 PM, Sebastian Macke <sebastian@macke.de>
>> wrote:
>>>
>>> At the moment there are two TLBs. The OpenRISC TLB followed
>>> by the QEMU's own TLB.
>>> At the end of the TLB miss handler a tlb_flush of QEMUs TLB
>>> is executed which is exactly what we want to avoid.
>>> As long as there is no context switch we don't have to flush the TLB.
>>
>> So this flush was needed in order to clean QEMU TLB in case
>> DTLB/ITLB translation was enabled/disabled, right? But since you
>> already have mmu index for nommu operation, wouldn't it be easier
>> to indicate mmu index correctly for data and code access and drop
>> this flush?
>>
>
> The mmu index is already set correctly and this patch removes the flush.

I'm not sure: cpu_mmu_index only checks SR_IME, not SR_DME.

> 1. Problem
> The problem is if there is a context switch.  OpenRISC clears its own small
> tlb page by page. But this does mean it flushes all pages in the big QEMU

I think there shouldn't be any entries in QEMU TLB other than those in
OpenRISC TLB, otherwise it would be possible to access unmapped virtual
addresses without getting pagefaults.

> TLB.  This is the reason why l.rfe instruction clears the tlb which is the
> instruction used to return to user mode
> But according to the specification this is wrong.
>
> 2. Problem which is the case you mentioned.
> Your are right, this is one solution and its written in the patchnotes as
> point 1.
> But this would not solve the problem No 1. I mentioned in this email.
>
> Confused? I am :)
>
> Easy: l.rfe is not supposed to clear the tlb. It can but it shouldn't.
> With this patch I remove the flush and solve all problems by assuming a
> global tlb flush if you invalidate the first entry of the small OpenRISC
> TLB.
Sebastian Macke Oct. 29, 2013, 11:14 p.m. UTC | #4
On 29/10/2013 3:20 PM, Max Filippov wrote:
> On Wed, Oct 30, 2013 at 1:53 AM, Sebastian Macke <sebastian@macke.de> wrote:
>> On 29/10/2013 2:01 PM, Max Filippov wrote:
>>> On Tue, Oct 29, 2013 at 11:04 PM, Sebastian Macke <sebastian@macke.de>
>>> wrote:
>>>> At the moment there are two TLBs. The OpenRISC TLB followed
>>>> by the QEMU's own TLB.
>>>> At the end of the TLB miss handler a tlb_flush of QEMUs TLB
>>>> is executed which is exactly what we want to avoid.
>>>> As long as there is no context switch we don't have to flush the TLB.
>>> So this flush was needed in order to clean QEMU TLB in case
>>> DTLB/ITLB translation was enabled/disabled, right? But since you
>>> already have mmu index for nommu operation, wouldn't it be easier
>>> to indicate mmu index correctly for data and code access and drop
>>> this flush?
>>>
>> The mmu index is already set correctly and this patch removes the flush.
> I'm not sure: cpu_mmu_index only checks SR_IME, not SR_DME.

And again, correct. I saw this problem too and wanted to correct it later.
But using Linux the IME and DME flags are always changed together and 
therefore distinguishing between them don't play a role.
So I forgot it in the end.

>> 1. Problem
>> The problem is if there is a context switch.  OpenRISC clears its own small
>> tlb page by page. But this does mean it flushes all pages in the big QEMU
> I think there shouldn't be any entries in QEMU TLB other than those in
> OpenRISC TLB, otherwise it would be possible to access unmapped virtual
> addresses without getting pagefaults.
Correct. And this was one dilemma I had. How to increase the whole speed 
of tlb misses by an order of magnitude
and not break any rules.

Without the two patches it works like this.
User mode OpenRISC TLB-Miss -> Exception -> QEMU TLB flush -> Set new 
page in the tlb handler -> return via l.rfe -> QEMU TLB flush

In the end we had always only one valid page in the QEMU TLB which is 
kind of crazy. ( Now I am unsure, because with my logic we would have 
never a valid page in the QEMU TLB and would run in an endless loop)

So with the patches QEMU's TLB could have indeed more entries than the 
OpenRISC TLB right now but which is 99% fine if you run it under Linux. 
Linux has the option to clear certain pages from the tlb which then of 
course will not reckognize those pages in QEMUs TLB. So there is a small 
chance for an error.

Three options:

1. Removing QEMUs TLBs.
2. Another option might be to make the flushing mmu_index dependend to 
increase the speed. But this is not implemented as far as I see.
3. I could also invalidate the previous page if the OpenRISC TLB entry 
is overwritten. But then I don't know the correct mmu_index. So this has 
to be done for all possible mmu indices.

Option three might be possible. I didn't think about this before.

Hopefully OpenRISC will get Hardware TLB refill next year. This would 
solve the problem.


>> TLB.  This is the reason why l.rfe instruction clears the tlb which is the
>> instruction used to return to user mode
>> But according to the specification this is wrong.
>>
>> 2. Problem which is the case you mentioned.
>> Your are right, this is one solution and its written in the patchnotes as
>> point 1.
>> But this would not solve the problem No 1. I mentioned in this email.
>>
>> Confused? I am :)
>>
>> Easy: l.rfe is not supposed to clear the tlb. It can but it shouldn't.
>> With this patch I remove the flush and solve all problems by assuming a
>> global tlb flush if you invalidate the first entry of the small OpenRISC
>> TLB.
diff mbox

Patch

diff --git a/target-openrisc/cpu.h b/target-openrisc/cpu.h
index 057821d..bac61e5 100644
--- a/target-openrisc/cpu.h
+++ b/target-openrisc/cpu.h
@@ -412,7 +412,7 @@  static inline void cpu_get_tb_cpu_state(CPUOpenRISCState *env,
     *pc = env->pc;
     *cs_base = 0;
     /* D_FLAG -- branch instruction exception */
-    *flags = (env->flags & D_FLAG) | (env->sr & SR_SM);
+    *flags = (env->flags & D_FLAG) | (env->sr & (SR_SM | SR_DME | SR_IME));
 }
 
 static inline int cpu_mmu_index(CPUOpenRISCState *env)
diff --git a/target-openrisc/interrupt_helper.c b/target-openrisc/interrupt_helper.c
index ae187f5..1ad3a78 100644
--- a/target-openrisc/interrupt_helper.c
+++ b/target-openrisc/interrupt_helper.c
@@ -25,10 +25,6 @@  void HELPER(rfe)(CPUOpenRISCState *env)
 {
     OpenRISCCPU *cpu = openrisc_env_get_cpu(env);
     CPUState *cs = CPU(cpu);
-#ifndef CONFIG_USER_ONLY
-    int need_flush_tlb = (cpu->env.sr & (SR_SM | SR_IME | SR_DME)) ^
-                         (cpu->env.esr & (SR_SM | SR_IME | SR_DME));
-#endif
     cpu->env.pc = cpu->env.epcr;
     ENV_SET_SR(&(cpu->env), cpu->env.esr);
 
@@ -48,10 +44,6 @@  void HELPER(rfe)(CPUOpenRISCState *env)
         cpu->env.tlb->cpu_openrisc_map_address_code =
             &cpu_openrisc_get_phys_nommu;
     }
-
-    if (need_flush_tlb) {
-        tlb_flush(&cpu->env, 1);
-    }
 #endif
     cs->interrupt_request |= CPU_INTERRUPT_EXITTB;
 }
diff --git a/target-openrisc/sys_helper.c b/target-openrisc/sys_helper.c
index 3eda0e1..b7c987d 100644
--- a/target-openrisc/sys_helper.c
+++ b/target-openrisc/sys_helper.c
@@ -77,6 +77,10 @@  void HELPER(mtspr)(CPUOpenRISCState *env,
         idx = spr - TO_SPR(1, 512);
         if (!(rb & 1)) {
             tlb_flush_page(env, env->tlb->dtlb[0][idx].mr & TARGET_PAGE_MASK);
+            if (idx == 0) {
+                /* Assume that the whole tlb buffer must be flushed */
+                tlb_flush(&cpu->env, 1);
+            }
         }
         env->tlb->dtlb[0][idx].mr = rb;
         break;