Patchwork [19/26] target-xtensa: implement loop option

login
register
mail settings
Submitter Max Filippov
Date May 17, 2011, 10:32 p.m.
Message ID <1305671572-5899-20-git-send-email-jcmvbkbc@gmail.com>
Download mbox | patch
Permalink /patch/96066/
State New
Headers show

Comments

Max Filippov - May 17, 2011, 10:32 p.m.
See ISA, 4.3.2 for details.

Operations that change LEND SR value invalidate TBs at the old and at
the new LEND. LEND value at TB compilation time is considered constant
and loop instruction is generated based on this value.

Invalidation may be avoided for the TB at the old LEND address, since
looping code verifies actual LEND value.

Invalidation may be avoided for the TB at the new LEND address if
there's a way to associate LEND address with TB at compilation time and
later verify that it doesn't change.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
---
RFC -> PATCH changes:
- add XTENSA_TBFLAG_EXCM, use it in loop ending check;
---
 target-xtensa/cpu.h       |    8 +++++
 target-xtensa/helpers.h   |    1 +
 target-xtensa/op_helper.c |   11 +++++++
 target-xtensa/translate.c |   65 +++++++++++++++++++++++++++++++++++++++-----
 4 files changed, 77 insertions(+), 8 deletions(-)
Richard Henderson - May 19, 2011, 9:51 p.m.
On 05/17/2011 03:32 PM, Max Filippov wrote:
> +    if (env->sregs[LEND] != v) {
> +        tb_invalidate_phys_page_range(
> +                env->sregs[LEND] - 1, env->sregs[LEND], 0);
> +        env->sregs[LEND] = v;
> +        tb_invalidate_phys_page_range(
> +                env->sregs[LEND] - 1, env->sregs[LEND], 0);
> +    }

Why are you invalidating twice?

> +static void gen_check_loop_end(DisasContext *dc, int slot)
> +{
> +    if (option_enabled(dc, XTENSA_OPTION_LOOP) &&
> +            !(dc->tb->flags & XTENSA_TBFLAG_EXCM) &&
> +            dc->next_pc == dc->lend) {
> +        int label = gen_new_label();
> +
> +        tcg_gen_brcondi_i32(TCG_COND_NE, cpu_SR[LEND], dc->next_pc, label);
> +        tcg_gen_brcondi_i32(TCG_COND_EQ, cpu_SR[LCOUNT], 0, label);
> +        tcg_gen_subi_i32(cpu_SR[LCOUNT], cpu_SR[LCOUNT], 1);
> +        gen_jump(dc, cpu_SR[LBEG]);
> +        gen_set_label(label);
> +        gen_jumpi(dc, dc->next_pc, slot);

If you're going to pretend that LEND is a constant, you might as well
pretend that LBEG is also a constant, so that you get to chain the TB's
around the loop.

> +static void gen_jumpi_check_loop_end(DisasContext *dc, int slot)
> +{
> +    gen_check_loop_end(dc, slot);
> +    gen_jumpi(dc, dc->next_pc, slot);

You're generating duplicate jumpi's here; that can probably be avoided
fairly easily.


r~
Max Filippov - May 20, 2011, 7:25 a.m.
> > +    if (env->sregs[LEND] != v) {
> > +        tb_invalidate_phys_page_range(
> > +                env->sregs[LEND] - 1, env->sregs[LEND], 0);
> > +        env->sregs[LEND] = v;
> > +        tb_invalidate_phys_page_range(
> > +                env->sregs[LEND] - 1, env->sregs[LEND], 0);
> > +    }
> 
> Why are you invalidating twice?

TB at the old LEND and at the new. Although it will work correctly without first invalidation.

> > +static void gen_check_loop_end(DisasContext *dc, int slot)
> > +{
> > +    if (option_enabled(dc, XTENSA_OPTION_LOOP) &&
> > +            !(dc->tb->flags & XTENSA_TBFLAG_EXCM) &&
> > +            dc->next_pc == dc->lend) {
> > +        int label = gen_new_label();
> > +
> > +        tcg_gen_brcondi_i32(TCG_COND_NE, cpu_SR[LEND], dc->next_pc, label);
> > +        tcg_gen_brcondi_i32(TCG_COND_EQ, cpu_SR[LCOUNT], 0, label);
> > +        tcg_gen_subi_i32(cpu_SR[LCOUNT], cpu_SR[LCOUNT], 1);
> > +        gen_jump(dc, cpu_SR[LBEG]);
> > +        gen_set_label(label);
> > +        gen_jumpi(dc, dc->next_pc, slot);
> 
> If you're going to pretend that LEND is a constant, you might as well
> pretend that LBEG is also a constant, so that you get to chain the TB's
> around the loop.

But there may be three exits from TB at the LEND if its last command is a branch: to the LBEG, to the branch target and to the next insn.

Thanks.
-- Max
Max Filippov - May 20, 2011, 9:10 a.m.
>> > +    if (env->sregs[LEND] != v) {
>> > +        tb_invalidate_phys_page_range(
>> > +                env->sregs[LEND] - 1, env->sregs[LEND], 0);
>> > +        env->sregs[LEND] = v;
>> > +        tb_invalidate_phys_page_range(
>> > +                env->sregs[LEND] - 1, env->sregs[LEND], 0);
>> > +    }
>>
>> Why are you invalidating twice?
>
> TB at the old LEND and at the new. Although it will work correctly without first invalidation.
>
>> > +static void gen_check_loop_end(DisasContext *dc, int slot)
>> > +{
>> > +    if (option_enabled(dc, XTENSA_OPTION_LOOP) &&
>> > +            !(dc->tb->flags & XTENSA_TBFLAG_EXCM) &&
>> > +            dc->next_pc == dc->lend) {
>> > +        int label = gen_new_label();
>> > +
>> > +        tcg_gen_brcondi_i32(TCG_COND_NE, cpu_SR[LEND], dc->next_pc, label);
>> > +        tcg_gen_brcondi_i32(TCG_COND_EQ, cpu_SR[LCOUNT], 0, label);
>> > +        tcg_gen_subi_i32(cpu_SR[LCOUNT], cpu_SR[LCOUNT], 1);
>> > +        gen_jump(dc, cpu_SR[LBEG]);
>> > +        gen_set_label(label);
>> > +        gen_jumpi(dc, dc->next_pc, slot);
>>
>> If you're going to pretend that LEND is a constant, you might as well
>> pretend that LBEG is also a constant, so that you get to chain the TB's
>> around the loop.
>
> But there may be three exits from TB at the LEND if its last command is a branch: to the LBEG, to the branch target and to the next insn.

Ok, I guess that I need to add gen_wsr_lbeg that invalidates TB at the
current LEND, pretend that LBEG is constant and use given slot to jump
to it. And also to get rid of tcg_gen_brcondi_i32(TCG_COND_NE,
cpu_SR[LEND], dc->next_pc, label);
Richard Henderson - May 20, 2011, 2:14 p.m.
On 05/20/2011 02:10 AM, Max Filippov wrote:
>>> If you're going to pretend that LEND is a constant, you might as well
>>> pretend that LBEG is also a constant, so that you get to chain the TB's
>>> around the loop.
>>
>> But there may be three exits from TB at the LEND if its last
>> command is a branch: to the LBEG, to the branch target and to the
>> next insn.
> 
> Ok, I guess that I need to add gen_wsr_lbeg that invalidates TB at the
> current LEND, pretend that LBEG is constant and use given slot to jump
> to it. And also to get rid of tcg_gen_brcondi_i32(TCG_COND_NE,
> cpu_SR[LEND], dc->next_pc, label);

Yes.

Consider that the code is written to assume that the loop cycles,
so the most likely exit at LEND is LBEG.  If we choose to mirror
that logic inside TCG, then of the 3 possible exits from the block
one of them should be LBEG so that the most likely edge can get
chained.


r~

Patch

diff --git a/target-xtensa/cpu.h b/target-xtensa/cpu.h
index 223535c..026b854 100644
--- a/target-xtensa/cpu.h
+++ b/target-xtensa/cpu.h
@@ -106,6 +106,9 @@  enum {
 };
 
 enum {
+    LBEG = 0,
+    LEND = 1,
+    LCOUNT = 2,
     SAR = 3,
     SCOMPARE1 = 12,
     WINDOW_BASE = 72,
@@ -257,12 +260,17 @@  static inline int cpu_mmu_index(CPUState *env)
     return xtensa_get_cring(env) != 0;
 }
 
+#define XTENSA_TBFLAG_EXCM 0x1
+
 static inline void cpu_get_tb_cpu_state(CPUState *env, target_ulong *pc,
         target_ulong *cs_base, int *flags)
 {
     *pc = env->pc;
     *cs_base = 0;
     *flags = 0;
+    if (env->sregs[PS] & PS_EXCM) {
+        *flags |= XTENSA_TBFLAG_EXCM;
+    }
 }
 
 #include "cpu-all.h"
diff --git a/target-xtensa/helpers.h b/target-xtensa/helpers.h
index 0971fde..7babf73 100644
--- a/target-xtensa/helpers.h
+++ b/target-xtensa/helpers.h
@@ -12,6 +12,7 @@  DEF_HELPER_1(rotw, void, i32)
 DEF_HELPER_2(window_check, void, i32, i32)
 DEF_HELPER_0(restore_owb, void)
 DEF_HELPER_1(movsp, void, i32)
+DEF_HELPER_1(wsr_lend, void, i32)
 DEF_HELPER_0(dump_state, void)
 
 #include "def-helper.h"
diff --git a/target-xtensa/op_helper.c b/target-xtensa/op_helper.c
index 61d82cf..3a0fa01 100644
--- a/target-xtensa/op_helper.c
+++ b/target-xtensa/op_helper.c
@@ -280,6 +280,17 @@  void HELPER(movsp)(uint32_t pc)
     }
 }
 
+void HELPER(wsr_lend)(uint32_t v)
+{
+    if (env->sregs[LEND] != v) {
+        tb_invalidate_phys_page_range(
+                env->sregs[LEND] - 1, env->sregs[LEND], 0);
+        env->sregs[LEND] = v;
+        tb_invalidate_phys_page_range(
+                env->sregs[LEND] - 1, env->sregs[LEND], 0);
+    }
+}
+
 void HELPER(dump_state)(void)
 {
     cpu_dump_state(env, stderr, fprintf, 0);
diff --git a/target-xtensa/translate.c b/target-xtensa/translate.c
index 323d67b..e5e4ce7 100644
--- a/target-xtensa/translate.c
+++ b/target-xtensa/translate.c
@@ -46,6 +46,7 @@  typedef struct DisasContext {
     uint32_t pc;
     uint32_t next_pc;
     int mem_idx;
+    uint32_t lend;
     int is_jmp;
     int singlestep_enabled;
 
@@ -64,6 +65,9 @@  static TCGv_i32 cpu_UR[256];
 #include "gen-icount.h"
 
 static const char * const sregnames[256] = {
+    [LBEG] = "LBEG",
+    [LEND] = "LEND",
+    [LCOUNT] = "LCOUNT",
     [SAR] = "SAR",
     [SCOMPARE1] = "SCOMPARE1",
     [WINDOW_BASE] = "WINDOW_BASE",
@@ -237,13 +241,35 @@  static void gen_callwi(DisasContext *dc, int callinc, uint32_t dest, int slot)
     tcg_temp_free(tmp);
 }
 
+static void gen_check_loop_end(DisasContext *dc, int slot)
+{
+    if (option_enabled(dc, XTENSA_OPTION_LOOP) &&
+            !(dc->tb->flags & XTENSA_TBFLAG_EXCM) &&
+            dc->next_pc == dc->lend) {
+        int label = gen_new_label();
+
+        tcg_gen_brcondi_i32(TCG_COND_NE, cpu_SR[LEND], dc->next_pc, label);
+        tcg_gen_brcondi_i32(TCG_COND_EQ, cpu_SR[LCOUNT], 0, label);
+        tcg_gen_subi_i32(cpu_SR[LCOUNT], cpu_SR[LCOUNT], 1);
+        gen_jump(dc, cpu_SR[LBEG]);
+        gen_set_label(label);
+        gen_jumpi(dc, dc->next_pc, slot);
+    }
+}
+
+static void gen_jumpi_check_loop_end(DisasContext *dc, int slot)
+{
+    gen_check_loop_end(dc, slot);
+    gen_jumpi(dc, dc->next_pc, slot);
+}
+
 static void gen_brcond(DisasContext *dc, TCGCond cond,
         TCGv_i32 t0, TCGv_i32 t1, uint32_t offset)
 {
     int label = gen_new_label();
 
     tcg_gen_brcond_i32(cond, t0, t1, label);
-    gen_jumpi(dc, dc->next_pc, 0);
+    gen_jumpi_check_loop_end(dc, 0);
     gen_set_label(label);
     gen_jumpi(dc, dc->pc + offset, 1);
 }
@@ -273,6 +299,11 @@  static void gen_rsr(DisasContext *dc, TCGv_i32 d, uint32_t sr)
     }
 }
 
+static void gen_wsr_lend(DisasContext *dc, uint32_t sr, TCGv_i32 v)
+{
+    gen_helper_wsr_lend(v);
+}
+
 static void gen_wsr_sar(DisasContext *dc, uint32_t sr, TCGv_i32 s)
 {
     tcg_gen_andi_i32(cpu_SR[sr], s, 0x3f);
@@ -292,6 +323,7 @@  static void gen_wsr(DisasContext *dc, uint32_t sr, TCGv_i32 s)
 {
     static void (* const wsr_handler[256])(DisasContext *dc,
             uint32_t sr, TCGv_i32 v) = {
+        [LEND] = gen_wsr_lend,
         [SAR] = gen_wsr_sar,
         [WINDOW_BASE] = gen_wsr_windowbase,
     };
@@ -1502,15 +1534,29 @@  static void disas_xtensa_insn(DisasContext *dc)
                     break;
 
                 case 8: /*LOOP*/
-                    TBD();
-                    break;
-
                 case 9: /*LOOPNEZ*/
-                    TBD();
-                    break;
-
                 case 10: /*LOOPGTZ*/
-                    TBD();
+                    HAS_OPTION(XTENSA_OPTION_LOOP);
+                    {
+                        uint32_t lend = dc->pc + RRI8_IMM8 + 4;
+                        TCGv_i32 tmp = tcg_const_i32(lend);
+
+                        tcg_gen_subi_i32(cpu_SR[LCOUNT], cpu_R[RRI8_S], 1);
+                        tcg_gen_movi_i32(cpu_SR[LBEG], dc->next_pc);
+                        gen_wsr_lend(dc, LEND, tmp);
+                        tcg_temp_free(tmp);
+
+                        if (BRI8_R > 8) {
+                            int label = gen_new_label();
+                            tcg_gen_brcondi_i32(
+                                    BRI8_R == 9 ? TCG_COND_NE : TCG_COND_GT,
+                                    cpu_R[RRI8_S], 0, label);
+                            gen_jumpi(dc, lend, 1);
+                            gen_set_label(label);
+                        }
+
+                        gen_jumpi(dc, dc->next_pc, 0);
+                    }
                     break;
 
                 default: /*reserved*/
@@ -1687,7 +1733,9 @@  static void disas_xtensa_insn(DisasContext *dc)
         break;
     }
 
+    gen_check_loop_end(dc, 0);
     dc->pc = dc->next_pc;
+
     return;
 
 invalid_opcode:
@@ -1731,6 +1779,7 @@  static void gen_intermediate_code_internal(
     dc.tb = tb;
     dc.pc = env->pc;
     dc.mem_idx = cpu_mmu_index(env);
+    dc.lend = env->sregs[LEND];
     dc.is_jmp = DISAS_NEXT;
 
     reset_sar_tracker(&dc);