Message ID | a38b25540910221831u1f43a337s8863b93930f275e9@mail.gmail.com |
---|---|
State | New |
Headers | show |
TeLeMan schrieb: > Tested i386-softmmu only. Now tci can run windows xp sp2 and its speed > is about 6 times slower than jit. > -- > SUN OF A BEACH Great. Many thanks for the fixes, enhancements and for the testing, too. Is patch 4 (call handling) needed, or is it an optimization? If it is needed, the tcg disassembler has to be extended as well. And did patch 5 (inline) speed up the code? I had expected that static functions don't need inline, because the compiler can optimize them anyway. Regards, Stefan
On Sat, Oct 24, 2009 at 02:58, Stefan Weil <weil@mail.berlios.de> wrote: > Is patch 4 (call handling) needed, or is it an optimization? > If it is needed, the tcg disassembler has to be extended as well. In fact tci has no stack and robber registers and doesn't need simulate the CPU work. I am trying to remove tcg_reg_alloc() in tcg_reg_alloc_op() & tcg_reg_alloc_call() and access the temporary variables directly in tci. > And did patch 5 (inline) speed up the code? I had expected > that static functions don't need inline, because the compiler > can optimize them anyway. You are right, patch 5 is not needed.
On Sat, Oct 24, 2009 at 11:23:43AM +0800, TeLeMan wrote: > On Sat, Oct 24, 2009 at 02:58, Stefan Weil <weil@mail.berlios.de> wrote: > > Is patch 4 (call handling) needed, or is it an optimization? > > If it is needed, the tcg disassembler has to be extended as well. > > In fact tci has no stack and robber registers and doesn't need > simulate the CPU work. I am trying to remove tcg_reg_alloc() in > tcg_reg_alloc_op() & tcg_reg_alloc_call() and access the temporary > variables directly in tci. 'Doesn't need' doesn't necessarily mean 'is better without', though. Perhaps it's best for TCI to reflect the behaviour of other TCG targets where possible? (You can then compare the code that is generated with different numbers of registers, and different constraints, etc.) Cheers,
diff --git a/tcg/tci.c b/tcg/tci.c index e467b3a..81c415c 100644 --- a/tcg/tci.c +++ b/tcg/tci.c @@ -206,7 +206,7 @@ static uint16_t tci_read_r16(uint8_t **tb_ptr) } /* Read indexed register (16 bit signed) from bytecode. */ -static uint16_t tci_read_r16s(uint8_t **tb_ptr) +static int16_t tci_read_r16s(uint8_t **tb_ptr) { uint16_t value = tci_read_reg16s(**tb_ptr); *tb_ptr += 1; @@ -549,7 +549,7 @@ unsigned long tcg_qemu_tb_exec(uint8_t *tb_ptr) t0 = *tb_ptr++; t1 = tci_read_ri32(&tb_ptr); t2 = tci_read_ri32(&tb_ptr); - tci_write_reg32(t0, (t1 >> t2) | (t1 & (1UL << 31))); + tci_write_reg32(t0, ((int32_t)t1 >> t2)); break; #ifdef TCG_TARGET_HAS_rot_i32 case INDEX_op_rotl_i32: @@ -794,7 +794,7 @@ unsigned long tcg_qemu_tb_exec(uint8_t *tb_ptr) t0 = *tb_ptr++; t1 = tci_read_ri64(&tb_ptr); t2 = tci_read_ri64(&tb_ptr); - tci_write_reg64(t0, (t1 >> t2) | (t1 & (1ULL << 63))); + tci_write_reg64(t0, ((int64_t)t1 >> t2)); break; #ifdef TCG_TARGET_HAS_rot_i64 case INDEX_op_rotl_i64: