From patchwork Wed Jan 20 17:19:00 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: identifier scorpio X-Patchwork-Id: 43326 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [199.232.76.165]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id A5F86B7C98 for ; Thu, 21 Jan 2010 05:11:55 +1100 (EST) Received: from localhost ([127.0.0.1]:40781 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NXeqi-0000Gr-0B for incoming@patchwork.ozlabs.org; Wed, 20 Jan 2010 13:00:12 -0500 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1NXeFW-0008Ix-7s for qemu-devel@nongnu.org; Wed, 20 Jan 2010 12:21:46 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1NXeFO-0008BS-0j for qemu-devel@nongnu.org; Wed, 20 Jan 2010 12:21:42 -0500 Received: from [199.232.76.173] (port=48766 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NXeFN-0008At-9S for qemu-devel@nongnu.org; Wed, 20 Jan 2010 12:21:37 -0500 Received: from web15904.mail.cnb.yahoo.com ([202.165.103.49]:23003) by monty-python.gnu.org with smtp (Exim 4.60) (envelope-from ) id 1NXeD7-0000iJ-Ji for qemu-devel@nongnu.org; Wed, 20 Jan 2010 12:21:37 -0500 Received: (qmail 24106 invoked by uid 60001); 20 Jan 2010 17:19:00 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com.cn; s=s1024; t=1264007940; bh=jvLdZMevtXA0TShNSv+A2NJsOF6mXQx5Fo/T/ly915A=; h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Subject:To:Cc:MIME-Version:Content-Type; b=rWUyXtz9XJoFomI8+XettaZGW8VUGy0M5Teui6liV3GTl/n05eB0TV/ERx5tj9oZruBMACsR+5MWrOY6ybycO2Y66r0/zokdF1hcjsbRU2iyLU/fnuoBQ/AcakZUgPBEgr2IEDer1eET6aFH/hK2WuThArZM3tQYPz2ZEYKrctM= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com.cn; h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Subject:To:Cc:MIME-Version:Content-Type; b=IWz/8yjn05i5+0FlCJYjqwVTe6MkZ6FmcOx9u4/szJI1LzSXdZu9cntMt1GEWJlRfBjf5vYtRww58xGJqSb0PX2X2apGXCt2+V/lB744rbyyUbyNW2BAR4qlXPMuGe2oeUjHaTK75DKbyegyRrfCXec0813hLPZU8lK4xZAcMgw=; Message-ID: <682404.21141.qm@web15904.mail.cnb.yahoo.com> X-YMail-OSG: vn1N9eEVM1mHCJUbbnvcb2qrpphXIZdAzMGiLCz_MEgXeRvrqm1b_FvcDdZBquyqFjzIa5igpw94xuNySXTwSaKbMRnvySJjMU76PoHBYDqWlbzMK_Vc9xpJAM714vM01_xOZka0IjxRVWTNfO2AEKcKfoAFLnnCTpmRtCGOj0di2NQO_0xMHmeNBS9Zco8zIZXvfWzk6sRWkwDLq3DkXz8YLMcr1xN.UZIB1o5Ze5bmQ22iW6smRDY53wxfiX9nHUilmjqxkh5UjytmRZaMyBWNZ.EtqkbOab8GgfXAej90XlD589U- Received: from [115.60.152.56] by web15904.mail.cnb.yahoo.com via HTTP; Thu, 21 Jan 2010 01:19:00 CST X-Mailer: YahooMailClassic/9.1.10 YahooMailWebService/0.8.100.260964 Date: Thu, 21 Jan 2010 01:19:00 +0800 (CST) From: identifier scorpio Subject: Re: [Qemu-devel] [PATCH] Porting TCG to alpha platform To: Richard Henderson MIME-Version: 1.0 X-detected-operating-system: by monty-python.gnu.org: FreeBSD 6.x (1) Cc: qemu-devel@nongnu.org X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Thank you all, especially Richard, for reviewing, and I'v partly amended my code according to you advices, but the result is not very encouraging, I can still run linux-0.2.img image and still can't run MS windows. I think that most of your advices are related to performance and may significantly reduce the TB size. Below i'll append my newly generated patch against stable-0.10, in case it is mangled, i also put it in the attachment. now I have some answers for your doubts. > > +static int target_parse_constraint(TCGArgConstraint > *ct, const char **pct_str) > > +{ > > +    const char *ct_str = *pct_str; > > + > > +    switch(ct_str[0]) > > +    { > > +    case 'r': > ... > > +    case 'L': > > Do you really need extra temporaries for L?  You > already have 3. in qemu_ld/st, we must use $16,$17,$18 as temporaries, because pass them as argument to helper functions such as qemu_ld/st_helpers[]. ... > Err.. "8 insns"?  You'd only ever need to output > 5.  Also, why would you ever want to explicitly never > elide one of these insns if you could? Say, if only L0 and > L3 were non-zero? > yes, the number of output instructions is 5, and my comment is a bit out-of-date. your method here is more elegant and I'll migrate to your "tcg_out_op_long()" version tomorrow. ... > With I/J constraints you don't need this special casing. I'm not very familiar with I/J constraints and i'll study them later. ... > > +        tcg_out_reloc(s, > s->code_ptr, R_ALPHA_REFQUAD, label_index, 0); > > +        s->code_ptr += 4; > > I realize that it doesn't really matter what value you use > here, so long as things are consistent with patch_reloc, but > it'll be less confusing if you use the proper relocation > type: R_ALPHA_BRADDR. > you are right, R_ALPHA_BRADDR is more clear. ... > > +        tcg_out_inst2(s, opc^4, > TMP_REG1, 1); > > +    /* record relocation infor */ > > +        tcg_out_reloc(s, > s->code_ptr, R_ALPHA_REFQUAD, label_index, 0); > > +        s->code_ptr += 4; > > Bug: You've applied the relocation to the wrong > instruction. > Bug: What's with the "opc^4"? > what did you mean that i "applied the relocation to the wrong instruction", couldn't i apply relocation to INDEX_op_brcond_i32 operation? and opc^4 here is used to toggle between OP_BLBC(opcode 0x38) and OP_BLBS(opcode 0x3c), ugly code :) ... > > +    /* if VM is of 32-bit arch, clear > higher 32-bit of addr */ > > Use a zapnot insn for this. zapnot is a good thing. ... > You don't need to push/pop anything here.  $26 should > be saved by the prologue we emitted, and $15 is > call-saved.  What you could usefully do is define a > register constraint for $27 so that TCG automatically loads > the value into that register and saves you a register move > here. I push/pop them here just for safe. > > > +    case INDEX_op_sar_i32: > > +        tcg_out_inst4i(s, > OP_SHIFT, args[1], 32, FUNC_SLL, args[1]); > > +        tcg_out_inst4i(s, > OP_SHIFT, args[1], 32, FUNC_SRA, args[1]); > > That last shift can be combined with the requested shift > via addition. For constant input, this saves an insn; for > register input, the addition can be done in parallel with > the first shift. i changed to use "addl r, 0, r" here. > For comparing 32-bit inputs, it doesn't actually matter how > you extend the inputs, so long as you do it the same for > both inputs.  Therefore the best solution here is to > sign-extend both inputs with "addl r,0,r".  Note as > well that you don't need temporaries, as the inputs only > have 32-bits defined; high bits are garbage in, garbage > out. i changed to use "addl r, 0, r" here too. > You'll also want to define INDEX_op_ext32s_i64 as "addl r,0,r". added. > > +    case INDEX_op_div2_i32: > > +    case INDEX_op_divu2_i32: > > Don't define these, but you will need to define > >   div_i32, divu_i32, rem_i32, remu_i32 >   div_i64, divu_i64, rem_i64, remu_i64 I think when qemu met x86 divide instructions, it will call helper functions to simulate them, must i define div_i32/divu_i32/...? ... > > +    tcg_out_push(s, TCG_REG_26); > > +    tcg_out_push(s, TCG_REG_27); > > +    tcg_out_push(s, TCG_REG_28); > > +    tcg_out_push(s, TCG_REG_29); > > Of these only $26 needs to be saved. also, i save them for safe. ++++++++++++++++++++++++++++++++++ below is the newest patch ... From 7cc2acddfb7333ab3f1f6b17fa8fa5dcdd3c0095 Mon Sep 17 00:00:00 2001 From: Dong Weiyu Date: Wed, 20 Jan 2010 23:48:55 +0800 Subject: [PATCH] Porting TCG to alpha platform. --- cpu-all.h | 2 +- tcg/alpha/tcg-target.c | 1196 ++++++++++++++++++++++++++++++++++++++++++++++++ tcg/alpha/tcg-target.h | 70 +++ 3 files changed, 1267 insertions(+), 1 deletions(-) create mode 100644 tcg/alpha/tcg-target.c create mode 100644 tcg/alpha/tcg-target.h + +/* used for function call generation */ +#define TCG_REG_CALL_STACK TCG_REG_30 +#define TCG_TARGET_STACK_ALIGN 16 +#define TCG_TARGET_CALL_STACK_OFFSET 0 + +/* we have signed extension instructions */ +#define TCG_TARGET_HAS_ext8s_i32 +#define TCG_TARGET_HAS_ext16s_i32 +#define TCG_TARGET_HAS_ext8s_i64 +#define TCG_TARGET_HAS_ext16s_i64 +#define TCG_TARGET_HAS_ext32s_i64 + +/* Note: must be synced with dyngen-exec.h */ +#define TCG_AREG0 TCG_REG_15 +#define TCG_AREG1 TCG_REG_9 +#define TCG_AREG2 TCG_REG_10 +#define TCG_AREG3 TCG_REG_11 +#define TCG_AREG4 TCG_REG_12 +#define TCG_AREG5 TCG_REG_13 +#define TCG_AREG6 TCG_REG_14 + +#define TMP_REG1 TCG_REG_23 +#define TMP_REG2 TCG_REG_24 +#define TMP_REG3 TCG_REG_25 + +static inline void flush_icache_range(unsigned long start, unsigned long stop) +{ + __asm__ __volatile__ ("call_pal 0x86"); +} + From 7cc2acddfb7333ab3f1f6b17fa8fa5dcdd3c0095 Mon Sep 17 00:00:00 2001 From: Dong Weiyu Date: Wed, 20 Jan 2010 23:48:55 +0800 Subject: [PATCH] Porting TCG to alpha platform. --- cpu-all.h | 2 +- tcg/alpha/tcg-target.c | 1196 ++++++++++++++++++++++++++++++++++++++++++++++++ tcg/alpha/tcg-target.h | 70 +++ 3 files changed, 1267 insertions(+), 1 deletions(-) create mode 100644 tcg/alpha/tcg-target.c create mode 100644 tcg/alpha/tcg-target.h diff --git a/cpu-all.h b/cpu-all.h index e0c3efd..bdf6fb2 100644 --- a/cpu-all.h +++ b/cpu-all.h @@ -22,7 +22,7 @@ #include "qemu-common.h" -#if defined(__arm__) || defined(__sparc__) || defined(__mips__) || defined(__hppa__) +#if defined(__arm__) || defined(__sparc__) || defined(__mips__) || defined(__hppa__) || defined(__alpha__) #define WORDS_ALIGNED #endif diff --git a/tcg/alpha/tcg-target.c b/tcg/alpha/tcg-target.c new file mode 100644 index 0000000..143f576 --- /dev/null +++ b/tcg/alpha/tcg-target.c @@ -0,0 +1,1196 @@ +/* + * Tiny Code Generator for QEMU on ALPHA platform +*/ + +#ifndef NDEBUG +static const char * const tcg_target_reg_names[TCG_TARGET_NB_REGS] = { + "$0", "$1", "$2", "$3", "$4", "$5", "$6", "$7", + "$8", "$9", "$10", "$11", "$12", "$13", "$14", "$15", + "$16", "$17", "$18", "$19", "$20", "$21", "$22", "$23", + "$24", "$25", "$26", "$27", "$28", "$29", "$30", "$31", +}; +#endif + +/* + * $26 ~ $31 are special, reserved, + * and $25 is deliberately reserved for jcc operation + * and $0 is usually used for return function result, better allocate it later + * and $15 is used for cpu_env pointer, allocate it at last +*/ +static const int tcg_target_reg_alloc_order[] = { + TCG_REG_9, TCG_REG_10, TCG_REG_11, TCG_REG_12, TCG_REG_13, TCG_REG_14, + TCG_REG_1, TCG_REG_2, TCG_REG_3, TCG_REG_4, TCG_REG_5, TCG_REG_6, + TCG_REG_7, TCG_REG_8, TCG_REG_22, + TCG_REG_16, TCG_REG_17, TCG_REG_18, TCG_REG_19, TCG_REG_20, TCG_REG_21 +}; + +/* + * according to alpha calling convention, these 6 registers are used for + * function parameter passing. if function has more than 6 parameters, remained + * ones are stored on stack. +*/ +static const int tcg_target_call_iarg_regs[6] = { + TCG_REG_16, TCG_REG_17, TCG_REG_18, TCG_REG_19, TCG_REG_20, TCG_REG_21 +}; + +/* + * according to alpha calling convention, $0 is used for returning function result. +*/ +static const int tcg_target_call_oarg_regs[1] = { TCG_REG_0 }; + +/* + * save the address of TB's epilogue. +*/ +static uint8_t *tb_ret_addr; + +#define INSN_OP(x) (((x) & 0x3f) << 26) +#define INSN_FUNC1(x) (((x) & 0x3) << 14) +#define INSN_FUNC2(x) (((x) & 0x7f) << 5) +#define INSN_RA(x) (((x) & 0x1f) << 21) +#define INSN_RB(x) (((x) & 0x1f) << 16) +#define INSN_RC(x) ((x) & 0x1f) +#define INSN_LIT(x) (((x) & 0xff) << 13) +#define INSN_DISP16(x) ((x) & 0xffff) +#define INSN_DISP21(x) ((x) & 0x1fffff) +#define INSN_RSVED(x) ((x) & 0x3fff) + +#define INSN_JMP (INSN_OP(0x1a) | INSN_FUNC1(0)) +#define INSN_CALL (INSN_OP(0x1a) | INSN_FUNC1(1)) +#define INSN_RET (INSN_OP(0x1a) | INSN_FUNC1(2)) +#define INSN_BR INSN_OP(0x30) +#define INSN_BEQ INSN_OP(0x39) +#define INSN_BNE INSN_OP(0x3d) +#define INSN_BLBC INSN_OP(0x38) +#define INSN_BLBS INSN_OP(0x3c) +#define INSN_ADDL (INSN_OP(0x10) | INSN_FUNC2(0)) +#define INSN_SUBL (INSN_OP(0x10) | INSN_FUNC2(0x9)) +#define INSN_ADDQ (INSN_OP(0x10) | INSN_FUNC2(0x20)) +#define INSN_SUBQ (INSN_OP(0x10) | INSN_FUNC2(0x29)) +#define INSN_CMPEQ (INSN_OP(0x10) | INSN_FUNC2(0x2d)) +#define INSN_CMPLT (INSN_OP(0x10) | INSN_FUNC2(0x4d)) +#define INSN_CMPLE (INSN_OP(0x10) | INSN_FUNC2(0x6d)) +#define INSN_CMPULT (INSN_OP(0x10) | INSN_FUNC2(0x1d)) +#define INSN_CMPULE (INSN_OP(0x10) | INSN_FUNC2(0x3d)) +#define INSN_MULL (INSN_OP(0x13) | INSN_FUNC2(0)) +#define INSN_MULQ (INSN_OP(0x13) | INSN_FUNC2(0x20)) +#define INSN_AND (INSN_OP(0x11) | INSN_FUNC2(0)) +#define INSN_BIS (INSN_OP(0x11) | INSN_FUNC2(0x20)) +#define INSN_XOR (INSN_OP(0x11) | INSN_FUNC2(0x40)) +#define INSN_SLL (INSN_OP(0x12) | INSN_FUNC2(0x39)) +#define INSN_SRL (INSN_OP(0x12) | INSN_FUNC2(0x34)) +#define INSN_SRA (INSN_OP(0x12) | INSN_FUNC2(0x3c)) +#define INSN_ZAPNOT (INSN_OP(0x12) | INSN_FUNC2(0x31)) +#define INSN_SEXTB (INSN_OP(0x1c) | INSN_FUNC2(0)) +#define INSN_SEXTW (INSN_OP(0x1c) | INSN_FUNC2(0x1)) +#define INSN_LDA INSN_OP(0x8) +#define INSN_LDAH INSN_OP(0x9) +#define INSN_LDBU INSN_OP(0xa) +#define INSN_LDWU INSN_OP(0xc) +#define INSN_LDL INSN_OP(0x28) +#define INSN_LDQ INSN_OP(0x29) +#define INSN_STB INSN_OP(0xe) +#define INSN_STW INSN_OP(0xd) +#define INSN_STL INSN_OP(0x2c) +#define INSN_STQ INSN_OP(0x2d) + +/* + * return the # of regs used for parameter passing on procedure calling. + * note that alpha use $16~$21 to transfer the first 6 paramenters of a procedure. +*/ +static inline int tcg_target_get_call_iarg_regs_count(int flags) +{ + return 6; +} + +/* + * given constraint, return available register set. this function is called once + * for each op at qemu's initialization stage. +*/ +static int target_parse_constraint(TCGArgConstraint *ct, const char **pct_str) +{ + const char *ct_str = *pct_str; + + switch(ct_str[0]) + { + case 'r': + /* constaint 'r' means any register is okay */ + ct->ct |= TCG_CT_REG; + tcg_regset_set32(ct->u.regs, 0, 0xffffffffu); + break; + + case 'L': + /* + * constranit 'L' is used for qemu_ld/st, which has 2 meanings: + * 1st, we the argument need to be allocated a register. + * 2nd, we should reserve some registers that belong to caller-clobbered + * list for qemu_ld/st local usage, so these registers must not be + * allocated to the argument that the 'L' constraint is describing. + * + * note that op qemu_ld/st has the TCG_OPF_CALL_CLOBBER flag, and + * tcg will free all callee-clobbered registers before generate target + * insn for qemu_ld/st, so we can use these register directly without + * warrying about destroying their content. + */ + ct->ct |= TCG_CT_REG; + tcg_regset_set32(ct->u.regs, 0, 0xffffffffu); + tcg_regset_reset_reg(ct->u.regs, TCG_REG_0); + tcg_regset_reset_reg(ct->u.regs, TCG_REG_16); + tcg_regset_reset_reg(ct->u.regs, TCG_REG_17); + tcg_regset_reset_reg(ct->u.regs, TCG_REG_18); + break; + + default: + return -1; + } + + ct_str++; + *pct_str = ct_str; + return 0; +} + +/* + * whether op's input argument may use constant +*/ +static inline int tcg_target_const_match( \ + tcg_target_long val, const TCGArgConstraint *arg_ct) +{ + int ct = arg_ct->ct; + return (ct & TCG_CT_CONST) ? 1 : 0; +} + +static inline void tcg_out_fmt_br(TCGContext *s, int opc, int ra, int disp) +{ + tcg_out32(s, (opc)|INSN_RA(ra)|INSN_DISP21(disp)); +} + +static inline void tcg_out_fmt_mem(TCGContext *s, int opc, int ra, int rb, int disp) +{ + tcg_out32(s, (opc)|INSN_RA(ra)|INSN_RB(rb)|INSN_DISP16(disp)); +} + +static inline void tcg_out_fmt_jmp(TCGContext *s, int opc, int ra, int rb, int rsved) +{ + tcg_out32(s, (opc)|INSN_RA(ra)|INSN_RB(rb)|INSN_RSVED(rsved)); +} + +static inline void tcg_out_fmt_opr(TCGContext *s, int opc, int ra, int rb, int rc) +{ + tcg_out32(s, (opc)|INSN_RA(ra)|INSN_RB(rb)|INSN_RC(rc)); +} + +static inline void tcg_out_fmt_opi(TCGContext *s, int opc, int ra, int lit, int rc) +{ + tcg_out32(s, (opc)|INSN_RA(ra)|INSN_LIT(lit)|INSN_RC(rc)|(1<<12)); +} + +/* + * mov from a reg to another +*/ +static inline void tcg_out_mov(TCGContext *s, int rc, int rb) +{ + if ( rb != rc ) { + tcg_out_fmt_opr(s, INSN_BIS, TCG_REG_31, rb, rc); + } +} + +/* + * mov a 64-bit immediate 'arg' to regsiter 'ra', this function will + * generate fixed length (5 insns) of target insn sequence. +*/ +static void tcg_out_movi_fixl( \ + TCGContext *s, TCGType type, int ra, tcg_target_long arg) +{ + tcg_target_long l0, l1, l2, l3; + tcg_target_long l1_tmp, l2_tmp, l3_tmp; + + l0 = arg & 0xffffu; + l1_tmp = l1 = ( arg >> 16) & 0xffffu; + l2_tmp = l2 = ( arg >> 32) & 0xffffu; + l3_tmp = l3 = ( arg >> 48) & 0xffffu; + + if ( l0 & 0x8000u) + l1_tmp = (l1 + 1) & 0xffffu; + if ( (l1_tmp & 0x8000u) || ((l1_tmp == 0) && (l1_tmp != l1))) + l2_tmp = (l2 + 1) & 0xffffu; + if ( (l2_tmp & 0x8000u) || ((l2_tmp == 0) && (l2_tmp != l2))) + l3_tmp = (l3 + 1) & 0xffffu; + + tcg_out_fmt_mem(s, INSN_LDAH, ra, TCG_REG_31, l3_tmp); + tcg_out_fmt_mem(s, INSN_LDA, ra, ra, l2_tmp); + tcg_out_fmt_opi(s, INSN_SLL, ra, 32, ra); + tcg_out_fmt_mem(s, INSN_LDAH, ra, ra, l1_tmp); + tcg_out_fmt_mem(s, INSN_LDA, ra, ra, l0); +} + +/* + * mov 64-bit immediate 'arg' to regsiter 'ra'. this function will + * generate variable length of target insn sequence. +*/ +static inline void tcg_out_movi( \ + TCGContext *s, TCGType type, int ra, tcg_target_long arg) +{ + if ( type == TCG_TYPE_I32) + arg = (int32_t)arg; + + if( arg == (int16_t)arg ) { + tcg_out_fmt_mem(s, INSN_LDA, ra, TCG_REG_31, arg); + } else if( arg == (int32_t)arg ) { + tcg_out_fmt_mem(s, INSN_LDAH, ra, TCG_REG_31, (arg>>16)); + if( arg & ((tcg_target_ulong)0x8000) ) { + tcg_out_fmt_mem(s, INSN_LDAH, ra, ra, 1); + } + tcg_out_fmt_mem(s, INSN_LDA, ra, ra, arg); + } else { + tcg_out_movi_fixl(s, type, ra, arg); + } +} + +static inline int _is_tmp_reg( int r) +{ + if ( r == TMP_REG1 || r == TMP_REG2 || r == TMP_REG3) + return 1; + else + return 0; +} + +/* + * load value in disp(Rb) to Ra. +*/ +static inline void tcg_out_ld( \ + TCGContext *s, TCGType type, int ra, int rb, tcg_target_long disp) +{ + int opc; + + if ( _is_tmp_reg(ra) || _is_tmp_reg(rb)) + tcg_abort(); + + opc = ((type == TCG_TYPE_I32) ? INSN_LDL : INSN_LDQ); + + if( disp != (int16_t)disp ) { + tcg_out_movi(s, TCG_TYPE_I64, TMP_REG1, disp); + tcg_out_fmt_opr(s, INSN_ADDQ, rb, TMP_REG1, TMP_REG1); + tcg_out_fmt_mem(s, opc, ra, TMP_REG1, 0); + } + else + tcg_out_fmt_mem(s, opc, ra, rb, disp); +} + +/* + * store value in Ra to disp(Rb). +*/ +static inline void tcg_out_st( \ + TCGContext *s, TCGType type, int ra, int rb, tcg_target_long disp) +{ + int opc; + + if ( _is_tmp_reg(ra) || _is_tmp_reg(rb)) + tcg_abort(); + + opc = ((type == TCG_TYPE_I32) ? INSN_STL : INSN_STQ); + + if( disp != (int16_t)disp ) { + tcg_out_movi(s, TCG_TYPE_I64, TMP_REG1, disp); + tcg_out_fmt_opr(s, INSN_ADDQ, rb, TMP_REG1, TMP_REG1); + tcg_out_fmt_mem(s, opc, ra, TMP_REG1, 0); + } + else + tcg_out_fmt_mem(s, opc, ra, rb, disp); +} + +/* + * generate arithmatic instruction with immediate. ra is used as both + * input and output, and val is used as another input. +*/ +static inline void tgen_arithi( \ + TCGContext *s, int opc, int ra, tcg_target_long val) +{ + if ( _is_tmp_reg(ra)) + tcg_abort(); + + if (val == (uint8_t)val) { + tcg_out_fmt_opi(s, opc, ra, val, ra); + } else { + tcg_out_movi(s, TCG_TYPE_I64, TMP_REG1, val); + tcg_out_fmt_opr(s, opc, ra, TMP_REG1, ra); + } +} + +static void tcg_out_addi(TCGContext *s, int reg, tcg_target_long val) +{ + if (val != 0) + tgen_arithi(s, INSN_ADDQ, reg, val); +} + +static inline void tcg_out_push(TCGContext *s, int reg) +{ + tcg_out_fmt_opi(s, INSN_SUBQ, TCG_REG_30, 8, TCG_REG_30); + tcg_out_fmt_mem(s, INSN_STQ, reg, TCG_REG_30, 0); +} + +static inline void tcg_out_pop(TCGContext *s, int reg) +{ + tcg_out_fmt_mem(s, INSN_LDQ, reg, TCG_REG_30, 0); + tcg_out_fmt_opi(s, INSN_ADDQ, TCG_REG_30, 8, TCG_REG_30); +} + +static const uint64_t tcg_cond_to_jcc[10] = { + [TCG_COND_EQ] = INSN_CMPEQ, + [TCG_COND_NE] = INSN_CMPEQ, + [TCG_COND_LT] = INSN_CMPLT, + [TCG_COND_GE] = INSN_CMPLT, + [TCG_COND_LE] = INSN_CMPLE, + [TCG_COND_GT] = INSN_CMPLE, + [TCG_COND_LTU] = INSN_CMPULT, + [TCG_COND_GEU] = INSN_CMPULT, + [TCG_COND_LEU] = INSN_CMPULE, + [TCG_COND_GTU] = INSN_CMPULE +}; + +static void patch_reloc(uint8_t *code_ptr, \ + int type, tcg_target_long value, tcg_target_long addend) +{ + TCGContext s; + tcg_target_long val; + + if ( type != R_ALPHA_BRADDR) + tcg_abort(); + + s.code_ptr = code_ptr; + val = (value - (tcg_target_long)s.code_ptr - 4) >> 2; + if ( !(val >= -0x100000 && val < 0x100000)) { + tcg_abort(); + } + + tcg_out_fmt_br(&s, INSN_BR, TCG_REG_31, val); +} + +static void tcg_out_br(TCGContext *s, int label_index) +{ + TCGLabel *l = &s->labels[label_index]; + + if (l->has_value) { + tcg_target_long val; + val = ((tcg_target_long)(l->u.value) - (tcg_target_long)s->code_ptr - 4) >> 2; + if ( val >= -0x100000 && val < 0x100000) { + // if distance can be put into 21-bit field + tcg_out_fmt_br(s, INSN_BR, TCG_REG_31, val); + } else { + tcg_abort(); + } + } else { + tcg_out_reloc(s, s->code_ptr, R_ALPHA_BRADDR, label_index, 0); + s->code_ptr += 4; + } +} + +static void tcg_out_brcond( TCGContext *s, int cond, \ + TCGArg arg1, TCGArg arg2, int const_arg2, int label_index) +{ + int opc; + TCGLabel *l = &s->labels[label_index]; + + if ( cond < TCG_COND_EQ || cond > TCG_COND_GTU || const_arg2) + tcg_abort(); + + opc = tcg_cond_to_jcc[cond]; + tcg_out_fmt_opr(s, opc, arg1, arg2, TMP_REG1); + + if (l->has_value) { + tcg_target_long val; + val = ((tcg_target_long)l->u.value - (tcg_target_long)s->code_ptr - 4) >> 2; + if ( val >= -0x100000 && val < 0x100000) { + // if distance can be put into 21-bit field + opc = (cond & 1) ? INSN_BLBC : INSN_BLBS; + tcg_out_fmt_br(s, opc, TMP_REG1, val); + } else { + tcg_abort(); + } + } else { + opc = (cond & 1) ? INSN_BLBS : INSN_BLBC; + tcg_out_fmt_br(s, opc, TMP_REG1, 1); + tcg_out_reloc(s, s->code_ptr, R_ALPHA_BRADDR, label_index, 0); + s->code_ptr += 4; + } +} + +#if defined(CONFIG_SOFTMMU) + +#include "../../softmmu_defs.h" + +static void *qemu_ld_helpers[4] = { + __ldb_mmu, + __ldw_mmu, + __ldl_mmu, + __ldq_mmu, +}; + +static void *qemu_st_helpers[4] = { + __stb_mmu, + __stw_mmu, + __stl_mmu, + __stq_mmu, +}; + +#endif + +static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, int opc) +{ + int addr_reg, data_reg, r0, r1, mem_index, s_bits; + tcg_target_long val; + +#if defined(CONFIG_SOFTMMU) + uint8_t *label1_ptr, *label2_ptr; +#endif + + data_reg = *args++; + addr_reg = *args++; + mem_index = *args; + s_bits = opc & 3; + + r0 = TCG_REG_16; + r1 = TCG_REG_17; + +#if defined(CONFIG_SOFTMMU) + + tcg_out_mov(s, r1, addr_reg); + tcg_out_mov(s, r0, addr_reg); + +#if TARGET_LONG_BITS == 32 + /* if VM is of 32-bit arch, clear higher 32-bit of addr */ + tcg_out_fmt_opi(s, INSN_ZAPNOT, r0, 0x0f, r0); + tcg_out_fmt_opi(s, INSN_ZAPNOT, r1, 0x0f, r1); +#endif + + tgen_arithi(s, INSN_AND, r0, TARGET_PAGE_MASK|((1<code_ptr; + s->code_ptr += 4; + + // + // here, unequal, TLB-miss. + // + tcg_out_movi(s, TCG_TYPE_I32, TCG_REG_17, mem_index); + tcg_out_movi(s, TCG_TYPE_I64, TMP_REG1, (tcg_target_long)qemu_ld_helpers[s_bits]); + tcg_out_push(s, addr_reg); + //tcg_out_push(s, TCG_REG_26); + //tcg_out_push(s, TCG_REG_15); + tcg_out_mov(s, TCG_REG_27, TMP_REG1); + tcg_out_fmt_jmp(s, INSN_CALL, TCG_REG_26, TMP_REG1, 0); + //tcg_out_pop(s, TCG_REG_15); + //tcg_out_pop(s, TCG_REG_26); + tcg_out_pop(s, addr_reg); + + // + // after helper function call, the result of ld is saved in $0 + // + switch(opc) { + case 0 | 4: + tcg_out_fmt_opr(s, INSN_SEXTB, TCG_REG_31, TCG_REG_0, data_reg); + break; + case 1 | 4: + tcg_out_fmt_opr(s, INSN_SEXTW, TCG_REG_31, TCG_REG_0, data_reg); + break; + case 2 | 4: + tcg_out_fmt_opr(s, INSN_ADDL, TCG_REG_0, TCG_REG_31, data_reg); + break; + case 0: + tcg_out_fmt_opi(s, INSN_ZAPNOT, TCG_REG_0, 0x1, data_reg); + break; + case 1: + tcg_out_fmt_opi(s, INSN_ZAPNOT, TCG_REG_0, 0x3, data_reg); + break; + case 2: + tcg_out_fmt_opi(s, INSN_ZAPNOT, TCG_REG_0, 0xf, data_reg); + break; + case 3: + tcg_out_mov(s, data_reg, TCG_REG_0); + break; + default: + tcg_abort(); + break; + } + + // + // we have done, jmp to label2. label2 is not resolved yet, + // we record a relocation. + // + label2_ptr = s->code_ptr; + s->code_ptr += 4; + + // patch jmp to label1 + val = (s->code_ptr - label1_ptr - 4) >> 2; + if ( !(val >= -0x100000 && val < 0x100000)) { + tcg_abort(); + } + *(uint32_t *)label1_ptr = (uint32_t) \ + ( INSN_BNE | ( TMP_REG1 << 21 ) | ( val & 0x1fffff)); + + // + // if we get here, a TLB entry is hit, r0 contains the guest addr and + // r1 contains the ptr that point to tlb_entry.addr_read. what we should + // do is to load the tlb_entry.addend (64-bit on alpha) and add it to + // r0 to get the host VA + // + tcg_out_movi(s, TCG_TYPE_I64, TMP_REG1, \ + offsetof(CPUTLBEntry, addend) - offsetof(CPUTLBEntry, addr_read)); + tcg_out_fmt_opr(s, INSN_ADDQ, r1, TMP_REG1, r1); + tcg_out_fmt_mem(s, INSN_LDQ, TMP_REG1, r1, 0); + tcg_out_fmt_opr(s, INSN_ADDQ, r0, TMP_REG1, r0); + +#else + r0 = addr_reg; +#endif // endif defined(CONFIG_SOFTMMU) + +#ifdef TARGET_WORDS_BIGENDIAN + tcg_abort(); +#endif + + // + // when we get here, r0 contains the host VA that can be used to access guest PA + // + switch(opc) { + case 0: + tcg_out_fmt_mem(s, INSN_LDBU, data_reg, r0, 0); + break; + case 0 | 4: + tcg_out_fmt_mem(s, INSN_LDBU, data_reg, r0, 0); + tcg_out_fmt_opr(s, INSN_SEXTB, TCG_REG_31, data_reg, data_reg); + break; + case 1: + tcg_out_fmt_mem(s, INSN_LDWU, data_reg, r0, 0); + break; + case 1 | 4: + tcg_out_fmt_mem(s, INSN_LDWU, data_reg, r0, 0); + tcg_out_fmt_opr(s, INSN_SEXTW, TCG_REG_31, data_reg, data_reg); + break; + case 2: + tcg_out_fmt_mem(s, INSN_LDL, data_reg, r0, 0); + tcg_out_fmt_opi(s, INSN_ZAPNOT, data_reg, 0xf, data_reg); + break; + case 2 | 4: + tcg_out_fmt_mem(s, INSN_LDL, data_reg, r0, 0); + break; + case 3: + tcg_out_fmt_mem(s, INSN_LDQ, data_reg, r0, 0); + break; + default: + tcg_abort(); + } + +#if defined(CONFIG_SOFTMMU) + /* label2: */ + val = (s->code_ptr - label2_ptr - 4) >> 2; + if ( !(val >= -0x100000 && val < 0x100000)) { + tcg_abort(); + } + *(uint32_t *)label2_ptr = (uint32_t)( INSN_BR \ + | ( TCG_REG_31 << 21 ) | ( val & 0x1fffff) ); +#endif +} + +static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, int opc) +{ + int addr_reg, data_reg, r0, r1, mem_index, s_bits; + tcg_target_long val; + +#if defined(CONFIG_SOFTMMU) + uint8_t *label1_ptr, *label2_ptr; +#endif + + data_reg = *args++; + addr_reg = *args++; + mem_index = *args; + s_bits = opc&3; + + r0 = TCG_REG_16; + r1 = TCG_REG_17; + +#if defined(CONFIG_SOFTMMU) + + tcg_out_mov(s, r1, addr_reg); + tcg_out_mov(s, r0, addr_reg); + +#if TARGET_LONG_BITS == 32 + /* if VM is of 32-bit arch, clear higher 32-bit of addr */ + tcg_out_fmt_opi(s, INSN_ZAPNOT, r0, 0x0f, r0); + tcg_out_fmt_opi(s, INSN_ZAPNOT, r1, 0x0f, r1); +#endif + + tgen_arithi(s, INSN_AND, r0, TARGET_PAGE_MASK | ((1 << s_bits) - 1)); + + tgen_arithi(s, INSN_SRL, r1, TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS); + tgen_arithi(s, INSN_AND, r1, (CPU_TLB_SIZE-1) << CPU_TLB_ENTRY_BITS); + + tcg_out_addi(s, r1, offsetof(CPUState, tlb_table[mem_index][0].addr_write)); + tcg_out_fmt_opr(s, INSN_ADDQ, r1, TCG_REG_15, r1); + +#if TARGET_LONG_BITS == 32 + tcg_out_fmt_mem(s, INSN_LDL, TMP_REG1, r1, 0); + tcg_out_fmt_opi(s, INSN_ZAPNOT, TMP_REG1, 0x0f, TMP_REG1); +#else + tcg_out_fmt_mem(s, INSN_LDQ, TMP_REG1, r1, 0); +#endif + + // + // now, r0 contains the page# and TMP_REG1 contains the addr to tlb_entry.addr_read + // we below will compare them + // + tcg_out_fmt_opr(s, INSN_CMPEQ, TMP_REG1, r0, TMP_REG1); + + tcg_out_mov(s, r0, addr_reg); +#if TARGET_LONG_BITS == 32 + tcg_out_fmt_opi(s, INSN_ZAPNOT, r0, 0x0f, r0); +#endif + + // + // if equal, we jump to label1. since label1 is not resolved yet, + // we just record a relocation. + // + label1_ptr = s->code_ptr; + s->code_ptr += 4; + + // here, unequal, TLB-miss, ... + tcg_out_mov(s, TCG_REG_17, data_reg); + tcg_out_movi(s, TCG_TYPE_I32, TCG_REG_18, mem_index); + tcg_out_movi(s, TCG_TYPE_I64, TMP_REG1, (tcg_target_long)qemu_st_helpers[s_bits]); + + tcg_out_push(s, data_reg); + tcg_out_push(s, addr_reg); + //tcg_out_push(s, TCG_REG_26); + //tcg_out_push(s, TCG_REG_15); + tcg_out_mov(s, TCG_REG_27,TMP_REG1); + tcg_out_fmt_jmp(s, INSN_CALL, TCG_REG_26, TMP_REG1, 0); + //tcg_out_pop(s, TCG_REG_15); + //tcg_out_pop(s, TCG_REG_26); + tcg_out_pop(s, addr_reg); + tcg_out_pop(s, data_reg); + + // + // we have done, jmp to label2. label2 is not resolved yet, + // we record a relocation. + // + label2_ptr = s->code_ptr; + s->code_ptr += 4; + + // patch jmp to label1 + val = (s->code_ptr - label1_ptr - 4) >> 2; + if ( !(val >= -0x100000 && val < 0x100000)) { + tcg_abort(); + } + *(uint32_t *)label1_ptr = (uint32_t) \ + ( INSN_BNE | ( TMP_REG1 << 21 ) | ( val & 0x1fffff)); + + // + // if we get here, a TLB entry is hit, r0 contains the guest addr and + // r1 contains the ptr that point to tlb_entry.addr_read. what we should + // do is to load the tlb_entry.addend (64-bit on alpha) and add it to + // r0 to get the host VA + // + tcg_out_movi(s, TCG_TYPE_I64, TMP_REG1, \ + offsetof(CPUTLBEntry, addend) - offsetof(CPUTLBEntry, addr_write)); + tcg_out_fmt_opr(s, INSN_ADDQ, r1, TMP_REG1, r1); + tcg_out_fmt_mem(s, INSN_LDQ, TMP_REG1, r1, 0); + tcg_out_fmt_opr(s, INSN_ADDQ, r0, TMP_REG1, r0); + +#else + r0 = addr_reg; +#endif + +#ifdef TARGET_WORDS_BIGENDIAN + tcg_abort(); +#endif + + // + // when we get here, r0 contains the host VA that can be used to access guest PA + // + switch(opc) { + case 0: + tcg_out_fmt_mem(s, INSN_STB, data_reg, r0, 0); + break; + case 1: + tcg_out_fmt_mem(s, INSN_STW, data_reg, r0, 0); + break; + case 2: + tcg_out_fmt_mem(s, INSN_STL, data_reg, r0, 0); + break; + case 3: + tcg_out_fmt_mem(s, INSN_STQ, data_reg, r0, 0); + break; + default: + tcg_abort(); + } + +#if defined(CONFIG_SOFTMMU) + /* patch jmp to label2: */ + val = (s->code_ptr - label2_ptr - 4) >> 2; + if ( !(val >= -0x100000 && val < 0x100000)) { + tcg_abort(); + } + *(uint32_t *)label2_ptr = (uint32_t)( INSN_BR \ + | ( TCG_REG_31 << 21 ) | ( val & 0x1fffff)); +#endif +} + +static inline void tgen_ldxx( TCGContext *s, int ra, int rb, tcg_target_long disp, int flags) +{ + int opc_array[4] = { INSN_LDBU, INSN_LDWU, INSN_LDL, INSN_LDQ}; + int opc = opc_array[flags & 3]; + + if ( _is_tmp_reg(ra) || _is_tmp_reg(rb)) + tcg_abort(); + + if( disp != (int16_t)disp ) { + /* disp cannot be stored in insn directly */ + tcg_out_movi(s, TCG_TYPE_I64, TMP_REG1, disp); + tcg_out_fmt_opr(s, INSN_ADDQ, rb, TMP_REG1, TMP_REG1); + tcg_out_fmt_mem(s, opc, ra, TMP_REG1, 0); + } else { + tcg_out_fmt_mem(s, opc, ra, rb, disp); + } + + switch ( flags & 7) { + case 0: + case 1: + case 2|4: + case 3: + break; + case 0|4: + tcg_out_fmt_opr(s, INSN_SEXTB, TCG_REG_31, ra, ra); + break; + case 1|4: + tcg_out_fmt_opr(s, INSN_SEXTW, TCG_REG_31, ra, ra); + break; + case 2: + tcg_out_fmt_opi(s, INSN_ZAPNOT, ra, 0x0f, ra); + break; + default: + tcg_abort(); + } +} + +static inline void tgen_stxx( TCGContext *s, int ra, int rb, tcg_target_long disp, int flags) +{ + int opc_array[4] = { INSN_STB, INSN_STW, INSN_STL, INSN_STQ}; + int opc = opc_array[flags & 3]; + + if( disp != (int16_t)disp ) { + /* disp cannot be stored in insn directly */ + tcg_out_movi(s, TCG_TYPE_I64, TMP_REG1, disp); + tcg_out_fmt_opr(s, INSN_ADDQ, rb, TMP_REG1, TMP_REG1); + tcg_out_fmt_mem(s, opc, ra, TMP_REG1, 0); + } else { + tcg_out_fmt_mem(s, opc, ra, rb, disp); + } +} + +static inline void tcg_out_op(TCGContext *s, \ + int opc, const TCGArg *args, const int *const_args) +{ + int oc; + switch(opc) + { + case INDEX_op_exit_tb: + /* + * exit_tb t0, where t0 is always constant and should be returned to engine + * since we'll back to engine soon, $0 and $1 will never be used + */ + tcg_out_movi(s, TCG_TYPE_I64, TCG_REG_0, args[0]); + tcg_out_movi(s, TCG_TYPE_I64, TMP_REG1, (tcg_target_long)tb_ret_addr); + tcg_out_fmt_jmp(s, INSN_JMP, TCG_REG_31, TMP_REG1, 0); + break; + + case INDEX_op_goto_tb: + /* goto_tb idx, where idx is constant 0 or 1, indicating the branch # */ + if (s->tb_jmp_offset) { + /* we don't support direct jmp */ + tcg_abort(); + } else { + tcg_out_movi( s, TCG_TYPE_I64, TMP_REG1, (tcg_target_long)(s->tb_next + args[0])); + tcg_out_fmt_mem(s, INSN_LDQ, TMP_REG1, TMP_REG1, 0); + tcg_out_fmt_jmp(s, INSN_JMP, TCG_REG_31, TMP_REG1, 0); + } + s->tb_next_offset[args[0]] = s->code_ptr - s->code_buf; + break; + + case INDEX_op_call: + if (const_args[0]) { + tcg_abort(); + } else { + //tcg_out_push( s, TCG_REG_26); + //tcg_out_push( s, TCG_REG_15); + tcg_out_mov( s, TCG_REG_27, args[0]); + tcg_out_fmt_jmp(s, INSN_CALL, TCG_REG_26, args[0], 0); + //tcg_out_pop( s, TCG_REG_15); + //tcg_out_pop( s, TCG_REG_26); + } + break; + + case INDEX_op_jmp: + if (const_args[0]) { + tcg_abort(); + } else { + tcg_out_fmt_jmp(s, INSN_JMP, TCG_REG_31, args[0], 0); + } + break; + + case INDEX_op_br: + tcg_out_br(s, args[0]); + break; + + case INDEX_op_ld8u_i32: + case INDEX_op_ld8u_i64: + tgen_ldxx( s, args[0], args[1], args[2], 0); + break; + case INDEX_op_ld8s_i32: + case INDEX_op_ld8s_i64: + tgen_ldxx( s, args[0], args[1], args[2], 0|4); + break; + case INDEX_op_ld16u_i32: + case INDEX_op_ld16u_i64: + tgen_ldxx( s, args[0], args[1], args[2], 1); + break; + case INDEX_op_ld16s_i32: + case INDEX_op_ld16s_i64: + tgen_ldxx( s, args[0], args[1], args[2], 1|4); + break; + case INDEX_op_ld32u_i64: + tgen_ldxx( s, args[0], args[1], args[2], 2); + break; + case INDEX_op_ld_i32: + case INDEX_op_ld32s_i64: + tgen_ldxx( s, args[0], args[1], args[2], 2|4); + break; + case INDEX_op_ld_i64: + tgen_ldxx( s, args[0], args[1], args[2], 3); + break; + + case INDEX_op_st8_i32: + case INDEX_op_st8_i64: + tgen_stxx( s, args[0], args[1], args[2], 0); + break; + case INDEX_op_st16_i32: + case INDEX_op_st16_i64: + tgen_stxx( s, args[0], args[1], args[2], 1); + break; + case INDEX_op_st_i32: + case INDEX_op_st32_i64: + tgen_stxx( s, args[0], args[1], args[2], 2); + break; + case INDEX_op_st_i64: + tgen_stxx( s, args[0], args[1], args[2], 3); + break; + + case INDEX_op_add_i32: + case INDEX_op_add_i64: + oc = INSN_ADDQ; + goto gen_arith; + case INDEX_op_sub_i32: + case INDEX_op_sub_i64: + oc = INSN_SUBQ; + goto gen_arith; + case INDEX_op_mul_i32: + oc = INSN_MULL; + goto gen_arith; + case INDEX_op_mul_i64: + oc = INSN_MULQ; + goto gen_arith; + case INDEX_op_and_i32: + case INDEX_op_and_i64: + oc = INSN_AND; + goto gen_arith; + case INDEX_op_or_i32: + case INDEX_op_or_i64: + oc = INSN_BIS; + goto gen_arith; + case INDEX_op_xor_i32: + case INDEX_op_xor_i64: + oc = INSN_XOR; + goto gen_arith; + case INDEX_op_shl_i32: + case INDEX_op_shl_i64: + oc = INSN_SLL; + goto gen_arith; + case INDEX_op_shr_i32: + tcg_out_fmt_opi(s, INSN_ZAPNOT, args[1], 0x0f, args[1]); + case INDEX_op_shr_i64: + oc = INSN_SRL; + goto gen_arith; + case INDEX_op_sar_i32: + tcg_out_fmt_opr(s, INSN_ADDL, args[1], TCG_REG_31, args[1]); + case INDEX_op_sar_i64: + oc = INSN_SRA; + gen_arith: + if (const_args[2]) { + tcg_abort(); + } else { + tcg_out_fmt_opr(s, oc, args[1], args[2], args[0]); + } + break; + + case INDEX_op_brcond_i32: + tcg_out_fmt_opr(s, INSN_ADDL, args[0], TCG_REG_31, args[0]); + tcg_out_fmt_opr(s, INSN_ADDL, args[1], TCG_REG_31, args[1]); + tcg_out_brcond(s, args[2], args[0], args[1], const_args[1], args[3]); + break; + case INDEX_op_brcond_i64: + tcg_out_brcond(s, args[2], args[0], args[1], const_args[1], args[3]); + break; + + case INDEX_op_ext8s_i32: + case INDEX_op_ext8s_i64: + tcg_out_fmt_opr(s, INSN_SEXTB, TCG_REG_31, args[1], args[0]); + break; + case INDEX_op_ext16s_i32: + case INDEX_op_ext16s_i64: + tcg_out_fmt_opr(s, INSN_SEXTW, TCG_REG_31, args[1], args[0]); + break; + case INDEX_op_ext32s_i64: + tcg_out_fmt_opr(s, INSN_ADDL, args[1], TCG_REG_31, args[0]); + break; + + case INDEX_op_qemu_ld8u: + tcg_out_qemu_ld(s, args, 0); + break; + case INDEX_op_qemu_ld8s: + tcg_out_qemu_ld(s, args, 0 | 4); + break; + case INDEX_op_qemu_ld16u: + tcg_out_qemu_ld(s, args, 1); + break; + case INDEX_op_qemu_ld16s: + tcg_out_qemu_ld(s, args, 1 | 4); + break; + case INDEX_op_qemu_ld32u: + tcg_out_qemu_ld(s, args, 2); + break; + case INDEX_op_qemu_ld32s: + tcg_out_qemu_ld(s, args, 2 | 4); + break; + case INDEX_op_qemu_ld64: + tcg_out_qemu_ld(s, args, 3); + break; + + case INDEX_op_qemu_st8: + tcg_out_qemu_st(s, args, 0); + break; + case INDEX_op_qemu_st16: + tcg_out_qemu_st(s, args, 1); + break; + case INDEX_op_qemu_st32: + tcg_out_qemu_st(s, args, 2); + break; + case INDEX_op_qemu_st64: + tcg_out_qemu_st(s, args, 3); + break; + + case INDEX_op_movi_i32: + case INDEX_op_movi_i64: + case INDEX_op_mov_i32: + case INDEX_op_mov_i64: + case INDEX_op_div2_i32: + case INDEX_op_divu2_i32: + default: + tcg_abort(); + } +} + +static const TCGTargetOpDef alpha_op_defs[] = { + { INDEX_op_exit_tb, { } }, + { INDEX_op_goto_tb, { } }, + { INDEX_op_call, { "r" } }, + { INDEX_op_jmp, { "r" } }, + { INDEX_op_br, { } }, + + { INDEX_op_mov_i32, { "r", "r" } }, + { INDEX_op_movi_i32, { "r" } }, + { INDEX_op_ld8u_i32, { "r", "r" } }, + { INDEX_op_ld8s_i32, { "r", "r" } }, + { INDEX_op_ld16u_i32, { "r", "r" } }, + { INDEX_op_ld16s_i32, { "r", "r" } }, + { INDEX_op_ld_i32, { "r", "r" } }, + { INDEX_op_st8_i32, { "r", "r" } }, + { INDEX_op_st16_i32, { "r", "r" } }, + { INDEX_op_st_i32, { "r", "r" } }, + + { INDEX_op_add_i32, { "r", "0", "r" } }, + { INDEX_op_mul_i32, { "r", "0", "r" } }, + //{ INDEX_op_div2_i32, { "a", "d", "0", "1", "r" } }, + //{ INDEX_op_divu2_i32, { "a", "d", "0", "1", "r" } }, + { INDEX_op_sub_i32, { "r", "0", "r" } }, + { INDEX_op_and_i32, { "r", "0", "r" } }, + { INDEX_op_or_i32, { "r", "0", "r" } }, + { INDEX_op_xor_i32, { "r", "0", "r" } }, + + { INDEX_op_shl_i32, { "r", "0", "r" } }, + { INDEX_op_shr_i32, { "r", "0", "r" } }, + { INDEX_op_sar_i32, { "r", "0", "r" } }, + + { INDEX_op_brcond_i32, { "r", "r" } }, + + { INDEX_op_mov_i64, { "r", "r" } }, + { INDEX_op_movi_i64, { "r" } }, + { INDEX_op_ld8u_i64, { "r", "r" } }, + { INDEX_op_ld8s_i64, { "r", "r" } }, + { INDEX_op_ld16u_i64, { "r", "r" } }, + { INDEX_op_ld16s_i64, { "r", "r" } }, + { INDEX_op_ld32u_i64, { "r", "r" } }, + { INDEX_op_ld32s_i64, { "r", "r" } }, + { INDEX_op_ld_i64, { "r", "r" } }, + { INDEX_op_st8_i64, { "r", "r" } }, + { INDEX_op_st16_i64, { "r", "r" } }, + { INDEX_op_st32_i64, { "r", "r" } }, + { INDEX_op_st_i64, { "r", "r" } }, + + { INDEX_op_add_i64, { "r", "0", "r" } }, + { INDEX_op_mul_i64, { "r", "0", "r" } }, + //{ INDEX_op_div2_i64, { "a", "d", "0", "1", "r" } }, + //{ INDEX_op_divu2_i64, { "a", "d", "0", "1", "r" } }, + { INDEX_op_sub_i64, { "r", "0", "r" } }, + { INDEX_op_and_i64, { "r", "0", "r" } }, + { INDEX_op_or_i64, { "r", "0", "r" } }, + { INDEX_op_xor_i64, { "r", "0", "r" } }, + + { INDEX_op_shl_i64, { "r", "0", "r" } }, + { INDEX_op_shr_i64, { "r", "0", "r" } }, + { INDEX_op_sar_i64, { "r", "0", "r" } }, + + { INDEX_op_brcond_i64, { "r", "r" } }, + + { INDEX_op_ext8s_i32, { "r", "r"} }, + { INDEX_op_ext16s_i32, { "r", "r"} }, + { INDEX_op_ext8s_i64, { "r", "r"} }, + { INDEX_op_ext16s_i64, { "r", "r"} }, + { INDEX_op_ext32s_i64, { "r", "r"} }, + + { INDEX_op_qemu_ld8u, { "r", "L" } }, + { INDEX_op_qemu_ld8s, { "r", "L" } }, + { INDEX_op_qemu_ld16u, { "r", "L" } }, + { INDEX_op_qemu_ld16s, { "r", "L" } }, + { INDEX_op_qemu_ld32u, { "r", "L" } }, + { INDEX_op_qemu_ld32s, { "r", "L" } }, + { INDEX_op_qemu_ld64, { "r", "L" } }, + + { INDEX_op_qemu_st8, { "L", "L" } }, + { INDEX_op_qemu_st16, { "L", "L" } }, + { INDEX_op_qemu_st32, { "L", "L" } }, + //{ INDEX_op_qemu_st64, { "L", "L", "L"} }, + { INDEX_op_qemu_st64, { "L", "L"} }, + { -1 }, +}; + + +static int tcg_target_callee_save_regs[] = { + TCG_REG_15, // used for the global env, so no need to save + TCG_REG_9, + TCG_REG_10, + TCG_REG_11, + TCG_REG_12, + TCG_REG_13, + TCG_REG_14 +}; + +/* + * Generate global QEMU prologue and epilogue code +*/ +void tcg_target_qemu_prologue(TCGContext *s) +{ + int i, frame_size, push_size, stack_addend; + + /* TB prologue */ + /*printf("TB prologue @ %lx\n", s->code_ptr);*/ + + /* save TCG_REG_26 */ + tcg_out_push(s, TCG_REG_26); + tcg_out_push(s, TCG_REG_27); + tcg_out_push(s, TCG_REG_28); + tcg_out_push(s, TCG_REG_29); + + /* save all callee saved registers */ + for(i = 0; i < ARRAY_SIZE(tcg_target_callee_save_regs); i++) { + tcg_out_push(s, tcg_target_callee_save_regs[i]); + } + + /* reserve some stack space */ + push_size = 8 + (4 + ARRAY_SIZE(tcg_target_callee_save_regs)) * 8; + frame_size = push_size + 4*TCG_STATIC_CALL_ARGS_SIZE; + frame_size = (frame_size + TCG_TARGET_STACK_ALIGN - 1) & ~(TCG_TARGET_STACK_ALIGN - 1); + stack_addend = frame_size - push_size; + tcg_out_addi(s, TCG_REG_30, -stack_addend); + + tcg_out_fmt_jmp(s, INSN_JMP, TCG_REG_31, TCG_REG_16, 0); /* jmp $16 */ + + /* TB epilogue */ + tb_ret_addr = s->code_ptr; + tcg_out_addi(s, TCG_REG_30, stack_addend); + for(i = ARRAY_SIZE(tcg_target_callee_save_regs) - 1; i >= 0; i--) { + tcg_out_pop(s, tcg_target_callee_save_regs[i]); + } + + tcg_out_pop(s, TCG_REG_29); + tcg_out_pop(s, TCG_REG_28); + tcg_out_pop(s, TCG_REG_27); + tcg_out_pop(s, TCG_REG_26); + tcg_out_fmt_jmp(s, INSN_RET, TCG_REG_31, TCG_REG_26, 0); /* ret */ +} + + +void tcg_target_init(TCGContext *s) +{ + /* fail safe */ + if ((1 << CPU_TLB_ENTRY_BITS) != sizeof(CPUTLBEntry)) + tcg_abort(); + + tcg_regset_set32(tcg_target_available_regs[TCG_TYPE_I32], 0, 0xffffffff); + tcg_regset_set32(tcg_target_available_regs[TCG_TYPE_I64], 0, 0xffffffff); + tcg_regset_set32(tcg_target_call_clobber_regs, 0, + (1 << TCG_REG_1 ) | (1 << TCG_REG_2 ) | (1 << TCG_REG_3 ) | (1 << TCG_REG_4 ) | + (1 << TCG_REG_5 ) | (1 << TCG_REG_6 ) | (1 << TCG_REG_7 ) | (1 << TCG_REG_8 ) | + (1 << TCG_REG_22) | (1 << TCG_REG_23) | (1 << TCG_REG_24) | (1 << TCG_REG_25) | + (1 << TCG_REG_16) | (1 << TCG_REG_17) | (1 << TCG_REG_18) | (1 << TCG_REG_19) | + (1 << TCG_REG_20) | (1 << TCG_REG_21) | (1 << TCG_REG_0 )); + + //tcg_regset_set32( tcg_target_call_clobber_regs, 0, 0xffffffff); + + tcg_regset_clear(s->reserved_regs); + // $26~$31 not allocated by tcg.c + tcg_regset_set_reg(s->reserved_regs, TCG_REG_26); + tcg_regset_set_reg(s->reserved_regs, TCG_REG_27); + tcg_regset_set_reg(s->reserved_regs, TCG_REG_28); + tcg_regset_set_reg(s->reserved_regs, TCG_REG_29); + tcg_regset_set_reg(s->reserved_regs, TCG_REG_30); + tcg_regset_set_reg(s->reserved_regs, TCG_REG_31); + // resved registers for tmp usage + tcg_regset_set_reg(s->reserved_regs, TMP_REG1); + tcg_regset_set_reg(s->reserved_regs, TMP_REG2); + tcg_regset_set_reg(s->reserved_regs, TMP_REG3); + + tcg_add_target_add_op_defs(alpha_op_defs); +} + diff --git a/tcg/alpha/tcg-target.h b/tcg/alpha/tcg-target.h new file mode 100644 index 0000000..79c57af --- /dev/null +++ b/tcg/alpha/tcg-target.h @@ -0,0 +1,70 @@ +/* + * Tiny Code Generator for QEMU + * + * Copyright (c) 2008 Fabrice Bellard + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the "Software"), to deal + * in the Software without restriction, including without limitation the rights + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + * copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN + * THE SOFTWARE. + */ +#define TCG_TARGET_ALPHA 1 + +#define TCG_TARGET_REG_BITS 64 + +#define TCG_TARGET_NB_REGS 32 + +enum { + TCG_REG_0 = 0, TCG_REG_1, TCG_REG_2, TCG_REG_3, + TCG_REG_4, TCG_REG_5, TCG_REG_6, TCG_REG_7, + TCG_REG_8, TCG_REG_9, TCG_REG_10, TCG_REG_11, + TCG_REG_12, TCG_REG_13, TCG_REG_14, TCG_REG_15, + TCG_REG_16, TCG_REG_17, TCG_REG_18, TCG_REG_19, + TCG_REG_20, TCG_REG_21, TCG_REG_22, TCG_REG_23, + TCG_REG_24, TCG_REG_25, TCG_REG_26, TCG_REG_27, + TCG_REG_28, TCG_REG_29, TCG_REG_30, TCG_REG_31 +}; + +/* used for function call generation */ +#define TCG_REG_CALL_STACK TCG_REG_30 +#define TCG_TARGET_STACK_ALIGN 16 +#define TCG_TARGET_CALL_STACK_OFFSET 0 + +/* we have signed extension instructions */ +#define TCG_TARGET_HAS_ext8s_i32 +#define TCG_TARGET_HAS_ext16s_i32 +#define TCG_TARGET_HAS_ext8s_i64 +#define TCG_TARGET_HAS_ext16s_i64 +#define TCG_TARGET_HAS_ext32s_i64 + +/* Note: must be synced with dyngen-exec.h */ +#define TCG_AREG0 TCG_REG_15 +#define TCG_AREG1 TCG_REG_9 +#define TCG_AREG2 TCG_REG_10 +#define TCG_AREG3 TCG_REG_11 +#define TCG_AREG4 TCG_REG_12 +#define TCG_AREG5 TCG_REG_13 +#define TCG_AREG6 TCG_REG_14 + +#define TMP_REG1 TCG_REG_23 +#define TMP_REG2 TCG_REG_24 +#define TMP_REG3 TCG_REG_25 + +static inline void flush_icache_range(unsigned long start, unsigned long stop) +{ + __asm__ __volatile__ ("call_pal 0x86"); +} + -- 1.6.3.3