From patchwork Tue Jan 26 01:19:13 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 43671 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [199.232.76.165]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id D29BDB7C98 for ; Tue, 26 Jan 2010 12:31:17 +1100 (EST) Received: from localhost ([127.0.0.1]:41071 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NZa6d-0006OO-0V for incoming@patchwork.ozlabs.org; Mon, 25 Jan 2010 20:20:35 -0500 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1NZa5P-0006Mk-TE for qemu-devel@nongnu.org; Mon, 25 Jan 2010 20:19:20 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1NZa5N-0006Lz-4w for qemu-devel@nongnu.org; Mon, 25 Jan 2010 20:19:18 -0500 Received: from [199.232.76.173] (port=44979 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NZa5M-0006Lr-TO for qemu-devel@nongnu.org; Mon, 25 Jan 2010 20:19:16 -0500 Received: from are.twiddle.net ([75.149.56.221]:55252) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1NZa5M-0006GL-3L for qemu-devel@nongnu.org; Mon, 25 Jan 2010 20:19:16 -0500 Received: from anchor.twiddle.home (anchor.twiddle.home [172.31.0.4]) by are.twiddle.net (Postfix) with ESMTPSA id 8897D47E; Mon, 25 Jan 2010 17:19:14 -0800 (PST) Message-ID: <4B5E4311.10707@twiddle.net> Date: Mon, 25 Jan 2010 17:19:13 -0800 From: Richard Henderson User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.7) Gecko/20100120 Fedora/3.0.1-1.fc12 Thunderbird/3.0.1 MIME-Version: 1.0 To: identifier scorpio Subject: Re: [Qemu-devel] [PATCH] Porting TCG to alpha platform References: <242393.28161.qm@web15901.mail.cnb.yahoo.com> In-Reply-To: <242393.28161.qm@web15901.mail.cnb.yahoo.com> X-detected-operating-system: by monty-python.gnu.org: GNU/Linux 2.6 (newer, 2) Cc: qemu-devel@nongnu.org X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org I've rearranged the code a tad, along the lines I had in mind. I *think* I may have found cause of some of your problems running Windows. Mainly, tcg_target_long was used places that could be holding HOST values, which means that the host values would have been truncated to 32 bits. Anyway, give this a try and let me know what happens. All I could actually try was a cross-compile of the code... r~ diff --git a/cpu-common.h b/cpu-common.h index 6302372..01f9980 100644 --- a/cpu-common.h +++ b/cpu-common.h @@ -3,7 +3,8 @@ /* CPU interfaces that are target indpendent. */ -#if defined(__arm__) || defined(__sparc__) || defined(__mips__) || defined(__hppa__) +#if defined(__arm__) || defined(__sparc__) || defined(__mips__) \ + || defined(__hppa__) || defined(__alpha__) #define WORDS_ALIGNED #endif diff --git a/exec-all.h b/exec-all.h index 820b59e..3e03aef 100644 --- a/exec-all.h +++ b/exec-all.h @@ -114,7 +114,8 @@ static inline int tlb_set_page(CPUState *env1, target_ulong vaddr, #define CODE_GEN_AVG_BLOCK_SIZE 64 #endif -#if defined(_ARCH_PPC) || defined(__x86_64__) || defined(__arm__) || defined(__i386__) +#if defined(_ARCH_PPC) || defined(__x86_64__) || defined(__arm__) \ + || defined(__i386__) || defined(__alpha__) #define USE_DIRECT_JUMP #endif @@ -189,6 +190,9 @@ extern int code_gen_max_blocks; #if defined(_ARCH_PPC) extern void ppc_tb_set_jmp_target(unsigned long jmp_addr, unsigned long addr); #define tb_set_jmp_target1 ppc_tb_set_jmp_target +#elif defined(__alpha__) +extern void alpha_tb_set_jmp_target(unsigned long, unsigned long) +#define tb_set_jmp_target1 alpha_tb_set_jmp_target #elif defined(__i386__) || defined(__x86_64__) static inline void tb_set_jmp_target1(unsigned long jmp_addr, unsigned long addr) { diff --git a/tcg/alpha/tcg-target.c b/tcg/alpha/tcg-target.c new file mode 100644 index 0000000..7e0e367 --- /dev/null +++ b/tcg/alpha/tcg-target.c @@ -0,0 +1,1398 @@ +/* + * Tiny Code Generator for QEMU on ALPHA platform. + */ + +#ifndef NDEBUG +static const char * const tcg_target_reg_names[TCG_TARGET_NB_REGS] = { + "$0", "$1", "$2", "$3", "$4", "$5", "$6", "$7", + "$8", "$9", "$10", "$11", "$12", "$13", "$14", "$15", + "$16", "$17", "$18", "$19", "$20", "$21", "$22", "$23", + "$24", "$25", "$26", "$27", "$28", "$29", "$30", "$31", +}; +#endif + +/* + * $15 is the cpu_env register, + * $30 is the stack pointer, + * $31 is the zero register, + * $23, $28, $29 are reserved as temporaries. + */ +static const int tcg_target_reg_alloc_order[] = { + TCG_REG_9, TCG_REG_10, TCG_REG_11, TCG_REG_12, TCG_REG_13, TCG_REG_14, + TCG_REG_1, TCG_REG_2, TCG_REG_3, TCG_REG_4, TCG_REG_5, TCG_REG_6, + TCG_REG_7, TCG_REG_8, TCG_REG_22, TCG_REG_24, TCG_REG_25, TCG_REG_26, + TCG_REG_27, TCG_REG_16, TCG_REG_17, TCG_REG_18, TCG_REG_19, TCG_REG_20, + TCG_REG_21 +}; + +/* + * According to alpha calling convention, these 6 registers are used for + * function parameter passing. if function has more than 6 parameters, + * remaining arguments are stored on the stack. + */ +static const int tcg_target_call_iarg_regs[6] = { + TCG_REG_16, TCG_REG_17, TCG_REG_18, TCG_REG_19, TCG_REG_20, TCG_REG_21 +}; + +/* + * According to alpha calling convention, $0 is used for returning function + * result. + */ +static const int tcg_target_call_oarg_regs[1] = { TCG_REG_0 }; + +/* + * Save the address of TB's epilogue. + */ +#define TB_RET_OFS TCG_STATIC_CALL_ARGS_SIZE + +#define INSN_OP(x) (((x) & 0x3f) << 26) +#define INSN_FUNC1(x) (((x) & 0x3) << 14) +#define INSN_FUNC2(x) (((x) & 0x7f) << 5) +#define INSN_RA(x) (((x) & 0x1f) << 21) +#define INSN_RB(x) (((x) & 0x1f) << 16) +#define INSN_RC(x) ((x) & 0x1f) +#define INSN_LIT(x) (((x) & 0xff) << 13) +#define INSN_DISP16(x) ((x) & 0xffff) +#define INSN_DISP21(x) ((x) & 0x1fffff) +#define INSN_RSVED(x) ((x) & 0x3fff) + +#define INSN_ADDL (INSN_OP(0x10) | INSN_FUNC2(0x00)) +#define INSN_ADDQ (INSN_OP(0x10) | INSN_FUNC2(0x20)) +#define INSN_AND (INSN_OP(0x11) | INSN_FUNC2(0x00)) +#define INSN_BEQ INSN_OP(0x39) +#define INSN_BGE INSN_OP(0x3e) +#define INSN_BGT INSN_OP(0x3f) +#define INSN_BIC (INSN_OP(0x11) | INSN_FUNC2(0x08)) +#define INSN_BIS (INSN_OP(0x11) | INSN_FUNC2(0x20)) +#define INSN_BLE INSN_OP(0x3b) +#define INSN_BLT INSN_OP(0x3a) +#define INSN_BNE INSN_OP(0x3d) +#define INSN_BR INSN_OP(0x30) +#define INSN_CALL (INSN_OP(0x1a) | INSN_FUNC1(1)) +#define INSN_CMPEQ (INSN_OP(0x10) | INSN_FUNC2(0x2d)) +#define INSN_CMPLE (INSN_OP(0x10) | INSN_FUNC2(0x6d)) +#define INSN_CMPLT (INSN_OP(0x10) | INSN_FUNC2(0x4d)) +#define INSN_CMPULE (INSN_OP(0x10) | INSN_FUNC2(0x3d)) +#define INSN_CMPULT (INSN_OP(0x10) | INSN_FUNC2(0x1d)) +#define INSN_EQV (INSN_OP(0x11) | INSN_FUNC2(0x48)) +#define INSN_EXTBL (INSN_OP(0x12) | INSN_FUNC2(0x06)) +#define INSN_EXTWH (INSN_OP(0x12) | INSN_FUNC2(0x16)) +#define INSN_INSLH (INSN_OP(0x12) | INSN_FUNC2(0x67)) +#define INSN_INSWL (INSN_OP(0x12) | INSN_FUNC2(0x57)) +#define INSN_JMP (INSN_OP(0x1a) | INSN_FUNC1(0)) +#define INSN_LDA INSN_OP(0x8) +#define INSN_LDAH INSN_OP(0x9) +#define INSN_LDBU INSN_OP(0xa) +#define INSN_LDL INSN_OP(0x28) +#define INSN_LDQ INSN_OP(0x29) +#define INSN_LDWU INSN_OP(0xc) +#define INSN_MULL (INSN_OP(0x13) | INSN_FUNC2(0x00)) +#define INSN_MULQ (INSN_OP(0x13) | INSN_FUNC2(0x20)) +#define INSN_ORNOT (INSN_OP(0x11) | INSN_FUNC2(0x28)) +#define INSN_RET (INSN_OP(0x1a) | INSN_FUNC1(2)) +#define INSN_SEXTB (INSN_OP(0x1c) | INSN_FUNC2(0x00)) +#define INSN_SEXTW (INSN_OP(0x1c) | INSN_FUNC2(0x1)) +#define INSN_SLL (INSN_OP(0x12) | INSN_FUNC2(0x39)) +#define INSN_SRA (INSN_OP(0x12) | INSN_FUNC2(0x3c)) +#define INSN_SRL (INSN_OP(0x12) | INSN_FUNC2(0x34)) +#define INSN_STB INSN_OP(0xe) +#define INSN_STL INSN_OP(0x2c) +#define INSN_STQ INSN_OP(0x2d) +#define INSN_STW INSN_OP(0xd) +#define INSN_SUBL (INSN_OP(0x10) | INSN_FUNC2(0x09)) +#define INSN_SUBQ (INSN_OP(0x10) | INSN_FUNC2(0x29)) +#define INSN_XOR (INSN_OP(0x11) | INSN_FUNC2(0x40)) +#define INSN_ZAPNOT (INSN_OP(0x12) | INSN_FUNC2(0x31)) +#define INSN_BUGCHK (INSN_OP(0x00) | INSN_DISP16(0x81)) + +/* + * Return the # of regs used for parameter passing on procedure calling. + * note that alpha use $16~$21 to transfer the first 6 paramenters of a + * procedure. + */ +static inline int tcg_target_get_call_iarg_regs_count(int flags) +{ + return 6; +} + +/* + * Given a constraint, fill in the available register set or constant range. + */ +static int target_parse_constraint(TCGArgConstraint *ct, const char **pct_str) +{ + const char *ct_str = *pct_str; + + switch (ct_str[0]) { + case 'r': + /* Constaint 'r' means any register is okay. */ + ct->ct |= TCG_CT_REG; + tcg_regset_set32(ct->u.regs, 0, 0xffffffffu); + break; + + case 'a': + /* Constraint 'a' means $24, one of the division inputs. */ + ct->ct |= TCG_CT_REG; + tcg_regset_set32(ct->u.regs, 0, 1u << 24); + break; + + case 'b': + /* Constraint 'b' means $25, one of the division inputs. */ + ct->ct |= TCG_CT_REG; + tcg_regset_set32(ct->u.regs, 0, 1u << 25); + break; + + case 'c': + /* Constraint 'c' means $27, the call procedure vector. */ + ct->ct |= TCG_CT_REG; + tcg_regset_set32(ct->u.regs, 0, 1u << 27); + break; + + case 'L': + /* Constraint for qemu_ld/st. The extra reserved registers are + used for passing the parameters to the helper function. */ + ct->ct |= TCG_CT_REG; + tcg_regset_set32(ct->u.regs, 0, 0xffffffffu); + tcg_regset_reset_reg(ct->u.regs, TCG_REG_16); + tcg_regset_reset_reg(ct->u.regs, TCG_REG_17); + tcg_regset_reset_reg(ct->u.regs, TCG_REG_18); + break; + + case 'I': + /* Constraint 'I' means an immediate 0 ... 255. */ + ct->ct |= TCG_CT_CONST_U8; + break; + + case 'J': + /* Constraint 'J' means the immediate 0. */ + ct->ct |= TCG_CT_CONST_ZERO; + break; + + default: + return -1; + } + + ct_str++; + *pct_str = ct_str; + return 0; +} + +static inline int tcg_target_const_match(tcg_target_long val, + const TCGArgConstraint *arg_ct) +{ + int ct = arg_ct->ct; + if (ct & TCG_CT_CONST) { + return 1; + } else if (ct & TCG_CT_CONST_U8) { + return val == (uint8_t)val; + } else if (ct & TCG_CT_CONST_ZERO) { + return val == 0; + } else { + return 0; + } +} + +static inline void tcg_out_fmt_br(TCGContext *s, int opc, int ra, int disp) +{ + tcg_out32(s, (opc) | INSN_RA(ra) | INSN_DISP21(disp)); +} + +static inline void tcg_out_fmt_mem(TCGContext *s, int opc, int ra, + int rb, int disp) +{ + if (disp != (int16_t)disp) { + tcg_abort(); + } + tcg_out32(s, (opc) | INSN_RA(ra) | INSN_RB(rb) | INSN_DISP16(disp)); +} + +static inline void tcg_out_fmt_jmp(TCGContext *s, int opc, int ra, + int rb, int rsved) +{ + tcg_out32(s, (opc) | INSN_RA(ra) | INSN_RB(rb) | INSN_RSVED(rsved)); +} + +static inline void tcg_out_fmt_opr(TCGContext *s, int opc, int ra, + int rb, int rc) +{ + tcg_out32(s, (opc) | INSN_RA(ra) | INSN_RB(rb) | INSN_RC(rc)); +} + +static inline void tcg_out_fmt_opi(TCGContext *s, int opc, int ra, + int lit, int rc) +{ + if (lit & ~0xff) { + tcg_abort(); + } + tcg_out32(s, (opc) | INSN_RA(ra) | INSN_LIT(lit) | INSN_RC(rc) | (1<<12)); +} + +/* + * mov from a reg to another + */ +static inline void tcg_out_mov(TCGContext *s, int rc, int rb) +{ + if (rb != rc) { + tcg_out_fmt_opr(s, INSN_BIS, TCG_REG_31, rb, rc); + } +} + +/* + * Helper function to emit a memory format operation with a displacement + * that may be larger than the 16 bits accepted by the real instruction. + */ +static void tcg_out_mem_long(TCGContext *s, int opc, int ra, int rb, long orig) +{ + long l0, l1, extra = 0, val = orig; + int rs; + + /* Pick a scratch register. Use the output register, if possible. */ + switch (opc) { + default: + if (ra != rb) { + rs = ra; + break; + } + /* FALLTHRU */ + + case INSN_STB: + case INSN_STW: + case INSN_STL: + case INSN_STQ: + if (ra == TMP_REG1) { + tcg_abort(); + } + rs = TMP_REG1; + break; + } + + l0 = (int16_t)val; + val = (val - l0) >> 16; + l1 = (int16_t)val; + + if (orig >> 31 == -1 || orig >> 31 == 0) { + if (l1 < 0 && orig >= 0) { + extra = 0x4000; + l1 = (int16_t)(val - 0x4000); + } + } else { + long l2, l3; + int rh = TCG_REG_31; + + val = (val - l1) >> 16; + l2 = (int16_t)val; + val = (val - l2) >> 16; + l3 = (int16_t)val; + + if (l3) { + tcg_out_fmt_mem(s, INSN_LDAH, rs, rh, l3); + rh = rs; + } + if (l2) { + tcg_out_fmt_mem(s, INSN_LDA, rs, rh, l2); + rh = rs; + } + tcg_out_fmt_opi(s, INSN_SLL, rh, 32, rs); + + if (rb != TCG_REG_31) { + tcg_out_fmt_opr(s, INSN_ADDQ, rs, rb, rs); + } + rb = rs; + } + + if (l1) { + tcg_out_fmt_mem(s, INSN_LDAH, rs, rb, l1); + rb = rs; + } + if (extra) { + tcg_out_fmt_mem(s, INSN_LDAH, rs, rb, extra); + rb = rs; + } + + if (opc != INSN_LDA || rb != ra || l0 != 0) { + tcg_out_fmt_mem(s, opc, ra, rb, l0); + } +} + +static inline void tcg_out_movi(TCGContext *s, TCGType type, int ra, long val) +{ + if (type == TCG_TYPE_I32) { + val = (int32_t)val; + } + tcg_out_mem_long(s, INSN_LDA, ra, TCG_REG_31, val); +} + +static inline void tcg_out_ld(TCGContext *s, int type, + int ra, int rb, long disp) +{ + tcg_out_mem_long(s, type == TCG_TYPE_I32 ? INSN_LDL : INSN_LDQ, + ra, rb, disp); +} + +static inline void tcg_out_st(TCGContext *s, int type, + int ra, int rb, long disp) +{ + tcg_out_mem_long(s, type == TCG_TYPE_I32 ? INSN_STL : INSN_STQ, + ra, rb, disp); +} + +static inline void tcg_out_addi(TCGContext *s, int reg, long val) +{ + if (val != 0) { + tcg_out_mem_long(s, INSN_LDA, reg, reg, val); + } +} + +static void tcg_out_andi(TCGContext *s, int ra, long val, int rc) +{ + if (val == (uint8_t)val) { + tcg_out_fmt_opi(s, INSN_AND, ra, val, rc); + } else if (~val == (uint8_t)~val) { + tcg_out_fmt_opi(s, INSN_BIC, ra, ~val, rc); + } else { + long mask0, maskff; + + /* Check and see if the value matches a ZAPNOT mask. This is fairly + common. Since we know this is an alpha host, speed the check by + using cmpbge to compare 8 bytes at once, and incidentally also + produce the zapnot mask. */ + /* ??? This builtin was implemented sometime in 2002, + perhaps in the GCC 3.1 timeframe. */ + mask0 = __builtin_alpha_cmpbge(0, val); + maskff = __builtin_alpha_cmpbge(val, -1); + + /* Here, mask0 contains the bytes that are 0, maskff contains + the bytes that are 0xff; that should cover the entire word. */ + if ((mask0 | maskff) == 0xff) { + tcg_out_fmt_opi(s, INSN_ZAPNOT, ra, maskff, rc); + } else { + /* Val contains bytes that are neither 0 nor 0xff, which + means that we cannot use zapnot. Load the constant. */ + tcg_out_movi(s, TCG_TYPE_I64, TMP_REG1, val); + tcg_out_fmt_opr(s, INSN_AND, ra, TMP_REG1, rc); + } + } +} + +static void tcg_out_extend(TCGContext *s, int sizeop, int ra, int rc) +{ + switch (sizeop) { + case 0: + tcg_out_fmt_opi(s, INSN_AND, ra, 0xff, rc); + break; + case 0 | 4: + tcg_out_fmt_opr(s, INSN_SEXTB, TCG_REG_31, ra, rc); + break; + case 1: + tcg_out_fmt_opi(s, INSN_ZAPNOT, ra, 0x03, rc); + break; + case 1 | 4: + tcg_out_fmt_opr(s, INSN_SEXTW, TCG_REG_31, ra, rc); + break; + case 2: + tcg_out_fmt_opi(s, INSN_ZAPNOT, ra, 0x0f, rc); + break; + case 2 | 4: + tcg_out_fmt_opr(s, INSN_ADDL, TCG_REG_31, ra, rc); + break; + case 3: + tcg_out_mov(s, ra, rc); + break; + default: + tcg_abort(); + } +} + +static void tcg_out_bswap(TCGContext *s, int sizeop, int ra, int rc) +{ + const int t0 = TMP_REG1, t1 = TMP_REG2, t2 = TMP_REG3; + + switch (sizeop) { + case 1: /* 16-bit swap, unsigned result */ + case 1 | 4: /* 16-bit swap, signed result */ + /* input value = xxxx xxAB */ + tcg_out_fmt_opi(s, INSN_EXTWH, ra, 7, t0); /* .... ..B. */ + tcg_out_fmt_opi(s, INSN_EXTBL, ra, 1, rc); /* .... ...A */ + tcg_out_fmt_opr(s, INSN_BIS, rc, t0, rc); /* .... ..BA */ + if (sizeop & 4) { + tcg_out_fmt_opr(s, INSN_SEXTW, TCG_REG_31, rc, rc); + } + break; + + case 2: /* 32-bit swap, unsigned result */ + case 2 | 4: /* 32-bit swap, signed result */ + /* input value = xxxx ABCD */ + tcg_out_fmt_opi(s, INSN_INSLH, ra, 7, t0); /* .... .ABC */ + tcg_out_fmt_opi(s, INSN_INSWL, ra, 3, t1); /* ...C D... */ + tcg_out_fmt_opr(s, INSN_BIS, t0, t1, t1); /* ...C DABC */ + tcg_out_fmt_opi(s, INSN_SRL, t1, 16, t0); /* .... .CDA */ + tcg_out_fmt_opi(s, INSN_ZAPNOT, t1, 0x0A, t1); /* .... D.B. */ + tcg_out_fmt_opi(s, INSN_ZAPNOT, t1, 0x05, t1); /* .... .C.A */ + tcg_out_fmt_opr(s, (sizeop & 4 ? INSN_ADDL : INSN_BIS), t0, t1, rc); + break; + + case 3: /* 64-bit swap */ + /* input value = ABCD EFGH */ + tcg_out_fmt_opi(s, INSN_SRL, ra, 24, t0); /* ...A BCDE */ + tcg_out_fmt_opi(s, INSN_SLL, ra, 24, t1); /* DEFG H... */ + tcg_out_fmt_opi(s, INSN_ZAPNOT, t0, 0x11, t0); /* ...A ...E */ + tcg_out_fmt_opi(s, INSN_ZAPNOT, t1, 0x88, t1); /* D... H... */ + tcg_out_fmt_opr(s, INSN_BIS, t0, t1, t2); /* D..A H..E */ + tcg_out_fmt_opi(s, INSN_SRL, ra, 8, t0); /* .ABC DEFG */ + tcg_out_fmt_opi(s, INSN_SLL, ra, 8, t1); /* BCDE FGH. */ + tcg_out_fmt_opi(s, INSN_ZAPNOT, t0, 0x22, t0); /* ..B. ..F. */ + tcg_out_fmt_opi(s, INSN_ZAPNOT, t1, 0x44, t1); /* .C.. .G.. */ + tcg_out_fmt_opr(s, INSN_BIS, t0, t2, t2); /* D.BA H.FE */ + tcg_out_fmt_opr(s, INSN_BIS, t1, t2, t2); /* DCBA HGFE */ + tcg_out_fmt_opi(s, INSN_SRL, t2, 32, t0); /* .... DCBA */ + tcg_out_fmt_opi(s, INSN_SLL, t2, 32, t1); /* HGFE .... */ + tcg_out_fmt_opr(s, INSN_BIS, t0, t1, rc); /* HGFE DCBA */ + break; + + default: + tcg_abort(); + } +} + +static void tcg_out_ld_sz(TCGContext *s, int sizeop, int ra, int rb, + int64_t disp) +{ + static const int ld_opc[4] = { INSN_LDBU, INSN_LDWU, INSN_LDL, INSN_LDQ }; + + tcg_out_mem_long(s, ld_opc[sizeop & 3], ra, rb, disp); + + switch (sizeop) { + case 0 | 4 | 8: + case 0 | 4: + case 1 | 4: + case 2: + tcg_out_extend(s, sizeop, ra, ra); + break; + + case 0: + case 0 | 8: + case 1: + case 2 | 4: + case 3: + break; + + case 1 | 8: + case 1 | 4 | 8: + case 2 | 8: + case 2 | 4 | 8: + case 3 | 8: + tcg_out_bswap(s, sizeop & 7, ra, ra); + break; + + default: + tcg_abort(); + } +} + +static void tcg_out_st_sz(TCGContext *s, int sizeop, int ra, int rb, + int64_t disp) +{ + static const int st_opc[4] = { INSN_STB, INSN_STW, INSN_STL, INSN_STQ }; + + if ((sizeop & 8) && (sizeop & 3) > 0) { + tcg_out_bswap(s, sizeop & 3, ra, TMP_REG1); + ra = TMP_REG1; + } + + tcg_out_mem_long(s, st_opc[sizeop & 3], ra, rb, disp); +} + +static void patch_reloc(uint8_t *x_ptr, int type, int64_t value, int64_t add) +{ + uint32_t *code_ptr = (uint32_t *)x_ptr; + uint32_t insn = *code_ptr; + + value += add; + switch (type) { + case R_ALPHA_BRADDR: + value -= (long)x_ptr + 4; + if ((value & 3) || value < -0x400000 || value >= 0x400000) { + tcg_abort(); + } + *code_ptr = insn | INSN_DISP21(value >> 2); + break; + + default: + tcg_abort(); + } +} + +static void tcg_out_br(TCGContext *s, int opc, int ra, int label_index) +{ + TCGLabel *l = &s->labels[label_index]; + long value; + + if (l->has_value) { + value = l->u.value; + value -= (long)s->code_ptr + 4; + if ((value & 3) || value < -0x400000 || value >= 0x400000) { + tcg_abort(); + } + value >>= 2; + } else { + tcg_out_reloc(s, s->code_ptr, R_ALPHA_BRADDR, label_index, 0); + value = 0; + } + tcg_out_fmt_br(s, opc, ra, value); +} + +static void tcg_out_brcond(TCGContext *s, int cond, TCGArg arg1, + TCGArg arg2, int const_arg2, int label_index) +{ + static const int br_opc[10] = { + [TCG_COND_EQ] = INSN_BEQ, + [TCG_COND_NE] = INSN_BNE, + [TCG_COND_LT] = INSN_BLT, + [TCG_COND_GE] = INSN_BGE, + [TCG_COND_LE] = INSN_BLE, + [TCG_COND_GT] = INSN_BGT + }; + + static const uint64_t cmp_opc[10] = { + [TCG_COND_EQ] = INSN_CMPEQ, + [TCG_COND_NE] = INSN_CMPEQ, + [TCG_COND_LT] = INSN_CMPLT, + [TCG_COND_GE] = INSN_CMPLT, + [TCG_COND_LE] = INSN_CMPLE, + [TCG_COND_GT] = INSN_CMPLE, + [TCG_COND_LTU] = INSN_CMPULT, + [TCG_COND_GEU] = INSN_CMPULT, + [TCG_COND_LEU] = INSN_CMPULE, + [TCG_COND_GTU] = INSN_CMPULE + }; + + int opc = 0; + + if (const_arg2) { + if (arg2 == 0) { + opc = br_opc[cond]; + } else if (cond == TCG_COND_EQ || cond == TCG_COND_NE) { + tcg_out_mem_long(s, INSN_LDA, TMP_REG1, arg1, -arg2); + opc = (cond == TCG_COND_EQ ? INSN_BEQ : INSN_BNE); + } + } + + if (opc == 0) { + opc = cmp_opc[cond]; + if (const_arg2) { + tcg_out_fmt_opi(s, opc, arg1, arg2, TMP_REG1); + } else { + tcg_out_fmt_opr(s, opc, arg1, arg2, TMP_REG1); + } + opc = (cond & 1) ? INSN_BEQ : INSN_BNE; + arg1 = TMP_REG1; + } + + tcg_out_br(s, opc, arg1, label_index); +} + +static void tcg_out_div(TCGContext *s, int sizeop) +{ + /* Note that these functions don't have normal C calling conventions. */ + typedef long divfn(long, long); + extern divfn __divl, __divlu, __reml, __remlu; + extern divfn __divq, __divqu, __remq, __remqu; + + static divfn * const libc_div[16] = { + [2] = __divlu, + [2 | 8] = __remlu, + [2 | 4] = __divl, + [2 | 4 | 8] = __reml, + + [3] = __divqu, + [3 | 8] = __remqu, + [3 | 4] = __divq, + [3 | 4 | 8] = __remq, + }; + + long val; + +#ifndef __linux__ + /* ??? A comment in GCC suggests that OSF/1 and WinNT both have a + bug such that even 32-bit inputs must be extended to 64-bits. + This is known to work properly on Linux, at least. */ + if ((sizeop & 3) != 3) { + tcg_out_extend(s, sizeop, TCG_REG_24, TCG_REG_24); + tcg_out_extend(s, sizeop, TCG_REG_25, TCG_REG_25); + } +#endif + + val = (long) libc_div[sizeop]; + if (val == 0) { + tcg_abort(); + } + + tcg_out_movi(s, TCG_TYPE_I64, TMP_REG1, val); + tcg_out_fmt_jmp(s, INSN_CALL, TCG_REG_23, TMP_REG1, val); +} + +#if defined(CONFIG_SOFTMMU) + +#include "../../softmmu_defs.h" + +static void *qemu_ld_helpers[4] = { + __ldb_mmu, + __ldw_mmu, + __ldl_mmu, + __ldq_mmu, +}; + +static void *qemu_st_helpers[4] = { + __stb_mmu, + __stw_mmu, + __stl_mmu, + __stq_mmu, +}; + +static void tcg_out_tlb_cmp(TCGContext *s, int sizeop, int r0, int r1, + int addr_reg, int label1, long tlb_offset) +{ + long val; + + /* Mask the page, plus the low bits of the access, into R0. Note + that the low bits are added in order to catch unaligned accesses, + as those bits won't be set in the TLB entry. For 32-bit targets, + force the high bits of the mask to be zero, as the high bits of + the input register are garbage. */ + val = TARGET_PAGE_MASK | ((1 << (sizeop & 3)) - 1); + if (TARGET_LONG_BITS == 32) { + val &= 0xffffffffu; + } + tcg_out_andi(s, addr_reg, val, r0); + + /* Compute the index into the TLB into R1. Again, note that the + high bits of a 32-bit address must be cleared. */ + tcg_out_fmt_opi(s, INSN_SRL, addr_reg, + TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS, r1); + + val = (CPU_TLB_SIZE - 1) << CPU_TLB_ENTRY_BITS; + if (TARGET_LONG_BITS == 32) { + val &= 0xffffffffu >> (TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS); + } + tcg_out_andi(s, r1, val, r1); + + /* Load the word at (R1 + CPU_ENV + TLB_OFFSET). Note that we + arrange for a 32-bit load to be zero-extended. */ + tcg_out_fmt_opr(s, INSN_ADDQ, r1, TCG_AREG0, r1); + tcg_out_ld_sz(s, TARGET_LONG_BITS == 32 ? 2 : 3, + TMP_REG2, r1, tlb_offset); + + /* Compare R0 with the value loaded from the TLB. */ + tcg_out_brcond(s, TCG_COND_NE, TMP_REG2, r0, 0, label1); +} +#endif /* SOFTMMU */ + +static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, int sizeop) +{ + int addr_reg, data_reg, r0, r1, mem_index, bswap; + long val; +#if defined(CONFIG_SOFTMMU) + int label1, label2; +#endif + + data_reg = *args++; + addr_reg = *args++; + mem_index = *args; + + r0 = TCG_REG_16; + r1 = TCG_REG_17; + +#if defined(CONFIG_SOFTMMU) + label1 = gen_new_label(); + label2 = gen_new_label(); + + tcg_out_tlb_cmp(s, sizeop, r0, r1, addr_reg, label1, + offsetof(CPUState, tlb_table[mem_index][0].addr_read)); + + /* TLB Hit. Note that Alpha statically predicts forward branch as + not taken, so arrange the fallthru as the common case. + + ADDR_REG contains the guest address, and R1 contains the pointer + to TLB_ENTRY.ADDR_READ. We need to load TLB_ENTRY.ADDEND and + add it to ADDR_REG to get the host address. */ + + tcg_out_ld(s, TCG_TYPE_I64, r1, r1, + offsetof(CPUTLBEntry, addend) + - offsetof(CPUTLBEntry, addr_read)); + + if (TARGET_LONG_BITS == 32) { + tcg_out_extend(s, 2, addr_reg, r0); + addr_reg = r0; + } + + tcg_out_fmt_opr(s, INSN_ADDQ, addr_reg, r1, r0); + val = 0; +#else + r0 = addr_reg; + val = GUEST_BASE; +#endif + +#if defined(TARGET_WORDS_BIGENDIAN) + /* Signal byte swap necessary. */ + bswap = 8; +#else + bswap = 0; +#endif + + /* Perform the actual load. */ + tcg_out_ld_sz(s, sizeop | bswap, data_reg, r0, val); + +#if defined(CONFIG_SOFTMMU) + tcg_out_br(s, INSN_BR, TCG_REG_31, label2); + + /* TLB miss. Call the helper function. */ + tcg_out_label(s, label1, (long)s->code_ptr); + tcg_out_movi(s, TCG_TYPE_I32, TCG_REG_17, mem_index); + + val = (long)qemu_ld_helpers[sizeop & 3]; + tcg_out_movi(s, TCG_TYPE_I64, TCG_REG_27, val); + tcg_out_fmt_jmp(s, INSN_CALL, TCG_REG_26, TCG_REG_27, val); + + /* The helper routines have no defined data extension. + Properly extend the result to whatever data type we need. */ + tcg_out_extend(s, sizeop, TCG_REG_0, data_reg); + + tcg_out_label(s, label2, (long)s->code_ptr); +#endif +} + +static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, int sizeop) +{ + int addr_reg, data_reg, r0, r1, mem_index, bswap; + long val; +#if defined(CONFIG_SOFTMMU) + int label1, label2; +#endif + + data_reg = *args++; + addr_reg = *args++; + mem_index = *args; + + r0 = TCG_REG_16; + r1 = TCG_REG_17; + +#if defined(CONFIG_SOFTMMU) + label1 = gen_new_label(); + label2 = gen_new_label(); + + tcg_out_tlb_cmp(s, sizeop, r0, r1, addr_reg, label1, + offsetof(CPUState, tlb_table[mem_index][0].addr_write)); + + /* TLB Hit. Note that Alpha statically predicts forward branch as + not taken, so arrange the fallthru as the common case. + + ADDR_REG contains the guest address, and R1 contains the pointer + to TLB_ENTRY.ADDR_READ. We need to load TLB_ENTRY.ADDEND and + add it to ADDR_REG to get the host address. */ + + tcg_out_ld(s, TCG_TYPE_I64, r1, r1, + offsetof(CPUTLBEntry, addend) + - offsetof(CPUTLBEntry, addr_write)); + + if (TARGET_LONG_BITS == 32) { + tcg_out_extend(s, 2, addr_reg, r0); + addr_reg = r0; + } + + tcg_out_fmt_opr(s, INSN_ADDQ, addr_reg, r1, r0); + val = 0; +#else + r0 = addr_reg; + val = GUEST_BASE; +#endif + +#if defined(TARGET_WORDS_BIGENDIAN) + /* Signal byte swap necessary. */ + bswap = 8; +#else + bswap = 0; +#endif + + /* Perform the actual load. */ + tcg_out_st_sz(s, sizeop | bswap, data_reg, r0, val); + +#if defined(CONFIG_SOFTMMU) + tcg_out_br(s, INSN_BR, TCG_REG_31, label2); + + /* TLB miss. Call the helper function. */ + tcg_out_label(s, label1, (long)s->code_ptr); + tcg_out_mov(s, TCG_REG_17, data_reg); + tcg_out_movi(s, TCG_TYPE_I32, TCG_REG_18, mem_index); + + val = (long)qemu_st_helpers[sizeop & 3]; + tcg_out_movi(s, TCG_TYPE_I64, TCG_REG_27, val); + tcg_out_fmt_jmp(s, INSN_CALL, TCG_REG_26, TCG_REG_27, val); + + tcg_out_label(s, label2, (long)s->code_ptr); +#endif +} + +static inline void tcg_out_op(TCGContext *s, int opc, const TCGArg *args, + const int *const_args) +{ + long arg0, arg1, arg2; + int c; + + arg0 = args[0]; + arg1 = args[1]; + arg2 = args[2]; + + switch (opc) { + case INDEX_op_exit_tb: + tcg_out_ld(s, TCG_TYPE_I64, TMP_REG1, TCG_REG_30, TB_RET_OFS); + tcg_out_movi(s, TCG_TYPE_I64, TCG_REG_0, arg0); + tcg_out_fmt_jmp(s, INSN_RET, TCG_REG_31, TMP_REG1, 0); + break; + + case INDEX_op_goto_tb: + if (s->tb_jmp_offset) { + /* Direct jump method. */ + /* Reserve 6 insns. In alpha_tb_set_jmp_target, if the + displacement happens to be in range (and I suspect it will + be much of the time) then install a direct branch. Otherwise + we can load the 64-bit constant into the first 5 insns and do + an indirect jump with the 6th. */ + s->tb_jmp_offset[arg0] = s->code_ptr - s->code_buf; + + /* Fill with BUGCHK plus 5 nops, to be sure it's filled in. */ + tcg_out32(s, INSN_BUGCHK); + tcg_out_fmt_opr(s, INSN_BIS, TCG_REG_31, TCG_REG_31, TCG_REG_31); + tcg_out_fmt_opr(s, INSN_BIS, TCG_REG_31, TCG_REG_31, TCG_REG_31); + tcg_out_fmt_opr(s, INSN_BIS, TCG_REG_31, TCG_REG_31, TCG_REG_31); + tcg_out_fmt_opr(s, INSN_BIS, TCG_REG_31, TCG_REG_31, TCG_REG_31); + tcg_out_fmt_opr(s, INSN_BIS, TCG_REG_31, TCG_REG_31, TCG_REG_31); + } else { + /* Indirect jump method. */ + tcg_abort(); + } + s->tb_next_offset[arg0] = s->code_ptr - s->code_buf; + break; + + case INDEX_op_call: + tcg_out_fmt_jmp(s, INSN_CALL, TCG_REG_26, TCG_REG_27, 0); + break; + + case INDEX_op_jmp: + tcg_out_fmt_jmp(s, INSN_JMP, TCG_REG_31, arg0, 0); + break; + + case INDEX_op_br: + tcg_out_br(s, INSN_BR, TCG_REG_31, arg0); + break; + + case INDEX_op_ld8u_i32: + case INDEX_op_ld8u_i64: + c = 0; + goto do_load; + case INDEX_op_ld8s_i32: + case INDEX_op_ld8s_i64: + c = 0 | 4; + goto do_load; + case INDEX_op_ld16u_i32: + case INDEX_op_ld16u_i64: + c = 1; + goto do_load; + case INDEX_op_ld16s_i32: + case INDEX_op_ld16s_i64: + c = 1 | 4; + goto do_load; + case INDEX_op_ld32u_i64: + c = 2; + goto do_load; + case INDEX_op_ld_i32: + case INDEX_op_ld32s_i64: + c = 2 | 4; + goto do_load; + case INDEX_op_ld_i64: + c = 3; + do_load: + tcg_out_ld_sz(s, c, arg0, arg1, arg2); + break; + + case INDEX_op_st8_i32: + case INDEX_op_st8_i64: + c = 0; + goto do_store; + case INDEX_op_st16_i32: + case INDEX_op_st16_i64: + c = 1; + goto do_store; + case INDEX_op_st_i32: + case INDEX_op_st32_i64: + c = 2; + goto do_store; + case INDEX_op_st_i64: + c = 3; + do_store: + tcg_out_st_sz(s, c, arg0, arg1, arg2); + break; + + case INDEX_op_sub_i32: + if (const_args[2]) { + arg2 = -args[2]; + } else { + c = INSN_SUBL; + goto do_arith; + } + /* FALLTHRU */ + + case INDEX_op_add_i32: + /* Note there is no guarantee of a sign-extended result. */ + if (const_args[2]) { + if (const_args[1]) { + arg1 = TCG_REG_31; + } + tcg_out_mem_long(s, INSN_LDA, arg0, arg1, (int32_t)arg2); + } else { + c = INSN_ADDL; + goto do_arith; + } + break; + + case INDEX_op_sub_i64: + if (const_args[2]) { + arg2 = -args[2]; + } else { + c = INSN_SUBQ; + goto do_arith; + } + /* FALLTHRU */ + + case INDEX_op_add_i64: + if (const_args[2]) { + if (const_args[1]) { + arg1 = TCG_REG_31; + } + tcg_out_mem_long(s, INSN_LDA, arg0, arg1, arg2); + } else { + c = INSN_ADDQ; + goto do_arith; + } + break; + + case INDEX_op_mul_i32: + c = INSN_MULL; + goto do_arith; + + case INDEX_op_mul_i64: + c = INSN_MULQ; + goto do_arith; + + case INDEX_op_and_i32: + case INDEX_op_and_i64: + if (const_args[2]) { + if (const_args[1]) { + arg1 = TCG_REG_31; + } + if (opc == INDEX_op_and_i32) { + arg2 = (uint32_t)arg2; + } + tcg_out_andi(s, arg1, arg2, arg0); + break; + } + c = INSN_AND; + goto do_arith; + + case INDEX_op_or_i32: + case INDEX_op_or_i64: + c = INSN_BIS; + goto do_arith; + + case INDEX_op_xor_i32: + case INDEX_op_xor_i64: + c = INSN_XOR; + goto do_arith; + + case INDEX_op_shl_i32: + case INDEX_op_shl_i64: + c = INSN_SLL; + goto do_arith; + + case INDEX_op_shr_i32: + if (!const_args[1]) { + tcg_out_extend(s, 2, arg1, arg1); + } + /* FALLTHRU */ + case INDEX_op_shr_i64: + c = INSN_SRL; + goto do_arith; + + case INDEX_op_sar_i32: + if (!const_args[1]) { + tcg_out_extend(s, 2 | 4, arg1, arg1); + } + /* FALLTHRU */ + case INDEX_op_sar_i64: + c = INSN_SRA; + goto do_arith; + + do_arith: + if (const_args[1]) { + arg1 = TCG_REG_31; + } + if (const_args[2]) { + tcg_out_fmt_opi(s, c, arg1, arg2, arg0); + } else { + tcg_out_fmt_opr(s, c, arg1, arg2, arg0); + } + break; + + case INDEX_op_not_i32: + case INDEX_op_not_i64: + if (const_args[1]) { + tcg_out_fmt_opi(s, INSN_ORNOT, TCG_REG_31, arg1, arg0); + } else { + tcg_out_fmt_opr(s, INSN_ORNOT, TCG_REG_31, arg1, arg0); + } + goto do_arith; + + case INDEX_op_brcond_i32: + /* 32-bit operands have undefined high bits. Sign extend both + input operands to correct that. Note that this extension + works even for unsigned comparisons, so long as both operands + are similarly extended. */ + if (!const_args[0]) { + tcg_out_extend(s, 2 | 4, arg0, arg0); + } + if (!const_args[1]) { + tcg_out_extend(s, 2 | 4, arg1, arg1); + } + /* FALLTHRU */ + case INDEX_op_brcond_i64: + if (const_args[0]) { + arg0 = TCG_REG_31; + } + tcg_out_brcond(s, arg2, arg0, arg1, const_args[1], args[3]); + break; + + case INDEX_op_ext8s_i32: + case INDEX_op_ext8s_i64: + c = 0 | 4; + goto do_sign_extend; + case INDEX_op_ext16s_i32: + case INDEX_op_ext16s_i64: + c = 1 | 4; + goto do_sign_extend; + case INDEX_op_ext32s_i64: + c = 2 | 4; + do_sign_extend: + if (const_args[1]) { + arg1 = TCG_REG_31; + } + tcg_out_extend(s, c, arg1, arg0); + break; + + case INDEX_op_div_i32: + c = 2 | 4; + goto do_div; + case INDEX_op_rem_i32: + c = 2 | 4 | 8; + goto do_div; + case INDEX_op_divu_i32: + c = 2; + goto do_div; + case INDEX_op_remu_i32: + c = 2 | 8; + goto do_div; + case INDEX_op_div_i64: + c = 3 | 4; + goto do_div; + case INDEX_op_rem_i64: + c = 3 | 4 | 8; + goto do_div; + case INDEX_op_divu_i64: + c = 3; + goto do_div; + case INDEX_op_remu_i64: + c = 3 | 8; + do_div: + tcg_out_div(s, c); + break; + + case INDEX_op_bswap16_i32: + case INDEX_op_bswap16_i64: + c = 1; + goto do_bswap; + case INDEX_op_bswap32_i32: + case INDEX_op_bswap32_i64: + c = 2; + goto do_bswap; + case INDEX_op_bswap64_i64: + c = 3; + do_bswap: + if (const_args[1]) { + arg1 = TCG_REG_31; + } + tcg_out_bswap(s, c, arg1, arg0); + break; + + case INDEX_op_qemu_ld8u: + c = 0; + goto do_qemu_load; + case INDEX_op_qemu_ld8s: + c = 0 | 4; + goto do_qemu_load; + case INDEX_op_qemu_ld16u: + c = 1; + goto do_qemu_load; + case INDEX_op_qemu_ld16s: + c = 1 | 4; + goto do_qemu_load; + case INDEX_op_qemu_ld32u: + c = 2; + goto do_qemu_load; + case INDEX_op_qemu_ld32s: + c = 2 | 4; + goto do_qemu_load; + case INDEX_op_qemu_ld64: + c = 3; + do_qemu_load: + tcg_out_qemu_ld(s, args, c); + break; + + case INDEX_op_qemu_st8: + c = 0; + goto do_qemu_store; + case INDEX_op_qemu_st16: + c = 1; + goto do_qemu_store; + case INDEX_op_qemu_st32: + c = 2; + goto do_qemu_store; + case INDEX_op_qemu_st64: + c = 3; + do_qemu_store: + tcg_out_qemu_st(s, args, c); + break; + + case INDEX_op_mov_i32: + case INDEX_op_mov_i64: + case INDEX_op_movi_i32: + case INDEX_op_movi_i64: + /* These four are handled by tcg.c directly. */ + default: + tcg_abort(); + } +} + +static const TCGTargetOpDef alpha_op_defs[] = { + { INDEX_op_exit_tb, { } }, + { INDEX_op_goto_tb, { } }, + { INDEX_op_call, { "c" } }, + { INDEX_op_jmp, { "r" } }, + { INDEX_op_br, { } }, + + { INDEX_op_ld8u_i32, { "r", "r" } }, + { INDEX_op_ld8s_i32, { "r", "r" } }, + { INDEX_op_ld16u_i32, { "r", "r" } }, + { INDEX_op_ld16s_i32, { "r", "r" } }, + { INDEX_op_ld_i32, { "r", "r" } }, + { INDEX_op_st8_i32, { "r", "r" } }, + { INDEX_op_st16_i32, { "r", "r" } }, + { INDEX_op_st_i32, { "r", "r" } }, + + { INDEX_op_add_i32, { "r", "rJ", "ri" } }, + { INDEX_op_mul_i32, { "r", "rJ", "rI" } }, + { INDEX_op_sub_i32, { "r", "rJ", "ri" } }, + { INDEX_op_and_i32, { "r", "rJ", "ri" } }, + { INDEX_op_or_i32, { "r", "rJ", "rI" } }, + { INDEX_op_xor_i32, { "r", "rJ", "rI" } }, + { INDEX_op_not_i32, { "r", "rI" } }, + + { INDEX_op_shl_i32, { "r", "rJ", "rI" } }, + { INDEX_op_shr_i32, { "r", "rJ", "rI" } }, + { INDEX_op_sar_i32, { "r", "rJ", "rI" } }, + + { INDEX_op_div_i32, { "c", "a", "b" } }, + { INDEX_op_rem_i32, { "c", "a", "b" } }, + { INDEX_op_divu_i32, { "c", "a", "b" } }, + { INDEX_op_remu_i32, { "c", "a", "b" } }, + + { INDEX_op_brcond_i32, { "rJ", "rI" } }, + + { INDEX_op_ld8u_i64, { "r", "r" } }, + { INDEX_op_ld8s_i64, { "r", "r" } }, + { INDEX_op_ld16u_i64, { "r", "r" } }, + { INDEX_op_ld16s_i64, { "r", "r" } }, + { INDEX_op_ld32u_i64, { "r", "r" } }, + { INDEX_op_ld32s_i64, { "r", "r" } }, + { INDEX_op_ld_i64, { "r", "r" } }, + { INDEX_op_st8_i64, { "r", "r" } }, + { INDEX_op_st16_i64, { "r", "r" } }, + { INDEX_op_st32_i64, { "r", "r" } }, + { INDEX_op_st_i64, { "r", "r" } }, + + { INDEX_op_add_i64, { "r", "rJ", "ri" } }, + { INDEX_op_mul_i64, { "r", "rJ", "rI" } }, + { INDEX_op_sub_i64, { "r", "rJ", "ri" } }, + { INDEX_op_and_i64, { "r", "rJ", "ri" } }, + { INDEX_op_or_i64, { "r", "rJ", "rI" } }, + { INDEX_op_xor_i64, { "r", "rJ", "rI" } }, + { INDEX_op_not_i64, { "r", "rI" } }, + + { INDEX_op_shl_i64, { "r", "rJ", "rI" } }, + { INDEX_op_shr_i64, { "r", "rJ", "rI" } }, + { INDEX_op_sar_i64, { "r", "rJ", "rI" } }, + + { INDEX_op_div_i64, { "c", "a", "b" } }, + { INDEX_op_rem_i64, { "c", "a", "b" } }, + { INDEX_op_divu_i64, { "c", "a", "b" } }, + { INDEX_op_remu_i64, { "c", "a", "b" } }, + + { INDEX_op_brcond_i64, { "rJ", "rI" } }, + + { INDEX_op_ext8s_i32, { "r", "rJ" } }, + { INDEX_op_ext16s_i32, { "r", "rJ" } }, + { INDEX_op_ext8s_i64, { "r", "rJ" } }, + { INDEX_op_ext16s_i64, { "r", "rJ" } }, + { INDEX_op_ext32s_i64, { "r", "rJ" } }, + + { INDEX_op_bswap16_i32, { "r", "rJ" } }, + { INDEX_op_bswap32_i32, { "r", "rJ" } }, + { INDEX_op_bswap16_i64, { "r", "rJ" } }, + { INDEX_op_bswap32_i64, { "r", "rJ" } }, + { INDEX_op_bswap64_i64, { "r", "rJ" } }, + + { INDEX_op_qemu_ld8u, { "r", "L" } }, + { INDEX_op_qemu_ld8s, { "r", "L" } }, + { INDEX_op_qemu_ld16u, { "r", "L" } }, + { INDEX_op_qemu_ld16s, { "r", "L" } }, + { INDEX_op_qemu_ld32u, { "r", "L" } }, + { INDEX_op_qemu_ld32s, { "r", "L" } }, + { INDEX_op_qemu_ld64, { "r", "L" } }, + + { INDEX_op_qemu_st8, { "L", "L" } }, + { INDEX_op_qemu_st16, { "L", "L" } }, + { INDEX_op_qemu_st32, { "L", "L" } }, + { INDEX_op_qemu_st64, { "L", "L" } }, + { -1 }, +}; + + +/* + * Generate global QEMU prologue and epilogue code + */ +void tcg_target_qemu_prologue(TCGContext *s) +{ + static const int save_regs[] = { + TCG_REG_26, + TCG_REG_9, + TCG_REG_10, + TCG_REG_11, + TCG_REG_12, + TCG_REG_13, + TCG_REG_14, + }; + + long i, frame_size, save_ofs; + uint8_t *ret_loc, *ent_loc; + + /* The shape of the stack frame is: + input sp + [ Register save area ] + [ TB return address ] + [ TCG_STATIC_CALL_ARGS_SIZE ] + sp + */ + + save_ofs = TCG_STATIC_CALL_ARGS_SIZE + 8; + frame_size = save_ofs + ARRAY_SIZE(save_regs) * 8; + frame_size += TCG_TARGET_STACK_ALIGN - 1; + frame_size &= -TCG_TARGET_STACK_ALIGN; + + /* TB Prologue. */ + ent_loc = s->code_ptr; + + /* Allocate the stack frame. */ + tcg_out_fmt_mem(s, INSN_LDA, TCG_REG_30, TCG_REG_30, -frame_size); + + /* Save all callee saved registers. */ + for (i = 0; i < ARRAY_SIZE(save_regs); i++) { + tcg_out_fmt_mem(s, INSN_STQ, save_regs[i], TCG_REG_30, save_ofs + i*8); + } + + /* ??? Store the return address of the TB. Ideally this would be + done at the beginning of the TB, storing $26 or something. But + we don't actually have control of the beginning of the TB. So + we compute this via arithmetic based off $27, which holds the + start address of this prologue. */ + ret_loc = s->code_ptr; + tcg_out_fmt_mem(s, INSN_LDA, TMP_REG1, TCG_REG_27, 0); + tcg_out_fmt_mem(s, INSN_STQ, TMP_REG1, TCG_REG_30, TB_RET_OFS); + + /* Invoke the TB. */ + tcg_out_fmt_jmp(s, INSN_CALL, TCG_REG_31, TCG_REG_16, 0); + + /* Fill in the offset for the TB return address, as described above. */ + i = s->code_ptr - ent_loc; + if (i != (int16_t)i) { + tcg_abort(); + } + *(int16_t *)ret_loc = i; + + /* TB epilogue. */ + + /* Restore all callee saved registers. */ + for (i = 0; i < ARRAY_SIZE(save_regs); i++) { + tcg_out_fmt_mem(s, INSN_LDQ, save_regs[i], TCG_REG_30, save_ofs + i*8); + } + + /* Deallocate the stack frame. */ + tcg_out_fmt_mem(s, INSN_LDA, TCG_REG_30, TCG_REG_30, frame_size); + + tcg_out_fmt_jmp(s, INSN_RET, TCG_REG_31, TCG_REG_26, 0); +} + + +void tcg_target_init(TCGContext *s) +{ + /* fail safe */ + if ((1 << CPU_TLB_ENTRY_BITS) != sizeof(CPUTLBEntry)) + tcg_abort(); + + tcg_regset_set32(tcg_target_available_regs[TCG_TYPE_I32], 0, 0xffffffff); + tcg_regset_set32(tcg_target_available_regs[TCG_TYPE_I64], 0, 0xffffffff); + + tcg_regset_set32(tcg_target_call_clobber_regs, 0, 0xffffffff); + tcg_regset_reset_reg(tcg_target_call_clobber_regs, TCG_REG_9); + tcg_regset_reset_reg(tcg_target_call_clobber_regs, TCG_REG_10); + tcg_regset_reset_reg(tcg_target_call_clobber_regs, TCG_REG_11); + tcg_regset_reset_reg(tcg_target_call_clobber_regs, TCG_REG_12); + tcg_regset_reset_reg(tcg_target_call_clobber_regs, TCG_REG_13); + tcg_regset_reset_reg(tcg_target_call_clobber_regs, TCG_REG_14); + tcg_regset_reset_reg(tcg_target_call_clobber_regs, TCG_REG_15); + tcg_regset_reset_reg(tcg_target_call_clobber_regs, TCG_REG_30); + + tcg_regset_clear(s->reserved_regs); + tcg_regset_set_reg(s->reserved_regs, TCG_REG_30); + tcg_regset_set_reg(s->reserved_regs, TCG_REG_31); + tcg_regset_set_reg(s->reserved_regs, TMP_REG1); + tcg_regset_set_reg(s->reserved_regs, TMP_REG2); + tcg_regset_set_reg(s->reserved_regs, TMP_REG3); + + tcg_add_target_add_op_defs(alpha_op_defs); +} + +/* Called from exec-all.h, tb_set_jmp_target. */ +void alpha_tb_set_jmp_target(unsigned long jmp_addr, unsigned long addr); +void alpha_tb_set_jmp_target(unsigned long jmp_addr, unsigned long addr) +{ + TCGContext s; + long disp; + + s.code_ptr = s.code_buf = (uint8_t *)jmp_addr; + + disp = jmp_addr + 4 - addr; + if (disp < -0x400000 || disp >= 0x400000) { + tcg_out_movi(&s, TCG_TYPE_I64, TMP_REG1, addr); + tcg_out_fmt_jmp(&s, INSN_JMP, TCG_REG_31, TMP_REG1, addr); + } else { + tcg_out_fmt_br(&s, INSN_BR, TCG_REG_31, disp); + } + + flush_icache_range(0, -1); +} diff --git a/tcg/alpha/tcg-target.h b/tcg/alpha/tcg-target.h new file mode 100644 index 0000000..278ff6a --- /dev/null +++ b/tcg/alpha/tcg-target.h @@ -0,0 +1,89 @@ +/* + * Tiny Code Generator for QEMU + * + * Copyright (c) 2008 Fabrice Bellard + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the "Software"), to deal + * in the Software without restriction, including without limitation the rights + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + * copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN + * THE SOFTWARE. + */ +#define TCG_TARGET_ALPHA 1 + +#define TCG_TARGET_REG_BITS 64 + +#define TCG_TARGET_NB_REGS 32 + +enum { + TCG_REG_0 = 0, TCG_REG_1, TCG_REG_2, TCG_REG_3, + TCG_REG_4, TCG_REG_5, TCG_REG_6, TCG_REG_7, + TCG_REG_8, TCG_REG_9, TCG_REG_10, TCG_REG_11, + TCG_REG_12, TCG_REG_13, TCG_REG_14, TCG_REG_15, + TCG_REG_16, TCG_REG_17, TCG_REG_18, TCG_REG_19, + TCG_REG_20, TCG_REG_21, TCG_REG_22, TCG_REG_23, + TCG_REG_24, TCG_REG_25, TCG_REG_26, TCG_REG_27, + TCG_REG_28, TCG_REG_29, TCG_REG_30, TCG_REG_31 +}; + +#define TCG_CT_CONST_U8 0x100 +#define TCG_CT_CONST_ZERO 0x200 + +/* Used for function call generation. */ +#define TCG_REG_CALL_STACK TCG_REG_30 +#define TCG_TARGET_STACK_ALIGN 16 +#define TCG_TARGET_CALL_STACK_OFFSET 0 + +/* We have signed extension instructions. */ +#define TCG_TARGET_HAS_ext8s_i32 +#define TCG_TARGET_HAS_ext16s_i32 +#define TCG_TARGET_HAS_ext8s_i64 +#define TCG_TARGET_HAS_ext16s_i64 +#define TCG_TARGET_HAS_ext32s_i64 + +/* We have single-output division routines. */ +#define TCG_TARGET_HAS_div_i32 +#define TCG_TARGET_HAS_div_i64 + +/* We have optimized bswap routines. */ +#define TCG_TARGET_HAS_bswap16_i32 +#define TCG_TARGET_HAS_bswap32_i32 +#define TCG_TARGET_HAS_bswap16_i64 +#define TCG_TARGET_HAS_bswap32_i64 +#define TCG_TARGET_HAS_bswap64_i64 + +/* We have NOT via ORNOT. */ +#define TCG_TARGET_HAS_not_i32 +#define TCG_TARGET_HAS_not_i64 + +#define TCG_TARGET_HAS_GUEST_BASE + +/* Note: must be synced with dyngen-exec.h */ +#define TCG_AREG0 TCG_REG_15 +#define TCG_AREG1 TCG_REG_9 +#define TCG_AREG2 TCG_REG_10 +#define TCG_AREG3 TCG_REG_11 +#define TCG_AREG4 TCG_REG_12 +#define TCG_AREG5 TCG_REG_13 +#define TCG_AREG6 TCG_REG_14 + +#define TMP_REG1 TCG_REG_28 +#define TMP_REG2 TCG_REG_29 +#define TMP_REG3 TCG_REG_23 + +static inline void flush_icache_range(unsigned long start, unsigned long stop) +{ + __asm__ __volatile__ ("call_pal 0x86"); +}