From patchwork Wed Feb 20 07:51:59 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 222013 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 887312C0082 for ; Wed, 20 Feb 2013 20:19:57 +1100 (EST) Received: from localhost ([::1]:48285 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1U84Uv-0001nc-Is for incoming@patchwork.ozlabs.org; Wed, 20 Feb 2013 02:53:49 -0500 Received: from eggs.gnu.org ([208.118.235.92]:43751) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1U84U7-0000r3-Us for qemu-devel@nongnu.org; Wed, 20 Feb 2013 02:53:06 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1U84U1-0002C0-CR for qemu-devel@nongnu.org; Wed, 20 Feb 2013 02:52:59 -0500 Received: from mail-pb0-f49.google.com ([209.85.160.49]:52427) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1U84U1-0002Bt-3Z for qemu-devel@nongnu.org; Wed, 20 Feb 2013 02:52:53 -0500 Received: by mail-pb0-f49.google.com with SMTP id xa12so2729668pbc.8 for ; Tue, 19 Feb 2013 23:52:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:sender:from:to:cc:subject:date:message-id:x-mailer :in-reply-to:references; bh=NOU2MCWWr6GMbYdsNte8sgzEKEjfE2fEp8uS7mtyoJw=; b=qImPrNIWMskvIWk+43Ceq0OI6LiA1nfDmj2cu2SRDZJuOK6nWnio1W/9j3+b7oHBMw AYadsOTJnpaD0G22VjCw4ljDqaMxxNNBUVBBQXndsOqj3KjZT64zOPFbEG3pvbcv/MqZ 2Zyf39LK0zZ0YguU+cmSO5cTt+OLkYLAVU6EFMFWPwFH2PYZHm+ne1e5uAD3Jtug97UL EH21TABe0UiAevnDZ4WAV1tM4a5MOlxES+xmgueojsjLHfarf3OdNWSl0PAgOVwbXhzw lzLhY4DpC5c+6Eqt2yVF/BpDJZb5oSn3mYtPogmaPX7b1lCSe8sIaZt/60/zMctJosmL 7p/g== X-Received: by 10.68.245.229 with SMTP id xr5mr47022802pbc.163.1361346772337; Tue, 19 Feb 2013 23:52:52 -0800 (PST) Received: from anchor.twiddle.net (50-194-63-110-static.hfc.comcastbusiness.net. [50.194.63.110]) by mx.google.com with ESMTPS id c8sm20826347pbq.10.2013.02.19.23.52.50 (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Tue, 19 Feb 2013 23:52:51 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Tue, 19 Feb 2013 23:51:59 -0800 Message-Id: <1361346746-8511-12-git-send-email-rth@twiddle.net> X-Mailer: git-send-email 1.8.1.2 In-Reply-To: <1361346746-8511-1-git-send-email-rth@twiddle.net> References: <1361346746-8511-1-git-send-email-rth@twiddle.net> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [fuzzy] X-Received-From: 209.85.160.49 Cc: blauwirbel@gmail.com, aurelien@aurel32.net Subject: [Qemu-devel] [PATCH 11/38] target-i386: Use mulu2 and muls2 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org These correspond very closely to the insns that we're emulating. Signed-off-by: Richard Henderson --- target-i386/helper.h | 4 -- target-i386/int_helper.c | 40 ------------ target-i386/translate.c | 167 ++++++++++++++++------------------------------- 3 files changed, 56 insertions(+), 155 deletions(-) diff --git a/target-i386/helper.h b/target-i386/helper.h index 26a0cc8..d6974df 100644 --- a/target-i386/helper.h +++ b/target-i386/helper.h @@ -14,12 +14,8 @@ DEF_HELPER_2(idivw_AX, void, env, tl) DEF_HELPER_2(divl_EAX, void, env, tl) DEF_HELPER_2(idivl_EAX, void, env, tl) #ifdef TARGET_X86_64 -DEF_HELPER_2(mulq_EAX_T0, void, env, tl) -DEF_HELPER_2(imulq_EAX_T0, void, env, tl) -DEF_HELPER_3(imulq_T0_T1, tl, env, tl, tl) DEF_HELPER_2(divq_EAX, void, env, tl) DEF_HELPER_2(idivq_EAX, void, env, tl) -DEF_HELPER_FLAGS_2(umulh, TCG_CALL_NO_RWG_SE, tl, tl, tl) #endif DEF_HELPER_2(aam, void, env, int) diff --git a/target-i386/int_helper.c b/target-i386/int_helper.c index 3b56075..74c7c36 100644 --- a/target-i386/int_helper.c +++ b/target-i386/int_helper.c @@ -374,46 +374,6 @@ static int idiv64(uint64_t *plow, uint64_t *phigh, int64_t b) return 0; } -void helper_mulq_EAX_T0(CPUX86State *env, target_ulong t0) -{ - uint64_t r0, r1; - - mulu64(&r0, &r1, EAX, t0); - EAX = r0; - EDX = r1; - CC_DST = r0; - CC_SRC = r1; -} - -target_ulong helper_umulh(target_ulong t0, target_ulong t1) -{ - uint64_t h, l; - mulu64(&l, &h, t0, t1); - return h; -} - -void helper_imulq_EAX_T0(CPUX86State *env, target_ulong t0) -{ - uint64_t r0, r1; - - muls64(&r0, &r1, EAX, t0); - EAX = r0; - EDX = r1; - CC_DST = r0; - CC_SRC = ((int64_t)r1 != ((int64_t)r0 >> 63)); -} - -target_ulong helper_imulq_T0_T1(CPUX86State *env, target_ulong t0, - target_ulong t1) -{ - uint64_t r0, r1; - - muls64(&r0, &r1, t0, t1); - CC_DST = r0; - CC_SRC = ((int64_t)r1 != ((int64_t)r0 >> 63)); - return r0; -} - void helper_divq_EAX(CPUX86State *env, target_ulong t0) { uint64_t r0, r1; diff --git a/target-i386/translate.c b/target-i386/translate.c index 439d19e..1545e3f 100644 --- a/target-i386/translate.c +++ b/target-i386/translate.c @@ -4111,31 +4111,18 @@ static void gen_sse(CPUX86State *env, DisasContext *s, int b, ot = s->dflag == 2 ? OT_QUAD : OT_LONG; gen_ldst_modrm(env, s, modrm, ot, OR_TMP0, 0); switch (ot) { - TCGv_i64 t0, t1; default: - t0 = tcg_temp_new_i64(); - t1 = tcg_temp_new_i64(); -#ifdef TARGET_X86_64 - tcg_gen_ext32u_i64(t0, cpu_T[0]); - tcg_gen_ext32u_i64(t1, cpu_regs[R_EDX]); -#else - tcg_gen_extu_i32_i64(t0, cpu_T[0]); - tcg_gen_extu_i32_i64(t0, cpu_regs[R_EDX]); -#endif - tcg_gen_mul_i64(t0, t0, t1); - tcg_gen_trunc_i64_tl(cpu_T[0], t0); - tcg_gen_shri_i64(t0, t0, 32); - tcg_gen_trunc_i64_tl(cpu_T[1], t0); - tcg_temp_free_i64(t0); - tcg_temp_free_i64(t1); - gen_op_mov_reg_T0(OT_LONG, s->vex_v); - gen_op_mov_reg_T1(OT_LONG, reg); + tcg_gen_trunc_tl_i32(cpu_tmp2_i32, cpu_T[0]); + tcg_gen_trunc_tl_i32(cpu_tmp3_i32, cpu_regs[R_EDX]); + tcg_gen_mulu2_i32(cpu_tmp2_i32, cpu_tmp3_i32, + cpu_tmp2_i32, cpu_tmp3_i32); + tcg_gen_extu_i32_tl(cpu_regs[s->vex_v], cpu_tmp2_i32); + tcg_gen_extu_i32_tl(cpu_regs[reg], cpu_tmp3_i32); break; #ifdef TARGET_X86_64 case OT_QUAD: - tcg_gen_mov_tl(cpu_T[1], cpu_regs[R_EDX]); - tcg_gen_mul_tl(cpu_regs[s->vex_v], cpu_T[0], cpu_T[1]); - gen_helper_umulh(cpu_regs[reg], cpu_T[0], cpu_T[1]); + tcg_gen_mulu2_i64(cpu_regs[s->vex_v], cpu_regs[reg], + cpu_T[0], cpu_regs[R_EDX]); break; #endif } @@ -5034,39 +5021,22 @@ static target_ulong disas_insn(CPUX86State *env, DisasContext *s, break; default: case OT_LONG: -#ifdef TARGET_X86_64 - gen_op_mov_TN_reg(OT_LONG, 1, R_EAX); - tcg_gen_ext32u_tl(cpu_T[0], cpu_T[0]); - tcg_gen_ext32u_tl(cpu_T[1], cpu_T[1]); - tcg_gen_mul_tl(cpu_T[0], cpu_T[0], cpu_T[1]); - gen_op_mov_reg_T0(OT_LONG, R_EAX); - tcg_gen_mov_tl(cpu_cc_dst, cpu_T[0]); - tcg_gen_shri_tl(cpu_T[0], cpu_T[0], 32); - gen_op_mov_reg_T0(OT_LONG, R_EDX); - tcg_gen_mov_tl(cpu_cc_src, cpu_T[0]); -#else - { - TCGv_i64 t0, t1; - t0 = tcg_temp_new_i64(); - t1 = tcg_temp_new_i64(); - gen_op_mov_TN_reg(OT_LONG, 1, R_EAX); - tcg_gen_extu_i32_i64(t0, cpu_T[0]); - tcg_gen_extu_i32_i64(t1, cpu_T[1]); - tcg_gen_mul_i64(t0, t0, t1); - tcg_gen_trunc_i64_i32(cpu_T[0], t0); - gen_op_mov_reg_T0(OT_LONG, R_EAX); - tcg_gen_mov_tl(cpu_cc_dst, cpu_T[0]); - tcg_gen_shri_i64(t0, t0, 32); - tcg_gen_trunc_i64_i32(cpu_T[0], t0); - gen_op_mov_reg_T0(OT_LONG, R_EDX); - tcg_gen_mov_tl(cpu_cc_src, cpu_T[0]); - } -#endif + tcg_gen_trunc_tl_i32(cpu_tmp2_i32, cpu_T[0]); + tcg_gen_trunc_tl_i32(cpu_tmp3_i32, cpu_regs[R_EAX]); + tcg_gen_mulu2_i32(cpu_tmp2_i32, cpu_tmp3_i32, + cpu_tmp2_i32, cpu_tmp3_i32); + tcg_gen_extu_i32_tl(cpu_regs[R_EAX], cpu_tmp2_i32); + tcg_gen_extu_i32_tl(cpu_regs[R_EDX], cpu_tmp3_i32); + tcg_gen_mov_tl(cpu_cc_dst, cpu_regs[R_EAX]); + tcg_gen_mov_tl(cpu_cc_src, cpu_regs[R_EDX]); set_cc_op(s, CC_OP_MULL); break; #ifdef TARGET_X86_64 case OT_QUAD: - gen_helper_mulq_EAX_T0(cpu_env, cpu_T[0]); + tcg_gen_mulu2_i64(cpu_regs[R_EAX], cpu_regs[R_EDX], + cpu_T[0], cpu_regs[R_EAX]); + tcg_gen_mov_tl(cpu_cc_dst, cpu_regs[R_EAX]); + tcg_gen_mov_tl(cpu_cc_src, cpu_regs[R_EDX]); set_cc_op(s, CC_OP_MULQ); break; #endif @@ -5102,41 +5072,25 @@ static target_ulong disas_insn(CPUX86State *env, DisasContext *s, break; default: case OT_LONG: -#ifdef TARGET_X86_64 - gen_op_mov_TN_reg(OT_LONG, 1, R_EAX); - tcg_gen_ext32s_tl(cpu_T[0], cpu_T[0]); - tcg_gen_ext32s_tl(cpu_T[1], cpu_T[1]); - tcg_gen_mul_tl(cpu_T[0], cpu_T[0], cpu_T[1]); - gen_op_mov_reg_T0(OT_LONG, R_EAX); - tcg_gen_mov_tl(cpu_cc_dst, cpu_T[0]); - tcg_gen_ext32s_tl(cpu_tmp0, cpu_T[0]); - tcg_gen_sub_tl(cpu_cc_src, cpu_T[0], cpu_tmp0); - tcg_gen_shri_tl(cpu_T[0], cpu_T[0], 32); - gen_op_mov_reg_T0(OT_LONG, R_EDX); -#else - { - TCGv_i64 t0, t1; - t0 = tcg_temp_new_i64(); - t1 = tcg_temp_new_i64(); - gen_op_mov_TN_reg(OT_LONG, 1, R_EAX); - tcg_gen_ext_i32_i64(t0, cpu_T[0]); - tcg_gen_ext_i32_i64(t1, cpu_T[1]); - tcg_gen_mul_i64(t0, t0, t1); - tcg_gen_trunc_i64_i32(cpu_T[0], t0); - gen_op_mov_reg_T0(OT_LONG, R_EAX); - tcg_gen_mov_tl(cpu_cc_dst, cpu_T[0]); - tcg_gen_sari_tl(cpu_tmp0, cpu_T[0], 31); - tcg_gen_shri_i64(t0, t0, 32); - tcg_gen_trunc_i64_i32(cpu_T[0], t0); - gen_op_mov_reg_T0(OT_LONG, R_EDX); - tcg_gen_sub_tl(cpu_cc_src, cpu_T[0], cpu_tmp0); - } -#endif + tcg_gen_trunc_tl_i32(cpu_tmp2_i32, cpu_T[0]); + tcg_gen_trunc_tl_i32(cpu_tmp3_i32, cpu_regs[R_EAX]); + tcg_gen_muls2_i32(cpu_tmp2_i32, cpu_tmp3_i32, + cpu_tmp2_i32, cpu_tmp3_i32); + tcg_gen_extu_i32_tl(cpu_regs[R_EAX], cpu_tmp2_i32); + tcg_gen_extu_i32_tl(cpu_regs[R_EDX], cpu_tmp3_i32); + tcg_gen_sari_i32(cpu_tmp2_i32, cpu_tmp2_i32, 31); + tcg_gen_mov_tl(cpu_cc_dst, cpu_regs[R_EAX]); + tcg_gen_sub_i32(cpu_tmp2_i32, cpu_tmp2_i32, cpu_tmp3_i32); + tcg_gen_extu_i32_tl(cpu_cc_src, cpu_tmp2_i32); set_cc_op(s, CC_OP_MULL); break; #ifdef TARGET_X86_64 case OT_QUAD: - gen_helper_imulq_EAX_T0(cpu_env, cpu_T[0]); + tcg_gen_muls2_i64(cpu_regs[R_EAX], cpu_regs[R_EDX], + cpu_T[0], cpu_regs[R_EAX]); + tcg_gen_mov_tl(cpu_cc_dst, cpu_regs[R_EAX]); + tcg_gen_sari_tl(cpu_cc_src, cpu_regs[R_EAX], 63); + tcg_gen_sub_tl(cpu_cc_src, cpu_cc_src, cpu_regs[R_EDX]); set_cc_op(s, CC_OP_MULQ); break; #endif @@ -5391,37 +5345,27 @@ static target_ulong disas_insn(CPUX86State *env, DisasContext *s, } else { gen_op_mov_TN_reg(ot, 1, reg); } - -#ifdef TARGET_X86_64 - if (ot == OT_QUAD) { - gen_helper_imulq_T0_T1(cpu_T[0], cpu_env, cpu_T[0], cpu_T[1]); - } else -#endif - if (ot == OT_LONG) { + switch (ot) { #ifdef TARGET_X86_64 - tcg_gen_ext32s_tl(cpu_T[0], cpu_T[0]); - tcg_gen_ext32s_tl(cpu_T[1], cpu_T[1]); - tcg_gen_mul_tl(cpu_T[0], cpu_T[0], cpu_T[1]); - tcg_gen_mov_tl(cpu_cc_dst, cpu_T[0]); - tcg_gen_ext32s_tl(cpu_tmp0, cpu_T[0]); - tcg_gen_sub_tl(cpu_cc_src, cpu_T[0], cpu_tmp0); -#else - { - TCGv_i64 t0, t1; - t0 = tcg_temp_new_i64(); - t1 = tcg_temp_new_i64(); - tcg_gen_ext_i32_i64(t0, cpu_T[0]); - tcg_gen_ext_i32_i64(t1, cpu_T[1]); - tcg_gen_mul_i64(t0, t0, t1); - tcg_gen_trunc_i64_i32(cpu_T[0], t0); - tcg_gen_mov_tl(cpu_cc_dst, cpu_T[0]); - tcg_gen_sari_tl(cpu_tmp0, cpu_T[0], 31); - tcg_gen_shri_i64(t0, t0, 32); - tcg_gen_trunc_i64_i32(cpu_T[1], t0); - tcg_gen_sub_tl(cpu_cc_src, cpu_T[1], cpu_tmp0); - } + case OT_QUAD: + tcg_gen_muls2_i64(cpu_regs[reg], cpu_T[1], cpu_T[0], cpu_T[1]); + tcg_gen_mov_tl(cpu_cc_dst, cpu_regs[reg]); + tcg_gen_sari_tl(cpu_cc_src, cpu_cc_dst, 63); + tcg_gen_sub_tl(cpu_cc_src, cpu_cc_src, cpu_T[1]); + break; #endif - } else { + case OT_LONG: + tcg_gen_trunc_tl_i32(cpu_tmp2_i32, cpu_T[0]); + tcg_gen_trunc_tl_i32(cpu_tmp3_i32, cpu_T[1]); + tcg_gen_muls2_i32(cpu_tmp2_i32, cpu_tmp3_i32, + cpu_tmp2_i32, cpu_tmp3_i32); + tcg_gen_extu_i32_tl(cpu_regs[reg], cpu_tmp2_i32); + tcg_gen_sari_i32(cpu_tmp2_i32, cpu_tmp2_i32, 31); + tcg_gen_mov_tl(cpu_cc_dst, cpu_regs[reg]); + tcg_gen_sub_i32(cpu_tmp2_i32, cpu_tmp2_i32, cpu_tmp3_i32); + tcg_gen_extu_i32_tl(cpu_cc_src, cpu_tmp2_i32); + break; + default: tcg_gen_ext16s_tl(cpu_T[0], cpu_T[0]); tcg_gen_ext16s_tl(cpu_T[1], cpu_T[1]); /* XXX: use 32 bit mul which could be faster */ @@ -5429,8 +5373,9 @@ static target_ulong disas_insn(CPUX86State *env, DisasContext *s, tcg_gen_mov_tl(cpu_cc_dst, cpu_T[0]); tcg_gen_ext16s_tl(cpu_tmp0, cpu_T[0]); tcg_gen_sub_tl(cpu_cc_src, cpu_T[0], cpu_tmp0); + gen_op_mov_reg_T0(ot, reg); + break; } - gen_op_mov_reg_T0(ot, reg); set_cc_op(s, CC_OP_MULB + ot); break; case 0x1c0: