From patchwork Thu Jan 20 17:16:33 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 79735 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [199.232.76.165]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id BB88CB70EA for ; Fri, 21 Jan 2011 04:38:38 +1100 (EST) Received: from localhost ([127.0.0.1]:48834 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PfySv-000650-1V for incoming@patchwork.ozlabs.org; Thu, 20 Jan 2011 12:38:33 -0500 Received: from [140.186.70.92] (port=41671 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PfyS7-0005y3-CH for qemu-devel@nongnu.org; Thu, 20 Jan 2011 12:37:44 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Pfy82-0004bh-Sk for qemu-devel@nongnu.org; Thu, 20 Jan 2011 12:17:00 -0500 Received: from eu1sys200aog114.obsmtp.com ([207.126.144.137]:41950) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Pfy82-0004bP-GD for qemu-devel@nongnu.org; Thu, 20 Jan 2011 12:16:58 -0500 Received: from source ([167.4.1.35]) (using TLSv1) by eu1sys200aob114.postini.com ([207.126.147.11]) with SMTP ID DSNKTThuCPN8hniQK17Xxv0mBMIW5b2hxkJh@postini.com; Thu, 20 Jan 2011 17:16:58 UTC Received: from zeta.dmz-us.st.com (ns4.st.com [167.4.16.71]) by beta.dmz-us.st.com (STMicroelectronics) with ESMTP id DB419C5 for ; Thu, 20 Jan 2011 17:16:55 +0000 (GMT) Received: from Webmail-eu.st.com (safex1hubcas5.st.com [10.75.90.71]) by zeta.dmz-us.st.com (STMicroelectronics) with ESMTP id 2E411160 for ; Thu, 20 Jan 2011 17:16:55 +0000 (GMT) Received: from [164.129.122.40] (164.129.122.40) by webmail-eu.st.com (10.75.90.13) with Microsoft SMTP Server (TLS) id 8.2.234.1; Thu, 20 Jan 2011 18:16:33 +0100 Message-ID: <4D386DF1.4060802@st.com> Date: Thu, 20 Jan 2011 18:16:33 +0100 From: Christophe Lyon User-Agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.9.2.13) Gecko/20101207 Lightning/1.0b2 Thunderbird/3.1.7 MIME-Version: 1.0 To: "qemu-devel@nongnu.org" X-Enigmail-Version: 1.1.1 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6, seldom 2.4 (older, 4) Subject: [Qemu-devel] [PATCH] target-arm: Set the right overflow bit for neon 32 and 64 bit saturating add/sub. X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Set the right overflow bit for neon 32 and 64 bit saturating add/sub. Also move the neon 64 bit saturating add/sub helpers to neon_helper.c for consistency with the 32 bits versions. There is probably still room for code commonalization though. Peter, this patch is based upon your patch 6f83e7d and adds the 64 bits case. Signed-off-by: Christophe Lyon Signed-off-by: Peter Maydell --- target-arm/helpers.h | 12 ++++-- target-arm/neon_helper.c | 89 ++++++++++++++++++++++++++++++++++++++++++++++ target-arm/op_helper.c | 49 ------------------------- target-arm/translate.c | 18 ++++----- 4 files changed, 105 insertions(+), 63 deletions(-) diff --git a/target-arm/helpers.h b/target-arm/helpers.h index b88ebae..8a2564e 100644 --- a/target-arm/helpers.h +++ b/target-arm/helpers.h @@ -137,10 +137,6 @@ DEF_HELPER_2(rsqrte_f32, f32, f32, env) DEF_HELPER_2(recpe_u32, i32, i32, env) DEF_HELPER_2(rsqrte_u32, i32, i32, env) DEF_HELPER_4(neon_tbl, i32, i32, i32, i32, i32) -DEF_HELPER_2(neon_add_saturate_u64, i64, i64, i64) -DEF_HELPER_2(neon_add_saturate_s64, i64, i64, i64) -DEF_HELPER_2(neon_sub_saturate_u64, i64, i64, i64) -DEF_HELPER_2(neon_sub_saturate_s64, i64, i64, i64) DEF_HELPER_2(add_cc, i32, i32, i32) DEF_HELPER_2(adc_cc, i32, i32, i32) @@ -160,10 +156,18 @@ DEF_HELPER_3(neon_qadd_u8, i32, env, i32, i32) DEF_HELPER_3(neon_qadd_s8, i32, env, i32, i32) DEF_HELPER_3(neon_qadd_u16, i32, env, i32, i32) DEF_HELPER_3(neon_qadd_s16, i32, env, i32, i32) +DEF_HELPER_3(neon_qadd_u32, i32, env, i32, i32) +DEF_HELPER_3(neon_qadd_s32, i32, env, i32, i32) DEF_HELPER_3(neon_qsub_u8, i32, env, i32, i32) DEF_HELPER_3(neon_qsub_s8, i32, env, i32, i32) DEF_HELPER_3(neon_qsub_u16, i32, env, i32, i32) DEF_HELPER_3(neon_qsub_s16, i32, env, i32, i32) +DEF_HELPER_3(neon_qsub_u32, i32, env, i32, i32) +DEF_HELPER_3(neon_qsub_s32, i32, env, i32, i32) +DEF_HELPER_3(neon_qadd_u64, i64, env, i64, i64) +DEF_HELPER_3(neon_qadd_s64, i64, env, i64, i64) +DEF_HELPER_3(neon_qsub_u64, i64, env, i64, i64) +DEF_HELPER_3(neon_qsub_s64, i64, env, i64, i64) DEF_HELPER_2(neon_hadd_s8, i32, i32, i32) DEF_HELPER_2(neon_hadd_u8, i32, i32, i32) diff --git a/target-arm/neon_helper.c b/target-arm/neon_helper.c index 20f3c16..c1619c0 100644 --- a/target-arm/neon_helper.c +++ b/target-arm/neon_helper.c @@ -198,6 +198,28 @@ NEON_VOP_ENV(qadd_u16, neon_u16, 2) #undef NEON_FN #undef NEON_USAT +uint32_t HELPER(neon_qadd_u32)(CPUState *env, uint32_t a, uint32_t b) +{ + uint32_t res = a + b; + if (res < a) { + SET_QC(); + res = ~0; + } + return res; +} + +uint64_t HELPER(neon_qadd_u64)(CPUState *env, uint64_t src1, uint64_t src2) +{ + uint64_t res; + + res = src1 + src2; + if (res < src1) { + SET_QC(); + res = ~(uint64_t)0; + } + return res; +} + #define NEON_SSAT(dest, src1, src2, type) do { \ int32_t tmp = (uint32_t)src1 + (uint32_t)src2; \ if (tmp != (type)tmp) { \ @@ -218,6 +240,28 @@ NEON_VOP_ENV(qadd_s16, neon_s16, 2) #undef NEON_FN #undef NEON_SSAT +uint32_t HELPER(neon_qadd_s32)(CPUState *env, uint32_t a, uint32_t b) +{ + uint32_t res = a + b; + if (((res ^ a) & SIGNBIT) && !((a ^ b) & SIGNBIT)) { + SET_QC(); + res = ~(((int32_t)a >> 31) ^ SIGNBIT); + } + return res; +} + +uint64_t HELPER(neon_qadd_s64)(CPUState *env, uint64_t src1, uint64_t src2) +{ + uint64_t res; + + res = src1 + src2; + if (((res ^ src1) & SIGNBIT64) && !((src1 ^ src2) & SIGNBIT64)) { + SET_QC(); + res = ((int64_t)src1 >> 63) ^ ~SIGNBIT64; + } + return res; +} + #define NEON_USAT(dest, src1, src2, type) do { \ uint32_t tmp = (uint32_t)src1 - (uint32_t)src2; \ if (tmp != (type)tmp) { \ @@ -234,6 +278,29 @@ NEON_VOP_ENV(qsub_u16, neon_u16, 2) #undef NEON_FN #undef NEON_USAT +uint32_t HELPER(neon_qsub_u32)(CPUState *env, uint32_t a, uint32_t b) +{ + uint32_t res = a - b; + if (res > a) { + SET_QC(); + res = 0; + } + return res; +} + +uint64_t HELPER(neon_qsub_u64)(CPUState *env, uint64_t src1, uint64_t src2) +{ + uint64_t res; + + if (src1 < src2) { + SET_QC(); + res = 0; + } else { + res = src1 - src2; + } + return res; +} + #define NEON_SSAT(dest, src1, src2, type) do { \ int32_t tmp = (uint32_t)src1 - (uint32_t)src2; \ if (tmp != (type)tmp) { \ @@ -254,6 +321,28 @@ NEON_VOP_ENV(qsub_s16, neon_s16, 2) #undef NEON_FN #undef NEON_SSAT +uint32_t HELPER(neon_qsub_s32)(CPUState *env, uint32_t a, uint32_t b) +{ + uint32_t res = a - b; + if (((res ^ a) & SIGNBIT) && ((a ^ b) & SIGNBIT)) { + SET_QC(); + res = ~(((int32_t)a >> 31) ^ SIGNBIT); + } + return res; +} + +uint64_t HELPER(neon_qsub_s64)(CPUState *env, uint64_t src1, uint64_t src2) +{ + uint64_t res; + + res = src1 - src2; + if (((res ^ src1) & SIGNBIT64) && ((src1 ^ src2) & SIGNBIT64)) { + SET_QC(); + res = ((int64_t)src1 >> 63) ^ ~SIGNBIT64; + } + return res; +} + #define NEON_FN(dest, src1, src2) dest = (src1 + src2) >> 1 NEON_VOP(hadd_s8, neon_s8, 4) NEON_VOP(hadd_u8, neon_u8, 4) diff --git a/target-arm/op_helper.c b/target-arm/op_helper.c index 43baa63..3de2610 100644 --- a/target-arm/op_helper.c +++ b/target-arm/op_helper.c @@ -424,52 +424,3 @@ uint32_t HELPER(ror_cc)(uint32_t x, uint32_t i) return ((uint32_t)x >> shift) | (x << (32 - shift)); } } - -uint64_t HELPER(neon_add_saturate_s64)(uint64_t src1, uint64_t src2) -{ - uint64_t res; - - res = src1 + src2; - if (((res ^ src1) & SIGNBIT64) && !((src1 ^ src2) & SIGNBIT64)) { - env->QF = 1; - res = ((int64_t)src1 >> 63) ^ ~SIGNBIT64; - } - return res; -} - -uint64_t HELPER(neon_add_saturate_u64)(uint64_t src1, uint64_t src2) -{ - uint64_t res; - - res = src1 + src2; - if (res < src1) { - env->QF = 1; - res = ~(uint64_t)0; - } - return res; -} - -uint64_t HELPER(neon_sub_saturate_s64)(uint64_t src1, uint64_t src2) -{ - uint64_t res; - - res = src1 - src2; - if (((res ^ src1) & SIGNBIT64) && ((src1 ^ src2) & SIGNBIT64)) { - env->QF = 1; - res = ((int64_t)src1 >> 63) ^ ~SIGNBIT64; - } - return res; -} - -uint64_t HELPER(neon_sub_saturate_u64)(uint64_t src1, uint64_t src2) -{ - uint64_t res; - - if (src1 < src2) { - env->QF = 1; - res = 0; - } else { - res = src1 - src2; - } - return res; -} diff --git a/target-arm/translate.c b/target-arm/translate.c index 41cbb96..d4566f2 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -3539,12 +3539,6 @@ static inline void gen_neon_rsb(int size, TCGv t0, TCGv t1) #define gen_helper_neon_pmin_s32 gen_helper_neon_min_s32 #define gen_helper_neon_pmin_u32 gen_helper_neon_min_u32 -/* FIXME: This is wrong. They set the wrong overflow bit. */ -#define gen_helper_neon_qadd_s32(a, e, b, c) gen_helper_add_saturate(a, b, c) -#define gen_helper_neon_qadd_u32(a, e, b, c) gen_helper_add_usaturate(a, b, c) -#define gen_helper_neon_qsub_s32(a, e, b, c) gen_helper_sub_saturate(a, b, c) -#define gen_helper_neon_qsub_u32(a, e, b, c) gen_helper_sub_usaturate(a, b, c) - #define GEN_NEON_INTEGER_OP_ENV(name) do { \ switch ((size << 1) | u) { \ case 0: \ @@ -4233,16 +4227,20 @@ static int disas_neon_data_insn(CPUState * env, DisasContext *s, uint32_t insn) switch (op) { case 1: /* VQADD */ if (u) { - gen_helper_neon_add_saturate_u64(CPU_V001); + gen_helper_neon_qadd_u64(cpu_V0, cpu_env, + cpu_V0, cpu_V1); } else { - gen_helper_neon_add_saturate_s64(CPU_V001); + gen_helper_neon_qadd_s64(cpu_V0, cpu_env, + cpu_V0, cpu_V1); } break; case 5: /* VQSUB */ if (u) { - gen_helper_neon_sub_saturate_u64(CPU_V001); + gen_helper_neon_qsub_u64(cpu_V0, cpu_env, + cpu_V0, cpu_V1); } else { - gen_helper_neon_sub_saturate_s64(CPU_V001); + gen_helper_neon_qsub_s64(cpu_V0, cpu_env, + cpu_V0, cpu_V1); } break; case 8: /* VSHL */