From patchwork Fri Feb 11 15:10:57 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 82803 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [199.232.76.165]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 45E7BB715C for ; Sat, 12 Feb 2011 04:37:06 +1100 (EST) Received: from localhost ([127.0.0.1]:48654 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Pnujd-0002Qm-85 for incoming@patchwork.ozlabs.org; Fri, 11 Feb 2011 10:16:37 -0500 Received: from [140.186.70.92] (port=54048 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PnueX-0008RY-KY for qemu-devel@nongnu.org; Fri, 11 Feb 2011 10:11:24 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1PnueQ-0003ci-P1 for qemu-devel@nongnu.org; Fri, 11 Feb 2011 10:11:21 -0500 Received: from eu1sys200aog107.obsmtp.com ([207.126.144.123]:47550) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1PnueQ-0003c8-Dp for qemu-devel@nongnu.org; Fri, 11 Feb 2011 10:11:14 -0500 Received: from source ([164.129.1.35]) (using TLSv1) by eu1sys200aob107.postini.com ([207.126.147.11]) with SMTP ID DSNKTVVRkASAg5CrfS7C7CKGhWzTi99HPNTr@postini.com; Fri, 11 Feb 2011 15:11:14 UTC Received: from zeta.dmz-eu.st.com (ns2.st.com [164.129.230.9]) by beta.dmz-eu.st.com (STMicroelectronics) with ESMTP id 267D0171 for ; Fri, 11 Feb 2011 15:11:12 +0000 (GMT) Received: from Webmail-eu.st.com (safex1hubcas5.st.com [10.75.90.71]) by zeta.dmz-eu.st.com (STMicroelectronics) with ESMTP id 04F224D11 for ; Fri, 11 Feb 2011 15:11:11 +0000 (GMT) Received: from localhost.localdomain (164.129.122.40) by webmail-eu.st.com (10.75.90.13) with Microsoft SMTP Server (TLS) id 8.2.234.1; Fri, 11 Feb 2011 16:11:11 +0100 From: To: Date: Fri, 11 Feb 2011 16:10:57 +0100 Message-ID: <1297437062-6118-2-git-send-email-christophe.lyon@st.com> X-Mailer: git-send-email 1.7.2.3 In-Reply-To: <1297437062-6118-1-git-send-email-christophe.lyon@st.com> References: <1297437062-6118-1-git-send-email-christophe.lyon@st.com> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6, seldom 2.4 (older, 4) X-Received-From: 207.126.144.123 Subject: [Qemu-devel] [PATCH 1/6] target-arm: Fix rounding constant addition for Neon shift instructions. X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org From: Christophe Lyon Handle cases where adding the rounding constant could overflow in Neon shift instructions: VRSHR, VRSRA, VQRSHRN, VQRSHRUN, VRSHRN. Signed-off-by: Christophe Lyon --- target-arm/neon_helper.c | 149 ++++++++++++++++++++++++++++++++++++++++++---- 1 files changed, 137 insertions(+), 12 deletions(-) diff --git a/target-arm/neon_helper.c b/target-arm/neon_helper.c index cf82072..3f1f3d4 100644 --- a/target-arm/neon_helper.c +++ b/target-arm/neon_helper.c @@ -558,9 +558,34 @@ uint64_t HELPER(neon_shl_s64)(uint64_t valop, uint64_t shiftop) }} while (0) NEON_VOP(rshl_s8, neon_s8, 4) NEON_VOP(rshl_s16, neon_s16, 2) -NEON_VOP(rshl_s32, neon_s32, 1) #undef NEON_FN +/* The addition of the rounding constant may overflow, so we use an + * intermediate 64 bits accumulator. */ +uint32_t HELPER(neon_rshl_s32)(uint32_t valop, uint32_t shiftop) +{ + int32_t dest; + int32_t val = (int32_t)valop; + int8_t shift = (int8_t)shiftop; + if (shift >= 32) { + dest = 0; + } else if (shift < -32) { + dest = val >> 31; + } else if (shift == -32) { + dest = val >> 31; + dest++; + dest >>= 1; + } else if (shift < 0) { + int64_t big_dest = ((int64_t)val + (1 << (-1 - shift))); + dest = big_dest >> -shift; + } else { + dest = val << shift; + } + return dest; +} + +/* Handling addition overflow with 64 bits inputs values is more + * tricky than with 32 bits values. */ uint64_t HELPER(neon_rshl_s64)(uint64_t valop, uint64_t shiftop) { int8_t shift = (int8_t)shiftop; @@ -574,7 +599,16 @@ uint64_t HELPER(neon_rshl_s64)(uint64_t valop, uint64_t shiftop) val++; val >>= 1; } else if (shift < 0) { - val = (val + ((int64_t)1 << (-1 - shift))) >> -shift; + val >>= (-shift - 1); + if (val == INT64_MAX) { + /* In this case, it means that the rounding constant is 1, + * and the addition would overflow. Return the actual + * result directly. */ + val = 0x4000000000000000LL; + } else { + val++; + val >>= 1; + } } else { val <<= shift; } @@ -596,9 +630,29 @@ uint64_t HELPER(neon_rshl_s64)(uint64_t valop, uint64_t shiftop) }} while (0) NEON_VOP(rshl_u8, neon_u8, 4) NEON_VOP(rshl_u16, neon_u16, 2) -NEON_VOP(rshl_u32, neon_u32, 1) #undef NEON_FN +/* The addition of the rounding constant may overflow, so we use an + * intermediate 64 bits accumulator. */ +uint32_t HELPER(neon_rshl_u32)(uint32_t val, uint32_t shiftop) +{ + uint32_t dest; + int8_t shift = (int8_t)shiftop; + if (shift >= 32 || shift < -32) { + dest = 0; + } else if (shift == -32) { + dest = val >> 31; + } else if (shift < 0) { + uint64_t big_dest = ((uint64_t)val + (1 << (-1 - shift))); + dest = big_dest >> -shift; + } else { + dest = val << shift; + } + return dest; +} + +/* Handling addition overflow with 64 bits inputs values is more + * tricky than with 32 bits values. */ uint64_t HELPER(neon_rshl_u64)(uint64_t val, uint64_t shiftop) { int8_t shift = (uint8_t)shiftop; @@ -607,9 +661,17 @@ uint64_t HELPER(neon_rshl_u64)(uint64_t val, uint64_t shiftop) } else if (shift == -64) { /* Rounding a 1-bit result just preserves that bit. */ val >>= 63; - } if (shift < 0) { - val = (val + ((uint64_t)1 << (-1 - shift))) >> -shift; - val >>= -shift; + } else if (shift < 0) { + val >>= (-shift - 1); + if (val == UINT64_MAX) { + /* In this case, it means that the rounding constant is 1, + * and the addition would overflow. Return the actual + * result directly. */ + val = 0x8000000000000000ULL; + } else { + val++; + val >>= 1; + } } else { val <<= shift; } @@ -784,14 +846,43 @@ uint64_t HELPER(neon_qshlu_s64)(CPUState *env, uint64_t valop, uint64_t shiftop) }} while (0) NEON_VOP_ENV(qrshl_u8, neon_u8, 4) NEON_VOP_ENV(qrshl_u16, neon_u16, 2) -NEON_VOP_ENV(qrshl_u32, neon_u32, 1) #undef NEON_FN +/* The addition of the rounding constant may overflow, so we use an + * intermediate 64 bits accumulator. */ +uint32_t HELPER(neon_qrshl_u32)(CPUState *env, uint32_t val, uint32_t shiftop) +{ + uint32_t dest; + int8_t shift = (int8_t)shiftop; + if (shift < 0) { + uint64_t big_dest = ((uint64_t)val + ( 1 << (-1 - shift))); + dest = big_dest >> -shift; + } else { + dest = val << shift; + if ((dest >> shift) != val) { + SET_QC(); + dest = ~0; + } + } + return dest; +} + +/* Handling addition overflow with 64 bits inputs values is more + * tricky than with 32 bits values. */ uint64_t HELPER(neon_qrshl_u64)(CPUState *env, uint64_t val, uint64_t shiftop) { int8_t shift = (int8_t)shiftop; if (shift < 0) { - val = (val + (1 << (-1 - shift))) >> -shift; + val >>= (-shift - 1); + if (val == UINT64_MAX) { + /* In this case, it means that the rounding constant is 1, + * and the addition would overflow. Return the actual + * result directly. */ + val = 0x8000000000000000ULL; + } else { + val++; + val >>= 1; + } } else { \ uint64_t tmp = val; val <<= shift; @@ -817,22 +908,56 @@ uint64_t HELPER(neon_qrshl_u64)(CPUState *env, uint64_t val, uint64_t shiftop) }} while (0) NEON_VOP_ENV(qrshl_s8, neon_s8, 4) NEON_VOP_ENV(qrshl_s16, neon_s16, 2) -NEON_VOP_ENV(qrshl_s32, neon_s32, 1) #undef NEON_FN +/* The addition of the rounding constant may overflow, so we use an + * intermediate 64 bits accumulator. */ +uint32_t HELPER(neon_qrshl_s32)(CPUState *env, uint32_t valop, uint32_t shiftop) +{ + int32_t dest; + int32_t val = (int32_t)valop; + int8_t shift = (int8_t)shiftop; + if (shift < 0) { + int64_t big_dest = ((int64_t)val + (1 << (-1 - shift))); + dest = big_dest >> -shift; + } else { + dest = val << shift; + if ((dest >> shift) != val) { + SET_QC(); + dest = (uint32_t)(1 << (sizeof(val) * 8 - 1)) - (val > 0 ? 1 : 0); + } + } + return dest; +} + +/* Handling addition overflow with 64 bits inputs values is more + * tricky than with 32 bits values. */ uint64_t HELPER(neon_qrshl_s64)(CPUState *env, uint64_t valop, uint64_t shiftop) { int8_t shift = (uint8_t)shiftop; int64_t val = valop; if (shift < 0) { - val = (val + (1 << (-1 - shift))) >> -shift; + val >>= (-shift - 1); + if (val == INT64_MAX) { + /* In this case, it means that the rounding constant is 1, + * and the addition would overflow. Return the actual + * result directly. */ + val = 0x4000000000000000ULL; + } else { + val++; + val >>= 1; + } } else { - int64_t tmp = val;; + int64_t tmp = val; val <<= shift; if ((val >> shift) != tmp) { SET_QC(); - val = tmp >> 31; + if (tmp < 0) { + val = INT64_MIN; + } else { + val = INT64_MAX; + } } } return val;