From patchwork Tue Dec 15 03:50:04 2009 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 41166 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [199.232.76.165]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 0ACCEB6F06 for ; Tue, 15 Dec 2009 14:50:52 +1100 (EST) Received: from localhost ([127.0.0.1]:48673 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NKOQz-0006wb-4a for incoming@patchwork.ozlabs.org; Mon, 14 Dec 2009 22:50:49 -0500 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1NKOQN-0006tP-S2 for qemu-devel@nongnu.org; Mon, 14 Dec 2009 22:50:11 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1NKOQM-0006sy-O4 for qemu-devel@nongnu.org; Mon, 14 Dec 2009 22:50:11 -0500 Received: from [199.232.76.173] (port=56933 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NKOQM-0006sU-H3 for qemu-devel@nongnu.org; Mon, 14 Dec 2009 22:50:10 -0500 Received: from are.twiddle.net ([75.149.56.221]:38604) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1NKOQK-0003Ek-5I for qemu-devel@nongnu.org; Mon, 14 Dec 2009 22:50:09 -0500 Received: from stone.twiddle.home (stone.twiddle.home [172.31.0.16]) by are.twiddle.net (Postfix) with ESMTPSA id CD482486; Mon, 14 Dec 2009 19:50:04 -0800 (PST) Message-ID: <4B27076C.6020806@twiddle.net> Date: Mon, 14 Dec 2009 19:50:04 -0800 From: Richard Henderson User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.4pre) Gecko/20090922 Fedora/3.0-3.9.b4.fc12 Thunderbird/3.0b4 MIME-Version: 1.0 To: Laurent Desnogues Subject: Re: [Qemu-devel] target-alpha: An approach to fp insn qualifiers References: <4B267DBC.8050909@twiddle.net> <761ea48b0912141211keb5bbben584d7fe76f44d78c@mail.gmail.com> <4B26D8EF.10801@twiddle.net> In-Reply-To: <4B26D8EF.10801@twiddle.net> X-detected-operating-system: by monty-python.gnu.org: GNU/Linux 2.6 (newer, 2) Cc: qemu-devel@nongnu.org X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org On 12/14/2009 04:31 PM, Richard Henderson wrote: > On 12/14/2009 12:11 PM, Laurent Desnogues wrote: >> I'll take a closer look at your patch tomorrow. > > For the record, I believe this finishes what I had in mind for the > exception handling there in op_handler.c. Hmph. One more patch for correctness. With this 183.equake runs correctly. I couldn't remember all the hoops to get runspec.pl to work, to do the whole testsuite, but I did run this one by hand. ./quake-amd64: Done. Terminating the simulation. real 0m34.943s user 0m34.913s sys 0m0.024s ./quake-axp: Done. Terminating the simulation. real 33m24.105s user 33m23.674s sys 0m0.116s with identical output. r~ commit daf11ad5cd50c56d44e36e4ea334c660f8fe4c16 Author: Richard Henderson Date: Mon Dec 14 19:46:57 2009 -0800 target-alpha: Don't ever saturate cvttq. The previous patch tried allowing saturation if /S; that doesn't match what the kernels generate. diff --git a/target-alpha/op_helper.c b/target-alpha/op_helper.c index d031f56..2d1c3d5 100644 --- a/target-alpha/op_helper.c +++ b/target-alpha/op_helper.c @@ -1220,120 +1220,106 @@ uint64_t helper_cvtqs (uint64_t a, uint32_t quals) is used by the compiler to get unsigned conversion for free with the same instruction. */ -static uint64_t cvttq_noqual_internal(uint64_t a, uint32_t rounding_mode) +static uint64_t cvttq_internal(uint64_t a) { uint64_t frac, ret = 0; - uint32_t exp, sign; + uint32_t exp, sign, exc = 0; int shift; sign = (a >> 63); exp = (uint32_t)(a >> 52) & 0x7ff; frac = a & 0xfffffffffffffull; - /* We already handled denormals in remap_ieee_input; infinities and - nans are defined to return zero as per truncation. */ - if (exp == 0 || exp == 0x7ff) - return 0; - - /* Restore implicit bit. */ - frac |= 0x10000000000000ull; - - /* Note that neither overflow exceptions nor inexact exceptions - are desired. This lets us streamline the checks quite a bit. */ - shift = exp - 1023 - 52; - if (shift >= 0) { - /* In this case the number is so large that we must shift - the fraction left. There is no rounding to do. */ - if (shift < 63) { - ret = frac << shift; - } + if (exp == 0) { + if (unlikely(frac != 0)) + goto do_underflow; + } else if (exp == 0x7ff) { + if (frac == 0) + exc = float_flag_overflow; + else + exc = float_flag_invalid; } else { - uint64_t round; - - /* In this case the number is smaller than the fraction as - represented by the 52 bit number. Here we must think - about rounding the result. Handle this by shifting the - fractional part of the number into the high bits of ROUND. - This will let us efficiently handle round-to-nearest. */ - shift = -shift; - if (shift < 63) { - ret = frac >> shift; - round = frac << (64 - shift); + /* Restore implicit bit. */ + frac |= 0x10000000000000ull; + + /* Note that neither overflow exceptions nor inexact exceptions + are desired. This lets us streamline the checks quite a bit. */ + shift = exp - 1023 - 52; + if (shift >= 0) { + /* In this case the number is so large that we must shift + the fraction left. There is no rounding to do. */ + if (shift < 63) { + ret = frac << shift; + if ((ret >> shift) != frac) + exc = float_flag_overflow; + } } else { - /* The exponent is so small we shift out everything. - Leave a sticky bit for proper rounding below. */ - round = 1; - } + uint64_t round; + + /* In this case the number is smaller than the fraction as + represented by the 52 bit number. Here we must think + about rounding the result. Handle this by shifting the + fractional part of the number into the high bits of ROUND. + This will let us efficiently handle round-to-nearest. */ + shift = -shift; + if (shift < 63) { + ret = frac >> shift; + round = frac << (64 - shift); + } else { + /* The exponent is so small we shift out everything. + Leave a sticky bit for proper rounding below. */ + do_underflow: + round = 1; + } - if (round) { - switch (rounding_mode) { - case float_round_nearest_even: - if (round == (1ull << 63)) { - /* Remaining fraction is exactly 0.5; round to even. */ - ret += (ret & 1); - } else if (round > (1ull << 63)) { - ret += 1; + if (round) { + exc = float_flag_inexact; + switch (FP_STATUS.float_rounding_mode) { + case float_round_nearest_even: + if (round == (1ull << 63)) { + /* Fraction is exactly 0.5; round to even. */ + ret += (ret & 1); + } else if (round > (1ull << 63)) { + ret += 1; + } + break; + case float_round_to_zero: + break; + case float_round_up: + if (!sign) + ret += 1; + break; + case float_round_down: + if (sign) + ret += 1; + break; } - break; - case float_round_to_zero: - break; - case float_round_up: - if (!sign) - ret += 1; - break; - case float_round_down: - if (sign) - ret += 1; - break; } } + if (sign) + ret = -ret; } + if (unlikely(exc)) + float_raise(exc, &FP_STATUS); - if (sign) - ret = -ret; return ret; } uint64_t helper_cvttq (uint64_t a, uint32_t quals) { uint64_t ret; + uint32_t token; - a = remap_ieee_input(quals, a); - - if (quals & QUAL_V) { - float64 fa = t_to_float64(a); - uint32_t token; - - token = begin_fp_exception(); - if ((quals & QUAL_RM_MASK) == QUAL_RM_C) { - ret = float64_to_int64_round_to_zero(fa, &FP_STATUS); - } else { - token |= begin_fp_roundmode(quals); - ret = float64_to_int64(fa, &FP_STATUS); - end_fp_roundmode(token); - } - end_fp_exception(quals, token); - } else { - uint32_t round_mode; - - switch (quals & QUAL_RM_MASK) { - case QUAL_RM_N: - round_mode = float_round_nearest_even; - break; - case QUAL_RM_C: - default: - round_mode = float_round_to_zero; - break; - case QUAL_RM_M: - round_mode = float_round_down; - break; - case QUAL_RM_D: - round_mode = FP_STATUS.float_rounding_mode; - break; - } + /* ??? There's an arugument to be made that when /S is enabled, we + should provide the standard IEEE saturated result, instead of + the truncated result that we *must* provide when /V is disabled. + However, that's not how either the Tru64 or Linux completion + handlers actually work, and GCC knows it. */ - ret = cvttq_noqual_internal(a, round_mode); - } + token = begin_fp(quals); + a = remap_ieee_input(quals, a); + ret = cvttq_internal(a); + end_fp(quals, token); return ret; }