Patchwork [V4,01/22] softfloat: Fix float64_to_uint64

login
register
mail settings
Submitter Tom Musta
Date Dec. 18, 2013, 8:19 p.m.
Message ID <1387397961-4894-2-git-send-email-tommusta@gmail.com>
Download mbox | patch
Permalink /patch/303090/
State New
Headers show

Comments

Tom Musta - Dec. 18, 2013, 8:19 p.m.
The comment preceding the float64_to_uint64 routine suggests that
the implementation is broken.  And this is, indeed, the case.

This patch properly implements the conversion of a 64-bit floating
point number to an unsigned, 64 bit integer.

This contribution can be licensed under either the softfloat-2a or -2b
license.

V2: Added softfloat license statement.

V3: Modified to meet QEMU coding conventions.

V4: Fixed incorrect handling of small negatives, which, if rounded
up to zero should not set the inexact flag.

Signed-off-by: Tom Musta <tommusta@gmail.com>
---
 fpu/softfloat.c |   98 +++++++++++++++++++++++++++++++++++++++++++++++++-----
 1 files changed, 89 insertions(+), 9 deletions(-)
Peter Maydell - Dec. 19, 2013, 10:11 p.m.
On 18 December 2013 20:19, Tom Musta <tommusta@gmail.com> wrote:
> The comment preceding the float64_to_uint64 routine suggests that
> the implementation is broken.  And this is, indeed, the case.
>
> This patch properly implements the conversion of a 64-bit floating
> point number to an unsigned, 64 bit integer.
>
> This contribution can be licensed under either the softfloat-2a or -2b
> license.
>
> V2: Added softfloat license statement.
>
> V3: Modified to meet QEMU coding conventions.
>
> V4: Fixed incorrect handling of small negatives, which, if rounded
> up to zero should not set the inexact flag.
>
> Signed-off-by: Tom Musta <tommusta@gmail.com>
> ---
>  fpu/softfloat.c |   98 +++++++++++++++++++++++++++++++++++++++++++++++++-----
>  1 files changed, 89 insertions(+), 9 deletions(-)
>
> diff --git a/fpu/softfloat.c b/fpu/softfloat.c
> index dbda61b..ec23908 100644
> --- a/fpu/softfloat.c
> +++ b/fpu/softfloat.c
> @@ -161,7 +161,6 @@ static int32 roundAndPackInt32( flag zSign, uint64_t absZ STATUS_PARAM)
>  | exception is raised and the largest positive or negative integer is
>  | returned.
>  *----------------------------------------------------------------------------*/
> -
>  static int64 roundAndPackInt64( flag zSign, uint64_t absZ0, uint64_t absZ1 STATUS_PARAM)
>  {
>      int8 roundingMode;
> @@ -204,6 +203,56 @@ static int64 roundAndPackInt64( flag zSign, uint64_t absZ0, uint64_t absZ1 STATU
>  }
>
>  /*----------------------------------------------------------------------------
> +| Takes the 128-bit fixed-point value formed by concatenating `absZ0' and
> +| `absZ1', with binary point between bits 63 and 64 (between the input words),
> +| and returns the properly rounded 64-bit unsigned integer corresponding to the
> +| input.  Ordinarily, the fixed-point input is simply rounded to an integer,
> +| with the inexact exception raised if the input cannot be represented exactly
> +| as an integer.  However, if the fixed-point input is too large, the invalid
> +| exception is raised and the largest unsigned integer is returned.
> +*----------------------------------------------------------------------------*/

You should probably say in this comment what the behaviour is for
negative inputs.

> +uint64_t float64_to_uint64(float64 a STATUS_PARAM)
> +{
> +    flag aSign;
> +    int_fast16_t aExp, shiftCount;
> +    uint64_t aSig, aSigExtra;
> +    a = float64_squash_input_denormal(a STATUS_VAR);
>
> -    return v - INT64_MIN;
> +    aSig = extractFloat64Frac(a);
> +    aExp = extractFloat64Exp(a);
> +    aSign = extractFloat64Sign(a);
> +    if (aSign && (aExp > 1022)) {
> +        float_raise(float_flag_invalid STATUS_VAR);
> +        return 0;

This incorrectly returns 0 rather than largest-positive-integer
for NaNs with the sign bit set.

> +    }
> +    if (aExp) {
> +        aSig |= LIT64(0x0010000000000000);
> +    }
> +    shiftCount = 0x433 - aExp;
> +    if (shiftCount <= 0) {
> +        if (0x43E < aExp) {
> +            float_raise(float_flag_invalid STATUS_VAR);
> +            return LIT64(0xFFFFFFFFFFFFFFFF);
> +        }
> +        aSigExtra = 0;
> +        aSig <<= -shiftCount;
> +    } else {
> +        shift64ExtraRightJamming(aSig, 0, shiftCount, &aSig, &aSigExtra);
> +    }
> +    return roundAndPackUint64(aSign, aSig, aSigExtra STATUS_VAR);
>  }

Other than that, the code *looks* OK, but it's really easy for
"not quite right" code to slip through here (especially on corner
cases like NaNs, denormals and odd rounding modes). How much
testing have you given this? I really recommend testing by firing a
huge pile of random (and semi random) test vectors at whatever
guest instruction you're implementing and comparing against
results on reference hardware.

thanks
-- PMM
Tom Musta - Dec. 20, 2013, 8:05 p.m.
On 12/19/2013 4:11 PM, Peter Maydell wrote:
> On 18 December 2013 20:19, Tom Musta <tommusta@gmail.com> wrote:
>> The comment preceding the float64_to_uint64 routine suggests that
>> the implementation is broken.  And this is, indeed, the case.
>>
>> This patch properly implements the conversion of a 64-bit floating
>> point number to an unsigned, 64 bit integer.
>>
>> This contribution can be licensed under either the softfloat-2a or -2b
>> license.
>>
>> V2: Added softfloat license statement.
>>
>> V3: Modified to meet QEMU coding conventions.
>>
>> V4: Fixed incorrect handling of small negatives, which, if rounded
>> up to zero should not set the inexact flag.
>>
>> Signed-off-by: Tom Musta <tommusta@gmail.com>
>> ---
>>  fpu/softfloat.c |   98 +++++++++++++++++++++++++++++++++++++++++++++++++-----
>>  1 files changed, 89 insertions(+), 9 deletions(-)
>>
>> diff --git a/fpu/softfloat.c b/fpu/softfloat.c
>> index dbda61b..ec23908 100644
>> --- a/fpu/softfloat.c
>> +++ b/fpu/softfloat.c
>> @@ -161,7 +161,6 @@ static int32 roundAndPackInt32( flag zSign, uint64_t absZ STATUS_PARAM)
>>  | exception is raised and the largest positive or negative integer is
>>  | returned.
>>  *----------------------------------------------------------------------------*/
>> -
>>  static int64 roundAndPackInt64( flag zSign, uint64_t absZ0, uint64_t absZ1 STATUS_PARAM)
>>  {
>>      int8 roundingMode;
>> @@ -204,6 +203,56 @@ static int64 roundAndPackInt64( flag zSign, uint64_t absZ0, uint64_t absZ1 STATU
>>  }
>>
>>  /*----------------------------------------------------------------------------
>> +| Takes the 128-bit fixed-point value formed by concatenating `absZ0' and
>> +| `absZ1', with binary point between bits 63 and 64 (between the input words),
>> +| and returns the properly rounded 64-bit unsigned integer corresponding to the
>> +| input.  Ordinarily, the fixed-point input is simply rounded to an integer,
>> +| with the inexact exception raised if the input cannot be represented exactly
>> +| as an integer.  However, if the fixed-point input is too large, the invalid
>> +| exception is raised and the largest unsigned integer is returned.
>> +*----------------------------------------------------------------------------*/
> 
> You should probably say in this comment what the behaviour is for
> negative inputs.
> 
>> +uint64_t float64_to_uint64(float64 a STATUS_PARAM)
>> +{
>> +    flag aSign;
>> +    int_fast16_t aExp, shiftCount;
>> +    uint64_t aSig, aSigExtra;
>> +    a = float64_squash_input_denormal(a STATUS_VAR);
>>
>> -    return v - INT64_MIN;
>> +    aSig = extractFloat64Frac(a);
>> +    aExp = extractFloat64Exp(a);
>> +    aSign = extractFloat64Sign(a);
>> +    if (aSign && (aExp > 1022)) {
>> +        float_raise(float_flag_invalid STATUS_VAR);
>> +        return 0;
> 
> This incorrectly returns 0 rather than largest-positive-integer
> for NaNs with the sign bit set.
> 
>> +    }
>> +    if (aExp) {
>> +        aSig |= LIT64(0x0010000000000000);
>> +    }
>> +    shiftCount = 0x433 - aExp;
>> +    if (shiftCount <= 0) {
>> +        if (0x43E < aExp) {
>> +            float_raise(float_flag_invalid STATUS_VAR);
>> +            return LIT64(0xFFFFFFFFFFFFFFFF);
>> +        }
>> +        aSigExtra = 0;
>> +        aSig <<= -shiftCount;
>> +    } else {
>> +        shift64ExtraRightJamming(aSig, 0, shiftCount, &aSig, &aSigExtra);
>> +    }
>> +    return roundAndPackUint64(aSign, aSig, aSigExtra STATUS_VAR);
>>  }
> 
> Other than that, the code *looks* OK, but it's really easy for
> "not quite right" code to slip through here (especially on corner
> cases like NaNs, denormals and odd rounding modes). How much
> testing have you given this? I really recommend testing by firing a
> huge pile of random (and semi random) test vectors at whatever
> guest instruction you're implementing and comparing against
> results on reference hardware.
> 
> thanks
> -- PMM
> 

Peter:

I agree with the comments and also with the bug.  I will fix.

I do test like you said ... random patterns with some biasing to try to get into corner
cases.  The bug you found was masked in the PowerPC code that wrapped to call to
float64_to_uint64.  I have constructed a variant of my test harness that invokes
float64_to_uint64 directly and it has uncovered the bug.

Patch

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index dbda61b..ec23908 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -161,7 +161,6 @@  static int32 roundAndPackInt32( flag zSign, uint64_t absZ STATUS_PARAM)
 | exception is raised and the largest positive or negative integer is
 | returned.
 *----------------------------------------------------------------------------*/
-
 static int64 roundAndPackInt64( flag zSign, uint64_t absZ0, uint64_t absZ1 STATUS_PARAM)
 {
     int8 roundingMode;
@@ -204,6 +203,56 @@  static int64 roundAndPackInt64( flag zSign, uint64_t absZ0, uint64_t absZ1 STATU
 }
 
 /*----------------------------------------------------------------------------
+| Takes the 128-bit fixed-point value formed by concatenating `absZ0' and
+| `absZ1', with binary point between bits 63 and 64 (between the input words),
+| and returns the properly rounded 64-bit unsigned integer corresponding to the
+| input.  Ordinarily, the fixed-point input is simply rounded to an integer,
+| with the inexact exception raised if the input cannot be represented exactly
+| as an integer.  However, if the fixed-point input is too large, the invalid
+| exception is raised and the largest unsigned integer is returned.
+*----------------------------------------------------------------------------*/
+
+static int64 roundAndPackUint64(flag zSign, uint64_t absZ0,
+                                uint64_t absZ1 STATUS_PARAM)
+{
+    int8 roundingMode;
+    flag roundNearestEven, increment;
+
+    roundingMode = STATUS(float_rounding_mode);
+    roundNearestEven = (roundingMode == float_round_nearest_even);
+    increment = ((int64_t)absZ1 < 0);
+    if (!roundNearestEven) {
+        if (roundingMode == float_round_to_zero) {
+            increment = 0;
+        } else if (absZ1) {
+            if (zSign) {
+                increment = (roundingMode == float_round_down) && absZ1;
+            } else {
+                increment = (roundingMode == float_round_up) && absZ1;
+            }
+        }
+    }
+    if (increment) {
+        ++absZ0;
+        if (absZ0 == 0) {
+            float_raise(float_flag_invalid STATUS_VAR);
+            return LIT64(0xFFFFFFFFFFFFFFFF);
+        }
+        absZ0 &= ~(((uint64_t)(absZ1<<1) == 0) & roundNearestEven);
+    }
+
+    if (zSign && absZ0) {
+        float_raise(float_flag_invalid STATUS_VAR);
+        return 0;
+    }
+
+    if (absZ1) {
+        STATUS(float_exception_flags) |= float_flag_inexact;
+    }
+    return absZ0;
+}
+
+/*----------------------------------------------------------------------------
 | Returns the fraction bits of the single-precision floating-point value `a'.
 *----------------------------------------------------------------------------*/
 
@@ -6536,18 +6585,49 @@  uint_fast16_t float64_to_uint16_round_to_zero(float64 a STATUS_PARAM)
     return res;
 }
 
-/* FIXME: This looks broken.  */
-uint64_t float64_to_uint64 (float64 a STATUS_PARAM)
-{
-    int64_t v;
+/*----------------------------------------------------------------------------
+| Returns the result of converting the double-precision floating-point value
+| `a' to the 64-bit unsigned integer format.  The conversion is
+| performed according to the IEC/IEEE Standard for Binary Floating-Point
+| Arithmetic---which means in particular that the conversion is rounded
+| according to the current rounding mode.  If `a' is a NaN, the largest
+| positive integer is returned.  If the conversion overflows, the
+| largest unsigned integer is returned.  If 'a' is negative, zero is
+| returned.
+*----------------------------------------------------------------------------*/
 
-    v = float64_val(int64_to_float64(INT64_MIN STATUS_VAR));
-    v += float64_val(a);
-    v = float64_to_int64(make_float64(v) STATUS_VAR);
+uint64_t float64_to_uint64(float64 a STATUS_PARAM)
+{
+    flag aSign;
+    int_fast16_t aExp, shiftCount;
+    uint64_t aSig, aSigExtra;
+    a = float64_squash_input_denormal(a STATUS_VAR);
 
-    return v - INT64_MIN;
+    aSig = extractFloat64Frac(a);
+    aExp = extractFloat64Exp(a);
+    aSign = extractFloat64Sign(a);
+    if (aSign && (aExp > 1022)) {
+        float_raise(float_flag_invalid STATUS_VAR);
+        return 0;
+    }
+    if (aExp) {
+        aSig |= LIT64(0x0010000000000000);
+    }
+    shiftCount = 0x433 - aExp;
+    if (shiftCount <= 0) {
+        if (0x43E < aExp) {
+            float_raise(float_flag_invalid STATUS_VAR);
+            return LIT64(0xFFFFFFFFFFFFFFFF);
+        }
+        aSigExtra = 0;
+        aSig <<= -shiftCount;
+    } else {
+        shift64ExtraRightJamming(aSig, 0, shiftCount, &aSig, &aSigExtra);
+    }
+    return roundAndPackUint64(aSign, aSig, aSigExtra STATUS_VAR);
 }
 
+
 uint64_t float64_to_uint64_round_to_zero (float64 a STATUS_PARAM)
 {
     int64_t v;