diff mbox

[1/9] softfloat: Introduce float32_to_uint64_round_to_zero

Message ID 1395866754-18673-2-git-send-email-tommusta@gmail.com
State New
Headers show

Commit Message

Tom Musta March 26, 2014, 8:45 p.m. UTC
This change adds the float32_to_uint64_round_to_zero function to the softfloat
library.  This function fills out the complement of float32 to INT round-to-zero
conversion rountines, where INT is {int32_t, uint32_t, int64_t, uint64_t}.

This contribution can be licensed under either the softfloat-2a or -2b
license.

Signed-off-by: Tom Musta <tommusta@gmail.com>
Tested-by: Tom Musta <tommusta@gmail.com>
---
 fpu/softfloat.c         |   54 +++++++++++++++++++++++++++++++++++++++++++++++
 include/fpu/softfloat.h |    1 +
 2 files changed, 55 insertions(+), 0 deletions(-)

Comments

Alexander Graf March 31, 2014, 5:26 p.m. UTC | #1
On 03/26/2014 09:45 PM, Tom Musta wrote:
> This change adds the float32_to_uint64_round_to_zero function to the softfloat
> library.  This function fills out the complement of float32 to INT round-to-zero
> conversion rountines, where INT is {int32_t, uint32_t, int64_t, uint64_t}.
>
> This contribution can be licensed under either the softfloat-2a or -2b
> license.
>
> Signed-off-by: Tom Musta <tommusta@gmail.com>
> Tested-by: Tom Musta <tommusta@gmail.com>

Peter, could you please ack? This is a requirement for a bug fix series 
that I'd like to see in 2.0.


Alex

> ---
>   fpu/softfloat.c         |   54 +++++++++++++++++++++++++++++++++++++++++++++++
>   include/fpu/softfloat.h |    1 +
>   2 files changed, 55 insertions(+), 0 deletions(-)
>
> diff --git a/fpu/softfloat.c b/fpu/softfloat.c
> index 5f02c16..d6df78a 100644
> --- a/fpu/softfloat.c
> +++ b/fpu/softfloat.c
> @@ -1628,6 +1628,60 @@ uint64 float32_to_uint64(float32 a STATUS_PARAM)
>   
>   /*----------------------------------------------------------------------------
>   | Returns the result of converting the single-precision floating-point value
> +| `a' to the 64-bit unsigned integer format.  The conversion is
> +| performed according to the IEC/IEEE Standard for Binary Floating-Point
> +| Arithmetic, except that the conversion is always rounded toward zero.  If
> +| `a' is a NaN, the largest unsigned integer is returned.  Otherwise, if the
> +| conversion overflows, the largest unsigned integer is returned.  If the
> +| 'a' is negative, the result is rounded and zero is returned; values that do
> +| not round to zero will raise the inexact flag.
> +*----------------------------------------------------------------------------*/
> +
> +uint64 float32_to_uint64_round_to_zero(float32 a STATUS_PARAM)
> +{
> +    flag aSign;
> +    int_fast16_t aExp, shiftCount;
> +    uint32_t aSig;
> +    uint64_t aSig64;
> +    uint64_t z;
> +    a = float32_squash_input_denormal(a STATUS_VAR);
> +
> +    aSig = extractFloat32Frac(a);
> +    aExp = extractFloat32Exp(a);
> +    aSign = extractFloat32Sign(a);
> +    if ((aSign) && (aExp > 126)) {
> +        float_raise(float_flag_invalid STATUS_VAR);
> +        if (float32_is_any_nan(a)) {
> +            return LIT64(0xFFFFFFFFFFFFFFFF);
> +        } else {
> +            return 0;
> +        }
> +    }
> +    shiftCount = 0xBE - aExp;
> +    if (aExp) {
> +        aSig |= 0x00800000;
> +    }
> +    if (shiftCount < 0) {
> +        float_raise(float_flag_invalid STATUS_VAR);
> +        return LIT64(0xFFFFFFFFFFFFFFFF);
> +    } else if (aExp <= 0x7E) {
> +        if (aExp | aSig) {
> +            STATUS(float_exception_flags) |= float_flag_inexact;
> +        }
> +        return 0;
> +    }
> +
> +    aSig64 = aSig;
> +    aSig64 <<= 40;
> +    z = aSig64 >> shiftCount;
> +    if (shiftCount && ((uint64_t)(aSig64 << (-shiftCount & 63)))) {
> +        STATUS(float_exception_flags) |= float_flag_inexact;
> +    }
> +    return z;
> +}
> +
> +/*----------------------------------------------------------------------------
> +| Returns the result of converting the single-precision floating-point value
>   | `a' to the 64-bit two's complement integer format.  The conversion is
>   | performed according to the IEC/IEEE Standard for Binary Floating-Point
>   | Arithmetic, except that the conversion is always rounded toward zero.  If
> diff --git a/include/fpu/softfloat.h b/include/fpu/softfloat.h
> index db878c1..4b3090c 100644
> --- a/include/fpu/softfloat.h
> +++ b/include/fpu/softfloat.h
> @@ -342,6 +342,7 @@ uint32 float32_to_uint32( float32 STATUS_PARAM );
>   uint32 float32_to_uint32_round_to_zero( float32 STATUS_PARAM );
>   int64 float32_to_int64( float32 STATUS_PARAM );
>   uint64 float32_to_uint64(float32 STATUS_PARAM);
> +uint64 float32_to_uint64_round_to_zero(float32 STATUS_PARAM);
>   int64 float32_to_int64_round_to_zero( float32 STATUS_PARAM );
>   float64 float32_to_float64( float32 STATUS_PARAM );
>   floatx80 float32_to_floatx80( float32 STATUS_PARAM );
Peter Maydell March 31, 2014, 5:48 p.m. UTC | #2
On 26 March 2014 20:45, Tom Musta <tommusta@gmail.com> wrote:
> This change adds the float32_to_uint64_round_to_zero function to the softfloat
> library.  This function fills out the complement of float32 to INT round-to-zero
> conversion rountines, where INT is {int32_t, uint32_t, int64_t, uint64_t}.
>
> This contribution can be licensed under either the softfloat-2a or -2b
> license.
>
> Signed-off-by: Tom Musta <tommusta@gmail.com>
> Tested-by: Tom Musta <tommusta@gmail.com>
> ---
>  fpu/softfloat.c         |   54 +++++++++++++++++++++++++++++++++++++++++++++++
>  include/fpu/softfloat.h |    1 +
>  2 files changed, 55 insertions(+), 0 deletions(-)
>
> diff --git a/fpu/softfloat.c b/fpu/softfloat.c
> index 5f02c16..d6df78a 100644
> --- a/fpu/softfloat.c
> +++ b/fpu/softfloat.c
> @@ -1628,6 +1628,60 @@ uint64 float32_to_uint64(float32 a STATUS_PARAM)
>
>  /*----------------------------------------------------------------------------
>  | Returns the result of converting the single-precision floating-point value
> +| `a' to the 64-bit unsigned integer format.  The conversion is
> +| performed according to the IEC/IEEE Standard for Binary Floating-Point
> +| Arithmetic, except that the conversion is always rounded toward zero.  If
> +| `a' is a NaN, the largest unsigned integer is returned.  Otherwise, if the
> +| conversion overflows, the largest unsigned integer is returned.  If the
> +| 'a' is negative, the result is rounded and zero is returned; values that do
> +| not round to zero will raise the inexact flag.
> +*----------------------------------------------------------------------------*/
> +
> +uint64 float32_to_uint64_round_to_zero(float32 a STATUS_PARAM)
> +{
> +    flag aSign;
> +    int_fast16_t aExp, shiftCount;
> +    uint32_t aSig;
> +    uint64_t aSig64;
> +    uint64_t z;

So, float64_to_uint64_round_to_zero() works by temporarily
fiddling with the rounding mode and then calling
float64_to_uint64(). Is there a reason for doing this
function like this rather than in the same way?

thanks
-- PMM
Tom Musta March 31, 2014, 6:07 p.m. UTC | #3
On 3/31/2014 12:48 PM, Peter Maydell wrote:
> On 26 March 2014 20:45, Tom Musta <tommusta@gmail.com> wrote:
>> This change adds the float32_to_uint64_round_to_zero function to the softfloat
>> library.  This function fills out the complement of float32 to INT round-to-zero
>> conversion rountines, where INT is {int32_t, uint32_t, int64_t, uint64_t}.
>>
>> This contribution can be licensed under either the softfloat-2a or -2b
>> license.
>>
>> Signed-off-by: Tom Musta <tommusta@gmail.com>
>> Tested-by: Tom Musta <tommusta@gmail.com>
>> ---
>>  fpu/softfloat.c         |   54 +++++++++++++++++++++++++++++++++++++++++++++++
>>  include/fpu/softfloat.h |    1 +
>>  2 files changed, 55 insertions(+), 0 deletions(-)
>>
>> diff --git a/fpu/softfloat.c b/fpu/softfloat.c
>> index 5f02c16..d6df78a 100644
>> --- a/fpu/softfloat.c
>> +++ b/fpu/softfloat.c
>> @@ -1628,6 +1628,60 @@ uint64 float32_to_uint64(float32 a STATUS_PARAM)
>>
>>  /*----------------------------------------------------------------------------
>>  | Returns the result of converting the single-precision floating-point value
>> +| `a' to the 64-bit unsigned integer format.  The conversion is
>> +| performed according to the IEC/IEEE Standard for Binary Floating-Point
>> +| Arithmetic, except that the conversion is always rounded toward zero.  If
>> +| `a' is a NaN, the largest unsigned integer is returned.  Otherwise, if the
>> +| conversion overflows, the largest unsigned integer is returned.  If the
>> +| 'a' is negative, the result is rounded and zero is returned; values that do
>> +| not round to zero will raise the inexact flag.
>> +*----------------------------------------------------------------------------*/
>> +
>> +uint64 float32_to_uint64_round_to_zero(float32 a STATUS_PARAM)
>> +{
>> +    flag aSign;
>> +    int_fast16_t aExp, shiftCount;
>> +    uint32_t aSig;
>> +    uint64_t aSig64;
>> +    uint64_t z;
> 
> So, float64_to_uint64_round_to_zero() works by temporarily
> fiddling with the rounding mode and then calling
> float64_to_uint64(). Is there a reason for doing this
> function like this rather than in the same way?
> 
> thanks
> -- PMM
> 

True.  But not all of the *_round_to_zero() routines do this, e.g.
float32_to_int64_round_to_zero().  So no matter what I do, it is
inconsistent with something.

Do you prefer the fiddle-and-reuse approach?  (I think I do, actually).
If so, I will respin the patch.
Peter Maydell March 31, 2014, 6:12 p.m. UTC | #4
On 31 March 2014 19:07, Tom Musta <tommusta@gmail.com> wrote:
> On 3/31/2014 12:48 PM, Peter Maydell wrote:
>> So, float64_to_uint64_round_to_zero() works by temporarily
>> fiddling with the rounding mode and then calling
>> float64_to_uint64(). Is there a reason for doing this
>> function like this rather than in the same way?

> True.  But not all of the *_round_to_zero() routines do this, e.g.
> float32_to_int64_round_to_zero().  So no matter what I do, it is
> inconsistent with something.
>
> Do you prefer the fiddle-and-reuse approach?  (I think I do, actually).
> If so, I will respin the patch.

I think that would be easier to review for correctness :-)

thanks
-- PMM
diff mbox

Patch

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 5f02c16..d6df78a 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -1628,6 +1628,60 @@  uint64 float32_to_uint64(float32 a STATUS_PARAM)
 
 /*----------------------------------------------------------------------------
 | Returns the result of converting the single-precision floating-point value
+| `a' to the 64-bit unsigned integer format.  The conversion is
+| performed according to the IEC/IEEE Standard for Binary Floating-Point
+| Arithmetic, except that the conversion is always rounded toward zero.  If
+| `a' is a NaN, the largest unsigned integer is returned.  Otherwise, if the
+| conversion overflows, the largest unsigned integer is returned.  If the
+| 'a' is negative, the result is rounded and zero is returned; values that do
+| not round to zero will raise the inexact flag.
+*----------------------------------------------------------------------------*/
+
+uint64 float32_to_uint64_round_to_zero(float32 a STATUS_PARAM)
+{
+    flag aSign;
+    int_fast16_t aExp, shiftCount;
+    uint32_t aSig;
+    uint64_t aSig64;
+    uint64_t z;
+    a = float32_squash_input_denormal(a STATUS_VAR);
+
+    aSig = extractFloat32Frac(a);
+    aExp = extractFloat32Exp(a);
+    aSign = extractFloat32Sign(a);
+    if ((aSign) && (aExp > 126)) {
+        float_raise(float_flag_invalid STATUS_VAR);
+        if (float32_is_any_nan(a)) {
+            return LIT64(0xFFFFFFFFFFFFFFFF);
+        } else {
+            return 0;
+        }
+    }
+    shiftCount = 0xBE - aExp;
+    if (aExp) {
+        aSig |= 0x00800000;
+    }
+    if (shiftCount < 0) {
+        float_raise(float_flag_invalid STATUS_VAR);
+        return LIT64(0xFFFFFFFFFFFFFFFF);
+    } else if (aExp <= 0x7E) {
+        if (aExp | aSig) {
+            STATUS(float_exception_flags) |= float_flag_inexact;
+        }
+        return 0;
+    }
+
+    aSig64 = aSig;
+    aSig64 <<= 40;
+    z = aSig64 >> shiftCount;
+    if (shiftCount && ((uint64_t)(aSig64 << (-shiftCount & 63)))) {
+        STATUS(float_exception_flags) |= float_flag_inexact;
+    }
+    return z;
+}
+
+/*----------------------------------------------------------------------------
+| Returns the result of converting the single-precision floating-point value
 | `a' to the 64-bit two's complement integer format.  The conversion is
 | performed according to the IEC/IEEE Standard for Binary Floating-Point
 | Arithmetic, except that the conversion is always rounded toward zero.  If
diff --git a/include/fpu/softfloat.h b/include/fpu/softfloat.h
index db878c1..4b3090c 100644
--- a/include/fpu/softfloat.h
+++ b/include/fpu/softfloat.h
@@ -342,6 +342,7 @@  uint32 float32_to_uint32( float32 STATUS_PARAM );
 uint32 float32_to_uint32_round_to_zero( float32 STATUS_PARAM );
 int64 float32_to_int64( float32 STATUS_PARAM );
 uint64 float32_to_uint64(float32 STATUS_PARAM);
+uint64 float32_to_uint64_round_to_zero(float32 STATUS_PARAM);
 int64 float32_to_int64_round_to_zero( float32 STATUS_PARAM );
 float64 float32_to_float64( float32 STATUS_PARAM );
 floatx80 float32_to_floatx80( float32 STATUS_PARAM );