diff mbox

softfloat: float32_to_float16() should do inexact instead of underflow for rounding case

Message ID CABtrw+M-DmL-SbNo7NCQ4GguTR4Y4T=tdkoK-8XyM5prbpCyow@mail.gmail.com
State New
Headers show

Commit Message

Alexey Starikovskiy April 25, 2012, 4:11 p.m. UTC
See A2.7.7
Floating-point exceptions

Inexact. The flag is set to 1 if the result of an operation is not
equivalent to the value that
would be produced if the operation were performed with unbounded
precision and exponent
range.
Underflow. The flag is set to 1 if the absolute value of the result of
an operation, produced
before rounding, is less than the minimum positive normalized number
for the destination
precision, and the rounded result is inexact.

In this case we can see that the result is not less than the minimum
positive normalized number for the destination precision...

Signed-off-by: Alexey Starikovskiy <aystarik@gmail.com>

---
 fpu/softfloat.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

         case float_round_nearest_even:

Comments

Peter Maydell April 25, 2012, 4:15 p.m. UTC | #1
On 25 April 2012 17:11, Alexey Starikovskiy <aystarik@gmail.com> wrote:
> See A2.7.7
> Floating-point exceptions
>
> Inexact. The flag is set to 1 if the result of an operation is not
> equivalent to the value that
> would be produced if the operation were performed with unbounded
> precision and exponent
> range.

You can't justify a patch to the generic IEEE floating point
code with a quotation from the ARM ARM. (And don't quote documents
without saying what document you're quoting!)

Can you provide a test case for an instruction or operation which
generates the wrong results?

-- PMm
diff mbox

Patch

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 9e1b5f9..8078f75 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -3067,7 +3067,7 @@  float16 float32_to_float16(float32 a, flag ieee
STATUS_PARAM)
         mask = 0x00001fff;
     }
     if (aSig & mask) {
-        float_raise( float_flag_underflow STATUS_VAR );
+        float_raise( float_flag_inexact STATUS_VAR );
         roundingMode = STATUS(float_rounding_mode);
         switch (roundingMode) {