diff mbox

FMULX should flushes operators to zero when FZ is set.

Message ID 1422456666-12270-1-git-send-email-libhu.so@gmail.com
State New
Headers show

Commit Message

Xiangyu Hu Jan. 28, 2015, 2:51 p.m. UTC
The difference between FMULX and FMUL is that FMULX will return 2.0f when one operator is 
FPInfinity and the other one is FPZero, whilst FMUL will return a Default NaN. Without 
this patch, the emulation would result in inconsistency.

Signed-off-by: Xiangyu Hu <libhu.so@gmail.com>
---
 target-arm/helper-a64.c | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

Xiangyu Hu Jan. 28, 2015, 3:42 p.m. UTC | #1
Thanks Peter. 

I have sent another patch with updated commit message.

- xiangyu

> On 28 Jan, 2015, at 10:54 pm, Peter Maydell <peter.maydell@linaro.org> wrote:
> 
> On 28 January 2015 at 14:51, Xiangyu Hu <libhu.so@gmail.com <mailto:libhu.so@gmail.com>> wrote:
>> The difference between FMULX and FMUL is that FMULX will return 2.0f when one operator is
>> FPInfinity and the other one is FPZero, whilst FMUL will return a Default NaN. Without
>> this patch, the emulation would result in inconsistency.
>> 
>> Signed-off-by: Xiangyu Hu <libhu.so@gmail.com>
>> ---
>> target-arm/helper-a64.c | 6 ++++++
>> 1 file changed, 6 insertions(+)
>> 
>> diff --git a/target-arm/helper-a64.c b/target-arm/helper-a64.c
>> index 81066ca..ebd9247 100644
>> --- a/target-arm/helper-a64.c
>> +++ b/target-arm/helper-a64.c
>> @@ -135,6 +135,9 @@ float32 HELPER(vfp_mulxs)(float32 a, float32 b, void *fpstp)
>> {
>>     float_status *fpst = fpstp;
>> 
>> +    a = float32_squash_input_denormal(a, fpst);
>> +    b = float32_squash_input_denormal(b, fpst);
>> +
>>     if ((float32_is_zero(a) && float32_is_infinity(b)) ||
>>         (float32_is_infinity(a) && float32_is_zero(b))) {
>>         /* 2.0 with the sign bit set to sign(A) XOR sign(B) */
>> @@ -148,6 +151,9 @@ float64 HELPER(vfp_mulxd)(float64 a, float64 b, void *fpstp)
>> {
>>     float_status *fpst = fpstp;
>> 
>> +    a = float64_squash_input_denormal(a, fpst);
>> +    b = float64_squash_input_denormal(b, fpst);
>> +
>>     if ((float64_is_zero(a) && float64_is_infinity(b)) ||
>>         (float64_is_infinity(a) && float64_is_zero(b))) {
>>         /* 2.0 with the sign bit set to sign(A) XOR sign(B) */
> 
> I think this code change is correct but the commit message doesn't
> do a very good job of describing what the change is or why it is
> needed... (in particular the difference between FMULX and FMUL is
> completely irrelevant to the need for this fix, but it is pretty
> much the only thing the commit message talks about).
> 
> thanks
> -- PMM
Alex Bennée Jan. 28, 2015, 3:54 p.m. UTC | #2
Xiangyu Hu <libhu.so@gmail.com> writes:

> The difference between FMULX and FMUL is that FMULX will return 2.0f when one operator is 
> FPInfinity and the other one is FPZero, whilst FMUL will return a Default NaN. Without 
> this patch, the emulation would result in inconsistency.
>
> Signed-off-by: Xiangyu Hu <libhu.so@gmail.com>
> ---
>  target-arm/helper-a64.c | 6 ++++++
>  1 file changed, 6 insertions(+)
>
> diff --git a/target-arm/helper-a64.c b/target-arm/helper-a64.c
> index 81066ca..ebd9247 100644
> --- a/target-arm/helper-a64.c
> +++ b/target-arm/helper-a64.c
> @@ -135,6 +135,9 @@ float32 HELPER(vfp_mulxs)(float32 a, float32 b, void *fpstp)
>  {
>      float_status *fpst = fpstp;
>  
> +    a = float32_squash_input_denormal(a, fpst);
> +    b = float32_squash_input_denormal(b, fpst);
> +
>      if ((float32_is_zero(a) && float32_is_infinity(b)) ||
>          (float32_is_infinity(a) && float32_is_zero(b))) {
>          /* 2.0 with the sign bit set to sign(A) XOR sign(B) */
> @@ -148,6 +151,9 @@ float64 HELPER(vfp_mulxd)(float64 a, float64 b, void *fpstp)
>  {
>      float_status *fpst = fpstp;
>  
> +    a = float64_squash_input_denormal(a, fpst);
> +    b = float64_squash_input_denormal(b, fpst);
> +
>      if ((float64_is_zero(a) && float64_is_infinity(b)) ||
>          (float64_is_infinity(a) && float64_is_zero(b))) {
>          /* 2.0 with the sign bit set to sign(A) XOR sign(B) */

Do we have test cases that trip up here? It would be nice to include
them in our testing as the random nature of RISU has obviously failed to
trip up on this instruction.
Peter Maydell Jan. 28, 2015, 3:57 p.m. UTC | #3
On 28 January 2015 at 15:54, Alex Bennée <alex.bennee@linaro.org> wrote:
> Do we have test cases that trip up here? It would be nice to include
> them in our testing as the random nature of RISU has obviously failed to
> trip up on this instruction.

Risu would probably catch this if we generated and ran test cases
which set the FPSCR bits to something other than the default.
(At least the 32-bit risugen lets you do this; I forget whether
we wired up that bit in the 64-bit support code.)

-- PMM
Xiangyu Hu Jan. 28, 2015, 4:11 p.m. UTC | #4
If RISU sets random FPSCR (FZ bit) values, I think such cases would be covered; it 
doesn’t look like such a corner case.

Maybe I can include some focus tests on this scenario if RISU failed to generate this pattern?

Thanks

- xiangyu

> On 28 Jan, 2015, at 11:57 pm, Peter Maydell <peter.maydell@linaro.org> wrote:
> 
> On 28 January 2015 at 15:54, Alex Bennée <alex.bennee@linaro.org> wrote:
>> Do we have test cases that trip up here? It would be nice to include
>> them in our testing as the random nature of RISU has obviously failed to
>> trip up on this instruction.
> 
> Risu would probably catch this if we generated and ran test cases
> which set the FPSCR bits to something other than the default.
> (At least the 32-bit risugen lets you do this; I forget whether
> we wired up that bit in the 64-bit support code.)
> 
> -- PMM
Claudio Fontana Jan. 28, 2015, 4:12 p.m. UTC | #5
Hi Peter,

On 28.01.2015 16:57, Peter Maydell wrote:
> On 28 January 2015 at 15:54, Alex Bennée <alex.bennee@linaro.org> wrote:
>> Do we have test cases that trip up here? It would be nice to include
>> them in our testing as the random nature of RISU has obviously failed to
>> trip up on this instruction.
> 
> Risu would probably catch this if we generated and ran test cases
> which set the FPSCR bits to something other than the default.
> (At least the 32-bit risugen lets you do this; I forget whether
> we wired up that bit in the 64-bit support code.)
> 
> -- PMM
> 

If nobody improved it from my implementation, the risugen script
will generate code which sets FPSR always unconditionally to 0,
while the FPCR is wired up with the user-provided "fpscr" option.
Not that there's any good reason behind it, probably both should be configurable..

Ciao,

Claudio
Peter Maydell Jan. 28, 2015, 4:25 p.m. UTC | #6
On 28 January 2015 at 16:12, Claudio Fontana <claudio.fontana@huawei.com> wrote:
> If nobody improved it from my implementation, the risugen script
> will generate code which sets FPSR always unconditionally to 0,
> while the FPCR is wired up with the user-provided "fpscr" option.
> Not that there's any good reason behind it, probably both should be
> configurable..

That's all you actually require, really -- there's not much benefit
from setting the FPSR flags on startup.

-- PMM
Peter Maydell Jan. 29, 2015, 7:22 p.m. UTC | #7
On 28 January 2015 at 15:57, Peter Maydell <peter.maydell@linaro.org> wrote:
> On 28 January 2015 at 15:54, Alex Bennée <alex.bennee@linaro.org> wrote:
>> Do we have test cases that trip up here? It would be nice to include
>> them in our testing as the random nature of RISU has obviously failed to
>> trip up on this instruction.
>
> Risu would probably catch this if we generated and ran test cases
> which set the FPSCR bits to something other than the default.

This bug is indeed caught by the following risugen:
./risugen --numinsns 10000 --pattern "FMULX A64_V" --fpscr 0x01000000
aarch64.risu FMULX_S3SAME_V_squash.out

(that fpscr value being "set DZ, nothing else".)

Alex, I don't suppose your automation of these tests makes
it easy to do a complete extra run (or better, just of the
Neon and FP insn tests) with different FPSCR flags?

-- PMM
Alex Bennée Feb. 3, 2015, 1:44 p.m. UTC | #8
Peter Maydell <peter.maydell@linaro.org> writes:

> On 28 January 2015 at 15:57, Peter Maydell <peter.maydell@linaro.org> wrote:
>> On 28 January 2015 at 15:54, Alex Bennée <alex.bennee@linaro.org> wrote:
>>> Do we have test cases that trip up here? It would be nice to include
>>> them in our testing as the random nature of RISU has obviously failed to
>>> trip up on this instruction.
>>
>> Risu would probably catch this if we generated and ran test cases
>> which set the FPSCR bits to something other than the default.
>
> This bug is indeed caught by the following risugen:
> ./risugen --numinsns 10000 --pattern "FMULX A64_V" --fpscr 0x01000000
> aarch64.risu FMULX_S3SAME_V_squash.out
>
> (that fpscr value being "set DZ, nothing else".)
>
> Alex, I don't suppose your automation of these tests makes
> it easy to do a complete extra run (or better, just of the
> Neon and FP insn tests) with different FPSCR flags?

I just need to add the additional test sequence and trace to the tarball
and the script should pick it up automatically.

>
> -- PMM
diff mbox

Patch

diff --git a/target-arm/helper-a64.c b/target-arm/helper-a64.c
index 81066ca..ebd9247 100644
--- a/target-arm/helper-a64.c
+++ b/target-arm/helper-a64.c
@@ -135,6 +135,9 @@  float32 HELPER(vfp_mulxs)(float32 a, float32 b, void *fpstp)
 {
     float_status *fpst = fpstp;
 
+    a = float32_squash_input_denormal(a, fpst);
+    b = float32_squash_input_denormal(b, fpst);
+
     if ((float32_is_zero(a) && float32_is_infinity(b)) ||
         (float32_is_infinity(a) && float32_is_zero(b))) {
         /* 2.0 with the sign bit set to sign(A) XOR sign(B) */
@@ -148,6 +151,9 @@  float64 HELPER(vfp_mulxd)(float64 a, float64 b, void *fpstp)
 {
     float_status *fpst = fpstp;
 
+    a = float64_squash_input_denormal(a, fpst);
+    b = float64_squash_input_denormal(b, fpst);
+
     if ((float64_is_zero(a) && float64_is_infinity(b)) ||
         (float64_is_infinity(a) && float64_is_zero(b))) {
         /* 2.0 with the sign bit set to sign(A) XOR sign(B) */