Patchwork [1/7] target-arm: Make Neon helper routines use correct FP status

login
register
mail settings
Submitter Peter Maydell
Date March 11, 2011, 6:12 p.m.
Message ID <1299867146-22049-2-git-send-email-peter.maydell@linaro.org>
Download mbox | patch
Permalink /patch/86444/
State New
Headers show

Comments

Peter Maydell - March 11, 2011, 6:12 p.m.
Make the Neon helper routines use the correct FP status from
the CPUEnv rather than using a dummy static one. This means
they will correctly handle denormals and NaNs and will set
FPSCR exception bits properly.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target-arm/helpers.h     |   22 +++++++++++-----------
 target-arm/neon_helper.c |   21 ++++++++++-----------
 target-arm/translate.c   |   42 ++++++++++++++++++++++--------------------
 3 files changed, 43 insertions(+), 42 deletions(-)
Nathan Froyd - March 11, 2011, 6:30 p.m.
On Fri, Mar 11, 2011 at 06:12:20PM +0000, Peter Maydell wrote:
> Make the Neon helper routines use the correct FP status from
> the CPUEnv rather than using a dummy static one. This means
> they will correctly handle denormals and NaNs and will set
> FPSCR exception bits properly.

Is there a reason that you don't simply use the global env rather than
passing in an extra parameter everywhere?  I wonder if it'd be
worthwhile just to merge these functions into op_helper.c, since we have
a proper FP status for NEON bits now.

-Nathan
Peter Maydell - March 11, 2011, 10:31 p.m.
On 11 March 2011 18:30, Nathan Froyd <froydnj@codesourcery.com> wrote:
> On Fri, Mar 11, 2011 at 06:12:20PM +0000, Peter Maydell wrote:
>> Make the Neon helper routines use the correct FP status from
>> the CPUEnv rather than using a dummy static one. This means
>> they will correctly handle denormals and NaNs and will set
>> FPSCR exception bits properly.
>
> Is there a reason that you don't simply use the global env rather than
> passing in an extra parameter everywhere?

Just following the pattern that generally seems to be used by
most helper functions, ie if you want the CPU env pass it in
as a parameter. As far as I know, you can't use the global
env unless you're in op_helper.c because that's the only
source file compiled with the right flags.

> I wonder if it'd be
> worthwhile just to merge these functions into op_helper.c,
> since we have a proper FP status for NEON bits now.

Why move these and not (for instance) the VFP helpers
in helper.c which use the CPU env for more or less the
same reasons?

-- PMM
Nathan Froyd - March 14, 2011, 5:35 a.m.
On Fri, Mar 11, 2011 at 10:31:31PM +0000, Peter Maydell wrote:
> On 11 March 2011 18:30, Nathan Froyd <froydnj@codesourcery.com> wrote:
> > Is there a reason that you don't simply use the global env rather than
> > passing in an extra parameter everywhere?
> 
> Just following the pattern that generally seems to be used by
> most helper functions, ie if you want the CPU env pass it in
> as a parameter. As far as I know, you can't use the global
> env unless you're in op_helper.c because that's the only
> source file compiled with the right flags.

Oh, right.  I am ambivalent as to whether passing env to such functions
is the right thing to do or not.

> > I wonder if it'd be worthwhile just to merge these functions into
> > op_helper.c, since we have a proper FP status for NEON bits now.
> 
> Why move these and not (for instance) the VFP helpers
> in helper.c which use the CPU env for more or less the
> same reasons?

No reason, other than that I wasn't thinking about the VFP helpers. :)

-Nathan
Peter Maydell - March 28, 2011, 2:15 p.m.
On 14 March 2011 05:35, Nathan Froyd <froydnj@codesourcery.com> wrote:
> On Fri, Mar 11, 2011 at 10:31:31PM +0000, Peter Maydell wrote:
>> On 11 March 2011 18:30, Nathan Froyd <froydnj@codesourcery.com> wrote:
>> > Is there a reason that you don't simply use the global env rather than
>> > passing in an extra parameter everywhere?
>>
>> Just following the pattern that generally seems to be used by
>> most helper functions, ie if you want the CPU env pass it in
>> as a parameter. As far as I know, you can't use the global
>> env unless you're in op_helper.c because that's the only
>> source file compiled with the right flags.
>
> Oh, right.  I am ambivalent as to whether passing env to such functions
> is the right thing to do or not.

So did this amount to a request for a change to this patchset,
or are you happy to let it pass?

(I'm planning to stick this patchset into a pull-request with some
of the other ARM patches that have had a few weeks for review comment
later this week, so if you'd like a change now would be a good time
to say so...)

thanks
-- PMM
Nathan Froyd - March 30, 2011, 6:38 p.m.
On Mon, Mar 28, 2011 at 03:15:08PM +0100, Peter Maydell wrote:
> On 14 March 2011 05:35, Nathan Froyd <froydnj@codesourcery.com> wrote:
> > Oh, right.  I am ambivalent as to whether passing env to such functions
> > is the right thing to do or not.
> 
> So did this amount to a request for a change to this patchset,
> or are you happy to let it pass?

I am happy to let it pass.

-Nathan

Patch

diff --git a/target-arm/helpers.h b/target-arm/helpers.h
index bd6977c..e2260b6 100644
--- a/target-arm/helpers.h
+++ b/target-arm/helpers.h
@@ -350,17 +350,17 @@  DEF_HELPER_2(neon_qneg_s8, i32, env, i32)
 DEF_HELPER_2(neon_qneg_s16, i32, env, i32)
 DEF_HELPER_2(neon_qneg_s32, i32, env, i32)
 
-DEF_HELPER_2(neon_min_f32, i32, i32, i32)
-DEF_HELPER_2(neon_max_f32, i32, i32, i32)
-DEF_HELPER_2(neon_abd_f32, i32, i32, i32)
-DEF_HELPER_2(neon_add_f32, i32, i32, i32)
-DEF_HELPER_2(neon_sub_f32, i32, i32, i32)
-DEF_HELPER_2(neon_mul_f32, i32, i32, i32)
-DEF_HELPER_2(neon_ceq_f32, i32, i32, i32)
-DEF_HELPER_2(neon_cge_f32, i32, i32, i32)
-DEF_HELPER_2(neon_cgt_f32, i32, i32, i32)
-DEF_HELPER_2(neon_acge_f32, i32, i32, i32)
-DEF_HELPER_2(neon_acgt_f32, i32, i32, i32)
+DEF_HELPER_3(neon_min_f32, i32, env, i32, i32)
+DEF_HELPER_3(neon_max_f32, i32, env, i32, i32)
+DEF_HELPER_3(neon_abd_f32, i32, env, i32, i32)
+DEF_HELPER_3(neon_add_f32, i32, env, i32, i32)
+DEF_HELPER_3(neon_sub_f32, i32, env, i32, i32)
+DEF_HELPER_3(neon_mul_f32, i32, env, i32, i32)
+DEF_HELPER_3(neon_ceq_f32, i32, env, i32, i32)
+DEF_HELPER_3(neon_cge_f32, i32, env, i32, i32)
+DEF_HELPER_3(neon_cgt_f32, i32, env, i32, i32)
+DEF_HELPER_3(neon_acge_f32, i32, env, i32, i32)
+DEF_HELPER_3(neon_acgt_f32, i32, env, i32, i32)
 
 /* iwmmxt_helper.c */
 DEF_HELPER_2(iwmmxt_maddsq, i64, i64, i64)
diff --git a/target-arm/neon_helper.c b/target-arm/neon_helper.c
index 002a9c1..97bc1e6 100644
--- a/target-arm/neon_helper.c
+++ b/target-arm/neon_helper.c
@@ -18,8 +18,7 @@ 
 
 #define SET_QC() env->vfp.xregs[ARM_VFP_FPSCR] = CPSR_Q
 
-static float_status neon_float_status;
-#define NFS &neon_float_status
+#define NFS (&env->vfp.standard_fp_status)
 
 /* Helper routines to perform bitwise copies between float and int.  */
 static inline float32 vfp_itos(uint32_t i)
@@ -1794,21 +1793,21 @@  uint32_t HELPER(neon_qneg_s32)(CPUState *env, uint32_t x)
 }
 
 /* NEON Float helpers.  */
-uint32_t HELPER(neon_min_f32)(uint32_t a, uint32_t b)
+uint32_t HELPER(neon_min_f32)(CPUState *env, uint32_t a, uint32_t b)
 {
     float32 f0 = vfp_itos(a);
     float32 f1 = vfp_itos(b);
     return (float32_compare_quiet(f0, f1, NFS) == -1) ? a : b;
 }
 
-uint32_t HELPER(neon_max_f32)(uint32_t a, uint32_t b)
+uint32_t HELPER(neon_max_f32)(CPUState *env, uint32_t a, uint32_t b)
 {
     float32 f0 = vfp_itos(a);
     float32 f1 = vfp_itos(b);
     return (float32_compare_quiet(f0, f1, NFS) == 1) ? a : b;
 }
 
-uint32_t HELPER(neon_abd_f32)(uint32_t a, uint32_t b)
+uint32_t HELPER(neon_abd_f32)(CPUState *env, uint32_t a, uint32_t b)
 {
     float32 f0 = vfp_itos(a);
     float32 f1 = vfp_itos(b);
@@ -1817,24 +1816,24 @@  uint32_t HELPER(neon_abd_f32)(uint32_t a, uint32_t b)
                     : float32_sub(f1, f0, NFS));
 }
 
-uint32_t HELPER(neon_add_f32)(uint32_t a, uint32_t b)
+uint32_t HELPER(neon_add_f32)(CPUState *env, uint32_t a, uint32_t b)
 {
     return vfp_stoi(float32_add(vfp_itos(a), vfp_itos(b), NFS));
 }
 
-uint32_t HELPER(neon_sub_f32)(uint32_t a, uint32_t b)
+uint32_t HELPER(neon_sub_f32)(CPUState *env, uint32_t a, uint32_t b)
 {
     return vfp_stoi(float32_sub(vfp_itos(a), vfp_itos(b), NFS));
 }
 
-uint32_t HELPER(neon_mul_f32)(uint32_t a, uint32_t b)
+uint32_t HELPER(neon_mul_f32)(CPUState *env, uint32_t a, uint32_t b)
 {
     return vfp_stoi(float32_mul(vfp_itos(a), vfp_itos(b), NFS));
 }
 
 /* Floating point comparisons produce an integer result.  */
 #define NEON_VOP_FCMP(name, cmp) \
-uint32_t HELPER(neon_##name)(uint32_t a, uint32_t b) \
+uint32_t HELPER(neon_##name)(CPUState *env, uint32_t a, uint32_t b) \
 { \
     if (float32_compare_quiet(vfp_itos(a), vfp_itos(b), NFS) cmp 0) \
         return ~0; \
@@ -1846,14 +1845,14 @@  NEON_VOP_FCMP(ceq_f32, ==)
 NEON_VOP_FCMP(cge_f32, >=)
 NEON_VOP_FCMP(cgt_f32, >)
 
-uint32_t HELPER(neon_acge_f32)(uint32_t a, uint32_t b)
+uint32_t HELPER(neon_acge_f32)(CPUState *env, uint32_t a, uint32_t b)
 {
     float32 f0 = float32_abs(vfp_itos(a));
     float32 f1 = float32_abs(vfp_itos(b));
     return (float32_compare_quiet(f0, f1,NFS) >= 0) ? ~0 : 0;
 }
 
-uint32_t HELPER(neon_acgt_f32)(uint32_t a, uint32_t b)
+uint32_t HELPER(neon_acgt_f32)(CPUState *env, uint32_t a, uint32_t b)
 {
     float32 f0 = float32_abs(vfp_itos(a));
     float32 f1 = float32_abs(vfp_itos(b));
diff --git a/target-arm/translate.c b/target-arm/translate.c
index 062de5e..f4be8dc 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -4517,56 +4517,56 @@  static int disas_neon_data_insn(CPUState * env, DisasContext *s, uint32_t insn)
         case 26: /* Floating point arithnetic.  */
             switch ((u << 2) | size) {
             case 0: /* VADD */
-                gen_helper_neon_add_f32(tmp, tmp, tmp2);
+                gen_helper_neon_add_f32(tmp, cpu_env, tmp, tmp2);
                 break;
             case 2: /* VSUB */
-                gen_helper_neon_sub_f32(tmp, tmp, tmp2);
+                gen_helper_neon_sub_f32(tmp, cpu_env, tmp, tmp2);
                 break;
             case 4: /* VPADD */
-                gen_helper_neon_add_f32(tmp, tmp, tmp2);
+                gen_helper_neon_add_f32(tmp, cpu_env, tmp, tmp2);
                 break;
             case 6: /* VABD */
-                gen_helper_neon_abd_f32(tmp, tmp, tmp2);
+                gen_helper_neon_abd_f32(tmp, cpu_env, tmp, tmp2);
                 break;
             default:
                 return 1;
             }
             break;
         case 27: /* Float multiply.  */
-            gen_helper_neon_mul_f32(tmp, tmp, tmp2);
+            gen_helper_neon_mul_f32(tmp, cpu_env, tmp, tmp2);
             if (!u) {
                 tcg_temp_free_i32(tmp2);
                 tmp2 = neon_load_reg(rd, pass);
                 if (size == 0) {
-                    gen_helper_neon_add_f32(tmp, tmp, tmp2);
+                    gen_helper_neon_add_f32(tmp, cpu_env, tmp, tmp2);
                 } else {
-                    gen_helper_neon_sub_f32(tmp, tmp2, tmp);
+                    gen_helper_neon_sub_f32(tmp, cpu_env, tmp2, tmp);
                 }
             }
             break;
         case 28: /* Float compare.  */
             if (!u) {
-                gen_helper_neon_ceq_f32(tmp, tmp, tmp2);
+                gen_helper_neon_ceq_f32(tmp, cpu_env, tmp, tmp2);
             } else {
                 if (size == 0)
-                    gen_helper_neon_cge_f32(tmp, tmp, tmp2);
+                    gen_helper_neon_cge_f32(tmp, cpu_env, tmp, tmp2);
                 else
-                    gen_helper_neon_cgt_f32(tmp, tmp, tmp2);
+                    gen_helper_neon_cgt_f32(tmp, cpu_env, tmp, tmp2);
             }
             break;
         case 29: /* Float compare absolute.  */
             if (!u)
                 return 1;
             if (size == 0)
-                gen_helper_neon_acge_f32(tmp, tmp, tmp2);
+                gen_helper_neon_acge_f32(tmp, cpu_env, tmp, tmp2);
             else
-                gen_helper_neon_acgt_f32(tmp, tmp, tmp2);
+                gen_helper_neon_acgt_f32(tmp, cpu_env, tmp, tmp2);
             break;
         case 30: /* Float min/max.  */
             if (size == 0)
-                gen_helper_neon_max_f32(tmp, tmp, tmp2);
+                gen_helper_neon_max_f32(tmp, cpu_env, tmp, tmp2);
             else
-                gen_helper_neon_min_f32(tmp, tmp, tmp2);
+                gen_helper_neon_min_f32(tmp, cpu_env, tmp, tmp2);
             break;
         case 31:
             if (size == 0)
@@ -5230,7 +5230,7 @@  static int disas_neon_data_insn(CPUState * env, DisasContext *s, uint32_t insn)
                                 gen_helper_neon_qrdmulh_s32(tmp, cpu_env, tmp, tmp2);
                             }
                         } else if (op & 1) {
-                            gen_helper_neon_mul_f32(tmp, tmp, tmp2);
+                            gen_helper_neon_mul_f32(tmp,  cpu_env, tmp, tmp2);
                         } else {
                             switch (size) {
                             case 0: gen_helper_neon_mul_u8(tmp, tmp, tmp2); break;
@@ -5248,13 +5248,15 @@  static int disas_neon_data_insn(CPUState * env, DisasContext *s, uint32_t insn)
                                 gen_neon_add(size, tmp, tmp2);
                                 break;
                             case 1:
-                                gen_helper_neon_add_f32(tmp, tmp, tmp2);
+                                gen_helper_neon_add_f32(tmp, cpu_env,
+                                                        tmp, tmp2);
                                 break;
                             case 4:
                                 gen_neon_rsb(size, tmp, tmp2);
                                 break;
                             case 5:
-                                gen_helper_neon_sub_f32(tmp, tmp2, tmp);
+                                gen_helper_neon_sub_f32(tmp,  cpu_env,
+                                                        tmp2, tmp);
                                 break;
                             default:
                                 abort();
@@ -5639,21 +5641,21 @@  static int disas_neon_data_insn(CPUState * env, DisasContext *s, uint32_t insn)
                             break;
                         case 24: case 27: /* Float VCGT #0, Float VCLE #0 */
                             tmp2 = tcg_const_i32(0);
-                            gen_helper_neon_cgt_f32(tmp, tmp, tmp2);
+                            gen_helper_neon_cgt_f32(tmp,  cpu_env, tmp, tmp2);
                             tcg_temp_free(tmp2);
                             if (op == 27)
                                 tcg_gen_not_i32(tmp, tmp);
                             break;
                         case 25: case 28: /* Float VCGE #0, Float VCLT #0 */
                             tmp2 = tcg_const_i32(0);
-                            gen_helper_neon_cge_f32(tmp, tmp, tmp2);
+                            gen_helper_neon_cge_f32(tmp,  cpu_env, tmp, tmp2);
                             tcg_temp_free(tmp2);
                             if (op == 28)
                                 tcg_gen_not_i32(tmp, tmp);
                             break;
                         case 26: /* Float VCEQ #0 */
                             tmp2 = tcg_const_i32(0);
-                            gen_helper_neon_ceq_f32(tmp, tmp, tmp2);
+                            gen_helper_neon_ceq_f32(tmp,  cpu_env, tmp, tmp2);
                             tcg_temp_free(tmp2);
                             break;
                         case 30: /* Float VABS */