Patchwork [applied] NEON vldN optimization

login
register
mail settings
Submitter Paul Brook
Date June 11, 2010, 7:15 p.m.
Message ID <201006112015.56847.paul@codesourcery.com>
Download mbox | patch
Permalink /patch/55349/
State New
Headers show

Comments

Paul Brook - June 11, 2010, 7:15 p.m.
When combining multiple values as part of a NEON array load, do explcit
shift/or rather than using gen_bfi.  This voids redundant mask
operations.

Signed-off-by: Paul Brook <paul@codesourcery.com>
---
 target-arm/translate.c |    6 ++++--
 1 files changed, 4 insertions(+), 2 deletions(-)
Paul Brook - June 11, 2010, 7:40 p.m.
> +                            tcg_gen_shli_i32(tmp2, tmp, 16);
> +                            tcg_gen_or_i32(tmp, tmp, tmp2);

Which if you're paying attention is incorrect. Actual committed patch is 
correct -  tcg_gen_shli_i32(tmp2, tmp2, 16);

Paul

Patch

diff --git a/target-arm/translate.c b/target-arm/translate.c
index 0eccca5..11ff055 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -3854,7 +3854,8 @@  static int disas_neon_ls_insn(CPUState * env, DisasContext *s, uint32_t insn)
                             tcg_gen_addi_i32(addr, addr, stride);
                             tmp2 = gen_ld16u(addr, IS_USER(s));
                             tcg_gen_addi_i32(addr, addr, stride);
-                            gen_bfi(tmp, tmp, tmp2, 16, 0xffff);
+                            tcg_gen_shli_i32(tmp2, tmp, 16);
+                            tcg_gen_or_i32(tmp, tmp, tmp2);
                             dead_tmp(tmp2);
                             neon_store_reg(rd, pass, tmp);
                         } else {
@@ -3875,7 +3876,8 @@  static int disas_neon_ls_insn(CPUState * env, DisasContext *s, uint32_t insn)
                                 if (n == 0) {
                                     tmp2 = tmp;
                                 } else {
-                                    gen_bfi(tmp2, tmp2, tmp, n * 8, 0xff);
+                                    tcg_gen_shli_i32(tmp, tmp, n * 8);
+                                    tcg_gen_or_i32(tmp2, tmp2, tmp);
                                     dead_tmp(tmp);
                                 }
                             }