diff mbox

target-arm: Fix garbage collection of temporaries in Neon emulation.

Message ID 4D35A4FB.3030403@st.com
State New
Headers show

Commit Message

Christophe Lyon Jan. 18, 2011, 2:34 p.m. UTC
Fix garbage collection of temporaries in Neon emulation.


Signed-off-by: Christophe Lyon <christophe.lyon@st.com>
---
 target-arm/translate.c |   22 +++++++++++++++++-----
 1 files changed, 17 insertions(+), 5 deletions(-)

 /* Translate a NEON data processing instruction.  Return nonzero if the
@@ -4840,7 +4852,7 @@ static int disas_neon_data_insn(CPUState * env,
DisasContext *s, uint32_t insn)
                 if (size == 3) {
                     tcg_temp_free_i64(tmp64);
                 } else {
-                    dead_tmp(tmp2);
+                    tcg_temp_free_i32(tmp2);
                 }
             } else if (op == 10) {
                 /* VSHLL */
@@ -5076,8 +5088,6 @@ static int disas_neon_data_insn(CPUState * env,
DisasContext *s, uint32_t insn)
                     case 8: case 9: case 10: case 11: case 12: case 13:
                         /* VMLAL, VQDMLAL, VMLSL, VQDMLSL, VMULL,
VQDMULL */
                         gen_neon_mull(cpu_V0, tmp, tmp2, size, u);
-                        dead_tmp(tmp2);
-                        dead_tmp(tmp);
                         break;
                     case 14: /* Polynomial VMULL */
                         cpu_abort(env, "Polynomial VMULL not implemented");
@@ -5235,9 +5245,12 @@ static int disas_neon_data_insn(CPUState * env,
DisasContext *s, uint32_t insn)
                             tmp = neon_load_reg(rn, 0);
                         } else {
                             tmp = tmp3;
+			    /* tmp2 has been discarded in
+			       gen_neon_mull during pass 0, we need to
+			       recreate it.  */
+			    tmp2 = neon_get_scalar(size, rm);
                         }
                         gen_neon_mull(cpu_V0, tmp, tmp2, size, u);
-                        dead_tmp(tmp);
                         if (op == 6 || op == 7) {
                             gen_neon_negl(cpu_V0, size);
                         }
@@ -5264,7 +5277,6 @@ static int disas_neon_data_insn(CPUState * env,
DisasContext *s, uint32_t insn)
                         neon_store_reg64(cpu_V0, rd + pass);
                     }

-                    dead_tmp(tmp2);

                     break;
                 default: /* 14 and 15 are RESERVED */

Comments

Peter Maydell Jan. 18, 2011, 3:26 p.m. UTC | #1
On 18 January 2011 14:34, Christophe Lyon <christophe.lyon@st.com> wrote:
> +
> +    /* gen_helper_neon_mull_[su]{8|16} do not free their parameters.
> +       Don't forget to clean them now.  */
> +    switch ((size << 1) | u) {
> +    case 0:
> +    case 1:
> +    case 2:
> +    case 3:
> +      dead_tmp(a);
> +      dead_tmp(b);
> +      break;
> +    }
>  }

This seems a rather convoluted way to write "if (size < 2) { ... }"

> @@ -5235,9 +5245,12 @@ static int disas_neon_data_insn(CPUState * env,
> DisasContext *s, uint32_t insn)
>                             tmp = neon_load_reg(rn, 0);
>                         } else {
>                             tmp = tmp3;
> +                           /* tmp2 has been discarded in
> +                              gen_neon_mull during pass 0, we need to
> +                              recreate it.  */
> +                           tmp2 = neon_get_scalar(size, rm);
>                         }

I think this will give the wrong results for instructions where the
scalar operand is in the same Neon register as the destination
for the first pass, because calling neon_get_scalar() again will
do a reload from the Neon register and it might have changed.
(Also loading it once at the start rather than in every pass is
more efficient as well as being correct :-))

Also your patch has hard-coded tabs in it (please see
CODING_STYLE on the subject of whitespace) and your
mail client or server has line-wrapped long lines in the patch
so it doesn't apply cleanly...

-- PMM
Peter Maydell Jan. 18, 2011, 3:36 p.m. UTC | #2
Incidentally there are some correctness fixes for the multiply-by-scalar
neon insns from the qemu-meego tree which are on my list to push
upstream. So you probably aren't getting the right results even if
you've managed to shut up qemu's warnings :-)

-- PMM
Christophe Lyon Jan. 18, 2011, 4:58 p.m. UTC | #3
On 18.01.2011 16:26, Peter Maydell wrote:
> On 18 January 2011 14:34, Christophe Lyon <christophe.lyon@st.com> wrote:
>> +
>> +    /* gen_helper_neon_mull_[su]{8|16} do not free their parameters.
>> +       Don't forget to clean them now.  */
>> +    switch ((size << 1) | u) {
>> +    case 0:
>> +    case 1:
>> +    case 2:
>> +    case 3:
>> +      dead_tmp(a);
>> +      dead_tmp(b);
>> +      break;
>> +    }
>>  }
> 
> This seems a rather convoluted way to write "if (size < 2) { ... }"
> 
It was for consistency/readability with the preceding paragraph.

>> @@ -5235,9 +5245,12 @@ static int disas_neon_data_insn(CPUState * env,
>> DisasContext *s, uint32_t insn)
>>                             tmp = neon_load_reg(rn, 0);
>>                         } else {
>>                             tmp = tmp3;
>> +                           /* tmp2 has been discarded in
>> +                              gen_neon_mull during pass 0, we need to
>> +                              recreate it.  */
>> +                           tmp2 = neon_get_scalar(size, rm);
>>                         }
> 
> I think this will give the wrong results for instructions where the
> scalar operand is in the same Neon register as the destination
> for the first pass, because calling neon_get_scalar() again will
> do a reload from the Neon register and it might have changed.
> (Also loading it once at the start rather than in every pass is
> more efficient as well as being correct :-))

I agree it's more efficient, but as the temporary is freed by gen_neon_mull, how can I make an efficient copy?

If we decide not to free the temporary in gen_mul[us]_i64_i32, we'll have to make sure clean up is performed correctly in many places.

 
> Also your patch has hard-coded tabs in it (please see
> CODING_STYLE on the subject of whitespace) and your
> mail client or server has line-wrapped long lines in the patch
> so it doesn't apply cleanly...

Sorry, I know we have some trouble with the mail client or server. Is it possible to send patches as attachments on this list?
Christophe Lyon Jan. 18, 2011, 5 p.m. UTC | #4
On 18.01.2011 16:36, Peter Maydell wrote:
> Incidentally there are some correctness fixes for the multiply-by-scalar
> neon insns from the qemu-meego tree which are on my list to push
> upstream. So you probably aren't getting the right results even if
> you've managed to shut up qemu's warnings :-)
> 

Actually it did not only shut up qemu's warnings. It was asserting. After fixing the asserts, it did warn a lot about resource leakage indeed, which I tried to fix with this patch.

And yes I can confirm there are many other wrong results in the Neon support, which I am currently fixing.


Christophe.
Peter Maydell Jan. 18, 2011, 5:09 p.m. UTC | #5
On 18 January 2011 17:00, Christophe Lyon <christophe.lyon@st.com> wrote:
> On 18.01.2011 16:36, Peter Maydell wrote:
>> Incidentally there are some correctness fixes for the multiply-by-scalar
>> neon insns from the qemu-meego tree which are on my list to push
>> upstream. So you probably aren't getting the right results even if
>> you've managed to shut up qemu's warnings :-)

> And yes I can confirm there are many other wrong results in the
> Neon support, which I am currently fixing.

Please coordinate this with me! I have a big pile of fixes which
I am working through, testing and submitting upstream, so you
are in significant danger of duplicating work, which would be
unfortunate.

-- PMM
diff mbox

Patch

diff --git a/target-arm/translate.c b/target-arm/translate.c
index 57664bc..363351e 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -4176,6 +4176,18 @@  static inline void gen_neon_mull(TCGv_i64 dest,
TCGv a, TCGv b, int size, int u)
         break;
     default: abort();
     }
+
+    /* gen_helper_neon_mull_[su]{8|16} do not free their parameters.
+       Don't forget to clean them now.  */
+    switch ((size << 1) | u) {
+    case 0:
+    case 1:
+    case 2:
+    case 3:
+      dead_tmp(a);
+      dead_tmp(b);
+      break;
+    }
 }