Patchwork [4/7] tcg: Optimize double-word comparisons against zero

login
register
mail settings
Submitter Richard Henderson
Date Sept. 27, 2012, 5:19 p.m.
Message ID <1348766397-20731-5-git-send-email-rth@twiddle.net>
Download mbox | patch
Permalink /patch/187417/
State New
Headers show

Comments

Richard Henderson - Sept. 27, 2012, 5:19 p.m.
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/optimize.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 51 insertions(+)
Aurelien Jarno - Oct. 1, 2012, 6:43 p.m.
On Thu, Sep 27, 2012 at 10:19:54AM -0700, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  tcg/optimize.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 51 insertions(+)
> 
> diff --git a/tcg/optimize.c b/tcg/optimize.c
> index d39926e..c972e4f 100644
> --- a/tcg/optimize.c
> +++ b/tcg/optimize.c
> @@ -799,6 +799,57 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
>              }
>              args += 6;
>              break;
> +        case INDEX_op_brcond2_i32:
> +            /* Simplify LT/GE comparisons vs zero to a single compare
> +               vs the high word of the input.  */
> +            if ((args[4] == TCG_COND_LT || args[4] == TCG_COND_GE)
> +                && temps[args[2]].state == TCG_TEMP_CONST
> +                && temps[args[3]].state == TCG_TEMP_CONST
> +                && temps[args[2]].val == 0
> +                && temps[args[2]].val == 0) {

The value comparison there is wrong, probably copy & paste issue. I 
wonder how it could work.

> +                gen_opc_buf[op_index] = INDEX_op_brcond_i32;
> +                args[0] = args[1];
> +                args[1] = args[3];
> +                args[2] = args[4];
> +                args[3] = args[5];
> +                gen_args += 4;
> +            } else {
> +                gen_args[0] = args[0];
> +                gen_args[1] = args[1];
> +                gen_args[2] = args[2];
> +                gen_args[3] = args[3];
> +                gen_args[4] = args[4];
> +                gen_args[5] = args[5];
> +                gen_args += 6;
> +            }
> +            memset(temps, 0, nb_temps * sizeof(struct tcg_temp_info));
> +            args += 6;
> +            break;
> +        case INDEX_op_setcond2_i32:
> +            /* Simplify LT/GE comparisons vs zero to a single compare
> +               vs the high word of the input.  */
> +            if ((args[5] == TCG_COND_LT || args[5] == TCG_COND_GE)
> +                && temps[args[3]].state == TCG_TEMP_CONST
> +                && temps[args[4]].state == TCG_TEMP_CONST
> +                && temps[args[3]].val == 0
> +                && temps[args[4]].val == 0) {

Here it is fine.

> +                gen_opc_buf[op_index] = INDEX_op_setcond_i32;
> +                args[1] = args[2];
> +                args[2] = args[4];
> +                args[3] = args[5];
> +                gen_args += 4;
> +            } else {
> +                reset_temp(args[0]);
> +                gen_args[0] = args[0];
> +                gen_args[1] = args[1];
> +                gen_args[2] = args[2];
> +                gen_args[3] = args[3];
> +                gen_args[4] = args[4];
> +                gen_args[5] = args[5];
> +                gen_args += 6;
> +            }
> +            args += 6;
> +            break;
>          case INDEX_op_call:
>              nb_call_args = (args[0] >> 16) + (args[0] & 0xffff);
>              if (!(args[nb_call_args + 1] & (TCG_CALL_CONST | TCG_CALL_PURE))) {

While it's a nice optimization to have, one that seems to happen a lot
more often is the two high parts being equal. It happens when the guest
is working on (u)int32_t.
Richard Henderson - Oct. 1, 2012, 6:47 p.m.
On 2012-10-01 11:43, Aurelien Jarno wrote:
> While it's a nice optimization to have, one that seems to happen a lot
> more often is the two high parts being equal. It happens when the guest
> is working on (u)int32_t.

It depends on what target you're looking at.  For alpha guest, all branches
are comparisons vs zero, so LT/GE happens with some regularity.



r~

Patch

diff --git a/tcg/optimize.c b/tcg/optimize.c
index d39926e..c972e4f 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -799,6 +799,57 @@  static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
             }
             args += 6;
             break;
+        case INDEX_op_brcond2_i32:
+            /* Simplify LT/GE comparisons vs zero to a single compare
+               vs the high word of the input.  */
+            if ((args[4] == TCG_COND_LT || args[4] == TCG_COND_GE)
+                && temps[args[2]].state == TCG_TEMP_CONST
+                && temps[args[3]].state == TCG_TEMP_CONST
+                && temps[args[2]].val == 0
+                && temps[args[2]].val == 0) {
+                gen_opc_buf[op_index] = INDEX_op_brcond_i32;
+                args[0] = args[1];
+                args[1] = args[3];
+                args[2] = args[4];
+                args[3] = args[5];
+                gen_args += 4;
+            } else {
+                gen_args[0] = args[0];
+                gen_args[1] = args[1];
+                gen_args[2] = args[2];
+                gen_args[3] = args[3];
+                gen_args[4] = args[4];
+                gen_args[5] = args[5];
+                gen_args += 6;
+            }
+            memset(temps, 0, nb_temps * sizeof(struct tcg_temp_info));
+            args += 6;
+            break;
+        case INDEX_op_setcond2_i32:
+            /* Simplify LT/GE comparisons vs zero to a single compare
+               vs the high word of the input.  */
+            if ((args[5] == TCG_COND_LT || args[5] == TCG_COND_GE)
+                && temps[args[3]].state == TCG_TEMP_CONST
+                && temps[args[4]].state == TCG_TEMP_CONST
+                && temps[args[3]].val == 0
+                && temps[args[4]].val == 0) {
+                gen_opc_buf[op_index] = INDEX_op_setcond_i32;
+                args[1] = args[2];
+                args[2] = args[4];
+                args[3] = args[5];
+                gen_args += 4;
+            } else {
+                reset_temp(args[0]);
+                gen_args[0] = args[0];
+                gen_args[1] = args[1];
+                gen_args[2] = args[2];
+                gen_args[3] = args[3];
+                gen_args[4] = args[4];
+                gen_args[5] = args[5];
+                gen_args += 6;
+            }
+            args += 6;
+            break;
         case INDEX_op_call:
             nb_call_args = (args[0] >> 16) + (args[0] & 0xffff);
             if (!(args[nb_call_args + 1] & (TCG_CALL_CONST | TCG_CALL_PURE))) {