Patchwork [v4,33/33] tcg-ppc64: Handle deposit of zero

login
register
mail settings
Submitter Richard Henderson
Date April 4, 2013, 10:56 p.m.
Message ID <1365116186-19382-34-git-send-email-rth@twiddle.net>
Download mbox | patch
Permalink /patch/233992/
State New
Headers show

Comments

Richard Henderson - April 4, 2013, 10:56 p.m.
The TCG optimizer does great work when inserting constants, being able
to fold the open-coded deposit expansion to just an AND or an OR.  Avoid
a bit the regression caused by having the deposit opcode by expanding
deposit of zero as an AND.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/ppc64/tcg-target.c | 22 ++++++++++++++++------
 1 file changed, 16 insertions(+), 6 deletions(-)
Aurelien Jarno - April 15, 2013, 8:14 a.m.
On Thu, Apr 04, 2013 at 05:56:26PM -0500, Richard Henderson wrote:
> The TCG optimizer does great work when inserting constants, being able
> to fold the open-coded deposit expansion to just an AND or an OR.  Avoid
> a bit the regression caused by having the deposit opcode by expanding
> deposit of zero as an AND.
> 
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  tcg/ppc64/tcg-target.c | 22 ++++++++++++++++------
>  1 file changed, 16 insertions(+), 6 deletions(-)
> 
> diff --git a/tcg/ppc64/tcg-target.c b/tcg/ppc64/tcg-target.c
> index 9583cf9..772f4ac 100644
> --- a/tcg/ppc64/tcg-target.c
> +++ b/tcg/ppc64/tcg-target.c
> @@ -1928,12 +1928,22 @@ static void tcg_out_op (TCGContext *s, TCGOpcode opc, const TCGArg *args,
>          break;
>  
>      case INDEX_op_deposit_i32:
> -        tcg_out_rlw(s, RLWIMI, args[0], args[2], args[3],
> -                    32 - args[3] - args[4], 31 - args[3]);
> +        if (const_args[2]) {
> +            uint32_t mask = ((2u << (args[4] - 1)) - 1) << args[3];
> +            tcg_out_andi32(s, args[0], args[0], ~mask);
> +        } else {
> +            tcg_out_rlw(s, RLWIMI, args[0], args[2], args[3],
> +                        32 - args[3] - args[4], 31 - args[3]);
> +        }
>          break;
>      case INDEX_op_deposit_i64:
> -        tcg_out_rld(s, RLDIMI, args[0], args[2], args[3],
> -                    64 - args[3] - args[4]);
> +        if (const_args[2]) {
> +            uint64_t mask = ((2ull << (args[4] - 1)) - 1) << args[3];
> +            tcg_out_andi64(s, args[0], args[0], ~mask);
> +        } else {
> +            tcg_out_rld(s, RLDIMI, args[0], args[2], args[3],
> +                        64 - args[3] - args[4]);
> +        }
>          break;
>  
>      case INDEX_op_movcond_i32:
> @@ -2136,8 +2146,8 @@ static const TCGTargetOpDef ppc_op_defs[] = {
>      { INDEX_op_bswap32_i64, { "r", "r" } },
>      { INDEX_op_bswap64_i64, { "r", "r" } },
>  
> -    { INDEX_op_deposit_i32, { "r", "0", "r" } },
> -    { INDEX_op_deposit_i64, { "r", "0", "r" } },
> +    { INDEX_op_deposit_i32, { "r", "0", "rZ" } },
> +    { INDEX_op_deposit_i64, { "r", "0", "rZ" } },
>  
>      { INDEX_op_add2_i64, { "r", "r", "r", "rI", "r", "rZM" } },
>      { INDEX_op_sub2_i64, { "r", "r", "rI", "r", "rZM", "r" } },

I first thought this should go into the middle end, but OTOH it will
de-optimize some TCG targets which have a zero register like MIPS.

In the long term I think we should allow the middle end to query the
backend for constraints, and take the right decision (which would also
improve constant propagation on some hosts). In the meantime it looks
like the right thing to do, so:

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>

Patch

diff --git a/tcg/ppc64/tcg-target.c b/tcg/ppc64/tcg-target.c
index 9583cf9..772f4ac 100644
--- a/tcg/ppc64/tcg-target.c
+++ b/tcg/ppc64/tcg-target.c
@@ -1928,12 +1928,22 @@  static void tcg_out_op (TCGContext *s, TCGOpcode opc, const TCGArg *args,
         break;
 
     case INDEX_op_deposit_i32:
-        tcg_out_rlw(s, RLWIMI, args[0], args[2], args[3],
-                    32 - args[3] - args[4], 31 - args[3]);
+        if (const_args[2]) {
+            uint32_t mask = ((2u << (args[4] - 1)) - 1) << args[3];
+            tcg_out_andi32(s, args[0], args[0], ~mask);
+        } else {
+            tcg_out_rlw(s, RLWIMI, args[0], args[2], args[3],
+                        32 - args[3] - args[4], 31 - args[3]);
+        }
         break;
     case INDEX_op_deposit_i64:
-        tcg_out_rld(s, RLDIMI, args[0], args[2], args[3],
-                    64 - args[3] - args[4]);
+        if (const_args[2]) {
+            uint64_t mask = ((2ull << (args[4] - 1)) - 1) << args[3];
+            tcg_out_andi64(s, args[0], args[0], ~mask);
+        } else {
+            tcg_out_rld(s, RLDIMI, args[0], args[2], args[3],
+                        64 - args[3] - args[4]);
+        }
         break;
 
     case INDEX_op_movcond_i32:
@@ -2136,8 +2146,8 @@  static const TCGTargetOpDef ppc_op_defs[] = {
     { INDEX_op_bswap32_i64, { "r", "r" } },
     { INDEX_op_bswap64_i64, { "r", "r" } },
 
-    { INDEX_op_deposit_i32, { "r", "0", "r" } },
-    { INDEX_op_deposit_i64, { "r", "0", "r" } },
+    { INDEX_op_deposit_i32, { "r", "0", "rZ" } },
+    { INDEX_op_deposit_i64, { "r", "0", "rZ" } },
 
     { INDEX_op_add2_i64, { "r", "r", "r", "rI", "r", "rZM" } },
     { INDEX_op_sub2_i64, { "r", "r", "rI", "r", "rZM", "r" } },