From patchwork Fri Jan 11 23:42:52 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 211455 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 86C6C2C035A for ; Sat, 12 Jan 2013 10:43:31 +1100 (EST) Received: from localhost ([::1]:36659 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TtoG1-0004CX-L0 for incoming@patchwork.ozlabs.org; Fri, 11 Jan 2013 18:43:29 -0500 Received: from eggs.gnu.org ([208.118.235.92]:53252) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TtoFe-00046I-5F for qemu-devel@nongnu.org; Fri, 11 Jan 2013 18:43:10 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1TtoFZ-0004KY-TD for qemu-devel@nongnu.org; Fri, 11 Jan 2013 18:43:06 -0500 Received: from mail-qc0-f182.google.com ([209.85.216.182]:40128) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TtoFZ-0004KS-NX for qemu-devel@nongnu.org; Fri, 11 Jan 2013 18:43:01 -0500 Received: by mail-qc0-f182.google.com with SMTP id k19so1449177qcs.41 for ; Fri, 11 Jan 2013 15:43:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:sender:from:to:cc:subject:date:message-id:x-mailer :in-reply-to:references; bh=ytLuFBv0WB/hORK/CHYIHn7FHtxxTrcXhaxOYYAMq7I=; b=0aAVL/fGIkojiuJQEht71xBTX3OtZWdMTsQfDnMt8kb3FvJ9ZIjNGadhxbZBBxzuir 9u322Mvbqj+LUK78mf2FhR5IenjzCqrt4puVkiyvKXckOjmJEDaDU0iZ73W4CoGL361q XJWj8ERnRaSLKbXcS0SJFkHfdWNx55+3ZC8y6UEPw2KanWzWuDv45blnTaLbSoNOHO9m e2K/2X7Qn9DPX0EuxeYNFvDNQqV4drvSoawzsGLpsoAAqTtqo/PpVg3+v18G9Z98oSGY gTHbRg4Fn/KBs12VrOWq2Uo5PosGqrsYpQhv7rCbRHrqJPna+VwvSDU6jPWuvlXMzuNH ZORw== X-Received: by 10.49.118.162 with SMTP id kn2mr72777167qeb.65.1357947781144; Fri, 11 Jan 2013 15:43:01 -0800 (PST) Received: from anchor.twiddle.home.com (50-194-63-110-static.hfc.comcastbusiness.net. [50.194.63.110]) by mx.google.com with ESMTPS id ff4sm4767977qab.10.2013.01.11.15.42.59 (version=TLSv1 cipher=RC4-SHA bits=128/128); Fri, 11 Jan 2013 15:43:00 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Fri, 11 Jan 2013 15:42:52 -0800 Message-Id: <1357947773-31051-3-git-send-email-rth@twiddle.net> X-Mailer: git-send-email 1.7.11.7 In-Reply-To: <1357947773-31051-1-git-send-email-rth@twiddle.net> References: <1357947773-31051-1-git-send-email-rth@twiddle.net> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [fuzzy] X-Received-From: 209.85.216.182 Cc: Paolo Bonzini , Aurelien Jarno Subject: [Qemu-devel] [PATCH 2/3] optimize: track nonzero bits of registers X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org From: Paolo Bonzini Add a "mask" field to the tcg_temp_info struct. A bit that is zero in "mask" will always be zero in the corresponding temporary. Zero bits in the mask can be produced from moves of immediates, zero-extensions, ANDs with constants, shifts; they can then be be propagated by logical operations, shifts, sign-extensions, negations, deposit operations, and conditional moves. Other operations will just reset the mask to all-ones, i.e. unknown. [rth: s/target_ulong/tcg_target_ulong/] Signed-off-by: Paolo Bonzini Signed-off-by: Richard Henderson --- tcg/optimize.c | 132 +++++++++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 110 insertions(+), 22 deletions(-) diff --git a/tcg/optimize.c b/tcg/optimize.c index 9d05a72..090efbc 100644 --- a/tcg/optimize.c +++ b/tcg/optimize.c @@ -46,6 +46,7 @@ struct tcg_temp_info { uint16_t prev_copy; uint16_t next_copy; tcg_target_ulong val; + tcg_target_ulong mask; }; static struct tcg_temp_info temps[TCG_MAX_TEMPS]; @@ -63,6 +64,7 @@ static void reset_temp(TCGArg temp) } } temps[temp].state = TCG_TEMP_UNDEF; + temps[temp].mask = -1; } /* Reset all temporaries, given that there are NB_TEMPS of them. */ @@ -71,6 +73,7 @@ static void reset_all_temps(int nb_temps) int i; for (i = 0; i < nb_temps; i++) { temps[i].state = TCG_TEMP_UNDEF; + temps[i].mask = -1; } } @@ -148,33 +151,35 @@ static bool temps_are_copies(TCGArg arg1, TCGArg arg2) static void tcg_opt_gen_mov(TCGContext *s, TCGArg *gen_args, TCGArg dst, TCGArg src) { - reset_temp(dst); - assert(temps[src].state != TCG_TEMP_CONST); - - if (s->temps[src].type == s->temps[dst].type) { - if (temps[src].state != TCG_TEMP_COPY) { - temps[src].state = TCG_TEMP_COPY; - temps[src].next_copy = src; - temps[src].prev_copy = src; - } - temps[dst].state = TCG_TEMP_COPY; - temps[dst].next_copy = temps[src].next_copy; - temps[dst].prev_copy = src; - temps[temps[dst].next_copy].prev_copy = dst; - temps[src].next_copy = dst; + reset_temp(dst); + temps[dst].mask = temps[src].mask; + assert(temps[src].state != TCG_TEMP_CONST); + + if (s->temps[src].type == s->temps[dst].type) { + if (temps[src].state != TCG_TEMP_COPY) { + temps[src].state = TCG_TEMP_COPY; + temps[src].next_copy = src; + temps[src].prev_copy = src; } + temps[dst].state = TCG_TEMP_COPY; + temps[dst].next_copy = temps[src].next_copy; + temps[dst].prev_copy = src; + temps[temps[dst].next_copy].prev_copy = dst; + temps[src].next_copy = dst; + } - gen_args[0] = dst; - gen_args[1] = src; + gen_args[0] = dst; + gen_args[1] = src; } static void tcg_opt_gen_movi(TCGArg *gen_args, TCGArg dst, TCGArg val) { - reset_temp(dst); - temps[dst].state = TCG_TEMP_CONST; - temps[dst].val = val; - gen_args[0] = dst; - gen_args[1] = val; + reset_temp(dst); + temps[dst].state = TCG_TEMP_CONST; + temps[dst].val = val; + temps[dst].mask = val; + gen_args[0] = dst; + gen_args[1] = val; } static TCGOpcode op_to_mov(TCGOpcode op) @@ -479,6 +484,7 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr, TCGArg *args, TCGOpDef *tcg_op_defs) { int i, nb_ops, op_index, nb_temps, nb_globals, nb_call_args; + tcg_target_ulong mask; TCGOpcode op; const TCGOpDef *def; TCGArg *gen_args; @@ -621,6 +627,87 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr, break; } + /* Simplify using known-zero bits */ + mask = -1; + switch (op) { + CASE_OP_32_64(ext8s): + if ((temps[args[1]].mask & 0x80) != 0) { + break; + } + CASE_OP_32_64(ext8u): + mask = 0xff; + goto and_const; + CASE_OP_32_64(ext16s): + if ((temps[args[1]].mask & 0x8000) != 0) { + break; + } + CASE_OP_32_64(ext16u): + mask = 0xffff; + goto and_const; + case INDEX_op_ext32s_i64: + if ((temps[args[1]].mask & 0x80000000) != 0) { + break; + } + case INDEX_op_ext32u_i64: + mask = 0xffffffffU; + goto and_const; + + CASE_OP_32_64(and): + mask = temps[args[2]].mask; + if (temps[args[2]].state == TCG_TEMP_CONST) { + and_const: + ; + } + mask = temps[args[1]].mask & mask; + break; + + CASE_OP_32_64(sar): + if (temps[args[2]].state == TCG_TEMP_CONST) { + mask = ((tcg_target_long)temps[args[1]].mask + >> temps[args[2]].val); + } + break; + + CASE_OP_32_64(shr): + if (temps[args[2]].state == TCG_TEMP_CONST) { + mask = temps[args[1]].mask >> temps[args[2]].val; + } + break; + + CASE_OP_32_64(shl): + if (temps[args[2]].state == TCG_TEMP_CONST) { + mask = temps[args[1]].mask << temps[args[2]].val; + } + break; + + CASE_OP_32_64(neg): + /* Set to 1 all bits to the left of the rightmost. */ + mask = -(temps[args[1]].mask & -temps[args[1]].mask); + break; + + CASE_OP_32_64(deposit): + tmp = ((1ull << args[4]) - 1); + mask = ((temps[args[1]].mask & ~(tmp << args[3])) + | ((temps[args[2]].mask & tmp) << args[3])); + break; + + CASE_OP_32_64(or): + CASE_OP_32_64(xor): + mask = temps[args[1]].mask | temps[args[2]].mask; + break; + + CASE_OP_32_64(setcond): + mask = 1; + break; + + CASE_OP_32_64(movcond): + mask = temps[args[3]].mask | temps[args[4]].mask; + break; + + default: + break; + } + /* Simplify expression for "op r, a, 0 => movi r, 0" cases */ switch (op) { CASE_OP_32_64(and): @@ -947,7 +1034,8 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr, /* Default case: we know nothing about operation (or were unable to compute the operation result) so no propagation is done. We trash everything if the operation is the end of a basic - block, otherwise we only trash the output args. */ + block, otherwise we only trash the output args. "mask" is + the non-zero bits mask for the first output arg. */ if (def->flags & TCG_OPF_BB_END) { reset_all_temps(nb_temps); } else {