From patchwork Tue Oct 2 18:32:30 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 188611 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 6C6B82C00A3 for ; Wed, 3 Oct 2012 04:35:06 +1000 (EST) Received: from localhost ([::1]:38691 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TJ7JA-0007ua-Ac for incoming@patchwork.ozlabs.org; Tue, 02 Oct 2012 14:35:04 -0400 Received: from eggs.gnu.org ([208.118.235.92]:35897) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TJ7Iu-0007u9-RT for qemu-devel@nongnu.org; Tue, 02 Oct 2012 14:34:50 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1TJ7Is-0008Dh-PV for qemu-devel@nongnu.org; Tue, 02 Oct 2012 14:34:48 -0400 Received: from mail-pa0-f45.google.com ([209.85.220.45]:43010) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TJ7Is-0008DD-Ju for qemu-devel@nongnu.org; Tue, 02 Oct 2012 14:34:46 -0400 Received: by mail-pa0-f45.google.com with SMTP id fb10so5569920pad.4 for ; Tue, 02 Oct 2012 11:34:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:from:to:cc:subject:date:message-id:x-mailer:in-reply-to :references; bh=K1Ud+eZOo23KLICsb/j6p75NlzZZ8X9jWyznP6iVWAA=; b=mEu7CHRjk8SVyfXouV+oZltA5UtErQluJuDvfECIXHp2Y2aod5Qw71D8KbVxfxeqV2 mLQgucxBhpIbQibZD6kM46LZvyiWP0vU2jkntiqntaKbt+pyBzTZx5zXqX742aUwDhFF 7rAKINCREGcjyOGpYHLPizbHmjYtWC6xk/q0/xUGhEFmpM0h1RT04+w33qZNR1Y7ltYP sG/jkC0Tw94DljABl1p+h6EjoG/1i4LntSbsQHZzBamYuLFqzlCVWUz8jicFCW9hKTWl 8kOf/Aw/SxFp4EuacFf6LTIZAhP4NPjdLPqk9xJ2gUeOm4L76JaOfMF6Mpdqe4Sommyk qPbA== Received: by 10.66.89.6 with SMTP id bk6mr46001197pab.81.1349202886260; Tue, 02 Oct 2012 11:34:46 -0700 (PDT) Received: from pebble.twiddle.home (me00536d0.tmodns.net. [208.54.5.224]) by mx.google.com with ESMTPS id nu8sm1259765pbc.45.2012.10.02.11.34.42 (version=TLSv1/SSLv3 cipher=OTHER); Tue, 02 Oct 2012 11:34:45 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Tue, 2 Oct 2012 11:32:30 -0700 Message-Id: <1349202750-16815-11-git-send-email-rth@twiddle.net> X-Mailer: git-send-email 1.7.11.4 In-Reply-To: <1349202750-16815-1-git-send-email-rth@twiddle.net> References: <1349202750-16815-1-git-send-email-rth@twiddle.net> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.220.45 Cc: Aurelien Jarno Subject: [Qemu-devel] [PATCH 10/10] tcg: Optimize mulu2 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Like add2, do operand ordering, constant folding, and dead operand elimination. The latter happens about 15% of all mulu2 during an x86_64 bios boot. Signed-off-by: Richard Henderson Reviewed-by: Aurelien Jarno --- tcg/optimize.c | 26 ++++++++++++++++++++++++++ tcg/tcg-op.h | 2 ++ tcg/tcg.c | 19 +++++++++++++++++++ 3 files changed, 47 insertions(+) diff --git a/tcg/optimize.c b/tcg/optimize.c index 05891ef..a06c8eb 100644 --- a/tcg/optimize.c +++ b/tcg/optimize.c @@ -543,6 +543,9 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr, swap_commutative(args[0], &args[2], &args[4]); swap_commutative(args[1], &args[3], &args[5]); break; + case INDEX_op_mulu2_i32: + swap_commutative(args[0], &args[2], &args[3]); + break; case INDEX_op_brcond2_i32: if (swap_commutative2(&args[0], &args[2])) { args[4] = tcg_swap_cond(args[4]); @@ -831,6 +834,29 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr, } goto do_default; + case INDEX_op_mulu2_i32: + if (temps[args[2]].state == TCG_TEMP_CONST + && temps[args[3]].state == TCG_TEMP_CONST) { + uint32_t a = temps[args[2]].val; + uint32_t b = temps[args[3]].val; + uint64_t r = (uint64_t)a * b; + TCGArg rl, rh; + + /* We emit the extra nop when we emit the mulu2. */ + assert(gen_opc_buf[op_index + 1] == INDEX_op_nop); + + rl = args[0]; + rh = args[1]; + gen_opc_buf[op_index] = INDEX_op_movi_i32; + gen_opc_buf[++op_index] = INDEX_op_movi_i32; + tcg_opt_gen_movi(&gen_args[0], rl, (uint32_t)r); + tcg_opt_gen_movi(&gen_args[2], rh, (uint32_t)(r >> 32)); + gen_args += 4; + args += 4; + break; + } + goto do_default; + case INDEX_op_brcond2_i32: tmp = do_constant_folding_cond2(&args[0], &args[2], args[4]); if (tmp != 2) { diff --git a/tcg/tcg-op.h b/tcg/tcg-op.h index 1f5a021..044e648 100644 --- a/tcg/tcg-op.h +++ b/tcg/tcg-op.h @@ -997,6 +997,8 @@ static inline void tcg_gen_mul_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2) tcg_gen_op4_i32(INDEX_op_mulu2_i32, TCGV_LOW(t0), TCGV_HIGH(t0), TCGV_LOW(arg1), TCGV_LOW(arg2)); + /* Allow the optimizer room to replace mulu2 with two moves. */ + tcg_gen_op0(INDEX_op_nop); tcg_gen_mul_i32(t1, TCGV_LOW(arg1), TCGV_HIGH(arg2)); tcg_gen_add_i32(TCGV_HIGH(t0), TCGV_HIGH(t0), t1); diff --git a/tcg/tcg.c b/tcg/tcg.c index 21c1074..8280489 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -1337,6 +1337,25 @@ static void tcg_liveness_analysis(TCGContext *s) } goto do_not_remove; + case INDEX_op_mulu2_i32: + args -= 4; + nb_iargs = 2; + nb_oargs = 2; + /* Likewise, test for the high part of the operation dead. */ + if (dead_temps[args[1]]) { + if (dead_temps[args[0]]) { + goto do_remove; + } + gen_opc_buf[op_index] = op = INDEX_op_mul_i32; + args[1] = args[2]; + args[2] = args[3]; + assert(gen_opc_buf[op_index + 1] == INDEX_op_nop); + tcg_set_nop(s, gen_opc_buf + op_index + 1, args + 3, 1); + /* Fall through and mark the single-word operation live. */ + nb_oargs = 1; + } + goto do_not_remove; + default: /* XXX: optimize by hardcoding common cases (e.g. triadic ops) */ args -= def->nb_args;