From patchwork Mon Apr 15 18:40:50 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 236671 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id BA47A2C00E6 for ; Tue, 16 Apr 2013 04:48:16 +1000 (EST) Received: from localhost ([::1]:60978 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1URoRq-0004Bz-WE for incoming@patchwork.ozlabs.org; Mon, 15 Apr 2013 14:48:15 -0400 Received: from eggs.gnu.org ([208.118.235.92]:42584) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1URoNF-0006h0-3C for qemu-devel@nongnu.org; Mon, 15 Apr 2013 14:43:32 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1URoNC-0002aq-0i for qemu-devel@nongnu.org; Mon, 15 Apr 2013 14:43:28 -0400 Received: from mail-qe0-f50.google.com ([209.85.128.50]:65428) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1URoNB-0002ai-Rr for qemu-devel@nongnu.org; Mon, 15 Apr 2013 14:43:25 -0400 Received: by mail-qe0-f50.google.com with SMTP id a11so2860931qen.37 for ; Mon, 15 Apr 2013 11:43:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:sender:from:to:cc:subject:date:message-id:x-mailer :in-reply-to:references; bh=xY7+cKLu14VZK+d5yp9vYnxNUj89s0EK0BY71yqRB/4=; b=QPV/D08oxAFVDfOM3hsL8+ZwYpHdajS/ZoiKAkMiDf3DQ/efwFWQgHOxuhntfBRtu1 L46piHjRE2m9dvZ3fJ3LRSyW20CkGcqXLV/+owSy0HCcSp24B3hp6T5tl0xCPw/v3MIY 4Bg/kcEFWjli1IcuaPOuWt5EkL5T+myoTVwKE+K5hl1lOrYdi/QPvebooQd3U4Nvqnvg qeSHGCCpInEUfnbUMBYMeZlHvBx+BWIK3ThTp4k0fQ6BaCY9V62Zqq3iT3OekbT3dOSq 3edBmEzuomwtfKqA5IVX0j+Y3hnvM+bL6e9xSUyk+FgahoG0A/nFWHcguesuMmTOQPuJ xuxA== X-Received: by 10.49.74.71 with SMTP id r7mr27508034qev.52.1366051405418; Mon, 15 Apr 2013 11:43:25 -0700 (PDT) Received: from pebble.com (214.Red-217-126-56.staticIP.rima-tde.net. [217.126.56.214]) by mx.google.com with ESMTPS id g6sm33990707qav.6.2013.04.15.11.43.19 (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Mon, 15 Apr 2013 11:43:24 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Mon, 15 Apr 2013 20:40:50 +0200 Message-Id: <1366051272-12979-12-git-send-email-rth@twiddle.net> X-Mailer: git-send-email 1.8.1.4 In-Reply-To: <1366051272-12979-1-git-send-email-rth@twiddle.net> References: <1366051272-12979-1-git-send-email-rth@twiddle.net> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [fuzzy] X-Received-From: 209.85.128.50 Cc: av1474@comtv.ru, aurelien@aurel32.net Subject: [Qemu-devel] [PATCH v5 11/33] tcg-ppc64: Improve constant add and sub ops. X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Improve constant addition -- previously we'd emit useless addi with 0. Use new constraints to force the driver to pull full 64-bit constants into a register. Reviewed-by: Aurelien Jarno Signed-off-by: Richard Henderson --- tcg/ppc64/tcg-target.c | 108 +++++++++++++++++++++++++++++-------------------- 1 file changed, 64 insertions(+), 44 deletions(-) diff --git a/tcg/ppc64/tcg-target.c b/tcg/ppc64/tcg-target.c index 6ba09ab..384946b 100644 --- a/tcg/ppc64/tcg-target.c +++ b/tcg/ppc64/tcg-target.c @@ -988,32 +988,6 @@ static void tcg_out_st (TCGContext *s, TCGType type, TCGReg arg, TCGReg arg1, tcg_out_ldsta (s, arg, arg1, arg2, STD, STDX); } -static void ppc_addi32(TCGContext *s, TCGReg rt, TCGReg ra, tcg_target_long si) -{ - if (!si && rt == ra) - return; - - if (si == (int16_t) si) - tcg_out32(s, ADDI | TAI(rt, ra, si)); - else { - uint16_t h = ((si >> 16) & 0xffff) + ((uint16_t) si >> 15); - tcg_out32(s, ADDIS | TAI(rt, ra, h)); - tcg_out32(s, ADDI | TAI(rt, rt, si)); - } -} - -static void ppc_addi64(TCGContext *s, TCGReg rt, TCGReg ra, tcg_target_long si) -{ - /* XXX: suboptimal */ - if (si == (int16_t) si - || ((((uint64_t) si >> 31) == 0) && (si & 0x8000) == 0)) - ppc_addi32 (s, rt, ra, si); - else { - tcg_out_movi (s, TCG_TYPE_I64, 0, si); - tcg_out32(s, ADD | TAB(rt, ra, 0)); - } -} - static void tcg_out_cmp (TCGContext *s, int cond, TCGArg arg1, TCGArg arg2, int const_arg2, int cr, int arch64) { @@ -1232,6 +1206,7 @@ void ppc_tb_set_jmp_target (unsigned long jmp_addr, unsigned long addr) static void tcg_out_op (TCGContext *s, TCGOpcode opc, const TCGArg *args, const int *const_args) { + TCGArg a0, a1, a2; int c; switch (opc) { @@ -1320,16 +1295,31 @@ static void tcg_out_op (TCGContext *s, TCGOpcode opc, const TCGArg *args, break; case INDEX_op_add_i32: - if (const_args[2]) - ppc_addi32 (s, args[0], args[1], args[2]); - else - tcg_out32 (s, ADD | TAB (args[0], args[1], args[2])); + a0 = args[0], a1 = args[1], a2 = args[2]; + if (const_args[2]) { + int32_t l, h; + do_addi_32: + l = (int16_t)a2; + h = a2 - l; + if (h) { + tcg_out32(s, ADDIS | TAI(a0, a1, h >> 16)); + a1 = a0; + } + if (l || a0 != a1) { + tcg_out32(s, ADDI | TAI(a0, a1, l)); + } + } else { + tcg_out32(s, ADD | TAB(a0, a1, a2)); + } break; case INDEX_op_sub_i32: - if (const_args[2]) - ppc_addi32 (s, args[0], args[1], -args[2]); - else - tcg_out32 (s, SUBF | TAB (args[0], args[2], args[1])); + a0 = args[0], a1 = args[1], a2 = args[2]; + if (const_args[2]) { + a2 = -a2; + goto do_addi_32; + } else { + tcg_out32(s, SUBF | TAB(a0, a2, a1)); + } break; case INDEX_op_and_i64: @@ -1459,16 +1449,46 @@ static void tcg_out_op (TCGContext *s, TCGOpcode opc, const TCGArg *args, break; case INDEX_op_add_i64: - if (const_args[2]) - ppc_addi64 (s, args[0], args[1], args[2]); - else - tcg_out32 (s, ADD | TAB (args[0], args[1], args[2])); + a0 = args[0], a1 = args[1], a2 = args[2]; + if (const_args[2]) { + int32_t l0, h1, h2; + do_addi_64: + /* We can always split any 32-bit signed constant into 3 pieces. + Note the positive 0x80000000 coming from the sub_i64 path, + handled with the same code we need for eg 0x7fff8000. */ + assert(a2 == (int32_t)a2 || a2 == 0x80000000); + l0 = (int16_t)a2; + h1 = a2 - l0; + h2 = 0; + if (h1 < 0 && (int64_t)a2 > 0) { + h2 = 0x40000000; + h1 = a2 - h2 - l0; + } + assert((TCGArg)h2 + h1 + l0 == a2); + + if (h2) { + tcg_out32(s, ADDIS | TAI(a0, a1, h2 >> 16)); + a1 = a0; + } + if (h1) { + tcg_out32(s, ADDIS | TAI(a0, a1, h1 >> 16)); + a1 = a0; + } + if (l0 || a0 != a1) { + tcg_out32(s, ADDI | TAI(a0, a1, l0)); + } + } else { + tcg_out32(s, ADD | TAB(a0, a1, a2)); + } break; case INDEX_op_sub_i64: - if (const_args[2]) - ppc_addi64 (s, args[0], args[1], -args[2]); - else - tcg_out32 (s, SUBF | TAB (args[0], args[2], args[1])); + a0 = args[0], a1 = args[1], a2 = args[2]; + if (const_args[2]) { + a2 = -a2; + goto do_addi_64; + } else { + tcg_out32(s, SUBF | TAB(a0, a2, a1)); + } break; case INDEX_op_shl_i64: @@ -1634,8 +1654,8 @@ static const TCGTargetOpDef ppc_op_defs[] = { { INDEX_op_neg_i32, { "r", "r" } }, { INDEX_op_not_i32, { "r", "r" } }, - { INDEX_op_add_i64, { "r", "r", "ri" } }, - { INDEX_op_sub_i64, { "r", "r", "ri" } }, + { INDEX_op_add_i64, { "r", "r", "rT" } }, + { INDEX_op_sub_i64, { "r", "r", "rT" } }, { INDEX_op_and_i64, { "r", "r", "rU" } }, { INDEX_op_or_i64, { "r", "r", "rU" } }, { INDEX_op_xor_i64, { "r", "r", "rU" } },