From patchwork Sat Mar 30 20:43:16 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 232561 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 0172C2C00A9 for ; Sun, 31 Mar 2013 07:46:52 +1100 (EST) Received: from localhost ([::1]:47879 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UM2fq-000796-6O for incoming@patchwork.ozlabs.org; Sat, 30 Mar 2013 16:46:50 -0400 Received: from eggs.gnu.org ([208.118.235.92]:33882) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UM2dB-0003aF-5o for qemu-devel@nongnu.org; Sat, 30 Mar 2013 16:44:09 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UM2d7-0005rq-3L for qemu-devel@nongnu.org; Sat, 30 Mar 2013 16:44:05 -0400 Received: from mail-pa0-f51.google.com ([209.85.220.51]:43848) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UM2d6-0005rh-TU for qemu-devel@nongnu.org; Sat, 30 Mar 2013 16:44:01 -0400 Received: by mail-pa0-f51.google.com with SMTP id jh10so780832pab.24 for ; Sat, 30 Mar 2013 13:44:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:sender:from:to:cc:subject:date:message-id:x-mailer :in-reply-to:references; bh=qq1errrISw9Rwebz//4Z6+KJDWAjtWfeFwLREtmT0sY=; b=edwcpAzF3zv7uxMMFEt0j8s64DQ3pkZ8V6QKX/t7mZdQCXaZglqZkPgJiCDksPgJV3 ewpRJ8ugIlJB4sB4X9IKX4pkvHLAPpej74zcprqe0FZKMZHYGUZFBUFT5Y08nWFHBpXB MFoHczrYmWZ2ncUH04n3wiSpEtH+m6dbNV+vDfAuUvXUljUs46s15pHS0dTXp5nGzOcM +SvcdZVSJ3PBJIYT5SsRB3EotBc8V0gvarwnQzQvKiU2KWY3MgO1QhHDUZrw7Vd5SvQe loZzA395aab3Zz/txDkZspd6tC4y2nWtetXEPdIbvRA91wR6klEdrjhcXxqQYgsEzrZZ WkLg== X-Received: by 10.66.232.230 with SMTP id tr6mr11476065pac.83.1364676240225; Sat, 30 Mar 2013 13:44:00 -0700 (PDT) Received: from fremont.twiddle.net (50-194-63-110-static.hfc.comcastbusiness.net. [50.194.63.110]) by mx.google.com with ESMTPS id gf1sm7620362pbc.24.2013.03.30.13.43.58 (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Sat, 30 Mar 2013 13:43:59 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 30 Mar 2013 13:43:16 -0700 Message-Id: <1364676207-21516-8-git-send-email-rth@twiddle.net> X-Mailer: git-send-email 1.8.1.4 In-Reply-To: <1364676207-21516-1-git-send-email-rth@twiddle.net> References: <1364676207-21516-1-git-send-email-rth@twiddle.net> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [fuzzy] X-Received-From: 209.85.220.51 Cc: Aurelien Jarno Subject: [Qemu-devel] [PATCH v4 07/18] tcg-arm: Improve constant generation X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Try fully rotated arguments to mov and mvn before trying movt or full decomposition. Begin decomposition with mvn when it looks like it'll help. Examples include -: mov r9, #0x00000fa0 -: orr r9, r9, #0x000ee000 -: orr r9, r9, #0x0ff00000 -: orr r9, r9, #0xf0000000 +: mvn r9, #0x0000005f +: eor r9, r9, #0x00011000 Signed-off-by: Richard Henderson --- tcg/arm/tcg-target.c | 67 ++++++++++++++++++++++++++++++++++------------------ 1 file changed, 44 insertions(+), 23 deletions(-) diff --git a/tcg/arm/tcg-target.c b/tcg/arm/tcg-target.c index bef2e66..ca76902 100644 --- a/tcg/arm/tcg-target.c +++ b/tcg/arm/tcg-target.c @@ -427,15 +427,31 @@ static inline void tcg_out_dat_imm(TCGContext *s, (rn << 16) | (rd << 12) | im); } -static inline void tcg_out_movi32(TCGContext *s, - int cond, int rd, uint32_t arg) -{ - /* TODO: This is very suboptimal, we can easily have a constant - * pool somewhere after all the instructions. */ - if ((int)arg < 0 && (int)arg >= -0x100) { - tcg_out_dat_imm(s, cond, ARITH_MVN, rd, 0, (~arg) & 0xff); - } else if (use_armv7_instructions) { - /* use movw/movt */ +static void tcg_out_movi32(TCGContext *s, int cond, int rd, uint32_t arg) +{ + int rot, opc, rn; + + /* For armv7, make sure not to use movw+movt when mov/mvn would do. + Speed things up by only checking when movt would be required. + Prior to armv7, have one go at fully rotated immediates before + doing the decomposition thing below. */ + if (!use_armv7_instructions || (arg & 0xffff0000)) { + rot = encode_imm(arg); + if (rot >= 0) { + tcg_out_dat_imm(s, cond, ARITH_MOV, rd, 0, + rotl(arg, rot) | (rot << 7)); + return; + } + rot = encode_imm(~arg); + if (rot >= 0) { + tcg_out_dat_imm(s, cond, ARITH_MVN, rd, 0, + rotl(~arg, rot) | (rot << 7)); + return; + } + } + + /* Use movw + movt. */ + if (use_armv7_instructions) { /* movw */ tcg_out32(s, (cond << 28) | 0x03000000 | (rd << 12) | ((arg << 4) & 0x000f0000) | (arg & 0xfff)); @@ -444,22 +460,27 @@ static inline void tcg_out_movi32(TCGContext *s, tcg_out32(s, (cond << 28) | 0x03400000 | (rd << 12) | ((arg >> 12) & 0x000f0000) | ((arg >> 16) & 0xfff)); } - } else { - int opc = ARITH_MOV; - int rn = 0; - - do { - int i, rot; - - i = ctz32(arg) & ~1; - rot = ((32 - i) << 7) & 0xf00; - tcg_out_dat_imm(s, cond, opc, rd, rn, ((arg >> i) & 0xff) | rot); - arg &= ~(0xff << i); + return; + } - opc = ARITH_ORR; - rn = rd; - } while (arg); + /* TODO: This is very suboptimal, we can easily have a constant + pool somewhere after all the instructions. */ + opc = ARITH_MOV; + rn = 0; + /* If we have lots of leading 1's, we can shorten the sequence by + beginning with mvn and then clearing higher bits with eor. */ + if (clz32(~arg) > clz32(arg)) { + opc = ARITH_MVN, arg = ~arg; } + do { + int i = ctz32(arg) & ~1; + rot = ((32 - i) << 7) & 0xf00; + tcg_out_dat_imm(s, cond, opc, rd, rn, ((arg >> i) & 0xff) | rot); + arg &= ~(0xff << i); + + opc = ARITH_EOR; + rn = rd; + } while (arg); } static inline void tcg_out_dat_rI(TCGContext *s, int cond, int opc, TCGArg dst,