From patchwork Mon Aug 24 19:36:50 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 510296 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 65BE414032C for ; Tue, 25 Aug 2015 05:38:31 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b=v6famrGA; dkim-atps=neutral Received: from localhost ([::1]:56321 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZTxZd-0001rw-9U for incoming@patchwork.ozlabs.org; Mon, 24 Aug 2015 15:38:29 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:50829) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZTxYw-0000Zi-Cs for qemu-devel@nongnu.org; Mon, 24 Aug 2015 15:37:47 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZTxYu-0005W9-Uq for qemu-devel@nongnu.org; Mon, 24 Aug 2015 15:37:46 -0400 Received: from mail-qg0-x229.google.com ([2607:f8b0:400d:c04::229]:34522) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZTxYu-0005W3-Qb for qemu-devel@nongnu.org; Mon, 24 Aug 2015 15:37:44 -0400 Received: by qgeg42 with SMTP id g42so93659450qge.1 for ; Mon, 24 Aug 2015 12:37:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references; bh=ookj9plM1Q+zL5AMSATaglxXmkb6F1FtqOLcGF7Bd2w=; b=v6famrGAdcBWTgInznwNcEgSM44eeicPh9rtuArn6lZ0Gph1QqhyKOGOck239Vyi7l 4icznc14WLBNz3ymnquLaDSAI4TAYaHTEUbenSIMMCiJPsh92P8FFW2XJljD+FdslETk IykVX8L96KI0+N+eBeMRg/jF/dyJP0adOSC7pxBO7Bt6EfndYrgKt0/mtoWFXIoRlnKh FlG4fpOzqL28oj6rc1vuXaOecaIa6dRTa8/XMeKij+ZV34xPGdCm17eYAVZREaSpubub 5eheQOim7AS3IZW1glo+iInGfoVkiGTlzFHAWAec2smz0q5bg3cKN8+JEiy7DFcP9iyA 9qjw== X-Received: by 10.140.234.10 with SMTP id f10mr60625169qhc.3.1440445064477; Mon, 24 Aug 2015 12:37:44 -0700 (PDT) Received: from bigtime.com ([75.147.178.105]) by smtp.gmail.com with ESMTPSA id r70sm11952823qki.12.2015.08.24.12.37.39 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 24 Aug 2015 12:37:44 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Mon, 24 Aug 2015 12:36:50 -0700 Message-Id: <1440445026-26522-3-git-send-email-rth@twiddle.net> X-Mailer: git-send-email 2.4.3 In-Reply-To: <1440445026-26522-1-git-send-email-rth@twiddle.net> References: <1440445026-26522-1-git-send-email-rth@twiddle.net> X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2607:f8b0:400d:c04::229 Cc: peter.maydell@linaro.org, Aurelien Jarno Subject: [Qemu-devel] [PULL 02/18] tcg/optimize: optimize temps tracking X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org From: Aurelien Jarno The tcg_temp_info structure uses 24 bytes per temp. Now that we emulate vector registers on most guests, it's not uncommon to have more than 100 used temps. This means we have initialize more than 2kB at least twice per TB, often more when there is a few goto_tb. Instead used a TCGTempSet bit array to track which temps are in used in the current basic block. This means there are only around 16 bytes to initialize. This improves the boot time of a MIPS guest on an x86-64 host by around 7% and moves out tcg_optimize from the the top of the profiler list. [rth: Handle TCG_CALL_DUMMY_ARG] Signed-off-by: Aurelien Jarno Signed-off-by: Richard Henderson --- tcg/optimize.c | 43 ++++++++++++++++++++++++++++++++----------- 1 file changed, 32 insertions(+), 11 deletions(-) diff --git a/tcg/optimize.c b/tcg/optimize.c index cd0e793..413920f 100644 --- a/tcg/optimize.c +++ b/tcg/optimize.c @@ -50,6 +50,7 @@ struct tcg_temp_info { }; static struct tcg_temp_info temps[TCG_MAX_TEMPS]; +static TCGTempSet temps_used; /* Reset TEMP's state to TCG_TEMP_UNDEF. If TEMP only had one copy, remove the copy flag from the left temp. */ @@ -67,6 +68,22 @@ static void reset_temp(TCGArg temp) temps[temp].mask = -1; } +/* Reset all temporaries, given that there are NB_TEMPS of them. */ +static void reset_all_temps(int nb_temps) +{ + bitmap_zero(temps_used.l, nb_temps); +} + +/* Initialize and activate a temporary. */ +static void init_temp_info(TCGArg temp) +{ + if (!test_bit(temp, temps_used.l)) { + temps[temp].state = TCG_TEMP_UNDEF; + temps[temp].mask = -1; + set_bit(temp, temps_used.l); + } +} + static TCGOp *insert_op_before(TCGContext *s, TCGOp *old_op, TCGOpcode opc, int nargs) { @@ -98,16 +115,6 @@ static TCGOp *insert_op_before(TCGContext *s, TCGOp *old_op, return new_op; } -/* Reset all temporaries, given that there are NB_TEMPS of them. */ -static void reset_all_temps(int nb_temps) -{ - int i; - for (i = 0; i < nb_temps; i++) { - temps[i].state = TCG_TEMP_UNDEF; - temps[i].mask = -1; - } -} - static int op_bits(TCGOpcode op) { const TCGOpDef *def = &tcg_op_defs[op]; @@ -598,12 +605,24 @@ void tcg_optimize(TCGContext *s) const TCGOpDef *def = &tcg_op_defs[opc]; oi_next = op->next; + + /* Count the arguments, and initialize the temps that are + going to be used */ if (opc == INDEX_op_call) { nb_oargs = op->callo; nb_iargs = op->calli; + for (i = 0; i < nb_oargs + nb_iargs; i++) { + tmp = args[i]; + if (tmp != TCG_CALL_DUMMY_ARG) { + init_temp_info(tmp); + } + } } else { nb_oargs = def->nb_oargs; nb_iargs = def->nb_iargs; + for (i = 0; i < nb_oargs + nb_iargs; i++) { + init_temp_info(args[i]); + } } /* Do copy propagation */ @@ -1299,7 +1318,9 @@ void tcg_optimize(TCGContext *s) if (!(args[nb_oargs + nb_iargs + 1] & (TCG_CALL_NO_READ_GLOBALS | TCG_CALL_NO_WRITE_GLOBALS))) { for (i = 0; i < nb_globals; i++) { - reset_temp(i); + if (test_bit(i, temps_used.l)) { + reset_temp(i); + } } } goto do_reset_output;