From patchwork Fri Feb 28 16:43:43 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Simpson X-Patchwork-Id: 1246711 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=quicinc.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=quicinc.com header.i=@quicinc.com header.a=rsa-sha256 header.s=qcdkim header.b=fU1UqMu7; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48TbtC5TDWz9sPK for ; Sat, 29 Feb 2020 04:21:23 +1100 (AEDT) Received: from localhost ([::1]:51182 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j7jKD-0001eM-Iv for incoming@patchwork.ozlabs.org; Fri, 28 Feb 2020 12:21:21 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:58410) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j7im2-0005bH-KL for qemu-devel@nongnu.org; Fri, 28 Feb 2020 11:46:04 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1j7im0-0007BW-G4 for qemu-devel@nongnu.org; Fri, 28 Feb 2020 11:46:02 -0500 Received: from alexa-out-sd-01.qualcomm.com ([199.106.114.38]:13250) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1j7im0-0005Uz-3z for qemu-devel@nongnu.org; Fri, 28 Feb 2020 11:46:00 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=quicinc.com; i=@quicinc.com; q=dns/txt; s=qcdkim; t=1582908360; x=1614444360; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=egQLU7UL0367O5Ng/6wa9jioXYHpGzCWdOTP3btoBUU=; b=fU1UqMu70WKz30whoKIEzhfbIv6Z1DGNa/o0j58cxxcy+l/wNU2RT5Zx FBuCGqBtuunudof3Pk0x9ky09kl+ahtw3/Y5OLanQnV9SofqGb34KbSqg 1HG5rC9UIkbTU7D/nZl883MiPm6MUxh+VTvqaRt065xVC3Bk86KHoKRir c=; Received: from unknown (HELO ironmsg05-sd.qualcomm.com) ([10.53.140.145]) by alexa-out-sd-01.qualcomm.com with ESMTP; 28 Feb 2020 08:44:32 -0800 Received: from vu-tsimpson-aus.qualcomm.com (HELO vu-tsimpson1-aus.qualcomm.com) ([10.222.150.1]) by ironmsg05-sd.qualcomm.com with ESMTP; 28 Feb 2020 08:44:32 -0800 Received: by vu-tsimpson1-aus.qualcomm.com (Postfix, from userid 47164) id D42C01206; Fri, 28 Feb 2020 10:44:31 -0600 (CST) From: Taylor Simpson To: qemu-devel@nongnu.org Subject: [RFC PATCH v2 47/67] Hexagon TCG generation - step 09 Date: Fri, 28 Feb 2020 10:43:43 -0600 Message-Id: <1582908244-304-48-git-send-email-tsimpson@quicinc.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1582908244-304-1-git-send-email-tsimpson@quicinc.com> References: <1582908244-304-1-git-send-email-tsimpson@quicinc.com> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: FreeBSD 9.x [fuzzy] X-Received-From: 199.106.114.38 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: riku.voipio@iki.fi, richard.henderson@linaro.org, laurent@vivier.eu, Taylor Simpson , philmd@redhat.com, aleksandar.m.mail@gmail.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Override instructions to speed up qemu Signed-off-by: Taylor Simpson --- target/hexagon/helper_overrides.h | 97 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 97 insertions(+) diff --git a/target/hexagon/helper_overrides.h b/target/hexagon/helper_overrides.h index d18aea4..5443a94e 100644 --- a/target/hexagon/helper_overrides.h +++ b/target/hexagon/helper_overrides.h @@ -1230,4 +1230,101 @@ gen_helper_vacsh_pred(PeV, cpu_env, RxxV, RssV, RttV); \ } while (0) +/* + * The following fWRAP macros are to speed up qemu + * We can add more over time + */ + +/* + * Add or subtract with carry. + * Predicate register is used as an extra input and output. + * r5:4 = add(r1:0, r3:2, p1):carry + */ +#define fWRAP_A4_addp_c(GENHLPR, SHORTCODE) \ + do { \ + TCGv LSB = tcg_temp_new(); \ + TCGv_i64 LSB_i64 = tcg_temp_new_i64(); \ + TCGv_i64 tmp_i64 = tcg_temp_new_i64(); \ + TCGv tmp = tcg_temp_new(); \ + tcg_gen_add_i64(RddV, RssV, RttV); \ + fLSBOLD(PxV); \ + tcg_gen_extu_i32_i64(LSB_i64, LSB); \ + tcg_gen_add_i64(RddV, RddV, LSB_i64); \ + fCARRY_FROM_ADD(RssV, RttV, LSB_i64); \ + tcg_gen_extrl_i64_i32(tmp, RssV); \ + f8BITSOF(PxV, tmp); \ + tcg_temp_free(LSB); \ + tcg_temp_free_i64(LSB_i64); \ + tcg_temp_free_i64(tmp_i64); \ + tcg_temp_free(tmp); \ + } while (0) + +/* r5:4 = sub(r1:0, r3:2, p1):carry */ +#define fWRAP_A4_subp_c(GENHLPR, SHORTCODE) \ + do { \ + TCGv LSB = tcg_temp_new(); \ + TCGv_i64 LSB_i64 = tcg_temp_new_i64(); \ + TCGv_i64 tmp_i64 = tcg_temp_new_i64(); \ + TCGv tmp = tcg_temp_new(); \ + tcg_gen_not_i64(tmp_i64, RttV); \ + tcg_gen_add_i64(RddV, RssV, tmp_i64); \ + fLSBOLD(PxV); \ + tcg_gen_extu_i32_i64(LSB_i64, LSB); \ + tcg_gen_add_i64(RddV, RddV, LSB_i64); \ + fCARRY_FROM_ADD(RssV, tmp_i64, LSB_i64); \ + tcg_gen_extrl_i64_i32(tmp, RssV); \ + f8BITSOF(PxV, tmp); \ + tcg_temp_free(LSB); \ + tcg_temp_free_i64(LSB_i64); \ + tcg_temp_free_i64(tmp_i64); \ + tcg_temp_free(tmp); \ + } while (0) + +/* + * Compare each of the 8 unsigned bytes + * The minimum is places in each byte of the destination. + * Each bit of the predicate is set true if the bit from the first operand + * is greater than the bit from the second operand. + * r5:4,p1 = vminub(r1:0, r3:2) + */ +#define fWRAP_A6_vminub_RdP(GENHLPR, SHORTCODE) \ + do { \ + TCGv BYTE = tcg_temp_new(); \ + TCGv left = tcg_temp_new(); \ + TCGv right = tcg_temp_new(); \ + TCGv tmp = tcg_temp_new(); \ + int i; \ + tcg_gen_movi_tl(PeV, 0); \ + tcg_gen_movi_i64(RddV, 0); \ + for (i = 0; i < 8; i++) { \ + fGETUBYTE(i, RttV); \ + tcg_gen_mov_tl(left, BYTE); \ + fGETUBYTE(i, RssV); \ + tcg_gen_mov_tl(right, BYTE); \ + tcg_gen_setcond_tl(TCG_COND_GT, tmp, left, right); \ + fSETBIT(i, PeV, tmp); \ + fMIN(tmp, left, right); \ + fSETBYTE(i, RddV, tmp); \ + } \ + tcg_temp_free(BYTE); \ + tcg_temp_free(left); \ + tcg_temp_free(right); \ + tcg_temp_free(tmp); \ + } while (0) + +#define fWRAP_J2_call(GENHLPR, SHORTCODE) \ + gen_call(riV) +#define fWRAP_J2_callr(GENHLPR, SHORTCODE) \ + gen_callr(RsV) + +#define fWRAP_J2_loop0r(GENHLPR, SHORTCODE) \ + gen_loop0r(RsV, riV, insn) +#define fWRAP_J2_loop1r(GENHLPR, SHORTCODE) \ + gen_loop1r(RsV, riV, insn) + +#define fWRAP_J2_endloop0(GENHLPR, SHORTCODE) \ + gen_endloop0() +#define fWRAP_J2_endloop1(GENHLPR, SHORTCODE) \ + gen_endloop1() + #endif