From patchwork Sat Apr 20 07:34:06 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1088301 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="xPNr4aeD"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44mPpj2900z9s4Y for ; Sat, 20 Apr 2019 17:38:37 +1000 (AEST) Received: from localhost ([127.0.0.1]:38087 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHka3-0002Xj-8D for incoming@patchwork.ozlabs.org; Sat, 20 Apr 2019 03:38:35 -0400 Received: from eggs.gnu.org ([209.51.188.92]:40059) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkWR-0008KO-Ur for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:34:53 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hHkWQ-0007em-HA for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:34:51 -0400 Received: from mail-pg1-x544.google.com ([2607:f8b0:4864:20::544]:39734) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hHkWQ-0007eQ-B4 for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:34:50 -0400 Received: by mail-pg1-x544.google.com with SMTP id l18so3459315pgj.6 for ; Sat, 20 Apr 2019 00:34:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=mMJwBlVyB/MmlQKnXsB2p1E0fVeRhNBVAwRu1nQHFzg=; b=xPNr4aeDIXraWxpq6k+ZDGDaWZnuC5X6I6f8DabSDFMELVOOPuhhmNsOWUiuvvwb4Y DOSy9m8+tQK6cZLeZXA1kYlNZ5ZcMTxb7JWq+1j7qQxuTlv6+KkrKrEj/+78BaGcFanW dfNbT8Aa0PQvi5q5QVJjyfTb4TSpM8xfDu6vhC6s4qG8KlVGDkvO1s3fDUxep8SqRc02 I3IBFwTRm5UuQHolV6cNXzbozSrxVNj6XmBuMM7kaijbQepeqHj1zgBFtOOS8P4eGkIP csrSy8rASzSsPedbbOHkAmOtv/0bitSTAzdCWv4f4DrUcC8dswMtH/QB/kDp/OfOou6L z9oQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=mMJwBlVyB/MmlQKnXsB2p1E0fVeRhNBVAwRu1nQHFzg=; b=JJmStUq7fLHiJ41nFBc4aDGoVK3d00DSkNYXRc7qmcFA/Ty4ygsMEzjlKwcLKs5Vew OtvKoc+UIBQNsJCGfC6VJr/WYg8rwrF88MNMPssA19kR329hPAps8xcCNi3YoFUuZloV i0m3mjVuRFU0QKHQgCD+4aFb1hP2OKnNPRmfcpkx09s5IalRZfnzEH0YTOOP50G/59ML Xtif4sUsQMuRuJOzn/toZL2GV7niHBZDQfyLuMYheRgTm9/In9ArpWmuvTJn7iv+OYW+ ntG6hJ513uCyw+iOPp40zU2HpqMctL5IwEfq+H+DjdBJQaArDgYV1jKAkdT/S5s9DpO2 az2w== X-Gm-Message-State: APjAAAW5u9OmaFxxcSxW9ZMyNuny0OYbMhJskb24FHMgoU93l4RqTZIr LEzSAYXJDSrXT2Kc6kTQK1ehfqpPzsE= X-Google-Smtp-Source: APXvYqyMtQ2BeezvlDW/qxAauSwZUNnDPx6iLPifuMHTRWgJzu5KGf6NWNnhKAGEdS3JsjTFrW5gxA== X-Received: by 2002:a65:5106:: with SMTP id f6mr8135741pgq.253.1555745689102; Sat, 20 Apr 2019 00:34:49 -0700 (PDT) Received: from localhost.localdomain (rrcs-66-91-136-155.west.biz.rr.com. [66.91.136.155]) by smtp.gmail.com with ESMTPSA id z22sm7025492pgv.23.2019.04.20.00.34.47 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 20 Apr 2019 00:34:48 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Fri, 19 Apr 2019 21:34:06 -1000 Message-Id: <20190420073442.7488-3-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190420073442.7488-1-richard.henderson@linaro.org> References: <20190420073442.7488-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::544 Subject: [Qemu-devel] [PATCH 02/38] tcg: Assert fixed_reg is read-only X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: david@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" The only fixed_reg is cpu_env, and it should not be modified during any TB. Therefore code that tries to special-case moves into a fixed_reg is dead. Remove it. Signed-off-by: Richard Henderson Reviewed-by: David Hildenbrand --- tcg/tcg.c | 85 +++++++++++++++++++++++++------------------------------ 1 file changed, 38 insertions(+), 47 deletions(-) diff --git a/tcg/tcg.c b/tcg/tcg.c index ade6050982..4f77a957b0 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -3279,11 +3279,8 @@ static void tcg_reg_alloc_do_movi(TCGContext *s, TCGTemp *ots, tcg_target_ulong val, TCGLifeData arg_life, TCGRegSet preferred_regs) { - if (ots->fixed_reg) { - /* For fixed registers, we do not do any constant propagation. */ - tcg_out_movi(s, ots->type, ots->reg, val); - return; - } + /* ENV should not be modified. */ + tcg_debug_assert(!ots->fixed_reg); /* The movi is not explicitly generated here. */ if (ots->val_type == TEMP_VAL_REG) { @@ -3319,6 +3316,9 @@ static void tcg_reg_alloc_mov(TCGContext *s, const TCGOp *op) ots = arg_temp(op->args[0]); ts = arg_temp(op->args[1]); + /* ENV should not be modified. */ + tcg_debug_assert(!ots->fixed_reg); + /* Note that otype != itype for no-op truncation. */ otype = ots->type; itype = ts->type; @@ -3343,7 +3343,7 @@ static void tcg_reg_alloc_mov(TCGContext *s, const TCGOp *op) } tcg_debug_assert(ts->val_type == TEMP_VAL_REG); - if (IS_DEAD_ARG(0) && !ots->fixed_reg) { + if (IS_DEAD_ARG(0)) { /* mov to a non-saved dead register makes no sense (even with liveness analysis disabled). */ tcg_debug_assert(NEED_SYNC_ARG(0)); @@ -3356,7 +3356,7 @@ static void tcg_reg_alloc_mov(TCGContext *s, const TCGOp *op) } temp_dead(s, ots); } else { - if (IS_DEAD_ARG(1) && !ts->fixed_reg && !ots->fixed_reg) { + if (IS_DEAD_ARG(1) && !ts->fixed_reg) { /* the mov can be suppressed */ if (ots->val_type == TEMP_VAL_REG) { s->reg_to_temp[ots->reg] = NULL; @@ -3509,6 +3509,10 @@ static void tcg_reg_alloc_op(TCGContext *s, const TCGOp *op) arg = op->args[i]; arg_ct = &def->args_ct[i]; ts = arg_temp(arg); + + /* ENV should not be modified. */ + tcg_debug_assert(!ts->fixed_reg); + if ((arg_ct->ct & TCG_CT_ALIAS) && !const_args[arg_ct->alias_index]) { reg = new_args[arg_ct->alias_index]; @@ -3517,29 +3521,19 @@ static void tcg_reg_alloc_op(TCGContext *s, const TCGOp *op) i_allocated_regs | o_allocated_regs, op->output_pref[k], ts->indirect_base); } else { - /* if fixed register, we try to use it */ - reg = ts->reg; - if (ts->fixed_reg && - tcg_regset_test_reg(arg_ct->u.regs, reg)) { - goto oarg_end; - } reg = tcg_reg_alloc(s, arg_ct->u.regs, o_allocated_regs, op->output_pref[k], ts->indirect_base); } tcg_regset_set_reg(o_allocated_regs, reg); - /* if a fixed register is used, then a move will be done afterwards */ - if (!ts->fixed_reg) { - if (ts->val_type == TEMP_VAL_REG) { - s->reg_to_temp[ts->reg] = NULL; - } - ts->val_type = TEMP_VAL_REG; - ts->reg = reg; - /* temp value is modified, so the value kept in memory is - potentially not the same */ - ts->mem_coherent = 0; - s->reg_to_temp[reg] = ts; + if (ts->val_type == TEMP_VAL_REG) { + s->reg_to_temp[ts->reg] = NULL; } - oarg_end: + ts->val_type = TEMP_VAL_REG; + ts->reg = reg; + /* temp value is modified, so the value kept in memory is + potentially not the same */ + ts->mem_coherent = 0; + s->reg_to_temp[reg] = ts; new_args[i] = reg; } } @@ -3555,10 +3549,10 @@ static void tcg_reg_alloc_op(TCGContext *s, const TCGOp *op) /* move the outputs in the correct register if needed */ for(i = 0; i < nb_oargs; i++) { ts = arg_temp(op->args[i]); - reg = new_args[i]; - if (ts->fixed_reg && ts->reg != reg) { - tcg_out_mov(s, ts->type, ts->reg, reg); - } + + /* ENV should not be modified. */ + tcg_debug_assert(!ts->fixed_reg); + if (NEED_SYNC_ARG(i)) { temp_sync(s, ts, o_allocated_regs, 0, IS_DEAD_ARG(i)); } else if (IS_DEAD_ARG(i)) { @@ -3679,26 +3673,23 @@ static void tcg_reg_alloc_call(TCGContext *s, TCGOp *op) for(i = 0; i < nb_oargs; i++) { arg = op->args[i]; ts = arg_temp(arg); + + /* ENV should not be modified. */ + tcg_debug_assert(!ts->fixed_reg); + reg = tcg_target_call_oarg_regs[i]; tcg_debug_assert(s->reg_to_temp[reg] == NULL); - - if (ts->fixed_reg) { - if (ts->reg != reg) { - tcg_out_mov(s, ts->type, ts->reg, reg); - } - } else { - if (ts->val_type == TEMP_VAL_REG) { - s->reg_to_temp[ts->reg] = NULL; - } - ts->val_type = TEMP_VAL_REG; - ts->reg = reg; - ts->mem_coherent = 0; - s->reg_to_temp[reg] = ts; - if (NEED_SYNC_ARG(i)) { - temp_sync(s, ts, allocated_regs, 0, IS_DEAD_ARG(i)); - } else if (IS_DEAD_ARG(i)) { - temp_dead(s, ts); - } + if (ts->val_type == TEMP_VAL_REG) { + s->reg_to_temp[ts->reg] = NULL; + } + ts->val_type = TEMP_VAL_REG; + ts->reg = reg; + ts->mem_coherent = 0; + s->reg_to_temp[reg] = ts; + if (NEED_SYNC_ARG(i)) { + temp_sync(s, ts, allocated_regs, 0, IS_DEAD_ARG(i)); + } else if (IS_DEAD_ARG(i)) { + temp_dead(s, ts); } } } From patchwork Sat Apr 20 07:34:07 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1088298 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="irBPPWdM"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44mPld3HPKz9s4Y for ; Sat, 20 Apr 2019 17:35:57 +1000 (AEST) Received: from localhost ([127.0.0.1]:38037 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkXT-0008QJ-Av for incoming@patchwork.ozlabs.org; Sat, 20 Apr 2019 03:35:55 -0400 Received: from eggs.gnu.org ([209.51.188.92]:40079) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkWU-0008MX-9A for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:34:55 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hHkWS-0007gC-K6 for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:34:54 -0400 Received: from mail-pl1-x644.google.com ([2607:f8b0:4864:20::644]:41505) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hHkWS-0007fa-CN for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:34:52 -0400 Received: by mail-pl1-x644.google.com with SMTP id d1so3532485plj.8 for ; Sat, 20 Apr 2019 00:34:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=++rfimdjNhvgXJCPSNhLZGoRFxyZo6MTZOKY7rGkHAA=; b=irBPPWdMGVBtZIMgKyrAIyngAAQWT+BE5Qh+J5Bj5eZUSpbGS1Q++zkg85QeJ5Zb1X 2TovKtr1xwa1zFbWIWiiRB+8RI6PKHh/DVM18EywuowGvYRWJCaIvT1pZQIiwDB+Bjmm cOh8F6OiHdYU4qsxxdra3DEwILaum41jOY5m36R4wtdFNVIXsnPPVTC3PlRo3h6PCvTF EkAEbbZiLOqk7+8ijHP0kwyRDwa2snGEpohJpKo7YP47cYh0DlYy67vktVCB9mq+f0Hq l/wPyUEJScIyU4+XWCiAsJiaeBXDED3TvpoH59SHvFLafP40obAAIYfozgqALA9pU3W7 22IA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=++rfimdjNhvgXJCPSNhLZGoRFxyZo6MTZOKY7rGkHAA=; b=DHL2qRU1x4Xtwaof+Gy0JK7I50D3gviwZsPj79bBjiPZmuD4eDSbsX2MyFblo2cD3F Hl6Nn4E2HsQT+QsCAcxs0fO1O6XTNcVX+Tq5mFRcleTP7t/JLVZa+AMtToPqS7UQrwTb 43OyJzSYdJ0hji0sp9MYQP6TuK9b0bwvs3wz7ofVo8HoktHGok6rFEQdxrAUhJo4E0mj Qbp5pCB/q8BCdK+wC9ZSVJQOztXgSgN3lV8iUphzfDNWD7RPxg1I2LS+KURuEqcBUMfC Sbm6r7BzMypmFCvkmsIlCO5gO4J6aR9P3yy12BhQN+eK//IrOHDoBAAfQPIoXugC+Cc4 mSGQ== X-Gm-Message-State: APjAAAVIXAmHTgtKqs+/000cewOM/if3KfksQXS9OM2rbSJ4EmAPJs+6 OI5nyFmd69N+Rc9+cDGiKRI8kUhuQl4= X-Google-Smtp-Source: APXvYqyPGAEcqGXcr/iGF+fYeRI6v/cd6TRp0mxYiT3ZmRqmwdLY5OP41WTj/2BN0tAyXmumxpH7Iw== X-Received: by 2002:a17:902:b210:: with SMTP id t16mr8282405plr.84.1555745690904; Sat, 20 Apr 2019 00:34:50 -0700 (PDT) Received: from localhost.localdomain (rrcs-66-91-136-155.west.biz.rr.com. [66.91.136.155]) by smtp.gmail.com with ESMTPSA id z22sm7025492pgv.23.2019.04.20.00.34.49 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 20 Apr 2019 00:34:50 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Fri, 19 Apr 2019 21:34:07 -1000 Message-Id: <20190420073442.7488-4-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190420073442.7488-1-richard.henderson@linaro.org> References: <20190420073442.7488-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::644 Subject: [Qemu-devel] [PATCH 03/38] tcg: Return bool success from tcg_out_mov X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: david@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" This patch merely changes the interface, aborting on all failures, of which there are currently none. Reviewed-by: David Gibson Signed-off-by: Richard Henderson Reviewed-by: Philippe Mathieu-Daudé Reviewed-by: David Hildenbrand --- tcg/aarch64/tcg-target.inc.c | 5 +++-- tcg/arm/tcg-target.inc.c | 7 +++++-- tcg/i386/tcg-target.inc.c | 5 +++-- tcg/mips/tcg-target.inc.c | 3 ++- tcg/ppc/tcg-target.inc.c | 3 ++- tcg/riscv/tcg-target.inc.c | 5 +++-- tcg/s390/tcg-target.inc.c | 3 ++- tcg/sparc/tcg-target.inc.c | 3 ++- tcg/tcg.c | 14 ++++++++++---- tcg/tci/tcg-target.inc.c | 3 ++- 10 files changed, 34 insertions(+), 17 deletions(-) diff --git a/tcg/aarch64/tcg-target.inc.c b/tcg/aarch64/tcg-target.inc.c index 8b93598bce..b2d3f9c0a5 100644 --- a/tcg/aarch64/tcg-target.inc.c +++ b/tcg/aarch64/tcg-target.inc.c @@ -938,10 +938,10 @@ static void tcg_out_ldst(TCGContext *s, AArch64Insn insn, TCGReg rd, tcg_out_ldst_r(s, insn, rd, rn, TCG_TYPE_I64, TCG_REG_TMP); } -static void tcg_out_mov(TCGContext *s, TCGType type, TCGReg ret, TCGReg arg) +static bool tcg_out_mov(TCGContext *s, TCGType type, TCGReg ret, TCGReg arg) { if (ret == arg) { - return; + return true; } switch (type) { case TCG_TYPE_I32: @@ -970,6 +970,7 @@ static void tcg_out_mov(TCGContext *s, TCGType type, TCGReg ret, TCGReg arg) default: g_assert_not_reached(); } + return true; } static void tcg_out_ld(TCGContext *s, TCGType type, TCGReg ret, diff --git a/tcg/arm/tcg-target.inc.c b/tcg/arm/tcg-target.inc.c index 6873b0cf95..34e6652142 100644 --- a/tcg/arm/tcg-target.inc.c +++ b/tcg/arm/tcg-target.inc.c @@ -2275,10 +2275,13 @@ static inline bool tcg_out_sti(TCGContext *s, TCGType type, TCGArg val, return false; } -static inline void tcg_out_mov(TCGContext *s, TCGType type, +static inline bool tcg_out_mov(TCGContext *s, TCGType type, TCGReg ret, TCGReg arg) { - tcg_out_dat_reg(s, COND_AL, ARITH_MOV, ret, 0, arg, SHIFT_IMM_LSL(0)); + if (ret != arg) { + tcg_out_dat_reg(s, COND_AL, ARITH_MOV, ret, 0, arg, SHIFT_IMM_LSL(0)); + } + return true; } static inline void tcg_out_movi(TCGContext *s, TCGType type, diff --git a/tcg/i386/tcg-target.inc.c b/tcg/i386/tcg-target.inc.c index 1fa833840e..817a167767 100644 --- a/tcg/i386/tcg-target.inc.c +++ b/tcg/i386/tcg-target.inc.c @@ -809,12 +809,12 @@ static inline void tgen_arithr(TCGContext *s, int subop, int dest, int src) tcg_out_modrm(s, OPC_ARITH_GvEv + (subop << 3) + ext, dest, src); } -static void tcg_out_mov(TCGContext *s, TCGType type, TCGReg ret, TCGReg arg) +static bool tcg_out_mov(TCGContext *s, TCGType type, TCGReg ret, TCGReg arg) { int rexw = 0; if (arg == ret) { - return; + return true; } switch (type) { case TCG_TYPE_I64: @@ -852,6 +852,7 @@ static void tcg_out_mov(TCGContext *s, TCGType type, TCGReg ret, TCGReg arg) default: g_assert_not_reached(); } + return true; } static void tcg_out_dup_vec(TCGContext *s, TCGType type, unsigned vece, diff --git a/tcg/mips/tcg-target.inc.c b/tcg/mips/tcg-target.inc.c index 8a92e916dd..f31ebb43bf 100644 --- a/tcg/mips/tcg-target.inc.c +++ b/tcg/mips/tcg-target.inc.c @@ -558,13 +558,14 @@ static inline void tcg_out_dsra(TCGContext *s, TCGReg rd, TCGReg rt, TCGArg sa) tcg_out_opc_sa64(s, OPC_DSRA, OPC_DSRA32, rd, rt, sa); } -static inline void tcg_out_mov(TCGContext *s, TCGType type, +static inline bool tcg_out_mov(TCGContext *s, TCGType type, TCGReg ret, TCGReg arg) { /* Simple reg-reg move, optimising out the 'do nothing' case */ if (ret != arg) { tcg_out_opc_reg(s, OPC_OR, ret, arg, TCG_REG_ZERO); } + return true; } static void tcg_out_movi(TCGContext *s, TCGType type, diff --git a/tcg/ppc/tcg-target.inc.c b/tcg/ppc/tcg-target.inc.c index 773690f1d9..ec8e336be8 100644 --- a/tcg/ppc/tcg-target.inc.c +++ b/tcg/ppc/tcg-target.inc.c @@ -566,12 +566,13 @@ static bool patch_reloc(tcg_insn_unit *code_ptr, int type, static void tcg_out_mem_long(TCGContext *s, int opi, int opx, TCGReg rt, TCGReg base, tcg_target_long offset); -static void tcg_out_mov(TCGContext *s, TCGType type, TCGReg ret, TCGReg arg) +static bool tcg_out_mov(TCGContext *s, TCGType type, TCGReg ret, TCGReg arg) { tcg_debug_assert(TCG_TARGET_REG_BITS == 64 || type == TCG_TYPE_I32); if (ret != arg) { tcg_out32(s, OR | SAB(arg, ret, arg)); } + return true; } static inline void tcg_out_rld(TCGContext *s, int op, TCGReg ra, TCGReg rs, diff --git a/tcg/riscv/tcg-target.inc.c b/tcg/riscv/tcg-target.inc.c index b785f4acb7..e2bf1c2c6e 100644 --- a/tcg/riscv/tcg-target.inc.c +++ b/tcg/riscv/tcg-target.inc.c @@ -515,10 +515,10 @@ static bool patch_reloc(tcg_insn_unit *code_ptr, int type, * TCG intrinsics */ -static void tcg_out_mov(TCGContext *s, TCGType type, TCGReg ret, TCGReg arg) +static bool tcg_out_mov(TCGContext *s, TCGType type, TCGReg ret, TCGReg arg) { if (ret == arg) { - return; + return true; } switch (type) { case TCG_TYPE_I32: @@ -528,6 +528,7 @@ static void tcg_out_mov(TCGContext *s, TCGType type, TCGReg ret, TCGReg arg) default: g_assert_not_reached(); } + return true; } static void tcg_out_movi(TCGContext *s, TCGType type, TCGReg rd, diff --git a/tcg/s390/tcg-target.inc.c b/tcg/s390/tcg-target.inc.c index 7db90b3bae..eb22188d1d 100644 --- a/tcg/s390/tcg-target.inc.c +++ b/tcg/s390/tcg-target.inc.c @@ -548,7 +548,7 @@ static void tcg_out_sh32(TCGContext* s, S390Opcode op, TCGReg dest, tcg_out_insn_RS(s, op, dest, sh_reg, 0, sh_imm); } -static void tcg_out_mov(TCGContext *s, TCGType type, TCGReg dst, TCGReg src) +static bool tcg_out_mov(TCGContext *s, TCGType type, TCGReg dst, TCGReg src) { if (src != dst) { if (type == TCG_TYPE_I32) { @@ -557,6 +557,7 @@ static void tcg_out_mov(TCGContext *s, TCGType type, TCGReg dst, TCGReg src) tcg_out_insn(s, RRE, LGR, dst, src); } } + return true; } static const S390Opcode lli_insns[4] = { diff --git a/tcg/sparc/tcg-target.inc.c b/tcg/sparc/tcg-target.inc.c index 7a61839dc1..83295955a7 100644 --- a/tcg/sparc/tcg-target.inc.c +++ b/tcg/sparc/tcg-target.inc.c @@ -407,12 +407,13 @@ static void tcg_out_arithc(TCGContext *s, TCGReg rd, TCGReg rs1, | (val2const ? INSN_IMM13(val2) : INSN_RS2(val2))); } -static inline void tcg_out_mov(TCGContext *s, TCGType type, +static inline bool tcg_out_mov(TCGContext *s, TCGType type, TCGReg ret, TCGReg arg) { if (ret != arg) { tcg_out_arith(s, ret, arg, TCG_REG_G0, ARITH_OR); } + return true; } static inline void tcg_out_sethi(TCGContext *s, TCGReg ret, uint32_t arg) diff --git a/tcg/tcg.c b/tcg/tcg.c index 4f77a957b0..b083faacd2 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -102,7 +102,7 @@ static const char *target_parse_constraint(TCGArgConstraint *ct, const char *ct_str, TCGType type); static void tcg_out_ld(TCGContext *s, TCGType type, TCGReg ret, TCGReg arg1, intptr_t arg2); -static void tcg_out_mov(TCGContext *s, TCGType type, TCGReg ret, TCGReg arg); +static bool tcg_out_mov(TCGContext *s, TCGType type, TCGReg ret, TCGReg arg); static void tcg_out_movi(TCGContext *s, TCGType type, TCGReg ret, tcg_target_long arg); static void tcg_out_op(TCGContext *s, TCGOpcode opc, const TCGArg *args, @@ -3372,7 +3372,9 @@ static void tcg_reg_alloc_mov(TCGContext *s, const TCGOp *op) allocated_regs, preferred_regs, ots->indirect_base); } - tcg_out_mov(s, otype, ots->reg, ts->reg); + if (!tcg_out_mov(s, otype, ots->reg, ts->reg)) { + abort(); + } } ots->val_type = TEMP_VAL_REG; ots->mem_coherent = 0; @@ -3472,7 +3474,9 @@ static void tcg_reg_alloc_op(TCGContext *s, const TCGOp *op) i_allocated_regs, 0); reg = tcg_reg_alloc(s, arg_ct->u.regs, i_allocated_regs, o_preferred_regs, ts->indirect_base); - tcg_out_mov(s, ts->type, reg, ts->reg); + if (!tcg_out_mov(s, ts->type, reg, ts->reg)) { + abort(); + } } new_args[i] = reg; const_args[i] = 0; @@ -3629,7 +3633,9 @@ static void tcg_reg_alloc_call(TCGContext *s, TCGOp *op) if (ts->val_type == TEMP_VAL_REG) { if (ts->reg != reg) { tcg_reg_free(s, reg, allocated_regs); - tcg_out_mov(s, ts->type, reg, ts->reg); + if (!tcg_out_mov(s, ts->type, reg, ts->reg)) { + abort(); + } } } else { TCGRegSet arg_set = 0; diff --git a/tcg/tci/tcg-target.inc.c b/tcg/tci/tcg-target.inc.c index 0015a98485..992d50cb1e 100644 --- a/tcg/tci/tcg-target.inc.c +++ b/tcg/tci/tcg-target.inc.c @@ -509,7 +509,7 @@ static void tcg_out_ld(TCGContext *s, TCGType type, TCGReg ret, TCGReg arg1, old_code_ptr[1] = s->code_ptr - old_code_ptr; } -static void tcg_out_mov(TCGContext *s, TCGType type, TCGReg ret, TCGReg arg) +static bool tcg_out_mov(TCGContext *s, TCGType type, TCGReg ret, TCGReg arg) { uint8_t *old_code_ptr = s->code_ptr; tcg_debug_assert(ret != arg); @@ -521,6 +521,7 @@ static void tcg_out_mov(TCGContext *s, TCGType type, TCGReg ret, TCGReg arg) tcg_out_r(s, ret); tcg_out_r(s, arg); old_code_ptr[1] = s->code_ptr - old_code_ptr; + return true; } static void tcg_out_movi(TCGContext *s, TCGType type, From patchwork Sat Apr 20 07:34:08 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1088304 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="jDXjecbu"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44mPsm1z6dz9s70 for ; Sat, 20 Apr 2019 17:41:15 +1000 (AEST) Received: from localhost ([127.0.0.1]:38145 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkcb-0004tN-LX for incoming@patchwork.ozlabs.org; Sat, 20 Apr 2019 03:41:13 -0400 Received: from eggs.gnu.org ([209.51.188.92]:40089) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkWU-0008NH-VX for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:34:56 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hHkWU-0007iN-0L for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:34:54 -0400 Received: from mail-pl1-x642.google.com ([2607:f8b0:4864:20::642]:46119) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hHkWT-0007gn-Qr for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:34:53 -0400 Received: by mail-pl1-x642.google.com with SMTP id o7so1289443pll.13 for ; Sat, 20 Apr 2019 00:34:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=+m09nvTdIxOMcWt67/JzzqfApcaXJMaZOV8+XsA1b8w=; b=jDXjecbufA8HCLc4BVGrGoN6GM1C/2OHWdaiV7aLES35e77AWrQWx7HZoK/ljV26P+ MBItbZtC+DszK3EgMQVElXCt9wnm/2TTEwAGgmRPS0wR5XYqItCQQygdHUIcitPH9hYL Pl6H4teYjzmKOYt52DE9gbH4DoraxRMIWhj9GkVnmGrJD8pKYLrA7GL25yvkPzmSNBHF R3D0isHI+q45sK0ZuZReA7LsdWkTopdNSaqwZkjgVLhw9PSENfAv/pll2n/VcRocUWts V746gSHcVEHa5CYdrNlPGBc5zci8MzpAkbEO3ME3/2+80iFXniebtJ0Mgu64L6LrZcI9 mYCw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=+m09nvTdIxOMcWt67/JzzqfApcaXJMaZOV8+XsA1b8w=; b=HHuCHiKeIqdBcVbNFchj2aNi8ZobHz1hxR3bMwXjr5AV7/nRplkt+zB/0M333acop5 7euSbMjGlDkz/qbDPqB5bItwxE6ZpzIKfAaMoKjCN6+RJA+6aIebLd53YENVQLWcQUHD F6o6UPTRuLLrdcQjtVfhlznEiRA8s1dDZnmtx1HLCNS/I90XDRooVEqLNQL+PyGAtoTX 69dllwObpeDVRTPS06nVekP6V/vMRdFFqOJ4lyw5CsHpl6FOuux7ukpbkGCDhZFE29Fa csAr3nOiANbwmo9d5Ms1RgA1N5XH1jxptmVbNKYOQpZIAk9H9s2Zuy46mC4JE6ZLqLYH GvXA== X-Gm-Message-State: APjAAAWKxEnwf95bZugS0pN7W3SFW7t7huCWZe0ZqfW6PqJoRbqff1ap 5XLW/tru80ts6HqTRsqaEUoVgtXhIyA= X-Google-Smtp-Source: APXvYqwgICDAbAmi47nvRWVwTybzOeZH3KuzJSXS/Pd4M5xXCafuPz5OOQAZ0yk9/rQiiVvF7DMBjw== X-Received: by 2002:a17:902:b617:: with SMTP id b23mr7833382pls.73.1555745692591; Sat, 20 Apr 2019 00:34:52 -0700 (PDT) Received: from localhost.localdomain (rrcs-66-91-136-155.west.biz.rr.com. [66.91.136.155]) by smtp.gmail.com with ESMTPSA id z22sm7025492pgv.23.2019.04.20.00.34.51 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 20 Apr 2019 00:34:51 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Fri, 19 Apr 2019 21:34:08 -1000 Message-Id: <20190420073442.7488-5-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190420073442.7488-1-richard.henderson@linaro.org> References: <20190420073442.7488-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::642 Subject: [Qemu-devel] [PATCH 04/38] tcg: Support cross-class moves without instruction support X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: david@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" PowerPC Altivec does not support direct moves between vector registers and general registers. So when tcg_out_mov fails, we can use the backing memory for the temporary to perform the move. Signed-off-by: Richard Henderson Acked-by: David Hildenbrand --- tcg/tcg.c | 25 ++++++++++++++++++++++--- 1 file changed, 22 insertions(+), 3 deletions(-) diff --git a/tcg/tcg.c b/tcg/tcg.c index b083faacd2..d3dcfe3dca 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -3373,7 +3373,18 @@ static void tcg_reg_alloc_mov(TCGContext *s, const TCGOp *op) ots->indirect_base); } if (!tcg_out_mov(s, otype, ots->reg, ts->reg)) { - abort(); + /* Cross register class move not supported. + Store the source register into the destination slot + and leave the destination temp as TEMP_VAL_MEM. */ + assert(!ots->fixed_reg); + if (!ts->mem_allocated) { + temp_allocate_frame(s, ots); + } + tcg_out_st(s, ts->type, ts->reg, + ots->mem_base->reg, ots->mem_offset); + ots->mem_coherent = 1; + temp_free_or_dead(s, ots, -1); + return; } } ots->val_type = TEMP_VAL_REG; @@ -3475,7 +3486,11 @@ static void tcg_reg_alloc_op(TCGContext *s, const TCGOp *op) reg = tcg_reg_alloc(s, arg_ct->u.regs, i_allocated_regs, o_preferred_regs, ts->indirect_base); if (!tcg_out_mov(s, ts->type, reg, ts->reg)) { - abort(); + /* Cross register class move not supported. Sync the + temp back to its slot and load from there. */ + temp_sync(s, ts, i_allocated_regs, 0, 0); + tcg_out_ld(s, ts->type, reg, + ts->mem_base->reg, ts->mem_offset); } } new_args[i] = reg; @@ -3634,7 +3649,11 @@ static void tcg_reg_alloc_call(TCGContext *s, TCGOp *op) if (ts->reg != reg) { tcg_reg_free(s, reg, allocated_regs); if (!tcg_out_mov(s, ts->type, reg, ts->reg)) { - abort(); + /* Cross register class move not supported. Sync the + temp back to its slot and load from there. */ + temp_sync(s, ts, allocated_regs, 0, 0); + tcg_out_ld(s, ts->type, reg, + ts->mem_base->reg, ts->mem_offset); } } } else { From patchwork Sat Apr 20 07:34:09 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1088299 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="AJfzaWwT"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44mPlj2HlPz9s4Y for ; Sat, 20 Apr 2019 17:36:01 +1000 (AEST) Received: from localhost ([127.0.0.1]:38039 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkXX-0008Qz-2Q for incoming@patchwork.ozlabs.org; Sat, 20 Apr 2019 03:35:59 -0400 Received: from eggs.gnu.org ([209.51.188.92]:40111) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkWW-0008Og-P6 for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:34:57 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hHkWV-0007mj-LP for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:34:56 -0400 Received: from mail-pl1-x642.google.com ([2607:f8b0:4864:20::642]:45262) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hHkWV-0007kM-FN for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:34:55 -0400 Received: by mail-pl1-x642.google.com with SMTP id bf11so3528508plb.12 for ; Sat, 20 Apr 2019 00:34:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=TM1gXfRu5rNa8H31jQ9LQek5zgysMvrWVqkLd98KzZU=; b=AJfzaWwT/c1toDadjaTeZ2pixHROHU4oc3oo4/BGxbvxkiyGuLmXOL0++OHt+hPyCb By3Lkz4pkSy2Y1ONCf8ssYxoILb8SXkqZG2iqMhYsAlYLhcWDK7ypWDHiAQsJf22/LXd XyC+NBZcPJymFFhTJo6peA2gCq/4FuxbnB0o4UovjM3W/ptssLVwdcekT4KWRJQSxDz2 emLga+LETqVrGRy7610NoPOwTC8hb+phxZ/EnvOwVmbSg/U3YjJvD1l1/oSl5VDXy9Og fPRZz2RZCdLx8OWnKc6Fgjr5tzRHiFj4tA3pZmysfAEjSnAwc6cUc21jSCl4kc8+aeGQ W6vg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=TM1gXfRu5rNa8H31jQ9LQek5zgysMvrWVqkLd98KzZU=; b=gslMYHa2/W7KWNDswUAJsl7IMLdic+Nd+k0N37efbIxrLTvLbSRPrJtGTrjsTO9YnF xk28oKQ8BXisZaFL5SOY3IgSNu+M+tZrMCHWCqNSx+qPzPNYUZ8/RUkanawimwJow48a zhrbHS7qP/MJYUuu4wjembp5E4BHwE6/sHJWyHB9dZh6W2wvJXs5L7G/g3B0XsYAmr7P 2NpWeVBIC/Yd/mB667IZWqnkCFYB/n+yei3NWCci8flemwGewY7i5BmXO2X66gHhM8Jh LgCEPf1f17OQDYq8IJ+BBHLFLvHdzzTlcllyGNM+0XXhzndTQl4456tbZ0toiWsjsGxJ heIA== X-Gm-Message-State: APjAAAUcPNezzs37VSA8+TstEgPSi2otViBxfAWII4oxzyRH1sTwOLFA w1ymAs/hQRiE7gP1hA/rUUu0LL42B4Q= X-Google-Smtp-Source: APXvYqzJu5wYCuPYM1LLXdhrBYhqAzGZPaeGWgx0JdF4jeOIhiCEU17hQybHbqqHTFkw7ai5sE18JA== X-Received: by 2002:a17:902:3e5:: with SMTP id d92mr8124255pld.11.1555745694257; Sat, 20 Apr 2019 00:34:54 -0700 (PDT) Received: from localhost.localdomain (rrcs-66-91-136-155.west.biz.rr.com. [66.91.136.155]) by smtp.gmail.com with ESMTPSA id z22sm7025492pgv.23.2019.04.20.00.34.52 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 20 Apr 2019 00:34:53 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Fri, 19 Apr 2019 21:34:09 -1000 Message-Id: <20190420073442.7488-6-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190420073442.7488-1-richard.henderson@linaro.org> References: <20190420073442.7488-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::642 Subject: [Qemu-devel] [PATCH 05/38] tcg: Allow add_vec, sub_vec, neg_vec, not_vec to be expanded X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: david@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson Reviewed-by: David Hildenbrand --- tcg/tcg-op-vec.c | 49 ++++++++++++++++++++++++++++++++---------------- 1 file changed, 33 insertions(+), 16 deletions(-) diff --git a/tcg/tcg-op-vec.c b/tcg/tcg-op-vec.c index 27f65600c3..cfb18682b1 100644 --- a/tcg/tcg-op-vec.c +++ b/tcg/tcg-op-vec.c @@ -226,16 +226,6 @@ void tcg_gen_stl_vec(TCGv_vec r, TCGv_ptr b, TCGArg o, TCGType low_type) vec_gen_3(INDEX_op_st_vec, low_type, 0, ri, bi, o); } -void tcg_gen_add_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) -{ - vec_gen_op3(INDEX_op_add_vec, vece, r, a, b); -} - -void tcg_gen_sub_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) -{ - vec_gen_op3(INDEX_op_sub_vec, vece, r, a, b); -} - void tcg_gen_and_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) { vec_gen_op3(INDEX_op_and_vec, 0, r, a, b); @@ -296,11 +286,30 @@ void tcg_gen_eqv_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) tcg_gen_not_vec(0, r, r); } +static bool do_op2(unsigned vece, TCGv_vec r, TCGv_vec a, TCGOpcode opc) +{ + TCGTemp *rt = tcgv_vec_temp(r); + TCGTemp *at = tcgv_vec_temp(a); + TCGArg ri = temp_arg(rt); + TCGArg ai = temp_arg(at); + TCGType type = rt->base_type; + int can; + + tcg_debug_assert(at->base_type >= type); + can = tcg_can_emit_vec_op(opc, type, vece); + if (can > 0) { + vec_gen_2(opc, type, vece, ri, ai); + } else if (can < 0) { + tcg_expand_vec_op(opc, type, vece, ri, ai); + } else { + return false; + } + return true; +} + void tcg_gen_not_vec(unsigned vece, TCGv_vec r, TCGv_vec a) { - if (TCG_TARGET_HAS_not_vec) { - vec_gen_op2(INDEX_op_not_vec, 0, r, a); - } else { + if (!TCG_TARGET_HAS_not_vec || !do_op2(vece, r, a, INDEX_op_not_vec)) { TCGv_vec t = tcg_const_ones_vec_matching(r); tcg_gen_xor_vec(0, r, a, t); tcg_temp_free_vec(t); @@ -309,9 +318,7 @@ void tcg_gen_not_vec(unsigned vece, TCGv_vec r, TCGv_vec a) void tcg_gen_neg_vec(unsigned vece, TCGv_vec r, TCGv_vec a) { - if (TCG_TARGET_HAS_neg_vec) { - vec_gen_op2(INDEX_op_neg_vec, vece, r, a); - } else { + if (!TCG_TARGET_HAS_neg_vec || !do_op2(vece, r, a, INDEX_op_neg_vec)) { TCGv_vec t = tcg_const_zeros_vec_matching(r); tcg_gen_sub_vec(vece, r, t, a); tcg_temp_free_vec(t); @@ -409,6 +416,16 @@ static void do_op3(unsigned vece, TCGv_vec r, TCGv_vec a, } } +void tcg_gen_add_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) +{ + do_op3(vece, r, a, b, INDEX_op_add_vec); +} + +void tcg_gen_sub_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) +{ + do_op3(vece, r, a, b, INDEX_op_sub_vec); +} + void tcg_gen_mul_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) { do_op3(vece, r, a, b, INDEX_op_mul_vec); From patchwork Sat Apr 20 07:34:10 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1088302 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="iyv87db6"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44mPpw2wjmz9s6w for ; Sat, 20 Apr 2019 17:38:48 +1000 (AEST) Received: from localhost ([127.0.0.1]:38096 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkaE-0002ir-8x for incoming@patchwork.ozlabs.org; Sat, 20 Apr 2019 03:38:46 -0400 Received: from eggs.gnu.org ([209.51.188.92]:40125) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkWY-0008Q6-Dg for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:34:59 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hHkWX-0007qX-91 for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:34:58 -0400 Received: from mail-pg1-x541.google.com ([2607:f8b0:4864:20::541]:33399) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hHkWX-0007pt-2f for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:34:57 -0400 Received: by mail-pg1-x541.google.com with SMTP id k19so3604414pgh.0 for ; Sat, 20 Apr 2019 00:34:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=TZV1ulrf0nOxaaoYwAiOAuzczUUGNeQajyWxzhn5VlI=; b=iyv87db64nfsC1gEQSDC8yrSzILDZA7ZB7wPOCOJJ6CfpFNBScbUSazRXyB9csI+3L YJPFGeumcLYsFggJLxY7mg46ddZxLyobp6DvC2A1lr/ls6juKYnzL9hquANvcxJaTjAJ NR0ygqM9FEzXHBcQLL1MAT9Mak23+RPn/uBea6h3zdHN4XJBXJGCD8atYPwd6rRstbZ3 AWL05mEigLtJzkeh83kbcelAASh43SnnTDR7U9jdHtEai8X+IX5fVy2hIzckQhb5jSxY nLBsN4RA+5qOJlHK+UhvBc7PEyFGe12cRnbwgk/HEJXDNRBSHHu2rdb1lpVfXJzXh/Sw KxfA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=TZV1ulrf0nOxaaoYwAiOAuzczUUGNeQajyWxzhn5VlI=; b=W5kvcgPPC3EkvCqsYHWsiffmiMYWoQzY0eiHvj7wX3aLO697Q/MShqdIqsgbxMT0LO GsA3qs3tFgBsKLW6GsuGAwfnBn5UelyOHuOQRkVjH2NM7BIZ/hKV8h0H3KJTXwGSUKiC tAA9xbAt/BJPZJvlhG4fErS9IrhrMEn9dGbUaL7P0hhBzeA1fFgCVX8DR6HeyRB0Sh4B PuYxbYWFYAN0N2nVPDbniby4q3zN8T1Zv7GJmO0Z/UHF59xK1tZy7HxNGfwJ5uQXYR6J 3H05K1bDlXzjyg0cTI5u52zPHlFAg7gmj3vB95q6DeDhqs0dutjXd9r5SLSWuR7fc7GI JF0w== X-Gm-Message-State: APjAAAUp8SJnJmPnf2lM3hC4G+gs/UqbBhZK/drcG7P97by5kxXVcMUH n5elAHq11e1a82Gmz1XwNFA9zN5Uaxc= X-Google-Smtp-Source: APXvYqzWQznkqmK5LpdsrdQJj1mYSuoAumAQOmW7bDNhBHV/R/nCWn8cMwumJisPa4hvJyuWzEekYQ== X-Received: by 2002:aa7:8083:: with SMTP id v3mr8680326pff.135.1555745695732; Sat, 20 Apr 2019 00:34:55 -0700 (PDT) Received: from localhost.localdomain (rrcs-66-91-136-155.west.biz.rr.com. [66.91.136.155]) by smtp.gmail.com with ESMTPSA id z22sm7025492pgv.23.2019.04.20.00.34.54 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 20 Apr 2019 00:34:55 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Fri, 19 Apr 2019 21:34:10 -1000 Message-Id: <20190420073442.7488-7-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190420073442.7488-1-richard.henderson@linaro.org> References: <20190420073442.7488-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::541 Subject: [Qemu-devel] [PATCH 06/38] tcg: Promote tcg_out_{dup, dupi}_vec to backend interface X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: david@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" The i386 backend already has these functions, and the aarch64 backend could easily split out one. Nothing is done with these functions yet, but this will aid register allocation of INDEX_op_dup_vec in a later patch. Signed-off-by: Richard Henderson --- tcg/aarch64/tcg-target.inc.c | 12 ++++++++++-- tcg/i386/tcg-target.inc.c | 3 ++- tcg/tcg.c | 14 ++++++++++++++ 3 files changed, 26 insertions(+), 3 deletions(-) diff --git a/tcg/aarch64/tcg-target.inc.c b/tcg/aarch64/tcg-target.inc.c index b2d3f9c0a5..116ebd8c1a 100644 --- a/tcg/aarch64/tcg-target.inc.c +++ b/tcg/aarch64/tcg-target.inc.c @@ -799,7 +799,7 @@ static void tcg_out_logicali(TCGContext *s, AArch64Insn insn, TCGType ext, } static void tcg_out_dupi_vec(TCGContext *s, TCGType type, - TCGReg rd, uint64_t v64) + TCGReg rd, tcg_target_long v64) { int op, cmode, imm8; @@ -814,6 +814,14 @@ static void tcg_out_dupi_vec(TCGContext *s, TCGType type, } } +static bool tcg_out_dup_vec(TCGContext *s, TCGType type, unsigned vece, + TCGReg rd, TCGReg rs) +{ + int is_q = type - TCG_TYPE_V64; + tcg_out_insn(s, 3605, DUP, is_q, rd, rs, 1 << vece, 0); + return true; +} + static void tcg_out_movi(TCGContext *s, TCGType type, TCGReg rd, tcg_target_long value) { @@ -2197,7 +2205,7 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, tcg_out_insn(s, 3617, NOT, is_q, 0, a0, a1); break; case INDEX_op_dup_vec: - tcg_out_insn(s, 3605, DUP, is_q, a0, a1, 1 << vece, 0); + tcg_out_dup_vec(s, type, vece, a0, a1); break; case INDEX_op_shli_vec: tcg_out_insn(s, 3614, SHL, is_q, a0, a1, a2 + (8 << vece)); diff --git a/tcg/i386/tcg-target.inc.c b/tcg/i386/tcg-target.inc.c index 817a167767..04e3d37b05 100644 --- a/tcg/i386/tcg-target.inc.c +++ b/tcg/i386/tcg-target.inc.c @@ -855,7 +855,7 @@ static bool tcg_out_mov(TCGContext *s, TCGType type, TCGReg ret, TCGReg arg) return true; } -static void tcg_out_dup_vec(TCGContext *s, TCGType type, unsigned vece, +static bool tcg_out_dup_vec(TCGContext *s, TCGType type, unsigned vece, TCGReg r, TCGReg a) { if (have_avx2) { @@ -888,6 +888,7 @@ static void tcg_out_dup_vec(TCGContext *s, TCGType type, unsigned vece, g_assert_not_reached(); } } + return true; } static void tcg_out_dupi_vec(TCGContext *s, TCGType type, diff --git a/tcg/tcg.c b/tcg/tcg.c index d3dcfe3dca..5ed9c7bee5 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -108,10 +108,24 @@ static void tcg_out_movi(TCGContext *s, TCGType type, static void tcg_out_op(TCGContext *s, TCGOpcode opc, const TCGArg *args, const int *const_args); #if TCG_TARGET_MAYBE_vec +static bool tcg_out_dup_vec(TCGContext *s, TCGType type, unsigned vece, + TCGReg dst, TCGReg src); +static void tcg_out_dupi_vec(TCGContext *s, TCGType type, + TCGReg dst, tcg_target_long arg); static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, unsigned vecl, unsigned vece, const TCGArg *args, const int *const_args); #else +static inline bool tcg_out_dup_vec(TCGContext *s, TCGType type, unsigned vece, + TCGReg dst, TCGReg src) +{ + g_assert_not_reached(); +} +static inline void tcg_out_dupi_vec(TCGContext *s, TCGType type, + TCGReg dst, tcg_target_long arg) +{ + g_assert_not_reached(); +} static inline void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, unsigned vecl, unsigned vece, const TCGArg *args, const int *const_args) From patchwork Sat Apr 20 07:34:11 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1088300 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="gyeWXKNd"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44mPpg6nMRz9s4Y for ; Sat, 20 Apr 2019 17:38:35 +1000 (AEST) Received: from localhost ([127.0.0.1]:38085 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHka1-0002XY-PK for incoming@patchwork.ozlabs.org; Sat, 20 Apr 2019 03:38:33 -0400 Received: from eggs.gnu.org ([209.51.188.92]:40139) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkWa-0008Rq-2c for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:01 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hHkWY-0007sV-QQ for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:00 -0400 Received: from mail-pg1-x529.google.com ([2607:f8b0:4864:20::529]:46016) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hHkWY-0007qt-I1 for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:34:58 -0400 Received: by mail-pg1-x529.google.com with SMTP id y3so3581178pgk.12 for ; Sat, 20 Apr 2019 00:34:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=5tICn7CJRIM5TsWJu9Zu04ZhN4ShdN6K4BgFJ/iHjvI=; b=gyeWXKNd47Lw9/vspx+4SLp9yMQf6gFNpps3XE1+2CSWfG95+eELdI3py9Nr4d+2Zl Cy1k+UADWGTu+txb2Q4nmSLSB46krKAoR/WQR6iRa3ck56SOZaCJ6e8nWZ96xnLOTy63 pBtG08KKAyV52omzEE/OUezc6RGAFjfLmnUQyPlacAQ+2lz4Ke0bvs2XNeFNC5oFVjIN L0juAe5BJu41Tk05AQs1OpIvNV3so5m2hgthPhrx3CftQubMxlid7W0CfjssQofW2Pez DnJHxaUVUuhjIB+byio32hyBsLl27BmMpNEfdmTROSYOumq1MYE4upFzWLhw1bekBgFs mZYA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=5tICn7CJRIM5TsWJu9Zu04ZhN4ShdN6K4BgFJ/iHjvI=; b=mGg+SyTW1rFyufEc4vJQ18QpqrnWG8XB8AxopVxFNVOPibiqx6LPA/ykx0WUX1MJeO IMQCACpGPs22///FHfHX5hAit1Z8frIB8nKdNEb2lvTbnocI493LUyhdTHxsH9OBl7bW JdyutXhOE/VnJoAHuoSS2h4ejaEzYnMsZeeyhabYlm6xju/zKnIJAaRUfikLehCtD/8E 0FwvmndabNJQJgMpu+X9mr6C/avh7coiYa5+ImbSySfSCELeGadz5LITQ5Ep2aXqNXGO KmuPTeK7dc2TCoqEQWUuj1llruIXPNZhXaUKzZQPT1do1bong7Ck6MjCznq9lxRUj6d7 85hg== X-Gm-Message-State: APjAAAUZtVu5fB4V8TQiobWh1+kaCO16mAe773FdXdkWbIS0hD2VWMhc 47BqEsuqfkKUdGzS6/ao7Jt58cpmXAw= X-Google-Smtp-Source: APXvYqyndRCS05rMAVr1xthSCrLDJdJ4OqU1sMl60acY2ksOY+A9016oaPGtzg3oP290PG1/QfmFSA== X-Received: by 2002:a62:448d:: with SMTP id m13mr8719534pfi.182.1555745697179; Sat, 20 Apr 2019 00:34:57 -0700 (PDT) Received: from localhost.localdomain (rrcs-66-91-136-155.west.biz.rr.com. [66.91.136.155]) by smtp.gmail.com with ESMTPSA id z22sm7025492pgv.23.2019.04.20.00.34.55 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 20 Apr 2019 00:34:56 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Fri, 19 Apr 2019 21:34:11 -1000 Message-Id: <20190420073442.7488-8-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190420073442.7488-1-richard.henderson@linaro.org> References: <20190420073442.7488-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::529 Subject: [Qemu-devel] [PATCH 07/38] tcg: Manually expand INDEX_op_dup_vec X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: david@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" This case is similar to INDEX_op_mov_* in that we need to do different things depending on the current location of the source. Signed-off-by: Richard Henderson --- tcg/aarch64/tcg-target.inc.c | 9 ++-- tcg/i386/tcg-target.inc.c | 8 ++- tcg/tcg.c | 102 +++++++++++++++++++++++++++++++++++ 3 files changed, 109 insertions(+), 10 deletions(-) diff --git a/tcg/aarch64/tcg-target.inc.c b/tcg/aarch64/tcg-target.inc.c index 116ebd8c1a..b272822969 100644 --- a/tcg/aarch64/tcg-target.inc.c +++ b/tcg/aarch64/tcg-target.inc.c @@ -2104,10 +2104,8 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc, case INDEX_op_mov_i32: /* Always emitted via tcg_out_mov. */ case INDEX_op_mov_i64: - case INDEX_op_mov_vec: case INDEX_op_movi_i32: /* Always emitted via tcg_out_movi. */ case INDEX_op_movi_i64: - case INDEX_op_dupi_vec: case INDEX_op_call: /* Always emitted via tcg_out_call. */ default: g_assert_not_reached(); @@ -2204,9 +2202,6 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, case INDEX_op_not_vec: tcg_out_insn(s, 3617, NOT, is_q, 0, a0, a1); break; - case INDEX_op_dup_vec: - tcg_out_dup_vec(s, type, vece, a0, a1); - break; case INDEX_op_shli_vec: tcg_out_insn(s, 3614, SHL, is_q, a0, a1, a2 + (8 << vece)); break; @@ -2250,6 +2245,10 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, } } break; + + case INDEX_op_mov_vec: /* Always emitted via tcg_out_mov. */ + case INDEX_op_dupi_vec: /* Always emitted via tcg_out_movi. */ + case INDEX_op_dup_vec: /* Always emitted via tcg_out_dup_vec. */ default: g_assert_not_reached(); } diff --git a/tcg/i386/tcg-target.inc.c b/tcg/i386/tcg-target.inc.c index 04e3d37b05..49691c4f56 100644 --- a/tcg/i386/tcg-target.inc.c +++ b/tcg/i386/tcg-target.inc.c @@ -2601,10 +2601,8 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc, break; case INDEX_op_mov_i32: /* Always emitted via tcg_out_mov. */ case INDEX_op_mov_i64: - case INDEX_op_mov_vec: case INDEX_op_movi_i32: /* Always emitted via tcg_out_movi. */ case INDEX_op_movi_i64: - case INDEX_op_dupi_vec: case INDEX_op_call: /* Always emitted via tcg_out_call. */ default: tcg_abort(); @@ -2793,9 +2791,6 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, case INDEX_op_st_vec: tcg_out_st(s, type, a0, a1, a2); break; - case INDEX_op_dup_vec: - tcg_out_dup_vec(s, type, vece, a0, a1); - break; case INDEX_op_x86_shufps_vec: insn = OPC_SHUFPS; @@ -2837,6 +2832,9 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, tcg_out8(s, a2); break; + case INDEX_op_mov_vec: /* Always emitted via tcg_out_mov. */ + case INDEX_op_dupi_vec: /* Always emitted via tcg_out_movi. */ + case INDEX_op_dup_vec: /* Always emitted via tcg_out_dup_vec. */ default: g_assert_not_reached(); } diff --git a/tcg/tcg.c b/tcg/tcg.c index 5ed9c7bee5..55498b63d7 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -3410,6 +3410,105 @@ static void tcg_reg_alloc_mov(TCGContext *s, const TCGOp *op) } } +static void tcg_reg_alloc_dup(TCGContext *s, const TCGOp *op) +{ + const TCGLifeData arg_life = op->life; + TCGRegSet dup_out_regs, dup_in_regs; + TCGTemp *its, *ots; + TCGType itype, vtype; + unsigned vece; + bool ok; + + ots = arg_temp(op->args[0]); + its = arg_temp(op->args[1]); + + /* There should be no fixed vector registers. */ + tcg_debug_assert(!ots->fixed_reg); + + itype = its->type; + vece = TCGOP_VECE(op); + vtype = TCGOP_VECL(op) + TCG_TYPE_V64; + + if (its->val_type == TEMP_VAL_CONST) { + /* Propagate constant via movi -> dupi. */ + tcg_target_ulong val = its->val; + if (IS_DEAD_ARG(1)) { + temp_dead(s, its); + } + tcg_reg_alloc_do_movi(s, ots, val, arg_life, op->output_pref[0]); + return; + } + + dup_out_regs = tcg_op_defs[INDEX_op_dup_vec].args_ct[0].u.regs; + dup_in_regs = tcg_op_defs[INDEX_op_dup_vec].args_ct[1].u.regs; + + /* Allocate the output register now. */ + if (ots->val_type != TEMP_VAL_REG) { + TCGRegSet allocated_regs = s->reserved_regs; + + if (!IS_DEAD_ARG(1) && its->val_type == TEMP_VAL_REG) { + /* Make sure to not spill the input register. */ + tcg_regset_set_reg(allocated_regs, its->reg); + } + ots->reg = tcg_reg_alloc(s, dup_out_regs, allocated_regs, + op->output_pref[0], ots->indirect_base); + ots->val_type = TEMP_VAL_REG; + ots->mem_coherent = 0; + s->reg_to_temp[ots->reg] = ots; + } + + switch (its->val_type) { + case TEMP_VAL_REG: + /* + * The dup constriaints must be broad, covering all possible VECE. + * However, tcg_op_dup_vec() gets to see the VECE and we allow it + * to fail, indicating that extra moves are required for that case. + */ + if (tcg_regset_test_reg(dup_in_regs, its->reg)) { + if (tcg_out_dup_vec(s, vtype, vece, ots->reg, its->reg)) { + goto done; + } + /* Try again from memory or a vector input register. */ + } + if (!its->mem_coherent) { + /* + * The input register is not synced, and so an extra store + * would be required to use memory. Attempt an integer-vector + * register move first. We do not have a TCGRegSet for this. + */ + if (tcg_out_mov(s, itype, ots->reg, its->reg)) { + break; + } + /* Sync the temp back to its slot and load from there. */ + temp_sync(s, its, s->reserved_regs, 0, 0); + } + /* fall through */ + + case TEMP_VAL_MEM: + /* TODO: dup from memory */ + tcg_out_ld(s, itype, ots->reg, its->mem_base->reg, its->mem_offset); + break; + + default: + g_assert_not_reached(); + } + + /* We now have a vector input register, so dup must succeed. */ + ok = tcg_out_dup_vec(s, vtype, vece, ots->reg, ots->reg); + tcg_debug_assert(ok); + + done: + if (IS_DEAD_ARG(1)) { + temp_dead(s, its); + } + if (NEED_SYNC_ARG(0)) { + temp_sync(s, ots, s->reserved_regs, 0, 0); + } + if (IS_DEAD_ARG(0)) { + temp_dead(s, ots); + } +} + static void tcg_reg_alloc_op(TCGContext *s, const TCGOp *op) { const TCGLifeData arg_life = op->life; @@ -3978,6 +4077,9 @@ int tcg_gen_code(TCGContext *s, TranslationBlock *tb) case INDEX_op_dupi_vec: tcg_reg_alloc_movi(s, op); break; + case INDEX_op_dup_vec: + tcg_reg_alloc_dup(s, op); + break; case INDEX_op_insn_start: if (num_insns >= 0) { size_t off = tcg_current_code_size(s); From patchwork Sat Apr 20 07:34:12 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1088303 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="Z0NxmqD6"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44mPpz3M2rz9s4Y for ; Sat, 20 Apr 2019 17:38:51 +1000 (AEST) Received: from localhost ([127.0.0.1]:38098 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkaH-0002ko-CT for incoming@patchwork.ozlabs.org; Sat, 20 Apr 2019 03:38:49 -0400 Received: from eggs.gnu.org ([209.51.188.92]:40147) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkWa-0008Sh-W4 for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:02 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hHkWa-0007tf-00 for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:00 -0400 Received: from mail-pg1-x544.google.com ([2607:f8b0:4864:20::544]:35514) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hHkWZ-0007tJ-Qa for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:34:59 -0400 Received: by mail-pg1-x544.google.com with SMTP id g8so3601647pgf.2 for ; Sat, 20 Apr 2019 00:34:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=ZZYr6QJr2rUipHfB54gIFVWp7UFMrAAprd2zd5BHTXI=; b=Z0NxmqD6VG5KXqTJ8B5iyjXs4h3dlY11z9YNVik5iJZ8vdHY0SOnePGvoqRFdcdSCs p5IFzWFmitcm6K1mmCLPQ81Ol0vTOkAkadVydY//pt26mlGOMfjo2lljEZsDn8duYQ05 tPq532T8STQLuzLPsXiHRw4nG6VgysGJvQ3lHIPx88Ww3guEjH64LcqeOfEaNWt7G2nG nSZVqB6bsE9MMjFqojZ/NVi1BXZhKBzopbzAx3ZyonWGmbEjgvqg9xLA4aSHXBumlBQQ rFv4AF4XTwwbutRIAnweKxDS6RFRWiCBgc+BTOch0jQkzbbmyQiwz6r+8c+vd3owZ+ul fO5A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=ZZYr6QJr2rUipHfB54gIFVWp7UFMrAAprd2zd5BHTXI=; b=i+t2KDvRVulOutwlzDawxafBlzKr7GQv3ZJj0GNcJm2qTTyr4a7JzTCvN2psBZ/tub FN629pd9tsr3i4pmmOAEjmzHkWFmyle82/9CsX2bcHUnT6/hhzCZJBgI1IF8D9pr5532 8vooS+0KOZTky5jdNX1UWzyQN+Dty3E5cluHH+VVH6fj+FSiqnsAg98iICGHGeBG1wmY 83d538nkgYeeLZi3FUqbXCa0PpIRPMuJCEO18WqSs86tdm6VvdrISqxPc1443DBEvsnf znAgi0/RUDXfv+GarXBPzJ5IKW6JKT2b0cLMLNkQMyD8sLvJjP7d9JEAh621fPvbc9jc tXPQ== X-Gm-Message-State: APjAAAWvdqBUCblr4Co3phmCLbYZOoWgKf2MzOL+rO7tKliaSFuYNkBG BvIp/BcUXiOFcdUu3tOGYKPt/EvwpxQ= X-Google-Smtp-Source: APXvYqyxojhLTh24tODFIZ2D9N5t8fDQxKO38/MWMHWfdiKY/feucXVKjeDxRiBPwlrjmH0QRJ3i7Q== X-Received: by 2002:a62:1b8a:: with SMTP id b132mr8381430pfb.19.1555745698634; Sat, 20 Apr 2019 00:34:58 -0700 (PDT) Received: from localhost.localdomain (rrcs-66-91-136-155.west.biz.rr.com. [66.91.136.155]) by smtp.gmail.com with ESMTPSA id z22sm7025492pgv.23.2019.04.20.00.34.57 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 20 Apr 2019 00:34:58 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Fri, 19 Apr 2019 21:34:12 -1000 Message-Id: <20190420073442.7488-9-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190420073442.7488-1-richard.henderson@linaro.org> References: <20190420073442.7488-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::544 Subject: [Qemu-devel] [PATCH 08/38] tcg: Add tcg_out_dupm_vec to the backend interface X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: david@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Currently stubbed out in all backends that support vectors. Signed-off-by: Richard Henderson --- tcg/aarch64/tcg-target.inc.c | 6 ++++++ tcg/i386/tcg-target.inc.c | 7 +++++++ tcg/tcg.c | 19 ++++++++++++++++++- 3 files changed, 31 insertions(+), 1 deletion(-) diff --git a/tcg/aarch64/tcg-target.inc.c b/tcg/aarch64/tcg-target.inc.c index b272822969..3f95930e88 100644 --- a/tcg/aarch64/tcg-target.inc.c +++ b/tcg/aarch64/tcg-target.inc.c @@ -822,6 +822,12 @@ static bool tcg_out_dup_vec(TCGContext *s, TCGType type, unsigned vece, return true; } +static bool tcg_out_dupm_vec(TCGContext *s, TCGType type, unsigned vece, + TCGReg r, TCGReg base, intptr_t offset) +{ + return false; +} + static void tcg_out_movi(TCGContext *s, TCGType type, TCGReg rd, tcg_target_long value) { diff --git a/tcg/i386/tcg-target.inc.c b/tcg/i386/tcg-target.inc.c index 49691c4f56..0be7d8f589 100644 --- a/tcg/i386/tcg-target.inc.c +++ b/tcg/i386/tcg-target.inc.c @@ -891,6 +891,13 @@ static bool tcg_out_dup_vec(TCGContext *s, TCGType type, unsigned vece, return true; } +static bool tcg_out_dupm_vec(TCGContext *s, TCGType type, unsigned vece, + TCGReg r, TCGReg base, intptr_t offset) +{ + return false; +} + + static void tcg_out_dupi_vec(TCGContext *s, TCGType type, TCGReg ret, tcg_target_long arg) { diff --git a/tcg/tcg.c b/tcg/tcg.c index 55498b63d7..1c34c08791 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -110,6 +110,8 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc, const TCGArg *args, #if TCG_TARGET_MAYBE_vec static bool tcg_out_dup_vec(TCGContext *s, TCGType type, unsigned vece, TCGReg dst, TCGReg src); +static bool tcg_out_dupm_vec(TCGContext *s, TCGType type, unsigned vece, + TCGReg dst, TCGReg base, intptr_t offset); static void tcg_out_dupi_vec(TCGContext *s, TCGType type, TCGReg dst, tcg_target_long arg); static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, unsigned vecl, @@ -121,6 +123,11 @@ static inline bool tcg_out_dup_vec(TCGContext *s, TCGType type, unsigned vece, { g_assert_not_reached(); } +static inline bool tcg_out_dupm_vec(TCGContext *s, TCGType type, unsigned vece, + TCGReg dst, TCGReg base, intptr_t offset) +{ + g_assert_not_reached(); +} static inline void tcg_out_dupi_vec(TCGContext *s, TCGType type, TCGReg dst, tcg_target_long arg) { @@ -3416,6 +3423,7 @@ static void tcg_reg_alloc_dup(TCGContext *s, const TCGOp *op) TCGRegSet dup_out_regs, dup_in_regs; TCGTemp *its, *ots; TCGType itype, vtype; + intptr_t endian_fixup; unsigned vece; bool ok; @@ -3485,7 +3493,16 @@ static void tcg_reg_alloc_dup(TCGContext *s, const TCGOp *op) /* fall through */ case TEMP_VAL_MEM: - /* TODO: dup from memory */ +#ifdef HOST_WORDS_BIGENDIAN + endian_fixup = itype == TCG_TYPE_I32 ? 4 : 8; + endian_fixup -= 1 << vece; +#else + endian_fixup = 0; +#endif + if (tcg_out_dupm_vec(s, vtype, vece, ots->reg, its->mem_base->reg, + its->mem_offset + endian_fixup)) { + goto done; + } tcg_out_ld(s, itype, ots->reg, its->mem_base->reg, its->mem_offset); break; From patchwork Sat Apr 20 07:34:13 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1088311 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="a2Fa43b5"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44mPx35cNgz9s6w for ; Sat, 20 Apr 2019 17:44:07 +1000 (AEST) Received: from localhost ([127.0.0.1]:38176 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkfN-0007Xa-N0 for incoming@patchwork.ozlabs.org; Sat, 20 Apr 2019 03:44:05 -0400 Received: from eggs.gnu.org ([209.51.188.92]:40163) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkWc-0008Up-SI for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:05 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hHkWb-0007wj-LX for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:02 -0400 Received: from mail-pf1-x442.google.com ([2607:f8b0:4864:20::442]:38049) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hHkWb-0007ud-EZ for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:01 -0400 Received: by mail-pf1-x442.google.com with SMTP id 10so3468870pfo.5 for ; Sat, 20 Apr 2019 00:35:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=gN4B/Stb2lbgTIviKYPnpm58ErCGyH4cO3TJuIucX7A=; b=a2Fa43b5mKhdlmgzqumM14WhxwTkr7Duf/AZDdGRGE/bEwnGliZSOVhwrLWiwVquLL +fpLjlHf+T90QfrG6VpgMKC3RogA6sWXkDziwWQUgeZ/UcEEHgRR1eaHyzX8FcP38lp+ fg03cDQvKS8iBfXtMi/hkDXz317yNWj0CJcmegsOQJek+l2tGKJBZGuh1WtwXzqJ2Nuw jZBWCQ/lnHB7ykrQRHRyvqxlvqE5Y16kuU1bElMaSx3J4VrBl8/ZY9o5TJqkOTQU9mio d1Sg7b+ZQ58qApKBYnGMmA9yZb1Ndy3Jerc2bGBE6Z3AUyiw+/6OTFfgApqt7gsUU9Ta xq/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=gN4B/Stb2lbgTIviKYPnpm58ErCGyH4cO3TJuIucX7A=; b=CroBBuQ73F6/Fl3bIWEgLA4iFAN9iYCGJh/VlIX7/rH8Luf1Jp+S64BC98IOcJYvwI 4hrGMhur1ZJA/4jZwUnj50xFPIVkLV1r+zgBtFyQDpe2b04A2tXfCENTfTGtmuTGaXjX 4e2InHSf3dGUewrnJmOEhDjB2fQjSaPHkelLkhKB53XFVvcs/lKcdNJlbcbF65KHXHQs e6yjvQq6YpQQWhL3aLBimIFw8xcq5fh3q2zUUcS6dqmcQrBnijb5NmOkYuOEGO+aZ+rr xnszC7nGuzb8S9Qf05JbXSvoxodbNGudc2T/I66ICvsHrS3icoHOm8fFS7hRy91dZlFd wyDA== X-Gm-Message-State: APjAAAXYn1fe5wQOLIpXjy/4I6KdjTj6erOLeh7Ewksqrt05RPSbP3WD 8vS4bH2pDJiCbbMhryWjfLno9oF2aO0= X-Google-Smtp-Source: APXvYqzAf1iLB0UbzyX75pVo4KTLT1xpYxFdL0uksOPzEKHFprbIkRiESb7rv//L3V7v1v2rUeyIcA== X-Received: by 2002:a63:6f0a:: with SMTP id k10mr8017418pgc.78.1555745700138; Sat, 20 Apr 2019 00:35:00 -0700 (PDT) Received: from localhost.localdomain (rrcs-66-91-136-155.west.biz.rr.com. [66.91.136.155]) by smtp.gmail.com with ESMTPSA id z22sm7025492pgv.23.2019.04.20.00.34.58 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 20 Apr 2019 00:34:59 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Fri, 19 Apr 2019 21:34:13 -1000 Message-Id: <20190420073442.7488-10-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190420073442.7488-1-richard.henderson@linaro.org> References: <20190420073442.7488-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::442 Subject: [Qemu-devel] [PATCH 09/38] tcg/i386: Implement tcg_out_dupm_vec X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: david@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" At the same time, improve tcg_out_dupi_vec wrt broadcast from the constant pool. Signed-off-by: Richard Henderson --- tcg/i386/tcg-target.inc.c | 57 +++++++++++++++++++++++++++++---------- 1 file changed, 43 insertions(+), 14 deletions(-) diff --git a/tcg/i386/tcg-target.inc.c b/tcg/i386/tcg-target.inc.c index 0be7d8f589..fcabc1bdf2 100644 --- a/tcg/i386/tcg-target.inc.c +++ b/tcg/i386/tcg-target.inc.c @@ -358,7 +358,6 @@ static inline int tcg_target_const_match(tcg_target_long val, TCGType type, #define OPC_MOVBE_MyGy (0xf1 | P_EXT38) #define OPC_MOVD_VyEy (0x6e | P_EXT | P_DATA16) #define OPC_MOVD_EyVy (0x7e | P_EXT | P_DATA16) -#define OPC_MOVDDUP (0x12 | P_EXT | P_SIMDF2) #define OPC_MOVDQA_VxWx (0x6f | P_EXT | P_DATA16) #define OPC_MOVDQA_WxVx (0x7f | P_EXT | P_DATA16) #define OPC_MOVDQU_VxWx (0x6f | P_EXT | P_SIMDF3) @@ -458,6 +457,10 @@ static inline int tcg_target_const_match(tcg_target_long val, TCGType type, #define OPC_UD2 (0x0b | P_EXT) #define OPC_VPBLENDD (0x02 | P_EXT3A | P_DATA16) #define OPC_VPBLENDVB (0x4c | P_EXT3A | P_DATA16) +#define OPC_VPINSRB (0x20 | P_EXT3A | P_DATA16) +#define OPC_VPINSRW (0xc4 | P_EXT | P_DATA16) +#define OPC_VBROADCASTSS (0x18 | P_EXT38 | P_DATA16) +#define OPC_VBROADCASTSD (0x19 | P_EXT38 | P_DATA16) #define OPC_VPBROADCASTB (0x78 | P_EXT38 | P_DATA16) #define OPC_VPBROADCASTW (0x79 | P_EXT38 | P_DATA16) #define OPC_VPBROADCASTD (0x58 | P_EXT38 | P_DATA16) @@ -855,16 +858,17 @@ static bool tcg_out_mov(TCGContext *s, TCGType type, TCGReg ret, TCGReg arg) return true; } +static const int avx2_dup_insn[4] = { + OPC_VPBROADCASTB, OPC_VPBROADCASTW, + OPC_VPBROADCASTD, OPC_VPBROADCASTQ, +}; + static bool tcg_out_dup_vec(TCGContext *s, TCGType type, unsigned vece, TCGReg r, TCGReg a) { if (have_avx2) { - static const int dup_insn[4] = { - OPC_VPBROADCASTB, OPC_VPBROADCASTW, - OPC_VPBROADCASTD, OPC_VPBROADCASTQ, - }; int vex_l = (type == TCG_TYPE_V256 ? P_VEXL : 0); - tcg_out_vex_modrm(s, dup_insn[vece] + vex_l, r, 0, a); + tcg_out_vex_modrm(s, avx2_dup_insn[vece] + vex_l, r, 0, a); } else { switch (vece) { case MO_8: @@ -894,10 +898,35 @@ static bool tcg_out_dup_vec(TCGContext *s, TCGType type, unsigned vece, static bool tcg_out_dupm_vec(TCGContext *s, TCGType type, unsigned vece, TCGReg r, TCGReg base, intptr_t offset) { - return false; + if (have_avx2) { + int vex_l = (type == TCG_TYPE_V256 ? P_VEXL : 0); + tcg_out_vex_modrm_offset(s, avx2_dup_insn[vece] + vex_l, + r, 0, base, offset); + } else { + switch (vece) { + case MO_64: + tcg_out_vex_modrm_offset(s, OPC_VBROADCASTSD, r, 0, base, offset); + break; + case MO_32: + tcg_out_vex_modrm_offset(s, OPC_VBROADCASTSS, r, 0, base, offset); + break; + case MO_16: + tcg_out_vex_modrm_offset(s, OPC_VPINSRW, r, r, base, offset); + tcg_out8(s, 0); /* imm8 */ + tcg_out_dup_vec(s, type, vece, r, r); + break; + case MO_8: + tcg_out_vex_modrm_offset(s, OPC_VPINSRB, r, r, base, offset); + tcg_out8(s, 0); /* imm8 */ + tcg_out_dup_vec(s, type, vece, r, r); + break; + default: + g_assert_not_reached(); + } + } + return true; } - static void tcg_out_dupi_vec(TCGContext *s, TCGType type, TCGReg ret, tcg_target_long arg) { @@ -918,16 +947,16 @@ static void tcg_out_dupi_vec(TCGContext *s, TCGType type, } else if (have_avx2) { tcg_out_vex_modrm_pool(s, OPC_VPBROADCASTQ + vex_l, ret); } else { - tcg_out_vex_modrm_pool(s, OPC_MOVDDUP, ret); + tcg_out_vex_modrm_pool(s, OPC_VBROADCASTSD, ret); } new_pool_label(s, arg, R_386_PC32, s->code_ptr - 4, -4); - } else if (have_avx2) { - tcg_out_vex_modrm_pool(s, OPC_VPBROADCASTD + vex_l, ret); - new_pool_label(s, arg, R_386_32, s->code_ptr - 4, 0); } else { - tcg_out_vex_modrm_pool(s, OPC_MOVD_VyEy, ret); + if (have_avx2) { + tcg_out_vex_modrm_pool(s, OPC_VBROADCASTSD + vex_l, ret); + } else { + tcg_out_vex_modrm_pool(s, OPC_VBROADCASTSS, ret); + } new_pool_label(s, arg, R_386_32, s->code_ptr - 4, 0); - tcg_out_dup_vec(s, type, MO_32, ret, ret); } } From patchwork Sat Apr 20 07:34:14 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1088306 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="uLUVCTrQ"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44mPt02n38z9s4Y for ; Sat, 20 Apr 2019 17:41:28 +1000 (AEST) Received: from localhost ([127.0.0.1]:38152 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkco-00055V-90 for incoming@patchwork.ozlabs.org; Sat, 20 Apr 2019 03:41:26 -0400 Received: from eggs.gnu.org ([209.51.188.92]:40182) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkWe-0008Uq-6I for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:05 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hHkWd-0007yl-50 for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:04 -0400 Received: from mail-pl1-x644.google.com ([2607:f8b0:4864:20::644]:38606) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hHkWc-0007yB-VH for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:03 -0400 Received: by mail-pl1-x644.google.com with SMTP id f36so3538773plb.5 for ; Sat, 20 Apr 2019 00:35:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=pV+13kF6P9BlfNwiU+pC3oUPKYg5eIexnBRNkKkWSQQ=; b=uLUVCTrQ9EuLTgdoWbM/n6PWVNrSAqioVlwSLoLTzuUJ4ojaMx3UXAcnnFp5Nwrdnr yNnZNq8/eNmLsRGSkdtx2hVBc6ZnYiahmDI3DwRaSJcoudNNv6ztCwWvSVnOxKkGLZTY b6M389lWgGmIASDDDb6G/NO/hXTagkS05iXrkdiDu9d/uS9FT/v5FhR9OP2STjPhlNgh 6i+pbwh78p0T6lu84yeUbwIJMwkKvDl+ouREVYMEuUMPzKIUwtpzPvwFxN97nCScxm6J b6VJe8Bt1VtPi0jycqOoev3d8CFC6sAMETXdQpAFLvIw5Mb4nVkXRHoAa6z3s8h2RCZu RrnA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=pV+13kF6P9BlfNwiU+pC3oUPKYg5eIexnBRNkKkWSQQ=; b=iIzkuzbYvMWUM7Xys3o0nz+NpqHPagZEf3S+wpY4PCmCaq2YznBc/oloVQT5igHGNm MWflFemRZGC5W/At5lNHrbo+2dl6lFDWzf3h96+aOFUtKy4CCtwIft3tBOnfqmOMLzG2 gBJOyPX9XmLQJke8kiN5U7OOhhgqc4qLL9VbrrGb6UiYkxkrEXdkad+Z6E869Ta59RjJ T9T89ODqs3NNmx3bfODx2WQRWN4xo+g5hypXrUox/zstWmDvz2loFHz8JU8puRKmCtwv bOH1cpP7swG/Q1yDXa8w9SBH23lCP7GiAusUvSqasNIaXYJQxLaXEJXyXvlfprCY1hpf ROFQ== X-Gm-Message-State: APjAAAWqczx3XyiyYNaPFDLWvNpfXVxCzPmCw+dqfVWb+1FgZFFGqVav tL0ZfcB39M7Su51iWZDFNbyQNNZM7xo= X-Google-Smtp-Source: APXvYqyoQD03Wg4ifoS7FrNUqFNEq27xyKfRvx5UUYC7MFdfBJBff2ukYW6raj3EdCFjKgpJ71/hTQ== X-Received: by 2002:a17:902:9a0c:: with SMTP id v12mr8411359plp.184.1555745701818; Sat, 20 Apr 2019 00:35:01 -0700 (PDT) Received: from localhost.localdomain (rrcs-66-91-136-155.west.biz.rr.com. [66.91.136.155]) by smtp.gmail.com with ESMTPSA id z22sm7025492pgv.23.2019.04.20.00.35.00 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 20 Apr 2019 00:35:01 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Fri, 19 Apr 2019 21:34:14 -1000 Message-Id: <20190420073442.7488-11-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190420073442.7488-1-richard.henderson@linaro.org> References: <20190420073442.7488-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::644 Subject: [Qemu-devel] [PATCH 10/38] tcg/aarch64: Implement tcg_out_dupm_vec X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: david@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- tcg/aarch64/tcg-target.inc.c | 38 ++++++++++++++++++++++++++++++++++-- 1 file changed, 36 insertions(+), 2 deletions(-) diff --git a/tcg/aarch64/tcg-target.inc.c b/tcg/aarch64/tcg-target.inc.c index 3f95930e88..1db4e22365 100644 --- a/tcg/aarch64/tcg-target.inc.c +++ b/tcg/aarch64/tcg-target.inc.c @@ -381,6 +381,9 @@ typedef enum { I3207_BLR = 0xd63f0000, I3207_RET = 0xd65f0000, + /* AdvSIMD load/store single structure. */ + I3303_LD1R = 0x0d40c000, + /* Load literal for loading the address at pc-relative offset */ I3305_LDR = 0x58000000, I3305_LDR_v64 = 0x5c000000, @@ -414,6 +417,8 @@ typedef enum { I3312_LDRVQ = 0x3c000000 | 3 << 22 | 0 << 30, I3312_STRVQ = 0x3c000000 | 2 << 22 | 0 << 30, + + I3312_TO_I3310 = 0x00200800, I3312_TO_I3313 = 0x01000000, @@ -566,7 +571,14 @@ static inline uint32_t tcg_in32(TCGContext *s) #define tcg_out_insn(S, FMT, OP, ...) \ glue(tcg_out_insn_,FMT)(S, glue(glue(glue(I,FMT),_),OP), ## __VA_ARGS__) -static void tcg_out_insn_3305(TCGContext *s, AArch64Insn insn, int imm19, TCGReg rt) +static void tcg_out_insn_3303(TCGContext *s, AArch64Insn insn, bool q, + TCGReg rt, TCGReg rn, unsigned size) +{ + tcg_out32(s, insn | (rt & 0x1f) | (rn << 5) | (size << 10) | (q << 30)); +} + +static void tcg_out_insn_3305(TCGContext *s, AArch64Insn insn, + int imm19, TCGReg rt) { tcg_out32(s, insn | (imm19 & 0x7ffff) << 5 | rt); } @@ -825,7 +837,29 @@ static bool tcg_out_dup_vec(TCGContext *s, TCGType type, unsigned vece, static bool tcg_out_dupm_vec(TCGContext *s, TCGType type, unsigned vece, TCGReg r, TCGReg base, intptr_t offset) { - return false; + if (offset != 0) { + AArch64Insn add_insn = I3401_ADDI; + TCGReg temp = TCG_REG_TMP; + + if (offset < 0) { + add_insn = I3401_SUBI; + offset = -offset; + } + if (offset <= 0xfff) { + tcg_out_insn_3401(s, add_insn, 1, temp, base, offset); + } else if (offset <= 0xffffff) { + tcg_out_insn_3401(s, add_insn, 1, temp, base, offset & 0xfff000); + if (offset & 0xfff) { + tcg_out_insn_3401(s, add_insn, 1, temp, base, offset & 0xfff); + } + } else { + tcg_out_movi(s, TCG_TYPE_PTR, temp, offset); + tcg_out_insn(s, 3502, ADD, 1, temp, temp, base); + } + base = temp; + } + tcg_out_insn(s, 3303, LD1R, type == TCG_TYPE_V128, r, base, vece); + return true; } static void tcg_out_movi(TCGContext *s, TCGType type, TCGReg rd, From patchwork Sat Apr 20 07:34:15 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1088315 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="YpEx3Q2A"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44mQ0P6jfYz9s4Y for ; Sat, 20 Apr 2019 17:47:01 +1000 (AEST) Received: from localhost ([127.0.0.1]:38229 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkiB-0001ig-T7 for incoming@patchwork.ozlabs.org; Sat, 20 Apr 2019 03:46:59 -0400 Received: from eggs.gnu.org ([209.51.188.92]:40197) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkWg-0008VI-Cb for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:07 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hHkWe-00080H-Rq for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:06 -0400 Received: from mail-pf1-x443.google.com ([2607:f8b0:4864:20::443]:46058) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hHkWe-0007zd-KN for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:04 -0400 Received: by mail-pf1-x443.google.com with SMTP id e24so3448169pfi.12 for ; Sat, 20 Apr 2019 00:35:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=0iYNBvM3tjMOVYl7nQvbch1xwYSCDKUEL4qDamRKVzw=; b=YpEx3Q2AE1mlP6/NkYmggTYYJEDSA0P4wssRCFhTpP1wntxGtd6oppka0yTmMOSg2X u+PR3iRUy1+n7s0/5nsXNA/BUxGFnMy+mxGdIAIrgzSVnVuB7/fdyqmqM6xNBx3psaxu JGOJJwBaDkd5Do6+uB0kWiFunumaGdfHrx338qULHmRH5ih5d78bQp5G+D9+c6rOzOO8 Yxjo9iUbZ2/oHUJqofoDdzPAy7z6m/FHgKd7fPg0axtGFMxku2QMeCh2g5YdxBH34oi9 ha54e52UojEZ2J+9KRxJ3t5iDbJHaEacP4fitH9+Y4q4t8gGA1tcrgqdAxen96T90oan EMew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=0iYNBvM3tjMOVYl7nQvbch1xwYSCDKUEL4qDamRKVzw=; b=Qa9P9BTO2XLiryxh9+ygjQSatyneyA4Cbcstog3cVHp+UsT6dasnVrLzvoGtI323cx yDamsd7TqAwaNMVBLTfbErXwWcLOOs32UtYTSc97jdNIovUD4gNpibKBB7JUjJAryKZC iWvmmUuX/iQgH9YTQTc6DbKjoeyW2MAQhI4kxJ4RxLs/6bW3NLGWxxpVF43Soxm7BiS3 6QxdTMZE5TrGWME482KSNmTxjDzyVx3lNl20jis2dM6OUNIjKXZm8D3sWY1kFGc1S004 zVaJSIIOKOgObdo+WqyY8y8/gkuCuZGcsTGFK8IFyUh4I3CHKCISXrWtKLyV5HB9jRXa iQGA== X-Gm-Message-State: APjAAAW3vTDinZNhVYDj/SduW8ApM3cj0dm7XPkKbYAsXR6wS5Ym8MLm fx/P5wyE2F54dTq+EJBh0QbfOaz8Eq8= X-Google-Smtp-Source: APXvYqx/Hdt9gsbErDIP0H729f51PEzfcQR/ijUgLZ+p1m31/gPyq/xsdS3HENWv43AxeqI2+TLrBg== X-Received: by 2002:a65:6205:: with SMTP id d5mr8108003pgv.61.1555745703316; Sat, 20 Apr 2019 00:35:03 -0700 (PDT) Received: from localhost.localdomain (rrcs-66-91-136-155.west.biz.rr.com. [66.91.136.155]) by smtp.gmail.com with ESMTPSA id z22sm7025492pgv.23.2019.04.20.00.35.01 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 20 Apr 2019 00:35:02 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Fri, 19 Apr 2019 21:34:15 -1000 Message-Id: <20190420073442.7488-12-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190420073442.7488-1-richard.henderson@linaro.org> References: <20190420073442.7488-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::443 Subject: [Qemu-devel] [PATCH 11/38] tcg: Add INDEX_op_dup_mem_vec X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: david@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Allow the backend to expand dup from memory directly, instead of forcing the value into a temp first. This is especially important if integer/vector register moves do not exist. Note that officially tcg_out_dupm_vec is allowed to fail. If it did, we could fix this up relatively easily: VECE == 32/64: Load the value into a vector register, then dup. Both of these must work. VECE == 8/16: If the value happens to be at an offset such that an aligned load would place the desired value in the least significant end of the register, go ahead and load w/garbage in high bits. Load the value w/INDEX_op_ld{8,16}_i32. Attempt a move directly to vector reg, which may fail. Store the value into the backing store for OTS. Load the value into the vector reg w/TCG_TYPE_I32, which must work. Duplicate from the vector reg into itself, which must work. All of which is well and good, except that all supported hosts can support dupm for all vece, so all of the failure paths would be dead code and untestable. Signed-off-by: Richard Henderson --- tcg/tcg-op.h | 1 + tcg/tcg-opc.h | 1 + tcg/aarch64/tcg-target.inc.c | 4 ++ tcg/i386/tcg-target.inc.c | 4 ++ tcg/tcg-op-gvec.c | 88 +++++++++++++++++++----------------- tcg/tcg-op-vec.c | 11 +++++ tcg/tcg.c | 1 + 7 files changed, 69 insertions(+), 41 deletions(-) diff --git a/tcg/tcg-op.h b/tcg/tcg-op.h index 1f1824c30a..9fff9864f6 100644 --- a/tcg/tcg-op.h +++ b/tcg/tcg-op.h @@ -954,6 +954,7 @@ void tcg_gen_atomic_umax_fetch_i64(TCGv_i64, TCGv, TCGv_i64, TCGArg, TCGMemOp); void tcg_gen_mov_vec(TCGv_vec, TCGv_vec); void tcg_gen_dup_i32_vec(unsigned vece, TCGv_vec, TCGv_i32); void tcg_gen_dup_i64_vec(unsigned vece, TCGv_vec, TCGv_i64); +void tcg_gen_dup_mem_vec(unsigned vece, TCGv_vec, TCGv_ptr, tcg_target_long); void tcg_gen_dup8i_vec(TCGv_vec, uint32_t); void tcg_gen_dup16i_vec(TCGv_vec, uint32_t); void tcg_gen_dup32i_vec(TCGv_vec, uint32_t); diff --git a/tcg/tcg-opc.h b/tcg/tcg-opc.h index 1bad6e4208..4bf71f261f 100644 --- a/tcg/tcg-opc.h +++ b/tcg/tcg-opc.h @@ -219,6 +219,7 @@ DEF(dup2_vec, 1, 2, 0, IMPLVEC | IMPL(TCG_TARGET_REG_BITS == 32)) DEF(ld_vec, 1, 1, 1, IMPLVEC) DEF(st_vec, 0, 2, 1, IMPLVEC) +DEF(dupm_vec, 1, 1, 1, IMPLVEC) DEF(add_vec, 1, 2, 0, IMPLVEC) DEF(sub_vec, 1, 2, 0, IMPLVEC) diff --git a/tcg/aarch64/tcg-target.inc.c b/tcg/aarch64/tcg-target.inc.c index 1db4e22365..1c9f4b0cb3 100644 --- a/tcg/aarch64/tcg-target.inc.c +++ b/tcg/aarch64/tcg-target.inc.c @@ -2188,6 +2188,9 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, case INDEX_op_st_vec: tcg_out_st(s, type, a0, a1, a2); break; + case INDEX_op_dupm_vec: + tcg_out_dupm_vec(s, type, vece, a0, a1, a2); + break; case INDEX_op_add_vec: tcg_out_insn(s, 3616, ADD, is_q, vece, a0, a1, a2); break; @@ -2520,6 +2523,7 @@ static const TCGTargetOpDef *tcg_target_op_def(TCGOpcode op) return &w_w; case INDEX_op_ld_vec: case INDEX_op_st_vec: + case INDEX_op_dupm_vec: return &w_r; case INDEX_op_dup_vec: return &w_wr; diff --git a/tcg/i386/tcg-target.inc.c b/tcg/i386/tcg-target.inc.c index fcabc1bdf2..4c42a2430d 100644 --- a/tcg/i386/tcg-target.inc.c +++ b/tcg/i386/tcg-target.inc.c @@ -2827,6 +2827,9 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, case INDEX_op_st_vec: tcg_out_st(s, type, a0, a1, a2); break; + case INDEX_op_dupm_vec: + tcg_out_dupm_vec(s, type, vece, a0, a1, a2); + break; case INDEX_op_x86_shufps_vec: insn = OPC_SHUFPS; @@ -3113,6 +3116,7 @@ static const TCGTargetOpDef *tcg_target_op_def(TCGOpcode op) case INDEX_op_ld_vec: case INDEX_op_st_vec: + case INDEX_op_dupm_vec: return &x_r; case INDEX_op_add_vec: diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c index 0996ef0812..f056018713 100644 --- a/tcg/tcg-op-gvec.c +++ b/tcg/tcg-op-gvec.c @@ -390,6 +390,40 @@ static TCGType choose_vector_type(TCGOpcode op, unsigned vece, uint32_t size, return 0; } +static void do_dup_store(TCGType type, uint32_t dofs, uint32_t oprsz, + uint32_t maxsz, TCGv_vec t_vec) +{ + uint32_t i = 0; + + switch (type) { + case TCG_TYPE_V256: + /* Recall that ARM SVE allows vector sizes that are not a + * power of 2, but always a multiple of 16. The intent is + * that e.g. size == 80 would be expanded with 2x32 + 1x16. + */ + for (; i + 32 <= oprsz; i += 32) { + tcg_gen_stl_vec(t_vec, cpu_env, dofs + i, TCG_TYPE_V256); + } + /* fallthru */ + case TCG_TYPE_V128: + for (; i + 16 <= oprsz; i += 16) { + tcg_gen_stl_vec(t_vec, cpu_env, dofs + i, TCG_TYPE_V128); + } + break; + case TCG_TYPE_V64: + for (; i < oprsz; i += 8) { + tcg_gen_stl_vec(t_vec, cpu_env, dofs + i, TCG_TYPE_V64); + } + break; + default: + g_assert_not_reached(); + } + + if (oprsz < maxsz) { + expand_clr(dofs + oprsz, maxsz - oprsz); + } +} + /* Set OPRSZ bytes at DOFS to replications of IN_32, IN_64 or IN_C. * Only one of IN_32 or IN_64 may be set; * IN_C is used if IN_32 and IN_64 are unset. @@ -429,49 +463,11 @@ static void do_dup(unsigned vece, uint32_t dofs, uint32_t oprsz, } else if (in_64) { tcg_gen_dup_i64_vec(vece, t_vec, in_64); } else { - switch (vece) { - case MO_8: - tcg_gen_dup8i_vec(t_vec, in_c); - break; - case MO_16: - tcg_gen_dup16i_vec(t_vec, in_c); - break; - case MO_32: - tcg_gen_dup32i_vec(t_vec, in_c); - break; - default: - tcg_gen_dup64i_vec(t_vec, in_c); - break; - } + tcg_gen_dupi_vec(vece, t_vec, in_c); } - - i = 0; - switch (type) { - case TCG_TYPE_V256: - /* Recall that ARM SVE allows vector sizes that are not a - * power of 2, but always a multiple of 16. The intent is - * that e.g. size == 80 would be expanded with 2x32 + 1x16. - */ - for (; i + 32 <= oprsz; i += 32) { - tcg_gen_stl_vec(t_vec, cpu_env, dofs + i, TCG_TYPE_V256); - } - /* fallthru */ - case TCG_TYPE_V128: - for (; i + 16 <= oprsz; i += 16) { - tcg_gen_stl_vec(t_vec, cpu_env, dofs + i, TCG_TYPE_V128); - } - break; - case TCG_TYPE_V64: - for (; i < oprsz; i += 8) { - tcg_gen_stl_vec(t_vec, cpu_env, dofs + i, TCG_TYPE_V64); - } - break; - default: - g_assert_not_reached(); - } - + do_dup_store(type, dofs, oprsz, maxsz, t_vec); tcg_temp_free_vec(t_vec); - goto done; + return; } /* Otherwise, inline with an integer type, unless "large". */ @@ -1287,6 +1283,16 @@ void tcg_gen_gvec_dup_i64(unsigned vece, uint32_t dofs, uint32_t oprsz, void tcg_gen_gvec_dup_mem(unsigned vece, uint32_t dofs, uint32_t aofs, uint32_t oprsz, uint32_t maxsz) { + if (vece <= MO_64) { + TCGType type = choose_vector_type(0, vece, oprsz, 0); + if (type != 0) { + TCGv_vec t_vec = tcg_temp_new_vec(type); + tcg_gen_dup_mem_vec(vece, t_vec, cpu_env, aofs); + do_dup_store(type, dofs, oprsz, maxsz, t_vec); + tcg_temp_free_vec(t_vec); + return; + } + } if (vece <= MO_32) { TCGv_i32 in = tcg_temp_new_i32(); switch (vece) { diff --git a/tcg/tcg-op-vec.c b/tcg/tcg-op-vec.c index cfb18682b1..ce7987b858 100644 --- a/tcg/tcg-op-vec.c +++ b/tcg/tcg-op-vec.c @@ -194,6 +194,17 @@ void tcg_gen_dup_i32_vec(unsigned vece, TCGv_vec r, TCGv_i32 a) vec_gen_2(INDEX_op_dup_vec, type, vece, ri, ai); } +void tcg_gen_dup_mem_vec(unsigned vece, TCGv_vec r, TCGv_ptr b, + tcg_target_long ofs) +{ + TCGArg ri = tcgv_vec_arg(r); + TCGArg bi = tcgv_ptr_arg(b); + TCGTemp *rt = arg_temp(ri); + TCGType type = rt->base_type; + + vec_gen_3(INDEX_op_dupm_vec, type, vece, ri, bi, ofs); +} + static void vec_gen_ldst(TCGOpcode opc, TCGv_vec r, TCGv_ptr b, TCGArg o) { TCGArg ri = tcgv_vec_arg(r); diff --git a/tcg/tcg.c b/tcg/tcg.c index 1c34c08791..0b0b228bb5 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -1605,6 +1605,7 @@ bool tcg_op_supported(TCGOpcode op) case INDEX_op_mov_vec: case INDEX_op_dup_vec: case INDEX_op_dupi_vec: + case INDEX_op_dupm_vec: case INDEX_op_ld_vec: case INDEX_op_st_vec: case INDEX_op_add_vec: From patchwork Sat Apr 20 07:34:16 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1088307 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="PRH8Ra4U"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44mPt24j3Cz9s4Y for ; Sat, 20 Apr 2019 17:41:30 +1000 (AEST) Received: from localhost ([127.0.0.1]:38154 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkcq-00058T-IS for incoming@patchwork.ozlabs.org; Sat, 20 Apr 2019 03:41:28 -0400 Received: from eggs.gnu.org ([209.51.188.92]:40212) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkWi-00005B-De for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:10 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hHkWg-00082J-GA for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:08 -0400 Received: from mail-pf1-x442.google.com ([2607:f8b0:4864:20::442]:34089) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hHkWg-000811-7G for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:06 -0400 Received: by mail-pf1-x442.google.com with SMTP id b3so3480545pfd.1 for ; Sat, 20 Apr 2019 00:35:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=N72xGVkYFKWjpoxWm45+VRyg+W9fgYOLsjdxRNRncls=; b=PRH8Ra4UXVTJWymwPVmmMzN7rJSJGP4b6L+R5LYPEW90lAEBRgd7wxeforaYEHTLXK WQKtDWjec9YuT1DKFethInCc4ejrbqGjcjBG0gKZB6MZfAuEDCcHFTEmw+WV5SubeNBI BybJNWVDbMOmI9aSu7lk+6mPWAekuehy/r3tiCQT2VLDhKXwNSY4kcL7GX8lcdYvi8qi QgTmgPTC/eJAk7kmwmK7a0/C0baWgIjv/nY9IioacWxBmKldKmjur62pb9OAbq0KBtNN ouB2Pi9eJeP4iuGj4rtw+T/1rOk1JvYsqnC7TstcLUxZbnVTZ68+S0kupkBzhzgdOHHv 3/eg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=N72xGVkYFKWjpoxWm45+VRyg+W9fgYOLsjdxRNRncls=; b=cgxtaP7n0lVKtgijtjEnXOs0uclCyx9tq5x/I6hngSfvwH5EyEk6J1hK3tzsf0zMEq 2BGFv8UemkbRh+DTUUpdJChlGta7dUy4zWN+nQpelLyj7sFTSlr5oWaBoi8m4T+0Kzhj um2EsWG5UAzeePdZTcDPH3ZTNk0jQb5BJ4xTRKhYrFXwMAnrk3dLLR0QGok7ozDzDPqn 4w7wYKiCRBY436K//X5HOoLGjzOpSMG7TfWZIrE0dE0NWqc9JSDvWM6GQ+ZaqlUs9Bmw RNsi6dNKHT9t8IjQ8HSmwtepjSFziFZKwg0l/TTYwLDJVugY1K+E0AvHSV1RGBFCVn6R UFNw== X-Gm-Message-State: APjAAAVKhfHrsThzh+WjiGkzPgPhl2wTF0Dxe30J2uhpMTkCvKSFo1sb 5kDGHJfp45uBg17ldZcSYh7ad9dsy2I= X-Google-Smtp-Source: APXvYqwV5pnLys4k/OzdNmSZXGx63j0Xg2poCGyhs31bAt641JfZt4VPCflxauOPxE/VEGA84lJcBQ== X-Received: by 2002:a62:6807:: with SMTP id d7mr8254995pfc.75.1555745705007; Sat, 20 Apr 2019 00:35:05 -0700 (PDT) Received: from localhost.localdomain (rrcs-66-91-136-155.west.biz.rr.com. [66.91.136.155]) by smtp.gmail.com with ESMTPSA id z22sm7025492pgv.23.2019.04.20.00.35.03 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 20 Apr 2019 00:35:04 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Fri, 19 Apr 2019 21:34:16 -1000 Message-Id: <20190420073442.7488-13-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190420073442.7488-1-richard.henderson@linaro.org> References: <20190420073442.7488-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::442 Subject: [Qemu-devel] [PATCH 12/38] tcg: Add gvec expanders for variable shift X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: david@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- accel/tcg/tcg-runtime.h | 15 ++++ tcg/tcg-op-gvec.h | 7 ++ tcg/tcg-op.h | 4 ++ accel/tcg/tcg-runtime-gvec.c | 132 +++++++++++++++++++++++++++++++++++ tcg/tcg-op-gvec.c | 87 +++++++++++++++++++++++ tcg/tcg-op-vec.c | 15 ++++ 6 files changed, 260 insertions(+) diff --git a/accel/tcg/tcg-runtime.h b/accel/tcg/tcg-runtime.h index dfe325625c..ed3ce5fd91 100644 --- a/accel/tcg/tcg-runtime.h +++ b/accel/tcg/tcg-runtime.h @@ -254,6 +254,21 @@ DEF_HELPER_FLAGS_3(gvec_sar16i, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_sar32i, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_sar64i, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_shl8v, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_shl16v, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_shl32v, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_shl64v, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(gvec_shr8v, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_shr16v, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_shr32v, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_shr64v, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(gvec_sar8v, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_sar16v, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_sar32v, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_sar64v, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(gvec_eq8, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_eq16, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_eq32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/tcg/tcg-op-gvec.h b/tcg/tcg-op-gvec.h index 850da32ded..1cd18a959a 100644 --- a/tcg/tcg-op-gvec.h +++ b/tcg/tcg-op-gvec.h @@ -294,6 +294,13 @@ void tcg_gen_gvec_shri(unsigned vece, uint32_t dofs, uint32_t aofs, void tcg_gen_gvec_sari(unsigned vece, uint32_t dofs, uint32_t aofs, int64_t shift, uint32_t oprsz, uint32_t maxsz); +void tcg_gen_gvec_shlv(unsigned vece, uint32_t dofs, uint32_t aofs, + uint32_t bofs, uint32_t oprsz, uint32_t maxsz); +void tcg_gen_gvec_shrv(unsigned vece, uint32_t dofs, uint32_t aofs, + uint32_t bofs, uint32_t oprsz, uint32_t maxsz); +void tcg_gen_gvec_sarv(unsigned vece, uint32_t dofs, uint32_t aofs, + uint32_t bofs, uint32_t oprsz, uint32_t maxsz); + void tcg_gen_gvec_cmp(TCGCond cond, unsigned vece, uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t oprsz, uint32_t maxsz); diff --git a/tcg/tcg-op.h b/tcg/tcg-op.h index 9fff9864f6..833c6330b5 100644 --- a/tcg/tcg-op.h +++ b/tcg/tcg-op.h @@ -986,6 +986,10 @@ void tcg_gen_shli_vec(unsigned vece, TCGv_vec r, TCGv_vec a, int64_t i); void tcg_gen_shri_vec(unsigned vece, TCGv_vec r, TCGv_vec a, int64_t i); void tcg_gen_sari_vec(unsigned vece, TCGv_vec r, TCGv_vec a, int64_t i); +void tcg_gen_shlv_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec s); +void tcg_gen_shrv_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec s); +void tcg_gen_sarv_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec s); + void tcg_gen_cmp_vec(TCGCond cond, unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); diff --git a/accel/tcg/tcg-runtime-gvec.c b/accel/tcg/tcg-runtime-gvec.c index e2c6f24262..7b88f5590c 100644 --- a/accel/tcg/tcg-runtime-gvec.c +++ b/accel/tcg/tcg-runtime-gvec.c @@ -725,6 +725,138 @@ void HELPER(gvec_sar64i)(void *d, void *a, uint32_t desc) clear_high(d, oprsz, desc); } +void HELPER(gvec_shl8v)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz = simd_oprsz(desc); + intptr_t i; + + for (i = 0; i < oprsz; i += sizeof(uint8_t)) { + *(uint8_t *)(d + i) = *(uint8_t *)(a + i) << *(uint8_t *)(b + i); + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_shl16v)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz = simd_oprsz(desc); + intptr_t i; + + for (i = 0; i < oprsz; i += sizeof(uint16_t)) { + *(uint16_t *)(d + i) = *(uint16_t *)(a + i) << *(uint16_t *)(b + i); + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_shl32v)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz = simd_oprsz(desc); + intptr_t i; + + for (i = 0; i < oprsz; i += sizeof(uint32_t)) { + *(uint32_t *)(d + i) = *(uint32_t *)(a + i) << *(uint32_t *)(b + i); + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_shl64v)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz = simd_oprsz(desc); + intptr_t i; + + for (i = 0; i < oprsz; i += sizeof(uint64_t)) { + *(uint64_t *)(d + i) = *(uint64_t *)(a + i) << *(uint64_t *)(b + i); + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_shr8v)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz = simd_oprsz(desc); + intptr_t i; + + for (i = 0; i < oprsz; i += sizeof(uint8_t)) { + *(uint8_t *)(d + i) = *(uint8_t *)(a + i) >> *(uint8_t *)(b + i); + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_shr16v)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz = simd_oprsz(desc); + intptr_t i; + + for (i = 0; i < oprsz; i += sizeof(uint16_t)) { + *(uint16_t *)(d + i) = *(uint16_t *)(a + i) >> *(uint16_t *)(b + i); + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_shr32v)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz = simd_oprsz(desc); + intptr_t i; + + for (i = 0; i < oprsz; i += sizeof(uint32_t)) { + *(uint32_t *)(d + i) = *(uint32_t *)(a + i) >> *(uint32_t *)(b + i); + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_shr64v)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz = simd_oprsz(desc); + intptr_t i; + + for (i = 0; i < oprsz; i += sizeof(uint64_t)) { + *(uint64_t *)(d + i) = *(uint64_t *)(a + i) >> *(uint64_t *)(b + i); + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_sar8v)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz = simd_oprsz(desc); + intptr_t i; + + for (i = 0; i < oprsz; i += sizeof(vec8)) { + *(int8_t *)(d + i) = *(int8_t *)(a + i) >> *(int8_t *)(b + i); + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_sar16v)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz = simd_oprsz(desc); + intptr_t i; + + for (i = 0; i < oprsz; i += sizeof(int16_t)) { + *(int16_t *)(d + i) = *(int16_t *)(a + i) >> *(int16_t *)(b + i); + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_sar32v)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz = simd_oprsz(desc); + intptr_t i; + + for (i = 0; i < oprsz; i += sizeof(vec32)) { + *(int32_t *)(d + i) = *(int32_t *)(a + i) >> *(int32_t *)(b + i); + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_sar64v)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz = simd_oprsz(desc); + intptr_t i; + + for (i = 0; i < oprsz; i += sizeof(vec64)) { + *(int64_t *)(d + i) = *(int64_t *)(a + i) >> *(int64_t *)(b + i); + } + clear_high(d, oprsz, desc); +} + /* If vectors are enabled, the compiler fills in -1 for true. Otherwise, we must take care of this by hand. */ #ifdef CONFIG_VECTOR16 diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c index f056018713..5d28184045 100644 --- a/tcg/tcg-op-gvec.c +++ b/tcg/tcg-op-gvec.c @@ -2382,6 +2382,93 @@ void tcg_gen_gvec_sari(unsigned vece, uint32_t dofs, uint32_t aofs, } } +void tcg_gen_gvec_shlv(unsigned vece, uint32_t dofs, uint32_t aofs, + uint32_t bofs, uint32_t oprsz, uint32_t maxsz) +{ + static const GVecGen3 g[4] = { + { .fniv = tcg_gen_shlv_vec, + .fno = gen_helper_gvec_shl8v, + .opc = INDEX_op_shlv_vec, + .vece = MO_8 }, + { .fniv = tcg_gen_shlv_vec, + .fno = gen_helper_gvec_shl16v, + .opc = INDEX_op_shlv_vec, + .vece = MO_16 }, + { .fni4 = tcg_gen_shl_i32, + .fniv = tcg_gen_shlv_vec, + .fno = gen_helper_gvec_shl32v, + .opc = INDEX_op_shlv_vec, + .vece = MO_32 }, + { .fni8 = tcg_gen_shl_i64, + .fniv = tcg_gen_shlv_vec, + .fno = gen_helper_gvec_shl64v, + .opc = INDEX_op_shlv_vec, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .vece = MO_64 }, + }; + + tcg_debug_assert(vece <= MO_64); + tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g[vece]); +} + +void tcg_gen_gvec_shrv(unsigned vece, uint32_t dofs, uint32_t aofs, + uint32_t bofs, uint32_t oprsz, uint32_t maxsz) +{ + static const GVecGen3 g[4] = { + { .fniv = tcg_gen_shrv_vec, + .fno = gen_helper_gvec_shr8v, + .opc = INDEX_op_shrv_vec, + .vece = MO_8 }, + { .fniv = tcg_gen_shrv_vec, + .fno = gen_helper_gvec_shr16v, + .opc = INDEX_op_shrv_vec, + .vece = MO_16 }, + { .fni4 = tcg_gen_shr_i32, + .fniv = tcg_gen_shrv_vec, + .fno = gen_helper_gvec_shr32v, + .opc = INDEX_op_shrv_vec, + .vece = MO_32 }, + { .fni8 = tcg_gen_shr_i64, + .fniv = tcg_gen_shrv_vec, + .fno = gen_helper_gvec_shr64v, + .opc = INDEX_op_shrv_vec, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .vece = MO_64 }, + }; + + tcg_debug_assert(vece <= MO_64); + tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g[vece]); +} + +void tcg_gen_gvec_sarv(unsigned vece, uint32_t dofs, uint32_t aofs, + uint32_t bofs, uint32_t oprsz, uint32_t maxsz) +{ + static const GVecGen3 g[4] = { + { .fniv = tcg_gen_sarv_vec, + .fno = gen_helper_gvec_sar8v, + .opc = INDEX_op_sarv_vec, + .vece = MO_8 }, + { .fniv = tcg_gen_sarv_vec, + .fno = gen_helper_gvec_sar16v, + .opc = INDEX_op_sarv_vec, + .vece = MO_16 }, + { .fni4 = tcg_gen_sar_i32, + .fniv = tcg_gen_sarv_vec, + .fno = gen_helper_gvec_sar32v, + .opc = INDEX_op_sarv_vec, + .vece = MO_32 }, + { .fni8 = tcg_gen_sar_i64, + .fniv = tcg_gen_sarv_vec, + .fno = gen_helper_gvec_sar64v, + .opc = INDEX_op_sarv_vec, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .vece = MO_64 }, + }; + + tcg_debug_assert(vece <= MO_64); + tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g[vece]); +} + /* Expand OPSZ bytes worth of three-operand operations using i32 elements. */ static void expand_cmp_i32(uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t oprsz, TCGCond cond) diff --git a/tcg/tcg-op-vec.c b/tcg/tcg-op-vec.c index ce7987b858..6601cb8a8f 100644 --- a/tcg/tcg-op-vec.c +++ b/tcg/tcg-op-vec.c @@ -481,3 +481,18 @@ void tcg_gen_umax_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) { do_op3(vece, r, a, b, INDEX_op_umax_vec); } + +void tcg_gen_shlv_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) +{ + do_op3(vece, r, a, b, INDEX_op_shlv_vec); +} + +void tcg_gen_shrv_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) +{ + do_op3(vece, r, a, b, INDEX_op_shrv_vec); +} + +void tcg_gen_sarv_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) +{ + do_op3(vece, r, a, b, INDEX_op_sarv_vec); +} From patchwork Sat Apr 20 07:34:17 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1088318 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="ASW0F1Fb"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44mQ3z0tf9z9s4Y for ; Sat, 20 Apr 2019 17:50:06 +1000 (AEST) Received: from localhost ([127.0.0.1]:38253 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHklA-00045h-RJ for incoming@patchwork.ozlabs.org; Sat, 20 Apr 2019 03:50:04 -0400 Received: from eggs.gnu.org ([209.51.188.92]:40218) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkWj-00005s-39 for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:10 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hHkWi-00084i-1l for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:09 -0400 Received: from mail-pf1-x444.google.com ([2607:f8b0:4864:20::444]:34091) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hHkWh-00083F-Rt for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:07 -0400 Received: by mail-pf1-x444.google.com with SMTP id b3so3480564pfd.1 for ; Sat, 20 Apr 2019 00:35:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=F1de5So3vFDWwHAWVYAQ29uD+09yUtcwgv6kBjYxLtM=; b=ASW0F1FbKt3Bq1COddmbp3oVZcJN2FNaQNqVewjp9+KYNX4Vg4voLCqr1DJG8H2qzK rn//YW51aZyO6RUEzkgWx6p+I65oUrOIZIyaBtvztJV8Id3JzZdM6+S3SIOh1pIJtVai taXs6nS8RZ5JZRO/JRSPVbT97Lgc/yD9LGPWMHuUbblNXR8QrGDvU0KaOvWJZzbVcW5T yvUAgCcboGKxg2Khqh5Wz0GX1Pw353l1046eidxslnpyopaOiAqPCJdv3IjMzt3tS6Y2 OfGbDgUhDg/p0w2dGtfYKj8EZgXOC3xgrL/4fqerdmvQel9PIR+n5LTJx5kjeo4hz35h nQSQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=F1de5So3vFDWwHAWVYAQ29uD+09yUtcwgv6kBjYxLtM=; b=c/cVH3JP+jCRAEhmK13E4G0uGcudZCFO3YA7jiKsXaHpHEybvcgp4HyLMFSFAo+Ubk 1XXLPivEeqwiKCTqupnaHLuOnJ0FxudCQd12fsyJsSrA0XZQEBDsV7LHCG9FlnIhYBoP bXLTFvcqdVOVWRn3wuRaoZZLlUY1/5nKMt7c5D7I4qsKGZyp9HzDShZhuSMD3KhtaINJ G9Cg1ONKxKoxZJecePDnNvbV9EcYtv127G1ehvosXaQ2aZs4a/mb+eSI83P/aTpGcIn2 C5TbnCaTN37O4a8XB/02tjla6lwkDERTKio+iXhPMIrXl4vCEJlGc2K6f61xAt8Npu2w xiPA== X-Gm-Message-State: APjAAAWcXUpVIpxfGdSyVIk/db/TdlN6Dgwj77cIheXDw5yAml2FqHtZ UAYmlCMMzoa2dRcziIdC0GicnSqSoqs= X-Google-Smtp-Source: APXvYqxEraO7aRcg5SPlK1jetffmZ2O0DPFwZiu9+M1EhDnWgvQnnaJ3e1VM0KBW9AwNHZCrHFbODg== X-Received: by 2002:a05:6a00:dc:: with SMTP id e28mr8433637pfj.186.1555745706468; Sat, 20 Apr 2019 00:35:06 -0700 (PDT) Received: from localhost.localdomain (rrcs-66-91-136-155.west.biz.rr.com. [66.91.136.155]) by smtp.gmail.com with ESMTPSA id z22sm7025492pgv.23.2019.04.20.00.35.05 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 20 Apr 2019 00:35:05 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Fri, 19 Apr 2019 21:34:17 -1000 Message-Id: <20190420073442.7488-14-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190420073442.7488-1-richard.henderson@linaro.org> References: <20190420073442.7488-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::444 Subject: [Qemu-devel] [PATCH 13/38] tcg/i386: Support vector variable shift opcodes X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: david@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- tcg/i386/tcg-target.h | 2 +- tcg/i386/tcg-target.inc.c | 35 +++++++++++++++++++++++++++++++++++ 2 files changed, 36 insertions(+), 1 deletion(-) diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h index 241bf19413..b240633455 100644 --- a/tcg/i386/tcg-target.h +++ b/tcg/i386/tcg-target.h @@ -184,7 +184,7 @@ extern bool have_avx2; #define TCG_TARGET_HAS_neg_vec 0 #define TCG_TARGET_HAS_shi_vec 1 #define TCG_TARGET_HAS_shs_vec 0 -#define TCG_TARGET_HAS_shv_vec 0 +#define TCG_TARGET_HAS_shv_vec have_avx2 #define TCG_TARGET_HAS_cmp_vec 1 #define TCG_TARGET_HAS_mul_vec 1 #define TCG_TARGET_HAS_sat_vec 1 diff --git a/tcg/i386/tcg-target.inc.c b/tcg/i386/tcg-target.inc.c index 4c42a2430d..04e609c7b2 100644 --- a/tcg/i386/tcg-target.inc.c +++ b/tcg/i386/tcg-target.inc.c @@ -467,6 +467,11 @@ static inline int tcg_target_const_match(tcg_target_long val, TCGType type, #define OPC_VPBROADCASTQ (0x59 | P_EXT38 | P_DATA16) #define OPC_VPERMQ (0x00 | P_EXT3A | P_DATA16 | P_REXW) #define OPC_VPERM2I128 (0x46 | P_EXT3A | P_DATA16 | P_VEXL) +#define OPC_VPSLLVD (0x47 | P_EXT38 | P_DATA16) +#define OPC_VPSLLVQ (0x47 | P_EXT38 | P_DATA16 | P_REXW) +#define OPC_VPSRAVD (0x46 | P_EXT38 | P_DATA16) +#define OPC_VPSRLVD (0x45 | P_EXT38 | P_DATA16) +#define OPC_VPSRLVQ (0x45 | P_EXT38 | P_DATA16 | P_REXW) #define OPC_VZEROUPPER (0x77 | P_EXT) #define OPC_XCHG_ax_r32 (0x90) @@ -2705,6 +2710,18 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, static int const umax_insn[4] = { OPC_PMAXUB, OPC_PMAXUW, OPC_PMAXUD, OPC_UD2 }; + static int const shlv_insn[4] = { + /* TODO: AVX512 adds support for MO_16. */ + OPC_UD2, OPC_UD2, OPC_VPSLLVD, OPC_VPSLLVQ + }; + static int const shrv_insn[4] = { + /* TODO: AVX512 adds support for MO_16. */ + OPC_UD2, OPC_UD2, OPC_VPSRLVD, OPC_VPSRLVQ + }; + static int const sarv_insn[4] = { + /* TODO: AVX512 adds support for MO_16, MO_64. */ + OPC_UD2, OPC_UD2, OPC_VPSRAVD, OPC_UD2 + }; TCGType type = vecl + TCG_TYPE_V64; int insn, sub; @@ -2757,6 +2774,15 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, case INDEX_op_umax_vec: insn = umax_insn[vece]; goto gen_simd; + case INDEX_op_shlv_vec: + insn = shlv_insn[vece]; + goto gen_simd; + case INDEX_op_shrv_vec: + insn = shrv_insn[vece]; + goto gen_simd; + case INDEX_op_sarv_vec: + insn = sarv_insn[vece]; + goto gen_simd; case INDEX_op_x86_punpckl_vec: insn = punpckl_insn[vece]; goto gen_simd; @@ -3134,6 +3160,9 @@ static const TCGTargetOpDef *tcg_target_op_def(TCGOpcode op) case INDEX_op_umin_vec: case INDEX_op_smax_vec: case INDEX_op_umax_vec: + case INDEX_op_shlv_vec: + case INDEX_op_shrv_vec: + case INDEX_op_sarv_vec: case INDEX_op_cmp_vec: case INDEX_op_x86_shufps_vec: case INDEX_op_x86_blend_vec: @@ -3191,6 +3220,12 @@ int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type, unsigned vece) } return 1; + case INDEX_op_shlv_vec: + case INDEX_op_shrv_vec: + return have_avx2 && vece >= MO_32; + case INDEX_op_sarv_vec: + return have_avx2 && vece == MO_32; + case INDEX_op_mul_vec: if (vece == MO_8) { /* We can expand the operation for MO_8. */ From patchwork Sat Apr 20 07:34:18 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1088313 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="Wzh4i7e6"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44mPxx5w0Nz9s4Y for ; Sat, 20 Apr 2019 17:44:53 +1000 (AEST) Received: from localhost ([127.0.0.1]:38180 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkg7-0008Bg-KR for incoming@patchwork.ozlabs.org; Sat, 20 Apr 2019 03:44:51 -0400 Received: from eggs.gnu.org ([209.51.188.92]:40239) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkWk-00007Q-HC for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:11 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hHkWj-00087V-ED for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:10 -0400 Received: from mail-pg1-x544.google.com ([2607:f8b0:4864:20::544]:37887) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hHkWj-000860-8P for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:09 -0400 Received: by mail-pg1-x544.google.com with SMTP id e6so3598420pgc.4 for ; Sat, 20 Apr 2019 00:35:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=qpLX9/DZZjJ+dprrH7G5JtH5NDJNPfgeQACEO5dLGO4=; b=Wzh4i7e6wasaoAYEAuOIkKNakbRfglvN8V46qqGkc1f7+9VMDOlCGJAm6CUYcagGms fy9RPIYB9E97bWJMOcFZtj2jKTzgZSKDBbigljJYHEBglXCq/PTZn/bcKZuyq0FZXvzi Q2XlSY/o8UlFl4bkq9+qVShsWdxc+ZmMKMw5udnAk+j3aBs/0Y5SvAE0L92pJQzryn8g 3Fhjw1KIERJyWnZvGBl9LqaAs+uTGb1jyiIlrdiCVeEYQe4xi7we6zBLSeHyNYXJW0q6 7qQ/kq6yp06IeQWL2NWxCboWfgcS7Mp9szR9D/2V4HycZbgjJ+85Np+OohdXcrLzMpMk 6zAQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=qpLX9/DZZjJ+dprrH7G5JtH5NDJNPfgeQACEO5dLGO4=; b=dk0BzTGgUkdCG9PIRLt2itp2LO6TOg3Pso+E5vB3z5nygpkQEu5uVzmVH9e5iWaYO2 8r7rg9i4HHIGMa/PrOaAWBT3tTym2ajUUu+GOIcmkbWjaS23HBG75AgMpJHtH9DJMoxO 0kqLWtXBXnkMOKtTdkVQBwUSikEEQcT4PFm8goZSJmUKcMORZ8iuffL7sssrhaIotP1z mo6sfdEN597zvdyC2GMCleqMdw1I9+azY/tbxC6q+UOhrAaxj2KmH7GPjLrr37Y5Qt2W xavoXgbWQKOsHkCs7fNgafPUN8NPVJKZnPHdu0nA3XtZmgyUHl+yM7J2AIdTTIIF+lnL dfmA== X-Gm-Message-State: APjAAAWm6Nw8SYGfb8rNU2JNote++0riet3tH6hhMDa80v/b5qWdD3gL fTx9Co5hTBMyFHiDn3rAVtkhODkXaxk= X-Google-Smtp-Source: APXvYqxzdEW0oIUcPI/kzYPx3dvhDZbPTW3lg0zch4Q5y0zA96go4bQRye20KaKa9THwL3aU2Nmc4g== X-Received: by 2002:a62:2046:: with SMTP id g67mr8260862pfg.121.1555745707925; Sat, 20 Apr 2019 00:35:07 -0700 (PDT) Received: from localhost.localdomain (rrcs-66-91-136-155.west.biz.rr.com. [66.91.136.155]) by smtp.gmail.com with ESMTPSA id z22sm7025492pgv.23.2019.04.20.00.35.06 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 20 Apr 2019 00:35:07 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Fri, 19 Apr 2019 21:34:18 -1000 Message-Id: <20190420073442.7488-15-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190420073442.7488-1-richard.henderson@linaro.org> References: <20190420073442.7488-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::544 Subject: [Qemu-devel] [PATCH 14/38] tcg/aarch64: Support vector variable shift opcodes X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: david@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- tcg/aarch64/tcg-target.h | 2 +- tcg/aarch64/tcg-target.opc.h | 2 ++ tcg/aarch64/tcg-target.inc.c | 42 ++++++++++++++++++++++++++++++++++++ 3 files changed, 45 insertions(+), 1 deletion(-) diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h index ce2bb1f90b..f5640a229b 100644 --- a/tcg/aarch64/tcg-target.h +++ b/tcg/aarch64/tcg-target.h @@ -134,7 +134,7 @@ typedef enum { #define TCG_TARGET_HAS_neg_vec 1 #define TCG_TARGET_HAS_shi_vec 1 #define TCG_TARGET_HAS_shs_vec 0 -#define TCG_TARGET_HAS_shv_vec 0 +#define TCG_TARGET_HAS_shv_vec 1 #define TCG_TARGET_HAS_cmp_vec 1 #define TCG_TARGET_HAS_mul_vec 1 #define TCG_TARGET_HAS_sat_vec 1 diff --git a/tcg/aarch64/tcg-target.opc.h b/tcg/aarch64/tcg-target.opc.h index 4816a6c3d4..59e1d3f7f7 100644 --- a/tcg/aarch64/tcg-target.opc.h +++ b/tcg/aarch64/tcg-target.opc.h @@ -1,3 +1,5 @@ /* Target-specific opcodes for host vector expansion. These will be emitted by tcg_expand_vec_op. For those familiar with GCC internals, consider these to be UNSPEC with names. */ + +DEF(aa64_sshl_vec, 1, 2, 0, IMPLVEC) diff --git a/tcg/aarch64/tcg-target.inc.c b/tcg/aarch64/tcg-target.inc.c index 1c9f4b0cb3..7d2a8213ec 100644 --- a/tcg/aarch64/tcg-target.inc.c +++ b/tcg/aarch64/tcg-target.inc.c @@ -538,12 +538,14 @@ typedef enum { I3616_CMEQ = 0x2e208c00, I3616_SMAX = 0x0e206400, I3616_SMIN = 0x0e206c00, + I3616_SSHL = 0x0e204400, I3616_SQADD = 0x0e200c00, I3616_SQSUB = 0x0e202c00, I3616_UMAX = 0x2e206400, I3616_UMIN = 0x2e206c00, I3616_UQADD = 0x2e200c00, I3616_UQSUB = 0x2e202c00, + I3616_USHL = 0x2e204400, /* AdvSIMD two-reg misc. */ I3617_CMGT0 = 0x0e208800, @@ -2254,6 +2256,12 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, case INDEX_op_sari_vec: tcg_out_insn(s, 3614, SSHR, is_q, a0, a1, (16 << vece) - a2); break; + case INDEX_op_shlv_vec: + tcg_out_insn(s, 3616, USHL, is_q, vece, a0, a1, a2); + break; + case INDEX_op_aa64_sshl_vec: + tcg_out_insn(s, 3616, SSHL, is_q, vece, a0, a1, a2); + break; case INDEX_op_cmp_vec: { TCGCond cond = args[3]; @@ -2321,7 +2329,11 @@ int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type, unsigned vece) case INDEX_op_smin_vec: case INDEX_op_umax_vec: case INDEX_op_umin_vec: + case INDEX_op_shlv_vec: return 1; + case INDEX_op_shrv_vec: + case INDEX_op_sarv_vec: + return -1; case INDEX_op_mul_vec: return vece < MO_64; @@ -2333,6 +2345,32 @@ int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type, unsigned vece) void tcg_expand_vec_op(TCGOpcode opc, TCGType type, unsigned vece, TCGArg a0, ...) { + va_list va; + TCGv_vec v0, v1, v2, t1; + + va_start(va, a0); + v0 = temp_tcgv_vec(arg_temp(a0)); + v1 = temp_tcgv_vec(arg_temp(va_arg(va, TCGArg))); + v2 = temp_tcgv_vec(arg_temp(va_arg(va, TCGArg))); + + switch (opc) { + case INDEX_op_shrv_vec: + case INDEX_op_sarv_vec: + /* Right shifts are negative left shifts for AArch64. */ + t1 = tcg_temp_new_vec(type); + tcg_gen_neg_vec(vece, t1, v2); + opc = (opc == INDEX_op_shrv_vec + ? INDEX_op_shlv_vec : INDEX_op_aa64_sshl_vec); + vec_gen_3(opc, type, vece, tcgv_vec_arg(v0), + tcgv_vec_arg(v1), tcgv_vec_arg(t1)); + tcg_temp_free_vec(t1); + break; + + default: + g_assert_not_reached(); + } + + va_end(va); } static const TCGTargetOpDef *tcg_target_op_def(TCGOpcode op) @@ -2514,6 +2552,10 @@ static const TCGTargetOpDef *tcg_target_op_def(TCGOpcode op) case INDEX_op_smin_vec: case INDEX_op_umax_vec: case INDEX_op_umin_vec: + case INDEX_op_shlv_vec: + case INDEX_op_shrv_vec: + case INDEX_op_sarv_vec: + case INDEX_op_aa64_sshl_vec: return &w_w_w; case INDEX_op_not_vec: case INDEX_op_neg_vec: From patchwork Sat Apr 20 07:34:19 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1088312 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="iyelCnjj"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44mPx75V9nz9s6w for ; Sat, 20 Apr 2019 17:44:11 +1000 (AEST) Received: from localhost ([127.0.0.1]:38178 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkfR-0007aj-HU for incoming@patchwork.ozlabs.org; Sat, 20 Apr 2019 03:44:09 -0400 Received: from eggs.gnu.org ([209.51.188.92]:40270) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkWn-0000AP-8T for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:14 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hHkWl-0008CN-QX for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:13 -0400 Received: from mail-pf1-x441.google.com ([2607:f8b0:4864:20::441]:39173) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hHkWl-0008A7-I0 for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:11 -0400 Received: by mail-pf1-x441.google.com with SMTP id i17so3465645pfo.6 for ; Sat, 20 Apr 2019 00:35:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=rwFKyv2b5NvlMBSdo8f6yzy4Wn8y0WFPdVUtat/RHMA=; b=iyelCnjj2WRCh+VeUbpUaz5l1itmrJ28CqUyxBW1bEEmF8cI70Xwpso0XRUtxbzRPe nThq+c35fxVSq4PVO67fPKzRlb2Kxm3R82XKDTi8POXycdYgv604BIOiAK2lYnsrJy3S vb+B7ATX+e+39acyQd3OKi0ddfoY4NA06tgll9/2Z5PscddrjbGj5ZMpo9WnfhSPFW9K +05+XGzmU46PqJEFbfDpzwSPpE00/GbjrP5Ab9tUsbQvEVvt9S07T8F7sS3eT5cUxt35 gY56uxuXPXEFv2bBMx+/NZoCFwzWLl5I79xIN7/t2c6c9Kho1IKMXPW6dFeTFolrRqav 6bMw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=rwFKyv2b5NvlMBSdo8f6yzy4Wn8y0WFPdVUtat/RHMA=; b=nEgOwJaLnd13X/032UDJsrY0Tu9G1RDk9uTqkzAFPzUtLxR4Iq9G9Jbug19+imQHxU iDjJlzOX2nTded/NAF5k9O1oEJzKRBjV+ra5IvsZpz3xGCqeWcG2xADqKd5eQjMutKWF etRX6iEBcbpLN5T0NHb582Ozf6PPacPcPl/395nHx2npDXP2a0IG60jnTnhj7BB6rpM4 nhVdxQ7VRnYbpVtdUUK4M8JGIVvzmwoNZmPRmbDoru3JHFx+qJuuifRJt236Na+oXRiW uNJs6ZFRIhPKRJ2i0SyekU6Zto4V4vPg9/ejbA/Ka8xKHmP1IFqlB4LN9Bbxp+02JFJo /zWA== X-Gm-Message-State: APjAAAVCRh+CkKM8Pd5UKu0kUE/XO033m+Yzlu+b5RcOAs9tWMCFiugV FDj3BOsn7HYtVluKHb9xIc1wnETJWCM= X-Google-Smtp-Source: APXvYqwT2kRYcKcbWe/RIlDLpd62dH/EKQczjKRIKSNOj+lmBfV12tUp0esCQTpLi/vvxAveWCVE7Q== X-Received: by 2002:a63:ad4b:: with SMTP id y11mr8086160pgo.405.1555745709970; Sat, 20 Apr 2019 00:35:09 -0700 (PDT) Received: from localhost.localdomain (rrcs-66-91-136-155.west.biz.rr.com. [66.91.136.155]) by smtp.gmail.com with ESMTPSA id z22sm7025492pgv.23.2019.04.20.00.35.08 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 20 Apr 2019 00:35:09 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Fri, 19 Apr 2019 21:34:19 -1000 Message-Id: <20190420073442.7488-16-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190420073442.7488-1-richard.henderson@linaro.org> References: <20190420073442.7488-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::441 Subject: [Qemu-devel] [PATCH 15/38] tcg: Implement tcg_gen_gvec_3i() X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: david@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" From: David Hildenbrand Let's add tcg_gen_gvec_3i(), similar to tcg_gen_gvec_2i(), however without introducing "gen_helper_gvec_3i *fnoi", as it isn't needed for now. Signed-off-by: David Hildenbrand Message-Id: <20190416185301.25344-2-david@redhat.com> Signed-off-by: Richard Henderson --- tcg/tcg-op-gvec.h | 24 ++++++++ tcg/tcg-op-gvec.c | 139 ++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 163 insertions(+) diff --git a/tcg/tcg-op-gvec.h b/tcg/tcg-op-gvec.h index 1cd18a959a..aaeb6e5d8b 100644 --- a/tcg/tcg-op-gvec.h +++ b/tcg/tcg-op-gvec.h @@ -164,6 +164,27 @@ typedef struct { bool load_dest; } GVecGen3; +typedef struct { + /* + * Expand inline as a 64-bit or 32-bit integer. Only one of these will be + * non-NULL. + */ + void (*fni8)(TCGv_i64, TCGv_i64, TCGv_i64, int64_t); + void (*fni4)(TCGv_i32, TCGv_i32, TCGv_i32, int32_t); + /* Expand inline with a host vector type. */ + void (*fniv)(unsigned, TCGv_vec, TCGv_vec, TCGv_vec, int64_t); + /* Expand out-of-line helper w/descriptor, data in descriptor. */ + gen_helper_gvec_3 *fno; + /* The opcode, if any, to which this corresponds. */ + TCGOpcode opc; + /* The vector element size, if applicable. */ + uint8_t vece; + /* Prefer i64 to v64. */ + bool prefer_i64; + /* Load dest as a 3rd source operand. */ + bool load_dest; +} GVecGen3i; + typedef struct { /* Expand inline as a 64-bit or 32-bit integer. Only one of these will be non-NULL. */ @@ -193,6 +214,9 @@ void tcg_gen_gvec_2s(uint32_t dofs, uint32_t aofs, uint32_t oprsz, uint32_t maxsz, TCGv_i64 c, const GVecGen2s *); void tcg_gen_gvec_3(uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t oprsz, uint32_t maxsz, const GVecGen3 *); +void tcg_gen_gvec_3i(uint32_t dofs, uint32_t aofs, uint32_t bofs, + uint32_t oprsz, uint32_t maxsz, int64_t c, + const GVecGen3i *); void tcg_gen_gvec_4(uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t cofs, uint32_t oprsz, uint32_t maxsz, const GVecGen4 *); diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c index 5d28184045..3eb9126f50 100644 --- a/tcg/tcg-op-gvec.c +++ b/tcg/tcg-op-gvec.c @@ -659,6 +659,29 @@ static void expand_3_i32(uint32_t dofs, uint32_t aofs, tcg_temp_free_i32(t0); } +static void expand_3i_i32(uint32_t dofs, uint32_t aofs, uint32_t bofs, + uint32_t oprsz, int32_t c, bool load_dest, + void (*fni)(TCGv_i32, TCGv_i32, TCGv_i32, int32_t)) +{ + TCGv_i32 t0 = tcg_temp_new_i32(); + TCGv_i32 t1 = tcg_temp_new_i32(); + TCGv_i32 t2 = tcg_temp_new_i32(); + uint32_t i; + + for (i = 0; i < oprsz; i += 4) { + tcg_gen_ld_i32(t0, cpu_env, aofs + i); + tcg_gen_ld_i32(t1, cpu_env, bofs + i); + if (load_dest) { + tcg_gen_ld_i32(t2, cpu_env, dofs + i); + } + fni(t2, t0, t1, c); + tcg_gen_st_i32(t2, cpu_env, dofs + i); + } + tcg_temp_free_i32(t0); + tcg_temp_free_i32(t1); + tcg_temp_free_i32(t2); +} + /* Expand OPSZ bytes worth of three-operand operations using i32 elements. */ static void expand_4_i32(uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t cofs, uint32_t oprsz, bool write_aofs, @@ -766,6 +789,29 @@ static void expand_3_i64(uint32_t dofs, uint32_t aofs, tcg_temp_free_i64(t0); } +static void expand_3i_i64(uint32_t dofs, uint32_t aofs, uint32_t bofs, + uint32_t oprsz, int64_t c, bool load_dest, + void (*fni)(TCGv_i64, TCGv_i64, TCGv_i64, int64_t)) +{ + TCGv_i64 t0 = tcg_temp_new_i64(); + TCGv_i64 t1 = tcg_temp_new_i64(); + TCGv_i64 t2 = tcg_temp_new_i64(); + uint32_t i; + + for (i = 0; i < oprsz; i += 8) { + tcg_gen_ld_i64(t0, cpu_env, aofs + i); + tcg_gen_ld_i64(t1, cpu_env, bofs + i); + if (load_dest) { + tcg_gen_ld_i64(t2, cpu_env, dofs + i); + } + fni(t2, t0, t1, c); + tcg_gen_st_i64(t2, cpu_env, dofs + i); + } + tcg_temp_free_i64(t0); + tcg_temp_free_i64(t1); + tcg_temp_free_i64(t2); +} + /* Expand OPSZ bytes worth of three-operand operations using i64 elements. */ static void expand_4_i64(uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t cofs, uint32_t oprsz, bool write_aofs, @@ -879,6 +925,35 @@ static void expand_3_vec(unsigned vece, uint32_t dofs, uint32_t aofs, tcg_temp_free_vec(t0); } +/* + * Expand OPSZ bytes worth of three-vector operands and an immediate operand + * using host vectors. + */ +static void expand_3i_vec(unsigned vece, uint32_t dofs, uint32_t aofs, + uint32_t bofs, uint32_t oprsz, uint32_t tysz, + TCGType type, int64_t c, bool load_dest, + void (*fni)(unsigned, TCGv_vec, TCGv_vec, TCGv_vec, + int64_t)) +{ + TCGv_vec t0 = tcg_temp_new_vec(type); + TCGv_vec t1 = tcg_temp_new_vec(type); + TCGv_vec t2 = tcg_temp_new_vec(type); + uint32_t i; + + for (i = 0; i < oprsz; i += tysz) { + tcg_gen_ld_vec(t0, cpu_env, aofs + i); + tcg_gen_ld_vec(t1, cpu_env, bofs + i); + if (load_dest) { + tcg_gen_ld_vec(t2, cpu_env, dofs + i); + } + fni(vece, t2, t0, t1, c); + tcg_gen_st_vec(t2, cpu_env, dofs + i); + } + tcg_temp_free_vec(t0); + tcg_temp_free_vec(t1); + tcg_temp_free_vec(t2); +} + /* Expand OPSZ bytes worth of four-operand operations using host vectors. */ static void expand_4_vec(unsigned vece, uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t cofs, uint32_t oprsz, @@ -1170,6 +1245,70 @@ void tcg_gen_gvec_3(uint32_t dofs, uint32_t aofs, uint32_t bofs, } } +/* Expand a vector operation with three vectors and an immediate. */ +void tcg_gen_gvec_3i(uint32_t dofs, uint32_t aofs, uint32_t bofs, + uint32_t oprsz, uint32_t maxsz, int64_t c, + const GVecGen3i *g) +{ + TCGType type; + uint32_t some; + + check_size_align(oprsz, maxsz, dofs | aofs | bofs); + check_overlap_3(dofs, aofs, bofs, maxsz); + + type = 0; + if (g->fniv) { + type = choose_vector_type(g->opc, g->vece, oprsz, g->prefer_i64); + } + switch (type) { + case TCG_TYPE_V256: + /* + * Recall that ARM SVE allows vector sizes that are not a + * power of 2, but always a multiple of 16. The intent is + * that e.g. size == 80 would be expanded with 2x32 + 1x16. + */ + some = QEMU_ALIGN_DOWN(oprsz, 32); + expand_3i_vec(g->vece, dofs, aofs, bofs, some, 32, TCG_TYPE_V256, + c, g->load_dest, g->fniv); + if (some == oprsz) { + break; + } + dofs += some; + aofs += some; + bofs += some; + oprsz -= some; + maxsz -= some; + /* fallthru */ + case TCG_TYPE_V128: + expand_3i_vec(g->vece, dofs, aofs, bofs, oprsz, 16, TCG_TYPE_V128, + c, g->load_dest, g->fniv); + break; + case TCG_TYPE_V64: + expand_3i_vec(g->vece, dofs, aofs, bofs, oprsz, 8, TCG_TYPE_V64, + c, g->load_dest, g->fniv); + break; + + case 0: + if (g->fni8 && check_size_impl(oprsz, 8)) { + expand_3i_i64(dofs, aofs, bofs, oprsz, c, g->load_dest, g->fni8); + } else if (g->fni4 && check_size_impl(oprsz, 4)) { + expand_3i_i32(dofs, aofs, bofs, oprsz, c, g->load_dest, g->fni4); + } else { + assert(g->fno != NULL); + tcg_gen_gvec_3_ool(dofs, aofs, bofs, oprsz, maxsz, c, g->fno); + return; + } + break; + + default: + g_assert_not_reached(); + } + + if (oprsz < maxsz) { + expand_clr(dofs + oprsz, maxsz - oprsz); + } +} + /* Expand a vector four-operand operation. */ void tcg_gen_gvec_4(uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t cofs, uint32_t oprsz, uint32_t maxsz, const GVecGen4 *g) From patchwork Sat Apr 20 07:34:20 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1088319 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="QQSbFICV"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44mQ4t6lchz9s4Y for ; Sat, 20 Apr 2019 17:50:54 +1000 (AEST) Received: from localhost ([127.0.0.1]:38266 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHklv-0004iI-V1 for incoming@patchwork.ozlabs.org; Sat, 20 Apr 2019 03:50:52 -0400 Received: from eggs.gnu.org ([209.51.188.92]:40337) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkWx-0000IU-QH for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:28 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hHkWo-0008J6-VH for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:20 -0400 Received: from mail-pf1-x443.google.com ([2607:f8b0:4864:20::443]:44881) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hHkWo-0008HN-IQ for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:14 -0400 Received: by mail-pf1-x443.google.com with SMTP id y13so3457200pfm.11 for ; Sat, 20 Apr 2019 00:35:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=7VbeA7W5tV8GWycExsPXqZUAx55SU6reILwI+uAdJe4=; b=QQSbFICVD6v7FuPP+Je+JppYskZkbVe3rEw5bM5z2nbTy24q6vw6HGG0YqDqz5EpOR ihixxdp53miZ6md1TWIhD2BjqNI9gBQwQGIjFM0mLViGNQMgRzFBk/uuHzVbIELN9qFk mwduMYCzaqiySC3IszXsknS1omthv3xM/Teu/RezGM4X65qLj+mj7oRnzFkwKozNp55j HUenY4FgSws8JxX+EUkX9DzW83EYjfjH3+EicLtoEfZa3Kbrm+zsbdVjuzPgb2TzFDb8 6l2Pq90nZ7QlSYjzZCa1eLGWwauPHrAje5LJ2O12E3lyiz2nhmT/HFYLCR40z3f/XHiE X8Ww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=7VbeA7W5tV8GWycExsPXqZUAx55SU6reILwI+uAdJe4=; b=a28f22ICa9eBmjBZQd6Sm7fpzkELnFISfIrSZExi55jnjQsC+AjimjXv1VvxvV95l2 MO2H6K3+3NPaA5I92x+mKg7rADS3TPezWZSSBlX0fQ7kGNHAqtgdZ9V5cuc/y/NpifL/ oRPYG794YsS5EruQmdQ4Y3peA3NqUi68pNJuaufGmNAVducFQONe36VbVob5xTxa6Bd6 aUEm7MsoMG5aNmiamT8WO43TNPkPUptx7Pb6qXH5bROe74Lrj0motYPT3OMIFZjpvj0o 5Pqq6ThErfFLckSrX0dR/57yg2MRWFjPnpnS6MyXKy1ObYhvW51nHcjCp3BLALlMAF29 l1Dg== X-Gm-Message-State: APjAAAXl5NdHBjpv4cRynAid74XT4zfCFZnS+a+bYoTy/AbMLBx22heC x9HzIly5n4tf8bOZgDy8Y3rXIBXJGoY= X-Google-Smtp-Source: APXvYqyuVWPvwDbnlcegBvruWUT+QwAdPuERsliwQxC97KZhFIXkTmWSoleKqg0BBcsCeIr1I2UnVA== X-Received: by 2002:a62:6a81:: with SMTP id f123mr8654712pfc.40.1555745712500; Sat, 20 Apr 2019 00:35:12 -0700 (PDT) Received: from localhost.localdomain (rrcs-66-91-136-155.west.biz.rr.com. [66.91.136.155]) by smtp.gmail.com with ESMTPSA id z22sm7025492pgv.23.2019.04.20.00.35.10 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 20 Apr 2019 00:35:11 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Fri, 19 Apr 2019 21:34:20 -1000 Message-Id: <20190420073442.7488-17-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190420073442.7488-1-richard.henderson@linaro.org> References: <20190420073442.7488-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::443 Subject: [Qemu-devel] [PATCH 16/38] tcg: Specify optional vector requirements with a list X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: david@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Replace the single opcode in .opc with a null-terminated array in .opt_opc. We still require that all opcodes be used with the same .vece. Validate the contents of this list with CONFIG_DEBUG_TCG. All tcg_gen_*_vec functions will check any list active during .fniv expansion. Swap the active list in and out as we expand other opcodes, or take control away from the front-end function. Convert all existing vector aware front ends. Signed-off-by: Richard Henderson --- tcg/tcg-op-gvec.h | 24 +- tcg/tcg.h | 18 ++ target/arm/translate-sve.c | 9 +- target/arm/translate.c | 127 +++++++---- target/ppc/translate/vmx-impl.inc.c | 7 +- tcg/tcg-op-gvec.c | 341 ++++++++++++++++++---------- tcg/tcg-op-vec.c | 18 ++ 7 files changed, 365 insertions(+), 179 deletions(-) diff --git a/tcg/tcg-op-gvec.h b/tcg/tcg-op-gvec.h index aaeb6e5d8b..a0e0902f6c 100644 --- a/tcg/tcg-op-gvec.h +++ b/tcg/tcg-op-gvec.h @@ -91,8 +91,8 @@ typedef struct { void (*fniv)(unsigned, TCGv_vec, TCGv_vec); /* Expand out-of-line helper w/descriptor. */ gen_helper_gvec_2 *fno; - /* The opcode, if any, to which this corresponds. */ - TCGOpcode opc; + /* The optional opcodes, if any, utilized by .fniv. */ + const TCGOpcode *opt_opc; /* The data argument to the out-of-line helper. */ int32_t data; /* The vector element size, if applicable. */ @@ -112,8 +112,8 @@ typedef struct { gen_helper_gvec_2 *fno; /* Expand out-of-line helper w/descriptor, data as argument. */ gen_helper_gvec_2i *fnoi; - /* The opcode, if any, to which this corresponds. */ - TCGOpcode opc; + /* The optional opcodes, if any, utilized by .fniv. */ + const TCGOpcode *opt_opc; /* The vector element size, if applicable. */ uint8_t vece; /* Prefer i64 to v64. */ @@ -131,8 +131,8 @@ typedef struct { void (*fniv)(unsigned, TCGv_vec, TCGv_vec, TCGv_vec); /* Expand out-of-line helper w/descriptor. */ gen_helper_gvec_2i *fno; - /* The opcode, if any, to which this corresponds. */ - TCGOpcode opc; + /* The optional opcodes, if any, utilized by .fniv. */ + const TCGOpcode *opt_opc; /* The data argument to the out-of-line helper. */ uint32_t data; /* The vector element size, if applicable. */ @@ -152,8 +152,8 @@ typedef struct { void (*fniv)(unsigned, TCGv_vec, TCGv_vec, TCGv_vec); /* Expand out-of-line helper w/descriptor. */ gen_helper_gvec_3 *fno; - /* The opcode, if any, to which this corresponds. */ - TCGOpcode opc; + /* The optional opcodes, if any, utilized by .fniv. */ + const TCGOpcode *opt_opc; /* The data argument to the out-of-line helper. */ int32_t data; /* The vector element size, if applicable. */ @@ -175,8 +175,8 @@ typedef struct { void (*fniv)(unsigned, TCGv_vec, TCGv_vec, TCGv_vec, int64_t); /* Expand out-of-line helper w/descriptor, data in descriptor. */ gen_helper_gvec_3 *fno; - /* The opcode, if any, to which this corresponds. */ - TCGOpcode opc; + /* The optional opcodes, if any, utilized by .fniv. */ + const TCGOpcode *opt_opc; /* The vector element size, if applicable. */ uint8_t vece; /* Prefer i64 to v64. */ @@ -194,8 +194,8 @@ typedef struct { void (*fniv)(unsigned, TCGv_vec, TCGv_vec, TCGv_vec, TCGv_vec); /* Expand out-of-line helper w/descriptor. */ gen_helper_gvec_4 *fno; - /* The opcode, if any, to which this corresponds. */ - TCGOpcode opc; + /* The optional opcodes, if any, utilized by .fniv. */ + const TCGOpcode *opt_opc; /* The data argument to the out-of-line helper. */ int32_t data; /* The vector element size, if applicable. */ diff --git a/tcg/tcg.h b/tcg/tcg.h index 7b1c15b40b..48d4d2e03e 100644 --- a/tcg/tcg.h +++ b/tcg/tcg.h @@ -694,6 +694,7 @@ struct TCGContext { QSIMPLEQ_HEAD(, TCGLabel) labels; int temps_in_use; int goto_tb_issue_mask; + const TCGOpcode *vecop_list; #endif /* Code generation. Note that we specifically do not use tcg_insn_unit @@ -1493,4 +1494,21 @@ void helper_atomic_sto_le_mmu(CPUArchState *env, target_ulong addr, Int128 val, void helper_atomic_sto_be_mmu(CPUArchState *env, target_ulong addr, Int128 val, TCGMemOpIdx oi, uintptr_t retaddr); +#ifdef CONFIG_DEBUG_TCG +void tcg_assert_listed_vecop(TCGOpcode); +#else +static inline void tcg_assert_listed_vecop(TCGOpcode op) { } +#endif + +static inline const TCGOpcode *tcg_swap_vecop_list(const TCGOpcode *n) +{ +#ifdef CONFIG_DEBUG_TCG + const TCGOpcode *o = tcg_ctx->vecop_list; + tcg_ctx->vecop_list = n; + return o; +#else + return NULL; +#endif +} + #endif /* TCG_H */ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 245cd82621..0682c0d32b 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3302,29 +3302,30 @@ static bool trans_SUB_zzi(DisasContext *s, arg_rri_esz *a) static bool trans_SUBR_zzi(DisasContext *s, arg_rri_esz *a) { + static const TCGOpcode vecop_list[] = { INDEX_op_sub_vec, 0 }; static const GVecGen2s op[4] = { { .fni8 = tcg_gen_vec_sub8_i64, .fniv = tcg_gen_sub_vec, .fno = gen_helper_sve_subri_b, - .opc = INDEX_op_sub_vec, + .opt_opc = vecop_list, .vece = MO_8, .scalar_first = true }, { .fni8 = tcg_gen_vec_sub16_i64, .fniv = tcg_gen_sub_vec, .fno = gen_helper_sve_subri_h, - .opc = INDEX_op_sub_vec, + .opt_opc = vecop_list, .vece = MO_16, .scalar_first = true }, { .fni4 = tcg_gen_sub_i32, .fniv = tcg_gen_sub_vec, .fno = gen_helper_sve_subri_s, - .opc = INDEX_op_sub_vec, + .opt_opc = vecop_list, .vece = MO_32, .scalar_first = true }, { .fni8 = tcg_gen_sub_i64, .fniv = tcg_gen_sub_vec, .fno = gen_helper_sve_subri_d, - .opc = INDEX_op_sub_vec, + .opt_opc = vecop_list, .prefer_i64 = TCG_TARGET_REG_BITS == 64, .vece = MO_64, .scalar_first = true } diff --git a/target/arm/translate.c b/target/arm/translate.c index 13e2dc6562..83a008e945 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -5772,27 +5772,31 @@ static void gen_ssra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) tcg_gen_add_vec(vece, d, d, a); } +static const TCGOpcode vecop_list_ssra[] = { + INDEX_op_sari_vec, INDEX_op_add_vec, 0 +}; + const GVecGen2i ssra_op[4] = { { .fni8 = gen_ssra8_i64, .fniv = gen_ssra_vec, .load_dest = true, - .opc = INDEX_op_sari_vec, + .opt_opc = vecop_list_ssra, .vece = MO_8 }, { .fni8 = gen_ssra16_i64, .fniv = gen_ssra_vec, .load_dest = true, - .opc = INDEX_op_sari_vec, + .opt_opc = vecop_list_ssra, .vece = MO_16 }, { .fni4 = gen_ssra32_i32, .fniv = gen_ssra_vec, .load_dest = true, - .opc = INDEX_op_sari_vec, + .opt_opc = vecop_list_ssra, .vece = MO_32 }, { .fni8 = gen_ssra64_i64, .fniv = gen_ssra_vec, .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list_ssra, .load_dest = true, - .opc = INDEX_op_sari_vec, .vece = MO_64 }, }; @@ -5826,27 +5830,31 @@ static void gen_usra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) tcg_gen_add_vec(vece, d, d, a); } +static const TCGOpcode vecop_list_usra[] = { + INDEX_op_shri_vec, INDEX_op_add_vec, 0 +}; + const GVecGen2i usra_op[4] = { { .fni8 = gen_usra8_i64, .fniv = gen_usra_vec, .load_dest = true, - .opc = INDEX_op_shri_vec, + .opt_opc = vecop_list_usra, .vece = MO_8, }, { .fni8 = gen_usra16_i64, .fniv = gen_usra_vec, .load_dest = true, - .opc = INDEX_op_shri_vec, + .opt_opc = vecop_list_usra, .vece = MO_16, }, { .fni4 = gen_usra32_i32, .fniv = gen_usra_vec, .load_dest = true, - .opc = INDEX_op_shri_vec, + .opt_opc = vecop_list_usra, .vece = MO_32, }, { .fni8 = gen_usra64_i64, .fniv = gen_usra_vec, .prefer_i64 = TCG_TARGET_REG_BITS == 64, .load_dest = true, - .opc = INDEX_op_shri_vec, + .opt_opc = vecop_list_usra, .vece = MO_64, }, }; @@ -5904,27 +5912,29 @@ static void gen_shr_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) } } +static const TCGOpcode vecop_list_sri[] = { INDEX_op_shri_vec, 0 }; + const GVecGen2i sri_op[4] = { { .fni8 = gen_shr8_ins_i64, .fniv = gen_shr_ins_vec, .load_dest = true, - .opc = INDEX_op_shri_vec, + .opt_opc = vecop_list_sri, .vece = MO_8 }, { .fni8 = gen_shr16_ins_i64, .fniv = gen_shr_ins_vec, .load_dest = true, - .opc = INDEX_op_shri_vec, + .opt_opc = vecop_list_sri, .vece = MO_16 }, { .fni4 = gen_shr32_ins_i32, .fniv = gen_shr_ins_vec, .load_dest = true, - .opc = INDEX_op_shri_vec, + .opt_opc = vecop_list_sri, .vece = MO_32 }, { .fni8 = gen_shr64_ins_i64, .fniv = gen_shr_ins_vec, .prefer_i64 = TCG_TARGET_REG_BITS == 64, .load_dest = true, - .opc = INDEX_op_shri_vec, + .opt_opc = vecop_list_sri, .vece = MO_64 }, }; @@ -5980,27 +5990,29 @@ static void gen_shl_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) } } +static const TCGOpcode vecop_list_sli[] = { INDEX_op_shli_vec, 0 }; + const GVecGen2i sli_op[4] = { { .fni8 = gen_shl8_ins_i64, .fniv = gen_shl_ins_vec, .load_dest = true, - .opc = INDEX_op_shli_vec, + .opt_opc = vecop_list_sli, .vece = MO_8 }, { .fni8 = gen_shl16_ins_i64, .fniv = gen_shl_ins_vec, .load_dest = true, - .opc = INDEX_op_shli_vec, + .opt_opc = vecop_list_sli, .vece = MO_16 }, { .fni4 = gen_shl32_ins_i32, .fniv = gen_shl_ins_vec, .load_dest = true, - .opc = INDEX_op_shli_vec, + .opt_opc = vecop_list_sli, .vece = MO_32 }, { .fni8 = gen_shl64_ins_i64, .fniv = gen_shl_ins_vec, .prefer_i64 = TCG_TARGET_REG_BITS == 64, .load_dest = true, - .opc = INDEX_op_shli_vec, + .opt_opc = vecop_list_sli, .vece = MO_64 }, }; @@ -6067,51 +6079,60 @@ static void gen_mls_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) /* Note that while NEON does not support VMLA and VMLS as 64-bit ops, * these tables are shared with AArch64 which does support them. */ + +static const TCGOpcode vecop_list_mla[] = { + INDEX_op_mul_vec, INDEX_op_add_vec, 0 +}; + +static const TCGOpcode vecop_list_mls[] = { + INDEX_op_mul_vec, INDEX_op_sub_vec, 0 +}; + const GVecGen3 mla_op[4] = { { .fni4 = gen_mla8_i32, .fniv = gen_mla_vec, - .opc = INDEX_op_mul_vec, .load_dest = true, + .opt_opc = vecop_list_mla, .vece = MO_8 }, { .fni4 = gen_mla16_i32, .fniv = gen_mla_vec, - .opc = INDEX_op_mul_vec, .load_dest = true, + .opt_opc = vecop_list_mla, .vece = MO_16 }, { .fni4 = gen_mla32_i32, .fniv = gen_mla_vec, - .opc = INDEX_op_mul_vec, .load_dest = true, + .opt_opc = vecop_list_mla, .vece = MO_32 }, { .fni8 = gen_mla64_i64, .fniv = gen_mla_vec, - .opc = INDEX_op_mul_vec, .prefer_i64 = TCG_TARGET_REG_BITS == 64, .load_dest = true, + .opt_opc = vecop_list_mla, .vece = MO_64 }, }; const GVecGen3 mls_op[4] = { { .fni4 = gen_mls8_i32, .fniv = gen_mls_vec, - .opc = INDEX_op_mul_vec, .load_dest = true, + .opt_opc = vecop_list_mls, .vece = MO_8 }, { .fni4 = gen_mls16_i32, .fniv = gen_mls_vec, - .opc = INDEX_op_mul_vec, .load_dest = true, + .opt_opc = vecop_list_mls, .vece = MO_16 }, { .fni4 = gen_mls32_i32, .fniv = gen_mls_vec, - .opc = INDEX_op_mul_vec, .load_dest = true, + .opt_opc = vecop_list_mls, .vece = MO_32 }, { .fni8 = gen_mls64_i64, .fniv = gen_mls_vec, - .opc = INDEX_op_mul_vec, .prefer_i64 = TCG_TARGET_REG_BITS == 64, .load_dest = true, + .opt_opc = vecop_list_mls, .vece = MO_64 }, }; @@ -6137,23 +6158,25 @@ static void gen_cmtst_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) tcg_gen_cmp_vec(TCG_COND_NE, vece, d, d, a); } +static const TCGOpcode vecop_list_cmtst[] = { INDEX_op_cmp_vec, 0 }; + const GVecGen3 cmtst_op[4] = { { .fni4 = gen_helper_neon_tst_u8, .fniv = gen_cmtst_vec, - .opc = INDEX_op_cmp_vec, + .opt_opc = vecop_list_cmtst, .vece = MO_8 }, { .fni4 = gen_helper_neon_tst_u16, .fniv = gen_cmtst_vec, - .opc = INDEX_op_cmp_vec, + .opt_opc = vecop_list_cmtst, .vece = MO_16 }, { .fni4 = gen_cmtst_i32, .fniv = gen_cmtst_vec, - .opc = INDEX_op_cmp_vec, + .opt_opc = vecop_list_cmtst, .vece = MO_32 }, { .fni8 = gen_cmtst_i64, .fniv = gen_cmtst_vec, .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .opc = INDEX_op_cmp_vec, + .opt_opc = vecop_list_cmtst, .vece = MO_64 }, }; @@ -6168,26 +6191,30 @@ static void gen_uqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, tcg_temp_free_vec(x); } +static const TCGOpcode vecop_list_uqadd[] = { + INDEX_op_usadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0 +}; + const GVecGen4 uqadd_op[4] = { { .fniv = gen_uqadd_vec, .fno = gen_helper_gvec_uqadd_b, - .opc = INDEX_op_usadd_vec, .write_aofs = true, + .opt_opc = vecop_list_uqadd, .vece = MO_8 }, { .fniv = gen_uqadd_vec, .fno = gen_helper_gvec_uqadd_h, - .opc = INDEX_op_usadd_vec, .write_aofs = true, + .opt_opc = vecop_list_uqadd, .vece = MO_16 }, { .fniv = gen_uqadd_vec, .fno = gen_helper_gvec_uqadd_s, - .opc = INDEX_op_usadd_vec, .write_aofs = true, + .opt_opc = vecop_list_uqadd, .vece = MO_32 }, { .fniv = gen_uqadd_vec, .fno = gen_helper_gvec_uqadd_d, - .opc = INDEX_op_usadd_vec, .write_aofs = true, + .opt_opc = vecop_list_uqadd, .vece = MO_64 }, }; @@ -6202,25 +6229,29 @@ static void gen_sqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, tcg_temp_free_vec(x); } +static const TCGOpcode vecop_list_sqadd[] = { + INDEX_op_ssadd_vec, INDEX_op_cmp_vec, INDEX_op_add_vec, 0 +}; + const GVecGen4 sqadd_op[4] = { { .fniv = gen_sqadd_vec, .fno = gen_helper_gvec_sqadd_b, - .opc = INDEX_op_ssadd_vec, + .opt_opc = vecop_list_sqadd, .write_aofs = true, .vece = MO_8 }, { .fniv = gen_sqadd_vec, .fno = gen_helper_gvec_sqadd_h, - .opc = INDEX_op_ssadd_vec, + .opt_opc = vecop_list_sqadd, .write_aofs = true, .vece = MO_16 }, { .fniv = gen_sqadd_vec, .fno = gen_helper_gvec_sqadd_s, - .opc = INDEX_op_ssadd_vec, + .opt_opc = vecop_list_sqadd, .write_aofs = true, .vece = MO_32 }, { .fniv = gen_sqadd_vec, .fno = gen_helper_gvec_sqadd_d, - .opc = INDEX_op_ssadd_vec, + .opt_opc = vecop_list_sqadd, .write_aofs = true, .vece = MO_64 }, }; @@ -6236,25 +6267,29 @@ static void gen_uqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, tcg_temp_free_vec(x); } +static const TCGOpcode vecop_list_uqsub[] = { + INDEX_op_ussub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0 +}; + const GVecGen4 uqsub_op[4] = { { .fniv = gen_uqsub_vec, .fno = gen_helper_gvec_uqsub_b, - .opc = INDEX_op_ussub_vec, + .opt_opc = vecop_list_uqsub, .write_aofs = true, .vece = MO_8 }, { .fniv = gen_uqsub_vec, .fno = gen_helper_gvec_uqsub_h, - .opc = INDEX_op_ussub_vec, + .opt_opc = vecop_list_uqsub, .write_aofs = true, .vece = MO_16 }, { .fniv = gen_uqsub_vec, .fno = gen_helper_gvec_uqsub_s, - .opc = INDEX_op_ussub_vec, + .opt_opc = vecop_list_uqsub, .write_aofs = true, .vece = MO_32 }, { .fniv = gen_uqsub_vec, .fno = gen_helper_gvec_uqsub_d, - .opc = INDEX_op_ussub_vec, + .opt_opc = vecop_list_uqsub, .write_aofs = true, .vece = MO_64 }, }; @@ -6270,25 +6305,29 @@ static void gen_sqsub_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, tcg_temp_free_vec(x); } +static const TCGOpcode vecop_list_sqsub[] = { + INDEX_op_sssub_vec, INDEX_op_cmp_vec, INDEX_op_sub_vec, 0 +}; + const GVecGen4 sqsub_op[4] = { { .fniv = gen_sqsub_vec, .fno = gen_helper_gvec_sqsub_b, - .opc = INDEX_op_sssub_vec, + .opt_opc = vecop_list_sqsub, .write_aofs = true, .vece = MO_8 }, { .fniv = gen_sqsub_vec, .fno = gen_helper_gvec_sqsub_h, - .opc = INDEX_op_sssub_vec, + .opt_opc = vecop_list_sqsub, .write_aofs = true, .vece = MO_16 }, { .fniv = gen_sqsub_vec, .fno = gen_helper_gvec_sqsub_s, - .opc = INDEX_op_sssub_vec, + .opt_opc = vecop_list_sqsub, .write_aofs = true, .vece = MO_32 }, { .fniv = gen_sqsub_vec, .fno = gen_helper_gvec_sqsub_d, - .opc = INDEX_op_sssub_vec, + .opt_opc = vecop_list_sqsub, .write_aofs = true, .vece = MO_64 }, }; diff --git a/target/ppc/translate/vmx-impl.inc.c b/target/ppc/translate/vmx-impl.inc.c index eb10c533ca..c83e605a00 100644 --- a/target/ppc/translate/vmx-impl.inc.c +++ b/target/ppc/translate/vmx-impl.inc.c @@ -562,10 +562,15 @@ static void glue(glue(gen_, NAME), _vec)(unsigned vece, TCGv_vec t, \ } \ static void glue(gen_, NAME)(DisasContext *ctx) \ { \ + static const TCGOpcode vecop_list[] = { \ + glue(glue(INDEX_op_, NORM), _vec), \ + glue(glue(INDEX_op_, SAT), _vec), \ + INDEX_op_cmp_vec, 0 \ + }; \ static const GVecGen4 g = { \ .fniv = glue(glue(gen_, NAME), _vec), \ .fno = glue(gen_helper_, NAME), \ - .opc = glue(glue(INDEX_op_, SAT), _vec), \ + .opt_opc = vecop_list, \ .write_aofs = true, \ .vece = VECE, \ }; \ diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c index 3eb9126f50..40858a83e0 100644 --- a/tcg/tcg-op-gvec.c +++ b/tcg/tcg-op-gvec.c @@ -26,6 +26,76 @@ #define MAX_UNROLL 4 +/* + * Vector optional opcode tracking. + * Except for the basic logical operations (and, or, xor), and + * data movement (mov, ld, st, dupi), many vector opcodes are + * optional and may not be supported on the host. Thank Intel + * for the irregularity in their instruction set. + * + * The gvec expanders allow custom vector operations to be composed, + * generally via the .fniv callback in the GVecGen* structures. At + * the same time, in deciding whether to use this hook we need to + * know if the host supports the required operations. This is + * presented as an array of opcodes, terminated by 0. Each opcode + * is assumed to be expanded with the given VECE. + * + * For debugging, we want to validate this array. Therefore, when + * tcg_ctx->vec_opt_opc is non-NULL, the tcg_gen_*_vec expanders + * will validate that their opcode is present in the list. + */ +#ifdef CONFIG_DEBUG_TCG +void tcg_assert_listed_vecop(TCGOpcode op) +{ + const TCGOpcode *p = tcg_ctx->vecop_list; + if (p) { + for (; *p; ++p) { + if (*p == op) { + return; + } + } + g_assert_not_reached(); + } +} + +static const TCGOpcode vecop_list_empty[1]; +#else +#define vecop_list_empty NULL +#endif + +static bool tcg_can_emit_vecop_list(const TCGOpcode *list, + TCGType type, unsigned vece) +{ + if (list == NULL) { + return true; + } + + for (; *list; ++list) { + TCGOpcode opc = *list; + + if (tcg_can_emit_vec_op(opc, type, vece)) { + continue; + } + + /* + * The opcode list is created by front ends based on what they + * actually invoke. We must mirror the logic in tcg-op-vec.c + * for generic expansions using other opcodes. + */ + switch (opc) { + case INDEX_op_neg_vec: + if (tcg_can_emit_vec_op(INDEX_op_sub_vec, type, vece)) { + continue; + } + break; + default: + break; + } + return false; + } + return true; +} + /* Verify vector size and alignment rules. OFS should be the OR of all of the operand offsets so that we can check them all at once. */ static void check_size_align(uint32_t oprsz, uint32_t maxsz, uint32_t ofs) @@ -360,31 +430,29 @@ static void gen_dup_i64(unsigned vece, TCGv_i64 out, TCGv_i64 in) * on elements of size VECE in the selected type. Do not select V64 if * PREFER_I64 is true. Return 0 if no vector type is selected. */ -static TCGType choose_vector_type(TCGOpcode op, unsigned vece, uint32_t size, - bool prefer_i64) +static TCGType choose_vector_type(const TCGOpcode *list, unsigned vece, + uint32_t size, bool prefer_i64) { if (TCG_TARGET_HAS_v256 && check_size_impl(size, 32)) { - if (op == 0) { - return TCG_TYPE_V256; - } - /* Recall that ARM SVE allows vector sizes that are not a + /* + * Recall that ARM SVE allows vector sizes that are not a * power of 2, but always a multiple of 16. The intent is * that e.g. size == 80 would be expanded with 2x32 + 1x16. * It is hard to imagine a case in which v256 is supported * but v128 is not, but check anyway. */ - if (tcg_can_emit_vec_op(op, TCG_TYPE_V256, vece) + if (tcg_can_emit_vecop_list(list, TCG_TYPE_V256, vece) && (size % 32 == 0 - || tcg_can_emit_vec_op(op, TCG_TYPE_V128, vece))) { + || tcg_can_emit_vecop_list(list, TCG_TYPE_V128, vece))) { return TCG_TYPE_V256; } } if (TCG_TARGET_HAS_v128 && check_size_impl(size, 16) - && (op == 0 || tcg_can_emit_vec_op(op, TCG_TYPE_V128, vece))) { + && tcg_can_emit_vecop_list(list, TCG_TYPE_V128, vece)) { return TCG_TYPE_V128; } if (TCG_TARGET_HAS_v64 && !prefer_i64 && check_size_impl(size, 8) - && (op == 0 || tcg_can_emit_vec_op(op, TCG_TYPE_V64, vece))) { + && tcg_can_emit_vecop_list(list, TCG_TYPE_V64, vece)) { return TCG_TYPE_V64; } return 0; @@ -452,7 +520,7 @@ static void do_dup(unsigned vece, uint32_t dofs, uint32_t oprsz, /* Implement inline with a vector type, if possible. * Prefer integer when 64-bit host and no variable dup. */ - type = choose_vector_type(0, vece, oprsz, + type = choose_vector_type(NULL, vece, oprsz, (TCG_TARGET_REG_BITS == 64 && in_32 == NULL && (in_64 == NULL || vece == MO_64))); if (type != 0) { @@ -987,6 +1055,8 @@ static void expand_4_vec(unsigned vece, uint32_t dofs, uint32_t aofs, void tcg_gen_gvec_2(uint32_t dofs, uint32_t aofs, uint32_t oprsz, uint32_t maxsz, const GVecGen2 *g) { + const TCGOpcode *this_list = g->opt_opc ? : vecop_list_empty; + const TCGOpcode *hold_list = tcg_swap_vecop_list(this_list); TCGType type; uint32_t some; @@ -995,7 +1065,7 @@ void tcg_gen_gvec_2(uint32_t dofs, uint32_t aofs, type = 0; if (g->fniv) { - type = choose_vector_type(g->opc, g->vece, oprsz, g->prefer_i64); + type = choose_vector_type(g->opt_opc, g->vece, oprsz, g->prefer_i64); } switch (type) { case TCG_TYPE_V256: @@ -1028,13 +1098,14 @@ void tcg_gen_gvec_2(uint32_t dofs, uint32_t aofs, } else { assert(g->fno != NULL); tcg_gen_gvec_2_ool(dofs, aofs, oprsz, maxsz, g->data, g->fno); - return; + oprsz = maxsz; } break; default: g_assert_not_reached(); } + tcg_swap_vecop_list(hold_list); if (oprsz < maxsz) { expand_clr(dofs + oprsz, maxsz - oprsz); @@ -1045,6 +1116,8 @@ void tcg_gen_gvec_2(uint32_t dofs, uint32_t aofs, void tcg_gen_gvec_2i(uint32_t dofs, uint32_t aofs, uint32_t oprsz, uint32_t maxsz, int64_t c, const GVecGen2i *g) { + const TCGOpcode *this_list = g->opt_opc ? : vecop_list_empty; + const TCGOpcode *hold_list = tcg_swap_vecop_list(this_list); TCGType type; uint32_t some; @@ -1053,7 +1126,7 @@ void tcg_gen_gvec_2i(uint32_t dofs, uint32_t aofs, uint32_t oprsz, type = 0; if (g->fniv) { - type = choose_vector_type(g->opc, g->vece, oprsz, g->prefer_i64); + type = choose_vector_type(g->opt_opc, g->vece, oprsz, g->prefer_i64); } switch (type) { case TCG_TYPE_V256: @@ -1095,13 +1168,14 @@ void tcg_gen_gvec_2i(uint32_t dofs, uint32_t aofs, uint32_t oprsz, maxsz, c, g->fnoi); tcg_temp_free_i64(tcg_c); } - return; + oprsz = maxsz; } break; default: g_assert_not_reached(); } + tcg_swap_vecop_list(hold_list); if (oprsz < maxsz) { expand_clr(dofs + oprsz, maxsz - oprsz); @@ -1119,9 +1193,11 @@ void tcg_gen_gvec_2s(uint32_t dofs, uint32_t aofs, uint32_t oprsz, type = 0; if (g->fniv) { - type = choose_vector_type(g->opc, g->vece, oprsz, g->prefer_i64); + type = choose_vector_type(g->opt_opc, g->vece, oprsz, g->prefer_i64); } if (type != 0) { + const TCGOpcode *this_list = g->opt_opc ? : vecop_list_empty; + const TCGOpcode *hold_list = tcg_swap_vecop_list(this_list); TCGv_vec t_vec = tcg_temp_new_vec(type); uint32_t some; @@ -1159,6 +1235,7 @@ void tcg_gen_gvec_2s(uint32_t dofs, uint32_t aofs, uint32_t oprsz, g_assert_not_reached(); } tcg_temp_free_vec(t_vec); + tcg_swap_vecop_list(hold_list); } else if (g->fni8 && check_size_impl(oprsz, 8)) { TCGv_i64 t64 = tcg_temp_new_i64(); @@ -1186,6 +1263,8 @@ void tcg_gen_gvec_2s(uint32_t dofs, uint32_t aofs, uint32_t oprsz, void tcg_gen_gvec_3(uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t oprsz, uint32_t maxsz, const GVecGen3 *g) { + const TCGOpcode *this_list = g->opt_opc ? : vecop_list_empty; + const TCGOpcode *hold_list = tcg_swap_vecop_list(this_list); TCGType type; uint32_t some; @@ -1194,7 +1273,7 @@ void tcg_gen_gvec_3(uint32_t dofs, uint32_t aofs, uint32_t bofs, type = 0; if (g->fniv) { - type = choose_vector_type(g->opc, g->vece, oprsz, g->prefer_i64); + type = choose_vector_type(g->opt_opc, g->vece, oprsz, g->prefer_i64); } switch (type) { case TCG_TYPE_V256: @@ -1232,13 +1311,14 @@ void tcg_gen_gvec_3(uint32_t dofs, uint32_t aofs, uint32_t bofs, assert(g->fno != NULL); tcg_gen_gvec_3_ool(dofs, aofs, bofs, oprsz, maxsz, g->data, g->fno); - return; + oprsz = maxsz; } break; default: g_assert_not_reached(); } + tcg_swap_vecop_list(hold_list); if (oprsz < maxsz) { expand_clr(dofs + oprsz, maxsz - oprsz); @@ -1250,6 +1330,8 @@ void tcg_gen_gvec_3i(uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t oprsz, uint32_t maxsz, int64_t c, const GVecGen3i *g) { + const TCGOpcode *this_list = g->opt_opc ? : vecop_list_empty; + const TCGOpcode *hold_list = tcg_swap_vecop_list(this_list); TCGType type; uint32_t some; @@ -1258,7 +1340,7 @@ void tcg_gen_gvec_3i(uint32_t dofs, uint32_t aofs, uint32_t bofs, type = 0; if (g->fniv) { - type = choose_vector_type(g->opc, g->vece, oprsz, g->prefer_i64); + type = choose_vector_type(g->opt_opc, g->vece, oprsz, g->prefer_i64); } switch (type) { case TCG_TYPE_V256: @@ -1296,13 +1378,14 @@ void tcg_gen_gvec_3i(uint32_t dofs, uint32_t aofs, uint32_t bofs, } else { assert(g->fno != NULL); tcg_gen_gvec_3_ool(dofs, aofs, bofs, oprsz, maxsz, c, g->fno); - return; + oprsz = maxsz; } break; default: g_assert_not_reached(); } + tcg_swap_vecop_list(hold_list); if (oprsz < maxsz) { expand_clr(dofs + oprsz, maxsz - oprsz); @@ -1313,6 +1396,8 @@ void tcg_gen_gvec_3i(uint32_t dofs, uint32_t aofs, uint32_t bofs, void tcg_gen_gvec_4(uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t cofs, uint32_t oprsz, uint32_t maxsz, const GVecGen4 *g) { + const TCGOpcode *this_list = g->opt_opc ? : vecop_list_empty; + const TCGOpcode *hold_list = tcg_swap_vecop_list(this_list); TCGType type; uint32_t some; @@ -1321,7 +1406,7 @@ void tcg_gen_gvec_4(uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t cofs, type = 0; if (g->fniv) { - type = choose_vector_type(g->opc, g->vece, oprsz, g->prefer_i64); + type = choose_vector_type(g->opt_opc, g->vece, oprsz, g->prefer_i64); } switch (type) { case TCG_TYPE_V256: @@ -1362,13 +1447,14 @@ void tcg_gen_gvec_4(uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t cofs, assert(g->fno != NULL); tcg_gen_gvec_4_ool(dofs, aofs, bofs, cofs, oprsz, maxsz, g->data, g->fno); - return; + oprsz = maxsz; } break; default: g_assert_not_reached(); } + tcg_swap_vecop_list(hold_list); if (oprsz < maxsz) { expand_clr(dofs + oprsz, maxsz - oprsz); @@ -1423,7 +1509,7 @@ void tcg_gen_gvec_dup_mem(unsigned vece, uint32_t dofs, uint32_t aofs, uint32_t oprsz, uint32_t maxsz) { if (vece <= MO_64) { - TCGType type = choose_vector_type(0, vece, oprsz, 0); + TCGType type = choose_vector_type(NULL, vece, oprsz, 0); if (type != 0) { TCGv_vec t_vec = tcg_temp_new_vec(type); tcg_gen_dup_mem_vec(vece, t_vec, cpu_env, aofs); @@ -1573,6 +1659,8 @@ void tcg_gen_vec_add32_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) tcg_temp_free_i64(t2); } +static const TCGOpcode vecop_list_add[] = { INDEX_op_add_vec, 0 }; + void tcg_gen_gvec_add(unsigned vece, uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t oprsz, uint32_t maxsz) { @@ -1580,22 +1668,22 @@ void tcg_gen_gvec_add(unsigned vece, uint32_t dofs, uint32_t aofs, { .fni8 = tcg_gen_vec_add8_i64, .fniv = tcg_gen_add_vec, .fno = gen_helper_gvec_add8, - .opc = INDEX_op_add_vec, + .opt_opc = vecop_list_add, .vece = MO_8 }, { .fni8 = tcg_gen_vec_add16_i64, .fniv = tcg_gen_add_vec, .fno = gen_helper_gvec_add16, - .opc = INDEX_op_add_vec, + .opt_opc = vecop_list_add, .vece = MO_16 }, { .fni4 = tcg_gen_add_i32, .fniv = tcg_gen_add_vec, .fno = gen_helper_gvec_add32, - .opc = INDEX_op_add_vec, + .opt_opc = vecop_list_add, .vece = MO_32 }, { .fni8 = tcg_gen_add_i64, .fniv = tcg_gen_add_vec, .fno = gen_helper_gvec_add64, - .opc = INDEX_op_add_vec, + .opt_opc = vecop_list_add, .prefer_i64 = TCG_TARGET_REG_BITS == 64, .vece = MO_64 }, }; @@ -1611,22 +1699,22 @@ void tcg_gen_gvec_adds(unsigned vece, uint32_t dofs, uint32_t aofs, { .fni8 = tcg_gen_vec_add8_i64, .fniv = tcg_gen_add_vec, .fno = gen_helper_gvec_adds8, - .opc = INDEX_op_add_vec, + .opt_opc = vecop_list_add, .vece = MO_8 }, { .fni8 = tcg_gen_vec_add16_i64, .fniv = tcg_gen_add_vec, .fno = gen_helper_gvec_adds16, - .opc = INDEX_op_add_vec, + .opt_opc = vecop_list_add, .vece = MO_16 }, { .fni4 = tcg_gen_add_i32, .fniv = tcg_gen_add_vec, .fno = gen_helper_gvec_adds32, - .opc = INDEX_op_add_vec, + .opt_opc = vecop_list_add, .vece = MO_32 }, { .fni8 = tcg_gen_add_i64, .fniv = tcg_gen_add_vec, .fno = gen_helper_gvec_adds64, - .opc = INDEX_op_add_vec, + .opt_opc = vecop_list_add, .prefer_i64 = TCG_TARGET_REG_BITS == 64, .vece = MO_64 }, }; @@ -1643,6 +1731,8 @@ void tcg_gen_gvec_addi(unsigned vece, uint32_t dofs, uint32_t aofs, tcg_temp_free_i64(tmp); } +static const TCGOpcode vecop_list_sub[] = { INDEX_op_sub_vec, 0 }; + void tcg_gen_gvec_subs(unsigned vece, uint32_t dofs, uint32_t aofs, TCGv_i64 c, uint32_t oprsz, uint32_t maxsz) { @@ -1650,22 +1740,22 @@ void tcg_gen_gvec_subs(unsigned vece, uint32_t dofs, uint32_t aofs, { .fni8 = tcg_gen_vec_sub8_i64, .fniv = tcg_gen_sub_vec, .fno = gen_helper_gvec_subs8, - .opc = INDEX_op_sub_vec, + .opt_opc = vecop_list_sub, .vece = MO_8 }, { .fni8 = tcg_gen_vec_sub16_i64, .fniv = tcg_gen_sub_vec, .fno = gen_helper_gvec_subs16, - .opc = INDEX_op_sub_vec, + .opt_opc = vecop_list_sub, .vece = MO_16 }, { .fni4 = tcg_gen_sub_i32, .fniv = tcg_gen_sub_vec, .fno = gen_helper_gvec_subs32, - .opc = INDEX_op_sub_vec, + .opt_opc = vecop_list_sub, .vece = MO_32 }, { .fni8 = tcg_gen_sub_i64, .fniv = tcg_gen_sub_vec, .fno = gen_helper_gvec_subs64, - .opc = INDEX_op_sub_vec, + .opt_opc = vecop_list_sub, .prefer_i64 = TCG_TARGET_REG_BITS == 64, .vece = MO_64 }, }; @@ -1729,22 +1819,22 @@ void tcg_gen_gvec_sub(unsigned vece, uint32_t dofs, uint32_t aofs, { .fni8 = tcg_gen_vec_sub8_i64, .fniv = tcg_gen_sub_vec, .fno = gen_helper_gvec_sub8, - .opc = INDEX_op_sub_vec, + .opt_opc = vecop_list_sub, .vece = MO_8 }, { .fni8 = tcg_gen_vec_sub16_i64, .fniv = tcg_gen_sub_vec, .fno = gen_helper_gvec_sub16, - .opc = INDEX_op_sub_vec, + .opt_opc = vecop_list_sub, .vece = MO_16 }, { .fni4 = tcg_gen_sub_i32, .fniv = tcg_gen_sub_vec, .fno = gen_helper_gvec_sub32, - .opc = INDEX_op_sub_vec, + .opt_opc = vecop_list_sub, .vece = MO_32 }, { .fni8 = tcg_gen_sub_i64, .fniv = tcg_gen_sub_vec, .fno = gen_helper_gvec_sub64, - .opc = INDEX_op_sub_vec, + .opt_opc = vecop_list_sub, .prefer_i64 = TCG_TARGET_REG_BITS == 64, .vece = MO_64 }, }; @@ -1753,27 +1843,29 @@ void tcg_gen_gvec_sub(unsigned vece, uint32_t dofs, uint32_t aofs, tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g[vece]); } +static const TCGOpcode vecop_list_mul[] = { INDEX_op_mul_vec, 0 }; + void tcg_gen_gvec_mul(unsigned vece, uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t oprsz, uint32_t maxsz) { static const GVecGen3 g[4] = { { .fniv = tcg_gen_mul_vec, .fno = gen_helper_gvec_mul8, - .opc = INDEX_op_mul_vec, + .opt_opc = vecop_list_mul, .vece = MO_8 }, { .fniv = tcg_gen_mul_vec, .fno = gen_helper_gvec_mul16, - .opc = INDEX_op_mul_vec, + .opt_opc = vecop_list_mul, .vece = MO_16 }, { .fni4 = tcg_gen_mul_i32, .fniv = tcg_gen_mul_vec, .fno = gen_helper_gvec_mul32, - .opc = INDEX_op_mul_vec, + .opt_opc = vecop_list_mul, .vece = MO_32 }, { .fni8 = tcg_gen_mul_i64, .fniv = tcg_gen_mul_vec, .fno = gen_helper_gvec_mul64, - .opc = INDEX_op_mul_vec, + .opt_opc = vecop_list_mul, .prefer_i64 = TCG_TARGET_REG_BITS == 64, .vece = MO_64 }, }; @@ -1788,21 +1880,21 @@ void tcg_gen_gvec_muls(unsigned vece, uint32_t dofs, uint32_t aofs, static const GVecGen2s g[4] = { { .fniv = tcg_gen_mul_vec, .fno = gen_helper_gvec_muls8, - .opc = INDEX_op_mul_vec, + .opt_opc = vecop_list_mul, .vece = MO_8 }, { .fniv = tcg_gen_mul_vec, .fno = gen_helper_gvec_muls16, - .opc = INDEX_op_mul_vec, + .opt_opc = vecop_list_mul, .vece = MO_16 }, { .fni4 = tcg_gen_mul_i32, .fniv = tcg_gen_mul_vec, .fno = gen_helper_gvec_muls32, - .opc = INDEX_op_mul_vec, + .opt_opc = vecop_list_mul, .vece = MO_32 }, { .fni8 = tcg_gen_mul_i64, .fniv = tcg_gen_mul_vec, .fno = gen_helper_gvec_muls64, - .opc = INDEX_op_mul_vec, + .opt_opc = vecop_list_mul, .prefer_i64 = TCG_TARGET_REG_BITS == 64, .vece = MO_64 }, }; @@ -1822,22 +1914,23 @@ void tcg_gen_gvec_muli(unsigned vece, uint32_t dofs, uint32_t aofs, void tcg_gen_gvec_ssadd(unsigned vece, uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t oprsz, uint32_t maxsz) { + static const TCGOpcode vecop_list[] = { INDEX_op_ssadd_vec, 0 }; static const GVecGen3 g[4] = { { .fniv = tcg_gen_ssadd_vec, .fno = gen_helper_gvec_ssadd8, - .opc = INDEX_op_ssadd_vec, + .opt_opc = vecop_list, .vece = MO_8 }, { .fniv = tcg_gen_ssadd_vec, .fno = gen_helper_gvec_ssadd16, - .opc = INDEX_op_ssadd_vec, + .opt_opc = vecop_list, .vece = MO_16 }, { .fniv = tcg_gen_ssadd_vec, .fno = gen_helper_gvec_ssadd32, - .opc = INDEX_op_ssadd_vec, + .opt_opc = vecop_list, .vece = MO_32 }, { .fniv = tcg_gen_ssadd_vec, .fno = gen_helper_gvec_ssadd64, - .opc = INDEX_op_ssadd_vec, + .opt_opc = vecop_list, .vece = MO_64 }, }; tcg_debug_assert(vece <= MO_64); @@ -1847,22 +1940,23 @@ void tcg_gen_gvec_ssadd(unsigned vece, uint32_t dofs, uint32_t aofs, void tcg_gen_gvec_sssub(unsigned vece, uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t oprsz, uint32_t maxsz) { + static const TCGOpcode vecop_list[] = { INDEX_op_sssub_vec, 0 }; static const GVecGen3 g[4] = { { .fniv = tcg_gen_sssub_vec, .fno = gen_helper_gvec_sssub8, - .opc = INDEX_op_sssub_vec, + .opt_opc = vecop_list, .vece = MO_8 }, { .fniv = tcg_gen_sssub_vec, .fno = gen_helper_gvec_sssub16, - .opc = INDEX_op_sssub_vec, + .opt_opc = vecop_list, .vece = MO_16 }, { .fniv = tcg_gen_sssub_vec, .fno = gen_helper_gvec_sssub32, - .opc = INDEX_op_sssub_vec, + .opt_opc = vecop_list, .vece = MO_32 }, { .fniv = tcg_gen_sssub_vec, .fno = gen_helper_gvec_sssub64, - .opc = INDEX_op_sssub_vec, + .opt_opc = vecop_list, .vece = MO_64 }, }; tcg_debug_assert(vece <= MO_64); @@ -1888,24 +1982,25 @@ static void tcg_gen_usadd_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) void tcg_gen_gvec_usadd(unsigned vece, uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t oprsz, uint32_t maxsz) { + static const TCGOpcode vecop_list[] = { INDEX_op_usadd_vec, 0 }; static const GVecGen3 g[4] = { { .fniv = tcg_gen_usadd_vec, .fno = gen_helper_gvec_usadd8, - .opc = INDEX_op_usadd_vec, + .opt_opc = vecop_list, .vece = MO_8 }, { .fniv = tcg_gen_usadd_vec, .fno = gen_helper_gvec_usadd16, - .opc = INDEX_op_usadd_vec, + .opt_opc = vecop_list, .vece = MO_16 }, { .fni4 = tcg_gen_usadd_i32, .fniv = tcg_gen_usadd_vec, .fno = gen_helper_gvec_usadd32, - .opc = INDEX_op_usadd_vec, + .opt_opc = vecop_list, .vece = MO_32 }, { .fni8 = tcg_gen_usadd_i64, .fniv = tcg_gen_usadd_vec, .fno = gen_helper_gvec_usadd64, - .opc = INDEX_op_usadd_vec, + .opt_opc = vecop_list, .vece = MO_64 } }; tcg_debug_assert(vece <= MO_64); @@ -1931,24 +2026,25 @@ static void tcg_gen_ussub_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) void tcg_gen_gvec_ussub(unsigned vece, uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t oprsz, uint32_t maxsz) { + static const TCGOpcode vecop_list[] = { INDEX_op_ussub_vec, 0 }; static const GVecGen3 g[4] = { { .fniv = tcg_gen_ussub_vec, .fno = gen_helper_gvec_ussub8, - .opc = INDEX_op_ussub_vec, + .opt_opc = vecop_list, .vece = MO_8 }, { .fniv = tcg_gen_ussub_vec, .fno = gen_helper_gvec_ussub16, - .opc = INDEX_op_ussub_vec, + .opt_opc = vecop_list, .vece = MO_16 }, { .fni4 = tcg_gen_ussub_i32, .fniv = tcg_gen_ussub_vec, .fno = gen_helper_gvec_ussub32, - .opc = INDEX_op_ussub_vec, + .opt_opc = vecop_list, .vece = MO_32 }, { .fni8 = tcg_gen_ussub_i64, .fniv = tcg_gen_ussub_vec, .fno = gen_helper_gvec_ussub64, - .opc = INDEX_op_ussub_vec, + .opt_opc = vecop_list, .vece = MO_64 } }; tcg_debug_assert(vece <= MO_64); @@ -1958,24 +2054,25 @@ void tcg_gen_gvec_ussub(unsigned vece, uint32_t dofs, uint32_t aofs, void tcg_gen_gvec_smin(unsigned vece, uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t oprsz, uint32_t maxsz) { + static const TCGOpcode vecop_list[] = { INDEX_op_smin_vec, 0 }; static const GVecGen3 g[4] = { { .fniv = tcg_gen_smin_vec, .fno = gen_helper_gvec_smin8, - .opc = INDEX_op_smin_vec, + .opt_opc = vecop_list, .vece = MO_8 }, { .fniv = tcg_gen_smin_vec, .fno = gen_helper_gvec_smin16, - .opc = INDEX_op_smin_vec, + .opt_opc = vecop_list, .vece = MO_16 }, { .fni4 = tcg_gen_smin_i32, .fniv = tcg_gen_smin_vec, .fno = gen_helper_gvec_smin32, - .opc = INDEX_op_smin_vec, + .opt_opc = vecop_list, .vece = MO_32 }, { .fni8 = tcg_gen_smin_i64, .fniv = tcg_gen_smin_vec, .fno = gen_helper_gvec_smin64, - .opc = INDEX_op_smin_vec, + .opt_opc = vecop_list, .vece = MO_64 } }; tcg_debug_assert(vece <= MO_64); @@ -1985,24 +2082,25 @@ void tcg_gen_gvec_smin(unsigned vece, uint32_t dofs, uint32_t aofs, void tcg_gen_gvec_umin(unsigned vece, uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t oprsz, uint32_t maxsz) { + static const TCGOpcode vecop_list[] = { INDEX_op_umin_vec, 0 }; static const GVecGen3 g[4] = { { .fniv = tcg_gen_umin_vec, .fno = gen_helper_gvec_umin8, - .opc = INDEX_op_umin_vec, + .opt_opc = vecop_list, .vece = MO_8 }, { .fniv = tcg_gen_umin_vec, .fno = gen_helper_gvec_umin16, - .opc = INDEX_op_umin_vec, + .opt_opc = vecop_list, .vece = MO_16 }, { .fni4 = tcg_gen_umin_i32, .fniv = tcg_gen_umin_vec, .fno = gen_helper_gvec_umin32, - .opc = INDEX_op_umin_vec, + .opt_opc = vecop_list, .vece = MO_32 }, { .fni8 = tcg_gen_umin_i64, .fniv = tcg_gen_umin_vec, .fno = gen_helper_gvec_umin64, - .opc = INDEX_op_umin_vec, + .opt_opc = vecop_list, .vece = MO_64 } }; tcg_debug_assert(vece <= MO_64); @@ -2012,24 +2110,25 @@ void tcg_gen_gvec_umin(unsigned vece, uint32_t dofs, uint32_t aofs, void tcg_gen_gvec_smax(unsigned vece, uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t oprsz, uint32_t maxsz) { + static const TCGOpcode vecop_list[] = { INDEX_op_smax_vec, 0 }; static const GVecGen3 g[4] = { { .fniv = tcg_gen_smax_vec, .fno = gen_helper_gvec_smax8, - .opc = INDEX_op_smax_vec, + .opt_opc = vecop_list, .vece = MO_8 }, { .fniv = tcg_gen_smax_vec, .fno = gen_helper_gvec_smax16, - .opc = INDEX_op_smax_vec, + .opt_opc = vecop_list, .vece = MO_16 }, { .fni4 = tcg_gen_smax_i32, .fniv = tcg_gen_smax_vec, .fno = gen_helper_gvec_smax32, - .opc = INDEX_op_smax_vec, + .opt_opc = vecop_list, .vece = MO_32 }, { .fni8 = tcg_gen_smax_i64, .fniv = tcg_gen_smax_vec, .fno = gen_helper_gvec_smax64, - .opc = INDEX_op_smax_vec, + .opt_opc = vecop_list, .vece = MO_64 } }; tcg_debug_assert(vece <= MO_64); @@ -2039,24 +2138,25 @@ void tcg_gen_gvec_smax(unsigned vece, uint32_t dofs, uint32_t aofs, void tcg_gen_gvec_umax(unsigned vece, uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t oprsz, uint32_t maxsz) { + static const TCGOpcode vecop_list[] = { INDEX_op_umax_vec, 0 }; static const GVecGen3 g[4] = { { .fniv = tcg_gen_umax_vec, .fno = gen_helper_gvec_umax8, - .opc = INDEX_op_umax_vec, + .opt_opc = vecop_list, .vece = MO_8 }, { .fniv = tcg_gen_umax_vec, .fno = gen_helper_gvec_umax16, - .opc = INDEX_op_umax_vec, + .opt_opc = vecop_list, .vece = MO_16 }, { .fni4 = tcg_gen_umax_i32, .fniv = tcg_gen_umax_vec, .fno = gen_helper_gvec_umax32, - .opc = INDEX_op_umax_vec, + .opt_opc = vecop_list, .vece = MO_32 }, { .fni8 = tcg_gen_umax_i64, .fniv = tcg_gen_umax_vec, .fno = gen_helper_gvec_umax64, - .opc = INDEX_op_umax_vec, + .opt_opc = vecop_list, .vece = MO_64 } }; tcg_debug_assert(vece <= MO_64); @@ -2110,26 +2210,27 @@ void tcg_gen_vec_neg32_i64(TCGv_i64 d, TCGv_i64 b) void tcg_gen_gvec_neg(unsigned vece, uint32_t dofs, uint32_t aofs, uint32_t oprsz, uint32_t maxsz) { + static const TCGOpcode vecop_list[] = { INDEX_op_neg_vec, 0 }; static const GVecGen2 g[4] = { { .fni8 = tcg_gen_vec_neg8_i64, .fniv = tcg_gen_neg_vec, .fno = gen_helper_gvec_neg8, - .opc = INDEX_op_neg_vec, + .opt_opc = vecop_list, .vece = MO_8 }, { .fni8 = tcg_gen_vec_neg16_i64, .fniv = tcg_gen_neg_vec, .fno = gen_helper_gvec_neg16, - .opc = INDEX_op_neg_vec, + .opt_opc = vecop_list, .vece = MO_16 }, { .fni4 = tcg_gen_neg_i32, .fniv = tcg_gen_neg_vec, .fno = gen_helper_gvec_neg32, - .opc = INDEX_op_neg_vec, + .opt_opc = vecop_list, .vece = MO_32 }, { .fni8 = tcg_gen_neg_i64, .fniv = tcg_gen_neg_vec, .fno = gen_helper_gvec_neg64, - .opc = INDEX_op_neg_vec, + .opt_opc = vecop_list, .prefer_i64 = TCG_TARGET_REG_BITS == 64, .vece = MO_64 }, }; @@ -2145,7 +2246,6 @@ void tcg_gen_gvec_and(unsigned vece, uint32_t dofs, uint32_t aofs, .fni8 = tcg_gen_and_i64, .fniv = tcg_gen_and_vec, .fno = gen_helper_gvec_and, - .opc = INDEX_op_and_vec, .prefer_i64 = TCG_TARGET_REG_BITS == 64, }; @@ -2163,7 +2263,6 @@ void tcg_gen_gvec_or(unsigned vece, uint32_t dofs, uint32_t aofs, .fni8 = tcg_gen_or_i64, .fniv = tcg_gen_or_vec, .fno = gen_helper_gvec_or, - .opc = INDEX_op_or_vec, .prefer_i64 = TCG_TARGET_REG_BITS == 64, }; @@ -2181,7 +2280,6 @@ void tcg_gen_gvec_xor(unsigned vece, uint32_t dofs, uint32_t aofs, .fni8 = tcg_gen_xor_i64, .fniv = tcg_gen_xor_vec, .fno = gen_helper_gvec_xor, - .opc = INDEX_op_xor_vec, .prefer_i64 = TCG_TARGET_REG_BITS == 64, }; @@ -2199,7 +2297,6 @@ void tcg_gen_gvec_andc(unsigned vece, uint32_t dofs, uint32_t aofs, .fni8 = tcg_gen_andc_i64, .fniv = tcg_gen_andc_vec, .fno = gen_helper_gvec_andc, - .opc = INDEX_op_andc_vec, .prefer_i64 = TCG_TARGET_REG_BITS == 64, }; @@ -2217,7 +2314,6 @@ void tcg_gen_gvec_orc(unsigned vece, uint32_t dofs, uint32_t aofs, .fni8 = tcg_gen_orc_i64, .fniv = tcg_gen_orc_vec, .fno = gen_helper_gvec_orc, - .opc = INDEX_op_orc_vec, .prefer_i64 = TCG_TARGET_REG_BITS == 64, }; @@ -2283,7 +2379,6 @@ static const GVecGen2s gop_ands = { .fni8 = tcg_gen_and_i64, .fniv = tcg_gen_and_vec, .fno = gen_helper_gvec_ands, - .opc = INDEX_op_and_vec, .prefer_i64 = TCG_TARGET_REG_BITS == 64, .vece = MO_64 }; @@ -2309,7 +2404,6 @@ static const GVecGen2s gop_xors = { .fni8 = tcg_gen_xor_i64, .fniv = tcg_gen_xor_vec, .fno = gen_helper_gvec_xors, - .opc = INDEX_op_xor_vec, .prefer_i64 = TCG_TARGET_REG_BITS == 64, .vece = MO_64 }; @@ -2335,7 +2429,6 @@ static const GVecGen2s gop_ors = { .fni8 = tcg_gen_or_i64, .fniv = tcg_gen_or_vec, .fno = gen_helper_gvec_ors, - .opc = INDEX_op_or_vec, .prefer_i64 = TCG_TARGET_REG_BITS == 64, .vece = MO_64 }; @@ -2374,26 +2467,27 @@ void tcg_gen_vec_shl16i_i64(TCGv_i64 d, TCGv_i64 a, int64_t c) void tcg_gen_gvec_shli(unsigned vece, uint32_t dofs, uint32_t aofs, int64_t shift, uint32_t oprsz, uint32_t maxsz) { + static const TCGOpcode vecop_list[] = { INDEX_op_shli_vec, 0 }; static const GVecGen2i g[4] = { { .fni8 = tcg_gen_vec_shl8i_i64, .fniv = tcg_gen_shli_vec, .fno = gen_helper_gvec_shl8i, - .opc = INDEX_op_shli_vec, + .opt_opc = vecop_list, .vece = MO_8 }, { .fni8 = tcg_gen_vec_shl16i_i64, .fniv = tcg_gen_shli_vec, .fno = gen_helper_gvec_shl16i, - .opc = INDEX_op_shli_vec, + .opt_opc = vecop_list, .vece = MO_16 }, { .fni4 = tcg_gen_shli_i32, .fniv = tcg_gen_shli_vec, .fno = gen_helper_gvec_shl32i, - .opc = INDEX_op_shli_vec, + .opt_opc = vecop_list, .vece = MO_32 }, { .fni8 = tcg_gen_shli_i64, .fniv = tcg_gen_shli_vec, .fno = gen_helper_gvec_shl64i, - .opc = INDEX_op_shli_vec, + .opt_opc = vecop_list, .prefer_i64 = TCG_TARGET_REG_BITS == 64, .vece = MO_64 }, }; @@ -2424,26 +2518,27 @@ void tcg_gen_vec_shr16i_i64(TCGv_i64 d, TCGv_i64 a, int64_t c) void tcg_gen_gvec_shri(unsigned vece, uint32_t dofs, uint32_t aofs, int64_t shift, uint32_t oprsz, uint32_t maxsz) { + static const TCGOpcode vecop_list[] = { INDEX_op_shri_vec, 0 }; static const GVecGen2i g[4] = { { .fni8 = tcg_gen_vec_shr8i_i64, .fniv = tcg_gen_shri_vec, .fno = gen_helper_gvec_shr8i, - .opc = INDEX_op_shri_vec, + .opt_opc = vecop_list, .vece = MO_8 }, { .fni8 = tcg_gen_vec_shr16i_i64, .fniv = tcg_gen_shri_vec, .fno = gen_helper_gvec_shr16i, - .opc = INDEX_op_shri_vec, + .opt_opc = vecop_list, .vece = MO_16 }, { .fni4 = tcg_gen_shri_i32, .fniv = tcg_gen_shri_vec, .fno = gen_helper_gvec_shr32i, - .opc = INDEX_op_shri_vec, + .opt_opc = vecop_list, .vece = MO_32 }, { .fni8 = tcg_gen_shri_i64, .fniv = tcg_gen_shri_vec, .fno = gen_helper_gvec_shr64i, - .opc = INDEX_op_shri_vec, + .opt_opc = vecop_list, .prefer_i64 = TCG_TARGET_REG_BITS == 64, .vece = MO_64 }, }; @@ -2488,26 +2583,27 @@ void tcg_gen_vec_sar16i_i64(TCGv_i64 d, TCGv_i64 a, int64_t c) void tcg_gen_gvec_sari(unsigned vece, uint32_t dofs, uint32_t aofs, int64_t shift, uint32_t oprsz, uint32_t maxsz) { + static const TCGOpcode vecop_list[] = { INDEX_op_sari_vec, 0 }; static const GVecGen2i g[4] = { { .fni8 = tcg_gen_vec_sar8i_i64, .fniv = tcg_gen_sari_vec, .fno = gen_helper_gvec_sar8i, - .opc = INDEX_op_sari_vec, + .opt_opc = vecop_list, .vece = MO_8 }, { .fni8 = tcg_gen_vec_sar16i_i64, .fniv = tcg_gen_sari_vec, .fno = gen_helper_gvec_sar16i, - .opc = INDEX_op_sari_vec, + .opt_opc = vecop_list, .vece = MO_16 }, { .fni4 = tcg_gen_sari_i32, .fniv = tcg_gen_sari_vec, .fno = gen_helper_gvec_sar32i, - .opc = INDEX_op_sari_vec, + .opt_opc = vecop_list, .vece = MO_32 }, { .fni8 = tcg_gen_sari_i64, .fniv = tcg_gen_sari_vec, .fno = gen_helper_gvec_sar64i, - .opc = INDEX_op_sari_vec, + .opt_opc = vecop_list, .prefer_i64 = TCG_TARGET_REG_BITS == 64, .vece = MO_64 }, }; @@ -2524,24 +2620,25 @@ void tcg_gen_gvec_sari(unsigned vece, uint32_t dofs, uint32_t aofs, void tcg_gen_gvec_shlv(unsigned vece, uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t oprsz, uint32_t maxsz) { + static const TCGOpcode vecop_list[] = { INDEX_op_shlv_vec, 0 }; static const GVecGen3 g[4] = { { .fniv = tcg_gen_shlv_vec, .fno = gen_helper_gvec_shl8v, - .opc = INDEX_op_shlv_vec, + .opt_opc = vecop_list, .vece = MO_8 }, { .fniv = tcg_gen_shlv_vec, .fno = gen_helper_gvec_shl16v, - .opc = INDEX_op_shlv_vec, + .opt_opc = vecop_list, .vece = MO_16 }, { .fni4 = tcg_gen_shl_i32, .fniv = tcg_gen_shlv_vec, .fno = gen_helper_gvec_shl32v, - .opc = INDEX_op_shlv_vec, + .opt_opc = vecop_list, .vece = MO_32 }, { .fni8 = tcg_gen_shl_i64, .fniv = tcg_gen_shlv_vec, .fno = gen_helper_gvec_shl64v, - .opc = INDEX_op_shlv_vec, + .opt_opc = vecop_list, .prefer_i64 = TCG_TARGET_REG_BITS == 64, .vece = MO_64 }, }; @@ -2553,24 +2650,25 @@ void tcg_gen_gvec_shlv(unsigned vece, uint32_t dofs, uint32_t aofs, void tcg_gen_gvec_shrv(unsigned vece, uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t oprsz, uint32_t maxsz) { + static const TCGOpcode vecop_list[] = { INDEX_op_shrv_vec, 0 }; static const GVecGen3 g[4] = { { .fniv = tcg_gen_shrv_vec, .fno = gen_helper_gvec_shr8v, - .opc = INDEX_op_shrv_vec, + .opt_opc = vecop_list, .vece = MO_8 }, { .fniv = tcg_gen_shrv_vec, .fno = gen_helper_gvec_shr16v, - .opc = INDEX_op_shrv_vec, + .opt_opc = vecop_list, .vece = MO_16 }, { .fni4 = tcg_gen_shr_i32, .fniv = tcg_gen_shrv_vec, .fno = gen_helper_gvec_shr32v, - .opc = INDEX_op_shrv_vec, + .opt_opc = vecop_list, .vece = MO_32 }, { .fni8 = tcg_gen_shr_i64, .fniv = tcg_gen_shrv_vec, .fno = gen_helper_gvec_shr64v, - .opc = INDEX_op_shrv_vec, + .opt_opc = vecop_list, .prefer_i64 = TCG_TARGET_REG_BITS == 64, .vece = MO_64 }, }; @@ -2582,24 +2680,25 @@ void tcg_gen_gvec_shrv(unsigned vece, uint32_t dofs, uint32_t aofs, void tcg_gen_gvec_sarv(unsigned vece, uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t oprsz, uint32_t maxsz) { + static const TCGOpcode vecop_list[] = { INDEX_op_sarv_vec, 0 }; static const GVecGen3 g[4] = { { .fniv = tcg_gen_sarv_vec, .fno = gen_helper_gvec_sar8v, - .opc = INDEX_op_sarv_vec, + .opt_opc = vecop_list, .vece = MO_8 }, { .fniv = tcg_gen_sarv_vec, .fno = gen_helper_gvec_sar16v, - .opc = INDEX_op_sarv_vec, + .opt_opc = vecop_list, .vece = MO_16 }, { .fni4 = tcg_gen_sar_i32, .fniv = tcg_gen_sarv_vec, .fno = gen_helper_gvec_sar32v, - .opc = INDEX_op_sarv_vec, + .opt_opc = vecop_list, .vece = MO_32 }, { .fni8 = tcg_gen_sar_i64, .fniv = tcg_gen_sarv_vec, .fno = gen_helper_gvec_sar64v, - .opc = INDEX_op_sarv_vec, + .opt_opc = vecop_list, .prefer_i64 = TCG_TARGET_REG_BITS == 64, .vece = MO_64 }, }; @@ -2667,6 +2766,7 @@ void tcg_gen_gvec_cmp(TCGCond cond, unsigned vece, uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t oprsz, uint32_t maxsz) { + static const TCGOpcode cmp_list[] = { INDEX_op_cmp_vec, 0 }; static gen_helper_gvec_3 * const eq_fn[4] = { gen_helper_gvec_eq8, gen_helper_gvec_eq16, gen_helper_gvec_eq32, gen_helper_gvec_eq64 @@ -2699,6 +2799,8 @@ void tcg_gen_gvec_cmp(TCGCond cond, unsigned vece, uint32_t dofs, [TCG_COND_LTU] = ltu_fn, [TCG_COND_LEU] = leu_fn, }; + + const TCGOpcode *hold_list; TCGType type; uint32_t some; @@ -2711,10 +2813,12 @@ void tcg_gen_gvec_cmp(TCGCond cond, unsigned vece, uint32_t dofs, return; } - /* Implement inline with a vector type, if possible. + /* + * Implement inline with a vector type, if possible. * Prefer integer when 64-bit host and 64-bit comparison. */ - type = choose_vector_type(INDEX_op_cmp_vec, vece, oprsz, + hold_list = tcg_swap_vecop_list(cmp_list); + type = choose_vector_type(cmp_list, vece, oprsz, TCG_TARGET_REG_BITS == 64 && vece == MO_64); switch (type) { case TCG_TYPE_V256: @@ -2756,13 +2860,14 @@ void tcg_gen_gvec_cmp(TCGCond cond, unsigned vece, uint32_t dofs, assert(fn != NULL); } tcg_gen_gvec_3_ool(dofs, aofs, bofs, oprsz, maxsz, 0, fn[vece]); - return; + oprsz = maxsz; } break; default: g_assert_not_reached(); } + tcg_swap_vecop_list(hold_list); if (oprsz < maxsz) { expand_clr(dofs + oprsz, maxsz - oprsz); diff --git a/tcg/tcg-op-vec.c b/tcg/tcg-op-vec.c index 6601cb8a8f..8d80e37f72 100644 --- a/tcg/tcg-op-vec.c +++ b/tcg/tcg-op-vec.c @@ -307,11 +307,14 @@ static bool do_op2(unsigned vece, TCGv_vec r, TCGv_vec a, TCGOpcode opc) int can; tcg_debug_assert(at->base_type >= type); + tcg_assert_listed_vecop(opc); can = tcg_can_emit_vec_op(opc, type, vece); if (can > 0) { vec_gen_2(opc, type, vece, ri, ai); } else if (can < 0) { + const TCGOpcode *hold_list = tcg_swap_vecop_list(NULL); tcg_expand_vec_op(opc, type, vece, ri, ai); + tcg_swap_vecop_list(hold_list); } else { return false; } @@ -329,11 +332,17 @@ void tcg_gen_not_vec(unsigned vece, TCGv_vec r, TCGv_vec a) void tcg_gen_neg_vec(unsigned vece, TCGv_vec r, TCGv_vec a) { + const TCGOpcode *hold_list; + + tcg_assert_listed_vecop(INDEX_op_neg_vec); + hold_list = tcg_swap_vecop_list(NULL); + if (!TCG_TARGET_HAS_neg_vec || !do_op2(vece, r, a, INDEX_op_neg_vec)) { TCGv_vec t = tcg_const_zeros_vec_matching(r); tcg_gen_sub_vec(vece, r, t, a); tcg_temp_free_vec(t); } + tcg_swap_vecop_list(hold_list); } static void do_shifti(TCGOpcode opc, unsigned vece, @@ -348,6 +357,7 @@ static void do_shifti(TCGOpcode opc, unsigned vece, tcg_debug_assert(at->base_type == type); tcg_debug_assert(i >= 0 && i < (8 << vece)); + tcg_assert_listed_vecop(opc); if (i == 0) { tcg_gen_mov_vec(r, a); @@ -361,8 +371,10 @@ static void do_shifti(TCGOpcode opc, unsigned vece, /* We leave the choice of expansion via scalar or vector shift to the target. Often, but not always, dupi can feed a vector shift easier than a scalar. */ + const TCGOpcode *hold_list = tcg_swap_vecop_list(NULL); tcg_debug_assert(can < 0); tcg_expand_vec_op(opc, type, vece, ri, ai, i); + tcg_swap_vecop_list(hold_list); } } @@ -395,12 +407,15 @@ void tcg_gen_cmp_vec(TCGCond cond, unsigned vece, tcg_debug_assert(at->base_type >= type); tcg_debug_assert(bt->base_type >= type); + tcg_assert_listed_vecop(INDEX_op_cmp_vec); can = tcg_can_emit_vec_op(INDEX_op_cmp_vec, type, vece); if (can > 0) { vec_gen_4(INDEX_op_cmp_vec, type, vece, ri, ai, bi, cond); } else { + const TCGOpcode *hold_list = tcg_swap_vecop_list(NULL); tcg_debug_assert(can < 0); tcg_expand_vec_op(INDEX_op_cmp_vec, type, vece, ri, ai, bi, cond); + tcg_swap_vecop_list(hold_list); } } @@ -418,12 +433,15 @@ static void do_op3(unsigned vece, TCGv_vec r, TCGv_vec a, tcg_debug_assert(at->base_type >= type); tcg_debug_assert(bt->base_type >= type); + tcg_assert_listed_vecop(opc); can = tcg_can_emit_vec_op(opc, type, vece); if (can > 0) { vec_gen_3(opc, type, vece, ri, ai, bi); } else { + const TCGOpcode *hold_list = tcg_swap_vecop_list(NULL); tcg_debug_assert(can < 0); tcg_expand_vec_op(opc, type, vece, ri, ai, bi); + tcg_swap_vecop_list(hold_list); } } From patchwork Sat Apr 20 07:34:21 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1088305 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="SYDhz4/7"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44mPsm18Sfz9s6w for ; Sat, 20 Apr 2019 17:41:15 +1000 (AEST) Received: from localhost ([127.0.0.1]:38143 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkcZ-0004s9-K0 for incoming@patchwork.ozlabs.org; Sat, 20 Apr 2019 03:41:11 -0400 Received: from eggs.gnu.org ([209.51.188.92]:40305) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkWr-0000E4-7x for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:19 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hHkWp-0008KI-IA for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:17 -0400 Received: from mail-pf1-x443.google.com ([2607:f8b0:4864:20::443]:40670) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hHkWp-0008IT-8r for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:15 -0400 Received: by mail-pf1-x443.google.com with SMTP id c207so3461875pfc.7 for ; Sat, 20 Apr 2019 00:35:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=sj8YQq+/YmayiyEfANOhjvVDSl2cyvdHLDn1MpvwzPI=; b=SYDhz4/7tpIyETUvXz8ClaanFcHTfll4UZHFC4R7ATjvvQXYNk2XmZc7qZt1tU7Le9 9xfQ4DIxM3Wa9K6Bb5gVEPIh9gHCUOo7e68LwKwmVpF82d2GAI4SlyAKxbmjEIN6EO4H IlJZVQyE2W0/+0y5gmO82hQ2dibOxuzPRwdtp2Lkbbb0RKKFHlBfK+KTZUXmmgj+Vfia P5PxcWi5iqklI7uQwnr1tcuWssvfjp77JeCtZxqtgjYZpdjcFk3hQwEvCtdWOfQ4VOY2 e0NrxJA7f1IAuYDKfvUtgi4IxmaPvrNt3958m6f/4KSyPBJ3IzqTODTmhc40VB5fKwG0 epAw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=sj8YQq+/YmayiyEfANOhjvVDSl2cyvdHLDn1MpvwzPI=; b=YOfHcw//3E8Ke2zY/3gtoOxL9nt2RFsq2lXK1lshO531pIn5kK7ZCpZwOXlHjS3lrv ezFDR9gJvyGqUzimw4523bbCDk7QDl7sEU0487nJSjSHswEjHaeub9lOpGuhkrU52b3A xCdylquLPwmfk0C/gPh2lhSYdR9ejRBQC2xwVNj9RqMm/6ElPCs9ePGiQqtsRxzuEMCs GKLSyexwOP1P00XwgUAW/efzdwvmvKffd2On50jhz9NVZZMhhK6lhjhibGWRu3RZ2u2q mwGTI/gQmelFQdeN7B0sbU7mtXixR67rQj2r03Io4eK+KCETfYPPDydu+T6Yv30JcCCH Op6A== X-Gm-Message-State: APjAAAUSc4dZTdkJm4DdSGtIhWTFd0qMItNVal0Lm2hIMtT40a5trgrv FsLkAEmzuZApASuvqQZekp5tYWL4tLo= X-Google-Smtp-Source: APXvYqwwVsIjg4c27C6vLDrMpHp9ak0wXhsP5yw11felSNhZk0Akhd46FYOiv7fJEsZla+KMxtoH9A== X-Received: by 2002:a63:5947:: with SMTP id j7mr8102249pgm.62.1555745713965; Sat, 20 Apr 2019 00:35:13 -0700 (PDT) Received: from localhost.localdomain (rrcs-66-91-136-155.west.biz.rr.com. [66.91.136.155]) by smtp.gmail.com with ESMTPSA id z22sm7025492pgv.23.2019.04.20.00.35.12 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 20 Apr 2019 00:35:13 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Fri, 19 Apr 2019 21:34:21 -1000 Message-Id: <20190420073442.7488-18-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190420073442.7488-1-richard.henderson@linaro.org> References: <20190420073442.7488-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::443 Subject: [Qemu-devel] [PATCH 17/38] tcg: Add gvec expanders for vector shift by scalar X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: david@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- tcg/tcg-op-gvec.h | 7 ++ tcg/tcg-op.h | 4 + tcg/tcg-op-gvec.c | 210 ++++++++++++++++++++++++++++++++++++++++++++++ tcg/tcg-op-vec.c | 54 ++++++++++++ 4 files changed, 275 insertions(+) diff --git a/tcg/tcg-op-gvec.h b/tcg/tcg-op-gvec.h index a0e0902f6c..f9c6058e92 100644 --- a/tcg/tcg-op-gvec.h +++ b/tcg/tcg-op-gvec.h @@ -318,6 +318,13 @@ void tcg_gen_gvec_shri(unsigned vece, uint32_t dofs, uint32_t aofs, void tcg_gen_gvec_sari(unsigned vece, uint32_t dofs, uint32_t aofs, int64_t shift, uint32_t oprsz, uint32_t maxsz); +void tcg_gen_gvec_shls(unsigned vece, uint32_t dofs, uint32_t aofs, + TCGv_i32 shift, uint32_t oprsz, uint32_t maxsz); +void tcg_gen_gvec_shrs(unsigned vece, uint32_t dofs, uint32_t aofs, + TCGv_i32 shift, uint32_t oprsz, uint32_t maxsz); +void tcg_gen_gvec_sars(unsigned vece, uint32_t dofs, uint32_t aofs, + TCGv_i32 shift, uint32_t oprsz, uint32_t maxsz); + void tcg_gen_gvec_shlv(unsigned vece, uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t oprsz, uint32_t maxsz); void tcg_gen_gvec_shrv(unsigned vece, uint32_t dofs, uint32_t aofs, diff --git a/tcg/tcg-op.h b/tcg/tcg-op.h index 833c6330b5..472b73cb38 100644 --- a/tcg/tcg-op.h +++ b/tcg/tcg-op.h @@ -986,6 +986,10 @@ void tcg_gen_shli_vec(unsigned vece, TCGv_vec r, TCGv_vec a, int64_t i); void tcg_gen_shri_vec(unsigned vece, TCGv_vec r, TCGv_vec a, int64_t i); void tcg_gen_sari_vec(unsigned vece, TCGv_vec r, TCGv_vec a, int64_t i); +void tcg_gen_shls_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_i32 s); +void tcg_gen_shrs_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_i32 s); +void tcg_gen_sars_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_i32 s); + void tcg_gen_shlv_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec s); void tcg_gen_shrv_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec s); void tcg_gen_sarv_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec s); diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c index 40858a83e0..4eb0747ddd 100644 --- a/tcg/tcg-op-gvec.c +++ b/tcg/tcg-op-gvec.c @@ -2617,6 +2617,216 @@ void tcg_gen_gvec_sari(unsigned vece, uint32_t dofs, uint32_t aofs, } } +/* + * Specialized generation vector shifts by a non-constant scalar. + */ + +static void expand_2sh_vec(unsigned vece, uint32_t dofs, uint32_t aofs, + uint32_t oprsz, uint32_t tysz, TCGType type, + TCGv_i32 shift, + void (*fni)(unsigned, TCGv_vec, TCGv_vec, TCGv_i32)) +{ + TCGv_vec t0 = tcg_temp_new_vec(type); + uint32_t i; + + for (i = 0; i < oprsz; i += tysz) { + tcg_gen_ld_vec(t0, cpu_env, aofs + i); + fni(vece, t0, t0, shift); + tcg_gen_st_vec(t0, cpu_env, dofs + i); + } + tcg_temp_free_vec(t0); +} + +static void do_shifts(unsigned vece, uint32_t dofs, uint32_t aofs, + TCGv_i32 shift, uint32_t oprsz, uint32_t maxsz, + void (*fni4)(TCGv_i32, TCGv_i32, TCGv_i32), + void (*fni8)(TCGv_i64, TCGv_i64, TCGv_i64), + void (*fniv_s)(unsigned, TCGv_vec, TCGv_vec, TCGv_i32), + void (*fniv_v)(unsigned, TCGv_vec, TCGv_vec, TCGv_vec), + gen_helper_gvec_2 *fno, + const TCGOpcode *s_list, const TCGOpcode *v_list) +{ + TCGType type; + uint32_t some; + + check_size_align(oprsz, maxsz, dofs | aofs); + check_overlap_2(dofs, aofs, maxsz); + + /* If the backend has a scalar expansion, great. */ + type = choose_vector_type(s_list, vece, oprsz, vece == MO_64); + if (type) { + const TCGOpcode *hold_list = tcg_swap_vecop_list(NULL); + switch (type) { + case TCG_TYPE_V256: + some = QEMU_ALIGN_DOWN(oprsz, 32); + expand_2sh_vec(vece, dofs, aofs, some, 32, + TCG_TYPE_V256, shift, fniv_s); + if (some == oprsz) { + break; + } + dofs += some; + aofs += some; + oprsz -= some; + maxsz -= some; + /* fallthru */ + case TCG_TYPE_V128: + expand_2sh_vec(vece, dofs, aofs, oprsz, 16, + TCG_TYPE_V128, shift, fniv_s); + break; + case TCG_TYPE_V64: + expand_2sh_vec(vece, dofs, aofs, oprsz, 8, + TCG_TYPE_V64, shift, fniv_s); + break; + default: + g_assert_not_reached(); + } + tcg_swap_vecop_list(hold_list); + goto clear_tail; + } + + /* If the backend supports variable vector shifts, also cool. */ + type = choose_vector_type(v_list, vece, oprsz, vece == MO_64); + if (type) { + const TCGOpcode *hold_list = tcg_swap_vecop_list(NULL); + TCGv_vec v_shift = tcg_temp_new_vec(type); + + if (vece == MO_64) { + TCGv_i64 sh64 = tcg_temp_new_i64(); + tcg_gen_extu_i32_i64(sh64, shift); + tcg_gen_dup_i64_vec(MO_64, v_shift, sh64); + tcg_temp_free_i64(sh64); + } else { + tcg_gen_dup_i32_vec(vece, v_shift, shift); + } + + switch (type) { + case TCG_TYPE_V256: + some = QEMU_ALIGN_DOWN(oprsz, 32); + expand_2s_vec(vece, dofs, aofs, some, 32, TCG_TYPE_V256, + v_shift, false, fniv_v); + if (some == oprsz) { + break; + } + dofs += some; + aofs += some; + oprsz -= some; + maxsz -= some; + /* fallthru */ + case TCG_TYPE_V128: + expand_2s_vec(vece, dofs, aofs, oprsz, 16, TCG_TYPE_V128, + v_shift, false, fniv_v); + break; + case TCG_TYPE_V64: + expand_2s_vec(vece, dofs, aofs, oprsz, 8, TCG_TYPE_V64, + v_shift, false, fniv_v); + break; + default: + g_assert_not_reached(); + } + tcg_temp_free_vec(v_shift); + tcg_swap_vecop_list(hold_list); + goto clear_tail; + } + + /* Otherwise fall back to integral... */ + if (fni8) { + TCGv_i64 sh64 = tcg_temp_new_i64(); + tcg_gen_extu_i32_i64(sh64, shift); + expand_2s_i64(dofs, aofs, oprsz, sh64, false, fni8); + tcg_temp_free_i64(sh64); + goto clear_tail; + } + if (fni4) { + expand_2s_i32(dofs, aofs, oprsz, shift, false, fni4); + goto clear_tail; + } + + /* Otherwise fall back to out of line. */ + tcg_debug_assert(fno); + { + TCGv_ptr a0 = tcg_temp_new_ptr(); + TCGv_ptr a1 = tcg_temp_new_ptr(); + TCGv_i32 desc = tcg_temp_new_i32(); + + tcg_gen_shli_i32(desc, shift, SIMD_DATA_SHIFT); + tcg_gen_ori_i32(desc, desc, simd_desc(oprsz, maxsz, 0)); + tcg_gen_addi_ptr(a0, cpu_env, dofs); + tcg_gen_addi_ptr(a1, cpu_env, aofs); + + fno(a0, a1, desc); + + tcg_temp_free_ptr(a0); + tcg_temp_free_ptr(a1); + tcg_temp_free_i32(desc); + return; + } + + clear_tail: + if (oprsz < maxsz) { + expand_clr(dofs + oprsz, maxsz - oprsz); + } +} + +void tcg_gen_gvec_shls(unsigned vece, uint32_t dofs, uint32_t aofs, + TCGv_i32 shift, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode scalar_list[] = { INDEX_op_shls_vec, 0 }; + static const TCGOpcode vector_list[] = { INDEX_op_shlv_vec, 0 }; + static gen_helper_gvec_2 * const fno[4] = { + gen_helper_gvec_shl8i, + gen_helper_gvec_shl16i, + gen_helper_gvec_shl32i, + gen_helper_gvec_shl64i, + }; + + tcg_debug_assert(vece <= MO_64); + do_shifts(vece, dofs, aofs, shift, oprsz, maxsz, + vece == MO_32 ? tcg_gen_shl_i32 : NULL, + vece == MO_64 ? tcg_gen_shl_i64 : NULL, + tcg_gen_shls_vec, tcg_gen_shlv_vec, fno[vece], + scalar_list, vector_list); +} + +void tcg_gen_gvec_shrs(unsigned vece, uint32_t dofs, uint32_t aofs, + TCGv_i32 shift, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode scalar_list[] = { INDEX_op_shrs_vec, 0 }; + static const TCGOpcode vector_list[] = { INDEX_op_shrv_vec, 0 }; + static gen_helper_gvec_2 * const fno[4] = { + gen_helper_gvec_shr8i, + gen_helper_gvec_shr16i, + gen_helper_gvec_shr32i, + gen_helper_gvec_shr64i, + }; + + tcg_debug_assert(vece <= MO_64); + do_shifts(vece, dofs, aofs, shift, oprsz, maxsz, + vece == MO_32 ? tcg_gen_shr_i32 : NULL, + vece == MO_64 ? tcg_gen_shr_i64 : NULL, + tcg_gen_shrs_vec, tcg_gen_shrv_vec, fno[vece], + scalar_list, vector_list); +} + +void tcg_gen_gvec_sars(unsigned vece, uint32_t dofs, uint32_t aofs, + TCGv_i32 shift, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode scalar_list[] = { INDEX_op_sars_vec, 0 }; + static const TCGOpcode vector_list[] = { INDEX_op_sarv_vec, 0 }; + static gen_helper_gvec_2 * const fno[4] = { + gen_helper_gvec_sar8i, + gen_helper_gvec_sar16i, + gen_helper_gvec_sar32i, + gen_helper_gvec_sar64i, + }; + + tcg_debug_assert(vece <= MO_64); + do_shifts(vece, dofs, aofs, shift, oprsz, maxsz, + vece == MO_32 ? tcg_gen_sar_i32 : NULL, + vece == MO_64 ? tcg_gen_sar_i64 : NULL, + tcg_gen_sars_vec, tcg_gen_sarv_vec, fno[vece], + scalar_list, vector_list); +} + void tcg_gen_gvec_shlv(unsigned vece, uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t oprsz, uint32_t maxsz) { diff --git a/tcg/tcg-op-vec.c b/tcg/tcg-op-vec.c index 8d80e37f72..5ec8e0b1a0 100644 --- a/tcg/tcg-op-vec.c +++ b/tcg/tcg-op-vec.c @@ -514,3 +514,57 @@ void tcg_gen_sarv_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) { do_op3(vece, r, a, b, INDEX_op_sarv_vec); } + +static void do_shifts(unsigned vece, TCGv_vec r, TCGv_vec a, + TCGv_i32 s, TCGOpcode opc_s, TCGOpcode opc_v) +{ + TCGTemp *rt = tcgv_vec_temp(r); + TCGTemp *at = tcgv_vec_temp(a); + TCGTemp *st = tcgv_i32_temp(s); + TCGArg ri = temp_arg(rt); + TCGArg ai = temp_arg(at); + TCGArg si = temp_arg(st); + TCGType type = rt->base_type; + const TCGOpcode *hold_list; + int can; + + tcg_debug_assert(at->base_type >= type); + tcg_assert_listed_vecop(opc_s); + hold_list = tcg_swap_vecop_list(NULL); + + can = tcg_can_emit_vec_op(opc_s, type, vece); + if (can > 0) { + vec_gen_3(opc_s, type, vece, ri, ai, si); + } else if (can < 0) { + tcg_expand_vec_op(opc_s, type, vece, ri, ai, si); + } else { + TCGv_vec vec_s = tcg_temp_new_vec(type); + + if (vece == MO_64) { + TCGv_i64 s64 = tcg_temp_new_i64(); + tcg_gen_extu_i32_i64(s64, s); + tcg_gen_dup_i64_vec(MO_64, vec_s, s64); + tcg_temp_free_i64(s64); + } else { + tcg_gen_dup_i32_vec(vece, vec_s, s); + } + do_op3(vece, r, a, vec_s, opc_v); + tcg_temp_free_vec(vec_s); + } + tcg_swap_vecop_list(hold_list); +} + +void tcg_gen_shls_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_i32 b) +{ + do_shifts(vece, r, a, b, INDEX_op_shls_vec, INDEX_op_shlv_vec); +} + +void tcg_gen_shrs_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_i32 b) +{ + do_shifts(vece, r, a, b, INDEX_op_shrs_vec, INDEX_op_shrv_vec); +} + +void tcg_gen_sars_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_i32 b) +{ + do_shifts(vece, r, a, b, INDEX_op_sars_vec, INDEX_op_sarv_vec); +} From patchwork Sat Apr 20 07:34:22 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1088317 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="d7v/fGWU"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44mQ1M558Mz9s4Y for ; Sat, 20 Apr 2019 17:47:51 +1000 (AEST) Received: from localhost ([127.0.0.1]:38235 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkiz-0002Ln-Hl for incoming@patchwork.ozlabs.org; Sat, 20 Apr 2019 03:47:49 -0400 Received: from eggs.gnu.org ([209.51.188.92]:40311) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkWs-0000E5-7s for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:19 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hHkWr-0008NC-1W for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:18 -0400 Received: from mail-pg1-x52f.google.com ([2607:f8b0:4864:20::52f]:43188) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hHkWq-0008LU-S7 for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:16 -0400 Received: by mail-pg1-x52f.google.com with SMTP id z9so3580511pgu.10 for ; Sat, 20 Apr 2019 00:35:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=LtKlGjbuWemXwNkHto3iydMd/+NtQ4CVvCNq9/MhWAw=; b=d7v/fGWU6nMrGcBWXUOtp2qL57i2eB8B+FIviw3QiM3ObFsWKo1zJmiFEqer3L2T9g hHrpTSfbmbOR/UHguT228X9FsO5YQ5tx32ZsFgH1I1PHcSRtOCbkFcH6wrb8Saya7B+Z m+bS/nxvlXtKDW+1tHMrLzru5+gkcD1Fx3WW2eWxtbGfgM4TvXYNWo7RibuBhSttQUKs bRaQwjB/NSf7RJ0QfZdotE5/Q7MURZJS5mxC7Aiwg183VPmKChRsxgkjbI+qnnlEk//6 JKcVTr9vEGQSkQtidgn47QJIVNlPqdn0k7UNclqGUSthfLolzpSw/k/B6E/uEOjLCdmq XNSA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=LtKlGjbuWemXwNkHto3iydMd/+NtQ4CVvCNq9/MhWAw=; b=FhQ9bSk913DlnIADqhEamawTEjVw5AjZVcV60QYCazH59D9mr+OlHy0+DiEQN2Dzb+ qpRW5cHyFiFgKow63/BbPP6Ftj/sYWFddgQHZLHzFIzSiRg04F1w5E2NohfMSNYEgj+O BKoPU+Rf0nn8e2Nq4ezqq3oCJcdKSWVd/bMqfU6JPWcm+IzHFeLUPwjg4cslkEXrGiS4 KC6Z3r9QOJ/6ulOcPvud3mOOYNB04XXri9u8wPYhOiEmKib/BNkYdh45KFD4zc5PUoNG ouWkxVP7IpWPl2vRxGXAMwVfWJlgco7clps9L1/S2toShh3zryk0W69dFzx3yMMBMIQl PX5w== X-Gm-Message-State: APjAAAVFvccXcK0GcQu6ABA5Xe3WP/K6GmTUY5BVAGEvBlF0SR4lGKm4 HBmTwdfdp7RSZxkBGMumTibVAWgm7Cg= X-Google-Smtp-Source: APXvYqzM6kerCE9yYCBU520nEWJlT07oVEoy/Hm/1PalftDDsi4DB9YwoJB6wUDSABzYOgV35oBqwQ== X-Received: by 2002:a63:7885:: with SMTP id t127mr7885953pgc.338.1555745715547; Sat, 20 Apr 2019 00:35:15 -0700 (PDT) Received: from localhost.localdomain (rrcs-66-91-136-155.west.biz.rr.com. [66.91.136.155]) by smtp.gmail.com with ESMTPSA id z22sm7025492pgv.23.2019.04.20.00.35.14 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 20 Apr 2019 00:35:14 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Fri, 19 Apr 2019 21:34:22 -1000 Message-Id: <20190420073442.7488-19-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190420073442.7488-1-richard.henderson@linaro.org> References: <20190420073442.7488-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::52f Subject: [Qemu-devel] [PATCH 18/38] tcg/i386: Support vector scalar shift opcodes X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: david@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- tcg/i386/tcg-target.h | 2 +- tcg/i386/tcg-target.inc.c | 35 +++++++++++++++++++++++++++++++++++ 2 files changed, 36 insertions(+), 1 deletion(-) diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h index b240633455..618aa520d2 100644 --- a/tcg/i386/tcg-target.h +++ b/tcg/i386/tcg-target.h @@ -183,7 +183,7 @@ extern bool have_avx2; #define TCG_TARGET_HAS_not_vec 0 #define TCG_TARGET_HAS_neg_vec 0 #define TCG_TARGET_HAS_shi_vec 1 -#define TCG_TARGET_HAS_shs_vec 0 +#define TCG_TARGET_HAS_shs_vec 1 #define TCG_TARGET_HAS_shv_vec have_avx2 #define TCG_TARGET_HAS_cmp_vec 1 #define TCG_TARGET_HAS_mul_vec 1 diff --git a/tcg/i386/tcg-target.inc.c b/tcg/i386/tcg-target.inc.c index 04e609c7b2..85b68e4326 100644 --- a/tcg/i386/tcg-target.inc.c +++ b/tcg/i386/tcg-target.inc.c @@ -420,6 +420,14 @@ static inline int tcg_target_const_match(tcg_target_long val, TCGType type, #define OPC_PSHIFTW_Ib (0x71 | P_EXT | P_DATA16) /* /2 /6 /4 */ #define OPC_PSHIFTD_Ib (0x72 | P_EXT | P_DATA16) /* /2 /6 /4 */ #define OPC_PSHIFTQ_Ib (0x73 | P_EXT | P_DATA16) /* /2 /6 /4 */ +#define OPC_PSLLW (0xf1 | P_EXT | P_DATA16) +#define OPC_PSLLD (0xf2 | P_EXT | P_DATA16) +#define OPC_PSLLQ (0xf3 | P_EXT | P_DATA16) +#define OPC_PSRAW (0xe1 | P_EXT | P_DATA16) +#define OPC_PSRAD (0xe2 | P_EXT | P_DATA16) +#define OPC_PSRLW (0xd1 | P_EXT | P_DATA16) +#define OPC_PSRLD (0xd2 | P_EXT | P_DATA16) +#define OPC_PSRLQ (0xd3 | P_EXT | P_DATA16) #define OPC_PSUBB (0xf8 | P_EXT | P_DATA16) #define OPC_PSUBW (0xf9 | P_EXT | P_DATA16) #define OPC_PSUBD (0xfa | P_EXT | P_DATA16) @@ -2722,6 +2730,15 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, /* TODO: AVX512 adds support for MO_16, MO_64. */ OPC_UD2, OPC_UD2, OPC_VPSRAVD, OPC_UD2 }; + static int const shls_insn[4] = { + OPC_UD2, OPC_PSLLW, OPC_PSLLD, OPC_PSLLQ + }; + static int const shrs_insn[4] = { + OPC_UD2, OPC_PSRLW, OPC_PSRLD, OPC_PSRLQ + }; + static int const sars_insn[4] = { + OPC_UD2, OPC_PSRAW, OPC_PSRAD, OPC_UD2 + }; TCGType type = vecl + TCG_TYPE_V64; int insn, sub; @@ -2783,6 +2800,15 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, case INDEX_op_sarv_vec: insn = sarv_insn[vece]; goto gen_simd; + case INDEX_op_shls_vec: + insn = shls_insn[vece]; + goto gen_simd; + case INDEX_op_shrs_vec: + insn = shrs_insn[vece]; + goto gen_simd; + case INDEX_op_sars_vec: + insn = sars_insn[vece]; + goto gen_simd; case INDEX_op_x86_punpckl_vec: insn = punpckl_insn[vece]; goto gen_simd; @@ -3163,6 +3189,9 @@ static const TCGTargetOpDef *tcg_target_op_def(TCGOpcode op) case INDEX_op_shlv_vec: case INDEX_op_shrv_vec: case INDEX_op_sarv_vec: + case INDEX_op_shls_vec: + case INDEX_op_shrs_vec: + case INDEX_op_sars_vec: case INDEX_op_cmp_vec: case INDEX_op_x86_shufps_vec: case INDEX_op_x86_blend_vec: @@ -3220,6 +3249,12 @@ int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type, unsigned vece) } return 1; + case INDEX_op_shls_vec: + case INDEX_op_shrs_vec: + return vece >= MO_16; + case INDEX_op_sars_vec: + return vece >= MO_16 && vece <= MO_32; + case INDEX_op_shlv_vec: case INDEX_op_shrv_vec: return have_avx2 && vece >= MO_32; From patchwork Sat Apr 20 07:34:23 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1088316 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="mhEfVSiF"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44mQ0S6fzxz9s4Y for ; Sat, 20 Apr 2019 17:47:04 +1000 (AEST) Received: from localhost ([127.0.0.1]:38231 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkiE-0001lq-Ss for incoming@patchwork.ozlabs.org; Sat, 20 Apr 2019 03:47:02 -0400 Received: from eggs.gnu.org ([209.51.188.92]:40331) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkWt-0000Es-Tg for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:21 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hHkWs-0008PV-H3 for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:19 -0400 Received: from mail-pg1-x541.google.com ([2607:f8b0:4864:20::541]:36383) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hHkWs-0008OZ-B5 for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:18 -0400 Received: by mail-pg1-x541.google.com with SMTP id 85so3602807pgc.3 for ; Sat, 20 Apr 2019 00:35:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=4wzV6OFZGlF90Lzl5pNsHL4DyaHrlHsg3fnJ58nfXjc=; b=mhEfVSiF1e0kyBao6RH+J09n3EDKdVTvtKnA4cJZQ7Q966kmY3L6wQ2SPOe/emq8Aw Ew728XxBgwAhKFisVtLLjAkKBw1/WniOaRq6T1zMCTxcmJOpPHPyVi8C72YTAe4W5hsc Owxw8tQs+uQpMtqpSdKM6Sh1N8DK/fThtBPiqfJkr/Ce8+FwkY9KoJ4RQUPyJBWGwyPN 6bPQv7IHw2cKYtLdMYBthFlWWdKEn71+xPRzYbwPMMfY1hEdFA1zZ/RrsVJcnZbbOoz9 4W5qPAEsk8i6G2rh9sQvjB7vLgh0KjTCxZZEix1VvnbcTowQXVDwGP56fXvYhk+RG8Bl zkyw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=4wzV6OFZGlF90Lzl5pNsHL4DyaHrlHsg3fnJ58nfXjc=; b=WlGtOYDv3sgqhRqp3nnU0ijG+YG7iJ6/Cg3qCAJSkV2WLpJ9UU4CZD8EKA30cQYl7F Tbgzk7U6Cib+wCaOVHBvakiDzmFQGo6SM2XgsckuuWEwPE5nCvcrZegme/VQulgG0Toa tvXz5D6cHhClvuC4Ru8S36fp/eNbAtAjNDHG7c3yWtAu/zPA2QEgN6JdIGWolJvtDB1z rKNF/0c8xBexgpWJHGEO8vjkunWIzJCUvJxeh9J8KA0oHsIGoFQrMl0xLJouq4KeeQJx Z/yBI8kCBeBQoT7uO+qm5EKJ3QkoYFRcmdUmImnJe5xt8u8jIVnLN0cFi8xI9bLjeYRb t0Yg== X-Gm-Message-State: APjAAAUzz9HIRv+/XZ3LKkcuYHWFWfAgNX4/vYyseqpUq8zbg+dstoUD 03InwYGw09Gvlp5NEfLWyWXyixAKbwg= X-Google-Smtp-Source: APXvYqx56eWui2Ql/qN2F3dbUvO/064yJ9CKXvUcXpGW9Kq1y3DBI4lSrN5QCDKjklEZSV5ouwshFA== X-Received: by 2002:a63:e818:: with SMTP id s24mr8150779pgh.190.1555745717133; Sat, 20 Apr 2019 00:35:17 -0700 (PDT) Received: from localhost.localdomain (rrcs-66-91-136-155.west.biz.rr.com. [66.91.136.155]) by smtp.gmail.com with ESMTPSA id z22sm7025492pgv.23.2019.04.20.00.35.15 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 20 Apr 2019 00:35:16 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Fri, 19 Apr 2019 21:34:23 -1000 Message-Id: <20190420073442.7488-20-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190420073442.7488-1-richard.henderson@linaro.org> References: <20190420073442.7488-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::541 Subject: [Qemu-devel] [PATCH 19/38] tcg: Add support for integer absolute value X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: david@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Remove a function of the same name from target/arm/. Use a branchless implementation of abs that gcc uses for x86. Signed-off-by: Richard Henderson Reviewed-by: Philippe Mathieu-Daudé Reviewed-by: David Hildenbrand --- tcg/tcg-op.h | 5 +++++ target/arm/translate.c | 10 ---------- tcg/tcg-op.c | 20 ++++++++++++++++++++ 3 files changed, 25 insertions(+), 10 deletions(-) diff --git a/tcg/tcg-op.h b/tcg/tcg-op.h index 472b73cb38..660fe205d0 100644 --- a/tcg/tcg-op.h +++ b/tcg/tcg-op.h @@ -335,6 +335,7 @@ void tcg_gen_smin_i32(TCGv_i32, TCGv_i32 arg1, TCGv_i32 arg2); void tcg_gen_smax_i32(TCGv_i32, TCGv_i32 arg1, TCGv_i32 arg2); void tcg_gen_umin_i32(TCGv_i32, TCGv_i32 arg1, TCGv_i32 arg2); void tcg_gen_umax_i32(TCGv_i32, TCGv_i32 arg1, TCGv_i32 arg2); +void tcg_gen_abs_i32(TCGv_i32, TCGv_i32); static inline void tcg_gen_discard_i32(TCGv_i32 arg) { @@ -534,6 +535,7 @@ void tcg_gen_smin_i64(TCGv_i64, TCGv_i64 arg1, TCGv_i64 arg2); void tcg_gen_smax_i64(TCGv_i64, TCGv_i64 arg1, TCGv_i64 arg2); void tcg_gen_umin_i64(TCGv_i64, TCGv_i64 arg1, TCGv_i64 arg2); void tcg_gen_umax_i64(TCGv_i64, TCGv_i64 arg1, TCGv_i64 arg2); +void tcg_gen_abs_i64(TCGv_i64, TCGv_i64); #if TCG_TARGET_REG_BITS == 64 static inline void tcg_gen_discard_i64(TCGv_i64 arg) @@ -973,6 +975,7 @@ void tcg_gen_nor_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); void tcg_gen_eqv_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); void tcg_gen_not_vec(unsigned vece, TCGv_vec r, TCGv_vec a); void tcg_gen_neg_vec(unsigned vece, TCGv_vec r, TCGv_vec a); +void tcg_gen_abs_vec(unsigned vece, TCGv_vec r, TCGv_vec a); void tcg_gen_ssadd_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); void tcg_gen_usadd_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); void tcg_gen_sssub_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); @@ -1019,6 +1022,7 @@ void tcg_gen_stl_vec(TCGv_vec r, TCGv_ptr base, TCGArg offset, TCGType t); #define tcg_gen_addi_tl tcg_gen_addi_i64 #define tcg_gen_sub_tl tcg_gen_sub_i64 #define tcg_gen_neg_tl tcg_gen_neg_i64 +#define tcg_gen_abs_tl tcg_gen_abs_i64 #define tcg_gen_subfi_tl tcg_gen_subfi_i64 #define tcg_gen_subi_tl tcg_gen_subi_i64 #define tcg_gen_and_tl tcg_gen_and_i64 @@ -1131,6 +1135,7 @@ void tcg_gen_stl_vec(TCGv_vec r, TCGv_ptr base, TCGArg offset, TCGType t); #define tcg_gen_addi_tl tcg_gen_addi_i32 #define tcg_gen_sub_tl tcg_gen_sub_i32 #define tcg_gen_neg_tl tcg_gen_neg_i32 +#define tcg_gen_abs_tl tcg_gen_abs_i32 #define tcg_gen_subfi_tl tcg_gen_subfi_i32 #define tcg_gen_subi_tl tcg_gen_subi_i32 #define tcg_gen_and_tl tcg_gen_and_i32 diff --git a/target/arm/translate.c b/target/arm/translate.c index 83a008e945..721171794d 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -603,16 +603,6 @@ static void gen_sar(TCGv_i32 dest, TCGv_i32 t0, TCGv_i32 t1) tcg_temp_free_i32(tmp1); } -static void tcg_gen_abs_i32(TCGv_i32 dest, TCGv_i32 src) -{ - TCGv_i32 c0 = tcg_const_i32(0); - TCGv_i32 tmp = tcg_temp_new_i32(); - tcg_gen_neg_i32(tmp, src); - tcg_gen_movcond_i32(TCG_COND_GT, dest, src, c0, src, tmp); - tcg_temp_free_i32(c0); - tcg_temp_free_i32(tmp); -} - static void shifter_out_im(TCGv_i32 var, int shift) { if (shift == 0) { diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c index a00d1df37e..0ac291f1c4 100644 --- a/tcg/tcg-op.c +++ b/tcg/tcg-op.c @@ -1091,6 +1091,16 @@ void tcg_gen_umax_i32(TCGv_i32 ret, TCGv_i32 a, TCGv_i32 b) tcg_gen_movcond_i32(TCG_COND_LTU, ret, a, b, b, a); } +void tcg_gen_abs_i32(TCGv_i32 ret, TCGv_i32 a) +{ + TCGv_i32 t = tcg_temp_new_i32(); + + tcg_gen_sari_i32(t, a, 31); + tcg_gen_xor_i32(ret, a, t); + tcg_gen_sub_i32(ret, ret, t); + tcg_temp_free_i32(t); +} + /* 64-bit ops */ #if TCG_TARGET_REG_BITS == 32 @@ -2548,6 +2558,16 @@ void tcg_gen_umax_i64(TCGv_i64 ret, TCGv_i64 a, TCGv_i64 b) tcg_gen_movcond_i64(TCG_COND_LTU, ret, a, b, b, a); } +void tcg_gen_abs_i64(TCGv_i64 ret, TCGv_i64 a) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_sari_i64(t, a, 63); + tcg_gen_xor_i64(ret, a, t); + tcg_gen_sub_i64(ret, ret, t); + tcg_temp_free_i64(t); +} + /* Size changing operations. */ void tcg_gen_extrl_i64_i32(TCGv_i32 ret, TCGv_i64 arg) From patchwork Sat Apr 20 07:34:24 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1088335 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="R7BqKhI6"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44mQMd4Pzxz9s70 for ; Sat, 20 Apr 2019 18:03:41 +1000 (AEST) Received: from localhost ([127.0.0.1]:38459 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkyJ-0007QR-DA for incoming@patchwork.ozlabs.org; Sat, 20 Apr 2019 04:03:39 -0400 Received: from eggs.gnu.org ([209.51.188.92]:40536) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkXY-00012L-1p for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:36:02 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hHkXV-0000mG-V1 for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:36:00 -0400 Received: from mail-pf1-x42d.google.com ([2607:f8b0:4864:20::42d]:40783) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hHkXU-0008Qa-Ah for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:56 -0400 Received: by mail-pf1-x42d.google.com with SMTP id c207so3461927pfc.7 for ; Sat, 20 Apr 2019 00:35:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=cy+pR6Cwkwdlfgf9cgc3XGVqnfEWs+Spaq6vwR8dvps=; b=R7BqKhI6asYdSYT607K0zDQfbRF2FZbLK6re7j7VOUDZvEN5JD3DKDPq8gFD5SG28r GD7BZk8mEz/SgajkgW7htgSyJXejhCF7UpfqOJdkwshQlrlxwoeCJYD+BDan4r07INoQ tZiNesVgdE3wqmZvUdytaFTbiJKksaxws2YDA6seyPxfj2ul/mi7oCPhSfY+15qmNu6z vx3GtaxmblXRB6UhTVSfT2xZWKMyfKqXAKF1gy/BLBedVvCGaajMgu0IHAVSo0bTqXSb 80xZff9Nkzzy9IDaZ8LxnInFADKfnZedIntx/EWwdhXzTGUU8XGQqcn7P1LfzdJPeV7N 4E1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=cy+pR6Cwkwdlfgf9cgc3XGVqnfEWs+Spaq6vwR8dvps=; b=pL+uE+AO0thLfpB2ifb5XyAuDs8mTqTG2e5ziPfysNCz5HUkhA71cgVqDWTaaFHyTE ONDHVkgYFNYdUdb6X8sCubF7EsG33f8bouMLRJpuXFpJv4bXRJ0upiwZm1Z9tAC2+kR7 IYg3nejWHuJRvrkYFp3g6NBk4df6bliww3SvnhhsvYOMIZZaECIFlv0Pk1OtYgFr4Zw1 KD51Wp1YF+xjBTeWI5k8Zj+Cb5ners6fdWXeiX31+CsT4PUaaR3tU1hn4wuo4XYzpqlo y/rYjPM1hW9ZB47C0DgLrSjvmevdGcKoESnCvlPQkCxg6zLKUrd/KTGY2bzDS4M2HYkN AvNQ== X-Gm-Message-State: APjAAAXhXCu3cv6x8UNDCNfgTtpITzk8rNH2EKLWBZI89msgbNxxn979 /PWL5srOvFsKJIdc2fyi21wuWCUinPo= X-Google-Smtp-Source: APXvYqzq2pps+iNMfpGFbeP7b2ojKbGP7crBr69mDdgvQjpaccAqgak7Dcl9E42Gf6NoDcuqW1JUgg== X-Received: by 2002:a65:4689:: with SMTP id h9mr7958296pgr.295.1555745718741; Sat, 20 Apr 2019 00:35:18 -0700 (PDT) Received: from localhost.localdomain (rrcs-66-91-136-155.west.biz.rr.com. [66.91.136.155]) by smtp.gmail.com with ESMTPSA id z22sm7025492pgv.23.2019.04.20.00.35.17 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 20 Apr 2019 00:35:18 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Fri, 19 Apr 2019 21:34:24 -1000 Message-Id: <20190420073442.7488-21-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190420073442.7488-1-richard.henderson@linaro.org> References: <20190420073442.7488-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::42d Subject: [Qemu-devel] [PATCH 20/38] tcg: Add support for vector absolute value X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: david@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- accel/tcg/tcg-runtime.h | 5 +++ tcg/aarch64/tcg-target.h | 1 + tcg/i386/tcg-target.h | 1 + tcg/tcg-op-gvec.h | 2 + tcg/tcg-opc.h | 1 + tcg/tcg.h | 1 + accel/tcg/tcg-runtime-gvec.c | 48 ++++++++++++++++++++++++ tcg/tcg-op-gvec.c | 71 ++++++++++++++++++++++++++++++++++++ tcg/tcg-op-vec.c | 31 ++++++++++++++++ tcg/tcg.c | 2 + tcg/README | 4 ++ 11 files changed, 167 insertions(+) diff --git a/accel/tcg/tcg-runtime.h b/accel/tcg/tcg-runtime.h index ed3ce5fd91..6d73dc2d65 100644 --- a/accel/tcg/tcg-runtime.h +++ b/accel/tcg/tcg-runtime.h @@ -225,6 +225,11 @@ DEF_HELPER_FLAGS_3(gvec_neg16, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_neg32, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_neg64, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_abs8, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_abs16, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_abs32, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_abs64, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + DEF_HELPER_FLAGS_3(gvec_not, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_and, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_or, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h index f5640a229b..21d06d928c 100644 --- a/tcg/aarch64/tcg-target.h +++ b/tcg/aarch64/tcg-target.h @@ -132,6 +132,7 @@ typedef enum { #define TCG_TARGET_HAS_orc_vec 1 #define TCG_TARGET_HAS_not_vec 1 #define TCG_TARGET_HAS_neg_vec 1 +#define TCG_TARGET_HAS_abs_vec 0 #define TCG_TARGET_HAS_shi_vec 1 #define TCG_TARGET_HAS_shs_vec 0 #define TCG_TARGET_HAS_shv_vec 1 diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h index 618aa520d2..7445f05885 100644 --- a/tcg/i386/tcg-target.h +++ b/tcg/i386/tcg-target.h @@ -182,6 +182,7 @@ extern bool have_avx2; #define TCG_TARGET_HAS_orc_vec 0 #define TCG_TARGET_HAS_not_vec 0 #define TCG_TARGET_HAS_neg_vec 0 +#define TCG_TARGET_HAS_abs_vec 0 #define TCG_TARGET_HAS_shi_vec 1 #define TCG_TARGET_HAS_shs_vec 1 #define TCG_TARGET_HAS_shv_vec have_avx2 diff --git a/tcg/tcg-op-gvec.h b/tcg/tcg-op-gvec.h index f9c6058e92..46f58febbf 100644 --- a/tcg/tcg-op-gvec.h +++ b/tcg/tcg-op-gvec.h @@ -228,6 +228,8 @@ void tcg_gen_gvec_not(unsigned vece, uint32_t dofs, uint32_t aofs, uint32_t oprsz, uint32_t maxsz); void tcg_gen_gvec_neg(unsigned vece, uint32_t dofs, uint32_t aofs, uint32_t oprsz, uint32_t maxsz); +void tcg_gen_gvec_abs(unsigned vece, uint32_t dofs, uint32_t aofs, + uint32_t oprsz, uint32_t maxsz); void tcg_gen_gvec_add(unsigned vece, uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t oprsz, uint32_t maxsz); diff --git a/tcg/tcg-opc.h b/tcg/tcg-opc.h index 4bf71f261f..4a2dd116eb 100644 --- a/tcg/tcg-opc.h +++ b/tcg/tcg-opc.h @@ -225,6 +225,7 @@ DEF(add_vec, 1, 2, 0, IMPLVEC) DEF(sub_vec, 1, 2, 0, IMPLVEC) DEF(mul_vec, 1, 2, 0, IMPLVEC | IMPL(TCG_TARGET_HAS_mul_vec)) DEF(neg_vec, 1, 1, 0, IMPLVEC | IMPL(TCG_TARGET_HAS_neg_vec)) +DEF(abs_vec, 1, 1, 0, IMPLVEC | IMPL(TCG_TARGET_HAS_abs_vec)) DEF(ssadd_vec, 1, 2, 0, IMPLVEC | IMPL(TCG_TARGET_HAS_sat_vec)) DEF(usadd_vec, 1, 2, 0, IMPLVEC | IMPL(TCG_TARGET_HAS_sat_vec)) DEF(sssub_vec, 1, 2, 0, IMPLVEC | IMPL(TCG_TARGET_HAS_sat_vec)) diff --git a/tcg/tcg.h b/tcg/tcg.h index 48d4d2e03e..986055fdfa 100644 --- a/tcg/tcg.h +++ b/tcg/tcg.h @@ -176,6 +176,7 @@ typedef uint64_t TCGRegSet; && !defined(TCG_TARGET_HAS_v128) \ && !defined(TCG_TARGET_HAS_v256) #define TCG_TARGET_MAYBE_vec 0 +#define TCG_TARGET_HAS_abs_vec 0 #define TCG_TARGET_HAS_neg_vec 0 #define TCG_TARGET_HAS_not_vec 0 #define TCG_TARGET_HAS_andc_vec 0 diff --git a/accel/tcg/tcg-runtime-gvec.c b/accel/tcg/tcg-runtime-gvec.c index 7b88f5590c..dd08095773 100644 --- a/accel/tcg/tcg-runtime-gvec.c +++ b/accel/tcg/tcg-runtime-gvec.c @@ -398,6 +398,54 @@ void HELPER(gvec_neg64)(void *d, void *a, uint32_t desc) clear_high(d, oprsz, desc); } +void HELPER(gvec_abs8)(void *d, void *a, uint32_t desc) +{ + intptr_t oprsz = simd_oprsz(desc); + intptr_t i; + + for (i = 0; i < oprsz; i += sizeof(int8_t)) { + int8_t aa = *(int8_t *)(a + i); + *(int8_t *)(d + i) = aa < 0 ? -aa : aa; + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_abs16)(void *d, void *a, uint32_t desc) +{ + intptr_t oprsz = simd_oprsz(desc); + intptr_t i; + + for (i = 0; i < oprsz; i += sizeof(int16_t)) { + int16_t aa = *(int16_t *)(a + i); + *(int16_t *)(d + i) = aa < 0 ? -aa : aa; + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_abs32)(void *d, void *a, uint32_t desc) +{ + intptr_t oprsz = simd_oprsz(desc); + intptr_t i; + + for (i = 0; i < oprsz; i += sizeof(int32_t)) { + int32_t aa = *(int32_t *)(a + i); + *(int32_t *)(d + i) = aa < 0 ? -aa : aa; + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_abs64)(void *d, void *a, uint32_t desc) +{ + intptr_t oprsz = simd_oprsz(desc); + intptr_t i; + + for (i = 0; i < oprsz; i += sizeof(int64_t)) { + int64_t aa = *(int64_t *)(a + i); + *(int64_t *)(d + i) = aa < 0 ? -aa : aa; + } + clear_high(d, oprsz, desc); +} + void HELPER(gvec_mov)(void *d, void *a, uint32_t desc) { intptr_t oprsz = simd_oprsz(desc); diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c index 4eb0747ddd..87d5a01cc9 100644 --- a/tcg/tcg-op-gvec.c +++ b/tcg/tcg-op-gvec.c @@ -88,6 +88,14 @@ static bool tcg_can_emit_vecop_list(const TCGOpcode *list, continue; } break; + case INDEX_op_abs_vec: + if (tcg_can_emit_vec_op(INDEX_op_sub_vec, type, vece) + && (tcg_can_emit_vec_op(INDEX_op_smax_vec, type, vece) > 0 + || tcg_can_emit_vec_op(INDEX_op_sari_vec, type, vece) > 0 + || tcg_can_emit_vec_op(INDEX_op_cmp_vec, type, vece))) { + continue; + } + break; default: break; } @@ -2239,6 +2247,69 @@ void tcg_gen_gvec_neg(unsigned vece, uint32_t dofs, uint32_t aofs, tcg_gen_gvec_2(dofs, aofs, oprsz, maxsz, &g[vece]); } +static void gen_absv_mask(TCGv_i64 d, TCGv_i64 b, unsigned vece) +{ + TCGv_i64 t = tcg_temp_new_i64(); + int nbit = 8 << vece; + + /* Create -1 for each negative element. */ + tcg_gen_shri_i64(t, b, nbit - 1); + tcg_gen_andi_i64(t, t, dup_const(vece, 1)); + tcg_gen_muli_i64(t, t, (1 << nbit) - 1); + + /* + * Invert (via xor -1) and add one (via sub -1). + * Because of the ordering the msb is cleared, + * so we never have carry into the next element. + */ + tcg_gen_xor_i64(d, b, t); + tcg_gen_sub_i64(d, d, t); + + tcg_temp_free_i64(t); +} + +static void tcg_gen_vec_abs8_i64(TCGv_i64 d, TCGv_i64 b) +{ + gen_absv_mask(d, b, MO_8); +} + +static void tcg_gen_vec_abs16_i64(TCGv_i64 d, TCGv_i64 b) +{ + gen_absv_mask(d, b, MO_16); +} + +void tcg_gen_gvec_abs(unsigned vece, uint32_t dofs, uint32_t aofs, + uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode vecop_list[] = { INDEX_op_abs_vec, 0 }; + static const GVecGen2 g[4] = { + { .fni8 = tcg_gen_vec_abs8_i64, + .fniv = tcg_gen_abs_vec, + .fno = gen_helper_gvec_abs8, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni8 = tcg_gen_vec_abs16_i64, + .fniv = tcg_gen_abs_vec, + .fno = gen_helper_gvec_abs16, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = tcg_gen_abs_i32, + .fniv = tcg_gen_abs_vec, + .fno = gen_helper_gvec_abs32, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = tcg_gen_abs_i64, + .fniv = tcg_gen_abs_vec, + .fno = gen_helper_gvec_abs64, + .opt_opc = vecop_list, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .vece = MO_64 }, + }; + + tcg_debug_assert(vece <= MO_64); + tcg_gen_gvec_2(dofs, aofs, oprsz, maxsz, &g[vece]); +} + void tcg_gen_gvec_and(unsigned vece, uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t oprsz, uint32_t maxsz) { diff --git a/tcg/tcg-op-vec.c b/tcg/tcg-op-vec.c index 5ec8e0b1a0..b7f21145bb 100644 --- a/tcg/tcg-op-vec.c +++ b/tcg/tcg-op-vec.c @@ -345,6 +345,37 @@ void tcg_gen_neg_vec(unsigned vece, TCGv_vec r, TCGv_vec a) tcg_swap_vecop_list(hold_list); } +void tcg_gen_abs_vec(unsigned vece, TCGv_vec r, TCGv_vec a) +{ + const TCGOpcode *hold_list; + + tcg_assert_listed_vecop(INDEX_op_abs_vec); + hold_list = tcg_swap_vecop_list(NULL); + + if (!do_op2(vece, r, a, INDEX_op_abs_vec)) { + TCGType type = tcgv_vec_temp(r)->base_type; + TCGv_vec t = tcg_temp_new_vec(type); + + tcg_debug_assert(tcg_can_emit_vec_op(INDEX_op_sub_vec, type, vece)); + if (tcg_can_emit_vec_op(INDEX_op_smax_vec, type, vece) > 0) { + tcg_gen_neg_vec(vece, t, a); + tcg_gen_smax_vec(vece, r, a, t); + } else { + if (tcg_can_emit_vec_op(INDEX_op_sari_vec, type, vece) > 0) { + tcg_gen_sari_vec(vece, t, a, (8 << vece) - 1); + } else { + do_dupi_vec(t, MO_REG, 0); + tcg_gen_cmp_vec(TCG_COND_LT, vece, t, a, t); + } + tcg_gen_xor_vec(vece, r, a, t); + tcg_gen_sub_vec(vece, r, r, t); + } + + tcg_temp_free_vec(t); + } + tcg_swap_vecop_list(hold_list); +} + static void do_shifti(TCGOpcode opc, unsigned vece, TCGv_vec r, TCGv_vec a, int64_t i) { diff --git a/tcg/tcg.c b/tcg/tcg.c index 0b0b228bb5..86a95a636b 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -1621,6 +1621,8 @@ bool tcg_op_supported(TCGOpcode op) return have_vec && TCG_TARGET_HAS_not_vec; case INDEX_op_neg_vec: return have_vec && TCG_TARGET_HAS_neg_vec; + case INDEX_op_abs_vec: + return have_vec && TCG_TARGET_HAS_abs_vec; case INDEX_op_andc_vec: return have_vec && TCG_TARGET_HAS_andc_vec; case INDEX_op_orc_vec: diff --git a/tcg/README b/tcg/README index c30e5418a6..cbdfd3b6bc 100644 --- a/tcg/README +++ b/tcg/README @@ -561,6 +561,10 @@ E.g. VECL=1 -> 64 << 1 -> v128, and VECE=2 -> 1 << 2 -> i32. Similarly, v0 = -v1. +* abs_vec v0, v1 + + Similarly, v0 = v1 < 0 ? -v1 : v1, in elements across the vector. + * smin_vec: * umin_vec: From patchwork Sat Apr 20 07:34:25 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1088322 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="B9VSKeJM"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44mQ845yX8z9s4Y for ; Sat, 20 Apr 2019 17:53:40 +1000 (AEST) Received: from localhost ([127.0.0.1]:38311 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkoc-00074K-KN for incoming@patchwork.ozlabs.org; Sat, 20 Apr 2019 03:53:38 -0400 Received: from eggs.gnu.org ([209.51.188.92]:40379) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkXI-0000ks-8I for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:46 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hHkXD-0000Ix-Sk for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:42 -0400 Received: from mail-pf1-x443.google.com ([2607:f8b0:4864:20::443]:44882) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hHkXB-0008S7-Sn for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:38 -0400 Received: by mail-pf1-x443.google.com with SMTP id y13so3457282pfm.11 for ; Sat, 20 Apr 2019 00:35:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=l9h7+bEChm8+9kEmXTq3lryUpPorB+hViRmREKufQQg=; b=B9VSKeJM5LnOPVPo+ZA8NsQ0lHQ+j1NVAHQECvtOMVbo15cUc+h0QV9vcRMfA282rB gGHxE/kujNWcCdJCiZLi5deID4dI218HVELp9Sl6nlrU39ffhHFSZYiWle7FNX2CQLNA gZHV1iTQzmfrwtkHqCoowMgrrZTQtaacPXRwYOqazjGBTdbiEq+N2BuovzVKFd26XE5I mlkX1FtJoiF6oPGkE7oOMXcSAiRgQDImwPNNKuLLTnyiTUFtKHQT+tVUrpgBZ/uwqFbw wV/cxGJOG6ACYPxipaH+D0V01VCWhE46FkWkb4PdOdaMccWMVwpGMQqWJr2VoJN+6Pkg QGYQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=l9h7+bEChm8+9kEmXTq3lryUpPorB+hViRmREKufQQg=; b=NEVbfXCenUoC3kw0rVzJl14ezhetiSLTSYPFQFQOM6kS/mGmgHvaUOJX0OJwMa/dCU ozCkDdhTPVytEPpkeZv6ypGwPzI6MX+u6wvcg3iYDgIpSKji/WdsCRFjLDLAsxA7osdD a3z4UDV8HI0ccXxi9j1yxpdwZ+inT+eiwtGgHwlNSQZM+X417bpt8KgfTUumLht2gczg Yo6nsE5TelhKsg1fXm4GFlEnosAHaNScQ7WPzj7ccPK9ljwfsLgNa5qR4bCd+ckSiQ1j 9BRs0342Tf0BObOx9cTjjsXkyGFi6IAeS4jWPCvaUQtDQrpbROQXcmIm+G84tUTElG2c 6ndg== X-Gm-Message-State: APjAAAXM4WSSxBgztTw05UFlFBNx1+2P/jvXP7kdmfmbnK6wxVGmJsbH lZJQjSXNdmSrrQHhZo2bBSJSg92AEJg= X-Google-Smtp-Source: APXvYqwRb90uu1jS9CCFQzmVygD2W/VozG126pmwq36wKnhI1QyY2LYoYJw8KQXGFiZPhz+/PaVn5g== X-Received: by 2002:a62:424f:: with SMTP id p76mr8423430pfa.141.1555745720293; Sat, 20 Apr 2019 00:35:20 -0700 (PDT) Received: from localhost.localdomain (rrcs-66-91-136-155.west.biz.rr.com. [66.91.136.155]) by smtp.gmail.com with ESMTPSA id z22sm7025492pgv.23.2019.04.20.00.35.18 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 20 Apr 2019 00:35:19 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Fri, 19 Apr 2019 21:34:25 -1000 Message-Id: <20190420073442.7488-22-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190420073442.7488-1-richard.henderson@linaro.org> References: <20190420073442.7488-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::443 Subject: [Qemu-devel] [PATCH 21/38] target/arm: Use tcg_gen_abs_i64 and tcg_gen_gvec_abs X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: david@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper.h | 2 -- target/arm/neon_helper.c | 5 ----- target/arm/translate-a64.c | 41 +++++--------------------------------- target/arm/translate.c | 11 +++------- 4 files changed, 8 insertions(+), 51 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index a09566f795..3d90b5be66 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -347,8 +347,6 @@ DEF_HELPER_2(neon_ceq_u8, i32, i32, i32) DEF_HELPER_2(neon_ceq_u16, i32, i32, i32) DEF_HELPER_2(neon_ceq_u32, i32, i32, i32) -DEF_HELPER_1(neon_abs_s8, i32, i32) -DEF_HELPER_1(neon_abs_s16, i32, i32) DEF_HELPER_1(neon_clz_u8, i32, i32) DEF_HELPER_1(neon_clz_u16, i32, i32) DEF_HELPER_1(neon_cls_s8, i32, i32) diff --git a/target/arm/neon_helper.c b/target/arm/neon_helper.c index ed1c6fc41c..4259056723 100644 --- a/target/arm/neon_helper.c +++ b/target/arm/neon_helper.c @@ -1228,11 +1228,6 @@ NEON_VOP(ceq_u16, neon_u16, 2) NEON_VOP(ceq_u32, neon_u32, 1) #undef NEON_FN -#define NEON_FN(dest, src, dummy) dest = (src < 0) ? -src : src -NEON_VOP1(abs_s8, neon_s8, 4) -NEON_VOP1(abs_s16, neon_s16, 2) -#undef NEON_FN - /* Count Leading Sign/Zero Bits. */ static inline int do_clz8(uint8_t x) { diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index dcdeb80176..fd8921565e 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -9468,11 +9468,7 @@ static void handle_2misc_64(DisasContext *s, int opcode, bool u, if (u) { tcg_gen_neg_i64(tcg_rd, tcg_rn); } else { - TCGv_i64 tcg_zero = tcg_const_i64(0); - tcg_gen_neg_i64(tcg_rd, tcg_rn); - tcg_gen_movcond_i64(TCG_COND_GT, tcg_rd, tcg_rn, tcg_zero, - tcg_rn, tcg_rd); - tcg_temp_free_i64(tcg_zero); + tcg_gen_abs_i64(tcg_rd, tcg_rn); } break; case 0x2f: /* FABS */ @@ -12366,11 +12362,12 @@ static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn) } break; case 0xb: - if (u) { /* NEG */ + if (u) { /* ABS, NEG */ gen_gvec_fn2(s, is_q, rd, rn, tcg_gen_gvec_neg, size); - return; + } else { + gen_gvec_fn2(s, is_q, rd, rn, tcg_gen_gvec_abs, size); } - break; + return; } if (size == 3) { @@ -12438,17 +12435,6 @@ static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn) gen_helper_neon_qabs_s32(tcg_res, cpu_env, tcg_op); } break; - case 0xb: /* ABS, NEG */ - if (u) { - tcg_gen_neg_i32(tcg_res, tcg_op); - } else { - TCGv_i32 tcg_zero = tcg_const_i32(0); - tcg_gen_neg_i32(tcg_res, tcg_op); - tcg_gen_movcond_i32(TCG_COND_GT, tcg_res, tcg_op, - tcg_zero, tcg_op, tcg_res); - tcg_temp_free_i32(tcg_zero); - } - break; case 0x2f: /* FABS */ gen_helper_vfp_abss(tcg_res, tcg_op); break; @@ -12561,23 +12547,6 @@ static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn) tcg_temp_free_i32(tcg_zero); break; } - case 0xb: /* ABS, NEG */ - if (u) { - TCGv_i32 tcg_zero = tcg_const_i32(0); - if (size) { - gen_helper_neon_sub_u16(tcg_res, tcg_zero, tcg_op); - } else { - gen_helper_neon_sub_u8(tcg_res, tcg_zero, tcg_op); - } - tcg_temp_free_i32(tcg_zero); - } else { - if (size) { - gen_helper_neon_abs_s16(tcg_res, tcg_op); - } else { - gen_helper_neon_abs_s8(tcg_res, tcg_op); - } - } - break; case 0x4: /* CLS, CLZ */ if (u) { if (size == 0) { diff --git a/target/arm/translate.c b/target/arm/translate.c index 721171794d..911ad0bdab 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -8031,6 +8031,9 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) case NEON_2RM_VNEG: tcg_gen_gvec_neg(size, rd_ofs, rm_ofs, vec_size, vec_size); break; + case NEON_2RM_VABS: + tcg_gen_gvec_abs(size, rd_ofs, rm_ofs, vec_size, vec_size); + break; default: elementwise: @@ -8136,14 +8139,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) } tcg_temp_free_i32(tmp2); break; - case NEON_2RM_VABS: - switch(size) { - case 0: gen_helper_neon_abs_s8(tmp, tmp); break; - case 1: gen_helper_neon_abs_s16(tmp, tmp); break; - case 2: tcg_gen_abs_i32(tmp, tmp); break; - default: abort(); - } - break; case NEON_2RM_VCGT0_F: { TCGv_ptr fpstatus = get_fpstatus_ptr(1); From patchwork Sat Apr 20 07:34:26 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1088334 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="VCtlkd1w"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44mQMc3Q8nz9s70 for ; Sat, 20 Apr 2019 18:03:40 +1000 (AEST) Received: from localhost ([127.0.0.1]:38457 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkyI-0007Pp-GZ for incoming@patchwork.ozlabs.org; Sat, 20 Apr 2019 04:03:38 -0400 Received: from eggs.gnu.org ([209.51.188.92]:40565) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkXY-00012p-Hs for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:36:01 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hHkXW-0000nT-Ge for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:36:00 -0400 Received: from mail-pf1-x444.google.com ([2607:f8b0:4864:20::444]:37102) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hHkXW-0008T6-A8 for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:58 -0400 Received: by mail-pf1-x444.google.com with SMTP id 8so3469384pfr.4 for ; Sat, 20 Apr 2019 00:35:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=6My5ewhJQTmrC7aPZrFJpyFIvvkO/0+nmBhhQ9PFA+k=; b=VCtlkd1w9plw02C6y3qV3WppgBB7Nh7i2ft8+M6T15fhTGSlaWktvrOR+u2oL3aJiq c5ZTfmxnMLQiSd7ycV1scZ29yhceZWZLdQLW1JiN0SUdb7p3ymZSs2/qEJmfHZScD4gF MazgtX/XV+mbATF2bG77E3j6x0kg4Nuc7KIFHi0zy3PCwzee2IplHMcAq+CeyMqaa7F1 TCZxlhxFXJvCdBQ0ro1uNojmrBgxSqZP5HQZFG1WOwYOat10P7N3+XBDT2xeFwSQ9yYV yiqy3WGq1LCgGLcNwdGbPRMS6jy2jljcDyGs0zfvPRcPML2b37TOX4H8OQt9SaJt16Bf jzbA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=6My5ewhJQTmrC7aPZrFJpyFIvvkO/0+nmBhhQ9PFA+k=; b=TB0O+NCp2mio9tZyEypGVArpnnOyfE8JX45ESSmfjR4mEAVnQUjK0dzX4UuuFKYdPh yPh47aOUhkwphAuPC8ONSuQVU/I0QDD0HCta5AOscxA7/d7bX9RZpUx/oDPOfPIyI0b1 d4WicFwvR+lnuusz8nyoDgSfgbtHhjbbKlMOYbVJYsu/pIc8DX4jtOUncFjkCaxEeT0u 7qc96BH0W3ixL2fES/DMgjIua52HuZP8p96TL5cM4MHEwzsNYmaQYcDh1qwv+P9HyosV vZPZQ9moDPmqpq2SuzLZnYOwxd+P726X58wKfcmgaN+QsC5wyZa5ILT3tEJ2mEixMlqC X3Iw== X-Gm-Message-State: APjAAAWLChe/e4EZSV3mRbN4gczYD8s0p5sqNwtkhEbSWwavrOwxmkek 7SeyREpvUEZR+86dzY+2fKvAwl5FO4E= X-Google-Smtp-Source: APXvYqwAIcRnBFR1ZYFLhTyeNmpJOOFRM5j/GAfF1iOMXzRTfosS3jRrRYq9O2pJyl+MShyh1/srtQ== X-Received: by 2002:a62:6e05:: with SMTP id j5mr8517570pfc.5.1555745721726; Sat, 20 Apr 2019 00:35:21 -0700 (PDT) Received: from localhost.localdomain (rrcs-66-91-136-155.west.biz.rr.com. [66.91.136.155]) by smtp.gmail.com with ESMTPSA id z22sm7025492pgv.23.2019.04.20.00.35.20 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 20 Apr 2019 00:35:21 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Fri, 19 Apr 2019 21:34:26 -1000 Message-Id: <20190420073442.7488-23-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190420073442.7488-1-richard.henderson@linaro.org> References: <20190420073442.7488-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::444 Subject: [Qemu-devel] [PATCH 22/38] target/cris: Use tcg_gen_abs_tl X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: david@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson Reviewed-by: Philippe Mathieu-Daudé --- target/cris/translate.c | 9 +-------- 1 file changed, 1 insertion(+), 8 deletions(-) diff --git a/target/cris/translate.c b/target/cris/translate.c index 11b2c11174..0374718f66 100644 --- a/target/cris/translate.c +++ b/target/cris/translate.c @@ -1685,18 +1685,11 @@ static int dec_cmp_r(CPUCRISState *env, DisasContext *dc) static int dec_abs_r(CPUCRISState *env, DisasContext *dc) { - TCGv t0; - LOG_DIS("abs $r%u, $r%u\n", dc->op1, dc->op2); cris_cc_mask(dc, CC_MASK_NZ); - t0 = tcg_temp_new(); - tcg_gen_sari_tl(t0, cpu_R[dc->op1], 31); - tcg_gen_xor_tl(cpu_R[dc->op2], cpu_R[dc->op1], t0); - tcg_gen_sub_tl(cpu_R[dc->op2], cpu_R[dc->op2], t0); - tcg_temp_free(t0); - + tcg_gen_abs_tl(cpu_R[dc->op2], cpu_R[dc->op1]); cris_alu(dc, CC_OP_MOVE, cpu_R[dc->op2], cpu_R[dc->op2], cpu_R[dc->op2], 4); return 2; From patchwork Sat Apr 20 07:34:27 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1088320 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="Lq9jyXh8"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44mQ5309Xgz9s4Y for ; Sat, 20 Apr 2019 17:51:03 +1000 (AEST) Received: from localhost ([127.0.0.1]:38293 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkm5-0004xI-1H for incoming@patchwork.ozlabs.org; Sat, 20 Apr 2019 03:51:01 -0400 Received: from eggs.gnu.org ([209.51.188.92]:40373) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkXG-0000gZ-IK for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:43 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hHkXC-0000H0-0t for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:39 -0400 Received: from mail-pl1-x642.google.com ([2607:f8b0:4864:20::642]:42015) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hHkXB-0008U2-Oj for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:37 -0400 Received: by mail-pl1-x642.google.com with SMTP id cv12so3533292plb.9 for ; Sat, 20 Apr 2019 00:35:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=BobRUDDq3dnwwt/qXps0agrFTDkezRLKK87vg29L7R4=; b=Lq9jyXh8dPQfAeeu4RrTpPeVB2gyMKcVeSZMBgb5+1gQ/oyvIohjbIT9XgDzqxFuLW 3WdexrYihohU3f2hHtsidux2k6boZnbiC5f5zP1SkMnJmfbhVEQCDUZnToVTYe4NIAJA 5ICft3oUq+SgsijbfXuWdAzGJFsxF/Na9e53cLT7ha9EcKySJo0LSsGUmGeraVZBpUo/ KpY1gWMDl3B0gqhQP3islPdYkG2wOgeRXqVNdh32pz9GU0p623qaY7SJpHoDhVB+K6ka 5+nFUoCwT8BWH0pYcOAvGGqBcR6SYDeRezoWY3VRBfAJgOLtEDuehP59kHnL/RqHEw8P Lc2A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=BobRUDDq3dnwwt/qXps0agrFTDkezRLKK87vg29L7R4=; b=ldVaWplsR1jKH9HhIcZA2LsJrnWPWcZuP/NNpPASTKXtwkYDFTL6ZBtH/xav6kK21L 7SGl/nZocUmWyisSHiTogry612zdIhrDDDR1oGwSJ5zhTkWr8rkCZPKu+HnrctVKCjFR fbN0Tr2WL4LBkuw1Ne4R08o+fvlzxitXFizFjLzag3m4Zy5K+oKAP8K0Pv1EVkuYxnmu yTUdLnz0J/BL0waUiBDbW6TB40WWBNcHB02cSAncO1sm8rpDWIPMVVAspkY5raR/VHd/ mXrH/KHcP1w18SRSDjF77DXZvwhpS1XtkBz6UrcsYmrStcwyptS1EOmhJBSC82af6+ML Uz6A== X-Gm-Message-State: APjAAAVl7g/F7AL28jZg3iSQo57+ALrsVAMedR3aUFbuqeGJOzfBARqQ H3RgYSG8uu3TiCjNPhutNL7V3jUGypI= X-Google-Smtp-Source: APXvYqyruKzEVeVZw1fZQHuZSaox0yDBE5vRpFeyromre/MLinJhm9W4LHsizjhNcBUyyudnyj6J4A== X-Received: by 2002:a17:902:778d:: with SMTP id o13mr2589862pll.314.1555745723146; Sat, 20 Apr 2019 00:35:23 -0700 (PDT) Received: from localhost.localdomain (rrcs-66-91-136-155.west.biz.rr.com. [66.91.136.155]) by smtp.gmail.com with ESMTPSA id z22sm7025492pgv.23.2019.04.20.00.35.21 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 20 Apr 2019 00:35:22 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Fri, 19 Apr 2019 21:34:27 -1000 Message-Id: <20190420073442.7488-24-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190420073442.7488-1-richard.henderson@linaro.org> References: <20190420073442.7488-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::642 Subject: [Qemu-devel] [PATCH 23/38] target/ppc: Use tcg_gen_abs_tl X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: david@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/ppc/translate.c | 80 +++++++++++++++++------------------------- 1 file changed, 32 insertions(+), 48 deletions(-) diff --git a/target/ppc/translate.c b/target/ppc/translate.c index badc1ae1a3..97b8e8ddaf 100644 --- a/target/ppc/translate.c +++ b/target/ppc/translate.c @@ -5013,39 +5013,27 @@ static void gen_ecowx(DisasContext *ctx) /* abs - abs. */ static void gen_abs(DisasContext *ctx) { - TCGLabel *l1 = gen_new_label(); - TCGLabel *l2 = gen_new_label(); - tcg_gen_brcondi_tl(TCG_COND_GE, cpu_gpr[rA(ctx->opcode)], 0, l1); - tcg_gen_neg_tl(cpu_gpr[rD(ctx->opcode)], cpu_gpr[rA(ctx->opcode)]); - tcg_gen_br(l2); - gen_set_label(l1); - tcg_gen_mov_tl(cpu_gpr[rD(ctx->opcode)], cpu_gpr[rA(ctx->opcode)]); - gen_set_label(l2); - if (unlikely(Rc(ctx->opcode) != 0)) - gen_set_Rc0(ctx, cpu_gpr[rD(ctx->opcode)]); + TCGv d = cpu_gpr[rD(ctx->opcode)]; + TCGv a = cpu_gpr[rA(ctx->opcode)]; + + tcg_gen_abs_tl(d, a); + if (unlikely(Rc(ctx->opcode) != 0)) { + gen_set_Rc0(ctx, d); + } } /* abso - abso. */ static void gen_abso(DisasContext *ctx) { - TCGLabel *l1 = gen_new_label(); - TCGLabel *l2 = gen_new_label(); - TCGLabel *l3 = gen_new_label(); - /* Start with XER OV disabled, the most likely case */ - tcg_gen_movi_tl(cpu_ov, 0); - tcg_gen_brcondi_tl(TCG_COND_GE, cpu_gpr[rA(ctx->opcode)], 0, l2); - tcg_gen_brcondi_tl(TCG_COND_NE, cpu_gpr[rA(ctx->opcode)], 0x80000000, l1); - tcg_gen_movi_tl(cpu_ov, 1); - tcg_gen_movi_tl(cpu_so, 1); - tcg_gen_br(l2); - gen_set_label(l1); - tcg_gen_neg_tl(cpu_gpr[rD(ctx->opcode)], cpu_gpr[rA(ctx->opcode)]); - tcg_gen_br(l3); - gen_set_label(l2); - tcg_gen_mov_tl(cpu_gpr[rD(ctx->opcode)], cpu_gpr[rA(ctx->opcode)]); - gen_set_label(l3); - if (unlikely(Rc(ctx->opcode) != 0)) - gen_set_Rc0(ctx, cpu_gpr[rD(ctx->opcode)]); + TCGv d = cpu_gpr[rD(ctx->opcode)]; + TCGv a = cpu_gpr[rA(ctx->opcode)]; + + tcg_gen_setcondi_tl(TCG_COND_EQ, cpu_ov, a, 0x80000000); + tcg_gen_abs_tl(d, a); + tcg_gen_or_tl(cpu_so, cpu_so, cpu_ov); + if (unlikely(Rc(ctx->opcode) != 0)) { + gen_set_Rc0(ctx, d); + } } /* clcs */ @@ -5265,33 +5253,29 @@ static void gen_mulo(DisasContext *ctx) /* nabs - nabs. */ static void gen_nabs(DisasContext *ctx) { - TCGLabel *l1 = gen_new_label(); - TCGLabel *l2 = gen_new_label(); - tcg_gen_brcondi_tl(TCG_COND_GT, cpu_gpr[rA(ctx->opcode)], 0, l1); - tcg_gen_mov_tl(cpu_gpr[rD(ctx->opcode)], cpu_gpr[rA(ctx->opcode)]); - tcg_gen_br(l2); - gen_set_label(l1); - tcg_gen_neg_tl(cpu_gpr[rD(ctx->opcode)], cpu_gpr[rA(ctx->opcode)]); - gen_set_label(l2); - if (unlikely(Rc(ctx->opcode) != 0)) - gen_set_Rc0(ctx, cpu_gpr[rD(ctx->opcode)]); + TCGv d = cpu_gpr[rD(ctx->opcode)]; + TCGv a = cpu_gpr[rA(ctx->opcode)]; + + tcg_gen_abs_tl(d, a); + tcg_gen_neg_tl(d, d); + if (unlikely(Rc(ctx->opcode) != 0)) { + gen_set_Rc0(ctx, d); + } } /* nabso - nabso. */ static void gen_nabso(DisasContext *ctx) { - TCGLabel *l1 = gen_new_label(); - TCGLabel *l2 = gen_new_label(); - tcg_gen_brcondi_tl(TCG_COND_GT, cpu_gpr[rA(ctx->opcode)], 0, l1); - tcg_gen_mov_tl(cpu_gpr[rD(ctx->opcode)], cpu_gpr[rA(ctx->opcode)]); - tcg_gen_br(l2); - gen_set_label(l1); - tcg_gen_neg_tl(cpu_gpr[rD(ctx->opcode)], cpu_gpr[rA(ctx->opcode)]); - gen_set_label(l2); + TCGv d = cpu_gpr[rD(ctx->opcode)]; + TCGv a = cpu_gpr[rA(ctx->opcode)]; + + tcg_gen_abs_tl(d, a); + tcg_gen_neg_tl(d, d); /* nabs never overflows */ tcg_gen_movi_tl(cpu_ov, 0); - if (unlikely(Rc(ctx->opcode) != 0)) - gen_set_Rc0(ctx, cpu_gpr[rD(ctx->opcode)]); + if (unlikely(Rc(ctx->opcode) != 0)) { + gen_set_Rc0(ctx, d); + } } /* rlmi - rlmi. */ From patchwork Sat Apr 20 07:34:28 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1088323 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="MT+oLvLp"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44mQ872YKzz9s4Y for ; Sat, 20 Apr 2019 17:53:43 +1000 (AEST) Received: from localhost ([127.0.0.1]:38313 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkof-00075R-46 for incoming@patchwork.ozlabs.org; Sat, 20 Apr 2019 03:53:41 -0400 Received: from eggs.gnu.org ([209.51.188.92]:40380) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkXI-0000kt-8K for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:46 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hHkXE-0000JG-2Y for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:42 -0400 Received: from mail-pg1-x543.google.com ([2607:f8b0:4864:20::543]:34314) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hHkXB-0008Uw-Sv for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:38 -0400 Received: by mail-pg1-x543.google.com with SMTP id v12so3609682pgq.1 for ; Sat, 20 Apr 2019 00:35:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=o+cYp+5ys6XsltkQeTLvq7AcmzszeQmf5jssa+GAZ9E=; b=MT+oLvLpOzp5uYF1oMT4e1K81r2Ji7eRVeo9UvbPKZ9fwr06fJxGnIB65UmL5O3Tmo EVDpG7KFA2gevzmcaknkxweS39KrN0+OkYh/IW9thmdPABmou7gIjG01/MaFUpCv+DXe VrfQaWiJfmli0AzARASryV/eWI+nU86RrJkn1qCOTOOjRXa4/OhgF+Hdhsl7/5NKwimo vmAV/URnxlVPfkht5snutgD4eb8cv4Ii2ux1FPeSDbVImUTJOvOc833jiTETDYWEr6Wy jlSJ7uNWHOyGhPLgX3axi8IBcgYIET/cyOlZSagsy0HLTwef1GP/B1gv/KOY3nORpXEW p+AA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=o+cYp+5ys6XsltkQeTLvq7AcmzszeQmf5jssa+GAZ9E=; b=OWezX7OxZZu30nTsZu1NmPFBPKSien3eryuLklqkIEoyBmCICadLdglE1huAPzMD32 cS0rnqXD82A+5Gj4sNRBAPAfsHvBPdNQMntq2RA+f1N6fV0semxUdEsxbQKl8ROMJiKh nEmU0CAIRme+DaP8NLJoLtGvn0d4jqLTl/YYNhoOuBFd6VHObxKW+Rv6ali2MFqcd2Zl ZprWL/zPHNeVVeb++J34k/LopH2wStEfL+sWIw2FXNSELrPhpwThld2Y7HLgLEpJWQ0l 1udHjMSNAtpD/Ao+G4sB4THOvHHeDiQBLSHt54OhKotVp3UnPqMsVjlY8DNIQYxZBYN/ HbTA== X-Gm-Message-State: APjAAAUZHJxfaVDxdpJBdIoVk7i5cwh+jPdU8sl8fO8eeIVJFcOoNPRX euCg7ot23sHum5gBvEm3dSLTeaUiyGE= X-Google-Smtp-Source: APXvYqyI3I3x2mWZeu2cpADf0s6onMgs9RNXgarvpdnd5M0zi/LPZXdxCxuCcrCauVG8VTTxKyPvwg== X-Received: by 2002:aa7:8494:: with SMTP id u20mr1377517pfn.76.1555745724610; Sat, 20 Apr 2019 00:35:24 -0700 (PDT) Received: from localhost.localdomain (rrcs-66-91-136-155.west.biz.rr.com. [66.91.136.155]) by smtp.gmail.com with ESMTPSA id z22sm7025492pgv.23.2019.04.20.00.35.23 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 20 Apr 2019 00:35:23 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Fri, 19 Apr 2019 21:34:28 -1000 Message-Id: <20190420073442.7488-25-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190420073442.7488-1-richard.henderson@linaro.org> References: <20190420073442.7488-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::543 Subject: [Qemu-devel] [PATCH 24/38] target/s390x: Use tcg_gen_abs_i64 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: david@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson Reviewed-by: David Hildenbrand Reviewed-by: Philippe Mathieu-Daudé --- target/s390x/translate.c | 8 +------- 1 file changed, 1 insertion(+), 7 deletions(-) diff --git a/target/s390x/translate.c b/target/s390x/translate.c index 0afa8f7ca5..030129acbb 100644 --- a/target/s390x/translate.c +++ b/target/s390x/translate.c @@ -1407,13 +1407,7 @@ static DisasJumpType help_branch(DisasContext *s, DisasCompare *c, static DisasJumpType op_abs(DisasContext *s, DisasOps *o) { - TCGv_i64 z, n; - z = tcg_const_i64(0); - n = tcg_temp_new_i64(); - tcg_gen_neg_i64(n, o->in2); - tcg_gen_movcond_i64(TCG_COND_LT, o->out, o->in2, z, n, o->in2); - tcg_temp_free_i64(n); - tcg_temp_free_i64(z); + tcg_gen_abs_i64(o->out, o->in2); return DISAS_NEXT; } From patchwork Sat Apr 20 07:34:29 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1088321 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="DS4tB6EW"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44mQ6n27sJz9s4Y for ; Sat, 20 Apr 2019 17:52:33 +1000 (AEST) Received: from localhost ([127.0.0.1]:38303 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHknX-0006Gs-6H for incoming@patchwork.ozlabs.org; Sat, 20 Apr 2019 03:52:31 -0400 Received: from eggs.gnu.org ([209.51.188.92]:40381) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkXI-0000ku-8P for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:46 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hHkXD-0000I0-Ak for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:42 -0400 Received: from mail-pf1-x441.google.com ([2607:f8b0:4864:20::441]:34830) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hHkXB-0008W6-ST for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:37 -0400 Received: by mail-pf1-x441.google.com with SMTP id t21so3478344pfh.2 for ; Sat, 20 Apr 2019 00:35:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=pqun8ODH+BipBhEXC50lCfhJUx093WlLNlb2HRBn9Eg=; b=DS4tB6EWM9gB0boq5JwaDI+1dKKq4nZiqPuTm2e+J23E00Ib006TXDMiz6ohJZJuQN lyhJGPbkO3XAYK2A9jEnqTRxoY3mdrQgIp9tEgvIHZNRr2ldYph35frWtwvriq9jhH1K gdgxybrqfyESTv2WiK/G0+qX1qj/QEhq1tM4lNsixIl9Huqmz7GjZeCmBT8/BElRzlrQ vI3tquo9tbbX2N4zLL5UK7xpVJB4+0rMmWEDmqE283vrKJvUxJi5ya7zBInkhVNb4iPx 1sK1xnvx2mHRIxX2FtW4DBzehtOJBE/qIlOQGc7jUVWY1VLjvXVlED6VEwLO+7ynMoDt TgfQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=pqun8ODH+BipBhEXC50lCfhJUx093WlLNlb2HRBn9Eg=; b=MNx7nnFiiw112E9Y05TraSpgvkCkWkUdt89wHAAKaoaDvICLLLU94kMhHgx5NY8JOO DuALKkVZNCEcmYhgEH3PaCZQRVdBRLg75mXd/2TtznTCelcH2NYq9MMH/ffVu1xo7DEl EcxIj5ouCVE9d0RLa9L8GX1H5WoLh4k7usWpPTm8qhFiaNjCONi9ZE0+BQHxqVTq0z5t Vy369Kqd59YQwmIKVNnmPc9zhbyMbCE9W/HIqVURsWRNwnbU/IOV4ZGcbmex7UeQJn4K 9Xkk6+rS2PXl2YDoYyZaGgSXMc8zSo85wtEuE/XpKIUhDtkOLZbefVREJGY0okZCqvPr FmMA== X-Gm-Message-State: APjAAAWFWJRJbDsQIlL5m8zjDSLHOL2E3rg4dvHYXBxHDuTcTvZmvLCB wMcfgGrY03vikuYXRpMts+i9NoO4FeM= X-Google-Smtp-Source: APXvYqznuYg6IgFeZUVoiellRammUI6C/wFdUt/DZK5F/mupe/ulSlJ+RYm599rlUHW1GzDrq6miAw== X-Received: by 2002:a62:388d:: with SMTP id f135mr8555509pfa.103.1555745726081; Sat, 20 Apr 2019 00:35:26 -0700 (PDT) Received: from localhost.localdomain (rrcs-66-91-136-155.west.biz.rr.com. [66.91.136.155]) by smtp.gmail.com with ESMTPSA id z22sm7025492pgv.23.2019.04.20.00.35.24 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 20 Apr 2019 00:35:25 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Fri, 19 Apr 2019 21:34:29 -1000 Message-Id: <20190420073442.7488-26-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190420073442.7488-1-richard.henderson@linaro.org> References: <20190420073442.7488-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::441 Subject: [Qemu-devel] [PATCH 25/38] target/xtensa: Use tcg_gen_abs_i32 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: david@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson Reviewed-by: Philippe Mathieu-Daudé --- target/xtensa/translate.c | 9 +-------- 1 file changed, 1 insertion(+), 8 deletions(-) diff --git a/target/xtensa/translate.c b/target/xtensa/translate.c index 65561d2c49..62be8a6f6a 100644 --- a/target/xtensa/translate.c +++ b/target/xtensa/translate.c @@ -1707,14 +1707,7 @@ void restore_state_to_opc(CPUXtensaState *env, TranslationBlock *tb, static void translate_abs(DisasContext *dc, const OpcodeArg arg[], const uint32_t par[]) { - TCGv_i32 zero = tcg_const_i32(0); - TCGv_i32 neg = tcg_temp_new_i32(); - - tcg_gen_neg_i32(neg, arg[1].in); - tcg_gen_movcond_i32(TCG_COND_GE, arg[0].out, - arg[1].in, zero, arg[1].in, neg); - tcg_temp_free(neg); - tcg_temp_free(zero); + tcg_gen_abs_i32(arg[0].out, arg[1].in); } static void translate_add(DisasContext *dc, const OpcodeArg arg[], From patchwork Sat Apr 20 07:34:30 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1088328 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="ni+RjTjG"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44mQFq3nZbz9s70 for ; Sat, 20 Apr 2019 17:58:39 +1000 (AEST) Received: from localhost ([127.0.0.1]:38385 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHktR-0002xR-GC for incoming@patchwork.ozlabs.org; Sat, 20 Apr 2019 03:58:37 -0400 Received: from eggs.gnu.org ([209.51.188.92]:40461) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkXW-00010B-4r for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:59 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hHkXU-0000l0-Gp for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:57 -0400 Received: from mail-pg1-x52e.google.com ([2607:f8b0:4864:20::52e]:42203) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hHkXT-00005G-UO for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:56 -0400 Received: by mail-pg1-x52e.google.com with SMTP id p6so3585750pgh.9 for ; Sat, 20 Apr 2019 00:35:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=wLEaZTkk1NbZcHzSWXgsTBTywHt51Uv7RFioHP3q3zU=; b=ni+RjTjGK/dKzsY0d7JkK2+FvozFPeo+u0c9qEeWjdd8tVR/Sz4CABqYFNUpQuk8Ar yMnBEVrKkRfsZBOPm73xqj+kQ69TMJvJFudTNz7jSCOUnC59Q1Rdld67kOzkbWx2ue7V cnDH5nu7WjMFYrRC7M2pFhoOtpyiQWM5yl0C9BBNCo/voaskUHE1YxikFwUBU/IcssSB zCYNkwU186W1gvyuFi6IwpZ7RnPxIrInyxE/A9tlKkOFhP4y6ETBLuwGFWkMzXdhwVya Anr+GzLP/9VDxVLrRQ7cVJbhF3NEqzfrYUG4yOWBcJDsA9Y80SpIznD9YUDV8eQJTerU ebdg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=wLEaZTkk1NbZcHzSWXgsTBTywHt51Uv7RFioHP3q3zU=; b=hUj0hUynoWQXs3YWAEGVVfdqg1Lmvbrtje9i1MxggWsEFCV7LAjIz/5TrMs+cntK31 nn2pI+fguBk4fnmWEjbfyKFhycKMDioqp7wAEymiBHdtFpaaY2rkuARFMeiBacsl1pD6 K/s0uNb+4CsASuYLY9jbqJ6ctk4gm8Hax836bvMILBMAwMJhV5XpdxMQRpE9/LZxBlpP Ny6XJkbB/l1oU1+G/RJ06UC4u3SdiHEwEGQOx173W3JqxgSxXaBE368JFij9T6ebyAR6 rMAx31RC5F89hdGZ7Mn/UOPqNoksI2+He3X057QfrBe9v0F0YEF/f9rzWRq6TuGioEtU heeg== X-Gm-Message-State: APjAAAWjs2K7R4Nhif/DhnGhox5omJLsc8eRUwKJjdJqSobqo9CwO1Ku YqG+Q1ylTgem68Sl2nc8lAJvQMIxAW0= X-Google-Smtp-Source: APXvYqwFTJEPFhFYhDhFjV+xfbryJXIg5MJVeLH9VbsNEgE1HgmUOjBT4l0JCPhB187WJqtTEgbWUA== X-Received: by 2002:a62:76c1:: with SMTP id r184mr8605049pfc.229.1555745727700; Sat, 20 Apr 2019 00:35:27 -0700 (PDT) Received: from localhost.localdomain (rrcs-66-91-136-155.west.biz.rr.com. [66.91.136.155]) by smtp.gmail.com with ESMTPSA id z22sm7025492pgv.23.2019.04.20.00.35.26 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 20 Apr 2019 00:35:27 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Fri, 19 Apr 2019 21:34:30 -1000 Message-Id: <20190420073442.7488-27-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190420073442.7488-1-richard.henderson@linaro.org> References: <20190420073442.7488-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::52e Subject: [Qemu-devel] [PATCH 26/38] tcg/i386: Support vector absolute value X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: david@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- tcg/i386/tcg-target.h | 2 +- tcg/i386/tcg-target.inc.c | 15 +++++++++++++++ 2 files changed, 16 insertions(+), 1 deletion(-) diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h index 7445f05885..66f16fbe3c 100644 --- a/tcg/i386/tcg-target.h +++ b/tcg/i386/tcg-target.h @@ -182,7 +182,7 @@ extern bool have_avx2; #define TCG_TARGET_HAS_orc_vec 0 #define TCG_TARGET_HAS_not_vec 0 #define TCG_TARGET_HAS_neg_vec 0 -#define TCG_TARGET_HAS_abs_vec 0 +#define TCG_TARGET_HAS_abs_vec 1 #define TCG_TARGET_HAS_shi_vec 1 #define TCG_TARGET_HAS_shs_vec 1 #define TCG_TARGET_HAS_shv_vec have_avx2 diff --git a/tcg/i386/tcg-target.inc.c b/tcg/i386/tcg-target.inc.c index 85b68e4326..3dae0bf0c5 100644 --- a/tcg/i386/tcg-target.inc.c +++ b/tcg/i386/tcg-target.inc.c @@ -369,6 +369,9 @@ static inline int tcg_target_const_match(tcg_target_long val, TCGType type, #define OPC_MOVSLQ (0x63 | P_REXW) #define OPC_MOVZBL (0xb6 | P_EXT) #define OPC_MOVZWL (0xb7 | P_EXT) +#define OPC_PABSB (0x1c | P_EXT38 | P_DATA16) +#define OPC_PABSW (0x1d | P_EXT38 | P_DATA16) +#define OPC_PABSD (0x1e | P_EXT38 | P_DATA16) #define OPC_PACKSSDW (0x6b | P_EXT | P_DATA16) #define OPC_PACKSSWB (0x63 | P_EXT | P_DATA16) #define OPC_PACKUSDW (0x2b | P_EXT38 | P_DATA16) @@ -2739,6 +2742,10 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, static int const sars_insn[4] = { OPC_UD2, OPC_PSRAW, OPC_PSRAD, OPC_UD2 }; + static int const abs_insn[4] = { + /* TODO: AVX512 adds support for MO_64. */ + OPC_PABSB, OPC_PABSW, OPC_PABSD, OPC_UD2 + }; TCGType type = vecl + TCG_TYPE_V64; int insn, sub; @@ -2827,6 +2834,11 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, insn = OPC_PUNPCKLDQ; goto gen_simd; #endif + case INDEX_op_abs_vec: + insn = abs_insn[vece]; + a2 = a1; + a1 = 0; + goto gen_simd; gen_simd: tcg_debug_assert(insn != OPC_UD2); if (type == TCG_TYPE_V256) { @@ -3204,6 +3216,7 @@ static const TCGTargetOpDef *tcg_target_op_def(TCGOpcode op) case INDEX_op_dup2_vec: #endif return &x_x_x; + case INDEX_op_abs_vec: case INDEX_op_dup_vec: case INDEX_op_shli_vec: case INDEX_op_shri_vec: @@ -3281,6 +3294,8 @@ int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type, unsigned vece) case INDEX_op_umin_vec: case INDEX_op_umax_vec: return vece <= MO_32 ? 1 : -1; + case INDEX_op_abs_vec: + return vece <= MO_32; default: return 0; From patchwork Sat Apr 20 07:34:31 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1088329 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="ClFSOIvv"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44mQFr6y0wz9s70 for ; Sat, 20 Apr 2019 17:58:40 +1000 (AEST) Received: from localhost ([127.0.0.1]:38388 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHktS-0002z0-Vp for incoming@patchwork.ozlabs.org; Sat, 20 Apr 2019 03:58:39 -0400 Received: from eggs.gnu.org ([209.51.188.92]:40517) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkXX-00011a-DZ for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:36:00 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hHkXW-0000mm-6F for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:59 -0400 Received: from mail-pf1-x441.google.com ([2607:f8b0:4864:20::441]:40670) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hHkXU-00006M-VL for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:57 -0400 Received: by mail-pf1-x441.google.com with SMTP id c207so3462049pfc.7 for ; Sat, 20 Apr 2019 00:35:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=L7/jGco+/JJYnhyme4mE2WKjitzinpGKSfbQsfSJn94=; b=ClFSOIvv3wKxiaBOLd3diX8DzrLQocKUTRav/6HZMfPGNjCCugVz/yzulcdHG6njKY lo9dfqDXRHPL2wA3+R4XlVqT1q4PhHFW0FDLj/4aLbpeYaUHuEr83RNiKzVmb59vr3T3 Q18zAxrE7ETxmVbET3JsInsJ0Q12K/zfmBtkHuIWbOnQfQoa3Dw2B1X6guZjMCyjac4F s5G0geHHH0K3DOn1N/LPzjVbj/ppZ4Kb8eai2lmvYyjS7rNb0PGOdioWNnfGyvNDhDUx A3vDvfWzoxsFz36ZWGITRpRH+k2MGvrYdC9Aw66W+l6ZMYYnE/0G/QgcLABLqbocUKOH UkbA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=L7/jGco+/JJYnhyme4mE2WKjitzinpGKSfbQsfSJn94=; b=S4gQaVaE4M0VBdhCDAMUZntTtJ2g1JCj42TZxHmfcVyEyof+UZkpBPVfu4h1tcgbfA lguxeTprgEpFzcCtyiTYW9MSum2nCfS09g0nUxsnahaX07XHMKZeVVrggELtzysFvvmL cRMSIiajmX6nTnRQQ5ZcmcmqDsQTU+3rx3q+cc0FvJqfMqTAjb7Fk+3yvIkDQyfUUJ36 TEtwc+9iMW+ilvQlPG75c+dLVRm5jFxW+WvqlazFkr34N+JWYawDkIIFoZhJ9ippjxW0 9qeU0I8Tf2/y+XLGtIrYOdm1aNDwrulQNrZzbuxzZ2FNITkwfsQYBL8UU9TKsPzgrEeq ZO6g== X-Gm-Message-State: APjAAAUvxuOA09uZB6pcdyhFOpXjQ+A0tF0Yf2AQ3x3CIsnXJoDgFyA/ TbN2kbNYb9DVtffyDxEDwusIKbZmQM4= X-Google-Smtp-Source: APXvYqwe4exToZXzU9T7f5JKxz0o3wpqrVnAlp+dW9FGVRdWFAxiTIoD7RD2DkZa/UG0nIL/qYL8cA== X-Received: by 2002:aa7:9116:: with SMTP id 22mr1348838pfh.165.1555745729098; Sat, 20 Apr 2019 00:35:29 -0700 (PDT) Received: from localhost.localdomain (rrcs-66-91-136-155.west.biz.rr.com. [66.91.136.155]) by smtp.gmail.com with ESMTPSA id z22sm7025492pgv.23.2019.04.20.00.35.27 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 20 Apr 2019 00:35:28 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Fri, 19 Apr 2019 21:34:31 -1000 Message-Id: <20190420073442.7488-28-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190420073442.7488-1-richard.henderson@linaro.org> References: <20190420073442.7488-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::441 Subject: [Qemu-devel] [PATCH 27/38] tcg/aarch64: Support vector absolute value X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: david@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- tcg/aarch64/tcg-target.h | 2 +- tcg/aarch64/tcg-target.inc.c | 6 ++++++ 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h index 21d06d928c..e43554c3c7 100644 --- a/tcg/aarch64/tcg-target.h +++ b/tcg/aarch64/tcg-target.h @@ -132,7 +132,7 @@ typedef enum { #define TCG_TARGET_HAS_orc_vec 1 #define TCG_TARGET_HAS_not_vec 1 #define TCG_TARGET_HAS_neg_vec 1 -#define TCG_TARGET_HAS_abs_vec 0 +#define TCG_TARGET_HAS_abs_vec 1 #define TCG_TARGET_HAS_shi_vec 1 #define TCG_TARGET_HAS_shs_vec 0 #define TCG_TARGET_HAS_shv_vec 1 diff --git a/tcg/aarch64/tcg-target.inc.c b/tcg/aarch64/tcg-target.inc.c index 7d2a8213ec..cf891defd4 100644 --- a/tcg/aarch64/tcg-target.inc.c +++ b/tcg/aarch64/tcg-target.inc.c @@ -554,6 +554,7 @@ typedef enum { I3617_CMGE0 = 0x2e208800, I3617_CMLE0 = 0x2e20a800, I3617_NOT = 0x2e205800, + I3617_ABS = 0x0e20b800, I3617_NEG = 0x2e20b800, /* System instructions. */ @@ -2205,6 +2206,9 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, case INDEX_op_neg_vec: tcg_out_insn(s, 3617, NEG, is_q, vece, a0, a1); break; + case INDEX_op_abs_vec: + tcg_out_insn(s, 3617, ABS, is_q, vece, a0, a1); + break; case INDEX_op_and_vec: tcg_out_insn(s, 3616, AND, is_q, 0, a0, a1, a2); break; @@ -2316,6 +2320,7 @@ int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type, unsigned vece) case INDEX_op_andc_vec: case INDEX_op_orc_vec: case INDEX_op_neg_vec: + case INDEX_op_abs_vec: case INDEX_op_not_vec: case INDEX_op_cmp_vec: case INDEX_op_shli_vec: @@ -2559,6 +2564,7 @@ static const TCGTargetOpDef *tcg_target_op_def(TCGOpcode op) return &w_w_w; case INDEX_op_not_vec: case INDEX_op_neg_vec: + case INDEX_op_abs_vec: case INDEX_op_shli_vec: case INDEX_op_shri_vec: case INDEX_op_sari_vec: From patchwork Sat Apr 20 07:34:32 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1088330 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="du+z4gqO"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44mQHN5mbsz9s9G for ; Sat, 20 Apr 2019 18:00:00 +1000 (AEST) Received: from localhost ([127.0.0.1]:38398 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkuk-00048b-QY for incoming@patchwork.ozlabs.org; Sat, 20 Apr 2019 03:59:58 -0400 Received: from eggs.gnu.org ([209.51.188.92]:40550) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkXY-00012Z-9F for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:36:02 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hHkXW-0000mz-8o for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:36:00 -0400 Received: from mail-pg1-x534.google.com ([2607:f8b0:4864:20::534]:33293) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hHkXV-0000AK-UL for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:58 -0400 Received: by mail-pg1-x534.google.com with SMTP id k19so3604839pgh.0 for ; Sat, 20 Apr 2019 00:35:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=pLJ/jEUEC0kAS1hVRK9AblNcf3gsii98OZW+A7GlfOQ=; b=du+z4gqOhqEQ0Rpi83vlAznGWlhJYA/wYMPZkc7mp92s+m7Cwd+cMrJvqm2FKhYrue UwBwyrMyfOhm9uKip8bl0NGmVyb5yUXmgX5q5HELqw/6jy/gWcwdV/i+3NK1ka7W0x4a K1rgl2/+48jXUNUTTCD2YDsX3X9pbAoVQerM1JNpGPbobfIWpHk8JR0q0Pb07i1JYcGt RWZXAat2kE8GYWCeUs30mA9a9/LMycbLRHoZyBbLJZdM0yVqitS9TdQaozTr0ReHWTiv LG2LTA2FRbK99XsJw0dQ1823XHRXAPim9+ibOw6OGS0diyipamZVoShq4+YSYsHhgBhE O/tA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=pLJ/jEUEC0kAS1hVRK9AblNcf3gsii98OZW+A7GlfOQ=; b=DtBpDeOaDWeY/fhkr/lXrv5lWjf0VAAawmkTcRMlwf25YJJtVxiSClJ9kh4t5jFJh5 2GlswqEzzWgH06LPQ9faaZHbweHZ2eU5ad/fC2USxT2AtKOfjjRPgxuYw05fAMGf9X+n ruG0lz/W57V9NewbVQF3wmgD6j82aKFhFNFDMBZDXNIhfktG2aroMW0ASETKt6aGlQsG QLDaw3e+CON9GUodS1Bos346gJigACb0MNoJYDSHUikU/kx5eguc0tS18pQ1qIvGYp8m 1ssUY8ZQo4QnKUHTn7GP8yyWwZuhFNufBrJlrhBoZi8O/xPMKCTpzXNGPzvDHrLHQrlx /vLw== X-Gm-Message-State: APjAAAV7VRC80aSNTFdQQOb7EbZEO4pK/aQFiDta/9o4zb4xqulzMlFv es0I6bPwK/byGFDcipwfTAFjZkyLBRA= X-Google-Smtp-Source: APXvYqwP1xr/YFir9OL02PUkTcxbIAcX5TTvDxmqJASiun23qbLaZvhSROztDY/G3bhMAqr+Tn0T6Q== X-Received: by 2002:a63:3287:: with SMTP id y129mr8215796pgy.9.1555745730533; Sat, 20 Apr 2019 00:35:30 -0700 (PDT) Received: from localhost.localdomain (rrcs-66-91-136-155.west.biz.rr.com. [66.91.136.155]) by smtp.gmail.com with ESMTPSA id z22sm7025492pgv.23.2019.04.20.00.35.29 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 20 Apr 2019 00:35:29 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Fri, 19 Apr 2019 21:34:32 -1000 Message-Id: <20190420073442.7488-29-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190420073442.7488-1-richard.henderson@linaro.org> References: <20190420073442.7488-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::534 Subject: [Qemu-devel] [PATCH 28/38] tcg: Add support for vector comparison select X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: david@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" At present, only tcg_gen_cmpsel_vec added, which can be used by other target-specific vector expanders. It is not clear whether a full gvec expander would be worthwhile, given the unspecified nature of the selector. Signed-off-by: Richard Henderson --- tcg/aarch64/tcg-target.h | 1 + tcg/i386/tcg-target.h | 1 + tcg/tcg-op.h | 2 ++ tcg/tcg-opc.h | 1 + tcg/tcg.h | 1 + tcg/tcg-op-gvec.c | 3 +++ tcg/tcg-op-vec.c | 37 +++++++++++++++++++++++++++++++++++++ tcg/tcg.c | 2 ++ tcg/README | 12 ++++++++++++ 9 files changed, 60 insertions(+) diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h index e43554c3c7..e1135e930a 100644 --- a/tcg/aarch64/tcg-target.h +++ b/tcg/aarch64/tcg-target.h @@ -140,6 +140,7 @@ typedef enum { #define TCG_TARGET_HAS_mul_vec 1 #define TCG_TARGET_HAS_sat_vec 1 #define TCG_TARGET_HAS_minmax_vec 1 +#define TCG_TARGET_HAS_cmpsel_vec 0 #define TCG_TARGET_DEFAULT_MO (0) #define TCG_TARGET_HAS_MEMORY_BSWAP 1 diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h index 66f16fbe3c..683e029980 100644 --- a/tcg/i386/tcg-target.h +++ b/tcg/i386/tcg-target.h @@ -190,6 +190,7 @@ extern bool have_avx2; #define TCG_TARGET_HAS_mul_vec 1 #define TCG_TARGET_HAS_sat_vec 1 #define TCG_TARGET_HAS_minmax_vec 1 +#define TCG_TARGET_HAS_cmpsel_vec 0 #define TCG_TARGET_deposit_i32_valid(ofs, len) \ (((ofs) == 0 && (len) == 8) || ((ofs) == 8 && (len) == 8) || \ diff --git a/tcg/tcg-op.h b/tcg/tcg-op.h index 660fe205d0..6c4cd0aa14 100644 --- a/tcg/tcg-op.h +++ b/tcg/tcg-op.h @@ -999,6 +999,8 @@ void tcg_gen_sarv_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec s); void tcg_gen_cmp_vec(TCGCond cond, unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); +void tcg_gen_cmpsel_vec(unsigned vece, TCGv_vec r, TCGv_vec s, + TCGv_vec a, TCGv_vec b); void tcg_gen_ld_vec(TCGv_vec r, TCGv_ptr base, TCGArg offset); void tcg_gen_st_vec(TCGv_vec r, TCGv_ptr base, TCGArg offset); diff --git a/tcg/tcg-opc.h b/tcg/tcg-opc.h index 4a2dd116eb..05fb9e3f37 100644 --- a/tcg/tcg-opc.h +++ b/tcg/tcg-opc.h @@ -255,6 +255,7 @@ DEF(shrv_vec, 1, 2, 0, IMPLVEC | IMPL(TCG_TARGET_HAS_shv_vec)) DEF(sarv_vec, 1, 2, 0, IMPLVEC | IMPL(TCG_TARGET_HAS_shv_vec)) DEF(cmp_vec, 1, 2, 1, IMPLVEC) +DEF(cmpsel_vec, 1, 3, 0, IMPLVEC | IMPL(TCG_TARGET_HAS_cmpsel_vec)) DEF(last_generic, 0, 0, 0, TCG_OPF_NOT_PRESENT) diff --git a/tcg/tcg.h b/tcg/tcg.h index 986055fdfa..1abee6cbe5 100644 --- a/tcg/tcg.h +++ b/tcg/tcg.h @@ -187,6 +187,7 @@ typedef uint64_t TCGRegSet; #define TCG_TARGET_HAS_mul_vec 0 #define TCG_TARGET_HAS_sat_vec 0 #define TCG_TARGET_HAS_minmax_vec 0 +#define TCG_TARGET_HAS_cmpsel_vec 0 #else #define TCG_TARGET_MAYBE_vec 1 #endif diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c index 87d5a01cc9..e7029d26f4 100644 --- a/tcg/tcg-op-gvec.c +++ b/tcg/tcg-op-gvec.c @@ -96,6 +96,9 @@ static bool tcg_can_emit_vecop_list(const TCGOpcode *list, continue; } break; + case INDEX_op_cmpsel_vec: + /* Fallback expansion uses only required logial ops. */ + continue; default: break; } diff --git a/tcg/tcg-op-vec.c b/tcg/tcg-op-vec.c index b7f21145bb..7d8f7b490a 100644 --- a/tcg/tcg-op-vec.c +++ b/tcg/tcg-op-vec.c @@ -599,3 +599,40 @@ void tcg_gen_sars_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_i32 b) { do_shifts(vece, r, a, b, INDEX_op_sars_vec, INDEX_op_sarv_vec); } + +void tcg_gen_cmpsel_vec(unsigned vece, TCGv_vec r, TCGv_vec s, + TCGv_vec a, TCGv_vec b) +{ + TCGTemp *rt = tcgv_vec_temp(r); + TCGTemp *st = tcgv_vec_temp(s); + TCGTemp *at = tcgv_vec_temp(a); + TCGTemp *bt = tcgv_vec_temp(b); + TCGArg ri = temp_arg(rt); + TCGArg si = temp_arg(st); + TCGArg ai = temp_arg(at); + TCGArg bi = temp_arg(bt); + TCGType type = rt->base_type; + const TCGOpcode *hold_list; + int can; + + tcg_debug_assert(st->base_type >= type); + tcg_debug_assert(at->base_type >= type); + tcg_debug_assert(bt->base_type >= type); + tcg_assert_listed_vecop(INDEX_op_cmpsel_vec); + hold_list = tcg_swap_vecop_list(NULL); + + can = tcg_can_emit_vec_op(INDEX_op_cmpsel_vec, type, vece); + if (can > 0) { + vec_gen_4(INDEX_op_cmpsel_vec, type, vece, ri, si, ai, bi); + } else if (can < 0) { + tcg_expand_vec_op(INDEX_op_cmpsel_vec, type, vece, ri, si, ai, bi); + } else { + TCGv_vec t = tcg_temp_new_vec(type); + + tcg_gen_and_vec(vece, t, a, s); + tcg_gen_andc_vec(vece, r, b, s); + tcg_gen_or_vec(vece, r, r, t); + tcg_temp_free_vec(t); + } + tcg_swap_vecop_list(hold_list); +} diff --git a/tcg/tcg.c b/tcg/tcg.c index 86a95a636b..0c68bd5cf5 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -1651,6 +1651,8 @@ bool tcg_op_supported(TCGOpcode op) case INDEX_op_smax_vec: case INDEX_op_umax_vec: return have_vec && TCG_TARGET_HAS_minmax_vec; + case INDEX_op_cmpsel_vec: + return have_vec && TCG_TARGET_HAS_cmpsel_vec; default: tcg_debug_assert(op > INDEX_op_last_generic && op < NB_OPS); diff --git a/tcg/README b/tcg/README index cbdfd3b6bc..16977a8d62 100644 --- a/tcg/README +++ b/tcg/README @@ -627,6 +627,18 @@ E.g. VECL=1 -> 64 << 1 -> v128, and VECE=2 -> 1 << 2 -> i32. Compare vectors by element, storing -1 for true and 0 for false. +* cmpsel_vec v0, v1, v2, v3 + + If an element in v1 is "true", select the corresponding element + from v2, else select the corresponding element from v3, storing + the result in v0. Nominally, this can be implemented as + + v0 = (v2 & v1) | (v3 & ~v1) + + HOWEVER, it is unspecified which bits of v1 are significant. + Thus each element of v1 must be -1 or 0, as if by the result + of cmp_vec. + ********* Note 1: Some shortcuts are defined when the last operand is known to be From patchwork Sat Apr 20 07:34:33 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1088310 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="tP+nEY2A"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44mPwp6Qhtz9s4Y for ; Sat, 20 Apr 2019 17:43:54 +1000 (AEST) Received: from localhost ([127.0.0.1]:38170 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkfA-0007MN-O5 for incoming@patchwork.ozlabs.org; Sat, 20 Apr 2019 03:43:52 -0400 Received: from eggs.gnu.org ([209.51.188.92]:40465) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkXW-00010D-5X for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:59 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hHkXU-0000kq-CU for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:57 -0400 Received: from mail-pg1-x543.google.com ([2607:f8b0:4864:20::543]:45866) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hHkXT-0000CP-Hn for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:56 -0400 Received: by mail-pg1-x543.google.com with SMTP id y3so3581595pgk.12 for ; Sat, 20 Apr 2019 00:35:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=wJcaUBM/vMrRB+8pVZI+UVwqYtaFLioXLl9e3hXIiDY=; b=tP+nEY2ADU1xwbMpw7bbjzbv9adN9LWg3TjcNdZcu0NPliDIiq7oQHE0tCjV5EkVyd Zf2md4I7vWO2iVJMGdaXSYhofnMifuzBBix1hyhKpFk3U9EV6wZW4q853cX/+sSCaNcF jIx72Dc3azHiQww3t+hMJwPVjYHMmCZbQJNS2qmr/JRUJoI4P9F2YFS8hTdIzlH2TUSI ZiekbYnirFSbtMb4kfyIwPzrjhpO8Wrs5jig6lR1zdYQjCFXNGrbE3+OcIz/oZAXiBrS dRWIGPrt/c61YgDxze5dIV5RrGT9xEUjmjNv0GQVnNKfWVNQP3GTsPaaMR+qddLXR4wQ 73gw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=wJcaUBM/vMrRB+8pVZI+UVwqYtaFLioXLl9e3hXIiDY=; b=hdyu2tG3XUchd3DWCe5bwnnabsmGN2M5nxeze/gk2+bJC+kgt79kfoH9R1IQdW1GSJ icCEKGZLBgtEF9SZbVYVvAs892nsylaTWdgUbGXTS0CfyYIZxV2yfk8ocO/ca25W95KB yVIfQO5jiS+PT7TUgA1txpY8O1pdHhPhgsj70qopNWjsaJX7reMsOTYB/87qxWriXPzH HPQFvF5q4kEgrSiZAh+0Lu/kitaacbGPZ01uZfCwy9hnODaH12SfDUWRBK9xS3yorxLV QRNp0HgjyGSq/VrEUlDGinswma96IyXFlZi+0RooB+/9LW6wGwnLgkgo4y3nJLFNpS/R gZoQ== X-Gm-Message-State: APjAAAVK/+en+pzEHTCVddmJu+TADud2vH4myTGITW5v6ZOGSLiFtppp ftbdZ0HZs4SCzVqP9YWf8PkmMuytQD8= X-Google-Smtp-Source: APXvYqxgrK+FYqA6YrThe0xh0OylKh8U7D+IdeHshAvs2XWr6Gxm9OjDv00tp3C/qKl7iq+svtSKNg== X-Received: by 2002:a62:12d0:: with SMTP id 77mr8551481pfs.15.1555745732073; Sat, 20 Apr 2019 00:35:32 -0700 (PDT) Received: from localhost.localdomain (rrcs-66-91-136-155.west.biz.rr.com. [66.91.136.155]) by smtp.gmail.com with ESMTPSA id z22sm7025492pgv.23.2019.04.20.00.35.30 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 20 Apr 2019 00:35:31 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Fri, 19 Apr 2019 21:34:33 -1000 Message-Id: <20190420073442.7488-30-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190420073442.7488-1-richard.henderson@linaro.org> References: <20190420073442.7488-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::543 Subject: [Qemu-devel] [PATCH 29/38] tcg/i386: Support vector comparison select value X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: david@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" We already had backend support for this feature, but with a backend-specific opcode. Remove the old name, and reorder the arguments to match the generic opcode. Signed-off-by: Richard Henderson --- tcg/i386/tcg-target.h | 2 +- tcg/i386/tcg-target.opc.h | 1 - tcg/i386/tcg-target.inc.c | 13 ++++++------- 3 files changed, 7 insertions(+), 9 deletions(-) diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h index 683e029980..acdf96b99d 100644 --- a/tcg/i386/tcg-target.h +++ b/tcg/i386/tcg-target.h @@ -190,7 +190,7 @@ extern bool have_avx2; #define TCG_TARGET_HAS_mul_vec 1 #define TCG_TARGET_HAS_sat_vec 1 #define TCG_TARGET_HAS_minmax_vec 1 -#define TCG_TARGET_HAS_cmpsel_vec 0 +#define TCG_TARGET_HAS_cmpsel_vec 1 #define TCG_TARGET_deposit_i32_valid(ofs, len) \ (((ofs) == 0 && (len) == 8) || ((ofs) == 8 && (len) == 8) || \ diff --git a/tcg/i386/tcg-target.opc.h b/tcg/i386/tcg-target.opc.h index e5fa88ba25..d761cca0cd 100644 --- a/tcg/i386/tcg-target.opc.h +++ b/tcg/i386/tcg-target.opc.h @@ -3,7 +3,6 @@ consider these to be UNSPEC with names. */ DEF(x86_shufps_vec, 1, 2, 1, IMPLVEC) -DEF(x86_vpblendvb_vec, 1, 3, 0, IMPLVEC) DEF(x86_blend_vec, 1, 2, 1, IMPLVEC) DEF(x86_packss_vec, 1, 2, 0, IMPLVEC) DEF(x86_packus_vec, 1, 2, 0, IMPLVEC) diff --git a/tcg/i386/tcg-target.inc.c b/tcg/i386/tcg-target.inc.c index 3dae0bf0c5..50296b4b2f 100644 --- a/tcg/i386/tcg-target.inc.c +++ b/tcg/i386/tcg-target.inc.c @@ -2921,13 +2921,13 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, tcg_out8(s, sub); break; - case INDEX_op_x86_vpblendvb_vec: + case INDEX_op_cmpsel_vec: insn = OPC_VPBLENDVB; if (type == TCG_TYPE_V256) { insn |= P_VEXL; } - tcg_out_vex_modrm(s, insn, a0, a1, a2); - tcg_out8(s, args[3] << 4); + tcg_out_vex_modrm(s, insn, a0, a2, args[3]); + tcg_out8(s, a1 << 4); break; case INDEX_op_x86_psrldq_vec: @@ -3223,7 +3223,7 @@ static const TCGTargetOpDef *tcg_target_op_def(TCGOpcode op) case INDEX_op_sari_vec: case INDEX_op_x86_psrldq_vec: return &x_x; - case INDEX_op_x86_vpblendvb_vec: + case INDEX_op_cmpsel_vec: return &x_x_x_x; default: @@ -3241,6 +3241,7 @@ int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type, unsigned vece) case INDEX_op_or_vec: case INDEX_op_xor_vec: case INDEX_op_andc_vec: + case INDEX_op_cmpsel_vec: return 1; case INDEX_op_cmp_vec: return -1; @@ -3537,9 +3538,7 @@ static void expand_vec_minmax(TCGType type, unsigned vece, TCGv_vec t2; t2 = v1, v1 = v2, v2 = t2; } - vec_gen_4(INDEX_op_x86_vpblendvb_vec, type, vece, - tcgv_vec_arg(v0), tcgv_vec_arg(v1), - tcgv_vec_arg(v2), tcgv_vec_arg(t1)); + tcg_gen_cmpsel_vec(vece, v0, t1, v1, v2); tcg_temp_free_vec(t1); } From patchwork Sat Apr 20 07:34:34 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1088336 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="nZKcgEtq"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44mQQ53fKWz9s70 for ; Sat, 20 Apr 2019 18:05:49 +1000 (AEST) Received: from localhost ([127.0.0.1]:38506 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHl0N-0000aO-Dq for incoming@patchwork.ozlabs.org; Sat, 20 Apr 2019 04:05:47 -0400 Received: from eggs.gnu.org ([209.51.188.92]:40579) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkXY-00012z-Rb for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:36:02 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hHkXW-0000nL-Ft for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:36:00 -0400 Received: from mail-pf1-x442.google.com ([2607:f8b0:4864:20::442]:39176) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hHkXW-0000Do-6g for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:58 -0400 Received: by mail-pf1-x442.google.com with SMTP id i17so3465924pfo.6 for ; Sat, 20 Apr 2019 00:35:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=DFRXZD7NX2YzkScQMMhpfUbyNqz4uR3r+/iPuMUtZlg=; b=nZKcgEtqvIgJ6O+Fo0xRG00kc2v8zY51aZI/2Cjkhrs7zUZfhIcNnMDYUvG78zHPpS 6Aiddva4aNAKYnlcM2US7RYifgBCz8mDGjq4BTfZAL5Ed6r+supMHNwK9FesTN0I/5FJ 9g0uHQn7u6N8YD8uTm8whgK3rS2DADJqlGHRzBqs9tGPuZ85Nv1AQNqX1iA6OYkQAvUk 7c73Zer84KtBCmXAlFqxWpF8EsDIjzTn2IS2qzPQNr+tjlKzQKX2HlJb/8N9qSynZlNP 7R2b5YrwOVRCJQPJL2Rs2TNJQQzS1g65CGk944vhCov2n7sntJS6vBE/WzckO5XQQa4x X2KQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=DFRXZD7NX2YzkScQMMhpfUbyNqz4uR3r+/iPuMUtZlg=; b=OYJozarnIn+smp09jIOwljeaYXiI8UuWZ95IbFWRZb25D2mCnw2IEjUgZqZJd9hWmq Soa7TvX1qQPsR0YvNU7sRa3XZePm8DKCEqyKZBBfkZkv/Rz8IwsRotfQoXV2ceQW97PX K/hYfPuWlyLwJDVIwV+Et5UITcMB0pE5s1+31VcbozO/e4G+bwIlEMwkO8WW0ru2XeM5 Mvd0tZuJui6fzcV0FokHLsVW/RNdP5clCSlnaRYtTxPvy4tFKimPeatgNlHtWfBjHkLz b7QLtsiavp4Mk8iUe1E2BqKNsnG74tXmLqsNurSgk5H1qcr6k3PnxetJ/b9vHtEZGzd6 WtQg== X-Gm-Message-State: APjAAAU7Xv/QspYa8Ls3fHojCgso82FgTjcY5dp5liUfC2narpDFl+MO mPlK9La7i03hEZxoolZjIMUsFLNXldI= X-Google-Smtp-Source: APXvYqyYk5vsB6jkCmlKEqH5nt58W0TrYuck8e3hcjJz8nbB6haUaZy3Yo8jGlWu6VRv/WtTZRC3yw== X-Received: by 2002:a65:5202:: with SMTP id o2mr8032976pgp.402.1555745733492; Sat, 20 Apr 2019 00:35:33 -0700 (PDT) Received: from localhost.localdomain (rrcs-66-91-136-155.west.biz.rr.com. [66.91.136.155]) by smtp.gmail.com with ESMTPSA id z22sm7025492pgv.23.2019.04.20.00.35.32 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 20 Apr 2019 00:35:32 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Fri, 19 Apr 2019 21:34:34 -1000 Message-Id: <20190420073442.7488-31-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190420073442.7488-1-richard.henderson@linaro.org> References: <20190420073442.7488-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::442 Subject: [Qemu-devel] [PATCH 30/38] tcg/aarch64: Support vector comparison select value X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: david@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" The instruction set has 3 insns that perform the same operation, only varying in which operand must overlap the destination. We can represent the operation without overlap and choose based on the operands seen. Signed-off-by: Richard Henderson --- tcg/aarch64/tcg-target.h | 2 +- tcg/aarch64/tcg-target.inc.c | 24 +++++++++++++++++++++++- 2 files changed, 24 insertions(+), 2 deletions(-) diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h index e1135e930a..e030bf3c8f 100644 --- a/tcg/aarch64/tcg-target.h +++ b/tcg/aarch64/tcg-target.h @@ -140,7 +140,7 @@ typedef enum { #define TCG_TARGET_HAS_mul_vec 1 #define TCG_TARGET_HAS_sat_vec 1 #define TCG_TARGET_HAS_minmax_vec 1 -#define TCG_TARGET_HAS_cmpsel_vec 0 +#define TCG_TARGET_HAS_cmpsel_vec 1 #define TCG_TARGET_DEFAULT_MO (0) #define TCG_TARGET_HAS_MEMORY_BSWAP 1 diff --git a/tcg/aarch64/tcg-target.inc.c b/tcg/aarch64/tcg-target.inc.c index cf891defd4..84d402acd8 100644 --- a/tcg/aarch64/tcg-target.inc.c +++ b/tcg/aarch64/tcg-target.inc.c @@ -525,6 +525,9 @@ typedef enum { I3616_ADD = 0x0e208400, I3616_AND = 0x0e201c00, I3616_BIC = 0x0e601c00, + I3616_BIF = 0x2ee01c00, + I3616_BIT = 0x2ea01c00, + I3616_BSL = 0x2e601c00, I3616_EOR = 0x2e201c00, I3616_MUL = 0x0e209c00, I3616_ORR = 0x0ea01c00, @@ -2178,7 +2181,7 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, TCGType type = vecl + TCG_TYPE_V64; unsigned is_q = vecl; - TCGArg a0, a1, a2; + TCGArg a0, a1, a2, a3; a0 = args[0]; a1 = args[1]; @@ -2301,6 +2304,20 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, } break; + case INDEX_op_cmpsel_vec: + a3 = args[3]; + if (a0 == a3) { + tcg_out_insn(s, 3616, BIT, is_q, 0, a0, a2, a1); + } else if (a0 == a2) { + tcg_out_insn(s, 3616, BIF, is_q, 0, a0, a3, a1); + } else { + if (a0 != a1) { + tcg_out_mov(s, type, a0, a1); + } + tcg_out_insn(s, 3616, BSL, is_q, 0, a0, a2, a3); + } + break; + case INDEX_op_mov_vec: /* Always emitted via tcg_out_mov. */ case INDEX_op_dupi_vec: /* Always emitted via tcg_out_movi. */ case INDEX_op_dup_vec: /* Always emitted via tcg_out_dup_vec. */ @@ -2323,6 +2340,7 @@ int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type, unsigned vece) case INDEX_op_abs_vec: case INDEX_op_not_vec: case INDEX_op_cmp_vec: + case INDEX_op_cmpsel_vec: case INDEX_op_shli_vec: case INDEX_op_shri_vec: case INDEX_op_sari_vec: @@ -2405,6 +2423,8 @@ static const TCGTargetOpDef *tcg_target_op_def(TCGOpcode op) = { .args_ct_str = { "r", "r", "rA", "rZ", "rZ" } }; static const TCGTargetOpDef add2 = { .args_ct_str = { "r", "r", "rZ", "rZ", "rA", "rMZ" } }; + static const TCGTargetOpDef w_w_w_w + = { .args_ct_str = { "w", "w", "w", "w" } }; switch (op) { case INDEX_op_goto_ptr: @@ -2577,6 +2597,8 @@ static const TCGTargetOpDef *tcg_target_op_def(TCGOpcode op) return &w_wr; case INDEX_op_cmp_vec: return &w_w_wZ; + case INDEX_op_cmpsel_vec: + return &w_w_w_w; default: return NULL; From patchwork Sat Apr 20 07:34:35 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1088314 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="Rp8jtGDj"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44mQ075bQ8z9s4Y for ; Sat, 20 Apr 2019 17:46:46 +1000 (AEST) Received: from localhost ([127.0.0.1]:38223 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkhs-0001VN-Fv for incoming@patchwork.ozlabs.org; Sat, 20 Apr 2019 03:46:40 -0400 Received: from eggs.gnu.org ([209.51.188.92]:40455) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkXV-000108-Vw for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:59 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hHkXT-0000k8-Gb for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:56 -0400 Received: from mail-pg1-x544.google.com ([2607:f8b0:4864:20::544]:36865) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hHkXP-0000Eo-ME for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:53 -0400 Received: by mail-pg1-x544.google.com with SMTP id e6so3598764pgc.4 for ; Sat, 20 Apr 2019 00:35:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=aLpUjQsuKJz6T6SbjeElUaQ0sY108vEanXCZ8S+u8lw=; b=Rp8jtGDjhHNIwYXfe5djKCIi44U5/YOw29tGwP1Tu709gLbgvYfBZHx/uf1DSg4gdR 71WnmzCUR4HBAdk5l3vGzqrR6dQs42BulIhwFaIwlRTrHKlfqY9WzQkTd5+MHbhTCSj/ P8e2JVi3LodRoVH2+tO2W9eVtFUUolWO1GPD/6Y0YXxToz7fSxoJ/L1OpxDaK4wuZncF 6ROMdi9JVuYWK1CYNhLLaPmhmobUX4qg4qrpWXvk6+AOfs3bi/N5MgmWmCjH111X8sCn lizjNYxWswtb7i2qDFLIFkiJjwQYs2I84tDKEjZCvyeKrCQlX7wEMbGJC/JyBq4Cj0b6 OCjw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=aLpUjQsuKJz6T6SbjeElUaQ0sY108vEanXCZ8S+u8lw=; b=LOYlcO/FNrVDOH1O31i6L2RbevGCGIWjd7td3p0is7DvBobOd1lhctJlueeXugr7aG dvNMSWT6cuUUnssr9n8oMjvrm7ma3+g4cdiM5foy0xuUUA9tyGcWT1hSptXbNxItDo+w /xN6JtKylYr3BQA2HZMnvQpYMTT1+U+ZkP5H294bwbYzFIIM7T4mceM1IPVewN5P0Xa8 ULppk8/+pfprmgqW5ENw+JFZFUslRc9SGkOcLL4esvUq+b8zfznNaUYkf8mouqvPJ33f Adf71FYN81z/pf/+ibILbwT4VH6xwLf2bnnnWGCM3urIyFLiKkZLcWXwW6LhxUxXv4yc 1K5g== X-Gm-Message-State: APjAAAUn9UjN8i29RfszZyjKsgsjUaShILW6pr6kCP3F+5wRllaNy056 z1OBZ1zbSY8rXkzCNwQQ+uuSpRY2kmo= X-Google-Smtp-Source: APXvYqzq+S+TX3DvtpDHmIi8fUyA3C3y1NfSo9fuJAn29Pdiyqm1I43Zuv1pFTMmu5zEZDpL6D+JXQ== X-Received: by 2002:a63:1247:: with SMTP id 7mr7965982pgs.352.1555745734929; Sat, 20 Apr 2019 00:35:34 -0700 (PDT) Received: from localhost.localdomain (rrcs-66-91-136-155.west.biz.rr.com. [66.91.136.155]) by smtp.gmail.com with ESMTPSA id z22sm7025492pgv.23.2019.04.20.00.35.33 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 20 Apr 2019 00:35:34 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Fri, 19 Apr 2019 21:34:35 -1000 Message-Id: <20190420073442.7488-32-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190420073442.7488-1-richard.henderson@linaro.org> References: <20190420073442.7488-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::544 Subject: [Qemu-devel] [PATCH 31/38] target/ppc: Use vector variable shifts for VS{L, R, RA}{B, H, W, D} X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: david@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/ppc/helper.h | 24 ++-- target/ppc/int_helper.c | 6 +- target/ppc/translate/vmx-impl.inc.c | 168 ++++++++++++++++++++++++++-- 3 files changed, 172 insertions(+), 26 deletions(-) diff --git a/target/ppc/helper.h b/target/ppc/helper.h index 638a6e99c4..5416dc55ce 100644 --- a/target/ppc/helper.h +++ b/target/ppc/helper.h @@ -180,18 +180,18 @@ DEF_HELPER_3(vmuloub, void, avr, avr, avr) DEF_HELPER_3(vmulouh, void, avr, avr, avr) DEF_HELPER_3(vmulouw, void, avr, avr, avr) DEF_HELPER_3(vmuluwm, void, avr, avr, avr) -DEF_HELPER_3(vsrab, void, avr, avr, avr) -DEF_HELPER_3(vsrah, void, avr, avr, avr) -DEF_HELPER_3(vsraw, void, avr, avr, avr) -DEF_HELPER_3(vsrad, void, avr, avr, avr) -DEF_HELPER_3(vsrb, void, avr, avr, avr) -DEF_HELPER_3(vsrh, void, avr, avr, avr) -DEF_HELPER_3(vsrw, void, avr, avr, avr) -DEF_HELPER_3(vsrd, void, avr, avr, avr) -DEF_HELPER_3(vslb, void, avr, avr, avr) -DEF_HELPER_3(vslh, void, avr, avr, avr) -DEF_HELPER_3(vslw, void, avr, avr, avr) -DEF_HELPER_3(vsld, void, avr, avr, avr) +DEF_HELPER_4(vsrab, void, avr, avr, avr, i32) +DEF_HELPER_4(vsrah, void, avr, avr, avr, i32) +DEF_HELPER_4(vsraw, void, avr, avr, avr, i32) +DEF_HELPER_4(vsrad, void, avr, avr, avr, i32) +DEF_HELPER_4(vsrb, void, avr, avr, avr, i32) +DEF_HELPER_4(vsrh, void, avr, avr, avr, i32) +DEF_HELPER_4(vsrw, void, avr, avr, avr, i32) +DEF_HELPER_4(vsrd, void, avr, avr, avr, i32) +DEF_HELPER_4(vslb, void, avr, avr, avr, i32) +DEF_HELPER_4(vslh, void, avr, avr, avr, i32) +DEF_HELPER_4(vslw, void, avr, avr, avr, i32) +DEF_HELPER_4(vsld, void, avr, avr, avr, i32) DEF_HELPER_3(vslo, void, avr, avr, avr) DEF_HELPER_3(vsro, void, avr, avr, avr) DEF_HELPER_3(vsrv, void, avr, avr, avr) diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c index 162add561e..35ec1ccdfb 100644 --- a/target/ppc/int_helper.c +++ b/target/ppc/int_helper.c @@ -1770,7 +1770,8 @@ VSHIFT(r, 0) #undef VSHIFT #define VSL(suffix, element, mask) \ - void helper_vsl##suffix(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b) \ + void helper_vsl##suffix(ppc_avr_t *r, ppc_avr_t *a, \ + ppc_avr_t *b, uint32_t desc) \ { \ int i; \ \ @@ -1958,7 +1959,8 @@ VNEG(vnegd, s64) #undef VNEG #define VSR(suffix, element, mask) \ - void helper_vsr##suffix(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b) \ + void helper_vsr##suffix(ppc_avr_t *r, ppc_avr_t *a, \ + ppc_avr_t *b, uint32_t desc) \ { \ int i; \ \ diff --git a/target/ppc/translate/vmx-impl.inc.c b/target/ppc/translate/vmx-impl.inc.c index c83e605a00..8cc2e99963 100644 --- a/target/ppc/translate/vmx-impl.inc.c +++ b/target/ppc/translate/vmx-impl.inc.c @@ -511,6 +511,150 @@ static void gen_vmrgow(DisasContext *ctx) tcg_temp_free_i64(avr); } +static void gen_vsl_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) +{ + TCGv_vec t = tcg_temp_new_vec_matching(b); + tcg_gen_dupi_vec(vece, t, (8 << vece) - 1); + tcg_gen_and_vec(vece, b, b, t); + tcg_temp_free_vec(t); + tcg_gen_shlv_vec(vece, d, a, b); +} + +static void gen_vslw_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + tcg_gen_andi_i32(b, b, 31); + tcg_gen_shl_i32(d, a, b); +} + +static void gen_vsld_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + tcg_gen_andi_i64(b, b, 63); + tcg_gen_shl_i64(d, a, b); +} + +static void gen__vsl(unsigned vece, uint32_t dofs, uint32_t aofs, + uint32_t bofs, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode shlv_list[] = { INDEX_op_shlv_vec, 0 }; + static const GVecGen3 g[4] = { + { .fniv = gen_vsl_vec, + .fno = gen_helper_vslb, + .opt_opc = shlv_list, + .vece = MO_8 }, + { .fniv = gen_vsl_vec, + .fno = gen_helper_vslh, + .opt_opc = shlv_list, + .vece = MO_16 }, + { .fni4 = gen_vslw_i32, + .fniv = gen_vsl_vec, + .fno = gen_helper_vslw, + .opt_opc = shlv_list, + .vece = MO_32 }, + { .fni8 = gen_vsld_i64, + .fniv = gen_vsl_vec, + .fno = gen_helper_vsld, + .opt_opc = shlv_list, + .vece = MO_64 } + }; + tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g[vece]); +} + +static void gen_vsr_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) +{ + TCGv_vec t = tcg_temp_new_vec_matching(b); + tcg_gen_dupi_vec(vece, t, (8 << vece) - 1); + tcg_gen_and_vec(vece, b, b, t); + tcg_temp_free_vec(t); + tcg_gen_shrv_vec(vece, d, a, b); +} + +static void gen_vsrw_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + tcg_gen_andi_i32(b, b, 31); + tcg_gen_shr_i32(d, a, b); +} + +static void gen_vsrd_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + tcg_gen_andi_i64(b, b, 63); + tcg_gen_shr_i64(d, a, b); +} + +static void gen__vsr(unsigned vece, uint32_t dofs, uint32_t aofs, + uint32_t bofs, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode shrv_list[] = { INDEX_op_shrv_vec, 0 }; + static const GVecGen3 g[4] = { + { .fniv = gen_vsr_vec, + .fno = gen_helper_vsrb, + .opt_opc = shrv_list, + .vece = MO_8 }, + { .fniv = gen_vsr_vec, + .fno = gen_helper_vsrh, + .opt_opc = shrv_list, + .vece = MO_16 }, + { .fni4 = gen_vsrw_i32, + .fniv = gen_vsr_vec, + .fno = gen_helper_vsrw, + .opt_opc = shrv_list, + .vece = MO_32 }, + { .fni8 = gen_vsrd_i64, + .fniv = gen_vsr_vec, + .fno = gen_helper_vsrd, + .opt_opc = shrv_list, + .vece = MO_64 } + }; + tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g[vece]); +} + +static void gen_vsra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) +{ + TCGv_vec t = tcg_temp_new_vec_matching(b); + tcg_gen_dupi_vec(vece, t, (8 << vece) - 1); + tcg_gen_and_vec(vece, b, b, t); + tcg_temp_free_vec(t); + tcg_gen_sarv_vec(vece, d, a, b); +} + +static void gen_vsraw_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + tcg_gen_andi_i32(b, b, 31); + tcg_gen_sar_i32(d, a, b); +} + +static void gen_vsrad_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + tcg_gen_andi_i64(b, b, 63); + tcg_gen_sar_i64(d, a, b); +} + +static void gen__vsra(unsigned vece, uint32_t dofs, uint32_t aofs, + uint32_t bofs, uint32_t oprsz, uint32_t maxsz) +{ + static const TCGOpcode sarv_list[] = { INDEX_op_sarv_vec, 0 }; + static const GVecGen3 g[4] = { + { .fniv = gen_vsra_vec, + .fno = gen_helper_vsrab, + .opt_opc = sarv_list, + .vece = MO_8 }, + { .fniv = gen_vsra_vec, + .fno = gen_helper_vsrah, + .opt_opc = sarv_list, + .vece = MO_16 }, + { .fni4 = gen_vsraw_i32, + .fniv = gen_vsra_vec, + .fno = gen_helper_vsraw, + .opt_opc = sarv_list, + .vece = MO_32 }, + { .fni8 = gen_vsrad_i64, + .fniv = gen_vsra_vec, + .fno = gen_helper_vsrad, + .opt_opc = sarv_list, + .vece = MO_64 } + }; + tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g[vece]); +} + GEN_VXFORM(vmuloub, 4, 0); GEN_VXFORM(vmulouh, 4, 1); GEN_VXFORM(vmulouw, 4, 2); @@ -526,21 +670,21 @@ GEN_VXFORM(vmuleuw, 4, 10); GEN_VXFORM(vmulesb, 4, 12); GEN_VXFORM(vmulesh, 4, 13); GEN_VXFORM(vmulesw, 4, 14); -GEN_VXFORM(vslb, 2, 4); -GEN_VXFORM(vslh, 2, 5); -GEN_VXFORM(vslw, 2, 6); +GEN_VXFORM_V(vslb, MO_8, gen__vsl, 2, 4); +GEN_VXFORM_V(vslh, MO_16, gen__vsl, 2, 5); +GEN_VXFORM_V(vslw, MO_32, gen__vsl, 2, 6); +GEN_VXFORM_V(vsld, MO_64, gen__vsl, 2, 23); GEN_VXFORM(vrlwnm, 2, 6); GEN_VXFORM_DUAL(vslw, PPC_ALTIVEC, PPC_NONE, \ vrlwnm, PPC_NONE, PPC2_ISA300) -GEN_VXFORM(vsld, 2, 23); -GEN_VXFORM(vsrb, 2, 8); -GEN_VXFORM(vsrh, 2, 9); -GEN_VXFORM(vsrw, 2, 10); -GEN_VXFORM(vsrd, 2, 27); -GEN_VXFORM(vsrab, 2, 12); -GEN_VXFORM(vsrah, 2, 13); -GEN_VXFORM(vsraw, 2, 14); -GEN_VXFORM(vsrad, 2, 15); +GEN_VXFORM_V(vsrb, MO_8, gen__vsr, 2, 8); +GEN_VXFORM_V(vsrh, MO_16, gen__vsr, 2, 9); +GEN_VXFORM_V(vsrw, MO_32, gen__vsr, 2, 10); +GEN_VXFORM_V(vsrd, MO_64, gen__vsr, 2, 27); +GEN_VXFORM_V(vsrab, MO_8, gen__vsra, 2, 12); +GEN_VXFORM_V(vsrah, MO_16, gen__vsra, 2, 13); +GEN_VXFORM_V(vsraw, MO_32, gen__vsra, 2, 14); +GEN_VXFORM_V(vsrad, MO_64, gen__vsra, 2, 15); GEN_VXFORM(vsrv, 2, 28); GEN_VXFORM(vslv, 2, 29); GEN_VXFORM(vslo, 6, 16); From patchwork Sat Apr 20 07:34:36 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1088333 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="Zr2u0cMZ"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44mQLV2Pqhz9s70 for ; Sat, 20 Apr 2019 18:02:41 +1000 (AEST) Received: from localhost ([127.0.0.1]:38451 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkxK-0006bW-V7 for incoming@patchwork.ozlabs.org; Sat, 20 Apr 2019 04:02:38 -0400 Received: from eggs.gnu.org ([209.51.188.92]:40510) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkXX-00011S-92 for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:36:03 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hHkXU-0000kz-Go for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:59 -0400 Received: from mail-pg1-x544.google.com ([2607:f8b0:4864:20::544]:38420) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hHkXT-0000GB-FL for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:56 -0400 Received: by mail-pg1-x544.google.com with SMTP id j26so3592677pgl.5 for ; Sat, 20 Apr 2019 00:35:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=c0vmagyfgwQZ6owRzZSMoEhDUfxJwisKXvt2xHvVFNQ=; b=Zr2u0cMZ9InkwKU/iGL21byBdl7AHGs1a4i7zYc762zaPsAJGt3Jrmtd9BBGResDEM xm0Gux3rgp8fDbD9giUBnTu6v43ic34BD60T3+C0hSr1PkMuQwDRMejGKrmaROzjg+mm 3D2CAqzbuPI8328PVNDGFHLMOqZTbhy0DXeL3cv8UOnGo3ccTubu5lx6ihiJc+NEPt0t 6ivKb3zPTvV36pl9qWzOawpok0veGGYoj7Fq9eoGWs2AEtdOo9kqKXu7Nk8GDlSQluSI iEpzM/p+oLiZF7oEp1pxSOOuUt8GJKnKrHBWorGmqnKSfz76aQDXAofEK88ykXCyre8B z/NQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=c0vmagyfgwQZ6owRzZSMoEhDUfxJwisKXvt2xHvVFNQ=; b=IKINDXmB6It/6EYbYeUDt6de5uEyQbUxfLHENV+IbG8sVtl5UOBwYYZiOvU/PP/OP6 TuJWc+NBKsvd8kJzQBLdl5TdTnj1ksuoT2BBT71SNM2iPvU5xW9TY81U70fVlEGa8CL+ qk73wEQ9XY9CGPYEpAP78mtJL+tNl44kiLKYa5oJR/7Q0wbzhHb4w5RsJv0LlDnUQdGX rCk9EIzDgAw5bb0/iO2NYqi97eSNxSNTAhf51TCjMNvYElLYYjLGH1CADIaLIAwWF7iY rdEaKvwA5u8pdMhhNIxyamno/aK5nMlSdHan+Oix3S9pNeEJ4iYw5M5ulh/n542GZe90 XZwQ== X-Gm-Message-State: APjAAAWm2nXTkZMBRRGVwFS/YY9rI/JSZ16KS5OUhR2kUJgRPjIJUdOV h0P6ZDKZu0e9hyi/Ga75gmf/HmCELzI= X-Google-Smtp-Source: APXvYqzhfFnAGd06yJi5EqMgqgg0O+9Iq+pVU+fbBHn08323xZamZGO4OnlTNyC2iCXX+zd8+VbrEQ== X-Received: by 2002:a63:6988:: with SMTP id e130mr8153226pgc.150.1555745736726; Sat, 20 Apr 2019 00:35:36 -0700 (PDT) Received: from localhost.localdomain (rrcs-66-91-136-155.west.biz.rr.com. [66.91.136.155]) by smtp.gmail.com with ESMTPSA id z22sm7025492pgv.23.2019.04.20.00.35.35 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 20 Apr 2019 00:35:36 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Fri, 19 Apr 2019 21:34:36 -1000 Message-Id: <20190420073442.7488-33-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190420073442.7488-1-richard.henderson@linaro.org> References: <20190420073442.7488-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::544 Subject: [Qemu-devel] [PATCH 32/38] target/arm: Vectorize USHL and SSHL X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: david@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" These instructions shift left or right depending on the sign of the input, and 7 bits are significant to the shift. This requires several masks and selects in addition to the actual shifts to form the complete answer. Signed-off-by: Richard Henderson --- target/arm/helper.h | 15 +- target/arm/translate.h | 6 + target/arm/neon_helper.c | 33 ----- target/arm/translate-a64.c | 18 +-- target/arm/translate.c | 288 +++++++++++++++++++++++++++++++++++-- target/arm/vec_helper.c | 176 +++++++++++++++++++++++ 6 files changed, 470 insertions(+), 66 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index 3d90b5be66..1c0de661fb 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -292,14 +292,8 @@ DEF_HELPER_2(neon_abd_s16, i32, i32, i32) DEF_HELPER_2(neon_abd_u32, i32, i32, i32) DEF_HELPER_2(neon_abd_s32, i32, i32, i32) -DEF_HELPER_2(neon_shl_u8, i32, i32, i32) -DEF_HELPER_2(neon_shl_s8, i32, i32, i32) DEF_HELPER_2(neon_shl_u16, i32, i32, i32) DEF_HELPER_2(neon_shl_s16, i32, i32, i32) -DEF_HELPER_2(neon_shl_u32, i32, i32, i32) -DEF_HELPER_2(neon_shl_s32, i32, i32, i32) -DEF_HELPER_2(neon_shl_u64, i64, i64, i64) -DEF_HELPER_2(neon_shl_s64, i64, i64, i64) DEF_HELPER_2(neon_rshl_u8, i32, i32, i32) DEF_HELPER_2(neon_rshl_s8, i32, i32, i32) DEF_HELPER_2(neon_rshl_u16, i32, i32, i32) @@ -686,6 +680,15 @@ DEF_HELPER_FLAGS_2(frint64_s, TCG_CALL_NO_RWG, f32, f32, ptr) DEF_HELPER_FLAGS_2(frint32_d, TCG_CALL_NO_RWG, f64, f64, ptr) DEF_HELPER_FLAGS_2(frint64_d, TCG_CALL_NO_RWG, f64, f64, ptr) +DEF_HELPER_FLAGS_4(gvec_sshl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_sshl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_sshl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_sshl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_ushl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_ushl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_ushl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_ushl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/translate.h b/target/arm/translate.h index 912cc2a4a5..633668fa1b 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -244,6 +244,8 @@ extern const GVecGen3 bif_op; extern const GVecGen3 mla_op[4]; extern const GVecGen3 mls_op[4]; extern const GVecGen3 cmtst_op[4]; +extern const GVecGen3 sshl_op[4]; +extern const GVecGen3 ushl_op[4]; extern const GVecGen2i ssra_op[4]; extern const GVecGen2i usra_op[4]; extern const GVecGen2i sri_op[4]; @@ -253,6 +255,10 @@ extern const GVecGen4 sqadd_op[4]; extern const GVecGen4 uqsub_op[4]; extern const GVecGen4 sqsub_op[4]; void gen_cmtst_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); +void gen_ushl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b); +void gen_sshl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b); +void gen_ushl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); +void gen_sshl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); /* * Forward to the isar_feature_* tests given a DisasContext pointer. diff --git a/target/arm/neon_helper.c b/target/arm/neon_helper.c index 4259056723..c581ffb7d3 100644 --- a/target/arm/neon_helper.c +++ b/target/arm/neon_helper.c @@ -615,24 +615,9 @@ NEON_VOP(abd_u32, neon_u32, 1) } else { \ dest = src1 << tmp; \ }} while (0) -NEON_VOP(shl_u8, neon_u8, 4) NEON_VOP(shl_u16, neon_u16, 2) -NEON_VOP(shl_u32, neon_u32, 1) #undef NEON_FN -uint64_t HELPER(neon_shl_u64)(uint64_t val, uint64_t shiftop) -{ - int8_t shift = (int8_t)shiftop; - if (shift >= 64 || shift <= -64) { - val = 0; - } else if (shift < 0) { - val >>= -shift; - } else { - val <<= shift; - } - return val; -} - #define NEON_FN(dest, src1, src2) do { \ int8_t tmp; \ tmp = (int8_t)src2; \ @@ -645,27 +630,9 @@ uint64_t HELPER(neon_shl_u64)(uint64_t val, uint64_t shiftop) } else { \ dest = src1 << tmp; \ }} while (0) -NEON_VOP(shl_s8, neon_s8, 4) NEON_VOP(shl_s16, neon_s16, 2) -NEON_VOP(shl_s32, neon_s32, 1) #undef NEON_FN -uint64_t HELPER(neon_shl_s64)(uint64_t valop, uint64_t shiftop) -{ - int8_t shift = (int8_t)shiftop; - int64_t val = valop; - if (shift >= 64) { - val = 0; - } else if (shift <= -64) { - val >>= 63; - } else if (shift < 0) { - val >>= -shift; - } else { - val <<= shift; - } - return val; -} - #define NEON_FN(dest, src1, src2) do { \ int8_t tmp; \ tmp = (int8_t)src2; \ diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index fd8921565e..c30f99c7cd 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -8845,9 +8845,9 @@ static void handle_3same_64(DisasContext *s, int opcode, bool u, break; case 0x8: /* SSHL, USHL */ if (u) { - gen_helper_neon_shl_u64(tcg_rd, tcg_rn, tcg_rm); + gen_ushl_i64(tcg_rd, tcg_rn, tcg_rm); } else { - gen_helper_neon_shl_s64(tcg_rd, tcg_rn, tcg_rm); + gen_sshl_i64(tcg_rd, tcg_rn, tcg_rm); } break; case 0x9: /* SQSHL, UQSHL */ @@ -11242,6 +11242,10 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) is_q ? 16 : 8, vec_full_reg_size(s), (u ? uqsub_op : sqsub_op) + size); return; + case 0x08: /* SSHL, USHL */ + gen_gvec_op3(s, is_q, rd, rn, rm, + u ? &ushl_op[size] : &sshl_op[size]); + return; case 0x0c: /* SMAX, UMAX */ if (u) { gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_umax, size); @@ -11357,16 +11361,6 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) genfn = fns[size][u]; break; } - case 0x8: /* SSHL, USHL */ - { - static NeonGenTwoOpFn * const fns[3][2] = { - { gen_helper_neon_shl_s8, gen_helper_neon_shl_u8 }, - { gen_helper_neon_shl_s16, gen_helper_neon_shl_u16 }, - { gen_helper_neon_shl_s32, gen_helper_neon_shl_u32 }, - }; - genfn = fns[size][u]; - break; - } case 0x9: /* SQSHL, UQSHL */ { static NeonGenTwoOpEnvFn * const fns[3][2] = { diff --git a/target/arm/translate.c b/target/arm/translate.c index 911ad0bdab..1fd31228f7 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -5285,13 +5285,13 @@ static inline void gen_neon_shift_narrow(int size, TCGv_i32 var, TCGv_i32 shift, if (u) { switch (size) { case 1: gen_helper_neon_shl_u16(var, var, shift); break; - case 2: gen_helper_neon_shl_u32(var, var, shift); break; + case 2: gen_ushl_i32(var, var, shift); break; default: abort(); } } else { switch (size) { case 1: gen_helper_neon_shl_s16(var, var, shift); break; - case 2: gen_helper_neon_shl_s32(var, var, shift); break; + case 2: gen_sshl_i32(var, var, shift); break; default: abort(); } } @@ -6170,6 +6170,270 @@ const GVecGen3 cmtst_op[4] = { .vece = MO_64 }, }; +void gen_ushl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + TCGv_i32 lval = tcg_temp_new_i32(); + TCGv_i32 rval = tcg_temp_new_i32(); + TCGv_i32 lsh = tcg_temp_new_i32(); + TCGv_i32 rsh = tcg_temp_new_i32(); + TCGv_i32 zero = tcg_const_i32(0); + TCGv_i32 max = tcg_const_i32(32); + + /* + * Perform possibly out of range shifts, trusting that the operation + * does not trap. Discard unused results after the fact. + */ + tcg_gen_ext8s_i32(lsh, b); + tcg_gen_neg_i32(rsh, lsh); + tcg_gen_shl_i32(lval, a, lsh); + tcg_gen_shr_i32(rval, a, rsh); + tcg_gen_movcond_i32(TCG_COND_LTU, d, lsh, max, lval, zero); + tcg_gen_movcond_i32(TCG_COND_LTU, d, rsh, max, rval, d); + + tcg_temp_free_i32(lval); + tcg_temp_free_i32(rval); + tcg_temp_free_i32(lsh); + tcg_temp_free_i32(rsh); + tcg_temp_free_i32(zero); + tcg_temp_free_i32(max); +} + +void gen_ushl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 lval = tcg_temp_new_i64(); + TCGv_i64 rval = tcg_temp_new_i64(); + TCGv_i64 lsh = tcg_temp_new_i64(); + TCGv_i64 rsh = tcg_temp_new_i64(); + TCGv_i64 zero = tcg_const_i64(0); + TCGv_i64 max = tcg_const_i64(64); + + /* + * Perform possibly out of range shifts, trusting that the operation + * does not trap. Discard unused results after the fact. + */ + tcg_gen_ext8s_i64(lsh, b); + tcg_gen_neg_i64(rsh, lsh); + tcg_gen_shl_i64(lval, a, lsh); + tcg_gen_shr_i64(rval, a, rsh); + tcg_gen_movcond_i64(TCG_COND_LTU, d, lsh, max, lval, zero); + tcg_gen_movcond_i64(TCG_COND_LTU, d, rsh, max, rval, d); + + tcg_temp_free_i64(lval); + tcg_temp_free_i64(rval); + tcg_temp_free_i64(lsh); + tcg_temp_free_i64(rsh); + tcg_temp_free_i64(zero); + tcg_temp_free_i64(max); +} + +static void gen_ushl_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) +{ + TCGv_vec lval = tcg_temp_new_vec_matching(d); + TCGv_vec rval = tcg_temp_new_vec_matching(d); + TCGv_vec lsh = tcg_temp_new_vec_matching(d); + TCGv_vec rsh = tcg_temp_new_vec_matching(d); + TCGv_vec msk, max; + + /* + * Since we don't have a sign-extend vector primitive, negate and mask. + * With the out-of-range check below, we'll select the correct answer. + */ + tcg_gen_neg_vec(vece, rsh, b); + if (vece == MO_8) { + tcg_gen_mov_vec(lsh, b); + } else { + msk = tcg_temp_new_vec_matching(d); + tcg_gen_dupi_vec(vece, msk, 0xff); + tcg_gen_and_vec(vece, lsh, b, msk); + tcg_gen_and_vec(vece, rsh, rsh, msk); + tcg_temp_free_vec(msk); + } + + /* + * Perform possibly out of range shifts, trusting that the operation + * does not trap. Discard unused results after the fact. + */ + tcg_gen_shlv_vec(vece, lval, a, lsh); + tcg_gen_shrv_vec(vece, rval, a, rsh); + + max = tcg_temp_new_vec_matching(d); + tcg_gen_dupi_vec(vece, max, 8 << vece); + tcg_gen_cmp_vec(TCG_COND_LTU, vece, lsh, lsh, max); + tcg_gen_cmp_vec(TCG_COND_LTU, vece, rsh, rsh, max); + tcg_temp_free_vec(max); + + tcg_gen_and_vec(vece, lval, lval, lsh); + tcg_gen_and_vec(vece, rval, rval, rsh); + tcg_gen_or_vec(vece, d, lval, rval); + + tcg_temp_free_vec(lval); + tcg_temp_free_vec(rval); + tcg_temp_free_vec(lsh); + tcg_temp_free_vec(rsh); +} + +static const TCGOpcode ushl_list[] = { + INDEX_op_neg_vec, INDEX_op_shlv_vec, + INDEX_op_shrv_vec, INDEX_op_cmp_vec, 0 +}; + +const GVecGen3 ushl_op[4] = { + { .fniv = gen_ushl_vec, + .fno = gen_helper_gvec_ushl_b, + .opt_opc = ushl_list, + .vece = MO_8 }, + { .fniv = gen_ushl_vec, + .fno = gen_helper_gvec_ushl_h, + .opt_opc = ushl_list, + .vece = MO_16 }, + { .fni4 = gen_ushl_i32, + .fniv = gen_ushl_vec, + .fno = gen_helper_gvec_ushl_s, + .opt_opc = ushl_list, + .vece = MO_32 }, + { .fni8 = gen_ushl_i64, + .fniv = gen_ushl_vec, + .fno = gen_helper_gvec_ushl_d, + .opt_opc = ushl_list, + .vece = MO_64 }, +}; + +void gen_sshl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + TCGv_i32 lval = tcg_temp_new_i32(); + TCGv_i32 rval = tcg_temp_new_i32(); + TCGv_i32 lsh = tcg_temp_new_i32(); + TCGv_i32 rsh = tcg_temp_new_i32(); + TCGv_i32 zero = tcg_const_i32(0); + TCGv_i32 max = tcg_const_i32(31); + + /* + * Perform possibly out of range shifts, trusting that the operation + * does not trap. Discard unused results after the fact. + */ + tcg_gen_ext8s_i32(lsh, b); + tcg_gen_neg_i32(rsh, lsh); + tcg_gen_shl_i32(lval, a, lsh); + tcg_gen_umin_i32(rsh, rsh, max); + tcg_gen_sar_i32(rval, a, rsh); + tcg_gen_movcond_i32(TCG_COND_LEU, lval, lsh, max, lval, zero); + tcg_gen_movcond_i32(TCG_COND_LT, d, lsh, zero, rval, lval); + + tcg_temp_free_i32(lval); + tcg_temp_free_i32(rval); + tcg_temp_free_i32(lsh); + tcg_temp_free_i32(rsh); + tcg_temp_free_i32(zero); + tcg_temp_free_i32(max); +} + +void gen_sshl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 lval = tcg_temp_new_i64(); + TCGv_i64 rval = tcg_temp_new_i64(); + TCGv_i64 lsh = tcg_temp_new_i64(); + TCGv_i64 rsh = tcg_temp_new_i64(); + TCGv_i64 zero = tcg_const_i64(0); + TCGv_i64 max = tcg_const_i64(63); + + /* + * Perform possibly out of range shifts, trusting that the operation + * does not trap. Discard unused results after the fact. + */ + tcg_gen_ext8s_i64(lsh, b); + tcg_gen_neg_i64(rsh, lsh); + tcg_gen_shl_i64(lval, a, lsh); + tcg_gen_umin_i64(rsh, rsh, max); + tcg_gen_sar_i64(rval, a, rsh); + tcg_gen_movcond_i64(TCG_COND_LEU, lval, lsh, max, lval, zero); + tcg_gen_movcond_i64(TCG_COND_LT, d, lsh, zero, rval, lval); + + tcg_temp_free_i64(lval); + tcg_temp_free_i64(rval); + tcg_temp_free_i64(lsh); + tcg_temp_free_i64(rsh); + tcg_temp_free_i64(zero); + tcg_temp_free_i64(max); +} + +static void gen_sshl_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) +{ + TCGv_vec lval = tcg_temp_new_vec_matching(d); + TCGv_vec rval = tcg_temp_new_vec_matching(d); + TCGv_vec lsh = tcg_temp_new_vec_matching(d); + TCGv_vec rsh = tcg_temp_new_vec_matching(d); + TCGv_vec msk = tcg_temp_new_vec_matching(d); + TCGv_vec max = tcg_temp_new_vec_matching(d); + + /* + * Since we don't have a sign-extend vector primitive, negate and mask. + * With the out-of-range check below, we'll select the correct answer. + */ + tcg_gen_neg_vec(vece, rsh, b); + if (vece == MO_8) { + tcg_gen_mov_vec(lsh, b); + } else { + tcg_gen_dupi_vec(vece, msk, 0xff); + tcg_gen_and_vec(vece, lsh, b, msk); + tcg_gen_and_vec(vece, rsh, rsh, msk); + } + + /* Bound rsh so out of bound right shift gets -1. */ + tcg_gen_dupi_vec(vece, max, (8 << vece) - 1); + tcg_gen_umin_vec(vece, rsh, rsh, max); + + /* Select between left and right shift. */ + tcg_gen_dupi_vec(vece, msk, vece == MO_8 ? 0 : 0x80); + tcg_gen_cmp_vec(TCG_COND_LT, vece, msk, lsh, msk); + + tcg_gen_shlv_vec(vece, lval, a, lsh); + tcg_gen_sarv_vec(vece, rval, a, rsh); + + /* Select in-bound left shift. */ + tcg_gen_cmp_vec(TCG_COND_GT, vece, lsh, lsh, max); + + /* Merge the selections above. */ + tcg_gen_andc_vec(vece, lval, lval, lsh); + if (vece == MO_8) { + tcg_gen_cmpsel_vec(vece, d, msk, rval, lval); + } else { + tcg_gen_cmpsel_vec(vece, d, msk, lval, rval); + } + + tcg_temp_free_vec(lval); + tcg_temp_free_vec(rval); + tcg_temp_free_vec(lsh); + tcg_temp_free_vec(rsh); + tcg_temp_free_vec(msk); + tcg_temp_free_vec(max); +} + +static const TCGOpcode sshl_list[] = { + INDEX_op_neg_vec, INDEX_op_umin_vec, INDEX_op_shlv_vec, + INDEX_op_sarv_vec, INDEX_op_cmp_vec, INDEX_op_cmpsel_vec, 0 +}; + +const GVecGen3 sshl_op[4] = { + { .fniv = gen_sshl_vec, + .fno = gen_helper_gvec_sshl_b, + .opt_opc = sshl_list, + .vece = MO_8 }, + { .fniv = gen_sshl_vec, + .fno = gen_helper_gvec_sshl_h, + .opt_opc = sshl_list, + .vece = MO_16 }, + { .fni4 = gen_sshl_i32, + .fniv = gen_sshl_vec, + .fno = gen_helper_gvec_sshl_s, + .opt_opc = sshl_list, + .vece = MO_32 }, + { .fni8 = gen_sshl_i64, + .fniv = gen_sshl_vec, + .fno = gen_helper_gvec_sshl_d, + .opt_opc = sshl_list, + .vece = MO_64 }, +}; + static void gen_uqadd_vec(unsigned vece, TCGv_vec t, TCGv_vec sat, TCGv_vec a, TCGv_vec b) { @@ -6573,6 +6837,11 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) vec_size, vec_size); } return 0; + + case NEON_3R_VSHL: + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, vec_size, vec_size, + u ? &ushl_op[size] : &sshl_op[size]); + return 0; } if (size == 3) { @@ -6581,13 +6850,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) neon_load_reg64(cpu_V0, rn + pass); neon_load_reg64(cpu_V1, rm + pass); switch (op) { - case NEON_3R_VSHL: - if (u) { - gen_helper_neon_shl_u64(cpu_V0, cpu_V1, cpu_V0); - } else { - gen_helper_neon_shl_s64(cpu_V0, cpu_V1, cpu_V0); - } - break; case NEON_3R_VQSHL: if (u) { gen_helper_neon_qshl_u64(cpu_V0, cpu_env, @@ -6622,7 +6884,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) } pairwise = 0; switch (op) { - case NEON_3R_VSHL: case NEON_3R_VQSHL: case NEON_3R_VRSHL: case NEON_3R_VQRSHL: @@ -6702,9 +6963,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) case NEON_3R_VHSUB: GEN_NEON_INTEGER_OP(hsub); break; - case NEON_3R_VSHL: - GEN_NEON_INTEGER_OP(shl); - break; case NEON_3R_VQSHL: GEN_NEON_INTEGER_OP_ENV(qshl); break; @@ -7113,9 +7371,9 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) } } else { if (input_unsigned) { - gen_helper_neon_shl_u64(cpu_V0, in, tmp64); + gen_ushl_i64(cpu_V0, in, tmp64); } else { - gen_helper_neon_shl_s64(cpu_V0, in, tmp64); + gen_sshl_i64(cpu_V0, in, tmp64); } } tmp = tcg_temp_new_i32(); diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index dedef62403..9f8eee5611 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -1046,3 +1046,179 @@ void HELPER(gvec_fmlal_idx_a64)(void *vd, void *vn, void *vm, do_fmlal_idx(vd, vn, vm, &env->vfp.fp_status, desc, get_flush_inputs_to_zero(&env->vfp.fp_status_f16)); } + +void HELPER(gvec_sshl_b)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int8_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz; ++i) { + int8_t mm = m[i]; + int8_t nn = n[i]; + int8_t res = 0; + if (mm >= 0) { + if (mm < 8) { + res = nn << mm; + } + } else { + res = nn >> (mm > -8 ? -mm : 7); + } + d[i] = res; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_sshl_h)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int16_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 2; ++i) { + int8_t mm = m[i]; /* only 8 bits of shift are significant */ + int16_t nn = n[i]; + int16_t res = 0; + if (mm >= 0) { + if (mm < 16) { + res = nn << mm; + } + } else { + res = nn >> (mm > -16 ? -mm : 15); + } + d[i] = res; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_sshl_s)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int32_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 4; ++i) { + int8_t mm = m[i]; /* only 8 bits of shift are significant */ + int32_t nn = n[i]; + int32_t res = 0; + if (mm >= 0) { + if (mm < 32) { + res = nn << mm; + } + } else { + res = nn >> (mm > -32 ? -mm : 31); + } + d[i] = res; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_sshl_d)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int64_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 8; ++i) { + int8_t mm = m[i]; /* only 8 bits of shift are significant */ + int64_t nn = n[i]; + int64_t res = 0; + if (mm >= 0) { + if (mm < 64) { + res = nn << mm; + } + } else { + res = nn >> (mm > -64 ? -mm : 63); + } + d[i] = res; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_ushl_b)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint8_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz; ++i) { + int8_t mm = m[i]; + uint8_t nn = n[i]; + uint8_t res = 0; + if (mm >= 0) { + if (mm < 8) { + res = nn << mm; + } + } else { + if (mm > -8) { + res = nn >> -mm; + } + } + d[i] = res; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_ushl_h)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint16_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 2; ++i) { + int8_t mm = m[i]; /* only 8 bits of shift are significant */ + uint16_t nn = n[i]; + uint16_t res = 0; + if (mm >= 0) { + if (mm < 16) { + res = nn << mm; + } + } else { + if (mm > -16) { + res = nn >> -mm; + } + } + d[i] = res; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_ushl_s)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint32_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 4; ++i) { + int8_t mm = m[i]; /* only 8 bits of shift are significant */ + uint32_t nn = n[i]; + uint32_t res = 0; + if (mm >= 0) { + if (mm < 32) { + res = nn << mm; + } + } else { + if (mm > -32) { + res = nn >> -mm; + } + } + d[i] = res; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_ushl_d)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint64_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 8; ++i) { + int8_t mm = m[i]; /* only 8 bits of shift are significant */ + uint64_t nn = n[i]; + uint64_t res = 0; + if (mm >= 0) { + if (mm < 64) { + res = nn << mm; + } + } else { + if (mm > -64) { + res = nn >> -mm; + } + } + d[i] = res; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} From patchwork Sat Apr 20 07:34:37 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1088331 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="vDULNkCE"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44mQJc54kQz9s70 for ; Sat, 20 Apr 2019 18:01:04 +1000 (AEST) Received: from localhost ([127.0.0.1]:38436 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkvl-00050J-Ru for incoming@patchwork.ozlabs.org; Sat, 20 Apr 2019 04:01:01 -0400 Received: from eggs.gnu.org ([209.51.188.92]:40556) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkXY-00012g-Dc for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:36:01 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hHkXW-0000n9-Dt for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:36:00 -0400 Received: from mail-pf1-x444.google.com ([2607:f8b0:4864:20::444]:38052) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hHkXW-0000HX-6Y for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:58 -0400 Received: by mail-pf1-x444.google.com with SMTP id 10so3469324pfo.5 for ; Sat, 20 Apr 2019 00:35:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=6wY+MX9gb/NGFl8f/7Ms1g+8OImy6UwT9MWlwo8Wb2o=; b=vDULNkCEVDAEy+8uXRYHenNyq/KPXhdCOJ3lrZFI20hgHAtRTlH7kGiNyRof7ozS4Q NPHHvzUakzyQzr4B8GH7ydiLHfQ8waw7OF1JntO0EP0R/qC1XqTLNNnUGKBKn7r1UKB5 8MZkEk7Wk9Jp91DlXm+Jr7qbv/E0wI8Uy2jFwyldJN9DgCRfyU2nPafHXC6QLiCQLrDA ywIjTEypuG9Ktp6MgKks6tISpWil9Ai5z3sqAtC2RfS03UuyOvNdiTbgBK8B+IVKBMhG Bysz7Da3uI3Iv49GK21bvj/G5CJ42KjH9KsYgF3bJhTfUflP82A/kYal3viogs0Qxxhq BRWg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=6wY+MX9gb/NGFl8f/7Ms1g+8OImy6UwT9MWlwo8Wb2o=; b=axu9eZIReIiG9Vn4CXW4qLLS0JYWNQNc6r55ni24mvHYOuS8Pb1qM6w4nuhwRsQFGi tZkg2g2USaMsKMPsxgMJQzxTJtS7i0qZXtwgbyStmTDxgX5j7qflawf+3G4XXEQzHFS1 Gib1N0uBXDgffWiUgn1+VqlCBNqv/DZGhCV5vbDhrExDyeAJe1aUm13HT0ktLsUVR7It ibPAG0I5mg65BxSrvlifVAbEX2yeULHdhEPtiV9mLliHEIxJmVSZywrfzAzRRSw2TqK4 ebSDN/2OC1h7wNOfAAJGzhcq64o8ALMkpX2ZgPnNxKvumUpwU0JxT+L2EF6K8jDQQogw qsIw== X-Gm-Message-State: APjAAAVo9jzQ5PRwtHsK3NA9aXFw0cSLdOBawInNnbSURV8vyiiMFIDJ bEedKa8omYdf5gpKb65I5v3Jk+bfwZI= X-Google-Smtp-Source: APXvYqzDTbHYGE+VpuCEipdw08Q84+/H5R5ektpgf7UBk6NTwlGwz8JrZ+jNBUudHkKi7RtREL3XGQ== X-Received: by 2002:a63:fa46:: with SMTP id g6mr8134255pgk.382.1555745738114; Sat, 20 Apr 2019 00:35:38 -0700 (PDT) Received: from localhost.localdomain (rrcs-66-91-136-155.west.biz.rr.com. [66.91.136.155]) by smtp.gmail.com with ESMTPSA id z22sm7025492pgv.23.2019.04.20.00.35.36 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 20 Apr 2019 00:35:37 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Fri, 19 Apr 2019 21:34:37 -1000 Message-Id: <20190420073442.7488-34-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190420073442.7488-1-richard.henderson@linaro.org> References: <20190420073442.7488-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::444 Subject: [Qemu-devel] [PATCH 33/38] tcg/aarch64: Do not advertise minmax for MO_64 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: david@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" The min/max instructions are not available for 64-bit elements. Fixes: 93f332a50371 Signed-off-by: Richard Henderson --- tcg/aarch64/tcg-target.inc.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/tcg/aarch64/tcg-target.inc.c b/tcg/aarch64/tcg-target.inc.c index 84d402acd8..e68e4de08c 100644 --- a/tcg/aarch64/tcg-target.inc.c +++ b/tcg/aarch64/tcg-target.inc.c @@ -2348,16 +2348,16 @@ int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type, unsigned vece) case INDEX_op_sssub_vec: case INDEX_op_usadd_vec: case INDEX_op_ussub_vec: - case INDEX_op_smax_vec: - case INDEX_op_smin_vec: - case INDEX_op_umax_vec: - case INDEX_op_umin_vec: case INDEX_op_shlv_vec: return 1; case INDEX_op_shrv_vec: case INDEX_op_sarv_vec: return -1; case INDEX_op_mul_vec: + case INDEX_op_smax_vec: + case INDEX_op_smin_vec: + case INDEX_op_umax_vec: + case INDEX_op_umin_vec: return vece < MO_64; default: From patchwork Sat Apr 20 07:34:38 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1088326 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="gOoNgjHu"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44mQC95SfVz9s4Y for ; Sat, 20 Apr 2019 17:56:21 +1000 (AEST) Received: from localhost ([127.0.0.1]:38367 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkrD-00011m-Li for incoming@patchwork.ozlabs.org; Sat, 20 Apr 2019 03:56:19 -0400 Received: from eggs.gnu.org ([209.51.188.92]:40405) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkXN-0000uN-Rr for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:50 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hHkXK-0000Ve-3u for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:49 -0400 Received: from mail-pf1-x441.google.com ([2607:f8b0:4864:20::441]:43449) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hHkXI-0000Jh-7g for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:45 -0400 Received: by mail-pf1-x441.google.com with SMTP id c8so3458979pfd.10 for ; Sat, 20 Apr 2019 00:35:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=iaXmBLRJ4AetvIw8U8LEOO46+IUqp272JP0xIL80EUs=; b=gOoNgjHukeRlvISeBnypWZmGThCjrKpAFA9VJde2eH4r3h58nQRzbELCV2hG/LMJXa VxwKP2n0q9gt5YlgVBLzmrrNPnglROvKNvkPimOAudF2fQP8Q9o2K0EN9aS1RtoOEUPY R6KR+lNyg6hwtFsloSPiti1L3tqfg3OpcDtS25TH7VS6nUyKd1TnuxWZYwuPQs7vX0i7 m1GaIVUljizfwxxzTJ6YMT5fOG1eeYUETuKg9Ynn4yOP+0FKhNH4M3Jv+KDuaxqVtDCt 9HEtgTMDk3nKSCwGXXyV//3nsorkXQR6ybwmz25dq10y3jgr9iKD9YqdFgKNwK1sr0Qi x4gA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=iaXmBLRJ4AetvIw8U8LEOO46+IUqp272JP0xIL80EUs=; b=JeK64MMKrYutltM/uG7DSaLAt27Dg/weI/pt9CAqRZQgm4vUNF6Lmt+EUZQFb8TwR8 GgnJaVRKySv9P0+iornc7kJkqC/6xkh2zDtnAMwtjyfXkuD6Kow8OQGoLb6/XRHjqyq7 I8/1qgB0O5vpoCVsITjmG2xL0JPIE4tByaTL0A/1hGwGrt/iFNqru9mNYSLwf9rkI8Om enb5uT4N1DdfreYSMsPkL5nKhb/F4MyL6IMvSpf2La2d0I2JcbuGYc0txx09DAH3TZ3m dweoI4U1kpOqNQY1JehBP+mzwjgdFnT32ZH/VpMzj59W2gmPjQ6ME3o4Ut0Kl44JG2Mn LJkQ== X-Gm-Message-State: APjAAAWoW0hpmLm8plqZ+j7K5R1Vi8yFf6W1S/d4HOZJhM7eMxssi/9N 8RAnLRsMCvnntj0t3Aq9cl65NqfI4qk= X-Google-Smtp-Source: APXvYqyi0r9Pw3Lt4HethJXjU9cDCXIlvla/wAnTCS5zLmsW87d5970wu9BaA65ZIHh6hATJksmIlg== X-Received: by 2002:a62:4115:: with SMTP id o21mr8453186pfa.153.1555745739569; Sat, 20 Apr 2019 00:35:39 -0700 (PDT) Received: from localhost.localdomain (rrcs-66-91-136-155.west.biz.rr.com. [66.91.136.155]) by smtp.gmail.com with ESMTPSA id z22sm7025492pgv.23.2019.04.20.00.35.38 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 20 Apr 2019 00:35:38 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Fri, 19 Apr 2019 21:34:38 -1000 Message-Id: <20190420073442.7488-35-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190420073442.7488-1-richard.henderson@linaro.org> References: <20190420073442.7488-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::441 Subject: [Qemu-devel] [PATCH 34/38] tcg: Do not recreate INDEX_op_neg_vec unless supported X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: david@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Use tcg_can_emit_vec_op instead of just TCG_TARGET_HAS_neg_vec, so that we check the type and vece for the actual operation. Signed-off-by: Richard Henderson --- tcg/optimize.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/tcg/optimize.c b/tcg/optimize.c index 5150c38a25..24faa06260 100644 --- a/tcg/optimize.c +++ b/tcg/optimize.c @@ -734,9 +734,13 @@ void tcg_optimize(TCGContext *s) } else if (opc == INDEX_op_sub_i64) { neg_op = INDEX_op_neg_i64; have_neg = TCG_TARGET_HAS_neg_i64; - } else { + } else if (TCG_TARGET_HAS_neg_vec) { + TCGType type = TCGOP_VECL(op) + TCG_TYPE_V64; + unsigned vece = TCGOP_VECE(op); neg_op = INDEX_op_neg_vec; - have_neg = TCG_TARGET_HAS_neg_vec; + have_neg = tcg_can_emit_vec_op(neg_op, type, vece) > 0; + } else { + break; } if (!have_neg) { break; From patchwork Sat Apr 20 07:34:39 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1088332 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="eEyK3olj"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44mQJc6lGTz9s9G for ; Sat, 20 Apr 2019 18:01:02 +1000 (AEST) Received: from localhost ([127.0.0.1]:38434 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkvj-0004xZ-68 for incoming@patchwork.ozlabs.org; Sat, 20 Apr 2019 04:00:59 -0400 Received: from eggs.gnu.org ([209.51.188.92]:40508) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkXX-00011Q-8W for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:36:00 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hHkXV-0000mF-V1 for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:59 -0400 Received: from mail-pl1-x644.google.com ([2607:f8b0:4864:20::644]:38608) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hHkXU-0000Mw-AD for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:56 -0400 Received: by mail-pl1-x644.google.com with SMTP id f36so3539289plb.5 for ; Sat, 20 Apr 2019 00:35:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=lRGQ2wNmavOUPa4H3gsQKIdLCd5xhIibGFq5wA+mFqQ=; b=eEyK3olj9pgynR5TsnKROTz1B68RnDbqagojfxN7iUkdy4rws/aE7WVVKHbudPeCF/ yHmO5C3IBVDeT5gndOlpTnr0fonX9PfseZqbJtoDAT0/Oojpjh7xBQP63jDDQubBae6+ 6ZBRdKtRPfjRCvvuhqiwFX56DLFBCQvYNi7eoJWLVN1CLmYEOHjLlWlF7UFGCOmajcqu 3J1AQ7GbUScp5qX+FL+bWUVS5XfV5Bt9KuGPIPzSgnqc80TSJUPJfBMSB6PuZmKiowBf G9PPcbWR558Tbi1wEsPmUOnoTE5IJ5u1AFhwtIfPbFP9qdJIIYhAE9Q+S4MDxKLtPY+g yeKw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=lRGQ2wNmavOUPa4H3gsQKIdLCd5xhIibGFq5wA+mFqQ=; b=BzjYEFlmWw+uQx+gfLTgajQ32EmLlP70M2/aJrX9NZ1WtYSZsbGZnfEvNhYQCeiPcF OeNMGwtCHs/nqE5I+/pjUQ3ayF1DkZTObVj3VjGufV7sTTYkltqJ6KVx4rpo4Easv2yh /ql8+qeM+7tGziaTgdMwiiBCxxaxxZ3pk96y/an3Kp+UDia217ET3DEYdPJxN69lJ26A gNB0glutsqi/5VW1pCAu646EiYrJr7zQSgB1OMLGy2NA9JSTAfLpUSLBy8dZIxFhc6+H S6x7+LNHLL/oSV9Xhszfu+cWe9N5n2vOeQBeKxX/S1QakESlsHod8dtlDCcx/+g7KJMd GwxA== X-Gm-Message-State: APjAAAWN5IYHaJl5d639jzJDQJNwajIliP5eApPXrgFQLjnPyC+KPFfd RW3oE+bXTngGv2W7YkzHQA9WgyJhFts= X-Google-Smtp-Source: APXvYqz+gnFlO6NLLKsZVUMjmtlIGsEfkRDTQdH282Hgz2ToA50mR3fzjUgFon343Teb2q8QEB005A== X-Received: by 2002:a17:902:2b88:: with SMTP id l8mr8225061plb.262.1555745741032; Sat, 20 Apr 2019 00:35:41 -0700 (PDT) Received: from localhost.localdomain (rrcs-66-91-136-155.west.biz.rr.com. [66.91.136.155]) by smtp.gmail.com with ESMTPSA id z22sm7025492pgv.23.2019.04.20.00.35.39 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 20 Apr 2019 00:35:40 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Fri, 19 Apr 2019 21:34:39 -1000 Message-Id: <20190420073442.7488-36-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190420073442.7488-1-richard.henderson@linaro.org> References: <20190420073442.7488-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::644 Subject: [Qemu-devel] [PATCH 35/38] tcg: Introduce do_op3_nofail for vector expansion X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: david@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" This makes do_op3 match do_op2 in allowing for failure, and thus fall back expansions. Signed-off-by: Richard Henderson --- tcg/tcg-op-vec.c | 45 +++++++++++++++++++++++++++------------------ 1 file changed, 27 insertions(+), 18 deletions(-) diff --git a/tcg/tcg-op-vec.c b/tcg/tcg-op-vec.c index 7d8f7b490a..5868a51270 100644 --- a/tcg/tcg-op-vec.c +++ b/tcg/tcg-op-vec.c @@ -450,7 +450,7 @@ void tcg_gen_cmp_vec(TCGCond cond, unsigned vece, } } -static void do_op3(unsigned vece, TCGv_vec r, TCGv_vec a, +static bool do_op3(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b, TCGOpcode opc) { TCGTemp *rt = tcgv_vec_temp(r); @@ -468,82 +468,91 @@ static void do_op3(unsigned vece, TCGv_vec r, TCGv_vec a, can = tcg_can_emit_vec_op(opc, type, vece); if (can > 0) { vec_gen_3(opc, type, vece, ri, ai, bi); - } else { + } else if (can < 0) { const TCGOpcode *hold_list = tcg_swap_vecop_list(NULL); - tcg_debug_assert(can < 0); tcg_expand_vec_op(opc, type, vece, ri, ai, bi); tcg_swap_vecop_list(hold_list); + } else { + return false; } + return true; +} + +static void do_op3_nofail(unsigned vece, TCGv_vec r, TCGv_vec a, + TCGv_vec b, TCGOpcode opc) +{ + bool ok = do_op3(vece, r, a, b, opc); + tcg_debug_assert(ok); } void tcg_gen_add_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) { - do_op3(vece, r, a, b, INDEX_op_add_vec); + do_op3_nofail(vece, r, a, b, INDEX_op_add_vec); } void tcg_gen_sub_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) { - do_op3(vece, r, a, b, INDEX_op_sub_vec); + do_op3_nofail(vece, r, a, b, INDEX_op_sub_vec); } void tcg_gen_mul_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) { - do_op3(vece, r, a, b, INDEX_op_mul_vec); + do_op3_nofail(vece, r, a, b, INDEX_op_mul_vec); } void tcg_gen_ssadd_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) { - do_op3(vece, r, a, b, INDEX_op_ssadd_vec); + do_op3_nofail(vece, r, a, b, INDEX_op_ssadd_vec); } void tcg_gen_usadd_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) { - do_op3(vece, r, a, b, INDEX_op_usadd_vec); + do_op3_nofail(vece, r, a, b, INDEX_op_usadd_vec); } void tcg_gen_sssub_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) { - do_op3(vece, r, a, b, INDEX_op_sssub_vec); + do_op3_nofail(vece, r, a, b, INDEX_op_sssub_vec); } void tcg_gen_ussub_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) { - do_op3(vece, r, a, b, INDEX_op_ussub_vec); + do_op3_nofail(vece, r, a, b, INDEX_op_ussub_vec); } void tcg_gen_smin_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) { - do_op3(vece, r, a, b, INDEX_op_smin_vec); + do_op3_nofail(vece, r, a, b, INDEX_op_smin_vec); } void tcg_gen_umin_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) { - do_op3(vece, r, a, b, INDEX_op_umin_vec); + do_op3_nofail(vece, r, a, b, INDEX_op_umin_vec); } void tcg_gen_smax_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) { - do_op3(vece, r, a, b, INDEX_op_smax_vec); + do_op3_nofail(vece, r, a, b, INDEX_op_smax_vec); } void tcg_gen_umax_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) { - do_op3(vece, r, a, b, INDEX_op_umax_vec); + do_op3_nofail(vece, r, a, b, INDEX_op_umax_vec); } void tcg_gen_shlv_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) { - do_op3(vece, r, a, b, INDEX_op_shlv_vec); + do_op3_nofail(vece, r, a, b, INDEX_op_shlv_vec); } void tcg_gen_shrv_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) { - do_op3(vece, r, a, b, INDEX_op_shrv_vec); + do_op3_nofail(vece, r, a, b, INDEX_op_shrv_vec); } void tcg_gen_sarv_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) { - do_op3(vece, r, a, b, INDEX_op_sarv_vec); + do_op3_nofail(vece, r, a, b, INDEX_op_sarv_vec); } static void do_shifts(unsigned vece, TCGv_vec r, TCGv_vec a, @@ -579,7 +588,7 @@ static void do_shifts(unsigned vece, TCGv_vec r, TCGv_vec a, } else { tcg_gen_dup_i32_vec(vece, vec_s, s); } - do_op3(vece, r, a, vec_s, opc_v); + do_op3_nofail(vece, r, a, vec_s, opc_v); tcg_temp_free_vec(vec_s); } tcg_swap_vecop_list(hold_list); From patchwork Sat Apr 20 07:34:40 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1088327 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="qBKKszkz"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44mQDq42ZXz9s70 for ; Sat, 20 Apr 2019 17:57:47 +1000 (AEST) Received: from localhost ([127.0.0.1]:38377 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHksb-0002DE-IP for incoming@patchwork.ozlabs.org; Sat, 20 Apr 2019 03:57:45 -0400 Received: from eggs.gnu.org ([209.51.188.92]:40566) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkXY-00012q-IL for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:36:02 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hHkXW-0000nG-FA for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:36:00 -0400 Received: from mail-pl1-x62c.google.com ([2607:f8b0:4864:20::62c]:36978) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hHkXW-0000Px-7j for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:58 -0400 Received: by mail-pl1-x62c.google.com with SMTP id w23so3539961ply.4 for ; Sat, 20 Apr 2019 00:35:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=R0QqxrQEMqfALEw2bMQE+5E2LW+Mhp5XpDv3PusuZQk=; b=qBKKszkzqmZLSpPi+dn+VAqtu/jNi1nmj6cL11ZW3BvZKYBebX6Op4paK4F2zm/0wv TH1CNck2bfDHz+B4beP3mczruHihNlVE6sQz0XRN3oFnhVcTm5V0zMl4HA1rsD4O+9RJ 8qEIbV5wIx2pZgzJ5N3rzcjm4Q5bAC9VjshX6FN5M9jnAg3qLBRWoeWkZ8CUuEfn5gL1 xF2vAhdn1VshTdNbVCd05D8ZAPydL37RCDy1W4ZFbNCPanzPorUyKV22V0UktDt+4+f7 oaxx2gSD1wV+Y3ROQlmghyZTQmaxGX3RYuaOskKUben/qcNPcyOqomBRtVXKuT3jNUjW vvyA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=R0QqxrQEMqfALEw2bMQE+5E2LW+Mhp5XpDv3PusuZQk=; b=PvS/Ho4yemB2SaA5BosgV17tfsBOscSafIx0sSQ03nM4Y4RkYKUm32nxeUNybeXVkp cOudvT/1t//yNftMKcMmdixTtR+KhMeKHNR9syBkj9/PXZrBsIUbA7+jUJYa9pCdTnWu ZLSSG8e2Os/nGX9OvP8LggcaBOg9R4Q7U5g3gfzOSHkW9F/POLKYwdHTv2H3T3rPKtoP 1znJdJxupMMocQF19altjXI78l3+wHTsX1ObHp+AczRPckRaPVgt9OP5wS9NVj8Bzww6 3ZaiAT0HwJNOMmdBrXq8W3zeQWMbxzx4uwW0Hgjlle2ugNCrmAhagk6yEj3Jl+covSAj hBVQ== X-Gm-Message-State: APjAAAV2CfZMt3CUkpJLPRarEgIUKeZevYKH897lLSGJI/XJ0ABEbKXT FFnrYqomY4oOwHgVVdHxP8gt2AKRx14= X-Google-Smtp-Source: APXvYqwghTj5Nl1xZMKzljOeShUko7pPThAioDrcTbMU8lLwnYls6ZhxHvtn4icxxQexN2ZozL4MPQ== X-Received: by 2002:a17:902:2aa6:: with SMTP id j35mr2390510plb.236.1555745742505; Sat, 20 Apr 2019 00:35:42 -0700 (PDT) Received: from localhost.localdomain (rrcs-66-91-136-155.west.biz.rr.com. [66.91.136.155]) by smtp.gmail.com with ESMTPSA id z22sm7025492pgv.23.2019.04.20.00.35.41 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 20 Apr 2019 00:35:41 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Fri, 19 Apr 2019 21:34:40 -1000 Message-Id: <20190420073442.7488-37-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190420073442.7488-1-richard.henderson@linaro.org> References: <20190420073442.7488-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::62c Subject: [Qemu-devel] [PATCH 36/38] tcg: Expand vector minmax using cmp+cmpsel X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: david@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- tcg/tcg-op-gvec.c | 8 ++++++++ tcg/tcg-op-vec.c | 19 +++++++++++++++---- 2 files changed, 23 insertions(+), 4 deletions(-) diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c index e7029d26f4..dddb00719a 100644 --- a/tcg/tcg-op-gvec.c +++ b/tcg/tcg-op-gvec.c @@ -99,6 +99,14 @@ static bool tcg_can_emit_vecop_list(const TCGOpcode *list, case INDEX_op_cmpsel_vec: /* Fallback expansion uses only required logial ops. */ continue; + case INDEX_op_smin_vec: + case INDEX_op_smax_vec: + case INDEX_op_umin_vec: + case INDEX_op_umax_vec: + if (tcg_can_emit_vec_op(INDEX_op_cmp_vec, type, vece)) { + continue; + } + break; default: break; } diff --git a/tcg/tcg-op-vec.c b/tcg/tcg-op-vec.c index 5868a51270..43abeb0674 100644 --- a/tcg/tcg-op-vec.c +++ b/tcg/tcg-op-vec.c @@ -520,24 +520,35 @@ void tcg_gen_ussub_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) do_op3_nofail(vece, r, a, b, INDEX_op_ussub_vec); } +static void do_minmax(unsigned vece, TCGv_vec r, TCGv_vec a, + TCGv_vec b, TCGOpcode opc, TCGCond cond) +{ + if (!do_op3(vece, r, a, b, opc)) { + TCGv_vec t = tcg_temp_new_vec_matching(r); + tcg_gen_cmp_vec(cond, vece, t, a, b); + tcg_gen_cmpsel_vec(vece, r, t, a, b); + tcg_temp_free_vec(t); + } +} + void tcg_gen_smin_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) { - do_op3_nofail(vece, r, a, b, INDEX_op_smin_vec); + do_minmax(vece, r, a, b, INDEX_op_smin_vec, TCG_COND_GT); } void tcg_gen_umin_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) { - do_op3_nofail(vece, r, a, b, INDEX_op_umin_vec); + do_minmax(vece, r, a, b, INDEX_op_umin_vec, TCG_COND_LTU); } void tcg_gen_smax_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) { - do_op3_nofail(vece, r, a, b, INDEX_op_smax_vec); + do_minmax(vece, r, a, b, INDEX_op_smax_vec, TCG_COND_GT); } void tcg_gen_umax_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) { - do_op3_nofail(vece, r, a, b, INDEX_op_umax_vec); + do_minmax(vece, r, a, b, INDEX_op_umax_vec, TCG_COND_GTU); } void tcg_gen_shlv_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) From patchwork Sat Apr 20 07:34:41 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1088325 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="T2rdJeN+"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44mQC83zK5z9s4Y for ; Sat, 20 Apr 2019 17:56:20 +1000 (AEST) Received: from localhost ([127.0.0.1]:38365 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkrC-000115-HD for incoming@patchwork.ozlabs.org; Sat, 20 Apr 2019 03:56:18 -0400 Received: from eggs.gnu.org ([209.51.188.92]:40420) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkXS-0000yt-Bm for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:55 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hHkXO-0000fE-PB for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:51 -0400 Received: from mail-pl1-x642.google.com ([2607:f8b0:4864:20::642]:43186) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hHkXL-0000Sm-Tm for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:49 -0400 Received: by mail-pl1-x642.google.com with SMTP id n8so3526796plp.10 for ; Sat, 20 Apr 2019 00:35:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=I6XLb1qOv9FkxV4sEapnWKsb8dxwM3qKBe7iS0e8mFM=; b=T2rdJeN+sSY0/ciB0T/ptw6x/beeh0FKjxmYpWJ4TyBYNRvzY/pfE4BJpS3INUO2VJ yD3I6vI/DJxOF4FDW4fZ2E2MJsQpKUfEsE1AJpYqJRkc9H26P6Wg10rMCjpyPPbsv0pW ttYgAmR3qkRk2oe97Ogk6eBvLKpjTQEcSNDLU2AyAUb6VJWX8STzCqobqz7d92DU2GGb 1hjLRIFYOtwYTQskCjrXN4sizVJd5ygRTYH+1OEJNeprqoGjyN+WvDXgKF4M7r+EBMPN q9M16dHu/MakBv4B3b4f+La39jJZMG9DNimzwPJbzAcDZo+v3TBWfR8bikTx6eAsiqhr qh/g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=I6XLb1qOv9FkxV4sEapnWKsb8dxwM3qKBe7iS0e8mFM=; b=jVrXEdj9/Oaj15caNGb3JhX6yKYDAAGtO0gY5KJf+6AXwg7oinxK5E89OZnwKv7f34 dB/HJRFnojkqwoeaMkusrXugJ969Kgas12VWa0YzG1eSkR5jcKvtVUO7pXyGoMFITNU1 ol66TbH37YpC/KfGqJZsBB47ijICt5SzTVU4mQytmI5AbgZy70AZc7MvXVbaPTIhFUVP e5Y/2o1AxThwBcy704JfdwfT3NLvTykYQf7TfMgePSoeZqbBv4m8ie0s2ZOZFZgDS9Rz 8/IOvUCXIwe4tRtleIr7z6RcMvN+7nmwnQ6ogdo7H/4GeCEqM6kpjzXjLIRHW3BM5VnN Rrdg== X-Gm-Message-State: APjAAAW20PKBjWnRptcJAfey+4a3ewwzJwArRbKj9K6GYfCd3EAzTQMi 8Pg7DGbBcSkgFJSashlAZf/f8gF/4aU= X-Google-Smtp-Source: APXvYqzu1Seh4zJdlQMpC78ZJ6S8q4vJcp+4LKSF6a0NlM9MP+xXYq6FeUJvDjQpcm/zWSORDDfMOw== X-Received: by 2002:a17:902:900a:: with SMTP id a10mr759928plp.336.1555745743903; Sat, 20 Apr 2019 00:35:43 -0700 (PDT) Received: from localhost.localdomain (rrcs-66-91-136-155.west.biz.rr.com. [66.91.136.155]) by smtp.gmail.com with ESMTPSA id z22sm7025492pgv.23.2019.04.20.00.35.42 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 20 Apr 2019 00:35:43 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Fri, 19 Apr 2019 21:34:41 -1000 Message-Id: <20190420073442.7488-38-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190420073442.7488-1-richard.henderson@linaro.org> References: <20190420073442.7488-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::642 Subject: [Qemu-devel] [PATCH 37/38] tcg/aarch64: Use MVNI for expansion of dupi X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: david@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- tcg/aarch64/tcg-target.inc.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/tcg/aarch64/tcg-target.inc.c b/tcg/aarch64/tcg-target.inc.c index e68e4de08c..20c8699f79 100644 --- a/tcg/aarch64/tcg-target.inc.c +++ b/tcg/aarch64/tcg-target.inc.c @@ -513,6 +513,7 @@ typedef enum { /* AdvSIMD modified immediate */ I3606_MOVI = 0x0f000400, + I3606_MVNI = 0x2f000400, /* AdvSIMD shift by immediate */ I3614_SSHR = 0x0f000400, @@ -823,6 +824,9 @@ static void tcg_out_dupi_vec(TCGContext *s, TCGType type, if (is_fimm(v64, &op, &cmode, &imm8)) { tcg_out_insn(s, 3606, MOVI, type == TCG_TYPE_V128, rd, op, cmode, imm8); + } else if (is_fimm(~v64, &op, &cmode, &imm8) + && op == 0 && ((1 << cmode) & 0x03555)) { + tcg_out_insn(s, 3606, MVNI, type == TCG_TYPE_V128, rd, 0, cmode, imm8); } else if (type == TCG_TYPE_V128) { new_pool_l2(s, R_AARCH64_CONDBR19, s->code_ptr, 0, v64, v64); tcg_out_insn(s, 3305, LDR_v128, 0, rd); From patchwork Sat Apr 20 07:34:42 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 1088324 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.b="zmCZJcm2"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44mQB362wmz9s4Y for ; Sat, 20 Apr 2019 17:55:23 +1000 (AEST) Received: from localhost ([127.0.0.1]:38326 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkqH-0008VQ-EU for incoming@patchwork.ozlabs.org; Sat, 20 Apr 2019 03:55:21 -0400 Received: from eggs.gnu.org ([209.51.188.92]:40463) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hHkXW-00010C-5A for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:59 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hHkXU-0000kk-Ar for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:57 -0400 Received: from mail-pf1-x443.google.com ([2607:f8b0:4864:20::443]:38052) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hHkXT-0000Vm-Er for qemu-devel@nongnu.org; Sat, 20 Apr 2019 03:35:55 -0400 Received: by mail-pf1-x443.google.com with SMTP id 10so3469439pfo.5 for ; Sat, 20 Apr 2019 00:35:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=mhAi7YY1D/Om/C4ZnAkNiJSYsKDzvLj0JsVFt1Dgh9U=; b=zmCZJcm27PstlkuhldKwmMKQmxp9Yzd0x15/C1jtoJKIo2kegOfxWkgX/uLivPsB3G 1pNV860ee579IRtM9JNNBH1p+aS+mQoZT1Px5z2Hkq5jzaY1JTygSalGcZ1AaCMBKuS0 9UWEEcxHTE5Y1jbSoUfZzJ7nFJELo2zgrdEwpzz2W83Qh0L6UOxBNkwPXZ4AqieD9KCH b1F7DBP8DvPSqRmJ2g7i/jUI/YN3B1M+JtFdqO19p+wka/evm8rvF88Hd53tbSI1Fqx/ gaJysFbORGKhjKw5d28iW1p5mE/xcGF/aWGADzY84LSq61Rj4P+Ns07bpyLy270UM3Om bKRQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=mhAi7YY1D/Om/C4ZnAkNiJSYsKDzvLj0JsVFt1Dgh9U=; b=XaEiN20xyaydgmFqLNTXsiAo9Di34eIDJUuhRL04v4uENACLGkMq+kAiamU4QNPCEO dw2RuQCJEvwyUE4ZHvITd2p2/S9CAGz87CVMou51e1Kr5qp8quFwsRGKNYvcNYUpVuf1 pq8yEwNXGy8FCFiOax+mziQ5Nz6DdyJxMGpZVE3S6sSx3hF/mGfcsUpK+c2mqmVFzwu5 nv5sR+KJiqjHUP0zr2pZPokbqQd+j0lQ/wMOoBaOQSZQTrD7kQ0HQKC9cYh+S4NxGkiO pSwmJX+lsAn5a8D6x4fsQ/ljV4KXlUTukYK6k/jySGZeL5WXZjG/mqQaByBIg34gpxwx ynzw== X-Gm-Message-State: APjAAAUVUrw7CeBkJoxols3NmpZYZt2PZ2KmvrJBccPr9rtrkpfvaeSx lQS1o6m53m906TCf98lNMa9ZLMCEcQ8= X-Google-Smtp-Source: APXvYqyWHcPB//FAjqmE4cTgRzm7yJmRnpvRYEC1xQIhmI/c5wz3ijO3246VyUGmrFw7H80dnsdGDg== X-Received: by 2002:a65:5106:: with SMTP id f6mr8138632pgq.253.1555745745442; Sat, 20 Apr 2019 00:35:45 -0700 (PDT) Received: from localhost.localdomain (rrcs-66-91-136-155.west.biz.rr.com. [66.91.136.155]) by smtp.gmail.com with ESMTPSA id z22sm7025492pgv.23.2019.04.20.00.35.44 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 20 Apr 2019 00:35:44 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Fri, 19 Apr 2019 21:34:42 -1000 Message-Id: <20190420073442.7488-39-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190420073442.7488-1-richard.henderson@linaro.org> References: <20190420073442.7488-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::443 Subject: [Qemu-devel] [PATCH 38/38] tcg/aarch64: Use ORRI and BICI for vector logical operations X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: david@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- tcg/aarch64/tcg-target.inc.c | 77 ++++++++++++++++++++++++++++++------ 1 file changed, 65 insertions(+), 12 deletions(-) diff --git a/tcg/aarch64/tcg-target.inc.c b/tcg/aarch64/tcg-target.inc.c index 20c8699f79..1e79a60fb2 100644 --- a/tcg/aarch64/tcg-target.inc.c +++ b/tcg/aarch64/tcg-target.inc.c @@ -119,6 +119,8 @@ static inline bool patch_reloc(tcg_insn_unit *code_ptr, int type, #define TCG_CT_CONST_LIMM 0x200 #define TCG_CT_CONST_ZERO 0x400 #define TCG_CT_CONST_MONE 0x800 +#define TCG_CT_CONST_PVI 0x1000 +#define TCG_CT_CONST_IVI 0x2000 /* parse target specific constraints */ static const char *target_parse_constraint(TCGArgConstraint *ct, @@ -154,6 +156,12 @@ static const char *target_parse_constraint(TCGArgConstraint *ct, case 'M': /* minus one */ ct->ct |= TCG_CT_CONST_MONE; break; + case 'P': /* vector positive immediate */ + ct->ct |= TCG_CT_CONST_PVI; + break; + case 'I': /* vector inverted immediate */ + ct->ct |= TCG_CT_CONST_IVI; + break; case 'Z': /* zero */ ct->ct |= TCG_CT_CONST_ZERO; break; @@ -294,6 +302,7 @@ static int tcg_target_const_match(tcg_target_long val, TCGType type, const TCGArgConstraint *arg_ct) { int ct = arg_ct->ct; + int op, cmode, imm8; if (ct & TCG_CT_CONST) { return 1; @@ -313,6 +322,16 @@ static int tcg_target_const_match(tcg_target_long val, TCGType type, if ((ct & TCG_CT_CONST_MONE) && val == -1) { return 1; } + if ((ct & TCG_CT_CONST_PVI) && + is_fimm(val, &op, &cmode, &imm8) && + op == 0 && ((1 << cmode) & 0x555)) { + return 1; + } + if ((ct & TCG_CT_CONST_IVI) && + is_fimm(~val, &op, &cmode, &imm8) && + op == 0 && ((1 << cmode) & 0x555)) { + return 1; + } return 0; } @@ -514,6 +533,8 @@ typedef enum { /* AdvSIMD modified immediate */ I3606_MOVI = 0x0f000400, I3606_MVNI = 0x2f000400, + I3606_ORRI = 0x0f001400, + I3606_BICI = 0x2f001400, /* AdvSIMD shift by immediate */ I3614_SSHR = 0x0f000400, @@ -2217,20 +2238,48 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, tcg_out_insn(s, 3617, ABS, is_q, vece, a0, a1); break; case INDEX_op_and_vec: - tcg_out_insn(s, 3616, AND, is_q, 0, a0, a1, a2); + if (const_args[2]) { + int op, cmode, imm8; + is_fimm(~a2, &op, &cmode, &imm8); + tcg_out_mov(s, type, a0, a1); + tcg_out_insn(s, 3606, BICI, is_q, a0, 0, cmode, imm8); + } else { + tcg_out_insn(s, 3616, AND, is_q, 0, a0, a1, a2); + } break; case INDEX_op_or_vec: - tcg_out_insn(s, 3616, ORR, is_q, 0, a0, a1, a2); + if (const_args[2]) { + int op, cmode, imm8; + is_fimm(a2, &op, &cmode, &imm8); + tcg_out_mov(s, type, a0, a1); + tcg_out_insn(s, 3606, ORRI, is_q, a0, 0, cmode, imm8); + } else { + tcg_out_insn(s, 3616, ORR, is_q, 0, a0, a1, a2); + } + break; + case INDEX_op_andc_vec: + if (const_args[2]) { + int op, cmode, imm8; + is_fimm(a2, &op, &cmode, &imm8); + tcg_out_mov(s, type, a0, a1); + tcg_out_insn(s, 3606, BICI, is_q, a0, 0, cmode, imm8); + } else { + tcg_out_insn(s, 3616, BIC, is_q, 0, a0, a1, a2); + } + break; + case INDEX_op_orc_vec: + if (const_args[2]) { + int op, cmode, imm8; + is_fimm(~a2, &op, &cmode, &imm8); + tcg_out_mov(s, type, a0, a1); + tcg_out_insn(s, 3606, ORRI, is_q, a0, 0, cmode, imm8); + } else { + tcg_out_insn(s, 3616, ORN, is_q, 0, a0, a1, a2); + } break; case INDEX_op_xor_vec: tcg_out_insn(s, 3616, EOR, is_q, 0, a0, a1, a2); break; - case INDEX_op_andc_vec: - tcg_out_insn(s, 3616, BIC, is_q, 0, a0, a1, a2); - break; - case INDEX_op_orc_vec: - tcg_out_insn(s, 3616, ORN, is_q, 0, a0, a1, a2); - break; case INDEX_op_ssadd_vec: tcg_out_insn(s, 3616, SQADD, is_q, vece, a0, a1, a2); break; @@ -2413,6 +2462,8 @@ static const TCGTargetOpDef *tcg_target_op_def(TCGOpcode op) static const TCGTargetOpDef lZ_l = { .args_ct_str = { "lZ", "l" } }; static const TCGTargetOpDef r_r_r = { .args_ct_str = { "r", "r", "r" } }; static const TCGTargetOpDef w_w_w = { .args_ct_str = { "w", "w", "w" } }; + static const TCGTargetOpDef w_w_wI = { .args_ct_str = { "w", "w", "wI" } }; + static const TCGTargetOpDef w_w_wP = { .args_ct_str = { "w", "w", "wP" } }; static const TCGTargetOpDef w_w_wZ = { .args_ct_str = { "w", "w", "wZ" } }; static const TCGTargetOpDef r_r_ri = { .args_ct_str = { "r", "r", "ri" } }; static const TCGTargetOpDef r_r_rA = { .args_ct_str = { "r", "r", "rA" } }; @@ -2568,11 +2619,7 @@ static const TCGTargetOpDef *tcg_target_op_def(TCGOpcode op) case INDEX_op_add_vec: case INDEX_op_sub_vec: case INDEX_op_mul_vec: - case INDEX_op_and_vec: - case INDEX_op_or_vec: case INDEX_op_xor_vec: - case INDEX_op_andc_vec: - case INDEX_op_orc_vec: case INDEX_op_ssadd_vec: case INDEX_op_sssub_vec: case INDEX_op_usadd_vec: @@ -2586,6 +2633,12 @@ static const TCGTargetOpDef *tcg_target_op_def(TCGOpcode op) case INDEX_op_sarv_vec: case INDEX_op_aa64_sshl_vec: return &w_w_w; + case INDEX_op_or_vec: + case INDEX_op_andc_vec: + return &w_w_wP; + case INDEX_op_and_vec: + case INDEX_op_orc_vec: + return &w_w_wI; case INDEX_op_not_vec: case INDEX_op_neg_vec: case INDEX_op_abs_vec: