From patchwork Fri Nov 11 00:52:05 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Simpson X-Patchwork-Id: 1702340 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=quicinc.com header.i=@quicinc.com header.a=rsa-sha256 header.s=qcppdkim1 header.b=S3/zI7KU; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4N7gDw5JSdz23lq for ; Fri, 11 Nov 2022 11:53:36 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1otIHU-0008Uu-Ba; Thu, 10 Nov 2022 19:52:28 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1otIHS-0008TN-3K for qemu-devel@nongnu.org; Thu, 10 Nov 2022 19:52:26 -0500 Received: from mx0a-0031df01.pphosted.com ([205.220.168.131]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1otIHO-0008Id-CG for qemu-devel@nongnu.org; Thu, 10 Nov 2022 19:52:25 -0500 Received: from pps.filterd (m0279867.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2AB0hb0g002441; Fri, 11 Nov 2022 00:52:18 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=qcppdkim1; bh=DJz1mPXG682KSaV30XcjWSYc44MybFJA/jwGuiilIp8=; b=S3/zI7KUNaw8NnRCUofI/GKEcnhbn98K18NwafzgK8InZaKC5QjHuh1KvWIY+0ICAndH 2lTT/vA4JMUrVxe03cxoELTfTHY684OkX2spItiyxKrm5Hm4iTg1zJaCHfi4GeRtabPf YVOPnVyTLLQcPtQFszs5EP6o/iUjpBk0g2nIrXW87vpmlS8AVPR2xkjXNKEaEY7M9joA Q1p0uHA4hWKETINutWXGjVdHFaW9B7uv895BvoESSwarZL+CQ0gL9orc+5thB/Ie+QLM 9r3/DYeQTalQOqZOZAua0wQ3JHbJXiDoXVjmAhY7Z6DlJV20oJPPzkdinSkRQp5KEXSe mA== Received: from nalasppmta04.qualcomm.com (Global_NAT1.qualcomm.com [129.46.96.20]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 3ksaqxr64j-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 11 Nov 2022 00:52:18 +0000 Received: from pps.filterd (NALASPPMTA04.qualcomm.com [127.0.0.1]) by NALASPPMTA04.qualcomm.com (8.17.1.5/8.17.1.5) with ESMTP id 2AB0qHXn021265; Fri, 11 Nov 2022 00:52:17 GMT Received: from pps.reinject (localhost [127.0.0.1]) by NALASPPMTA04.qualcomm.com (PPS) with ESMTPS id 3kngwm6u7m-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Fri, 11 Nov 2022 00:52:17 +0000 Received: from NALASPPMTA04.qualcomm.com (NALASPPMTA04.qualcomm.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 2AB0qGfQ021231; Fri, 11 Nov 2022 00:52:16 GMT Received: from hu-devc-lv-u18-c.qualcomm.com (hu-tsimpson-lv.qualcomm.com [10.47.235.220]) by NALASPPMTA04.qualcomm.com (PPS) with ESMTP id 2AB0qGF7021226; Fri, 11 Nov 2022 00:52:16 +0000 Received: by hu-devc-lv-u18-c.qualcomm.com (Postfix, from userid 47164) id 478E25000AE; Thu, 10 Nov 2022 16:52:16 -0800 (PST) From: Taylor Simpson To: qemu-devel@nongnu.org Cc: tsimpson@quicinc.com, richard.henderson@linaro.org, philmd@linaro.org, peter.maydell@linaro.org, bcain@quicinc.com, quic_mathbern@quicinc.com, stefanha@redhat.com Subject: [PULL 02/11] Hexagon (target/hexagon) Fix predicated assignment to .tmp and .cur Date: Thu, 10 Nov 2022 16:52:05 -0800 Message-Id: <20221111005214.22764-3-tsimpson@quicinc.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20221111005214.22764-1-tsimpson@quicinc.com> References: <20221111005214.22764-1-tsimpson@quicinc.com> MIME-Version: 1.0 X-QCInternal: smtphost X-QCInternal: smtphost X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-ORIG-GUID: DqY-UJUkWYx8P-NxWW3SdldL4ON0zr8d X-Proofpoint-GUID: DqY-UJUkWYx8P-NxWW3SdldL4ON0zr8d X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-10_14,2022-11-09_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 priorityscore=1501 mlxscore=0 bulkscore=0 clxscore=1015 adultscore=0 impostorscore=0 suspectscore=0 spamscore=0 phishscore=0 mlxlogscore=670 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211110004 Received-SPF: pass client-ip=205.220.168.131; envelope-from=tsimpson@qualcomm.com; helo=mx0a-0031df01.pphosted.com X-Spam_score_int: -17 X-Spam_score: -1.8 X-Spam_bar: - X-Spam_report: (-1.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.25, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Here are example instructions with a predicated .tmp/.cur assignment if (p1) v12.tmp = vmem(r7 + #0) if (p0) v12.cur = vmem(r9 + #0) The .tmp/.cur indicates that references to v12 in the same packet take the result of the load. However, when the predicate is false, the value at the start of the packet should be used. After the packet commits, the .tmp value is dropped, but the .cur value is maintained. To fix this bug, we preload the original value from the HVX register into the temporary used for the result. Test cases added to tests/tcg/hexagon/hvx_misc.c Acked-by: Richard Henderson Co-authored-by: Matheus Tavares Bernardino Signed-off-by: Matheus Tavares Bernardino Signed-off-by: Taylor Simpson Message-Id: <20221108162906.3166-3-tsimpson@quicinc.com> --- target/hexagon/translate.h | 6 +++ tests/tcg/hexagon/hvx_misc.c | 72 +++++++++++++++++++++++++++++++++ target/hexagon/gen_tcg_funcs.py | 12 ++++++ 3 files changed, 90 insertions(+) diff --git a/target/hexagon/translate.h b/target/hexagon/translate.h index 115e29b84f..b8fcf615e8 100644 --- a/target/hexagon/translate.h +++ b/target/hexagon/translate.h @@ -86,6 +86,12 @@ static inline bool is_preloaded(DisasContext *ctx, int num) return test_bit(num, ctx->regs_written); } +static inline bool is_vreg_preloaded(DisasContext *ctx, int num) +{ + return test_bit(num, ctx->vregs_updated) || + test_bit(num, ctx->vregs_updated_tmp); +} + intptr_t ctx_future_vreg_off(DisasContext *ctx, int regnum, int num, bool alloc_ok); intptr_t ctx_tmp_vreg_off(DisasContext *ctx, int regnum, diff --git a/tests/tcg/hexagon/hvx_misc.c b/tests/tcg/hexagon/hvx_misc.c index 6e2c9ab3cd..53d5c9b44f 100644 --- a/tests/tcg/hexagon/hvx_misc.c +++ b/tests/tcg/hexagon/hvx_misc.c @@ -541,6 +541,75 @@ static void test_vshuff(void) check_output_b(__LINE__, 1); } +static void test_load_tmp_predicated(void) +{ + void *p0 = buffer0; + void *p1 = buffer1; + void *pout = output; + bool pred = true; + + for (int i = 0; i < BUFSIZE; i++) { + /* + * Load into v12 as .tmp with a predicate + * When the predicate is true, we get the vector from buffer1[i] + * When the predicate is false, we get a vector of all 1's + * Regardless of the predicate, the next packet should have + * a vector of all 1's + */ + asm("v3 = vmem(%0 + #0)\n\t" + "r1 = #1\n\t" + "v12 = vsplat(r1)\n\t" + "p1 = !cmp.eq(%3, #0)\n\t" + "{\n\t" + " if (p1) v12.tmp = vmem(%1 + #0)\n\t" + " v4.w = vadd(v12.w, v3.w)\n\t" + "}\n\t" + "v4.w = vadd(v4.w, v12.w)\n\t" + "vmem(%2 + #0) = v4\n\t" + : : "r"(p0), "r"(p1), "r"(pout), "r"(pred) + : "r1", "p1", "v12", "v3", "v4", "v6", "memory"); + p0 += sizeof(MMVector); + p1 += sizeof(MMVector); + pout += sizeof(MMVector); + + for (int j = 0; j < MAX_VEC_SIZE_BYTES / 4; j++) { + expect[i].w[j] = + pred ? buffer0[i].w[j] + buffer1[i].w[j] + 1 + : buffer0[i].w[j] + 2; + } + pred = !pred; + } + + check_output_w(__LINE__, BUFSIZE); +} + +static void test_load_cur_predicated(void) +{ + bool pred = true; + for (int i = 0; i < BUFSIZE; i++) { + asm volatile("p0 = !cmp.eq(%3, #0)\n\t" + "v3 = vmem(%0+#0)\n\t" + /* + * Preload v4 to make sure that the assignment from the + * packet below is not being ignored when pred is false. + */ + "r0 = #0x01237654\n\t" + "v4 = vsplat(r0)\n\t" + "{\n\t" + " if (p0) v3.cur = vmem(%1+#0)\n\t" + " v4 = v3\n\t" + "}\n\t" + "vmem(%2+#0) = v4\n\t" + : + : "r"(&buffer0[i]), "r"(&buffer1[i]), + "r"(&output[i]), "r"(pred) + : "r0", "p0", "v3", "v4", "memory"); + expect[i] = pred ? buffer1[i] : buffer0[i]; + pred = !pred; + } + check_output_w(__LINE__, BUFSIZE); +} + int main() { init_buffers(); @@ -578,6 +647,9 @@ int main() test_vshuff(); + test_load_tmp_predicated(); + test_load_cur_predicated(); + puts(err ? "FAIL" : "PASS"); return err ? 1 : 0; } diff --git a/target/hexagon/gen_tcg_funcs.py b/target/hexagon/gen_tcg_funcs.py index 02a6565685..ca5fde91cc 100755 --- a/target/hexagon/gen_tcg_funcs.py +++ b/target/hexagon/gen_tcg_funcs.py @@ -173,6 +173,18 @@ def genptr_decl(f, tag, regtype, regid, regno): f.write(" ctx_future_vreg_off(ctx, %s%sN," % \ (regtype, regid)) f.write(" 1, true);\n"); + if 'A_CONDEXEC' in hex_common.attribdict[tag]: + f.write(" if (!is_vreg_preloaded(ctx, %s)) {\n" % (regN)) + f.write(" intptr_t src_off =") + f.write(" offsetof(CPUHexagonState, VRegs[%s%sN]);\n"% \ + (regtype, regid)) + f.write(" tcg_gen_gvec_mov(MO_64, %s%sV_off,\n" % \ + (regtype, regid)) + f.write(" src_off,\n") + f.write(" sizeof(MMVector),\n") + f.write(" sizeof(MMVector));\n") + f.write(" }\n") + if (not hex_common.skip_qemu_helper(tag)): f.write(" TCGv_ptr %s%sV = tcg_temp_new_ptr();\n" % \ (regtype, regid))