From patchwork Fri May 17 15:25:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Robin Dapp X-Patchwork-Id: 1936547 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20230601 header.b=aurGLBcD; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4VgrQV5Mpfz20dJ for ; Sat, 18 May 2024 01:26:10 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D4E363858C56 for ; Fri, 17 May 2024 15:26:03 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ej1-x62d.google.com (mail-ej1-x62d.google.com [IPv6:2a00:1450:4864:20::62d]) by sourceware.org (Postfix) with ESMTPS id 73066384AB7B for ; Fri, 17 May 2024 15:25:33 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 73066384AB7B Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 73066384AB7B Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::62d ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1715959537; cv=none; b=wyXRTBTh5gAMOOeohLJNnto2dlOc03GDUJ06h4HvNJGVwC8WTOuG+po/kCCLlWKHLhOwTGhaZRjAEsxR3gKI4qvjJZRW5xC8VTenOTHQGpPNCvPy2VhAgyhS7JuCSbHdw1a9dB3kjfHa3Vf/a4M9J5vqkUWa3Ae+JZu2uLrRi6s= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1715959537; c=relaxed/simple; bh=0OATqrQnwy+kQGNBFHrZIzF8zoj3bQoKBSnqSJ0bL1k=; h=DKIM-Signature:Message-ID:Date:MIME-Version:From:Subject:To; b=QzTeaaXGaC+2Ya9d/MN0Gdzt8wOjV7pwztaRLPdMVnE9m6gtSa0ooI0kggj8wDtBlESa+2499pH6pWzBt0sFmyXMqmiT7ullCUJcifGnRu6zj4Y/Z2MT90FCymiWOUMYQWYFHetfWnHGhCs3tP2JHtplf1I1OE82QdV3kSIBS+s= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-ej1-x62d.google.com with SMTP id a640c23a62f3a-a59a387fbc9so512614666b.1 for ; Fri, 17 May 2024 08:25:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1715959531; x=1716564331; darn=gcc.gnu.org; h=content-transfer-encoding:to:subject:from:content-language:cc :user-agent:mime-version:date:message-id:from:to:cc:subject:date :message-id:reply-to; bh=trVEsV2JNhbTiB8RLf0MiXU6kVAvS1HxT/ApNPz0FWo=; b=aurGLBcD4U9byMYjRHgNP2967rBAeDzueGpCLAFoqTMjcB9r+IMQaSaueaJnO+hYla M9+7ahfoN3Gv3BhDlopyIeL6/zsbTWi29XPCJ+Rs8RhgZ2R3/ZWKuQhf46rw2pJmGqwc mCU7U6ry0ejWEcYNuUnRGZF4hcbv12re/cj3gbQ7pmU8vtJtS8e+jQTWXO+YRnaCJ7VJ zpjxTAybhCyJ3lVM5tKSaYxN+MqIEPhjbTyakEGScONJGfEh9OeSK8aF42eGYUdkhnSZ EIE30lbgmswK0NLsl7LXIOY6UpivuABZhgtrcT6vdIVIFSrCp81M4JKnYAfzBtwPGBMV Pc1g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1715959531; x=1716564331; h=content-transfer-encoding:to:subject:from:content-language:cc :user-agent:mime-version:date:message-id:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=trVEsV2JNhbTiB8RLf0MiXU6kVAvS1HxT/ApNPz0FWo=; b=LJhzo1snpDT1XcG3TCkVYq7EEKg9991uaUMVtJC1QYCHBNegzR9ku+gIUozbFiXhzA WokX7VFc1yJpsVc/3eU31chWZ6ZF8roSnhjl4R2GZSxJ6y/KcJwHNLthIJAuwkmafjOR J4ikA1Q02/7mLF9Txg9QDxALALbUMHq8spm1KELvvureHPOm+WymVNpLch6ng72xqO0c Q53m3lKItqVYDTAkdF+iAl73SGW3BgkK2/4H82uEtmE9C7kpCLy19ONqiUenoN9kbOQj RwFA4pT6kYF8B5sWxqiuI2UuymOXVh6ehDIFLP4+IX12075cUIoCEh9w5IRaAwuoCHml Ug1A== X-Gm-Message-State: AOJu0Yyvy3Dy2RDFN7l+HMhm1Kq0+FsKFE1T9dT6OWsR5lLldW3EJu8N S05ox65NTZpTzJ2O19xXVBHHMPBKmP+UAlVi4abOZtc7oXTIXsiRw9v/DA== X-Google-Smtp-Source: AGHT+IEvGmFyD1wDyIcC1qloSC54eeBlvcf3Jol76pqQccz1MmuHWn9GkRc9bOeO5HPff/+GhC2/VA== X-Received: by 2002:a17:907:30c5:b0:a5c:e0c3:fb38 with SMTP id a640c23a62f3a-a5ce0c3fc6bmr340342966b.31.1715959531319; Fri, 17 May 2024 08:25:31 -0700 (PDT) Received: from [192.168.1.23] (ip-149-172-150-237.um42.pools.vodafone-ip.de. [149.172.150.237]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-a5a17b015ecsm1138384866b.177.2024.05.17.08.25.30 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 17 May 2024 08:25:30 -0700 (PDT) Message-ID: <948a03cc-2e66-443f-907a-19755656a1e2@gmail.com> Date: Fri, 17 May 2024 17:25:30 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Cc: rdapp.gcc@gmail.com, palmer , Kito Cheng , "juzhe.zhong@rivai.ai" , jeffreyalaw Content-Language: en-US From: Robin Dapp Subject: [PATCH] RISC-V: Add vwsll combine helpers. To: gcc-patches X-Spam-Status: No, score=-9.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Hi, this patch enables the usage of vwsll in autovec context by adding the necessary combine patterns and tests. Regtested on rv64gcv_zvfh_zvbb. Regards Robin gcc/ChangeLog: * config/riscv/autovec-opt.md (*vwsll_zext1_): New pattern. (*vwsll_zext2_): Ditto. (*vwsll_zext1_scalar_): Ditto. (*vwsll_zext1_trunc_): Ditto. (*vwsll_zext2_trunc_): Ditto. (*vwsll_zext1_trunc_scalar_): Ditto. * config/riscv/vector-crypto.md: Make pattern similar to other narrowing/widening patterns. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/binop/vwsll-1.c: New test. * gcc.target/riscv/rvv/autovec/binop/vwsll-run.c: New test. * gcc.target/riscv/rvv/autovec/binop/vwsll-template.h: New test. --- gcc/config/riscv/autovec-opt.md | 123 ++++++++++++++++++ gcc/config/riscv/vector-crypto.md | 2 +- .../riscv/rvv/autovec/binop/vwsll-1.c | 10 ++ .../riscv/rvv/autovec/binop/vwsll-run.c | 67 ++++++++++ .../riscv/rvv/autovec/binop/vwsll-template.h | 49 +++++++ 5 files changed, 250 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vwsll-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vwsll-run.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vwsll-template.h diff --git a/gcc/config/riscv/autovec-opt.md b/gcc/config/riscv/autovec-opt.md index 645dc53d868..06438f9e2f7 100644 --- a/gcc/config/riscv/autovec-opt.md +++ b/gcc/config/riscv/autovec-opt.md @@ -1436,3 +1436,126 @@ (define_insn_and_split "*n" DONE; } [(set_attr "type" "vmalu")]) + +;; vzext.vf2 + vsll = vwsll. +(define_insn_and_split "*vwsll_zext1_" + [(set (match_operand:VWEXTI 0 "register_operand" "=vr ") + (ashift:VWEXTI + (zero_extend:VWEXTI + (match_operand: 1 "register_operand" " vr ")) + (match_operand: 2 "vector_shift_operand" "vrvk")))] + "TARGET_ZVBB && can_create_pseudo_p ()" + "#" + "&& 1" + [(const_int 0)] + { + insn_code icode = code_for_pred_vwsll (mode); + riscv_vector::emit_vlmax_insn (icode, riscv_vector::BINARY_OP, operands); + DONE; + } + [(set_attr "type" "vwsll")]) + +(define_insn_and_split "*vwsll_zext2_" + [(set (match_operand:VWEXTI 0 "register_operand" "=vr ") + (ashift:VWEXTI + (zero_extend:VWEXTI + (match_operand: 1 "register_operand" " vr ")) + (zero_extend:VWEXTI + (match_operand: 2 "vector_shift_operand" "vrvk"))))] + "TARGET_ZVBB && can_create_pseudo_p ()" + "#" + "&& 1" + [(const_int 0)] + { + insn_code icode = code_for_pred_vwsll (mode); + riscv_vector::emit_vlmax_insn (icode, riscv_vector::BINARY_OP, operands); + DONE; + } + [(set_attr "type" "vwsll")]) + + +(define_insn_and_split "*vwsll_zext1_scalar_" + [(set (match_operand:VWEXTI 0 "register_operand" "=vr") + (ashift:VWEXTI + (zero_extend:VWEXTI + (match_operand: 1 "register_operand" " vr")) + (match_operand: 2 "vector_scalar_shift_operand" " rK")))] + "TARGET_ZVBB && can_create_pseudo_p ()" + "#" + "&& 1" + [(const_int 0)] + { + if (GET_CODE (operands[2]) == SUBREG) + operands[2] = SUBREG_REG (operands[2]); + insn_code icode = code_for_pred_vwsll_scalar (mode); + riscv_vector::emit_vlmax_insn (icode, riscv_vector::BINARY_OP, operands); + DONE; + } + [(set_attr "type" "vwsll")]) + +;; For +;; uint16_t dst; +;; uint8_t a, b; +;; dst = vwsll (a, b) +;; we seem to create +;; aa = (int) a; +;; bb = (int) b; +;; dst = (short) vwsll (aa, bb); +;; The following patterns help to combine this idiom into one vwsll. + +(define_insn_and_split "*vwsll_zext1_trunc_" + [(set (match_operand: 0 "register_operand" "=vr ") + (truncate: + (ashift:VQEXTI + (zero_extend:VQEXTI + (match_operand: 1 "register_operand" " vr ")) + (match_operand:VQEXTI 2 "vector_shift_operand" "vrvk"))))] + "TARGET_ZVBB && can_create_pseudo_p ()" + "#" + "&& 1" + [(const_int 0)] + { + insn_code icode = code_for_pred_vwsll (mode); + riscv_vector::emit_vlmax_insn (icode, riscv_vector::BINARY_OP, operands); + DONE; + } + [(set_attr "type" "vwsll")]) + +(define_insn_and_split "*vwsll_zext2_trunc_" + [(set (match_operand: 0 "register_operand" "=vr ") + (truncate: + (ashift:VQEXTI + (zero_extend:VQEXTI + (match_operand: 1 "register_operand" " vr ")) + (zero_extend:VQEXTI + (match_operand: 2 "vector_shift_operand" "vrvk")))))] + "TARGET_ZVBB && can_create_pseudo_p ()" + "#" + "&& 1" + [(const_int 0)] + { + insn_code icode = code_for_pred_vwsll (mode); + riscv_vector::emit_vlmax_insn (icode, riscv_vector::BINARY_OP, operands); + DONE; + } + [(set_attr "type" "vwsll")]) + +(define_insn_and_split "*vwsll_zext1_trunc_scalar_" + [(set (match_operand: 0 "register_operand" "=vr ") + (truncate: + (ashift:VQEXTI + (zero_extend:VQEXTI + (match_operand: 1 "register_operand" " vr ")) + (match_operand: 2 "vector_scalar_shift_operand" " rK"))))] + "TARGET_ZVBB && can_create_pseudo_p ()" + "#" + "&& 1" + [(const_int 0)] + { + if (GET_CODE (operands[2]) == SUBREG) + operands[2] = SUBREG_REG (operands[2]); + insn_code icode = code_for_pred_vwsll_scalar (mode); + riscv_vector::emit_vlmax_insn (icode, riscv_vector::BINARY_OP, operands); + DONE; + } + [(set_attr "type" "vwsll")]) diff --git a/gcc/config/riscv/vector-crypto.md b/gcc/config/riscv/vector-crypto.md index e474ddf5da7..24822e2712c 100755 --- a/gcc/config/riscv/vector-crypto.md +++ b/gcc/config/riscv/vector-crypto.md @@ -298,7 +298,7 @@ (define_insn "@pred_vwsll" (match_operand: 4 "register_operand" "vr")) (match_operand:VWEXTI 2 "vector_merge_operand" "0vu")))] "TARGET_ZVBB" - "vwsll.vv\t%0,%3,%4%p1" + "vwsll.v%o4\t%0,%3,%4%p1" [(set_attr "type" "vwsll") (set_attr "mode" "")]) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vwsll-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vwsll-1.c new file mode 100644 index 00000000000..a2e5b4f5aa1 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vwsll-1.c @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-add-options "riscv_v" } */ +/* { dg-add-options "riscv_zvbb" } */ +/* { dg-additional-options "-std=c99 -fno-vect-cost-model" } */ + +#include "vwsll-template.h" + +/* { dg-final { scan-assembler-times {\tvwsll\.vv} 3 } } */ +/* { dg-final { scan-assembler-times {\tvwsll\.vx} 3 } } */ +/* { dg-final { scan-assembler-times {\tvwsll\.vi} 3 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vwsll-run.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vwsll-run.c new file mode 100644 index 00000000000..ddb84618b50 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vwsll-run.c @@ -0,0 +1,67 @@ +/* { dg-do run } */ +/* { dg-require-effective-target "riscv_zvbb_ok" } */ +/* { dg-add-options "riscv_v" } */ +/* { dg-add-options "riscv_zvbb" } */ +/* { dg-additional-options "-std=c99 -fno-vect-cost-model" } */ + +#include "vwsll-template.h" + +#include + +#define SZ 512 + +#define RUN(TYPE1, TYPE2, VAL) \ + TYPE1 dst##TYPE1[SZ]; \ + TYPE2 a##TYPE2[SZ]; \ + TYPE2 b##TYPE2[SZ]; \ + for (int i = 0; i < SZ; i++) \ + { \ + dst##TYPE1[i] = 0; \ + a##TYPE2[i] = VAL; \ + b##TYPE2[i] = i % 4; \ + } \ + vwsll_vv##TYPE1 (dst##TYPE1, a##TYPE2, b##TYPE2, SZ); \ + for (int i = 0; i < SZ; i++) \ + assert (dst##TYPE1[i] == (VAL << (i % 4))); + +#define RUN2(TYPE1, TYPE2, VAL) \ + TYPE1 dst2##TYPE1[SZ]; \ + TYPE2 a2##TYPE2[SZ]; \ + for (int i = 0; i < SZ; i++) \ + { \ + dst2##TYPE1[i] = 0; \ + a2##TYPE2[i] = VAL; \ + } \ + TYPE2 b2##TYPE2 = 7; \ + vwsll_vx##TYPE1 (dst2##TYPE1, a2##TYPE2, b2##TYPE2, SZ); \ + for (int i = 0; i < SZ; i++) \ + assert (dst2##TYPE1[i] == (VAL << b2##TYPE2)); + +#define RUN3(TYPE1, TYPE2, VAL) \ + TYPE1 dst3##TYPE1[SZ]; \ + TYPE2 a3##TYPE2[SZ]; \ + for (int i = 0; i < SZ; i++) \ + { \ + dst3##TYPE1[i] = 0; \ + a3##TYPE2[i] = VAL; \ + } \ + vwsll_vi##TYPE1 (dst3##TYPE1, a3##TYPE2, SZ); \ + for (int i = 0; i < SZ; i++) \ + assert (dst3##TYPE1[i] == (VAL << 6)); + +#define RUN_ALL() \ + RUN (uint16_t, uint8_t, 2) \ + RUN (uint32_t, uint16_t, 2) \ + RUN (uint64_t, uint32_t, 4) \ + RUN2 (uint16_t, uint8_t, 8) \ + RUN2 (uint32_t, uint16_t, 8) \ + RUN2 (uint64_t, uint32_t, 10) \ + RUN3 (uint16_t, uint8_t, 255) \ + RUN3 (uint32_t, uint16_t, 34853) \ + RUN3 (uint64_t, uint32_t, 1794394) + +int +main () +{ + RUN_ALL () +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vwsll-template.h b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vwsll-template.h new file mode 100644 index 00000000000..376cbaee0d5 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vwsll-template.h @@ -0,0 +1,49 @@ +/* { dg-do compile } */ +/* { dg-add-options "riscv_v" } */ +/* { dg-add-options "riscv_zvbb" } */ +/* { dg-additional-options "-std=c99 -fno-vect-cost-model" } */ + +#include + +#define TEST1_TYPE(TYPE1, TYPE2) \ + __attribute__ ((noipa)) void vwsll_vv##TYPE1 (TYPE1 *restrict dst, \ + TYPE2 *restrict a, \ + TYPE2 *restrict b, int n) \ + { \ + for (int i = 0; i < n; i++) \ + dst[i] = (TYPE1) a[i] << b[i]; \ + } + +#define TEST2_TYPE(TYPE1, TYPE2) \ + __attribute__ ((noipa)) void vwsll_vx##TYPE1 (TYPE1 *restrict dst, \ + TYPE2 *restrict a, TYPE2 b, \ + int n) \ + { \ + for (int i = 0; i < n; i++) \ + dst[i] = (TYPE1) a[i] << b; \ + } + +#define TEST3_TYPE(TYPE1, TYPE2) \ + __attribute__ ((noipa)) void vwsll_vi##TYPE1 (TYPE1 *restrict dst, \ + TYPE2 *restrict a, int n) \ + { \ + for (int i = 0; i < n; i++) \ + dst[i] = (TYPE1) a[i] << 6; \ + } + +#define TEST_ALL() \ + TEST1_TYPE (uint16_t, uint8_t) \ + TEST1_TYPE (uint32_t, uint16_t) \ + TEST1_TYPE (uint64_t, uint32_t) \ + TEST2_TYPE (uint16_t, uint8_t) \ + TEST2_TYPE (uint32_t, uint16_t) \ + TEST2_TYPE (uint64_t, uint32_t) \ + TEST3_TYPE (uint16_t, uint8_t) \ + TEST3_TYPE (uint32_t, uint16_t) \ + TEST3_TYPE (uint64_t, uint32_t) + +TEST_ALL() + +/* { dg-final { scan-assembler-times {\tvwsll\.vv} 3 } } */ +/* { dg-final { scan-assembler-times {\tvwsll\.vx} 3 } } */ +/* { dg-final { scan-assembler-times {\tvwsll\.vi} 3 } } */