From patchwork Thu Sep 14 10:49:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?b?6ZKf5bGF5ZOy?= X-Patchwork-Id: 1834149 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4RmYxk0tnhz1yhZ for ; Thu, 14 Sep 2023 20:50:18 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 32A963857C41 for ; Thu, 14 Sep 2023 10:50:16 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbg151.qq.com (smtpbg151.qq.com [18.169.211.239]) by sourceware.org (Postfix) with ESMTPS id 674A23858D20 for ; Thu, 14 Sep 2023 10:50:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 674A23858D20 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai X-QQ-mid: bizesmtp73t1694688594t6ld338v Received: from rios-cad122.hadoop.rioslab.org ( [58.60.1.26]) by bizesmtp.qq.com (ESMTP) with id ; Thu, 14 Sep 2023 18:49:53 +0800 (CST) X-QQ-SSF: 01400000000000G0V000000A0000000 X-QQ-FEAT: oH0qrucfWSzVYMZd1m4NJEPS4zU45b/p9G1/22vJMqQOkco/ZajJzzZ3QnT1S O/k5scCzSezbI7kASp3a32jHK741B1+1/syG0ZAJj7tWtOB3aAkI7pBM4IJtvVyZ5sGzdbw 8Oo3LX/6Ra5nLB9jfvLWYE/tEM645rfAtchPak8ghPOjwyeeqwWxYa93q1JIBKY462pkoin Sy6dvYl04lEyH4PbqtdS+OPVMQL83rT5NGXq6eH7pvWHpjIJtGaVTKQDbylJ0T1IcOL3Xak brukh5JlkqTM9jRE2UR8ZKYo89RhXc7WPVObVU9aN3HeKefu2gc08Jlf5XKZYbgUFElIFCr l8XAK59E91/ZWsFszwAcAoKV0vcFXMz7Mn+7SC66IyNURVoua6BQyzU37Csthlau4JeMQ8n VGfPS8NvR7k= X-QQ-GoodBg: 2 X-BIZMAIL-ID: 3900270242169366823 From: Juzhe-Zhong To: gcc-patches@gcc.gnu.org Cc: kito.cheng@gmail.com, kito.cheng@sifive.com, jeffreyalaw@gmail.com, rdapp.gcc@gmail.com, Juzhe-Zhong Subject: [PATCH V4] RISC-V: Expand VLS mode to scalar mode move[PR111391] Date: Thu, 14 Sep 2023 18:49:52 +0800 Message-Id: <20230914104952.4173011-1-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.3 MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz7a-one-0 X-Spam-Status: No, score=-9.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" This patch fixes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111391 PR target/111391 gcc/ChangeLog: * config/riscv/autovec.md (@vec_extract): Remove @. (vec_extract): Ditto. * config/riscv/riscv-vsetvl.cc (emit_vsetvl_insn): Fix bug. (pass_vsetvl::local_eliminate_vsetvl_insn): Fix bug. * config/riscv/riscv.cc (riscv_legitimize_move): Expand VLS mode to scalar mode move. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/partial/slp-9.c: Adapt test. * gcc.target/riscv/rvv/autovec/pr111391-1.c: New test. * gcc.target/riscv/rvv/autovec/pr111391-2.c: New test. --- gcc/config/riscv/autovec.md | 2 +- gcc/config/riscv/riscv-vsetvl.cc | 4 +- gcc/config/riscv/riscv.cc | 64 +++++++++++++++++++ .../riscv/rvv/autovec/partial/slp-9.c | 1 - .../gcc.target/riscv/rvv/autovec/pr111391-1.c | 28 ++++++++ .../gcc.target/riscv/rvv/autovec/pr111391-2.c | 10 +++ 6 files changed, 106 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr111391-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr111391-2.c diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md index e74a1695709..7121bab1716 100644 --- a/gcc/config/riscv/autovec.md +++ b/gcc/config/riscv/autovec.md @@ -1442,7 +1442,7 @@ ;; ------------------------------------------------------------------------- ;; ---- [INT,FP] Extract a vector element. ;; ------------------------------------------------------------------------- -(define_expand "@vec_extract" +(define_expand "vec_extract" [(set (match_operand: 0 "register_operand") (vec_select: (match_operand:V_VLS 1 "register_operand") diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc index dc02246756d..5f031c18df5 100644 --- a/gcc/config/riscv/riscv-vsetvl.cc +++ b/gcc/config/riscv/riscv-vsetvl.cc @@ -649,6 +649,8 @@ emit_vsetvl_insn (enum vsetvl_type insn_type, enum emit_type emit_type, { fprintf (dump_file, "\nInsert vsetvl insn PATTERN:\n"); print_rtl_single (dump_file, pat); + fprintf (dump_file, "\nfor insn:\n"); + print_rtl_single (dump_file, rinsn); } if (emit_type == EMIT_DIRECT) @@ -3867,7 +3869,7 @@ pass_vsetvl::local_eliminate_vsetvl_insn (const bb_info *bb) const skip_one = true; } - curr_avl = get_avl (rinsn); + curr_avl = curr_dem.get_avl (); /* Some instrucion like pred_extract_first don't reqruie avl, so the avl is null, use vl_placeholder for unify the handling diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 762937b0e37..8c766e2e2be 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -2513,6 +2513,70 @@ riscv_legitimize_move (machine_mode mode, rtx dest, rtx src) } return true; } + /* Expand + (set (reg:DI target) (subreg:DI (reg:V8QI reg) 0)) + Expand this data movement instead of simply forbid it since + we can improve the code generation for this following scenario + by RVV auto-vectorization: + (set (reg:V8QI 149) (vec_duplicate:V8QI (reg:QI)) + (set (reg:DI target) (subreg:DI (reg:V8QI reg) 0)) + Since RVV mode and scalar mode are in different REG_CLASS, + we need to explicitly move data from V_REGS to GR_REGS by scalar move. */ + if (SUBREG_P (src) && riscv_v_ext_mode_p (GET_MODE (SUBREG_REG (src)))) + { + machine_mode vmode = GET_MODE (SUBREG_REG (src)); + unsigned int mode_size = GET_MODE_SIZE (mode).to_constant (); + unsigned int vmode_size = GET_MODE_SIZE (vmode).to_constant (); + unsigned int nunits = vmode_size / mode_size; + scalar_mode smode = as_a (mode); + unsigned int index = SUBREG_BYTE (src).to_constant () / mode_size; + unsigned int num = smode == DImode && !TARGET_VECTOR_ELEN_64 ? 2 : 1; + + if (num == 2) + { + /* If we want to extract 64bit value but ELEN < 64, + we use RVV vector mode with EEW = 32 to extract + the highpart and lowpart. */ + smode = SImode; + nunits = nunits * 2; + } + vmode = riscv_vector::get_vector_mode (smode, nunits).require (); + enum insn_code icode + = convert_optab_handler (vec_extract_optab, vmode, smode); + gcc_assert (icode != CODE_FOR_nothing); + rtx v = gen_lowpart (vmode, SUBREG_REG (src)); + + for (unsigned int i = 0; i < num; i++) + { + class expand_operand ops[3]; + rtx result; + if (num == 1) + result = dest; + else if (i == 0) + result = gen_lowpart (smode, dest); + else + result = gen_reg_rtx (smode); + create_output_operand (&ops[0], result, smode); + ops[0].target = 1; + create_input_operand (&ops[1], v, vmode); + create_integer_operand (&ops[2], index + i); + expand_insn (icode, 3, ops); + if (ops[0].value != result) + emit_move_insn (result, ops[0].value); + + if (i == 1) + { + rtx tmp + = expand_binop (Pmode, ashl_optab, gen_lowpart (Pmode, result), + gen_int_mode (32, Pmode), NULL_RTX, 0, + OPTAB_DIRECT); + rtx tmp2 = expand_binop (Pmode, ior_optab, tmp, dest, NULL_RTX, 0, + OPTAB_DIRECT); + emit_move_insn (dest, tmp2); + } + } + return true; + } /* Expand (set (reg:QI target) (mem:QI (address))) to diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/slp-9.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/slp-9.c index 5fba27c7a35..7c42438c9d9 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/slp-9.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/slp-9.c @@ -29,4 +29,3 @@ TEST_ALL (VEC_PERM) /* { dg-final { scan-assembler-times {viota.m} 2 } } */ -/* { dg-final { scan-assembler-not {vmv\.v\.i} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr111391-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr111391-1.c new file mode 100644 index 00000000000..a7f64c937c6 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr111391-1.c @@ -0,0 +1,28 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -Wno-int-conversion -Wno-implicit-function -Wno-incompatible-pointer-types -Wno-implicit-function-declaration -Ofast -ftree-vectorize" } */ + +int d (); +typedef struct +{ + int b; +} c; +int +e (char *f, long g) +{ + f += g; + while (g--) + *--f = d; +} + +int +d (c * f) +{ + while (h ()) + switch (f->b) + case 'Q': + { + long a; + e (&a, sizeof (a)); + i (a); + } +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr111391-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr111391-2.c new file mode 100644 index 00000000000..1f170c962e1 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr111391-2.c @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gc_zve32x_zvl128b -mabi=lp64d -Wno-int-conversion -Wno-implicit-function -Wno-incompatible-pointer-types -Wno-implicit-function-declaration -Ofast -ftree-vectorize" } */ + +#include "pr111391-1.c" + +/* { dg-final { scan-assembler-times {vsetivli\s+zero,\s*2,\s*e32,\s*mf2,\s*t[au],\s*m[au]} 1 } } +/* { dg-final { scan-assembler-times {vmv\.x\.s} 2 } } */ +/* { dg-final { scan-assembler-times {vslidedown.vi\s+v[0-9]+,\s*v[0-9]+,\s*1} 1 } } */ +/* { dg-final { scan-assembler-times {slli\s+[a-x0-9]+,[a-x0-9]+,32} 1 } } */ +/* { dg-final { scan-assembler-times {or\s+[a-x0-9]+,[a-x0-9]+,[a-x0-9]+} 1 } } */