From patchwork Fri Nov 17 08:32:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiahao Xu X-Patchwork-Id: 1865849 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SYXKc2Hg7z1yRl for ; Mon, 20 Nov 2023 13:59:32 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E223A3875DDC for ; Mon, 20 Nov 2023 02:59:29 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail.loongson.cn (mail.loongson.cn [114.242.206.163]) by sourceware.org (Postfix) with ESMTP id 162083842FE1 for ; Mon, 20 Nov 2023 02:58:49 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 162083842FE1 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=loongson.cn Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=loongson.cn ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 162083842FE1 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=114.242.206.163 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700449132; cv=none; b=O8c82cGoJTi7CvE/Ba/DNYm+0K+O3VedXPFBaE9eVvdi/eDF86sIaYAcGm3hOx0TAtnbw2WtRfBEMS8h2jAOsIHR3fQqsFUXdl7BPW5VLDbkWeJR1IF3jGuLZTNLiZIybB+jtpF3AiFj3yBfbVXP2WZwHiPDxhV+NK464bM6VrY= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700449132; c=relaxed/simple; bh=IX38O+mnLiRCvIjMn5xQUHtzhDwgFFRQXpVDRFH7amw=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=nK3yD//EW33ViPEzvqcc9YKaq0wuIKwGMZzs31r2Rio7op2KDlcBiahfCFCgObatc0fCB1w4K01RFJEZuBHd4R7EO+draUiLngQM3JZaLKuxJBDsPWxIo+C/xl78N9ZKmoOMHoH4SbHSXUaVAzbGQKHQ1A2ubzRmnyjkz/+X7h0= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from loongson.cn (unknown [10.10.130.252]) by gateway (Coremail) with SMTP id _____8CxNvExJVdl0sI6AA--.50003S3; Fri, 17 Nov 2023 16:32:49 +0800 (CST) Received: from slurm-master.loongson.cn (unknown [10.10.130.252]) by localhost.localdomain (Coremail) with SMTP id AQAAf8DxfS8rJVdlFiVFAA--.20838S4; Fri, 17 Nov 2023 16:32:43 +0800 (CST) From: Jiahao Xu To: gcc-patches@gcc.gnu.org Cc: xry111@xry111.site, i@xen0n.name, chenglulu@loongson.cn, xuchenghua@loongson.cn, Jiahao Xu Subject: [PATCH] LoongArch: Add support for xorsign. Date: Fri, 17 Nov 2023 16:32:41 +0800 Message-Id: <20231117083241.28437-1-xujiahao@loongson.cn> X-Mailer: git-send-email 2.20.1 MIME-Version: 1.0 X-CM-TRANSID: AQAAf8DxfS8rJVdlFiVFAA--.20838S4 X-CM-SenderInfo: 50xmxthkdrqz5rrqw2lrqou0/ X-Coremail-Antispam: 1Uk129KBj93XoW3CFyDur15ZFyfJr4DJFW7ZFc_yoWkKFW3pw 4DCw1xtrW8JFZ7K3Wvka45XwsxtFW2ka1ava4ayryjkr12gr9Fq3W8KasIqFy5J34rXr1a vayF9w1YgF47KwcCm3ZEXasCq-sJn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7ZEXa sCq-sGcSsGvfJ3Ic02F40EFcxC0VAKzVAqx4xG6I80ebIjqfuFe4nvWSU5nxnvy29KBjDU 0xBIdaVrnRJUUUkFb4IE77IF4wAFF20E14v26r1j6r4UM7CY07I20VC2zVCF04k26cxKx2 IYs7xG6rWj6s0DM7CIcVAFz4kK6r1Y6r17M28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48v e4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_Gr0_Xr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI 0_Gr0_Cr1l84ACjcxK6I8E87Iv67AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVCY1x0267AK xVW8Jr0_Cr1UM2AIxVAIcxkEcVAq07x20xvEncxIr21l57IF6xkI12xvs2x26I8E6xACxx 1l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjxv20xvE14v26r106r15McIj6I8E87Iv 67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41l42xK82IYc2 Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAqx4xG67AKxVWUJVWUGwC20s02 6x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r126r1DMIIYrxkI7VAKI48JMIIF0x vE2Ix0cI8IcVAFwI0_Jr0_JF4lIxAIcVC0I7IYx2IY6xkF7I0E14v26r1j6r4UMIIF0xvE 42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6x kF7I0E14v26r1j6r4UYxBIdaVFxhVjvjDU0xZFpf9x07jUsqXUUUUU= X-Gw-Check: c0a08213eae25e9b X-Spam-Status: No, score=-13.2 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org This patch adds support for xorsign pattern to scalar fp and vector. With the new expands, uniformly using vector bitwise logical operations to handle xorsign. On LoongArch64, floating-point registers and vector registers share the same register, so this patch also allows conversion between LSX vector mode and scalar fp mode to avoid unnecessary instruction generation. gcc/ChangeLog: * config/loongarch/lasx.md (xorsign3): New expander. * config/loongarch/loongarch.cc (loongarch_can_change_mode_class): Allow conversion between LSX vector mode and scalar fp mode. * config/loongarch/loongarch.md (@xorsign3): New expander. * config/loongarch/lsx.md (@xorsign3): Ditto. gcc/testsuite/ChangeLog: * gcc.target/loongarch/vector/lasx/lasx-xorsign-run.c: New test. * gcc.target/loongarch/vector/lasx/lasx-xorsign.c: New test. * gcc.target/loongarch/vector/lsx/lsx-xorsign-run.c: New test. * gcc.target/loongarch/vector/lsx/lsx-xorsign.c: New test. * gcc.target/loongarch/xorsign-run.c: New test. * gcc.target/loongarch/xorsign.c: New test. diff --git a/gcc/config/loongarch/lasx.md b/gcc/config/loongarch/lasx.md index f0f2dd08dd8..5a4be588fb4 100644 --- a/gcc/config/loongarch/lasx.md +++ b/gcc/config/loongarch/lasx.md @@ -1120,10 +1120,10 @@ (define_insn "umod3" (set_attr "mode" "")]) (define_insn "xor3" - [(set (match_operand:ILASX 0 "register_operand" "=f,f,f") - (xor:ILASX - (match_operand:ILASX 1 "register_operand" "f,f,f") - (match_operand:ILASX 2 "reg_or_vector_same_val_operand" "f,YC,Urv8")))] + [(set (match_operand:LASX 0 "register_operand" "=f,f,f") + (xor:LASX + (match_operand:LASX 1 "register_operand" "f,f,f") + (match_operand:LASX 2 "reg_or_vector_same_val_operand" "f,YC,Urv8")))] "ISA_HAS_LASX" "@ xvxor.v\t%u0,%u1,%u2 @@ -3147,6 +3147,20 @@ (define_expand "copysign3" operands[5] = gen_reg_rtx (mode); }) +(define_expand "xorsign3" + [(set (match_dup 4) + (and:FLASX (match_dup 3) + (match_operand:FLASX 2 "register_operand"))) + (set (match_operand:FLASX 0 "register_operand") + (xor:FLASX (match_dup 4) + (match_operand:FLASX 1 "register_operand")))] + "ISA_HAS_LASX" +{ + operands[3] = loongarch_build_signbit_mask (mode, 1, 0); + + operands[4] = gen_reg_rtx (mode); +}) + (define_insn "absv4df2" [(set (match_operand:V4DF 0 "register_operand" "=f") diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc index d05743bec87..e4cdbcf0f2d 100644 --- a/gcc/config/loongarch/loongarch.cc +++ b/gcc/config/loongarch/loongarch.cc @@ -6687,6 +6687,11 @@ loongarch_can_change_mode_class (machine_mode from, machine_mode to, if (LSX_SUPPORTED_MODE_P (from) && LSX_SUPPORTED_MODE_P (to)) return true; + /* Allow conversion between LSX vector mode and scalar fp mode. */ + if ((LSX_SUPPORTED_MODE_P (from) && SCALAR_FLOAT_MODE_P (to)) + || ((SCALAR_FLOAT_MODE_P (from) && LSX_SUPPORTED_MODE_P (to)))) + return true; + return !reg_classes_intersect_p (FP_REGS, rclass); } diff --git a/gcc/config/loongarch/loongarch.md b/gcc/config/loongarch/loongarch.md index 22814a3679c..117c0924a85 100644 --- a/gcc/config/loongarch/loongarch.md +++ b/gcc/config/loongarch/loongarch.md @@ -1146,6 +1146,23 @@ (define_insn "copysign3" "fcopysign.\t%0,%1,%2" [(set_attr "type" "fcopysign") (set_attr "mode" "")]) + +(define_expand "@xorsign3" + [(match_operand:ANYF 0 "register_operand") + (match_operand:ANYF 1 "register_operand") + (match_operand:ANYF 2 "register_operand")] + "ISA_HAS_LSX" +{ + machine_mode lsx_mode + = mode == SFmode ? V4SFmode : V2DFmode; + rtx tmp = gen_reg_rtx (lsx_mode); + rtx op1 = lowpart_subreg (lsx_mode, operands[1], mode); + rtx op2 = lowpart_subreg (lsx_mode, operands[2], mode); + emit_insn (gen_xorsign3 (lsx_mode, tmp, op1, op2)); + emit_move_insn (operands[0], + lowpart_subreg (mode, tmp, lsx_mode)); + DONE; +}) ;; ;; .................... diff --git a/gcc/config/loongarch/lsx.md b/gcc/config/loongarch/lsx.md index 55c7d79a030..40500363dc0 100644 --- a/gcc/config/loongarch/lsx.md +++ b/gcc/config/loongarch/lsx.md @@ -1027,10 +1027,10 @@ (define_insn "umod3" (set_attr "mode" "")]) (define_insn "xor3" - [(set (match_operand:ILSX 0 "register_operand" "=f,f,f") - (xor:ILSX - (match_operand:ILSX 1 "register_operand" "f,f,f") - (match_operand:ILSX 2 "reg_or_vector_same_val_operand" "f,YC,Urv8")))] + [(set (match_operand:LSX 0 "register_operand" "=f,f,f") + (xor:LSX + (match_operand:LSX 1 "register_operand" "f,f,f") + (match_operand:LSX 2 "reg_or_vector_same_val_operand" "f,YC,Urv8")))] "ISA_HAS_LSX" "@ vxor.v\t%w0,%w1,%w2 @@ -2884,6 +2884,21 @@ (define_expand "copysign3" operands[5] = gen_reg_rtx (mode); }) +(define_expand "@xorsign3" + [(set (match_dup 4) + (and:FLSX (match_dup 3) + (match_operand:FLSX 2 "register_operand"))) + (set (match_operand:FLSX 0 "register_operand") + (xor:FLSX (match_dup 4) + (match_operand:FLSX 1 "register_operand")))] + "ISA_HAS_LSX" +{ + operands[3] = loongarch_build_signbit_mask (mode, 1, 0); + + operands[4] = gen_reg_rtx (mode); +}) + + (define_insn "absv2df2" [(set (match_operand:V2DF 0 "register_operand" "=f") (abs:V2DF (match_operand:V2DF 1 "register_operand" "f")))] diff --git a/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-xorsign-run.c b/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-xorsign-run.c new file mode 100644 index 00000000000..f48865b4fdf --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-xorsign-run.c @@ -0,0 +1,59 @@ +/* { dg-do run } */ +/* { dg-options "-O2 -ftree-vectorize -mlasx" } */ + +#include "lasx-xorsign.c" + +extern void abort (); + +#define N 16 +float a[N] = {-0.1f, -3.2f, -6.3f, -9.4f, + -12.5f, -15.6f, -18.7f, -21.8f, + 24.9f, 27.1f, 30.2f, 33.3f, + 36.4f, 39.5f, 42.6f, 45.7f}; +float b[N] = {-1.2f, 3.4f, -5.6f, 7.8f, + -9.0f, 1.0f, -2.0f, 3.0f, + -4.0f, -5.0f, 6.0f, 7.0f, + -8.0f, -9.0f, 10.0f, 11.0f}; +float r[N]; + +double ad[N] = {-0.1d, -3.2d, -6.3d, -9.4d, + -12.5d, -15.6d, -18.7d, -21.8d, + 24.9d, 27.1d, 30.2d, 33.3d, + 36.4d, 39.5d, 42.6d, 45.7d}; +double bd[N] = {-1.2d, 3.4d, -5.6d, 7.8d, + -9.0d, 1.0d, -2.0d, 3.0d, + -4.0d, -5.0d, 6.0d, 7.0d, + -8.0d, -9.0d, 10.0d, 11.0d}; +double rd[N]; + +void +__attribute__ ((optimize ("-O0"))) +check_xorsignf (void) +{ + for (int i = 0; i < N; i++) + if (r[i] != a[i] * __builtin_copysignf (1.0f, b[i])) + abort (); +} + +void +__attribute__ ((optimize ("-O0"))) +check_xorsign (void) +{ + for (int i = 0; i < N; i++) + if (rd[i] != ad[i] * __builtin_copysign (1.0d, bd[i])) + abort (); +} + +int +main (void) +{ + my_xorsignf (r, a, b, N); + /* check results: */ + check_xorsignf (); + + my_xorsign (rd, ad, bd, N); + /* check results: */ + check_xorsign (); + + return 0; +} diff --git a/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-xorsign.c b/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-xorsign.c new file mode 100644 index 00000000000..190a9239b31 --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-xorsign.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -ftree-vectorize -mlasx" } */ +/* { dg-final { scan-assembler "xvand\\.v" } } */ +/* { dg-final { scan-assembler "xvxor\\.v" } } */ +/* { dg-final { scan-assembler-not "xvfmul" } } */ + +double +my_xorsign (double *restrict a, double *restrict b, double *restrict c, int n) +{ + for (int i = 0; i < n; i++) + a[i] = b[i] * __builtin_copysign (1.0d, c[i]); +} + +float +my_xorsignf (float *restrict a, float *restrict b, float *restrict c, int n) +{ + for (int i = 0; i < n; i++) + a[i] = b[i] * __builtin_copysignf (1.0f, c[i]); +} diff --git a/gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-xorsign-run.c b/gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-xorsign-run.c new file mode 100644 index 00000000000..960714d7924 --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-xorsign-run.c @@ -0,0 +1,59 @@ +/* { dg-do run } */ +/* { dg-options "-O2 -ftree-vectorize -mlsx" } */ + +#include "lsx-xorsign.c" + +extern void abort (); + +#define N 16 +float a[N] = {-0.1f, -3.2f, -6.3f, -9.4f, + -12.5f, -15.6f, -18.7f, -21.8f, + 24.9f, 27.1f, 30.2f, 33.3f, + 36.4f, 39.5f, 42.6f, 45.7f}; +float b[N] = {-1.2f, 3.4f, -5.6f, 7.8f, + -9.0f, 1.0f, -2.0f, 3.0f, + -4.0f, -5.0f, 6.0f, 7.0f, + -8.0f, -9.0f, 10.0f, 11.0f}; +float r[N]; + +double ad[N] = {-0.1d, -3.2d, -6.3d, -9.4d, + -12.5d, -15.6d, -18.7d, -21.8d, + 24.9d, 27.1d, 30.2d, 33.3d, + 36.4d, 39.5d, 42.6d, 45.7d}; +double bd[N] = {-1.2d, 3.4d, -5.6d, 7.8d, + -9.0d, 1.0d, -2.0d, 3.0d, + -4.0d, -5.0d, 6.0d, 7.0d, + -8.0d, -9.0d, 10.0d, 11.0d}; +double rd[N]; + +void +__attribute__ ((optimize ("-O0"))) +check_xorsignf (void) +{ + for (int i = 0; i < N; i++) + if (r[i] != a[i] * __builtin_copysignf (1.0f, b[i])) + abort (); +} + +void +__attribute__ ((optimize ("-O0"))) +check_xorsign (void) +{ + for (int i = 0; i < N; i++) + if (rd[i] != ad[i] * __builtin_copysign (1.0d, bd[i])) + abort (); +} + +int +main (void) +{ + my_xorsignf (r, a, b, N); + /* check results: */ + check_xorsignf (); + + my_xorsign (rd, ad, bd, N); + /* check results: */ + check_xorsign (); + + return 0; +} diff --git a/gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-xorsign.c b/gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-xorsign.c new file mode 100644 index 00000000000..c2694c11e79 --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-xorsign.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -ftree-vectorize -mlsx" } */ +/* { dg-final { scan-assembler "vand\\.v" } } */ +/* { dg-final { scan-assembler "vxor\\.v" } } */ +/* { dg-final { scan-assembler-not "vfmul" } } */ + +double +my_xorsign (double *restrict a, double *restrict b, double *restrict c, int n) +{ + for (int i = 0; i < n; i++) + a[i] = b[i] * __builtin_copysign (1.0d, c[i]); +} + +float +my_xorsignf (float *restrict a, float *restrict b, float *restrict c, int n) +{ + for (int i = 0; i < n; i++) + a[i] = b[i] * __builtin_copysignf (1.0f, c[i]); +} diff --git a/gcc/testsuite/gcc.target/loongarch/xorsign-run.c b/gcc/testsuite/gcc.target/loongarch/xorsign-run.c new file mode 100644 index 00000000000..e0987abb311 --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/xorsign-run.c @@ -0,0 +1,24 @@ +/* { dg-do run } */ +/* { dg-options "-O2 -mlsx" } */ + +extern void abort(void); + +static double x = 2.0; +static float y = 2.0; + +int main() +{ + if ((2.5 * __builtin_copysign(1.0d, x)) != 2.5) + abort(); + + if ((2.5 * __builtin_copysign(1.0f, y)) != 2.5) + abort(); + + if ((2.5 * __builtin_copysignf(1.0d, -x)) != -2.5) + abort(); + + if ((2.5 * __builtin_copysignf(1.0f, -y)) != -2.5) + abort(); + + return 0; +} diff --git a/gcc/testsuite/gcc.target/loongarch/xorsign.c b/gcc/testsuite/gcc.target/loongarch/xorsign.c new file mode 100644 index 00000000000..ca80603d48b --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/xorsign.c @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mlsx" } */ +/* { dg-final { scan-assembler "vand\\.v" } } */ +/* { dg-final { scan-assembler "vxor\\.v" } } */ +/* { dg-final { scan-assembler-not "fcopysign" } } */ +/* { dg-final { scan-assembler-not "fmul" } } */ + +double +my_xorsign (double a, double b) +{ + return a * __builtin_copysign (1.0d, b); +} + +float +my_xorsignf (float a, float b) +{ + return a * __builtin_copysignf (1.0f, b); +}