[{"id":3188266,"web_url":"http://patchwork.ozlabs.org/comment/3188266/","msgid":"<CA+=Sn1kbO1OkC_1oMJi8uH8bajmGn07A+F6nvb6dGKBRcR8S3Q@mail.gmail.com>","list_archive_url":null,"date":"2023-09-27T01:17:24","subject":"Re: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1 <<\n signbit(x)) [PR109154]","submitter":{"id":40,"url":"http://patchwork.ozlabs.org/api/people/40/","name":"Andrew Pinski","email":"pinskia@gmail.com"},"content":"On Tue, Sep 26, 2023 at 5:51 PM Tamar Christina <tamar.christina@arm.com> wrote:\n>\n> Hi All,\n>\n> For targets that allow conversion between int and float modes this adds a new\n> optimization transforming fneg (fabs (x)) into x | (1 << signbit(x)).  Such\n> sequences are common in scientific code working with gradients.\n>\n> The transformed instruction if the target has an inclusive-OR that takes an\n> immediate is both shorter an faster.  For those that don't the immediate has\n> to be seperate constructed but this still ends up being faster as the immediate\n> construction is not on the critical path.\n>\n> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.\n>\n> Ok for master?\n\nI think this should be part of isel instead of match.\nMaybe we could use genmatch to generate the code that does the\ntransformations but this does not belong as part of match really.\n\nThanks,\nAndrew\n\n>\n> Thanks,\n> Tamar\n>\n> gcc/ChangeLog:\n>\n>         PR tree-optimization/109154\n>         * match.pd: Add new neg+abs rule.\n>\n> gcc/testsuite/ChangeLog:\n>\n>         PR tree-optimization/109154\n>         * gcc.target/aarch64/fneg-abs_1.c: New test.\n>         * gcc.target/aarch64/fneg-abs_2.c: New test.\n>         * gcc.target/aarch64/fneg-abs_3.c: New test.\n>         * gcc.target/aarch64/fneg-abs_4.c: New test.\n>         * gcc.target/aarch64/sve/fneg-abs_1.c: New test.\n>         * gcc.target/aarch64/sve/fneg-abs_2.c: New test.\n>         * gcc.target/aarch64/sve/fneg-abs_3.c: New test.\n>         * gcc.target/aarch64/sve/fneg-abs_4.c: New test.\n>\n> --- inline copy of patch --\n> diff --git a/gcc/match.pd b/gcc/match.pd\n> index 39c7ea1088f25538ed8bd26ee89711566141a71f..8ebde06dcd4b26d694826cffad0fb17e1136600a 100644\n> --- a/gcc/match.pd\n> +++ b/gcc/match.pd\n> @@ -9476,3 +9476,57 @@ and,\n>        }\n>        (if (full_perm_p)\n>         (vec_perm (op@3 @0 @1) @3 @2))))))\n> +\n> +/* Transform fneg (fabs (X)) -> X | 1 << signbit (X).  */\n> +\n> +(simplify\n> + (negate (abs @0))\n> + (if (FLOAT_TYPE_P (type)\n> +      /* We have to delay this rewriting till after forward prop because otherwise\n> +        it's harder to do trigonometry optimizations. e.g. cos(-fabs(x)) is not\n> +        matched in one go.  Instead cos (-x) is matched first followed by cos(|x|).\n> +        The bottom op approach makes this rule match first and it's not untill\n> +        fwdprop that we match top down.  There are manu such simplications so we\n> +        delay this optimization till later on.  */\n> +      && canonicalize_math_after_vectorization_p ())\n> +  (with {\n> +    tree itype = unsigned_type_for (type);\n> +    machine_mode mode = TYPE_MODE (type);\n> +    const struct real_format *float_fmt = FLOAT_MODE_FORMAT (mode);\n> +    auto optab = VECTOR_TYPE_P (type) ? optab_vector : optab_default; }\n> +   (if (float_fmt\n> +       && float_fmt->signbit_rw >= 0\n> +       && targetm.can_change_mode_class (TYPE_MODE (itype),\n> +                                         TYPE_MODE (type), ALL_REGS)\n> +        && target_supports_op_p (itype, BIT_IOR_EXPR, optab))\n> +    (with { wide_int wone = wi::one (element_precision (type));\n> +           int sbit = float_fmt->signbit_rw;\n> +           auto stype = VECTOR_TYPE_P (type) ? TREE_TYPE (itype) : itype;\n> +           tree sign_bit = wide_int_to_tree (stype, wi::lshift (wone, sbit));}\n> +     (view_convert:type\n> +      (bit_ior (view_convert:itype @0)\n> +              { build_uniform_cst (itype, sign_bit); } )))))))\n> +\n> +/* Repeat the same but for conditional negate.  */\n> +\n> +(simplify\n> + (IFN_COND_NEG @1 (abs @0) @2)\n> + (if (FLOAT_TYPE_P (type))\n> +  (with {\n> +    tree itype = unsigned_type_for (type);\n> +    machine_mode mode = TYPE_MODE (type);\n> +    const struct real_format *float_fmt = FLOAT_MODE_FORMAT (mode);\n> +    auto optab = VECTOR_TYPE_P (type) ? optab_vector : optab_default; }\n> +   (if (float_fmt\n> +       && float_fmt->signbit_rw >= 0\n> +       && targetm.can_change_mode_class (TYPE_MODE (itype),\n> +                                         TYPE_MODE (type), ALL_REGS)\n> +        && target_supports_op_p (itype, BIT_IOR_EXPR, optab))\n> +    (with { wide_int wone = wi::one (element_precision (type));\n> +           int sbit = float_fmt->signbit_rw;\n> +           auto stype = VECTOR_TYPE_P (type) ? TREE_TYPE (itype) : itype;\n> +           tree sign_bit = wide_int_to_tree (stype, wi::lshift (wone, sbit));}\n> +     (view_convert:type\n> +      (IFN_COND_IOR @1 (view_convert:itype @0)\n> +              { build_uniform_cst (itype, sign_bit); }\n> +              (view_convert:itype @2) )))))))\n> \\ No newline at end of file\n> diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_1.c b/gcc/testsuite/gcc.target/aarch64/fneg-abs_1.c\n> new file mode 100644\n> index 0000000000000000000000000000000000000000..f823013c3ddf6b3a266c3abfcbf2642fc2a75fa6\n> --- /dev/null\n> +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_1.c\n> @@ -0,0 +1,39 @@\n> +/* { dg-do compile } */\n> +/* { dg-options \"-O3\" } */\n> +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } } */\n> +\n> +#pragma GCC target \"+nosve\"\n> +\n> +#include <arm_neon.h>\n> +\n> +/*\n> +** t1:\n> +**     orr     v[0-9]+.2s, #128, lsl #24\n> +**     ret\n> +*/\n> +float32x2_t t1 (float32x2_t a)\n> +{\n> +  return vneg_f32 (vabs_f32 (a));\n> +}\n> +\n> +/*\n> +** t2:\n> +**     orr     v[0-9]+.4s, #128, lsl #24\n> +**     ret\n> +*/\n> +float32x4_t t2 (float32x4_t a)\n> +{\n> +  return vnegq_f32 (vabsq_f32 (a));\n> +}\n> +\n> +/*\n> +** t3:\n> +**     adrp    x0, .LC[0-9]+\n> +**     ldr     q[0-9]+, \\[x0, #:lo12:.LC0\\]\n> +**     orr     v[0-9]+.16b, v[0-9]+.16b, v[0-9]+.16b\n> +**     ret\n> +*/\n> +float64x2_t t3 (float64x2_t a)\n> +{\n> +  return vnegq_f64 (vabsq_f64 (a));\n> +}\n> diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_2.c b/gcc/testsuite/gcc.target/aarch64/fneg-abs_2.c\n> new file mode 100644\n> index 0000000000000000000000000000000000000000..141121176b309e4b2aa413dc55271a6e3c93d5e1\n> --- /dev/null\n> +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_2.c\n> @@ -0,0 +1,31 @@\n> +/* { dg-do compile } */\n> +/* { dg-options \"-O3\" } */\n> +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } } */\n> +\n> +#pragma GCC target \"+nosve\"\n> +\n> +#include <arm_neon.h>\n> +#include <math.h>\n> +\n> +/*\n> +** f1:\n> +**     movi    v[0-9]+.2s, 0x80, lsl 24\n> +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> +**     ret\n> +*/\n> +float32_t f1 (float32_t a)\n> +{\n> +  return -fabsf (a);\n> +}\n> +\n> +/*\n> +** f2:\n> +**     mov     x0, -9223372036854775808\n> +**     fmov    d[0-9]+, x0\n> +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> +**     ret\n> +*/\n> +float64_t f2 (float64_t a)\n> +{\n> +  return -fabs (a);\n> +}\n> diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_3.c b/gcc/testsuite/gcc.target/aarch64/fneg-abs_3.c\n> new file mode 100644\n> index 0000000000000000000000000000000000000000..b4652173a95d104ddfa70c497f0627a61ea89d3b\n> --- /dev/null\n> +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_3.c\n> @@ -0,0 +1,36 @@\n> +/* { dg-do compile } */\n> +/* { dg-options \"-O3\" } */\n> +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } } */\n> +\n> +#pragma GCC target \"+nosve\"\n> +\n> +#include <arm_neon.h>\n> +#include <math.h>\n> +\n> +/*\n> +** f1:\n> +**     ...\n> +**     ldr     q[0-9]+, \\[x0\\]\n> +**     orr     v[0-9]+.4s, #128, lsl #24\n> +**     str     q[0-9]+, \\[x0\\], 16\n> +**     ...\n> +*/\n> +void f1 (float32_t *a, int n)\n> +{\n> +  for (int i = 0; i < (n & -8); i++)\n> +   a[i] = -fabsf (a[i]);\n> +}\n> +\n> +/*\n> +** f2:\n> +**     ...\n> +**     ldr     q[0-9]+, \\[x0\\]\n> +**     orr     v[0-9]+.16b, v[0-9]+.16b, v[0-9]+.16b\n> +**     str     q[0-9]+, \\[x0\\], 16\n> +**     ...\n> +*/\n> +void f2 (float64_t *a, int n)\n> +{\n> +  for (int i = 0; i < (n & -8); i++)\n> +   a[i] = -fabs (a[i]);\n> +}\n> diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_4.c b/gcc/testsuite/gcc.target/aarch64/fneg-abs_4.c\n> new file mode 100644\n> index 0000000000000000000000000000000000000000..10879dea74462d34b26160eeb0bd54ead063166b\n> --- /dev/null\n> +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_4.c\n> @@ -0,0 +1,39 @@\n> +/* { dg-do compile } */\n> +/* { dg-options \"-O3\" } */\n> +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } } */\n> +\n> +#pragma GCC target \"+nosve\"\n> +\n> +#include <string.h>\n> +\n> +/*\n> +** negabs:\n> +**     mov     x0, -9223372036854775808\n> +**     fmov    d[0-9]+, x0\n> +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> +**     ret\n> +*/\n> +double negabs (double x)\n> +{\n> +   unsigned long long y;\n> +   memcpy (&y, &x, sizeof(double));\n> +   y = y | (1UL << 63);\n> +   memcpy (&x, &y, sizeof(double));\n> +   return x;\n> +}\n> +\n> +/*\n> +** negabsf:\n> +**     movi    v[0-9]+.2s, 0x80, lsl 24\n> +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> +**     ret\n> +*/\n> +float negabsf (float x)\n> +{\n> +   unsigned int y;\n> +   memcpy (&y, &x, sizeof(float));\n> +   y = y | (1U << 31);\n> +   memcpy (&x, &y, sizeof(float));\n> +   return x;\n> +}\n> +\n> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_1.c b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_1.c\n> new file mode 100644\n> index 0000000000000000000000000000000000000000..0c7664e6de77a497682952653ffd417453854d52\n> --- /dev/null\n> +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_1.c\n> @@ -0,0 +1,37 @@\n> +/* { dg-do compile } */\n> +/* { dg-options \"-O3\" } */\n> +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } } */\n> +\n> +#include <arm_neon.h>\n> +\n> +/*\n> +** t1:\n> +**     orr     v[0-9]+.2s, #128, lsl #24\n> +**     ret\n> +*/\n> +float32x2_t t1 (float32x2_t a)\n> +{\n> +  return vneg_f32 (vabs_f32 (a));\n> +}\n> +\n> +/*\n> +** t2:\n> +**     orr     v[0-9]+.4s, #128, lsl #24\n> +**     ret\n> +*/\n> +float32x4_t t2 (float32x4_t a)\n> +{\n> +  return vnegq_f32 (vabsq_f32 (a));\n> +}\n> +\n> +/*\n> +** t3:\n> +**     adrp    x0, .LC[0-9]+\n> +**     ldr     q[0-9]+, \\[x0, #:lo12:.LC0\\]\n> +**     orr     v[0-9]+.16b, v[0-9]+.16b, v[0-9]+.16b\n> +**     ret\n> +*/\n> +float64x2_t t3 (float64x2_t a)\n> +{\n> +  return vnegq_f64 (vabsq_f64 (a));\n> +}\n> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_2.c b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_2.c\n> new file mode 100644\n> index 0000000000000000000000000000000000000000..a60cd31b9294af2dac69eed1c93f899bd5c78fca\n> --- /dev/null\n> +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_2.c\n> @@ -0,0 +1,29 @@\n> +/* { dg-do compile } */\n> +/* { dg-options \"-O3\" } */\n> +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } } */\n> +\n> +#include <arm_neon.h>\n> +#include <math.h>\n> +\n> +/*\n> +** f1:\n> +**     movi    v[0-9]+.2s, 0x80, lsl 24\n> +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> +**     ret\n> +*/\n> +float32_t f1 (float32_t a)\n> +{\n> +  return -fabsf (a);\n> +}\n> +\n> +/*\n> +** f2:\n> +**     mov     x0, -9223372036854775808\n> +**     fmov    d[0-9]+, x0\n> +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> +**     ret\n> +*/\n> +float64_t f2 (float64_t a)\n> +{\n> +  return -fabs (a);\n> +}\n> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_3.c b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_3.c\n> new file mode 100644\n> index 0000000000000000000000000000000000000000..1bf34328d8841de8e6b0a5458562a9f00e31c275\n> --- /dev/null\n> +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_3.c\n> @@ -0,0 +1,34 @@\n> +/* { dg-do compile } */\n> +/* { dg-options \"-O3\" } */\n> +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } } */\n> +\n> +#include <arm_neon.h>\n> +#include <math.h>\n> +\n> +/*\n> +** f1:\n> +**     ...\n> +**     ld1w    z[0-9]+.s, p[0-9]+/z, \\[x0, x2, lsl 2\\]\n> +**     orr     z[0-9]+.s, z[0-9]+.s, #0x80000000\n> +**     st1w    z[0-9]+.s, p[0-9]+, \\[x0, x2, lsl 2\\]\n> +**     ...\n> +*/\n> +void f1 (float32_t *a, int n)\n> +{\n> +  for (int i = 0; i < (n & -8); i++)\n> +   a[i] = -fabsf (a[i]);\n> +}\n> +\n> +/*\n> +** f2:\n> +**     ...\n> +**     ld1d    z[0-9]+.d, p[0-9]+/z, \\[x0, x2, lsl 3\\]\n> +**     orr     z[0-9]+.d, z[0-9]+.d, #0x8000000000000000\n> +**     st1d    z[0-9]+.d, p[0-9]+, \\[x0, x2, lsl 3\\]\n> +**     ...\n> +*/\n> +void f2 (float64_t *a, int n)\n> +{\n> +  for (int i = 0; i < (n & -8); i++)\n> +   a[i] = -fabs (a[i]);\n> +}\n> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_4.c b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_4.c\n> new file mode 100644\n> index 0000000000000000000000000000000000000000..21f2a8da2a5d44e3d01f6604ca7be87e3744d494\n> --- /dev/null\n> +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_4.c\n> @@ -0,0 +1,37 @@\n> +/* { dg-do compile } */\n> +/* { dg-options \"-O3\" } */\n> +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } } */\n> +\n> +#include <string.h>\n> +\n> +/*\n> +** negabs:\n> +**     mov     x0, -9223372036854775808\n> +**     fmov    d[0-9]+, x0\n> +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> +**     ret\n> +*/\n> +double negabs (double x)\n> +{\n> +   unsigned long long y;\n> +   memcpy (&y, &x, sizeof(double));\n> +   y = y | (1UL << 63);\n> +   memcpy (&x, &y, sizeof(double));\n> +   return x;\n> +}\n> +\n> +/*\n> +** negabsf:\n> +**     movi    v[0-9]+.2s, 0x80, lsl 24\n> +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> +**     ret\n> +*/\n> +float negabsf (float x)\n> +{\n> +   unsigned int y;\n> +   memcpy (&y, &x, sizeof(float));\n> +   y = y | (1U << 31);\n> +   memcpy (&x, &y, sizeof(float));\n> +   return x;\n> +}\n> +\n>\n>\n>\n>\n> --","headers":{"Return-Path":"<gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org>","X-Original-To":["incoming@patchwork.ozlabs.org","gcc-patches@gcc.gnu.org"],"Delivered-To":["patchwork-incoming@legolas.ozlabs.org","gcc-patches@gcc.gnu.org"],"Authentication-Results":["legolas.ozlabs.org;\n\tdkim=pass (2048-bit key;\n unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256\n header.s=20230601 header.b=gcYh9fVb;\n\tdkim-atps=neutral","legolas.ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org\n (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org;\n envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org;\n receiver=patchwork.ozlabs.org)","sourceware.org;\n dmarc=pass (p=none dis=none) header.from=gmail.com","sourceware.org; spf=pass smtp.mailfrom=gmail.com"],"Received":["from server2.sourceware.org (server2.sourceware.org\n [IPv6:2620:52:3:1:0:246e:9693:128c])\n\t(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n\t key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384)\n\t(No client certificate requested)\n\tby legolas.ozlabs.org (Postfix) with ESMTPS id 4RwJdM4174z1yp0\n\tfor <incoming@patchwork.ozlabs.org>; Wed, 27 Sep 2023 11:17:57 +1000 (AEST)","from server2.sourceware.org (localhost [IPv6:::1])\n\tby sourceware.org (Postfix) with ESMTP id 6E6E93857B93\n\tfor <incoming@patchwork.ozlabs.org>; Wed, 27 Sep 2023 01:17:55 +0000 (GMT)","from mail-wr1-x42b.google.com (mail-wr1-x42b.google.com\n [IPv6:2a00:1450:4864:20::42b])\n by sourceware.org (Postfix) with ESMTPS id E21D03858C27\n for <gcc-patches@gcc.gnu.org>; Wed, 27 Sep 2023 01:17:38 +0000 (GMT)","by mail-wr1-x42b.google.com with SMTP id\n ffacd0b85a97d-3231d67aff2so6527003f8f.0\n for <gcc-patches@gcc.gnu.org>; Tue, 26 Sep 2023 18:17:38 -0700 (PDT)"],"DMARC-Filter":"OpenDMARC Filter v1.4.2 sourceware.org E21D03858C27","DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n d=gmail.com; s=20230601; t=1695777458; x=1696382258; darn=gcc.gnu.org;\n h=content-transfer-encoding:cc:to:subject:message-id:date:from\n :in-reply-to:references:mime-version:from:to:cc:subject:date\n :message-id:reply-to;\n bh=n/zKg/PUHHW88XLW2X4Ux/JRzSUVi+rSfoAHCqJkxg0=;\n b=gcYh9fVbqaZ2B8DQEgaQoDY7sdAwXswKXRbtGHHZ7MJGGnJ8L9k9XMbMOINN3l3JdQ\n +29gfm4VoGmHIqu2v5cC+qMOzYXEfj189WVYJpY6J7+Rv6zFRDG3sBA0lrcssGgaYzui\n G2asdlteGGVyH1/pAylwCAsGPkNhP85GBZXM4UxWadnj5XouE0Sma/AtrnN3yds2E0q2\n Gfdv+MMGhVywhqwb+C6xynkEmEcXzuUrY9L/7+KB+DHUz8jIFvbRqryo3GMQnTbUX/lg\n xInNn/BG3eMZMj27eokl4GZAwee1InpiMGvw2pLfRQeksfwEG12dqdgAe31D9+G9o/Uo\n UPUg==","X-Google-DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n d=1e100.net; s=20230601; t=1695777458; x=1696382258;\n h=content-transfer-encoding:cc:to:subject:message-id:date:from\n :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc\n :subject:date:message-id:reply-to;\n bh=n/zKg/PUHHW88XLW2X4Ux/JRzSUVi+rSfoAHCqJkxg0=;\n b=d5/e7PQyfJjv+swnD6POYnJmxFad3Bzi51av7BWvwKP2H/le5SXsCIerrTpS+SvVYY\n MrrP08EUQ00cNbkXmAVYuy1iBLGF+KdoF/WFVYw5A+jFalCuaaUmFherdNIeRshiQMJS\n 3MZgrXlRlljexJ4hE5HkE+sAlUcKJM1i23ChOaiiLiW9uJEZVsyh+4D81X+DfuFDQexb\n 1uJPn/IlmK4Aien51C6WjfLQLeSH55y6lt15GjqV8tzFHewr3DfZJy6+VR0Q0tacG6DD\n XZQkkN4oNAJqOPDuXD3kShCf8IMMIXHVU5klgoXGSEl6WkUbFq1nhbpChqECS7OX2E+K\n VjhQ==","X-Gm-Message-State":"AOJu0YzIHkdA03EKDtO2X8qTFInvlIsZev3NRmpAwi8KgoqYJRemWdA6\n HyyCQb1nGzbE3CpeKlu5hJGMWx8/WQ8EJMyeJ01hHCyo8XE=","X-Google-Smtp-Source":"\n AGHT+IGmlAW2HGgZCfliw3XX8ic1a/HBvcD42w1NOnmAy0mmR0hcgLcd8CrXYE8mhqaxnWs278vAFAfX7dcqsb8Q3xI=","X-Received":"by 2002:a5d:5582:0:b0:317:ef6:89a7 with SMTP id\n i2-20020a5d5582000000b003170ef689a7mr278572wrv.27.1695777457385; Tue, 26 Sep\n 2023 18:17:37 -0700 (PDT)","MIME-Version":"1.0","References":"<patch-17718-tamar@arm.com>","In-Reply-To":"<patch-17718-tamar@arm.com>","From":"Andrew Pinski <pinskia@gmail.com>","Date":"Tue, 26 Sep 2023 18:17:24 -0700","Message-ID":"\n <CA+=Sn1kbO1OkC_1oMJi8uH8bajmGn07A+F6nvb6dGKBRcR8S3Q@mail.gmail.com>","Subject":"Re: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1 <<\n signbit(x)) [PR109154]","To":"Tamar Christina <tamar.christina@arm.com>","Cc":"gcc-patches@gcc.gnu.org, nd@arm.com, rguenther@suse.de,\n jlaw@ventanamicro.com","Content-Type":"text/plain; charset=\"UTF-8\"","Content-Transfer-Encoding":"quoted-printable","X-Spam-Status":"No, score=-7.9 required=5.0 tests=BAYES_00, DKIM_SIGNED,\n DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0,\n KAM_LOTSOFHASH, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS,\n TXREP autolearn=ham autolearn_force=no version=3.4.6","X-Spam-Checker-Version":"SpamAssassin 3.4.6 (2021-04-09) on\n server2.sourceware.org","X-BeenThere":"gcc-patches@gcc.gnu.org","X-Mailman-Version":"2.1.30","Precedence":"list","List-Id":"Gcc-patches mailing list <gcc-patches.gcc.gnu.org>","List-Unsubscribe":"<https://gcc.gnu.org/mailman/options/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>","List-Archive":"<https://gcc.gnu.org/pipermail/gcc-patches/>","List-Post":"<mailto:gcc-patches@gcc.gnu.org>","List-Help":"<mailto:gcc-patches-request@gcc.gnu.org?subject=help>","List-Subscribe":"<https://gcc.gnu.org/mailman/listinfo/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>","Errors-To":"gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org"}},{"id":3188289,"web_url":"http://patchwork.ozlabs.org/comment/3188289/","msgid":"<VI1PR08MB5325CC904A863DB87F17CB88FFC2A@VI1PR08MB5325.eurprd08.prod.outlook.com>","list_archive_url":null,"date":"2023-09-27T02:31:37","subject":"RE: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1 <<\n signbit(x)) [PR109154]","submitter":{"id":69689,"url":"http://patchwork.ozlabs.org/api/people/69689/","name":"Tamar Christina","email":"Tamar.Christina@arm.com"},"content":"> -----Original Message-----\n> From: Andrew Pinski <pinskia@gmail.com>\n> Sent: Wednesday, September 27, 2023 2:17 AM\n> To: Tamar Christina <Tamar.Christina@arm.com>\n> Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>; rguenther@suse.de;\n> jlaw@ventanamicro.com\n> Subject: Re: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1 <<\n> signbit(x)) [PR109154]\n> \n> On Tue, Sep 26, 2023 at 5:51 PM Tamar Christina <tamar.christina@arm.com>\n> wrote:\n> >\n> > Hi All,\n> >\n> > For targets that allow conversion between int and float modes this\n> > adds a new optimization transforming fneg (fabs (x)) into x | (1 <<\n> > signbit(x)).  Such sequences are common in scientific code working with\n> gradients.\n> >\n> > The transformed instruction if the target has an inclusive-OR that\n> > takes an immediate is both shorter an faster.  For those that don't\n> > the immediate has to be seperate constructed but this still ends up\n> > being faster as the immediate construction is not on the critical path.\n> >\n> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.\n> >\n> > Ok for master?\n> \n> I think this should be part of isel instead of match.\n> Maybe we could use genmatch to generate the code that does the\n> transformations but this does not belong as part of match really.\n\nI disagree.. I don't think this belongs in isel. Isel is for structural transformations.\nIf there is a case for something else I'd imagine backwardprop is a better choice.\n\nBut I don't see why it doesn't belong here considering it *is* a mathematical optimization\nand the file has plenty of transformations such as mask optimizations and vector conditional\nrewriting.\n\nRegards,\nTamar\n\n> \n> Thanks,\n> Andrew\n> \n> >\n> > Thanks,\n> > Tamar\n> >\n> > gcc/ChangeLog:\n> >\n> >         PR tree-optimization/109154\n> >         * match.pd: Add new neg+abs rule.\n> >\n> > gcc/testsuite/ChangeLog:\n> >\n> >         PR tree-optimization/109154\n> >         * gcc.target/aarch64/fneg-abs_1.c: New test.\n> >         * gcc.target/aarch64/fneg-abs_2.c: New test.\n> >         * gcc.target/aarch64/fneg-abs_3.c: New test.\n> >         * gcc.target/aarch64/fneg-abs_4.c: New test.\n> >         * gcc.target/aarch64/sve/fneg-abs_1.c: New test.\n> >         * gcc.target/aarch64/sve/fneg-abs_2.c: New test.\n> >         * gcc.target/aarch64/sve/fneg-abs_3.c: New test.\n> >         * gcc.target/aarch64/sve/fneg-abs_4.c: New test.\n> >\n> > --- inline copy of patch --\n> > diff --git a/gcc/match.pd b/gcc/match.pd index\n> >\n> 39c7ea1088f25538ed8bd26ee89711566141a71f..8ebde06dcd4b26d69482\n> 6cffad0f\n> > b17e1136600a 100644\n> > --- a/gcc/match.pd\n> > +++ b/gcc/match.pd\n> > @@ -9476,3 +9476,57 @@ and,\n> >        }\n> >        (if (full_perm_p)\n> >         (vec_perm (op@3 @0 @1) @3 @2))))))\n> > +\n> > +/* Transform fneg (fabs (X)) -> X | 1 << signbit (X).  */\n> > +\n> > +(simplify\n> > + (negate (abs @0))\n> > + (if (FLOAT_TYPE_P (type)\n> > +      /* We have to delay this rewriting till after forward prop because\n> otherwise\n> > +        it's harder to do trigonometry optimizations. e.g. cos(-fabs(x)) is not\n> > +        matched in one go.  Instead cos (-x) is matched first followed by\n> cos(|x|).\n> > +        The bottom op approach makes this rule match first and it's not untill\n> > +        fwdprop that we match top down.  There are manu such simplications\n> so we\n> > +        delay this optimization till later on.  */\n> > +      && canonicalize_math_after_vectorization_p ())\n> > +  (with {\n> > +    tree itype = unsigned_type_for (type);\n> > +    machine_mode mode = TYPE_MODE (type);\n> > +    const struct real_format *float_fmt = FLOAT_MODE_FORMAT (mode);\n> > +    auto optab = VECTOR_TYPE_P (type) ? optab_vector : optab_default; }\n> > +   (if (float_fmt\n> > +       && float_fmt->signbit_rw >= 0\n> > +       && targetm.can_change_mode_class (TYPE_MODE (itype),\n> > +                                         TYPE_MODE (type), ALL_REGS)\n> > +        && target_supports_op_p (itype, BIT_IOR_EXPR, optab))\n> > +    (with { wide_int wone = wi::one (element_precision (type));\n> > +           int sbit = float_fmt->signbit_rw;\n> > +           auto stype = VECTOR_TYPE_P (type) ? TREE_TYPE (itype) : itype;\n> > +           tree sign_bit = wide_int_to_tree (stype, wi::lshift (wone, sbit));}\n> > +     (view_convert:type\n> > +      (bit_ior (view_convert:itype @0)\n> > +              { build_uniform_cst (itype, sign_bit); } )))))))\n> > +\n> > +/* Repeat the same but for conditional negate.  */\n> > +\n> > +(simplify\n> > + (IFN_COND_NEG @1 (abs @0) @2)\n> > + (if (FLOAT_TYPE_P (type))\n> > +  (with {\n> > +    tree itype = unsigned_type_for (type);\n> > +    machine_mode mode = TYPE_MODE (type);\n> > +    const struct real_format *float_fmt = FLOAT_MODE_FORMAT (mode);\n> > +    auto optab = VECTOR_TYPE_P (type) ? optab_vector : optab_default; }\n> > +   (if (float_fmt\n> > +       && float_fmt->signbit_rw >= 0\n> > +       && targetm.can_change_mode_class (TYPE_MODE (itype),\n> > +                                         TYPE_MODE (type), ALL_REGS)\n> > +        && target_supports_op_p (itype, BIT_IOR_EXPR, optab))\n> > +    (with { wide_int wone = wi::one (element_precision (type));\n> > +           int sbit = float_fmt->signbit_rw;\n> > +           auto stype = VECTOR_TYPE_P (type) ? TREE_TYPE (itype) : itype;\n> > +           tree sign_bit = wide_int_to_tree (stype, wi::lshift (wone, sbit));}\n> > +     (view_convert:type\n> > +      (IFN_COND_IOR @1 (view_convert:itype @0)\n> > +              { build_uniform_cst (itype, sign_bit); }\n> > +              (view_convert:itype @2) )))))))\n> > \\ No newline at end of file\n> > diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_1.c\n> > b/gcc/testsuite/gcc.target/aarch64/fneg-abs_1.c\n> > new file mode 100644\n> > index\n> >\n> 0000000000000000000000000000000000000000..f823013c3ddf6b3a266\n> c3abfcbf2\n> > 642fc2a75fa6\n> > --- /dev/null\n> > +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_1.c\n> > @@ -0,0 +1,39 @@\n> > +/* { dg-do compile } */\n> > +/* { dg-options \"-O3\" } */\n> > +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } }\n> > +*/\n> > +\n> > +#pragma GCC target \"+nosve\"\n> > +\n> > +#include <arm_neon.h>\n> > +\n> > +/*\n> > +** t1:\n> > +**     orr     v[0-9]+.2s, #128, lsl #24\n> > +**     ret\n> > +*/\n> > +float32x2_t t1 (float32x2_t a)\n> > +{\n> > +  return vneg_f32 (vabs_f32 (a));\n> > +}\n> > +\n> > +/*\n> > +** t2:\n> > +**     orr     v[0-9]+.4s, #128, lsl #24\n> > +**     ret\n> > +*/\n> > +float32x4_t t2 (float32x4_t a)\n> > +{\n> > +  return vnegq_f32 (vabsq_f32 (a));\n> > +}\n> > +\n> > +/*\n> > +** t3:\n> > +**     adrp    x0, .LC[0-9]+\n> > +**     ldr     q[0-9]+, \\[x0, #:lo12:.LC0\\]\n> > +**     orr     v[0-9]+.16b, v[0-9]+.16b, v[0-9]+.16b\n> > +**     ret\n> > +*/\n> > +float64x2_t t3 (float64x2_t a)\n> > +{\n> > +  return vnegq_f64 (vabsq_f64 (a));\n> > +}\n> > diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_2.c\n> > b/gcc/testsuite/gcc.target/aarch64/fneg-abs_2.c\n> > new file mode 100644\n> > index\n> >\n> 0000000000000000000000000000000000000000..141121176b309e4b2a\n> a413dc5527\n> > 1a6e3c93d5e1\n> > --- /dev/null\n> > +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_2.c\n> > @@ -0,0 +1,31 @@\n> > +/* { dg-do compile } */\n> > +/* { dg-options \"-O3\" } */\n> > +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } }\n> > +*/\n> > +\n> > +#pragma GCC target \"+nosve\"\n> > +\n> > +#include <arm_neon.h>\n> > +#include <math.h>\n> > +\n> > +/*\n> > +** f1:\n> > +**     movi    v[0-9]+.2s, 0x80, lsl 24\n> > +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> > +**     ret\n> > +*/\n> > +float32_t f1 (float32_t a)\n> > +{\n> > +  return -fabsf (a);\n> > +}\n> > +\n> > +/*\n> > +** f2:\n> > +**     mov     x0, -9223372036854775808\n> > +**     fmov    d[0-9]+, x0\n> > +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> > +**     ret\n> > +*/\n> > +float64_t f2 (float64_t a)\n> > +{\n> > +  return -fabs (a);\n> > +}\n> > diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_3.c\n> > b/gcc/testsuite/gcc.target/aarch64/fneg-abs_3.c\n> > new file mode 100644\n> > index\n> >\n> 0000000000000000000000000000000000000000..b4652173a95d104ddf\n> a70c497f06\n> > 27a61ea89d3b\n> > --- /dev/null\n> > +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_3.c\n> > @@ -0,0 +1,36 @@\n> > +/* { dg-do compile } */\n> > +/* { dg-options \"-O3\" } */\n> > +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } }\n> > +*/\n> > +\n> > +#pragma GCC target \"+nosve\"\n> > +\n> > +#include <arm_neon.h>\n> > +#include <math.h>\n> > +\n> > +/*\n> > +** f1:\n> > +**     ...\n> > +**     ldr     q[0-9]+, \\[x0\\]\n> > +**     orr     v[0-9]+.4s, #128, lsl #24\n> > +**     str     q[0-9]+, \\[x0\\], 16\n> > +**     ...\n> > +*/\n> > +void f1 (float32_t *a, int n)\n> > +{\n> > +  for (int i = 0; i < (n & -8); i++)\n> > +   a[i] = -fabsf (a[i]);\n> > +}\n> > +\n> > +/*\n> > +** f2:\n> > +**     ...\n> > +**     ldr     q[0-9]+, \\[x0\\]\n> > +**     orr     v[0-9]+.16b, v[0-9]+.16b, v[0-9]+.16b\n> > +**     str     q[0-9]+, \\[x0\\], 16\n> > +**     ...\n> > +*/\n> > +void f2 (float64_t *a, int n)\n> > +{\n> > +  for (int i = 0; i < (n & -8); i++)\n> > +   a[i] = -fabs (a[i]);\n> > +}\n> > diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_4.c\n> > b/gcc/testsuite/gcc.target/aarch64/fneg-abs_4.c\n> > new file mode 100644\n> > index\n> >\n> 0000000000000000000000000000000000000000..10879dea74462d34b2\n> 6160eeb0bd\n> > 54ead063166b\n> > --- /dev/null\n> > +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_4.c\n> > @@ -0,0 +1,39 @@\n> > +/* { dg-do compile } */\n> > +/* { dg-options \"-O3\" } */\n> > +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } }\n> > +*/\n> > +\n> > +#pragma GCC target \"+nosve\"\n> > +\n> > +#include <string.h>\n> > +\n> > +/*\n> > +** negabs:\n> > +**     mov     x0, -9223372036854775808\n> > +**     fmov    d[0-9]+, x0\n> > +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> > +**     ret\n> > +*/\n> > +double negabs (double x)\n> > +{\n> > +   unsigned long long y;\n> > +   memcpy (&y, &x, sizeof(double));\n> > +   y = y | (1UL << 63);\n> > +   memcpy (&x, &y, sizeof(double));\n> > +   return x;\n> > +}\n> > +\n> > +/*\n> > +** negabsf:\n> > +**     movi    v[0-9]+.2s, 0x80, lsl 24\n> > +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> > +**     ret\n> > +*/\n> > +float negabsf (float x)\n> > +{\n> > +   unsigned int y;\n> > +   memcpy (&y, &x, sizeof(float));\n> > +   y = y | (1U << 31);\n> > +   memcpy (&x, &y, sizeof(float));\n> > +   return x;\n> > +}\n> > +\n> > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_1.c\n> > b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_1.c\n> > new file mode 100644\n> > index\n> >\n> 0000000000000000000000000000000000000000..0c7664e6de77a49768\n> 2952653ffd\n> > 417453854d52\n> > --- /dev/null\n> > +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_1.c\n> > @@ -0,0 +1,37 @@\n> > +/* { dg-do compile } */\n> > +/* { dg-options \"-O3\" } */\n> > +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } }\n> > +*/\n> > +\n> > +#include <arm_neon.h>\n> > +\n> > +/*\n> > +** t1:\n> > +**     orr     v[0-9]+.2s, #128, lsl #24\n> > +**     ret\n> > +*/\n> > +float32x2_t t1 (float32x2_t a)\n> > +{\n> > +  return vneg_f32 (vabs_f32 (a));\n> > +}\n> > +\n> > +/*\n> > +** t2:\n> > +**     orr     v[0-9]+.4s, #128, lsl #24\n> > +**     ret\n> > +*/\n> > +float32x4_t t2 (float32x4_t a)\n> > +{\n> > +  return vnegq_f32 (vabsq_f32 (a));\n> > +}\n> > +\n> > +/*\n> > +** t3:\n> > +**     adrp    x0, .LC[0-9]+\n> > +**     ldr     q[0-9]+, \\[x0, #:lo12:.LC0\\]\n> > +**     orr     v[0-9]+.16b, v[0-9]+.16b, v[0-9]+.16b\n> > +**     ret\n> > +*/\n> > +float64x2_t t3 (float64x2_t a)\n> > +{\n> > +  return vnegq_f64 (vabsq_f64 (a));\n> > +}\n> > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_2.c\n> > b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_2.c\n> > new file mode 100644\n> > index\n> >\n> 0000000000000000000000000000000000000000..a60cd31b9294af2dac6\n> 9eed1c93f\n> > 899bd5c78fca\n> > --- /dev/null\n> > +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_2.c\n> > @@ -0,0 +1,29 @@\n> > +/* { dg-do compile } */\n> > +/* { dg-options \"-O3\" } */\n> > +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } }\n> > +*/\n> > +\n> > +#include <arm_neon.h>\n> > +#include <math.h>\n> > +\n> > +/*\n> > +** f1:\n> > +**     movi    v[0-9]+.2s, 0x80, lsl 24\n> > +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> > +**     ret\n> > +*/\n> > +float32_t f1 (float32_t a)\n> > +{\n> > +  return -fabsf (a);\n> > +}\n> > +\n> > +/*\n> > +** f2:\n> > +**     mov     x0, -9223372036854775808\n> > +**     fmov    d[0-9]+, x0\n> > +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> > +**     ret\n> > +*/\n> > +float64_t f2 (float64_t a)\n> > +{\n> > +  return -fabs (a);\n> > +}\n> > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_3.c\n> > b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_3.c\n> > new file mode 100644\n> > index\n> >\n> 0000000000000000000000000000000000000000..1bf34328d8841de8e6\n> b0a5458562\n> > a9f00e31c275\n> > --- /dev/null\n> > +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_3.c\n> > @@ -0,0 +1,34 @@\n> > +/* { dg-do compile } */\n> > +/* { dg-options \"-O3\" } */\n> > +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } }\n> > +*/\n> > +\n> > +#include <arm_neon.h>\n> > +#include <math.h>\n> > +\n> > +/*\n> > +** f1:\n> > +**     ...\n> > +**     ld1w    z[0-9]+.s, p[0-9]+/z, \\[x0, x2, lsl 2\\]\n> > +**     orr     z[0-9]+.s, z[0-9]+.s, #0x80000000\n> > +**     st1w    z[0-9]+.s, p[0-9]+, \\[x0, x2, lsl 2\\]\n> > +**     ...\n> > +*/\n> > +void f1 (float32_t *a, int n)\n> > +{\n> > +  for (int i = 0; i < (n & -8); i++)\n> > +   a[i] = -fabsf (a[i]);\n> > +}\n> > +\n> > +/*\n> > +** f2:\n> > +**     ...\n> > +**     ld1d    z[0-9]+.d, p[0-9]+/z, \\[x0, x2, lsl 3\\]\n> > +**     orr     z[0-9]+.d, z[0-9]+.d, #0x8000000000000000\n> > +**     st1d    z[0-9]+.d, p[0-9]+, \\[x0, x2, lsl 3\\]\n> > +**     ...\n> > +*/\n> > +void f2 (float64_t *a, int n)\n> > +{\n> > +  for (int i = 0; i < (n & -8); i++)\n> > +   a[i] = -fabs (a[i]);\n> > +}\n> > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_4.c\n> > b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_4.c\n> > new file mode 100644\n> > index\n> >\n> 0000000000000000000000000000000000000000..21f2a8da2a5d44e3d0\n> 1f6604ca7b\n> > e87e3744d494\n> > --- /dev/null\n> > +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_4.c\n> > @@ -0,0 +1,37 @@\n> > +/* { dg-do compile } */\n> > +/* { dg-options \"-O3\" } */\n> > +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } }\n> > +*/\n> > +\n> > +#include <string.h>\n> > +\n> > +/*\n> > +** negabs:\n> > +**     mov     x0, -9223372036854775808\n> > +**     fmov    d[0-9]+, x0\n> > +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> > +**     ret\n> > +*/\n> > +double negabs (double x)\n> > +{\n> > +   unsigned long long y;\n> > +   memcpy (&y, &x, sizeof(double));\n> > +   y = y | (1UL << 63);\n> > +   memcpy (&x, &y, sizeof(double));\n> > +   return x;\n> > +}\n> > +\n> > +/*\n> > +** negabsf:\n> > +**     movi    v[0-9]+.2s, 0x80, lsl 24\n> > +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> > +**     ret\n> > +*/\n> > +float negabsf (float x)\n> > +{\n> > +   unsigned int y;\n> > +   memcpy (&y, &x, sizeof(float));\n> > +   y = y | (1U << 31);\n> > +   memcpy (&x, &y, sizeof(float));\n> > +   return x;\n> > +}\n> > +\n> >\n> >\n> >\n> >\n> > --","headers":{"Return-Path":"<gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org>","X-Original-To":["incoming@patchwork.ozlabs.org","gcc-patches@gcc.gnu.org"],"Delivered-To":["patchwork-incoming@legolas.ozlabs.org","gcc-patches@gcc.gnu.org"],"Authentication-Results":["legolas.ozlabs.org;\n\tdkim=pass (1024-bit key;\n unprotected) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com\n header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com\n header.b=m139maCM;\n\tdkim=pass (1024-bit key) header.d=armh.onmicrosoft.com\n header.i=@armh.onmicrosoft.com header.a=rsa-sha256\n header.s=selector2-armh-onmicrosoft-com header.b=m139maCM;\n\tdkim-atps=neutral","legolas.ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org\n (client-ip=8.43.85.97; helo=server2.sourceware.org;\n envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org;\n receiver=patchwork.ozlabs.org)","sourceware.org;\n dmarc=pass (p=none dis=none) header.from=arm.com","sourceware.org; spf=pass smtp.mailfrom=arm.com"],"Received":["from server2.sourceware.org (server2.sourceware.org [8.43.85.97])\n\t(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n\t key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384)\n\t(No client certificate requested)\n\tby legolas.ozlabs.org (Postfix) with ESMTPS id 4RwLH110PYz1yp0\n\tfor <incoming@patchwork.ozlabs.org>; Wed, 27 Sep 2023 12:32:11 +1000 (AEST)","from server2.sourceware.org (localhost [IPv6:::1])\n\tby sourceware.org (Postfix) with ESMTP id 4F9FB3861809\n\tfor <incoming@patchwork.ozlabs.org>; Wed, 27 Sep 2023 02:32:07 +0000 (GMT)","from EUR01-DB5-obe.outbound.protection.outlook.com\n (mail-db5eur01on2078.outbound.protection.outlook.com [40.107.15.78])\n by sourceware.org (Postfix) with ESMTPS id 28CDE3858C74\n for <gcc-patches@gcc.gnu.org>; Wed, 27 Sep 2023 02:31:52 +0000 (GMT)","from AS8P251CA0008.EURP251.PROD.OUTLOOK.COM (2603:10a6:20b:2f2::26)\n by VI1PR08MB10005.eurprd08.prod.outlook.com (2603:10a6:800:1bf::7)\n with Microsoft SMTP Server (version=TLS1_2,\n cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6813.28; Wed, 27 Sep\n 2023 02:31:47 +0000","from AM7EUR03FT054.eop-EUR03.prod.protection.outlook.com\n (2603:10a6:20b:2f2:cafe::f7) by AS8P251CA0008.outlook.office365.com\n (2603:10a6:20b:2f2::26) with Microsoft SMTP Server (version=TLS1_2,\n cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6838.21 via Frontend\n Transport; Wed, 27 Sep 2023 02:31:47 +0000","from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by\n AM7EUR03FT054.mail.protection.outlook.com (100.127.140.133) with\n Microsoft\n SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id\n 15.20.6838.19 via Frontend Transport; Wed, 27 Sep 2023 02:31:47 +0000","(\"Tessian outbound 5c548696a0e7:v175\");\n Wed, 27 Sep 2023 02:31:47 +0000","from b6c99a67d81d.1\n by 64aa7808-outbound-1.mta.getcheckrecipient.com id\n 66A1BB8F-CECC-45F1-8EE1-CD98DDA0F3CD.1;\n Wed, 27 Sep 2023 02:31:40 +0000","from EUR04-DB3-obe.outbound.protection.outlook.com\n by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id\n b6c99a67d81d.1\n (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384);\n Wed, 27 Sep 2023 02:31:40 +0000","from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17)\n by AS8PR08MB8828.eurprd08.prod.outlook.com (2603:10a6:20b:5b9::5)\n with Microsoft SMTP Server (version=TLS1_2,\n cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6838.21; Wed, 27 Sep\n 2023 02:31:38 +0000","from VI1PR08MB5325.eurprd08.prod.outlook.com\n ([fe80::662f:8e26:1bf8:aaa1]) by VI1PR08MB5325.eurprd08.prod.outlook.com\n ([fe80::662f:8e26:1bf8:aaa1%7]) with mapi id 15.20.6813.027; Wed, 27 Sep 2023\n 02:31:37 +0000"],"DMARC-Filter":"OpenDMARC Filter v1.4.2 sourceware.org 28CDE3858C74","DKIM-Signature":["v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com;\n s=selector2-armh-onmicrosoft-com;\n h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;\n bh=eDH8hgjOtf6aiKJba4Vhh5N3p4uUj8R2rXxsOBDX/64=;\n b=m139maCMl22cDCetRi+ctrS/Hvb7JxJplbOLTEq84vk8X/LPbyYWAM5w6rIpRnRL4rFSSA6ow2lRyRZJoqe0V1BRxFNM8LIfnSyd75+6KJcoy/CH+Rrqz1stxa8sG3X71JfHybixmKP3YzFDl+YCBhfqEr89PlcXA+9lwzv1JFQ=","v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com;\n s=selector2-armh-onmicrosoft-com;\n h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;\n bh=eDH8hgjOtf6aiKJba4Vhh5N3p4uUj8R2rXxsOBDX/64=;\n b=m139maCMl22cDCetRi+ctrS/Hvb7JxJplbOLTEq84vk8X/LPbyYWAM5w6rIpRnRL4rFSSA6ow2lRyRZJoqe0V1BRxFNM8LIfnSyd75+6KJcoy/CH+Rrqz1stxa8sG3X71JfHybixmKP3YzFDl+YCBhfqEr89PlcXA+9lwzv1JFQ="],"X-MS-Exchange-Authentication-Results":"spf=pass (sender IP is 63.35.35.123)\n smtp.mailfrom=arm.com; dkim=pass (signature was verified)\n header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com;","Received-SPF":"Pass (protection.outlook.com: domain of arm.com designates\n 63.35.35.123 as permitted sender) receiver=protection.outlook.com;\n client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com;\n pr=C","X-CheckRecipientChecked":"true","X-CR-MTA-CID":"3647e06408f43704","X-CR-MTA-TID":"64aa7808","ARC-Seal":"i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none;\n b=W0c5qKTKsVd95hBpiS4L61Fu4cKwz8djhFsclrFR7rDd3zTdyNEHo5ipYZUeLoTJfbckD0l77UnKDwVrxZdhpY6hNW9jhmGF21uuAa4HljZuPVG/pSgrSc3aTF4dKHkNH2Ei6tb73hn1hqP8wQBruxPd+9F5v7LxwmNZg/mS5cDnFYi9W4JITbfas/MI9l+NUIsYRa17UvytD2GspVTcjD/fc9Nr2V1Y/5f4dtLzMmdWNCeUPYuVU2rRHI5WekWxe8gLOf+NycSLmj0DvuyWrhtodHTkeH+kioAKaeHCHNMXp+tZTAiJCs2kCRubu9Mfq7eG0gK/e87Ji1fmIX4zdQ==","ARC-Message-Signature":"i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com;\n s=arcselector9901;\n h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1;\n bh=eDH8hgjOtf6aiKJba4Vhh5N3p4uUj8R2rXxsOBDX/64=;\n b=REICPbNd7vp7tFCZM4znxw4RL5AAgCXvahIFQJvQBuiF0RKYoOaEswMHAse/2VvUENMDDIA4cUaSXsKNcZWt0/Y5d4tq6y6/seFVtvK5XQg96qMhD70d90tVonscNJHmnzG6FJcMXIDRRgRUoeK05k0/VyiBloSX4CvFsAoFqNUZpie/qhi95N4AoIxoPoYXwRGn9fwhIdHCsnzlCubIc/z5PQ2bFJxu8d2uyqI3iqsh2IUBPZgrjPM5YB9NB+pFUUP4NY4KP2VuhbrQf1cwf8H27bjyTDSDR7Rfhg8XDkW8MOQ83YnCJrUM9+36PQg14KYXuNuo8xjw44ug0uxShA==","ARC-Authentication-Results":"i=1; mx.microsoft.com 1; spf=pass\n smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass\n header.d=arm.com; arc=none","From":"Tamar Christina <Tamar.Christina@arm.com>","To":"Andrew Pinski <pinskia@gmail.com>","CC":"\"gcc-patches@gcc.gnu.org\" <gcc-patches@gcc.gnu.org>, nd <nd@arm.com>,\n \"rguenther@suse.de\" <rguenther@suse.de>, \"jlaw@ventanamicro.com\"\n <jlaw@ventanamicro.com>","Subject":"RE: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1 <<\n signbit(x)) [PR109154]","Thread-Topic":"[PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1\n << signbit(x)) [PR109154]","Thread-Index":"AQHZ8NyYUy/9C+tZQUuY9xAIpvunf7At3wQAgAAUNTA=","Date":"Wed, 27 Sep 2023 02:31:37 +0000","Message-ID":"\n <VI1PR08MB5325CC904A863DB87F17CB88FFC2A@VI1PR08MB5325.eurprd08.prod.outlook.com>","References":"<patch-17718-tamar@arm.com>\n <CA+=Sn1kbO1OkC_1oMJi8uH8bajmGn07A+F6nvb6dGKBRcR8S3Q@mail.gmail.com>","In-Reply-To":"\n <CA+=Sn1kbO1OkC_1oMJi8uH8bajmGn07A+F6nvb6dGKBRcR8S3Q@mail.gmail.com>","Accept-Language":"en-US","Content-Language":"en-US","X-MS-Has-Attach":"","X-MS-TNEF-Correlator":"","Authentication-Results-Original":"dkim=none (message not signed)\n header.d=none;dmarc=none action=none header.from=arm.com;","x-ms-traffictypediagnostic":"\n VI1PR08MB5325:EE_|AS8PR08MB8828:EE_|AM7EUR03FT054:EE_|VI1PR08MB10005:EE_","X-MS-Office365-Filtering-Correlation-Id":"6cf3fe40-b93f-43ad-5e41-08dbbf01e5be","x-checkrecipientrouted":"true","nodisclaimer":"true","X-MS-Exchange-SenderADCheck":"1","X-MS-Exchange-AntiSpam-Relay":"0","X-Microsoft-Antispam-Untrusted":"BCL:0;","X-Microsoft-Antispam-Message-Info-Original":"\n l89k/TGE92Y4XRyMZrJrb5+XrnKdouzug0KXyj79JZuscK2Ty6xof5zkc83kCk9MZ5jrxaSdPB0mWFGWqmE3FdrB1pJsgRHYd7Zvblkq0Yoe21txY27jIrxZmSqRO6lyUzPVpWs+0kiyvhyHGZ3HUm4w+ULs+ecPs1m1eMIGRBF6Wdg58pGSJZ2R7UQ3L38TscZriB+4l8rYUpVYRFpE/+eQR/me82brLTiSrL2QI5X5hJIOBGsKFNuw3tMiZWK0L4o9ZtYfdt6/+cAEPVota+Y87i2++2btZd5ZIB9d0aJHvt21/Cd68rNQPAr2ALTyrdumrrI/GzInQbv0nN4jjeb7arnXNwd/vMPCDYSxOc0kto8UaCFYbmrk/6riThtII+0V8ROy8WnbeWcgAI6GFmeU+xYDX7w2oOzRl4y6AlaX4HAyVKbYeG0/DXtIxPdsqHwc/6OV8PW/ScXSzFErciswHzIF+CQkV13NIZnN2SHxzgk6H1NXZL5fN2qNDPZCj0EEtvgV+l3mMZ167ltEwlbuAWSy1J8rPkDaF9xbVZO7CuE0Q/nDXBAagjXwU98UO8yLn5qwZVrIS0T0zOLyLWt6YBlu66NHop1zdv/tHs9H7oiRcZz2heTHSCML6ByvWdpOmjZz8RWsU8D57GuoJST/bGmbM61QVWITU5p+fC8=","X-Forefront-Antispam-Report-Untrusted":"CIP:255.255.255.255; CTRY:; LANG:en;\n SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com;\n PTR:; CAT:NONE;\n SFS:(13230031)(376002)(366004)(39860400002)(396003)(136003)(346002)(230922051799003)(451199024)(1800799009)(186009)(38070700005)(38100700002)(122000001)(33656002)(86362001)(84970400001)(30864003)(66946007)(7696005)(2906002)(478600001)(53546011)(55016003)(6506007)(8936002)(5660300002)(4326008)(52536014)(8676002)(76116006)(41300700001)(71200400001)(9686003)(83380400001)(26005)(66446008)(66556008)(66476007)(316002)(54906003)(6916009)(64756008)(357404004);\n DIR:OUT; SFP:1101;","Content-Type":"text/plain; charset=\"utf-8\"","Content-Transfer-Encoding":"base64","MIME-Version":"1.0","X-MS-Exchange-Transport-CrossTenantHeadersStamped":["AS8PR08MB8828","VI1PR08MB10005"],"Original-Authentication-Results":"dkim=none (message not signed)\n header.d=none;dmarc=none action=none header.from=arm.com;","X-EOPAttributedMessage":"0","X-MS-Exchange-Transport-CrossTenantHeadersStripped":"\n AM7EUR03FT054.eop-EUR03.prod.protection.outlook.com","X-MS-PublicTrafficType":"Email","X-MS-Office365-Filtering-Correlation-Id-Prvs":"\n 6dfec429-8546-48f7-d4ce-08dbbf01dfde","X-Microsoft-Antispam":"BCL:0;","X-Microsoft-Antispam-Message-Info":"\n cvNm5rFOD+RrUvDN7aRF+0E44KEb5PoJU8R6dmNORJ8hLxJx9tMtlMuPWG69DNF0PPH8lyz5MeXgzPtZflRzZuYHOFQuni119iAy/hpoYwRaVN8NhxMOX241E2QDUJWyOA+NDvwPrxL3Twm8pPAjS25LjipcehsShdsZYQTbVHYP3RnqpF2BP6QkL5KqOFujXysOKjIwmIlwHRx/UaytnvhwLZ11Fz7s6dXtZuHbOAw2ZKVapQ5Sr3OSKiWjf+1zUbZnjenQCqp9ZEDhfGrMht12KDEXhmSdKFhex4N3eO7V4CdX6F79FOu+bF9aAzk+TpvntXkx9ZQExQGei2VfFeDEW7atzYFrgBtZanFyOobuM8yl7gi+H+WWABMs+qWEF3twSsH34cfE3EWKm2BcIYi8Ls4rLexmlCdc2KCNSf6b4z1tTFudQ0aHqtvNctlx0LZxVYJX/2pKTR5LQwccjQq7CjZkVrh/4hZebwpcYJE61cq/k+J7fTdxAGNhY0Q1AFfmYndzuefIjA4ImqpgRB1QIv0kth91X6npKKLAMPGmUiAmU3sX5sDt7nQ+lv1LX2VxCTIaTrKSYao0izVock5jbJBJMGxdeMm8iW0kvZ2lslCV0f72s5UGA9LjJkudcRO0pr1QNoPUft50caJpevPxcip/SoWK9P9NC239MQbOjehvMixdE0BPNr8My9X9Jp/eki21d18z8g4+BhbStEX+/sAyYEp8VpICBlgSYWQQyt5cNflH88HtuHs7Odc4/vnt2LSfVnLMfjATOFF76Y1iD6PuZUbuW0wHuMa8RNw=","X-Forefront-Antispam-Report":"CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:;\n IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com;\n PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE;\n SFS:(13230031)(4636009)(136003)(396003)(376002)(39860400002)(346002)(230922051799003)(1800799009)(186009)(82310400011)(451199024)(46966006)(36840700001)(40470700004)(40460700003)(53546011)(7696005)(26005)(356005)(70586007)(82740400003)(107886003)(86362001)(9686003)(70206006)(33656002)(478600001)(47076005)(81166007)(36860700001)(6506007)(54906003)(336012)(55016003)(83380400001)(6862004)(8936002)(2906002)(41300700001)(4326008)(40480700001)(316002)(8676002)(52536014)(5660300002)(30864003)(84970400001)(357404004);\n DIR:OUT; SFP:1101;","X-OriginatorOrg":"arm.com","X-MS-Exchange-CrossTenant-OriginalArrivalTime":"27 Sep 2023 02:31:47.2606 (UTC)","X-MS-Exchange-CrossTenant-Network-Message-Id":"\n 6cf3fe40-b93f-43ad-5e41-08dbbf01e5be","X-MS-Exchange-CrossTenant-Id":"f34e5979-57d9-4aaa-ad4d-b122a662184d","X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp":"\n TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123];\n Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com]","X-MS-Exchange-CrossTenant-AuthSource":"\n AM7EUR03FT054.eop-EUR03.prod.protection.outlook.com","X-MS-Exchange-CrossTenant-AuthAs":"Anonymous","X-MS-Exchange-CrossTenant-FromEntityHeader":"HybridOnPrem","X-Spam-Status":"No, score=-12.0 required=5.0 tests=BAYES_00, DKIM_SIGNED,\n DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, KAM_LOTSOFHASH,\n KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE,\n TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6","X-Spam-Checker-Version":"SpamAssassin 3.4.6 (2021-04-09) on\n server2.sourceware.org","X-BeenThere":"gcc-patches@gcc.gnu.org","X-Mailman-Version":"2.1.30","Precedence":"list","List-Id":"Gcc-patches mailing list <gcc-patches.gcc.gnu.org>","List-Unsubscribe":"<https://gcc.gnu.org/mailman/options/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>","List-Archive":"<https://gcc.gnu.org/pipermail/gcc-patches/>","List-Post":"<mailto:gcc-patches@gcc.gnu.org>","List-Help":"<mailto:gcc-patches-request@gcc.gnu.org?subject=help>","List-Subscribe":"<https://gcc.gnu.org/mailman/listinfo/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>","Errors-To":"gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org"}},{"id":3188437,"web_url":"http://patchwork.ozlabs.org/comment/3188437/","msgid":"<nycvar.YFH.7.77.849.2309270710120.5561@jbgna.fhfr.qr>","list_archive_url":null,"date":"2023-09-27T07:11:37","subject":"RE: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1\n << signbit(x)) [PR109154]","submitter":{"id":4338,"url":"http://patchwork.ozlabs.org/api/people/4338/","name":"Richard Biener","email":"rguenther@suse.de"},"content":"On Wed, 27 Sep 2023, Tamar Christina wrote:\n\n> > -----Original Message-----\n> > From: Andrew Pinski <pinskia@gmail.com>\n> > Sent: Wednesday, September 27, 2023 2:17 AM\n> > To: Tamar Christina <Tamar.Christina@arm.com>\n> > Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>; rguenther@suse.de;\n> > jlaw@ventanamicro.com\n> > Subject: Re: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1 <<\n> > signbit(x)) [PR109154]\n> > \n> > On Tue, Sep 26, 2023 at 5:51?PM Tamar Christina <tamar.christina@arm.com>\n> > wrote:\n> > >\n> > > Hi All,\n> > >\n> > > For targets that allow conversion between int and float modes this\n> > > adds a new optimization transforming fneg (fabs (x)) into x | (1 <<\n> > > signbit(x)).  Such sequences are common in scientific code working with\n> > gradients.\n> > >\n> > > The transformed instruction if the target has an inclusive-OR that\n> > > takes an immediate is both shorter an faster.  For those that don't\n> > > the immediate has to be seperate constructed but this still ends up\n> > > being faster as the immediate construction is not on the critical path.\n> > >\n> > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.\n> > >\n> > > Ok for master?\n> > \n> > I think this should be part of isel instead of match.\n> > Maybe we could use genmatch to generate the code that does the\n> > transformations but this does not belong as part of match really.\n> \n> I disagree.. I don't think this belongs in isel. Isel is for structural transformations.\n> If there is a case for something else I'd imagine backwardprop is a better choice.\n> \n> But I don't see why it doesn't belong here considering it *is* a mathematical optimization\n> and the file has plenty of transformations such as mask optimizations and vector conditional\n> rewriting.\n\nBut the mathematical transform would more generally be\nfneg (fabs (x)) -> copysign (x, -1.) and that can be optimally expanded\nat RTL expansion time?\n\nRichard.\n\n> Regards,\n> Tamar\n> \n> > \n> > Thanks,\n> > Andrew\n> > \n> > >\n> > > Thanks,\n> > > Tamar\n> > >\n> > > gcc/ChangeLog:\n> > >\n> > >         PR tree-optimization/109154\n> > >         * match.pd: Add new neg+abs rule.\n> > >\n> > > gcc/testsuite/ChangeLog:\n> > >\n> > >         PR tree-optimization/109154\n> > >         * gcc.target/aarch64/fneg-abs_1.c: New test.\n> > >         * gcc.target/aarch64/fneg-abs_2.c: New test.\n> > >         * gcc.target/aarch64/fneg-abs_3.c: New test.\n> > >         * gcc.target/aarch64/fneg-abs_4.c: New test.\n> > >         * gcc.target/aarch64/sve/fneg-abs_1.c: New test.\n> > >         * gcc.target/aarch64/sve/fneg-abs_2.c: New test.\n> > >         * gcc.target/aarch64/sve/fneg-abs_3.c: New test.\n> > >         * gcc.target/aarch64/sve/fneg-abs_4.c: New test.\n> > >\n> > > --- inline copy of patch --\n> > > diff --git a/gcc/match.pd b/gcc/match.pd index\n> > >\n> > 39c7ea1088f25538ed8bd26ee89711566141a71f..8ebde06dcd4b26d69482\n> > 6cffad0f\n> > > b17e1136600a 100644\n> > > --- a/gcc/match.pd\n> > > +++ b/gcc/match.pd\n> > > @@ -9476,3 +9476,57 @@ and,\n> > >        }\n> > >        (if (full_perm_p)\n> > >         (vec_perm (op@3 @0 @1) @3 @2))))))\n> > > +\n> > > +/* Transform fneg (fabs (X)) -> X | 1 << signbit (X).  */\n> > > +\n> > > +(simplify\n> > > + (negate (abs @0))\n> > > + (if (FLOAT_TYPE_P (type)\n> > > +      /* We have to delay this rewriting till after forward prop because\n> > otherwise\n> > > +        it's harder to do trigonometry optimizations. e.g. cos(-fabs(x)) is not\n> > > +        matched in one go.  Instead cos (-x) is matched first followed by\n> > cos(|x|).\n> > > +        The bottom op approach makes this rule match first and it's not untill\n> > > +        fwdprop that we match top down.  There are manu such simplications\n> > so we\n> > > +        delay this optimization till later on.  */\n> > > +      && canonicalize_math_after_vectorization_p ())\n> > > +  (with {\n> > > +    tree itype = unsigned_type_for (type);\n> > > +    machine_mode mode = TYPE_MODE (type);\n> > > +    const struct real_format *float_fmt = FLOAT_MODE_FORMAT (mode);\n> > > +    auto optab = VECTOR_TYPE_P (type) ? optab_vector : optab_default; }\n> > > +   (if (float_fmt\n> > > +       && float_fmt->signbit_rw >= 0\n> > > +       && targetm.can_change_mode_class (TYPE_MODE (itype),\n> > > +                                         TYPE_MODE (type), ALL_REGS)\n> > > +        && target_supports_op_p (itype, BIT_IOR_EXPR, optab))\n> > > +    (with { wide_int wone = wi::one (element_precision (type));\n> > > +           int sbit = float_fmt->signbit_rw;\n> > > +           auto stype = VECTOR_TYPE_P (type) ? TREE_TYPE (itype) : itype;\n> > > +           tree sign_bit = wide_int_to_tree (stype, wi::lshift (wone, sbit));}\n> > > +     (view_convert:type\n> > > +      (bit_ior (view_convert:itype @0)\n> > > +              { build_uniform_cst (itype, sign_bit); } )))))))\n> > > +\n> > > +/* Repeat the same but for conditional negate.  */\n> > > +\n> > > +(simplify\n> > > + (IFN_COND_NEG @1 (abs @0) @2)\n> > > + (if (FLOAT_TYPE_P (type))\n> > > +  (with {\n> > > +    tree itype = unsigned_type_for (type);\n> > > +    machine_mode mode = TYPE_MODE (type);\n> > > +    const struct real_format *float_fmt = FLOAT_MODE_FORMAT (mode);\n> > > +    auto optab = VECTOR_TYPE_P (type) ? optab_vector : optab_default; }\n> > > +   (if (float_fmt\n> > > +       && float_fmt->signbit_rw >= 0\n> > > +       && targetm.can_change_mode_class (TYPE_MODE (itype),\n> > > +                                         TYPE_MODE (type), ALL_REGS)\n> > > +        && target_supports_op_p (itype, BIT_IOR_EXPR, optab))\n> > > +    (with { wide_int wone = wi::one (element_precision (type));\n> > > +           int sbit = float_fmt->signbit_rw;\n> > > +           auto stype = VECTOR_TYPE_P (type) ? TREE_TYPE (itype) : itype;\n> > > +           tree sign_bit = wide_int_to_tree (stype, wi::lshift (wone, sbit));}\n> > > +     (view_convert:type\n> > > +      (IFN_COND_IOR @1 (view_convert:itype @0)\n> > > +              { build_uniform_cst (itype, sign_bit); }\n> > > +              (view_convert:itype @2) )))))))\n> > > \\ No newline at end of file\n> > > diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_1.c\n> > > b/gcc/testsuite/gcc.target/aarch64/fneg-abs_1.c\n> > > new file mode 100644\n> > > index\n> > >\n> > 0000000000000000000000000000000000000000..f823013c3ddf6b3a266\n> > c3abfcbf2\n> > > 642fc2a75fa6\n> > > --- /dev/null\n> > > +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_1.c\n> > > @@ -0,0 +1,39 @@\n> > > +/* { dg-do compile } */\n> > > +/* { dg-options \"-O3\" } */\n> > > +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } }\n> > > +*/\n> > > +\n> > > +#pragma GCC target \"+nosve\"\n> > > +\n> > > +#include <arm_neon.h>\n> > > +\n> > > +/*\n> > > +** t1:\n> > > +**     orr     v[0-9]+.2s, #128, lsl #24\n> > > +**     ret\n> > > +*/\n> > > +float32x2_t t1 (float32x2_t a)\n> > > +{\n> > > +  return vneg_f32 (vabs_f32 (a));\n> > > +}\n> > > +\n> > > +/*\n> > > +** t2:\n> > > +**     orr     v[0-9]+.4s, #128, lsl #24\n> > > +**     ret\n> > > +*/\n> > > +float32x4_t t2 (float32x4_t a)\n> > > +{\n> > > +  return vnegq_f32 (vabsq_f32 (a));\n> > > +}\n> > > +\n> > > +/*\n> > > +** t3:\n> > > +**     adrp    x0, .LC[0-9]+\n> > > +**     ldr     q[0-9]+, \\[x0, #:lo12:.LC0\\]\n> > > +**     orr     v[0-9]+.16b, v[0-9]+.16b, v[0-9]+.16b\n> > > +**     ret\n> > > +*/\n> > > +float64x2_t t3 (float64x2_t a)\n> > > +{\n> > > +  return vnegq_f64 (vabsq_f64 (a));\n> > > +}\n> > > diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_2.c\n> > > b/gcc/testsuite/gcc.target/aarch64/fneg-abs_2.c\n> > > new file mode 100644\n> > > index\n> > >\n> > 0000000000000000000000000000000000000000..141121176b309e4b2a\n> > a413dc5527\n> > > 1a6e3c93d5e1\n> > > --- /dev/null\n> > > +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_2.c\n> > > @@ -0,0 +1,31 @@\n> > > +/* { dg-do compile } */\n> > > +/* { dg-options \"-O3\" } */\n> > > +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } }\n> > > +*/\n> > > +\n> > > +#pragma GCC target \"+nosve\"\n> > > +\n> > > +#include <arm_neon.h>\n> > > +#include <math.h>\n> > > +\n> > > +/*\n> > > +** f1:\n> > > +**     movi    v[0-9]+.2s, 0x80, lsl 24\n> > > +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> > > +**     ret\n> > > +*/\n> > > +float32_t f1 (float32_t a)\n> > > +{\n> > > +  return -fabsf (a);\n> > > +}\n> > > +\n> > > +/*\n> > > +** f2:\n> > > +**     mov     x0, -9223372036854775808\n> > > +**     fmov    d[0-9]+, x0\n> > > +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> > > +**     ret\n> > > +*/\n> > > +float64_t f2 (float64_t a)\n> > > +{\n> > > +  return -fabs (a);\n> > > +}\n> > > diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_3.c\n> > > b/gcc/testsuite/gcc.target/aarch64/fneg-abs_3.c\n> > > new file mode 100644\n> > > index\n> > >\n> > 0000000000000000000000000000000000000000..b4652173a95d104ddf\n> > a70c497f06\n> > > 27a61ea89d3b\n> > > --- /dev/null\n> > > +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_3.c\n> > > @@ -0,0 +1,36 @@\n> > > +/* { dg-do compile } */\n> > > +/* { dg-options \"-O3\" } */\n> > > +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } }\n> > > +*/\n> > > +\n> > > +#pragma GCC target \"+nosve\"\n> > > +\n> > > +#include <arm_neon.h>\n> > > +#include <math.h>\n> > > +\n> > > +/*\n> > > +** f1:\n> > > +**     ...\n> > > +**     ldr     q[0-9]+, \\[x0\\]\n> > > +**     orr     v[0-9]+.4s, #128, lsl #24\n> > > +**     str     q[0-9]+, \\[x0\\], 16\n> > > +**     ...\n> > > +*/\n> > > +void f1 (float32_t *a, int n)\n> > > +{\n> > > +  for (int i = 0; i < (n & -8); i++)\n> > > +   a[i] = -fabsf (a[i]);\n> > > +}\n> > > +\n> > > +/*\n> > > +** f2:\n> > > +**     ...\n> > > +**     ldr     q[0-9]+, \\[x0\\]\n> > > +**     orr     v[0-9]+.16b, v[0-9]+.16b, v[0-9]+.16b\n> > > +**     str     q[0-9]+, \\[x0\\], 16\n> > > +**     ...\n> > > +*/\n> > > +void f2 (float64_t *a, int n)\n> > > +{\n> > > +  for (int i = 0; i < (n & -8); i++)\n> > > +   a[i] = -fabs (a[i]);\n> > > +}\n> > > diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_4.c\n> > > b/gcc/testsuite/gcc.target/aarch64/fneg-abs_4.c\n> > > new file mode 100644\n> > > index\n> > >\n> > 0000000000000000000000000000000000000000..10879dea74462d34b2\n> > 6160eeb0bd\n> > > 54ead063166b\n> > > --- /dev/null\n> > > +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_4.c\n> > > @@ -0,0 +1,39 @@\n> > > +/* { dg-do compile } */\n> > > +/* { dg-options \"-O3\" } */\n> > > +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } }\n> > > +*/\n> > > +\n> > > +#pragma GCC target \"+nosve\"\n> > > +\n> > > +#include <string.h>\n> > > +\n> > > +/*\n> > > +** negabs:\n> > > +**     mov     x0, -9223372036854775808\n> > > +**     fmov    d[0-9]+, x0\n> > > +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> > > +**     ret\n> > > +*/\n> > > +double negabs (double x)\n> > > +{\n> > > +   unsigned long long y;\n> > > +   memcpy (&y, &x, sizeof(double));\n> > > +   y = y | (1UL << 63);\n> > > +   memcpy (&x, &y, sizeof(double));\n> > > +   return x;\n> > > +}\n> > > +\n> > > +/*\n> > > +** negabsf:\n> > > +**     movi    v[0-9]+.2s, 0x80, lsl 24\n> > > +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> > > +**     ret\n> > > +*/\n> > > +float negabsf (float x)\n> > > +{\n> > > +   unsigned int y;\n> > > +   memcpy (&y, &x, sizeof(float));\n> > > +   y = y | (1U << 31);\n> > > +   memcpy (&x, &y, sizeof(float));\n> > > +   return x;\n> > > +}\n> > > +\n> > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_1.c\n> > > b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_1.c\n> > > new file mode 100644\n> > > index\n> > >\n> > 0000000000000000000000000000000000000000..0c7664e6de77a49768\n> > 2952653ffd\n> > > 417453854d52\n> > > --- /dev/null\n> > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_1.c\n> > > @@ -0,0 +1,37 @@\n> > > +/* { dg-do compile } */\n> > > +/* { dg-options \"-O3\" } */\n> > > +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } }\n> > > +*/\n> > > +\n> > > +#include <arm_neon.h>\n> > > +\n> > > +/*\n> > > +** t1:\n> > > +**     orr     v[0-9]+.2s, #128, lsl #24\n> > > +**     ret\n> > > +*/\n> > > +float32x2_t t1 (float32x2_t a)\n> > > +{\n> > > +  return vneg_f32 (vabs_f32 (a));\n> > > +}\n> > > +\n> > > +/*\n> > > +** t2:\n> > > +**     orr     v[0-9]+.4s, #128, lsl #24\n> > > +**     ret\n> > > +*/\n> > > +float32x4_t t2 (float32x4_t a)\n> > > +{\n> > > +  return vnegq_f32 (vabsq_f32 (a));\n> > > +}\n> > > +\n> > > +/*\n> > > +** t3:\n> > > +**     adrp    x0, .LC[0-9]+\n> > > +**     ldr     q[0-9]+, \\[x0, #:lo12:.LC0\\]\n> > > +**     orr     v[0-9]+.16b, v[0-9]+.16b, v[0-9]+.16b\n> > > +**     ret\n> > > +*/\n> > > +float64x2_t t3 (float64x2_t a)\n> > > +{\n> > > +  return vnegq_f64 (vabsq_f64 (a));\n> > > +}\n> > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_2.c\n> > > b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_2.c\n> > > new file mode 100644\n> > > index\n> > >\n> > 0000000000000000000000000000000000000000..a60cd31b9294af2dac6\n> > 9eed1c93f\n> > > 899bd5c78fca\n> > > --- /dev/null\n> > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_2.c\n> > > @@ -0,0 +1,29 @@\n> > > +/* { dg-do compile } */\n> > > +/* { dg-options \"-O3\" } */\n> > > +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } }\n> > > +*/\n> > > +\n> > > +#include <arm_neon.h>\n> > > +#include <math.h>\n> > > +\n> > > +/*\n> > > +** f1:\n> > > +**     movi    v[0-9]+.2s, 0x80, lsl 24\n> > > +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> > > +**     ret\n> > > +*/\n> > > +float32_t f1 (float32_t a)\n> > > +{\n> > > +  return -fabsf (a);\n> > > +}\n> > > +\n> > > +/*\n> > > +** f2:\n> > > +**     mov     x0, -9223372036854775808\n> > > +**     fmov    d[0-9]+, x0\n> > > +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> > > +**     ret\n> > > +*/\n> > > +float64_t f2 (float64_t a)\n> > > +{\n> > > +  return -fabs (a);\n> > > +}\n> > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_3.c\n> > > b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_3.c\n> > > new file mode 100644\n> > > index\n> > >\n> > 0000000000000000000000000000000000000000..1bf34328d8841de8e6\n> > b0a5458562\n> > > a9f00e31c275\n> > > --- /dev/null\n> > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_3.c\n> > > @@ -0,0 +1,34 @@\n> > > +/* { dg-do compile } */\n> > > +/* { dg-options \"-O3\" } */\n> > > +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } }\n> > > +*/\n> > > +\n> > > +#include <arm_neon.h>\n> > > +#include <math.h>\n> > > +\n> > > +/*\n> > > +** f1:\n> > > +**     ...\n> > > +**     ld1w    z[0-9]+.s, p[0-9]+/z, \\[x0, x2, lsl 2\\]\n> > > +**     orr     z[0-9]+.s, z[0-9]+.s, #0x80000000\n> > > +**     st1w    z[0-9]+.s, p[0-9]+, \\[x0, x2, lsl 2\\]\n> > > +**     ...\n> > > +*/\n> > > +void f1 (float32_t *a, int n)\n> > > +{\n> > > +  for (int i = 0; i < (n & -8); i++)\n> > > +   a[i] = -fabsf (a[i]);\n> > > +}\n> > > +\n> > > +/*\n> > > +** f2:\n> > > +**     ...\n> > > +**     ld1d    z[0-9]+.d, p[0-9]+/z, \\[x0, x2, lsl 3\\]\n> > > +**     orr     z[0-9]+.d, z[0-9]+.d, #0x8000000000000000\n> > > +**     st1d    z[0-9]+.d, p[0-9]+, \\[x0, x2, lsl 3\\]\n> > > +**     ...\n> > > +*/\n> > > +void f2 (float64_t *a, int n)\n> > > +{\n> > > +  for (int i = 0; i < (n & -8); i++)\n> > > +   a[i] = -fabs (a[i]);\n> > > +}\n> > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_4.c\n> > > b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_4.c\n> > > new file mode 100644\n> > > index\n> > >\n> > 0000000000000000000000000000000000000000..21f2a8da2a5d44e3d0\n> > 1f6604ca7b\n> > > e87e3744d494\n> > > --- /dev/null\n> > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_4.c\n> > > @@ -0,0 +1,37 @@\n> > > +/* { dg-do compile } */\n> > > +/* { dg-options \"-O3\" } */\n> > > +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } }\n> > > +*/\n> > > +\n> > > +#include <string.h>\n> > > +\n> > > +/*\n> > > +** negabs:\n> > > +**     mov     x0, -9223372036854775808\n> > > +**     fmov    d[0-9]+, x0\n> > > +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> > > +**     ret\n> > > +*/\n> > > +double negabs (double x)\n> > > +{\n> > > +   unsigned long long y;\n> > > +   memcpy (&y, &x, sizeof(double));\n> > > +   y = y | (1UL << 63);\n> > > +   memcpy (&x, &y, sizeof(double));\n> > > +   return x;\n> > > +}\n> > > +\n> > > +/*\n> > > +** negabsf:\n> > > +**     movi    v[0-9]+.2s, 0x80, lsl 24\n> > > +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> > > +**     ret\n> > > +*/\n> > > +float negabsf (float x)\n> > > +{\n> > > +   unsigned int y;\n> > > +   memcpy (&y, &x, sizeof(float));\n> > > +   y = y | (1U << 31);\n> > > +   memcpy (&x, &y, sizeof(float));\n> > > +   return x;\n> > > +}\n> > > +\n> > >\n> > >\n> > >\n> > >\n> > > --\n>","headers":{"Return-Path":"<gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org>","X-Original-To":["incoming@patchwork.ozlabs.org","gcc-patches@gcc.gnu.org"],"Delivered-To":["patchwork-incoming@legolas.ozlabs.org","gcc-patches@gcc.gnu.org"],"Authentication-Results":["legolas.ozlabs.org;\n\tdkim=pass (1024-bit key;\n unprotected) header.d=suse.de header.i=@suse.de header.a=rsa-sha256\n header.s=susede2_rsa header.b=0yPX/kjc;\n\tdkim=pass header.d=suse.de header.i=@suse.de header.a=ed25519-sha256\n header.s=susede2_ed25519 header.b=2IMuI8Vo;\n\tdkim-atps=neutral","legolas.ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org\n (client-ip=8.43.85.97; helo=server2.sourceware.org;\n envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org;\n receiver=patchwork.ozlabs.org)","sourceware.org;\n dmarc=pass (p=none dis=none) header.from=suse.de","sourceware.org; spf=pass smtp.mailfrom=suse.de"],"Received":["from server2.sourceware.org (ip-8-43-85-97.sourceware.org\n [8.43.85.97])\n\t(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n\t key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384)\n\t(No client certificate requested)\n\tby legolas.ozlabs.org (Postfix) with ESMTPS id 4RwSVl225cz1yp8\n\tfor <incoming@patchwork.ozlabs.org>; Wed, 27 Sep 2023 17:12:45 +1000 (AEST)","from server2.sourceware.org (localhost [IPv6:::1])\n\tby sourceware.org (Postfix) with ESMTP id D00B0385DC35\n\tfor <incoming@patchwork.ozlabs.org>; Wed, 27 Sep 2023 07:12:43 +0000 (GMT)","from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28])\n by sourceware.org (Postfix) with ESMTPS id 0C864385E019\n for <gcc-patches@gcc.gnu.org>; Wed, 27 Sep 2023 07:11:39 +0000 (GMT)","from relay2.suse.de (relay2.suse.de [149.44.160.134])\n by smtp-out1.suse.de (Postfix) with ESMTP id CB1CB2187A;\n Wed, 27 Sep 2023 07:11:37 +0000 (UTC)","from wotan.suse.de (wotan.suse.de [10.160.0.1])\n (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))\n (No client certificate requested)\n by relay2.suse.de (Postfix) with ESMTPS id A8E2B2C142;\n Wed, 27 Sep 2023 07:11:37 +0000 (UTC)"],"DMARC-Filter":"OpenDMARC Filter v1.4.2 sourceware.org 0C864385E019","DKIM-Signature":["v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de;\n s=susede2_rsa;\n t=1695798697;\n h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc:\n mime-version:mime-version:content-type:content-type:\n in-reply-to:in-reply-to:references:references;\n bh=pYDyUgQHqVFVWTbwREno7+nWA/MtFHLg052QNlfGuKY=;\n b=0yPX/kjcVqomAhVlIRcAfh0ryKX8CBODaQqrfd1eBrzWEdN2dw863N7YBofEt8hlvPt9bC\n 6KvfmkSVT+zQzUWx9ki/H8eFrWHfaKVcCSzTwi1Dh/zzaqylkVRYyCMks1r2wbZSxLOjpC\n IRI+A1l1A8CnCHgEgG8hz3a7fqbcbCw=","v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de;\n s=susede2_ed25519; t=1695798697;\n h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc:\n mime-version:mime-version:content-type:content-type:\n in-reply-to:in-reply-to:references:references;\n bh=pYDyUgQHqVFVWTbwREno7+nWA/MtFHLg052QNlfGuKY=;\n b=2IMuI8VoRrt4s2ga35+2AswQg1re68zpespjCWA/lObwVm81Z/X5dEnB2PddAQgrkLFGzV\n uaiPVaafmeMcWaAg=="],"Date":"Wed, 27 Sep 2023 07:11:37 +0000 (UTC)","From":"Richard Biener <rguenther@suse.de>","To":"Tamar Christina <Tamar.Christina@arm.com>","cc":"Andrew Pinski <pinskia@gmail.com>,\n \"gcc-patches@gcc.gnu.org\" <gcc-patches@gcc.gnu.org>, nd <nd@arm.com>,\n \"jlaw@ventanamicro.com\" <jlaw@ventanamicro.com>","Subject":"RE: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1\n << signbit(x)) [PR109154]","In-Reply-To":"\n <VI1PR08MB5325CC904A863DB87F17CB88FFC2A@VI1PR08MB5325.eurprd08.prod.outlook.com>","Message-ID":"<nycvar.YFH.7.77.849.2309270710120.5561@jbgna.fhfr.qr>","References":"<patch-17718-tamar@arm.com>\n <CA+=Sn1kbO1OkC_1oMJi8uH8bajmGn07A+F6nvb6dGKBRcR8S3Q@mail.gmail.com>\n <VI1PR08MB5325CC904A863DB87F17CB88FFC2A@VI1PR08MB5325.eurprd08.prod.outlook.com>","User-Agent":"Alpine 2.22 (LSU 394 2020-01-19)","MIME-Version":"1.0","Content-Type":"text/plain; charset=US-ASCII","X-Spam-Status":"No, score=-10.9 required=5.0 tests=BAYES_00, DKIM_SIGNED,\n DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_LOTSOFHASH,\n KAM_SHORT, SPF_HELO_NONE, SPF_PASS,\n TXREP autolearn=ham autolearn_force=no version=3.4.6","X-Spam-Checker-Version":"SpamAssassin 3.4.6 (2021-04-09) on\n server2.sourceware.org","X-BeenThere":"gcc-patches@gcc.gnu.org","X-Mailman-Version":"2.1.30","Precedence":"list","List-Id":"Gcc-patches mailing list <gcc-patches.gcc.gnu.org>","List-Unsubscribe":"<https://gcc.gnu.org/mailman/options/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>","List-Archive":"<https://gcc.gnu.org/pipermail/gcc-patches/>","List-Post":"<mailto:gcc-patches@gcc.gnu.org>","List-Help":"<mailto:gcc-patches-request@gcc.gnu.org?subject=help>","List-Subscribe":"<https://gcc.gnu.org/mailman/listinfo/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>","Errors-To":"gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org"}},{"id":3188480,"web_url":"http://patchwork.ozlabs.org/comment/3188480/","msgid":"<VI1PR08MB532509805D977DE375DE3618FFC2A@VI1PR08MB5325.eurprd08.prod.outlook.com>","list_archive_url":null,"date":"2023-09-27T07:56:43","subject":"RE: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1 <<\n signbit(x)) [PR109154]","submitter":{"id":69689,"url":"http://patchwork.ozlabs.org/api/people/69689/","name":"Tamar Christina","email":"Tamar.Christina@arm.com"},"content":"> -----Original Message-----\n> From: Richard Biener <rguenther@suse.de>\n> Sent: Wednesday, September 27, 2023 8:12 AM\n> To: Tamar Christina <Tamar.Christina@arm.com>\n> Cc: Andrew Pinski <pinskia@gmail.com>; gcc-patches@gcc.gnu.org; nd\n> <nd@arm.com>; jlaw@ventanamicro.com\n> Subject: RE: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1 <<\n> signbit(x)) [PR109154]\n> \n> On Wed, 27 Sep 2023, Tamar Christina wrote:\n> \n> > > -----Original Message-----\n> > > From: Andrew Pinski <pinskia@gmail.com>\n> > > Sent: Wednesday, September 27, 2023 2:17 AM\n> > > To: Tamar Christina <Tamar.Christina@arm.com>\n> > > Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>; rguenther@suse.de;\n> > > jlaw@ventanamicro.com\n> > > Subject: Re: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to\n> > > x | (1 <<\n> > > signbit(x)) [PR109154]\n> > >\n> > > On Tue, Sep 26, 2023 at 5:51?PM Tamar Christina\n> > > <tamar.christina@arm.com>\n> > > wrote:\n> > > >\n> > > > Hi All,\n> > > >\n> > > > For targets that allow conversion between int and float modes this\n> > > > adds a new optimization transforming fneg (fabs (x)) into x | (1\n> > > > << signbit(x)).  Such sequences are common in scientific code\n> > > > working with\n> > > gradients.\n> > > >\n> > > > The transformed instruction if the target has an inclusive-OR that\n> > > > takes an immediate is both shorter an faster.  For those that\n> > > > don't the immediate has to be seperate constructed but this still\n> > > > ends up being faster as the immediate construction is not on the critical\n> path.\n> > > >\n> > > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.\n> > > >\n> > > > Ok for master?\n> > >\n> > > I think this should be part of isel instead of match.\n> > > Maybe we could use genmatch to generate the code that does the\n> > > transformations but this does not belong as part of match really.\n> >\n> > I disagree.. I don't think this belongs in isel. Isel is for structural\n> transformations.\n> > If there is a case for something else I'd imagine backwardprop is a better\n> choice.\n> >\n> > But I don't see why it doesn't belong here considering it *is* a\n> > mathematical optimization and the file has plenty of transformations\n> > such as mask optimizations and vector conditional rewriting.\n> \n> But the mathematical transform would more generally be fneg (fabs (x)) ->\n> copysign (x, -1.) and that can be optimally expanded at RTL expansion time?\n\nAh sure, atm I did copysign (x, -1) -> x | 1 << signbits.  I can do it the other way\naround.  And I guess since copysign (-x, y), copysign(|x|, y) -> copysign (x, y) that\nshould solve the trigonometry problem too.\n\nCool will do that instead, thanks!\n\nTamar\n\n> \n> Richard.\n> \n> > Regards,\n> > Tamar\n> >\n> > >\n> > > Thanks,\n> > > Andrew\n> > >\n> > > >\n> > > > Thanks,\n> > > > Tamar\n> > > >\n> > > > gcc/ChangeLog:\n> > > >\n> > > >         PR tree-optimization/109154\n> > > >         * match.pd: Add new neg+abs rule.\n> > > >\n> > > > gcc/testsuite/ChangeLog:\n> > > >\n> > > >         PR tree-optimization/109154\n> > > >         * gcc.target/aarch64/fneg-abs_1.c: New test.\n> > > >         * gcc.target/aarch64/fneg-abs_2.c: New test.\n> > > >         * gcc.target/aarch64/fneg-abs_3.c: New test.\n> > > >         * gcc.target/aarch64/fneg-abs_4.c: New test.\n> > > >         * gcc.target/aarch64/sve/fneg-abs_1.c: New test.\n> > > >         * gcc.target/aarch64/sve/fneg-abs_2.c: New test.\n> > > >         * gcc.target/aarch64/sve/fneg-abs_3.c: New test.\n> > > >         * gcc.target/aarch64/sve/fneg-abs_4.c: New test.\n> > > >\n> > > > --- inline copy of patch --\n> > > > diff --git a/gcc/match.pd b/gcc/match.pd index\n> > > >\n> > >\n> 39c7ea1088f25538ed8bd26ee89711566141a71f..8ebde06dcd4b26d69482\n> > > 6cffad0f\n> > > > b17e1136600a 100644\n> > > > --- a/gcc/match.pd\n> > > > +++ b/gcc/match.pd\n> > > > @@ -9476,3 +9476,57 @@ and,\n> > > >        }\n> > > >        (if (full_perm_p)\n> > > >         (vec_perm (op@3 @0 @1) @3 @2))))))\n> > > > +\n> > > > +/* Transform fneg (fabs (X)) -> X | 1 << signbit (X).  */\n> > > > +\n> > > > +(simplify\n> > > > + (negate (abs @0))\n> > > > + (if (FLOAT_TYPE_P (type)\n> > > > +      /* We have to delay this rewriting till after forward prop\n> > > > +because\n> > > otherwise\n> > > > +        it's harder to do trigonometry optimizations. e.g. cos(-fabs(x)) is not\n> > > > +        matched in one go.  Instead cos (-x) is matched first\n> > > > + followed by\n> > > cos(|x|).\n> > > > +        The bottom op approach makes this rule match first and it's not\n> untill\n> > > > +        fwdprop that we match top down.  There are manu such\n> > > > + simplications\n> > > so we\n> > > > +        delay this optimization till later on.  */\n> > > > +      && canonicalize_math_after_vectorization_p ())  (with {\n> > > > +    tree itype = unsigned_type_for (type);\n> > > > +    machine_mode mode = TYPE_MODE (type);\n> > > > +    const struct real_format *float_fmt = FLOAT_MODE_FORMAT (mode);\n> > > > +    auto optab = VECTOR_TYPE_P (type) ? optab_vector : optab_default; }\n> > > > +   (if (float_fmt\n> > > > +       && float_fmt->signbit_rw >= 0\n> > > > +       && targetm.can_change_mode_class (TYPE_MODE (itype),\n> > > > +                                         TYPE_MODE (type), ALL_REGS)\n> > > > +        && target_supports_op_p (itype, BIT_IOR_EXPR, optab))\n> > > > +    (with { wide_int wone = wi::one (element_precision (type));\n> > > > +           int sbit = float_fmt->signbit_rw;\n> > > > +           auto stype = VECTOR_TYPE_P (type) ? TREE_TYPE (itype) : itype;\n> > > > +           tree sign_bit = wide_int_to_tree (stype, wi::lshift (wone, sbit));}\n> > > > +     (view_convert:type\n> > > > +      (bit_ior (view_convert:itype @0)\n> > > > +              { build_uniform_cst (itype, sign_bit); } )))))))\n> > > > +\n> > > > +/* Repeat the same but for conditional negate.  */\n> > > > +\n> > > > +(simplify\n> > > > + (IFN_COND_NEG @1 (abs @0) @2)\n> > > > + (if (FLOAT_TYPE_P (type))\n> > > > +  (with {\n> > > > +    tree itype = unsigned_type_for (type);\n> > > > +    machine_mode mode = TYPE_MODE (type);\n> > > > +    const struct real_format *float_fmt = FLOAT_MODE_FORMAT (mode);\n> > > > +    auto optab = VECTOR_TYPE_P (type) ? optab_vector : optab_default; }\n> > > > +   (if (float_fmt\n> > > > +       && float_fmt->signbit_rw >= 0\n> > > > +       && targetm.can_change_mode_class (TYPE_MODE (itype),\n> > > > +                                         TYPE_MODE (type), ALL_REGS)\n> > > > +        && target_supports_op_p (itype, BIT_IOR_EXPR, optab))\n> > > > +    (with { wide_int wone = wi::one (element_precision (type));\n> > > > +           int sbit = float_fmt->signbit_rw;\n> > > > +           auto stype = VECTOR_TYPE_P (type) ? TREE_TYPE (itype) : itype;\n> > > > +           tree sign_bit = wide_int_to_tree (stype, wi::lshift (wone, sbit));}\n> > > > +     (view_convert:type\n> > > > +      (IFN_COND_IOR @1 (view_convert:itype @0)\n> > > > +              { build_uniform_cst (itype, sign_bit); }\n> > > > +              (view_convert:itype @2) )))))))\n> > > > \\ No newline at end of file\n> > > > diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_1.c\n> > > > b/gcc/testsuite/gcc.target/aarch64/fneg-abs_1.c\n> > > > new file mode 100644\n> > > > index\n> > > >\n> > >\n> 0000000000000000000000000000000000000000..f823013c3ddf6b3a266\n> > > c3abfcbf2\n> > > > 642fc2a75fa6\n> > > > --- /dev/null\n> > > > +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_1.c\n> > > > @@ -0,0 +1,39 @@\n> > > > +/* { dg-do compile } */\n> > > > +/* { dg-options \"-O3\" } */\n> > > > +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 }\n> > > > +} } */\n> > > > +\n> > > > +#pragma GCC target \"+nosve\"\n> > > > +\n> > > > +#include <arm_neon.h>\n> > > > +\n> > > > +/*\n> > > > +** t1:\n> > > > +**     orr     v[0-9]+.2s, #128, lsl #24\n> > > > +**     ret\n> > > > +*/\n> > > > +float32x2_t t1 (float32x2_t a)\n> > > > +{\n> > > > +  return vneg_f32 (vabs_f32 (a)); }\n> > > > +\n> > > > +/*\n> > > > +** t2:\n> > > > +**     orr     v[0-9]+.4s, #128, lsl #24\n> > > > +**     ret\n> > > > +*/\n> > > > +float32x4_t t2 (float32x4_t a)\n> > > > +{\n> > > > +  return vnegq_f32 (vabsq_f32 (a)); }\n> > > > +\n> > > > +/*\n> > > > +** t3:\n> > > > +**     adrp    x0, .LC[0-9]+\n> > > > +**     ldr     q[0-9]+, \\[x0, #:lo12:.LC0\\]\n> > > > +**     orr     v[0-9]+.16b, v[0-9]+.16b, v[0-9]+.16b\n> > > > +**     ret\n> > > > +*/\n> > > > +float64x2_t t3 (float64x2_t a)\n> > > > +{\n> > > > +  return vnegq_f64 (vabsq_f64 (a)); }\n> > > > diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_2.c\n> > > > b/gcc/testsuite/gcc.target/aarch64/fneg-abs_2.c\n> > > > new file mode 100644\n> > > > index\n> > > >\n> > >\n> 0000000000000000000000000000000000000000..141121176b309e4b2a\n> > > a413dc5527\n> > > > 1a6e3c93d5e1\n> > > > --- /dev/null\n> > > > +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_2.c\n> > > > @@ -0,0 +1,31 @@\n> > > > +/* { dg-do compile } */\n> > > > +/* { dg-options \"-O3\" } */\n> > > > +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 }\n> > > > +} } */\n> > > > +\n> > > > +#pragma GCC target \"+nosve\"\n> > > > +\n> > > > +#include <arm_neon.h>\n> > > > +#include <math.h>\n> > > > +\n> > > > +/*\n> > > > +** f1:\n> > > > +**     movi    v[0-9]+.2s, 0x80, lsl 24\n> > > > +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> > > > +**     ret\n> > > > +*/\n> > > > +float32_t f1 (float32_t a)\n> > > > +{\n> > > > +  return -fabsf (a);\n> > > > +}\n> > > > +\n> > > > +/*\n> > > > +** f2:\n> > > > +**     mov     x0, -9223372036854775808\n> > > > +**     fmov    d[0-9]+, x0\n> > > > +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> > > > +**     ret\n> > > > +*/\n> > > > +float64_t f2 (float64_t a)\n> > > > +{\n> > > > +  return -fabs (a);\n> > > > +}\n> > > > diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_3.c\n> > > > b/gcc/testsuite/gcc.target/aarch64/fneg-abs_3.c\n> > > > new file mode 100644\n> > > > index\n> > > >\n> > >\n> 0000000000000000000000000000000000000000..b4652173a95d104ddf\n> > > a70c497f06\n> > > > 27a61ea89d3b\n> > > > --- /dev/null\n> > > > +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_3.c\n> > > > @@ -0,0 +1,36 @@\n> > > > +/* { dg-do compile } */\n> > > > +/* { dg-options \"-O3\" } */\n> > > > +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 }\n> > > > +} } */\n> > > > +\n> > > > +#pragma GCC target \"+nosve\"\n> > > > +\n> > > > +#include <arm_neon.h>\n> > > > +#include <math.h>\n> > > > +\n> > > > +/*\n> > > > +** f1:\n> > > > +**     ...\n> > > > +**     ldr     q[0-9]+, \\[x0\\]\n> > > > +**     orr     v[0-9]+.4s, #128, lsl #24\n> > > > +**     str     q[0-9]+, \\[x0\\], 16\n> > > > +**     ...\n> > > > +*/\n> > > > +void f1 (float32_t *a, int n)\n> > > > +{\n> > > > +  for (int i = 0; i < (n & -8); i++)\n> > > > +   a[i] = -fabsf (a[i]);\n> > > > +}\n> > > > +\n> > > > +/*\n> > > > +** f2:\n> > > > +**     ...\n> > > > +**     ldr     q[0-9]+, \\[x0\\]\n> > > > +**     orr     v[0-9]+.16b, v[0-9]+.16b, v[0-9]+.16b\n> > > > +**     str     q[0-9]+, \\[x0\\], 16\n> > > > +**     ...\n> > > > +*/\n> > > > +void f2 (float64_t *a, int n)\n> > > > +{\n> > > > +  for (int i = 0; i < (n & -8); i++)\n> > > > +   a[i] = -fabs (a[i]);\n> > > > +}\n> > > > diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_4.c\n> > > > b/gcc/testsuite/gcc.target/aarch64/fneg-abs_4.c\n> > > > new file mode 100644\n> > > > index\n> > > >\n> > >\n> 0000000000000000000000000000000000000000..10879dea74462d34b2\n> > > 6160eeb0bd\n> > > > 54ead063166b\n> > > > --- /dev/null\n> > > > +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_4.c\n> > > > @@ -0,0 +1,39 @@\n> > > > +/* { dg-do compile } */\n> > > > +/* { dg-options \"-O3\" } */\n> > > > +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 }\n> > > > +} } */\n> > > > +\n> > > > +#pragma GCC target \"+nosve\"\n> > > > +\n> > > > +#include <string.h>\n> > > > +\n> > > > +/*\n> > > > +** negabs:\n> > > > +**     mov     x0, -9223372036854775808\n> > > > +**     fmov    d[0-9]+, x0\n> > > > +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> > > > +**     ret\n> > > > +*/\n> > > > +double negabs (double x)\n> > > > +{\n> > > > +   unsigned long long y;\n> > > > +   memcpy (&y, &x, sizeof(double));\n> > > > +   y = y | (1UL << 63);\n> > > > +   memcpy (&x, &y, sizeof(double));\n> > > > +   return x;\n> > > > +}\n> > > > +\n> > > > +/*\n> > > > +** negabsf:\n> > > > +**     movi    v[0-9]+.2s, 0x80, lsl 24\n> > > > +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> > > > +**     ret\n> > > > +*/\n> > > > +float negabsf (float x)\n> > > > +{\n> > > > +   unsigned int y;\n> > > > +   memcpy (&y, &x, sizeof(float));\n> > > > +   y = y | (1U << 31);\n> > > > +   memcpy (&x, &y, sizeof(float));\n> > > > +   return x;\n> > > > +}\n> > > > +\n> > > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_1.c\n> > > > b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_1.c\n> > > > new file mode 100644\n> > > > index\n> > > >\n> > >\n> 0000000000000000000000000000000000000000..0c7664e6de77a49768\n> > > 2952653ffd\n> > > > 417453854d52\n> > > > --- /dev/null\n> > > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_1.c\n> > > > @@ -0,0 +1,37 @@\n> > > > +/* { dg-do compile } */\n> > > > +/* { dg-options \"-O3\" } */\n> > > > +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 }\n> > > > +} } */\n> > > > +\n> > > > +#include <arm_neon.h>\n> > > > +\n> > > > +/*\n> > > > +** t1:\n> > > > +**     orr     v[0-9]+.2s, #128, lsl #24\n> > > > +**     ret\n> > > > +*/\n> > > > +float32x2_t t1 (float32x2_t a)\n> > > > +{\n> > > > +  return vneg_f32 (vabs_f32 (a)); }\n> > > > +\n> > > > +/*\n> > > > +** t2:\n> > > > +**     orr     v[0-9]+.4s, #128, lsl #24\n> > > > +**     ret\n> > > > +*/\n> > > > +float32x4_t t2 (float32x4_t a)\n> > > > +{\n> > > > +  return vnegq_f32 (vabsq_f32 (a)); }\n> > > > +\n> > > > +/*\n> > > > +** t3:\n> > > > +**     adrp    x0, .LC[0-9]+\n> > > > +**     ldr     q[0-9]+, \\[x0, #:lo12:.LC0\\]\n> > > > +**     orr     v[0-9]+.16b, v[0-9]+.16b, v[0-9]+.16b\n> > > > +**     ret\n> > > > +*/\n> > > > +float64x2_t t3 (float64x2_t a)\n> > > > +{\n> > > > +  return vnegq_f64 (vabsq_f64 (a)); }\n> > > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_2.c\n> > > > b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_2.c\n> > > > new file mode 100644\n> > > > index\n> > > >\n> > >\n> 0000000000000000000000000000000000000000..a60cd31b9294af2dac6\n> > > 9eed1c93f\n> > > > 899bd5c78fca\n> > > > --- /dev/null\n> > > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_2.c\n> > > > @@ -0,0 +1,29 @@\n> > > > +/* { dg-do compile } */\n> > > > +/* { dg-options \"-O3\" } */\n> > > > +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 }\n> > > > +} } */\n> > > > +\n> > > > +#include <arm_neon.h>\n> > > > +#include <math.h>\n> > > > +\n> > > > +/*\n> > > > +** f1:\n> > > > +**     movi    v[0-9]+.2s, 0x80, lsl 24\n> > > > +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> > > > +**     ret\n> > > > +*/\n> > > > +float32_t f1 (float32_t a)\n> > > > +{\n> > > > +  return -fabsf (a);\n> > > > +}\n> > > > +\n> > > > +/*\n> > > > +** f2:\n> > > > +**     mov     x0, -9223372036854775808\n> > > > +**     fmov    d[0-9]+, x0\n> > > > +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> > > > +**     ret\n> > > > +*/\n> > > > +float64_t f2 (float64_t a)\n> > > > +{\n> > > > +  return -fabs (a);\n> > > > +}\n> > > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_3.c\n> > > > b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_3.c\n> > > > new file mode 100644\n> > > > index\n> > > >\n> > >\n> 0000000000000000000000000000000000000000..1bf34328d8841de8e6\n> > > b0a5458562\n> > > > a9f00e31c275\n> > > > --- /dev/null\n> > > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_3.c\n> > > > @@ -0,0 +1,34 @@\n> > > > +/* { dg-do compile } */\n> > > > +/* { dg-options \"-O3\" } */\n> > > > +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 }\n> > > > +} } */\n> > > > +\n> > > > +#include <arm_neon.h>\n> > > > +#include <math.h>\n> > > > +\n> > > > +/*\n> > > > +** f1:\n> > > > +**     ...\n> > > > +**     ld1w    z[0-9]+.s, p[0-9]+/z, \\[x0, x2, lsl 2\\]\n> > > > +**     orr     z[0-9]+.s, z[0-9]+.s, #0x80000000\n> > > > +**     st1w    z[0-9]+.s, p[0-9]+, \\[x0, x2, lsl 2\\]\n> > > > +**     ...\n> > > > +*/\n> > > > +void f1 (float32_t *a, int n)\n> > > > +{\n> > > > +  for (int i = 0; i < (n & -8); i++)\n> > > > +   a[i] = -fabsf (a[i]);\n> > > > +}\n> > > > +\n> > > > +/*\n> > > > +** f2:\n> > > > +**     ...\n> > > > +**     ld1d    z[0-9]+.d, p[0-9]+/z, \\[x0, x2, lsl 3\\]\n> > > > +**     orr     z[0-9]+.d, z[0-9]+.d, #0x8000000000000000\n> > > > +**     st1d    z[0-9]+.d, p[0-9]+, \\[x0, x2, lsl 3\\]\n> > > > +**     ...\n> > > > +*/\n> > > > +void f2 (float64_t *a, int n)\n> > > > +{\n> > > > +  for (int i = 0; i < (n & -8); i++)\n> > > > +   a[i] = -fabs (a[i]);\n> > > > +}\n> > > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_4.c\n> > > > b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_4.c\n> > > > new file mode 100644\n> > > > index\n> > > >\n> > >\n> 0000000000000000000000000000000000000000..21f2a8da2a5d44e3d0\n> > > 1f6604ca7b\n> > > > e87e3744d494\n> > > > --- /dev/null\n> > > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_4.c\n> > > > @@ -0,0 +1,37 @@\n> > > > +/* { dg-do compile } */\n> > > > +/* { dg-options \"-O3\" } */\n> > > > +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 }\n> > > > +} } */\n> > > > +\n> > > > +#include <string.h>\n> > > > +\n> > > > +/*\n> > > > +** negabs:\n> > > > +**     mov     x0, -9223372036854775808\n> > > > +**     fmov    d[0-9]+, x0\n> > > > +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> > > > +**     ret\n> > > > +*/\n> > > > +double negabs (double x)\n> > > > +{\n> > > > +   unsigned long long y;\n> > > > +   memcpy (&y, &x, sizeof(double));\n> > > > +   y = y | (1UL << 63);\n> > > > +   memcpy (&x, &y, sizeof(double));\n> > > > +   return x;\n> > > > +}\n> > > > +\n> > > > +/*\n> > > > +** negabsf:\n> > > > +**     movi    v[0-9]+.2s, 0x80, lsl 24\n> > > > +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> > > > +**     ret\n> > > > +*/\n> > > > +float negabsf (float x)\n> > > > +{\n> > > > +   unsigned int y;\n> > > > +   memcpy (&y, &x, sizeof(float));\n> > > > +   y = y | (1U << 31);\n> > > > +   memcpy (&x, &y, sizeof(float));\n> > > > +   return x;\n> > > > +}\n> > > > +\n> > > >\n> > > >\n> > > >\n> > > >\n> > > > --\n> >\n> \n> --\n> Richard Biener <rguenther@suse.de>\n> SUSE Software Solutions Germany GmbH,\n> Frankenstrasse 146, 90461 Nuernberg, Germany;\n> GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG\n> Nuernberg)","headers":{"Return-Path":"<gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org>","X-Original-To":["incoming@patchwork.ozlabs.org","gcc-patches@gcc.gnu.org"],"Delivered-To":["patchwork-incoming@legolas.ozlabs.org","gcc-patches@gcc.gnu.org"],"Authentication-Results":["legolas.ozlabs.org;\n\tdkim=pass (1024-bit key;\n unprotected) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com\n header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com\n header.b=QFPfqp9h;\n\tdkim=pass (1024-bit key) header.d=armh.onmicrosoft.com\n header.i=@armh.onmicrosoft.com header.a=rsa-sha256\n header.s=selector2-armh-onmicrosoft-com header.b=QFPfqp9h;\n\tdkim-atps=neutral","legolas.ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org\n (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org;\n envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org;\n receiver=patchwork.ozlabs.org)","sourceware.org;\n dmarc=pass (p=none dis=none) header.from=arm.com","sourceware.org; spf=pass smtp.mailfrom=arm.com"],"Received":["from server2.sourceware.org (server2.sourceware.org\n [IPv6:2620:52:3:1:0:246e:9693:128c])\n\t(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n\t key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384)\n\t(No client certificate requested)\n\tby legolas.ozlabs.org (Postfix) with ESMTPS id 4RwTV60sCYz1yp8\n\tfor <incoming@patchwork.ozlabs.org>; Wed, 27 Sep 2023 17:57:18 +1000 (AEST)","from server2.sourceware.org (localhost [IPv6:::1])\n\tby sourceware.org (Postfix) with ESMTP id 312563861900\n\tfor <incoming@patchwork.ozlabs.org>; Wed, 27 Sep 2023 07:57:16 +0000 (GMT)","from EUR04-HE1-obe.outbound.protection.outlook.com\n (mail-he1eur04on2084.outbound.protection.outlook.com [40.107.7.84])\n by sourceware.org (Postfix) with ESMTPS id 3282E38618B8\n for <gcc-patches@gcc.gnu.org>; Wed, 27 Sep 2023 07:57:01 +0000 (GMT)","from AS9PR05CA0053.eurprd05.prod.outlook.com (2603:10a6:20b:489::33)\n by DU2PR08MB9992.eurprd08.prod.outlook.com (2603:10a6:10:490::11)\n with Microsoft SMTP Server (version=TLS1_2,\n cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6838.21; Wed, 27 Sep\n 2023 07:56:55 +0000","from AM7EUR03FT027.eop-EUR03.prod.protection.outlook.com\n (2603:10a6:20b:489:cafe::81) by AS9PR05CA0053.outlook.office365.com\n (2603:10a6:20b:489::33) with Microsoft SMTP Server (version=TLS1_2,\n cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6838.21 via Frontend\n Transport; Wed, 27 Sep 2023 07:56:55 +0000","from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by\n AM7EUR03FT027.mail.protection.outlook.com (100.127.140.124) with\n Microsoft\n SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id\n 15.20.6838.19 via Frontend Transport; Wed, 27 Sep 2023 07:56:54 +0000","(\"Tessian outbound d084e965c4eb:v175\");\n Wed, 27 Sep 2023 07:56:53 +0000","from 7f5efb5f710c.1\n by 64aa7808-outbound-1.mta.getcheckrecipient.com id\n D7F3360B-E959-42DF-98A7-6D07B26A1D09.1;\n Wed, 27 Sep 2023 07:56:46 +0000","from EUR05-DB8-obe.outbound.protection.outlook.com\n by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id\n 7f5efb5f710c.1\n (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384);\n Wed, 27 Sep 2023 07:56:46 +0000","from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17)\n by PA4PR08MB6318.eurprd08.prod.outlook.com (2603:10a6:102:e2::20)\n with Microsoft SMTP Server (version=TLS1_2,\n cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6813.28; Wed, 27 Sep\n 2023 07:56:44 +0000","from VI1PR08MB5325.eurprd08.prod.outlook.com\n ([fe80::662f:8e26:1bf8:aaa1]) by VI1PR08MB5325.eurprd08.prod.outlook.com\n ([fe80::662f:8e26:1bf8:aaa1%7]) with mapi id 15.20.6813.027; Wed, 27 Sep 2023\n 07:56:43 +0000"],"DMARC-Filter":"OpenDMARC Filter v1.4.2 sourceware.org 3282E38618B8","DKIM-Signature":["v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com;\n s=selector2-armh-onmicrosoft-com;\n h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;\n bh=P+pb15xg+csjIHc5DspXBwnV7VV42YbyjlTRjDooMCQ=;\n b=QFPfqp9hknu+nsealsdqC4c1k1e5HetF8zKHar2txTTeifGV28qlLv+aDlTDYlNQBDG9gw9F0Wq+b+V1vZ9N8YQBKGoItJ3V8nqN8bjHvBhqwQM6JBd6Zywl4im0fzMky98IJLf1kWiUGPxbRoPk3gMcxezrhP6HxNIh0MiI2Ck=","v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com;\n s=selector2-armh-onmicrosoft-com;\n h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;\n bh=P+pb15xg+csjIHc5DspXBwnV7VV42YbyjlTRjDooMCQ=;\n b=QFPfqp9hknu+nsealsdqC4c1k1e5HetF8zKHar2txTTeifGV28qlLv+aDlTDYlNQBDG9gw9F0Wq+b+V1vZ9N8YQBKGoItJ3V8nqN8bjHvBhqwQM6JBd6Zywl4im0fzMky98IJLf1kWiUGPxbRoPk3gMcxezrhP6HxNIh0MiI2Ck="],"X-MS-Exchange-Authentication-Results":"spf=pass (sender IP is 63.35.35.123)\n smtp.mailfrom=arm.com; dkim=pass (signature was verified)\n header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com;","Received-SPF":"Pass (protection.outlook.com: domain of arm.com designates\n 63.35.35.123 as permitted sender) receiver=protection.outlook.com;\n client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com;\n pr=C","X-CheckRecipientChecked":"true","X-CR-MTA-CID":"03da9d58414c6005","X-CR-MTA-TID":"64aa7808","ARC-Seal":"i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none;\n b=deg7digOIi084GugggdAtX8p0nsJ7rDR4GCvGTuokD8xobCde0l4ydQeXyHx4ygWx5WNXgudOT8soPQj3dpaLqJTnbVGrVqvarQucNU6vVmZTDMIuEV/TUbqt4YF9+A704VsLc7qKuUZIYx8BF9xhDXsSIi1I+2FGdsr3HJ8ZELUlinQdiu73AWD7T5owLgqRQ3l+u/zAPTf7h4cMnPDAkjOl+Ykiwz6QHT9O+sO6z4ZJ8FRCyf4wVvqOZ3bOBuhmb2fvHgEsZ2JD7dGMiJdRr61BcDcwIvJ72UHlHbpTt107tWmcUgSHXPEC7l67BXn4HcPRhHElymArbw0Pi71IA==","ARC-Message-Signature":"i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com;\n s=arcselector9901;\n h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1;\n bh=P+pb15xg+csjIHc5DspXBwnV7VV42YbyjlTRjDooMCQ=;\n b=n4dT1NspgDJRxKIkR0RUAD8CQcHo1tPVhdl8rqLGYntxqniBGb2LoTtvCz6BEHzA4QzGzJhr8sYJEJC82mNeGtEC7MDKhfSTKQToPSMtBINa8RzXHmXuBwW33Dq8S2Y5gu9oQV8ZHRade/107dOGzO8pXaiascOcpNFlsInW+QtVFSK45O/z5u6BK2wKeGL9+e4cB7+po/Y5xAZGPApfDFVZN6FwQgRgjoUZXpArbSBYisK50+igVMHtuMMR7By4dtcwMIB+1xfSSTpLBiBGHOauWHoR5/hivVU+iRlfhr8AKY2KQ1ifhsAjO8DGyPXtXWi05o2hjBqAJFPRYKW+sw==","ARC-Authentication-Results":"i=1; mx.microsoft.com 1; spf=pass\n smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass\n header.d=arm.com; arc=none","From":"Tamar Christina <Tamar.Christina@arm.com>","To":"Richard Biener <rguenther@suse.de>","CC":"Andrew Pinski <pinskia@gmail.com>, \"gcc-patches@gcc.gnu.org\"\n <gcc-patches@gcc.gnu.org>, nd <nd@arm.com>, \"jlaw@ventanamicro.com\"\n <jlaw@ventanamicro.com>","Subject":"RE: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1 <<\n signbit(x)) [PR109154]","Thread-Topic":"[PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1\n << signbit(x)) [PR109154]","Thread-Index":"AQHZ8NyYUy/9C+tZQUuY9xAIpvunf7At3wQAgAAUNTCAAE7DgIAAC3HA","Date":"Wed, 27 Sep 2023 07:56:43 +0000","Message-ID":"\n <VI1PR08MB532509805D977DE375DE3618FFC2A@VI1PR08MB5325.eurprd08.prod.outlook.com>","References":"<patch-17718-tamar@arm.com>\n <CA+=Sn1kbO1OkC_1oMJi8uH8bajmGn07A+F6nvb6dGKBRcR8S3Q@mail.gmail.com>\n <VI1PR08MB5325CC904A863DB87F17CB88FFC2A@VI1PR08MB5325.eurprd08.prod.outlook.com>\n <nycvar.YFH.7.77.849.2309270710120.5561@jbgna.fhfr.qr>","In-Reply-To":"<nycvar.YFH.7.77.849.2309270710120.5561@jbgna.fhfr.qr>","Accept-Language":"en-US","Content-Language":"en-US","X-MS-Has-Attach":"","X-MS-TNEF-Correlator":"","Authentication-Results-Original":"dkim=none (message not signed)\n header.d=none;dmarc=none action=none header.from=arm.com;","x-ms-traffictypediagnostic":"\n VI1PR08MB5325:EE_|PA4PR08MB6318:EE_|AM7EUR03FT027:EE_|DU2PR08MB9992:EE_","X-MS-Office365-Filtering-Correlation-Id":"08447c4b-1ed5-4591-df8a-08dbbf2f50bb","x-checkrecipientrouted":"true","nodisclaimer":"true","X-MS-Exchange-SenderADCheck":"1","X-MS-Exchange-AntiSpam-Relay":"0","X-Microsoft-Antispam-Untrusted":"BCL:0;","X-Microsoft-Antispam-Message-Info-Original":"\n 6mXN9UT9nxQYVjTrMTDIwpjm38y2a2MEZLy68DpoabUl38ZcPx6aKyUW+3q63BvWoXrjakeOiVsfv6AvL5rfW55xuHKzQcB2HOIoPqjyVCHp9z37llCRL0wcR8wC0KsNai4TO5NkB4yYMT+pRICJ88nRxEthZCCDyDIIySy0RUtjwoTxrM6Fzn/YkNDvBH/sJYIMkV/cn5f6vNbijVV+DxSm1gyqxFSEp2PZaYyRBNE4h2qKpTWv1/BxvBHLM79r/H0OZ0VnhqB3R3jvGDJvMZvdCnYty/hpfg3RWBIjvBJ9AKOdyJSMG8ozq+ubNNrPu5quA9HzkbPxyt5UA8uq2NW3Q/K62hZZtnfpwZX6GS35Saukc9XM8/E2lDOOuJQR0XcSTSkfZB+FDJlVDqX3jkdEOZ4hWAbl0rPhxCSnDv+Ps4mBzLaN1mfVPnl4Vid98syQBUIGpbpyJX/kV8Wsm+Ob5MWRIetFWXZOrQECLyRRpUFfbTV/lzoyR87SfMm8RBwWrzGIFYpYglc1TzANNprUlnuPSRIvSPH/aQbEagTbcMNAjAjZl19Ihd4chEoRdU53bG4Nr/4h8ojQ6ySmOkLgPX7dfPGwA3KxxzM+O5y4RKnIceH1tnIHchyqLFqL4ntc7WN38AvUT0uvGYFdJ+yaIXwGaZQV8tcHugKUkmg=","X-Forefront-Antispam-Report-Untrusted":"CIP:255.255.255.255; CTRY:; LANG:en;\n SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com;\n PTR:; CAT:NONE;\n SFS:(13230031)(376002)(396003)(136003)(366004)(39860400002)(346002)(230922051799003)(451199024)(186009)(1800799009)(8676002)(8936002)(41300700001)(76116006)(2906002)(55016003)(26005)(84970400001)(4326008)(30864003)(52536014)(478600001)(66476007)(316002)(54906003)(66946007)(5660300002)(6916009)(64756008)(66556008)(66446008)(6506007)(71200400001)(7696005)(53546011)(9686003)(83380400001)(38070700005)(86362001)(122000001)(38100700002)(33656002)(357404004);\n DIR:OUT; SFP:1101;","Content-Type":"text/plain; charset=\"us-ascii\"","Content-Transfer-Encoding":"quoted-printable","MIME-Version":"1.0","X-MS-Exchange-Transport-CrossTenantHeadersStamped":["PA4PR08MB6318","DU2PR08MB9992"],"Original-Authentication-Results":"dkim=none (message not signed)\n header.d=none;dmarc=none action=none header.from=arm.com;","X-EOPAttributedMessage":"0","X-MS-Exchange-Transport-CrossTenantHeadersStripped":"\n AM7EUR03FT027.eop-EUR03.prod.protection.outlook.com","X-MS-PublicTrafficType":"Email","X-MS-Office365-Filtering-Correlation-Id-Prvs":"\n 0a18dc64-72cc-4740-ed93-08dbbf2f4a7a","X-Microsoft-Antispam":"BCL:0;","X-Microsoft-Antispam-Message-Info":"\n fBTJbeD9YAEvlP/vddqoedJ6uonxGMkK6Lnx1rrze+IJWzf2Q0CPfl0hKdFpc+CnkfsaWHpkDO5l/XMnfNUuf+tVmg8hFtG0CQL0BkBetveYUVZULQWWYK7E8NXiqf2ul3Kuz0wKVqRez0hrGNkLkCEcDWip137joK/FpD5rXTvB7lN8DDhjh2C1BNFzmdREm5e3YaqnD4exD2Fh1S41sCPm2QJtsocb3EfejF56IM9TMrrF7v35nkC+/ScBLUnW553I+kenS2fBi73xXAO9+PQKXIpLMyrY0VULkXp6oU/gNADElNwC4ChcQLoM6k2D4jrbjY0PEhRLryMxybAbzYlyFO5Ehx4XLDiQmWvjkR7NiyYewel+ZxgekRi/Y1bJN2YfzD9HZiFs03XK/yT9/Nl786b8gcTrKZm5WXq6lOH3Qq3OMXtPMLssJPFJDG7xdUgAXCcnQ3Y9GD9cC/VRR/vZxwuOo+DiUHR/fRBL6BejVSJC87lSFcK3Dg16Bxoc6KQQ81xaglQ1fe0hvoUazqX907Rv24QHrAz3gu+caRPxJem7XfTAxLnjWLepmijBjNoI54me1Vb7l+MYQUiWKhQg72btvVUSm6ZwesijSFXfVtNPP9XeaTbQdCFnSqog7p/Q+Q9iMEeeKmi+23ksdOmnTAAzrkOi8GKnzKFKTB0cnBT8sgwlY+yyW4/LY5P/Rw55+OemF0eUJr4e3Qxl3QLws5XtKu1k+AsoOhSmyBOOkTY7dtexZXjlFVqk3VcY7+ZZY281oSYXwKd/1RNmVI7103uMNfXfheGxB5h7Wy0=","X-Forefront-Antispam-Report":"CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:;\n IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com;\n PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE;\n SFS:(13230031)(4636009)(376002)(39850400004)(136003)(396003)(346002)(230922051799003)(451199024)(82310400011)(1800799009)(186009)(36840700001)(46966006)(40470700004)(30864003)(6506007)(83380400001)(84970400001)(2906002)(7696005)(47076005)(478600001)(26005)(6862004)(8676002)(5660300002)(107886003)(70206006)(52536014)(336012)(4326008)(70586007)(8936002)(36860700001)(54906003)(41300700001)(316002)(82740400003)(9686003)(53546011)(356005)(81166007)(86362001)(33656002)(40460700003)(40480700001)(55016003)(357404004);\n DIR:OUT; SFP:1101;","X-OriginatorOrg":"arm.com","X-MS-Exchange-CrossTenant-OriginalArrivalTime":"27 Sep 2023 07:56:54.1133 (UTC)","X-MS-Exchange-CrossTenant-Network-Message-Id":"\n 08447c4b-1ed5-4591-df8a-08dbbf2f50bb","X-MS-Exchange-CrossTenant-Id":"f34e5979-57d9-4aaa-ad4d-b122a662184d","X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp":"\n TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123];\n Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com]","X-MS-Exchange-CrossTenant-AuthSource":"\n AM7EUR03FT027.eop-EUR03.prod.protection.outlook.com","X-MS-Exchange-CrossTenant-AuthAs":"Anonymous","X-MS-Exchange-CrossTenant-FromEntityHeader":"HybridOnPrem","X-Spam-Status":"No, score=-12.0 required=5.0 tests=BAYES_00, DKIM_SIGNED,\n DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, KAM_LOTSOFHASH,\n KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE,\n TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6","X-Spam-Checker-Version":"SpamAssassin 3.4.6 (2021-04-09) on\n server2.sourceware.org","X-BeenThere":"gcc-patches@gcc.gnu.org","X-Mailman-Version":"2.1.30","Precedence":"list","List-Id":"Gcc-patches mailing list <gcc-patches.gcc.gnu.org>","List-Unsubscribe":"<https://gcc.gnu.org/mailman/options/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>","List-Archive":"<https://gcc.gnu.org/pipermail/gcc-patches/>","List-Post":"<mailto:gcc-patches@gcc.gnu.org>","List-Help":"<mailto:gcc-patches-request@gcc.gnu.org?subject=help>","List-Subscribe":"<https://gcc.gnu.org/mailman/listinfo/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>","Errors-To":"gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org"}},{"id":3188591,"web_url":"http://patchwork.ozlabs.org/comment/3188591/","msgid":"<VI1PR08MB5325EF37073EFC87B0574525FFC2A@VI1PR08MB5325.eurprd08.prod.outlook.com>","list_archive_url":null,"date":"2023-09-27T09:35:09","subject":"RE: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1 <<\n signbit(x)) [PR109154]","submitter":{"id":69689,"url":"http://patchwork.ozlabs.org/api/people/69689/","name":"Tamar Christina","email":"Tamar.Christina@arm.com"},"content":"> -----Original Message-----\n> From: Tamar Christina <Tamar.Christina@arm.com>\n> Sent: Wednesday, September 27, 2023 8:57 AM\n> To: Richard Biener <rguenther@suse.de>\n> Cc: Andrew Pinski <pinskia@gmail.com>; gcc-patches@gcc.gnu.org; nd\n> <nd@arm.com>; jlaw@ventanamicro.com\n> Subject: RE: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1 <<\n> signbit(x)) [PR109154]\n> \n> > -----Original Message-----\n> > From: Richard Biener <rguenther@suse.de>\n> > Sent: Wednesday, September 27, 2023 8:12 AM\n> > To: Tamar Christina <Tamar.Christina@arm.com>\n> > Cc: Andrew Pinski <pinskia@gmail.com>; gcc-patches@gcc.gnu.org; nd\n> > <nd@arm.com>; jlaw@ventanamicro.com\n> > Subject: RE: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x\n> > | (1 <<\n> > signbit(x)) [PR109154]\n> >\n> > On Wed, 27 Sep 2023, Tamar Christina wrote:\n> >\n> > > > -----Original Message-----\n> > > > From: Andrew Pinski <pinskia@gmail.com>\n> > > > Sent: Wednesday, September 27, 2023 2:17 AM\n> > > > To: Tamar Christina <Tamar.Christina@arm.com>\n> > > > Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>; rguenther@suse.de;\n> > > > jlaw@ventanamicro.com\n> > > > Subject: Re: [PATCH]middle-end match.pd: optimize fneg (fabs (x))\n> > > > to x | (1 <<\n> > > > signbit(x)) [PR109154]\n> > > >\n> > > > On Tue, Sep 26, 2023 at 5:51?PM Tamar Christina\n> > > > <tamar.christina@arm.com>\n> > > > wrote:\n> > > > >\n> > > > > Hi All,\n> > > > >\n> > > > > For targets that allow conversion between int and float modes\n> > > > > this adds a new optimization transforming fneg (fabs (x)) into x\n> > > > > | (1 << signbit(x)).  Such sequences are common in scientific\n> > > > > code working with\n> > > > gradients.\n> > > > >\n> > > > > The transformed instruction if the target has an inclusive-OR\n> > > > > that takes an immediate is both shorter an faster.  For those\n> > > > > that don't the immediate has to be seperate constructed but this\n> > > > > still ends up being faster as the immediate construction is not\n> > > > > on the critical\n> > path.\n> > > > >\n> > > > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.\n> > > > >\n> > > > > Ok for master?\n> > > >\n> > > > I think this should be part of isel instead of match.\n> > > > Maybe we could use genmatch to generate the code that does the\n> > > > transformations but this does not belong as part of match really.\n> > >\n> > > I disagree.. I don't think this belongs in isel. Isel is for\n> > > structural\n> > transformations.\n> > > If there is a case for something else I'd imagine backwardprop is a\n> > > better\n> > choice.\n> > >\n> > > But I don't see why it doesn't belong here considering it *is* a\n> > > mathematical optimization and the file has plenty of transformations\n> > > such as mask optimizations and vector conditional rewriting.\n> >\n> > But the mathematical transform would more generally be fneg (fabs (x))\n> > -> copysign (x, -1.) and that can be optimally expanded at RTL expansion\n> time?\n> \n> Ah sure, atm I did copysign (x, -1) -> x | 1 << signbits.  I can do it the other way\n> around.  And I guess since copysign (-x, y), copysign(|x|, y) -> copysign (x, y)\n> that should solve the trigonometry problem too.\n> \n> Cool will do that instead, thanks!\n\nHmm this seems to conflict with the pattern\n\n/* copysign(x, CST) -> [-]abs (x).  */\n(for copysigns (COPYSIGN_ALL)\n (simplify\n  (copysigns @0 REAL_CST@1)\n  (if (REAL_VALUE_NEGATIVE (TREE_REAL_CST (@1)))\n   (negate (abs @0))\n   (abs @0))))\n\nWhich does the opposite transformation.\n\nShould I try removing this?\n\nThanks,\nTamar\n\n> \n> Tamar\n> \n> >\n> > Richard.\n> >\n> > > Regards,\n> > > Tamar\n> > >\n> > > >\n> > > > Thanks,\n> > > > Andrew\n> > > >\n> > > > >\n> > > > > Thanks,\n> > > > > Tamar\n> > > > >\n> > > > > gcc/ChangeLog:\n> > > > >\n> > > > >         PR tree-optimization/109154\n> > > > >         * match.pd: Add new neg+abs rule.\n> > > > >\n> > > > > gcc/testsuite/ChangeLog:\n> > > > >\n> > > > >         PR tree-optimization/109154\n> > > > >         * gcc.target/aarch64/fneg-abs_1.c: New test.\n> > > > >         * gcc.target/aarch64/fneg-abs_2.c: New test.\n> > > > >         * gcc.target/aarch64/fneg-abs_3.c: New test.\n> > > > >         * gcc.target/aarch64/fneg-abs_4.c: New test.\n> > > > >         * gcc.target/aarch64/sve/fneg-abs_1.c: New test.\n> > > > >         * gcc.target/aarch64/sve/fneg-abs_2.c: New test.\n> > > > >         * gcc.target/aarch64/sve/fneg-abs_3.c: New test.\n> > > > >         * gcc.target/aarch64/sve/fneg-abs_4.c: New test.\n> > > > >\n> > > > > --- inline copy of patch --\n> > > > > diff --git a/gcc/match.pd b/gcc/match.pd index\n> > > > >\n> > > >\n> >\n> 39c7ea1088f25538ed8bd26ee89711566141a71f..8ebde06dcd4b26d69482\n> > > > 6cffad0f\n> > > > > b17e1136600a 100644\n> > > > > --- a/gcc/match.pd\n> > > > > +++ b/gcc/match.pd\n> > > > > @@ -9476,3 +9476,57 @@ and,\n> > > > >        }\n> > > > >        (if (full_perm_p)\n> > > > >         (vec_perm (op@3 @0 @1) @3 @2))))))\n> > > > > +\n> > > > > +/* Transform fneg (fabs (X)) -> X | 1 << signbit (X).  */\n> > > > > +\n> > > > > +(simplify\n> > > > > + (negate (abs @0))\n> > > > > + (if (FLOAT_TYPE_P (type)\n> > > > > +      /* We have to delay this rewriting till after forward\n> > > > > +prop because\n> > > > otherwise\n> > > > > +        it's harder to do trigonometry optimizations. e.g. cos(-fabs(x)) is\n> not\n> > > > > +        matched in one go.  Instead cos (-x) is matched first\n> > > > > + followed by\n> > > > cos(|x|).\n> > > > > +        The bottom op approach makes this rule match first and\n> > > > > + it's not\n> > untill\n> > > > > +        fwdprop that we match top down.  There are manu such\n> > > > > + simplications\n> > > > so we\n> > > > > +        delay this optimization till later on.  */\n> > > > > +      && canonicalize_math_after_vectorization_p ())  (with {\n> > > > > +    tree itype = unsigned_type_for (type);\n> > > > > +    machine_mode mode = TYPE_MODE (type);\n> > > > > +    const struct real_format *float_fmt = FLOAT_MODE_FORMAT\n> (mode);\n> > > > > +    auto optab = VECTOR_TYPE_P (type) ? optab_vector :\n> optab_default; }\n> > > > > +   (if (float_fmt\n> > > > > +       && float_fmt->signbit_rw >= 0\n> > > > > +       && targetm.can_change_mode_class (TYPE_MODE (itype),\n> > > > > +                                         TYPE_MODE (type), ALL_REGS)\n> > > > > +        && target_supports_op_p (itype, BIT_IOR_EXPR, optab))\n> > > > > +    (with { wide_int wone = wi::one (element_precision (type));\n> > > > > +           int sbit = float_fmt->signbit_rw;\n> > > > > +           auto stype = VECTOR_TYPE_P (type) ? TREE_TYPE (itype) : itype;\n> > > > > +           tree sign_bit = wide_int_to_tree (stype, wi::lshift (wone, sbit));}\n> > > > > +     (view_convert:type\n> > > > > +      (bit_ior (view_convert:itype @0)\n> > > > > +              { build_uniform_cst (itype, sign_bit); } )))))))\n> > > > > +\n> > > > > +/* Repeat the same but for conditional negate.  */\n> > > > > +\n> > > > > +(simplify\n> > > > > + (IFN_COND_NEG @1 (abs @0) @2)\n> > > > > + (if (FLOAT_TYPE_P (type))\n> > > > > +  (with {\n> > > > > +    tree itype = unsigned_type_for (type);\n> > > > > +    machine_mode mode = TYPE_MODE (type);\n> > > > > +    const struct real_format *float_fmt = FLOAT_MODE_FORMAT\n> (mode);\n> > > > > +    auto optab = VECTOR_TYPE_P (type) ? optab_vector :\n> optab_default; }\n> > > > > +   (if (float_fmt\n> > > > > +       && float_fmt->signbit_rw >= 0\n> > > > > +       && targetm.can_change_mode_class (TYPE_MODE (itype),\n> > > > > +                                         TYPE_MODE (type), ALL_REGS)\n> > > > > +        && target_supports_op_p (itype, BIT_IOR_EXPR, optab))\n> > > > > +    (with { wide_int wone = wi::one (element_precision (type));\n> > > > > +           int sbit = float_fmt->signbit_rw;\n> > > > > +           auto stype = VECTOR_TYPE_P (type) ? TREE_TYPE (itype) : itype;\n> > > > > +           tree sign_bit = wide_int_to_tree (stype, wi::lshift (wone, sbit));}\n> > > > > +     (view_convert:type\n> > > > > +      (IFN_COND_IOR @1 (view_convert:itype @0)\n> > > > > +              { build_uniform_cst (itype, sign_bit); }\n> > > > > +              (view_convert:itype @2) )))))))\n> > > > > \\ No newline at end of file\n> > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_1.c\n> > > > > b/gcc/testsuite/gcc.target/aarch64/fneg-abs_1.c\n> > > > > new file mode 100644\n> > > > > index\n> > > > >\n> > > >\n> >\n> 0000000000000000000000000000000000000000..f823013c3ddf6b3a266\n> > > > c3abfcbf2\n> > > > > 642fc2a75fa6\n> > > > > --- /dev/null\n> > > > > +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_1.c\n> > > > > @@ -0,0 +1,39 @@\n> > > > > +/* { dg-do compile } */\n> > > > > +/* { dg-options \"-O3\" } */\n> > > > > +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64\n> > > > > +} } } */\n> > > > > +\n> > > > > +#pragma GCC target \"+nosve\"\n> > > > > +\n> > > > > +#include <arm_neon.h>\n> > > > > +\n> > > > > +/*\n> > > > > +** t1:\n> > > > > +**     orr     v[0-9]+.2s, #128, lsl #24\n> > > > > +**     ret\n> > > > > +*/\n> > > > > +float32x2_t t1 (float32x2_t a)\n> > > > > +{\n> > > > > +  return vneg_f32 (vabs_f32 (a)); }\n> > > > > +\n> > > > > +/*\n> > > > > +** t2:\n> > > > > +**     orr     v[0-9]+.4s, #128, lsl #24\n> > > > > +**     ret\n> > > > > +*/\n> > > > > +float32x4_t t2 (float32x4_t a)\n> > > > > +{\n> > > > > +  return vnegq_f32 (vabsq_f32 (a)); }\n> > > > > +\n> > > > > +/*\n> > > > > +** t3:\n> > > > > +**     adrp    x0, .LC[0-9]+\n> > > > > +**     ldr     q[0-9]+, \\[x0, #:lo12:.LC0\\]\n> > > > > +**     orr     v[0-9]+.16b, v[0-9]+.16b, v[0-9]+.16b\n> > > > > +**     ret\n> > > > > +*/\n> > > > > +float64x2_t t3 (float64x2_t a)\n> > > > > +{\n> > > > > +  return vnegq_f64 (vabsq_f64 (a)); }\n> > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_2.c\n> > > > > b/gcc/testsuite/gcc.target/aarch64/fneg-abs_2.c\n> > > > > new file mode 100644\n> > > > > index\n> > > > >\n> > > >\n> >\n> 0000000000000000000000000000000000000000..141121176b309e4b2a\n> > > > a413dc5527\n> > > > > 1a6e3c93d5e1\n> > > > > --- /dev/null\n> > > > > +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_2.c\n> > > > > @@ -0,0 +1,31 @@\n> > > > > +/* { dg-do compile } */\n> > > > > +/* { dg-options \"-O3\" } */\n> > > > > +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64\n> > > > > +} } } */\n> > > > > +\n> > > > > +#pragma GCC target \"+nosve\"\n> > > > > +\n> > > > > +#include <arm_neon.h>\n> > > > > +#include <math.h>\n> > > > > +\n> > > > > +/*\n> > > > > +** f1:\n> > > > > +**     movi    v[0-9]+.2s, 0x80, lsl 24\n> > > > > +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> > > > > +**     ret\n> > > > > +*/\n> > > > > +float32_t f1 (float32_t a)\n> > > > > +{\n> > > > > +  return -fabsf (a);\n> > > > > +}\n> > > > > +\n> > > > > +/*\n> > > > > +** f2:\n> > > > > +**     mov     x0, -9223372036854775808\n> > > > > +**     fmov    d[0-9]+, x0\n> > > > > +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> > > > > +**     ret\n> > > > > +*/\n> > > > > +float64_t f2 (float64_t a)\n> > > > > +{\n> > > > > +  return -fabs (a);\n> > > > > +}\n> > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_3.c\n> > > > > b/gcc/testsuite/gcc.target/aarch64/fneg-abs_3.c\n> > > > > new file mode 100644\n> > > > > index\n> > > > >\n> > > >\n> >\n> 0000000000000000000000000000000000000000..b4652173a95d104ddf\n> > > > a70c497f06\n> > > > > 27a61ea89d3b\n> > > > > --- /dev/null\n> > > > > +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_3.c\n> > > > > @@ -0,0 +1,36 @@\n> > > > > +/* { dg-do compile } */\n> > > > > +/* { dg-options \"-O3\" } */\n> > > > > +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64\n> > > > > +} } } */\n> > > > > +\n> > > > > +#pragma GCC target \"+nosve\"\n> > > > > +\n> > > > > +#include <arm_neon.h>\n> > > > > +#include <math.h>\n> > > > > +\n> > > > > +/*\n> > > > > +** f1:\n> > > > > +**     ...\n> > > > > +**     ldr     q[0-9]+, \\[x0\\]\n> > > > > +**     orr     v[0-9]+.4s, #128, lsl #24\n> > > > > +**     str     q[0-9]+, \\[x0\\], 16\n> > > > > +**     ...\n> > > > > +*/\n> > > > > +void f1 (float32_t *a, int n)\n> > > > > +{\n> > > > > +  for (int i = 0; i < (n & -8); i++)\n> > > > > +   a[i] = -fabsf (a[i]);\n> > > > > +}\n> > > > > +\n> > > > > +/*\n> > > > > +** f2:\n> > > > > +**     ...\n> > > > > +**     ldr     q[0-9]+, \\[x0\\]\n> > > > > +**     orr     v[0-9]+.16b, v[0-9]+.16b, v[0-9]+.16b\n> > > > > +**     str     q[0-9]+, \\[x0\\], 16\n> > > > > +**     ...\n> > > > > +*/\n> > > > > +void f2 (float64_t *a, int n)\n> > > > > +{\n> > > > > +  for (int i = 0; i < (n & -8); i++)\n> > > > > +   a[i] = -fabs (a[i]);\n> > > > > +}\n> > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_4.c\n> > > > > b/gcc/testsuite/gcc.target/aarch64/fneg-abs_4.c\n> > > > > new file mode 100644\n> > > > > index\n> > > > >\n> > > >\n> >\n> 0000000000000000000000000000000000000000..10879dea74462d34b2\n> > > > 6160eeb0bd\n> > > > > 54ead063166b\n> > > > > --- /dev/null\n> > > > > +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_4.c\n> > > > > @@ -0,0 +1,39 @@\n> > > > > +/* { dg-do compile } */\n> > > > > +/* { dg-options \"-O3\" } */\n> > > > > +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64\n> > > > > +} } } */\n> > > > > +\n> > > > > +#pragma GCC target \"+nosve\"\n> > > > > +\n> > > > > +#include <string.h>\n> > > > > +\n> > > > > +/*\n> > > > > +** negabs:\n> > > > > +**     mov     x0, -9223372036854775808\n> > > > > +**     fmov    d[0-9]+, x0\n> > > > > +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> > > > > +**     ret\n> > > > > +*/\n> > > > > +double negabs (double x)\n> > > > > +{\n> > > > > +   unsigned long long y;\n> > > > > +   memcpy (&y, &x, sizeof(double));\n> > > > > +   y = y | (1UL << 63);\n> > > > > +   memcpy (&x, &y, sizeof(double));\n> > > > > +   return x;\n> > > > > +}\n> > > > > +\n> > > > > +/*\n> > > > > +** negabsf:\n> > > > > +**     movi    v[0-9]+.2s, 0x80, lsl 24\n> > > > > +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> > > > > +**     ret\n> > > > > +*/\n> > > > > +float negabsf (float x)\n> > > > > +{\n> > > > > +   unsigned int y;\n> > > > > +   memcpy (&y, &x, sizeof(float));\n> > > > > +   y = y | (1U << 31);\n> > > > > +   memcpy (&x, &y, sizeof(float));\n> > > > > +   return x;\n> > > > > +}\n> > > > > +\n> > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_1.c\n> > > > > b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_1.c\n> > > > > new file mode 100644\n> > > > > index\n> > > > >\n> > > >\n> >\n> 0000000000000000000000000000000000000000..0c7664e6de77a49768\n> > > > 2952653ffd\n> > > > > 417453854d52\n> > > > > --- /dev/null\n> > > > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_1.c\n> > > > > @@ -0,0 +1,37 @@\n> > > > > +/* { dg-do compile } */\n> > > > > +/* { dg-options \"-O3\" } */\n> > > > > +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64\n> > > > > +} } } */\n> > > > > +\n> > > > > +#include <arm_neon.h>\n> > > > > +\n> > > > > +/*\n> > > > > +** t1:\n> > > > > +**     orr     v[0-9]+.2s, #128, lsl #24\n> > > > > +**     ret\n> > > > > +*/\n> > > > > +float32x2_t t1 (float32x2_t a)\n> > > > > +{\n> > > > > +  return vneg_f32 (vabs_f32 (a)); }\n> > > > > +\n> > > > > +/*\n> > > > > +** t2:\n> > > > > +**     orr     v[0-9]+.4s, #128, lsl #24\n> > > > > +**     ret\n> > > > > +*/\n> > > > > +float32x4_t t2 (float32x4_t a)\n> > > > > +{\n> > > > > +  return vnegq_f32 (vabsq_f32 (a)); }\n> > > > > +\n> > > > > +/*\n> > > > > +** t3:\n> > > > > +**     adrp    x0, .LC[0-9]+\n> > > > > +**     ldr     q[0-9]+, \\[x0, #:lo12:.LC0\\]\n> > > > > +**     orr     v[0-9]+.16b, v[0-9]+.16b, v[0-9]+.16b\n> > > > > +**     ret\n> > > > > +*/\n> > > > > +float64x2_t t3 (float64x2_t a)\n> > > > > +{\n> > > > > +  return vnegq_f64 (vabsq_f64 (a)); }\n> > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_2.c\n> > > > > b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_2.c\n> > > > > new file mode 100644\n> > > > > index\n> > > > >\n> > > >\n> >\n> 0000000000000000000000000000000000000000..a60cd31b9294af2dac6\n> > > > 9eed1c93f\n> > > > > 899bd5c78fca\n> > > > > --- /dev/null\n> > > > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_2.c\n> > > > > @@ -0,0 +1,29 @@\n> > > > > +/* { dg-do compile } */\n> > > > > +/* { dg-options \"-O3\" } */\n> > > > > +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64\n> > > > > +} } } */\n> > > > > +\n> > > > > +#include <arm_neon.h>\n> > > > > +#include <math.h>\n> > > > > +\n> > > > > +/*\n> > > > > +** f1:\n> > > > > +**     movi    v[0-9]+.2s, 0x80, lsl 24\n> > > > > +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> > > > > +**     ret\n> > > > > +*/\n> > > > > +float32_t f1 (float32_t a)\n> > > > > +{\n> > > > > +  return -fabsf (a);\n> > > > > +}\n> > > > > +\n> > > > > +/*\n> > > > > +** f2:\n> > > > > +**     mov     x0, -9223372036854775808\n> > > > > +**     fmov    d[0-9]+, x0\n> > > > > +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> > > > > +**     ret\n> > > > > +*/\n> > > > > +float64_t f2 (float64_t a)\n> > > > > +{\n> > > > > +  return -fabs (a);\n> > > > > +}\n> > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_3.c\n> > > > > b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_3.c\n> > > > > new file mode 100644\n> > > > > index\n> > > > >\n> > > >\n> >\n> 0000000000000000000000000000000000000000..1bf34328d8841de8e6\n> > > > b0a5458562\n> > > > > a9f00e31c275\n> > > > > --- /dev/null\n> > > > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_3.c\n> > > > > @@ -0,0 +1,34 @@\n> > > > > +/* { dg-do compile } */\n> > > > > +/* { dg-options \"-O3\" } */\n> > > > > +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64\n> > > > > +} } } */\n> > > > > +\n> > > > > +#include <arm_neon.h>\n> > > > > +#include <math.h>\n> > > > > +\n> > > > > +/*\n> > > > > +** f1:\n> > > > > +**     ...\n> > > > > +**     ld1w    z[0-9]+.s, p[0-9]+/z, \\[x0, x2, lsl 2\\]\n> > > > > +**     orr     z[0-9]+.s, z[0-9]+.s, #0x80000000\n> > > > > +**     st1w    z[0-9]+.s, p[0-9]+, \\[x0, x2, lsl 2\\]\n> > > > > +**     ...\n> > > > > +*/\n> > > > > +void f1 (float32_t *a, int n)\n> > > > > +{\n> > > > > +  for (int i = 0; i < (n & -8); i++)\n> > > > > +   a[i] = -fabsf (a[i]);\n> > > > > +}\n> > > > > +\n> > > > > +/*\n> > > > > +** f2:\n> > > > > +**     ...\n> > > > > +**     ld1d    z[0-9]+.d, p[0-9]+/z, \\[x0, x2, lsl 3\\]\n> > > > > +**     orr     z[0-9]+.d, z[0-9]+.d, #0x8000000000000000\n> > > > > +**     st1d    z[0-9]+.d, p[0-9]+, \\[x0, x2, lsl 3\\]\n> > > > > +**     ...\n> > > > > +*/\n> > > > > +void f2 (float64_t *a, int n)\n> > > > > +{\n> > > > > +  for (int i = 0; i < (n & -8); i++)\n> > > > > +   a[i] = -fabs (a[i]);\n> > > > > +}\n> > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_4.c\n> > > > > b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_4.c\n> > > > > new file mode 100644\n> > > > > index\n> > > > >\n> > > >\n> >\n> 0000000000000000000000000000000000000000..21f2a8da2a5d44e3d0\n> > > > 1f6604ca7b\n> > > > > e87e3744d494\n> > > > > --- /dev/null\n> > > > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_4.c\n> > > > > @@ -0,0 +1,37 @@\n> > > > > +/* { dg-do compile } */\n> > > > > +/* { dg-options \"-O3\" } */\n> > > > > +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64\n> > > > > +} } } */\n> > > > > +\n> > > > > +#include <string.h>\n> > > > > +\n> > > > > +/*\n> > > > > +** negabs:\n> > > > > +**     mov     x0, -9223372036854775808\n> > > > > +**     fmov    d[0-9]+, x0\n> > > > > +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> > > > > +**     ret\n> > > > > +*/\n> > > > > +double negabs (double x)\n> > > > > +{\n> > > > > +   unsigned long long y;\n> > > > > +   memcpy (&y, &x, sizeof(double));\n> > > > > +   y = y | (1UL << 63);\n> > > > > +   memcpy (&x, &y, sizeof(double));\n> > > > > +   return x;\n> > > > > +}\n> > > > > +\n> > > > > +/*\n> > > > > +** negabsf:\n> > > > > +**     movi    v[0-9]+.2s, 0x80, lsl 24\n> > > > > +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> > > > > +**     ret\n> > > > > +*/\n> > > > > +float negabsf (float x)\n> > > > > +{\n> > > > > +   unsigned int y;\n> > > > > +   memcpy (&y, &x, sizeof(float));\n> > > > > +   y = y | (1U << 31);\n> > > > > +   memcpy (&x, &y, sizeof(float));\n> > > > > +   return x;\n> > > > > +}\n> > > > > +\n> > > > >\n> > > > >\n> > > > >\n> > > > >\n> > > > > --\n> > >\n> >\n> > --\n> > Richard Biener <rguenther@suse.de>\n> > SUSE Software Solutions Germany GmbH,\n> > Frankenstrasse 146, 90461 Nuernberg, Germany;\n> > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG\n> > Nuernberg)","headers":{"Return-Path":"<gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org>","X-Original-To":["incoming@patchwork.ozlabs.org","gcc-patches@gcc.gnu.org"],"Delivered-To":["patchwork-incoming@legolas.ozlabs.org","gcc-patches@gcc.gnu.org"],"Authentication-Results":["legolas.ozlabs.org;\n\tdkim=pass (1024-bit key;\n unprotected) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com\n header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com\n header.b=mbo8aUD9;\n\tdkim=pass (1024-bit key) header.d=armh.onmicrosoft.com\n header.i=@armh.onmicrosoft.com header.a=rsa-sha256\n header.s=selector2-armh-onmicrosoft-com header.b=mbo8aUD9;\n\tdkim-atps=neutral","legolas.ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org\n (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org;\n envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org;\n receiver=patchwork.ozlabs.org)","sourceware.org;\n dmarc=pass (p=none dis=none) header.from=arm.com","sourceware.org; spf=pass smtp.mailfrom=arm.com"],"Received":["from server2.sourceware.org (server2.sourceware.org\n [IPv6:2620:52:3:1:0:246e:9693:128c])\n\t(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n\t key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384)\n\t(No client certificate requested)\n\tby legolas.ozlabs.org (Postfix) with ESMTPS id 4RwWgh06VVz1yp8\n\tfor <incoming@patchwork.ozlabs.org>; Wed, 27 Sep 2023 19:35:44 +1000 (AEST)","from server2.sourceware.org (localhost [IPv6:::1])\n\tby sourceware.org (Postfix) with ESMTP id 118C23861871\n\tfor <incoming@patchwork.ozlabs.org>; Wed, 27 Sep 2023 09:35:42 +0000 (GMT)","from EUR02-VI1-obe.outbound.protection.outlook.com\n (mail-vi1eur02on2061.outbound.protection.outlook.com [40.107.241.61])\n by sourceware.org (Postfix) with ESMTPS id 02FFF3858409\n for <gcc-patches@gcc.gnu.org>; Wed, 27 Sep 2023 09:35:25 +0000 (GMT)","from DU2PR04CA0045.eurprd04.prod.outlook.com (2603:10a6:10:234::20)\n by AS8PR08MB10363.eurprd08.prod.outlook.com (2603:10a6:20b:56b::13)\n with Microsoft SMTP Server (version=TLS1_2,\n cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6838.21; Wed, 27 Sep\n 2023 09:35:22 +0000","from DBAEUR03FT035.eop-EUR03.prod.protection.outlook.com\n (2603:10a6:10:234:cafe::ee) by DU2PR04CA0045.outlook.office365.com\n (2603:10a6:10:234::20) with Microsoft SMTP Server (version=TLS1_2,\n cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6838.21 via Frontend\n Transport; Wed, 27 Sep 2023 09:35:22 +0000","from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by\n DBAEUR03FT035.mail.protection.outlook.com (100.127.142.136) with\n Microsoft\n SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id\n 15.20.6838.21 via Frontend Transport; Wed, 27 Sep 2023 09:35:21 +0000","(\"Tessian outbound 5c548696a0e7:v175\");\n Wed, 27 Sep 2023 09:35:21 +0000","from b2fe0a88d170.2\n by 64aa7808-outbound-1.mta.getcheckrecipient.com id\n 1D724873-7F06-4A93-82E2-8EC69FC074F4.1;\n Wed, 27 Sep 2023 09:35:15 +0000","from EUR04-HE1-obe.outbound.protection.outlook.com\n by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id\n b2fe0a88d170.2\n (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384);\n Wed, 27 Sep 2023 09:35:15 +0000","from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17)\n by AS2PR08MB9617.eurprd08.prod.outlook.com (2603:10a6:20b:60a::13)\n with Microsoft SMTP Server (version=TLS1_2,\n cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6813.28; Wed, 27 Sep\n 2023 09:35:12 +0000","from VI1PR08MB5325.eurprd08.prod.outlook.com\n ([fe80::662f:8e26:1bf8:aaa1]) by VI1PR08MB5325.eurprd08.prod.outlook.com\n ([fe80::662f:8e26:1bf8:aaa1%7]) with mapi id 15.20.6813.027; Wed, 27 Sep 2023\n 09:35:12 +0000"],"DMARC-Filter":"OpenDMARC Filter v1.4.2 sourceware.org 02FFF3858409","DKIM-Signature":["v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com;\n s=selector2-armh-onmicrosoft-com;\n h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;\n bh=wsnis8ugQHUnGo4iK7J4ukgAGDBacYDpIfHbWHRpNFU=;\n b=mbo8aUD97i/ZgNhYRh++cQASrlcI+9nuv5TpMcrcFqmFZSvnq/tCYuaP1DVBzn+wxmlbGBsu7moWtyLj09N8b4VBhq5xFbN0E6vxuK9aNo/p6g9NoQbdXWMFVzTwXMATuDMDbNBY2CXDtqYLApcdFUm8Vs4hGQQEaUj6PiC/OH8=","v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com;\n s=selector2-armh-onmicrosoft-com;\n h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;\n bh=wsnis8ugQHUnGo4iK7J4ukgAGDBacYDpIfHbWHRpNFU=;\n b=mbo8aUD97i/ZgNhYRh++cQASrlcI+9nuv5TpMcrcFqmFZSvnq/tCYuaP1DVBzn+wxmlbGBsu7moWtyLj09N8b4VBhq5xFbN0E6vxuK9aNo/p6g9NoQbdXWMFVzTwXMATuDMDbNBY2CXDtqYLApcdFUm8Vs4hGQQEaUj6PiC/OH8="],"X-MS-Exchange-Authentication-Results":"spf=pass (sender IP is 63.35.35.123)\n smtp.mailfrom=arm.com; dkim=pass (signature was verified)\n header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com;","Received-SPF":"Pass (protection.outlook.com: domain of arm.com designates\n 63.35.35.123 as permitted sender) receiver=protection.outlook.com;\n client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com;\n pr=C","X-CheckRecipientChecked":"true","X-CR-MTA-CID":"ea3ec4284c15a79f","X-CR-MTA-TID":"64aa7808","ARC-Seal":"i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none;\n b=kaqoyiEau7tefaSPrgZf7LPlqJo7r8E4cxl9cKvYzt17HHAM9AnPpEfQMEpRtADMwZGmj/y0De2LuVLnGTy6xVJnq6WSC0XYqFVVK//54JUnC3tRImUndj7xIvZP3oTWah/hCikgj717W3ygBtMTqp5538ZAxGFxIjMckHMEv6A6NbR50wOqgrxvd7c7A6JBwkI3X1pi0EmA8uS0QUPiujzhERueEPpMkUETlgT2GiSqu0DOEgIZ4kJtfhYyORVR5LoVJUWMC66Wl7thJKPiRFut7ZB1Q6Nf3cK9EPbLYMtIGBWVsm/akdP06JD8yjSQJCqC7MfpBs1JCIvI+E+lCQ==","ARC-Message-Signature":"i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com;\n s=arcselector9901;\n h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1;\n bh=wsnis8ugQHUnGo4iK7J4ukgAGDBacYDpIfHbWHRpNFU=;\n b=gZPN/3fQZ3M3UdhxboegZ6d90NEazETbWj48qV3+mPuzJSIsWJXT1dh2vveZzUPFMgx8QsDbMbCJJH/TRgYbOVnsSHO6B9kbXfHWyb1U/dMAsPF28oUBO2MyQgBehQFu7k0nerXHunfIioXlXIkDNfw0zvKFrQvFFQJH9OxvwwsX/YVts05cQKv+zuW8GraYeKx2py4f4O4tl92bb5JaihG3ZLikPgt/QxkJMd145X1dCQcD1WufQRgtp9wkoHaUfaLRVICh7MYic2YAkDW4Xj16oMfjZmbW2efICVtu0ffJC2tafMZYty1vFlBj3RjPnlfIaWbcDZeIP7djkuQeWQ==","ARC-Authentication-Results":"i=1; mx.microsoft.com 1; spf=pass\n smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass\n header.d=arm.com; arc=none","From":"Tamar Christina <Tamar.Christina@arm.com>","To":"Tamar Christina <Tamar.Christina@arm.com>, Richard Biener\n <rguenther@suse.de>","CC":"Andrew Pinski <pinskia@gmail.com>, \"gcc-patches@gcc.gnu.org\"\n <gcc-patches@gcc.gnu.org>, nd <nd@arm.com>, \"jlaw@ventanamicro.com\"\n <jlaw@ventanamicro.com>","Subject":"RE: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1 <<\n signbit(x)) [PR109154]","Thread-Topic":"[PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1\n << signbit(x)) [PR109154]","Thread-Index":"AQHZ8NyYUy/9C+tZQUuY9xAIpvunf7At3wQAgAAUNTCAAE7DgIAAC3HAgAAbzuA=","Date":"Wed, 27 Sep 2023 09:35:09 +0000","Message-ID":"\n <VI1PR08MB5325EF37073EFC87B0574525FFC2A@VI1PR08MB5325.eurprd08.prod.outlook.com>","References":"<patch-17718-tamar@arm.com>\n <CA+=Sn1kbO1OkC_1oMJi8uH8bajmGn07A+F6nvb6dGKBRcR8S3Q@mail.gmail.com>\n <VI1PR08MB5325CC904A863DB87F17CB88FFC2A@VI1PR08MB5325.eurprd08.prod.outlook.com>\n <nycvar.YFH.7.77.849.2309270710120.5561@jbgna.fhfr.qr>\n <VI1PR08MB532509805D977DE375DE3618FFC2A@VI1PR08MB5325.eurprd08.prod.outlook.com>","In-Reply-To":"\n <VI1PR08MB532509805D977DE375DE3618FFC2A@VI1PR08MB5325.eurprd08.prod.outlook.com>","Accept-Language":"en-US","Content-Language":"en-US","X-MS-Has-Attach":"","X-MS-TNEF-Correlator":"","Authentication-Results-Original":"dkim=none (message not signed)\n header.d=none;dmarc=none action=none header.from=arm.com;","x-ms-traffictypediagnostic":"\n VI1PR08MB5325:EE_|AS2PR08MB9617:EE_|DBAEUR03FT035:EE_|AS8PR08MB10363:EE_","X-MS-Office365-Filtering-Correlation-Id":"9b7d6b55-dcda-49ec-b094-08dbbf3d11fe","x-checkrecipientrouted":"true","nodisclaimer":"true","X-MS-Exchange-SenderADCheck":"1","X-MS-Exchange-AntiSpam-Relay":"0","X-Microsoft-Antispam-Untrusted":"BCL:0;","X-Microsoft-Antispam-Message-Info-Original":"\n mMP/EJapJaKLYVz3AUsYxkaqIm1gh81kfG8B1QeUS0Sq7XBz2qhBD/QcCV3cCkt65ZXH/nt3oXyJWKPFnxgwlZ7UMdlbAzfvoyoWb1fPAOschfl7JUlS+1seEj7pVnfOntC3V4cm2PglLdG4U75hHrQT/Bbw0O2JGDHOdIkyeySYrKoNTVj1zkUl32pxScoAUQ3a3fC/n73kHYOEdKNFghI/ShLVtGhDOrQxipclOWa6lz7emnqtwR91ub2HlOnI8VgxZeTAV+ROTxc5X+5VhEFp1sDAXHr71irtBLUktXpmsM6OAcGu2Se9SwBnJWgm2AldI5qB+bjXVCtxN5hmCT15Q96YNCXvWrmCYhCLvgPDq5V6+YQKuhX6AnShX0RFxK+hbdTL/Pa3M77AbWShXBaglvH3HLC0GLC5DvKGhIQvadJV07wb4nhNbcuRD15SXRP40Xt0zaIEsNovMcRhXp++L6pX4j9ziWUCLP6DJRySpMGgf6VN/JP5D54sXDsnK68Qr7CRi8X/F9bX3U8dy7X5H4yqbk8tKNbRWTgWwnVSt1WYcFOEYNHgw5x4cyovRyRscDhsogwRVpp17Zz3+OgKLVKXhN41M0ZmGiEUsEpE29QVKA7mwZhm/oejF7gYWjGdiqb66dMjO2P78624pNUrGCwJzoEQcBa/HfKrv9M=","X-Forefront-Antispam-Report-Untrusted":"CIP:255.255.255.255; CTRY:; LANG:en;\n SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com;\n PTR:; CAT:NONE;\n SFS:(13230031)(396003)(366004)(39860400002)(346002)(136003)(376002)(230922051799003)(451199024)(186009)(1800799009)(84970400001)(2940100002)(5660300002)(316002)(54906003)(55016003)(83380400001)(41300700001)(30864003)(52536014)(8676002)(8936002)(66946007)(4326008)(9686003)(33656002)(66446008)(86362001)(6666004)(53546011)(110136005)(7696005)(64756008)(6506007)(71200400001)(66476007)(66556008)(26005)(122000001)(2906002)(76116006)(38100700002)(478600001)(38070700005)(579004)(357404004);\n DIR:OUT; SFP:1101;","Content-Type":"text/plain; charset=\"us-ascii\"","Content-Transfer-Encoding":"quoted-printable","MIME-Version":"1.0","X-MS-Exchange-Transport-CrossTenantHeadersStamped":["AS2PR08MB9617","AS8PR08MB10363"],"Original-Authentication-Results":"dkim=none (message not signed)\n header.d=none;dmarc=none action=none header.from=arm.com;","X-EOPAttributedMessage":"0","X-MS-Exchange-Transport-CrossTenantHeadersStripped":"\n DBAEUR03FT035.eop-EUR03.prod.protection.outlook.com","X-MS-PublicTrafficType":"Email","X-MS-Office365-Filtering-Correlation-Id-Prvs":"\n 300b8a0f-8a83-472d-ee43-08dbbf3d0ac7","X-Microsoft-Antispam":"BCL:0;","X-Microsoft-Antispam-Message-Info":"\n peTOK0HuFG4q+EHaohiBf0NmJSWChztsRU6xS5hD8wZwxt1z6EySo7vQA25n2w5zSk6O1Qh9G7k5tfbnROx51yCMVrti+GN87VQvGBzd81KGPRu84u8EPJPdPX214eJ1DfSqVrTWLhlbEJpLJcZ8rCbZm67S4WeSJpz7pQAtVXo8u6LPvL+9iZoZS/7d3sMyT8a1LokxFy2s808htLmClMDTExxVOtzwccOimiTE8QhuocSekiPPDoR71KrDIRj+/SRZSEn1LdneOSnnDYSXSIIx3ebZIFjrgYnU70dmX/q1zh//DURJkAVFEPpz8c5jWTcEUpsp1PTFmyZtgvD6UmBNiRCvj1ttTEWQIqbZgYkdQMfNqwTgtVcrgDzDct121qJ86iFBi6uk87RkLVpbmAYcC5RjHsEf4nBMuAxnlELmesvF3q3Pvj4leETD9Vwyc1yGeA5oGc9b0l84Jt05UnARkcD6JSeWVBQ3Ibrr7mWW0Z56NszV8aPhoCDdB5vOJxtMltCxp/W0pXoSU8SwZXckFi2JO4zFJjN3nN8BRmVgKDQAKWRCz+lxYqv71lumxt9BV38S8wAWlU9mvaZ17kYWvOlVyqU1+P/kQDtv+Xl0j8+t5QbIjqqnF7Cu8D7MeSQaRXyuSR3uTc9qqC8Z76o9rU20Cov1yfapBmeEtzFEBPcBkq9Y2X3ChBrxd/PSrmHiewwC06EIuI04nFbZqVOFWAdYCU/i18Nl8TXFXoR7E4BdQfuyGwhpPfrWTZhdKMjQ1Zx0oCijvP0GB/HmyuyD/cJA4E4QY8YYWE4lBuc=","X-Forefront-Antispam-Report":"CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:;\n IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com;\n PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE;\n SFS:(13230031)(4636009)(39860400002)(396003)(136003)(376002)(346002)(230922051799003)(451199024)(186009)(1800799009)(82310400011)(46966006)(36840700001)(40470700004)(40460700003)(86362001)(33656002)(40480700001)(70586007)(26005)(52536014)(8676002)(107886003)(4326008)(8936002)(70206006)(336012)(5660300002)(41300700001)(54906003)(2940100002)(47076005)(110136005)(316002)(84970400001)(36860700001)(2906002)(30864003)(83380400001)(6666004)(55016003)(7696005)(6506007)(478600001)(9686003)(53546011)(82740400003)(356005)(81166007)(357404004);\n DIR:OUT; SFP:1101;","X-OriginatorOrg":"arm.com","X-MS-Exchange-CrossTenant-OriginalArrivalTime":"27 Sep 2023 09:35:21.8402 (UTC)","X-MS-Exchange-CrossTenant-Network-Message-Id":"\n 9b7d6b55-dcda-49ec-b094-08dbbf3d11fe","X-MS-Exchange-CrossTenant-Id":"f34e5979-57d9-4aaa-ad4d-b122a662184d","X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp":"\n TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123];\n Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com]","X-MS-Exchange-CrossTenant-AuthSource":"\n DBAEUR03FT035.eop-EUR03.prod.protection.outlook.com","X-MS-Exchange-CrossTenant-AuthAs":"Anonymous","X-MS-Exchange-CrossTenant-FromEntityHeader":"HybridOnPrem","X-Spam-Status":"No, score=-12.0 required=5.0 tests=BAYES_00, DKIM_SIGNED,\n DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, KAM_LOTSOFHASH,\n KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE,\n TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6","X-Spam-Checker-Version":"SpamAssassin 3.4.6 (2021-04-09) on\n server2.sourceware.org","X-BeenThere":"gcc-patches@gcc.gnu.org","X-Mailman-Version":"2.1.30","Precedence":"list","List-Id":"Gcc-patches mailing list <gcc-patches.gcc.gnu.org>","List-Unsubscribe":"<https://gcc.gnu.org/mailman/options/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>","List-Archive":"<https://gcc.gnu.org/pipermail/gcc-patches/>","List-Post":"<mailto:gcc-patches@gcc.gnu.org>","List-Help":"<mailto:gcc-patches-request@gcc.gnu.org?subject=help>","List-Subscribe":"<https://gcc.gnu.org/mailman/listinfo/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>","Errors-To":"gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org"}},{"id":3188595,"web_url":"http://patchwork.ozlabs.org/comment/3188595/","msgid":"<nycvar.YFH.7.77.849.2309270936320.5561@jbgna.fhfr.qr>","list_archive_url":null,"date":"2023-09-27T09:39:29","subject":"RE: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1\n << signbit(x)) [PR109154]","submitter":{"id":4338,"url":"http://patchwork.ozlabs.org/api/people/4338/","name":"Richard Biener","email":"rguenther@suse.de"},"content":"On Wed, 27 Sep 2023, Tamar Christina wrote:\n\n> > -----Original Message-----\n> > From: Tamar Christina <Tamar.Christina@arm.com>\n> > Sent: Wednesday, September 27, 2023 8:57 AM\n> > To: Richard Biener <rguenther@suse.de>\n> > Cc: Andrew Pinski <pinskia@gmail.com>; gcc-patches@gcc.gnu.org; nd\n> > <nd@arm.com>; jlaw@ventanamicro.com\n> > Subject: RE: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1 <<\n> > signbit(x)) [PR109154]\n> > \n> > > -----Original Message-----\n> > > From: Richard Biener <rguenther@suse.de>\n> > > Sent: Wednesday, September 27, 2023 8:12 AM\n> > > To: Tamar Christina <Tamar.Christina@arm.com>\n> > > Cc: Andrew Pinski <pinskia@gmail.com>; gcc-patches@gcc.gnu.org; nd\n> > > <nd@arm.com>; jlaw@ventanamicro.com\n> > > Subject: RE: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x\n> > > | (1 <<\n> > > signbit(x)) [PR109154]\n> > >\n> > > On Wed, 27 Sep 2023, Tamar Christina wrote:\n> > >\n> > > > > -----Original Message-----\n> > > > > From: Andrew Pinski <pinskia@gmail.com>\n> > > > > Sent: Wednesday, September 27, 2023 2:17 AM\n> > > > > To: Tamar Christina <Tamar.Christina@arm.com>\n> > > > > Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>; rguenther@suse.de;\n> > > > > jlaw@ventanamicro.com\n> > > > > Subject: Re: [PATCH]middle-end match.pd: optimize fneg (fabs (x))\n> > > > > to x | (1 <<\n> > > > > signbit(x)) [PR109154]\n> > > > >\n> > > > > On Tue, Sep 26, 2023 at 5:51?PM Tamar Christina\n> > > > > <tamar.christina@arm.com>\n> > > > > wrote:\n> > > > > >\n> > > > > > Hi All,\n> > > > > >\n> > > > > > For targets that allow conversion between int and float modes\n> > > > > > this adds a new optimization transforming fneg (fabs (x)) into x\n> > > > > > | (1 << signbit(x)).  Such sequences are common in scientific\n> > > > > > code working with\n> > > > > gradients.\n> > > > > >\n> > > > > > The transformed instruction if the target has an inclusive-OR\n> > > > > > that takes an immediate is both shorter an faster.  For those\n> > > > > > that don't the immediate has to be seperate constructed but this\n> > > > > > still ends up being faster as the immediate construction is not\n> > > > > > on the critical\n> > > path.\n> > > > > >\n> > > > > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.\n> > > > > >\n> > > > > > Ok for master?\n> > > > >\n> > > > > I think this should be part of isel instead of match.\n> > > > > Maybe we could use genmatch to generate the code that does the\n> > > > > transformations but this does not belong as part of match really.\n> > > >\n> > > > I disagree.. I don't think this belongs in isel. Isel is for\n> > > > structural\n> > > transformations.\n> > > > If there is a case for something else I'd imagine backwardprop is a\n> > > > better\n> > > choice.\n> > > >\n> > > > But I don't see why it doesn't belong here considering it *is* a\n> > > > mathematical optimization and the file has plenty of transformations\n> > > > such as mask optimizations and vector conditional rewriting.\n> > >\n> > > But the mathematical transform would more generally be fneg (fabs (x))\n> > > -> copysign (x, -1.) and that can be optimally expanded at RTL expansion\n> > time?\n> > \n> > Ah sure, atm I did copysign (x, -1) -> x | 1 << signbits.  I can do it the other way\n> > around.  And I guess since copysign (-x, y), copysign(|x|, y) -> copysign (x, y)\n> > that should solve the trigonometry problem too.\n> > \n> > Cool will do that instead, thanks!\n> \n> Hmm this seems to conflict with the pattern\n> \n> /* copysign(x, CST) -> [-]abs (x).  */\n> (for copysigns (COPYSIGN_ALL)\n>  (simplify\n>   (copysigns @0 REAL_CST@1)\n>   (if (REAL_VALUE_NEGATIVE (TREE_REAL_CST (@1)))\n>    (negate (abs @0))\n>    (abs @0))))\n> \n> Which does the opposite transformation.\n\nI suppose the idea is that -abs(x) might be easier to optimize with\nother patterns (consider a - copysign(x,...), optimizing to a + abs(x)).\n\nFor abs vs copysign it's a canonicalization, but (negate (abs @0))\nis less canonical than copysign.\n\n> Should I try removing this?\n\nI'd say yes (and put the reverse canonicalization next to this pattern).\n\nRichard.\n\n> Thanks,\n> Tamar\n> \n> > \n> > Tamar\n> > \n> > >\n> > > Richard.\n> > >\n> > > > Regards,\n> > > > Tamar\n> > > >\n> > > > >\n> > > > > Thanks,\n> > > > > Andrew\n> > > > >\n> > > > > >\n> > > > > > Thanks,\n> > > > > > Tamar\n> > > > > >\n> > > > > > gcc/ChangeLog:\n> > > > > >\n> > > > > >         PR tree-optimization/109154\n> > > > > >         * match.pd: Add new neg+abs rule.\n> > > > > >\n> > > > > > gcc/testsuite/ChangeLog:\n> > > > > >\n> > > > > >         PR tree-optimization/109154\n> > > > > >         * gcc.target/aarch64/fneg-abs_1.c: New test.\n> > > > > >         * gcc.target/aarch64/fneg-abs_2.c: New test.\n> > > > > >         * gcc.target/aarch64/fneg-abs_3.c: New test.\n> > > > > >         * gcc.target/aarch64/fneg-abs_4.c: New test.\n> > > > > >         * gcc.target/aarch64/sve/fneg-abs_1.c: New test.\n> > > > > >         * gcc.target/aarch64/sve/fneg-abs_2.c: New test.\n> > > > > >         * gcc.target/aarch64/sve/fneg-abs_3.c: New test.\n> > > > > >         * gcc.target/aarch64/sve/fneg-abs_4.c: New test.\n> > > > > >\n> > > > > > --- inline copy of patch --\n> > > > > > diff --git a/gcc/match.pd b/gcc/match.pd index\n> > > > > >\n> > > > >\n> > >\n> > 39c7ea1088f25538ed8bd26ee89711566141a71f..8ebde06dcd4b26d69482\n> > > > > 6cffad0f\n> > > > > > b17e1136600a 100644\n> > > > > > --- a/gcc/match.pd\n> > > > > > +++ b/gcc/match.pd\n> > > > > > @@ -9476,3 +9476,57 @@ and,\n> > > > > >        }\n> > > > > >        (if (full_perm_p)\n> > > > > >         (vec_perm (op@3 @0 @1) @3 @2))))))\n> > > > > > +\n> > > > > > +/* Transform fneg (fabs (X)) -> X | 1 << signbit (X).  */\n> > > > > > +\n> > > > > > +(simplify\n> > > > > > + (negate (abs @0))\n> > > > > > + (if (FLOAT_TYPE_P (type)\n> > > > > > +      /* We have to delay this rewriting till after forward\n> > > > > > +prop because\n> > > > > otherwise\n> > > > > > +        it's harder to do trigonometry optimizations. e.g. cos(-fabs(x)) is\n> > not\n> > > > > > +        matched in one go.  Instead cos (-x) is matched first\n> > > > > > + followed by\n> > > > > cos(|x|).\n> > > > > > +        The bottom op approach makes this rule match first and\n> > > > > > + it's not\n> > > untill\n> > > > > > +        fwdprop that we match top down.  There are manu such\n> > > > > > + simplications\n> > > > > so we\n> > > > > > +        delay this optimization till later on.  */\n> > > > > > +      && canonicalize_math_after_vectorization_p ())  (with {\n> > > > > > +    tree itype = unsigned_type_for (type);\n> > > > > > +    machine_mode mode = TYPE_MODE (type);\n> > > > > > +    const struct real_format *float_fmt = FLOAT_MODE_FORMAT\n> > (mode);\n> > > > > > +    auto optab = VECTOR_TYPE_P (type) ? optab_vector :\n> > optab_default; }\n> > > > > > +   (if (float_fmt\n> > > > > > +       && float_fmt->signbit_rw >= 0\n> > > > > > +       && targetm.can_change_mode_class (TYPE_MODE (itype),\n> > > > > > +                                         TYPE_MODE (type), ALL_REGS)\n> > > > > > +        && target_supports_op_p (itype, BIT_IOR_EXPR, optab))\n> > > > > > +    (with { wide_int wone = wi::one (element_precision (type));\n> > > > > > +           int sbit = float_fmt->signbit_rw;\n> > > > > > +           auto stype = VECTOR_TYPE_P (type) ? TREE_TYPE (itype) : itype;\n> > > > > > +           tree sign_bit = wide_int_to_tree (stype, wi::lshift (wone, sbit));}\n> > > > > > +     (view_convert:type\n> > > > > > +      (bit_ior (view_convert:itype @0)\n> > > > > > +              { build_uniform_cst (itype, sign_bit); } )))))))\n> > > > > > +\n> > > > > > +/* Repeat the same but for conditional negate.  */\n> > > > > > +\n> > > > > > +(simplify\n> > > > > > + (IFN_COND_NEG @1 (abs @0) @2)\n> > > > > > + (if (FLOAT_TYPE_P (type))\n> > > > > > +  (with {\n> > > > > > +    tree itype = unsigned_type_for (type);\n> > > > > > +    machine_mode mode = TYPE_MODE (type);\n> > > > > > +    const struct real_format *float_fmt = FLOAT_MODE_FORMAT\n> > (mode);\n> > > > > > +    auto optab = VECTOR_TYPE_P (type) ? optab_vector :\n> > optab_default; }\n> > > > > > +   (if (float_fmt\n> > > > > > +       && float_fmt->signbit_rw >= 0\n> > > > > > +       && targetm.can_change_mode_class (TYPE_MODE (itype),\n> > > > > > +                                         TYPE_MODE (type), ALL_REGS)\n> > > > > > +        && target_supports_op_p (itype, BIT_IOR_EXPR, optab))\n> > > > > > +    (with { wide_int wone = wi::one (element_precision (type));\n> > > > > > +           int sbit = float_fmt->signbit_rw;\n> > > > > > +           auto stype = VECTOR_TYPE_P (type) ? TREE_TYPE (itype) : itype;\n> > > > > > +           tree sign_bit = wide_int_to_tree (stype, wi::lshift (wone, sbit));}\n> > > > > > +     (view_convert:type\n> > > > > > +      (IFN_COND_IOR @1 (view_convert:itype @0)\n> > > > > > +              { build_uniform_cst (itype, sign_bit); }\n> > > > > > +              (view_convert:itype @2) )))))))\n> > > > > > \\ No newline at end of file\n> > > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_1.c\n> > > > > > b/gcc/testsuite/gcc.target/aarch64/fneg-abs_1.c\n> > > > > > new file mode 100644\n> > > > > > index\n> > > > > >\n> > > > >\n> > >\n> > 0000000000000000000000000000000000000000..f823013c3ddf6b3a266\n> > > > > c3abfcbf2\n> > > > > > 642fc2a75fa6\n> > > > > > --- /dev/null\n> > > > > > +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_1.c\n> > > > > > @@ -0,0 +1,39 @@\n> > > > > > +/* { dg-do compile } */\n> > > > > > +/* { dg-options \"-O3\" } */\n> > > > > > +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64\n> > > > > > +} } } */\n> > > > > > +\n> > > > > > +#pragma GCC target \"+nosve\"\n> > > > > > +\n> > > > > > +#include <arm_neon.h>\n> > > > > > +\n> > > > > > +/*\n> > > > > > +** t1:\n> > > > > > +**     orr     v[0-9]+.2s, #128, lsl #24\n> > > > > > +**     ret\n> > > > > > +*/\n> > > > > > +float32x2_t t1 (float32x2_t a)\n> > > > > > +{\n> > > > > > +  return vneg_f32 (vabs_f32 (a)); }\n> > > > > > +\n> > > > > > +/*\n> > > > > > +** t2:\n> > > > > > +**     orr     v[0-9]+.4s, #128, lsl #24\n> > > > > > +**     ret\n> > > > > > +*/\n> > > > > > +float32x4_t t2 (float32x4_t a)\n> > > > > > +{\n> > > > > > +  return vnegq_f32 (vabsq_f32 (a)); }\n> > > > > > +\n> > > > > > +/*\n> > > > > > +** t3:\n> > > > > > +**     adrp    x0, .LC[0-9]+\n> > > > > > +**     ldr     q[0-9]+, \\[x0, #:lo12:.LC0\\]\n> > > > > > +**     orr     v[0-9]+.16b, v[0-9]+.16b, v[0-9]+.16b\n> > > > > > +**     ret\n> > > > > > +*/\n> > > > > > +float64x2_t t3 (float64x2_t a)\n> > > > > > +{\n> > > > > > +  return vnegq_f64 (vabsq_f64 (a)); }\n> > > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_2.c\n> > > > > > b/gcc/testsuite/gcc.target/aarch64/fneg-abs_2.c\n> > > > > > new file mode 100644\n> > > > > > index\n> > > > > >\n> > > > >\n> > >\n> > 0000000000000000000000000000000000000000..141121176b309e4b2a\n> > > > > a413dc5527\n> > > > > > 1a6e3c93d5e1\n> > > > > > --- /dev/null\n> > > > > > +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_2.c\n> > > > > > @@ -0,0 +1,31 @@\n> > > > > > +/* { dg-do compile } */\n> > > > > > +/* { dg-options \"-O3\" } */\n> > > > > > +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64\n> > > > > > +} } } */\n> > > > > > +\n> > > > > > +#pragma GCC target \"+nosve\"\n> > > > > > +\n> > > > > > +#include <arm_neon.h>\n> > > > > > +#include <math.h>\n> > > > > > +\n> > > > > > +/*\n> > > > > > +** f1:\n> > > > > > +**     movi    v[0-9]+.2s, 0x80, lsl 24\n> > > > > > +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> > > > > > +**     ret\n> > > > > > +*/\n> > > > > > +float32_t f1 (float32_t a)\n> > > > > > +{\n> > > > > > +  return -fabsf (a);\n> > > > > > +}\n> > > > > > +\n> > > > > > +/*\n> > > > > > +** f2:\n> > > > > > +**     mov     x0, -9223372036854775808\n> > > > > > +**     fmov    d[0-9]+, x0\n> > > > > > +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> > > > > > +**     ret\n> > > > > > +*/\n> > > > > > +float64_t f2 (float64_t a)\n> > > > > > +{\n> > > > > > +  return -fabs (a);\n> > > > > > +}\n> > > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_3.c\n> > > > > > b/gcc/testsuite/gcc.target/aarch64/fneg-abs_3.c\n> > > > > > new file mode 100644\n> > > > > > index\n> > > > > >\n> > > > >\n> > >\n> > 0000000000000000000000000000000000000000..b4652173a95d104ddf\n> > > > > a70c497f06\n> > > > > > 27a61ea89d3b\n> > > > > > --- /dev/null\n> > > > > > +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_3.c\n> > > > > > @@ -0,0 +1,36 @@\n> > > > > > +/* { dg-do compile } */\n> > > > > > +/* { dg-options \"-O3\" } */\n> > > > > > +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64\n> > > > > > +} } } */\n> > > > > > +\n> > > > > > +#pragma GCC target \"+nosve\"\n> > > > > > +\n> > > > > > +#include <arm_neon.h>\n> > > > > > +#include <math.h>\n> > > > > > +\n> > > > > > +/*\n> > > > > > +** f1:\n> > > > > > +**     ...\n> > > > > > +**     ldr     q[0-9]+, \\[x0\\]\n> > > > > > +**     orr     v[0-9]+.4s, #128, lsl #24\n> > > > > > +**     str     q[0-9]+, \\[x0\\], 16\n> > > > > > +**     ...\n> > > > > > +*/\n> > > > > > +void f1 (float32_t *a, int n)\n> > > > > > +{\n> > > > > > +  for (int i = 0; i < (n & -8); i++)\n> > > > > > +   a[i] = -fabsf (a[i]);\n> > > > > > +}\n> > > > > > +\n> > > > > > +/*\n> > > > > > +** f2:\n> > > > > > +**     ...\n> > > > > > +**     ldr     q[0-9]+, \\[x0\\]\n> > > > > > +**     orr     v[0-9]+.16b, v[0-9]+.16b, v[0-9]+.16b\n> > > > > > +**     str     q[0-9]+, \\[x0\\], 16\n> > > > > > +**     ...\n> > > > > > +*/\n> > > > > > +void f2 (float64_t *a, int n)\n> > > > > > +{\n> > > > > > +  for (int i = 0; i < (n & -8); i++)\n> > > > > > +   a[i] = -fabs (a[i]);\n> > > > > > +}\n> > > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_4.c\n> > > > > > b/gcc/testsuite/gcc.target/aarch64/fneg-abs_4.c\n> > > > > > new file mode 100644\n> > > > > > index\n> > > > > >\n> > > > >\n> > >\n> > 0000000000000000000000000000000000000000..10879dea74462d34b2\n> > > > > 6160eeb0bd\n> > > > > > 54ead063166b\n> > > > > > --- /dev/null\n> > > > > > +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_4.c\n> > > > > > @@ -0,0 +1,39 @@\n> > > > > > +/* { dg-do compile } */\n> > > > > > +/* { dg-options \"-O3\" } */\n> > > > > > +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64\n> > > > > > +} } } */\n> > > > > > +\n> > > > > > +#pragma GCC target \"+nosve\"\n> > > > > > +\n> > > > > > +#include <string.h>\n> > > > > > +\n> > > > > > +/*\n> > > > > > +** negabs:\n> > > > > > +**     mov     x0, -9223372036854775808\n> > > > > > +**     fmov    d[0-9]+, x0\n> > > > > > +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> > > > > > +**     ret\n> > > > > > +*/\n> > > > > > +double negabs (double x)\n> > > > > > +{\n> > > > > > +   unsigned long long y;\n> > > > > > +   memcpy (&y, &x, sizeof(double));\n> > > > > > +   y = y | (1UL << 63);\n> > > > > > +   memcpy (&x, &y, sizeof(double));\n> > > > > > +   return x;\n> > > > > > +}\n> > > > > > +\n> > > > > > +/*\n> > > > > > +** negabsf:\n> > > > > > +**     movi    v[0-9]+.2s, 0x80, lsl 24\n> > > > > > +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> > > > > > +**     ret\n> > > > > > +*/\n> > > > > > +float negabsf (float x)\n> > > > > > +{\n> > > > > > +   unsigned int y;\n> > > > > > +   memcpy (&y, &x, sizeof(float));\n> > > > > > +   y = y | (1U << 31);\n> > > > > > +   memcpy (&x, &y, sizeof(float));\n> > > > > > +   return x;\n> > > > > > +}\n> > > > > > +\n> > > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_1.c\n> > > > > > b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_1.c\n> > > > > > new file mode 100644\n> > > > > > index\n> > > > > >\n> > > > >\n> > >\n> > 0000000000000000000000000000000000000000..0c7664e6de77a49768\n> > > > > 2952653ffd\n> > > > > > 417453854d52\n> > > > > > --- /dev/null\n> > > > > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_1.c\n> > > > > > @@ -0,0 +1,37 @@\n> > > > > > +/* { dg-do compile } */\n> > > > > > +/* { dg-options \"-O3\" } */\n> > > > > > +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64\n> > > > > > +} } } */\n> > > > > > +\n> > > > > > +#include <arm_neon.h>\n> > > > > > +\n> > > > > > +/*\n> > > > > > +** t1:\n> > > > > > +**     orr     v[0-9]+.2s, #128, lsl #24\n> > > > > > +**     ret\n> > > > > > +*/\n> > > > > > +float32x2_t t1 (float32x2_t a)\n> > > > > > +{\n> > > > > > +  return vneg_f32 (vabs_f32 (a)); }\n> > > > > > +\n> > > > > > +/*\n> > > > > > +** t2:\n> > > > > > +**     orr     v[0-9]+.4s, #128, lsl #24\n> > > > > > +**     ret\n> > > > > > +*/\n> > > > > > +float32x4_t t2 (float32x4_t a)\n> > > > > > +{\n> > > > > > +  return vnegq_f32 (vabsq_f32 (a)); }\n> > > > > > +\n> > > > > > +/*\n> > > > > > +** t3:\n> > > > > > +**     adrp    x0, .LC[0-9]+\n> > > > > > +**     ldr     q[0-9]+, \\[x0, #:lo12:.LC0\\]\n> > > > > > +**     orr     v[0-9]+.16b, v[0-9]+.16b, v[0-9]+.16b\n> > > > > > +**     ret\n> > > > > > +*/\n> > > > > > +float64x2_t t3 (float64x2_t a)\n> > > > > > +{\n> > > > > > +  return vnegq_f64 (vabsq_f64 (a)); }\n> > > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_2.c\n> > > > > > b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_2.c\n> > > > > > new file mode 100644\n> > > > > > index\n> > > > > >\n> > > > >\n> > >\n> > 0000000000000000000000000000000000000000..a60cd31b9294af2dac6\n> > > > > 9eed1c93f\n> > > > > > 899bd5c78fca\n> > > > > > --- /dev/null\n> > > > > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_2.c\n> > > > > > @@ -0,0 +1,29 @@\n> > > > > > +/* { dg-do compile } */\n> > > > > > +/* { dg-options \"-O3\" } */\n> > > > > > +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64\n> > > > > > +} } } */\n> > > > > > +\n> > > > > > +#include <arm_neon.h>\n> > > > > > +#include <math.h>\n> > > > > > +\n> > > > > > +/*\n> > > > > > +** f1:\n> > > > > > +**     movi    v[0-9]+.2s, 0x80, lsl 24\n> > > > > > +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> > > > > > +**     ret\n> > > > > > +*/\n> > > > > > +float32_t f1 (float32_t a)\n> > > > > > +{\n> > > > > > +  return -fabsf (a);\n> > > > > > +}\n> > > > > > +\n> > > > > > +/*\n> > > > > > +** f2:\n> > > > > > +**     mov     x0, -9223372036854775808\n> > > > > > +**     fmov    d[0-9]+, x0\n> > > > > > +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> > > > > > +**     ret\n> > > > > > +*/\n> > > > > > +float64_t f2 (float64_t a)\n> > > > > > +{\n> > > > > > +  return -fabs (a);\n> > > > > > +}\n> > > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_3.c\n> > > > > > b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_3.c\n> > > > > > new file mode 100644\n> > > > > > index\n> > > > > >\n> > > > >\n> > >\n> > 0000000000000000000000000000000000000000..1bf34328d8841de8e6\n> > > > > b0a5458562\n> > > > > > a9f00e31c275\n> > > > > > --- /dev/null\n> > > > > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_3.c\n> > > > > > @@ -0,0 +1,34 @@\n> > > > > > +/* { dg-do compile } */\n> > > > > > +/* { dg-options \"-O3\" } */\n> > > > > > +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64\n> > > > > > +} } } */\n> > > > > > +\n> > > > > > +#include <arm_neon.h>\n> > > > > > +#include <math.h>\n> > > > > > +\n> > > > > > +/*\n> > > > > > +** f1:\n> > > > > > +**     ...\n> > > > > > +**     ld1w    z[0-9]+.s, p[0-9]+/z, \\[x0, x2, lsl 2\\]\n> > > > > > +**     orr     z[0-9]+.s, z[0-9]+.s, #0x80000000\n> > > > > > +**     st1w    z[0-9]+.s, p[0-9]+, \\[x0, x2, lsl 2\\]\n> > > > > > +**     ...\n> > > > > > +*/\n> > > > > > +void f1 (float32_t *a, int n)\n> > > > > > +{\n> > > > > > +  for (int i = 0; i < (n & -8); i++)\n> > > > > > +   a[i] = -fabsf (a[i]);\n> > > > > > +}\n> > > > > > +\n> > > > > > +/*\n> > > > > > +** f2:\n> > > > > > +**     ...\n> > > > > > +**     ld1d    z[0-9]+.d, p[0-9]+/z, \\[x0, x2, lsl 3\\]\n> > > > > > +**     orr     z[0-9]+.d, z[0-9]+.d, #0x8000000000000000\n> > > > > > +**     st1d    z[0-9]+.d, p[0-9]+, \\[x0, x2, lsl 3\\]\n> > > > > > +**     ...\n> > > > > > +*/\n> > > > > > +void f2 (float64_t *a, int n)\n> > > > > > +{\n> > > > > > +  for (int i = 0; i < (n & -8); i++)\n> > > > > > +   a[i] = -fabs (a[i]);\n> > > > > > +}\n> > > > > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_4.c\n> > > > > > b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_4.c\n> > > > > > new file mode 100644\n> > > > > > index\n> > > > > >\n> > > > >\n> > >\n> > 0000000000000000000000000000000000000000..21f2a8da2a5d44e3d0\n> > > > > 1f6604ca7b\n> > > > > > e87e3744d494\n> > > > > > --- /dev/null\n> > > > > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_4.c\n> > > > > > @@ -0,0 +1,37 @@\n> > > > > > +/* { dg-do compile } */\n> > > > > > +/* { dg-options \"-O3\" } */\n> > > > > > +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64\n> > > > > > +} } } */\n> > > > > > +\n> > > > > > +#include <string.h>\n> > > > > > +\n> > > > > > +/*\n> > > > > > +** negabs:\n> > > > > > +**     mov     x0, -9223372036854775808\n> > > > > > +**     fmov    d[0-9]+, x0\n> > > > > > +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> > > > > > +**     ret\n> > > > > > +*/\n> > > > > > +double negabs (double x)\n> > > > > > +{\n> > > > > > +   unsigned long long y;\n> > > > > > +   memcpy (&y, &x, sizeof(double));\n> > > > > > +   y = y | (1UL << 63);\n> > > > > > +   memcpy (&x, &y, sizeof(double));\n> > > > > > +   return x;\n> > > > > > +}\n> > > > > > +\n> > > > > > +/*\n> > > > > > +** negabsf:\n> > > > > > +**     movi    v[0-9]+.2s, 0x80, lsl 24\n> > > > > > +**     orr     v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> > > > > > +**     ret\n> > > > > > +*/\n> > > > > > +float negabsf (float x)\n> > > > > > +{\n> > > > > > +   unsigned int y;\n> > > > > > +   memcpy (&y, &x, sizeof(float));\n> > > > > > +   y = y | (1U << 31);\n> > > > > > +   memcpy (&x, &y, sizeof(float));\n> > > > > > +   return x;\n> > > > > > +}\n> > > > > > +\n> > > > > >\n> > > > > >\n> > > > > >\n> > > > > >\n> > > > > > --\n> > > >\n> > >\n> > > --\n> > > Richard Biener <rguenther@suse.de>\n> > > SUSE Software Solutions Germany GmbH,\n> > > Frankenstrasse 146, 90461 Nuernberg, Germany;\n> > > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG\n> > > Nuernberg)\n>","headers":{"Return-Path":"<gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org>","X-Original-To":["incoming@patchwork.ozlabs.org","gcc-patches@gcc.gnu.org"],"Delivered-To":["patchwork-incoming@legolas.ozlabs.org","gcc-patches@gcc.gnu.org"],"Authentication-Results":["legolas.ozlabs.org;\n\tdkim=pass (1024-bit key;\n unprotected) header.d=suse.de header.i=@suse.de header.a=rsa-sha256\n header.s=susede2_rsa header.b=EKZ4HTjl;\n\tdkim=pass header.d=suse.de header.i=@suse.de header.a=ed25519-sha256\n header.s=susede2_ed25519 header.b=PMSmO9pk;\n\tdkim-atps=neutral","legolas.ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org\n (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org;\n envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org;\n receiver=patchwork.ozlabs.org)","sourceware.org;\n dmarc=pass (p=none dis=none) header.from=suse.de","sourceware.org; spf=pass smtp.mailfrom=suse.de"],"Received":["from server2.sourceware.org (server2.sourceware.org\n [IPv6:2620:52:3:1:0:246e:9693:128c])\n\t(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n\t key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384)\n\t(No client certificate requested)\n\tby legolas.ozlabs.org (Postfix) with ESMTPS id 4RwWmN5CCkz1yp8\n\tfor <incoming@patchwork.ozlabs.org>; Wed, 27 Sep 2023 19:39:48 +1000 (AEST)","from server2.sourceware.org (localhost [IPv6:::1])\n\tby sourceware.org (Postfix) with ESMTP id BD6B3385CCA3\n\tfor <incoming@patchwork.ozlabs.org>; Wed, 27 Sep 2023 09:39:46 +0000 (GMT)","from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28])\n by sourceware.org (Postfix) with ESMTPS id 09AE6385CC97\n for <gcc-patches@gcc.gnu.org>; Wed, 27 Sep 2023 09:39:30 +0000 (GMT)","from relay2.suse.de (relay2.suse.de [149.44.160.134])\n by smtp-out1.suse.de (Postfix) with ESMTP id 3FD102195A;\n Wed, 27 Sep 2023 09:39:29 +0000 (UTC)","from wotan.suse.de (wotan.suse.de [10.160.0.1])\n (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))\n (No client certificate requested)\n by relay2.suse.de (Postfix) with ESMTPS id 1E9BF2C14E;\n Wed, 27 Sep 2023 09:39:29 +0000 (UTC)"],"DMARC-Filter":"OpenDMARC Filter v1.4.2 sourceware.org 09AE6385CC97","DKIM-Signature":["v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de;\n s=susede2_rsa;\n t=1695807569;\n h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc:\n mime-version:mime-version:content-type:content-type:\n in-reply-to:in-reply-to:references:references;\n bh=46q/PEBHd2ffVzQN6wxfxGF0yMLFSX30nLil3dq2wOU=;\n b=EKZ4HTjlWC2QBkXe2JITo0ijG+7SAvz1MCwylTlQQyA6cUh3V5Jld2GjyOnHTghYhB0ibd\n AoX9mVb8bTKm+GeCieZ/aQk4VwKJ/ZH2Pw+Dr+lDJRylIk2/OTo4IZPF0e0C/qCw540CYQ\n qnyZWz7Z/o67DrqKjcrQhwAX/Gj99yA=","v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de;\n s=susede2_ed25519; t=1695807569;\n h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc:\n mime-version:mime-version:content-type:content-type:\n in-reply-to:in-reply-to:references:references;\n bh=46q/PEBHd2ffVzQN6wxfxGF0yMLFSX30nLil3dq2wOU=;\n b=PMSmO9pkbyPHK0k9yAQtw49duUis66gIOTPlNJVgKbNy/qCgC7Vlek5c22Fxa85gAkSUMq\n fLHsYzKxDLXql8Cg=="],"Date":"Wed, 27 Sep 2023 09:39:29 +0000 (UTC)","From":"Richard Biener <rguenther@suse.de>","To":"Tamar Christina <Tamar.Christina@arm.com>","cc":"Andrew Pinski <pinskia@gmail.com>,\n \"gcc-patches@gcc.gnu.org\" <gcc-patches@gcc.gnu.org>, nd <nd@arm.com>,\n \"jlaw@ventanamicro.com\" <jlaw@ventanamicro.com>","Subject":"RE: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1\n << signbit(x)) [PR109154]","In-Reply-To":"\n <VI1PR08MB5325EF37073EFC87B0574525FFC2A@VI1PR08MB5325.eurprd08.prod.outlook.com>","Message-ID":"<nycvar.YFH.7.77.849.2309270936320.5561@jbgna.fhfr.qr>","References":"<patch-17718-tamar@arm.com>\n <CA+=Sn1kbO1OkC_1oMJi8uH8bajmGn07A+F6nvb6dGKBRcR8S3Q@mail.gmail.com>\n <VI1PR08MB5325CC904A863DB87F17CB88FFC2A@VI1PR08MB5325.eurprd08.prod.outlook.com>\n <nycvar.YFH.7.77.849.2309270710120.5561@jbgna.fhfr.qr>\n <VI1PR08MB532509805D977DE375DE3618FFC2A@VI1PR08MB5325.eurprd08.prod.outlook.com>\n <VI1PR08MB5325EF37073EFC87B0574525FFC2A@VI1PR08MB5325.eurprd08.prod.outlook.com>","User-Agent":"Alpine 2.22 (LSU 394 2020-01-19)","MIME-Version":"1.0","Content-Type":"text/plain; charset=US-ASCII","X-Spam-Status":"No, score=-10.9 required=5.0 tests=BAYES_00, DKIM_SIGNED,\n DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_LOTSOFHASH,\n KAM_SHORT, SPF_HELO_NONE, SPF_PASS,\n TXREP autolearn=ham autolearn_force=no version=3.4.6","X-Spam-Checker-Version":"SpamAssassin 3.4.6 (2021-04-09) on\n server2.sourceware.org","X-BeenThere":"gcc-patches@gcc.gnu.org","X-Mailman-Version":"2.1.30","Precedence":"list","List-Id":"Gcc-patches mailing list <gcc-patches.gcc.gnu.org>","List-Unsubscribe":"<https://gcc.gnu.org/mailman/options/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>","List-Archive":"<https://gcc.gnu.org/pipermail/gcc-patches/>","List-Post":"<mailto:gcc-patches@gcc.gnu.org>","List-Help":"<mailto:gcc-patches-request@gcc.gnu.org?subject=help>","List-Subscribe":"<https://gcc.gnu.org/mailman/listinfo/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>","Errors-To":"gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org"}},{"id":3190222,"web_url":"http://patchwork.ozlabs.org/comment/3190222/","msgid":"<5535287f-43af-4c04-bd3d-47f2075f61ed@gmail.com>","list_archive_url":null,"date":"2023-09-29T15:00:46","subject":"Re: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1 <<\n signbit(x)) [PR109154]","submitter":{"id":81263,"url":"http://patchwork.ozlabs.org/api/people/81263/","name":"Jeffrey Law","email":"jeffreyalaw@gmail.com"},"content":"On 9/26/23 18:50, Tamar Christina wrote:\n> Hi All,\n> \n> For targets that allow conversion between int and float modes this adds a new\n> optimization transforming fneg (fabs (x)) into x | (1 << signbit(x)).  Such\n> sequences are common in scientific code working with gradients.\n> \n> The transformed instruction if the target has an inclusive-OR that takes an\n> immediate is both shorter an faster.  For those that don't the immediate has\n> to be seperate constructed but this still ends up being faster as the immediate\n> construction is not on the critical path.\n> \n> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.\n> \n> Ok for master?\n> \n> Thanks,\n> Tamar\n> \n> gcc/ChangeLog:\n> \n> \tPR tree-optimization/109154\n> \t* match.pd: Add new neg+abs rule.\n> \n> gcc/testsuite/ChangeLog:\n> \n> \tPR tree-optimization/109154\n> \t* gcc.target/aarch64/fneg-abs_1.c: New test.\n> \t* gcc.target/aarch64/fneg-abs_2.c: New test.\n> \t* gcc.target/aarch64/fneg-abs_3.c: New test.\n> \t* gcc.target/aarch64/fneg-abs_4.c: New test.\n> \t* gcc.target/aarch64/sve/fneg-abs_1.c: New test.\n> \t* gcc.target/aarch64/sve/fneg-abs_2.c: New test.\n> \t* gcc.target/aarch64/sve/fneg-abs_3.c: New test.\n> \t* gcc.target/aarch64/sve/fneg-abs_4.c: New test.\n> \n> --- inline copy of patch --\n> diff --git a/gcc/match.pd b/gcc/match.pd\n> index 39c7ea1088f25538ed8bd26ee89711566141a71f..8ebde06dcd4b26d694826cffad0fb17e1136600a 100644\n> --- a/gcc/match.pd\n> +++ b/gcc/match.pd\n> @@ -9476,3 +9476,57 @@ and,\n>         }\n>         (if (full_perm_p)\n>   \t(vec_perm (op@3 @0 @1) @3 @2))))))\n> +\n> +/* Transform fneg (fabs (X)) -> X | 1 << signbit (X).  */\n> +\n> +(simplify\n> + (negate (abs @0))\n> + (if (FLOAT_TYPE_P (type)\n> +      /* We have to delay this rewriting till after forward prop because otherwise\n> +\t it's harder to do trigonometry optimizations. e.g. cos(-fabs(x)) is not\n> +\t matched in one go.  Instead cos (-x) is matched first followed by cos(|x|).\n> +\t The bottom op approach makes this rule match first and it's not untill\n> +\t fwdprop that we match top down.  There are manu such simplications so we\nMultiple typos this line.  fwdprop->fwprop manu->many \nsimplications->simplifications.\n\nOK with the typos fixed.\n\nThanks.  I meant to say hi at the Cauldron, but never seemed to get away \nlong enough to find you..\n\njeff","headers":{"Return-Path":"<gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org>","X-Original-To":["incoming@patchwork.ozlabs.org","gcc-patches@gcc.gnu.org"],"Delivered-To":["patchwork-incoming@legolas.ozlabs.org","gcc-patches@gcc.gnu.org"],"Authentication-Results":["legolas.ozlabs.org;\n\tdkim=pass (2048-bit key;\n unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256\n header.s=20230601 header.b=WzlRHpqq;\n\tdkim-atps=neutral","legolas.ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org\n (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org;\n envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org;\n receiver=patchwork.ozlabs.org)","sourceware.org;\n dmarc=pass (p=none dis=none) header.from=gmail.com","sourceware.org; spf=pass smtp.mailfrom=gmail.com"],"Received":["from server2.sourceware.org (server2.sourceware.org\n [IPv6:2620:52:3:1:0:246e:9693:128c])\n\t(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n\t key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384)\n\t(No client certificate requested)\n\tby legolas.ozlabs.org (Postfix) with ESMTPS id 4Rxtp90vhrz1yng\n\tfor <incoming@patchwork.ozlabs.org>; Sat, 30 Sep 2023 01:01:05 +1000 (AEST)","from server2.sourceware.org (localhost [IPv6:::1])\n\tby sourceware.org (Postfix) with ESMTP id 1E39838323CA\n\tfor <incoming@patchwork.ozlabs.org>; Fri, 29 Sep 2023 15:01:03 +0000 (GMT)","from mail-io1-xd2e.google.com (mail-io1-xd2e.google.com\n [IPv6:2607:f8b0:4864:20::d2e])\n by sourceware.org (Postfix) with ESMTPS id 77B213858C50\n for <gcc-patches@gcc.gnu.org>; Fri, 29 Sep 2023 15:00:49 +0000 (GMT)","by mail-io1-xd2e.google.com with SMTP id\n ca18e2360f4ac-79fa387fb96so435556339f.1\n for <gcc-patches@gcc.gnu.org>; Fri, 29 Sep 2023 08:00:49 -0700 (PDT)","from [172.31.0.109] ([136.36.130.248])\n by smtp.gmail.com with ESMTPSA id\n o16-20020a056638125000b00439c3385402sm5239967jas.149.2023.09.29.08.00.47\n (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128);\n Fri, 29 Sep 2023 08:00:47 -0700 (PDT)"],"DMARC-Filter":"OpenDMARC Filter v1.4.2 sourceware.org 77B213858C50","DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n d=gmail.com; s=20230601; t=1695999649; x=1696604449; darn=gcc.gnu.org;\n h=content-transfer-encoding:in-reply-to:from:references:cc:to\n :content-language:subject:user-agent:mime-version:date:message-id\n :from:to:cc:subject:date:message-id:reply-to;\n bh=PUICq6QcQqpkStrLxkXeAvrJholV8NGr3O/zP3k6I/w=;\n b=WzlRHpqqeebveHEDgmyf1qfCrkiitS/VmgrPD1UdXOQETlGYbp1LRraJk5FgGN6fgW\n Mun9Pw6zyVadPaz/RqPgLx65nF2s8o69nbDkOM5tdMKdTS5xcaRX6KOdPR2t0epbA3lI\n jflghczHyOYM91LWSoYLwu6QKBTRhcVXpeLQCVc4q0SVN1V/KiBNqi7YzKkntYSeW7DV\n 97o9ClGt6V5bMimSINnLbWNXb+Q/wpCnANg33NqSVIrt/t2C6TgeGLwC004Bs01nET0T\n 1hwAhWK/tYnh5GM2G9rfPKsyfKr4k2qZin2s/ioJbPHIUeK9zflJIcQvY3cAQJqZ8Al9\n XiCg==","X-Google-DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n d=1e100.net; s=20230601; t=1695999649; x=1696604449;\n h=content-transfer-encoding:in-reply-to:from:references:cc:to\n :content-language:subject:user-agent:mime-version:date:message-id\n :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;\n bh=PUICq6QcQqpkStrLxkXeAvrJholV8NGr3O/zP3k6I/w=;\n b=snhWbIq4LUQM0Yxonr+Tr3aVQDlvapOC8whxs4+SGLenGXMvNTULEi8xvK/GdwHnEy\n H2OCRZI6u9Ln+/PvxUGQp7IXvO/RWWS//4m6KZShXdKTsUMuvUzDwAuWAqc7vGBSA+GY\n 9auA+tAo2h9BrY9/o8Pj+4gc8e98eOOTQmQIogyGet1r+rzfmZWckUm7k63hlDFl2Rnf\n Ixo6P57xhyFcLIY+xgD8XNmNelJM6e/lfRCdSnysJqeenSe0F9BeRjEgxEQaqlSuJOXA\n tZBVl+pM1AP4SijloWaEP4GrW+AoMpWVIWUyT0yUVdVUgLPyPYswmCBS+taF2qJwIn53\n fO1A==","X-Gm-Message-State":"AOJu0Yw+mYDAvCeNtWkWr6KTfnl0obYxHn4/QoDw1ZECiboQtK23+7zm\n cDjHtqvxpgIIoDEvSixqoFQ=","X-Google-Smtp-Source":"\n AGHT+IGhBJQrlScC/gE9WsCSE+/s6jhKEDr8E/HeCc7YmJgcNMDGQDenVe482ixx0LQ9LAhwDGwyqQ==","X-Received":"by 2002:a6b:e50b:0:b0:79f:a0e9:b8c3 with SMTP id\n y11-20020a6be50b000000b0079fa0e9b8c3mr4841231ioc.20.1695999648531;\n Fri, 29 Sep 2023 08:00:48 -0700 (PDT)","Message-ID":"<5535287f-43af-4c04-bd3d-47f2075f61ed@gmail.com>","Date":"Fri, 29 Sep 2023 09:00:46 -0600","MIME-Version":"1.0","User-Agent":"Mozilla Thunderbird","Subject":"Re: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1 <<\n signbit(x)) [PR109154]","Content-Language":"en-US","To":"Tamar Christina <tamar.christina@arm.com>, gcc-patches@gcc.gnu.org","Cc":"nd@arm.com, rguenther@suse.de, jlaw@ventanamicro.com","References":"<patch-17718-tamar@arm.com>","From":"Jeff Law <jeffreyalaw@gmail.com>","In-Reply-To":"<patch-17718-tamar@arm.com>","Content-Type":"text/plain; charset=UTF-8; format=flowed","Content-Transfer-Encoding":"7bit","X-Spam-Status":"No, score=-8.4 required=5.0 tests=BAYES_00, DKIM_SIGNED,\n DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0,\n KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS,\n TXREP autolearn=ham autolearn_force=no version=3.4.6","X-Spam-Checker-Version":"SpamAssassin 3.4.6 (2021-04-09) on\n server2.sourceware.org","X-BeenThere":"gcc-patches@gcc.gnu.org","X-Mailman-Version":"2.1.30","Precedence":"list","List-Id":"Gcc-patches mailing list <gcc-patches.gcc.gnu.org>","List-Unsubscribe":"<https://gcc.gnu.org/mailman/options/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>","List-Archive":"<https://gcc.gnu.org/pipermail/gcc-patches/>","List-Post":"<mailto:gcc-patches@gcc.gnu.org>","List-Help":"<mailto:gcc-patches-request@gcc.gnu.org?subject=help>","List-Subscribe":"<https://gcc.gnu.org/mailman/listinfo/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>","Errors-To":"gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org"}},{"id":3193890,"web_url":"http://patchwork.ozlabs.org/comment/3193890/","msgid":"<VI1PR08MB5325B21B06D84B5E1B1D2FDFFFCAA@VI1PR08MB5325.eurprd08.prod.outlook.com>","list_archive_url":null,"date":"2023-10-05T18:09:50","subject":"RE: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1 <<\n signbit(x)) [PR109154]","submitter":{"id":69689,"url":"http://patchwork.ozlabs.org/api/people/69689/","name":"Tamar Christina","email":"Tamar.Christina@arm.com"},"content":"> > b17e1136600a 100644\n> > --- a/gcc/match.pd\n> > +++ b/gcc/match.pd\n> > @@ -9476,3 +9476,57 @@ and,\n> >         }\n> >         (if (full_perm_p)\n> >   \t(vec_perm (op@3 @0 @1) @3 @2))))))\n> > +\n> > +/* Transform fneg (fabs (X)) -> X | 1 << signbit (X).  */\n> > +\n> > +(simplify\n> > + (negate (abs @0))\n> > + (if (FLOAT_TYPE_P (type)\n> > +      /* We have to delay this rewriting till after forward prop because\n> otherwise\n> > +\t it's harder to do trigonometry optimizations. e.g. cos(-fabs(x)) is not\n> > +\t matched in one go.  Instead cos (-x) is matched first followed by\n> cos(|x|).\n> > +\t The bottom op approach makes this rule match first and it's not untill\n> > +\t fwdprop that we match top down.  There are manu such\n> simplications\n> > +so we\n> Multiple typos this line.  fwdprop->fwprop manu->many\n> simplications->simplifications.\n> \n> OK with the typos fixed.\n\nAh I think you missed the previous emails from Richi whom wanted this canonicalized to\ncopysign instead. I've just finished doing so and will send the updated patch 😊\n\n> \n> Thanks.  I meant to say hi at the Cauldron, but never seemed to get away long\n> enough to find you..\n\nHehehe Indeed, I think I only saw you once and then *poof* like a ninja you were gone!\n\nNext time 😊\n\nCheers,\nTamar\n\n> \n> jeff","headers":{"Return-Path":"<gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org>","X-Original-To":["incoming@patchwork.ozlabs.org","gcc-patches@gcc.gnu.org"],"Delivered-To":["patchwork-incoming@legolas.ozlabs.org","gcc-patches@gcc.gnu.org"],"Authentication-Results":["legolas.ozlabs.org;\n\tdkim=pass (1024-bit key;\n unprotected) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com\n header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com\n header.b=6/ZOUkpu;\n\tdkim=pass (1024-bit key) header.d=armh.onmicrosoft.com\n header.i=@armh.onmicrosoft.com header.a=rsa-sha256\n header.s=selector2-armh-onmicrosoft-com header.b=6/ZOUkpu;\n\tdkim-atps=neutral","legolas.ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org\n (client-ip=8.43.85.97; helo=server2.sourceware.org;\n envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org;\n receiver=patchwork.ozlabs.org)","sourceware.org;\n dmarc=pass (p=none dis=none) header.from=arm.com","sourceware.org; spf=pass smtp.mailfrom=arm.com"],"Received":["from server2.sourceware.org (server2.sourceware.org [8.43.85.97])\n\t(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n\t key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384)\n\t(No client certificate requested)\n\tby legolas.ozlabs.org (Postfix) with ESMTPS id 4S1fjq1Fh1z1yng\n\tfor <incoming@patchwork.ozlabs.org>; Fri,  6 Oct 2023 05:10:20 +1100 (AEDT)","from server2.sourceware.org (localhost [IPv6:::1])\n\tby sourceware.org (Postfix) with ESMTP id CDF55385559B\n\tfor <incoming@patchwork.ozlabs.org>; Thu,  5 Oct 2023 18:10:18 +0000 (GMT)","from EUR05-DB8-obe.outbound.protection.outlook.com\n (mail-db8eur05on2064.outbound.protection.outlook.com [40.107.20.64])\n by sourceware.org (Postfix) with ESMTPS id 6FE373857806\n for <gcc-patches@gcc.gnu.org>; Thu,  5 Oct 2023 18:10:03 +0000 (GMT)","from DUZPR01CA0047.eurprd01.prod.exchangelabs.com\n (2603:10a6:10:469::16) by GV1PR08MB8036.eurprd08.prod.outlook.com\n (2603:10a6:150:97::7) with Microsoft SMTP Server (version=TLS1_2,\n cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6838.28; Thu, 5 Oct\n 2023 18:10:00 +0000","from DBAEUR03FT027.eop-EUR03.prod.protection.outlook.com\n (2603:10a6:10:469:cafe::f5) by DUZPR01CA0047.outlook.office365.com\n (2603:10a6:10:469::16) with Microsoft SMTP Server (version=TLS1_2,\n cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6838.33 via Frontend\n Transport; Thu, 5 Oct 2023 18:10:00 +0000","from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by\n DBAEUR03FT027.mail.protection.outlook.com (100.127.142.237) with\n Microsoft\n SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id\n 15.20.6863.26 via Frontend Transport; Thu, 5 Oct 2023 18:10:00 +0000","(\"Tessian outbound d219f9a4f5c9:v211\");\n Thu, 05 Oct 2023 18:10:00 +0000","from e0adb5576988.1\n by 64aa7808-outbound-1.mta.getcheckrecipient.com id\n 97F8800D-CE06-42DC-B812-085D1618FF40.1;\n Thu, 05 Oct 2023 18:09:53 +0000","from EUR04-DB3-obe.outbound.protection.outlook.com\n by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id\n e0adb5576988.1\n (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384);\n Thu, 05 Oct 2023 18:09:53 +0000","from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17)\n by DB8PR08MB5529.eurprd08.prod.outlook.com (2603:10a6:10:115::22)\n with Microsoft SMTP Server (version=TLS1_2,\n cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6838.35; Thu, 5 Oct\n 2023 18:09:52 +0000","from VI1PR08MB5325.eurprd08.prod.outlook.com\n ([fe80::662f:8e26:1bf8:aaa1]) by VI1PR08MB5325.eurprd08.prod.outlook.com\n ([fe80::662f:8e26:1bf8:aaa1%7]) with mapi id 15.20.6838.033; Thu, 5 Oct 2023\n 18:09:51 +0000"],"DMARC-Filter":"OpenDMARC Filter v1.4.2 sourceware.org 6FE373857806","DKIM-Signature":["v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com;\n s=selector2-armh-onmicrosoft-com;\n h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;\n bh=oi3AfOcxacx3OHrMP4B8w/6zuIRzXlCqqwLH4mIAFoU=;\n b=6/ZOUkpuLRYM6s1FNb/G7N41+MNYxJSOaDrlYBP9FThJ62gNS/itrP4/3m7UlRGVbhuKAbeyDa9UmQVQtkOymg0+spRoP8x3bsOdhBOzbKGFFSkYElkx6Vlvza9lp2RiX39gM5CHUZEcM9m1FH9qGaW0XgrrVFt0uwN5/k2r6wM=","v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com;\n s=selector2-armh-onmicrosoft-com;\n h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;\n bh=oi3AfOcxacx3OHrMP4B8w/6zuIRzXlCqqwLH4mIAFoU=;\n b=6/ZOUkpuLRYM6s1FNb/G7N41+MNYxJSOaDrlYBP9FThJ62gNS/itrP4/3m7UlRGVbhuKAbeyDa9UmQVQtkOymg0+spRoP8x3bsOdhBOzbKGFFSkYElkx6Vlvza9lp2RiX39gM5CHUZEcM9m1FH9qGaW0XgrrVFt0uwN5/k2r6wM="],"X-MS-Exchange-Authentication-Results":"spf=pass (sender IP is 63.35.35.123)\n smtp.mailfrom=arm.com; dkim=pass (signature was verified)\n header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com;","Received-SPF":"Pass (protection.outlook.com: domain of arm.com designates\n 63.35.35.123 as permitted sender) receiver=protection.outlook.com;\n client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com;\n pr=C","X-CheckRecipientChecked":"true","X-CR-MTA-CID":"21a838f604ee4ebf","X-CR-MTA-TID":"64aa7808","ARC-Seal":"i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none;\n b=NNP1BAi4YL6plOONJ0aONnEkrcoD1+g7nGBx3oJxDUcT2LJCivg3o/9QYQaaI1b5PYifZ2rAnQ79QgXLoYo52xygxyNy2sj5MhHnuA9qh1KM5hMb0Mcpo2SLaeJvaO2cJFA1T5HKG5aVFKR5UF6IZtGVVHl6BWm6nNl+nX4/cefKEP9/yB6E5QkLXEKIv8LSESzpDZAQ31WdIZyMvx1mGkaXsX4sVI7rZtR/fWKa67t9kSFqxVV5WTQw3e36snQLzRkFtdjePH181+58NW03H1ykKV4DgOZe18wkKUX1RhtB38x7pTAn2Q3ONhCtJ2RVSnbIAFd7N2t/KZpQQF0gFA==","ARC-Message-Signature":"i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com;\n s=arcselector9901;\n h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1;\n bh=oi3AfOcxacx3OHrMP4B8w/6zuIRzXlCqqwLH4mIAFoU=;\n b=QOxK5bNK1ScmxNbxnq1Yk0F/Z2ZlIXmrUcs3KJGZO+NRZMjQbgu+r2fNOAocBHA9f25qVDM/tk3kAaj1XmI1/pMGhtJiT3FCCj+WJOWGxFaBKX63VZJSsT02FtthAQ0WZ3am43FXCRoSo7xgSawAWc8uOQrn0Mb1hHQ22iDPaceTM8MPILhgGwJWlO7OwElbgkKNERPg5GlHEuHCIZLK91d4kc/8NAUwQaXaDa4SioZW4W+blUxGtIMfRnBtOEk2UAIKUp2GFrcK5BxNKNcP5XQBUPGcl1K5pMHQYIwH707MY4PVfg/6XVOQyUormGzzwwA1uIStpFdn16VBkAMuSw==","ARC-Authentication-Results":"i=1; mx.microsoft.com 1; spf=pass\n smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass\n header.d=arm.com; arc=none","From":"Tamar Christina <Tamar.Christina@arm.com>","To":"Jeff Law <jeffreyalaw@gmail.com>, \"gcc-patches@gcc.gnu.org\"\n <gcc-patches@gcc.gnu.org>","CC":"nd <nd@arm.com>, \"rguenther@suse.de\" <rguenther@suse.de>,\n \"jlaw@ventanamicro.com\" <jlaw@ventanamicro.com>","Subject":"RE: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1 <<\n signbit(x)) [PR109154]","Thread-Topic":"[PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1\n << signbit(x)) [PR109154]","Thread-Index":"AQHZ8NyYUy/9C+tZQUuY9xAIpvunf7Ax6boAgAmiR0A=","Date":"Thu, 5 Oct 2023 18:09:50 +0000","Message-ID":"\n <VI1PR08MB5325B21B06D84B5E1B1D2FDFFFCAA@VI1PR08MB5325.eurprd08.prod.outlook.com>","References":"<patch-17718-tamar@arm.com>\n <5535287f-43af-4c04-bd3d-47f2075f61ed@gmail.com>","In-Reply-To":"<5535287f-43af-4c04-bd3d-47f2075f61ed@gmail.com>","Accept-Language":"en-US","Content-Language":"en-US","X-MS-Has-Attach":"","X-MS-TNEF-Correlator":"","Authentication-Results-Original":"dkim=none (message not signed)\n header.d=none;dmarc=none action=none header.from=arm.com;","x-ms-traffictypediagnostic":"\n VI1PR08MB5325:EE_|DB8PR08MB5529:EE_|DBAEUR03FT027:EE_|GV1PR08MB8036:EE_","X-MS-Office365-Filtering-Correlation-Id":"40321552-7eb6-40c7-0fef-08dbc5ce4a1f","x-checkrecipientrouted":"true","nodisclaimer":"true","X-MS-Exchange-SenderADCheck":"1","X-MS-Exchange-AntiSpam-Relay":"0","X-Microsoft-Antispam-Untrusted":"BCL:0;","X-Microsoft-Antispam-Message-Info-Original":"\n ELoGC1lzzHspVqnaKOgMi2fHdceRIGgRgjyO6uuD2TAFByVM4O8xXclhoEE+TqNpLH7dfUV+ATuGKQDidVpZLQca3skgtpslryxdiOjuQ/X6z58r0z7fq32D3gaEZ7efmIfIJdagUe1Rqe7CXc3dvnoZhGGJe4X8/JeSHGkOx86FkayHvKhYwKKymkqtJkixgQc3fiQORtwi4Z2Edp56H6B6tJb3ndbDhCgHR30kS4idHjM6BF5ox8fQbCa+gHspClvhpy6wy0RLv9bf1zbBESjeeax7v+UICYsMYQYbyDKnegcJytDbeAi0INuOF6me7AKaL/yQ+OC1JYB0kaitj5KS4qPSUr2RSJ4Ve2f3f6+AxrCE1rbDM8fqPp5e8LoxII4/Qzyz8yWIrXdWymA2O/gnL/JpyMIooYb7zUeIKhJKoTwm2oi53q5r79px/eC0aNY04k9WhG1dX/7fdHoQme83czSyqos4O08yxxXFm0SCa8aNnqz0VA3HswJqoBJ6UG+C2EdM0T6PfNvnv6ni26iyTY2blC2tGFdk1yju3TSyH+xY27hE477EVKJYryHPHIoKlEqyGS9tNu+bkxys6tfzgGipWuDpc6tgcSk9khSV4O18mCAcBA8pvEOnCoks","X-Forefront-Antispam-Report-Untrusted":"CIP:255.255.255.255; CTRY:; LANG:en;\n SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com;\n PTR:; CAT:NONE;\n SFS:(13230031)(376002)(39860400002)(366004)(346002)(136003)(396003)(230922051799003)(1800799009)(186009)(451199024)(64100799003)(6506007)(7696005)(9686003)(478600001)(83380400001)(26005)(71200400001)(316002)(2906002)(66556008)(66446008)(54906003)(64756008)(110136005)(66476007)(76116006)(41300700001)(66946007)(8676002)(8936002)(4326008)(5660300002)(52536014)(33656002)(38100700002)(38070700005)(86362001)(122000001)(55016003);\n DIR:OUT; SFP:1101;","Content-Type":"text/plain; charset=\"utf-8\"","Content-Transfer-Encoding":"base64","MIME-Version":"1.0","X-MS-Exchange-Transport-CrossTenantHeadersStamped":["DB8PR08MB5529","GV1PR08MB8036"],"Original-Authentication-Results":"dkim=none (message not signed)\n header.d=none;dmarc=none action=none header.from=arm.com;","X-EOPAttributedMessage":"0","X-MS-Exchange-Transport-CrossTenantHeadersStripped":"\n DBAEUR03FT027.eop-EUR03.prod.protection.outlook.com","X-MS-PublicTrafficType":"Email","X-MS-Office365-Filtering-Correlation-Id-Prvs":"\n e3e2c9b1-7886-493f-85e8-08dbc5ce44b9","X-Microsoft-Antispam":"BCL:0;","X-Microsoft-Antispam-Message-Info":"\n tIeplqNmJ1P/kBO0uXtSzw2mMztu+FiLdsSr1z0r8pXwgYU00Yrdry0bjpq6xgyOXCbC1/Wl37YW52+QPe3/9PyJ9W/fEqXeYyCtGmH1rFJ7ztpUteGPdDv/a7RTxFTh3DpbPHHwA6PgUdtGKFvIKKVeq3edq6ERieqO2gEzCFy/f5/h7aAp+D9y+S8yU2oh4u0AC2eDt8Lt8FfcS8GiKpO3MkeKKRXDKlnQtVjk9Oryg/R4bqtJeGScskj7hiHs5fvnR2Kfv6L54t43/J382Ar8R4wKFoX3KcTyCQQnvv1YoKDdxs7T0jvtSkalZtDvYus+lEhNVPVK27G9pgmPP3B+hOtG0vnAi0sywEbOY6MUtzaS9N9VUaOwigFgBA7InZ9GZLg1OLlVZp/DMHtIOy7/CZQrfLCFw/c1Wk/y/rZ3/KjLhMPNChj6H4hm1oMI15Nx+1tKW41x2IHH+hTF8c/HWlTLYbhXfd+qFBRQWQT1/wuiRadAIYAtrr+5ieTUzKFGWWXJVWB8Am1gN7totpzc3e+YgR0AloaJfBhMmOlWReU1+mDEZUQLvbnzAK3Fy6Fjb1p8sdalqTAeulEPoGcNRl5xvujwYDk2AqFCd+q4m+sm/YAE/28Z4qKsWk1yjDq8i4haOTexme3ZTqo9qXM+2f8yMVZyMvC8+iOydEqoMTAVPFXEdTAhiyoLHpIv7PB9ZYGPJbT6mSl/yoFPCUWzuhQ+Dh+ZskthHFwUfA/G+gKnw0gBJUyon2SdZIVD","X-Forefront-Antispam-Report":"CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:;\n IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com;\n PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE;\n SFS:(13230031)(4636009)(39860400002)(346002)(396003)(136003)(376002)(230922051799003)(1800799009)(451199024)(64100799003)(82310400011)(186009)(46966006)(36840700001)(40470700004)(70206006)(26005)(110136005)(2906002)(70586007)(4326008)(8676002)(5660300002)(41300700001)(54906003)(8936002)(52536014)(9686003)(478600001)(6506007)(7696005)(36860700001)(336012)(316002)(107886003)(47076005)(83380400001)(40460700003)(356005)(40480700001)(55016003)(81166007)(82740400003)(86362001)(33656002);\n DIR:OUT; SFP:1101;","X-OriginatorOrg":"arm.com","X-MS-Exchange-CrossTenant-OriginalArrivalTime":"05 Oct 2023 18:10:00.0385 (UTC)","X-MS-Exchange-CrossTenant-Network-Message-Id":"\n 40321552-7eb6-40c7-0fef-08dbc5ce4a1f","X-MS-Exchange-CrossTenant-Id":"f34e5979-57d9-4aaa-ad4d-b122a662184d","X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp":"\n TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123];\n Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com]","X-MS-Exchange-CrossTenant-AuthSource":"\n DBAEUR03FT027.eop-EUR03.prod.protection.outlook.com","X-MS-Exchange-CrossTenant-AuthAs":"Anonymous","X-MS-Exchange-CrossTenant-FromEntityHeader":"HybridOnPrem","X-Spam-Status":"No, score=-6.4 required=5.0 tests=BAYES_00, DKIM_SIGNED,\n DKIM_VALID, FORGED_SPF_HELO, KAM_DMARC_NONE, RCVD_IN_DNSWL_NONE,\n RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP,\n UNPARSEABLE_RELAY autolearn=no autolearn_force=no version=3.4.6","X-Spam-Checker-Version":"SpamAssassin 3.4.6 (2021-04-09) on\n server2.sourceware.org","X-BeenThere":"gcc-patches@gcc.gnu.org","X-Mailman-Version":"2.1.30","Precedence":"list","List-Id":"Gcc-patches mailing list <gcc-patches.gcc.gnu.org>","List-Unsubscribe":"<https://gcc.gnu.org/mailman/options/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>","List-Archive":"<https://gcc.gnu.org/pipermail/gcc-patches/>","List-Post":"<mailto:gcc-patches@gcc.gnu.org>","List-Help":"<mailto:gcc-patches-request@gcc.gnu.org?subject=help>","List-Subscribe":"<https://gcc.gnu.org/mailman/listinfo/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>","Errors-To":"gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org"}},{"id":3193893,"web_url":"http://patchwork.ozlabs.org/comment/3193893/","msgid":"<VI1PR08MB532531FEB62DED3A5518BF0AFFCAA@VI1PR08MB5325.eurprd08.prod.outlook.com>","list_archive_url":null,"date":"2023-10-05T18:11:22","subject":"RE: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1 <<\n signbit(x)) [PR109154]","submitter":{"id":69689,"url":"http://patchwork.ozlabs.org/api/people/69689/","name":"Tamar Christina","email":"Tamar.Christina@arm.com"},"content":"> I suppose the idea is that -abs(x) might be easier to optimize with other\n> patterns (consider a - copysign(x,...), optimizing to a + abs(x)).\n> \n> For abs vs copysign it's a canonicalization, but (negate (abs @0)) is less\n> canonical than copysign.\n> \n> > Should I try removing this?\n> \n> I'd say yes (and put the reverse canonicalization next to this pattern).\n> \n\nThis patch transforms fneg (fabs (x)) into copysign (x, -1) which is more\ncanonical and allows a target to expand this sequence efficiently.  Such\nsequences are common in scientific code working with gradients.\n\nvarious optimizations in match.pd only happened on COPYSIGN but not COPYSIGN_ALL\nwhich means they exclude IFN_COPYSIGN.  COPYSIGN however is restricted to only\nthe C99 builtins and so doesn't work for vectors.\n\nThe patch expands these optimizations to work on COPYSIGN_ALL.\n\nThere is an existing canonicalization of copysign (x, -1) to fneg (fabs (x))\nwhich I remove since this is a less efficient form.  The testsuite is also\nupdated in light of this.\n\nBootstrapped Regtested on aarch64-none-linux-gnu and no issues.\n\nOk for master?\n\nThanks,\nTamar\n\ngcc/ChangeLog:\n\n\tPR tree-optimization/109154\n\t* match.pd: Add new neg+abs rule, remove inverse copysign rule and\n\texpand existing copysign optimizations.\n\ngcc/testsuite/ChangeLog:\n\n\tPR tree-optimization/109154\n\t* gcc.dg/fold-copysign-1.c: Updated.\n\t* gcc.dg/pr55152-2.c: Updated.\n\t* gcc.dg/tree-ssa/abs-4.c: Updated.\n\t* gcc.dg/tree-ssa/backprop-6.c: Updated.\n\t* gcc.dg/tree-ssa/copy-sign-2.c: Updated.\n\t* gcc.dg/tree-ssa/mult-abs-2.c: Updated.\n\t* gcc.target/aarch64/fneg-abs_1.c: New test.\n\t* gcc.target/aarch64/fneg-abs_2.c: New test.\n\t* gcc.target/aarch64/fneg-abs_3.c: New test.\n\t* gcc.target/aarch64/fneg-abs_4.c: New test.\n\t* gcc.target/aarch64/sve/fneg-abs_1.c: New test.\n\t* gcc.target/aarch64/sve/fneg-abs_2.c: New test.\n\t* gcc.target/aarch64/sve/fneg-abs_3.c: New test.\n\t* gcc.target/aarch64/sve/fneg-abs_4.c: New test.\n\n--- inline copy of patch ---\n\ndiff --git a/gcc/match.pd b/gcc/match.pd\nindex 4bdd83e6e061b16dbdb2845b9398fcfb8a6c9739..bd6599d36021e119f51a4928354f580ffe82c6e2 100644\n--- a/gcc/match.pd\n+++ b/gcc/match.pd\n@@ -1074,45 +1074,43 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)\n \n /* cos(copysign(x, y)) -> cos(x).  Similarly for cosh.  */\n (for coss (COS COSH)\n-     copysigns (COPYSIGN)\n- (simplify\n-  (coss (copysigns @0 @1))\n-   (coss @0)))\n+ (for copysigns (COPYSIGN_ALL)\n+  (simplify\n+   (coss (copysigns @0 @1))\n+    (coss @0))))\n \n /* pow(copysign(x, y), z) -> pow(x, z) if z is an even integer.  */\n (for pows (POW)\n-     copysigns (COPYSIGN)\n- (simplify\n-  (pows (copysigns @0 @2) REAL_CST@1)\n-  (with { HOST_WIDE_INT n; }\n-   (if (real_isinteger (&TREE_REAL_CST (@1), &n) && (n & 1) == 0)\n-    (pows @0 @1)))))\n+ (for copysigns (COPYSIGN_ALL)\n+  (simplify\n+   (pows (copysigns @0 @2) REAL_CST@1)\n+   (with { HOST_WIDE_INT n; }\n+    (if (real_isinteger (&TREE_REAL_CST (@1), &n) && (n & 1) == 0)\n+     (pows @0 @1))))))\n /* Likewise for powi.  */\n (for pows (POWI)\n-     copysigns (COPYSIGN)\n- (simplify\n-  (pows (copysigns @0 @2) INTEGER_CST@1)\n-  (if ((wi::to_wide (@1) & 1) == 0)\n-   (pows @0 @1))))\n+ (for copysigns (COPYSIGN_ALL)\n+  (simplify\n+   (pows (copysigns @0 @2) INTEGER_CST@1)\n+   (if ((wi::to_wide (@1) & 1) == 0)\n+    (pows @0 @1)))))\n \n (for hypots (HYPOT)\n-     copysigns (COPYSIGN)\n- /* hypot(copysign(x, y), z) -> hypot(x, z).  */\n- (simplify\n-  (hypots (copysigns @0 @1) @2)\n-  (hypots @0 @2))\n- /* hypot(x, copysign(y, z)) -> hypot(x, y).  */\n- (simplify\n-  (hypots @0 (copysigns @1 @2))\n-  (hypots @0 @1)))\n+ (for copysigns (COPYSIGN)\n+  /* hypot(copysign(x, y), z) -> hypot(x, z).  */\n+  (simplify\n+   (hypots (copysigns @0 @1) @2)\n+   (hypots @0 @2))\n+  /* hypot(x, copysign(y, z)) -> hypot(x, y).  */\n+  (simplify\n+   (hypots @0 (copysigns @1 @2))\n+   (hypots @0 @1))))\n \n-/* copysign(x, CST) -> [-]abs (x).  */\n-(for copysigns (COPYSIGN_ALL)\n- (simplify\n-  (copysigns @0 REAL_CST@1)\n-  (if (REAL_VALUE_NEGATIVE (TREE_REAL_CST (@1)))\n-   (negate (abs @0))\n-   (abs @0))))\n+/* Transform fneg (fabs (X)) -> copysign (X, -1).  */\n+\n+(simplify\n+ (negate (abs @0))\n+ (IFN_COPYSIGN @0 { build_minus_one_cst (type); }))\n \n /* copysign(copysign(x, y), z) -> copysign(x, z).  */\n (for copysigns (COPYSIGN_ALL)\ndiff --git a/gcc/testsuite/gcc.dg/fold-copysign-1.c b/gcc/testsuite/gcc.dg/fold-copysign-1.c\nindex f17d65c24ee4dca9867827d040fe0a404c515e7b..f9cafd14ab05f5e8ab2f6f68e62801d21c2df6a6 100644\n--- a/gcc/testsuite/gcc.dg/fold-copysign-1.c\n+++ b/gcc/testsuite/gcc.dg/fold-copysign-1.c\n@@ -12,5 +12,5 @@ double bar (double x)\n   return __builtin_copysign (x, minuszero);\n }\n \n-/* { dg-final { scan-tree-dump-times \"= -\" 1 \"cddce1\" } } */\n-/* { dg-final { scan-tree-dump-times \"= ABS_EXPR\" 2 \"cddce1\" } } */\n+/* { dg-final { scan-tree-dump-times \"__builtin_copysign\" 1 \"cddce1\" } } */\n+/* { dg-final { scan-tree-dump-times \"= ABS_EXPR\" 1 \"cddce1\" } } */\ndiff --git a/gcc/testsuite/gcc.dg/pr55152-2.c b/gcc/testsuite/gcc.dg/pr55152-2.c\nindex 54db0f2062da105a829d6690ac8ed9891fe2b588..605f202ed6bc7aa8fe921457b02ff0b88cc63ce6 100644\n--- a/gcc/testsuite/gcc.dg/pr55152-2.c\n+++ b/gcc/testsuite/gcc.dg/pr55152-2.c\n@@ -10,4 +10,5 @@ int f(int a)\n   return (a<-a)?a:-a;\n }\n \n-/* { dg-final { scan-tree-dump-times \"ABS_EXPR\" 2 \"optimized\" } } */\n+/* { dg-final { scan-tree-dump-times \"\\.COPYSIGN\" 1 \"optimized\" } } */\n+/* { dg-final { scan-tree-dump-times \"ABS_EXPR\" 1 \"optimized\" } } */\ndiff --git a/gcc/testsuite/gcc.dg/tree-ssa/abs-4.c b/gcc/testsuite/gcc.dg/tree-ssa/abs-4.c\nindex 6197519faf7b55aed7bc162cd0a14dd2145210ca..e1b825f37f69ac3c4666b3a52d733368805ad31d 100644\n--- a/gcc/testsuite/gcc.dg/tree-ssa/abs-4.c\n+++ b/gcc/testsuite/gcc.dg/tree-ssa/abs-4.c\n@@ -9,5 +9,6 @@ long double abs_ld(long double x) { return __builtin_signbit(x) ? x : -x; }\n \n /* __builtin_signbit(x) ? x : -x. Should be convert into - ABS_EXP<x> */\n /* { dg-final { scan-tree-dump-not \"signbit\" \"optimized\"} } */\n-/* { dg-final { scan-tree-dump-times \"= ABS_EXPR\" 3 \"optimized\"} } */\n-/* { dg-final { scan-tree-dump-times \"= -\" 3 \"optimized\"} } */\n+/* { dg-final { scan-tree-dump-times \"= ABS_EXPR\" 1 \"optimized\"} } */\n+/* { dg-final { scan-tree-dump-times \"= -\" 1 \"optimized\"} } */\n+/* { dg-final { scan-tree-dump-times \"= \\.COPYSIGN\" 2 \"optimized\"} } */\ndiff --git a/gcc/testsuite/gcc.dg/tree-ssa/backprop-6.c b/gcc/testsuite/gcc.dg/tree-ssa/backprop-6.c\nindex 31f05716f1498dc709cac95fa20fb5796642c77e..c3a138642d6ff7be984e91fa1343cb2718db7ae1 100644\n--- a/gcc/testsuite/gcc.dg/tree-ssa/backprop-6.c\n+++ b/gcc/testsuite/gcc.dg/tree-ssa/backprop-6.c\n@@ -26,5 +26,6 @@ TEST_FUNCTION (float, f)\n TEST_FUNCTION (double, )\n TEST_FUNCTION (long double, l)\n \n-/* { dg-final { scan-tree-dump-times {Deleting[^\\n]* = -} 6 \"backprop\" } } */\n-/* { dg-final { scan-tree-dump-times {Deleting[^\\n]* = ABS_EXPR <} 3 \"backprop\" } } */\n+/* { dg-final { scan-tree-dump-times {Deleting[^\\n]* = -} 4 \"backprop\" } } */\n+/* { dg-final { scan-tree-dump-times {Deleting[^\\n]* = \\.COPYSIGN} 2 \"backprop\" } } */\n+/* { dg-final { scan-tree-dump-times {Deleting[^\\n]* = ABS_EXPR <} 1 \"backprop\" } } */\ndiff --git a/gcc/testsuite/gcc.dg/tree-ssa/copy-sign-2.c b/gcc/testsuite/gcc.dg/tree-ssa/copy-sign-2.c\nindex de52c5f7c8062958353d91f5031193defc9f3f91..e5d565c4b9832c00106588ef411fbd8c292a5cad 100644\n--- a/gcc/testsuite/gcc.dg/tree-ssa/copy-sign-2.c\n+++ b/gcc/testsuite/gcc.dg/tree-ssa/copy-sign-2.c\n@@ -10,4 +10,5 @@ float f1(float x)\n   float t = __builtin_copysignf (1.0f, -x);\n   return x * t;\n }\n-/* { dg-final { scan-tree-dump-times \"ABS\" 2 \"optimized\"} } */\n+/* { dg-final { scan-tree-dump-times \"ABS\" 1 \"optimized\"} } */\n+/* { dg-final { scan-tree-dump-times \".COPYSIGN\" 1 \"optimized\"} } */\ndiff --git a/gcc/testsuite/gcc.dg/tree-ssa/mult-abs-2.c b/gcc/testsuite/gcc.dg/tree-ssa/mult-abs-2.c\nindex a41f1baf25669a4fd301a586a49ba5e3c5b966ab..a22896b21c8b5a4d5d8e28bd8ae0db896e63ade0 100644\n--- a/gcc/testsuite/gcc.dg/tree-ssa/mult-abs-2.c\n+++ b/gcc/testsuite/gcc.dg/tree-ssa/mult-abs-2.c\n@@ -34,4 +34,5 @@ float i1(float x)\n {\n   return x * (x <= 0.f ? 1.f : -1.f);\n }\n-/* { dg-final { scan-tree-dump-times \"ABS\" 8 \"gimple\"} } */\n+/* { dg-final { scan-tree-dump-times \"ABS\" 4 \"gimple\"} } */\n+/* { dg-final { scan-tree-dump-times \"\\.COPYSIGN\" 4 \"gimple\"} } */\ndiff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_1.c b/gcc/testsuite/gcc.target/aarch64/fneg-abs_1.c\nnew file mode 100644\nindex 0000000000000000000000000000000000000000..f823013c3ddf6b3a266c3abfcbf2642fc2a75fa6\n--- /dev/null\n+++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_1.c\n@@ -0,0 +1,39 @@\n+/* { dg-do compile } */\n+/* { dg-options \"-O3\" } */\n+/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } } */\n+\n+#pragma GCC target \"+nosve\"\n+\n+#include <arm_neon.h>\n+\n+/*\n+** t1:\n+**\torr\tv[0-9]+.2s, #128, lsl #24\n+**\tret\n+*/\n+float32x2_t t1 (float32x2_t a)\n+{\n+  return vneg_f32 (vabs_f32 (a));\n+}\n+\n+/*\n+** t2:\n+**\torr\tv[0-9]+.4s, #128, lsl #24\n+**\tret\n+*/\n+float32x4_t t2 (float32x4_t a)\n+{\n+  return vnegq_f32 (vabsq_f32 (a));\n+}\n+\n+/*\n+** t3:\n+**\tadrp\tx0, .LC[0-9]+\n+**\tldr\tq[0-9]+, \\[x0, #:lo12:.LC0\\]\n+**\torr\tv[0-9]+.16b, v[0-9]+.16b, v[0-9]+.16b\n+**\tret\n+*/\n+float64x2_t t3 (float64x2_t a)\n+{\n+  return vnegq_f64 (vabsq_f64 (a));\n+}\ndiff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_2.c b/gcc/testsuite/gcc.target/aarch64/fneg-abs_2.c\nnew file mode 100644\nindex 0000000000000000000000000000000000000000..141121176b309e4b2aa413dc55271a6e3c93d5e1\n--- /dev/null\n+++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_2.c\n@@ -0,0 +1,31 @@\n+/* { dg-do compile } */\n+/* { dg-options \"-O3\" } */\n+/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } } */\n+\n+#pragma GCC target \"+nosve\"\n+\n+#include <arm_neon.h>\n+#include <math.h>\n+\n+/*\n+** f1:\n+**\tmovi\tv[0-9]+.2s, 0x80, lsl 24\n+**\torr\tv[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n+**\tret\n+*/\n+float32_t f1 (float32_t a)\n+{\n+  return -fabsf (a);\n+}\n+\n+/*\n+** f2:\n+**\tmov\tx0, -9223372036854775808\n+**\tfmov\td[0-9]+, x0\n+**\torr\tv[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n+**\tret\n+*/\n+float64_t f2 (float64_t a)\n+{\n+  return -fabs (a);\n+}\ndiff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_3.c b/gcc/testsuite/gcc.target/aarch64/fneg-abs_3.c\nnew file mode 100644\nindex 0000000000000000000000000000000000000000..b4652173a95d104ddfa70c497f0627a61ea89d3b\n--- /dev/null\n+++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_3.c\n@@ -0,0 +1,36 @@\n+/* { dg-do compile } */\n+/* { dg-options \"-O3\" } */\n+/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } } */\n+\n+#pragma GCC target \"+nosve\"\n+\n+#include <arm_neon.h>\n+#include <math.h>\n+\n+/*\n+** f1:\n+**\t...\n+**\tldr\tq[0-9]+, \\[x0\\]\n+**\torr\tv[0-9]+.4s, #128, lsl #24\n+**\tstr\tq[0-9]+, \\[x0\\], 16\n+**\t...\n+*/\n+void f1 (float32_t *a, int n)\n+{\n+  for (int i = 0; i < (n & -8); i++)\n+   a[i] = -fabsf (a[i]);\n+}\n+\n+/*\n+** f2:\n+**\t...\n+**\tldr\tq[0-9]+, \\[x0\\]\n+**\torr\tv[0-9]+.16b, v[0-9]+.16b, v[0-9]+.16b\n+**\tstr\tq[0-9]+, \\[x0\\], 16\n+**\t...\n+*/\n+void f2 (float64_t *a, int n)\n+{\n+  for (int i = 0; i < (n & -8); i++)\n+   a[i] = -fabs (a[i]);\n+}\ndiff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_4.c b/gcc/testsuite/gcc.target/aarch64/fneg-abs_4.c\nnew file mode 100644\nindex 0000000000000000000000000000000000000000..10879dea74462d34b26160eeb0bd54ead063166b\n--- /dev/null\n+++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_4.c\n@@ -0,0 +1,39 @@\n+/* { dg-do compile } */\n+/* { dg-options \"-O3\" } */\n+/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } } */\n+\n+#pragma GCC target \"+nosve\"\n+\n+#include <string.h>\n+\n+/*\n+** negabs:\n+**\tmov\tx0, -9223372036854775808\n+**\tfmov\td[0-9]+, x0\n+**\torr\tv[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n+**\tret\n+*/\n+double negabs (double x)\n+{\n+   unsigned long long y;\n+   memcpy (&y, &x, sizeof(double));\n+   y = y | (1UL << 63);\n+   memcpy (&x, &y, sizeof(double));\n+   return x;\n+}\n+\n+/*\n+** negabsf:\n+**\tmovi\tv[0-9]+.2s, 0x80, lsl 24\n+**\torr\tv[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n+**\tret\n+*/\n+float negabsf (float x)\n+{\n+   unsigned int y;\n+   memcpy (&y, &x, sizeof(float));\n+   y = y | (1U << 31);\n+   memcpy (&x, &y, sizeof(float));\n+   return x;\n+}\n+\ndiff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_1.c b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_1.c\nnew file mode 100644\nindex 0000000000000000000000000000000000000000..0c7664e6de77a497682952653ffd417453854d52\n--- /dev/null\n+++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_1.c\n@@ -0,0 +1,37 @@\n+/* { dg-do compile } */\n+/* { dg-options \"-O3\" } */\n+/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } } */\n+\n+#include <arm_neon.h>\n+\n+/*\n+** t1:\n+**\torr\tv[0-9]+.2s, #128, lsl #24\n+**\tret\n+*/\n+float32x2_t t1 (float32x2_t a)\n+{\n+  return vneg_f32 (vabs_f32 (a));\n+}\n+\n+/*\n+** t2:\n+**\torr\tv[0-9]+.4s, #128, lsl #24\n+**\tret\n+*/\n+float32x4_t t2 (float32x4_t a)\n+{\n+  return vnegq_f32 (vabsq_f32 (a));\n+}\n+\n+/*\n+** t3:\n+**\tadrp\tx0, .LC[0-9]+\n+**\tldr\tq[0-9]+, \\[x0, #:lo12:.LC0\\]\n+**\torr\tv[0-9]+.16b, v[0-9]+.16b, v[0-9]+.16b\n+**\tret\n+*/\n+float64x2_t t3 (float64x2_t a)\n+{\n+  return vnegq_f64 (vabsq_f64 (a));\n+}\ndiff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_2.c b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_2.c\nnew file mode 100644\nindex 0000000000000000000000000000000000000000..a60cd31b9294af2dac69eed1c93f899bd5c78fca\n--- /dev/null\n+++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_2.c\n@@ -0,0 +1,29 @@\n+/* { dg-do compile } */\n+/* { dg-options \"-O3\" } */\n+/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } } */\n+\n+#include <arm_neon.h>\n+#include <math.h>\n+\n+/*\n+** f1:\n+**\tmovi\tv[0-9]+.2s, 0x80, lsl 24\n+**\torr\tv[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n+**\tret\n+*/\n+float32_t f1 (float32_t a)\n+{\n+  return -fabsf (a);\n+}\n+\n+/*\n+** f2:\n+**\tmov\tx0, -9223372036854775808\n+**\tfmov\td[0-9]+, x0\n+**\torr\tv[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n+**\tret\n+*/\n+float64_t f2 (float64_t a)\n+{\n+  return -fabs (a);\n+}\ndiff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_3.c b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_3.c\nnew file mode 100644\nindex 0000000000000000000000000000000000000000..1bf34328d8841de8e6b0a5458562a9f00e31c275\n--- /dev/null\n+++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_3.c\n@@ -0,0 +1,34 @@\n+/* { dg-do compile } */\n+/* { dg-options \"-O3\" } */\n+/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } } */\n+\n+#include <arm_neon.h>\n+#include <math.h>\n+\n+/*\n+** f1:\n+**\t...\n+**\tld1w\tz[0-9]+.s, p[0-9]+/z, \\[x0, x2, lsl 2\\]\n+**\torr\tz[0-9]+.s, z[0-9]+.s, #0x80000000\n+**\tst1w\tz[0-9]+.s, p[0-9]+, \\[x0, x2, lsl 2\\]\n+**\t...\n+*/\n+void f1 (float32_t *a, int n)\n+{\n+  for (int i = 0; i < (n & -8); i++)\n+   a[i] = -fabsf (a[i]);\n+}\n+\n+/*\n+** f2:\n+**\t...\n+**\tld1d\tz[0-9]+.d, p[0-9]+/z, \\[x0, x2, lsl 3\\]\n+**\torr\tz[0-9]+.d, z[0-9]+.d, #0x8000000000000000\n+**\tst1d\tz[0-9]+.d, p[0-9]+, \\[x0, x2, lsl 3\\]\n+**\t...\n+*/\n+void f2 (float64_t *a, int n)\n+{\n+  for (int i = 0; i < (n & -8); i++)\n+   a[i] = -fabs (a[i]);\n+}\ndiff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_4.c b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_4.c\nnew file mode 100644\nindex 0000000000000000000000000000000000000000..21f2a8da2a5d44e3d01f6604ca7be87e3744d494\n--- /dev/null\n+++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_4.c\n@@ -0,0 +1,37 @@\n+/* { dg-do compile } */\n+/* { dg-options \"-O3\" } */\n+/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } } */\n+\n+#include <string.h>\n+\n+/*\n+** negabs:\n+**\tmov\tx0, -9223372036854775808\n+**\tfmov\td[0-9]+, x0\n+**\torr\tv[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n+**\tret\n+*/\n+double negabs (double x)\n+{\n+   unsigned long long y;\n+   memcpy (&y, &x, sizeof(double));\n+   y = y | (1UL << 63);\n+   memcpy (&x, &y, sizeof(double));\n+   return x;\n+}\n+\n+/*\n+** negabsf:\n+**\tmovi\tv[0-9]+.2s, 0x80, lsl 24\n+**\torr\tv[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n+**\tret\n+*/\n+float negabsf (float x)\n+{\n+   unsigned int y;\n+   memcpy (&y, &x, sizeof(float));\n+   y = y | (1U << 31);\n+   memcpy (&x, &y, sizeof(float));\n+   return x;\n+}\n+","headers":{"Return-Path":"<gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org>","X-Original-To":["incoming@patchwork.ozlabs.org","gcc-patches@gcc.gnu.org"],"Delivered-To":["patchwork-incoming@legolas.ozlabs.org","gcc-patches@gcc.gnu.org"],"Authentication-Results":["legolas.ozlabs.org;\n\tdkim=pass (1024-bit key;\n unprotected) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com\n header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com\n header.b=xd0PeRus;\n\tdkim=pass (1024-bit key) header.d=armh.onmicrosoft.com\n header.i=@armh.onmicrosoft.com header.a=rsa-sha256\n header.s=selector2-armh-onmicrosoft-com header.b=xd0PeRus;\n\tdkim-atps=neutral","legolas.ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org\n (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org;\n envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org;\n receiver=patchwork.ozlabs.org)","sourceware.org;\n dmarc=pass (p=none dis=none) header.from=arm.com","sourceware.org; spf=pass smtp.mailfrom=arm.com"],"Received":["from server2.sourceware.org (server2.sourceware.org\n [IPv6:2620:52:3:1:0:246e:9693:128c])\n\t(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n\t key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384)\n\t(No client certificate requested)\n\tby legolas.ozlabs.org (Postfix) with ESMTPS id 4S1fm54c1gz1yng\n\tfor <incoming@patchwork.ozlabs.org>; Fri,  6 Oct 2023 05:12:21 +1100 (AEDT)","from server2.sourceware.org (localhost [IPv6:::1])\n\tby sourceware.org (Postfix) with ESMTP id 3A70438555AE\n\tfor <incoming@patchwork.ozlabs.org>; Thu,  5 Oct 2023 18:12:19 +0000 (GMT)","from EUR04-DB3-obe.outbound.protection.outlook.com\n (mail-db3eur04on2088.outbound.protection.outlook.com [40.107.6.88])\n by sourceware.org (Postfix) with ESMTPS id 3790A3858C35\n for <gcc-patches@gcc.gnu.org>; Thu,  5 Oct 2023 18:11:35 +0000 (GMT)","from DU2PR04CA0033.eurprd04.prod.outlook.com (2603:10a6:10:234::8)\n by AM9PR08MB6673.eurprd08.prod.outlook.com (2603:10a6:20b:307::22) with\n Microsoft SMTP Server (version=TLS1_2,\n cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6838.35; Thu, 5 Oct\n 2023 18:11:31 +0000","from DBAEUR03FT029.eop-EUR03.prod.protection.outlook.com\n (2603:10a6:10:234:cafe::77) by DU2PR04CA0033.outlook.office365.com\n (2603:10a6:10:234::8) with Microsoft SMTP Server (version=TLS1_2,\n cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6838.33 via Frontend\n Transport; Thu, 5 Oct 2023 18:11:31 +0000","from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by\n DBAEUR03FT029.mail.protection.outlook.com (100.127.142.181) with\n Microsoft\n SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id\n 15.20.6863.26 via Frontend Transport; Thu, 5 Oct 2023 18:11:31 +0000","(\"Tessian outbound 0ae75d4034ba:v211\");\n Thu, 05 Oct 2023 18:11:31 +0000","from 523abf92567d.1\n by 64aa7808-outbound-1.mta.getcheckrecipient.com id\n 441DB498-329A-45F9-9525-C2856CEFC47F.1;\n Thu, 05 Oct 2023 18:11:25 +0000","from EUR04-DB3-obe.outbound.protection.outlook.com\n by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id\n 523abf92567d.1\n (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384);\n Thu, 05 Oct 2023 18:11:25 +0000","from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17)\n by DB8PR08MB5529.eurprd08.prod.outlook.com (2603:10a6:10:115::22)\n with Microsoft SMTP Server (version=TLS1_2,\n cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6838.35; Thu, 5 Oct\n 2023 18:11:23 +0000","from VI1PR08MB5325.eurprd08.prod.outlook.com\n ([fe80::662f:8e26:1bf8:aaa1]) by VI1PR08MB5325.eurprd08.prod.outlook.com\n ([fe80::662f:8e26:1bf8:aaa1%7]) with mapi id 15.20.6838.033; Thu, 5 Oct 2023\n 18:11:22 +0000"],"DMARC-Filter":"OpenDMARC Filter v1.4.2 sourceware.org 3790A3858C35","DKIM-Signature":["v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com;\n s=selector2-armh-onmicrosoft-com;\n h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;\n bh=JlWySLAa+Up14jbtVwqE1mZpG4mLurOWo927gtt8BgE=;\n b=xd0PeRusSCxOS7YwehL3OVpcupypeKaS2lRuRk5oN1Py5jtdBRG5x1MkCGmcU4BQHEfzuXoZvjhk9RYfwwYXd3EIos40kWSXre9TBofRGU18cauRb/E4LGhMMpXXyZFUx8F/9Rfccz1wMsmdn3aetlltTyjPwQWKZqMq68TLyYk=","v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com;\n s=selector2-armh-onmicrosoft-com;\n h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;\n bh=JlWySLAa+Up14jbtVwqE1mZpG4mLurOWo927gtt8BgE=;\n b=xd0PeRusSCxOS7YwehL3OVpcupypeKaS2lRuRk5oN1Py5jtdBRG5x1MkCGmcU4BQHEfzuXoZvjhk9RYfwwYXd3EIos40kWSXre9TBofRGU18cauRb/E4LGhMMpXXyZFUx8F/9Rfccz1wMsmdn3aetlltTyjPwQWKZqMq68TLyYk="],"X-MS-Exchange-Authentication-Results":"spf=pass (sender IP is 63.35.35.123)\n smtp.mailfrom=arm.com; dkim=pass (signature was verified)\n header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com;","Received-SPF":"Pass (protection.outlook.com: domain of arm.com designates\n 63.35.35.123 as permitted sender) receiver=protection.outlook.com;\n client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com;\n pr=C","X-CheckRecipientChecked":"true","X-CR-MTA-CID":"2d3d68e253e4c4e4","X-CR-MTA-TID":"64aa7808","ARC-Seal":"i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none;\n b=YokVwvaCa0xQ4YoRfF2mXPJJ6jvIjdXpKJR4GHBDrZbFlM38jtouwoh1hDgCZI8PdavF+ayFJIm3JSZhzN0q/tCnqa/bPAmDNebUoORAOHIb0aes0nV83tt2vHzQn3FLvuINjp04ksi+OmPfQ7UgVQUHkquqrKUmYMmL4vXfisCfuD/YTboka845tmR9eInolBE1MnZRNUYF4DgFCWm44umJ6T82dxeR0IQTVmhwa5ZaNIFksspE3RtV1fpZpYQ1w898i9acMpbpl5B93E48wAXgUjcxeQ/DF8e3PH7McFW8p7oFm92DHPatlq2OQL12qTmqK46aERRr7AEDHiVa7w==","ARC-Message-Signature":"i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com;\n s=arcselector9901;\n h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1;\n bh=JlWySLAa+Up14jbtVwqE1mZpG4mLurOWo927gtt8BgE=;\n b=H6kGYQpNigTOd7oVZCHmd5cOsVFBWpYu2u+w5zUw6ioQS/fbTLVdvwIcWB8Xst7FtUGoV14vPrFsEMZDwTGGEtiBxPb0czLCh2Gajf12wu4UEMWNWM7aueq8ZYxvMIPssAvM3/YjYSuLUlgMe6iQ5tPp0jPwcyXy/tmiL7VSDRoRHrfR3l64RzXkQSia7ZlkE89RxHX/w50snX6qPfsUY80D947EcmsS3Qfw3LzBZQJPD05yZYGZsei7oEinNMxYXO/qxEp8xz8Ie+n1vjWPYnbnZsG4h019BrE8gFWuGsgL0vzqfqll74y8bjC6Idf8gvAXDw+MperJdVDmIIqxKQ==","ARC-Authentication-Results":"i=1; mx.microsoft.com 1; spf=pass\n smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass\n header.d=arm.com; arc=none","From":"Tamar Christina <Tamar.Christina@arm.com>","To":"Richard Biener <rguenther@suse.de>","CC":"Andrew Pinski <pinskia@gmail.com>, \"gcc-patches@gcc.gnu.org\"\n <gcc-patches@gcc.gnu.org>, nd <nd@arm.com>, \"jlaw@ventanamicro.com\"\n <jlaw@ventanamicro.com>","Subject":"RE: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1 <<\n signbit(x)) [PR109154]","Thread-Topic":"[PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1\n << signbit(x)) [PR109154]","Thread-Index":"\n AQHZ8NyYUy/9C+tZQUuY9xAIpvunf7At3wQAgAAUNTCAAE7DgIAAC3HAgAAbzuCAAAIRgIANIULQ","Date":"Thu, 5 Oct 2023 18:11:22 +0000","Message-ID":"\n <VI1PR08MB532531FEB62DED3A5518BF0AFFCAA@VI1PR08MB5325.eurprd08.prod.outlook.com>","References":"<patch-17718-tamar@arm.com>\n <CA+=Sn1kbO1OkC_1oMJi8uH8bajmGn07A+F6nvb6dGKBRcR8S3Q@mail.gmail.com>\n <VI1PR08MB5325CC904A863DB87F17CB88FFC2A@VI1PR08MB5325.eurprd08.prod.outlook.com>\n <nycvar.YFH.7.77.849.2309270710120.5561@jbgna.fhfr.qr>\n <VI1PR08MB532509805D977DE375DE3618FFC2A@VI1PR08MB5325.eurprd08.prod.outlook.com>\n <VI1PR08MB5325EF37073EFC87B0574525FFC2A@VI1PR08MB5325.eurprd08.prod.outlook.com>\n <nycvar.YFH.7.77.849.2309270936320.5561@jbgna.fhfr.qr>","In-Reply-To":"<nycvar.YFH.7.77.849.2309270936320.5561@jbgna.fhfr.qr>","Accept-Language":"en-US","Content-Language":"en-US","X-MS-Has-Attach":"yes","X-MS-TNEF-Correlator":"","Authentication-Results-Original":"dkim=none (message not signed)\n header.d=none;dmarc=none action=none header.from=arm.com;","x-ms-traffictypediagnostic":"\n VI1PR08MB5325:EE_|DB8PR08MB5529:EE_|DBAEUR03FT029:EE_|AM9PR08MB6673:EE_","X-MS-Office365-Filtering-Correlation-Id":"188487cb-32e3-409e-9bb7-08dbc5ce80c6","x-checkrecipientrouted":"true","nodisclaimer":"true","X-MS-Exchange-SenderADCheck":"1","X-MS-Exchange-AntiSpam-Relay":"0","X-Microsoft-Antispam-Untrusted":"BCL:0;","X-Microsoft-Antispam-Message-Info-Original":"\n XMd49xAnSevAkgeqH0ewdW2598QtavzqDSiJ8y8Wy2feQNj96JhvK1CYAXfT7dxpZVPOL7tWiLNCnhc6yJa0rsDklkhmfGKp3hM11ZkmGsJsdCNtOj8u6Rej5Rse7hkHo/5eqVJnnb9Ez/lJtSghdHGgODk/3o7xSWYtp50kJGvsuY2d7wIKeZ6UOs5rqtGY8iA70qgPmUQ0RIFXyNzqGKUVvlmifG5xvpnSMSk6fIBB7Vd1Gbu5X5paqgvEoE3wS3redQmFEXJwztoFxqboF0VfCyhcHZMH+buuEbcLPykHrb8/L1yb9e5nOW+zOtNv2kWSNKt7cMkJuItGIbjzxIz1aLvRqZGLyRUEXxEVCZ0+EM7iVQR/3XcBclpsyXIz9upbuHHBZicF/E5bs/HyfUl6K5MH+sdpINMMmEw31XrEroiGbnWq/wWZYWM/7ENSwwoVza+D+I2/2pcbK4fwLFTMZ8bMh1qWiomgDAaPXueqoDc4SMfSHoYBDg/xYPMwITiZSnDwyxRCrzvX66e/lqPcVXd7g1slf2kgLhgp7JuocSGiVgNyN8Yp4snhVDOSLZEaXOybxWgqxr46qUBqW2T5cAqi4NIHX+07V6IJOreyt5RbXrjPkjtxBTeMHg7z+Do8ik3w82zk3508kyVlP9dltQLxzoL0k1RHlmwvD2Y=","X-Forefront-Antispam-Report-Untrusted":"CIP:255.255.255.255; CTRY:; LANG:en;\n SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com;\n PTR:; CAT:NONE;\n SFS:(13230031)(346002)(396003)(136003)(366004)(39860400002)(376002)(230922051799003)(1800799009)(64100799003)(186009)(451199024)(84970400001)(55016003)(9686003)(6506007)(7696005)(478600001)(38100700002)(38070700005)(33656002)(122000001)(99936003)(86362001)(30864003)(316002)(6916009)(2906002)(83380400001)(26005)(71200400001)(66946007)(41300700001)(76116006)(8676002)(8936002)(4326008)(66556008)(66476007)(66446008)(64756008)(54906003)(52536014)(5660300002)(357404004);\n DIR:OUT; SFP:1101;","Content-Type":"multipart/mixed;\n boundary=\"_002_VI1PR08MB532531FEB62DED3A5518BF0AFFCAAVI1PR08MB5325eurp_\"","MIME-Version":"1.0","X-MS-Exchange-Transport-CrossTenantHeadersStamped":["DB8PR08MB5529","AM9PR08MB6673"],"Original-Authentication-Results":"dkim=none (message not signed)\n header.d=none;dmarc=none action=none header.from=arm.com;","X-EOPAttributedMessage":"0","X-MS-Exchange-Transport-CrossTenantHeadersStripped":"\n DBAEUR03FT029.eop-EUR03.prod.protection.outlook.com","X-MS-PublicTrafficType":"Email","X-MS-Office365-Filtering-Correlation-Id-Prvs":"\n 1fcb0172-2936-4ea2-7a82-08dbc5ce7b6a","X-Microsoft-Antispam":"BCL:0;","X-Microsoft-Antispam-Message-Info":"\n UtfYGZEpBKkbXRcI3+KldcvrbmN2DmNr5m1/cil0VSSlqJNfV4o/VYx6EOBwmfpmP6tcXvtYYapEOvasJ5m38VL9GLyUQqhdgJ75oYbslRLTx4RBw4usoCOJqpaeWyxG3ITxqVA+IOLpt7CF5OjQ3/iMuoToecRyKg2oRHMOzpt8fHq/KE8luWRIWO57A830d71IVpU9C1sGLRA4G8gWfVSGKsZMze4T3fkUb4msO6WHlTFsaMV8N8tdmU0J9206qbjCeS3ymy/qMoGUO7i/utvNHM2M3EmqwegyUDJMeFWy1NZQ1VW067+JEpYq3uPffe8HlPes7HBNiu7hktK1WQDWx2+3N52pvu+6QvErdmK0pY9HutMQnxKR5cbOd9XxN3g6IkVxaCQXlZ0MJPc/1RlQFjgzjgN3YiAotKuSUtnTnwO6O5r5ptEz8icxY3hbQNac6jhs4MUz+r4EJHKT6hw+SrcODTAEAARIS5mF3sehBzQaWEHEzs0W2pYDpfURD8vDAoiFDfZHzYYHHX742zYb+CQsKXfIXHtGnJnLZS1NMeFN7ErAd4MBCjuUITK8olkJKwbtiS8MX9W/cS7YIdQh1WZ9DhDvn+t5ka268mojspbjTp0o9Y+FzXgvPHYHVPAOyOgBPGwY+wmWoKbPL6Rw8GUqsp2/xHXkEiq4ZNUO2Tvxb2JgabGdFNZtqzHXDgnasaE3JOWBTLK8AtVb36L1C5liNk/1JLjwBS7EIcA9rrfRP1+LcZUDBrHAtr3P4Ip7uqzCFlhIsm52LKG+f8GYuFTnshZpbzitt/+Wgvc=","X-Forefront-Antispam-Report":"CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:;\n IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com;\n PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE;\n SFS:(13230031)(4636009)(346002)(39860400002)(136003)(376002)(396003)(230922051799003)(186009)(82310400011)(1800799009)(451199024)(64100799003)(36840700001)(40470700004)(46966006)(316002)(55016003)(70206006)(40480700001)(54906003)(70586007)(47076005)(40460700003)(36860700001)(5660300002)(52536014)(99936003)(81166007)(2906002)(33656002)(86362001)(235185007)(82740400003)(356005)(30864003)(41300700001)(4326008)(6862004)(8936002)(6506007)(7696005)(84970400001)(9686003)(83380400001)(336012)(26005)(8676002)(107886003)(478600001)(357404004);\n DIR:OUT; SFP:1101;","X-OriginatorOrg":"arm.com","X-MS-Exchange-CrossTenant-OriginalArrivalTime":"05 Oct 2023 18:11:31.7491 (UTC)","X-MS-Exchange-CrossTenant-Network-Message-Id":"\n 188487cb-32e3-409e-9bb7-08dbc5ce80c6","X-MS-Exchange-CrossTenant-Id":"f34e5979-57d9-4aaa-ad4d-b122a662184d","X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp":"\n TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123];\n Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com]","X-MS-Exchange-CrossTenant-AuthSource":"\n DBAEUR03FT029.eop-EUR03.prod.protection.outlook.com","X-MS-Exchange-CrossTenant-AuthAs":"Anonymous","X-MS-Exchange-CrossTenant-FromEntityHeader":"HybridOnPrem","X-Spam-Status":"No, score=-12.2 required=5.0 tests=BAYES_00, DKIM_SIGNED,\n DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, KAM_LOTSOFHASH,\n KAM_SHORT, RCVD_IN_DNSWL_BLOCKED, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, TXREP,\n T_SPF_TEMPERROR, UNPARSEABLE_RELAY,\n URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6","X-Spam-Checker-Version":"SpamAssassin 3.4.6 (2021-04-09) on\n server2.sourceware.org","X-BeenThere":"gcc-patches@gcc.gnu.org","X-Mailman-Version":"2.1.30","Precedence":"list","List-Id":"Gcc-patches mailing list <gcc-patches.gcc.gnu.org>","List-Unsubscribe":"<https://gcc.gnu.org/mailman/options/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>","List-Archive":"<https://gcc.gnu.org/pipermail/gcc-patches/>","List-Post":"<mailto:gcc-patches@gcc.gnu.org>","List-Help":"<mailto:gcc-patches-request@gcc.gnu.org?subject=help>","List-Subscribe":"<https://gcc.gnu.org/mailman/listinfo/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>","Errors-To":"gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org"}},{"id":3194098,"web_url":"http://patchwork.ozlabs.org/comment/3194098/","msgid":"<nycvar.YFH.7.77.849.2310060558560.5561@jbgna.fhfr.qr>","list_archive_url":null,"date":"2023-10-06T06:24:31","subject":"RE: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1\n << signbit(x)) [PR109154]","submitter":{"id":4338,"url":"http://patchwork.ozlabs.org/api/people/4338/","name":"Richard Biener","email":"rguenther@suse.de"},"content":"On Thu, 5 Oct 2023, Tamar Christina wrote:\n\n> > I suppose the idea is that -abs(x) might be easier to optimize with other\n> > patterns (consider a - copysign(x,...), optimizing to a + abs(x)).\n> > \n> > For abs vs copysign it's a canonicalization, but (negate (abs @0)) is less\n> > canonical than copysign.\n> > \n> > > Should I try removing this?\n> > \n> > I'd say yes (and put the reverse canonicalization next to this pattern).\n> > \n> \n> This patch transforms fneg (fabs (x)) into copysign (x, -1) which is more\n> canonical and allows a target to expand this sequence efficiently.  Such\n> sequences are common in scientific code working with gradients.\n> \n> various optimizations in match.pd only happened on COPYSIGN but not COPYSIGN_ALL\n> which means they exclude IFN_COPYSIGN.  COPYSIGN however is restricted to only\n\nThat's not true:\n\n(define_operator_list COPYSIGN\n    BUILT_IN_COPYSIGNF\n    BUILT_IN_COPYSIGN\n    BUILT_IN_COPYSIGNL\n    IFN_COPYSIGN)\n\nbut they miss the extended float builtin variants like\n__builtin_copysignf16.  Also see below\n\n> the C99 builtins and so doesn't work for vectors.\n> \n> The patch expands these optimizations to work on COPYSIGN_ALL.\n> \n> There is an existing canonicalization of copysign (x, -1) to fneg (fabs (x))\n> which I remove since this is a less efficient form.  The testsuite is also\n> updated in light of this.\n> \n> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.\n> \n> Ok for master?\n> \n> Thanks,\n> Tamar\n> \n> gcc/ChangeLog:\n> \n> \tPR tree-optimization/109154\n> \t* match.pd: Add new neg+abs rule, remove inverse copysign rule and\n> \texpand existing copysign optimizations.\n> \n> gcc/testsuite/ChangeLog:\n> \n> \tPR tree-optimization/109154\n> \t* gcc.dg/fold-copysign-1.c: Updated.\n> \t* gcc.dg/pr55152-2.c: Updated.\n> \t* gcc.dg/tree-ssa/abs-4.c: Updated.\n> \t* gcc.dg/tree-ssa/backprop-6.c: Updated.\n> \t* gcc.dg/tree-ssa/copy-sign-2.c: Updated.\n> \t* gcc.dg/tree-ssa/mult-abs-2.c: Updated.\n> \t* gcc.target/aarch64/fneg-abs_1.c: New test.\n> \t* gcc.target/aarch64/fneg-abs_2.c: New test.\n> \t* gcc.target/aarch64/fneg-abs_3.c: New test.\n> \t* gcc.target/aarch64/fneg-abs_4.c: New test.\n> \t* gcc.target/aarch64/sve/fneg-abs_1.c: New test.\n> \t* gcc.target/aarch64/sve/fneg-abs_2.c: New test.\n> \t* gcc.target/aarch64/sve/fneg-abs_3.c: New test.\n> \t* gcc.target/aarch64/sve/fneg-abs_4.c: New test.\n> \n> --- inline copy of patch ---\n> \n> diff --git a/gcc/match.pd b/gcc/match.pd\n> index 4bdd83e6e061b16dbdb2845b9398fcfb8a6c9739..bd6599d36021e119f51a4928354f580ffe82c6e2 100644\n> --- a/gcc/match.pd\n> +++ b/gcc/match.pd\n> @@ -1074,45 +1074,43 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)\n>  \n>  /* cos(copysign(x, y)) -> cos(x).  Similarly for cosh.  */\n>  (for coss (COS COSH)\n> -     copysigns (COPYSIGN)\n> - (simplify\n> -  (coss (copysigns @0 @1))\n> -   (coss @0)))\n> + (for copysigns (COPYSIGN_ALL)\n\nSo this ends up generating for example the match\n(cosf (copysignl ...)) which doesn't make much sense.\n\nThe lock-step iteration did\n(cosf (copysignf ..)) ... (ifn_cos (ifn_copysign ...))\nwhich is leaner but misses the case of\n(cosf (ifn_copysign ..)) - that's probably what you are\nafter with this change.\n\nThat said, there isn't a nice solution (without altering the match.pd\nIL).  There's the explicit solution, spelling out all combinations.\n\nSo if we want to go with yout pragmatic solution changing this\nto use COPYSIGN_ALL isn't necessary, only changing the lock-step\nfor iteration to a cross product for iteration is.\n\nChanging just this pattern to\n\n(for coss (COS COSH)\n (for copysigns (COPYSIGN)\n  (simplify\n   (coss (copysigns @0 @1))\n   (coss @0))))\n\nincreases the total number of gimple-match-x.cc lines from\n234988 to 235324.\n\nThe alternative is to do\n\n(for coss (COS COSH)\n     copysigns (COPYSIGN)\n (simplify\n  (coss (copysigns @0 @1))\n   (coss @0))\n (simplify\n  (coss (IFN_COPYSIGN @0 @1))\n   (coss @0)))\n\nwhich properly will diagnose a duplicate pattern.  Ther are\ncurrently no operator lists with just builtins defined (that\ncould be fixed, see gencfn-macros.cc), supposed we'd have\nCOS_C we could do\n\n(for coss (COS_C COSH_C IFN_COS IFN_COSH)\n     copysigns (COPYSIGN_C COPYSIGN_C IFN_COPYSIGN IFN_COPYSIGN \nIFN_COPYSIGN IFN_COPYSIGN IFN_COPYSIGN IFN_COPYSIGN IFN_COPYSIGN \nIFN_COPYSIGN)\n (simplify\n  (coss (copysigns @0 @1))\n   (coss @0)))\n\nwhich of course still looks ugly ;) (some syntax extension like\nallowing to specify IFN_COPYSIGN*8 would be nice here and easy\nenough to do)\n\nCan you split out the part changing COPYSIGN to COPYSIGN_ALL,\nre-do it to only split the fors, keeping COPYSIGN and provide\nsome statistics on the gimple-match-* size?  I think this might\nbe the pragmatic solution for now.\n\nRichard - can you think of a clever way to express the desired\niteration?  How do RTL macro iterations address cases like this?\n\nRichard.\n\n> +  (simplify\n> +   (coss (copysigns @0 @1))\n> +    (coss @0))))\n>  \n>  /* pow(copysign(x, y), z) -> pow(x, z) if z is an even integer.  */\n>  (for pows (POW)\n> -     copysigns (COPYSIGN)\n> - (simplify\n> -  (pows (copysigns @0 @2) REAL_CST@1)\n> -  (with { HOST_WIDE_INT n; }\n> -   (if (real_isinteger (&TREE_REAL_CST (@1), &n) && (n & 1) == 0)\n> -    (pows @0 @1)))))\n> + (for copysigns (COPYSIGN_ALL)\n> +  (simplify\n> +   (pows (copysigns @0 @2) REAL_CST@1)\n> +   (with { HOST_WIDE_INT n; }\n> +    (if (real_isinteger (&TREE_REAL_CST (@1), &n) && (n & 1) == 0)\n> +     (pows @0 @1))))))\n>  /* Likewise for powi.  */\n>  (for pows (POWI)\n> -     copysigns (COPYSIGN)\n> - (simplify\n> -  (pows (copysigns @0 @2) INTEGER_CST@1)\n> -  (if ((wi::to_wide (@1) & 1) == 0)\n> -   (pows @0 @1))))\n> + (for copysigns (COPYSIGN_ALL)\n> +  (simplify\n> +   (pows (copysigns @0 @2) INTEGER_CST@1)\n> +   (if ((wi::to_wide (@1) & 1) == 0)\n> +    (pows @0 @1)))))\n>  \n>  (for hypots (HYPOT)\n> -     copysigns (COPYSIGN)\n> - /* hypot(copysign(x, y), z) -> hypot(x, z).  */\n> - (simplify\n> -  (hypots (copysigns @0 @1) @2)\n> -  (hypots @0 @2))\n> - /* hypot(x, copysign(y, z)) -> hypot(x, y).  */\n> - (simplify\n> -  (hypots @0 (copysigns @1 @2))\n> -  (hypots @0 @1)))\n> + (for copysigns (COPYSIGN)\n> +  /* hypot(copysign(x, y), z) -> hypot(x, z).  */\n> +  (simplify\n> +   (hypots (copysigns @0 @1) @2)\n> +   (hypots @0 @2))\n> +  /* hypot(x, copysign(y, z)) -> hypot(x, y).  */\n> +  (simplify\n> +   (hypots @0 (copysigns @1 @2))\n> +   (hypots @0 @1))))\n>  \n> -/* copysign(x, CST) -> [-]abs (x).  */\n> -(for copysigns (COPYSIGN_ALL)\n> - (simplify\n> -  (copysigns @0 REAL_CST@1)\n> -  (if (REAL_VALUE_NEGATIVE (TREE_REAL_CST (@1)))\n> -   (negate (abs @0))\n> -   (abs @0))))\n> +/* Transform fneg (fabs (X)) -> copysign (X, -1).  */\n> +\n> +(simplify\n> + (negate (abs @0))\n> + (IFN_COPYSIGN @0 { build_minus_one_cst (type); }))\n>  \n>  /* copysign(copysign(x, y), z) -> copysign(x, z).  */\n>  (for copysigns (COPYSIGN_ALL)\n> diff --git a/gcc/testsuite/gcc.dg/fold-copysign-1.c b/gcc/testsuite/gcc.dg/fold-copysign-1.c\n> index f17d65c24ee4dca9867827d040fe0a404c515e7b..f9cafd14ab05f5e8ab2f6f68e62801d21c2df6a6 100644\n> --- a/gcc/testsuite/gcc.dg/fold-copysign-1.c\n> +++ b/gcc/testsuite/gcc.dg/fold-copysign-1.c\n> @@ -12,5 +12,5 @@ double bar (double x)\n>    return __builtin_copysign (x, minuszero);\n>  }\n>  \n> -/* { dg-final { scan-tree-dump-times \"= -\" 1 \"cddce1\" } } */\n> -/* { dg-final { scan-tree-dump-times \"= ABS_EXPR\" 2 \"cddce1\" } } */\n> +/* { dg-final { scan-tree-dump-times \"__builtin_copysign\" 1 \"cddce1\" } } */\n> +/* { dg-final { scan-tree-dump-times \"= ABS_EXPR\" 1 \"cddce1\" } } */\n> diff --git a/gcc/testsuite/gcc.dg/pr55152-2.c b/gcc/testsuite/gcc.dg/pr55152-2.c\n> index 54db0f2062da105a829d6690ac8ed9891fe2b588..605f202ed6bc7aa8fe921457b02ff0b88cc63ce6 100644\n> --- a/gcc/testsuite/gcc.dg/pr55152-2.c\n> +++ b/gcc/testsuite/gcc.dg/pr55152-2.c\n> @@ -10,4 +10,5 @@ int f(int a)\n>    return (a<-a)?a:-a;\n>  }\n>  \n> -/* { dg-final { scan-tree-dump-times \"ABS_EXPR\" 2 \"optimized\" } } */\n> +/* { dg-final { scan-tree-dump-times \"\\.COPYSIGN\" 1 \"optimized\" } } */\n> +/* { dg-final { scan-tree-dump-times \"ABS_EXPR\" 1 \"optimized\" } } */\n> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/abs-4.c b/gcc/testsuite/gcc.dg/tree-ssa/abs-4.c\n> index 6197519faf7b55aed7bc162cd0a14dd2145210ca..e1b825f37f69ac3c4666b3a52d733368805ad31d 100644\n> --- a/gcc/testsuite/gcc.dg/tree-ssa/abs-4.c\n> +++ b/gcc/testsuite/gcc.dg/tree-ssa/abs-4.c\n> @@ -9,5 +9,6 @@ long double abs_ld(long double x) { return __builtin_signbit(x) ? x : -x; }\n>  \n>  /* __builtin_signbit(x) ? x : -x. Should be convert into - ABS_EXP<x> */\n>  /* { dg-final { scan-tree-dump-not \"signbit\" \"optimized\"} } */\n> -/* { dg-final { scan-tree-dump-times \"= ABS_EXPR\" 3 \"optimized\"} } */\n> -/* { dg-final { scan-tree-dump-times \"= -\" 3 \"optimized\"} } */\n> +/* { dg-final { scan-tree-dump-times \"= ABS_EXPR\" 1 \"optimized\"} } */\n> +/* { dg-final { scan-tree-dump-times \"= -\" 1 \"optimized\"} } */\n> +/* { dg-final { scan-tree-dump-times \"= \\.COPYSIGN\" 2 \"optimized\"} } */\n> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/backprop-6.c b/gcc/testsuite/gcc.dg/tree-ssa/backprop-6.c\n> index 31f05716f1498dc709cac95fa20fb5796642c77e..c3a138642d6ff7be984e91fa1343cb2718db7ae1 100644\n> --- a/gcc/testsuite/gcc.dg/tree-ssa/backprop-6.c\n> +++ b/gcc/testsuite/gcc.dg/tree-ssa/backprop-6.c\n> @@ -26,5 +26,6 @@ TEST_FUNCTION (float, f)\n>  TEST_FUNCTION (double, )\n>  TEST_FUNCTION (long double, l)\n>  \n> -/* { dg-final { scan-tree-dump-times {Deleting[^\\n]* = -} 6 \"backprop\" } } */\n> -/* { dg-final { scan-tree-dump-times {Deleting[^\\n]* = ABS_EXPR <} 3 \"backprop\" } } */\n> +/* { dg-final { scan-tree-dump-times {Deleting[^\\n]* = -} 4 \"backprop\" } } */\n> +/* { dg-final { scan-tree-dump-times {Deleting[^\\n]* = \\.COPYSIGN} 2 \"backprop\" } } */\n> +/* { dg-final { scan-tree-dump-times {Deleting[^\\n]* = ABS_EXPR <} 1 \"backprop\" } } */\n> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/copy-sign-2.c b/gcc/testsuite/gcc.dg/tree-ssa/copy-sign-2.c\n> index de52c5f7c8062958353d91f5031193defc9f3f91..e5d565c4b9832c00106588ef411fbd8c292a5cad 100644\n> --- a/gcc/testsuite/gcc.dg/tree-ssa/copy-sign-2.c\n> +++ b/gcc/testsuite/gcc.dg/tree-ssa/copy-sign-2.c\n> @@ -10,4 +10,5 @@ float f1(float x)\n>    float t = __builtin_copysignf (1.0f, -x);\n>    return x * t;\n>  }\n> -/* { dg-final { scan-tree-dump-times \"ABS\" 2 \"optimized\"} } */\n> +/* { dg-final { scan-tree-dump-times \"ABS\" 1 \"optimized\"} } */\n> +/* { dg-final { scan-tree-dump-times \".COPYSIGN\" 1 \"optimized\"} } */\n> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/mult-abs-2.c b/gcc/testsuite/gcc.dg/tree-ssa/mult-abs-2.c\n> index a41f1baf25669a4fd301a586a49ba5e3c5b966ab..a22896b21c8b5a4d5d8e28bd8ae0db896e63ade0 100644\n> --- a/gcc/testsuite/gcc.dg/tree-ssa/mult-abs-2.c\n> +++ b/gcc/testsuite/gcc.dg/tree-ssa/mult-abs-2.c\n> @@ -34,4 +34,5 @@ float i1(float x)\n>  {\n>    return x * (x <= 0.f ? 1.f : -1.f);\n>  }\n> -/* { dg-final { scan-tree-dump-times \"ABS\" 8 \"gimple\"} } */\n> +/* { dg-final { scan-tree-dump-times \"ABS\" 4 \"gimple\"} } */\n> +/* { dg-final { scan-tree-dump-times \"\\.COPYSIGN\" 4 \"gimple\"} } */\n> diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_1.c b/gcc/testsuite/gcc.target/aarch64/fneg-abs_1.c\n> new file mode 100644\n> index 0000000000000000000000000000000000000000..f823013c3ddf6b3a266c3abfcbf2642fc2a75fa6\n> --- /dev/null\n> +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_1.c\n> @@ -0,0 +1,39 @@\n> +/* { dg-do compile } */\n> +/* { dg-options \"-O3\" } */\n> +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } } */\n> +\n> +#pragma GCC target \"+nosve\"\n> +\n> +#include <arm_neon.h>\n> +\n> +/*\n> +** t1:\n> +**\torr\tv[0-9]+.2s, #128, lsl #24\n> +**\tret\n> +*/\n> +float32x2_t t1 (float32x2_t a)\n> +{\n> +  return vneg_f32 (vabs_f32 (a));\n> +}\n> +\n> +/*\n> +** t2:\n> +**\torr\tv[0-9]+.4s, #128, lsl #24\n> +**\tret\n> +*/\n> +float32x4_t t2 (float32x4_t a)\n> +{\n> +  return vnegq_f32 (vabsq_f32 (a));\n> +}\n> +\n> +/*\n> +** t3:\n> +**\tadrp\tx0, .LC[0-9]+\n> +**\tldr\tq[0-9]+, \\[x0, #:lo12:.LC0\\]\n> +**\torr\tv[0-9]+.16b, v[0-9]+.16b, v[0-9]+.16b\n> +**\tret\n> +*/\n> +float64x2_t t3 (float64x2_t a)\n> +{\n> +  return vnegq_f64 (vabsq_f64 (a));\n> +}\n> diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_2.c b/gcc/testsuite/gcc.target/aarch64/fneg-abs_2.c\n> new file mode 100644\n> index 0000000000000000000000000000000000000000..141121176b309e4b2aa413dc55271a6e3c93d5e1\n> --- /dev/null\n> +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_2.c\n> @@ -0,0 +1,31 @@\n> +/* { dg-do compile } */\n> +/* { dg-options \"-O3\" } */\n> +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } } */\n> +\n> +#pragma GCC target \"+nosve\"\n> +\n> +#include <arm_neon.h>\n> +#include <math.h>\n> +\n> +/*\n> +** f1:\n> +**\tmovi\tv[0-9]+.2s, 0x80, lsl 24\n> +**\torr\tv[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> +**\tret\n> +*/\n> +float32_t f1 (float32_t a)\n> +{\n> +  return -fabsf (a);\n> +}\n> +\n> +/*\n> +** f2:\n> +**\tmov\tx0, -9223372036854775808\n> +**\tfmov\td[0-9]+, x0\n> +**\torr\tv[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> +**\tret\n> +*/\n> +float64_t f2 (float64_t a)\n> +{\n> +  return -fabs (a);\n> +}\n> diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_3.c b/gcc/testsuite/gcc.target/aarch64/fneg-abs_3.c\n> new file mode 100644\n> index 0000000000000000000000000000000000000000..b4652173a95d104ddfa70c497f0627a61ea89d3b\n> --- /dev/null\n> +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_3.c\n> @@ -0,0 +1,36 @@\n> +/* { dg-do compile } */\n> +/* { dg-options \"-O3\" } */\n> +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } } */\n> +\n> +#pragma GCC target \"+nosve\"\n> +\n> +#include <arm_neon.h>\n> +#include <math.h>\n> +\n> +/*\n> +** f1:\n> +**\t...\n> +**\tldr\tq[0-9]+, \\[x0\\]\n> +**\torr\tv[0-9]+.4s, #128, lsl #24\n> +**\tstr\tq[0-9]+, \\[x0\\], 16\n> +**\t...\n> +*/\n> +void f1 (float32_t *a, int n)\n> +{\n> +  for (int i = 0; i < (n & -8); i++)\n> +   a[i] = -fabsf (a[i]);\n> +}\n> +\n> +/*\n> +** f2:\n> +**\t...\n> +**\tldr\tq[0-9]+, \\[x0\\]\n> +**\torr\tv[0-9]+.16b, v[0-9]+.16b, v[0-9]+.16b\n> +**\tstr\tq[0-9]+, \\[x0\\], 16\n> +**\t...\n> +*/\n> +void f2 (float64_t *a, int n)\n> +{\n> +  for (int i = 0; i < (n & -8); i++)\n> +   a[i] = -fabs (a[i]);\n> +}\n> diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_4.c b/gcc/testsuite/gcc.target/aarch64/fneg-abs_4.c\n> new file mode 100644\n> index 0000000000000000000000000000000000000000..10879dea74462d34b26160eeb0bd54ead063166b\n> --- /dev/null\n> +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_4.c\n> @@ -0,0 +1,39 @@\n> +/* { dg-do compile } */\n> +/* { dg-options \"-O3\" } */\n> +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } } */\n> +\n> +#pragma GCC target \"+nosve\"\n> +\n> +#include <string.h>\n> +\n> +/*\n> +** negabs:\n> +**\tmov\tx0, -9223372036854775808\n> +**\tfmov\td[0-9]+, x0\n> +**\torr\tv[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> +**\tret\n> +*/\n> +double negabs (double x)\n> +{\n> +   unsigned long long y;\n> +   memcpy (&y, &x, sizeof(double));\n> +   y = y | (1UL << 63);\n> +   memcpy (&x, &y, sizeof(double));\n> +   return x;\n> +}\n> +\n> +/*\n> +** negabsf:\n> +**\tmovi\tv[0-9]+.2s, 0x80, lsl 24\n> +**\torr\tv[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> +**\tret\n> +*/\n> +float negabsf (float x)\n> +{\n> +   unsigned int y;\n> +   memcpy (&y, &x, sizeof(float));\n> +   y = y | (1U << 31);\n> +   memcpy (&x, &y, sizeof(float));\n> +   return x;\n> +}\n> +\n> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_1.c b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_1.c\n> new file mode 100644\n> index 0000000000000000000000000000000000000000..0c7664e6de77a497682952653ffd417453854d52\n> --- /dev/null\n> +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_1.c\n> @@ -0,0 +1,37 @@\n> +/* { dg-do compile } */\n> +/* { dg-options \"-O3\" } */\n> +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } } */\n> +\n> +#include <arm_neon.h>\n> +\n> +/*\n> +** t1:\n> +**\torr\tv[0-9]+.2s, #128, lsl #24\n> +**\tret\n> +*/\n> +float32x2_t t1 (float32x2_t a)\n> +{\n> +  return vneg_f32 (vabs_f32 (a));\n> +}\n> +\n> +/*\n> +** t2:\n> +**\torr\tv[0-9]+.4s, #128, lsl #24\n> +**\tret\n> +*/\n> +float32x4_t t2 (float32x4_t a)\n> +{\n> +  return vnegq_f32 (vabsq_f32 (a));\n> +}\n> +\n> +/*\n> +** t3:\n> +**\tadrp\tx0, .LC[0-9]+\n> +**\tldr\tq[0-9]+, \\[x0, #:lo12:.LC0\\]\n> +**\torr\tv[0-9]+.16b, v[0-9]+.16b, v[0-9]+.16b\n> +**\tret\n> +*/\n> +float64x2_t t3 (float64x2_t a)\n> +{\n> +  return vnegq_f64 (vabsq_f64 (a));\n> +}\n> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_2.c b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_2.c\n> new file mode 100644\n> index 0000000000000000000000000000000000000000..a60cd31b9294af2dac69eed1c93f899bd5c78fca\n> --- /dev/null\n> +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_2.c\n> @@ -0,0 +1,29 @@\n> +/* { dg-do compile } */\n> +/* { dg-options \"-O3\" } */\n> +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } } */\n> +\n> +#include <arm_neon.h>\n> +#include <math.h>\n> +\n> +/*\n> +** f1:\n> +**\tmovi\tv[0-9]+.2s, 0x80, lsl 24\n> +**\torr\tv[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> +**\tret\n> +*/\n> +float32_t f1 (float32_t a)\n> +{\n> +  return -fabsf (a);\n> +}\n> +\n> +/*\n> +** f2:\n> +**\tmov\tx0, -9223372036854775808\n> +**\tfmov\td[0-9]+, x0\n> +**\torr\tv[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> +**\tret\n> +*/\n> +float64_t f2 (float64_t a)\n> +{\n> +  return -fabs (a);\n> +}\n> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_3.c b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_3.c\n> new file mode 100644\n> index 0000000000000000000000000000000000000000..1bf34328d8841de8e6b0a5458562a9f00e31c275\n> --- /dev/null\n> +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_3.c\n> @@ -0,0 +1,34 @@\n> +/* { dg-do compile } */\n> +/* { dg-options \"-O3\" } */\n> +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } } */\n> +\n> +#include <arm_neon.h>\n> +#include <math.h>\n> +\n> +/*\n> +** f1:\n> +**\t...\n> +**\tld1w\tz[0-9]+.s, p[0-9]+/z, \\[x0, x2, lsl 2\\]\n> +**\torr\tz[0-9]+.s, z[0-9]+.s, #0x80000000\n> +**\tst1w\tz[0-9]+.s, p[0-9]+, \\[x0, x2, lsl 2\\]\n> +**\t...\n> +*/\n> +void f1 (float32_t *a, int n)\n> +{\n> +  for (int i = 0; i < (n & -8); i++)\n> +   a[i] = -fabsf (a[i]);\n> +}\n> +\n> +/*\n> +** f2:\n> +**\t...\n> +**\tld1d\tz[0-9]+.d, p[0-9]+/z, \\[x0, x2, lsl 3\\]\n> +**\torr\tz[0-9]+.d, z[0-9]+.d, #0x8000000000000000\n> +**\tst1d\tz[0-9]+.d, p[0-9]+, \\[x0, x2, lsl 3\\]\n> +**\t...\n> +*/\n> +void f2 (float64_t *a, int n)\n> +{\n> +  for (int i = 0; i < (n & -8); i++)\n> +   a[i] = -fabs (a[i]);\n> +}\n> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_4.c b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_4.c\n> new file mode 100644\n> index 0000000000000000000000000000000000000000..21f2a8da2a5d44e3d01f6604ca7be87e3744d494\n> --- /dev/null\n> +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_4.c\n> @@ -0,0 +1,37 @@\n> +/* { dg-do compile } */\n> +/* { dg-options \"-O3\" } */\n> +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } } */\n> +\n> +#include <string.h>\n> +\n> +/*\n> +** negabs:\n> +**\tmov\tx0, -9223372036854775808\n> +**\tfmov\td[0-9]+, x0\n> +**\torr\tv[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> +**\tret\n> +*/\n> +double negabs (double x)\n> +{\n> +   unsigned long long y;\n> +   memcpy (&y, &x, sizeof(double));\n> +   y = y | (1UL << 63);\n> +   memcpy (&x, &y, sizeof(double));\n> +   return x;\n> +}\n> +\n> +/*\n> +** negabsf:\n> +**\tmovi\tv[0-9]+.2s, 0x80, lsl 24\n> +**\torr\tv[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n> +**\tret\n> +*/\n> +float negabsf (float x)\n> +{\n> +   unsigned int y;\n> +   memcpy (&y, &x, sizeof(float));\n> +   y = y | (1U << 31);\n> +   memcpy (&x, &y, sizeof(float));\n> +   return x;\n> +}\n> +\n>","headers":{"Return-Path":"<gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org>","X-Original-To":["incoming@patchwork.ozlabs.org","gcc-patches@gcc.gnu.org"],"Delivered-To":["patchwork-incoming@legolas.ozlabs.org","gcc-patches@gcc.gnu.org"],"Authentication-Results":["legolas.ozlabs.org;\n\tdkim=pass (1024-bit key;\n unprotected) header.d=suse.de header.i=@suse.de header.a=rsa-sha256\n header.s=susede2_rsa header.b=AVb15K7x;\n\tdkim=pass header.d=suse.de header.i=@suse.de header.a=ed25519-sha256\n header.s=susede2_ed25519 header.b=zAhUOaDm;\n\tdkim-atps=neutral","legolas.ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org\n (client-ip=8.43.85.97; helo=server2.sourceware.org;\n envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org;\n receiver=patchwork.ozlabs.org)","sourceware.org;\n dmarc=pass (p=none dis=none) header.from=suse.de","sourceware.org; spf=pass smtp.mailfrom=suse.de"],"Received":["from server2.sourceware.org (ip-8-43-85-97.sourceware.org\n [8.43.85.97])\n\t(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n\t key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384)\n\t(No client certificate requested)\n\tby legolas.ozlabs.org (Postfix) with ESMTPS id 4S1z1F3ywdz1yqF\n\tfor <incoming@patchwork.ozlabs.org>; Fri,  6 Oct 2023 17:24:47 +1100 (AEDT)","from server2.sourceware.org (localhost [IPv6:::1])\n\tby sourceware.org (Postfix) with ESMTP id 1E320385735A\n\tfor <incoming@patchwork.ozlabs.org>; Fri,  6 Oct 2023 06:24:45 +0000 (GMT)","from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29])\n by sourceware.org (Postfix) with ESMTPS id 1A6D43858D3C\n for <gcc-patches@gcc.gnu.org>; Fri,  6 Oct 2023 06:24:32 +0000 (GMT)","from relay2.suse.de (relay2.suse.de [149.44.160.134])\n by smtp-out2.suse.de (Postfix) with ESMTP id 501A21F74C;\n Fri,  6 Oct 2023 06:24:31 +0000 (UTC)","from wotan.suse.de (wotan.suse.de [10.160.0.1])\n (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))\n (No client certificate requested)\n by relay2.suse.de (Postfix) with ESMTPS id 1E77F2C142;\n Fri,  6 Oct 2023 06:24:31 +0000 (UTC)"],"DMARC-Filter":"OpenDMARC Filter v1.4.2 sourceware.org 1A6D43858D3C","DKIM-Signature":["v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de;\n s=susede2_rsa;\n t=1696573471;\n h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc:\n mime-version:mime-version:content-type:content-type:\n in-reply-to:in-reply-to:references:references;\n bh=kZmAsuACwih3LrwnamrhWQf3MuvJVHM8ILH9MkInn7s=;\n b=AVb15K7xVx0sSYkrI94llt6+vGqk80sIPT62Co5jVKzdqx9rY4dDPC9HHNddRKib0FZfyb\n 3u0h9bG7jo/lv6LRfb6pP2BQnAZtUmQfsG6fc7cwp7bKr8LNAsztlg9sJVfokIndXziyN2\n LTqY3T3ZFXWmAYIXlu0wucXhREgRCJM=","v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de;\n s=susede2_ed25519; t=1696573471;\n h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc:\n mime-version:mime-version:content-type:content-type:\n in-reply-to:in-reply-to:references:references;\n bh=kZmAsuACwih3LrwnamrhWQf3MuvJVHM8ILH9MkInn7s=;\n b=zAhUOaDmzzBM8fwIfrdm/t8LbbH3nVp1zgR4c4GRm1VohkxLJri477QXQupC6o2XCe40/E\n q4hc09L84sWC0YBQ=="],"Date":"Fri, 6 Oct 2023 06:24:31 +0000 (UTC)","From":"Richard Biener <rguenther@suse.de>","To":"Tamar Christina <Tamar.Christina@arm.com>","cc":"Andrew Pinski <pinskia@gmail.com>,\n \"gcc-patches@gcc.gnu.org\" <gcc-patches@gcc.gnu.org>, nd <nd@arm.com>,\n \"jlaw@ventanamicro.com\" <jlaw@ventanamicro.com>, richard.sandiford@arm.com","Subject":"RE: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1\n << signbit(x)) [PR109154]","In-Reply-To":"\n <VI1PR08MB532531FEB62DED3A5518BF0AFFCAA@VI1PR08MB5325.eurprd08.prod.outlook.com>","Message-ID":"<nycvar.YFH.7.77.849.2310060558560.5561@jbgna.fhfr.qr>","References":"<patch-17718-tamar@arm.com>\n <CA+=Sn1kbO1OkC_1oMJi8uH8bajmGn07A+F6nvb6dGKBRcR8S3Q@mail.gmail.com>\n <VI1PR08MB5325CC904A863DB87F17CB88FFC2A@VI1PR08MB5325.eurprd08.prod.outlook.com>\n <nycvar.YFH.7.77.849.2309270710120.5561@jbgna.fhfr.qr>\n <VI1PR08MB532509805D977DE375DE3618FFC2A@VI1PR08MB5325.eurprd08.prod.outlook.com>\n <VI1PR08MB5325EF37073EFC87B0574525FFC2A@VI1PR08MB5325.eurprd08.prod.outlook.com>\n <nycvar.YFH.7.77.849.2309270936320.5561@jbgna.fhfr.qr>\n <VI1PR08MB532531FEB62DED3A5518BF0AFFCAA@VI1PR08MB5325.eurprd08.prod.outlook.com>","User-Agent":"Alpine 2.22 (LSU 394 2020-01-19)","MIME-Version":"1.0","Content-Type":"text/plain; charset=US-ASCII","X-Spam-Status":"No, score=-10.9 required=5.0 tests=BAYES_00, DKIM_SIGNED,\n DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_LOTSOFHASH,\n KAM_SHORT, SPF_HELO_NONE, SPF_PASS,\n TXREP autolearn=ham autolearn_force=no version=3.4.6","X-Spam-Checker-Version":"SpamAssassin 3.4.6 (2021-04-09) on\n server2.sourceware.org","X-BeenThere":"gcc-patches@gcc.gnu.org","X-Mailman-Version":"2.1.30","Precedence":"list","List-Id":"Gcc-patches mailing list <gcc-patches.gcc.gnu.org>","List-Unsubscribe":"<https://gcc.gnu.org/mailman/options/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>","List-Archive":"<https://gcc.gnu.org/pipermail/gcc-patches/>","List-Post":"<mailto:gcc-patches@gcc.gnu.org>","List-Help":"<mailto:gcc-patches-request@gcc.gnu.org?subject=help>","List-Subscribe":"<https://gcc.gnu.org/mailman/listinfo/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>","Errors-To":"gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org"}},{"id":3194772,"web_url":"http://patchwork.ozlabs.org/comment/3194772/","msgid":"<mpt1qe6lrbr.fsf@arm.com>","list_archive_url":null,"date":"2023-10-07T09:22:48","subject":"Re: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1 <<\n signbit(x)) [PR109154]","submitter":{"id":64746,"url":"http://patchwork.ozlabs.org/api/people/64746/","name":"Richard Sandiford","email":"richard.sandiford@arm.com"},"content":"Richard Biener <rguenther@suse.de> writes:\n> On Thu, 5 Oct 2023, Tamar Christina wrote:\n>\n>> > I suppose the idea is that -abs(x) might be easier to optimize with other\n>> > patterns (consider a - copysign(x,...), optimizing to a + abs(x)).\n>> > \n>> > For abs vs copysign it's a canonicalization, but (negate (abs @0)) is less\n>> > canonical than copysign.\n>> > \n>> > > Should I try removing this?\n>> > \n>> > I'd say yes (and put the reverse canonicalization next to this pattern).\n>> > \n>> \n>> This patch transforms fneg (fabs (x)) into copysign (x, -1) which is more\n>> canonical and allows a target to expand this sequence efficiently.  Such\n>> sequences are common in scientific code working with gradients.\n>> \n>> various optimizations in match.pd only happened on COPYSIGN but not COPYSIGN_ALL\n>> which means they exclude IFN_COPYSIGN.  COPYSIGN however is restricted to only\n>\n> That's not true:\n>\n> (define_operator_list COPYSIGN\n>     BUILT_IN_COPYSIGNF\n>     BUILT_IN_COPYSIGN\n>     BUILT_IN_COPYSIGNL\n>     IFN_COPYSIGN)\n>\n> but they miss the extended float builtin variants like\n> __builtin_copysignf16.  Also see below\n>\n>> the C99 builtins and so doesn't work for vectors.\n>> \n>> The patch expands these optimizations to work on COPYSIGN_ALL.\n>> \n>> There is an existing canonicalization of copysign (x, -1) to fneg (fabs (x))\n>> which I remove since this is a less efficient form.  The testsuite is also\n>> updated in light of this.\n>> \n>> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.\n>> \n>> Ok for master?\n>> \n>> Thanks,\n>> Tamar\n>> \n>> gcc/ChangeLog:\n>> \n>> \tPR tree-optimization/109154\n>> \t* match.pd: Add new neg+abs rule, remove inverse copysign rule and\n>> \texpand existing copysign optimizations.\n>> \n>> gcc/testsuite/ChangeLog:\n>> \n>> \tPR tree-optimization/109154\n>> \t* gcc.dg/fold-copysign-1.c: Updated.\n>> \t* gcc.dg/pr55152-2.c: Updated.\n>> \t* gcc.dg/tree-ssa/abs-4.c: Updated.\n>> \t* gcc.dg/tree-ssa/backprop-6.c: Updated.\n>> \t* gcc.dg/tree-ssa/copy-sign-2.c: Updated.\n>> \t* gcc.dg/tree-ssa/mult-abs-2.c: Updated.\n>> \t* gcc.target/aarch64/fneg-abs_1.c: New test.\n>> \t* gcc.target/aarch64/fneg-abs_2.c: New test.\n>> \t* gcc.target/aarch64/fneg-abs_3.c: New test.\n>> \t* gcc.target/aarch64/fneg-abs_4.c: New test.\n>> \t* gcc.target/aarch64/sve/fneg-abs_1.c: New test.\n>> \t* gcc.target/aarch64/sve/fneg-abs_2.c: New test.\n>> \t* gcc.target/aarch64/sve/fneg-abs_3.c: New test.\n>> \t* gcc.target/aarch64/sve/fneg-abs_4.c: New test.\n>> \n>> --- inline copy of patch ---\n>> \n>> diff --git a/gcc/match.pd b/gcc/match.pd\n>> index 4bdd83e6e061b16dbdb2845b9398fcfb8a6c9739..bd6599d36021e119f51a4928354f580ffe82c6e2 100644\n>> --- a/gcc/match.pd\n>> +++ b/gcc/match.pd\n>> @@ -1074,45 +1074,43 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)\n>>  \n>>  /* cos(copysign(x, y)) -> cos(x).  Similarly for cosh.  */\n>>  (for coss (COS COSH)\n>> -     copysigns (COPYSIGN)\n>> - (simplify\n>> -  (coss (copysigns @0 @1))\n>> -   (coss @0)))\n>> + (for copysigns (COPYSIGN_ALL)\n>\n> So this ends up generating for example the match\n> (cosf (copysignl ...)) which doesn't make much sense.\n>\n> The lock-step iteration did\n> (cosf (copysignf ..)) ... (ifn_cos (ifn_copysign ...))\n> which is leaner but misses the case of\n> (cosf (ifn_copysign ..)) - that's probably what you are\n> after with this change.\n>\n> That said, there isn't a nice solution (without altering the match.pd\n> IL).  There's the explicit solution, spelling out all combinations.\n>\n> So if we want to go with yout pragmatic solution changing this\n> to use COPYSIGN_ALL isn't necessary, only changing the lock-step\n> for iteration to a cross product for iteration is.\n>\n> Changing just this pattern to\n>\n> (for coss (COS COSH)\n>  (for copysigns (COPYSIGN)\n>   (simplify\n>    (coss (copysigns @0 @1))\n>    (coss @0))))\n>\n> increases the total number of gimple-match-x.cc lines from\n> 234988 to 235324.\n\nI guess the difference between this and the later suggestions is that\nthis one allows builtin copysign to be paired with ifn cos, which would\nbe potentially useful in other situations.  (It isn't here because\nifn_cos is rarely provided.)  How much of the growth is due to that,\nand much of it is from nonsensical combinations like\n(builtin_cosf (builtin_copysignl ...))?\n\nIf it's mostly from nonsensical combinations then would it be possible\nto make genmatch drop them?\n\n> The alternative is to do\n>\n> (for coss (COS COSH)\n>      copysigns (COPYSIGN)\n>  (simplify\n>   (coss (copysigns @0 @1))\n>    (coss @0))\n>  (simplify\n>   (coss (IFN_COPYSIGN @0 @1))\n>    (coss @0)))\n>\n> which properly will diagnose a duplicate pattern.  Ther are\n> currently no operator lists with just builtins defined (that\n> could be fixed, see gencfn-macros.cc), supposed we'd have\n> COS_C we could do\n>\n> (for coss (COS_C COSH_C IFN_COS IFN_COSH)\n>      copysigns (COPYSIGN_C COPYSIGN_C IFN_COPYSIGN IFN_COPYSIGN \n> IFN_COPYSIGN IFN_COPYSIGN IFN_COPYSIGN IFN_COPYSIGN IFN_COPYSIGN \n> IFN_COPYSIGN)\n>  (simplify\n>   (coss (copysigns @0 @1))\n>    (coss @0)))\n>\n> which of course still looks ugly ;) (some syntax extension like\n> allowing to specify IFN_COPYSIGN*8 would be nice here and easy\n> enough to do)\n>\n> Can you split out the part changing COPYSIGN to COPYSIGN_ALL,\n> re-do it to only split the fors, keeping COPYSIGN and provide\n> some statistics on the gimple-match-* size?  I think this might\n> be the pragmatic solution for now.\n>\n> Richard - can you think of a clever way to express the desired\n> iteration?  How do RTL macro iterations address cases like this?\n\nI don't think .md files have an equivalent construct, unfortunately.\n(I also regret some of the choices I made for .md iterators, but that's\nanother story.)\n\nPerhaps an alternative to the *8 thing would be \"IFN_COPYSIGN...\",\nwith the \"...\" meaning \"fill to match the longest operator list\nin the loop\".\n\nThanks,\nRichard\n\n> Richard.\n>\n>> +  (simplify\n>> +   (coss (copysigns @0 @1))\n>> +    (coss @0))))\n>>  \n>>  /* pow(copysign(x, y), z) -> pow(x, z) if z is an even integer.  */\n>>  (for pows (POW)\n>> -     copysigns (COPYSIGN)\n>> - (simplify\n>> -  (pows (copysigns @0 @2) REAL_CST@1)\n>> -  (with { HOST_WIDE_INT n; }\n>> -   (if (real_isinteger (&TREE_REAL_CST (@1), &n) && (n & 1) == 0)\n>> -    (pows @0 @1)))))\n>> + (for copysigns (COPYSIGN_ALL)\n>> +  (simplify\n>> +   (pows (copysigns @0 @2) REAL_CST@1)\n>> +   (with { HOST_WIDE_INT n; }\n>> +    (if (real_isinteger (&TREE_REAL_CST (@1), &n) && (n & 1) == 0)\n>> +     (pows @0 @1))))))\n>>  /* Likewise for powi.  */\n>>  (for pows (POWI)\n>> -     copysigns (COPYSIGN)\n>> - (simplify\n>> -  (pows (copysigns @0 @2) INTEGER_CST@1)\n>> -  (if ((wi::to_wide (@1) & 1) == 0)\n>> -   (pows @0 @1))))\n>> + (for copysigns (COPYSIGN_ALL)\n>> +  (simplify\n>> +   (pows (copysigns @0 @2) INTEGER_CST@1)\n>> +   (if ((wi::to_wide (@1) & 1) == 0)\n>> +    (pows @0 @1)))))\n>>  \n>>  (for hypots (HYPOT)\n>> -     copysigns (COPYSIGN)\n>> - /* hypot(copysign(x, y), z) -> hypot(x, z).  */\n>> - (simplify\n>> -  (hypots (copysigns @0 @1) @2)\n>> -  (hypots @0 @2))\n>> - /* hypot(x, copysign(y, z)) -> hypot(x, y).  */\n>> - (simplify\n>> -  (hypots @0 (copysigns @1 @2))\n>> -  (hypots @0 @1)))\n>> + (for copysigns (COPYSIGN)\n>> +  /* hypot(copysign(x, y), z) -> hypot(x, z).  */\n>> +  (simplify\n>> +   (hypots (copysigns @0 @1) @2)\n>> +   (hypots @0 @2))\n>> +  /* hypot(x, copysign(y, z)) -> hypot(x, y).  */\n>> +  (simplify\n>> +   (hypots @0 (copysigns @1 @2))\n>> +   (hypots @0 @1))))\n>>  \n>> -/* copysign(x, CST) -> [-]abs (x).  */\n>> -(for copysigns (COPYSIGN_ALL)\n>> - (simplify\n>> -  (copysigns @0 REAL_CST@1)\n>> -  (if (REAL_VALUE_NEGATIVE (TREE_REAL_CST (@1)))\n>> -   (negate (abs @0))\n>> -   (abs @0))))\n>> +/* Transform fneg (fabs (X)) -> copysign (X, -1).  */\n>> +\n>> +(simplify\n>> + (negate (abs @0))\n>> + (IFN_COPYSIGN @0 { build_minus_one_cst (type); }))\n>>  \n>>  /* copysign(copysign(x, y), z) -> copysign(x, z).  */\n>>  (for copysigns (COPYSIGN_ALL)\n>> diff --git a/gcc/testsuite/gcc.dg/fold-copysign-1.c b/gcc/testsuite/gcc.dg/fold-copysign-1.c\n>> index f17d65c24ee4dca9867827d040fe0a404c515e7b..f9cafd14ab05f5e8ab2f6f68e62801d21c2df6a6 100644\n>> --- a/gcc/testsuite/gcc.dg/fold-copysign-1.c\n>> +++ b/gcc/testsuite/gcc.dg/fold-copysign-1.c\n>> @@ -12,5 +12,5 @@ double bar (double x)\n>>    return __builtin_copysign (x, minuszero);\n>>  }\n>>  \n>> -/* { dg-final { scan-tree-dump-times \"= -\" 1 \"cddce1\" } } */\n>> -/* { dg-final { scan-tree-dump-times \"= ABS_EXPR\" 2 \"cddce1\" } } */\n>> +/* { dg-final { scan-tree-dump-times \"__builtin_copysign\" 1 \"cddce1\" } } */\n>> +/* { dg-final { scan-tree-dump-times \"= ABS_EXPR\" 1 \"cddce1\" } } */\n>> diff --git a/gcc/testsuite/gcc.dg/pr55152-2.c b/gcc/testsuite/gcc.dg/pr55152-2.c\n>> index 54db0f2062da105a829d6690ac8ed9891fe2b588..605f202ed6bc7aa8fe921457b02ff0b88cc63ce6 100644\n>> --- a/gcc/testsuite/gcc.dg/pr55152-2.c\n>> +++ b/gcc/testsuite/gcc.dg/pr55152-2.c\n>> @@ -10,4 +10,5 @@ int f(int a)\n>>    return (a<-a)?a:-a;\n>>  }\n>>  \n>> -/* { dg-final { scan-tree-dump-times \"ABS_EXPR\" 2 \"optimized\" } } */\n>> +/* { dg-final { scan-tree-dump-times \"\\.COPYSIGN\" 1 \"optimized\" } } */\n>> +/* { dg-final { scan-tree-dump-times \"ABS_EXPR\" 1 \"optimized\" } } */\n>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/abs-4.c b/gcc/testsuite/gcc.dg/tree-ssa/abs-4.c\n>> index 6197519faf7b55aed7bc162cd0a14dd2145210ca..e1b825f37f69ac3c4666b3a52d733368805ad31d 100644\n>> --- a/gcc/testsuite/gcc.dg/tree-ssa/abs-4.c\n>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/abs-4.c\n>> @@ -9,5 +9,6 @@ long double abs_ld(long double x) { return __builtin_signbit(x) ? x : -x; }\n>>  \n>>  /* __builtin_signbit(x) ? x : -x. Should be convert into - ABS_EXP<x> */\n>>  /* { dg-final { scan-tree-dump-not \"signbit\" \"optimized\"} } */\n>> -/* { dg-final { scan-tree-dump-times \"= ABS_EXPR\" 3 \"optimized\"} } */\n>> -/* { dg-final { scan-tree-dump-times \"= -\" 3 \"optimized\"} } */\n>> +/* { dg-final { scan-tree-dump-times \"= ABS_EXPR\" 1 \"optimized\"} } */\n>> +/* { dg-final { scan-tree-dump-times \"= -\" 1 \"optimized\"} } */\n>> +/* { dg-final { scan-tree-dump-times \"= \\.COPYSIGN\" 2 \"optimized\"} } */\n>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/backprop-6.c b/gcc/testsuite/gcc.dg/tree-ssa/backprop-6.c\n>> index 31f05716f1498dc709cac95fa20fb5796642c77e..c3a138642d6ff7be984e91fa1343cb2718db7ae1 100644\n>> --- a/gcc/testsuite/gcc.dg/tree-ssa/backprop-6.c\n>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/backprop-6.c\n>> @@ -26,5 +26,6 @@ TEST_FUNCTION (float, f)\n>>  TEST_FUNCTION (double, )\n>>  TEST_FUNCTION (long double, l)\n>>  \n>> -/* { dg-final { scan-tree-dump-times {Deleting[^\\n]* = -} 6 \"backprop\" } } */\n>> -/* { dg-final { scan-tree-dump-times {Deleting[^\\n]* = ABS_EXPR <} 3 \"backprop\" } } */\n>> +/* { dg-final { scan-tree-dump-times {Deleting[^\\n]* = -} 4 \"backprop\" } } */\n>> +/* { dg-final { scan-tree-dump-times {Deleting[^\\n]* = \\.COPYSIGN} 2 \"backprop\" } } */\n>> +/* { dg-final { scan-tree-dump-times {Deleting[^\\n]* = ABS_EXPR <} 1 \"backprop\" } } */\n>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/copy-sign-2.c b/gcc/testsuite/gcc.dg/tree-ssa/copy-sign-2.c\n>> index de52c5f7c8062958353d91f5031193defc9f3f91..e5d565c4b9832c00106588ef411fbd8c292a5cad 100644\n>> --- a/gcc/testsuite/gcc.dg/tree-ssa/copy-sign-2.c\n>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/copy-sign-2.c\n>> @@ -10,4 +10,5 @@ float f1(float x)\n>>    float t = __builtin_copysignf (1.0f, -x);\n>>    return x * t;\n>>  }\n>> -/* { dg-final { scan-tree-dump-times \"ABS\" 2 \"optimized\"} } */\n>> +/* { dg-final { scan-tree-dump-times \"ABS\" 1 \"optimized\"} } */\n>> +/* { dg-final { scan-tree-dump-times \".COPYSIGN\" 1 \"optimized\"} } */\n>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/mult-abs-2.c b/gcc/testsuite/gcc.dg/tree-ssa/mult-abs-2.c\n>> index a41f1baf25669a4fd301a586a49ba5e3c5b966ab..a22896b21c8b5a4d5d8e28bd8ae0db896e63ade0 100644\n>> --- a/gcc/testsuite/gcc.dg/tree-ssa/mult-abs-2.c\n>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/mult-abs-2.c\n>> @@ -34,4 +34,5 @@ float i1(float x)\n>>  {\n>>    return x * (x <= 0.f ? 1.f : -1.f);\n>>  }\n>> -/* { dg-final { scan-tree-dump-times \"ABS\" 8 \"gimple\"} } */\n>> +/* { dg-final { scan-tree-dump-times \"ABS\" 4 \"gimple\"} } */\n>> +/* { dg-final { scan-tree-dump-times \"\\.COPYSIGN\" 4 \"gimple\"} } */\n>> diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_1.c b/gcc/testsuite/gcc.target/aarch64/fneg-abs_1.c\n>> new file mode 100644\n>> index 0000000000000000000000000000000000000000..f823013c3ddf6b3a266c3abfcbf2642fc2a75fa6\n>> --- /dev/null\n>> +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_1.c\n>> @@ -0,0 +1,39 @@\n>> +/* { dg-do compile } */\n>> +/* { dg-options \"-O3\" } */\n>> +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } } */\n>> +\n>> +#pragma GCC target \"+nosve\"\n>> +\n>> +#include <arm_neon.h>\n>> +\n>> +/*\n>> +** t1:\n>> +**\torr\tv[0-9]+.2s, #128, lsl #24\n>> +**\tret\n>> +*/\n>> +float32x2_t t1 (float32x2_t a)\n>> +{\n>> +  return vneg_f32 (vabs_f32 (a));\n>> +}\n>> +\n>> +/*\n>> +** t2:\n>> +**\torr\tv[0-9]+.4s, #128, lsl #24\n>> +**\tret\n>> +*/\n>> +float32x4_t t2 (float32x4_t a)\n>> +{\n>> +  return vnegq_f32 (vabsq_f32 (a));\n>> +}\n>> +\n>> +/*\n>> +** t3:\n>> +**\tadrp\tx0, .LC[0-9]+\n>> +**\tldr\tq[0-9]+, \\[x0, #:lo12:.LC0\\]\n>> +**\torr\tv[0-9]+.16b, v[0-9]+.16b, v[0-9]+.16b\n>> +**\tret\n>> +*/\n>> +float64x2_t t3 (float64x2_t a)\n>> +{\n>> +  return vnegq_f64 (vabsq_f64 (a));\n>> +}\n>> diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_2.c b/gcc/testsuite/gcc.target/aarch64/fneg-abs_2.c\n>> new file mode 100644\n>> index 0000000000000000000000000000000000000000..141121176b309e4b2aa413dc55271a6e3c93d5e1\n>> --- /dev/null\n>> +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_2.c\n>> @@ -0,0 +1,31 @@\n>> +/* { dg-do compile } */\n>> +/* { dg-options \"-O3\" } */\n>> +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } } */\n>> +\n>> +#pragma GCC target \"+nosve\"\n>> +\n>> +#include <arm_neon.h>\n>> +#include <math.h>\n>> +\n>> +/*\n>> +** f1:\n>> +**\tmovi\tv[0-9]+.2s, 0x80, lsl 24\n>> +**\torr\tv[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n>> +**\tret\n>> +*/\n>> +float32_t f1 (float32_t a)\n>> +{\n>> +  return -fabsf (a);\n>> +}\n>> +\n>> +/*\n>> +** f2:\n>> +**\tmov\tx0, -9223372036854775808\n>> +**\tfmov\td[0-9]+, x0\n>> +**\torr\tv[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n>> +**\tret\n>> +*/\n>> +float64_t f2 (float64_t a)\n>> +{\n>> +  return -fabs (a);\n>> +}\n>> diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_3.c b/gcc/testsuite/gcc.target/aarch64/fneg-abs_3.c\n>> new file mode 100644\n>> index 0000000000000000000000000000000000000000..b4652173a95d104ddfa70c497f0627a61ea89d3b\n>> --- /dev/null\n>> +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_3.c\n>> @@ -0,0 +1,36 @@\n>> +/* { dg-do compile } */\n>> +/* { dg-options \"-O3\" } */\n>> +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } } */\n>> +\n>> +#pragma GCC target \"+nosve\"\n>> +\n>> +#include <arm_neon.h>\n>> +#include <math.h>\n>> +\n>> +/*\n>> +** f1:\n>> +**\t...\n>> +**\tldr\tq[0-9]+, \\[x0\\]\n>> +**\torr\tv[0-9]+.4s, #128, lsl #24\n>> +**\tstr\tq[0-9]+, \\[x0\\], 16\n>> +**\t...\n>> +*/\n>> +void f1 (float32_t *a, int n)\n>> +{\n>> +  for (int i = 0; i < (n & -8); i++)\n>> +   a[i] = -fabsf (a[i]);\n>> +}\n>> +\n>> +/*\n>> +** f2:\n>> +**\t...\n>> +**\tldr\tq[0-9]+, \\[x0\\]\n>> +**\torr\tv[0-9]+.16b, v[0-9]+.16b, v[0-9]+.16b\n>> +**\tstr\tq[0-9]+, \\[x0\\], 16\n>> +**\t...\n>> +*/\n>> +void f2 (float64_t *a, int n)\n>> +{\n>> +  for (int i = 0; i < (n & -8); i++)\n>> +   a[i] = -fabs (a[i]);\n>> +}\n>> diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_4.c b/gcc/testsuite/gcc.target/aarch64/fneg-abs_4.c\n>> new file mode 100644\n>> index 0000000000000000000000000000000000000000..10879dea74462d34b26160eeb0bd54ead063166b\n>> --- /dev/null\n>> +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_4.c\n>> @@ -0,0 +1,39 @@\n>> +/* { dg-do compile } */\n>> +/* { dg-options \"-O3\" } */\n>> +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } } */\n>> +\n>> +#pragma GCC target \"+nosve\"\n>> +\n>> +#include <string.h>\n>> +\n>> +/*\n>> +** negabs:\n>> +**\tmov\tx0, -9223372036854775808\n>> +**\tfmov\td[0-9]+, x0\n>> +**\torr\tv[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n>> +**\tret\n>> +*/\n>> +double negabs (double x)\n>> +{\n>> +   unsigned long long y;\n>> +   memcpy (&y, &x, sizeof(double));\n>> +   y = y | (1UL << 63);\n>> +   memcpy (&x, &y, sizeof(double));\n>> +   return x;\n>> +}\n>> +\n>> +/*\n>> +** negabsf:\n>> +**\tmovi\tv[0-9]+.2s, 0x80, lsl 24\n>> +**\torr\tv[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n>> +**\tret\n>> +*/\n>> +float negabsf (float x)\n>> +{\n>> +   unsigned int y;\n>> +   memcpy (&y, &x, sizeof(float));\n>> +   y = y | (1U << 31);\n>> +   memcpy (&x, &y, sizeof(float));\n>> +   return x;\n>> +}\n>> +\n>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_1.c b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_1.c\n>> new file mode 100644\n>> index 0000000000000000000000000000000000000000..0c7664e6de77a497682952653ffd417453854d52\n>> --- /dev/null\n>> +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_1.c\n>> @@ -0,0 +1,37 @@\n>> +/* { dg-do compile } */\n>> +/* { dg-options \"-O3\" } */\n>> +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } } */\n>> +\n>> +#include <arm_neon.h>\n>> +\n>> +/*\n>> +** t1:\n>> +**\torr\tv[0-9]+.2s, #128, lsl #24\n>> +**\tret\n>> +*/\n>> +float32x2_t t1 (float32x2_t a)\n>> +{\n>> +  return vneg_f32 (vabs_f32 (a));\n>> +}\n>> +\n>> +/*\n>> +** t2:\n>> +**\torr\tv[0-9]+.4s, #128, lsl #24\n>> +**\tret\n>> +*/\n>> +float32x4_t t2 (float32x4_t a)\n>> +{\n>> +  return vnegq_f32 (vabsq_f32 (a));\n>> +}\n>> +\n>> +/*\n>> +** t3:\n>> +**\tadrp\tx0, .LC[0-9]+\n>> +**\tldr\tq[0-9]+, \\[x0, #:lo12:.LC0\\]\n>> +**\torr\tv[0-9]+.16b, v[0-9]+.16b, v[0-9]+.16b\n>> +**\tret\n>> +*/\n>> +float64x2_t t3 (float64x2_t a)\n>> +{\n>> +  return vnegq_f64 (vabsq_f64 (a));\n>> +}\n>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_2.c b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_2.c\n>> new file mode 100644\n>> index 0000000000000000000000000000000000000000..a60cd31b9294af2dac69eed1c93f899bd5c78fca\n>> --- /dev/null\n>> +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_2.c\n>> @@ -0,0 +1,29 @@\n>> +/* { dg-do compile } */\n>> +/* { dg-options \"-O3\" } */\n>> +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } } */\n>> +\n>> +#include <arm_neon.h>\n>> +#include <math.h>\n>> +\n>> +/*\n>> +** f1:\n>> +**\tmovi\tv[0-9]+.2s, 0x80, lsl 24\n>> +**\torr\tv[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n>> +**\tret\n>> +*/\n>> +float32_t f1 (float32_t a)\n>> +{\n>> +  return -fabsf (a);\n>> +}\n>> +\n>> +/*\n>> +** f2:\n>> +**\tmov\tx0, -9223372036854775808\n>> +**\tfmov\td[0-9]+, x0\n>> +**\torr\tv[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n>> +**\tret\n>> +*/\n>> +float64_t f2 (float64_t a)\n>> +{\n>> +  return -fabs (a);\n>> +}\n>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_3.c b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_3.c\n>> new file mode 100644\n>> index 0000000000000000000000000000000000000000..1bf34328d8841de8e6b0a5458562a9f00e31c275\n>> --- /dev/null\n>> +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_3.c\n>> @@ -0,0 +1,34 @@\n>> +/* { dg-do compile } */\n>> +/* { dg-options \"-O3\" } */\n>> +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } } */\n>> +\n>> +#include <arm_neon.h>\n>> +#include <math.h>\n>> +\n>> +/*\n>> +** f1:\n>> +**\t...\n>> +**\tld1w\tz[0-9]+.s, p[0-9]+/z, \\[x0, x2, lsl 2\\]\n>> +**\torr\tz[0-9]+.s, z[0-9]+.s, #0x80000000\n>> +**\tst1w\tz[0-9]+.s, p[0-9]+, \\[x0, x2, lsl 2\\]\n>> +**\t...\n>> +*/\n>> +void f1 (float32_t *a, int n)\n>> +{\n>> +  for (int i = 0; i < (n & -8); i++)\n>> +   a[i] = -fabsf (a[i]);\n>> +}\n>> +\n>> +/*\n>> +** f2:\n>> +**\t...\n>> +**\tld1d\tz[0-9]+.d, p[0-9]+/z, \\[x0, x2, lsl 3\\]\n>> +**\torr\tz[0-9]+.d, z[0-9]+.d, #0x8000000000000000\n>> +**\tst1d\tz[0-9]+.d, p[0-9]+, \\[x0, x2, lsl 3\\]\n>> +**\t...\n>> +*/\n>> +void f2 (float64_t *a, int n)\n>> +{\n>> +  for (int i = 0; i < (n & -8); i++)\n>> +   a[i] = -fabs (a[i]);\n>> +}\n>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_4.c b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_4.c\n>> new file mode 100644\n>> index 0000000000000000000000000000000000000000..21f2a8da2a5d44e3d01f6604ca7be87e3744d494\n>> --- /dev/null\n>> +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_4.c\n>> @@ -0,0 +1,37 @@\n>> +/* { dg-do compile } */\n>> +/* { dg-options \"-O3\" } */\n>> +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } } */\n>> +\n>> +#include <string.h>\n>> +\n>> +/*\n>> +** negabs:\n>> +**\tmov\tx0, -9223372036854775808\n>> +**\tfmov\td[0-9]+, x0\n>> +**\torr\tv[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n>> +**\tret\n>> +*/\n>> +double negabs (double x)\n>> +{\n>> +   unsigned long long y;\n>> +   memcpy (&y, &x, sizeof(double));\n>> +   y = y | (1UL << 63);\n>> +   memcpy (&x, &y, sizeof(double));\n>> +   return x;\n>> +}\n>> +\n>> +/*\n>> +** negabsf:\n>> +**\tmovi\tv[0-9]+.2s, 0x80, lsl 24\n>> +**\torr\tv[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n>> +**\tret\n>> +*/\n>> +float negabsf (float x)\n>> +{\n>> +   unsigned int y;\n>> +   memcpy (&y, &x, sizeof(float));\n>> +   y = y | (1U << 31);\n>> +   memcpy (&x, &y, sizeof(float));\n>> +   return x;\n>> +}\n>> +\n>>","headers":{"Return-Path":"<gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org>","X-Original-To":["incoming@patchwork.ozlabs.org","gcc-patches@gcc.gnu.org"],"Delivered-To":["patchwork-incoming@legolas.ozlabs.org","gcc-patches@gcc.gnu.org"],"Authentication-Results":["legolas.ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org\n (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org;\n envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org;\n receiver=patchwork.ozlabs.org)","sourceware.org;\n dmarc=pass (p=none dis=none) header.from=arm.com","sourceware.org; spf=pass smtp.mailfrom=arm.com"],"Received":["from server2.sourceware.org (server2.sourceware.org\n [IPv6:2620:52:3:1:0:246e:9693:128c])\n\t(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n\t key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384)\n\t(No client certificate requested)\n\tby legolas.ozlabs.org (Postfix) with ESMTPS id 4S2fwY5s5Rz1yq7\n\tfor <incoming@patchwork.ozlabs.org>; Sat,  7 Oct 2023 20:23:07 +1100 (AEDT)","from server2.sourceware.org (localhost [IPv6:::1])\n\tby sourceware.org (Postfix) with ESMTP id BB4873856DC6\n\tfor <incoming@patchwork.ozlabs.org>; Sat,  7 Oct 2023 09:23:05 +0000 (GMT)","from foss.arm.com (foss.arm.com [217.140.110.172])\n by sourceware.org (Postfix) with ESMTP id 36C813858D37\n for <gcc-patches@gcc.gnu.org>; Sat,  7 Oct 2023 09:22:51 +0000 (GMT)","from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14])\n by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 48851C15;\n Sat,  7 Oct 2023 02:23:30 -0700 (PDT)","from localhost (unknown [10.32.110.65])\n by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id AB1523F762;\n Sat,  7 Oct 2023 02:22:49 -0700 (PDT)"],"DMARC-Filter":"OpenDMARC Filter v1.4.2 sourceware.org 36C813858D37","From":"Richard Sandiford <richard.sandiford@arm.com>","To":"Richard Biener <rguenther@suse.de>","Mail-Followup-To":"Richard Biener <rguenther@suse.de>,\n Tamar Christina <Tamar.Christina@arm.com>, Andrew Pinski <pinskia@gmail.com>,\n \"gcc-patches\\@gcc.gnu.org\" <gcc-patches@gcc.gnu.org>, nd <nd@arm.com>,\n \"jlaw\\@ventanamicro.com\" <jlaw@ventanamicro.com>, richard.sandiford@arm.com","Cc":"Tamar Christina <Tamar.Christina@arm.com>,\n Andrew Pinski <pinskia@gmail.com>,\n \"gcc-patches\\@gcc.gnu.org\" <gcc-patches@gcc.gnu.org>, nd <nd@arm.com>,\n \"jlaw\\@ventanamicro.com\" <jlaw@ventanamicro.com>","Subject":"Re: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1 <<\n signbit(x)) [PR109154]","References":"<patch-17718-tamar@arm.com>\n <CA+=Sn1kbO1OkC_1oMJi8uH8bajmGn07A+F6nvb6dGKBRcR8S3Q@mail.gmail.com>\n <VI1PR08MB5325CC904A863DB87F17CB88FFC2A@VI1PR08MB5325.eurprd08.prod.outlook.com>\n <nycvar.YFH.7.77.849.2309270710120.5561@jbgna.fhfr.qr>\n <VI1PR08MB532509805D977DE375DE3618FFC2A@VI1PR08MB5325.eurprd08.prod.outlook.com>\n <VI1PR08MB5325EF37073EFC87B0574525FFC2A@VI1PR08MB5325.eurprd08.prod.outlook.com>\n <nycvar.YFH.7.77.849.2309270936320.5561@jbgna.fhfr.qr>\n <VI1PR08MB532531FEB62DED3A5518BF0AFFCAA@VI1PR08MB5325.eurprd08.prod.outlook.com>\n <nycvar.YFH.7.77.849.2310060558560.5561@jbgna.fhfr.qr>","Date":"Sat, 07 Oct 2023 10:22:48 +0100","In-Reply-To":"<nycvar.YFH.7.77.849.2310060558560.5561@jbgna.fhfr.qr> (Richard\n Biener's message of \"Fri, 6 Oct 2023 06:24:31 +0000 (UTC)\")","Message-ID":"<mpt1qe6lrbr.fsf@arm.com>","User-Agent":"Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux)","MIME-Version":"1.0","Content-Type":"text/plain","X-Spam-Status":"No, score=-24.3 required=5.0 tests=BAYES_00, GIT_PATCH_0,\n KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_LOTSOFHASH,\n KAM_SHORT, SPF_HELO_NONE, SPF_NONE,\n TXREP autolearn=ham autolearn_force=no version=3.4.6","X-Spam-Checker-Version":"SpamAssassin 3.4.6 (2021-04-09) on\n server2.sourceware.org","X-BeenThere":"gcc-patches@gcc.gnu.org","X-Mailman-Version":"2.1.30","Precedence":"list","List-Id":"Gcc-patches mailing list <gcc-patches.gcc.gnu.org>","List-Unsubscribe":"<https://gcc.gnu.org/mailman/options/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>","List-Archive":"<https://gcc.gnu.org/pipermail/gcc-patches/>","List-Post":"<mailto:gcc-patches@gcc.gnu.org>","List-Help":"<mailto:gcc-patches-request@gcc.gnu.org?subject=help>","List-Subscribe":"<https://gcc.gnu.org/mailman/listinfo/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>","Errors-To":"gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org"}},{"id":3194789,"web_url":"http://patchwork.ozlabs.org/comment/3194789/","msgid":"<E1BD3E05-ACB9-4484-8979-D0B71759A3FC@suse.de>","list_archive_url":null,"date":"2023-10-07T10:34:43","subject":"Re: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1 <<\n signbit(x)) [PR109154]","submitter":{"id":4338,"url":"http://patchwork.ozlabs.org/api/people/4338/","name":"Richard Biener","email":"rguenther@suse.de"},"content":"> Am 07.10.2023 um 11:23 schrieb Richard Sandiford <richard.sandiford@arm.com>:\n> \n> ﻿Richard Biener <rguenther@suse.de> writes:\n>> On Thu, 5 Oct 2023, Tamar Christina wrote:\n>> \n>>>> I suppose the idea is that -abs(x) might be easier to optimize with other\n>>>> patterns (consider a - copysign(x,...), optimizing to a + abs(x)).\n>>>> \n>>>> For abs vs copysign it's a canonicalization, but (negate (abs @0)) is less\n>>>> canonical than copysign.\n>>>> \n>>>>> Should I try removing this?\n>>>> \n>>>> I'd say yes (and put the reverse canonicalization next to this pattern).\n>>>> \n>>> \n>>> This patch transforms fneg (fabs (x)) into copysign (x, -1) which is more\n>>> canonical and allows a target to expand this sequence efficiently.  Such\n>>> sequences are common in scientific code working with gradients.\n>>> \n>>> various optimizations in match.pd only happened on COPYSIGN but not COPYSIGN_ALL\n>>> which means they exclude IFN_COPYSIGN.  COPYSIGN however is restricted to only\n>> \n>> That's not true:\n>> \n>> (define_operator_list COPYSIGN\n>>    BUILT_IN_COPYSIGNF\n>>    BUILT_IN_COPYSIGN\n>>    BUILT_IN_COPYSIGNL\n>>    IFN_COPYSIGN)\n>> \n>> but they miss the extended float builtin variants like\n>> __builtin_copysignf16.  Also see below\n>> \n>>> the C99 builtins and so doesn't work for vectors.\n>>> \n>>> The patch expands these optimizations to work on COPYSIGN_ALL.\n>>> \n>>> There is an existing canonicalization of copysign (x, -1) to fneg (fabs (x))\n>>> which I remove since this is a less efficient form.  The testsuite is also\n>>> updated in light of this.\n>>> \n>>> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.\n>>> \n>>> Ok for master?\n>>> \n>>> Thanks,\n>>> Tamar\n>>> \n>>> gcc/ChangeLog:\n>>> \n>>>    PR tree-optimization/109154\n>>>    * match.pd: Add new neg+abs rule, remove inverse copysign rule and\n>>>    expand existing copysign optimizations.\n>>> \n>>> gcc/testsuite/ChangeLog:\n>>> \n>>>    PR tree-optimization/109154\n>>>    * gcc.dg/fold-copysign-1.c: Updated.\n>>>    * gcc.dg/pr55152-2.c: Updated.\n>>>    * gcc.dg/tree-ssa/abs-4.c: Updated.\n>>>    * gcc.dg/tree-ssa/backprop-6.c: Updated.\n>>>    * gcc.dg/tree-ssa/copy-sign-2.c: Updated.\n>>>    * gcc.dg/tree-ssa/mult-abs-2.c: Updated.\n>>>    * gcc.target/aarch64/fneg-abs_1.c: New test.\n>>>    * gcc.target/aarch64/fneg-abs_2.c: New test.\n>>>    * gcc.target/aarch64/fneg-abs_3.c: New test.\n>>>    * gcc.target/aarch64/fneg-abs_4.c: New test.\n>>>    * gcc.target/aarch64/sve/fneg-abs_1.c: New test.\n>>>    * gcc.target/aarch64/sve/fneg-abs_2.c: New test.\n>>>    * gcc.target/aarch64/sve/fneg-abs_3.c: New test.\n>>>    * gcc.target/aarch64/sve/fneg-abs_4.c: New test.\n>>> \n>>> --- inline copy of patch ---\n>>> \n>>> diff --git a/gcc/match.pd b/gcc/match.pd\n>>> index 4bdd83e6e061b16dbdb2845b9398fcfb8a6c9739..bd6599d36021e119f51a4928354f580ffe82c6e2 100644\n>>> --- a/gcc/match.pd\n>>> +++ b/gcc/match.pd\n>>> @@ -1074,45 +1074,43 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)\n>>> \n>>> /* cos(copysign(x, y)) -> cos(x).  Similarly for cosh.  */\n>>> (for coss (COS COSH)\n>>> -     copysigns (COPYSIGN)\n>>> - (simplify\n>>> -  (coss (copysigns @0 @1))\n>>> -   (coss @0)))\n>>> + (for copysigns (COPYSIGN_ALL)\n>> \n>> So this ends up generating for example the match\n>> (cosf (copysignl ...)) which doesn't make much sense.\n>> \n>> The lock-step iteration did\n>> (cosf (copysignf ..)) ... (ifn_cos (ifn_copysign ...))\n>> which is leaner but misses the case of\n>> (cosf (ifn_copysign ..)) - that's probably what you are\n>> after with this change.\n>> \n>> That said, there isn't a nice solution (without altering the match.pd\n>> IL).  There's the explicit solution, spelling out all combinations.\n>> \n>> So if we want to go with yout pragmatic solution changing this\n>> to use COPYSIGN_ALL isn't necessary, only changing the lock-step\n>> for iteration to a cross product for iteration is.\n>> \n>> Changing just this pattern to\n>> \n>> (for coss (COS COSH)\n>> (for copysigns (COPYSIGN)\n>>  (simplify\n>>   (coss (copysigns @0 @1))\n>>   (coss @0))))\n>> \n>> increases the total number of gimple-match-x.cc lines from\n>> 234988 to 235324.\n> \n> I guess the difference between this and the later suggestions is that\n> this one allows builtin copysign to be paired with ifn cos, which would\n> be potentially useful in other situations.  (It isn't here because\n> ifn_cos is rarely provided.)  How much of the growth is due to that,\n> and much of it is from nonsensical combinations like\n> (builtin_cosf (builtin_copysignl ...))?\n> \n> If it's mostly from nonsensical combinations then would it be possible\n> to make genmatch drop them?\n> \n>> The alternative is to do\n>> \n>> (for coss (COS COSH)\n>>     copysigns (COPYSIGN)\n>> (simplify\n>>  (coss (copysigns @0 @1))\n>>   (coss @0))\n>> (simplify\n>>  (coss (IFN_COPYSIGN @0 @1))\n>>   (coss @0)))\n>> \n>> which properly will diagnose a duplicate pattern.  Ther are\n>> currently no operator lists with just builtins defined (that\n>> could be fixed, see gencfn-macros.cc), supposed we'd have\n>> COS_C we could do\n>> \n>> (for coss (COS_C COSH_C IFN_COS IFN_COSH)\n>>     copysigns (COPYSIGN_C COPYSIGN_C IFN_COPYSIGN IFN_COPYSIGN \n>> IFN_COPYSIGN IFN_COPYSIGN IFN_COPYSIGN IFN_COPYSIGN IFN_COPYSIGN \n>> IFN_COPYSIGN)\n>> (simplify\n>>  (coss (copysigns @0 @1))\n>>   (coss @0)))\n>> \n>> which of course still looks ugly ;) (some syntax extension like\n>> allowing to specify IFN_COPYSIGN*8 would be nice here and easy\n>> enough to do)\n>> \n>> Can you split out the part changing COPYSIGN to COPYSIGN_ALL,\n>> re-do it to only split the fors, keeping COPYSIGN and provide\n>> some statistics on the gimple-match-* size?  I think this might\n>> be the pragmatic solution for now.\n>> \n>> Richard - can you think of a clever way to express the desired\n>> iteration?  How do RTL macro iterations address cases like this?\n> \n> I don't think .md files have an equivalent construct, unfortunately.\n> (I also regret some of the choices I made for .md iterators, but that's\n> another story.)\n> \n> Perhaps an alternative to the *8 thing would be \"IFN_COPYSIGN...\",\n> with the \"...\" meaning \"fill to match the longest operator list\n> in the loop\".\n\nHm, I’ll think about this.  It would be useful to have a function like\n\nInternal_fn ifn_for (combined_fn);\n\nSo we can indirectly match all builtins with a switch on the ifn code.\n\nRichard \n\n> Thanks,\n> Richard\n> \n>> Richard.\n>> \n>>> +  (simplify\n>>> +   (coss (copysigns @0 @1))\n>>> +    (coss @0))))\n>>> \n>>> /* pow(copysign(x, y), z) -> pow(x, z) if z is an even integer.  */\n>>> (for pows (POW)\n>>> -     copysigns (COPYSIGN)\n>>> - (simplify\n>>> -  (pows (copysigns @0 @2) REAL_CST@1)\n>>> -  (with { HOST_WIDE_INT n; }\n>>> -   (if (real_isinteger (&TREE_REAL_CST (@1), &n) && (n & 1) == 0)\n>>> -    (pows @0 @1)))))\n>>> + (for copysigns (COPYSIGN_ALL)\n>>> +  (simplify\n>>> +   (pows (copysigns @0 @2) REAL_CST@1)\n>>> +   (with { HOST_WIDE_INT n; }\n>>> +    (if (real_isinteger (&TREE_REAL_CST (@1), &n) && (n & 1) == 0)\n>>> +     (pows @0 @1))))))\n>>> /* Likewise for powi.  */\n>>> (for pows (POWI)\n>>> -     copysigns (COPYSIGN)\n>>> - (simplify\n>>> -  (pows (copysigns @0 @2) INTEGER_CST@1)\n>>> -  (if ((wi::to_wide (@1) & 1) == 0)\n>>> -   (pows @0 @1))))\n>>> + (for copysigns (COPYSIGN_ALL)\n>>> +  (simplify\n>>> +   (pows (copysigns @0 @2) INTEGER_CST@1)\n>>> +   (if ((wi::to_wide (@1) & 1) == 0)\n>>> +    (pows @0 @1)))))\n>>> \n>>> (for hypots (HYPOT)\n>>> -     copysigns (COPYSIGN)\n>>> - /* hypot(copysign(x, y), z) -> hypot(x, z).  */\n>>> - (simplify\n>>> -  (hypots (copysigns @0 @1) @2)\n>>> -  (hypots @0 @2))\n>>> - /* hypot(x, copysign(y, z)) -> hypot(x, y).  */\n>>> - (simplify\n>>> -  (hypots @0 (copysigns @1 @2))\n>>> -  (hypots @0 @1)))\n>>> + (for copysigns (COPYSIGN)\n>>> +  /* hypot(copysign(x, y), z) -> hypot(x, z).  */\n>>> +  (simplify\n>>> +   (hypots (copysigns @0 @1) @2)\n>>> +   (hypots @0 @2))\n>>> +  /* hypot(x, copysign(y, z)) -> hypot(x, y).  */\n>>> +  (simplify\n>>> +   (hypots @0 (copysigns @1 @2))\n>>> +   (hypots @0 @1))))\n>>> \n>>> -/* copysign(x, CST) -> [-]abs (x).  */\n>>> -(for copysigns (COPYSIGN_ALL)\n>>> - (simplify\n>>> -  (copysigns @0 REAL_CST@1)\n>>> -  (if (REAL_VALUE_NEGATIVE (TREE_REAL_CST (@1)))\n>>> -   (negate (abs @0))\n>>> -   (abs @0))))\n>>> +/* Transform fneg (fabs (X)) -> copysign (X, -1).  */\n>>> +\n>>> +(simplify\n>>> + (negate (abs @0))\n>>> + (IFN_COPYSIGN @0 { build_minus_one_cst (type); }))\n>>> \n>>> /* copysign(copysign(x, y), z) -> copysign(x, z).  */\n>>> (for copysigns (COPYSIGN_ALL)\n>>> diff --git a/gcc/testsuite/gcc.dg/fold-copysign-1.c b/gcc/testsuite/gcc.dg/fold-copysign-1.c\n>>> index f17d65c24ee4dca9867827d040fe0a404c515e7b..f9cafd14ab05f5e8ab2f6f68e62801d21c2df6a6 100644\n>>> --- a/gcc/testsuite/gcc.dg/fold-copysign-1.c\n>>> +++ b/gcc/testsuite/gcc.dg/fold-copysign-1.c\n>>> @@ -12,5 +12,5 @@ double bar (double x)\n>>>   return __builtin_copysign (x, minuszero);\n>>> }\n>>> \n>>> -/* { dg-final { scan-tree-dump-times \"= -\" 1 \"cddce1\" } } */\n>>> -/* { dg-final { scan-tree-dump-times \"= ABS_EXPR\" 2 \"cddce1\" } } */\n>>> +/* { dg-final { scan-tree-dump-times \"__builtin_copysign\" 1 \"cddce1\" } } */\n>>> +/* { dg-final { scan-tree-dump-times \"= ABS_EXPR\" 1 \"cddce1\" } } */\n>>> diff --git a/gcc/testsuite/gcc.dg/pr55152-2.c b/gcc/testsuite/gcc.dg/pr55152-2.c\n>>> index 54db0f2062da105a829d6690ac8ed9891fe2b588..605f202ed6bc7aa8fe921457b02ff0b88cc63ce6 100644\n>>> --- a/gcc/testsuite/gcc.dg/pr55152-2.c\n>>> +++ b/gcc/testsuite/gcc.dg/pr55152-2.c\n>>> @@ -10,4 +10,5 @@ int f(int a)\n>>>   return (a<-a)?a:-a;\n>>> }\n>>> \n>>> -/* { dg-final { scan-tree-dump-times \"ABS_EXPR\" 2 \"optimized\" } } */\n>>> +/* { dg-final { scan-tree-dump-times \"\\.COPYSIGN\" 1 \"optimized\" } } */\n>>> +/* { dg-final { scan-tree-dump-times \"ABS_EXPR\" 1 \"optimized\" } } */\n>>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/abs-4.c b/gcc/testsuite/gcc.dg/tree-ssa/abs-4.c\n>>> index 6197519faf7b55aed7bc162cd0a14dd2145210ca..e1b825f37f69ac3c4666b3a52d733368805ad31d 100644\n>>> --- a/gcc/testsuite/gcc.dg/tree-ssa/abs-4.c\n>>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/abs-4.c\n>>> @@ -9,5 +9,6 @@ long double abs_ld(long double x) { return __builtin_signbit(x) ? x : -x; }\n>>> \n>>> /* __builtin_signbit(x) ? x : -x. Should be convert into - ABS_EXP<x> */\n>>> /* { dg-final { scan-tree-dump-not \"signbit\" \"optimized\"} } */\n>>> -/* { dg-final { scan-tree-dump-times \"= ABS_EXPR\" 3 \"optimized\"} } */\n>>> -/* { dg-final { scan-tree-dump-times \"= -\" 3 \"optimized\"} } */\n>>> +/* { dg-final { scan-tree-dump-times \"= ABS_EXPR\" 1 \"optimized\"} } */\n>>> +/* { dg-final { scan-tree-dump-times \"= -\" 1 \"optimized\"} } */\n>>> +/* { dg-final { scan-tree-dump-times \"= \\.COPYSIGN\" 2 \"optimized\"} } */\n>>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/backprop-6.c b/gcc/testsuite/gcc.dg/tree-ssa/backprop-6.c\n>>> index 31f05716f1498dc709cac95fa20fb5796642c77e..c3a138642d6ff7be984e91fa1343cb2718db7ae1 100644\n>>> --- a/gcc/testsuite/gcc.dg/tree-ssa/backprop-6.c\n>>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/backprop-6.c\n>>> @@ -26,5 +26,6 @@ TEST_FUNCTION (float, f)\n>>> TEST_FUNCTION (double, )\n>>> TEST_FUNCTION (long double, l)\n>>> \n>>> -/* { dg-final { scan-tree-dump-times {Deleting[^\\n]* = -} 6 \"backprop\" } } */\n>>> -/* { dg-final { scan-tree-dump-times {Deleting[^\\n]* = ABS_EXPR <} 3 \"backprop\" } } */\n>>> +/* { dg-final { scan-tree-dump-times {Deleting[^\\n]* = -} 4 \"backprop\" } } */\n>>> +/* { dg-final { scan-tree-dump-times {Deleting[^\\n]* = \\.COPYSIGN} 2 \"backprop\" } } */\n>>> +/* { dg-final { scan-tree-dump-times {Deleting[^\\n]* = ABS_EXPR <} 1 \"backprop\" } } */\n>>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/copy-sign-2.c b/gcc/testsuite/gcc.dg/tree-ssa/copy-sign-2.c\n>>> index de52c5f7c8062958353d91f5031193defc9f3f91..e5d565c4b9832c00106588ef411fbd8c292a5cad 100644\n>>> --- a/gcc/testsuite/gcc.dg/tree-ssa/copy-sign-2.c\n>>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/copy-sign-2.c\n>>> @@ -10,4 +10,5 @@ float f1(float x)\n>>>   float t = __builtin_copysignf (1.0f, -x);\n>>>   return x * t;\n>>> }\n>>> -/* { dg-final { scan-tree-dump-times \"ABS\" 2 \"optimized\"} } */\n>>> +/* { dg-final { scan-tree-dump-times \"ABS\" 1 \"optimized\"} } */\n>>> +/* { dg-final { scan-tree-dump-times \".COPYSIGN\" 1 \"optimized\"} } */\n>>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/mult-abs-2.c b/gcc/testsuite/gcc.dg/tree-ssa/mult-abs-2.c\n>>> index a41f1baf25669a4fd301a586a49ba5e3c5b966ab..a22896b21c8b5a4d5d8e28bd8ae0db896e63ade0 100644\n>>> --- a/gcc/testsuite/gcc.dg/tree-ssa/mult-abs-2.c\n>>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/mult-abs-2.c\n>>> @@ -34,4 +34,5 @@ float i1(float x)\n>>> {\n>>>   return x * (x <= 0.f ? 1.f : -1.f);\n>>> }\n>>> -/* { dg-final { scan-tree-dump-times \"ABS\" 8 \"gimple\"} } */\n>>> +/* { dg-final { scan-tree-dump-times \"ABS\" 4 \"gimple\"} } */\n>>> +/* { dg-final { scan-tree-dump-times \"\\.COPYSIGN\" 4 \"gimple\"} } */\n>>> diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_1.c b/gcc/testsuite/gcc.target/aarch64/fneg-abs_1.c\n>>> new file mode 100644\n>>> index 0000000000000000000000000000000000000000..f823013c3ddf6b3a266c3abfcbf2642fc2a75fa6\n>>> --- /dev/null\n>>> +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_1.c\n>>> @@ -0,0 +1,39 @@\n>>> +/* { dg-do compile } */\n>>> +/* { dg-options \"-O3\" } */\n>>> +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } } */\n>>> +\n>>> +#pragma GCC target \"+nosve\"\n>>> +\n>>> +#include <arm_neon.h>\n>>> +\n>>> +/*\n>>> +** t1:\n>>> +**    orr    v[0-9]+.2s, #128, lsl #24\n>>> +**    ret\n>>> +*/\n>>> +float32x2_t t1 (float32x2_t a)\n>>> +{\n>>> +  return vneg_f32 (vabs_f32 (a));\n>>> +}\n>>> +\n>>> +/*\n>>> +** t2:\n>>> +**    orr    v[0-9]+.4s, #128, lsl #24\n>>> +**    ret\n>>> +*/\n>>> +float32x4_t t2 (float32x4_t a)\n>>> +{\n>>> +  return vnegq_f32 (vabsq_f32 (a));\n>>> +}\n>>> +\n>>> +/*\n>>> +** t3:\n>>> +**    adrp    x0, .LC[0-9]+\n>>> +**    ldr    q[0-9]+, \\[x0, #:lo12:.LC0\\]\n>>> +**    orr    v[0-9]+.16b, v[0-9]+.16b, v[0-9]+.16b\n>>> +**    ret\n>>> +*/\n>>> +float64x2_t t3 (float64x2_t a)\n>>> +{\n>>> +  return vnegq_f64 (vabsq_f64 (a));\n>>> +}\n>>> diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_2.c b/gcc/testsuite/gcc.target/aarch64/fneg-abs_2.c\n>>> new file mode 100644\n>>> index 0000000000000000000000000000000000000000..141121176b309e4b2aa413dc55271a6e3c93d5e1\n>>> --- /dev/null\n>>> +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_2.c\n>>> @@ -0,0 +1,31 @@\n>>> +/* { dg-do compile } */\n>>> +/* { dg-options \"-O3\" } */\n>>> +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } } */\n>>> +\n>>> +#pragma GCC target \"+nosve\"\n>>> +\n>>> +#include <arm_neon.h>\n>>> +#include <math.h>\n>>> +\n>>> +/*\n>>> +** f1:\n>>> +**    movi    v[0-9]+.2s, 0x80, lsl 24\n>>> +**    orr    v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n>>> +**    ret\n>>> +*/\n>>> +float32_t f1 (float32_t a)\n>>> +{\n>>> +  return -fabsf (a);\n>>> +}\n>>> +\n>>> +/*\n>>> +** f2:\n>>> +**    mov    x0, -9223372036854775808\n>>> +**    fmov    d[0-9]+, x0\n>>> +**    orr    v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n>>> +**    ret\n>>> +*/\n>>> +float64_t f2 (float64_t a)\n>>> +{\n>>> +  return -fabs (a);\n>>> +}\n>>> diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_3.c b/gcc/testsuite/gcc.target/aarch64/fneg-abs_3.c\n>>> new file mode 100644\n>>> index 0000000000000000000000000000000000000000..b4652173a95d104ddfa70c497f0627a61ea89d3b\n>>> --- /dev/null\n>>> +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_3.c\n>>> @@ -0,0 +1,36 @@\n>>> +/* { dg-do compile } */\n>>> +/* { dg-options \"-O3\" } */\n>>> +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } } */\n>>> +\n>>> +#pragma GCC target \"+nosve\"\n>>> +\n>>> +#include <arm_neon.h>\n>>> +#include <math.h>\n>>> +\n>>> +/*\n>>> +** f1:\n>>> +**    ...\n>>> +**    ldr    q[0-9]+, \\[x0\\]\n>>> +**    orr    v[0-9]+.4s, #128, lsl #24\n>>> +**    str    q[0-9]+, \\[x0\\], 16\n>>> +**    ...\n>>> +*/\n>>> +void f1 (float32_t *a, int n)\n>>> +{\n>>> +  for (int i = 0; i < (n & -8); i++)\n>>> +   a[i] = -fabsf (a[i]);\n>>> +}\n>>> +\n>>> +/*\n>>> +** f2:\n>>> +**    ...\n>>> +**    ldr    q[0-9]+, \\[x0\\]\n>>> +**    orr    v[0-9]+.16b, v[0-9]+.16b, v[0-9]+.16b\n>>> +**    str    q[0-9]+, \\[x0\\], 16\n>>> +**    ...\n>>> +*/\n>>> +void f2 (float64_t *a, int n)\n>>> +{\n>>> +  for (int i = 0; i < (n & -8); i++)\n>>> +   a[i] = -fabs (a[i]);\n>>> +}\n>>> diff --git a/gcc/testsuite/gcc.target/aarch64/fneg-abs_4.c b/gcc/testsuite/gcc.target/aarch64/fneg-abs_4.c\n>>> new file mode 100644\n>>> index 0000000000000000000000000000000000000000..10879dea74462d34b26160eeb0bd54ead063166b\n>>> --- /dev/null\n>>> +++ b/gcc/testsuite/gcc.target/aarch64/fneg-abs_4.c\n>>> @@ -0,0 +1,39 @@\n>>> +/* { dg-do compile } */\n>>> +/* { dg-options \"-O3\" } */\n>>> +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } } */\n>>> +\n>>> +#pragma GCC target \"+nosve\"\n>>> +\n>>> +#include <string.h>\n>>> +\n>>> +/*\n>>> +** negabs:\n>>> +**    mov    x0, -9223372036854775808\n>>> +**    fmov    d[0-9]+, x0\n>>> +**    orr    v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n>>> +**    ret\n>>> +*/\n>>> +double negabs (double x)\n>>> +{\n>>> +   unsigned long long y;\n>>> +   memcpy (&y, &x, sizeof(double));\n>>> +   y = y | (1UL << 63);\n>>> +   memcpy (&x, &y, sizeof(double));\n>>> +   return x;\n>>> +}\n>>> +\n>>> +/*\n>>> +** negabsf:\n>>> +**    movi    v[0-9]+.2s, 0x80, lsl 24\n>>> +**    orr    v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n>>> +**    ret\n>>> +*/\n>>> +float negabsf (float x)\n>>> +{\n>>> +   unsigned int y;\n>>> +   memcpy (&y, &x, sizeof(float));\n>>> +   y = y | (1U << 31);\n>>> +   memcpy (&x, &y, sizeof(float));\n>>> +   return x;\n>>> +}\n>>> +\n>>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_1.c b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_1.c\n>>> new file mode 100644\n>>> index 0000000000000000000000000000000000000000..0c7664e6de77a497682952653ffd417453854d52\n>>> --- /dev/null\n>>> +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_1.c\n>>> @@ -0,0 +1,37 @@\n>>> +/* { dg-do compile } */\n>>> +/* { dg-options \"-O3\" } */\n>>> +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } } */\n>>> +\n>>> +#include <arm_neon.h>\n>>> +\n>>> +/*\n>>> +** t1:\n>>> +**    orr    v[0-9]+.2s, #128, lsl #24\n>>> +**    ret\n>>> +*/\n>>> +float32x2_t t1 (float32x2_t a)\n>>> +{\n>>> +  return vneg_f32 (vabs_f32 (a));\n>>> +}\n>>> +\n>>> +/*\n>>> +** t2:\n>>> +**    orr    v[0-9]+.4s, #128, lsl #24\n>>> +**    ret\n>>> +*/\n>>> +float32x4_t t2 (float32x4_t a)\n>>> +{\n>>> +  return vnegq_f32 (vabsq_f32 (a));\n>>> +}\n>>> +\n>>> +/*\n>>> +** t3:\n>>> +**    adrp    x0, .LC[0-9]+\n>>> +**    ldr    q[0-9]+, \\[x0, #:lo12:.LC0\\]\n>>> +**    orr    v[0-9]+.16b, v[0-9]+.16b, v[0-9]+.16b\n>>> +**    ret\n>>> +*/\n>>> +float64x2_t t3 (float64x2_t a)\n>>> +{\n>>> +  return vnegq_f64 (vabsq_f64 (a));\n>>> +}\n>>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_2.c b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_2.c\n>>> new file mode 100644\n>>> index 0000000000000000000000000000000000000000..a60cd31b9294af2dac69eed1c93f899bd5c78fca\n>>> --- /dev/null\n>>> +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_2.c\n>>> @@ -0,0 +1,29 @@\n>>> +/* { dg-do compile } */\n>>> +/* { dg-options \"-O3\" } */\n>>> +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } } */\n>>> +\n>>> +#include <arm_neon.h>\n>>> +#include <math.h>\n>>> +\n>>> +/*\n>>> +** f1:\n>>> +**    movi    v[0-9]+.2s, 0x80, lsl 24\n>>> +**    orr    v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n>>> +**    ret\n>>> +*/\n>>> +float32_t f1 (float32_t a)\n>>> +{\n>>> +  return -fabsf (a);\n>>> +}\n>>> +\n>>> +/*\n>>> +** f2:\n>>> +**    mov    x0, -9223372036854775808\n>>> +**    fmov    d[0-9]+, x0\n>>> +**    orr    v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n>>> +**    ret\n>>> +*/\n>>> +float64_t f2 (float64_t a)\n>>> +{\n>>> +  return -fabs (a);\n>>> +}\n>>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_3.c b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_3.c\n>>> new file mode 100644\n>>> index 0000000000000000000000000000000000000000..1bf34328d8841de8e6b0a5458562a9f00e31c275\n>>> --- /dev/null\n>>> +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_3.c\n>>> @@ -0,0 +1,34 @@\n>>> +/* { dg-do compile } */\n>>> +/* { dg-options \"-O3\" } */\n>>> +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } } */\n>>> +\n>>> +#include <arm_neon.h>\n>>> +#include <math.h>\n>>> +\n>>> +/*\n>>> +** f1:\n>>> +**    ...\n>>> +**    ld1w    z[0-9]+.s, p[0-9]+/z, \\[x0, x2, lsl 2\\]\n>>> +**    orr    z[0-9]+.s, z[0-9]+.s, #0x80000000\n>>> +**    st1w    z[0-9]+.s, p[0-9]+, \\[x0, x2, lsl 2\\]\n>>> +**    ...\n>>> +*/\n>>> +void f1 (float32_t *a, int n)\n>>> +{\n>>> +  for (int i = 0; i < (n & -8); i++)\n>>> +   a[i] = -fabsf (a[i]);\n>>> +}\n>>> +\n>>> +/*\n>>> +** f2:\n>>> +**    ...\n>>> +**    ld1d    z[0-9]+.d, p[0-9]+/z, \\[x0, x2, lsl 3\\]\n>>> +**    orr    z[0-9]+.d, z[0-9]+.d, #0x8000000000000000\n>>> +**    st1d    z[0-9]+.d, p[0-9]+, \\[x0, x2, lsl 3\\]\n>>> +**    ...\n>>> +*/\n>>> +void f2 (float64_t *a, int n)\n>>> +{\n>>> +  for (int i = 0; i < (n & -8); i++)\n>>> +   a[i] = -fabs (a[i]);\n>>> +}\n>>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_4.c b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_4.c\n>>> new file mode 100644\n>>> index 0000000000000000000000000000000000000000..21f2a8da2a5d44e3d01f6604ca7be87e3744d494\n>>> --- /dev/null\n>>> +++ b/gcc/testsuite/gcc.target/aarch64/sve/fneg-abs_4.c\n>>> @@ -0,0 +1,37 @@\n>>> +/* { dg-do compile } */\n>>> +/* { dg-options \"-O3\" } */\n>>> +/* { dg-final { check-function-bodies \"**\" \"\" \"\" { target lp64 } } } */\n>>> +\n>>> +#include <string.h>\n>>> +\n>>> +/*\n>>> +** negabs:\n>>> +**    mov    x0, -9223372036854775808\n>>> +**    fmov    d[0-9]+, x0\n>>> +**    orr    v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n>>> +**    ret\n>>> +*/\n>>> +double negabs (double x)\n>>> +{\n>>> +   unsigned long long y;\n>>> +   memcpy (&y, &x, sizeof(double));\n>>> +   y = y | (1UL << 63);\n>>> +   memcpy (&x, &y, sizeof(double));\n>>> +   return x;\n>>> +}\n>>> +\n>>> +/*\n>>> +** negabsf:\n>>> +**    movi    v[0-9]+.2s, 0x80, lsl 24\n>>> +**    orr    v[0-9]+.8b, v[0-9]+.8b, v[0-9]+.8b\n>>> +**    ret\n>>> +*/\n>>> +float negabsf (float x)\n>>> +{\n>>> +   unsigned int y;\n>>> +   memcpy (&y, &x, sizeof(float));\n>>> +   y = y | (1U << 31);\n>>> +   memcpy (&x, &y, sizeof(float));\n>>> +   return x;\n>>> +}\n>>> +\n>>>","headers":{"Return-Path":"<gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org>","X-Original-To":["incoming@patchwork.ozlabs.org","gcc-patches@gcc.gnu.org"],"Delivered-To":["patchwork-incoming@legolas.ozlabs.org","gcc-patches@gcc.gnu.org"],"Authentication-Results":["legolas.ozlabs.org;\n\tdkim=pass (1024-bit key;\n unprotected) header.d=suse.de header.i=@suse.de header.a=rsa-sha256\n header.s=susede2_rsa header.b=UvLLfYmF;\n\tdkim=pass header.d=suse.de header.i=@suse.de header.a=ed25519-sha256\n header.s=susede2_ed25519 header.b=mqfN7w+v;\n\tdkim-atps=neutral","legolas.ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org\n (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org;\n envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org;\n receiver=patchwork.ozlabs.org)","sourceware.org;\n dmarc=pass (p=none dis=none) header.from=suse.de","sourceware.org; spf=pass smtp.mailfrom=suse.de"],"Received":["from server2.sourceware.org (server2.sourceware.org\n [IPv6:2620:52:3:1:0:246e:9693:128c])\n\t(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n\t key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384)\n\t(No client certificate requested)\n\tby legolas.ozlabs.org (Postfix) with ESMTPS id 4S2hWj2qfTz1yq7\n\tfor <incoming@patchwork.ozlabs.org>; Sat,  7 Oct 2023 21:35:11 +1100 (AEDT)","from server2.sourceware.org (localhost [IPv6:::1])\n\tby sourceware.org (Postfix) with ESMTP id 7CFF6385772D\n\tfor <incoming@patchwork.ozlabs.org>; Sat,  7 Oct 2023 10:35:09 +0000 (GMT)","from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28])\n by sourceware.org (Postfix) with ESMTPS id 468643858D37\n for <gcc-patches@gcc.gnu.org>; Sat,  7 Oct 2023 10:34:55 +0000 (GMT)","from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de\n [192.168.254.74])\n (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512)\n (No client certificate requested)\n by smtp-out1.suse.de (Postfix) with ESMTPS id 362492187E;\n Sat,  7 Oct 2023 10:34:54 +0000 (UTC)","from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de\n [192.168.254.74])\n (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512)\n (No client certificate requested)\n by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 249A11391E;\n Sat,  7 Oct 2023 10:34:54 +0000 (UTC)","from dovecot-director2.suse.de ([192.168.254.65])\n by imap2.suse-dmz.suse.de with ESMTPSA id cNbrCE40IWXiPQAAMHmgww\n (envelope-from <rguenther@suse.de>); Sat, 07 Oct 2023 10:34:54 +0000"],"DMARC-Filter":"OpenDMARC Filter v1.4.2 sourceware.org 468643858D37","DKIM-Signature":["v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de;\n s=susede2_rsa;\n t=1696674894;\n h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc:\n mime-version:mime-version:content-type:content-type:\n content-transfer-encoding:content-transfer-encoding:\n in-reply-to:in-reply-to:references:references;\n bh=7J03tJTFpc7wcZMZy7LllieLOrnBMP9XbMqSBh9XchI=;\n b=UvLLfYmFw73x+PTvvH3gWkiaRyJbU/JZHjKFP9KNScv18TJPnyTbtZzYP+F9fncR/3iQV1\n ke8UNm/X0zGOItaQoOdrf4hTL12yXa+75Kf5V66pvmnV9KbjnJ13n/xCBorDMb+8I8NRGl\n GB4MC4ratZYR5k08Hk1gaJJa3Fqe1E8=","v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de;\n s=susede2_ed25519; t=1696674894;\n h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc:\n mime-version:mime-version:content-type:content-type:\n content-transfer-encoding:content-transfer-encoding:\n in-reply-to:in-reply-to:references:references;\n bh=7J03tJTFpc7wcZMZy7LllieLOrnBMP9XbMqSBh9XchI=;\n b=mqfN7w+vJlhaXH0Vwk4zg4wXTe1CKmQEzj8fuNwAhkToZLXeZGpdyy/+2zqHxHEpyL+EGy\n TRmD4dsHk6MnEXBw=="],"Content-Type":"text/plain; charset=utf-8","Content-Transfer-Encoding":"quoted-printable","From":"Richard Biener <rguenther@suse.de>","Mime-Version":"1.0 (1.0)","Subject":"Re: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1 <<\n signbit(x)) [PR109154]","Date":"Sat, 7 Oct 2023 12:34:43 +0200","Message-Id":"<E1BD3E05-ACB9-4484-8979-D0B71759A3FC@suse.de>","References":"<mpt1qe6lrbr.fsf@arm.com>","Cc":"Tamar Christina <tamar.christina@arm.com>,\n Andrew Pinski <pinskia@gmail.com>, gcc-patches@gcc.gnu.org, nd <nd@arm.com>,\n jlaw@ventanamicro.com","In-Reply-To":"<mpt1qe6lrbr.fsf@arm.com>","To":"Richard Sandiford <richard.sandiford@arm.com>","X-Mailer":"iPhone Mail (20H19)","X-Spam-Status":"No, score=-11.4 required=5.0 tests=BAYES_00, DKIM_SIGNED,\n DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_LOTSOFHASH,\n KAM_SHORT, SPF_HELO_NONE, SPF_PASS,\n TXREP autolearn=ham autolearn_force=no version=3.4.6","X-Spam-Checker-Version":"SpamAssassin 3.4.6 (2021-04-09) on\n server2.sourceware.org","X-BeenThere":"gcc-patches@gcc.gnu.org","X-Mailman-Version":"2.1.30","Precedence":"list","List-Id":"Gcc-patches mailing list <gcc-patches.gcc.gnu.org>","List-Unsubscribe":"<https://gcc.gnu.org/mailman/options/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>","List-Archive":"<https://gcc.gnu.org/pipermail/gcc-patches/>","List-Post":"<mailto:gcc-patches@gcc.gnu.org>","List-Help":"<mailto:gcc-patches-request@gcc.gnu.org?subject=help>","List-Subscribe":"<https://gcc.gnu.org/mailman/listinfo/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>","Errors-To":"gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org"}},{"id":3194800,"web_url":"http://patchwork.ozlabs.org/comment/3194800/","msgid":"<mptbkdak6oa.fsf@arm.com>","list_archive_url":null,"date":"2023-10-07T11:34:13","subject":"Re: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1 <<\n signbit(x)) [PR109154]","submitter":{"id":64746,"url":"http://patchwork.ozlabs.org/api/people/64746/","name":"Richard Sandiford","email":"richard.sandiford@arm.com"},"content":"Richard Biener <rguenther@suse.de> writes:\n>> Am 07.10.2023 um 11:23 schrieb Richard Sandiford <richard.sandiford@arm.com>>> ﻿Richard Biener <rguenther@suse.de> writes:\n>>> On Thu, 5 Oct 2023, Tamar Christina wrote:\n>>> \n>>>>> I suppose the idea is that -abs(x) might be easier to optimize with other\n>>>>> patterns (consider a - copysign(x,...), optimizing to a + abs(x)).\n>>>>> \n>>>>> For abs vs copysign it's a canonicalization, but (negate (abs @0)) is less\n>>>>> canonical than copysign.\n>>>>> \n>>>>>> Should I try removing this?\n>>>>> \n>>>>> I'd say yes (and put the reverse canonicalization next to this pattern).\n>>>>> \n>>>> \n>>>> This patch transforms fneg (fabs (x)) into copysign (x, -1) which is more\n>>>> canonical and allows a target to expand this sequence efficiently.  Such\n>>>> sequences are common in scientific code working with gradients.\n>>>> \n>>>> various optimizations in match.pd only happened on COPYSIGN but not COPYSIGN_ALL\n>>>> which means they exclude IFN_COPYSIGN.  COPYSIGN however is restricted to only\n>>> \n>>> That's not true:\n>>> \n>>> (define_operator_list COPYSIGN\n>>>    BUILT_IN_COPYSIGNF\n>>>    BUILT_IN_COPYSIGN\n>>>    BUILT_IN_COPYSIGNL\n>>>    IFN_COPYSIGN)\n>>> \n>>> but they miss the extended float builtin variants like\n>>> __builtin_copysignf16.  Also see below\n>>> \n>>>> the C99 builtins and so doesn't work for vectors.\n>>>> \n>>>> The patch expands these optimizations to work on COPYSIGN_ALL.\n>>>> \n>>>> There is an existing canonicalization of copysign (x, -1) to fneg (fabs (x))\n>>>> which I remove since this is a less efficient form.  The testsuite is also\n>>>> updated in light of this.\n>>>> \n>>>> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.\n>>>> \n>>>> Ok for master?\n>>>> \n>>>> Thanks,\n>>>> Tamar\n>>>> \n>>>> gcc/ChangeLog:\n>>>> \n>>>>    PR tree-optimization/109154\n>>>>    * match.pd: Add new neg+abs rule, remove inverse copysign rule and\n>>>>    expand existing copysign optimizations.\n>>>> \n>>>> gcc/testsuite/ChangeLog:\n>>>> \n>>>>    PR tree-optimization/109154\n>>>>    * gcc.dg/fold-copysign-1.c: Updated.\n>>>>    * gcc.dg/pr55152-2.c: Updated.\n>>>>    * gcc.dg/tree-ssa/abs-4.c: Updated.\n>>>>    * gcc.dg/tree-ssa/backprop-6.c: Updated.\n>>>>    * gcc.dg/tree-ssa/copy-sign-2.c: Updated.\n>>>>    * gcc.dg/tree-ssa/mult-abs-2.c: Updated.\n>>>>    * gcc.target/aarch64/fneg-abs_1.c: New test.\n>>>>    * gcc.target/aarch64/fneg-abs_2.c: New test.\n>>>>    * gcc.target/aarch64/fneg-abs_3.c: New test.\n>>>>    * gcc.target/aarch64/fneg-abs_4.c: New test.\n>>>>    * gcc.target/aarch64/sve/fneg-abs_1.c: New test.\n>>>>    * gcc.target/aarch64/sve/fneg-abs_2.c: New test.\n>>>>    * gcc.target/aarch64/sve/fneg-abs_3.c: New test.\n>>>>    * gcc.target/aarch64/sve/fneg-abs_4.c: New test.\n>>>> \n>>>> --- inline copy of patch ---\n>>>> \n>>>> diff --git a/gcc/match.pd b/gcc/match.pd\n>>>> index 4bdd83e6e061b16dbdb2845b9398fcfb8a6c9739..bd6599d36021e119f51a4928354f580ffe82c6e2 100644\n>>>> --- a/gcc/match.pd\n>>>> +++ b/gcc/match.pd\n>>>> @@ -1074,45 +1074,43 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)\n>>>> \n>>>> /* cos(copysign(x, y)) -> cos(x).  Similarly for cosh.  */\n>>>> (for coss (COS COSH)\n>>>> -     copysigns (COPYSIGN)\n>>>> - (simplify\n>>>> -  (coss (copysigns @0 @1))\n>>>> -   (coss @0)))\n>>>> + (for copysigns (COPYSIGN_ALL)\n>>> \n>>> So this ends up generating for example the match\n>>> (cosf (copysignl ...)) which doesn't make much sense.\n>>> \n>>> The lock-step iteration did\n>>> (cosf (copysignf ..)) ... (ifn_cos (ifn_copysign ...))\n>>> which is leaner but misses the case of\n>>> (cosf (ifn_copysign ..)) - that's probably what you are\n>>> after with this change.\n>>> \n>>> That said, there isn't a nice solution (without altering the match.pd\n>>> IL).  There's the explicit solution, spelling out all combinations.\n>>> \n>>> So if we want to go with yout pragmatic solution changing this\n>>> to use COPYSIGN_ALL isn't necessary, only changing the lock-step\n>>> for iteration to a cross product for iteration is.\n>>> \n>>> Changing just this pattern to\n>>> \n>>> (for coss (COS COSH)\n>>> (for copysigns (COPYSIGN)\n>>>  (simplify\n>>>   (coss (copysigns @0 @1))\n>>>   (coss @0))))\n>>> \n>>> increases the total number of gimple-match-x.cc lines from\n>>> 234988 to 235324.\n>> \n>> I guess the difference between this and the later suggestions is that\n>> this one allows builtin copysign to be paired with ifn cos, which would\n>> be potentially useful in other situations.  (It isn't here because\n>> ifn_cos is rarely provided.)  How much of the growth is due to that,\n>> and much of it is from nonsensical combinations like\n>> (builtin_cosf (builtin_copysignl ...))?\n>> \n>> If it's mostly from nonsensical combinations then would it be possible\n>> to make genmatch drop them?\n>> \n>>> The alternative is to do\n>>> \n>>> (for coss (COS COSH)\n>>>     copysigns (COPYSIGN)\n>>> (simplify\n>>>  (coss (copysigns @0 @1))\n>>>   (coss @0))\n>>> (simplify\n>>>  (coss (IFN_COPYSIGN @0 @1))\n>>>   (coss @0)))\n>>> \n>>> which properly will diagnose a duplicate pattern.  Ther are\n>>> currently no operator lists with just builtins defined (that\n>>> could be fixed, see gencfn-macros.cc), supposed we'd have\n>>> COS_C we could do\n>>> \n>>> (for coss (COS_C COSH_C IFN_COS IFN_COSH)\n>>>     copysigns (COPYSIGN_C COPYSIGN_C IFN_COPYSIGN IFN_COPYSIGN \n>>> IFN_COPYSIGN IFN_COPYSIGN IFN_COPYSIGN IFN_COPYSIGN IFN_COPYSIGN \n>>> IFN_COPYSIGN)\n>>> (simplify\n>>>  (coss (copysigns @0 @1))\n>>>   (coss @0)))\n>>> \n>>> which of course still looks ugly ;) (some syntax extension like\n>>> allowing to specify IFN_COPYSIGN*8 would be nice here and easy\n>>> enough to do)\n>>> \n>>> Can you split out the part changing COPYSIGN to COPYSIGN_ALL,\n>>> re-do it to only split the fors, keeping COPYSIGN and provide\n>>> some statistics on the gimple-match-* size?  I think this might\n>>> be the pragmatic solution for now.\n>>> \n>>> Richard - can you think of a clever way to express the desired\n>>> iteration?  How do RTL macro iterations address cases like this?\n>> \n>> I don't think .md files have an equivalent construct, unfortunately.\n>> (I also regret some of the choices I made for .md iterators, but that's\n>> another story.)\n>> \n>> Perhaps an alternative to the *8 thing would be \"IFN_COPYSIGN...\",\n>> with the \"...\" meaning \"fill to match the longest operator list\n>> in the loop\".\n>\n> Hm, I’ll think about this.  It would be useful to have a function like\n>\n> Internal_fn ifn_for (combined_fn);\n>\n> So we can indirectly match all builtins with a switch on the ifn code.\n\nThere's:\n\nextern internal_fn associated_internal_fn (combined_fn, tree);\nextern internal_fn associated_internal_fn (tree);\nextern internal_fn replacement_internal_fn (gcall *);\n\nwhere the first one requires the return type, and the second one\noperates on CALL_EXPRs.\n\nThanks,\nRichard","headers":{"Return-Path":"<gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org>","X-Original-To":["incoming@patchwork.ozlabs.org","gcc-patches@gcc.gnu.org"],"Delivered-To":["patchwork-incoming@legolas.ozlabs.org","gcc-patches@gcc.gnu.org"],"Authentication-Results":["legolas.ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org\n (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org;\n envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org;\n receiver=patchwork.ozlabs.org)","sourceware.org;\n dmarc=pass (p=none dis=none) header.from=arm.com","sourceware.org; spf=pass smtp.mailfrom=arm.com"],"Received":["from server2.sourceware.org (server2.sourceware.org\n [IPv6:2620:52:3:1:0:246e:9693:128c])\n\t(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n\t key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384)\n\t(No client certificate requested)\n\tby legolas.ozlabs.org (Postfix) with ESMTPS id 4S2jr83Vytz1yq7\n\tfor <incoming@patchwork.ozlabs.org>; Sat,  7 Oct 2023 22:34:32 +1100 (AEDT)","from server2.sourceware.org (localhost [IPv6:::1])\n\tby sourceware.org (Postfix) with ESMTP id 06A7C385701C\n\tfor <incoming@patchwork.ozlabs.org>; Sat,  7 Oct 2023 11:34:29 +0000 (GMT)","from foss.arm.com (foss.arm.com [217.140.110.172])\n by sourceware.org (Postfix) with ESMTP id 8EA6B385840A\n for <gcc-patches@gcc.gnu.org>; Sat,  7 Oct 2023 11:34:16 +0000 (GMT)","from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14])\n by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B60B2C15;\n Sat,  7 Oct 2023 04:34:55 -0700 (PDT)","from localhost (unknown [10.32.110.65])\n by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 34E7E3F5A1;\n Sat,  7 Oct 2023 04:34:15 -0700 (PDT)"],"DMARC-Filter":"OpenDMARC Filter v1.4.2 sourceware.org 8EA6B385840A","From":"Richard Sandiford <richard.sandiford@arm.com>","To":"Richard Biener <rguenther@suse.de>","Mail-Followup-To":"Richard Biener <rguenther@suse.de>,\n Tamar Christina <tamar.christina@arm.com>, Andrew Pinski <pinskia@gmail.com>,\n gcc-patches@gcc.gnu.org, nd <nd@arm.com>, jlaw@ventanamicro.com,\n richard.sandiford@arm.com","Cc":"Tamar Christina <tamar.christina@arm.com>,\n Andrew Pinski <pinskia@gmail.com>, gcc-patches@gcc.gnu.org, nd <nd@arm.com>,\n jlaw@ventanamicro.com","Subject":"Re: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1 <<\n signbit(x)) [PR109154]","References":"<mpt1qe6lrbr.fsf@arm.com>\n <E1BD3E05-ACB9-4484-8979-D0B71759A3FC@suse.de>","Date":"Sat, 07 Oct 2023 12:34:13 +0100","In-Reply-To":"<E1BD3E05-ACB9-4484-8979-D0B71759A3FC@suse.de> (Richard Biener's\n message of \"Sat, 7 Oct 2023 12:34:43 +0200\")","Message-ID":"<mptbkdak6oa.fsf@arm.com>","User-Agent":"Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux)","MIME-Version":"1.0","Content-Type":"text/plain; charset=utf-8","Content-Transfer-Encoding":"quoted-printable","X-Spam-Status":"No, score=-24.4 required=5.0 tests=BAYES_00, GIT_PATCH_0,\n KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT,\n SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6","X-Spam-Checker-Version":"SpamAssassin 3.4.6 (2021-04-09) on\n server2.sourceware.org","X-BeenThere":"gcc-patches@gcc.gnu.org","X-Mailman-Version":"2.1.30","Precedence":"list","List-Id":"Gcc-patches mailing list <gcc-patches.gcc.gnu.org>","List-Unsubscribe":"<https://gcc.gnu.org/mailman/options/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>","List-Archive":"<https://gcc.gnu.org/pipermail/gcc-patches/>","List-Post":"<mailto:gcc-patches@gcc.gnu.org>","List-Help":"<mailto:gcc-patches-request@gcc.gnu.org?subject=help>","List-Subscribe":"<https://gcc.gnu.org/mailman/listinfo/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>","Errors-To":"gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org"}},{"id":3195180,"web_url":"http://patchwork.ozlabs.org/comment/3195180/","msgid":"<nycvar.YFH.7.77.849.2310090711380.5561@jbgna.fhfr.qr>","list_archive_url":null,"date":"2023-10-09T07:20:34","subject":"Re: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1\n << signbit(x)) [PR109154]","submitter":{"id":4338,"url":"http://patchwork.ozlabs.org/api/people/4338/","name":"Richard Biener","email":"rguenther@suse.de"},"content":"On Sat, 7 Oct 2023, Richard Sandiford wrote:\n\n> Richard Biener <rguenther@suse.de> writes:\n> >> Am 07.10.2023 um 11:23 schrieb Richard Sandiford <richard.sandiford@arm.com>>> ﻿Richard Biener <rguenther@suse.de> writes:\n> >>> On Thu, 5 Oct 2023, Tamar Christina wrote:\n> >>> \n> >>>>> I suppose the idea is that -abs(x) might be easier to optimize with other\n> >>>>> patterns (consider a - copysign(x,...), optimizing to a + abs(x)).\n> >>>>> \n> >>>>> For abs vs copysign it's a canonicalization, but (negate (abs @0)) is less\n> >>>>> canonical than copysign.\n> >>>>> \n> >>>>>> Should I try removing this?\n> >>>>> \n> >>>>> I'd say yes (and put the reverse canonicalization next to this pattern).\n> >>>>> \n> >>>> \n> >>>> This patch transforms fneg (fabs (x)) into copysign (x, -1) which is more\n> >>>> canonical and allows a target to expand this sequence efficiently.  Such\n> >>>> sequences are common in scientific code working with gradients.\n> >>>> \n> >>>> various optimizations in match.pd only happened on COPYSIGN but not COPYSIGN_ALL\n> >>>> which means they exclude IFN_COPYSIGN.  COPYSIGN however is restricted to only\n> >>> \n> >>> That's not true:\n> >>> \n> >>> (define_operator_list COPYSIGN\n> >>>    BUILT_IN_COPYSIGNF\n> >>>    BUILT_IN_COPYSIGN\n> >>>    BUILT_IN_COPYSIGNL\n> >>>    IFN_COPYSIGN)\n> >>> \n> >>> but they miss the extended float builtin variants like\n> >>> __builtin_copysignf16.  Also see below\n> >>> \n> >>>> the C99 builtins and so doesn't work for vectors.\n> >>>> \n> >>>> The patch expands these optimizations to work on COPYSIGN_ALL.\n> >>>> \n> >>>> There is an existing canonicalization of copysign (x, -1) to fneg (fabs (x))\n> >>>> which I remove since this is a less efficient form.  The testsuite is also\n> >>>> updated in light of this.\n> >>>> \n> >>>> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.\n> >>>> \n> >>>> Ok for master?\n> >>>> \n> >>>> Thanks,\n> >>>> Tamar\n> >>>> \n> >>>> gcc/ChangeLog:\n> >>>> \n> >>>>    PR tree-optimization/109154\n> >>>>    * match.pd: Add new neg+abs rule, remove inverse copysign rule and\n> >>>>    expand existing copysign optimizations.\n> >>>> \n> >>>> gcc/testsuite/ChangeLog:\n> >>>> \n> >>>>    PR tree-optimization/109154\n> >>>>    * gcc.dg/fold-copysign-1.c: Updated.\n> >>>>    * gcc.dg/pr55152-2.c: Updated.\n> >>>>    * gcc.dg/tree-ssa/abs-4.c: Updated.\n> >>>>    * gcc.dg/tree-ssa/backprop-6.c: Updated.\n> >>>>    * gcc.dg/tree-ssa/copy-sign-2.c: Updated.\n> >>>>    * gcc.dg/tree-ssa/mult-abs-2.c: Updated.\n> >>>>    * gcc.target/aarch64/fneg-abs_1.c: New test.\n> >>>>    * gcc.target/aarch64/fneg-abs_2.c: New test.\n> >>>>    * gcc.target/aarch64/fneg-abs_3.c: New test.\n> >>>>    * gcc.target/aarch64/fneg-abs_4.c: New test.\n> >>>>    * gcc.target/aarch64/sve/fneg-abs_1.c: New test.\n> >>>>    * gcc.target/aarch64/sve/fneg-abs_2.c: New test.\n> >>>>    * gcc.target/aarch64/sve/fneg-abs_3.c: New test.\n> >>>>    * gcc.target/aarch64/sve/fneg-abs_4.c: New test.\n> >>>> \n> >>>> --- inline copy of patch ---\n> >>>> \n> >>>> diff --git a/gcc/match.pd b/gcc/match.pd\n> >>>> index 4bdd83e6e061b16dbdb2845b9398fcfb8a6c9739..bd6599d36021e119f51a4928354f580ffe82c6e2 100644\n> >>>> --- a/gcc/match.pd\n> >>>> +++ b/gcc/match.pd\n> >>>> @@ -1074,45 +1074,43 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)\n> >>>> \n> >>>> /* cos(copysign(x, y)) -> cos(x).  Similarly for cosh.  */\n> >>>> (for coss (COS COSH)\n> >>>> -     copysigns (COPYSIGN)\n> >>>> - (simplify\n> >>>> -  (coss (copysigns @0 @1))\n> >>>> -   (coss @0)))\n> >>>> + (for copysigns (COPYSIGN_ALL)\n> >>> \n> >>> So this ends up generating for example the match\n> >>> (cosf (copysignl ...)) which doesn't make much sense.\n> >>> \n> >>> The lock-step iteration did\n> >>> (cosf (copysignf ..)) ... (ifn_cos (ifn_copysign ...))\n> >>> which is leaner but misses the case of\n> >>> (cosf (ifn_copysign ..)) - that's probably what you are\n> >>> after with this change.\n> >>> \n> >>> That said, there isn't a nice solution (without altering the match.pd\n> >>> IL).  There's the explicit solution, spelling out all combinations.\n> >>> \n> >>> So if we want to go with yout pragmatic solution changing this\n> >>> to use COPYSIGN_ALL isn't necessary, only changing the lock-step\n> >>> for iteration to a cross product for iteration is.\n> >>> \n> >>> Changing just this pattern to\n> >>> \n> >>> (for coss (COS COSH)\n> >>> (for copysigns (COPYSIGN)\n> >>>  (simplify\n> >>>   (coss (copysigns @0 @1))\n> >>>   (coss @0))))\n> >>> \n> >>> increases the total number of gimple-match-x.cc lines from\n> >>> 234988 to 235324.\n> >> \n> >> I guess the difference between this and the later suggestions is that\n> >> this one allows builtin copysign to be paired with ifn cos, which would\n> >> be potentially useful in other situations.  (It isn't here because\n> >> ifn_cos is rarely provided.)  How much of the growth is due to that,\n> >> and much of it is from nonsensical combinations like\n> >> (builtin_cosf (builtin_copysignl ...))?\n> >> \n> >> If it's mostly from nonsensical combinations then would it be possible\n> >> to make genmatch drop them?\n> >> \n> >>> The alternative is to do\n> >>> \n> >>> (for coss (COS COSH)\n> >>>     copysigns (COPYSIGN)\n> >>> (simplify\n> >>>  (coss (copysigns @0 @1))\n> >>>   (coss @0))\n> >>> (simplify\n> >>>  (coss (IFN_COPYSIGN @0 @1))\n> >>>   (coss @0)))\n> >>> \n> >>> which properly will diagnose a duplicate pattern.  Ther are\n> >>> currently no operator lists with just builtins defined (that\n> >>> could be fixed, see gencfn-macros.cc), supposed we'd have\n> >>> COS_C we could do\n> >>> \n> >>> (for coss (COS_C COSH_C IFN_COS IFN_COSH)\n> >>>     copysigns (COPYSIGN_C COPYSIGN_C IFN_COPYSIGN IFN_COPYSIGN \n> >>> IFN_COPYSIGN IFN_COPYSIGN IFN_COPYSIGN IFN_COPYSIGN IFN_COPYSIGN \n> >>> IFN_COPYSIGN)\n> >>> (simplify\n> >>>  (coss (copysigns @0 @1))\n> >>>   (coss @0)))\n> >>> \n> >>> which of course still looks ugly ;) (some syntax extension like\n> >>> allowing to specify IFN_COPYSIGN*8 would be nice here and easy\n> >>> enough to do)\n> >>> \n> >>> Can you split out the part changing COPYSIGN to COPYSIGN_ALL,\n> >>> re-do it to only split the fors, keeping COPYSIGN and provide\n> >>> some statistics on the gimple-match-* size?  I think this might\n> >>> be the pragmatic solution for now.\n> >>> \n> >>> Richard - can you think of a clever way to express the desired\n> >>> iteration?  How do RTL macro iterations address cases like this?\n> >> \n> >> I don't think .md files have an equivalent construct, unfortunately.\n> >> (I also regret some of the choices I made for .md iterators, but that's\n> >> another story.)\n> >> \n> >> Perhaps an alternative to the *8 thing would be \"IFN_COPYSIGN...\",\n> >> with the \"...\" meaning \"fill to match the longest operator list\n> >> in the loop\".\n> >\n> > Hm, I?ll think about this.  It would be useful to have a function like\n> >\n> > Internal_fn ifn_for (combined_fn);\n> >\n> > So we can indirectly match all builtins with a switch on the ifn code.\n> \n> There's:\n> \n> extern internal_fn associated_internal_fn (combined_fn, tree);\n> extern internal_fn associated_internal_fn (tree);\n> extern internal_fn replacement_internal_fn (gcall *);\n> \n> where the first one requires the return type, and the second one\n> operates on CALL_EXPRs.\n\nHmm, for full generality the way we code-generate would need to change\nquite a bit.  Instead I've come up with the following quite limited\napproach.  You can write\n\n(for coss (COS COSH)\n (simplify\n  (coss (ANY_COPYSIGN @0 @1))\n  (coss @0))))\n\nwith it.  For each internal function the following patch adds a\nANY_<name> identifier.  The use is somewhat limited - you cannot\nuse it as the outermost operation in the match part and you cannot\nuse it in the replacement part at all.  The nice thing is there's\nno \"iteration\" at all, the ANY_COPYSIGN doesn't cause any pattern\nduplication, instead we match it via CASE_CFN_<name> so it will\nhappily match mis-matched (typewise) calls (but those shouldn't\nbe there...).\n\nThe patch doesn't contain any defensiveness in the parser for the\nuse restriction, but you should get compile failures for misuses\nat least.\n\nIt should match quite some of the copysign cases, I suspect its\nof no use for most of the operators so maybe less general handling\nand only specifically introducing ANY_COPYSIGN would be better.\nAt least I cannot think of any other functions that are matched\nbut disappear in the resulting replacement?\n\nRichard.\n\ndiff --git a/gcc/genmatch.cc b/gcc/genmatch.cc\nindex 03d325efdf6..f7d3f51c013 100644\n--- a/gcc/genmatch.cc\n+++ b/gcc/genmatch.cc\n@@ -524,10 +524,14 @@ class fn_id : public id_base\n {\n public:\n   fn_id (enum built_in_function fn_, const char *id_)\n-      : id_base (id_base::FN, id_), fn (fn_) {}\n+      : id_base (id_base::FN, id_), fn (fn_), case_macro (nullptr) {}\n   fn_id (enum internal_fn fn_, const char *id_)\n-      : id_base (id_base::FN, id_), fn (int (END_BUILTINS) + int (fn_)) {}\n+      : id_base (id_base::FN, id_), fn (int (END_BUILTINS) + int (fn_)),\n+    case_macro (nullptr) {}\n+  fn_id (const char *case_macro_, const char *id_)\n+      : id_base (id_base::FN, id_), fn (-1U), case_macro (case_macro_) {}\n   unsigned int fn;\n+  const char *case_macro;\n };\n \n class simplify;\n@@ -3262,6 +3266,10 @@ dt_node::gen_kids_1 (FILE *f, int indent, bool gimple, int depth,\n \t      if (user_id *u = dyn_cast <user_id *> (e->operation))\n \t\tfor (auto id : u->substitutes)\n \t\t  fprintf_indent (f, indent, \"case %s:\\n\", id->id);\n+\t      else if (is_a <fn_id *> (e->operation)\n+\t\t       && as_a <fn_id *> (e->operation)->case_macro)\n+\t\tfprintf_indent (f, indent, \"%s:\\n\",\n+\t\t\t\tas_a <fn_id *> (e->operation)->case_macro);\n \t      else\n \t\tfprintf_indent (f, indent, \"case %s:\\n\", e->operation->id);\n \t      /* We need to be defensive against bogus prototypes allowing\n@@ -3337,9 +3345,12 @@ dt_node::gen_kids_1 (FILE *f, int indent, bool gimple, int depth,\n       for (unsigned j = 0; j < generic_fns.length (); ++j)\n \t{\n \t  expr *e = as_a <expr *>(generic_fns[j]->op);\n-\t  gcc_assert (e->operation->kind == id_base::FN);\n+\t  fn_id *oper = as_a <fn_id *> (e->operation);\n \n-\t  fprintf_indent (f, indent, \"case %s:\\n\", e->operation->id);\n+\t  if (oper->case_macro)\n+\t    fprintf_indent (f, indent, \"%s:\\n\", oper->case_macro);\n+\t  else\n+\t    fprintf_indent (f, indent, \"case %s:\\n\", e->operation->id);\n \t  fprintf_indent (f, indent, \"  if (call_expr_nargs (%s) == %d)\\n\"\n \t\t\t\t     \"    {\\n\", kid_opname, e->ops.length ());\n \t  generic_fns[j]->gen (f, indent + 6, false, depth);\n@@ -5496,7 +5507,8 @@ main (int argc, char **argv)\n #include \"builtins.def\"\n \n #define DEF_INTERNAL_FN(CODE, NAME, FNSPEC) \\\n-  add_function (IFN_##CODE, \"CFN_\" #CODE);\n+  add_function (IFN_##CODE, \"CFN_\" #CODE); \\\n+  add_function (\"CASE_CFN_\" # CODE, \"ANY_\" # CODE);\n #include \"internal-fn.def\"\n \n   /* Parse ahead!  */","headers":{"Return-Path":"<gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org>","X-Original-To":["incoming@patchwork.ozlabs.org","gcc-patches@gcc.gnu.org"],"Delivered-To":["patchwork-incoming@legolas.ozlabs.org","gcc-patches@gcc.gnu.org"],"Authentication-Results":["legolas.ozlabs.org;\n\tdkim=pass (1024-bit key;\n unprotected) header.d=suse.de header.i=@suse.de header.a=rsa-sha256\n header.s=susede2_rsa header.b=AUcTNxp3;\n\tdkim=pass header.d=suse.de header.i=@suse.de header.a=ed25519-sha256\n header.s=susede2_ed25519 header.b=ajiN7xwl;\n\tdkim-atps=neutral","legolas.ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org\n (client-ip=8.43.85.97; helo=server2.sourceware.org;\n envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org;\n receiver=patchwork.ozlabs.org)","sourceware.org;\n dmarc=pass (p=none dis=none) header.from=suse.de","sourceware.org; spf=pass smtp.mailfrom=suse.de"],"Received":["from server2.sourceware.org (server2.sourceware.org [8.43.85.97])\n\t(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n\t key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384)\n\t(No client certificate requested)\n\tby legolas.ozlabs.org (Postfix) with ESMTPS id 4S3r6c0MWlz1yq1\n\tfor <incoming@patchwork.ozlabs.org>; Mon,  9 Oct 2023 18:20:54 +1100 (AEDT)","from server2.sourceware.org (localhost [IPv6:::1])\n\tby sourceware.org (Postfix) with ESMTP id 442873858C2C\n\tfor <incoming@patchwork.ozlabs.org>; Mon,  9 Oct 2023 07:20:48 +0000 (GMT)","from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29])\n by sourceware.org (Postfix) with ESMTPS id 924703858D35\n for <gcc-patches@gcc.gnu.org>; Mon,  9 Oct 2023 07:20:35 +0000 (GMT)","from relay2.suse.de (relay2.suse.de [149.44.160.134])\n by smtp-out2.suse.de (Postfix) with ESMTP id 8B1CA1F381;\n Mon,  9 Oct 2023 07:20:34 +0000 (UTC)","from wotan.suse.de (wotan.suse.de [10.160.0.1])\n (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))\n (No client certificate requested)\n by relay2.suse.de (Postfix) with ESMTPS id 565132C142;\n Mon,  9 Oct 2023 07:20:34 +0000 (UTC)"],"DMARC-Filter":"OpenDMARC Filter v1.4.2 sourceware.org 924703858D35","DKIM-Signature":["v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de;\n s=susede2_rsa;\n t=1696836034;\n h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc:\n mime-version:mime-version:content-type:content-type:\n in-reply-to:in-reply-to:references:references;\n bh=QD7oLIrqgSND5p7EHqThhaZC312vJpphG7sASvH3PBo=;\n b=AUcTNxp3kDy70wx3zu2S9O7AixFmFouJE3av58qGH4ohcA+YSwNymZryUqa9DYW9IKAi3o\n 0qOVA3X59g0l3Rbz1fvlvr+zd02UPKo6D7tPCwkmVBAXhTYko9nwxM2ZoxYgYTUStkcw/1\n BzZyfHCInsoGZbjici78lb7zn8tgMd0=","v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de;\n s=susede2_ed25519; t=1696836034;\n h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc:\n mime-version:mime-version:content-type:content-type:\n in-reply-to:in-reply-to:references:references;\n bh=QD7oLIrqgSND5p7EHqThhaZC312vJpphG7sASvH3PBo=;\n b=ajiN7xwlSgsS20SM4WEZVzVd7seeGrLVujz24LmfA6BmCDDRteR5hnbQ7AhjZ/3QdLhaSZ\n XHGYUnNvh5CERIBA=="],"Date":"Mon, 9 Oct 2023 07:20:34 +0000 (UTC)","From":"Richard Biener <rguenther@suse.de>","To":"Richard Sandiford <richard.sandiford@arm.com>","cc":"Tamar Christina <tamar.christina@arm.com>,\n Andrew Pinski <pinskia@gmail.com>, gcc-patches@gcc.gnu.org,\n nd <nd@arm.com>, jlaw@ventanamicro.com","Subject":"Re: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1\n << signbit(x)) [PR109154]","In-Reply-To":"<mptbkdak6oa.fsf@arm.com>","Message-ID":"<nycvar.YFH.7.77.849.2310090711380.5561@jbgna.fhfr.qr>","References":"<mpt1qe6lrbr.fsf@arm.com>\n <E1BD3E05-ACB9-4484-8979-D0B71759A3FC@suse.de> <mptbkdak6oa.fsf@arm.com>","User-Agent":"Alpine 2.22 (LSU 394 2020-01-19)","MIME-Version":"1.0","Content-Type":"multipart/mixed;\n boundary=\"-1609957120-1656513207-1696836034=:5561\"","X-Spam-Status":"No, score=-11.0 required=5.0 tests=BAYES_00, DKIM_SIGNED,\n DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT,\n SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6","X-Spam-Checker-Version":"SpamAssassin 3.4.6 (2021-04-09) on\n server2.sourceware.org","X-BeenThere":"gcc-patches@gcc.gnu.org","X-Mailman-Version":"2.1.30","Precedence":"list","List-Id":"Gcc-patches mailing list <gcc-patches.gcc.gnu.org>","List-Unsubscribe":"<https://gcc.gnu.org/mailman/options/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>","List-Archive":"<https://gcc.gnu.org/pipermail/gcc-patches/>","List-Post":"<mailto:gcc-patches@gcc.gnu.org>","List-Help":"<mailto:gcc-patches-request@gcc.gnu.org?subject=help>","List-Subscribe":"<https://gcc.gnu.org/mailman/listinfo/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>","Errors-To":"gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org"}},{"id":3195193,"web_url":"http://patchwork.ozlabs.org/comment/3195193/","msgid":"<CA+=Sn1nToQ0HM_r3n_a85zgGB5bx6kVHOWehjZXiLTDbSJUUmw@mail.gmail.com>","list_archive_url":null,"date":"2023-10-09T07:36:30","subject":"Re: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1 <<\n signbit(x)) [PR109154]","submitter":{"id":40,"url":"http://patchwork.ozlabs.org/api/people/40/","name":"Andrew Pinski","email":"pinskia@gmail.com"},"content":"On Mon, Oct 9, 2023 at 12:20 AM Richard Biener <rguenther@suse.de> wrote:\n>\n> On Sat, 7 Oct 2023, Richard Sandiford wrote:\n>\n> > Richard Biener <rguenther@suse.de> writes:\n> > >> Am 07.10.2023 um 11:23 schrieb Richard Sandiford <richard.sandiford@arm.com>>> ﻿Richard Biener <rguenther@suse.de> writes:\n> > >>> On Thu, 5 Oct 2023, Tamar Christina wrote:\n> > >>>\n> > >>>>> I suppose the idea is that -abs(x) might be easier to optimize with other\n> > >>>>> patterns (consider a - copysign(x,...), optimizing to a + abs(x)).\n> > >>>>>\n> > >>>>> For abs vs copysign it's a canonicalization, but (negate (abs @0)) is less\n> > >>>>> canonical than copysign.\n> > >>>>>\n> > >>>>>> Should I try removing this?\n> > >>>>>\n> > >>>>> I'd say yes (and put the reverse canonicalization next to this pattern).\n> > >>>>>\n> > >>>>\n> > >>>> This patch transforms fneg (fabs (x)) into copysign (x, -1) which is more\n> > >>>> canonical and allows a target to expand this sequence efficiently.  Such\n> > >>>> sequences are common in scientific code working with gradients.\n> > >>>>\n> > >>>> various optimizations in match.pd only happened on COPYSIGN but not COPYSIGN_ALL\n> > >>>> which means they exclude IFN_COPYSIGN.  COPYSIGN however is restricted to only\n> > >>>\n> > >>> That's not true:\n> > >>>\n> > >>> (define_operator_list COPYSIGN\n> > >>>    BUILT_IN_COPYSIGNF\n> > >>>    BUILT_IN_COPYSIGN\n> > >>>    BUILT_IN_COPYSIGNL\n> > >>>    IFN_COPYSIGN)\n> > >>>\n> > >>> but they miss the extended float builtin variants like\n> > >>> __builtin_copysignf16.  Also see below\n> > >>>\n> > >>>> the C99 builtins and so doesn't work for vectors.\n> > >>>>\n> > >>>> The patch expands these optimizations to work on COPYSIGN_ALL.\n> > >>>>\n> > >>>> There is an existing canonicalization of copysign (x, -1) to fneg (fabs (x))\n> > >>>> which I remove since this is a less efficient form.  The testsuite is also\n> > >>>> updated in light of this.\n> > >>>>\n> > >>>> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.\n> > >>>>\n> > >>>> Ok for master?\n> > >>>>\n> > >>>> Thanks,\n> > >>>> Tamar\n> > >>>>\n> > >>>> gcc/ChangeLog:\n> > >>>>\n> > >>>>    PR tree-optimization/109154\n> > >>>>    * match.pd: Add new neg+abs rule, remove inverse copysign rule and\n> > >>>>    expand existing copysign optimizations.\n> > >>>>\n> > >>>> gcc/testsuite/ChangeLog:\n> > >>>>\n> > >>>>    PR tree-optimization/109154\n> > >>>>    * gcc.dg/fold-copysign-1.c: Updated.\n> > >>>>    * gcc.dg/pr55152-2.c: Updated.\n> > >>>>    * gcc.dg/tree-ssa/abs-4.c: Updated.\n> > >>>>    * gcc.dg/tree-ssa/backprop-6.c: Updated.\n> > >>>>    * gcc.dg/tree-ssa/copy-sign-2.c: Updated.\n> > >>>>    * gcc.dg/tree-ssa/mult-abs-2.c: Updated.\n> > >>>>    * gcc.target/aarch64/fneg-abs_1.c: New test.\n> > >>>>    * gcc.target/aarch64/fneg-abs_2.c: New test.\n> > >>>>    * gcc.target/aarch64/fneg-abs_3.c: New test.\n> > >>>>    * gcc.target/aarch64/fneg-abs_4.c: New test.\n> > >>>>    * gcc.target/aarch64/sve/fneg-abs_1.c: New test.\n> > >>>>    * gcc.target/aarch64/sve/fneg-abs_2.c: New test.\n> > >>>>    * gcc.target/aarch64/sve/fneg-abs_3.c: New test.\n> > >>>>    * gcc.target/aarch64/sve/fneg-abs_4.c: New test.\n> > >>>>\n> > >>>> --- inline copy of patch ---\n> > >>>>\n> > >>>> diff --git a/gcc/match.pd b/gcc/match.pd\n> > >>>> index 4bdd83e6e061b16dbdb2845b9398fcfb8a6c9739..bd6599d36021e119f51a4928354f580ffe82c6e2 100644\n> > >>>> --- a/gcc/match.pd\n> > >>>> +++ b/gcc/match.pd\n> > >>>> @@ -1074,45 +1074,43 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)\n> > >>>>\n> > >>>> /* cos(copysign(x, y)) -> cos(x).  Similarly for cosh.  */\n> > >>>> (for coss (COS COSH)\n> > >>>> -     copysigns (COPYSIGN)\n> > >>>> - (simplify\n> > >>>> -  (coss (copysigns @0 @1))\n> > >>>> -   (coss @0)))\n> > >>>> + (for copysigns (COPYSIGN_ALL)\n> > >>>\n> > >>> So this ends up generating for example the match\n> > >>> (cosf (copysignl ...)) which doesn't make much sense.\n> > >>>\n> > >>> The lock-step iteration did\n> > >>> (cosf (copysignf ..)) ... (ifn_cos (ifn_copysign ...))\n> > >>> which is leaner but misses the case of\n> > >>> (cosf (ifn_copysign ..)) - that's probably what you are\n> > >>> after with this change.\n> > >>>\n> > >>> That said, there isn't a nice solution (without altering the match.pd\n> > >>> IL).  There's the explicit solution, spelling out all combinations.\n> > >>>\n> > >>> So if we want to go with yout pragmatic solution changing this\n> > >>> to use COPYSIGN_ALL isn't necessary, only changing the lock-step\n> > >>> for iteration to a cross product for iteration is.\n> > >>>\n> > >>> Changing just this pattern to\n> > >>>\n> > >>> (for coss (COS COSH)\n> > >>> (for copysigns (COPYSIGN)\n> > >>>  (simplify\n> > >>>   (coss (copysigns @0 @1))\n> > >>>   (coss @0))))\n> > >>>\n> > >>> increases the total number of gimple-match-x.cc lines from\n> > >>> 234988 to 235324.\n> > >>\n> > >> I guess the difference between this and the later suggestions is that\n> > >> this one allows builtin copysign to be paired with ifn cos, which would\n> > >> be potentially useful in other situations.  (It isn't here because\n> > >> ifn_cos is rarely provided.)  How much of the growth is due to that,\n> > >> and much of it is from nonsensical combinations like\n> > >> (builtin_cosf (builtin_copysignl ...))?\n> > >>\n> > >> If it's mostly from nonsensical combinations then would it be possible\n> > >> to make genmatch drop them?\n> > >>\n> > >>> The alternative is to do\n> > >>>\n> > >>> (for coss (COS COSH)\n> > >>>     copysigns (COPYSIGN)\n> > >>> (simplify\n> > >>>  (coss (copysigns @0 @1))\n> > >>>   (coss @0))\n> > >>> (simplify\n> > >>>  (coss (IFN_COPYSIGN @0 @1))\n> > >>>   (coss @0)))\n> > >>>\n> > >>> which properly will diagnose a duplicate pattern.  Ther are\n> > >>> currently no operator lists with just builtins defined (that\n> > >>> could be fixed, see gencfn-macros.cc), supposed we'd have\n> > >>> COS_C we could do\n> > >>>\n> > >>> (for coss (COS_C COSH_C IFN_COS IFN_COSH)\n> > >>>     copysigns (COPYSIGN_C COPYSIGN_C IFN_COPYSIGN IFN_COPYSIGN\n> > >>> IFN_COPYSIGN IFN_COPYSIGN IFN_COPYSIGN IFN_COPYSIGN IFN_COPYSIGN\n> > >>> IFN_COPYSIGN)\n> > >>> (simplify\n> > >>>  (coss (copysigns @0 @1))\n> > >>>   (coss @0)))\n> > >>>\n> > >>> which of course still looks ugly ;) (some syntax extension like\n> > >>> allowing to specify IFN_COPYSIGN*8 would be nice here and easy\n> > >>> enough to do)\n> > >>>\n> > >>> Can you split out the part changing COPYSIGN to COPYSIGN_ALL,\n> > >>> re-do it to only split the fors, keeping COPYSIGN and provide\n> > >>> some statistics on the gimple-match-* size?  I think this might\n> > >>> be the pragmatic solution for now.\n> > >>>\n> > >>> Richard - can you think of a clever way to express the desired\n> > >>> iteration?  How do RTL macro iterations address cases like this?\n> > >>\n> > >> I don't think .md files have an equivalent construct, unfortunately.\n> > >> (I also regret some of the choices I made for .md iterators, but that's\n> > >> another story.)\n> > >>\n> > >> Perhaps an alternative to the *8 thing would be \"IFN_COPYSIGN...\",\n> > >> with the \"...\" meaning \"fill to match the longest operator list\n> > >> in the loop\".\n> > >\n> > > Hm, I?ll think about this.  It would be useful to have a function like\n> > >\n> > > Internal_fn ifn_for (combined_fn);\n> > >\n> > > So we can indirectly match all builtins with a switch on the ifn code.\n> >\n> > There's:\n> >\n> > extern internal_fn associated_internal_fn (combined_fn, tree);\n> > extern internal_fn associated_internal_fn (tree);\n> > extern internal_fn replacement_internal_fn (gcall *);\n> >\n> > where the first one requires the return type, and the second one\n> > operates on CALL_EXPRs.\n>\n> Hmm, for full generality the way we code-generate would need to change\n> quite a bit.  Instead I've come up with the following quite limited\n> approach.  You can write\n>\n> (for coss (COS COSH)\n>  (simplify\n>   (coss (ANY_COPYSIGN @0 @1))\n>   (coss @0))))\n\nThis optimization is also handled by backprop (gimple-ssa-backprop.cc)\nin a better way than the match code handle.\nSo maybe we don't really need to extend match-and-simplify here.\nRight now backprop is only ran once early after inlining. Maybe run it\nonce more late would help?\n\nThanks,\nAndrew\n\n\n>\n> with it.  For each internal function the following patch adds a\n> ANY_<name> identifier.  The use is somewhat limited - you cannot\n> use it as the outermost operation in the match part and you cannot\n> use it in the replacement part at all.  The nice thing is there's\n> no \"iteration\" at all, the ANY_COPYSIGN doesn't cause any pattern\n> duplication, instead we match it via CASE_CFN_<name> so it will\n> happily match mis-matched (typewise) calls (but those shouldn't\n> be there...).\n>\n> The patch doesn't contain any defensiveness in the parser for the\n> use restriction, but you should get compile failures for misuses\n> at least.\n>\n> It should match quite some of the copysign cases, I suspect its\n> of no use for most of the operators so maybe less general handling\n> and only specifically introducing ANY_COPYSIGN would be better.\n> At least I cannot think of any other functions that are matched\n> but disappear in the resulting replacement?\n>\n> Richard.\n>\n> diff --git a/gcc/genmatch.cc b/gcc/genmatch.cc\n> index 03d325efdf6..f7d3f51c013 100644\n> --- a/gcc/genmatch.cc\n> +++ b/gcc/genmatch.cc\n> @@ -524,10 +524,14 @@ class fn_id : public id_base\n>  {\n>  public:\n>    fn_id (enum built_in_function fn_, const char *id_)\n> -      : id_base (id_base::FN, id_), fn (fn_) {}\n> +      : id_base (id_base::FN, id_), fn (fn_), case_macro (nullptr) {}\n>    fn_id (enum internal_fn fn_, const char *id_)\n> -      : id_base (id_base::FN, id_), fn (int (END_BUILTINS) + int (fn_)) {}\n> +      : id_base (id_base::FN, id_), fn (int (END_BUILTINS) + int (fn_)),\n> +    case_macro (nullptr) {}\n> +  fn_id (const char *case_macro_, const char *id_)\n> +      : id_base (id_base::FN, id_), fn (-1U), case_macro (case_macro_) {}\n>    unsigned int fn;\n> +  const char *case_macro;\n>  };\n>\n>  class simplify;\n> @@ -3262,6 +3266,10 @@ dt_node::gen_kids_1 (FILE *f, int indent, bool gimple, int depth,\n>               if (user_id *u = dyn_cast <user_id *> (e->operation))\n>                 for (auto id : u->substitutes)\n>                   fprintf_indent (f, indent, \"case %s:\\n\", id->id);\n> +             else if (is_a <fn_id *> (e->operation)\n> +                      && as_a <fn_id *> (e->operation)->case_macro)\n> +               fprintf_indent (f, indent, \"%s:\\n\",\n> +                               as_a <fn_id *> (e->operation)->case_macro);\n>               else\n>                 fprintf_indent (f, indent, \"case %s:\\n\", e->operation->id);\n>               /* We need to be defensive against bogus prototypes allowing\n> @@ -3337,9 +3345,12 @@ dt_node::gen_kids_1 (FILE *f, int indent, bool gimple, int depth,\n>        for (unsigned j = 0; j < generic_fns.length (); ++j)\n>         {\n>           expr *e = as_a <expr *>(generic_fns[j]->op);\n> -         gcc_assert (e->operation->kind == id_base::FN);\n> +         fn_id *oper = as_a <fn_id *> (e->operation);\n>\n> -         fprintf_indent (f, indent, \"case %s:\\n\", e->operation->id);\n> +         if (oper->case_macro)\n> +           fprintf_indent (f, indent, \"%s:\\n\", oper->case_macro);\n> +         else\n> +           fprintf_indent (f, indent, \"case %s:\\n\", e->operation->id);\n>           fprintf_indent (f, indent, \"  if (call_expr_nargs (%s) == %d)\\n\"\n>                                      \"    {\\n\", kid_opname, e->ops.length ());\n>           generic_fns[j]->gen (f, indent + 6, false, depth);\n> @@ -5496,7 +5507,8 @@ main (int argc, char **argv)\n>  #include \"builtins.def\"\n>\n>  #define DEF_INTERNAL_FN(CODE, NAME, FNSPEC) \\\n> -  add_function (IFN_##CODE, \"CFN_\" #CODE);\n> +  add_function (IFN_##CODE, \"CFN_\" #CODE); \\\n> +  add_function (\"CASE_CFN_\" # CODE, \"ANY_\" # CODE);\n>  #include \"internal-fn.def\"\n>\n>    /* Parse ahead!  */","headers":{"Return-Path":"<gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org>","X-Original-To":["incoming@patchwork.ozlabs.org","gcc-patches@gcc.gnu.org"],"Delivered-To":["patchwork-incoming@legolas.ozlabs.org","gcc-patches@gcc.gnu.org"],"Authentication-Results":["legolas.ozlabs.org;\n\tdkim=pass (2048-bit key;\n unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256\n header.s=20230601 header.b=jvIEQye+;\n\tdkim-atps=neutral","legolas.ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org\n (client-ip=8.43.85.97; helo=server2.sourceware.org;\n envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org;\n receiver=patchwork.ozlabs.org)","sourceware.org;\n dmarc=pass (p=none dis=none) header.from=gmail.com","sourceware.org; spf=pass smtp.mailfrom=gmail.com"],"Received":["from server2.sourceware.org (ip-8-43-85-97.sourceware.org\n [8.43.85.97])\n\t(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n\t key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384)\n\t(No client certificate requested)\n\tby legolas.ozlabs.org (Postfix) with ESMTPS id 4S3rT82gHyz1yq1\n\tfor <incoming@patchwork.ozlabs.org>; Mon,  9 Oct 2023 18:37:00 +1100 (AEDT)","from server2.sourceware.org (localhost [IPv6:::1])\n\tby sourceware.org (Postfix) with ESMTP id 49355385696B\n\tfor <incoming@patchwork.ozlabs.org>; Mon,  9 Oct 2023 07:36:58 +0000 (GMT)","from mail-pg1-x529.google.com (mail-pg1-x529.google.com\n [IPv6:2607:f8b0:4864:20::529])\n by sourceware.org (Postfix) with ESMTPS id BC4AA385772B\n for <gcc-patches@gcc.gnu.org>; Mon,  9 Oct 2023 07:36:43 +0000 (GMT)","by mail-pg1-x529.google.com with SMTP id\n 41be03b00d2f7-578b407045bso3424152a12.0\n for <gcc-patches@gcc.gnu.org>; Mon, 09 Oct 2023 00:36:43 -0700 (PDT)"],"DMARC-Filter":"OpenDMARC Filter v1.4.2 sourceware.org BC4AA385772B","DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n d=gmail.com; s=20230601; t=1696837002; x=1697441802; darn=gcc.gnu.org;\n h=content-transfer-encoding:cc:to:subject:message-id:date:from\n :in-reply-to:references:mime-version:from:to:cc:subject:date\n :message-id:reply-to;\n bh=lHq1GnbFGTu6XO3h+VQuknG0WHNieFHtT2l5KqAjqVk=;\n b=jvIEQye+NbJ6SZjTnSCd3Blw0Kt4AJ7KotBOUNmiE7krCgcQ6f1WujouxtOLko4vHs\n p/AVGZ/yk+vEfEy412LANjfnR0049MUDPpD0vJArpmwXkDmLfyD0fXHo9u4wrfcqp+rR\n hE+qboa4PRj61F0BXn1PGwAuELlX5XOeKe0ZoVTwuQnmtkMeHkCPGycwgfqUpUEe5pRR\n 2BLI6JZIbRIdb2qHQYGZ3FxiOCKxMWXmqYhDEh7aOMyHw3m8iIXji42aWUSyZckdSIYJ\n mfGM+Aq/gx36MrC1PL+DvNMvo+aZZlt9evVNYgUXBtDuN4FowRe5/6hCa8mJUUh0Fn2T\n AbSw==","X-Google-DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n d=1e100.net; s=20230601; t=1696837002; x=1697441802;\n h=content-transfer-encoding:cc:to:subject:message-id:date:from\n :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc\n :subject:date:message-id:reply-to;\n bh=lHq1GnbFGTu6XO3h+VQuknG0WHNieFHtT2l5KqAjqVk=;\n b=lCRSqUjshjOW1KkGq9fjA+DzzWlWImKIpI0Jny4rIS350sCstZOFOif8+fyXQN363G\n +HhBzoNIzNTGteCr88huwY4Int8UNwHvniGi8e/wLygMvP5puUKP3AGIEegZjgwmmvFy\n gRRefOATuhwWTDzcTcWg9O25WbkhHf4vncqGxBcBrTYFo9a1328G0OEgNv8vX6GT7cpb\n 7OG6oyjwE5W9hQe5xWzyXfNJCI8suDjOTYLV+5q3TIHfT2V3tzp79JeNUQo2ufFC3UOB\n SIVltgIpb2WrptFRwng/60CR67avR7rQ5RMUxaDlWd215VD75ZcmW/gAwUtLmzxK1GKZ\n psGg==","X-Gm-Message-State":"AOJu0YxQ20LMBIo52JtvuNQ1wIrdcZZh4sHN9JTuxTqCf5FwKn9PsVze\n jQOeZCG+zSNkMhCTjjA0BwFAkPskJeey2rYvtMs=","X-Google-Smtp-Source":"\n AGHT+IGR3SrvZqXibAvBvU4NsnTuQYmYg6NbZ5FQGT3VQyP8SgScbrB05Mf/gC+tSEjogvQ69vfbaoQDYu0JPbeEegM=","X-Received":"by 2002:a17:90b:3a84:b0:274:4fb:360a with SMTP id\n om4-20020a17090b3a8400b0027404fb360amr12841361pjb.16.1696837002396; Mon, 09\n Oct 2023 00:36:42 -0700 (PDT)","MIME-Version":"1.0","References":"<mpt1qe6lrbr.fsf@arm.com>\n <E1BD3E05-ACB9-4484-8979-D0B71759A3FC@suse.de>\n <mptbkdak6oa.fsf@arm.com>\n <nycvar.YFH.7.77.849.2310090711380.5561@jbgna.fhfr.qr>","In-Reply-To":"<nycvar.YFH.7.77.849.2310090711380.5561@jbgna.fhfr.qr>","From":"Andrew Pinski <pinskia@gmail.com>","Date":"Mon, 9 Oct 2023 00:36:30 -0700","Message-ID":"\n <CA+=Sn1nToQ0HM_r3n_a85zgGB5bx6kVHOWehjZXiLTDbSJUUmw@mail.gmail.com>","Subject":"Re: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1 <<\n signbit(x)) [PR109154]","To":"Richard Biener <rguenther@suse.de>","Cc":"Richard Sandiford <richard.sandiford@arm.com>,\n Tamar Christina <tamar.christina@arm.com>,\n gcc-patches@gcc.gnu.org, nd <nd@arm.com>, jlaw@ventanamicro.com","Content-Type":"text/plain; charset=\"UTF-8\"","Content-Transfer-Encoding":"quoted-printable","X-Spam-Status":"No, score=-8.0 required=5.0 tests=BAYES_00, DKIM_SIGNED,\n DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0,\n KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS,\n TXREP autolearn=ham autolearn_force=no version=3.4.6","X-Spam-Checker-Version":"SpamAssassin 3.4.6 (2021-04-09) on\n server2.sourceware.org","X-BeenThere":"gcc-patches@gcc.gnu.org","X-Mailman-Version":"2.1.30","Precedence":"list","List-Id":"Gcc-patches mailing list <gcc-patches.gcc.gnu.org>","List-Unsubscribe":"<https://gcc.gnu.org/mailman/options/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>","List-Archive":"<https://gcc.gnu.org/pipermail/gcc-patches/>","List-Post":"<mailto:gcc-patches@gcc.gnu.org>","List-Help":"<mailto:gcc-patches-request@gcc.gnu.org?subject=help>","List-Subscribe":"<https://gcc.gnu.org/mailman/listinfo/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>","Errors-To":"gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org"}},{"id":3195259,"web_url":"http://patchwork.ozlabs.org/comment/3195259/","msgid":"<nycvar.YFH.7.77.849.2310090828360.5561@jbgna.fhfr.qr>","list_archive_url":null,"date":"2023-10-09T09:06:00","subject":"Re: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1\n << signbit(x)) [PR109154]","submitter":{"id":4338,"url":"http://patchwork.ozlabs.org/api/people/4338/","name":"Richard Biener","email":"rguenther@suse.de"},"content":"On Mon, 9 Oct 2023, Andrew Pinski wrote:\n\n> On Mon, Oct 9, 2023 at 12:20?AM Richard Biener <rguenther@suse.de> wrote:\n> >\n> > On Sat, 7 Oct 2023, Richard Sandiford wrote:\n> >\n> > > Richard Biener <rguenther@suse.de> writes:\n> > > >> Am 07.10.2023 um 11:23 schrieb Richard Sandiford <richard.sandiford@arm.com>>> ﻿Richard Biener <rguenther@suse.de> writes:\n> > > >>> On Thu, 5 Oct 2023, Tamar Christina wrote:\n> > > >>>\n> > > >>>>> I suppose the idea is that -abs(x) might be easier to optimize with other\n> > > >>>>> patterns (consider a - copysign(x,...), optimizing to a + abs(x)).\n> > > >>>>>\n> > > >>>>> For abs vs copysign it's a canonicalization, but (negate (abs @0)) is less\n> > > >>>>> canonical than copysign.\n> > > >>>>>\n> > > >>>>>> Should I try removing this?\n> > > >>>>>\n> > > >>>>> I'd say yes (and put the reverse canonicalization next to this pattern).\n> > > >>>>>\n> > > >>>>\n> > > >>>> This patch transforms fneg (fabs (x)) into copysign (x, -1) which is more\n> > > >>>> canonical and allows a target to expand this sequence efficiently.  Such\n> > > >>>> sequences are common in scientific code working with gradients.\n> > > >>>>\n> > > >>>> various optimizations in match.pd only happened on COPYSIGN but not COPYSIGN_ALL\n> > > >>>> which means they exclude IFN_COPYSIGN.  COPYSIGN however is restricted to only\n> > > >>>\n> > > >>> That's not true:\n> > > >>>\n> > > >>> (define_operator_list COPYSIGN\n> > > >>>    BUILT_IN_COPYSIGNF\n> > > >>>    BUILT_IN_COPYSIGN\n> > > >>>    BUILT_IN_COPYSIGNL\n> > > >>>    IFN_COPYSIGN)\n> > > >>>\n> > > >>> but they miss the extended float builtin variants like\n> > > >>> __builtin_copysignf16.  Also see below\n> > > >>>\n> > > >>>> the C99 builtins and so doesn't work for vectors.\n> > > >>>>\n> > > >>>> The patch expands these optimizations to work on COPYSIGN_ALL.\n> > > >>>>\n> > > >>>> There is an existing canonicalization of copysign (x, -1) to fneg (fabs (x))\n> > > >>>> which I remove since this is a less efficient form.  The testsuite is also\n> > > >>>> updated in light of this.\n> > > >>>>\n> > > >>>> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.\n> > > >>>>\n> > > >>>> Ok for master?\n> > > >>>>\n> > > >>>> Thanks,\n> > > >>>> Tamar\n> > > >>>>\n> > > >>>> gcc/ChangeLog:\n> > > >>>>\n> > > >>>>    PR tree-optimization/109154\n> > > >>>>    * match.pd: Add new neg+abs rule, remove inverse copysign rule and\n> > > >>>>    expand existing copysign optimizations.\n> > > >>>>\n> > > >>>> gcc/testsuite/ChangeLog:\n> > > >>>>\n> > > >>>>    PR tree-optimization/109154\n> > > >>>>    * gcc.dg/fold-copysign-1.c: Updated.\n> > > >>>>    * gcc.dg/pr55152-2.c: Updated.\n> > > >>>>    * gcc.dg/tree-ssa/abs-4.c: Updated.\n> > > >>>>    * gcc.dg/tree-ssa/backprop-6.c: Updated.\n> > > >>>>    * gcc.dg/tree-ssa/copy-sign-2.c: Updated.\n> > > >>>>    * gcc.dg/tree-ssa/mult-abs-2.c: Updated.\n> > > >>>>    * gcc.target/aarch64/fneg-abs_1.c: New test.\n> > > >>>>    * gcc.target/aarch64/fneg-abs_2.c: New test.\n> > > >>>>    * gcc.target/aarch64/fneg-abs_3.c: New test.\n> > > >>>>    * gcc.target/aarch64/fneg-abs_4.c: New test.\n> > > >>>>    * gcc.target/aarch64/sve/fneg-abs_1.c: New test.\n> > > >>>>    * gcc.target/aarch64/sve/fneg-abs_2.c: New test.\n> > > >>>>    * gcc.target/aarch64/sve/fneg-abs_3.c: New test.\n> > > >>>>    * gcc.target/aarch64/sve/fneg-abs_4.c: New test.\n> > > >>>>\n> > > >>>> --- inline copy of patch ---\n> > > >>>>\n> > > >>>> diff --git a/gcc/match.pd b/gcc/match.pd\n> > > >>>> index 4bdd83e6e061b16dbdb2845b9398fcfb8a6c9739..bd6599d36021e119f51a4928354f580ffe82c6e2 100644\n> > > >>>> --- a/gcc/match.pd\n> > > >>>> +++ b/gcc/match.pd\n> > > >>>> @@ -1074,45 +1074,43 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)\n> > > >>>>\n> > > >>>> /* cos(copysign(x, y)) -> cos(x).  Similarly for cosh.  */\n> > > >>>> (for coss (COS COSH)\n> > > >>>> -     copysigns (COPYSIGN)\n> > > >>>> - (simplify\n> > > >>>> -  (coss (copysigns @0 @1))\n> > > >>>> -   (coss @0)))\n> > > >>>> + (for copysigns (COPYSIGN_ALL)\n> > > >>>\n> > > >>> So this ends up generating for example the match\n> > > >>> (cosf (copysignl ...)) which doesn't make much sense.\n> > > >>>\n> > > >>> The lock-step iteration did\n> > > >>> (cosf (copysignf ..)) ... (ifn_cos (ifn_copysign ...))\n> > > >>> which is leaner but misses the case of\n> > > >>> (cosf (ifn_copysign ..)) - that's probably what you are\n> > > >>> after with this change.\n> > > >>>\n> > > >>> That said, there isn't a nice solution (without altering the match.pd\n> > > >>> IL).  There's the explicit solution, spelling out all combinations.\n> > > >>>\n> > > >>> So if we want to go with yout pragmatic solution changing this\n> > > >>> to use COPYSIGN_ALL isn't necessary, only changing the lock-step\n> > > >>> for iteration to a cross product for iteration is.\n> > > >>>\n> > > >>> Changing just this pattern to\n> > > >>>\n> > > >>> (for coss (COS COSH)\n> > > >>> (for copysigns (COPYSIGN)\n> > > >>>  (simplify\n> > > >>>   (coss (copysigns @0 @1))\n> > > >>>   (coss @0))))\n> > > >>>\n> > > >>> increases the total number of gimple-match-x.cc lines from\n> > > >>> 234988 to 235324.\n> > > >>\n> > > >> I guess the difference between this and the later suggestions is that\n> > > >> this one allows builtin copysign to be paired with ifn cos, which would\n> > > >> be potentially useful in other situations.  (It isn't here because\n> > > >> ifn_cos is rarely provided.)  How much of the growth is due to that,\n> > > >> and much of it is from nonsensical combinations like\n> > > >> (builtin_cosf (builtin_copysignl ...))?\n> > > >>\n> > > >> If it's mostly from nonsensical combinations then would it be possible\n> > > >> to make genmatch drop them?\n> > > >>\n> > > >>> The alternative is to do\n> > > >>>\n> > > >>> (for coss (COS COSH)\n> > > >>>     copysigns (COPYSIGN)\n> > > >>> (simplify\n> > > >>>  (coss (copysigns @0 @1))\n> > > >>>   (coss @0))\n> > > >>> (simplify\n> > > >>>  (coss (IFN_COPYSIGN @0 @1))\n> > > >>>   (coss @0)))\n> > > >>>\n> > > >>> which properly will diagnose a duplicate pattern.  Ther are\n> > > >>> currently no operator lists with just builtins defined (that\n> > > >>> could be fixed, see gencfn-macros.cc), supposed we'd have\n> > > >>> COS_C we could do\n> > > >>>\n> > > >>> (for coss (COS_C COSH_C IFN_COS IFN_COSH)\n> > > >>>     copysigns (COPYSIGN_C COPYSIGN_C IFN_COPYSIGN IFN_COPYSIGN\n> > > >>> IFN_COPYSIGN IFN_COPYSIGN IFN_COPYSIGN IFN_COPYSIGN IFN_COPYSIGN\n> > > >>> IFN_COPYSIGN)\n> > > >>> (simplify\n> > > >>>  (coss (copysigns @0 @1))\n> > > >>>   (coss @0)))\n> > > >>>\n> > > >>> which of course still looks ugly ;) (some syntax extension like\n> > > >>> allowing to specify IFN_COPYSIGN*8 would be nice here and easy\n> > > >>> enough to do)\n> > > >>>\n> > > >>> Can you split out the part changing COPYSIGN to COPYSIGN_ALL,\n> > > >>> re-do it to only split the fors, keeping COPYSIGN and provide\n> > > >>> some statistics on the gimple-match-* size?  I think this might\n> > > >>> be the pragmatic solution for now.\n> > > >>>\n> > > >>> Richard - can you think of a clever way to express the desired\n> > > >>> iteration?  How do RTL macro iterations address cases like this?\n> > > >>\n> > > >> I don't think .md files have an equivalent construct, unfortunately.\n> > > >> (I also regret some of the choices I made for .md iterators, but that's\n> > > >> another story.)\n> > > >>\n> > > >> Perhaps an alternative to the *8 thing would be \"IFN_COPYSIGN...\",\n> > > >> with the \"...\" meaning \"fill to match the longest operator list\n> > > >> in the loop\".\n> > > >\n> > > > Hm, I?ll think about this.  It would be useful to have a function like\n> > > >\n> > > > Internal_fn ifn_for (combined_fn);\n> > > >\n> > > > So we can indirectly match all builtins with a switch on the ifn code.\n> > >\n> > > There's:\n> > >\n> > > extern internal_fn associated_internal_fn (combined_fn, tree);\n> > > extern internal_fn associated_internal_fn (tree);\n> > > extern internal_fn replacement_internal_fn (gcall *);\n> > >\n> > > where the first one requires the return type, and the second one\n> > > operates on CALL_EXPRs.\n> >\n> > Hmm, for full generality the way we code-generate would need to change\n> > quite a bit.  Instead I've come up with the following quite limited\n> > approach.  You can write\n> >\n> > (for coss (COS COSH)\n> >  (simplify\n> >   (coss (ANY_COPYSIGN @0 @1))\n> >   (coss @0))))\n> \n> This optimization is also handled by backprop (gimple-ssa-backprop.cc)\n> in a better way than the match code handle.\n> So maybe we don't really need to extend match-and-simplify here.\n> Right now backprop is only ran once early after inlining. Maybe run it\n> once more late would help?\n\nI think it generally makes sense to clean up simple things during\nbuilding/folding and not wait for specialized passes.\n\nThe question here is mostly whether we are fine with some bloat\nin {generic,gimple}-match-?.cc or not.  The change proposed likely\ndoens't make a big impact as it's going to be of limited use.\n\nAny change exposing semantics of the builtins to genmatch so it\ncan rule out say combining BUILT_IN_SINF and BUILT_IN_COS is going\nto be quite difficult if you consider\n\n(for coss (BUILT_IN_COS BUILT_IN_COSL)\n     sins (BUILT_IN_SINF BUILT_IN_SIN)\n (simplify\n  (coss (convert (sins @0)))\n...\n\nor so.  The current operator-list handling is handling them ordered,\nwe might want to introduce a semantically different operator-set\niteration, treating them unordered.  That's basically how\n\n (for list1 (...)\n  (for list2 (...)\n\nworks.  There would be the opportunity to change code generation\nfor such case to catch-all case, but the way we generate the\ndecision tree makes this difficult I think.  I've filed PR111732\nto track this genmatch optimization opportunity.\n\nRichard.\n\n\n\n> Thanks,\n> Andrew\n> \n> \n> >\n> > with it.  For each internal function the following patch adds a\n> > ANY_<name> identifier.  The use is somewhat limited - you cannot\n> > use it as the outermost operation in the match part and you cannot\n> > use it in the replacement part at all.  The nice thing is there's\n> > no \"iteration\" at all, the ANY_COPYSIGN doesn't cause any pattern\n> > duplication, instead we match it via CASE_CFN_<name> so it will\n> > happily match mis-matched (typewise) calls (but those shouldn't\n> > be there...).\n> >\n> > The patch doesn't contain any defensiveness in the parser for the\n> > use restriction, but you should get compile failures for misuses\n> > at least.\n> >\n> > It should match quite some of the copysign cases, I suspect its\n> > of no use for most of the operators so maybe less general handling\n> > and only specifically introducing ANY_COPYSIGN would be better.\n> > At least I cannot think of any other functions that are matched\n> > but disappear in the resulting replacement?\n> >\n> > Richard.\n> >\n> > diff --git a/gcc/genmatch.cc b/gcc/genmatch.cc\n> > index 03d325efdf6..f7d3f51c013 100644\n> > --- a/gcc/genmatch.cc\n> > +++ b/gcc/genmatch.cc\n> > @@ -524,10 +524,14 @@ class fn_id : public id_base\n> >  {\n> >  public:\n> >    fn_id (enum built_in_function fn_, const char *id_)\n> > -      : id_base (id_base::FN, id_), fn (fn_) {}\n> > +      : id_base (id_base::FN, id_), fn (fn_), case_macro (nullptr) {}\n> >    fn_id (enum internal_fn fn_, const char *id_)\n> > -      : id_base (id_base::FN, id_), fn (int (END_BUILTINS) + int (fn_)) {}\n> > +      : id_base (id_base::FN, id_), fn (int (END_BUILTINS) + int (fn_)),\n> > +    case_macro (nullptr) {}\n> > +  fn_id (const char *case_macro_, const char *id_)\n> > +      : id_base (id_base::FN, id_), fn (-1U), case_macro (case_macro_) {}\n> >    unsigned int fn;\n> > +  const char *case_macro;\n> >  };\n> >\n> >  class simplify;\n> > @@ -3262,6 +3266,10 @@ dt_node::gen_kids_1 (FILE *f, int indent, bool gimple, int depth,\n> >               if (user_id *u = dyn_cast <user_id *> (e->operation))\n> >                 for (auto id : u->substitutes)\n> >                   fprintf_indent (f, indent, \"case %s:\\n\", id->id);\n> > +             else if (is_a <fn_id *> (e->operation)\n> > +                      && as_a <fn_id *> (e->operation)->case_macro)\n> > +               fprintf_indent (f, indent, \"%s:\\n\",\n> > +                               as_a <fn_id *> (e->operation)->case_macro);\n> >               else\n> >                 fprintf_indent (f, indent, \"case %s:\\n\", e->operation->id);\n> >               /* We need to be defensive against bogus prototypes allowing\n> > @@ -3337,9 +3345,12 @@ dt_node::gen_kids_1 (FILE *f, int indent, bool gimple, int depth,\n> >        for (unsigned j = 0; j < generic_fns.length (); ++j)\n> >         {\n> >           expr *e = as_a <expr *>(generic_fns[j]->op);\n> > -         gcc_assert (e->operation->kind == id_base::FN);\n> > +         fn_id *oper = as_a <fn_id *> (e->operation);\n> >\n> > -         fprintf_indent (f, indent, \"case %s:\\n\", e->operation->id);\n> > +         if (oper->case_macro)\n> > +           fprintf_indent (f, indent, \"%s:\\n\", oper->case_macro);\n> > +         else\n> > +           fprintf_indent (f, indent, \"case %s:\\n\", e->operation->id);\n> >           fprintf_indent (f, indent, \"  if (call_expr_nargs (%s) == %d)\\n\"\n> >                                      \"    {\\n\", kid_opname, e->ops.length ());\n> >           generic_fns[j]->gen (f, indent + 6, false, depth);\n> > @@ -5496,7 +5507,8 @@ main (int argc, char **argv)\n> >  #include \"builtins.def\"\n> >\n> >  #define DEF_INTERNAL_FN(CODE, NAME, FNSPEC) \\\n> > -  add_function (IFN_##CODE, \"CFN_\" #CODE);\n> > +  add_function (IFN_##CODE, \"CFN_\" #CODE); \\\n> > +  add_function (\"CASE_CFN_\" # CODE, \"ANY_\" # CODE);\n> >  #include \"internal-fn.def\"\n> >\n> >    /* Parse ahead!  */\n>","headers":{"Return-Path":"<gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org>","X-Original-To":["incoming@patchwork.ozlabs.org","gcc-patches@gcc.gnu.org"],"Delivered-To":["patchwork-incoming@legolas.ozlabs.org","gcc-patches@gcc.gnu.org"],"Authentication-Results":["legolas.ozlabs.org;\n\tdkim=pass (1024-bit key;\n unprotected) header.d=suse.de header.i=@suse.de header.a=rsa-sha256\n header.s=susede2_rsa header.b=0cJuuhTu;\n\tdkim=pass header.d=suse.de header.i=@suse.de header.a=ed25519-sha256\n header.s=susede2_ed25519 header.b=47K+PTWZ;\n\tdkim-atps=neutral","legolas.ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org\n (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org;\n envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org;\n receiver=patchwork.ozlabs.org)","sourceware.org;\n dmarc=pass (p=none dis=none) header.from=suse.de","sourceware.org; spf=pass smtp.mailfrom=suse.de"],"Received":["from server2.sourceware.org (server2.sourceware.org\n [IPv6:2620:52:3:1:0:246e:9693:128c])\n\t(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n\t key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384)\n\t(No client certificate requested)\n\tby legolas.ozlabs.org (Postfix) with ESMTPS id 4S3tSC6B8Yz1yq1\n\tfor <incoming@patchwork.ozlabs.org>; Mon,  9 Oct 2023 20:06:19 +1100 (AEDT)","from server2.sourceware.org (localhost [IPv6:::1])\n\tby sourceware.org (Postfix) with ESMTP id D59283858434\n\tfor <incoming@patchwork.ozlabs.org>; Mon,  9 Oct 2023 09:06:17 +0000 (GMT)","from smtp-out2.suse.de (smtp-out2.suse.de\n [IPv6:2001:67c:2178:6::1d])\n by sourceware.org (Postfix) with ESMTPS id ABD723858D1E\n for <gcc-patches@gcc.gnu.org>; Mon,  9 Oct 2023 09:06:01 +0000 (GMT)","from relay2.suse.de (relay2.suse.de [149.44.160.134])\n by smtp-out2.suse.de (Postfix) with ESMTP id 9B1631F749;\n Mon,  9 Oct 2023 09:06:00 +0000 (UTC)","from wotan.suse.de (wotan.suse.de [10.160.0.1])\n (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))\n (No client certificate requested)\n by relay2.suse.de (Postfix) with ESMTPS id 590402C143;\n Mon,  9 Oct 2023 09:06:00 +0000 (UTC)"],"DMARC-Filter":"OpenDMARC Filter v1.4.2 sourceware.org ABD723858D1E","DKIM-Signature":["v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de;\n s=susede2_rsa;\n t=1696842360;\n h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc:\n mime-version:mime-version:content-type:content-type:\n in-reply-to:in-reply-to:references:references;\n bh=16M9GNTreHpDISEBbaOxZXFpvmPPSrQ1nFvILecMW0Y=;\n b=0cJuuhTupLV659zmFGYMjKT12sodoPbTBGVpN9kO0QksRXGz5wtzZkM5JWeLth7Qm5WVTF\n qx1tekPGs3rb/INHtb0HOB4Ubc/WUTDkFhtvz8TOhBtLofQF4mGZXr4hqgXYH4rlV/IdJ/\n SuEPwBagMM+d4cU0sGQhDICstopgH/I=","v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de;\n s=susede2_ed25519; t=1696842360;\n h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc:\n mime-version:mime-version:content-type:content-type:\n in-reply-to:in-reply-to:references:references;\n bh=16M9GNTreHpDISEBbaOxZXFpvmPPSrQ1nFvILecMW0Y=;\n b=47K+PTWZzoFdi0t1ocBVLzDIZS622awvNBEd36Y6gFX6LzW6jTZqHfrJsFbqPjpGZn6VN+\n X2S70SlBrHKPPODg=="],"Date":"Mon, 9 Oct 2023 09:06:00 +0000 (UTC)","From":"Richard Biener <rguenther@suse.de>","To":"Andrew Pinski <pinskia@gmail.com>","cc":"Richard Sandiford <richard.sandiford@arm.com>,\n Tamar Christina <tamar.christina@arm.com>, gcc-patches@gcc.gnu.org,\n nd <nd@arm.com>, jlaw@ventanamicro.com","Subject":"Re: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1\n << signbit(x)) [PR109154]","In-Reply-To":"\n <CA+=Sn1nToQ0HM_r3n_a85zgGB5bx6kVHOWehjZXiLTDbSJUUmw@mail.gmail.com>","Message-ID":"<nycvar.YFH.7.77.849.2310090828360.5561@jbgna.fhfr.qr>","References":"<mpt1qe6lrbr.fsf@arm.com>\n <E1BD3E05-ACB9-4484-8979-D0B71759A3FC@suse.de> <mptbkdak6oa.fsf@arm.com>\n <nycvar.YFH.7.77.849.2310090711380.5561@jbgna.fhfr.qr>\n <CA+=Sn1nToQ0HM_r3n_a85zgGB5bx6kVHOWehjZXiLTDbSJUUmw@mail.gmail.com>","User-Agent":"Alpine 2.22 (LSU 394 2020-01-19)","MIME-Version":"1.0","Content-Type":"multipart/mixed;\n boundary=\"-1609957120-471027652-1696842360=:5561\"","X-Spam-Status":"No, score=-11.0 required=5.0 tests=BAYES_00, DKIM_SIGNED,\n DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT,\n SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6","X-Spam-Checker-Version":"SpamAssassin 3.4.6 (2021-04-09) on\n server2.sourceware.org","X-BeenThere":"gcc-patches@gcc.gnu.org","X-Mailman-Version":"2.1.30","Precedence":"list","List-Id":"Gcc-patches mailing list <gcc-patches.gcc.gnu.org>","List-Unsubscribe":"<https://gcc.gnu.org/mailman/options/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>","List-Archive":"<https://gcc.gnu.org/pipermail/gcc-patches/>","List-Post":"<mailto:gcc-patches@gcc.gnu.org>","List-Help":"<mailto:gcc-patches-request@gcc.gnu.org?subject=help>","List-Subscribe":"<https://gcc.gnu.org/mailman/listinfo/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>","Errors-To":"gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org"}}]