{"id":2226259,"url":"http://patchwork.ozlabs.org/api/1.2/patches/2226259/?format=json","web_url":"http://patchwork.ozlabs.org/project/gcc/patch/bmm.hhubd62wbc.gcc.gcc-TEST.pinskia.20.1.5@forge-stage.sourceware.org/","project":{"id":17,"url":"http://patchwork.ozlabs.org/api/1.2/projects/17/?format=json","name":"GNU Compiler Collection","link_name":"gcc","list_id":"gcc-patches.gcc.gnu.org","list_email":"gcc-patches@gcc.gnu.org","web_url":null,"scm_url":null,"webscm_url":null,"list_archive_url":"","list_archive_url_format":"","commit_url_format":""},"msgid":"<bmm.hhubd62wbc.gcc.gcc-TEST.pinskia.20.1.5@forge-stage.sourceware.org>","list_archive_url":null,"date":"2026-04-22T10:29:50","name":"[v1,05/11] Popcount V2HI support","commit_ref":null,"pull_url":null,"state":"new","archived":false,"hash":"080a1e64c34642adb80ec77e8d78fe457dbda1bc","submitter":{"id":93219,"url":"http://patchwork.ozlabs.org/api/1.2/people/93219/?format=json","name":"Andrew Pinski via Sourceware Forge","email":"forge-bot+pinskia@forge-stage.sourceware.org"},"delegate":null,"mbox":"http://patchwork.ozlabs.org/project/gcc/patch/bmm.hhubd62wbc.gcc.gcc-TEST.pinskia.20.1.5@forge-stage.sourceware.org/mbox/","series":[{"id":500972,"url":"http://patchwork.ozlabs.org/api/1.2/series/500972/?format=json","web_url":"http://patchwork.ozlabs.org/project/gcc/list/?series=500972","date":"2026-04-22T10:29:49","name":"WIP: v2hiv4qi","version":1,"mbox":"http://patchwork.ozlabs.org/series/500972/mbox/"}],"comments":"http://patchwork.ozlabs.org/api/patches/2226259/comments/","check":"pending","checks":"http://patchwork.ozlabs.org/api/patches/2226259/checks/","tags":{},"related":[],"headers":{"Return-Path":"<gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org>","X-Original-To":["incoming@patchwork.ozlabs.org","gcc-patches@gcc.gnu.org"],"Delivered-To":["patchwork-incoming@legolas.ozlabs.org","gcc-patches@gcc.gnu.org"],"Authentication-Results":["legolas.ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org\n (client-ip=2620:52:6:3111::32; helo=vm01.sourceware.org;\n envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org;\n receiver=patchwork.ozlabs.org)","sourceware.org; dmarc=none (p=none dis=none)\n header.from=forge-stage.sourceware.org","sourceware.org;\n spf=pass smtp.mailfrom=forge-stage.sourceware.org","server2.sourceware.org;\n arc=none smtp.remote-ip=38.145.34.39"],"Received":["from vm01.sourceware.org (vm01.sourceware.org\n [IPv6:2620:52:6:3111::32])\n\t(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n\t key-exchange x25519 server-signature ECDSA (secp384r1) server-digest SHA384)\n\t(No client certificate requested)\n\tby legolas.ozlabs.org (Postfix) with ESMTPS id 4g0wwn1SRvz1yD5\n\tfor <incoming@patchwork.ozlabs.org>; Wed, 22 Apr 2026 20:50:17 +1000 (AEST)","from vm01.sourceware.org (localhost [127.0.0.1])\n\tby sourceware.org (Postfix) with ESMTP id BF47841F9E34\n\tfor <incoming@patchwork.ozlabs.org>; Wed, 22 Apr 2026 10:50:14 +0000 (GMT)","from forge-stage.sourceware.org (vm08.sourceware.org [38.145.34.39])\n by sourceware.org (Postfix) with ESMTPS id 2724648FDB04\n for <gcc-patches@gcc.gnu.org>; Wed, 22 Apr 2026 10:31:09 +0000 (GMT)","from forge-stage.sourceware.org (localhost [IPv6:::1])\n (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n key-exchange x25519 server-signature ECDSA (prime256v1) server-digest SHA256)\n (No client certificate requested)\n by forge-stage.sourceware.org (Postfix) with ESMTPS id 4E2F442608\n for <gcc-patches@gcc.gnu.org>; Wed, 22 Apr 2026 10:31:06 +0000 (UTC)"],"DKIM-Filter":["OpenDKIM Filter v2.11.0 sourceware.org BF47841F9E34","OpenDKIM Filter v2.11.0 sourceware.org 2724648FDB04"],"DMARC-Filter":"OpenDMARC Filter v1.4.2 sourceware.org 2724648FDB04","ARC-Filter":"OpenARC Filter v1.0.0 sourceware.org 2724648FDB04","ARC-Seal":"i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1776853869; cv=none;\n b=eC6IddVrC4Sgr1eM9qJm4U6q4KIn1LNUxJuGoh2DYC4Q8FjjZnl+qVkqhARDkYkSIJp+s7c/+kUAawS1weBbt5cR980ya7r1+qWvUlM0QSUASRXrGC3adb8uYY8lIxAUyFdGD7gNAV3ZIJ8xgCX7rhJYvRgk1GWHK5pYJ4fIhYw=","ARC-Message-Signature":"i=1; a=rsa-sha256; d=sourceware.org; s=key;\n t=1776853869; c=relaxed/simple;\n bh=viX1VjCqdgobvFpO8nIuWT0j7dhdx3HjSdXpIXJO4K4=;\n h=From:Date:Subject:To:Message-ID;\n b=ih1gUDv8piSDrvlXL4Gbt1tjCo0PiKhUE/In22n2tpDKg+WOnXEhveRDaMtGz95FaW+EMWpLKpGEHuPT6iTcxAMtAxO0Tg2/SpEJWGxxA/0no9tgRaT2pfD/PHIbuhLH9uriKtkGzW4QzOUIdOKlZyozSf44Y655ifO8VqQjGVs=","ARC-Authentication-Results":"i=1; server2.sourceware.org","From":"Andrew Pinski via Sourceware Forge\n <forge-bot+pinskia@forge-stage.sourceware.org>","Date":"Wed, 22 Apr 2026 10:29:50 +0000","Subject":"[PATCH v1 05/11] Popcount V2HI support","To":"gcc-patches mailing list <gcc-patches@gcc.gnu.org>","Message-ID":"\n <bmm.hhubd62wbc.gcc.gcc-TEST.pinskia.20.1.5@forge-stage.sourceware.org>","X-Mailer":"batrachomyomachia","X-Pull-Request-Organization":"gcc","X-Pull-Request-Repository":"gcc-TEST","X-Pull-Request":"https://forge.sourceware.org/gcc/gcc-TEST/pulls/20","References":"\n <bmm.hhubd62wbc.gcc.gcc-TEST.pinskia.20.1.0@forge-stage.sourceware.org>","In-Reply-To":"\n <bmm.hhubd62wbc.gcc.gcc-TEST.pinskia.20.1.0@forge-stage.sourceware.org>","X-Patch-URL":"\n https://forge.sourceware.org/pinskia/gcc-TEST/commit/4440d67e07dc9e7b4d99d337333c8e957b00f3aa","X-BeenThere":"gcc-patches@gcc.gnu.org","X-Mailman-Version":"2.1.30","Precedence":"list","List-Id":"Gcc-patches mailing list <gcc-patches.gcc.gnu.org>","List-Unsubscribe":"<https://gcc.gnu.org/mailman/options/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>","List-Archive":"<https://gcc.gnu.org/pipermail/gcc-patches/>","List-Post":"<mailto:gcc-patches@gcc.gnu.org>","List-Help":"<mailto:gcc-patches-request@gcc.gnu.org?subject=help>","List-Subscribe":"<https://gcc.gnu.org/mailman/listinfo/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>","Reply-To":"gcc-patches mailing list <gcc-patches@gcc.gnu.org>,\n pinskia@gcc.gnu.org","Errors-To":"gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org"},"content":"From: Andrew Pinski <quic_apinski@quicinc.com>\n\nThis adds V2HI support for popcount.\nSince V4HI support was added afterwards, this adds it\nin a secondary patch also.\n\ngcc/ChangeLog:\n\n\t* config/aarch64/aarch64-simd.md:\n\t* config/aarch64/iterators.md (128):\n\t(V8HI):\n\ngcc/testsuite/ChangeLog:\n\n\t* gcc.target/aarch64/popcnt-vec-1.c: New test.\n\nSigned-off-by: Andrew Pinski <quic_apinski@quicinc.com>\n---\n gcc/config/aarch64/aarch64-simd.md            | 19 ++++++++----\n gcc/config/aarch64/iterators.md               | 20 +++++++++++--\n .../gcc.target/aarch64/popcnt-vec-1.c         | 30 +++++++++++++++++++\n 3 files changed, 60 insertions(+), 9 deletions(-)\n create mode 100644 gcc/testsuite/gcc.target/aarch64/popcnt-vec-1.c","diff":"diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md\nindex 13e632550cd0..1cee004664ca 100644\n--- a/gcc/config/aarch64/aarch64-simd.md\n+++ b/gcc/config/aarch64/aarch64-simd.md\n@@ -1012,7 +1012,7 @@\n \t  (plus:<VDBLW>\n \t    (vec_select:<VDBLW>\n \t      (ANY_EXTEND:<V2XWIDE>\n-\t\t(match_operand:VDQV_L 2 \"register_operand\"))\n+\t\t(match_operand:VDQSV_L 2 \"register_operand\"))\n \t      (match_dup 3))\n \t    (vec_select:<VDBLW> (ANY_EXTEND:<V2XWIDE> (match_dup 2))\n \t      (match_dup 4)))\n@@ -1031,7 +1031,7 @@\n \t  (plus:<VDBLW>\n \t    (vec_select:<VDBLW>\n \t      (ANY_EXTEND:<V2XWIDE>\n-\t\t(match_operand:VDQV_L 2 \"register_operand\" \"w\"))\n+\t\t(match_operand:VDQSV_L 2 \"register_operand\" \"w\"))\n \t      (match_operand:<V2XWIDE> 3 \"vect_par_cnst_even_or_odd_half\" \"\"))\n \t    (vec_select:<VDBLW> (ANY_EXTEND:<V2XWIDE> (match_dup 2))\n \t      (match_operand:<V2XWIDE> 4 \"vect_par_cnst_even_or_odd_half\" \"\")))\n@@ -3498,7 +3498,7 @@\n \t(plus:<VDBLW>\n \t  (vec_select:<VDBLW>\n \t    (ANY_EXTEND:<V2XWIDE>\n-\t      (match_operand:VDQV_L 1 \"register_operand\"))\n+\t      (match_operand:VDQSV_L 1 \"register_operand\"))\n \t    (match_dup 2))\n \t  (vec_select:<VDBLW> (ANY_EXTEND:<V2XWIDE> (match_dup 1))\n \t    (match_dup 3))))]\n@@ -3515,7 +3515,7 @@\n \t(plus:<VDBLW>\n \t  (vec_select:<VDBLW>\n \t    (ANY_EXTEND:<V2XWIDE>\n-\t      (match_operand:VDQV_L 1 \"register_operand\" \"w\"))\n+\t      (match_operand:VDQSV_L 1 \"register_operand\" \"w\"))\n \t    (match_operand:<V2XWIDE> 2 \"vect_par_cnst_even_or_odd_half\"))\n \t  (vec_select:<VDBLW> (ANY_EXTEND:<V2XWIDE> (match_dup 1))\n \t    (match_operand:<V2XWIDE> 3 \"vect_par_cnst_even_or_odd_half\"))))]\n@@ -3573,8 +3573,15 @@\n       }\n \n     /* Generate a byte popcount.  */\n-    machine_mode mode = <bitsize> == 64 ? V8QImode : V16QImode;\n-    machine_mode mode2 = <bitsize> == 64 ? V2SImode : V4SImode;\n+    machine_mode mode;\n+    machine_mode mode2;\n+    /* V2HI and V4QI does need mode2.*/\n+    if (<real_bitsize> == 32)\n+      mode = V4QImode, mode2 = BLKmode;\n+    else if (<real_bitsize> == 64)\n+      mode = V8QImode, mode2 = V2SImode;\n+    else \n+      mode = V16QImode, mode2 = V4SImode;\n     rtx tmp = gen_reg_rtx (mode);\n     auto icode = optab_handler (popcount_optab, mode);\n     emit_insn (GEN_FCN (icode) (tmp, gen_lowpart (mode, operands[1])));\ndiff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md\nindex 719e5e91168c..c042da62a870 100644\n--- a/gcc/config/aarch64/iterators.md\n+++ b/gcc/config/aarch64/iterators.md\n@@ -263,6 +263,10 @@\n ;; Advanced SIMD modes for Integer widening reduction across lanes.\n (define_mode_iterator VDQV_L [V8QI V16QI V4HI V8HI V4SI V2SI])\n \n+;; Advanced SIMD modes for Integer widening reduction across lanes.\n+;; Plus 32bit modes\n+(define_mode_iterator VDQSV_L [V4QI V8QI V16QI V2HI V4HI V8HI V4SI V2SI])\n+\n ;; All double integer narrow-able modes.\n (define_mode_iterator VDN [V4HI V2SI DI])\n \n@@ -318,7 +322,10 @@\n (define_mode_iterator VDQHS [V4HI V8HI V2SI V4SI])\n \n ;; Advanced SIMD modes for H, S and D types.\n-(define_mode_iterator VDQHSD [V4HI V8HI V2SI V4SI V2DI])\n+;; Note this is used only for bswap and popcount patterns\n+;; so including V2HI is fine as the aarch64 intrinsics\n+;; are not used for them\n+(define_mode_iterator VDQHSD [V2HI V4HI V8HI V2SI V4SI V2DI])\n \n (define_mode_iterator VDQHSD_V1DI [VDQHSD V1DI])\n \n@@ -1278,6 +1285,13 @@\n \t\t\t   (V2SI \"64\") (V4SI \"128\")\n \t\t\t   (V1DI \"64\") (V2DI \"128\")])\n \n+;; Map a mode to the number of bits in it, if the size of the mode\n+;; is constant. Unlike the above V4QI and V2HI are 32bit\n+(define_mode_attr real_bitsize [(V4QI \"32\") (V8QI \"64\") (V16QI \"128\")\n+\t\t\t\t(V2HI \"32\") (V4HI \"64\") (V8HI \"128\")\n+\t\t\t\t\t    (V2SI \"64\") (V4SI \"128\")\n+\t\t\t\t\t    (V1DI \"64\") (V2DI \"128\")])\n+\n ;; Map a floating point or integer mode to the appropriate register name prefix\n (define_mode_attr s [(HF \"h\") (SF \"s\") (DF \"d\") (SI \"s\") (DI \"d\")])\n \n@@ -1693,8 +1707,8 @@\n \t\t\t(DF   \"v2df\")])\n \n ;; Modes with double-width elements.\n-(define_mode_attr VDBLW [(V8QI \"V4HI\") (V16QI \"V8HI\")\n-                  (V4HI \"V2SI\") (V8HI \"V4SI\")\n+(define_mode_attr VDBLW [(V4QI \"V2HI\") (V8QI \"V4HI\") (V16QI \"V8HI\")\n+                  (V2HI \"SI\") (V4HI \"V2SI\") (V8HI \"V4SI\")\n                   (V2SI \"DI\")   (V4SI \"V2DI\")])\n \n (define_mode_attr VQUADW [(V8QI \"V4SI\") (V16QI \"V8SI\")\ndiff --git a/gcc/testsuite/gcc.target/aarch64/popcnt-vec-1.c b/gcc/testsuite/gcc.target/aarch64/popcnt-vec-1.c\nnew file mode 100644\nindex 000000000000..4f400eef977f\n--- /dev/null\n+++ b/gcc/testsuite/gcc.target/aarch64/popcnt-vec-1.c\n@@ -0,0 +1,30 @@\n+/* { dg-do compile } */\n+/* { dg-options \"-O2 -fno-vect-cost-model\" } */\n+\n+\n+/* SLP\n+   This function should produce cnt v.8b and uaddlp (Add Long Pairwise).  */\n+void\n+pop_short (unsigned short *__restrict b, unsigned short *__restrict d)\n+{\n+  d[0] = __builtin_popcount (b[0]);\n+  d[1] = __builtin_popcount (b[1]);\n+}\n+\n+/* SLP\n+   This function should produce cnt v.8b.  */\n+void\n+pop_char (unsigned char *__restrict b, unsigned char *__restrict d)\n+{\n+  d[0] = __builtin_popcount (b[0]);\n+  d[1] = __builtin_popcount (b[1]);\n+  d[2] = __builtin_popcount (b[2]);\n+  d[3] = __builtin_popcount (b[3]);\n+}\n+\n+\n+/* { dg-final { scan-assembler-not {\\tbl\\tpopcount} } } */\n+/* { dg-final { scan-assembler-times {cnt\\t} 2 } } */\n+/* { dg-final { scan-assembler-times {uaddlp\\t} 1 } } */\n+/* { dg-final { scan-assembler-times {ldr\\ts} 2 } } */\n+/* { dg-final { scan-assembler-times {str\\ts} 2 } } */\n","prefixes":["v1","05/11"]}