{"id":2215403,"url":"http://patchwork.ozlabs.org/api/patches/2215403/?format=json","web_url":"http://patchwork.ozlabs.org/project/gcc/patch/20260324140939.2390596-1-vijay@linux.ibm.com/","project":{"id":17,"url":"http://patchwork.ozlabs.org/api/projects/17/?format=json","name":"GNU Compiler Collection","link_name":"gcc","list_id":"gcc-patches.gcc.gnu.org","list_email":"gcc-patches@gcc.gnu.org","web_url":null,"scm_url":null,"webscm_url":null,"list_archive_url":"","list_archive_url_format":"","commit_url_format":""},"msgid":"<20260324140939.2390596-1-vijay@linux.ibm.com>","list_archive_url":null,"date":"2026-03-24T14:09:39","name":"[v5] rs6000: Reduces a multi-step comparison sequence to a single vcmpnez instruction [PR116004]","commit_ref":null,"pull_url":null,"state":"new","archived":false,"hash":"9b8e4d1e149d58442a676994ff3063cda4cef380","submitter":{"id":91942,"url":"http://patchwork.ozlabs.org/api/people/91942/?format=json","name":"Vijay shankar telidevulapalli","email":"vijay@linux.ibm.com"},"delegate":null,"mbox":"http://patchwork.ozlabs.org/project/gcc/patch/20260324140939.2390596-1-vijay@linux.ibm.com/mbox/","series":[{"id":497298,"url":"http://patchwork.ozlabs.org/api/series/497298/?format=json","web_url":"http://patchwork.ozlabs.org/project/gcc/list/?series=497298","date":"2026-03-24T14:09:39","name":"[v5] rs6000: Reduces a multi-step comparison sequence to a single vcmpnez instruction [PR116004]","version":5,"mbox":"http://patchwork.ozlabs.org/series/497298/mbox/"}],"comments":"http://patchwork.ozlabs.org/api/patches/2215403/comments/","check":"pending","checks":"http://patchwork.ozlabs.org/api/patches/2215403/checks/","tags":{},"related":[],"headers":{"Return-Path":"<gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org>","X-Original-To":["incoming@patchwork.ozlabs.org","gcc-patches@gcc.gnu.org"],"Delivered-To":["patchwork-incoming@legolas.ozlabs.org","gcc-patches@gcc.gnu.org"],"Authentication-Results":["legolas.ozlabs.org;\n\tdkim=pass (2048-bit key;\n unprotected) header.d=ibm.com header.i=@ibm.com header.a=rsa-sha256\n header.s=pp1 header.b=Y5Ze9M+W;\n\tdkim-atps=neutral","legolas.ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org\n (client-ip=38.145.34.32; helo=vm01.sourceware.org;\n envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org;\n receiver=patchwork.ozlabs.org)","sourceware.org;\n\tdkim=pass (2048-bit key,\n unprotected) header.d=ibm.com header.i=@ibm.com header.a=rsa-sha256\n header.s=pp1 header.b=Y5Ze9M+W","sourceware.org;\n dmarc=none (p=none dis=none) header.from=linux.ibm.com","sourceware.org;\n spf=none smtp.mailfrom=kubota.pok.stglabs.ibm.com","server2.sourceware.org;\n arc=none smtp.remote-ip=148.163.158.5"],"Received":["from vm01.sourceware.org (vm01.sourceware.org [38.145.34.32])\n\t(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n\t key-exchange x25519 server-signature ECDSA (secp384r1) server-digest SHA384)\n\t(No client certificate requested)\n\tby legolas.ozlabs.org (Postfix) with ESMTPS id 4fgBly48pTz1y1g\n\tfor <incoming@patchwork.ozlabs.org>; Wed, 25 Mar 2026 01:11:10 +1100 (AEDT)","from vm01.sourceware.org (localhost [127.0.0.1])\n\tby sourceware.org (Postfix) with ESMTP id 988B24BA23DF\n\tfor <incoming@patchwork.ozlabs.org>; Tue, 24 Mar 2026 14:11:08 +0000 (GMT)","from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com\n [148.163.158.5])\n by sourceware.org (Postfix) with ESMTPS id B20CA4BA23CD\n for <gcc-patches@gcc.gnu.org>; Tue, 24 Mar 2026 14:09:45 +0000 (GMT)","from pps.filterd (m0353725.ppops.net [127.0.0.1])\n by mx0a-001b2d01.pphosted.com (8.18.1.11/8.18.1.11) with ESMTP id\n 62O468iw3640057; Tue, 24 Mar 2026 14:09:44 GMT","from ppma11.dal12v.mail.ibm.com\n (db.9e.1632.ip4.static.sl-reverse.com [50.22.158.219])\n by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 4d1ky037a4-1\n (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT);\n Tue, 24 Mar 2026 14:09:43 +0000 (GMT)","from pps.filterd (ppma11.dal12v.mail.ibm.com [127.0.0.1])\n by ppma11.dal12v.mail.ibm.com (8.18.1.2/8.18.1.2) with ESMTP id\n 62ODCnmS004387;\n Tue, 24 Mar 2026 14:09:43 GMT","from smtprelay04.dal12v.mail.ibm.com ([172.16.1.6])\n by ppma11.dal12v.mail.ibm.com (PPS) with ESMTPS id 4d28c21wx9-1\n (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT);\n Tue, 24 Mar 2026 14:09:43 +0000","from smtpav03.dal12v.mail.ibm.com (smtpav03.dal12v.mail.ibm.com\n [10.241.53.102])\n by smtprelay04.dal12v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id\n 62OE9gq829491898\n (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK);\n Tue, 24 Mar 2026 14:09:42 GMT","from smtpav03.dal12v.mail.ibm.com (unknown [127.0.0.1])\n by IMSVA (Postfix) with ESMTP id 2907C58060;\n Tue, 24 Mar 2026 14:09:42 +0000 (GMT)","from smtpav03.dal12v.mail.ibm.com (unknown [127.0.0.1])\n by IMSVA (Postfix) with ESMTP id 1359C5803F;\n Tue, 24 Mar 2026 14:09:42 +0000 (GMT)","from kubota.pok.stglabs.ibm.com (unknown [9.114.39.181])\n by smtpav03.dal12v.mail.ibm.com (Postfix) with ESMTPS;\n Tue, 24 Mar 2026 14:09:42 +0000 (GMT)","by kubota.pok.stglabs.ibm.com (Postfix, from userid 19539)\n id 5DA1A80B9D4B; Tue, 24 Mar 2026 09:09:41 -0500 (EST)"],"DKIM-Filter":["OpenDKIM Filter v2.11.0 sourceware.org 988B24BA23DF","OpenDKIM Filter v2.11.0 sourceware.org B20CA4BA23CD"],"DMARC-Filter":"OpenDMARC Filter v1.4.2 sourceware.org B20CA4BA23CD","ARC-Filter":"OpenARC Filter v1.0.0 sourceware.org B20CA4BA23CD","ARC-Seal":"i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1774361385; cv=none;\n b=cTiiovb+xbOsPZ0S4L8tjEiDPrj7elDtUJbU/FM191LfvAEDLydUEPW6VE3Xg0pPFk5Sul4xetL+R+MRAcDcYj4GXkTCeLJhvUlaNuQEWWBDl90ZC69K4zG5h09C2+VesBfqof761w9DpVCMlXBhf2W4X1PkpG8iTd9KUeLzvLI=","ARC-Message-Signature":"i=1; a=rsa-sha256; d=sourceware.org; s=key;\n t=1774361385; c=relaxed/simple;\n bh=o55bUDTwxDwyZGQhJmmuQmyB7fZayXFEGK52YMteXUI=;\n h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version;\n b=FtFn0kLRxIojw3EKL5rdgEOnNyVPfbg2wgNCClm6dcGmztw+2eNrkLPjPy2xvWQu7rOHgU2P6wyvafpZLcRXbI+QMdWLx7DQ4Q5BbEa91safKMCh0RUrRuF5eXXvxpsufssZUHMAD4EE5xevZCLWR5hDsSIgjikxDI0I01BZNGA=","ARC-Authentication-Results":"i=1; server2.sourceware.org","DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=cc\n :content-transfer-encoding:date:from:message-id:mime-version\n :subject:to; s=pp1; bh=boxWV0rTPL1/xVI+zJ2xh7LTcK0RiSyfGDfWUlrbK\n 0w=; b=Y5Ze9M+WqA2oNIB+PnyCPpxV57f4GlzdGd5WeyyoopYlqzg+GdPQE+RCD\n NyV5LwRwN6TFQk9vAOsWttEclcuz+YZo63NzAcFaBB49w/tYOrj7YE2k5+QoVfH2\n AjeTd8XdJ/kJBqUjm9z81r0TLqI2FL3pnRn0rEUUzdCXWsTBlD8ff/19jo5EwRWz\n a3ELsUGhmbj1ONnhMAPXm/mAUAKiyDawG4vhLUGn4VAlccRYOv39tp0d37ms4qmF\n VtkRus3JJ/WSYA3K6LQS4SNQS58Kb4E20OLeT8qXsFzJ2VPwrnWjsUOGEDuDl9GJ\n MtkvIEaEBOCqMxZ8U2FxulsmgEG/A==","From":"Vijay Shankar <vijay@linux.ibm.com>","To":"gcc-patches@gcc.gnu.org, segher@kernel.crashing.org,\n jskumari@linux.ibm.com","Cc":"meissner@linux.ibm.com, kishan@linux.ibm.com, avinashd@linux.ibm.com,\n Vijay Shankar <vijay@linux.ibm.com>","Subject":"[PATCH v5] rs6000: Reduces a multi-step comparison sequence to a\n single vcmpnez instruction [PR116004]","Date":"Tue, 24 Mar 2026 09:09:39 -0500","Message-ID":"<20260324140939.2390596-1-vijay@linux.ibm.com>","X-Mailer":"git-send-email 2.47.3","MIME-Version":"1.0","Content-Transfer-Encoding":"8bit","X-TM-AS-GCONF":"00","X-Proofpoint-Spam-Details-Enc":"AW1haW4tMjYwMzI0MDEwOSBTYWx0ZWRfX7SDtFTHgRTS+\n /XnxTHZPmJgWCRrLc7nxGldjCrOdMz1fVMc/Fu4dz7LkEmNp9uBHt8a6eO/O6uBxWjBGkNfbD6j\n WMyV7X5ti2Kji/ae4MDDRRsCH+cf4LlTvyTQUmYDgC4aQd60qRAbaPBeIPJVSV7HPGgF5GViNmj\n LADmIhjBEVlIMoEULrC67USH5QlX16LVyp+FODj5ml98EPxV3iZiZmEais/NDclbp8MMLAUshnH\n EbTruaJEqMq6sScWHq2bYERnckj2SNgOxeXM9GsesoyVHalpR1FfS66mYOtq/g0yW1jkF42C3dc\n 1dLloqYuXGUKKBcbr72chEiKpaDR9eeIA3H47927Oppz1S1H+8y7vRfPguSmq3rnIIsv/DLBW+B\n Qfo9m6PXOtxQYHkUS5iyKX2hDXxY6WtG23WOp7rTshNGTYZGdGc6T44/UdYsgVnbu3UAl2hNWId\n taJy/KF75L5GovvzaHA==","X-Authority-Analysis":"v=2.4 cv=JK42csKb c=1 sm=1 tr=0 ts=69c29b28 cx=c_pps\n a=aDMHemPKRhS1OARIsFnwRA==:117 a=aDMHemPKRhS1OARIsFnwRA==:17\n a=Yq5XynenixoA:10 a=VkNPw1HP01LnGYTKEx00:22 a=RnoormkPH1_aCDwRdu11:22\n a=V8glGbnc2Ofi9Qvn3v5h:22 a=VnNF1IyMAAAA:8 a=0cw-c5L_uOa28hNVWYAA:9","X-Proofpoint-ORIG-GUID":"MgNIh2qLoGmPSFdqkTM0bw5FUJt0ZuRM","X-Proofpoint-GUID":"MgNIh2qLoGmPSFdqkTM0bw5FUJt0ZuRM","X-Proofpoint-Virus-Version":"vendor=baseguard\n engine=ICAP:2.0.293,Aquarius:18.0.1143,Hydra:6.1.51,FMLib:17.12.100.49\n definitions=2026-03-24_03,2026-03-23_02,2025-10-01_01","X-Proofpoint-Spam-Details":"rule=outbound_notspam policy=outbound score=0\n impostorscore=0 clxscore=1015 priorityscore=1501 malwarescore=0 adultscore=0\n spamscore=0 suspectscore=0 phishscore=0 lowpriorityscore=0 bulkscore=0\n classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0\n reason=mlx scancount=1 engine=8.22.0-2603050001 definitions=main-2603240109","X-BeenThere":"gcc-patches@gcc.gnu.org","X-Mailman-Version":"2.1.30","Precedence":"list","List-Id":"Gcc-patches mailing list <gcc-patches.gcc.gnu.org>","List-Unsubscribe":"<https://gcc.gnu.org/mailman/options/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>","List-Archive":"<https://gcc.gnu.org/pipermail/gcc-patches/>","List-Post":"<mailto:gcc-patches@gcc.gnu.org>","List-Help":"<mailto:gcc-patches-request@gcc.gnu.org?subject=help>","List-Subscribe":"<https://gcc.gnu.org/mailman/listinfo/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>","Errors-To":"gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org"},"content":"Changes from v4:\n        * Update to use Use \\m...\\M in\nChanges from v3:\n\t* Added New testcase.\nChanges from v2:\n\t* Some formatting.\nChanges from v1:\n\t* Added more info to commit message fixed indentation.\n\nThis patch removes redundant vector compare instructions and logic\nfrom the vec_first_mismatch_or_eos_index intrinsic.\nCurrently, GCC generates extra vcmpneb instructions and additional\nmasking logic (xxland, xxlorc) to handle EOS and mismatch comparisons.\nHowever, a single vcmpnezb instruction already suffices, as it covers\nboth By eliminating the redundant comparisons (vcmpneb) and the\nassociated logic (xxland/xxlorc) we produce shorter,\nmore efficient code.\n\nBootstrapped and tested on powerpc64le-linux-gnu with no regressions.\n\n2025-10-22  Vijay Shankar  <vijay@linux.ibm.com>\n\ngcc/ChangeLog:\n\tPR target/116004\n\t* config/rs6000/vsx.md (first_mismatch_or_eos_index): Remove redundant\n\temit_insns.\n\ngcc/testsuite/ChangeLog:\n\tPR target/116004\n\t* gcc.target/powerpc/pr116004.c: New Test.\n---\n gcc/config/rs6000/vsx.md                    | 22 ++------\n gcc/testsuite/gcc.target/powerpc/pr116004.c | 58 +++++++++++++++++++++\n 2 files changed, 61 insertions(+), 19 deletions(-)\n create mode 100644 gcc/testsuite/gcc.target/powerpc/pr116004.c","diff":"diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md\nindex cfad9b8c6..3c2319a53 100644\n--- a/gcc/config/rs6000/vsx.md\n+++ b/gcc/config/rs6000/vsx.md\n@@ -5668,29 +5668,13 @@\n   \"TARGET_P9_VECTOR\"\n {\n   int sh;\n-  rtx cmpz1_result = gen_reg_rtx (<MODE>mode);\n-  rtx cmpz2_result = gen_reg_rtx (<MODE>mode);\n-  rtx cmpz_result = gen_reg_rtx (<MODE>mode);\n-  rtx not_cmpz_result = gen_reg_rtx (<MODE>mode);\n-  rtx and_result = gen_reg_rtx (<MODE>mode);\n   rtx result = gen_reg_rtx (<MODE>mode);\n-  rtx vzero = gen_reg_rtx (<MODE>mode);\n-\n-  /* Vector with zeros in elements that correspond to zeros in operands.  */\n-  emit_move_insn (vzero, CONST0_RTX (<MODE>mode));\n \n-  emit_insn (gen_vcmpne<VSX_EXTRACT_WIDTH> (cmpz1_result, operands[1], vzero));\n-  emit_insn (gen_vcmpne<VSX_EXTRACT_WIDTH> (cmpz2_result, operands[2], vzero));\n-  emit_insn (gen_and<mode>3 (and_result, cmpz1_result, cmpz2_result));\n+  /* Vector with ones in elements that do not match or elements corresponding\n+     to zeros in operands.  */\n \n-  /* Vector with ones in elments that match.  */\n-  emit_insn (gen_vcmpnez<VSX_EXTRACT_WIDTH> (cmpz_result, operands[1],\n+  emit_insn (gen_vcmpnez<VSX_EXTRACT_WIDTH> (result, operands[1],\n                                              operands[2]));\n-  emit_insn (gen_one_cmpl<mode>2 (not_cmpz_result, cmpz_result));\n-\n-  /* Create vector with ones in elements where there was a zero in one of\n-     the source elements or the elements did not match.  */\n-  emit_insn (gen_nand<mode>3 (result, and_result, not_cmpz_result));\n   sh = GET_MODE_SIZE (GET_MODE_INNER (<MODE>mode)) / 2;\n \n   if (<MODE>mode == V16QImode)\ndiff --git a/gcc/testsuite/gcc.target/powerpc/pr116004.c b/gcc/testsuite/gcc.target/powerpc/pr116004.c\nnew file mode 100644\nindex 000000000..3f240c1c4\n--- /dev/null\n+++ b/gcc/testsuite/gcc.target/powerpc/pr116004.c\n@@ -0,0 +1,58 @@\n+/* { dg-do compile } */\n+/* { dg-options \"-mdejagnu-cpu=power9 -O2\" } */\n+/* { dg-final { scan-assembler-times {\\mvcmpnezb\\M} 2 } } */\n+/* { dg-final { scan-assembler-times {\\mvcmpnezh\\M} 2 } } */\n+/* { dg-final { scan-assembler-times {\\mvcmpnezw\\M} 2 } } */\n+/* { dg-final { scan-assembler-not {\\mvcmpneb\\M} } } */\n+/* { dg-final { scan-assembler-not {\\mvcmpneh\\M} } } */\n+/* { dg-final { scan-assembler-not {\\mvcmpnew\\M} } } */\n+\n+#include <altivec.h>\n+#include <stdint.h>\n+\n+int main(void) {\n+  vector signed char char_src1, char_src2;\n+  vector unsigned char uchar_src1, uchar_src2;\n+  vector signed short short_src1, short_src2;\n+  vector unsigned short ushort_src1, ushort_src2;\n+  vector signed int int_src1, int_src2;\n+  vector unsigned int uint_src1, uint_src2;\n+\n+  volatile unsigned int r1, r2, r3, r4, r5, r6;\n+\n+  /* signed char */\n+  char_src1 = (vector signed char) {-1, 2, 3, 0, -5, 6, 7, 8,\n+                                    9, 10, 11, 12, 13, 14, 15, 16};\n+  char_src2 = (vector signed char) {2, 3, 20, 0, -5, 6, 7, 8,\n+                                    9, 10, 11, 12, 13, 14, 15, 16};\n+  r1 = vec_first_mismatch_or_eos_index(char_src1, char_src2);\n+\n+  /* unsigned char */\n+  uchar_src1 = (vector unsigned char) {1, 2, 3, 4, 5, 6, 7, 8,\n+                                       9, 10, 11, 12, 13, 14, 15, 16};\n+  uchar_src2 = (vector unsigned char) {1, 0, 3, 4, 5, 6, 7, 8,\n+                                       9, 10, 11, 12, 13, 14, 15, 16};\n+  r2 = vec_first_mismatch_or_eos_index(uchar_src1, uchar_src2);\n+\n+  /* signed short */\n+  short_src1 = (vector signed short) {-10, -20, 30, 40, 50, 60, 70, 80};\n+  short_src2 = (vector signed short) {-10, 20, 30, 40, 50, 60, 70, 80};\n+  r3 = vec_first_mismatch_or_eos_index(short_src1, short_src2);\n+\n+  /* unsigned short */\n+  ushort_src1 = (vector unsigned short) {10, 20, 30, 40, 50, 60, 70, 0};\n+  ushort_src2 = (vector unsigned short) {10, 20, 30, 40, 50, 60, 70, 80};\n+  r4 = vec_first_mismatch_or_eos_index(ushort_src1, ushort_src2);\n+\n+  /* signed int */\n+  int_src1 = (vector signed int) {1, 2, 3, 4};\n+  int_src2 = (vector signed int) {1, 20, 3, 4};\n+  r5 = vec_first_mismatch_or_eos_index(int_src1, int_src2);\n+\n+  /* unsigned int */\n+  uint_src1 = (vector unsigned int) {1, 2, 3, 0};\n+  uint_src2 = (vector unsigned int) {1, 2, 3, 0};\n+  r6 = vec_first_mismatch_or_eos_index(uint_src1, uint_src2);\n+\n+  return 0;\n+}\n","prefixes":["v5"]}