{"id":2175750,"url":"http://patchwork.ozlabs.org/api/1.0/covers/2175750/?format=json","project":{"id":17,"url":"http://patchwork.ozlabs.org/api/1.0/projects/17/?format=json","name":"GNU Compiler Collection","link_name":"gcc","list_id":"gcc-patches.gcc.gnu.org","list_email":"gcc-patches@gcc.gnu.org","web_url":null,"scm_url":null,"webscm_url":null},"msgid":"<20251218211109.3562-1-chris.bazley@arm.com>","date":"2025-12-18T21:10:58","name":"[v7,00/10] Extend BB SLP vectorization to use predicated tails","submitter":{"id":89471,"url":"http://patchwork.ozlabs.org/api/1.0/people/89471/?format=json","name":"Christopher Bazley","email":"Chris.Bazley@arm.com"},"series":[{"id":485915,"url":"http://patchwork.ozlabs.org/api/1.0/series/485915/?format=json","date":"2025-12-18T21:10:58","name":"Extend BB SLP vectorization to use predicated tails","version":7,"mbox":"http://patchwork.ozlabs.org/series/485915/mbox/"}],"headers":{"Return-Path":"<gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org>","X-Original-To":["incoming@patchwork.ozlabs.org","gcc-patches@gcc.gnu.org"],"Delivered-To":["patchwork-incoming@legolas.ozlabs.org","gcc-patches@gcc.gnu.org"],"Authentication-Results":["legolas.ozlabs.org;\n\tdkim=pass (1024-bit key;\n unprotected) header.d=arm.com header.i=@arm.com header.a=rsa-sha256\n header.s=selector1 header.b=qmPjl3Qr;\n\tdkim=pass (1024-bit key) header.d=arm.com header.i=@arm.com\n header.a=rsa-sha256 header.s=selector1 header.b=qmPjl3Qr;\n\tdkim-atps=neutral","legolas.ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org\n (client-ip=38.145.34.32; helo=vm01.sourceware.org;\n envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org;\n receiver=patchwork.ozlabs.org)","sourceware.org;\n\tdkim=pass (1024-bit key,\n unprotected) header.d=arm.com header.i=@arm.com header.a=rsa-sha256\n header.s=selector1 header.b=qmPjl3Qr;\n\tdkim=pass (1024-bit key) header.d=arm.com header.i=@arm.com\n header.a=rsa-sha256 header.s=selector1 header.b=qmPjl3Qr","sourceware.org;\n dmarc=pass (p=none dis=none) header.from=arm.com","sourceware.org; spf=pass smtp.mailfrom=arm.com","server2.sourceware.org;\n arc=pass smtp.remote-ip=52.101.66.7"],"Received":["from vm01.sourceware.org (vm01.sourceware.org [38.145.34.32])\n\t(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n\t key-exchange x25519 server-signature ECDSA (secp384r1) server-digest SHA384)\n\t(No client certificate requested)\n\tby legolas.ozlabs.org (Postfix) with ESMTPS id 4dXNgR2tsgz1y2r\n\tfor <incoming@patchwork.ozlabs.org>; Fri, 19 Dec 2025 08:13:22 +1100 (AEDT)","from vm01.sourceware.org (localhost [127.0.0.1])\n\tby sourceware.org (Postfix) with ESMTP id 127034BA2E06\n\tfor <incoming@patchwork.ozlabs.org>; Thu, 18 Dec 2025 21:13:20 +0000 (GMT)","from DUZPR83CU001.outbound.protection.outlook.com\n (mail-northeuropeazon11012007.outbound.protection.outlook.com [52.101.66.7])\n by sourceware.org (Postfix) with ESMTPS id E0EC84BA2E06\n for <gcc-patches@gcc.gnu.org>; Thu, 18 Dec 2025 21:12:19 +0000 (GMT)","from AM0P190CA0011.EURP190.PROD.OUTLOOK.COM (2603:10a6:208:190::21)\n by PAWPR08MB11331.eurprd08.prod.outlook.com (2603:10a6:102:50b::9)\n with Microsoft SMTP Server (version=TLS1_2,\n cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9434.8; Thu, 18 Dec\n 2025 21:12:15 +0000","from AMS0EPF0000019D.eurprd05.prod.outlook.com\n (2603:10a6:208:190:cafe::a1) by AM0P190CA0011.outlook.office365.com\n (2603:10a6:208:190::21) with Microsoft SMTP Server (version=TLS1_3,\n cipher=TLS_AES_256_GCM_SHA384) id 15.20.9434.7 via Frontend Transport; Thu,\n 18 Dec 2025 21:12:03 +0000","from outbound-uk1.az.dlp.m.darktrace.com (4.158.2.129) by\n AMS0EPF0000019D.mail.protection.outlook.com (10.167.16.249) with Microsoft\n SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.9434.6\n via Frontend Transport; Thu, 18 Dec 2025 21:12:14 +0000","from DB9PR06CA0006.eurprd06.prod.outlook.com (2603:10a6:10:1db::11)\n by VI0PR08MB11749.eurprd08.prod.outlook.com (2603:10a6:800:313::8)\n with Microsoft SMTP Server (version=TLS1_2,\n cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9434.8; Thu, 18 Dec\n 2025 21:11:09 +0000","from DB1PEPF00039231.eurprd03.prod.outlook.com\n (2603:10a6:10:1db:cafe::d2) by DB9PR06CA0006.outlook.office365.com\n (2603:10a6:10:1db::11) with Microsoft SMTP Server (version=TLS1_3,\n cipher=TLS_AES_256_GCM_SHA384) id 15.20.9434.7 via Frontend Transport; Thu,\n 18 Dec 2025 21:10:52 +0000","from nebula.arm.com (172.205.89.229) by\n DB1PEPF00039231.mail.protection.outlook.com (10.167.8.104) with Microsoft\n SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id\n 15.20.9412.4 via Frontend Transport; Thu, 18 Dec 2025 21:11:09 +0000","from AZ-NEU-EXJ02.Arm.com (10.240.25.139) by AZ-NEU-EX03.Arm.com\n (10.240.25.137) with Microsoft SMTP Server (version=TLS1_2,\n cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.2562.29; Thu, 18 Dec\n 2025 21:11:09 +0000","from AZ-NEU-EX04.Arm.com (10.240.25.138) by AZ-NEU-EXJ02.Arm.com\n (10.240.25.139) with Microsoft SMTP Server (version=TLS1_2,\n cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.2562.29; Thu, 18 Dec\n 2025 21:11:09 +0000","from ip-10-248-139-165.eu-west-1.compute.internal (10.248.139.165)\n by mail.arm.com (10.240.25.138) with Microsoft SMTP Server id 15.2.2562.29\n via Frontend Transport; Thu, 18 Dec 2025 21:11:09 +0000"],"DKIM-Filter":["OpenDKIM Filter v2.11.0 sourceware.org 127034BA2E06","OpenDKIM Filter v2.11.0 sourceware.org E0EC84BA2E06"],"DMARC-Filter":"OpenDMARC Filter v1.4.2 sourceware.org E0EC84BA2E06","ARC-Filter":"OpenARC Filter v1.0.0 sourceware.org E0EC84BA2E06","ARC-Seal":["i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1766092340; cv=pass;\n b=qu3gBY4kfai4htiQo2PIOJ7sn5w9Spd2uz76AMuXR/11jsZFu7fosh9mYmNOiOUHzzCM7pRYnp4ShFeUsfUjsHNBkWFCyPgbKIQIjEOx+OmICG9Z0hVu0V/pZ84lhtSDFylLv2fiD54S+V+ldFKLSws9jp+J5bo16CezGatmSKM=","i=2; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=pass;\n b=tiJq1diIQ7ANDimsl/4aHSSpN7QYG7nie8BO7i2BBZ6QbGx+XDbxFookb8fDcO4UAclUOQ89nqCcGsHVSoMfGNlPm7zoBetxF7rSnj9FW4bdM5jEbrm3/P5i2ECMM14sQgKgq/PzwLXmuQuxOakYr1X6BwJmKeOtbNFekUA6tOAygPW8ctNu2RWAsDyRZ1p3PDLKcTS4K7hNNQWAh/g+kRvzl6wI1sNSlet5ArnZiz6cuayWXTBVrsiXy0JSiSELKaUmvCKjjH3juhjhkqbcYlaHBhdX/JH1DjCRH7uuWPnkzyAcgj9crI2mP5sjOwquKwdQK96Tsg17+L+vyl2vxA==","i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none;\n b=W+5pikh60QJUjJQcKo5NcxuvAckSES+RJOlC6UYdmVyCZIRnFONJLsWIBUQNk7+sk86DDhA1+lF5gAMYXZ29iLVvmqo1ut5yqyPxLZ1n54CamGyEs6UggumHzDTJP3wc/9+bgQOHvO6dMDd8bMh29epyH9RXhQoB5RBsRbG2ZMPqg64p1UKYVeuu+xhOfn4MxdWvNsM/Nlkzjh6SqeBOw8Iv7RTzMeNUZ/eD8HOBShCcFwMwfLXWNyoF3e84nDimkYPw5X2amRhUyBmOoqzO+MG/JeSJYMd+MsXKRZWP8wEScZSWS3m9zF4rI4y/9DsVd/36ee0EleJyxIOasdamNA=="],"ARC-Message-Signature":["i=3; a=rsa-sha256; d=sourceware.org; s=key;\n t=1766092340; c=relaxed/simple;\n bh=O21t7DVyZjfcl++rqXmzy8ME2eyu+e2gqlqq0mi5gfw=;\n h=DKIM-Signature:DKIM-Signature:From:To:Subject:Date:Message-ID:\n MIME-Version;\n b=maNcy4/mF1yb4PrddsUC5S4eicLVy6jgYxJKOD/cPtq55If0ZhJ7k9ZzTCGK5Eu32sNaSbNsCvXPpMASRZGRrAUXbqQVoE9YSmnZsJn1ql8mRyEasOoBgKIpI6Gzb9QujQJkDniPvP8y6kJP6LZW7FzWb7H5dRNOhVX78zNB/fM=","i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com;\n s=arcselector10001;\n h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1;\n bh=ddNqFz5RssoRW3QuRCXzq9ipa5gxc+ku2B6aJOdbl+Y=;\n b=lN00KXePQ2o8vwKSGH6BdSBx2XI+d4h4t/xB5okDgi9wCSmRfQjkEWv99TCqGT3TR6ULPOuCjUUoQUCK/pGQ5345Wr3bhjnQ3w5lHZCpA7JqUeAiGdDJ28SYD4iGp+h9JLsKoBQTEe1Ty70pDje30PyMgDY/YvP8u3ITi6x4/ueh1kUTPeBJUaH+7pfTEjHm6fhLb80AcMdQt8kg2Aaw81IOX7LexZ/5awwdLlk+LHPOJ9Sf7DkO3Eapp/LpA7xX8GioBMNL/20zoI8qrcIl2+unnfqnLoR5EhbCbH0GNCtgO0e3VTMzLrtVa213cB/UVvknGlAb1N3P1gG9aLpXHw==","i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com;\n s=arcselector10001;\n h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1;\n bh=ddNqFz5RssoRW3QuRCXzq9ipa5gxc+ku2B6aJOdbl+Y=;\n b=cwP6eJKSNjBn8XqNqYjt4UDt7o27r0CLGyU3yCiosRGRbVyN2r/lleCoChmHHZGulFCh4QLNv48QuYMWMgBcrJtA4fi/aHuwporDi2NXCAEKlYaDvXQlrFU6TPyaKuqajf8zQAkwFo2BlVJ+jN0gU43enEHy+agihR+N3MEYjwb8XF9xQzuE+8hsWYr7voTWVyRiTw1TyG2wEO+cNZ3HhaBAtKtLw7ldwNOEgOz0RKMoYUpJ8ByU/r/PL6QHrklJe94N8z7bTnYU/KQ41LHHEnZqPKzUzQIRgkc0Mj6jQV8QXAjv81+w/k0/EM/PKFd7Jl1jaBWeK9JpKYp8pbIR/g=="],"ARC-Authentication-Results":["i=3; server2.sourceware.org","i=2; mx.microsoft.com 1; spf=pass (sender ip is\n 4.158.2.129) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass\n (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass\n (signature was verified) header.d=arm.com; arc=pass (0 oda=1 ltdi=1\n spf=[1,1,smtp.mailfrom=arm.com] dmarc=[1,1,header.from=arm.com])","i=1; mx.microsoft.com 1; spf=pass (sender ip is\n 172.205.89.229) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com;\n dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com;\n dkim=none (message not signed); arc=none (0)"],"DKIM-Signature":["v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1;\n h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;\n bh=ddNqFz5RssoRW3QuRCXzq9ipa5gxc+ku2B6aJOdbl+Y=;\n b=qmPjl3Qry1HJCGNn+XxwhwAXaEUdzeQAF3KMfdX9GceJ/ZBLQJBySqIIihmtfTJbo+KLkxbkaNqNBVrXDUrq6S8nGZcGNd0JsPM1zimtR/ZwOWl7G2STNCLQ5eTH0adt8xfwKdrcjSNfSi67i9MiHK7NvFHHDOJmpa1XctnId9g=","v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1;\n h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;\n bh=ddNqFz5RssoRW3QuRCXzq9ipa5gxc+ku2B6aJOdbl+Y=;\n b=qmPjl3Qry1HJCGNn+XxwhwAXaEUdzeQAF3KMfdX9GceJ/ZBLQJBySqIIihmtfTJbo+KLkxbkaNqNBVrXDUrq6S8nGZcGNd0JsPM1zimtR/ZwOWl7G2STNCLQ5eTH0adt8xfwKdrcjSNfSi67i9MiHK7NvFHHDOJmpa1XctnId9g="],"X-MS-Exchange-Authentication-Results":["spf=pass (sender IP is 4.158.2.129)\n smtp.mailfrom=arm.com; dkim=pass (signature was verified)\n header.d=arm.com;dmarc=pass action=none header.from=arm.com;","spf=pass (sender IP is 172.205.89.229)\n smtp.mailfrom=arm.com; dkim=none (message not signed)\n header.d=none;dmarc=pass action=none header.from=arm.com;"],"Received-SPF":["Pass (protection.outlook.com: domain of arm.com designates\n 4.158.2.129 as permitted sender) receiver=protection.outlook.com;\n client-ip=4.158.2.129; helo=outbound-uk1.az.dlp.m.darktrace.com; pr=C","Pass (protection.outlook.com: domain of arm.com designates\n 172.205.89.229 as permitted sender) receiver=protection.outlook.com;\n client-ip=172.205.89.229; helo=nebula.arm.com; pr=C"],"From":"Christopher Bazley <chris.bazley@arm.com>","To":"<gcc-patches@gcc.gnu.org>","Subject":"[PATCH v7 00/10] Extend BB SLP vectorization to use predicated tails","Date":"Thu, 18 Dec 2025 21:10:58 +0000","Message-ID":"<20251218211109.3562-1-chris.bazley@arm.com>","X-Mailer":"git-send-email 2.43.0","MIME-Version":"1.0","Content-Transfer-Encoding":"8bit","Content-Type":"text/plain","X-EOPAttributedMessage":"1","X-MS-TrafficTypeDiagnostic":"\n DB1PEPF00039231:EE_|VI0PR08MB11749:EE_|AMS0EPF0000019D:EE_|PAWPR08MB11331:EE_","X-MS-Office365-Filtering-Correlation-Id":"10675417-f14f-4d5d-4431-08de3e7a1ddc","x-checkrecipientrouted":"true","NoDisclaimer":"true","X-MS-Exchange-SenderADCheck":"1","X-MS-Exchange-AntiSpam-Relay":"0","X-Microsoft-Antispam-Untrusted":"BCL:0;\n ARA:13230040|36860700013|1800799024|376014|82310400026|13003099007;","X-Microsoft-Antispam-Message-Info-Original":"\n cXbtcU4mWAbqs8WDOSGKT6xzO2HbHC/cVYlHIwyAXQafp9OSwmnkyJM5tM4dT9NuSB3CYKtIBwK4ml5pE7UDPHckvigiDGB8BJI/kOniofT2eGaKmQo0wjQOBb2hCKO+2LyE9p4QUmUFKJdixLQ3Eu1/UcmbmV4NPBVOBxPmDPw1ebIIS00oS4DlhiJshO814GawdaxLfg/cSAOajcLBEiPEchWT/hd3gvGos14NBsFEDR6kdNL5zneqwtkhjK3c4snvbh6yuUJGCBrxMHHZCCtfZ+nPkQt8uP3vBzSwIFsKi4MzQVyOMbFnH+6MBHFR7+DriUo3F3hpPv9w8vUzSYBvPsaf5VE8D8P5DkIUd4yWhgQr+VMpwBZ2+VjqTuWgpmTMYlDm28Ld7MveATmlb22DBX1yxQtDSy4yhwqAxkqzFPKFeHbGSNndNqCnkDd0InvWbt3tEIxMbBCi+z/lW766ziKaqEaMxof7qvkuzhEFQ0l0EhX3FUss7fS7WDBfTcD4FHnuQGS3UZzm2NYdzPTIfggaAIuNYVHu7k258yQ2m2HzueKoBEFs1IWBWoYSZ287hzKzex791VM0ggs/iV7yLRbaJuyk0OKzkoHrmVXTAYWZh6pAAEx6utCzIx27cop5S7GDmzfE2mE300Kd41CO90PWTPAcMTMU+IAxPH2V8p//SPX5X5SkibVisl4hA2Zaodh4QQjO4Mdt51dQjIaYX+i78KbacIiuBPdgy451GCzEhmUwHE1MPdcUEB+OvHUBHvWF85pAwSdgfYA9ciKkFCMUBT7y8SN8D8Ll4SyznLHbLfcqREqvSEUHwA18xBMhe78yIHMWfcWGiYHt4hWEUDxeSK/ySae6VnEDjYB4WWqqUKh7sgtbsOETz/wAjbPf8DktX+q/ehwyECfrWnU6uM/efcSytRrUe1oxlYtG0gGCf7yAGS4ASmLo8oF+pgEQp9OQcUSq5m5TTOs/OKWGNgMtpsfNpZElmer4vxIda89oTqQJuob0KEk2lSLzCkq6k168+jqae20PO/sszUltY8W5e3VQSD1zZnaVneRm/M7UeTich1BhVd3Ev94DonPzaNGNXhBNXz63j1k7KzXSGOZDfYrZK5KLToOVG19haBUA3Q/URpoAwHxWjb0jGCLhpfDQFF6BcVh2M4Ah19PlXe/0ArjGIt6yC3iteGMuYdN7tk44OSb1ehJyKumWJUhnPqGMIUKySUCkoEhvhKh5yICeB2g9aIGJxRFsHO1iv6SA+O8vGNMnQooH5ik4snOi+mIZItUqnXasC8ljXeunNu4VHAM08JWPwoCmAA0WRke92Z+v6u2z40BPazw1BFELzNASVE+XsMFMlITSIpqVEXhkgMQ4ziwMehI126+oHeu15Xi0lPmXVc0SpkJ68jKMDkarfXZ03apIt/U/Bd/c04ZZ83RhYe8woWMY+7i7QxgwbqVO+HpSNItcjyavGOwb839wQwsuH0xh/V5zrOlt6/1evsOilMPlxZ9EDaAPoYApH9f7vnSxIVqXtOqR","X-Forefront-Antispam-Report-Untrusted":"CIP:172.205.89.229; CTRY:IE; LANG:en;\n SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:nebula.arm.com; PTR:InfoDomainNonexistent;\n CAT:NONE;\n SFS:(13230040)(36860700013)(1800799024)(376014)(82310400026)(13003099007);\n DIR:OUT; SFP:1101;","X-MS-Exchange-Transport-CrossTenantHeadersStamped":["VI0PR08MB11749","PAWPR08MB11331"],"X-MS-Exchange-Transport-CrossTenantHeadersStripped":"\n AMS0EPF0000019D.eurprd05.prod.outlook.com","X-MS-PublicTrafficType":"Email","X-MS-Office365-Filtering-Correlation-Id-Prvs":"\n 8c9b0bd1-133d-490b-ee65-08de3e79f761","X-Microsoft-Antispam":"BCL:0;\n ARA:13230040|35042699022|1800799024|36860700013|82310400026|14060799003|376014|13003099007;","X-Microsoft-Antispam-Message-Info":"\n 6K/v5qaP6G4S+23Z0INc26cUKxHT9eEqJd8XLu4+wbWz0EiHJWulqlCSyCvsq+9QldFv5CleARffyACfLu8FEDZuQoVKMV7k0PZvsBaG+MX+Yj4Nmq+vTKRMg4qV0YeGoKLTjm1eHdVKUK7oqkSQ/sxA3tPpl6AkWlB+7oQb1Bn/mAmD1XzfRnu0xo2DP6fwUKS0wL5chAIBJ2hCXUuoOh8Fm/jfvKCN6Tkt127Vpo9JWSR+zPbpJOZUMgqQSNyqs1jn57/rcE8XqGEWWqNY67aB7t4BDx9O45S1HMXozvNPwpcXmaZKdrRz6iuESDU2GSiZg/nP6Y0163nY4PVQdKGiRaXJo2Q5/l1aCKApXjIPfwj74/RwKis6VJFWAE51S4AlSkR3S3DgRJsaT351dXsCV41V0llBP6BAPLTxCKAc7RZz+kdQs9IKWxy0emLBssggfv5Ke2NhZ7y4k8qhZfeowUwiwlzJID9BbW2V4f6aH6T+6QNizGE8s0muSkzE6bVdhWzPmaAIQ0Gl++xZ+vWNOCmHzAb+bO9gHRscDQiODC3yBbUgnx42uKsdBTb0RD3mSGf4yLG2ltPj+YbXnkjrFQMdtXODRh+sHAONRHzYyAdbA4QDkk5622fsTy0jqDH6jLSBqa4jD6NeAXiOasdAj58XNl4ofwvMEvmY1O6d3Xd0EoeIAMH8AVV1QLGROeCnlJsQlFAx80TywvJ2yPV6xswxb14IGgpNwN983gAdS3H4tZ3SKHyh3jAPTu2yEZ0guvsjqQK7d8fNfEotOfd7GO1By3O2sI2fyPNPFHXwT2ivrJh2n/7DC1JhLa5g3MVVufQ5PWWs0lXpgs1T4T3ki67ySnHVB/8bLUp4Pu21ppMeGg7p0YZuipML0nfxdxmbul5LDp+YqCWz2seZhD6va7zLFLba/ItaEl2QJoFqBNWbc1rLqcQ7lNQNt9vCbz2Qwt4248n/XbXh3BJfD8PC+ptAq5qY6YHW++us2n0nzluEIYaLo7HjFuKspe28e75fim0zQb1hw6dqmjuDALL3+dHyE3YfkPjjqrCN8d9bbJOy3loqUcz9q9Plu6HgScJIJmMZupR5+HTRO+GOCZ7VBWn8KFGFm3Nb+h5G3p+HXelyc7N4l0m0sejB1BTQlhaxDnGzCf86e14E7aT6VCYnmuhbKVbWEr4gP7zXqmiLxXXESxnPr9biHsLRBC7MpQjt3VPkc5P5G6+IijgDZH1ySMthYvzRiWeLtSaHhs1AFUqeW38JS4lMlR4MQQ0EOR2fUauKWOGHLIEsMGTAd/sS9hCu5ghBgm9qnSN0kwdnT+CdvZaudjsjPwcUjZdcO7CvYReFDwnb9DlFoupZYolH4V+XVlShXk/IH3qcx6ddVkjPCKo3WW548FsunLM4Vg4bCqh21YDWQsgq2lOzSURXyCSMwl59mz7ZAIt1gisiyndqf9rVZ+LmQpCYaDfdvOUqbarQgQ7bUvtbaEZLWiwwZgz8+2npqZrZvFwy03u6EZKmOig4/diMqwOS/LuI","X-Forefront-Antispam-Report":"CIP:4.158.2.129; CTRY:GB; LANG:en; SCL:1; SRV:;\n IPV:NLI; SFV:NSPM; H:outbound-uk1.az.dlp.m.darktrace.com;\n PTR:InfoDomainNonexistent; CAT:NONE;\n SFS:(13230040)(35042699022)(1800799024)(36860700013)(82310400026)(14060799003)(376014)(13003099007);\n DIR:OUT; SFP:1101;","X-OriginatorOrg":"arm.com","X-MS-Exchange-CrossTenant-OriginalArrivalTime":"18 Dec 2025 21:12:14.0712 (UTC)","X-MS-Exchange-CrossTenant-Network-Message-Id":"\n 10675417-f14f-4d5d-4431-08de3e7a1ddc","X-MS-Exchange-CrossTenant-Id":"f34e5979-57d9-4aaa-ad4d-b122a662184d","X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp":"\n TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[4.158.2.129];\n Helo=[outbound-uk1.az.dlp.m.darktrace.com]","X-MS-Exchange-CrossTenant-AuthSource":"\n AMS0EPF0000019D.eurprd05.prod.outlook.com","X-MS-Exchange-CrossTenant-AuthAs":"Anonymous","X-MS-Exchange-CrossTenant-FromEntityHeader":"HybridOnPrem","X-BeenThere":"gcc-patches@gcc.gnu.org","X-Mailman-Version":"2.1.30","Precedence":"list","List-Id":"Gcc-patches mailing list <gcc-patches.gcc.gnu.org>","List-Unsubscribe":"<https://gcc.gnu.org/mailman/options/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>","List-Archive":"<https://gcc.gnu.org/pipermail/gcc-patches/>","List-Post":"<mailto:gcc-patches@gcc.gnu.org>","List-Help":"<mailto:gcc-patches-request@gcc.gnu.org?subject=help>","List-Subscribe":"<https://gcc.gnu.org/mailman/listinfo/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>","Errors-To":"gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org"},"content":"GCC already supports fully-predicated vectorisation for loops, both\nusing \"traditional\" loop vectorisation and loop-aware SLP\n(superword-level parallelism). For example, GCC can vectorise:\n\nvoid\nfoo (char *x)\n{\n  for (int i = 0; i < 6; i += 2)\n    {\n      x[i] += 1;\n      x[i + 1] += 2;\n    }\n}\n\nfrom which it generates the following assembly code (with -O2\n-ftree-vectorize -march=armv9-a+sve -msve-vector-bits=scalable):\n\nfoo:\n        ptrue   p7.b, vl6\n        mov     w1, 513\n        ld1b    z31.b, p7/z, [x0]\n        mov     z30.h, w1\n        add     z30.b, z31.b, z30.b\n        st1b    z30.b, p7, [x0]\n        ret\n\nHowever, GCC cannot yet vectorise the unrolled form of the same\nfunction, even though it is semantically equivalent:\n\nvoid\nfoo (char *x)\n{\n  x[0] += 1;\n  x[1] += 2;\n  x[2] += 1;\n  x[3] += 2;\n  x[4] += 1;\n  x[5] += 2;\n}\n\nThese patches implement support for vectorising the unrolled form of\nthe above function by enabling use of a predicate mask or length\nlimit for basic block SLP. For example, it can now be vectorised to\nthe following assembly code (using the same options as above):\n\nfoo:\n\tptrue\tp7.b, vl6\n\tptrue\tp6.b, all\n\tld1b\tz31.b, p7/z, [x0]\n\tadrp\tx1, .LC0\n\tadd\tx1, x1, :lo12:.LC0\n\tld1rqb\tz30.b, p6/z, [x1]\n\tadd\tz30.b, z31.b, z30.b\n\tst1b\tz30.b, p7, [x0]\n\tret\n\nPredication is only used for groups whose size is not neatly divisible\ninto vectors of lengths that can be supported directly by the target.\n\nBootstrapped and tested on aarch64-linux-gnu.\nBased on commit 630c1bfbb5bc3ba9fafca5e97096263ab8f0a04b.\n\nOK for trunk?\n\nChanges in v2:\n - Updated regexes used by the vect-over-widen-*.c tests so that they\n   do not accidentally match text that is now part of the dump but\n   was not previously.\n - Updated regexes used by the aarch64/popcnt-sve.c test so that it\n   expects 'all' as the operand of 'ptrue', instead of some specific\n   number of active elements.\n - Removed a dump_printf from vect_get_num_copies because the\n   gcc.dg/vect/vect-shift-5.c test relies on a fixed dump layout\n   spanning multiple lines.\n - Fixed a bug in vect_get_vector_types_for_stmt, which had\n   accidentally been modified to unconditionally set group_size to\n   zero (even for basic block vectorization).\n - Relaxed an overzealous new check in\n   vect_maybe_update_slp_op_vectype, which now considers the\n   vectorization factor when checking the number of lanes of external\n   definitions during loop vectorization. e.g., lanes=11 is not\n   divisible by subparts=8 with vf=1 but vf*lanes=88 is divisible by\n   subparts=8 with vf=11.\n - Removed the stmts vector ownership changes relating to mishandling\n   of failure of the vect_analyze_slp_instance function (to be fixed\n   separately).\n - A check in get_vectype_for_scalar_type for whether the natural\n   choice of vector type satisfies the group size was too simplistic.\n   Instead of choosing a narrower vector type if the natural vector\n   type could be long enough but not definitely (variable length, by\n   proxy), get_len_load_store_mode is now used to explicitly query\n   whether the target supports mask- or length-limited loads and\n   stores. With the previous version, GCC preferred the natural vector\n   type if it was known to be long enough; sometimes that resulted in\n   better output than a narrower type, but it also caused some\n   bb-slp-* tests to fail.\n - Shuffled dump_printfs around into separate patches.\n - An assertion is now used to protect my change to use lower bound\n   of number of subparts in gimple_build_vector.\n\nChanges in v3:\n - Wrote changelog entries.\n - Created the gimple_build_vector_with_zero_padding function and\n   used it place of the gimple_build_vector function.\n - Reverted my change to use constant_lower_bound of subparts in the\n   gimple_build_vector function.\n - Fixed a check for 'partial vectors required' in vect_analyze_stmt\n   to include cases in which the minimum bound of the length of a\n   variable-length vector type equals the number of active lanes in an\n   SLP tree node. (Maybe less than, instead of known to be less than.)\n - Reverted my change to regexes used by the aarch64/popcnt-sve.c test\n   because it is not needed after the above fix.\n - Replaced SLP_TREE_CAN_USE_MASK_P and SLP_TREE_CAN_USE_LEN_P with\n   SLP_TREE_PARTIAL_VECTORS_STYLE.\n - Titivated the documentation of vect_record_nunits.\n - Renamed local variable max_nunits to nunits in the\n   vect_analyze_slp_reduc_chain function.\n - Added an assertion in vect_create_constant_vectors to verify a\n   claimed relationship between group size and number of subparts.\n - Fixed a mistake in the description of vect_slp_get_bb_len.\n - Clarified the descriptions of vect_record_mask and vect_record_len\n   (from 'would be required' to 'could be used ... if required').\n - Renamed the vect_can_use_mask_p function as vect_fully_masked_p\n   and vect_can_use_len_p as vect_fully_with_length_p.\n - Tightened assertions in vect_record_mask and vect_record_len:\n   instead of just 'the other function must not have been called', we\n   now assert that no partial vectors style has already been set.\n - Updated the documentation of check_load_store_for_partial_vectors\n   to describe its role in BB SLP vectorization.\n - Renamed masked_loop_p and len_loop_p as masked_p and len_p in\n   contexts where they may now be used for BB SLP vectorization.\n - Guard against a potential null pointer dereference in\n   LOOP_VINFO_PARTIAL_LOAD_STORE_BIAS when used by the\n   vectorizable_operation and vectorizable_call functions for BB SLP\n   vectorization (instead of assuming !vect_fully_with_length_p).\n - Guard against calling dump_printf_loc with null instead of\n   a vector type in vectorizable_comparison_1.\n - Clear any partial vectors style that might have been set by callees\n   of vect_analyze_stmt if it finds that partial vectors aren't needed.\n - Revert a change to make the vect_is_simple_use function more robust\n   when the requested SLP tree child node is not an internal def and\n   has no scalar operand to return.\n - Revert a change to reserve slots for the definitions to be pushed\n   for narrow FLOAT_EXPR vectorization before using quick_push in the\n   vectorizable_conversion function.\n\nChanges in v4:\n - Resolved code generation regressions in\n   gcc.target/aarch64/sve/slp_6.c and\n   gcc.target/aarch64/sve/slp_7_costly.c by adding code to\n   handle variable-length vector types in store_constructor.\n   (Still had to update expectations for slp_6.c though.)\n - Removed a default constructor from the definition of\n   struct slp_tree_nunits in order to fix a compilation error in\n   the vect_update_nunits function.\n - Renamed the gimple_build_vector_with_zero_padding function as\n   gimple_build_vector_from_elems.\n - Changed gimple_build_vector_from_elems to take a vector type\n   and list of element values instead of a tree_vector_builder.\n - Rewrote the description of gimple_build_vector_from_elems.\n - Removed redundant masking from gimple_build_vector_from_elems. It's\n   now almost equivalent to gimple_build_vector in use cases where\n   some element value is not constant.\n - Assert that the constant lower bound of the number of subparts\n   in a vector type is a power of 2 instead of merely a multiple of 2\n   in vect_build_slp_tree_1.\n\nChanges in v5:\n - Removed the patch to track the minimum and maximum number of lanes\n   for BB SLP.\n - Added a new function, vect_slp_tree_min_nunits, which is invoked by\n   vect_analyze_slp_instance to compute the minimum number of subparts\n   for all of the vector types used in an SLP tree for which an SLP\n   instance is about to be created.\n\nChanges in v6:\n - Moved the BB SLP implementations of vect_record_mask and\n   vect_record_len into tree-vect-slp.cc.\n - Relaxed (failing) assertions: the style of an SLP node is not always\n   vect_partial_vectors_none on entry to vect_record_mask and\n   vect_record_len. Allow the same style to be set multiple times.\n - Updated an existing pattern named 'Simplify vector extracts' to\n   guard against invalid invocations of tree_to_uhwi when a\n   BIT_FIELD_REF has an unsuitable polynomial offset or size.\n - Rebased on 43afcb3a83c3648141285d80cd3d8a562047fb43.\n\nChanges in v7:\n - Fixed the vectorizable_live_operation function so that it no\n   longer generates invalid offsets such as BIT_FIELD_REF\n   <_251, 32, POLY_INT_CST [96, 128]> for BB SLP with a\n   variable-length vector type.\n - Revert changes to the existing pattern named 'Simplify vector\n   extracts' because those changes to make the pattern more robust\n   are now redundant.\n - Rebased on 630c1bfbb5bc3ba9fafca5e97096263ab8f0a04b.\n\nLink to v1:\nhttps://inbox.sourceware.org/gcc-patches/20251028102242.3951273-1-chris.bazley@arm.com/\n\nLink to v2:\nhttps://inbox.sourceware.org/gcc-patches/20251110131328.15262-1-chris.bazley@arm.com/\n\nLink to v3:\nhttps://inbox.sourceware.org/gcc-patches/20251124185104.751328-1-chris.bazley@arm.com/\n\nLink to v4:\nhttps://inbox.sourceware.org/gcc-patches/20251128224555.5095-1-chris.bazley@arm.com/\n\nLink to v5:\nhttps://inbox.sourceware.org/gcc-patches/20251202181237.2463894-1-chris.bazley@arm.com/\n\nLink to v6:\nhttps://inbox.sourceware.org/gcc-patches/20251206165518.5449-1-chris.bazley@arm.com/\n\nChristopher Bazley (10):\n  Preparation to support predicated vector tails for BB SLP\n  Implement recording/getting of mask/length for BB SLP\n  Update constant creation for BB SLP with predicated tails\n  Refactor check_load_store_for_partial_vectors\n  New parameter for vect_maybe_update_slp_op_vectype\n  Handle variable-length vector types in store_constructor\n  AArch64/SVE: Relax the expectations of the popcnt-sve test\n  Extend BB SLP vectorization to use predicated tails\n  AArch64/SVE: Tests for use of predicated vector tails for BB SLP\n  Add extra conditional dump output to the vectorizer\n\n gcc/expr.cc                                   |   7 +-\n gcc/gimple-fold.cc                            |  54 ++\n gcc/gimple-fold.h                             |  14 +\n .../gcc.dg/vect/vect-over-widen-10.c          |   2 +-\n .../gcc.dg/vect/vect-over-widen-13.c          |   2 +-\n .../gcc.dg/vect/vect-over-widen-14.c          |   2 +-\n .../gcc.dg/vect/vect-over-widen-17.c          |   2 +-\n .../gcc.dg/vect/vect-over-widen-18.c          |   2 +-\n gcc/testsuite/gcc.dg/vect/vect-over-widen-5.c |   2 +-\n gcc/testsuite/gcc.dg/vect/vect-over-widen-6.c |   2 +-\n gcc/testsuite/gcc.dg/vect/vect-over-widen-7.c |   2 +-\n gcc/testsuite/gcc.dg/vect/vect-over-widen-8.c |   2 +-\n gcc/testsuite/gcc.dg/vect/vect-over-widen-9.c |   2 +-\n gcc/testsuite/gcc.target/aarch64/popcnt-sve.c |  10 +-\n gcc/testsuite/gcc.target/aarch64/sve/slp_6.c  |   3 -\n .../gcc.target/aarch64/sve/slp_pred_1.c       |  33 +\n .../gcc.target/aarch64/sve/slp_pred_1_run.c   |   6 +\n .../gcc.target/aarch64/sve/slp_pred_2.c       |  33 +\n .../gcc.target/aarch64/sve/slp_pred_3.c       |  33 +\n .../gcc.target/aarch64/sve/slp_pred_3_run.c   |   6 +\n .../gcc.target/aarch64/sve/slp_pred_4.c       |  33 +\n .../gcc.target/aarch64/sve/slp_pred_5.c       |  36 +\n .../gcc.target/aarch64/sve/slp_pred_6.c       |  39 +\n .../gcc.target/aarch64/sve/slp_pred_6_run.c   |   6 +\n .../gcc.target/aarch64/sve/slp_pred_7.c       |  38 +\n .../gcc.target/aarch64/sve/slp_pred_harness.h |  28 +\n gcc/tree-vect-loop.cc                         |  44 +-\n gcc/tree-vect-slp.cc                          | 312 ++++++-\n gcc/tree-vect-stmts.cc                        | 786 +++++++++++-------\n gcc/tree-vectorizer.h                         | 109 ++-\n 30 files changed, 1287 insertions(+), 363 deletions(-)\n create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/slp_pred_1.c\n create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/slp_pred_1_run.c\n create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/slp_pred_2.c\n create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/slp_pred_3.c\n create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/slp_pred_3_run.c\n create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/slp_pred_4.c\n create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/slp_pred_5.c\n create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/slp_pred_6.c\n create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/slp_pred_6_run.c\n create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/slp_pred_7.c\n create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/slp_pred_harness.h"}