{"id":2197809,"url":"http://patchwork.ozlabs.org/api/1.0/patches/2197809/?format=json","project":{"id":21,"url":"http://patchwork.ozlabs.org/api/1.0/projects/21/?format=json","name":"Linux Tegra Development","link_name":"linux-tegra","list_id":"linux-tegra.vger.kernel.org","list_email":"linux-tegra@vger.kernel.org","web_url":null,"scm_url":null,"webscm_url":null},"msgid":"<20260218145809.1622856-7-bwicaksono@nvidia.com>","date":"2026-02-18T14:58:07","name":"[v2,6/8] perf: add NVIDIA Tegra410 CPU Memory Latency PMU","commit_ref":null,"pull_url":null,"state":"handled-elsewhere","archived":false,"hash":"9422bd0a07345ac7101fa3eb050523c5a01a75ae","submitter":{"id":83903,"url":"http://patchwork.ozlabs.org/api/1.0/people/83903/?format=json","name":"Besar Wicaksono","email":"bwicaksono@nvidia.com"},"delegate":null,"mbox":"http://patchwork.ozlabs.org/project/linux-tegra/patch/20260218145809.1622856-7-bwicaksono@nvidia.com/mbox/","series":[{"id":492565,"url":"http://patchwork.ozlabs.org/api/1.0/series/492565/?format=json","date":"2026-02-18T14:58:01","name":"perf: add NVIDIA Tegra410 Uncore PMU support","version":2,"mbox":"http://patchwork.ozlabs.org/series/492565/mbox/"}],"check":"pending","checks":"http://patchwork.ozlabs.org/api/patches/2197809/checks/","tags":{},"headers":{"Return-Path":"\n <linux-tegra+bounces-12058-incoming=patchwork.ozlabs.org@vger.kernel.org>","X-Original-To":["incoming@patchwork.ozlabs.org","linux-tegra@vger.kernel.org"],"Delivered-To":"patchwork-incoming@legolas.ozlabs.org","Authentication-Results":["legolas.ozlabs.org;\n\tdkim=pass (2048-bit key;\n unprotected) header.d=Nvidia.com header.i=@Nvidia.com header.a=rsa-sha256\n header.s=selector2 header.b=MglpCu4Y;\n\tdkim-atps=neutral","legolas.ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org\n (client-ip=2600:3c0a:e001:db::12fc:5321; helo=sea.lore.kernel.org;\n envelope-from=linux-tegra+bounces-12058-incoming=patchwork.ozlabs.org@vger.kernel.org;\n receiver=patchwork.ozlabs.org)","smtp.subspace.kernel.org;\n\tdkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com\n header.b=\"MglpCu4Y\"","smtp.subspace.kernel.org;\n arc=fail smtp.client-ip=52.101.56.30","smtp.subspace.kernel.org;\n dmarc=pass (p=reject dis=none) header.from=nvidia.com","smtp.subspace.kernel.org;\n spf=fail smtp.mailfrom=nvidia.com"],"Received":["from sea.lore.kernel.org (sea.lore.kernel.org\n [IPv6:2600:3c0a:e001:db::12fc:5321])\n\t(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n\t key-exchange x25519 server-signature ECDSA (secp384r1) server-digest SHA384)\n\t(No client certificate requested)\n\tby legolas.ozlabs.org (Postfix) with ESMTPS id 4fGKTL4898z1xvq\n\tfor <incoming@patchwork.ozlabs.org>; Thu, 19 Feb 2026 02:01:10 +1100 (AEDT)","from smtp.subspace.kernel.org (conduit.subspace.kernel.org\n [100.90.174.1])\n\tby sea.lore.kernel.org (Postfix) with ESMTP id 3B36E302FABA\n\tfor <incoming@patchwork.ozlabs.org>; Wed, 18 Feb 2026 14:59:23 +0000 (UTC)","from localhost.localdomain (localhost.localdomain [127.0.0.1])\n\tby smtp.subspace.kernel.org (Postfix) with ESMTP id 1E99833E36A;\n\tWed, 18 Feb 2026 14:59:23 +0000 (UTC)","from BN1PR04CU002.outbound.protection.outlook.com\n (mail-eastus2azon11010030.outbound.protection.outlook.com [52.101.56.30])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))\n\t(No client certificate requested)\n\tby smtp.subspace.kernel.org (Postfix) with ESMTPS id C7A7033EB10;\n\tWed, 18 Feb 2026 14:59:20 +0000 (UTC)","from BL1PR13CA0324.namprd13.prod.outlook.com (2603:10b6:208:2c1::29)\n by IA0PPFB67404FBA.namprd12.prod.outlook.com (2603:10b6:20f:fc04::be2) with\n Microsoft SMTP Server (version=TLS1_2,\n cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9611.16; Wed, 18 Feb\n 2026 14:59:10 +0000","from BL02EPF00021F68.namprd02.prod.outlook.com\n (2603:10b6:208:2c1:cafe::4) by BL1PR13CA0324.outlook.office365.com\n (2603:10b6:208:2c1::29) with Microsoft SMTP Server (version=TLS1_3,\n cipher=TLS_AES_256_GCM_SHA384) id 15.20.9632.13 via Frontend Transport; Wed,\n 18 Feb 2026 14:59:06 +0000","from mail.nvidia.com (216.228.117.161) by\n BL02EPF00021F68.mail.protection.outlook.com (10.167.249.4) with Microsoft\n SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id\n 15.20.9632.12 via Frontend Transport; Wed, 18 Feb 2026 14:59:10 +0000","from rnnvmail202.nvidia.com (10.129.68.7) by mail.nvidia.com\n (10.129.200.67) with Microsoft SMTP Server (version=TLS1_2,\n cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.2562.20; Wed, 18 Feb\n 2026 06:58:52 -0800","from rnnvmail201.nvidia.com (10.129.68.8) by rnnvmail202.nvidia.com\n (10.129.68.7) with Microsoft SMTP Server (version=TLS1_2,\n cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.2562.20; Wed, 18 Feb\n 2026 06:58:52 -0800","from build-bwicaksono-noble-20251018.internal (10.127.8.11) by\n mail.nvidia.com (10.129.68.8) with Microsoft SMTP Server id 15.2.2562.20 via\n Frontend Transport; Wed, 18 Feb 2026 06:58:50 -0800"],"ARC-Seal":["i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;\n\tt=1771426762; cv=fail;\n b=QoscvO/u1XRzpCnuzFFTFijFP3gGNTjZUPFOy/MDw8Qbv8Ezc2Lf9OMSHsWRW+ArCFEPqHtxhk1RXvFuyE+1F8BeVaZmoS93bpijviKtbpvlO7VKTJ5q+Ms8vReBUjXtSZWNvTH6sgDqNaQgVYAZ85r9f8Ssq7z72EuBLtBM8FY=","i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none;\n b=o6kNS6OrcMsQEeJxbe+AQ+dbiyxCD5TbabKBz6rY4EyixfkuYiXgsZ7IJhV8vlLelO7MzPCfqzcPUGW95mawGBkdObOVwXB3b7loWuEBfSgErNRUgDWcqSOD4EB4PmDfFwlQv9lX8I/9JowHBdIagwj7sy4AJHxJye2x83ibRJlQS2QI+/DfqLC56aGvWKR1rH0/F2KTfRz6VWmJEt/09gLFoHryEmjzhFfRAm1oeWYFa7KdarHLMnKip82jtEtRCBtgpAzY2RSq59W4Vr648SUO+OBsZrNQYnWg63rUKczizBoRCDmQC6LAfX4QBQETfXEmrHvUpWmrLcqem0ij/A=="],"ARC-Message-Signature":["i=2; a=rsa-sha256; d=subspace.kernel.org;\n\ts=arc-20240116; t=1771426762; c=relaxed/simple;\n\tbh=4Zo8G2GQc+ynfprhsRlvKdXblVQagSZkucYk3iOGjYg=;\n\th=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References:\n\t MIME-Version:Content-Type;\n b=fnsD5oFenoZuNljUy/pLI5yxFP1j7xP1EENVZO4YqSlClpo5z/UyXLznJZphLQxM9M2YCenQJLDWwtGiMEj8tcOWypN9icLc+sqSI9CA2P2rG2TjOtEPfSNBucgKHaoFXnIQtALIgoSUDUqapYphtOSaIudTyCzB4zlSJhl7N4Y=","i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com;\n s=arcselector10001;\n h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1;\n bh=qV2kCcc12PtTJ9uINu++HCQcd9CrIBuD0QBzJqAYm1g=;\n b=yTys9FBjezbO7LtWcz87eUA0519Jp3sV1h+sm/zpRjzP8cmmDZcyLBXpV+G99VYorLjign28qa0KB6wLYLE1lTW+7hYbNPIpKjJUGaLGoGV2U44hSuJWjvMcfcxc0pLwqTJNYYnn32zp9IwFRUYveo4eFfxROAm27bo/ZddrvVlTXnv5d7E6Rup3ogeiSGeaUjlUjek1J5gu3j2Mu5riHTHR9nKeVZqxgTxj2k1mS7lEDsHgvJAlqpqN/atiVTnmW/QewYGGDt52NUYwT4VC5BMxktPgdLdDEdNLFOHz0F92rfKAu9tgUHDgh20UqW0WdDSBFherDrRxeJ/XjqjqAg=="],"ARC-Authentication-Results":["i=2; smtp.subspace.kernel.org;\n dmarc=pass (p=reject dis=none) header.from=nvidia.com;\n spf=fail smtp.mailfrom=nvidia.com;\n dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com\n header.b=MglpCu4Y; arc=fail smtp.client-ip=52.101.56.30","i=1; mx.microsoft.com 1; spf=pass (sender ip is\n 216.228.117.161) smtp.rcpttodomain=kernel.org smtp.mailfrom=nvidia.com;\n dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com;\n dkim=none (message not signed); arc=none (0)"],"DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com;\n s=selector2;\n h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;\n bh=qV2kCcc12PtTJ9uINu++HCQcd9CrIBuD0QBzJqAYm1g=;\n b=MglpCu4YLESs6LM4WURgjiIvXxfaoryf8icT8xM9DLaGpBPhBDvqs7JJmR/iURJf2t6Fbb6PxTgBhMTYvdOshcTL7qDhbf6WpBfiDcRl2nRvN/J0/DkoD6a+E3KExcb0v/RbHnx7stAbYxxxsJoEva9ZoE//GeKfN26wpZqkWO79Y7BNjb51FY5v2Y8qeyvCPj11vUuO5iAdbKVezboC2yj+ITJmKPG++9Iqg/oTTfFXt6A2y0xdhKYtL+UMAFYGTUMoZjkCVQNUlv1FMP8Cwt2jnUHtYiT9X6g/cQrD65Ih5kfLEthTTd+HWocJqCi4XwG5HOXFSoY+aFjQtMBcOg==","X-MS-Exchange-Authentication-Results":"spf=pass (sender IP is 216.228.117.161)\n smtp.mailfrom=nvidia.com; dkim=none (message not signed)\n header.d=none;dmarc=pass action=none header.from=nvidia.com;","Received-SPF":"Pass (protection.outlook.com: domain of nvidia.com designates\n 216.228.117.161 as permitted sender) receiver=protection.outlook.com;\n client-ip=216.228.117.161; helo=mail.nvidia.com; pr=C","From":"Besar Wicaksono <bwicaksono@nvidia.com>","To":"<will@kernel.org>, <suzuki.poulose@arm.com>, <robin.murphy@arm.com>,\n\t<ilkka@os.amperecomputing.com>","CC":"<linux-arm-kernel@lists.infradead.org>, <linux-kernel@vger.kernel.org>,\n\t<linux-tegra@vger.kernel.org>, <mark.rutland@arm.com>, <treding@nvidia.com>,\n\t<jonathanh@nvidia.com>, <vsethi@nvidia.com>, <rwiley@nvidia.com>,\n\t<sdonthineni@nvidia.com>, <skelley@nvidia.com>, <ywan@nvidia.com>,\n\t<mochs@nvidia.com>, <nirmoyd@nvidia.com>, Besar Wicaksono\n\t<bwicaksono@nvidia.com>","Subject":"[PATCH v2 6/8] perf: add NVIDIA Tegra410 CPU Memory Latency PMU","Date":"Wed, 18 Feb 2026 14:58:07 +0000","Message-ID":"<20260218145809.1622856-7-bwicaksono@nvidia.com>","X-Mailer":"git-send-email 2.43.0","In-Reply-To":"<20260218145809.1622856-1-bwicaksono@nvidia.com>","References":"<20260218145809.1622856-1-bwicaksono@nvidia.com>","Precedence":"bulk","X-Mailing-List":"linux-tegra@vger.kernel.org","List-Id":"<linux-tegra.vger.kernel.org>","List-Subscribe":"<mailto:linux-tegra+subscribe@vger.kernel.org>","List-Unsubscribe":"<mailto:linux-tegra+unsubscribe@vger.kernel.org>","MIME-Version":"1.0","Content-Transfer-Encoding":"8bit","Content-Type":"text/plain","X-NV-OnPremToCloud":"ExternallySecured","X-EOPAttributedMessage":"0","X-MS-PublicTrafficType":"Email","X-MS-TrafficTypeDiagnostic":"BL02EPF00021F68:EE_|IA0PPFB67404FBA:EE_","X-MS-Office365-Filtering-Correlation-Id":"a0d2dc2d-9b2c-450c-c178-08de6efe45b4","X-MS-Exchange-SenderADCheck":"1","X-MS-Exchange-AntiSpam-Relay":"0","X-Microsoft-Antispam":"\n\tBCL:0;ARA:13230040|82310400026|376014|36860700013|1800799024;","X-Microsoft-Antispam-Message-Info":"\n s/lmtIs1hodYpbw6rh944egNnr1+ObnA6VMQf1Zg9yRI2uiQzYfoC1rGxXgz2U7cDo26y7oyIt+B92XA77RcX1LVDb3GLMXYjqJMVniKxgJOXl8b6awMLeyvRUP5mcQI8dejmz06FQQPbokAUeCLmhDSP/b8be3hW05JXQsTLxNDmllx3unyG2AUaUYjXRS1UfElaDRfQxJdsLxtXsxwyVJWCCa74+41XT9X2b5w+tZFwRwXBpO4cj8ojrgvoAxnhEwmFv0HyE8A70toyB7fE6jT3WySx4NIIGC/9TX9+XfU7PHkmfakamApVzAEzaZPqZhBsmww5TW9qD/fzj4cZ64gjth6LMAm2RHwsWj4WOb31eR8zK9BP5YtOVG8b2HB5D9cYn8gzy/Ms0z247k6aAookZ2MZuqDuK2CrMjNlu51CCqDW/SApjYwsBl7ksofwiCtTpCWPJolquyxwq+SlaaJ+a6ZlPJticDn8TmlFyOCXdBvUI8SynJLhVibzSGyoMaZJBlm6741hIBcQxjxFP4Z6q6BuZRm1gjErNfwHcqd2mil4e5kjlbZjInar6bjq9PySWiOibrAq/tjt18exvMBm4js/jpqbFG8JN3KXJ1YgT0dhMmRwqYbi7X9DIP6RZFvB6dYgXML/D4lrMBWhWRWKrIRHrZeaO+p8WRcfid0ceO0i/9C67e4fydSGGjoYE966TRRnYkiCir6T+9mgSwG+4b6gH7wbCHP7yLVJsFsftze5E7MNm7CaXmay/gMQJ9cgop5Tt299L+ckmWARMK4o2KqsjF/iWXpydP75jD9SacSY+duE2SZoSxzx1QXL3FECQwGMqPnYOhqRIHu6RhMaZz1D6+J8tSsvMdgoZw2ITiAOFgpiZWG8aZx9pv4+t+IMLHOrJ1whUvUr/huoHTJEEaFrdzMF2CQyJP+6OQfzsHONg0ERQ2OsP3FcBTrxasBTTFauOvtDM8LIGHuxFYwR5UnQB8h03Bz/OH2uneTEOXMPgJVNZF0rr2E+rF8ETfFa7RgiYAb3zkkht3M096lWn1B4vn/m5hwAiep09Xl2WUcETQLUbKiaAfSA1zMzgXFSbvjwTzv1Fghgfq/O39FOA9gIPTFp4xKf/0uchLGcUnmpw3hf5hD9CHAbRg4ya03Ta1ouym+UzjAOPzVydOF87o6vIbKRYccqmgy/KA/vYX3LYCzJPJWIhB5aADKVIR0fwXDnuPXykPKsU060kbuA3Kq1VjGfRoTQX60W5sIYqNWz1MHFeneJaM+cXyfvZBPpE+PzPVGe8nRE4uHiGi2mTMv3JcUrUPGOtwBEUVREQaPv4qPoBelcemgkDtACCcNEp9bKEamjprpbxjiRFBdmq4Iz1RDHEBajvCY9Yl09ox1qExwwJVuzGzaXnHFZZkNhOMkLN1Eu7tYZUzgXu4XdhmFoIuvRm7N2jaoLBrHvRHCUi1md7TuE5plWRfPS4f2ekfU/7ttLALh/jXswqrilVP2cOG4dk+VRAuEkVEllTrmDrFMW/nRXwrj4GcHQMYyAs0udTYW47iJsX5oMGp2ZtWnGcTj7hRsEatR5CgbNaM6t5sjYjvi2Olf9NN1bd6o3UrBucINDu+uIRvDTQnyEM0kpg/RHaVc6hkFzhJJSypi+Th1bXG7iv840D5BE+fxoCJiq8/Md/Qb37BzAw==","X-Forefront-Antispam-Report":"\n\tCIP:216.228.117.161;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc6edge2.nvidia.com;CAT:NONE;SFS:(13230040)(82310400026)(376014)(36860700013)(1800799024);DIR:OUT;SFP:1101;","X-MS-Exchange-AntiSpam-MessageData-ChunkCount":"1","X-MS-Exchange-AntiSpam-MessageData-0":"\n\tyD82pJYllt+GGYR38f5skl6+CR9cGzVVZQRmiXexWCLRxElmqYmd+QAzH0tdAqBq+U/BTMCKGl750nRuZnxfp/HRm9YmZ6NAy8UFaVftPoagc51aN5yNmcKm+s7Ww0exF+1g++5BBM7sA8F7I/ZsscP92ESTkCKm8IuP7k/IZ/bxIo/5cRsNN/yZvp8TE0R4ML9WTdghEeFkOmKvGTwyyfxz4S9My5q/Z7PooRljeNKQPU0O011koobeRUsVkzSUGKmEsTUSozFb0foVN1S3KibvlhvXFBe7q/TN+E2a1+GLfiEEmGF+fZi6fzhwYZ2DotWQlN5o8yMzPysCQAJdCSIHYSEaUZ7sFuH+SAzNrckbB6b97j5b2q4I0p7Ylz0LAy3pOXa1uxVx3xcIIt/+KtzrIrhoEvlSUe4scF/+LfSsS9lM+Fcdu5KZDDaeMma7","X-OriginatorOrg":"Nvidia.com","X-MS-Exchange-CrossTenant-OriginalArrivalTime":"18 Feb 2026 14:59:10.1586\n (UTC)","X-MS-Exchange-CrossTenant-Network-Message-Id":"\n a0d2dc2d-9b2c-450c-c178-08de6efe45b4","X-MS-Exchange-CrossTenant-Id":"43083d15-7273-40c1-b7db-39efd9ccc17a","X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp":"\n TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.117.161];Helo=[mail.nvidia.com]","X-MS-Exchange-CrossTenant-AuthSource":"\n\tBL02EPF00021F68.namprd02.prod.outlook.com","X-MS-Exchange-CrossTenant-AuthAs":"Anonymous","X-MS-Exchange-CrossTenant-FromEntityHeader":"HybridOnPrem","X-MS-Exchange-Transport-CrossTenantHeadersStamped":"IA0PPFB67404FBA"},"content":"Adds CPU Memory (CMEM) Latency PMU support in Tegra410 SOC.\nThe PMU is used to measure latency between the edge of the\nUnified Coherence Fabric to the local system DRAM.\n\nReviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>\nSigned-off-by: Besar Wicaksono <bwicaksono@nvidia.com>\n---\n .../admin-guide/perf/nvidia-tegra410-pmu.rst  |  25 +\n drivers/perf/Kconfig                          |   7 +\n drivers/perf/Makefile                         |   1 +\n drivers/perf/nvidia_t410_cmem_latency_pmu.c   | 727 ++++++++++++++++++\n 4 files changed, 760 insertions(+)\n create mode 100644 drivers/perf/nvidia_t410_cmem_latency_pmu.c","diff":"diff --git a/Documentation/admin-guide/perf/nvidia-tegra410-pmu.rst b/Documentation/admin-guide/perf/nvidia-tegra410-pmu.rst\nindex 07dc447eead7..c8fbc289d12c 100644\n--- a/Documentation/admin-guide/perf/nvidia-tegra410-pmu.rst\n+++ b/Documentation/admin-guide/perf/nvidia-tegra410-pmu.rst\n@@ -8,6 +8,7 @@ metrics like memory bandwidth, latency, and utilization:\n * Unified Coherence Fabric (UCF)\n * PCIE\n * PCIE-TGT\n+* CPU Memory (CMEM) Latency\n \n PMU Driver\n ----------\n@@ -342,3 +343,27 @@ Example usage:\n   0x10000 to 0x100FF on socket 0's PCIE RC-1::\n \n     perf stat -a -e nvidia_pcie_tgt_pmu_0_rc_1/event=0x1,dst_addr_base=0x10000,dst_addr_mask=0xFFF00,dst_addr_en=0x1/\n+\n+CPU Memory (CMEM) Latency PMU\n+-----------------------------\n+\n+This PMU monitors latency events of memory read requests from the edge of the\n+Unified Coherence Fabric (UCF) to local CPU DRAM:\n+\n+  * RD_REQ counters: count read requests (32B per request).\n+  * RD_CUM_OUTS counters: accumulated outstanding request counter, which track\n+    how many cycles the read requests are in flight.\n+  * CYCLES counter: counts the number of elapsed cycles.\n+\n+The average latency is calculated as::\n+\n+   FREQ_IN_GHZ = CYCLES / ELAPSED_TIME_IN_NS\n+   AVG_LATENCY_IN_CYCLES = RD_CUM_OUTS / RD_REQ\n+   AVERAGE_LATENCY_IN_NS = AVG_LATENCY_IN_CYCLES / FREQ_IN_GHZ\n+\n+The events and configuration options of this PMU device are described in sysfs,\n+see /sys/bus/event_source/devices/nvidia_cmem_latency_pmu_<socket-id>.\n+\n+Example usage::\n+\n+  perf stat -a -e '{nvidia_cmem_latency_pmu_0/rd_req/,nvidia_cmem_latency_pmu_0/rd_cum_outs/,nvidia_cmem_latency_pmu_0/cycles/}'\ndiff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig\nindex 638321fc9800..26e86067d8f9 100644\n--- a/drivers/perf/Kconfig\n+++ b/drivers/perf/Kconfig\n@@ -311,4 +311,11 @@ config MARVELL_PEM_PMU\n \t  Enable support for PCIe Interface performance monitoring\n \t  on Marvell platform.\n \n+config NVIDIA_TEGRA410_CMEM_LATENCY_PMU\n+\ttristate \"NVIDIA Tegra410 CPU Memory Latency PMU\"\n+\tdepends on ARM64 && ACPI\n+\thelp\n+\t  Enable perf support for CPU memory latency counters monitoring on\n+\t  NVIDIA Tegra410 SoC.\n+\n endmenu\ndiff --git a/drivers/perf/Makefile b/drivers/perf/Makefile\nindex ea52711a87e3..4aa6aad393c2 100644\n--- a/drivers/perf/Makefile\n+++ b/drivers/perf/Makefile\n@@ -35,3 +35,4 @@ obj-$(CONFIG_DWC_PCIE_PMU) += dwc_pcie_pmu.o\n obj-$(CONFIG_ARM_CORESIGHT_PMU_ARCH_SYSTEM_PMU) += arm_cspmu/\n obj-$(CONFIG_MESON_DDR_PMU) += amlogic/\n obj-$(CONFIG_CXL_PMU) += cxl_pmu.o\n+obj-$(CONFIG_NVIDIA_TEGRA410_CMEM_LATENCY_PMU) += nvidia_t410_cmem_latency_pmu.o\ndiff --git a/drivers/perf/nvidia_t410_cmem_latency_pmu.c b/drivers/perf/nvidia_t410_cmem_latency_pmu.c\nnew file mode 100644\nindex 000000000000..9b466581c8fc\n--- /dev/null\n+++ b/drivers/perf/nvidia_t410_cmem_latency_pmu.c\n@@ -0,0 +1,727 @@\n+// SPDX-License-Identifier: GPL-2.0\n+/*\n+ * NVIDIA Tegra410 CPU Memory (CMEM) Latency PMU driver.\n+ *\n+ * Copyright (c) 2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved.\n+ */\n+\n+#include <linux/acpi.h>\n+#include <linux/bitops.h>\n+#include <linux/cpumask.h>\n+#include <linux/device.h>\n+#include <linux/interrupt.h>\n+#include <linux/io.h>\n+#include <linux/module.h>\n+#include <linux/perf_event.h>\n+#include <linux/platform_device.h>\n+\n+#define NUM_INSTANCES    14\n+#define BCAST(pmu) pmu->base[NUM_INSTANCES]\n+\n+/* Register offsets. */\n+#define CG_CTRL         0x800\n+#define CTRL            0x808\n+#define STATUS          0x810\n+#define CYCLE_CNTR      0x818\n+#define MC0_REQ_CNTR    0x820\n+#define MC0_AOR_CNTR    0x830\n+#define MC1_REQ_CNTR    0x838\n+#define MC1_AOR_CNTR    0x848\n+#define MC2_REQ_CNTR    0x850\n+#define MC2_AOR_CNTR    0x860\n+\n+/* CTRL values. */\n+#define CTRL_DISABLE    0x0ULL\n+#define CTRL_ENABLE     0x1ULL\n+#define CTRL_CLR        0x2ULL\n+\n+/* CG_CTRL values. */\n+#define CG_CTRL_DISABLE    0x0ULL\n+#define CG_CTRL_ENABLE     0x1ULL\n+\n+/* STATUS register field. */\n+#define STATUS_CYCLE_OVF      BIT(0)\n+#define STATUS_MC0_AOR_OVF    BIT(1)\n+#define STATUS_MC0_REQ_OVF    BIT(3)\n+#define STATUS_MC1_AOR_OVF    BIT(4)\n+#define STATUS_MC1_REQ_OVF    BIT(6)\n+#define STATUS_MC2_AOR_OVF    BIT(7)\n+#define STATUS_MC2_REQ_OVF    BIT(9)\n+\n+/* Events. */\n+#define EVENT_CYCLES    0x0\n+#define EVENT_REQ       0x1\n+#define EVENT_AOR       0x2\n+\n+#define NUM_EVENTS           0x3\n+#define MASK_EVENT           0x3\n+#define MAX_ACTIVE_EVENTS    32\n+\n+#define ACTIVE_CPU_MASK        0x0\n+#define ASSOCIATED_CPU_MASK    0x1\n+\n+static unsigned long cmem_lat_pmu_cpuhp_state;\n+\n+struct cmem_lat_pmu_hw_events {\n+\tstruct perf_event *events[MAX_ACTIVE_EVENTS];\n+\tDECLARE_BITMAP(used_ctrs, MAX_ACTIVE_EVENTS);\n+};\n+\n+struct cmem_lat_pmu {\n+\tstruct pmu pmu;\n+\tstruct device *dev;\n+\tconst char *name;\n+\tconst char *identifier;\n+\tvoid __iomem *base[NUM_INSTANCES + 1];\n+\tcpumask_t associated_cpus;\n+\tcpumask_t active_cpu;\n+\tstruct hlist_node node;\n+\tstruct cmem_lat_pmu_hw_events hw_events;\n+};\n+\n+#define to_cmem_lat_pmu(p) \\\n+\tcontainer_of(p, struct cmem_lat_pmu, pmu)\n+\n+\n+/* Get event type from perf_event. */\n+static inline u32 get_event_type(struct perf_event *event)\n+{\n+\treturn (event->attr.config) & MASK_EVENT;\n+}\n+\n+/* PMU operations. */\n+static int cmem_lat_pmu_get_event_idx(struct cmem_lat_pmu_hw_events *hw_events,\n+\t\t\t\tstruct perf_event *event)\n+{\n+\tunsigned int idx;\n+\n+\tidx = find_first_zero_bit(hw_events->used_ctrs, MAX_ACTIVE_EVENTS);\n+\tif (idx >= MAX_ACTIVE_EVENTS)\n+\t\treturn -EAGAIN;\n+\n+\tset_bit(idx, hw_events->used_ctrs);\n+\n+\treturn idx;\n+}\n+\n+static bool cmem_lat_pmu_validate_event(struct pmu *pmu,\n+\t\t\t\t struct cmem_lat_pmu_hw_events *hw_events,\n+\t\t\t\t struct perf_event *event)\n+{\n+\tif (is_software_event(event))\n+\t\treturn true;\n+\n+\t/* Reject groups spanning multiple HW PMUs. */\n+\tif (event->pmu != pmu)\n+\t\treturn false;\n+\n+\treturn (cmem_lat_pmu_get_event_idx(hw_events, event) >= 0);\n+}\n+\n+/*\n+ * Make sure the group of events can be scheduled at once\n+ * on the PMU.\n+ */\n+static bool cmem_lat_pmu_validate_group(struct perf_event *event)\n+{\n+\tstruct perf_event *sibling, *leader = event->group_leader;\n+\tstruct cmem_lat_pmu_hw_events fake_hw_events;\n+\n+\tif (event->group_leader == event)\n+\t\treturn true;\n+\n+\tmemset(&fake_hw_events, 0, sizeof(fake_hw_events));\n+\n+\tif (!cmem_lat_pmu_validate_event(event->pmu, &fake_hw_events, leader))\n+\t\treturn false;\n+\n+\tfor_each_sibling_event(sibling, leader) {\n+\t\tif (!cmem_lat_pmu_validate_event(event->pmu, &fake_hw_events,\n+\t\t\t\t\t\tsibling))\n+\t\t\treturn false;\n+\t}\n+\n+\treturn cmem_lat_pmu_validate_event(event->pmu, &fake_hw_events, event);\n+}\n+\n+static int cmem_lat_pmu_event_init(struct perf_event *event)\n+{\n+\tstruct cmem_lat_pmu *cmem_lat_pmu = to_cmem_lat_pmu(event->pmu);\n+\tstruct hw_perf_event *hwc = &event->hw;\n+\tu32 event_type = get_event_type(event);\n+\n+\tif (event->attr.type != event->pmu->type ||\n+\t    event_type >= NUM_EVENTS)\n+\t\treturn -ENOENT;\n+\n+\t/*\n+\t * Following other \"uncore\" PMUs, we do not support sampling mode or\n+\t * attach to a task (per-process mode).\n+\t */\n+\tif (is_sampling_event(event)) {\n+\t\tdev_dbg(cmem_lat_pmu->pmu.dev,\n+\t\t\t\"Can't support sampling events\\n\");\n+\t\treturn -EOPNOTSUPP;\n+\t}\n+\n+\tif (event->cpu < 0 || event->attach_state & PERF_ATTACH_TASK) {\n+\t\tdev_dbg(cmem_lat_pmu->pmu.dev,\n+\t\t\t\"Can't support per-task counters\\n\");\n+\t\treturn -EINVAL;\n+\t}\n+\n+\t/*\n+\t * Make sure the CPU assignment is on one of the CPUs associated with\n+\t * this PMU.\n+\t */\n+\tif (!cpumask_test_cpu(event->cpu, &cmem_lat_pmu->associated_cpus)) {\n+\t\tdev_dbg(cmem_lat_pmu->pmu.dev,\n+\t\t\t\"Requested cpu is not associated with the PMU\\n\");\n+\t\treturn -EINVAL;\n+\t}\n+\n+\t/* Enforce the current active CPU to handle the events in this PMU. */\n+\tevent->cpu = cpumask_first(&cmem_lat_pmu->active_cpu);\n+\tif (event->cpu >= nr_cpu_ids)\n+\t\treturn -EINVAL;\n+\n+\tif (!cmem_lat_pmu_validate_group(event))\n+\t\treturn -EINVAL;\n+\n+\thwc->idx = -1;\n+\thwc->config = event_type;\n+\n+\treturn 0;\n+}\n+\n+static u64 cmem_lat_pmu_read_status(struct cmem_lat_pmu *cmem_lat_pmu,\n+\t\t\t\t   unsigned int inst)\n+{\n+\treturn readq(cmem_lat_pmu->base[inst] + STATUS);\n+}\n+\n+static u64 cmem_lat_pmu_read_cycle_counter(struct perf_event *event)\n+{\n+\tconst unsigned int instance = 0;\n+\tu64 status;\n+\tstruct cmem_lat_pmu *cmem_lat_pmu = to_cmem_lat_pmu(event->pmu);\n+\tstruct device *dev = cmem_lat_pmu->dev;\n+\n+\t/*\n+\t * Use the reading from first instance since all instances are\n+\t * identical.\n+\t */\n+\tstatus = cmem_lat_pmu_read_status(cmem_lat_pmu, instance);\n+\tif (status & STATUS_CYCLE_OVF)\n+\t\tdev_warn(dev, \"Cycle counter overflow\\n\");\n+\n+\treturn readq(cmem_lat_pmu->base[instance] + CYCLE_CNTR);\n+}\n+\n+static u64 cmem_lat_pmu_read_req_counter(struct perf_event *event)\n+{\n+\tunsigned int i;\n+\tu64 status, val = 0;\n+\tstruct cmem_lat_pmu *cmem_lat_pmu = to_cmem_lat_pmu(event->pmu);\n+\tstruct device *dev = cmem_lat_pmu->dev;\n+\n+\t/* Sum up the counts from all instances. */\n+\tfor (i = 0; i < NUM_INSTANCES; i++) {\n+\t\tstatus = cmem_lat_pmu_read_status(cmem_lat_pmu, i);\n+\t\tif (status & STATUS_MC0_REQ_OVF)\n+\t\t\tdev_warn(dev, \"MC0 request counter overflow\\n\");\n+\t\tif (status & STATUS_MC1_REQ_OVF)\n+\t\t\tdev_warn(dev, \"MC1 request counter overflow\\n\");\n+\t\tif (status & STATUS_MC2_REQ_OVF)\n+\t\t\tdev_warn(dev, \"MC2 request counter overflow\\n\");\n+\n+\t\tval += readq(cmem_lat_pmu->base[i] + MC0_REQ_CNTR);\n+\t\tval += readq(cmem_lat_pmu->base[i] + MC1_REQ_CNTR);\n+\t\tval += readq(cmem_lat_pmu->base[i] + MC2_REQ_CNTR);\n+\t}\n+\n+\treturn val;\n+}\n+\n+static u64 cmem_lat_pmu_read_aor_counter(struct perf_event *event)\n+{\n+\tunsigned int i;\n+\tu64 status, val = 0;\n+\tstruct cmem_lat_pmu *cmem_lat_pmu = to_cmem_lat_pmu(event->pmu);\n+\tstruct device *dev = cmem_lat_pmu->dev;\n+\n+\t/* Sum up the counts from all instances. */\n+\tfor (i = 0; i < NUM_INSTANCES; i++) {\n+\t\tstatus = cmem_lat_pmu_read_status(cmem_lat_pmu, i);\n+\t\tif (status & STATUS_MC0_AOR_OVF)\n+\t\t\tdev_warn(dev, \"MC0 AOR counter overflow\\n\");\n+\t\tif (status & STATUS_MC1_AOR_OVF)\n+\t\t\tdev_warn(dev, \"MC1 AOR counter overflow\\n\");\n+\t\tif (status & STATUS_MC2_AOR_OVF)\n+\t\t\tdev_warn(dev, \"MC2 AOR counter overflow\\n\");\n+\n+\t\tval += readq(cmem_lat_pmu->base[i] + MC0_AOR_CNTR);\n+\t\tval += readq(cmem_lat_pmu->base[i] + MC1_AOR_CNTR);\n+\t\tval += readq(cmem_lat_pmu->base[i] + MC2_AOR_CNTR);\n+\t}\n+\n+\treturn val;\n+}\n+\n+static u64 (*read_counter_fn[NUM_EVENTS])(struct perf_event *) = {\n+\t[EVENT_CYCLES] = cmem_lat_pmu_read_cycle_counter,\n+\t[EVENT_REQ] = cmem_lat_pmu_read_req_counter,\n+\t[EVENT_AOR] = cmem_lat_pmu_read_aor_counter,\n+};\n+\n+static void cmem_lat_pmu_event_update(struct perf_event *event)\n+{\n+\tu32 event_type;\n+\tu64 prev, now;\n+\tstruct hw_perf_event *hwc = &event->hw;\n+\n+\tif (hwc->state & PERF_HES_STOPPED)\n+\t\treturn;\n+\n+\tevent_type = hwc->config;\n+\n+\tdo {\n+\t\tprev = local64_read(&hwc->prev_count);\n+\t\tnow = read_counter_fn[event_type](event);\n+\t} while (local64_cmpxchg(&hwc->prev_count, prev, now) != prev);\n+\n+\tlocal64_add(now - prev, &event->count);\n+\n+\thwc->state |= PERF_HES_UPTODATE;\n+}\n+\n+static void cmem_lat_pmu_start(struct perf_event *event, int pmu_flags)\n+{\n+\tevent->hw.state = 0;\n+}\n+\n+static void cmem_lat_pmu_stop(struct perf_event *event, int pmu_flags)\n+{\n+\tevent->hw.state |= PERF_HES_STOPPED;\n+}\n+\n+static int cmem_lat_pmu_add(struct perf_event *event, int flags)\n+{\n+\tstruct cmem_lat_pmu *cmem_lat_pmu = to_cmem_lat_pmu(event->pmu);\n+\tstruct cmem_lat_pmu_hw_events *hw_events = &cmem_lat_pmu->hw_events;\n+\tstruct hw_perf_event *hwc = &event->hw;\n+\tint idx;\n+\n+\tif (WARN_ON_ONCE(!cpumask_test_cpu(smp_processor_id(),\n+\t\t\t\t\t   &cmem_lat_pmu->associated_cpus)))\n+\t\treturn -ENOENT;\n+\n+\tidx = cmem_lat_pmu_get_event_idx(hw_events, event);\n+\tif (idx < 0)\n+\t\treturn idx;\n+\n+\thw_events->events[idx] = event;\n+\thwc->idx = idx;\n+\thwc->state = PERF_HES_STOPPED | PERF_HES_UPTODATE;\n+\n+\tif (flags & PERF_EF_START)\n+\t\tcmem_lat_pmu_start(event, PERF_EF_RELOAD);\n+\n+\t/* Propagate changes to the userspace mapping. */\n+\tperf_event_update_userpage(event);\n+\n+\treturn 0;\n+}\n+\n+static void cmem_lat_pmu_del(struct perf_event *event, int flags)\n+{\n+\tstruct cmem_lat_pmu *cmem_lat_pmu = to_cmem_lat_pmu(event->pmu);\n+\tstruct cmem_lat_pmu_hw_events *hw_events = &cmem_lat_pmu->hw_events;\n+\tstruct hw_perf_event *hwc = &event->hw;\n+\tint idx = hwc->idx;\n+\n+\tcmem_lat_pmu_stop(event, PERF_EF_UPDATE);\n+\n+\thw_events->events[idx] = NULL;\n+\n+\tclear_bit(idx, hw_events->used_ctrs);\n+\n+\tperf_event_update_userpage(event);\n+}\n+\n+static void cmem_lat_pmu_read(struct perf_event *event)\n+{\n+\tcmem_lat_pmu_event_update(event);\n+}\n+\n+static inline void cmem_lat_pmu_cg_ctrl(struct cmem_lat_pmu *cmem_lat_pmu, u64 val)\n+{\n+\twriteq(val, BCAST(cmem_lat_pmu) + CG_CTRL);\n+}\n+\n+static inline void cmem_lat_pmu_ctrl(struct cmem_lat_pmu *cmem_lat_pmu, u64 val)\n+{\n+\twriteq(val, BCAST(cmem_lat_pmu) + CTRL);\n+}\n+\n+static void cmem_lat_pmu_enable(struct pmu *pmu)\n+{\n+\tbool disabled;\n+\tstruct cmem_lat_pmu *cmem_lat_pmu = to_cmem_lat_pmu(pmu);\n+\n+\tdisabled = bitmap_empty(\n+\t\tcmem_lat_pmu->hw_events.used_ctrs, MAX_ACTIVE_EVENTS);\n+\n+\tif (disabled)\n+\t\treturn;\n+\n+\t/* Enable all the counters. */\n+\tcmem_lat_pmu_cg_ctrl(cmem_lat_pmu, CG_CTRL_ENABLE);\n+\tcmem_lat_pmu_ctrl(cmem_lat_pmu, CTRL_ENABLE);\n+}\n+\n+static void cmem_lat_pmu_disable(struct pmu *pmu)\n+{\n+\tint idx;\n+\tstruct perf_event *event;\n+\tstruct cmem_lat_pmu *cmem_lat_pmu = to_cmem_lat_pmu(pmu);\n+\n+\t/* Disable all the counters. */\n+\tcmem_lat_pmu_ctrl(cmem_lat_pmu, CTRL_DISABLE);\n+\n+\t/*\n+\t * The counters will start from 0 again on restart.\n+\t * Update the events immediately to avoid losing the counts.\n+\t */\n+\tfor_each_set_bit(\n+\t\tidx, cmem_lat_pmu->hw_events.used_ctrs, MAX_ACTIVE_EVENTS) {\n+\t\tevent = cmem_lat_pmu->hw_events.events[idx];\n+\n+\t\tif (!event)\n+\t\t\tcontinue;\n+\n+\t\tcmem_lat_pmu_event_update(event);\n+\n+\t\tlocal64_set(&event->hw.prev_count, 0ULL);\n+\t}\n+\n+\tcmem_lat_pmu_ctrl(cmem_lat_pmu, CTRL_CLR);\n+\tcmem_lat_pmu_cg_ctrl(cmem_lat_pmu, CG_CTRL_DISABLE);\n+}\n+\n+/* PMU identifier attribute. */\n+\n+static ssize_t cmem_lat_pmu_identifier_show(struct device *dev,\n+\t\t\t\t\t struct device_attribute *attr,\n+\t\t\t\t\t char *page)\n+{\n+\tstruct cmem_lat_pmu *cmem_lat_pmu = to_cmem_lat_pmu(dev_get_drvdata(dev));\n+\n+\treturn sysfs_emit(page, \"%s\\n\", cmem_lat_pmu->identifier);\n+}\n+\n+static struct device_attribute cmem_lat_pmu_identifier_attr =\n+\t__ATTR(identifier, 0444, cmem_lat_pmu_identifier_show, NULL);\n+\n+static struct attribute *cmem_lat_pmu_identifier_attrs[] = {\n+\t&cmem_lat_pmu_identifier_attr.attr,\n+\tNULL,\n+};\n+\n+static struct attribute_group cmem_lat_pmu_identifier_attr_group = {\n+\t.attrs = cmem_lat_pmu_identifier_attrs,\n+};\n+\n+/* Format attributes. */\n+\n+#define NV_PMU_EXT_ATTR(_name, _func, _config)\t\t\t\\\n+\t(&((struct dev_ext_attribute[]){\t\t\t\t\\\n+\t\t{\t\t\t\t\t\t\t\\\n+\t\t\t.attr = __ATTR(_name, 0444, _func, NULL),\t\\\n+\t\t\t.var = (void *)_config\t\t\t\t\\\n+\t\t}\t\t\t\t\t\t\t\\\n+\t})[0].attr.attr)\n+\n+static struct attribute *cmem_lat_pmu_formats[] = {\n+\tNV_PMU_EXT_ATTR(event, device_show_string, \"config:0-1\"),\n+\tNULL,\n+};\n+\n+static const struct attribute_group cmem_lat_pmu_format_group = {\n+\t.name = \"format\",\n+\t.attrs = cmem_lat_pmu_formats,\n+};\n+\n+/* Event attributes. */\n+\n+static ssize_t cmem_lat_pmu_sysfs_event_show(struct device *dev,\n+\t\t\t\tstruct device_attribute *attr, char *buf)\n+{\n+\tstruct perf_pmu_events_attr *pmu_attr;\n+\n+\tpmu_attr = container_of(attr, typeof(*pmu_attr), attr);\n+\treturn sysfs_emit(buf, \"event=0x%llx\\n\", pmu_attr->id);\n+}\n+\n+#define NV_PMU_EVENT_ATTR(_name, _config)\t\\\n+\tPMU_EVENT_ATTR_ID(_name, cmem_lat_pmu_sysfs_event_show, _config)\n+\n+static struct attribute *cmem_lat_pmu_events[] = {\n+\tNV_PMU_EVENT_ATTR(cycles, EVENT_CYCLES),\n+\tNV_PMU_EVENT_ATTR(rd_req, EVENT_REQ),\n+\tNV_PMU_EVENT_ATTR(rd_cum_outs, EVENT_AOR),\n+\tNULL\n+};\n+\n+static const struct attribute_group cmem_lat_pmu_events_group = {\n+\t.name = \"events\",\n+\t.attrs = cmem_lat_pmu_events,\n+};\n+\n+/* Cpumask attributes. */\n+\n+static ssize_t cmem_lat_pmu_cpumask_show(struct device *dev,\n+\t\t\t    struct device_attribute *attr, char *buf)\n+{\n+\tstruct pmu *pmu = dev_get_drvdata(dev);\n+\tstruct cmem_lat_pmu *cmem_lat_pmu = to_cmem_lat_pmu(pmu);\n+\tstruct dev_ext_attribute *eattr =\n+\t\tcontainer_of(attr, struct dev_ext_attribute, attr);\n+\tunsigned long mask_id = (unsigned long)eattr->var;\n+\tconst cpumask_t *cpumask;\n+\n+\tswitch (mask_id) {\n+\tcase ACTIVE_CPU_MASK:\n+\t\tcpumask = &cmem_lat_pmu->active_cpu;\n+\t\tbreak;\n+\tcase ASSOCIATED_CPU_MASK:\n+\t\tcpumask = &cmem_lat_pmu->associated_cpus;\n+\t\tbreak;\n+\tdefault:\n+\t\treturn 0;\n+\t}\n+\treturn cpumap_print_to_pagebuf(true, buf, cpumask);\n+}\n+\n+#define NV_PMU_CPUMASK_ATTR(_name, _config)\t\t\t\\\n+\tNV_PMU_EXT_ATTR(_name, cmem_lat_pmu_cpumask_show,\t\\\n+\t\t\t\t(unsigned long)_config)\n+\n+static struct attribute *cmem_lat_pmu_cpumask_attrs[] = {\n+\tNV_PMU_CPUMASK_ATTR(cpumask, ACTIVE_CPU_MASK),\n+\tNV_PMU_CPUMASK_ATTR(associated_cpus, ASSOCIATED_CPU_MASK),\n+\tNULL,\n+};\n+\n+static const struct attribute_group cmem_lat_pmu_cpumask_attr_group = {\n+\t.attrs = cmem_lat_pmu_cpumask_attrs,\n+};\n+\n+/* Per PMU device attribute groups. */\n+\n+static const struct attribute_group *cmem_lat_pmu_attr_groups[] = {\n+\t&cmem_lat_pmu_identifier_attr_group,\n+\t&cmem_lat_pmu_format_group,\n+\t&cmem_lat_pmu_events_group,\n+\t&cmem_lat_pmu_cpumask_attr_group,\n+\tNULL,\n+};\n+\n+static int cmem_lat_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)\n+{\n+\tstruct cmem_lat_pmu *cmem_lat_pmu =\n+\t\thlist_entry_safe(node, struct cmem_lat_pmu, node);\n+\n+\tif (!cpumask_test_cpu(cpu, &cmem_lat_pmu->associated_cpus))\n+\t\treturn 0;\n+\n+\t/* If the PMU is already managed, there is nothing to do */\n+\tif (!cpumask_empty(&cmem_lat_pmu->active_cpu))\n+\t\treturn 0;\n+\n+\t/* Use this CPU for event counting */\n+\tcpumask_set_cpu(cpu, &cmem_lat_pmu->active_cpu);\n+\n+\treturn 0;\n+}\n+\n+static int cmem_lat_pmu_cpu_teardown(unsigned int cpu, struct hlist_node *node)\n+{\n+\tunsigned int dst;\n+\n+\tstruct cmem_lat_pmu *cmem_lat_pmu =\n+\t\thlist_entry_safe(node, struct cmem_lat_pmu, node);\n+\n+\t/* Nothing to do if this CPU doesn't own the PMU */\n+\tif (!cpumask_test_and_clear_cpu(cpu, &cmem_lat_pmu->active_cpu))\n+\t\treturn 0;\n+\n+\t/* Choose a new CPU to migrate ownership of the PMU to */\n+\tdst = cpumask_any_and_but(&cmem_lat_pmu->associated_cpus,\n+\t\t\t\t  cpu_online_mask, cpu);\n+\tif (dst >= nr_cpu_ids)\n+\t\treturn 0;\n+\n+\t/* Use this CPU for event counting */\n+\tperf_pmu_migrate_context(&cmem_lat_pmu->pmu, cpu, dst);\n+\tcpumask_set_cpu(dst, &cmem_lat_pmu->active_cpu);\n+\n+\treturn 0;\n+}\n+\n+static int cmem_lat_pmu_get_cpus(struct cmem_lat_pmu *cmem_lat_pmu,\n+\t\t\t\tunsigned int socket)\n+{\n+\tint ret = 0, cpu;\n+\n+\tfor_each_possible_cpu(cpu) {\n+\t\tif (cpu_to_node(cpu) == socket)\n+\t\t\tcpumask_set_cpu(cpu, &cmem_lat_pmu->associated_cpus);\n+\t}\n+\n+\tif (cpumask_empty(&cmem_lat_pmu->associated_cpus)) {\n+\t\tdev_dbg(cmem_lat_pmu->dev,\n+\t\t\t\"No cpu associated with PMU socket-%u\\n\", socket);\n+\t\tret = -ENODEV;\n+\t}\n+\n+\treturn ret;\n+}\n+\n+static int cmem_lat_pmu_probe(struct platform_device *pdev)\n+{\n+\tstruct device *dev = &pdev->dev;\n+\tstruct acpi_device *acpi_dev;\n+\tstruct cmem_lat_pmu *cmem_lat_pmu;\n+\tchar *name, *uid_str;\n+\tint ret, i;\n+\tu32 socket;\n+\n+\tacpi_dev = ACPI_COMPANION(dev);\n+\tif (!acpi_dev)\n+\t\treturn -ENODEV;\n+\n+\tuid_str = acpi_device_uid(acpi_dev);\n+\tif (!uid_str)\n+\t\treturn -ENODEV;\n+\n+\tret = kstrtou32(uid_str, 0, &socket);\n+\tif (ret)\n+\t\treturn ret;\n+\n+\tcmem_lat_pmu = devm_kzalloc(dev, sizeof(*cmem_lat_pmu), GFP_KERNEL);\n+\tname = devm_kasprintf(dev, GFP_KERNEL, \"nvidia_cmem_latency_pmu_%u\", socket);\n+\tif (!cmem_lat_pmu || !name)\n+\t\treturn -ENOMEM;\n+\n+\tcmem_lat_pmu->dev = dev;\n+\tcmem_lat_pmu->name = name;\n+\tcmem_lat_pmu->identifier = acpi_device_hid(acpi_dev);\n+\tplatform_set_drvdata(pdev, cmem_lat_pmu);\n+\n+\tcmem_lat_pmu->pmu = (struct pmu) {\n+\t\t.parent\t\t= &pdev->dev,\n+\t\t.task_ctx_nr\t= perf_invalid_context,\n+\t\t.pmu_enable\t= cmem_lat_pmu_enable,\n+\t\t.pmu_disable\t= cmem_lat_pmu_disable,\n+\t\t.event_init\t= cmem_lat_pmu_event_init,\n+\t\t.add\t\t= cmem_lat_pmu_add,\n+\t\t.del\t\t= cmem_lat_pmu_del,\n+\t\t.start\t\t= cmem_lat_pmu_start,\n+\t\t.stop\t\t= cmem_lat_pmu_stop,\n+\t\t.read\t\t= cmem_lat_pmu_read,\n+\t\t.attr_groups\t= cmem_lat_pmu_attr_groups,\n+\t\t.capabilities\t= PERF_PMU_CAP_NO_EXCLUDE |\n+\t\t\t\t\tPERF_PMU_CAP_NO_INTERRUPT,\n+\t};\n+\n+\t/* Map the address of all the instances plus one for the broadcast. */\n+\tfor (i = 0; i < NUM_INSTANCES + 1; i++) {\n+\t\tcmem_lat_pmu->base[i] = devm_platform_ioremap_resource(pdev, i);\n+\t\tif (IS_ERR(cmem_lat_pmu->base[i])) {\n+\t\t\tdev_err(dev, \"Failed map address for instance %d\\n\", i);\n+\t\t\treturn PTR_ERR(cmem_lat_pmu->base[i]);\n+\t\t}\n+\t}\n+\n+\tret = cmem_lat_pmu_get_cpus(cmem_lat_pmu, socket);\n+\tif (ret)\n+\t\treturn ret;\n+\n+\tret = cpuhp_state_add_instance(cmem_lat_pmu_cpuhp_state,\n+\t\t\t\t       &cmem_lat_pmu->node);\n+\tif (ret) {\n+\t\tdev_err(&pdev->dev, \"Error %d registering hotplug\\n\", ret);\n+\t\treturn ret;\n+\t}\n+\n+\tcmem_lat_pmu_cg_ctrl(cmem_lat_pmu, CG_CTRL_ENABLE);\n+\tcmem_lat_pmu_ctrl(cmem_lat_pmu, CTRL_CLR);\n+\tcmem_lat_pmu_cg_ctrl(cmem_lat_pmu, CG_CTRL_DISABLE);\n+\n+\tret = perf_pmu_register(&cmem_lat_pmu->pmu, name, -1);\n+\tif (ret) {\n+\t\tdev_err(&pdev->dev, \"Failed to register PMU: %d\\n\", ret);\n+\t\tcpuhp_state_remove_instance(cmem_lat_pmu_cpuhp_state,\n+\t\t\t\t\t    &cmem_lat_pmu->node);\n+\t\treturn ret;\n+\t}\n+\n+\tdev_dbg(&pdev->dev, \"Registered %s PMU\\n\", name);\n+\n+\treturn 0;\n+}\n+\n+static void cmem_lat_pmu_device_remove(struct platform_device *pdev)\n+{\n+\tstruct cmem_lat_pmu *cmem_lat_pmu = platform_get_drvdata(pdev);\n+\n+\tperf_pmu_unregister(&cmem_lat_pmu->pmu);\n+\tcpuhp_state_remove_instance(cmem_lat_pmu_cpuhp_state,\n+\t\t\t\t    &cmem_lat_pmu->node);\n+}\n+\n+static const struct acpi_device_id cmem_lat_pmu_acpi_match[] = {\n+\t{ \"NVDA2021\", },\n+\t{ }\n+};\n+MODULE_DEVICE_TABLE(acpi, cmem_lat_pmu_acpi_match);\n+\n+static struct platform_driver cmem_lat_pmu_driver = {\n+\t.driver = {\n+\t\t.name = \"nvidia-t410-cmem-latency-pmu\",\n+\t\t.acpi_match_table = ACPI_PTR(cmem_lat_pmu_acpi_match),\n+\t\t.suppress_bind_attrs = true,\n+\t},\n+\t.probe = cmem_lat_pmu_probe,\n+\t.remove = cmem_lat_pmu_device_remove,\n+};\n+\n+static int __init cmem_lat_pmu_init(void)\n+{\n+\tint ret;\n+\n+\tret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN,\n+\t\t\t\t      \"perf/nvidia/cmem_latency:online\",\n+\t\t\t\t      cmem_lat_pmu_cpu_online,\n+\t\t\t\t      cmem_lat_pmu_cpu_teardown);\n+\tif (ret < 0)\n+\t\treturn ret;\n+\n+\tcmem_lat_pmu_cpuhp_state = ret;\n+\n+\treturn platform_driver_register(&cmem_lat_pmu_driver);\n+}\n+\n+static void __exit cmem_lat_pmu_exit(void)\n+{\n+\tplatform_driver_unregister(&cmem_lat_pmu_driver);\n+\tcpuhp_remove_multi_state(cmem_lat_pmu_cpuhp_state);\n+}\n+\n+module_init(cmem_lat_pmu_init);\n+module_exit(cmem_lat_pmu_exit);\n+\n+MODULE_LICENSE(\"GPL\");\n+MODULE_DESCRIPTION(\"NVIDIA Tegra410 CPU Memory Latency PMU driver\");\n+MODULE_AUTHOR(\"Besar Wicaksono <bwicaksono@nvidia.com>\");\n","prefixes":["v2","6/8"]}