{"id":2175198,"url":"http://patchwork.ozlabs.org/api/1.0/covers/2175198/?format=json","project":{"id":28,"url":"http://patchwork.ozlabs.org/api/1.0/projects/28/?format=json","name":"Linux PCI development","link_name":"linux-pci","list_id":"linux-pci.vger.kernel.org","list_email":"linux-pci@vger.kernel.org","web_url":null,"scm_url":null,"webscm_url":null},"msgid":"<20251217151609.3162665-1-den@valinux.co.jp>","date":"2025-12-17T15:15:34","name":"[RFC,v3,00/35] NTB transport backed by endpoint DW eDMA","submitter":{"id":91573,"url":"http://patchwork.ozlabs.org/api/1.0/people/91573/?format=json","name":"Koichiro Den","email":"den@valinux.co.jp"},"series":[{"id":485709,"url":"http://patchwork.ozlabs.org/api/1.0/series/485709/?format=json","date":"2025-12-17T15:15:53","name":"NTB transport backed by endpoint DW eDMA","version":3,"mbox":"http://patchwork.ozlabs.org/series/485709/mbox/"}],"headers":{"Return-Path":"\n <linux-pci+bounces-43175-incoming=patchwork.ozlabs.org@vger.kernel.org>","X-Original-To":["incoming@patchwork.ozlabs.org","linux-pci@vger.kernel.org"],"Delivered-To":"patchwork-incoming@legolas.ozlabs.org","Authentication-Results":["legolas.ozlabs.org;\n\tdkim=pass (1024-bit key;\n unprotected) header.d=valinux.co.jp header.i=@valinux.co.jp\n header.a=rsa-sha256 header.s=selector1 header.b=jgQP6t3e;\n\tdkim-atps=neutral","legolas.ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org\n (client-ip=2600:3c0a:e001:db::12fc:5321; helo=sea.lore.kernel.org;\n envelope-from=linux-pci+bounces-43175-incoming=patchwork.ozlabs.org@vger.kernel.org;\n receiver=patchwork.ozlabs.org)","smtp.subspace.kernel.org;\n\tdkim=pass (1024-bit key) header.d=valinux.co.jp header.i=@valinux.co.jp\n header.b=\"jgQP6t3e\"","smtp.subspace.kernel.org;\n arc=fail smtp.client-ip=40.107.74.52","smtp.subspace.kernel.org;\n dmarc=pass (p=none dis=none) header.from=valinux.co.jp","smtp.subspace.kernel.org;\n spf=pass smtp.mailfrom=valinux.co.jp","dkim=none (message not signed)\n header.d=none;dmarc=none action=none header.from=valinux.co.jp;"],"Received":["from sea.lore.kernel.org (sea.lore.kernel.org\n [IPv6:2600:3c0a:e001:db::12fc:5321])\n\t(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n\t key-exchange x25519 server-signature ECDSA (secp384r1) server-digest SHA384)\n\t(No client certificate requested)\n\tby legolas.ozlabs.org (Postfix) with ESMTPS id 4dWcy11P2fz1xty\n\tfor <incoming@patchwork.ozlabs.org>; Thu, 18 Dec 2025 02:23:21 +1100 (AEDT)","from smtp.subspace.kernel.org (conduit.subspace.kernel.org\n [100.90.174.1])\n\tby sea.lore.kernel.org (Postfix) with ESMTP id 1CCBA30DA91E\n\tfor <incoming@patchwork.ozlabs.org>; Wed, 17 Dec 2025 15:16:32 +0000 (UTC)","from localhost.localdomain (localhost.localdomain [127.0.0.1])\n\tby smtp.subspace.kernel.org (Postfix) with ESMTP id 6128B33984F;\n\tWed, 17 Dec 2025 15:16:23 +0000 (UTC)","from OS0P286CU010.outbound.protection.outlook.com\n (mail-japanwestazon11011052.outbound.protection.outlook.com [40.107.74.52])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))\n\t(No client certificate requested)\n\tby smtp.subspace.kernel.org (Postfix) with ESMTPS id F1F4630FC04;\n\tWed, 17 Dec 2025 15:16:20 +0000 (UTC)","from TYWP286MB2697.JPNP286.PROD.OUTLOOK.COM (2603:1096:400:24c::11)\n by OS9P286MB4633.JPNP286.PROD.OUTLOOK.COM (2603:1096:604:2fc::12) with\n Microsoft SMTP Server (version=TLS1_2,\n cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9434.6; Wed, 17 Dec\n 2025 15:16:15 +0000","from TYWP286MB2697.JPNP286.PROD.OUTLOOK.COM\n ([fe80::fb7e:f4ed:a580:9d03]) by TYWP286MB2697.JPNP286.PROD.OUTLOOK.COM\n ([fe80::fb7e:f4ed:a580:9d03%5]) with mapi id 15.20.9434.001; Wed, 17 Dec 2025\n 15:16:13 +0000"],"ARC-Seal":["i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;\n\tt=1765984583; cv=fail;\n b=eZKlF4LpIpNhbjit+B0zQ8+ktzkvJwiNj4q2LE+vhfbpRrN1BqsTyItQ0J/5WMb4CKBxAc9W9mdAc0dSGD9aypjuDqX5W9FDxWJuIzmOr8MU9rR2TGeGP6+lZZgD843aQt9f34eteU7poLhq1Zt9+6xcORMkzZW0D3IQS5HHDs4=","i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none;\n b=vQFaIyVDmQG0DlFCVrn6zTD519b+MlEPxaL9wLZF4Mnkqy4ol3XDSSrFh7z5lRo55nWgey9y9Fa92zfYjnYg4Ms7qmTvYuirrit/pCoAufIpDq2tDqpucU7YsXRr3Dr4zIJuZSexT8k1O6Ei1R25Dxwmetojg1Pm6/xnQ/2pXzzKUZ3vwIIGmN9XIXDyGyAWaHj/tlCQPuzfTeNAtWELpZ1KYmm30H7K+Sb4/QSsAV9dAUZ+7xOlstfLIhD2oT3AZHWjAXdFtzUCLB0EYmOOQsTtlb9jK2ZLLOMbRYVgVatrxdwQC/RPSNPF1vgO6wuydIxF5tveJx6auvmHL2jU2Q=="],"ARC-Message-Signature":["i=2; a=rsa-sha256; d=subspace.kernel.org;\n\ts=arc-20240116; t=1765984583; c=relaxed/simple;\n\tbh=ES5VNqj4y9m2z1wuiXKQyDYDTfhbeVZVHq2VTYBeHxE=;\n\th=From:To:Cc:Subject:Date:Message-ID:Content-Type:MIME-Version;\n b=C80InA9M+eZreuL4YI2qfVzDFRMAoVEk9OmUGfIsvwEds/d6AfTnOIkC3GDvhNSkEsmCPn5fnlRK5vKiGlLZkozvY9WQZYEovhziZkKVIZAP+EnqR8gVwuZqcELArMz4i7WUwL5X3w8kUiCb8ZMVfG4yujLWFToecyQ1+ydmAJE=","i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com;\n s=arcselector10001;\n h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1;\n bh=o+Yv3ZJsp0yBp1FNd1VCpkcIhRC1lCmig+XExZ1PAKg=;\n b=dwQondfBd6TF8gItNbNkcwi1eI0uph+7hBPpDCwp6EySpncwtGRP9z0xsrW5e7SAm6u4w40XB3nOhGww7Wq2MMAREwPo6Of4pvBkdJCjiJRds9zAvcco5zHdXNjTo7iEqkgcpVu0nXJnovUI3e3ATEzcU2bbT+SZM7IE0t8VmkqPLiCYESRN/6w98+uuIRRmfVJq7CYpcJtP2Eo+Nqm2+v+4VbqzJ4z9YmEzy/SCTprL7UYnzrJY93XIDcN/1OdxH2zxEfKA6FC5T/mggK3t7LyRSaagXUfYiMpWi6Cr8807pDKzTqe6wtW84ihiHVya9H5lDCS0F4Q+Ah+gkkuPaQ=="],"ARC-Authentication-Results":["i=2; smtp.subspace.kernel.org;\n dmarc=pass (p=none dis=none) header.from=valinux.co.jp;\n spf=pass smtp.mailfrom=valinux.co.jp;\n dkim=pass (1024-bit key) header.d=valinux.co.jp header.i=@valinux.co.jp\n header.b=jgQP6t3e; arc=fail smtp.client-ip=40.107.74.52","i=1; mx.microsoft.com 1; spf=pass\n smtp.mailfrom=valinux.co.jp; dmarc=pass action=none\n header.from=valinux.co.jp; dkim=pass header.d=valinux.co.jp; arc=none"],"DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed; d=valinux.co.jp;\n s=selector1;\n h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;\n bh=o+Yv3ZJsp0yBp1FNd1VCpkcIhRC1lCmig+XExZ1PAKg=;\n b=jgQP6t3emVnq4tsjuBARsyXQOPsVfoRn6n6nK/h2sdh4Zetof319YnQy1x+NzX1p4Lr56AdEYF6FkhcwEFw4vHEdyQWuyv1uVj8VWG8QiEAyTwIimunUclYQp8G7SfH1kfSL9FWDUR5nExhTSgEsM65z8IQHuE6gSnUn4JSks/M=","From":"Koichiro Den <den@valinux.co.jp>","To":"Frank.Li@nxp.com,\n\tdave.jiang@intel.com,\n\tntb@lists.linux.dev,\n\tlinux-pci@vger.kernel.org,\n\tdmaengine@vger.kernel.org,\n\tlinux-renesas-soc@vger.kernel.org,\n\tnetdev@vger.kernel.org,\n\tlinux-kernel@vger.kernel.org","Cc":"mani@kernel.org,\n\tkwilczynski@kernel.org,\n\tkishon@kernel.org,\n\tbhelgaas@google.com,\n\tcorbet@lwn.net,\n\tgeert+renesas@glider.be,\n\tmagnus.damm@gmail.com,\n\trobh@kernel.org,\n\tkrzk+dt@kernel.org,\n\tconor+dt@kernel.org,\n\tvkoul@kernel.org,\n\tjoro@8bytes.org,\n\twill@kernel.org,\n\trobin.murphy@arm.com,\n\tjdmason@kudzu.us,\n\tallenbh@gmail.com,\n\tandrew+netdev@lunn.ch,\n\tdavem@davemloft.net,\n\tedumazet@google.com,\n\tkuba@kernel.org,\n\tpabeni@redhat.com,\n\tBasavaraj.Natikar@amd.com,\n\tShyam-sundar.S-k@amd.com,\n\tkurt.schwemmer@microsemi.com,\n\tlogang@deltatee.com,\n\tjingoohan1@gmail.com,\n\tlpieralisi@kernel.org,\n\tutkarsh02t@gmail.com,\n\tjbrunet@baylibre.com,\n\tdlemoal@kernel.org,\n\tarnd@arndb.de,\n\telfring@users.sourceforge.net,\n\tden@valinux.co.jp","Subject":"[RFC PATCH v3 00/35] NTB transport backed by endpoint DW eDMA","Date":"Thu, 18 Dec 2025 00:15:34 +0900","Message-ID":"<20251217151609.3162665-1-den@valinux.co.jp>","X-Mailer":"git-send-email 2.51.0","Content-Transfer-Encoding":"8bit","Content-Type":"text/plain","X-ClientProxiedBy":"TYCPR01CA0127.jpnprd01.prod.outlook.com\n (2603:1096:400:26d::12) To TYWP286MB2697.JPNP286.PROD.OUTLOOK.COM\n (2603:1096:400:24c::11)","Precedence":"bulk","X-Mailing-List":"linux-pci@vger.kernel.org","List-Id":"<linux-pci.vger.kernel.org>","List-Subscribe":"<mailto:linux-pci+subscribe@vger.kernel.org>","List-Unsubscribe":"<mailto:linux-pci+unsubscribe@vger.kernel.org>","MIME-Version":"1.0","X-MS-PublicTrafficType":"Email","X-MS-TrafficTypeDiagnostic":"TYWP286MB2697:EE_|OS9P286MB4633:EE_","X-MS-Office365-Filtering-Correlation-Id":"8dc9d895-7bd6-473d-5a0c-08de3d7f3737","X-MS-Exchange-SenderADCheck":"1","X-MS-Exchange-AntiSpam-Relay":"0","X-Microsoft-Antispam":"\n\tBCL:0;ARA:13230040|10070799003|376014|7416014|1800799024|366016|13003099007;","X-Microsoft-Antispam-Message-Info":"\n P38+Uwro2/KL+jif89boGEamqkMkbtYlTsAzLNDVb4nsHDA77hVvwsLdaW5yFQb3vHqoykUA6rJAWwOW09Z1odoNN2RdDsD0Ij8E/PPhOOf0SkAmyDKPa0/VaJrZ3emm05iv+ypeuH9Nh/9jkHN48zsyISVG+6AKuz++ySVRUeTXD8ZgseNuEMAmfBvxgAMkEa5IGmi3gF1y7QPOnb+C3wgj11FyvikvG7jGX//ltXa514S0lmtLYqVFLxBDP10XhGKopWnpthQONJqhGN6Jy0DljqaKNYhTpbppJMvrJDD7w8AR9DYm/cdlLPITUTOoh0tLtPPVe0PvvxdUcklNiNjr8SP8Vz2+NqHg+CLqe2yXv2KXuFlAPPSxe50ekUkBNyUOYfYj7CVjLI3KAXTsINsuLI9lHlkrgXuOjzC1IT0erTLwjKNfWTKUJRUAV/VPlISSxVHq5PrqtUU6g0RKJ/Nec6VMPBKEgbYat76HV43H/KgpowTQpqu/kv8u2JfelteDUqv+txVEJgASXWoDP8okrvPZWaf3ZB5yXMxQAonLSotN2lj0jQmwiYrWar/w1EzTjXe/cstloqgzm0+jGEQKM6fjjSMDHUPBZ8OxOwmA/jGoqamnF1NHYBv4bgQi6x75CxVJwIpSOTUSkodVubvnkNbNv8DPI+of4UlEPo8pSRcWiLZlM2FBfZ5Jg8nsYfIsq7or2IXz/iJp2upgZcUbdjmuB+ap3KIRB/a7hjttiI90rHT3ZIoAAHL7OfyHMDvZWg0fxfvnh+0QoJfoJGivpuGacyFn8eMtCvTocxK6qJjbxp9O35URKFvr+ODzHgw5/9BcQy0qWwB/Yo26r/ZPpozaH2hqqBRgpJ5YpdgG+dNKlkAz0w1svv3Q83U8XuzJarRHCkCp71oNaa4Wg88IniLRvlhBvrhpLGm1OfSQjD62hfEFA0JYogWphKxN5u+dzhhN+p9gZ1zg1yvD4Vdi6OcgGEDKwVtF7IdwrrFjoA0tqNYV+nRGFx6Qe5o9U72mYY4TMaVXycJG5VVlNLZpVkIzoPQel0RumWb11Y9SkTTp/ZoIBU2l44Vhb256aLuA1qdnDZUw9KRlR4QUm//udXNRes4r0Qd9zt9F19BmpzH/FOkZc85B1r1MGrX/pF0VrttylWewvZP3962zqcqbhlU3EnnDyYptwcKZ/VYQTA8mwDjQdSS0S4urH4K6y20xtS6ARuiU+A888Dtx1Nhn1GHopV7vidhgr4QLBE1feIFvIhjyzSIgpz0tbdpvpPZU+QpemAuEVrrmAdc2TfcPVFnt1Q1LylbZx2vc3p24NYsooIaL74sQi7IjyUI9Lb5+pXmVr9pSySqgKCYrH4XwMfln/4p7BYEw8dxg70ZBsqgXw4m3lEGsYX8Wzkz26HriCErk4Nty+cov6s9TYbHYyvJBl582Oh6uqLgLi+XavbKC9nJiRsQ4UfbDtUkt","X-Forefront-Antispam-Report":"\n\tCIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:TYWP286MB2697.JPNP286.PROD.OUTLOOK.COM;PTR:;CAT:NONE;SFS:(13230040)(10070799003)(376014)(7416014)(1800799024)(366016)(13003099007);DIR:OUT;SFP:1101;","X-MS-Exchange-AntiSpam-MessageData-ChunkCount":"1","X-MS-Exchange-AntiSpam-MessageData-0":"\n Uvv6Sd75i3CpLRwXLVlv8ia3kvd5ck3RBQ8fdqa+FHBgnxW7WJZsv/SJ7AdlTblXegMghxenSI+OK9Ei+IbdcW7XGcZzY0KjqYWpQ33D+IEpBMSbXhYtQmldgDnf8TssgboAAI2BlllCT+35ClhWetuP+j715zS1+FayEkuXGQL/s4KRN0iSPK4vDhVtK2x8uyH55FZm7y1LFiqYJJ2lzQhfk3Oloj0kXVfEE2dSuqZbZi1QBlBjXGHahDb5R1pw5btKRtU5voZ1Yiq6oaKClKB+17r0i8Q9mTYo1UCMAPIE9yshCFOroTxrdPU4CdVlkjH268j2UWgyNXKkvo7V9BCE2IDhAD7snxR3XYdrweIG3O1LEIm4FykwSPcNagzy3S8h9JsOxwwza4AzSF01OTiQBAZJLmiYnBcdRpwCc76u4Wop4NC2LfjZ/ZOIxUK8GL7a+lhirJumSFMN/lsQImC2c6sexXUia9lmLXB0QbibBiP6G+a+m/dvntpwNT/cD1BpVt7+NMLw/1gWKzjFF6eN+4pFG339LG27Pmr9WNUR0mPuM1VD88Rds1ZGOZeEeBllXboItfugAIqPKYOT1T/gjeTjIvo4LeEcvCf9f+bwlHGsf2E6otRq6zYQSI3mX9l1sMgqAokoaLoBxuAHcAuats/a8OYdPrMHZdwTPGYAcp3mQ/WlimnlZVSKRUUnTtIrbj6NBDYbRMVwMw8eXFeXOYxafdRhSZE7AEol96tIF+Svb0zWK4ZVDzJUlUdXqUoLojrmx2AyHbQK+fy/yhcJl/d1mXhmD93GU+cNVUjfXX0+jnVwdmOSFUlNKVQtdQTwgZeAodIO0OOZ0pJe5bS+l2KaVluu41uF6VIqsIq6/TO+OilvXB85juIosMdB66c6XWXGlgftse3tcU8EowZH9pK498Qctsof7OuGQ+P4bycqjh0xTsOjU/viHn39MklJY3R7TcT1jzCB/Yl6WfhcoPPDj2/VFNdBdgg1ZeZOeu+nBA43z5KcJnG/VQENTJmF4efZe5XPTtm228tnltQY7rJAqSDsXdFclcsMOCt8jzdDT6zaUFDNrfMKqX7huH3FkFzKvf+puLSMBFMsOhKgnC/exQtWu79ZQdrd5OleG/Ie40s2BT8UDcb+0izHz7Hnj1dsa4XnzJ54ECUgT1aTZErASck/jDyMuVYyryxOf+9vnDH6ONhqQXajG7YQsFNyaDZdNp7ZAUdE5IFd0j5qSTCHaJsC6zWDfV9ZnXjPv41R+m3HBLLZSJTWcYvwgo0thEK3oQrIdViEO0t09MkQpUMIWeF0OzGJAeyC8igBdqg1NnGMiNipdHDr3EKjYp02X0HpXZQCZiLVISNsolv505SGuTbmAmFDVg3es0SOV1LJzyymWAa5I0WL5Y7qMHbT6+2af4zEqnFVfdHtwG40Pu2hzJGA/X9Zc2AU651AnOrqfXckxeTWuyNEPb1oL1Ds9tXnHn8teEv127EKLPw2SspYQ7KwnJmZsWnSbKEhZKiNzDXNJh0i2kNPZWsFtx1Pfe/+EqkaZ3teUIfb0zXO5LytMkt1kh5YMQy82Z/c4AKhQFWy/H8FVExaUrBYcoYPw5bS6speSCPzCnhV8FPKSy8by9ULQA0bIo4og8o=","X-OriginatorOrg":"valinux.co.jp","X-MS-Exchange-CrossTenant-Network-Message-Id":"\n 8dc9d895-7bd6-473d-5a0c-08de3d7f3737","X-MS-Exchange-CrossTenant-AuthSource":"TYWP286MB2697.JPNP286.PROD.OUTLOOK.COM","X-MS-Exchange-CrossTenant-AuthAs":"Internal","X-MS-Exchange-CrossTenant-OriginalArrivalTime":"17 Dec 2025 15:16:13.1366\n (UTC)","X-MS-Exchange-CrossTenant-FromEntityHeader":"Hosted","X-MS-Exchange-CrossTenant-Id":"7a57bee8-f73d-4c5f-a4f7-d72c91c8c111","X-MS-Exchange-CrossTenant-MailboxType":"HOSTED","X-MS-Exchange-CrossTenant-UserPrincipalName":"\n PpPnAT22euMZdYkFSqP3ESzHvO6uXTSF4GlqaF1YrAURsYOZLCK3yJzQr3ptPjkUjcGKP2PtcjhtaQ/5tU8Q4w==","X-MS-Exchange-Transport-CrossTenantHeadersStamped":"OS9P286MB4633"},"content":"Hi,\n\nThis is RFC v3 of the NTB/PCI series that introduces NTB transport backed\nby DesignWare PCIe integrated eDMA.\n\n  RFC v2: https://lore.kernel.org/all/20251129160405.2568284-1-den@valinux.co.jp/\n  RFC v1: https://lore.kernel.org/all/20251023071916.901355-1-den@valinux.co.jp/\n\nThe goal is to improve performance between a host and an endpoint over\nntb_transport (typically with ntb_netdev on top). On R-Car S4, preliminary\niperf3 results show 10~20x throughput improvement. Latency improvements are\nalso observed.\n\nIn this approach, payload is transferred by DMA directly between host and\nendpoint address spaces, and the NTB Memory Window is primarily used as a\ncontrol/metadata window (and to expose the eDMA register/LL regions).\nCompared to the memcpy-based transport, this avoids extra copies and\nenables deeper rings and scales out to multiple queue pairs.\n\nCompared to RFC v2, data plane works in a symmetric manner in both\ndirections (host-to-endpoint and endpoint-to-host). The host side drives\nremote read channels for its TX transfer while the endpoint drives local\nwrite channels.\n\nAgain, I recognize that this is quite a large series. Sorry for the volume,\nbut for the RFC stage I believe presenting the full picture in a single set\nhelps with reviewing the overall architecture (Of course detail feedback\nwould be appreciated as well). Once the direction is agreed, I will respin\nit split by subsystem and topic.\n\nMany thanks for all the reviews and feedback from multiple perspectives.\n\n\nData flow overview\n==================\n\n    Figure 1. RC->EP traffic via ntb_netdev+ntb_transport\n                     backed by Remote eDMA\n\n          EP                                   RC\n       phys addr                            phys addr\n         space                                space\n          +-+                                  +-+\n          | |                                  | |\n          | |                ||                | |\n          +-+-----.          ||                | |\n EDMA REG | |      \\    [A]  ||                | |\n          +-+----.  '---+-+  ||                | |\n          | |     \\     | |<---------[0-a]----------\n          +-+-----------| |<----------[2]----------.\n  EDMA LL | |           | |  ||                | | :\n          | |           | |  ||                | | :\n          +-+-----------+-+  ||  [B]           | | :\n          | |                ||  ++            | | :\n       ---------[0-b]----------->||----------------'\n          | |            ++  ||  ||            | |\n          | |            ||  ||  ++            | |\n          | |            ||<----------[4]-----------\n          | |            ++  ||                | |\n          | |           [C]  ||                | |\n       .--|#|<------------------------[3]------|#|<-.\n       :  |#|                ||                |#|  :\n      [5] | |                ||                | | [1]\n       :  | |                ||                | |  :\n       '->|#|                                  |#|--'\n          |#|                                  |#|\n          | |                                  | |\n\n\n    Figure 2. EP->RC traffic via ntb_netdev+ntb_transport\n                     backed by EP-Local eDMA\n\n          EP                                   RC\n       phys addr                            phys addr\n         space                                space\n          +-+                                  +-+\n          | |                                  | |\n          | |                ||                | |\n          +-+                ||                | |\n EDMA REG | |                ||                | |\n          +-+                ||                | |\n^         | |                ||                | |\n:         +-+                ||                | |\n: EDMA LL | |                ||                | |\n:         | |                ||                | |\n:         +-+                ||  [C]           | |\n:         | |                ||  ++            | |\n:      -----------[4]----------->||            | |\n:         | |            ++  ||  ||            | |\n:         | |            ||  ||  ++            | |\n'----------------[2]-----||<--------[0-b]-----------\n          | |            ++  ||                | |\n          | |           [B]  ||                | |\n       .->|#|--------[3]---------------------->|#|--.\n       :  |#|                ||                |#|  :\n      [1] | |                ||                | | [5]\n       :  | |                ||                | |  :\n       '--|#|                                  |#|<-'\n          |#|                                  |#|\n          | |                                  | |\n\n\n      0-a. configure Remote eDMA\n      0-b. DMA-map and produce DAR\n      1.   memcpy while building skb in ntb_netdev case\n      2.   consume DAR, DMA-map SAR and kick DMA read transfer\n      3.   DMA transfer\n      4.   consume (commit)\n      5.   memcpy to application side\n\n      [A]: MemoryWindow that aggregates eDMA regs and LL.\n           IB iATU translations (Address Match Mode).\n      [B]: Control plane ring buffer (for \"produce\")\n      [C]: Control plane ring buffer (for \"consume\")\n\n  Note:\n    - Figure 1 is unchanged from RFC v2.\n    - Figure 2 differs from the one depicted in RFC v2 cover letter.\n\n\nChanges since RFC v2\n====================\n\nRFCv2->RFCv3 changes:\n  - Architecture\n    - Have EP side use its local write channels, while leaving RC side to\n      use remote read channels.\n    - Abstraction/HW-specific stuff encapsulation improved.\n  - Added control/config region versioning for the vNTB/EPF control region\n    so that mismatched RC/EP kernels fail early instead of silently using an\n    incompatible layout.\n  - Reworked BAR subrange / multi-region mapping support:\n    - Dropped the v2 approach that added new inbound mapping ops in the EPC\n      core.\n    - Introduced `struct pci_epf_bar.submap` and extended DesignWare EP to\n      support BAR subrange inbound mapping via Address Match Mode IB iATU.\n    - pci-epf-vntb now provides a subrange mapping hint to the EPC driver\n      when offsets are used.\n  - Changed .get_pci_epc() to .get_private_data()\n  - Dropped two commits from RFC v2 that should be submitted separately:\n    (1) ntb_transport debugfs seq_file conversion\n    (2) DWC EP outbound iATU MSI mapping/cache fix (will be re-posted separately)\n  - Added documentation updates.\n  - Addressed assorted review nits from the RFC v2 thread (naming/structure).\n\nRFCv1->RFCv2 changes:\n  - Architecture\n    - Drop the generic interrupt backend + DW eDMA test-interrupt backend\n      approach and instead adopt the remote eDMA-backed ntb_transport mode\n      proposed by Frank Li. The BAR-sharing / mwN_offset / inbound\n      mapping (Address Match Mode) infrastructure from RFC v1 is largely\n      kept, with only minor refinements and code motion where necessary\n      to fit the new transport-mode design.\n  - For Patch 01\n    - Rework the array_index_nospec() conversion to address review\n      comments on \"[RFC PATCH 01/25]\".\n\nRFCv2: https://lore.kernel.org/all/20251129160405.2568284-1-den@valinux.co.jp/\nRFCv1: https://lore.kernel.org/all/20251023071916.901355-1-den@valinux.co.jp/\n\n\nPatch layout\n============\n\n  Patch 01-25 : preparation for Patch 26\n                - 01-07: support multiple MWs in a BAR\n\t\t- 08-25: other misc preparations\n  Patch 26    : main and most important patch, adds eDMA-backed transport\n  Patch 27-28 : multi-queue use, thanks to the remote eDMA, performance\n                scales\n  Patch 29-33 : handle several SoC-specific issues so that remote eDMA\n                mode ntb_transport works on R-Car S4\n  Patch 34-35 : kernel doc updates\n\n\nTested on\n=========\n\n* 2x Renesas R-Car S4 Spider (RC<->EP connected with OcuLink cable)\n* Kernel base: next-20251216 + [1] + [2] + [3]\n\n  [1]: https://lore.kernel.org/all/20251210071358.2267494-2-cassel@kernel.org/\n       (this is a spin-out patch from\n        https://lore.kernel.org/linux-pci/20251129160405.2568284-20-den@valinux.co.jp/)\n  [2]: https://lore.kernel.org/all/20251208-dma_prep_config-v1-0-53490c5e1e2a@nxp.com/\n       (while it appears to still be under active discussion)\n  [3]: https://lore.kernel.org/all/20251217081955.3137163-1-den@valinux.co.jp/\n       (this is a spin-out patch from\n        https://lore.kernel.org/all/20251129160405.2568284-14-den@valinux.co.jp/)\n\n\nPerformance measurement\n=======================\n\nNo serious measurements yet, because:\n  * For \"before the change\", even use_dma/use_msi does not work on the\n    upstream kernel unless we apply some patches for R-Car S4. With some\n    unmerged patch series I had posted earlier (but superseded by this RFC\n    attempt), it was observed that we can achieve about 7 Gbps for the\n    RC->EP direction. Pure upstream kernel can achieve around 500 Mbps\n    though.\n  * For \"after the change\", measurements are not mature because this\n    RFC v3 patch series is not yet performance-optimized at this stage.\n\nHere are the rough measurements showing the achievable performance on\nthe R-Car S4:\n\n- Before this change:\n\n  * ping\n    64 bytes from 10.0.0.11: icmp_seq=1 ttl=64 time=12.3 ms\n    64 bytes from 10.0.0.11: icmp_seq=2 ttl=64 time=6.58 ms\n    64 bytes from 10.0.0.11: icmp_seq=3 ttl=64 time=1.26 ms\n    64 bytes from 10.0.0.11: icmp_seq=4 ttl=64 time=7.43 ms\n    64 bytes from 10.0.0.11: icmp_seq=5 ttl=64 time=1.39 ms\n    64 bytes from 10.0.0.11: icmp_seq=6 ttl=64 time=7.38 ms\n    64 bytes from 10.0.0.11: icmp_seq=7 ttl=64 time=1.42 ms\n    64 bytes from 10.0.0.11: icmp_seq=8 ttl=64 time=7.41 ms\n\n  * RC->EP (`sudo iperf3 -ub0 -l 65480 -P 2`)\n    [ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams\n    [  5]   0.00-10.01  sec   344 MBytes   288 Mbits/sec  3.483 ms  51/5555 (0.92%)  receiver\n    [  6]   0.00-10.01  sec   342 MBytes   287 Mbits/sec  3.814 ms  38/5517 (0.69%)  receiver\n    [SUM]   0.00-10.01  sec   686 MBytes   575 Mbits/sec  3.648 ms  89/11072 (0.8%)  receiver\n\n  * EP->RC (`sudo iperf3 -ub0 -l 65480 -P 2`)\n    [  5]   0.00-10.03  sec   334 MBytes   279 Mbits/sec  3.164 ms  390/5731 (6.8%)  receiver\n    [  6]   0.00-10.03  sec   334 MBytes   279 Mbits/sec  2.416 ms  396/5741 (6.9%)  receiver\n    [SUM]   0.00-10.03  sec   667 MBytes   558 Mbits/sec  2.790 ms  786/11472 (6.9%)  receiver\n\n    Note: with `-P 2`, the best total bitrate (receiver side) was achieved.\n\n- After this change (use_remote_edma=1):\n\n  * ping\n    64 bytes from 10.0.0.11: icmp_seq=1 ttl=64 time=1.42 ms\n    64 bytes from 10.0.0.11: icmp_seq=2 ttl=64 time=1.38 ms\n    64 bytes from 10.0.0.11: icmp_seq=3 ttl=64 time=1.21 ms\n    64 bytes from 10.0.0.11: icmp_seq=4 ttl=64 time=1.02 ms\n    64 bytes from 10.0.0.11: icmp_seq=5 ttl=64 time=1.06 ms\n    64 bytes from 10.0.0.11: icmp_seq=6 ttl=64 time=0.995 ms\n    64 bytes from 10.0.0.11: icmp_seq=7 ttl=64 time=0.964 ms\n    64 bytes from 10.0.0.11: icmp_seq=8 ttl=64 time=1.49 ms\n\n  * RC->EP (`sudo iperf3 -ub0 -l 65480 -P 4`)\n    [  5]   0.00-10.02  sec  3.00 GBytes  2.58 Gbits/sec  0.437 ms  33053/82329 (40%)  receiver\n    [  6]   0.00-10.02  sec  3.00 GBytes  2.58 Gbits/sec  0.174 ms  46379/95655 (48%)  receiver\n    [  9]   0.00-10.02  sec  2.88 GBytes  2.47 Gbits/sec  0.106 ms  47672/94924 (50%)  receiver\n    [ 11]   0.00-10.02  sec  2.87 GBytes  2.46 Gbits/sec  0.364 ms  23694/70817 (33%)  receiver\n    [SUM]   0.00-10.02  sec  11.8 GBytes  10.1 Gbits/sec  0.270 ms  150798/343725 (44%)  receiver\n\n  * EP->RC (`sudo iperf3 -ub0 -l 65480 -P 4`)\n    [  5]   0.00-10.01  sec  3.28 GBytes  2.82 Gbits/sec  0.380 ms  38578/92355 (42%)  receiver\n    [  6]   0.00-10.01  sec  3.24 GBytes  2.78 Gbits/sec  0.430 ms  14268/67340 (21%)  receiver\n    [  9]   0.00-10.01  sec  2.92 GBytes  2.51 Gbits/sec  0.074 ms  0/47890 (0%)  receiver\n    [ 11]   0.00-10.01  sec  4.76 GBytes  4.09 Gbits/sec  0.037 ms  0/78073 (0%)  receiver\n    [SUM]   0.00-10.01  sec  14.2 GBytes  12.2 Gbits/sec  0.230 ms  52846/285658 (18%)  receiver\n\n  * configfs settings:\n      # modprobe pci_epf_vntb\n      # cd /sys/kernel/config/pci_ep/\n      # mkdir functions/pci_epf_vntb/func1\n      # echo 0x1912 >   functions/pci_epf_vntb/func1/vendorid\n      # echo 0x0030 >   functions/pci_epf_vntb/func1/deviceid\n      # echo 32 >       functions/pci_epf_vntb/func1/msi_interrupts\n      # echo 16 >       functions/pci_epf_vntb/func1/pci_epf_vntb.0/db_count\n      # echo 128 >      functions/pci_epf_vntb/func1/pci_epf_vntb.0/spad_count\n      # echo 2 >        functions/pci_epf_vntb/func1/pci_epf_vntb.0/num_mws\n      # echo 0xe0000 >  functions/pci_epf_vntb/func1/pci_epf_vntb.0/mw1\n      # echo 0x20000 >  functions/pci_epf_vntb/func1/pci_epf_vntb.0/mw2\n      # echo 0xe0000 >  functions/pci_epf_vntb/func1/pci_epf_vntb.0/mw2_offset\n      # echo 0x1912 >   functions/pci_epf_vntb/func1/pci_epf_vntb.0/vntb_vid\n      # echo 0x0030 >   functions/pci_epf_vntb/func1/pci_epf_vntb.0/vntb_pid\n      # echo 0x10 >     functions/pci_epf_vntb/func1/pci_epf_vntb.0/vbus_number\n      # echo 0 >        functions/pci_epf_vntb/func1/pci_epf_vntb.0/ctrl_bar\n      # echo 4 >        functions/pci_epf_vntb/func1/pci_epf_vntb.0/db_bar\n      # echo 2 >        functions/pci_epf_vntb/func1/pci_epf_vntb.0/mw1_bar\n      # echo 2 >        functions/pci_epf_vntb/func1/pci_epf_vntb.0/mw2_bar\n      # ln -s controllers/e65d0000.pcie-ep functions/pci_epf_vntb/func1/primary/\n      # echo 1 > controllers/e65d0000.pcie-ep/start\n\n\n\nThank you for reviewing,\n\n\nKoichiro Den (35):\n  PCI: endpoint: pci-epf-vntb: Use array_index_nospec() on mws_size[]\n    access\n  NTB: epf: Add mwN_offset support and config region versioning\n  PCI: dwc: ep: Support BAR subrange inbound mapping via address match\n    iATU\n  NTB: Add offset parameter to MW translation APIs\n  PCI: endpoint: pci-epf-vntb: Propagate MW offset from configfs when\n    present\n  NTB: ntb_transport: Support partial memory windows with offsets\n  PCI: endpoint: pci-epf-vntb: Hint subrange mapping preference to EPC\n    driver\n  NTB: core: Add .get_private_data() to ntb_dev_ops\n  NTB: epf: vntb: Implement .get_private_data() callback\n  dmaengine: dw-edma: Fix MSI data values for multi-vector IMWr\n    interrupts\n  NTB: ntb_transport: Move TX memory window setup into setup_qp_mw()\n  NTB: ntb_transport: Dynamically determine qp count\n  NTB: ntb_transport: Introduce get_dma_dev() helper\n  NTB: epf: Reserve a subset of MSI vectors for non-NTB users\n  NTB: ntb_transport: Move internal types to ntb_transport_internal.h\n  NTB: ntb_transport: Introduce ntb_transport_backend_ops\n  dmaengine: dw-edma: Add helper func to retrieve register base and size\n  dmaengine: dw-edma: Add per-channel interrupt routing mode\n  dmaengine: dw-edma: Poll completion when local IRQ handling is\n    disabled\n  dmaengine: dw-edma: Add notify-only channels support\n  dmaengine: dw-edma: Add a helper to retrieve LL (Linked List) region\n  dmaengine: dw-edma: Serialize RMW on shared interrupt registers\n  NTB: ntb_transport: Split core into ntb_transport_core.c\n  NTB: ntb_transport: Add additional hooks for DW eDMA backend\n  NTB: hw: Introduce DesignWare eDMA helper\n  NTB: ntb_transport: Introduce DW eDMA backed transport mode\n  NTB: epf: Provide db_vector_count/db_vector_mask callbacks\n  ntb_netdev: Multi-queue support\n  NTB: epf: Add per-SoC quirk to cap MRRS for DWC eDMA (128B for R-Car)\n  iommu: ipmmu-vmsa: Add PCIe ch0 to devices_allowlist\n  iommu: ipmmu-vmsa: Add support for reserved regions\n  arm64: dts: renesas: Add Spider RC/EP DTs for NTB with remote DW PCIe\n    eDMA\n  NTB: epf: Add an additional memory window (MW2) barno mapping on\n    Renesas R-Car\n  Documentation: PCI: endpoint: pci-epf-vntb: Update and add mwN_offset\n    usage\n  Documentation: driver-api: ntb: Document remote eDMA transport backend\n\n Documentation/PCI/endpoint/pci-vntb-howto.rst |  16 +-\n Documentation/driver-api/ntb.rst              |  58 +\n arch/arm64/boot/dts/renesas/Makefile          |   2 +\n .../boot/dts/renesas/r8a779f0-spider-ep.dts   |  37 +\n .../boot/dts/renesas/r8a779f0-spider-rc.dts   |  52 +\n drivers/dma/dw-edma/dw-edma-core.c            | 233 ++++-\n drivers/dma/dw-edma/dw-edma-core.h            |  13 +-\n drivers/dma/dw-edma/dw-edma-v0-core.c         |  39 +-\n drivers/iommu/ipmmu-vmsa.c                    |   7 +-\n drivers/net/ntb_netdev.c                      | 341 ++++--\n drivers/ntb/Kconfig                           |  12 +\n drivers/ntb/Makefile                          |   4 +\n drivers/ntb/hw/amd/ntb_hw_amd.c               |   6 +-\n drivers/ntb/hw/edma/ntb_hw_edma.c             | 754 +++++++++++++\n drivers/ntb/hw/edma/ntb_hw_edma.h             |  76 ++\n drivers/ntb/hw/epf/ntb_hw_epf.c               | 187 +++-\n drivers/ntb/hw/idt/ntb_hw_idt.c               |   3 +-\n drivers/ntb/hw/intel/ntb_hw_gen1.c            |   6 +-\n drivers/ntb/hw/intel/ntb_hw_gen1.h            |   2 +-\n drivers/ntb/hw/intel/ntb_hw_gen3.c            |   3 +-\n drivers/ntb/hw/intel/ntb_hw_gen4.c            |   6 +-\n drivers/ntb/hw/mscc/ntb_hw_switchtec.c        |   6 +-\n drivers/ntb/msi.c                             |   6 +-\n .../{ntb_transport.c => ntb_transport_core.c} | 482 ++++-----\n drivers/ntb/ntb_transport_edma.c              | 987 ++++++++++++++++++\n drivers/ntb/ntb_transport_internal.h          | 220 ++++\n drivers/ntb/test/ntb_perf.c                   |   4 +-\n drivers/ntb/test/ntb_tool.c                   |   6 +-\n .../pci/controller/dwc/pcie-designware-ep.c   | 198 +++-\n drivers/pci/controller/dwc/pcie-designware.c  |  25 +\n drivers/pci/controller/dwc/pcie-designware.h  |   2 +\n drivers/pci/endpoint/functions/pci-epf-vntb.c | 246 ++++-\n drivers/pci/endpoint/pci-epc-core.c           |   2 +-\n include/linux/dma/edma.h                      | 106 ++\n include/linux/ntb.h                           |  38 +-\n include/linux/ntb_transport.h                 |   5 +\n include/linux/pci-epf.h                       |  27 +\n 37 files changed, 3716 insertions(+), 501 deletions(-)\n create mode 100644 arch/arm64/boot/dts/renesas/r8a779f0-spider-ep.dts\n create mode 100644 arch/arm64/boot/dts/renesas/r8a779f0-spider-rc.dts\n create mode 100644 drivers/ntb/hw/edma/ntb_hw_edma.c\n create mode 100644 drivers/ntb/hw/edma/ntb_hw_edma.h\n rename drivers/ntb/{ntb_transport.c => ntb_transport_core.c} (91%)\n create mode 100644 drivers/ntb/ntb_transport_edma.c\n create mode 100644 drivers/ntb/ntb_transport_internal.h"}