diff mbox series

[v5,4/9] PCI/P2PDMA: Export pci_p2pdma_map_type() function

Message ID 0fa715706e1adf5e26199dc3eaa3b1ff3b14db67.1760368250.git.leon@kernel.org
State New
Headers show
Series vfio/pci: Allow MMIO regions to be exported through dma-buf | expand

Commit Message

Leon Romanovsky Oct. 13, 2025, 3:26 p.m. UTC
From: Leon Romanovsky <leonro@nvidia.com>

Export the pci_p2pdma_map_type() function to allow external modules
and subsystems to determine the appropriate mapping type for P2PDMA
transfers between a provider and target device.

The function determines whether peer-to-peer DMA transfers can be
done directly through PCI switches (PCI_P2PDMA_MAP_BUS_ADDR) or
must go through the host bridge (PCI_P2PDMA_MAP_THRU_HOST_BRIDGE),
or if the transfer is not supported at all.

This export enables subsystems like VFIO to properly handle P2PDMA
operations by querying the mapping type before attempting transfers,
ensuring correct DMA address programming and error handling.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
 drivers/pci/p2pdma.c       | 15 ++++++-
 include/linux/pci-p2pdma.h | 85 +++++++++++++++++++++-----------------
 2 files changed, 59 insertions(+), 41 deletions(-)

Comments

Christoph Hellwig Oct. 17, 2025, 6:31 a.m. UTC | #1
Nacked-by: Christoph Hellwig <hch@lst.de>

As explained to you multiple times, pci_p2pdma_map_type is a low-level
helper that absolutely MUST be wrapper in proper accessors.  It is
dangerous when used incorrectly and requires too much boiler plate.
There is no way this can be directly exported, and you really need to
stop resending this.
Jason Gunthorpe Oct. 17, 2025, 12:14 p.m. UTC | #2
On Thu, Oct 16, 2025 at 11:31:53PM -0700, Christoph Hellwig wrote:
> 
> Nacked-by: Christoph Hellwig <hch@lst.de>
> 
> As explained to you multiple times, pci_p2pdma_map_type is a low-level
> helper that absolutely MUST be wrapper in proper accessors. 

You never responded to the discussion:

https://lore.kernel.org/all/20250727190252.GF7551@nvidia.com/

What is the plan here? Is the new DMA API unusable by modules? That
seems a little challenging.

> It is dangerous when used incorrectly and requires too much boiler
> plate.  There is no way this can be directly exported, and you
> really need to stop resending this.

Yeah, I don't like the boilerplate at all either.

It looks like there is a simple enough solution here. I wanted to
tackle this after, but maybe it is small enough to do it now.

dmabuf should gain some helpers like BIO has to manage its map/unmap
flows, so lets put a start of some helpers in
drivers/dma/dma-mapping.c (or whatever). dmabuf is a built in so it
can call the function without exporting it just like block and hmm are
doing.

The same code as in this vfio patch will get moved into the helper and
vfio will call it under its dmabuf map/unmap ops.

The goal would be to make it much easier for other dmabuf exporters to
switch from dma_map_resource() to this new dmabuf api which is safe
for P2P.

Jason
Christoph Hellwig Oct. 20, 2025, 12:29 p.m. UTC | #3
On Fri, Oct 17, 2025 at 09:14:47AM -0300, Jason Gunthorpe wrote:
> On Thu, Oct 16, 2025 at 11:31:53PM -0700, Christoph Hellwig wrote:
> > 
> > Nacked-by: Christoph Hellwig <hch@lst.de>
> > 
> > As explained to you multiple times, pci_p2pdma_map_type is a low-level
> > helper that absolutely MUST be wrapper in proper accessors. 
> 
> You never responded to the discussion:
> 
> https://lore.kernel.org/all/20250727190252.GF7551@nvidia.com/
> 
> What is the plan here? Is the new DMA API unusable by modules? That
> seems a little challenging.

Yes.  These are only intended to be wrapped by subsystems.

> It looks like there is a simple enough solution here. I wanted to
> tackle this after, but maybe it is small enough to do it now.
> 
> dmabuf should gain some helpers like BIO has to manage its map/unmap
> flows, so lets put a start of some helpers in
> drivers/dma/dma-mapping.c (or whatever). dmabuf is a built in so it
> can call the function without exporting it just like block and hmm are
> doing.

Yes, that sounds much better.  And dmabuf in general could use some
deduplicating of their dma mapping patterns (and eventual fixing).
Jason Gunthorpe Oct. 20, 2025, 1:14 p.m. UTC | #4
On Mon, Oct 20, 2025 at 05:29:06AM -0700, Christoph Hellwig wrote:
> On Fri, Oct 17, 2025 at 09:14:47AM -0300, Jason Gunthorpe wrote:
> > On Thu, Oct 16, 2025 at 11:31:53PM -0700, Christoph Hellwig wrote:
> > > 
> > > Nacked-by: Christoph Hellwig <hch@lst.de>
> > > 
> > > As explained to you multiple times, pci_p2pdma_map_type is a low-level
> > > helper that absolutely MUST be wrapper in proper accessors. 
> > 
> > You never responded to the discussion:
> > 
> > https://lore.kernel.org/all/20250727190252.GF7551@nvidia.com/
> > 
> > What is the plan here? Is the new DMA API unusable by modules? That
> > seems a little challenging.
> 
> Yes.  These are only intended to be wrapped by subsystems.

Sure, but many subsystems are fully modular too.. RDMA for example.

Well, lets see what comes in the future..

> Yes, that sounds much better.  And dmabuf in general could use some
> deduplicating of their dma mapping patterns (and eventual fixing).

Yes, it certainly could, but I wanted to tackle this later..

I think adding some 'dmabuf create a map for this list of phys on this
provider' is a good simplifed primitive. Simple drivers like VFIO that
only want to expose MMIO can just call it directly.

Once this is settled I want to have RDMA wrap some of its MMIO VMAs in
DMABUF as well, so I can see at least two users of the helper.

Jason
diff mbox series

Patch

diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c
index a2ec7e93fd71..bdbbc49f46ee 100644
--- a/drivers/pci/p2pdma.c
+++ b/drivers/pci/p2pdma.c
@@ -1048,8 +1048,18 @@  void pci_p2pmem_publish(struct pci_dev *pdev, bool publish)
 }
 EXPORT_SYMBOL_GPL(pci_p2pmem_publish);
 
-static enum pci_p2pdma_map_type
-pci_p2pdma_map_type(struct p2pdma_provider *provider, struct device *dev)
+/**
+ * pci_p2pdma_map_type - Determine the mapping type for P2PDMA transfers
+ * @provider: P2PDMA provider structure
+ * @dev: Target device for the transfer
+ *
+ * Determines how peer-to-peer DMA transfers should be mapped between
+ * the provider and the target device. The mapping type indicates whether
+ * the transfer can be done directly through PCI switches or must go
+ * through the host bridge.
+ */
+enum pci_p2pdma_map_type pci_p2pdma_map_type(struct p2pdma_provider *provider,
+					     struct device *dev)
 {
 	enum pci_p2pdma_map_type type = PCI_P2PDMA_MAP_NOT_SUPPORTED;
 	struct pci_dev *pdev = to_pci_dev(provider->owner);
@@ -1078,6 +1088,7 @@  pci_p2pdma_map_type(struct p2pdma_provider *provider, struct device *dev)
 
 	return type;
 }
+EXPORT_SYMBOL_GPL(pci_p2pdma_map_type);
 
 void __pci_p2pdma_update_state(struct pci_p2pdma_map_state *state,
 		struct device *dev, struct page *page)
diff --git a/include/linux/pci-p2pdma.h b/include/linux/pci-p2pdma.h
index e307c9380d46..1e499a8e0099 100644
--- a/include/linux/pci-p2pdma.h
+++ b/include/linux/pci-p2pdma.h
@@ -26,6 +26,45 @@  struct p2pdma_provider {
 	u64 bus_offset;
 };
 
+enum pci_p2pdma_map_type {
+	/*
+	 * PCI_P2PDMA_MAP_UNKNOWN: Used internally as an initial state before
+	 * the mapping type has been calculated. Exported routines for the API
+	 * will never return this value.
+	 */
+	PCI_P2PDMA_MAP_UNKNOWN = 0,
+
+	/*
+	 * Not a PCI P2PDMA transfer.
+	 */
+	PCI_P2PDMA_MAP_NONE,
+
+	/*
+	 * PCI_P2PDMA_MAP_NOT_SUPPORTED: Indicates the transaction will
+	 * traverse the host bridge and the host bridge is not in the
+	 * allowlist. DMA Mapping routines should return an error when
+	 * this is returned.
+	 */
+	PCI_P2PDMA_MAP_NOT_SUPPORTED,
+
+	/*
+	 * PCI_P2PDMA_MAP_BUS_ADDR: Indicates that two devices can talk to
+	 * each other directly through a PCI switch and the transaction will
+	 * not traverse the host bridge. Such a mapping should program
+	 * the DMA engine with PCI bus addresses.
+	 */
+	PCI_P2PDMA_MAP_BUS_ADDR,
+
+	/*
+	 * PCI_P2PDMA_MAP_THRU_HOST_BRIDGE: Indicates two devices can talk
+	 * to each other, but the transaction traverses a host bridge on the
+	 * allowlist. In this case, a normal mapping either with CPU physical
+	 * addresses (in the case of dma-direct) or IOVA addresses (in the
+	 * case of IOMMUs) should be used to program the DMA engine.
+	 */
+	PCI_P2PDMA_MAP_THRU_HOST_BRIDGE,
+};
+
 #ifdef CONFIG_PCI_P2PDMA
 int pcim_p2pdma_init(struct pci_dev *pdev);
 struct p2pdma_provider *pcim_p2pdma_provider(struct pci_dev *pdev, int bar);
@@ -45,6 +84,8 @@  int pci_p2pdma_enable_store(const char *page, struct pci_dev **p2p_dev,
 			    bool *use_p2pdma);
 ssize_t pci_p2pdma_enable_show(char *page, struct pci_dev *p2p_dev,
 			       bool use_p2pdma);
+enum pci_p2pdma_map_type pci_p2pdma_map_type(struct p2pdma_provider *provider,
+					     struct device *dev);
 #else /* CONFIG_PCI_P2PDMA */
 static inline int pcim_p2pdma_init(struct pci_dev *pdev)
 {
@@ -106,6 +147,11 @@  static inline ssize_t pci_p2pdma_enable_show(char *page,
 {
 	return sprintf(page, "none\n");
 }
+static inline enum pci_p2pdma_map_type
+pci_p2pdma_map_type(struct p2pdma_provider *provider, struct device *dev)
+{
+	return PCI_P2PDMA_MAP_NOT_SUPPORTED;
+}
 #endif /* CONFIG_PCI_P2PDMA */
 
 
@@ -120,45 +166,6 @@  static inline struct pci_dev *pci_p2pmem_find(struct device *client)
 	return pci_p2pmem_find_many(&client, 1);
 }
 
-enum pci_p2pdma_map_type {
-	/*
-	 * PCI_P2PDMA_MAP_UNKNOWN: Used internally as an initial state before
-	 * the mapping type has been calculated. Exported routines for the API
-	 * will never return this value.
-	 */
-	PCI_P2PDMA_MAP_UNKNOWN = 0,
-
-	/*
-	 * Not a PCI P2PDMA transfer.
-	 */
-	PCI_P2PDMA_MAP_NONE,
-
-	/*
-	 * PCI_P2PDMA_MAP_NOT_SUPPORTED: Indicates the transaction will
-	 * traverse the host bridge and the host bridge is not in the
-	 * allowlist. DMA Mapping routines should return an error when
-	 * this is returned.
-	 */
-	PCI_P2PDMA_MAP_NOT_SUPPORTED,
-
-	/*
-	 * PCI_P2PDMA_MAP_BUS_ADDR: Indicates that two devices can talk to
-	 * each other directly through a PCI switch and the transaction will
-	 * not traverse the host bridge. Such a mapping should program
-	 * the DMA engine with PCI bus addresses.
-	 */
-	PCI_P2PDMA_MAP_BUS_ADDR,
-
-	/*
-	 * PCI_P2PDMA_MAP_THRU_HOST_BRIDGE: Indicates two devices can talk
-	 * to each other, but the transaction traverses a host bridge on the
-	 * allowlist. In this case, a normal mapping either with CPU physical
-	 * addresses (in the case of dma-direct) or IOVA addresses (in the
-	 * case of IOMMUs) should be used to program the DMA engine.
-	 */
-	PCI_P2PDMA_MAP_THRU_HOST_BRIDGE,
-};
-
 struct pci_p2pdma_map_state {
 	struct p2pdma_provider *mem;
 	enum pci_p2pdma_map_type map;