diff mbox series

[v5,1/9] PCI/P2PDMA: Separate the mmap() support from the core logic

Message ID 1044f7aa09836d63de964d4eb6e646b3071c1fdb.1760368250.git.leon@kernel.org
State New
Headers show
Series vfio/pci: Allow MMIO regions to be exported through dma-buf | expand

Commit Message

Leon Romanovsky Oct. 13, 2025, 3:26 p.m. UTC
From: Leon Romanovsky <leonro@nvidia.com>

Currently the P2PDMA code requires a pgmap and a struct page to
function. The was serving three important purposes:

 - DMA API compatibility, where scatterlist required a struct page as
   input

 - Life cycle management, the percpu_ref is used to prevent UAF during
   device hot unplug

 - A way to get the P2P provider data through the pci_p2pdma_pagemap

The DMA API now has a new flow, and has gained phys_addr_t support, so
it no longer needs struct pages to perform P2P mapping.

Lifecycle management can be delegated to the user, DMABUF for instance
has a suitable invalidation protocol that does not require struct page.

Finding the P2P provider data can also be managed by the caller
without need to look it up from the phys_addr.

Split the P2PDMA code into two layers. The optional upper layer,
effectively, provides a way to mmap() P2P memory into a VMA by
providing struct page, pgmap, a genalloc and sysfs.

The lower layer provides the actual P2P infrastructure and is wrapped
up in a new struct p2pdma_provider. Rework the mmap layer to use new
p2pdma_provider based APIs.

Drivers that do not want to put P2P memory into VMA's can allocate a
struct p2pdma_provider after probe() starts and free it before
remove() completes. When DMA mapping the driver must convey the struct
p2pdma_provider to the DMA mapping code along with a phys_addr of the
MMIO BAR slice to map. The driver must ensure that no DMA mapping
outlives the lifetime of the struct p2pdma_provider.

The intended target of this new API layer is DMABUF. There is usually
only a single p2pdma_provider for a DMABUF exporter. Most drivers can
establish the p2pdma_provider during probe, access the single instance
during DMABUF attach and use that to drive the DMA mapping.

DMABUF provides an invalidation mechanism that can guarantee all DMA
is halted and the DMA mappings are undone prior to destroying the
struct p2pdma_provider. This ensures there is no UAF through DMABUFs
that are lingering past driver removal.

The new p2pdma_provider layer cannot be used to create P2P memory that
can be mapped into VMA's, be used with pin_user_pages(), O_DIRECT, and
so on. These use cases must still use the mmap() layer. The
p2pdma_provider layer is principally for DMABUF-like use cases where
DMABUF natively manages the life cycle and access instead of
vmas/pin_user_pages()/struct page.

In addition, remove the bus_off field from pci_p2pdma_map_state since
it duplicates information already available in the pgmap structure.
The bus_offset is only used in one location (pci_p2pdma_bus_addr_map)
and is always identical to pgmap->bus_offset.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
 drivers/pci/p2pdma.c       | 43 ++++++++++++++++++++------------------
 include/linux/pci-p2pdma.h | 19 ++++++++++++-----
 2 files changed, 37 insertions(+), 25 deletions(-)

Comments

Christoph Hellwig Oct. 17, 2025, 6:30 a.m. UTC | #1
On Mon, Oct 13, 2025 at 06:26:03PM +0300, Leon Romanovsky wrote:
> The DMA API now has a new flow, and has gained phys_addr_t support, so
> it no longer needs struct pages to perform P2P mapping.

That's news to me.  All the pci_p2pdma_map_state machinery is still
based on pgmaps and thus pages.

> Lifecycle management can be delegated to the user, DMABUF for instance
> has a suitable invalidation protocol that does not require struct page.

How?
Jason Gunthorpe Oct. 17, 2025, 11:53 a.m. UTC | #2
On Thu, Oct 16, 2025 at 11:30:06PM -0700, Christoph Hellwig wrote:
> On Mon, Oct 13, 2025 at 06:26:03PM +0300, Leon Romanovsky wrote:
> > The DMA API now has a new flow, and has gained phys_addr_t support, so
> > it no longer needs struct pages to perform P2P mapping.
> 
> That's news to me.  All the pci_p2pdma_map_state machinery is still
> based on pgmaps and thus pages.

We had this discussion already three months ago:

https://lore.kernel.org/all/20250729131502.GJ36037@nvidia.com/

These couple patches make the core pci_p2pdma_map_state machinery work
on struct p2pdma_provider, and pgmap is just one way to get a
p2pdma_provider *

The struct page paths through pgmap go page->pgmap->mem to get
p2pdma_provider.

The non-struct page paths just have a p2pdma_provider * without a
pgmap. In this series VFIO uses

+	*provider = pcim_p2pdma_provider(pdev, bar);

To get the provider for a specific BAR.

> > Lifecycle management can be delegated to the user, DMABUF for instance
> > has a suitable invalidation protocol that does not require struct page.
> 
> How?

I think I've answered this three times now - for DMABUF the DMABUF
invalidation scheme is used to control the lifetime and no DMA mapping
outlives the provider, and the provider doesn't outlive the driver.

Hotplug works fine. VFIO gets the driver removal callback, it
invalidates all the DMABUFs, refuses to re-validate them, destroys the
P2P provider, and ends its driver. There is no lifetime issue.

Obviously you cannot use the new p2provider mechanism without some
kind of protection against use after hot unplug, but it doesn't have
to be struct page based.

For VFIO the invalidation scheme is linked to dma_buf_move_notify(),
for instance the hotunplug case goes:

static const struct vfio_device_ops vfio_pci_ops = {
   .close_device	= vfio_pci_core_close_device,

	vfio_pci_dma_buf_cleanup(vdev);

		dma_buf_move_notify(priv->dmabuf);

And then if we follow that into an importer like RDMA:

static struct dma_buf_attach_ops mlx5_ib_dmabuf_attach_ops = {
   .move_notify = mlx5_ib_dmabuf_invalidate_cb,

	mlx5r_umr_update_mr_pas(mr, MLX5_IB_UPD_XLT_ZAP);
	ib_umem_dmabuf_unmap_pages(umem_dmabuf);
	
	    dma_buf_unmap_attachment(umem_dmabuf->attach, umem_dmabuf->sgt,
				 DMA_BIDIRECTIONAL);
               vfio_pci_dma_buf_unmap()

XLT_ZAP tells the HW to stop doing DMA and the unmap_pages -> 
unmap_attachment -> vfio_pci_dma_buf_unmap()
flow will tear down the DMA API mapping and remove it from the
IOMMU. All of this happens before device_driver remove completes.

There is no lifecycle issue here and we don't need pgmap to solve a
livecycle problem or to help find the p2pdma_provider.

Jason
Christoph Hellwig Oct. 20, 2025, 12:27 p.m. UTC | #3
On Fri, Oct 17, 2025 at 08:53:20AM -0300, Jason Gunthorpe wrote:
> On Thu, Oct 16, 2025 at 11:30:06PM -0700, Christoph Hellwig wrote:
> > On Mon, Oct 13, 2025 at 06:26:03PM +0300, Leon Romanovsky wrote:
> > > The DMA API now has a new flow, and has gained phys_addr_t support, so
> > > it no longer needs struct pages to perform P2P mapping.
> > 
> > That's news to me.  All the pci_p2pdma_map_state machinery is still
> > based on pgmaps and thus pages.
> 
> We had this discussion already three months ago:
> 
> https://lore.kernel.org/all/20250729131502.GJ36037@nvidia.com/
> 
> These couple patches make the core pci_p2pdma_map_state machinery work
> on struct p2pdma_provider, and pgmap is just one way to get a
> p2pdma_provider *
> 
> The struct page paths through pgmap go page->pgmap->mem to get
> p2pdma_provider.
> 
> The non-struct page paths just have a p2pdma_provider * without a
> pgmap. In this series VFIO uses
> 
> +	*provider = pcim_p2pdma_provider(pdev, bar);
> 
> To get the provider for a specific BAR.

And what protects that life time?  I've not seen anyone actually
building the proper lifetime management.  And if someone did the patches
need to clearly point to that.

> I think I've answered this three times now - for DMABUF the DMABUF
> invalidation scheme is used to control the lifetime and no DMA mapping
> outlives the provider, and the provider doesn't outlive the driver.

How?

> Obviously you cannot use the new p2provider mechanism without some
> kind of protection against use after hot unplug, but it doesn't have
> to be struct page based.

And how does this interact with everyone else expecting pgmap based
lifetime management.
Jason Gunthorpe Oct. 20, 2025, 12:58 p.m. UTC | #4
On Mon, Oct 20, 2025 at 05:27:02AM -0700, Christoph Hellwig wrote:
> On Fri, Oct 17, 2025 at 08:53:20AM -0300, Jason Gunthorpe wrote:
> > On Thu, Oct 16, 2025 at 11:30:06PM -0700, Christoph Hellwig wrote:
> > > On Mon, Oct 13, 2025 at 06:26:03PM +0300, Leon Romanovsky wrote:
> > > > The DMA API now has a new flow, and has gained phys_addr_t support, so
> > > > it no longer needs struct pages to perform P2P mapping.
> > > 
> > > That's news to me.  All the pci_p2pdma_map_state machinery is still
> > > based on pgmaps and thus pages.
> > 
> > We had this discussion already three months ago:
> > 
> > https://lore.kernel.org/all/20250729131502.GJ36037@nvidia.com/
> > 
> > These couple patches make the core pci_p2pdma_map_state machinery work
> > on struct p2pdma_provider, and pgmap is just one way to get a
> > p2pdma_provider *
> > 
> > The struct page paths through pgmap go page->pgmap->mem to get
> > p2pdma_provider.
> > 
> > The non-struct page paths just have a p2pdma_provider * without a
> > pgmap. In this series VFIO uses
> > 
> > +	*provider = pcim_p2pdma_provider(pdev, bar);
> > 
> > To get the provider for a specific BAR.
> 
> And what protects that life time?  I've not seen anyone actually
> building the proper lifetime management.  And if someone did the patches
> need to clearly point to that.

It is this series!

The above API gives a lifetime that is driver bound. The calling
driver must ensure it stops using provider and stops doing DMA with it
before remove() completes.

This VFIO series does that through the move_notify callchain I showed
in the previous email. This callchain is always triggered before
remove() of the VFIO PCI driver is completed.

> > I think I've answered this three times now - for DMABUF the DMABUF
> > invalidation scheme is used to control the lifetime and no DMA mapping
> > outlives the provider, and the provider doesn't outlive the driver.
> 
> How?

I explained it in detail in the message you are repling to. If
something is not clear can you please be more specific??

Is it the mmap in VFIO perhaps that is causing these questions?

VFIO uses a PFNMAP VMA, so you can't pin_user_page() it. It uses
unmap_mapping_range() during its remove() path to get rid of the VMA
PTEs.

The DMA activity doesn't use the mmap *at all*. It isn't like NVMe
which relies on the ZONE_DEVICE pages and VMAs to link drivers
togther.

Instead the DMABUF FD is used to pass the MMIO pages between VFIO and
another driver. DMABUF has a built in invalidation mechanism that VFIO
triggers before remove(). The invalidation removes access from the
other driver.

This is different than NVMe which has no invalidation. NVMe does
unmap_mapping_range() on the VMA and waits for all the short lived
pgmap references to clear. We don't need anything like that because
DMABUF invalidation is synchronous.

The full picture for VFIO is something like:

[startup]
  MMIO is acquired from the pci_resource
  p2p_providers are setup

[runtime]
  MMIO is mapped into PFNMAP VMAs
  MMIO is linked to a DMABUF FD
  DMABUF FD gets DMA mapped using the p2p_provider

[unplug]
  unmap_mapping_range() is called so all VMAs are emptied out and the
  fault handler prevents new PTEs 
    ** No access to the MMIO through VMAs is possible**

  vfio_pci_dma_buf_cleanup() is called which prevents new DMABUF
  mappings from starting, and does dma_buf_move_notify() on all the
  open DMABUF FDs to invalidate other drivers. Other drivers stop
  doing DMA and we need to free the IOVA from the IOMMU/etc.
    ** No DMA access from other drivers is possible now**

  Any still open DMABUF FD will fail inside VFIO immediately due to
  the priv->revoked checks.
    **No code touches the p2p_provider anymore**

  The p2p_provider is destroyed by devm.

> > Obviously you cannot use the new p2provider mechanism without some
> > kind of protection against use after hot unplug, but it doesn't have
> > to be struct page based.
> 
> And how does this interact with everyone else expecting pgmap based
> lifetime management.

They continue to use pgmap and nothing changes for them.

The pgmap path always waited until nothing was using the pgmap and
thus provider before allowing device driver remove() to complete.

The refactoring doesn't change the lifecycle model, it just provides
entry points to access the driver bound lifetime model directly
instead of being forced to use pgmap.

Leon, can you add some remarks to the comments about what the rules
are to call pcim_p2pdma_provider() ?

Jason
Leon Romanovsky Oct. 20, 2025, 3:04 p.m. UTC | #5
On Mon, Oct 20, 2025 at 09:58:54AM -0300, Jason Gunthorpe wrote:
> On Mon, Oct 20, 2025 at 05:27:02AM -0700, Christoph Hellwig wrote:
> > On Fri, Oct 17, 2025 at 08:53:20AM -0300, Jason Gunthorpe wrote:
> > > On Thu, Oct 16, 2025 at 11:30:06PM -0700, Christoph Hellwig wrote:
> > > > On Mon, Oct 13, 2025 at 06:26:03PM +0300, Leon Romanovsky wrote:
> > > > > The DMA API now has a new flow, and has gained phys_addr_t support, so
> > > > > it no longer needs struct pages to perform P2P mapping.
> > > > 
> > > > That's news to me.  All the pci_p2pdma_map_state machinery is still
> > > > based on pgmaps and thus pages.
> > > 
> > > We had this discussion already three months ago:
> > > 
> > > https://lore.kernel.org/all/20250729131502.GJ36037@nvidia.com/
> > > 
> > > These couple patches make the core pci_p2pdma_map_state machinery work
> > > on struct p2pdma_provider, and pgmap is just one way to get a
> > > p2pdma_provider *
> > > 
> > > The struct page paths through pgmap go page->pgmap->mem to get
> > > p2pdma_provider.
> > > 
> > > The non-struct page paths just have a p2pdma_provider * without a
> > > pgmap. In this series VFIO uses
> > > 
> > > +	*provider = pcim_p2pdma_provider(pdev, bar);
> > > 
> > > To get the provider for a specific BAR.
> > 
> > And what protects that life time?  I've not seen anyone actually
> > building the proper lifetime management.  And if someone did the patches
> > need to clearly point to that.
> 
> It is this series!
> 
> The above API gives a lifetime that is driver bound. The calling
> driver must ensure it stops using provider and stops doing DMA with it
> before remove() completes.
> 
> This VFIO series does that through the move_notify callchain I showed
> in the previous email. This callchain is always triggered before
> remove() of the VFIO PCI driver is completed.
> 
> > > I think I've answered this three times now - for DMABUF the DMABUF
> > > invalidation scheme is used to control the lifetime and no DMA mapping
> > > outlives the provider, and the provider doesn't outlive the driver.
> > 
> > How?
> 
> I explained it in detail in the message you are repling to. If
> something is not clear can you please be more specific??
> 
> Is it the mmap in VFIO perhaps that is causing these questions?
> 
> VFIO uses a PFNMAP VMA, so you can't pin_user_page() it. It uses
> unmap_mapping_range() during its remove() path to get rid of the VMA
> PTEs.
> 
> The DMA activity doesn't use the mmap *at all*. It isn't like NVMe
> which relies on the ZONE_DEVICE pages and VMAs to link drivers
> togther.
> 
> Instead the DMABUF FD is used to pass the MMIO pages between VFIO and
> another driver. DMABUF has a built in invalidation mechanism that VFIO
> triggers before remove(). The invalidation removes access from the
> other driver.
> 
> This is different than NVMe which has no invalidation. NVMe does
> unmap_mapping_range() on the VMA and waits for all the short lived
> pgmap references to clear. We don't need anything like that because
> DMABUF invalidation is synchronous.
> 
> The full picture for VFIO is something like:
> 
> [startup]
>   MMIO is acquired from the pci_resource
>   p2p_providers are setup
> 
> [runtime]
>   MMIO is mapped into PFNMAP VMAs
>   MMIO is linked to a DMABUF FD
>   DMABUF FD gets DMA mapped using the p2p_provider
> 
> [unplug]
>   unmap_mapping_range() is called so all VMAs are emptied out and the
>   fault handler prevents new PTEs 
>     ** No access to the MMIO through VMAs is possible**
> 
>   vfio_pci_dma_buf_cleanup() is called which prevents new DMABUF
>   mappings from starting, and does dma_buf_move_notify() on all the
>   open DMABUF FDs to invalidate other drivers. Other drivers stop
>   doing DMA and we need to free the IOVA from the IOMMU/etc.
>     ** No DMA access from other drivers is possible now**
> 
>   Any still open DMABUF FD will fail inside VFIO immediately due to
>   the priv->revoked checks.
>     **No code touches the p2p_provider anymore**
> 
>   The p2p_provider is destroyed by devm.
> 
> > > Obviously you cannot use the new p2provider mechanism without some
> > > kind of protection against use after hot unplug, but it doesn't have
> > > to be struct page based.
> > 
> > And how does this interact with everyone else expecting pgmap based
> > lifetime management.
> 
> They continue to use pgmap and nothing changes for them.
> 
> The pgmap path always waited until nothing was using the pgmap and
> thus provider before allowing device driver remove() to complete.
> 
> The refactoring doesn't change the lifecycle model, it just provides
> entry points to access the driver bound lifetime model directly
> instead of being forced to use pgmap.
> 
> Leon, can you add some remarks to the comments about what the rules
> are to call pcim_p2pdma_provider() ?

Yes, sure.

Thanks

> 
> Jason
Christoph Hellwig Oct. 22, 2025, 7:10 a.m. UTC | #6
On Mon, Oct 20, 2025 at 09:58:54AM -0300, Jason Gunthorpe wrote:
> I explained it in detail in the message you are repling to. If
> something is not clear can you please be more specific??
> 
> Is it the mmap in VFIO perhaps that is causing these questions?
> 
> VFIO uses a PFNMAP VMA, so you can't pin_user_page() it. It uses
> unmap_mapping_range() during its remove() path to get rid of the VMA
> PTEs.

This all needs to g• into the explanation.

> Instead the DMABUF FD is used to pass the MMIO pages between VFIO and
> another driver. DMABUF has a built in invalidation mechanism that VFIO
> triggers before remove(). The invalidation removes access from the
> other driver.
> 
> This is different than NVMe which has no invalidation. NVMe does
> unmap_mapping_range() on the VMA and waits for all the short lived
> pgmap references to clear. We don't need anything like that because
> DMABUF invalidation is synchronous.

Please add documentation for this model to the source tree.
Jason Gunthorpe Oct. 22, 2025, 11:43 a.m. UTC | #7
On Wed, Oct 22, 2025 at 12:10:35AM -0700, Christoph Hellwig wrote:
> On Mon, Oct 20, 2025 at 09:58:54AM -0300, Jason Gunthorpe wrote:
> > I explained it in detail in the message you are repling to. If
> > something is not clear can you please be more specific??
> > 
> > Is it the mmap in VFIO perhaps that is causing these questions?
> > 
> > VFIO uses a PFNMAP VMA, so you can't pin_user_page() it. It uses
> > unmap_mapping_range() during its remove() path to get rid of the VMA
> > PTEs.
> 
> This all needs to g• into the explanation.
> 
> > Instead the DMABUF FD is used to pass the MMIO pages between VFIO and
> > another driver. DMABUF has a built in invalidation mechanism that VFIO
> > triggers before remove(). The invalidation removes access from the
> > other driver.
> > 
> > This is different than NVMe which has no invalidation. NVMe does
> > unmap_mapping_range() on the VMA and waits for all the short lived
> > pgmap references to clear. We don't need anything like that because
> > DMABUF invalidation is synchronous.
> 
> Please add documentation for this model to the source tree.

Okay, Lets see what we can come up with. I think explaining the dmabuf
model with respect to the p2p provider in the new common dmabuf
mapping API code would make sense.

Jason
diff mbox series

Patch

diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c
index 78e108e47254..59cd6fb40e83 100644
--- a/drivers/pci/p2pdma.c
+++ b/drivers/pci/p2pdma.c
@@ -28,9 +28,8 @@  struct pci_p2pdma {
 };
 
 struct pci_p2pdma_pagemap {
-	struct pci_dev *provider;
-	u64 bus_offset;
 	struct dev_pagemap pgmap;
+	struct p2pdma_provider mem;
 };
 
 static struct pci_p2pdma_pagemap *to_p2p_pgmap(struct dev_pagemap *pgmap)
@@ -204,8 +203,8 @@  static void p2pdma_page_free(struct page *page)
 {
 	struct pci_p2pdma_pagemap *pgmap = to_p2p_pgmap(page_pgmap(page));
 	/* safe to dereference while a reference is held to the percpu ref */
-	struct pci_p2pdma *p2pdma =
-		rcu_dereference_protected(pgmap->provider->p2pdma, 1);
+	struct pci_p2pdma *p2pdma = rcu_dereference_protected(
+		to_pci_dev(pgmap->mem.owner)->p2pdma, 1);
 	struct percpu_ref *ref;
 
 	gen_pool_free_owner(p2pdma->pool, (uintptr_t)page_to_virt(page),
@@ -270,14 +269,15 @@  static int pci_p2pdma_setup(struct pci_dev *pdev)
 
 static void pci_p2pdma_unmap_mappings(void *data)
 {
-	struct pci_dev *pdev = data;
+	struct pci_p2pdma_pagemap *p2p_pgmap = data;
 
 	/*
 	 * Removing the alloc attribute from sysfs will call
 	 * unmap_mapping_range() on the inode, teardown any existing userspace
 	 * mappings and prevent new ones from being created.
 	 */
-	sysfs_remove_file_from_group(&pdev->dev.kobj, &p2pmem_alloc_attr.attr,
+	sysfs_remove_file_from_group(&p2p_pgmap->mem.owner->kobj,
+				     &p2pmem_alloc_attr.attr,
 				     p2pmem_group.name);
 }
 
@@ -328,10 +328,9 @@  int pci_p2pdma_add_resource(struct pci_dev *pdev, int bar, size_t size,
 	pgmap->nr_range = 1;
 	pgmap->type = MEMORY_DEVICE_PCI_P2PDMA;
 	pgmap->ops = &p2pdma_pgmap_ops;
-
-	p2p_pgmap->provider = pdev;
-	p2p_pgmap->bus_offset = pci_bus_address(pdev, bar) -
-		pci_resource_start(pdev, bar);
+	p2p_pgmap->mem.owner = &pdev->dev;
+	p2p_pgmap->mem.bus_offset =
+		pci_bus_address(pdev, bar) - pci_resource_start(pdev, bar);
 
 	addr = devm_memremap_pages(&pdev->dev, pgmap);
 	if (IS_ERR(addr)) {
@@ -340,7 +339,7 @@  int pci_p2pdma_add_resource(struct pci_dev *pdev, int bar, size_t size,
 	}
 
 	error = devm_add_action_or_reset(&pdev->dev, pci_p2pdma_unmap_mappings,
-					 pdev);
+					 p2p_pgmap);
 	if (error)
 		goto pages_free;
 
@@ -972,16 +971,16 @@  void pci_p2pmem_publish(struct pci_dev *pdev, bool publish)
 }
 EXPORT_SYMBOL_GPL(pci_p2pmem_publish);
 
-static enum pci_p2pdma_map_type pci_p2pdma_map_type(struct dev_pagemap *pgmap,
-						    struct device *dev)
+static enum pci_p2pdma_map_type
+pci_p2pdma_map_type(struct p2pdma_provider *provider, struct device *dev)
 {
 	enum pci_p2pdma_map_type type = PCI_P2PDMA_MAP_NOT_SUPPORTED;
-	struct pci_dev *provider = to_p2p_pgmap(pgmap)->provider;
+	struct pci_dev *pdev = to_pci_dev(provider->owner);
 	struct pci_dev *client;
 	struct pci_p2pdma *p2pdma;
 	int dist;
 
-	if (!provider->p2pdma)
+	if (!pdev->p2pdma)
 		return PCI_P2PDMA_MAP_NOT_SUPPORTED;
 
 	if (!dev_is_pci(dev))
@@ -990,7 +989,7 @@  static enum pci_p2pdma_map_type pci_p2pdma_map_type(struct dev_pagemap *pgmap,
 	client = to_pci_dev(dev);
 
 	rcu_read_lock();
-	p2pdma = rcu_dereference(provider->p2pdma);
+	p2pdma = rcu_dereference(pdev->p2pdma);
 
 	if (p2pdma)
 		type = xa_to_value(xa_load(&p2pdma->map_types,
@@ -998,7 +997,7 @@  static enum pci_p2pdma_map_type pci_p2pdma_map_type(struct dev_pagemap *pgmap,
 	rcu_read_unlock();
 
 	if (type == PCI_P2PDMA_MAP_UNKNOWN)
-		return calc_map_type_and_dist(provider, client, &dist, true);
+		return calc_map_type_and_dist(pdev, client, &dist, true);
 
 	return type;
 }
@@ -1006,9 +1005,13 @@  static enum pci_p2pdma_map_type pci_p2pdma_map_type(struct dev_pagemap *pgmap,
 void __pci_p2pdma_update_state(struct pci_p2pdma_map_state *state,
 		struct device *dev, struct page *page)
 {
-	state->pgmap = page_pgmap(page);
-	state->map = pci_p2pdma_map_type(state->pgmap, dev);
-	state->bus_off = to_p2p_pgmap(state->pgmap)->bus_offset;
+	struct pci_p2pdma_pagemap *p2p_pgmap = to_p2p_pgmap(page_pgmap(page));
+
+	if (state->mem == &p2p_pgmap->mem)
+		return;
+
+	state->mem = &p2p_pgmap->mem;
+	state->map = pci_p2pdma_map_type(&p2p_pgmap->mem, dev);
 }
 
 /**
diff --git a/include/linux/pci-p2pdma.h b/include/linux/pci-p2pdma.h
index 951f81a38f3a..1400f3ad4299 100644
--- a/include/linux/pci-p2pdma.h
+++ b/include/linux/pci-p2pdma.h
@@ -16,6 +16,16 @@ 
 struct block_device;
 struct scatterlist;
 
+/**
+ * struct p2pdma_provider
+ *
+ * A p2pdma provider is a range of MMIO address space available to the CPU.
+ */
+struct p2pdma_provider {
+	struct device *owner;
+	u64 bus_offset;
+};
+
 #ifdef CONFIG_PCI_P2PDMA
 int pci_p2pdma_add_resource(struct pci_dev *pdev, int bar, size_t size,
 		u64 offset);
@@ -139,11 +149,11 @@  enum pci_p2pdma_map_type {
 };
 
 struct pci_p2pdma_map_state {
-	struct dev_pagemap *pgmap;
+	struct p2pdma_provider *mem;
 	enum pci_p2pdma_map_type map;
-	u64 bus_off;
 };
 
+
 /* helper for pci_p2pdma_state(), do not use directly */
 void __pci_p2pdma_update_state(struct pci_p2pdma_map_state *state,
 		struct device *dev, struct page *page);
@@ -162,8 +172,7 @@  pci_p2pdma_state(struct pci_p2pdma_map_state *state, struct device *dev,
 		struct page *page)
 {
 	if (IS_ENABLED(CONFIG_PCI_P2PDMA) && is_pci_p2pdma_page(page)) {
-		if (state->pgmap != page_pgmap(page))
-			__pci_p2pdma_update_state(state, dev, page);
+		__pci_p2pdma_update_state(state, dev, page);
 		return state->map;
 	}
 	return PCI_P2PDMA_MAP_NONE;
@@ -181,7 +190,7 @@  static inline dma_addr_t
 pci_p2pdma_bus_addr_map(struct pci_p2pdma_map_state *state, phys_addr_t paddr)
 {
 	WARN_ON_ONCE(state->map != PCI_P2PDMA_MAP_BUS_ADDR);
-	return paddr + state->bus_off;
+	return paddr + state->mem->bus_offset;
 }
 
 #endif /* _LINUX_PCI_P2P_H */