diff mbox series

[1/4] dma-mapping: add a new dma_need_sync API

Message ID 20200629130359.2690853-2-hch@lst.de
State Accepted
Delegated to: BPF Maintainers
Headers show
Series [1/4] dma-mapping: add a new dma_need_sync API | expand

Commit Message

Christoph Hellwig June 29, 2020, 1:03 p.m. UTC
Add a new API to check if calls to dma_sync_single_for_{device,cpu} are
required for a given DMA streaming mapping.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 Documentation/core-api/dma-api.rst |  8 ++++++++
 include/linux/dma-direct.h         |  1 +
 include/linux/dma-mapping.h        |  5 +++++
 kernel/dma/direct.c                |  6 ++++++
 kernel/dma/mapping.c               | 10 ++++++++++
 5 files changed, 30 insertions(+)

Comments

Jonathan Lemon July 6, 2020, 7:42 p.m. UTC | #1
On Mon, Jun 29, 2020 at 03:03:56PM +0200, Christoph Hellwig wrote:
> Add a new API to check if calls to dma_sync_single_for_{device,cpu} are
> required for a given DMA streaming mapping.
> 
> +::
> +
> +	bool
> +	dma_need_sync(struct device *dev, dma_addr_t dma_addr);
> +
> +Returns %true if dma_sync_single_for_{device,cpu} calls are required to
> +transfer memory ownership.  Returns %false if those calls can be skipped.

Hi Christoph -

Thie call above is for a specific dma_addr.  For correctness, would I
need to check every addr, or can I assume that for a specific memory
type (pages returned from malloc), that the answer would be identical?
Christoph Hellwig July 7, 2020, 6:47 a.m. UTC | #2
On Mon, Jul 06, 2020 at 12:42:27PM -0700, Jonathan Lemon wrote:
> On Mon, Jun 29, 2020 at 03:03:56PM +0200, Christoph Hellwig wrote:
> > Add a new API to check if calls to dma_sync_single_for_{device,cpu} are
> > required for a given DMA streaming mapping.
> > 
> > +::
> > +
> > +	bool
> > +	dma_need_sync(struct device *dev, dma_addr_t dma_addr);
> > +
> > +Returns %true if dma_sync_single_for_{device,cpu} calls are required to
> > +transfer memory ownership.  Returns %false if those calls can be skipped.
> 
> Hi Christoph -
> 
> Thie call above is for a specific dma_addr.  For correctness, would I
> need to check every addr, or can I assume that for a specific memory
> type (pages returned from malloc), that the answer would be identical?

You need to check every mapping.  E.g. this API pairs with a
dma_map_single/page call.  For S/G mappings you'd need to call it for
each entry, although if you have a use case for that we really should
add a dma_sg_need_sync helper instea of open coding the scatterlist walk.
Jonathan Lemon July 7, 2020, 3:11 p.m. UTC | #3
On Tue, Jul 07, 2020 at 08:47:30AM +0200, Christoph Hellwig wrote:
> On Mon, Jul 06, 2020 at 12:42:27PM -0700, Jonathan Lemon wrote:
> > On Mon, Jun 29, 2020 at 03:03:56PM +0200, Christoph Hellwig wrote:
> > > Add a new API to check if calls to dma_sync_single_for_{device,cpu} are
> > > required for a given DMA streaming mapping.
> > > 
> > > +::
> > > +
> > > +	bool
> > > +	dma_need_sync(struct device *dev, dma_addr_t dma_addr);
> > > +
> > > +Returns %true if dma_sync_single_for_{device,cpu} calls are required to
> > > +transfer memory ownership.  Returns %false if those calls can be skipped.
> > 
> > Hi Christoph -
> > 
> > Thie call above is for a specific dma_addr.  For correctness, would I
> > need to check every addr, or can I assume that for a specific memory
> > type (pages returned from malloc), that the answer would be identical?
> 
> You need to check every mapping.  E.g. this API pairs with a
> dma_map_single/page call.  For S/G mappings you'd need to call it for
> each entry, although if you have a use case for that we really should
> add a dma_sg_need_sync helper instea of open coding the scatterlist walk.

My use case is setting up a pinned memory area, and caching the dma
mappings.  I'd like to bypass storing the DMA addresses if they aren't
needed.  For example:

setup()
{
    if (dma_need_sync(dev, addr, len)) {
        kvmalloc_array(...)
        cache_dma_mappings(...)
    }


dev_get_dma(page)
{
    if (!cache)
        return page_to_phys(page)

    return dma_cache_lookup(...)



The reason for doing it this way is that the page in question may be
backed by either system memory, or device memory such as a GPU.  For the
latter, the GPU provides a table of DMA addresses where data may be
accessed, so I'm unable to use the dma_map_page() API.
Christoph Hellwig July 7, 2020, 3:14 p.m. UTC | #4
On Tue, Jul 07, 2020 at 08:11:09AM -0700, Jonathan Lemon wrote:
> > You need to check every mapping.  E.g. this API pairs with a
> > dma_map_single/page call.  For S/G mappings you'd need to call it for
> > each entry, although if you have a use case for that we really should
> > add a dma_sg_need_sync helper instea of open coding the scatterlist walk.
> 
> My use case is setting up a pinned memory area, and caching the dma
> mappings.  I'd like to bypass storing the DMA addresses if they aren't
> needed.  For example:
> 
> setup()
> {
>     if (dma_need_sync(dev, addr, len)) {
>         kvmalloc_array(...)
>         cache_dma_mappings(...)
>     }
> 
> 
> dev_get_dma(page)
> {
>     if (!cache)
>         return page_to_phys(page)
> 
>     return dma_cache_lookup(...)
> 
> 
> 
> The reason for doing it this way is that the page in question may be
> backed by either system memory, or device memory such as a GPU.  For the
> latter, the GPU provides a table of DMA addresses where data may be
> accessed, so I'm unable to use the dma_map_page() API.

dma_need_sync doesn't tell you if the unmap needs the dma_addr_t.
I've been think about replacing CONFIG_NEED_DMA_MAP_STATE with a runtime
for a while, which would give you exattly what you need.  For now it
isn't very useful as there are very few configs left that do not have
CONFIG_NEED_DMA_MAP_STATE set.
diff mbox series

Patch

diff --git a/Documentation/core-api/dma-api.rst b/Documentation/core-api/dma-api.rst
index 2d8d2fed731720..f41620439ef349 100644
--- a/Documentation/core-api/dma-api.rst
+++ b/Documentation/core-api/dma-api.rst
@@ -204,6 +204,14 @@  Returns the maximum size of a mapping for the device. The size parameter
 of the mapping functions like dma_map_single(), dma_map_page() and
 others should not be larger than the returned value.
 
+::
+
+	bool
+	dma_need_sync(struct device *dev, dma_addr_t dma_addr);
+
+Returns %true if dma_sync_single_for_{device,cpu} calls are required to
+transfer memory ownership.  Returns %false if those calls can be skipped.
+
 ::
 
 	unsigned long
diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h
index cdfa400f89b3d3..5184735a0fe8eb 100644
--- a/include/linux/dma-direct.h
+++ b/include/linux/dma-direct.h
@@ -85,4 +85,5 @@  int dma_direct_mmap(struct device *dev, struct vm_area_struct *vma,
 		void *cpu_addr, dma_addr_t dma_addr, size_t size,
 		unsigned long attrs);
 int dma_direct_supported(struct device *dev, u64 mask);
+bool dma_direct_need_sync(struct device *dev, dma_addr_t dma_addr);
 #endif /* _LINUX_DMA_DIRECT_H */
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 78f677cf45ab69..a33ed3954ed465 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -461,6 +461,7 @@  int dma_set_mask(struct device *dev, u64 mask);
 int dma_set_coherent_mask(struct device *dev, u64 mask);
 u64 dma_get_required_mask(struct device *dev);
 size_t dma_max_mapping_size(struct device *dev);
+bool dma_need_sync(struct device *dev, dma_addr_t dma_addr);
 unsigned long dma_get_merge_boundary(struct device *dev);
 #else /* CONFIG_HAS_DMA */
 static inline dma_addr_t dma_map_page_attrs(struct device *dev,
@@ -571,6 +572,10 @@  static inline size_t dma_max_mapping_size(struct device *dev)
 {
 	return 0;
 }
+static inline bool dma_need_sync(struct device *dev, dma_addr_t dma_addr)
+{
+	return false;
+}
 static inline unsigned long dma_get_merge_boundary(struct device *dev)
 {
 	return 0;
diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index 93f578a8e613ba..95866b64758100 100644
--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -539,3 +539,9 @@  size_t dma_direct_max_mapping_size(struct device *dev)
 		return swiotlb_max_mapping_size(dev);
 	return SIZE_MAX;
 }
+
+bool dma_direct_need_sync(struct device *dev, dma_addr_t dma_addr)
+{
+	return !dev_is_dma_coherent(dev) ||
+		is_swiotlb_buffer(dma_to_phys(dev, dma_addr));
+}
diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c
index 98e3d873792ea4..a8c18c9a796fdc 100644
--- a/kernel/dma/mapping.c
+++ b/kernel/dma/mapping.c
@@ -397,6 +397,16 @@  size_t dma_max_mapping_size(struct device *dev)
 }
 EXPORT_SYMBOL_GPL(dma_max_mapping_size);
 
+bool dma_need_sync(struct device *dev, dma_addr_t dma_addr)
+{
+	const struct dma_map_ops *ops = get_dma_ops(dev);
+
+	if (dma_is_direct(ops))
+		return dma_direct_need_sync(dev, dma_addr);
+	return ops->sync_single_for_cpu || ops->sync_single_for_device;
+}
+EXPORT_SYMBOL_GPL(dma_need_sync);
+
 unsigned long dma_get_merge_boundary(struct device *dev)
 {
 	const struct dma_map_ops *ops = get_dma_ops(dev);