diff mbox

[net-next,RFC,19/26] arch/sparc: Add option to skip DMA sync as a part of map and unmap

Message ID 20161024120607.16276.5989.stgit@ahduyck-blue-test.jf.intel.com
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Duyck, Alexander H Oct. 24, 2016, 12:06 p.m. UTC
This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
avoid invoking cache line invalidation if the driver will just handle it
via a sync_for_cpu or sync_for_device call.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: sparclinux@vger.kernel.org
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 arch/sparc/kernel/iommu.c  |    4 ++--
 arch/sparc/kernel/ioport.c |    4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

Comments

David Miller Oct. 24, 2016, 6:27 p.m. UTC | #1
From: Alexander Duyck <alexander.h.duyck@intel.com>
Date: Mon, 24 Oct 2016 08:06:07 -0400

> This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
> avoid invoking cache line invalidation if the driver will just handle it
> via a sync_for_cpu or sync_for_device call.
> 
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: sparclinux@vger.kernel.org
> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>

This is fine for avoiding the flush for performance reasons, but the
chip isn't going to write anything back unless the device wrote into
the area.
Alexander H Duyck Oct. 24, 2016, 7:24 p.m. UTC | #2
On Mon, Oct 24, 2016 at 11:27 AM, David Miller <davem@davemloft.net> wrote:
> From: Alexander Duyck <alexander.h.duyck@intel.com>
> Date: Mon, 24 Oct 2016 08:06:07 -0400
>
>> This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
>> avoid invoking cache line invalidation if the driver will just handle it
>> via a sync_for_cpu or sync_for_device call.
>>
>> Cc: "David S. Miller" <davem@davemloft.net>
>> Cc: sparclinux@vger.kernel.org
>> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
>
> This is fine for avoiding the flush for performance reasons, but the
> chip isn't going to write anything back unless the device wrote into
> the area.

That is mostly what I am doing here.  The original implementation was
mostly for performance.  I am trying to take the attribute that was
already in place for ARM and apply it to all the other architectures.
So what will be happening now is that we call the map function with
this attribute set and then use the sync functions to map it to the
device and then pull the mapping later.

The idea is that if Jesper does his page pool stuff it would be
calling the map/unmap functions and then the drivers would be doing
the sync_for_cpu/sync_for_device.  I want to make sure the map is
cheap and we will have to call sync_for_cpu from the drivers anyway
since there is no guarantee if we will have a new page or be reusing
an existing one.

- Alex
diff mbox

Patch

diff --git a/arch/sparc/kernel/iommu.c b/arch/sparc/kernel/iommu.c
index 5c615ab..8fda4e4 100644
--- a/arch/sparc/kernel/iommu.c
+++ b/arch/sparc/kernel/iommu.c
@@ -415,7 +415,7 @@  static void dma_4u_unmap_page(struct device *dev, dma_addr_t bus_addr,
 		ctx = (iopte_val(*base) & IOPTE_CONTEXT) >> 47UL;
 
 	/* Step 1: Kick data out of streaming buffers if necessary. */
-	if (strbuf->strbuf_enabled)
+	if (strbuf->strbuf_enabled && !(attrs & DMA_ATTR_SKIP_CPU_SYNC))
 		strbuf_flush(strbuf, iommu, bus_addr, ctx,
 			     npages, direction);
 
@@ -640,7 +640,7 @@  static void dma_4u_unmap_sg(struct device *dev, struct scatterlist *sglist,
 		base = iommu->page_table + entry;
 
 		dma_handle &= IO_PAGE_MASK;
-		if (strbuf->strbuf_enabled)
+		if (strbuf->strbuf_enabled && !(attrs & DMA_ATTR_SKIP_CPU_SYNC))
 			strbuf_flush(strbuf, iommu, dma_handle, ctx,
 				     npages, direction);
 
diff --git a/arch/sparc/kernel/ioport.c b/arch/sparc/kernel/ioport.c
index 2344103..6ffaec4 100644
--- a/arch/sparc/kernel/ioport.c
+++ b/arch/sparc/kernel/ioport.c
@@ -527,7 +527,7 @@  static dma_addr_t pci32_map_page(struct device *dev, struct page *page,
 static void pci32_unmap_page(struct device *dev, dma_addr_t ba, size_t size,
 			     enum dma_data_direction dir, unsigned long attrs)
 {
-	if (dir != PCI_DMA_TODEVICE)
+	if (dir != PCI_DMA_TODEVICE && !(attrs & DMA_ATTR_SKIP_CPU_SYNC))
 		dma_make_coherent(ba, PAGE_ALIGN(size));
 }
 
@@ -572,7 +572,7 @@  static void pci32_unmap_sg(struct device *dev, struct scatterlist *sgl,
 	struct scatterlist *sg;
 	int n;
 
-	if (dir != PCI_DMA_TODEVICE) {
+	if (dir != PCI_DMA_TODEVICE && !(attrs & DMA_ATTR_SKIP_CPU_SYNC)) {
 		for_each_sg(sgl, sg, nents, n) {
 			dma_make_coherent(sg_phys(sg), PAGE_ALIGN(sg->length));
 		}