Message ID | 20161024120607.16276.5989.stgit@ahduyck-blue-test.jf.intel.com |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
From: Alexander Duyck <alexander.h.duyck@intel.com> Date: Mon, 24 Oct 2016 08:06:07 -0400 > This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to > avoid invoking cache line invalidation if the driver will just handle it > via a sync_for_cpu or sync_for_device call. > > Cc: "David S. Miller" <davem@davemloft.net> > Cc: sparclinux@vger.kernel.org > Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> This is fine for avoiding the flush for performance reasons, but the chip isn't going to write anything back unless the device wrote into the area.
On Mon, Oct 24, 2016 at 11:27 AM, David Miller <davem@davemloft.net> wrote: > From: Alexander Duyck <alexander.h.duyck@intel.com> > Date: Mon, 24 Oct 2016 08:06:07 -0400 > >> This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to >> avoid invoking cache line invalidation if the driver will just handle it >> via a sync_for_cpu or sync_for_device call. >> >> Cc: "David S. Miller" <davem@davemloft.net> >> Cc: sparclinux@vger.kernel.org >> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> > > This is fine for avoiding the flush for performance reasons, but the > chip isn't going to write anything back unless the device wrote into > the area. That is mostly what I am doing here. The original implementation was mostly for performance. I am trying to take the attribute that was already in place for ARM and apply it to all the other architectures. So what will be happening now is that we call the map function with this attribute set and then use the sync functions to map it to the device and then pull the mapping later. The idea is that if Jesper does his page pool stuff it would be calling the map/unmap functions and then the drivers would be doing the sync_for_cpu/sync_for_device. I want to make sure the map is cheap and we will have to call sync_for_cpu from the drivers anyway since there is no guarantee if we will have a new page or be reusing an existing one. - Alex
diff --git a/arch/sparc/kernel/iommu.c b/arch/sparc/kernel/iommu.c index 5c615ab..8fda4e4 100644 --- a/arch/sparc/kernel/iommu.c +++ b/arch/sparc/kernel/iommu.c @@ -415,7 +415,7 @@ static void dma_4u_unmap_page(struct device *dev, dma_addr_t bus_addr, ctx = (iopte_val(*base) & IOPTE_CONTEXT) >> 47UL; /* Step 1: Kick data out of streaming buffers if necessary. */ - if (strbuf->strbuf_enabled) + if (strbuf->strbuf_enabled && !(attrs & DMA_ATTR_SKIP_CPU_SYNC)) strbuf_flush(strbuf, iommu, bus_addr, ctx, npages, direction); @@ -640,7 +640,7 @@ static void dma_4u_unmap_sg(struct device *dev, struct scatterlist *sglist, base = iommu->page_table + entry; dma_handle &= IO_PAGE_MASK; - if (strbuf->strbuf_enabled) + if (strbuf->strbuf_enabled && !(attrs & DMA_ATTR_SKIP_CPU_SYNC)) strbuf_flush(strbuf, iommu, dma_handle, ctx, npages, direction); diff --git a/arch/sparc/kernel/ioport.c b/arch/sparc/kernel/ioport.c index 2344103..6ffaec4 100644 --- a/arch/sparc/kernel/ioport.c +++ b/arch/sparc/kernel/ioport.c @@ -527,7 +527,7 @@ static dma_addr_t pci32_map_page(struct device *dev, struct page *page, static void pci32_unmap_page(struct device *dev, dma_addr_t ba, size_t size, enum dma_data_direction dir, unsigned long attrs) { - if (dir != PCI_DMA_TODEVICE) + if (dir != PCI_DMA_TODEVICE && !(attrs & DMA_ATTR_SKIP_CPU_SYNC)) dma_make_coherent(ba, PAGE_ALIGN(size)); } @@ -572,7 +572,7 @@ static void pci32_unmap_sg(struct device *dev, struct scatterlist *sgl, struct scatterlist *sg; int n; - if (dir != PCI_DMA_TODEVICE) { + if (dir != PCI_DMA_TODEVICE && !(attrs & DMA_ATTR_SKIP_CPU_SYNC)) { for_each_sg(sgl, sg, nents, n) { dma_make_coherent(sg_phys(sg), PAGE_ALIGN(sg->length)); }
This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to avoid invoking cache line invalidation if the driver will just handle it via a sync_for_cpu or sync_for_device call. Cc: "David S. Miller" <davem@davemloft.net> Cc: sparclinux@vger.kernel.org Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> --- arch/sparc/kernel/iommu.c | 4 ++-- arch/sparc/kernel/ioport.c | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-)