ARC: fix broken noncoherent cache ops

Message ID 20180724141302.4305-1-Eugeniy.Paltsev@synopsys.com
State New
Headers show
Series
  • ARC: fix broken noncoherent cache ops
Related show

Commit Message

Eugeniy Paltsev July 24, 2018, 2:13 p.m.
All DMA devices on ARC haven't worked with SW cache control
since commit a8eb92d02dd7 ("arc: fix arc_dma_{map,unmap}_page")
This happens because we don't check direction argument at all in
new implementation. Fix that.

Fixies: commit a8eb92d02dd7 ("arc: fix arc_dma_{map,unmap}_page")
Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
---
NOTE:
 * This patch was stress tested on HSDK with bonie++ (usb and sdio)
   with IOC disabled. The ethernet wasn't tested because it doesn't
   work with SW cache control as for today (see STAR 9001336019)

 arch/arc/mm/dma.c | 46 ++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 44 insertions(+), 2 deletions(-)

Comments

Vineet Gupta July 24, 2018, 6:34 p.m. | #1
On 07/24/2018 07:13 AM, Eugeniy Paltsev wrote:
> All DMA devices on ARC haven't worked with SW cache control
> since commit a8eb92d02dd7 ("arc: fix arc_dma_{map,unmap}_page")
> This happens because we don't check direction argument at all in
> new implementation. Fix that.

Good find and I presume painful to debug.

Interesting though how the error tricked finally as the root cause was
arc_dma_sync_single*() were broken to begin with.

Prior to common ops rework, arc_dma_sync_single_for_device() would unconditionally
do cache wback, independent of the direction (by calling _dma_cache_sync helper
with TO_DEVICE). In 713a74624bba ("arc: simplify
arc_dma_sync_single_for_{cpu,device}") Christoph changed this to skip the helper.
And then in a8eb92d02dd7, the usage of these routines was prolifirated to the more
common kernel API dma_*map_page() API and that is where the original deficiency
showed up. I'll add this bit of history to changelog to remember this better.


> Fixies: commit a8eb92d02dd7 ("arc: fix arc_dma_{map,unmap}_page")
> Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
> ---
> NOTE:
>  * This patch was stress tested on HSDK with bonie++ (usb and sdio)
>    with IOC disabled. The ethernet wasn't tested because it doesn't
>    work with SW cache control as for today (see STAR 9001336019)
>
>  arch/arc/mm/dma.c | 46 ++++++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 44 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arc/mm/dma.c b/arch/arc/mm/dma.c
> index 8c1071840979..cefb776a99ff 100644
> --- a/arch/arc/mm/dma.c
> +++ b/arch/arc/mm/dma.c
> @@ -129,14 +129,56 @@ int arch_dma_mmap(struct device *dev, struct vm_area_struct *vma,
>  	return ret;
>  }
>  
> +/*
> + * Cache operations depending on function and direction argument, inspired by
> + * https://lkml.org/lkml/2018/5/18/979
> + * "dma_sync_*_for_cpu and direction=TO_DEVICE (was Re: [PATCH 02/20]
> + * dma-mapping: provide a generic dma-noncoherent implementation)"
> + *
> + *          |   map          ==  for_device     |   unmap     ==  for_cpu
> + *          |----------------------------------------------------------------
> + * TO_DEV   |   writeback        writeback      |   none          none
> + * FROM_DEV |   invalidate       invalidate     |   invalidate    invalidate
> + * BIDIR    |   writeback+inv    writeback+inv  |   invalidate    invalidate
> + *
> + * NOTE: we don't check the validity of direction argument as it is done in
> + * upper layer functions (in include/linux/dma-mapping.h)
> + */
> +

Very nice !  Added to for-curr

Thx,
-Vineet

>  void arch_sync_dma_for_device(struct device *dev, phys_addr_t paddr,
>  		size_t size, enum dma_data_direction dir)
>  {
> -	dma_cache_wback(paddr, size);
> +	switch (dir) {
> +	case DMA_TO_DEVICE:
> +		dma_cache_wback(paddr, size);
> +		break;
> +
> +	case DMA_FROM_DEVICE:
> +		dma_cache_inv(paddr, size);
> +		break;
> +
> +	case DMA_BIDIRECTIONAL:
> +		dma_cache_wback_inv(paddr, size);
> +		break;
> +
> +	default:
> +		break;
> +	}
>  }
>  
>  void arch_sync_dma_for_cpu(struct device *dev, phys_addr_t paddr,
>  		size_t size, enum dma_data_direction dir)
>  {
> -	dma_cache_inv(paddr, size);
> +	switch (dir) {
> +	case DMA_TO_DEVICE:
> +		break;
> +
> +	case DMA_FROM_DEVICE:
> +	case DMA_BIDIRECTIONAL:
> +		dma_cache_inv(paddr, size);
> +		break;
> +
> +	default:
> +		break;
> +	}
>  }
Christoph Hellwig July 26, 2018, 9:11 a.m. | #2
On Tue, Jul 24, 2018 at 05:13:02PM +0300, Eugeniy Paltsev wrote:
> All DMA devices on ARC haven't worked with SW cache control
> since commit a8eb92d02dd7 ("arc: fix arc_dma_{map,unmap}_page")
> This happens because we don't check direction argument at all in
> new implementation. Fix that.
> 
> Fixies: commit a8eb92d02dd7 ("arc: fix arc_dma_{map,unmap}_page")
> Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>

Looks sensible.  Might be worth explaining that ARC can speculate
into the areas under DMA, which is why this is required.
Vineet Gupta July 26, 2018, 7 p.m. | #3
On 07/26/2018 02:08 AM, Christoph Hellwig wrote:
> On Tue, Jul 24, 2018 at 05:13:02PM +0300, Eugeniy Paltsev wrote:
>> All DMA devices on ARC haven't worked with SW cache control
>> since commit a8eb92d02dd7 ("arc: fix arc_dma_{map,unmap}_page")
>> This happens because we don't check direction argument at all in
>> new implementation. Fix that.
>>
>> Fixies: commit a8eb92d02dd7 ("arc: fix arc_dma_{map,unmap}_page")
>> Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
> Looks sensible.  Might be worth explaining that ARC can speculate
> into the areas under DMA, which is why this is required.
>

ARC CPUs do prefetch, but I doubt if they are doing so, so aggressively, specially
when the region around DMA buffers is unlikely to be used for normal LD/ST
bleeding into DMA buffers. The issue here seems to be less technical and a bit of
snafu in implementation details.

1. originally
    dma_map_single(@dir)  => honored @dir, and did inv, wback or both depending on it

    sync_for_device(@dir) => forced @dir DMA_TO_DEV = > cache wback
    sync_for_cpu(@dir)     => forced @dir DMA_FROM_DEV = > cache inv

2. After commit a8eb92d02dd7, dma_map_single() starting callingsync_for_device( )
which as noted above didn't respect @dir, only doing cache wback, and thus would
fail for DMA_FROM_DEV/BIDIR cases where cpu needs to read from buffer and thus
requires cache inv as well. Likewise dma_unmap_single() would unconditionally do
cache inv, given usage of sync_for_cpu() which would be wrong for the TO_DEVICE cases.

Too bad I didn't spot this in the code review myself at the time.

-Vineet

Patch

diff --git a/arch/arc/mm/dma.c b/arch/arc/mm/dma.c
index 8c1071840979..cefb776a99ff 100644
--- a/arch/arc/mm/dma.c
+++ b/arch/arc/mm/dma.c
@@ -129,14 +129,56 @@  int arch_dma_mmap(struct device *dev, struct vm_area_struct *vma,
 	return ret;
 }
 
+/*
+ * Cache operations depending on function and direction argument, inspired by
+ * https://lkml.org/lkml/2018/5/18/979
+ * "dma_sync_*_for_cpu and direction=TO_DEVICE (was Re: [PATCH 02/20]
+ * dma-mapping: provide a generic dma-noncoherent implementation)"
+ *
+ *          |   map          ==  for_device     |   unmap     ==  for_cpu
+ *          |----------------------------------------------------------------
+ * TO_DEV   |   writeback        writeback      |   none          none
+ * FROM_DEV |   invalidate       invalidate     |   invalidate    invalidate
+ * BIDIR    |   writeback+inv    writeback+inv  |   invalidate    invalidate
+ *
+ * NOTE: we don't check the validity of direction argument as it is done in
+ * upper layer functions (in include/linux/dma-mapping.h)
+ */
+
 void arch_sync_dma_for_device(struct device *dev, phys_addr_t paddr,
 		size_t size, enum dma_data_direction dir)
 {
-	dma_cache_wback(paddr, size);
+	switch (dir) {
+	case DMA_TO_DEVICE:
+		dma_cache_wback(paddr, size);
+		break;
+
+	case DMA_FROM_DEVICE:
+		dma_cache_inv(paddr, size);
+		break;
+
+	case DMA_BIDIRECTIONAL:
+		dma_cache_wback_inv(paddr, size);
+		break;
+
+	default:
+		break;
+	}
 }
 
 void arch_sync_dma_for_cpu(struct device *dev, phys_addr_t paddr,
 		size_t size, enum dma_data_direction dir)
 {
-	dma_cache_inv(paddr, size);
+	switch (dir) {
+	case DMA_TO_DEVICE:
+		break;
+
+	case DMA_FROM_DEVICE:
+	case DMA_BIDIRECTIONAL:
+		dma_cache_inv(paddr, size);
+		break;
+
+	default:
+		break;
+	}
 }