diff mbox

[v4,6/7] mtd: nand: omap2: Fix high memory dma prefetch transfer

Message ID 1457654203-20856-7-git-send-email-fcooper@ti.com
State Superseded
Headers show

Commit Message

Franklin S Cooper Jr March 10, 2016, 11:56 p.m. UTC
Based on DMA documentation and testing using high memory buffer when
doing dma transfers can lead to various issues including kernel
panics.

To workaround this simply use cpu copy. The amount of high memory
buffers used are very uncommon so no noticeable performance hit should
be seen.

Signed-off-by: Franklin S Cooper Jr <fcooper@ti.com>
---
 drivers/mtd/nand/omap2.c | 14 +++-----------
 1 file changed, 3 insertions(+), 11 deletions(-)

Comments

Boris Brezillon March 21, 2016, 3:04 p.m. UTC | #1
Hi Franklin,

On Thu, 10 Mar 2016 17:56:42 -0600
Franklin S Cooper Jr <fcooper@ti.com> wrote:

> Based on DMA documentation and testing using high memory buffer when
> doing dma transfers can lead to various issues including kernel
> panics.

I guess it all comes from the vmalloced buffer case, which are not
guaranteed to be physically contiguous (one of the DMA requirement,
unless you have an iommu).

> 
> To workaround this simply use cpu copy. The amount of high memory
> buffers used are very uncommon so no noticeable performance hit should
> be seen.

Hm, that's not necessarily true. UBI and UBIFS allocate their buffers
using vmalloc (vmalloced buffers fall in the high_memory region), and
those are likely to be dis-contiguous if you have NANDs with pages > 4k.

I recently posted patches to ease sg_table creation from any kind of
virtual address [1][2]. Can you try them and let me know if it fixes
your problem?

Thanks,

Boris

[1]https://lkml.org/lkml/2016/3/8/276
[2]https://lkml.org/lkml/2016/3/8/277
Franklin S Cooper Jr April 13, 2016, 8:08 p.m. UTC | #2
On 03/21/2016 10:04 AM, Boris Brezillon wrote:
> Hi Franklin,
> 
> On Thu, 10 Mar 2016 17:56:42 -0600
> Franklin S Cooper Jr <fcooper@ti.com> wrote:
> 
>> Based on DMA documentation and testing using high memory buffer when
>> doing dma transfers can lead to various issues including kernel
>> panics.
> 
> I guess it all comes from the vmalloced buffer case, which are not
> guaranteed to be physically contiguous (one of the DMA requirement,
> unless you have an iommu).
> 
>>
>> To workaround this simply use cpu copy. The amount of high memory
>> buffers used are very uncommon so no noticeable performance hit should
>> be seen.
> 
> Hm, that's not necessarily true. UBI and UBIFS allocate their buffers
> using vmalloc (vmalloced buffers fall in the high_memory region), and
> those are likely to be dis-contiguous if you have NANDs with pages > 4k.
> 
> I recently posted patches to ease sg_table creation from any kind of
> virtual address [1][2]. Can you try them and let me know if it fixes
> your problem?

It looks like you won't be going forward with your patchset based on
this thread [1]. I can probably reword the patch description to avoid
implying that it is uncommon to run into high mem buffers. Also DMA with
NAND prefetch suffers from a reduction of performance compared to CPU
polling with prefetch. This is largely due to the significant over head
required to read such a small amount of data at a time. The
optimizations I've worked on all revolved around reducing the cycles
spent before executing the DMA request. Trying to make a high memory
buffer able to be used by the DMA adds significant amount of cycles and
your better off just using the cpu for performance reasons.

[1]https://lkml.org/lkml/2016/4/4/346
> 
> Thanks,
> 
> Boris
> 
> [1]https://lkml.org/lkml/2016/3/8/276
> [2]https://lkml.org/lkml/2016/3/8/277
> 
>
Boris Brezillon April 13, 2016, 8:24 p.m. UTC | #3
Hi Franklin,

On Wed, 13 Apr 2016 15:08:12 -0500
"Franklin S Cooper Jr." <fcooper@ti.com> wrote:

> 
> 
> On 03/21/2016 10:04 AM, Boris Brezillon wrote:
> > Hi Franklin,
> > 
> > On Thu, 10 Mar 2016 17:56:42 -0600
> > Franklin S Cooper Jr <fcooper@ti.com> wrote:
> > 
> >> Based on DMA documentation and testing using high memory buffer when
> >> doing dma transfers can lead to various issues including kernel
> >> panics.
> > 
> > I guess it all comes from the vmalloced buffer case, which are not
> > guaranteed to be physically contiguous (one of the DMA requirement,
> > unless you have an iommu).
> > 
> >>
> >> To workaround this simply use cpu copy. The amount of high memory
> >> buffers used are very uncommon so no noticeable performance hit should
> >> be seen.
> > 
> > Hm, that's not necessarily true. UBI and UBIFS allocate their buffers
> > using vmalloc (vmalloced buffers fall in the high_memory region), and
> > those are likely to be dis-contiguous if you have NANDs with pages > 4k.
> > 
> > I recently posted patches to ease sg_table creation from any kind of
> > virtual address [1][2]. Can you try them and let me know if it fixes
> > your problem?
> 
> It looks like you won't be going forward with your patchset based on
> this thread [1].

Nope. According to Russell it's unsafe to do that.

> I can probably reword the patch description to avoid
> implying that it is uncommon to run into high mem buffers. Also DMA with
> NAND prefetch suffers from a reduction of performance compared to CPU
> polling with prefetch. This is largely due to the significant over head
> required to read such a small amount of data at a time. The
> optimizations I've worked on all revolved around reducing the cycles
> spent before executing the DMA request. Trying to make a high memory
> buffer able to be used by the DMA adds significant amount of cycles and
> your better off just using the cpu for performance reasons.

Okay.
One comment though, why not using virt_addr_valid() instead of
addr >= high_memory here?

Best Regards,

Boris
Franklin S Cooper Jr April 13, 2016, 9:11 p.m. UTC | #4
On 04/13/2016 03:24 PM, Boris Brezillon wrote:
> Hi Franklin,
> 
> On Wed, 13 Apr 2016 15:08:12 -0500
> "Franklin S Cooper Jr." <fcooper@ti.com> wrote:
> 
>>
>>
>> On 03/21/2016 10:04 AM, Boris Brezillon wrote:
>>> Hi Franklin,
>>>
>>> On Thu, 10 Mar 2016 17:56:42 -0600
>>> Franklin S Cooper Jr <fcooper@ti.com> wrote:
>>>
>>>> Based on DMA documentation and testing using high memory buffer when
>>>> doing dma transfers can lead to various issues including kernel
>>>> panics.
>>>
>>> I guess it all comes from the vmalloced buffer case, which are not
>>> guaranteed to be physically contiguous (one of the DMA requirement,
>>> unless you have an iommu).
>>>
>>>>
>>>> To workaround this simply use cpu copy. The amount of high memory
>>>> buffers used are very uncommon so no noticeable performance hit should
>>>> be seen.
>>>
>>> Hm, that's not necessarily true. UBI and UBIFS allocate their buffers
>>> using vmalloc (vmalloced buffers fall in the high_memory region), and
>>> those are likely to be dis-contiguous if you have NANDs with pages > 4k.
>>>
>>> I recently posted patches to ease sg_table creation from any kind of
>>> virtual address [1][2]. Can you try them and let me know if it fixes
>>> your problem?
>>
>> It looks like you won't be going forward with your patchset based on
>> this thread [1].
> 
> Nope. According to Russell it's unsafe to do that.
> 
>> I can probably reword the patch description to avoid
>> implying that it is uncommon to run into high mem buffers. Also DMA with
>> NAND prefetch suffers from a reduction of performance compared to CPU
>> polling with prefetch. This is largely due to the significant over head
>> required to read such a small amount of data at a time. The
>> optimizations I've worked on all revolved around reducing the cycles
>> spent before executing the DMA request. Trying to make a high memory
>> buffer able to be used by the DMA adds significant amount of cycles and
>> your better off just using the cpu for performance reasons.
> 
> Okay.
> One comment though, why not using virt_addr_valid() instead of
> addr >= high_memory here?


I had no reason other than simply using the approach used in the driver
already. Virt_addr_valid looks like it will work so I'll make the switch
after testing it.
> 
> Best Regards,
> 
> Boris
> 
>
diff mbox

Patch

diff --git a/drivers/mtd/nand/omap2.c b/drivers/mtd/nand/omap2.c
index 0863a83..22b0112 100644
--- a/drivers/mtd/nand/omap2.c
+++ b/drivers/mtd/nand/omap2.c
@@ -467,17 +467,8 @@  static inline int omap_nand_dma_transfer(struct mtd_info *mtd, void *addr,
 	int ret;
 	u32 val;
 
-	if (addr >= high_memory) {
-		struct page *p1;
-
-		if (((size_t)addr & PAGE_MASK) !=
-			((size_t)(addr + len - 1) & PAGE_MASK))
-			goto out_copy;
-		p1 = vmalloc_to_page(addr);
-		if (!p1)
-			goto out_copy;
-		addr = page_address(p1) + ((size_t)addr & ~PAGE_MASK);
-	}
+	if (addr >= high_memory)
+		goto out_copy;
 
 	sg_init_one(&sg, addr, len);
 	n = dma_map_sg(info->dma->device->dev, &sg, 1, dir);
@@ -534,6 +525,7 @@  out_copy:
 	else
 		is_write == 0 ? omap_read_buf8(mtd, (u_char *) addr, len)
 			: omap_write_buf8(mtd, (u_char *) addr, len);
+
 	return 0;
 }