Message ID | 4D8AFE45.8050609@atmel.com |
---|---|
State | RFC |
Headers | show |
On Thu, Mar 24, 2011 at 04:18:13PM +0800, Nicolas Ferre wrote: > diff --git a/drivers/spi/atmel_spi.c b/drivers/spi/atmel_spi.c > --- a/drivers/spi/atmel_spi.c > +++ b/drivers/spi/atmel_spi.c > @@ -647,6 +647,22 @@ static void atmel_spi_next_message(struct spi_master *master) > atmel_spi_next_xfer(master, msg); > } > > +static void *adjust_buffer_location(struct device *dev, void *buf) > +{ > + if (likely(buf < high_memory)) { > + return buf; > + } else { > + struct page *pg; > + > + pg = vmalloc_to_page(buf); > + if (pg == 0) { > + dev_err(dev, "failed to vmalloc_to_page\n"); > + return NULL; > + } > + return page_address(pg) + ((size_t)buf & ~PAGE_MASK); > + } > +} > + This really doesn't fix the problem. If the page is read or written via the vmalloc mapping, you'll have stale data. DMA to vmalloc areas is dodgy at best.
On Thu, 2011-03-24 at 08:25 +0000, Russell King - ARM Linux wrote: > On Thu, Mar 24, 2011 at 04:18:13PM +0800, Nicolas Ferre wrote: > > diff --git a/drivers/spi/atmel_spi.c b/drivers/spi/atmel_spi.c > > --- a/drivers/spi/atmel_spi.c > > +++ b/drivers/spi/atmel_spi.c > > @@ -647,6 +647,22 @@ static void atmel_spi_next_message(struct spi_master *master) > > atmel_spi_next_xfer(master, msg); > > } > > > > +static void *adjust_buffer_location(struct device *dev, void *buf) > > +{ > > + if (likely(buf < high_memory)) { > > + return buf; > > + } else { > > + struct page *pg; > > + > > + pg = vmalloc_to_page(buf); > > + if (pg == 0) { > > + dev_err(dev, "failed to vmalloc_to_page\n"); > > + return NULL; > > + } > > + return page_address(pg) + ((size_t)buf & ~PAGE_MASK); > > + } > > +} > > + > > This really doesn't fix the problem. If the page is read or written via > the vmalloc mapping, you'll have stale data. > > DMA to vmalloc areas is dodgy at best. This topics pops up often. So what is the right fix? And sorry for my ignorance.
On Thu, Mar 24, 2011 at 10:36:18AM +0200, Artem Bityutskiy wrote: > On Thu, 2011-03-24 at 08:25 +0000, Russell King - ARM Linux wrote: > > On Thu, Mar 24, 2011 at 04:18:13PM +0800, Nicolas Ferre wrote: > > > diff --git a/drivers/spi/atmel_spi.c b/drivers/spi/atmel_spi.c > > > --- a/drivers/spi/atmel_spi.c > > > +++ b/drivers/spi/atmel_spi.c > > > @@ -647,6 +647,22 @@ static void atmel_spi_next_message(struct spi_master *master) > > > atmel_spi_next_xfer(master, msg); > > > } > > > > > > +static void *adjust_buffer_location(struct device *dev, void *buf) > > > +{ > > > + if (likely(buf < high_memory)) { > > > + return buf; > > > + } else { > > > + struct page *pg; > > > + > > > + pg = vmalloc_to_page(buf); > > > + if (pg == 0) { > > > + dev_err(dev, "failed to vmalloc_to_page\n"); > > > + return NULL; > > > + } > > > + return page_address(pg) + ((size_t)buf & ~PAGE_MASK); > > > + } > > > +} > > > + > > > > This really doesn't fix the problem. If the page is read or written via > > the vmalloc mapping, you'll have stale data. > > > > DMA to vmalloc areas is dodgy at best. > > This topics pops up often. So what is the right fix? And sorry for my > ignorance. I have no idea. The problem comes down to MTDs interfaces being designed only for PIO - it's expected that the CPU places data into the virtual address being passed. This virtual address can be a vmalloc address, a kmalloc'd address or a page address. As long as that happens, everything all works because there's no cache aliasing problems to consider as the buffers are always accessed through the same mappings. As soon as there's any attempt to move away from PIO, things get really sticky because, with *any* DMA noncoherent architecture (it's not limited to ARM) you run into issues with cache coherency causing data corruption. When cache maintainence ends up having to deal with virtual addresses for one level of cache and physical addresses for subseqent levels, the problems are compounded even more. The only real answer I can give is: if you want to deal with DMA, you absolutely must conform to the restrictions on DMA which means that you can't pass vmalloc addresses to the DMA API. You can't work around that by converting a vmalloc address to a physical address and then its coresponding virtual page address as that means the DMA API will flush the wrong virtual address range and you'll again have cache incoherency. That goes for *all* of the DMA API interfaces.
On Thu, Mar 24, 2011 at 09:27:32AM +0000, Russell King - ARM Linux wrote: > I have no idea. The problem comes down to MTDs interfaces being designed > only for PIO - it's expected that the CPU places data into the virtual > address being passed. This virtual address can be a vmalloc address, > a kmalloc'd address or a page address. > > As long as that happens, everything all works because there's no cache > aliasing problems to consider as the buffers are always accessed through > the same mappings. > > As soon as there's any attempt to move away from PIO, things get really > sticky because, with *any* DMA noncoherent architecture (it's not limited > to ARM) you run into issues with cache coherency causing data corruption. > > When cache maintainence ends up having to deal with virtual addresses for > one level of cache and physical addresses for subseqent levels, the > problems are compounded even more. > > The only real answer I can give is: if you want to deal with DMA, you > absolutely must conform to the restrictions on DMA which means that you > can't pass vmalloc addresses to the DMA API. > > You can't work around that by converting a vmalloc address to a physical > address and then its coresponding virtual page address as that means the > DMA API will flush the wrong virtual address range and you'll again have > cache incoherency. That goes for *all* of the DMA API interfaces. I should add - from asm-generic/dma-mapping-common.h: static inline dma_addr_t dma_map_single_attrs(struct device *dev, void *ptr, size_t size, enum dma_data_direction dir, struct dma_attrs *attrs) { struct dma_map_ops *ops = get_dma_ops(dev); dma_addr_t addr; kmemcheck_mark_initialized(ptr, size); BUG_ON(!valid_dma_direction(dir)); addr = ops->map_page(dev, virt_to_page(ptr), (unsigned long)ptr & ~PAGE_MASK, size, dir, attrs); And note that virt_to_page() only works for addresses in the kernel's direct mapped memory region (iow, kmalloc et.al, not vmalloc). The above will probably go wrong silently (or maybe oops if you're lucky) rather than BUG. The difference is that on ARM we ensure that such bad cases are always caught.
On Thu, 2011-03-24 at 09:27 +0000, Russell King - ARM Linux wrote: > The only real answer I can give is: if you want to deal with DMA, you > absolutely must conform to the restrictions on DMA which means that you > can't pass vmalloc addresses to the DMA API. I see, thanks. Well, I do not see any issues with changing MTD-related SW (JFFS2, UBI, UBIFS, etc) to use kmalloc. But this would require som non-trivial efforts, although I believe this is doable. Basically, we have to work with entire eraseblocks often, which may be 128KiB or 256KiB or even 512KiB nowadays. And we vmalloc the eraseblock-sized buffers in these cases. We could instead work with arrays of pages (or multiple pages), and then do things like readv/writev. But this requires a brave knight who'd come and just implemented this, or a company who'd fund someone to work on this.
diff --git a/drivers/spi/atmel_spi.c b/drivers/spi/atmel_spi.c --- a/drivers/spi/atmel_spi.c +++ b/drivers/spi/atmel_spi.c @@ -647,6 +647,22 @@ static void atmel_spi_next_message(struct spi_master *master) atmel_spi_next_xfer(master, msg); } +static void *adjust_buffer_location(struct device *dev, void *buf) +{ + if (likely(buf < high_memory)) { + return buf; + } else { + struct page *pg; + + pg = vmalloc_to_page(buf); + if (pg == 0) { + dev_err(dev, "failed to vmalloc_to_page\n"); + return NULL; + } + return page_address(pg) + ((size_t)buf & ~PAGE_MASK); + } +} + /* * For DMA, tx_buf/tx_dma have the same relationship as rx_buf/rx_dma: * - The buffer is either valid for CPU access, else NULL @@ -658,23 +674,24 @@ static int atmel_spi_dma_map_xfer(struct atmel_spi *as, struct spi_transfer *xfer) { struct device *dev = &as->pdev->dev; + void *ptr; xfer->tx_dma = xfer->rx_dma = INVALID_DMA_ADDRESS; if (xfer->tx_buf) { - /* tx_buf is a const void* where we need a void * for the dma - * mapping */ - void *nonconst_tx = (void *)xfer->tx_buf; - - xfer->tx_dma = dma_map_single(dev, - nonconst_tx, xfer->len, - DMA_TO_DEVICE); + ptr = adjust_buffer_location(dev, (void *)xfer->tx_buf); + if (ptr) + xfer->tx_dma = dma_map_single(dev, + ptr, xfer->len, + DMA_TO_DEVICE); if (dma_mapping_error(dev, xfer->tx_dma)) return -ENOMEM; } if (xfer->rx_buf) { - xfer->rx_dma = dma_map_single(dev, - xfer->rx_buf, xfer->len, - DMA_FROM_DEVICE); + ptr = adjust_buffer_location(dev, xfer->rx_buf); + if (ptr) + xfer->rx_dma = dma_map_single(dev, + ptr, xfer->len, + DMA_FROM_DEVICE); if (dma_mapping_error(dev, xfer->rx_dma)) { if (xfer->tx_buf) dma_unmap_single(dev,diff --git a/arch/arm/mach-at91/board-