Patchwork [01/11] async_tx: don't use src_list argument of async_xor() for dma addresses

login
register
mail settings
Submitter Ilya Yanok
Date Nov. 13, 2008, 3:15 p.m.
Message ID <1226589364-5619-2-git-send-email-yanok@emcraft.com>
Download mbox | patch
Permalink /patch/8578/
State Superseded, archived
Headers show

Comments

Ilya Yanok - Nov. 13, 2008, 3:15 p.m.
Using src_list argument of async_xor() as a storage for dma addresses
implies sizeof(dma_addr_t) <= sizeof(struct page *) restriction which is
not always true.

Signed-off-by: Ilya Yanok <yanok@emcraft.com>
---
 crypto/async_tx/async_xor.c |   14 ++------------
 1 files changed, 2 insertions(+), 12 deletions(-)
Dan Williams - Nov. 15, 2008, 12:42 a.m.
On Thu, Nov 13, 2008 at 8:15 AM, Ilya Yanok <yanok@emcraft.com> wrote:
> Using src_list argument of async_xor() as a storage for dma addresses
> implies sizeof(dma_addr_t) <= sizeof(struct page *) restriction which is
> not always true.
>
> Signed-off-by: Ilya Yanok <yanok@emcraft.com>
> ---

I don't like the stack space implications of this change.  Especially
for large arrays we will be carrying two 'src_cnt' size arrays on the
stack, one from MD and one from async_tx.  However, I think the
current scheme of overwriting input parameters is pretty ugly.  So, I
want to benchmark the performance implications of adding a GFP_NOIO
allocation here, with the idea being that if the allocation fails we
can still fallback to the synchronous code path.

--
Dan
Benjamin Herrenschmidt - Nov. 15, 2008, 7:12 a.m.
On Fri, 2008-11-14 at 17:42 -0700, Dan Williams wrote:
> I don't like the stack space implications of this change.  Especially
> for large arrays we will be carrying two 'src_cnt' size arrays on the
> stack, one from MD and one from async_tx.  However, I think the
> current scheme of overwriting input parameters is pretty ugly. 

Well, it's also broken :-) On a number of architectures, dma_addr_t can
be 64 bit while page * is 32 bit

>  So, I
> want to benchmark the performance implications of adding a GFP_NOIO
> allocation here, with the idea being that if the allocation fails we
> can still fallback to the synchronous code path.

Patch

diff --git a/crypto/async_tx/async_xor.c b/crypto/async_tx/async_xor.c
index c029d3e..00c74c5 100644
--- a/crypto/async_tx/async_xor.c
+++ b/crypto/async_tx/async_xor.c
@@ -42,7 +42,7 @@  do_async_xor(struct dma_chan *chan, struct page *dest, struct page **src_list,
 	     dma_async_tx_callback cb_fn, void *cb_param)
 {
 	struct dma_device *dma = chan->device;
-	dma_addr_t *dma_src = (dma_addr_t *) src_list;
+	dma_addr_t dma_src[src_cnt];
 	struct dma_async_tx_descriptor *tx = NULL;
 	int src_off = 0;
 	int i;
@@ -247,7 +247,7 @@  async_xor_zero_sum(struct page *dest, struct page **src_list,
 	BUG_ON(src_cnt <= 1);
 
 	if (device && src_cnt <= device->max_xor) {
-		dma_addr_t *dma_src = (dma_addr_t *) src_list;
+		dma_addr_t dma_src[src_cnt];
 		unsigned long dma_prep_flags = cb_fn ? DMA_PREP_INTERRUPT : 0;
 		int i;
 
@@ -296,16 +296,6 @@  EXPORT_SYMBOL_GPL(async_xor_zero_sum);
 
 static int __init async_xor_init(void)
 {
-	#ifdef CONFIG_DMA_ENGINE
-	/* To conserve stack space the input src_list (array of page pointers)
-	 * is reused to hold the array of dma addresses passed to the driver.
-	 * This conversion is only possible when dma_addr_t is less than the
-	 * the size of a pointer.  HIGHMEM64G is known to violate this
-	 * assumption.
-	 */
-	BUILD_BUG_ON(sizeof(dma_addr_t) > sizeof(struct page *));
-	#endif
-
 	return 0;
 }