Patchwork [01/11] async_tx: don't use src_list argument of async_xor() for dma addresses

login
register
mail settings
Submitter Yuri Tikhonov
Date Dec. 8, 2008, 9:55 p.m.
Message ID <200812090055.26721.yur@emcraft.com>
Download mbox | patch
Permalink /patch/12853/
State Not Applicable
Headers show

Comments

Yuri Tikhonov - Dec. 8, 2008, 9:55 p.m.
Using src_list argument of async_xor() as a storage for dma addresses
implies sizeof(dma_addr_t) <= sizeof(struct page *) restriction which is
not always true (e.g. ppc440spe).

Signed-off-by: Ilya Yanok <yanok@emcraft.com>
Signed-off-by: Yuri Tikhonov <yur@emcraft.com>
---
 crypto/async_tx/async_xor.c |   14 ++------------
 1 files changed, 2 insertions(+), 12 deletions(-)
Dan Williams - Dec. 9, 2008, 12:31 a.m.
On Mon, Dec 8, 2008 at 2:55 PM, Yuri Tikhonov <yur@emcraft.com> wrote:
> Using src_list argument of async_xor() as a storage for dma addresses
> implies sizeof(dma_addr_t) <= sizeof(struct page *) restriction which is
> not always true (e.g. ppc440spe).
>

ppc440spe runs with CONFIG_PHYS_64BIT?

If we do this then we need to also change md to limit the number of
allowed disks based on the kernel stack size.  Because with 256 disks
a 4K stack can be consumed by one call to async_pq ((256 sources in
raid5.c + 256 sources async_pq.c) * 8 bytes per source on 64-bit).

Regards,
Dan
Yuri Tikhonov - Dec. 9, 2008, 12:41 a.m.
On Tuesday, December 9, 2008 you wrote:

> On Mon, Dec 8, 2008 at 2:55 PM, Yuri Tikhonov <yur@emcraft.com> wrote:
>> Using src_list argument of async_xor() as a storage for dma addresses
>> implies sizeof(dma_addr_t) <= sizeof(struct page *) restriction which is
>> not always true (e.g. ppc440spe).
>>

> ppc440spe runs with CONFIG_PHYS_64BIT?

 Yep. It uses 36-bit addressing, so this CONFIG is turned on.

> If we do this then we need to also change md to limit the number of
> allowed disks based on the kernel stack size.  Because with 256 disks
> a 4K stack can be consumed by one call to async_pq ((256 sources in
> raid5.c + 256 sources async_pq.c) * 8 bytes per source on 64-bit).

 On ppc440spe we have 8KB stack, so the things are not worse than on 
32-bit archs with 4KB stack. Thus, I guess no changes to md are 
required because of this patch. Right?

 Regards, Yuri

 --
 Yuri Tikhonov, Senior Software Engineer
 Emcraft Systems, www.emcraft.com
Dan Williams - Dec. 10, 2008, 1:08 a.m.
On Mon, Dec 8, 2008 at 5:41 PM, Yuri Tikhonov <yur@emcraft.com> wrote:
> On Tuesday, December 9, 2008 you wrote:
>
>> On Mon, Dec 8, 2008 at 2:55 PM, Yuri Tikhonov <yur@emcraft.com> wrote:
>>> Using src_list argument of async_xor() as a storage for dma addresses
>>> implies sizeof(dma_addr_t) <= sizeof(struct page *) restriction which is
>>> not always true (e.g. ppc440spe).
>>>
>
>> ppc440spe runs with CONFIG_PHYS_64BIT?
>
>  Yep. It uses 36-bit addressing, so this CONFIG is turned on.
>
>> If we do this then we need to also change md to limit the number of
>> allowed disks based on the kernel stack size.  Because with 256 disks
>> a 4K stack can be consumed by one call to async_pq ((256 sources in
>> raid5.c + 256 sources async_pq.c) * 8 bytes per source on 64-bit).
>
>  On ppc440spe we have 8KB stack, so the things are not worse than on
> 32-bit archs with 4KB stack. Thus, I guess no changes to md are
> required because of this patch. Right?

8K stacks do make this less of an issue *provided* handle_stripe()
remains only called from raid5d.  We used to share some stripe
handling work with the requester's process context where the stack is
much more crowded.  So, we would now be more strongly tied to the
raid5d-only approach... maybe that is not enough to deny this change.
Neil what do you think of the async_{xor,pq,etc} apis allocating
'src_cnt' sized arrays on the stack?

Thanks,
Dan

Patch

diff --git a/crypto/async_tx/async_xor.c b/crypto/async_tx/async_xor.c
index c029d3e..00c74c5 100644
--- a/crypto/async_tx/async_xor.c
+++ b/crypto/async_tx/async_xor.c
@@ -42,7 +42,7 @@  do_async_xor(struct dma_chan *chan, struct page *dest, struct page **src_list,
 	     dma_async_tx_callback cb_fn, void *cb_param)
 {
 	struct dma_device *dma = chan->device;
-	dma_addr_t *dma_src = (dma_addr_t *) src_list;
+	dma_addr_t dma_src[src_cnt];
 	struct dma_async_tx_descriptor *tx = NULL;
 	int src_off = 0;
 	int i;
@@ -247,7 +247,7 @@  async_xor_zero_sum(struct page *dest, struct page **src_list,
 	BUG_ON(src_cnt <= 1);
 
 	if (device && src_cnt <= device->max_xor) {
-		dma_addr_t *dma_src = (dma_addr_t *) src_list;
+		dma_addr_t dma_src[src_cnt];
 		unsigned long dma_prep_flags = cb_fn ? DMA_PREP_INTERRUPT : 0;
 		int i;
 
@@ -296,16 +296,6 @@  EXPORT_SYMBOL_GPL(async_xor_zero_sum);
 
 static int __init async_xor_init(void)
 {
-	#ifdef CONFIG_DMA_ENGINE
-	/* To conserve stack space the input src_list (array of page pointers)
-	 * is reused to hold the array of dma addresses passed to the driver.
-	 * This conversion is only possible when dma_addr_t is less than the
-	 * the size of a pointer.  HIGHMEM64G is known to violate this
-	 * assumption.
-	 */
-	BUILD_BUG_ON(sizeof(dma_addr_t) > sizeof(struct page *));
-	#endif
-
 	return 0;
 }