Message ID | 20100924212443.GA24654@ovro.caltech.edu (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
On Fri, 2010-09-24 at 14:24 -0700, Ira W. Snyder wrote: > On Fri, Sep 24, 2010 at 01:40:56PM -0700, Dan Williams wrote: > > I don't think any dma channels gracefully handle descriptors that were > > prepped but not submitted. You would probably need to submit the > > backlog, poll for completion, and then return the error. > > Alternatively, the expectation is that descriptor allocations are > > transient, i.e. once previously submitted transactions are completed > > the descriptors will return to the available pool. So you could do > > what async_tx routines do and just poll for a descriptor. > > > > Can you give me an example? Even some pseudocode would help. Here is one from do_async_gen_syndrome() in crypto/async_tx/async_pq.c: /* Since we have clobbered the src_list we are committed * to doing this asynchronously. Drivers force forward * progress in case they can not provide a descriptor */ for (;;) { tx = dma->device_prep_dma_pq(chan, dma_dest, &dma_src[src_off], pq_src_cnt, &coefs[src_off], len, dma_flags); if (likely(tx)) break; async_tx_quiesce(&submit->depend_tx); dma_async_issue_pending(chan); } > The other DMAEngine functions (dma_async_memcpy_*()) don't do anything > with the descriptor if submit fails. Take for example > dma_async_memcpy_buf_to_buf(). If tx->tx_submit(tx); fails, any code > using it has no way to return the descriptor to the free pool. > > Does tx_submit() implicitly return descriptors to the free pool if it > fails? No, submit() failures are a hold over from when the ioatdma driver used to perform additional descriptor allocation at ->submit() time. After prep() the expectation is that the engine is just waiting to be told "go" and can't fail. The only reason ->submit() retains a return code is to support the "cookie" based method for polling for operation completion. A dma driver should handle all descriptor submission failure scenarios at prep time. > Ok, I thought the list was clearer, but this is equally easy. How about > the following change that does away with the list completely. Then > things should work on ioatdma as well. > > From d59569ff48a89ef5411af3cf2995af7b742c5cd3 Mon Sep 17 00:00:00 2001 > From: Ira W. Snyder <iws@ovro.caltech.edu> > Date: Fri, 24 Sep 2010 14:18:09 -0700 > Subject: [PATCH] dma: improve scatterlist to scatterlist transfer > > This is an improved algorithm to improve support on the Intel I/OAT > driver. > > Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu> > --- > drivers/dma/dmaengine.c | 52 +++++++++++++++++++++----------------------- > include/linux/dmaengine.h | 3 -- > 2 files changed, 25 insertions(+), 30 deletions(-) > > diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c > index 57ec1e5..cde775c 100644 > --- a/drivers/dma/dmaengine.c > +++ b/drivers/dma/dmaengine.c > @@ -983,10 +983,13 @@ dma_async_memcpy_sg_to_sg(struct dma_chan *chan, > struct dma_async_tx_descriptor *tx; > dma_cookie_t cookie = -ENOMEM; > size_t dst_avail, src_avail; > - struct list_head tx_list; > + struct scatterlist *sg; > size_t transferred = 0; > + size_t dst_total = 0; > + size_t src_total = 0; > dma_addr_t dst, src; > size_t len; > + int i; > > if (dst_nents == 0 || src_nents == 0) > return -EINVAL; > @@ -994,12 +997,17 @@ dma_async_memcpy_sg_to_sg(struct dma_chan *chan, > if (dst_sg == NULL || src_sg == NULL) > return -EINVAL; > > + /* get the total count of bytes in each scatterlist */ > + for_each_sg(dst_sg, sg, dst_nents, i) > + dst_total += sg_dma_len(sg); > + > + for_each_sg(src_sg, sg, src_nents, i) > + src_total += sg_dma_len(sg); > + What about overrun or underrun do we not care if src_total != dst_total? Otherwise looks ok.
On Fri, Sep 24, 2010 at 02:53:14PM -0700, Dan Williams wrote: > On Fri, 2010-09-24 at 14:24 -0700, Ira W. Snyder wrote: > > On Fri, Sep 24, 2010 at 01:40:56PM -0700, Dan Williams wrote: > > > I don't think any dma channels gracefully handle descriptors that were > > > prepped but not submitted. You would probably need to submit the > > > backlog, poll for completion, and then return the error. > > > Alternatively, the expectation is that descriptor allocations are > > > transient, i.e. once previously submitted transactions are completed > > > the descriptors will return to the available pool. So you could do > > > what async_tx routines do and just poll for a descriptor. > > > > > > > Can you give me an example? Even some pseudocode would help. > > Here is one from do_async_gen_syndrome() in crypto/async_tx/async_pq.c: > > /* Since we have clobbered the src_list we are committed > * to doing this asynchronously. Drivers force forward > * progress in case they can not provide a descriptor > */ > for (;;) { > tx = dma->device_prep_dma_pq(chan, dma_dest, > &dma_src[src_off], > pq_src_cnt, > &coefs[src_off], len, > dma_flags); > if (likely(tx)) > break; > async_tx_quiesce(&submit->depend_tx); > dma_async_issue_pending(chan); > } > > > The other DMAEngine functions (dma_async_memcpy_*()) don't do anything > > with the descriptor if submit fails. Take for example > > dma_async_memcpy_buf_to_buf(). If tx->tx_submit(tx); fails, any code > > using it has no way to return the descriptor to the free pool. > > > > Does tx_submit() implicitly return descriptors to the free pool if it > > fails? > > No, submit() failures are a hold over from when the ioatdma driver used > to perform additional descriptor allocation at ->submit() time. After > prep() the expectation is that the engine is just waiting to be told > "go" and can't fail. The only reason ->submit() retains a return code > is to support the "cookie" based method for polling for operation > completion. A dma driver should handle all descriptor submission > failure scenarios at prep time. > Ok, that's more like what I expected. So we still need the try forever code similar to the above. I can add that for the next version. > > Ok, I thought the list was clearer, but this is equally easy. How about > > the following change that does away with the list completely. Then > > things should work on ioatdma as well. > > > > From d59569ff48a89ef5411af3cf2995af7b742c5cd3 Mon Sep 17 00:00:00 2001 > > From: Ira W. Snyder <iws@ovro.caltech.edu> > > Date: Fri, 24 Sep 2010 14:18:09 -0700 > > Subject: [PATCH] dma: improve scatterlist to scatterlist transfer > > > > This is an improved algorithm to improve support on the Intel I/OAT > > driver. > > > > Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu> > > --- > > drivers/dma/dmaengine.c | 52 +++++++++++++++++++++----------------------- > > include/linux/dmaengine.h | 3 -- > > 2 files changed, 25 insertions(+), 30 deletions(-) > > > > diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c > > index 57ec1e5..cde775c 100644 > > --- a/drivers/dma/dmaengine.c > > +++ b/drivers/dma/dmaengine.c > > @@ -983,10 +983,13 @@ dma_async_memcpy_sg_to_sg(struct dma_chan *chan, > > struct dma_async_tx_descriptor *tx; > > dma_cookie_t cookie = -ENOMEM; > > size_t dst_avail, src_avail; > > - struct list_head tx_list; > > + struct scatterlist *sg; > > size_t transferred = 0; > > + size_t dst_total = 0; > > + size_t src_total = 0; > > dma_addr_t dst, src; > > size_t len; > > + int i; > > > > if (dst_nents == 0 || src_nents == 0) > > return -EINVAL; > > @@ -994,12 +997,17 @@ dma_async_memcpy_sg_to_sg(struct dma_chan *chan, > > if (dst_sg == NULL || src_sg == NULL) > > return -EINVAL; > > > > + /* get the total count of bytes in each scatterlist */ > > + for_each_sg(dst_sg, sg, dst_nents, i) > > + dst_total += sg_dma_len(sg); > > + > > + for_each_sg(src_sg, sg, src_nents, i) > > + src_total += sg_dma_len(sg); > > + > > What about overrun or underrun do we not care if src_total != dst_total? > > Otherwise looks ok. > I don't know if we should care about that. The algorithm handles that case just fine. It copies the maximum amount it can, which is exactly min(src_total, dst_total). Whichever scatterlist runs out of entries first is the shortest. As a real world example, my driver verifies that both scatterlists have exactly the right number of bytes available before trying to program the hardware. Ira
On Fri, 2010-09-24 at 15:04 -0700, Ira W. Snyder wrote: > On Fri, Sep 24, 2010 at 02:53:14PM -0700, Dan Williams wrote: > > What about overrun or underrun do we not care if src_total != dst_total? > > > > Otherwise looks ok. > > > > I don't know if we should care about that. The algorithm handles that > case just fine. It copies the maximum amount it can, which is exactly > min(src_total, dst_total). Whichever scatterlist runs out of entries > first is the shortest. > > As a real world example, my driver verifies that both scatterlists have > exactly the right number of bytes available before trying to program the > hardware. Ok, just handle the prep failure and I think we are good to go. -- Dan
On Fri, Sep 24, 2010 at 03:04:19PM -0700, Ira W. Snyder wrote: > On Fri, Sep 24, 2010 at 02:53:14PM -0700, Dan Williams wrote: > > On Fri, 2010-09-24 at 14:24 -0700, Ira W. Snyder wrote: > > > On Fri, Sep 24, 2010 at 01:40:56PM -0700, Dan Williams wrote: > > > > I don't think any dma channels gracefully handle descriptors that were > > > > prepped but not submitted. You would probably need to submit the > > > > backlog, poll for completion, and then return the error. > > > > Alternatively, the expectation is that descriptor allocations are > > > > transient, i.e. once previously submitted transactions are completed > > > > the descriptors will return to the available pool. So you could do > > > > what async_tx routines do and just poll for a descriptor. > > > > > > > > > > Can you give me an example? Even some pseudocode would help. > > > > Here is one from do_async_gen_syndrome() in crypto/async_tx/async_pq.c: > > > > /* Since we have clobbered the src_list we are committed > > * to doing this asynchronously. Drivers force forward > > * progress in case they can not provide a descriptor > > */ > > for (;;) { > > tx = dma->device_prep_dma_pq(chan, dma_dest, > > &dma_src[src_off], > > pq_src_cnt, > > &coefs[src_off], len, > > dma_flags); > > if (likely(tx)) > > break; > > async_tx_quiesce(&submit->depend_tx); > > dma_async_issue_pending(chan); > > } > > > > > The other DMAEngine functions (dma_async_memcpy_*()) don't do anything > > > with the descriptor if submit fails. Take for example > > > dma_async_memcpy_buf_to_buf(). If tx->tx_submit(tx); fails, any code > > > using it has no way to return the descriptor to the free pool. > > > > > > Does tx_submit() implicitly return descriptors to the free pool if it > > > fails? > > > > No, submit() failures are a hold over from when the ioatdma driver used > > to perform additional descriptor allocation at ->submit() time. After > > prep() the expectation is that the engine is just waiting to be told > > "go" and can't fail. The only reason ->submit() retains a return code > > is to support the "cookie" based method for polling for operation > > completion. A dma driver should handle all descriptor submission > > failure scenarios at prep time. > > > > Ok, that's more like what I expected. So we still need the try forever > code similar to the above. I can add that for the next version. > When coding this change, I've noticed one problem that would break my driver. I cannot issue dma_async_issue_pending() on the channel while creating the descriptors, since this will start transferring the previously submitted DMA descriptors. This breaks the external hardware control requirement. Imagine this scenario: 1) device is not yet setup for external control (nothing is pulsing the pins) 2) dma_async_memcpy_sg_to_sg() - this hits an allocation failure, which calls dma_async_issue_pending() - this causes the DMA engine to start transferring to a device which is not ready yet - memory pressure stops, and allocation succeeds again - some descriptors have been transferred, but not the ones since the alloc failure - now the first half of the descriptors (pre alloc failure) have been transferred - the second half of the descriptors (post alloc failure) are still pending - the dma_async_memcpy_sg_to_sg() returns success: all tx_submit() succeeded 3) device_control() - setup external control mode 4) dma_async_issue_pending() - start the externally controlled transfer 5) tell the external agent to start controlling the DMA transaction - now there isn't enough data left, and the external agent fails to program the FPGAs I don't mind adding it to the code, since I have enough memory that I don't ever see allocation failures. It is an embedded system, and we've been careful not to overcommit memory. I think for all other users, it would be the appropriate thing to do. Most people don't care if the scatterlist is copied in two chunks with a time gap in the middle. An alternative implementation would be to implement device_prep_sg_to_sg() that returned a struct dma_async_tx_descriptor, which could then be used as normal by higher layers. This would allow the driver to allocate / cleanup all descriptors in one shot. This would be completely robust to this error situation. Is there one solution you'd prefer over the other? They're both similar in the amount of code, though duplication would probably be increased in the device_prep_sg_to_sg() case. If any other driver implements it. Thanks, Ira
diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c index 57ec1e5..cde775c 100644 --- a/drivers/dma/dmaengine.c +++ b/drivers/dma/dmaengine.c @@ -983,10 +983,13 @@ dma_async_memcpy_sg_to_sg(struct dma_chan *chan, struct dma_async_tx_descriptor *tx; dma_cookie_t cookie = -ENOMEM; size_t dst_avail, src_avail; - struct list_head tx_list; + struct scatterlist *sg; size_t transferred = 0; + size_t dst_total = 0; + size_t src_total = 0; dma_addr_t dst, src; size_t len; + int i; if (dst_nents == 0 || src_nents == 0) return -EINVAL; @@ -994,12 +997,17 @@ dma_async_memcpy_sg_to_sg(struct dma_chan *chan, if (dst_sg == NULL || src_sg == NULL) return -EINVAL; + /* get the total count of bytes in each scatterlist */ + for_each_sg(dst_sg, sg, dst_nents, i) + dst_total += sg_dma_len(sg); + + for_each_sg(src_sg, sg, src_nents, i) + src_total += sg_dma_len(sg); + /* get prepared for the loop */ dst_avail = sg_dma_len(dst_sg); src_avail = sg_dma_len(src_sg); - INIT_LIST_HEAD(&tx_list); - /* run until we are out of descriptors */ while (true) { @@ -1018,14 +1026,24 @@ dma_async_memcpy_sg_to_sg(struct dma_chan *chan, return -ENOMEM; } - /* keep track of the tx for later */ - list_add_tail(&tx->entry, &tx_list); - /* update metadata */ transferred += len; dst_avail -= len; src_avail -= len; + /* if this is the last transfer, setup the callback */ + if (dst_total == transferred || src_total == transferred) { + tx->callback = cb; + tx->callback_param = cb_param; + } + + /* submit the transaction */ + cookie = tx->tx_submit(tx); + if (dma_submit_error(cookie)) { + dev_err(dev->dev, "failed to submit desc\n"); + return cookie; + } + fetch: /* fetch the next dst scatterlist entry */ if (dst_avail == 0) { @@ -1060,30 +1078,13 @@ fetch: } } - /* loop through the list of descriptors and submit them */ - list_for_each_entry(tx, &tx_list, entry) { - - /* this is the last descriptor: add the callback */ - if (list_is_last(&tx->entry, &tx_list)) { - tx->callback = cb; - tx->callback_param = cb_param; - } - - /* submit the transaction */ - cookie = tx->tx_submit(tx); - if (dma_submit_error(cookie)) { - dev_err(dev->dev, "failed to submit desc\n"); - return cookie; - } - } - /* update counters */ preempt_disable(); __this_cpu_add(chan->local->bytes_transferred, transferred); __this_cpu_inc(chan->local->memcpy_count); preempt_enable(); - return cookie; + return 0; } EXPORT_SYMBOL(dma_async_memcpy_sg_to_sg); #endif @@ -1092,9 +1093,6 @@ void dma_async_tx_descriptor_init(struct dma_async_tx_descriptor *tx, struct dma_chan *chan) { tx->chan = chan; - #ifdef CONFIG_DMAENGINE_SG_TO_SG - INIT_LIST_HEAD(&tx->entry); - #endif #ifndef CONFIG_ASYNC_TX_DISABLE_CHANNEL_SWITCH spin_lock_init(&tx->lock); #endif diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h index 5507f4c..26f2712 100644 --- a/include/linux/dmaengine.h +++ b/include/linux/dmaengine.h @@ -317,9 +317,6 @@ struct dma_async_tx_descriptor { dma_cookie_t (*tx_submit)(struct dma_async_tx_descriptor *tx); dma_async_tx_callback callback; void *callback_param; -#ifdef CONFIG_DMAENGINE_SG_TO_SG - struct list_head entry; -#endif #ifndef CONFIG_ASYNC_TX_DISABLE_CHANNEL_SWITCH struct dma_async_tx_descriptor *next; struct dma_async_tx_descriptor *parent;