diff mbox

[05/10] dma-helpers: add dma_buf_read and dma_buf_write

Message ID 1312478089-806-6-git-send-email-pbonzini@redhat.com
State New
Headers show

Commit Message

Paolo Bonzini Aug. 4, 2011, 5:14 p.m. UTC
These helpers do a full transfer from an in-memory buffer to
target memory, with full support for MMIO areas.  It will be used to store
the reply of an emulated command into a QEMUSGList provided by the
adapter.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 cutils.c      |    8 +++---
 dma-helpers.c |   63 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 dma.h         |    5 ++++
 3 files changed, 72 insertions(+), 4 deletions(-)

Comments

Stefan Hajnoczi Aug. 11, 2011, 7:58 a.m. UTC | #1
On Thu, Aug 04, 2011 at 07:14:43PM +0200, Paolo Bonzini wrote:
> These helpers do a full transfer from an in-memory buffer to
> target memory, with full support for MMIO areas.  It will be used to store
> the reply of an emulated command into a QEMUSGList provided by the
> adapter.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  cutils.c      |    8 +++---
>  dma-helpers.c |   63 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  dma.h         |    5 ++++
>  3 files changed, 72 insertions(+), 4 deletions(-)

I don't understand this patch.  If we have a memory buffer that needs to
be transferred to target memory, then there is no need for bounce
buffers or cpu_physical_memory_map().

Can we use cpu_physical_memory_rw() on each sglist element instead?  No
-EAGAIN necessary because the memory buffer already acts as the local
bounce buffer.

Stefan
Paolo Bonzini Aug. 11, 2011, 12:10 p.m. UTC | #2
On 08/11/2011 09:58 AM, Stefan Hajnoczi wrote:
> On Thu, Aug 04, 2011 at 07:14:43PM +0200, Paolo Bonzini wrote:
>> These helpers do a full transfer from an in-memory buffer to
>> target memory, with full support for MMIO areas.  It will be used to store
>> the reply of an emulated command into a QEMUSGList provided by the
>> adapter.
>>
>> Signed-off-by: Paolo Bonzini<pbonzini@redhat.com>
>> ---
>>   cutils.c      |    8 +++---
>>   dma-helpers.c |   63 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   dma.h         |    5 ++++
>>   3 files changed, 72 insertions(+), 4 deletions(-)
>
> I don't understand this patch.  If we have a memory buffer that needs to
> be transferred to target memory, then there is no need for bounce
> buffers or cpu_physical_memory_map().
>
> Can we use cpu_physical_memory_rw() on each sglist element instead?  No
> -EAGAIN necessary because the memory buffer already acts as the local
> bounce buffer.

Doh, you're obviously right.  I don't know what I was thinking. :)

What do you think about passing the residual bytes for short transfers? 
  Should I look into updating BlockDriverCompletionFunc, or is the 
approach of patch 2 okay?  If I have an excuse to learn more about 
Coccinelle, that can be fun. :)

Paolo
Stefan Hajnoczi Aug. 11, 2011, 1:29 p.m. UTC | #3
On Thu, Aug 11, 2011 at 1:10 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> On 08/11/2011 09:58 AM, Stefan Hajnoczi wrote:
>>
>> On Thu, Aug 04, 2011 at 07:14:43PM +0200, Paolo Bonzini wrote:
>>>
>>> These helpers do a full transfer from an in-memory buffer to
>>> target memory, with full support for MMIO areas.  It will be used to
>>> store
>>> the reply of an emulated command into a QEMUSGList provided by the
>>> adapter.
>>>
>>> Signed-off-by: Paolo Bonzini<pbonzini@redhat.com>
>>> ---
>>>  cutils.c      |    8 +++---
>>>  dma-helpers.c |   63
>>> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>  dma.h         |    5 ++++
>>>  3 files changed, 72 insertions(+), 4 deletions(-)
>>
>> I don't understand this patch.  If we have a memory buffer that needs to
>> be transferred to target memory, then there is no need for bounce
>> buffers or cpu_physical_memory_map().
>>
>> Can we use cpu_physical_memory_rw() on each sglist element instead?  No
>> -EAGAIN necessary because the memory buffer already acts as the local
>> bounce buffer.
>
> Doh, you're obviously right.  I don't know what I was thinking. :)
>
> What do you think about passing the residual bytes for short transfers?
>  Should I look into updating BlockDriverCompletionFunc, or is the approach
> of patch 2 okay?  If I have an excuse to learn more about Coccinelle, that
> can be fun. :)

The bdrv_aio_readv() and bdrv_aio_writev() functions don't have the
concept of residual bytes.  They only work on fully completed I/O
operations.  If there is an error they pass -errno.  Therefore I don't
think BlockDriverCompletionFunc is the right type to add residual
bytes to.

It seems that residual bytes are a SCSI layer concept that the block
layer does not deal with.

Stefan
Paolo Bonzini Aug. 11, 2011, 2:24 p.m. UTC | #4
On 08/11/2011 03:29 PM, Stefan Hajnoczi wrote:
>> >
>> >  What do you think about passing the residual bytes for short transfers?
>> >    Should I look into updating BlockDriverCompletionFunc, or is the approach
>> >  of patch 2 okay?  If I have an excuse to learn more about Coccinelle, that
>> >  can be fun.:)
> The bdrv_aio_readv() and bdrv_aio_writev() functions don't have the
> concept of residual bytes.  They only work on fully completed I/O
> operations.  If there is an error they pass -errno.

But if a transfer was split due to failure of cpu_physical_memory_map, 
and only the second part fails, you can have a short transfer and you 
need to pass residual bytes back.  The only way out of this is to make a 
bounce buffer as big as all the unmappable parts of the S/G list, which 
is undesirable of course.  So the residual bytes are a general DMA 
concept, not specific to SCSI.

> Therefore I don't think BlockDriverCompletionFunc is the right type
> to add residual bytes to.

Right, I would rather update BlockDriverCompletionFunc to pass the AIOCB 
as a third parameter, and store the residual bytes in the DMAAIOCB (with 
a getter that the completion function can use).

Paolo
Kevin Wolf Aug. 11, 2011, 2:37 p.m. UTC | #5
Am 11.08.2011 16:24, schrieb Paolo Bonzini:
> On 08/11/2011 03:29 PM, Stefan Hajnoczi wrote:
>>>>
>>>>  What do you think about passing the residual bytes for short transfers?
>>>>    Should I look into updating BlockDriverCompletionFunc, or is the approach
>>>>  of patch 2 okay?  If I have an excuse to learn more about Coccinelle, that
>>>>  can be fun.:)
>> The bdrv_aio_readv() and bdrv_aio_writev() functions don't have the
>> concept of residual bytes.  They only work on fully completed I/O
>> operations.  If there is an error they pass -errno.
> 
> But if a transfer was split due to failure of cpu_physical_memory_map, 
> and only the second part fails, you can have a short transfer and you 
> need to pass residual bytes back.  The only way out of this is to make a 
> bounce buffer as big as all the unmappable parts of the S/G list, which 
> is undesirable of course.  So the residual bytes are a general DMA 
> concept, not specific to SCSI.
> 
>> Therefore I don't think BlockDriverCompletionFunc is the right type
>> to add residual bytes to.
> 
> Right, I would rather update BlockDriverCompletionFunc to pass the AIOCB 
> as a third parameter, and store the residual bytes in the DMAAIOCB (with 
> a getter that the completion function can use).

Isn't the DMAAIOCB already passed as opaque to the callback?

Kevin
Paolo Bonzini Aug. 11, 2011, 3:05 p.m. UTC | #6
On 08/11/2011 04:37 PM, Kevin Wolf wrote:
> >  Right, I would rather update BlockDriverCompletionFunc to pass the AIOCB
> >  as a third parameter, and store the residual bytes in the DMAAIOCB (with
> >  a getter that the completion function can use).
>
> Isn't the DMAAIOCB already passed as opaque to the callback?

It is passed to the dma_bdrv_cb, but not to the caller-provided 
callback.  If the operation completes before dma_bdrv_{read,write} 
returns, the AIOCB is not stored anywhere and the asynchronous callback 
does not have access to it.  Usually it does not have anything to do 
with it, but in this case it could get the residual.

Another possibility is always completing DMA in a bottom half.  This 
ensures that the callback can access the AIOCB, but it exposes an 
implementation detail to the caller, so I don't like it.

Paolo
Kevin Wolf Aug. 11, 2011, 3:12 p.m. UTC | #7
Am 11.08.2011 17:05, schrieb Paolo Bonzini:
> On 08/11/2011 04:37 PM, Kevin Wolf wrote:
>>>  Right, I would rather update BlockDriverCompletionFunc to pass the AIOCB
>>>  as a third parameter, and store the residual bytes in the DMAAIOCB (with
>>>  a getter that the completion function can use).
>>
>> Isn't the DMAAIOCB already passed as opaque to the callback?
> 
> It is passed to the dma_bdrv_cb, but not to the caller-provided 
> callback.  If the operation completes before dma_bdrv_{read,write} 
> returns, the AIOCB is not stored anywhere and the asynchronous callback 
> does not have access to it.  Usually it does not have anything to do 
> with it, but in this case it could get the residual.
> 
> Another possibility is always completing DMA in a bottom half.  This 
> ensures that the callback can access the AIOCB, but it exposes an 
> implementation detail to the caller, so I don't like it.

At least in the block layer, AIO callbacks may never be called before
the submission function has returned. I think this makes the DMA helpers
provide the same behaviour.

But I'm not sure if the definition of the AIOCB struct isn't private to
the block layer.

Kevin
Paolo Bonzini Aug. 11, 2011, 3:27 p.m. UTC | #8
On 08/11/2011 05:12 PM, Kevin Wolf wrote:
>> >  Another possibility is always completing DMA in a bottom half.  This
>> >  ensures that the callback can access the AIOCB, but it exposes an
>> >  implementation detail to the caller, so I don't like it.
>
> At least in the block layer, AIO callbacks may never be called before
> the submission function has returned. I think this makes the DMA helpers
> provide the same behaviour.
>
> But I'm not sure if the definition of the AIOCB struct isn't private to
> the block layer.

Yes, it is; I would add a getter that is specific to the DMAAIOCB.

Paolo
Stefan Hajnoczi Aug. 11, 2011, 8:06 p.m. UTC | #9
On Thu, Aug 11, 2011 at 4:27 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> On 08/11/2011 05:12 PM, Kevin Wolf wrote:
>>>
>>> >  Another possibility is always completing DMA in a bottom half.  This
>>> >  ensures that the callback can access the AIOCB, but it exposes an
>>> >  implementation detail to the caller, so I don't like it.
>>
>> At least in the block layer, AIO callbacks may never be called before
>> the submission function has returned. I think this makes the DMA helpers
>> provide the same behaviour.
>>
>> But I'm not sure if the definition of the AIOCB struct isn't private to
>> the block layer.
>
> Yes, it is; I would add a getter that is specific to the DMAAIOCB.

You don't need to make the dma_bdrv_io() cb function a
BlockDriverCompletionFunc.  Instead define a DMACompletionFunc:

typedef void DMACompletionFunc(void *opaque, int ret,
target_phys_addr_t residual);

The only one invoking dbs->common.cb is dma-helpers.c so you can just
avoid using common.cb and instead use a DMACompletionFunc cb.

Perhaps the AIOCB concept can be made generic where BlockDriverAIOCB
inherits from it.  I'm not even so sure that keeping a
BlockDriverState pointer around is that useful.  At least
dma-helpers.c seems to think it's too much typing and it just
duplicates the bs field into DMAAIOCB directly :).  The question then
becomes how to abstract the typed callback function nicely.

Stefan
diff mbox

Patch

diff --git a/cutils.c b/cutils.c
index f9a7e36..969fd2e 100644
--- a/cutils.c
+++ b/cutils.c
@@ -215,14 +215,14 @@  void qemu_iovec_concat(QEMUIOVector *dst, QEMUIOVector *src, size_t size)
 
 void qemu_iovec_destroy(QEMUIOVector *qiov)
 {
-    assert(qiov->nalloc != -1);
-
-    qemu_free(qiov->iov);
+    if (qiov->nalloc != -1) {
+        qemu_free(qiov->iov);
+    }
 }
 
 void qemu_iovec_reset(QEMUIOVector *qiov)
 {
-    assert(qiov->nalloc != -1);
+    assert(qiov->nalloc != -1 || qiov->niov == 0);
 
     qiov->niov = 0;
     qiov->size = 0;
diff --git a/dma-helpers.c b/dma-helpers.c
index 4469ce2..10dc8ca 100644
--- a/dma-helpers.c
+++ b/dma-helpers.c
@@ -84,6 +84,7 @@  struct DMAAIOCB {
     BlockDriverAIOCB common;
     union {
         BlockDriverState *bs;
+        uint8_t *ptr;
     } u;
     BlockDriverAIOCB *acb;
     QEMUSGList *sg;
@@ -245,3 +246,65 @@  BlockDriverAIOCB *dma_bdrv_write(BlockDriverState *bs,
 {
     return dma_bdrv_io(bs, sg, sector, bdrv_aio_writev, cb, opaque, 1);
 }
+
+
+static int dma_copy_cb(DMAAIOCB *dbs, int ret)
+{
+    void *mem;
+    target_phys_addr_t cur_len;
+
+    assert(ret == 0);
+
+    /* sector_num is the residual number of bytes to copy.  */
+    while (dbs->sector_num > 0 &&
+           (mem = qemu_sglist_map_segment(dbs->sg, &cur_len,
+                                          !dbs->is_write)) != NULL) {
+        if (dbs->is_write) {
+            memcpy(dbs->u.ptr, mem, cur_len);
+        } else {
+            memcpy(mem, dbs->u.ptr, cur_len);
+        }
+        cpu_physical_memory_unmap(mem, cur_len, !dbs->is_write, cur_len);
+        dbs->u.ptr += cur_len;
+        dbs->sector_num -= cur_len;
+    }
+
+    if (dbs->sg->resid > 0) {
+        cpu_register_map_client(dbs, continue_after_map_failure);
+        return -EAGAIN;
+    }
+
+    return 0;
+}
+
+static BlockDriverAIOCB *dma_copy(
+    uint8_t *ptr, int32_t len, QEMUSGList *sg,
+    void (*cb)(void *opaque, int ret), void *opaque, int is_write)
+{
+    DMAAIOCB *dbs =  qemu_aio_get(&dma_aio_pool, NULL, cb, opaque);
+
+    dbs->acb = NULL;
+    dbs->u.ptr = ptr;
+    dbs->sg = sg;
+    dbs->sector_num = MIN(len, sg->size);
+    dbs->is_write = is_write;
+    dbs->bh = NULL;
+    dbs->cb = dma_copy_cb;
+    qemu_iovec_init_external(&dbs->iov, NULL, 0);
+    qemu_sglist_rewind(sg);
+    return dma_continue(dbs, 0);
+}
+
+BlockDriverAIOCB *dma_buf_read(
+    uint8_t *ptr, int32_t len, QEMUSGList *sg,
+    void (*cb)(void *opaque, int ret), void *opaque)
+{
+    return dma_copy(ptr, len, sg, cb, opaque, 0);
+}
+
+BlockDriverAIOCB *dma_buf_write(
+    uint8_t *ptr, int32_t len, QEMUSGList *sg,
+    void (*cb)(void *opaque, int ret), void *opaque)
+{
+    return dma_copy(ptr, len, sg, cb, opaque, 1);
+}
diff --git a/dma.h b/dma.h
index 363e932..6bf51ee 100644
--- a/dma.h
+++ b/dma.h
@@ -58,4 +58,9 @@  BlockDriverAIOCB *dma_bdrv_read(BlockDriverState *bs,
 BlockDriverAIOCB *dma_bdrv_write(BlockDriverState *bs,
                                  QEMUSGList *sg, uint64_t sector,
                                  BlockDriverCompletionFunc *cb, void *opaque);
+BlockDriverAIOCB *dma_buf_read(uint8_t *ptr, int32_t len, QEMUSGList *sg,
+                               BlockDriverCompletionFunc *cb, void *opaque);
+BlockDriverAIOCB *dma_buf_write(uint8_t *ptr, int32_t len, QEMUSGList *sg,
+                                BlockDriverCompletionFunc *cb, void *opaque);
+
 #endif