Patchwork [13/15] scsi: Implement alloc_req_iov callback

login
register
mail settings
Submitter Hannes Reinecke
Date Nov. 24, 2010, 11:16 a.m.
Message ID <1290597370-21365-14-git-send-email-hare@suse.de>
Download mbox | patch
Permalink /patch/72842/
State New
Headers show

Comments

Hannes Reinecke - Nov. 24, 2010, 11:16 a.m.
Add callback to create a request with a predefined iovec.
This is required for drivers which can use the iovec
of a command directly.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 hw/scsi-disk.c    |   25 +++++++++++++++++++++----
 hw/scsi-generic.c |   44 +++++++++++++++++++++++++++++++++-----------
 hw/scsi.h         |    2 ++
 3 files changed, 56 insertions(+), 15 deletions(-)
Christoph Hellwig - Nov. 24, 2010, 4:52 p.m.
On Wed, Nov 24, 2010 at 12:16:08PM +0100, Hannes Reinecke wrote:
> Add callback to create a request with a predefined iovec.
> This is required for drivers which can use the iovec
> of a command directly.

What happend to my comment that the iov and non-iov case should share
code?  Also what happened to the other comment about not naming the
method implementation different from the method name.
Hannes Reinecke - Nov. 25, 2010, 8:53 a.m.
On 11/24/2010 05:52 PM, Christoph Hellwig wrote:
> On Wed, Nov 24, 2010 at 12:16:08PM +0100, Hannes Reinecke wrote:
>> Add callback to create a request with a predefined iovec.
>> This is required for drivers which can use the iovec
>> of a command directly.
> 
> What happend to my comment that the iov and non-iov case should share
> code?  Also what happened to the other comment about not naming the
> method implementation different from the method name.
> 
Looked into it.
Sure I could be doing it for scsi-disk; for scsi-generic it won't
work, though. And it's not much of a code-share to have from it;
you'll end up with something like:

static SCSIRequest *scsi_new_request(SCSIDevice *d, uint32_t tag,
        uint32_t lun)
{
    struct iovec *iov;
    uint8_t *buf;
    SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, d);
    SCSIRequest *req;
    SCSIDiskReq *r;

    buf = qemu_blockalign(s->bs, SCSI_DMA_BUF_SIZE);
    iov = qemu_mallocz(sizeof(struct iovec));
    iov[0].iov_base = buf;
    req = scsi_new_request_iovec(d, tag, lun, iov, 1);
    r = DO_UPCAST(SCSIDiskReq, req, req);
    r->iov_buf = buf;
    return req;
}

which doesn't look better than the original:

static SCSIRequest *scsi_new_request(SCSIDevice *d, uint32_t tag,
        uint32_t lun)
{
    SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, d);
    SCSIRequest *req;
    SCSIDiskReq *r;

    req = scsi_req_alloc(sizeof(SCSIDiskReq), d, tag, lun);
    r = DO_UPCAST(SCSIDiskReq, req, req);
    r->iov_buf = qemu_blockalign(s->bs, SCSI_DMA_BUF_SIZE);
    r->iov = qemu_mallocz(sizeof(struct iovec));
    r->iov[0].iov_base = r->iov_buf;
    r->iov_num = 1;
    return req;
}

But I'm open to suggestions.

And for the naming:
The SCSI stack is using 'req' for every function accepting
SCSIRequest as an argument:

hw/scsi.h:
SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice *d,
  uint32_t tag, uint32_t lun);
void scsi_req_free(SCSIRequest *req);

int scsi_req_parse(SCSIRequest *req, uint8_t *buf);
void scsi_req_print(SCSIRequest *req);
void scsi_req_complete(SCSIRequest *req);

So using 'alloc_req' and 'free_req' is in line with this naming
scheme. Using 'alloc_request' and 'free_request' really looked odd
in the light of the general usage.

Hence I didn't do it.
But again, I'm open to suggestions here.

Cheers,

Hannes
Christoph Hellwig - Nov. 25, 2010, 3:29 p.m.
On Thu, Nov 25, 2010 at 09:53:25AM +0100, Hannes Reinecke wrote:
> Looked into it.
> Sure I could be doing it for scsi-disk; for scsi-generic it won't
> work, though. And it's not much of a code-share to have from it;
> you'll end up with something like:

Yes, and that is a good start to completely get rid of the non-iovec
version.  Keeping two parallel APIs around that have slighly mismatching
semantics is a bad idea.  And for scsi-generic in the version in your
the difference is even worse and more subtile.

I think the only way to get this interface future proof is to unify
it.  That is always pass an iovec from the HBA driver, and make the
length chunking an explicitly flag passed to ->alloc_req instead of
an implicit one when using the old interface.  Then refactor the code
currently resetting.

Talking about scsi-generic, how is the auto request-sense in 
scsi_read_data and the mode select snooping in scsi_write_complete
supposed to for for the iovec interface?

> And for the naming:
> The SCSI stack is using 'req' for every function accepting
> SCSIRequest as an argument:
> 
> hw/scsi.h:
> SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice *d,
>   uint32_t tag, uint32_t lun);
> void scsi_req_free(SCSIRequest *req);
> 
> int scsi_req_parse(SCSIRequest *req, uint8_t *buf);
> void scsi_req_print(SCSIRequest *req);
> void scsi_req_complete(SCSIRequest *req);
> 
> So using 'alloc_req' and 'free_req' is in line with this naming
> scheme. Using 'alloc_request' and 'free_request' really looked odd
> in the light of the general usage.

Keeping the method names is fine, but please name the implementations
matching it, e.g.

scsi_alloc_req and co.

And yes, using the scsi_ prefix is a bit confusing with the generic
code also using it, eventually the disk driver should use scsi_disk
and the generic driver scsi_generic prefixes.

And in general it would be nice if you could simplify answer to reviews.
If something that the reviewer requests doesn't make sense state so in
reply instead of silently ignoring it.
Hannes Reinecke - Nov. 25, 2010, 4:21 p.m.
On 11/25/2010 04:29 PM, Christoph Hellwig wrote:
> On Thu, Nov 25, 2010 at 09:53:25AM +0100, Hannes Reinecke wrote:
>> Looked into it.
>> Sure I could be doing it for scsi-disk; for scsi-generic it won't
>> work, though. And it's not much of a code-share to have from it;
>> you'll end up with something like:
> 
> Yes, and that is a good start to completely get rid of the non-iovec
> version.  Keeping two parallel APIs around that have slighly mismatching
> semantics is a bad idea.  And for scsi-generic in the version in your
> the difference is even worse and more subtile.
> 
> I think the only way to get this interface future proof is to unify
> it.  That is always pass an iovec from the HBA driver, and make the
> length chunking an explicitly flag passed to ->alloc_req instead of
> an implicit one when using the old interface.  Then refactor the code
> currently resetting.
> 
I don't think that'll work. It's only in send_command() when the CDB
is parsed, and only then we do know how much data we should be
expecting.
If you have a iovec passed in this doesn't matter as you can't
really enlarge it. But in the other case you really need the size if
you want to allocate a buffer large enough to hold the data.

I must say I'd like to get rid of the chunking transfer in scsi-disk.
To have it for scsi-disk only is really pointless, as you can
potentially send exactly the same commands via scsi-generic.
So for scsi-generic to work properly qemu need to be able to
allocate the _entire_ data buffer. And hence qemu _must_ be able to
allocate the same buffer for the scsi-disk emulation.
So any malloc space arguments don't really work out here.

By the same reasoning we could remove the chunking altogether;
any HBA _must_ be capable of issuing the entire data (if issued via
scsi-generic) even today. So I don't really buy the argument of
chunking being required for large I/Os.

But then, I've been down that road already.
With no large success.

> Talking about scsi-generic, how is the auto request-sense in 
> scsi_read_data and the mode select snooping in scsi_write_complete
> supposed to for for the iovec interface?
> 
The sense code is actually a property of the device, no the command.
And a REQUEST_SENSE command will just return the status of the last
command. So the sense buffer is always stored with the device
in SCSIGenericState and can be retrieved at will.

But you are correct, the MODE SELECT snooping needs to be modified.
No big deal, though.

>> And for the naming:
>> The SCSI stack is using 'req' for every function accepting
>> SCSIRequest as an argument:
>>
>> hw/scsi.h:
>> SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice *d,
>>   uint32_t tag, uint32_t lun);
>> void scsi_req_free(SCSIRequest *req);
>>
>> int scsi_req_parse(SCSIRequest *req, uint8_t *buf);
>> void scsi_req_print(SCSIRequest *req);
>> void scsi_req_complete(SCSIRequest *req);
>>
>> So using 'alloc_req' and 'free_req' is in line with this naming
>> scheme. Using 'alloc_request' and 'free_request' really looked odd
>> in the light of the general usage.
> 
> Keeping the method names is fine, but please name the implementations
> matching it, e.g.
> 
> scsi_alloc_req and co.
> 
> And yes, using the scsi_ prefix is a bit confusing with the generic
> code also using it, eventually the disk driver should use scsi_disk
> and the generic driver scsi_generic prefixes.
> 
Ah, now I think I see what you mean.
You were think along these lines:

static SCSIDeviceInfo scsi_generic_info = {
    .qdev.name    = "scsi-generic",
    .qdev.desc    = "pass through generic scsi device (/dev/sg*)",
    .qdev.size    = sizeof(SCSIGenericState),
    .qdev.reset   = scsi_generic_reset,
    .init         = scsi_generic_initfn,
    .destroy      = scsi_destroy,
    .alloc_req    = scsi_generic_alloc_req,
    .alloc_req_iov  = scsi_generic_alloc_req_iovec,
    .free_req     = scsi_generic_free_req,
    .send_command = scsi_send_command,
    .read_data    = scsi_read_data,

correct?
Yeah, that's easy to do.

> And in general it would be nice if you could simplify answer to reviews.
> If something that the reviewer requests doesn't make sense state so in
> reply instead of silently ignoring it.

Will do in the future.

Cheers,

Hannes
Paul Brook - Nov. 26, 2010, 12:06 a.m.
> I must say I'd like to get rid of the chunking transfer in scsi-disk.
> To have it for scsi-disk only is really pointless, as you can
> potentially send exactly the same commands via scsi-generic.
> So for scsi-generic to work properly qemu need to be able to
> allocate the _entire_ data buffer. And hence qemu _must_ be able to
> allocate the same buffer for the scsi-disk emulation.
> So any malloc space arguments don't really work out here.
> 
> By the same reasoning we could remove the chunking altogether;
> any HBA _must_ be capable of issuing the entire data (if issued via
> scsi-generic) even today. So I don't really buy the argument of
> chunking being required for large I/Os.

We've been over this before. Your logic is fundamentally flawed.

In many cases The HBA simply doesn't know where the data is going to go until 
after the command has been issued.  Issuing the command (which may fail) and 
setting up the buffer to receive the data are separate operations, and there 
may be guest visible state in between. Even if you assume the initial command 
is accepted successfully, the DMA buffer may be submitted in several parts. If 
the device response does not full all these parts then we should not be 
accessing the unused ones. And this is assuming your HBA is DMA capable to 
start with - a USB mass storage device almost certainly isn't.

Even if you do know the full DMA buffer ahead of time, there's no guarantee 
that you'll actually be able to map it all at once. You have to assume that 
bounce buffers are required, and only a small chunk of the ram can be mapped 
at any point.

Combine this with the fact that the guest may submit arbitrarily large 
requests and you need some form of chunking. IIRC the passthrough support 
ignores the linux interface already restricts you to relatively small 
requests.

Paul

Patch

diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c
index 68b8667..67f93a5 100644
--- a/hw/scsi-disk.c
+++ b/hw/scsi-disk.c
@@ -96,14 +96,30 @@  static SCSIRequest *scsi_new_request(SCSIDevice *d, uint32_t tag,
     return req;
 }
 
+static SCSIRequest *scsi_new_request_iovec(SCSIDevice *d, uint32_t tag,
+        uint32_t lun, struct iovec *iov, int iov_num)
+{
+    SCSIRequest *req;
+    SCSIDiskReq *r;
+
+    req = scsi_req_alloc(sizeof(SCSIDiskReq), d, tag, lun);
+    r = DO_UPCAST(SCSIDiskReq, req, req);
+    r->iov = iov;
+    r->iov_num = iov_num;
+    r->iov_buf = NULL;
+    return req;
+}
+
 static void scsi_remove_request(SCSIRequest *req)
 {
     SCSIDiskReq *r = DO_UPCAST(SCSIDiskReq, req, req);
 
-    qemu_free(r->iov);
-    r->iov = NULL;
-    qemu_vfree(r->iov_buf);
-    r->iov_buf = NULL;
+    if (r->iov_buf) {
+        qemu_free(r->iov);
+        r->iov = NULL;
+        qemu_vfree(r->iov_buf);
+        r->iov_buf = NULL;
+    }
     scsi_req_free(&r->req);
 }
 
@@ -1220,6 +1236,7 @@  static SCSIDeviceInfo scsi_disk_info = {
     .init         = scsi_disk_initfn,
     .destroy      = scsi_destroy,
     .alloc_req    = scsi_new_request,
+    .alloc_req_iov = scsi_new_request_iovec,
     .free_req     = scsi_remove_request,
     .send_command = scsi_send_command,
     .read_data    = scsi_read_data,
diff --git a/hw/scsi-generic.c b/hw/scsi-generic.c
index 949b4cc..8c99e9e 100644
--- a/hw/scsi-generic.c
+++ b/hw/scsi-generic.c
@@ -108,6 +108,25 @@  static SCSIRequest *scsi_new_request(SCSIDevice *d, uint32_t tag, uint32_t lun)
     return req;
 }
 
+static SCSIRequest *scsi_new_request_iovec(SCSIDevice *d, uint32_t tag,
+                                           uint32_t lun, struct iovec *iov, int iov_num)
+{
+    SCSIRequest *req;
+    SCSIGenericReq *r;
+    int i;
+
+    req = scsi_req_alloc(sizeof(SCSIGenericReq), d, tag, lun);
+    r = DO_UPCAST(SCSIGenericReq, req, req);
+    r->io_header.dxferp = iov;
+    r->io_header.iovec_count = iov_num;
+    r->io_header.dxfer_len = 0;
+    for (i = 0; i < iov_num; i++)
+        r->io_header.dxfer_len += iov[i].iov_len;
+    r->buf = NULL;
+    r->buflen = 0;
+    return req;
+}
+
 static void scsi_remove_request(SCSIRequest *req)
 {
     SCSIGenericReq *r = DO_UPCAST(SCSIGenericReq, req, req);
@@ -180,8 +199,10 @@  static int execute_command(BlockDriverState *bdrv,
 
     r->io_header.interface_id = 'S';
     r->io_header.dxfer_direction = direction;
-    r->io_header.dxferp = r->buf;
-    r->io_header.dxfer_len = r->buflen;
+    if (r->buf) {
+        r->io_header.dxferp = r->buf;
+        r->io_header.dxfer_len = r->buflen;
+    }
     r->io_header.cmdp = r->req.cmd.buf;
     r->io_header.cmd_len = r->req.cmd.len;
     r->io_header.mx_sb_len = sizeof(s->sensebuf);
@@ -287,7 +308,7 @@  static int scsi_write_data(SCSIRequest *req)
 
     DPRINTF("scsi_write_data 0x%x\n", req->tag);
 
-    if (r->len == 0) {
+    if (r->len == 0 && r->io_header.dxfer_len == 0) {
         r->len = r->buflen;
         r->req.bus->complete(&r->req, SCSI_REASON_DATA, r->len);
         return 0;
@@ -369,8 +390,7 @@  static int32_t scsi_send_command(SCSIRequest *req, uint8_t *cmd)
 #endif
 
     if (r->req.cmd.xfer == 0) {
-        if (r->buf != NULL)
-            qemu_free(r->buf);
+        qemu_free(r->buf);
         r->buflen = 0;
         r->buf = NULL;
         ret = execute_command(s->bs, r, SG_DXFER_NONE, scsi_command_complete);
@@ -381,14 +401,15 @@  static int32_t scsi_send_command(SCSIRequest *req, uint8_t *cmd)
         return 0;
     }
 
-    if (r->buflen != r->req.cmd.xfer) {
-        if (r->buf != NULL)
+    if (!r->io_header.iovec_count) {
+        if (r->buflen != r->req.cmd.xfer) {
             qemu_free(r->buf);
-        r->buf = qemu_malloc(r->req.cmd.xfer);
-        r->buflen = r->req.cmd.xfer;
-    }
+            r->buf = qemu_malloc(r->req.cmd.xfer);
+            r->buflen = r->req.cmd.xfer;
+        }
 
-    memset(r->buf, 0, r->buflen);
+        memset(r->buf, 0, r->buflen);
+    }
     r->len = r->req.cmd.xfer;
     if (r->req.cmd.mode == SCSI_XFER_TO_DEV) {
         r->len = 0;
@@ -561,6 +582,7 @@  static SCSIDeviceInfo scsi_generic_info = {
     .init         = scsi_generic_initfn,
     .destroy      = scsi_destroy,
     .alloc_req    = scsi_new_request,
+    .alloc_req_iov  = scsi_new_request_iovec,
     .free_req     = scsi_remove_request,
     .send_command = scsi_send_command,
     .read_data    = scsi_read_data,
diff --git a/hw/scsi.h b/hw/scsi.h
index cc96f85..063154d 100644
--- a/hw/scsi.h
+++ b/hw/scsi.h
@@ -72,6 +72,8 @@  struct SCSIDeviceInfo {
     scsi_qdev_initfn init;
     void (*destroy)(SCSIDevice *s);
     SCSIRequest *(*alloc_req)(SCSIDevice *s, uint32_t tag, uint32_t lun);
+    SCSIRequest *(*alloc_req_iov)(SCSIDevice *s, uint32_t tag, uint32_t lun,
+                                  struct iovec *iov, int iov_num);
     void (*free_req)(SCSIRequest *req);
     int32_t (*send_command)(SCSIRequest *req, uint8_t *buf);
     void (*read_data)(SCSIRequest *req);