Message ID | 1332949439-6781-2-git-send-email-stefanha@linux.vnet.ibm.com |
---|---|
State | New |
Headers | show |
On 28.03.2012 19:43, Stefan Hajnoczi wrote: > void ide_sector_read(IDEState *s) > { [] > + s->iov.iov_base = s->io_buffer; > + s->iov.iov_len = n * BDRV_SECTOR_SIZE; > + qemu_iovec_init_external(&s->qiov, &s->iov, 1); > + > + bdrv_acct_start(s->bs, &s->acct, n * BDRV_SECTOR_SIZE, BDRV_ACCT_READ); > + bdrv_aio_readv(s->bs, sector_num, &s->qiov, n, > + ide_sector_read_cb, s); > } Shouldn't this function be returning something and check the return value of bdrv_aio_readv() ? I'm not sure if bdrv_aio_readv() is _supposed_ to never return any errors. In practice it is definitely a bit more complex. If bdrv_aio_readv() didn't do anything useful (ie, didn't queue the callback), we'll be busy forever. Again, I've no idea if it supposed to never return error. A block device without proper "media" present may - at least in theory - do so. If it must not, I think this should be documented somewhere that bdrv_aio_readv() does never return error, by design. That was one of the reasons I digged into the block layer in the first place - in order to understand, simplify and _document_ what each interface does. Thanks, /mjt
On 28.03.2012 19:43, Stefan Hajnoczi wrote: ... > void ide_sector_read(IDEState *s) > { ... > + s->iov.iov_base = s->io_buffer; > + s->iov.iov_len = n * BDRV_SECTOR_SIZE; > + qemu_iovec_init_external(&s->qiov, &s->iov, 1); > + bdrv_aio_readv(s->bs, sector_num, &s->qiov, n, > + ide_sector_read_cb, s); > } > > @@ -383,6 +383,8 @@ struct IDEState { > int cd_sector_size; > int atapi_dma; /* true if dma is requested for the packet cmd */ > BlockAcctCookie acct; > + struct iovec iov; > + QEMUIOVector qiov; > /* ATA DMA state */ > int io_buffer_size; > QEMUSGList sg; You don't actually need iov here, it can be a local variable just as well. The same applies to ide_sector_write() in the next patch. The state structure usually holds stuff which is actually needed to be keept across several calls, iov is not one of them. Thanks, /mjt
On 29.03.2012 10:46, Michael Tokarev wrote: > On 28.03.2012 19:43, Stefan Hajnoczi wrote: > ... > >> void ide_sector_read(IDEState *s) >> { > ... >> + s->iov.iov_base = s->io_buffer; >> + s->iov.iov_len = n * BDRV_SECTOR_SIZE; >> + qemu_iovec_init_external(&s->qiov, &s->iov, 1); >> + bdrv_aio_readv(s->bs, sector_num, &s->qiov, n, >> + ide_sector_read_cb, s); >> } >> >> @@ -383,6 +383,8 @@ struct IDEState { >> int cd_sector_size; >> int atapi_dma; /* true if dma is requested for the packet cmd */ >> BlockAcctCookie acct; >> + struct iovec iov; >> + QEMUIOVector qiov; >> /* ATA DMA state */ >> int io_buffer_size; >> QEMUSGList sg; > > You don't actually need iov here, it can be a local variable > just as well. The same applies to ide_sector_write() in the > next patch. The state structure usually holds stuff which > is actually needed to be keept across several calls, iov is > not one of them. Please ignore this one ;) For qemu_iovec_init_external(), the real iov is ofcourse needed. Thanks, /mjt
On Thu, Mar 29, 2012 at 7:42 AM, Michael Tokarev <mjt@tls.msk.ru> wrote: > On 28.03.2012 19:43, Stefan Hajnoczi wrote: >> void ide_sector_read(IDEState *s) >> { > [] >> + s->iov.iov_base = s->io_buffer; >> + s->iov.iov_len = n * BDRV_SECTOR_SIZE; >> + qemu_iovec_init_external(&s->qiov, &s->iov, 1); >> + >> + bdrv_acct_start(s->bs, &s->acct, n * BDRV_SECTOR_SIZE, BDRV_ACCT_READ); >> + bdrv_aio_readv(s->bs, sector_num, &s->qiov, n, >> + ide_sector_read_cb, s); >> } > > Shouldn't this function be returning something and > check the return value of bdrv_aio_readv() ? I think you are right that something is missing here. If the IDE controller is reset then we need to cancel the pending aio request. Otherwise we could end up with a dangling request across reset that overwrites io_buffer. About checking the return value: it used to be the case that the return value needed to be checked. This was kind of a wart in the interface because it created 2 error handling paths: one immediate return and one callback with error. I think Paolo and Kevin were the ones to address this. You can check out this patch for starters but there are a few more related ones: ad54ae80c7 ("block: bdrv_aio_* do not return NULL") Today the return value does not need to be checked for NULL. Stefan
Il 29/03/2012 10:29, Stefan Hajnoczi ha scritto: > This was kind of a wart in the interface because it created 2 error > handling paths: one immediate return and one callback with error. I > think Paolo and Kevin were the ones to address this. I think you did that when adding the coroutines. I just cleaned up the result. Paolo
diff --git a/hw/ide/core.c b/hw/ide/core.c index 4d568ac..e188157 100644 --- a/hw/ide/core.c +++ b/hw/ide/core.c @@ -457,40 +457,67 @@ static void ide_rw_error(IDEState *s) { ide_set_irq(s->bus); } +static void ide_sector_read_cb(void *opaque, int ret) +{ + IDEState *s = opaque; + int n; + + s->status &= ~BUSY_STAT; + + bdrv_acct_done(s->bs, &s->acct); + if (ret != 0) { + if (ide_handle_rw_error(s, -ret, BM_STATUS_PIO_RETRY | + BM_STATUS_RETRY_READ)) { + return; + } + } + + n = s->nsector; + if (n > s->req_nb_sectors) { + n = s->req_nb_sectors; + } + + /* Allow the guest to read the io_buffer */ + ide_transfer_start(s, s->io_buffer, n * BDRV_SECTOR_SIZE, ide_sector_read); + + ide_set_irq(s->bus); + + ide_set_sector(s, ide_get_sector(s) + n); + s->nsector -= n; +} + void ide_sector_read(IDEState *s) { int64_t sector_num; - int ret, n; + int n; s->status = READY_STAT | SEEK_STAT; s->error = 0; /* not needed by IDE spec, but needed by Windows */ sector_num = ide_get_sector(s); n = s->nsector; + if (n == 0) { - /* no more sector to read from disk */ ide_transfer_stop(s); - } else { + return; + } + + s->status |= BUSY_STAT; + + if (n > s->req_nb_sectors) { + n = s->req_nb_sectors; + } + #if defined(DEBUG_IDE) - printf("read sector=%" PRId64 "\n", sector_num); + printf("sector=%" PRId64 "\n", sector_num); #endif - if (n > s->req_nb_sectors) - n = s->req_nb_sectors; - bdrv_acct_start(s->bs, &s->acct, n * BDRV_SECTOR_SIZE, BDRV_ACCT_READ); - ret = bdrv_read(s->bs, sector_num, s->io_buffer, n); - bdrv_acct_done(s->bs, &s->acct); - if (ret != 0) { - if (ide_handle_rw_error(s, -ret, - BM_STATUS_PIO_RETRY | BM_STATUS_RETRY_READ)) - { - return; - } - } - ide_transfer_start(s, s->io_buffer, 512 * n, ide_sector_read); - ide_set_irq(s->bus); - ide_set_sector(s, sector_num + n); - s->nsector -= n; - } + s->iov.iov_base = s->io_buffer; + s->iov.iov_len = n * BDRV_SECTOR_SIZE; + qemu_iovec_init_external(&s->qiov, &s->iov, 1); + + bdrv_acct_start(s->bs, &s->acct, n * BDRV_SECTOR_SIZE, BDRV_ACCT_READ); + bdrv_aio_readv(s->bs, sector_num, &s->qiov, n, + ide_sector_read_cb, s); } static void dma_buf_commit(IDEState *s) diff --git a/hw/ide/internal.h b/hw/ide/internal.h index c808a0d..37d7307 100644 --- a/hw/ide/internal.h +++ b/hw/ide/internal.h @@ -383,6 +383,8 @@ struct IDEState { int cd_sector_size; int atapi_dma; /* true if dma is requested for the packet cmd */ BlockAcctCookie acct; + struct iovec iov; + QEMUIOVector qiov; /* ATA DMA state */ int io_buffer_size; QEMUSGList sg;
The IDE PIO interface currently uses bdrv_read() to perform reads synchronously. Synchronous I/O in the vcpu thread is bad because it prevents the guest from executing code - it makes the guest unresponsive. This patch converts IDE PIO to use bdrv_aio_readv(). We simply need to use the BUSY_STAT status so the guest knows to wait while we are busy. The only external user of ide_sector_read() is restart behavior on I/O errors and it is not affected by this change. We still need to restart I/O in the same way. Migration is also unaffected if I understand the code correctly. We continue to use the same transfer function and the BUSY_STAT status should never be migrated since we flush I/O before migrating device state. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> --- hw/ide/core.c | 69 ++++++++++++++++++++++++++++++++++++---------------- hw/ide/internal.h | 2 + 2 files changed, 50 insertions(+), 21 deletions(-)