Message ID | 1278072457-8156-2-git-send-email-kwolf@redhat.com |
---|---|
State | New |
Headers | show |
On Fri, Jul 2, 2010 at 1:07 PM, Kevin Wolf <kwolf@redhat.com> wrote: > bdrv_aio_writev may call the callback immediately (and it will commonly do so > in error cases). Current code doesn't consider this. For details see the > comment added by this patch. > > Signed-off-by: Kevin Wolf <kwolf@redhat.com> > --- > block.c | 35 +++++++++++++++++++++++++++++------ > 1 files changed, 29 insertions(+), 6 deletions(-) > > diff --git a/block.c b/block.c > index 9176dec..e65971c 100644 > --- a/block.c > +++ b/block.c > @@ -2183,8 +2183,29 @@ int bdrv_aio_multiwrite(BlockDriverState *bs, BlockRequest *reqs, int num_reqs) > // Check for mergable requests > num_reqs = multiwrite_merge(bs, reqs, num_reqs, mcb); > > - // Run the aio requests > + /* > + * Run the aio requests. As soon as one request can't be submitted > + * successfully, fail all requests that are not yet submitted (we must > + * return failure for all requests anyway) > + * > + * num_requests cannot be set to the right value immediately: If > + * bdrv_aio_writev fails for some request, num_requests would be too high > + * and therefore multiwrite_cb() would never recognize the multiwrite > + * request as completed. We also cannot use the loop variable i to set it > + * when the first request fails because the callback may already have been > + * called for previously submitted requests. Thus, num_requests must be > + * incremented for each request that is submitted. > + * > + * The problem that callbacks may be called early also means that we need > + * to take care that num_requests doesn't become 0 before all requests are > + * submitted - multiwrite_cb() would consider the multiwrite request > + * completed. A dummy request that is "completed" by a manual call to > + * multiwrite_cb() takes care of this. > + */ > + mcb->num_requests = 1; > + > for (i = 0; i < num_reqs; i++) { > + mcb->num_requests++; > acb = bdrv_aio_writev(bs, reqs[i].sector, reqs[i].qiov, > reqs[i].nb_sectors, multiwrite_cb, mcb); > > @@ -2192,22 +2213,24 @@ int bdrv_aio_multiwrite(BlockDriverState *bs, BlockRequest *reqs, int num_reqs) > // We can only fail the whole thing if no request has been > // submitted yet. Otherwise we'll wait for the submitted AIOs to > // complete and report the error in the callback. > - if (mcb->num_requests == 0) { > - reqs[i].error = -EIO; > + if (i == 0) { > goto fail; > } else { > - mcb->num_requests++; > multiwrite_cb(mcb, -EIO); When bdrv_aio_writev() fails we don't know if the callback has been invoked by the block driver. Qcow2 will invoke the callback in some cases. This is a problem because num_requests will be decremented twice if we unconditionally call it here. Stefan
Am 02.07.2010 15:18, schrieb Stefan Hajnoczi: > On Fri, Jul 2, 2010 at 1:07 PM, Kevin Wolf <kwolf@redhat.com> wrote: >> bdrv_aio_writev may call the callback immediately (and it will commonly do so >> in error cases). Current code doesn't consider this. For details see the >> comment added by this patch. >> >> Signed-off-by: Kevin Wolf <kwolf@redhat.com> >> --- >> block.c | 35 +++++++++++++++++++++++++++++------ >> 1 files changed, 29 insertions(+), 6 deletions(-) >> >> diff --git a/block.c b/block.c >> index 9176dec..e65971c 100644 >> --- a/block.c >> +++ b/block.c >> @@ -2183,8 +2183,29 @@ int bdrv_aio_multiwrite(BlockDriverState *bs, BlockRequest *reqs, int num_reqs) >> // Check for mergable requests >> num_reqs = multiwrite_merge(bs, reqs, num_reqs, mcb); >> >> - // Run the aio requests >> + /* >> + * Run the aio requests. As soon as one request can't be submitted >> + * successfully, fail all requests that are not yet submitted (we must >> + * return failure for all requests anyway) >> + * >> + * num_requests cannot be set to the right value immediately: If >> + * bdrv_aio_writev fails for some request, num_requests would be too high >> + * and therefore multiwrite_cb() would never recognize the multiwrite >> + * request as completed. We also cannot use the loop variable i to set it >> + * when the first request fails because the callback may already have been >> + * called for previously submitted requests. Thus, num_requests must be >> + * incremented for each request that is submitted. >> + * >> + * The problem that callbacks may be called early also means that we need >> + * to take care that num_requests doesn't become 0 before all requests are >> + * submitted - multiwrite_cb() would consider the multiwrite request >> + * completed. A dummy request that is "completed" by a manual call to >> + * multiwrite_cb() takes care of this. >> + */ >> + mcb->num_requests = 1; >> + >> for (i = 0; i < num_reqs; i++) { >> + mcb->num_requests++; >> acb = bdrv_aio_writev(bs, reqs[i].sector, reqs[i].qiov, >> reqs[i].nb_sectors, multiwrite_cb, mcb); >> >> @@ -2192,22 +2213,24 @@ int bdrv_aio_multiwrite(BlockDriverState *bs, BlockRequest *reqs, int num_reqs) >> // We can only fail the whole thing if no request has been >> // submitted yet. Otherwise we'll wait for the submitted AIOs to >> // complete and report the error in the callback. >> - if (mcb->num_requests == 0) { >> - reqs[i].error = -EIO; >> + if (i == 0) { >> goto fail; >> } else { >> - mcb->num_requests++; >> multiwrite_cb(mcb, -EIO); > > When bdrv_aio_writev() fails we don't know if the callback has been > invoked by the block driver. Qcow2 will invoke the callback in some > cases. This is a problem because num_requests will be decremented > twice if we unconditionally call it here. Talked to Stefan on IRC and we came to the conclusion that it's not a problem in fact: qcow_aio_writev() either returns NULL or calls a callback, but it never does both. If a block driver returned NULL and called a callback for the same request that would be a bug in the block driver. Kevin
diff --git a/block.c b/block.c index 9176dec..e65971c 100644 --- a/block.c +++ b/block.c @@ -2183,8 +2183,29 @@ int bdrv_aio_multiwrite(BlockDriverState *bs, BlockRequest *reqs, int num_reqs) // Check for mergable requests num_reqs = multiwrite_merge(bs, reqs, num_reqs, mcb); - // Run the aio requests + /* + * Run the aio requests. As soon as one request can't be submitted + * successfully, fail all requests that are not yet submitted (we must + * return failure for all requests anyway) + * + * num_requests cannot be set to the right value immediately: If + * bdrv_aio_writev fails for some request, num_requests would be too high + * and therefore multiwrite_cb() would never recognize the multiwrite + * request as completed. We also cannot use the loop variable i to set it + * when the first request fails because the callback may already have been + * called for previously submitted requests. Thus, num_requests must be + * incremented for each request that is submitted. + * + * The problem that callbacks may be called early also means that we need + * to take care that num_requests doesn't become 0 before all requests are + * submitted - multiwrite_cb() would consider the multiwrite request + * completed. A dummy request that is "completed" by a manual call to + * multiwrite_cb() takes care of this. + */ + mcb->num_requests = 1; + for (i = 0; i < num_reqs; i++) { + mcb->num_requests++; acb = bdrv_aio_writev(bs, reqs[i].sector, reqs[i].qiov, reqs[i].nb_sectors, multiwrite_cb, mcb); @@ -2192,22 +2213,24 @@ int bdrv_aio_multiwrite(BlockDriverState *bs, BlockRequest *reqs, int num_reqs) // We can only fail the whole thing if no request has been // submitted yet. Otherwise we'll wait for the submitted AIOs to // complete and report the error in the callback. - if (mcb->num_requests == 0) { - reqs[i].error = -EIO; + if (i == 0) { goto fail; } else { - mcb->num_requests++; multiwrite_cb(mcb, -EIO); break; } - } else { - mcb->num_requests++; } } + /* Complete the dummy request */ + multiwrite_cb(mcb, 0); + return 0; fail: + for (i = 0; i < mcb->num_callbacks; i++) { + reqs[i].error = -EIO; + } qemu_free(mcb); return -1; }
bdrv_aio_writev may call the callback immediately (and it will commonly do so in error cases). Current code doesn't consider this. For details see the comment added by this patch. Signed-off-by: Kevin Wolf <kwolf@redhat.com> --- block.c | 35 +++++++++++++++++++++++++++++------ 1 files changed, 29 insertions(+), 6 deletions(-)