Message ID | 20210322061951.186748-3-its@irrelevant.dk |
---|---|
State | New |
Headers | show |
Series | hw/block/nvme: coverity fixes | expand |
On 22.03.21 07:19, Klaus Jensen wrote: > From: Klaus Jensen <k.jensen@samsung.com> > > In nvme_format_ns(), if the namespace is of zero size (which might be > useless, but not invalid), the `count` variable will leak. Fix this by > returning early in that case. When looking at the Coverity report, something else caught my eye: As far as I’m aware, blk_aio_pwrite_zeroes() may invoke the CB before returning (if blk_do_pwritev_part() returns without yielding). I don’t think that will happen with real hardware (who knows, though), but it should be possible to see with the null-co block driver. nvme_format_ns() doesn’t quite look like it takes that into account. For example, because *count starts at 1 and is decremented after the while (len) loop, all nvme_aio_format_cb() invocations (if they are invoked before their blk_aio_pwrite_zeroes() returns) will see *count == 2, and thus not free it, or call nvme_enqueue_req_completion(). I don’t know whether the latter is problematic, but not freeing `count` doesn’t seem right. Perhaps this could be addressed by adding a condition to the `(*count)--` to see whether `(*count)-- == 1` (or rather `--(*count) == 0`), which would indicate that there are no AIO functions still in flight? Max > Reported-by: Coverity (CID 1451082) > Fixes: dc04d25e2f3f ("hw/block/nvme: add support for the format nvm command") > Signed-off-by: Klaus Jensen <k.jensen@samsung.com> > --- > hw/block/nvme.c | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/hw/block/nvme.c b/hw/block/nvme.c > index 6842b01ab58b..dad275971a84 100644 > --- a/hw/block/nvme.c > +++ b/hw/block/nvme.c > @@ -4984,6 +4984,11 @@ static uint16_t nvme_format_ns(NvmeCtrl *n, NvmeNamespace *ns, uint8_t lbaf, > ns->status = NVME_FORMAT_IN_PROGRESS; > > len = ns->size; > + > + if (!len) { > + return NVME_SUCCESS; > + } > + > offset = 0; > > count = g_new(int, 1); >
On Mar 22 11:02, Max Reitz wrote: > On 22.03.21 07:19, Klaus Jensen wrote: > > From: Klaus Jensen <k.jensen@samsung.com> > > > > In nvme_format_ns(), if the namespace is of zero size (which might be > > useless, but not invalid), the `count` variable will leak. Fix this by > > returning early in that case. > > When looking at the Coverity report, something else caught my eye: As far as > I’m aware, blk_aio_pwrite_zeroes() may invoke the CB before returning (if > blk_do_pwritev_part() returns without yielding). I don’t think that will > happen with real hardware (who knows, though), but it should be possible to > see with the null-co block driver. > > nvme_format_ns() doesn’t quite look like it takes that into account. For > example, because *count starts at 1 and is decremented after the while (len) > loop, all nvme_aio_format_cb() invocations (if they are invoked before their > blk_aio_pwrite_zeroes() returns) will see > *count == 2, and thus not free it, or call nvme_enqueue_req_completion(). > > I don’t know whether the latter is problematic, but not freeing `count` > doesn’t seem right. Perhaps this could be addressed by adding a condition > to the `(*count)--` to see whether `(*count)-- == 1` (or rather `--(*count) > == 0`), which would indicate that there are no AIO functions still in > flight? Hi Max, Thanks for glossing over this. And, yeah, you are right, nice catch. The reference counting is exactly to guard against pwrite_zeroes invoking the CB before returning, but if it happens it is not correctly handling it anyway :( This stuff is exactly why I proposed converting all this into the "proper" BlockAIOCB approach (see [1]), but it need a little more work. I'll v2 this with a fix for this! Thanks! [1]: https://lore.kernel.org/qemu-devel/20210302111040.289244-1-its@irrelevant.dk/
On 22.03.21 11:48, Klaus Jensen wrote: > On Mar 22 11:02, Max Reitz wrote: >> On 22.03.21 07:19, Klaus Jensen wrote: >>> From: Klaus Jensen <k.jensen@samsung.com> >>> >>> In nvme_format_ns(), if the namespace is of zero size (which might be >>> useless, but not invalid), the `count` variable will leak. Fix this by >>> returning early in that case. >> >> When looking at the Coverity report, something else caught my eye: As far as >> I’m aware, blk_aio_pwrite_zeroes() may invoke the CB before returning (if >> blk_do_pwritev_part() returns without yielding). I don’t think that will >> happen with real hardware (who knows, though), but it should be possible to >> see with the null-co block driver. >> >> nvme_format_ns() doesn’t quite look like it takes that into account. For >> example, because *count starts at 1 and is decremented after the while (len) >> loop, all nvme_aio_format_cb() invocations (if they are invoked before their >> blk_aio_pwrite_zeroes() returns) will see >> *count == 2, and thus not free it, or call nvme_enqueue_req_completion(). >> >> I don’t know whether the latter is problematic, but not freeing `count` >> doesn’t seem right. Perhaps this could be addressed by adding a condition >> to the `(*count)--` to see whether `(*count)-- == 1` (or rather `--(*count) >> == 0`), which would indicate that there are no AIO functions still in >> flight? > > Hi Max, > > Thanks for glossing over this. > > And, yeah, you are right, nice catch. The reference counting is exactly > to guard against pwrite_zeroes invoking the CB before returning, but if > it happens it is not correctly handling it anyway :( > > This stuff is exactly why I proposed converting all this into the > "proper" BlockAIOCB approach (see [1]), but it need a little more work. > > I'll v2 this with a fix for this! Thanks! > > > [1]: https://lore.kernel.org/qemu-devel/20210302111040.289244-1-its@irrelevant.dk/ OK, thanks! :) That rewrite does look more in-line with how AIO is handled elsewhere, yes. Max
diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 6842b01ab58b..dad275971a84 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -4984,6 +4984,11 @@ static uint16_t nvme_format_ns(NvmeCtrl *n, NvmeNamespace *ns, uint8_t lbaf, ns->status = NVME_FORMAT_IN_PROGRESS; len = ns->size; + + if (!len) { + return NVME_SUCCESS; + } + offset = 0; count = g_new(int, 1);