diff mbox

block: check BlockDriverState object before dereference

Message ID 20170711170821.24669-1-ppandit@redhat.com
State New
Headers show

Commit Message

Prasad Pandit July 11, 2017, 5:08 p.m. UTC
From: Prasad J Pandit <pjp@fedoraproject.org>

When processing ATA_CACHE_FLUSH ide controller command,
BlockDriverState object could be null. Add check to avoid
null pointer dereference.

Reported-by: Kieron Shorrock <kshorrock@paloaltonetworks.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
---
 block/block-backend.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

Comments

John Snow July 17, 2017, 3:56 p.m. UTC | #1
On 07/11/2017 01:08 PM, P J P wrote:
> From: Prasad J Pandit <pjp@fedoraproject.org>
> 
> When processing ATA_CACHE_FLUSH ide controller command,
> BlockDriverState object could be null. Add check to avoid
> null pointer dereference.
> 

This happens through 0xE7, what QEMU calls WIN_CACHE_FLUSH and described
in ATA8 ACS3 as "FLUSH CACHE"

The core of the matter here is that any ATA device associated with a
BlockBackend that has no media inserted will accept the command and call
blk_aio_flush, which will later fail because blk_aio_prwv assumes that
the BlockBackend it was given definitely has a BlockDriverState attached.

The associated bdrv_inc_in_flight causes the null pointer dereference.

It's not immediately clear to me what the right fix is here:

(1) Should blk_ functions not assume they will have a BlockDriverState?
(i.e. should an attempted blk_flush on an empty blk just succeed
quietly, or is that inherently an error?)

(2) Should ATA reject such commands entirely?

(3) Both?

ATA says that CDROM drives /may/ accept flush cache, but it doesn't
really say what it has to do about if there's no media. This is for
flushing write cache, after all, and our media are RO for CDROMs.

We could possibly make flush cache a NOP for CDROMs entirely, or I can
remove our support for the command, as it is "optional" and I can't find
any information on what, if anything, it should actually do.

Grumble Grumble, ATA specs.

> Reported-by: Kieron Shorrock <kshorrock@paloaltonetworks.com>

Thank you for your report.

> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
> ---
>  block/block-backend.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/block/block-backend.c b/block/block-backend.c
> index 0df3457..6a543a4 100644
> --- a/block/block-backend.c
> +++ b/block/block-backend.c
> @@ -1168,8 +1168,13 @@ static BlockAIOCB *blk_aio_prwv(BlockBackend *blk, int64_t offset, int bytes,
>  {
>      BlkAioEmAIOCB *acb;
>      Coroutine *co;
> +    BlockDriverState *bs = blk_bs(blk);
>  
> -    bdrv_inc_in_flight(blk_bs(blk));
> +    if (!bs) {
> +        return NULL;
> +    }
> +

This hotfix as posted is inappropriate I think, because this path has
never been able to return NULL previously and many callers don't check
to see if their command was registered successfully, and rely on the
callback to be executed.

> +    bdrv_inc_in_flight(bs);
>      acb = blk_aio_get(&blk_aio_em_aiocb_info, blk, cb, opaque);
>      acb->rwco = (BlkRwCo) {
>          .blk    = blk,
> @@ -1182,7 +1187,7 @@ static BlockAIOCB *blk_aio_prwv(BlockBackend *blk, int64_t offset, int bytes,
>      acb->has_returned = false;
>  
>      co = qemu_coroutine_create(co_entry, acb);
> -    bdrv_coroutine_enter(blk_bs(blk), co);
> +    bdrv_coroutine_enter(bs, co);
>  
>      acb->has_returned = true;
>      if (acb->rwco.ret != NOT_DONE) {
>
Stefan Hajnoczi July 21, 2017, 3:47 p.m. UTC | #2
On Mon, Jul 17, 2017 at 11:56:48AM -0400, John Snow wrote:
> On 07/11/2017 01:08 PM, P J P wrote:
> > From: Prasad J Pandit <pjp@fedoraproject.org>
> > 
> > When processing ATA_CACHE_FLUSH ide controller command,
> > BlockDriverState object could be null. Add check to avoid
> > null pointer dereference.
> > 
> 
> This happens through 0xE7, what QEMU calls WIN_CACHE_FLUSH and described
> in ATA8 ACS3 as "FLUSH CACHE"
> 
> The core of the matter here is that any ATA device associated with a
> BlockBackend that has no media inserted will accept the command and call
> blk_aio_flush, which will later fail because blk_aio_prwv assumes that
> the BlockBackend it was given definitely has a BlockDriverState attached.
> 
> The associated bdrv_inc_in_flight causes the null pointer dereference.
> 
> It's not immediately clear to me what the right fix is here:
> 
> (1) Should blk_ functions not assume they will have a BlockDriverState?
> (i.e. should an attempted blk_flush on an empty blk just succeed
> quietly, or is that inherently an error?)

Hmm...BlockDriverState still has bdrv_is_inserted() even though
BlockBackend->root can be NULL?  CCing Markus in case he has thoughts on
the BB/BDS split.

I find it weird that block-backend.c calls bdrv_inc_in_flight() and then
bdrv_co_flush() will call it again.  Perhaps we need to do this so that
blk_drain() works correctly but it's odd that they share the same
counter variable.
Paolo Bonzini July 23, 2017, 2:36 p.m. UTC | #3
On 21/07/2017 17:47, Stefan Hajnoczi wrote:
> Hmm...BlockDriverState still has bdrv_is_inserted() even though
> BlockBackend->root can be NULL?  CCing Markus in case he has thoughts on
> the BB/BDS split.
> 
> I find it weird that block-backend.c calls bdrv_inc_in_flight() and then
> bdrv_co_flush() will call it again.  Perhaps we need to do this so that
> blk_drain() works correctly but it's odd that they share the same
> counter variable.

See here:
https://lists.gnu.org/archive/html/qemu-devel/2016-10/msg02305.html

> I need to count requests at the BB level because the blk_aio_*
> operations have a separate bottom half that is invoked if either 1) they
> never reach BDS (because of an error); or 2) the bdrv_co_* call
> completes without yielding.  The count must be >0 when blk_aio_*
> returns, or bdrv_drain (and thus blk_drain) won't loop.  Because
> bdrv_drain and blk_drain are conflated, the counter must be the BDS one.
> 
> In turn, the BDS counter is needed because of the lack of isolated
> sections.  The right design would be for blk_isolate_begin to call
> blk_drain on *other* BlockBackends reachable in a child-to-parent visit.
>  Instead, until that is implemented, we have bdrv_drained_begin that
> emulates that through the same-named callback, followed by a
> parent-to-child bdrv_drain that is almost always unnecessary.

The full explanation of the long-term plans and what "isolated section"
means is at
https://lists.gnu.org/archive/html/qemu-devel/2016-10/msg02016.html.
(Unfortunately, since then we've reintroduced the "double aio_poll" hack
in BDRV_POLL_WHILE...).

Paolo
John Snow Aug. 1, 2017, 12:14 a.m. UTC | #4
On 07/23/2017 10:36 AM, Paolo Bonzini wrote:
> On 21/07/2017 17:47, Stefan Hajnoczi wrote:
>> Hmm...BlockDriverState still has bdrv_is_inserted() even though
>> BlockBackend->root can be NULL?  CCing Markus in case he has thoughts on
>> the BB/BDS split.
>>
>> I find it weird that block-backend.c calls bdrv_inc_in_flight() and then
>> bdrv_co_flush() will call it again.  Perhaps we need to do this so that
>> blk_drain() works correctly but it's odd that they share the same
>> counter variable.
> 
> See here:
> https://lists.gnu.org/archive/html/qemu-devel/2016-10/msg02305.html
> 

>> I need to count requests at the BB level because the blk_aio_*
>> operations have a separate bottom half that is invoked if either 1) they
>> never reach BDS (because of an error); or 2) the bdrv_co_* call
>> completes without yielding.  The count must be >0 when blk_aio_*
>> returns, or bdrv_drain (and thus blk_drain) won't loop.  Because
>> bdrv_drain and blk_drain are conflated, the counter must be the BDS one.
>>
>> In turn, the BDS counter is needed because of the lack of isolated
>> sections.  The right design would be for blk_isolate_begin to call
>> blk_drain on *other* BlockBackends reachable in a child-to-parent visit.
>>  Instead, until that is implemented, we have bdrv_drained_begin that
>> emulates that through the same-named callback, followed by a
>> parent-to-child bdrv_drain that is almost always unnecessary.

OK, so you have some justifications for why it needs to be at the BB
level...

> 
> The full explanation of the long-term plans and what "isolated section"
> means is at
> https://lists.gnu.org/archive/html/qemu-devel/2016-10/msg02016.html.
> (Unfortunately, since then we've reintroduced the "double aio_poll" hack
> in BDRV_POLL_WHILE...).
> 
> Paolo
> 

I may need some nudging towards understanding what the right solution
here is, though. Should the blk_aio_flush assume that there always is a
root BDS? should it not assume that?

It's difficult for me to understand right now if the bug is in the
expectation for the blk_ functions and the caller should be amended, or
if you need changes to the way the blk_ functions are trying to
increment a counter that doesn't exist.

I can handle the former fairly easily; if it's the latter, I'm afraid
it's stuck in the middle of some of your changes and I'd need a stronger
hint.

John
Paolo Bonzini Aug. 1, 2017, 8:35 a.m. UTC | #5
On 01/08/2017 02:14, John Snow wrote:
> I may need some nudging towards understanding what the right solution
> here is, though. Should the blk_aio_flush assume that there always is a
> root BDS? should it not assume that?

I think blk_aio_flush is not special.  If there is no root BDS, either
you return -ENOMEDIUM, or you crash.  But all functions should be doing
the same.

The former makes sense, but right now blk_prwv for one are crashing if
there is no root BDS so the minimum patch would fix the caller rather
than blk_aio_flush.

Paolo

> It's difficult for me to understand right now if the bug is in the
> expectation for the blk_ functions and the caller should be amended, or
> if you need changes to the way the blk_ functions are trying to
> increment a counter that doesn't exist.
> 
> I can handle the former fairly easily; if it's the latter, I'm afraid
> it's stuck in the middle of some of your changes and I'd need a stronger
> hint.
Kevin Wolf Aug. 1, 2017, 1:40 p.m. UTC | #6
Am 01.08.2017 um 10:35 hat Paolo Bonzini geschrieben:
> On 01/08/2017 02:14, John Snow wrote:
> > I may need some nudging towards understanding what the right solution
> > here is, though. Should the blk_aio_flush assume that there always is a
> > root BDS? should it not assume that?
> 
> I think blk_aio_flush is not special.  If there is no root BDS, either
> you return -ENOMEDIUM, or you crash.  But all functions should be doing
> the same.

The intended semantics is that they return -ENOMEDIUM (or fail at
least). This is how things have always worked, and that it crashes now
because of the bdrv_inc_in_flight() was not an intentional change, but
simply a bug in the patch.

> The former makes sense, but right now blk_prwv for one are crashing if
> there is no root BDS so the minimum patch would fix the caller rather
> than blk_aio_flush.

The synchronous versions don't crash, and bdrv_aio_prwv() would fix all
cases if bdrv_inc_in_flight() were moved inside the coroutine; probably
right before blk_aio_complete(). This would be more consistent with how
the synchronous versions work, too, increasing the in-flight count only
by 1 rather than 2 for an AIO request.

Kevin
diff mbox

Patch

diff --git a/block/block-backend.c b/block/block-backend.c
index 0df3457..6a543a4 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1168,8 +1168,13 @@  static BlockAIOCB *blk_aio_prwv(BlockBackend *blk, int64_t offset, int bytes,
 {
     BlkAioEmAIOCB *acb;
     Coroutine *co;
+    BlockDriverState *bs = blk_bs(blk);
 
-    bdrv_inc_in_flight(blk_bs(blk));
+    if (!bs) {
+        return NULL;
+    }
+
+    bdrv_inc_in_flight(bs);
     acb = blk_aio_get(&blk_aio_em_aiocb_info, blk, cb, opaque);
     acb->rwco = (BlkRwCo) {
         .blk    = blk,
@@ -1182,7 +1187,7 @@  static BlockAIOCB *blk_aio_prwv(BlockBackend *blk, int64_t offset, int bytes,
     acb->has_returned = false;
 
     co = qemu_coroutine_create(co_entry, acb);
-    bdrv_coroutine_enter(blk_bs(blk), co);
+    bdrv_coroutine_enter(bs, co);
 
     acb->has_returned = true;
     if (acb->rwco.ret != NOT_DONE) {