diff mbox series

[1/4] block: Expect graph changes in bdrv_parent_drained_begin/end

Message ID 20171129102513.9153-2-pbonzini@redhat.com
State New
Headers show
Series blockjob: reinstate busy=false/busy=true in block_job_sleep_ns | expand

Commit Message

Paolo Bonzini Nov. 29, 2017, 10:25 a.m. UTC
From: Kevin Wolf <kwolf@redhat.com>

The .drained_begin/end callbacks can (directly or indirectly via
aio_poll()) cause block nodes to be removed or the current BdrvChild to
point to a different child node.

Use QLIST_FOREACH_SAFE() to make sure we don't access invalid
BlockDriverStates or accidentally continue iterating the parents of the
new child node instead of the node we actually came from.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Tested-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
---
 block/io.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Comments

Alberto Garcia Nov. 29, 2017, 12:27 p.m. UTC | #1
On Wed 29 Nov 2017 11:25:10 AM CET, Paolo Bonzini wrote:
> From: Kevin Wolf <kwolf@redhat.com>
>
> The .drained_begin/end callbacks can (directly or indirectly via
> aio_poll()) cause block nodes to be removed or the current BdrvChild to
> point to a different child node.
>
> Use QLIST_FOREACH_SAFE() to make sure we don't access invalid
> BlockDriverStates or accidentally continue iterating the parents of the
> new child node instead of the node we actually came from.
>
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> Tested-by: Jeff Cody <jcody@redhat.com>
> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
> Reviewed-by: Jeff Cody <jcody@redhat.com>

Reviewed-by: Alberto Garcia <berto@igalia.com>

Berto
Stefan Hajnoczi Nov. 29, 2017, 1:44 p.m. UTC | #2
On Wed, Nov 29, 2017 at 11:25:10AM +0100, Paolo Bonzini wrote:
> From: Kevin Wolf <kwolf@redhat.com>
> 
> The .drained_begin/end callbacks can (directly or indirectly via
> aio_poll()) cause block nodes to be removed or the current BdrvChild to
> point to a different child node.
> 
> Use QLIST_FOREACH_SAFE() to make sure we don't access invalid
> BlockDriverStates or accidentally continue iterating the parents of the
> new child node instead of the node we actually came from.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> Tested-by: Jeff Cody <jcody@redhat.com>
> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
> Reviewed-by: Jeff Cody <jcody@redhat.com>
> ---
>  block/io.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/block/io.c b/block/io.c
> index 4fdf93a014..6773926fc1 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -42,9 +42,9 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
>  
>  void bdrv_parent_drained_begin(BlockDriverState *bs)
>  {
> -    BdrvChild *c;
> +    BdrvChild *c, *next;
>  
> -    QLIST_FOREACH(c, &bs->parents, next_parent) {
> +    QLIST_FOREACH_SAFE(c, &bs->parents, next_parent, next) {
>          if (c->role->drained_begin) {
>              c->role->drained_begin(c);
>          }
> @@ -53,9 +53,9 @@ void bdrv_parent_drained_begin(BlockDriverState *bs)
>  
>  void bdrv_parent_drained_end(BlockDriverState *bs)
>  {
> -    BdrvChild *c;
> +    BdrvChild *c, *next;
>  
> -    QLIST_FOREACH(c, &bs->parents, next_parent) {
> +    QLIST_FOREACH_SAFE(c, &bs->parents, next_parent, next) {
>          if (c->role->drained_end) {
>              c->role->drained_end(c);
>          }

Hmm...not sure if this patch is enough.

Thinking about this more, if this sub-graph changes then
.drained_begin() callbacks are not necessarily paired with
.drained_end() callbacks.

In other words, all of the following are possible:

1. Only .drained_begin() is called
2. .drained_begin() is called, then .drained_end()
3. Only .drained_end() is called

It makes me wonder if we need to either:

Take references and ensure .drained_end() gets called if and only if
.drained_begin() was also called.

OR

Document that .drained_begin/end() calls may not be paired and audit
existing code to check that this is safe.

Stefan
Kevin Wolf Nov. 29, 2017, 1:55 p.m. UTC | #3
Am 29.11.2017 um 14:44 hat Stefan Hajnoczi geschrieben:
> On Wed, Nov 29, 2017 at 11:25:10AM +0100, Paolo Bonzini wrote:
> > From: Kevin Wolf <kwolf@redhat.com>
> > 
> > The .drained_begin/end callbacks can (directly or indirectly via
> > aio_poll()) cause block nodes to be removed or the current BdrvChild to
> > point to a different child node.
> > 
> > Use QLIST_FOREACH_SAFE() to make sure we don't access invalid
> > BlockDriverStates or accidentally continue iterating the parents of the
> > new child node instead of the node we actually came from.
> > 
> > Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> > Tested-by: Jeff Cody <jcody@redhat.com>
> > Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
> > Reviewed-by: Jeff Cody <jcody@redhat.com>
> > ---
> >  block/io.c | 8 ++++----
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> > 
> > diff --git a/block/io.c b/block/io.c
> > index 4fdf93a014..6773926fc1 100644
> > --- a/block/io.c
> > +++ b/block/io.c
> > @@ -42,9 +42,9 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
> >  
> >  void bdrv_parent_drained_begin(BlockDriverState *bs)
> >  {
> > -    BdrvChild *c;
> > +    BdrvChild *c, *next;
> >  
> > -    QLIST_FOREACH(c, &bs->parents, next_parent) {
> > +    QLIST_FOREACH_SAFE(c, &bs->parents, next_parent, next) {
> >          if (c->role->drained_begin) {
> >              c->role->drained_begin(c);
> >          }
> > @@ -53,9 +53,9 @@ void bdrv_parent_drained_begin(BlockDriverState *bs)
> >  
> >  void bdrv_parent_drained_end(BlockDriverState *bs)
> >  {
> > -    BdrvChild *c;
> > +    BdrvChild *c, *next;
> >  
> > -    QLIST_FOREACH(c, &bs->parents, next_parent) {
> > +    QLIST_FOREACH_SAFE(c, &bs->parents, next_parent, next) {
> >          if (c->role->drained_end) {
> >              c->role->drained_end(c);
> >          }
> 
> Hmm...not sure if this patch is enough.

Depends on what you have in mind, but generally, no, it's certainly not
enough to fix all bad things that can happen during drain...

> Thinking about this more, if this sub-graph changes then
> .drained_begin() callbacks are not necessarily paired with
> .drained_end() callbacks.
> 
> In other words, all of the following are possible:
> 
> 1. Only .drained_begin() is called
> 2. .drained_begin() is called, then .drained_end()
> 3. Only .drained_end() is called

bdrv_replace_child_noperm() makes sure to call .drained_end() and
.drained_begin() depending on whether the old and new child node are
currently drained. So I think this might actually work.

> It makes me wonder if we need to either:
> 
> Take references and ensure .drained_end() gets called if and only if
> .drained_begin() was also called.
> 
> OR
> 
> Document that .drained_begin/end() calls may not be paired and audit
> existing code to check that this is safe.

How would the latter even possibly work? If the .drained_end is missing,
the parent would never know that it can send new requests again. A
missing .drained_begin could be ignored in the .drained_end callback,
but it wouldn't be safe because the parent wouldn't actually stop to
send new requests.

Kevin
Stefan Hajnoczi Dec. 1, 2017, 10:13 a.m. UTC | #4
On Wed, Nov 29, 2017 at 02:55:28PM +0100, Kevin Wolf wrote:
> Am 29.11.2017 um 14:44 hat Stefan Hajnoczi geschrieben:
> > On Wed, Nov 29, 2017 at 11:25:10AM +0100, Paolo Bonzini wrote:
> > > From: Kevin Wolf <kwolf@redhat.com>
> > > 
> > > The .drained_begin/end callbacks can (directly or indirectly via
> > > aio_poll()) cause block nodes to be removed or the current BdrvChild to
> > > point to a different child node.
> > > 
> > > Use QLIST_FOREACH_SAFE() to make sure we don't access invalid
> > > BlockDriverStates or accidentally continue iterating the parents of the
> > > new child node instead of the node we actually came from.
> > > 
> > > Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> > > Tested-by: Jeff Cody <jcody@redhat.com>
> > > Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
> > > Reviewed-by: Jeff Cody <jcody@redhat.com>
> > > ---
> > >  block/io.c | 8 ++++----
> > >  1 file changed, 4 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/block/io.c b/block/io.c
> > > index 4fdf93a014..6773926fc1 100644
> > > --- a/block/io.c
> > > +++ b/block/io.c
> > > @@ -42,9 +42,9 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
> > >  
> > >  void bdrv_parent_drained_begin(BlockDriverState *bs)
> > >  {
> > > -    BdrvChild *c;
> > > +    BdrvChild *c, *next;
> > >  
> > > -    QLIST_FOREACH(c, &bs->parents, next_parent) {
> > > +    QLIST_FOREACH_SAFE(c, &bs->parents, next_parent, next) {
> > >          if (c->role->drained_begin) {
> > >              c->role->drained_begin(c);
> > >          }
> > > @@ -53,9 +53,9 @@ void bdrv_parent_drained_begin(BlockDriverState *bs)
> > >  
> > >  void bdrv_parent_drained_end(BlockDriverState *bs)
> > >  {
> > > -    BdrvChild *c;
> > > +    BdrvChild *c, *next;
> > >  
> > > -    QLIST_FOREACH(c, &bs->parents, next_parent) {
> > > +    QLIST_FOREACH_SAFE(c, &bs->parents, next_parent, next) {
> > >          if (c->role->drained_end) {
> > >              c->role->drained_end(c);
> > >          }
> > 
> > Hmm...not sure if this patch is enough.
> 
> Depends on what you have in mind, but generally, no, it's certainly not
> enough to fix all bad things that can happen during drain...
> 
> > Thinking about this more, if this sub-graph changes then
> > .drained_begin() callbacks are not necessarily paired with
> > .drained_end() callbacks.
> > 
> > In other words, all of the following are possible:
> > 
> > 1. Only .drained_begin() is called
> > 2. .drained_begin() is called, then .drained_end()
> > 3. Only .drained_end() is called
> 
> bdrv_replace_child_noperm() makes sure to call .drained_end() and
> .drained_begin() depending on whether the old and new child node are
> currently drained. So I think this might actually work.
> 
> > It makes me wonder if we need to either:
> > 
> > Take references and ensure .drained_end() gets called if and only if
> > .drained_begin() was also called.
> > 
> > OR
> > 
> > Document that .drained_begin/end() calls may not be paired and audit
> > existing code to check that this is safe.
> 
> How would the latter even possibly work? If the .drained_end is missing,
> the parent would never know that it can send new requests again. A
> missing .drained_begin could be ignored in the .drained_end callback,
> but it wouldn't be safe because the parent wouldn't actually stop to
> send new requests.

Yesterday I looked at a similar issue in Fam's new RFC that moves
.drained_begin()/.drained_end() to AioContext.  I came to the conclusion
that while it's easy to implement unmatched begin/end calls (as we do
today) it's not possible to get useful semantics out of them :).  So we
have to ensure the the begin/end calls match.
diff mbox series

Patch

diff --git a/block/io.c b/block/io.c
index 4fdf93a014..6773926fc1 100644
--- a/block/io.c
+++ b/block/io.c
@@ -42,9 +42,9 @@  static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
 
 void bdrv_parent_drained_begin(BlockDriverState *bs)
 {
-    BdrvChild *c;
+    BdrvChild *c, *next;
 
-    QLIST_FOREACH(c, &bs->parents, next_parent) {
+    QLIST_FOREACH_SAFE(c, &bs->parents, next_parent, next) {
         if (c->role->drained_begin) {
             c->role->drained_begin(c);
         }
@@ -53,9 +53,9 @@  void bdrv_parent_drained_begin(BlockDriverState *bs)
 
 void bdrv_parent_drained_end(BlockDriverState *bs)
 {
-    BdrvChild *c;
+    BdrvChild *c, *next;
 
-    QLIST_FOREACH(c, &bs->parents, next_parent) {
+    QLIST_FOREACH_SAFE(c, &bs->parents, next_parent, next) {
         if (c->role->drained_end) {
             c->role->drained_end(c);
         }