Patchwork [2/2] block: move bdrv_dev_change_media_cb() to callers that really need it

login
register
mail settings
Submitter Luiz Capitulino
Date April 19, 2013, 5:47 p.m.
Message ID <1366393639-20651-3-git-send-email-lcapitulino@redhat.com>
Download mbox | patch
Permalink /patch/238078/
State New
Headers show

Comments

Luiz Capitulino - April 19, 2013, 5:47 p.m.
Commit 9ca111544c64b5abed2e79cf52e19a8f227b347b moved the call to
bdrv_dev_change_media_cb() outside the media check in bdrv_close(),
this added a regression where spurious DEVICE_TRAY_MOVED events
are emitted at shutdown.

To fix that this commit moves the bdrv_dev_change_media_cb() calls
to the callers that really need to report a media change, which
are eject_device() and do_drive_del(). This fixes the problem
commit 9ca1115 intended to fix, plus the spurious events.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
---
 block.c    | 2 --
 blockdev.c | 2 ++
 2 files changed, 2 insertions(+), 2 deletions(-)
Eric Blake - April 25, 2013, 1:59 p.m.
On 04/19/2013 11:47 AM, Luiz Capitulino wrote:
> Commit 9ca111544c64b5abed2e79cf52e19a8f227b347b moved the call to
> bdrv_dev_change_media_cb() outside the media check in bdrv_close(),
> this added a regression where spurious DEVICE_TRAY_MOVED events
> are emitted at shutdown.

Libvirt can tolerate the extra events, but I agree that targeting 1.5 as
the release to clean this back up is reasonable.

> 
> To fix that this commit moves the bdrv_dev_change_media_cb() calls
> to the callers that really need to report a media change, which
> are eject_device() and do_drive_del(). This fixes the problem
> commit 9ca1115 intended to fix, plus the spurious events.
> 
> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
> ---
>  block.c    | 2 --
>  blockdev.c | 2 ++
>  2 files changed, 2 insertions(+), 2 deletions(-)

Reviewed-by: Eric Blake <eblake@redhat.com>
Markus Armbruster - April 25, 2013, 6:18 p.m.
Luiz Capitulino <lcapitulino@redhat.com> writes:

> Commit 9ca111544c64b5abed2e79cf52e19a8f227b347b moved the call to
> bdrv_dev_change_media_cb() outside the media check in bdrv_close(),
> this added a regression where spurious DEVICE_TRAY_MOVED events
> are emitted at shutdown.
>
> To fix that this commit moves the bdrv_dev_change_media_cb() calls
> to the callers that really need to report a media change, which
> are eject_device() and do_drive_del(). This fixes the problem
> commit 9ca1115 intended to fix, plus the spurious events.
>
> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
> ---
>  block.c    | 2 --
>  blockdev.c | 2 ++
>  2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/block.c b/block.c
> index 90d0ed1..7fc3014 100644
> --- a/block.c
> +++ b/block.c
> @@ -1342,8 +1342,6 @@ void bdrv_close(BlockDriverState *bs)
>          }
>      }
>  
> -    bdrv_dev_change_media_cb(bs, false);
> -
>      /*throttling disk I/O limits*/
>      if (bs->io_limits_enabled) {
>          bdrv_io_limits_disable(bs);
> diff --git a/blockdev.c b/blockdev.c
> index 8a1652b..f1f3b6e 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -950,6 +950,7 @@ static void eject_device(BlockDriverState *bs, int force, Error **errp)
>      }
>  
>      bdrv_close(bs);
> +    bdrv_dev_change_media_cb(bs, false);
>  }
>  
>  void qmp_eject(const char *device, bool has_force, bool force, Error **errp)
> @@ -1100,6 +1101,7 @@ int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data)
>      bdrv_drain_all();
>      bdrv_flush(bs);
>      bdrv_close(bs);
> +    bdrv_dev_change_media_cb(bs, false);
>  
>      /* if we have a device attached to this BlockDriverState
>       * then we need to make the drive anonymous until the device

Invariant: callback does nothing unless a device model with removable
media is connected (dev_ops->change_media_cb set).

Before 9ca1115: Callback runs on any bdrv_close() that actually ejects a
medium.

Since 9ca1115: Callback runs on any bdrv_close()

With this patch applied: Callback runs in eject_device() and
do_drive_del().  No change, except it now runs after
bdrv_io_limits_disable(), which shouldn't matter.  This is the trivial
part of the review.

Now the non-trivial part.  Callback no longer runs in

* bdrv_open() when it fails because it can't open the backing file

* bdrv_open() when it fails because the block driver doesn't consume the
  all options

* bdrv_delete()
  - bdrv_file_open() error path
  - bdrv_open_backing_file() error path
  - bdrv_open() snapshot=on path
  - bdrv_open(), purpose not obvious, perhaps related to format probing
  - bdrv_open() some error paths
  - bdrv_close(), bs->backing_hd
  - bdrv_close(), bs->file
  - bdrv_drop_intermediate()
  - bdrv_snapshot_goto() error path
  - bdrv_img_create()
  - drive_uninit()
  - drive_init() error path
  - qmp_transaction() error path
  - qmp_drive_mirror() some error paths
  - qemu_img.c many places
  - qemu_io.c many places
  - blkverify_open() error path
  - blkverify_close()
  - cow_create()
  - mirror_run()
  - qcow_create()
  - qcow2_create2()
  - qed_create()
  - sheepdog.c's sd_prealloc(), sd_create()
  - close_unused_images()
  - vmdk_free_extents(), vmdk_parse_extents(), vmdk_create()
  - vvfat.c's write_target_close()

* qemu-nbd.c's main()

* mirror_run()

* qcow2_create2()

* pci_piix3_xen_ide_unplug()

* bdrv_close_all()
  - qemu-nbd.c, from atexit()  Same as above.
  - vl.c's main().

For each of them, I'd like to see an argument why the not running the
callback is okay.  A good one is "no device model can be attached", say
because the BDS isn't a root (device models only attach to roots), or
because the BDS hasn't escaped its constructor, yet.

Not wanting to do all this work is exactly why I refrained from
attempting to fix the problem commit 9ca1115 attempts to fix (I
suggested to declare it a feature instead).  Didn't help, because I also
refrained from NAKing that fix, and here we are.

I'll do what I can, as time permits, but help is certainly appreciated.
Luiz Capitulino - April 25, 2013, 9:16 p.m.
On Thu, 25 Apr 2013 20:18:35 +0200
Markus Armbruster <armbru@redhat.com> wrote:

> Luiz Capitulino <lcapitulino@redhat.com> writes:
> 
> > Commit 9ca111544c64b5abed2e79cf52e19a8f227b347b moved the call to
> > bdrv_dev_change_media_cb() outside the media check in bdrv_close(),
> > this added a regression where spurious DEVICE_TRAY_MOVED events
> > are emitted at shutdown.
> >
> > To fix that this commit moves the bdrv_dev_change_media_cb() calls
> > to the callers that really need to report a media change, which
> > are eject_device() and do_drive_del(). This fixes the problem
> > commit 9ca1115 intended to fix, plus the spurious events.
> >
> > Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
> > ---
> >  block.c    | 2 --
> >  blockdev.c | 2 ++
> >  2 files changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/block.c b/block.c
> > index 90d0ed1..7fc3014 100644
> > --- a/block.c
> > +++ b/block.c
> > @@ -1342,8 +1342,6 @@ void bdrv_close(BlockDriverState *bs)
> >          }
> >      }
> >  
> > -    bdrv_dev_change_media_cb(bs, false);
> > -
> >      /*throttling disk I/O limits*/
> >      if (bs->io_limits_enabled) {
> >          bdrv_io_limits_disable(bs);
> > diff --git a/blockdev.c b/blockdev.c
> > index 8a1652b..f1f3b6e 100644
> > --- a/blockdev.c
> > +++ b/blockdev.c
> > @@ -950,6 +950,7 @@ static void eject_device(BlockDriverState *bs, int force, Error **errp)
> >      }
> >  
> >      bdrv_close(bs);
> > +    bdrv_dev_change_media_cb(bs, false);
> >  }
> >  
> >  void qmp_eject(const char *device, bool has_force, bool force, Error **errp)
> > @@ -1100,6 +1101,7 @@ int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data)
> >      bdrv_drain_all();
> >      bdrv_flush(bs);
> >      bdrv_close(bs);
> > +    bdrv_dev_change_media_cb(bs, false);
> >  
> >      /* if we have a device attached to this BlockDriverState
> >       * then we need to make the drive anonymous until the device
> 
> Invariant: callback does nothing unless a device model with removable
> media is connected (dev_ops->change_media_cb set).
> 
> Before 9ca1115: Callback runs on any bdrv_close() that actually ejects a
> medium.
> 
> Since 9ca1115: Callback runs on any bdrv_close()
> 
> With this patch applied: Callback runs in eject_device() and
> do_drive_del().  No change, except it now runs after
> bdrv_io_limits_disable(), which shouldn't matter.  This is the trivial
> part of the review.
> 
> Now the non-trivial part.  Callback no longer runs in
> 
> * bdrv_open() when it fails because it can't open the backing file
> 
> * bdrv_open() when it fails because the block driver doesn't consume the
>   all options
> 
> * bdrv_delete()
>   - bdrv_file_open() error path
>   - bdrv_open_backing_file() error path
>   - bdrv_open() snapshot=on path
>   - bdrv_open(), purpose not obvious, perhaps related to format probing
>   - bdrv_open() some error paths
>   - bdrv_close(), bs->backing_hd
>   - bdrv_close(), bs->file
>   - bdrv_drop_intermediate()
>   - bdrv_snapshot_goto() error path
>   - bdrv_img_create()
>   - drive_uninit()
>   - drive_init() error path
>   - qmp_transaction() error path
>   - qmp_drive_mirror() some error paths
>   - qemu_img.c many places
>   - qemu_io.c many places
>   - blkverify_open() error path
>   - blkverify_close()
>   - cow_create()
>   - mirror_run()
>   - qcow_create()
>   - qcow2_create2()
>   - qed_create()
>   - sheepdog.c's sd_prealloc(), sd_create()
>   - close_unused_images()
>   - vmdk_free_extents(), vmdk_parse_extents(), vmdk_create()
>   - vvfat.c's write_target_close()
> 
> * qemu-nbd.c's main()
> 
> * mirror_run()
> 
> * qcow2_create2()
> 
> * pci_piix3_xen_ide_unplug()
> 
> * bdrv_close_all()
>   - qemu-nbd.c, from atexit()  Same as above.
>   - vl.c's main().
> 
> For each of them, I'd like to see an argument why the not running the
> callback is okay.

Very encouraging :( I wonder if such a comprehensive analysis was done
when 1 adding the callback in the first place and 2 on 9ca1115.

> A good one is "no device model can be attached", say
> because the BDS isn't a root (device models only attach to roots), or
> because the BDS hasn't escaped its constructor, yet.
> 
> Not wanting to do all this work is exactly why I refrained from
> attempting to fix the problem commit 9ca1115 attempts to fix (I
> suggested to declare it a feature instead).  Didn't help, because I also
> refrained from NAKing that fix, and here we are.
> 
> I'll do what I can, as time permits, but help is certainly appreciated.

The bug I'm fixing is not a serious one, but this makes me think that
reverting 9ca1115 is the best thing to do.
Markus Armbruster - April 26, 2013, 11:35 a.m.
Luiz Capitulino <lcapitulino@redhat.com> writes:

> On Thu, 25 Apr 2013 20:18:35 +0200
> Markus Armbruster <armbru@redhat.com> wrote:
>
>> Luiz Capitulino <lcapitulino@redhat.com> writes:
>> 
>> > Commit 9ca111544c64b5abed2e79cf52e19a8f227b347b moved the call to
>> > bdrv_dev_change_media_cb() outside the media check in bdrv_close(),
>> > this added a regression where spurious DEVICE_TRAY_MOVED events
>> > are emitted at shutdown.
>> >
>> > To fix that this commit moves the bdrv_dev_change_media_cb() calls
>> > to the callers that really need to report a media change, which
>> > are eject_device() and do_drive_del(). This fixes the problem
>> > commit 9ca1115 intended to fix, plus the spurious events.
>> >
>> > Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
>> > ---
>> >  block.c    | 2 --
>> >  blockdev.c | 2 ++
>> >  2 files changed, 2 insertions(+), 2 deletions(-)
>> >
>> > diff --git a/block.c b/block.c
>> > index 90d0ed1..7fc3014 100644
>> > --- a/block.c
>> > +++ b/block.c
>> > @@ -1342,8 +1342,6 @@ void bdrv_close(BlockDriverState *bs)
>> >          }
>> >      }
>> >  
>> > -    bdrv_dev_change_media_cb(bs, false);
>> > -
>> >      /*throttling disk I/O limits*/
>> >      if (bs->io_limits_enabled) {
>> >          bdrv_io_limits_disable(bs);
>> > diff --git a/blockdev.c b/blockdev.c
>> > index 8a1652b..f1f3b6e 100644
>> > --- a/blockdev.c
>> > +++ b/blockdev.c
>> > @@ -950,6 +950,7 @@ static void eject_device(BlockDriverState *bs, int force, Error **errp)
>> >      }
>> >  
>> >      bdrv_close(bs);
>> > +    bdrv_dev_change_media_cb(bs, false);
>> >  }
>> >  
>> >  void qmp_eject(const char *device, bool has_force, bool force, Error **errp)
>> > @@ -1100,6 +1101,7 @@ int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data)
>> >      bdrv_drain_all();
>> >      bdrv_flush(bs);
>> >      bdrv_close(bs);
>> > +    bdrv_dev_change_media_cb(bs, false);
>> >  
>> >      /* if we have a device attached to this BlockDriverState
>> >       * then we need to make the drive anonymous until the device
>> 
>> Invariant: callback does nothing unless a device model with removable
>> media is connected (dev_ops->change_media_cb set).
>> 
>> Before 9ca1115: Callback runs on any bdrv_close() that actually ejects a
>> medium.
>> 
>> Since 9ca1115: Callback runs on any bdrv_close()
>> 
>> With this patch applied: Callback runs in eject_device() and
>> do_drive_del().  No change, except it now runs after
>> bdrv_io_limits_disable(), which shouldn't matter.  This is the trivial
>> part of the review.
>> 
>> Now the non-trivial part.  Callback no longer runs in
>> 
>> * bdrv_open() when it fails because it can't open the backing file
>> 
>> * bdrv_open() when it fails because the block driver doesn't consume the
>>   all options
>> 
>> * bdrv_delete()
>>   - bdrv_file_open() error path
>>   - bdrv_open_backing_file() error path
>>   - bdrv_open() snapshot=on path
>>   - bdrv_open(), purpose not obvious, perhaps related to format probing
>>   - bdrv_open() some error paths
>>   - bdrv_close(), bs->backing_hd
>>   - bdrv_close(), bs->file
>>   - bdrv_drop_intermediate()
>>   - bdrv_snapshot_goto() error path
>>   - bdrv_img_create()
>>   - drive_uninit()
>>   - drive_init() error path
>>   - qmp_transaction() error path
>>   - qmp_drive_mirror() some error paths
>>   - qemu_img.c many places
>>   - qemu_io.c many places
>>   - blkverify_open() error path
>>   - blkverify_close()
>>   - cow_create()
>>   - mirror_run()
>>   - qcow_create()
>>   - qcow2_create2()
>>   - qed_create()
>>   - sheepdog.c's sd_prealloc(), sd_create()
>>   - close_unused_images()
>>   - vmdk_free_extents(), vmdk_parse_extents(), vmdk_create()
>>   - vvfat.c's write_target_close()
>> 
>> * qemu-nbd.c's main()
>> 
>> * mirror_run()
>> 
>> * qcow2_create2()
>> 
>> * pci_piix3_xen_ide_unplug()
>> 
>> * bdrv_close_all()
>>   - qemu-nbd.c, from atexit()  Same as above.
>>   - vl.c's main().
>> 
>> For each of them, I'd like to see an argument why the not running the
>> callback is okay.
>
> Very encouraging :( I wonder if such a comprehensive analysis was done
> when 1 adding the callback in the first place and 2 on 9ca1115.

Sorry about that.

It's not quite as bad as it looks, by the way.  I believe many (most?)
places can be ticked off relatively quickly, because no device model can
be attached there.

I'm pretty sure 9ca1115 went in without such a review.  But it's flawed.
Let's not repeat the mistake.

>> A good one is "no device model can be attached", say
>> because the BDS isn't a root (device models only attach to roots), or
>> because the BDS hasn't escaped its constructor, yet.
>> 
>> Not wanting to do all this work is exactly why I refrained from
>> attempting to fix the problem commit 9ca1115 attempts to fix (I
>> suggested to declare it a feature instead).  Didn't help, because I also
>> refrained from NAKing that fix, and here we are.
>> 
>> I'll do what I can, as time permits, but help is certainly appreciated.
>
> The bug I'm fixing is not a serious one, but this makes me think that
> reverting 9ca1115 is the best thing to do.

Reverting it goes back from the minor bug we have now (spurious events)
to the minor bug we used to have for ages (no event on "ejecting" an
empty tray).  Can't say I care much either way.

I'd like to have the bug fixed.  I outlined proper review above, but I'm
not sure it's worth the effort.  Whether that means we should stop
whining and do the review work, or apply the patch and pray, or do
nothing, or revert 9ca1115, I don't know.  Up to Kevin & Stefan.
Pavel Hrdina - May 21, 2013, 3:54 p.m.
On 25.4.2013 20:18, Markus Armbruster wrote:
> Luiz Capitulino <lcapitulino@redhat.com> writes:
>
>> Commit 9ca111544c64b5abed2e79cf52e19a8f227b347b moved the call to
>> bdrv_dev_change_media_cb() outside the media check in bdrv_close(),
>> this added a regression where spurious DEVICE_TRAY_MOVED events
>> are emitted at shutdown.
>>
>> To fix that this commit moves the bdrv_dev_change_media_cb() calls
>> to the callers that really need to report a media change, which
>> are eject_device() and do_drive_del(). This fixes the problem
>> commit 9ca1115 intended to fix, plus the spurious events.
>>
>> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
>> ---
>>   block.c    | 2 --
>>   blockdev.c | 2 ++
>>   2 files changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/block.c b/block.c
>> index 90d0ed1..7fc3014 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -1342,8 +1342,6 @@ void bdrv_close(BlockDriverState *bs)
>>           }
>>       }
>>
>> -    bdrv_dev_change_media_cb(bs, false);
>> -
>>       /*throttling disk I/O limits*/
>>       if (bs->io_limits_enabled) {
>>           bdrv_io_limits_disable(bs);
>> diff --git a/blockdev.c b/blockdev.c
>> index 8a1652b..f1f3b6e 100644
>> --- a/blockdev.c
>> +++ b/blockdev.c
>> @@ -950,6 +950,7 @@ static void eject_device(BlockDriverState *bs, int force, Error **errp)
>>       }
>>
>>       bdrv_close(bs);
>> +    bdrv_dev_change_media_cb(bs, false);
>>   }
>>
>>   void qmp_eject(const char *device, bool has_force, bool force, Error **errp)
>> @@ -1100,6 +1101,7 @@ int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data)
>>       bdrv_drain_all();
>>       bdrv_flush(bs);
>>       bdrv_close(bs);
>> +    bdrv_dev_change_media_cb(bs, false);
>>
>>       /* if we have a device attached to this BlockDriverState
>>        * then we need to make the drive anonymous until the device
>
> Invariant: callback does nothing unless a device model with removable
> media is connected (dev_ops->change_media_cb set).
>
> Before 9ca1115: Callback runs on any bdrv_close() that actually ejects a
> medium.
>
> Since 9ca1115: Callback runs on any bdrv_close()
>
> With this patch applied: Callback runs in eject_device() and
> do_drive_del().  No change, except it now runs after
> bdrv_io_limits_disable(), which shouldn't matter.  This is the trivial
> part of the review.
>
> Now the non-trivial part.  Callback no longer runs in
>
> * bdrv_open() when it fails because it can't open the backing file
>
> * bdrv_open() when it fails because the block driver doesn't consume the
>    all options
>
> * bdrv_delete()
>    - bdrv_file_open() error path
>    - bdrv_open_backing_file() error path
>    - bdrv_open() snapshot=on path
>    - bdrv_open(), purpose not obvious, perhaps related to format probing
>    - bdrv_open() some error paths
>    - bdrv_close(), bs->backing_hd
>    - bdrv_close(), bs->file
>    - bdrv_drop_intermediate()
>    - bdrv_snapshot_goto() error path
>    - bdrv_img_create()
>    - drive_uninit()
>    - drive_init() error path
>    - qmp_transaction() error path
>    - qmp_drive_mirror() some error paths
>    - qemu_img.c many places
>    - qemu_io.c many places
>    - blkverify_open() error path
>    - blkverify_close()
>    - cow_create()
>    - mirror_run()
>    - qcow_create()
>    - qcow2_create2()
>    - qed_create()
>    - sheepdog.c's sd_prealloc(), sd_create()
>    - close_unused_images()
>    - vmdk_free_extents(), vmdk_parse_extents(), vmdk_create()
>    - vvfat.c's write_target_close()
>
> * qemu-nbd.c's main()
>
> * mirror_run()
>
> * qcow2_create2()
>
> * pci_piix3_xen_ide_unplug()
>
> * bdrv_close_all()
>    - qemu-nbd.c, from atexit()  Same as above.
>    - vl.c's main().
>
> For each of them, I'd like to see an argument why the not running the
> callback is okay.  A good one is "no device model can be attached", say
> because the BDS isn't a root (device models only attach to roots), or
> because the BDS hasn't escaped its constructor, yet.
>
> Not wanting to do all this work is exactly why I refrained from
> attempting to fix the problem commit 9ca1115 attempts to fix (I
> suggested to declare it a feature instead).  Didn't help, because I also
> refrained from NAKing that fix, and here we are.
>
> I'll do what I can, as time permits, but help is certainly appreciated.
>

I'm think that we should call the bdrv_dev_change_media_cb() only in 
that cases where it would be called on bare-metal. That means only if we 
are ejecting or changing the device's media.

For example the case in this patch, that this function is called in 
do_drive_del() is not right. On real machine if you want to remove the 
whole drive you don't open the tray.

So the only callers of this function should be the eject_device() and 
the qmp_bdrv_open_encrypted() according to current logic:
     - call only if the open is successful and key is not required
     - or if the open is successful and key is required and correct

Events from guest side are handled from different place.

This is my opinion so any comments or corrections are welcomed.

Pavel

Patch

diff --git a/block.c b/block.c
index 90d0ed1..7fc3014 100644
--- a/block.c
+++ b/block.c
@@ -1342,8 +1342,6 @@  void bdrv_close(BlockDriverState *bs)
         }
     }
 
-    bdrv_dev_change_media_cb(bs, false);
-
     /*throttling disk I/O limits*/
     if (bs->io_limits_enabled) {
         bdrv_io_limits_disable(bs);
diff --git a/blockdev.c b/blockdev.c
index 8a1652b..f1f3b6e 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -950,6 +950,7 @@  static void eject_device(BlockDriverState *bs, int force, Error **errp)
     }
 
     bdrv_close(bs);
+    bdrv_dev_change_media_cb(bs, false);
 }
 
 void qmp_eject(const char *device, bool has_force, bool force, Error **errp)
@@ -1100,6 +1101,7 @@  int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data)
     bdrv_drain_all();
     bdrv_flush(bs);
     bdrv_close(bs);
+    bdrv_dev_change_media_cb(bs, false);
 
     /* if we have a device attached to this BlockDriverState
      * then we need to make the drive anonymous until the device