diff mbox

migration: qemu-coroutine-lock.c:141: qemu_co_mutex_unlock: Assertion `mutex->locked == 1' failed

Message ID 54192E4A.8020401@ozlabs.ru
State New
Headers show

Commit Message

Alexey Kardashevskiy Sept. 17, 2014, 6:46 a.m. UTC
On 09/16/2014 10:34 PM, Kevin Wolf wrote:
> Am 16.09.2014 um 14:10 hat Paolo Bonzini geschrieben:
>> Il 16/09/2014 14:02, Alexey Kardashevskiy ha scritto:
>>> I am having problems when migrate a guest via libvirt like this:
>>>
>>> virsh migrate --live --persistent --undefinesource --copy-storage-all
>>> --verbose --desturi qemu+ssh://legkvm/system --domain chig1
>>>
>>> The XML used to create the guest is at the end of this mail.
>>>
>>> I see NBD FLUSH command after the destination QEMU received EOF for
>>> migration stream and this produces a crash in qcow2_co_flush_to_os() as
>>> s->lock is false or s->l2_table_cache is NULL.
>>>
>>
>> Max, Kevin, could the fix be something like this?
>>
>> diff --git a/block/qcow2.c b/block/qcow2.c
>> index 0daf25c..e7459ea 100644
>> --- a/block/qcow2.c
>> +++ b/block/qcow2.c
>> @@ -1442,6 +1442,7 @@ static void qcow2_invalidate_cache(BlockDriverState *bs, Error **errp)
>>          memcpy(&aes_decrypt_key, &s->aes_decrypt_key, sizeof(aes_decrypt_key));
>>      }
>>  
>> +    qemu_co_mutex_lock(&s->lock);
>>      qcow2_close(bs);
>>  
>>      bdrv_invalidate_cache(bs->file, &local_err);
>> @@ -1455,6 +1456,7 @@ static void qcow2_invalidate_cache(BlockDriverState *bs, Error **errp)
>>  
>>      ret = qcow2_open(bs, options, flags, &local_err);
>>      QDECREF(options);
>> +    qemu_co_mutex_unlock(&s->lock);
>>      if (local_err) {
>>          error_setg(errp, "Could not reopen qcow2 layer: %s",
>>                     error_get_pretty(local_err));
>>
>> On top of this, *_invalidate_cache needs to be marked as coroutine_fn.
> 
> I think bdrv_invalidate_cache() really needs to call bdrv_drain_all()
> before starting to reopen stuff. There could be requests in flight
> without holding the lock and if you can indeed reopen their BDS under
> their feet without breaking things (I doubt it), that would be pure
> luck.


I tried the patch below and it did not help. So I assume I did it wrong,
could you please explain more? Thanks!


         return;
diff mbox

Patch

diff --git a/block.c b/block.c
index 2df600e..ecc876d 100644
--- a/block.c
+++ b/block.c
@@ -5038,11 +5038,16 @@  void bdrv_invalidate_cache(BlockDriverState *bs,
Error **errp)
         return;
     }

+    bdrv_drain_all();
+
     if (bs->drv->bdrv_invalidate_cache) {
         bs->drv->bdrv_invalidate_cache(bs, &local_err);
     } else if (bs->file) {
         bdrv_invalidate_cache(bs->file, &local_err);
     }
+
+    bdrv_drain_all();
+
     if (local_err) {
         error_propagate(errp, local_err);