diff mbox series

[RFC,v2,1/3] vfio: Move the saving of the config space to the right place in VFIO migration

Message ID 20201209080919.156-2-lushenming@huawei.com
State New
Headers show
Series vfio: Some fixes and optimizations for VFIO migration | expand

Commit Message

Shenming Lu Dec. 9, 2020, 8:09 a.m. UTC
On ARM64 the VFIO SET_IRQS ioctl is dependent on the VM interrupt
setup, if the restoring of the VFIO PCI device config space is
before the VGIC, an error might occur in the kernel.

So we move the saving of the config space to the non-iterable
process, so that it will be called after the VGIC according to
their priorities.

As for the possible dependence of the device specific migration
data on it's config space, we can let the vendor driver to
include any config info it needs in its own data stream.
(Should we note this in the header file linux-headers/linux/vfio.h?)

Signed-off-by: Shenming Lu <lushenming@huawei.com>
---
 hw/vfio/migration.c | 25 +++++++++++++++----------
 1 file changed, 15 insertions(+), 10 deletions(-)

Comments

Cornelia Huck Dec. 9, 2020, 12:29 p.m. UTC | #1
On Wed, 9 Dec 2020 16:09:17 +0800
Shenming Lu <lushenming@huawei.com> wrote:

> On ARM64 the VFIO SET_IRQS ioctl is dependent on the VM interrupt
> setup, if the restoring of the VFIO PCI device config space is
> before the VGIC, an error might occur in the kernel.
> 
> So we move the saving of the config space to the non-iterable
> process, so that it will be called after the VGIC according to
> their priorities.
> 
> As for the possible dependence of the device specific migration
> data on it's config space, we can let the vendor driver to
> include any config info it needs in its own data stream.
> (Should we note this in the header file linux-headers/linux/vfio.h?)

Given that the header is our primary source about how this interface
should act, we need to properly document expectations about what will
be saved/restored when there (well, in the source file in the kernel.)
That goes in both directions: what a userspace must implement, and what
a vendor driver can rely on.

[Related, but not a todo for you: I think we're still missing proper
documentation of the whole migration feature.]

> 
> Signed-off-by: Shenming Lu <lushenming@huawei.com>
> ---
>  hw/vfio/migration.c | 25 +++++++++++++++----------
>  1 file changed, 15 insertions(+), 10 deletions(-)
Alex Williamson Dec. 9, 2020, 6:34 p.m. UTC | #2
On Wed, 9 Dec 2020 13:29:47 +0100
Cornelia Huck <cohuck@redhat.com> wrote:

> On Wed, 9 Dec 2020 16:09:17 +0800
> Shenming Lu <lushenming@huawei.com> wrote:
> 
> > On ARM64 the VFIO SET_IRQS ioctl is dependent on the VM interrupt
> > setup, if the restoring of the VFIO PCI device config space is
> > before the VGIC, an error might occur in the kernel.
> > 
> > So we move the saving of the config space to the non-iterable
> > process, so that it will be called after the VGIC according to
> > their priorities.
> > 
> > As for the possible dependence of the device specific migration
> > data on it's config space, we can let the vendor driver to
> > include any config info it needs in its own data stream.
> > (Should we note this in the header file linux-headers/linux/vfio.h?)  
> 
> Given that the header is our primary source about how this interface
> should act, we need to properly document expectations about what will
> be saved/restored when there (well, in the source file in the kernel.)
> That goes in both directions: what a userspace must implement, and what
> a vendor driver can rely on.
> 
> [Related, but not a todo for you: I think we're still missing proper
> documentation of the whole migration feature.]

Yes, we never saw anything past v1 of the documentation patch.  Thanks,

Alex


> > Signed-off-by: Shenming Lu <lushenming@huawei.com>
> > ---
> >  hw/vfio/migration.c | 25 +++++++++++++++----------
> >  1 file changed, 15 insertions(+), 10 deletions(-)
Shenming Lu Dec. 10, 2020, 2:21 a.m. UTC | #3
On 2020/12/10 2:34, Alex Williamson wrote:
> On Wed, 9 Dec 2020 13:29:47 +0100
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
>> On Wed, 9 Dec 2020 16:09:17 +0800
>> Shenming Lu <lushenming@huawei.com> wrote:
>>
>>> On ARM64 the VFIO SET_IRQS ioctl is dependent on the VM interrupt
>>> setup, if the restoring of the VFIO PCI device config space is
>>> before the VGIC, an error might occur in the kernel.
>>>
>>> So we move the saving of the config space to the non-iterable
>>> process, so that it will be called after the VGIC according to
>>> their priorities.
>>>
>>> As for the possible dependence of the device specific migration
>>> data on it's config space, we can let the vendor driver to
>>> include any config info it needs in its own data stream.
>>> (Should we note this in the header file linux-headers/linux/vfio.h?)  
>>
>> Given that the header is our primary source about how this interface
>> should act, we need to properly document expectations about what will
>> be saved/restored when there (well, in the source file in the kernel.)
>> That goes in both directions: what a userspace must implement, and what
>> a vendor driver can rely on.

Yeah, in order to make the vendor driver and QEMU cooperate better, we might
need to document some expectations about the data section in the migration
region...

>>
>> [Related, but not a todo for you: I think we're still missing proper
>> documentation of the whole migration feature.]
> 
> Yes, we never saw anything past v1 of the documentation patch.  Thanks,
> 

By the way, is there anything unproper with this patch? Wish your suggestion. :-)

> Alex
> 
> 
>>> Signed-off-by: Shenming Lu <lushenming@huawei.com>
>>> ---
>>>  hw/vfio/migration.c | 25 +++++++++++++++----------
>>>  1 file changed, 15 insertions(+), 10 deletions(-)  
> 
> .
>
Alex Williamson Jan. 26, 2021, 9:36 p.m. UTC | #4
On Thu, 10 Dec 2020 10:21:21 +0800
Shenming Lu <lushenming@huawei.com> wrote:

> On 2020/12/10 2:34, Alex Williamson wrote:
> > On Wed, 9 Dec 2020 13:29:47 +0100
> > Cornelia Huck <cohuck@redhat.com> wrote:
> >   
> >> On Wed, 9 Dec 2020 16:09:17 +0800
> >> Shenming Lu <lushenming@huawei.com> wrote:
> >>  
> >>> On ARM64 the VFIO SET_IRQS ioctl is dependent on the VM interrupt
> >>> setup, if the restoring of the VFIO PCI device config space is
> >>> before the VGIC, an error might occur in the kernel.
> >>>
> >>> So we move the saving of the config space to the non-iterable
> >>> process, so that it will be called after the VGIC according to
> >>> their priorities.
> >>>
> >>> As for the possible dependence of the device specific migration
> >>> data on it's config space, we can let the vendor driver to
> >>> include any config info it needs in its own data stream.
> >>> (Should we note this in the header file linux-headers/linux/vfio.h?)    
> >>
> >> Given that the header is our primary source about how this interface
> >> should act, we need to properly document expectations about what will
> >> be saved/restored when there (well, in the source file in the kernel.)
> >> That goes in both directions: what a userspace must implement, and what
> >> a vendor driver can rely on.  
> 
> Yeah, in order to make the vendor driver and QEMU cooperate better, we might
> need to document some expectations about the data section in the migration
> region...
> >>
> >> [Related, but not a todo for you: I think we're still missing proper
> >> documentation of the whole migration feature.]  
> > 
> > Yes, we never saw anything past v1 of the documentation patch.  Thanks,
> >   
> 
> By the way, is there anything unproper with this patch? Wish your suggestion. :-)

I'm really hoping for some feedback from Kirti, I understand the NVIDIA
vGPU driver to have some dependency on this.  Thanks,

Alex

> >>> Signed-off-by: Shenming Lu <lushenming@huawei.com>
> >>> ---
> >>>  hw/vfio/migration.c | 25 +++++++++++++++----------
> >>>  1 file changed, 15 insertions(+), 10 deletions(-)    
> > 
> > .
> >   
>
Kirti Wankhede Jan. 27, 2021, 9:52 p.m. UTC | #5
On 1/27/2021 3:06 AM, Alex Williamson wrote:
> On Thu, 10 Dec 2020 10:21:21 +0800
> Shenming Lu <lushenming@huawei.com> wrote:
> 
>> On 2020/12/10 2:34, Alex Williamson wrote:
>>> On Wed, 9 Dec 2020 13:29:47 +0100
>>> Cornelia Huck <cohuck@redhat.com> wrote:
>>>    
>>>> On Wed, 9 Dec 2020 16:09:17 +0800
>>>> Shenming Lu <lushenming@huawei.com> wrote:
>>>>   
>>>>> On ARM64 the VFIO SET_IRQS ioctl is dependent on the VM interrupt
>>>>> setup, if the restoring of the VFIO PCI device config space is
>>>>> before the VGIC, an error might occur in the kernel.
>>>>>
>>>>> So we move the saving of the config space to the non-iterable
>>>>> process, so that it will be called after the VGIC according to
>>>>> their priorities.
>>>>>
>>>>> As for the possible dependence of the device specific migration
>>>>> data on it's config space, we can let the vendor driver to
>>>>> include any config info it needs in its own data stream.
>>>>> (Should we note this in the header file linux-headers/linux/vfio.h?)
>>>>
>>>> Given that the header is our primary source about how this interface
>>>> should act, we need to properly document expectations about what will
>>>> be saved/restored when there (well, in the source file in the kernel.)
>>>> That goes in both directions: what a userspace must implement, and what
>>>> a vendor driver can rely on.
>>
>> Yeah, in order to make the vendor driver and QEMU cooperate better, we might
>> need to document some expectations about the data section in the migration
>> region...
>>>>
>>>> [Related, but not a todo for you: I think we're still missing proper
>>>> documentation of the whole migration feature.]
>>>
>>> Yes, we never saw anything past v1 of the documentation patch.  Thanks,
>>>    
>>
>> By the way, is there anything unproper with this patch? Wish your suggestion. :-)
> 
> I'm really hoping for some feedback from Kirti, I understand the NVIDIA
> vGPU driver to have some dependency on this.  Thanks,
> 

I need to verify this patch. Spare me a day to verify this.

Thanks,
Kirti


> Alex
> 
>>>>> Signed-off-by: Shenming Lu <lushenming@huawei.com>
>>>>> ---
>>>>>   hw/vfio/migration.c | 25 +++++++++++++++----------
>>>>>   1 file changed, 15 insertions(+), 10 deletions(-)
>>>
>>> .
>>>    
>>
>
Kirti Wankhede Feb. 18, 2021, 2:42 p.m. UTC | #6
On 12/9/2020 1:39 PM, Shenming Lu wrote:
> On ARM64 the VFIO SET_IRQS ioctl is dependent on the VM interrupt
> setup, if the restoring of the VFIO PCI device config space is
> before the VGIC, an error might occur in the kernel.
> 
> So we move the saving of the config space to the non-iterable
> process, so that it will be called after the VGIC according to
> their priorities.
> 
> As for the possible dependence of the device specific migration
> data on it's config space, we can let the vendor driver to
> include any config info it needs in its own data stream.
> (Should we note this in the header file linux-headers/linux/vfio.h?)
> 
> Signed-off-by: Shenming Lu <lushenming@huawei.com>
> ---
>   hw/vfio/migration.c | 25 +++++++++++++++----------
>   1 file changed, 15 insertions(+), 10 deletions(-)
> 
> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
> index 00daa50ed8..3b9de1353a 100644
> --- a/hw/vfio/migration.c
> +++ b/hw/vfio/migration.c
> @@ -575,11 +575,6 @@ static int vfio_save_complete_precopy(QEMUFile *f, void *opaque)
>           return ret;
>       }
>   
> -    ret = vfio_save_device_config_state(f, opaque);
> -    if (ret) {
> -        return ret;
> -    }
> -
>       ret = vfio_update_pending(vbasedev);
>       if (ret) {
>           return ret;
> @@ -620,6 +615,19 @@ static int vfio_save_complete_precopy(QEMUFile *f, void *opaque)
>       return ret;
>   }
>   
> +static void vfio_save_state(QEMUFile *f, void *opaque)
> +{
> +    VFIODevice *vbasedev = opaque;
> +    int ret;
> +
> +    /* The device specific data is migrated in the iterable process. */
> +    ret = vfio_save_device_config_state(f, opaque);
> +    if (ret) {
> +        error_report("%s: Failed to save device config space",
> +                     vbasedev->name);
> +    }
> +}
> +

Since error is not propagated, set error in migration stream for 
migration to fail, use qemu_file_set_error() on error.

Thanks,
Kirti

>   static int vfio_load_setup(QEMUFile *f, void *opaque)
>   {
>       VFIODevice *vbasedev = opaque;
> @@ -670,11 +678,7 @@ static int vfio_load_state(QEMUFile *f, void *opaque, int version_id)
>           switch (data) {
>           case VFIO_MIG_FLAG_DEV_CONFIG_STATE:
>           {
> -            ret = vfio_load_device_config_state(f, opaque);
> -            if (ret) {
> -                return ret;
> -            }
> -            break;
> +            return vfio_load_device_config_state(f, opaque);
>           }
>           case VFIO_MIG_FLAG_DEV_SETUP_STATE:
>           {
> @@ -720,6 +724,7 @@ static SaveVMHandlers savevm_vfio_handlers = {
>       .save_live_pending = vfio_save_pending,
>       .save_live_iterate = vfio_save_iterate,
>       .save_live_complete_precopy = vfio_save_complete_precopy,
> +    .save_state = vfio_save_state,
>       .load_setup = vfio_load_setup,
>       .load_cleanup = vfio_load_cleanup,
>       .load_state = vfio_load_state,
>
Kirti Wankhede Feb. 18, 2021, 2:45 p.m. UTC | #7
On 1/27/2021 3:06 AM, Alex Williamson wrote:
> On Thu, 10 Dec 2020 10:21:21 +0800
> Shenming Lu <lushenming@huawei.com> wrote:
> 
>> On 2020/12/10 2:34, Alex Williamson wrote:
>>> On Wed, 9 Dec 2020 13:29:47 +0100
>>> Cornelia Huck <cohuck@redhat.com> wrote:
>>>    
>>>> On Wed, 9 Dec 2020 16:09:17 +0800
>>>> Shenming Lu <lushenming@huawei.com> wrote:
>>>>   
>>>>> On ARM64 the VFIO SET_IRQS ioctl is dependent on the VM interrupt
>>>>> setup, if the restoring of the VFIO PCI device config space is
>>>>> before the VGIC, an error might occur in the kernel.
>>>>>
>>>>> So we move the saving of the config space to the non-iterable
>>>>> process, so that it will be called after the VGIC according to
>>>>> their priorities.
>>>>>
>>>>> As for the possible dependence of the device specific migration
>>>>> data on it's config space, we can let the vendor driver to
>>>>> include any config info it needs in its own data stream.
>>>>> (Should we note this in the header file linux-headers/linux/vfio.h?)
>>>>
>>>> Given that the header is our primary source about how this interface
>>>> should act, we need to properly document expectations about what will
>>>> be saved/restored when there (well, in the source file in the kernel.)
>>>> That goes in both directions: what a userspace must implement, and what
>>>> a vendor driver can rely on.
>>
>> Yeah, in order to make the vendor driver and QEMU cooperate better, we might
>> need to document some expectations about the data section in the migration
>> region...
>>>>
>>>> [Related, but not a todo for you: I think we're still missing proper
>>>> documentation of the whole migration feature.]
>>>
>>> Yes, we never saw anything past v1 of the documentation patch.  Thanks,
>>>

I'll get back on this and send next version soon.

>>
>> By the way, is there anything unproper with this patch? Wish your suggestion. :-)
> 
> I'm really hoping for some feedback from Kirti, I understand the NVIDIA
> vGPU driver to have some dependency on this.  Thanks,

NVIDIA driver doesn't use device config space value/information during 
device data restoration, so we are good with this change.

Thanks,
Kirti


> 
> Alex
> 
>>>>> Signed-off-by: Shenming Lu <lushenming@huawei.com>
>>>>> ---
>>>>>   hw/vfio/migration.c | 25 +++++++++++++++----------
>>>>>   1 file changed, 15 insertions(+), 10 deletions(-)
>>>
>>> .
>>>    
>>
>
Shenming Lu Feb. 19, 2021, 10:33 a.m. UTC | #8
On 2021/2/18 22:42, Kirti Wankhede wrote:
> 
> 
> On 12/9/2020 1:39 PM, Shenming Lu wrote:
>> On ARM64 the VFIO SET_IRQS ioctl is dependent on the VM interrupt
>> setup, if the restoring of the VFIO PCI device config space is
>> before the VGIC, an error might occur in the kernel.
>>
>> So we move the saving of the config space to the non-iterable
>> process, so that it will be called after the VGIC according to
>> their priorities.
>>
>> As for the possible dependence of the device specific migration
>> data on it's config space, we can let the vendor driver to
>> include any config info it needs in its own data stream.
>> (Should we note this in the header file linux-headers/linux/vfio.h?)
>>
>> Signed-off-by: Shenming Lu <lushenming@huawei.com>
>> ---
>>   hw/vfio/migration.c | 25 +++++++++++++++----------
>>   1 file changed, 15 insertions(+), 10 deletions(-)
>>
>> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
>> index 00daa50ed8..3b9de1353a 100644
>> --- a/hw/vfio/migration.c
>> +++ b/hw/vfio/migration.c
>> @@ -575,11 +575,6 @@ static int vfio_save_complete_precopy(QEMUFile *f, void *opaque)
>>           return ret;
>>       }
>>   -    ret = vfio_save_device_config_state(f, opaque);
>> -    if (ret) {
>> -        return ret;
>> -    }
>> -
>>       ret = vfio_update_pending(vbasedev);
>>       if (ret) {
>>           return ret;
>> @@ -620,6 +615,19 @@ static int vfio_save_complete_precopy(QEMUFile *f, void *opaque)
>>       return ret;
>>   }
>>   +static void vfio_save_state(QEMUFile *f, void *opaque)
>> +{
>> +    VFIODevice *vbasedev = opaque;
>> +    int ret;
>> +
>> +    /* The device specific data is migrated in the iterable process. */
>> +    ret = vfio_save_device_config_state(f, opaque);
>> +    if (ret) {
>> +        error_report("%s: Failed to save device config space",
>> +                     vbasedev->name);
>> +    }
>> +}
>> +
> 
> Since error is not propagated, set error in migration stream for migration to fail, use qemu_file_set_error() on error.

Makes sense. I will send a v3 soon.	Thanks,

Shenming

> 
> Thanks,
> Kirti
> 
>>   static int vfio_load_setup(QEMUFile *f, void *opaque)
>>   {
>>       VFIODevice *vbasedev = opaque;
>> @@ -670,11 +678,7 @@ static int vfio_load_state(QEMUFile *f, void *opaque, int version_id)
>>           switch (data) {
>>           case VFIO_MIG_FLAG_DEV_CONFIG_STATE:
>>           {
>> -            ret = vfio_load_device_config_state(f, opaque);
>> -            if (ret) {
>> -                return ret;
>> -            }
>> -            break;
>> +            return vfio_load_device_config_state(f, opaque);
>>           }
>>           case VFIO_MIG_FLAG_DEV_SETUP_STATE:
>>           {
>> @@ -720,6 +724,7 @@ static SaveVMHandlers savevm_vfio_handlers = {
>>       .save_live_pending = vfio_save_pending,
>>       .save_live_iterate = vfio_save_iterate,
>>       .save_live_complete_precopy = vfio_save_complete_precopy,
>> +    .save_state = vfio_save_state,
>>       .load_setup = vfio_load_setup,
>>       .load_cleanup = vfio_load_cleanup,
>>       .load_state = vfio_load_state,
>>
> .
diff mbox series

Patch

diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index 00daa50ed8..3b9de1353a 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -575,11 +575,6 @@  static int vfio_save_complete_precopy(QEMUFile *f, void *opaque)
         return ret;
     }
 
-    ret = vfio_save_device_config_state(f, opaque);
-    if (ret) {
-        return ret;
-    }
-
     ret = vfio_update_pending(vbasedev);
     if (ret) {
         return ret;
@@ -620,6 +615,19 @@  static int vfio_save_complete_precopy(QEMUFile *f, void *opaque)
     return ret;
 }
 
+static void vfio_save_state(QEMUFile *f, void *opaque)
+{
+    VFIODevice *vbasedev = opaque;
+    int ret;
+
+    /* The device specific data is migrated in the iterable process. */
+    ret = vfio_save_device_config_state(f, opaque);
+    if (ret) {
+        error_report("%s: Failed to save device config space",
+                     vbasedev->name);
+    }
+}
+
 static int vfio_load_setup(QEMUFile *f, void *opaque)
 {
     VFIODevice *vbasedev = opaque;
@@ -670,11 +678,7 @@  static int vfio_load_state(QEMUFile *f, void *opaque, int version_id)
         switch (data) {
         case VFIO_MIG_FLAG_DEV_CONFIG_STATE:
         {
-            ret = vfio_load_device_config_state(f, opaque);
-            if (ret) {
-                return ret;
-            }
-            break;
+            return vfio_load_device_config_state(f, opaque);
         }
         case VFIO_MIG_FLAG_DEV_SETUP_STATE:
         {
@@ -720,6 +724,7 @@  static SaveVMHandlers savevm_vfio_handlers = {
     .save_live_pending = vfio_save_pending,
     .save_live_iterate = vfio_save_iterate,
     .save_live_complete_precopy = vfio_save_complete_precopy,
+    .save_state = vfio_save_state,
     .load_setup = vfio_load_setup,
     .load_cleanup = vfio_load_cleanup,
     .load_state = vfio_load_state,