Message ID | 20201209080919.156-2-lushenming@huawei.com |
---|---|
State | New |
Headers | show |
Series | vfio: Some fixes and optimizations for VFIO migration | expand |
On Wed, 9 Dec 2020 16:09:17 +0800 Shenming Lu <lushenming@huawei.com> wrote: > On ARM64 the VFIO SET_IRQS ioctl is dependent on the VM interrupt > setup, if the restoring of the VFIO PCI device config space is > before the VGIC, an error might occur in the kernel. > > So we move the saving of the config space to the non-iterable > process, so that it will be called after the VGIC according to > their priorities. > > As for the possible dependence of the device specific migration > data on it's config space, we can let the vendor driver to > include any config info it needs in its own data stream. > (Should we note this in the header file linux-headers/linux/vfio.h?) Given that the header is our primary source about how this interface should act, we need to properly document expectations about what will be saved/restored when there (well, in the source file in the kernel.) That goes in both directions: what a userspace must implement, and what a vendor driver can rely on. [Related, but not a todo for you: I think we're still missing proper documentation of the whole migration feature.] > > Signed-off-by: Shenming Lu <lushenming@huawei.com> > --- > hw/vfio/migration.c | 25 +++++++++++++++---------- > 1 file changed, 15 insertions(+), 10 deletions(-)
On Wed, 9 Dec 2020 13:29:47 +0100 Cornelia Huck <cohuck@redhat.com> wrote: > On Wed, 9 Dec 2020 16:09:17 +0800 > Shenming Lu <lushenming@huawei.com> wrote: > > > On ARM64 the VFIO SET_IRQS ioctl is dependent on the VM interrupt > > setup, if the restoring of the VFIO PCI device config space is > > before the VGIC, an error might occur in the kernel. > > > > So we move the saving of the config space to the non-iterable > > process, so that it will be called after the VGIC according to > > their priorities. > > > > As for the possible dependence of the device specific migration > > data on it's config space, we can let the vendor driver to > > include any config info it needs in its own data stream. > > (Should we note this in the header file linux-headers/linux/vfio.h?) > > Given that the header is our primary source about how this interface > should act, we need to properly document expectations about what will > be saved/restored when there (well, in the source file in the kernel.) > That goes in both directions: what a userspace must implement, and what > a vendor driver can rely on. > > [Related, but not a todo for you: I think we're still missing proper > documentation of the whole migration feature.] Yes, we never saw anything past v1 of the documentation patch. Thanks, Alex > > Signed-off-by: Shenming Lu <lushenming@huawei.com> > > --- > > hw/vfio/migration.c | 25 +++++++++++++++---------- > > 1 file changed, 15 insertions(+), 10 deletions(-)
On 2020/12/10 2:34, Alex Williamson wrote: > On Wed, 9 Dec 2020 13:29:47 +0100 > Cornelia Huck <cohuck@redhat.com> wrote: > >> On Wed, 9 Dec 2020 16:09:17 +0800 >> Shenming Lu <lushenming@huawei.com> wrote: >> >>> On ARM64 the VFIO SET_IRQS ioctl is dependent on the VM interrupt >>> setup, if the restoring of the VFIO PCI device config space is >>> before the VGIC, an error might occur in the kernel. >>> >>> So we move the saving of the config space to the non-iterable >>> process, so that it will be called after the VGIC according to >>> their priorities. >>> >>> As for the possible dependence of the device specific migration >>> data on it's config space, we can let the vendor driver to >>> include any config info it needs in its own data stream. >>> (Should we note this in the header file linux-headers/linux/vfio.h?) >> >> Given that the header is our primary source about how this interface >> should act, we need to properly document expectations about what will >> be saved/restored when there (well, in the source file in the kernel.) >> That goes in both directions: what a userspace must implement, and what >> a vendor driver can rely on. Yeah, in order to make the vendor driver and QEMU cooperate better, we might need to document some expectations about the data section in the migration region... >> >> [Related, but not a todo for you: I think we're still missing proper >> documentation of the whole migration feature.] > > Yes, we never saw anything past v1 of the documentation patch. Thanks, > By the way, is there anything unproper with this patch? Wish your suggestion. :-) > Alex > > >>> Signed-off-by: Shenming Lu <lushenming@huawei.com> >>> --- >>> hw/vfio/migration.c | 25 +++++++++++++++---------- >>> 1 file changed, 15 insertions(+), 10 deletions(-) > > . >
On Thu, 10 Dec 2020 10:21:21 +0800 Shenming Lu <lushenming@huawei.com> wrote: > On 2020/12/10 2:34, Alex Williamson wrote: > > On Wed, 9 Dec 2020 13:29:47 +0100 > > Cornelia Huck <cohuck@redhat.com> wrote: > > > >> On Wed, 9 Dec 2020 16:09:17 +0800 > >> Shenming Lu <lushenming@huawei.com> wrote: > >> > >>> On ARM64 the VFIO SET_IRQS ioctl is dependent on the VM interrupt > >>> setup, if the restoring of the VFIO PCI device config space is > >>> before the VGIC, an error might occur in the kernel. > >>> > >>> So we move the saving of the config space to the non-iterable > >>> process, so that it will be called after the VGIC according to > >>> their priorities. > >>> > >>> As for the possible dependence of the device specific migration > >>> data on it's config space, we can let the vendor driver to > >>> include any config info it needs in its own data stream. > >>> (Should we note this in the header file linux-headers/linux/vfio.h?) > >> > >> Given that the header is our primary source about how this interface > >> should act, we need to properly document expectations about what will > >> be saved/restored when there (well, in the source file in the kernel.) > >> That goes in both directions: what a userspace must implement, and what > >> a vendor driver can rely on. > > Yeah, in order to make the vendor driver and QEMU cooperate better, we might > need to document some expectations about the data section in the migration > region... > >> > >> [Related, but not a todo for you: I think we're still missing proper > >> documentation of the whole migration feature.] > > > > Yes, we never saw anything past v1 of the documentation patch. Thanks, > > > > By the way, is there anything unproper with this patch? Wish your suggestion. :-) I'm really hoping for some feedback from Kirti, I understand the NVIDIA vGPU driver to have some dependency on this. Thanks, Alex > >>> Signed-off-by: Shenming Lu <lushenming@huawei.com> > >>> --- > >>> hw/vfio/migration.c | 25 +++++++++++++++---------- > >>> 1 file changed, 15 insertions(+), 10 deletions(-) > > > > . > > >
On 1/27/2021 3:06 AM, Alex Williamson wrote: > On Thu, 10 Dec 2020 10:21:21 +0800 > Shenming Lu <lushenming@huawei.com> wrote: > >> On 2020/12/10 2:34, Alex Williamson wrote: >>> On Wed, 9 Dec 2020 13:29:47 +0100 >>> Cornelia Huck <cohuck@redhat.com> wrote: >>> >>>> On Wed, 9 Dec 2020 16:09:17 +0800 >>>> Shenming Lu <lushenming@huawei.com> wrote: >>>> >>>>> On ARM64 the VFIO SET_IRQS ioctl is dependent on the VM interrupt >>>>> setup, if the restoring of the VFIO PCI device config space is >>>>> before the VGIC, an error might occur in the kernel. >>>>> >>>>> So we move the saving of the config space to the non-iterable >>>>> process, so that it will be called after the VGIC according to >>>>> their priorities. >>>>> >>>>> As for the possible dependence of the device specific migration >>>>> data on it's config space, we can let the vendor driver to >>>>> include any config info it needs in its own data stream. >>>>> (Should we note this in the header file linux-headers/linux/vfio.h?) >>>> >>>> Given that the header is our primary source about how this interface >>>> should act, we need to properly document expectations about what will >>>> be saved/restored when there (well, in the source file in the kernel.) >>>> That goes in both directions: what a userspace must implement, and what >>>> a vendor driver can rely on. >> >> Yeah, in order to make the vendor driver and QEMU cooperate better, we might >> need to document some expectations about the data section in the migration >> region... >>>> >>>> [Related, but not a todo for you: I think we're still missing proper >>>> documentation of the whole migration feature.] >>> >>> Yes, we never saw anything past v1 of the documentation patch. Thanks, >>> >> >> By the way, is there anything unproper with this patch? Wish your suggestion. :-) > > I'm really hoping for some feedback from Kirti, I understand the NVIDIA > vGPU driver to have some dependency on this. Thanks, > I need to verify this patch. Spare me a day to verify this. Thanks, Kirti > Alex > >>>>> Signed-off-by: Shenming Lu <lushenming@huawei.com> >>>>> --- >>>>> hw/vfio/migration.c | 25 +++++++++++++++---------- >>>>> 1 file changed, 15 insertions(+), 10 deletions(-) >>> >>> . >>> >> >
On 12/9/2020 1:39 PM, Shenming Lu wrote: > On ARM64 the VFIO SET_IRQS ioctl is dependent on the VM interrupt > setup, if the restoring of the VFIO PCI device config space is > before the VGIC, an error might occur in the kernel. > > So we move the saving of the config space to the non-iterable > process, so that it will be called after the VGIC according to > their priorities. > > As for the possible dependence of the device specific migration > data on it's config space, we can let the vendor driver to > include any config info it needs in its own data stream. > (Should we note this in the header file linux-headers/linux/vfio.h?) > > Signed-off-by: Shenming Lu <lushenming@huawei.com> > --- > hw/vfio/migration.c | 25 +++++++++++++++---------- > 1 file changed, 15 insertions(+), 10 deletions(-) > > diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c > index 00daa50ed8..3b9de1353a 100644 > --- a/hw/vfio/migration.c > +++ b/hw/vfio/migration.c > @@ -575,11 +575,6 @@ static int vfio_save_complete_precopy(QEMUFile *f, void *opaque) > return ret; > } > > - ret = vfio_save_device_config_state(f, opaque); > - if (ret) { > - return ret; > - } > - > ret = vfio_update_pending(vbasedev); > if (ret) { > return ret; > @@ -620,6 +615,19 @@ static int vfio_save_complete_precopy(QEMUFile *f, void *opaque) > return ret; > } > > +static void vfio_save_state(QEMUFile *f, void *opaque) > +{ > + VFIODevice *vbasedev = opaque; > + int ret; > + > + /* The device specific data is migrated in the iterable process. */ > + ret = vfio_save_device_config_state(f, opaque); > + if (ret) { > + error_report("%s: Failed to save device config space", > + vbasedev->name); > + } > +} > + Since error is not propagated, set error in migration stream for migration to fail, use qemu_file_set_error() on error. Thanks, Kirti > static int vfio_load_setup(QEMUFile *f, void *opaque) > { > VFIODevice *vbasedev = opaque; > @@ -670,11 +678,7 @@ static int vfio_load_state(QEMUFile *f, void *opaque, int version_id) > switch (data) { > case VFIO_MIG_FLAG_DEV_CONFIG_STATE: > { > - ret = vfio_load_device_config_state(f, opaque); > - if (ret) { > - return ret; > - } > - break; > + return vfio_load_device_config_state(f, opaque); > } > case VFIO_MIG_FLAG_DEV_SETUP_STATE: > { > @@ -720,6 +724,7 @@ static SaveVMHandlers savevm_vfio_handlers = { > .save_live_pending = vfio_save_pending, > .save_live_iterate = vfio_save_iterate, > .save_live_complete_precopy = vfio_save_complete_precopy, > + .save_state = vfio_save_state, > .load_setup = vfio_load_setup, > .load_cleanup = vfio_load_cleanup, > .load_state = vfio_load_state, >
On 1/27/2021 3:06 AM, Alex Williamson wrote: > On Thu, 10 Dec 2020 10:21:21 +0800 > Shenming Lu <lushenming@huawei.com> wrote: > >> On 2020/12/10 2:34, Alex Williamson wrote: >>> On Wed, 9 Dec 2020 13:29:47 +0100 >>> Cornelia Huck <cohuck@redhat.com> wrote: >>> >>>> On Wed, 9 Dec 2020 16:09:17 +0800 >>>> Shenming Lu <lushenming@huawei.com> wrote: >>>> >>>>> On ARM64 the VFIO SET_IRQS ioctl is dependent on the VM interrupt >>>>> setup, if the restoring of the VFIO PCI device config space is >>>>> before the VGIC, an error might occur in the kernel. >>>>> >>>>> So we move the saving of the config space to the non-iterable >>>>> process, so that it will be called after the VGIC according to >>>>> their priorities. >>>>> >>>>> As for the possible dependence of the device specific migration >>>>> data on it's config space, we can let the vendor driver to >>>>> include any config info it needs in its own data stream. >>>>> (Should we note this in the header file linux-headers/linux/vfio.h?) >>>> >>>> Given that the header is our primary source about how this interface >>>> should act, we need to properly document expectations about what will >>>> be saved/restored when there (well, in the source file in the kernel.) >>>> That goes in both directions: what a userspace must implement, and what >>>> a vendor driver can rely on. >> >> Yeah, in order to make the vendor driver and QEMU cooperate better, we might >> need to document some expectations about the data section in the migration >> region... >>>> >>>> [Related, but not a todo for you: I think we're still missing proper >>>> documentation of the whole migration feature.] >>> >>> Yes, we never saw anything past v1 of the documentation patch. Thanks, >>> I'll get back on this and send next version soon. >> >> By the way, is there anything unproper with this patch? Wish your suggestion. :-) > > I'm really hoping for some feedback from Kirti, I understand the NVIDIA > vGPU driver to have some dependency on this. Thanks, NVIDIA driver doesn't use device config space value/information during device data restoration, so we are good with this change. Thanks, Kirti > > Alex > >>>>> Signed-off-by: Shenming Lu <lushenming@huawei.com> >>>>> --- >>>>> hw/vfio/migration.c | 25 +++++++++++++++---------- >>>>> 1 file changed, 15 insertions(+), 10 deletions(-) >>> >>> . >>> >> >
On 2021/2/18 22:42, Kirti Wankhede wrote: > > > On 12/9/2020 1:39 PM, Shenming Lu wrote: >> On ARM64 the VFIO SET_IRQS ioctl is dependent on the VM interrupt >> setup, if the restoring of the VFIO PCI device config space is >> before the VGIC, an error might occur in the kernel. >> >> So we move the saving of the config space to the non-iterable >> process, so that it will be called after the VGIC according to >> their priorities. >> >> As for the possible dependence of the device specific migration >> data on it's config space, we can let the vendor driver to >> include any config info it needs in its own data stream. >> (Should we note this in the header file linux-headers/linux/vfio.h?) >> >> Signed-off-by: Shenming Lu <lushenming@huawei.com> >> --- >> hw/vfio/migration.c | 25 +++++++++++++++---------- >> 1 file changed, 15 insertions(+), 10 deletions(-) >> >> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c >> index 00daa50ed8..3b9de1353a 100644 >> --- a/hw/vfio/migration.c >> +++ b/hw/vfio/migration.c >> @@ -575,11 +575,6 @@ static int vfio_save_complete_precopy(QEMUFile *f, void *opaque) >> return ret; >> } >> - ret = vfio_save_device_config_state(f, opaque); >> - if (ret) { >> - return ret; >> - } >> - >> ret = vfio_update_pending(vbasedev); >> if (ret) { >> return ret; >> @@ -620,6 +615,19 @@ static int vfio_save_complete_precopy(QEMUFile *f, void *opaque) >> return ret; >> } >> +static void vfio_save_state(QEMUFile *f, void *opaque) >> +{ >> + VFIODevice *vbasedev = opaque; >> + int ret; >> + >> + /* The device specific data is migrated in the iterable process. */ >> + ret = vfio_save_device_config_state(f, opaque); >> + if (ret) { >> + error_report("%s: Failed to save device config space", >> + vbasedev->name); >> + } >> +} >> + > > Since error is not propagated, set error in migration stream for migration to fail, use qemu_file_set_error() on error. Makes sense. I will send a v3 soon. Thanks, Shenming > > Thanks, > Kirti > >> static int vfio_load_setup(QEMUFile *f, void *opaque) >> { >> VFIODevice *vbasedev = opaque; >> @@ -670,11 +678,7 @@ static int vfio_load_state(QEMUFile *f, void *opaque, int version_id) >> switch (data) { >> case VFIO_MIG_FLAG_DEV_CONFIG_STATE: >> { >> - ret = vfio_load_device_config_state(f, opaque); >> - if (ret) { >> - return ret; >> - } >> - break; >> + return vfio_load_device_config_state(f, opaque); >> } >> case VFIO_MIG_FLAG_DEV_SETUP_STATE: >> { >> @@ -720,6 +724,7 @@ static SaveVMHandlers savevm_vfio_handlers = { >> .save_live_pending = vfio_save_pending, >> .save_live_iterate = vfio_save_iterate, >> .save_live_complete_precopy = vfio_save_complete_precopy, >> + .save_state = vfio_save_state, >> .load_setup = vfio_load_setup, >> .load_cleanup = vfio_load_cleanup, >> .load_state = vfio_load_state, >> > .
diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c index 00daa50ed8..3b9de1353a 100644 --- a/hw/vfio/migration.c +++ b/hw/vfio/migration.c @@ -575,11 +575,6 @@ static int vfio_save_complete_precopy(QEMUFile *f, void *opaque) return ret; } - ret = vfio_save_device_config_state(f, opaque); - if (ret) { - return ret; - } - ret = vfio_update_pending(vbasedev); if (ret) { return ret; @@ -620,6 +615,19 @@ static int vfio_save_complete_precopy(QEMUFile *f, void *opaque) return ret; } +static void vfio_save_state(QEMUFile *f, void *opaque) +{ + VFIODevice *vbasedev = opaque; + int ret; + + /* The device specific data is migrated in the iterable process. */ + ret = vfio_save_device_config_state(f, opaque); + if (ret) { + error_report("%s: Failed to save device config space", + vbasedev->name); + } +} + static int vfio_load_setup(QEMUFile *f, void *opaque) { VFIODevice *vbasedev = opaque; @@ -670,11 +678,7 @@ static int vfio_load_state(QEMUFile *f, void *opaque, int version_id) switch (data) { case VFIO_MIG_FLAG_DEV_CONFIG_STATE: { - ret = vfio_load_device_config_state(f, opaque); - if (ret) { - return ret; - } - break; + return vfio_load_device_config_state(f, opaque); } case VFIO_MIG_FLAG_DEV_SETUP_STATE: { @@ -720,6 +724,7 @@ static SaveVMHandlers savevm_vfio_handlers = { .save_live_pending = vfio_save_pending, .save_live_iterate = vfio_save_iterate, .save_live_complete_precopy = vfio_save_complete_precopy, + .save_state = vfio_save_state, .load_setup = vfio_load_setup, .load_cleanup = vfio_load_cleanup, .load_state = vfio_load_state,
On ARM64 the VFIO SET_IRQS ioctl is dependent on the VM interrupt setup, if the restoring of the VFIO PCI device config space is before the VGIC, an error might occur in the kernel. So we move the saving of the config space to the non-iterable process, so that it will be called after the VGIC according to their priorities. As for the possible dependence of the device specific migration data on it's config space, we can let the vendor driver to include any config info it needs in its own data stream. (Should we note this in the header file linux-headers/linux/vfio.h?) Signed-off-by: Shenming Lu <lushenming@huawei.com> --- hw/vfio/migration.c | 25 +++++++++++++++---------- 1 file changed, 15 insertions(+), 10 deletions(-)