diff mbox series

[v3,13/13] docs/devel: Document VFIO device dirty page tracking

Message ID 20230304014343.33646-14-joao.m.martins@oracle.com
State New
Headers show
Series vfio/migration: Device dirty page tracking | expand

Commit Message

Joao Martins March 4, 2023, 1:43 a.m. UTC
From: Avihai Horon <avihaih@nvidia.com>

Adjust the VFIO dirty page tracking documentation and add a section to
describe device dirty page tracking.

Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
---
 docs/devel/vfio-migration.rst | 46 +++++++++++++++++++++++------------
 1 file changed, 31 insertions(+), 15 deletions(-)

Comments

Cédric Le Goater March 6, 2023, 5:15 p.m. UTC | #1
On 3/4/23 02:43, Joao Martins wrote:
> From: Avihai Horon <avihaih@nvidia.com>
> 
> Adjust the VFIO dirty page tracking documentation and add a section to
> describe device dirty page tracking.
> 
> Signed-off-by: Avihai Horon <avihaih@nvidia.com>
> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
> ---
>   docs/devel/vfio-migration.rst | 46 +++++++++++++++++++++++------------
>   1 file changed, 31 insertions(+), 15 deletions(-)
> 
> diff --git a/docs/devel/vfio-migration.rst b/docs/devel/vfio-migration.rst
> index c214c73e2818..1b68ccf11529 100644
> --- a/docs/devel/vfio-migration.rst
> +++ b/docs/devel/vfio-migration.rst
> @@ -59,22 +59,37 @@ System memory dirty pages tracking
>   ----------------------------------
>   
>   A ``log_global_start`` and ``log_global_stop`` memory listener callback informs
> -the VFIO IOMMU module to start and stop dirty page tracking. A ``log_sync``
> -memory listener callback marks those system memory pages as dirty which are
> -used for DMA by the VFIO device. The dirty pages bitmap is queried per
> -container. All pages pinned by the vendor driver through external APIs have to
> -be marked as dirty during migration. When there are CPU writes, CPU dirty page
> -tracking can identify dirtied pages, but any page pinned by the vendor driver
> -can also be written by the device. There is currently no device or IOMMU
> -support for dirty page tracking in hardware.
> +the VFIO dirty tracking module to start and stop dirty page tracking. A
> +``log_sync`` memory listener callback queries the dirty page bitmap from the
> +dirty tracking module and marks system memory pages which were DMA-ed by the
> +VFIO device as dirty. The dirty page bitmap is queried per container.
> +
> +Currently there are two ways dirty page tracking can be done:
> +(1) Device dirty tracking:
> +In this method the device is responsible to log and report its DMAs. This
> +method can be used only if the device is capable of tracking its DMAs.
> +Discovering device capability, starting and stopping dirty tracking, and
> +syncing the dirty bitmaps from the device are done using the DMA logging uAPI.
> +More info about the uAPI can be found in the comments of the
> +``vfio_device_feature_dma_logging_control`` and
> +``vfio_device_feature_dma_logging_report`` structures in the header file
> +linux-headers/linux/vfio.h.
> +
> +(2) VFIO IOMMU module:
> +In this method dirty tracking is done by IOMMU. However, there is currently no
> +IOMMU support for dirty page tracking. For this reason, all pages are
> +perpetually marked dirty, unless the device driver pins pages through external
> +APIs in which case only those pinned pages are perpetually marked dirty.
> +
> +If the above two methods are not supported, all pages are perpetually marked
> +dirty by QEMU.
>   
>   By default, dirty pages are tracked during pre-copy as well as stop-and-copy
> -phase. So, a page pinned by the vendor driver will be copied to the destination
> -in both phases. Copying dirty pages in pre-copy phase helps QEMU to predict if
> -it can achieve its downtime tolerances. If QEMU during pre-copy phase keeps
> -finding dirty pages continuously, then it understands that even in stop-and-copy
> -phase, it is likely to find dirty pages and can predict the downtime
> -accordingly.
> +phase. So, a page marked as dirty will be copied to the destination in both
> +phases. Copying dirty pages in pre-copy phase helps QEMU to predict if it can
> +achieve its downtime tolerances. If QEMU during pre-copy phase keeps finding
> +dirty pages continuously, then it understands that even in stop-and-copy phase,
> +it is likely to find dirty pages and can predict the downtime accordingly.
>   
>   QEMU also provides a per device opt-out option ``pre-copy-dirty-page-tracking``
>   which disables querying the dirty bitmap during pre-copy phase. If it is set to
> @@ -89,7 +104,8 @@ phase of migration. In that case, the unmap ioctl returns any dirty pages in
>   that range and QEMU reports corresponding guest physical pages dirty. During
>   stop-and-copy phase, an IOMMU notifier is used to get a callback for mapped
>   pages and then dirty pages bitmap is fetched from VFIO IOMMU modules for those
> -mapped ranges.
> +mapped ranges. If device dirty tracking is enabled with vIOMMU, live migration
> +will be blocked.

There is a limitation with multiple devices also.

Thanks,

C.

>   
>   Flow of state changes during Live migration
>   ===========================================
Joao Martins March 6, 2023, 5:18 p.m. UTC | #2
On 06/03/2023 17:15, Cédric Le Goater wrote:
> On 3/4/23 02:43, Joao Martins wrote:
>> From: Avihai Horon <avihaih@nvidia.com>
>>
>> Adjust the VFIO dirty page tracking documentation and add a section to
>> describe device dirty page tracking.
>>
>> Signed-off-by: Avihai Horon <avihaih@nvidia.com>
>> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
>> ---
>>   docs/devel/vfio-migration.rst | 46 +++++++++++++++++++++++------------
>>   1 file changed, 31 insertions(+), 15 deletions(-)
>>
>> diff --git a/docs/devel/vfio-migration.rst b/docs/devel/vfio-migration.rst
>> index c214c73e2818..1b68ccf11529 100644
>> --- a/docs/devel/vfio-migration.rst
>> +++ b/docs/devel/vfio-migration.rst
>> @@ -59,22 +59,37 @@ System memory dirty pages tracking
>>   ----------------------------------
>>     A ``log_global_start`` and ``log_global_stop`` memory listener callback
>> informs
>> -the VFIO IOMMU module to start and stop dirty page tracking. A ``log_sync``
>> -memory listener callback marks those system memory pages as dirty which are
>> -used for DMA by the VFIO device. The dirty pages bitmap is queried per
>> -container. All pages pinned by the vendor driver through external APIs have to
>> -be marked as dirty during migration. When there are CPU writes, CPU dirty page
>> -tracking can identify dirtied pages, but any page pinned by the vendor driver
>> -can also be written by the device. There is currently no device or IOMMU
>> -support for dirty page tracking in hardware.
>> +the VFIO dirty tracking module to start and stop dirty page tracking. A
>> +``log_sync`` memory listener callback queries the dirty page bitmap from the
>> +dirty tracking module and marks system memory pages which were DMA-ed by the
>> +VFIO device as dirty. The dirty page bitmap is queried per container.
>> +
>> +Currently there are two ways dirty page tracking can be done:
>> +(1) Device dirty tracking:
>> +In this method the device is responsible to log and report its DMAs. This
>> +method can be used only if the device is capable of tracking its DMAs.
>> +Discovering device capability, starting and stopping dirty tracking, and
>> +syncing the dirty bitmaps from the device are done using the DMA logging uAPI.
>> +More info about the uAPI can be found in the comments of the
>> +``vfio_device_feature_dma_logging_control`` and
>> +``vfio_device_feature_dma_logging_report`` structures in the header file
>> +linux-headers/linux/vfio.h.
>> +
>> +(2) VFIO IOMMU module:
>> +In this method dirty tracking is done by IOMMU. However, there is currently no
>> +IOMMU support for dirty page tracking. For this reason, all pages are
>> +perpetually marked dirty, unless the device driver pins pages through external
>> +APIs in which case only those pinned pages are perpetually marked dirty.
>> +
>> +If the above two methods are not supported, all pages are perpetually marked
>> +dirty by QEMU.
>>     By default, dirty pages are tracked during pre-copy as well as stop-and-copy
>> -phase. So, a page pinned by the vendor driver will be copied to the destination
>> -in both phases. Copying dirty pages in pre-copy phase helps QEMU to predict if
>> -it can achieve its downtime tolerances. If QEMU during pre-copy phase keeps
>> -finding dirty pages continuously, then it understands that even in stop-and-copy
>> -phase, it is likely to find dirty pages and can predict the downtime
>> -accordingly.
>> +phase. So, a page marked as dirty will be copied to the destination in both
>> +phases. Copying dirty pages in pre-copy phase helps QEMU to predict if it can
>> +achieve its downtime tolerances. If QEMU during pre-copy phase keeps finding
>> +dirty pages continuously, then it understands that even in stop-and-copy phase,
>> +it is likely to find dirty pages and can predict the downtime accordingly.
>>     QEMU also provides a per device opt-out option
>> ``pre-copy-dirty-page-tracking``
>>   which disables querying the dirty bitmap during pre-copy phase. If it is set to
>> @@ -89,7 +104,8 @@ phase of migration. In that case, the unmap ioctl returns
>> any dirty pages in
>>   that range and QEMU reports corresponding guest physical pages dirty. During
>>   stop-and-copy phase, an IOMMU notifier is used to get a callback for mapped
>>   pages and then dirty pages bitmap is fetched from VFIO IOMMU modules for those
>> -mapped ranges.
>> +mapped ranges. If device dirty tracking is enabled with vIOMMU, live migration
>> +will be blocked.
> 
> There is a limitation with multiple devices also.
> 
I'm aware. I just didn't write it because the section I am changing is specific
to vIOMMU.

> Thanks,
> 
> C.
> 
>>     Flow of state changes during Live migration
>>   ===========================================
>
Joao Martins March 6, 2023, 5:21 p.m. UTC | #3
On 06/03/2023 17:18, Joao Martins wrote:
> On 06/03/2023 17:15, Cédric Le Goater wrote:
>> On 3/4/23 02:43, Joao Martins wrote:
>>> From: Avihai Horon <avihaih@nvidia.com>
>>>
>>> Adjust the VFIO dirty page tracking documentation and add a section to
>>> describe device dirty page tracking.
>>>
>>> Signed-off-by: Avihai Horon <avihaih@nvidia.com>
>>> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
>>> ---
>>>   docs/devel/vfio-migration.rst | 46 +++++++++++++++++++++++------------
>>>   1 file changed, 31 insertions(+), 15 deletions(-)
>>>
>>> diff --git a/docs/devel/vfio-migration.rst b/docs/devel/vfio-migration.rst
>>> index c214c73e2818..1b68ccf11529 100644
>>> --- a/docs/devel/vfio-migration.rst
>>> +++ b/docs/devel/vfio-migration.rst
>>> @@ -59,22 +59,37 @@ System memory dirty pages tracking
>>>   ----------------------------------
>>>     A ``log_global_start`` and ``log_global_stop`` memory listener callback
>>> informs
>>> -the VFIO IOMMU module to start and stop dirty page tracking. A ``log_sync``
>>> -memory listener callback marks those system memory pages as dirty which are
>>> -used for DMA by the VFIO device. The dirty pages bitmap is queried per
>>> -container. All pages pinned by the vendor driver through external APIs have to
>>> -be marked as dirty during migration. When there are CPU writes, CPU dirty page
>>> -tracking can identify dirtied pages, but any page pinned by the vendor driver
>>> -can also be written by the device. There is currently no device or IOMMU
>>> -support for dirty page tracking in hardware.
>>> +the VFIO dirty tracking module to start and stop dirty page tracking. A
>>> +``log_sync`` memory listener callback queries the dirty page bitmap from the
>>> +dirty tracking module and marks system memory pages which were DMA-ed by the
>>> +VFIO device as dirty. The dirty page bitmap is queried per container.
>>> +
>>> +Currently there are two ways dirty page tracking can be done:
>>> +(1) Device dirty tracking:
>>> +In this method the device is responsible to log and report its DMAs. This
>>> +method can be used only if the device is capable of tracking its DMAs.
>>> +Discovering device capability, starting and stopping dirty tracking, and
>>> +syncing the dirty bitmaps from the device are done using the DMA logging uAPI.
>>> +More info about the uAPI can be found in the comments of the
>>> +``vfio_device_feature_dma_logging_control`` and
>>> +``vfio_device_feature_dma_logging_report`` structures in the header file
>>> +linux-headers/linux/vfio.h.
>>> +
>>> +(2) VFIO IOMMU module:
>>> +In this method dirty tracking is done by IOMMU. However, there is currently no
>>> +IOMMU support for dirty page tracking. For this reason, all pages are
>>> +perpetually marked dirty, unless the device driver pins pages through external
>>> +APIs in which case only those pinned pages are perpetually marked dirty.
>>> +
>>> +If the above two methods are not supported, all pages are perpetually marked
>>> +dirty by QEMU.
>>>     By default, dirty pages are tracked during pre-copy as well as stop-and-copy
>>> -phase. So, a page pinned by the vendor driver will be copied to the destination
>>> -in both phases. Copying dirty pages in pre-copy phase helps QEMU to predict if
>>> -it can achieve its downtime tolerances. If QEMU during pre-copy phase keeps
>>> -finding dirty pages continuously, then it understands that even in stop-and-copy
>>> -phase, it is likely to find dirty pages and can predict the downtime
>>> -accordingly.
>>> +phase. So, a page marked as dirty will be copied to the destination in both
>>> +phases. Copying dirty pages in pre-copy phase helps QEMU to predict if it can
>>> +achieve its downtime tolerances. If QEMU during pre-copy phase keeps finding
>>> +dirty pages continuously, then it understands that even in stop-and-copy phase,
>>> +it is likely to find dirty pages and can predict the downtime accordingly.
>>>     QEMU also provides a per device opt-out option
>>> ``pre-copy-dirty-page-tracking``
>>>   which disables querying the dirty bitmap during pre-copy phase. If it is set to
>>> @@ -89,7 +104,8 @@ phase of migration. In that case, the unmap ioctl returns
>>> any dirty pages in
>>>   that range and QEMU reports corresponding guest physical pages dirty. During
>>>   stop-and-copy phase, an IOMMU notifier is used to get a callback for mapped
>>>   pages and then dirty pages bitmap is fetched from VFIO IOMMU modules for those
>>> -mapped ranges.
>>> +mapped ranges. If device dirty tracking is enabled with vIOMMU, live migration
>>> +will be blocked.
>>
>> There is a limitation with multiple devices also.
>>
> I'm aware. I just didn't write it because the section I am changing is specific
> to vIOMMU.
> 
... and this patch is covering device dirty tracking
Cédric Le Goater March 6, 2023, 5:21 p.m. UTC | #4
On 3/6/23 18:18, Joao Martins wrote:
> 
> 
> On 06/03/2023 17:15, Cédric Le Goater wrote:
>> On 3/4/23 02:43, Joao Martins wrote:
>>> From: Avihai Horon <avihaih@nvidia.com>
>>>
>>> Adjust the VFIO dirty page tracking documentation and add a section to
>>> describe device dirty page tracking.
>>>
>>> Signed-off-by: Avihai Horon <avihaih@nvidia.com>
>>> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
>>> ---
>>>    docs/devel/vfio-migration.rst | 46 +++++++++++++++++++++++------------
>>>    1 file changed, 31 insertions(+), 15 deletions(-)
>>>
>>> diff --git a/docs/devel/vfio-migration.rst b/docs/devel/vfio-migration.rst
>>> index c214c73e2818..1b68ccf11529 100644
>>> --- a/docs/devel/vfio-migration.rst
>>> +++ b/docs/devel/vfio-migration.rst
>>> @@ -59,22 +59,37 @@ System memory dirty pages tracking
>>>    ----------------------------------
>>>      A ``log_global_start`` and ``log_global_stop`` memory listener callback
>>> informs
>>> -the VFIO IOMMU module to start and stop dirty page tracking. A ``log_sync``
>>> -memory listener callback marks those system memory pages as dirty which are
>>> -used for DMA by the VFIO device. The dirty pages bitmap is queried per
>>> -container. All pages pinned by the vendor driver through external APIs have to
>>> -be marked as dirty during migration. When there are CPU writes, CPU dirty page
>>> -tracking can identify dirtied pages, but any page pinned by the vendor driver
>>> -can also be written by the device. There is currently no device or IOMMU
>>> -support for dirty page tracking in hardware.
>>> +the VFIO dirty tracking module to start and stop dirty page tracking. A
>>> +``log_sync`` memory listener callback queries the dirty page bitmap from the
>>> +dirty tracking module and marks system memory pages which were DMA-ed by the
>>> +VFIO device as dirty. The dirty page bitmap is queried per container.
>>> +
>>> +Currently there are two ways dirty page tracking can be done:
>>> +(1) Device dirty tracking:
>>> +In this method the device is responsible to log and report its DMAs. This
>>> +method can be used only if the device is capable of tracking its DMAs.
>>> +Discovering device capability, starting and stopping dirty tracking, and
>>> +syncing the dirty bitmaps from the device are done using the DMA logging uAPI.
>>> +More info about the uAPI can be found in the comments of the
>>> +``vfio_device_feature_dma_logging_control`` and
>>> +``vfio_device_feature_dma_logging_report`` structures in the header file
>>> +linux-headers/linux/vfio.h.
>>> +
>>> +(2) VFIO IOMMU module:
>>> +In this method dirty tracking is done by IOMMU. However, there is currently no
>>> +IOMMU support for dirty page tracking. For this reason, all pages are
>>> +perpetually marked dirty, unless the device driver pins pages through external
>>> +APIs in which case only those pinned pages are perpetually marked dirty.
>>> +
>>> +If the above two methods are not supported, all pages are perpetually marked
>>> +dirty by QEMU.
>>>      By default, dirty pages are tracked during pre-copy as well as stop-and-copy
>>> -phase. So, a page pinned by the vendor driver will be copied to the destination
>>> -in both phases. Copying dirty pages in pre-copy phase helps QEMU to predict if
>>> -it can achieve its downtime tolerances. If QEMU during pre-copy phase keeps
>>> -finding dirty pages continuously, then it understands that even in stop-and-copy
>>> -phase, it is likely to find dirty pages and can predict the downtime
>>> -accordingly.
>>> +phase. So, a page marked as dirty will be copied to the destination in both
>>> +phases. Copying dirty pages in pre-copy phase helps QEMU to predict if it can
>>> +achieve its downtime tolerances. If QEMU during pre-copy phase keeps finding
>>> +dirty pages continuously, then it understands that even in stop-and-copy phase,
>>> +it is likely to find dirty pages and can predict the downtime accordingly.
>>>      QEMU also provides a per device opt-out option
>>> ``pre-copy-dirty-page-tracking``
>>>    which disables querying the dirty bitmap during pre-copy phase. If it is set to
>>> @@ -89,7 +104,8 @@ phase of migration. In that case, the unmap ioctl returns
>>> any dirty pages in
>>>    that range and QEMU reports corresponding guest physical pages dirty. During
>>>    stop-and-copy phase, an IOMMU notifier is used to get a callback for mapped
>>>    pages and then dirty pages bitmap is fetched from VFIO IOMMU modules for those
>>> -mapped ranges.
>>> +mapped ranges. If device dirty tracking is enabled with vIOMMU, live migration
>>> +will be blocked.
>>
>> There is a limitation with multiple devices also.
>>
> I'm aware. I just didn't write it because the section I am changing is specific
> to vIOMMU.


Ah OK. I didn't check, sorry.

Reviewed-by: Cédric Le Goater <clg@redhat.com>

Thanks,

C.
diff mbox series

Patch

diff --git a/docs/devel/vfio-migration.rst b/docs/devel/vfio-migration.rst
index c214c73e2818..1b68ccf11529 100644
--- a/docs/devel/vfio-migration.rst
+++ b/docs/devel/vfio-migration.rst
@@ -59,22 +59,37 @@  System memory dirty pages tracking
 ----------------------------------
 
 A ``log_global_start`` and ``log_global_stop`` memory listener callback informs
-the VFIO IOMMU module to start and stop dirty page tracking. A ``log_sync``
-memory listener callback marks those system memory pages as dirty which are
-used for DMA by the VFIO device. The dirty pages bitmap is queried per
-container. All pages pinned by the vendor driver through external APIs have to
-be marked as dirty during migration. When there are CPU writes, CPU dirty page
-tracking can identify dirtied pages, but any page pinned by the vendor driver
-can also be written by the device. There is currently no device or IOMMU
-support for dirty page tracking in hardware.
+the VFIO dirty tracking module to start and stop dirty page tracking. A
+``log_sync`` memory listener callback queries the dirty page bitmap from the
+dirty tracking module and marks system memory pages which were DMA-ed by the
+VFIO device as dirty. The dirty page bitmap is queried per container.
+
+Currently there are two ways dirty page tracking can be done:
+(1) Device dirty tracking:
+In this method the device is responsible to log and report its DMAs. This
+method can be used only if the device is capable of tracking its DMAs.
+Discovering device capability, starting and stopping dirty tracking, and
+syncing the dirty bitmaps from the device are done using the DMA logging uAPI.
+More info about the uAPI can be found in the comments of the
+``vfio_device_feature_dma_logging_control`` and
+``vfio_device_feature_dma_logging_report`` structures in the header file
+linux-headers/linux/vfio.h.
+
+(2) VFIO IOMMU module:
+In this method dirty tracking is done by IOMMU. However, there is currently no
+IOMMU support for dirty page tracking. For this reason, all pages are
+perpetually marked dirty, unless the device driver pins pages through external
+APIs in which case only those pinned pages are perpetually marked dirty.
+
+If the above two methods are not supported, all pages are perpetually marked
+dirty by QEMU.
 
 By default, dirty pages are tracked during pre-copy as well as stop-and-copy
-phase. So, a page pinned by the vendor driver will be copied to the destination
-in both phases. Copying dirty pages in pre-copy phase helps QEMU to predict if
-it can achieve its downtime tolerances. If QEMU during pre-copy phase keeps
-finding dirty pages continuously, then it understands that even in stop-and-copy
-phase, it is likely to find dirty pages and can predict the downtime
-accordingly.
+phase. So, a page marked as dirty will be copied to the destination in both
+phases. Copying dirty pages in pre-copy phase helps QEMU to predict if it can
+achieve its downtime tolerances. If QEMU during pre-copy phase keeps finding
+dirty pages continuously, then it understands that even in stop-and-copy phase,
+it is likely to find dirty pages and can predict the downtime accordingly.
 
 QEMU also provides a per device opt-out option ``pre-copy-dirty-page-tracking``
 which disables querying the dirty bitmap during pre-copy phase. If it is set to
@@ -89,7 +104,8 @@  phase of migration. In that case, the unmap ioctl returns any dirty pages in
 that range and QEMU reports corresponding guest physical pages dirty. During
 stop-and-copy phase, an IOMMU notifier is used to get a callback for mapped
 pages and then dirty pages bitmap is fetched from VFIO IOMMU modules for those
-mapped ranges.
+mapped ranges. If device dirty tracking is enabled with vIOMMU, live migration
+will be blocked.
 
 Flow of state changes during Live migration
 ===========================================