Message ID | 20210121110540.33704-1-david@redhat.com |
---|---|
Headers | show |
Series | virtio-mem: vfio support | expand |
On Thu, Jan 21, 2021 at 12:05:29PM +0100, David Hildenbrand wrote: > A virtio-mem device manages a memory region in guest physical address > space, represented as a single (currently large) memory region in QEMU, > mapped into system memory address space. Before the guest is allowed to use > memory blocks, it must coordinate with the hypervisor (plug blocks). After > a reboot, all memory is usually unplugged - when the guest comes up, it > detects the virtio-mem device and selects memory blocks to plug (based on > resize requests from the hypervisor). > > Memory hot(un)plug consists of (un)plugging memory blocks via a virtio-mem > device (triggered by the guest). When unplugging blocks, we discard the > memory - similar to memory balloon inflation. In contrast to memory > ballooning, we always know which memory blocks a guest may actually use - > especially during a reboot, after a crash, or after kexec (and during > hibernation as well). Guests agreed to not access unplugged memory again, > especially not via DMA. > > The issue with vfio is, that it cannot deal with random discards - for this > reason, virtio-mem and vfio can currently only run mutually exclusive. > Especially, vfio would currently map the whole memory region (with possible > only little/no plugged blocks), resulting in all pages getting pinned and > therefore resulting in a higher memory consumption than expected (turning > virtio-mem basically useless in these environments). > > To make vfio work nicely with virtio-mem, we have to map only the plugged > blocks, and map/unmap properly when plugging/unplugging blocks (including > discarding of RAM when unplugging). We achieve that by using a new notifier > mechanism that communicates changes. series Acked-by: Michael S. Tsirkin <mst@redhat.com> virtio bits Reviewed-by: Michael S. Tsirkin <mst@redhat.com> This needs to go through vfio tree I assume. > It's important to map memory in the granularity in which we could see > unmaps again (-> virtio-mem block size) - so when e.g., plugging > consecutive 100 MB with a block size of 2 MB, we need 50 mappings. When > unmapping, we can use a single vfio_unmap call for the applicable range. > We expect that the block size of virtio-mem devices will be fairly large > in the future (to not run out of mappings and to improve hot(un)plug > performance), configured by the user, when used with vfio (e.g., 128MB, > 1G, ...), but it will depend on the setup. > > More info regarding virtio-mem can be found at: > https://virtio-mem.gitlab.io/ > > v5 is located at: > git@github.com:davidhildenbrand/qemu.git virtio-mem-vfio-v5 > > v4 -> v5: > - "vfio: Support for RamDiscardMgr in the !vIOMMU case" > -- Added more assertions for granularity vs. iommu supported pagesize > - "vfio: Sanity check maximum number of DMA mappings with RamDiscardMgr" > -- Fix accounting of mappings > - "vfio: Disable only uncoordinated discards for VFIO_TYPE1 iommus" > -- Fence off SPAPR and add some comments regarding future support. > -- Tweak patch description > - Rebase and retest > > v3 -> v4: > - "vfio: Query and store the maximum number of DMA mappings > -- Limit the patch to querying and storing only > -- Renamed to "vfio: Query and store the maximum number of possible DMA > mappings" > - "vfio: Support for RamDiscardMgr in the !vIOMMU case" > -- Remove sanity checks / warning the user > - "vfio: Sanity check maximum number of DMA mappings with RamDiscardMgr" > -- Perform sanity checks by looking at the number of memslots and all > registered RamDiscardMgr sections > - Rebase and retest > - Reshuffled the patches slightly > > v2 -> v3: > - Rebased + retested > - Fixed some typos > - Added RB's > > v1 -> v2: > - "memory: Introduce RamDiscardMgr for RAM memory regions" > -- Fix some errors in the documentation > -- Make register_listener() notify about populated parts and > unregister_listener() notify about discarding populated parts, to > simplify future locking inside virtio-mem, when handling requests via a > separate thread. > - "vfio: Query and store the maximum number of DMA mappings" > -- Query number of mappings and track mappings (except for vIOMMU) > - "vfio: Support for RamDiscardMgr in the !vIOMMU case" > -- Adapt to RamDiscardMgr changes and warn via generic DMA reservation > - "vfio: Support for RamDiscardMgr in the vIOMMU case" > -- Use vmstate priority to handle migration dependencies > > RFC - v1: > - VFIO migration code. Due to missing kernel support, I cannot really test > if that part works. > - Understand/test/document vIOMMU implications, also regarding migration > - Nicer ram_block_discard_disable/require handling. > - s/SparseRAMHandler/RamDiscardMgr/, refactorings, cleanups, documentation, > testing, ... > > David Hildenbrand (11): > memory: Introduce RamDiscardMgr for RAM memory regions > virtio-mem: Factor out traversing unplugged ranges > virtio-mem: Implement RamDiscardMgr interface > vfio: Support for RamDiscardMgr in the !vIOMMU case > vfio: Query and store the maximum number of possible DMA mappings > vfio: Sanity check maximum number of DMA mappings with RamDiscardMgr > vfio: Support for RamDiscardMgr in the vIOMMU case > softmmu/physmem: Don't use atomic operations in > ram_block_discard_(disable|require) > softmmu/physmem: Extend ram_block_discard_(require|disable) by two > discard types > virtio-mem: Require only coordinated discards > vfio: Disable only uncoordinated discards for VFIO_TYPE1 iommus > > hw/vfio/common.c | 348 +++++++++++++++++++++++++++++++-- > hw/virtio/virtio-mem.c | 347 ++++++++++++++++++++++++++++---- > include/exec/memory.h | 249 ++++++++++++++++++++++- > include/hw/vfio/vfio-common.h | 13 ++ > include/hw/virtio/virtio-mem.h | 3 + > include/migration/vmstate.h | 1 + > softmmu/memory.c | 22 +++ > softmmu/physmem.c | 108 +++++++--- > 8 files changed, 1007 insertions(+), 84 deletions(-) > > -- > 2.29.2
On 27.01.21 13:45, Michael S. Tsirkin wrote: > On Thu, Jan 21, 2021 at 12:05:29PM +0100, David Hildenbrand wrote: >> A virtio-mem device manages a memory region in guest physical address >> space, represented as a single (currently large) memory region in QEMU, >> mapped into system memory address space. Before the guest is allowed to use >> memory blocks, it must coordinate with the hypervisor (plug blocks). After >> a reboot, all memory is usually unplugged - when the guest comes up, it >> detects the virtio-mem device and selects memory blocks to plug (based on >> resize requests from the hypervisor). >> >> Memory hot(un)plug consists of (un)plugging memory blocks via a virtio-mem >> device (triggered by the guest). When unplugging blocks, we discard the >> memory - similar to memory balloon inflation. In contrast to memory >> ballooning, we always know which memory blocks a guest may actually use - >> especially during a reboot, after a crash, or after kexec (and during >> hibernation as well). Guests agreed to not access unplugged memory again, >> especially not via DMA. >> >> The issue with vfio is, that it cannot deal with random discards - for this >> reason, virtio-mem and vfio can currently only run mutually exclusive. >> Especially, vfio would currently map the whole memory region (with possible >> only little/no plugged blocks), resulting in all pages getting pinned and >> therefore resulting in a higher memory consumption than expected (turning >> virtio-mem basically useless in these environments). >> >> To make vfio work nicely with virtio-mem, we have to map only the plugged >> blocks, and map/unmap properly when plugging/unplugging blocks (including >> discarding of RAM when unplugging). We achieve that by using a new notifier >> mechanism that communicates changes. > > series > > Acked-by: Michael S. Tsirkin <mst@redhat.com> > > virtio bits > > Reviewed-by: Michael S. Tsirkin <mst@redhat.com> > > This needs to go through vfio tree I assume. Thanks Michael. @Alex, what are your suggestions?
On 08.02.21 09:28, David Hildenbrand wrote: > On 27.01.21 13:45, Michael S. Tsirkin wrote: >> On Thu, Jan 21, 2021 at 12:05:29PM +0100, David Hildenbrand wrote: >>> A virtio-mem device manages a memory region in guest physical address >>> space, represented as a single (currently large) memory region in QEMU, >>> mapped into system memory address space. Before the guest is allowed to use >>> memory blocks, it must coordinate with the hypervisor (plug blocks). After >>> a reboot, all memory is usually unplugged - when the guest comes up, it >>> detects the virtio-mem device and selects memory blocks to plug (based on >>> resize requests from the hypervisor). >>> >>> Memory hot(un)plug consists of (un)plugging memory blocks via a virtio-mem >>> device (triggered by the guest). When unplugging blocks, we discard the >>> memory - similar to memory balloon inflation. In contrast to memory >>> ballooning, we always know which memory blocks a guest may actually use - >>> especially during a reboot, after a crash, or after kexec (and during >>> hibernation as well). Guests agreed to not access unplugged memory again, >>> especially not via DMA. >>> >>> The issue with vfio is, that it cannot deal with random discards - for this >>> reason, virtio-mem and vfio can currently only run mutually exclusive. >>> Especially, vfio would currently map the whole memory region (with possible >>> only little/no plugged blocks), resulting in all pages getting pinned and >>> therefore resulting in a higher memory consumption than expected (turning >>> virtio-mem basically useless in these environments). >>> >>> To make vfio work nicely with virtio-mem, we have to map only the plugged >>> blocks, and map/unmap properly when plugging/unplugging blocks (including >>> discarding of RAM when unplugging). We achieve that by using a new notifier >>> mechanism that communicates changes. >> >> series >> >> Acked-by: Michael S. Tsirkin <mst@redhat.com> >> >> virtio bits >> >> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> >> >> This needs to go through vfio tree I assume. > > Thanks Michael. > > @Alex, what are your suggestions? Gentle ping.
On Mon, 15 Feb 2021 15:03:43 +0100 David Hildenbrand <david@redhat.com> wrote: > On 08.02.21 09:28, David Hildenbrand wrote: > > On 27.01.21 13:45, Michael S. Tsirkin wrote: > >> On Thu, Jan 21, 2021 at 12:05:29PM +0100, David Hildenbrand wrote: > >>> A virtio-mem device manages a memory region in guest physical address > >>> space, represented as a single (currently large) memory region in QEMU, > >>> mapped into system memory address space. Before the guest is allowed to use > >>> memory blocks, it must coordinate with the hypervisor (plug blocks). After > >>> a reboot, all memory is usually unplugged - when the guest comes up, it > >>> detects the virtio-mem device and selects memory blocks to plug (based on > >>> resize requests from the hypervisor). > >>> > >>> Memory hot(un)plug consists of (un)plugging memory blocks via a virtio-mem > >>> device (triggered by the guest). When unplugging blocks, we discard the > >>> memory - similar to memory balloon inflation. In contrast to memory > >>> ballooning, we always know which memory blocks a guest may actually use - > >>> especially during a reboot, after a crash, or after kexec (and during > >>> hibernation as well). Guests agreed to not access unplugged memory again, > >>> especially not via DMA. > >>> > >>> The issue with vfio is, that it cannot deal with random discards - for this > >>> reason, virtio-mem and vfio can currently only run mutually exclusive. > >>> Especially, vfio would currently map the whole memory region (with possible > >>> only little/no plugged blocks), resulting in all pages getting pinned and > >>> therefore resulting in a higher memory consumption than expected (turning > >>> virtio-mem basically useless in these environments). > >>> > >>> To make vfio work nicely with virtio-mem, we have to map only the plugged > >>> blocks, and map/unmap properly when plugging/unplugging blocks (including > >>> discarding of RAM when unplugging). We achieve that by using a new notifier > >>> mechanism that communicates changes. > >> > >> series > >> > >> Acked-by: Michael S. Tsirkin <mst@redhat.com> > >> > >> virtio bits > >> > >> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> > >> > >> This needs to go through vfio tree I assume. > > > > Thanks Michael. > > > > @Alex, what are your suggestions? > > Gentle ping. Sorry for the delay. It looks to me like patches 1, 8, and 9 are Memory API that are still missing an Ack from Paolo. I'll toss in my A-b+R-b for patches 6 and 7. I don't see that this necessarily needs to go in through vfio, I'm more than happy if someone else wants to grab it. Thanks, Alex
On 16.02.21 19:33, Alex Williamson wrote: > On Mon, 15 Feb 2021 15:03:43 +0100 > David Hildenbrand <david@redhat.com> wrote: > >> On 08.02.21 09:28, David Hildenbrand wrote: >>> On 27.01.21 13:45, Michael S. Tsirkin wrote: >>>> On Thu, Jan 21, 2021 at 12:05:29PM +0100, David Hildenbrand wrote: >>>>> A virtio-mem device manages a memory region in guest physical address >>>>> space, represented as a single (currently large) memory region in QEMU, >>>>> mapped into system memory address space. Before the guest is allowed to use >>>>> memory blocks, it must coordinate with the hypervisor (plug blocks). After >>>>> a reboot, all memory is usually unplugged - when the guest comes up, it >>>>> detects the virtio-mem device and selects memory blocks to plug (based on >>>>> resize requests from the hypervisor). >>>>> >>>>> Memory hot(un)plug consists of (un)plugging memory blocks via a virtio-mem >>>>> device (triggered by the guest). When unplugging blocks, we discard the >>>>> memory - similar to memory balloon inflation. In contrast to memory >>>>> ballooning, we always know which memory blocks a guest may actually use - >>>>> especially during a reboot, after a crash, or after kexec (and during >>>>> hibernation as well). Guests agreed to not access unplugged memory again, >>>>> especially not via DMA. >>>>> >>>>> The issue with vfio is, that it cannot deal with random discards - for this >>>>> reason, virtio-mem and vfio can currently only run mutually exclusive. >>>>> Especially, vfio would currently map the whole memory region (with possible >>>>> only little/no plugged blocks), resulting in all pages getting pinned and >>>>> therefore resulting in a higher memory consumption than expected (turning >>>>> virtio-mem basically useless in these environments). >>>>> >>>>> To make vfio work nicely with virtio-mem, we have to map only the plugged >>>>> blocks, and map/unmap properly when plugging/unplugging blocks (including >>>>> discarding of RAM when unplugging). We achieve that by using a new notifier >>>>> mechanism that communicates changes. >>>> >>>> series >>>> >>>> Acked-by: Michael S. Tsirkin <mst@redhat.com> >>>> >>>> virtio bits >>>> >>>> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> >>>> >>>> This needs to go through vfio tree I assume. >>> >>> Thanks Michael. >>> >>> @Alex, what are your suggestions? >> >> Gentle ping. > > Sorry for the delay. It looks to me like patches 1, 8, and 9 are > Memory API that are still missing an Ack from Paolo. I'll toss in my > A-b+R-b for patches 6 and 7. I don't see that this necessarily needs > to go in through vfio, I'm more than happy if someone else wants to > grab it. Thanks, Thanks, I assume patch #11 is fine with you as well? @Paolo, it would be great if I can get your feedback on patch #1. I have more stuff coming up that will reuse RamDiscardMgr (i.e., for better migration handling and better guest memory dump handling).
On Tue, 16 Feb 2021 19:49:09 +0100 David Hildenbrand <david@redhat.com> wrote: > On 16.02.21 19:33, Alex Williamson wrote: > > On Mon, 15 Feb 2021 15:03:43 +0100 > > David Hildenbrand <david@redhat.com> wrote: > > > >> On 08.02.21 09:28, David Hildenbrand wrote: > >>> On 27.01.21 13:45, Michael S. Tsirkin wrote: > >>>> On Thu, Jan 21, 2021 at 12:05:29PM +0100, David Hildenbrand wrote: > >>>>> A virtio-mem device manages a memory region in guest physical address > >>>>> space, represented as a single (currently large) memory region in QEMU, > >>>>> mapped into system memory address space. Before the guest is allowed to use > >>>>> memory blocks, it must coordinate with the hypervisor (plug blocks). After > >>>>> a reboot, all memory is usually unplugged - when the guest comes up, it > >>>>> detects the virtio-mem device and selects memory blocks to plug (based on > >>>>> resize requests from the hypervisor). > >>>>> > >>>>> Memory hot(un)plug consists of (un)plugging memory blocks via a virtio-mem > >>>>> device (triggered by the guest). When unplugging blocks, we discard the > >>>>> memory - similar to memory balloon inflation. In contrast to memory > >>>>> ballooning, we always know which memory blocks a guest may actually use - > >>>>> especially during a reboot, after a crash, or after kexec (and during > >>>>> hibernation as well). Guests agreed to not access unplugged memory again, > >>>>> especially not via DMA. > >>>>> > >>>>> The issue with vfio is, that it cannot deal with random discards - for this > >>>>> reason, virtio-mem and vfio can currently only run mutually exclusive. > >>>>> Especially, vfio would currently map the whole memory region (with possible > >>>>> only little/no plugged blocks), resulting in all pages getting pinned and > >>>>> therefore resulting in a higher memory consumption than expected (turning > >>>>> virtio-mem basically useless in these environments). > >>>>> > >>>>> To make vfio work nicely with virtio-mem, we have to map only the plugged > >>>>> blocks, and map/unmap properly when plugging/unplugging blocks (including > >>>>> discarding of RAM when unplugging). We achieve that by using a new notifier > >>>>> mechanism that communicates changes. > >>>> > >>>> series > >>>> > >>>> Acked-by: Michael S. Tsirkin <mst@redhat.com> > >>>> > >>>> virtio bits > >>>> > >>>> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> > >>>> > >>>> This needs to go through vfio tree I assume. > >>> > >>> Thanks Michael. > >>> > >>> @Alex, what are your suggestions? > >> > >> Gentle ping. > > > > Sorry for the delay. It looks to me like patches 1, 8, and 9 are > > Memory API that are still missing an Ack from Paolo. I'll toss in my > > A-b+R-b for patches 6 and 7. I don't see that this necessarily needs > > to go in through vfio, I'm more than happy if someone else wants to > > grab it. Thanks, > > Thanks, I assume patch #11 is fine with you as well? Yep, sent my acks for it as well. Thanks, Alex > @Paolo, it would be great if I can get your feedback on patch #1. I have > more stuff coming up that will reuse RamDiscardMgr (i.e., for better > migration handling and better guest memory dump handling). >