diff mbox series

[RFC,1/5] virtio-balloon: Remove unnecessary MADV_WILLNEED on deflate

Message ID 20181012032431.32693-2-david@gibson.dropbear.id.au
State New
Headers show
Series Improve balloon handling of pagesizes other than 4kiB | expand

Commit Message

David Gibson Oct. 12, 2018, 3:24 a.m. UTC
When the balloon is inflated, we discard memory place in it using madvise()
with MADV_DONTNEED.  And when we deflate it we use MADV_WILLNEED, which
sounds like it makes sense but is actually unnecessary.

The misleadingly named MADV_DONTNEED just discards the memory in question,
it doesn't set any persistent state on it in-kernel; all that's necessary
to bring the memory back is to touch it.  MADV_WILLNEED in contrast
specifically says that the memory will be used soon and faults it in.

Memory that's being given back to the guest by deflating the balloon
*might* be used soon, but it equally could just sit around in the guest's
pools until it actually needs it.  And, over the general timescale that
memory ballooning operates, it seems unlikely that one extra fault for the
guest will be a vast performance issue.

So, simplify the balloon's operation by dropping the madvise() on deflate.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/virtio/virtio-balloon.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

Comments

David Hildenbrand Oct. 12, 2018, 7:40 a.m. UTC | #1
On 12/10/2018 05:24, David Gibson wrote:
> When the balloon is inflated, we discard memory place in it using madvise()
> with MADV_DONTNEED.  And when we deflate it we use MADV_WILLNEED, which
> sounds like it makes sense but is actually unnecessary.
> 
> The misleadingly named MADV_DONTNEED just discards the memory in question,
> it doesn't set any persistent state on it in-kernel; all that's necessary
> to bring the memory back is to touch it.  MADV_WILLNEED in contrast
> specifically says that the memory will be used soon and faults it in.
> 
> Memory that's being given back to the guest by deflating the balloon
> *might* be used soon, but it equally could just sit around in the guest's
> pools until it actually needs it.  And, over the general timescale that
> memory ballooning operates, it seems unlikely that one extra fault for the
> guest will be a vast performance issue.
> 
> So, simplify the balloon's operation by dropping the madvise() on deflate.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  hw/virtio/virtio-balloon.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> index 1728e4f83a..6ec4bcf4e1 100644
> --- a/hw/virtio/virtio-balloon.c
> +++ b/hw/virtio/virtio-balloon.c
> @@ -35,9 +35,8 @@
>  
>  static void balloon_page(void *addr, int deflate)
>  {
> -    if (!qemu_balloon_is_inhibited()) {
> -        qemu_madvise(addr, BALLOON_PAGE_SIZE,
> -                deflate ? QEMU_MADV_WILLNEED : QEMU_MADV_DONTNEED);
> +    if (!qemu_balloon_is_inhibited() && !deflate) {
> +        qemu_madvise(addr, BALLOON_PAGE_SIZE, QEMU_MADV_DONTNEED);
>      }
>  }
>  
> 

Care to rename the function and drop the deflate parameter?
(call it only when actually inflating)

apart from that

Reviewed-by: David Hildenbrand <david@redhat.com>
Richard Henderson Oct. 12, 2018, 5:41 p.m. UTC | #2
On 10/11/18 8:24 PM, David Gibson wrote:
> When the balloon is inflated, we discard memory place in it using madvise()
> with MADV_DONTNEED.  And when we deflate it we use MADV_WILLNEED, which
> sounds like it makes sense but is actually unnecessary.
> 
> The misleadingly named MADV_DONTNEED just discards the memory in question,
> it doesn't set any persistent state on it in-kernel; all that's necessary
> to bring the memory back is to touch it.


Isn't the point of deflate to free up host memory, for use by other guests or
whatever?

If you do nothing upon deflate, then that doesn't actually happen.  It seems to
me that this is backward and you should use DONTNEED on deflate and then
inflate should do nothing.


r~
Eric Blake Oct. 12, 2018, 5:59 p.m. UTC | #3
On 10/12/18 12:41 PM, Richard Henderson wrote:
> On 10/11/18 8:24 PM, David Gibson wrote:
>> When the balloon is inflated, we discard memory place in it using madvise()
>> with MADV_DONTNEED.  And when we deflate it we use MADV_WILLNEED, which
>> sounds like it makes sense but is actually unnecessary.
>>
>> The misleadingly named MADV_DONTNEED just discards the memory in question,
>> it doesn't set any persistent state on it in-kernel; all that's necessary
>> to bring the memory back is to touch it.
> 
> 
> Isn't the point of deflate to free up host memory, for use by other guests or
> whatever?

If I remember correctly, deflate says to make the balloon smaller (that 
is, the balloon gets out of the way so that the guest can use more 
memory); inflate says to make the balloon bigger (that is, the guest has 
less memory available to use because the inflated balloon is occupying 
that memory instead - where memory occupied by the balloon is really 
memory handed back to the host).

> 
> If you do nothing upon deflate, then that doesn't actually happen.  It seems to
> me that this is backward and you should use DONTNEED on deflate and then
> inflate should do nothing.

Or rather, it sounds like your definition of deflate is opposite the one 
I recall (you are trying to argue that deflate means the guest uses less 
memory and is therefore deflating in size; while I thought deflate meant 
that the balloon shrinks in size making more memory available to the guest).
Michael S. Tsirkin Oct. 12, 2018, 6:05 p.m. UTC | #4
On Fri, Oct 12, 2018 at 02:24:27PM +1100, David Gibson wrote:
> When the balloon is inflated, we discard memory place in it using madvise()
> with MADV_DONTNEED.  And when we deflate it we use MADV_WILLNEED, which
> sounds like it makes sense but is actually unnecessary.
> 
> The misleadingly named MADV_DONTNEED just discards the memory in question,
> it doesn't set any persistent state on it in-kernel; all that's necessary
> to bring the memory back is to touch it.  MADV_WILLNEED in contrast
> specifically says that the memory will be used soon and faults it in.
> 
> Memory that's being given back to the guest by deflating the balloon
> *might* be used soon, but it equally could just sit around in the guest's
> pools until it actually needs it.  And, over the general timescale that
> memory ballooning operates, it seems unlikely that one extra fault for the
> guest will be a vast performance issue.

Thinking about it, it might be for RT guests.

So I suspect if you want to drop MADV_WILLNEED you need a flag
telling qemu that's not the usecase.




> So, simplify the balloon's operation by dropping the madvise() on deflate.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  hw/virtio/virtio-balloon.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> index 1728e4f83a..6ec4bcf4e1 100644
> --- a/hw/virtio/virtio-balloon.c
> +++ b/hw/virtio/virtio-balloon.c
> @@ -35,9 +35,8 @@
>  
>  static void balloon_page(void *addr, int deflate)
>  {
> -    if (!qemu_balloon_is_inhibited()) {
> -        qemu_madvise(addr, BALLOON_PAGE_SIZE,
> -                deflate ? QEMU_MADV_WILLNEED : QEMU_MADV_DONTNEED);
> +    if (!qemu_balloon_is_inhibited() && !deflate) {
> +        qemu_madvise(addr, BALLOON_PAGE_SIZE, QEMU_MADV_DONTNEED);
>      }
>  }
>  
> -- 
> 2.17.1
David Gibson Oct. 13, 2018, 6:23 a.m. UTC | #5
On Fri, Oct 12, 2018 at 10:41:33AM -0700, Richard Henderson wrote:
> On 10/11/18 8:24 PM, David Gibson wrote:
> > When the balloon is inflated, we discard memory place in it using madvise()
> > with MADV_DONTNEED.  And when we deflate it we use MADV_WILLNEED, which
> > sounds like it makes sense but is actually unnecessary.
> > 
> > The misleadingly named MADV_DONTNEED just discards the memory in question,
> > it doesn't set any persistent state on it in-kernel; all that's necessary
> > to bring the memory back is to touch it.
> 
> Isn't the point of deflate to free up host memory, for use by other guests or
> whatever?

As Eric notes, I think you have inflate and deflate the wrong way around.

> If you do nothing upon deflate, then that doesn't actually happen.  It seems to
> me that this is backward and you should use DONTNEED on deflate and then
> inflate should do nothing.
> 
> 
> r~
>
David Gibson Oct. 13, 2018, 6:26 a.m. UTC | #6
On Fri, Oct 12, 2018 at 09:40:24AM +0200, David Hildenbrand wrote:
> On 12/10/2018 05:24, David Gibson wrote:
> > When the balloon is inflated, we discard memory place in it using madvise()
> > with MADV_DONTNEED.  And when we deflate it we use MADV_WILLNEED, which
> > sounds like it makes sense but is actually unnecessary.
> > 
> > The misleadingly named MADV_DONTNEED just discards the memory in question,
> > it doesn't set any persistent state on it in-kernel; all that's necessary
> > to bring the memory back is to touch it.  MADV_WILLNEED in contrast
> > specifically says that the memory will be used soon and faults it in.
> > 
> > Memory that's being given back to the guest by deflating the balloon
> > *might* be used soon, but it equally could just sit around in the guest's
> > pools until it actually needs it.  And, over the general timescale that
> > memory ballooning operates, it seems unlikely that one extra fault for the
> > guest will be a vast performance issue.
> > 
> > So, simplify the balloon's operation by dropping the madvise() on deflate.
> > 
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > ---
> >  hw/virtio/virtio-balloon.c | 5 ++---
> >  1 file changed, 2 insertions(+), 3 deletions(-)
> > 
> > diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> > index 1728e4f83a..6ec4bcf4e1 100644
> > --- a/hw/virtio/virtio-balloon.c
> > +++ b/hw/virtio/virtio-balloon.c
> > @@ -35,9 +35,8 @@
> >  
> >  static void balloon_page(void *addr, int deflate)
> >  {
> > -    if (!qemu_balloon_is_inhibited()) {
> > -        qemu_madvise(addr, BALLOON_PAGE_SIZE,
> > -                deflate ? QEMU_MADV_WILLNEED : QEMU_MADV_DONTNEED);
> > +    if (!qemu_balloon_is_inhibited() && !deflate) {
> > +        qemu_madvise(addr, BALLOON_PAGE_SIZE, QEMU_MADV_DONTNEED);
> >      }
> >  }
> >  
> > 
> 
> Care to rename the function and drop the deflate parameter?
> (call it only when actually inflating)

As you probably found, I do that in a later patch.

> apart from that
> 
> Reviewed-by: David Hildenbrand <david@redhat.com>
>
David Hildenbrand Oct. 15, 2018, 6:54 a.m. UTC | #7
On 12/10/2018 20:05, Michael S. Tsirkin wrote:
> On Fri, Oct 12, 2018 at 02:24:27PM +1100, David Gibson wrote:
>> When the balloon is inflated, we discard memory place in it using madvise()
>> with MADV_DONTNEED.  And when we deflate it we use MADV_WILLNEED, which
>> sounds like it makes sense but is actually unnecessary.
>>
>> The misleadingly named MADV_DONTNEED just discards the memory in question,
>> it doesn't set any persistent state on it in-kernel; all that's necessary
>> to bring the memory back is to touch it.  MADV_WILLNEED in contrast
>> specifically says that the memory will be used soon and faults it in.
>>
>> Memory that's being given back to the guest by deflating the balloon
>> *might* be used soon, but it equally could just sit around in the guest's
>> pools until it actually needs it.  And, over the general timescale that
>> memory ballooning operates, it seems unlikely that one extra fault for the
>> guest will be a vast performance issue.
> 
> Thinking about it, it might be for RT guests.
> 
> So I suspect if you want to drop MADV_WILLNEED you need a flag
> telling qemu that's not the usecase.
> 

As far as I know RT guests

1. mlock all memory
2. use huge pages

"MADV_DONTNEED cannot be applied to locked pages, Huge TLB pages, or
VM_PFNMAP pages."

As DONTNEED never succeeded, WILLNEED will do nothing. (e.g. postcopy
temporarily has to unlock all memory in order to make DONTNEED work)

(And "Expect access in the near future." does not sound like guarantees
to me either way)

So I think this can go.
Michael S. Tsirkin Oct. 15, 2018, 10:43 a.m. UTC | #8
On Mon, Oct 15, 2018 at 08:54:27AM +0200, David Hildenbrand wrote:
> On 12/10/2018 20:05, Michael S. Tsirkin wrote:
> > On Fri, Oct 12, 2018 at 02:24:27PM +1100, David Gibson wrote:
> >> When the balloon is inflated, we discard memory place in it using madvise()
> >> with MADV_DONTNEED.  And when we deflate it we use MADV_WILLNEED, which
> >> sounds like it makes sense but is actually unnecessary.
> >>
> >> The misleadingly named MADV_DONTNEED just discards the memory in question,
> >> it doesn't set any persistent state on it in-kernel; all that's necessary
> >> to bring the memory back is to touch it.  MADV_WILLNEED in contrast
> >> specifically says that the memory will be used soon and faults it in.
> >>
> >> Memory that's being given back to the guest by deflating the balloon
> >> *might* be used soon, but it equally could just sit around in the guest's
> >> pools until it actually needs it.  And, over the general timescale that
> >> memory ballooning operates, it seems unlikely that one extra fault for the
> >> guest will be a vast performance issue.
> > 
> > Thinking about it, it might be for RT guests.
> > 
> > So I suspect if you want to drop MADV_WILLNEED you need a flag
> > telling qemu that's not the usecase.
> > 
> 
> As far as I know RT guests
> 
> 1. mlock all memory
> 2. use huge pages
> 
> "MADV_DONTNEED cannot be applied to locked pages, Huge TLB pages, or
> VM_PFNMAP pages."
> 
> As DONTNEED never succeeded, WILLNEED will do nothing. (e.g. postcopy
> temporarily has to unlock all memory in order to make DONTNEED work)


Hmm you are right.
Document this in commit log?


> (And "Expect access in the near future." does not sound like guarantees
> to me either way)

Should we be concerned about impact on performance though?

Thinking aloud it might have an impact.  But there is also a benefit.
Let's document known effects so people know what to look for
if they see issues?

	WILLNEED aggressively faults all memory in, removing it
	will cause just as much memory as needed to be allocated on access,
	slowing down the CPU at that point but conserving host memory.

	With this patch device assignment (e.g. with VFIO) hotplug will take longer
	as it needs to allocate all this memory on hotplug.
	While this happens all VM is paused right now.


> So I think this can go.
> 
> -- 
> 
> Thanks,
> 
> David / dhildenb
David Hildenbrand Oct. 15, 2018, 11:14 a.m. UTC | #9
>> (And "Expect access in the near future." does not sound like guarantees
>> to me either way)
> 
> Should we be concerned about impact on performance though?

Yes, it really depends on the use case (and most of the time, the memory
won't be directly reused). Let's document it in the commit log just as
you suggest.

> 
> Thinking aloud it might have an impact.  But there is also a benefit.
> Let's document known effects so people know what to look for
> if they see issues?
> 
> 	WILLNEED aggressively faults all memory in, removing it
> 	will cause just as much memory as needed to be allocated on access,
> 	slowing down the CPU at that point but conserving host memory.
> 
> 	With this patch device assignment (e.g. with VFIO) hotplug will take longer
> 	as it needs to allocate all this memory on hotplug.
> 	While this happens all VM is paused right now.
David Gibson Dec. 4, 2018, 4:26 a.m. UTC | #10
On Mon, Oct 15, 2018 at 06:43:06AM -0400, Michael S. Tsirkin wrote:
> On Mon, Oct 15, 2018 at 08:54:27AM +0200, David Hildenbrand wrote:
> > On 12/10/2018 20:05, Michael S. Tsirkin wrote:
> > > On Fri, Oct 12, 2018 at 02:24:27PM +1100, David Gibson wrote:
> > >> When the balloon is inflated, we discard memory place in it using madvise()
> > >> with MADV_DONTNEED.  And when we deflate it we use MADV_WILLNEED, which
> > >> sounds like it makes sense but is actually unnecessary.
> > >>
> > >> The misleadingly named MADV_DONTNEED just discards the memory in question,
> > >> it doesn't set any persistent state on it in-kernel; all that's necessary
> > >> to bring the memory back is to touch it.  MADV_WILLNEED in contrast
> > >> specifically says that the memory will be used soon and faults it in.
> > >>
> > >> Memory that's being given back to the guest by deflating the balloon
> > >> *might* be used soon, but it equally could just sit around in the guest's
> > >> pools until it actually needs it.  And, over the general timescale that
> > >> memory ballooning operates, it seems unlikely that one extra fault for the
> > >> guest will be a vast performance issue.
> > > 
> > > Thinking about it, it might be for RT guests.
> > > 
> > > So I suspect if you want to drop MADV_WILLNEED you need a flag
> > > telling qemu that's not the usecase.
> > > 
> > 
> > As far as I know RT guests
> > 
> > 1. mlock all memory
> > 2. use huge pages
> > 
> > "MADV_DONTNEED cannot be applied to locked pages, Huge TLB pages, or
> > VM_PFNMAP pages."
> > 
> > As DONTNEED never succeeded, WILLNEED will do nothing. (e.g. postcopy
> > temporarily has to unlock all memory in order to make DONTNEED work)
> 
> 
> Hmm you are right.
> Document this in commit log?
> 
> 
> > (And "Expect access in the near future." does not sound like guarantees
> > to me either way)
> 
> Should we be concerned about impact on performance though?
> 
> Thinking aloud it might have an impact.  But there is also a benefit.
> Let's document known effects so people know what to look for
> if they see issues?

Ok, my new text in the commit message is:

This patch simplify's the balloon operation by dropping the madvise()                                   
on deflate.  This might have an impact on performance - it will move a                                  
delay at deflate time until that memory is actually touched, which                                      
might be more latency sensitive.  However:                                                              
                                                                                                        
  * Memory that's being given back to the guest by deflating the                                        
    balloon *might* be used soon, but it equally could just sit around                                  
    in the guest's pools until needed (or even be faulted out again if                                  
    the host is under memory pressure).                                                                 
                                                                                                        
  * Usually, the timescale over which you'll be adjusting the balloon                                   
    is long enough that a few extra faults after deflation aren't                                       
    going to make a difference.                           

Does that seem like it covers what it should?

> 	WILLNEED aggressively faults all memory in, removing it
> 	will cause just as much memory as needed to be allocated on access,
> 	slowing down the CPU at that point but conserving host memory.
> 
> 	With this patch device assignment (e.g. with VFIO) hotplug will take longer
> 	as it needs to allocate all this memory on hotplug.
> 	While this happens all VM is paused right now.

I'm not sure what you mean by this.  This patch is very specific to
the virtio-balloon deflate path.  "True" memory hotplug (e.g. via ACPI
or PAPR dynamic reconfiguration) is unaffected.  And if VFIO is in
play, all guest memory will generally be locked, just like with
realtime.
diff mbox series

Patch

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 1728e4f83a..6ec4bcf4e1 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -35,9 +35,8 @@ 
 
 static void balloon_page(void *addr, int deflate)
 {
-    if (!qemu_balloon_is_inhibited()) {
-        qemu_madvise(addr, BALLOON_PAGE_SIZE,
-                deflate ? QEMU_MADV_WILLNEED : QEMU_MADV_DONTNEED);
+    if (!qemu_balloon_is_inhibited() && !deflate) {
+        qemu_madvise(addr, BALLOON_PAGE_SIZE, QEMU_MADV_DONTNEED);
     }
 }