Patchwork [RFC] virtio: Report new guest memory statistics pertinent to memory ballooning

login
register
mail settings
Submitter Adam Litke
Date Nov. 5, 2009, 10:50 p.m.
Message ID <1257461425.3121.22.camel@aglitke>
Download mbox | patch
Permalink /patch/37800/
State New
Headers show

Comments

Adam Litke - Nov. 5, 2009, 10:50 p.m.
[RFC] virtio: Report new guest memory statistics pertinent to memory ballooning
    
    When using ballooning to manage overcommitted memory on a host, a system for
    guests to communicate their memory usage to the host can provide information
    that will minimize the impact of ballooning on the guests.  The current method
    employs a daemon running in each guest that communicates memory statistics to a
    host daemon at a specified time interval.  The host daemon aggregates this
    information and inflates and/or deflates balloons according to the level of
    host memory pressure.  This approach is effective but overly complex since a
    daemon must be installed inside each guest and coordinated to communicate with
    the host.  A simpler approach is to collect memory statistics in the virtio
    balloon driver and communicate them to the host via the device config space.
    
    This patch implements the qemu side of the communication channel.  I will post
    the kernel driver modifications in-reply to this message.
    
    I'd like to pose a few questions concerning my implementation:
    
     * Is there a better way to use feature codes than using one per variable?
    
     * This patch causes the 'info balloon' monitor command to generate output like
       the following:
    
    	(qemu) info balloon
    	balloon: actual=1024 MB
    	balloon: pswapin=0 pages
    	balloon: pswapout=0 pages
    	balloon: panon=3928 KB
    	balloon: pgmajfault=0
    	balloon: pgminfault=247914
    	balloon: memfree=987032 KB
    	balloon: memtot=1020812 KB
    
       Is this agreeable?
    
    Thank you for your comments...
    
    Signed-off-by: Adam Litke <agl@us.ibm.com>
Anthony Liguori - Nov. 5, 2009, 11:36 p.m.
agl@linux.vnet.ibm.com wrote:
>     [RFC] virtio: Report new guest memory statistics pertinent to memory ballooning
>     
>     When using ballooning to manage overcommitted memory on a host, a system for
>     guests to communicate their memory usage to the host can provide information
>     that will minimize the impact of ballooning on the guests.  The current method
>     employs a daemon running in each guest that communicates memory statistics to a
>     host daemon at a specified time interval.  The host daemon aggregates this
>     information and inflates and/or deflates balloons according to the level of
>     host memory pressure.  This approach is effective but overly complex since a
>     daemon must be installed inside each guest and coordinated to communicate with
>     the host.  A simpler approach is to collect memory statistics in the virtio
>     balloon driver and communicate them to the host via the device config space.
>     
>     This patch implements the qemu side of the communication channel.  I will post
>     the kernel driver modifications in-reply to this message.
>     
>     I'd like to pose a few questions concerning my implementation:
>     
>      * Is there a better way to use feature codes than using one per variable?
>   

I think it's the right granularity personally.

>      * This patch causes the 'info balloon' monitor command to generate output like
>        the following:
>     
>     	(qemu) info balloon
>     	balloon: actual=1024 MB
>     	balloon: pswapin=0 pages
>     	balloon: pswapout=0 pages
>     	balloon: panon=3928 KB
>     	balloon: pgmajfault=0
>     	balloon: pgminfault=247914
>     	balloon: memfree=987032 KB
>     	balloon: memtot=1020812 KB
>   

We could probably use less obscure names for these stats like 
total_memory.  I know we're facing a global shortage of bits but I think 
this is excessive conservation :-)

We typically print these out as:

 balloon: actual=1024,pswapin=0,pswapout=0,panon=3928...

It's a bit less readable admittedly, but it's how things tend to work.

>    
>        Is this agreeable?
>     
>     Thank you for your comments...
>     
>     Signed-off-by: Adam Litke <agl@us.ibm.com>\
>   

The indent is pretty strange here.  git-send-email should help.

> diff --git a/balloon.h b/balloon.h
> index 60b4a5d..3f67021 100644
> --- a/balloon.h
> +++ b/balloon.h
> @@ -14,14 +14,21 @@
>  #ifndef _QEMU_BALLOON_H
>  #define _QEMU_BALLOON_H
>
> +#include "hw/virtio-balloon.h"
>  #include "cpu-defs.h"
>
> -typedef ram_addr_t (QEMUBalloonEvent)(void *opaque, ram_addr_t target);
> +struct QEMUBalloonState {
> +    ram_addr_t actual;
> +    struct virtio_balloon_stats stats;
> +};
>   

We like to typedef the struct away from structures in qemu.  See 
CodingStyle.

The rest looks good.
Avi Kivity - Nov. 8, 2009, 9:02 a.m.
On 11/06/2009 12:50 AM, Adam Litke wrote:
>      [RFC] virtio: Report new guest memory statistics pertinent to memory ballooning
>
>      When using ballooning to manage overcommitted memory on a host, a system for
>      guests to communicate their memory usage to the host can provide information
>      that will minimize the impact of ballooning on the guests.  The current method
>      employs a daemon running in each guest that communicates memory statistics to a
>      host daemon at a specified time interval.  The host daemon aggregates this
>      information and inflates and/or deflates balloons according to the level of
>      host memory pressure.  This approach is effective but overly complex since a
>      daemon must be installed inside each guest and coordinated to communicate with
>      the host.  A simpler approach is to collect memory statistics in the virtio
>      balloon driver and communicate them to the host via the device config space.
>
>      This patch implements the qemu side of the communication channel.  I will post
>      the kernel driver modifications in-reply to this message.
>
>      I'd like to pose a few questions concerning my implementation:
>
>       * Is there a better way to use feature codes than using one per variable?
>
>       * This patch causes the 'info balloon' monitor command to generate output like
>         the following:
>
>      	(qemu) info balloon
>      	balloon: actual=1024 MB
>      	balloon: pswapin=0 pages
>      	balloon: pswapout=0 pages
>      	balloon: panon=3928 KB
>      	balloon: pgmajfault=0
>      	balloon: pgminfault=247914
>      	balloon: memfree=987032 KB
>      	balloon: memtot=1020812 KB
>
>         Is this agreeable?
>    

It's important that these statistics be cross platform.  Can you (or 
someone knowledgeable in Windows) verify that all of these are 
meaningful there?  I'd expect it to be so with the possible exception of 
panon.

Can you explain how the host would use these?  I can see how major 
faults are interesting, but how is the minor faults statistic useful?

Normally I'd ask that the values be in machine readable form, but with 
the machine monitor protocol, that's not necessary.
Jamie Lokier - Nov. 8, 2009, 11:27 p.m.
Avi Kivity wrote:
> >     	(qemu) info balloon
> >     	balloon: actual=1024 MB
> >     	balloon: pswapin=0 pages
> >     	balloon: pswapout=0 pages
> >     	balloon: panon=3928 KB
> >     	balloon: pgmajfault=0
> >     	balloon: pgminfault=247914
> >     	balloon: memfree=987032 KB
> >     	balloon: memtot=1020812 KB
> 
> It's important that these statistics be cross platform.  Can you (or 
> someone knowledgeable in Windows) verify that all of these are 
> meaningful there?  I'd expect it to be so with the possible exception of 
> panon.

pswapin, pswapout, pgmajfault and pgminfault are not meaningful even
on some Linux targets (those without MMU, and therefore no page
faults).  Yet even those could, perhaps, use virtio-balloon.

It's probably best to have a way to omit values that are not
meaningful to a guest, or otherwise indicate "not defined by the
current guest".

-- Jamie
Anthony Liguori - Nov. 9, 2009, 1:59 p.m.
Avi Kivity wrote:
>>
>>
>>          (qemu) info balloon
>>          balloon: actual=1024 MB
>>          balloon: pswapin=0 pages
>>          balloon: pswapout=0 pages
>>          balloon: panon=3928 KB
>>          balloon: pgmajfault=0
>>          balloon: pgminfault=247914
>>          balloon: memfree=987032 KB
>>          balloon: memtot=1020812 KB
>>
>>         Is this agreeable?
>>    
>
> It's important that these statistics be cross platform.  Can you (or 
> someone knowledgeable in Windows) verify that all of these are 
> meaningful there?  I'd expect it to be so with the possible exception 
> of panon.

They don't have to all be cross platform.  We can expose platform 
specific metrics via feature bits.  We should only disable features that 
the guest has acknowledged.

But of course, whenever possible, we want metrics to be meaningful cross 
platform.

Patch

diff --git a/balloon.h b/balloon.h
index 60b4a5d..3f67021 100644
--- a/balloon.h
+++ b/balloon.h
@@ -14,14 +14,21 @@ 
 #ifndef _QEMU_BALLOON_H
 #define _QEMU_BALLOON_H
 
+#include "hw/virtio-balloon.h"
 #include "cpu-defs.h"
 
-typedef ram_addr_t (QEMUBalloonEvent)(void *opaque, ram_addr_t target);
+struct QEMUBalloonState {
+    ram_addr_t actual;
+    struct virtio_balloon_stats stats;
+};
+
+typedef void (QEMUBalloonEvent)(void *opaque, ram_addr_t target,
+                                      struct QEMUBalloonState *state);
 
 void qemu_add_balloon_handler(QEMUBalloonEvent *func, void *opaque);
 
 void qemu_balloon(ram_addr_t target);
 
-ram_addr_t qemu_balloon_status(void);
+int qemu_balloon_status(struct QEMUBalloonState *s);
 
 #endif
diff --git a/hw/virtio-balloon.c b/hw/virtio-balloon.c
index cfd3b41..1dc8345 100644
--- a/hw/virtio-balloon.c
+++ b/hw/virtio-balloon.c
@@ -30,6 +30,7 @@  typedef struct VirtIOBalloon
     VirtQueue *ivq, *dvq;
     uint32_t num_pages;
     uint32_t actual;
+    struct virtio_balloon_stats stats;
 } VirtIOBalloon;
 
 static VirtIOBalloon *to_virtio_balloon(VirtIODevice *vdev)
@@ -111,8 +112,15 @@  static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data)
 
     config.num_pages = cpu_to_le32(dev->num_pages);
     config.actual = cpu_to_le32(dev->actual);
-
-    memcpy(config_data, &config, 8);
+    config.stats.pswapin = cpu_to_le32(dev->stats.pswapin);
+    config.stats.pswapout = cpu_to_le32(dev->stats.pswapout);
+    config.stats.panon = cpu_to_le32(dev->stats.panon);
+    config.stats.pgmajfault = cpu_to_le32(dev->stats.pgmajfault);
+    config.stats.pgminfault = cpu_to_le32(dev->stats.pgminfault);
+    config.stats.memfree = cpu_to_le32(dev->stats.memfree);
+    config.stats.memtot = cpu_to_le32(dev->stats.memtot);
+
+    memcpy(config_data, &config, sizeof(config));
 }
 
 static void virtio_balloon_set_config(VirtIODevice *vdev,
@@ -120,16 +128,39 @@  static void virtio_balloon_set_config(VirtIODevice *vdev,
 {
     VirtIOBalloon *dev = to_virtio_balloon(vdev);
     struct virtio_balloon_config config;
-    memcpy(&config, config_data, 8);
+    memcpy(&config, config_data, sizeof(config));
     dev->actual = config.actual;
+    dev->stats.pswapin = config.stats.pswapin;
+    dev->stats.pswapout = config.stats.pswapout;
+    dev->stats.panon = config.stats.panon;
+    dev->stats.pgmajfault = config.stats.pgmajfault;
+    dev->stats.pgminfault = config.stats.pgminfault;
+    dev->stats.memfree = config.stats.memfree;
+    dev->stats.memtot = config.stats.memtot;
 }
 
 static uint32_t virtio_balloon_get_features(VirtIODevice *vdev)
 {
-    return 0;
+    uint32_t features = 0;
+
+    features |= (1 << VIRTIO_BALLOON_F_RPT_SWAP_IN)  |
+                (1 << VIRTIO_BALLOON_F_RPT_SWAP_OUT) |
+                (1 << VIRTIO_BALLOON_F_RPT_ANON)     |
+                (1 << VIRTIO_BALLOON_F_RPT_MAJFLT)   |
+                (1 << VIRTIO_BALLOON_F_RPT_MINFLT)   |
+                (1 << VIRTIO_BALLOON_F_RPT_MEMFREE)  |
+                (1 << VIRTIO_BALLOON_F_RPT_MEMTOT);
+		
+    return features;
+}
+
+static inline int has_feature(VirtIOBalloon *dev, int feature)
+{
+	return (dev->vdev.features & 1<<feature);
 }
 
-static ram_addr_t virtio_balloon_to_target(void *opaque, ram_addr_t target)
+static void virtio_balloon_to_target(void *opaque, ram_addr_t target,
+                                     struct QEMUBalloonState *s)
 {
     VirtIOBalloon *dev = opaque;
 
@@ -141,7 +172,23 @@  static ram_addr_t virtio_balloon_to_target(void *opaque, ram_addr_t target)
         virtio_notify_config(&dev->vdev);
     }
 
-    return ram_size - (dev->actual << VIRTIO_BALLOON_PFN_SHIFT);
+    if (s) {
+        s->actual = ram_size - (dev->actual << VIRTIO_BALLOON_PFN_SHIFT);
+        s->stats.pswapin = has_feature(dev, VIRTIO_BALLOON_F_RPT_SWAP_OUT) ?
+                                       dev->stats.pswapin : -1;
+        s->stats.pswapout = has_feature(dev, VIRTIO_BALLOON_F_RPT_SWAP_OUT) ?
+                                        dev->stats.pswapout : -1;  
+        s->stats.panon = has_feature(dev, VIRTIO_BALLOON_F_RPT_ANON) ?
+                                     dev->stats.panon : -1;
+        s->stats.pgmajfault = has_feature(dev, VIRTIO_BALLOON_F_RPT_MAJFLT) ?
+                                          dev->stats.pgmajfault : -1;
+        s->stats.pgminfault = has_feature(dev, VIRTIO_BALLOON_F_RPT_MINFLT) ?
+                                          dev->stats.pgminfault : -1;
+        s->stats.memfree = has_feature(dev, VIRTIO_BALLOON_F_RPT_MEMFREE) ?
+                                       dev->stats.memfree : -1;
+        s->stats.memtot = has_feature(dev, VIRTIO_BALLOON_F_RPT_MEMTOT) ?
+                                      dev->stats.memtot : -1;
+    }
 }
 
 static void virtio_balloon_save(QEMUFile *f, void *opaque)
@@ -174,8 +221,9 @@  VirtIODevice *virtio_balloon_init(DeviceState *dev)
     VirtIOBalloon *s;
 
     s = (VirtIOBalloon *)virtio_common_init("virtio-balloon",
-                                            VIRTIO_ID_BALLOON,
-                                            8, sizeof(VirtIOBalloon));
+                             VIRTIO_ID_BALLOON,
+                             sizeof(struct virtio_balloon_config),
+                             sizeof(VirtIOBalloon));
 
     s->vdev.get_config = virtio_balloon_get_config;
     s->vdev.set_config = virtio_balloon_set_config;
diff --git a/hw/virtio-balloon.h b/hw/virtio-balloon.h
index 9a0d119..c93909e 100644
--- a/hw/virtio-balloon.h
+++ b/hw/virtio-balloon.h
@@ -25,16 +25,37 @@ 
 
 /* The feature bitmap for virtio balloon */
 #define VIRTIO_BALLOON_F_MUST_TELL_HOST 0 /* Tell before reclaiming pages */
+                                        /* Guest memory statistic reporting */
+#define VIRTIO_BALLOON_F_RPT_SWAP_IN  1   /* Number of pages swapped in */
+#define VIRTIO_BALLOON_F_RPT_SWAP_OUT 2   /* Number of pages swapped out */
+#define VIRTIO_BALLOON_F_RPT_ANON     3   /* Number of anonymous pages in use */
+#define VIRTIO_BALLOON_F_RPT_MAJFLT   4   /* Number of major faults */
+#define VIRTIO_BALLOON_F_RPT_MINFLT   5   /* Number of minor faults */
+#define VIRTIO_BALLOON_F_RPT_MEMFREE  6   /* Total amount of free memory */
+#define VIRTIO_BALLOON_F_RPT_MEMTOT   7   /* Total amount of memory */
 
 /* Size of a PFN in the balloon interface. */
 #define VIRTIO_BALLOON_PFN_SHIFT 12
 
+struct virtio_balloon_stats
+{
+    uint32_t pswapin;     /* pages swapped in */
+    uint32_t pswapout;    /* pages swapped out */
+    uint32_t panon;       /* anonymous pages in use */
+    uint32_t pgmajfault;  /* Major page faults */
+    uint32_t pgminfault;  /* Minor page faults */
+    uint32_t memfree;     /* Total amount of free memory (in kb) */
+    uint32_t memtot;      /* Total amount of memory (in kb) */
+};
+
 struct virtio_balloon_config
 {
     /* Number of pages host wants Guest to give up. */
     uint32_t num_pages;
     /* Number of pages we've actually got in balloon. */
     uint32_t actual;
+    /* Memory statistics */
+    struct virtio_balloon_stats stats;
 };
 
 #endif
diff --git a/monitor.c b/monitor.c
index ed1ce6e..36cff11 100644
--- a/monitor.c
+++ b/monitor.c
@@ -1581,18 +1581,35 @@  static void do_balloon(Monitor *mon, int value)
     qemu_balloon(target << 20);
 }
 
+static inline void print_stat(Monitor *mon, uint32_t val, const char *label,
+                              const char *units)
+{
+    if (val != -1)
+        monitor_printf(mon, "balloon: %s=%u %s\n", label, val, units);
+}
+
 static void do_info_balloon(Monitor *mon)
 {
-    ram_addr_t actual;
+    struct QEMUBalloonState s;
+    int ret;
 
-    actual = qemu_balloon_status();
+    ret = qemu_balloon_status(&s);
     if (kvm_enabled() && !kvm_has_sync_mmu())
         monitor_printf(mon, "Using KVM without synchronous MMU, "
                        "ballooning disabled\n");
-    else if (actual == 0)
+    else if (ret < 0)
         monitor_printf(mon, "Ballooning not activated in VM\n");
-    else
-        monitor_printf(mon, "balloon: actual=%d\n", (int)(actual >> 20));
+    else {
+        monitor_printf(mon, "balloon: actual=%d MB\n", (int)(s.actual >> 20));
+        print_stat(mon, s.stats.pswapin, "pswapin", "pages");
+        print_stat(mon, s.stats.pswapout, "pswapout", "pages");
+        print_stat(mon, s.stats.panon, "panon", "KB");
+        print_stat(mon, s.stats.pgmajfault, "pgmajfault", "");
+        print_stat(mon, s.stats.pgminfault, "pgminfault", "");
+        print_stat(mon, s.stats.memfree, "memfree", "KB");
+        print_stat(mon, s.stats.memtot, "memtot", "KB");
+    }
+        
 }
 
 static qemu_acl *find_acl(Monitor *mon, const char *name)
diff --git a/vl.c b/vl.c
index 710d52e..de3b4ae 100644
--- a/vl.c
+++ b/vl.c
@@ -334,13 +334,13 @@  void qemu_add_balloon_handler(QEMUBalloonEvent *func, void *opaque)
 void qemu_balloon(ram_addr_t target)
 {
     if (qemu_balloon_event)
-        qemu_balloon_event(qemu_balloon_event_opaque, target);
+        qemu_balloon_event(qemu_balloon_event_opaque, target, NULL);
 }
 
-ram_addr_t qemu_balloon_status(void)
+int qemu_balloon_status(struct QEMUBalloonState *s)
 {
     if (qemu_balloon_event)
-        return qemu_balloon_event(qemu_balloon_event_opaque, 0);
+        qemu_balloon_event(qemu_balloon_event_opaque, 0, s);
     return 0;
 }