Message ID | 1425020265-25939-3-git-send-email-den@openvz.org |
---|---|
State | New |
Headers | show |
On Fri, 2015-02-27 at 09:57 +0300, Denis V. Lunev wrote: > Excessive virtio_balloon inflation can cause invocation of OOM-killer, > when Linux is under severe memory pressure. Various mechanisms are > responsible for correct virtio_balloon memory management. Nevertheless it > is often the case that these control tools does not have enough time to > react on fast changing memory load. As a result OS runs out of memory and > invokes OOM-killer. The balancing of memory by use of the virtio balloon > should not cause the termination of processes while there are pages in the > balloon. Now there is no way for virtio balloon driver to free memory at > the last moment before some process get killed by OOM-killer. > > This does not provide a security breach as balloon itself is running > inside Guest OS and is working in the cooperation with the host. Thus > some improvements from Guest side should be considered as normal. > > To solve the problem, introduce a virtio_balloon callback which is > expected to be called from the oom notifier call chain in out_of_memory() > function. If virtio balloon could release some memory, it will make the > system return and retry the allocation that forced the out of memory > killer to run. > > This behavior should be enabled if and only if appropriate feature bit > is set on the device. It is off by default. > > This functionality was recently merged into vanilla Linux. > > commit 5a10b7dbf904bfe01bb9fcc6298f7df09eed77d5 > Author: Raushaniya Maksudova <rmaksudova@parallels.com> > Date: Mon Nov 10 09:36:29 2014 +1030 > > This patch adds respective control bits into QEMU. It introduces > deflate-on-oom option for balloon device which does the trick. What's the status on this, please? It's been over a month since this was posted with no further review feedback, so I think it's ready. Getting this into qemu is blocking our next step which would be adding the feature bit to the virtio spec. James
On Wed, Apr 01, 2015 at 12:44:28PM +0300, James Bottomley wrote: > On Fri, 2015-02-27 at 09:57 +0300, Denis V. Lunev wrote: > > Excessive virtio_balloon inflation can cause invocation of OOM-killer, > > when Linux is under severe memory pressure. Various mechanisms are > > responsible for correct virtio_balloon memory management. Nevertheless it > > is often the case that these control tools does not have enough time to > > react on fast changing memory load. As a result OS runs out of memory and > > invokes OOM-killer. The balancing of memory by use of the virtio balloon > > should not cause the termination of processes while there are pages in the > > balloon. Now there is no way for virtio balloon driver to free memory at > > the last moment before some process get killed by OOM-killer. > > > > This does not provide a security breach as balloon itself is running > > inside Guest OS and is working in the cooperation with the host. Thus > > some improvements from Guest side should be considered as normal. > > > > To solve the problem, introduce a virtio_balloon callback which is > > expected to be called from the oom notifier call chain in out_of_memory() > > function. If virtio balloon could release some memory, it will make the > > system return and retry the allocation that forced the out of memory > > killer to run. > > > > This behavior should be enabled if and only if appropriate feature bit > > is set on the device. It is off by default. > > > > This functionality was recently merged into vanilla Linux. > > > > commit 5a10b7dbf904bfe01bb9fcc6298f7df09eed77d5 > > Author: Raushaniya Maksudova <rmaksudova@parallels.com> > > Date: Mon Nov 10 09:36:29 2014 +1030 > > > > This patch adds respective control bits into QEMU. It introduces > > deflate-on-oom option for balloon device which does the trick. > > What's the status on this, please? It's been over a month since this > was posted with no further review feedback, so I think it's ready. > Getting this into qemu is blocking our next step which would be adding > the feature bit to the virtio spec. > > James This was posted after soft feature freeze for 2.3, so it'll have to go into 2.4. I don't see why would this block your work on the spec: you should make progress on this meanwhile.
On Wed, 2015-04-01 at 11:50 +0200, Michael S. Tsirkin wrote: > On Wed, Apr 01, 2015 at 12:44:28PM +0300, James Bottomley wrote: > > On Fri, 2015-02-27 at 09:57 +0300, Denis V. Lunev wrote: > > > Excessive virtio_balloon inflation can cause invocation of OOM-killer, > > > when Linux is under severe memory pressure. Various mechanisms are > > > responsible for correct virtio_balloon memory management. Nevertheless it > > > is often the case that these control tools does not have enough time to > > > react on fast changing memory load. As a result OS runs out of memory and > > > invokes OOM-killer. The balancing of memory by use of the virtio balloon > > > should not cause the termination of processes while there are pages in the > > > balloon. Now there is no way for virtio balloon driver to free memory at > > > the last moment before some process get killed by OOM-killer. > > > > > > This does not provide a security breach as balloon itself is running > > > inside Guest OS and is working in the cooperation with the host. Thus > > > some improvements from Guest side should be considered as normal. > > > > > > To solve the problem, introduce a virtio_balloon callback which is > > > expected to be called from the oom notifier call chain in out_of_memory() > > > function. If virtio balloon could release some memory, it will make the > > > system return and retry the allocation that forced the out of memory > > > killer to run. > > > > > > This behavior should be enabled if and only if appropriate feature bit > > > is set on the device. It is off by default. > > > > > > This functionality was recently merged into vanilla Linux. > > > > > > commit 5a10b7dbf904bfe01bb9fcc6298f7df09eed77d5 > > > Author: Raushaniya Maksudova <rmaksudova@parallels.com> > > > Date: Mon Nov 10 09:36:29 2014 +1030 > > > > > > This patch adds respective control bits into QEMU. It introduces > > > deflate-on-oom option for balloon device which does the trick. > > > > What's the status on this, please? It's been over a month since this > > was posted with no further review feedback, so I think it's ready. > > Getting this into qemu is blocking our next step which would be adding > > the feature bit to the virtio spec. > > > > James > > This was posted after soft feature freeze for 2.3, so it'll have to go > into 2.4. I don't see why would this block your work on the spec: you > should make progress on this meanwhile. I can do that ... I just thought the spec was trailing edge, so I was waiting to have the patch accepted, which confirms the implementation. I didn't want to write it into the spec and have the actual implementation changed by review later. James
On Wed, Apr 01, 2015 at 12:51:42PM +0300, James Bottomley wrote: > On Wed, 2015-04-01 at 11:50 +0200, Michael S. Tsirkin wrote: > > On Wed, Apr 01, 2015 at 12:44:28PM +0300, James Bottomley wrote: > > > On Fri, 2015-02-27 at 09:57 +0300, Denis V. Lunev wrote: > > > > Excessive virtio_balloon inflation can cause invocation of OOM-killer, > > > > when Linux is under severe memory pressure. Various mechanisms are > > > > responsible for correct virtio_balloon memory management. Nevertheless it > > > > is often the case that these control tools does not have enough time to > > > > react on fast changing memory load. As a result OS runs out of memory and > > > > invokes OOM-killer. The balancing of memory by use of the virtio balloon > > > > should not cause the termination of processes while there are pages in the > > > > balloon. Now there is no way for virtio balloon driver to free memory at > > > > the last moment before some process get killed by OOM-killer. > > > > > > > > This does not provide a security breach as balloon itself is running > > > > inside Guest OS and is working in the cooperation with the host. Thus > > > > some improvements from Guest side should be considered as normal. > > > > > > > > To solve the problem, introduce a virtio_balloon callback which is > > > > expected to be called from the oom notifier call chain in out_of_memory() > > > > function. If virtio balloon could release some memory, it will make the > > > > system return and retry the allocation that forced the out of memory > > > > killer to run. > > > > > > > > This behavior should be enabled if and only if appropriate feature bit > > > > is set on the device. It is off by default. > > > > > > > > This functionality was recently merged into vanilla Linux. > > > > > > > > commit 5a10b7dbf904bfe01bb9fcc6298f7df09eed77d5 > > > > Author: Raushaniya Maksudova <rmaksudova@parallels.com> > > > > Date: Mon Nov 10 09:36:29 2014 +1030 > > > > > > > > This patch adds respective control bits into QEMU. It introduces > > > > deflate-on-oom option for balloon device which does the trick. > > > > > > What's the status on this, please? It's been over a month since this > > > was posted with no further review feedback, so I think it's ready. > > > Getting this into qemu is blocking our next step which would be adding > > > the feature bit to the virtio spec. > > > > > > James > > > > This was posted after soft feature freeze for 2.3, so it'll have to go > > into 2.4. I don't see why would this block your work on the spec: you > > should make progress on this meanwhile. > > I can do that ... I just thought the spec was trailing edge, so I was > waiting to have the patch accepted, which confirms the implementation. > I didn't want to write it into the spec and have the actual > implementation changed by review later. > > James > It's up to you really, I would just like to point out two things: - spec process is a long one, assuming we accept a spec change, we go though a public review period, multiple votes etc. About half a year to release a spec revision with new features. So time enough to make minor changes. - oasis process works like this (roughly): spec is written spec goes through a public review process community standard is published 3 implementations are reported spec becomes an oasis standard so implementations aren't required at early stages
On 01/04/15 13:18, Michael S. Tsirkin wrote: > On Wed, Apr 01, 2015 at 12:51:42PM +0300, James Bottomley wrote: >> On Wed, 2015-04-01 at 11:50 +0200, Michael S. Tsirkin wrote: >>> On Wed, Apr 01, 2015 at 12:44:28PM +0300, James Bottomley wrote: >>>> On Fri, 2015-02-27 at 09:57 +0300, Denis V. Lunev wrote: >>>>> Excessive virtio_balloon inflation can cause invocation of OOM-killer, >>>>> when Linux is under severe memory pressure. Various mechanisms are >>>>> responsible for correct virtio_balloon memory management. Nevertheless it >>>>> is often the case that these control tools does not have enough time to >>>>> react on fast changing memory load. As a result OS runs out of memory and >>>>> invokes OOM-killer. The balancing of memory by use of the virtio balloon >>>>> should not cause the termination of processes while there are pages in the >>>>> balloon. Now there is no way for virtio balloon driver to free memory at >>>>> the last moment before some process get killed by OOM-killer. >>>>> >>>>> This does not provide a security breach as balloon itself is running >>>>> inside Guest OS and is working in the cooperation with the host. Thus >>>>> some improvements from Guest side should be considered as normal. >>>>> >>>>> To solve the problem, introduce a virtio_balloon callback which is >>>>> expected to be called from the oom notifier call chain in out_of_memory() >>>>> function. If virtio balloon could release some memory, it will make the >>>>> system return and retry the allocation that forced the out of memory >>>>> killer to run. >>>>> >>>>> This behavior should be enabled if and only if appropriate feature bit >>>>> is set on the device. It is off by default. >>>>> >>>>> This functionality was recently merged into vanilla Linux. >>>>> >>>>> commit 5a10b7dbf904bfe01bb9fcc6298f7df09eed77d5 >>>>> Author: Raushaniya Maksudova <rmaksudova@parallels.com> >>>>> Date: Mon Nov 10 09:36:29 2014 +1030 >>>>> >>>>> This patch adds respective control bits into QEMU. It introduces >>>>> deflate-on-oom option for balloon device which does the trick. >>>> What's the status on this, please? It's been over a month since this >>>> was posted with no further review feedback, so I think it's ready. >>>> Getting this into qemu is blocking our next step which would be adding >>>> the feature bit to the virtio spec. >>>> >>>> James >>> This was posted after soft feature freeze for 2.3, so it'll have to go >>> into 2.4. I don't see why would this block your work on the spec: you >>> should make progress on this meanwhile. >> I can do that ... I just thought the spec was trailing edge, so I was >> waiting to have the patch accepted, which confirms the implementation. >> I didn't want to write it into the spec and have the actual >> implementation changed by review later. >> >> James >> > It's up to you really, I would just like to point out two things: > - spec process is a long one, assuming we accept a spec change, > we go though a public review period, multiple votes etc. > About half a year to release a spec revision with > new features. > So time enough to make minor changes. > - oasis process works like this (roughly): > spec is written > spec goes through a public review process > community standard is published > 3 implementations are reported > spec becomes an oasis standard > so implementations aren't required at early stages 2.3 is done, 2.4 window is opened.... The patch is applicable for both git://git.kernel.org/pub/scm/virt/kvm/mst/qemu.git and vanilla qemu. How can we proceed?
On Mon, 2015-05-04 at 12:47 +0300, Denis V. Lunev wrote: > On 01/04/15 13:18, Michael S. Tsirkin wrote: > > On Wed, Apr 01, 2015 at 12:51:42PM +0300, James Bottomley wrote: > >> On Wed, 2015-04-01 at 11:50 +0200, Michael S. Tsirkin wrote: > >>> On Wed, Apr 01, 2015 at 12:44:28PM +0300, James Bottomley wrote: > >>>> On Fri, 2015-02-27 at 09:57 +0300, Denis V. Lunev wrote: > >>>>> Excessive virtio_balloon inflation can cause invocation of OOM-killer, > >>>>> when Linux is under severe memory pressure. Various mechanisms are > >>>>> responsible for correct virtio_balloon memory management. Nevertheless it > >>>>> is often the case that these control tools does not have enough time to > >>>>> react on fast changing memory load. As a result OS runs out of memory and > >>>>> invokes OOM-killer. The balancing of memory by use of the virtio balloon > >>>>> should not cause the termination of processes while there are pages in the > >>>>> balloon. Now there is no way for virtio balloon driver to free memory at > >>>>> the last moment before some process get killed by OOM-killer. > >>>>> > >>>>> This does not provide a security breach as balloon itself is running > >>>>> inside Guest OS and is working in the cooperation with the host. Thus > >>>>> some improvements from Guest side should be considered as normal. > >>>>> > >>>>> To solve the problem, introduce a virtio_balloon callback which is > >>>>> expected to be called from the oom notifier call chain in out_of_memory() > >>>>> function. If virtio balloon could release some memory, it will make the > >>>>> system return and retry the allocation that forced the out of memory > >>>>> killer to run. > >>>>> > >>>>> This behavior should be enabled if and only if appropriate feature bit > >>>>> is set on the device. It is off by default. > >>>>> > >>>>> This functionality was recently merged into vanilla Linux. > >>>>> > >>>>> commit 5a10b7dbf904bfe01bb9fcc6298f7df09eed77d5 > >>>>> Author: Raushaniya Maksudova <rmaksudova@parallels.com> > >>>>> Date: Mon Nov 10 09:36:29 2014 +1030 > >>>>> > >>>>> This patch adds respective control bits into QEMU. It introduces > >>>>> deflate-on-oom option for balloon device which does the trick. > >>>> What's the status on this, please? It's been over a month since this > >>>> was posted with no further review feedback, so I think it's ready. > >>>> Getting this into qemu is blocking our next step which would be adding > >>>> the feature bit to the virtio spec. > >>>> > >>>> James > >>> This was posted after soft feature freeze for 2.3, so it'll have to go > >>> into 2.4. I don't see why would this block your work on the spec: you > >>> should make progress on this meanwhile. > >> I can do that ... I just thought the spec was trailing edge, so I was > >> waiting to have the patch accepted, which confirms the implementation. > >> I didn't want to write it into the spec and have the actual > >> implementation changed by review later. > >> > >> James > >> > > It's up to you really, I would just like to point out two things: > > - spec process is a long one, assuming we accept a spec change, > > we go though a public review period, multiple votes etc. > > About half a year to release a spec revision with > > new features. > > So time enough to make minor changes. > > - oasis process works like this (roughly): > > spec is written > > spec goes through a public review process > > community standard is published > > 3 implementations are reported > > spec becomes an oasis standard > > so implementations aren't required at early stages > 2.3 is done, 2.4 window is opened.... > > The patch is applicable for both > git://git.kernel.org/pub/scm/virt/kvm/mst/qemu.git > and vanilla qemu. > > How can we proceed? The spec update supporting this feature is published for review: https://www.oasis-open.org/committees/download.php/55709/virtio-v1.0-csprd04.zip It's probably a good idea to have the implementation there as well. Do we need to resend these patches? James
On Mon, Jun 08, 2015 at 07:54:42AM -0700, James Bottomley wrote: > On Mon, 2015-05-04 at 12:47 +0300, Denis V. Lunev wrote: > > On 01/04/15 13:18, Michael S. Tsirkin wrote: > > > On Wed, Apr 01, 2015 at 12:51:42PM +0300, James Bottomley wrote: > > >> On Wed, 2015-04-01 at 11:50 +0200, Michael S. Tsirkin wrote: > > >>> On Wed, Apr 01, 2015 at 12:44:28PM +0300, James Bottomley wrote: > > >>>> On Fri, 2015-02-27 at 09:57 +0300, Denis V. Lunev wrote: > > >>>>> Excessive virtio_balloon inflation can cause invocation of OOM-killer, > > >>>>> when Linux is under severe memory pressure. Various mechanisms are > > >>>>> responsible for correct virtio_balloon memory management. Nevertheless it > > >>>>> is often the case that these control tools does not have enough time to > > >>>>> react on fast changing memory load. As a result OS runs out of memory and > > >>>>> invokes OOM-killer. The balancing of memory by use of the virtio balloon > > >>>>> should not cause the termination of processes while there are pages in the > > >>>>> balloon. Now there is no way for virtio balloon driver to free memory at > > >>>>> the last moment before some process get killed by OOM-killer. > > >>>>> > > >>>>> This does not provide a security breach as balloon itself is running > > >>>>> inside Guest OS and is working in the cooperation with the host. Thus > > >>>>> some improvements from Guest side should be considered as normal. > > >>>>> > > >>>>> To solve the problem, introduce a virtio_balloon callback which is > > >>>>> expected to be called from the oom notifier call chain in out_of_memory() > > >>>>> function. If virtio balloon could release some memory, it will make the > > >>>>> system return and retry the allocation that forced the out of memory > > >>>>> killer to run. > > >>>>> > > >>>>> This behavior should be enabled if and only if appropriate feature bit > > >>>>> is set on the device. It is off by default. > > >>>>> > > >>>>> This functionality was recently merged into vanilla Linux. > > >>>>> > > >>>>> commit 5a10b7dbf904bfe01bb9fcc6298f7df09eed77d5 > > >>>>> Author: Raushaniya Maksudova <rmaksudova@parallels.com> > > >>>>> Date: Mon Nov 10 09:36:29 2014 +1030 > > >>>>> > > >>>>> This patch adds respective control bits into QEMU. It introduces > > >>>>> deflate-on-oom option for balloon device which does the trick. > > >>>> What's the status on this, please? It's been over a month since this > > >>>> was posted with no further review feedback, so I think it's ready. > > >>>> Getting this into qemu is blocking our next step which would be adding > > >>>> the feature bit to the virtio spec. > > >>>> > > >>>> James > > >>> This was posted after soft feature freeze for 2.3, so it'll have to go > > >>> into 2.4. I don't see why would this block your work on the spec: you > > >>> should make progress on this meanwhile. > > >> I can do that ... I just thought the spec was trailing edge, so I was > > >> waiting to have the patch accepted, which confirms the implementation. > > >> I didn't want to write it into the spec and have the actual > > >> implementation changed by review later. > > >> > > >> James > > >> > > > It's up to you really, I would just like to point out two things: > > > - spec process is a long one, assuming we accept a spec change, > > > we go though a public review period, multiple votes etc. > > > About half a year to release a spec revision with > > > new features. > > > So time enough to make minor changes. > > > - oasis process works like this (roughly): > > > spec is written > > > spec goes through a public review process > > > community standard is published > > > 3 implementations are reported > > > spec becomes an oasis standard > > > so implementations aren't required at early stages > > 2.3 is done, 2.4 window is opened.... > > > > The patch is applicable for both > > git://git.kernel.org/pub/scm/virt/kvm/mst/qemu.git > > and vanilla qemu. > > > > How can we proceed? > > The spec update supporting this feature is published for review: > > https://www.oasis-open.org/committees/download.php/55709/virtio-v1.0-csprd04.zip > > It's probably a good idea to have the implementation there as well. Do > we need to resend these patches? > > James > Yes, please do.
diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c index 21e449a..cbc5f7f 100644 --- a/hw/virtio/virtio-balloon.c +++ b/hw/virtio/virtio-balloon.c @@ -305,8 +305,8 @@ static void virtio_balloon_set_config(VirtIODevice *vdev, static uint32_t virtio_balloon_get_features(VirtIODevice *vdev, uint32_t f) { - f |= (1 << VIRTIO_BALLOON_F_STATS_VQ); - return f; + VirtIOBalloon *dev = VIRTIO_BALLOON(vdev); + return f | (1u << VIRTIO_BALLOON_F_STATS_VQ) | dev->host_features; } static void virtio_balloon_stat(void *opaque, BalloonInfo *info) @@ -409,6 +409,8 @@ static void virtio_balloon_device_unrealize(DeviceState *dev, Error **errp) } static Property virtio_balloon_properties[] = { + DEFINE_PROP_BIT("deflate-on-oom", VirtIOBalloon, host_features, + VIRTIO_BALLOON_F_DEFLATE_ON_OOM, false), DEFINE_PROP_END_OF_LIST(), }; diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h index 4ab8f54..7f49b1f 100644 --- a/include/hw/virtio/virtio-balloon.h +++ b/include/hw/virtio/virtio-balloon.h @@ -36,6 +36,7 @@ typedef struct VirtIOBalloon { QEMUTimer *stats_timer; int64_t stats_last_update; int64_t stats_poll_interval; + uint32_t host_features; } VirtIOBalloon; #endif
Excessive virtio_balloon inflation can cause invocation of OOM-killer, when Linux is under severe memory pressure. Various mechanisms are responsible for correct virtio_balloon memory management. Nevertheless it is often the case that these control tools does not have enough time to react on fast changing memory load. As a result OS runs out of memory and invokes OOM-killer. The balancing of memory by use of the virtio balloon should not cause the termination of processes while there are pages in the balloon. Now there is no way for virtio balloon driver to free memory at the last moment before some process get killed by OOM-killer. This does not provide a security breach as balloon itself is running inside Guest OS and is working in the cooperation with the host. Thus some improvements from Guest side should be considered as normal. To solve the problem, introduce a virtio_balloon callback which is expected to be called from the oom notifier call chain in out_of_memory() function. If virtio balloon could release some memory, it will make the system return and retry the allocation that forced the out of memory killer to run. This behavior should be enabled if and only if appropriate feature bit is set on the device. It is off by default. This functionality was recently merged into vanilla Linux. commit 5a10b7dbf904bfe01bb9fcc6298f7df09eed77d5 Author: Raushaniya Maksudova <rmaksudova@parallels.com> Date: Mon Nov 10 09:36:29 2014 +1030 This patch adds respective control bits into QEMU. It introduces deflate-on-oom option for balloon device which does the trick. Signed-off-by: Denis V. Lunev <den@openvz.org> CC: Raushaniya Maksudova <rmaksudova@parallels.com> CC: Anthony Liguori <aliguori@amazon.com> CC: Michael S. Tsirkin <mst@redhat.com> --- hw/virtio/virtio-balloon.c | 6 ++++-- include/hw/virtio/virtio-balloon.h | 1 + 2 files changed, 5 insertions(+), 2 deletions(-)