Patchwork whitelist host virtio networking features [was Re: qemu-kvm-0.11 regression, crashes on older ...]

login
register
mail settings
Submitter Dustin Kirkland
Date Oct. 29, 2009, 3:34 p.m.
Message ID <1256830455.25064.155.camel@x200>
Download mbox | patch
Permalink /patch/37200/
State New
Headers show

Comments

Dustin Kirkland - Oct. 29, 2009, 3:34 p.m.
whitelist host virtio networking features

This patch is a followup to 8eca6b1bc770982595db2f7207c65051572436cb,
fixing crashes when guests with 2.6.25 virtio drivers have saturated
virtio network connections.

https://bugs.edge.launchpad.net/ubuntu/+source/qemu-kvm/+bug/458521

That patch should have been whitelisting *_HOST_* rather than the the
*_GUEST_* features.

I tested this by running an Ubuntu 8.04 Hardy guest (2.6.24 kernel +
2.6.25-virtio driver).  I saturated both the incoming, and outgoing
network connection with nc, seeing sustained 6MB/s up and 6MB/s down
bitrates for ~20 minutes.  Previously, this crashed immediately.  Now,
the guest does not crash and maintains network connectivity throughout
the test.

Signed-off-by: Dustin Kirkland <kirkland@canonical.com>
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Tested-by: Dustin Kirkland <kirkland@canonical.com>
Dustin Kirkland - Oct. 30, 2009, 9:15 p.m.
On Thu, Oct 29, 2009 at 10:34 AM, Dustin Kirkland
<kirkland@canonical.com> wrote:
> whitelist host virtio networking features
>
> This patch is a followup to 8eca6b1bc770982595db2f7207c65051572436cb,
> fixing crashes when guests with 2.6.25 virtio drivers have saturated
> virtio network connections.
>
> https://bugs.edge.launchpad.net/ubuntu/+source/qemu-kvm/+bug/458521
>
> That patch should have been whitelisting *_HOST_* rather than the the
> *_GUEST_* features.
>
> I tested this by running an Ubuntu 8.04 Hardy guest (2.6.24 kernel +
> 2.6.25-virtio driver).  I saturated both the incoming, and outgoing
> network connection with nc, seeing sustained 6MB/s up and 6MB/s down
> bitrates for ~20 minutes.  Previously, this crashed immediately.  Now,
> the guest does not crash and maintains network connectivity throughout
> the test.
<snip>

FYI...

Canonical's Ubuntu Security Team will be filing a CVE on this issue,
since there is a bit of an attack vector here, and since
qemu-kvm-0.11.0 is generally available as an official release (and now
part of Ubuntu 9.10).

Guests running linux <= 2.6.25 virtio-net (e.g Ubuntu 8.04 hardy) on
top of qemu-kvm-0.11.0 can be remotely crashed by a non-privileged
network user flooding an open port on the guest.  The crash happens in
a manner that abruptly terminates the guest's execution (ie, without
shutting down cleanly).  This may affect the guest filesystem's
general happiness.

:-Dustin
Mark McLoughlin - Nov. 2, 2009, 2:38 p.m.
On Fri, 2009-10-30 at 16:15 -0500, Dustin Kirkland wrote:
> On Thu, Oct 29, 2009 at 10:34 AM, Dustin Kirkland
> <kirkland@canonical.com> wrote:
> > whitelist host virtio networking features
> >
> > This patch is a followup to 8eca6b1bc770982595db2f7207c65051572436cb,
> > fixing crashes when guests with 2.6.25 virtio drivers have saturated
> > virtio network connections.
> >
> > https://bugs.edge.launchpad.net/ubuntu/+source/qemu-kvm/+bug/458521
> >
> > That patch should have been whitelisting *_HOST_* rather than the the
> > *_GUEST_* features.
> >
> > I tested this by running an Ubuntu 8.04 Hardy guest (2.6.24 kernel +
> > 2.6.25-virtio driver).  I saturated both the incoming, and outgoing
> > network connection with nc, seeing sustained 6MB/s up and 6MB/s down
> > bitrates for ~20 minutes.  Previously, this crashed immediately.  Now,
> > the guest does not crash and maintains network connectivity throughout
> > the test.
> <snip>
> 
> FYI...

Thanks for the notice

> Canonical's Ubuntu Security Team will be filing a CVE on this issue,
> since there is a bit of an attack vector here, and since
> qemu-kvm-0.11.0 is generally available as an official release (and now
> part of Ubuntu 9.10).
> 
> Guests running linux <= 2.6.25 virtio-net (e.g Ubuntu 8.04 hardy) on
> top of qemu-kvm-0.11.0 can be remotely crashed by a non-privileged
> network user flooding an open port on the guest.  The crash happens in
> a manner that abruptly terminates the guest's execution (ie, without
> shutting down cleanly).  This may affect the guest filesystem's
> general happiness.

IMHO, the CVE should be against the 2.6.25 virtio drivers - the bug is
in the guest and the issue we're discussing here is just a hacky
workaround for the guest bug.

Cheers,
Mark.
Anthony Liguori - Nov. 2, 2009, 3:42 p.m.
Mark McLoughlin wrote:
>> Canonical's Ubuntu Security Team will be filing a CVE on this issue,
>> since there is a bit of an attack vector here, and since
>> qemu-kvm-0.11.0 is generally available as an official release (and now
>> part of Ubuntu 9.10).
>>
>> Guests running linux <= 2.6.25 virtio-net (e.g Ubuntu 8.04 hardy) on
>> top of qemu-kvm-0.11.0 can be remotely crashed by a non-privileged
>> network user flooding an open port on the guest.  The crash happens in
>> a manner that abruptly terminates the guest's execution (ie, without
>> shutting down cleanly).  This may affect the guest filesystem's
>> general happiness.
>>     
>
> IMHO, the CVE should be against the 2.6.25 virtio drivers - the bug is
> in the guest and the issue we're discussing here is just a hacky
> workaround for the guest bug.
>   

Yeah, I'm inclined to agree.  The guest generates bad data and we exit.  
exit()ing is probably not wonderful but it's a well understood behavior.

The fundamental bug here is in the guest, not in qemu.

Regards,

Anthony Liguori

> Cheers,
> Mark.
>
>
Jamie Lokier - Nov. 2, 2009, 3:52 p.m.
Anthony Liguori wrote:
> Mark McLoughlin wrote:
> >>Canonical's Ubuntu Security Team will be filing a CVE on this issue,
> >>since there is a bit of an attack vector here, and since
> >>qemu-kvm-0.11.0 is generally available as an official release (and now
> >>part of Ubuntu 9.10).
> >>
> >>Guests running linux <= 2.6.25 virtio-net (e.g Ubuntu 8.04 hardy) on
> >>top of qemu-kvm-0.11.0 can be remotely crashed by a non-privileged
> >>network user flooding an open port on the guest.  The crash happens in
> >>a manner that abruptly terminates the guest's execution (ie, without
> >>shutting down cleanly).  This may affect the guest filesystem's
> >>general happiness.
> >>    
> >
> >IMHO, the CVE should be against the 2.6.25 virtio drivers - the bug is
> >in the guest and the issue we're discussing here is just a hacky
> >workaround for the guest bug.
> >  
> 
> Yeah, I'm inclined to agree.  The guest generates bad data and we exit.  
> exit()ing is probably not wonderful but it's a well understood behavior.
> 
> The fundamental bug here is in the guest, not in qemu.

Guests should never be able to crash or terminate qemu, unless they
call something that is intentionally an "exit qemu" hook for the
guest.  And even that should be possible to disable.

What happens if the guest is running malicious code?  What if it's
been hacked?  The worst that should happen is the guest behaves like a
hacked machine.

What about running old machine images?  One of the major uses I've
seen of KVM is for running old machine images - images which are not
to be updated, so that they continue to have the same behaviour.

I agree that the 2.6.25 virtio drivers have a bug and ought to be
fixed, but a qemu which abruptly terminates is never good.

-- Jamie
Dustin Kirkland - Nov. 2, 2009, 4:58 p.m.
On Mon, Nov 2, 2009 at 8:38 AM, Mark McLoughlin <markmc@redhat.com> wrote:
> On Fri, 2009-10-30 at 16:15 -0500, Dustin Kirkland wrote:
>> Canonical's Ubuntu Security Team will be filing a CVE on this issue,
>> since there is a bit of an attack vector here, and since
>> qemu-kvm-0.11.0 is generally available as an official release (and now
>> part of Ubuntu 9.10).
>>
>> Guests running linux <= 2.6.25 virtio-net (e.g Ubuntu 8.04 hardy) on
>> top of qemu-kvm-0.11.0 can be remotely crashed by a non-privileged
>> network user flooding an open port on the guest.  The crash happens in
>> a manner that abruptly terminates the guest's execution (ie, without
>> shutting down cleanly).  This may affect the guest filesystem's
>> general happiness.
>
> IMHO, the CVE should be against the 2.6.25 virtio drivers - the bug is
> in the guest and the issue we're discussing here is just a hacky
> workaround for the guest bug.

Kees/Jamie/Marc-

I think Mark has a good point.  This bug has two parts.  Ultimately,
it's triggered by a buggy virtio-net implementation in the Ubuntu 8.04
kernel (as well as any others using the circa 2.6.25 virtio net code).
 The CVE should probably mention (or focus on) this too.

The qemu-kvm patch is still a good thing to do, as it shouldn't just
exit and terminate the VM, so that's needed as well.

:-Dustin
Michael Tokarev - Nov. 2, 2009, 6:20 p.m.
Jamie Lokier wrote:
> Anthony Liguori wrote:
>> Mark McLoughlin wrote:
>>>> Canonical's Ubuntu Security Team will be filing a CVE on this issue,
>>>> since there is a bit of an attack vector here, and since
>>>> qemu-kvm-0.11.0 is generally available as an official release (and now
>>>> part of Ubuntu 9.10).
>>>>
>>>> Guests running linux <= 2.6.25 virtio-net (e.g Ubuntu 8.04 hardy) on
>>>> top of qemu-kvm-0.11.0 can be remotely crashed by a non-privileged
>>>> network user flooding an open port on the guest.  The crash happens in
>>>> a manner that abruptly terminates the guest's execution (ie, without
>>>> shutting down cleanly).  This may affect the guest filesystem's
>>>> general happiness.
>>>>    
>>> IMHO, the CVE should be against the 2.6.25 virtio drivers - the bug is
>>> in the guest and the issue we're discussing here is just a hacky
>>> workaround for the guest bug.
>>>  
>> Yeah, I'm inclined to agree.  The guest generates bad data and we exit.  
>> exit()ing is probably not wonderful but it's a well understood behavior.
>>
>> The fundamental bug here is in the guest, not in qemu.
> 
> Guests should never be able to crash or terminate qemu, unless they
> call something that is intentionally an "exit qemu" hook for the
> guest.  And even that should be possible to disable.

Well, if your buggy NIC driver does something wrong programming
the hardware (like the famous r8169 did - it allocated less
buffer space than telling to the card, so the card were happily
overwriting unrelated kernel memory with content received from
network), you will most likely get a machine which does not
respond to external events, a stuck machine, until you hit
"reset" button (provided there is one) or toggle power.
Or just a reboot, depending on what exactly you've hit.

If you want kvm to behave like this, wrap it into a trivial
shell script that restarts the guest.

/mjt
Anthony Liguori - Nov. 2, 2009, 6:55 p.m.
Jamie Lokier wrote:
> Anthony Liguori wrote:
>   
>> Mark McLoughlin wrote:
>>     
>>>> Canonical's Ubuntu Security Team will be filing a CVE on this issue,
>>>> since there is a bit of an attack vector here, and since
>>>> qemu-kvm-0.11.0 is generally available as an official release (and now
>>>> part of Ubuntu 9.10).
>>>>
>>>> Guests running linux <= 2.6.25 virtio-net (e.g Ubuntu 8.04 hardy) on
>>>> top of qemu-kvm-0.11.0 can be remotely crashed by a non-privileged
>>>> network user flooding an open port on the guest.  The crash happens in
>>>> a manner that abruptly terminates the guest's execution (ie, without
>>>> shutting down cleanly).  This may affect the guest filesystem's
>>>> general happiness.
>>>>    
>>>>         
>>> IMHO, the CVE should be against the 2.6.25 virtio drivers - the bug is
>>> in the guest and the issue we're discussing here is just a hacky
>>> workaround for the guest bug.
>>>  
>>>       
>> Yeah, I'm inclined to agree.  The guest generates bad data and we exit.  
>> exit()ing is probably not wonderful but it's a well understood behavior.
>>
>> The fundamental bug here is in the guest, not in qemu.
>>     
>
> Guests should never be able to crash or terminate qemu, unless they
> call something that is intentionally an "exit qemu" hook for the
> guest.  And even that should be possible to disable.
>   

They can exit qemu via an ACPI shutdown.  I don't see the difference.

Regards,

Anthony Liguori
Dustin Kirkland - Nov. 2, 2009, 7:25 p.m.
On Mon, 2009-11-02 at 12:55 -0600, Anthony Liguori wrote:
> They can exit qemu via an ACPI shutdown.  I don't see the difference.

An ACPI shutdown is triggered by an authenticated user inside of the
guest.

The present exit is triggered by any other anonymous user on the
network, with the ability to send a lot of packets very quickly to the
VM guest.  The guest isn't able to handle this properly (and rightly
that guest's kernel should be fixed).  But I do see a difference.

:-Dustin
Jamie Lokier - Nov. 2, 2009, 7:39 p.m.
Michael Tokarev wrote:
> If you want kvm to behave like this, wrap it into a trivial
> shell script that restarts the guest.

True, kvm has enough crash-bugs elsewhere that I already have to deal
with that.  It'd be nice to distinguish kvm/qemu bugs from guest
bugs, though :-)

kvm/qemu also has lock-up bugs, where it's spinning at 100% and the
guest seems to be stuck (even though the VNC server continues to
work).  That's a bit harder to fix with a wrapper script.

-- Jamie
Anthony Liguori - Nov. 2, 2009, 8:50 p.m.
Dustin Kirkland wrote:
> On Mon, 2009-11-02 at 12:55 -0600, Anthony Liguori wrote:
>   
>> They can exit qemu via an ACPI shutdown.  I don't see the difference.
>>     
>
> An ACPI shutdown is triggered by an authenticated user inside of the
> guest.
>
> The present exit is triggered by any other anonymous user on the
> network, with the ability to send a lot of packets very quickly to the
> VM guest.  The guest isn't able to handle this properly (and rightly
> that guest's kernel should be fixed).  But I do see a difference.
>   

Well the problem is triggered by the guest kernel writing garbage to 
virtio-net's backend.  That's why we're suggesting it's really a guest 
kernel issue.

If the guest kernel writes something bad to qemu, we're may kill the 
guest.  That's not a qemu bug, it's the designed behavior.

Regards,

Anthony Liguori

> :-Dustin
>
Jamie Lokier - Nov. 5, 2009, 5:06 a.m.
Anthony Liguori wrote:
> Dustin Kirkland wrote:
> >On Mon, 2009-11-02 at 12:55 -0600, Anthony Liguori wrote:
> >  
> >>They can exit qemu via an ACPI shutdown.  I don't see the difference.
> >>    
> >
> >An ACPI shutdown is triggered by an authenticated user inside of the
> >guest.
> >
> >The present exit is triggered by any other anonymous user on the
> >network, with the ability to send a lot of packets very quickly to the
> >VM guest.  The guest isn't able to handle this properly (and rightly
> >that guest's kernel should be fixed).  But I do see a difference.
> >  
> 
> Well the problem is triggered by the guest kernel writing garbage to 
> virtio-net's backend.  That's why we're suggesting it's really a guest 
> kernel issue.
> 
> If the guest kernel writes something bad to qemu, we're may kill the 
> guest.  That's not a qemu bug, it's the designed behavior.

I'm ok with killing the guest, but think the right behaviour is
whatever the user's configured to happen when the guest is killed, in
the same way that ACPI shutdown does not shutdown if -no-shutdown is
used, and you can debug a guest which does that.

A guest writing garbage to virtio-net should not be able to use it to
escape being debugged :-)

How can a manager program determine if the exit is due to intentional
shutdown or crashing guest?

I'd be inclined to make those "exit() called from the device
emulation" cases do whatever -watchdog-action says, if the watchdog is
enabled.  With a great big log message.

-- Jamie

Patch

diff --git a/hw/virtio-net.c b/hw/virtio-net.c
index ce8e6cb..27834fa 100644
--- a/hw/virtio-net.c
+++ b/hw/virtio-net.c
@@ -164,10 +164,10 @@  static uint32_t virtio_net_bad_features(VirtIODevice *vdev)
     /* Linux kernel 2.6.25.  It understood MAC (as everyone must),
      * but also these: */
     features |= (1 << VIRTIO_NET_F_MAC);
-    features |= (1 << VIRTIO_NET_F_GUEST_CSUM);
-    features |= (1 << VIRTIO_NET_F_GUEST_TSO4);
-    features |= (1 << VIRTIO_NET_F_GUEST_TSO6);
-    features |= (1 << VIRTIO_NET_F_GUEST_ECN);
+    features |= (1 << VIRTIO_NET_F_CSUM);
+    features |= (1 << VIRTIO_NET_F_HOST_TSO4);
+    features |= (1 << VIRTIO_NET_F_HOST_TSO6);
+    features |= (1 << VIRTIO_NET_F_HOST_ECN);
 
     return features & virtio_net_get_features(vdev);
 }