diff mbox

qemu-kvm-0.11 regression, crashes on older guests with virtio network

Message ID 1256815818-sup-7805@xpc65.scottt
State New
Headers show

Commit Message

Scott Tsai Oct. 29, 2009, noon UTC
Excerpts from Mark McLoughlin's message of Thu Oct 29 17:16:43 +0800 2009:
> Assuming this is something like the virtio-net in 2.6.26, there was no
> receivable buffers support so (as Scott points out) it must be that
> we've read a packet from the tap device which is >1514 bytes (or >1524
> bytes with IFF_VNET_HDR) but the guest has not supplied buffers which
> are large enough to take it

> One thing to check is that the tap device is being initialized by
> qemu-kvm using TUNSETOFFLOAD with either zero or TUN_F_CSUM - i.e. GSO
> should not be enabled, because the guest cannot handle large GSO packets

> Another possibility is that the MTU on the bridge in the host is too
> large and that's what's causing the large packets to be sent

Using Dustin's image, I see:
	virtio_net_set_features(features: 0x00000930)
	tap_set_offload(csum: 1, tso4: 1, tso6: 1, ecn: 1)
being called and get an mtu of 1500 on virbr0 using his birdge.sh script.

virtio_net_receive2 was trying to transfer a 1534 byte packet (1524 'size' + 10 'virtio_net_hdr')
and the guest only had 1524 bytes of space in its input descriptors.

BTW, I can also reproduce this running Dustin's image inside Fedora 11's qemu-0.10.6-9.fc11.x86_64.

The patch I posted earlier actually only applies to the 0.10 branch, here's a patch that compiles for 0.11:

From 06aa7db0705cf747c35cbcbd09d0e37713f16fe4 Mon Sep 17 00:00:00 2001
From: Scott Tsai <scottt.tw@gmail.com>
Date: Thu, 29 Oct 2009 10:56:12 +0800
Subject: [PATCH] virtio-net: drop large packets when no mergable_rx_bufs

Currently virtio-net calls exit(1) when it receives a large packet and
the VIRTIO_NET_F_MRG_RXBUF feature isn't set.
Change it to drop the packet instead.

see: https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/458521
---
 hw/virtio-net.c |    8 +++++++-
 hw/virtio.c     |   33 +++++++++++++++++++++++++++++++++
 2 files changed, 40 insertions(+), 1 deletions(-)

Comments

Mark McLoughlin Oct. 29, 2009, 12:16 p.m. UTC | #1
On Thu, 2009-10-29 at 20:00 +0800, Scott Tsai wrote:
> Excerpts from Mark McLoughlin's message of Thu Oct 29 17:16:43 +0800 2009:
> > Assuming this is something like the virtio-net in 2.6.26, there was no
> > receivable buffers support so (as Scott points out) it must be that
> > we've read a packet from the tap device which is >1514 bytes (or >1524
> > bytes with IFF_VNET_HDR) but the guest has not supplied buffers which
> > are large enough to take it
> 
> > One thing to check is that the tap device is being initialized by
> > qemu-kvm using TUNSETOFFLOAD with either zero or TUN_F_CSUM - i.e. GSO
> > should not be enabled, because the guest cannot handle large GSO packets
> 
> > Another possibility is that the MTU on the bridge in the host is too
> > large and that's what's causing the large packets to be sent
> 
> Using Dustin's image, I see:
> 	virtio_net_set_features(features: 0x00000930)

Hmm - 0x930 doesn't seem right. Is that 930 decimal, 0x3a2 hex?

> 	tap_set_offload(csum: 1, tso4: 1, tso6: 1, ecn: 1)
> being called and get an mtu of 1500 on virbr0 using his birdge.sh script.
> 
> virtio_net_receive2 was trying to transfer a 1534 byte packet (1524 'size' + 10 'virtio_net_hdr')
> and the guest only had 1524 bytes of space in its input descriptors.

Okay, that sounds like a bug in Dustin's version of the guest virtio-net
driver - if it is only supplying 1524 byte buffers, it should not be
saying it supports the VIRTIO_NET_F_GUEST_TSO4 feature

Cheers,
Mark.
Scott Tsai Oct. 29, 2009, 12:21 p.m. UTC | #2
> Hmm - 0x930 doesn't seem right. Is that 930 decimal, 0x3a2 hex?

yup. printf format string typo.
Anthony Liguori Oct. 29, 2009, 2:11 p.m. UTC | #3
Mark McLoughlin wrote:
>
>> 	tap_set_offload(csum: 1, tso4: 1, tso6: 1, ecn: 1)
>> being called and get an mtu of 1500 on virbr0 using his birdge.sh script.
>>
>> virtio_net_receive2 was trying to transfer a 1534 byte packet (1524 'size' + 10 'virtio_net_hdr')
>> and the guest only had 1524 bytes of space in its input descriptors.
>>     
>
> Okay, that sounds like a bug in Dustin's version of the guest virtio-net
> driver - if it is only supplying 1524 byte buffers, it should not be
> saying it supports the VIRTIO_NET_F_GUEST_TSO4 feature
>   

See:

commit 8eca6b1bc770982595db2f7207c65051572436cb
Author: aliguori <aliguori@c046a42c-6fe2-441c-8c8c-71466251a162>
Date:   Sun Apr 5 17:40:08 2009 +0000

    Fix oops on 2.6.25 guest (Rusty Russell)
   
    I believe this is behind the following:
    https://bugs.edge.launchpad.net/ubuntu/jaunty/+source/linux/+bug/331128
   
    virtio_pci in 2.6.25 didn't do feature negotiation correctly: it 
acked every
    bit.  Fortunately, we can detect this.
   
    Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
    Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>

It looks like Rusty's fix wasn't enough.  If I change virtio-net to only 
advertise F_MAC, we don't run into this problem.

Regards,

Anthony Liguori
Mark McLoughlin Oct. 29, 2009, 2:25 p.m. UTC | #4
On Thu, 2009-10-29 at 09:11 -0500, Anthony Liguori wrote:
> Mark McLoughlin wrote:
> >
> >> 	tap_set_offload(csum: 1, tso4: 1, tso6: 1, ecn: 1)
> >> being called and get an mtu of 1500 on virbr0 using his birdge.sh script.
> >>
> >> virtio_net_receive2 was trying to transfer a 1534 byte packet (1524 'size' + 10 'virtio_net_hdr')
> >> and the guest only had 1524 bytes of space in its input descriptors.
> >>     
> >
> > Okay, that sounds like a bug in Dustin's version of the guest virtio-net
> > driver - if it is only supplying 1524 byte buffers, it should not be
> > saying it supports the VIRTIO_NET_F_GUEST_TSO4 feature
> >   
> 
> See:
> 
> commit 8eca6b1bc770982595db2f7207c65051572436cb
> Author: aliguori <aliguori@c046a42c-6fe2-441c-8c8c-71466251a162>
> Date:   Sun Apr 5 17:40:08 2009 +0000
> 
>     Fix oops on 2.6.25 guest (Rusty Russell)
>    
>     I believe this is behind the following:
>     https://bugs.edge.launchpad.net/ubuntu/jaunty/+source/linux/+bug/331128
>    
>     virtio_pci in 2.6.25 didn't do feature negotiation correctly: it 
> acked every
>     bit.  Fortunately, we can detect this.
>    
>     Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
>     Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
> 
> It looks like Rusty's fix wasn't enough.  If I change virtio-net to only 
> advertise F_MAC, we don't run into this problem.

If it's not acking VBAD_FEATURE, then it doesn't sound like the same
issue

It's also not acking e.g. MRG_RXBUF, which suggests that it is
selectively acking features, and choosing to ack TSO4

A quick look through the guest driver code should clear up the
confusion. Dustion, got a pointer?

Thanks,
Mark.
Dustin Kirkland Oct. 29, 2009, 2:34 p.m. UTC | #5
On Thu, 2009-10-29 at 14:25 +0000, Mark McLoughlin wrote:
> On Thu, 2009-10-29 at 09:11 -0500, Anthony Liguori wrote:
> > Mark McLoughlin wrote:
> > >
> > >> 	tap_set_offload(csum: 1, tso4: 1, tso6: 1, ecn: 1)
> > >> being called and get an mtu of 1500 on virbr0 using his birdge.sh script.
> > >>
> > >> virtio_net_receive2 was trying to transfer a 1534 byte packet (1524 'size' + 10 'virtio_net_hdr')
> > >> and the guest only had 1524 bytes of space in its input descriptors.
> > >>     
> > >
> > > Okay, that sounds like a bug in Dustin's version of the guest virtio-net
> > > driver - if it is only supplying 1524 byte buffers, it should not be
> > > saying it supports the VIRTIO_NET_F_GUEST_TSO4 feature
> > >   
> > 
> > See:
> > 
> > commit 8eca6b1bc770982595db2f7207c65051572436cb
> > Author: aliguori <aliguori@c046a42c-6fe2-441c-8c8c-71466251a162>
> > Date:   Sun Apr 5 17:40:08 2009 +0000
> > 
> >     Fix oops on 2.6.25 guest (Rusty Russell)
> >    
> >     I believe this is behind the following:
> >     https://bugs.edge.launchpad.net/ubuntu/jaunty/+source/linux/+bug/331128
> >    
> >     virtio_pci in 2.6.25 didn't do feature negotiation correctly: it 
> > acked every
> >     bit.  Fortunately, we can detect this.
> >    
> >     Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
> >     Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
> > 
> > It looks like Rusty's fix wasn't enough.  If I change virtio-net to only 
> > advertise F_MAC, we don't run into this problem.
> 
> If it's not acking VBAD_FEATURE, then it doesn't sound like the same
> issue
> 
> It's also not acking e.g. MRG_RXBUF, which suggests that it is
> selectively acking features, and choosing to ack TSO4
> 
> A quick look through the guest driver code should clear up the
> confusion. Dustion, got a pointer?

Hi Mark,

I'm currently testing Scott's patch above.

In the mean time, Hardy's kernel is in git here:

http://kernel.ubuntu.com/git?p=ubuntu/ubuntu-hardy.git;a=summary

Thanks,
:-Dustin
Anthony Liguori Oct. 29, 2009, 2:39 p.m. UTC | #6
Mark McLoughlin wrote:
> On Thu, 2009-10-29 at 09:11 -0500, Anthony Liguori wrote:
>   
>> Mark McLoughlin wrote:
>>     
>>>> 	tap_set_offload(csum: 1, tso4: 1, tso6: 1, ecn: 1)
>>>> being called and get an mtu of 1500 on virbr0 using his birdge.sh script.
>>>>
>>>> virtio_net_receive2 was trying to transfer a 1534 byte packet (1524 'size' + 10 'virtio_net_hdr')
>>>> and the guest only had 1524 bytes of space in its input descriptors.
>>>>     
>>>>         
>>> Okay, that sounds like a bug in Dustin's version of the guest virtio-net
>>> driver - if it is only supplying 1524 byte buffers, it should not be
>>> saying it supports the VIRTIO_NET_F_GUEST_TSO4 feature
>>>   
>>>       
>> See:
>>
>> commit 8eca6b1bc770982595db2f7207c65051572436cb
>> Author: aliguori <aliguori@c046a42c-6fe2-441c-8c8c-71466251a162>
>> Date:   Sun Apr 5 17:40:08 2009 +0000
>>
>>     Fix oops on 2.6.25 guest (Rusty Russell)
>>    
>>     I believe this is behind the following:
>>     https://bugs.edge.launchpad.net/ubuntu/jaunty/+source/linux/+bug/331128
>>    
>>     virtio_pci in 2.6.25 didn't do feature negotiation correctly: it 
>> acked every
>>     bit.  Fortunately, we can detect this.
>>    
>>     Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
>>     Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
>>
>> It looks like Rusty's fix wasn't enough.  If I change virtio-net to only 
>> advertise F_MAC, we don't run into this problem.
>>     
>
> If it's not acking VBAD_FEATURE, then it doesn't sound like the same
> issue
>   

It was acking VBAD_FEATURE when I tested it.

But if you look at the patch, it whitelists the following features:

    features |= (1 << VIRTIO_NET_F_MAC);
    features |= (1 << VIRTIO_NET_F_GUEST_CSUM);
    features |= (1 << VIRTIO_NET_F_GUEST_TSO4);
    features |= (1 << VIRTIO_NET_F_GUEST_TSO6);
    features |= (1 << VIRTIO_NET_F_GUEST_ECN);

Which is why it's ack'ing TSO4.   Removing TSO4 didn't seem to fix it 
for me.

Regards,

Anthony Liguori
Dustin Kirkland Oct. 29, 2009, 2:39 p.m. UTC | #7
On Thu, 2009-10-29 at 20:00 +0800, Scott Tsai wrote:
> Excerpts from Mark McLoughlin's message of Thu Oct 29 17:16:43 +0800 2009:
> > Assuming this is something like the virtio-net in 2.6.26, there was no
> > receivable buffers support so (as Scott points out) it must be that
> > we've read a packet from the tap device which is >1514 bytes (or >1524
> > bytes with IFF_VNET_HDR) but the guest has not supplied buffers which
> > are large enough to take it
> 
> > One thing to check is that the tap device is being initialized by
> > qemu-kvm using TUNSETOFFLOAD with either zero or TUN_F_CSUM - i.e. GSO
> > should not be enabled, because the guest cannot handle large GSO packets
> 
> > Another possibility is that the MTU on the bridge in the host is too
> > large and that's what's causing the large packets to be sent
> 
> Using Dustin's image, I see:
> 	virtio_net_set_features(features: 0x00000930)
> 	tap_set_offload(csum: 1, tso4: 1, tso6: 1, ecn: 1)
> being called and get an mtu of 1500 on virbr0 using his birdge.sh script.
> 
> virtio_net_receive2 was trying to transfer a 1534 byte packet (1524 'size' + 10 'virtio_net_hdr')
> and the guest only had 1524 bytes of space in its input descriptors.
> 
> BTW, I can also reproduce this running Dustin's image inside Fedora 11's qemu-0.10.6-9.fc11.x86_64.
> 
> The patch I posted earlier actually only applies to the 0.10 branch, here's a patch that compiles for 0.11:


Hi Scott,

Thanks for this.  Testing this, kvm doesn't crash.  And the guest has
working network connectivity, until I saturate the network connection
with nc.  At that point, the guest loses network connectivity all
together.  So the fix is not quite ideal, yet.

:-Dustin
Dustin Kirkland Oct. 29, 2009, 2:46 p.m. UTC | #8
On Thu, 2009-10-29 at 09:34 -0500, Dustin Kirkland wrote:
> In the mean time, Hardy's kernel is in git here:
> 
> http://kernel.ubuntu.com/git?p=ubuntu/ubuntu-hardy.git;a=summary

I'll save you a few clicks...

http://kernel.ubuntu.com/git?p=ubuntu/ubuntu-hardy.git;a=blob;f=drivers/net/virtio_net.c;h=d1a200ff5fd266c05e9a876e5e4e550737f77d84;hb=HEAD

:-dustin
Mark McLoughlin Oct. 29, 2009, 2:48 p.m. UTC | #9
On Thu, 2009-10-29 at 09:39 -0500, Anthony Liguori wrote:
> Mark McLoughlin wrote:
> > On Thu, 2009-10-29 at 09:11 -0500, Anthony Liguori wrote:
> >   
> >> Mark McLoughlin wrote:
> >>     
> >>>> 	tap_set_offload(csum: 1, tso4: 1, tso6: 1, ecn: 1)
> >>>> being called and get an mtu of 1500 on virbr0 using his birdge.sh script.
> >>>>
> >>>> virtio_net_receive2 was trying to transfer a 1534 byte packet (1524 'size' + 10 'virtio_net_hdr')
> >>>> and the guest only had 1524 bytes of space in its input descriptors.
> >>>>     
> >>>>         
> >>> Okay, that sounds like a bug in Dustin's version of the guest virtio-net
> >>> driver - if it is only supplying 1524 byte buffers, it should not be
> >>> saying it supports the VIRTIO_NET_F_GUEST_TSO4 feature
> >>>   
> >>>       
> >> See:
> >>
> >> commit 8eca6b1bc770982595db2f7207c65051572436cb
> >> Author: aliguori <aliguori@c046a42c-6fe2-441c-8c8c-71466251a162>
> >> Date:   Sun Apr 5 17:40:08 2009 +0000
> >>
> >>     Fix oops on 2.6.25 guest (Rusty Russell)
> >>    
> >>     I believe this is behind the following:
> >>     https://bugs.edge.launchpad.net/ubuntu/jaunty/+source/linux/+bug/331128
> >>    
> >>     virtio_pci in 2.6.25 didn't do feature negotiation correctly: it 
> >> acked every
> >>     bit.  Fortunately, we can detect this.
> >>    
> >>     Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
> >>     Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
> >>
> >> It looks like Rusty's fix wasn't enough.  If I change virtio-net to only 
> >> advertise F_MAC, we don't run into this problem.
> >>     
> >
> > If it's not acking VBAD_FEATURE, then it doesn't sound like the same
> > issue
> >   
> 
> It was acking VBAD_FEATURE when I tested it.
> 
> But if you look at the patch, it whitelists the following features:
> 
>     features |= (1 << VIRTIO_NET_F_MAC);
>     features |= (1 << VIRTIO_NET_F_GUEST_CSUM);
>     features |= (1 << VIRTIO_NET_F_GUEST_TSO4);
>     features |= (1 << VIRTIO_NET_F_GUEST_TSO6);
>     features |= (1 << VIRTIO_NET_F_GUEST_ECN);

Ah, it all makes sense now.

I was getting confused between HOST_* and GUEST_*

this should have been:

    features |= (1 << VIRTIO_NET_F_MAC);
    features |= (1 << VIRTIO_NET_F_HOST_CSUM);
    features |= (1 << VIRTIO_NET_F_HOST_TSO4);
    features |= (1 << VIRTIO_NET_F_HOST_TSO6);
    features |= (1 << VIRTIO_NET_F_HOST_ECN);

Could you try that Dustin?

> Which is why it's ack'ing TSO4.   Removing TSO4 didn't seem to fix it 
> for me.

Odd.

Cheers,
Mark.
Mark McLoughlin Oct. 29, 2009, 2:50 p.m. UTC | #10
On Thu, 2009-10-29 at 09:46 -0500, Dustin Kirkland wrote:
> On Thu, 2009-10-29 at 09:34 -0500, Dustin Kirkland wrote:
> > In the mean time, Hardy's kernel is in git here:
> > 
> > http://kernel.ubuntu.com/git?p=ubuntu/ubuntu-hardy.git;a=summary
> 
> I'll save you a few clicks...
> 
> http://kernel.ubuntu.com/git?p=ubuntu/ubuntu-hardy.git;a=blob;f=drivers/net/virtio_net.c;h=d1a200ff5fd266c05e9a876e5e4e550737f77d84;hb=HEAD

Actually, what would save more clicks is if:

  git://kernel.ubuntu.com/ubuntu/ubuntu-hardy.git

was listed in web interface. Took me a while to find:

  https://wiki.ubuntu.com/KernelTeam/KernelGitGuide

:-)

Thanks,
Mark.
Scott Tsai Oct. 29, 2009, 11:22 p.m. UTC | #11
Hi, Dustin,
What's the easiest way to see the patches to qemu that Canonical
carries for the different Ubuntu releases?
(I think http://patches.ubuntu.com/ only diffs against Debian for the
last stable Ubuntu release?)

Also, is there a way for an outside developer to get email
notifications when a patch is added to Ubuntu's qemu package?
diff mbox

Patch

diff --git a/hw/virtio-net.c b/hw/virtio-net.c
index ce8e6cb..2e6725b 100644
--- a/hw/virtio-net.c
+++ b/hw/virtio-net.c
@@ -502,6 +502,8 @@  static int receive_filter(VirtIONet *n, const uint8_t *buf, int size)
     return 0;
 }
 
+int buffer_fits_in_virtqueue_top(VirtQueue *vq, int size);
+
 static ssize_t virtio_net_receive2(VLANClientState *vc, const uint8_t *buf, size_t size, int raw)
 {
     VirtIONet *n = vc->opaque;
@@ -518,6 +520,10 @@  static ssize_t virtio_net_receive2(VLANClientState *vc, const uint8_t *buf, size
     hdr_len = n->mergeable_rx_bufs ?
         sizeof(struct virtio_net_hdr_mrg_rxbuf) : sizeof(struct virtio_net_hdr);
 
+    /* drop packet instead of truncating it */
+    if (!n->mergeable_rx_bufs && !buffer_fits_in_virtqueue_top(n->rx_vq, hdr_len + size))
+        return;
+
     offset = i = 0;
 
     while (offset < size) {
@@ -531,7 +537,7 @@  static ssize_t virtio_net_receive2(VLANClientState *vc, const uint8_t *buf, size
             virtqueue_pop(n->rx_vq, &elem) == 0) {
             if (i == 0)
                 return -1;
-            fprintf(stderr, "virtio-net truncating packet\n");
+            fprintf(stderr, "virtio-net truncating packet: mergable_rx_bufs: %d\n", n->mergeable_rx_bufs);
             exit(1);
         }
 
diff --git a/hw/virtio.c b/hw/virtio.c
index 41e7ca2..d9e0353 100644
--- a/hw/virtio.c
+++ b/hw/virtio.c
@@ -356,6 +356,39 @@  int virtqueue_avail_bytes(VirtQueue *vq, int in_bytes, int out_bytes)
     return 0;
 }
 
+/* buffer_fits_in_virtqueue_top: returns true if a 'size' byte buffer could fit in the
+ * input descriptors that virtqueue_pop() would have returned
+ */
+int buffer_fits_in_virtqueue_top(VirtQueue *vq, int size);
+
+int buffer_fits_in_virtqueue_top(VirtQueue *vq, int size)
+{
+    unsigned int i, max;
+    int input_iov_len_sum;
+    target_phys_addr_t desc_pa;
+
+    if (!virtqueue_num_heads(vq, vq->last_avail_idx))
+        return 0;
+
+    desc_pa = vq->vring.desc;
+    max = vq->vring.num;
+    i = virtqueue_get_head(vq, vq->last_avail_idx);
+
+    if (vring_desc_flags(desc_pa, i) & VRING_DESC_F_INDIRECT) {
+        /* loop over the indirect descriptor table */
+        max = vring_desc_len(desc_pa, i) / sizeof(VRingDesc);
+        desc_pa = vring_desc_addr(desc_pa, i);
+        i = 0;
+    }
+
+    input_iov_len_sum = 0;
+    do {
+        if (vring_desc_flags(desc_pa, i) & VRING_DESC_F_WRITE)
+            input_iov_len_sum += vring_desc_len(desc_pa, i);
+    } while ((i = virtqueue_next_desc(desc_pa, i, max)) != vq->vring.num);
+    return input_iov_len_sum >= size;
+}
+
 int virtqueue_pop(VirtQueue *vq, VirtQueueElement *elem)
 {
     unsigned int i, head, max;