diff mbox

"virtio-net: enable multiqueue by default" in linux-next breaks networking on GCE

Message ID 60cd312f-86f9-47e9-0c72-f4c2109e2f87@redhat.com
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Jason Wang Dec. 13, 2016, 3:43 a.m. UTC
On 2016年12月13日 11:12, Theodore Ts'o wrote:
> On Tue, Dec 13, 2016 at 04:28:17AM +0200, Michael S. Tsirkin wrote:
>> That's unfortunate, of course. It could be a hypervisor or
>> a guest kernel bug. ideas:
>> - does host have mq capability? how many queues?
>> - how about # of msix vectors?
>> - after you send something on tx queues,
>>    are interrupts arriving on rx queues?
>> - is problem rx or tx?
>>    set ip and arp manually and send a packet to known MAC,
>>    does it get there?
> Sorry, I don't know how to debug virtio-net.  Given that it's in a
> cloud environment, I also can't set ip addresses manually, since ip
> addresses are set manually.
>
> If you can send me a patch, I'm happy to apply it and send you back
> results.
>
> I can say that I've had _zero_ problems using pretty much any kernel
> from 3.10 to 4.9 using Google Compute Engine.  The commit I referenced
> caused things to stop working.  So in terms of regression, this is
> definitely a regression, and it's definitely caused by commit
> 449000102901.  Even if it is a hypervisor "bug", I'm pretty sure I
> know what Linus will say if I ask him to revert it.  Linux kernels are
> expected to work around hardware bugs, and breaking users just because
> hardware is "broken" by some definition is generally not considered
> friendly, especially when has been working for years and years before
> some commit "fixed" things.
>
> I would very much like to work with you to fix it, but I will need
> your help, since virtio-net doesn't seem to print any informational
> during the boot sequence, and I don't know how the best way to debug
> it.
>
> Cheers,
>
> 						- Ted

Thanks for reporting this issue. Looks like I blindly set the affinity 
instead of queues during probe. Could you please try the following patch 
to see if it works?

Comments

Theodore Ts'o Dec. 13, 2016, 4:19 a.m. UTC | #1
On Tue, Dec 13, 2016 at 11:43:00AM +0800, Jason Wang wrote:
> Thanks for reporting this issue. Looks like I blindly set the affinity
> instead of queues during probe. Could you please try the following patch to
> see if it works?

This fixed things, thanks!!

						- Ted
						

> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index b425fa1..fe9f772 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -1930,7 +1930,9 @@ static int virtnet_probe(struct virtio_device *vdev)
>                 goto free_unregister_netdev;
>         }
> 
> -       virtnet_set_affinity(vi);
> +       rtnl_lock();
> +       virtnet_set_queues(vi, vi->curr_queue_pairs);
> +       rtnl_unlock();
> 
>         /* Assume link up if device can't report link status,
>            otherwise get link status from config. */
> 
>
diff mbox

Patch

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index b425fa1..fe9f772 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1930,7 +1930,9 @@  static int virtnet_probe(struct virtio_device *vdev)
                 goto free_unregister_netdev;
         }

-       virtnet_set_affinity(vi);
+       rtnl_lock();
+       virtnet_set_queues(vi, vi->curr_queue_pairs);
+       rtnl_unlock();

         /* Assume link up if device can't report link status,
            otherwise get link status from config. */