From patchwork Tue Dec 13 03:43:00 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jason Wang X-Patchwork-Id: 705283 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3td5C52vKMz9t2T for ; Tue, 13 Dec 2016 14:43:13 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932338AbcLMDnH (ORCPT ); Mon, 12 Dec 2016 22:43:07 -0500 Received: from mx1.redhat.com ([209.132.183.28]:41214 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932176AbcLMDnG (ORCPT ); Mon, 12 Dec 2016 22:43:06 -0500 Received: from int-mx10.intmail.prod.int.phx2.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 7E6B5C04D2BE; Tue, 13 Dec 2016 03:43:05 +0000 (UTC) Received: from [10.72.5.153] (vpn1-5-153.pek2.redhat.com [10.72.5.153]) by int-mx10.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id uBD3h1Ce032114 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 12 Dec 2016 22:43:03 -0500 Subject: Re: "virtio-net: enable multiqueue by default" in linux-next breaks networking on GCE To: "Theodore Ts'o" , "Michael S. Tsirkin" References: <20161212233343.q5xlv55rc5npqaqp@thunk.org> <20161213042057-mutt-send-email-mst@kernel.org> <20161213031243.avq5g5m5r5ylcnnk@thunk.org> Cc: netdev@vger.kernel.org, nhorman@tuxdriver.com, davem@davemloft.net From: Jason Wang Message-ID: <60cd312f-86f9-47e9-0c72-f4c2109e2f87@redhat.com> Date: Tue, 13 Dec 2016 11:43:00 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.5.1 MIME-Version: 1.0 In-Reply-To: <20161213031243.avq5g5m5r5ylcnnk@thunk.org> X-Scanned-By: MIMEDefang 2.68 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Tue, 13 Dec 2016 03:43:05 +0000 (UTC) Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On 2016年12月13日 11:12, Theodore Ts'o wrote: > On Tue, Dec 13, 2016 at 04:28:17AM +0200, Michael S. Tsirkin wrote: >> That's unfortunate, of course. It could be a hypervisor or >> a guest kernel bug. ideas: >> - does host have mq capability? how many queues? >> - how about # of msix vectors? >> - after you send something on tx queues, >> are interrupts arriving on rx queues? >> - is problem rx or tx? >> set ip and arp manually and send a packet to known MAC, >> does it get there? > Sorry, I don't know how to debug virtio-net. Given that it's in a > cloud environment, I also can't set ip addresses manually, since ip > addresses are set manually. > > If you can send me a patch, I'm happy to apply it and send you back > results. > > I can say that I've had _zero_ problems using pretty much any kernel > from 3.10 to 4.9 using Google Compute Engine. The commit I referenced > caused things to stop working. So in terms of regression, this is > definitely a regression, and it's definitely caused by commit > 449000102901. Even if it is a hypervisor "bug", I'm pretty sure I > know what Linus will say if I ask him to revert it. Linux kernels are > expected to work around hardware bugs, and breaking users just because > hardware is "broken" by some definition is generally not considered > friendly, especially when has been working for years and years before > some commit "fixed" things. > > I would very much like to work with you to fix it, but I will need > your help, since virtio-net doesn't seem to print any informational > during the boot sequence, and I don't know how the best way to debug > it. > > Cheers, > > - Ted Thanks for reporting this issue. Looks like I blindly set the affinity instead of queues during probe. Could you please try the following patch to see if it works? diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index b425fa1..fe9f772 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -1930,7 +1930,9 @@ static int virtnet_probe(struct virtio_device *vdev) goto free_unregister_netdev; } - virtnet_set_affinity(vi); + rtnl_lock(); + virtnet_set_queues(vi, vi->curr_queue_pairs); + rtnl_unlock(); /* Assume link up if device can't report link status, otherwise get link status from config. */