From patchwork Thu Dec 7 17:56:04 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Fastabend X-Patchwork-Id: 845740 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="MpqlMl3c"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3yt39T1pCLz9s7B for ; Fri, 8 Dec 2017 04:57:49 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753415AbdLGR5s (ORCPT ); Thu, 7 Dec 2017 12:57:48 -0500 Received: from mail-pf0-f196.google.com ([209.85.192.196]:41899 "EHLO mail-pf0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754016AbdLGR4R (ORCPT ); Thu, 7 Dec 2017 12:56:17 -0500 Received: by mail-pf0-f196.google.com with SMTP id j28so5220855pfk.8 for ; Thu, 07 Dec 2017 09:56:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:from:to:cc:date:message-id:in-reply-to:references :user-agent:mime-version:content-transfer-encoding; bh=MzKhcNucV/krb9cN0cI9CX7k+QLhwzwzQyOQLJtIaRs=; b=MpqlMl3cciDMXWotFxnEwXMoGZIhh0J+LMcn9vyoFk+Zu2ex2DklYPHOduUEZqZ10v cAQQgfUz0mJVCJxukSwvHOExyneYtegh38LWQT3PDOVcKd2j9j66AC6YsfiMa+yhQhls OO5g2bA7172YxeeKiKzHS1GJYQkEyuo+LaSM+2lOehXy4BOi0ERDJO3ACSVZH2js7MSJ vT/B0Y1v1T26/TzDJphP89mvNAQ/wpxBXN+jg9wCxWkPPkI4rHwmwtEXN+LY6mJuixBZ RRJPs1eIWQGu27Fdo5gykslZrOAFvmIfclLmNF0Hnj8jvum2ZsyRx89bwwYNXw+ipdx5 I7xg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:from:to:cc:date:message-id:in-reply-to :references:user-agent:mime-version:content-transfer-encoding; bh=MzKhcNucV/krb9cN0cI9CX7k+QLhwzwzQyOQLJtIaRs=; b=AYp0Jas0bIKT90ZLLH+vut7m5FJAsrQmXyr8I9Avdlc1KBDOaZ4DlrtnGTXIlfM0mL EDgNkykda+WgCQb+/Dc+lQVy0nqhtO41Bw44SUsYzTZaDFTb9RRB6Bn4bCpZeZ6sZ1k4 /YGCY0ZoDVqUNglCZtYUgB75qSUb8E6smwxPBFb8nzKVNwcg1Pmf6VQqqa/gPUAOXpGv BclUL4VIaMqm+NuI3pvgmGiG/Ti7Zv0MKezKADwGtVZgHEpvzNyUHd+tgbsWbXn/X3Zc MuJRbH8voA2+uJWIqXTsV2xX4G6DFmEuhvy5LzoD0h3rzHg7u4bENsCWVRO0VLIMWhBR L88A== X-Gm-Message-State: AJaThX6CG1XFnnBqlpq8XcsMCYZtlxi0UUFJvQ3xDwk3rQ/okl3LCiKK 2aMlwLdU/2prVjPONz4k/5k= X-Google-Smtp-Source: AGs4zMZUrMIfKXIWKf8u4q+4/D1yJJHbvJJaaTVQf305uXTuJtzgE1jl9H4z10q2M5P6n86VAHfPZg== X-Received: by 10.84.217.150 with SMTP id p22mr26566907pli.427.1512669376933; Thu, 07 Dec 2017 09:56:16 -0800 (PST) Received: from [127.0.1.1] ([72.168.144.118]) by smtp.gmail.com with ESMTPSA id o84sm11518011pfa.46.2017.12.07.09.56.09 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 07 Dec 2017 09:56:16 -0800 (PST) Subject: [net-next PATCH 07/14] net: sched: drop qdisc_reset from dev_graft_qdisc From: John Fastabend To: willemdebruijn.kernel@gmail.com, daniel@iogearbox.net, eric.dumazet@gmail.com, davem@davemloft.net Cc: netdev@vger.kernel.org, jiri@resnulli.us, xiyou.wangcong@gmail.com Date: Thu, 07 Dec 2017 09:56:04 -0800 Message-ID: <20171207175603.5771.63274.stgit@john-Precision-Tower-5810> In-Reply-To: <20171207173500.5771.41198.stgit@john-Precision-Tower-5810> References: <20171207173500.5771.41198.stgit@john-Precision-Tower-5810> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org In qdisc_graft_qdisc a "new" qdisc is attached and the 'qdisc_destroy' operation is called on the old qdisc. The destroy operation will wait a rcu grace period and call qdisc_rcu_free(). At which point gso_cpu_skb is free'd along with all stats so no need to zero stats and gso_cpu_skb from the graft operation itself. Further after dropping the qdisc locks we can not continue to call qdisc_reset before waiting an rcu grace period so that the qdisc is detached from all cpus. By removing the qdisc_reset() here we get the correct property of waiting an rcu grace period and letting the qdisc_destroy operation clean up the qdisc correctly. Note, a refcnt greater than 1 would cause the destroy operation to be aborted however if this ever happened the reference to the qdisc would be lost and we would have a memory leak. Signed-off-by: John Fastabend --- net/sched/sch_generic.c | 28 +++++++++++++++++++--------- 1 file changed, 19 insertions(+), 9 deletions(-) diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c index dfeabe3..482ba22 100644 --- a/net/sched/sch_generic.c +++ b/net/sched/sch_generic.c @@ -819,10 +819,6 @@ struct Qdisc *dev_graft_qdisc(struct netdev_queue *dev_queue, root_lock = qdisc_lock(oqdisc); spin_lock_bh(root_lock); - /* Prune old scheduler */ - if (oqdisc && refcount_read(&oqdisc->refcnt) <= 1) - qdisc_reset(oqdisc); - /* ... and graft new one */ if (qdisc == NULL) qdisc = &noop_qdisc; @@ -977,6 +973,16 @@ static bool some_qdisc_is_busy(struct net_device *dev) return false; } +static void dev_qdisc_reset(struct net_device *dev, + struct netdev_queue *dev_queue, + void *none) +{ + struct Qdisc *qdisc = dev_queue->qdisc_sleeping; + + if (qdisc) + qdisc_reset(qdisc); +} + /** * dev_deactivate_many - deactivate transmissions on several devices * @head: list of devices to deactivate @@ -987,7 +993,6 @@ static bool some_qdisc_is_busy(struct net_device *dev) void dev_deactivate_many(struct list_head *head) { struct net_device *dev; - bool sync_needed = false; list_for_each_entry(dev, head, close_list) { netdev_for_each_tx_queue(dev, dev_deactivate_queue, @@ -997,20 +1002,25 @@ void dev_deactivate_many(struct list_head *head) &noop_qdisc); dev_watchdog_down(dev); - sync_needed |= !dev->dismantle; } /* Wait for outstanding qdisc-less dev_queue_xmit calls. * This is avoided if all devices are in dismantle phase : * Caller will call synchronize_net() for us */ - if (sync_needed) - synchronize_net(); + synchronize_net(); /* Wait for outstanding qdisc_run calls. */ - list_for_each_entry(dev, head, close_list) + list_for_each_entry(dev, head, close_list) { while (some_qdisc_is_busy(dev)) yield(); + /* The new qdisc is assigned at this point so we can safely + * unwind stale skb lists and qdisc statistics + */ + netdev_for_each_tx_queue(dev, dev_qdisc_reset, NULL); + if (dev_ingress_queue(dev)) + dev_qdisc_reset(dev, dev_ingress_queue(dev), NULL); + } } void dev_deactivate(struct net_device *dev)