From patchwork Thu Sep 7 04:26:06 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 810845 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="VFZGLSHi"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3xnnTB6QVMz9sNd for ; Thu, 7 Sep 2017 14:26:22 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753394AbdIGE0U (ORCPT ); Thu, 7 Sep 2017 00:26:20 -0400 Received: from mail-pf0-f195.google.com ([209.85.192.195]:33807 "EHLO mail-pf0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750742AbdIGE0R (ORCPT ); Thu, 7 Sep 2017 00:26:17 -0400 Received: by mail-pf0-f195.google.com with SMTP id y68so3991095pfd.1 for ; Wed, 06 Sep 2017 21:26:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=OKEGvU6iY0SuCEaLL2h0fJnXxmty39ROSd+lHsn61qk=; b=VFZGLSHiAQlGBH7OY5BNIibGahk2I+rAgPqsZDoKMv52vTMNLlc86l75UB6yS+V+71 99ioVQbXxQhl5WAuaoaSANQ5GKOhjXgwRxURjMNcAmD8XZYcu49d6nGpxfd4uC6EX9Mf gT6HRgb0VpR7fLg1OWzyclPjLR5oMrXlYc8nEC/y/ESbfVFSleZbqOHz6EoqXIiFIPSe sDrvzNejaFLxKsIhMiIUUUg8i45ikwXb/Mu/2U2oKhos1XF0wJpvvd1QquJo3wvHrlEz Snp2jy4E41a4pJi9N+h9MNfVX3LByF9WRHyAQnLcOojeWY1F+ebC6T9347YTZ6M4vFTW +tIA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=OKEGvU6iY0SuCEaLL2h0fJnXxmty39ROSd+lHsn61qk=; b=HMHeju3/hgba7eNuFVFEefUE/teN35VTUjuz2LH1F82XJAHS7y5t8jmNINshJC2tCI B0ciTQ1m6LFpeWm+LfFNGpckShIaHGZT3zg5MGTYauHIQehgDS3FrpSGsrSubw8O+SPL lyDwiih/AWIkbuxwy6BAcSiJL1W3SrWSwsNMnfdzCpWeLjePVnOSwPDkBl/P9q4BQit7 YI11Em67YRi4+2nWOu2DOIiM37dHdU3YmRqCzvfxEaBB2itgj3rb2Mvjq687Rpr99wqw mhP28cZxF6zSNTEnCJxHe2089mtynBrWsEM7+68nGAPtbu8k2LRJHlMqCY3wK5gbj8eu w+Hg== X-Gm-Message-State: AHPjjUg3EBVZrKnuDH0shh7ANm3HWZNb/xbE9SP8USefzZg/eEUFUJm9 qfit4GRv5aSWNSFbM1g= X-Google-Smtp-Source: ADKCNb5z5pGuTjq8aVmmh6HS11C2FfH/zbusx4uRecAwZIjjOtnr7cYHWSq8U6TkRub4usG0S5dUsg== X-Received: by 10.84.234.196 with SMTP id i4mr1591187plt.208.1504758376785; Wed, 06 Sep 2017 21:26:16 -0700 (PDT) Received: from tw-172-25-30-113.office.twttr.net ([8.25.197.25]) by smtp.gmail.com with ESMTPSA id i2sm1759997pfd.21.2017.09.06.21.26.15 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 06 Sep 2017 21:26:16 -0700 (PDT) From: Cong Wang To: netdev@vger.kernel.org Cc: jakub.kicinski@netronome.com, Cong Wang , Jiri Pirko , Eric Dumazet Subject: [Patch net v2 1/2] net_sched: get rid of tcfa_rcu Date: Wed, 6 Sep 2017 21:26:06 -0700 Message-Id: <20170907042607.24413-1-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.9.4 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org gen estimator has been rewritten in commit 1c0d32fde5bd ("net_sched: gen_estimator: complete rewrite of rate estimators"), the caller is no longer needed to wait for a grace period. So this patch gets rid of it. This also completely closes a race condition between action free path and filter chain add/remove path for the following patch. Because otherwise the nested RCU callback can't be caught by rcu_barrier(). Cc: Jiri Pirko Cc: Eric Dumazet Signed-off-by: Cong Wang --- include/net/act_api.h | 2 -- net/sched/act_api.c | 12 +++--------- 2 files changed, 3 insertions(+), 11 deletions(-) diff --git a/include/net/act_api.h b/include/net/act_api.h index 26ffd8333f50..68218a5f8e72 100644 --- a/include/net/act_api.h +++ b/include/net/act_api.h @@ -38,7 +38,6 @@ struct tc_action { struct gnet_stats_queue tcfa_qstats; struct net_rate_estimator __rcu *tcfa_rate_est; spinlock_t tcfa_lock; - struct rcu_head tcfa_rcu; struct gnet_stats_basic_cpu __percpu *cpu_bstats; struct gnet_stats_queue __percpu *cpu_qstats; struct tc_cookie *act_cookie; @@ -55,7 +54,6 @@ struct tc_action { #define tcf_qstats common.tcfa_qstats #define tcf_rate_est common.tcfa_rate_est #define tcf_lock common.tcfa_lock -#define tcf_rcu common.tcfa_rcu static inline unsigned int tcf_hash(u32 index, unsigned int hmask) { diff --git a/net/sched/act_api.c b/net/sched/act_api.c index f2e9ed34a963..fc17d286a6a2 100644 --- a/net/sched/act_api.c +++ b/net/sched/act_api.c @@ -53,10 +53,8 @@ static void tcf_action_goto_chain_exec(const struct tc_action *a, res->goto_tp = rcu_dereference_bh(chain->filter_chain); } -static void free_tcf(struct rcu_head *head) +static void free_tcf(struct tc_action *p) { - struct tc_action *p = container_of(head, struct tc_action, tcfa_rcu); - free_percpu(p->cpu_bstats); free_percpu(p->cpu_qstats); @@ -76,11 +74,7 @@ static void tcf_hash_destroy(struct tcf_hashinfo *hinfo, struct tc_action *p) hlist_del(&p->tcfa_head); spin_unlock_bh(&hinfo->lock); gen_kill_estimator(&p->tcfa_rate_est); - /* - * gen_estimator est_timer() might access p->tcfa_lock - * or bstats, wait a RCU grace period before freeing p - */ - call_rcu(&p->tcfa_rcu, free_tcf); + free_tcf(p); } int __tcf_hash_release(struct tc_action *p, bool bind, bool strict) @@ -271,7 +265,7 @@ void tcf_hash_cleanup(struct tc_action *a, struct nlattr *est) { if (est) gen_kill_estimator(&a->tcfa_rate_est); - call_rcu(&a->tcfa_rcu, free_tcf); + free_tcf(a); } EXPORT_SYMBOL(tcf_hash_cleanup); From patchwork Thu Sep 7 04:26:07 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cong Wang X-Patchwork-Id: 810846 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="aEdhOjgO"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3xnnTJ2fw2z9sNd for ; Thu, 7 Sep 2017 14:26:28 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753447AbdIGE01 (ORCPT ); Thu, 7 Sep 2017 00:26:27 -0400 Received: from mail-pf0-f194.google.com ([209.85.192.194]:38719 "EHLO mail-pf0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753374AbdIGE0S (ORCPT ); Thu, 7 Sep 2017 00:26:18 -0400 Received: by mail-pf0-f194.google.com with SMTP id q76so3986514pfq.5 for ; Wed, 06 Sep 2017 21:26:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=QP2ih+6R2zUU3QnYXeel7vXIeTgLdhbU/uqmelCoUag=; b=aEdhOjgOQv+XaymnOaES961LQcPjbCEvDrI/pe8Yj48jzluplhbaUs9ZsJwVrXVq3w hLM6hp6Iwc0Ae5VC5ET95qRUCvuI80DYKolzo6ZbH5R7ilnn+01Cm7T5Fdjs8yHrT9uO OslU69PkNEiPvPnp2PWR0BauCb5JBIi5wnkp9S98erm0aBmomR2CVCyxiEm4uw3qGQKV 0gJpm5lhZPGrKIfau08srencn49wqVWTYCqmIoXa3X0HrmpuAsHsU3rfToN7dqkPYJPJ Ip5L4OXa8KDN+PoOz+4Wn/tk3ZASEfvHbQ+DbcDLI+26TqkAUlX3oB/UBM+8pXNIILvB NobA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=QP2ih+6R2zUU3QnYXeel7vXIeTgLdhbU/uqmelCoUag=; b=G+AN0i/V9PDuhA6waY9YFwlqIVsgeJas3EGkmjq45EwEyfCO+ZfyVmDtNaiuAo+qNz ADrFO/3H3wUt+b7PvBu78d7FkLB8ZyYYE8Q55s7HbMiS7Hb1Bu1KpV+4W2Gpx1UQvjnb e7yQNlJPEeYmz6UcuzpVWCjHMGoOdaSgb5Xdr/asTV1IAJLx6NCzXWIBmUaiTK7M4nHO LaqKiyunE3nesV2E+UmAy8b94tdUiK0FMM2LfpmKujAt+TqbbookD/fHGk2vMx2+U9ji D2aBu2wAasbXWnnuDqNhmdlnyNEjU9mnUw+EMdsKOiWJ4PvspE/FfNeDqFpdKYX+u9KV s1xg== X-Gm-Message-State: AHPjjUgK9G7yYYYWOUb6KjzZrU3T2SsFYu31+OatXTWNhV9Tr5U3eGQ3 eb5nVAcHhcTXfUEkHgs= X-Google-Smtp-Source: ADKCNb5f0aqpHkgAO5RUrvV17ZKYec2Wguv3bAD4LVXcNnNM1Mr10YF2zIkdCn7n07j0DPq4aeA2CQ== X-Received: by 10.84.129.226 with SMTP id b89mr1571237plb.84.1504758377895; Wed, 06 Sep 2017 21:26:17 -0700 (PDT) Received: from tw-172-25-30-113.office.twttr.net ([8.25.197.25]) by smtp.gmail.com with ESMTPSA id i2sm1759997pfd.21.2017.09.06.21.26.16 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 06 Sep 2017 21:26:17 -0700 (PDT) From: Cong Wang To: netdev@vger.kernel.org Cc: jakub.kicinski@netronome.com, Cong Wang , Jiri Pirko Subject: [Patch net v2 2/2] net_sched: fix all the madness of tc filter chain Date: Wed, 6 Sep 2017 21:26:07 -0700 Message-Id: <20170907042607.24413-2-xiyou.wangcong@gmail.com> X-Mailer: git-send-email 2.9.4 In-Reply-To: <20170907042607.24413-1-xiyou.wangcong@gmail.com> References: <20170907042607.24413-1-xiyou.wangcong@gmail.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This patch fixes the following madness of tc filter chain: 1) tcf_chain_destroy() is called by both tcf_block_put() and tcf_chain_put(). tcf_chain_put() is correctly refcnt'ed and paired with tcf_chain_get(), but tcf_block_put() is not, it should be paired with tcf_block_get() which means we still need to decrease the refcnt. Think it in another way: if we call tcf_bock_put() immediately after tcf_block_get(), could we get effectively a nop? This causes a memory leak as reported by Jakub. 2) tp proto should hold a refcnt to the chain too. This significantly simplifies the logic: 2a) Chain 0 is no longer special, it is created and refcnted by tp like any other chains. All the ugliness in tcf_chain_put() can be gone! 2b) No need to handle the flushing oddly, because block still holds chain 0, it can not be released, this guarantees block is the last user. 2c) The race condition with RCU callbacks is easier to handle with just a rcu_barrier()! Much easier to understand, nothing to hide! Thanks to the previous patch. Please see also the comments in code. 2d) Make the code understandable by humans, much less error-prone. Fixes: 744a4cf63e52 ("net: sched: fix use after free when tcf_chain_destroy is called multiple times") Fixes: 5bc1701881e3 ("net: sched: introduce multichain support for filters") Reported-by: Jakub Kicinski Cc: Jiri Pirko Signed-off-by: Cong Wang --- net/sched/cls_api.c | 38 ++++++++++++++++++++++---------------- 1 file changed, 22 insertions(+), 16 deletions(-) diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c index 6c5ea84d2682..e9060dc36519 100644 --- a/net/sched/cls_api.c +++ b/net/sched/cls_api.c @@ -209,21 +209,20 @@ static void tcf_chain_flush(struct tcf_chain *chain) RCU_INIT_POINTER(*chain->p_filter_chain, NULL); while ((tp = rtnl_dereference(chain->filter_chain)) != NULL) { RCU_INIT_POINTER(chain->filter_chain, tp->next); + tcf_chain_put(chain); tcf_proto_destroy(tp); } } static void tcf_chain_destroy(struct tcf_chain *chain) { - /* May be already removed from the list by the previous call. */ - if (!list_empty(&chain->list)) - list_del_init(&chain->list); + list_del(&chain->list); + kfree(chain); +} - /* There might still be a reference held when we got here from - * tcf_block_put. Wait for the user to drop reference before free. - */ - if (!chain->refcnt) - kfree(chain); +static void tcf_chain_hold(struct tcf_chain *chain) +{ + ++chain->refcnt; } struct tcf_chain *tcf_chain_get(struct tcf_block *block, u32 chain_index, @@ -233,7 +232,7 @@ struct tcf_chain *tcf_chain_get(struct tcf_block *block, u32 chain_index, list_for_each_entry(chain, &block->chain_list, list) { if (chain->index == chain_index) { - chain->refcnt++; + tcf_chain_hold(chain); return chain; } } @@ -246,10 +245,7 @@ EXPORT_SYMBOL(tcf_chain_get); void tcf_chain_put(struct tcf_chain *chain) { - /* Destroy unused chain, with exception of chain 0, which is the - * default one and has to be always present. - */ - if (--chain->refcnt == 0 && !chain->filter_chain && chain->index != 0) + if (--chain->refcnt == 0) tcf_chain_destroy(chain); } EXPORT_SYMBOL(tcf_chain_put); @@ -294,10 +290,18 @@ void tcf_block_put(struct tcf_block *block) if (!block) return; - list_for_each_entry_safe(chain, tmp, &block->chain_list, list) { + /* Standalone actions are not allowed to jump to any chain, and + * bound actions should be all removed after flushing. However, + * filters are destroyed in RCU callbacks, we have to flush and wait + * for them before releasing this refcnt, otherwise we race with RCU + * callbacks!!! + */ + list_for_each_entry(chain, &block->chain_list, list) tcf_chain_flush(chain); - tcf_chain_destroy(chain); - } + rcu_barrier(); + + list_for_each_entry_safe(chain, tmp, &block->chain_list, list) + tcf_chain_put(chain); kfree(block); } EXPORT_SYMBOL(tcf_block_put); @@ -375,6 +379,7 @@ static void tcf_chain_tp_insert(struct tcf_chain *chain, rcu_assign_pointer(*chain->p_filter_chain, tp); RCU_INIT_POINTER(tp->next, tcf_chain_tp_prev(chain_info)); rcu_assign_pointer(*chain_info->pprev, tp); + tcf_chain_hold(chain); } static void tcf_chain_tp_remove(struct tcf_chain *chain, @@ -386,6 +391,7 @@ static void tcf_chain_tp_remove(struct tcf_chain *chain, if (chain->p_filter_chain && tp == chain->filter_chain) RCU_INIT_POINTER(*chain->p_filter_chain, next); RCU_INIT_POINTER(*chain_info->pprev, next); + tcf_chain_put(chain); } static struct tcf_proto *tcf_chain_tp_find(struct tcf_chain *chain,