Message ID | 1275931108.2545.168.camel@edumazet-laptop |
---|---|
State | Superseded, archived |
Delegated to: | David Miller |
Headers | show |
On Tue, Jun 8, 2010 at 1:18 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote: > > > [PATCH net-2.6] pkt_sched: gen_estimator: add a new lock > > gen_kill_estimator() / gen_new_estimator() is not always called with > RTNL held. > > net/netfilter/xt_RATEEST.c is one user of these API that do not hold > RTNL, so random corruptions can occur between "tc" and "iptables" > > Add a new fine grained lock instead of trying to use RTNL in xt_RATEEST > Why not use RTNL in xt_RATEEST? It seems xt_RATEEST misuse the APIs. and I think gen_replace_estimator is expected to be an atomic operation. And gen_estimator_active() is also assumed to be called with RTNL locked.
Le mardi 08 juin 2010 à 09:00 +0800, Changli Gao a écrit : > On Tue, Jun 8, 2010 at 1:18 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote: > > > > > > [PATCH net-2.6] pkt_sched: gen_estimator: add a new lock > > > > gen_kill_estimator() / gen_new_estimator() is not always called with > > RTNL held. > > > > net/netfilter/xt_RATEEST.c is one user of these API that do not hold > > RTNL, so random corruptions can occur between "tc" and "iptables" > > > > Add a new fine grained lock instead of trying to use RTNL in xt_RATEEST > > > > Why not use RTNL in xt_RATEEST? It seems xt_RATEEST misuse the APIs. > > and I think gen_replace_estimator is expected to be an atomic operation. > > And gen_estimator_active() is also assumed to be called with RTNL locked. > Thank you for asking this question. Because I want to kill RTNL when possible, I dont even want to try adding RTNL to xt_RATEEST and solve all lock dependencies it might raise. RTNL = Big and Horrible Network LOCK You never got blocked because of this RTNL thing, dont you ? I did. And it sucks, because when you hit this, you are in a hurry and locating the bottleneck takes lot of time. RTNL is the thing we must hold during device register / unregister. Its locked for long delays because of all synchronize_rcu() that must be done, and that is already a big problem on some setups. Every time someone adds a RTNL requirement, you can be sure another guy will zap it during following ten years. Let's do this right now, not later. For an example of horrible rtnl behavior, take a look at following construct : if (!rtnl_trylock()) return restart_syscall(); I saw hundred of udev looping, trying to get rtnl to dump some information. (Patrick added a rtnl requirement to all dump operations, and it sucks) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Jun 8, 2010 at 12:30 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote: > > Thank you for asking this question. > > Because I want to kill RTNL when possible, I dont even want to try > adding RTNL to xt_RATEEST and solve all lock dependencies it might > raise. > > RTNL = Big and Horrible Network LOCK > It is much like the BKL, and should be killed too. Thanks. :)
Le mardi 08 juin 2010 à 09:00 +0800, Changli Gao a écrit : > and I think gen_replace_estimator is expected to be an atomic operation. > > And gen_estimator_active() is also assumed to be called with RTNL locked. > My patch fixes a bug of new/kill operators, regardless of RTNL being held or not. Its should be small enough to be included in linux-2.6.35. If what you say is right, all gen_replace_estimator() / gen_estimator_active() callers should still holds RTNL. I didnt change this part. If you believe one caller doesnt hold RTNL, please submit another patch. Then, in net-next-2.6, we can probably cleanup this to remove RTNL requirement if possible for gen_replace_estimator() / gen_estimator_active() Yes, it sounds a bit difficult (three patches instead of a single one), but this is the how things should be done, step by step. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Jun 8, 2010 at 12:58 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote: > Le mardi 08 juin 2010 à 09:00 +0800, Changli Gao a écrit : > >> and I think gen_replace_estimator is expected to be an atomic operation. >> >> And gen_estimator_active() is also assumed to be called with RTNL locked. >> > > My patch fixes a bug of new/kill operators, regardless of RTNL being > held or not. Its should be small enough to be included in linux-2.6.35. > > If what you say is right, all gen_replace_estimator() / > gen_estimator_active() callers should still holds RTNL. > I didnt change this part. > If you believe one caller doesnt hold RTNL, please submit another patch. > > Then, in net-next-2.6, we can probably cleanup this to remove RTNL > requirement if possible for gen_replace_estimator() / > gen_estimator_active() > > Yes, it sounds a bit difficult (three patches instead of a single one), > but this is the how things should be done, step by step. > IMO, this bug should be fixed by adding rtnl_lock to xt_RATEEST.c. Killing rtnl should be done in separated patches. They are different things. Your patch introduces another locks, and it is extra overhead for other users.
Le mardi 08 juin 2010 à 13:20 +0800, Changli Gao a écrit : > IMO, this bug should be fixed by adding rtnl_lock to xt_RATEEST.c. > Killing rtnl should be done in separated patches. They are different > things. Your patch introduces another locks, and it is extra overhead > for other users. > extra overhead, in new/kill estimators ? Are you kidding ? RTNL is taken, taking an extra-uncontended spinlock is free. Nope, I wont add rtnl lock to xt_RATEEST.c I believe you dont really understood what I patiently explained to you. Thats becoming rediculous. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/core/gen_estimator.c b/net/core/gen_estimator.c index cf8e703..3d11203 100644 --- a/net/core/gen_estimator.c +++ b/net/core/gen_estimator.c @@ -107,6 +107,7 @@ static DEFINE_RWLOCK(est_lock); /* Protects against soft lockup during large deletion */ static struct rb_root est_root = RB_ROOT; +static DEFINE_SPINLOCK(est_tree_lock); static void est_timer(unsigned long arg) { @@ -201,7 +202,6 @@ struct gen_estimator *gen_find_node(const struct gnet_stats_basic_packed *bstats * * Returns 0 on success or a negative error code. * - * NOTE: Called under rtnl_mutex */ int gen_new_estimator(struct gnet_stats_basic_packed *bstats, struct gnet_stats_rate_est *rate_est, @@ -222,6 +222,7 @@ int gen_new_estimator(struct gnet_stats_basic_packed *bstats, if (est == NULL) return -ENOBUFS; + spin_lock(&est_tree_lock); idx = parm->interval + 2; est->bstats = bstats; est->rate_est = rate_est; @@ -242,6 +243,7 @@ int gen_new_estimator(struct gnet_stats_basic_packed *bstats, list_add_rcu(&est->list, &elist[idx].list); gen_add_node(est); + spin_unlock(&est_tree_lock); return 0; } @@ -261,13 +263,13 @@ static void __gen_kill_estimator(struct rcu_head *head) * * Removes the rate estimator specified by &bstats and &rate_est. * - * NOTE: Called under rtnl_mutex */ void gen_kill_estimator(struct gnet_stats_basic_packed *bstats, struct gnet_stats_rate_est *rate_est) { struct gen_estimator *e; + spin_lock(&est_tree_lock); while ((e = gen_find_node(bstats, rate_est))) { rb_erase(&e->node, &est_root); @@ -278,6 +280,7 @@ void gen_kill_estimator(struct gnet_stats_basic_packed *bstats, list_del_rcu(&e->list); call_rcu(&e->e_rcu, __gen_kill_estimator); } + spin_unlock(&est_tree_lock); } EXPORT_SYMBOL(gen_kill_estimator);