Message ID | 20180723072312.4153-4-jiri@resnulli.us |
---|---|
State | Accepted, archived |
Delegated to: | David Miller |
Headers | show |
Series | sched: introduce chain templates support with offloading to mlxsw | expand |
On Mon, Jul 23, 2018 at 12:25 AM Jiri Pirko <jiri@resnulli.us> wrote: > + switch (n->nlmsg_type) { > + case RTM_NEWCHAIN: > + /* In case the chain was successfully added, take a reference > + * to the chain. This ensures that an empty chain > + * does not disappear at the end of this function. > + */ > + tcf_chain_hold(chain); > + chain->explicitly_created = true; > + tc_chain_notify(chain, NULL, 0, NLM_F_CREATE | NLM_F_EXCL, > + RTM_NEWCHAIN, false); > + break; > + case RTM_DELCHAIN: > + /* Flush the chain first as the user requested chain removal. */ > + tcf_chain_flush(chain); > + /* In case the chain was successfully deleted, put a reference > + * to the chain previously taken during addition. > + */ > + tcf_chain_put_explicitly_created(chain); > + break; I don't see you send notification to user-space when deleting a chain, am I missing anything?
On Tue, Jul 24, 2018 at 3:30 PM Cong Wang <xiyou.wangcong@gmail.com> wrote: > > On Mon, Jul 23, 2018 at 12:25 AM Jiri Pirko <jiri@resnulli.us> wrote: > > + switch (n->nlmsg_type) { > > + case RTM_NEWCHAIN: > > + /* In case the chain was successfully added, take a reference > > + * to the chain. This ensures that an empty chain > > + * does not disappear at the end of this function. > > + */ > > + tcf_chain_hold(chain); > > + chain->explicitly_created = true; > > + tc_chain_notify(chain, NULL, 0, NLM_F_CREATE | NLM_F_EXCL, > > + RTM_NEWCHAIN, false); > > + break; > > + case RTM_DELCHAIN: > > + /* Flush the chain first as the user requested chain removal. */ > > + tcf_chain_flush(chain); > > + /* In case the chain was successfully deleted, put a reference > > + * to the chain previously taken during addition. > > + */ > > + tcf_chain_put_explicitly_created(chain); > > + break; > > I don't see you send notification to user-space when deleting a chain, > am I missing anything? Oh, it is hidden in tcf_chain_put(): void tcf_chain_put(struct tcf_chain *chain) { if (--chain->refcnt == 0) { tc_chain_notify(chain, NULL, 0, 0, RTM_DELCHAIN, false); tc_chain_tmplt_del(chain); tcf_chain_destroy(chain); } } So, you only send out notification when the last refcnt is gone. If the chain that is being deleted by a user is still used by an action, you return 0 or -EPERM?
Wed, Jul 25, 2018 at 01:20:08AM CEST, xiyou.wangcong@gmail.com wrote: >On Tue, Jul 24, 2018 at 3:30 PM Cong Wang <xiyou.wangcong@gmail.com> wrote: >> >> On Mon, Jul 23, 2018 at 12:25 AM Jiri Pirko <jiri@resnulli.us> wrote: >> > + switch (n->nlmsg_type) { >> > + case RTM_NEWCHAIN: >> > + /* In case the chain was successfully added, take a reference >> > + * to the chain. This ensures that an empty chain >> > + * does not disappear at the end of this function. >> > + */ >> > + tcf_chain_hold(chain); >> > + chain->explicitly_created = true; >> > + tc_chain_notify(chain, NULL, 0, NLM_F_CREATE | NLM_F_EXCL, >> > + RTM_NEWCHAIN, false); >> > + break; >> > + case RTM_DELCHAIN: >> > + /* Flush the chain first as the user requested chain removal. */ >> > + tcf_chain_flush(chain); >> > + /* In case the chain was successfully deleted, put a reference >> > + * to the chain previously taken during addition. >> > + */ >> > + tcf_chain_put_explicitly_created(chain); >> > + break; >> >> I don't see you send notification to user-space when deleting a chain, >> am I missing anything? > >Oh, it is hidden in tcf_chain_put(): > >void tcf_chain_put(struct tcf_chain *chain) >{ > if (--chain->refcnt == 0) { > tc_chain_notify(chain, NULL, 0, 0, RTM_DELCHAIN, false); > tc_chain_tmplt_del(chain); > tcf_chain_destroy(chain); > } >} > >So, you only send out notification when the last refcnt is gone. > >If the chain that is being deleted by a user is still used by an action, >you return 0 or -EPERM? 0 and the chain stays there until the action is removed. Hmm, do you thing that -EPERM should be returned in that case? The thing is, we have to flush the chain in order to see the action references are there. We would have to have 2 ref counters, one for filter, one for actions. What do you think?
On Tue, Jul 24, 2018 at 11:49 PM Jiri Pirko <jiri@resnulli.us> wrote: > > Wed, Jul 25, 2018 at 01:20:08AM CEST, xiyou.wangcong@gmail.com wrote: > >So, you only send out notification when the last refcnt is gone. > > > >If the chain that is being deleted by a user is still used by an action, > >you return 0 or -EPERM? > > 0 and the chain stays there until the action is removed. Hmm, do you thing > that -EPERM should be returned in that case? The thing is, we have to > flush the chain in order to see the action references are there. We would > have to have 2 ref counters, one for filter, one for actions. > What do you think? _If_ RTM_DELCHAIN does decrease the chain refcnt, then it is broken: # tc chain add X... (refcnt == 1) # tc action add ... goto chain X (refcnt==2) # tc chain del X ... (refcnt== 1) # tc chain del X ... (refcnt==0) RTM_DELCHAIN should just test if refcnt is 1, if it is, delete it, otherwise return -EPERM. This is how we handle tc standalone actions, see tcf_idr_delete_index(). Yes, you might need two refcnt's here.
Wed, Jul 25, 2018 at 06:40:44PM CEST, xiyou.wangcong@gmail.com wrote: >On Tue, Jul 24, 2018 at 11:49 PM Jiri Pirko <jiri@resnulli.us> wrote: >> >> Wed, Jul 25, 2018 at 01:20:08AM CEST, xiyou.wangcong@gmail.com wrote: >> >So, you only send out notification when the last refcnt is gone. >> > >> >If the chain that is being deleted by a user is still used by an action, >> >you return 0 or -EPERM? >> >> 0 and the chain stays there until the action is removed. Hmm, do you thing >> that -EPERM should be returned in that case? The thing is, we have to >> flush the chain in order to see the action references are there. We would >> have to have 2 ref counters, one for filter, one for actions. >> What do you think? > >_If_ RTM_DELCHAIN does decrease the chain refcnt, then it is >broken: > ># tc chain add X... (refcnt == 1) ># tc action add ... goto chain X (refcnt==2) ># tc chain del X ... (refcnt== 1) ># tc chain del X ... (refcnt==0) > >RTM_DELCHAIN should just test if refcnt is 1, if it is, delete it, >otherwise return -EPERM. This is how we handle tc standalone >actions, see tcf_idr_delete_index(). > >Yes, you might need two refcnt's here. Okay. Sounds good. I'm on it.
Thu, Jul 26, 2018 at 09:38:39AM CEST, jiri@resnulli.us wrote: >Wed, Jul 25, 2018 at 06:40:44PM CEST, xiyou.wangcong@gmail.com wrote: >>On Tue, Jul 24, 2018 at 11:49 PM Jiri Pirko <jiri@resnulli.us> wrote: >>> >>> Wed, Jul 25, 2018 at 01:20:08AM CEST, xiyou.wangcong@gmail.com wrote: >>> >So, you only send out notification when the last refcnt is gone. >>> > >>> >If the chain that is being deleted by a user is still used by an action, >>> >you return 0 or -EPERM? >>> >>> 0 and the chain stays there until the action is removed. Hmm, do you thing >>> that -EPERM should be returned in that case? The thing is, we have to >>> flush the chain in order to see the action references are there. We would >>> have to have 2 ref counters, one for filter, one for actions. >>> What do you think? >> >>_If_ RTM_DELCHAIN does decrease the chain refcnt, then it is >>broken: >> >># tc chain add X... (refcnt == 1) >># tc action add ... goto chain X (refcnt==2) >># tc chain del X ... (refcnt== 1) >># tc chain del X ... (refcnt==0) >> >>RTM_DELCHAIN should just test if refcnt is 1, if it is, delete it, >>otherwise return -EPERM. This is how we handle tc standalone >>actions, see tcf_idr_delete_index(). >> >>Yes, you might need two refcnt's here. > >Okay. Sounds good. I'm on it. Actually, I found an issue. The action to "goto chain" might be attached to a filter in the same chain. That is completely legitimate usage. When I do: # tc chain del X I expect the chain to be flushed and removed. If there is an action there with "goto" to the same chain, the command should be successful. However, I don't see any easy way to find out if the chain is referenced only by actions used by filters in the same chain :/ Thoughts?
Thu, Jul 26, 2018 at 12:06:14PM CEST, jiri@resnulli.us wrote: >Thu, Jul 26, 2018 at 09:38:39AM CEST, jiri@resnulli.us wrote: >>Wed, Jul 25, 2018 at 06:40:44PM CEST, xiyou.wangcong@gmail.com wrote: >>>On Tue, Jul 24, 2018 at 11:49 PM Jiri Pirko <jiri@resnulli.us> wrote: >>>> >>>> Wed, Jul 25, 2018 at 01:20:08AM CEST, xiyou.wangcong@gmail.com wrote: >>>> >So, you only send out notification when the last refcnt is gone. >>>> > >>>> >If the chain that is being deleted by a user is still used by an action, >>>> >you return 0 or -EPERM? >>>> >>>> 0 and the chain stays there until the action is removed. Hmm, do you thing >>>> that -EPERM should be returned in that case? The thing is, we have to >>>> flush the chain in order to see the action references are there. We would >>>> have to have 2 ref counters, one for filter, one for actions. >>>> What do you think? >>> >>>_If_ RTM_DELCHAIN does decrease the chain refcnt, then it is >>>broken: >>> >>># tc chain add X... (refcnt == 1) >>># tc action add ... goto chain X (refcnt==2) >>># tc chain del X ... (refcnt== 1) >>># tc chain del X ... (refcnt==0) >>> >>>RTM_DELCHAIN should just test if refcnt is 1, if it is, delete it, >>>otherwise return -EPERM. This is how we handle tc standalone >>>actions, see tcf_idr_delete_index(). >>> >>>Yes, you might need two refcnt's here. >> >>Okay. Sounds good. I'm on it. > >Actually, I found an issue. The action to "goto chain" might be attached >to a filter in the same chain. That is completely legitimate usage. >When I do: ># tc chain del X >I expect the chain to be flushed and removed. If there is an action >there with "goto" to the same chain, the command should be successful. >However, I don't see any easy way to find out if the chain is referenced >only by actions used by filters in the same chain :/ > >Thoughts? I'm now working on a patch that would treat empty chains implicitly created or deleted by user that only are referenced by action as a zombie ones. They won't be visible on dump. User won't know about them, they would only serve as a place holder for "goto chain" actions. I think it is reasonable. What do you think. Will send the RFC in few hours. >
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h index 86f4651784e8..81ec8276db9c 100644 --- a/include/net/sch_generic.h +++ b/include/net/sch_generic.h @@ -304,6 +304,7 @@ struct tcf_chain { struct tcf_block *block; u32 index; /* chain index */ unsigned int refcnt; + bool explicitly_created; }; struct tcf_block { diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h index 7d8502313c99..46399367627f 100644 --- a/include/uapi/linux/rtnetlink.h +++ b/include/uapi/linux/rtnetlink.h @@ -150,6 +150,13 @@ enum { RTM_NEWCACHEREPORT = 96, #define RTM_NEWCACHEREPORT RTM_NEWCACHEREPORT + RTM_NEWCHAIN = 100, +#define RTM_NEWCHAIN RTM_NEWCHAIN + RTM_DELCHAIN, +#define RTM_DELCHAIN RTM_DELCHAIN + RTM_GETCHAIN, +#define RTM_GETCHAIN RTM_GETCHAIN + __RTM_MAX, #define RTM_MAX (((__RTM_MAX + 3) & ~3) - 1) }; diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c index d26fed194870..5225fc557e69 100644 --- a/net/sched/cls_api.c +++ b/net/sched/cls_api.c @@ -262,29 +262,57 @@ static void tcf_chain_hold(struct tcf_chain *chain) ++chain->refcnt; } -struct tcf_chain *tcf_chain_get(struct tcf_block *block, u32 chain_index, - bool create) +static struct tcf_chain *tcf_chain_lookup(struct tcf_block *block, + u32 chain_index) { struct tcf_chain *chain; list_for_each_entry(chain, &block->chain_list, list) { - if (chain->index == chain_index) { - tcf_chain_hold(chain); + if (chain->index == chain_index) return chain; - } + } + return NULL; +} + +static int tc_chain_notify(struct tcf_chain *chain, struct sk_buff *oskb, + u32 seq, u16 flags, int event, bool unicast); + +struct tcf_chain *tcf_chain_get(struct tcf_block *block, u32 chain_index, + bool create) +{ + struct tcf_chain *chain = tcf_chain_lookup(block, chain_index); + + if (chain) { + tcf_chain_hold(chain); + return chain; } - return create ? tcf_chain_create(block, chain_index) : NULL; + if (!create) + return NULL; + chain = tcf_chain_create(block, chain_index); + if (!chain) + return NULL; + tc_chain_notify(chain, NULL, 0, NLM_F_CREATE | NLM_F_EXCL, + RTM_NEWCHAIN, false); + return chain; } EXPORT_SYMBOL(tcf_chain_get); void tcf_chain_put(struct tcf_chain *chain) { - if (--chain->refcnt == 0) + if (--chain->refcnt == 0) { + tc_chain_notify(chain, NULL, 0, 0, RTM_DELCHAIN, false); tcf_chain_destroy(chain); + } } EXPORT_SYMBOL(tcf_chain_put); +static void tcf_chain_put_explicitly_created(struct tcf_chain *chain) +{ + if (chain->explicitly_created) + tcf_chain_put(chain); +} + static bool tcf_block_offload_in_use(struct tcf_block *block) { return block->offloadcnt; @@ -694,8 +722,10 @@ void tcf_block_put_ext(struct tcf_block *block, struct Qdisc *q, if (block->refcnt == 1) { /* At this point, all the chains should have refcnt >= 1. */ - list_for_each_entry_safe(chain, tmp, &block->chain_list, list) + list_for_each_entry_safe(chain, tmp, &block->chain_list, list) { + tcf_chain_put_explicitly_created(chain); tcf_chain_put(chain); + } block->refcnt--; if (list_empty(&block->chain_list)) @@ -1609,6 +1639,264 @@ static int tc_dump_tfilter(struct sk_buff *skb, struct netlink_callback *cb) return skb->len; } +static int tc_chain_fill_node(struct tcf_chain *chain, struct net *net, + struct sk_buff *skb, struct tcf_block *block, + u32 portid, u32 seq, u16 flags, int event) +{ + unsigned char *b = skb_tail_pointer(skb); + struct nlmsghdr *nlh; + struct tcmsg *tcm; + + nlh = nlmsg_put(skb, portid, seq, event, sizeof(*tcm), flags); + if (!nlh) + goto out_nlmsg_trim; + tcm = nlmsg_data(nlh); + tcm->tcm_family = AF_UNSPEC; + tcm->tcm__pad1 = 0; + tcm->tcm__pad2 = 0; + tcm->tcm_handle = 0; + if (block->q) { + tcm->tcm_ifindex = qdisc_dev(block->q)->ifindex; + tcm->tcm_parent = block->q->handle; + } else { + tcm->tcm_ifindex = TCM_IFINDEX_MAGIC_BLOCK; + tcm->tcm_block_index = block->index; + } + + if (nla_put_u32(skb, TCA_CHAIN, chain->index)) + goto nla_put_failure; + + nlh->nlmsg_len = skb_tail_pointer(skb) - b; + return skb->len; + +out_nlmsg_trim: +nla_put_failure: + nlmsg_trim(skb, b); + return -EMSGSIZE; +} + +static int tc_chain_notify(struct tcf_chain *chain, struct sk_buff *oskb, + u32 seq, u16 flags, int event, bool unicast) +{ + u32 portid = oskb ? NETLINK_CB(oskb).portid : 0; + struct tcf_block *block = chain->block; + struct net *net = block->net; + struct sk_buff *skb; + + skb = alloc_skb(NLMSG_GOODSIZE, GFP_KERNEL); + if (!skb) + return -ENOBUFS; + + if (tc_chain_fill_node(chain, net, skb, block, portid, + seq, flags, event) <= 0) { + kfree_skb(skb); + return -EINVAL; + } + + if (unicast) + return netlink_unicast(net->rtnl, skb, portid, MSG_DONTWAIT); + + return rtnetlink_send(skb, net, portid, RTNLGRP_TC, flags & NLM_F_ECHO); +} + +/* Add/delete/get a chain */ + +static int tc_ctl_chain(struct sk_buff *skb, struct nlmsghdr *n, + struct netlink_ext_ack *extack) +{ + struct net *net = sock_net(skb->sk); + struct nlattr *tca[TCA_MAX + 1]; + struct tcmsg *t; + u32 parent; + u32 chain_index; + struct Qdisc *q = NULL; + struct tcf_chain *chain = NULL; + struct tcf_block *block; + unsigned long cl; + int err; + + if (n->nlmsg_type != RTM_GETCHAIN && + !netlink_ns_capable(skb, net->user_ns, CAP_NET_ADMIN)) + return -EPERM; + +replay: + err = nlmsg_parse(n, sizeof(*t), tca, TCA_MAX, NULL, extack); + if (err < 0) + return err; + + t = nlmsg_data(n); + parent = t->tcm_parent; + cl = 0; + + block = tcf_block_find(net, &q, &parent, &cl, + t->tcm_ifindex, t->tcm_block_index, extack); + if (IS_ERR(block)) + return PTR_ERR(block); + + chain_index = tca[TCA_CHAIN] ? nla_get_u32(tca[TCA_CHAIN]) : 0; + if (chain_index > TC_ACT_EXT_VAL_MASK) { + NL_SET_ERR_MSG(extack, "Specified chain index exceeds upper limit"); + return -EINVAL; + } + chain = tcf_chain_lookup(block, chain_index); + if (n->nlmsg_type == RTM_NEWCHAIN) { + if (chain) { + NL_SET_ERR_MSG(extack, "Filter chain already exists"); + return -EEXIST; + } + if (!(n->nlmsg_flags & NLM_F_CREATE)) { + NL_SET_ERR_MSG(extack, "Need both RTM_NEWCHAIN and NLM_F_CREATE to create a new chain"); + return -ENOENT; + } + chain = tcf_chain_create(block, chain_index); + if (!chain) { + NL_SET_ERR_MSG(extack, "Failed to create filter chain"); + return -ENOMEM; + } + } else { + if (!chain) { + NL_SET_ERR_MSG(extack, "Cannot find specified filter chain"); + return -EINVAL; + } + tcf_chain_hold(chain); + } + + switch (n->nlmsg_type) { + case RTM_NEWCHAIN: + /* In case the chain was successfully added, take a reference + * to the chain. This ensures that an empty chain + * does not disappear at the end of this function. + */ + tcf_chain_hold(chain); + chain->explicitly_created = true; + tc_chain_notify(chain, NULL, 0, NLM_F_CREATE | NLM_F_EXCL, + RTM_NEWCHAIN, false); + break; + case RTM_DELCHAIN: + /* Flush the chain first as the user requested chain removal. */ + tcf_chain_flush(chain); + /* In case the chain was successfully deleted, put a reference + * to the chain previously taken during addition. + */ + tcf_chain_put_explicitly_created(chain); + break; + case RTM_GETCHAIN: + break; + err = tc_chain_notify(chain, skb, n->nlmsg_seq, + n->nlmsg_seq, n->nlmsg_type, true); + if (err < 0) + NL_SET_ERR_MSG(extack, "Failed to send chain notify message"); + break; + default: + err = -EOPNOTSUPP; + NL_SET_ERR_MSG(extack, "Unsupported message type"); + goto errout; + } + +errout: + tcf_chain_put(chain); + if (err == -EAGAIN) + /* Replay the request. */ + goto replay; + return err; +} + +/* called with RTNL */ +static int tc_dump_chain(struct sk_buff *skb, struct netlink_callback *cb) +{ + struct net *net = sock_net(skb->sk); + struct nlattr *tca[TCA_MAX + 1]; + struct Qdisc *q = NULL; + struct tcf_block *block; + struct tcf_chain *chain; + struct tcmsg *tcm = nlmsg_data(cb->nlh); + long index_start; + long index; + u32 parent; + int err; + + if (nlmsg_len(cb->nlh) < sizeof(*tcm)) + return skb->len; + + err = nlmsg_parse(cb->nlh, sizeof(*tcm), tca, TCA_MAX, NULL, NULL); + if (err) + return err; + + if (tcm->tcm_ifindex == TCM_IFINDEX_MAGIC_BLOCK) { + block = tcf_block_lookup(net, tcm->tcm_block_index); + if (!block) + goto out; + /* If we work with block index, q is NULL and parent value + * will never be used in the following code. The check + * in tcf_fill_node prevents it. However, compiler does not + * see that far, so set parent to zero to silence the warning + * about parent being uninitialized. + */ + parent = 0; + } else { + const struct Qdisc_class_ops *cops; + struct net_device *dev; + unsigned long cl = 0; + + dev = __dev_get_by_index(net, tcm->tcm_ifindex); + if (!dev) + return skb->len; + + parent = tcm->tcm_parent; + if (!parent) { + q = dev->qdisc; + parent = q->handle; + } else { + q = qdisc_lookup(dev, TC_H_MAJ(tcm->tcm_parent)); + } + if (!q) + goto out; + cops = q->ops->cl_ops; + if (!cops) + goto out; + if (!cops->tcf_block) + goto out; + if (TC_H_MIN(tcm->tcm_parent)) { + cl = cops->find(q, tcm->tcm_parent); + if (cl == 0) + goto out; + } + block = cops->tcf_block(q, cl, NULL); + if (!block) + goto out; + if (tcf_block_shared(block)) + q = NULL; + } + + index_start = cb->args[0]; + index = 0; + + list_for_each_entry(chain, &block->chain_list, list) { + if ((tca[TCA_CHAIN] && + nla_get_u32(tca[TCA_CHAIN]) != chain->index)) + continue; + if (index < index_start) { + index++; + continue; + } + err = tc_chain_fill_node(chain, net, skb, block, + NETLINK_CB(cb->skb).portid, + cb->nlh->nlmsg_seq, NLM_F_MULTI, + RTM_NEWCHAIN); + if (err <= 0) + break; + index++; + } + + cb->args[0] = index; + +out: + /* If we did no progress, the error (EMSGSIZE) is real */ + if (skb->len == 0 && err) + return err; + return skb->len; +} + void tcf_exts_destroy(struct tcf_exts *exts) { #ifdef CONFIG_NET_CLS_ACT @@ -1825,6 +2113,10 @@ static int __init tc_filter_init(void) rtnl_register(PF_UNSPEC, RTM_DELTFILTER, tc_del_tfilter, NULL, 0); rtnl_register(PF_UNSPEC, RTM_GETTFILTER, tc_get_tfilter, tc_dump_tfilter, 0); + rtnl_register(PF_UNSPEC, RTM_NEWCHAIN, tc_ctl_chain, NULL, 0); + rtnl_register(PF_UNSPEC, RTM_DELCHAIN, tc_ctl_chain, NULL, 0); + rtnl_register(PF_UNSPEC, RTM_GETCHAIN, tc_ctl_chain, + tc_dump_chain, 0); return 0; diff --git a/security/selinux/nlmsgtab.c b/security/selinux/nlmsgtab.c index 7b7433a1a34c..74b951f55608 100644 --- a/security/selinux/nlmsgtab.c +++ b/security/selinux/nlmsgtab.c @@ -159,7 +159,7 @@ int selinux_nlmsg_lookup(u16 sclass, u16 nlmsg_type, u32 *perm) switch (sclass) { case SECCLASS_NETLINK_ROUTE_SOCKET: /* RTM_MAX always point to RTM_SETxxxx, ie RTM_NEWxxx + 3 */ - BUILD_BUG_ON(RTM_MAX != (RTM_NEWCACHEREPORT + 3)); + BUILD_BUG_ON(RTM_MAX != (RTM_NEWCHAIN + 3)); err = nlmsg_perm(nlmsg_type, perm, nlmsg_route_perms, sizeof(nlmsg_route_perms)); break;