Message ID | 1410448819-6694-3-git-send-email-pablo@netfilter.org |
---|---|
State | Accepted |
Delegated to: | Pablo Neira |
Headers | show |
On Thu, Sep 11, 2014 at 05:20:19PM +0200, Pablo Neira Ayuso wrote: > This patch exposes the ruleset generation ID in three ways: > > 1) The new command NFT_MSG_GETGEN that exposes the 32-bits ruleset > generation ID. This ID is incremented in every commit and it > should be large enough to avoid wraparound problems. > > 2) The less significant 16-bits of the generation ID is exposed through > the nfgenmsg->res_id header field. This allows us to quickly catch > if the ruleset has change between two consecutive list dumps from > different object lists (in this specific case I think the risk of > wraparound is unlikely). > > 3) Userspace subscribers may receive notifications of new rule-set > generation after every commit. This also provides an alternative > way to monitor the generation ID. If the events are lost, the > userspace process hits a overrun error, so it knows that it is > working with a stale ruleset anyway. Correct, there's just one thing to consider here, which is what happens once we add active ruleset state notifications, like counters, limit etc. At that point its not clear anymore whether changes have happened. OTOH it would be just a false positive, so at least things would keep working. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 11 September 2014 17:20, Pablo Neira Ayuso <pablo@netfilter.org> wrote: > + > +static int nf_tables_gen_notify(struct net *net, struct sk_buff *skb, int event) > +{ > + struct nlmsghdr *nlh = nlmsg_hdr(skb); > + struct sk_buff *skb2; > + int err; > + > + if (nlmsg_report(nlh) && > + !nfnetlink_has_listeners(net, NFNLGRP_NFTABLES)) > + return 0; > + The above logic is different from the other functions. I don't know if that was deliberate. > + err = -ENOBUFS; > + skb2 = nlmsg_new(NLMSG_GOODSIZE, GFP_KERNEL); > + if (skb2 == NULL) > + goto err; > + > + err = nf_tables_fill_gen_info(skb2, net, NETLINK_CB(skb).portid, > + nlh->nlmsg_seq); > + if (err < 0) { > + kfree_skb(skb2); > + goto err; > + } > + > + err = nfnetlink_send(skb2, net, NETLINK_CB(skb).portid, > + NFNLGRP_NFTABLES, nlmsg_report(nlh), GFP_KERNEL); > +err: > + if (err < 0) { > + nfnetlink_set_err(net, NETLINK_CB(skb).portid, NFNLGRP_NFTABLES, > + err); > + } > + return err; > +} > + All the xx_notify() functions looks very similar. Do you think it worth to factorize the code?
On Thu, Sep 11, 2014 at 04:32:44PM +0100, Patrick McHardy wrote: > On Thu, Sep 11, 2014 at 05:20:19PM +0200, Pablo Neira Ayuso wrote: > > This patch exposes the ruleset generation ID in three ways: > > > > 1) The new command NFT_MSG_GETGEN that exposes the 32-bits ruleset > > generation ID. This ID is incremented in every commit and it > > should be large enough to avoid wraparound problems. > > > > 2) The less significant 16-bits of the generation ID is exposed through > > the nfgenmsg->res_id header field. This allows us to quickly catch > > if the ruleset has change between two consecutive list dumps from > > different object lists (in this specific case I think the risk of > > wraparound is unlikely). > > > > 3) Userspace subscribers may receive notifications of new rule-set > > generation after every commit. This also provides an alternative > > way to monitor the generation ID. If the events are lost, the > > userspace process hits a overrun error, so it knows that it is > > working with a stale ruleset anyway. > > Correct, there's just one thing to consider here, which is what happens > once we add active ruleset state notifications, like counters, limit > etc. At that point its not clear anymore whether changes have happened. > OTOH it would be just a false positive, so at least things would keep > working. Right, I can put the genid notification in a different nfnetlink multicast group (NFNLGRP_NFTABLES_GENID) to avoid false positives if you like the idea, we have plenty of spare groups. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Sep 11, 2014 at 05:46:57PM +0200, Arturo Borrero Gonzalez wrote: > On 11 September 2014 17:20, Pablo Neira Ayuso <pablo@netfilter.org> wrote: > > + > > +static int nf_tables_gen_notify(struct net *net, struct sk_buff *skb, int event) > > +{ > > + struct nlmsghdr *nlh = nlmsg_hdr(skb); > > + struct sk_buff *skb2; > > + int err; > > + > > + if (nlmsg_report(nlh) && > > + !nfnetlink_has_listeners(net, NFNLGRP_NFTABLES)) > > + return 0; > > + > > The above logic is different from the other functions. > > I don't know if that was deliberate. We don't have nft_ctx at that point. nlmsg_report(nlh) is basically what ctx->report usually constains. Unless I'm overlooking something obvious, this looks correct to me. > > + err = -ENOBUFS; > > + skb2 = nlmsg_new(NLMSG_GOODSIZE, GFP_KERNEL); > > + if (skb2 == NULL) > > + goto err; > > + > > + err = nf_tables_fill_gen_info(skb2, net, NETLINK_CB(skb).portid, > > + nlh->nlmsg_seq); > > + if (err < 0) { > > + kfree_skb(skb2); > > + goto err; > > + } > > + > > + err = nfnetlink_send(skb2, net, NETLINK_CB(skb).portid, > > + NFNLGRP_NFTABLES, nlmsg_report(nlh), GFP_KERNEL); > > +err: > > + if (err < 0) { > > + nfnetlink_set_err(net, NETLINK_CB(skb).portid, NFNLGRP_NFTABLES, > > + err); > > + } > > + return err; > > +} > > + > > All the xx_notify() functions looks very similar. Do you think it > worth to factorize the code? I prefer if we focus on adding missing features, fixing bugs and so on. We'll have the time to revisit things later on. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Sep 11, 2014 at 06:10:40PM +0200, Pablo Neira Ayuso wrote: > On Thu, Sep 11, 2014 at 04:32:44PM +0100, Patrick McHardy wrote: > > On Thu, Sep 11, 2014 at 05:20:19PM +0200, Pablo Neira Ayuso wrote: > > > This patch exposes the ruleset generation ID in three ways: > > > > > > 1) The new command NFT_MSG_GETGEN that exposes the 32-bits ruleset > > > generation ID. This ID is incremented in every commit and it > > > should be large enough to avoid wraparound problems. > > > > > > 2) The less significant 16-bits of the generation ID is exposed through > > > the nfgenmsg->res_id header field. This allows us to quickly catch > > > if the ruleset has change between two consecutive list dumps from > > > different object lists (in this specific case I think the risk of > > > wraparound is unlikely). > > > > > > 3) Userspace subscribers may receive notifications of new rule-set > > > generation after every commit. This also provides an alternative > > > way to monitor the generation ID. If the events are lost, the > > > userspace process hits a overrun error, so it knows that it is > > > working with a stale ruleset anyway. > > > > Correct, there's just one thing to consider here, which is what happens > > once we add active ruleset state notifications, like counters, limit > > etc. At that point its not clear anymore whether changes have happened. > > OTOH it would be just a false positive, so at least things would keep > > working. > > Right, I can put the genid notification in a different nfnetlink > multicast group (NFNLGRP_NFTABLES_GENID) to avoid false positives if > you like the idea, we have plenty of spare groups. I don't think that's a really good idea since the ordering between the rule notifications and the commit notification wouldn't be reliable. Same thing is probably true for state notifications, not entirely sure yet if they could reasonably be sent to a different group. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Sep 11, 2014 at 05:45:58PM +0100, Patrick McHardy wrote: > On Thu, Sep 11, 2014 at 06:10:40PM +0200, Pablo Neira Ayuso wrote: > > On Thu, Sep 11, 2014 at 04:32:44PM +0100, Patrick McHardy wrote: > > > On Thu, Sep 11, 2014 at 05:20:19PM +0200, Pablo Neira Ayuso wrote: > > > > This patch exposes the ruleset generation ID in three ways: > > > > > > > > 1) The new command NFT_MSG_GETGEN that exposes the 32-bits ruleset > > > > generation ID. This ID is incremented in every commit and it > > > > should be large enough to avoid wraparound problems. > > > > > > > > 2) The less significant 16-bits of the generation ID is exposed through > > > > the nfgenmsg->res_id header field. This allows us to quickly catch > > > > if the ruleset has change between two consecutive list dumps from > > > > different object lists (in this specific case I think the risk of > > > > wraparound is unlikely). > > > > > > > > 3) Userspace subscribers may receive notifications of new rule-set > > > > generation after every commit. This also provides an alternative > > > > way to monitor the generation ID. If the events are lost, the > > > > userspace process hits a overrun error, so it knows that it is > > > > working with a stale ruleset anyway. > > > > > > Correct, there's just one thing to consider here, which is what happens > > > once we add active ruleset state notifications, like counters, limit > > > etc. At that point its not clear anymore whether changes have happened. > > > OTOH it would be just a false positive, so at least things would keep > > > working. > > > > Right, I can put the genid notification in a different nfnetlink > > multicast group (NFNLGRP_NFTABLES_GENID) to avoid false positives if > > you like the idea, we have plenty of spare groups. > > I don't think that's a really good idea since the ordering between the > rule notifications and the commit notification wouldn't be reliable. > Same thing is probably true for state notifications, not entirely > sure yet if they could reasonably be sent to a different group. Indeed, we have to stick to one single group. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Sep 11, 2014 at 06:57:31PM +0200, Pablo Neira Ayuso wrote: > On Thu, Sep 11, 2014 at 05:45:58PM +0100, Patrick McHardy wrote: > > On Thu, Sep 11, 2014 at 06:10:40PM +0200, Pablo Neira Ayuso wrote: > > > On Thu, Sep 11, 2014 at 04:32:44PM +0100, Patrick McHardy wrote: > > > > On Thu, Sep 11, 2014 at 05:20:19PM +0200, Pablo Neira Ayuso wrote: > > > > > This patch exposes the ruleset generation ID in three ways: > > > > > > > > > > 1) The new command NFT_MSG_GETGEN that exposes the 32-bits ruleset > > > > > generation ID. This ID is incremented in every commit and it > > > > > should be large enough to avoid wraparound problems. > > > > > > > > > > 2) The less significant 16-bits of the generation ID is exposed through > > > > > the nfgenmsg->res_id header field. This allows us to quickly catch > > > > > if the ruleset has change between two consecutive list dumps from > > > > > different object lists (in this specific case I think the risk of > > > > > wraparound is unlikely). > > > > > > > > > > 3) Userspace subscribers may receive notifications of new rule-set > > > > > generation after every commit. This also provides an alternative > > > > > way to monitor the generation ID. If the events are lost, the > > > > > userspace process hits a overrun error, so it knows that it is > > > > > working with a stale ruleset anyway. > > > > > > > > Correct, there's just one thing to consider here, which is what happens > > > > once we add active ruleset state notifications, like counters, limit > > > > etc. At that point its not clear anymore whether changes have happened. > > > > OTOH it would be just a false positive, so at least things would keep > > > > working. > > > > > > Right, I can put the genid notification in a different nfnetlink > > > multicast group (NFNLGRP_NFTABLES_GENID) to avoid false positives if > > > you like the idea, we have plenty of spare groups. > > > > I don't think that's a really good idea since the ordering between the > > rule notifications and the commit notification wouldn't be reliable. > > Same thing is probably true for state notifications, not entirely > > sure yet if they could reasonably be sent to a different group. > > Indeed, we have to stick to one single group. Oh, you can subscribe to several groups from one single socket. So you get them notifications in order. IIRC, the grouping just provides a way to filter out what you don't want to listen. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Sep 11, 2014 at 07:22:10PM +0200, Pablo Neira Ayuso wrote: > On Thu, Sep 11, 2014 at 06:57:31PM +0200, Pablo Neira Ayuso wrote: > > On Thu, Sep 11, 2014 at 05:45:58PM +0100, Patrick McHardy wrote: > > > > > > > > Right, I can put the genid notification in a different nfnetlink > > > > multicast group (NFNLGRP_NFTABLES_GENID) to avoid false positives if > > > > you like the idea, we have plenty of spare groups. > > > > > > I don't think that's a really good idea since the ordering between the > > > rule notifications and the commit notification wouldn't be reliable. > > > Same thing is probably true for state notifications, not entirely > > > sure yet if they could reasonably be sent to a different group. > > > > Indeed, we have to stick to one single group. > > Oh, you can subscribe to several groups from one single socket. So you > get them notifications in order. IIRC, the grouping just provides a > way to filter out what you don't want to listen. > Correct, but at that point we're back to square one because we don't know why the error orginated :) -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Sep 11, 2014 at 06:35:35PM +0100, Patrick McHardy wrote: > On Thu, Sep 11, 2014 at 07:22:10PM +0200, Pablo Neira Ayuso wrote: > > On Thu, Sep 11, 2014 at 06:57:31PM +0200, Pablo Neira Ayuso wrote: > > > On Thu, Sep 11, 2014 at 05:45:58PM +0100, Patrick McHardy wrote: > > > > > > > > > > Right, I can put the genid notification in a different nfnetlink > > > > > multicast group (NFNLGRP_NFTABLES_GENID) to avoid false positives if > > > > > you like the idea, we have plenty of spare groups. > > > > > > > > I don't think that's a really good idea since the ordering between the > > > > rule notifications and the commit notification wouldn't be reliable. > > > > Same thing is probably true for state notifications, not entirely > > > > sure yet if they could reasonably be sent to a different group. > > > > > > Indeed, we have to stick to one single group. > > > > Oh, you can subscribe to several groups from one single socket. So you > > get them notifications in order. IIRC, the grouping just provides a > > way to filter out what you don't want to listen. > > > > Correct, but at that point we're back to square one because we don't > know why the error orginated :) Right. I'm considering the (likely spamming) stateful notifications that we'll have at some point. Those we can put them in NFNLGRP_NFTABLES_STATES or something similar. So we leave NFNLGRP_NFTABLES for rule-set updates only (including the genid notification, of course). -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/include/uapi/linux/netfilter/nf_tables.h b/include/uapi/linux/netfilter/nf_tables.h index 66d66dd..b72ccfe 100644 --- a/include/uapi/linux/netfilter/nf_tables.h +++ b/include/uapi/linux/netfilter/nf_tables.h @@ -51,6 +51,8 @@ enum nft_verdicts { * @NFT_MSG_NEWSETELEM: create a new set element (enum nft_set_elem_attributes) * @NFT_MSG_GETSETELEM: get a set element (enum nft_set_elem_attributes) * @NFT_MSG_DELSETELEM: delete a set element (enum nft_set_elem_attributes) + * @NFT_MSG_NEWGEN: announce a new generation, only for events (enum nft_gen_attributes) + * @NFT_MSG_GETGEN: get the rule-set generation (enum nft_gen_attributes) */ enum nf_tables_msg_types { NFT_MSG_NEWTABLE, @@ -68,6 +70,8 @@ enum nf_tables_msg_types { NFT_MSG_NEWSETELEM, NFT_MSG_GETSETELEM, NFT_MSG_DELSETELEM, + NFT_MSG_NEWGEN, + NFT_MSG_GETGEN, NFT_MSG_MAX, }; @@ -812,4 +816,16 @@ enum nft_masq_attributes { }; #define NFTA_MASQ_MAX (__NFTA_MASQ_MAX - 1) +/** + * enum nft_gen_attributes - nf_tables ruleset generation attributes + * + * @NFTA_GEN_ID: Ruleset generation ID (NLA_U32) + */ +enum nft_gen_attributes { + NFTA_GEN_UNSPEC, + NFTA_GEN_ID, + __NFTA_GEN_MAX +}; +#define NFTA_GEN_MAX (__NFTA_GEN_MAX - 1) + #endif /* _LINUX_NF_TABLES_H */ diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index 8237460..a476b99 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -405,9 +405,9 @@ static const struct nla_policy nft_table_policy[NFTA_TABLE_MAX + 1] = { [NFTA_TABLE_FLAGS] = { .type = NLA_U32 }, }; -static int nf_tables_fill_table_info(struct sk_buff *skb, u32 portid, u32 seq, - int event, u32 flags, int family, - const struct nft_table *table) +static int nf_tables_fill_table_info(struct sk_buff *skb, struct net *net, + u32 portid, u32 seq, int event, u32 flags, + int family, const struct nft_table *table) { struct nlmsghdr *nlh; struct nfgenmsg *nfmsg; @@ -420,7 +420,7 @@ static int nf_tables_fill_table_info(struct sk_buff *skb, u32 portid, u32 seq, nfmsg = nlmsg_data(nlh); nfmsg->nfgen_family = family; nfmsg->version = NFNETLINK_V0; - nfmsg->res_id = 0; + nfmsg->res_id = htons(net->nft.base_seq & 0xffff); if (nla_put_string(skb, NFTA_TABLE_NAME, table->name) || nla_put_be32(skb, NFTA_TABLE_FLAGS, htonl(table->flags)) || @@ -448,8 +448,8 @@ static int nf_tables_table_notify(const struct nft_ctx *ctx, int event) if (skb == NULL) goto err; - err = nf_tables_fill_table_info(skb, ctx->portid, ctx->seq, event, 0, - ctx->afi->family, ctx->table); + err = nf_tables_fill_table_info(skb, ctx->net, ctx->portid, ctx->seq, + event, 0, ctx->afi->family, ctx->table); if (err < 0) { kfree_skb(skb); goto err; @@ -488,7 +488,7 @@ static int nf_tables_dump_tables(struct sk_buff *skb, if (idx > s_idx) memset(&cb->args[1], 0, sizeof(cb->args) - sizeof(cb->args[0])); - if (nf_tables_fill_table_info(skb, + if (nf_tables_fill_table_info(skb, net, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq, NFT_MSG_NEWTABLE, @@ -540,7 +540,7 @@ static int nf_tables_gettable(struct sock *nlsk, struct sk_buff *skb, if (!skb2) return -ENOMEM; - err = nf_tables_fill_table_info(skb2, NETLINK_CB(skb).portid, + err = nf_tables_fill_table_info(skb2, net, NETLINK_CB(skb).portid, nlh->nlmsg_seq, NFT_MSG_NEWTABLE, 0, family, table); if (err < 0) @@ -914,9 +914,9 @@ nla_put_failure: return -ENOSPC; } -static int nf_tables_fill_chain_info(struct sk_buff *skb, u32 portid, u32 seq, - int event, u32 flags, int family, - const struct nft_table *table, +static int nf_tables_fill_chain_info(struct sk_buff *skb, struct net *net, + u32 portid, u32 seq, int event, u32 flags, + int family, const struct nft_table *table, const struct nft_chain *chain) { struct nlmsghdr *nlh; @@ -930,7 +930,7 @@ static int nf_tables_fill_chain_info(struct sk_buff *skb, u32 portid, u32 seq, nfmsg = nlmsg_data(nlh); nfmsg->nfgen_family = family; nfmsg->version = NFNETLINK_V0; - nfmsg->res_id = 0; + nfmsg->res_id = htons(net->nft.base_seq & 0xffff); if (nla_put_string(skb, NFTA_CHAIN_TABLE, table->name)) goto nla_put_failure; @@ -988,8 +988,8 @@ static int nf_tables_chain_notify(const struct nft_ctx *ctx, int event) if (skb == NULL) goto err; - err = nf_tables_fill_chain_info(skb, ctx->portid, ctx->seq, event, 0, - ctx->afi->family, ctx->table, + err = nf_tables_fill_chain_info(skb, ctx->net, ctx->portid, ctx->seq, + event, 0, ctx->afi->family, ctx->table, ctx->chain); if (err < 0) { kfree_skb(skb); @@ -1031,7 +1031,8 @@ static int nf_tables_dump_chains(struct sk_buff *skb, if (idx > s_idx) memset(&cb->args[1], 0, sizeof(cb->args) - sizeof(cb->args[0])); - if (nf_tables_fill_chain_info(skb, NETLINK_CB(cb->skb).portid, + if (nf_tables_fill_chain_info(skb, net, + NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq, NFT_MSG_NEWCHAIN, NLM_F_MULTI, @@ -1090,7 +1091,7 @@ static int nf_tables_getchain(struct sock *nlsk, struct sk_buff *skb, if (!skb2) return -ENOMEM; - err = nf_tables_fill_chain_info(skb2, NETLINK_CB(skb).portid, + err = nf_tables_fill_chain_info(skb2, net, NETLINK_CB(skb).portid, nlh->nlmsg_seq, NFT_MSG_NEWCHAIN, 0, family, table, chain); if (err < 0) @@ -1647,8 +1648,9 @@ static const struct nla_policy nft_rule_policy[NFTA_RULE_MAX + 1] = { .len = NFT_USERDATA_MAXLEN }, }; -static int nf_tables_fill_rule_info(struct sk_buff *skb, u32 portid, u32 seq, - int event, u32 flags, int family, +static int nf_tables_fill_rule_info(struct sk_buff *skb, struct net *net, + u32 portid, u32 seq, int event, + u32 flags, int family, const struct nft_table *table, const struct nft_chain *chain, const struct nft_rule *rule) @@ -1668,7 +1670,7 @@ static int nf_tables_fill_rule_info(struct sk_buff *skb, u32 portid, u32 seq, nfmsg = nlmsg_data(nlh); nfmsg->nfgen_family = family; nfmsg->version = NFNETLINK_V0; - nfmsg->res_id = 0; + nfmsg->res_id = htons(net->nft.base_seq & 0xffff); if (nla_put_string(skb, NFTA_RULE_TABLE, table->name)) goto nla_put_failure; @@ -1724,8 +1726,8 @@ static int nf_tables_rule_notify(const struct nft_ctx *ctx, if (skb == NULL) goto err; - err = nf_tables_fill_rule_info(skb, ctx->portid, ctx->seq, event, 0, - ctx->afi->family, ctx->table, + err = nf_tables_fill_rule_info(skb, ctx->net, ctx->portid, ctx->seq, + event, 0, ctx->afi->family, ctx->table, ctx->chain, rule); if (err < 0) { kfree_skb(skb); @@ -1771,7 +1773,7 @@ static int nf_tables_dump_rules(struct sk_buff *skb, if (idx > s_idx) memset(&cb->args[1], 0, sizeof(cb->args) - sizeof(cb->args[0])); - if (nf_tables_fill_rule_info(skb, NETLINK_CB(cb->skb).portid, + if (nf_tables_fill_rule_info(skb, net, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq, NFT_MSG_NEWRULE, NLM_F_MULTI | NLM_F_APPEND, @@ -1837,7 +1839,7 @@ static int nf_tables_getrule(struct sock *nlsk, struct sk_buff *skb, if (!skb2) return -ENOMEM; - err = nf_tables_fill_rule_info(skb2, NETLINK_CB(skb).portid, + err = nf_tables_fill_rule_info(skb2, net, NETLINK_CB(skb).portid, nlh->nlmsg_seq, NFT_MSG_NEWRULE, 0, family, table, chain, rule); if (err < 0) @@ -2321,7 +2323,7 @@ static int nf_tables_fill_set(struct sk_buff *skb, const struct nft_ctx *ctx, nfmsg = nlmsg_data(nlh); nfmsg->nfgen_family = ctx->afi->family; nfmsg->version = NFNETLINK_V0; - nfmsg->res_id = 0; + nfmsg->res_id = htons(ctx->net->nft.base_seq & 0xffff); if (nla_put_string(skb, NFTA_SET_TABLE, ctx->table->name)) goto nla_put_failure; @@ -2925,7 +2927,7 @@ static int nf_tables_dump_set(struct sk_buff *skb, struct netlink_callback *cb) nfmsg = nlmsg_data(nlh); nfmsg->nfgen_family = ctx.afi->family; nfmsg->version = NFNETLINK_V0; - nfmsg->res_id = 0; + nfmsg->res_id = htons(ctx.net->nft.base_seq & 0xffff); if (nla_put_string(skb, NFTA_SET_ELEM_LIST_TABLE, ctx.table->name)) goto nla_put_failure; @@ -3006,7 +3008,7 @@ static int nf_tables_fill_setelem_info(struct sk_buff *skb, nfmsg = nlmsg_data(nlh); nfmsg->nfgen_family = ctx->afi->family; nfmsg->version = NFNETLINK_V0; - nfmsg->res_id = 0; + nfmsg->res_id = htons(ctx->net->nft.base_seq & 0xffff); if (nla_put_string(skb, NFTA_SET_TABLE, ctx->table->name)) goto nla_put_failure; @@ -3293,6 +3295,87 @@ static int nf_tables_delsetelem(struct sock *nlsk, struct sk_buff *skb, return err; } +static int nf_tables_fill_gen_info(struct sk_buff *skb, struct net *net, + u32 portid, u32 seq) +{ + struct nlmsghdr *nlh; + struct nfgenmsg *nfmsg; + int event = (NFNL_SUBSYS_NFTABLES << 8) | NFT_MSG_NEWGEN; + + nlh = nlmsg_put(skb, portid, seq, event, sizeof(struct nfgenmsg), 0); + if (nlh == NULL) + goto nla_put_failure; + + nfmsg = nlmsg_data(nlh); + nfmsg->nfgen_family = AF_UNSPEC; + nfmsg->version = NFNETLINK_V0; + nfmsg->res_id = htons(net->nft.base_seq & 0xffff); + + if (nla_put_be32(skb, NFTA_GEN_ID, htonl(net->nft.base_seq))) + goto nla_put_failure; + + return nlmsg_end(skb, nlh); + +nla_put_failure: + nlmsg_trim(skb, nlh); + return -EMSGSIZE; +} + +static int nf_tables_gen_notify(struct net *net, struct sk_buff *skb, int event) +{ + struct nlmsghdr *nlh = nlmsg_hdr(skb); + struct sk_buff *skb2; + int err; + + if (nlmsg_report(nlh) && + !nfnetlink_has_listeners(net, NFNLGRP_NFTABLES)) + return 0; + + err = -ENOBUFS; + skb2 = nlmsg_new(NLMSG_GOODSIZE, GFP_KERNEL); + if (skb2 == NULL) + goto err; + + err = nf_tables_fill_gen_info(skb2, net, NETLINK_CB(skb).portid, + nlh->nlmsg_seq); + if (err < 0) { + kfree_skb(skb2); + goto err; + } + + err = nfnetlink_send(skb2, net, NETLINK_CB(skb).portid, + NFNLGRP_NFTABLES, nlmsg_report(nlh), GFP_KERNEL); +err: + if (err < 0) { + nfnetlink_set_err(net, NETLINK_CB(skb).portid, NFNLGRP_NFTABLES, + err); + } + return err; +} + +static int nf_tables_getgen(struct sock *nlsk, struct sk_buff *skb, + const struct nlmsghdr *nlh, + const struct nlattr * const nla[]) +{ + struct net *net = sock_net(skb->sk); + struct sk_buff *skb2; + int err; + + skb2 = alloc_skb(NLMSG_GOODSIZE, GFP_KERNEL); + if (skb2 == NULL) + return -ENOMEM; + + err = nf_tables_fill_gen_info(skb2, net, NETLINK_CB(skb).portid, + nlh->nlmsg_seq); + if (err < 0) + goto err; + + return nlmsg_unicast(nlsk, skb2, NETLINK_CB(skb).portid); +err: + kfree_skb(skb2); + return err; +} + static const struct nfnl_callback nf_tables_cb[NFT_MSG_MAX] = { [NFT_MSG_NEWTABLE] = { .call_batch = nf_tables_newtable, @@ -3369,6 +3452,9 @@ static const struct nfnl_callback nf_tables_cb[NFT_MSG_MAX] = { .attr_count = NFTA_SET_ELEM_LIST_MAX, .policy = nft_set_elem_list_policy, }, + [NFT_MSG_GETGEN] = { + .call = nf_tables_getgen, + }, }; static void nft_chain_commit_update(struct nft_trans *trans) @@ -3526,6 +3612,8 @@ static int nf_tables_commit(struct sk_buff *skb) call_rcu(&trans->rcu_head, nf_tables_commit_release_rcu); } + nf_tables_gen_notify(net, skb, NFT_MSG_NEWGEN); + return 0; }
This patch exposes the ruleset generation ID in three ways: 1) The new command NFT_MSG_GETGEN that exposes the 32-bits ruleset generation ID. This ID is incremented in every commit and it should be large enough to avoid wraparound problems. 2) The less significant 16-bits of the generation ID is exposed through the nfgenmsg->res_id header field. This allows us to quickly catch if the ruleset has change between two consecutive list dumps from different object lists (in this specific case I think the risk of wraparound is unlikely). 3) Userspace subscribers may receive notifications of new rule-set generation after every commit. This also provides an alternative way to monitor the generation ID. If the events are lost, the userspace process hits a overrun error, so it knows that it is working with a stale ruleset anyway. Patrick spotted that rule-set transformations in userspace may take quite some time. In that case, it annotates the 32-bits generation ID before fetching the rule-set, then: 1) it compares it to what we obtain after the transformation to make sure it is not working with a stale rule-set and no wraparound has ocurred. 2) it subscribes to ruleset notifications, so it can watch for new generation ID. This is complementary to the NLM_F_DUMP_INTR approach, which allows us to detect an interference in the middle one single list dumping. There is no way to explicitly check that an interference has occurred between two list dumps from the kernel, since it doesn't know how many lists the userspace client is actually going to dump. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> --- Change from v1: Expose the 32-bit generation ID through NFT_MSG_GETGEN and add event notification after the commit to notify the new generation ID to subscribers. include/uapi/linux/netfilter/nf_tables.h | 16 ++++ net/netfilter/nf_tables_api.c | 140 ++++++++++++++++++++++++------ 2 files changed, 130 insertions(+), 26 deletions(-)