Message ID | 149935174634.19966.16018870027671610502.stgit@nfdev2.cica.es |
---|---|
State | Changes Requested |
Delegated to: | Pablo Neira |
Headers | show |
Hi, On Thu, Jul 06, 2017 at 04:36:45PM +0200, Arturo Borrero Gonzalez wrote: > If you add set elements to interval sets, the output is wrong. > Fix this by caching first element of the range (first event), > then wait for the second element of the range (second event) to > print them both at the same time. As promised, I am preparing my own solution for side-by-side comparison. Though I'm running into problems and want to use the occasion to discuss them first: What I wasn't able to solve yet are half-open ranges, like so: | nft add set ip t portrange { type inet_service; flags interval; } | nft add element ip t portrange { 1024-65535 } In this case there is only a single element with value 1024 which doesn't have EXPR_F_INTERVAL_END set. Looking at interval_map_decompose(), this is identified to be a range till the end of the scope if it's the last element in the set. In monitor code though, I can't predict whether an interval end element will come afterwards or not, so I end up caching the element and everything turns into a mess. I'm pretty sure your solution has the same problem, could you check that? Right now, I only see two ways to get this sorted: 1) Change kernel code to always include both start end end of a range in a single notification. This would eliminate the need for any caching in netlink_events_setelem_cb() altogether! 2) Change monitor code to cache all events until the final NFTA_GEN_ID message, then handle all messages at once. What do you think? Thanks, Phil -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 11 July 2017 at 20:11, Phil Sutter <phil@nwl.cc> wrote: > Hi, > > On Thu, Jul 06, 2017 at 04:36:45PM +0200, Arturo Borrero Gonzalez wrote: >> If you add set elements to interval sets, the output is wrong. >> Fix this by caching first element of the range (first event), >> then wait for the second element of the range (second event) to >> print them both at the same time. > > As promised, I am preparing my own solution for side-by-side comparison. > Though I'm running into problems and want to use the occasion to discuss > them first: > > What I wasn't able to solve yet are half-open ranges, like so: > > | nft add set ip t portrange { type inet_service; flags interval; } > | nft add element ip t portrange { 1024-65535 } > > In this case there is only a single element with value 1024 which > doesn't have EXPR_F_INTERVAL_END set. Looking at > interval_map_decompose(), this is identified to be a range till the end > of the scope if it's the last element in the set. > > In monitor code though, I can't predict whether an interval end element > will come afterwards or not, so I end up caching the element and > everything turns into a mess. I'm pretty sure your solution has the same > problem, could you check that? > > Right now, I only see two ways to get this sorted: > > 1) Change kernel code to always include both start end end of a range in > a single notification. This would eliminate the need for any caching > in netlink_events_setelem_cb() altogether! > > 2) Change monitor code to cache all events until the final NFTA_GEN_ID > message, then handle all messages at once. > > What do you think? > We should avoid touching the kernel for this. Anyway, my patch doesn't solve the same issue for deleting range elements. In this patch I added the logic in netlink_events_setelem_cb() and probably a better place for this is in the netlink_events_cache_update() routine. I'm sending a patch to add a bit of debugging to the monitor code path meanwhile we solve this issue. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi, On Wed, Jul 12, 2017 at 01:24:24PM +0200, Arturo Borrero Gonzalez wrote: > On 11 July 2017 at 20:11, Phil Sutter <phil@nwl.cc> wrote: > > Hi, > > > > On Thu, Jul 06, 2017 at 04:36:45PM +0200, Arturo Borrero Gonzalez wrote: > >> If you add set elements to interval sets, the output is wrong. > >> Fix this by caching first element of the range (first event), > >> then wait for the second element of the range (second event) to > >> print them both at the same time. > > > > As promised, I am preparing my own solution for side-by-side comparison. > > Though I'm running into problems and want to use the occasion to discuss > > them first: > > > > What I wasn't able to solve yet are half-open ranges, like so: > > > > | nft add set ip t portrange { type inet_service; flags interval; } > > | nft add element ip t portrange { 1024-65535 } > > > > In this case there is only a single element with value 1024 which > > doesn't have EXPR_F_INTERVAL_END set. Looking at > > interval_map_decompose(), this is identified to be a range till the end > > of the scope if it's the last element in the set. > > > > In monitor code though, I can't predict whether an interval end element > > will come afterwards or not, so I end up caching the element and > > everything turns into a mess. I'm pretty sure your solution has the same > > problem, could you check that? > > > > Right now, I only see two ways to get this sorted: > > > > 1) Change kernel code to always include both start end end of a range in > > a single notification. This would eliminate the need for any caching > > in netlink_events_setelem_cb() altogether! > > > > 2) Change monitor code to cache all events until the final NFTA_GEN_ID > > message, then handle all messages at once. > > > > What do you think? > > > > We should avoid touching the kernel for this. I found a userspace solution yesterday: I created a new callback for NFT_MSG_NEWGEN, which will take care of the remaining cached element. Patch follows later. Thanks, Phil -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Arturo, On Thu, Jul 06, 2017 at 04:36:45PM +0200, Arturo Borrero Gonzalez wrote: > If you add set elements to interval sets, the output is wrong. > Fix this by caching first element of the range (first event), > then wait for the second element of the range (second event) to > print them both at the same time. > > We also avoid printing the first null element required in the RB tree. > > Before this patch: > > % nft add element t s {10-20, 30-40} > add element ip t s { 0 } > add element ip t s { 10 } > add element ip t s { ftp } > add element ip t s { 30 } > add element ip t s { 41 } > > After this patch: > > % nft add element t s {10-20, 30-40} > add element ip t s { 10-20 } > add element ip t s { 30-40 } > > CC: Phil Sutter <phil@nwl.cc> > Signed-off-by: Arturo Borrero Gonzalez <arturo@netfilter.org> > --- > > This was discussed during Netfilter Workshop 2017 in Faro, Portugal. > I think Phil has another patch to address this issue from a different > approach. I like this approach, it's simple. Still you mentioned it doesn't work for several cases. One of them are deletions. Also Phil has found a number of corner cases that still need special handling, problems are: 1) nft flush set gets elements in reverse order, we can probably fix this from the kernel (flush set support is quite recent). Or just use the mergesort function to get them sorted out from userspace. 2) Half-open ranges, it's just a single element with the end flag interval set on, in such case, we just print what we have. Anything else? There's a list of threads discussing this split in several patches, it's making it a bit hard to follow you guys ;-). > include/rule.h | 2 ++ > src/netlink.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 52 insertions(+) > > diff --git a/include/rule.h b/include/rule.h > index 7424b21..1b44e4c 100644 > --- a/include/rule.h > +++ b/include/rule.h > @@ -217,6 +217,7 @@ extern struct rule *rule_lookup(const struct chain *chain, uint64_t handle); > * @datalen: mapping data len > * @objtype: mapping object type > * @init: initializer > + * @rg_cache: cached range element (left) > * @policy: set mechanism policy > * @desc: set mechanism desc > */ > @@ -234,6 +235,7 @@ struct set { > unsigned int datalen; > uint32_t objtype; > struct expr *init; > + struct expr *rg_cache; Why not just add this element to the set, instead of caching it here in this new field? I mean, place it in set->init? > uint32_t policy; > struct { > uint32_t size; -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/include/rule.h b/include/rule.h index 7424b21..1b44e4c 100644 --- a/include/rule.h +++ b/include/rule.h @@ -217,6 +217,7 @@ extern struct rule *rule_lookup(const struct chain *chain, uint64_t handle); * @datalen: mapping data len * @objtype: mapping object type * @init: initializer + * @rg_cache: cached range element (left) * @policy: set mechanism policy * @desc: set mechanism desc */ @@ -234,6 +235,7 @@ struct set { unsigned int datalen; uint32_t objtype; struct expr *init; + struct expr *rg_cache; uint32_t policy; struct { uint32_t size; diff --git a/src/netlink.c b/src/netlink.c index 880502c..ad0e712 100644 --- a/src/netlink.c +++ b/src/netlink.c @@ -2198,6 +2198,46 @@ out: return MNL_CB_OK; } +/* returns true if the event should be ignored (i.e. null element) */ +static bool netlink_event_ignore_range_event(struct nftnl_set_elem *nlse) +{ + uint32_t flags = 0; + + if (nftnl_set_elem_is_set(nlse, NFTNL_SET_ELEM_FLAGS)) + flags = nftnl_set_elem_get_u32(nlse, NFTNL_SET_ELEM_FLAGS); + if (!(flags & NFT_SET_ELEM_INTERVAL_END)) + return false; + + if (nftnl_set_elem_get_u32(nlse, NFTNL_SET_ELEM_KEY) != 0) + return false; + + return true; +} + +/* returns true if the we cached the range element */ +static bool netlink_event_range_cache(struct set *cached_set, + struct set *dummyset) +{ + struct expr *elem; + + /* not an interval ? */ + if (!(cached_set->flags & NFT_SET_INTERVAL)) + return false; + + /* cache first element of the range */ + elem = list_entry(dummyset->init->expressions.prev, struct expr, list); + if (!(elem->flags & EXPR_F_INTERVAL_END)) { + cached_set->rg_cache = expr_clone(elem); + return true; + } + + /* we have all the information now */ + compound_expr_add(dummyset->init, cached_set->rg_cache); + interval_map_decompose(dummyset->init); + + return false; +} + static int netlink_events_setelem_cb(const struct nlmsghdr *nlh, int type, struct netlink_mon_handler *monh) { @@ -2240,6 +2280,11 @@ static int netlink_events_setelem_cb(const struct nlmsghdr *nlh, int type, nlse = nftnl_set_elems_iter_next(nlsei); while (nlse != NULL) { + if (netlink_event_ignore_range_event(nlse)) { + set_free(dummyset); + nftnl_set_elems_iter_destroy(nlsei); + goto out; + } if (netlink_delinearize_setelem(nlse, dummyset) < 0) { set_free(dummyset); nftnl_set_elems_iter_destroy(nlsei); @@ -2251,6 +2296,10 @@ static int netlink_events_setelem_cb(const struct nlmsghdr *nlh, int type, switch (type) { case NFT_MSG_NEWSETELEM: + if (netlink_event_range_cache(set, dummyset)) { + set_free(dummyset); + goto out; + } printf("add "); break; case NFT_MSG_DELSETELEM: @@ -2264,6 +2313,7 @@ static int netlink_events_setelem_cb(const struct nlmsghdr *nlh, int type, expr_print(dummyset->init, monh->ctx->octx); printf("\n"); + expr_free(set->rg_cache); set_free(dummyset); break; case NFTNL_OUTPUT_XML:
If you add set elements to interval sets, the output is wrong. Fix this by caching first element of the range (first event), then wait for the second element of the range (second event) to print them both at the same time. We also avoid printing the first null element required in the RB tree. Before this patch: % nft add element t s {10-20, 30-40} add element ip t s { 0 } add element ip t s { 10 } add element ip t s { ftp } add element ip t s { 30 } add element ip t s { 41 } After this patch: % nft add element t s {10-20, 30-40} add element ip t s { 10-20 } add element ip t s { 30-40 } CC: Phil Sutter <phil@nwl.cc> Signed-off-by: Arturo Borrero Gonzalez <arturo@netfilter.org> --- This was discussed during Netfilter Workshop 2017 in Faro, Portugal. I think Phil has another patch to address this issue from a different approach. include/rule.h | 2 ++ src/netlink.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 52 insertions(+) -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html