Message ID | 20211209163926.25563-1-fw@strlen.de |
---|---|
State | Accepted |
Delegated to: | Pablo Neira |
Headers | show |
Series | [nf] netfilter: ctnetlink: remove expired entries first | expand |
On Thu, Dec 9, 2021 at 4:39 PM Florian Westphal <fw@strlen.de> wrote: > > When dumping conntrack table to userspace via ctnetlink, check if the ct has > already expired before doing any of the 'skip' checks. > > This expires dead entries faster. > /proc handler also removes outdated entries first. > > Reported-by: Vitaly Zuevsky <vzuevsky@ns1.com> > Signed-off-by: Florian Westphal <fw@strlen.de> > --- > Vitaly, I suspect this might be related to the issue you reported, > I suspect we skip the NAT-clash entries instead of evicting them from > ctnetlink path too. > > net/netfilter/nf_conntrack_netlink.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c > index 81d03acf68d4..ec4164c32d27 100644 > --- a/net/netfilter/nf_conntrack_netlink.c > +++ b/net/netfilter/nf_conntrack_netlink.c > @@ -1195,8 +1195,6 @@ ctnetlink_dump_table(struct sk_buff *skb, struct netlink_callback *cb) > } > hlist_nulls_for_each_entry(h, n, &nf_conntrack_hash[cb->args[0]], > hnnode) { > - if (NF_CT_DIRECTION(h) != IP_CT_DIR_ORIGINAL) > - continue; > ct = nf_ct_tuplehash_to_ctrack(h); > if (nf_ct_is_expired(ct)) { > if (i < ARRAY_SIZE(nf_ct_evict) && > @@ -1208,6 +1206,9 @@ ctnetlink_dump_table(struct sk_buff *skb, struct netlink_callback *cb) > if (!net_eq(net, nf_ct_net(ct))) > continue; > > + if (NF_CT_DIRECTION(h) != IP_CT_DIR_ORIGINAL) > + continue; > + > if (cb->args[1]) { > if (ct != last) > continue; > -- > 2.32.0 > Florian, thanks for prompt turnaround on this. Seeing conntrack -C 107530 mandates the check what flows consume this many entries. I cannot do this if conntrack -L skips anything while kernel defaults to not exposing conntrack table via /proc. This server is not supposed to NAT anything by the way. We use some -j NOTRACK rules and I'd like to see what flows evade those rules and consume so many slots. Could you perhaps recommend a way to achieve this other than reconfiguring the kernel?
Vitaly Zuevsky <vzuevsky@ns1.com> wrote: > > diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c > > index 81d03acf68d4..ec4164c32d27 100644 > > --- a/net/netfilter/nf_conntrack_netlink.c > > +++ b/net/netfilter/nf_conntrack_netlink.c > > @@ -1195,8 +1195,6 @@ ctnetlink_dump_table(struct sk_buff *skb, struct netlink_callback *cb) > > } > > hlist_nulls_for_each_entry(h, n, &nf_conntrack_hash[cb->args[0]], > > hnnode) { > > - if (NF_CT_DIRECTION(h) != IP_CT_DIR_ORIGINAL) > > - continue; > > ct = nf_ct_tuplehash_to_ctrack(h); > > if (nf_ct_is_expired(ct)) { > > if (i < ARRAY_SIZE(nf_ct_evict) && > > @@ -1208,6 +1206,9 @@ ctnetlink_dump_table(struct sk_buff *skb, struct netlink_callback *cb) > > if (!net_eq(net, nf_ct_net(ct))) > > continue; > > > > + if (NF_CT_DIRECTION(h) != IP_CT_DIR_ORIGINAL) > > + continue; > > + > > if (cb->args[1]) { > > if (ct != last) > > continue; > > -- > > 2.32.0 > > > > Florian, thanks for prompt turnaround on this. Seeing > conntrack -C > 107530 > mandates the check what flows consume this many entries. I cannot do > this if conntrack -L skips anything while kernel defaults to not > exposing conntrack table via /proc. This server is not supposed to NAT > anything by the way. Then this patch doesn't change anything. Maybe 'conntrack -L unconfirmed' or 'conntrack -L dying' show something?
On Thu, Dec 9, 2021 at 5:11 PM Florian Westphal <fw@strlen.de> wrote: > > > -- > > > 2.32.0 > > > > > > > Florian, thanks for prompt turnaround on this. Seeing > > conntrack -C > > 107530 > > mandates the check what flows consume this many entries. I cannot do > > this if conntrack -L skips anything while kernel defaults to not > > exposing conntrack table via /proc. This server is not supposed to NAT > > anything by the way. > > Then this patch doesn't change anything. > > Maybe 'conntrack -L unconfirmed' or 'conntrack -L dying' show something? Are you saying that was a patch? v2.32.0? Mind sharing a link for downloading the source and/or packaged release? I would like to test it just in case, and if no luck, what do i do to file it as a bug?
On Thu, Dec 09, 2021 at 05:39:26PM +0100, Florian Westphal wrote: > When dumping conntrack table to userspace via ctnetlink, check if the ct has > already expired before doing any of the 'skip' checks. > > This expires dead entries faster. > /proc handler also removes outdated entries first. Applied, thanks
Hi Florian Do you have any news on this? Meanwhile I cloned the repo git://git.netfilter.org/conntrack-tools, ran ./autogen.sh to produce configure, and the latter failed with: checking for rpc/rpc_msg.h... yes ./configure: line 13329: syntax error near unexpected token `LIBTIRPC,' ./configure: line 13329: ` PKG_CHECK_MODULES(LIBTIRPC, libtirpc >= 0.1)' Interestingly, PKG_CHECK_MODULES was never defined there. Is that repository for production code - I am confused? Thank you. Br Vitaly On Thu, Dec 9, 2021 at 6:23 PM Vitaly Zuevsky <vzuevsky@ns1.com> wrote: > > On Thu, Dec 9, 2021 at 5:11 PM Florian Westphal <fw@strlen.de> wrote: > > > > -- > > > > 2.32.0 > > > > > > > > > > Florian, thanks for prompt turnaround on this. Seeing > > > conntrack -C > > > 107530 > > > mandates the check what flows consume this many entries. I cannot do > > > this if conntrack -L skips anything while kernel defaults to not > > > exposing conntrack table via /proc. This server is not supposed to NAT > > > anything by the way. > > > > Then this patch doesn't change anything. > > > > Maybe 'conntrack -L unconfirmed' or 'conntrack -L dying' show something? > > Are you saying that was a patch? v2.32.0? Mind sharing a link for > downloading the source and/or packaged release? > I would like to test it just in case, and if no luck, what do i do to > file it as a bug?
Vitaly Zuevsky <vzuevsky@ns1.com> wrote: > Hi Florian > > Do you have any news on this? > Meanwhile I cloned the repo git://git.netfilter.org/conntrack-tools, > ran ./autogen.sh to produce configure, and the latter failed with: > > checking for rpc/rpc_msg.h... yes > ./configure: line 13329: syntax error near unexpected token `LIBTIRPC,' > ./configure: line 13329: ` PKG_CHECK_MODULES(LIBTIRPC, libtirpc >= 0.1)' > > Interestingly, PKG_CHECK_MODULES was never defined there. Is that > repository for production code - I am confused? Sure. But the patch is for the kernel. I already mentioned that this doesn't handle anything for non-nat case. > > > Maybe 'conntrack -L unconfirmed' or 'conntrack -L dying' show something? Still stands. Also, is there really a discrepancy? Please show output of conntrack -C conntrack -L | wc -l conntrack -C "conntrack -L" reclaims dead/timed-out entries, conntrack -C currently does not.
On Fri, Dec 17, 2021 at 7:04 PM Florian Westphal <fw@strlen.de> wrote: > Sure. But the patch is for the kernel. > I already mentioned that this doesn't handle anything for non-nat case. > > > > > Maybe 'conntrack -L unconfirmed' or 'conntrack -L dying' show something? > > Still stands. > > Also, is there really a discrepancy? Please show output of > > conntrack -C > conntrack -L | wc -l > conntrack -C > > "conntrack -L" reclaims dead/timed-out entries, conntrack -C currently > does not. Of course... It is an order of magnitude difference: # conntrack -L unconfirmed conntrack v1.4.4 (conntrack-tools): 0 flow entries have been shown. # conntrack -L dying conntrack v1.4.4 (conntrack-tools): 0 flow entries have been shown. # conntrack -C 88064 # conntrack -L | wc -l conntrack v1.4.4 (conntrack-tools): 7641 flow entries have been shown. 7641 # conntrack -C 87706 # cat /etc/lsb-release DISTRIB_ID=Ubuntu DISTRIB_RELEASE=18.04 DISTRIB_CODENAME=bionic DISTRIB_DESCRIPTION="Ubuntu 18.04.5 LTS"
Hi Florian Further to our investigation, I have identified the cause of confusion: # conntrack -C 141412 # conntrack -L >/dev/null conntrack v1.4.4 (conntrack-tools): 8342 flow entries have been shown. # conntrack -f ipv6 -L >/dev/null conntrack v1.4.4 (conntrack-tools): 124358 flow entries have been shown. so -C option implicitly sums up ipv4 and ipv6 counts, and -L option defaults to ipv4. This is neither a consistent UX practice nor properly documented. I am happy to bring it up - Christmas present xD
diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c index 81d03acf68d4..ec4164c32d27 100644 --- a/net/netfilter/nf_conntrack_netlink.c +++ b/net/netfilter/nf_conntrack_netlink.c @@ -1195,8 +1195,6 @@ ctnetlink_dump_table(struct sk_buff *skb, struct netlink_callback *cb) } hlist_nulls_for_each_entry(h, n, &nf_conntrack_hash[cb->args[0]], hnnode) { - if (NF_CT_DIRECTION(h) != IP_CT_DIR_ORIGINAL) - continue; ct = nf_ct_tuplehash_to_ctrack(h); if (nf_ct_is_expired(ct)) { if (i < ARRAY_SIZE(nf_ct_evict) && @@ -1208,6 +1206,9 @@ ctnetlink_dump_table(struct sk_buff *skb, struct netlink_callback *cb) if (!net_eq(net, nf_ct_net(ct))) continue; + if (NF_CT_DIRECTION(h) != IP_CT_DIR_ORIGINAL) + continue; + if (cb->args[1]) { if (ct != last) continue;
When dumping conntrack table to userspace via ctnetlink, check if the ct has already expired before doing any of the 'skip' checks. This expires dead entries faster. /proc handler also removes outdated entries first. Reported-by: Vitaly Zuevsky <vzuevsky@ns1.com> Signed-off-by: Florian Westphal <fw@strlen.de> --- Vitaly, I suspect this might be related to the issue you reported, I suspect we skip the NAT-clash entries instead of evicting them from ctnetlink path too. net/netfilter/nf_conntrack_netlink.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)