diff mbox

[nf-next,2/3] netfilter: nf_ct_helper: use nf_ct_iterate_cleanup to unlink helper objs

Message ID 1495345149-57674-3-git-send-email-zlpnobody@163.com
State Changes Requested
Delegated to: Pablo Neira
Headers show

Commit Message

Liping Zhang May 21, 2017, 5:39 a.m. UTC
From: Liping Zhang <zlpnobody@gmail.com>

When we unlink the helper objects, we will iterate the nf_conntrack_hash,
iterate the unconfirmed list, handle the hash resize situation, etc.

Actually this logic is same as the nf_ct_iterate_cleanup, so we can use it
to remove these copy & paste codes.

Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
---
 net/netfilter/nf_conntrack_helper.c | 46 ++++---------------------------------
 1 file changed, 4 insertions(+), 42 deletions(-)

Comments

Florian Westphal May 21, 2017, 8:15 a.m. UTC | #1
Liping Zhang <zlpnobody@163.com> wrote:
> From: Liping Zhang <zlpnobody@gmail.com>
> 
> When we unlink the helper objects, we will iterate the nf_conntrack_hash,
> iterate the unconfirmed list, handle the hash resize situation, etc.
> 
> Actually this logic is same as the nf_ct_iterate_cleanup, so we can use it
> to remove these copy & paste codes.

Agree, this is a good idea.  However:

> --- a/net/netfilter/nf_conntrack_helper.c
> +++ b/net/netfilter/nf_conntrack_helper.c
> @@ -274,16 +274,16 @@ int __nf_ct_try_assign_helper(struct nf_conn *ct, struct nf_conn *tmpl,
>  EXPORT_SYMBOL_GPL(__nf_ct_try_assign_helper);
>  
>  /* appropriate ct lock protecting must be taken by caller */
> -static inline int unhelp(struct nf_conntrack_tuple_hash *i,
> -			 const struct nf_conntrack_helper *me)
> +static int unhelp(struct nf_conn *ct, void *me)
>  {
> -	struct nf_conn *ct = nf_ct_tuplehash_to_ctrack(i);
>  	struct nf_conn_help *help = nfct_help(ct);
>  
>  	if (help && rcu_dereference_raw(help->helper) == me) {
>  		nf_conntrack_event(IPCT_HELPER, ct);
>  		RCU_INIT_POINTER(help->helper, NULL);
>  	}

this is broken for unconfirmed conntracks, as
other cpu can reallocate the extension area.

For the module removal case, we have no choice but to toss the
unconfirmed conntracks.

Same for patch #3.

I plan to submit my patches soon, perhaps its best if I only
submit the first couple of patches so you can rebase on top of that?

Alternatively, I'm fine if your patches go in first, I can also
just rebase on top.

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Liping Zhang May 21, 2017, 10:17 a.m. UTC | #2
Hi Florian,

2017-05-21 16:15 GMT+08:00 Florian Westphal <fw@strlen.de>:
[...]
> this is broken for unconfirmed conntracks, as
> other cpu can reallocate the extension area.

Right, I missed this point, thanks for your reminder.

> For the module removal case, we have no choice but to toss the
> unconfirmed conntracks.
>
> Same for patch #3.
>
> I plan to submit my patches soon, perhaps its best if I only
> submit the first couple of patches so you can rebase on top of that?

I read your nfct_iterate_cleanup_15 patch series just now.
Your patch set did more jobs, also including all the jobs which
my patch set did. :)

I think it's better to do these things together, so I'm fine if you
can mark my patch set as Superseded. :)

> Alternatively, I'm fine if your patches go in first, I can also
> just rebase on top.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Florian Westphal May 21, 2017, 10:31 a.m. UTC | #3
Liping Zhang <zlpnobody@gmail.com> wrote:
> Hi Florian,
> 
> 2017-05-21 16:15 GMT+08:00 Florian Westphal <fw@strlen.de>:
> [...]
> > this is broken for unconfirmed conntracks, as
> > other cpu can reallocate the extension area.
> 
> Right, I missed this point, thanks for your reminder.
> 
> > For the module removal case, we have no choice but to toss the
> > unconfirmed conntracks.
> >
> > Same for patch #3.
> >
> > I plan to submit my patches soon, perhaps its best if I only
> > submit the first couple of patches so you can rebase on top of that?
> 
> I read your nfct_iterate_cleanup_15 patch series just now.
> Your patch set did more jobs, also including all the jobs which
> my patch set did. :)
> 
> I think it's better to do these things together, so I'm fine if you
> can mark my patch set as Superseded. :)

What about this: I will submit first half of my patches, then you can
rebase your two patches on top and send them, then I can rebase again
the rest.  What do you think?

BTW, I found another bug just now, but I don't have time to address it
right now:

nf_nat_proto_clean() does:

ct->status &= ~IPS_NAT_DONE_MASK;

Thats also broken(racy).  We have to audit all the non-atomic writes of
ct->status and change them to set/clear_bit()...

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Liping Zhang May 21, 2017, 11:05 a.m. UTC | #4
Hi Florian,

2017-05-21 18:31 GMT+08:00 Florian Westphal <fw@strlen.de>:
> Liping Zhang <zlpnobody@gmail.com> wrote:
>> Hi Florian,
>>
>> 2017-05-21 16:15 GMT+08:00 Florian Westphal <fw@strlen.de>:
>> [...]
>> > this is broken for unconfirmed conntracks, as
>> > other cpu can reallocate the extension area.
>>
>> Right, I missed this point, thanks for your reminder.
>>
>> > For the module removal case, we have no choice but to toss the
>> > unconfirmed conntracks.
>> >
>> > Same for patch #3.
>> >
>> > I plan to submit my patches soon, perhaps its best if I only
>> > submit the first couple of patches so you can rebase on top of that?
>>
>> I read your nfct_iterate_cleanup_15 patch series just now.
>> Your patch set did more jobs, also including all the jobs which
>> my patch set did. :)
>>
>> I think it's better to do these things together, so I'm fine if you
>> can mark my patch set as Superseded. :)
>
> What about this: I will submit first half of my patches, then you can
> rebase your two patches on top and send them, then I can rebase again
> the rest.  What do you think?

OK. Fine with me.

> BTW, I found another bug just now, but I don't have time to address it
> right now:
>
> nf_nat_proto_clean() does:
>
> ct->status &= ~IPS_NAT_DONE_MASK;

Yes, here we should use clear_bit(IPS_SRC_NAT_DONE_BIT, &ct->status);
(For IPS_DST_NAT_DONE, we don't care about it, so we can
leave it unchanged.)

> Thats also broken(racy).  We have to audit all the non-atomic writes of
> ct->status and change them to set/clear_bit()...

I audited the related codes just now, this seems to be the last
ct->status writer which use non-atomic bit operation(of course,
except these unconfirmed ct->status writer).

I will have a further and closer check. If you are not opposed to,
I can send a related patch to fix this. :)
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Florian Westphal May 21, 2017, 11:10 a.m. UTC | #5
Liping Zhang <zlpnobody@gmail.com> wrote:
> Yes, here we should use clear_bit(IPS_SRC_NAT_DONE_BIT, &ct->status);
> (For IPS_DST_NAT_DONE, we don't care about it, so we can
> leave it unchanged.)

Oh, right.

> > Thats also broken(racy).  We have to audit all the non-atomic writes of
> > ct->status and change them to set/clear_bit()...
> 
> I audited the related codes just now, this seems to be the last
> ct->status writer which use non-atomic bit operation(of course,
> except these unconfirmed ct->status writer).
> 
> I will have a further and closer check. If you are not opposed to,
> I can send a related patch to fix this. :)

That would be great, thanks Liping!
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/netfilter/nf_conntrack_helper.c b/net/netfilter/nf_conntrack_helper.c
index 3a60efa..32cd36d 100644
--- a/net/netfilter/nf_conntrack_helper.c
+++ b/net/netfilter/nf_conntrack_helper.c
@@ -274,16 +274,16 @@  int __nf_ct_try_assign_helper(struct nf_conn *ct, struct nf_conn *tmpl,
 EXPORT_SYMBOL_GPL(__nf_ct_try_assign_helper);
 
 /* appropriate ct lock protecting must be taken by caller */
-static inline int unhelp(struct nf_conntrack_tuple_hash *i,
-			 const struct nf_conntrack_helper *me)
+static int unhelp(struct nf_conn *ct, void *me)
 {
-	struct nf_conn *ct = nf_ct_tuplehash_to_ctrack(i);
 	struct nf_conn_help *help = nfct_help(ct);
 
 	if (help && rcu_dereference_raw(help->helper) == me) {
 		nf_conntrack_event(IPCT_HELPER, ct);
 		RCU_INIT_POINTER(help->helper, NULL);
 	}
+
+	/* We are not intended to delete this conntrack. */
 	return 0;
 }
 
@@ -425,32 +425,10 @@  int nf_conntrack_helper_register(struct nf_conntrack_helper *me)
 }
 EXPORT_SYMBOL_GPL(nf_conntrack_helper_register);
 
-static void __nf_conntrack_helper_unregister(struct nf_conntrack_helper *me,
-					     struct net *net)
-{
-	struct nf_conntrack_tuple_hash *h;
-	const struct hlist_nulls_node *nn;
-	int cpu;
-
-	/* Get rid of expecteds, set helpers to NULL. */
-	for_each_possible_cpu(cpu) {
-		struct ct_pcpu *pcpu = per_cpu_ptr(net->ct.pcpu_lists, cpu);
-
-		spin_lock_bh(&pcpu->lock);
-		hlist_nulls_for_each_entry(h, nn, &pcpu->unconfirmed, hnnode)
-			unhelp(h, me);
-		spin_unlock_bh(&pcpu->lock);
-	}
-}
-
 void nf_conntrack_helper_unregister(struct nf_conntrack_helper *me)
 {
-	struct nf_conntrack_tuple_hash *h;
 	struct nf_conntrack_expect *exp;
 	const struct hlist_node *next;
-	const struct hlist_nulls_node *nn;
-	unsigned int last_hsize;
-	spinlock_t *lock;
 	struct net *net;
 	unsigned int i;
 
@@ -481,24 +459,8 @@  void nf_conntrack_helper_unregister(struct nf_conntrack_helper *me)
 
 	rtnl_lock();
 	for_each_net(net)
-		__nf_conntrack_helper_unregister(me, net);
+		nf_ct_iterate_cleanup(net, unhelp, me, 0, 0);
 	rtnl_unlock();
-
-	local_bh_disable();
-restart:
-	last_hsize = nf_conntrack_htable_size;
-	for (i = 0; i < last_hsize; i++) {
-		lock = &nf_conntrack_locks[i % CONNTRACK_LOCKS];
-		nf_conntrack_lock(lock);
-		if (last_hsize != nf_conntrack_htable_size) {
-			spin_unlock(lock);
-			goto restart;
-		}
-		hlist_nulls_for_each_entry(h, nn, &nf_conntrack_hash[i], hnnode)
-			unhelp(h, me);
-		spin_unlock(lock);
-	}
-	local_bh_enable();
 }
 EXPORT_SYMBOL_GPL(nf_conntrack_helper_unregister);