diff mbox

[nf] netfilter: ctnetlink: remove unnecessary nf_conntrack_expect_lock protection

Message ID 1491056064-19687-1-git-send-email-zlpnobody@163.com
State Accepted
Delegated to: Pablo Neira
Headers show

Commit Message

Liping Zhang April 1, 2017, 2:14 p.m. UTC
From: Liping Zhang <zlpnobody@gmail.com>

Currently, ctnetlink_change_helper() is always protected by _expect_lock,
this is unnecessary, since the operations are unrelated to _expect_lock.

Also this will cause a deadlock when deleting the helper from a conntrack,
as _expect_lock will be locked again by nf_ct_remove_expectations():

         CPU0
        ----
  lock(nf_conntrack_expect_lock);
  lock(nf_conntrack_expect_lock);

  *** DEADLOCK ***
  May be due to missing lock nesting notation

  2 locks held by lt-conntrack_gr/12853:
  #0:  (&table[i].mutex){+.+.+.}, at: [<ffffffffa05e2009>]
       nfnetlink_rcv_msg+0x399/0x6a9 [nfnetlink]
  #1:  (nf_conntrack_expect_lock){+.....}, at: [<ffffffffa05f2c1f>]
       ctnetlink_new_conntrack+0x17f/0x408 [nf_conntrack_netlink]

  Call Trace:
   dump_stack+0x85/0xc2
   __lock_acquire+0x1608/0x1680
   ? ctnetlink_parse_tuple_proto+0x10f/0x1c0 [nf_conntrack_netlink]
   lock_acquire+0x100/0x1f0
   ? nf_ct_remove_expectations+0x32/0x90 [nf_conntrack]
   _raw_spin_lock_bh+0x3f/0x50
   ? nf_ct_remove_expectations+0x32/0x90 [nf_conntrack]
   nf_ct_remove_expectations+0x32/0x90 [nf_conntrack]
   ctnetlink_change_helper+0xc6/0x190 [nf_conntrack_netlink]
   ctnetlink_new_conntrack+0x1b2/0x408 [nf_conntrack_netlink]
   nfnetlink_rcv_msg+0x60a/0x6a9 [nfnetlink]
   ? nfnetlink_rcv_msg+0x1b9/0x6a9 [nfnetlink]
   ? nfnetlink_bind+0x1a0/0x1a0 [nfnetlink]
   netlink_rcv_skb+0xa4/0xc0
   nfnetlink_rcv+0x87/0x770 [nfnetlink]

So remove these _expect_lock now.

Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
---
 net/netfilter/nf_conntrack_netlink.c | 15 ++-------------
 1 file changed, 2 insertions(+), 13 deletions(-)

Comments

Pablo Neira Ayuso April 8, 2017, 9:16 p.m. UTC | #1
On Sat, Apr 01, 2017 at 10:14:24PM +0800, Liping Zhang wrote:
> @@ -1960,9 +1955,7 @@ static int ctnetlink_new_conntrack(struct net *net, struct sock *ctnl,
>  	err = -EEXIST;
>  	ct = nf_ct_tuplehash_to_ctrack(h);
>  	if (!(nlh->nlmsg_flags & NLM_F_EXCL)) {
> -		spin_lock_bh(&nf_conntrack_expect_lock);
>  		err = ctnetlink_change_conntrack(ct, cda);
> -		spin_unlock_bh(&nf_conntrack_expect_lock);

We used to have a central spinlock here.

        spin_lock_bh(&nf_conntrack_lock);

that was removed time ago, so this go converted to use
nf_conntrack_expect_lock.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Liping Zhang April 9, 2017, 4:21 a.m. UTC | #2
Hi Pablo,

2017-04-09 5:16 GMT+08:00 Pablo Neira Ayuso <pablo@netfilter.org>:
> On Sat, Apr 01, 2017 at 10:14:24PM +0800, Liping Zhang wrote:
>> @@ -1960,9 +1955,7 @@ static int ctnetlink_new_conntrack(struct net *net, struct sock *ctnl,
>>       err = -EEXIST;
>>       ct = nf_ct_tuplehash_to_ctrack(h);
>>       if (!(nlh->nlmsg_flags & NLM_F_EXCL)) {
>> -             spin_lock_bh(&nf_conntrack_expect_lock);
>>               err = ctnetlink_change_conntrack(ct, cda);
>> -             spin_unlock_bh(&nf_conntrack_expect_lock);
>
> We used to have a central spinlock here.
>
>         spin_lock_bh(&nf_conntrack_lock);
>
> that was removed time ago, so this go converted to use
> nf_conntrack_expect_lock.

This patch should add:

Fixes: ca7433df3a67 ("netfilter: conntrack: seperate expect locking
from nf_conntrack_lock")

Commit ca7433df3a67 add spin_lock_bh(&nf_conntrack_expect_lock) in
nf_ct_remove_expectations, but we also lock the _expect_lock before calling
ctnetlink_change_conntrack, so dead lock will happen:

 spin_lock_bh(&nf_conntrack_expect_lock):
->err = ctnetlink_change_conntrack(ct, cda)
-->ctnetlink_change_helper
--->if (!strcmp(helpname, "")) nf_ct_remove_expectations()
---->spin_lock_bh(&nf_conntrack_expect_lock); //lock _expect_lock
again, dead lock!

Since ctnetlink_change_conntrack is unrelated to nf_conntrack_expect_lock,
so remove it can fix this issue.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pablo Neira Ayuso April 10, 2017, 12:02 p.m. UTC | #3
On Sun, Apr 09, 2017 at 12:21:22PM +0800, Liping Zhang wrote:
> Hi Pablo,
> 
> 2017-04-09 5:16 GMT+08:00 Pablo Neira Ayuso <pablo@netfilter.org>:
> > On Sat, Apr 01, 2017 at 10:14:24PM +0800, Liping Zhang wrote:
> >> @@ -1960,9 +1955,7 @@ static int ctnetlink_new_conntrack(struct net *net, struct sock *ctnl,
> >>       err = -EEXIST;
> >>       ct = nf_ct_tuplehash_to_ctrack(h);
> >>       if (!(nlh->nlmsg_flags & NLM_F_EXCL)) {
> >> -             spin_lock_bh(&nf_conntrack_expect_lock);
> >>               err = ctnetlink_change_conntrack(ct, cda);
> >> -             spin_unlock_bh(&nf_conntrack_expect_lock);
> >
> > We used to have a central spinlock here.
> >
> >         spin_lock_bh(&nf_conntrack_lock);
> >
> > that was removed time ago, so this go converted to use
> > nf_conntrack_expect_lock.
> 
> This patch should add:
> 
> Fixes: ca7433df3a67 ("netfilter: conntrack: seperate expect locking
> from nf_conntrack_lock")
> 
> Commit ca7433df3a67 add spin_lock_bh(&nf_conntrack_expect_lock) in
> nf_ct_remove_expectations, but we also lock the _expect_lock before calling
> ctnetlink_change_conntrack, so dead lock will happen:
> 
>  spin_lock_bh(&nf_conntrack_expect_lock):
> ->err = ctnetlink_change_conntrack(ct, cda)
> -->ctnetlink_change_helper
> --->if (!strcmp(helpname, "")) nf_ct_remove_expectations()
> ---->spin_lock_bh(&nf_conntrack_expect_lock); //lock _expect_lock
> again, dead lock!

I agree this is fixing the deadlock but see below.

> Since ctnetlink_change_conntrack is unrelated to nf_conntrack_expect_lock,
> so remove it can fix this issue.

But packets may be updating a conntrack at the same time that we're
mangling via ctnetlink, right?
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Liping Zhang April 10, 2017, 1:53 p.m. UTC | #4
Hi Pablo,

2017-04-10 20:02 GMT+08:00 Pablo Neira Ayuso <pablo@netfilter.org>:
[...]
>> Since ctnetlink_change_conntrack is unrelated to nf_conntrack_expect_lock,
>> so remove it can fix this issue.
>
> But packets may be updating a conntrack at the same time that we're
> mangling via ctnetlink, right?

Yes, but in packets processing path, we use rcu_read_lock(), so using
spin_lock_bh(&nf_conntrack_expect_lock) here won't help anything.

As a quick summary(just a reference):
1. For CTA_TIMEOUT, there's no problem
2. For CTA_MARK, no problem too
3. For CTA_PROTOINFO, spin_lock_bh(&ct->lock) will be held, so no problem too
4. For CTA_LABELS, it may race with packets path, but it seems not a big problem
5. For CTA_SEQ_ADJ_ORIG... we should hold &ct->lock when do updating seqadj
    (this one should require a new patch)
6. For CTA_HELP, updating helpinfo may be a problem(I am not sure
about this part)
7. For CTA_STATUS, I think it may cause a big problem, the bit set operation via
ctnetlink_change_status is not atomic, so it may clear the
IPS_DYING_BIT, for example:
    CPU0(update CTA_STATUS)        CPU1(packet path, set _DYING_)
    ctnetlink_change_status                 --
    olds = ct->status                             --
    --                                               set_bit(IPS_DYING_BIT...
    ct->status = olds | new status --> Here DYING_BIT will be cleared!

But I think we can convert "ct->status |= status & ~(IPS_NAT_DONE_MASK
| IPS_NAT_MASK);"
to a series of atomic bit set operations to solve the 7th issue.

And the issues listed above won't be solved by holding _expect_lock,
so I think we should get rid of the _expect_lock at first.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pablo Neira Ayuso April 13, 2017, 9:52 p.m. UTC | #5
On Sat, Apr 01, 2017 at 10:14:24PM +0800, Liping Zhang wrote:
> From: Liping Zhang <zlpnobody@gmail.com>
> 
> Currently, ctnetlink_change_helper() is always protected by _expect_lock,
> this is unnecessary, since the operations are unrelated to _expect_lock.
> 
> Also this will cause a deadlock when deleting the helper from a conntrack,
> as _expect_lock will be locked again by nf_ct_remove_expectations():

Applied.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pablo Neira Ayuso April 13, 2017, 10:03 p.m. UTC | #6
On Mon, Apr 10, 2017 at 09:53:17PM +0800, Liping Zhang wrote:
> Hi Pablo,
>
> 2017-04-10 20:02 GMT+08:00 Pablo Neira Ayuso <pablo@netfilter.org>:
> [...]
> >> Since ctnetlink_change_conntrack is unrelated to nf_conntrack_expect_lock,
> >> so remove it can fix this issue.
> >
> > But packets may be updating a conntrack at the same time that we're
> > mangling via ctnetlink, right?
>
> Yes, but in packets processing path, we use rcu_read_lock(), so using
> spin_lock_bh(&nf_conntrack_expect_lock) here won't help anything.
>
> As a quick summary(just a reference):
> 1. For CTA_TIMEOUT, there's no problem
> 2. For CTA_MARK, no problem too
> 3. For CTA_PROTOINFO, spin_lock_bh(&ct->lock) will be held, so no problem too
> 4. For CTA_LABELS, it may race with packets path, but it seems not a big problem
> 5. For CTA_SEQ_ADJ_ORIG... we should hold &ct->lock when do updating seqadj
>     (this one should require a new patch)
> 6. For CTA_HELP, updating helpinfo may be a problem(I am not sure
> about this part)
> 7. For CTA_STATUS, I think it may cause a big problem, the bit set operation via
> ctnetlink_change_status is not atomic, so it may clear the
> IPS_DYING_BIT, for example:
>     CPU0(update CTA_STATUS)        CPU1(packet path, set _DYING_)
>     ctnetlink_change_status                 --
>     olds = ct->status                             --
>     --                                               set_bit(IPS_DYING_BIT...
>     ct->status = olds | new status --> Here DYING_BIT will be cleared!
>
> But I think we can convert "ct->status |= status & ~(IPS_NAT_DONE_MASK
> | IPS_NAT_MASK);"
> to a series of atomic bit set operations to solve the 7th issue.
>
> And the issues listed above won't be solved by holding _expect_lock,
> so I think we should get rid of the _expect_lock at first.

I'm tossing this. I would like to see a patch series to address all
issues with conntrack updates in one go.

By when the central spinlock was removed, this was incorrectly
converted to be safe. Since then on this has been broken.

This should refer to patch ca7433df3a67 when fixing this.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
index 7b83bbf..f776314 100644
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -1514,14 +1514,9 @@  static int ctnetlink_change_helper(struct nf_conn *ct,
 					    nf_ct_protonum(ct));
 	if (helper == NULL) {
 #ifdef CONFIG_MODULES
-		spin_unlock_bh(&nf_conntrack_expect_lock);
-
-		if (request_module("nfct-helper-%s", helpname) < 0) {
-			spin_lock_bh(&nf_conntrack_expect_lock);
+		if (request_module("nfct-helper-%s", helpname) < 0)
 			return -EOPNOTSUPP;
-		}
 
-		spin_lock_bh(&nf_conntrack_expect_lock);
 		helper = __nf_conntrack_helper_find(helpname, nf_ct_l3num(ct),
 						    nf_ct_protonum(ct));
 		if (helper)
@@ -1960,9 +1955,7 @@  static int ctnetlink_new_conntrack(struct net *net, struct sock *ctnl,
 	err = -EEXIST;
 	ct = nf_ct_tuplehash_to_ctrack(h);
 	if (!(nlh->nlmsg_flags & NLM_F_EXCL)) {
-		spin_lock_bh(&nf_conntrack_expect_lock);
 		err = ctnetlink_change_conntrack(ct, cda);
-		spin_unlock_bh(&nf_conntrack_expect_lock);
 		if (err == 0) {
 			nf_conntrack_eventmask_report((1 << IPCT_REPLY) |
 						      (1 << IPCT_ASSURED) |
@@ -2357,11 +2350,7 @@  ctnetlink_glue_parse(const struct nlattr *attr, struct nf_conn *ct)
 	if (ret < 0)
 		return ret;
 
-	spin_lock_bh(&nf_conntrack_expect_lock);
-	ret = ctnetlink_glue_parse_ct((const struct nlattr **)cda, ct);
-	spin_unlock_bh(&nf_conntrack_expect_lock);
-
-	return ret;
+	return ctnetlink_glue_parse_ct((const struct nlattr **)cda, ct);
 }
 
 static int ctnetlink_glue_exp_parse(const struct nlattr * const *cda,