Message ID | 1416870887-5285-2-git-send-email-pablo@netfilter.org |
---|---|
State | Accepted |
Delegated to: | Pablo Neira |
Headers | show |
On Tue, 25 Nov 2014 00:14:47 +0100 Pablo Neira Ayuso <pablo@netfilter.org> wrote: > Commit 5195c14c8b27c ("netfilter: conntrack: fix race in > __nf_conntrack_confirm against get_next_corpse") aimed to resolve the > race condition between the confirmation (packet path) and the flush > command (from control plane). However, it introduced a crash when > several packets race to add a new conntrack, which seems easier to > reproduce when nf_queue is in place. > > Fix this race, in __nf_conntrack_confirm(), by removing the CT > from unconfirmed list before checking the DYING bit. In case > race occured, re-add the CT to the dying list Re-adding it (to some list) case the hash race occurs should fix the problem (the reverted patch introduced). > This patch also changes the verdict from NF_ACCEPT to NF_DROP when > we lose race. Basically, the confirmation happens for the first packet > that we see in a flow. If you just invoked conntrack -F once (which > should be the common case), then this is likely to be the first packet > of the flow (unless you already called flush anytime soon in the past). > This should be hard to trigger, but better drop this packet, otherwise > we leave things in inconsistent state since the destination will likely > reply to this packet, but it will find no conntrack, unless the origin > retransmits. > > The change of the verdict has been discussed in: > https://www.marc.info/?l=linux-netdev&m=141588039530056&w=2 > > Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> > --- > Note: I will check and re-test this again tomorrow morning with fresh > mind, compile tested only. > > net/netfilter/nf_conntrack_core.c | 20 +++++++++----------- > 1 file changed, 9 insertions(+), 11 deletions(-) > > diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c > index 5016a69..c588012 100644 > --- a/net/netfilter/nf_conntrack_core.c > +++ b/net/netfilter/nf_conntrack_core.c > @@ -611,16 +611,15 @@ __nf_conntrack_confirm(struct sk_buff *skb) > */ > NF_CT_ASSERT(!nf_ct_is_confirmed(ct)); > pr_debug("Confirming conntrack %p\n", ct); > - /* We have to check the DYING flag inside the lock to prevent > - a race against nf_ct_get_next_corpse() possibly called from > - user context, else we insert an already 'dead' hash, blocking > - further use of that particular connection -JM */ > + /* We have to check the DYING flag after unlink to prevent > + * a race against nf_ct_get_next_corpse() possibly called from > + * user context, else we insert an already 'dead' hash, blocking > + * further use of that particular connection -JM. > + */ > + nf_ct_del_from_dying_or_unconfirmed_list(ct); > > - if (unlikely(nf_ct_is_dying(ct))) { > - nf_conntrack_double_unlock(hash, reply_hash); > - local_bh_enable(); > - return NF_ACCEPT; > - } > + if (unlikely(nf_ct_is_dying(ct))) > + goto out; > > /* See if there's one in the list already, including reverse: > NAT could have grabbed it without realizing, since we're > @@ -636,8 +635,6 @@ __nf_conntrack_confirm(struct sk_buff *skb) > zone == nf_ct_zone(nf_ct_tuplehash_to_ctrack(h))) > goto out; > > - nf_ct_del_from_dying_or_unconfirmed_list(ct); > - > /* Timer relative to confirmation time, not original > setting time, otherwise we'd get timer wrap in > weird delay cases. */ > @@ -673,6 +670,7 @@ __nf_conntrack_confirm(struct sk_buff *skb) > return NF_ACCEPT; > > out: > + nf_ct_add_to_dying_list(ct); > nf_conntrack_double_unlock(hash, reply_hash); > NF_CT_STAT_INC(net, insert_failed); > local_bh_enable(); The change looks good to me. And we only re-add it to dying_list in the goto "out" error cases. I know Florian had some ideas of howto get rid of the unconfirmed list all together (which would be a fairly good optimization), but I guess we want a fairly simple patch for just fixing the bug first? Thanks for taking care of this Pablo.
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c index 5016a69..c588012 100644 --- a/net/netfilter/nf_conntrack_core.c +++ b/net/netfilter/nf_conntrack_core.c @@ -611,16 +611,15 @@ __nf_conntrack_confirm(struct sk_buff *skb) */ NF_CT_ASSERT(!nf_ct_is_confirmed(ct)); pr_debug("Confirming conntrack %p\n", ct); - /* We have to check the DYING flag inside the lock to prevent - a race against nf_ct_get_next_corpse() possibly called from - user context, else we insert an already 'dead' hash, blocking - further use of that particular connection -JM */ + /* We have to check the DYING flag after unlink to prevent + * a race against nf_ct_get_next_corpse() possibly called from + * user context, else we insert an already 'dead' hash, blocking + * further use of that particular connection -JM. + */ + nf_ct_del_from_dying_or_unconfirmed_list(ct); - if (unlikely(nf_ct_is_dying(ct))) { - nf_conntrack_double_unlock(hash, reply_hash); - local_bh_enable(); - return NF_ACCEPT; - } + if (unlikely(nf_ct_is_dying(ct))) + goto out; /* See if there's one in the list already, including reverse: NAT could have grabbed it without realizing, since we're @@ -636,8 +635,6 @@ __nf_conntrack_confirm(struct sk_buff *skb) zone == nf_ct_zone(nf_ct_tuplehash_to_ctrack(h))) goto out; - nf_ct_del_from_dying_or_unconfirmed_list(ct); - /* Timer relative to confirmation time, not original setting time, otherwise we'd get timer wrap in weird delay cases. */ @@ -673,6 +670,7 @@ __nf_conntrack_confirm(struct sk_buff *skb) return NF_ACCEPT; out: + nf_ct_add_to_dying_list(ct); nf_conntrack_double_unlock(hash, reply_hash); NF_CT_STAT_INC(net, insert_failed); local_bh_enable();
Commit 5195c14c8b27c ("netfilter: conntrack: fix race in __nf_conntrack_confirm against get_next_corpse") aimed to resolve the race condition between the confirmation (packet path) and the flush command (from control plane). However, it introduced a crash when several packets race to add a new conntrack, which seems easier to reproduce when nf_queue is in place. Fix this race, in __nf_conntrack_confirm(), by removing the CT from unconfirmed list before checking the DYING bit. In case race occured, re-add the CT to the dying list This patch also changes the verdict from NF_ACCEPT to NF_DROP when we lose race. Basically, the confirmation happens for the first packet that we see in a flow. If you just invoked conntrack -F once (which should be the common case), then this is likely to be the first packet of the flow (unless you already called flush anytime soon in the past). This should be hard to trigger, but better drop this packet, otherwise we leave things in inconsistent state since the destination will likely reply to this packet, but it will find no conntrack, unless the origin retransmits. The change of the verdict has been discussed in: https://www.marc.info/?l=linux-netdev&m=141588039530056&w=2 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> --- Note: I will check and re-test this again tomorrow morning with fresh mind, compile tested only. net/netfilter/nf_conntrack_core.c | 20 +++++++++----------- 1 file changed, 9 insertions(+), 11 deletions(-)