Message ID | 4A3A6143.3040607@gmail.com |
---|---|
State | Not Applicable, archived |
Delegated to: | David Miller |
Headers | show |
Eric Dumazet wrote: > In my own analysis, I found death_by_timeout() might be problematic, > with RCU and lockless lookups. > > static void death_by_timeout(unsigned long ul_conntrack) > { > struct nf_conn *ct = (void *)ul_conntrack; > > if (!test_bit(IPS_DYING_BIT, &ct->status) && > unlikely(nf_conntrack_event(IPCT_DESTROY, ct) < 0)) { > /* destroy event was not delivered */ > nf_ct_delete_from_lists(ct); > << HERE >> > > nf_ct_insert_dying_list(ct); > return; > } > set_bit(IPS_DYING_BIT, &ct->status); > nf_ct_delete_from_lists(ct); > nf_ct_put(ct); > } > > > We delete ct from a list and insert it in a new list. > > I believe a reader could "*catch*" ct while doing a lookup and miss the end > of its chain. (nulls algo check the null value at the end of lookup and can > decide to restart the lookup if the null value is not the expected one) > > We need to change nf_conntrack_init_net() and use a different "null" value, > guaranteed not being used in regular lists Good catch. This is a new bug, but it shouldn't matter in this case since nf_conntrack_event() can't fail unless you have a userspace listener that makes use of reliable delivery, which I think hasn't even been released yet. > Patch follows : Looks good. If you send me a Signed-off-by: I'll already apply it. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Patrick McHardy wrote: > Eric Dumazet wrote: >> In my own analysis, I found death_by_timeout() might be problematic, >> with RCU and lockless lookups. >> >> static void death_by_timeout(unsigned long ul_conntrack) >> { >> struct nf_conn *ct = (void *)ul_conntrack; >> >> if (!test_bit(IPS_DYING_BIT, &ct->status) && >> unlikely(nf_conntrack_event(IPCT_DESTROY, ct) < 0)) { >> /* destroy event was not delivered */ >> nf_ct_delete_from_lists(ct); >> << HERE >> >> >> nf_ct_insert_dying_list(ct); >> return; >> } >> set_bit(IPS_DYING_BIT, &ct->status); >> nf_ct_delete_from_lists(ct); >> nf_ct_put(ct); >> } >> >> >> We delete ct from a list and insert it in a new list. >> >> I believe a reader could "*catch*" ct while doing a lookup and miss >> the end >> of its chain. (nulls algo check the null value at the end of lookup >> and can >> decide to restart the lookup if the null value is not the expected one) >> >> We need to change nf_conntrack_init_net() and use a different "null" >> value, >> guaranteed not being used in regular lists > > Good catch. This is a new bug, but it shouldn't matter in this case > since nf_conntrack_event() can't fail unless you have a userspace > listener that makes use of reliable delivery, which I think hasn't > even been released yet. Indeed. I didn't include user-space support for this yet in my tree, so this should not be the problem. Thanks for the catch anyway! -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c index 5f72b94..5276a2d 100644 --- a/net/netfilter/nf_conntrack_core.c +++ b/net/netfilter/nf_conntrack_core.c @@ -1267,13 +1267,19 @@ err_cache: return ret; } +/* + * We need to use special "null" values, not used in hash table + */ +#define UNCONFIRMED_NULLS_VAL ((1<<30)+0) +#define DYING_NULLS_VAL ((1<<30)+1) + static int nf_conntrack_init_net(struct net *net) { int ret; atomic_set(&net->ct.count, 0); - INIT_HLIST_NULLS_HEAD(&net->ct.unconfirmed, 0); - INIT_HLIST_NULLS_HEAD(&net->ct.dying, 0); + INIT_HLIST_NULLS_HEAD(&net->ct.unconfirmed, UNCONFIRMED_NULLS_VAL); + INIT_HLIST_NULLS_HEAD(&net->ct.dying, DYING_NULLS_VAL); net->ct.stat = alloc_percpu(struct ip_conntrack_stat); if (!net->ct.stat) { ret = -ENOMEM;