Message ID | 1395060178-11833-37-git-send-email-pablo@netfilter.org |
---|---|
State | Awaiting Upstream |
Headers | show |
From: Pablo Neira Ayuso > +#define CONNLIMIT_SLOTS 256 > +#define CONNLIMIT_LOCK_SLOTS 32 You might want to make these explicitly unsigned to ensure the divisions are unsigned (so can be done with a simple mask). David -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 2014-03-17 at 13:42 +0100, Pablo Neira Ayuso wrote: > From: Florian Westphal <fw@strlen.de> > > connlimit currently suffers from spinlock contention, example for > 4-core system with rps enabled: > +#define CONNLIMIT_SLOTS 256 > +#define CONNLIMIT_LOCK_SLOTS 32 32 spinlocks use 2 cache lines (assuming 4 bytes per spinlock, and 64 bytes cache lines) So I guess this probably should be increased to have less false sharing. Note: This can be done later, I do not want to block this patch serie at all ! I believe this hash table of spinlocks could be global, not in each struct xt_connlimit_data. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Eric Dumazet <eric.dumazet@gmail.com> wrote: > On Mon, 2014-03-17 at 13:42 +0100, Pablo Neira Ayuso wrote: > > From: Florian Westphal <fw@strlen.de> > > > > connlimit currently suffers from spinlock contention, example for > > 4-core system with rps enabled: > > > +#define CONNLIMIT_SLOTS 256 > > +#define CONNLIMIT_LOCK_SLOTS 32 > > 32 spinlocks use 2 cache lines (assuming 4 bytes per spinlock, and 64 > bytes cache lines) > > So I guess this probably should be increased to have less false sharing. True, Jesper pointed out the same thing to me. > I believe this hash table of spinlocks could be global, not in each > struct xt_connlimit_data. Good point. Indeed, this can be global. I did not increase it since more locks than tree slots is illegal (need exclusive access to each rtree at this time). I guess we could align it or increase number of lock and rbtree slots (after moving lock slots out of connlimit data). Thanks for the heads-up Eric, I'll see about addressing this later this week if noone beats me to it. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Laight <David.Laight@ACULAB.COM> wrote: > From: Pablo Neira Ayuso > > +#define CONNLIMIT_SLOTS 256 > > +#define CONNLIMIT_LOCK_SLOTS 32 > > You might want to make these explicitly unsigned to ensure > the divisions are unsigned (so can be done with a simple mask). I can do this but I checked that at least on my machine gcc already emits a 'add' instruction to reduce the hash value. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Florian Westphal > David Laight <David.Laight@ACULAB.COM> wrote: > > From: Pablo Neira Ayuso > > > +#define CONNLIMIT_SLOTS 256 > > > +#define CONNLIMIT_LOCK_SLOTS 32 > > > > You might want to make these explicitly unsigned to ensure > > the divisions are unsigned (so can be done with a simple mask). > > I can do this but I checked that at least on my machine gcc already > emits a 'add' instruction to reduce the hash value. I think you meant 'and' :-) Yes the hash itself is probably unsigned so forces an unsigned modulus. It just seemed better to force it from the other argument as well. David -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 17 Mar 2014 07:00:08 -0700 Eric Dumazet <eric.dumazet@gmail.com> wrote: > On Mon, 2014-03-17 at 13:42 +0100, Pablo Neira Ayuso wrote: > > From: Florian Westphal <fw@strlen.de> > > > > connlimit currently suffers from spinlock contention, example for > > 4-core system with rps enabled: > > > +#define CONNLIMIT_SLOTS 256 > > +#define CONNLIMIT_LOCK_SLOTS 32 > > > > 32 spinlocks use 2 cache lines (assuming 4 bytes per spinlock, and 64 > bytes cache lines) Hehe, I actually also pointed this out during my internal review, but we never gotten around to fixing this. > So I guess this probably should be increased to have less false sharing. > > Note: This can be done later, I do not want to block this patch serie at > all ! Yes, lets fix it up later.
On Tue, 2014-03-18 at 14:46 +0100, Jesper Dangaard Brouer wrote: > Hehe, I actually also pointed this out during my internal review, but > we never gotten around to fixing this. BTW, sizeof(spinlock_t) can be 2 bytes, if NR_CPUS is below 128 on x86 So whole hash table fits in a single cache line. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/netfilter/xt_connlimit.c b/net/netfilter/xt_connlimit.c index a8eaabb..ad290cc 100644 --- a/net/netfilter/xt_connlimit.c +++ b/net/netfilter/xt_connlimit.c @@ -31,6 +31,9 @@ #include <net/netfilter/nf_conntrack_tuple.h> #include <net/netfilter/nf_conntrack_zones.h> +#define CONNLIMIT_SLOTS 256 +#define CONNLIMIT_LOCK_SLOTS 32 + /* we will save the tuples of all connections we care about */ struct xt_connlimit_conn { struct hlist_node node; @@ -39,8 +42,8 @@ struct xt_connlimit_conn { }; struct xt_connlimit_data { - struct hlist_head iphash[256]; - spinlock_t lock; + struct hlist_head iphash[CONNLIMIT_SLOTS]; + spinlock_t locks[CONNLIMIT_LOCK_SLOTS]; }; static u_int32_t connlimit_rnd __read_mostly; @@ -48,7 +51,8 @@ static struct kmem_cache *connlimit_conn_cachep __read_mostly; static inline unsigned int connlimit_iphash(__be32 addr) { - return jhash_1word((__force __u32)addr, connlimit_rnd) & 0xFF; + return jhash_1word((__force __u32)addr, + connlimit_rnd) % CONNLIMIT_SLOTS; } static inline unsigned int @@ -61,7 +65,8 @@ connlimit_iphash6(const union nf_inet_addr *addr, for (i = 0; i < ARRAY_SIZE(addr->ip6); ++i) res.ip6[i] = addr->ip6[i] & mask->ip6[i]; - return jhash2((u32 *)res.ip6, ARRAY_SIZE(res.ip6), connlimit_rnd) & 0xFF; + return jhash2((u32 *)res.ip6, ARRAY_SIZE(res.ip6), + connlimit_rnd) % CONNLIMIT_SLOTS; } static inline bool already_closed(const struct nf_conn *conn) @@ -183,7 +188,7 @@ static int count_them(struct net *net, hhead = &data->iphash[hash]; - spin_lock_bh(&data->lock); + spin_lock_bh(&data->locks[hash % CONNLIMIT_LOCK_SLOTS]); count = count_hlist(net, hhead, tuple, addr, mask, family, &addit); if (addit) { if (add_hlist(hhead, tuple, addr)) @@ -191,7 +196,7 @@ static int count_them(struct net *net, else count = -ENOMEM; } - spin_unlock_bh(&data->lock); + spin_unlock_bh(&data->locks[hash % CONNLIMIT_LOCK_SLOTS]); return count; } @@ -227,7 +232,6 @@ connlimit_mt(const struct sk_buff *skb, struct xt_action_param *par) connections = count_them(net, info->data, tuple_ptr, &addr, &info->mask, par->family); - if (connections < 0) /* kmalloc failed, drop it entirely */ goto hotdrop; @@ -268,7 +272,9 @@ static int connlimit_mt_check(const struct xt_mtchk_param *par) return -ENOMEM; } - spin_lock_init(&info->data->lock); + for (i = 0; i < ARRAY_SIZE(info->data->locks); ++i) + spin_lock_init(&info->data->locks[i]); + for (i = 0; i < ARRAY_SIZE(info->data->iphash); ++i) INIT_HLIST_HEAD(&info->data->iphash[i]); @@ -309,6 +315,10 @@ static struct xt_match connlimit_mt_reg __read_mostly = { static int __init connlimit_mt_init(void) { int ret; + + BUILD_BUG_ON(CONNLIMIT_LOCK_SLOTS > CONNLIMIT_SLOTS); + BUILD_BUG_ON((CONNLIMIT_SLOTS % CONNLIMIT_LOCK_SLOTS) != 0); + connlimit_conn_cachep = kmem_cache_create("xt_connlimit_conn", sizeof(struct xt_connlimit_conn), 0, 0, NULL);